Hi everyone,

 

I’m new to the Nutch world and I am currently implementing an Intranet
search project using it. The preliminary results have been really good, but
I do have some questions that I’d like to pose:

 

-          Is there any way to perform form based authentication? I know
that this is a common request but I haven’t found a “good-enough” answer to
it. The only references I’ve found are about basic auth, which I’d prefer to
avoid. I ask this because I’ve noticed that SearchBlox, which uses Nutch
internally, has an option to support form based auth. Was this something
they developed on their own?

-          Another issue I have is authorization support. The intranet I’m
working on has different security profiles, with sensitive stuff that must
be hidden from some users but has to be searchable by others. What is the
best way to do this? To have an index per profile?

-          What is the best reference to implement incremental indexing? I
wouldn’t like to rebuild my index in every crawl session. I would rather
have it being update incrementally. Is this possible?

-          Can the companion web app (the search web app included in Nutch
distribution) perform the crawling process too? I ask this because I’ve
noticed that it has included a nutch-default.xml file. Maybe it uses Quartz
or something to perform asynch processing?

-          Can Nutch perform stemming? 

 

Please feel free to answer only one of these questions at a time. (I know
there are a lot of questions).

 

Thanks,

Gonçalo Gaiolas

 

 

Reply via email to