Hi everyone,
Im new to the Nutch world and I am currently implementing an Intranet search project using it. The preliminary results have been really good, but I do have some questions that Id like to pose: - Is there any way to perform form based authentication? I know that this is a common request but I havent found a good-enough answer to it. The only references Ive found are about basic auth, which Id prefer to avoid. I ask this because Ive noticed that SearchBlox, which uses Nutch internally, has an option to support form based auth. Was this something they developed on their own? - Another issue I have is authorization support. The intranet Im working on has different security profiles, with sensitive stuff that must be hidden from some users but has to be searchable by others. What is the best way to do this? To have an index per profile? - What is the best reference to implement incremental indexing? I wouldnt like to rebuild my index in every crawl session. I would rather have it being update incrementally. Is this possible? - Can the companion web app (the search web app included in Nutch distribution) perform the crawling process too? I ask this because Ive noticed that it has included a nutch-default.xml file. Maybe it uses Quartz or something to perform asynch processing? - Can Nutch perform stemming? Please feel free to answer only one of these questions at a time. (I know there are a lot of questions). Thanks, Gonçalo Gaiolas
