On Wed, Oct 1, 2008 at 4:35 PM, Yoav Shapira <[EMAIL PROTECTED]> wrote: > Hi, > > I would like to use Nutch to crawl and index an intranet web site for > internal use. The site requires authentication, and stores the > credentials in a cookie. I've got a valid login and I have the cookie > saved, no problem. How do I tell Nutch to use it? > > I did some research online before asking, but unfortunately I couldn't > find a step-by-step answer for a newbie like myself. I see there's an > http-client plugin that can support some authentication. Is that what > I should use for cookies? If so, how do I configure it? > > Or is there something else I should be doing? If the documentation / > answer exists, sorry for the hassle and please just point me to it ;) >
Unfortunately, nutch doesn't have such a feature yet. (One of the problems is that we do not have a place to store cookies in a distributed setup) > -- > Thanks, > > Yoav > -- Doğacan Güney
