Yoav - I'm not 100% certain about this, as I haven't had to deal with 
Nutch+cookies, but I did see some logging that made me think "ah, this thing 
handles cookies like a browser".  Yes, that's likely something that comes with 
httpclient, so just enable protocol-httpclient and disable protocol-http.  Want 
to try and report back?

Found this: 
http://wiki.apache.org/nutch/HttpPostAuthentication?highlight=%28cookies%29

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
> From: Yoav Shapira <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Monday, May 5, 2008 8:49:50 PM
> Subject: How to authenticate with cookies?
> 
> Hi,
> 
> I'm using Nutch to crawl an intranet site that is behind form
> authentication.  I know Nutch doesn't support form authentication yet
> (right?), but I think this site would also work with cookies.  I have
> the right set of cookie names and values, at least for testing, but I
> don't know how to have Nutch use these cookies with every HTTP
> requests during its crawl.
> 
> I saw a reference to a "protocol-httpclient" plugin.  Is that true / relevant?
> 
> Any help on configuring Nutch to use cookies for authentication would
> be appreciated.
> 
> -- 
> Thanks,
> 
> Yoav
> 


Reply via email to