nutch 2.1 and session cookies

Michael Gang Tue, 08 Jan 2013 07:08:41 -0800

Hi,

I am searching for a way to scrap pages where you have to login first.
I searched in google and found this jira
https://issues.apache.org/jira/browse/NUTCH-827
"
I've created a patch against the trunk which adds support for very
rudimentary POST-based authentication support. It takes a link from
nutch-site.xml with a site to POST to and its respective parameters
(username, password, etc.). It then checks upon every request whether any
cookies have been initialized, and if none have, it fetches them from the
given link.
".


I wanted to ask if this issue will be introduced in nutch 2?

Thanks,
David

nutch 2.1 and session cookies

Reply via email to