Hi Talat, Thanks a lot for the reply. I will go through it and try it out.
Thanks, Tizy On Tue, Dec 16, 2014 at 2:25 PM, Talat Uyarer <[email protected]> wrote: > > Hi Tizy, > > There is some discuss. You can reach at NUTCH-827 [1] IMHO we need > some help. If we create this feature it will be useful. > > Talat > > [1] https://issues.apache.org/jira/browse/NUTCH-827 > > 2014-12-16 10:44 GMT+02:00 Tizy Ninan <[email protected]>: > > Hi, > > > > Thanks for the reply. > > Is there any alternative way to do this authentication? Does the fetcher > > job of Nutch accept cookies for fetching the web sites from the same > > domain? Could you suggest any work around to do form based authentication > > using Nutch? > > > > Thanks, > > Tizy > > > > On Tue, Dec 16, 2014 at 1:08 PM, Halil Ibrahim Simsek < > [email protected]> > > wrote: > >> > >> Hello Tizy, > >> > >> As I know, currently the development version of Nutch can do Basic, > Digest > >> and NTLM based authentication. [1] Nutch can not do POST based > >> authentication that depends on cookies. BTW there is a document which > >> supposed to provide this feature but as far as i see no code developed > yet. > >> [2] > >> > >> [1] https://wiki.apache.org/nutch/HttpAuthenticationSchemes > >> [2] https://wiki.apache.org/nutch/HttpPostAuthentication > >> > >> Halil > >> > >> 2014-12-16 7:16 GMT+02:00 Tizy Ninan <[email protected]>: > >> > > >> > Hi, > >> > > >> > I am trying to develop a custom crawler to crawl websites that require > >> form > >> > based authentication using Nutch v1.9 in Java. The > >> HttpPostAuthentication > >> > feature of Nutch is followed to implement it. > >> > > >> > The login parameters required for authentication such as html form-id, > >> > login post data(username, password) are specified as key-value pairs > in a > >> > configuration file. What is required to identify the html login > form(id > >> or > >> > name of the html form)? How to identify the html form parameters if > id or > >> > name of the form is not specified? > >> > > >> > I have also posted the question to the developer mailing list, but did > >> not > >> > receive any reply.I am stuck with this for a while. Could somebody > >> provide > >> > with a solution on how to specify the html form parameters of > websites to > >> > be crawled to perform form based authentication? > >> > > >> > Thanks and Regards, > >> > Tizy > >> > > >> > > > > > > -- > > Thanks and Regards, > > Tizy > > > > -- > Talat UYARER > Websitesi: http://talat.uyarer.com > Twitter: http://twitter.com/talatuyarer > Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304 > -- Thanks and Regards, Tizy

