Hi Talat,

Thanks a lot for the reply. I will go through it and try it out.

Thanks,
Tizy

On Tue, Dec 16, 2014 at 2:25 PM, Talat Uyarer <[email protected]> wrote:
>
> Hi Tizy,
>
> There is some discuss. You can reach at NUTCH-827 [1] IMHO we need
> some help. If we create this feature it will be useful.
>
> Talat
>
> [1] https://issues.apache.org/jira/browse/NUTCH-827
>
> 2014-12-16 10:44 GMT+02:00 Tizy Ninan <[email protected]>:
> > Hi,
> >
> > Thanks for the reply.
> > Is there any alternative way to do this authentication? Does the fetcher
> > job of Nutch accept cookies for fetching the web sites from the same
> > domain? Could you suggest any work around to do form based authentication
> > using Nutch?
> >
> > Thanks,
> > Tizy
> >
> > On Tue, Dec 16, 2014 at 1:08 PM, Halil Ibrahim Simsek <
> [email protected]>
> > wrote:
> >>
> >> Hello Tizy,
> >>
> >> As I know, currently the development version of Nutch can do Basic,
> Digest
> >> and NTLM based authentication. [1] Nutch can not do POST based
> >> authentication that depends on cookies. BTW there is a document which
> >> supposed to provide this feature but as far as i see no code developed
> yet.
> >> [2]
> >>
> >> [1] https://wiki.apache.org/nutch/HttpAuthenticationSchemes
> >> [2] https://wiki.apache.org/nutch/HttpPostAuthentication
> >>
> >> Halil
> >>
> >> 2014-12-16 7:16 GMT+02:00 Tizy Ninan <[email protected]>:
> >> >
> >> > Hi,
> >> >
> >> > I am trying to develop a custom crawler to crawl websites that require
> >> form
> >> > based authentication using Nutch v1.9 in Java.  The
> >> HttpPostAuthentication
> >> > feature of Nutch is followed to implement it.
> >> >
> >> > The login parameters required for authentication such as html form-id,
> >> > login post data(username, password) are specified as key-value pairs
> in a
> >> > configuration file. What is required to identify the html login
> form(id
> >> or
> >> > name of the html form)? How to identify the html form parameters if
> id or
> >> > name of the form is not specified?
> >> >
> >> > I have also posted the question to the developer mailing list, but did
> >> not
> >> > receive any reply.I am stuck with this for a while. Could somebody
> >> provide
> >> > with a solution on how to specify the html form parameters of
> websites to
> >> > be crawled to perform form based authentication?
> >> >
> >> > Thanks and Regards,
> >> > Tizy
> >> >
> >>
> >
> >
> > --
> > Thanks and Regards,
> > Tizy
>
>
>
> --
> Talat UYARER
> Websitesi: http://talat.uyarer.com
> Twitter: http://twitter.com/talatuyarer
> Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304
>


-- 
Thanks and Regards,
Tizy

Reply via email to