On 9/8/06, Jim Wilson <[EMAIL PROTECTED]> wrote: > Dear Nutch User List, > > I am desperately trying to index an Intranet with the following > characteristics > > 1) Some sites require no authentication - these already work great! > 2) Some sites require basic HTTP Authentication. > 3) Some sites require NTLM Authentication. > 4) No sites require both HTTP and NTLM (only one or the other). > 5) The same Username/Password should work on all sites which require either > type of Authentication. > 6) For sites requiring NTLM Authentication, the same Domain is always used. > 7) If a site requires authentication, but the Username/Password mentioned > above fails, the site doesn't matter and does not need fetched/indexed. > > My question is this: How can I provide a default Username/Password/Domain > for Nutch to use when answering HTTP or NTLM challenges? > > (I really hope all I need is a couple of <property> tags in my > nutch-site.xml, but I'm beginning to doubt it). > > I love Nutch, and really want to use it. Please help if you know the > answer. Thanks!
I'm also very interested in hearing more on the topic. The only mention of a solution to (a part of) this problem I found is http://www.dehora.net/journal/2005/11/nutch_with_basic_authentication.html t.n.a. ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
