Hi Bob,

It's explained as:

"Example:-
    <credentials username="susam" password="masus">
      <default realm="sso"/>
      <authscope host="192.168.101.33" port="80" realm="login"/>
      <authscope host="example" port="8080" realm="blogs"/>
      <authscope host="example" port="8080" realm="wiki"/>
      <authscope host="example" port="80" realm="quiz" scheme="NTLM"/>
    </credentials>
    <credentials username="admin" password="nimda">
      <authscope host="example" port="8080"/>
    </credentials>

  In the above example, 'example:8080' server has pages with multiple
  authentication realms. The first set of credentials would be used for
  'blogs' and 'wiki' authentication realms. The second set of
  credentials would be used for all other realms. For 'login' realm of
  '192.168.101.33', the first set of credentials would be used. For any
  other realm of '192.168.101.33' authentication would not be done. For
  the NTLM authentication required by 'example:80', the first set of
  credentials would be used. For 'sso' realms of all other servers, the
  first set of credentials would be used, since it is configured as
  'default'.

  NTLM does not use the notion of realms. The domain name may be
  specified as the value for 'realm' attribute in case of NTLM."

So, do you set realm?

Kind Regards,
Furkan KAMACI

On Wed, Nov 2, 2016 at 9:49 PM, Bell, Bob <bob.b...@austintexas.gov> wrote:

> Furkan,
>
> Same results.   I tried domain\\user and domain\user, do I need
> to put a trace on the traffic  and see what packets are being
> sent by nutch ?
>
> Thanks,
> Bob
>
> -----Original Message-----
> From: Bell, Bob [mailto:bob.b...@austintexas.gov]
> Sent: Wednesday, November 02, 2016 2:31 PM
> To: user@nutch.apache.org
> Subject: RE: Nutch 1.12 NTLM authentication IIS 7.5 Intranet
>
> Yes, I will check that.     I cranked up the logging and ran again, to see
> if you might spot something odd.
>
>
> -----Original Message-----
> From: Furkan KAMACI [mailto:furkankam...@gmail.com]
> Sent: Wednesday, November 02, 2016 2:20 PM
> To: user@nutch.apache.org
> Cc: Bell, Bob <bob.b...@austintexas.gov>
> Subject: Re: Nutch 1.12 NTLM authentication IIS 7.5 Intranet
>
> Hi Bob,
>
> Server may require that the domain as a part of username. For example,
> "domain\\user". Could you check that?
>
> Kind Regards,
> Furkan KAMACI
>
> On Wed, Nov 2, 2016 at 9:11 PM, Bell, Bob <bob.b...@austintexas.gov>
> wrote:
>
> > I have replaced <iis74.intranet> is just a string replacement for our
> > actual intranet name something like blah.intranet.org, and I use the
> > <> convention when I obscuring actual data.
> >
> > What might the log4js.properties entry for httpclient.Http ?  I see it
> > is only at INFO level logging, but I do not know that proper object
> > path to set it up.
> >
> > Thanks,
> > Bob
> >
> > >Hi Bob,
> > >
> > >Do you write host as <iis75.intranet> or iis75.intranet ?
> > >
> > >Kind Regards,
> > >Furkan KAMACI
> >
> > -----Original Message-----
> > From: Bell, Bob
> > Sent: Wednesday, November 02, 2016 12:17 PM
> > To: 'user@nutch.apache.org' <user@nutch.apache.org>
> > Cc: Bell, Bob <bob.b...@austintexas.gov>
> > Subject: Nutch 1.12 NTLM authentication IIS 7.5 Intranet
> >
> > I have been trying for more than a year to get NTLM to work with IIS 7.5
> > without success.   I was
> > happy to see the 1.12 recent release, and thought ok I will give it
> > shot again.  I am almost to point where I do not believe it works with
> > ntlm, or it does not know how to handle the multiple 401's
> > that are returned, or I have some fundamental problem somewhere ?    I
> > have tried everything I
> > could think of, and am at loss on how to solve this mystery.    My Nutch
> > server is a Centos 7 in a
> > Virtual Box.    I am using the httpclient as indicated in the docs but
> > with no love.      I can fetch with
> > anonymous, but I need ntlm to work.
> >
> > I am using plugin.includes = >protocol-httpclient
> >
> > nutch-site.xml:
> > <property>
> > <name>http.auth.file</name>
> > <value>httpclient-auth.xml</value>
> > <description>Authentication configuration file for 'protocol-httpclient'
> > plugin.
> > </description>
> > </property>
> >
> > httpclient-auth.xml for local user:
> > <auth-configuration>
> >     <credentials username="nutch" password="<somepassword>">
> >         <default  scheme="basic" port="80"/>
> >     </credentials>
> > </auth-configuration>
> >
> > Here is output with local user account on the server, one thing I
> > notice, is that I cannot force authentication to be anything other
> > than ntlm, even though I support ntlm, basic, and
> > digest.   Notice the scheme was basic,
> > but it goes though ntlm regardless.
> >
> > [root@localhost nutch]# nutch parsechecker http://<iis75.intranet>
> > fetching: http://<iis75.intranet>
> > Whitelisted hosts: [<iis75.intranet>]
> > http.proxy.host = null
> > http.proxy.port = 8080
> > http.proxy.exception.list = false
> > http.timeout = 36000
> > http.content.limit = 65536
> > http.agent = APL-Nutch-Spider/Nutch-1.12 http.accept.language =
> > en-us,en-gb,en;q=0.7,*;q=0.3 http.accept =
> > text/html,application/xhtml+
> > xml,application/xml;q=0.9,*/*;q=0.8
> > Credentials - username: nutch; set as default for realm: ; scheme:
> > basic Pre-configured credentials with scope -  host: <iis75.intranet>;
> > port: 80; not found for url: http://<iis75.intranet> Authorization
> > required Supported authentication schemes in the order of preference:
> > [ntlm, digest, basic] ntlm authentication scheme selected Using
> authentication scheme:
> > ntlm Authorization challenge processed Authentication scope: NTLM <any
> > realm>@<iis75.intranet>:80 Credentials required Credentials provider
> > realm>not
> > available No credentials available for NTLM <any
> > realm>@<iis75.intranet>:80
> > url: http://<iis75.intranet>; status code: 401; bytes received: 0;
> > Content-Length: 0
> > 401 Authentication Required
> > Fetch failed with protocol status: access_denied(17), lastModified=0:
> > Authentication required: http://<iis75.intranet> [root@localhost
> > nutch]#
> >
> >
> > httpclient-auth.xml for domain  user:
> > <auth-configuration>
> >     <credentials username="<domainuser>" password="<domainpassword>
> >         <default host="<iis75.intranet>" scheme="ntlm" port="80"
> > realm="<domain>"/>
> >     </credentials>
> > </auth-configuration>
> >
> > note: doesn’t matter what I put in the host, doesn’t seem to change
> > anything.
> >
> > [root@localhost nutch]# nutch parsechecker http://<iis75.intranet>
> > fetching: http://<iis75.intranet>
> > Whitelisted hosts: [<iis75.intranet>]
> > http.proxy.host = null
> > http.proxy.port = 8080
> > http.proxy.exception.list = false
> > http.timeout = 36000
> > http.content.limit = 65536
> > http.agent = APL-Nutch-Spider/Nutch-1.12 http.accept.language =
> > en-us,en-gb,en;q=0.7,*;q=0.3 http.accept =
> > text/html,application/xhtml+
> > xml,application/xml;q=0.9,*/*;q=0.8
> > Credentials - username: <domainuser>"; set as default for realm:
> > =<domain>; scheme: ntlm Pre-configured credentials with scope -  host:
> > <iis75.intranet>; port: 80; not found for url: http://<iis75.intranet>
> > Authorization required Supported authentication schemes in the order
> > of
> > preference: [ntlm, digest, basic] ntlm authentication scheme selected
> > Using authentication scheme: ntlm Authorization challenge processed
> > Authentication scope: NTLM <any realm>@<iis75.intranet>:80 Retry
> > authentication Authenticating with NTLM <any
> > realm>@<iis75.intranet>:80 enter NTLMScheme.authenticate(Credentials,
> > HttpMethod) Authorization required Using authentication scheme: ntlm
> > Authorization challenge processed Authentication scope: NTLM <any
> > realm>@<iis75.intranet>:80 Retry authentication Authenticating with
> > NTLM <any realm>@<iis75.intranet>:80 enter
> > NTLMScheme.authenticate(Credentials, HttpMethod) Authorization
> > required Using authentication scheme: ntlm Authorization challenge
> > processed Authentication scope: NTLM <any realm>@<iis75.intranet>:80
> > Credentials required Credentials provider not available Failure
> > authenticating with NTLM <any realm>@<iis75.intranet>:80
> > url: http://<iis75.intranet>; status code: 401; bytes received: 0;
> > Content-Length: 0
> > 401 Authentication Required
> > Fetch failed with protocol status: access_denied(17), lastModified=0:
> > Authentication required: http://<iis75.intranet>
> >
> > Last entry in  Hadoop.log:
> >
> > 2016-11-02 12:08:49,568 INFO  parse.ParserChecker - fetching: http://
> > <iis75.intranet>
> > 2016-11-02 12:08:50,040 DEBUG util.ObjectCache - No object cache found
> > for
> > conf=Configuration: core-default.xml, core-site.xml,
> > nutch-default.xml, nutch-site.xml, instantiating a new object cache
> > 2016-11-02 12:08:50,119 INFO  protocol.RobotRulesParser - Whitelisted
> > hosts: [<iis75.intranet>]
> > 2016-11-02 12:08:50,119 INFO  httpclient.Http - http.proxy.host = null
> > 2016-11-02 12:08:50,119 INFO  httpclient.Http - http.proxy.port = 8080
> > 2016-11-02 12:08:50,119 INFO  httpclient.Http -
> > http.proxy.exception.list = false
> > 2016-11-02 12:08:50,119 INFO  httpclient.Http - http.timeout = 36000
> > 2016-11-02 12:08:50,119 INFO  httpclient.Http - http.content.limit =
> > 65536
> > 2016-11-02 12:08:50,119 INFO  httpclient.Http - http.agent =
> > APL-Nutch-Spider/Nutch-1.12 (bob.b...@austintexas.gov)
> > 2016-11-02 12:08:50,120 INFO  httpclient.Http - http.accept.language =
> > en-us,en-gb,en;q=0.7,*;q=0.3
> > 2016-11-02 12:08:50,120 INFO  httpclient.Http - http.accept =
> > text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> > 2016-11-02 12:08:50,133 TRACE httpclient.Http - Credentials - username:
> > <domainuser>; set as default for realm: <domain>; scheme: ntlm
> > 2016-11-02 12:08:50,134 TRACE httpclient.Http - Pre-configured
> > credentials with scope -  host: <iis75.intranet>; port: 80; not found
> > for url: http:// <iis75.intranet>
> > 2016-11-02 12:08:50,313 DEBUG httpclient.HttpMethodDirector -
> > Authorization required
> > 2016-11-02 12:08:50,320 DEBUG auth.AuthChallengeProcessor - Supported
> > authentication schemes in the order of preference: [ntlm, digest,
> > basic]
> > 2016-11-02 12:08:50,320 INFO  auth.AuthChallengeProcessor - ntlm
> > authentication scheme selected
> > 2016-11-02 12:08:50,320 DEBUG auth.AuthChallengeProcessor - Using
> > authentication scheme: ntlm
> > 2016-11-02 12:08:50,320 DEBUG auth.AuthChallengeProcessor -
> > Authorization challenge processed
> > 2016-11-02 12:08:50,320 DEBUG httpclient.HttpMethodDirector -
> > Authentication scope: NTLM <any realm>@<iis75.intranet>:80
> > 2016-11-02 12:08:50,320 DEBUG httpclient.HttpMethodDirector - Retry
> > authentication
> > 2016-11-02 12:08:50,321 DEBUG httpclient.HttpMethodDirector -
> > Authenticating with NTLM <any realm>@<iis75.intranet>:80
> > 2016-11-02 12:08:50,321 TRACE auth.NTLMScheme - enter
> > NTLMScheme.authenticate(Credentials, HttpMethod)
> > 2016-11-02 12:08:50,351 DEBUG httpclient.HttpMethodDirector -
> > Authorization required
> > 2016-11-02 12:08:50,352 DEBUG auth.AuthChallengeProcessor - Using
> > authentication scheme: ntlm
> > 2016-11-02 12:08:50,352 DEBUG auth.AuthChallengeProcessor -
> > Authorization challenge processed
> > 2016-11-02 12:08:50,352 DEBUG httpclient.HttpMethodDirector -
> > Authentication scope: NTLM <any realm>@<iis75.intranet>:80
> > 2016-11-02 12:08:50,352 DEBUG httpclient.HttpMethodDirector - Retry
> > authentication
> > 2016-11-02 12:08:50,352 DEBUG httpclient.HttpMethodDirector -
> > Authenticating with NTLM <any realm>@<iis75.intranet>:80
> > 2016-11-02 12:08:50,352 TRACE auth.NTLMScheme - enter
> > NTLMScheme.authenticate(Credentials, HttpMethod)
> > 2016-11-02 12:08:50,393 DEBUG httpclient.HttpMethodDirector -
> > Authorization required
> > 2016-11-02 12:08:50,393 DEBUG auth.AuthChallengeProcessor - Using
> > authentication scheme: ntlm
> > 2016-11-02 12:08:50,393 DEBUG auth.AuthChallengeProcessor -
> > Authorization challenge processed
> > 2016-11-02 12:08:50,393 DEBUG httpclient.HttpMethodDirector -
> > Authentication scope: NTLM <any realm>@<iis75.intranet>:80
> > 2016-11-02 12:08:50,393 DEBUG httpclient.HttpMethodDirector -
> > Credentials required
> > 2016-11-02 12:08:50,393 DEBUG httpclient.HttpMethodDirector -
> > Credentials provider not available
> > 2016-11-02 12:08:50,393 INFO  httpclient.HttpMethodDirector - Failure
> > authenticating with NTLM <any realm>@<iis75.intranet>:80
> > 2016-11-02 12:08:50,395 TRACE httpclient.Http - url:
> > http://<iis75.intranet>; status code: 401; bytes received: 0;
> > Content-Length: 0
> > 2016-11-02 12:08:50,681 DEBUG util.ObjectCache - No object cache found
> > for
> > conf=Configuration: core-default.xml, core-site.xml,
> > nutch-default.xml, nutch-site.xml, instantiating a new object cache
> > 2016-11-02 12:08:50,804 TRACE httpclient.Http - 401 Authentication
> > Required
> >
> > Any help is appreciated, as I am about to move on to another spirder
> > for solr.
> >
> > Thanks,
> > Bob
> >
> >
>

Reply via email to