Hello, It doesn't say much except failure, no reason. You might want to set debugging to TRACE, the authenticator logs on that level. You could also check if there are server side messages.
Regards, Markus -----Original message----- > From:Larry.Santello <[email protected]> > Sent: Thursday 25th April 2019 15:28 > To: [email protected] > Subject: Nutch NTLM to IIS 8.5 - issues! > > All - > > I've tried several 1.x versions of Nutch and a variety of configurations and > simply can NOT get NTLM authentication working with Nutch. I need help > desperately! > > Here are the relevent configuration points: > Note: "user", "password", and "ntdomain" are, of course, fillers for real > values > > httpclient-auth.xml: > <credentials username="user" password="password" > > <default realm="ntdomain" /> > </credentials> > > nutch-site.xml: > <property> > <name>plugin.includes</name> > > <value>protocol-(http|httpclient)|urlfilter-(regex|validator)|parse-(html|tika)|index-(basic|anchor)|indexer-solr|scoring-opic|urlnormalizer-(pass|regex|basic)</value> > <description> </description> > </property> > > logged problem (note that, yes, this is from 1.5.1, but 1.15 produces > similar results): > 2019-04-25 07:38:47,641 INFO parse.ParserChecker - fetching: > http://url.com/crawltest.html > 2019-04-25 07:38:47,650 INFO plugin.PluginRepository - Plugins: looking in: > C:\nutch\apache-nutch-1.5.1\plugins > 2019-04-25 07:38:47,728 INFO plugin.PluginRepository - Plugin > Auto-activation mode: [true] > 2019-04-25 07:38:47,729 INFO plugin.PluginRepository - Registered Plugins: > 2019-04-25 07:38:47,729 INFO plugin.PluginRepository - Html Parse > Plug-in > (parse-html) > 2019-04-25 07:38:47,729 INFO plugin.PluginRepository - HTTP Framework > (lib-http) > 2019-04-25 07:38:47,729 INFO plugin.PluginRepository - Http / Https > Protocol Plug-in (protocol-httpclient) > 2019-04-25 07:38:47,729 INFO plugin.PluginRepository - Regex URL Filter > (urlfilter-regex) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - the nutch core > extension points (nutch-extensionpoints) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Basic Indexing > Filter (index-basic) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Anchor Indexing > Filter (index-anchor) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Tika Parser > Plug-in > (parse-tika) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Basic URL > Normalizer (urlnormalizer-basic) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Regex URL Filter > Framework (lib-regex-filter) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Regex URL > Normalizer (urlnormalizer-regex) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - URL Validator > (urlfilter-validator) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - CyberNeko HTML > Parser (lib-nekohtml) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Pass-through URL > Normalizer (urlnormalizer-pass) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - OPIC Scoring > Plug-in (scoring-opic) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Http Protocol > Plug-in (protocol-http) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Registered > Extension-Points: > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Nutch Content > Parser (org.apache.nutch.parse.Parser) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Nutch URL Filter > (org.apache.nutch.net.URLFilter) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - HTML Parse > Filter > (org.apache.nutch.parse.HtmlParseFilter) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Nutch Scoring > (org.apache.nutch.scoring.ScoringFilter) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Nutch URL > Normalizer (org.apache.nutch.net.URLNormalizer) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Nutch Protocol > (org.apache.nutch.protocol.Protocol) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Nutch Segment > Merge > Filter (org.apache.nutch.segment.SegmentMergeFilter) > 2019-04-25 07:38:47,733 INFO plugin.PluginRepository - Nutch Indexing > Filter (org.apache.nutch.indexer.IndexingFilter) > 2019-04-25 07:38:47,761 INFO httpclient.Http - http.proxy.host = null > 2019-04-25 07:38:47,762 INFO httpclient.Http - http.proxy.port = 8080 > 2019-04-25 07:38:47,763 INFO httpclient.Http - http.timeout = 10000 > 2019-04-25 07:38:47,763 INFO httpclient.Http - http.content.limit = -1 > 2019-04-25 07:38:47,763 INFO httpclient.Http - http.agent = Ulinenet > Spider/Nutch-1.5.1 > 2019-04-25 07:38:47,764 INFO httpclient.Http - http.accept.language = > en-us,en-gb,en;q=0.7,*;q=0.3 > 2019-04-25 07:38:47,764 INFO httpclient.Http - http.accept = > text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 > 2019-04-25 07:38:47,835 DEBUG auth.AuthChallengeProcessor - Supported > authentication schemes in the order of preference: [ntlm, digest, basic] > 2019-04-25 07:38:47,836 INFO auth.AuthChallengeProcessor - ntlm > authentication scheme selected > 2019-04-25 07:38:47,837 DEBUG auth.AuthChallengeProcessor - Using > authentication scheme: ntlm > 2019-04-25 07:38:47,837 DEBUG auth.AuthChallengeProcessor - Authorization > challenge processed > 2019-04-25 07:38:47,847 DEBUG auth.AuthChallengeProcessor - Using > authentication scheme: ntlm > 2019-04-25 07:38:47,847 DEBUG auth.AuthChallengeProcessor - Authorization > challenge processed > 2019-04-25 07:38:48,335 DEBUG auth.AuthChallengeProcessor - Using > authentication scheme: ntlm > 2019-04-25 07:38:48,336 DEBUG auth.AuthChallengeProcessor - Authorization > challenge processed > 2019-04-25 07:38:48,337 INFO httpclient.HttpMethodDirector - Failure > authenticating with NTLM <any realm>@url.com:80 > 2019-04-25 07:38:48,507 INFO crawl.SignatureFactory - Using Signature impl: > org.apache.nutch.crawl.MD5Signature > 2019-04-25 07:38:48,509 INFO parse.ParserChecker - parsing: > http://url.com/crawltest.html > 2019-04-25 07:38:48,509 INFO parse.ParserChecker - contentType: > application/xhtml+xml > 2019-04-25 07:38:48,510 INFO parse.ParserChecker - signature: > 495abb7f991fb4dd6a056f748908a2d9 > > The way i'm testing: > bin/nutch parsechecker http://url.com/crawltest.html > > Finally, I should note that the following curl command DOES work: > curl --ntlm --user user:password http://url.com/crawltest.html > > > > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Nutch-User-f603147.html >

