[ 
https://issues.apache.org/jira/browse/CONNECTORS-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650673#comment-13650673
 ] 

Karl Wright commented on CONNECTORS-682:
----------------------------------------

Hi Erlend,

The "null" means just that the message from the causing exception is null; it 
is not an NPE.

Looking at the log, the exception is here:

{code}
 WARN 2013-04-30 08:18:35,782 (Worker thread '4') - IO exception during 
indexing: null
org.apache.http.client.ClientProtocolException
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
        at 
org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:277)
        at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
        at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
        at 
org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:897)
Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry 
request with a non-repeatable request entity.
        at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:693)
        at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
        ... 6 more
 WARN 2013-04-30 08:18:35,787 (Worker thread '4') - Service interruption 
reported for job 1365066502674 connection 'Web': IO exception during indexing: 
null
{code}

Right above that, the thread "Thread-9392430" ends with:

{code}
DEBUG 2013-04-30 08:18:35,778 (Thread-9392430) - Cannot retry non-repeatable 
request
DEBUG 2013-04-30 08:18:35,779 (Thread-9392430) - Connection 
0.0.0.0:36826<->129.240.120.16:443 shut down
DEBUG 2013-04-30 08:18:35,779 (Thread-9392430) - Connection 
0.0.0.0:36826<->129.240.120.16:443 closed
DEBUG 2013-04-30 08:18:35,779 (Thread-9392430) - Connection released: [id: 
197122][route: {s}->https://solr-prod01.uio.no:443][total kept alive: 0; route 
allocated: 0 of 2; total allocated: 0 of 1]
{code}

Looking further back in the trace for Thread-9392430, you will note that the 
entire page content is transmitted followed by:

{code}
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - Receiving response: HTTP/1.1 
401 Authorization Required
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - << HTTP/1.1 401 Authorization 
Required
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - << Date: Tue, 30 Apr 2013 
06:18:33 GMT
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - << Server: Apache/2.2.24 (Unix)
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - << WWW-Authenticate: Basic 
realm="resin"
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - << Content-Length: 527
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - << Connection: close
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - << Content-Type: text/html; 
charset=iso-8859-1
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - Authentication required
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - solr-prod01.uio.no:443 
requested authentication
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - Authentication schemes in the 
order of preference: [negotiate, Kerberos, NTLM, Digest, Basic]
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - Challenge for negotiate 
authentication scheme not available
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - Challenge for Kerberos 
authentication scheme not available
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - Challenge for NTLM 
authentication scheme not available
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - Challenge for Digest 
authentication scheme not available
DEBUG 2013-04-30 08:18:35,776 (Thread-9392430) - Selected authentication 
options: [BASIC]
{code}

In other words, for some reason expect/continue is not happening for this 
thread.  Going further back, the exchange begins with:

{code}
DEBUG 2013-04-30 08:18:33,766 (Thread-9392430) - Connection request: [route: 
{s}->https://solr-prod01.uio.no:443][total kept alive: 0; route allocated: 0 of 
2; total allocated: 0 of 1]
DEBUG 2013-04-30 08:18:33,766 (Thread-9392430) - Connection leased: [id: 
197122][route: {s}->https://solr-prod01.uio.no:443][total kept alive: 0; route 
allocated: 1 of 2; total allocated: 1 of 1]
DEBUG 2013-04-30 08:18:33,766 (Thread-9392430) - Connecting to 
solr-prod01.uio.no:443
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - CookieSpec selected: best-match
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - Auth cache not set in the 
context
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - Target auth state: UNCHALLENGED
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - Proxy auth state: UNCHALLENGED
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - Attempt 1 to execute request
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - Sending request: POST 
/solr/uio/update/extract HTTP/1.1
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> "POST 
/solr/uio/update/extract HTTP/1.1[\r][\n]"
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> "Content-Charset: 
UTF-8[\r][\n]"
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> "User-Agent: 
Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0[\r][\n]"
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> "Transfer-Encoding: 
chunked[\r][\n]"
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> "Content-Type: 
multipart/form-data; boundary=PhMY-DzFkcY_3PIP1FuBGiOtqscwJceHD[\r][\n]"
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> "Host: 
solr-prod01.uio.no:443[\r][\n]"
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> "Connection: 
Keep-Alive[\r][\n]"
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> "Expect: 
100-continue[\r][\n]"
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> "[\r][\n]"
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> POST 
/solr/uio/update/extract HTTP/1.1
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> Content-Charset: UTF-8
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> User-Agent: 
Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> Transfer-Encoding: chunked
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> Content-Type: 
multipart/form-data; boundary=PhMY-DzFkcY_3PIP1FuBGiOtqscwJceHD
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> Host: solr-prod01.uio.no:443
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> Connection: Keep-Alive
DEBUG 2013-04-30 08:18:33,768 (Thread-9392430) - >> Expect: 100-continue
{code}

... but then HttpClient doesn't stop and wait for the continue response from 
Solr; it keeps right on going!!

{code}
DEBUG 2013-04-30 08:18:35,770 (Thread-9392430) - >> "d8[\r][\n]"
DEBUG 2013-04-30 08:18:35,770 (Thread-9392430) - >> 
"--PhMY-DzFkcY_3PIP1FuBGiOtqscwJceHD[\r][\n]"
DEBUG 2013-04-30 08:18:35,770 (Thread-9392430) - >> "Content-Disposition: 
form-data; name="literal.id"[\r][\n]"
DEBUG 2013-04-30 08:18:35,770 (Thread-9392430) - >> "Content-Type: text/plain; 
charset=UTF-8[\r][\n]"
DEBUG 2013-04-30 08:18:35,770 (Thread-9392430) - >> "Content-Transfer-Encoding: 
8bit[\r][\n]"
DEBUG 2013-04-30 08:18:35,770 (Thread-9392430) - >> "[\r][\n]"
DEBUG 2013-04-30 08:18:35,770 (Thread-9392430) - >> 
"https://www.duo.uio.no/handle/10852/12997/statistics";
DEBUG 2013-04-30 08:18:35,770 (Thread-9392430) - >> "[\r][\n]"
DEBUG 2013-04-30 08:18:35,771 (Thread-9392430) - >> "af[\r][\n]"
DEBUG 2013-04-30 08:18:35,771 (Thread-9392430) - >> "[\r][\n]"
...
{code}

This is clearly wrong, and an HttpClient bug, and I will have to raise this 
issue as an HttpClient ticket.  We'll see what else is required to diagnose the 
problem.

                
> Solr connector is still apparently retrying indexing under some conditions
> --------------------------------------------------------------------------
>
>                 Key: CONNECTORS-682
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-682
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Lucene/SOLR connector
>    Affects Versions: ManifoldCF 1.2
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.3
>
>
> The Solr connector, when configured to use Basic Auth, is still getting 
> exceptions that look like this:
> {code}
> ERROR 2013-04-28 00:38:04,539 (Worker thread '5') - Exception tossed: 
> Repeated service interruptions - failure processing document: null
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
> interruptions - failure processing document: null
>       at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:586)
> Caused by: org.apache.http.client.ClientProtocolException
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
>       at 
> org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:277)
>       at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
>       at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>       at 
> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:885)
> Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry 
> request with a non-repeatable request entity.
>       at 
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:693)
>       at 
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
>       ... 6 more
> {code}
> This was working, but apparently things changed when we adopted HttpClient 
> 4.2.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to