[ 
https://issues.apache.org/jira/browse/CONNECTORS-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650669#comment-13650669
 ] 

Karl Wright commented on CONNECTORS-682:
----------------------------------------

Erlend says:

Hello Karl,

It has been difficult for me to investigate this case further from US since I 
wasn't able to access the server from my IPad. Anyway, I think I have found the 
reason why the last job stopped. It is a HttpClient issue 
(NonRepeatableRequestException).

I'm sending this directly to you since the ticket is closed.

The last job was only fetching articles from www.duo.uio.no and stopped with 
the status "Error: Repeated service interruptions - failure processing 
document: null". While investigating the simple history, here's the error that 
made the job stop:
04-30-2013 08:18:33.766         document ingest (Solr) 
https://www.duo.uio.no/handle/10852/12997/statistics
        FAILED  19051   2014    IOException occured when talking to server at: 
https://solr-prod01.uio.no:443/solr/uio: null

The "null" sounds like a NullpointerException. From ManifoldCF.log:

ERROR 2013-04-30 08:18:35,803 (Worker thread '4') - Exception tossed: Repeated 
service interruptions - failure processing document: null
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
interruptions - failure processing document: null
        at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:586)
Caused by: org.apache.http.client.ClientProtocolException
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
        at 
org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:277)
        at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
        at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
        at 
org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:897)
Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry 
request with a non-repeatable request entity.
        at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:693)
        at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
        ... 6 more

Here's the last manifoldcf.log:
http://folk.uio.no/erlendfg/manifoldcf/manifoldcf2.log

I suddenly began to think about the commit within strategy which is enabled in 
the Solr output connection. I'm just mentioning this, but it does not seem to 
be the case here.

I generated another thread dump, but I don't think it will give us much 
valuable information, but I did it anyway:
http://folk.uio.no/erlendfg/manifoldcf/threaddump6.txt



                
> Solr connector is still apparently retrying indexing under some conditions
> --------------------------------------------------------------------------
>
>                 Key: CONNECTORS-682
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-682
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Lucene/SOLR connector
>    Affects Versions: ManifoldCF 1.2
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.3
>
>
> The Solr connector, when configured to use Basic Auth, is still getting 
> exceptions that look like this:
> {code}
> ERROR 2013-04-28 00:38:04,539 (Worker thread '5') - Exception tossed: 
> Repeated service interruptions - failure processing document: null
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
> interruptions - failure processing document: null
>       at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:586)
> Caused by: org.apache.http.client.ClientProtocolException
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
>       at 
> org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:277)
>       at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
>       at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>       at 
> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:885)
> Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry 
> request with a non-repeatable request entity.
>       at 
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:693)
>       at 
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
>       ... 6 more
> {code}
> This was working, but apparently things changed when we adopted HttpClient 
> 4.2.5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to