[
https://issues.apache.org/jira/browse/CONNECTORS-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650669#comment-13650669
]
Karl Wright commented on CONNECTORS-682:
----------------------------------------
Erlend says:
Hello Karl,
It has been difficult for me to investigate this case further from US since I
wasn't able to access the server from my IPad. Anyway, I think I have found the
reason why the last job stopped. It is a HttpClient issue
(NonRepeatableRequestException).
I'm sending this directly to you since the ticket is closed.
The last job was only fetching articles from www.duo.uio.no and stopped with
the status "Error: Repeated service interruptions - failure processing
document: null". While investigating the simple history, here's the error that
made the job stop:
04-30-2013 08:18:33.766 document ingest (Solr)
https://www.duo.uio.no/handle/10852/12997/statistics
FAILED 19051 2014 IOException occured when talking to server at:
https://solr-prod01.uio.no:443/solr/uio: null
The "null" sounds like a NullpointerException. From ManifoldCF.log:
ERROR 2013-04-30 08:18:35,803 (Worker thread '4') - Exception tossed: Repeated
service interruptions - failure processing document: null
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service
interruptions - failure processing document: null
at
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:586)
Caused by: org.apache.http.client.ClientProtocolException
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at
org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:277)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at
org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:897)
Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry
request with a non-repeatable request entity.
at
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:693)
at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
... 6 more
Here's the last manifoldcf.log:
http://folk.uio.no/erlendfg/manifoldcf/manifoldcf2.log
I suddenly began to think about the commit within strategy which is enabled in
the Solr output connection. I'm just mentioning this, but it does not seem to
be the case here.
I generated another thread dump, but I don't think it will give us much
valuable information, but I did it anyway:
http://folk.uio.no/erlendfg/manifoldcf/threaddump6.txt
> Solr connector is still apparently retrying indexing under some conditions
> --------------------------------------------------------------------------
>
> Key: CONNECTORS-682
> URL: https://issues.apache.org/jira/browse/CONNECTORS-682
> Project: ManifoldCF
> Issue Type: Bug
> Components: Lucene/SOLR connector
> Affects Versions: ManifoldCF 1.2
> Reporter: Karl Wright
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.3
>
>
> The Solr connector, when configured to use Basic Auth, is still getting
> exceptions that look like this:
> {code}
> ERROR 2013-04-28 00:38:04,539 (Worker thread '5') - Exception tossed:
> Repeated service interruptions - failure processing document: null
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service
> interruptions - failure processing document: null
> at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:586)
> Caused by: org.apache.http.client.ClientProtocolException
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
> at
> org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:277)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
> at
> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:885)
> Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry
> request with a non-repeatable request entity.
> at
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:693)
> at
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
> ... 6 more
> {code}
> This was working, but apparently things changed when we adopted HttpClient
> 4.2.5.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira