[
https://issues.apache.org/jira/browse/CONNECTORS-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555567#comment-13555567
]
Karl Wright commented on CONNECTORS-608:
----------------------------------------
David,
It is difficult to diagnose without knowing exactly what you are doing, but if
you have multiple JVMs all running ManifoldCF and all going to the same Solr
instance, that might be where the problem is. ManifoldCF is not able to keep
track of the number of connections outstanding in that situation. And, even
with one job being active at a time, the other JVMs may be keeping open
connections around, because ManifoldCF pools these. It should, over time,
release them - but only after they've been idle a while.
Since the number of connections on your Solr instance is limited, that means
you have an excellent chance of a connection attempt having to wait an extended
period of time for a Solr connection to free up, so you get connection
timeouts. This all makes sense.
As for the one document that is not working right - is it special in some way?
Extremely long, perhaps?
> Solr connector gets socket timeouts on slow documents
> -----------------------------------------------------
>
> Key: CONNECTORS-608
> URL: https://issues.apache.org/jira/browse/CONNECTORS-608
> Project: ManifoldCF
> Issue Type: Bug
> Components: Lucene/SOLR connector
> Affects Versions: ManifoldCF 1.1
> Reporter: Karl Wright
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.1
>
> Attachments: mcf-jetty-error.txt
>
>
> The Solr connector fails on some documents with the following exception.
> {code}
> ERROR 2013-01-11 11:13:59,372 (Worker thread '36') -
> Exception tossed: Repeated service interruptions - failure processing
> document: Software caused connection abort: recv failed
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service
> interruptions - failure processing document: Software caused connection
> abort: recv failed
> at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
> Caused by: java.net.SocketException: Software caused connection abort: recv
> failed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(Unknown Source)
> at java.net.SocketInputStream.read(Unknown Source)
> at
> org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
> at
> org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
> at
> org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
> at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
> at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:61)
> at
> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
> at
> org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
> at
> org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
> at
> org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
> at
> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
> at
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
> at
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:716)
> at
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:521)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
> at
> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:742)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira