[
https://issues.apache.org/jira/browse/CONNECTORS-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555369#comment-13555369
]
Karl Wright edited comment on CONNECTORS-608 at 1/16/13 7:31 PM:
-----------------------------------------------------------------
Hi David,
The jetty exception is from a crawler-UI page that failed to render because of
a socket exception sending the output to a browser:
{code}
Caused by: java.io.IOException: An established connection was aborted by the
software in your host machine
{code}
It's interesting only in that this is yet another socket timeout. I don't know
what Jetty's default socket timeout value is, but if it could not send the
page's html to the browser in a reasonable time, your system must have somehow
been in a very special state, or you'd closed your browser in the midst of a
page render, or something like that.
As for the problem with overwhelming Solr, can you experiment a bit in the
following way:
- Make sure that no other jobs are running; if they are, pause them and wait
for them to finish pausing
- Reduce the max connection count in your Solr connection definition down to 1,
and save it
- Restart manifoldcf, just to be sure that the max takes effect for certain
- Run the problematic job, and see if it succeeds this time
I would expect it to succeed, albeit slowly. If it still fails, then clearly
we are somehow leaking Solr connections, which I can try to diagnose here this
evening.
was (Author: [email protected]):
Hi David,
The jetty exception is from a crawler-UI page that failed to render because of
a socket exception sending the output to a browser:
{code}
Caused by: java.io.IOException: An established connection was aborted by the
software in your host machine
{code}
It's interesting only in that this is yet another socket timeout. I don't know
what Jetty's default socket timeout value is, but if it could not send the
page's html to the browser in a reasonable time, your system must have somehow
been in a very special state, or you'd closed your browser in the midst of a
page render, or something like that.
As for the problem with overwhelming Solr, can you experiment a bit in the
following way:
- Make sure that no other jobs are running; if they are, pause them and wait
for them to finish pausing
- Reduce the max connection count in your Solr connection definition down to 1
- Run the problematic job, and see if it succeeds this time
I would expect it to succeed, albeit slowly. If it still fails, then clearly
we are somehow leaking Solr connections, which I can try to diagnose here this
evening.
> Solr connector gets socket timeouts on slow documents
> -----------------------------------------------------
>
> Key: CONNECTORS-608
> URL: https://issues.apache.org/jira/browse/CONNECTORS-608
> Project: ManifoldCF
> Issue Type: Bug
> Components: Lucene/SOLR connector
> Affects Versions: ManifoldCF 1.1
> Reporter: Karl Wright
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.1
>
> Attachments: mcf-jetty-error.txt
>
>
> The Solr connector fails on some documents with the following exception.
> {code}
> ERROR 2013-01-11 11:13:59,372 (Worker thread '36') -
> Exception tossed: Repeated service interruptions - failure processing
> document: Software caused connection abort: recv failed
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service
> interruptions - failure processing document: Software caused connection
> abort: recv failed
> at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
> Caused by: java.net.SocketException: Software caused connection abort: recv
> failed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(Unknown Source)
> at java.net.SocketInputStream.read(Unknown Source)
> at
> org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
> at
> org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
> at
> org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
> at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
> at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:61)
> at
> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
> at
> org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
> at
> org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
> at
> org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
> at
> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
> at
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
> at
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:716)
> at
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:521)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
> at
> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:742)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira