This is because httpclient is retrying on error for three times by
default.  This has to be disabled in the Solr connector, or the rest
of the logic won't work right.

I've opened a ticket (CONNECTORS-610) for this problem too.

Karl

On Mon, Jan 14, 2013 at 10:13 AM, Ahmet Arslan <[email protected]> wrote:
> Hi Karl,
>
> Thanks for quick fix.
>
> I am still seeing the following error after 'svn up' and 'ant build'
>
> ERROR 2013-01-14 17:09:41,949 (Worker thread '6') - Exception tossed: 
> Repeated service interruptions - failure processing document: null
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
> interruptions - failure processing document: null
>         at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
> Caused by: org.apache.http.client.ClientProtocolException
>         at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
>         at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
>         at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
>         at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>         at 
> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:790)
> Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry 
> request with a non-repeatable request entity.  The cause lists the reason the 
> original request failed.
>         at 
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:692)
>         at 
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:523)
>         at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
>         ... 6 more
> Caused by: java.net.SocketException: Broken pipe
>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at 
> org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:169)
>         at 
> org.apache.http.impl.io.ChunkedOutputStream.flushCacheWithAppend(ChunkedOutputStream.java:110)
>         at 
> org.apache.http.impl.io.ChunkedOutputStream.write(ChunkedOutputStream.java:165)
>         at 
> org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:92)
>         at 
> org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
>         at 
> org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
>         at 
> org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122)
>         at 
> org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271)
>         at 
> org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197)
>         at 
> org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257)
>         at 
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>         at 
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:718)
>         ... 8 more
>
>
>
> --- On Mon, 1/14/13, Karl Wright <[email protected]> wrote:
>
>> From: Karl Wright <[email protected]>
>> Subject: Re: Repeated service interruptions - failure processing document: 
>> null
>> To: [email protected]
>> Date: Monday, January 14, 2013, 3:30 PM
>> Hi Ahmet,
>>
>> The exception that seems to be causing the abort is a socket
>> exception
>> coming from a socket write:
>>
>> > Caused by: java.net.SocketException: Broken pipe
>>
>> This makes sense in light of the http code returned from
>> Solr, which
>> was 413:  http://www.checkupdown.com/status/E413.html .
>>
>> So there is nothing actually *wrong* with the .aspx
>> documents, but
>> they are just way too big, and Solr is rejecting them for
>> that reason.
>>
>> Clearly, though, the Solr connector should recognize this
>> code as
>> meaning "never retry", so instead of killing the job, it
>> should just
>> skip the document.  I'll open a ticket for that now.
>>
>> Karl
>>
>>
>> On Mon, Jan 14, 2013 at 8:22 AM, Ahmet Arslan <[email protected]>
>> wrote:
>> > Hello,
>> >
>> > I am indexing a SharePoint 2010 instance using
>> mcf-trunk (At revision 1432907)
>> >
>> > There is no problem with a Document library that
>> contains word excel etc.
>> >
>> > However, I receive the following errors with a Document
>> library that has *.aspx files in it.
>> >
>> > Status of Jobs => Error: Repeated service
>> interruptions - failure processing document: null
>> >
>> >  WARN 2013-01-14 15:00:12,720 (Worker thread '13')
>> - Service interruption reported for job 1358009105156
>> connection 'iknow': IO exception during indexing: null
>> > ERROR 2013-01-14 15:00:12,763 (Worker thread '13') -
>> Exception tossed: Repeated service interruptions - failure
>> processing document: null
>> >
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException:
>> Repeated service interruptions - failure processing
>> document: null
>> >         at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
>> > Caused by:
>> org.apache.http.client.ClientProtocolException
>> >         at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
>> >         at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
>> >         at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
>> >         at
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
>> >         at
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
>> >         at
>> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>> >         at
>> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:768)
>> > Caused by:
>> org.apache.http.client.NonRepeatableRequestException: Cannot
>> retry request with a non-repeatable request entity.
>> The cause lists the reason the original request failed.
>> >         at
>> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:692)
>> >         at
>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:523)
>> >         at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
>> >         ... 6 more
>> > Caused by: java.net.SocketException: Broken pipe
>> >         at
>> java.net.SocketOutputStream.socketWrite0(Native Method)
>> >         at
>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>> >         at
>> java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>> >         at
>> org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:169)
>> >         at
>> org.apache.http.impl.io.ChunkedOutputStream.flushCacheWithAppend(ChunkedOutputStream.java:110)
>> >         at
>> org.apache.http.impl.io.ChunkedOutputStream.write(ChunkedOutputStream.java:165)
>> >         at
>> org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:92)
>> >         at
>> org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
>> >         at
>> org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
>> >         at
>> org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122)
>> >         at
>> org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271)
>> >         at
>> org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197)
>> >         at
>> org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257)
>> >         at
>> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>> >         at
>> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:718)
>> >         ... 8 more
>> >
>> > Status of Jobs => Error: Unhandled Solr exception
>> during indexing (0): Server at http://localhost:8983/solr/all returned non ok
>> status:413, message:FULL head
>> >
>> >         ERROR 2013-01-14
>> 15:10:42,074 (Worker thread '15') - Exception tossed:
>> Unhandled Solr exception during indexing (0): Server at 
>> http://localhost:8983/solr/all returned non ok
>> status:413, message:FULL head
>> >
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException:
>> Unhandled Solr exception during indexing (0): Server at 
>> http://localhost:8983/solr/all returned non ok
>> status:413, message:FULL head
>> >         at
>> org.apache.manifoldcf.agents.output.solr.HttpPoster.handleSolrException(HttpPoster.java:360)
>> >         at
>> org.apache.manifoldcf.agents.output.solr.HttpPoster.indexPost(HttpPoster.java:477)
>> >         at
>> org.apache.manifoldcf.agents.output.solr.SolrConnector.addOrReplaceDocument(SolrConnector.java:594)
>> >         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1579)
>> >         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:504)
>> >         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:370)
>> >         at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1652)
>> >         at
>> org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1559)
>> >         at
>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>> >         at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:551)
>> >
>> > On the solr side I see :
>> >
>> > INFO: Creating new http client,
>> config:maxConnections=200&maxConnectionsPerHost=8
>> > 2013-01-14 15:18:21.775:WARN:oejh.HttpParser:Full
>> [671412972,-1,m=5,g=6144,p=6144,c=6144]={2F736F6C722F616
>> ...long long chars ... 2B656B6970{}
>> >
>> > Thanks,
>> > Ahmet
>>

Reply via email to