I checked in a fix for this ticket on trunk.  Please let me know if it
resolves this issue.

Karl

On Mon, Jan 14, 2013 at 10:20 AM, Karl Wright <[email protected]> wrote:
> This is because httpclient is retrying on error for three times by
> default.  This has to be disabled in the Solr connector, or the rest
> of the logic won't work right.
>
> I've opened a ticket (CONNECTORS-610) for this problem too.
>
> Karl
>
> On Mon, Jan 14, 2013 at 10:13 AM, Ahmet Arslan <[email protected]> wrote:
>> Hi Karl,
>>
>> Thanks for quick fix.
>>
>> I am still seeing the following error after 'svn up' and 'ant build'
>>
>> ERROR 2013-01-14 17:09:41,949 (Worker thread '6') - Exception tossed: 
>> Repeated service interruptions - failure processing document: null
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
>> interruptions - failure processing document: null
>>         at 
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
>> Caused by: org.apache.http.client.ClientProtocolException
>>         at 
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
>>         at 
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
>>         at 
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
>>         at 
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
>>         at 
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
>>         at 
>> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>>         at 
>> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:790)
>> Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot 
>> retry request with a non-repeatable request entity.  The cause lists the 
>> reason the original request failed.
>>         at 
>> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:692)
>>         at 
>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:523)
>>         at 
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
>>         ... 6 more
>> Caused by: java.net.SocketException: Broken pipe
>>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>>         at 
>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>>         at 
>> org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:169)
>>         at 
>> org.apache.http.impl.io.ChunkedOutputStream.flushCacheWithAppend(ChunkedOutputStream.java:110)
>>         at 
>> org.apache.http.impl.io.ChunkedOutputStream.write(ChunkedOutputStream.java:165)
>>         at 
>> org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:92)
>>         at 
>> org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
>>         at 
>> org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
>>         at 
>> org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122)
>>         at 
>> org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271)
>>         at 
>> org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197)
>>         at 
>> org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257)
>>         at 
>> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>>         at 
>> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:718)
>>         ... 8 more
>>
>>
>>
>> --- On Mon, 1/14/13, Karl Wright <[email protected]> wrote:
>>
>>> From: Karl Wright <[email protected]>
>>> Subject: Re: Repeated service interruptions - failure processing document: 
>>> null
>>> To: [email protected]
>>> Date: Monday, January 14, 2013, 3:30 PM
>>> Hi Ahmet,
>>>
>>> The exception that seems to be causing the abort is a socket
>>> exception
>>> coming from a socket write:
>>>
>>> > Caused by: java.net.SocketException: Broken pipe
>>>
>>> This makes sense in light of the http code returned from
>>> Solr, which
>>> was 413:  http://www.checkupdown.com/status/E413.html .
>>>
>>> So there is nothing actually *wrong* with the .aspx
>>> documents, but
>>> they are just way too big, and Solr is rejecting them for
>>> that reason.
>>>
>>> Clearly, though, the Solr connector should recognize this
>>> code as
>>> meaning "never retry", so instead of killing the job, it
>>> should just
>>> skip the document.  I'll open a ticket for that now.
>>>
>>> Karl
>>>
>>>
>>> On Mon, Jan 14, 2013 at 8:22 AM, Ahmet Arslan <[email protected]>
>>> wrote:
>>> > Hello,
>>> >
>>> > I am indexing a SharePoint 2010 instance using
>>> mcf-trunk (At revision 1432907)
>>> >
>>> > There is no problem with a Document library that
>>> contains word excel etc.
>>> >
>>> > However, I receive the following errors with a Document
>>> library that has *.aspx files in it.
>>> >
>>> > Status of Jobs => Error: Repeated service
>>> interruptions - failure processing document: null
>>> >
>>> >  WARN 2013-01-14 15:00:12,720 (Worker thread '13')
>>> - Service interruption reported for job 1358009105156
>>> connection 'iknow': IO exception during indexing: null
>>> > ERROR 2013-01-14 15:00:12,763 (Worker thread '13') -
>>> Exception tossed: Repeated service interruptions - failure
>>> processing document: null
>>> >
>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException:
>>> Repeated service interruptions - failure processing
>>> document: null
>>> >         at
>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
>>> > Caused by:
>>> org.apache.http.client.ClientProtocolException
>>> >         at
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
>>> >         at
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
>>> >         at
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
>>> >         at
>>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
>>> >         at
>>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
>>> >         at
>>> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>>> >         at
>>> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:768)
>>> > Caused by:
>>> org.apache.http.client.NonRepeatableRequestException: Cannot
>>> retry request with a non-repeatable request entity.
>>> The cause lists the reason the original request failed.
>>> >         at
>>> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:692)
>>> >         at
>>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:523)
>>> >         at
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
>>> >         ... 6 more
>>> > Caused by: java.net.SocketException: Broken pipe
>>> >         at
>>> java.net.SocketOutputStream.socketWrite0(Native Method)
>>> >         at
>>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>>> >         at
>>> java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>>> >         at
>>> org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:169)
>>> >         at
>>> org.apache.http.impl.io.ChunkedOutputStream.flushCacheWithAppend(ChunkedOutputStream.java:110)
>>> >         at
>>> org.apache.http.impl.io.ChunkedOutputStream.write(ChunkedOutputStream.java:165)
>>> >         at
>>> org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:92)
>>> >         at
>>> org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
>>> >         at
>>> org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
>>> >         at
>>> org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122)
>>> >         at
>>> org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271)
>>> >         at
>>> org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197)
>>> >         at
>>> org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257)
>>> >         at
>>> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>>> >         at
>>> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:718)
>>> >         ... 8 more
>>> >
>>> > Status of Jobs => Error: Unhandled Solr exception
>>> during indexing (0): Server at http://localhost:8983/solr/all returned non 
>>> ok
>>> status:413, message:FULL head
>>> >
>>> >         ERROR 2013-01-14
>>> 15:10:42,074 (Worker thread '15') - Exception tossed:
>>> Unhandled Solr exception during indexing (0): Server at 
>>> http://localhost:8983/solr/all returned non ok
>>> status:413, message:FULL head
>>> >
>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException:
>>> Unhandled Solr exception during indexing (0): Server at 
>>> http://localhost:8983/solr/all returned non ok
>>> status:413, message:FULL head
>>> >         at
>>> org.apache.manifoldcf.agents.output.solr.HttpPoster.handleSolrException(HttpPoster.java:360)
>>> >         at
>>> org.apache.manifoldcf.agents.output.solr.HttpPoster.indexPost(HttpPoster.java:477)
>>> >         at
>>> org.apache.manifoldcf.agents.output.solr.SolrConnector.addOrReplaceDocument(SolrConnector.java:594)
>>> >         at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1579)
>>> >         at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:504)
>>> >         at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:370)
>>> >         at
>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1652)
>>> >         at
>>> org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1559)
>>> >         at
>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>> >         at
>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:551)
>>> >
>>> > On the solr side I see :
>>> >
>>> > INFO: Creating new http client,
>>> config:maxConnections=200&maxConnectionsPerHost=8
>>> > 2013-01-14 15:18:21.775:WARN:oejh.HttpParser:Full
>>> [671412972,-1,m=5,g=6144,p=6144,c=6144]={2F736F6C722F616
>>> ...long long chars ... 2B656B6970{}
>>> >
>>> > Thanks,
>>> > Ahmet
>>>

Reply via email to