Hi Karl,
Thanks for quick fix.
I am still seeing the following error after 'svn up' and 'ant build'
ERROR 2013-01-14 17:09:41,949 (Worker thread '6') - Exception tossed: Repeated
service interruptions - failure processing document: null
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service
interruptions - failure processing document: null
at
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
Caused by: org.apache.http.client.ClientProtocolException
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at
org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:790)
Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry
request with a non-repeatable request entity. The cause lists the reason the
original request failed.
at
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:692)
at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:523)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
... 6 more
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at
org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:169)
at
org.apache.http.impl.io.ChunkedOutputStream.flushCacheWithAppend(ChunkedOutputStream.java:110)
at
org.apache.http.impl.io.ChunkedOutputStream.write(ChunkedOutputStream.java:165)
at
org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:92)
at
org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
at
org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
at
org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122)
at
org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271)
at
org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197)
at
org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257)
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:718)
... 8 more
--- On Mon, 1/14/13, Karl Wright <[email protected]> wrote:
> From: Karl Wright <[email protected]>
> Subject: Re: Repeated service interruptions - failure processing document:
> null
> To: [email protected]
> Date: Monday, January 14, 2013, 3:30 PM
> Hi Ahmet,
>
> The exception that seems to be causing the abort is a socket
> exception
> coming from a socket write:
>
> > Caused by: java.net.SocketException: Broken pipe
>
> This makes sense in light of the http code returned from
> Solr, which
> was 413: http://www.checkupdown.com/status/E413.html .
>
> So there is nothing actually *wrong* with the .aspx
> documents, but
> they are just way too big, and Solr is rejecting them for
> that reason.
>
> Clearly, though, the Solr connector should recognize this
> code as
> meaning "never retry", so instead of killing the job, it
> should just
> skip the document. I'll open a ticket for that now.
>
> Karl
>
>
> On Mon, Jan 14, 2013 at 8:22 AM, Ahmet Arslan <[email protected]>
> wrote:
> > Hello,
> >
> > I am indexing a SharePoint 2010 instance using
> mcf-trunk (At revision 1432907)
> >
> > There is no problem with a Document library that
> contains word excel etc.
> >
> > However, I receive the following errors with a Document
> library that has *.aspx files in it.
> >
> > Status of Jobs => Error: Repeated service
> interruptions - failure processing document: null
> >
> > WARN 2013-01-14 15:00:12,720 (Worker thread '13')
> - Service interruption reported for job 1358009105156
> connection 'iknow': IO exception during indexing: null
> > ERROR 2013-01-14 15:00:12,763 (Worker thread '13') -
> Exception tossed: Repeated service interruptions - failure
> processing document: null
> >
> org.apache.manifoldcf.core.interfaces.ManifoldCFException:
> Repeated service interruptions - failure processing
> document: null
> > at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
> > Caused by:
> org.apache.http.client.ClientProtocolException
> > at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
> > at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
> > at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
> > at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
> > at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> > at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
> > at
> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:768)
> > Caused by:
> org.apache.http.client.NonRepeatableRequestException: Cannot
> retry request with a non-repeatable request entity.
> The cause lists the reason the original request failed.
> > at
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:692)
> > at
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:523)
> > at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
> > ... 6 more
> > Caused by: java.net.SocketException: Broken pipe
> > at
> java.net.SocketOutputStream.socketWrite0(Native Method)
> > at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
> > at
> java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> > at
> org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:169)
> > at
> org.apache.http.impl.io.ChunkedOutputStream.flushCacheWithAppend(ChunkedOutputStream.java:110)
> > at
> org.apache.http.impl.io.ChunkedOutputStream.write(ChunkedOutputStream.java:165)
> > at
> org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:92)
> > at
> org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
> > at
> org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
> > at
> org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122)
> > at
> org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271)
> > at
> org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197)
> > at
> org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257)
> > at
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
> > at
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:718)
> > ... 8 more
> >
> > Status of Jobs => Error: Unhandled Solr exception
> during indexing (0): Server at http://localhost:8983/solr/all returned non ok
> status:413, message:FULL head
> >
> > ERROR 2013-01-14
> 15:10:42,074 (Worker thread '15') - Exception tossed:
> Unhandled Solr exception during indexing (0): Server at
> http://localhost:8983/solr/all returned non ok
> status:413, message:FULL head
> >
> org.apache.manifoldcf.core.interfaces.ManifoldCFException:
> Unhandled Solr exception during indexing (0): Server at
> http://localhost:8983/solr/all returned non ok
> status:413, message:FULL head
> > at
> org.apache.manifoldcf.agents.output.solr.HttpPoster.handleSolrException(HttpPoster.java:360)
> > at
> org.apache.manifoldcf.agents.output.solr.HttpPoster.indexPost(HttpPoster.java:477)
> > at
> org.apache.manifoldcf.agents.output.solr.SolrConnector.addOrReplaceDocument(SolrConnector.java:594)
> > at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1579)
> > at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:504)
> > at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:370)
> > at
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1652)
> > at
> org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1559)
> > at
> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
> > at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:551)
> >
> > On the solr side I see :
> >
> > INFO: Creating new http client,
> config:maxConnections=200&maxConnectionsPerHost=8
> > 2013-01-14 15:18:21.775:WARN:oejh.HttpParser:Full
> [671412972,-1,m=5,g=6144,p=6144,c=6144]={2F736F6C722F616
> ...long long chars ... 2B656B6970{}
> >
> > Thanks,
> > Ahmet
>