Hello,

I am indexing a SharePoint 2010 instance using mcf-trunk (At revision 1432907)

There is no problem with a Document library that contains word excel etc.

However, I receive the following errors with a Document library that has *.aspx 
files in it.

Status of Jobs => Error: Repeated service interruptions - failure processing 
document: null

 WARN 2013-01-14 15:00:12,720 (Worker thread '13') - Service interruption 
reported for job 1358009105156 connection 'iknow': IO exception during 
indexing: null
ERROR 2013-01-14 15:00:12,763 (Worker thread '13') - Exception tossed: Repeated 
service interruptions - failure processing document: null
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
interruptions - failure processing document: null
        at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
Caused by: org.apache.http.client.ClientProtocolException
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
        at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
        at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
        at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
        at 
org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:768)
Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry 
request with a non-repeatable request entity.  The cause lists the reason the 
original request failed.
        at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:692)
        at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:523)
        at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
        ... 6 more
Caused by: java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at 
org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:169)
        at 
org.apache.http.impl.io.ChunkedOutputStream.flushCacheWithAppend(ChunkedOutputStream.java:110)
        at 
org.apache.http.impl.io.ChunkedOutputStream.write(ChunkedOutputStream.java:165)
        at 
org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:92)
        at 
org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
        at 
org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
        at 
org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122)
        at 
org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271)
        at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197)
        at 
org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257)
        at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
        at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:718)
        ... 8 more
        
Status of Jobs => Error: Unhandled Solr exception during indexing (0): Server 
at http://localhost:8983/solr/all returned non ok status:413, message:FULL head
        
        ERROR 2013-01-14 15:10:42,074 (Worker thread '15') - Exception tossed: 
Unhandled Solr exception during indexing (0): Server at 
http://localhost:8983/solr/all returned non ok status:413, message:FULL head
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unhandled Solr 
exception during indexing (0): Server at http://localhost:8983/solr/all 
returned non ok status:413, message:FULL head
        at 
org.apache.manifoldcf.agents.output.solr.HttpPoster.handleSolrException(HttpPoster.java:360)
        at 
org.apache.manifoldcf.agents.output.solr.HttpPoster.indexPost(HttpPoster.java:477)
        at 
org.apache.manifoldcf.agents.output.solr.SolrConnector.addOrReplaceDocument(SolrConnector.java:594)
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1579)
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:504)
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:370)
        at 
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1652)
        at 
org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1559)
        at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
        at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:551)
        
On the solr side I see :

INFO: Creating new http client, 
config:maxConnections=200&maxConnectionsPerHost=8
2013-01-14 15:18:21.775:WARN:oejh.HttpParser:Full 
[671412972,-1,m=5,g=6144,p=6144,c=6144]={2F736F6C722F616 ...long long chars ... 
2B656B6970{}

Thanks,
Ahmet

Reply via email to