Hello,

Using solr output connector and SP2010 Repository connector, I am indexing a 
document library named Documents. This library has some scanned pdf documents. 
Very First crawl indexes all 91 docs.
When I hit "Re-ingest all associated documents" and start second crawl, I get : 
"Error: Unexpected jobqueue status - record id 1344907007021, expecting active 
status, saw 3"

Here is the stack trace:
When i look at 
http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf, 
it is an image (scanned) pdf. 

WARN 2012-08-14 05:13:22,068 (Worker thread '39') - SharePoint: Error closing 
connection to file 
'http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf': 
Connection reset
java.net.SocketException: Connection reset
        at java.net.SocketInputStream.read(SocketInputStream.java:113)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
        at org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown 
Source)
        at org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown 
Source)
        at 
org.apache.commons.httpclient.ChunkedInputStream.exhaustInputStream(Unknown 
Source)
        at org.apache.commons.httpclient.ContentLengthInputStream.close(Unknown 
Source)
        at java.io.FilterInputStream.close(FilterInputStream.java:155)
        at 
org.apache.commons.httpclient.AutoCloseInputStream.notifyWatcher(Unknown Source)
        at org.apache.commons.httpclient.AutoCloseInputStream.close(Unknown 
Source)
        at 
org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1457)
        at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
        at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:549)
DEBUG 2012-08-14 05:13:22,072 (Worker thread '42') - SharePoint: Path attribute 
name is null
 WARN 2012-08-14 05:13:22,081 (Worker thread '39') - SharePoint: IOException 
thrown: Connection reset
java.net.SocketException: Connection reset
        at java.net.SocketInputStream.read(SocketInputStream.java:168)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
        at org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown 
Source)
        at java.io.FilterInputStream.read(FilterInputStream.java:116)
        at org.apache.commons.httpclient.AutoCloseInputStream.read(Unknown 
Source)
        at java.io.FilterInputStream.read(FilterInputStream.java:90)
        at org.apache.commons.httpclient.AutoCloseInputStream.read(Unknown 
Source)
        at 
org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1447)
        at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
        at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:549)
 WARN 2012-08-14 05:13:22,186 (Worker thread '39') - Service interruption 
reported for job 1344906886879 connection 'SP2010': SharePoint is down 
attempting to read 
'http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf', 
retrying: Connection reset
ERROR 2012-08-14 05:13:22,230 (Worker thread '39') - Exception tossed: 
Unexpected jobqueue status - record id 1344907007021, expecting active status, 
saw 3
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue 
status - record id 1344907007021, expecting active status, saw 3
        at 
org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:711)
        at 
org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2435)
        at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:745)

Reply via email to