I was crawling web sites with links to html and pdf files on the provided multiprocess-example agent for a few hours, then Simple History started showing -104 result code with a message saying "Interrupted: Job no longer active".
After the same error occurred repeatedly around 40 times, the job status became "Aborting" and then ended up with "Error: Repeated service interruptions - failure processing document: Ingestion HTTP error code 500". The job was interrupted and stopped. Does anyone know what situation brings "Repeated service interruptions" and has jobs stopped? Also in what circumstance an error status code -104 occurs? What is the meaning of the code -104? If you have any ideas, please advise me on how to avoid this error. I am using the followings: Solr 1.4 (Extracting Request Handler is set) ManifoldCF 0.4 (multiprocess-example) - Repository connector: WEB - Output connector: Solr Tomcat 6.0.29 PostgreSQL 9.1.3 Here is MCF’s debug log right before the job was interrupted: DEBUG 2012-03-15 20:04:16,325 (Worker thread '4') - WEB: Attempting to get connection to http://xx.xx.xx.xx:80 (95697 ms) DEBUG 2012-03-15 20:04:16,325 (Worker thread '4') - WEB: Waiting 3895 ms before starting fetch on http://xx.xx.xx.xx:80 DEBUG 2012-03-15 20:04:20,221 (Worker thread '4') - WEB: Attempting to get connection to http://xx.xx.xx.xx:80 (99593 ms) DEBUG 2012-03-15 20:04:20,221 (Worker thread '4') - WEB: Successfully got connection to http://xx.xx.xx.xx:80 (99593 ms) DEBUG 2012-03-15 20:04:20,221 (Worker thread '4') - WEB: Waiting for an HttpClient object DEBUG 2012-03-15 20:04:20,221 (Worker thread '4') - WEB: Got an HttpClient object after 0 ms. DEBUG 2012-03-15 20:04:20,221 (Worker thread '4') - WEB: Get method for '/xx/xx.pdf' DEBUG 2012-03-15 20:04:20,222 (Worker thread '4') - WEB: For http://xx.xx/xx/xx.pdf, setting virtual host to xx.xx DEBUG 2012-03-15 20:04:20,315 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 128 ms. DEBUG 2012-03-15 20:04:20,445 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:20,509 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:20,573 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:20,637 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:20,701 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:20,765 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:20,829 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:20,893 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:20,957 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:21,021 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:21,085 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:21,149 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:21,213 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. DEBUG 2012-03-15 20:04:21,277 (Worker thread '4') - WEB: Performing a read wait on bin 'xx.xx' of 62 ms. INFO 2012-03-15 20:04:21,344 (Worker thread '4') - WEB: FETCH URL| http://xx.xx/xx/xx.pdf|1331809460221+1122|-104|65536|org.apache.manifoldcf.core.interfaces.ManifoldCFException|Interrupted: Job no longer active DEBUG 2012-03-15 20:04:21,344 (Worker thread '4') - WEB: Fetch exception for 'http://xx.xx/xx/xx.pdf' org.apache.manifoldcf.core.interfaces.ManifoldCFException: Interrupted: Job no longer active at org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ThrottledConnection.noteInterrupted(ThrottledFetcher.java:1735) at org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.getDocumentVersions(WebcrawlerConnector.java:743) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:318) Caused by: org.apache.manifoldcf.agents.interfaces.ServiceInterruption: Job no longer active at org.apache.manifoldcf.crawler.system.WorkerThread$VersionActivity.checkJobStillActive(WorkerThread.java:1223) at org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.addData(DataCache.java:135) at org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.getDocumentVersions(WebcrawlerConnector.java:713) ... 1 more WARN 2012-03-15 20:04:21,345 (Worker thread '4') - Pre-ingest service interruption reported for job 1331716457096 connection 'web': Job no longer active DEBUG 2012-03-15 20:04:23,871 (Job reset thread) - Stopped job 1331716457096 DEBUG 2012-03-15 20:04:24,236 (Job notification thread) - Found job 1331716457096 in need of notification
