[ https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217519#comment-13217519 ]
Karl Wright edited comment on CONNECTORS-288 at 2/27/12 8:25 PM: ----------------------------------------------------------------- bq. I tried to execute the test now and I think that now tests are runned correctly but it seems that it can't delete the job from Manifold at the end of the test: Right, the problem is that the job deletion hangs, because it's trying to delete the documents from the index and something goes wrong with that. I posted earlier the manifoldcf.log output associated with this failure: {code} ERROR 2012-02-26 16:09:35,903 (Document delete thread '7') - Exception tossed: Server/page not found org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111) at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.<init>(ElasticSearchDelete.java:35) at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748) at org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130) {code} The issue is that the "Server/page not found" error seems to occur intermittently on many different requests. These are usually retried, but at the end during the delete phase the delete threads wait 5 minutes before retrying, which is why the test fails, because it only waits 2 minutes. The real problem is that we should not be getting these intermittent random errors at all, which is why I think we need to look at data that is kept around in the connector from request to request, namely the cached data structures. I am certain these are the source of the problem. was (Author: kwri...@metacarta.com): bq. I tried to execute the test now and I think that now tests are runned correctly but it seems that it can't delete the job from Manifold at the end of the test: Right, the problem is that the job deletion hangs, because it's trying to delete the documents from the index and something goes wrong with that. I posted earlier the manifoldcf.log output associated with this failure: {code} ERROR 2012-02-26 16:09:35,903 (Document delete thread '7') - Exception tossed: Server/page not found org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111) at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.<init>(ElasticSearchDelete.java:35) at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748) at org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130) {code} The issue is that the "Server/page not found" error seems to occur intermittently on many different requests. These are usually retried, but at the end during the delete phase they wait 5 minutes before being retried, which is why the test fails. The real problem is that we should not be getting intermittent random errors at all, which is why I think we need to look at data that is kept around in the connector from request to request, namely the cached data structures. I am certain these are the source of the problem. > An ElasticSearch connector would be helpful > ------------------------------------------- > > Key: CONNECTORS-288 > URL: https://issues.apache.org/jira/browse/CONNECTORS-288 > Project: ManifoldCF > Issue Type: New Feature > Affects Versions: ManifoldCF 0.5 > Reporter: Piergiorgio Lucidi > Assignee: Piergiorgio Lucidi > Labels: elasticsearch > Fix For: ManifoldCF next > > Attachments: manifold-elasticsearch-patch, > manifold-elasticsearch-patch, manifold-elasticsearch-patch, > manifold-elasticsearch-patch, manifold-elasticsearch-patch, > manifold-elasticsearch-patch, manifold-elasticsearch-patch, > manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct > > Original Estimate: 120h > Remaining Estimate: 120h > > An ElasticSearch connector could be very useful to spread the use of > ManifoldCF -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira