[ 
https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217519#comment-13217519
 ] 

Karl Wright edited comment on CONNECTORS-288 at 2/27/12 8:25 PM:
-----------------------------------------------------------------

bq. I tried to execute the test now and I think that now tests are runned 
correctly but it seems that it can't delete the job from Manifold at the end of 
the test:

Right, the problem is that the job deletion hangs, because it's trying to 
delete the documents from the index and something goes wrong with that.  I 
posted earlier the manifoldcf.log output associated with this failure:

{code}
ERROR 2012-02-26 16:09:35,903 (Document delete thread '7') - Exception tossed: 
Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
        at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
        at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.<init>(ElasticSearchDelete.java:35)
        at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
        at 
org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
{code}

The issue is that the "Server/page not found" error seems to occur 
intermittently on many different requests.  These are usually retried, but at 
the end during the delete phase the delete threads wait 5 minutes before 
retrying, which is why the test fails, because it only waits 2 minutes.  The 
real problem is that we should not be getting these intermittent random errors 
at all, which is why I think we need to look at data that is kept around in the 
connector from request to request, namely the cached data structures.  I am 
certain these are the source of the problem.


                
      was (Author: kwri...@metacarta.com):
    bq. I tried to execute the test now and I think that now tests are runned 
correctly but it seems that it can't delete the job from Manifold at the end of 
the test:

Right, the problem is that the job deletion hangs, because it's trying to 
delete the documents from the index and something goes wrong with that.  I 
posted earlier the manifoldcf.log output associated with this failure:

{code}
ERROR 2012-02-26 16:09:35,903 (Document delete thread '7') - Exception tossed: 
Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
        at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
        at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.<init>(ElasticSearchDelete.java:35)
        at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
        at 
org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
{code}

The issue is that the "Server/page not found" error seems to occur 
intermittently on many different requests.  These are usually retried, but at 
the end during the delete phase they wait 5 minutes before being retried, which 
is why the test fails.  The real problem is that we should not be getting 
intermittent random errors at all, which is why I think we need to look at data 
that is kept around in the connector from request to request, namely the cached 
data structures.  I am certain these are the source of the problem.


                  
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, 
> manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
> manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
> manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
> manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of 
> ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to