Raintung Li created SOLR-4073:
---------------------------------

             Summary: Overseer will miss operations in some cases.
                 Key: SOLR-4073
                 URL: https://issues.apache.org/jira/browse/SOLR-4073
             Project: Solr
          Issue Type: Bug
          Components: SolrCloud
    Affects Versions: 4.0, 4.0-BETA, 4.0-ALPHA
         Environment: Solr cloud
            Reporter: Raintung Li


One overseer disconnect with Zookeeper, but overseer thread still handle the 
request(A) in the DistributedQueue. Example: overseer thread reconnect 
Zookeeper try to remove the Top's request. "workQueue.remove();".   

Now the other server will take over the overseer privilege because old overseer 
disconnect. Start overseer thread and handle the queue request(A) again, and 
remove the request(A) from queue, then try to get the top's request(B, doesn't 
get). In the this time old overseer reconnect with ZooKeeper, and remove the 
top's request from queue. Now the top request is B, it is moved by old overseer 
server.  New overseer server never do B request,because this request deleted by 
old overseer server, at the last this request(B) miss operations.

At best, distributeQueue.peek can get the request's ID that will be removed for 
workqueue.remove(ID), not remove the top's request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to