Bruce Schuchardt created GEODE-5385:
---------------------------------------

             Summary: hang trying to create a bucket
                 Key: GEODE-5385
                 URL: https://issues.apache.org/jira/browse/GEODE-5385
             Project: Geode
          Issue Type: Bug
            Reporter: Bruce Schuchardt


It's possible for partitioned region bucket allocation to hang even though 
there appears to be plenty of storage available.  This can happen if one server 
is creating the partitioned region at the same time the region is being 
destroyed by another server.

The server creating the partitioned region will send a ForceReattemptException 
back to the server destroying the region and that exception is ignored.  The 
server creating the PR will then be stuck with a region having a dangling ID 
that has been removed from the PR metadata region.  If another server then 
recreates the PR it will assign a new ID to it and the servers will have skewed 
IDs.  The IDs are sent in partitioned region messages such as manage-bucket.  

The distribution advisors don't recognize that there is a skew and our logs 
show nothing about it because a safety mechanism was accidentally turned off by 
an engineer in PRSanityCheckMessage.  This message performs a check of the IDs 
in the servers to make sure they're consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to