Bruce Schuchardt created GEODE-5385:
---------------------------------------
Summary: hang trying to create a bucket
Key: GEODE-5385
URL: https://issues.apache.org/jira/browse/GEODE-5385
Project: Geode
Issue Type: Bug
Reporter: Bruce Schuchardt
It's possible for partitioned region bucket allocation to hang even though
there appears to be plenty of storage available. This can happen if one server
is creating the partitioned region at the same time the region is being
destroyed by another server.
The server creating the partitioned region will send a ForceReattemptException
back to the server destroying the region and that exception is ignored. The
server creating the PR will then be stuck with a region having a dangling ID
that has been removed from the PR metadata region. If another server then
recreates the PR it will assign a new ID to it and the servers will have skewed
IDs. The IDs are sent in partitioned region messages such as manage-bucket.
The distribution advisors don't recognize that there is a skew and our logs
show nothing about it because a safety mechanism was accidentally turned off by
an engineer in PRSanityCheckMessage. This message performs a check of the IDs
in the servers to make sure they're consistent.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)