[
https://issues.apache.org/jira/browse/GEODE-5385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bruce Schuchardt resolved GEODE-5385.
-------------------------------------
Resolution: Fixed
Fix Version/s: 1.7.0
> hang trying to create a bucket
> ------------------------------
>
> Key: GEODE-5385
> URL: https://issues.apache.org/jira/browse/GEODE-5385
> Project: Geode
> Issue Type: Bug
> Reporter: Bruce Schuchardt
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.7.0
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> It's possible for partitioned region bucket allocation to hang even though
> there appears to be plenty of storage available. This can happen if one
> server is creating the partitioned region at the same time the region is
> being destroyed by another server.
> The server creating the partitioned region will send a
> ForceReattemptException back to the server destroying the region and that
> exception is ignored. The server creating the PR will then be stuck with a
> region having a dangling ID that has been removed from the PR metadata
> region. If another server then recreates the PR it will assign a new ID to
> it and the servers will have skewed IDs. The IDs are sent in partitioned
> region messages such as manage-bucket.
> The distribution advisors don't recognize that there is a skew and our logs
> show nothing about it because a safety mechanism was accidentally turned off
> by an engineer in PRSanityCheckMessage. This message performs a check of the
> IDs in the servers to make sure they're consistent.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)