[
https://issues.apache.org/jira/browse/GEODE-7569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bill Burcham resolved GEODE-7569.
---------------------------------
Fix Version/s: 1.12.0
Resolution: Fixed
> Hang during StateFlush due to new flipping the containsRegionContentChange on
> PartitionMessageWithDirectReply
> -------------------------------------------------------------------------------------------------------------
>
> Key: GEODE-7569
> URL: https://issues.apache.org/jira/browse/GEODE-7569
> Project: Geode
> Issue Type: Bug
> Components: membership
> Reporter: Dan Smith
> Assignee: Dan Smith
> Priority: Major
> Fix For: 1.12.0
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> The recent changes in GEODE-7435 in e3a31e190031f094ac3bd1517722d6bead710418
> have caused a distributed deadlock when making a copy of a bucket.
> These changes flipped the value of containsRegionContentChange for
> PartitionMessageWithDirectReply.
> That flag controls what messages participate in a state flush operation. Now,
> many new messages are part of a state flush, including messages which trigger
> bucket creation. This causes the following distributed deadlock:
> 1. Member A is waiting for a StateFlush to finish
> 2. Member B is stuck in StateStabilizationMessage, waiting for messages to be
> processed
> 3. Member B is in the middle of processing some messages, which is what is
> holding up the StateStabilizationMessage
> 4. Some of those messages are PartitionMessageWithDirectReply messages that
> end up triggering createBucketAtomically. That method is blocks waiting for
> bucket creation in Member A to finish.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)