[
https://issues.apache.org/jira/browse/GEODE-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316887#comment-16316887
]
ASF subversion and git services commented on GEODE-4051:
--------------------------------------------------------
Commit 61077fba1b18c54ed0d253e7904bad69a10b93ac in geode's branch
refs/heads/develop from [~dschneider]
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=61077fb ]
GEODE-4051: change StateMarkerMessage to always reply
One of the calls in StateMarkerMessage process would not end
up sending a reply if waitForCurrentOperations threw an exception.
It now has a finally block that makes sure the reply message is
always sent.
> Two server jvms crashed at same time and caused some primary and redundant
> buckets to be cleared. Causing some buckets to get locked and not able to
> recover also after bouncing all servers
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: GEODE-4051
> URL: https://issues.apache.org/jira/browse/GEODE-4051
> Project: Geode
> Issue Type: Bug
> Components: regions
> Affects Versions: 1.2.0
> Reporter: Igor Barchak
> Assignee: Darrel Schneider
>
> "Pooled Waiting Message Processor 5" tid=0x162
> java.lang.Thread.State: TIMED_WAITING
> at sun.misc.Unsafe.park(Native Method)
> - waiting on java.util.concurrent.CountDownLatch$Sync@1993a5
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
> at
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:64)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:715)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:644)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:624)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:519)
> at
> org.apache.geode.internal.cache.StateFlushOperation.flush(StateFlushOperation.java:243)
> at
> org.apache.geode.internal.cache.InitialImageOperation.getFromOne(InitialImageOperation.java:349)
> at
> org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1168)
> at
> org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1023)
> at
> org.apache.geode.internal.cache.BucketRegion.initialize(BucketRegion.java:253)
> at
> org.apache.geode.internal.cache.LocalRegion.createSubregion(LocalRegion.java:962)
> at
> org.apache.geode.internal.cache.PartitionedRegionDataStore.createBucketRegion(PartitionedRegionDataStore.java:726)
> at
> org.apache.geode.internal.cache.PartitionedRegionDataStore.grabFreeBucket(PartitionedRegionDataStore.java:414)
> - locked org.apache.geode.internal.cache.ProxyBucketRegion@6820a0b6
> at
> org.apache.geode.internal.cache.PartitionedRegionDataStore.grabFreeBucketRecursively(PartitionedRegionDataStore.java:272)
> at
> org.apache.geode.internal.cache.PartitionedRegionDataStore.grabBucket(PartitionedRegionDataStore.java:2815)
> at
> org.apache.geode.internal.cache.partitioned.ManageBackupBucketMessage.operateOnPartitionedRegion(ManageBackupBucketMessage.java:148)
> at
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:332)
> Seems like it was introduced in this fix
> https://github.com/apache/geode/commit/3a1062e245b3ded52ea3f6b6de0aff94ce846fa3?diff=split
> See StateMarkerMessage.process
> The first if condition doesn't have a finally block.
> The else has a finally block.
> The first if condition didn't have a 'waitFor' operation earlier - it was
> introduced in this commit
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)