[
https://issues.apache.org/jira/browse/GEODE-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaojian Zhou updated GEODE-9191:
---------------------------------
Labels: GeodeOperationAPI (was: )
> PR clear should not miss clearing bucket which lost primary
> -----------------------------------------------------------
>
> Key: GEODE-9191
> URL: https://issues.apache.org/jira/browse/GEODE-9191
> Project: Geode
> Issue Type: Sub-task
> Reporter: Xiaojian Zhou
> Priority: Major
> Labels: GeodeOperationAPI
>
> This scenario is found when introducing GII test case for PR clear. The
> sequence is:
> (1) there're 3 servers, server1 is accessor, server2 and server3 are
> datastores.
> (2) shutdown server2
> (3) send PR clear from server1 (accessor) and restart server2 at the same
> time. There's a race that server2 did not receive the
> PartitionedRegionClearMessage.
> (4) server2 finished GII
> (5) only server3 received PartitionedRegionClearMessage and it hosts all the
> primary buckets. When PR clear thread iterates through these primary buckets
> one by one, some of them might lose primary to server2.
> (6) BR.cmnClearRegion will return immediately since it's no longer primary,
> but clearedBuckets.add(localPrimaryBucketRegion.getId()); will still be
> called. So from the caller point of view, this bucket is cleared. It wouldn't
> even throw PartitionedRegionPartialClearException.
> The problem is:
> before calling cmnClearRegion, we should call BR.doLockForPrimary to make
> sure it's still primary. If not, throw exception. Then
> clearedBuckets.add(localPrimaryBucketRegion.getId()); will not be called for
> this bucket.
> The expected behavior in this scenario is to throw
> PartitionedRegionPartialClearException.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)