[ 
https://issues.apache.org/jira/browse/GEODE-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaojian Zhou updated GEODE-9191:
---------------------------------
        Parent: GEODE-7665
    Issue Type: Sub-task  (was: Bug)

> PR clear should not miss clearing bucket which lost primary
> -----------------------------------------------------------
>
>                 Key: GEODE-9191
>                 URL: https://issues.apache.org/jira/browse/GEODE-9191
>             Project: Geode
>          Issue Type: Sub-task
>            Reporter: Xiaojian Zhou
>            Priority: Major
>
> This scenario is found when introducing GII test case for PR clear. The 
> sequence is:
> (1) there're 3 servers, server1 is accessor, server2 and server3 are 
> datastores.
> (2) shutdown server2
> (3) send PR clear from server1 (accessor) and restart server2 at the same 
> time. There's a race that server2 did not receive the 
> PartitionedRegionClearMessage.
> (4) server2 finished GII
> (5) only server3 received PartitionedRegionClearMessage and it hosts all the 
> primary buckets. When PR clear thread iterates through these primary buckets 
> one by one, some of them might lose primary to server2. 
> (6) BR.cmnClearRegion will return immediately since it's no longer primary, 
> but clearedBuckets.add(localPrimaryBucketRegion.getId()); will still be 
> called. So from the caller point of view, this bucket is cleared. It wouldn't 
> even throw PartitionedRegionPartialClearException.
> The problem is:
> before calling cmnClearRegion, we should call BR.doLockForPrimary to make 
> sure it's still primary. If not, throw exception.  Then 
> clearedBuckets.add(localPrimaryBucketRegion.getId()); will not be called for 
> this bucket. 
> The expected behavior in this scenario is to throw 
> PartitionedRegionPartialClearException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to