[ 
https://issues.apache.org/jira/browse/GEODE-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339313#comment-17339313
 ] 

Xiaojian Zhou edited comment on GEODE-9191 at 5/4/21, 10:28 PM:
----------------------------------------------------------------

More investigation found that the primary buckets could switch at any time 
especially when they are not balanced (usually happened in GII). We need to 
lock the primary from moving.

The revised design will be:
(1) coordinator(a server) assignAllBuckets. Then waits for all the primaries to 
show up. 
(2) coordinator sends lock message to all members. 
(3) upon received the lock message, each datastore server saves current primary 
bucket number for future reference. 
(4) At each datastore, iterate through local primary bucket list to lock 
primary from moving and lock rvv. If either total locked primary buckets or 
total locked rvv buckets at this member is different with previous saved 
primary bucket number, unlock all of them and return RetryException to 
coordinator. 
(5) If coordinator received retry exception it will resend lock message and 
retry forever until succeeded.
(6) After locked all the members' primary buckets, coordinator sends clear 
message to all the members.
(6) each member clear primary buckets one by one and return number of buckets 
cleared.
(7) Coordinator collect all the numbers cleared, if less than expected bucket 
number, throw PartialClearException to caller. This could happen when a member 
is offline in the middle of clear. 
(8) If any member exit in the middle of clear, the membership listener at 
coordinator will be notified. It will unlock all the locks and retry from 
locking then clearing. In retry, if the missing member's buckets are recreated 
in other member, the retry succeed. Otherwise, the total cleared buckets number 
is still lower than expected (i.e. PartitionOffline happened), throw the 
PartialClearException. 
(9) if the coordinator exit in the middle of clear, unlock all the locks and 
throw PartialClearException.



was (Author: zhouxj):
More investigation found that the primary buckets could switch at any time 
especially when they are not balanced (usually happened in GII). We need to 
lock the primary from moving.

The revised design will be:
(1) coordinator(a server) assignAllBuckets. Then waits for all the primaries to 
show up. 
(2) coordinator sends lock message to all members. 
(3) upon received the lock message, each datastore server 
lockBucketCreationForRegionClear() then save current primary bucket number. 
(4) At each datastore, iterate through local primary bucket list to lock 
primary from moving and lock rvv. If either total locked primary buckets or 
total locked rvv buckets at this member is different with previous saved 
primary bucket number, unlock all of them and return RetryException to 
coordinator. 
(5) If coordinator received retry exception it will resend lock message and 
retry forever until succeeded.
(6) After locked all the members' primary buckets, coordinator sends clear 
message to all the members.
(6) each member clear primary buckets one by one and return number of buckets 
cleared.
(7) Coordinator collect all the numbers cleared, if less than expected bucket 
number, throw PartialClearException to caller. This could happen when a member 
is offline in the middle of clear. 
(8) If any member exit in the middle of clear, since we will not allow to 
create bucket during clear, so unlock all the locks and return number of 
buckets cleared to coordinator. The coordinator will finally throw 
PartialClearException. 
(9) if the coordinator exit in the middle of clear, unlock all the locks and 
throw PartialClearException.


> PR clear could miss clearing bucket which lost primary
> ------------------------------------------------------
>
>                 Key: GEODE-9191
>                 URL: https://issues.apache.org/jira/browse/GEODE-9191
>             Project: Geode
>          Issue Type: Sub-task
>            Reporter: Xiaojian Zhou
>            Assignee: Xiaojian Zhou
>            Priority: Major
>              Labels: GeodeOperationAPI, pull-request-available
>
> This scenario is found when introducing GII test case for PR clear. The 
> sequence is:
> (1) there're 3 servers, server1 is accessor, server2 and server3 are 
> datastores.
> (2) shutdown server2
> (3) send PR clear from server1 (accessor) and restart server2 at the same 
> time. There's a race that server2 did not receive the 
> PartitionedRegionClearMessage.
> (4) server2 finished GII
> (5) only server3 received PartitionedRegionClearMessage and it hosts all the 
> primary buckets. When PR clear thread iterates through these primary buckets 
> one by one, some of them might lose primary to server2. 
> (6) BR.cmnClearRegion will return immediately since it's no longer primary, 
> but clearedBuckets.add(localPrimaryBucketRegion.getId()); will still be 
> called. So from the caller point of view, this bucket is cleared. It wouldn't 
> even throw PartitionedRegionPartialClearException.
> The problem is:
> before calling cmnClearRegion, we should call BR.doLockForPrimary to make 
> sure it's still primary. If not, throw exception.  Then 
> clearedBuckets.add(localPrimaryBucketRegion.getId()); will not be called for 
> this bucket. 
> The expected behavior in this scenario is to throw 
> PartitionedRegionPartialClearException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to