[ 
https://issues.apache.org/jira/browse/GEODE-10039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500431#comment-17500431
 ] 

Jianxia Chen commented on GEODE-10039:
--------------------------------------

When a member m1 is moving a primary bucket to the other member m2 due to 
rebalance. m2 will send a RemoveBucketMessage to m1. When m1 is processing the 
RemoveBucketMessage, it will send a DestroyRegionMessage to all members hosting 
the partition region. The DestroyRegionMessage will remove the bucket profile 
related to m1. Because m1 no longer hosts the primary bucket. Note that this 
DestroyRegionMessage does not destroy any bucket, it only removes the bucket 
profile from members that host the partitioned region. 

When a member m3 is in the process of creating a partitioned region, while the 
primary bucket is being moved from m1 to m2. It is possible that the 
DestroyRegionMessage is sent to m2 only. Because when sending the 
DestroyRegionMessage, m1 does not know that m3 also hosts the partitioned 
region. Therefore m3 will miss the DestroyRegionMessage and have 2 bucket 
profiles for m1 and m2 respectively. If later after m3 has successfully created 
the partitioned region, m2 shuts down, then it will remove the bucket profile 
from m3. At this point, there is no primary bucket. However, m3 has a stale 
bucket profile, showing m1 is still hosting the bucket.

This can be reproduced in distributed test.

> BucketProfiles can be stale in rare cases.
> ------------------------------------------
>
>                 Key: GEODE-10039
>                 URL: https://issues.apache.org/jira/browse/GEODE-10039
>             Project: Geode
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.15.0
>            Reporter: Mark Hanson
>            Assignee: Jianxia Chen
>            Priority: Major
>              Labels: GeodeOperationAPI, blocks-1.15.0​
>
> In the case when a server is starting as a member of a partitioned region 
> during a rebalance, it is possible for the  the starting server to not get a 
> profile removal for a bucket that has been relocated.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to