[jira] [Commented] (GEODE-10409) Rebalance Model Missing Collocated Regions At Server Startup

ASF subversion and git services (Jira) Mon, 12 Sep 2022 01:07:06 -0700


    [ 
https://issues.apache.org/jira/browse/GEODE-10409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17602982#comment-17602982
 ]


ASF subversion and git services commented on GEODE-10409:
---------------------------------------------------------

Commit 0852113f1b8086203ffdd99bae1afa250c2eaa3e in geode's branch 
refs/heads/develop from WeijieEST
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=0852113f1b ]

GEODE-10409: Fix rebalance load model missing collocated regions at s… (#7839)

* GEODE-10409: Fix rebalance load model missing collocated regions at server 
startup

Assume region A collocated with A1 and A2, and a is the leader region, when 
rebalance at startup,
rebalance will happened after the 3 region collocation completed, generally 
this happened in region A2.
And when calculate rebalance load model from view of region A2, only leader 
region A and A2 itself will
be added to the model, this commit fix the issue and make A1 also be added to 
the model.

* add test cases to test rebalance model and remove the static mock

* change test case to avoid changing existing methods for testing

* improve test case

> Rebalance Model Missing Collocated Regions At Server Startup
> ------------------------------------------------------------
>
>                 Key: GEODE-10409
>                 URL: https://issues.apache.org/jira/browse/GEODE-10409
>             Project: Geode
>          Issue Type: Bug
>            Reporter: Weijie Xu
>            Assignee: Weijie Xu
>            Priority: Major
>              Labels: needsTriage, pull-request-available
>         Attachments: server2.log, test.tar.gz
>
>
> Following steps reproduce the issue:
> Run the start.gfsh in the attached example, which configures a geode system 
> with a partitioned region, a gateway sender and a collocated region with the 
> partitioned region. So there are three regions totally, the leader region, 
> the collcated region and the queue region.
> Then run the example code, which will source ~400M data and 5 times amount of 
> events into the system.
> Then stop one of the server, and revoke the disk file of the server.
> Then start the server, which will trigger a bucket recovery.
> From the attached log line596, line598 and line5958, we can see that the 
> queue region is not included in the rebalance model, either in the data size 
> colum nor in the max size colum.
> Then do a manual rebalance after the server is up, this time log shows the 
> queue region is added to the model.(line6010, line6012, lin6014 and line6028)
>  
> The inconsistent behavior will lead to 2 negative results:
> 1) Different result of rebalance between server startup phase and manual 
> trigger, startup rebalance tells everything is OK, rebalance finished, but 
> manual trigger rebalance tells space not enough since it included the queue 
> region into the model which has 5 times data size as the leader region.
> 2) A dismatch between the rebalance model and the actual data being 
> rebalanced(Actually the queue region data is rebalanced although the region 
> is not included in the model at server startup phase).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (GEODE-10409) Rebalance Model Missing Collocated Regions At Server Startup

Reply via email to