[
https://issues.apache.org/jira/browse/GEODE-10409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17602982#comment-17602982
]
ASF subversion and git services commented on GEODE-10409:
---------------------------------------------------------
Commit 0852113f1b8086203ffdd99bae1afa250c2eaa3e in geode's branch
refs/heads/develop from WeijieEST
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=0852113f1b ]
GEODE-10409: Fix rebalance load model missing collocated regions at s… (#7839)
* GEODE-10409: Fix rebalance load model missing collocated regions at server
startup
Assume region A collocated with A1 and A2, and a is the leader region, when
rebalance at startup,
rebalance will happened after the 3 region collocation completed, generally
this happened in region A2.
And when calculate rebalance load model from view of region A2, only leader
region A and A2 itself will
be added to the model, this commit fix the issue and make A1 also be added to
the model.
* add test cases to test rebalance model and remove the static mock
* change test case to avoid changing existing methods for testing
* improve test case
> Rebalance Model Missing Collocated Regions At Server Startup
> ------------------------------------------------------------
>
> Key: GEODE-10409
> URL: https://issues.apache.org/jira/browse/GEODE-10409
> Project: Geode
> Issue Type: Bug
> Reporter: Weijie Xu
> Assignee: Weijie Xu
> Priority: Major
> Labels: needsTriage, pull-request-available
> Attachments: server2.log, test.tar.gz
>
>
> Following steps reproduce the issue:
> Run the start.gfsh in the attached example, which configures a geode system
> with a partitioned region, a gateway sender and a collocated region with the
> partitioned region. So there are three regions totally, the leader region,
> the collcated region and the queue region.
> Then run the example code, which will source ~400M data and 5 times amount of
> events into the system.
> Then stop one of the server, and revoke the disk file of the server.
> Then start the server, which will trigger a bucket recovery.
> From the attached log line596, line598 and line5958, we can see that the
> queue region is not included in the rebalance model, either in the data size
> colum nor in the max size colum.
> Then do a manual rebalance after the server is up, this time log shows the
> queue region is added to the model.(line6010, line6012, lin6014 and line6028)
>
> The inconsistent behavior will lead to 2 negative results:
> 1) Different result of rebalance between server startup phase and manual
> trigger, startup rebalance tells everything is OK, rebalance finished, but
> manual trigger rebalance tells space not enough since it included the queue
> region into the model which has 5 times data size as the leader region.
> 2) A dismatch between the rebalance model and the actual data being
> rebalanced(Actually the queue region data is rebalanced although the region
> is not included in the model at server startup phase).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)