ZuebeyirEser commented on issue #2473:
URL: https://github.com/apache/fluss/issues/2473#issuecomment-3796867119
@swuferhong I’ve been looking into this and was able to reproduce the error
locally.
It appears to be a transient metadata state. When `rebalance()` is called
shortly after a table is created, the `CoordinatorContext` sometimes has the
leader elected, but the replica assignment list hasn't been fully updated yet.
This causes the optimizer to hit the Leader replica is not in the bucket check.
I wrote a small loop to trigger rebalance during concurrent table creation.
Without the change, it failed consistently within a few iterations. With the
fix, it successfully completed 50 iterations.
```java
@Test
void testRebalanceDuringConcurrentTableCreation() throws Exception {
rootAdmin.createAcls(Collections.singletonList(
new AclBinding(
Resource.cluster(),
new AccessControlEntry(
guestPrincipal,
WILD_CARD_HOST,
OperationType.WRITE,
PermissionType.ALLOW))))
.all().get();
for (int i = 0; i < 20; i++) {
TablePath transientTable = TablePath.of("test_db_1",
"transient_rebalance_table_" + i);
rootAdmin.createTable(transientTable, DATA1_TABLE_DESCRIPTOR_PK,
false);
// Verification: ensure rebalance doesn't fail due to transient
metadata
assertThatCode(() ->
guestAdmin.rebalance(Collections.emptyList()).get())
.doesNotThrowAnyException();
rootAdmin.dropTable(transientTable, true).get();
}
}
```
My proposed solution:
I’ve tested a change in RebalanceManager to gracefully skip buckets that are
in this transient state (where the leader is -1 or not yet in the assignment
list). This allows the rebalance to continue for all other stable tables. As
far as I can see, this is safe because the skipped bucket will simply be picked
up in the next rebalance cycle once its state is consistent.
I'd be happy to submit a PR with this fix and the regression test if this
approach sounds reasonable to you.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]