[ 
https://issues.apache.org/jira/browse/IGNITE-18692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17699693#comment-17699693
 ] 

Aleksandr Polovtcev commented on IGNITE-18692:
----------------------------------------------

There are several problems with this test:
1. Rebalance does not happen, because the disabled node takes too long to leave 
the Logical Topology. I fixed this by decreasing the Logical Topology removal 
timeout. Here's the branch with the fix: 
https://github.com/gridgain/apache-ignite-3/tree/ignite-18692
2. This fixes the problem presented in the description of this ticket, but the 
test still fails: the first read after the Rebalance times out. This happens 
for the following reason:
    * `PartitionReplicaListener` uses a `safeTime` object to catch up with the 
provided read timestamp and blocks all reads until it happens.
    * `PartitionListener` is responsible for updating the `safeTime` object 
(these objects must be the same in both listeners).
    * However, during the Rebalance procedure (see 
`TableManager#updateAssignmentInternal`), we always restart the ReplicaService 
(i.e. create a new `PartitionReplicaListener`), but we don't restart the Raft 
node (i.e.  new `PartitionListener` does not get created). This means that the 
new `PartitionReplicaListener` and the old `PartitionListener` will use 
different `safeTime` objects.

> Rebalance test is failed
> ------------------------
>
>                 Key: IGNITE-18692
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18692
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergey Uttsel
>            Assignee: Aleksandr Polovtcev
>            Priority: Major
>              Labels: ignite-3
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> org.apache.ignite.internal.rebalance.ItRebalanceTest#assignmentsChangingOnNodeLeaveNodeJoin
>  failed.
> The failure is caused by commits:
> db8f1e38 "IGNITE-18397 Rework Watches based on Raft Learners (#1490)"
> ff27d76d "IGNITE-18598 Fix compilation after merge (#1560)"
> I created separated branch with this test: 
> [https://github.com/gridgain/apache-ignite-3/tree/ignite-18088_test] which 
> based on ff27d76d "IGNITE-18598 Fix compilation after merge (#1560)"
>  
> {code:java}
> org.opentest4j.AssertionFailedError: expected: <true> but was: <false>
>     at app//org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
>     at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40)
>     at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35)
>     at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:179)
>     at 
> app//org.apache.ignite.internal.rebalance.ItRebalanceTest.assignmentsChangingOnNodeLeaveNodeJoin(ItRebalanceTest.java:132)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to