[jira] [Assigned] (IGNITE-21565) ReplicasSafeTimePropagationTest#testSafeTimeReorderingOnLeaderReElection is flaky

Alexander Lapin (Jira) Wed, 13 Mar 2024 00:09:05 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alexander Lapin reassigned IGNITE-21565:
----------------------------------------

    Assignee: Alexander Lapin

> ReplicasSafeTimePropagationTest#testSafeTimeReorderingOnLeaderReElection is 
> flaky
> ---------------------------------------------------------------------------------
>
>                 Key: IGNITE-21565
>                 URL: https://issues.apache.org/jira/browse/IGNITE-21565
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Alexander Lapin
>            Assignee: Alexander Lapin
>            Priority: Major
>              Labels: ignite-3
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code:java}
> java.lang.AssertionError: java.util.concurrent.ExecutionException: 
> java.util.concurrent.TimeoutException  at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:78)
>   at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
>   at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:67)  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)  at 
> org.apache.ignite.distributed.ReplicasSafeTimePropagationTest.sendSafeTimeSyncCommand(ReplicasSafeTimePropagationTest.java:231)
>  {code}
> [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleTable/7867487?expandBuildDeploymentsSection=false&hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildChangesSection=true&expandBuildTestsSection=true&expandBuildProblemsSection=true]
> h3.  
> h3. Upd #1
> SafeTimeReorderingException occurred because of the race between 
> maxObservableSafeTime update on leader election and SafeTimeSyncCommands 
> processing within onBeforeApply. In order to fix that 
>  *  I've added synchronous onBeforeLeaderStart callback that is now used to 
> update maxObservableSafeTime with clock.now() +CLOCK_SKEW insread of 
> previously used asynchronous onLeaderStart. By asynchronous here I mean 
> asynchronous to onBeforeApply.
>  *  I've also added maxObservableSafeTime  update to Long.MAX_VALUE on each 
> leader stop along with same initial value. MAX_LONG is greater than any 
> possible safeTime propagated with commands thus we will actually block any 
> SafeTimeSyncCommands processing (**even within onBeforeApply**) before leader 
> election that includes proper maxObservableSafeTime initialization.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-21565) ReplicasSafeTimePropagationTest#testSafeTimeReorderingOnLeaderReElection is flaky

Reply via email to