[ 
https://issues.apache.org/jira/browse/IGNITE-28704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pavlov updated IGNITE-28704:
-----------------------------------
    Ignite Flags:   (was: Docs Required,Release Notes Required)

> CheckpointBufferDeadlockTest.testFourCheckpointThreads times out in Cache 7
> ---------------------------------------------------------------------------
>
>                 Key: IGNITE-28704
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28704
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Ignite TC Bot
>            Priority: Major
>              Labels: flaky-test, teamcity, test-failure
>
> CheckpointBufferDeadlockTest.testFourCheckpointThreads is flaky on TeamCity 
> Cache 7 and often times out at the framework default 300000 ms on master.
> Observed failures:
> - Suite: IgniteTests24Java8_Cache7 / IgniteCacheTestSuite7
> - Test: 
> org.apache.ignite.internal.processors.cache.persistence.db.CheckpointBufferDeadlockTest.testFourCheckpointThreads
> - Builds: 9075284, 9075427, 9075719, 9076052, 9077938
> - Failure: java.util.concurrent.TimeoutException: Test has been timed out 
> [test=testFourCheckpointThreads, timeout=300000]
> - History from TC bot context: base branch flaky, approx. 51.5% failure rate.
> Likely cause:
> testFourCheckpointThreads repeats runDeadlockScenario SF.applyLB(10, 3) 
> times. With the default TEST_SCALE_FACTOR=1.0 this is 10 full iterations. 
> Each iteration starts a persistent node, loads about 500 MB / 4 worth of 
> 900-byte entries, enables an artificially slow checkpoint, sleeps for 10 
> seconds to trigger throttling, then forces checkpoint completion and 
> restarts/cleans persistence. The method can legitimately still be in the 
> artificial slow-checkpoint window when the 300s GridAbstractTest timeout 
> expires. Thread dumps show expected load-runner throttling / idle checkpoint 
> threads rather than a product assertion or deterministic deadlock.
> Proposed fix:
> Keep the regression coverage but cap the four-checkpoint-threads variant to 
> the lower-bound repeat count, so the test still exercises repeated 
> restarts/checkpoints while staying inside the suite timeout budget.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to