[
https://issues.apache.org/jira/browse/IGNITE-28704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18082236#comment-18082236
]
Dmitry Pavlov commented on IGNITE-28704:
----------------------------------------
[https://ci2.ignite.apache.org/test/-3817933280317769007?currentProjectId=IgniteTests24Java8&branch=%3Cdefault%3E](https://ci2.ignite.apache.org/test/-3817933280317769007?currentProjectId=IgniteTests24Java8&branch=%3Cdefault%3E)
CheckpointBufferDeadlockTest.testFourCheckpointThreads
> CheckpointBufferDeadlockTest.testFourCheckpointThreads times out in Cache 7
> ---------------------------------------------------------------------------
>
> Key: IGNITE-28704
> URL: https://issues.apache.org/jira/browse/IGNITE-28704
> Project: Ignite
> Issue Type: Bug
> Reporter: Ignite TC Bot
> Assignee: Dmitry Pavlov
> Priority: Major
> Labels: MakeTeamcityGreenAgain, flaky-test, teamcity,
> test-failure
>
> CheckpointBufferDeadlockTest.testFourCheckpointThreads is flaky on TeamCity
> Cache 7 and often times out at the framework default 300000 ms on master.
> Observed failures:
> - Suite: IgniteTests24Java8_Cache7 / IgniteCacheTestSuite7
> - Test:
> org.apache.ignite.internal.processors.cache.persistence.db.CheckpointBufferDeadlockTest.testFourCheckpointThreads
> - Builds: 9075284, 9075427, 9075719, 9076052, 9077938
> - Failure: java.util.concurrent.TimeoutException: Test has been timed out
> [test=testFourCheckpointThreads, timeout=300000]
> - History from TC bot context: base branch flaky, approx. 51.5% failure rate.
> Likely cause:
> testFourCheckpointThreads repeats runDeadlockScenario SF.applyLB(10, 3)
> times. With the default TEST_SCALE_FACTOR=1.0 this is 10 full iterations.
> Each iteration starts a persistent node, loads about 500 MB / 4 worth of
> 900-byte entries, enables an artificially slow checkpoint, sleeps for 10
> seconds to trigger throttling, then forces checkpoint completion and
> restarts/cleans persistence. The method can legitimately still be in the
> artificial slow-checkpoint window when the 300s GridAbstractTest timeout
> expires. Thread dumps show expected load-runner throttling / idle checkpoint
> threads rather than a product assertion or deterministic deadlock.
> Proposed fix:
> Keep the regression coverage but cap the four-checkpoint-threads variant to
> the lower-bound repeat count, so the test still exercises repeated
> restarts/checkpoints while staying inside the suite timeout budget.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)