[
https://issues.apache.org/jira/browse/KAFKA-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17535843#comment-17535843
]
Chris Egerton commented on KAFKA-12657:
---------------------------------------
I've reviewed the test runs and managed to reproduce locally by invoking
{{{}./gradlew :connect:runtime:integrationTest{}}}. It appears that the issue
is environmental; every failure occurs because some condition isn't met in
time, and oftentimes, that condition is simply starting an embedded Connect
worker at the beginning of the test.
I was able to achieve a completely green run by adding {{-PmaxParallelForks=5}}
to my test run, which suggests that reducing the number of active test cases
(or even just running threads in general) would alleviate the issue.
Unfortunately, it looks like our Jenkinsfile [already sets that value even
lower|https://github.com/apache/kafka/blob/85cfa70f59162d3b7ae23c55bb3f3fe97e56ba80/Jenkinsfile#L40]
to just 2.
[~mjsax] [~cadonna] do you have any details about the CI architecture these
tests are run on? Are the Jenkins nodes that carry out these tests shared
concurrently by different Jenkins jobs/stages? How many cores are available on
them? And, as a sanity check, are we still using the project's {{Jenkinsfile}}
to run these tests (and set {{{}-PmaxParallelForks=2{}}})?
> Flaky Tests BlockingConnectorTest.testWorkerRestartWithBlockInConnectorStop
> ---------------------------------------------------------------------------
>
> Key: KAFKA-12657
> URL: https://issues.apache.org/jira/browse/KAFKA-12657
> Project: Kafka
> Issue Type: Test
> Components: KafkaConnect
> Reporter: Matthias J. Sax
> Priority: Critical
> Labels: flaky-test
>
> [https://github.com/apache/kafka/pull/10506/checks?check_run_id=2327377745]
> {quote} {{org.opentest4j.AssertionFailedError: Condition not met within
> timeout 60000. Worker did not complete startup in time ==> expected: <true>
> but was: <false>
> at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
> at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40)
> at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:193)
> at
> org.apache.kafka.test.TestUtils.lambda$waitForCondition$3(TestUtils.java:319)
> at
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:367)
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:316)
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:300)
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:290)
> at
> org.apache.kafka.connect.integration.BlockingConnectorTest.setup(BlockingConnectorTest.java:133)}}
> {quote}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)