[
https://issues.apache.org/jira/browse/KAFKA-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17563004#comment-17563004
]
Matthew de Detrich edited comment on KAFKA-14014 at 7/6/22 7:27 AM:
--------------------------------------------------------------------
[~cadonna] So I did some debugging on this ticket over the past week and I
found out some interesting things.
To start off with I did manage to predictably replicate the test and its
flakiness does appear to be related to load, i.e. the test is more flaky the
less CPU resources it has. I am using docker (i.e. running the tests within
docker gradle image) by using the --cpu flag to limit resources.
Interestingly I have gone up to 5 cpu's and its still flaking out albeit less
often. I attempted to increase the various timeouts that is used in t he test
but this had no effect so I am going to dig a bit further.
Note that 5 cpu's is already considered "high" (at least for a machine that
would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6
cores, 12 threads) and at least when running with all of the resources on the
machine I couldn't replicate the flaky test.
was (Author: mdedetrich-aiven):
[~cadonna] So I did some debugging on this ticket over the past week and I
found out some interesting things.
To start off with, the test is more flaky the less CPU resources it has. I am
using docker (i.e. running the tests within docker gradle image) by using the
--cpu flag to limit resources.
Interestingly I have gone up to 5 cpu's and its still flaking out albeit less
often. I attempted to increase the various timeouts that is used in t he test
but this had no effect so I am going to dig a bit further.
Note that 5 cpu's is already considered "high" (at least for a machine that
would reasonably run kafka streams). The machine I am running as 12 "cpus"'s (6
cores, 12 threads) and at least when running with all of the resources on the
machine I couldn't replicate the flaky test.
> Flaky test
> NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets()
> ----------------------------------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-14014
> URL: https://issues.apache.org/jira/browse/KAFKA-14014
> Project: Kafka
> Issue Type: Test
> Components: streams
> Reporter: Bruno Cadonna
> Priority: Critical
>
> {code:java}
> java.lang.AssertionError:
> Expected: <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 2)]>
> but: was <[KeyValue(B, 1), KeyValue(A, 2), KeyValue(C, 1)]>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
> at
> org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets(NamedTopologyIntegrationTest.java:540)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
> https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-12310/2/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_17_and_Scala_2_13___shouldAllowRemovingAndAddingNamedTopologyToRunningApplicationWithMultipleNodesAndResetsOffsets/
--
This message was sent by Atlassian Jira
(v8.20.10#820010)