[
https://issues.apache.org/jira/browse/KAFKA-16701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868797#comment-17868797
]
Greg Harris commented on KAFKA-16701:
-------------------------------------
[~bboyleonp] Thanks for looking into this more, and identifying that it's some
sort of inter-test interaction happening.
> and causes the following error that is not seen under JDK 11.
I see that exception under JDK 11. It looks to be intentionally part of
processDisconnectedException. I wouldn't worry about this, it seems unrelated.
I also believe tests within a single class will run serially with our
parallelism settings. If you're seeing evidence otherwise, please share.
I explored a bit further, and managed to sometimes reproduce the timeout with
just two tests running serially: closingChannelSendFailure then
closingChannelWithBufferedReceivesFailedSend, with all other tests Disabled. It
looks like the full suite almost always fails, and tests on their own almost
always pass, while running just two tests sometimes passes and sometimes fails.
I wasn't able to get a single-test failure with the timeout, or a deterministic
failure.
While these are JUnit tests, they aren't fully isolated from one another;
Opening and closing sockets will inherently change the state of the process,
and so its always possible that sockets from one test are being picked up by
another. I think we need to keep investigating.
> Some SocketServerTest buffered close tests flaky failing locally
> ----------------------------------------------------------------
>
> Key: KAFKA-16701
> URL: https://issues.apache.org/jira/browse/KAFKA-16701
> Project: Kafka
> Issue Type: Test
> Components: core, unit tests
> Affects Versions: 3.5.0, 3.6.0, 3.7.0
> Reporter: Greg Harris
> Assignee: bboyleonp
> Priority: Major
> Labels: flaky-test
>
> These tests are failing for me on a local development environment, but don't
> appear to be flaky or failing in CI. They only appear to fail for JDK >= 17.
> I'm using an M1 Mac, so it is possible that either the Mac's linear port
> allocation, or a native implementation is impacting this.
> closingChannelSendFailure()
>
> {noformat}
> java.lang.AssertionError: receiveRequest timed out
> at
> kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
> at
> kafka.network.SocketServerTest.makeChannelWithBufferedRequestsAndCloseRemote(SocketServerTest.scala:690)
> at
> kafka.network.SocketServerTest.$anonfun$verifySendFailureAfterRemoteClose$1(SocketServerTest.scala:1434)
> at
> kafka.network.SocketServerTest.verifySendFailureAfterRemoteClose(SocketServerTest.scala:1430)
> at
> kafka.network.SocketServerTest.closingChannelSendFailure(SocketServerTest.scala:1425){noformat}
> closingChannelWithBufferedReceivesFailedSend()
>
> {noformat}
> java.lang.AssertionError: receiveRequest timed out
> at
> kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1591)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1590)
> at
> kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1553)
> at
> kafka.network.SocketServerTest.closingChannelWithBufferedReceivesFailedSend(SocketServerTest.scala:1520){noformat}
> closingChannelWithCompleteAndIncompleteBufferedReceives()
> {noformat}
> java.lang.AssertionError: receiveRequest timed out
> at
> kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1591)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1590)
> at
> kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1553)
> at
> kafka.network.SocketServerTest.closingChannelWithCompleteAndIncompleteBufferedReceives(SocketServerTest.scala:1511)
> {noformat}
> remoteCloseWithBufferedReceives()
> {noformat}
> java.lang.AssertionError: receiveRequest timed out
> at
> kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1591)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1590)
> at
> kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1553)
> at
> kafka.network.SocketServerTest.remoteCloseWithBufferedReceives(SocketServerTest.scala:1453){noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)