[
https://issues.apache.org/jira/browse/KAFKA-16701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869149#comment-17869149
]
bboyleonp commented on KAFKA-16701:
-----------------------------------
[~gharris1727] I agree that we should keep investigating this issue.
It seems the inter-test interaction that causes the case
`closingChannelSendFailure` execution in between `processDisconnectedException`
i{*}s not reproducible{*} on my environment anymore, which aligns our previous
thoughts. I am going to ignore the previous finding on this one and try to
figure out if there's some other potential flaw.
> It looks like the full suite almost always fails, and tests on their own
> almost always pass
I can find the same. It seems the problem does not lie in the test cases
themselves, but the whole suite execution. I am going to investigate towards
this direction to see if I can find any clues.
> Opening and closing sockets will inherently change the state of the process,
> and so its always possible that sockets from one test are being picked up by
> another.
That is a great point. I print out the PID and TID of
`processDisconnectedException` and `closingChannelSendFailure` for testing
purpose and they are both using the same PID and TID. There's a chance that
this is raised by resources reuse.
I will update if I have any new findings.
> Some SocketServerTest buffered close tests flaky failing locally
> ----------------------------------------------------------------
>
> Key: KAFKA-16701
> URL: https://issues.apache.org/jira/browse/KAFKA-16701
> Project: Kafka
> Issue Type: Test
> Components: core, unit tests
> Affects Versions: 3.5.0, 3.6.0, 3.7.0
> Reporter: Greg Harris
> Assignee: bboyleonp
> Priority: Major
> Labels: flaky-test
>
> These tests are failing for me on a local development environment, but don't
> appear to be flaky or failing in CI. They only appear to fail for JDK >= 17.
> I'm using an M1 Mac, so it is possible that either the Mac's linear port
> allocation, or a native implementation is impacting this.
> closingChannelSendFailure()
>
> {noformat}
> java.lang.AssertionError: receiveRequest timed out
> at
> kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
> at
> kafka.network.SocketServerTest.makeChannelWithBufferedRequestsAndCloseRemote(SocketServerTest.scala:690)
> at
> kafka.network.SocketServerTest.$anonfun$verifySendFailureAfterRemoteClose$1(SocketServerTest.scala:1434)
> at
> kafka.network.SocketServerTest.verifySendFailureAfterRemoteClose(SocketServerTest.scala:1430)
> at
> kafka.network.SocketServerTest.closingChannelSendFailure(SocketServerTest.scala:1425){noformat}
> closingChannelWithBufferedReceivesFailedSend()
>
> {noformat}
> java.lang.AssertionError: receiveRequest timed out
> at
> kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1591)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1590)
> at
> kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1553)
> at
> kafka.network.SocketServerTest.closingChannelWithBufferedReceivesFailedSend(SocketServerTest.scala:1520){noformat}
> closingChannelWithCompleteAndIncompleteBufferedReceives()
> {noformat}
> java.lang.AssertionError: receiveRequest timed out
> at
> kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1591)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1590)
> at
> kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1553)
> at
> kafka.network.SocketServerTest.closingChannelWithCompleteAndIncompleteBufferedReceives(SocketServerTest.scala:1511)
> {noformat}
> remoteCloseWithBufferedReceives()
> {noformat}
> java.lang.AssertionError: receiveRequest timed out
> at
> kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1591)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190)
> at
> kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1590)
> at
> kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1553)
> at
> kafka.network.SocketServerTest.remoteCloseWithBufferedReceives(SocketServerTest.scala:1453){noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)