[
https://issues.apache.org/jira/browse/THRIFT-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876057#comment-17876057
]
Yuxuan Wang commented on THRIFT-5814:
-------------------------------------
So far I tried a few ways, none of them actually fixes the flakiness:
* Replace the tcp connection with a unix domain socket
* After the client established the connection and sleep for a small period of
time, do a connectivity check to make sure the connection is still good, and if
not, retry establishing a client connection again
* Disable tcp keep-alive
* Change tcp keep-alive to a much shorter interval
> go: Flaky test TestNoHangDuringStopFromClientNoDataSendDuringAcceptLoop
> -----------------------------------------------------------------------
>
> Key: THRIFT-5814
> URL: https://issues.apache.org/jira/browse/THRIFT-5814
> Project: Thrift
> Issue Type: Task
> Components: Go - Library
> Affects Versions: 0.20.0
> Reporter: Yuxuan Wang
> Priority: Minor
>
> Currently the
> [TestNoHangDuringStopFromClientNoDataSendDuringAcceptLoop|https://github.com/apache/thrift/blob/cb9ceada554f47aa5ebbedfe3984de0983cf0226/lib/go/thrift/simple_server_test.go#L164]
> test in go library can be flaky (fails at roughly 1-in-100 chance)
> What this test does is roughly:
> # Create a local server listening on a random local port (via localhost:0)
> # Create a tcp client that connects to the server (via net.Dial) but does
> nothing after established the connection (so to server's PoV this is an idle
> client)
> # Tries to shutdown the server
> # Verifies that the shutting down of the server took at least the configured
> timeout, before server forcefully close idle client connections
> Step 4 can occasionally (rarely) fail because the server shutdown much faster
> than expected. I did some digging, the reason seems to be that the
> client-server tcp connection is broken after established (killed by the os or
> something?)
> So we need to find a way to keep the connection until server kills it to fix
> the flakiness of this test
--
This message was sent by Atlassian Jira
(v8.20.10#820010)