Thanks for suggesting this course of action Michael!

This flushed out two additional bugs in our stack, which I believe
are probably at the root of some of the rare intermittent failures
we have been observing in the Throwing* tests:

  - SocketTube: I found an issue where the scheduler might not
       be restarted if resuming/pausing event from within
       the scheduler loop (that runs in the selector manager
       thread) failed due to the socket being asynchronously
       closed by another thread.
       That could cause some tests to fail in timeout.

   - Http2Connection/Stream: there was an issue where DataFrames
       could be sent after a ResetFrame was sent. That caused the
       server to close down the connection. The next test would
       start opening a new stream on the same connection while
       the server was concurrently closing it, and the test
       would eventually fail - sometimes with a message saying
       "EOF reached while reading".

The following webrev includes these two additional fixes, and I have
now very stable test runs. I wonder if I should try to extract those
two fixes though - as it might be worthwile to backport
them independently:

http://cr.openjdk.java.net/~dfuchs/webrev_8245462/webrev.01/index.html

best regards,

-- daniel

On 28/07/2020 15:19, Daniel Fuchs wrote:
On 28/07/2020 15:04, Michael McMahon wrote:
The code is technically racy on the GET test, but it's often the case when you want something to be racy then it turns out not to be in practice, 99 times out of a 100 anyway (figures made up). I was thinking you could put a random sleep on the client side before calling cancel (say between 1ms and the SERVER_LATENCY constant). Print out the random value too
in case it finds a problem.

Oh - that's a good point. Let me try that.

best regards,

-- daniel


Reply via email to