On 11/11/2020 22:37, Rémy Maucherat wrote:
> On Wed, Nov 11, 2020 at 9:44 PM <ma...@apache.org> wrote:
> 
>> This is an automated email from the ASF dual-hosted git repository.
>>
>> markt pushed a commit to branch master
>> in repository https://gitbox.apache.org/repos/asf/tomcat.git
>>
>>
>> The following commit(s) were added to refs/heads/master by this push:
>>      new 45aeed6  Fix NIO concurrency issue that removes connections from
>> the poller.
>> 45aeed6 is described below
>>
>> commit 45aeed655771308d5185d9dbab8e29a73d87509b
>> Author: Mark Thomas <ma...@apache.org>
>> AuthorDate: Wed Nov 11 20:43:04 2020 +0000
>>
>>     Fix NIO concurrency issue that removes connections from the poller.
>>
>>     This is the source of the intermittent WebSocket test failure so this
>>     commit also removes the associated debug code for that issue.
>>
> 
> Great fix. I never expected this one ...

Thanks. It took me long enough to find it.

It only occurred every 1 in ~15 test runs. When a full test run takes
~8.5 mins it is a slow process. I was trying to narrow down the set of
tests that triggered it but it was hard to determine if the failure was
still being triggered. After about a day of getting nowhere I decided to
start from the other end and ran the single test in a loop until it
failed. That meant I could reproduce the failure in less than a minute.
Things moved a lot faster from that point.

Once I could reproduce the issue, it was just a case of adding debug
statements to track down the root cause. Some of those statements
altered the timing enough to prevent the failure but even that helped as
it meant the issue was occurring after that point.

The root cause surprised me as well. I'd suspected some sort of issue
along these lines and had been looking at the source code during the
longer test runs. Knowing NioChannel instances were being re-used I'd
explicitly looked for places were this sort of mix-up could happen and
completely failed to find this one.

It will be interesting to see if any other intermittent issues disappear
suggesting they had the same root cause.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to