[
https://issues.apache.org/jira/browse/THRIFT-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089493#comment-17089493
]
Mario Emmenlauer edited comment on THRIFT-4282 at 4/22/20, 9:40 AM:
--------------------------------------------------------------------
I agree that StressTestNonBlocking is unstable on Windows. I think it may
better be disabled for all users on Windows. I've submitted a corresponding PR
to disable the test on Windows.
Also, when looking at the test results, I'm under the impression that the
implementation of StressTestNonBlocking may be flaky. On average, one out of
two runs of StressTestNonBlocking dies on MSVC, with errors such as:
{code}
[...]
433: Thrift: Mon Apr 20 10:42:23 2020 TNonblockingServer: IO thread #0
entering loop...
433: Launch 20 client threads
433: workers :4, client : 20, loops : 1000, rate : 132.213
433: echoVoid => 20000
433: done.
433: Thrift: Mon Apr 20 10:44:55 2020 TNonblocking: notifyHandler read()
failed: : errno = 10093
433: Thrift: Mon Apr 20 10:
23/26 Test #433: StressTestNonBlocking ...................... Passed 152.30
sec
{code}
The "curious" lines are
{code}
433: Thrift: Mon Apr 20 10:44:55 2020 TNonblocking: notifyHandler read()
failed: : errno = 10093
433: Thrift: Mon Apr 20 10:
{code}
I'm under the impression that there is a crash or problem, but potentially in a
separate thread, and not transferred to the main application. The test is
marked as "passed"! However, one out of five runs, the test ends more
critically and gets terminated in error state.
So, while disabling the flaky test seems reasonable, it may be good to check if
the error handling for multi-threading is correct and complete. Something may
be fishy there. Better error handling may help to pinpoint to the underlying
problem.
was (Author: emmenlau):
I agree that StressTestNonBlocking is unstable on Windows. Maybe it should be
disabled in CMakeLists for MSVC? If its only disabled for Apache Thrift
AppVeyor, it may confuse users.
Also, when looking at the test results, I'm under the impression that the
implementation of StressTestNonBlocking is flaky. On average one out of two
runs of StressTestNonBlocking dies on MSVC with errors such as:
{code}
[...]
433: Thrift: Mon Apr 20 10:42:23 2020 TNonblockingServer: IO thread #0
entering loop...
433: Launch 20 client threads
433: workers :4, client : 20, loops : 1000, rate : 132.213
433: echoVoid => 20000
433: done.
433: Thrift: Mon Apr 20 10:44:55 2020 TNonblocking: notifyHandler read()
failed: : errno = 10093
433: Thrift: Mon Apr 20 10:
23/26 Test #433: StressTestNonBlocking ...................... Passed 152.30
sec
{code}
The "curious" lines are
{code}
433: Thrift: Mon Apr 20 10:44:55 2020 TNonblocking: notifyHandler read()
failed: : errno = 10093
433: Thrift: Mon Apr 20 10:
{code}
I'm under the impression that there is a crash or problem, but potentially in a
separate thread, and not transferred to the main application. The test is
marked as "passed"! However, one out of five runs, the test ends more
critically and gets terminated in error state.
So, while disabling the flaky test seems reasonable, it would be quite good to
check if the error handling for multi-threading is correct and complete.
Something may be fishy there.
> StressTestNonBlocking is disabled in Appveyor as it is unstable on Windows in
> general
> -------------------------------------------------------------------------------------
>
> Key: THRIFT-4282
> URL: https://issues.apache.org/jira/browse/THRIFT-4282
> Project: Thrift
> Issue Type: Bug
> Components: C++ - Library
> Affects Versions: 0.10.0
> Environment: Windows, Appveyor
> Reporter: James E. King III
> Priority: Major
>
> I have not been able to complete most of my local builds including
> StressTestNonBlocking because the test fails. It is disabled in AppVeyor so
> the builds will pass. We need this test to be fixed so we can enable it for
> CI builds again.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)