[
https://issues.apache.org/jira/browse/ARROW-14734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17456274#comment-17456274
]
Weston Pace commented on ARROW-14734:
-------------------------------------
The actual culprit appears to be SignalCancelTest. At least, I am unable to
reproduce this with running CountingSemaphore.Basic alone. However, if I run
SignalCancelTest and CountingSemaphore.Basic at the same time then it will fail
(sometimes without output that matches the description of this issue so that it
looks like CountingSemaphore.Basic failed)
I can also reproduce it by running SignalCancelTest on repeat. So far only
with RegisterUnregister. I was able to capture a stack trace in Visual Studio
and it looks like the detached thread in cancel_test.cc:182 is raising a signal
that is not caught by any custom handler and thus exits the application.
I can only get it to repeat if I stress the CPU.
My guess is that somehow the test is tearing down (and removing the
cancellation guard) and then the signalling thread actually raises the signal.
> [C++][CI] CountingSemaphore sporadic test crash
> -----------------------------------------------
>
> Key: ARROW-14734
> URL: https://issues.apache.org/jira/browse/ARROW-14734
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, Continuous Integration
> Reporter: Antoine Pitrou
> Assignee: Weston Pace
> Priority: Major
>
> This may be a fluke, but this crash appeared on CI:
> https://github.com/apache/arrow/runs/4234285140?check_suite_focus=true#step:8:110
> {code}
> [==========] Running 134 tests from 23 test suites.
> [----------] Global test environment set-up.
> [----------] 7 tests from CancelTest
> [ RUN ] CancelTest.StopBasics
> [ OK ] CancelTest.StopBasics (0 ms)
> [ RUN ] CancelTest.StopTokenCopy
> [ OK ] CancelTest.StopTokenCopy (0 ms)
> [ RUN ] CancelTest.RequestStopTwice
> [ OK ] CancelTest.RequestStopTwice (0 ms)
> [ RUN ] CancelTest.Unstoppable
> [ OK ] CancelTest.Unstoppable (0 ms)
> [ RUN ] CancelTest.SourceVanishes
> [ OK ] CancelTest.SourceVanishes (0 ms)
> [ RUN ] CancelTest.ThreadedPollSuccess
> [ OK ] CancelTest.ThreadedPollSuccess (11 ms)
> [ RUN ] CancelTest.ThreadedPollCancel
> [ OK ] CancelTest.ThreadedPollCancel (11 ms)
> [----------] 7 tests from CancelTest (23 ms total)
> [----------] 2 tests from SignalCancelTest
> [ RUN ] SignalCancelTest.Register
> [ OK ] SignalCancelTest.Register (1 ms)
> [ RUN ] SignalCancelTest.RegisterUnregister
> [ OK ] SignalCancelTest.RegisterUnregister (111 ms)
> [----------] 2 tests from SignalCancelTest (113 ms total)
> [----------] 3 tests from CountingSemaphore
> [ RUN ] CountingSemaphore.Basic
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)