westonpace commented on pull request #10845: URL: https://github.com/apache/arrow/pull/10845#issuecomment-904986082
Digging further I can reproduce it reliably if I configure the producer (the background generator in test_util.h) to be eventually slow and the source node to be initially slow (the Loop method in start producing). Since the loop method is initially slow the futures are already finished and it runs on the unit test thread. Once the loop moves faster and the background generator slows down then the futures are not already finished and they get queued on the thread pool. The test itself seems to by trying to test a parallel case (parallel=true) yet it still isn't specifying an exec_context with an executor. Adding an executor to the exec_context triggers a different race condition here: https://github.com/apache/arrow/blob/523b618d4e7317bc8a09ca7025ae4688c07b0bdc/cpp/src/arrow/compute/exec/hash_join_node.cc#L213 cached is a reference to an element in `cached_probe_batches` which is soon cleared. So, my recommendation for this PR is to change all tests (parallel or not) to use an exec context which specifes an executor. Then work through those issues. I don't think it is valid to NOT have an executor (and I will be opening a separate JIRA dedicated to that issue). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
