westonpace commented on pull request #10845:
URL: https://github.com/apache/arrow/pull/10845#issuecomment-904986082


   Digging further I can reproduce it reliably if I configure the producer (the 
background generator in test_util.h) to be eventually slow and the source node 
to be initially slow (the Loop method in start producing).  Since the loop 
method is initially slow the futures are already finished and it runs on the 
unit test thread.  Once the loop moves faster and the background generator 
slows down then the futures are not already finished and they get queued on the 
thread pool.
   
   The test itself seems to by trying to test a parallel case (parallel=true) 
yet it still isn't specifying an exec_context with an executor.  Adding an 
executor to the exec_context triggers a different race condition here: 
https://github.com/apache/arrow/blob/523b618d4e7317bc8a09ca7025ae4688c07b0bdc/cpp/src/arrow/compute/exec/hash_join_node.cc#L213
   
   cached is a reference to an element in `cached_probe_batches` which is soon 
cleared.
   
   So, my recommendation for this PR is to change all tests (parallel or not) 
to use an exec context which specifes an executor.  Then work through those 
issues.
   
   I don't think it is valid to NOT have an executor (and I will be opening a 
separate JIRA dedicated to that issue).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to