westonpace opened a new pull request #12263:
URL: https://github.com/apache/arrow/pull/12263


   The test could fail when writing due to a race condition.  If the batches 
were delivered `AAAAABBBBBCCCCC...` then by the time we need to close a file to 
make space we can close an already completed file (and so we won't have to open 
up a new one later) and we end up with 5 files for 5 partitions.
   
   Adding `use_threads=False` to the `write_dataset` call was not sufficient.  
The `arrow::dataset::FileSystemDataset::Write` method was always using the CPU 
executor for the exec plan.  In other scanner methods we base the CPU executor 
on the scan options (`nullptr` if `scan_options->use_threads` is `false`).  
Making both of these changes together seems to make the test reliably pass.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to