westonpace opened a new pull request #12263: URL: https://github.com/apache/arrow/pull/12263
The test could fail when writing due to a race condition. If the batches were delivered `AAAAABBBBBCCCCC...` then by the time we need to close a file to make space we can close an already completed file (and so we won't have to open up a new one later) and we end up with 5 files for 5 partitions. Adding `use_threads=False` to the `write_dataset` call was not sufficient. The `arrow::dataset::FileSystemDataset::Write` method was always using the CPU executor for the exec plan. In other scanner methods we base the CPU executor on the scan options (`nullptr` if `scan_options->use_threads` is `false`). Making both of these changes together seems to make the test reliably pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
