wgtmac opened a new issue, #33710:
URL: https://github.com/apache/arrow/issues/33710

   ### Describe the enhancement requested
   
   This PR enhances the `parquet::arrow::FileWriter` to write columns in 
parallel by setting `ArrowWriterProperties::use_threads` to true and optionally 
providing an executor. As the [review 
comment](https://github.com/apache/arrow/pull/33656#pullrequestreview-1250353638)
 explains, a nested parallelism deadlock may occur if the file writer itself is 
running in the same executor.
   
   To provide better usability, we can adopt the suggestion from @westonpace to 
add a `WriteRecordBatchAsync` method that calls `ParallelForAsync` and returns 
the future. This can then be safely called in parallel, even by thread pool 
threads (assuming they don't block on that future but wrap it up into a higher 
level `AllComplete` call later). The `WriteRecordBatch` method could then just 
return `WriteRecordBatchAsync(...).status()`.
   
   ### Component(s)
   
   C++, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to