Andy Grove created ARROW-11012:
----------------------------------
Summary: [Rust] [DataFusion] Make write_csv and write_parquet
concurrent
Key: ARROW-11012
URL: https://issues.apache.org/jira/browse/ARROW-11012
Project: Apache Arrow
Issue Type: Improvement
Components: Rust - DataFusion
Reporter: Andy Grove
ExecutionContext.write_csv and write_parquet currently iterate over the output
partitions and execute one at a time and write the results out. We should run
these as tokio tasks so they can run concurrently. This should, in theory, help
with memory usage when the plan contains repartition operators.
We may want to add a configuration option so we can choose between serial and
parallel writes?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)