Wes McKinney created ARROW-9983: ----------------------------------- Summary: [C++][Dataset] Use larger default batch size than 32K for Datasets API Key: ARROW-9983 URL: https://issues.apache.org/jira/browse/ARROW-9983 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Wes McKinney Fix For: 2.0.0
Dremio uses 64K batch sizes. We could probably get away with even larger batch sizes (e.g. 256K or 1M) and allow memory-constrained users to elect a smaller batch size. See example of some performance issues related to this in ARROW-9924 -- This message was sent by Atlassian Jira (v8.3.4#803005)