Weston Pace created ARROW-12371:
-----------------------------------
Summary: [C++] Allow EnumeratingGenerator to be async-reentrant
Key: ARROW-12371
URL: https://issues.apache.org/jira/browse/ARROW-12371
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Weston Pace
The combination of EnumeratingGenerator and ResequencingGenerator can be used
to process items in a "first available" fashion. This is currently used in the
scanner to compensate for intermittent fragment performance.
A potential further improvement would be to use this same pattern for
out-of-order readahead. For example, when reading a parquet file or an IPC
file via S3 the reader may request multiple batches in parallel. If the next
batch is slow but the later batches are fast we could start processing the
later batches while we wait for the next batch.
This would be a pretty minor improvement to latency (probably won't affect
throughput much) so I don't know that it is a very high priority fix. It may
be best to wait until profiling shows this is an issue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)