Hi, Ivan and I are debugging some behavior of the source node this morning and I was hoping to clarify that our understanding is correct.
We observed that when using source node with a generator: https://github.com/apache/arrow/blob/66c66d040bbf81a4819b276aee306625dc02837c/cpp/src/arrow/compute/exec/options.h#L54 The source node becomes "sequential" (batches come out in order one at a time) even with a GetCpuThreadPool() attached to the plan. We traced the code into this class: https://github.com/apache/arrow/blob/78fb2edd30b602bd54702896fa78d36ec6fefc8c/cpp/src/arrow/util/async_generator.h#L316 And it seems like because of the synchronization of this class, it generates batches sequentially. Is this correct understanding and if it is intentional that the source node are sequential when backed by a generator? (This is actually the behavior that we want)