alamb opened a new issue, #21598:
URL: https://github.com/apache/datafusion/issues/21598

   ## Is your feature request related to a problem or challenge?
   
   Follow-on from [#20529](https://github.com/apache/datafusion/issues/20529) 
and the morsel API work in 
[#21342](https://github.com/apache/datafusion/pull/21342).
   
   In [this review 
comment](https://github.com/apache/datafusion/pull/21342#discussion_r3070184104),
 Adrian raised a good question about whether 
[`Morsel`](https://github.com/apache/datafusion/blob/04dbbbf6694a4b162f76aee0091fdc3a47d2f9f0/datafusion/datasource/src/morsel/mod.rs#L52)
 should return an iterator rather than a stream. We tried that change locally 
and found that the API is straightforward for mock and fully CPU-ready morsels, 
and it fits well with the current 
[`ScanState`](https://github.com/apache/datafusion/blob/04dbbbf6694a4b162f76aee0091fdc3a47d2f9f0/datafusion/datasource/src/file_stream/scan_state.rs#L65)
 / `FileStream` scheduler. However, Parquet is structurally harder because 
[`ParquetPushDecoder`](https://docs.rs/parquet/latest/parquet/arrow/push_decoder/struct.ParquetPushDecoder.html)
 can discover `NeedsData` only after some batches have already been produced 
from the current decode state, which makes a synchronous morsel boundary 
difficult to express cleanly in [
 
`ParquetOpenState`](https://github.com/apache/datafusion/blob/04dbbbf6694a4b162f76aee0091fdc3a47d2f9f0/datafusion/datasource-parquet/src/opener.rs#L202),
 
[`PushDecoderState`](https://github.com/apache/datafusion/blob/04dbbbf6694a4b162f76aee0091fdc3a47d2f9f0/datafusion/datasource-parquet/src/opener.rs#L411),
 and 
[`ParquetStreamMorsel`](https://github.com/apache/datafusion/blob/04dbbbf6694a4b162f76aee0091fdc3a47d2f9f0/datafusion/datasource-parquet/src/opener.rs#L507).
   
   Three-sentence summary of what we found in testing: switching `Morsel` from 
`Stream` to `Iterator` worked cleanly for `FileStream` and the mock morsel 
infrastructure. For Parquet, a simple iterator wrapper required blocking 
because the decoder can hit `NeedsData` mid-iteration, after already yielding 
some batches. A handoff-based design removed the blocking bridge in principle, 
but the current `ParquetPushDecoder` / reader ownership model made it easy to 
introduce hangs in Parquet reverse-scan tests.
   
   ## Describe the solution you'd like
   
   Add Parquet-side support that makes an iterator-based morsel contract 
practical without blocking.
   
   Two likely enablers are:
   - an API on 
[`ParquetPushDecoder`](https://docs.rs/parquet/latest/parquet/arrow/push_decoder/struct.ParquetPushDecoder.html)
 that can tell the caller how much already-buffered output remains, or 
otherwise expose a clearer “this many rows/batches are ready before the next 
`NeedsData`” boundary
   - an efficient way to hand off or clone the 
[`AsyncFileReader`](https://docs.rs/parquet/latest/parquet/arrow/async_reader/trait.AsyncFileReader.html)
 state needed to start the next async read without coupling the active morsel 
iterator and the pending planner so tightly
   
   More concretely, it would be useful if Parquet exposed one or both of:
   - a way to ask the decoder whether more output can be produced synchronously 
before the next I/O boundary
   - a cheap way to split or clone the reader / read context so the next 
planner I/O can be prepared independently of the currently active morsel 
iterator
   
   That would let DataFusion produce a 
[`Morsel`](https://github.com/apache/datafusion/blob/04dbbbf6694a4b162f76aee0091fdc3a47d2f9f0/datafusion/datasource/src/morsel/mod.rs#L52)
 that is honestly CPU-only, while keeping future I/O in 
[`PendingMorselPlanner`](https://github.com/apache/datafusion/blob/04dbbbf6694a4b162f76aee0091fdc3a47d2f9f0/datafusion/datasource/src/morsel/mod.rs#L188)
 rather than hiding async work behind a blocking iterator bridge.
   
   ## Describe alternatives you've considered
   
   - Keep 
[`Morsel`](https://github.com/apache/datafusion/blob/04dbbbf6694a4b162f76aee0091fdc3a47d2f9f0/datafusion/datasource/src/morsel/mod.rs#L52)
 as a stream for now. This is the cleanest fit for current Parquet behavior, 
but it gives up the simpler synchronous morsel contract.
   - Use `futures::executor::block_on` or `block_on_stream` inside Parquet 
morsels. This works mechanically but blocks executor threads and is not a good 
long-term design.
   - Use a channel handoff where the morsel iterator returns batches until 
`NeedsData`, then sends decoder state to a waiting planner future. This is 
conceptually closer to the desired model, but our prototype still had 
lifecycle/handoff complexity and hung in some Parquet tests.
   
   ## Additional context
   
   Relevant links:
   - [#20529](https://github.com/apache/datafusion/issues/20529)
   - [#21342](https://github.com/apache/datafusion/pull/21342)
   - Adrian’s review question: [#21342 
comment](https://github.com/apache/datafusion/pull/21342#discussion_r3070184104)
   - [`ParquetPushDecoder` source in 
arrow-rs](https://github.com/apache/arrow-rs/blob/master/parquet/src/arrow/push_decoder/mod.rs)
   
   If helpful, this issue could explicitly track the Parquet-specific API work 
needed before revisiting an iterator-based `Morsel` contract in DataFusion.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to