westonpace commented on issue #7983:
URL: https://github.com/apache/arrow-rs/issues/7983#issuecomment-3164659936

   FWIW, Lance uses something kind of like a push decoder and I've been very 
happy with the performance.  There's no reason I'm aware of that it couldn't 
work with Parquet.  It's described a little bit 
[here](https://blog.lancedb.com/splitting-scheduling-from-decoding/) but the 
overall structure is that there are three components, the scheduler, the I/O 
manager, and the decoder.
   
   The scheduler is a dedicated thread (tokio task spawn) that is started when 
the read starts.  It goes through the file metadata and calculates, based on 
the rows requested, what I/O ranges will be needed.  This is also where you'd 
apply filter-based pruning in Parquet I think.  For each column chunk the 
scheduler submits a request to the I/O manager (with the byte ranges needed) 
and gets a handle and then puts a message on the decode queue (with a handle to 
the column chunk description and a handle to the I/O request)
   
   The I/O manager gets I/O requests.  There is one I/O request per column 
chunk which contains all the desired ranges.  We do coalescing here (similar to 
C++).  When the I/O is finished it marks the future complete.
   
   The decoder is a stream polled called by the reader threads.  There can be 
as many reader threads as you want (we tie these to partitions in datafusion).  
Each time the decoder is polled it grabs the next message from the decode 
queue, awaits the I/O handle (this is where the readers block if we are I/O 
bound), and then once the data is ready it decodes the data into record batches.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to