[PR] feat(parquet/file): Add SeekToRow for Column readers [arrow-go]

via GitHub Mon, 17 Feb 2025 09:47:40 -0800


zeroshade opened a new pull request, #283:
URL: https://github.com/apache/arrow-go/pull/283


   ### Rationale for this change
   Addressing the comments in 
https://github.com/apache/arrow-go/issues/278#issuecomment-2657578697 to allow 
for optimizing reads by skipping entire pages and leveraging the offset index 
if it exists.
   
   ### What changes are included in this PR?
   Deprecating the old `NewColumnChunkReader` and `NewPageReader` methods as 
they really aren't safe to use outside of the package, and have proved 
difficult to evolve without breaking changes. Instead users should rely on 
using the `RowGroupReader` to perform the creation of the column readers and 
page readers, which is generally what is done by consumers already.
   
   Adding `SeekToRow` method on the ColumnChunkReader to allow skipping to a 
particular row in the column chunk (which also allows quickly resetting back to 
the beginning of a column!) along with `SeekToPageWithRow` method on the page 
reader. Also updates the `Skip` method to properly skip *rows* in a repeated 
column, not just values.
   
   ### Are these changes tested?
   Yes, tests are included.
   
   ### Are there any user-facing changes?
   Just the new methods. The deprecated methods are not removed currently.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] feat(parquet/file): Add SeekToRow for Column readers [arrow-go]

Reply via email to