I have picked an approach for reducing GET calls overall (from 2 footer + 1
actual data read to 1 GET call) for small files.
For small files we can buffer the whole file instead of doing separated
calls.

High level implementation - https://github.com/apache/iceberg/pull/16729
As similar changes to parquet-mr (like arrow-rs) can result into much
cleaner approach here, providing hint then instead of 8 bytes it gets that
much bytes so maybe no additional GET call to fetch footer.
For same reason I have started discussion to figure best approach -
https://lists.apache.org/thread/yb8nom3w2zplb703m0p052kcc1wwotrr

Would appreciate inputs and feedback there
Thanks
-- 
Lakhyani Varun
Indian Institute of Technology Roorkee

Reply via email to