I have picked an approach for reducing GET calls overall (from 2 footer + 1 
actual data read to 1 GET call) for small files.
For small files we can buffer the whole file instead of doing separated calls.

High level implementation - https://github.com/apache/iceberg/pull/16729
As similar changes to parquet-mr (like arrow-rs) can result into much cleaner 
approach here, providing hint then instead of 8 bytes it gets that much bytes 
so maybe no additional GET call to fetch footer.
For same reason I have started discussion to figure best approach - 
https://lists.apache.org/thread/yb8nom3w2zplb703m0p052kcc1wwotrr

Would appreciate inputs and feedback there
Thanks 
--
Lakhyani Varun

Reply via email to