Hi, The C++ Parquet implementation in the Apache Arrow (namely the parquet-cpp) has added Page Index support since 13.0.0. Recently SizeStatistics support is also added in 19.0.0. Both features are disabled by default. We did a benchmark and the result showed that we can enable them by default with acceptable penalties. Therefore I opened a PR [1] to turn on them by default. The benchmark result is also available in this PR. Any feedback is welcome. If there is no objection, we will merge this PR and release it with Apache Arrow 20.0.0.
[1] https://github.com/apache/arrow/pull/45249 Best, Gang