hi Renato, Sounds like a useful feature to have (to be able to inspect data page metadata without decoding all the data inside). You'll need to propose a change and patch to Apache Parquet
Speaking of which, we're having a discussion on the Arrow and Parquet mailing lists about easing Parquet-related development process for both communities: https://lists.apache.org/thread.html/4bc135b4e933b959602df48bc3d5978ab7a4299d83d4295da9f498ac@%3Cdev.parquet.apache.org%3E - Wes On Mon, Jul 30, 2018 at 12:02 PM, Renato Marroquín Mogrovejo <renatoj.marroq...@gmail.com> wrote: > Hi Arrow devs, > > I am trying to separate reading only pageHeaders from reading > (reading+uncompresing+serializing) its entire content. > The current SerializedPageReader::NextPage() does both things at the same > time. > I tried importing format::PageHeader into a separate project linking > against a build of parquet-cpp, but I can't, I guess it is because it is > not exported, right? > Any suggestions/pointers/ideas are highly appreciated! > Thanks! > > > Renato M.