Hi Hatem, I agree that the levels shall be included as per the specification. I checked the implementation in parquet-mr as well and it also includes the levels in both uncompressed and compressed values.
Cheers, Gabor On Wed, Mar 25, 2020 at 1:02 PM Hatem Helal <[email protected]> wrote: > I've recently done some work on adding support for DataPageV2 to the cpp > code base [1]. A question came up if the uncompressed_page_size includes > the levels which are not compressed in the V2 format anyway. > > My understanding of the thrift specification [2] is that the levels are > included in this size. Can someone help confirm whether this > interpretation is correct? > > Thanks, > > Hatem > > [1] https://github.com/apache/arrow/pull/6481 > [2] > > https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L623 >
