Thanks Gabor, that is very helpful to know. Best wishes,
Hatem On Wed, Mar 25, 2020 at 2:15 PM Gabor Szadovszky <[email protected]> wrote: > Hi Hatem, > > I agree that the levels shall be included as per the specification. I > checked the implementation in parquet-mr as well and it also includes the > levels in both uncompressed and compressed values. > > Cheers, > Gabor > > On Wed, Mar 25, 2020 at 1:02 PM Hatem Helal <[email protected]> wrote: > > > I've recently done some work on adding support for DataPageV2 to the cpp > > code base [1]. A question came up if the uncompressed_page_size includes > > the levels which are not compressed in the V2 format anyway. > > > > My understanding of the thrift specification [2] is that the levels are > > included in this size. Can someone help confirm whether this > > interpretation is correct? > > > > Thanks, > > > > Hatem > > > > [1] https://github.com/apache/arrow/pull/6481 > > [2] > > > > > https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L623 > > >
