Thanks Gabor, that is very helpful to know.

Best wishes,

Hatem

On Wed, Mar 25, 2020 at 2:15 PM Gabor Szadovszky
<[email protected]> wrote:

> Hi Hatem,
>
> I agree that the levels shall be included as per the specification. I
> checked the implementation in parquet-mr as well and it also includes the
> levels in both uncompressed and compressed values.
>
> Cheers,
> Gabor
>
> On Wed, Mar 25, 2020 at 1:02 PM Hatem Helal <[email protected]> wrote:
>
> > I've recently done some work on adding support for DataPageV2 to the cpp
> > code base [1].  A question came up if the uncompressed_page_size includes
> > the levels which are not compressed in the V2 format anyway.
> >
> > My understanding of the thrift specification [2] is that the levels are
> > included in this size.  Can someone help confirm whether this
> > interpretation is correct?
> >
> > Thanks,
> >
> > Hatem
> >
> > [1] https://github.com/apache/arrow/pull/6481
> > [2]
> >
> >
> https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L623
> >
>

Reply via email to