Re: Interpretation of PageHeader uncompressed_page_size

2020-03-26 Thread Hatem Helal
Thanks Gabor, that is very helpful to know. Best wishes, Hatem On Wed, Mar 25, 2020 at 2:15 PM Gabor Szadovszky wrote: > Hi Hatem, > > I agree that the levels shall be included as per the specification. I > checked the implementation in parquet-mr as well and it also includes the > levels in

Re: Interpretation of PageHeader uncompressed_page_size

2020-03-25 Thread Gabor Szadovszky
Hi Hatem, I agree that the levels shall be included as per the specification. I checked the implementation in parquet-mr as well and it also includes the levels in both uncompressed and compressed values. Cheers, Gabor On Wed, Mar 25, 2020 at 1:02 PM Hatem Helal wrote: > I've recently done

Interpretation of PageHeader uncompressed_page_size

2020-03-25 Thread Hatem Helal
I've recently done some work on adding support for DataPageV2 to the cpp code base [1]. A question came up if the uncompressed_page_size includes the levels which are not compressed in the V2 format anyway. My understanding of the thrift specification [2] is that the levels are included in this