I have similar question and interested in finding out more about
Do we know which query engines can create parquet files with DataPageHeaderV2
(either by default or through an option) ?
Which query engines support reading them and which do not ?
> On Oct 18, 2016, at 6:07 AM, Zivkovic, Vladimir
> <vladimir.zivko...@capitalone.com> wrote:
> We start using Parquet file format in one of our projects for efficient
> storing of data to S3 and later querying of data in RedShift.
> How do we make sure we use the latest version of Parquet and how to know what
> version is being supported by different query engines? (e.g. Presto, RedShift
> Is this a good place to track versions?
> By looking at this video – it seems like version 2.x of Parquet format is in
> progress? https://www.youtube.com/watch?v=MZNjmfx4LMc
> Also how can we know what version of Parquet is being supported by Spark 2.0
> and is it possible to change the parquet version (and how) in Spark?
> And how would we know that different version of Parquet will work with
> different query engines..
> Please refer me to some guide/blog if you have for some of the questions
> Thank you,
> Data Manager
> The information contained in this e-mail is confidential and/or proprietary
> to Capital One and/or its affiliates and may only be used solely in
> performance of work or services for Capital One. The information transmitted
> herewith is intended only for use by the individual or entity to which it is
> addressed. If the reader of this message is not the intended recipient, you
> are hereby notified that any review, retransmission, dissemination,
> distribution, copying or other use of, or taking of any action in reliance
> upon this information is strictly prohibited. If you have received this
> communication in error, please contact the sender and delete the material
> from your computer.