Ok, thanks! Is there any source of information about the difference/benefits between v1 and v2 ?
Btw, i tested v2 and it failed for me on delta-encoded columns when invoking the skip() API: https://issues.apache.org/jira/browse/PARQUET-623 <https://issues.apache.org/jira/browse/PARQUET-623> cheers Johannes > On 20 May 2016, at 22:05, Ryan Blue <[email protected]> wrote: > > Johannes, > > Parquet v1 is well-defined and closed, while v2 is still evolving and not > yet final. Parquet provides forward-compatibility within a format version, > so once v2 is finalized and released you will be able to read any v2 file > with the initial reader, even if it is written by a later version of the > library. Because v2 is still evolving, the risk is that we may add > something to v2 that can't be read with current readers. Other than that > concern, you should be able to use it now. > > rb > > On Thu, May 19, 2016 at 10:29 AM, Johannes Zillmann < > [email protected]> wrote: > >> Hey guys, >> >> i started digging into the Parquet universe… the java version… 1.8.1 and >> 2.3.1 format... >> >> Saw that there are two different page representations, v1 & v2. >> org.apache.parquet.hadoop.ParquetWriter is using v1 by default. >> >> So my question is… >> Whats the difference and is v2 safe to use ? >> >> best >> Johannes > > > > > -- > Ryan Blue > Software Engineer > Netflix
