Ok, thanks!

Is there any source of information about the difference/benefits between v1 and 
v2 ?

Btw, i tested v2 and it failed for me on delta-encoded columns when invoking 
the skip() API: https://issues.apache.org/jira/browse/PARQUET-623 
<https://issues.apache.org/jira/browse/PARQUET-623>

cheers
Johannes

> On 20 May 2016, at 22:05, Ryan Blue <[email protected]> wrote:
> 
> Johannes,
> 
> Parquet v1 is well-defined and closed, while v2 is still evolving and not
> yet final. Parquet provides forward-compatibility within a format version,
> so once v2 is finalized and released you will be able to read any v2 file
> with the initial reader, even if it is written by a later version of the
> library. Because v2 is still evolving, the risk is that we may add
> something to v2 that can't be read with current readers. Other than that
> concern, you should be able to use it now.
> 
> rb
> 
> On Thu, May 19, 2016 at 10:29 AM, Johannes Zillmann <
> [email protected]> wrote:
> 
>> Hey guys,
>> 
>> i started digging into the Parquet universe… the java version… 1.8.1 and
>> 2.3.1 format...
>> 
>> Saw that there are two different page representations, v1 & v2.
>> org.apache.parquet.hadoop.ParquetWriter is using v1 by default.
>> 
>> So my question is…
>> Whats the difference and is v2 safe to use ?
>> 
>> best
>> Johannes
> 
> 
> 
> 
> -- 
> Ryan Blue
> Software Engineer
> Netflix

Reply via email to