Hi Benjamin, Several people were on vacation due to the holidays, that's
why you got a slow response on the dev@ email. The issue you're reporting
is not a bug but you might be using a different encoding version of Parquet.

Currently, Parquet has two encoding versions, PARQUET_1_0 and PARQUET_2_0.
PARQUET_2_0 is an experimental feature where different types of encodings
are applied per column type such the ones you are mentioning and also
mentioned in
https://github.com/apache/parquet-format/blob/master/Encodings.md. Only
parquet 2.x versions have PARQUET_2_0 enabled by default. Parquet 1.x
versions have PARQUET_1_0 enabled by default, but PARQUET_2_0 should be
supported I think.

How are you writing your data to Parquet? Did you write your own
application, or using Hive, Impala, or anything else?

On Tue, Jan 5, 2016 at 4:39 PM, Nong Li <[email protected]> wrote:

> Have we enabled the 2.0 encodings?
>
> On Wed, Dec 30, 2015 at 5:34 PM, Benjamin Anderson <[email protected]>
> wrote:
>
> > Hi there - I'm working on a small Parquet project and encountering
> > some surprising results with regard to encoding decisions.
> >
> > My dataset consists of ~1.5MM log lines parsed to an Avro schema and
> > written to a Parquet file via AvroParquetWriter. According to its log
> > output, Parquet is writing all int/long columns out with either
> > [BIT_PACKED, PLAIN] or [BIT_PACKED, PLAIN_DICTIONARY]. This surprised
> > me - at least one of those columns is an epoch value that should be
> > quite amenable to the DELTA_BINARY_PACKED. What's the best way to
> > understand Parquet's encoding choices?
> >
> > Secondary question: Is  DELTA_BINARY_PACKED supported for INT64
> > columns? The documentation[1] says it is, but the code[2] suggests
> > otherwise.
> >
> > Cheers,
> > --
> > b
> >
> > [1]:
> >
> https://github.com/apache/parquet-format/blob/master/Encodings.md#delta-encoding-delta_binary_packed--5
> > [2]:
> >
> https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/column/Encoding.java#L166-L168
> >
>

Reply via email to