Lars (Cloudera Impala): Zoltan proposal to get to a more stable release or
Qinghui, Benoit, Miguel, Justin (Criteo): Pull request. Parquet-proto.
Gidon (IBM): encryption JIRA. On track
Ryan (Netflix): getting 1.10 out
Zoltan (Cloudera): column index fixes from Gabor, ideas on list
Anna (Cloudera): Compatibility issues.
Compatibility issues and flags:
- Define standard flags for features that are supported or not:
- New Compression algorithms: Brotli, ZStandard, ...
- New Encodings (since v1): Delta-int, …
- Flags are standards across parquet implementations to limit usage of
features to a set supported across all components
- Define (a few) profiles with the sets of features supported for a
given version (1.0, 2.0, 3.0)
- These are goals for any implementation to support.
- To be discussed: optional features that can be ignored and don’t
prevent reading the file (ex: bloom filters, page index)
- Zoltan: create jira and google doc with a design proposal
- Criteo to validate and give +1 :
- New feature needed:
- support: empty list vs null list.
- Crate will Create jira and submit New PR
Column indexes: (By Gabor) PR: https://github.com/apache/parquet-mr/pull/456
- Needs modification in parquet-format utils (not the thrift metadata)
=> new release
- first version writing into parquet-mr
- Ryan to review
- Ryan and Zoltan to follow up on making parquet-format release
On Wed, Feb 14, 2018 at 9:02 AM, Julien Le Dem <julien.le...@wework.com>
> starting now on google hangout: