Notes: Attendees:
- Julien (WeWork): proto, release - Marcel: Iceberg - Zoltan, Gabor, Anna (Cloudera): bug null values. - - https://issues.apache.org/jira/projects/PARQUET/issues/PARQUET-1222 <https://issues.apache.org/jira/projects/PARQUET/issues/PARQUET-1222?filter=allopenissues> - https://issues.apache.org/jira/projects/PARQUET/issues/PARQUET-1217 <https://issues.apache.org/jira/projects/PARQUET/issues/PARQUET-1217?filter=allissues> - Lars, Zoltan Borok-nagy (Cloudera Impala): new way of merging changes after moving to gitbox. - Deepak (Vertica): encryption in c++ - Benoit, Singhue (Criteo): protobuf. Merging - - https://github.com/apache/parquet-mr/pull/411 - PARQUET-968 - Chao (Uber): encryption, Native Rust implementation. - Gidon (IBM): encryption jira, status and next steps. - Protobuf: - - https://github.com/apache/parquet-mr/pull/411 - In use for a few weeks. - Introduces a breaking change: - - Empty maps become null maps - Will add flag to avoid compatibility break - Rust: - - Been working for 1 year - 2 contributors. - Read implementation only for now. - Want to contribute to the parquet project. - Plan to have Parquet-rust using Arrow-rust - Personal project. - Encryption: https://issues.apache.org/jira/browse/PARQUET-1178 - - Need review: https://github.com/apache/parquet-format/pull/84/files - Chao: Hive table use parquet format. Different engines (Presto). use the data so security should be implemented at the node level - Deepak: make sure there’s no incompatibility issues. - Gidon: has been looking at the C++ implementation. Cross compatibility working. - Action: - - Provide feedback on PR and doc. - Giddon to share java. - Deepak take a look and provide cpp point of view - Bugs: - - PARQUET-1222 <https://issues.apache.org/jira/projects/PARQUET/issues/PARQUET-1222?filter=allopenissues>: Handling of NaN and 0+ 0-: - - 1: fix current behavior (ignore NaN in stats and 0+-) - 2: provide better total ordering including NaN etc - PARQUET-1217: if null_count if populated but not min/max old parquet use default 0 min max for numbers. - - Need a fix and parquet-mr - Old readers will have problems: - - Possibly provide a 1.8.3 release with the bug fix for project depending on an old version. - For example Spark: https://github.com/apache/spark/blob/34811e0b908449fd59bca476604612b1d200778d/pom.xml#L132 - Will reach out to the spark team to see if they can upgrade. On Tue, Mar 13, 2018 at 10:01 AM, Julien Le Dem <julien.le...@gmail.com> wrote: > https://meet.google.com/jpy-mump-ngc >