Attendance/Agenda: Deepak (Vertica): - indexing discussion Wes (twosigma): - indexing discussion - parquet-cpp 1.1 Marcel (Cloudera Impala): - Index proposal - sort order clarification went in Julien (Dremio): - indexing - protos Lukas (parquet-proto): - parquet-proto
Notes: - parquet-proto: - 3 changes on the way: - issue with protos repeated field that often are not read by other integrations - add support for protos generic types (may break compatibility?) - schema evolution using ids in photo fields. - Lukas to send JIRAs - would want to merge them soon and have a release - Index proposal for improving point queries and range queries. https://docs.google.com/document/d/1sBACp8Lbutuj1Zxdowvsrlm8ku4BFxf8U_Do5K2wSO4/edit# - todo (Marcel): clarify mechanism to store OffsetIndex and ColumnIndex outside the footer (probably just before). - todo (Marcel): add other optional fields form statistics in ColumnIndex (min, max, null_count, distinct_count) - todo (everyone): iterate on the feedback - impala prototype planned for June - Logical types pull request: https://github.com/apache/parquet-format/pull/51/files - todo: give more feedback -- Julien