Notes: Attendees and agenda building:
Ryan (Netflix): - new logical types representation - index proposal Deepak (Vertica): - logical types for timestamps Lars (Impala): - dummy ordering to test unknown ordering - implement new ordering in parquet-mr Marcel (Impala): - index proposal Uwe (Blue Yonder): - parquet cpp 1.1 Wes (twosigma): - parquet-cpp 1.1 - indexing proposal Zoltan (Cloudera - fileformats): Julien (Dremio): - parquet-mr - indexing proposal: near footer of indexes. - new logical types Discussion: - logical types: PARQUET-906 https://github.com/apache/parquet-format/pull/51 - action: Marcel and Lars to give feedback - action: give feedback by next week - testing unknown ordering: https://github.com/apache/parquet-format/pull/53/files - discussed pros and cons of approaches. Lars will follow up on the JIRA/PR - parquet-cpp 1.1 release: - will include: - support for reading structs to arrow: (simple reader of one level structs) - support for windows - reading and writing of lists of lists: (handles empty lists) - move arrow dependency from 0.2 to 0.3 - rc coming soon. - todo: make summary/release notes - index proposal: PARQUET-922 - action Julien: open jira to implement footer reading optimization in parquet-mr - The new index metadata is before the footer to not impact regular scan read. - We will make pages stop on row boundaries when the index is present - add row_count to page v1 - discussion: do we need compression? - to be addressed later. We should prototype something first - Deepak: open Jira for limiting stats size in parquet-cpp On Wed, May 10, 2017 at 10:02 AM, Julien Le Dem <[email protected]> wrote: > https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up > > -- > Julien > -- Julien
