Julien, Can you please add me to the calendar invite for the sync-up meetings ? Thanks.
On Thu, Oct 27, 2016 at 2:33 PM, Julien Le Dem <[email protected]> wrote: > Attendees/Agenda > Julien (Dremio): > - Parquet-format: arrow types parity. > - parquet-mr: Parquet-Arrow schema converter PR > Ryan (Netflix): > - present New Parquet cli > - Parquet sort order proposal > Gabor, Zoltan (Cloudera, file formats team): > - getting started > Uwe (Blue Yonder): > - parquet-cpp getting close to release > - type changes with arrow discussion > > Parquet logical types: > - Julien proposed new logical types to bring parity with Arrow: Union, > Intervals types, Null, Half Precision floats > - TODO(Julien): add LogicalType doc for new types. > - Union: > - differentiate between null union and projecting another value using > the union itself optional fields. > - describe union type constraints. > - Null: type for things that are always null. For example data coming from > schema discovery on son with a field always null. > - Interval Type: > - uses actual SQL spec for interval units > - deprecate existing Interval logical type. > - Half precision float: punt on that for now. > - defined in Arrow metadata > - actually not implemented in arrow-cpp and arrow-java > - possibly add physical type for half precision types. > - add encodings? See Ryan’s PR for float encoding > > - Uwe: TIMESTAMP_NANOS ? > - used in Pandas > - used in Hive (through loosely defined Parquet’s int96) > - debate wether we should support it or not. > - Possibly have an int64 or fixed length byte array to store it. > - TODO(Uwe): open a JIRA, Ryan comment > > Parquet-cli: > - Ryan's new parquet-cli > - easier to try encodings. > - look at data. > - some code from the kite project in Apache 2. > > Parquet sort order: > - current proposal: to have 2 separate min and max in stats block > - Ryan: to create a Pull Request. > - how to formally specify sort order (comparator/collation) > - standard database collations? Look into Calcite? > > Parquet-cpp release? > - fix bugs. > - release JIRA. > > next sync up in two weeks. > > On Thu, Oct 27, 2016 at 9:59 AM, Julien Le Dem <[email protected]> wrote: > >> https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up >> >> -- >> Julien >> > > > > -- > Julien -- regards, Deepak Majeti
