Thank you for correcting! On Thursday, November 10, 2016, Ryan Blue <[email protected]> wrote:
> I have a slight correct for the Brotli encoding numbers. The 20% size > decrease incurred a 2.5% increase in compression time (using brotli-5), > while the 15% size decrease had a 12% encoding time *decrease* (using > brotli-4). We've decided to use brotli-5 for tables that are read a lot, > and brotli-4 for most other tables. > > On Thu, Nov 10, 2016 at 11:26 AM, Julien Le Dem <[email protected] > <javascript:;>> wrote: > > > Attendees/agenda: > > Zoltan (Cloudera): > > - Parquet tools questions > > Piyush (Twitter): > > - planning on encoding optimization > > Uwe: > > - release parquet-cpp > > - license/notice questions > > Wes (twosigma): > > - working on arrow > > - helping with the parquet-cpp release > > Deepak (HP/Vertica): > > - read/write parquet-cpp > > - discuss. statistics PARQUET-686. timestamps/... > > Ryan (Netflix): > > - 1.9.0 release out. > > - statistics > > Julien (Dremio): > > - Parquet-Arrow integration > > > > Notes: > > Parquet-tools: > > - when missing hadoop jars on the class path => bad error message > > - 1.6 used to bundle hadoop > > - 1.9 requires adding hadoop classpath > > - Ryan has new new CLI tool > > > > Parquet cpp release: > > - need to put mentions in NOTICE files > > - merge script came from the Spark project (Apache 2 License) > > - some code came from Impala (Apache 2 License) > > - Need to track the files imported from impala > > - Wes to document. > > - Zoltan to look into moving copyright to NOTICE > > > > Statistics: > > - Revisit signed/unsigned stats approach > > - instead add information on how the min/man got obtained. (Collation) > > - collation should follow a standard. We’re going to implement only a > > subset. > > - JIRA PARQUET-686 > > > > int96: > > - deprecate write of int96 (Ryan to look into it) > > > > New Encodings/compression: > > - brotli compression. => 20% decrease in size. 25% increase in encoding > > time. other settings: 15%/12% (compared to gzip). Ryan to update the PR. > > - need cpp integration as well. Uwe > > - PARQUET-682: specify encoding per column. Piyush to update PR > > > > > > > > On Thu, Nov 10, 2016 at 10:00 AM, Julien Le Dem <[email protected] > <javascript:;>> wrote: > > > > > starting now > > > https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up > > > > > > On Thu, Nov 10, 2016 at 8:51 AM, Julien Le Dem <[email protected] > <javascript:;>> > > wrote: > > > > > >> Reminder that the Parquet Sync up will be in 1h at 10am PT on hangout: > > >> https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up > > >> > > >> -- > > >> Julien > > >> > > > > > > > > > > > > -- > > > Julien > > > > > > > > > > > -- > > Julien > > > > > > -- > Ryan Blue > Software Engineer > Netflix > -- Julien
