Thanks Julien -- is it possible to arrange for some advance notice of the date and time of the sync up (or a shared google calendar perhaps)?
On Thu, May 12, 2016 at 5:33 PM, Julien Le Dem <[email protected]> wrote: > The next sync up will be around Strata London early June, where I'll happen > to be. We will do in the morning Pacific time, evening Europe time. > > Notes from this sync: > > attendees: > - Julien (Dremio) > - Alex, Piyush (Twitter) > - Ryan (Netflix) > > > Parquet 2.0 encodings discussion: > > - Jira open to finalize encodings: PARQUET-588: 2.0 encodings finalization. > > - Ryan is doing experiments to measure efficiency on their data > > - Alex and Piyush are looking at encoding selection strategies: How to pick > the best encoding for the data automatically > > > 1.9 release: > > - last blocker: PARQUET-400 (readFully() behavior) needs update from > Jason. Possibly Piyush could pick it up if Jason is busy > > > Brotli integration. > > - Ryan has been working on Brotli compression algorithm integration > > - for similar compression cost as snappy, much better compression ratio > > - embeds native library similar to snappy integration > > - looking into possibly statically linking the native library > > - PR available on parquet-format and parquet-mr > > > Vectorized read: > > - towards end of June we will organize a Parquet vectorized read hackathon > for all parties interested (make yourself known if interested, we'll send > more details later, possible remote participation through hangout) > > > Lazy projections at runtime. > > - Alex has been looking into lazy thrift object for parquet-thrift to > minimize assembly cost in scalding existing jobs that don't declare the > columns they need. > > > Next sync will be in the morning PT. > > > > > > > > On Thu, May 12, 2016 at 5:42 AM, Deepak Majeti <[email protected]> > wrote: > >> I am sorry for missing this meeting as well. >> My interest is also to improve parquet-cpp reader/writer performance. >> I will work with Uwe and Wes on this. >> My other interest is on supporting predicate pushdown. I will work on >> this in parallel with performance. >> >> Thanks! >> >> On Thu, May 12, 2016 at 4:05 AM, Uwe Korn <[email protected]> wrote: >> > >> >> I'm sorry I wasn't able to join today again (traveling). We could >> >> choose an early time Pacific time to make the meeting accessible to >> >> both Asia and Europe -- I would suggest 8 or 9 AM Pacific >> >> >> > 8 or 9 am PT would work for me (CEST), 4pm PT is just not manageable. >> > Also: Do we have a calendar where I can see in advance when sync ups are? >> > >> > Currently I'm working on the Parquet integration with Arrow and on >> building >> > a Python interface for libarrow-parquet. Once we have a basic working >> > version, I will look into implementing missing features in the writer and >> > improving general read/write performance in parquet-cpp. >> > >> > Uwe >> > >> >> >> >> http://timesched.pocoo.org/?date=2016-05-11&tz=pacific-standard-time >> !,de:berlin,cn:shanghai,us:new-york-city:ny >> >> >> >> I did not have much time for writing Parquet C++ development the last >> >> 6 weeks, but plan to help Uwe complete the writer implementation and >> >> work toward a more complete Apache Arrow integration (this is in >> >> progress here: >> >> https://github.com/apache/arrow/tree/master/cpp/src/arrow/parquet) >> >> >> >> Other items of immediate interest >> >> >> >> - C++ API to the file metadata (read + write) >> >> - Conda packaging for built artifacts (to make parquet-cpp easier for >> >> Python programmers to install portably when the time comes). I got >> >> Thrift C++ into conda-forge this week so this should not be hard now >> >> https://github.com/conda-forge/thrift-cpp-feedstock >> >> - Expanding column scan benchmarks (thanks Uwe for kickstarting the >> >> benchmarking effort!) >> >> - Perf improvements for the RLE decoder >> >> >> >> Thanks >> >> Wes >> >> >> >> On Wed, May 11, 2016 at 4:04 PM, Julien Le Dem <[email protected]> >> wrote: >> >>> >> >>> The actual hangout url is >> >>> https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up >> >>> >> >>> On Wed, May 11, 2016 at 3:57 PM, Julien Le Dem <[email protected]> >> wrote: >> >>> >> >>>> starting in 5 mins: >> >>>> https://plus.google.com/hangouts/_/event/parquet_sync_up >> >>>> >> >>>> On Wed, May 11, 2016 at 1:53 PM, Julien Le Dem <[email protected]> >> >>>> wrote: >> >>>> >> >>>>> It is happening at 4pm PT on google hangout >> >>>>> https://plus.google.com/hangouts/_/event/parquet_sync_up >> >>>>> >> >>>>> (we can do a different time next time, based on timezone preferences. >> >>>>> Afternoon is better for Asia. Morning is better for Europe) >> >>>>> >> >>>>> -- >> >>>>> Julien >> >>>>> >> >>>> >> >>>> >> >>>> -- >> >>>> Julien >> >>>> >> >>> >> >>> >> >>> -- >> >>> Julien >> > >> > >> >> >> >> -- >> regards, >> Deepak Majeti >> > > > > -- > Julien
