Thanks Julien -- is it possible to arrange for some advance notice of
the date and time of the sync up (or a shared google calendar
perhaps)?

On Thu, May 12, 2016 at 5:33 PM, Julien Le Dem <[email protected]> wrote:
> The next sync up will be around Strata London early June, where I'll happen
> to be. We will do in the morning Pacific time, evening Europe time.
>
> Notes from this sync:
>
> attendees:
>  - Julien (Dremio)
>  - Alex, Piyush (Twitter)
>  - Ryan (Netflix)
>
>
>  Parquet 2.0 encodings discussion:
>
>  - Jira open to finalize encodings: PARQUET-588: 2.0 encodings finalization.
>
>  - Ryan is doing experiments to measure efficiency on their data
>
> - Alex and Piyush are looking at encoding selection strategies: How to pick
> the best encoding for the data automatically
>
>
> 1.9 release:
>
>  - last blocker: PARQUET-400 (readFully() behavior) needs update from
> Jason. Possibly Piyush could pick it up if Jason is busy
>
>
> Brotli integration.
>
> - Ryan has been working on Brotli compression algorithm integration
>
> - for similar compression cost as snappy, much better compression ratio
>
> - embeds native library similar to snappy integration
>
> - looking into possibly statically linking the native library
>
> - PR available on parquet-format and parquet-mr
>
>
> Vectorized read:
>
>  - towards end of June we will organize a Parquet vectorized read hackathon
> for all parties interested (make yourself known if interested, we'll send
> more details later, possible remote participation through hangout)
>
>
> Lazy projections at runtime.
>
>  - Alex has been looking into lazy thrift object for parquet-thrift to
> minimize assembly cost in scalding existing jobs that don't declare the
> columns they need.
>
>
> Next sync will be in the morning PT.
>
>
>
>
>
>
>
> On Thu, May 12, 2016 at 5:42 AM, Deepak Majeti <[email protected]>
> wrote:
>
>> I am sorry for missing this meeting as well.
>> My interest is also to improve parquet-cpp reader/writer performance.
>> I will work with Uwe and Wes on this.
>> My other interest is on supporting predicate pushdown.  I will work on
>> this in parallel with performance.
>>
>> Thanks!
>>
>> On Thu, May 12, 2016 at 4:05 AM, Uwe Korn <[email protected]> wrote:
>> >
>> >> I'm sorry I wasn't able to join today again (traveling). We could
>> >> choose an early time Pacific time to make the meeting accessible to
>> >> both Asia and Europe -- I would suggest 8 or 9 AM Pacific
>> >>
>> > 8 or 9 am PT would work for me (CEST), 4pm PT is just not manageable.
>> > Also: Do we have a calendar where I can see in advance when sync ups are?
>> >
>> > Currently I'm working on the Parquet integration with Arrow and on
>> building
>> > a Python interface for libarrow-parquet. Once we have a basic working
>> > version, I will look into implementing missing features in the writer and
>> > improving general read/write performance in parquet-cpp.
>> >
>> > Uwe
>> >
>> >>
>> >> http://timesched.pocoo.org/?date=2016-05-11&tz=pacific-standard-time
>> !,de:berlin,cn:shanghai,us:new-york-city:ny
>> >>
>> >> I did not have much time for writing Parquet C++ development the last
>> >> 6 weeks, but plan to help Uwe complete the writer implementation and
>> >> work toward a more complete Apache Arrow integration (this is in
>> >> progress here:
>> >> https://github.com/apache/arrow/tree/master/cpp/src/arrow/parquet)
>> >>
>> >> Other items of immediate interest
>> >>
>> >> - C++ API to the file metadata (read + write)
>> >> - Conda packaging for built artifacts (to make parquet-cpp easier for
>> >> Python programmers to install portably when the time comes). I got
>> >> Thrift C++ into conda-forge this week so this should not be hard now
>> >> https://github.com/conda-forge/thrift-cpp-feedstock
>> >> - Expanding column scan benchmarks (thanks Uwe for kickstarting the
>> >> benchmarking effort!)
>> >> - Perf improvements for the RLE decoder
>> >>
>> >> Thanks
>> >> Wes
>> >>
>> >> On Wed, May 11, 2016 at 4:04 PM, Julien Le Dem <[email protected]>
>> wrote:
>> >>>
>> >>> The actual hangout url is
>> >>> https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up
>> >>>
>> >>> On Wed, May 11, 2016 at 3:57 PM, Julien Le Dem <[email protected]>
>> wrote:
>> >>>
>> >>>> starting in 5 mins:
>> >>>> https://plus.google.com/hangouts/_/event/parquet_sync_up
>> >>>>
>> >>>> On Wed, May 11, 2016 at 1:53 PM, Julien Le Dem <[email protected]>
>> >>>> wrote:
>> >>>>
>> >>>>> It is happening at 4pm PT on google hangout
>> >>>>> https://plus.google.com/hangouts/_/event/parquet_sync_up
>> >>>>>
>> >>>>> (we can do a different time next time, based on timezone preferences.
>> >>>>> Afternoon is better for Asia. Morning is better for Europe)
>> >>>>>
>> >>>>> --
>> >>>>> Julien
>> >>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Julien
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> Julien
>> >
>> >
>>
>>
>>
>> --
>> regards,
>> Deepak Majeti
>>
>
>
>
> --
> Julien

Reply via email to