+1 Wednesday

On Wed, Jul 22, 2015 at 4:02 PM, Jason Altekruse <[email protected]>
wrote:

> +1 for wednesday
>
> On Wed, Jul 22, 2015 at 3:47 PM, Jacques Nadeau <[email protected]>
> wrote:
>
> > +1 for Wed.
> >
> > On Wed, Jul 22, 2015 at 3:45 PM, Alex Levenson <
> > [email protected]> wrote:
> >
> > > +1 for Wednesday
> > >
> > > On Wed, Jul 22, 2015 at 3:44 PM, Julien Le Dem
> > <[email protected]
> > > >
> > > wrote:
> > >
> > > > Wednesday then?
> > > > no more conflicts?
> > > >
> > > > On Tue, Jul 21, 2015 at 7:26 PM, Alex Levenson <
> > > > [email protected]> wrote:
> > > >
> > > > > Sorry to be difficult but, can I request any day other than Monday
> --
> > > how
> > > > > about Wednesday?
> > > > >
> > > > > On Tue, Jul 21, 2015 at 7:19 PM, Julien Le Dem <[email protected]>
> > > wrote:
> > > > >
> > > > > > There's no particular reason for Tuesdays.
> > > > > > We could do the next one on a Monday.
> > > > > > Anybody objects?
> > > > > >
> > > > > > Julien
> > > > > >
> > > > > > > On Jul 21, 2015, at 17:37, Jacques Nadeau <[email protected]>
> > > > wrote:
> > > > > > >
> > > > > > > Any chance we can have these on either a different day or time?
> > > The
> > > > > > Drill
> > > > > > > hangout is every Tuesday at 10am so I always have to pick one
> or
> > > the
> > > > > > other.
> > > > > > >
> > > > > > > On Tue, Jul 21, 2015 at 10:56 AM, Nezih Yigitbasi <
> > > > > > > [email protected]> wrote:
> > > > > > >
> > > > > > >> An update to "actions", I will create a PR for the vectorized
> > read
> > > > > > instead
> > > > > > >> of Zhenxiao.
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Nezih
> > > > > > >>
> > > > > > >> On Tue, Jul 21, 2015 at 10:51 AM, Julien Le Dem
> > > > > > <[email protected]
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> Agenda
> > > > > > >>> - Julien (Twitter):
> > > > > > >>>   - interested in ByteBuffer status
> > > > > > >>> - Ryan (by email): interested in ByteBuffer status. did some
> > work
> > > > on
> > > > > > >> bloom
> > > > > > >>> filters.
> > > > > > >>> PARQUET-251 and PARQUET-246 make sure 2.0 encodings and other
> > new
> > > > > > >> features
> > > > > > >>> are solid.
> > > > > > >>> - Daniel, Nezih, Zhengxiao (Netflix):
> > > > > > >>>    - update on Vectorized read path for Presto (Dong Chen for
> > > Hive)
> > > > > > >>>    - Parquet-99: OOM on write
> > > > > > >>> - Ippokratis: Impala team.
> > > > > > >>> - Jason Altekruse: (Drill/MapR)
> > > > > > >>>   - update on Java direct memory representation (hadoop 2.0
> > > > > ByteBuffer)
> > > > > > >>>   - currently uses a fork of Parquet that uses the GSOC work.
> > > > > > >>> - Tianshuo: 1.8.1 release.
> > > > > > >>> - Sanjeev (Twitter):
> > > > > > >>>  - want to hear updates about vectorized in Presto
> > > > > > >>>
> > > > > > >>> actions:
> > > > > > >>>  - Zhengxiao: update vectorization PR
> > > > > > >>>  - Jason: update ByteBuffer PR
> > > > > > >>>  - Jason: open JIRA for dic encoding fallback pointer
> > > > > > >>>  - Daniel: opened a PR for PARQUET-99: up for review
> > > > > > >>>
> > > > > > >>> Notes:
> > > > > > >>> - Vectorized read path for Presto (Dong Chen for Hive)
> > > PARQUET-131
> > > > > > >>>       - batch read
> > > > > > >>>       - lazy materialization
> > > > > > >>>       - Netflix integrated with Presto, Dong Chen integrated
> > with
> > > > > Hive
> > > > > > >>>       - Nezih: micro/macro benchmark
> > > > > > >>>            - micro 2 read paths
> > > > > > >>>                  - only primitives, no converters (3 x faster
> > > with
> > > > > > >>> vectorized)
> > > > > > >>>                  - complex with converters (no different
> > > > performance)
> > > > > > >>>            - macro Presto :
> > > > > > >>>                  - complex types not better
> > > > > > >>>                  - 2x better for primitive types
> > > > > > >>>       - Daniel: projection + predicate well optimized with
> > presto
> > > > > (lazy
> > > > > > >>> load, lazy materialization). predicate push down and using
> dic
> > in
> > > > > > >> predicate
> > > > > > >>> evaluation.
> > > > > > >>>       - Ippokratis: fan out? => 100 values per collection,
> > > list/map
> > > > > > >>> materialization expansive
> > > > > > >>>
> > > > > > >>> - Dictionary encoding: because of fallback mechanism. We
> don't
> > > know
> > > > > > when
> > > > > > >>> the dictionary ends. => Jason to open a JIRA
> > > > > > >>>
> > > > > > >>> - Parquet-99: OOM on write
> > > > > > >>>   - all big rows: (10MB per row) runs OOM before we first
> check
> > > > > > >>>   - big variability in size: small initial rows throw off
> > > estimate
> > > > > and
> > > > > > >>> following big rows blow memory
> > > > > > >>>   - add settings for checking at constant #rows.
> > > > > > >>>   - we should experiment with simpler strategies
> > > > > > >>>
> > > > > > >>> - ByteBuffer status:
> > > > > > >>>   - Jason need to rebase the PR
> > > > > > >>>   - Parquet-77
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Tue, Jul 21, 2015 at 10:05 AM, Julien Le Dem <
> > > > [email protected]>
> > > > > > >>> wrote:
> > > > > > >>>
> > > > > > >>>> It's happening now:
> > > > > > >>>>
> > https://plus.google.com/hangouts/_/twitter.com/parquet-sync-up
> > > > > > >>>>
> > > > > > >>>> On Tue, Jul 14, 2015 at 10:04 AM, Julien Le Dem <
> > > > [email protected]
> > > > > >
> > > > > > >>>> wrote:
> > > > > > >>>>
> > > > > > >>>>> The next Parquet sync up will be held on google hangout on
> > > > > 7/21/2015
> > > > > > >> at
> > > > > > >>>>> 10 am PST
> > > > > > >>>>>
> > https://plus.google.com/hangouts/_/twitter.com/parquet-sync-up
> > > > > > >>
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Alex Levenson
> > > > > @THISWILLWORK
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Alex Levenson
> > > @THISWILLWORK
> > >
> >
>

Reply via email to