Cool, thanks Uwe. Wil try using it within the coming days.

Regards,
Keith.

http://keith-chapman.com

On Tue, Jan 17, 2017 at 11:44 PM, Uwe L. Korn <[email protected]> wrote:

> Hi Keith,
>
> just a small heads up: the pull request for the read path is merged, I'm
> currently looking into removing all those copies in the write as well.
>
> Cheers
> Uwe
>
> On Fri, Jan 13, 2017, at 02:20 AM, Keith Chapman wrote:
> > Cool, Thanks for the update Wes. I was wondering if there was some deign
> > issue I was not aware of :). I will keep my eyes on the PR and llok to
> > make
> > more optimizations and upstream it.
> >
> > Regards,
> > Keith.
> >
> > http://keith-chapman.com
> >
> > On Thu, Jan 12, 2017 at 5:15 PM, Wes McKinney <[email protected]>
> > wrote:
> >
> > > hi Keith
> > >
> > > Uwe is working on this right now (avoiding the extra copy):
> > >
> > > https://github.com/apache/parquet-cpp/pull/218
> > >
> > > We would appreciate any efforts to further optimize these code paths.
> > >
> > > Thanks
> > > Wes
> > >
> > > On Thu, Jan 12, 2017 at 7:21 PM, Keith Chapman <
> [email protected]>
> > > wrote:
> > > > Hi,
> > > >
> > > > I'm using the the parquet-cpp library to read in some parquet files.
> I
> > > seen
> > > > that the parquet-cpp library has support for arrow and hence I
> thought of
> > > > giving it a shot. When running experiments I did not see any
> significant
> > > > increase in performance hence I was taking a look at the code. It
> looks
> > > to
> > > > me like the arrow reader uses and intermediate buffer to store the
> data
> > > and
> > > > hence does an extra copy, is this because of the mismatch in data
> types
> > > > between parquet and arrow? I'm specifically refering to the
> > > > FlatColumnReader::Impl::ReadNullableFlatBatch method in [1] (line
> 276).
> > > > Also I would imagine that setting one bit at a time would be
> inefficient,
> > > > not too sure if the compiler would be smart enough to set a work at a
> > > time
> > > > (I doubt it though). Just wondering if there was a reason behind
> having
> > > the
> > > > code as it is.
> > > >
> > > > [1]
> > > > https://github.com/apache/parquet-cpp/blob/master/src/
> > > parquet/arrow/reader.cc
> > > >
> > > >
> > > > Regards,
> > > > Keith.
> > > >
> > > > http://keith-chapman.com
> > >
>

Reply via email to