We've had a lot of discussion about this in the Iceberg community as well,
since Parquet to Arrow is going to be the easiest path to vectorized reads
for Spark. It would be great to have people working on it!

On Wed, Dec 12, 2018 at 7:38 AM Wes McKinney <[email protected]> wrote:

> hi Masayuki -- this is great to hear. Since this software was not
> developed in the Apache Parquet community we may need to careful about
> IP lineage / transfer issues if you do open a pull request.
>
> - Wes
> On Wed, Dec 12, 2018 at 9:23 AM Masayuki Takahashi
> <[email protected]> wrote:
> >
> > Hi,
> >
> > I am developing the simple converter from Parquet to Arrow.
> >
> > https://github.com/masayuki038/parquet-to-arrow
> >
> > If anyone have not started yet, may I create the JIRA and pull request
> > about the converter from parquet to arrow?
> >
> > I would like to develop the converter from Arrow to Parquet and some
> > features(like Dremio implementation).
> >
> > thanks.
> >
> >
> > 2018年12月12日(水) 23:49 Wes McKinney <[email protected]>:
> > >
> > > hi Yurui,
> > >
> > > It has been discussed in the last 3 years, but I haven't seen anyone
> > > step up to begin to work on this yet. Having vectorized Arrow read and
> > > write in a reusable Java library would be very useful (it has proven
> > > popular in C++). We welcome your contributions.
> > >
> > > - Wes
> > > On Tue, Dec 11, 2018 at 9:34 PM Yurui Zhou <[email protected]>
> wrote:
> > > >
> > > > Hello
> > > >
> > > > I just learned arrow now provided a native reader/writer
> implementation on C++ to allow user directly read parquet file into Arrow
> Buffer and Write to parquet file from arrow buffer.
> > > >
> > > > I am wondering is there any plan on making the same support on the
> Java side?
> > > >
> > > > I found an implementation on dremio codebase that provide the arrow
> support mentioned above.
> https://github.com/dremio/dremio-oss/tree/master/sabot/kernel/src/main/java/com/dremio/exec/store/parquet
> > > >
> > > > Does the parquet community or arrow community have any plan to
> integrate this into the parquet codebase or implement a new version from
> scratch?
> > > >
> > > > Thanks
> > > > Yurui
> >
> >
> >
> > --
> > 高橋 真之
>


-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to