Yup, that's great. I will update the PR when back from vacation.
Regards
JB
Le 20 avr. 2018 à 02:26, à 02:26, Eugene Kirpichov a
écrit:
>Very cool! JB, time to update your PR?
>
>On Thu, Apr 19, 2018 at 9:17 AM Alexey Romanenko
>
>wrote:
>
>>
Very cool! JB, time to update your PR?
On Thu, Apr 19, 2018 at 9:17 AM Alexey Romanenko
wrote:
> FYI: Apache Parquet 1.10.0 was release recently.
> It contains *org.apache.parquet.io.OutputFile *and updated
> *org.apache.parquet.hadoop.ParquetFileWriter*
>
> WBR,
>
FYI: Apache Parquet 1.10.0 was release recently.
It contains org.apache.parquet.io.OutputFile and updated
org.apache.parquet.hadoop.ParquetFileWriter
WBR,
Alexey
> On 14 Feb 2018, at 20:10, Jean-Baptiste Onofré wrote:
>
> Great !!
>
> In the mean time, I started to PoC
Great !!
In the mean time, I started to PoC around directly parquet-common to see if I
can implement a BeamParquetReader and a BeamParquetWriter.
I might also propose some PRs.
I will continue tomorrow around that.
Thanks again !
Regards
JB
On 02/14/2018 08:04 PM, Ryan Blue wrote:
> Additions
Hi Ryan,
Thanks for the update.
Ideally for Beam, it would be great to have the AvroParquetReader and
AvroParquetWriter using the InputFile/OutputFile interfaces. It would allow me
to directly leverage Beam FileIO.
Do you have a rough date for the Parquet release with that ?
Thanks
Regards
JB
Jean-Baptiste,
We're planning a release that will include the new OutputFile class, which
I think you should be able to use. Is there anything you'd change to make
this work more easily with Beam?
rb
On Tue, Feb 13, 2018 at 12:31 PM, Jean-Baptiste Onofré
wrote:
> Hi guys,
>
Thanks for raising this, JB!
To clarify for people on Parquet mailing list who are not familiar with
Beam:
Beam supports multiple filesystems (currently: local, HDFS, Google Cloud,
S3) via a pluggable interface (that among other things can give you a
Channel for reading/writing the given path),
Hi guys,
I'm working on the Apache Beam ParquetIO:
https://github.com/apache/beam/pull/1851
In Beam, thanks to FileIO, we support several filesystems (HDFS, S3, ...).
If I was able to implement the Read part using AvroParquetReader leveraging Beam
FileIO, I'm struggling on the writing part.