Hi Billy,
I will push my branch with ParquetIO on my github.
Yes, the Beam IO is independent from the runner.
Regards
JB
On 12/12/2016 05:29 PM, Newport, Billy wrote:
I don't mind writing one, is there a fork for the ParquetIO works that's
already been done or is it in trunk?
The ParquetIO is independent of the runner being used? Is that right?
Thanks
-----Original Message-----
From: Jean-Baptiste Onofré [mailto:[email protected]]
Sent: Monday, December 12, 2016 11:25 AM
To: [email protected]
Subject: Re: Avro Parquet/Flink/Beam
Hi,
Beam provides a AvroCoder/AvroIO that you can use, but not yet a
ParquetIO (I created a Jira about that and started to work on it).
You can use the Avro reader to populate the PCollection and then use a
custom DoFn to create the Parquet (waiting for the ParquetIO).
Regards
JB
On 12/12/2016 05:19 PM, Newport, Billy wrote:
Are there any examples showing the use of beam with avro/parquet and a
flink runner? I see an avro reader for beam, is it a matter of writing
another one for avro-parquet or does this need to use the flink
HadoopOutputFormat for example?
Thanks
Billy
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com