Yup, that's great. I will update the PR when back from vacation.

Regards
JB

Le 20 avr. 2018 à 02:26, à 02:26, Eugene Kirpichov <kirpic...@google.com> a 
écrit:
>Very cool! JB, time to update your PR?
>
>On Thu, Apr 19, 2018 at 9:17 AM Alexey Romanenko
><aromanenko....@gmail.com>
>wrote:
>
>> FYI: Apache Parquet 1.10.0 was release recently.
>> It contains *org.apache.parquet.io.OutputFile *and updated
>> *org.apache.parquet.hadoop.ParquetFileWriter*
>>
>> WBR,
>> Alexey
>>
>>
>> On 14 Feb 2018, at 20:10, Jean-Baptiste Onofré <j...@nanthrax.net>
>wrote:
>>
>> Great !!
>>
>> In the mean time, I started to PoC around directly parquet-common to
>see
>> if I
>> can implement a BeamParquetReader and a BeamParquetWriter.
>>
>> I might also propose some PRs.
>>
>> I will continue tomorrow around that.
>>
>> Thanks again !
>> Regards
>> JB
>>
>> On 02/14/2018 08:04 PM, Ryan Blue wrote:
>>
>> Additions to the builders are easy enough that we can get that in.
>There's
>> a PR out there that needs to be fixed:
>> https://github.com/apache/parquet-mr/pull/446
>>
>> I've asked the author for just the builder changes. If we don't hear
>back,
>> we can add another PR but I'd like to give the author some time to
>update.
>>
>> rb
>>
>> On Tue, Feb 13, 2018 at 9:20 PM, Jean-Baptiste Onofré
><j...@nanthrax.net>
>> wrote:
>>
>> Hi  Ryan,
>>
>> Thanks for the update.
>>
>> Ideally for Beam, it would be great to have the AvroParquetReader and
>> AvroParquetWriter using the InputFile/OutputFile interfaces. It would
>> allow me
>> to directly leverage Beam FileIO.
>>
>> Do you have a rough date for the Parquet release with that ?
>>
>> Thanks
>> Regards
>> JB
>>
>> On 02/14/2018 02:01 AM, Ryan Blue wrote:
>>
>> Jean-Baptiste,
>>
>> We're planning a release that will include the new OutputFile class,
>>
>> which I
>>
>> think you should be able to use. Is there anything you'd change to
>make
>>
>> this
>>
>> work more easily with Beam?
>>
>> rb
>>
>> On Tue, Feb 13, 2018 at 12:31 PM, Jean-Baptiste Onofré
><j...@nanthrax.net
>> <mailto:j...@nanthrax.net>> wrote:
>>
>>    Hi guys,
>>
>>    I'm working on the Apache Beam ParquetIO:
>>
>>    https://github.com/apache/beam/pull/1851
>>    <https://github.com/apache/beam/pull/1851>
>>
>>    In Beam, thanks to FileIO, we support several filesystems (HDFS,
>S3,
>>
>> ...).
>>
>>
>>    If I was able to implement the Read part using AvroParquetReader
>>
>> leveraging Beam
>>
>>     FileIO, I'm struggling on the writing part.
>>
>>    I have to create ParquetSink implementing FileIO.Sink. Especially,
>I
>>
>> have to
>>
>>    implement the open(WritableByteChannel channel) method.
>>
>>    It's not possible to use AvroParquetWriter here as it takes a Path
>>
>> as argument
>>
>>    (and from the channel, I can only have an OutputStream).
>>
>>    As a workaround, I wanted to use org.apache.parquet.hadoop.
>>
>> ParquetFileWriter,
>>
>>    providing my own implementation of org.apache.parquet.io
>>    <http://org.apache.parquet.io>.OutputFile.
>>
>>    Unfortunately OutputFile (and the updated method in
>>
>> ParquetFileWriter) exists on
>>
>>    Parquet master branch, but it was different on Parquet 1.9.0.
>>
>>    So, I have two questions:
>>    - do you plan a Parquet 1.9.1 release including
>>
>> org.apache.parquet.io
>>
>>    <http://org.apache.parquet.io>.OutputFile
>>    and updated org.apache.parquet.hadoop.ParquetFileWriter ?
>>    - using Parquet 1.9.0, do you have any advice how to use
>>    AvroParquetWriter/ParquetFileWriter with an OutputStream (or any
>>
>> object that I
>>
>>    can get from WritableByteChannel) ?
>>
>>    Thanks !
>>
>>    Regards
>>    JB
>>    --
>>    Jean-Baptiste Onofré
>>    jbono...@apache.org <mailto:jbono...@apache.org>
>>    http://blog.nanthrax.net
>>    Talend - http://www.talend.com
>>
>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>>
>>
>>
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>>
>>

Reply via email to