Re: Arrow Support in Parquet Writers

Wes McKinney Mon, 06 Jul 2020 14:55:20 -0700

Is there is a path to having an Arrow<->Parquet implementation in Java
that does not have a hard dependency on Iceberg? This is a common ask
and it seems like it would be a clear community win that would net
more contributors than something Iceberg-specific.


On Mon, Jul 6, 2020 at 2:54 PM Ryan Blue <rb...@netflix.com.invalid> wrote:
>
> Sure, if you need an Arrow writer and want to work on it, we would be happy 
> to include it in Iceberg.
>
> What is your use case? The main reason why we don't have one is that neither 
> Presto nor Spark uses Arrow for writing.
>
> On Mon, Jul 6, 2020 at 9:04 AM Chen Song <chen.song...@gmail.com> wrote:
>>
>> I looked at the Iceberg Data API and found that the write is row based. If I 
>> want to use a columnar data file format like Parquet and efficiently sink 
>> columnar data in memory (like Arrow). I assume it is not currently 
>> implemented but OK to enhance the data API to support this?
>>
>> --
>> Chen Song
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix

Re: Arrow Support in Parquet Writers

Reply via email to