Is there is a path to having an Arrow<->Parquet implementation in Java that does not have a hard dependency on Iceberg? This is a common ask and it seems like it would be a clear community win that would net more contributors than something Iceberg-specific.
On Mon, Jul 6, 2020 at 2:54 PM Ryan Blue <rb...@netflix.com.invalid> wrote: > > Sure, if you need an Arrow writer and want to work on it, we would be happy > to include it in Iceberg. > > What is your use case? The main reason why we don't have one is that neither > Presto nor Spark uses Arrow for writing. > > On Mon, Jul 6, 2020 at 9:04 AM Chen Song <chen.song...@gmail.com> wrote: >> >> I looked at the Iceberg Data API and found that the write is row based. If I >> want to use a columnar data file format like Parquet and efficiently sink >> columnar data in memory (like Arrow). I assume it is not currently >> implemented but OK to enhance the data API to support this? >> >> -- >> Chen Song >> > > > -- > Ryan Blue > Software Engineer > Netflix