Hi, Micah - I see the Avro*Consumer classes in the javadocs <https://arrow.apache.org/docs/java/>, which would lead me to believe we have Arrow to Avro capability. What am I missing?
On Mon, Jun 29, 2020 at 9:33 PM Micah Kornfield <[email protected]> wrote: > Just a clarification the functionality in Java is from Avro to Arrow (not > Arrow to Avro). > > > > On Mon, Jun 29, 2020 at 2:25 PM Wes McKinney <[email protected]> wrote: > >> On Mon, Jun 29, 2020 at 4:15 PM Cindy McMullen <[email protected]> >> wrote: >> > >> > Hi, Wes - >> > >> > Yes, we're using Java/Scala, but also have a good Python code base for >> our data scientists. Our goal is to replace storage/representation of >> Thrift for ML features with some more OSS-friendly format, such as Parquet >> or Avro, and avoid writing multiple adapters. >> > >> > Ideally, we could stream data from Parquet disk in batches into >> Arrow-compatible consumers. Is this a reasonable fit for something like >> Arrow Flight? >> >> Yes, Flight is definitely designed for that -- fast / efficient >> delivery of Arrow record batches over TCP. >> >> > >> > On Mon, Jun 29, 2020 at 2:37 PM Wes McKinney <[email protected]> >> wrote: >> >> >> >> hi Cindy, >> >> >> >> Could you clarify which PL you are working in (though assuming Scala / >> >> Java judging by your e-mail address)? >> >> >> >> In C++ we have reasonably mature Parquet->Arrow reading but not yet >> >> conversion from Arrow to Avro. In Java, I am not sure what is the >> >> state of the art for getting Parquet into Arrow but this code does not >> >> live in Apache Arrow -- I know that Apache Iceberg has done some work >> >> around this but I'm not sure how consumable it is as a library. >> >> Java-Arrow does have some preliminary support for converting Arrow to >> >> Avro, I believe. So there's some engineering here to do in any case. >> >> >> >> best, >> >> Wes >> >> >> >> On Mon, Jun 29, 2020 at 2:45 PM Cindy McMullen <[email protected]> >> wrote: >> >> > >> >> > Can I use Arrow to stream data from a Parquet file source and >> consume it via Avro? >> >
