Hi Cindy, > Are you saying that the Avro -> Arrow converter is already available in > release 0.17.1?
Yes, in Java <https://arrow.apache.org/docs/java/org/apache/arrow/AvroToArrow.html> [1] it exists in a separate POM <https://mvnrepository.com/artifact/org.apache.arrow/arrow-avro> [2]. Note that this is still in an experimental/contrib state (i.e. I'm not sure if anyone is using it in production) and it might get some refactoring, but it should be good place to start experimenting, and feedback on it would be welcome. As for use cases: we're trying to move away from Thrift in parts of our ML > stack. We need to support wide, row-based data with schema support, so > probably need to convert Thrift to Avro. However, we'd love to use Arrow > *between* components (Spark, TensorFlow, scikit-learn), but it's likely > our data will originate in Avro and/or Thrift. Thanks. Like a I said I hope to work a little bit on the C++/Python side of Avro to Arrow but I can't give an exact time frame for it. Thrift I think is more complicated since it seems like there are multiple protocols that would likely need support. But contributions are welcome :) Hope this helps. Micah [1] https://arrow.apache.org/docs/java/org/apache/arrow/AvroToArrow.html [2] https://mvnrepository.com/artifact/org.apache.arrow/arrow-avro On Wed, May 20, 2020 at 12:36 PM Cindy McMullen <[email protected]> wrote: > Hi, Micah - > > I wasn't aware that the Avro converter already existed in Java, since I > couldn't find any Arrow docs on it. I was going by the Arrow/JIRA release > tag. Are you saying that the Avro -> Arrow converter is already available > in release 0.17.1? > > As for use cases: we're trying to move away from Thrift in parts of our ML > stack. We need to support wide, row-based data with schema support, so > probably need to convert Thrift to Avro. However, we'd love to use Arrow > *between* components (Spark, TensorFlow, scikit-learn), but it's likely > our data will originate in Avro and/or Thrift. > > Thanks - > > -- Cindy > > On Wed, May 20, 2020 at 1:14 PM Micah Kornfield <[email protected]> > wrote: > >> The avro to arrow converter in c++/python will not be done anytime soon >> unless someone else takes it up (one exists in Java). It has been on my >> low priority backlog for a while but I haven't had time to get to it. We >> should remove a specific release tag from it. >> >> As far as I know there are no plans for thrift or other formats at this >> point. >> >> May I ask what your use case is? >> >> Thanks, >> Micah >> >> On Wednesday, May 20, 2020, Cindy McMullen <[email protected]> wrote: >> >>> I see that the Avro converter is planned for Arrow 1.0.0. Any ideas >>> about when that release might be? >>> >>> Any plans for a Thrift -> Avro converter? >>> >>
