JonasDev1 commented on issue #7690: URL: https://github.com/apache/datafusion/issues/7690#issuecomment-2370845677
I have the same use case. The main reason is that the avro reader needs a avro file input stream and I have only a binary array of Avro messages(From Kafka). My current workaround is to deserialze each message to a value and write all with the avro writer in memory and to deserialize them again with the datafusion Avro Reader. To solve this, I would like to split the AvroArrowArrayReader into a reader and a convertor(Vec<Value> to RecordBatch). You could then also add a from_avro function, simillar to the one in [(Spark](https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.avro.functions.from_avro.html) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org