alamb commented on issue #4886:
URL: https://github.com/apache/arrow-rs/issues/4886#issuecomment-1893802947

   > Would you mind explaining in a bit more detail what you mean by 
"vectorized support for reading and writing avro data", and point me to where 
that would plug in the code?
   
   
   I think this means "reading avro records into `Array` directly
   
   Here is the way it is implemented in datafusion: 
   
https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/src/datasource/avro_to_arrow/reader.rs
   
   There are more sophisticated ways of implementing this feature, for example 
the tape based methods of the JSON and CSV readers in this crate
   
   What I would personally recommend doing is:
   1. Make a PR with a relatively basic (can be missing features) Avro Reader / 
writer and use that to work out the interface that is desired (@tustvold  may 
already have this sitting on a branch somewhere)
   2. Implement a basic reader/writer, perhaps using apache-avro or perhaps 
another implementation, including broad test coverage
   3. Work on optimizing the implementation (using the existing coverage)
   4. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to