ameyc opened a new issue, #11342:
URL: https://github.com/apache/datafusion/issues/11342

   ### Is your feature request related to a problem or challenge?
   
   We are currently working on a stream processing system built atop DataFusion 
and as such Avro is a major format for us given its ubiquity in the Kafka 
world. We tried using the the existing Avro Reader in data fusion, however 
found it lacking in some critical ways that make not terribly useful for us in 
its present state.
   
   The reader currently does not support complex nested datatypes such as -
   
   1. The List arrays [only support primitive 
types](https://github.com/apache/datafusion/blob/main/datafusion/core/src/datasource/avro_to_arrow/arrow_array_reader.rs#L627)
   2. Dictionary arrays only support Utf8 as its value types.
   
   Lastly, the reader seems to rely on `decode_internal` method on the 
`apache-avro` crate and seems to implement some of the Avro decoding "by hand". 
We ended up rolling our reader to support and we're able to use 
`decode_from_avro` datum and entirely pass on the avro decoding responsibility 
to the avro package.
   
   Would love to work with @tustvold who seems to contributed here the most to 
augment the existing limitations here.
   
   ### Describe the solution you'd like
   
   Addition of support for parsing complex datatypes.
   
   ### Describe alternatives you've considered
   
   Convert avro > json then rely on json_to_arrow conversion, but this leads to 
inevitable loss of type information.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to