martin-traverse commented on PR #779:
URL: https://github.com/apache/arrow-java/pull/779#issuecomment-2953028800

   Hi @lidavidm  - I have added the dictionary decoding producer, which turned 
out to be very simple. Now any dictionary encoded fields that are not valid 
Avro enums will be automatically decoded and output as their concrete type. 
This does require running a regex over the dictionary entries, but that only 
has to happen once when the producers are set up.
   
   I do think we will need to change the enum read, so dictionaries are 
populated during the schema phase rather than the data phase in order to read 
whole files with multiple blocks. I'd like to keep that change back and do it 
as part of the next PR, which will be read / write for whole files.
   
   Assuming you are happy with both these points then I think this PR is ready 
for review  :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to