josephglanville opened a new pull request #10232:
URL: https://github.com/apache/druid/pull/10232


   Fixes #10229 
   
   ### Description
   
   This PR includes 2 sets of fixes to make working with Avro OCF files in the 
Web Console work seamlessly.
   
   The first is fixing the format detection by first changing the detection 
logic to run against the "raw" input rows and also patching the Avro OCF check 
to only check for the ASCII `Obj` prefix.
   
   The second corrects some more fundamental issues that wouldn't have been 
noticed without the web UI as manual ingestion specs that don't use the sampler 
API or rely on root field enumeration wouldn't have run into.
   First of which is that Enum and Fixed types were not properly supported in 
the `AvroFlattenerMaker`, they wouldn't be listed as root fields and were not 
properly converted into primitive types.
   Additionally the `AvroFlattenerMaker` didn't override the 
`ObjectFlatteners#finalizeConversionForMap` method so raw Avro types leaked 
into the `SamplerResponse`. The raw Avro types aren't configured to be mapped 
correctly so were causing errors when serialising `SamplerResponse` whilst 
using the Web UI. This is what led to the discovery of Enum and Fixed not being 
supported correctly.
   
   A follow up to this PR will likely include tests for `AvroOCFReader#sample` 
in a similar vein to what exists for `ParquetReader`.
   
   <hr>
   
   This PR has:
   - [ ] been self-reviewed.
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] been tested in a test Druid cluster.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `AvroFlattenerMaker`
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to