rsrkpatwari1234 opened a new pull request, #17708:
URL: https://github.com/apache/pinot/pull/17708

   ### Problem
   When Parquet/Avro store a list as an array of structs with a single field 
"element", the record extractor was leaving them as arrays of single-key maps 
instead of unwrapping to plain arrays. For example:
   Expected: "metadata": {"tags": ["abc", "xyz"]}
   Actual: "metadata": {"tags": [{"element":"abc"}, {"element":"xyz"}]}
   This breaks schemas and queries that expect multi-value columns to be arrays 
of scalars.
   
   ### Solution
   BaseRecordExtractor: After building a multi-value array from a Collection or 
Object[], run it through a new helper unwrapElementMapsInArray(). If every 
element is a Map with exactly one key "element", replace the array with the 
values of that key; otherwise leave the array unchanged. Do not apply 
unwrapping to primitive arrays.
   
   ### Tests
   Added BaseRecordExtractorTest with 8 cases covering unwrap when all elements 
are single-key "element" maps, no unwrap for 
mixed/multi-key/different-key/primitive/empty, and a full-path extract() test.
   
   ### Scope
   Change is in pinot-spi; any record reader that extends BaseRecordExtractor 
gets the fix.
   Primitive arrays and non–element-map arrays are unchanged.
   
   Fixes https://github.com/apache/pinot/issues/17420


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to