rsrkpatwari1234 opened a new pull request, #17708:
URL: https://github.com/apache/pinot/pull/17708
### Problem
When Parquet/Avro store a list as an array of structs with a single field
"element", the record extractor was leaving them as arrays of single-key maps
instead of unwrapping to plain arrays. For example:
Expected: "metadata": {"tags": ["abc", "xyz"]}
Actual: "metadata": {"tags": [{"element":"abc"}, {"element":"xyz"}]}
This breaks schemas and queries that expect multi-value columns to be arrays
of scalars.
### Solution
BaseRecordExtractor: After building a multi-value array from a Collection or
Object[], run it through a new helper unwrapElementMapsInArray(). If every
element is a Map with exactly one key "element", replace the array with the
values of that key; otherwise leave the array unchanged. Do not apply
unwrapping to primitive arrays.
### Tests
Added BaseRecordExtractorTest with 8 cases covering unwrap when all elements
are single-key "element" maps, no unwrap for
mixed/multi-key/different-key/primitive/empty, and a full-path extract() test.
### Scope
Change is in pinot-spi; any record reader that extends BaseRecordExtractor
gets the fix.
Primitive arrays and non–element-map arrays are unchanged.
Fixes https://github.com/apache/pinot/issues/17420
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]