shounakmk219 opened a new pull request, #14738: URL: https://github.com/apache/pinot/pull/14738
The transform pipeline is failing with an `ArrayIndexOutOfBoundsException` when it encounters a JSON column value with empty json array as JSON value is not standardised (empty array -> null) https://github.com/apache/pinot/blob/master/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/DataTypeTransformer.java#L97 Earlier empty array for json datatype was getting extracted as string, now its getting extracted as Object[] due to the change at https://github.com/apache/pinot/pull/14547/files#diff-7ac5349f9d75e27a62a063dbf81db3ed30c8de052b4ffa7719187e4babaa60baR66 which leads to `isMultiValue` returning true for empty json array `convertMultiValue` returns `Object[]` while `convertSingleValue` returns a `string` https://github.com/apache/pinot/blob/master/pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/BaseRecordExtractor.java#L39 ``` public Object convert(Object value) { Object convertedValue; if (isMultiValue(value)) { convertedValue = convertMultiValue(value); } else if (isMap(value)) { convertedValue = convertMap(value); } else if (isRecord(value)) { convertedValue = convertRecord(value); } else { convertedValue = convertSingleValue(value); } return convertedValue; } ``` As JSON datatype column is never supposed to be multi valued, overriding the `JSONRecordExtractor.isMultiValue()` to always return false -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
