clintropolis opened a new pull request, #13406:
URL: https://github.com/apache/druid/pull/13406
### Description
This PR fixes an issue when using `KafkaInputFormat` with Druid nested
columns, where any nested data was effectively unable to be ingested when using
this format, _unless_ the nested columns were added explicitly to the
`flattenSpec` of the underlying format. The reason for this is because Druid
nested column indexer and nested data transformation functions such as
`json_value` rely on the `flattenSpec` machinery to extract and convert data
from various nested formats into plain java objects.
The `KafkaInputFormat` was eagerly copying the value payload `Map` (which
was a flattener) and blending it with the 'header' `Map` to make a composite
input row, however currently nested columns do not advertise on flattener
`keySet`, so when this copy happened the nested data was left out, leading to
always seeing `null` valued inputs when using Druid nested indexer or
transforms.
This PR solves the issue by building a `Map` which delegates to the payload
map before falling back to the header map, allowing the underlying flattener
from the payload to keep doing its thing.
The added test cases all fail prior to the changes in this patch with errors
of the form:
```
java.lang.AssertionError:
Expected :{mg=1}
Actual :null
```
<hr>
This PR has:
- [x] been self-reviewed.
- [x] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [x] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [x] been tested in a test Druid cluster.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]