jihoonson opened a new pull request, #12428:
URL: https://github.com/apache/druid/pull/12428
### Description
When `useFieldDiscovery` and `includeAllDimensions` are both set,
`IndexMerger` should store every column even if it is empty. However, it
currently doesn't respect the `includeAllDimensions` flag but only checks if
the given column is explicitly specified in the dimensionsSpec. This PR fixes
this bug by checking the flag properly.
This PR also fixes another bug in `JsonInputFormat` that implicitly filters
null fields out even when `useFieldDiscovery` is set. Instead of filtering them
out in an early stage in ingestion, `keepNullColumns` will be set automatically
if `useFieldDiscovery` is set, so that `MapInputRowParser` can check the
dimensionsSpec to decide whether to keep the null fields or not. This change
will not make any user-facing change as 1) `keepNullColumns` is undocumented
and used only in the sampler and 2) even if there is someone using
`keepNullColumns`, the existing behavior should remain the same. The behavior
changes only when `includeAllDimensions` is set as well as `useFieldDiscovery`.
<hr>
##### Key changed/added classes in this PR
* `IndexMergerV9`
* `JsonInputFormat`
<hr>
This PR has:
- [x] been self-reviewed.
- [x] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [x] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]