clintropolis opened a new pull request #10248: URL: https://github.com/apache/druid/pull/10248
### Description This PR fixes an issue when using `ExpressionVirtualColumn` expressions on realtime string columns which are sparsely populated and have not encountered explicit `null` values to ensure that they are encoded in the dictionary. In the code there is an implicit assumption that if `ColumnCapabilities.isDictionaryEncoded` is true then `DimensionSelector.nameLookupPossibleInAdvance` is also true, and `isDictionaryEncoded` appears to be checked primarily in cases where this should also be true. Prior to this fix the added tests would explode with an error in the form: ``` Selector of class[org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector] does not have a dictionary, cannot use it. ``` because [of this check](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/virtual/SingleStringInputDimensionSelector.java#L52). <hr> This PR has: - [ ] been self-reviewed. - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. - [ ] been tested in a test Druid cluster. <hr> ##### Key changed/added classes in this PR * `StringDimensionIndexer` * `ExpressionSelectors` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
