clintropolis opened a new pull request #10248:
URL: https://github.com/apache/druid/pull/10248


   ### Description
   This PR fixes an issue when using `ExpressionVirtualColumn` expressions on 
realtime string columns which are sparsely populated and have not encountered 
explicit `null` values to ensure that they are encoded in the dictionary. 
   
   In the code there is an implicit assumption that if 
`ColumnCapabilities.isDictionaryEncoded` is true then 
`DimensionSelector.nameLookupPossibleInAdvance` is also true, and 
`isDictionaryEncoded` appears to be checked primarily in cases where this 
should also be true.
   
   Prior to this fix the added tests would explode with an error in the form:
   ```
   Selector of 
class[org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector]
 does not have a dictionary, cannot use it.
   ```
   because [of this 
check](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/virtual/SingleStringInputDimensionSelector.java#L52).
 
   
   <hr>
   
   This PR has:
   - [ ] been self-reviewed.
      - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [x] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `StringDimensionIndexer`
    * `ExpressionSelectors`
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to