clintropolis opened a new pull request, #13714:
URL: https://github.com/apache/druid/pull/13714

   ### Description
   Fixes an issue with nested columns that can occur when both actual `null` 
and the string `"null"` are present in any nested path that results in the 
`null` values incorrectly becoming associated with the `"null"` values.
   
   The bug was caused by a usage of `String.valueOf` in 
`StringFieldColumnWriter` that was not checking for null values when writing 
out the column. This still worked by dumb luck because the fastutils 2Int maps 
that were backing the globalId lookup when writing out segments had a default 
value of 0, which happens to be `null` global id, so even though `"null"` 
wasn't present in the global dictionary it ended up with the correct id. 
However, if `"null"` was present, `null` would incorrectly be written out as 
the `"null"` global id and associated to that value instead.
   
   As a safety measure, I've changed the 2int maps to have a default value of 
-1, and check that the globalid is in range before writing it to the column to 
ensure mistakes like this don't happen in the future.
   
   The added test data in `NestedDataColumnSupplierTest` would fail prior to 
this PR.
   
   Unrelated, this PR also moves some column selector tests that had nothing to 
do with scan queries out of `NestedDataScanQueryTest` into their own file.
   
   This PR has:
   
   - [x] been self-reviewed.
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to