saithal-confluent commented on code in PR #16517:
URL: https://github.com/apache/druid/pull/16517#discussion_r1630658620
##########
processing/src/main/java/org/apache/druid/segment/StringDimensionIndexer.java:
##########
@@ -132,12 +142,14 @@ public EncodedKeyComponent<int[]>
processRowValsToUnsortedEncodedKeyComponent(@N
} else if (dimValues instanceof byte[]) {
encodedDimensionValues =
new
int[]{dimLookup.add(emptyToNullIfNeeded(StringUtils.encodeBase64String((byte[])
dimValues)))};
+ dictionaryChanged = true;
} else {
encodedDimensionValues = new
int[]{dimLookup.add(emptyToNullIfNeeded(dimValues))};
+ dictionaryChanged = true;
}
// If dictionary size has changed, the sorted lookup is no longer valid.
- if (oldDictSize != dimLookup.size()) {
+ if (dictionaryChanged) {
Review Comment:
You are right, this invalidates the cache hits as well and changes the
behaviour. I've pushed a new commit which would change the return type of `add`
function from `int` to `Pair<Integer, Boolean>` where the boolean indicates
whether an element was actually added to the dictionary or not. The integer
however remains the same.
Please take a look @kgyrtkirk and @clintropolis
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]