clintropolis commented on PR #19004: URL: https://github.com/apache/druid/pull/19004#issuecomment-3899870911
re front-coding, i think i see what is going on, [the `ExpressionPredicateIndexSupplier` that is computing the indexes](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/math/expr/ExpressionPredicateIndexSupplier.java#L262) is scanning the dictionary in order to find all of the values that match, but is using random access to get the values which for front coding means seeking in the bucket to the position we actually need which is basically where all of the cost of using it is (in exchange for the smaller sizes). I suspect when it is not using the indexes that perhaps the matcher must be able to rule out a match earlier and so fewer overall calls to [`FrontCodedIndexed.get`](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/data/FrontCodedIndexed.java#L208) are happening. I think since we are checking every dictionary value, it would be a lot more chill for front-coding if use used the dictionary iterator instead of calling get, it needs to be exposed on `DictionaryEncodedValueIndex` so that the expression predicate indexes could use it. Will look into this :+1: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
