maytasm commented on a change in pull request #10053:
URL: https://github.com/apache/druid/pull/10053#discussion_r442605486
##########
File path:
processing/src/main/java/org/apache/druid/query/topn/TopNQueryEngine.java
##########
@@ -126,14 +126,20 @@ private TopNMapFn getMapFn(
// Once we have arbitrary dimension types following check should be
replaced by checking
// that the column is of type long and single-value.
dimension.equals(ColumnHolder.TIME_COLUMN_NAME)
- ) {
+ ) {
// A special TimeExtractionTopNAlgorithm is required, since
DimExtractionTopNAlgorithm
// currently relies on the dimension cardinality to support
lexicographic sorting
topNAlgorithm = new TimeExtractionTopNAlgorithm(adapter, query);
} else if (selector.isHasExtractionFn()) {
topNAlgorithm = new HeapBasedTopNAlgorithm(adapter, query);
- } else if (columnCapabilities == null || !(columnCapabilities.getType() ==
ValueType.STRING
- &&
columnCapabilities.isDictionaryEncoded())) {
+ } else if (
+ columnCapabilities == null ||
+ !(columnCapabilities.getType() == ValueType.STRING &&
+ columnCapabilities.isDictionaryEncoded() &&
+ columnCapabilities.areDictionaryValuesSorted().isTrue() &&
+ columnCapabilities.areDictionaryValuesUnique().isTrue()
+ )
+ ) {
// Use HeapBasedTopNAlgorithm for non-Strings and for
non-dictionary-encoded Strings, and for things we don't know
Review comment:
Can you update the comments and also mention why we can/want to use
HeapBasedTopNAlgorithm for each of those conditions (not sorted, not unique,
etc)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]