[GitHub] [druid] clintropolis commented on a change in pull request #10053: fix topn on string columns with non-sorted or non-unique dictionaries

GitBox Fri, 19 Jun 2020 17:35:17 -0700


clintropolis commented on a change in pull request #10053:
URL: https://github.com/apache/druid/pull/10053#discussion_r443084818




##########
File path: 
processing/src/main/java/org/apache/druid/query/topn/TopNQueryEngine.java
##########
@@ -158,6 +156,40 @@ private TopNMapFn getMapFn(
     return new TopNMapFn(query, topNAlgorithm);
   }
 
+  /**
+   * {@link PooledTopNAlgorithm} (and {@link 
AggregateTopNMetricFirstAlgorithm} which utilizes the pooled
+   * algorithm) are optimized off-heap algorithms for aggregating dictionary 
encoded string columns. These algorithms
+   * rely on dictionary ids being unique so to aggregate on the dictionary ids 
directly and defer
+   * {@link org.apache.druid.segment.DimensionSelector#lookupName(int)} until 
as late as possible in query processing.
+   *
+   * When these conditions are not true, we have an on-heap fall-back 
algorithm, the {@link HeapBasedTopNAlgorithm}
+   * (and {@link TimeExtractionTopNAlgorithm} for a specialized form for long 
columns) which aggregates on values of
+   * selectors.
+   */
+  private static boolean requiresHeapAlgorithm(

Review comment:
       Ha, that was the original name of this method and it was flipped, but I 
second guessed myself at the last minute and swapped when I noticed only 1 
clause could return true. The javadocs still talk about the pooled algorithm 
first.
   
   Still, I agree and think `canUsePooledAlgorithm` is better, I'll switch it 
back :+1:




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] clintropolis commented on a change in pull request #10053: fix topn on string columns with non-sorted or non-unique dictionaries

Reply via email to