Re: [PR] rework cursor creation (druid)

via GitHub Wed, 14 Aug 2024 19:39:09 -0700


clintropolis commented on code in PR #16533:
URL: https://github.com/apache/druid/pull/16533#discussion_r1717807468



##########
processing/src/main/java/org/apache/druid/query/topn/TopNQueryEngine.java:
##########
@@ -193,13 +205,74 @@ private static boolean canUsePooledAlgorithm(
       // non-string output cannot use the pooled algorith, even if the 
underlying selector supports it
       return false;
     }
-    if (Types.is(capabilities, ValueType.STRING)) {
-      // string columns must use the on heap algorithm unless they have the 
following capabilites
-      return capabilities.isDictionaryEncoded().isTrue() && 
capabilities.areDictionaryValuesUnique().isTrue();
-    } else {
+    if (!Types.is(capabilities, ValueType.STRING)) {
       // non-strings are not eligible to use the pooled algorithm, and should 
use a heap algorithm
       return false;
     }
+
+    // string columns must use the on heap algorithm unless they have the 
following capabilites
+    if (!capabilities.isDictionaryEncoded().isTrue() || 
!capabilities.areDictionaryValuesUnique().isTrue()) {
+      return false;
+    }
+    if (Granularities.ALL.equals(query.getGranularity())) {
+      // all other requirements have been satisfied, ALL granularity can 
always use the pooled algorithms
+      return true;
+    }
+    // if not using ALL granularity, we can still potentially use the pooled 
algorithm if we are certain it doesn't
+    // need to make multiple passes (e.g. reset the cursor)
+    try (final ResourceHolder<ByteBuffer> resultsBufHolder = 
bufferPool.take()) {
+      final ByteBuffer resultsBuf = resultsBufHolder.get();
+      resultsBuf.clear();
+
+      final int numBytesToWorkWith = resultsBuf.remaining();
+      final int numValuesPerPass = numBytesPerRecord > 0 ? numBytesToWorkWith 
/ numBytesPerRecord : cardinality;
+
+      return numValuesPerPass <= cardinality;
+    }
+  }
+
+  private static boolean shouldUseAggregateMetricFirstAlgorithm(TopNQuery 
query, TopNAlgorithmSelector selector)
+  {
+    // must be using ALL granularity since it makes multiple passes and must 
reset the cursor
+    if (Granularities.ALL.equals(query.getGranularity())) {
+      return selector.isAggregateTopNMetricFirst() || 
query.context().getBoolean("doAggregateTopNMetricFirst", false);
+    }
+    return false;
+  }
+
+  public static CursorBuildSpec makeCursorBuildSpec(TopNQuery query, @Nullable 
QueryMetrics<?> queryMetrics)
+  {
+    // virtual column is currently only used as a decorator to pass to the 
cursor holder to allow specializing cursor
+    // and vector cursors if any pre-aggregated data at the matching 
granularity is available
+    // eventually this could probably be reworked to be used by the 
granularizer instead of the existing method
+    // of creating a selector on the time column
+    final VirtualColumn granularityVirtual = 
Granularities.toVirtualColumn(query);

Review Comment:
   This was originally the funny method on `CursorBuildSpec` that took grouping 
columns, virtual columns, and granularity, but that was a bit too strange. 
Maybe it should be a static method on `Granularities` that accepts a `Query` 
and a `CursorBuildSpecBuilder`, will play around with it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] rework cursor creation (druid)

Reply via email to