Re: [PR] [Groupby Query Metrics] Add merge buffer tracking (druid)

via GitHub Tue, 13 Jan 2026 11:48:10 -0800


abhishekrb19 commented on code in PR #18731:
URL: https://github.com/apache/druid/pull/18731#discussion_r2684118134



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/BufferHashGrouper.java:
##########
@@ -199,18 +209,15 @@ public int size()
       }
 
       // Sort offsets in-place.
-      Collections.sort(
-          wrappedOffsets,
-          (lhs, rhs) -> {
-            final ByteBuffer tableBuffer = hashTable.getTableBuffer();
-            return comparator.compare(
-                tableBuffer,
-                tableBuffer,
-                lhs + HASH_SIZE,
-                rhs + HASH_SIZE
-            );
-          }
-      );

Review Comment:
   Functionally there's no difference between the two sorting methods, right? 
If this is a stylistic preference, please revert this from the PR and we can do 
these changes separately if desired.



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ConcurrentGrouper.java:
##########
@@ -332,7 +332,7 @@ public void reset()
       throw new ISE("Grouper is closed");
     }
 
-    groupers.forEach(Grouper::reset);

Review Comment:
   Is this change required? Given that the `ConcurrentGrouper` operates on 
`SpillingGrouper` instance, I suppose this is technically correct. But calling 
`Grouper::reset` as it was earlier should already ensure that the specific 
reset/close methods from `SpillingGrouper` are invoked? If so, could we revert 
this change here and below?



##########
processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java:
##########
@@ -100,6 +134,21 @@ public long getMergeBufferAcquisitionTimeNs()
       return mergeBufferAcquisitionTimeNs;
     }
 
+    public long getMergeBufferTotalUsage()
+    {
+      return mergeBufferTotalUsage;
+    }

Review Comment:
   On the naming, consider:
   - `getMergeBufferTotalUsage() -> `getMergeBufferTotalUsedBytes()`
   - `getMaxMergeBufferUsage()` -> `getMaxMergeBufferUsedBytes()



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/AbstractBufferHashGrouper.java:
##########
@@ -173,6 +173,15 @@ public void close()
     aggregators.reset();
   }
 
+  /**
+   * This method is implemented to return the highest memory value claimed by 
the Grouper. This is only
+   * used for monitoring the size of the merge buffers used.
+   */
+  public long getMergeBufferUsage()
+  {
+    return hashTable.getMaxTableBufferUsage();
+  }

Review Comment:
   please move this to the `Grouper` interface and have a default 
implementation for it. Override it in all the concrete classes as needed:
   ```
     default long getMergeBufferUsage()
     {
       return 0l;
     }
   ```



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferHashTable.java:
##########
@@ -381,6 +387,16 @@ public int getGrowthCount()
     return growthCount;
   }
 
+  protected void updateMaxTableBufferUsage()
+  {
+    maxTableBufferUsage = Math.max(maxTableBufferUsage, 
tableBuffer.capacity());

Review Comment:
   Unless I'm missing something, the issue 
https://github.com/apache/druid/issues/17902 was created to actually get some 
visibility into actual merge buffer usage, but this metric would just tell us 
how much was actually configured instead? 
   
   `tableBuffer.capacity()` would just indicate the capacity of the buffers, so 
more or less what was configured via `druid.processing.buffer.sizeBytes`.



##########
processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java:
##########
@@ -100,6 +134,21 @@ public long getMergeBufferAcquisitionTimeNs()
       return mergeBufferAcquisitionTimeNs;
     }
 
+    public long getMergeBufferTotalUsage()
+    {
+      return mergeBufferTotalUsage;
+    }
+
+    public long getMaxMergeBufferAcquisitionTimeNs()
+    {
+      return maxMergeBufferAcquisitionTimeNs;
+    }

Review Comment:
   nit: consider moving this after `getMergeBufferAcquisitionTimeNs()` so the 
maxes are grouped together with their total counterparts



##########
server/src/test/java/org/apache/druid/server/metrics/GroupByStatsMonitorTest.java:
##########
@@ -185,6 +194,55 @@ public void testMonitoringMergeBuffer_pendingRequests()
     }
   }
 
+  @Test
+  public void testMonitoringWithoutMockingGroupByStatsProvider()

Review Comment:
   Perhaps something like this to avoid confusion - the existing 
`groupByStatsProvider` in the test setup is still a concrete type (not a mock), 
even though it overrides a method
   
   ```suggestion
     public void testMonitoringWithMultipleResources()
   ```



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/SpillingGrouper.java:
##########
@@ -68,7 +68,7 @@ public class SpillingGrouper<KeyType> implements 
Grouper<KeyType>
       "Not enough disk space to execute this query. Try raising 
druid.query.groupBy.maxOnDiskStorage."
   );
 
-  private final Grouper<KeyType> grouper;
+  private final AbstractBufferHashGrouper<KeyType> grouper;

Review Comment:
   Same comment as above. I think it would be better to define these methods in 
the `Grouper` interface and leave this change as it was earlier. Mixing 
`Grouper` with impls otherwise seems confusing



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferIntList.java:
##########
@@ -55,6 +55,7 @@ public void add(int val)
     }
     buffer.putInt(numElements * Integer.BYTES, val);
     numElements++;
+    maxMergeBufferUsageBytes = Math.max(maxMergeBufferUsageBytes, numElements 
* Integer.BYTES);

Review Comment:
   Actually I think this variable and state tracking isn't needed in `add()` 
since we're tracking `numElements` already. We can just do it inline 
`numElements * Integer.BYTES` from `getMaxMergeBufferUsageBytes()`



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferIntList.java:
##########
@@ -71,4 +72,9 @@ public void reset()
   {
     numElements = 0;

Review Comment:
   Shouldn't `reset()` also set `maxMergeBufferUsageBytes` to 0?



##########
server/src/test/java/org/apache/druid/server/metrics/GroupByStatsMonitorTest.java:
##########
@@ -128,11 +134,17 @@ public void testMonitorWithServiceDimensions()
     verifyMetricValue(emitter, "mergeBuffer/used", dimFilters, 0L);
     verifyMetricValue(emitter, "mergeBuffer/queries", dimFilters, 1L);
     verifyMetricValue(emitter, "mergeBuffer/acquisitionTimeNs", dimFilters, 
100L);
+    verifyMetricValue(emitter, "mergeBuffer/bytesUsed", dimFilters, 200L);
+    verifyMetricValue(emitter, "mergeBuffer/maxAcquisitionTimeNs", dimFilters, 
100L);
+    verifyMetricValue(emitter, "mergeBuffer/maxBytesUsed", dimFilters, 200L);
     verifyMetricValue(emitter, "groupBy/spilledQueries", dimFilters, 2L);
     verifyMetricValue(emitter, "groupBy/spilledBytes", dimFilters, 200L);
+    verifyMetricValue(emitter, "groupBy/maxSpilledBytes", dimFilters, 200L);
     verifyMetricValue(emitter, "groupBy/mergeDictionarySize", dimFilters, 
300L);
+    verifyMetricValue(emitter, "groupBy/maxMergeDictionarySize", dimFilters, 
300L);

Review Comment:
   It looks much cleaner, thanks for adding it!



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/AbstractBufferHashGrouper.java:
##########
@@ -173,6 +173,15 @@ public void close()
     aggregators.reset();
   }
 
+  /**
+   * This method is implemented to return the highest memory value claimed by 
the Grouper. This is only
+   * used for monitoring the size of the merge buffers used.
+   */
+  public long getMergeBufferUsage()
+  {
+    return hashTable.getMaxTableBufferUsage();
+  }

Review Comment:
   Also, `getMergeBufferUsage()` seems a bit vague. What do you think about 
`getMaxMergeBufferUsedBytes()`? Once decided on a name, please use the name 
across the board for consistency
   
   



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/BufferHashGrouper.java:
##########
@@ -154,6 +152,18 @@ public void reset()
     aggregators.reset();
   }
 
+  @Override
+  public long getMergeBufferUsage()
+  {
+    if (!initialized) {
+      return 0L;
+    }
+
+    long hashTableUsage = hashTable.getMaxTableBufferUsage();
+    long offSetListUsage = offsetList.getMaxMergeBufferUsageBytes();
+    return hashTableUsage + offSetListUsage;

Review Comment:
   Same comment, I think this would just more or less tell us configured size 
rather than actual buffer usage. (more or less because of offset list tracking)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [Groupby Query Metrics] Add merge buffer tracking (druid)

Reply via email to