Re: [PR] [Groupby Query Metrics] Add merge buffer tracking (druid)

via GitHub Mon, 19 Jan 2026 22:08:11 -0800


abhishekrb19 commented on code in PR #18731:
URL: https://github.com/apache/druid/pull/18731#discussion_r2706872449



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/LimitedBufferHashGrouper.java:
##########
@@ -571,7 +585,19 @@ public void adjustTableWhenFull()
 
       size = numCopied;
       tableBuffer = newTableBuffer;
+      updateMaxTableBufferUsedBytes();
       growthCount++;
     }
+
+    @Override
+    protected void updateMaxTableBufferUsedBytes()
+    {
+      long currentBufferUsedBytes = 0;
+      for (ByteBuffer buffer : subHashTableBuffers) {
+        currentBufferUsedBytes += buffer.capacity();
+      }

Review Comment:
   I think this effectively would just be `tableArenaSize` which would reflect 
the allocated configured size rather than actual used size?



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferHashTable.java:
##########
@@ -381,6 +387,16 @@ public int getGrowthCount()
     return growthCount;
   }
 
+  protected void updateMaxTableBufferUsage()
+  {
+    maxTableBufferUsage = Math.max(maxTableBufferUsage, 
tableBuffer.capacity());

Review Comment:
   Thanks for the update. Please see 
https://github.com/apache/druid/pull/18731/changes#r2706872449



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/LimitedBufferHashGrouper.java:
##########
@@ -571,7 +585,19 @@ public void adjustTableWhenFull()
 
       size = numCopied;
       tableBuffer = newTableBuffer;
+      updateMaxTableBufferUsedBytes();
       growthCount++;
     }
+
+    @Override
+    protected void updateMaxTableBufferUsedBytes()
+    {
+      long currentBufferUsedBytes = 0;
+      for (ByteBuffer buffer : subHashTableBuffers) {
+        currentBufferUsedBytes += buffer.capacity();
+      }

Review Comment:
   Please update `LimitedBufferHashGrouperTest`, `BufferHashGrouperTest`  and 
related grouper tests to validate the correctness of these implementations.
   
   I just pulled in the latest patch locally and ran some group by queries and 
noticed that the `bytesUsed` and `maxBytesUsed` were more or less what was 
configured `druid.processing.buffer.sizeBytes` 🤔 
   
   
   I’ll try to dig into this more, but in the meantime, I’d still recommend 
splitting the PR into two parts: 1. max metrics 2. the bytesUsed and 
maxBytesUsed metrics
   2 seems a bit more involved.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [Groupby Query Metrics] Add merge buffer tracking (druid)

Reply via email to