jihoonson commented on a change in pull request #10685:
URL: https://github.com/apache/druid/pull/10685#discussion_r552969515



##########
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferMinMaxOffsetHeap.java
##########
@@ -59,6 +60,35 @@ public ByteBufferMinMaxOffsetHeap(
     this.heapIndexUpdater = heapIndexUpdater;
   }
 
+  public ByteBufferMinMaxOffsetHeap copy()
+  {
+    LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater updater =
+        Optional
+            .ofNullable(heapIndexUpdater)
+            
.map(LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater::copy)
+            .orElse(null);
+
+    // deep copy buf
+    ByteBuffer buffer = ByteBuffer.allocateDirect(buf.capacity());

Review comment:
       > Yeah, I agree. My biggest concern with this code is that it is not 
obvious to the caller that creating new iterators of the type being changed 
here will allocate new off-heap memory in an unbounded fashion. This is ok if 
we think that not "too many" copies will be done but I cannot affirm that.
   
   It seems pretty dangerous to me as users are not expected to be aware of 
this behavior. The blocking merge buffer pool is used to avoid unexpectedly 
using "too many" copies at the same time like in this problematic scenario. 
   
   By the way, on the second look, I'm wondering why we copy the buffer instead 
of fixing the iterator of `LimitedBufferHashGrouper` to be re-iterable. This 
seems a better fix to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to