jihoonson commented on a change in pull request #10685:
URL: https://github.com/apache/druid/pull/10685#discussion_r552969515
##########
File path:
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/ByteBufferMinMaxOffsetHeap.java
##########
@@ -59,6 +60,35 @@ public ByteBufferMinMaxOffsetHeap(
this.heapIndexUpdater = heapIndexUpdater;
}
+ public ByteBufferMinMaxOffsetHeap copy()
+ {
+ LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater updater =
+ Optional
+ .ofNullable(heapIndexUpdater)
+
.map(LimitedBufferHashGrouper.BufferGrouperOffsetHeapIndexUpdater::copy)
+ .orElse(null);
+
+ // deep copy buf
+ ByteBuffer buffer = ByteBuffer.allocateDirect(buf.capacity());
Review comment:
> Yeah, I agree. My biggest concern with this code is that it is not
obvious to the caller that creating new iterators of the type being changed
here will allocate new off-heap memory in an unbounded fashion. This is ok if
we think that not "too many" copies will be done but I cannot affirm that.
It seems pretty dangerous to me as users are not expected to be aware of
this behavior. The blocking merge buffer pool is used to avoid unexpectedly
using "too many" copies at the same time like in this problematic scenario.
By the way, on the second look, I'm wondering why we copy the buffer instead
of fixing the iterator of `LimitedBufferHashGrouper` to be re-iterable. This
seems a better fix to me.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]