Github user maryannxue commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/171#discussion_r65293136
  
    --- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/iterate/DeferredByteBufferSegmentQueue.java
 ---
    @@ -0,0 +1,123 @@
    +package org.apache.phoenix.iterate;
    +
    +import org.apache.commons.io.output.DeferredFileOutputStream;
    +import org.apache.phoenix.memory.MemoryManager;
    +import org.apache.phoenix.memory.MemoryManager.MemoryChunk;
    +
    +import java.io.*;
    +import java.util.*;
    +
    +public abstract class DeferredByteBufferSegmentQueue<T> extends 
BufferSegmentQueue<T> {
    +
    +    final MemoryChunk chunk;
    +
    +    public DeferredByteBufferSegmentQueue(int index, int thresholdBytes,
    +                                          boolean hasMaxQueueSize, 
MemoryManager memoryManager) {
    +        super(index, thresholdBytes, hasMaxQueueSize);
    +        chunk = memoryManager.allocate(thresholdBytes);
    --- End diff --
    
    "thresholdBytes" might be confusing here. There are actually two 
occurrences of memory usage here, first one being in-memory priority queue for 
sorting, once that part, the size of which is rather an estimate (based on the 
priority queue data structure) than an actual value, has reached the threshold, 
the priority queue content should be written to a some kind of file 
OutputStream, which is now DeferredFileOutputStream. The second memory usage is 
that used by DeferredFileOutputStream itself, since its content will first stay 
in memory before its own threshold is reached.
    Therefore, we might need to allocate twice (it's not real allocate anyway, 
it's for tracking memory usage actually). But a better way to do this is to 
make use of SpoolingResultIterator logic to handle the entire second part as 
mentioned above. They should be exactly the same logic except that 
SpoolingResultIterator writes and reads Tuples and what you need here is 
something that writes and reads ResultEntry. So see if you can apply some 
abstraction here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to