chenglei created PHOENIX-2970:
---------------------------------

             Summary: SpoolingResultIterator using memory too conservative 
                 Key: PHOENIX-2970
                 URL: https://issues.apache.org/jira/browse/PHOENIX-2970
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.7.0
            Reporter: chenglei


     Even if SpoolingResultIterator will be deprecated, but HBase older than 
than  0.98.17 would continue use it, and I think the 
DeferredByteBufferSegmentQueue class simialr to SpoolingResultIterator  in the 
future version may also has the same problem.
      In SpoolingResultIterator's ctor, it tries to allocate "phoenix.query.
spoolThresholdBytes" bytes from MemoryManager,and uses the allocated 
MemoryChunk's size as DeferredFileOutputStream's threshold,just as the 
following code:
       {code:borderStyle=solid}
        final MemoryChunk chunk = mm.allocate(0, thresholdBytes);
        long waitTime = System.currentTimeMillis() - startTime;
        GLOBAL_MEMORY_WAIT_TIME.update(waitTime);
        memoryMetrics.getMemoryWaitTimeMetric().change(waitTime);
        DeferredFileOutputStream spoolTo = null;
        try {
            // Can't be bigger than int, since it's the max of the above 
allocation
            int size = (int)chunk.getSize();
            spoolTo = new DeferredFileOutputStream(size, 
"ResultSpooler",".bin", new File(spoolDirectory)) {
                @Override
                protected void thresholdReached() throws IOException {
                    try {
                        super.thresholdReached();
                    } finally {
                        chunk.close();
                    }
                }
            };
       {code} 
       
       SpoolingResultIterator assumes that the wrapped ResultIterator would 
always fetch "phoenix.query.spoolThresholdBytes" bytes,but most of time it 
won't. For example, if we execute "select count(*) from table" on a big table 
with many regions, the ScanPlan will  parallel too many
SpoolingResultIterators to fetch the result, and each SpoolingResultIterator 
tries to allocate "phoenix.query.
spoolThresholdBytes" bytes from MemoryManager.If we don't have too many memory, 
lots of  SpoolingResultIterators will allocate 0 bytes from MemoryManager,and 
the DeferredFileOutputStream will unnecessaryly use temp file to put the 
results, even if the result is just a single count value. This behavior will 
slow the query.
     Can we remove the MemoryManager,just as the MappedByteBufferSortedQueue 
does?
    





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to