[ 
https://issues.apache.org/jira/browse/PHOENIX-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated PHOENIX-2970:
------------------------------
    Description: 
Even if SpoolingResultIterator will be deprecated, but HBase older than than  
0.98.17 would continue use it, and I think the DeferredByteBufferSegmentQueue 
class simialr to SpoolingResultIterator  in the future version may also has the 
same problem.

In SpoolingResultIterator's ctor, it tries to allocate 
"phoenix.query.spoolThresholdBytes" bytes from MemoryManager,and uses the 
allocated MemoryChunk's size as DeferredFileOutputStream's threshold,just as 
the following code:

       {code:borderStyle=solid}
        final MemoryChunk chunk = mm.allocate(0, thresholdBytes);
        long waitTime = System.currentTimeMillis() - startTime;
        GLOBAL_MEMORY_WAIT_TIME.update(waitTime);
        memoryMetrics.getMemoryWaitTimeMetric().change(waitTime);
        DeferredFileOutputStream spoolTo = null;
        try {
            // Can't be bigger than int, since it's the max of the above 
allocation
            int size = (int)chunk.getSize();
            spoolTo = new DeferredFileOutputStream(size, 
"ResultSpooler",".bin", new File(spoolDirectory)) {
                @Override
                protected void thresholdReached() throws IOException {
                    try {
                        super.thresholdReached();
                    } finally {
                        chunk.close();
                    }
                }
            };
       {code} 
       
SpoolingResultIterator assumes that the wrapped ResultIterator would always 
fetch "phoenix.query.spoolThresholdBytes" bytes,but most of time it won't. For 
example, if we execute "select count(1) from table" on a big table with many 
regions, the ScanPlan will  parallel too many SpoolingResultIterators to fetch 
the result, and each SpoolingResultIterator tries to allocate 
"phoenix.query.spoolThresholdBytes" bytes from MemoryManager.If we don't have 
too many memory, lots of  SpoolingResultIterators will allocate 0 bytes from 
MemoryManager,and the DeferredFileOutputStream will unnecessaryly use temp file 
to put the results, even if the result is just a single count value. This 
behavior will slow the query.
    
 Can we remove the MemoryManager in SpoolingResultIterator,just as the 
MappedByteBufferSortedQueue does?
    



  was:
Even if SpoolingResultIterator will be deprecated, but HBase older than than  
0.98.17 would continue use it, and I think the DeferredByteBufferSegmentQueue 
class simialr to SpoolingResultIterator  in the future version may also has the 
same problem.

In SpoolingResultIterator's ctor, it tries to allocate 
"phoenix.query.spoolThresholdBytes" bytes from MemoryManager,and uses the 
allocated MemoryChunk's size as DeferredFileOutputStream's threshold,just as 
the following code:

       {code:borderStyle=solid}
        final MemoryChunk chunk = mm.allocate(0, thresholdBytes);
        long waitTime = System.currentTimeMillis() - startTime;
        GLOBAL_MEMORY_WAIT_TIME.update(waitTime);
        memoryMetrics.getMemoryWaitTimeMetric().change(waitTime);
        DeferredFileOutputStream spoolTo = null;
        try {
            // Can't be bigger than int, since it's the max of the above 
allocation
            int size = (int)chunk.getSize();
            spoolTo = new DeferredFileOutputStream(size, 
"ResultSpooler",".bin", new File(spoolDirectory)) {
                @Override
                protected void thresholdReached() throws IOException {
                    try {
                        super.thresholdReached();
                    } finally {
                        chunk.close();
                    }
                }
            };
       {code} 
       
SpoolingResultIterator assumes that the wrapped ResultIterator would always 
fetch "phoenix.query.spoolThresholdBytes" bytes,but most of time it won't. For 
example, if we execute "select count(1) from table" on a big table with many 
regions, the ScanPlan will  parallel too many SpoolingResultIterators to fetch 
the result, and each SpoolingResultIterator tries to allocate 
"phoenix.query.spoolThresholdBytes" bytes from MemoryManager.If we don't have 
too many memory, lots of  SpoolingResultIterators will allocate 0 bytes from 
MemoryManager,and the DeferredFileOutputStream will unnecessaryly use temp file 
to put the results, even if the result is just a single count value. This 
behavior will slow the query.
    
 Can we remove the MemoryManager,just as the MappedByteBufferSortedQueue does?
    




> SpoolingResultIterator using memory too conservative 
> -----------------------------------------------------
>
>                 Key: PHOENIX-2970
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2970
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0
>            Reporter: chenglei
>
> Even if SpoolingResultIterator will be deprecated, but HBase older than than  
> 0.98.17 would continue use it, and I think the DeferredByteBufferSegmentQueue 
> class simialr to SpoolingResultIterator  in the future version may also has 
> the same problem.
> In SpoolingResultIterator's ctor, it tries to allocate 
> "phoenix.query.spoolThresholdBytes" bytes from MemoryManager,and uses the 
> allocated MemoryChunk's size as DeferredFileOutputStream's threshold,just as 
> the following code:
>        {code:borderStyle=solid}
>         final MemoryChunk chunk = mm.allocate(0, thresholdBytes);
>         long waitTime = System.currentTimeMillis() - startTime;
>         GLOBAL_MEMORY_WAIT_TIME.update(waitTime);
>         memoryMetrics.getMemoryWaitTimeMetric().change(waitTime);
>         DeferredFileOutputStream spoolTo = null;
>         try {
>             // Can't be bigger than int, since it's the max of the above 
> allocation
>             int size = (int)chunk.getSize();
>             spoolTo = new DeferredFileOutputStream(size, 
> "ResultSpooler",".bin", new File(spoolDirectory)) {
>                 @Override
>                 protected void thresholdReached() throws IOException {
>                     try {
>                         super.thresholdReached();
>                     } finally {
>                         chunk.close();
>                     }
>                 }
>             };
>        {code} 
>        
> SpoolingResultIterator assumes that the wrapped ResultIterator would always 
> fetch "phoenix.query.spoolThresholdBytes" bytes,but most of time it won't. 
> For example, if we execute "select count(1) from table" on a big table with 
> many regions, the ScanPlan will  parallel too many SpoolingResultIterators to 
> fetch the result, and each SpoolingResultIterator tries to allocate 
> "phoenix.query.spoolThresholdBytes" bytes from MemoryManager.If we don't have 
> too many memory, lots of  SpoolingResultIterators will allocate 0 bytes from 
> MemoryManager,and the DeferredFileOutputStream will unnecessaryly use temp 
> file to put the results, even if the result is just a single count value. 
> This behavior will slow the query.
>     
>  Can we remove the MemoryManager in SpoolingResultIterator,just as the 
> MappedByteBufferSortedQueue does?
>     



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to