[ https://issues.apache.org/jira/browse/PHOENIX-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chenglei updated PHOENIX-2970: ------------------------------ Summary: SpoolingResultIterator using memory too conservatively , which leads to use temp file unnecessaryly (was: SpoolingResultIterator using memory too conservative , which leads to use temp file unnecessaryly) > SpoolingResultIterator using memory too conservatively , which leads to use > temp file unnecessaryly > --------------------------------------------------------------------------------------------------- > > Key: PHOENIX-2970 > URL: https://issues.apache.org/jira/browse/PHOENIX-2970 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.7.0 > Reporter: chenglei > > Even if SpoolingResultIterator will be deprecated, but HBase older than than > 0.98.17 would continue use it, and I think the DeferredByteBufferSegmentQueue > class simialr to SpoolingResultIterator in the future version may also has > the same problem. > In SpoolingResultIterator's ctor, it tries to allocate maximum > "phoenix.query.spoolThresholdBytes" bytes from MemoryManager,and uses the > allocated MemoryChunk's size as DeferredFileOutputStream's threshold,just as > the following code: > {code:borderStyle=solid} > final MemoryChunk chunk = mm.allocate(0, thresholdBytes); > long waitTime = System.currentTimeMillis() - startTime; > GLOBAL_MEMORY_WAIT_TIME.update(waitTime); > memoryMetrics.getMemoryWaitTimeMetric().change(waitTime); > DeferredFileOutputStream spoolTo = null; > try { > // Can't be bigger than int, since it's the max of the above > allocation > int size = (int)chunk.getSize(); > spoolTo = new DeferredFileOutputStream(size, > "ResultSpooler",".bin", new File(spoolDirectory)) { > @Override > protected void thresholdReached() throws IOException { > try { > super.thresholdReached(); > } finally { > chunk.close(); > } > } > }; > {code} > > SpoolingResultIterator assumes that the wrapped ResultIterator would always > fetch "phoenix.query.spoolThresholdBytes" bytes,but most of time it won't. > For example, if we execute "select count(1) from table" on a big table with > many regions, the ScanPlan will parallel too many SpoolingResultIterators to > fetch the result, and each SpoolingResultIterator tries to allocate > "phoenix.query.spoolThresholdBytes" bytes from MemoryManager.If we don't have > too many memory, lots of SpoolingResultIterators will allocate 0 bytes from > MemoryManager,and the DeferredFileOutputStream will unnecessaryly use temp > file to put the results, even if the result is just a single count value. > This behavior will slow the query. > > Can we remove the MemoryManager in SpoolingResultIterator,just as the > MappedByteBufferSortedQueue does? > -- This message was sent by Atlassian JIRA (v6.3.4#6332)