[
https://issues.apache.org/jira/browse/PHOENIX-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15318718#comment-15318718
]
James Taylor commented on PHOENIX-2970:
---------------------------------------
How about just increasing the amount of global memory available
{{phoenix.query.maxGlobalMemoryPercentage}} so that you spool into memory more
often?
> SpoolingResultIterator using memory too conservatively , which leads to using
> temp file unnecessaryly
> -----------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-2970
> URL: https://issues.apache.org/jira/browse/PHOENIX-2970
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.7.0
> Reporter: chenglei
>
> Even if SpoolingResultIterator will be deprecated, but HBase older than
> 0.98.17 would continue use it, and I think the DeferredByteBufferSegmentQueue
> class simialr to SpoolingResultIterator in the future version may also has
> the same problem.
> In SpoolingResultIterator's ctor, it tries to allocate maximum
> "phoenix.query.spoolThresholdBytes" bytes from MemoryManager,and uses the
> allocated MemoryChunk's size as DeferredFileOutputStream's threshold,just as
> the following code:
> {code:borderStyle=solid}
> final MemoryChunk chunk = mm.allocate(0, thresholdBytes);
> long waitTime = System.currentTimeMillis() - startTime;
> GLOBAL_MEMORY_WAIT_TIME.update(waitTime);
> memoryMetrics.getMemoryWaitTimeMetric().change(waitTime);
> DeferredFileOutputStream spoolTo = null;
> try {
> // Can't be bigger than int, since it's the max of the above
> allocation
> int size = (int)chunk.getSize();
> spoolTo = new DeferredFileOutputStream(size,
> "ResultSpooler",".bin", new File(spoolDirectory)) {
> @Override
> protected void thresholdReached() throws IOException {
> try {
> super.thresholdReached();
> } finally {
> chunk.close();
> }
> }
> };
> {code}
>
> SpoolingResultIterator assumes that the wrapped ResultIterator would always
> fetch "phoenix.query.spoolThresholdBytes" bytes,but most of time it won't.
> For example, if we execute "select count(1) from table" on a big table with
> many regions, the ScanPlan will parallel too many SpoolingResultIterators to
> fetch the result, and each SpoolingResultIterator tries to allocate maximum
> "phoenix.query.spoolThresholdBytes" bytes from MemoryManager.If we don't have
> too many memory, lots of SpoolingResultIterators will allocate 0 bytes from
> MemoryManager, and the corresponding DeferredFileOutputStream's threshold
> will be 0, so the DeferredFileOutputStream will unnecessaryly use temp file
> to put the results, even if the result is just a single count value. This
> behavior will slow the query.
>
> Can we remove the MemoryManager in SpoolingResultIterator,just as the
> MappedByteBufferSortedQueue does?
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)