[ https://issues.apache.org/jira/browse/PHOENIX-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054522#comment-14054522 ]
Gabriel Reid commented on PHOENIX-539: -------------------------------------- Thanks for taking a look [~jamestaylor]. I would definitely be interesting if [~lhofhansl] could take a look and give more feedback on his original idea. The one thing that I would possibly worry about with leaving the scanners open on the server is running into scanner lease timeouts -- I actually assumed that this was the original reason why the SpoolingResultIterator was used in the first place. I'll upload a patch with the points you noted (doc on why chunking is disabled when an order by is present or a join is being done) and a separate HashJoinInfo.isHashJoin(Scan) method. For reference (without having to look through the new patch), the reason that the ChunkedResultIterator isn't used for when a join or order by are done is because the scans used for those operations don't work properly with restarting at the last-encountered row. As far as I understand, it's probably better to not do chunking there anyhow, as the scan results for those operations are only useful in their entirety, so chunking would probably only slow things down even if it did work there. > Implement parallel scanner that does not spool to disk > ------------------------------------------------------ > > Key: PHOENIX-539 > URL: https://issues.apache.org/jira/browse/PHOENIX-539 > Project: Phoenix > Issue Type: Task > Reporter: James Taylor > Assignee: larsh > Attachments: PHOENIX-539.1.patch, PHOENIX-539.patch > > > In scenarios where a LIMIT is not present on a non aggregate query that will > return a lot of results, Phoenix spools the results to disk. This is less > than ideal in these situations. @larsh has created a very good and relatively > simple implementation that is queue based to replace this. -- This message was sent by Atlassian JIRA (v6.2#6252)