[jira] [Commented] (PHOENIX-539) Implement parallel scanner that does not spool to disk

Gabriel Reid (JIRA) Mon, 07 Jul 2014 22:19:27 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054522#comment-14054522
 ]


Gabriel Reid commented on PHOENIX-539:
--------------------------------------

Thanks for taking a look [~jamestaylor]. 

I would definitely be interesting if [~lhofhansl] could take a look and give 
more feedback on his original idea. The one thing that I would possibly worry 
about with leaving the scanners open on the server is running into scanner 
lease timeouts -- I actually assumed that this was the original reason why the 
SpoolingResultIterator was used in the first place.

I'll upload a patch with the points you noted (doc on why chunking is disabled 
when an order by is present or a join is being done) and a separate 
HashJoinInfo.isHashJoin(Scan) method. 

For reference (without having to look through the new patch), the reason that 
the ChunkedResultIterator isn't used for when a join or order by are done is 
because the scans used for those operations don't work properly with restarting 
at the last-encountered row. As far as I understand, it's probably better to 
not do chunking there anyhow, as the scan results for those operations are only 
useful in their entirety, so chunking would probably only slow things down even 
if it did work there.



> Implement parallel scanner that does not spool to disk
> ------------------------------------------------------
>
>                 Key: PHOENIX-539
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-539
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: James Taylor
>            Assignee: larsh
>         Attachments: PHOENIX-539.1.patch, PHOENIX-539.patch
>
>
> In scenarios where a LIMIT is not present on a non aggregate query that will 
> return a lot of results, Phoenix spools the results to disk. This is less 
> than ideal in these situations. @larsh has created a very good and relatively 
> simple implementation that is queue based to replace this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PHOENIX-539) Implement parallel scanner that does not spool to disk

Reply via email to