[ 
https://issues.apache.org/jira/browse/HBASE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771103#action_12771103
 ] 

Dan Washusen commented on HBASE-1935:
-------------------------------------

re. out-of-order receipt of results

What do you see as the benefits in parallel scanning with results in order?

The 'RegionCallable' defined at line 3109 of the patch opens a scanner on a 
specific region server.  The same scanner is then used for all results returned 
from that region.  If you wanted to receive results in-order the time saved 
would be;
* The time taken to switch from one region to the next.  For example, while 
iterating over results from region 1 you could start fetching results from 
region 2.
* The time spent by the client iterating over the results returned in that 
batch before asking the server side scanner for the next batch.

re. startRow and endRow restrictions

The ParallelHTable in this patch (line 3608) falls back to a sequential scan if 
the scan has a startRow or endRow defined.  It should be possible to use the 
parallel scanner with out-of-order receipt of results if either of these values 
are specified.  The scanner could list all regions and for each region see if 
it's startKey and endKey fall within the scan's startRow and endRow.  If it 
does scan it.

I'm probably stating the obvious with both those points but I'm new to HBase so 
you'll have to forgive me. :)

Cheers,
Dan


> Scan in parallel
> ----------------
>
>                 Key: HBASE-1935
>                 URL: https://issues.apache.org/jira/browse/HBASE-1935
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: stack
>         Attachments: pscanner.patch
>
>
> A scanner that rather than scan in series, instead scanned multiple regions 
> in parallell would be more involved but could complete much faster 
> partiularly if results are sparse.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to