[ 
https://issues.apache.org/jira/browse/HBASE-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13634673#comment-13634673
 ] 

Varun Sharma commented on HBASE-8362:
-------------------------------------

How about an algorithm like this:

1) Sort all the incoming gets and a point i initialized to 0. Lets call gets[i] 
the i-th row in this list of sorted gets
Loop
2) RESEEK to gets[i] - call next() and let the obtained row be currentRow
   a) If gets[i] matches currentRow, great, include the row in results, i++ and 
RESEEK
   b) We got currentRow > gets[i], increment i such that gets[i] >= currentRow. 
If gets[i] == currentRow, then include the row and do i++ and continue the loop

Run this loop until either the scanner is exhausted or until i > gets.length

Of course this requires that you sort the gets first, so that you seek 
unidirectionally in the forward direction which is also more optimal. This 
could be formulated as a filter and we could replace the many gets in the 
multi() function with a scan() operation bundled with the filter. The filter 
can be:

MultiRowFilter(byte[][] rows)  // List of rows to limit the scan to.

We would only pass "rows" which are within the regions boundaries and ignore 
all others. IMHO, that seems to be one way of implementing this solution...


Thanks
Varun

 
                
> Possible MultiGet optimization
> ------------------------------
>
>                 Key: HBASE-8362
>                 URL: https://issues.apache.org/jira/browse/HBASE-8362
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>
> Currently MultiGets are executed on a RegionServer in a single thread in a 
> loop that handles each Get separately (opening a scanner, seeking, etc).
> It seems we could optimize this (per region at least) by opening a single 
> scanner and issue a reseek for each Get that was requested.
> I have not tested this yet and no patch, but I would like to solicit feedback 
> on this idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to