[
https://issues.apache.org/jira/browse/HBASE-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898138#comment-13898138
]
Liyin Tang commented on HBASE-10502:
------------------------------------
In addition, the API of HBASE-10502 seems to more flexible (to me). Because if
there is a single scan request, spanning multiple region boundaries, then hbase
client is always able to split this scan request into multiple region-local
scan requests, and then submit to HBASE-10502 for parallel execution.
> [89-fb] ParallelScanner: a client utility to perform multiple scan requests
> in parallel.
> ----------------------------------------------------------------------------------------
>
> Key: HBASE-10502
> URL: https://issues.apache.org/jira/browse/HBASE-10502
> Project: HBase
> Issue Type: New Feature
> Reporter: Liyin Tang
> Fix For: 0.89-fb
>
>
> ParallelScanner is a utility class for the HBase client to perform multiple
> scan requests in parallel. It requires all the scan requests having the same
> caching size for the simplicity purpose.
>
> This class provides 3 very basic functionalities:
> * The initialize function will Initialize all the ResultScanners by calling
> {@link HTable#getScanner(Scan)} in parallel for each scan request.
> * The next function will call the corresponding {@link ResultScanner#next(int
> numRows)} from each scan request in parallel, and then return all the results
> together as a list. Also, if result list is empty, it indicates there is no
> data left for all the scanners and the user can call {@link #close()}
> afterwards.
> * The close function will close all the scanners and shutdown the thread pool.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)