[
https://issues.apache.org/jira/browse/HBASE-16224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432449#comment-15432449
]
ChiaPing Tsai commented on HBASE-16224:
---------------------------------------
[~chenheng]
Thanks for your comment. Please let me to explain the changes.
What is RowAccess ?
AP needs to iterate more rows for creating the "big" request. For this pupose,
the original logical needs to clone all rows and restore the remaining rows. So
it will cause the large data copy. We need a way to access the inner buffer for
preventing the large data copy. A solution is a list implementation which
wraps the inner buffer, but the list interface is complicated. So I introduces
the RowAccess interface rather than List interface.
What is TaskCountChecker?
As mentioned above, AP intends to create the "big" request. If the original
logical disallows all region/regionserver when checking the front of the rows,
AP will iterate all rows for finding some allowed rows. The elapsed time of
iteration is directly proportional to the number of rows. So I just honor the
decision for allowed region.
In short, there are interplay between RowAccess and TaskCountCheckers. They
should be on the same patch.
> Reduce the number of RPCs for the large PUTs
> --------------------------------------------
>
> Key: HBASE-16224
> URL: https://issues.apache.org/jira/browse/HBASE-16224
> Project: HBase
> Issue Type: Improvement
> Reporter: ChiaPing Tsai
> Assignee: ChiaPing Tsai
> Priority: Minor
> Attachments: HBASE-16224-v1.patch, HBASE-16224-v2.patch,
> HBASE-16224-v3.patch, HBASE-16224-v4.patch, HBASE-16224-v5.patch,
> HBASE-16224-v6.patch, HBASE-16224-v7.patch, HBASE-16224-v8.patch,
> HBASE-16224-v9.patch, experiment-v9.patch.xlsx, experiment.xlsx
>
>
> This patch is proposed to reduce the number of RPC for the large PUTs
> The number and data size of write thread(SingleServerRequestRunnable) is a
> result of three main factors:
> 1) The flush size taken by BufferedMutatorImpl#backgroundFlushCommits
> 2) The limit of task number
> 3) ClientBackoffPolicy
> A lot of requests created with less MUTATIONs is a result of two reason:
> 1) many regions of target table are in different server.
> 2) flush size in step one is summed by “all” server rather than “individual”
> server
> This patch removes the limit of flush size in step one and add maximum size
> to submit for each server in the AsyncProcess
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)