[
https://issues.apache.org/jira/browse/HBASE-16224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426204#comment-15426204
]
ChiaPing Tsai edited comment on HBASE-16224 at 8/18/16 10:04 AM:
-----------------------------------------------------------------
hi [~chenheng]
It needs a condition to break the while loop. Otherwise, The AP will spend
plenty of time to grab the rows.
The isFull() can reach the goal but the cost is directly proportional to the
number of regions for the table. Because we need to locate all the regions.
Therefore, the new SubmittedSizeChecker is added for limiting the heapsize of
total request when AP grabs the allowed rows. In fact, the behavior of
SubmittedSizeChecker is similar to what previous BufferedMutatorImpl grabs all
of the available mutations before calling AsyncProcess#submit.
Thanks for your review.
was (Author: chia7712):
[~chenheng]
It needs a condition to break the while loop. Otherwise, The AP will spend
plenty of time to grab the rows.
The isFull() can reach the goal but the cost is directly proportional to the
number of regions for the table. Because we need to locate all the regions.
Therefore, the new SubmittedSizeChecker is added for limiting the heapsize of
total request when AP grabs the allowed rows. In fact, the behavior of
SubmittedSizeChecker is similar to what previous BufferedMutatorImpl grabs all
of the available mutations before calling AsyncProcess#submit.
> Reduce the number of RPCs for the large PUTs
> --------------------------------------------
>
> Key: HBASE-16224
> URL: https://issues.apache.org/jira/browse/HBASE-16224
> Project: HBase
> Issue Type: Improvement
> Reporter: ChiaPing Tsai
> Assignee: ChiaPing Tsai
> Priority: Minor
> Attachments: HBASE-16224-v1.patch, HBASE-16224-v2.patch,
> HBASE-16224-v3.patch, HBASE-16224-v4.patch, HBASE-16224-v5.patch,
> HBASE-16224-v6.patch, HBASE-16224-v7.patch, HBASE-16224-v8.patch,
> HBASE-16224-v9.patch, experiment-v9.patch.xlsx, experiment.xlsx
>
>
> This patch is proposed to reduce the number of RPC for the large PUTs
> The number and data size of write thread(SingleServerRequestRunnable) is a
> result of three main factors:
> 1) The flush size taken by BufferedMutatorImpl#backgroundFlushCommits
> 2) The limit of task number
> 3) ClientBackoffPolicy
> A lot of requests created with less MUTATIONs is a result of two reason:
> 1) many regions of target table are in different server.
> 2) flush size in step one is summed by “all” server rather than “individual”
> server
> This patch removes the limit of flush size in step one and add maximum size
> to submit for each server in the AsyncProcess
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)