[ 
https://issues.apache.org/jira/browse/HBASE-16224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432449#comment-15432449
 ] 

ChiaPing Tsai commented on HBASE-16224:
---------------------------------------

[~chenheng]

Thanks for your comment. Please let me to explain the changes.

What is RowAccess ?
AP needs to iterate more rows for creating the "big" request. For this pupose, 
the original logical needs to clone all rows and restore the remaining rows. So 
it will cause the large data copy. We need a way to access the inner buffer for 
preventing the large data copy.  A solution is a list implementation which 
wraps the inner buffer, but the list interface is complicated. So I introduces 
the RowAccess interface rather than List interface.


What is TaskCountChecker?
As mentioned above, AP intends to create the "big" request. If the original 
logical disallows all region/regionserver when checking the front of the rows, 
AP will iterate all rows for finding some allowed rows. The elapsed time of 
iteration is directly proportional to the number of rows. So I just honor the 
decision for allowed region.

In short, there are interplay between RowAccess and TaskCountCheckers. They 
should be on the same patch.

> Reduce the number of RPCs for the large PUTs
> --------------------------------------------
>
>                 Key: HBASE-16224
>                 URL: https://issues.apache.org/jira/browse/HBASE-16224
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: ChiaPing Tsai
>            Assignee: ChiaPing Tsai
>            Priority: Minor
>         Attachments: HBASE-16224-v1.patch, HBASE-16224-v2.patch, 
> HBASE-16224-v3.patch, HBASE-16224-v4.patch, HBASE-16224-v5.patch, 
> HBASE-16224-v6.patch, HBASE-16224-v7.patch, HBASE-16224-v8.patch, 
> HBASE-16224-v9.patch, experiment-v9.patch.xlsx, experiment.xlsx
>
>
> This patch is proposed to reduce the number of RPC for the large PUTs 
> The number and data size of write thread(SingleServerRequestRunnable) is a 
> result of three main factors:
> 1) The flush size taken by BufferedMutatorImpl#backgroundFlushCommits
> 2) The limit of task number
> 3) ClientBackoffPolicy
> A lot of requests created with less MUTATIONs is a result of two reason: 
> 1) many regions of target table are in different server.
> 2) flush size in step one is summed by “all” server rather than “individual” 
> server
> This patch removes the limit of flush size in step one and add maximum size 
> to submit for each server in the AsyncProcess



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to