[ 
https://issues.apache.org/jira/browse/HBASE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272941#comment-16272941
 ] 

Chance Li edited comment on HBASE-19389 at 12/1/17 7:07 PM:
------------------------------------------------------------

We have done some small tests for concurrent writing to CSLM, see test number 
below. We can see the RT growth is very fast (performance reduction) in the 
case of concurrency.
!CSLM-concurrent-write.png!

About solution, one choice is to protect CSLM avoiding large concurrency 
writing, another is to improve the CSLM. By the way, In our scene we don't want 
to use Qos(request-throttling).
We have chosen a more engineering solution which is to protect CSLM avoiding 
large concurrency and many columns writing. In this way, we can avoid the all 
RS handlers doing a more slow call. In another word, the other calls have 
chance to be handled. 

about the patch:
1. Dynamic configuration: such as 
#parallelPutToStoreThreadLimitCheckMinColumnNum and 
#parallelPutToStoreThreadLimit.
2. Return #RegionTooBusyException When it exceeds the threshold.
3. It's not strong limit, we wan't use lock.  so handler maybe busy in short 
time.
4. Only for multi op, not Append. 

ycsb result with patch:
!ycsb-result.png!

metrics:
!metrics-1.png!
 
Welcome any suggestion.  And I will upload the patch in 2 days , and upload 
more test number.


was (Author: chancelq):
We have done some small tests for concurrent writing to CSLM, see test number 
below. We can see the RT growth is very fast (performance reduction) in the 
case of concurrency.
!CSLM-concurrent-write.png!

About solution, one choice is to protect CSLM avoiding large concurrency 
writing, another is to improve the CSLM. By the way, In our scene we don't want 
to use Qos(request-throttling).
We have chosen a more engineering solution which is to protect CSLM avoiding 
large concurrency and many columns writing. In this way, we can avoid the all 
RS handlers doing a more slow call. In another word, the other calls have 
chance to be handled. 

about the patch:
1. Dynamic configuration: such as min column num and concurrent num.
2. Return #RegionTooBusyException When it exceeds the threshold.
3. It's not strong limit, we wan't use lock.  so handler maybe busy in short 
time.
4. Only for multi op, not Append. 

ycsb result with patch:
!ycsb-result.png!

metrics:
!metrics-1.png!
 
Welcome any suggestion.  And I will upload the patch in 2 days , and upload 
more test number.

> Limit concurrency of put with dense (hundreds) columns to prevent write 
> hander exhausted
> ----------------------------------------------------------------------------------------
>
>                 Key: HBASE-19389
>                 URL: https://issues.apache.org/jira/browse/HBASE-19389
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance
>    Affects Versions: 2.0.0
>         Environment: 2000+ Region Servers
> PCI-E ssd
>            Reporter: Chance Li
>            Assignee: Chance Li
>             Fix For: 2.0.0, 3.0.0
>
>         Attachments: CSLM-concurrent-write.png, metrics-1.png, ycsb-result.png
>
>
> In a large cluster, with a large number of clients, we found the RS's 
> handlers are all busy sometimes. And after investigation we found the root 
> cause is about CSLM, such as compare function heavy load. We reviewed the 
> related WALs, and found that there were many columns (more than 1000 columns) 
> were writing at that time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to