dingwei2019 created HBASE-27959:
-----------------------------------

             Summary: change the random scope of random row for 
PerformanceEvaluation
                 Key: HBASE-27959
                 URL: https://issues.apache.org/jira/browse/HBASE-27959
             Project: HBase
          Issue Type: Improvement
          Components: Performance
    Affects Versions: 2.5.5, 2.5.0, 2.3.2
            Reporter: dingwei2019


*question description:*

when we use PerformanceEvaluation tool to run randomWrite test, we find out 
that when one region happened regiontoobusy, the requests in the ui will 
dramatically decreased in a short time from several million to 0. 

the mechanism of regiontoobusy is really nice to maximum the throughput of 
hbase cluster, it will only influence the current region which happened 
regiontoobusy. but when i look into the whole procedure of the random 
write(include client and server), i found there may be some problem in 
PerformanceEvaluation tool(client) which cause the current question.

 

*cause of the issue:*

before trying to illustrate the issue, here are some preconditions we need to 
know first:

1、one request generated by TestClient thread will contain the data of all 
regions(we will accumulate 2M(default client buffer size) request, the request 
will include many mutate operations. each mutate operation is generated in 
random from a whole table)

2、when one 2M's request doesn't finish, the TestClient won't generate a new 
request(due to the PE's mechanism)

let's try to illustrate the cause of the issue:

1、when one region happened regiontoobusy, the 2M's request will not be finished 
util the region unblocked. so do the other 2M's request generated by other 
TestClients.

2、both the client(PerformanceEvaluation) and server(regionserver) will block 
and we will see the request in the ui and cpu util decreased in several seconds 
util the unblock of the region.

*probable solution:*

the issue is not due to regionserver(regiontoobusy is a good mechanism from the 
regionserver's side), but the client. 

if we try to change the scope of random, we will find a way to solve this 
problem.

the origin scope is the whole table, if every TestClient generate random row in 
it's own scope we will solve this problem, let's take a example:

assume that we have 5 TestClient, each TestClient charge for 1000 requests.

TestClient1 will generate random rows from 0–999;

TestClient2 will generate random rows from 1000–1999;

TestClient3 will generate random rows from 2000–2999;

TestClient4 will generate random rows from 3000--3999;

TestClient5 will generate random rows from 4000–4999;

 

*some other words to say:*

i raise a question encountered in my work. i hope to have a further discussion 
with the experts from the community and other fields in order to seek a better 
solution. if my solution is acceptable, i will push a patch to solve the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to