[jira] [Commented] (HBASE-25346) hbase2.x the performance is lower than hbase 1.x ？

ramkrishna.s.vasudevan (Jira) Thu, 10 Dec 2020 01:48:05 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-25346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247133#comment-17247133
 ]


ramkrishna.s.vasudevan commented on HBASE-25346:
------------------------------------------------

Here it is the random write report with 2.0. 

What is the value u see with 1.x based hbase?  The WAL system is AsyncFSWAL in 
2.x and FileSystem in 1.x.  BTW how many nodes are you testing? 

In HBASE-24850 we have seen issues with the CellComparator performance when we 
add more columns per row. The addition to memstore takes more time due to the 
comparisons. Can you see if you can try that patch there to see if it improves 
your write performance? I have a PR raised  against branch-2.3. 

 

> hbase2.x the performance is lower than hbase 1.x  ？
> ---------------------------------------------------
>
>                 Key: HBASE-25346
>                 URL: https://issues.apache.org/jira/browse/HBASE-25346
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 2.0.2
>            Reporter: nilonealex
>            Priority: Critical
>         Attachments: hbase-pe-performace-test.log, hbase-site.xml, 
> test_for_randomWrite.log
>
>
> Recently we found that the newly built production hbase cluster is running a 
> bit slow , the hadoop version is Hbase2.0.2 ( HDP3.1.1) and it has 100 
> nodes.Then we begin to  do load & query performance verification between 
> Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ) test environment （4nodes）, 
> found that : put data based on hbase2.0 is much slower than hbase1.x (the 
> former is almost half of the latter), I use BufferedMutator and 
> BufferedMutatorParams term for batch put to improve efficiency. More 
> confusing is the performance of the production environment is worse than my 
> test environment
> Some of the codes are as follows:
> -----------------------------------------------------------------------
> {color:#4C9AFF}List<Mutation> mutator = new ArrayList<>();
> BufferedMutator table = null;
> BufferedMutatorParams params = new 
> BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName()));
> params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024);
> table = connection.getBufferedMutator(params);                
>               
> mutator.add(p);
> if(totalCnts % 5000 == 0 ) {
>       table.mutate(mutator);
>       mutator.clear();
> }{color}
> -----------------------------------------------------------------------
> The file to put is a text format file: 2 million rows comma-separated text 
> file, each row records 110 columns, total size is about 1G. In addition to 
> the main parameter configuration such as heap memory, I kept the default 
> parameter values ??for most of the hbase services.
> The load program is designed for single thread.
> The following is the progress information :
> ----------------------- Hbase1.2.0 ( CDH5.13.3 ) 
> --------------------------------------------
> 2020-12-01 16:48:18 inserted:  100000
> 2020-12-01 16:48:36 inserted:  200000
> 2020-12-01 16:48:52 inserted:  300000
> 2020-12-01 16:49:08 inserted:  400000
> 2020-12-01 16:49:23 inserted:  500000
> 2020-12-01 16:49:39 inserted:  600000
> 2020-12-01 16:49:56 inserted:  700000
> 2020-12-01 16:50:12 inserted:  800000
> 2020-12-01 16:50:29 inserted:  900000
> 2020-12-01 16:50:45 inserted:  1000000
> 2020-12-01 16:51:01 inserted:  1100000
> 2020-12-01 16:51:17 inserted:  1200000
> 2020-12-01 16:51:34 inserted:  1300000
> 2020-12-01 16:51:49 inserted:  1400000
> 2020-12-01 16:52:05 inserted:  1500000
> 2020-12-01 16:52:21 inserted:  1600000
> 2020-12-01 16:52:40 inserted:  1700000
> 2020-12-01 16:52:57 inserted:  1800000
> 2020-12-01 16:53:19 inserted:  1900000
> 2020-12-01 16:53:42 inserted:  2000000
> 2020-12-01 16:53:48 inserted:  2000000
> imp finished ok! 
> --job finished--
> -----------------------Hbase.2.0.2 ( 
> HDP3.1.1)---------------------------------------------
> 2020-12-01 17:25:24 inserted:  100000
> 2020-12-01 17:26:03 inserted:  200000
> 2020-12-01 17:26:39 inserted:  300000
> 2020-12-01 17:27:13 inserted:  400000
> 2020-12-01 17:27:47 inserted:  500000
> 2020-12-01 17:28:23 inserted:  600000
> 2020-12-01 17:29:03 inserted:  700000
> 2020-12-01 17:29:40 inserted:  800000
> 2020-12-01 17:30:15 inserted:  900000
> 2020-12-01 17:30:51 inserted:  1000000
> 2020-12-01 17:31:27 inserted:  1100000
> 2020-12-01 17:32:03 inserted:  1200000
> 2020-12-01 17:32:39 inserted:  1300000
> 2020-12-01 17:33:14 inserted:  1400000
> 2020-12-01 17:33:50 inserted:  1500000
> 2020-12-01 17:34:25 inserted:  1600000
> 2020-12-01 17:35:01 inserted:  1700000
> 2020-12-01 17:35:38 inserted:  1800000
> 2020-12-01 17:36:14 inserted:  1900000
> 2020-12-01 17:36:51 inserted:  2000000
> 2020-12-01 17:36:55 inserted:  2000000
> imp finished ok! 
> --job finished--
> returnCode=0
> In addition, we also did some benchmark tests on the production cluster.The 
> delay is seem to be a bit high. The detailed report is in the attachment.
> Are there any key points that I have not done configuration? or，, this 
> version has performance defects ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-25346) hbase2.x the performance is lower than hbase 1.x ？

Reply via email to