[ https://issues.apache.org/jira/browse/HBASE-25346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245725#comment-17245725 ]
ramkrishna.s.vasudevan commented on HBASE-25346: ------------------------------------------------ [~nilone2] >From the report you have attached am not sure whether it is the writes that is >slower or reads also? Is it possible to attach the TPS/latency numbers with writes and reads? > hbase2.x the performance is lower than hbase 1.x ? > --------------------------------------------------- > > Key: HBASE-25346 > URL: https://issues.apache.org/jira/browse/HBASE-25346 > Project: HBase > Issue Type: Improvement > Affects Versions: 2.0.2 > Reporter: nilonealex > Priority: Critical > Attachments: hbase-pe-performace-test.log, hbase-site.xml > > > Recently we found that the newly built production hbase cluster is running a > bit slow , the hadoop version is Hbase2.0.2 ( HDP3.1.1) and it has 100 > nodes.Then we begin to do load & query performance verification between > Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ) test environment (4nodes), > found that : put data based on hbase2.0 is much slower than hbase1.x (the > former is almost half of the latter), I use BufferedMutator and > BufferedMutatorParams term for batch put to improve efficiency. More > confusing is the performance of the production environment is worse than my > test environment > Some of the codes are as follows: > ----------------------------------------------------------------------- > {color:#4C9AFF}List<Mutation> mutator = new ArrayList<>(); > BufferedMutator table = null; > BufferedMutatorParams params = new > BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName())); > params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024); > table = connection.getBufferedMutator(params); > > mutator.add(p); > if(totalCnts % 5000 == 0 ) { > table.mutate(mutator); > mutator.clear(); > }{color} > ----------------------------------------------------------------------- > The file to put is a text format file: 2 million rows comma-separated text > file, each row records 110 columns, total size is about 1G. In addition to > the main parameter configuration such as heap memory, I kept the default > parameter values ??for most of the hbase services. > The load program is designed for single thread. > The following is the progress information : > ----------------------- Hbase1.2.0 ( CDH5.13.3 ) > -------------------------------------------- > 2020-12-01 16:48:18 inserted: 100000 > 2020-12-01 16:48:36 inserted: 200000 > 2020-12-01 16:48:52 inserted: 300000 > 2020-12-01 16:49:08 inserted: 400000 > 2020-12-01 16:49:23 inserted: 500000 > 2020-12-01 16:49:39 inserted: 600000 > 2020-12-01 16:49:56 inserted: 700000 > 2020-12-01 16:50:12 inserted: 800000 > 2020-12-01 16:50:29 inserted: 900000 > 2020-12-01 16:50:45 inserted: 1000000 > 2020-12-01 16:51:01 inserted: 1100000 > 2020-12-01 16:51:17 inserted: 1200000 > 2020-12-01 16:51:34 inserted: 1300000 > 2020-12-01 16:51:49 inserted: 1400000 > 2020-12-01 16:52:05 inserted: 1500000 > 2020-12-01 16:52:21 inserted: 1600000 > 2020-12-01 16:52:40 inserted: 1700000 > 2020-12-01 16:52:57 inserted: 1800000 > 2020-12-01 16:53:19 inserted: 1900000 > 2020-12-01 16:53:42 inserted: 2000000 > 2020-12-01 16:53:48 inserted: 2000000 > imp finished ok! > --job finished-- > -----------------------Hbase.2.0.2 ( > HDP3.1.1)--------------------------------------------- > 2020-12-01 17:25:24 inserted: 100000 > 2020-12-01 17:26:03 inserted: 200000 > 2020-12-01 17:26:39 inserted: 300000 > 2020-12-01 17:27:13 inserted: 400000 > 2020-12-01 17:27:47 inserted: 500000 > 2020-12-01 17:28:23 inserted: 600000 > 2020-12-01 17:29:03 inserted: 700000 > 2020-12-01 17:29:40 inserted: 800000 > 2020-12-01 17:30:15 inserted: 900000 > 2020-12-01 17:30:51 inserted: 1000000 > 2020-12-01 17:31:27 inserted: 1100000 > 2020-12-01 17:32:03 inserted: 1200000 > 2020-12-01 17:32:39 inserted: 1300000 > 2020-12-01 17:33:14 inserted: 1400000 > 2020-12-01 17:33:50 inserted: 1500000 > 2020-12-01 17:34:25 inserted: 1600000 > 2020-12-01 17:35:01 inserted: 1700000 > 2020-12-01 17:35:38 inserted: 1800000 > 2020-12-01 17:36:14 inserted: 1900000 > 2020-12-01 17:36:51 inserted: 2000000 > 2020-12-01 17:36:55 inserted: 2000000 > imp finished ok! > --job finished-- > returnCode=0 > In addition, we also did some benchmark tests on the production cluster.The > delay is seem to be a bit high. The detailed report is in the attachment. > Are there any key points that I have not done configuration? or,, this > version has performance defects ? -- This message was sent by Atlassian Jira (v8.3.4#803005)