[jira] [Updated] (HBASE-25346) hbase2.x the performance is lower than hbase 1.x ?
[ https://issues.apache.org/jira/browse/HBASE-25346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nilonealex updated HBASE-25346: --- Attachment: error_pe_randomWrite.log > hbase2.x the performance is lower than hbase 1.x ? > --- > > Key: HBASE-25346 > URL: https://issues.apache.org/jira/browse/HBASE-25346 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.2 >Reporter: nilonealex >Priority: Critical > Attachments: error_pe_randomWrite.log, error_pe_randomWrite.log, > hbase-pe-performace-test.log, hbase-site.xml, test_for_randomWrite.log, > test_for_randomWrite_hbase1.2.1.log > > > Recently we found that the newly built production hbase cluster is running a > bit slow , the hadoop version is Hbase2.0.2 ( HDP3.1.1) and it has 100 > nodes.Then we begin to do load & query performance verification between > Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ) test environment (4nodes), > found that : put data based on hbase2.0 is much slower than hbase1.x (the > former is almost half of the latter), I use BufferedMutator and > BufferedMutatorParams term for batch put to improve efficiency. More > confusing is the performance of the production environment is worse than my > test environment > Some of the codes are as follows: > --- > {color:#4C9AFF}List mutator = new ArrayList<>(); > BufferedMutator table = null; > BufferedMutatorParams params = new > BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName())); > params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024); > table = connection.getBufferedMutator(params); > > mutator.add(p); > if(totalCnts % 5000 == 0 ) { > table.mutate(mutator); > mutator.clear(); > }{color} > --- > The file to put is a text format file: 2 million rows comma-separated text > file, each row records 110 columns, total size is about 1G. In addition to > the main parameter configuration such as heap memory, I kept the default > parameter values ??for most of the hbase services. > The load program is designed for single thread. > The following is the progress information : > --- Hbase1.2.0 ( CDH5.13.3 ) > > 2020-12-01 16:48:18 inserted: 10 > 2020-12-01 16:48:36 inserted: 20 > 2020-12-01 16:48:52 inserted: 30 > 2020-12-01 16:49:08 inserted: 40 > 2020-12-01 16:49:23 inserted: 50 > 2020-12-01 16:49:39 inserted: 60 > 2020-12-01 16:49:56 inserted: 70 > 2020-12-01 16:50:12 inserted: 80 > 2020-12-01 16:50:29 inserted: 90 > 2020-12-01 16:50:45 inserted: 100 > 2020-12-01 16:51:01 inserted: 110 > 2020-12-01 16:51:17 inserted: 120 > 2020-12-01 16:51:34 inserted: 130 > 2020-12-01 16:51:49 inserted: 140 > 2020-12-01 16:52:05 inserted: 150 > 2020-12-01 16:52:21 inserted: 160 > 2020-12-01 16:52:40 inserted: 170 > 2020-12-01 16:52:57 inserted: 180 > 2020-12-01 16:53:19 inserted: 190 > 2020-12-01 16:53:42 inserted: 200 > 2020-12-01 16:53:48 inserted: 200 > imp finished ok! > --job finished-- > ---Hbase.2.0.2 ( > HDP3.1.1)- > 2020-12-01 17:25:24 inserted: 10 > 2020-12-01 17:26:03 inserted: 20 > 2020-12-01 17:26:39 inserted: 30 > 2020-12-01 17:27:13 inserted: 40 > 2020-12-01 17:27:47 inserted: 50 > 2020-12-01 17:28:23 inserted: 60 > 2020-12-01 17:29:03 inserted: 70 > 2020-12-01 17:29:40 inserted: 80 > 2020-12-01 17:30:15 inserted: 90 > 2020-12-01 17:30:51 inserted: 100 > 2020-12-01 17:31:27 inserted: 110 > 2020-12-01 17:32:03 inserted: 120 > 2020-12-01 17:32:39 inserted: 130 > 2020-12-01 17:33:14 inserted: 140 > 2020-12-01 17:33:50 inserted: 150 > 2020-12-01 17:34:25 inserted: 160 > 2020-12-01 17:35:01 inserted: 170 > 2020-12-01 17:35:38 inserted: 180 > 2020-12-01 17:36:14 inserted: 190 > 2020-12-01 17:36:51 inserted: 200 > 2020-12-01 17:36:55 inserted: 200 > imp finished ok! > --job finished-- > returnCode=0 > In addition, we also did some benchmark tests on the production cluster.The > delay is seem to be a bit high. The detailed report is in the attachment. > Are there any key points that I have not done configuration? or,, this > version has performance defects ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25346) hbase2.x the performance is lower than hbase 1.x ?
[ https://issues.apache.org/jira/browse/HBASE-25346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nilonealex updated HBASE-25346: --- Attachment: error_pe_randomWrite.log > hbase2.x the performance is lower than hbase 1.x ? > --- > > Key: HBASE-25346 > URL: https://issues.apache.org/jira/browse/HBASE-25346 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.2 >Reporter: nilonealex >Priority: Critical > Attachments: error_pe_randomWrite.log, hbase-pe-performace-test.log, > hbase-site.xml, test_for_randomWrite.log, test_for_randomWrite_hbase1.2.1.log > > > Recently we found that the newly built production hbase cluster is running a > bit slow , the hadoop version is Hbase2.0.2 ( HDP3.1.1) and it has 100 > nodes.Then we begin to do load & query performance verification between > Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ) test environment (4nodes), > found that : put data based on hbase2.0 is much slower than hbase1.x (the > former is almost half of the latter), I use BufferedMutator and > BufferedMutatorParams term for batch put to improve efficiency. More > confusing is the performance of the production environment is worse than my > test environment > Some of the codes are as follows: > --- > {color:#4C9AFF}List mutator = new ArrayList<>(); > BufferedMutator table = null; > BufferedMutatorParams params = new > BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName())); > params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024); > table = connection.getBufferedMutator(params); > > mutator.add(p); > if(totalCnts % 5000 == 0 ) { > table.mutate(mutator); > mutator.clear(); > }{color} > --- > The file to put is a text format file: 2 million rows comma-separated text > file, each row records 110 columns, total size is about 1G. In addition to > the main parameter configuration such as heap memory, I kept the default > parameter values ??for most of the hbase services. > The load program is designed for single thread. > The following is the progress information : > --- Hbase1.2.0 ( CDH5.13.3 ) > > 2020-12-01 16:48:18 inserted: 10 > 2020-12-01 16:48:36 inserted: 20 > 2020-12-01 16:48:52 inserted: 30 > 2020-12-01 16:49:08 inserted: 40 > 2020-12-01 16:49:23 inserted: 50 > 2020-12-01 16:49:39 inserted: 60 > 2020-12-01 16:49:56 inserted: 70 > 2020-12-01 16:50:12 inserted: 80 > 2020-12-01 16:50:29 inserted: 90 > 2020-12-01 16:50:45 inserted: 100 > 2020-12-01 16:51:01 inserted: 110 > 2020-12-01 16:51:17 inserted: 120 > 2020-12-01 16:51:34 inserted: 130 > 2020-12-01 16:51:49 inserted: 140 > 2020-12-01 16:52:05 inserted: 150 > 2020-12-01 16:52:21 inserted: 160 > 2020-12-01 16:52:40 inserted: 170 > 2020-12-01 16:52:57 inserted: 180 > 2020-12-01 16:53:19 inserted: 190 > 2020-12-01 16:53:42 inserted: 200 > 2020-12-01 16:53:48 inserted: 200 > imp finished ok! > --job finished-- > ---Hbase.2.0.2 ( > HDP3.1.1)- > 2020-12-01 17:25:24 inserted: 10 > 2020-12-01 17:26:03 inserted: 20 > 2020-12-01 17:26:39 inserted: 30 > 2020-12-01 17:27:13 inserted: 40 > 2020-12-01 17:27:47 inserted: 50 > 2020-12-01 17:28:23 inserted: 60 > 2020-12-01 17:29:03 inserted: 70 > 2020-12-01 17:29:40 inserted: 80 > 2020-12-01 17:30:15 inserted: 90 > 2020-12-01 17:30:51 inserted: 100 > 2020-12-01 17:31:27 inserted: 110 > 2020-12-01 17:32:03 inserted: 120 > 2020-12-01 17:32:39 inserted: 130 > 2020-12-01 17:33:14 inserted: 140 > 2020-12-01 17:33:50 inserted: 150 > 2020-12-01 17:34:25 inserted: 160 > 2020-12-01 17:35:01 inserted: 170 > 2020-12-01 17:35:38 inserted: 180 > 2020-12-01 17:36:14 inserted: 190 > 2020-12-01 17:36:51 inserted: 200 > 2020-12-01 17:36:55 inserted: 200 > imp finished ok! > --job finished-- > returnCode=0 > In addition, we also did some benchmark tests on the production cluster.The > delay is seem to be a bit high. The detailed report is in the attachment. > Are there any key points that I have not done configuration? or,, this > version has performance defects ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25346) hbase2.x the performance is lower than hbase 1.x ?
[ https://issues.apache.org/jira/browse/HBASE-25346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nilonealex updated HBASE-25346: --- Attachment: test_for_randomWrite_hbase1.2.1.log > hbase2.x the performance is lower than hbase 1.x ? > --- > > Key: HBASE-25346 > URL: https://issues.apache.org/jira/browse/HBASE-25346 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.2 >Reporter: nilonealex >Priority: Critical > Attachments: hbase-pe-performace-test.log, hbase-site.xml, > test_for_randomWrite.log, test_for_randomWrite_hbase1.2.1.log > > > Recently we found that the newly built production hbase cluster is running a > bit slow , the hadoop version is Hbase2.0.2 ( HDP3.1.1) and it has 100 > nodes.Then we begin to do load & query performance verification between > Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ) test environment (4nodes), > found that : put data based on hbase2.0 is much slower than hbase1.x (the > former is almost half of the latter), I use BufferedMutator and > BufferedMutatorParams term for batch put to improve efficiency. More > confusing is the performance of the production environment is worse than my > test environment > Some of the codes are as follows: > --- > {color:#4C9AFF}List mutator = new ArrayList<>(); > BufferedMutator table = null; > BufferedMutatorParams params = new > BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName())); > params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024); > table = connection.getBufferedMutator(params); > > mutator.add(p); > if(totalCnts % 5000 == 0 ) { > table.mutate(mutator); > mutator.clear(); > }{color} > --- > The file to put is a text format file: 2 million rows comma-separated text > file, each row records 110 columns, total size is about 1G. In addition to > the main parameter configuration such as heap memory, I kept the default > parameter values ??for most of the hbase services. > The load program is designed for single thread. > The following is the progress information : > --- Hbase1.2.0 ( CDH5.13.3 ) > > 2020-12-01 16:48:18 inserted: 10 > 2020-12-01 16:48:36 inserted: 20 > 2020-12-01 16:48:52 inserted: 30 > 2020-12-01 16:49:08 inserted: 40 > 2020-12-01 16:49:23 inserted: 50 > 2020-12-01 16:49:39 inserted: 60 > 2020-12-01 16:49:56 inserted: 70 > 2020-12-01 16:50:12 inserted: 80 > 2020-12-01 16:50:29 inserted: 90 > 2020-12-01 16:50:45 inserted: 100 > 2020-12-01 16:51:01 inserted: 110 > 2020-12-01 16:51:17 inserted: 120 > 2020-12-01 16:51:34 inserted: 130 > 2020-12-01 16:51:49 inserted: 140 > 2020-12-01 16:52:05 inserted: 150 > 2020-12-01 16:52:21 inserted: 160 > 2020-12-01 16:52:40 inserted: 170 > 2020-12-01 16:52:57 inserted: 180 > 2020-12-01 16:53:19 inserted: 190 > 2020-12-01 16:53:42 inserted: 200 > 2020-12-01 16:53:48 inserted: 200 > imp finished ok! > --job finished-- > ---Hbase.2.0.2 ( > HDP3.1.1)- > 2020-12-01 17:25:24 inserted: 10 > 2020-12-01 17:26:03 inserted: 20 > 2020-12-01 17:26:39 inserted: 30 > 2020-12-01 17:27:13 inserted: 40 > 2020-12-01 17:27:47 inserted: 50 > 2020-12-01 17:28:23 inserted: 60 > 2020-12-01 17:29:03 inserted: 70 > 2020-12-01 17:29:40 inserted: 80 > 2020-12-01 17:30:15 inserted: 90 > 2020-12-01 17:30:51 inserted: 100 > 2020-12-01 17:31:27 inserted: 110 > 2020-12-01 17:32:03 inserted: 120 > 2020-12-01 17:32:39 inserted: 130 > 2020-12-01 17:33:14 inserted: 140 > 2020-12-01 17:33:50 inserted: 150 > 2020-12-01 17:34:25 inserted: 160 > 2020-12-01 17:35:01 inserted: 170 > 2020-12-01 17:35:38 inserted: 180 > 2020-12-01 17:36:14 inserted: 190 > 2020-12-01 17:36:51 inserted: 200 > 2020-12-01 17:36:55 inserted: 200 > imp finished ok! > --job finished-- > returnCode=0 > In addition, we also did some benchmark tests on the production cluster.The > delay is seem to be a bit high. The detailed report is in the attachment. > Are there any key points that I have not done configuration? or,, this > version has performance defects ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25346) hbase2.x the performance is lower than hbase 1.x ?
[ https://issues.apache.org/jira/browse/HBASE-25346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nilonealex updated HBASE-25346: --- Attachment: test_for_randomWrite.log > hbase2.x the performance is lower than hbase 1.x ? > --- > > Key: HBASE-25346 > URL: https://issues.apache.org/jira/browse/HBASE-25346 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.2 >Reporter: nilonealex >Priority: Critical > Attachments: hbase-pe-performace-test.log, hbase-site.xml, > test_for_randomWrite.log > > > Recently we found that the newly built production hbase cluster is running a > bit slow , the hadoop version is Hbase2.0.2 ( HDP3.1.1) and it has 100 > nodes.Then we begin to do load & query performance verification between > Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ) test environment (4nodes), > found that : put data based on hbase2.0 is much slower than hbase1.x (the > former is almost half of the latter), I use BufferedMutator and > BufferedMutatorParams term for batch put to improve efficiency. More > confusing is the performance of the production environment is worse than my > test environment > Some of the codes are as follows: > --- > {color:#4C9AFF}List mutator = new ArrayList<>(); > BufferedMutator table = null; > BufferedMutatorParams params = new > BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName())); > params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024); > table = connection.getBufferedMutator(params); > > mutator.add(p); > if(totalCnts % 5000 == 0 ) { > table.mutate(mutator); > mutator.clear(); > }{color} > --- > The file to put is a text format file: 2 million rows comma-separated text > file, each row records 110 columns, total size is about 1G. In addition to > the main parameter configuration such as heap memory, I kept the default > parameter values ??for most of the hbase services. > The load program is designed for single thread. > The following is the progress information : > --- Hbase1.2.0 ( CDH5.13.3 ) > > 2020-12-01 16:48:18 inserted: 10 > 2020-12-01 16:48:36 inserted: 20 > 2020-12-01 16:48:52 inserted: 30 > 2020-12-01 16:49:08 inserted: 40 > 2020-12-01 16:49:23 inserted: 50 > 2020-12-01 16:49:39 inserted: 60 > 2020-12-01 16:49:56 inserted: 70 > 2020-12-01 16:50:12 inserted: 80 > 2020-12-01 16:50:29 inserted: 90 > 2020-12-01 16:50:45 inserted: 100 > 2020-12-01 16:51:01 inserted: 110 > 2020-12-01 16:51:17 inserted: 120 > 2020-12-01 16:51:34 inserted: 130 > 2020-12-01 16:51:49 inserted: 140 > 2020-12-01 16:52:05 inserted: 150 > 2020-12-01 16:52:21 inserted: 160 > 2020-12-01 16:52:40 inserted: 170 > 2020-12-01 16:52:57 inserted: 180 > 2020-12-01 16:53:19 inserted: 190 > 2020-12-01 16:53:42 inserted: 200 > 2020-12-01 16:53:48 inserted: 200 > imp finished ok! > --job finished-- > ---Hbase.2.0.2 ( > HDP3.1.1)- > 2020-12-01 17:25:24 inserted: 10 > 2020-12-01 17:26:03 inserted: 20 > 2020-12-01 17:26:39 inserted: 30 > 2020-12-01 17:27:13 inserted: 40 > 2020-12-01 17:27:47 inserted: 50 > 2020-12-01 17:28:23 inserted: 60 > 2020-12-01 17:29:03 inserted: 70 > 2020-12-01 17:29:40 inserted: 80 > 2020-12-01 17:30:15 inserted: 90 > 2020-12-01 17:30:51 inserted: 100 > 2020-12-01 17:31:27 inserted: 110 > 2020-12-01 17:32:03 inserted: 120 > 2020-12-01 17:32:39 inserted: 130 > 2020-12-01 17:33:14 inserted: 140 > 2020-12-01 17:33:50 inserted: 150 > 2020-12-01 17:34:25 inserted: 160 > 2020-12-01 17:35:01 inserted: 170 > 2020-12-01 17:35:38 inserted: 180 > 2020-12-01 17:36:14 inserted: 190 > 2020-12-01 17:36:51 inserted: 200 > 2020-12-01 17:36:55 inserted: 200 > imp finished ok! > --job finished-- > returnCode=0 > In addition, we also did some benchmark tests on the production cluster.The > delay is seem to be a bit high. The detailed report is in the attachment. > Are there any key points that I have not done configuration? or,, this > version has performance defects ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25346) hbase2.x the performance is lower than hbase 1.x ?
[ https://issues.apache.org/jira/browse/HBASE-25346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nilonealex updated HBASE-25346: --- Attachment: hbase-pe-performace-test.log > hbase2.x the performance is lower than hbase 1.x ? > --- > > Key: HBASE-25346 > URL: https://issues.apache.org/jira/browse/HBASE-25346 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.2 >Reporter: nilonealex >Priority: Critical > Attachments: hbase-pe-performace-test.log, hbase-site.xml > > > Recently we found that the newly built production hbase cluster is running a > bit slow , the hadoop version is Hbase2.0.2 ( HDP3.1.1) and it has 100 > nodes.Then we begin to do load & query performance verification between > Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ) test environment (4nodes), > found that : put data based on hbase2.0 is much slower than hbase1.x (the > former is almost half of the latter), I use BufferedMutator and > BufferedMutatorParams term for batch put to improve efficiency. More > confusing is the performance of the production environment is worse than my > test environment > Some of the codes are as follows: > --- > {color:#4C9AFF}List mutator = new ArrayList<>(); > BufferedMutator table = null; > BufferedMutatorParams params = new > BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName())); > params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024); > table = connection.getBufferedMutator(params); > > mutator.add(p); > if(totalCnts % 5000 == 0 ) { > table.mutate(mutator); > mutator.clear(); > }{color} > --- > The file to put is a text format file: 2 million rows comma-separated text > file, each row records 110 columns, total size is about 1G. In addition to > the main parameter configuration such as heap memory, I kept the default > parameter values ??for most of the hbase services. > The load program is designed for single thread. > The following is the progress information : > --- Hbase1.2.0 ( CDH5.13.3 ) > > 2020-12-01 16:48:18 inserted: 10 > 2020-12-01 16:48:36 inserted: 20 > 2020-12-01 16:48:52 inserted: 30 > 2020-12-01 16:49:08 inserted: 40 > 2020-12-01 16:49:23 inserted: 50 > 2020-12-01 16:49:39 inserted: 60 > 2020-12-01 16:49:56 inserted: 70 > 2020-12-01 16:50:12 inserted: 80 > 2020-12-01 16:50:29 inserted: 90 > 2020-12-01 16:50:45 inserted: 100 > 2020-12-01 16:51:01 inserted: 110 > 2020-12-01 16:51:17 inserted: 120 > 2020-12-01 16:51:34 inserted: 130 > 2020-12-01 16:51:49 inserted: 140 > 2020-12-01 16:52:05 inserted: 150 > 2020-12-01 16:52:21 inserted: 160 > 2020-12-01 16:52:40 inserted: 170 > 2020-12-01 16:52:57 inserted: 180 > 2020-12-01 16:53:19 inserted: 190 > 2020-12-01 16:53:42 inserted: 200 > 2020-12-01 16:53:48 inserted: 200 > imp finished ok! > --job finished-- > ---Hbase.2.0.2 ( > HDP3.1.1)- > 2020-12-01 17:25:24 inserted: 10 > 2020-12-01 17:26:03 inserted: 20 > 2020-12-01 17:26:39 inserted: 30 > 2020-12-01 17:27:13 inserted: 40 > 2020-12-01 17:27:47 inserted: 50 > 2020-12-01 17:28:23 inserted: 60 > 2020-12-01 17:29:03 inserted: 70 > 2020-12-01 17:29:40 inserted: 80 > 2020-12-01 17:30:15 inserted: 90 > 2020-12-01 17:30:51 inserted: 100 > 2020-12-01 17:31:27 inserted: 110 > 2020-12-01 17:32:03 inserted: 120 > 2020-12-01 17:32:39 inserted: 130 > 2020-12-01 17:33:14 inserted: 140 > 2020-12-01 17:33:50 inserted: 150 > 2020-12-01 17:34:25 inserted: 160 > 2020-12-01 17:35:01 inserted: 170 > 2020-12-01 17:35:38 inserted: 180 > 2020-12-01 17:36:14 inserted: 190 > 2020-12-01 17:36:51 inserted: 200 > 2020-12-01 17:36:55 inserted: 200 > imp finished ok! > --job finished-- > returnCode=0 > In addition, we also did some benchmark tests on the production cluster.The > delay is seem to be a bit high. The detailed report is in the attachment. > Are there any key points that I have not done configuration? or,, this > version has performance defects ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25346) hbase2.x the performance is lower than hbase 1.x ?
[ https://issues.apache.org/jira/browse/HBASE-25346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nilonealex updated HBASE-25346: --- Description: Recently we found that the newly built production hbase cluster is running a bit slow , the hadoop version is Hbase2.0.2 ( HDP3.1.1) and it has 100 nodes.Then we begin to do load & query performance verification between Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ) test environment (4nodes), found that : put data based on hbase2.0 is much slower than hbase1.x (the former is almost half of the latter), I use BufferedMutator and BufferedMutatorParams term for batch put to improve efficiency. More confusing is the performance of the production environment is worse than my test environment Some of the codes are as follows: --- {color:#4C9AFF}List mutator = new ArrayList<>(); BufferedMutator table = null; BufferedMutatorParams params = new BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName())); params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024); table = connection.getBufferedMutator(params); mutator.add(p); if(totalCnts % 5000 == 0 ) { table.mutate(mutator); mutator.clear(); }{color} --- The file to put is a text format file: 2 million rows comma-separated text file, each row records 110 columns, total size is about 1G. In addition to the main parameter configuration such as heap memory, I kept the default parameter values ??for most of the hbase services. The load program is designed for single thread. The following is the progress information : --- Hbase1.2.0 ( CDH5.13.3 ) 2020-12-01 16:48:18 inserted: 10 2020-12-01 16:48:36 inserted: 20 2020-12-01 16:48:52 inserted: 30 2020-12-01 16:49:08 inserted: 40 2020-12-01 16:49:23 inserted: 50 2020-12-01 16:49:39 inserted: 60 2020-12-01 16:49:56 inserted: 70 2020-12-01 16:50:12 inserted: 80 2020-12-01 16:50:29 inserted: 90 2020-12-01 16:50:45 inserted: 100 2020-12-01 16:51:01 inserted: 110 2020-12-01 16:51:17 inserted: 120 2020-12-01 16:51:34 inserted: 130 2020-12-01 16:51:49 inserted: 140 2020-12-01 16:52:05 inserted: 150 2020-12-01 16:52:21 inserted: 160 2020-12-01 16:52:40 inserted: 170 2020-12-01 16:52:57 inserted: 180 2020-12-01 16:53:19 inserted: 190 2020-12-01 16:53:42 inserted: 200 2020-12-01 16:53:48 inserted: 200 imp finished ok! --job finished-- ---Hbase.2.0.2 ( HDP3.1.1)- 2020-12-01 17:25:24 inserted: 10 2020-12-01 17:26:03 inserted: 20 2020-12-01 17:26:39 inserted: 30 2020-12-01 17:27:13 inserted: 40 2020-12-01 17:27:47 inserted: 50 2020-12-01 17:28:23 inserted: 60 2020-12-01 17:29:03 inserted: 70 2020-12-01 17:29:40 inserted: 80 2020-12-01 17:30:15 inserted: 90 2020-12-01 17:30:51 inserted: 100 2020-12-01 17:31:27 inserted: 110 2020-12-01 17:32:03 inserted: 120 2020-12-01 17:32:39 inserted: 130 2020-12-01 17:33:14 inserted: 140 2020-12-01 17:33:50 inserted: 150 2020-12-01 17:34:25 inserted: 160 2020-12-01 17:35:01 inserted: 170 2020-12-01 17:35:38 inserted: 180 2020-12-01 17:36:14 inserted: 190 2020-12-01 17:36:51 inserted: 200 2020-12-01 17:36:55 inserted: 200 imp finished ok! --job finished-- returnCode=0 In addition, we also did some benchmark tests on the production cluster.The delay is seem to be a bit high. The detailed report is in the attachment. Are there any key points that I have not done configuration? or,, this version has performance defects ? was: Recently we found that the newly built production hbase cluster is running a bit slow , the hadoop version is Hbase2.0.2 ( HDP3.1.1) and it has 100 nodes.Then we begin to do load & query performance verification between Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ) test environment (4nodes), found that : put data based on hbase2.0 is much slower than hbase1.x (the former is almost half of the latter), I use BufferedMutator and BufferedMutatorParams term for batch put to improve efficiency. Some of the codes are as follows: --- {color:#4C9AFF}List mutator = new ArrayList<>(); BufferedMutator table = null; BufferedMutatorParams params = new BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName())); params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024); table = connection.getBufferedMutator(params); mutator.add(p); if(totalCnts % 5000 == 0 ) {
[jira] [Updated] (HBASE-25346) hbase2.x the performance is lower than hbase 1.x ?
[ https://issues.apache.org/jira/browse/HBASE-25346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nilonealex updated HBASE-25346: --- Summary: hbase2.x the performance is lower than hbase 1.x ? (was: hbase2.x the speed of writing data is slower than hbase 1.x) > hbase2.x the performance is lower than hbase 1.x ? > --- > > Key: HBASE-25346 > URL: https://issues.apache.org/jira/browse/HBASE-25346 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.2 >Reporter: nilonealex >Priority: Critical > Attachments: hbase-site.xml > > > Recently we found that the newly built production hbase cluster is running a > bit slow , the hadoop version is Hbase2.0.2 ( HDP3.1.1) and it has 100 > nodes.Then we begin to do load & query performance verification between > Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ) test environment (4nodes), > found that : put data based on hbase2.0 is much slower than hbase1.x (the > former is almost half of the latter), I use BufferedMutator and > BufferedMutatorParams term for batch put to improve efficiency. Some of the > codes are as follows: > --- > {color:#4C9AFF}List mutator = new ArrayList<>(); > BufferedMutator table = null; > BufferedMutatorParams params = new > BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName())); > params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024); > table = connection.getBufferedMutator(params); > > mutator.add(p); > if(totalCnts % 5000 == 0 ) { > table.mutate(mutator); > mutator.clear(); > }{color} > --- > The file to put is a text format file: 2 million rows comma-separated text > file, each row records 110 columns, total size is about 1G. In addition to > the main parameter configuration such as heap memory, I kept the default > parameter values ??for most of the hbase services. > The load program is designed for single thread. > The following is the progress information : > --- Hbase1.2.0 ( CDH5.13.3 ) > > 2020-12-01 16:48:18 inserted: 10 > 2020-12-01 16:48:36 inserted: 20 > 2020-12-01 16:48:52 inserted: 30 > 2020-12-01 16:49:08 inserted: 40 > 2020-12-01 16:49:23 inserted: 50 > 2020-12-01 16:49:39 inserted: 60 > 2020-12-01 16:49:56 inserted: 70 > 2020-12-01 16:50:12 inserted: 80 > 2020-12-01 16:50:29 inserted: 90 > 2020-12-01 16:50:45 inserted: 100 > 2020-12-01 16:51:01 inserted: 110 > 2020-12-01 16:51:17 inserted: 120 > 2020-12-01 16:51:34 inserted: 130 > 2020-12-01 16:51:49 inserted: 140 > 2020-12-01 16:52:05 inserted: 150 > 2020-12-01 16:52:21 inserted: 160 > 2020-12-01 16:52:40 inserted: 170 > 2020-12-01 16:52:57 inserted: 180 > 2020-12-01 16:53:19 inserted: 190 > 2020-12-01 16:53:42 inserted: 200 > 2020-12-01 16:53:48 inserted: 200 > imp finished ok! > --job finished-- > ---Hbase.2.0.2 ( > HDP3.1.1)- > 2020-12-01 17:25:24 inserted: 10 > 2020-12-01 17:26:03 inserted: 20 > 2020-12-01 17:26:39 inserted: 30 > 2020-12-01 17:27:13 inserted: 40 > 2020-12-01 17:27:47 inserted: 50 > 2020-12-01 17:28:23 inserted: 60 > 2020-12-01 17:29:03 inserted: 70 > 2020-12-01 17:29:40 inserted: 80 > 2020-12-01 17:30:15 inserted: 90 > 2020-12-01 17:30:51 inserted: 100 > 2020-12-01 17:31:27 inserted: 110 > 2020-12-01 17:32:03 inserted: 120 > 2020-12-01 17:32:39 inserted: 130 > 2020-12-01 17:33:14 inserted: 140 > 2020-12-01 17:33:50 inserted: 150 > 2020-12-01 17:34:25 inserted: 160 > 2020-12-01 17:35:01 inserted: 170 > 2020-12-01 17:35:38 inserted: 180 > 2020-12-01 17:36:14 inserted: 190 > 2020-12-01 17:36:51 inserted: 200 > 2020-12-01 17:36:55 inserted: 200 > imp finished ok! > --job finished-- > returnCode=0 > In addition, we also did some benchmark tests on the production cluster.The > delay is seem to be a bit high. The detailed report is in the attachment. > Are there any key points that I have not done configuration? or,, this > version has performance defects ? -- This message was sent by Atlassian Jira (v8.3.4#803005)