[
https://issues.apache.org/jira/browse/HBASE-23185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shinya Yoshida updated HBASE-23185:
-----------------------------------
Description:
When we analyzed the performance of our hbase application with many puts, we
found that Configuration methods use many CPU resources:
!Screenshot from 2019-10-18 12-38-14.png|width=460,height=205!
As you can see, getTable().put() is calling Configuration methods which cause
regex or synchronization by Hashtable.
This should not happen in 0.99.2 because
https://issues.apache.org/jira/browse/HBASE-12128 addressed such an issue.
However, it's reproducing nowadays by bugs or leakages after many code
evoluations between 0.9x and 1.x.
#
[https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java#L369-L374]
** finishSetup is called every new HTable() e.g. every con.getTable()
** So getInt is called everytime and it does regex
#
[https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/BufferedMutatorImpl.java#L115]
** BufferedMutatorImpl is created every first put for HTable e.g.
con.getTable().put()
** Create ConnectionConf every time in BufferedMutatorImpl constructor
** ConnectionConf gets config value in the constructor
#
[https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java#L326]
** AsyncProcess is created in BufferedMutatorImpl constructor, so new
AsyncProcess is created by con.getTable().put()
** AsyncProcess parse many configurations
So, con.getTable().put() is heavy operation for CPU because of getting config
value.
With in-house patch for this issue, we observed about 10% improvement on
max-throughput (e.g. CPU usage) at client-side:
!Screenshot from 2019-10-18 13-03-24.png|width=508,height=223!
Seems branch-2 is not affected because client implementation has been changed
dramatically.
was:
When we analyzed the performance of our hbase application with many puts, we
found that Configuration methods use many CPU resources:
!Screenshot from 2019-10-18 12-38-14.png|width=460,height=205!
As you can see, getTable().put() is calling Configuration methods which cause
regex or synchronization by Hashtable.
This should not happen in 0.99.2 because
https://issues.apache.org/jira/browse/HBASE-12128 addressed such an issue.
However, it's reproducing nowadays by bugs or leakages after many code
evoluations between 0.9x and 1.x.
#
[https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java#L369-L374]
** finishSetup is called every new HTable() e.g. every con.getTable()
** So getInt is called everytime and it does regex
#
[https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/BufferedMutatorImpl.java#L115]
** BufferedMutatorImpl is created every first put for HTable e.g.
con.getTable().put()
** Create ConnectionConf every time in BufferedMutatorImpl constructor
** ConnectionConf gets config value in the constructor
#
[https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java#L326]
** AsyncProcess is created in BufferedMutatorImpl constructor, so new
AsyncProcess is created by con.getTable().put()
** AsyncProcess parse many configurations
So, con.getTable().put() is heavy operation for CPU because of getting config
value.
With in-house patch for this issue, we observed about 10% improvement on
max-throughput (e.g. CPU usage) at client-side:
!Screenshot from 2019-10-18 13-03-24.png|width=508,height=223!
I confirmed branch-2 is not affected because client implementation has been
changed dramatically.
> High cpu usage because getTable()#put() gets config value every time
> --------------------------------------------------------------------
>
> Key: HBASE-23185
> URL: https://issues.apache.org/jira/browse/HBASE-23185
> Project: HBase
> Issue Type: Bug
> Components: Client
> Affects Versions: 1.5.0, 1.4.10, 1.2.12, 1.3.5
> Reporter: Shinya Yoshida
> Assignee: Shinya Yoshida
> Priority: Major
> Labels: performance
> Attachments: Screenshot from 2019-10-18 12-38-14.png, Screenshot from
> 2019-10-18 13-03-24.png
>
>
> When we analyzed the performance of our hbase application with many puts, we
> found that Configuration methods use many CPU resources:
> !Screenshot from 2019-10-18 12-38-14.png|width=460,height=205!
> As you can see, getTable().put() is calling Configuration methods which cause
> regex or synchronization by Hashtable.
> This should not happen in 0.99.2 because
> https://issues.apache.org/jira/browse/HBASE-12128 addressed such an issue.
> However, it's reproducing nowadays by bugs or leakages after many code
> evoluations between 0.9x and 1.x.
> #
> [https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java#L369-L374]
> ** finishSetup is called every new HTable() e.g. every con.getTable()
> ** So getInt is called everytime and it does regex
> #
> [https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/BufferedMutatorImpl.java#L115]
> ** BufferedMutatorImpl is created every first put for HTable e.g.
> con.getTable().put()
> ** Create ConnectionConf every time in BufferedMutatorImpl constructor
> ** ConnectionConf gets config value in the constructor
> #
> [https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java#L326]
> ** AsyncProcess is created in BufferedMutatorImpl constructor, so new
> AsyncProcess is created by con.getTable().put()
> ** AsyncProcess parse many configurations
> So, con.getTable().put() is heavy operation for CPU because of getting config
> value.
>
> With in-house patch for this issue, we observed about 10% improvement on
> max-throughput (e.g. CPU usage) at client-side:
> !Screenshot from 2019-10-18 13-03-24.png|width=508,height=223!
>
> Seems branch-2 is not affected because client implementation has been changed
> dramatically.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)