[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fanrui updated HADOOP-17222:
----------------------------
    Description: 
Note:Not only the hdfs client can get the current benefit, all callers of 
NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
example.

 

Hdfs client selects best DN for hdfs Block. method call stack:

DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
NetUtils.createSocketAddr

NetUtils.createSocketAddr creates the corresponding InetSocketAddress based on 
the host and port. There are some heavier operations in the 
NetUtils.createSocketAddr method, for example: URI.create(target), so 
NetUtils.createSocketAddr takes more time to execute.

The following is my performance report. The report is based on HBase calling 
hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
operations often access a small DataBlock (about 64k) instead of the entire 
HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
method is time-consuming.
h3. Test Environment:

 
{code:java}
HBase version: 2.1.0
JVM: -Xmx2g -Xms2g 
hadoop hdfs version: 2.7.4
disk:SSD
OS:CentOS Linux release 7.4.1708 (Core)
JMH Benchmark: @Fork(value = 1) 
@Warmup(iterations = 300) 
@Measurement(iterations = 300)
{code}
h4. Before Optimization FlameGraph:

In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
proportion.

!Before Optimization remark.png!
h3. Optimization ideas:

NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
Here we can add Cache to InetSocketAddress. The key of Cache is host and port, 
and the value is InetSocketAddress.
h4. After Optimization FlameGraph:

In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, and 
the ConcurrentHashMap.get() method gets data from the Cache. The CPU usage of 
DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% to 0.54%.

!After Optimization remark.png!
h3. Original FlameGraph link:

[Before 
Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]

[After Optimization 
FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]

  was:
Hdfs client selects best DN for hdfs Block. method call stack:

DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
NetUtils.createSocketAddr

NetUtils.createSocketAddr creates the corresponding InetSocketAddress based on 
the host and port. There are some heavier operations in the 
NetUtils.createSocketAddr method, for example: URI.create(target), so 
NetUtils.createSocketAddr takes more time to execute.

The following is my performance report. The report is based on HBase calling 
hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
operations often access a small DataBlock (about 64k) instead of the entire 
HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
method is time-consuming.
h3. Test Environment:

 
{code:java}
HBase version: 2.1.0
JVM: -Xmx2g -Xms2g 
hadoop hdfs version: 2.7.4
disk:SSD
OS:CentOS Linux release 7.4.1708 (Core)
JMH Benchmark: @Fork(value = 1) 
@Warmup(iterations = 300) 
@Measurement(iterations = 300)
{code}
h4. Before Optimization FlameGraph:

In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
proportion.

!Before Optimization remark.png!
h3. Optimization ideas:

NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
Here we can add Cache to InetSocketAddress. The key of Cache is host and port, 
and the value is InetSocketAddress.
h4. After Optimization FlameGraph:

In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, and 
the ConcurrentHashMap.get() method gets data from the Cache. The CPU usage of 
DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% to 0.54%.

!After Optimization remark.png!
h3. Original FlameGraph link:

[Before 
Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]

[After Optimization 
FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]


> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-17222
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17222
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common, hdfs-client
>         Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>            Reporter: fanrui
>            Priority: Major
>         Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to