[
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang updated HDFS-7966:
----------------------------
Attachment: TestHttp2Performance.svg
Running this testcase(change readPerThread to 200000).
https://github.com/Apache9/hadoop/blob/HDFS-7966-POC/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/dtp/TestHttp2Performance.java
This is a single threaded testcase that seeks back and read small data
repeatedly. Notice that I disable transferTo since we can not use transferTo
with netty HTTP/2 right now.
The output result is
{noformat}
******* time based on http2 45145ms
******* time based on tcp 29871ms
{noformat}
HTTP/2 is about 50% slower.
The attachment is the flame graph of this test run. I collect some information
{noformat}
ReadBlockHandler: 24.42 = 9.60 + 8.82 + 4.14 + 1.86
DataXceiver : 23.64
Http2BlockReader : 12.81 = 5.67 + 1.26 + 5.13 + 0.35 + 0.40
RemoteBlockReader2: 10.41 = 6.90 + 3.51
ThreadPoolExecutor.execute: 0.98
ThreadPoolExecutor.getTask: 1.69
Other netty overhead : 10.46 = 12.19 - 0.35 - 0.40 - 0.98
{noformat}
According to the graph, HTTP/2 should be (50.36 - 34.05) / 34.05 = 47.9%
slower. Basically same with testcase output.
And the actual HTTP/2 overhead is about 5%(See Http2ConnectionHandler.decode in
graph, and should minus the ThreadPoolExecutor and Http2DataReceiver time from
it which is 6.85 - 0.98 - 0.35 - 0.40 = 5.12). So HTTP/2 should make us 15%
slower.
I think first we need to find out why we are 50% but not 15% slower.
Thanks.
> New Data Transfer Protocol via HTTP/2
> -------------------------------------
>
> Key: HDFS-7966
> URL: https://issues.apache.org/jira/browse/HDFS-7966
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Haohui Mai
> Assignee: Qianqian Shi
> Labels: gsoc, gsoc2015, mentor
> Attachments: GSoC2015_Proposal.pdf, TestHttp2Performance.svg
>
>
> The current Data Transfer Protocol (DTP) implements a rich set of features
> that span across multiple layers, including:
> * Connection pooling and authentication (session layer)
> * Encryption (presentation layer)
> * Data writing pipeline (application layer)
> All these features are HDFS-specific and defined by implementation. As a
> result it requires non-trivial amount of work to implement HDFS clients and
> servers.
> This jira explores to delegate the responsibilities of the session and
> presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles
> connection multiplexing, QoS, authentication and encryption, reducing the
> scope of DTP to the application layer only. By leveraging the existing HTTP/2
> library, it should simplify the implementation of both HDFS clients and
> servers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)