[
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736658#comment-14736658
]
Duo Zhang commented on HDFS-7966:
---------------------------------
Netty-4.1.0Beta6 is out so I'm back. I have added a simple {{asyncRead}}
method(not fully asynchronous since this is only a POC) to {{DFSInputStream}}
and write a performance test for it. Here is the test result(two times for
every test)
{noformat}
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest async /test 100
50000 4096 // 100 here means max concurrency which used to prevent OOM.
******* time based on http2 230946
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest async /test 100
50000 4096
******* time based on http2 231066
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 100
50000 4096 pread
******* time based on tcp 231410
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 100
50000 4096 pread
******* time based on tcp 231038
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 100
50000 4096 pread
******* time based on http2 236069
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 100
50000 4096 pread
******* time based on http2 231773
{noformat}
The performance difference is ~±4% and async is a little better than tcp.
Thanks.
> New Data Transfer Protocol via HTTP/2
> -------------------------------------
>
> Key: HDFS-7966
> URL: https://issues.apache.org/jira/browse/HDFS-7966
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Haohui Mai
> Assignee: Qianqian Shi
> Labels: gsoc, gsoc2015, mentor
> Attachments: GSoC2015_Proposal.pdf,
> TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg,
> TestHttp2ReadBlockInsideEventLoop.svg
>
>
> The current Data Transfer Protocol (DTP) implements a rich set of features
> that span across multiple layers, including:
> * Connection pooling and authentication (session layer)
> * Encryption (presentation layer)
> * Data writing pipeline (application layer)
> All these features are HDFS-specific and defined by implementation. As a
> result it requires non-trivial amount of work to implement HDFS clients and
> servers.
> This jira explores to delegate the responsibilities of the session and
> presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles
> connection multiplexing, QoS, authentication and encryption, reducing the
> scope of DTP to the application layer only. By leveraging the existing HTTP/2
> library, it should simplify the implementation of both HDFS clients and
> servers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)