[ 
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627827#comment-14627827
 ] 

Duo Zhang commented on HDFS-7966:
---------------------------------

Small read using {{PerformanceTest}}. Unit is millisecond.

{noformat}
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 
1(thread number) 1000000(read count per thread) 1024(bytes per read) pread(use 
pread)
{noformat}

{noformat}
./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 1 
1000000 1024 pread
******* time based on tcp 242730

./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 1 
1000000 1024 pread
******* time based on http2 324491

./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 10 
100000 1024 pread
******* time based on tcp 40688

./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 10 
100000 1024 pread
******* time based on http2 82819

./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 100 
10000 1024 pread
******* time based on tcp 21612

./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 100 
10000 1024 pread
******* time based on http2 69658

./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest tcp /test 500 
2000 1024 pread
******* time based on tcp 19931

./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 500 
2000 1024 pread
******* time based on http2 151727

./bin/hadoop org.apache.hadoop.hdfs.web.http2.PerformanceTest http2 /test 1000 
1000 1024 pread
******* time based on http2 251735
{noformat}

For the single threaded test, 324491/242730=1.34, so http2 is 30% slow than 
tcp. Will try to find the overhead later.

And for multi threaded test, http2 is much slow than tcp. And tcp failed the 
1000 threads test.

I think the problem is that I only use one connection in http2 so there is only 
one EventLoop(which means only one thread) which sends or receives data. And 
for tcp, the thread number is same with connection number. The {{%CPU}} of 
datanode when using http2 is always around 100% no matter the thread number is 
10 or 100 or 1000. But when using tcp the {{%CPU}} could be higher than 1500% 
when the number of thread increasing. Next I will write new test which can use 
multiple http2 connections.

Thanks.

> New Data Transfer Protocol via HTTP/2
> -------------------------------------
>
>                 Key: HDFS-7966
>                 URL: https://issues.apache.org/jira/browse/HDFS-7966
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Haohui Mai
>            Assignee: Qianqian Shi
>              Labels: gsoc, gsoc2015, mentor
>         Attachments: GSoC2015_Proposal.pdf, 
> TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg
>
>
> The current Data Transfer Protocol (DTP) implements a rich set of features 
> that span across multiple layers, including:
> * Connection pooling and authentication (session layer)
> * Encryption (presentation layer)
> * Data writing pipeline (application layer)
> All these features are HDFS-specific and defined by implementation. As a 
> result it requires non-trivial amount of work to implement HDFS clients and 
> servers.
> This jira explores to delegate the responsibilities of the session and 
> presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles 
> connection multiplexing, QoS, authentication and encryption, reducing the 
> scope of DTP to the application layer only. By leveraging the existing HTTP/2 
> library, it should simplify the implementation of both HDFS clients and 
> servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to