[ 
https://issues.apache.org/jira/browse/RATIS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293378#comment-17293378
 ] 

runzhiwang edited comment on RATIS-1312 at 3/2/21, 6:02 AM:
------------------------------------------------------------

[~szetszwo] Hi, I test the performance again, and streaming is still slower 
than hdfs. when use one client write 400 files * 128MB, streaming cost 60 
seconds, and hdfs cost 48 seconds, and the performance is unrelated with GC. 
The perf result has been attached, I find ratis streaming cost about 15% in 
allocating(8.79%) DirectByteBuffer and decodeDataStreamRequestByteBuf(6.42%) , 
which were not needed in HDFS.  I think we need to fix these two problems.

1. decodeDataStreamRequestByteBuf cost 6.42% caused by, netty split 4MB to many 
small packet, even though we define buffer size 4MB, so each small packet need 
to decodeDataStreamRequestByteBuf. 

2. allocating DirectByteBuffer cost 8.79%, still do not know why.


The test step:
1. I use only one ratis node, compared to, one namenode and one datanode

2. I did not write disk in streaming and hdfs, to only test performance of 
network.

3. code of Ratis streaming
Client and Server code which comment write disk: 
https://github.com/runzhiwang/incubator-ratis/commits/ratis-streaming
Client command: ${BIN}/client.sh filestore datastream --size 128000000 
--numFiles 400 --bufferSize 4000000 --syncSize -1 --type DirectByteBuffer 
--peers ${PEERS} --storage /data/ratis/n2 --storage /data1/ratis/n2 --storage 
/data2/ratis/n2 --storage /data3/ratis/n2 --storage /data4/ratis/n2 --storage 
/data5/ratis/n2 --storage /data6/ratis/n2 --storage /data7/ratis/n2 --storage 
/data8/ratis/n2 --storage /data9/ratis/n2 --storage /data10/ratis/n2 --storage 
/data11/ratis/n2

4. code of HDFS
Server code which comment write disk: 
https://github.com/runzhiwang/hadoop/tree/release-3.2.1
Client code: https://github.com/runzhiwang/HdfsPerformance
Client command: hadoop-3.2.1/bin/hadoop jar hdfs-performance-1.0-SNAPSHOT.jar 
org.apache.HdfsPerformance 400



was (Author: yjxxtd):
[~szetszwo] Hi, I test the performance again, and streaming is still slower 
than hdfs. when use one client write 400 files * 128MB, streaming cost 60 
seconds, and hdfs cost 48 seconds, and the performance is unrelated with GC. 
The perf has been attached, I find ratis streaming cost about 15% in 
allocating(8.79%) DirectByteBuffer and decodeDataStreamRequestByteBuf(6.42%) , 
which were not needed in HDFS. 

1. decodeDataStreamRequestByteBuf cost 6.42% caused by, netty split 4MB to many 
small packet, even though we define buffer size 4MB, so each small packet need 
to decodeDataStreamRequestByteBuf. 

2. allocating DirectByteBuffer cost 8.79%, still do not know why.


The test step:
1. I use only one ratis node, compared to, one namenode and one datanode

2. I did not write disk in streaming and hdfs, to only test performance of 
network.

3. code of Ratis streaming
Client and Server code which comment write disk: 
https://github.com/runzhiwang/incubator-ratis/commits/ratis-streaming
Client command: ${BIN}/client.sh filestore datastream --size 128000000 
--numFiles 400 --bufferSize 4000000 --syncSize -1 --type DirectByteBuffer 
--peers ${PEERS} --storage /data/ratis/n2 --storage /data1/ratis/n2 --storage 
/data2/ratis/n2 --storage /data3/ratis/n2 --storage /data4/ratis/n2 --storage 
/data5/ratis/n2 --storage /data6/ratis/n2 --storage /data7/ratis/n2 --storage 
/data8/ratis/n2 --storage /data9/ratis/n2 --storage /data10/ratis/n2 --storage 
/data11/ratis/n2

4. code of HDFS
Server code which comment write disk: 
https://github.com/runzhiwang/hadoop/tree/release-3.2.1
Client code: https://github.com/runzhiwang/HdfsPerformance
Client command: hadoop-3.2.1/bin/hadoop jar hdfs-performance-1.0-SNAPSHOT.jar 
org.apache.HdfsPerformance 400


> Compare the performance between HDFS and DataStreamApi
> ------------------------------------------------------
>
>                 Key: RATIS-1312
>                 URL: https://issues.apache.org/jira/browse/RATIS-1312
>             Project: Ratis
>          Issue Type: Sub-task
>            Reporter: runzhiwang
>            Priority: Major
>         Attachments: hdfs.svg, streaming.svg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to