[
https://issues.apache.org/jira/browse/RATIS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293378#comment-17293378
]
runzhiwang edited comment on RATIS-1312 at 3/2/21, 6:03 AM:
------------------------------------------------------------
[~szetszwo] Hi, I test the performance again, and streaming is still slower
than hdfs. when use one client write 400 files * 128MB, streaming cost 60
seconds, and hdfs cost 48 seconds, and the performance is unrelated with GC.
The perf result has been attached, I find ratis streaming cost about 15% in
allocating(8.79%) DirectByteBuffer and decodeDataStreamRequestByteBuf(6.42%) ,
which were not needed in HDFS. I think we need to fix these two problems.
1. decodeDataStreamRequestByteBuf cost 6.42% caused by, netty split 4MB to many
small packets, even though we define buffer size 4MB, so each small packet need
to decodeDataStreamRequestByteBuf.
2. allocating DirectByteBuffer cost 8.79%, still do not know why.
The test step:
1. I use only one ratis node, compared to, one namenode and one datanode
2. I did not write disk in streaming and hdfs, to only test performance of
network.
3. code of Ratis streaming
Client and Server code which comment write disk:
https://github.com/runzhiwang/incubator-ratis/commits/ratis-streaming
Client command: ${BIN}/client.sh filestore datastream --size 128000000
--numFiles 400 --bufferSize 4000000 --syncSize -1 --type DirectByteBuffer
--peers ${PEERS} --storage /data/ratis/n2 --storage /data1/ratis/n2 --storage
/data2/ratis/n2 --storage /data3/ratis/n2 --storage /data4/ratis/n2 --storage
/data5/ratis/n2 --storage /data6/ratis/n2 --storage /data7/ratis/n2 --storage
/data8/ratis/n2 --storage /data9/ratis/n2 --storage /data10/ratis/n2 --storage
/data11/ratis/n2
4. code of HDFS
Server code which comment write disk:
https://github.com/runzhiwang/hadoop/tree/release-3.2.1
Client code: https://github.com/runzhiwang/HdfsPerformance
Client command: hadoop-3.2.1/bin/hadoop jar hdfs-performance-1.0-SNAPSHOT.jar
org.apache.HdfsPerformance 400
was (Author: yjxxtd):
[~szetszwo] Hi, I test the performance again, and streaming is still slower
than hdfs. when use one client write 400 files * 128MB, streaming cost 60
seconds, and hdfs cost 48 seconds, and the performance is unrelated with GC.
The perf result has been attached, I find ratis streaming cost about 15% in
allocating(8.79%) DirectByteBuffer and decodeDataStreamRequestByteBuf(6.42%) ,
which were not needed in HDFS. I think we need to fix these two problems.
1. decodeDataStreamRequestByteBuf cost 6.42% caused by, netty split 4MB to many
small packet, even though we define buffer size 4MB, so each small packet need
to decodeDataStreamRequestByteBuf.
2. allocating DirectByteBuffer cost 8.79%, still do not know why.
The test step:
1. I use only one ratis node, compared to, one namenode and one datanode
2. I did not write disk in streaming and hdfs, to only test performance of
network.
3. code of Ratis streaming
Client and Server code which comment write disk:
https://github.com/runzhiwang/incubator-ratis/commits/ratis-streaming
Client command: ${BIN}/client.sh filestore datastream --size 128000000
--numFiles 400 --bufferSize 4000000 --syncSize -1 --type DirectByteBuffer
--peers ${PEERS} --storage /data/ratis/n2 --storage /data1/ratis/n2 --storage
/data2/ratis/n2 --storage /data3/ratis/n2 --storage /data4/ratis/n2 --storage
/data5/ratis/n2 --storage /data6/ratis/n2 --storage /data7/ratis/n2 --storage
/data8/ratis/n2 --storage /data9/ratis/n2 --storage /data10/ratis/n2 --storage
/data11/ratis/n2
4. code of HDFS
Server code which comment write disk:
https://github.com/runzhiwang/hadoop/tree/release-3.2.1
Client code: https://github.com/runzhiwang/HdfsPerformance
Client command: hadoop-3.2.1/bin/hadoop jar hdfs-performance-1.0-SNAPSHOT.jar
org.apache.HdfsPerformance 400
> Compare the performance between HDFS and DataStreamApi
> ------------------------------------------------------
>
> Key: RATIS-1312
> URL: https://issues.apache.org/jira/browse/RATIS-1312
> Project: Ratis
> Issue Type: Sub-task
> Reporter: runzhiwang
> Priority: Major
> Attachments: hdfs.svg, streaming.svg
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)