[
https://issues.apache.org/jira/browse/RATIS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293378#comment-17293378
]
runzhiwang edited comment on RATIS-1312 at 3/2/21, 6:09 AM:
------------------------------------------------------------
[~szetszwo] Hi, I test the performance again, and streaming is still slower
than hdfs. when use one client write 400 files * 128MB, streaming cost 60
seconds, and hdfs cost 48 seconds, and the performance is unrelated with GC.
The perf result has been attached, I find ratis streaming cost about 15% in
allocating(8.79%) DirectByteBuffer and decodeDataStreamRequestByteBuf(6.42%) ,
which were not needed in HDFS. I think we need to fix these two problems.
1. decodeDataStreamRequestByteBuf costs 6.42% caused by, netty splits 4MB to
many small packets, even though we define buffer size 4MB, so each small packet
need to decodeDataStreamRequestByteBuf. Still do not know how to fix.
2. allocating DirectByteBuffer costs 8.79%, still do not know why.
The test step:
1. I use only one ratis node, compared to, one namenode and one datanode
2. I did not write disk in streaming and hdfs, to only test performance of
network.
3. code of Ratis streaming
Client and Server code which comment write disk:
https://github.com/runzhiwang/incubator-ratis/commits/ratis-streaming
Client command: ${BIN}/client.sh filestore datastream --size 128000000
--numFiles 400 --bufferSize 4000000 --syncSize -1 --type DirectByteBuffer
--peers ${PEERS} --storage /data/ratis/n2 --storage /data1/ratis/n2 --storage
/data2/ratis/n2 --storage /data3/ratis/n2 --storage /data4/ratis/n2 --storage
/data5/ratis/n2 --storage /data6/ratis/n2 --storage /data7/ratis/n2 --storage
/data8/ratis/n2 --storage /data9/ratis/n2 --storage /data10/ratis/n2 --storage
/data11/ratis/n2
4. code of HDFS
Server code which comment write disk:
https://github.com/runzhiwang/hadoop/tree/release-3.2.1
Client code: https://github.com/runzhiwang/HdfsPerformance
Client command: hadoop-3.2.1/bin/hadoop jar hdfs-performance-1.0-SNAPSHOT.jar
org.apache.HdfsPerformance 400
was (Author: yjxxtd):
[~szetszwo] Hi, I test the performance again, and streaming is still slower
than hdfs. when use one client write 400 files * 128MB, streaming cost 60
seconds, and hdfs cost 48 seconds, and the performance is unrelated with GC.
The perf result has been attached, I find ratis streaming cost about 15% in
allocating(8.79%) DirectByteBuffer and decodeDataStreamRequestByteBuf(6.42%) ,
which were not needed in HDFS. I think we need to fix these two problems.
1. decodeDataStreamRequestByteBuf cost 6.42% caused by, netty split 4MB to many
small packets, even though we define buffer size 4MB, so each small packet need
to decodeDataStreamRequestByteBuf. Still do not know how to fix.
2. allocating DirectByteBuffer cost 8.79%, still do not know why.
The test step:
1. I use only one ratis node, compared to, one namenode and one datanode
2. I did not write disk in streaming and hdfs, to only test performance of
network.
3. code of Ratis streaming
Client and Server code which comment write disk:
https://github.com/runzhiwang/incubator-ratis/commits/ratis-streaming
Client command: ${BIN}/client.sh filestore datastream --size 128000000
--numFiles 400 --bufferSize 4000000 --syncSize -1 --type DirectByteBuffer
--peers ${PEERS} --storage /data/ratis/n2 --storage /data1/ratis/n2 --storage
/data2/ratis/n2 --storage /data3/ratis/n2 --storage /data4/ratis/n2 --storage
/data5/ratis/n2 --storage /data6/ratis/n2 --storage /data7/ratis/n2 --storage
/data8/ratis/n2 --storage /data9/ratis/n2 --storage /data10/ratis/n2 --storage
/data11/ratis/n2
4. code of HDFS
Server code which comment write disk:
https://github.com/runzhiwang/hadoop/tree/release-3.2.1
Client code: https://github.com/runzhiwang/HdfsPerformance
Client command: hadoop-3.2.1/bin/hadoop jar hdfs-performance-1.0-SNAPSHOT.jar
org.apache.HdfsPerformance 400
> Compare the performance between HDFS and DataStreamApi
> ------------------------------------------------------
>
> Key: RATIS-1312
> URL: https://issues.apache.org/jira/browse/RATIS-1312
> Project: Ratis
> Issue Type: Sub-task
> Reporter: runzhiwang
> Priority: Major
> Attachments: hdfs.svg, streaming.svg
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)