Hi,
We've performed tests for ext3 and xfs filesystems using different
settings. The results might be useful for anyone else.
The datanode cluster consists of 15 slave nodes, each equipped with
1Gbit ethernet, X3220@2.40GHz quadcores and 4x1TB disks. The disk read
speeds vary from about 90 to 130MB/s. (Tested using hdparm -t).
Hadoop: Cloudera CDH3u0 (4 concurrent mappers / node)
OS: Linux version 2.6.18-238.5.1.el5 (mockbu...@builder10.centos.org)
(gcc version 4.1.2 20080704 (Red Hat 4.1.2-50))
#our command
for i in `seq 1 10`; do ./hadoop jar
../hadoop-examples-0.20.2-cdh3u0.jar randomwriter -Ddfs.replication=1
/rand$i && ./hadoop fs -rmr /rand$i/_logs /rand$i/_SUCCESS && ./hadoop
distcp -Ddfs.replication=1 /rand$i /rand-copy$i; done
Our benchmark consists of a standard random-writer job followed by a
distcp of the same data, both using a replication of 1. This is to make
sure only the disks get hit. Each benchmark is ran several times for
every configuration. Because of the occasional hickup, I will list both
the average and the fastest times for each configuration. I read the
execution times off the jobtracker.
The configurations (with exection times in seconds of Avg-writer /
Min-writer / Avg-distcp / Min-distcp)
ext3-default 158 / 136 / 411 / 343
ext3-tuned 159 / 132 / 330 / 297
ra1024 ext3-tuned 159 / 132 / 292 / 264
ra1024 xfs-tuned 128 / 122 / 220 / 202
To explain, ext3-tuned is with tuned mount options
[noatime,nodiratime,data=writeback,rw] and ra1024 means a read-ahead
buffer of 1024 blocks. The xfs disks are created using mkfs options
[size=128m,lazy-count=1] and mount options [noatime,nodiratime,logbufs=8].
In conclusion it seems that using tuned xfs filesystems combined with
increased read-ahead buffers increased our basic hdfs performance with
about 10% (random-writer) to 40% (distcp).
Hopefully this is useful to anyone. Although I won't be performing more
tests soon I'd be happy to provide more details.
Ferdy.