Chris, Daniel and Mathieu,
Thanks for your constructive feedback!
On Thu, Aug 05, 2010 at 04:05:33PM +0200, Freek Dijkstra wrote:
ZFS BtrFS
1 SSD 256 MiByte/s 256 MiByte/s
2 SSDs 505 MiByte/s 504 MiByte/s
3 SSDs 736 MiByte/s 756 MiByte/s
4 SSDs 952 MiByte/s 916 MiByte/s
5 SSDs1226 MiByte/s 986 MiByte/s
6 SSDs1450 MiByte/s 978 MiByte/s
8 SSDs1653 MiByte/s 932 MiByte/s
16 SSDs 2750 MiByte/s 919 MiByte/s
[...]
The above results were for Ubuntu 10.04.1 server, with BtrFS v0.19,
Which kernels are those?
For BtrFS: Linux 2.6.32-21-server #32-Ubuntu SMP x86_64 GNU/Linux
For ZFS: FreeBSD 8.1-RELEASE (GENERIC)
(Note that we can currently not upgrade easily due to binary drivers for
the SAS+SATA controllers :(. I'd be happy to push the vendor though, if
you think it makes a difference.)
Daniel J Blueman wrote:
Perhaps create a new filesystem and mount with 'nodatasum'
I get an improvement: 919 MiByte/s just became 1580 MiByte/s. Not as
fast as it can, but most certainly an improvement.
existing extents which were previously created will be checked, so
need to start fresh.
Indeed, also the other way around. I created two test files, while
mounted with and without the -o nodatasum option:
write w/o nodatasum; read w/o nodatasum: 919 ± 43 MiByte/s
write w/o nodatasum; read w/ nodatasum: 922 ± 72 MiByte/s
write w/ nodatasum; read w/o nodatasum: 1082 ± 46 MiByte/s
write w/ nodatasum; read w/ nodatasum: 1586 ± 126 MiByte/s
So even if I remount the disk in the normal way, and read a file created
without checksums, I still get a small improvement :)
(PS: the above tests were repeated 4 times, the last even 8 times. As
you can see from the standard deviation, the results are not always very
accurate. The cause is unknown; CPU load is low.)
Chris Mason wrote:
Basically we have two different things to tune. First the block layer
and then btrfs.
And then we need to setup a fio job file that hammers on all the ssds at
once. I'd have it use adio/dio and talk directly to the drives.
[global]
size=32g
direct=1
iodepth=8
bs=20m
rw=read
[f1]
filename=/dev/sdd
[f2]
filename=/dev/sde
[f3]
filename=/dev/sdf
[...]
[f16]
filename=/dev/sds
Thanks. First one disk:
f1: (groupid=0, jobs=1): err= 0: pid=6273
read : io=32780MB, bw=260964KB/s, iops=12, runt=128626msec
clat (usec): min=74940, max=80721, avg=78449.61, stdev=923.24
bw (KB/s) : min=240469, max=269981, per=100.10%, avg=261214.77,
stdev=2765.91
cpu : usr=0.01%, sys=2.69%, ctx=1747, majf=0, minf=5153
IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, =64=0.0%
submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
=64=0.0%
issued r/w: total=1639/0, short=0/0
lat (msec): 100=100.00%
Run status group 0 (all jobs):
READ: io=32780MB, aggrb=260963KB/s, minb=267226KB/s, maxb=267226KB/s,
mint=128626msec, maxt=128626msec
Disk stats (read/write):
sdd: ios=261901/0, merge=0/0, ticks=10135270/0, in_queue=10136460,
util=99.30%
So 255 MiByte/s.
Out of curiousity, what is the distinction between the reported figures
of 260964 kiB/s, 261214.77 kiB/s, 267226 kiB/s and 260963 kiB/s?
Now 16 disks (abbreviated):
~/fio# ./fio ssd.fio
Starting 16 processes
f1: (groupid=0, jobs=1): err= 0: pid=4756
read : io=32780MB, bw=212987KB/s, iops=10, runt=157600msec
clat (msec): min=75, max=138, avg=96.15, stdev= 4.47
lat (msec): min=75, max=138, avg=96.15, stdev= 4.47
bw (KB/s) : min=153121, max=268968, per=6.31%, avg=213181.15,
stdev=9052.26
cpu : usr=0.00%, sys=1.71%, ctx=2737, majf=0, minf=5153
IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, =64=0.0%
submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
=64=0.0%
issued r/w: total=1639/0, short=0/0
lat (msec): 100=97.99%, 250=2.01%
[..similar for f2 to f16..]
f1: read : io=32780MB, bw=212987KB/s, iops=10, runt=157600msec
bw (KB/s) : min=153121, max=268968, per=6.31%, avg=213181.15,
stdev=9052.26
f2: read : io=32780MB, bw=213873KB/s, iops=10, runt=156947msec
bw (KB/s) : min=151143, max=251508, per=6.33%, avg=213987.34,
stdev=8958.86
f3: read : io=32780MB, bw=214613KB/s, iops=10, runt=156406msec
bw (KB/s) : min=149216, max=219037, per=6.35%, avg=214779.89,
stdev=9332.99
f4: read : io=32780MB, bw=214388KB/s, iops=10, runt=156570msec
bw (KB/s) : min=148675, max=226298, per=6.35%, avg=214576.51,
stdev=8985.03
f5: read : io=32780MB, bw=213848KB/s, iops=10, runt=156965msec
bw (KB/s) : min=144479, max=241414, per=6.33%, avg=213935.81,
stdev=10023.68
f6: read : io=32780MB, bw=213514KB/s, iops=10,