On 2012-05-20 12:05, Christoph Bartoschek wrote:
Hi,
we have a two node setup with drbd below LVM and an Ext4 filesystem that is
shared vi NFS. The system shows low performance and lots of timeouts
resulting in unnecessary failovers from pacemaker.
The connection between both nodes is capable of 1 GByte/s as shown by iperf.
The network between the clients and the nodes is capable of 110 MByte/s. The
RAID can be filled with 450 MByte/s.
Thus I would expect to have a write performance of about 100 MByte/s. But dd
gives me only 20 MByte/s.
dd if=/dev/zero of=bigfile.10G bs=8192 count=1310720
1310720+0 records in
1310720+0 records out
10737418240 bytes (11 GB) copied, 498.26 s, 21.5 MB/s
to give you some numbers to compare:
I've got a small XFS file system, which i'm currently testing with.
Using a single thread and NFS4 only:
my configuration:
nfsserver:
# exportfs -v
/data/export
192.168.100.0/24(rw,wdelay,no_root_squash,no_subtree_check,fsid=1000)
nfsclient mount
192.168.100.200:/data/export on /mnt type nfs
(rw,nosuid,nodev,nodiratime,relatime,vers=4,addr=192.168.100.200,clientaddr=192.168.100.107)
via network (1gbit connection for both drbd sync and nfs)
# dd if=/dev/zero of=bigfile.10G bs=6192 count=1310720
1310720+0 records in
1310720+0 records out
8115978240 bytes (8.1 GB) copied, 140.279 s, 57.9 MB/s
on the same machine so that 1gbit is for drbd only:
# dd if=/dev/zero of=bigfile.10G bs=6192 count=1310720
1310720+0 records in
1310720+0 records out
8115978240 bytes (8.1 GB) copied, 70.9297 s, 114 MB/s
Maybe this numbers and configuration helps?
Cheers,
Raoul
While the slow dd runs there are timeouts on the server resulting in a
restart of some resources. In the logfile I also see:
[329014.592452] INFO: task nfsd:2252 blocked for more than 120 seconds.
[329014.592820] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[329014.593273] nfsd D 0000000000000007 0 2252 2
0x00000000
...
Has anyone an idea what could cause such problems? I have no idea for
further analysis.
i haven't seen such issue during my current tests.
Is ext4 unsuitable for such a setup? Or is the linux nfs3 implementation
broken? Are buffers too large such that one has too wait too long for a
flush?
Maybe I'll have the time to switch form xfs to ext4 and retest
during the next couple of days. But I cannot guarantee anything.
Maybe you could try switching to XFS instead?
Cheers;
Raoul
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. r.bha...@ipax.at
Technischer Leiter
IPAX - Aloy Bhatia Hava OG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. off...@ipax.at
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org