On 2012-05-20 12:05, Christoph Bartoschek wrote:
Hi,

we have a two node setup with drbd below LVM and an Ext4 filesystem that is
shared vi NFS. The system shows low performance and lots of timeouts
resulting in unnecessary failovers from pacemaker.

The connection between both nodes is capable of 1 GByte/s as shown by iperf.
The network between the clients and the nodes is capable of 110 MByte/s. The
RAID can be filled with 450 MByte/s.

Thus I would expect to have a write performance of about 100 MByte/s. But dd
gives me only 20 MByte/s.

dd if=/dev/zero of=bigfile.10G bs=8192  count=1310720
1310720+0 records in
1310720+0 records out
10737418240 bytes (11 GB) copied, 498.26 s, 21.5 MB/s

to give you some numbers to compare:

I've got a small XFS file system, which i'm currently testing with.
Using a single thread and NFS4 only:

my configuration:
nfsserver:
# exportfs -v
/data/export 192.168.100.0/24(rw,wdelay,no_root_squash,no_subtree_check,fsid=1000)


nfsclient mount
192.168.100.200:/data/export on /mnt type nfs (rw,nosuid,nodev,nodiratime,relatime,vers=4,addr=192.168.100.200,clientaddr=192.168.100.107)

via network (1gbit connection for both drbd sync and nfs)
  # dd if=/dev/zero of=bigfile.10G bs=6192  count=1310720
  1310720+0 records in
  1310720+0 records out
  8115978240 bytes (8.1 GB) copied, 140.279 s, 57.9 MB/s

on the same machine so that 1gbit is for drbd only:
  # dd if=/dev/zero of=bigfile.10G bs=6192  count=1310720
  1310720+0 records in
  1310720+0 records out
  8115978240 bytes (8.1 GB) copied, 70.9297 s, 114 MB/s

Maybe this numbers and configuration helps?

Cheers,
Raoul

While the slow dd runs there are timeouts on the server resulting in a
restart of some resources. In the logfile I also see:

[329014.592452] INFO: task nfsd:2252 blocked for more than 120 seconds.
[329014.592820] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[329014.593273] nfsd            D 0000000000000007     0  2252      2
0x00000000
...
Has anyone an idea what could cause such problems? I have no idea for
further analysis.

i haven't seen such issue during my current tests.

Is ext4 unsuitable for such a setup? Or is the linux nfs3 implementation
broken? Are buffers too large such that one has too wait too long for a
flush?

Maybe I'll have the time to switch form xfs to ext4 and retest
during the next couple of days. But I cannot guarantee anything.

Maybe you could try switching to XFS instead?

Cheers;
Raoul
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
Technischer Leiter

IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
1190 Wien                           tel.               +43 1 3670030
FN 277995t HG Wien                  fax.            +43 1 3670030 15
____________________________________________________________________



_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to