Hello,

Using OCFS2 on top of a cLVM-mirrored LV is an absolute no-go for SLES11 SP2:

First, while mirroring the LV ("only" 300GB) access to any of the involved 
devices is very slow, but it's not the I/O that's the limit, but something 
else; "communication" I suspect.
  PID USER      PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+   P COMMAND
14883 root      RT   0  535m  11m 7808 R     42  0.0  34:35.24  0 corosync
16493 root      20   0 33804 2168 1752 R     14  0.0  10:32.29  0 cmirrord

Shouldn't a busy mirroring job have a "D" state instead of an "R" state, 
burning CPU? It seems the [kmirrord] is blocked by the cmirrord. Otherwise the 
task that really does the mirroring (not talking about mirroring) should use 
the bandwidth. Before you ask: Wait% was not >80%; it was near 30%.

Here's a comparison (in a Xen-VM that uses a OCFS2 image on cLVM loop-mounted 
as /dev/xvdb, and a MD-RAID. Both use the same type of disks in the same 
storage systems:

v05:~ # hdparm -t /dev/xvdb

/dev/xvdb:
 Timing buffered disk reads:  372 MB in  3.00 seconds = 123.91 MB/sec
v05:~ # hdparm -t /dev/xvdb

/dev/xvdb:
 Timing buffered disk reads:  712 MB in  3.01 seconds = 236.68 MB/sec
v05:~ # hdparm -t /dev/md0

/dev/md0:
 Timing buffered disk reads:  1892 MB in  3.00 seconds = 630.32 MB/sec
v05:~ # hdparm -t /dev/md0

/dev/md0:
 Timing buffered disk reads:  1948 MB in  3.00 seconds = 649.19 MB/sec

(I ran the test twice because of buffering effects)
Of course the tests were made when mirroring was done.

And before someone suspects the latter results are because of local buffering, 
here are the stats with local buffering:
rksapv05:~ # hdparm -T /dev/md0

/dev/md0:
 Timing cached reads:   18106 MB in  1.99 seconds = 9095.48 MB/sec
v05:~ # hdparm -T /dev/md0

/dev/md0:
 Timing cached reads:   18262 MB in  1.99 seconds = 9174.92 MB/sec

So besides of the inefficient I/O with cLVM there are more issues:
1) LVM should load-balance between the mirror legs
2) LVM should use a leg-internal bitmap to resynchronize the mirror in a 
non-stupid way
3) LVM should mirror the more recent mirror leg to the outdated mirror leg, not 
use a fixed direction.

So my advice is: Don't use it (for SLES11 SP2). I'm somewhat displeased about 
the situation, because I had a support request asking exactly whether this 
setup is a supported configuration, and it was confirmed. Now the first 
proposal regarding the terrible performance was to _try_ SLES11 SP3 beta...

Oh my!

Ulrich


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to