Hi all,
I've recently upgraded to DRBD 8.4.3 (protocol C) on CentOS 6.4 (kernel
3.10.10) with Xen 4.3.0 on hardware RAID10 with an Infiniband 20Gbit/sec
replication link.
For a few days now, we've been experiencing a very strange issue whereby
(seemingly randomly) the system will become almost unresponsive, with
iowait going to 100% on some (but not all) domUs and dom0, but even the
domUs whose load remains stable will still be incredibly sluggish. The
problem occurs even when the resources are in standalone mode.
Sometimes it self-corrects, but it's becoming more severe and is now
less likely to go away without a reboot. Earlier today, the system
running as primary was at 0.02 load, and the slave (which was doing
nothing other than receiving updates from the master, no domUs running)
went to 13 load and was pretty much dead.
I've tried a variety of tuning options, including enabling
disable_sendpage, but nothing is making it any better. Nothing is
printed to the logs.
My next thought is to try downgrading to DRBD 8.3, but considering
support ends in December, I'd much prefer to continue using 8.4.
I'm very much hoping that someone more experienced than myself will be
able to offer some words of wisdom. :)
Thanks
Regards,
Stephen Marsh
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user