Hi! I found some obscure problem having to do with LVM multipathing and hot-plugged disks:
I have written some RAs that support "hotplugging of SAN disks" via NPIV (N_Port ID Virtualization) and addition and removal of multipath maps. On to of that is LVM and filesystems. So fa, so good. However I discovered a problem when multiple resources are shut down in parallel: The LVM-stuff (like vgdisplay) access all disks that are around, and not just the disks that matter. This may lead to a race condition where one resource group stops an LVM monitor, the shuts down the corresponding multipath, and finally the NPIV-device (SCSI unplug). Unfortunately during that another LVM command may access the disks that are clear for removal. I don't know what exactly happened, bu tthe result was that several vgdisplay commands did hang (unkillable with kill -9 even), multipath commands did hang (device busy through LVM?), and the device could not be removed. As it seems there is some rather global lock involved that makes more and more command hang. In the end the machine needed a hard reset to recover. I know the HP-UX implementation of LVM where the kernel knows which disks belong to which volume group once the VG is active. The any command related to a specific VG only has to access the disks that are actually PVs of that VG (if at all). In the Linux implementation, LVM commands seem to scan every block device every time, causing odd effects. In another test I did, the wall-clock time for "lvs" increased by a factor close to 1000 when there was significant load on a few disks. I'm talking about execution times of more than 60 seconds for one single "lvs". Naturally this causes a monitor failure for several resources, but the stop operation would also time out, causing a (needless) node fence. The real reasons for that huge increase in execution time is still under investigation, but it is absolutely repeatable on a machine with many (>50) block devices, logs of RAM (>100G), huge filesystems (>1TB) and several huge files (>100GB) that are being copied. Regards, Ulrich _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
