On Tue, Sep 10, 2013 at 04:39:56PM +0000, Prantis, Kelsey wrote: > We have a cluster of 7 KVM vms on a host. The host OS is Fedora 18, and the > guest OS is Centos 6.4. Installed kvm/qemu/kernel packages are as follows: > > qemu-system-x86-1.2.2-11.fc18.x86_64 > qemu-common-1.2.2-11.fc18.x86_64 > qemu-img-1.2.2-11.fc18.x86_64 > libvirt-daemon-driver-qemu-0.10.2.5-1.fc18.x86_64 > qemu-kvm-1.2.2-11.fc18.x86_64 > ipxe-roms-qemu-20120328-2.gitaac9718.fc18.noarch > kernel-3.9.4-200.fc18.x86_64 > > To 4 of the vms we have attached the same 5 lvs to be used as shared storage, > with definitions like the below (disk1-disk5): > > <disk type='block' device='disk'> > <driver name='qemu' type='raw' /> > <source dev='/dev/vg_00/disk1'/> > <target dev='sda' bus='scsi'/> > <shareable/> > <serial>disk1</serial> > <alias name='scsi0-0-0'/> > <address type='drive' controller='0' bus='0' target='0' unit='0'/> > </disk> > > Throughout the course of our automated test suite, our tests format the > device with an ext4 file system and then immediately mount the file system to > write a few files after the format completes. Most of the time this works > great. However, some small percentage of the time it is failing on the mount > command with "No such device". > > Unable to mount /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_disk1: No such device > > > We know that the device does in fact exist and was operable, since the mkfs > command just had completed successfully and without error, so I am not sure > why suddenly it is returning "No such device" when trying to mount, and only > a small percentage of the time. To prove that the device is in fact there, > we've tried putting the mount into a retry-loop as a debug measure to show > the device is eventually there, and without fail in one of the loop > iterations the mount does complete successfully. It seems like there could > possibly be some sort of race between closing the device after the mkfs and > quickly opening it again for the mount? > > We've reproduced this both with directly attached devices, as above, as well > as with iscsi devices.
This is weird because the symlinks in /dev/disk/by-*/ just point back to ../../sd*. The "No such device" error message implies the device node exists on the file system but the kernel thinks a device for that major/minor number is not present. I wonder if the output of "udevadm monitor" during the mfks and mount steps shows devices appearing/disappearing? That might explain a race condition. Can you share your script that runs mkfs and mounts the file system? At which point in the boot process does your script run? Stefan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
