GitHub user bkrosnov added a comment to the discussion: Shareable RAW disk
Without something like SCSI reservations or storage fencing integrated the the in-VM cluster, data corruption will often occur when you disconnect/reconnect one of the hosts from the storage network. There is nothing the VM can do to prevent it because the storage stack in the host has operations in flight, which it holds and then submits when it can. The write operations in flight might be waiting not only in the block layers of the host, they might be stuck even in TCP buffers of the iSCSI session for example, just waiting for an opportunity be retransmitted, 10s of seconds later. Disabling caching doesn't fix it. You can't have block storage without a queue (operations which have been requested by the VM, but are not complete yet). And as @krokodilerian said, there are a bunch of ways in which you can make useful clusters in VMs, just giving them a shared disk which can delay some operations by 10s of seconds is not a way to enable clusters in VMs. Even active-standby clusters are risky in such environment. I had a quick look at OpenStack and oVirt. Both completely glance over this issue and leave it to the user to figure it out. This is bad. If we do the same in CloudStack, users wouldn't understand the limitations. They'd shoot themselves in the foot. They'd blame CloudStack or the storage. On the other hand VMWare supports SCSI persistent reservations for shared virtual disks, which is good. To get a similar good and usable shared disks in CloudStack/KVM, we'd need much more than just permitting two simultaneous attachments. For example, we'd need SCSI PR emulation for the raw file on NFS scenario. For iSCSI and FC we'd need virtio-scsi and scsi-block (passthrough) devices, maybe qemu-pr-helper plus additional checks that the host configuration, multipath are correctly configured, etc. If implemented we need a way for a storage plugin to signal if it supports it or not. The StorPool driver will opt out of the shared disk feature. Currently we have a different way of supporting the cluster-in-VMs use-case, which is not pretty, but a workaround exists for our users. GitHub link: https://github.com/apache/cloudstack/discussions/9976#discussioncomment-11399306 ---- This is an automatically sent email for users@cloudstack.apache.org. To unsubscribe, please send an email to: users-unsubscr...@cloudstack.apache.org