GitHub user wido added a comment to the discussion: Shareable RAW disk

> Without something like SCSI reservations or storage fencing integrated the 
> the in-VM cluster, data corruption will often occur when you 
> disconnect/reconnect one of the hosts from the storage network. There is 
> nothing the VM can do to prevent it because the storage stack in the host has 
> operations in flight, which it holds and then submits when it can. The write 
> operations in flight might be waiting not only in the block layers of the 
> host, they might be stuck even in TCP buffers of the iSCSI session for 
> example, just waiting for an opportunity be retransmitted, 10s of seconds 
> later.
> 
> Disabling caching doesn't fix it. You can't have block storage without a 
> queue (operations which have been requested by the VM, but are not complete 
> yet). And as @krokodilerian said, there are a bunch of ways in which you can 
> make useful clusters in VMs, just giving them a shared disk which can delay 
> some operations by 10s of seconds is not a way to enable clusters in VMs. 
> Even active-standby clusters are risky in such environment.
> 
> I had a quick look at OpenStack and oVirt. Both completely glance over this 
> issue and leave it to the user to figure it out. This is bad. If we do the 
> same in CloudStack, users wouldn't understand the limitations. They'd shoot 
> themselves in the foot. They'd blame CloudStack or the storage.
> 
> On the other hand VMWare supports SCSI persistent reservations for shared 
> virtual disks, which is good. To get a similar good and usable shared disks 
> in CloudStack/KVM, we'd need much more than just permitting two simultaneous 
> attachments. For example, we'd need SCSI PR emulation for the raw file on NFS 
> scenario. For iSCSI and FC we'd need virtio-scsi and scsi-block (passthrough) 
> devices, maybe qemu-pr-helper plus additional checks that the host 
> configuration, multipath are correctly configured, etc.
> 
> If implemented we need a way for a storage plugin to signal if it supports it 
> or not. The StorPool driver will opt out of the shared disk feature. 
> Currently we have a different way of supporting the cluster-in-VMs use-case, 
> which is not pretty, but a workaround exists for our users.

Great feedback!

virtio-scsi seems to support reservations, so we might only support it if 
virtio-scsi is being used. But then it depends if RBD or StorPool could support 
these reservations.

It does seem like RBD support reservations through RBD locking. Would it be a 
possibility that this is supported?

GitHub link: 
https://github.com/apache/cloudstack/discussions/9976#discussioncomment-11406496

----
This is an automatically sent email for users@cloudstack.apache.org.
To unsubscribe, please send an email to: users-unsubscr...@cloudstack.apache.org

Reply via email to