On 4/21/21 12:48 AM, Klaus Wenninger wrote:
Just to better understand the issue ...
Does the first resource implement storage that is being used
by the resource that is being migrated/moved?
Or is it just the combination of 2 parallel moves that is
overcommitting storage or network?
Is it assured that there are no load-scenarios inside these
resources that create the same issues as if you migrate/move
them?

Klaus

Thanks for the help Klaus, I'll spell it out more clearly.

I'm using a group resource sets up a failover-ip address, then mounts a ZFS dataset (which exports a configuration directory as NFS), then a custom resource called ZFSiSCSI that exports all virtual machine disks as iSCSI.

Like this:

  * Resource Group: IP-ZFS-iSCSI:
    * fence-datastore    (stonith:fence_scsi):     Started node1
    * failover-ip    (ocf::heartbeat:IPaddr):     Started node1
    * zfs-datastore    (ocf::heartbeat:ZFS):     Started node1
    * ZFSiSCSI    (ocf::heartbeat:ZFSiSCSI):     Started node1

Then I create a virtual machine with

primitive vm-testvm VirtualDomain params config="/nfs/vm/testvm/testvm.xml" meta allow-migrate=true op monitor timeout=30 interval=10

This works fine because the ZFS storage can be mounted/exported on node1 or node2 which will have an iSCSI target for each VM bound to the shared IP address.  I can move the storage to either node and while there is a pause in the storage it works fine as things move around faster than the iscsi timeout.  I can also migrate the VM to either node because when it's started on the target node, it can immediately access it's iscsi storage regardless if the storage is local or not.

The problem is monitoring with VirtualDomain.  The /usr/lib/ocf/resource.d/heartbeat/VirtualDomain script goes to check to see if /nfs/vm/testvm/testvm.xml is available with this line:

        if [ ! -r $OCF_RESKEY_config ]; then
                if ocf_is_probe; then
                        ocf_log info "Configuration file $OCF_RESKEY_config not readable during probe."

That causes bash to stat the config file which if we are in the middle of a IP-ZFS-iSCSI move, will return -1 which then causes VirtualDomain to view the VM as dead and hard resets it.

If I set the stickiness to 100 then it's a race condition, many times we get the storage layer migrated without VirtualDomain noticing, but if the stickiness is not set, then moving a resource causes the cluster to re-balance and will cause the VM to fail every time because validation is one of the first things we do when we migrate the VM, and it's at the same time as a IP-ZFS-iSCSI move so the config file goes away for 5 seconds.

I'm not sure how to fix this.  The nodes don't have local storage that isn't the ZFS pool, otherwise I'd just create a local config directory and glusterfs them together.

I suppose the next step is to see if NFS has some sort of retry mode so that bash stating the config file is blocked until a timeout. That would certainly fix my issue as that's how the iscsi stuff works, retry until timeout.  Another option is to rework VirtualDomain as stating a config file isn't really a good test to see if the domain is working.  It makes more sense to have it make a virsh call to see if it's working and only care about the config file if it's starting the domain.

Ideas welcome!!!!

Matt

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to