[ovirt-users] Re: oVirt and NetApp NFS storage
> So for all lease-holding vms, push the lease to a different storage domain > (on different > storage hardware), apply the upgrade and then push the leases back? And that > can be done > whilst the VMs are running? Leases must be frequently renewed, so I guess no > particular > reason why not. If you have the storage leases on another hardware then you do not have the same problem I have; unless you are talking about upgrading the hardware of the storage domain that holds the leases. General: you can change the lease on a running VM. But (speaking for 4.2) you have to set it to no lease first. > > Does that work for the SPM and hosted engine as well? I think the hosted engine has it's own kind of storage lease, but I can't say anything to that. SPM does not have a storage lease it should be hardware (ie it's one of the hypervisors) not a VM. > > I'm guessing this doesn't help with an unmanaged contoller failover. > Anecdotally, > that seems to happen a bit faster for me than a managed one, which is also > odd. Yeah, all my ideas are just for planned maintenance. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/JMYPVO45FWLO7OEPTSIJDUN7SG4VUKAZ/
[ovirt-users] Re: oVirt and NetApp NFS storage
So for all lease-holding vms, push the lease to a different storage domain (on different storage hardware), apply the upgrade and then push the leases back? And that can be done whilst the VMs are running? Leases must be frequently renewed, so I guess no particular reason why not. Does that work for the SPM and hosted engine as well? I'm guessing this doesn't help with an unmanaged contoller failover. Anecdotally, that seems to happen a bit faster for me than a managed one, which is also odd. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/7VPUGSOG27RNCLM7MV7XEUF3O2TT5NJ2/
[ovirt-users] Re: oVirt and NetApp NFS storage
my current idea of a workaround is to disable storage leases before netapp upgrades, and re-enable them after upgrades are done. python sdk should make this fairly easy (https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/set_vm_lease_storage_domain.py and https://gerrit.ovirt.org/c/99712/) ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/3EZAJDRFUCCY5T62E2CKDYVQZVWWABAR/
[ovirt-users] Re: oVirt and NetApp NFS storage
I too am suffering from this. Whilst a proper oVirt-configuration item would be ideal, the RFE suggests that there is a manual work-around for setting sanlock's io_timeout, albeit not recommended. Could anyone share what it is? It seems clear from the sanlock code that io_timeout is the only configurable item, and that the renewal fail timeout is 8 x io_timeout. In that case, increasing the io_timeout from the default of 10 to 15 should cover me. https://pagure.io/sanlock/blob/master/f/src/timeouts.h https://pagure.io/sanlock/blob/master/f/src/timeouts.c Searching suggests two different ways. Firstly, putting, e.g. SANLOCKOPTS="-o 15", in /etc/sysconfig/sanlock, though the sanlock docs do not confirm that should work and the search hits are all very old. Secondly, as oVirt is using libvirt, setting io_timeout=15 in /etc/libvirt/qemu-sanlock.conf. Does anyone know if either of those work? Or if there's another way? Thanks, Jon ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GUQIQEAESDSINKZFMX6ZCG5EW7NAOKY3/
[ovirt-users] Re: oVirt and NetApp NFS storage
So after a little wild discussion with support it boiled down to this BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1705289 which suggests it may be a good idea to have sanlock io_timeout configureable :) ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/7OQG3TRTS7YFC26QUOLVQ4DYUO5ZEMED/
[ovirt-users] Re: oVirt and NetApp NFS storage
Hi, NFS hard mounts should not help with sanlock. It should just kill the VMs after 60 seconds same as a soft mount. The sanlock timeout: I asked RedHat support about that in a case for my RHV environment. They said this should not be changed. Only thing that came to my mind: disable storage leases in case of planned storage maintenance. Greetings Klaas On Thu, Apr 18, 2019, 19:10 Strahil wrote: > I know 2 approaches. > 1. Use NFS hard mounting option - it will never give error to sanlock and > it will be waiting until NFS is recovered (never tries this one, but in > theory might work) > 2. Change the default sanlock timeout (last time I tried that - it didn't > work) . You might need help from Sandro or Sahina for that option. > > Best Regards, > Strahil NikolovOn Apr 18, 2019 11:45, klaasdem...@gmail.com wrote: > > > > Hi, > > > > I got a question regarding oVirt and the support of NetApp NFS storage. > > We have a MetroCluster for our virtual machine disks but a HA-Failover > > of that (active IP gets assigned to another node) seems to produce > > outages too long for sanlock to handle - that affects all VMs that have > > storage leases. NetApp says a "worst case" takeover time is 120 seconds. > > That would mean sanlock has already killed all VMs. Is anyone familiar > > with how we could setup oVirt to allow such storage outages? Do I need > > to use another type of storage for my oVirt VMs because that NFS > > implementation is unsuitable for oVirt? > > > > > > Greetings > > > > Klaas > > ___ > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-le...@ovirt.org > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/ > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SFIH3IDKDGRQOQ4POSETN27CRZPCDTM2/
[ovirt-users] Re: oVirt and NetApp NFS storage
I know 2 approaches. 1. Use NFS hard mounting option - it will never give error to sanlock and it will be waiting until NFS is recovered (never tries this one, but in theory might work) 2. Change the default sanlock timeout (last time I tried that - it didn't work) . You might need help from Sandro or Sahina for that option. Best Regards, Strahil NikolovOn Apr 18, 2019 11:45, klaasdem...@gmail.com wrote: > > Hi, > > I got a question regarding oVirt and the support of NetApp NFS storage. > We have a MetroCluster for our virtual machine disks but a HA-Failover > of that (active IP gets assigned to another node) seems to produce > outages too long for sanlock to handle - that affects all VMs that have > storage leases. NetApp says a "worst case" takeover time is 120 seconds. > That would mean sanlock has already killed all VMs. Is anyone familiar > with how we could setup oVirt to allow such storage outages? Do I need > to use another type of storage for my oVirt VMs because that NFS > implementation is unsuitable for oVirt? > > > Greetings > > Klaas > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/HJ7L23WBOGKVBNZJVID5QLLHUJU5AZB3/
[ovirt-users] Re: oVirt and NetApp NFS storage
Hi, nope, no storage leases and no fencing at all (because of vlan separation between mgmt and RAC). We have own HA fencing mechanism in place, which will trigger action over API when alarm in monitoring is triggered. HTH On 18.04.19 13:12, klaasdem...@gmail.com wrote: Hi, are you using ovirt storage leases? You'll need them if you want to handle a hypervisor completely unresponsive (including fencing actions) in a HA setting. Those storage leases use sanlock. If you use sanlock a VM gets killed if the lease is not renewable during a very short timeframe (60 seconds). That is what is killing the VMs during takeover. Before storage leases it seems to have worked because it would simply wait long enough for nfs to finish. Greetings Klaas On 18.04.19 12:47, Ladislav Humenik wrote: Hi, we have netapp nfs with ovirt in production and never experienced an outage during takeover/giveback .. - the default ovirt mount options should also handle little NFS timeout (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys) - but to tune it little up you should set disk timeout inside your guest VMs to at least 180 and than you are safe example: |cat << EOF >>/etc/rc.d/rc.local # Increasing the timeout value for i in /sys/class/scsi_generic/*/device/timeout; do echo 180 > "\$i"; done EOF | KR On 18.04.19 10:45, klaasdem...@gmail.com wrote: Hi, I got a question regarding oVirt and the support of NetApp NFS storage. We have a MetroCluster for our virtual machine disks but a HA-Failover of that (active IP gets assigned to another node) seems to produce outages too long for sanlock to handle - that affects all VMs that have storage leases. NetApp says a "worst case" takeover time is 120 seconds. That would mean sanlock has already killed all VMs. Is anyone familiar with how we could setup oVirt to allow such storage outages? Do I need to use another type of storage for my oVirt VMs because that NFS implementation is unsuitable for oVirt? Greetings Klaas ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/ -- Ladislav Humenik System administrator / VI -- Ladislav Humenik System administrator / VI ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/7XBZSHLGDYWIYADMKHFFMWYTXNFHKHWM/
[ovirt-users] Re: oVirt and NetApp NFS storage
Hi, are you using ovirt storage leases? You'll need them if you want to handle a hypervisor completely unresponsive (including fencing actions) in a HA setting. Those storage leases use sanlock. If you use sanlock a VM gets killed if the lease is not renewable during a very short timeframe (60 seconds). That is what is killing the VMs during takeover. Before storage leases it seems to have worked because it would simply wait long enough for nfs to finish. Greetings Klaas On 18.04.19 12:47, Ladislav Humenik wrote: Hi, we have netapp nfs with ovirt in production and never experienced an outage during takeover/giveback .. - the default ovirt mount options should also handle little NFS timeout (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys) - but to tune it little up you should set disk timeout inside your guest VMs to at least 180 and than you are safe example: |cat << EOF >>/etc/rc.d/rc.local # Increasing the timeout value for i in /sys/class/scsi_generic/*/device/timeout; do echo 180 > "\$i"; done EOF | KR On 18.04.19 10:45, klaasdem...@gmail.com wrote: Hi, I got a question regarding oVirt and the support of NetApp NFS storage. We have a MetroCluster for our virtual machine disks but a HA-Failover of that (active IP gets assigned to another node) seems to produce outages too long for sanlock to handle - that affects all VMs that have storage leases. NetApp says a "worst case" takeover time is 120 seconds. That would mean sanlock has already killed all VMs. Is anyone familiar with how we could setup oVirt to allow such storage outages? Do I need to use another type of storage for my oVirt VMs because that NFS implementation is unsuitable for oVirt? Greetings Klaas ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/ -- Ladislav Humenik System administrator / VI ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/M4PMK33ZQLLH6DCXLC3NNDUQDQJM3XTX/
[ovirt-users] Re: oVirt and NetApp NFS storage
Hi, are you using ovirt storage leases? You'll need them if you want to handle a hypervisor completely unresponsive (including fencing actions) in a HA setting. Those storage leases use sanlock. If you use sanlock a VM gets killed if the lease is not renewable during a very short timeframe (60 seconds). That is what is killing the VMs during takeover. Before storage leases it seems to have worked because it would simply wait long enough for nfs to finish. Greetings Klaas On 18.04.19 12:47, Ladislav Humenik wrote: Hi, we have netapp nfs with ovirt in production and never experienced an outage during takeover/giveback .. - the default ovirt mount options should also handle little NFS timeout (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys) - but to tune it little up you should set disk timeout inside your guest VMs to at least 180 and than you are safe example: |cat << EOF >>/etc/rc.d/rc.local # Increasing the timeout value for i in /sys/class/scsi_generic/*/device/timeout; do echo 180 > "\$i"; done EOF | KR On 18.04.19 10:45, klaasdem...@gmail.com wrote: Hi, I got a question regarding oVirt and the support of NetApp NFS storage. We have a MetroCluster for our virtual machine disks but a HA-Failover of that (active IP gets assigned to another node) seems to produce outages too long for sanlock to handle - that affects all VMs that have storage leases. NetApp says a "worst case" takeover time is 120 seconds. That would mean sanlock has already killed all VMs. Is anyone familiar with how we could setup oVirt to allow such storage outages? Do I need to use another type of storage for my oVirt VMs because that NFS implementation is unsuitable for oVirt? Greetings Klaas ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/ -- Ladislav Humenik System administrator / VI ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MFQJNX6TNWTCJERLH3WAO7UGEUPWQ2KI/
[ovirt-users] Re: oVirt and NetApp NFS storage
Hi, we have netapp nfs with ovirt in production and never experienced an outage during takeover/giveback .. - the default ovirt mount options should also handle little NFS timeout (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys) - but to tune it little up you should set disk timeout inside your guest VMs to at least 180 and than you are safe example: |cat << EOF >>/etc/rc.d/rc.local # Increasing the timeout value for i in /sys/class/scsi_generic/*/device/timeout; do echo 180 > "\$i"; done EOF | KR On 18.04.19 10:45, klaasdem...@gmail.com wrote: Hi, I got a question regarding oVirt and the support of NetApp NFS storage. We have a MetroCluster for our virtual machine disks but a HA-Failover of that (active IP gets assigned to another node) seems to produce outages too long for sanlock to handle - that affects all VMs that have storage leases. NetApp says a "worst case" takeover time is 120 seconds. That would mean sanlock has already killed all VMs. Is anyone familiar with how we could setup oVirt to allow such storage outages? Do I need to use another type of storage for my oVirt VMs because that NFS implementation is unsuitable for oVirt? Greetings Klaas ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/ -- Ladislav Humenik System administrator / VI ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6TLJ4KH4P5DH2RZFZUUKUYCY6SFQJHSN/