[ovirt-users] Re: oVirt and NetApp NFS storage

2019-05-10 Thread klaasdemter
> So for all lease-holding vms, push the lease to a different storage domain 
> (on different
> storage hardware), apply the upgrade and then push the leases back? And that 
> can be done
> whilst the VMs are running? Leases must be frequently renewed, so I guess no 
> particular
> reason why not.

If you have the storage leases on another hardware then you do not have the 
same problem I have; unless you are talking about upgrading the hardware of the 
storage domain that holds the leases.
General: you can change the lease on a running VM. But (speaking for 4.2) you 
have to set it to no lease first.

> 
> Does that work for the SPM and hosted engine as well?

I think the hosted engine has it's own kind of storage lease, but I can't say 
anything to that. SPM does not have a storage lease it should be hardware (ie 
it's one of the hypervisors) not a VM.

> 
> I'm guessing this doesn't help with an unmanaged contoller failover. 
> Anecdotally,
> that seems to happen a bit faster for me than a managed one, which is also 
> odd.

Yeah, all my ideas are just for planned maintenance.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JMYPVO45FWLO7OEPTSIJDUN7SG4VUKAZ/


[ovirt-users] Re: oVirt and NetApp NFS storage

2019-05-07 Thread jon
So for all lease-holding vms, push the lease to a different storage domain (on 
different storage hardware), apply the upgrade and then push the leases back? 
And that can be done whilst the VMs are running? Leases must be frequently 
renewed, so I guess no particular reason why not.

Does that work for the SPM and hosted engine as well?

I'm guessing this doesn't help with an unmanaged contoller failover. 
Anecdotally, that seems to happen a bit faster for me than a managed one, which 
is also odd.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7VPUGSOG27RNCLM7MV7XEUF3O2TT5NJ2/


[ovirt-users] Re: oVirt and NetApp NFS storage

2019-05-03 Thread klaasdemter
my current idea of a workaround is to disable storage leases before netapp 
upgrades, and re-enable them after upgrades are done. python sdk should make 
this fairly easy 
(https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/set_vm_lease_storage_domain.py
 and https://gerrit.ovirt.org/c/99712/)
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3EZAJDRFUCCY5T62E2CKDYVQZVWWABAR/


[ovirt-users] Re: oVirt and NetApp NFS storage

2019-05-03 Thread jon
I too am suffering from this. Whilst a proper oVirt-configuration item would be 
ideal, the RFE suggests that there is a manual work-around for setting 
sanlock's io_timeout, albeit not recommended. Could anyone share what it is?

It seems clear from the sanlock code that io_timeout is the only configurable 
item, and that the renewal fail timeout is 8 x io_timeout. In that case, 
increasing the io_timeout from the default of 10 to 15 should cover me.

https://pagure.io/sanlock/blob/master/f/src/timeouts.h
https://pagure.io/sanlock/blob/master/f/src/timeouts.c

Searching suggests two different ways. Firstly, putting, e.g. SANLOCKOPTS="-o 
15", in /etc/sysconfig/sanlock, though the sanlock docs do not confirm that 
should work and the search hits are all very old. Secondly, as oVirt is using 
libvirt, setting io_timeout=15 in /etc/libvirt/qemu-sanlock.conf.

Does anyone know if either of those work? Or if there's another way?

Thanks,

Jon
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GUQIQEAESDSINKZFMX6ZCG5EW7NAOKY3/


[ovirt-users] Re: oVirt and NetApp NFS storage

2019-05-02 Thread klaasdemter
So after a little wild discussion with support it boiled down to this BZ: 
https://bugzilla.redhat.com/show_bug.cgi?id=1705289 which suggests it may be a 
good idea to have sanlock io_timeout configureable :)
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7OQG3TRTS7YFC26QUOLVQ4DYUO5ZEMED/


[ovirt-users] Re: oVirt and NetApp NFS storage

2019-04-23 Thread Klaas Demter
Hi,
NFS hard mounts should not help with sanlock. It should just kill the VMs
after 60 seconds same as a soft mount.
The sanlock timeout: I asked RedHat support about that in a case for my RHV
environment. They said this should not be changed.

Only thing that came to my mind: disable storage leases in case of planned
storage maintenance.

Greetings
Klaas

On Thu, Apr 18, 2019, 19:10 Strahil  wrote:

> I know 2 approaches.
> 1. Use NFS hard mounting option - it will never give error to sanlock and
> it will be waiting until NFS is recovered (never tries this one, but in
> theory might work)
> 2. Change the default sanlock timeout (last time I tried that - it didn't
> work) . You might need help from Sandro or Sahina for that option.
>
> Best Regards,
> Strahil NikolovOn Apr 18, 2019 11:45, klaasdem...@gmail.com wrote:
> >
> > Hi,
> >
> > I got a question regarding oVirt and the support of NetApp NFS storage.
> > We have a MetroCluster for our virtual machine disks but a HA-Failover
> > of that (active IP gets assigned to another node) seems to produce
> > outages too long for sanlock to handle - that affects all VMs that have
> > storage leases. NetApp says a "worst case" takeover time is 120 seconds.
> > That would mean sanlock has already killed all VMs. Is anyone familiar
> > with how we could setup oVirt to allow such storage outages? Do I need
> > to use another type of storage for my oVirt VMs because that NFS
> > implementation is unsuitable for oVirt?
> >
> >
> > Greetings
> >
> > Klaas
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> > List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SFIH3IDKDGRQOQ4POSETN27CRZPCDTM2/


[ovirt-users] Re: oVirt and NetApp NFS storage

2019-04-18 Thread Strahil
I know 2 approaches.
1. Use NFS hard mounting option - it will never give error to sanlock and it 
will be waiting until NFS is recovered (never tries this one, but in theory 
might work)
2. Change the default sanlock timeout (last time I tried that - it didn't work) 
. You might need help from Sandro or Sahina for that option.

Best Regards,
Strahil NikolovOn Apr 18, 2019 11:45, klaasdem...@gmail.com wrote:
>
> Hi, 
>
> I got a question regarding oVirt and the support of NetApp NFS storage. 
> We have a MetroCluster for our virtual machine disks but a HA-Failover 
> of that (active IP gets assigned to another node) seems to produce 
> outages too long for sanlock to handle - that affects all VMs that have 
> storage leases. NetApp says a "worst case" takeover time is 120 seconds. 
> That would mean sanlock has already killed all VMs. Is anyone familiar 
> with how we could setup oVirt to allow such storage outages? Do I need 
> to use another type of storage for my oVirt VMs because that NFS 
> implementation is unsuitable for oVirt? 
>
>
> Greetings 
>
> Klaas 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HJ7L23WBOGKVBNZJVID5QLLHUJU5AZB3/


[ovirt-users] Re: oVirt and NetApp NFS storage

2019-04-18 Thread Ladislav Humenik
Hi, nope, no storage leases and no fencing at all (because of vlan 
separation between mgmt and RAC).


We have own HA fencing mechanism in place, which will trigger action 
over API when alarm in monitoring is triggered.


HTH

On 18.04.19 13:12, klaasdem...@gmail.com wrote:

Hi,
are you using ovirt storage leases? You'll need them if you want to 
handle a hypervisor completely unresponsive (including fencing 
actions) in a HA setting. Those storage leases use sanlock. If you use 
sanlock a VM gets killed if the lease is not renewable during a very 
short timeframe (60 seconds). That is what is killing the VMs during 
takeover. Before storage leases it seems to have worked because it 
would simply wait long enough for nfs to finish.


Greetings
Klaas

On 18.04.19 12:47, Ladislav Humenik wrote:
Hi, we have netapp nfs with ovirt in production and never experienced 
an outage during takeover/giveback ..
- the default ovirt mount options should also handle little NFS 
timeout 
(rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys)
- but to tune it little up you should set disk timeout inside your 
guest VMs to at least 180 and than you are safe


example:
|cat << EOF >>/etc/rc.d/rc.local # Increasing the timeout value for i 
in /sys/class/scsi_generic/*/device/timeout; do echo 180 > "\$i"; 
done EOF |



KR

On 18.04.19 10:45, klaasdem...@gmail.com wrote:

Hi,

I got a question regarding oVirt and the support of NetApp NFS 
storage. We have a MetroCluster for our virtual machine disks but a 
HA-Failover of that (active IP gets assigned to another node) seems 
to produce outages too long for sanlock to handle - that affects all 
VMs that have storage leases. NetApp says a "worst case" takeover 
time is 120 seconds. That would mean sanlock has already killed all 
VMs. Is anyone familiar with how we could setup oVirt to allow such 
storage outages? Do I need to use another type of storage for my 
oVirt VMs because that NFS implementation is unsuitable for oVirt?



Greetings

Klaas
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/

--
Ladislav Humenik

System administrator / VI




--
Ladislav Humenik

System administrator / VI
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7XBZSHLGDYWIYADMKHFFMWYTXNFHKHWM/


[ovirt-users] Re: oVirt and NetApp NFS storage

2019-04-18 Thread klaasdemter

Hi,
are you using ovirt storage leases? You'll need them if you want to 
handle a hypervisor completely unresponsive (including fencing actions) 
in a HA setting. Those storage leases use sanlock. If you use sanlock a 
VM gets killed if the lease is not renewable during a very short 
timeframe (60 seconds). That is what is killing the VMs during takeover. 
Before storage leases it seems to have worked because it would simply 
wait long enough for nfs to finish.


Greetings
Klaas

On 18.04.19 12:47, Ladislav Humenik wrote:
Hi, we have netapp nfs with ovirt in production and never experienced 
an outage during takeover/giveback ..
- the default ovirt mount options should also handle little NFS 
timeout 
(rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys)
- but to tune it little up you should set disk timeout inside your 
guest VMs to at least 180 and than you are safe


example:
|cat << EOF >>/etc/rc.d/rc.local # Increasing the timeout value for i 
in /sys/class/scsi_generic/*/device/timeout; do echo 180 > "\$i"; done 
EOF |



KR

On 18.04.19 10:45, klaasdem...@gmail.com wrote:

Hi,

I got a question regarding oVirt and the support of NetApp NFS 
storage. We have a MetroCluster for our virtual machine disks but a 
HA-Failover of that (active IP gets assigned to another node) seems 
to produce outages too long for sanlock to handle - that affects all 
VMs that have storage leases. NetApp says a "worst case" takeover 
time is 120 seconds. That would mean sanlock has already killed all 
VMs. Is anyone familiar with how we could setup oVirt to allow such 
storage outages? Do I need to use another type of storage for my 
oVirt VMs because that NFS implementation is unsuitable for oVirt?



Greetings

Klaas
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/

--
Ladislav Humenik

System administrator / VI


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/M4PMK33ZQLLH6DCXLC3NNDUQDQJM3XTX/


[ovirt-users] Re: oVirt and NetApp NFS storage

2019-04-18 Thread klaasdemter

Hi,
are you using ovirt storage leases? You'll need them if you want to 
handle a hypervisor completely unresponsive (including fencing actions) 
in a HA setting. Those storage leases use sanlock. If you use sanlock a 
VM gets killed if the lease is not renewable during a very short 
timeframe (60 seconds). That is what is killing the VMs during takeover. 
Before storage leases it seems to have worked because it would simply 
wait long enough for nfs to finish.


Greetings
Klaas

On 18.04.19 12:47, Ladislav Humenik wrote:
Hi, we have netapp nfs with ovirt in production and never experienced 
an outage during takeover/giveback ..
- the default ovirt mount options should also handle little NFS 
timeout 
(rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys)
- but to tune it little up you should set disk timeout inside your 
guest VMs to at least 180 and than you are safe


example:
|cat << EOF >>/etc/rc.d/rc.local # Increasing the timeout value for i 
in /sys/class/scsi_generic/*/device/timeout; do echo 180 > "\$i"; done 
EOF |



KR

On 18.04.19 10:45, klaasdem...@gmail.com wrote:

Hi,

I got a question regarding oVirt and the support of NetApp NFS 
storage. We have a MetroCluster for our virtual machine disks but a 
HA-Failover of that (active IP gets assigned to another node) seems 
to produce outages too long for sanlock to handle - that affects all 
VMs that have storage leases. NetApp says a "worst case" takeover 
time is 120 seconds. That would mean sanlock has already killed all 
VMs. Is anyone familiar with how we could setup oVirt to allow such 
storage outages? Do I need to use another type of storage for my 
oVirt VMs because that NFS implementation is unsuitable for oVirt?



Greetings

Klaas
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/

--
Ladislav Humenik

System administrator / VI


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MFQJNX6TNWTCJERLH3WAO7UGEUPWQ2KI/


[ovirt-users] Re: oVirt and NetApp NFS storage

2019-04-18 Thread Ladislav Humenik
Hi, we have netapp nfs with ovirt in production and never experienced an 
outage during takeover/giveback ..
- the default ovirt mount options should also handle little NFS timeout 
(rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys)
- but to tune it little up you should set disk timeout inside your guest 
VMs to at least 180 and than you are safe


example:
|cat << EOF >>/etc/rc.d/rc.local # Increasing the timeout value for i in 
/sys/class/scsi_generic/*/device/timeout; do echo 180 > "\$i"; done EOF |



KR

On 18.04.19 10:45, klaasdem...@gmail.com wrote:

Hi,

I got a question regarding oVirt and the support of NetApp NFS 
storage. We have a MetroCluster for our virtual machine disks but a 
HA-Failover of that (active IP gets assigned to another node) seems to 
produce outages too long for sanlock to handle - that affects all VMs 
that have storage leases. NetApp says a "worst case" takeover time is 
120 seconds. That would mean sanlock has already killed all VMs. Is 
anyone familiar with how we could setup oVirt to allow such storage 
outages? Do I need to use another type of storage for my oVirt VMs 
because that NFS implementation is unsuitable for oVirt?



Greetings

Klaas
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TSJJKK5UG57CCFYUUCXXM3LYQJW2ODWZ/


--
Ladislav Humenik

System administrator / VI

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6TLJ4KH4P5DH2RZFZUUKUYCY6SFQJHSN/