Sorry for the slow reply, was out sick end of last week.

Thank you Nir! You have been very helpful in getting a grasp on this issue.

I have gone ahead and open an RFE for resuming on a Direct LUN:

https://bugzilla.redhat.com/show_bug.cgi?id=1610459

Thanks again!

Regards,

Ryan

On Tue, Jul 24, 2018 at 12:30 PM, Nir Soffer <[email protected]> wrote:

> On Tue, Jul 24, 2018 at 8:30 PM Ryan Bullock <[email protected]> wrote:
> ...
>
>> Vdsm does monitor multipath events for all LUNs, but they are used only
>>> for reporting purposes, see:
>>> https://ovirt.org/develop/release-management/features/
>>> storage/multipath-events/
>>>
>>> We could use the events for resuming vms using the multipath devices that
>>> became available. This functionality will be even more important in the
>>> next version
>>> since we plan to move to LUN per disk model.
>>>
>>>
>>
>> I will look at doing this. At the very least I feel that
>> differences/limitations between storage back-ends/methods should be
>> documented. Just so users don't run into any surprises.
>>
>
> You can file a bug for documenting this issue.
>
> ...
>
>> My other question is, how can I keep my VMs with Direct LUNs from pausing
>>>> during short outages? Can I put configurations in my multipath.conf for
>>>> just the wwids of my Direct LUNs to increase the ‘no_path_retry’ to prevent
>>>> the VMs from pausing in the first place? I know in general you don’t want
>>>> to increase the ‘no_path_retry’ because it can cause timeout issues with
>>>> VDSM and SPM operations (LVM changes, etc). But in the case of a Direct LUN
>>>> would it cause any problems?
>>>>
>>>
>>> You can add a drop-in multipath configuration that will change
>>> no_path_retry for specific device, or multiapth.
>>>
>>> Increasing no_path_retry will cause larger delays when vdsm try to
>>> access the LUNs via lvm commands, but the delay should be only on
>>> the first access when a LUN is not available.
>>>
>>>
>> Would that increased delay cause any sort of issues for Ovirt (e.g.
>> thinking a node is offline/unresponsive) if set globally in multipath.conf?
>> Since a Direct LUN doesn't use LVM, would this even be a consideration if
>> the increased delay was limited to the Direct LUN only?
>>
>
> Vdsm scans all LUNs to discover oVirt volumes, so it will be effected by
> multipath
> configuration applied only for direct LUNs.
>
> Increasing no_path_retry for any LUN will increase the chance to delay some
> vdsm flows accessing LUNs (e.g. updating lvm cache, scsi rescan, listing
> devices).
> But the delay happens once when the multipath device loose all paths. The
> benefit
> is smaller chance that a VM will pause or restart because of short outage.
>
> Nir
>
_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/VCLQX46CLZOXWR3A7NXUOAFVZICUVCDH/

Reply via email to