> On 25 Aug 2020, at 13:02, Vojtech Juranek <[email protected]> wrote:
> 
> On čtvrtek 20. srpna 2020 14:42:15 CEST Michal Skrivanek wrote:
>>> On 20 Aug 2020, at 14:28, Nir Soffer <[email protected]> wrote:
>>> 
>>> On Thu, Aug 20, 2020 at 12:19 PM Vojtech Juranek <[email protected]>
>>> wrote:
>> 
>>>> 
>>>> Hi,
>>>> as Fedor is on PTO, I was asked to take over.
>>>> 
>>>> I don't think new functionality is need, more detail list of proposed
>>>> flows and changes is bellow,
> TL;DR: I'd suggest to
>>>> 
>>>> - change in engine: send PDIV instead of CD path from engine to vdsm
>> 
>> 
>> the biggest issue here is always with older versions not having this
>> functionality yet….
> 
> yes, I was wrong, this requires new functionality. Therefore Nir proposed to 
> create a feature page for it. We put together flows which should cover all 
> the 
> possible issues (including VM recovery). Please review and comment directly 
> under 
> 
> https://github.com/oVirt/ovirt-site/pull/2320
> 
> (unless you think creating feature page for it is wrong way to go:-)

IMHO you’re making it bigger than it needs to be
We already have ChangeCD API, I don’t see how adding two different APIs make it 
significantly better
There is definitely a time when old unmaintainable code needs to be rewritten 
and improved….but it greatly increases the "time to fix”.

Either way, I still don’t get "add new drivespec to VM metadata…”. We’re not 
adding a new drive. If you mean to extend the existing disk metadata with new 
attribute please say “extend” or something else instead.

> 
>> 
>>>> - change in vdsm: implement counter of ISOs being used by VMs to know
>>>> when we can deactivate volume
> - change in vdsm: remove old drivespec
>>>> from VM XML when changing/removing CD (and eventually deactivate
>>>> volume)>> 
>>>> 
>>>> You comments are welcome.
>>>> Thanks
>>>> Vojta
>>>> 
>>>> Flows
>>>> ===
>>>> 
>>>> VM without a CD
>>>> -------------------------
>>>> 
>>>> - Should not be possible to insert any CD, this option should not be
>>>> available/active in the UI.
>> 
>>> 
>>> I don't think we have such configuration, all VMs have empty cdrom by
>>> default:
> 
>>> 
>>>   <disk type='file' device='cdrom'>
>>> 
>>>     <driver name='qemu' error_policy='report'/>
>>>     <source startupPolicy='optional'/>
>>>     <target dev='sdc' bus='sata'/>
>>>     <readonly/>
>>>     <alias name='ua-d95b29f2-e8a2-4119-9ac2-5b4267e6637d'/>
>>>     <address type='drive' controller='0' bus='0' target='0' unit='2'/>
>>> 
>>>   </disk>
>>> 
>>> 
>>> But of course if we have such configuration when adding CD is not
>> 
>> 
>> we don’t. All VMs have always at least that one implicit CD (empty or
>> loaded)
> We have a case where there is additional CD for
>> cloud-init/payload, that additional CD is not “changeable” nor exposed in
>> UI anywhere 
>> 
>>> possible, the menu/button
>>> should be disabled in the UI.
>>> 
>>> 
>>>> VM without CD, changeCD inserts new ISO
>> 
>> 
>> please update to make it really clear, there’s no “VM without CD”. VM always
>> has CD, just not loaded (empty tray)
> so this case doesn’t exist…
>> 
>> 
>>>> -----------------------------------------------------------------
>>>> 
>>>> - add new drivespec to VM metadata
>>> 
>>> 
>>> How failure is handled?
>>> 
>>> 
>>>> - prepare new drivespec
>>> 
>>> 
>>> What if vdsm is restarted at this point?
>>> 
>>> How VM recovery should handle this drivespec referring to inactive
>>> volume?
>>> 
>>> 
>>>> - attach new device to VM
>>>> - if attaching to VM fails, tear down drivespec and remove drivespec
>>>> from VM metadata
>> 
>>> 
>>> Tearing down and removing drivespec cannot be done atomically. any
>>> operation may fail and
>>> vdsm may be killed at any point.
>>> 
>>> The design should suggest how vdsm recover from all errors.
>>> 
>>> 
>>>> VM with CD, changeCD removes current ISO
>>>> -------------------------------------------------------------------
>>>> 
>>>> - tear down previous drivespec
>>>> 
>>>>   - if volume with ISO is inactive (as a result e.g. of failure of vdsm
>>>>   after inserting drivespec into VM metadata, but before activating
>>>>   volume), continue without error
>> 
>>> 
>>> Sounds good, and may be already handled in lvm module. You can try to
>>> deactivate
> LV twice to confirm this.
>>> 
>>> 
>>>>   - if drivespec is used by another VM, don’t deactivate volume
>>> 
>>> 
>>> This is the tricky part, vdsm does not have a mechanism for tracking
>>> usage of block devices,
>>> and adding such mechanism is much bigger work than supporting CD on
>>> block devices, so
>>> it cannot be part of this work.
>> 
>> 
>> We do have the list of active VMs and their xml/devices, so we do know if
>> there are other users, it’s nt an expensive check. Locking it properly so
>> there’s no in flight request for change cd could be a bit tricky but it
>> doesn’t sound that complex
>>> 
>>> 
>>>> -remove drivespec from VM metadata
>>> 
>>> 
>>> What if this fails, or vdsm is restarted before we try to do this?
>>> 
>>> When vdsm starts it recovers running vms, we need to handle this case
>>> - drivespec
>>> referencing non-existing volume in this flow.
>> 
>> 
>> remove what exactly? CD device stays there, the "tray is ejected”, device’s
>> path is updated to “".
> 
>> 
>>> 
>>> 
>>>> VM with CD, changeCD inserts new ISO and removes old
>>>> -------------------------------------------------------------------------
>>>> -------------
>>> 
>>>> - tear down previous drivespec
>>>> 
>>>>   - if volume with ISO is inactive (as a result e.g. of failure of vdsm
>>>>   after inserting drivespec into VM metadata, but before activating
>>>>   volume), continue without error
> - if drivespec is used by another
>>>>   VM, don’t deactivate volume
>>>> 
>>>> - remove previous drivespec from VM metadata
>>>> - add new drivespec to VM metadata
>>>> - prepare new drivespec
>>>> - attach new device to VM
>>>> - if attaching new drivespac fails, tear down new drivespec and attach
>>>> back previous drivespec
>> 
>>> 
>>> This is too complicated. When I discussed this Fedor, our conclusion
>>> was that we don't
>>> want to support this complexity, and it will be much easier to
>>> implement 2 APIs, one for
>>> removing a CD, and one for inserting a CD.
>>> 
>>> With separate APIs, error handling is much easier, and is similar to
>>> existing error handling
>>> in hotplugDisk and hotunplugDisk. You have good example how to handle
>>> all errors in these
>>> APIs.
>> 
>> 
>> there’s a big difference between hotplugs and change CD in terms of the
>> devices. It’s supposedly an atomic change. Though having a separate actions
>> for load/eject would be probably ok. But not sure if worth it
> 
>> 
>>> 
>>> For example, if we have API for removing a CD:
>>> 
>>> 1. engine send VM.removeCD
>>> 2. vdsm mark drivespec for removal
>>> 
>>> Possible implementation:
>>> 
>>> 
>>>   <ovirt-vm:device devtype="disk" name="sdc" removing="yes">
>>>   ...
>>>   </ovirt-vm:device>
>>> 
>>> 
>>> 3. vdsm detach device from VM
>> 
>> 
>> please update the flows. no device detach/attach…
>> 
>> 
>>> 4. vsdm deactivate volume
>>> 5. vdsm remove drivespec from metadata
>>> 
>>> Vdsm may be killed after any step. When recovering the VM, we can have
>>> these cases:
>>> 
>>> 1. vdsm was killed after step 2, the CD is marked for removal, but
>>> device is attached to VM,
>>> 
>>>  and the volume is active:
>>>  - change the metadata un-marking the drivespec for removal.
>>> 
>>> 2. vdsm was killed after step 3, the drivespec is marked for removal,
>>> and the device is not
>>> 
>>>  attached to the VM, but the volume is active:
>>>  - deactivate the volume
>>>  - remove the drivespec from the metadata
>>>  Note that this should not happen before vdsm deactivate unused
>>> 
>>> logical volumes when starting
>>> 
>>>  so we should really have only case 3.
>>> 
>>> 3. vdsm was killed after step 4, the drivespec is marked for removal,
>>> device not attached
>>> 
>>>  to the VM, and the volume is not active:
>>>  - remove the drivespec from the metadata
>>> 
>>> 4. vdsm was killed after step 5, everything is good in vdsm side.
>>> 
>>> Teardown should not fail in storage layer if volume is still used. In
>>> current code we cannot
>>> avoid the lvm failures, but we don't have to pass the failure up to
>>> the caller, or we can pass
>>> specific error like "volume is used", which can be ignored silently by
>>> the caller, since this
>>> error means someone else is using this volume and is responsible for
>>> deactivating it.
>>> 
>>> In case of simple errors that vdsm can report, engine may fail the
>>> change CD, or maybe
>>> continue to insert the new one, if the old Cd was detached. For
>>> example if old CD failed
>>> to deactivate, this is storage issue that should not fail attaching of
>>> a new CD. Storage layer
>>> should make sure the volume is deactivate if nobody uses it, but the
>>> user should not care
>>> about that.
>>> 
>>> If vdsm was killed, engine will fail the operation, but it must wait
>>> until the VM is recovered.
>>> At this point engine get the VM XML and can update the real state of
>>> the CD, either still
>>> attached, or detached. Virt code should not care about deactivation
>>> failure.
> 
>>> Storage layer should be concerned with deactivation of active volumes,
>>> but I don't think we
>>> handle such issues in engine yet. On vdsm side, unused logical volumes
>>> are deactivated
>>> during startup, see:
>>> https://github.com/oVirt/vdsm/blob/f6de2ca915faa719381e22086feb8b65aa4de94
>>> 2/lib/vdsm/storage/lvm.py#L1023
> 
>>> We probably need to add periodic job in storage to deactivate unused
>> 
>> 
>> sounds good enough indeed
>> 
>> 
>>> volumes so we don't wait
>>> until the next vdsm restart. This can also be handled in engine but I
>>> don't think it is the right place
>>> to handle host state.
>>> 
>>> 
>>>> Proposed changes
>>>> ===========
>>>> 
>>>> Engine
>>>> ----------
>>>> 
>>>> - change ChangeDiskCommand to use PDIV (pool, domain, image, volume ID)
>>>> instead of ISO path
> - modify UI not to provide an option to change CD
>>>> for VMs without CD>> 
>>>> 
>>>> vdsm
>>>> --------
>>>> 
>>>> - add volume/ISO counter which would count the number of VM using the
>>>> given ISO. When CDROM is inserted, the counter is increased, when
>>>> ejected, the counter is decreased. ISO volume is deactivated only when
>>>> on VM uses it.
>> 
>>> 
>>> This is a hack for supporting ISO, while the issue of shared block volume
>>> is
> not limited to ISO. We have this issue in many flows with many
>>> partial and broken
>>> solution like avoiding activation in engine (image transfer), or
>>> avoiding activation in
>>> vdsm (snapshot).
>>> 
>>> This should be solved separately in a proper way in vdsm storage. This
>>> is a large and
>>> complex change that we may have in future version, not something for
>>> the next zstream.
>>> Unfortunately this area in vdsm was neglected for years, and we have
>>> huge technical debt.
>>> 
>>> So for now we will have to live with warning when active ISO image is
>>> deactivated, or improve
>>> error handling so the caller can detect and handle properly the case
>>> of volume in use. I think
>>> better error handling will be easy to do and good enough for now.
>>> 
>>> 
>>>>   We already have sqlite DB for managed storage, so we can add DB for
>>>>   ISO quite easily.
>>> 
>>>> - modify Vm._changeBlockDev to reflect flows suggested above
>>> 
>>> 
>>> The solution require persistence, since vdsm can be killed, and most
>>> we would most likely
>>> use a database for this, but this should not be part of the
>>> mangedvolume database, since
>>> sqlite is not good with concurrent usage.
>>> 
>>> Nir
>>> 
>>> 
>>> 
>>>>> Hi Fedor,
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 27 Jun 2020, at 14:46, Fedor Gavrilov <[email protected]> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> So from what I was able to see in hotplugDisk example, for CDs I would
>>>>>> have to do similar:
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 1. parse disk params from stored xml metadata
>>>>>> 2. call prepareVolumePath with these params
>>>>>> (skip normalizeVdsmImg)
>>>>>> 3. call updateDriveIndex
>>>>> 
>>>>> 
>>>>> 
>>>>> Why?
>>>>> 
>>>>> 
>>>>> 
>>>>>> (skip validating whether volume has leases)
>>>>>> 4. drive.getXML() and then hooks.before_disk_hotplug with it
>>>>> 
>>>>> 
>>>>> 
>>>>> If only the storage stuff would ever finish moving to xml....
>>>>> But it’s not right to call this hook anyway, iiuc you’re not doing a
>>>>> hotplug
>>>> 
>>>> 
>>>> 
>>>>> 
>>>>> 
>>>>>> 5. call attachDevice with drive XML and handle possible errors
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> I am not sure, however, whether we need what "else" does in
>>>>>> hotplugDisk
>>>>>> (vm.py line 3468)...
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Looking forward to hearing your opinions,
>>>>>> Fedor
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ----- Original Message -----
>>>>>> From: "Fedor Gavrilov" <[email protected]>
>>>>>> To: "devel" <[email protected]>
>>>>>> Sent: Monday, June 22, 2020 4:37:59 PM
>>>>>> Subject: [ovirt-devel] implementing hotplugCd/hotunplugCd in vdsm
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Hey,
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> So in an attempt to fix change CD functionality we discovered a few
>>>>>> other
>>>>>> potential
>>>>> 
>>>>> 
>>>>> CD has a long history of issues caused by historical simplifications.
>>>>> What is the intended fix, stop using string paths?
>>>>> 
>>>>> 
>>>>> 
>>>>>> issues and what Nir suggested was to implement two [somewhat] new
>>>>>> functions in VDSM: hotplug and hotunplug for CDs similar to how it
>>>>>> works
>>>>>> for normal disks now.
>>>>> 
>>>>> 
>>>>> This is conceptually different, CD is a removable media,
>>>>> hotplug/unplug is for the device itself. We never supported hotplug of
>>>>> the device though.
>>>>> 
>>>>> 
>>>>> 
>>>>>> Existing changeCD function will be left as is for backwards
>>>>>> compatibility.
>>>>> 
>>>>> 
>>>>> 
>>>>> Who would use the new function?
>>>>> 
>>>>> 
>>>>> 
>>>>>> As I found out, engine already calculates iface and index before
>>>>>> invoking
>>>>>> VDSM functions, so we will just pass these along with PDIV to the
>>>>>> VDSM.
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Suggested flow is, let me quote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> So the complete change CD flow should be:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 1. get the previous drivespec from vm metadata
>>>>>>> 2. prepare new drivespec
>>>>>>> 3. add new drivespec to vm metadata
>>>>>>> 4. attach a new device to vm
>>>>> 
>>>>> 
>>>>> 
>>>>> Don’t call it a new device when you just change a media
>>>>> 
>>>>> 
>>>>> 
>>>>>>> 5. teardown the previous drivespec
>>>>>>> 6. remove previous drivespec from vm metadata
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> When the vm is stopped, it must do:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 1. get drive spec from vm metadata
>>>>>>> 2. teardown drivespec
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> During attach, there are interesting races:
>>>>>>> - what happens if vdsm crashes after step 2? who will teardown the
>>>>>>> volume?
>>>>>>> maybe you need to add the new drivespec to the metadata first,
>>>>>>> before preparing it.
>>>>>>> - what happens if attach failed? who will remove the new drive from
>>>>>>> the metadata?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Now, what makes hotplugDisk/hotunplugDisk different? From what I
>>>>>> understand, the flow is same there, so what difference is there as far
>>>>>> as
>>>>>> VDSM is concerned? If none, this means if I more or less copy that
>>>>>> code,
>>>>>> changing minor details and data accordingly for CDs, this should work,
>>>>>> shouldn't it?
>>>>> 
>>>>> 
>>>>> Hotplug/unplug is similar, but I would like to see the proposed change
>>>>> in context of engine as well, and I expect it to be a bit more complex
>>>>> there. Without that this is not worth it.
>>>>> 
>>>>> Thanks,
>>>>> michal
>>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Thanks!
>>>>>> Fedor
>>>>>> _______________________________________________
>>>>>> Devel mailing list -- [email protected]
>>>>>> To unsubscribe send an email to [email protected]
>>>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>>>> oVirt Code of Conduct:
>>>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>> 
>>>> List
>>>> 
>>>>>> Archives:
>>>>>> https://lists.ovirt.org/archives/list/[email protected]/message/LAQR3RW4R
>>>>>> MT
>>>>>> UNFUXL5T4HWLPKXJKEC3Y/ _______________________________________________
>>>>>> Devel mailing list -- [email protected]
>>>>>> To unsubscribe send an email to [email protected]
>>>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>>>> oVirt Code of Conduct:
>>>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>> 
>>>> List
>>>> 
>>>>>> Archives:
>>>>>> https://lists.ovirt.org/archives/list/[email protected]/message/VKPDAB7XX
>>>>>> TQ
>>>>>> KPYWAZEHQYTV7TRCQQI2E/
>>>>> 
>>>>> _______________________________________________
>>>>> Devel mailing list -- [email protected]
>>>>> To unsubscribe send an email to [email protected]
>>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>>> oVirt Code of Conduct:
>>>>> https://www.ovirt.org/community/about/community-guidelines/ List
>>>>> Archives:
>>>>> https://lists.ovirt.org/archives/list/[email protected]/message/W2U55OKENZ
>>>>> P2Z
> MNG7AJDYWE3NKBOPGOA/
>>>> 
>>>> 
>>> 
>>> 
>> 
>> _______________________________________________
>> Devel mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/ List Archives:
>> https://lists.ovirt.org/archives/list/[email protected]/message/BJ2J73ODFQWNM
>> HRIOXTKI2YGQD4YV76S/
> 
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/2T4ENYLRGGLOWD2UUDLMUB3MBP3YHNAT/

Reply via email to