On Dec 7, 2016 16:00, "Nir Soffer" <[email protected]> wrote: > > On Wed, Dec 7, 2016 at 10:17 AM, Oved Ourfali <[email protected]> wrote: > > > > > > On Tue, Dec 6, 2016 at 11:12 PM, Adam Litke <[email protected]> wrote: > >> > >> On 06/12/16 22:06 +0200, Arik Hadas wrote: > >>> > >>> Adam, > >> > >> > >> :) You seem upset. Sorry if I touched on a nerve... > >> > >>> Just out of curiosity: when you write "v2v has promised" - what exactly > >>> do you > >>> mean? the tool? Richard Jones (the maintainer of virt-v2v)? Shahar and I > >>> that > >>> implemented the integration with virt-v2v? I'm not aware of such a > >>> promise by > >>> any of these options :) > >> > >> > >> Some history... > >> > >> Earlier this year Nir, Francesco (added), Shahar, and I began > >> discussing the similarities between what storage needed to do with > >> external commands and what was designed specifically for v2v. I am > >> not sure if you were involved in the project at that time. The plan > >> was to create common infrastructure that could be extended to fit the > >> unique needs of the verticals. The v2v code was going to be moved > >> over to the new infrastructure (see [1]) and the only thing that > >> stopped the initial patch was lack of a VMWare testing environment for > >> verification. > >> > >> At that time storage refocused on developing verbs that used the new > >> infrastructure and have been maintaining its suitability for general > >> use. Conversion of v2v -> Host Jobs is obviously a lower priority > >> item and much more difficult now due to the early missed opportunity. > >> > >>> Anyway, let's say that you were given such a promise by someone and thus > >>> consider that mechanism to be deprecated - it doesn't really matter. > >> > >> > >> I may be biased but I think my opinion does matter. > >> > >>> The current implementation doesn't well fit to this flow (it requires > >>> per-volume job, it creates leases that are not needed for template's > >>> disks, > >>> ...) and with the "next-gen API" with proper support for virt flows not > >>> even > >>> being discussed with us (and iiuc also not with the infra team) yet, I > >>> don't > >>> understand what do you suggest except for some strong, though irrelevant, > >>> statements. > >> > >> > >> If you are willing to engage in a good-faith technical discussion I am > >> sure I can help you to understand. These operations to storage demand > >> some form of locking protection. If volume leases aren't appropriate then > >> perhaps we should use the VM Leases / xleases that Nir is finishing > >> off for 4.1 now. > >> > >>> I suggest loud and clear to reuse (not to add dependencies, not to > >>> enhance, ..) > >>> an existing mechanism for a very similar flow of virt-v2v that works well > >>> and > >>> simple. > >> > >> > >> I clearly remember discussions involving infra (hello Oved), virt > >> (hola Michal), and storage where we decided that new APIs performing > >> async operations involving external commands should use the HostJobs > >> infrastructure instead of adding more information to Host Stats. > >> These were the "famous" entity polling meetings. > > We discussed these issues behind close doors, not in the public mailing list, > so it is not surprising that people do not know about the agreements we had. >
The core team was there. So it is surprising. > >> > >> Of course plans can change but I have never been looped into any such > >> discussions. > >> > > > > Well, I think that when someone builds a good infrastructure he first needs > > to talk to all consumers and make sure it fits. > > In this case it seems like most work was done to fit the storage use-case, > > and now you check whether it can fit others as well.... > > The jobs framework is generic and can be used for any subsystem, > there is nothing related to storage about it. But modifying disks *is* > a storage operation, even if someone from the virt team worked on it. > > V2v is also storage operation - if we compare it with copying disks: > > - we create a new volume that nobody is using yet > - if the operation fails, the disk must be in illegal state > - if the operation fails we delete the disks > - if the operation succeeds the volume must be legal > - we need to limit the number of operations on a host > - we need to detect the job state if the host becomes non-responsive > - we may want to fence the job if the host becomes non-responsive > in volume jobs, we can increment the volume generation and run > the same job on another host. > - we want to take a lease on storage to ensure that other hosts cannot > access the same entity, or that the job will fail if someone else is using > this entity > - we want to take a lease on storage, ensuring that a job cannot get > stuck for long time - sanlock kill the owner of a lease when storage > becomes inaccessible. > - we want to report progress > > sysprep is less risky because the operation is faster, but on storage even > fast operation can get stuck for minutes. > > We need to agree on a standard way to do such operations that is safe enough > and can be managed on the engine side. > > > IMO it makes much more sense to use events where possible (and you've > > promised to use those as well, but I don't see you doing that...). v2v > > should use events for sure, and they have promised to do that in the past, > > instead of using the v2v jobs. The reason events weren't used originally > > with the v2v feature, was that it was too risky and the events > > infrastructure was added too late in the game. > > Events are not replacing the need for managing jobs in the vdsm side. > Engine must have a way to query the current jobs before subscribing > to events from these jobs, otherwise you will loose events and engine > will never notice a completed job after network errors. > > The jobs framework supports events, see > https://gerrit.ovirt.org/67118 > > We are waiting for review from the infra team, maybe you can > get someone to review this? It would have been great to review the design for this before it reaches to gerrit. Anyway, I get permissions error when opening. Any clue why? > > Nir > > > > > > >>> > >>> Do you "promise" to implement your "next gen API" for 4.1 as an > >>> alternative? > >> > >> > >> I guess we need the design first. > >> > >> > >>> On Tue, Dec 6, 2016 at 5:04 PM, Adam Litke <[email protected]> wrote: > >>> > >>> On 05/12/16 11:17 +0200, Arik Hadas wrote: > >>> > >>> > >>> > >>> On Mon, Dec 5, 2016 at 10:05 AM, Nir Soffer <[email protected] > > >>> wrote: > >>> > >>> On Sun, Dec 4, 2016 at 8:50 PM, Shmuel Melamud > >>> <[email protected]> > >>> wrote: > >>> > > >>> > Hi! > >>> > > >>> > I'm currently working on integration of virt-sysprep into > >>> oVirt. > >>> > > >>> > Usually, if user creates a template from a regular VM, and > >>> then > >>> creates > >>> new VMs from this template, these new VMs inherit all > >>> configuration > >>> of the > >>> original VM, including SSH keys, UDEV rules, MAC addresses, > >>> system > >>> ID, > >>> hostname etc. It is unfortunate, because you cannot have two > >>> network > >>> devices with the same MAC address in the same network, for > >>> example. > >>> > > >>> > To avoid this, user must clean all machine-specific > >>> configuration > >>> from > >>> the original VM before creating a template from it. You can do > >>> this > >>> manually, but there is virt-sysprep utility that does this > >>> automatically. > >>> > > >>> > Ideally, virt-sysprep should be seamlessly integrated into > >>> template > >>> creation process. But the first step is to create a simple > >>> button: > >>> user > >>> selects a VM, clicks the button and oVirt executes virt-sysprep > >>> on > >>> the VM. > >>> > > >>> > virt-sysprep works directly on VM's filesystem. It accepts > >>> list of > >>> all > >>> disks of the VM as parameters: > >>> > > >>> > virt-sysprep -a disk1.img -a disk2.img -a disk3.img > >>> > > >>> > The architecture is as follows: command on the Engine side > >>> runs a > >>> job on > >>> VDSM side and tracks its success/failure. The job on VDSM side > >>> runs > >>> virt-sysprep. > >>> > > >>> > The question is how to implement the job correctly? > >>> > > >>> > I thought about using storage jobs, but they are designed to > >>> work > >>> only > >>> with a single volume, correct? > >>> > >>> New storage verbs are volume based. This make it easy to manage > >>> them on the engine side, and will allow parallelizing volume > >>> operations > >>> on single or multiple hosts. > >>> > >>> A storage volume job is using sanlock lease on the modified > >>> volume > >>> and volume generation number. If a host running pending jobs > >>> becomes > >>> non-responsive and cannot be fenced, we can detect the state of > >>> the job, fence the job, and start the job on another host. > >>> > >>> In the SPM task, if a host becomes non-responsive and cannot be > >>> fenced, the whole setup is stuck, there is no way to perform > >>> any > >>> storage operation. > >>> > Is is possible to use them with operation that is performed > >>> on > >>> multiple > >>> volumes? > >>> > Or, alternatively, is it possible to use some kind of 'VM > >>> jobs' - > >>> that > >>> work on VM at whole? > >>> > >>> We can do: > >>> > >>> 1. Add jobs with multiple volumes leases - can make error > >>> handling > >>> very > >>> complex. How do tell a job state if you have multiple > >>> leases? > >>> which > >>> volume generation you use? > >>> > >>> 2. Use volume job using one of the volumes (the boot volume?). > >>> This > >>> does > >>> not protect the other volumes from modification but engine > >>> is > >>> responsible > >>> for this. > >>> > >>> 3. Use new "vm jobs", using a vm lease (should be available > >>> this > >>> week > >>> on master). > >>> This protects a vm during sysprep from starting the vm. > >>> We still need a generation to detect the job state, I think > >>> we > >>> can > >>> use the sanlock > >>> lease generation for this. > >>> > >>> I like the last option since sysprep is much like running a vm. > >>> > How v2v solves this problem? > >>> > >>> It does not. > >>> > >>> v2v predates storage volume jobs. It does not use volume leases > >>> and > >>> generation > >>> and does have any way to recover if a host running v2v becomes > >>> non-responsive > >>> and cannot be fenced. > >>> > >>> It also does not use the jobs framework and does not use a > >>> thread > >>> pool for > >>> v2v jobs, so it has no limit on the number of storage > >>> operations on > >>> a host. > >>> > >>> > >>> Right, but let's be fair and present the benefits of v2v-jobs as > >>> well: > >>> 1. it is the simplest "infrastructure" in terms of LOC > >>> > >>> > >>> It is also deprecated. V2V has promised to adopt the richer Host Jobs > >>> API in the future. > >>> > >>> > >>> 2. it is the most efficient mechanism in terms of interactions > >>> between > >>> the > >>> engine and VDSM (it doesn't require new verbs/call, the data is > >>> attached to > >>> VdsStats; probably the easiest mechanism to convert to events) > >>> > >>> > >>> Engine is already polling the host jobs API so I am not sure I agree > >>> with you here. > >>> > >>> > >>> 3. it is the most efficient implementation in terms of interaction > >>> with > >>> the > >>> database (no date is persisted into the database, no polling is > >>> done) > >>> > >>> > >>> Again, we're already using the Host Jobs API. We'll gain efficiency > >>> by migrating away from the old v2v API and having a single, unified > >>> approach (Host Jobs). > >>> > >>> > >>> Currently we have 3 mechanisms to report jobs: > >>> 1. VM jobs - that is currently used for live-merge. This requires > >>> the > >>> VM entity > >>> to exist in VDSM, thus not suitable for virt-sysprep. > >>> > >>> > >>> Correct, not appropriate for this application. > >>> > >>> > >>> 2. storage jobs - complicated infrastructure, targeted for > >>> recovering > >>> from > >>> failures to maintain storage consistency. Many of the things this > >>> infrastructure knows to handle is irrelevant for virt-sysprep > >>> flow, and > >>> the > >>> fact that virt-sysprep is invoked on VM rather than particular > >>> disk > >>> makes it > >>> less suitable. > >>> > >>> > >>> These are more appropriately called HostJobs and the have the > >>> following semantics: > >>> - They represent an external process running on a single host > >>> - They are not persisted. If the host or vdsm restarts, the job is > >>> aborted > >>> - They operate on entities. Currently storage is the first adopter > >>> of the infrastructure but virt was going to adopt these for the > >>> next-gen API. Entities can be volumes, storage domains, vms, > >>> network interfaces, etc. > >>> - Job status and progress is reported by the Host Jobs API. If a job > >>> is not present, then the underlying entitie(s) must be polled by > >>> engine to determine the actual state. > >>> > >>> > >>> 3. V2V jobs - no mechanism is provided to resume failed jobs, no > >>> leases, etc > >>> > >>> > >>> This is the old infra upon which Host Jobs are built. v2v has > >>> promised to move to Host Jobs in the future so we should not add new > >>> dependencies to this code. > >>> > >>> > >>> I have some arguments for using V2V-like jobs [1]: > >>> 1. creating template from vm is rarely done - if host goes > >>> unresponsive > >>> or any > >>> other failure is detected we can just remove the template and > >>> report > >>> the error > >>> > >>> > >>> We can chose this error handling with Host Jobs as well. > >>> > >>> > >>> 2. the phase of virt-sysprep is, unlike typical storage operation, > >>> short - > >>> reducing the risk of failures during the process > >>> > >>> > >>> Reduced risk of failures is never an excuse to have lax error > >>> handling. The storage flavored host jobs provide tons of utilities > >>> for making error handling standardized, easy to implement, and > >>> correct. > >>> > >>> > >>> 3. during the operation the VM is down - by locking the > >>> VM/template and > >>> its > >>> disks on the engine side, we render leases-like mechanism > >>> redundant > >>> > >>> > >>> Eventually we want to protect all operations on storage with sanlock > >>> leases. This is safer and allows for a more distributed approach to > >>> management. Again, the use of leases correctly in host jobs requires > >>> about 5 lines of code. The benefits of standardization far outweigh > >>> any perceived simplification resulting from omitting it. > >>> > >>> > >>> 4. in the worst case - the disk will not be corrupted (only some > >>> of the > >>> data > >>> might be removed). > >>> > >>> > >>> Again, the way engine chooses to handle job failures is independent of > >>> the mechanism. Let's separate that from this discussion. > >>> > >>> > >>> So I think that the mechanism for storage jobs is an over-kill for > >>> this > >>> case. > >>> We can keep it simple by generalise the V2V-job for other > >>> virt-tools > >>> jobs, like > >>> virt-sysprep. > >>> > >>> > >>> I think we ought to standardize on the Host Jobs framework where we > >>> can collaborate on unit tests, standardized locking and error > >>> handling, abort logic, etc. When v2v moves to host jobs then we will > >>> have a unified method of handling ephemeral jobs that are tied to > >>> entities. > >>> > >>> -- > >>> Adam Litke > >>> > >>> > >> > >> -- > >> Adam Litke > > > >
_______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
