On Mon, Dec 17, 2012 at 03:12:34PM -0500, Saggi Mizrahi wrote: > This is an addendum to my previous email. > > ----- Original Message ----- > > From: "Saggi Mizrahi" <[email protected]> > > To: "Adam Litke" <[email protected]> > > Cc: "Dan Kenigsberg" <[email protected]>, "Ayal Baron" <[email protected]>, > > "Federico Simoncelli" > > <[email protected]>, [email protected], > > [email protected] > > Sent: Monday, December 17, 2012 2:52:06 PM > > Subject: Re: Managing async tasks > > > > > > > > ----- Original Message ----- > > > From: "Adam Litke" <[email protected]> > > > To: "Saggi Mizrahi" <[email protected]> > > > Cc: "Dan Kenigsberg" <[email protected]>, "Ayal Baron" > > > <[email protected]>, "Federico Simoncelli" > > > <[email protected]>, [email protected], > > > [email protected] > > > Sent: Monday, December 17, 2012 2:16:25 PM > > > Subject: Re: Managing async tasks > > > > > > On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote: > > > > > > > > > > > > ----- Original Message ----- > > > > > From: "Adam Litke" <[email protected]> To: > > > > > [email protected] > > > > > Cc: "Dan Kenigsberg" <[email protected]>, "Ayal Baron" > > > > > <[email protected]>, > > > > > "Saggi Mizrahi" <[email protected]>, "Federico Simoncelli" > > > > > <[email protected]>, [email protected] Sent: Monday, > > > > > December 17, > > > > > 2012 12:00:49 PM Subject: Managing async tasks > > > > > > > > > > On today's vdsm call we had a lively discussion around how > > > > > asynchronous > > > > > operations should be handled in the future. In an effort to > > > > > include more > > > > > people in the discussion and to better capture the resulting > > > > > conversation I > > > > > would like to continue that discussion here on the mailing > > > > > list. > > > > > > > > > > A lot of ideas were thrown around about how 'tasks' should be > > > > > handled in the > > > > > future. There are a lot of ways that it can be done. To > > > > > determine how we > > > > > should implement it, it's probably best if we start with a set > > > > > of > > > > > requirements. If we can first agree on these, it should be > > > > > easy > > > > > to find a > > > > > solution that meets them. I'll take a stab at identifying a > > > > > first set of > > > > > POSSIBLE requirements: > > > > > > > > > > - Standardized method for determining the result of an > > > > > operation > > > > > > > > > > This is a big one for me because it directly affects the > > > > > consumability of > > > > > the API. If each verb has different semantics for > > > > > discovering > > > > > whether it > > > > > has completed successfully, then the API will be nearly > > > > > impossible to use > > > > > easily. > > > > Since there is no way to assure if of some tasks completed > > > > successfully or > > > > failed, especially around the murky waters of storage, I say this > > > > requirement > > > > should be removed. At least not in the context of a task. > > > > > > I don't agree. Please feel free to convince me with some exampled. > > > If we > > > cannot provide feedback to a user as to whether their request has > > > been satisfied > > > or not, then we have some bigger problems to solve. > > If VDSM sends a write command to a storage server, and the connection > > hangs up before the ACK has returned. > > The operation has been committed but VDSM has no way of knowing if > > that happened as far as VDSM is concerned it got an ETIMEO or EIO. > > This is the same problem that the engine has with VDSM. > > If VDSM creates an image\VM\network\repo but the connection hangs up > > before the response can be sent back as far as the engine is > > concerned the operation times out. > > This is an inherent issue with clustering. > > This is why I want to move away from tasks being *the* trackable > > objects. > > Tasks should be short. As short as possible. > > Run VM should just persist the VM information on the VDSM host and > > return. The rest of the tracking should be done using the VM ID. > > Create image should return once VDSM persisted the information about > > the request on the repository and created the metadata files. > > Tracking should be done on the repo or the imageId. > > The thing is that I know how long a VM object should live (or an Image > object). > So tracking it is straight forward. How long a task should live is very > problematic and quite context specific. > It depends on what the task is. > I think it's quite confusing from an API standpoint to have every task have a > different scope, id requirement and life-cycle. > > In VDSM has two types of APIs > > CRUD objects - VM, Image, Repository, Bridge, Storage Connections.... > General transient methods - getBiosInfo(), getDeviceList() > > The latter are quite simple to manage. They don't need any special handling. > If you lost a getBiosInfo() call you just send another one, no harm done. > The same is even true with things that "change" the host like getDeviceList() > > What we are really arguing about is fitting the CRUD objects to some generic > task oriented scheme. > I'm saying it's a waste of time as you can quite easily have flows to recover > from each operation. > > Create - Check if the object exists > Read - Read again > Update - either update again or read and update if update didn't commit the > first time > Delete - Check if object doesn't exist > > Each of the objects we CRUD have different life-cycles and ownership > semantics. > > Danken raised the point that creation has a problem that if it fails there is > no way to get why it failed. > This is why Create method should be minimal. They shouldn't create the object > just the entry in the respective persistent storage. > Even now storage connections are persisted to disk and then the operation > returns and the user polls to see the state of the connection. > The same should be done for everything. Do the minimum required to create the > object entry and mark it as "not usable". > For storage connections it's "connecting" > For VMs it's "preparing for launch" > For new images it's "broken" and in some regards "degraded" > > I hope this makes things clearer
So, I tried to avoid using the word 'Task' in my initial email because I didn't want to cause any confusion. I actually like the ideas you mention above. They work especially well assuming we have no rollback/cancel operations. I was hoping someone would volunteer some requirements that would be pertinent to the 'setupNetwork' flow. I don't think this one fits very well into the model above. -- Adam Litke <[email protected]> IBM Linux Technology Center _______________________________________________ Engine-devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/engine-devel
