Re: [Engine-devel] Managing async tasks

Adam Litke Mon, 17 Dec 2012 14:41:18 -0800

On Mon, Dec 17, 2012 at 03:12:34PM -0500, Saggi Mizrahi wrote:
> This is an addendum to my previous email.
> 
> ----- Original Message -----
> > From: "Saggi Mizrahi" <[email protected]>
> > To: "Adam Litke" <[email protected]>
> > Cc: "Dan Kenigsberg" <[email protected]>, "Ayal Baron" <[email protected]>, 
> > "Federico Simoncelli"
> > <[email protected]>, [email protected], 
> > [email protected]
> > Sent: Monday, December 17, 2012 2:52:06 PM
> > Subject: Re: Managing async tasks
> > 
> > 
> > 
> > ----- Original Message -----
> > > From: "Adam Litke" <[email protected]>
> > > To: "Saggi Mizrahi" <[email protected]>
> > > Cc: "Dan Kenigsberg" <[email protected]>, "Ayal Baron"
> > > <[email protected]>, "Federico Simoncelli"
> > > <[email protected]>, [email protected],
> > > [email protected]
> > > Sent: Monday, December 17, 2012 2:16:25 PM
> > > Subject: Re: Managing async tasks
> > > 
> > > On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote:
> > > > 
> > > > 
> > > > ----- Original Message -----
> > > > > From: "Adam Litke" <[email protected]> To:
> > > > > [email protected]
> > > > > Cc: "Dan Kenigsberg" <[email protected]>, "Ayal Baron"
> > > > > <[email protected]>,
> > > > > "Saggi Mizrahi" <[email protected]>, "Federico Simoncelli"
> > > > > <[email protected]>, [email protected] Sent: Monday,
> > > > > December 17,
> > > > > 2012 12:00:49 PM Subject: Managing async tasks
> > > > > 
> > > > > On today's vdsm call we had a lively discussion around how
> > > > > asynchronous
> > > > > operations should be handled in the future.  In an effort to
> > > > > include more
> > > > > people in the discussion and to better capture the resulting
> > > > > conversation I
> > > > > would like to continue that discussion here on the mailing
> > > > > list.
> > > > > 
> > > > > A lot of ideas were thrown around about how 'tasks' should be
> > > > > handled in the
> > > > > future.  There are a lot of ways that it can be done.  To
> > > > > determine how we
> > > > > should implement it, it's probably best if we start with a set
> > > > > of
> > > > > requirements.  If we can first agree on these, it should be
> > > > > easy
> > > > > to find a
> > > > > solution that meets them.  I'll take a stab at identifying a
> > > > > first set of
> > > > > POSSIBLE requirements:
> > > > > 
> > > > > - Standardized method for determining the result of an
> > > > > operation
> > > > > 
> > > > >   This is a big one for me because it directly affects the
> > > > >   consumability of
> > > > >   the API.  If each verb has different semantics for
> > > > >   discovering
> > > > >   whether it
> > > > >   has completed successfully, then the API will be nearly
> > > > >   impossible to use
> > > > >   easily.
> > > > Since there is no way to assure if of some tasks completed
> > > > successfully or
> > > > failed, especially around the murky waters of storage, I say this
> > > > requirement
> > > > should be removed.  At least not in the context of a task.
> > > 
> > > I don't agree.  Please feel free to convince me with some exampled.
> > >  If we
> > > cannot provide feedback to a user as to whether their request has
> > > been satisfied
> > > or not, then we have some bigger problems to solve.
> > If VDSM sends a write command to a storage server, and the connection
> > hangs up before the ACK has returned.
> > The operation has been committed but VDSM has no way of knowing if
> > that happened as far as VDSM is concerned it got an ETIMEO or EIO.
> > This is the same problem that the engine has with VDSM.
> > If VDSM creates an image\VM\network\repo but the connection hangs up
> > before the response can be sent back as far as the engine is
> > concerned the operation times out.
> > This is an inherent issue with clustering.
> > This is why I want to move away from tasks being *the* trackable
> > objects.
> > Tasks should be short. As short as possible.
> > Run VM should just persist the VM information on the VDSM host and
> > return. The rest of the tracking should be done using the VM ID.
> > Create image should return once VDSM persisted the information about
> > the request on the repository and created the metadata files.
> > Tracking should be done on the repo or the imageId.
> 
> The thing is that I know how long a VM object should live (or an Image 
> object).
> So tracking it is straight forward. How long a task should live is very 
> problematic and quite context specific.
> It depends on what the task is.
> I think it's quite confusing from an API standpoint to have every task have a 
> different scope, id requirement and life-cycle.
> 
> In VDSM has two types of APIs
> 
> CRUD objects - VM, Image, Repository, Bridge, Storage Connections....
> General transient methods - getBiosInfo(), getDeviceList()
> 
> The latter are quite simple to manage. They don't need any special handling. 
> If you lost a getBiosInfo() call you just send another one, no harm done.
> The same is even true with things that "change" the host like getDeviceList()
> 
> What we are really arguing about is fitting the CRUD objects to some generic 
> task oriented scheme.
> I'm saying it's a waste of time as you can quite easily have flows to recover 
> from each operation.
> 
> Create - Check if the object exists
> Read - Read again
> Update - either update again or read and update if update didn't commit the 
> first time
> Delete - Check if object doesn't exist
> 
> Each of the objects we CRUD have different life-cycles and ownership 
> semantics.
> 
> Danken raised the point that creation has a problem that if it fails there is 
> no way to get why it failed.
> This is why Create method should be minimal. They shouldn't create the object 
> just the entry in the respective persistent storage.
> Even now storage connections are persisted to disk and then the operation 
> returns and the user polls to see the state of the connection.
> The same should be done for everything. Do the minimum required to create the 
> object entry and mark it as "not usable".
> For storage connections it's "connecting"
> For VMs it's "preparing for launch"
> For new images it's "broken" and in some regards "degraded"
> 
> I hope this makes things clearer


So, I tried to avoid using the word 'Task' in my initial email because I didn't
want to cause any confusion.  I actually like the ideas you mention above.  They
work especially well assuming we have no rollback/cancel operations.

I was hoping someone would volunteer some requirements that would be pertinent
to the 'setupNetwork' flow.  I don't think this one fits very well into the
model above.


-- 
Adam Litke <[email protected]>
IBM Linux Technology Center

_______________________________________________
Engine-devel mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] Managing async tasks

Reply via email to