Re: [Xen-devel] Cancelling asynchronous operations in libxl

2015-06-25 Thread Ian Campbell
On Wed, 2015-06-24 at 16:41 +0100, Ian Jackson wrote:
 Euan Harris writes (Re: Cancelling asynchronous operations in libxl):
  We've discussed the semantics of cancellation a bit more off-list and
  have come to two conclusions:
  
1.  [...]
  
We should rename the proposed libxl_ao_cancel() to libxl_ao_abort().
 
 Unless someone objects to this, I will do this in my next
 rebase/resend.
 
 (CCing a slightly wider set of people who may be interested in libxl
 API semantics.)

FWIW it seems fine to me...

 
This function will be defined as a best-effort way to kill an
asynchronous operation, and will give no guarantees about the
state of the affected domain afterwards.   We may add a true
libxl_ao_cancel() function later, with better guarantees about the
state of the domain afterwards.   libxl_ao_abort(), as defined here,
covers many of our requirements in Xapi.
 
 My plan for implementing (eventually) libxl_ao_cancel is that
 it works my like abort, except that operations can:
 
  * block and unblock cancellation during critical sections
 
  * declare an ao committed, causing cancellation requests to all fail
 
  * divert cancellation requests to a special handler (which could
start to try to undo the operation, for example)
 
 ...
The semantics of libxl_ao_cancel/_abort() are defined as best effort,
so it suffices to have just two return codes:
  
  0: The request to cancel/abort has been noted, and it may or may 
 not happen.   To find out which, check the eventual return code
 of the async operation.
  
  ERROR_NOTFOUND: the operation to be cancelled has already completed.
 
 ERROR_NOTFOUND might also mean that the operation has not yet
 started.  For example, the call to libxl_domain_create_new might be on
 its way into libxl and be waiting for the libxl ctx lock.
 
 Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Cancelling asynchronous operations in libxl

2015-06-24 Thread Ian Jackson
Euan Harris writes (Re: Cancelling asynchronous operations in libxl):
 We've discussed the semantics of cancellation a bit more off-list and
 have come to two conclusions:
 
   1.  [...]
 
   We should rename the proposed libxl_ao_cancel() to libxl_ao_abort().

Unless someone objects to this, I will do this in my next
rebase/resend.

(CCing a slightly wider set of people who may be interested in libxl
API semantics.)

   This function will be defined as a best-effort way to kill an
   asynchronous operation, and will give no guarantees about the
   state of the affected domain afterwards.   We may add a true
   libxl_ao_cancel() function later, with better guarantees about the
   state of the domain afterwards.   libxl_ao_abort(), as defined here,
   covers many of our requirements in Xapi.

My plan for implementing (eventually) libxl_ao_cancel is that
it works my like abort, except that operations can:

 * block and unblock cancellation during critical sections

 * declare an ao committed, causing cancellation requests to all fail

 * divert cancellation requests to a special handler (which could
   start to try to undo the operation, for example)

...
   The semantics of libxl_ao_cancel/_abort() are defined as best effort,
   so it suffices to have just two return codes:
 
 0: The request to cancel/abort has been noted, and it may or may 
not happen.   To find out which, check the eventual return code
of the async operation.
 
 ERROR_NOTFOUND: the operation to be cancelled has already completed.

ERROR_NOTFOUND might also mean that the operation has not yet
started.  For example, the call to libxl_domain_create_new might be on
its way into libxl and be waiting for the libxl ctx lock.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Cancelling asynchronous operations in libxl

2015-06-24 Thread Euan Harris
Hi,

On Wed, Jan 28, 2015 at 04:57:19PM +, Ian Jackson wrote:
 Euan Harris writes (Re: Cancelling asynchronous operations in libxl):
  On Tue, Jan 20, 2015 at 04:38:24PM +, Ian Jackson wrote:
* Is an API along these lines going to meet your needs ?
  
  The API you propose for libxl_ao_cancel, as described in the comment in
  libxl.h, looks reasonable to us.The comment for ERROR_NOTIMPLEMENTED
  is a bit confusing: under what circumstances might a task actually be
  cancelled although libxl_ao_cancel returned ERROR_NOTIMPLEMENTED?
 
 A single operation may go through phases during which cancellation is
 effective, and phases during which it is not very effective because it
 hasn't been properly hooked up.  If libxl_ao_cancel is called during
 the latter, it will return ERROR_NOTIMPLEMENTED but the operation will
 still be marked as wanting-cancellation, so if it enters a phase where
 cancellation is effective, it will stop at that point.
 
 To put it another way, what libxl_ao_cancel does is:
   - find the ao in question, hopefully
   - make a note in the ao that it ought to be cancelled
   - look for something internal that has registered a
  cancellation hook
   - if such a hook was found, call it and return success;
  otherwise return ERROR_NOTIMPLEMENTED.
 
 So ERROR_NOTIMPLEMENTED is more of a hint.
 
 If you prefer, it would be possible to make libxl_ao_cancel _not_ make
 a note that the operation ought to be cancelled, in the case where
 it's returning ERROR_NOTIMPLEMENTED.  Then the libxl_ao_cancel would
 be guaranteed to have no effect.
 
 But, if we do that, it won't be possible to mark a
 currently-running-and-not-promptly-cancellable but
 maybe-shortly-actually-cancellable operation as to be cancelled.
 
 Perhaps if this is confusing the better answer is simply to return a
 different error code instead of ERROR_NOTIMPLEMENTED,
   ERROR_CANCELLATION_DIFFICULT

We've discussed the semantics of cancellation a bit more off-list and
have come to two conclusions:

  1.  The behaviour of the current libxl_ao_cancel() proposal is more akin
  to 'abort' than 'cancel'.   This is because the proposed
  implementation can't guarantee the state of the domain after
  cancellation - it might be fine, it might be dead, or it might
  be in some unanticipated limbo state, depending on just when the
  cancellation call took effect.

  We should rename the proposed libxl_ao_cancel() to libxl_ao_abort().
  This function will be defined as a best-effort way to kill an
  asynchronous operation, and will give no guarantees about the
  state of the affected domain afterwards.   We may add a true
  libxl_ao_cancel() function later, with better guarantees about the
  state of the domain afterwards.   libxl_ao_abort(), as defined here,
  covers many of our requirements in Xapi.

  2.  We should remove the ERROR_NOTIMPLEMENTED error code.It does
  not add much value, because cancellation is implemented in terms
  of underlying primitive operations, rather than API operations.
  Any async API operation may be cancellable in principle, and whether
  this error code is returned depends on exactly what primitive
  operation happens to be in progress when libxl_ao_cancel/_abort() is
  called.   Furthermore, even if the call to libxl_ao_cancel/_abort()
  returns NOTIMPLEMENTED, the operation may be cancelled anyway when
  it starts a cancellable primitive operation.

  The semantics of libxl_ao_cancel/_abort() are defined as best effort,
  so it suffices to have just two return codes:

0: The request to cancel/abort has been noted, and it may or may 
   not happen.   To find out which, check the eventual return code
   of the async operation.

ERROR_NOTFOUND: the operation to be cancelled has already completed.

Thanks,
Euan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Cancelling asynchronous operations in libxl

2015-02-03 Thread Ian Jackson
Euan Harris writes (Re: Cancelling asynchronous operations in libxl):
 Sorry, I didn't think you were waiting for a reply.   Your explanation
 does answer my questions, thanks.

Oh, good, thanks.

 I think that the current proposed behaviour will suit us fine.   We will
 probably treat the OK and NOTIMPLEMENTED cases in the same way, by using
 more drastic means to stop the activity if cancellation is not confirmed
 within a reasonable timeout.

Right.  I will rebase my series and get it to compile and do some
smoke tests.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Cancelling asynchronous operations in libxl

2015-02-02 Thread Ian Jackson
Ian Jackson writes (Re: Cancelling asynchronous operations in libxl):
 Euan Harris writes (Re: Cancelling asynchronous operations in libxl):
  The API you propose for libxl_ao_cancel, as described in the comment in
  libxl.h, looks reasonable to us.The comment for ERROR_NOTIMPLEMENTED
  is a bit confusing: under what circumstances might a task actually be
  cancelled although libxl_ao_cancel returned ERROR_NOTIMPLEMENTED?
 
 A single operation may go through phases during which cancellation is
 effective, and phases during which it is not very effective because it
 hasn't been properly hooked up.  [etc.]

Does that explanation answer your questions ?  What did you think of
my alternative suggestions ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Cancelling asynchronous operations in libxl

2015-01-28 Thread Ian Jackson
Euan Harris writes (Re: Cancelling asynchronous operations in libxl):
 On Tue, Jan 20, 2015 at 04:38:24PM +, Ian Jackson wrote:
   * Is an API along these lines going to meet your needs ?
 
 The API you propose for libxl_ao_cancel, as described in the comment in
 libxl.h, looks reasonable to us.The comment for ERROR_NOTIMPLEMENTED
 is a bit confusing: under what circumstances might a task actually be
 cancelled although libxl_ao_cancel returned ERROR_NOTIMPLEMENTED?

A single operation may go through phases during which cancellation is
effective, and phases during which it is not very effective because it
hasn't been properly hooked up.  If libxl_ao_cancel is called during
the latter, it will return ERROR_NOTIMPLEMENTED but the operation will
still be marked as wanting-cancellation, so if it enters a phase where
cancellation is effective, it will stop at that point.

To put it another way, what libxl_ao_cancel does is:
  - find the ao in question, hopefully
  - make a note in the ao that it ought to be cancelled
  - look for something internal that has registered a
 cancellation hook
  - if such a hook was found, call it and return success;
 otherwise return ERROR_NOTIMPLEMENTED.

So ERROR_NOTIMPLEMENTED is more of a hint.

If you prefer, it would be possible to make libxl_ao_cancel _not_ make
a note that the operation ought to be cancelled, in the case where
it's returning ERROR_NOTIMPLEMENTED.  Then the libxl_ao_cancel would
be guaranteed to have no effect.

But, if we do that, it won't be possible to mark a
currently-running-and-not-promptly-cancellable but
maybe-shortly-actually-cancellable operation as to be cancelled.

Perhaps if this is confusing the better answer is simply to return a
different error code instead of ERROR_NOTIMPLEMENTED,
  ERROR_CANCELLATION_DIFFICULT

   * Can you help me test it ?  Trying to test this in xl is going to be
 awkward and involve a lot of extraneous and very complicated signal
 handling; and AFAIAA libvirt doesn't have any cancellation
 facility.
 
 Yes, of course.   However, wouldn't it also be useful for xl to gain
 the ability to cancel long-running operations by handling SIGINT?

As I say, making xl do something with signals is a substantial piece
of work in itself.

   * Any further comments (eg, re timescales etc).
 
 None that we can think of at the moment.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Cancelling asynchronous operations in libxl

2015-01-28 Thread Euan Harris
Hi,

On Tue, Jan 20, 2015 at 04:38:24PM +, Ian Jackson wrote:
  * Is an API along these lines going to meet your needs ?

The API you propose for libxl_ao_cancel, as described in the comment in
libxl.h, looks reasonable to us.The comment for ERROR_NOTIMPLEMENTED
is a bit confusing: under what circumstances might a task actually be
cancelled although libxl_ao_cancel returned ERROR_NOTIMPLEMENTED?

  * Can you help me test it ?  Trying to test this in xl is going to be
awkward and involve a lot of extraneous and very complicated signal
handling; and AFAIAA libvirt doesn't have any cancellation
facility.

Yes, of course.   However, wouldn't it also be useful for xl to gain
the ability to cancel long-running operations by handling SIGINT?

  * Any further comments (eg, re timescales etc).

None that we can think of at the moment.

Thanks,
Euan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel