Re: [Xen-devel] [Qemu-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-05-02 Thread Markus Armbruster
Eric Blake  writes:

> On 04/28/2017 09:42 AM, Markus Armbruster wrote:
>> Eric Blake  writes:
>> 
>>> We want to track why a guest was shutdown; in particular, being able
>>> to tell the difference between a guest request (such as ACPI request)
>>> and host request (such as SIGINT) will prove useful to libvirt.
>>> Since all requests eventually end up changing shutdown_requested in
>>> vl.c, the logical change is to make that value track the reason,
>>> rather than its current 0/1 contents.
>>>
>>> Since command-line options control whether a reset request is turned
>>> into a shutdown request instead, the same treatment is given to
>>> reset_requested.
>>>
>>> This patch adds a QAPI enum ShutdownCause that describes reasons
>>> that a shutdown can be requested, and changes qemu_system_reset() to
>>> pass the reason through, although for now it is not reported.  The
>>> next patch will actually wire things up to modify events to report
>>> data based on the reason, and to pass the correct enum value in from
>>> various call-sites that can trigger a reset/shutdown.  Since QAPI
>>> generates enums starting at 0, it's easier if we use a different
>>> number as our sentinel that no request has happened yet.  Most of
>>> the changes are in vl.c, but xen was using things externally.
>>>
>
>>> -static int reset_requested;
>>> -static int shutdown_requested, shutdown_signal;
>>> +static int reset_requested = -1;
>>> +static int shutdown_requested = -1, shutdown_signal;
>> 
>> Peeking ahead, I see that shutdown_requested and reset_requested take
>> ShutdownCause values and -1.  The latter means "no shutdown requested".
>> What about adding 'none' to ShutdownCause, with value 0, und use that
>> instead of literal -1?  Would avoid the unusual "negative means false,
>> non-negative means true".
>
> Works nicely if the enum is internal-use only.  Gets a bit more awkward
> if the enum is exposed to the end-user.
>
> The fact that I let QAPI generate the enum in patch 3 is evidence that
> I'm leaning towards exposing it to the end user (patch 4); if we want to
> keep it internal-only, a better place for the enum might be in sysemu.h

Yes, unless you need the generated ShutdownCause_lookup[].

> (where we also have the weird '#define VMRESET_SILENT false' '#define
> VMRESET_REPORT true' to name a boolean parameter).

Some people believe such defines make code more readable, others hate
them.  Regardless, they're unusual in QEMU.  Unusual is best avoided.

>> PATCH 4 exposes ShutdownCause in event SHUTDOWN, and 'none' must not
>> occur there.  However, if we ever add a query-shutdown to go with this
>> event, we will need 'none' there.
>
> So, query-shutdown would basically be: what is the last-reported
> shutdown event (normally none, when the guest is still running; but if,
> like libvirt, you start qemu -no-shutdown, it can then be the cause of
> why we are in a shutdown/stopped state while waiting for final cleanup)?

Sounds right.

> How important/likely is such an event?  (Hmm, from libvirt's
> perspective, events are usually reliable, but can be lost; if we can
> restart libvirtd and reconnect to a qemu process that is hanging on to
> life only because no one has cleaned it up yet, query-shutdown does seem
> like a useful thing for libvirt to have at the time it reconnects to
> that qemu process).

Rule of thumb: if we need an event, we probably need a query, too.

> We could always include 'none' in the QAPI enum, then document in
> 'SHUTDOWN' and 'RESET' events that the cause will never be 'none'.

Yes.

> Doc
> hacks like that feel a little unclean, but not so horrible as to be
> unforgivable.

I wouldn't call it an unclean hack.  For me, it's coping with an
insufficiently expressive type system: we can't define ShutdownCause + {
'none' } as a supertype of ShutdownCause.

Even if we could, I'm not sure it would be worth the bother.

>> I'd be tempted to reshuffle declarations here, because shutdown_signal's
>> int is a different one than reset_requested's and shutdown_requested,
>> and the latter two's "negative means false, non-negative means true" is
>> unusual enough to justify a comment.
> ...
>> 
>> Hmm.  In case we stick to literal -1: consider splitting this patch into
>> a part that changes @shutdown_requested from zero/non-zero to
>> negative/non-negative, and a part that uses ShutdownCause for the
>> non-negative values.
>
>
> You're definitely right that if the enum doesn't have a nice none=0
> state, then reshuffling to the magic -1 as no request is awkward enough
> to be done alone.
>
> But part of the answer is also dependent on whether we want PATCH 4 or
> not (or, as you brought up, the possibility of a query-shutdown command
> with even more persistent storage of the last-reported event).

Yes.

>>> @@ -1650,11 +1650,11 @@ static void qemu_kill_report(void)
>>>  static int 

Re: [Xen-devel] [Qemu-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Eric Blake
On 04/28/2017 09:42 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> We want to track why a guest was shutdown; in particular, being able
>> to tell the difference between a guest request (such as ACPI request)
>> and host request (such as SIGINT) will prove useful to libvirt.
>> Since all requests eventually end up changing shutdown_requested in
>> vl.c, the logical change is to make that value track the reason,
>> rather than its current 0/1 contents.
>>
>> Since command-line options control whether a reset request is turned
>> into a shutdown request instead, the same treatment is given to
>> reset_requested.
>>
>> This patch adds a QAPI enum ShutdownCause that describes reasons
>> that a shutdown can be requested, and changes qemu_system_reset() to
>> pass the reason through, although for now it is not reported.  The
>> next patch will actually wire things up to modify events to report
>> data based on the reason, and to pass the correct enum value in from
>> various call-sites that can trigger a reset/shutdown.  Since QAPI
>> generates enums starting at 0, it's easier if we use a different
>> number as our sentinel that no request has happened yet.  Most of
>> the changes are in vl.c, but xen was using things externally.
>>

>> -static int reset_requested;
>> -static int shutdown_requested, shutdown_signal;
>> +static int reset_requested = -1;
>> +static int shutdown_requested = -1, shutdown_signal;
> 
> Peeking ahead, I see that shutdown_requested and reset_requested take
> ShutdownCause values and -1.  The latter means "no shutdown requested".
> What about adding 'none' to ShutdownCause, with value 0, und use that
> instead of literal -1?  Would avoid the unusual "negative means false,
> non-negative means true".

Works nicely if the enum is internal-use only.  Gets a bit more awkward
if the enum is exposed to the end-user.

The fact that I let QAPI generate the enum in patch 3 is evidence that
I'm leaning towards exposing it to the end user (patch 4); if we want to
keep it internal-only, a better place for the enum might be in sysemu.h
(where we also have the weird '#define VMRESET_SILENT false' '#define
VMRESET_REPORT true' to name a boolean parameter).

> 
> PATCH 4 exposes ShutdownCause in event SHUTDOWN, and 'none' must not
> occur there.  However, if we ever add a query-shutdown to go with this
> event, we will need 'none' there.

So, query-shutdown would basically be: what is the last-reported
shutdown event (normally none, when the guest is still running; but if,
like libvirt, you start qemu -no-shutdown, it can then be the cause of
why we are in a shutdown/stopped state while waiting for final cleanup)?

How important/likely is such an event?  (Hmm, from libvirt's
perspective, events are usually reliable, but can be lost; if we can
restart libvirtd and reconnect to a qemu process that is hanging on to
life only because no one has cleaned it up yet, query-shutdown does seem
like a useful thing for libvirt to have at the time it reconnects to
that qemu process).

We could always include 'none' in the QAPI enum, then document in
'SHUTDOWN' and 'RESET' events that the cause will never be 'none'.  Doc
hacks like that feel a little unclean, but not so horrible as to be
unforgivable.

> 
> I'd be tempted to reshuffle declarations here, because shutdown_signal's
> int is a different one than reset_requested's and shutdown_requested,
> and the latter two's "negative means false, non-negative means true" is
> unusual enough to justify a comment.
...
> 
> Hmm.  In case we stick to literal -1: consider splitting this patch into
> a part that changes @shutdown_requested from zero/non-zero to
> negative/non-negative, and a part that uses ShutdownCause for the
> non-negative values.


You're definitely right that if the enum doesn't have a nice none=0
state, then reshuffling to the magic -1 as no request is awkward enough
to be done alone.

But part of the answer is also dependent on whether we want PATCH 4 or
not (or, as you brought up, the possibility of a query-shutdown command
with even more persistent storage of the last-reported event).


>> @@ -1650,11 +1650,11 @@ static void qemu_kill_report(void)
>>  static int qemu_reset_requested(void)
>>  {
>>  int r = reset_requested;
>> -if (r && replay_checkpoint(CHECKPOINT_RESET_REQUESTED)) {
>> -reset_requested = 0;
>> +if (r >= 0 && replay_checkpoint(CHECKPOINT_RESET_REQUESTED)) {
>> +reset_requested = -1;
>>  return r;
>>  }
>> -return false;
>> +return -1;
> 
> "return false" in a function returning int smells, good riddance.
> 

In one of my earlier drafts of the patch, I even tried to change the
return type from int to bool, but decided that keeping it as int worked
best (if I have to use the -1/cause dichotomy).  But you're also right
that with a 'none' value in the enum, I could directly return ShutdownCause.

>>  }
>>
>>  static int qemu_suspend_requested(void)
>> @@ 

Re: [Xen-devel] [Qemu-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Markus Armbruster
"Dr. David Alan Gilbert"  writes:

> * Eric Blake (ebl...@redhat.com) wrote:
>> We want to track why a guest was shutdown; in particular, being able
>> to tell the difference between a guest request (such as ACPI request)
>> and host request (such as SIGINT) will prove useful to libvirt.
>> Since all requests eventually end up changing shutdown_requested in
>> vl.c, the logical change is to make that value track the reason,
>> rather than its current 0/1 contents.
>> 
>> Since command-line options control whether a reset request is turned
>> into a shutdown request instead, the same treatment is given to
>> reset_requested.
>> 
>> This patch adds a QAPI enum ShutdownCause that describes reasons
>> that a shutdown can be requested, and changes qemu_system_reset() to
>> pass the reason through, although for now it is not reported.  The
>> next patch will actually wire things up to modify events to report
>> data based on the reason, and to pass the correct enum value in from
>> various call-sites that can trigger a reset/shutdown.  Since QAPI
>> generates enums starting at 0, it's easier if we use a different
>> number as our sentinel that no request has happened yet.  Most of
>> the changes are in vl.c, but xen was using things externally.
>> 
>> Signed-off-by: Eric Blake 
>> 
>> ---
>> v4: s/ShutdownType/ShutdownCause/, no thanks to mingw header pollution
>> v3: new patch
>> ---
>>  qapi-schema.json| 23 +++
>>  include/sysemu/sysemu.h |  2 +-
>>  vl.c| 44 
>>  hw/i386/xen/xen-hvm.c   |  9 ++---
>>  migration/colo.c|  2 +-
>>  migration/savevm.c  |  2 +-
>>  6 files changed, 60 insertions(+), 22 deletions(-)
>> 
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 01b087f..a4ebdd1 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -2304,6 +2304,29 @@
>>  { 'command': 'system_powerdown' }
>> 
>>  ##
>> +# @ShutdownCause:
>> +#
>> +# Enumeration of various causes for shutdown.
>> +#
>> +# @host-qmp: Reaction to a QMP command, such as 'quit'
>> +# @host-signal: Reaction to a signal, such as SIGINT
>> +# @host-ui: Reaction to a UI event, such as closing the window
>> +# @host-replay: The host is replaying an earlier shutdown event
>> +# @host-error: Qemu encountered an error that prevents further use of the 
>> guest
>> +# @guest-shutdown: The guest requested a shutdown, such as via ACPI or
>> +#  other hardware-specific action
>> +# @guest-reset: The guest requested a reset, and the command line
>> +#   response to a reset is to instead trigger a shutdown
>> +# @guest-panic: The guest panicked, and the command line response to
>> +#   a panic is to trigger a shutdown
>
> It's a little coarse grained;  is there anyway to pass platform specific 
> information
> for debug?  I ask because I spent a while debugging a few bugs with unexpected
> resets and had to figure out which of x86's many reset causes triggered it.

Asking for more help with debugging is fair, but I think the need is
better served by tracepoints than by exposing even more detail in QMP,
where compatibility promises apply.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [Qemu-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Markus Armbruster
Eric Blake  writes:

> We want to track why a guest was shutdown; in particular, being able
> to tell the difference between a guest request (such as ACPI request)
> and host request (such as SIGINT) will prove useful to libvirt.
> Since all requests eventually end up changing shutdown_requested in
> vl.c, the logical change is to make that value track the reason,
> rather than its current 0/1 contents.
>
> Since command-line options control whether a reset request is turned
> into a shutdown request instead, the same treatment is given to
> reset_requested.
>
> This patch adds a QAPI enum ShutdownCause that describes reasons
> that a shutdown can be requested, and changes qemu_system_reset() to
> pass the reason through, although for now it is not reported.  The
> next patch will actually wire things up to modify events to report
> data based on the reason, and to pass the correct enum value in from
> various call-sites that can trigger a reset/shutdown.  Since QAPI
> generates enums starting at 0, it's easier if we use a different
> number as our sentinel that no request has happened yet.  Most of
> the changes are in vl.c, but xen was using things externally.
>
> Signed-off-by: Eric Blake 
>
> ---
> v4: s/ShutdownType/ShutdownCause/, no thanks to mingw header pollution
> v3: new patch
> ---
>  qapi-schema.json| 23 +++
>  include/sysemu/sysemu.h |  2 +-
>  vl.c| 44 
>  hw/i386/xen/xen-hvm.c   |  9 ++---
>  migration/colo.c|  2 +-
>  migration/savevm.c  |  2 +-
>  6 files changed, 60 insertions(+), 22 deletions(-)
>
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 01b087f..a4ebdd1 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -2304,6 +2304,29 @@
>  { 'command': 'system_powerdown' }
>
>  ##
> +# @ShutdownCause:
> +#
> +# Enumeration of various causes for shutdown.
> +#
> +# @host-qmp: Reaction to a QMP command, such as 'quit'
> +# @host-signal: Reaction to a signal, such as SIGINT
> +# @host-ui: Reaction to a UI event, such as closing the window
> +# @host-replay: The host is replaying an earlier shutdown event
> +# @host-error: Qemu encountered an error that prevents further use of the 
> guest
> +# @guest-shutdown: The guest requested a shutdown, such as via ACPI or
> +#  other hardware-specific action
> +# @guest-reset: The guest requested a reset, and the command line
> +#   response to a reset is to instead trigger a shutdown
> +# @guest-panic: The guest panicked, and the command line response to
> +#   a panic is to trigger a shutdown
> +#
> +# Since: 2.10
> +##
> +{ 'enum': 'ShutdownCause',
> +  'data': [ 'host-qmp', 'host-signal', 'host-ui', 'host-replay', 
> 'host-error',
> +'guest-shutdown', 'guest-reset', 'guest-panic' ] }
> +
> +##
>  # @cpu:
>  #
>  # This command is a nop that is only provided for the purposes of 
> compatibility.
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 16175f7..00a907f 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -65,7 +65,7 @@ bool qemu_vmstop_requested(RunState *r);
>  int qemu_shutdown_requested_get(void);
>  int qemu_reset_requested_get(void);
>  void qemu_system_killed(int signal, pid_t pid);
> -void qemu_system_reset(bool report);
> +void qemu_system_reset(bool report, int reason);
>  void qemu_system_guest_panicked(GuestPanicInformation *info);
>  size_t qemu_target_page_size(void);
>
> diff --git a/vl.c b/vl.c
> index 879786a..2b95b7f 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1597,8 +1597,8 @@ void vm_state_notify(int running, RunState state)
>  }
>  }
>
> -static int reset_requested;
> -static int shutdown_requested, shutdown_signal;
> +static int reset_requested = -1;
> +static int shutdown_requested = -1, shutdown_signal;

Peeking ahead, I see that shutdown_requested and reset_requested take
ShutdownCause values and -1.  The latter means "no shutdown requested".
What about adding 'none' to ShutdownCause, with value 0, und use that
instead of literal -1?  Would avoid the unusual "negative means false,
non-negative means true".

PATCH 4 exposes ShutdownCause in event SHUTDOWN, and 'none' must not
occur there.  However, if we ever add a query-shutdown to go with this
event, we will need 'none' there.

I'd be tempted to reshuffle declarations here, because shutdown_signal's
int is a different one than reset_requested's and shutdown_requested,
and the latter two's "negative means false, non-negative means true" is
unusual enough to justify a comment.

>  static pid_t shutdown_pid;
>  static int powerdown_requested;
>  static int debug_requested;
> @@ -1624,7 +1624,7 @@ int qemu_reset_requested_get(void)
>
>  static int qemu_shutdown_requested(void)
>  {
> -return atomic_xchg(_requested, 0);
> +return atomic_xchg(_requested, -1);
>  }

Hmm.  In case we stick to literal -1: