On 17 May 2013 11:46, Michele Tartara <[email protected]> wrote: > Ganeti is currently not able to detect a legit shutdown request performed by a > user from inside a Xen domain. > > This patch provides a design document to implement a mechanism able to cope > with > such events. > > Signed-off-by: Michele Tartara <[email protected]> > --- > Makefile.am | 1 + > doc/design-draft.rst | 1 + > doc/design-internal-shutdown.rst | 72 > ++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 74 insertions(+) > create mode 100644 doc/design-internal-shutdown.rst > > diff --git a/Makefile.am b/Makefile.am > index 037cf53..f66624e 100644 > --- a/Makefile.am > +++ b/Makefile.am > @@ -410,6 +410,7 @@ docinput = \ > doc/design-htools-2.3.rst \ > doc/design-http-server.rst \ > doc/design-impexp2.rst \ > + doc/design-internal-shutdown.rst \ > doc/design-lu-generated-jobs.rst \ > doc/design-linuxha.rst \ > doc/design-multi-reloc.rst \ > diff --git a/doc/design-draft.rst b/doc/design-draft.rst > index ccb2f93..9a1d2b1 100644 > --- a/doc/design-draft.rst > +++ b/doc/design-draft.rst > @@ -19,6 +19,7 @@ Design document drafts > design-storagetypes.rst > design-reason-trail.rst > design-device-uuid-name.rst > + design-internal-shutdown.rst > > .. vim: set textwidth=72 : > .. Local Variables: > diff --git a/doc/design-internal-shutdown.rst > b/doc/design-internal-shutdown.rst > new file mode 100644 > index 0000000..836d00c > --- /dev/null > +++ b/doc/design-internal-shutdown.rst > @@ -0,0 +1,72 @@ > +============================================================ > +Detection of user-initiated shutdown from inside an instance > +============================================================ > + > +.. contents:: :depth: 2 > + > +This is a design document detailing the implementation of a way for Ganeti to > +detect whether a machine marked as up but not running was shutdown gracefully > +by the user from inside the machine itself. > + > +Current state and shortcomings > +============================== > + > +Ganeti keeps track of the desired status of instances in order to be able to > +take proper actions (e.g.: reboot) on the ones that happen to crash. > +Currently, the only way to properly shut down a machine is through Ganeti's > own > +commands, that will mark an instance as ``ADMIN_down``. > +If a user shuts down an instance from inside, through the proper command of > the > +operating system it is running, the instance will be shutdown gracefully, but > +Ganeti is not aware of that: the desired status of the instance will still be > +marked as ``running``, so when the watcher realises that the instance is > down, > +it will restart it. This behaviour is usually not what the user expects. > + > +Proposed changes > +================ > + > +We propose to modify Ganeti in such a way that it will detect when an > instance > +was shutdown because of an explicit user request. When such a situation is > +detected, the state of the instance will be set to ADMIN_down, as intended by > +the user. > + > +This design document applies to the Xen backend of Ganeti, because it uses > +features specific of such hypervisor. > + > +Implementation > +============== > + > +Xen knows why a domain is being shut down (a crash or an explicit shutdown > +or poweroff request), but such information is not usually readily available > +externally, because all such cases lead to the virtual machine being > destroyed > +immediately after the event is detected. > + > +Still, Xen allows the instance configuration file to define what action to be > +taken in all those cases through the ``on_poweroff``, ``on_shutdown`` and > +``on_crash`` variables. By setting them to ``preserve``, Xen will avoid > +destroying the domains automatically. > + > +When the domain is not destroyed, it can be viewed by using ``xm list`` (or > ``xl > +list`` in newer Xen versions), and the ``State`` field of the output will > +provide useful information. > + > +If the state is ``----c-`` it means the instance has crashed. > + > +If the state is ``---s--`` it means the instance was properly shutdown. > + > +If the instance was properly shutdown and it is still marked as ``running`` > by > +Ganeti, it means that it was shutdown from inside by the user, and the ganeti > +status of the instance needs to be changed to ``ADMIN_down``. > + > +This will be done at regular intervals by the group watcher, just before > +deciding which instances to reboot. > + > +On top of that, at the same times, the watcher will also need to issue ``xm > +destroy`` commands for all the domains that are in crashed or shutdown state, > +since this will not be done automatically by Xen anymore because of the > +``preserve`` setting in their config files.
I think that that should be done also by gnt-instance start and similar commands, as they could be issued before the watcher runs. Also, what happens to output of gnt-instance list? Will it be correct? Bernardo > + > +.. vim: set textwidth=72 : > +.. Local Variables: > +.. mode: rst > +.. fill-column: 72 > +.. End: > -- > 1.8.2.1 >
