Just wanted to ping on this again. I'm confident enough in it if widow
doesn't get a chance to respond.
On Mar 13, 2013 1:27 PM, "Marcus Sorensen" <shadow...@gmail.com> wrote:

> Ok, the fix for CLOUDSTACK-600 is the first patch mentioned plus two
> fixes to it. The first additional patch handles upgrades, and the
> second one is a minor fix for when we shut down a non-persistent VM
> that will avoid printing useless scary messages (we throw an error in
> the timeout loop that waits for the vm to shut down when it succeeds,
> because the domain can't be found).
>
> I've tested this patch set both in a fresh 4.1 install and a fresh
> master install, and both were able to start/stop VMs successfully.
> When upgrading an existing 4.1 install (without the patches), the VMs
> were undefined on shutdown as they should be.
>
> commit 5dfcd309f10e5bd6a918f7fdff3f44a3dff2374a
> Author: Wido den Hollander <w...@widodh.nl>
> Date:   Thu Feb 7 22:58:20 2013 +0100
>
>     agent: Do not define domains persistent in libvirt
>
>     We used to define domains persistent in libvirt, which caused XML
> definitionsto stay there after a reboot of the hypervisor.
>
>     We however don't do anything with those already defined domains,
> actually, we wipe all defined domains when starting the agent.
>
>     Some users however reported that libvirt started these domains
> after a reboot before the CloudStack agent was started.
>
>     By starting domains from the XML description and not defining them
> we prevent them from ever being stored in libvirt.
>
> commit 8d7d1cd5623a6744a954fb35eeff8408d6381083
> Author: Marcus Sorensen <mar...@betterservers.com>
> Date:   Wed Mar 13 11:10:56 2013 -0600
>
>     Summary: KVM - undefine persistent VMs on stop
>
>     Detail: A previous patch fixed an issue where we are defining VMs to
> persist
>     locally on KVM hosts, which can cause issues if the agent isn't
> running and
>     libvirt decides to start the VM unbeknownst to cloudstack. The
> previous patch
>     stopped defining VMs as persistent. This patch adds compatibility
> for existing
>     cloudstack environments, removing the persistent definition on
> stop if needed.
>
>     BUG-ID: CLOUDSTACK-600
>
> commit 97d2e3fe7772fa01941295397f9d59d35cf47671
> Author: Marcus Sorensen <mar...@betterservers.com>
> Date:   Wed Mar 13 12:57:46 2013 -0600
>
>     Summary: KVM - remove harmless message about domain not found on VM
> stop
>
>     Detail: When we stop a VM, it's definition is no longer valid.
> Therefore, we
>     need to catch the exception thrown from libvirt in trying to lookup a
>     non-existent domain by UUID while trying to check if it's shut down.
>
>     BUG-ID:CLOUDSTACK-600
>     Signed-off-by: Marcus Sorensen <mar...@betterservers.com> 1363201066
> -0600
>
>
> On Tue, Mar 12, 2013 at 11:49 PM, Marcus Sorensen <shadow...@gmail.com>
> wrote:
> > Looks like this patch might need to be adjusted slightly.  We can use
> > the domain.isPersistent() method to determine if we need to call
> > domain.undefine() in addition to dm.shutdown and dm.destroy. That will
> > allow upgrades to clean up existing definitions as VMs are shutdown.
> > If you don't have a chance to get to that, I'll try to tomorrow.
> >
> > On Tue, Mar 12, 2013 at 7:23 PM, Marcus Sorensen <shadow...@gmail.com>
> wrote:
> >> To clarify, this is a fairly serious bug, and it should probably be
> >> addressed, I'm just wanting clarification that the patch is good to go
> >> as a fix. On my way home I was thinking about how I'm not sure if it
> >> addresses upgrades, i.e. does it still allow existing definitions to
> >> be wiped, or will those be forever stuck. I haven't even looked yet.
> >>
> >> Essentially, the bug is that if a KVM host is uncleanly shut down, and
> >> then brought back up without starting the agent, the VMs that were
> >> running on the system will be started by libvirt/startup scripts,
> >> which can be disastrous for shared-storage VMs.  So if a host goes
> >> down without being put into maintenance mode, for whatever reason, and
> >> the agent is either slow to start, the agent fails to start for some
> >> reason, or auto-start is disabled for some reason, you're almost
> >> guaranteed to have two VMs running off of the same shared storage
> >> image.
> >>
> >> We don't want to make any VMs permanently known to a host, and libvirt
> >> provides a simple way to define a VM domain without making it
> >> permanent.
> >>
> >> On Tue, Mar 12, 2013 at 6:42 PM, Chip Childers
> >> <chip.child...@sungard.com> wrote:
> >>> On Tue, Mar 12, 2013 at 06:39:09PM -0600, Marcus Sorensen wrote:
> >>>> This addresses CLOUDSTACK-600.  Please get buyoff from Wido if
> >>>> possible, since it's his patch. I've reviewed and tested it, but I
> >>>> want to make sure I'm not wrong in assuming that this should be in 4.1
> >>>>
> >>>> commit 5dfcd309f10e5bd6a918f7fdff3f44a3dff2374a
> >>>> Author: Wido den Hollander <w...@widodh.nl>
> >>>> Date:   Thu Feb 7 22:58:20 2013 +0100
> >>>>
> >>>>     agent: Do not define domains persistent in libvirt
> >>>>
> >>>>     We used to define domains persistent in libvirt, which caused XML
> >>>> definitions
> >>>>     to stay there after a reboot of the hypervisor.
> >>>>
> >>>>     We however don't do anything with those already defined domains,
> >>>> actually, we wipe
> >>>>     all defined domains when starting the agent.
> >>>>
> >>>>     Some users however reported that libvirt started these domains
> >>>> after a reboot
> >>>>     before the CloudStack agent was started.
> >>>>
> >>>>     By starting domains from the XML description and not defining them
> >>>> we prevent
> >>>>     them from ever being stored in libvirt.
> >>>>
> >>>
> >>> Wido - I'll let you comment before we apply this.
>

Reply via email to