Re: [DISCUSS] getting rid of KVM patchdisk

Wido den Hollander Tue, 05 Mar 2013 10:42:04 -0800


On 03/05/2013 07:28 PM, Edison Su wrote:

-----Original Message-----
From: Marcus Sorensen [mailto:shadow...@gmail.com]
Sent: Monday, March 04, 2013 10:26 PM
To: cloudstack-dev@incubator.apache.org
Subject: Re: [DISCUSS] getting rid of KVM patchdisk

I've been thinking about how to tackle this, written a little concept code, and
it seems fairly straightforward to include our own little python daemon that
speaks JSON via this local character device in the system vm. I'm assuming
we'd start it up at the beginning of cloud-early-config.

What I'm not certain of is how to get the 'cmdline' bits into the system before
cloud-early-config needs them. Do we block in cloud-early-config, waiting on
getting the cmdline file before continuing, and push it via StartCommand?


We put a lot of logic into init scripts inside system vm, which is unnecessary 
complicated the system vm programming:
1. init script is not portable, if people want to use other Linux distribution 
as system vm OS, then he has to write his own init scripts.
2. init script is not easy to hack, it has its own dialect(how to log message, 
how to write dependence etc)
3. init script is running in a limited environment(some system wide services 
are not available), put the limitation on what you can do in a init script.

Maybe we need to start working on new system vm programming model now? Better 
to just put a python daemon inside system vm, and provide restful API through 
link local ip address(or private ip if it's vmware), then mgt server or 
hypervisor agent code can just send commands to the python daemon through http, 
instead of ssh.

In your case, the python daemon, needs to wait on the well-defined serial 
port(e.g /dev/virtio-ports/org.apache.cloudstack.guest.agent), get cmdline, 
then programming system vm itself, and reboot.

This seems like a very sensible thing to do. I've already created Jiratickets about this last November.

I haven't been able to look at it though, but have a Python daemonrunning which does everything on a Read-Only (!!!) filesystem would beawesome.

The reason I mention the read-only filesystem is that it would make thesystem VMs much more resilient against SAN issues. Assume a read-only FSand make them stateless and store everything on a tmpfs or in memorysomewhere.


(Would be something cool to discuss at a Colab Conference ;))

Wido


On Mon, Mar 4, 2013 at 5:27 PM, Marcus Sorensen <shadow...@gmail.com>
wrote:

I tested this with Rohit's systemvm from master. It works fine,
provided you install the qemu-guest-agent software and modify the
libvirt xml definition of the system vm to include something like:

    <channel type='unix'>
       <source mode='bind' path='/var/lib/libvirt/qemu/v-2-VM.agent'/>
       <target type='virtio' name='org.qemu.guest_agent.0'/>
       <alias name='channel0'/>
       <address type='virtio-serial' controller='0' bus='0' port='1'/>
     </channel>

Then on the host you can connect to the
/var/lib/libvirt/qemu/v-2-VM.agent unix socket and send QMP JSON to do
things like write files. We can't execute the various scripts through
it, but we also don't have to use qemu-ga; we could have our own thing
listening on the unix socket.



On Mon, Mar 4, 2013 at 3:24 PM, Marcus Sorensen

<shadow...@gmail.com> wrote:

I think this just requires an updated system vm (the virtio-serial
portion). I've played a bit with the old debian 2.6.32-5-686-bigmem
one and can't get the device nodes to show up, even though the
/boot/config shows that it has CONFIG_VIRTIO_CONSOLE=y. However, if I
try this with a CentOS 6.3 VM, on a CentOS 6.3 or Ubuntu 12.04 KVM
host it works. So I'm not sure what's being used for the ipv6 update,
but we can probably make one that works. We'll need to install
qemu-ga and start it within the systemvm as well.

On Mon, Mar 4, 2013 at 12:41 PM, Edison Su <edison...@citrix.com>

wrote:

-----Original Message-----
From: Marcus Sorensen [mailto:shadow...@gmail.com]
Sent: Sunday, March 03, 2013 12:13 PM
To: cloudstack-dev@incubator.apache.org
Subject: [DISCUSS] getting rid of KVM patchdisk

For those who don't know (this probably doesn't matter, but...),
when KVM brings up a system VM, it creates a 'patchdisk' on primary
storage. This patchdisk is used to pass along 1) the authorized_keys file

and 2) a 'cmdline'

file that describes to the systemvm startup services all of the
various properties of the system vm.

Example cmdline file:

  template=domP type=secstorage host=172.17.10.10 port=8250
name=s-1- VM
zone=1 pod=1 guid=s-1-VM
resource=com.cloud.storage.resource.NfsSecondaryStorageResource
instance=SecStorage sslcopy=true role=templateProcessor mtu=1500
eth2ip=192.168.100.170 eth2mask=255.255.255.0

gateway=192.168.100.1

public.network.device=eth2 eth0ip=169.254.1.46

eth0mask=255.255.0.0

eth1ip=172.17.10.150 eth1mask=255.255.255.0

mgmtcidr=172.17.10.0/24

localgw=172.17.10.1 private.network.device=eth1
eth3ip=172.17.10.192
eth3mask=255.255.255.0 storageip=172.17.10.192
storagenetmask=255.255.255.0 storagegateway=172.17.10.1
internaldns1=8.8.4.4 dns1=8.8.8.8

This patch disk has been bugging me for awhile, as it creates a
volume that isn't really tracked anywhere or known about in
cloudstack's database. Up until recently these would just litter
the KVM primary storages, but there's been some triage done to
attempt to clean them up when the system vms go away. It's not
perfect. It also can be inefficient for certain primary storage
types, for example if you end up creating a bunch of 10MB luns on a

SAN for these.


So my question goes to those who have been working on the system

vm.

My first preference (aside from a full system vm redesign, perhaps
something that is controlled via an API) would be to copy these up
to the system vm via SCP or something. But the cloud services start
so early on that this isn't possible. Next would be to inject them
into the system vm's root disk before starting the server, but if
we're allowing people to make their own system vms, can we count on
the partitions being what we expect? Also I don't think this will
work for RBD, which qemu directly connects to, with the host OS

unaware of any disk.


Options?


Could you take a look at the status of this projects in KVM?
http://wiki.qemu.org/Features/QAPI/GuestAgent
https://fedoraproject.org/wiki/Features/VirtioSerial

Basically, we need a way to talk to guest VM(sending parameters to

KVM guest) after VM is booted up. Both VMware/Xenserver has its own way
to send parameters to guest VM through PV driver, but there is no such thing
for KVM few years ago.

Re: [DISCUSS] getting rid of KVM patchdisk

Reply via email to