Ewan, Can you explain why you say dom0 CPU is a scarce resource? I agree for a lot of reasons work like this should be done in a domU, but I'm just curious. My thoughts would have been that it's not so scarce. I know there are things like the disk drivers running in the dom0 kernel doing disk I/O, but I'd think that'd not be much CPU usage. It'd be mostly I/O wait. And I wouldn't think network receive in dom0 vs domU would cause much of a difference overall. I thought the hypervisor scheduled dom0 and domUs similarly. Am I wrong?
The only thing I can think of is when running HVM VMs, qemu can be using a lot of CPU. - Chris On Feb 16, 2011, at 7:12 AM, Ewan Mellor wrote: > Just for summary, the advantages of having the streaming inside a domU are: > > 1. You move the network receive and the image decompression / > decryption (if you’re using that) off dom0’s CPU and onto the domU’s. Dom0 > CPU is a scarce resource, even in the new release of XenServer with 4 CPUs in > domain 0. This avoids hurting customer workloads by contending inside domain > 0. > 2. You can easily apply network and CPU QoS to the operations above. > This also avoids hurting customer workloads, by simply capping the maximum > amount of work that the OpenStack domU can do. > 3. You can use Python 2.6 for OpenStack, even though XenServer dom0 is > stuck on CentOS 5.5 (Python 2.4). > 4. You get a minor security improvement, because you get to keep a > network-facing service out of domain 0. > > So, this is all fine if you’re streaming direct to disk, but as you say, if > you want to stream VHD files you have a problem, because the VHD needs to go > into a filesystem mounted in domain 0. It’s not possible to write from a > domU into a dom0-owned filesystem, without some trickery. Here are the > options as I see them: > > Option A: Stream in two stages, one from Glance to domU, then from domU to > dom0. The stream from domU to dom0 could just be a really simple network > put, and would just fit on the end of the current pipeline. You lose a bit > of dom0 CPU, because of the incoming stream, and it’s less efficient overall, > because of the two hops. It’s primary advantage is that you can do most of > the work inside the domU still, so if you are intending to decompress and/or > decrypt locally, then this would likely be a win. > > Option B: Stream from Glance directly into dom0. This would be a xapi plugin > acting as a Glance client. This is the simplest solution, but loses all the > benefits above. I think it’s the one that you’re suggesting below. This > leaves you with similar performance problems to the ones that you suffer > today on your existing architecture. The advantage here is simplicity, and > it’s certainly worth considering. > > Option C: Run an NFS server in domain 0, and mount that inside the domU. You > can then write direct to dom0’s filesystem from the domU. This sounds > plausible, but I don’t think that I recommend it. The load on dom0 of doing > this is probably no better than Options A or B, which would mean that the > complexity wasn’t worth it. > > Option D: Unpack the VHD file inside the domU, and write it through the PV > path. This is probably the option that you haven’t considered yet. The same > VHD parsing code that we use in domain 0 is also available in an easily > consumable form (called libvhdio). This can be used to take a VHD file from > the network and parse it, so that you can write the allocated blocks directly > to the VDI. This would have all the advantages above, but it adds yet > another moving part to the pipeline. Also, this is going to be pretty simple > if you’re just using VHDs as a way to handle sparseness. If you’re expecting > to stream a whole tree of snapshots as multiple files, and then expect all > the relationships between the files to get wired up correctly, then this is > not the solution you’re looking for. It’s technically doable, but it’s very > fiddly. > > So, in summary: > > Option A: Two hops. Ideal if you’re worried about the cost of decompressing > / decrypting on the host. > Option B: Direct to dom0. Ideal if you want the simplest solution. > Option D: Parse the VHD. Probably best performance. Fiddly development work > required. Not a good idea if you want to work with trees of VHDs. > > Where do you think you stand? I can advise in more detail about the > implementation, if you have a particular option that you prefer. > > Cheers. > > Ewan. > > > From: openstack-xenapi-bounces+ewan.mellor=citrix....@lists.launchpad.net > [mailto:openstack-xenapi-bounces+ewan.mellor=citrix....@lists.launchpad.net] > On Behalf Of Rick Harris > Sent: 11 February 2011 22:13 > To: openstack-xenapi@lists.launchpad.net > Subject: [Openstack-xenapi] Glance Plugin/DomU access to SR? > > We recently moved to running the compute-worker within a domU instance. > > We could make this move because domU can access VBDs in dom0-space by > performing a VBD.plug. > > The problem is that we'd like to deal deal with whole VHDs rather than kernel, > ramdisk, and partitioning (the impetus of the unified-images BP). > > So, for snapshots we stream the base copy VHD held in the SR into Glance, > and, likewise, for restores, we stream the snapshot VHD from Glance into the > SR, rescan, and > then spin up the instance. > > The problem is: now that we're running the compute-worker in domU, how can we > access the SR? Is there a way we can map it into domU space (a la VBD.plug)? > > The way we solved this for snapshots was by using the Glance plugin and > performing these operations in dom0. > > So, my questions are: > > 1. Are SR operations something we need to use the Glance plugin for? > > 2. If we must use a dom0 plugin for this method of restore, does it make > sense to just do > everything image related in the plugin? > > -Rick > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use of the > individual or entity to which this message is addressed, and unless otherwise > expressly indicated, is confidential and privileged information of Rackspace. > Any dissemination, distribution or copying of the enclosed material is > prohibited. > If you receive this transmission in error, please notify us immediately by > e-mail > at ab...@rackspace.com, and delete the original message. > Your cooperation is appreciated. > _______________________________________________ > Mailing list: https://launchpad.net/~openstack-xenapi > Post to : openstack-xenapi@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack-xenapi > More help : https://help.launchpad.net/ListHelp _______________________________________________ Mailing list: https://launchpad.net/~openstack-xenapi Post to : openstack-xenapi@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack-xenapi More help : https://help.launchpad.net/ListHelp