Ewan,

Can you explain why you say dom0 CPU is a scarce resource?  I agree for a lot 
of reasons work like this should be done in a domU, but I'm just curious.  My 
thoughts would have been that it's not so scarce.  I know there are things like 
the disk drivers running in the dom0 kernel doing disk I/O, but I'd think 
that'd not be much CPU usage.
It'd be mostly I/O wait.  And I wouldn't think network receive in dom0 vs domU 
would cause much of a difference overall.  I thought the hypervisor scheduled 
dom0 and domUs similarly.  Am I wrong?

The only thing I can think of is when running HVM VMs, qemu can be using a lot 
of CPU.

- Chris



On Feb 16, 2011, at 7:12 AM, Ewan Mellor wrote:

> Just for summary, the advantages of having the streaming inside a domU are:
>  
> 1.       You move the network receive and the image decompression / 
> decryption (if you’re using that) off dom0’s CPU and onto the domU’s.  Dom0 
> CPU is a scarce resource, even in the new release of XenServer with 4 CPUs in 
> domain 0.  This avoids hurting customer workloads by contending inside domain 
> 0.
> 2.       You can easily apply network and CPU QoS to the operations above.  
> This also avoids hurting customer workloads, by simply capping the maximum 
> amount of work that the OpenStack domU can do.
> 3.       You can use Python 2.6 for OpenStack, even though XenServer dom0 is 
> stuck on CentOS 5.5 (Python 2.4).
> 4.       You get a minor security improvement, because you get to keep a 
> network-facing service out of domain 0.
>  
> So, this is all fine if you’re streaming direct to disk, but as you say, if 
> you want to stream VHD files you have a problem, because the VHD needs to go 
> into a filesystem mounted in domain 0.  It’s not possible to write from a 
> domU into a dom0-owned filesystem, without some trickery.  Here are the 
> options as I see them:
>  
> Option A: Stream in two stages, one from Glance to domU, then from domU to 
> dom0.  The stream from domU to dom0 could just be a really simple network 
> put, and would just fit on the end of the current pipeline.  You lose a bit 
> of dom0 CPU, because of the incoming stream, and it’s less efficient overall, 
> because of the two hops. It’s primary advantage is that you can do most of 
> the work inside the domU still, so if you are intending to decompress and/or 
> decrypt locally, then this would likely be a win. 
>  
> Option B: Stream from Glance directly into dom0.  This would be a xapi plugin 
> acting as a Glance client.  This is the simplest solution, but loses all the 
> benefits above.   I think it’s the one that you’re suggesting below.  This 
> leaves you with similar performance problems to the ones that you suffer 
> today on your existing architecture.  The advantage here is simplicity, and 
> it’s certainly worth considering.
>  
> Option C: Run an NFS server in domain 0, and mount that inside the domU.  You 
> can then write direct to dom0’s filesystem from the domU.  This sounds 
> plausible, but I don’t think that I recommend it.  The load on dom0 of doing 
> this is probably no better than Options A or B, which would mean that the 
> complexity wasn’t worth it.
>  
> Option D: Unpack the VHD file inside the domU, and write it through the PV 
> path.  This is probably the option that you haven’t considered yet.  The same 
> VHD parsing code that we use in domain 0 is also available in an easily 
> consumable form (called libvhdio).  This can be used to take a VHD file from 
> the network and parse it, so that you can write the allocated blocks directly 
> to the VDI.  This would have all the advantages above, but it adds yet 
> another moving part to the pipeline.  Also, this is going to be pretty simple 
> if you’re just using VHDs as a way to handle sparseness.  If you’re expecting 
> to stream a whole tree of snapshots as multiple files, and then expect all 
> the relationships between the files to get wired up correctly, then this is 
> not the solution you’re looking for.  It’s technically doable, but it’s very 
> fiddly.
>  
> So, in summary:
>  
> Option A: Two hops.  Ideal if you’re worried about the cost of decompressing 
> / decrypting on the host.
> Option B: Direct to dom0.  Ideal if you want the simplest solution.
> Option D: Parse the VHD.  Probably best performance.  Fiddly development work 
> required.  Not a good idea if you want to work with trees of VHDs.
>  
> Where do you think you stand?  I can advise in more detail about the 
> implementation, if you have a particular option that you prefer.
>  
> Cheers.
>  
> Ewan.
>  
>  
> From: openstack-xenapi-bounces+ewan.mellor=citrix....@lists.launchpad.net 
> [mailto:openstack-xenapi-bounces+ewan.mellor=citrix....@lists.launchpad.net] 
> On Behalf Of Rick Harris
> Sent: 11 February 2011 22:13
> To: openstack-xenapi@lists.launchpad.net
> Subject: [Openstack-xenapi] Glance Plugin/DomU access to SR?
>  
> We recently moved to running the compute-worker within a domU instance. 
>  
> We could make this move because domU can access VBDs in dom0-space by
> performing a VBD.plug.
>  
> The problem is that we'd like to deal deal with whole VHDs rather than kernel,
> ramdisk, and partitioning (the impetus of the unified-images BP).
>  
> So, for snapshots we stream the base copy VHD held in the SR into Glance,
> and, likewise, for restores, we stream the snapshot VHD from Glance into the 
> SR, rescan, and
> then spin up the instance.
>  
> The problem is: now that we're running the compute-worker in domU, how can we
> access the SR?  Is there a way we can map it into domU space (a la VBD.plug)?
>  
> The way we solved this for snapshots was by using the Glance plugin and
> performing these operations in dom0.
>  
> So, my questions are:
>  
> 1. Are SR operations something we need to use the Glance plugin for?
>  
> 2. If we must use a dom0 plugin for this method of restore, does it make 
> sense to just do
> everything image related in the plugin?
>  
> -Rick
>  
> Confidentiality Notice: This e-mail message (including any attached or
> embedded documents) is intended for the exclusive and confidential use of the
> individual or entity to which this message is addressed, and unless otherwise
> expressly indicated, is confidential and privileged information of Rackspace. 
> Any dissemination, distribution or copying of the enclosed material is 
> prohibited.
> If you receive this transmission in error, please notify us immediately by 
> e-mail
> at ab...@rackspace.com, and delete the original message. 
> Your cooperation is appreciated.
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack-xenapi
> Post to     : openstack-xenapi@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack-xenapi
> More help   : https://help.launchpad.net/ListHelp


_______________________________________________
Mailing list: https://launchpad.net/~openstack-xenapi
Post to     : openstack-xenapi@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack-xenapi
More help   : https://help.launchpad.net/ListHelp

Reply via email to