Re: [openstack-dev] [nova] Core pinning

2013-11-27 Thread Daniel P. Berrange
On Wed, Nov 27, 2013 at 03:50:47PM +0200, Tuomas Paappanen wrote:
> >On Tue, 2013-11-19 at 12:52 +, Daniel P. Berrange wrote:
> >>I think there are several use cases mixed up in your descriptions
> >>here which should likely be considered independantly
> >>
> >>  - pCPU/vCPU pinning
> >>
> >>I don't really think this is a good idea as a general purpose
> >>feature in its own right. It tends to lead to fairly inefficient
> >>use of CPU resources when you consider that a large % of guests
> >>will be mostly idle most of the time. It has a fairly high
> >>administrative burden to maintain explicit pinning too. This
> >>feels like a data center virt use case rather than cloud use
> >>case really.
> >>
> >>  - Dedicated CPU reservation
> >>
> >>The ability of an end user to request that their VM (or their
> >>group of VMs) gets assigned a dedicated host CPU set to run on.
> >>This is obviously something that would have to be controlled
> >>at a flavour level, and in a commercial deployment would carry
> >>a hefty pricing premium.
> >>
> >>I don't think you want to expose explicit pCPU/vCPU placement
> >>for this though. Just request the high level concept and allow
> >>the virt host to decide actual placement
> I think pcpu/vcpu pinning could be considered like an extension for
> dedicated cpu reservation feature. And I agree that if we
> exclusively dedicate pcpus for VMs it is inefficient from cloud
> point of view, but in some case, end user may want to be sure(and
> ready to pay) that their VMs have resources available e.g. for
> sudden load peaks.
> 
> So, here is my proposal how dedicated cpu reservation would function
> on high level:
> 
> When an end user wants VM with nn vcpus which are running on
> dedicated host cpu set, admin could enable it by setting a new
> "dedicate_pcpu" parameter in a flavor(e.g. optional flavor
> parameter). By default, amount of pcpus and vcpus could be same. And
> as option, explicit vcpu/pcpu pinning could be done by defining
> vcpu/pcpu relations to flavors extra specs(vcpupin:0 0...).
> 
> In the virt driver there is two alternatives how to do the pcpu
> sharing 1. all dedicated pcpus are shared with all vcpus(default
> case) or 2. each vcpu has dedicated pcpu(vcpu 0 will be pinned to
> the first pcpu in a cpu set, vcpu 1 to the second pcpu and so on).
> Vcpu/pcpu pinning option could be used to extend the latter case.
> 
> In any case, before VM with or without dedicated pcpus is launched
> the virt driver must take care of that the dedicated pcpus are
> excluded from existing VMs and from a new VMs and that there are
> enough free pcpus for placement. And I think minimum amount of pcpus
> for VMs without dedicated pcpus must be configurable somewhere.
> 
> Comments?

I still don't believe that vcpu:pcpu pinning is something we want
to do, even with dedicated CPUs. There are always threads in the
host doing work on behalf of the VM that are not related to vCPUs.
For example the main QEMU emulator thread, the QEMU I/O threads,
kernel threads. Other hypervisors have similar behaviour. It is
better to let the kernel / hypervisor scheduler decide how to
balance the competing workloads than forcing a fixed  & suboptimally
performing vcpu:pcpu mapping. The only time I've seen fixed pinning
make a consistent benefit is when you have NUMA involved and want to
prevent a VM spanning NUMA nodes. Even then you'd just be best pinning
to the set of CPUs in a node and then letting the vCPUs float amonst
the pCPUs in that node.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-27 Thread Tuomas Paappanen

On 19.11.2013 20:18, yunhong jiang wrote:

On Tue, 2013-11-19 at 12:52 +, Daniel P. Berrange wrote:

On Wed, Nov 13, 2013 at 02:46:06PM +0200, Tuomas Paappanen wrote:

Hi all,

I would like to hear your thoughts about core pinning in Openstack.
Currently nova(with qemu-kvm) supports usage of cpu set of PCPUs
what can be used by instances. I didn't find blueprint, but I think
this feature is for isolate cpus used by host from cpus used by
instances(VCPUs).

But, from performance point of view it is better to exclusively
dedicate PCPUs for VCPUs and emulator. In some cases you may want to
guarantee that only one instance(and its VCPUs) is using certain
PCPUs.  By using core pinning you can optimize instance performance
based on e.g. cache sharing, NUMA topology, interrupt handling, pci
pass through(SR-IOV) in multi socket hosts etc.

We have already implemented feature like this(PoC with limitations)
to Nova Grizzly version and would like to hear your opinion about
it.

The current implementation consists of three main parts:
- Definition of pcpu-vcpu maps for instances and instance spawning
- (optional) Compute resource and capability advertising including
free pcpus and NUMA topology.
- (optional) Scheduling based on free cpus and NUMA topology.

The implementation is quite simple:

(additional/optional parts)
Nova-computes are advertising free pcpus and NUMA topology in same
manner than host capabilities. Instances are scheduled based on this
information.

(core pinning)
admin can set PCPUs for VCPUs and for emulator process, or select
NUMA cell for instance vcpus, by adding key:value pairs to flavor's
extra specs.

EXAMPLE:
instance has 4 vcpus
:
vcpus:1,2,3,4 --> vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2...
emulator:5 --> emulator pinned to pcpu5
or
numacell:0 --> all vcpus are pinned to pcpus in numa cell 0.

In nova-compute, core pinning information is read from extra specs
and added to domain xml same way as cpu quota values(cputune).


   
   
   
   
   


What do you think? Implementation alternatives? Is this worth of
blueprint? All related comments are welcome!

I think there are several use cases mixed up in your descriptions
here which should likely be considered independantly

  - pCPU/vCPU pinning

I don't really think this is a good idea as a general purpose
feature in its own right. It tends to lead to fairly inefficient
use of CPU resources when you consider that a large % of guests
will be mostly idle most of the time. It has a fairly high
administrative burden to maintain explicit pinning too. This
feels like a data center virt use case rather than cloud use
case really.

  - Dedicated CPU reservation

The ability of an end user to request that their VM (or their
group of VMs) gets assigned a dedicated host CPU set to run on.
This is obviously something that would have to be controlled
at a flavour level, and in a commercial deployment would carry
a hefty pricing premium.

I don't think you want to expose explicit pCPU/vCPU placement
for this though. Just request the high level concept and allow
the virt host to decide actual placement
I think pcpu/vcpu pinning could be considered like an extension for 
dedicated cpu reservation feature. And I agree that if we exclusively 
dedicate pcpus for VMs it is inefficient from cloud point of view, but 
in some case, end user may want to be sure(and ready to pay) that their 
VMs have resources available e.g. for sudden load peaks.


So, here is my proposal how dedicated cpu reservation would function on 
high level:


When an end user wants VM with nn vcpus which are running on dedicated 
host cpu set, admin could enable it by setting a new "dedicate_pcpu" 
parameter in a flavor(e.g. optional flavor parameter). By default, 
amount of pcpus and vcpus could be same. And as option, explicit 
vcpu/pcpu pinning could be done by defining vcpu/pcpu relations to 
flavors extra specs(vcpupin:0 0...).


In the virt driver there is two alternatives how to do the pcpu sharing 
1. all dedicated pcpus are shared with all vcpus(default case) or 2. 
each vcpu has dedicated pcpu(vcpu 0 will be pinned to the first pcpu in 
a cpu set, vcpu 1 to the second pcpu and so on). Vcpu/pcpu pinning 
option could be used to extend the latter case.


In any case, before VM with or without dedicated pcpus is launched the 
virt driver must take care of that the dedicated pcpus are excluded from 
existing VMs and from a new VMs and that there are enough free pcpus for 
placement. And I think minimum amount of pcpus for VMs without dedicated 
pcpus must be configurable somewhere.


Comments?

Br, Tuomas



  - Host NUMA placement.

By not taking NUMA into account currently the libvirt driver
at least is badly wasting resources. Having too much cross-numa
node memory access by guests just kills scalability. The virt
driver should really automaticall figure out cpu & memory pinni

Re: [openstack-dev] [nova] Core pinning

2013-11-26 Thread Roman Verchikov
Tuomas,

> I haven't but I will write a blueprint for the core pinning part.
Can’t wait to see it!

> Are you using extra specs for carrying cpuset attributes in your 
> implementation?
Yes, exactly. 

Although we're using slightly different syntax to update flavor, for example:
$ nova flavor-key set  set vcpupin:0=1-5,12-17
In here ‘0’ is vCPU, and '1-5,12-17' - pCPUs. Basically this command results in 
the following libvirt xml:

   


We’re also using the ‘placement’ attribute of  set to ‘static’:
$ nova flavor-key  set vcpu:placement=static
Which results in the following libvirt xml:
…

Otherwise, the functionality and implementation seem to be identical.

[offtopic] 
Apologies for delayed answer, for some reason I thought your email would arrive 
to my personal mailbox 
[/offtopic]

-Roman

On Nov 19, 2013, at 14:35, Tuomas Paappanen  wrote:

> Hi Roman,
> 
> I haven't but I will write a blueprint for the core pinning part.
> I considered vcpu element usage as well but in that case you can not set e.g. 
> vcpu-0 to run on pcpu-0. Vcpus and emulator are sharing all pcpus defined in 
> cpuset so I decided to use cputune element.
> 
> Are you using extra specs for carrying cpuset attributes in your 
> implementation?
> 
> Br,Tuomas
> 
> On 18.11.2013 17:14, Roman Verchikov wrote:
>> Tuomas,
>> 
>> Have you published your code/blueprints anywhere? Looks like we’re working 
>> on the same stuff. I have implemented almost the same feature set (haven’t 
>> published anything yet because of this thread), except for the scheduler 
>> part. The main goal is to be able to pin VCPUs in NUMA environment.
>> 
>> Have you considered adding placement and cpuset attributes to  
>> element? For example:
>> 
>> 
>> Thanks,
>> Roman
>> 
>> On Nov 13, 2013, at 14:46, Tuomas Paappanen  
>> wrote:
>> 
>>> Hi all,
>>> 
>>> I would like to hear your thoughts about core pinning in Openstack. 
>>> Currently nova(with qemu-kvm) supports usage of cpu set of PCPUs what can 
>>> be used by instances. I didn't find blueprint, but I think this feature is 
>>> for isolate cpus used by host from cpus used by instances(VCPUs).
>>> 
>>> But, from performance point of view it is better to exclusively dedicate 
>>> PCPUs for VCPUs and emulator. In some cases you may want to guarantee that 
>>> only one instance(and its VCPUs) is using certain PCPUs.  By using core 
>>> pinning you can optimize instance performance based on e.g. cache sharing, 
>>> NUMA topology, interrupt handling, pci pass through(SR-IOV) in multi socket 
>>> hosts etc.
>>> 
>>> We have already implemented feature like this(PoC with limitations) to Nova 
>>> Grizzly version and would like to hear your opinion about it.
>>> 
>>> The current implementation consists of three main parts:
>>> - Definition of pcpu-vcpu maps for instances and instance spawning
>>> - (optional) Compute resource and capability advertising including free 
>>> pcpus and NUMA topology.
>>> - (optional) Scheduling based on free cpus and NUMA topology.
>>> 
>>> The implementation is quite simple:
>>> 
>>> (additional/optional parts)
>>> Nova-computes are advertising free pcpus and NUMA topology in same manner 
>>> than host capabilities. Instances are scheduled based on this information.
>>> 
>>> (core pinning)
>>> admin can set PCPUs for VCPUs and for emulator process, or select NUMA cell 
>>> for instance vcpus, by adding key:value pairs to flavor's extra specs.
>>> 
>>> EXAMPLE:
>>> instance has 4 vcpus
>>> :
>>> vcpus:1,2,3,4 --> vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2...
>>> emulator:5 --> emulator pinned to pcpu5
>>> or
>>> numacell:0 --> all vcpus are pinned to pcpus in numa cell 0.
>>> 
>>> In nova-compute, core pinning information is read from extra specs and 
>>> added to domain xml same way as cpu quota values(cputune).
>>> 
>>> 
>>>  
>>>  
>>>  
>>>  
>>>  
>>> 
>>> 
>>> What do you think? Implementation alternatives? Is this worth of blueprint? 
>>> All related comments are welcome!
>>> 
>>> Regards,
>>> Tuomas
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ___
>>> OpenStack-dev mailing list
>>> OpenStack-dev@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-19 Thread yunhong jiang
On Tue, 2013-11-19 at 12:52 +, Daniel P. Berrange wrote:
> On Wed, Nov 13, 2013 at 02:46:06PM +0200, Tuomas Paappanen wrote:
> > Hi all,
> > 
> > I would like to hear your thoughts about core pinning in Openstack.
> > Currently nova(with qemu-kvm) supports usage of cpu set of PCPUs
> > what can be used by instances. I didn't find blueprint, but I think
> > this feature is for isolate cpus used by host from cpus used by
> > instances(VCPUs).
> > 
> > But, from performance point of view it is better to exclusively
> > dedicate PCPUs for VCPUs and emulator. In some cases you may want to
> > guarantee that only one instance(and its VCPUs) is using certain
> > PCPUs.  By using core pinning you can optimize instance performance
> > based on e.g. cache sharing, NUMA topology, interrupt handling, pci
> > pass through(SR-IOV) in multi socket hosts etc.
> > 
> > We have already implemented feature like this(PoC with limitations)
> > to Nova Grizzly version and would like to hear your opinion about
> > it.
> > 
> > The current implementation consists of three main parts:
> > - Definition of pcpu-vcpu maps for instances and instance spawning
> > - (optional) Compute resource and capability advertising including
> > free pcpus and NUMA topology.
> > - (optional) Scheduling based on free cpus and NUMA topology.
> > 
> > The implementation is quite simple:
> > 
> > (additional/optional parts)
> > Nova-computes are advertising free pcpus and NUMA topology in same
> > manner than host capabilities. Instances are scheduled based on this
> > information.
> > 
> > (core pinning)
> > admin can set PCPUs for VCPUs and for emulator process, or select
> > NUMA cell for instance vcpus, by adding key:value pairs to flavor's
> > extra specs.
> > 
> > EXAMPLE:
> > instance has 4 vcpus
> > :
> > vcpus:1,2,3,4 --> vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2...
> > emulator:5 --> emulator pinned to pcpu5
> > or
> > numacell:0 --> all vcpus are pinned to pcpus in numa cell 0.
> > 
> > In nova-compute, core pinning information is read from extra specs
> > and added to domain xml same way as cpu quota values(cputune).
> > 
> > 
> >   
> >   
> >   
> >   
> >   
> > 
> > 
> > What do you think? Implementation alternatives? Is this worth of
> > blueprint? All related comments are welcome!
> 
> I think there are several use cases mixed up in your descriptions
> here which should likely be considered independantly
> 
>  - pCPU/vCPU pinning
> 
>I don't really think this is a good idea as a general purpose
>feature in its own right. It tends to lead to fairly inefficient
>use of CPU resources when you consider that a large % of guests
>will be mostly idle most of the time. It has a fairly high
>administrative burden to maintain explicit pinning too. This
>feels like a data center virt use case rather than cloud use
>case really.
> 
>  - Dedicated CPU reservation
> 
>The ability of an end user to request that their VM (or their
>group of VMs) gets assigned a dedicated host CPU set to run on.
>This is obviously something that would have to be controlled
>at a flavour level, and in a commercial deployment would carry
>a hefty pricing premium.
> 
>I don't think you want to expose explicit pCPU/vCPU placement
>for this though. Just request the high level concept and allow
>the virt host to decide actual placement
> 
>  - Host NUMA placement.
> 
>By not taking NUMA into account currently the libvirt driver
>at least is badly wasting resources. Having too much cross-numa
>node memory access by guests just kills scalability. The virt
>driver should really automaticall figure out cpu & memory pinning
>within the scope of a NUMA node automatically. No admin config
>should be required for this.
> 
>  - Guest NUMA topology
> 
>If the flavour memory size / cpu count exceeds the size of a
>single NUMA node, then the flavour should likely have a way to
>express that the guest should see multiple NUMA nodes. The
>virt host would then set guest NUMA topology to match the way
>it places vCPUs & memory on host NUMA nodes. Again you don't
>want explicit pcpu/vcpu mapping done by the admin for this.
> 
> 
> 
> Regards,
> Daniel

Quite clear splitting and +1 for P/V pin option.

--jyh



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-19 Thread Daniel P. Berrange
On Wed, Nov 13, 2013 at 11:57:22AM -0600, Chris Friesen wrote:
> On 11/13/2013 11:40 AM, Jiang, Yunhong wrote:
> 
> >>But, from performance point of view it is better to exclusively
> >>dedicate PCPUs for VCPUs and emulator. In some cases you may want
> >>to guarantee that only one instance(and its VCPUs) is using certain
> >>PCPUs.  By using core pinning you can optimize instance performance
> >>based on e.g. cache sharing, NUMA topology, interrupt handling, pci
> >>pass through(SR-IOV) in multi socket hosts etc.
> >
> >My 2 cents. When you talking about " performance point of view", are
> >you talking about guest performance, or overall performance? Pin PCPU
> >is sure to benefit guest performance, but possibly not for overall
> >performance, especially if the vCPU is not consume 100% of the CPU
> >resources.
> 
> It can actually be both.  If a guest has several virtual cores that
> both access the same memory, it can be highly beneficial all around
> if all the memory/cpus for that guest come from a single NUMA node
> on the host.  That way you reduce the cross-NUMA-node memory
> traffic, increasing overall efficiency.  Alternately, if a guest has
> several cores that use lots of memory bandwidth but don't access the
> same data, you might want to ensure that the cores are on different
> NUMA nodes to equalize utilization of the different NUMA nodes.
> 
> Similarly, once you start talking about doing SR-IOV networking I/O
> passthrough into a guest (for SDN/NFV stuff) for optimum efficiency
> it is beneficial to be able to steer interrupts on the physical host
> to the specific cpus on which the guest will be running.  This
> implies some form of pinning.

I would say intelligent NUMA placement is something that virt drivers
should address automatically without any need for admin defined pinning.
The latter is just imposing too much admin burden, for something we can
figure out automatically to a good enough extent.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-19 Thread Daniel P. Berrange
On Wed, Nov 13, 2013 at 02:46:06PM +0200, Tuomas Paappanen wrote:
> Hi all,
> 
> I would like to hear your thoughts about core pinning in Openstack.
> Currently nova(with qemu-kvm) supports usage of cpu set of PCPUs
> what can be used by instances. I didn't find blueprint, but I think
> this feature is for isolate cpus used by host from cpus used by
> instances(VCPUs).
> 
> But, from performance point of view it is better to exclusively
> dedicate PCPUs for VCPUs and emulator. In some cases you may want to
> guarantee that only one instance(and its VCPUs) is using certain
> PCPUs.  By using core pinning you can optimize instance performance
> based on e.g. cache sharing, NUMA topology, interrupt handling, pci
> pass through(SR-IOV) in multi socket hosts etc.
> 
> We have already implemented feature like this(PoC with limitations)
> to Nova Grizzly version and would like to hear your opinion about
> it.
> 
> The current implementation consists of three main parts:
> - Definition of pcpu-vcpu maps for instances and instance spawning
> - (optional) Compute resource and capability advertising including
> free pcpus and NUMA topology.
> - (optional) Scheduling based on free cpus and NUMA topology.
> 
> The implementation is quite simple:
> 
> (additional/optional parts)
> Nova-computes are advertising free pcpus and NUMA topology in same
> manner than host capabilities. Instances are scheduled based on this
> information.
> 
> (core pinning)
> admin can set PCPUs for VCPUs and for emulator process, or select
> NUMA cell for instance vcpus, by adding key:value pairs to flavor's
> extra specs.
> 
> EXAMPLE:
> instance has 4 vcpus
> :
> vcpus:1,2,3,4 --> vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2...
> emulator:5 --> emulator pinned to pcpu5
> or
> numacell:0 --> all vcpus are pinned to pcpus in numa cell 0.
> 
> In nova-compute, core pinning information is read from extra specs
> and added to domain xml same way as cpu quota values(cputune).
> 
> 
>   
>   
>   
>   
>   
> 
> 
> What do you think? Implementation alternatives? Is this worth of
> blueprint? All related comments are welcome!

I think there are several use cases mixed up in your descriptions
here which should likely be considered independantly

 - pCPU/vCPU pinning

   I don't really think this is a good idea as a general purpose
   feature in its own right. It tends to lead to fairly inefficient
   use of CPU resources when you consider that a large % of guests
   will be mostly idle most of the time. It has a fairly high
   administrative burden to maintain explicit pinning too. This
   feels like a data center virt use case rather than cloud use
   case really.

 - Dedicated CPU reservation

   The ability of an end user to request that their VM (or their
   group of VMs) gets assigned a dedicated host CPU set to run on.
   This is obviously something that would have to be controlled
   at a flavour level, and in a commercial deployment would carry
   a hefty pricing premium.

   I don't think you want to expose explicit pCPU/vCPU placement
   for this though. Just request the high level concept and allow
   the virt host to decide actual placement

 - Host NUMA placement.

   By not taking NUMA into account currently the libvirt driver
   at least is badly wasting resources. Having too much cross-numa
   node memory access by guests just kills scalability. The virt
   driver should really automaticall figure out cpu & memory pinning
   within the scope of a NUMA node automatically. No admin config
   should be required for this.

 - Guest NUMA topology

   If the flavour memory size / cpu count exceeds the size of a
   single NUMA node, then the flavour should likely have a way to
   express that the guest should see multiple NUMA nodes. The
   virt host would then set guest NUMA topology to match the way
   it places vCPUs & memory on host NUMA nodes. Again you don't
   want explicit pcpu/vcpu mapping done by the admin for this.



Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-19 Thread Tuomas Paappanen

Hi Roman,

I haven't but I will write a blueprint for the core pinning part.
I considered vcpu element usage as well but in that case you can not set 
e.g. vcpu-0 to run on pcpu-0. Vcpus and emulator are sharing all pcpus 
defined in cpuset so I decided to use cputune element.


Are you using extra specs for carrying cpuset attributes in your 
implementation?


Br,Tuomas

On 18.11.2013 17:14, Roman Verchikov wrote:

Tuomas,

Have you published your code/blueprints anywhere? Looks like we’re working on 
the same stuff. I have implemented almost the same feature set (haven’t 
published anything yet because of this thread), except for the scheduler part. 
The main goal is to be able to pin VCPUs in NUMA environment.

Have you considered adding placement and cpuset attributes to  element? 
For example:


Thanks,
Roman

On Nov 13, 2013, at 14:46, Tuomas Paappanen  wrote:


Hi all,

I would like to hear your thoughts about core pinning in Openstack. Currently 
nova(with qemu-kvm) supports usage of cpu set of PCPUs what can be used by 
instances. I didn't find blueprint, but I think this feature is for isolate 
cpus used by host from cpus used by instances(VCPUs).

But, from performance point of view it is better to exclusively dedicate PCPUs 
for VCPUs and emulator. In some cases you may want to guarantee that only one 
instance(and its VCPUs) is using certain PCPUs.  By using core pinning you can 
optimize instance performance based on e.g. cache sharing, NUMA topology, 
interrupt handling, pci pass through(SR-IOV) in multi socket hosts etc.

We have already implemented feature like this(PoC with limitations) to Nova 
Grizzly version and would like to hear your opinion about it.

The current implementation consists of three main parts:
- Definition of pcpu-vcpu maps for instances and instance spawning
- (optional) Compute resource and capability advertising including free pcpus 
and NUMA topology.
- (optional) Scheduling based on free cpus and NUMA topology.

The implementation is quite simple:

(additional/optional parts)
Nova-computes are advertising free pcpus and NUMA topology in same manner than 
host capabilities. Instances are scheduled based on this information.

(core pinning)
admin can set PCPUs for VCPUs and for emulator process, or select NUMA cell for 
instance vcpus, by adding key:value pairs to flavor's extra specs.

EXAMPLE:
instance has 4 vcpus
:
vcpus:1,2,3,4 --> vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2...
emulator:5 --> emulator pinned to pcpu5
or
numacell:0 --> all vcpus are pinned to pcpus in numa cell 0.

In nova-compute, core pinning information is read from extra specs and added to 
domain xml same way as cpu quota values(cputune).


  
  
  
  
  


What do you think? Implementation alternatives? Is this worth of blueprint? All 
related comments are welcome!

Regards,
Tuomas





___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-18 Thread Roman Verchikov
Tuomas,

Have you published your code/blueprints anywhere? Looks like we’re working on 
the same stuff. I have implemented almost the same feature set (haven’t 
published anything yet because of this thread), except for the scheduler part. 
The main goal is to be able to pin VCPUs in NUMA environment.

Have you considered adding placement and cpuset attributes to  element? 
For example:


Thanks,
Roman

On Nov 13, 2013, at 14:46, Tuomas Paappanen  wrote:

> Hi all,
> 
> I would like to hear your thoughts about core pinning in Openstack. Currently 
> nova(with qemu-kvm) supports usage of cpu set of PCPUs what can be used by 
> instances. I didn't find blueprint, but I think this feature is for isolate 
> cpus used by host from cpus used by instances(VCPUs).
> 
> But, from performance point of view it is better to exclusively dedicate 
> PCPUs for VCPUs and emulator. In some cases you may want to guarantee that 
> only one instance(and its VCPUs) is using certain PCPUs.  By using core 
> pinning you can optimize instance performance based on e.g. cache sharing, 
> NUMA topology, interrupt handling, pci pass through(SR-IOV) in multi socket 
> hosts etc.
> 
> We have already implemented feature like this(PoC with limitations) to Nova 
> Grizzly version and would like to hear your opinion about it.
> 
> The current implementation consists of three main parts:
> - Definition of pcpu-vcpu maps for instances and instance spawning
> - (optional) Compute resource and capability advertising including free pcpus 
> and NUMA topology.
> - (optional) Scheduling based on free cpus and NUMA topology.
> 
> The implementation is quite simple:
> 
> (additional/optional parts)
> Nova-computes are advertising free pcpus and NUMA topology in same manner 
> than host capabilities. Instances are scheduled based on this information.
> 
> (core pinning)
> admin can set PCPUs for VCPUs and for emulator process, or select NUMA cell 
> for instance vcpus, by adding key:value pairs to flavor's extra specs.
> 
> EXAMPLE:
> instance has 4 vcpus
> :
> vcpus:1,2,3,4 --> vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2...
> emulator:5 --> emulator pinned to pcpu5
> or
> numacell:0 --> all vcpus are pinned to pcpus in numa cell 0.
> 
> In nova-compute, core pinning information is read from extra specs and added 
> to domain xml same way as cpu quota values(cputune).
> 
> 
>  
>  
>  
>  
>  
> 
> 
> What do you think? Implementation alternatives? Is this worth of blueprint? 
> All related comments are welcome!
> 
> Regards,
> Tuomas
> 
> 
> 
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-15 Thread Tapio Tallgren
Hi,

The use cases fro CPU pinning are exactly like discussed above: (1)
lowering guest scheduling latencies and (2) improving networking latencies
by pinning the SR-IOV IRQ's to specific cores. There is also a third use
case, (3) avoiding long latencies with spinlocks.

> On Wed, Nov 13, 2013 at 8:20 PM, Jiang, Yunhong 
 wrote:

>
>> Similarly, once you start talking about doing SR-IOV networking I/O
>> passthrough into a guest (for SDN/NFV stuff) for optimum efficiency it
>> is beneficial to be able to steer interrupts on the physical host to the
>> specific cpus on which the guest will be running.  This implies some
>> form of pinning.

> Still, I think hypervisor should achieve this, instead of openstack.

How would this work? As a solution, this would be much better since then
OpenStack would have much less low-level work to do.

-Tapio
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-14 Thread Tuomas Paappanen

On 13.11.2013 20:20, Jiang, Yunhong wrote:



-Original Message-
From: Chris Friesen [mailto:chris.frie...@windriver.com]
Sent: Wednesday, November 13, 2013 9:57 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [nova] Core pinning

On 11/13/2013 11:40 AM, Jiang, Yunhong wrote:


But, from performance point of view it is better to exclusively
dedicate PCPUs for VCPUs and emulator. In some cases you may want
to guarantee that only one instance(and its VCPUs) is using certain
PCPUs.  By using core pinning you can optimize instance performance
based on e.g. cache sharing, NUMA topology, interrupt handling, pci
pass through(SR-IOV) in multi socket hosts etc.

My 2 cents. When you talking about " performance point of view", are
you talking about guest performance, or overall performance? Pin PCPU
is sure to benefit guest performance, but possibly not for overall
performance, especially if the vCPU is not consume 100% of the CPU
resources.

It can actually be both.  If a guest has several virtual cores that both
access the same memory, it can be highly beneficial all around if all
the memory/cpus for that guest come from a single NUMA node on the
host.
   That way you reduce the cross-NUMA-node memory traffic, increasing
overall efficiency.  Alternately, if a guest has several cores that use
lots of memory bandwidth but don't access the same data, you might want
to ensure that the cores are on different NUMA nodes to equalize
utilization of the different NUMA nodes.

I think the Tuomas is talking about " exclusively dedicate PCPUs for VCPUs", in 
that situation, that pCPU can't be shared by other vCPU anymore. If this vCPU like cost 
only 50% of the PCPU usage, it's sure to be a waste of the overall performance.

As to the cross NUMA node access, I'd let hypervisor, instead of cloud OS, to 
reduce the cross NUMA access as much as possible.

I'm not against such usage, it's sure to be used on data center virtualization. 
Just question if it's for cloud.



Similarly, once you start talking about doing SR-IOV networking I/O
passthrough into a guest (for SDN/NFV stuff) for optimum efficiency it
is beneficial to be able to steer interrupts on the physical host to the
specific cpus on which the guest will be running.  This implies some
form of pinning.

Still, I think hypervisor should achieve this, instead of openstack.



I think pin CPU is common to data center virtualization, but not sure
if it's in scope of cloud, which provide computing power, not
hardware resources.

And I think part of your purpose can be achieved through
https://wiki.openstack.org/wiki/CPUEntitlement and
https://wiki.openstack.org/wiki/InstanceResourceQuota . Especially I
hope a well implemented hypervisor will avoid needless vcpu migration
if the vcpu is very busy and required most of the pCPU's computing
capability (I knew Xen used to have some issue in the scheduler to
cause frequent vCPU migration long before).

I'm not sure the above stuff can be done with those.  It's not just
about quantity of resources, but also about which specific resources
will be used so that other things can be done based on that knowledge.

With the above stuff, it ensure the QoS and the compute capability for the 
guest, I think.

--jyh
  

Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Hi,

thank you for your comments. I am talking about quest performance. We 
are using openstack for managing Telco cloud applications where quest 
performance optimization is needed.
That example where pcpus are dedicated exclusively for vcpus is not a 
problem. It can be implemented by using scheduling filters and if you 
need that feature you can take the filter in use. Without it, pcpus are 
shared in normal way.


As Chris said, core pinning e.g. depending on NUMA topology is 
beneficial and I think its beneficial with or without exclusive 
dedication of pcpu.


Regards,
Tuomas

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-13 Thread Jiang, Yunhong


> -Original Message-
> From: Chris Friesen [mailto:chris.frie...@windriver.com]
> Sent: Wednesday, November 13, 2013 9:57 AM
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [nova] Core pinning
> 
> On 11/13/2013 11:40 AM, Jiang, Yunhong wrote:
> 
> >> But, from performance point of view it is better to exclusively
> >> dedicate PCPUs for VCPUs and emulator. In some cases you may want
> >> to guarantee that only one instance(and its VCPUs) is using certain
> >> PCPUs.  By using core pinning you can optimize instance performance
> >> based on e.g. cache sharing, NUMA topology, interrupt handling, pci
> >> pass through(SR-IOV) in multi socket hosts etc.
> >
> > My 2 cents. When you talking about " performance point of view", are
> > you talking about guest performance, or overall performance? Pin PCPU
> > is sure to benefit guest performance, but possibly not for overall
> > performance, especially if the vCPU is not consume 100% of the CPU
> > resources.
> 
> It can actually be both.  If a guest has several virtual cores that both
> access the same memory, it can be highly beneficial all around if all
> the memory/cpus for that guest come from a single NUMA node on the
> host.
>   That way you reduce the cross-NUMA-node memory traffic, increasing
> overall efficiency.  Alternately, if a guest has several cores that use
> lots of memory bandwidth but don't access the same data, you might want
> to ensure that the cores are on different NUMA nodes to equalize
> utilization of the different NUMA nodes.

I think the Tuomas is talking about " exclusively dedicate PCPUs for VCPUs", in 
that situation, that pCPU can't be shared by other vCPU anymore. If this vCPU 
like cost only 50% of the PCPU usage, it's sure to be a waste of the overall 
performance. 

As to the cross NUMA node access, I'd let hypervisor, instead of cloud OS, to 
reduce the cross NUMA access as much as possible.

I'm not against such usage, it's sure to be used on data center virtualization. 
Just question if it's for cloud.


> 
> Similarly, once you start talking about doing SR-IOV networking I/O
> passthrough into a guest (for SDN/NFV stuff) for optimum efficiency it
> is beneficial to be able to steer interrupts on the physical host to the
> specific cpus on which the guest will be running.  This implies some
> form of pinning.

Still, I think hypervisor should achieve this, instead of openstack.


> 
> > I think pin CPU is common to data center virtualization, but not sure
> > if it's in scope of cloud, which provide computing power, not
> > hardware resources.
> >
> > And I think part of your purpose can be achieved through
> > https://wiki.openstack.org/wiki/CPUEntitlement and
> > https://wiki.openstack.org/wiki/InstanceResourceQuota . Especially I
> > hope a well implemented hypervisor will avoid needless vcpu migration
> > if the vcpu is very busy and required most of the pCPU's computing
> > capability (I knew Xen used to have some issue in the scheduler to
> > cause frequent vCPU migration long before).
> 
> I'm not sure the above stuff can be done with those.  It's not just
> about quantity of resources, but also about which specific resources
> will be used so that other things can be done based on that knowledge.

With the above stuff, it ensure the QoS and the compute capability for the 
guest, I think.

--jyh
 
> 
> Chris
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-13 Thread Chris Friesen

On 11/13/2013 11:40 AM, Jiang, Yunhong wrote:


But, from performance point of view it is better to exclusively
dedicate PCPUs for VCPUs and emulator. In some cases you may want
to guarantee that only one instance(and its VCPUs) is using certain
PCPUs.  By using core pinning you can optimize instance performance
based on e.g. cache sharing, NUMA topology, interrupt handling, pci
pass through(SR-IOV) in multi socket hosts etc.


My 2 cents. When you talking about " performance point of view", are
you talking about guest performance, or overall performance? Pin PCPU
is sure to benefit guest performance, but possibly not for overall
performance, especially if the vCPU is not consume 100% of the CPU
resources.


It can actually be both.  If a guest has several virtual cores that both 
access the same memory, it can be highly beneficial all around if all 
the memory/cpus for that guest come from a single NUMA node on the host. 
 That way you reduce the cross-NUMA-node memory traffic, increasing 
overall efficiency.  Alternately, if a guest has several cores that use 
lots of memory bandwidth but don't access the same data, you might want 
to ensure that the cores are on different NUMA nodes to equalize 
utilization of the different NUMA nodes.


Similarly, once you start talking about doing SR-IOV networking I/O 
passthrough into a guest (for SDN/NFV stuff) for optimum efficiency it 
is beneficial to be able to steer interrupts on the physical host to the 
specific cpus on which the guest will be running.  This implies some 
form of pinning.



I think pin CPU is common to data center virtualization, but not sure
if it's in scope of cloud, which provide computing power, not
hardware resources.

And I think part of your purpose can be achieved through
https://wiki.openstack.org/wiki/CPUEntitlement and
https://wiki.openstack.org/wiki/InstanceResourceQuota . Especially I
hope a well implemented hypervisor will avoid needless vcpu migration
if the vcpu is very busy and required most of the pCPU's computing
capability (I knew Xen used to have some issue in the scheduler to
cause frequent vCPU migration long before).


I'm not sure the above stuff can be done with those.  It's not just 
about quantity of resources, but also about which specific resources 
will be used so that other things can be done based on that knowledge.


Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-13 Thread Jiang, Yunhong


> -Original Message-
> From: Tuomas Paappanen [mailto:tuomas.paappa...@tieto.com]
> Sent: Wednesday, November 13, 2013 4:46 AM
> To: openstack-dev@lists.openstack.org
> Subject: [openstack-dev] [nova] Core pinning
> 
> Hi all,
> 
> I would like to hear your thoughts about core pinning in Openstack.
> Currently nova(with qemu-kvm) supports usage of cpu set of PCPUs what
> can be used by instances. I didn't find blueprint, but I think this
> feature is for isolate cpus used by host from cpus used by
> instances(VCPUs).
> 
> But, from performance point of view it is better to exclusively dedicate
> PCPUs for VCPUs and emulator. In some cases you may want to guarantee
> that only one instance(and its VCPUs) is using certain PCPUs.  By using
> core pinning you can optimize instance performance based on e.g. cache
> sharing, NUMA topology, interrupt handling, pci pass through(SR-IOV) in
> multi socket hosts etc.

My 2 cents.
When you talking about " performance point of view", are you talking about 
guest performance, or overall performance? Pin PCPU is sure to benefit guest 
performance, but possibly not for overall performance, especially if the vCPU 
is not consume 100% of the CPU resources. 

I think pin CPU is common to data center virtualization, but not sure if it's 
in scope of cloud, which provide computing power, not hardware resources.

And I think part of your purpose can be achieved through 
https://wiki.openstack.org/wiki/CPUEntitlement and 
https://wiki.openstack.org/wiki/InstanceResourceQuota . Especially I hope a 
well implemented hypervisor will avoid needless vcpu migration if the vcpu is 
very busy and required most of the pCPU's computing capability (I knew Xen used 
to have some issue in the scheduler to cause frequent vCPU migration long 
before).

--jyh


> 
> We have already implemented feature like this(PoC with limitations) to
> Nova Grizzly version and would like to hear your opinion about it.
> 
> The current implementation consists of three main parts:
> - Definition of pcpu-vcpu maps for instances and instance spawning
> - (optional) Compute resource and capability advertising including free
> pcpus and NUMA topology.
> - (optional) Scheduling based on free cpus and NUMA topology.
> 
> The implementation is quite simple:
> 
> (additional/optional parts)
> Nova-computes are advertising free pcpus and NUMA topology in same
> manner than host capabilities. Instances are scheduled based on this
> information.
> 
> (core pinning)
> admin can set PCPUs for VCPUs and for emulator process, or select NUMA
> cell for instance vcpus, by adding key:value pairs to flavor's extra specs.
> 
> EXAMPLE:
> instance has 4 vcpus
> :
> vcpus:1,2,3,4 --> vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2...
> emulator:5 --> emulator pinned to pcpu5
> or
> numacell:0 --> all vcpus are pinned to pcpus in numa cell 0.
> 
> In nova-compute, core pinning information is read from extra specs and
> added to domain xml same way as cpu quota values(cputune).
> 
> 
>
>
>
>
>
> 
> 
> What do you think? Implementation alternatives? Is this worth of
> blueprint? All related comments are welcome!
> 
> Regards,
> Tuomas
> 
> 
> 
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] Core pinning

2013-11-13 Thread Tuomas Paappanen

Hi all,

I would like to hear your thoughts about core pinning in Openstack. 
Currently nova(with qemu-kvm) supports usage of cpu set of PCPUs what 
can be used by instances. I didn't find blueprint, but I think this 
feature is for isolate cpus used by host from cpus used by instances(VCPUs).


But, from performance point of view it is better to exclusively dedicate 
PCPUs for VCPUs and emulator. In some cases you may want to guarantee 
that only one instance(and its VCPUs) is using certain PCPUs.  By using 
core pinning you can optimize instance performance based on e.g. cache 
sharing, NUMA topology, interrupt handling, pci pass through(SR-IOV) in 
multi socket hosts etc.


We have already implemented feature like this(PoC with limitations) to 
Nova Grizzly version and would like to hear your opinion about it.


The current implementation consists of three main parts:
- Definition of pcpu-vcpu maps for instances and instance spawning
- (optional) Compute resource and capability advertising including free 
pcpus and NUMA topology.

- (optional) Scheduling based on free cpus and NUMA topology.

The implementation is quite simple:

(additional/optional parts)
Nova-computes are advertising free pcpus and NUMA topology in same 
manner than host capabilities. Instances are scheduled based on this 
information.


(core pinning)
admin can set PCPUs for VCPUs and for emulator process, or select NUMA 
cell for instance vcpus, by adding key:value pairs to flavor's extra specs.


EXAMPLE:
instance has 4 vcpus
:
vcpus:1,2,3,4 --> vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2...
emulator:5 --> emulator pinned to pcpu5
or
numacell:0 --> all vcpus are pinned to pcpus in numa cell 0.

In nova-compute, core pinning information is read from extra specs and 
added to domain xml same way as cpu quota values(cputune).



  
  
  
  
  


What do you think? Implementation alternatives? Is this worth of 
blueprint? All related comments are welcome!


Regards,
Tuomas





___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev