[openstack-dev] [Nova] why force_config_drive is a per comptue node config

2014-02-27 Thread yunhong jiang
Greeting,
I have some questions on the force_config_drive configuration options
and hope get some hints.
a) Why do we want this? Per my understanding, if the user want to use
the config drive, they need specify it in the nova boot. Or is it
because possibly user have no idea of the cloudinit installation in the
image?

b) even if we want to force config drive, why it's a compute node
config instead of cloud wise config? Compute-node config will have some
migration issue per my understanding.

Did I missed anything important?

Thanks
--jyh


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Question about USB passthrough

2014-02-25 Thread yunhong jiang
On Tue, 2014-02-25 at 03:05 +, Liuji (Jeremy) wrote:
 Now that USB devices are used so widely in private/hybrid cloud like
 used as USB key, and there are no technical issues in libvirt/qemu.
 I think it a valuable feature in openstack.

USB key is an interesting scenario. I assume the USB key is just for
some specific VM, wondering how the admin/user know which usb disk to
which VM?

--jyh


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing

2014-02-25 Thread yunhong jiang
On Tue, 2014-02-25 at 10:45 +, John Garbutt wrote:
 
 As a heads up, the overheads of DB calls turned out to dwarf any
 algorithmic improvements I managed. There will clearly be some RPC
 overhead, but it didn't stand out as much as the DB issue.
 
 The move to conductor work should certainly stop the scheduler making
 those pesky DB calls to update the nova instance. And then,
 improvements like no-db-scheduler and improvements to scheduling
 algorithms should shine through much more.
 
Although DB access is sure the key for performance, but do we really
want to pursue conductor-based scheduler?

--jyh


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Question about USB passthrough

2014-02-24 Thread yunhong jiang
On Mon, 2014-02-24 at 04:10 +, Liuji (Jeremy) wrote:
 I have found a BP about USB device passthrough in
 https://blueprints.launchpad.net/nova/+spec/host-usb-passthrough. 
 I have also read the latest nova code and make sure it doesn't support
 USB passthrough by now.
 
 Are there any progress or plan for USB passthrough?

I don't know anyone is working on USB passthrough.

--jyh


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] pci device hotplug

2014-02-21 Thread yunhong jiang
On Mon, 2014-02-17 at 06:43 +, Gouzongmei wrote:
 Hello,
 
  
 
 In current PCI passthrough implementation, a pci device is only
 allowed to be assigned to a instance while the instance is being
 created, it is not allowed to be assigned or removed from the instance
 while the instance is running or stop. 
 
 Besides, I noticed that the basic ability--remove a pci device from
 the instance(not by delete the flavor) has never been implemented or
 prompted by anyone.
 
 The current implementation:
 
 https://wiki.openstack.org/wiki/Pci_passthrough
 
  
 
 I have tested the nic hotplug on my experimental environment, it’s
 supported by the latest libvirt and qemu.
 
  
 
 My problem is, why the pci device hotplug is not proposed in openstack
 until now, and is there anyone planning to do the pci device hotplug?

Agree that PCI hotplug is an important feature. The reason of no support
yet is bandwidth. The folks working on PCI spend a lot of time on SR-IOV
NIC discussion.

--jyh



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] bp proposal: filter based on the load averages of the host

2014-02-14 Thread yunhong jiang
On Fri, 2014-02-14 at 15:29 +, sahid wrote:
 Greetings,
 
 I would like to add a new filter based on the load averages.
 
 This filter will use the command uptime and will provides an option to choice 
 a
 period between 1, 5, and 15 minutes and an option to choice the max load
 average (a float between 0 and 1).
 
 Why:
   During a scheduling it could be useful to exclude a host that have a too
 heavy load and the command uptime (available in all linux system) 
 can return a load average of the system in different periods.
 
 About the implementation:
   Currently 'all' drivers (libvirt, xenapi, vmware) supports a method
 get_host_uptime that returns the output of the command 'uptime'. We have to 
 add
 in compute/stats.py a new method calculate_loadavg() that returns based on the
 output of driver.get_host_uptime() from compute/ressource_tracker.py a well
 formatted tuple of load averages for each periods. We also need to update
 api/openstack/compute/contrib/hypervisors.py to take care of this new
 field.
 
   The implementation will be divided in several parts:
 * Add to host_manager the possibility to get the loads_averages
 * Implement the filter based on this new property
 * Implement the filter with a per-aggregate configuration
 
 The blueprint: https://blueprints.launchpad.net/nova/+spec/filter-based-uptime
 
 I will be happy to get any comments about this filter, perharps it is not 
 implemented
 yet because of something I didn't see or my thinking of the implementation is 
 wrong.
 
 PS: I have checked metrics and cpu_resource but It does not get an averages 
 of the
 system load or perhaps I have not understand all.
 
 Thanks a lot,
 s.
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

I think load average has more than CPU, you need consider like I/O
usage, or even other metrics. Maybe you can have a look at the
https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling ?

Also IMHO the policy of exclude a host that have a too heavy load is
not so clean, would it be better to keep the usage as a scheduler
weight?

Thanks
--jyh


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron] PCI pass-through SRIOV

2014-01-27 Thread yunhong jiang
On Mon, 2014-01-27 at 14:58 +, Robert Li (baoli) wrote:
 Hi Folks,
 
 
 In today's meeting, we discussed a scheduler issue for SRIOV. The
 basic requirement is for coexistence of the following compute nodes in
 a cloud:
   -- SRIOV only compute nodes
   -- non-SRIOV only compute nodes
   -- Compute nodes that can support both SRIOV and non-SRIOV
 ports. Lack of a proper name, let's call them compute nodes with
 hybrid NICs support, or simply hybrid compute nodes.
 
 
 I'm not sure if it's practical in having hybrid compute nodes in a
 real cloud. But it may be useful in the lab to bench mark the
 performance differences between SRIOV, non-SRIOV, and coexistence of
 both.
 
 
 In a cloud that supports SRIOV in some of the compute nodes, a request
 such as:
 
 
  nova boot —flavor m1.large —image image-uuid --nic
 net-id=net-uuid vm
 
 
 doesn't require a SRIOV port. However, it's possible for the nova
 scheduler to place it on a compute node that supports sriov port only.
 Since neutron plugin runs on the controller, port-create would succeed
 unless neutron knows the host doesn't support non-sriov port. But
 connectivity on the node would not be established since no agent is
 running on that host to establish such connectivity. 
 
 
 Irena brought up the idea of using host aggregate. This requires
 creation of a non-SRIOV host aggregate, and use that in the above
 'nova boot' command. It should work.
 
 
 The patch I had introduced a new constraint in the existing PCI
 passthrough filter. 
 
 
 The consensus seems to be having a better solution in a later release.
 And for now, people can either use host aggregate or resort to their
 own means.
 
 
 Let's keep the discussion going on this. 
 
 
 Thanks,
 Robert
 
 
  
 
 
 
 
 
 
 On 1/24/14 4:50 PM, Robert Li (baoli) ba...@cisco.com wrote:
 
 
 Hi Folks,
 
 
 Based on Thursday's discussion and a chat with Irena, I took
 the liberty to add a summary and discussion points for SRIOV
 on Monday and onwards. Check it
 out https://wiki.openstack.org/wiki/Meetings/Passthrough.
 Please feel free to update it. Let's try to finalize it next
 week. The goal is to determine the BPs that need to get
 approved, and to start coding. 
 
 
 thanks,
 Robert
 
 
 
 
 On 1/22/14 8:03 AM, Robert Li (baoli) ba...@cisco.com
 wrote:
 
 
 Sounds great! Let's do it on Thursday.
 
 
 --Robert
 
 
 On 1/22/14 12:46 AM, Irena Berezovsky
 ire...@mellanox.com wrote:
 
 
 Hi Robert, all,
 
 I would suggest not to delay the SR-IOV
 discussion to the next week.
 
 Let’s try to cover the SRIOV side and
 especially the nova-neutron interaction points
 and interfaces this Thursday.
 
 Once we have the interaction points well
 defined, we can run parallel patches to cover
 the full story.
 
  
 
 Thanks a lot,
 
 Irena 
 
  
 
 From: Robert Li (baoli)
 [mailto:ba...@cisco.com] 
 Sent: Wednesday, January 22, 2014 12:02 AM
 To: OpenStack Development Mailing List (not
 for usage questions)
 Subject: [openstack-dev] [nova][neutron] PCI
 passthrough SRIOV
 
 
  
 
 Hi Folks,
 
 
  
 
 
 As the debate about PCI flavor versus host
 aggregate goes on, I'd like to move forward
 with the SRIOV side of things in the same
 time. I know that tomorrow's IRC will be
 focusing on the BP review, and it may well
 continue into Thursday. Therefore, let's start
 discussing SRIOV side of things on Monday. 
 
 
  
 
 
 Basically, we need to work out the details 

Re: [openstack-dev] [nova][neutron] PCI pass-through SRIOV

2014-01-27 Thread yunhong jiang
On Mon, 2014-01-27 at 21:14 +, Jani, Nrupal wrote:
 Hi,
 
  
 
 There are two possibilities for the hybrid compute nodes
 
 - In the first case, a compute node has two NICs,  one SRIOV
 NIC  the other NIC for the VirtIO
 
 - In the 2nd case, Compute node has only one SRIOV NIC, where
 VFs are used for the VMs, either macvtap or direct assignment.  And
 the PF is used for the uplink to the linux bridge or OVS!!
 
  
 
 My question to the team is whether we consider both of these
 deployments or not?
 
Nrupal, good question. I assume if a NIC will be used for vNIC type, it
will not be reported by hypervisor as assignable PCI devices since host
will own it and the OVS is setup based on it.

Irena/Ian, please correct me. At least this is assumption in nova PCI
code, I think.

Thanks
--jyh 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] why don't we deal with claims when live migrating an instance?

2014-01-17 Thread yunhong jiang
On Fri, 2014-01-17 at 09:39 -0800, Vishvananda Ishaya wrote:
 
 On Jan 16, 2014, at 9:41 PM, Jiang, Yunhong yunhong.ji...@intel.com
 wrote:
 
  I noticed the BP has been approved, but I really want to understand
  more on the reason, can anyone provide me some hints?
   
  In the BP, it states that “For resize, we need to confirm, as we
  want to give end user an opportunity to rollback”. But why do we
  want to give user an opportunity to rollback to resize? And why that
  reason does not apply to cold migration and live migration?
 
 
 The confirm is so the user can verify that the instance is still
 functional in the new state. We leave the old instance around so they
 can abort and return to the old instance if something goes wrong. This
 could apply to cold migration as well since it uses the same code
 paths, but it definitely does not make sense in the case of
 live-migration, because there is no old vm to revert to.

Thanks for clarification.
 
 In the case of cold migration, the state is quite confusing as
 “RESIZE_VERIFY”, and the need to confirm is not immediately obvious so
 I think that is driving the change.
 
I didn't saw patch to change the state in that BP, so possibly it's
still on way.

So basically the idea is, while we keep the implementation code path
combined for resize/code migration as much as possible, but we will keep
them different from user point of view, like different configuration
option, different state etc, right?
 
--jyh





___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack][Nova][compute] Why prune all compute node stats when sync up compute nodes

2014-01-17 Thread yunhong jiang
On Thu, 2014-01-16 at 00:22 +0800, Jay Lau wrote:
 Greeting,
 
 In compute/manager.py, there is a periodic task named as
 update_available_resource(), it will update resource for each compute
 periodically.
 
  @periodic_task.periodic_task
 def update_available_resource(self, context):
 See driver.get_available_resource()
 
 Periodic process that keeps that the compute host's
 understanding of
 resource availability and usage in sync with the underlying
 hypervisor.
 
 :param context: security context
 
 new_resource_tracker_dict = {}
 nodenames = set(self.driver.get_available_nodes())
 for nodename in nodenames:
 rt = self._get_resource_tracker(nodename)
 rt.update_available_resource(context)  Update
 here
 new_resource_tracker_dict[nodename] = rt
 
 In resource_tracker.py,
 https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L384
 
 self._update(context, resources, prune_stats=True)
 
 It always set prune_stats as True, this caused some problems for me.
 As now I'm putting some metrics to compute_node_stats table, those
 metrics does not change frequently, so I did not update it frequently.
 But the periodic task always prune the new metrics that I added.

 
IIUC, it's because the host resource may change dynamically, at least in
original design?

 What about add a configuration parameter in nova.cont to make
 prune_stats as configurable?

Instead of make prune_stats to be configuration, will it make more sense
to be lazy update, i.e. not update the DB if no changes?
 
 Thanks,
 
 
 Jay
 
 
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [neutron] PCI pass-through network support

2014-01-17 Thread yunhong jiang
On Fri, 2014-01-17 at 22:30 +, Robert Li (baoli) wrote:
 Yunhong,
 
 I'm hoping that these comments can be directly addressed:
   a practical deployment scenario that requires arbitrary
 attributes.

I'm just strongly against to support only one attributes (your PCI
group) for scheduling and management, that's really TOO limited.

A simple scenario is, I have 3 encryption card:
Card 1 (vendor_id is V1, device_id =0xa)
card 2(vendor_id is V1, device_id=0xb)
card 3(vendor_id is V2, device_id=0xb)

I have two images. One image only support Card 1 and another image
support Card 1/3 (or any other combination of the 3 card type). I don't
only one attributes will meet such requirement.

As to arbitrary attributes or limited list of attributes, my opinion is,
as there are so many type of PCI devices and so many potential of PCI
devices usage, support arbitrary attributes will make our effort more
flexible, if we can push the implementation into the tree.

   detailed design on the following (that also take into account
 the
 introduction of predefined attributes):
 * PCI stats report since the scheduler is stats based

I don't think there are much difference with current implementation.

 * the scheduler in support of PCI flavors with arbitrary
 attributes and potential overlapping.

As Ian said, we need make sure the pci_stats and the PCI flavor have the
same set of attributes, so I don't think there are much difference with
current implementation.

   networking requirements to support multiple provider
 nets/physical
 nets

Can't the extra info resolve this issue? Can you elaborate the issue?

Thanks
--jyh
 
 I guess that the above will become clear as the discussion goes on.
 And we
 also need to define the deliveries
  
 Thanks,
 Robert 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [neutron] PCI pass-through network support

2014-01-16 Thread yunhong jiang
On Thu, 2014-01-16 at 01:28 +0100, Ian Wells wrote:
 To clarify a couple of Robert's points, since we had a conversation
 earlier:
 On 15 January 2014 23:47, Robert Li (baoli) ba...@cisco.com wrote:
   ---  do we agree that BDF address (or device id, whatever
 you call it), and node id shouldn't be used as attributes in
 defining a PCI flavor?
 
 
 Note that the current spec doesn't actually exclude it as an option.
 It's just an unwise thing to do.  In theory, you could elect to define
 your flavors using the BDF attribute but determining 'the card in this
 slot is equivalent to all the other cards in the same slot in other
 machines' is probably not the best idea...  We could lock it out as an
 option or we could just assume that administrators wouldn't be daft
 enough to try.
 
 
 * the compute node needs to know the PCI flavor.
 [...] 
   - to support live migration, we need to use
 it to create network xml
 
 
 I didn't understand this at first and it took me a while to get what
 Robert meant here.
 
 This is based on Robert's current code for macvtap based live
 migration.  The issue is that if you wish to migrate a VM and it's
 tied to a physical interface, you can't guarantee that the same
 physical interface is going to be used on the target machine, but at
 the same time you can't change the libvirt.xml as it comes over with
 the migrating machine.  The answer is to define a network and refer
 out to it from libvirt.xml.  In Robert's current code he's using the
 group name of the PCI devices to create a network containing the list
 of equivalent devices (those in the group) that can be macvtapped.
 Thus when the host migrates it will find another, equivalent,
 interface.  This falls over in the use case under consideration where
 a device can be mapped using more than one flavor, so we have to
 discard the use case or rethink the implementation.
 
 There's a more complex solution - I think - where we create a
 temporary network for each macvtap interface a machine's going to use,
 with a name based on the instance UUID and port number, and containing
 the device to map.  Before starting the migration we would create a
 replacement network containing only the new device on the target host;
 migration would find the network from the name in the libvirt.xml, and
 the content of that network would behave identically.  We'd be
 creating libvirt networks on the fly and a lot more of them, and we'd
 need decent cleanup code too ('when freeing a PCI device, delete any
 network it's a member of'), so it all becomes a lot more hairy.
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Ian/Robert, below is my understanding to the method Robet want to use,
am I right?

a) Define a libvirt network as  Using a macvtap direct connection
section at http://libvirt.org/formatnetwork.html . For example, like
followed one: 
network
name group_name1 /name
forward mode=bridge
  interface dev=eth20/
  interface dev=eth21/
  interface dev=eth22/
  interface dev=eth23/
  interface dev=eth24/
/forward
  /network


b) When assign SRIOV NIC devices to an instance, as in Assignment from
a pool of SRIOV VFs in a libvirt network definition section in
http://wiki.libvirt.org/page/Networking#PCI_Passthrough_of_host_network_devices 
, use libvirt network definition group_name1. For example, like followed one:

  interface type='network'
source network='group_name1'
  /interface


If my understanding is correct, then I have something unclear yet:
a) How will the libvirt create the libvirt network (i.e. libvirt network
group_name1)? Will it has be created when compute boot up, or it will be
created before instance creation? I suppose per Robert's design, it's
created when compute node is up, am I right?

b) If all the interface are used up by instance, what will happen.
Considering that 4 interface allocated to the group_name1 libvirt
network, and user try to migrate 6 instance with 'group_name1' network,
what will happen?

And below is my comments:

a) Yes, this is in fact different with the current nova PCI support
philosophy. Currently we assume Nova owns the devices, manage the device
assignment to each instance. While in such situation, libvirt network is
in fact another layer of PCI device management layer (although very
thin) !

b) This also remind me that possibly other VMM like XenAPI has special
requirement and we need input/confirmation from them also.


As how to resolve the issue, I think there are several solution:

a) Create one libvirt network for each SRIOV NIC assigned to each
instance dynamic, i.e. the libvirt network always has only one interface
included, it may be static created or dynamical created. This solution
in fact removes the 

Re: [openstack-dev] [nova] Maintaining backwards compatibility for RPC calls

2013-11-27 Thread yunhong jiang
On Wed, 2013-11-27 at 12:38 +, Day, Phil wrote:
 Doesn’t this mean that you can’t deploy Icehouse (3.0) code into a
 Havana system but leave the RPC version pinned at Havana until all of
 the code has been updated ?  

I think it's because this change is for computer manager, not for
conductor or other service.

Thanks
--jyh


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Core pinning

2013-11-19 Thread yunhong jiang
On Tue, 2013-11-19 at 12:52 +, Daniel P. Berrange wrote:
 On Wed, Nov 13, 2013 at 02:46:06PM +0200, Tuomas Paappanen wrote:
  Hi all,
  
  I would like to hear your thoughts about core pinning in Openstack.
  Currently nova(with qemu-kvm) supports usage of cpu set of PCPUs
  what can be used by instances. I didn't find blueprint, but I think
  this feature is for isolate cpus used by host from cpus used by
  instances(VCPUs).
  
  But, from performance point of view it is better to exclusively
  dedicate PCPUs for VCPUs and emulator. In some cases you may want to
  guarantee that only one instance(and its VCPUs) is using certain
  PCPUs.  By using core pinning you can optimize instance performance
  based on e.g. cache sharing, NUMA topology, interrupt handling, pci
  pass through(SR-IOV) in multi socket hosts etc.
  
  We have already implemented feature like this(PoC with limitations)
  to Nova Grizzly version and would like to hear your opinion about
  it.
  
  The current implementation consists of three main parts:
  - Definition of pcpu-vcpu maps for instances and instance spawning
  - (optional) Compute resource and capability advertising including
  free pcpus and NUMA topology.
  - (optional) Scheduling based on free cpus and NUMA topology.
  
  The implementation is quite simple:
  
  (additional/optional parts)
  Nova-computes are advertising free pcpus and NUMA topology in same
  manner than host capabilities. Instances are scheduled based on this
  information.
  
  (core pinning)
  admin can set PCPUs for VCPUs and for emulator process, or select
  NUMA cell for instance vcpus, by adding key:value pairs to flavor's
  extra specs.
  
  EXAMPLE:
  instance has 4 vcpus
  key:value
  vcpus:1,2,3,4 -- vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2...
  emulator:5 -- emulator pinned to pcpu5
  or
  numacell:0 -- all vcpus are pinned to pcpus in numa cell 0.
  
  In nova-compute, core pinning information is read from extra specs
  and added to domain xml same way as cpu quota values(cputune).
  
  cputune
vcpupin vcpu='0' cpuset='1'/
vcpupin vcpu='1' cpuset='2'/
vcpupin vcpu='2' cpuset='3'/
vcpupin vcpu='3' cpuset='4'/
emulatorpin cpuset='5'/
  /cputune
  
  What do you think? Implementation alternatives? Is this worth of
  blueprint? All related comments are welcome!
 
 I think there are several use cases mixed up in your descriptions
 here which should likely be considered independantly
 
  - pCPU/vCPU pinning
 
I don't really think this is a good idea as a general purpose
feature in its own right. It tends to lead to fairly inefficient
use of CPU resources when you consider that a large % of guests
will be mostly idle most of the time. It has a fairly high
administrative burden to maintain explicit pinning too. This
feels like a data center virt use case rather than cloud use
case really.
 
  - Dedicated CPU reservation
 
The ability of an end user to request that their VM (or their
group of VMs) gets assigned a dedicated host CPU set to run on.
This is obviously something that would have to be controlled
at a flavour level, and in a commercial deployment would carry
a hefty pricing premium.
 
I don't think you want to expose explicit pCPU/vCPU placement
for this though. Just request the high level concept and allow
the virt host to decide actual placement
 
  - Host NUMA placement.
 
By not taking NUMA into account currently the libvirt driver
at least is badly wasting resources. Having too much cross-numa
node memory access by guests just kills scalability. The virt
driver should really automaticall figure out cpu  memory pinning
within the scope of a NUMA node automatically. No admin config
should be required for this.
 
  - Guest NUMA topology
 
If the flavour memory size / cpu count exceeds the size of a
single NUMA node, then the flavour should likely have a way to
express that the guest should see multiple NUMA nodes. The
virt host would then set guest NUMA topology to match the way
it places vCPUs  memory on host NUMA nodes. Again you don't
want explicit pcpu/vcpu mapping done by the admin for this.
 
 
 
 Regards,
 Daniel

Quite clear splitting and +1 for P/V pin option.

--jyh



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][api] Is this a potential issue

2013-11-18 Thread yunhong jiang
On Mon, 2013-11-18 at 10:18 -0500, Andrew Laski wrote:
 On 11/15/13 at 04:01pm, yunhong jiang wrote:
 On Fri, 2013-11-15 at 17:19 -0500, Andrew Laski wrote:

 If yes, would it be possible to create a special task_state as IDLE, to
 distinguish it better? When no task on-going, the task_state will be
 IDLE, instead of None.
 
 I'm starting on some work right now which will break task_state off into 
 it's own model and API resource.  In my opinion we don't need to model 
 the idea of no task running, we can check if there are tasks for the 
 instance or not.  So I think that using None is fine here and we 
 shouldn't add an IDLE state.
 
Got it, thanks.

--jyh
 
 --jyh
 
 
 
 
 
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-18 Thread yunhong jiang
On Mon, 2013-11-18 at 14:09 -0800, Joe Gordon wrote:
 
 Phil Day discussed this at the summit and I have finally gotten around
 to posting a POC of this. 
 
 https://review.openstack.org/#/c/57053/

Hi, Joe, why you think the DB is not exact state in your followed commit
message? I think the DB is updated to date by resource tracker, am I
right (the resource tracker get the underlying resource information
periodically but I think that information is mostly static). And I think
the scheduler retry mainly comes from the race condition of multiple
scheduler instance.

We already have the concept that the DB isn't the exact state of the
world, right now it's updated every 10 seconds. And we use the scheduler
retry mechanism to handle cases where the scheduler was wrong. 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-18 Thread yunhong jiang
On Mon, 2013-11-18 at 15:32 -0800, Joe Gordon wrote:
 
 
 
 On Mon, Nov 18, 2013 at 4:08 PM, yunhong jiang
 yunhong.ji...@linux.intel.com wrote:
 On Mon, 2013-11-18 at 14:09 -0800, Joe Gordon wrote:
 
  Phil Day discussed this at the summit and I have finally
 gotten around
  to posting a POC of this.
 
  https://review.openstack.org/#/c/57053/
 
 
 Hi, Joe, why you think the DB is not exact state in your
 followed commit
 message? I think the DB is updated to date by resource
 tracker, am I
 right (the resource tracker get the underlying resource
 information
 periodically but I think that information is mostly static).
 And I think
 the scheduler retry mainly comes from the race condition of
 multiple
 scheduler instance.
 
 
 
 
 You answered the question yourself, the compute nodes (indirectly)
 update the DB periodically, so the further you are from the last
 periodic update the less up to date the DB is.
 
But the compute node will also update the DB if any claim changes
between the period, and also considering currently the resource tracker
calculate the instance usage (like RAM, core etc) itself instead of
depends on hyper-visor report, I think the DB information should be
considered mostly up to date.

Of course, I'm not against the information cache.

--jyh
 
 Its there for both reasons.  But yes it was originally put there
 because of the multi scheduler race condition.
  
 
 We already have the concept that the DB isn't the exact state
 of the
 world, right now it's updated every 10 seconds. And we use the
 scheduler
 retry mechanism to handle cases where the scheduler was wrong.
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][api] Is this a potential issue

2013-11-15 Thread yunhong jiang
On Fri, 2013-11-15 at 17:19 -0500, Andrew Laski wrote:
 On 11/15/13 at 07:30am, Dan Smith wrote:
  You're not missing anything.  But I think that's a bug, or at least an
  unexpected change in behaviour from how it used to work.  If you follow
  instance_update() in nova.db.sqlalchemy.api just the presence of
  expected_task_state triggers the check.  So we may need to find a way to
  pass that through with the save method.
 
 This came up recently. We decided that since we no longer have a kwargs
 dictionary to test for the presence or absence of that flag, that we
 would require setting it to a tuple, which is already supported for
 allowing multiple state possibilities. So, if you pass
 expected_task_state=(None,) then it will do the right thing.
 
 Make sense?
 
 Perfect.  I thought the old method was a bit counterintuitive and 
 started thinking this would be better after I sent the email earlier.
 

I checked and seems most usage of instance.save() with
expected_state=None assume an exception and need change. Can I assume
this rule apply to all?

If yes, would it be possible to create a special task_state as IDLE, to
distinguish it better? When no task on-going, the task_state will be
IDLE, instead of None.

--jyh








___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev