[openstack-dev] [Nova] why force_config_drive is a per comptue node config
Greeting, I have some questions on the force_config_drive configuration options and hope get some hints. a) Why do we want this? Per my understanding, if the user want to use the config drive, they need specify it in the nova boot. Or is it because possibly user have no idea of the cloudinit installation in the image? b) even if we want to force config drive, why it's a compute node config instead of cloud wise config? Compute-node config will have some migration issue per my understanding. Did I missed anything important? Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Question about USB passthrough
On Tue, 2014-02-25 at 03:05 +, Liuji (Jeremy) wrote: Now that USB devices are used so widely in private/hybrid cloud like used as USB key, and there are no technical issues in libvirt/qemu. I think it a valuable feature in openstack. USB key is an interesting scenario. I assume the USB key is just for some specific VM, wondering how the admin/user know which usb disk to which VM? --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing
On Tue, 2014-02-25 at 10:45 +, John Garbutt wrote: As a heads up, the overheads of DB calls turned out to dwarf any algorithmic improvements I managed. There will clearly be some RPC overhead, but it didn't stand out as much as the DB issue. The move to conductor work should certainly stop the scheduler making those pesky DB calls to update the nova instance. And then, improvements like no-db-scheduler and improvements to scheduling algorithms should shine through much more. Although DB access is sure the key for performance, but do we really want to pursue conductor-based scheduler? --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Question about USB passthrough
On Mon, 2014-02-24 at 04:10 +, Liuji (Jeremy) wrote: I have found a BP about USB device passthrough in https://blueprints.launchpad.net/nova/+spec/host-usb-passthrough. I have also read the latest nova code and make sure it doesn't support USB passthrough by now. Are there any progress or plan for USB passthrough? I don't know anyone is working on USB passthrough. --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] pci device hotplug
On Mon, 2014-02-17 at 06:43 +, Gouzongmei wrote: Hello, In current PCI passthrough implementation, a pci device is only allowed to be assigned to a instance while the instance is being created, it is not allowed to be assigned or removed from the instance while the instance is running or stop. Besides, I noticed that the basic ability--remove a pci device from the instance(not by delete the flavor) has never been implemented or prompted by anyone. The current implementation: https://wiki.openstack.org/wiki/Pci_passthrough I have tested the nic hotplug on my experimental environment, it’s supported by the latest libvirt and qemu. My problem is, why the pci device hotplug is not proposed in openstack until now, and is there anyone planning to do the pci device hotplug? Agree that PCI hotplug is an important feature. The reason of no support yet is bandwidth. The folks working on PCI spend a lot of time on SR-IOV NIC discussion. --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: filter based on the load averages of the host
On Fri, 2014-02-14 at 15:29 +, sahid wrote: Greetings, I would like to add a new filter based on the load averages. This filter will use the command uptime and will provides an option to choice a period between 1, 5, and 15 minutes and an option to choice the max load average (a float between 0 and 1). Why: During a scheduling it could be useful to exclude a host that have a too heavy load and the command uptime (available in all linux system) can return a load average of the system in different periods. About the implementation: Currently 'all' drivers (libvirt, xenapi, vmware) supports a method get_host_uptime that returns the output of the command 'uptime'. We have to add in compute/stats.py a new method calculate_loadavg() that returns based on the output of driver.get_host_uptime() from compute/ressource_tracker.py a well formatted tuple of load averages for each periods. We also need to update api/openstack/compute/contrib/hypervisors.py to take care of this new field. The implementation will be divided in several parts: * Add to host_manager the possibility to get the loads_averages * Implement the filter based on this new property * Implement the filter with a per-aggregate configuration The blueprint: https://blueprints.launchpad.net/nova/+spec/filter-based-uptime I will be happy to get any comments about this filter, perharps it is not implemented yet because of something I didn't see or my thinking of the implementation is wrong. PS: I have checked metrics and cpu_resource but It does not get an averages of the system load or perhaps I have not understand all. Thanks a lot, s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev I think load average has more than CPU, you need consider like I/O usage, or even other metrics. Maybe you can have a look at the https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling ? Also IMHO the policy of exclude a host that have a too heavy load is not so clean, would it be better to keep the usage as a scheduler weight? Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][neutron] PCI pass-through SRIOV
On Mon, 2014-01-27 at 14:58 +, Robert Li (baoli) wrote: Hi Folks, In today's meeting, we discussed a scheduler issue for SRIOV. The basic requirement is for coexistence of the following compute nodes in a cloud: -- SRIOV only compute nodes -- non-SRIOV only compute nodes -- Compute nodes that can support both SRIOV and non-SRIOV ports. Lack of a proper name, let's call them compute nodes with hybrid NICs support, or simply hybrid compute nodes. I'm not sure if it's practical in having hybrid compute nodes in a real cloud. But it may be useful in the lab to bench mark the performance differences between SRIOV, non-SRIOV, and coexistence of both. In a cloud that supports SRIOV in some of the compute nodes, a request such as: nova boot —flavor m1.large —image image-uuid --nic net-id=net-uuid vm doesn't require a SRIOV port. However, it's possible for the nova scheduler to place it on a compute node that supports sriov port only. Since neutron plugin runs on the controller, port-create would succeed unless neutron knows the host doesn't support non-sriov port. But connectivity on the node would not be established since no agent is running on that host to establish such connectivity. Irena brought up the idea of using host aggregate. This requires creation of a non-SRIOV host aggregate, and use that in the above 'nova boot' command. It should work. The patch I had introduced a new constraint in the existing PCI passthrough filter. The consensus seems to be having a better solution in a later release. And for now, people can either use host aggregate or resort to their own means. Let's keep the discussion going on this. Thanks, Robert On 1/24/14 4:50 PM, Robert Li (baoli) ba...@cisco.com wrote: Hi Folks, Based on Thursday's discussion and a chat with Irena, I took the liberty to add a summary and discussion points for SRIOV on Monday and onwards. Check it out https://wiki.openstack.org/wiki/Meetings/Passthrough. Please feel free to update it. Let's try to finalize it next week. The goal is to determine the BPs that need to get approved, and to start coding. thanks, Robert On 1/22/14 8:03 AM, Robert Li (baoli) ba...@cisco.com wrote: Sounds great! Let's do it on Thursday. --Robert On 1/22/14 12:46 AM, Irena Berezovsky ire...@mellanox.com wrote: Hi Robert, all, I would suggest not to delay the SR-IOV discussion to the next week. Let’s try to cover the SRIOV side and especially the nova-neutron interaction points and interfaces this Thursday. Once we have the interaction points well defined, we can run parallel patches to cover the full story. Thanks a lot, Irena From: Robert Li (baoli) [mailto:ba...@cisco.com] Sent: Wednesday, January 22, 2014 12:02 AM To: OpenStack Development Mailing List (not for usage questions) Subject: [openstack-dev] [nova][neutron] PCI passthrough SRIOV Hi Folks, As the debate about PCI flavor versus host aggregate goes on, I'd like to move forward with the SRIOV side of things in the same time. I know that tomorrow's IRC will be focusing on the BP review, and it may well continue into Thursday. Therefore, let's start discussing SRIOV side of things on Monday. Basically, we need to work out the details
Re: [openstack-dev] [nova][neutron] PCI pass-through SRIOV
On Mon, 2014-01-27 at 21:14 +, Jani, Nrupal wrote: Hi, There are two possibilities for the hybrid compute nodes - In the first case, a compute node has two NICs, one SRIOV NIC the other NIC for the VirtIO - In the 2nd case, Compute node has only one SRIOV NIC, where VFs are used for the VMs, either macvtap or direct assignment. And the PF is used for the uplink to the linux bridge or OVS!! My question to the team is whether we consider both of these deployments or not? Nrupal, good question. I assume if a NIC will be used for vNIC type, it will not be reported by hypervisor as assignable PCI devices since host will own it and the OVS is setup based on it. Irena/Ian, please correct me. At least this is assumption in nova PCI code, I think. Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] why don't we deal with claims when live migrating an instance?
On Fri, 2014-01-17 at 09:39 -0800, Vishvananda Ishaya wrote: On Jan 16, 2014, at 9:41 PM, Jiang, Yunhong yunhong.ji...@intel.com wrote: I noticed the BP has been approved, but I really want to understand more on the reason, can anyone provide me some hints? In the BP, it states that “For resize, we need to confirm, as we want to give end user an opportunity to rollback”. But why do we want to give user an opportunity to rollback to resize? And why that reason does not apply to cold migration and live migration? The confirm is so the user can verify that the instance is still functional in the new state. We leave the old instance around so they can abort and return to the old instance if something goes wrong. This could apply to cold migration as well since it uses the same code paths, but it definitely does not make sense in the case of live-migration, because there is no old vm to revert to. Thanks for clarification. In the case of cold migration, the state is quite confusing as “RESIZE_VERIFY”, and the need to confirm is not immediately obvious so I think that is driving the change. I didn't saw patch to change the state in that BP, so possibly it's still on way. So basically the idea is, while we keep the implementation code path combined for resize/code migration as much as possible, but we will keep them different from user point of view, like different configuration option, different state etc, right? --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [OpenStack][Nova][compute] Why prune all compute node stats when sync up compute nodes
On Thu, 2014-01-16 at 00:22 +0800, Jay Lau wrote: Greeting, In compute/manager.py, there is a periodic task named as update_available_resource(), it will update resource for each compute periodically. @periodic_task.periodic_task def update_available_resource(self, context): See driver.get_available_resource() Periodic process that keeps that the compute host's understanding of resource availability and usage in sync with the underlying hypervisor. :param context: security context new_resource_tracker_dict = {} nodenames = set(self.driver.get_available_nodes()) for nodename in nodenames: rt = self._get_resource_tracker(nodename) rt.update_available_resource(context) Update here new_resource_tracker_dict[nodename] = rt In resource_tracker.py, https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L384 self._update(context, resources, prune_stats=True) It always set prune_stats as True, this caused some problems for me. As now I'm putting some metrics to compute_node_stats table, those metrics does not change frequently, so I did not update it frequently. But the periodic task always prune the new metrics that I added. IIUC, it's because the host resource may change dynamically, at least in original design? What about add a configuration parameter in nova.cont to make prune_stats as configurable? Instead of make prune_stats to be configuration, will it make more sense to be lazy update, i.e. not update the DB if no changes? Thanks, Jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [neutron] PCI pass-through network support
On Fri, 2014-01-17 at 22:30 +, Robert Li (baoli) wrote: Yunhong, I'm hoping that these comments can be directly addressed: a practical deployment scenario that requires arbitrary attributes. I'm just strongly against to support only one attributes (your PCI group) for scheduling and management, that's really TOO limited. A simple scenario is, I have 3 encryption card: Card 1 (vendor_id is V1, device_id =0xa) card 2(vendor_id is V1, device_id=0xb) card 3(vendor_id is V2, device_id=0xb) I have two images. One image only support Card 1 and another image support Card 1/3 (or any other combination of the 3 card type). I don't only one attributes will meet such requirement. As to arbitrary attributes or limited list of attributes, my opinion is, as there are so many type of PCI devices and so many potential of PCI devices usage, support arbitrary attributes will make our effort more flexible, if we can push the implementation into the tree. detailed design on the following (that also take into account the introduction of predefined attributes): * PCI stats report since the scheduler is stats based I don't think there are much difference with current implementation. * the scheduler in support of PCI flavors with arbitrary attributes and potential overlapping. As Ian said, we need make sure the pci_stats and the PCI flavor have the same set of attributes, so I don't think there are much difference with current implementation. networking requirements to support multiple provider nets/physical nets Can't the extra info resolve this issue? Can you elaborate the issue? Thanks --jyh I guess that the above will become clear as the discussion goes on. And we also need to define the deliveries Thanks, Robert ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [neutron] PCI pass-through network support
On Thu, 2014-01-16 at 01:28 +0100, Ian Wells wrote: To clarify a couple of Robert's points, since we had a conversation earlier: On 15 January 2014 23:47, Robert Li (baoli) ba...@cisco.com wrote: --- do we agree that BDF address (or device id, whatever you call it), and node id shouldn't be used as attributes in defining a PCI flavor? Note that the current spec doesn't actually exclude it as an option. It's just an unwise thing to do. In theory, you could elect to define your flavors using the BDF attribute but determining 'the card in this slot is equivalent to all the other cards in the same slot in other machines' is probably not the best idea... We could lock it out as an option or we could just assume that administrators wouldn't be daft enough to try. * the compute node needs to know the PCI flavor. [...] - to support live migration, we need to use it to create network xml I didn't understand this at first and it took me a while to get what Robert meant here. This is based on Robert's current code for macvtap based live migration. The issue is that if you wish to migrate a VM and it's tied to a physical interface, you can't guarantee that the same physical interface is going to be used on the target machine, but at the same time you can't change the libvirt.xml as it comes over with the migrating machine. The answer is to define a network and refer out to it from libvirt.xml. In Robert's current code he's using the group name of the PCI devices to create a network containing the list of equivalent devices (those in the group) that can be macvtapped. Thus when the host migrates it will find another, equivalent, interface. This falls over in the use case under consideration where a device can be mapped using more than one flavor, so we have to discard the use case or rethink the implementation. There's a more complex solution - I think - where we create a temporary network for each macvtap interface a machine's going to use, with a name based on the instance UUID and port number, and containing the device to map. Before starting the migration we would create a replacement network containing only the new device on the target host; migration would find the network from the name in the libvirt.xml, and the content of that network would behave identically. We'd be creating libvirt networks on the fly and a lot more of them, and we'd need decent cleanup code too ('when freeing a PCI device, delete any network it's a member of'), so it all becomes a lot more hairy. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Ian/Robert, below is my understanding to the method Robet want to use, am I right? a) Define a libvirt network as Using a macvtap direct connection section at http://libvirt.org/formatnetwork.html . For example, like followed one: network name group_name1 /name forward mode=bridge interface dev=eth20/ interface dev=eth21/ interface dev=eth22/ interface dev=eth23/ interface dev=eth24/ /forward /network b) When assign SRIOV NIC devices to an instance, as in Assignment from a pool of SRIOV VFs in a libvirt network definition section in http://wiki.libvirt.org/page/Networking#PCI_Passthrough_of_host_network_devices , use libvirt network definition group_name1. For example, like followed one: interface type='network' source network='group_name1' /interface If my understanding is correct, then I have something unclear yet: a) How will the libvirt create the libvirt network (i.e. libvirt network group_name1)? Will it has be created when compute boot up, or it will be created before instance creation? I suppose per Robert's design, it's created when compute node is up, am I right? b) If all the interface are used up by instance, what will happen. Considering that 4 interface allocated to the group_name1 libvirt network, and user try to migrate 6 instance with 'group_name1' network, what will happen? And below is my comments: a) Yes, this is in fact different with the current nova PCI support philosophy. Currently we assume Nova owns the devices, manage the device assignment to each instance. While in such situation, libvirt network is in fact another layer of PCI device management layer (although very thin) ! b) This also remind me that possibly other VMM like XenAPI has special requirement and we need input/confirmation from them also. As how to resolve the issue, I think there are several solution: a) Create one libvirt network for each SRIOV NIC assigned to each instance dynamic, i.e. the libvirt network always has only one interface included, it may be static created or dynamical created. This solution in fact removes the
Re: [openstack-dev] [nova] Maintaining backwards compatibility for RPC calls
On Wed, 2013-11-27 at 12:38 +, Day, Phil wrote: Doesn’t this mean that you can’t deploy Icehouse (3.0) code into a Havana system but leave the RPC version pinned at Havana until all of the code has been updated ? I think it's because this change is for computer manager, not for conductor or other service. Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Core pinning
On Tue, 2013-11-19 at 12:52 +, Daniel P. Berrange wrote: On Wed, Nov 13, 2013 at 02:46:06PM +0200, Tuomas Paappanen wrote: Hi all, I would like to hear your thoughts about core pinning in Openstack. Currently nova(with qemu-kvm) supports usage of cpu set of PCPUs what can be used by instances. I didn't find blueprint, but I think this feature is for isolate cpus used by host from cpus used by instances(VCPUs). But, from performance point of view it is better to exclusively dedicate PCPUs for VCPUs and emulator. In some cases you may want to guarantee that only one instance(and its VCPUs) is using certain PCPUs. By using core pinning you can optimize instance performance based on e.g. cache sharing, NUMA topology, interrupt handling, pci pass through(SR-IOV) in multi socket hosts etc. We have already implemented feature like this(PoC with limitations) to Nova Grizzly version and would like to hear your opinion about it. The current implementation consists of three main parts: - Definition of pcpu-vcpu maps for instances and instance spawning - (optional) Compute resource and capability advertising including free pcpus and NUMA topology. - (optional) Scheduling based on free cpus and NUMA topology. The implementation is quite simple: (additional/optional parts) Nova-computes are advertising free pcpus and NUMA topology in same manner than host capabilities. Instances are scheduled based on this information. (core pinning) admin can set PCPUs for VCPUs and for emulator process, or select NUMA cell for instance vcpus, by adding key:value pairs to flavor's extra specs. EXAMPLE: instance has 4 vcpus key:value vcpus:1,2,3,4 -- vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2... emulator:5 -- emulator pinned to pcpu5 or numacell:0 -- all vcpus are pinned to pcpus in numa cell 0. In nova-compute, core pinning information is read from extra specs and added to domain xml same way as cpu quota values(cputune). cputune vcpupin vcpu='0' cpuset='1'/ vcpupin vcpu='1' cpuset='2'/ vcpupin vcpu='2' cpuset='3'/ vcpupin vcpu='3' cpuset='4'/ emulatorpin cpuset='5'/ /cputune What do you think? Implementation alternatives? Is this worth of blueprint? All related comments are welcome! I think there are several use cases mixed up in your descriptions here which should likely be considered independantly - pCPU/vCPU pinning I don't really think this is a good idea as a general purpose feature in its own right. It tends to lead to fairly inefficient use of CPU resources when you consider that a large % of guests will be mostly idle most of the time. It has a fairly high administrative burden to maintain explicit pinning too. This feels like a data center virt use case rather than cloud use case really. - Dedicated CPU reservation The ability of an end user to request that their VM (or their group of VMs) gets assigned a dedicated host CPU set to run on. This is obviously something that would have to be controlled at a flavour level, and in a commercial deployment would carry a hefty pricing premium. I don't think you want to expose explicit pCPU/vCPU placement for this though. Just request the high level concept and allow the virt host to decide actual placement - Host NUMA placement. By not taking NUMA into account currently the libvirt driver at least is badly wasting resources. Having too much cross-numa node memory access by guests just kills scalability. The virt driver should really automaticall figure out cpu memory pinning within the scope of a NUMA node automatically. No admin config should be required for this. - Guest NUMA topology If the flavour memory size / cpu count exceeds the size of a single NUMA node, then the flavour should likely have a way to express that the guest should see multiple NUMA nodes. The virt host would then set guest NUMA topology to match the way it places vCPUs memory on host NUMA nodes. Again you don't want explicit pcpu/vcpu mapping done by the admin for this. Regards, Daniel Quite clear splitting and +1 for P/V pin option. --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][api] Is this a potential issue
On Mon, 2013-11-18 at 10:18 -0500, Andrew Laski wrote: On 11/15/13 at 04:01pm, yunhong jiang wrote: On Fri, 2013-11-15 at 17:19 -0500, Andrew Laski wrote: If yes, would it be possible to create a special task_state as IDLE, to distinguish it better? When no task on-going, the task_state will be IDLE, instead of None. I'm starting on some work right now which will break task_state off into it's own model and API resource. In my opinion we don't need to model the idea of no task running, we can check if there are tasks for the instance or not. So I think that using None is fine here and we shouldn't add an IDLE state. Got it, thanks. --jyh --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Does Nova really need an SQL database?
On Mon, 2013-11-18 at 14:09 -0800, Joe Gordon wrote: Phil Day discussed this at the summit and I have finally gotten around to posting a POC of this. https://review.openstack.org/#/c/57053/ Hi, Joe, why you think the DB is not exact state in your followed commit message? I think the DB is updated to date by resource tracker, am I right (the resource tracker get the underlying resource information periodically but I think that information is mostly static). And I think the scheduler retry mainly comes from the race condition of multiple scheduler instance. We already have the concept that the DB isn't the exact state of the world, right now it's updated every 10 seconds. And we use the scheduler retry mechanism to handle cases where the scheduler was wrong. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Does Nova really need an SQL database?
On Mon, 2013-11-18 at 15:32 -0800, Joe Gordon wrote: On Mon, Nov 18, 2013 at 4:08 PM, yunhong jiang yunhong.ji...@linux.intel.com wrote: On Mon, 2013-11-18 at 14:09 -0800, Joe Gordon wrote: Phil Day discussed this at the summit and I have finally gotten around to posting a POC of this. https://review.openstack.org/#/c/57053/ Hi, Joe, why you think the DB is not exact state in your followed commit message? I think the DB is updated to date by resource tracker, am I right (the resource tracker get the underlying resource information periodically but I think that information is mostly static). And I think the scheduler retry mainly comes from the race condition of multiple scheduler instance. You answered the question yourself, the compute nodes (indirectly) update the DB periodically, so the further you are from the last periodic update the less up to date the DB is. But the compute node will also update the DB if any claim changes between the period, and also considering currently the resource tracker calculate the instance usage (like RAM, core etc) itself instead of depends on hyper-visor report, I think the DB information should be considered mostly up to date. Of course, I'm not against the information cache. --jyh Its there for both reasons. But yes it was originally put there because of the multi scheduler race condition. We already have the concept that the DB isn't the exact state of the world, right now it's updated every 10 seconds. And we use the scheduler retry mechanism to handle cases where the scheduler was wrong. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][api] Is this a potential issue
On Fri, 2013-11-15 at 17:19 -0500, Andrew Laski wrote: On 11/15/13 at 07:30am, Dan Smith wrote: You're not missing anything. But I think that's a bug, or at least an unexpected change in behaviour from how it used to work. If you follow instance_update() in nova.db.sqlalchemy.api just the presence of expected_task_state triggers the check. So we may need to find a way to pass that through with the save method. This came up recently. We decided that since we no longer have a kwargs dictionary to test for the presence or absence of that flag, that we would require setting it to a tuple, which is already supported for allowing multiple state possibilities. So, if you pass expected_task_state=(None,) then it will do the right thing. Make sense? Perfect. I thought the old method was a bit counterintuitive and started thinking this would be better after I sent the email earlier. I checked and seems most usage of instance.save() with expected_state=None assume an exception and need change. Can I assume this rule apply to all? If yes, would it be possible to create a special task_state as IDLE, to distinguish it better? When no task on-going, the task_state will be IDLE, instead of None. --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev