Re: [openstack-dev] [nova] Proposal: Move CPU and memory allocation ratio out of scheduler

2014-06-09 Thread Chris Friesen
On 06/09/2014 07:59 AM, Jay Pipes wrote: On 06/06/2014 08:07 AM, Murray, Paul (HP Cloud) wrote: Forcing an instance to a specific host is very useful for the operator - it fulfills a valid use case for monitoring and testing purposes. Pray tell, what is that valid use case? I find it useful

Re: [openstack-dev] [nova] Proposal: Move CPU and memory allocation ratio out of scheduler

2014-06-03 Thread Chris Friesen
On 06/03/2014 07:29 AM, Jay Pipes wrote: Hi Stackers, tl;dr = Move CPU and RAM allocation ratio definition out of the Nova scheduler and into the resource tracker. Remove the calculations for overcommit out of the core_filter and ram_filter scheduler pieces. Makes sense to me. Chris

Re: [openstack-dev] [Nova][Heat] Custom Nova Flavor creation through Heat (pt.2)

2014-05-05 Thread Chris Friesen
On 05/05/2014 10:51 AM, Steve Gordon wrote: In addition extra specifications may denote the passthrough of additional devices, adding another dimension. This seems likely to be the case in the use case outlined in the original thread [1]. Thanks, Steve [1]

Re: [openstack-dev] [Nova] [Heat] Custom Nova Flavor creation through Heat (pt.2)

2014-05-05 Thread Chris Friesen
On 05/05/2014 11:40 AM, Solly Ross wrote: One thing that I was discussing with @jaypipes and @dansmith over on IRC was the possibility of breaking flavors down into separate components -- i.e have a disk flavor, a CPU flavor, and a RAM flavor. This way, you still get the control of the size of

Re: [openstack-dev] [Nova] [Heat] Custom Nova Flavor creation through Heat (pt.2)

2014-05-05 Thread Chris Friesen
On 05/05/2014 12:18 PM, Chris Friesen wrote: As a simplifying view you could keep the existing flavors which group all of them, while still allowing instances to specify each one separately if desired. Also, if we're allowing the cpu/memory/disk to be specified independently at instance boot

Re: [openstack-dev] [Heat] looking to add support for server groups to heat...any comments?

2014-04-30 Thread Chris Friesen
On 04/30/2014 03:41 PM, Mike Spreitzer wrote: Chris Friesen chris.frie...@windriver.com wrote on 04/28/2014 10:44:46 AM: Using a property of a heat resource to trigger the creation of a nova resource would not fit that model. For the sake of your argument, let's pretend that the new ASG

Re: [openstack-dev] [Heat] looking to add support for server groups to heat...any comments?

2014-04-28 Thread Chris Friesen
On 04/26/2014 09:41 PM, Jay Lau wrote: Just noticed this email, I have already filed a blueprint related to this topic https://blueprints.launchpad.net/heat/+spec/vm-instance-group-support My idea is that can we add a new field such as PlacemenetPolicy to AutoScalingGroup? If the value is

Re: [openstack-dev] [nova] Proposal: remove the server groups feature

2014-04-28 Thread Chris Friesen
On 04/25/2014 03:15 PM, Jay Pipes wrote: There are myriad problems with the above user experience and implementation. Let me explain them. 1. The user isn't creating a server group when they issue a nova server-group-create call. They are creating a policy and calling it a group. Cognitive

Re: [openstack-dev] [nova] Proposal: remove the server groups feature

2014-04-28 Thread Chris Friesen
On 04/28/2014 06:58 AM, Steve Gordon wrote: - Original Message - Create two new options to nova boot: --near-tag TAG and --not-near-tag TAG The first would tell the scheduler to place the new VM near other VMs having a particular tag. The latter would tell the scheduler to place the

Re: [openstack-dev] [nova] Proposal: remove the server groups feature

2014-04-28 Thread Chris Friesen
On 04/28/2014 11:22 AM, Dan Smith wrote: 2. There's no way to add an existing server to this group. In the original API there was a way to add existing servers to the group. This didn't make it into the code that was submitted. It is however supported by the instance group db API in nova.

[openstack-dev] [Heat] looking to add support for server groups to heat...any comments?

2014-04-25 Thread Chris Friesen
I'm looking to add support for server groups to heat. I've got working code, but I thought I'd post the overall design here in case people had objections. Basically, what I propose is to add a class NovaServerGroup resource. Currently it would only support a policy property to store the

Re: [openstack-dev] [Heat] looking to add support for server groups to heat...any comments?

2014-04-25 Thread Chris Friesen
On 04/25/2014 11:01 AM, Mike Spreitzer wrote: Zane Bitter zbit...@redhat.com wrote on 04/25/2014 12:36:00 PM: On 25/04/14 12:23, Chris Friesen wrote: More important is Zane's following question. The Server class would be extended with an optional server_group property. If it is set

Re: [openstack-dev] [Heat] looking to add support for server groups to heat...any comments?

2014-04-25 Thread Chris Friesen
On 04/25/2014 12:00 PM, Zane Bitter wrote: On 25/04/14 13:50, Chris Friesen wrote: In the nova boot command we pass the group uuid like this: --hint group=e4cf5dea-4831-49a1-867d-e263f2579dd0 If we were to make use of the scheduler hints, how would that look? Something like this? (I'm

Re: [openstack-dev] [nova] wrap_instance_event() swallows return codes....on purpose?

2014-04-22 Thread Chris Friesen
On 04/22/2014 06:34 AM, Russell Bryant wrote: On 04/21/2014 06:01 PM, Chris Friesen wrote: Hi all, In compute/manager.py the function wrap_instance_event() just calls function(). This means that if it's used to decorate a function that returns a value, then the caller will never see

[openstack-dev] [nova] wrap_instance_event() swallows return codes....on purpose?

2014-04-21 Thread Chris Friesen
Hi all, In compute/manager.py the function wrap_instance_event() just calls function(). This means that if it's used to decorate a function that returns a value, then the caller will never see the return code. Is this a bug, or is the expectation that we would only ever use this wrapper

Re: [openstack-dev] [Openstack][nova][Neutron] Launch VM with multiple Ethernet interfaces with I.P. of single subnet.

2014-04-17 Thread Chris Friesen
On 04/17/2014 06:37 AM, CARVER, PAUL wrote: Aaron Rosen wrote: Sorry not really. It's still not clear to me why multiple nics would be required on the same L2 domain. I’m a fan of this old paper for nostalgic reasons

Re: [openstack-dev] deliver the vm-level HA to improve the business continuity with openstack

2014-04-16 Thread Chris Friesen
On 04/15/2014 08:33 PM, Jay Pipes wrote: On Tue, 2014-04-15 at 12:01 +0100, Duncan Thomas wrote: On 14 April 2014 19:51, James Penick pen...@yahoo-inc.com wrote: We drive the ³VM=Cattle² message pretty hard. Part of onboarding a property to our cloud, and allowing them to serve traffic from

Re: [openstack-dev] [nova] Server Groups are not an optional element, bug or feature ?

2014-04-09 Thread Chris Friesen
On 04/09/2014 03:45 AM, Day, Phil wrote: -Original Message- From: Russell Bryant We were thinking that there may be a use for being able to query a full list of instances (including the deleted ones) for a group. The API just hasn't made it that far yet. Just hiding them for now

Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ?

2014-04-09 Thread Chris Friesen
On 04/09/2014 03:55 AM, Day, Phil wrote: I would guess that affinity is more likely to be a soft requirement that anti-affinity, in that I can see some services just not meeting their HA goals without anti-affinity but I'm struggling to think of a use case why affinity is a must for the

Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ?

2014-04-08 Thread Chris Friesen
On 04/08/2014 07:25 AM, Jay Pipes wrote: On Tue, 2014-04-08 at 10:49 +, Day, Phil wrote: On a large cloud you’re protect against this to some extent if the number of servers is number of instances in the quota. However it does feel that there are a couple of things missing to really

Re: [openstack-dev] [heat] [nova] How should a holistic scheduler relate to Heat?

2014-04-08 Thread Chris Friesen
On 04/04/2014 12:42 AM, Mike Spreitzer wrote: Now let us consider how to evolve the Nova API so that a server-group can be scheduled holistically. That is, we want to enable the scheduler to look at both the group's policies and its membership, all at once, and make a joint decision about how

Re: [openstack-dev] [heat] [nova] How should a holistic scheduler relate to Heat?

2014-04-08 Thread Chris Friesen
On 04/08/2014 11:27 AM, Mike Spreitzer wrote: There really should be one more step in that flow. Consider a create scenario. In general, as the client makes the calls to create individual resources: some will succeed, some will fail (some in ways that make it clear the capacity will not be

Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ?

2014-04-03 Thread Chris Friesen
On 04/03/2014 07:51 AM, Sylvain Bauza wrote: Hi, I'm currently trying to reproduce [1]. This bug requires to have the same host on two different aggregates, each one having an AZ. IIRC, Nova API prevents hosts of being part of two distinct AZs [2], so IMHO this request should not be possible.

Re: [openstack-dev] [OpenStack-Dev][Nova][VMWare] Enable live migration with one nova compute

2014-04-03 Thread Chris Friesen
On 04/03/2014 05:48 PM, Jay Pipes wrote: On Mon, 2014-03-31 at 17:11 +0800, Jay Lau wrote: Hi, Currently with VMWare VCDriver, one nova compute can manage multiple clusters/RPs, this caused cluster admin cannot do live migration between clusters/PRs if those clusters/PRs managed by one nova

[openstack-dev] [nova] hw_qemu_guest_agent is littering /var/lib/libvirt/qemu

2014-04-01 Thread Chris Friesen
When enabling hw_qemu_guest_agent on an image and booting it up, qemu creates a unix socket in /var/lib/libvirt/qemu. However, it looks like it never gets cleaned up when the instance is deleted. Ideally it seems like something that qemu or libvirt should handle, but would it maybe make

[openstack-dev] [nova] [bug] unit tests sqlite regexp() function doesn't behave like mysql

2014-03-31 Thread Chris Friesen
I mentioned this last week in another thread but I suspect it got lost. I recently came across a situation where the code failed when running it under devstack but passed the unit tests. It turns out that the unit tests regexp() behaves differently than the built-in one in mysql. Down in

Re: [openstack-dev] [nova] [bug] unit tests sqlite regexp() function doesn't behave like mysql

2014-03-31 Thread Chris Friesen
On 03/31/2014 08:54 AM, Chris Friesen wrote: I mentioned this last week in another thread but I suspect it got lost. I forgot to mention...I've opened this as a bug (https://bugs.launchpad.net/nova/+bug/1298690) but I wanted to get some wider suggestions as to the proper way to deal

Re: [openstack-dev] [nova] [bug] unit tests sqlite regexp() function doesn't behave like mysql

2014-03-31 Thread Chris Friesen
On 03/31/2014 09:24 AM, Solly Ross wrote: IMHO,Stringifying None and then expecting the *string* to match NULL is wrong. Could we check to see if `filters[filter_name]` is None and deal with that case separately (i.e `if filters[filter_name] is None: filter = column_is_null_check

Re: [openstack-dev] [nova] [bug] nova server-group-list doesn't show any members

2014-03-31 Thread Chris Friesen
On 03/31/2014 03:56 PM, Vishvananda Ishaya wrote: On Mar 27, 2014, at 4:38 PM, Chris Friesen chris.frie...@windriver.com wrote: On 03/27/2014 04:47 PM, Chris Friesen wrote: Interestingly, unit test

Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..

2014-03-28 Thread Chris Friesen
On 03/28/2014 05:01 AM, Jesse Pretorius wrote: On 27 March 2014 20:52, Chris Friesen chris.frie...@windriver.com mailto:chris.frie...@windriver.com wrote: It'd be nice to be able to do a heat template where you could specify things like put these three servers on separate hosts from

Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..

2014-03-27 Thread Chris Friesen
On 03/27/2014 11:48 AM, Day, Phil wrote: Sorry if I'm coming late to this thread, but why would you define AZs to cover othognal zones ? See Vish's first message. AZs are a very specific form of aggregate - they provide a particular isolation schematic between the hosts (i.e. physical hosts

Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..

2014-03-27 Thread Chris Friesen
On 03/27/2014 12:28 PM, Day, Phil wrote: Personally I'm a bit worried about users having too fine a granularity over where they place a sever - AZs are generally few and big so you can afford to allow this and not have capacity issues, but if I had to expose 40 different rack based zones it

Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..

2014-03-27 Thread Chris Friesen
On 03/27/2014 12:49 PM, Day, Phil wrote: -Original Message- From: Chris Friesen [mailto:chris.frie...@windriver.com] On 03/27/2014 11:48 AM, Day, Phil wrote: nova boot --availability-zone az1 --scheduler-hint want-fast-cpu --scheduler-hint want-ssd ... Does this actually work

[openstack-dev] [nova] [bug] nova server-group-list doesn't show any members

2014-03-27 Thread Chris Friesen
I've filed this as a bug (https://bugs.launchpad.net/nova/+bug/1298494) but I thought I'd post it here as well to make sure it got visibility. If I create a server group, then boot a server as part of the group, then run nova server-group-list it doesn't show the server as being a member of

Re: [openstack-dev] [nova] [bug] nova server-group-list doesn't show any members

2014-03-27 Thread Chris Friesen
On 03/27/2014 03:57 PM, Chris Friesen wrote: If I change the filter to use 'deleted': False instead of 'deleted_at': None then it works as expected. The leads to a couple of questions: 1) There is a column deleted_at in the database table, why can't we filter on it? I wonder if maybe

Re: [openstack-dev] [nova] [bug] nova server-group-list doesn't show any members

2014-03-27 Thread Chris Friesen
On 03/27/2014 03:57 PM, Chris Friesen wrote: The leads to a couple of questions: 1) There is a column deleted_at in the database table, why can't we filter on it? 2) How did this get submitted when it doesn't work? I've updated to the current codebase in devstack and I'm still seeing

Re: [openstack-dev] [nova] [bug] nova server-group-list doesn't show any members

2014-03-27 Thread Chris Friesen
On 03/27/2014 04:47 PM, Chris Friesen wrote: Interestingly, unit test nova.tests.api.openstack.compute.contrib.test_server_groups.ServerGroupTest.test_display_members passes just fine, and it seems to be running the same sqlalchemy code. Is this a case where sqlite behaves differently from

Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..

2014-03-26 Thread Chris Friesen
On 03/25/2014 02:50 PM, Sangeeta Singh wrote: What I am trying to achieve is have two AZ that the user can select during the boot but then have a default AZ which has the HV from both AZ1 AND AZ2 so that when the user does not specify any AZ in the boot command I scatter my VM on both the AZ

Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..

2014-03-26 Thread Chris Friesen
On 03/26/2014 10:47 AM, Vishvananda Ishaya wrote: Personally I view this as a bug. There is no reason why we shouldn’t support arbitrary grouping of zones. I know there is at least one problem with zones that overlap regarding displaying them properly:

Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..

2014-03-26 Thread Chris Friesen
On 03/26/2014 11:17 AM, Khanh-Toan Tran wrote: I don't know why you need a compute node that belongs to 2 different availability-zones. Maybe I'm wrong but for me it's logical that availability-zones do not share the same compute nodes. The availability-zones have the role of partition your

[openstack-dev] [nova] should there be an audit to clear the REBOOTING task_state?

2014-03-25 Thread Chris Friesen
I've reported a bug (https://bugs.launchpad.net/nova/+bug/1296967) where we got stuck with a task_state of REBOOTING due to what seem to be RPC issues. Regardless of how we got there, currently there is no audit that will clear the task_state if it gets stuck. Because of this, once we got

Re: [openstack-dev] [All][Keystone] Deprecation of the v2 API

2014-03-25 Thread Chris Friesen
On 03/25/2014 04:50 PM, Russell Bryant wrote: We discussed the deprecation of the v2 keystone API in the cross-project meeting today [1]. This thread is to recap and bring that discussion to some consensus. snip In summary, until we have completed v3 support within OpenStack itself, it's

Re: [openstack-dev] auto-delete in amqp reply_* queues in OpenStack

2014-03-24 Thread Chris Friesen
On 03/24/2014 02:59 AM, Dmitry Mescheryakov wrote: Chris, In oslo.messaging a single reply queue is used to gather results from all the calls. It is created lazily on the first call and is used until the process is killed. I did a quick look at oslo.rpc from oslo-incubator and it seems like it

[openstack-dev] [nova] nova-compute not re-establishing connectivity after controller switchover

2014-03-24 Thread Chris Friesen
We've been stress-testing our system doing controlled switchover of the controller. Normally this works okay, but we've run into a situation that seems to show a flaw in the reconnection logic. On the compute node, nova-compute has managed to get into a state where it shows as down in nova

Re: [openstack-dev] [nova] nova-compute not re-establishing connectivity after controller switchover

2014-03-24 Thread Chris Friesen
On 03/24/2014 10:41 AM, Chris Friesen wrote: We've been stress-testing our system doing controlled switchover of the controller. Normally this works okay, but we've run into a situation that seems to show a flaw in the reconnection logic. On the compute node, nova-compute has managed to get

Re: [openstack-dev] [nova] nova-compute not re-establishing connectivity after controller switchover

2014-03-24 Thread Chris Friesen
On 03/24/2014 10:59 AM, Dan Smith wrote: Any ideas on what might be going on would be appreciated. This looks like something that should be filed as a bug. I don't have any ideas off hand, bit I will note that the reconnection logic works fine for us in the upstream upgrade tests. That

Re: [openstack-dev] [nova] nova-compute not re-establishing connectivity after controller switchover

2014-03-24 Thread Chris Friesen
On 03/24/2014 11:31 AM, Chris Friesen wrote: It looks like we're raising RecoverableConnectionError: connection already closed down in /usr/lib64/python2.7/site-packages/amqp/abstract_channel.py, but nothing handles it. It looks like the most likely place that should be handling

Re: [openstack-dev] auto-delete in amqp reply_* queues in OpenStack

2014-03-24 Thread Chris Friesen
On 03/24/2014 01:27 PM, Dmitry Mescheryakov wrote: I see two possible explanations for these 5 remaining queues: * They were indeed recreated by 'compute' services. I.e. controller service send some command over rpc and then it was shut down. Its reply queue was automatically deleted, since

Re: [openstack-dev] [nova] nova-compute not re-establishing connectivity after controller switchover

2014-03-24 Thread Chris Friesen
On 03/24/2014 07:45 PM, Chris Behrens wrote: Do you have some sort of network device like a firewall between your compute and rabbit or you failed from one rabbit over to another? There are two controllers (active/standby) and two computes all hooked up to the same switch. We definitely did

Re: [openstack-dev] [nova] nova-compute not re-establishing connectivity after controller switchover

2014-03-24 Thread Chris Friesen
On 03/24/2014 09:24 PM, Chris Friesen wrote: The problem is that the RPC code in Havana doesn't handle connection error exceptions. The oslo.messaging code used in Icehouse does. If we ported https://github.com/openstack/oslo.messaging/commit/0400cbf4f83cf8d58076c7e65e08a156ec3508a8

[openstack-dev] auto-delete in amqp reply_* queues in OpenStack

2014-03-23 Thread Chris Friesen
Hi, If I run rabbitmqadmin list queues on my controller node I see 28 queues with names of the form reply_uuid. From what I've been reading, these queues are supposed to be used for the replies to rpc calls, they're not durable', and they all have auto_delete set to True. Given the above,

Re: [openstack-dev] [nova] Backwards incompatible API changes

2014-03-21 Thread Chris Friesen
This is sort of off on a tangent, but one of the things that resulted in this being a problem was the fact that if someone creates a private flavor and then tries to add access second flavor access call will fail because the the tenant already is on the access list. Something I was

Re: [openstack-dev] [nova] instances stuck with task_state of REBOOTING

2014-03-21 Thread Chris Friesen
On 03/21/2014 08:41 AM, Solly Ross wrote: Well, if messages are getting dropped on the floor due to communication issues, that's not a good thing. If you have time, could you determine why the messages are getting dropped on the floor? We shouldn't be doing things that require both the

[openstack-dev] [nova] instances stuck with task_state of REBOOTING

2014-03-20 Thread Chris Friesen
I'm running a havana install, and during some testing I've managed to get the system into a state where two instances are up and running but are reporting a task_state of REBOOTING. I can see the nova-api logs showing the soft-reboot request. I don't see a corresponding nova-compute log

Re: [openstack-dev] [nova] instances stuck with task_state of REBOOTING

2014-03-20 Thread Chris Friesen
On 03/20/2014 12:06 PM, Solly Ross wrote: Hi Chris, Are you in the position to determine whether or not this happens with the latest master code? Either way, it definitely looks like a bug. Unfortunately not right now, working towards a deadline. If you could give more specific reproduction

Re: [openstack-dev] [nova] instances stuck with task_state of REBOOTING

2014-03-20 Thread Chris Friesen
On 03/20/2014 12:29 PM, Chris Friesen wrote: The fact that there are no success or error logs in nova-compute.log makes me wonder if we somehow got stuck in self.driver.reboot(). Also, I'm kind of wondering what would happen if nova-compute was running reboot_instance() and we rebooted

Re: [openstack-dev] [Nova][Heat] How to reliably detect VM failures?

2014-03-19 Thread Chris Friesen
On 03/18/2014 11:18 AM, Zane Bitter wrote: On 18/03/14 12:42, Steven Dake wrote: You should be able to use the HARestarter resource and functionality to do healthchecking of a vm. HARestarter is actually pretty problematic, both in a causes major architectural headaches for Heat and will

Re: [openstack-dev] [Marconi] Why is marconi a queue implementation vs a provisioning API?

2014-03-19 Thread Chris Friesen
On 03/19/2014 02:24 PM, Fox, Kevin M wrote: Can someone please give more detail into why MongoDB being AGPL is a problem? The drivers that Marconi uses are Apache2 licensed, MongoDB is separated by the network stack and MongoDB is not exposed to the Marconi users so I don't think the 'A' part of

Re: [openstack-dev] [Nova][Heat] How to reliably detect VM failures?

2014-03-19 Thread Chris Friesen
On 03/19/2014 08:38 PM, Qiming Teng wrote: I don't think it a good idea to rely on some external monitoring systems to do a VM failure detection. It means additional steps to set up, additional software to upgrade, additional chapter in the Operator's Guide, etc. We are evaluating whether

Re: [openstack-dev] [nova] need help with unit test framework, trying to fix bug 1292963

2014-03-18 Thread Chris Friesen
On 03/17/2014 04:28 PM, Chris Friesen wrote: The second one filters out all of the objects and returns nothing. (Pdb) query_prefix.filter(models.Instance.vm_state != vm_states.SOFT_DELETED).all() [] I think I've found another problem. (The rabbit hole continues...) It appears

Re: [openstack-dev] [nova] question about e41fb84 fix anti-affinity race condition on boot

2014-03-17 Thread Chris Friesen
On 03/17/2014 11:59 AM, John Garbutt wrote: On 17 March 2014 17:54, John Garbutt j...@johngarbutt.com wrote: Given the scheduler split, writing that value into the nova db from the scheduler would be a step backwards, and it probably breaks lots of code that assumes the host is not set until

Re: [openstack-dev] [nova] question about e41fb84 fix anti-affinity race condition on boot

2014-03-17 Thread Chris Friesen
On 03/17/2014 01:29 PM, Andrew Laski wrote: On 03/17/14 at 01:11pm, Chris Friesen wrote: On 03/17/2014 11:59 AM, John Garbutt wrote: On 17 March 2014 17:54, John Garbutt j...@johngarbutt.com wrote: Given the scheduler split, writing that value into the nova db from the scheduler would

[openstack-dev] [nova] need help with unit test framework, trying to fix bug 1292963

2014-03-17 Thread Chris Friesen
I've submitted code for review at https://review.openstack.org/80808; but it seems to break the unit tests. Where do the deleted and deleted_at fields for the instance get created for unit tests? Where is the database stored for unit tests, and is there a way to look at it directly? Here

Re: [openstack-dev] [nova] question about e41fb84 fix anti-affinity race condition on boot

2014-03-17 Thread Chris Friesen
On 03/17/2014 02:30 PM, Sylvain Bauza wrote: There is a global concern here about how an holistic scheduler can perform decisions, and from which key metrics. The current effort is leading to having the Gantt DB updated thanks to resource tracker for scheduling appropriately the hosts. If we

Re: [openstack-dev] [nova] need help with unit test framework, trying to fix bug 1292963

2014-03-17 Thread Chris Friesen
On 03/17/2014 04:04 PM, Joe Gordon wrote: On Mon, Mar 17, 2014 at 2:16 PM, Chris Friesen chris.frie...@windriver.com mailto:chris.frie...@windriver.com wrote: The original code looks like this: filters = {'uuid': filter_uuids, 'deleted_at': None} instances

Re: [openstack-dev] [nova] question about e41fb84 fix anti-affinity race condition on boot

2014-03-17 Thread Chris Friesen
On 03/17/2014 05:01 PM, Sylvain Bauza wrote: There are 2 distinct cases : 1. there are multiple schedulers involved in the decision 2. there is one single scheduler but there is a race condition on it About 1., I agree we need to see how the scheduler (and later on Gantt) could address

[openstack-dev] [nova] [bug?] possible postgres/mysql incompatibility in InstanceGroup.get_hosts()

2014-03-15 Thread Chris Friesen
Hi, I'm trying to run InstanceGroup.get_hosts() on a havana installation that uses postgres. When I run the code, I get the following error: RemoteError: Remote error: ProgrammingError (ProgrammingError) operator does not exist: timestamp without time zone ~ unknown 2014-03-14 09:58:57.193

[openstack-dev] [nova] question about e41fb84 fix anti-affinity race condition on boot

2014-03-15 Thread Chris Friesen
Hi, I'm curious why the specified git commit chose to fix the anti-affinity race condition by aborting the boot and triggering a reschedule. It seems to me that it would have been more elegant for the scheduler to do a database transaction that would atomically check that the chosen host

Re: [openstack-dev] [nova] [bug?] possible postgres/mysql incompatibility in InstanceGroup.get_hosts()

2014-03-15 Thread Chris Friesen
On 03/15/2014 04:29 AM, Sean Dague wrote: On 03/15/2014 02:49 AM, Chris Friesen wrote: Hi, I'm trying to run InstanceGroup.get_hosts() on a havana installation that uses postgres. When I run the code, I get the following error: RemoteError: Remote error: ProgrammingError (ProgrammingError

Re: [openstack-dev] UTF-8 required charset/encoding for openstack database?

2014-03-12 Thread Chris Friesen
On 03/11/2014 05:50 PM, Clint Byrum wrote: But MySQL can't possibly know what you _meant_ when you were inserting data. So, if you _assumed_ that the database was UTF-8, and inserted UTF-8 with all of those things accidentally set for latin1, then you will have UTF-8 in your db, but MySQL will

[openstack-dev] any recommendations for live debugging of openstack services?

2014-03-12 Thread Chris Friesen
Are there any tools that people can recommend for live debugging of openstack services? I'm looking for a mechanism where I could take a running system that isn't behaving the way I expect and somehow poke around inside the program while it keeps running. (Sort of like tracepoints in gdb.)

[openstack-dev] UTF-8 required charset/encoding for openstack database?

2014-03-10 Thread Chris Friesen
Hi, I'm using havana and recent we ran into an issue with heat related to character sets. In heat/db/sqlalchemy/api.py in user_creds_get() we call _decrypt() on an encrypted password stored in the database and then try to convert the result to unicode. Today we hit a case where this

Re: [openstack-dev] [nova] [bug?] live migration fails with boot-from-volume

2014-03-10 Thread Chris Friesen
On 03/08/2014 02:23 AM, ChangBo Guo wrote: Are you using libvirt driver ? As I remember, the way to check if compute nodes with shared storage is : create a temporary file from source node , then check the file from dest node , by accessing file system from operating system level. And

Re: [openstack-dev] UTF-8 required charset/encoding for openstack database?

2014-03-10 Thread Chris Friesen
On 03/10/2014 02:02 PM, Ben Nemec wrote: We just had a discussion about this in #openstack-oslo too. See the discussion starting at 2014-03-10T16:32:26 http://eavesdrop.openstack.org/irclogs/%23openstack-oslo/%23openstack-oslo.2014-03-10.log In that discussion dhellmann said, I wonder if we

Re: [openstack-dev] [nova] a question about instance snapshot

2014-03-10 Thread Chris Friesen
On 03/10/2014 02:58 PM, Jay Pipes wrote: On Mon, 2014-03-10 at 16:30 -0400, Shawn Hartsock wrote: While I understand the general argument about pets versus cattle. The question is, would you be willing to poke a few holes in the strict cattle abstraction for the sake of pragmatism. Few shops

[openstack-dev] [nova] [bug?] live migration fails with boot-from-volume

2014-03-07 Thread Chris Friesen
Hi, I was just testing the current icehouse code and came across some behaviour that looked suspicious. I have two nodes, an all-in-one and a compute node. I was not using shared instance storage. I created a volume from an image and then booted an instance from the volume. Once the

Re: [openstack-dev] [nova] Future of the Nova API

2014-03-03 Thread Chris Friesen
On 03/03/2014 08:14 AM, Steve Gordon wrote: I would be interested in your opinion on the impact of a V2 version release which had backwards incompatibility in only one area - and that is input validation. So only apps/SDKs which are currently misusing the API (I think the most common problem

[openstack-dev] inconsistent naming? node vs host vs vs hypervisor_hostname vs OS-EXT-SRV-ATTR:host

2014-02-28 Thread Chris Friesen
Hi, I've been working with OpenStack for a while now but I'm still a bit fuzzy on the precise meaning of some of the terminology. It seems reasonably clear that a node is a computer running at least one component of an Openstack system. However, nova service-list talks about the host that

Re: [openstack-dev] inconsistent naming? node vs host vs vs hypervisor_hostname vs OS-EXT-SRV-ATTR:host

2014-02-28 Thread Chris Friesen
On 02/28/2014 11:38 AM, Jiang, Yunhong wrote: One reason of the confusion is, in some virt driver (maybe xenapi or vmwareapi), one compute service manages multiple node. Okay, so in the scenario above, is the nova-compute service running on a node or a host? (And if it's a host, then what is

Re: [openstack-dev] [nova] Future of the Nova API

2014-02-27 Thread Chris Friesen
On 02/27/2014 08:43 AM, Dan Smith wrote: So I think once we start returning different response codes, or completely different structures (such as the tasks change will be), it doesn't matter if we make the change in effect by invoking /v2 prefix or /v3 prefix or we look for a header. Its a major

Re: [openstack-dev] [nova] Future of the Nova API

2014-02-27 Thread Chris Friesen
On 02/27/2014 06:00 PM, Alex Xu wrote: Does mean our code looks like as below? if client_version 2: elif client_version 3 ... elif client_version 4: ... elif client_version 5: ... elif client_version 6: .. And we need test each version... That looks bad... I don't

Re: [openstack-dev] [nova] Future of the Nova API

2014-02-26 Thread Chris Friesen
On 02/26/2014 04:50 PM, Dan Smith wrote: So if we make backwards incompatible changes we really need a major version bump. Minor versions don't cut it, because the expectation is you have API stability within a major version. I disagree. If the client declares support for it, I think we can

Re: [openstack-dev] [nova] why doesn't _rollback_live_migration() always call rollback_live_migration_at_destination()?

2014-02-25 Thread Chris Friesen
On 02/25/2014 05:15 AM, John Garbutt wrote: On 24 February 2014 22:14, Chris Friesen chris.frie...@windriver.com wrote: What happens if we have a shared-storage instance that we try to migrate and fail and end up rolling back? Are we going to end up with messed-up networking

[openstack-dev] need advice on how to supply automated testing with bugfix patch

2014-02-25 Thread Chris Friesen
I'm in the process of putting together a bug report and a patch for properly handling resource tracking on live migration. The change involves code that will run on the destination compute node in order to properly account for the resources that the instance to be migrated will consume.

Re: [openstack-dev] [nova][libvirt] Is there anything blocking the libvirt driver from implementing the host_maintenance_mode API?

2014-02-24 Thread Chris Friesen
On 02/20/2014 11:38 AM, Matt Riedemann wrote: On 2/19/2014 4:05 PM, Matt Riedemann wrote: The os-hosts OS API extension [1] showed up before I was working on the project and I see that only the VMware and XenAPI drivers implement it, but was wondering why the libvirt driver doesn't - either

[openstack-dev] [nova] why doesn't _rollback_live_migration() always call rollback_live_migration_at_destination()?

2014-02-24 Thread Chris Friesen
I'm looking at the live migration rollback code and I'm a bit confused. When setting up a live migration we unconditionally run ComputeManager.pre_live_migration() on the destination host to do various things including setting up networks on the host. If something goes wrong with the live

Re: [openstack-dev] [nova] Future of the Nova API

2014-02-24 Thread Chris Friesen
On 02/24/2014 04:01 PM, Morgan Fainberg wrote: TL;DR, “don’t break the contract”. If we are seriously making incompatible changes (and we will be regardless of the direction) the only reasonable option is a new major version. Agreed. I don't think we can possibly consider making

Re: [openstack-dev] [nova] Future of the Nova API

2014-02-24 Thread Chris Friesen
On 02/24/2014 04:59 PM, Sean Dague wrote: So, that begs a new approach. Because I think at this point even if we did put out Nova v3, there can never be a v4. It's too much, too big, and doesn't fit in the incremental nature of the project. Does it necessarily need to be that way though?

Re: [openstack-dev] [nova] Future of the Nova API

2014-02-24 Thread Chris Friesen
On 02/24/2014 05:17 PM, Sean Dague wrote: On 02/24/2014 06:13 PM, Chris Friesen wrote: On 02/24/2014 04:59 PM, Sean Dague wrote: So, that begs a new approach. Because I think at this point even if we did put out Nova v3, there can never be a v4. It's too much, too big, and doesn't fit

Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler

2014-02-11 Thread Chris Friesen
On 02/11/2014 03:21 AM, Khanh-Toan Tran wrote: Second, there is nothing wrong with booting the instances (or instantiating other resources) as separate commands as long as we support some kind of reservation token. I'm not sure what reservation token would do, is it some kind of informing

[openstack-dev] transactions in openstack REST API?

2014-02-03 Thread Chris Friesen
Has anyone ever considered adding the concept of transaction IDs to the openstack REST API? I'm envisioning a way to handle long-running transactions more cleanly. For example: 1) A user sends a request to live-migrate an instance 2) Openstack acks the request and includes a transaction

Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler

2014-02-03 Thread Chris Friesen
On 02/03/2014 12:28 PM, Khanh-Toan Tran wrote: Another though would be the need for Instance Group API [1]. Currently users can only request multiple instances of the same flavors. These requests do not need LP to solve, just placing instances one by one is sufficient. Therefore we need this

Re: [openstack-dev] transactions in openstack REST API?

2014-02-03 Thread Chris Friesen
On 02/03/2014 01:31 PM, Andrew Laski wrote: On 02/03/14 at 01:10pm, Chris Friesen wrote: Has anyone ever considered adding the concept of transaction IDs to the openstack REST API? I'm envisioning a way to handle long-running transactions more cleanly. For example: 1) A user sends a request

Re: [openstack-dev] [nova][neutron] PCI pass-through SRIOV

2014-01-28 Thread Chris Friesen
On 01/28/2014 10:55 AM, Jani, Nrupal wrote: While technically it is possible, we as a team can decide about the final recommendationJGiven that VFs are going to be used for the high-performance VMs, mixing VMs with virtio VFs may not be a good option. Initially we can use PF interface for the

Re: [openstack-dev] [nova]Why not allow to create a vm directly with two VIF in the same network

2014-01-24 Thread Chris Friesen
On 01/24/2014 08:33 AM, CARVER, PAUL wrote: I agree that I’d like to see a set of use cases for this. This is the second time in as many days that I’ve heard about a desire to have such a thing but I still don’t think I understand any use cases adequately. In the physical world it makes perfect

Re: [openstack-dev] [ironic] Disk Eraser

2014-01-17 Thread Chris Friesen
On 01/17/2014 04:20 PM, Devananda van der Veen wrote: tl;dr, We should not be recycling bare metal nodes between untrusted tenants at this time. There's a broader discussion about firmware security going on, which, I think, will take a while for the hardware vendors to really address. What

Re: [openstack-dev] Proposal for dd disk i/o performance blueprint of cinder.

2014-01-16 Thread Chris Friesen
On 01/15/2014 11:25 PM, Clint Byrum wrote: Excerpts from Alan Kavanagh's message of 2014-01-15 19:11:03 -0800: Hi Paul I posted a query to Ironic which is related to this discussion. My thinking was I want to ensure the case you note here (1) a tenant can not read another tenants disk..

[openstack-dev] [nova] how is resource tracking supposed to work for live migration and evacuation?

2014-01-16 Thread Chris Friesen
Hi, I'm trying to figure out how resource tracking is intended to work for live migration and evacuation. For a while I thought that maybe we were relying on the call to ComputeManager._instance_update() in ComputeManager.post_live_migration_at_destination(). However, in

Re: [openstack-dev] Proposal for dd disk i/o performance blueprint of cinder.

2014-01-16 Thread Chris Friesen
On 01/16/2014 04:22 PM, Clint Byrum wrote: Excerpts from Fox, Kevin M's message of 2014-01-16 09:29:14 -0800: Yeah, I think the evil firmware issue is separate and should be solved separately. Ideally, there should be a mode you can set the bare metal server into where firmware updates are

Re: [openstack-dev] Evil Firmware

2014-01-16 Thread Chris Friesen
On 01/16/2014 05:12 PM, CARVER, PAUL wrote: Jumping back to an earlier part of the discussion, it occurs to me that this has broader implications. There's some discussion going on under the heading of Neutron with regard to PCI passthrough. I imagine it's under Neutron because of a desire to

<    1   2   3   4   5   6   7   >