Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

2014-09-03 Thread Vladik Romanovsky
+1

I had several pacthes in start lxc from block device series. The blueprint 
was waiting since Icehouse.
In Juno it was approved, however, besides Daniel Berrange no one was looking at 
these patches.
Now it's being pushed to Kilo, regadless of the fact that everything is +2ed.

Normally, I don't actively pursue people to get approvals, as I was getting 
angry pushback from cores,
at the begining of my way with openstack.

I don't understand what is the proper way to get work done.

Vladik 

- Original Message -
 From: Solly Ross sr...@redhat.com
 To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Sent: Wednesday, September 3, 2014 11:57:29 AM
 Subject: Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno
 
  I will follow up with a more detailed email about what I believe we are
  missing, once the FF settles and I have applied some soothing creme to
  my burnout wounds, but currently my sentiment is:
  
  Contributing features to Nova nowadays SUCKS!!1 (even as a core
  reviewer) We _have_ to change that!
 
 I think this is *very* important.
 
 rant
 For instance, I have/had two patch series
 up. One is of length 2 and is relatively small.  It's basically sitting there
 with one +2 on each patch.  I will now most likely have to apply for a FFE
 to get it merged, not because there's more changes to be made before it can
 get merged
 (there was one small nit posted yesterday) or because it's a huge patch that
 needs a lot
 of time to review, but because it just took a while to get reviewed by cores,
 and still only appears to have been looked at by one core.
 
 For the other patch series (which is admittedly much bigger), it was hard
 just to
 get reviews (and it was something where I actually *really* wanted several
 opinions,
 because the patch series touched a couple of things in a very significant
 way).
 
 Now, this is not my first contribution to OpenStack, or to Nova, for that
 matter.  I
 know things don't always get in.  It's frustrating, however, when it seems
 like the
 reason something didn't get in wasn't because it was fundamentally flawed,
 but instead
 because it didn't get reviews until it was too late to actually take that
 feedback into
 account, or because it just didn't get much attention review-wise at all.  If
 I were a
 new contributor to Nova who had successfully gotten a major blueprint
 approved and
 the implemented, only to see it get rejected like this, I might get turned
 off of Nova,
 and go to work on one of the other OpenStack projects that seemed to move a
 bit faster.
 /rant
 
 So, it's silly to rant without actually providing any ideas on how to fix it.
 One suggestion would be, for each approved blueprint, to have one or two
 cores
 explicitly marked as being responsible for providing at least some feedback
 on
 that patch.  This proposal has issues, since we have a lot of blueprints and
 only
 twenty cores, who also have their own stuff to work on.  However, I think the
 general idea of having guaranteed reviewers is not unsound by itself.
 Perhaps
 we should have a loose tier of reviewers between core and everybody else.
 These reviewers would be known good reviewers who would follow the
 implementation
 of particular blueprints if a core did not have the time.  Then, when those
 reviewers
 gave the +1 to all the patches in a series, they could ping a core, who
 could feel
 more comfortable giving a +2 without doing a deep inspection of the code.
 
 That's just one suggestion, though.  Whatever the solution may be, this is a
 problem that we need to fix.  While I enjoyed going through the blueprint
 process
 this cycle (not sarcastic -- I actually enjoyed the whole structured
 feedback thing),
 the follow up to that was not the most pleasant.
 
 One final note: the specs referenced above didn't get approved until Spec
 Freeze, which
 seemed to leave me with less time to implement things.  In fact, it seemed
 that a lot
 of specs didn't get approved until spec freeze.  Perhaps if we had more
 staggered
 approval of specs, we'd have more staggered submission of patches, and thus
 less of a
 sudden influx of patches in the couple weeks before feature proposal freeze.
 
 Best Regards,
 Solly Ross
 
 - Original Message -
  From: Nikola Đipanov ndipa...@redhat.com
  To: openstack-dev@lists.openstack.org
  Sent: Wednesday, September 3, 2014 5:50:09 AM
  Subject: Re: [openstack-dev] [Nova] Feature Freeze Exception process for
  Juno
  
  On 09/02/2014 09:23 PM, Michael Still wrote:
   On Tue, Sep 2, 2014 at 1:40 PM, Nikola Đipanov ndipa...@redhat.com
   wrote:
   On 09/02/2014 08:16 PM, Michael Still wrote:
   Hi.
  
   We're soon to hit feature freeze, as discussed in Thierry's recent
   email. I'd like to outline the process for requesting a freeze
   exception:
  
   * your code must already be up for review
   * your blueprint must have an approved spec
   * you need three 

[openstack-dev] [nova] [feature freeze exception] FFE for libvirt-start-lxc-from-block-devices

2014-09-04 Thread Vladik Romanovsky
Hello,

I would like to ask for an extension for libvirt-start-lxc-from-block-devices 
feature. It has been previously pushed from Ice house to Juno.
The spec [1] has been approved. One of the patches is a bug fix. Another patch 
has been already approved and failed in the gate.
All patches has a +2 from Daniel Berrange.

The list of the remaining patches are in [2].


[1] https://review.openstack.org/#/c/88062
[2] 
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/libvirt-start-lxc-from-block-devices,n,z

Thank you,
Vladik

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Vladik Romanovsky
+1 

I very much agree with Dan's the propsal.

I am concerned about difficulties we will face with merging
patches that spreads accross various regions: manager, conductor, scheduler, 
etc..
However, I think, this is a small price to pay for having a more focused teams.

IMO, we will stiil have to pay it, the moment the scheduler will separate.

Regards,
Vladik

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Unified Guest Agent proposal

2013-12-10 Thread Vladik Romanovsky

Maybe it will be useful to use Ovirt guest agent as a base.

http://www.ovirt.org/Guest_Agent
https://github.com/oVirt/ovirt-guest-agent

It is already working well on linux and windows and has a lot of functionality.
However, currently it is using virtio-serial for communication, but I think it 
can be extended for other bindings.

Vladik

- Original Message -
 From: Clint Byrum cl...@fewbar.com
 To: openstack-dev openstack-dev@lists.openstack.org
 Sent: Tuesday, 10 December, 2013 4:02:41 PM
 Subject: Re: [openstack-dev] Unified Guest Agent proposal
 
 Excerpts from Dmitry Mescheryakov's message of 2013-12-10 12:37:37 -0800:
   What is the exact scenario you're trying to avoid?
  
  It is DDoS attack on either transport (AMQP / ZeroMQ provider) or server
  (Salt / Our own self-written server). Looking at the design, it doesn't
  look like the attack could be somehow contained within a tenant it is
  coming from.
  
 
 We can push a tenant-specific route for the metadata server, and a tenant
 specific endpoint for in-agent things. Still simpler than hypervisor-aware
 guests. I haven't seen anybody ask for this yet, though I'm sure if they
 run into these problems it will be the next logical step.
 
  In the current OpenStack design I see only one similarly vulnerable
  component - metadata server. Keeping that in mind, maybe I just
  overestimate the threat?
  
 
 Anything you expose to the users is vulnerable. By using the localized
 hypervisor scheme you're now making the compute node itself vulnerable.
 Only now you're asking that an already complicated thing (nova-compute)
 add another job, rate limiting.
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Unified Guest Agent proposal

2013-12-12 Thread Vladik Romanovsky
Dmitry,

I understand that :)
The only hypervisor dependency it has is how it communicates with the host, 
while this can be extended and turned into a binding, so people could connect 
to it in multiple ways.

The real value, as I see it, is which features this guest agent already 
implements and the fact that this is a mature code base.

Thanks,
Vladik 

- Original Message -
 From: Dmitry Mescheryakov dmescherya...@mirantis.com
 To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Sent: Thursday, 12 December, 2013 12:27:47 PM
 Subject: Re: [openstack-dev] Unified Guest Agent proposal
 
 Vladik,
 
 Thanks for the suggestion, but hypervisor-dependent solution is exactly what
 scares off people in the thread :-)
 
 Thanks,
 
 Dmitry
 
 
 2013/12/11 Vladik Romanovsky  vladik.romanov...@enovance.com 
 
 
 
 Maybe it will be useful to use Ovirt guest agent as a base.
 
 http://www.ovirt.org/Guest_Agent
 https://github.com/oVirt/ovirt-guest-agent
 
 It is already working well on linux and windows and has a lot of
 functionality.
 However, currently it is using virtio-serial for communication, but I think
 it can be extended for other bindings.
 
 Vladik
 
 - Original Message -
  From: Clint Byrum  cl...@fewbar.com 
  To: openstack-dev  openstack-dev@lists.openstack.org 
  Sent: Tuesday, 10 December, 2013 4:02:41 PM
  Subject: Re: [openstack-dev] Unified Guest Agent proposal
  
  Excerpts from Dmitry Mescheryakov's message of 2013-12-10 12:37:37 -0800:
What is the exact scenario you're trying to avoid?
   
   It is DDoS attack on either transport (AMQP / ZeroMQ provider) or server
   (Salt / Our own self-written server). Looking at the design, it doesn't
   look like the attack could be somehow contained within a tenant it is
   coming from.
   
  
  We can push a tenant-specific route for the metadata server, and a tenant
  specific endpoint for in-agent things. Still simpler than hypervisor-aware
  guests. I haven't seen anybody ask for this yet, though I'm sure if they
  run into these problems it will be the next logical step.
  
   In the current OpenStack design I see only one similarly vulnerable
   component - metadata server. Keeping that in mind, maybe I just
   overestimate the threat?
   
  
  Anything you expose to the users is vulnerable. By using the localized
  hypervisor scheme you're now making the compute node itself vulnerable.
  Only now you're asking that an already complicated thing (nova-compute)
  add another job, rate limiting.
  
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][libvirt]when deleting instance which is in migrating state, instance files can be stay in destination node forever

2013-12-16 Thread Vladik Romanovsky
I would block it in the API or have the API cancelling the migration first. 
I don't see a reason why to start an operation that is meant to fail, which 
also has a complex chain of event, following it failure.

Regardless of the above, I think that the suggested exception handling is 
needed in any case.


Vladik

- Original Message -
 From: Loganathan Parthipan parthi...@hp.com
 To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Sent: Monday, 16 December, 2013 8:25:09 AM
 Subject: Re: [openstack-dev] [Nova][libvirt]when deleting instance which is 
 in migrating state, instance files can be
 stay in destination node forever
 
 
 
 Isn’t just handling the exception instance_not_found enough? By this time
 source would’ve been cleaned up. Destination VM resources will get cleaned
 up by the periodic task since the VM is not associated with this host. Am I
 missing something here?
 
 
 
 
 
 
 From: 王宏 [mailto:w.wangho...@gmail.com]
 Sent: 16 December 2013 11:32
 To: openstack-dev@lists.openstack.org
 Subject: [openstack-dev] [Nova][libvirt]when deleting instance which is in
 migrating state, instance files can be stay in destination node forever
 
 
 
 
 
 Hi all.
 
 
 When I try to fix a bug: https://bugs.launchpad.net/nova/+bug/1242961 ,
 
 
 I get a trouble.
 
 
 
 
 
 To reproduce the bug is very easy. Live migrate a vm in block_migration mode,
 
 
 and then delelte the vm immediately.
 
 
 
 
 
 The reason of this bug is as follow:
 
 
 1. Because live migrate costs more time, so the vm will be deleted
 sucessfully
 
 
 before live migrate complete. And then, we will get an exception while live
 
 
 migrating.
 
 
 2. After live migrate failed, we start to rollback. But, in the rollback
 method
 
 
 we will get or modify the info of vm from db. Because the vm has been deleted
 
 
 already, so we will get instance_not_found exception and rollback will be
 
 
 faild too.
 
 
 
 
 
 I have two ways to fix the bug:
 
 
 i)Add check in nova-api. When try to delete a vm, we return an error message
 if
 
 
 the vm_state is LIVE_MIGRATING. This way is very simple, but need to
 carefully
 
 
 consider. I have found a related discussion:
 
 
 http://lists.openstack.org/pipermail/openstack-dev/2013-October/017454.html ,
 but
 
 
 it has no result in the discussion.
 
 
 ii)Before live migrate we get all the data needed by rollback method, and add
 a
 
 
 new rollback method. The new method will clean up resources at destination
 based
 
 
 on the above data(The resouces at source has been already cleaned up by
 
 
 deleting).
 
 
 
 
 
 I have no idea whitch one I should choose. Or, any other ideas?:)
 
 
 
 
 
 Regards,
 
 
 wanghong
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] VM diagnostics - V3 proposal

2013-12-19 Thread Vladik Romanovsky
I think it was:

ceilometer sample-list -m cpu_util -q 'resource_id=vm_uuid'

Vladik

- Original Message -
 From: Daniel P. Berrange berra...@redhat.com
 To: John Garbutt j...@johngarbutt.com
 Cc: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Sent: Thursday, 19 December, 2013 9:34:02 AM
 Subject: Re: [openstack-dev] [nova] VM diagnostics - V3 proposal
 
 On Thu, Dec 19, 2013 at 02:27:40PM +, John Garbutt wrote:
  On 16 December 2013 15:50, Daniel P. Berrange berra...@redhat.com wrote:
   On Mon, Dec 16, 2013 at 03:37:39PM +, John Garbutt wrote:
   On 16 December 2013 15:25, Daniel P. Berrange berra...@redhat.com
   wrote:
On Mon, Dec 16, 2013 at 06:58:24AM -0800, Gary Kotton wrote:
I'd like to propose the following for the V3 API (we will not touch
V2
in case operators have applications that are written against this –
this
may be the case for libvirt or xen. The VMware API support was added
in I1):
   
 1.  We formalize the data that is returned by the API [1]
   
Before we debate what standard data should be returned we need
detail of exactly what info the current 3 virt drivers return.
IMHO it would be better if we did this all in the existing wiki
page associated with the blueprint, rather than etherpad, so it
serves as a permanent historical record for the blueprint design.
  
   +1
  
While we're doing this I think we should also consider whether
the 'get_diagnostics' API is fit for purpose more generally.
eg currently it is restricted to administrators. Some, if
not all, of the data libvirt returns is relevant to the owner
of the VM but they can not get at it.
  
   Ceilometer covers that ground, we should ask them about this API.
  
   If we consider what is potentially in scope for ceilometer and
   subtract that from what the libvirt get_diagnostics impl currently
   returns, you pretty much end up with the empty set. This might cause
   us to question if 'get_diagnostics' should exist at all from the
   POV of the libvirt driver's impl. Perhaps vmware/xen return data
   that is out of scope for ceilometer ?
  
  Hmm, a good point.
 
 So perhaps I'm just being dumb, but I deployed ceilometer and could
 not figure out how to get it to print out the stats for a single
 VM from its CLI ? eg, can someone show me a command line invocation
 for ceilometer that displays CPU, memory, disk and network I/O stats
 in one go ?
 
 
 Daniel
 --
 |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org  -o- http://virt-manager.org :|
 |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] VM diagnostics - V3 proposal

2013-12-19 Thread Vladik Romanovsky
Or

ceilometer meter-list -q resource_id='vm_uuid'

- Original Message -
 From: Daniel P. Berrange berra...@redhat.com
 To: John Garbutt j...@johngarbutt.com
 Cc: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Sent: Thursday, 19 December, 2013 9:34:02 AM
 Subject: Re: [openstack-dev] [nova] VM diagnostics - V3 proposal
 
 On Thu, Dec 19, 2013 at 02:27:40PM +, John Garbutt wrote:
  On 16 December 2013 15:50, Daniel P. Berrange berra...@redhat.com wrote:
   On Mon, Dec 16, 2013 at 03:37:39PM +, John Garbutt wrote:
   On 16 December 2013 15:25, Daniel P. Berrange berra...@redhat.com
   wrote:
On Mon, Dec 16, 2013 at 06:58:24AM -0800, Gary Kotton wrote:
I'd like to propose the following for the V3 API (we will not touch
V2
in case operators have applications that are written against this –
this
may be the case for libvirt or xen. The VMware API support was added
in I1):
   
 1.  We formalize the data that is returned by the API [1]
   
Before we debate what standard data should be returned we need
detail of exactly what info the current 3 virt drivers return.
IMHO it would be better if we did this all in the existing wiki
page associated with the blueprint, rather than etherpad, so it
serves as a permanent historical record for the blueprint design.
  
   +1
  
While we're doing this I think we should also consider whether
the 'get_diagnostics' API is fit for purpose more generally.
eg currently it is restricted to administrators. Some, if
not all, of the data libvirt returns is relevant to the owner
of the VM but they can not get at it.
  
   Ceilometer covers that ground, we should ask them about this API.
  
   If we consider what is potentially in scope for ceilometer and
   subtract that from what the libvirt get_diagnostics impl currently
   returns, you pretty much end up with the empty set. This might cause
   us to question if 'get_diagnostics' should exist at all from the
   POV of the libvirt driver's impl. Perhaps vmware/xen return data
   that is out of scope for ceilometer ?
  
  Hmm, a good point.
 
 So perhaps I'm just being dumb, but I deployed ceilometer and could
 not figure out how to get it to print out the stats for a single
 VM from its CLI ? eg, can someone show me a command line invocation
 for ceilometer that displays CPU, memory, disk and network I/O stats
 in one go ?
 
 
 Daniel
 --
 |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org  -o- http://virt-manager.org :|
 |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] VM diagnostics - V3 proposal

2013-12-19 Thread Vladik Romanovsky
Ah, I think I've responded too fast, sorry.

meter-list provides a list of various measurements that are being done per 
resource.
sample-list provides a list of samples per every meter: ceilometer sample-list 
--meter cpu_util -q resource_id=vm_uuid
These samples can be aggregated over a period of time per every meter and 
resource:
ceilometer statistics -m cpu_util -q 
'timestampSTART;timestamp=END;resource_id=vm_uuid' --period 3600

Vladik



- Original Message -
 From: Daniel P. Berrange berra...@redhat.com
 To: Vladik Romanovsky vladik.romanov...@enovance.com
 Cc: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org, John
 Garbutt j...@johngarbutt.com
 Sent: Thursday, 19 December, 2013 10:37:27 AM
 Subject: Re: [openstack-dev] [nova] VM diagnostics - V3 proposal
 
 On Thu, Dec 19, 2013 at 03:47:30PM +0100, Vladik Romanovsky wrote:
  I think it was:
  
  ceilometer sample-list -m cpu_util -q 'resource_id=vm_uuid'
 
 Hmm, a standard devstack deployment of ceilometer doesn't seem to
 record any performance stats at all - just shows me the static
 configuration parameters :-(
 
  ceilometer meter-list  -q 'resource_id=296b22c6-2a4d-4a8d-a7cd-2d73339f9c70'
 +-+---+--+--+--+--+
 | Name| Type  | Unit | Resource ID
 | | User ID  | Project ID
 | |
 +-+---+--+--+--+--+
 | disk.ephemeral.size | gauge | GB   |
 | 296b22c6-2a4d-4a8d-a7cd-2d73339f9c70 | 96f9a624a325473daf4cd7875be46009 |
 | ec26984024c1438e8e2f93dc6a8c5ad0 |
 | disk.root.size  | gauge | GB   |
 | 296b22c6-2a4d-4a8d-a7cd-2d73339f9c70 | 96f9a624a325473daf4cd7875be46009 |
 | ec26984024c1438e8e2f93dc6a8c5ad0 |
 | instance| gauge | instance |
 | 296b22c6-2a4d-4a8d-a7cd-2d73339f9c70 | 96f9a624a325473daf4cd7875be46009 |
 | ec26984024c1438e8e2f93dc6a8c5ad0 |
 | instance:m1.small   | gauge | instance |
 | 296b22c6-2a4d-4a8d-a7cd-2d73339f9c70 | 96f9a624a325473daf4cd7875be46009 |
 | ec26984024c1438e8e2f93dc6a8c5ad0 |
 | memory  | gauge | MB   |
 | 296b22c6-2a4d-4a8d-a7cd-2d73339f9c70 | 96f9a624a325473daf4cd7875be46009 |
 | ec26984024c1438e8e2f93dc6a8c5ad0 |
 | vcpus   | gauge | vcpu |
 | 296b22c6-2a4d-4a8d-a7cd-2d73339f9c70 | 96f9a624a325473daf4cd7875be46009 |
 | ec26984024c1438e8e2f93dc6a8c5ad0 |
 +-+---+--+--+--+--+
 
 
 If the admin user can't rely on ceilometer guaranteeing availability of
 the performance stats at all, then I think having an API in nova to report
 them is in fact justifiable. In fact it is probably justifiable no matter
 what as a fallback way to check that VMs are doing in the fact of failure
 of ceilometer / part of the cloud infrastructure.
 
 Daniel
 --
 |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org  -o- http://virt-manager.org :|
 |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] unittests are failing after change I5e4cb4a8

2014-01-06 Thread Vladik Romanovsky
Hello everyone,

I'm just wondering, is there anyone else who is affected by this bug: 
https://bugs.launchpad.net/nova/+bug/1266534 ?
It looks to me that everyone should be affected by it (after change 
https://review.openstack.org/#/c/61310 has been merged),
but I also see many tests in Jenkins which are passing.

I'm not sure if I am missing anything.

Thanks,
Vladik

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][Libvirt] Disabling nova-compute when a connection to libvirt is broken.

2013-10-10 Thread Vladik Romanovsky
Hello everyone,

I have been recently working on a migration bug in nova (Bug #1233184). 

I noticed that compute service remains available, even if a connection to 
libvirt is broken.
I thought that it might be better to disable the service (using 
conductor.manager.update_service()) and resume it once it's connected again. 
(maybe keep the host_stats periodic task running or create a dedicated one, 
once it succeed, the service will become available again).
This way new vms wont be scheduled nor migrated to the disconnected host.

Any thoughts on that?
Is anyone already working on that?

Thank you,
Vladik

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][Libvirt] Disabling nova-compute when a connection to libvirt is broken.

2013-10-13 Thread Vladik Romanovsky
Thanks for the feedback.

I'm already working on this. Will send the patch for review, very soon.

Thanks,
Vladik

- Original Message -
From: Lingxian Kong anlin.k...@gmail.com
To: OpenStack Development Mailing List openstack-dev@lists.openstack.org
Sent: Saturday, October 12, 2013 5:42:15 AM
Subject: Re: [openstack-dev] [nova][Libvirt] Disabling nova-compute when a 
connection to libvirt is broken.

+1 for me. And I am willing to be a volunteer. 


2013/10/12 Joe Gordon  joe.gord...@gmail.com  



On Thu, Oct 10, 2013 at 4:47 AM, Vladik Romanovsky  
vladik.romanov...@enovance.com  wrote: 


Hello everyone, 

I have been recently working on a migration bug in nova (Bug #1233184). 

I noticed that compute service remains available, even if a connection to 
libvirt is broken. 
I thought that it might be better to disable the service (using 
conductor.manager.update_service()) and resume it once it's connected again. 
(maybe keep the host_stats periodic task running or create a dedicated one, 
once it succeed, the service will become available again). 
This way new vms wont be scheduled nor migrated to the disconnected host. 

Any thoughts on that? 

Sounds reasonable to me. If we can't reach libvirt there isn't much that 
nova-compute can / should do. 


Is anyone already working on that? 

Thank you, 
Vladik 

___ 
OpenStack-dev mailing list 
OpenStack-dev@lists.openstack.org 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev 


___ 
OpenStack-dev mailing list 
OpenStack-dev@lists.openstack.org 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev 




-- 
 
Lingxian Kong 
Huawei Technologies Co.,LTD. 
IT Product Line CloudOS PDU 
China, Xi'an 
Mobile: +86-18602962792 
Email: konglingx...@huawei.com ; anlin.k...@gmail.com 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][libvirt][lxc] Attach volumes to LXC is broken

2014-06-19 Thread Vladik Romanovsky
Hello Everyone,

I've been recently working on bug/1269990, re: attached volumes to LXC are
being lost after reboot/power on, after fixing it, I've realized that the
initial attach volume to LXC operation is not functional at all. (bug/1330981)

I've described the problem in details in the bug. In essence, it all converges
to the fact that /dev/nbdX or /dev/loopX is being set as a root_device_name,
when LXC is being started. Later, while attaching a new volume, these devices
names are not being properly parsed. Nor can a disk_bus can be chosen for the 
new
volume.

Saving the nbd device as root_device_name was introduced by patch
6277f8aa9 - Change-Id: I063fd3a9856bba089bcde5cdefd2576e2eb2b0e9,
to fix a problem where nbd and loop devices are not properly disconnected while 
the LXC instance
 has been terminated.
These devices where leaking because, while starting LXC, we are unmounting the 
lxc rootfs,
and thus cleaning the LXC space in disk.clean_lxc_namespace() - 
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L3570
This is causing the disk api not able to find a relevant nbd/loop device to 
disconnect
while terminating a LXC instance.
https://github.com/openstack/nova/blob/master/nova/virt/disk/api.py#L250

The possible to solutions to these problems, I could come up with:
1. Stop saving nbd/loop devices as root_device_name, as well as, stop calling
   disk.clean_lxc_namespace(), letting the terminate_instance()
   unmount the LXC rootfs and disconnect relevant devices
   (relying on an existing mechanism). 
   This will also allow the attach_volume() to succeed.
   
2. Adjust get_next_device_name() method to explicitly handle nbd and loop 
devices in 
   https://github.com/openstack/nova/blob/master/nova/compute/utils.py#L129
   
3. Add an additional filed to the instance model, other then root_device_name,
   to save nbd/loop devices, that should be disconnected on instance 
termination.
   
Not sure which option is better. Also, it is not entirely clear to me why 
clean_lxc_namespace
was/is needed.

I'd like to get your opinion and feedback if I'm missing anything or the 
explanation was too confusing :)

Thanks,
Vladik 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] nova host-update gives error 'Virt driver does not implement host disabled status'

2014-11-26 Thread Vladik Romanovsky


- Original Message -
 From: Vineet Menon mvineetme...@gmail.com
 To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Sent: Wednesday, 26 November, 2014 5:14:09 AM
 Subject: Re: [openstack-dev] [nova] nova host-update gives error 'Virt driver 
 does not implement host disabled
 status'
 
 Hi Kevin,
 
 Oh. Yes. That could be the problem.
 Thanks for pointing that out.
 
 
 Regards,
 
 Vineet Menon
 
 
 On 26 November 2014 at 02:02, Chen CH Ji  jiche...@cn.ibm.com  wrote:
 
 
 
 
 
 are you using libvirt ? it's not implemented
 ,guess your bug are talking about other hypervisors?
 
 the message was printed here:
 http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/contrib/hosts.py#n236
 
 Best Regards!
 
 Kevin (Chen) Ji 纪 晨
 
 Engineer, zVM Development, CSTL
 Notes: Chen CH Ji/China/IBM@IBMCN Internet: jiche...@cn.ibm.com
 Phone: +86-10-82454158
 Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian District,
 Beijing 100193, PRC
 
 Vineet Menon ---11/26/2014 12:10:39 AM---Hi, I'm trying to reproduce the bug
 https://bugs.launchpad.net/nova/+bug/1259535 .
 
 From: Vineet Menon  mvineetme...@gmail.com 
 To: openstack-dev  openstack-dev@lists.openstack.org 
 Date: 11/26/2014 12:10 AM
 Subject: [openstack-dev] [nova] nova host-update gives error 'Virt driver
 does not implement host disabled status'
 
 
Hi Vinet, 

There are two methods in the API for changing the service/host status.
nova host-update and nova service-update.

Currently, in order to disable the service one should use the nova 
service-update command,
which maps to service_update method in the manager class.

nova host-update maps to set_host_enabled() methodin the virt drivers, which 
is not implemented
in the libvirt driver.
Not sure what is the purpose of this method, but libvirt driver doesn't 
implement it.

For a short period of time, this method was implemented, for a wrong reason, 
which was causing the bug in the title,
however, it was fix with https://review.openstack.org/#/c/61016

Let me know if you have any questions.

Thanks,
Vladik



 
 
 Hi,
 
 I'm trying to reproduce the bug https://bugs.launchpad.net/nova/+bug/1259535
 .
 While trying to issue the command, nova host-update --status disable
 machine1, an error is thrown saying,
 
 
 ERROR (HTTPNotImplemented): Virt driver does not implement host disabled
 status. (HTTP 501) (Request-ID: req-1f58feda-93af-42e0-b7b6-bcdd095f7d8c)
 
 What is this error about?
 
 Regards,
 Vineet Menon
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] multi-queue virtio-net interface

2015-01-23 Thread Vladik Romanovsky
Unfortunately, I didn't get a feature freeze exception for this blueprint.
I will resubmit the spec in next cycle.

I think the best way for you to contribute is to review the spec,
when it's re-posted and +1 it, if you agree with the design.

Thanks,
Vladik 

- Original Message -
 From: Steve Gordon sgor...@redhat.com
 To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Cc: mfiuc...@akamai.com
 Sent: Wednesday, 21 January, 2015 4:43:37 PM
 Subject: Re: [openstack-dev] multi-queue virtio-net interface
 
 - Original Message -
  From: Rajagopalan Sivaramakrishnan r...@juniper.net
  To: openstack-dev@lists.openstack.org
  
  Hello,
  We are hitting a performance bottleneck in the Contrail network
  virtualization solution due to the virtio interface having a single
  queue in VMs spawned using Openstack. There seems to be a blueprint to
  address this by enabling multi-queue virtio-net at
  
  https://blueprints.launchpad.net/nova/+spec/libvirt-virtio-net-multiqueue
  
  It is not clear what the current status of this project is. We would be
  happy
  to contribute towards this effort if required. Could somebody please let us
  know what the next steps should be to get this into an upcoming release?
  
  Thanks,
  
  Raja
 
 The specification is up for review here:
 
 https://review.openstack.org/#/c/128825/
 
 There is an associated Feature Freeze Exception (FFE) email for this proposal
 here which would need to be approved for this to be included in Kilo:
 
 
 http://lists.openstack.org/pipermail/openstack-dev/2015-January/054263.html
 
 Thanks,
 
 Steve
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-02 Thread Vladik Romanovsky


- Original Message -
 From: Daniel P. Berrange berra...@redhat.com
 To: Robert Collins robe...@robertcollins.net
 Cc: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org,
 openstack-operat...@lists.openstack.org
 Sent: Monday, 2 February, 2015 5:56:56 AM
 Subject: Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration 
 ends
 
 On Mon, Feb 02, 2015 at 08:24:20AM +1300, Robert Collins wrote:
  On 31 January 2015 at 05:47, Daniel P. Berrange berra...@redhat.com
  wrote:
   In working on a recent Nova migration bug
  
 https://bugs.launchpad.net/nova/+bug/1414065
  
   I had cause to refactor the way the nova libvirt driver monitors live
   migration completion/failure/progress. This refactor has opened the
   door for doing more intelligent active management of the live migration
   process.
  ...
   What kind of things would be the biggest win from Operators' or tenants'
   POV ?
  
  Awesome. Couple thoughts from my perspective. Firstly, there's a bunch
  of situation dependent tuning. One thing Crowbar does really nicely is
  that you specify the host layout in broad abstract terms - e.g. 'first
  10G network link' and so on : some of your settings above like whether
  to compress page are going to be heavily dependent on the bandwidth
  available (I doubt that compression is a win on a 100G link for
  instance, and would be suspect at 10G even). So it would be nice if
  there was a single dial or two to set and Nova would auto-calculate
  good defaults from that (with appropriate overrides being available).
 
 I wonder how such an idea would fit into Nova, since it doesn't really
 have that kind of knowledge about the network deployment characteristics.
 
  Operationally avoiding trouble is better than being able to fix it, so
  I quite like the idea of defaulting the auto-converge option on, or
  perhaps making it controllable via flavours, so that operators can
  offer (and identify!) those particularly performance sensitive
  workloads rather than having to guess which instances are special and
  which aren't.
 
 I'll investigate the auto-converge further to find out what the
 potential downsides of it are. If we can unconditionally enable
 it, it would be simpler than adding yet more tunables.
 
  Being able to cancel the migration would be good. Relatedly being able
  to restart nova-compute while a migration is going on would be good
  (or put differently, a migration happening shouldn't prevent a deploy
  of Nova code: interlocks like that make continuous deployment much
  harder).
  
  If we can't already, I'd like as a user to be able to see that the
  migration is happening (allows diagnosis of transient issues during
  the migration). Some ops folk may want to hide that of course.
  
  I'm not sure that automatically rolling back after N minutes makes
  sense : if the impact on the cluster is significant then 1 minute vs
  10 doesn't instrinsically matter: what matters more is preventing too
  many concurrent migrations, so that would be another feature that I
  don't think we have yet: don't allow more than some N inbound and M
  outbound live migrations to a compute host at any time, to prevent IO
  storms. We may want to log with NOTIFICATION migrations that are still
  progressing but appear to be having trouble completing. And of course
  an admin API to query all migrations in progress to allow API driven
  health checks by monitoring tools - which gives the power to manage
  things to admins without us having to write a probably-too-simple
  config interface.
 
 Interesting, the point about concurrent migrations hadn't occurred to
 me before, but it does of course make sense since migration is
 primarily network bandwidth limited, though disk bandwidth is relevant
 too if doing block migration.

Indeed, there was a lot time spent investigating this topic (in Ovirt again)
and eventually it was decided to expose a config option and allow 3 concurrent
migrations by default.

https://github.com/oVirt/vdsm/blob/master/lib/vdsm/config.py.in#L126

 
 Regards,
 Daniel
 --
 |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org  -o- http://virt-manager.org :|
 |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-01-30 Thread Vladik Romanovsky


- Original Message -
 From: Daniel P. Berrange berra...@redhat.com
 To: openstack-dev@lists.openstack.org, openstack-operat...@lists.openstack.org
 Sent: Friday, 30 January, 2015 11:47:16 AM
 Subject: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
 
 In working on a recent Nova migration bug
 
   https://bugs.launchpad.net/nova/+bug/1414065
 
 I had cause to refactor the way the nova libvirt driver monitors live
 migration completion/failure/progress. This refactor has opened the
 door for doing more intelligent active management of the live migration
 process.
 
 As it stands today, we launch live migration, with a possible bandwidth
 limit applied and just pray that it succeeds eventually. It might take
 until the end of the universe and we'll happily wait that long. This is
 pretty dumb really and I think we really ought to do better. The problem
 is that I'm not really sure what better should mean, except for ensuring
 it doesn't run forever.
 
 As a demo, I pushed a quick proof of concept showing how we could easily
 just abort live migration after say 10 minutes
 
   https://review.openstack.org/#/c/151665/
 
 There are a number of possible things to consider though...
 
 First how to detect when live migration isn't going to succeeed.
 
  - Could do a crude timeout, eg allow 10 minutes to succeeed or else.
 
  - Look at data transfer stats (memory transferred, memory remaining to
transfer, disk transferred, disk remaining to transfer) to determine
if it is making forward progress.

I think this is a better option. We could define a timeout for the progress
and cancel if there is no progress. IIRC there were similar debates about it
in Ovirt, we could do something similar:
https://github.com/oVirt/vdsm/blob/master/vdsm/virt/migration.py#L430

 
  - Leave it upto the admin / user to decided if it has gone long enough
 
 The first is easy, while the second is harder but probably more reliable
 and useful for users.
 
 Second is a question of what todo when it looks to be failing
 
  - Cancel the migration - leave it running on source. Not good if the
admin is trying to evacuate a host.
 
  - Pause the VM - make it complete as non-live migration. Not good if
the guest workload doesn't like being paused
 
  - Increase the bandwidth permitted. There is a built-in rate limit in
QEMU overridable via nova.conf. Could argue that the admin should just
set their desired limit in nova.conf and be done with it, but perhaps
there's a case for increasing it in special circumstances. eg emergency
evacuate of host it is better to waste bandwidth  complete the job,
but for non-urgent scenarios better to limit bandwidth  accept failure ?
 
  - Increase the maximum downtime permitted. This is the small time window
when the guest switches from source to dest. To small and it'll never
switch, too large and it'll suffer unacceptable interuption.
 

In my opinion, it would be great if we could play with bandwidth and downtime
before cancelling the migration or pausing.
However, It makes sense only if there is some kind of a progress in the transfer
stats and not a complete disconnect. In that case we should just cancel it.

 We could do some of these things automatically based on some policy
 or leave them upto the cloud admin/tenant user via new APIs
 
 Third there's question of other QEMU features we could make use of to
 stop problems in the first place
 
  - Auto-converge flag - if you set this QEMU throttles back the CPUs
so the guest cannot dirty ram pages as quickly. This is nicer than
pausing CPUs altogether, but could still be an issue for guests
which have strong performance requirements
 
  - Page compression flag - if you set this QEMU does compression of
pages to reduce data that has to be sent. This is basically trading
off network bandwidth vs CPU burn. Probably a win unless you are
already highly overcomit on CPU on the host
 
 Fourth there's a question of whether we should give the tenant user or
 cloud admin further APIs for influencing migration
 
  - Add an explicit API for cancelling migration ?
 
  - Add APIs for setting tunables like downtime, bandwidth on the fly ?
 
  - Or drive some of the tunables like downtime, bandwidth, or policies
like cancel vs paused from flavour or image metadata properties ?
 
  - Allow operations like evacuate to specify a live migration policy
eg switch non-live migrate after 5 minutes ?
 
IMHO, an explicit API for cancelling migration is very much needed.
I remember cases when migrations took about 8 or hours, leaving the
admins helpless :)

Also, I very much like the idea of having tunables and policy to set
in the flavours and image properties.
To allow the administrators to set these as a template in the flavour
and also to let the users to update/override or request these options
as they should know the best (hopefully) what is running in their guests.
 


 The 

Re: [openstack-dev] [nova][NFV][qa] Testing NUMA, CPU pinning and large pages

2015-01-28 Thread Vladik Romanovsky


- Original Message -
 From: Steve Gordon sgor...@redhat.com
 To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Sent: Tuesday, 27 January, 2015 9:46:44 AM
 Subject: Re: [openstack-dev] [nova][NFV][qa] Testing NUMA, CPU pinning and 
 large pages
 
 - Original Message -
  From: Vladik Romanovsky vladik.romanov...@enovance.com
  To: openstack-dev@lists.openstack.org
  
  Hi everyone,
  
  Following Steve Gordon's email [1], regarding CI for NUMA, SR-IOV, and
  other
  features, I'd like to start a discussion about the NUMA testing in
  particular.
  
  Recently we have started a work to test some of these features.
  The current plan is to use the functional tests, in the Nova tree, to
  exercise
  the code paths for NFV use cases. In general, these will contain tests
  to cover various scenarios regarding NUMA, CPU pinning, large pages and
  validate a correct placement/scheduling.
 
 Hi Vladik,
 
 There was some discussion of the above at the Nova mid-cycle yesterday, are
 you able to give a quick update on any progress with regards to creation of
 the above functional tests?
 

I have a some progress, however, currently I have some challenges with 
validating
the scheduler filters outcome. I'll try to post some of it in the coming days.

  In addition to the functional tests in Nova, we have also proposed two
  basic
  scenarios in Tempest [2][3]. One to make sure that an instance can boot
  with a
  minimal NUMA configuration (a topology that every host should have) and
  one that would request an impossible topology and fail with an expected
  exception.
 
 We also discussed the above tempest changes and they will likely receive some
 more review cycles as a result of this discussion but it looks like there is
 already some feedback from Nikola that needs to be addressed. More broadly
 for the list it looks like we need to determine whether adding a negative
 test in this case is a valid/desireable use of Tempest.

I have updated the tempest tests yesterday. The tests were waiting on a nova
patch to be merged: 
https://review.openstack.org/#/c/145312

However, unfortunately, I've discovered another bug in nova that prevents the
tests from passing, somehow I missed it in the previous attempt:
https://review.openstack.org/#/c/150694

Thanks,
Vladik

 
 Thanks,
 
 Steve
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][NFV][qa] Testing NUMA, CPU pinning and large pages

2015-01-11 Thread Vladik Romanovsky

Hi everyone,

Following Steve Gordon's email [1], regarding CI for NUMA, SR-IOV, and other
features, I'd like to start a discussion about the NUMA testing in 
particular.


Recently we have started a work to test some of these features.
The current plan is to use the functional tests, in the Nova tree, to 
exercise

the code paths for NFV use cases. In general, these will contain tests
to cover various scenarios regarding NUMA, CPU pinning, large pages and
validate a correct placement/scheduling.

In addition to the functional tests in Nova, we have also proposed two basic
scenarios in Tempest [2][3]. One to make sure that an instance can boot 
with a

minimal NUMA configuration (a topology that every host should have) and
one that would request an impossible topology and fail with an expected
exception.

This work doesn't eliminate the need of testing on a real hardware, however,
these tests should provide coverage for the features that are currently 
being
submitted upstream and hopefully be a good starting point for future 
testing.


Thoughts?

Vladik

[1] 
http://lists.openstack.org/pipermail/openstack-dev/2014-November/050306.html

[2] https://review.openstack.org/143540
[3] https://review.openstack.org/143541


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] request spec freeze exception for virtio-net multiqueue

2015-01-12 Thread Vladik Romanovsky
Hello,

I'd like to request an exception for virtio-net multiqueue feature. [1]
This is an important feature that aims to increase the total network throughput
in guests and not too hard to implement.

Thanks,
Vladik

[1] https://review.openstack.org/#/c/128825

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] correct API for getting image metadata for an instance ?

2015-05-28 Thread Vladik Romanovsky
 As part of the work to object-ify the image metadata dicts, I'm looking
 at the current way the libvirt driver fetches image metadata for an
 instance, in cases where the compute manager hasn't already passed it
 into the virt driver API. I see 2 methods that libvirt uses to get the
 image metadata
 
  - nova.utils.get_image_from_system_metadata(instance.system_metadata)
 
  It takes the system metadata stored against the instance
  and turns it into image  metadata.
 
 
 - nova.compute.utils.get_image_metadata(context,
  image_api,
  instance.image_ref,
instance)
 
  This tries to get metadata from the image api and turns
  this into system metadata
 
  It then gets system metadata from the instance and merges
  it from the data from the image
 
  It then calls nova.utils.get_image_from_system_metadata()
 
  IIUC, any changes against the image will override what
  is stored against the instance
 
 
 
 IIUC, when an instance is booted, the image metadata should be
 saved against the instance. So I'm wondering why we need to have
 code in compute.utils that merges back in the image metadata each
 time ?
 
 Is this intentional so that we pull in latest changes from the
 image, to override what's previously saved on the instance ? If
 so, then it seems that we should have been consistent in using
 the compute_utils get_image_metadata() API everywhere.
 
 It seems wrong though to pull in the latest metadata from the
 image. The libvirt driver makes various decisions at boot time
 about how to configure the guest based on the metadata. When we
 later do changes to that guest (snapshot, hotplug, etc, etc)
 we *must* use exactly the same image metadata we had at boot
 time, otherwise decisions we make will be inconsistent with how
 the guest is currently configured.
 
 eg if you set  hw_disk_bus=virtio at boot time, and then later
 change the image to use hw_disk_bus=scsi, and then try to hotplug
 a new drive on the guest, we *must* operate wrt hw_disk_bus=virtio
 because the guest will not have any scsi bus present.

I agree, as well, that we should use the system_metadate instead of
getting the latest from Glance.

I just wish there would be an easy way to edit it, in order to update
some keys such as video driver, watchdog action, nic driver.. etc,
so it would be picked up on hard reboot, for example.


 
 This says to me we should /never/ use the compute_utils
 get_image_metadata() API once the guest is running, and so we
 should convert libvirt to use nova.utils.get_image_from_system_metadata()
 exclusively.
 
 It also makes me wonder how nova/compute/manager.py is obtaining image
 meta in cases where it passes it into the API and whether that needs
 changing at all.
 
 
 Regards,
 Daniel
 --
 |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org  -o- http://virt-manager.org :|
 |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev