[openstack-dev] [nova][neutron[SR-IOV] SR-IOV meeting is cancel today

2016-09-27 Thread Moshe Levi
Hi all,

Sorry for the late mail, but I have to cancel the meeting today.

Thanks,
Moshe Levi



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Neutron] SR-IOV subteam

2015-11-12 Thread Nikola Đipanov
Top posting since I wanted to just add the [Neutron] tag to the subject
as I imagine there are a few folks in Neutron-land who will be
interested in this.

We had the first meeting this week [1] and there were some cross project
topics mentioned (especially around scheduling) so feel free to review
and comment.

[1]
http://eavesdrop.openstack.org/meetings/sriov/2015/sriov.2015-11-10-13.09.log.html

On 11/10/2015 01:42 AM, Nikola Đipanov wrote:
> On 11/04/2015 07:56 AM, Moshe Levi wrote:
>> Maybe we can you use the pci- passthrough meeting slot 
>> http://eavesdrop.openstack.org/#PCI_Passthrough_Meeting 
>> It been a long time since we had a meeting. 
>>
> 
> I think that slot works well (at least for me). I'd maybe change the
> cadence to bi-weekly in the beginning and see if we need to increase it
> as the cycle progresses.
> 
> Here's the patch proposing the said changes:
> 
> https://review.openstack.org/243382
> 
> On 11/09/2015 06:33 PM, Beliveau, Ludovic wrote:
>> Is there a meeting planned for this week ?
>>
>> Thanks,
>> /ludovic
> 
> Why not - let's have it today 13:00 UTC as the above patch suggests and
> chat more on there.
> 
> Thanks,
> N.
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron][SR-IOV] Hardware changes and shifting PCI addresses

2015-09-15 Thread Daniel P. Berrange
On Mon, Sep 14, 2015 at 09:34:31PM -0400, Jay Pipes wrote:
> On 09/10/2015 05:23 PM, Brent Eagles wrote:
> >Hi,
> >
> >I was recently informed of a situation that came up when an engineer
> >added an SR-IOV nic to a compute node that was hosting some guests that
> >had VFs attached. Unfortunately, adding the card shuffled the PCI
> >addresses causing some degree of havoc. Basically, the PCI addresses
> >associated with the previously allocated VFs were no longer valid.
> >
> >I tend to consider this a non-issue. The expectation that hosts have
> >relatively static hardware configuration (and kernel/driver configs for
> >that matter) is the price you pay for having pets with direct hardware
> >access. That being said, this did come as a surprise to some of those
> >involved and I don't think we have any messaging around this or advice
> >on how to deal with situations like this.
> >
> >So what should we do? I can't quite see altering OpenStack to deal with
> >this situation (or even how that could work). Has anyone done any
> >research into this problem, even if it is how to recover or extricate
> >a guest that is no longer valid? It seems that at the very least we
> >could use some stern warnings in the docs.
> 
> Hi Brent,
> 
> Interesting issue. We have code in the PCI tracker that ostensibly handles
> this problem:
> 
> https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L145-L164
> 
> But the note from yjiang5 is telling:
> 
> # Pci properties may change while assigned because of
> # hotplug or config changes. Although normally this should
> # not happen.
> # As the devices have been assigned to a instance, we defer
> # the change till the instance is destroyed. We will
> # not sync the new properties with database before that.
> # TODO(yjiang5): Not sure if this is a right policy, but
> # at least it avoids some confusion and, if
> # we can add more action like killing the instance
> # by force in future.
> 
> Basically, if the PCI device tracker notices that an instance is assigned a
> PCI device with an address that no longer exists in the PCI device addresses
> returned from libvirt, it will (eventually, in the _free_instance() method)
> remove the PCI device assignment from the Instance object, but it will make
> no attempt to assign a new PCI device that meets the original PCI device
> specification in the launch request.
> 
> Should we handle this case and attempt a "hot re-assignment of a PCI
> device"? Perhaps. Is it high priority? Not really, IMHO.

Hotplugging new PCI devices to a running host should not have any impact
on existing PCI device addresses - it'll merely add new adddresses for
new devices - existing devices are unchanged. So Everything should "just
work" in that case. IIUC, Brent's Q was around turning off the host and
cold-plugging/unplugging hardware, which /is/ liable to arbitrarily
re-arrange existing PCI device addresses.

> If you'd like to file a bug against Nova, that would be cool, though.

I think it is explicitly out of scope for Nova to deal with this
scenario.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron][SR-IOV] Hardware changes and shifting PCI addresses

2015-09-15 Thread Daniel P. Berrange
On Thu, Sep 10, 2015 at 06:53:06PM -0230, Brent Eagles wrote:
> Hi,
> 
> I was recently informed of a situation that came up when an engineer
> added an SR-IOV nic to a compute node that was hosting some guests that
> had VFs attached. Unfortunately, adding the card shuffled the PCI
> addresses causing some degree of havoc. Basically, the PCI addresses
> associated with the previously allocated VFs were no longer valid.

This seems to be implying that they took the host offline to make
hardware changes, and then tried to re-start the originally running
guests directly, without letting the schedular re-run.

If correct, then IMHO that is an unsupported approach. After making
any hardware changes you should essentially consider that to be a
new compute host. There is no expectation that previously running
guests on that host can be restarted. You must let the compute
host report its new hardware capabilities, and let the schedular
place guests on it from scratch, using the new PCI address info.

> I tend to consider this a non-issue. The expectation that hosts have
> relatively static hardware configuration (and kernel/driver configs for
> that matter) is the price you pay for having pets with direct hardware
> access. That being said, this did come as a surprise to some of those
> involved and I don't think we have any messaging around this or advice
> on how to deal with situations like this.
> 
> So what should we do? I can't quite see altering OpenStack to deal with
> this situation (or even how that could work). Has anyone done any
> research into this problem, even if it is how to recover or extricate
> a guest that is no longer valid? It seems that at the very least we
> could use some stern warnings in the docs.

Taking a host offline for maintenance, should be considered
equivalent to throwing away the existing host and deploying a new
host. There should be zero state carry-over from OpenStack POV,
since both the software and hardware changes can potentially
invalidate previous informationm used by the schedular for deploying
on that host.  The idea of recovering a previously running guest
should be explicitly unsupported.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron][SR-IOV] Hardware changes and shifting PCI addresses

2015-09-15 Thread Chris Friesen

On 09/15/2015 02:25 AM, Daniel P. Berrange wrote:


Taking a host offline for maintenance, should be considered
equivalent to throwing away the existing host and deploying a new
host. There should be zero state carry-over from OpenStack POV,
since both the software and hardware changes can potentially
invalidate previous informationm used by the schedular for deploying
on that host.  The idea of recovering a previously running guest
should be explicitly unsupported.


This isn't the way the nova code is currently written though.

By default, any instances that were running on that compute node are going to 
still be in the DB as running on that compute node but in the "stopped" state. 
If you then do a "nova start", they'll try to start up on that node again.


Heck, if you enable "resume_guests_state_on_host_boot" then nova will restart 
them automatically for you on startup.


To robustly do what you're talking about would require someone (nova, the 
operator, etc.) to migrate all instances off of a compute node before taking it 
down (which is currently impossible for suspended instances), and then force a 
"nova evacuate" (or maybe "nova delete") for every instance that was on a 
compute node that went down.


Chris

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron][SR-IOV] Hardware changes and shifting PCI addresses

2015-09-14 Thread Sean M. Collins
Brent is our Neutron-Nova liaison - can someone from the SR-IOV team
please respond?

-- 
Sean M. Collins

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron][SR-IOV] Hardware changes and shifting PCI addresses

2015-09-14 Thread Jay Pipes

On 09/10/2015 05:23 PM, Brent Eagles wrote:

Hi,

I was recently informed of a situation that came up when an engineer
added an SR-IOV nic to a compute node that was hosting some guests that
had VFs attached. Unfortunately, adding the card shuffled the PCI
addresses causing some degree of havoc. Basically, the PCI addresses
associated with the previously allocated VFs were no longer valid.

I tend to consider this a non-issue. The expectation that hosts have
relatively static hardware configuration (and kernel/driver configs for
that matter) is the price you pay for having pets with direct hardware
access. That being said, this did come as a surprise to some of those
involved and I don't think we have any messaging around this or advice
on how to deal with situations like this.

So what should we do? I can't quite see altering OpenStack to deal with
this situation (or even how that could work). Has anyone done any
research into this problem, even if it is how to recover or extricate
a guest that is no longer valid? It seems that at the very least we
could use some stern warnings in the docs.


Hi Brent,

Interesting issue. We have code in the PCI tracker that ostensibly 
handles this problem:


https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L145-L164

But the note from yjiang5 is telling:

# Pci properties may change while assigned because of
# hotplug or config changes. Although normally this should
# not happen.
# As the devices have been assigned to a instance, we defer
# the change till the instance is destroyed. We will
# not sync the new properties with database before that.
# TODO(yjiang5): Not sure if this is a right policy, but
# at least it avoids some confusion and, if
# we can add more action like killing the instance
# by force in future.

Basically, if the PCI device tracker notices that an instance is 
assigned a PCI device with an address that no longer exists in the PCI 
device addresses returned from libvirt, it will (eventually, in the 
_free_instance() method) remove the PCI device assignment from the 
Instance object, but it will make no attempt to assign a new PCI device 
that meets the original PCI device specification in the launch request.


Should we handle this case and attempt a "hot re-assignment of a PCI 
device"? Perhaps. Is it high priority? Not really, IMHO.


If you'd like to file a bug against Nova, that would be cool, though.

Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][neutron][SR-IOV] Hardware changes and shifting PCI addresses

2015-09-10 Thread Brent Eagles
Hi,

I was recently informed of a situation that came up when an engineer
added an SR-IOV nic to a compute node that was hosting some guests that
had VFs attached. Unfortunately, adding the card shuffled the PCI
addresses causing some degree of havoc. Basically, the PCI addresses
associated with the previously allocated VFs were no longer valid. 

I tend to consider this a non-issue. The expectation that hosts have
relatively static hardware configuration (and kernel/driver configs for
that matter) is the price you pay for having pets with direct hardware
access. That being said, this did come as a surprise to some of those
involved and I don't think we have any messaging around this or advice
on how to deal with situations like this.

So what should we do? I can't quite see altering OpenStack to deal with
this situation (or even how that could work). Has anyone done any
research into this problem, even if it is how to recover or extricate
a guest that is no longer valid? It seems that at the very least we
could use some stern warnings in the docs.

Cheers,

Brent


pgpTofx8v4Ts8.pgp
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][neutron][SR-IOV]

2015-05-14 Thread Moshe Levi
Hi all,

I prepared etherpad with all SR-IOV Features [1]  that were submitted to 
Neutron/Nova for Liberty.
Please feel free to add new features or existing features that I missed.

The etherpad also includes issues to discuss  section.
Please feel free  add your feedback/issues under it.

I will  arrange BoF session. Time and day are TBD.

[1] https://etherpad.openstack.org/p/liberty-sriov

Regards,
Moshe Levi




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][neutron] SR-IOV networking patches available

2014-02-26 Thread Robert Li (baoli)
Hi,

The following two Work In Progress patches are available for end-to-end SR-IOV 
networking:
nova client: https://review.openstack.org/#/c/67503/
nova: https://review.openstack.org/#/c/67500/

Please check the commit messages for how to use them.

Neutron changes required to support SR-IOV have already been merged. Many 
thanks to the developers working on them and having them merged in a very short 
time! They are:

https://blueprints.launchpad.net/neutron/+spec/vif-details
https://blueprints.launchpad.net/neutron/+spec/ml2-binding-profile
https://blueprints.launchpad.net/neutron/+spec/ml2-request-vnic-type

The above patches combined can be used to develop a neutron plugin that 
supports SR-IOV. Please note that Although the nova patches are WIP patches, 
they can be used for your integration testing if you are developing a sr-iov 
capable neutron plugin.

If you use devstack, you may need the following patch for devstack to define 
the PCI whitelist entries:

diff --git a/lib/nova b/lib/nova
index fefeda1..995873a 100644
--- a/lib/nova
+++ b/lib/nova
@@ -475,6 +475,10 @@ function create_nova_conf() {
 iniset $NOVA_CONF DEFAULT ${I/=/ }
 done

+if [ -n $PCI_LIST ]; then
+iniset_multiline  $NOVA_CONF DEFAULT pci_passthrough_whitelist 
${PCI_LIST[@]}
+fi
+
 # All nova-compute workers need to know the vnc configuration options
 # These settings don't hurt anything if n-xvnc and n-novnc are disabled
 if is_service_enabled n-cpu; then

And define something like the following in your localrc file:
PCI_LIST=('{vendor_id:1137,product_id:0071,address:*:0a:00.*,physical_network:physnet1}'
   '{vendor_id:1137,product_id:0071}')
Basically it's a bash array of strings with each string being a json dict. 
Checkout https://review.openstack.org/#/c/67500 for the syntax.

Thanks,
Robert

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev