Re: [openstack-dev] [all] Hide CI comments in Gerrit

2014-07-24 Thread Mike Kolesnik
Great script!

I have a fork that I made and improved it a bit:
https://gist.github.com/mkolesni/92076378d45c7b5e692b

This fork supports:
1. Button/link is integrated nicely to the gerrit UI (appears in the
comments title, just like the other ones).
2. Auto hide will hide comments by default (can be turned off).
3. Regex like bot detection which requires a shorter list of unique
bot names, and less maintenance of the script.
4. oVirt support (for those interested).

Regards,
Mike

- Original Message -
 Hi,
 
 I created a small userscript that allows you to hide CI comments in Gerrit.
 That way you can read only comments written by humans and hide everything
 else. I’ve been struggling for a long time to follow discussions on changes
 with many patch sets because of the CI noise. So I came up with this
 userscript:
 
 https://gist.github.com/rgerganov/35382752557cb975354a
 
 It adds “Toggle CI” button at the bottom of the page that hides/shows CI
 comments. Right now it is configured for Nova CIs, as I contribute mostly
 there, but you can easily make it work for other projects as well. It
 supports both the “old” and “new” screens that we have.
 
 How to install on Chrome: open chrome://extensions and dragdrop the script
 there
 How to install on Firefox: install Greasemonkey first and then open the
 script
 
 Known issues:
  - you may need to reload the page to get the new button
  - I tried to add the button somewhere close to the collapse/expand links but
  it didn’t work for some reason
 
 Hope you will find it useful. Any feedback is welcome :)
 
 Thanks,
 Rado
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][neutron] VIF event callbacks implementation

2014-04-28 Thread Mike Kolesnik
Hi, 

I came across the implementation of 
https://blueprints.launchpad.net/neutron/+spec/nova-event-callback
and have a question about the way it was implemented.

I notice that now Neutron has a dependency on Nova and needs to be configured
to have nova details (API endpoint, user, password, tenant, etc).
Aside from creating a sort of cyclic dependency between the two, it is my
understanding that Neutron is meant to be a stand alone service capable of
being consumed by other compute managers (i.e. oVirt).
This breaks that paradigm.

So my question is: Why use API and not RPC?

I saw that there is already a notification system in Neutron that notifies on
each port update (among other things) which are currently consumed by 
Ceilometer.
Why not have Nova use those notifications to decide that a VIF got plugged 
correctly,
floating IPs changed, and so on?

I am willing to make the necessary changes to decouple Neutron from Nova, but
want to understand the rationale behind the original decision of using API
and not RPC notifications.

Regards,
Mike

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Introducing 'wrapt' to taskflow breaks Jenkins builds on stable branches

2014-11-20 Thread Mike Kolesnik
Hi, 

Currently stable branch Jenkins builds are failing due to the error: 
Syncing /opt/stack/new/taskflow/requirements-py3.txt 
'wrapt' is not a global requirement but it should be,something went wrong 

It's my understanding that this is a side effect from your change in taskflow: 
https://review.openstack.org/#/c/129507/ 

This is currently blocking (amongst other things) a backport of a security fix: 
https://review.openstack.org/#/c/135624/ 

Joshua - Would you be so kind as to investigate this? 

Kind Regards, 
Mike 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Neutron][L2Pop][HA Routers] Request for comments for a possible solution

2014-12-18 Thread Mike Kolesnik
Hi Neutron community members.

I wanted to query the community about a proposal of how to fix HA routers not 
working with L2Population (bug 1365476[1]).
This bug is important to fix especially if we want to have HA routers and DVR
routers working together.

[1] https://bugs.launchpad.net/neutron/+bug/1365476

What's happening now?
* HA routers use distributed ports, i.e. the port with the same IP  MAC
  details is applied on all nodes where an L3 agent is hosting this router.
* Currently, the port details have a binding pointing to an arbitrary node
  and this is not updated.
* L2pop takes this potentially stale information and uses it to create: 
  1. A tunnel to the node.
  2. An FDB entry that directs traffic for that port to that node.
  3. If ARP responder is on, ARP requests will not traverse the network.
* Problem is, the master router wouldn't necessarily be running on the
  reported agent.
  This means that traffic would not reach the master node but some arbitrary
  node where the router master might be running, but might be in another
  state (standby, fail).

What is proposed?
Basically the idea is not to do L2Pop for HA router ports that reside on the
tenant network.
Instead, we would create a tunnel to each node hosting the HA router so that
the normal learning switch functionality would take care of switching the
traffic to the master router.
This way no matter where the master router is currently running, the data
plane would know how to forward traffic to it.
This solution requires changes on the controller only.

What's to gain?
* Data plane only solution, independent of the control plane.
* Lowest failover time (same as HA routers today).
* High backport potential:
  * No APIs changed/added.
  * No configuration changes.
  * No DB changes.
  * Changes localized to a single file and limited in scope.

What's the alternative?
An alternative solution would be to have the controller update the port binding
on the single port so that the plain old L2Pop happens and notifies about the
location of the master router.
This basically negates all the benefits of the proposed solution, but is wider.
This solution depends on the report-ha-router-master spec which is currently in
the implementation phase.

It's important to note that these two solutions don't collide and could be done
independently. The one I'm proposing just makes more sense from an HA viewpoint
because of it's benefits which fit the HA methodology of being fast  having as
little outside dependency as possible.
It could be done as an initial solution which solves the bug for mechanism
drivers that support normal learning switch (OVS), and later kept as an
optimization to the more general, controller based, solution which will solve
the issue for any mechanism driver working with L2Pop (Linux Bridge, possibly
others).

Would love to hear your thoughts on the subject.

Regards,
Mike

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron][L2Pop][HA Routers] Request for comments for a possible solution

2014-12-18 Thread Mike Kolesnik
Hi Mathieu,

Thanks for the quick reply, some comments inline..

Regards,
Mike

- Original Message -
 Hi mike,
 
 thanks for working on this bug :
 
 On Thu, Dec 18, 2014 at 1:47 PM, Gary Kotton gkot...@vmware.com wrote:
 
 
  On 12/18/14, 2:06 PM, Mike Kolesnik mkole...@redhat.com wrote:
 
 Hi Neutron community members.
 
 I wanted to query the community about a proposal of how to fix HA routers
 not
 working with L2Population (bug 1365476[1]).
 This bug is important to fix especially if we want to have HA routers and
 DVR
 routers working together.
 
 [1] https://bugs.launchpad.net/neutron/+bug/1365476
 
 What's happening now?
 * HA routers use distributed ports, i.e. the port with the same IP  MAC
   details is applied on all nodes where an L3 agent is hosting this
 router.
 * Currently, the port details have a binding pointing to an arbitrary node
   and this is not updated.
 * L2pop takes this potentially stale information and uses it to create:
   1. A tunnel to the node.
   2. An FDB entry that directs traffic for that port to that node.
   3. If ARP responder is on, ARP requests will not traverse the network.
 * Problem is, the master router wouldn't necessarily be running on the
   reported agent.
   This means that traffic would not reach the master node but some
 arbitrary
   node where the router master might be running, but might be in another
   state (standby, fail).
 
 What is proposed?
 Basically the idea is not to do L2Pop for HA router ports that reside on
 the
 tenant network.
 Instead, we would create a tunnel to each node hosting the HA router so
 that
 the normal learning switch functionality would take care of switching the
 traffic to the master router.
 
  In Neutron we just ensure that the MAC address is unique per network.
  Could a duplicate MAC address cause problems here?
 
 gary, AFAIU, from a Neutron POV, there is only one port, which is the
 router Port, which is plugged twice. One time per port.
 I think that the capacity to bind a port to several host is also a
 prerequisite for a clean solution here. This will be provided by
 patches to this bug :
 https://bugs.launchpad.net/neutron/+bug/1367391
 
 
 This way no matter where the master router is currently running, the data
 plane would know how to forward traffic to it.
 This solution requires changes on the controller only.
 
 What's to gain?
 * Data plane only solution, independent of the control plane.
 * Lowest failover time (same as HA routers today).
 * High backport potential:
   * No APIs changed/added.
   * No configuration changes.
   * No DB changes.
   * Changes localized to a single file and limited in scope.
 
 What's the alternative?
 An alternative solution would be to have the controller update the port
 binding
 on the single port so that the plain old L2Pop happens and notifies about
 the
 location of the master router.
 This basically negates all the benefits of the proposed solution, but is
 wider.
 This solution depends on the report-ha-router-master spec which is
 currently in
 the implementation phase.
 
 It's important to note that these two solutions don't collide and could
 be done
 independently. The one I'm proposing just makes more sense from an HA
 viewpoint
 because of it's benefits which fit the HA methodology of being fast 
 having as
 little outside dependency as possible.
 It could be done as an initial solution which solves the bug for mechanism
 drivers that support normal learning switch (OVS), and later kept as an
 optimization to the more general, controller based, solution which will
 solve
 the issue for any mechanism driver working with L2Pop (Linux Bridge,
 possibly
 others).
 
 Would love to hear your thoughts on the subject.
 
 You will have to clearly update the doc to mention that deployment
 with Linuxbridge+l2pop are not compatible with HA.

Yes this should be added and this is already the situation right now.
However if anyone would like to work on a LB fix (the general one or some
specific one) I would gladly help with reviewing it.

 
 Moreover, this solution is downgrading the l2pop solution, by
 disabling the ARP-responder when VMs want to talk to a HA router.
 This means that ARP requests will be duplicated to every overlay
 tunnel to feed the OVS Mac learning table.
 This is something that we were trying to avoid with l2pop. But may be
 this is acceptable.

Yes basically you're correct, however this would be only limited to those
tunnels that connect to the nodes where the HA router is hosted, so we
would still limit the amount of traffic that is sent across the underlay.

Also bear in mind that ARP is actually good (at least in OVS case) since
it helps the VM locate on which tunnel the master is, so once it receives
the ARP response it records a flow that directs the traffic to the correct
tunnel, so we just get hit by the one ARP broadcast but it's sort of a
necessary evil in order to locate the master..

 
 I know that ofagent is also using l2pop, I

Re: [openstack-dev] [Neutron][L2Pop][HA Routers] Request for comments for a possible solution

2014-12-20 Thread Mike Kolesnik
Hi Mathieu,

Comments inline

Regards,
Mike

- Original Message -
 Mike,
 
 I'm not even sure that your solution works without being able to bind
 a router HA port to several hosts.
 What's happening currently is that you :
 
 1.create the router on two l3agent.
 2. those l3agent trigger the sync_router() on the l3plugin.
 3. l3plugin.sync_routers() will trigger l2plugin.update_port(host=l3agent).
 4. ML2 will bind the port to the host mentioned in the last update_port().
 
 From a l2pop perspective, this will result in creating only one tunnel
 to the host lastly specified.
 I can't find any code that forces that only the master router binds
 its router port. So we don't even know if the host which binds the
 router port is hosting the master router or the slave one, and so if
 l2pop is creating the tunnel to the master or to the slave.
 
 Can you confirm that the above sequence is correct? or am I missing
 something?

Are you referring to the alternative solution?

In that case it seems that you're correct so that there would need to be
awareness of the master router at some level there as well.
I can't say for sure as I've been thinking on the proposed solution with
no FDBs so there would be some issues with the alternative that need to
be ironed out.

 
 Without the capacity to bind a port to several hosts, l2pop won't be
 able to create tunnel correctly, that's the reason why I was saying
 that a prerequisite for a smart solution would be to first fix the bug
 :
 https://bugs.launchpad.net/neutron/+bug/1367391
 
 DVR Had the same issue. Their workaround was to create a new
 port_binding tables, that manages the capacity for one DVR port to be
 bound to several host.
 As mentioned in the bug 1367391, this adding a technical debt in ML2,
 which has to be tackle down in priority from my POV.

I agree that this would simplify work but even without this bug fixed we
can achieve either solution.

We have already knowledge of the agents hosting a router so this is
completely doable without waiting for fix for bug 1367391.

Also from my understanding the bug 1367391 is targeted at DVR only, not
at HA router ports.

 
 
 On Thu, Dec 18, 2014 at 6:28 PM, Mike Kolesnik mkole...@redhat.com wrote:
  Hi Mathieu,
 
  Thanks for the quick reply, some comments inline..
 
  Regards,
  Mike
 
  - Original Message -
  Hi mike,
 
  thanks for working on this bug :
 
  On Thu, Dec 18, 2014 at 1:47 PM, Gary Kotton gkot...@vmware.com wrote:
  
  
   On 12/18/14, 2:06 PM, Mike Kolesnik mkole...@redhat.com wrote:
  
  Hi Neutron community members.
  
  I wanted to query the community about a proposal of how to fix HA
  routers
  not
  working with L2Population (bug 1365476[1]).
  This bug is important to fix especially if we want to have HA routers
  and
  DVR
  routers working together.
  
  [1] https://bugs.launchpad.net/neutron/+bug/1365476
  
  What's happening now?
  * HA routers use distributed ports, i.e. the port with the same IP  MAC
details is applied on all nodes where an L3 agent is hosting this
  router.
  * Currently, the port details have a binding pointing to an arbitrary
  node
and this is not updated.
  * L2pop takes this potentially stale information and uses it to
  create:
1. A tunnel to the node.
2. An FDB entry that directs traffic for that port to that node.
3. If ARP responder is on, ARP requests will not traverse the network.
  * Problem is, the master router wouldn't necessarily be running on the
reported agent.
This means that traffic would not reach the master node but some
  arbitrary
node where the router master might be running, but might be in another
state (standby, fail).
  
  What is proposed?
  Basically the idea is not to do L2Pop for HA router ports that reside on
  the
  tenant network.
  Instead, we would create a tunnel to each node hosting the HA router so
  that
  the normal learning switch functionality would take care of switching
  the
  traffic to the master router.
  
   In Neutron we just ensure that the MAC address is unique per network.
   Could a duplicate MAC address cause problems here?
 
  gary, AFAIU, from a Neutron POV, there is only one port, which is the
  router Port, which is plugged twice. One time per port.
  I think that the capacity to bind a port to several host is also a
  prerequisite for a clean solution here. This will be provided by
  patches to this bug :
  https://bugs.launchpad.net/neutron/+bug/1367391
 
 
  This way no matter where the master router is currently running, the
  data
  plane would know how to forward traffic to it.
  This solution requires changes on the controller only.
  
  What's to gain?
  * Data plane only solution, independent of the control plane.
  * Lowest failover time (same as HA routers today).
  * High backport potential:
* No APIs changed/added.
* No configuration changes.
* No DB changes.
* Changes localized to a single file and limited in scope

Re: [openstack-dev] Request for comments for a possible solution

2014-12-20 Thread Mike Kolesnik
Hi Vivek,

Replies inline.

Regards,
Mike

- Original Message -
 Hi Mike,
 
 Few clarifications inline [Vivek]
 
 -Original Message-
 From: Mike Kolesnik [mailto:mkole...@redhat.com]
 Sent: Thursday, December 18, 2014 10:58 PM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [Neutron][L2Pop][HA Routers] Request for
 comments for a possible solution
 
 Hi Mathieu,
 
 Thanks for the quick reply, some comments inline..
 
 Regards,
 Mike
 
 - Original Message -
  Hi mike,
 
  thanks for working on this bug :
 
  On Thu, Dec 18, 2014 at 1:47 PM, Gary Kotton gkot...@vmware.com wrote:
  
  
   On 12/18/14, 2:06 PM, Mike Kolesnik mkole...@redhat.com wrote:
  
  Hi Neutron community members.
  
  I wanted to query the community about a proposal of how to fix HA
  routers not working with L2Population (bug 1365476[1]).
  This bug is important to fix especially if we want to have HA
  routers and DVR routers working together.
  
  [1] https://bugs.launchpad.net/neutron/+bug/1365476
  
  What's happening now?
  * HA routers use distributed ports, i.e. the port with the same IP 
  MAC
details is applied on all nodes where an L3 agent is hosting this
  router.
  * Currently, the port details have a binding pointing to an
  arbitrary node
and this is not updated.
  * L2pop takes this potentially stale information and uses it to create:
1. A tunnel to the node.
2. An FDB entry that directs traffic for that port to that node.
3. If ARP responder is on, ARP requests will not traverse the network.
  * Problem is, the master router wouldn't necessarily be running on
  the
reported agent.
This means that traffic would not reach the master node but some
  arbitrary
node where the router master might be running, but might be in
  another
state (standby, fail).
  
  What is proposed?
  Basically the idea is not to do L2Pop for HA router ports that
  reside on the tenant network.
  Instead, we would create a tunnel to each node hosting the HA router
  so that the normal learning switch functionality would take care of
  switching the traffic to the master router.
  
   In Neutron we just ensure that the MAC address is unique per network.
   Could a duplicate MAC address cause problems here?
 
  gary, AFAIU, from a Neutron POV, there is only one port, which is the
  router Port, which is plugged twice. One time per port.
  I think that the capacity to bind a port to several host is also a
  prerequisite for a clean solution here. This will be provided by
  patches to this bug :
  https://bugs.launchpad.net/neutron/+bug/1367391
 
 
  This way no matter where the master router is currently running, the
  data plane would know how to forward traffic to it.
  This solution requires changes on the controller only.
  
  What's to gain?
  * Data plane only solution, independent of the control plane.
  * Lowest failover time (same as HA routers today).
  * High backport potential:
* No APIs changed/added.
* No configuration changes.
* No DB changes.
* Changes localized to a single file and limited in scope.
  
  What's the alternative?
  An alternative solution would be to have the controller update the
  port binding on the single port so that the plain old L2Pop happens
  and notifies about the location of the master router.
  This basically negates all the benefits of the proposed solution,
  but is wider.
  This solution depends on the report-ha-router-master spec which is
  currently in the implementation phase.
  
  It's important to note that these two solutions don't collide and
  could be done independently. The one I'm proposing just makes more
  sense from an HA viewpoint because of it's benefits which fit the HA
  methodology of being fast  having as little outside dependency as
  possible.
  It could be done as an initial solution which solves the bug for
  mechanism drivers that support normal learning switch (OVS), and
  later kept as an optimization to the more general, controller based,
  solution which will solve the issue for any mechanism driver working
  with L2Pop (Linux Bridge, possibly others).
  
  Would love to hear your thoughts on the subject.
 
  You will have to clearly update the doc to mention that deployment
  with Linuxbridge+l2pop are not compatible with HA.
 
 Yes this should be added and this is already the situation right now.
 However if anyone would like to work on a LB fix (the general one or some
 specific one) I would gladly help with reviewing it.
 
 
  Moreover, this solution is downgrading the l2pop solution, by
  disabling the ARP-responder when VMs want to talk to a HA router.
  This means that ARP requests will be duplicated to every overlay
  tunnel to feed the OVS Mac learning table.
  This is something that we were trying to avoid with l2pop. But may be
  this is acceptable.
 
 Yes basically you're correct, however this would be only limited to those
 tunnels

Re: [openstack-dev] Request for comments for a possible solution

2015-01-14 Thread Mike Kolesnik
Hi Mathieu, 

Please see comments inline. 

Regards, 
Mike 

- Original Message -

 Hi Mike,

 after reviewing your latest patch [1], I think that a possible solution could
 be to add a new entry in fdb RPC message.
 This entry would specify whether the port is multi-bound or not.
 The new fdb message would look like this :
 {net_id:
 {port:
 {agent_ip:
 {mac, ip, multi-bound }
 }
 }
 network_type:
 vxlan,
 segment_id:
 id
 }

 When the multi-bound option would be set, the ARP responder would be
 provisioned but the underlying module (ovs or kernel vxlan) would be
 provisioned to flood the packet to every tunnel concerned by this overlay
 segment, and not only the tunnel to agent that is supposed to host the port.
 In the LB world, this means not adding fdb entry for the MAC of the
 multi-bound port, whereas in the OVS world, it means not adding a flow that
 send the trafic that matches the MAC of the multi-bound port to only one
 tunnel port, but to every tunnel port of this overlay segment.

So let me see if I understand what you suggest correctly.. 

You suggest that instead of not sending the FDB we do send it along with an 
optional third parameter? 

Mind you that FDBs are sent as a list so for example an l2pop message would 
look like: 
{ 
'61a00edd-018e-4923-9524-df91b3f3083b': { 
'ports': { 
'30.0.0.2': [ 
[ 
'00:00:00:00:00:00', 
'0.0.0.0' 
], 
[ 
'00:00:00:00:12:34', 
'10.0.0.1' 
] 
], 
'30.0.0.1': [ 
[ 
'00:00:00:00:00:00', 
'0.0.0.0' 
], 
[ 
'00:00:00:00:56:78', 
'10.1.1.1' 
] 
] 
}, 
'network_type': u'vxlan', 
'segment_id': 1 
} 
} 

So the parameter you suggest to add will be at index 2 of each FDB list? 

I'm not sure it will be optional then, otherwise it could be quite hard to 
decode these messages.. 

Also, you suggest that each agent will know what to do according to this 
parameter? 

 This way, traffic to multi-bound port will behave as unknown unicast traffic.
 First packet will be flood to every tunnel and local bridge will learn the
 correct tunnel for the following packets based on which tunnel received the
 answer.
 Once learning occurs with first ingress packet, following packets would be
 sent to the correct tunnel and not flooded anymore.

IIUC then we still need to send all nodes where the HA port is scheduled on, 
this just adds on top of it and moves out the decision regarding the FDB to the 
agent level. 

The FDB is then only needed for populating the ARP responder? 

 I've tested this with linuxbridge and it works fine. Based on code overview,
 this should work correctly with OVS too. I'll test it ASAP.

 I know that DVR team already add such a flag in RPC messages, but they revert
 it in later patches. I would be very interested in having their opinion on
 this proposal.
 It seems that DVR port could also use this flag. This would result in having
 ARP responder activated for DVR port too.

 This shouldn't need a bump in RPC versioning since this flag would be
 optionnal. So their shouldn't have any issue with backward compatibility.

I'm not sure if it's backwards compatible since you're actually changing the 
construct of the RPC message so it's a bit unexpected how the old agents will 
react. 
It's not adding a new key-value, it's modifying each fdb's list.. 

 Regards,

 Mathieu

 [1] https://review.openstack.org/#/c/141114/2

 On Sun, Dec 21, 2014 at 12:14 PM, Narasimhan, Vivekanandan 
 vivekanandan.narasim...@hp.com  wrote:

  Hi Mike,
 

  Just one comment [Vivek]
 

  -Original Message-
 
  From: Mike Kolesnik [mailto: mkole...@redhat.com ]
 
  Sent: Sunday, December 21, 2014 11:17 AM
 
  To: OpenStack Development Mailing List (not for usage questions)
 
  Cc: Robert Kukura
 
  Subject: Re: [openstack-dev] [Neutron][L2Pop][HA Routers] Request for
  comments for a possible solution
 

  Hi Mathieu,
 

  Comments inline
 

  Regards,
 
  Mike
 

  - Original Message -
 
   Mike,
 
  
 
   I'm not even sure that your solution works without being able to bind
 
   a router HA port to several hosts.
 
   What's happening currently is that you :
 
  
 
   1.create the router on two l3agent.
 
   2. those l3agent trigger the sync_router() on the l3plugin.
 
   3. l3plugin.sync_routers() will trigger
   l2plugin.update_port(host=l3agent).
 
   4. ML2 will bind the port to the host mentioned in the last
   update_port().
 
  
 
   From a l2pop perspective, this will result in creating only one tunnel
 
   to the host lastly specified.
 
   I can't find any code that forces that only the master router binds
 
   its router port. So we don't even know if the host which binds the
 
   router port is hosting the master router or the slave one, and so if
 
   l2pop is creating the tunnel to the master or to the slave.
 
  
 
   Can you confirm that the above sequence is correct? or am I missing
 
   something?
 

  Are you referring to the alternative solution?
 

  In that case it seems that you're correct so that there would need to be
  awareness

Re: [openstack-dev] Why need br-int and br-tun in openstack neutron

2015-05-25 Thread Mike Kolesnik
- Original Message -
 Comments in-line.
 
 - Original Message -
  On 23 May 2015 at 04:43, Assaf Muller  amul...@redhat.com  wrote:
  
  
  
  There's no real reason as far as I'm aware, just an implementation
  decision.
  
  This is inaccurate. There is a reason(s), and this has been asked before:
  
  http://lists.openstack.org/pipermail/openstack/2014-March/005950.html
 
 This link is to a thread asking why do we connect a Linux bridge between a
 tap
 device and br-int (For security groups).
 
  http://lists.openstack.org/pipermail/openstack/2014-April/006865.html
 
 This link is to this thread itself.

No it's from another author but just the same text (almost exactly).
i.e. https://www.diffchecker.com/xl98zm9a

Either it's the same poster or some freak coincidence, or just some copy paste..

Also Vivek gave the correct answer on that thread:
http://lists.openstack.org/pipermail/openstack/2014-April/006868.html

In a nutshell, decoupling the overlay layer from the VM connectivity.
VMs are always connected to the br-int the same way, but the overlay
(vxlan/gre or vlans) is connected differently.

 
  
  In a nutshell, the design decision that led to the existing architecture is
  due to the way OVS handles packets and interact with netfilter.
 
 I think you're talking about the bridge between a tap device and br-int and
 not about br-tun.
 
  
  The fact that we keep asking the same question clearly shows lack of
  documentation, both developer and user facing.
  
  I'll get this fixed once and for all.
 
 Thank you.
 
  
  Thanks,
  Armando
  
  
  
  
  
  
  On 21 במאי 2015, at 01:48, Na Zhu  na...@cn.ibm.com  wrote:
  
  
  
  
  
  
  Dear,
  
  
  When OVS plugin is used with GRE option in Neutron, I see that each compute
  node has br-tun and br-int bridges created.
  
  I'm trying to understand why we need the additional br-tun bridge here.
  Can't we create tunneling ports in br-int bridge, and have br-int relay
  traffic between VM ports and tunneling ports directly? Why do we have to
  introduce another br-tun bridge?
  
  
  Regards,
  Juno Zhu
  Staff Software Engineer, System Networking
  China Systems and Technology Lab (CSTL), IBM Wuxi
  Email: na...@cn.ibm.com
  
  
  
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe: openstack-dev-requ...@lists.openstack.org ?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
  
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] (no subject)

2015-08-04 Thread Mike Kolesnik
On Tue, Aug 4, 2015 at 1:02 PM, Ihar Hrachyshka ihrac...@redhat.com wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 Hi all,

 in feature/qos, we use ml2 extension drivers to handle additional
 qos_policy_id field that can be provided thru API:

 http://git.openstack.org/cgit/openstack/neutron/tree/neutron/plugins/ml2
 /extensions/qos.py?h=feature/qos

 What we do in qos extension is we create a database 'binding' object
 between the updated port and the QoS policy that corresponds to
 qos_policy_id. So we access the database. It means there may be some
 complications there, f.e. the policy object is not available for the
 tenant, or just does not exist. In that case, we raise an exception
 from the extension, assuming that ml2 will propagate it to the user in
 some form.


​First of all maybe we should be asking this on the u/s mailing list to get
a broader view?
​



 But it does not work. This is because _call_on_ext_drivers swallows
 exceptions:

 http://git.openstack.org/cgit/openstack/neutron/tree/neutron/plugins/ml2
 /managers.py#n766

 It makes me ask some questions:

 - - first, do we use extensions as was expected? Can we extend
 extensions to cover our use case?


​I think we are, they mostly fit the case but as everything in Neutron it's
unripe.
However from my experience this was the ripest option available to us..
​



 - - second, what would be the right way to go assuming we want to
 support the case? Should we just reraise? Or maybe postpone till all
 extension drivers are called, and then propagate an exception top into
 the stack? (Probably some extension manager specific exception?) Or
 maybe we want extensions to claim whether they may raise, and handle
 them accordingly?


​I was thinking in order not to alter existing extension behaviours that we
can define in the ML2 extension driver scope a special exception type (sort
of exception container), and if an exception of this type is raised ​then
we should re-raise it.
I'm not sure there's much value to aggregating the exceptions right off the
bat and this can be done later on.




 - - alternatively, if we abuse the API and should stop doing it, which
 other options do we have to achieve similar behaviour without relying
 on ml2 extensions AND without polluting ml2 driver with qos specific cod
 e?

 Thanks for your answers,
 Ihar
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2

 iQEcBAEBCAAGBQJVwI29AAoJEC5aWaUY1u57yLYH/jhYmu4aR+ewZwSzDYXMcfdz
 tD5BSYKD/YmDMIAYprmVCqOlk1jaioesFPMUOrsycpacZZWjg5tDSrpJ2Iz5/ZPw
 BYLIPGaYF3Pu87LHrUKhIz4f2TfSWve/7GBCZ6AK6zVqCXky8A9MRfWrf774a8oF
 kexP7qQVbyrOcXxZANDa1bJuLDsb4TiTcuuDizPtuUWlMfzmtZeauyieji/g1smq
 HBO5h7zUFQ87YvBqq7ed2KhlRENxo26aSrpxTFkyyxJU9xH1J8q9W1gWO7Tw1uCV
 psaijDmlxU/KySR97Ro8m5teu+7Pcb2cg/s57WaHWuAvPNW1CmfYc/XDn2I9KlI=
 =Fo++
 -END PGP SIGNATURE-

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Regards,
Mike
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron][qos][ml2] extensions swallow exceptions

2015-08-04 Thread Mike Kolesnik
Don't know why subject wasn't set automatically..

On Tue, Aug 4, 2015 at 3:30 PM, Mike Kolesnik mkole...@redhat.com wrote:



 On Tue, Aug 4, 2015 at 1:02 PM, Ihar Hrachyshka ihrac...@redhat.com
 wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 Hi all,

 in feature/qos, we use ml2 extension drivers to handle additional
 qos_policy_id field that can be provided thru API:

 http://git.openstack.org/cgit/openstack/neutron/tree/neutron/plugins/ml2
 /extensions/qos.py?h=feature/qos
 http://git.openstack.org/cgit/openstack/neutron/tree/neutron/plugins/ml2/extensions/qos.py?h=feature/qos

 What we do in qos extension is we create a database 'binding' object
 between the updated port and the QoS policy that corresponds to
 qos_policy_id. So we access the database. It means there may be some
 complications there, f.e. the policy object is not available for the
 tenant, or just does not exist. In that case, we raise an exception
 from the extension, assuming that ml2 will propagate it to the user in
 some form.


 ​First of all maybe we should be asking this on the u/s mailing list to
 get a broader view?


Don't mind this, I must be drunk..​


​



 But it does not work. This is because _call_on_ext_drivers swallows
 exceptions:

 http://git.openstack.org/cgit/openstack/neutron/tree/neutron/plugins/ml2
 /managers.py#n766
 http://git.openstack.org/cgit/openstack/neutron/tree/neutron/plugins/ml2/managers.py#n766

 It makes me ask some questions:

 - - first, do we use extensions as was expected? Can we extend
 extensions to cover our use case?


 ​I think we are, they mostly fit the case but as everything in Neutron
 it's unripe.
 However from my experience this was the ripest option available to us..
 ​



 - - second, what would be the right way to go assuming we want to
 support the case? Should we just reraise? Or maybe postpone till all
 extension drivers are called, and then propagate an exception top into
 the stack? (Probably some extension manager specific exception?) Or
 maybe we want extensions to claim whether they may raise, and handle
 them accordingly?


 ​I was thinking in order not to alter existing extension behaviours that
 we can define in the ML2 extension driver scope a special exception type
 (sort of exception container), and if an exception of this type is raised
 ​then we should re-raise it.
 I'm not sure there's much value to aggregating the exceptions right off
 the bat and this can be done later on.




 - - alternatively, if we abuse the API and should stop doing it, which
 other options do we have to achieve similar behaviour without relying
 on ml2 extensions AND without polluting ml2 driver with qos specific cod
 e?

 Thanks for your answers,
 Ihar
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2

 iQEcBAEBCAAGBQJVwI29AAoJEC5aWaUY1u57yLYH/jhYmu4aR+ewZwSzDYXMcfdz
 tD5BSYKD/YmDMIAYprmVCqOlk1jaioesFPMUOrsycpacZZWjg5tDSrpJ2Iz5/ZPw
 BYLIPGaYF3Pu87LHrUKhIz4f2TfSWve/7GBCZ6AK6zVqCXky8A9MRfWrf774a8oF
 kexP7qQVbyrOcXxZANDa1bJuLDsb4TiTcuuDizPtuUWlMfzmtZeauyieji/g1smq
 HBO5h7zUFQ87YvBqq7ed2KhlRENxo26aSrpxTFkyyxJU9xH1J8q9W1gWO7Tw1uCV
 psaijDmlxU/KySR97Ro8m5teu+7Pcb2cg/s57WaHWuAvPNW1CmfYc/XDn2I9KlI=
 =Fo++
 -END PGP SIGNATURE-

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 --
 Regards,
 Mike




-- 
Regards,
Mike
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [neutron] Adding results to extension callbacks

2015-07-13 Thread Mike Kolesnik
Hi, 

I sent a simple patch to check the possibility to add results to callbacks: 
https://review.openstack.org/#/c/201127/ 

This will allow us to decouple the callback logic from the ML2 plugin in the 
QoS scenario where we need to update the agents in case the profile_id on a 
port/network changes. 
It will also allow for a cleaner way to extend resource attributes as 
AFTER_READ callbacks can return a dict of fields to add to the original 
resource instead of mutating it directly. 

Please let me know what you think of this idea. 

Regards, 
Mike 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Should we document the using of device:owner of the PORT ?

2015-07-15 Thread Mike Kolesnik
- Original Message -

 Yes please.

 This would be a good starting point.
 I also think that the ability of editing it, as well as the value it could be
 set to, should be constrained.

FYI the oVirt project uses this field to identify ports it creates and manages. 
So if you're going to constrain it to something, it should probably be 
configurable so that managers other than Nova can continue to use Neutron. 

 As you have surely noticed, there are several code path which rely on an
 appropriate value being set in this attribute.
 This means a user can potentially trigger malfunctioning by sending PUT
 requests to edit this attribute.

 Summarizing, I think that document its usage is a good starting point, but I
 believe we should address the way this attribute is exposed at the API layer
 as well.

 Salvatore

 On 13 July 2015 at 11:52, Wang, Yalei  yalei.w...@intel.com  wrote:

  Hi all,
 
  The device:owner the port is defined as a 255 byte string, and is widely
  used
  now, indicating the use of the port.
 
  Seems we can fill it freely, and user also could update/set it from cmd
  line(port-update $PORT_ID --device_owner), and I don’t find the guideline
  for using.
 
  What is its function? For indicating the using of the port, and seems
  horizon
  also use it to show the topology.
 
  And nova really need it editable, should we at least document all of the
  possible values into some guide to make it clear? If yes, I can do it.
 
  I got these using from the code(maybe not complete, pls point it out):
 
  From constants.py,
 
  DEVICE_OWNER_ROUTER_HA_INTF = network:router_ha_interface
 
  DEVICE_OWNER_ROUTER_INTF = network:router_interface
 
  DEVICE_OWNER_ROUTER_GW = network:router_gateway
 
  DEVICE_OWNER_FLOATINGIP = network:floatingip
 
  DEVICE_OWNER_DHCP = network:dhcp
 
  DEVICE_OWNER_DVR_INTERFACE = network:router_interface_distributed
 
  DEVICE_OWNER_AGENT_GW = network:floatingip_agent_gateway
 
  DEVICE_OWNER_ROUTER_SNAT = network:router_centralized_snat
 
  DEVICE_OWNER_LOADBALANCER = neutron:LOADBALANCER
 
  And from debug_agent.py
 
  DEVICE_OWNER_NETWORK_PROBE = 'network:probe'
 
  DEVICE_OWNER_COMPUTE_PROBE = 'compute:probe'
 
  And setting from nova/network/neutronv2/api.py,
 
  'compute:%s' % instance.availability_zone
 
  Thanks all!
 
  /Yalei
 

  __
 
  OpenStack Development Mailing List (not for usage questions)
 
  Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron-dev] [neutron] Generalized issues in the unit testing of ML2 mechanism drivers

2017-12-18 Thread Mike Kolesnik
On Wed, Dec 13, 2017 at 2:30 PM, Michel Peterson  wrote:

> Through my work in networking-odl I've found what I believe is an issue
> present in a majority of ML2 drivers. An issue I think needs awareness so
> each project can decide a course of action.
>
> The issue stems from the adopted practice of importing
> `neutron.tests.unit.plugins.ml2.test_plugin` and creating classes with
> noop operation to "inherit" tests for free [1]. The idea behind is nice,
> you inherit >600 tests that cover several scenarios.
>
> There are several issues of adopting this pattern, two of which are
> paramount:
>
> 1. If the mechanism driver is not loaded correctly [2], the tests then
> don't test the mechanism driver but still succeed and therefore there is no
> indication that there is something wrong with the code. In the case of
> networking-odl it wasn't discovered until last week, which means that for
> >1 year it this was adding PASSed tests uselessly.
>
> 2. It gives a false sense of reassurance. If the code of those tests is
> analyzed it's possible to see that the code itself is mostly centered
> around testing the REST endpoint of neutron than actually testing that the
> mechanism succeeds on the operation it was supposed to test. As a result of
> this, there is marginally added value on having those tests. To be clear,
> the hooks for the respective operations are called on the mechanism driver,
> but the result of the operation is not asserted.
>
> I would love to hear more voices around this, so feel free to comment.
>

​i talked to a few guys from networking-ovn which are now processing this
info so they could chime in, but from what I've understood the issue wasn't
given much thought in networking-ovn (and I suspect other mechanism
drivers).
​

>
> Regarding networking-odl the solution I propose is the following:
>   **First**, discard completely the change mentioned in the footnote #2.
>   **Second**, create a patch that completely removes the tests that follow
> this pattern.
>   **Third**, incorporate the neutron tempest plugin into the CI and rely
> on that for assuring coverage of the different scenarios.
>

​This sounds like a good plan to me.
​

>
> Also to mention that when discovered this issue in networking-odl we took
> a decision not to merge more patches until the PS of footnote #2 was
> addressed. I think we can now decide to overrule that decision and proceed
> as usual.
>

​Agreed.
​

>
>
>
> [1]: http://codesearch.openstack.org/?q=class%20.*\(.*TestMl2
> 
> [2]: something that was happening in networking-odl and addressed by
> https://review.openstack.org/#/c/523934
>
> ___
> neutron-dev mailing list
> neutron-...@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/neutron-dev
>
>


-- 
Regards,
Mike
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev