Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-16 Thread Mohammad Banikazemi

It is perhaps worth mentioning that there is an effort to implement a
generic synchronization mechanism (between Neutron and backend
controllers/devices) in the ML2 plugin [1]. A possible framework for its
eventual implementation is in an early discussion/proof-of-concept WIP
state [2].

-Mohammad

[1] https://blueprints.launchpad.net/neutron/+spec/ml2-driver-sync
[2] https://review.openstack.org/#/c/154333/



From:   Cory Benfield cory.benfi...@metaswitch.com
To: OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Date:   03/16/2015 04:48 AM
Subject:Re: [openstack-dev] [neutron] Generic question about
synchronizing   neutron agent on compute node with DB



On Sun, Mar 15, 2015 at 22:37:59, Sławek Kapłoński wrote:
 Maybe good idea for the beginning could be to implement some periodic
 task
 called from agent to check db config and compare it with real state on
 host?
 What do You think? Or maybe I'm competly wrong with such idea and it
 should be
 done in different way?

This is almost exactly what we do in our Calico ML2 driver. Each of our
agents will periodically request its complete state from a neutron-server
node and will ensure that its local state matches that expected state. This
interval is configurable, to allow administrators to make a trade-off
between DB/network load and convergence time.

With reliable transport this is in principle almost never needed (messages
only really get lost on agent crash, and the agent will resynchronize when
it starts back up anyway), but it provides assurances that the fabric is
capable of bringing itself into consistency without administrator
intervention.

Having similar function in other neutron agents would be valuable for the
exact same reasons, but do bear in mind the potentially increased load this
kind of resynchronization can place on databases and servers.

Cory
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-16 Thread Sławek Kapłoński
Hello,

I read blueprint which You send but I don't know how it should solve for 
example problems like can be in l2pop mechanism. It send message to fanout 
cast and forget about it. There is no any exception in port_update_postcommit 
operation but message can be not consumed by some agents (or maybe I'm wrong 
and it couldn't happen?) and then this agent is not synced with neutron db. 

--
Pozdrawiam / Best regards
Sławek Kapłoński
sla...@kaplonski.pl

Dnia poniedziałek, 16 marca 2015 11:05:45 Mohammad Banikazemi pisze:
 It is perhaps worth mentioning that there is an effort to implement a
 generic synchronization mechanism (between Neutron and backend
 controllers/devices) in the ML2 plugin [1]. A possible framework for its
 eventual implementation is in an early discussion/proof-of-concept WIP
 state [2].
 
 -Mohammad
 
 [1] https://blueprints.launchpad.net/neutron/+spec/ml2-driver-sync
 [2] https://review.openstack.org/#/c/154333/
 
 
 
 From: Cory Benfield cory.benfi...@metaswitch.com
 To:   OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org
 Date: 03/16/2015 04:48 AM
 Subject:  Re: [openstack-dev] [neutron] Generic question about
 synchronizing neutron agent on compute node with DB
 
 On Sun, Mar 15, 2015 at 22:37:59, Sławek Kapłoński wrote:
  Maybe good idea for the beginning could be to implement some periodic
  task
  called from agent to check db config and compare it with real state on
  host?
  What do You think? Or maybe I'm competly wrong with such idea and it
  should be
  done in different way?
 
 This is almost exactly what we do in our Calico ML2 driver. Each of our
 agents will periodically request its complete state from a neutron-server
 node and will ensure that its local state matches that expected state. This
 interval is configurable, to allow administrators to make a trade-off
 between DB/network load and convergence time.
 
 With reliable transport this is in principle almost never needed (messages
 only really get lost on agent crash, and the agent will resynchronize when
 it starts back up anyway), but it provides assurances that the fabric is
 capable of bringing itself into consistency without administrator
 intervention.
 
 Having similar function in other neutron agents would be valuable for the
 exact same reasons, but do bear in mind the potentially increased load this
 kind of resynchronization can place on databases and servers.
 
 Cory
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-16 Thread Sławek Kapłoński
Hello,

Thanks. I didn't find it before.
When we will upgrade our infra we will see if this problem will still present. 
I hope that this was due to that bug maybe and will be fixed then :) 

--
Pozdrawiam / Best regards
Sławek Kapłoński
sla...@kaplonski.pl

Dnia poniedziałek, 16 marca 2015 00:14:57 Mathieu Rohon pisze:
 Hi slawek,
 
 may be you're hitting this l2pop bug :
 https://bugs.launchpad.net/neutron/+bug/1372438
 
 On Sun, Mar 15, 2015 at 11:37 PM, Sławek Kapłoński sla...@kaplonski.pl
 
 wrote:
  Hello,
  
  Dnia niedziela, 15 marca 2015 17:45:05 Salvatore Orlando pisze:
   On 14 March 2015 at 11:19, Sławek Kapłoński sla...@kaplonski.pl wrote:
Hello,

I'm using ovs agents with L2 population mechanism in ML2 plugin. I
  
  noticed
  
that sometimes agents don't receive proper RPC to add new vxlan tunnel
openflow rules and then vxlan network between some compute nodes not
working.
I'm now using still havana release but want to upgrade to Juno. I was
checking
Juno code in l2 population mech driver and ovs plugin and I didn't
find
anything like periodic check if openflow rules are proper set or maybe
resynced.
Maybe it would be also good idea to add something like that to ovs
  
  agent?
  
   It would surely be a good idea to add some form of reliability into
   communications between server and agents.
   So far there are still several instances where the server sends a fire
  
  and
  
   forget notification to the agent, and does not take any step to ensure
  
  the
  
   state change associated with that notification has been actually applied
  
  to
  
   the agent. This applies also to some messages from the agent side, such
  
  as
  
   status change notifications.
  
  Maybe good idea for the beginning could be to implement some periodic task
  called from agent to check db config and compare it with real state on
  host?
  What do You think? Or maybe I'm competly wrong with such idea and it
  should be
  done in different way?
  
   This is something that can be beneficial any neutron implementation
   which
   depends on one or more agents, not just for those using the ovs/linux
   bridge agents with the l2-population driver.
  
  Probably yes, but I had this problem only with this l2-population driver
  so
  far and that's why I wrote about it :)
  
  --
  Pozdrawiam / Best regards
  Sławek Kapłoński
  sla...@kaplonski.pl
  
   Salvatore
   
--
Pozdrawiam / Best regards
Sławek Kapłoński
sla...@kaplonski.pl

Dnia piątek, 13 marca 2015 11:18:28 YAMAMOTO Takashi pisze:
  However, I briefly looked through the L2 agent code and didn't see
  
  a
  
  periodic task to resync the port information to protect from a
  
  neutron
  
  server that failed to send a notification because it crashed or
  
  lost
  
its

  amqp connection. The L3 agent has a period sync routers task that

helps in

  this regard. Maybe another neutron developer more familiar with
  
  the L2
  
  agent can chime in here if I'm missing anything.
 
 i don't think you are missing anything.
 periodic sync would be a good improvement.
 
 YAMAMAOTO Takashi
  
  __
  
 OpenStack Development Mailing List (not for usage questions)

 Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  __
  
OpenStack Development Mailing List (not for usage questions)
  
Unsubscribe:
  openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-16 Thread Cory Benfield
On Sun, Mar 15, 2015 at 22:37:59, Sławek Kapłoński wrote:
 Maybe good idea for the beginning could be to implement some periodic
 task
 called from agent to check db config and compare it with real state on
 host?
 What do You think? Or maybe I'm competly wrong with such idea and it
 should be
 done in different way?

This is almost exactly what we do in our Calico ML2 driver. Each of our agents 
will periodically request its complete state from a neutron-server node and 
will ensure that its local state matches that expected state. This interval is 
configurable, to allow administrators to make a trade-off between DB/network 
load and convergence time.

With reliable transport this is in principle almost never needed (messages only 
really get lost on agent crash, and the agent will resynchronize when it starts 
back up anyway), but it provides assurances that the fabric is capable of 
bringing itself into consistency without administrator intervention.

Having similar function in other neutron agents would be valuable for the exact 
same reasons, but do bear in mind the potentially increased load this kind of 
resynchronization can place on databases and servers.

Cory
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-15 Thread Salvatore Orlando
On 14 March 2015 at 11:19, Sławek Kapłoński sla...@kaplonski.pl wrote:

 Hello,

 I'm using ovs agents with L2 population mechanism in ML2 plugin. I noticed
 that sometimes agents don't receive proper RPC to add new vxlan tunnel
 openflow rules and then vxlan network between some compute nodes not
 working.
 I'm now using still havana release but want to upgrade to Juno. I was
 checking
 Juno code in l2 population mech driver and ovs plugin and I didn't find
 anything like periodic check if openflow rules are proper set or maybe
 resynced.
 Maybe it would be also good idea to add something like that to ovs agent?


It would surely be a good idea to add some form of reliability into
communications between server and agents.
So far there are still several instances where the server sends a fire and
forget notification to the agent, and does not take any step to ensure the
state change associated with that notification has been actually applied to
the agent. This applies also to some messages from the agent side, such as
status change notifications.

This is something that can be beneficial any neutron implementation which
depends on one or more agents, not just for those using the ovs/linux
bridge agents with the l2-population driver.

Salvatore


 --
 Pozdrawiam / Best regards
 Sławek Kapłoński
 sla...@kaplonski.pl

 Dnia piątek, 13 marca 2015 11:18:28 YAMAMOTO Takashi pisze:
   However, I briefly looked through the L2 agent code and didn't see a
   periodic task to resync the port information to protect from a neutron
   server that failed to send a notification because it crashed or lost
 its
   amqp connection. The L3 agent has a period sync routers task that
 helps in
   this regard. Maybe another neutron developer more familiar with the L2
   agent can chime in here if I'm missing anything.
 
  i don't think you are missing anything.
  periodic sync would be a good improvement.
 
  YAMAMAOTO Takashi
 
 
 __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-15 Thread Salvatore Orlando
The L2 agent, for instance, has a logic to perform full synchronisations
with the server.
These happens in two cases:
1) upon agent restart, as some messages from the server side might have
gone lost
2) whenever a failure is detected on the agent side (this is probably a bit
too conservative).

Salvatore

On 14 March 2015 at 10:51, Leo Y minh...@gmail.com wrote:

 Hello Rossella,

 I meant to something different, to less conventional changes. Right now,
 the network topology state is stored in neutron DB and each compute node
 knows about it by using neutron API per-request. Node knows means that
 neutron agents have this data stored in in-memory structures. In a case
 this synchronization is broken due a bug in software or (un)intentional
 change in neutron DB, I'd like to understand if the re-synchronization is
 possible. Right now, I know that L3 agent (I'm not sure if its working for
 all L3 agents) has periodic task that refreshes NIC information from
 neutron server. However, L2 agents don't have this mechanic. I don't know
 about agents that implement SDN.
 So, I'm looking to learn how the current neutron implementation deals with
 that problem.


 On Fri, Mar 13, 2015 at 10:52 AM, Rossella Sblendido rsblend...@suse.com
 wrote:

  On 03/07/2015 01:10 PM, Leo Y wrote:
  What happens when neutron DB is updated to change network settings (e.g.
  via Dashboard or manually) when there are communication sessions opened
  in compute nodes. Does it influence those sessions? When the update is
  propagated to compute nodes?

 Hi Leo,

 when you say change network settings I think you mean a change in the
 security group, is my assumption correct? In that case the Neutron
 server will notify all the L2 agent (they reside on each compute node)
 about the change. There are different kind of messages that the Neutron
 server sends depending on the type of the update,
 security_groups_rule_updated, security_groups_member_updated,
 security_groups_provider_updated. Each L2 agent will process the message
 and apply the required modification on the host. In the default
 implementation we use iptables to implement security group, so the
 update consists in some modifications of the iptables rules. Regarding
 the existing connections in the compute nodes they might not be affected
 by the change, which is a problem already discussed in this mail thread
 [1] and there's a patch in review to fix that [2].
 Hope that answers your question.

 cheers,

 Rossella

 [1]

 http://lists.openstack.org/pipermail/openstack-dev/2014-October/049055.html
 [2] https://review.openstack.org/#/c/147713/

 On 03/13/2015 04:10 AM, Kevin Benton wrote:
  Yeah, I was making a bad assumption for the l2 and l3. Sorry about that.
  It sounds like we don't have any protection against servers failing to
  send notifications.
 
  On Mar 12, 2015 7:41 PM, Assaf Muller amul...@redhat.com
  mailto:amul...@redhat.com wrote:
 
 
 
  - Original Message -
However, I briefly looked through the L2 agent code and didn't
 see a
periodic task to resync the port information to protect from a
  neutron
server that failed to send a notification because it crashed or
  lost its
amqp connection. The L3 agent has a period sync routers task
  that helps in
this regard.
 
  The L3 agent periodic sync is only if the full_sync flag was turned
  on, which
  is a result of an error.
 
Maybe another neutron developer more familiar with the L2
agent can chime in here if I'm missing anything.
  
   i don't think you are missing anything.
   periodic sync would be a good improvement.
  
   YAMAMAOTO Takashi
  
  
 
  __
   OpenStack Development Mailing List (not for usage questions)
   Unsubscribe:
  openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
 
 
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe:
  openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
 __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 

Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-15 Thread Sławek Kapłoński
Hello,

Dnia niedziela, 15 marca 2015 17:45:05 Salvatore Orlando pisze:
 On 14 March 2015 at 11:19, Sławek Kapłoński sla...@kaplonski.pl wrote:
  Hello,
  
  I'm using ovs agents with L2 population mechanism in ML2 plugin. I noticed
  that sometimes agents don't receive proper RPC to add new vxlan tunnel
  openflow rules and then vxlan network between some compute nodes not
  working.
  I'm now using still havana release but want to upgrade to Juno. I was
  checking
  Juno code in l2 population mech driver and ovs plugin and I didn't find
  anything like periodic check if openflow rules are proper set or maybe
  resynced.
  Maybe it would be also good idea to add something like that to ovs agent?
 
 It would surely be a good idea to add some form of reliability into
 communications between server and agents.
 So far there are still several instances where the server sends a fire and
 forget notification to the agent, and does not take any step to ensure the
 state change associated with that notification has been actually applied to
 the agent. This applies also to some messages from the agent side, such as
 status change notifications.

Maybe good idea for the beginning could be to implement some periodic task 
called from agent to check db config and compare it with real state on host?
What do You think? Or maybe I'm competly wrong with such idea and it should be 
done in different way?

 
 This is something that can be beneficial any neutron implementation which
 depends on one or more agents, not just for those using the ovs/linux
 bridge agents with the l2-population driver.

Probably yes, but I had this problem only with this l2-population driver so 
far and that's why I wrote about it :)

--
Pozdrawiam / Best regards
Sławek Kapłoński
sla...@kaplonski.pl

 
 Salvatore
 
  --
  Pozdrawiam / Best regards
  Sławek Kapłoński
  sla...@kaplonski.pl
  
  Dnia piątek, 13 marca 2015 11:18:28 YAMAMOTO Takashi pisze:
However, I briefly looked through the L2 agent code and didn't see a
periodic task to resync the port information to protect from a neutron
server that failed to send a notification because it crashed or lost
  
  its
  
amqp connection. The L3 agent has a period sync routers task that
  
  helps in
  
this regard. Maybe another neutron developer more familiar with the L2
agent can chime in here if I'm missing anything.
   
   i don't think you are missing anything.
   periodic sync would be a good improvement.
   
   YAMAMAOTO Takashi
  
  __
  
   OpenStack Development Mailing List (not for usage questions)
  
   Unsubscribe:
  openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-15 Thread Mathieu Rohon
Hi slawek,

may be you're hitting this l2pop bug :
https://bugs.launchpad.net/neutron/+bug/1372438

On Sun, Mar 15, 2015 at 11:37 PM, Sławek Kapłoński sla...@kaplonski.pl
wrote:

 Hello,

 Dnia niedziela, 15 marca 2015 17:45:05 Salvatore Orlando pisze:
  On 14 March 2015 at 11:19, Sławek Kapłoński sla...@kaplonski.pl wrote:
   Hello,
  
   I'm using ovs agents with L2 population mechanism in ML2 plugin. I
 noticed
   that sometimes agents don't receive proper RPC to add new vxlan tunnel
   openflow rules and then vxlan network between some compute nodes not
   working.
   I'm now using still havana release but want to upgrade to Juno. I was
   checking
   Juno code in l2 population mech driver and ovs plugin and I didn't find
   anything like periodic check if openflow rules are proper set or maybe
   resynced.
   Maybe it would be also good idea to add something like that to ovs
 agent?
 
  It would surely be a good idea to add some form of reliability into
  communications between server and agents.
  So far there are still several instances where the server sends a fire
 and
  forget notification to the agent, and does not take any step to ensure
 the
  state change associated with that notification has been actually applied
 to
  the agent. This applies also to some messages from the agent side, such
 as
  status change notifications.

 Maybe good idea for the beginning could be to implement some periodic task
 called from agent to check db config and compare it with real state on
 host?
 What do You think? Or maybe I'm competly wrong with such idea and it
 should be
 done in different way?

 
  This is something that can be beneficial any neutron implementation which
  depends on one or more agents, not just for those using the ovs/linux
  bridge agents with the l2-population driver.

 Probably yes, but I had this problem only with this l2-population driver so
 far and that's why I wrote about it :)

 --
 Pozdrawiam / Best regards
 Sławek Kapłoński
 sla...@kaplonski.pl

 
  Salvatore
 
   --
   Pozdrawiam / Best regards
   Sławek Kapłoński
   sla...@kaplonski.pl
  
   Dnia piątek, 13 marca 2015 11:18:28 YAMAMOTO Takashi pisze:
 However, I briefly looked through the L2 agent code and didn't see
 a
 periodic task to resync the port information to protect from a
 neutron
 server that failed to send a notification because it crashed or
 lost
  
   its
  
 amqp connection. The L3 agent has a period sync routers task that
  
   helps in
  
 this regard. Maybe another neutron developer more familiar with
 the L2
 agent can chime in here if I'm missing anything.
   
i don't think you are missing anything.
periodic sync would be a good improvement.
   
YAMAMAOTO Takashi
  
  
 __
  
OpenStack Development Mailing List (not for usage questions)
  
Unsubscribe:
   openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
 __
   OpenStack Development Mailing List (not for usage questions)
   Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-14 Thread Leo Y
Hello Rossella,

I meant to something different, to less conventional changes. Right now,
the network topology state is stored in neutron DB and each compute node
knows about it by using neutron API per-request. Node knows means that
neutron agents have this data stored in in-memory structures. In a case
this synchronization is broken due a bug in software or (un)intentional
change in neutron DB, I'd like to understand if the re-synchronization is
possible. Right now, I know that L3 agent (I'm not sure if its working for
all L3 agents) has periodic task that refreshes NIC information from
neutron server. However, L2 agents don't have this mechanic. I don't know
about agents that implement SDN.
So, I'm looking to learn how the current neutron implementation deals with
that problem.


On Fri, Mar 13, 2015 at 10:52 AM, Rossella Sblendido rsblend...@suse.com
wrote:

  On 03/07/2015 01:10 PM, Leo Y wrote:
  What happens when neutron DB is updated to change network settings (e.g.
  via Dashboard or manually) when there are communication sessions opened
  in compute nodes. Does it influence those sessions? When the update is
  propagated to compute nodes?

 Hi Leo,

 when you say change network settings I think you mean a change in the
 security group, is my assumption correct? In that case the Neutron
 server will notify all the L2 agent (they reside on each compute node)
 about the change. There are different kind of messages that the Neutron
 server sends depending on the type of the update,
 security_groups_rule_updated, security_groups_member_updated,
 security_groups_provider_updated. Each L2 agent will process the message
 and apply the required modification on the host. In the default
 implementation we use iptables to implement security group, so the
 update consists in some modifications of the iptables rules. Regarding
 the existing connections in the compute nodes they might not be affected
 by the change, which is a problem already discussed in this mail thread
 [1] and there's a patch in review to fix that [2].
 Hope that answers your question.

 cheers,

 Rossella

 [1]
 http://lists.openstack.org/pipermail/openstack-dev/2014-October/049055.html
 [2] https://review.openstack.org/#/c/147713/

 On 03/13/2015 04:10 AM, Kevin Benton wrote:
  Yeah, I was making a bad assumption for the l2 and l3. Sorry about that.
  It sounds like we don't have any protection against servers failing to
  send notifications.
 
  On Mar 12, 2015 7:41 PM, Assaf Muller amul...@redhat.com
  mailto:amul...@redhat.com wrote:
 
 
 
  - Original Message -
However, I briefly looked through the L2 agent code and didn't
 see a
periodic task to resync the port information to protect from a
  neutron
server that failed to send a notification because it crashed or
  lost its
amqp connection. The L3 agent has a period sync routers task
  that helps in
this regard.
 
  The L3 agent periodic sync is only if the full_sync flag was turned
  on, which
  is a result of an error.
 
Maybe another neutron developer more familiar with the L2
agent can chime in here if I'm missing anything.
  
   i don't think you are missing anything.
   periodic sync would be a good improvement.
  
   YAMAMAOTO Takashi
  
  
 
  __
   OpenStack Development Mailing List (not for usage questions)
   Unsubscribe:
  openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
 
 
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe:
  openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
 __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Regards,
Leo
-
I enjoy the massacre of ads. This sentence will slaughter ads without a
messy bloodbath
__
OpenStack Development Mailing List (not for 

Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-14 Thread Sławek Kapłoński
Hello,

I'm using ovs agents with L2 population mechanism in ML2 plugin. I noticed 
that sometimes agents don't receive proper RPC to add new vxlan tunnel 
openflow rules and then vxlan network between some compute nodes not working.
I'm now using still havana release but want to upgrade to Juno. I was checking 
Juno code in l2 population mech driver and ovs plugin and I didn't find 
anything like periodic check if openflow rules are proper set or maybe 
resynced. 
Maybe it would be also good idea to add something like that to ovs agent?

--
Pozdrawiam / Best regards
Sławek Kapłoński
sla...@kaplonski.pl

Dnia piątek, 13 marca 2015 11:18:28 YAMAMOTO Takashi pisze:
  However, I briefly looked through the L2 agent code and didn't see a
  periodic task to resync the port information to protect from a neutron
  server that failed to send a notification because it crashed or lost its
  amqp connection. The L3 agent has a period sync routers task that helps in
  this regard. Maybe another neutron developer more familiar with the L2
  agent can chime in here if I'm missing anything.
 
 i don't think you are missing anything.
 periodic sync would be a good improvement.
 
 YAMAMAOTO Takashi
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-13 Thread Rossella Sblendido
 On 03/07/2015 01:10 PM, Leo Y wrote:
 What happens when neutron DB is updated to change network settings (e.g.
 via Dashboard or manually) when there are communication sessions opened
 in compute nodes. Does it influence those sessions? When the update is
 propagated to compute nodes?

Hi Leo,

when you say change network settings I think you mean a change in the
security group, is my assumption correct? In that case the Neutron
server will notify all the L2 agent (they reside on each compute node)
about the change. There are different kind of messages that the Neutron
server sends depending on the type of the update,
security_groups_rule_updated, security_groups_member_updated,
security_groups_provider_updated. Each L2 agent will process the message
and apply the required modification on the host. In the default
implementation we use iptables to implement security group, so the
update consists in some modifications of the iptables rules. Regarding
the existing connections in the compute nodes they might not be affected
by the change, which is a problem already discussed in this mail thread
[1] and there's a patch in review to fix that [2].
Hope that answers your question.

cheers,

Rossella

[1]
http://lists.openstack.org/pipermail/openstack-dev/2014-October/049055.html
[2] https://review.openstack.org/#/c/147713/

On 03/13/2015 04:10 AM, Kevin Benton wrote:
 Yeah, I was making a bad assumption for the l2 and l3. Sorry about that.
 It sounds like we don't have any protection against servers failing to
 send notifications.
 
 On Mar 12, 2015 7:41 PM, Assaf Muller amul...@redhat.com
 mailto:amul...@redhat.com wrote:
 
 
 
 - Original Message -
   However, I briefly looked through the L2 agent code and didn't see a
   periodic task to resync the port information to protect from a
 neutron
   server that failed to send a notification because it crashed or
 lost its
   amqp connection. The L3 agent has a period sync routers task
 that helps in
   this regard.
 
 The L3 agent periodic sync is only if the full_sync flag was turned
 on, which
 is a result of an error.
 
   Maybe another neutron developer more familiar with the L2
   agent can chime in here if I'm missing anything.
 
  i don't think you are missing anything.
  periodic sync would be a good improvement.
 
  YAMAMAOTO Takashi
 
 
 __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-12 Thread Leo Y
What does it mean under if that notification is lost, the agent will
eventually resynchronize? Is it proven/guaranteed? By what means?
Can you, please the process with more details? Or point me to resources
that describe it.

Thank you


On Mon, Mar 9, 2015 at 2:11 AM, Kevin Benton blak...@gmail.com wrote:

 Port changes will result in an update message being sent on the AMQP
 message bus. When the agent receives it, it will affect the communications
 then. If that notification is lost, the agent will eventually resynchronize.

 So during normal operations, the change should take effect within a few
 seconds.

 On Sat, Mar 7, 2015 at 4:10 AM, Leo Y minh...@gmail.com wrote:

 Hello,

 What happens when neutron DB is updated to change network settings (e.g.
 via Dashboard or manually) when there are communication sessions opened in
 compute nodes. Does it influence those sessions? When the update is
 propagated to compute nodes?

 Thank you

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 --
 Kevin Benton

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Regards,
Leo
-
I enjoy the massacre of ads. This sentence will slaughter ads without a
messy bloodbath
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-12 Thread Assaf Muller


- Original Message -
  However, I briefly looked through the L2 agent code and didn't see a
  periodic task to resync the port information to protect from a neutron
  server that failed to send a notification because it crashed or lost its
  amqp connection. The L3 agent has a period sync routers task that helps in
  this regard.

The L3 agent periodic sync is only if the full_sync flag was turned on, which
is a result of an error.

  Maybe another neutron developer more familiar with the L2
  agent can chime in here if I'm missing anything.
 
 i don't think you are missing anything.
 periodic sync would be a good improvement.
 
 YAMAMAOTO Takashi
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-12 Thread Kevin Benton
If there are any errors on the agent connecting to the message bus or
retrieving messages, an exception will be thrown in the main rpc_loop,
which will be caught and a sync flag will be set to true, which will
trigger the sync on the next loop.

However, I briefly looked through the L2 agent code and didn't see a
periodic task to resync the port information to protect from a neutron
server that failed to send a notification because it crashed or lost its
amqp connection. The L3 agent has a period sync routers task that helps in
this regard. Maybe another neutron developer more familiar with the L2
agent can chime in here if I'm missing anything.

On Thu, Mar 12, 2015 at 6:19 AM, Leo Y minh...@gmail.com wrote:

 What does it mean under if that notification is lost, the agent will
 eventually resynchronize? Is it proven/guaranteed? By what means?
 Can you, please the process with more details? Or point me to resources
 that describe it.

 Thank you


 On Mon, Mar 9, 2015 at 2:11 AM, Kevin Benton blak...@gmail.com wrote:

 Port changes will result in an update message being sent on the AMQP
 message bus. When the agent receives it, it will affect the communications
 then. If that notification is lost, the agent will eventually resynchronize.

 So during normal operations, the change should take effect within a few
 seconds.

 On Sat, Mar 7, 2015 at 4:10 AM, Leo Y minh...@gmail.com wrote:

 Hello,

 What happens when neutron DB is updated to change network settings (e.g.
 via Dashboard or manually) when there are communication sessions opened in
 compute nodes. Does it influence those sessions? When the update is
 propagated to compute nodes?

 Thank you


 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 --
 Kevin Benton

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 --
 Regards,
 Leo
 -
 I enjoy the massacre of ads. This sentence will slaughter ads without a
 messy bloodbath

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Kevin Benton
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-12 Thread Kevin Benton
Yeah, I was making a bad assumption for the l2 and l3. Sorry about that. It
sounds like we don't have any protection against servers failing to send
notifications.
On Mar 12, 2015 7:41 PM, Assaf Muller amul...@redhat.com wrote:



 - Original Message -
   However, I briefly looked through the L2 agent code and didn't see a
   periodic task to resync the port information to protect from a neutron
   server that failed to send a notification because it crashed or lost
 its
   amqp connection. The L3 agent has a period sync routers task that
 helps in
   this regard.

 The L3 agent periodic sync is only if the full_sync flag was turned on,
 which
 is a result of an error.

   Maybe another neutron developer more familiar with the L2
   agent can chime in here if I'm missing anything.
 
  i don't think you are missing anything.
  periodic sync would be a good improvement.
 
  YAMAMAOTO Takashi
 
 
 __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-12 Thread YAMAMOTO Takashi
 However, I briefly looked through the L2 agent code and didn't see a
 periodic task to resync the port information to protect from a neutron
 server that failed to send a notification because it crashed or lost its
 amqp connection. The L3 agent has a period sync routers task that helps in
 this regard. Maybe another neutron developer more familiar with the L2
 agent can chime in here if I'm missing anything.

i don't think you are missing anything.
periodic sync would be a good improvement.

YAMAMAOTO Takashi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-08 Thread Kevin Benton
Port changes will result in an update message being sent on the AMQP
message bus. When the agent receives it, it will affect the communications
then. If that notification is lost, the agent will eventually resynchronize.

So during normal operations, the change should take effect within a few
seconds.

On Sat, Mar 7, 2015 at 4:10 AM, Leo Y minh...@gmail.com wrote:

 Hello,

 What happens when neutron DB is updated to change network settings (e.g.
 via Dashboard or manually) when there are communication sessions opened in
 compute nodes. Does it influence those sessions? When the update is
 propagated to compute nodes?

 Thank you

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Kevin Benton
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [neutron] Generic question about synchronizing neutron agent on compute node with DB

2015-03-07 Thread Leo Y
Hello,

What happens when neutron DB is updated to change network settings (e.g.
via Dashboard or manually) when there are communication sessions opened in
compute nodes. Does it influence those sessions? When the update is
propagated to compute nodes?

Thank you
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev