Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-06-16 Thread Terry Wilson
 Right now I'm leaning toward parent always does nothing + PluginWorker.
 Everything is forked, no special case for workers==0, and explicit
 designation of the only one case. Of course, it's still early in the day
 and I haven't had any coffee.

I have updated the patch (https://review.openstack.org/#/c/189391/) to 
implement the above. I have it marked WIP because it doesn't have any tests and 
it modifies ServicePluginBase to have a call to get_processes(), but almost no 
service plugins actually inherit from it even though they implement its 
interface. The get_processes stuff in general could be fleshed out a bit as 
well. I just wanted to get something up for the purposes of discussion, so 
anyone interested in this particular problem should take a look and discuss. :)

Terry

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-06-10 Thread Kyle Mestery
On Wed, Jun 10, 2015 at 2:25 PM, Neil Jerram neil.jer...@metaswitch.com
wrote:



 On 08/06/15 22:02, Kevin Benton wrote:

 This depends on what initialize is supposed to be doing. If it's just a
 one-time sync with a back-end, then I think calling it once in each
 child process might not be what we want.

 I left a comment on Terry's patch. I think we should just use the
 callback manager to have a pre-fork and post-fork even to let
 drivers/plugins do whatever is appropriate for them.


 Can you point me to more detail about the callback manager (or registry)?
 I haven't come across that yet.


http://docs.openstack.org/developer/neutron/devref/callbacks.html



 Thanks,
 Neil

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-06-10 Thread Neil Jerram



On 08/06/15 22:02, Kevin Benton wrote:

This depends on what initialize is supposed to be doing. If it's just a
one-time sync with a back-end, then I think calling it once in each
child process might not be what we want.

I left a comment on Terry's patch. I think we should just use the
callback manager to have a pre-fork and post-fork even to let
drivers/plugins do whatever is appropriate for them.


Can you point me to more detail about the callback manager (or 
registry)?  I haven't come across that yet.


Thanks,
Neil

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-06-10 Thread Terry Wilson
There are two classes of behavior that need to be handled:

1) There are things that can only be done after forking like setting up 
connections or spawning threads.
2) Some things should only be done once regardless of number of forks, like 
syncing.

Even when you just want something to happen once, there is a good chance you 
may need that to happen post-fork. For example, syncing between OVSDB and 
neutron databases requires a socket connection and we don't want to have it 
going on 16 times.

Case 1 is a little complex due to how we launch api/rpc worker threads. The 
obvious place to notify that a fork is complete is in the 
RpcWorker/WorkerService start() methods, since they are the only code outside 
of openstack.common.service that is really called post-fork. The problem is the 
case where api_workers==rpc_workers==0. In this case, the parent process calls 
start() on both, so you end up with two calls to your post-fork initialization 
and only one process. It is easy enough to pass whether or not start() should 
call the initialization, or whether we hold off and let the main process do it 
before calling waitall()--it's just a bit ugly (see my patch: 
https://review.openstack.org/#/c/189391/).

Another option to handle case 1 would be to kill the case where you have a 
single process handling both workers. Always have the parent do nothing, and 
fork a process for each api/rpc worker treating workers=0 as workers=1. Then, 
start() can safely be used without hacking around the special case.

Case 2 the problem is which process is *the one*? The fork() call happens in 
the weird bastardized eventlet-threading hybrid openstack.common ThreadGroup 
stuff, so who knows what order things are really happening. The easiest thing 
to detect as unique is the parent process through some plugin pre-fork call 
that stores the parent's pid. The problem with using the parent process for the 
'do it once' case is that we have to be able to guarantee that all the forking 
is really done, and it happens eventlety. Maybe an accumulator that fires off 
an event when api_workers + rpc_workers fork() events received? Anyway, it's 
messy.

Another option for 2 would be to let the plugin specify that it needs its own 
worker process. If so, spawn it call PluginWorker.start() which initializes 
after-fork. Seems like it could be cleaner.

Right now I'm leaning toward parent always does nothing + PluginWorker. 
Everything is forked, no special case for workers==0, and explicit designation 
of the only one case. Of course, it's still early in the day and I haven't 
had any coffee.

Terry

- Original Message -
 This depends on what initialize is supposed to be doing. If it's just a
 one-time sync with a back-end, then I think calling it once in each child
 process might not be what we want.
 
 I left a comment on Terry's patch. I think we should just use the callback
 manager to have a pre-fork and post-fork even to let drivers/plugins do
 whatever is appropriate for them.
 
 On Mon, Jun 8, 2015 at 1:00 PM, Robert Kukura  kuk...@noironetworks.com 
 wrote:
 
 
 
 From a driver's perspective, it would be simpler, and I think sufficient, to
 change ML2 to call initialize() on drivers after the forking, rather than
 requiring drivers to know about forking.
 
 -Bob
 
 
 On 6/8/15 2:59 PM, Armando M. wrote:
 
 
 
 Interestingly, [1] was filed a few moments ago:
 
 [1] https://bugs.launchpad.net/neutron/+bug/1463129
 
 On 2 June 2015 at 22:48, Salvatore Orlando  sorla...@nicira.com  wrote:
 
 
 
 I'm not sure if you can test this behaviour on your own because it requires
 the VMware plugin and the eventlet handling of backend response.
 
 But the issue was manifesting and had to be fixed with this mega-hack [1].
 The issue was not about several workers executing the same code - the
 loopingcall was always started on a single thread. The issue I witnessed was
 that the other API workers just hang.
 
 There's probably something we need to understand about how eventlet can work
 safely with a os.fork (I just think they're not really made to work
 together!).
 Regardless, I did not spent too much time on it, because I thought that the
 multiple workers code might have been rewritten anyway by the pecan switch
 activities you're doing.
 
 Salvatore
 
 
 [1] https://review.openstack.org/#/c/180145/
 
 On 3 June 2015 at 02:20, Kevin Benton  blak...@gmail.com  wrote:
 
 
 
 Sorry about the long delay.
 
  Even the LOG.error(KEVIN PID=%s network response: %s % (os.getpid(),
  r.text)) line? Surely the server would have forked before that line was
  executed - so what could prevent it from executing once in each forked
  process, and hence generating multiple logs?
 
 Yes, just once. I wasn't able to reproduce the behavior you ran into. Maybe
 eventlet has some protection for this? Can you provide small sample code for
 the logging driver that does reproduce the issue?
 
 On Wed, May 13, 2015 at 5:19 AM, Neil Jerram  

Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-06-08 Thread Armando M.
Interestingly, [1] was filed a few moments ago:

[1] https://bugs.launchpad.net/neutron/+bug/1463129

On 2 June 2015 at 22:48, Salvatore Orlando sorla...@nicira.com wrote:

 I'm not sure if you can test this behaviour on your own because it
 requires the VMware plugin and the eventlet handling of backend response.

 But the issue was manifesting and had to be fixed with this mega-hack [1].
 The issue was not about several workers executing the same code - the
 loopingcall was always started on a single thread. The issue I witnessed
 was that the other API workers just hang.

 There's probably something we need to understand about how eventlet can
 work safely with a os.fork (I just think they're not really made to work
 together!).
 Regardless, I did not spent too much time on it, because I thought that
 the multiple workers code might have been rewritten anyway by the pecan
 switch activities you're doing.

 Salvatore


 [1] https://review.openstack.org/#/c/180145/

 On 3 June 2015 at 02:20, Kevin Benton blak...@gmail.com wrote:

 Sorry about the long delay.

 Even the LOG.error(KEVIN PID=%s network response: %s % (os.getpid(),
 r.text)) line?  Surely the server would have forked before that line was
 executed - so what could prevent it from executing once in each forked
 process, and hence generating multiple logs?

 Yes, just once. I wasn't able to reproduce the behavior you ran into.
 Maybe eventlet has some protection for this? Can you provide small sample
 code for the logging driver that does reproduce the issue?

 On Wed, May 13, 2015 at 5:19 AM, Neil Jerram neil.jer...@metaswitch.com
 wrote:

 Hi Kevin,

 Thanks for your response...

 On 08/05/15 08:43, Kevin Benton wrote:

 I'm not sure I understand the behavior you are seeing. When your
 mechanism driver gets initialized and kicks off processing, all of that
 should be happening in the parent PID. I don't know why your child
 processes start executing code that wasn't invoked. Can you provide a
 pointer to the code or give a sample that reproduces the issue?


 https://github.com/Metaswitch/calico/tree/master/calico/openstack

 Basically, our driver's initialize method immediately kicks off a green
 thread to audit what is now in the Neutron DB, and to ensure that the other
 Calico components are consistent with that.

  I modified the linuxbridge mech driver to try to reproduce it:
 http://paste.openstack.org/show/216859/

 In the output, I never received any of the init code output I added more
 than once, including the function spawned using eventlet.


 Interesting.  Even the LOG.error(KEVIN PID=%s network response: %s %
 (os.getpid(), r.text)) line?  Surely the server would have forked before
 that line was executed - so what could prevent it from executing once in
 each forked process, and hence generating multiple logs?

 Thanks,
 Neil

  The only time I ever saw anything executed by a child process was actual
 API requests (e.g. the create_port method).




  On Thu, May 7, 2015 at 6:08 AM, Neil Jerram neil.jer...@metaswitch.com
 mailto:neil.jer...@metaswitch.com wrote:

 Is there a design for how ML2 mechanism drivers are supposed to cope
 with the Neutron server forking?

 What I'm currently seeing, with api_workers = 2, is:

 - my mechanism driver gets instantiated and initialized, and
 immediately kicks off some processing that involves communicating
 over the network

 - the Neutron server process then forks into multiple copies

 - multiple copies of my driver's network processing then continue,
 and interfere badly with each other :-)

 I think what I should do is:

 - wait until any forking has happened

 - then decide (somehow) which mechanism driver is going to kick off
 that processing, and do that.

 But how can a mechanism driver know when the Neutron server forking
 has happened?

 Thanks,
  Neil


 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 --
 Kevin Benton



 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 --
 Kevin Benton

 

Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-06-08 Thread Robert Kukura
From a driver's perspective, it would be simpler, and I think 
sufficient, to change ML2 to call initialize() on drivers after the 
forking, rather than requiring drivers to know about forking.


-Bob

On 6/8/15 2:59 PM, Armando M. wrote:

Interestingly, [1] was filed a few moments ago:

[1] https://bugs.launchpad.net/neutron/+bug/1463129

On 2 June 2015 at 22:48, Salvatore Orlando sorla...@nicira.com 
mailto:sorla...@nicira.com wrote:


I'm not sure if you can test this behaviour on your own because it
requires the VMware plugin and the eventlet handling of backend
response.

But the issue was manifesting and had to be fixed with this
mega-hack [1]. The issue was not about several workers executing
the same code - the loopingcall was always started on a single
thread. The issue I witnessed was that the other API workers just
hang.

There's probably something we need to understand about how
eventlet can work safely with a os.fork (I just think they're not
really made to work together!).
Regardless, I did not spent too much time on it, because I thought
that the multiple workers code might have been rewritten anyway by
the pecan switch activities you're doing.

Salvatore


[1] https://review.openstack.org/#/c/180145/

On 3 June 2015 at 02:20, Kevin Benton blak...@gmail.com
mailto:blak...@gmail.com wrote:

Sorry about the long delay.

Even the LOG.error(KEVIN PID=%s network response: %s %
(os.getpid(), r.text)) line?  Surely the server would have
forked before that line was executed - so what could prevent
it from executing once in each forked process, and hence
generating multiple logs?

Yes, just once. I wasn't able to reproduce the behavior you
ran into. Maybe eventlet has some protection for this? Can you
provide small sample code for the logging driver that does
reproduce the issue?

On Wed, May 13, 2015 at 5:19 AM, Neil Jerram
neil.jer...@metaswitch.com
mailto:neil.jer...@metaswitch.com wrote:

Hi Kevin,

Thanks for your response...

On 08/05/15 08:43, Kevin Benton wrote:

I'm not sure I understand the behavior you are seeing.
When your
mechanism driver gets initialized and kicks off
processing, all of that
should be happening in the parent PID. I don't know
why your child
processes start executing code that wasn't invoked.
Can you provide a
pointer to the code or give a sample that reproduces
the issue?


https://github.com/Metaswitch/calico/tree/master/calico/openstack

Basically, our driver's initialize method immediately
kicks off a green thread to audit what is now in the
Neutron DB, and to ensure that the other Calico components
are consistent with that.

I modified the linuxbridge mech driver to try to
reproduce it:
http://paste.openstack.org/show/216859/

In the output, I never received any of the init code
output I added more
than once, including the function spawned using eventlet.


Interesting.  Even the LOG.error(KEVIN PID=%s network
response: %s % (os.getpid(), r.text)) line?  Surely the
server would have forked before that line was executed -
so what could prevent it from executing once in each
forked process, and hence generating multiple logs?

Thanks,
Neil

The only time I ever saw anything executed by a child
process was actual
API requests (e.g. the create_port method).




On Thu, May 7, 2015 at 6:08 AM, Neil Jerram
neil.jer...@metaswitch.com
mailto:neil.jer...@metaswitch.com
mailto:neil.jer...@metaswitch.com
mailto:neil.jer...@metaswitch.com wrote:

Is there a design for how ML2 mechanism drivers
are supposed to cope
with the Neutron server forking?

What I'm currently seeing, with api_workers = 2, is:

- my mechanism driver gets instantiated and
initialized, and
immediately kicks off some processing that
involves communicating
over the network

- the Neutron server process then forks into
multiple copies

- multiple copies of my driver's network
processing then continue,
and interfere badly with each other :-)

I think what I should do is:

- wait 

Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-06-08 Thread Russell Bryant
Right, I think there are use cases for both.  I don't think it's a huge
burden to have to know about it.  I think it's actually quite important
to understand when the initialization happens.

-- 
Russell Bryant

On 06/08/2015 05:02 PM, Kevin Benton wrote:
 This depends on what initialize is supposed to be doing. If it's just a
 one-time sync with a back-end, then I think calling it once in each
 child process might not be what we want.
 
 I left a comment on Terry's patch. I think we should just use the
 callback manager to have a pre-fork and post-fork even to let
 drivers/plugins do whatever is appropriate for them.
 
 On Mon, Jun 8, 2015 at 1:00 PM, Robert Kukura kuk...@noironetworks.com
 mailto:kuk...@noironetworks.com wrote:
 
 From a driver's perspective, it would be simpler, and I think
 sufficient, to change ML2 to call initialize() on drivers after the
 forking, rather than requiring drivers to know about forking.
 
 -Bob
 
 
 On 6/8/15 2:59 PM, Armando M. wrote:
 Interestingly, [1] was filed a few moments ago:

 [1] https://bugs.launchpad.net/neutron/+bug/1463129

 On 2 June 2015 at 22:48, Salvatore Orlando sorla...@nicira.com
 mailto:sorla...@nicira.com wrote:

 I'm not sure if you can test this behaviour on your own
 because it requires the VMware plugin and the eventlet
 handling of backend response.

 But the issue was manifesting and had to be fixed with this
 mega-hack [1]. The issue was not about several workers
 executing the same code - the loopingcall was always started
 on a single thread. The issue I witnessed was that the other
 API workers just hang.

 There's probably something we need to understand about how
 eventlet can work safely with a os.fork (I just think they're
 not really made to work together!).
 Regardless, I did not spent too much time on it, because I
 thought that the multiple workers code might have been
 rewritten anyway by the pecan switch activities you're doing.

 Salvatore


 [1] https://review.openstack.org/#/c/180145/

 On 3 June 2015 at 02:20, Kevin Benton blak...@gmail.com
 mailto:blak...@gmail.com wrote:

 Sorry about the long delay.

 Even the LOG.error(KEVIN PID=%s network response: %s %
 (os.getpid(), r.text)) line?  Surely the server would have
 forked before that line was executed - so what could
 prevent it from executing once in each forked process, and
 hence generating multiple logs?

 Yes, just once. I wasn't able to reproduce the behavior
 you ran into. Maybe eventlet has some protection for this?
 Can you provide small sample code for the logging driver
 that does reproduce the issue? 

 On Wed, May 13, 2015 at 5:19 AM, Neil Jerram
 neil.jer...@metaswitch.com
 mailto:neil.jer...@metaswitch.com wrote:

 Hi Kevin,

 Thanks for your response...

 On 08/05/15 08:43, Kevin Benton wrote:

 I'm not sure I understand the behavior you are
 seeing. When your
 mechanism driver gets initialized and kicks off
 processing, all of that
 should be happening in the parent PID. I don't
 know why your child
 processes start executing code that wasn't
 invoked. Can you provide a
 pointer to the code or give a sample that
 reproduces the issue?


 
 https://github.com/Metaswitch/calico/tree/master/calico/openstack

 Basically, our driver's initialize method immediately
 kicks off a green thread to audit what is now in the
 Neutron DB, and to ensure that the other Calico
 components are consistent with that.

 I modified the linuxbridge mech driver to try to
 reproduce it:
 http://paste.openstack.org/show/216859/

 In the output, I never received any of the init
 code output I added more
 than once, including the function spawned using
 eventlet.


 Interesting.  Even the LOG.error(KEVIN PID=%s network
 response: %s % (os.getpid(), r.text)) line?  Surely
 the server would have forked before that line was
 executed - so what could prevent it from executing
 once in each forked process, and hence generating
 multiple logs?

 Thanks,
 Neil

 The only time I ever saw anything executed by a
 child process 

Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-06-08 Thread Kevin Benton
This depends on what initialize is supposed to be doing. If it's just a
one-time sync with a back-end, then I think calling it once in each child
process might not be what we want.

I left a comment on Terry's patch. I think we should just use the callback
manager to have a pre-fork and post-fork even to let drivers/plugins do
whatever is appropriate for them.

On Mon, Jun 8, 2015 at 1:00 PM, Robert Kukura kuk...@noironetworks.com
wrote:

  From a driver's perspective, it would be simpler, and I think sufficient,
 to change ML2 to call initialize() on drivers after the forking, rather
 than requiring drivers to know about forking.

 -Bob


 On 6/8/15 2:59 PM, Armando M. wrote:

 Interestingly, [1] was filed a few moments ago:

  [1] https://bugs.launchpad.net/neutron/+bug/1463129

 On 2 June 2015 at 22:48, Salvatore Orlando sorla...@nicira.com wrote:

 I'm not sure if you can test this behaviour on your own because it
 requires the VMware plugin and the eventlet handling of backend response.

  But the issue was manifesting and had to be fixed with this mega-hack
 [1]. The issue was not about several workers executing the same code - the
 loopingcall was always started on a single thread. The issue I witnessed
 was that the other API workers just hang.

  There's probably something we need to understand about how eventlet can
 work safely with a os.fork (I just think they're not really made to work
 together!).
 Regardless, I did not spent too much time on it, because I thought that
 the multiple workers code might have been rewritten anyway by the pecan
 switch activities you're doing.

  Salvatore


  [1] https://review.openstack.org/#/c/180145/

 On 3 June 2015 at 02:20, Kevin Benton blak...@gmail.com wrote:

 Sorry about the long delay.

  Even the LOG.error(KEVIN PID=%s network response: %s %
 (os.getpid(), r.text)) line?  Surely the server would have forked before
 that line was executed - so what could prevent it from executing once in
 each forked process, and hence generating multiple logs?

  Yes, just once. I wasn't able to reproduce the behavior you ran into.
 Maybe eventlet has some protection for this? Can you provide small sample
 code for the logging driver that does reproduce the issue?

 On Wed, May 13, 2015 at 5:19 AM, Neil Jerram neil.jer...@metaswitch.com
  wrote:

 Hi Kevin,

 Thanks for your response...

 On 08/05/15 08:43, Kevin Benton wrote:

 I'm not sure I understand the behavior you are seeing. When your
 mechanism driver gets initialized and kicks off processing, all of that
 should be happening in the parent PID. I don't know why your child
 processes start executing code that wasn't invoked. Can you provide a
 pointer to the code or give a sample that reproduces the issue?


 https://github.com/Metaswitch/calico/tree/master/calico/openstack

 Basically, our driver's initialize method immediately kicks off a green
 thread to audit what is now in the Neutron DB, and to ensure that the other
 Calico components are consistent with that.

  I modified the linuxbridge mech driver to try to reproduce it:
 http://paste.openstack.org/show/216859/

 In the output, I never received any of the init code output I added
 more
 than once, including the function spawned using eventlet.


  Interesting.  Even the LOG.error(KEVIN PID=%s network response: %s %
 (os.getpid(), r.text)) line?  Surely the server would have forked before
 that line was executed - so what could prevent it from executing once in
 each forked process, and hence generating multiple logs?

 Thanks,
 Neil

  The only time I ever saw anything executed by a child process was
 actual
 API requests (e.g. the create_port method).




  On Thu, May 7, 2015 at 6:08 AM, Neil Jerram 
 neil.jer...@metaswitch.com
  mailto:neil.jer...@metaswitch.com wrote:

 Is there a design for how ML2 mechanism drivers are supposed to
 cope
 with the Neutron server forking?

 What I'm currently seeing, with api_workers = 2, is:

 - my mechanism driver gets instantiated and initialized, and
 immediately kicks off some processing that involves communicating
 over the network

 - the Neutron server process then forks into multiple copies

 - multiple copies of my driver's network processing then continue,
 and interfere badly with each other :-)

 I think what I should do is:

 - wait until any forking has happened

 - then decide (somehow) which mechanism driver is going to kick off
 that processing, and do that.

 But how can a mechanism driver know when the Neutron server forking
 has happened?

 Thanks,
  Neil


 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 

Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-06-02 Thread Kevin Benton
Sorry about the long delay.

Even the LOG.error(KEVIN PID=%s network response: %s % (os.getpid(),
r.text)) line?  Surely the server would have forked before that line was
executed - so what could prevent it from executing once in each forked
process, and hence generating multiple logs?

Yes, just once. I wasn't able to reproduce the behavior you ran into. Maybe
eventlet has some protection for this? Can you provide small sample code
for the logging driver that does reproduce the issue?

On Wed, May 13, 2015 at 5:19 AM, Neil Jerram neil.jer...@metaswitch.com
wrote:

 Hi Kevin,

 Thanks for your response...

 On 08/05/15 08:43, Kevin Benton wrote:

 I'm not sure I understand the behavior you are seeing. When your
 mechanism driver gets initialized and kicks off processing, all of that
 should be happening in the parent PID. I don't know why your child
 processes start executing code that wasn't invoked. Can you provide a
 pointer to the code or give a sample that reproduces the issue?


 https://github.com/Metaswitch/calico/tree/master/calico/openstack

 Basically, our driver's initialize method immediately kicks off a green
 thread to audit what is now in the Neutron DB, and to ensure that the other
 Calico components are consistent with that.

  I modified the linuxbridge mech driver to try to reproduce it:
 http://paste.openstack.org/show/216859/

 In the output, I never received any of the init code output I added more
 than once, including the function spawned using eventlet.


 Interesting.  Even the LOG.error(KEVIN PID=%s network response: %s %
 (os.getpid(), r.text)) line?  Surely the server would have forked before
 that line was executed - so what could prevent it from executing once in
 each forked process, and hence generating multiple logs?

 Thanks,
 Neil

  The only time I ever saw anything executed by a child process was actual
 API requests (e.g. the create_port method).




  On Thu, May 7, 2015 at 6:08 AM, Neil Jerram neil.jer...@metaswitch.com
 mailto:neil.jer...@metaswitch.com wrote:

 Is there a design for how ML2 mechanism drivers are supposed to cope
 with the Neutron server forking?

 What I'm currently seeing, with api_workers = 2, is:

 - my mechanism driver gets instantiated and initialized, and
 immediately kicks off some processing that involves communicating
 over the network

 - the Neutron server process then forks into multiple copies

 - multiple copies of my driver's network processing then continue,
 and interfere badly with each other :-)

 I think what I should do is:

 - wait until any forking has happened

 - then decide (somehow) which mechanism driver is going to kick off
 that processing, and do that.

 But how can a mechanism driver know when the Neutron server forking
 has happened?

 Thanks,
  Neil


 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 --
 Kevin Benton


 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Kevin Benton
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-06-02 Thread Salvatore Orlando
I'm not sure if you can test this behaviour on your own because it requires
the VMware plugin and the eventlet handling of backend response.

But the issue was manifesting and had to be fixed with this mega-hack [1].
The issue was not about several workers executing the same code - the
loopingcall was always started on a single thread. The issue I witnessed
was that the other API workers just hang.

There's probably something we need to understand about how eventlet can
work safely with a os.fork (I just think they're not really made to work
together!).
Regardless, I did not spent too much time on it, because I thought that the
multiple workers code might have been rewritten anyway by the pecan switch
activities you're doing.

Salvatore


[1] https://review.openstack.org/#/c/180145/

On 3 June 2015 at 02:20, Kevin Benton blak...@gmail.com wrote:

 Sorry about the long delay.

 Even the LOG.error(KEVIN PID=%s network response: %s % (os.getpid(),
 r.text)) line?  Surely the server would have forked before that line was
 executed - so what could prevent it from executing once in each forked
 process, and hence generating multiple logs?

 Yes, just once. I wasn't able to reproduce the behavior you ran into.
 Maybe eventlet has some protection for this? Can you provide small sample
 code for the logging driver that does reproduce the issue?

 On Wed, May 13, 2015 at 5:19 AM, Neil Jerram neil.jer...@metaswitch.com
 wrote:

 Hi Kevin,

 Thanks for your response...

 On 08/05/15 08:43, Kevin Benton wrote:

 I'm not sure I understand the behavior you are seeing. When your
 mechanism driver gets initialized and kicks off processing, all of that
 should be happening in the parent PID. I don't know why your child
 processes start executing code that wasn't invoked. Can you provide a
 pointer to the code or give a sample that reproduces the issue?


 https://github.com/Metaswitch/calico/tree/master/calico/openstack

 Basically, our driver's initialize method immediately kicks off a green
 thread to audit what is now in the Neutron DB, and to ensure that the other
 Calico components are consistent with that.

  I modified the linuxbridge mech driver to try to reproduce it:
 http://paste.openstack.org/show/216859/

 In the output, I never received any of the init code output I added more
 than once, including the function spawned using eventlet.


 Interesting.  Even the LOG.error(KEVIN PID=%s network response: %s %
 (os.getpid(), r.text)) line?  Surely the server would have forked before
 that line was executed - so what could prevent it from executing once in
 each forked process, and hence generating multiple logs?

 Thanks,
 Neil

  The only time I ever saw anything executed by a child process was actual
 API requests (e.g. the create_port method).




  On Thu, May 7, 2015 at 6:08 AM, Neil Jerram neil.jer...@metaswitch.com
 mailto:neil.jer...@metaswitch.com wrote:

 Is there a design for how ML2 mechanism drivers are supposed to cope
 with the Neutron server forking?

 What I'm currently seeing, with api_workers = 2, is:

 - my mechanism driver gets instantiated and initialized, and
 immediately kicks off some processing that involves communicating
 over the network

 - the Neutron server process then forks into multiple copies

 - multiple copies of my driver's network processing then continue,
 and interfere badly with each other :-)

 I think what I should do is:

 - wait until any forking has happened

 - then decide (somehow) which mechanism driver is going to kick off
 that processing, and do that.

 But how can a mechanism driver know when the Neutron server forking
 has happened?

 Thanks,
  Neil


 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 
 http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 --
 Kevin Benton



 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 --
 Kevin Benton

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 

Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-05-13 Thread Neil Jerram

Hi Kevin,

Thanks for your response...

On 08/05/15 08:43, Kevin Benton wrote:

I'm not sure I understand the behavior you are seeing. When your
mechanism driver gets initialized and kicks off processing, all of that
should be happening in the parent PID. I don't know why your child
processes start executing code that wasn't invoked. Can you provide a
pointer to the code or give a sample that reproduces the issue?


https://github.com/Metaswitch/calico/tree/master/calico/openstack

Basically, our driver's initialize method immediately kicks off a green 
thread to audit what is now in the Neutron DB, and to ensure that the 
other Calico components are consistent with that.



I modified the linuxbridge mech driver to try to reproduce it:
http://paste.openstack.org/show/216859/

In the output, I never received any of the init code output I added more
than once, including the function spawned using eventlet.


Interesting.  Even the LOG.error(KEVIN PID=%s network response: %s % 
(os.getpid(), r.text)) line?  Surely the server would have forked before 
that line was executed - so what could prevent it from executing once in 
each forked process, and hence generating multiple logs?


Thanks,
Neil


The only time I ever saw anything executed by a child process was actual
API requests (e.g. the create_port method).





On Thu, May 7, 2015 at 6:08 AM, Neil Jerram neil.jer...@metaswitch.com
mailto:neil.jer...@metaswitch.com wrote:

Is there a design for how ML2 mechanism drivers are supposed to cope
with the Neutron server forking?

What I'm currently seeing, with api_workers = 2, is:

- my mechanism driver gets instantiated and initialized, and
immediately kicks off some processing that involves communicating
over the network

- the Neutron server process then forks into multiple copies

- multiple copies of my driver's network processing then continue,
and interfere badly with each other :-)

I think what I should do is:

- wait until any forking has happened

- then decide (somehow) which mechanism driver is going to kick off
that processing, and do that.

But how can a mechanism driver know when the Neutron server forking
has happened?

Thanks,
 Neil

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--
Kevin Benton


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-05-13 Thread Neil Jerram

Hi Salvatore,

Thanks for your reply...

On 08/05/15 09:20, Salvatore Orlando wrote:

Just like the Neutron plugin manager, also ML2 driver manager ensure
drivers are loaded only once regardless of the number of workers.
What Kevin did proves that drivers are correctly loaded before forking
(I reckon).


Yes, up to a point.  It seems clear that we can rely on the following 
events being ordered:


1. Mechanism drivers are instantiated (__init__) and initialized 
(initialize).


2. The Neutron server forks (into a number of copies as dictated by 
api_workers and rpc_workers).


3. Mechanism driver entry points such as create_port_pre/postcommit are 
called.


However...


However, forking is something to be careful about especially when using
eventlet. For the plugin my team maintains we were creating a periodic
task during plugin initialisation.
This lead to an interesting condition where API workers were hanging
[1]. This situation was fixed with a rather pedestrian fix - by adding a
delay.


Yes!  This is precisely the same situation that I have.  Currently I am 
also planning to 'fix' it by adding a delay of a few seconds.  However 
that is not an amazing fix, because if there is something that a 
mechanism driver needs to do on startup, it would probably rather do it 
as soon as possible; and on the other hand because it involves guessing 
how long steps (1) and (2) above will take.


Readers may be wondering why a mechanism driver needs to do something on 
startup?  In general, the answer is so as to recheck the Neutron DB - 
i.e. any VMs/ports that should already exist - and ensure that the 
driver's downstream components are all correctly in sync with that.  In 
Calico's case, that means auditing that the routing and iptables on each 
compute host match to the current VM and security configuration.


This need is implied by the existence of the _postcommit entry points. 
When a mechanism driver is implemented using those entry points, it is 
possible for driver or downstream software to crash after the Neutron DB 
believes that a transaction has been committed, and leave dataplane 
state wrong.  Clearly, then, when the driver or downstream software is 
restarted, it needs to resync against the standing Neutron DB.



Generally speaking I would find useful to have a way to identify an
API worker in order to designate a specific one for processing that
should not be made redundant.
On the other hand I self-object to the above statement by saying that
API workers are not supposed to do this kind of processing, which should
be deferred to some other helper process.


+1 on both points :-)

There could be a post_fork() mechanism driver entry point.  It wouldn't 
matter which worker or helper process called it; the requirement would 
be simply that it would only be called once, after all the forking has 
occurred.


Regards,
Neil



Salvatore

[1] https://bugs.launchpad.net/vmware-nsx/+bug/1420278

On 8 May 2015 at 09:43, Kevin Benton blak...@gmail.com
mailto:blak...@gmail.com wrote:

I'm not sure I understand the behavior you are seeing. When your
mechanism driver gets initialized and kicks off processing, all of
that should be happening in the parent PID. I don't know why your
child processes start executing code that wasn't invoked. Can you
provide a pointer to the code or give a sample that reproduces the
issue?

I modified the linuxbridge mech driver to try to reproduce it:
http://paste.openstack.org/show/216859/

In the output, I never received any of the init code output I added
more than once, including the function spawned using eventlet.

The only time I ever saw anything executed by a child process was
actual API requests (e.g. the create_port method).


On Thu, May 7, 2015 at 6:08 AM, Neil Jerram
neil.jer...@metaswitch.com mailto:neil.jer...@metaswitch.com wrote:

Is there a design for how ML2 mechanism drivers are supposed to
cope with the Neutron server forking?

What I'm currently seeing, with api_workers = 2, is:

- my mechanism driver gets instantiated and initialized, and
immediately kicks off some processing that involves
communicating over the network

- the Neutron server process then forks into multiple copies

- multiple copies of my driver's network processing then
continue, and interfere badly with each other :-)

I think what I should do is:

- wait until any forking has happened

- then decide (somehow) which mechanism driver is going to kick
off that processing, and do that.

But how can a mechanism driver know when the Neutron server
forking has happened?

Thanks,
 Neil


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:

Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-05-08 Thread Kevin Benton
I'm not sure I understand the behavior you are seeing. When your mechanism
driver gets initialized and kicks off processing, all of that should be
happening in the parent PID. I don't know why your child processes start
executing code that wasn't invoked. Can you provide a pointer to the code
or give a sample that reproduces the issue?

I modified the linuxbridge mech driver to try to reproduce it:
http://paste.openstack.org/show/216859/

In the output, I never received any of the init code output I added more
than once, including the function spawned using eventlet.

The only time I ever saw anything executed by a child process was actual
API requests (e.g. the create_port method).


On Thu, May 7, 2015 at 6:08 AM, Neil Jerram neil.jer...@metaswitch.com
wrote:

 Is there a design for how ML2 mechanism drivers are supposed to cope with
 the Neutron server forking?

 What I'm currently seeing, with api_workers = 2, is:

 - my mechanism driver gets instantiated and initialized, and immediately
 kicks off some processing that involves communicating over the network

 - the Neutron server process then forks into multiple copies

 - multiple copies of my driver's network processing then continue, and
 interfere badly with each other :-)

 I think what I should do is:

 - wait until any forking has happened

 - then decide (somehow) which mechanism driver is going to kick off that
 processing, and do that.

 But how can a mechanism driver know when the Neutron server forking has
 happened?

 Thanks,
 Neil

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Kevin Benton
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-05-08 Thread Salvatore Orlando
Just like the Neutron plugin manager, also ML2 driver manager ensure
drivers are loaded only once regardless of the number of workers.
What Kevin did proves that drivers are correctly loaded before forking (I
reckon).

However, forking is something to be careful about especially when using
eventlet. For the plugin my team maintains we were creating a periodic task
during plugin initialisation.
This lead to an interesting condition where API workers were hanging [1].
This situation was fixed with a rather pedestrian fix - by adding a delay.

Generally speaking I would find useful to have a way to identify an API
worker in order to designate a specific one for processing that should not
be made redundant.
On the other hand I self-object to the above statement by saying that API
workers are not supposed to do this kind of processing, which should be
deferred to some other helper process.

Salvatore

[1] https://bugs.launchpad.net/vmware-nsx/+bug/1420278

On 8 May 2015 at 09:43, Kevin Benton blak...@gmail.com wrote:

 I'm not sure I understand the behavior you are seeing. When your mechanism
 driver gets initialized and kicks off processing, all of that should be
 happening in the parent PID. I don't know why your child processes start
 executing code that wasn't invoked. Can you provide a pointer to the code
 or give a sample that reproduces the issue?

 I modified the linuxbridge mech driver to try to reproduce it:
 http://paste.openstack.org/show/216859/

 In the output, I never received any of the init code output I added more
 than once, including the function spawned using eventlet.

 The only time I ever saw anything executed by a child process was actual
 API requests (e.g. the create_port method).


 On Thu, May 7, 2015 at 6:08 AM, Neil Jerram neil.jer...@metaswitch.com
 wrote:

 Is there a design for how ML2 mechanism drivers are supposed to cope with
 the Neutron server forking?

 What I'm currently seeing, with api_workers = 2, is:

 - my mechanism driver gets instantiated and initialized, and immediately
 kicks off some processing that involves communicating over the network

 - the Neutron server process then forks into multiple copies

 - multiple copies of my driver's network processing then continue, and
 interfere badly with each other :-)

 I think what I should do is:

 - wait until any forking has happened

 - then decide (somehow) which mechanism driver is going to kick off that
 processing, and do that.

 But how can a mechanism driver know when the Neutron server forking has
 happened?

 Thanks,
 Neil

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 --
 Kevin Benton

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

2015-05-07 Thread Neil Jerram
Is there a design for how ML2 mechanism drivers are supposed to cope 
with the Neutron server forking?


What I'm currently seeing, with api_workers = 2, is:

- my mechanism driver gets instantiated and initialized, and immediately 
kicks off some processing that involves communicating over the network


- the Neutron server process then forks into multiple copies

- multiple copies of my driver's network processing then continue, and 
interfere badly with each other :-)


I think what I should do is:

- wait until any forking has happened

- then decide (somehow) which mechanism driver is going to kick off that 
processing, and do that.


But how can a mechanism driver know when the Neutron server forking has 
happened?


Thanks,
Neil

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev