Hi Salvatore,
Thanks for your reply...
On 08/05/15 09:20, Salvatore Orlando wrote:
Just like the Neutron plugin manager, also ML2 driver manager ensure
drivers are loaded only once regardless of the number of workers.
What Kevin did proves that drivers are correctly loaded before forking
(I reckon).
Yes, up to a point. It seems clear that we can rely on the following
events being ordered:
1. Mechanism drivers are instantiated (__init__) and initialized
(initialize).
2. The Neutron server forks (into a number of copies as dictated by
api_workers and rpc_workers).
3. Mechanism driver entry points such as create_port_pre/postcommit are
called.
However...
However, forking is something to be careful about especially when using
eventlet. For the plugin my team maintains we were creating a periodic
task during plugin initialisation.
This lead to an interesting condition where API workers were hanging
[1]. This situation was fixed with a rather pedestrian fix - by adding a
delay.
Yes! This is precisely the same situation that I have. Currently I am
also planning to 'fix' it by adding a delay of a few seconds. However
that is not an amazing fix, because if there is something that a
mechanism driver needs to do on startup, it would probably rather do it
as soon as possible; and on the other hand because it involves guessing
how long steps (1) and (2) above will take.
Readers may be wondering why a mechanism driver needs to do something on
startup? In general, the answer is so as to recheck the Neutron DB -
i.e. any VMs/ports that should already exist - and ensure that the
driver's downstream components are all correctly in sync with that. In
Calico's case, that means auditing that the routing and iptables on each
compute host match to the current VM and security configuration.
This need is implied by the existence of the _postcommit entry points.
When a mechanism driver is implemented using those entry points, it is
possible for driver or downstream software to crash after the Neutron DB
believes that a transaction has been committed, and leave dataplane
state wrong. Clearly, then, when the driver or downstream software is
restarted, it needs to resync against the standing Neutron DB.
Generally speaking I would find useful to have a way to "identify" an
API worker in order to designate a specific one for processing that
should not be made redundant.
On the other hand I self-object to the above statement by saying that
API workers are not supposed to do this kind of processing, which should
be deferred to some other helper process.
+1 on both points :-)
There could be a post_fork() mechanism driver entry point. It wouldn't
matter which worker or helper process called it; the requirement would
be simply that it would only be called once, after all the forking has
occurred.
Regards,
Neil
Salvatore
[1] https://bugs.launchpad.net/vmware-nsx/+bug/1420278
On 8 May 2015 at 09:43, Kevin Benton <blak...@gmail.com
<mailto:blak...@gmail.com>> wrote:
I'm not sure I understand the behavior you are seeing. When your
mechanism driver gets initialized and kicks off processing, all of
that should be happening in the parent PID. I don't know why your
child processes start executing code that wasn't invoked. Can you
provide a pointer to the code or give a sample that reproduces the
issue?
I modified the linuxbridge mech driver to try to reproduce it:
http://paste.openstack.org/show/216859/
In the output, I never received any of the init code output I added
more than once, including the function spawned using eventlet.
The only time I ever saw anything executed by a child process was
actual API requests (e.g. the create_port method).
On Thu, May 7, 2015 at 6:08 AM, Neil Jerram
<neil.jer...@metaswitch.com <mailto:neil.jer...@metaswitch.com>> wrote:
Is there a design for how ML2 mechanism drivers are supposed to
cope with the Neutron server forking?
What I'm currently seeing, with api_workers = 2, is:
- my mechanism driver gets instantiated and initialized, and
immediately kicks off some processing that involves
communicating over the network
- the Neutron server process then forks into multiple copies
- multiple copies of my driver's network processing then
continue, and interfere badly with each other :-)
I think what I should do is:
- wait until any forking has happened
- then decide (somehow) which mechanism driver is going to kick
off that processing, and do that.
But how can a mechanism driver know when the Neutron server
forking has happened?
Thanks,
Neil
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
--
Kevin Benton
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev