Isaku, Do you have in mind any implementation, any BP? We could actually work on this together, all plugins will get the benefits of a better implementation.
Thanks, Edgar On 11/19/13 3:57 AM, "Isaku Yamahata" <isaku.yamah...@gmail.com> wrote: >On Mon, Nov 18, 2013 at 03:55:49PM -0500, >Robert Kukura <rkuk...@redhat.com> wrote: > >> On 11/18/2013 03:25 PM, Edgar Magana wrote: >> > Developers, >> > >> > This topic has been discussed before but I do not remember if we have >>a >> > good solution or not. >> >> The ML2 plugin addresses this by calling each MechanismDriver twice. The >> create_network_precommit() method is called as part of the DB >> transaction, and the create_network_postcommit() method is called after >> the transaction has been committed. Interactions with devices or >> controllers are done in the postcommit methods. If the postcommit method >> raises an exception, the plugin deletes that partially-created resource >> and returns the exception to the client. You might consider a similar >> approach in your plugin. > >Splitting works into two phase, pre/post, is good approach. >But there still remains race window. >Once the transaction is committed, the result is visible to outside. >So the concurrent request to same resource will be racy. >There is a window after pre_xxx_yyy before post_xxx_yyy() where >other requests can be handled. > >The state machine needs to be enhanced, I think. (plugins need >modification) >For example, adding more states like pending_{create, delete, update}. >Also we would like to consider serializing between operation of ports >and subnets. or between operation of subnets and network depending on >performance requirement. >(Or carefully audit complex status change. i.e. >changing port during subnet/network update/deletion.) > >I think it would be useful to establish reference locking policy >for ML2 plugin for SDN controllers. >Thoughts or comments? If this is considered useful and acceptable, >I'm willing to help. > >thanks, >Isaku Yamahata > >> -Bob >> >> > Basically, if concurrent API calls are sent to Neutron, all of them >>are >> > sent to the plug-in level where two actions have to be made: >> > >> > 1. DB transaction ? No just for data persistence but also to collect >>the >> > information needed for the next action >> > 2. Plug-in back-end implementation ? In our case is a call to the >>python >> > library than consequentially calls PLUMgrid REST GW (soon SAL) >> > >> > For instance: >> > >> > def create_port(self, context, port): >> > with context.session.begin(subtransactions=True): >> > # Plugin DB - Port Create and Return port >> > port_db = super(NeutronPluginPLUMgridV2, >> > self).create_port(context, >> > >> port) >> > device_id = port_db["device_id"] >> > if port_db["device_owner"] == "network:router_gateway": >> > router_db = self._get_router(context, device_id) >> > else: >> > router_db = None >> > try: >> > LOG.debug(_("PLUMgrid Library: create_port() called")) >> > # Back-end implementation >> > self._plumlib.create_port(port_db, router_db) >> > except Exception: >> > >> > >> > The way we have implemented at the plugin-level in Havana (even in >> > Grizzly) is that both action are wrapped in the same "transaction" >>which >> > automatically rolls back any operation done to its original state >> > protecting mostly the DB of having any inconsistency state or left >>over >> > data if the back-end part fails.=. >> > The problem that we are experiencing is when concurrent calls to the >> > same API are sent, the number of operation at the plug-in back-end are >> > long enough to make the next concurrent API call to get stuck at the >>DB >> > transaction level, which creates a hung state for the Neutron Server >>to >> > the point that all concurrent API calls will fail. >> > >> > This can be fixed if we include some "locking" system such as calling: >> > >> > from neutron.common import utile >> > >> > >> > @utils.synchronized('any-name', external=True) >> > def create_port(self, context, port): >> > >> > >> > Obviously, this will create a serialization of all concurrent calls >> > which will ends up in having a really bad performance. Does anyone >>has a >> > better solution? >> > >> > Thanks, >> > >> > Edgar >> > >> > >> > _______________________________________________ >> > OpenStack-dev mailing list >> > OpenStack-dev@lists.openstack.org >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >-- >Isaku Yamahata <isaku.yamah...@gmail.com> _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev