Re: [openstack-dev] [neutron] Seeing db lockout issues in neutron add_router_interface
Thanks Kevin. Will take a look at the example. Thanks, Divya On Tue, May 10, 2016 at 11:41 PM, Kevin Bentonwrote: > Unfortunately we didn't switch to the new sql driver until liberty so that > probably wouldn't be a safe switch in Kilo. > > Adding a retry will help, but unfortunately that will still block your > call for 60 seconds with that driver until the timeout exception is > triggered. > We worked around this in ML2 by identifying the calls that could yield > while holding a DB lock and then acquiring a semaphore before doing each > one. > You can see an example here: > https://github.com/openstack/neutron/blob/363eeb06104662ee38aeed04af043899379f6ab8/neutron/plugins/ml2/plugin.py#L1074 > > On Tue, May 10, 2016 at 11:27 PM, Divya wrote: > >> Thanks Mike for the response. I am part of Nuage openstack team. We are >> looking in to the issue. >> An extra delete_port call in NuagePlugin's add_router_interface triggers >> db lockout when insert into routerport ( this is in core neutron ). >> Are you suggesting NuagePlugin should retry in this case or should core >> neutron, add-router_interface should retry?? >> Will give it a try. >> >> >> On Tue, May 10, 2016 at 4:54 PM, Mike Bayer wrote: >> >>> >>> >>> On 05/10/2016 04:57 PM, Divya wrote: >>> Hi, I am trying to run this rally test on stable/kilo https://github.com/openstack/rally/blob/master/samples/tasks/scenarios/neutron/create_and_delete_routers.json with concurrency 50 and iterations 2000. This test basically cretaes routers and subnets and then calls router-interface-add router-interface-delete And i am running this against 3rd party Nuage plugin. In the NuagePlugin: add_router_interface is something like this: super().add_router_interface try: some calls to external rest server super().delete_port except: remove_router_interface: --- super().remove_router_interface some calls to external rest server super().create_port() some calls to external rest server If i comment delete_port in the add_router_interface, i am not hitting the db lockout issue. delete_port or any other operations are not within any transaction. So not sure, why this is leading to db lock timeouts in insert to routerport error trace http://paste.openstack.org/show/496626/ Really appreciate any help on this. >>> >>> >>> I'm not on the Neutron team, but in general, Openstack applications >>> should be employing retry logic internally which anticipates database >>> deadlocks like these and retries the operation. I'd report this stack >>> trace (especially if it is reproducible) as a bug to this plugin's >>> launchpad project. >>> >>> >>> >>> Thanks, Divya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> __ >>> OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: >>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >> >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Seeing db lockout issues in neutron add_router_interface
Unfortunately we didn't switch to the new sql driver until liberty so that probably wouldn't be a safe switch in Kilo. Adding a retry will help, but unfortunately that will still block your call for 60 seconds with that driver until the timeout exception is triggered. We worked around this in ML2 by identifying the calls that could yield while holding a DB lock and then acquiring a semaphore before doing each one. You can see an example here: https://github.com/openstack/neutron/blob/363eeb06104662ee38aeed04af043899379f6ab8/neutron/plugins/ml2/plugin.py#L1074 On Tue, May 10, 2016 at 11:27 PM, Divyawrote: > Thanks Mike for the response. I am part of Nuage openstack team. We are > looking in to the issue. > An extra delete_port call in NuagePlugin's add_router_interface triggers > db lockout when insert into routerport ( this is in core neutron ). > Are you suggesting NuagePlugin should retry in this case or should core > neutron, add-router_interface should retry?? > Will give it a try. > > > On Tue, May 10, 2016 at 4:54 PM, Mike Bayer wrote: > >> >> >> On 05/10/2016 04:57 PM, Divya wrote: >> >>> Hi, >>> I am trying to run this rally test on stable/kilo >>> >>> https://github.com/openstack/rally/blob/master/samples/tasks/scenarios/neutron/create_and_delete_routers.json >>> >>> with concurrency 50 and iterations 2000. >>> >>> This test basically cretaes routers and subnets >>> and then calls >>> router-interface-add >>> router-interface-delete >>> >>> >>> And i am running this against 3rd party Nuage plugin. >>> >>> In the NuagePlugin: >>> >>> add_router_interface is something like this: >>> >>> super().add_router_interface >>> try: >>>some calls to external rest server >>>super().delete_port >>> except: >>> >>> remove_router_interface: >>> --- >>> super().remove_router_interface >>> some calls to external rest server >>> super().create_port() >>> some calls to external rest server >>> >>> >>> If i comment delete_port in the add_router_interface, i am not hitting >>> the db lockout issue. >>> delete_port or any other operations are not within any transaction. >>> So not sure, why this is leading to db lock timeouts in insert to >>> routerport >>> >>> error trace >>> http://paste.openstack.org/show/496626/ >>> >>> >>> >>> Really appreciate any help on this. >>> >> >> >> I'm not on the Neutron team, but in general, Openstack applications >> should be employing retry logic internally which anticipates database >> deadlocks like these and retries the operation. I'd report this stack >> trace (especially if it is reproducible) as a bug to this plugin's >> launchpad project. >> >> >> >> >>> Thanks, >>> Divya >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> __ >>> OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: >>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Seeing db lockout issues in neutron add_router_interface
Thanks Mike for the response. I am part of Nuage openstack team. We are looking in to the issue. An extra delete_port call in NuagePlugin's add_router_interface triggers db lockout when insert into routerport ( this is in core neutron ). Are you suggesting NuagePlugin should retry in this case or should core neutron, add-router_interface should retry?? Will give it a try. On Tue, May 10, 2016 at 4:54 PM, Mike Bayerwrote: > > > On 05/10/2016 04:57 PM, Divya wrote: > >> Hi, >> I am trying to run this rally test on stable/kilo >> >> https://github.com/openstack/rally/blob/master/samples/tasks/scenarios/neutron/create_and_delete_routers.json >> >> with concurrency 50 and iterations 2000. >> >> This test basically cretaes routers and subnets >> and then calls >> router-interface-add >> router-interface-delete >> >> >> And i am running this against 3rd party Nuage plugin. >> >> In the NuagePlugin: >> >> add_router_interface is something like this: >> >> super().add_router_interface >> try: >>some calls to external rest server >>super().delete_port >> except: >> >> remove_router_interface: >> --- >> super().remove_router_interface >> some calls to external rest server >> super().create_port() >> some calls to external rest server >> >> >> If i comment delete_port in the add_router_interface, i am not hitting >> the db lockout issue. >> delete_port or any other operations are not within any transaction. >> So not sure, why this is leading to db lock timeouts in insert to >> routerport >> >> error trace >> http://paste.openstack.org/show/496626/ >> >> >> >> Really appreciate any help on this. >> > > > I'm not on the Neutron team, but in general, Openstack applications should > be employing retry logic internally which anticipates database deadlocks > like these and retries the operation. I'd report this stack trace > (especially if it is reproducible) as a bug to this plugin's launchpad > project. > > > > >> Thanks, >> Divya >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Seeing db lockout issues in neutron add_router_interface
Thanks Kevin for the response. Kevin, this is stable/kilo (customer is still on stable/kilo),is pymysql supported in stable/kilo?? Thanks & Regards, Divya On Tue, May 10, 2016 at 10:36 PM, Kevin Bentonwrote: > In addition to what Mike said, "Lock wait timeout exceeded" sounds like an > error from the C-based mysql driver that eventlet couldn't recognize > yielding calls on. We have moved away from that upstream for quite some > time now. Ensure your DB connection string starts with 'mysql+pymysql://' > to use the pymysql one. > > On Tue, May 10, 2016 at 4:54 PM, Mike Bayer wrote: > >> >> >> On 05/10/2016 04:57 PM, Divya wrote: >> >>> Hi, >>> I am trying to run this rally test on stable/kilo >>> >>> https://github.com/openstack/rally/blob/master/samples/tasks/scenarios/neutron/create_and_delete_routers.json >>> >>> with concurrency 50 and iterations 2000. >>> >>> This test basically cretaes routers and subnets >>> and then calls >>> router-interface-add >>> router-interface-delete >>> >>> >>> And i am running this against 3rd party Nuage plugin. >>> >>> In the NuagePlugin: >>> >>> add_router_interface is something like this: >>> >>> super().add_router_interface >>> try: >>>some calls to external rest server >>>super().delete_port >>> except: >>> >>> remove_router_interface: >>> --- >>> super().remove_router_interface >>> some calls to external rest server >>> super().create_port() >>> some calls to external rest server >>> >>> >>> If i comment delete_port in the add_router_interface, i am not hitting >>> the db lockout issue. >>> delete_port or any other operations are not within any transaction. >>> So not sure, why this is leading to db lock timeouts in insert to >>> routerport >>> >>> error trace >>> http://paste.openstack.org/show/496626/ >>> >>> >>> >>> Really appreciate any help on this. >>> >> >> >> I'm not on the Neutron team, but in general, Openstack applications >> should be employing retry logic internally which anticipates database >> deadlocks like these and retries the operation. I'd report this stack >> trace (especially if it is reproducible) as a bug to this plugin's >> launchpad project. >> >> >> >> >>> Thanks, >>> Divya >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> __ >>> OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: >>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Seeing db lockout issues in neutron add_router_interface
In addition to what Mike said, "Lock wait timeout exceeded" sounds like an error from the C-based mysql driver that eventlet couldn't recognize yielding calls on. We have moved away from that upstream for quite some time now. Ensure your DB connection string starts with 'mysql+pymysql://' to use the pymysql one. On Tue, May 10, 2016 at 4:54 PM, Mike Bayerwrote: > > > On 05/10/2016 04:57 PM, Divya wrote: > >> Hi, >> I am trying to run this rally test on stable/kilo >> >> https://github.com/openstack/rally/blob/master/samples/tasks/scenarios/neutron/create_and_delete_routers.json >> >> with concurrency 50 and iterations 2000. >> >> This test basically cretaes routers and subnets >> and then calls >> router-interface-add >> router-interface-delete >> >> >> And i am running this against 3rd party Nuage plugin. >> >> In the NuagePlugin: >> >> add_router_interface is something like this: >> >> super().add_router_interface >> try: >>some calls to external rest server >>super().delete_port >> except: >> >> remove_router_interface: >> --- >> super().remove_router_interface >> some calls to external rest server >> super().create_port() >> some calls to external rest server >> >> >> If i comment delete_port in the add_router_interface, i am not hitting >> the db lockout issue. >> delete_port or any other operations are not within any transaction. >> So not sure, why this is leading to db lock timeouts in insert to >> routerport >> >> error trace >> http://paste.openstack.org/show/496626/ >> >> >> >> Really appreciate any help on this. >> > > > I'm not on the Neutron team, but in general, Openstack applications should > be employing retry logic internally which anticipates database deadlocks > like these and retries the operation. I'd report this stack trace > (especially if it is reproducible) as a bug to this plugin's launchpad > project. > > > > >> Thanks, >> Divya >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Seeing db lockout issues in neutron add_router_interface
On 05/10/2016 04:57 PM, Divya wrote: Hi, I am trying to run this rally test on stable/kilo https://github.com/openstack/rally/blob/master/samples/tasks/scenarios/neutron/create_and_delete_routers.json with concurrency 50 and iterations 2000. This test basically cretaes routers and subnets and then calls router-interface-add router-interface-delete And i am running this against 3rd party Nuage plugin. In the NuagePlugin: add_router_interface is something like this: super().add_router_interface try: some calls to external rest server super().delete_port except: remove_router_interface: --- super().remove_router_interface some calls to external rest server super().create_port() some calls to external rest server If i comment delete_port in the add_router_interface, i am not hitting the db lockout issue. delete_port or any other operations are not within any transaction. So not sure, why this is leading to db lock timeouts in insert to routerport error trace http://paste.openstack.org/show/496626/ Really appreciate any help on this. I'm not on the Neutron team, but in general, Openstack applications should be employing retry logic internally which anticipates database deadlocks like these and retries the operation. I'd report this stack trace (especially if it is reproducible) as a bug to this plugin's launchpad project. Thanks, Divya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Seeing db lockout issues in neutron add_router_interface
Are there any general guidelines to avoid these db lock timeout issues in the third party neutron plugins?? Thanks, Divya On Tue, May 10, 2016 at 1:57 PM, Divyawrote: > Hi, >I am trying to run this rally test on stable/kilo > https://github.com/openstack/rally/blob/master/samples/tasks/scenarios/neutron/create_and_delete_routers.json > > with concurrency 50 and iterations 2000. > > This test basically cretaes routers and subnets > and then calls > router-interface-add > router-interface-delete > > > And i am running this against 3rd party Nuage plugin. > > In the NuagePlugin: > > add_router_interface is something like this: > > super().add_router_interface > try: > some calls to external rest server > super().delete_port > except: > > > remove_router_interface: > --- > super().remove_router_interface > some calls to external rest server > super().create_port() > some calls to external rest server > > > If i comment delete_port in the add_router_interface, i am not hitting the > db lockout issue. > delete_port or any other operations are not within any transaction. > So not sure, why this is leading to db lock timeouts in insert to > routerport > > error trace > http://paste.openstack.org/show/496626/ > > > > Really appreciate any help on this. > > Thanks, > Divya > > > > > > > > > > > > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [neutron] Seeing db lockout issues in neutron add_router_interface
Hi, I am trying to run this rally test on stable/kilo https://github.com/openstack/rally/blob/master/samples/tasks/scenarios/neutron/create_and_delete_routers.json with concurrency 50 and iterations 2000. This test basically cretaes routers and subnets and then calls router-interface-add router-interface-delete And i am running this against 3rd party Nuage plugin. In the NuagePlugin: add_router_interface is something like this: super().add_router_interface try: some calls to external rest server super().delete_port except: remove_router_interface: --- super().remove_router_interface some calls to external rest server super().create_port() some calls to external rest server If i comment delete_port in the add_router_interface, i am not hitting the db lockout issue. delete_port or any other operations are not within any transaction. So not sure, why this is leading to db lock timeouts in insert to routerport error trace http://paste.openstack.org/show/496626/ Really appreciate any help on this. Thanks, Divya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev