Hi Gary, Nice work tracking this down. I thought there would be an issue on the nova-compute side? The part I'm confused about is how multiple ports are getting created with the same device_id? I guess that even though an exception is raised in quantum-server it still successfully creates the port and then nova-compute some where retries and creates another port?
Thanks, Aaron On Tue, Jun 4, 2013 at 5:46 AM, Gary Kotton <[email protected]> wrote: > Hi, > Just to provide a little more information. The problem is a result of a > timeout with the Quantum client (used by Nova) to interface with the > Quantum service ([2]). There are 2 issues here: > 1. Keystone is a bottleneck > 2. At times there is an exception with the Quantum database (an example of > a trace is below in [1]) > Thanks > Gary > > 1. > Traceback (most recent call last): > File "/usr/lib/python2.6/site-**packages/quantum/api/v2/**resource.py", > line 82, in resource > result = method(request=request, **args) > File "/usr/lib/python2.6/site-**packages/quantum/api/v2/base.**py", > line 239, in index > return self._items(request, True, parent_id) > File "/usr/lib/python2.6/site-**packages/quantum/api/v2/base.**py", > line 192, in _items > obj_list = obj_getter(request.context, **kwargs) > File "/usr/lib/python2.6/site-**packages/quantum/db/**securitygroups_db.py", > line 136, in get_security_groups > page_reverse=page_reverse) > File "/usr/lib/python2.6/site-**packages/quantum/db/db_base_**plugin_v2.py", > line 240, in _get_collection > items = [dict_func(c, fields) for c in query.all()] > File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.** > 6-linux-x86_64.egg/sqlalchemy/**orm/query.py", line 2115, in all > return list(self) > File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.** > 6-linux-x86_64.egg/sqlalchemy/**orm/query.py", line 2227, in __iter__ > return self._execute_and_instances(**context) > File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.** > 6-linux-x86_64.egg/sqlalchemy/**orm/query.py", line 2240, in > _execute_and_instances > close_with_result=True) > File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.** > 6-linux-x86_64.egg/sqlalchemy/**orm/query.py", line 2231, in > _connection_from_session > **kw) > File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.** > 6-linux-x86_64.egg/sqlalchemy/**orm/session.py", line 730, in connection > close_with_result=close_with_**result) > File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.** > 6-linux-x86_64.egg/sqlalchemy/**orm/session.py", line 736, in > _connection_for_bind > return engine.contextual_connect(****kwargs) > File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.** > 6-linux-x86_64.egg/sqlalchemy/**engine/base.py", line 2490, in > contextual_connect > self.pool.connect(), > File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.** > 6-linux-x86_64.egg/sqlalchemy/**pool.py", line 224, in connect > return _ConnectionFairy(self).**checkout() > File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.** > 6-linux-x86_64.egg/sqlalchemy/**pool.py", line 387, in __init__ > rec = self._connection_record = pool._do_get() > File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.** > 6-linux-x86_64.egg/sqlalchemy/**pool.py", line 729, in _do_get > (self.size(), self.overflow(), self._timeout)) > TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection > timed out, timeout 30 > > 2. > [root@dhcp-4-126 ~(keystone_admin)]# cd /var/log/nova/ > [root@dhcp-4-126 nova(keystone_admin)]# grep -r > 6cd39e4d-a805-4f05-83e8-**3a9ecc2117b1 > * > api.log:2013-06-04 04:55:19.425 INFO nova.osapi_compute.wsgi.server > [req-51d09365-8f74-4759-a29e-**7d06d9c22e10 ba5e489e00d64ae4834b535646ee80 > **df cb2207818a394626937f878e0538ac**7b] 10.35.4.126 "GET /v2/** > cb2207818a394626937f878e0538ac**7b/servers/6cd39e4d-a805-4f05-**83e8-3a9ecc2117b1 > HTTP/1.1" status: 200 len: 1477 time: 28.6792312 > api.log:2013-06-04 04:57:36.743 INFO nova.osapi_compute.wsgi.server > [req-924fee73-7883-4160-8d7d-**6ea77c3d34f3 ba5e489e00d64ae4834b535646ee80 > **df cb2207818a394626937f878e0538ac**7b] 10.35.4.126 "GET /v2/** > cb2207818a394626937f878e0538ac**7b/servers/6cd39e4d-a805-4f05-**83e8-3a9ecc2117b1 > HTTP/1.1" status: 200 len: 1725 time: 3.2600341 > scheduler.log:2013-06-04 04:57:33.662 ERROR nova.scheduler.filter_**scheduler > [req-899adfbb-059e-473d-9508-**7419127def73 ba5e489e00d64ae4834b535646ee80 > **df cb2207818a394626937f878e0538ac**7b] [instance: > 6cd39e4d-a805-4f05-83e8-**3a9ecc2117b1] Error from last host: > dhcp-4-227.tlv.redhat.com (node dhcp-4-227.tlv.redhat.com): [u'Traceback > (most recent call last):\n', u' File "/usr/lib/python2.6/site-** > packages/nova/compute/manager.**py", line 831, in _run_instance\n > requested_networks, macs, security_groups)\n', u' File > "/usr/lib/python2.6/site-**packages/nova/compute/manager.**py", line > 1075, in _allocate_network\n instance=instance)\n', u' File > "/usr/lib64/python2.6/**contextlib.py", line 23, in __exit__\n > self.gen.next()\n', u' File "/usr/lib/python2.6/site-** > packages/nova/compute/manager.**py", line 1071, in _allocate_network\n > security_groups=security_**groups)\n', u' File "/usr/lib/python2.6/site- > **packages/nova/network/api.py", line 46, in wrapper\n res = f(self, > context, *args, **kwargs)\n', u' File "/usr/lib/python2.6/site-** > packages/nova/network/**quantumv2/api.py", line 281, in > allocate_for_instance\n \'exception\': ex})\n', u' File > "/usr/lib64/python2.6/**contextlib.py", line 23, in __exit__\n > self.gen.next()\n', u' File "/usr/lib/python2.6/site-** > packages/nova/network/**quantumv2/api.py", line 264, in > allocate_for_instance\n > quantum.create_port(port_req_**body)[\'port\'][\'id\'])\n', > u' File "/usr/lib/python2.6/site-**packages/quantumclient/v2_0/**client.py", > line 107, in with_params\n ret = self.function(instance, *args, > **kwargs)\n', u' File "/usr/lib/python2.6/site-** > packages/quantumclient/v2_0/**client.py", line 269, in create_port\n > return self.post(self.ports_path, body=body)\n', u' File > "/usr/lib/python2.6/site-**packages/quantumclient/v2_0/**client.py", line > 987, in post\n headers=headers, params=params)\n', u' File > "/usr/lib/python2.6/site-**packages/quantumclient/v2_0/**client.py", line > 912, in do_request\n self._handle_fault_response(**status_code, > replybody)\n', u' File "/usr/lib/python2.6/site-** > packages/quantumclient/v2_0/**client.py", line 893, in > _handle_fault_response\n exception_handler_v20(status_**code, > des_error_body)\n', u' File "/usr/lib/python2.6/site-** > packages/quantumclient/v2_0/**client.py", line 87, in > exception_handler_v20\n message=message)\n', u'QuantumClientException: > Request Timeout\n'] > scheduler.log:2013-06-04 04:57:33.680 WARNING nova.scheduler.driver > [req-899adfbb-059e-473d-9508-**7419127def73 ba5e489e00d64ae4834b535646ee80 > **df cb2207818a394626937f878e0538ac**7b] [instance: > 6cd39e4d-a805-4f05-83e8-**3a9ecc2117b1] Setting instance to ERROR state. > > > On 06/04/2013 12:28 PM, Gary Kotton wrote: > >> Hi, >> Thanks to Aarons script I was able to reproduce and find the problem for >> https://bugs.launchpad.net/**quantum/+bug/1160442<https://bugs.launchpad.net/quantum/+bug/1160442>. >> I have found the problem and will push a patch soon. >> Thanks >> Gary >> >> >> > > -- > Mailing list: > https://launchpad.net/~**quantum-core<https://launchpad.net/~quantum-core> > Post to : > [email protected].**net<[email protected]> > Unsubscribe : > https://launchpad.net/~**quantum-core<https://launchpad.net/~quantum-core> > More help : > https://help.launchpad.net/**ListHelp<https://help.launchpad.net/ListHelp> >
-- Mailing list: https://launchpad.net/~quantum-core Post to : [email protected] Unsubscribe : https://launchpad.net/~quantum-core More help : https://help.launchpad.net/ListHelp

