Hi Gary,

Nice work tracking this down.  I thought there would be an issue on the
nova-compute side? The part I'm confused about is how multiple ports are
getting created with the same device_id? I guess that even though an
exception is raised in quantum-server it still successfully creates the
port and then nova-compute some where retries and creates another port?

Thanks,

Aaron


On Tue, Jun 4, 2013 at 5:46 AM, Gary Kotton <[email protected]> wrote:

> Hi,
> Just to provide a little more information. The problem is a result of a
> timeout with the Quantum client (used by Nova) to interface with the
> Quantum service ([2]). There are 2 issues here:
> 1. Keystone is a bottleneck
> 2. At times there is an exception with the Quantum database (an example of
> a trace is below in [1])
> Thanks
> Gary
>
> 1.
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-**packages/quantum/api/v2/**resource.py",
> line 82, in resource
>     result = method(request=request, **args)
>   File "/usr/lib/python2.6/site-**packages/quantum/api/v2/base.**py",
> line 239, in index
>     return self._items(request, True, parent_id)
>   File "/usr/lib/python2.6/site-**packages/quantum/api/v2/base.**py",
> line 192, in _items
>     obj_list = obj_getter(request.context, **kwargs)
>   File "/usr/lib/python2.6/site-**packages/quantum/db/**securitygroups_db.py",
> line 136, in get_security_groups
>     page_reverse=page_reverse)
>   File "/usr/lib/python2.6/site-**packages/quantum/db/db_base_**plugin_v2.py",
> line 240, in _get_collection
>     items = [dict_func(c, fields) for c in query.all()]
>   File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.**
> 6-linux-x86_64.egg/sqlalchemy/**orm/query.py", line 2115, in all
>     return list(self)
>   File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.**
> 6-linux-x86_64.egg/sqlalchemy/**orm/query.py", line 2227, in __iter__
>     return self._execute_and_instances(**context)
>   File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.**
> 6-linux-x86_64.egg/sqlalchemy/**orm/query.py", line 2240, in
> _execute_and_instances
>     close_with_result=True)
>   File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.**
> 6-linux-x86_64.egg/sqlalchemy/**orm/query.py", line 2231, in
> _connection_from_session
>     **kw)
>   File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.**
> 6-linux-x86_64.egg/sqlalchemy/**orm/session.py", line 730, in connection
>     close_with_result=close_with_**result)
>   File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.**
> 6-linux-x86_64.egg/sqlalchemy/**orm/session.py", line 736, in
> _connection_for_bind
>     return engine.contextual_connect(****kwargs)
>   File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.**
> 6-linux-x86_64.egg/sqlalchemy/**engine/base.py", line 2490, in
> contextual_connect
>     self.pool.connect(),
>   File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.**
> 6-linux-x86_64.egg/sqlalchemy/**pool.py", line 224, in connect
>     return _ConnectionFairy(self).**checkout()
>   File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.**
> 6-linux-x86_64.egg/sqlalchemy/**pool.py", line 387, in __init__
>     rec = self._connection_record = pool._do_get()
>   File "/usr/lib64/python2.6/site-**packages/SQLAlchemy-0.7.8-py2.**
> 6-linux-x86_64.egg/sqlalchemy/**pool.py", line 729, in _do_get
>     (self.size(), self.overflow(), self._timeout))
> TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection
> timed out, timeout 30
>
> 2.
> [root@dhcp-4-126 ~(keystone_admin)]# cd /var/log/nova/
> [root@dhcp-4-126 nova(keystone_admin)]# grep -r 
> 6cd39e4d-a805-4f05-83e8-**3a9ecc2117b1
> *
> api.log:2013-06-04 04:55:19.425 INFO nova.osapi_compute.wsgi.server
> [req-51d09365-8f74-4759-a29e-**7d06d9c22e10 ba5e489e00d64ae4834b535646ee80
> **df cb2207818a394626937f878e0538ac**7b] 10.35.4.126 "GET /v2/**
> cb2207818a394626937f878e0538ac**7b/servers/6cd39e4d-a805-4f05-**83e8-3a9ecc2117b1
> HTTP/1.1" status: 200 len: 1477 time: 28.6792312
> api.log:2013-06-04 04:57:36.743 INFO nova.osapi_compute.wsgi.server
> [req-924fee73-7883-4160-8d7d-**6ea77c3d34f3 ba5e489e00d64ae4834b535646ee80
> **df cb2207818a394626937f878e0538ac**7b] 10.35.4.126 "GET /v2/**
> cb2207818a394626937f878e0538ac**7b/servers/6cd39e4d-a805-4f05-**83e8-3a9ecc2117b1
> HTTP/1.1" status: 200 len: 1725 time: 3.2600341
> scheduler.log:2013-06-04 04:57:33.662 ERROR nova.scheduler.filter_**scheduler
> [req-899adfbb-059e-473d-9508-**7419127def73 ba5e489e00d64ae4834b535646ee80
> **df cb2207818a394626937f878e0538ac**7b] [instance:
> 6cd39e4d-a805-4f05-83e8-**3a9ecc2117b1] Error from last host:
> dhcp-4-227.tlv.redhat.com (node dhcp-4-227.tlv.redhat.com): [u'Traceback
> (most recent call last):\n', u'  File "/usr/lib/python2.6/site-**
> packages/nova/compute/manager.**py", line 831, in _run_instance\n
>  requested_networks, macs, security_groups)\n', u'  File
> "/usr/lib/python2.6/site-**packages/nova/compute/manager.**py", line
> 1075, in _allocate_network\n    instance=instance)\n', u'  File
> "/usr/lib64/python2.6/**contextlib.py", line 23, in __exit__\n
>  self.gen.next()\n', u'  File "/usr/lib/python2.6/site-**
> packages/nova/compute/manager.**py", line 1071, in _allocate_network\n
>  security_groups=security_**groups)\n', u'  File "/usr/lib/python2.6/site-
> **packages/nova/network/api.py", line 46, in wrapper\n    res = f(self,
> context, *args, **kwargs)\n', u'  File "/usr/lib/python2.6/site-**
> packages/nova/network/**quantumv2/api.py", line 281, in
> allocate_for_instance\n    \'exception\': ex})\n', u'  File
> "/usr/lib64/python2.6/**contextlib.py", line 23, in __exit__\n
>  self.gen.next()\n', u'  File "/usr/lib/python2.6/site-**
> packages/nova/network/**quantumv2/api.py", line 264, in
> allocate_for_instance\n    
> quantum.create_port(port_req_**body)[\'port\'][\'id\'])\n',
> u'  File "/usr/lib/python2.6/site-**packages/quantumclient/v2_0/**client.py",
> line 107, in with_params\n    ret = self.function(instance, *args,
> **kwargs)\n', u'  File "/usr/lib/python2.6/site-**
> packages/quantumclient/v2_0/**client.py", line 269, in create_port\n
>  return self.post(self.ports_path, body=body)\n', u'  File
> "/usr/lib/python2.6/site-**packages/quantumclient/v2_0/**client.py", line
> 987, in post\n    headers=headers, params=params)\n', u'  File
> "/usr/lib/python2.6/site-**packages/quantumclient/v2_0/**client.py", line
> 912, in do_request\n    self._handle_fault_response(**status_code,
> replybody)\n', u'  File "/usr/lib/python2.6/site-**
> packages/quantumclient/v2_0/**client.py", line 893, in
> _handle_fault_response\n    exception_handler_v20(status_**code,
> des_error_body)\n', u'  File "/usr/lib/python2.6/site-**
> packages/quantumclient/v2_0/**client.py", line 87, in
> exception_handler_v20\n    message=message)\n', u'QuantumClientException:
> Request Timeout\n']
> scheduler.log:2013-06-04 04:57:33.680 WARNING nova.scheduler.driver
> [req-899adfbb-059e-473d-9508-**7419127def73 ba5e489e00d64ae4834b535646ee80
> **df cb2207818a394626937f878e0538ac**7b] [instance:
> 6cd39e4d-a805-4f05-83e8-**3a9ecc2117b1] Setting instance to ERROR state.
>
>
> On 06/04/2013 12:28 PM, Gary Kotton wrote:
>
>> Hi,
>> Thanks to Aarons script I was able to reproduce and find the problem for
>> https://bugs.launchpad.net/**quantum/+bug/1160442<https://bugs.launchpad.net/quantum/+bug/1160442>.
>> I have found the problem and will push a patch soon.
>> Thanks
>> Gary
>>
>>
>>
>
> --
> Mailing list: 
> https://launchpad.net/~**quantum-core<https://launchpad.net/~quantum-core>
> Post to     : 
> [email protected].**net<[email protected]>
> Unsubscribe : 
> https://launchpad.net/~**quantum-core<https://launchpad.net/~quantum-core>
> More help   : 
> https://help.launchpad.net/**ListHelp<https://help.launchpad.net/ListHelp>
>
-- 
Mailing list: https://launchpad.net/~quantum-core
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~quantum-core
More help   : https://help.launchpad.net/ListHelp

Reply via email to