Re: [Openstack] [grizzly]Problems of qpid as rpcbackend

2013-05-30 Thread minmin ren
Hi all,

I think it is a bug of qpid as rpcbackend.

   Other service(nova-compute, cinder-scheduler, etc) use eventlet thead to
run service. They stop service use thread kill() method. The last step
rpc.cleanup() just did nothing, because the relative consume connection run
in thread and killed. I think it is unnecessary. All queue is auto-delete,
they will be removed when all receiver disappear.

However, cinder-volume use process to run service, so stop service need to
close connection and receiver (consumer) of the session of connection need
to close when call connection.close().
receiver close will sent MessageCancel and QueueDelete message to
broker(qpid server), so that all cinder-volume queue be removed.

I think that the reason of problem confused me.

But I don't know how to solve it.


2013/5/28 minmin ren rmm0...@gmail.com

 I think I found some problems of qpid as rpcbackend, however I'm not sure
 about it. Could anyone  try to test it with your environment?

 openstack grizzly version

 config file need debug=True

 1. service openstack-cinder-scheduler stop (nova-compute, nova-scheduler,
 etc)
 2.  vi /var/log/cinder/scheduler.log   some info will be found like
 this.

 I deployed two machines(node1 and dev202)

 2013-05-27 06:02:46 CRITICAL [cinder] need more than 0 values to unpack
 Traceback (most recent call last):
   File /usr/bin/cinder-scheduler, line 50, in module
 service.wait()
   File /usr/lib/python2.6/site-packages/cinder/service.py, line 613, in
 wait
 rpc.cleanup()
   File
 /usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/__init__.py,
 line 240, in cleanup
 return _get_impl().cleanup()
   File
 /usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py,
 line 649, in cleanup
 return rpc_amqp.cleanup(Connection.pool)
   File
 /usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/amqp.py,
 line 671, in cleanup
 connection_pool.empty()
   File
 /usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/amqp.py,
 line 80, in empty
 self.get().close()
   File
 /usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py,
 line 386, in close
 self.connection.close()
   File string, line 6, in close
   File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py,
 line 316, in close
 ssn.close(timeout=timeout)
   File string, line 6, in close
   File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py,
 line 749, in close
 if not self._ewait(lambda: self.closed, timeout=timeout):
   File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py,
 line 566, in _ewait
 result = self.connection._ewait(lambda: self.error or predicate(),
 timeout)
   File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py,
 line 208, in _ewait
 result = self._wait(lambda: self.error or predicate(), timeout)
   File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py,
 line 193, in _wait
 return self._waiter.wait(predicate, timeout=timeout)
   File /usr/lib/python2.6/site-packages/qpid/concurrency.py, line 57, in
 wait
 self.condition.wait(3)
   File /usr/lib/python2.6/site-packages/qpid/concurrency.py, line 96, in
 wait
 sw.wait(timeout)
   File /usr/lib/python2.6/site-packages/qpid/compat.py, line 53, in wait
 ready, _, _ = select([self], [], [], timeout)
 ValueError: need more than 0 values to unpack


 I put the problems with multi-cinder-volumes on launchpad
 https://answers.launchpad.net/cinder/+question/229456
 Because I encountered this problems, however others services except
 cinder-volume never appear this problems.
 Then I found other services log print some critical info, error at
 self.connection.close()
 So I delete self.connection.close() which should not be removed, I watch
 qpid queue infomation, the problem which I  confused on
 multi-cinder-volumes disappear.
 As a result, I think the problem I found may be a bug.

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [grizzly]Problems of qpid as rpcbackend

2013-05-30 Thread Ray Pekowski
I am not familiar with impl_qpid,py, but am familiar with amqp.py and have
had problems around rpc_amqp.cleanup() the Pool.empty() method it calls.
It was a totally different problem, but I decided to take a look at your
problem.  I noticed that in impl_qpid.py the only other place a
connection.close() is done is surrounded by this code:

# Close the session if necessary
if self.connection.opened():
try:
self.connection.close()
except qpid_exceptions.ConnectionError:
pass

I suggest you wrap the close at line 386 of impl_qpid.py with the same code
and your problem will be fixed.  Here is the line identified from your call
stack:

  File /usr/lib/python2.6/site-
packages/cinder/openstack/common/rpc/impl_qpid.py, line 386, in close
self.connection.close()

If that works, open a bug report.

Good catch

Ray
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [grizzly]Problems of qpid as rpcbackend

2013-05-30 Thread minmin ren
Hi Ray,
Thanks for your reply.
try  except  change to line 386 only solve cinder-scheduler or
nova-compute service which is the similar implementation stop raise
exception.
However,  all cinder-volume queue be removed when one of
multi-cinder-volume service stop. It is another problem.
I use pdb module to trace two different sevice stop(cinder-scheduler and
cinder-volume).
I describe two different implemention stop service
cinder-scheduler catch the signal to stop will to call _launcher.stop()
cinder/service.py line 612
_launcher.stop() will kill all service thread which run service.start and
service.wait .
After thread killed, I found that connection.session.recievers is [], that
means all consumer released. I'm not sure connection closed or not.
I found that the method kill() of class service not be called.

cinder-volume launch two processes,  service run in child process
(service.py line 227) and parent process watch the status of child.
When parent process catch to stop signal, it send the stop signal to child
process.
child process will catch signal and call service.stop (service.py  line 239)

And I use pdb to trace stop steps. I found that
connection.session.receivers is not [] and including three
receivers(cinder-volume, cinder-volume.node1, cinder-volume_fanout)
qpid will remove receivers of session, then MessageCancel and QueueDelete
will set to qpidd.
I think QueueDelete told the qpidd to delete all cinder-volume queues.






2013/5/30 Ray Pekowski pekow...@gmail.com

 I am not familiar with impl_qpid,py, but am familiar with amqp.py and have
 had problems around rpc_amqp.cleanup() the Pool.empty() method it calls.
 It was a totally different problem, but I decided to take a look at your
 problem.  I noticed that in impl_qpid.py the only other place a
 connection.close() is done is surrounded by this code:

 # Close the session if necessary
 if self.connection.opened():
 try:
 self.connection.close()
 except qpid_exceptions.ConnectionError:
 pass

 I suggest you wrap the close at line 386 of impl_qpid.py with the same
 code and your problem will be fixed.  Here is the line identified from your
 call stack:

   File /usr/lib/python2.6/site-
 packages/cinder/openstack/common/rpc/impl_qpid.py, line 386, in close
 self.connection.close()

 If that works, open a bug report.

 Good catch

 Ray

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] [grizzly]Problems of qpid as rpcbackend

2013-05-28 Thread minmin ren
I think I found some problems of qpid as rpcbackend, however I'm not sure
about it. Could anyone  try to test it with your environment?

openstack grizzly version

config file need debug=True

1. service openstack-cinder-scheduler stop (nova-compute, nova-scheduler,
etc)
2.  vi /var/log/cinder/scheduler.log   some info will be found like this.

I deployed two machines(node1 and dev202)

2013-05-27 06:02:46 CRITICAL [cinder] need more than 0 values to unpack
Traceback (most recent call last):
  File /usr/bin/cinder-scheduler, line 50, in module
service.wait()
  File /usr/lib/python2.6/site-packages/cinder/service.py, line 613, in
wait
rpc.cleanup()
  File
/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/__init__.py,
line 240, in cleanup
return _get_impl().cleanup()
  File
/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py,
line 649, in cleanup
return rpc_amqp.cleanup(Connection.pool)
  File
/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/amqp.py,
line 671, in cleanup
connection_pool.empty()
  File
/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/amqp.py,
line 80, in empty
self.get().close()
  File
/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/impl_qpid.py,
line 386, in close
self.connection.close()
  File string, line 6, in close
  File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line
316, in close
ssn.close(timeout=timeout)
  File string, line 6, in close
  File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line
749, in close
if not self._ewait(lambda: self.closed, timeout=timeout):
  File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line
566, in _ewait
result = self.connection._ewait(lambda: self.error or predicate(),
timeout)
  File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line
208, in _ewait
result = self._wait(lambda: self.error or predicate(), timeout)
  File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line
193, in _wait
return self._waiter.wait(predicate, timeout=timeout)
  File /usr/lib/python2.6/site-packages/qpid/concurrency.py, line 57, in
wait
self.condition.wait(3)
  File /usr/lib/python2.6/site-packages/qpid/concurrency.py, line 96, in
wait
sw.wait(timeout)
  File /usr/lib/python2.6/site-packages/qpid/compat.py, line 53, in wait
ready, _, _ = select([self], [], [], timeout)
ValueError: need more than 0 values to unpack


I put the problems with multi-cinder-volumes on launchpad
https://answers.launchpad.net/cinder/+question/229456
Because I encountered this problems, however others services except
cinder-volume never appear this problems.
Then I found other services log print some critical info, error at
self.connection.close()
So I delete self.connection.close() which should not be removed, I watch
qpid queue infomation, the problem which I  confused on
multi-cinder-volumes disappear.
As a result, I think the problem I found may be a bug.
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp