Re: [openstack-dev] [Ceilometer]Collector's performance

2014-03-11 Thread Ildikó Váncsa
Hi Nadya,

You mentioned multiple DB backends in your mail. Which one did you use to 
perform these tests or did you get the same/similar performance results in case 
of both?

Best Regards,
Ildiko

From: Nadya Privalova [mailto:nprival...@mirantis.com]
Sent: Tuesday, March 11, 2014 6:05 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [Ceilometer]Collector's performance

Hi team!
Last week we were working on notification problem in ceilometer during tempest 
tests creation. Tests for notification passed successfully on Postgres but 
failed on MySQL. This made us start investigations and this email contains some 
results.
As it turned out, tempest as it is is something like performance-testing for 
Ceilometer. It contains 2057 tests. Almost in all test OpenStack resources are 
being created and deleted: images, instances, volumes. E.g. during instance 
creation nova sends 9 notifications. And all the tests are running in parallel 
for about 40 minutes.
From ceilometer-collector logs we may found very useful message:

2014-03-10 
09:42:41.356http://logs.openstack.org/36/64136/20/check/check-tempest-dsvm-full/e361520/logs/screen-ceilometer-collector.txt.gz#_2014-03-10_09_42_41_356
 22845 DEBUG ceilometer.dispatcher.database 
[req-16ea95c5-6454-407a-9c64-94d5ef900c9e - - - - -] metering data 
storage.objects.outgoing.bytes for b7a490322e65422cb1129b13b49020e6 @ 
2014-03-10T09:34:31.090107:
So collector starts to process_metering_data in dispatcher only in 9:42 but 
nova sent it in 9:34. To look at whole picture please take look at picture [1]. 
It illustrates time difference based on this message in logs.
Besides, I decided to take a look on difference between the RPC-publisher sends 
the message and the collector receives the message. To create this plot I've 
parsed the lines like below from anotifications log:


2014-03-10 
09:25:49.333http://logs.openstack.org/36/64136/20/check/check-tempest-dsvm-full/e361520/logs/screen-ceilometer-anotification.txt.gz#_2014-03-10_09_25_49_333
 22833 DEBUG ceilometer.openstack.common.rpc.amqp [-] UNIQUE_ID is 
683dd3f130534b9fbb5606aef862b83d.





After that I found the corresponding id in collector log:

2014-03-10 
09:25:49.352http://logs.openstack.org/36/64136/20/check/check-tempest-dsvm-full/e361520/logs/screen-ceilometer-collector.txt.gz#_2014-03-10_09_25_49_352
 22845 DEBUG ceilometer.openstack.common.rpc.amqp [-] received 
{u'_context_domain': None, u'_context_request_id': 
u'req-0a5fafe6-e097-4f90-a68a-a91da1cff22c',




u'args': {u'data': [...,
 u'message_id': u'f7ad63fc-a835-11e3-8223-bc764e205385', u'counter_type': 
u'gauge'}]}, u'_context_read_only': False, u'_unique_id': 
u'683dd3f130534b9fbb5606aef862b83d',




u'_context_user_identity': u'- - - - -', u'_context_instance_uuid': None, 
u'_context_show_deleted': False, u'_context_tenant': None, 
u'_context_auth_token': 'SANITIZED',




} _safe_log 
/opt/stack/new/ceilometer/ceilometer/openstack/common/rpc/common.py:280
So in the example above we see time-difference only in 20 milliseconds. But it 
grows very quickly :( To see it please take a look on picture [2].
To summarize pictures:
1. Picture 1: Axis Y: amount of seconds between nova creates notification and 
the collector retrieves the message. Axis X: timestamp
2. Picture 2: Axis Y: amount of seconds between the publisher publishes the 
message and the collector retrieves the message. Axis X: timestamp
These pictures are almost the same and it makes me think that collector cannot 
manage with big amount of messages. What do you think about it? Do you agree or 
you need more evidences, e.g. amount of messages in rabbit or amth else?
Let's discuss that in [Ceilometer] topic first, I will create a new thread 
about testing strategy in tempest later. Because in this circumstances we 
forced to refuse from created notification tests and cannot reduce time for 
polling because it will make everything even worst.

[1]: http://postimg.org/image/r4501bdyb/
[2]: http://postimg.org/image/yy5a1ste1/

Thanks for your attention,
Nadya
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer]Collector's performance

2014-03-11 Thread Nadya Privalova
Ildiko,

Thanks for question, I forgot to write about it. The results for mysql, the
link to logs
http://logs.openstack.org/36/64136/20/check/check-tempest-dsvm-full/e361520/.
But I guess postgress stuff looks the same because it failed during last
test run (https://review.openstack.org/#/c/64136/). Will check tomorrow
anyway.

Nadya


On Tue, Mar 11, 2014 at 10:01 PM, Ildikó Váncsa
ildiko.van...@ericsson.comwrote:

  Hi Nadya,



 You mentioned multiple DB backends in your mail. Which one did you use to
 perform these tests or did you get the same/similar performance results in
 case of both?



 Best Regards,

 Ildiko



 *From:* Nadya Privalova [mailto:nprival...@mirantis.com]
 *Sent:* Tuesday, March 11, 2014 6:05 PM
 *To:* OpenStack Development Mailing List
 *Subject:* [openstack-dev] [Ceilometer]Collector's performance



 Hi team!

 Last week we were working on notification problem in ceilometer during
 tempest tests creation. Tests for notification passed successfully on
 Postgres but failed on MySQL. This made us start investigations and this
 email contains some results.

 As it turned out, tempest as it is is something like performance-testing
 for Ceilometer. It contains 2057 tests. Almost in all test OpenStack
 resources are being created and deleted: images, instances, volumes. E.g.
 during instance creation nova sends 9 notifications. And all the tests are
 running in parallel for about 40 minutes.

 From ceilometer-collector logs we may found very useful message:

 2014-03-10 09:42:41.356 
 http://logs.openstack.org/36/64136/20/check/check-tempest-dsvm-full/e361520/logs/screen-ceilometer-collector.txt.gz#_2014-03-10_09_42_41_356
  22845 DEBUG ceilometer.dispatcher.database 
 [req-16ea95c5-6454-407a-9c64-94d5ef900c9e - - - - -] metering data 
 storage.objects.outgoing.bytes for b7a490322e65422cb1129b13b49020e6 @ 
 2014-03-10T09:34:31.090107:

 So collector starts to process_metering_data in dispatcher only in 9:42
 but nova sent it in 9:34. To look at whole picture please take look at
 picture [1]. It illustrates time difference based on this message in logs.

 Besides, I decided to take a look on difference between the RPC-publisher
 sends the message and the collector receives the message. To create this
 plot I've parsed the lines like below from anotifications log:



 2014-03-10 09:25:49.333 
 http://logs.openstack.org/36/64136/20/check/check-tempest-dsvm-full/e361520/logs/screen-ceilometer-anotification.txt.gz#_2014-03-10_09_25_49_333
  22833 DEBUG ceilometer.openstack.common.rpc.amqp [-] UNIQUE_ID is 
 683dd3f130534b9fbb5606aef862b83d.





  After that I found the corresponding id in collector log:

 2014-03-10 09:25:49.352 
 http://logs.openstack.org/36/64136/20/check/check-tempest-dsvm-full/e361520/logs/screen-ceilometer-collector.txt.gz#_2014-03-10_09_25_49_352
  22845 DEBUG ceilometer.openstack.common.rpc.amqp [-] received 
 {u'_context_domain': None, u'_context_request_id': 
 u'req-0a5fafe6-e097-4f90-a68a-a91da1cff22c',



 u'args': {u'data': [...,
  u'message_id': u'f7ad63fc-a835-11e3-8223-bc764e205385', u'counter_type': 
 u'gauge'}]}, u'_context_read_only': False, u'_unique_id': 
 u'683dd3f130534b9fbb5606aef862b83d',



 u'_context_user_identity': u'- - - - -', u'_context_instance_uuid': None, 
 u'_context_show_deleted': False, u'_context_tenant': None, 
 u'_context_auth_token': 'SANITIZED',



 } _safe_log 
 /opt/stack/new/ceilometer/ceilometer/openstack/common/rpc/common.py:280

 So in the example above we see time-difference only in 20 milliseconds.
 But it grows very quickly :( To see it please take a look on picture [2].

 To summarize pictures:

 1. Picture 1: Axis Y: amount of seconds between nova creates notification
 and the collector retrieves the message. Axis X: timestamp

 2. Picture 2: Axis Y: amount of seconds between the publisher publishes
 the message and the collector retrieves the message. Axis X: timestamp

 These pictures are almost the same and it makes me think that collector
 cannot manage with big amount of messages. What do you think about it? Do
 you agree or you need more evidences, e.g. amount of messages in rabbit or
 amth else?

 Let's discuss that in [Ceilometer] topic first, I will create a new thread
 about testing strategy in tempest later. Because in this circumstances we
 forced to refuse from created notification tests and cannot reduce time for
 polling because it will make everything even worst.



 [1]: http://postimg.org/image/r4501bdyb/
 [2]: http://postimg.org/image/yy5a1ste1/



 Thanks for your attention,

 Nadya

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer]Collector's performance

2014-03-11 Thread Gordon Chung
i did notice the collector service was only ever writing one db connection 
at a time. i've opened a bug for that here: 
https://bugs.launchpad.net/ceilometer/+bug/1291054

i am curious as to why postgresql passes but not mysql? is postgres 
actually faster or are it's default configurations set up better?

cheers,
gordon chung
openstack, ibm software standards___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev