Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
On 09/25/14 17:55, Clint Byrum wrote: Now I use Ceilometer's pipeline to forward events to elasticsearch via udp + logstash and do not use Ceilometer's DB or API at all. Interesting, this almost sounds like what should be the default configuration honestly. Ceilometer generates a lot of data in a real OpenStack deployment. The problem is not only managing the I/O load, but also the long term implications of storing and searching through that amount of data. The documentation does not cover this last point at all, last time I checked. Actually it is very difficult to find anything about "what can I do with ceilometer data?". I'm storing about one million messages per day generated by a 6-compute cluster. Assuming each message contains one sample... MySQL is not the best solution. Think if you need to store them for one year. And as others have said, it is not a good relational database use-case anyway. As for Elasticsearch, the setup is straightforward and works well, even better since you can re-use the same infrastructure for system logs. It is fast to deploy, reliable on the long run and easily scalable. And you can start to have a good look at the data while you think ways of using it. You need to add some simple mappings, otherwise ES will try to be smart indexing fields (IDs become words separated by '-' and they will no longer match). There are also the messages from neutron metering and cinder that have a non-standard format for date fields that Elasticsearch cannot parse without help (I opened bugs for those, and use logstash to convert the fields to ISO format). Hopefully these will be fixed. The application that uses Ceilometer's data through Elasticsearch has to speak to OpenStack APIs anyway, to translate the IDs and to make sense of Neutron metering labels. Daniele ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
On Thu, Sep 25, 2014 at 11:51:23AM -0400, gordon chung wrote: > > mysql> select count(*) from metadata_text; > > +--+ > > | count(*) | > > +--+ > > | 25249913 | > > +--+ > > 1 row in set (3.83 sec)> > > There were 25M records in one table. The deletion time is reaching an > > unacceptable level (7 minutes for 4M records) and it was not increasing > > in a linear way. Maybe DB experts can show me how to optimize this? > we don't do any customisations in default ceilometer package so i'm sure > there's way to optimise... not sure if any devops ppl read this list. > > Another question: does the mongodb backend support events now? > > (I asked this question in IRC, but, just as usual, no response from > > anyone in that community, no matter a silly question or not is it...) > regarding events, are you specifically asking about events > (http://docs.openstack.org/developer/ceilometer/events.html) in ceilometer or > using events term in generic sense? the table above has no relation to events > in ceilometer, it's related to samples and corresponding resource. we did do > some remodelling of sql backend this cycle which should shrink the size of > the metadata tables. > there's a euro-bias in ceilometer so you'll be more successful reaching > people on irc during euro work hours... that said, you'll probably get best > response by posting to list or pinging someone on core team directly. > cheers,gord Thanks for the responses above. TBH, I am unware of any performance problems based on my previous experience using MongoDB as the backend. I switched over to MySQL simply because only SQlAlchemy has supports to Ceilometer events. Sorry for the confusion -- the metadata table size wasn't a direct result of using events, though it does seem like an indirect result of switching to MySQL (not sure about this either). I'll try Euro work hours in future. Thanks for the hints! Cheers, Qiming > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
Excerpts from Daniele Venzano's message of 2014-09-25 02:40:11 -0700: > On 09/25/14 10:12, Qiming Teng wrote: > > Yes, just about 3 VMs running on two hosts, for at most 3 weeks. This > > is leading me to another question -- any best practices/tools to > > retire the old data on a regular basis? Regards, Qiming > > There is a tool: ceilometer-expirer > > I tried to use it on a mysql database, since I had the same table size > problem as you and it made the machine hit swap. I think it tries to > load the whole table in memory. > Just to see if it would eventually finish, I let it run for 1 week > before throwing away the whole database and move on. > > Now I use Ceilometer's pipeline to forward events to elasticsearch via > udp + logstash and do not use Ceilometer's DB or API at all. > Interesting, this almost sounds like what should be the default configuration honestly. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
> mysql> select count(*) from metadata_text; > +--+ > | count(*) | > +--+ > | 25249913 | > +--+ > 1 row in set (3.83 sec)> > There were 25M records in one table. The deletion time is reaching an > unacceptable level (7 minutes for 4M records) and it was not increasing > in a linear way. Maybe DB experts can show me how to optimize this? we don't do any customisations in default ceilometer package so i'm sure there's way to optimise... not sure if any devops ppl read this list. > Another question: does the mongodb backend support events now? > (I asked this question in IRC, but, just as usual, no response from > anyone in that community, no matter a silly question or not is it...) regarding events, are you specifically asking about events (http://docs.openstack.org/developer/ceilometer/events.html) in ceilometer or using events term in generic sense? the table above has no relation to events in ceilometer, it's related to samples and corresponding resource. we did do some remodelling of sql backend this cycle which should shrink the size of the metadata tables. there's a euro-bias in ceilometer so you'll be more successful reaching people on irc during euro work hours... that said, you'll probably get best response by posting to list or pinging someone on core team directly. cheers,gord ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
On Thu, Sep 25, 2014 at 11:40:11AM +0200, Daniele Venzano wrote: > On 09/25/14 10:12, Qiming Teng wrote: > >Yes, just about 3 VMs running on two hosts, for at most 3 weeks. > >This is leading me to another question -- any best practices/tools > >to retire the old data on a regular basis? Regards, Qiming > > There is a tool: ceilometer-expirer > > I tried to use it on a mysql database, since I had the same table > size problem as you and it made the machine hit swap. I think it > tries to load the whole table in memory. > Just to see if it would eventually finish, I let it run for 1 week > before throwing away the whole database and move on. > > Now I use Ceilometer's pipeline to forward events to elasticsearch > via udp + logstash and do not use Ceilometer's DB or API at all. Ah, that is something worth a try. Thanks. Regards, Qiming > Best, > Daniele > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
On 09/25/14 10:12, Qiming Teng wrote: Yes, just about 3 VMs running on two hosts, for at most 3 weeks. This is leading me to another question -- any best practices/tools to retire the old data on a regular basis? Regards, Qiming There is a tool: ceilometer-expirer I tried to use it on a mysql database, since I had the same table size problem as you and it made the machine hit swap. I think it tries to load the whole table in memory. Just to see if it would eventually finish, I let it run for 1 week before throwing away the whole database and move on. Now I use Ceilometer's pipeline to forward events to elasticsearch via udp + logstash and do not use Ceilometer's DB or API at all. Best, Daniele ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
On Wed, Sep 24, 2014 at 09:43:54PM -0700, Preston L. Bannister wrote: > Sorry, I am jumping into this without enough context, but ... > > > On Wed, Sep 24, 2014 at 8:37 PM, Qiming Teng > wrote: > > > > mysql> select count(*) from metadata_text; > > +--+ > > | count(*) | > > +--+ > > | 25249913 | > > +--+ > > 1 row in set (3.83 sec) > > > > > There are problems where a simple sequential log file is superior to a > database table. The above looks like a log ... a very large number of > events, without an immediate customer. For sequential access, a simple file > is *vastly* superior to a database table. > > If you are thinking about indexed access to the above as a table, think > about the cost of adding items to the index, for that many items. The cost > of building the index is not small. Running a map/reduce on sequential > files might be faster. > > Again, I do not have enough context, but ... 25 million rows? Yes, just about 3 VMs running on two hosts, for at most 3 weeks. This is leading me to another question -- any best practices/tools to retire the old data on a regular basis? Regards, Qiming > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
Qiming, yes - for MongoDB, DB2, HBase and SQL-based, all the backends support events feature for now, this has been merged afair ~month or two ago. Cheers Dina On Thu, Sep 25, 2014 at 11:45 AM, Qiming Teng wrote: > So MongoDB support to events is ready in tree? > > Regards, > Qiming > > On Thu, Sep 25, 2014 at 10:26:08AM +0300, Igor Degtiarov wrote: > > Hi, Qiming Teng. > > > > Now all backends support events. So you may use MongoDB instead of > > MySQL, or if you like you may choose HBase. > > > > Cheers, Igor. > > -- Igor > > > > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards, Dina Belova Software Engineer Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
So MongoDB support to events is ready in tree? Regards, Qiming On Thu, Sep 25, 2014 at 10:26:08AM +0300, Igor Degtiarov wrote: > Hi, Qiming Teng. > > Now all backends support events. So you may use MongoDB instead of > MySQL, or if you like you may choose HBase. > > Cheers, Igor. > -- Igor > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
Hi, Qiming Teng. Now all backends support events. So you may use MongoDB instead of MySQL, or if you like you may choose HBase. Cheers, Igor. -- Igor On Thu, Sep 25, 2014 at 7:43 AM, Preston L. Bannister wrote: > Sorry, I am jumping into this without enough context, but ... > > > On Wed, Sep 24, 2014 at 8:37 PM, Qiming Teng > wrote: >> >> mysql> select count(*) from metadata_text; >> +--+ >> | count(*) | >> +--+ >> | 25249913 | >> +--+ >> 1 row in set (3.83 sec) > > > > There are problems where a simple sequential log file is superior to a > database table. The above looks like a log ... a very large number of > events, without an immediate customer. For sequential access, a simple file > is *vastly* superior to a database table. > > If you are thinking about indexed access to the above as a table, think > about the cost of adding items to the index, for that many items. The cost > of building the index is not small. Running a map/reduce on sequential files > might be faster. > > Again, I do not have enough context, but ... 25 million rows? > > > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
Sorry, I am jumping into this without enough context, but ... On Wed, Sep 24, 2014 at 8:37 PM, Qiming Teng wrote: > > mysql> select count(*) from metadata_text; > +--+ > | count(*) | > +--+ > | 25249913 | > +--+ > 1 row in set (3.83 sec) > There are problems where a simple sequential log file is superior to a database table. The above looks like a log ... a very large number of events, without an immediate customer. For sequential access, a simple file is *vastly* superior to a database table. If you are thinking about indexed access to the above as a table, think about the cost of adding items to the index, for that many items. The cost of building the index is not small. Running a map/reduce on sequential files might be faster. Again, I do not have enough context, but ... 25 million rows? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
Excerpts from Qiming Teng's message of 2014-09-24 20:37:39 -0700: > Hi, > > Some weeks ago, I checked my then latest devstack install and I learned > this: event support in Ceilometer is only available for sqlalchemy > backend; mongodb backend was still under development. I have been using > MySQL during the past weeks and now I think I'm trapped by a performance > problem of MySQL. > > One or two Nova servers were launched and remain idle for about 10 days. > Now I'm seeing a lot of data accumulated in db and I wanted to cleanse > it manually. Here is what I got: > > mysql> select count(*) from metadata_text; > +--+ > | count(*) | > +--+ > | 25249913 | > +--+ > 1 row in set (3.83 sec) > > mysql> delete from metadata_text limit 1000; > Query OK, 1000 rows affected (0.02 sec) > > mysql> delete from metadata_text limit 1; > Query OK, 1 rows affected (0.39 sec) > > mysql> delete from metadata_text limit 10; > Query OK, 10 rows affected (2.31 sec) > > mysql> delete from metadata_text limit 100; > Query OK, 100 rows affected (25.32 sec) > > mysql> delete from metadata_text limit 200; > Query OK, 200 rows affected (1 min 16.17 sec) > > mysql> delete from metadata_text limit 400; > Query OK, 400 rows affected (7 min 40.40 sec) > > There were 25M records in one table. The deletion time is reaching an > unacceptable level (7 minutes for 4M records) and it was not increasing > in a linear way. Maybe DB experts can show me how to optimize this? Wow, you definitely do not want to be doing transactions like that on a regular basis. It is just murder on performance and can be deadly for things like replication. If you plan to do the whole table, just use truncate table. This is not unique to Ceilometer, and is in fact as old as databases really. There is a fantastic tool for doing it as efficiently as possible though: http://www.percona.com/doc/percona-toolkit/2.1/pt-archiver.html It will try to order by the physical blocks in the table, and do small chunks continuously to minimize the impact. You can also have it sleep for an amount of time based on how long the last delete took, so that you are responsive to server load and impact. The approach pt-archiver uses should be built in to any purge commands, but many of the purge commands I've encountered in OpenStack just throw a massive delete at the DB and hope for the best. I have not looked at Ceilometer's. Please note this is mostly an operational question, and not a development question, so I think this thread might want to move over to the openstack@ mailing list. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question
On 25/09/14 15:37, Qiming Teng wrote: Hi, Some weeks ago, I checked my then latest devstack install and I learned this: event support in Ceilometer is only available for sqlalchemy backend; mongodb backend was still under development. I have been using MySQL during the past weeks and now I think I'm trapped by a performance problem of MySQL. One or two Nova servers were launched and remain idle for about 10 days. Now I'm seeing a lot of data accumulated in db and I wanted to cleanse it manually. Here is what I got: mysql> select count(*) from metadata_text; +--+ | count(*) | +--+ | 25249913 | +--+ 1 row in set (3.83 sec) mysql> delete from metadata_text limit 1000; Query OK, 1000 rows affected (0.02 sec) mysql> delete from metadata_text limit 1; Query OK, 1 rows affected (0.39 sec) mysql> delete from metadata_text limit 10; Query OK, 10 rows affected (2.31 sec) mysql> delete from metadata_text limit 100; Query OK, 100 rows affected (25.32 sec) mysql> delete from metadata_text limit 200; Query OK, 200 rows affected (1 min 16.17 sec) mysql> delete from metadata_text limit 400; Query OK, 400 rows affected (7 min 40.40 sec) There were 25M records in one table. The deletion time is reaching an unacceptable level (7 minutes for 4M records) and it was not increasing in a linear way. Maybe DB experts can show me how to optimize this? Writes of bigger datasets will take non linear time when (possibly default?) configs are outgrown. For instance (assumimg metadata_text is an innodb table, take a look at: - innodb_log_buffer_size - innodb_log_file_size (warning: read the manual carefully before changing this) - innodb_buffer_pool_size Also index maintenance can get to be a limiting factor, I'm not sure if mysql will use the sort buffer to help with this, but maybe try increase - sort_buffer_size (just for the session doing the delete) and see if it helps. There are many (way too many) other parameters to tweak, but the above ones are probably the best to start with. Cheers Mark ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev