Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-26 Thread Daniele Venzano

On 09/25/14 17:55, Clint Byrum wrote:
Now I use Ceilometer's pipeline to forward events to elasticsearch via 
udp + logstash and do not use Ceilometer's DB or API at all.

Interesting, this almost sounds like what should be the default
configuration honestly.

Ceilometer generates a lot of data in a real OpenStack deployment. The 
problem is not only managing the I/O load, but also the long term 
implications of storing and searching through that amount of data. The 
documentation does not cover this last point at all, last time I 
checked. Actually it is very difficult to find anything about what can 
I do with ceilometer data?.


I'm storing about one million messages per day generated by a 6-compute 
cluster. Assuming each message contains one sample... MySQL is not the 
best solution. Think if you need to store them for one year. And as 
others have said, it is not a good relational database use-case anyway.


As for Elasticsearch, the setup is straightforward and works well, even 
better since you can re-use the same infrastructure for system logs. It 
is fast to deploy, reliable on the long run and easily scalable. And you 
can start to have a good look at the data while you think ways of using it.


You need to add some simple mappings, otherwise ES will try to be smart 
indexing fields (IDs become words separated by '-' and they will no 
longer match).


There are also the messages from neutron metering and cinder that have a 
non-standard format for date fields that Elasticsearch cannot parse 
without help (I opened bugs for those, and use logstash to convert the 
fields to ISO format). Hopefully these will be fixed.


The application that uses Ceilometer's data through Elasticsearch has to 
speak to OpenStack APIs anyway, to translate the IDs and to make sense 
of Neutron metering labels.


Daniele

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-25 Thread Igor Degtiarov
Hi, Qiming Teng.

Now all backends support events. So you may use MongoDB instead of
MySQL, or if you like you may choose HBase.

Cheers, Igor.
-- Igor


On Thu, Sep 25, 2014 at 7:43 AM, Preston L. Bannister
pres...@bannister.us wrote:
 Sorry, I am jumping into this without enough context, but ...


 On Wed, Sep 24, 2014 at 8:37 PM, Qiming Teng teng...@linux.vnet.ibm.com
 wrote:

 mysql select count(*) from metadata_text;
 +--+
 | count(*) |
 +--+
 | 25249913 |
 +--+
 1 row in set (3.83 sec)



 There are problems where a simple sequential log file is superior to a
 database table. The above looks like a log ... a very large number of
 events, without an immediate customer. For sequential access, a simple file
 is *vastly* superior to a database table.

 If you are thinking about indexed access to the above as a table, think
 about the cost of adding items to the index, for that many items. The cost
 of building the index is not small. Running a map/reduce on sequential files
 might be faster.

 Again, I do not have enough context, but ... 25 million rows?




 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-25 Thread Qiming Teng
So MongoDB support to events is ready in tree?

Regards,
  Qiming

On Thu, Sep 25, 2014 at 10:26:08AM +0300, Igor Degtiarov wrote:
 Hi, Qiming Teng.
 
 Now all backends support events. So you may use MongoDB instead of
 MySQL, or if you like you may choose HBase.
 
 Cheers, Igor.
 -- Igor
 
 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-25 Thread Dina Belova
Qiming, yes - for MongoDB, DB2, HBase and SQL-based, all the backends
support events feature for now, this has been merged afair ~month or two
ago.

Cheers
Dina

On Thu, Sep 25, 2014 at 11:45 AM, Qiming Teng teng...@linux.vnet.ibm.com
wrote:

 So MongoDB support to events is ready in tree?

 Regards,
   Qiming

 On Thu, Sep 25, 2014 at 10:26:08AM +0300, Igor Degtiarov wrote:
  Hi, Qiming Teng.
 
  Now all backends support events. So you may use MongoDB instead of
  MySQL, or if you like you may choose HBase.
 
  Cheers, Igor.
  -- Igor
 



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 

Best regards,

Dina Belova

Software Engineer

Mirantis Inc.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-25 Thread Qiming Teng
On Wed, Sep 24, 2014 at 09:43:54PM -0700, Preston L. Bannister wrote:
 Sorry, I am jumping into this without enough context, but ...
 
 
 On Wed, Sep 24, 2014 at 8:37 PM, Qiming Teng teng...@linux.vnet.ibm.com
 wrote:
 
  mysql select count(*) from metadata_text;
  +--+
  | count(*) |
  +--+
  | 25249913 |
  +--+
  1 row in set (3.83 sec)
 
 
 
 There are problems where a simple sequential log file is superior to a
 database table. The above looks like a log ... a very large number of
 events, without an immediate customer. For sequential access, a simple file
 is *vastly* superior to a database table.
 
 If you are thinking about indexed access to the above as a table, think
 about the cost of adding items to the index, for that many items. The cost
 of building the index is not small. Running a map/reduce on sequential
 files might be faster.
 
 Again, I do not have enough context, but ... 25 million rows?

Yes, just about 3 VMs running on two hosts, for at most 3 weeks.  This
is leading me to another question -- any best practices/tools to retire
the old data on a regular basis?

Regards,
  Qiming
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-25 Thread Daniele Venzano

On 09/25/14 10:12, Qiming Teng wrote:
Yes, just about 3 VMs running on two hosts, for at most 3 weeks. This 
is leading me to another question -- any best practices/tools to 
retire the old data on a regular basis? Regards, Qiming


There is a tool: ceilometer-expirer

I tried to use it on a mysql database, since I had the same table size 
problem as you and it made the machine hit swap. I think it tries to 
load the whole table in memory.
Just to see if it would eventually finish, I let it run for 1 week 
before throwing away the whole database and move on.


Now I use Ceilometer's pipeline to forward events to elasticsearch via 
udp + logstash and do not use Ceilometer's DB or API at all.



Best,
Daniele

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-25 Thread Qiming Teng
On Thu, Sep 25, 2014 at 11:40:11AM +0200, Daniele Venzano wrote:
 On 09/25/14 10:12, Qiming Teng wrote:
 Yes, just about 3 VMs running on two hosts, for at most 3 weeks.
 This is leading me to another question -- any best practices/tools
 to retire the old data on a regular basis? Regards, Qiming
 
 There is a tool: ceilometer-expirer
 
 I tried to use it on a mysql database, since I had the same table
 size problem as you and it made the machine hit swap. I think it
 tries to load the whole table in memory.
 Just to see if it would eventually finish, I let it run for 1 week
 before throwing away the whole database and move on.
 
 Now I use Ceilometer's pipeline to forward events to elasticsearch
 via udp + logstash and do not use Ceilometer's DB or API at all.

Ah, that is something worth a try.  Thanks.

Regards,
 Qiming
 
 Best,
 Daniele
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-25 Thread gordon chung
 mysql select count(*) from metadata_text;
 +--+
 | count(*) |
 +--+
 | 25249913 |
 +--+
 1 row in set (3.83 sec) 
 There were 25M records in one table.  The deletion time is reaching an
 unacceptable level (7 minutes for 4M records) and it was not increasing
 in a linear way.  Maybe DB experts can show me how to optimize this?
we don't do any customisations in default ceilometer package so i'm sure 
there's way to optimise... not sure if any devops ppl read this list. 
 Another question: does the mongodb backend support events now?
 (I asked this question in IRC, but, just as usual, no response from
 anyone in that community, no matter a silly question or not is it...)
regarding events, are you specifically asking about events 
(http://docs.openstack.org/developer/ceilometer/events.html) in ceilometer or 
using events term in generic sense? the table above has no relation to events 
in ceilometer, it's related to samples and corresponding resource.  we did do 
some remodelling of sql backend this cycle which should shrink the size of the 
metadata tables.
there's a euro-bias in ceilometer so you'll be more successful reaching people 
on irc during euro work hours... that said, you'll probably get best response 
by posting to list or pinging someone on core team directly.
cheers,gord   ___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-25 Thread Clint Byrum
Excerpts from Daniele Venzano's message of 2014-09-25 02:40:11 -0700:
 On 09/25/14 10:12, Qiming Teng wrote:
  Yes, just about 3 VMs running on two hosts, for at most 3 weeks. This 
  is leading me to another question -- any best practices/tools to 
  retire the old data on a regular basis? Regards, Qiming
 
 There is a tool: ceilometer-expirer
 
 I tried to use it on a mysql database, since I had the same table size 
 problem as you and it made the machine hit swap. I think it tries to 
 load the whole table in memory.
 Just to see if it would eventually finish, I let it run for 1 week 
 before throwing away the whole database and move on.
 
 Now I use Ceilometer's pipeline to forward events to elasticsearch via 
 udp + logstash and do not use Ceilometer's DB or API at all.
 

Interesting, this almost sounds like what should be the default
configuration honestly.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-25 Thread Qiming Teng
On Thu, Sep 25, 2014 at 11:51:23AM -0400, gordon chung wrote:
  mysql select count(*) from metadata_text;
  +--+
  | count(*) |
  +--+
  | 25249913 |
  +--+
  1 row in set (3.83 sec) 
  There were 25M records in one table.  The deletion time is reaching an
  unacceptable level (7 minutes for 4M records) and it was not increasing
  in a linear way.  Maybe DB experts can show me how to optimize this?
 we don't do any customisations in default ceilometer package so i'm sure 
 there's way to optimise... not sure if any devops ppl read this list. 
  Another question: does the mongodb backend support events now?
  (I asked this question in IRC, but, just as usual, no response from
  anyone in that community, no matter a silly question or not is it...)
 regarding events, are you specifically asking about events 
 (http://docs.openstack.org/developer/ceilometer/events.html) in ceilometer or 
 using events term in generic sense? the table above has no relation to events 
 in ceilometer, it's related to samples and corresponding resource.  we did do 
 some remodelling of sql backend this cycle which should shrink the size of 
 the metadata tables.
 there's a euro-bias in ceilometer so you'll be more successful reaching 
 people on irc during euro work hours... that said, you'll probably get best 
 response by posting to list or pinging someone on core team directly.
 cheers,gord 

Thanks for the responses above.
TBH, I am unware of any performance problems based on my previous
experience using MongoDB as the backend.  I switched over to MySQL
simply because only SQlAlchemy has supports to Ceilometer events.
Sorry for the confusion -- the metadata table size wasn't a direct
result of using events, though it does seem like an indirect result of
switching to MySQL (not sure about this either).

I'll try Euro work hours in future.  Thanks for the hints!

Cheers,
Qiming

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-24 Thread Mark Kirkwood

On 25/09/14 15:37, Qiming Teng wrote:

Hi,

Some weeks ago, I checked my then latest devstack install and I learned
this: event support in Ceilometer is only available for sqlalchemy
backend; mongodb backend was still under development.  I have been using
MySQL during the past weeks and now I think I'm trapped by a performance
problem of MySQL.

One or two Nova servers were launched and remain idle for about 10 days.
Now I'm seeing a lot of data accumulated in db and I wanted to cleanse
it manually.  Here is what I got:

mysql select count(*) from metadata_text;
+--+
| count(*) |
+--+
| 25249913 |
+--+
1 row in set (3.83 sec)

mysql delete from metadata_text limit 1000;
Query OK, 1000 rows affected (0.02 sec)

mysql delete from metadata_text limit 1;
Query OK, 1 rows affected (0.39 sec)

mysql delete from metadata_text limit 10;
Query OK, 10 rows affected (2.31 sec)

mysql delete from metadata_text limit 100;
Query OK, 100 rows affected (25.32 sec)

mysql delete from metadata_text limit 200;
Query OK, 200 rows affected (1 min 16.17 sec)

mysql delete from metadata_text limit 400;
Query OK, 400 rows affected (7 min 40.40 sec)

There were 25M records in one table.  The deletion time is reaching an
unacceptable level (7 minutes for 4M records) and it was not increasing
in a linear way.  Maybe DB experts can show me how to optimize this?



Writes of bigger datasets will take non linear time when (possibly 
default?) configs are outgrown. For instance (assumimg metadata_text is 
an innodb table, take a look at:


- innodb_log_buffer_size
- innodb_log_file_size (warning: read the manual carefully before 
changing this)

- innodb_buffer_pool_size

Also index maintenance can get to be a limiting factor, I'm not sure if 
mysql will use the sort buffer to help with this, but maybe try increase


- sort_buffer_size

(just for the session doing the delete) and see if it helps.

There are many (way too many) other parameters to tweak, but the above 
ones are probably the best to start with.


Cheers

Mark

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-24 Thread Clint Byrum
Excerpts from Qiming Teng's message of 2014-09-24 20:37:39 -0700:
 Hi,
 
 Some weeks ago, I checked my then latest devstack install and I learned
 this: event support in Ceilometer is only available for sqlalchemy
 backend; mongodb backend was still under development.  I have been using
 MySQL during the past weeks and now I think I'm trapped by a performance
 problem of MySQL.
 
 One or two Nova servers were launched and remain idle for about 10 days.
 Now I'm seeing a lot of data accumulated in db and I wanted to cleanse
 it manually.  Here is what I got:
 
 mysql select count(*) from metadata_text;
 +--+
 | count(*) |
 +--+
 | 25249913 |
 +--+
 1 row in set (3.83 sec)
 
 mysql delete from metadata_text limit 1000;
 Query OK, 1000 rows affected (0.02 sec)
 
 mysql delete from metadata_text limit 1;
 Query OK, 1 rows affected (0.39 sec)
 
 mysql delete from metadata_text limit 10;
 Query OK, 10 rows affected (2.31 sec)
 
 mysql delete from metadata_text limit 100;
 Query OK, 100 rows affected (25.32 sec)
 
 mysql delete from metadata_text limit 200;
 Query OK, 200 rows affected (1 min 16.17 sec)
 
 mysql delete from metadata_text limit 400;
 Query OK, 400 rows affected (7 min 40.40 sec)
 
 There were 25M records in one table.  The deletion time is reaching an
 unacceptable level (7 minutes for 4M records) and it was not increasing
 in a linear way.  Maybe DB experts can show me how to optimize this?

Wow, you definitely do not want to be doing transactions like that on a
regular basis. It is just murder on performance and can be deadly for
things like replication. If you plan to do the whole table, just use
truncate table.

This is not unique to Ceilometer, and is in fact as old as databases
really.

There is a fantastic tool for doing it as efficiently as possible
though:

http://www.percona.com/doc/percona-toolkit/2.1/pt-archiver.html

It will try to order by the physical blocks in the table, and do small
chunks continuously to minimize the impact. You can also have it sleep
for an amount of time based on how long the last delete took, so that
you are responsive to server load and impact.

The approach pt-archiver uses should be built in to any purge commands,
but many of the purge commands I've encountered in OpenStack just throw
a massive delete at the DB and hope for the best. I have not looked at
Ceilometer's.

Please note this is mostly an operational question, and not a development
question, so I think this thread might want to move over to the openstack@
mailing list.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-24 Thread Preston L. Bannister
Sorry, I am jumping into this without enough context, but ...


On Wed, Sep 24, 2014 at 8:37 PM, Qiming Teng teng...@linux.vnet.ibm.com
wrote:

 mysql select count(*) from metadata_text;
 +--+
 | count(*) |
 +--+
 | 25249913 |
 +--+
 1 row in set (3.83 sec)



There are problems where a simple sequential log file is superior to a
database table. The above looks like a log ... a very large number of
events, without an immediate customer. For sequential access, a simple file
is *vastly* superior to a database table.

If you are thinking about indexed access to the above as a table, think
about the cost of adding items to the index, for that many items. The cost
of building the index is not small. Running a map/reduce on sequential
files might be faster.

Again, I do not have enough context, but ... 25 million rows?
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev