On 12/28/2013 05:51 AM, 刘胜 wrote:
Hi all:
I have reported a bug about time consuming of “resource-list” in
ceilometer CLI:

In order to Identify the causes of this phenomenon, I have pdb the codes
in my invironment(configured  mysql as db driver):
the most import part of process of listing resource is implemented in
following codes:

code of get_resources() in /ceilometer/storage/impl_sqlalchemy.py:
  for meter, first_ts, last_ts in query.all():
             yield api_models.Resource(
for m in meter.resource.meters
The method  generate iterator of object of api_models.Resource for
ceilometer API to show.
1.The operation “query.all()” will query the DB table “meter” with the
expression generated forward,in my invironment the DB table “meter” have
more than 300000 items, so this operation may consume about 30 seconds;
2.The operation"for m in meter.resource.meters" will circulate the
meters of this resource . a resource of server may have more than 100000
meter iterms in my invironment.  So the time of whole process is too
long. I think the meter of Resource object can be reduced and I have
tested this modification, it is OK for listing resource,and reduce the
most time consumption

I have noticed that there are many methods of db operation may time

ps: I have configured the ceilometer pulling interval from 600s to 60s
in /etc/ceilometer/pipeline.yaml, but the invironment has just run 10 days!

I'm a beginner of ceilometer,and want to fix this bug,but I haven't
found a suitable way
may be someone can help me with this?

Yep. The performance of the SQL driver in Ceilometer out-of-the-box with that particular line is unusable in our experience. We have our Chef cookbook literally patch Ceilometer's source code and comment out that particular line because it makes performance of Ceilometer unusable.

I hate to say it, but the SQL driver in Ceilometer really needs an overhaul, both at the schema level and the code level:

On the schema level:

* The indexes, especially on sourceassoc, are wrong:
** The order of the columns in the multi-column indexes like idx_sr, idx_sm, idx_su, idx_sp is incorrect. Columns used in predicates should *precede* columns (like source_id) that are used in joins. The way the indexes are structured now makes them unusable by the query optimizer for 99% of queries on the sourceassoc table, which means any queries on sourceassoc trigger a full table scan of the hundreds of millions of records in the table. Things are made worse by the fact that INSERT operations are slowed for each index on a table, and the fact that none of these indexes are used just means we're wasting cycles on each INSERT for no reason. ** The indexes are across the entire VARCHAR(255) field width. This isn't necessary (and I would argue that the base field type should be smaller). Index width can be reduced (and performance increased) by limiting the indexable width to 32 (or smaller).

The solution to the main indexing issues is to do the following:

DROP INDEX idx_sr ON sourceassoc;
CREATE INDEX idx_sr ON sourceassoc (resource_id(32), source_id(32));
DROP INDEX idx_sp ON sourceassoc;
CREATE INDEX idx_sp ON sourceassoc (project_id(32), source_id(32));
DROP INDEX idx_su ON sourceassoc;
CREATE INDEX idx_su ON sourceassoc (user_id(32), source_id(32));
DROP INDEX idx_sm ON sourceassoc;
CREATE INDEX idx_sm ON sourceassoc (meter_id, source_id(32));

Keep in mind if you have (hundreds of) millions of records in the sourceassoc table, the above will take a long time to run. It will take hours, but you'll be happy you did it. You'll see the database performance increase dramatically.

* The columns that refer to IDs of various kinds should not be UTF8. Changing these columns to a latin1 or even binary charset would cut the space requirements for the data and index storage by 65%. This means you can fit around 3x as many records in the same data and index pages. The more records you fit into an index page, the faster seeks and scans will be.

* sourceassoc has no primary key.

* The meter table has the following:

  KEY ix_meter_id (id)

which is entirely redundant (id is the primary key) and does nothing but slow down insert operations for every record in the meter table.

* The meter table mixes frequently searched and aggregated fields (like timestamp, counter_type, project_id) with infrequently accessed fields (like resource_metadata, which is a VARCHAR(5000)). This leads to poorer performance of aggregate queries on the meter table that use the clustered index (primary key) in aggregation (for an example, see the particular line of code that we comment out of Ceilometer above). A better performing schema would consolidate slim, frequently accessed fields into the main meter table and move infrequently accessed or searched fields into a meter_extra table. This would mean many more records of the main meter table can fit into a single InnoDB data page (the clustered index), which means faster seeks and scans for 99% of queries on that table.

On the code level there are a variety of inefficient queries that are generated, and there are a number of places where using something like a memcache caching layer for common lookup queries could help reduce load on the DB server.

I'm hoping to push some patches in the early part of 2014 that address performance and scalability issues in the SQL driver for Ceilometer.


OpenStack-dev mailing list

Reply via email to