[PERFORM] High CPU Utilization

Joe Uhl Mon, 16 Mar 2009 11:53:46 -0700

Our production database is seeing very heavy CPU utilization - anyonehave any ideas/input considering the following?

CPU utilization gradually increases during the day until it approaches90%-100% at our peak time. When this happens our transactions/secdrops and our site becomes very slow. When in this state, I can seehundreds of queries in pg_stat_activity that are not waiting on locksbut sit there for minutes. When the database is not in this state,those same queries can complete in fractions of a second - faster thatmy script that watches pg_stat_activity can keep track of them.

This server has dual quad core xeon 5310s, 32 GB RAM, and a fewdifferent disk arrays (all managed in hardware by either the Perc5/ior Perc5/e adapter). The Postgres data is on a 14 disk 7.2k SATA raid10. This server runs nothing but Postgres.

The PostgreSQL database (according to pg_database_size) is 55GB and weare running PostgreSQL 8.3.5 and the 2.6.28.7-2 kernel under Arch Linux.

Right now (not under peak load) this server is running at 68% CPUutilization and its SATA raid 10 is doing about 2MB/s writes and 11MB/s reads. When I run dd I can hit 200+MB/s writes and 230+ MB/s reads,so we are barely using the available IO. Further when I run dd theCPU utilization of that process only approaches 20%-30% of one core.

Additionally, when I view "top -c" I generally see a dozen or so"idle" postgres processes (they appear and drop away quickly though)consuming very large chunks of CPU (as much as 60% of a core). At anygiven time we have hundreds of idle postgres processes due to the JDBCconnection pooling but most of them are 0% as I would expect them tobe. I also see selects and inserts consuming very large percentagesof CPU but I view that as less strange since they are doing work.

Any ideas as to what is causing our CPUs to struggle? Is the factthat our RAM covers a significant portion of the database causing ourCPUs to do a bunch of thrashing as they work with memory while ourdisk controllers sit idle? According to top we barely use any swap.

We currently have max_connections set to 1000 (roughly the sum of theJDBC pools on our application servers). Would decreasing this valuehelp? We can decrease the JDBC pools or switch to pgbouncer forpooling if this is the case.

Really just looking for any input/ideas. Our workload is primarilyOLTP in nature - essentially a social network. By transactions/sec atthe start I am using the xact_commit value in pg_stat_database.Please let me know if this value is not appropriate for getting a tpsguess. Right now with the 60% CPU utilization and low IO usexact_commit is increasing at a rate of 1070 a second.

I have an identical PITR slave I can pause the PITR sync on to run anytest against. I will happily provide any additional information thatwould be helpful.


Any assistance is greatly appreciated.

Joe Uhl

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] High CPU Utilization

Reply via email to