Dear all I am having a problem of high cpu loads in my postgres server during peak time. Following are the details of my setup (details as per the postgres wiki) .
** PostgreSQL version o Run "select pg_version();" in psql or PgAdmin III and provide the full, exact output.* clusternode2:~ # rpm -qa | grep postgres postgresql-devel-8.1.9-1.2 postgresql-8.1.9-1.2 postgresql-docs-8.1.9-1.2 postgresql-server-8.1.9-1.2 postgresql-libs-64bit-8.1.9-1.2 postgresql-libs-8.1.9-1.2 postgresql-jdbc-8.1-12.2 postgresql-contrib-8.1.9-1.2 * *A description of what you are trying to achieve and what results you expect.* To keep the CPU Load below 10 , Now during peak times the load is nearing to 40 At that time , it is not possible to access the data. ** The EXACT text of the query you ran, if any * The EXACT output of that query if it's short enough to be reasonable to post o If you think the output is wrong, what you think should've been produced instead * The EXACT error message you get, if there is one* As of now , i am unable to locate the exact query, the load shoots up abnormally during peak time is the main problem . * * What program you're using to connect to PostgreSQL* Jakarta Tomcat - Struts with JSP ** What version of the ODBC/JDBC driver you're using, if any* postgresql-jdbc-8.1-12.2 * *What you were doing when the error happened / how to cause the error. Describe in as much detail as possible, step by step, including command lines, SQL output, etc.* When certain tables with more than 3 lakh items are concurrently accessed by more than 300 users, the CPU load shoots up . * * Is there anything remotely unusual in the PostgreSQL server logs? o On Windows these are in your data directory. On a default PostgreSQL install that'll be in C:\Program Files\PostgreSQL\8.4\data\pg_log (assuming you're using 8.4) * The log file /var/log/postgresql has no data . * o On Linux this depends a bit on distro, but you'll usually find them in /var/log/postgresql/. * Operating system and version o Linux users: + Linux distro and version + Kernel details (run "uname -a" on the terminal) * SLES 10 SP3 clusternode2:~ # uname -a Linux clusternode2 2.6.16.46-0.12-ppc64 #1 SMP Thu May 17 14:00:09 UTC 2007 ppc64 ppc64 ppc64 GNU/Linux * * What kind of hardware you have. o CPU manufacturer and model, eg "AMD Athlon X2" or "Intel Core 2 Duo" o Amount and size of RAM installed, eg "2GB RAM" * High Availability Cluster with two IBM P Series Server and one DS4700 Storage IBM P series P52A with 2-core 2.1 Ghz POWER5+ Processor Card , 36 MB L3 Cache ,16 GB of RAM, 73.4 GB 10,000 RPM Ultra320 SCSI Drive for Operating System . * o Storage details (important for performance and corruption questions) + Do you use a RAID controller? If so, what type of controller? eg "3Ware Escalade 8500-8" # Does it have a battery backed cache module? # Is write-back caching enabled? + Do you use software RAID? If so, what software and what version? eg "Linux software RAID (md) 2.6.18-5-686 SMP mod_unload 686 REGPARM gcc-4.1". # In the case of Linux software RAID you can get the details from the "modinfo md_mod" command + Is your PostgreSQL database on a SAN? # Who made it, what kind, etc? Provide what details you can. + How many hard disks are connected to the system and what types are they? You need to say more than just "6 disks". At least give maker, rotational speed and interface type, eg "6 15,000rpm Seagate SAS disks". + How are your disks arranged for storage? Are you using RAID? If so, what RAID level(s)? What PostgreSQL data is on what disks / disk sets? What file system(s) are in use? # eg: "Two disks in RAID 1, with all PostgreSQL data and programs stored on one ext3 file system." # eg: "4 disks in RAID 5 holding the pg data directory on an ext3 file system. 2 disks in RAID 1 holding pg_clog, pg_xlog, the temporary tablespace, and the sort scratch space, also on ext3.". # eg: "Default Windows install of PostgreSQL" + In case of corruption data reports: # Have you had any unexpected power loss lately? # Have you run a file system check? (chkdsk / fsck) # Are there any error messages in the system logs? (unix/linux: "dmesg", "/var/log/syslog" ; Windows: Event Viewer in Control Panel -> Administrative Tools ) * IBM SAN DS4700 Storage with Fibre Channel HDD (73.4 GB * 10) Two Partitions - 73.4 GB * 3 RAID 5 - 134 GB storage partitions (One holding Jakarata tomcat application server and other holding Postgresql Database) . Four Hard disk RAID 5 with ext3 file systems hold the pgdata on SAN . Hard disk rotational speed is 73 GB 15K IBM 2 GB Fibre channel No power loss, filesystem check also fine, No errors on /var/log/syslog *Following is the output of TOP command during offpeak time.* top - 18:36:56 up 77 days, 20:33, 1 user, load average: 12.99, 9.22, 10.37 Tasks: 142 total, 12 running, 130 sleeping, 0 stopped, 0 zombie Cpu(s): 46.1%us, 1.9%sy, 0.0%ni, 6.1%id, 3.0%wa, 0.0%hi, 0.1%si, 42.9%st Mem: 16133676k total, 13657396k used, 2476280k free, 450908k buffers Swap: 14466492k total, 124k used, 14466368k free, 11590056k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22458 postgres 19 0 2473m 477m 445m R 40 3.0 0:15.49 postmaster 22451 postgres 15 0 2442m 447m 437m S 33 2.8 0:30.44 postmaster 22464 postgres 17 0 2443m 397m 383m R 28 2.5 0:13.78 postmaster 22484 postgres 16 0 2448m 431m 412m S 20 2.7 0:02.73 postmaster 22465 postgres 17 0 2440m 461m 449m R 15 2.9 0:03.52 postmaster 22452 postgres 16 0 2450m 727m 706m R 13 4.6 0:23.46 postmaster 22476 postgres 16 0 2437m 413m 405m S 13 2.6 0:06.11 postmaster 22485 postgres 16 0 2439m 230m 222m R 7 1.5 0:05.72 postmaster 22481 postgres 15 0 2436m 175m 169m S 7 1.1 0:04.44 postmaster 22435 postgres 17 0 2438m 371m 361m R 6 2.4 1:17.92 postmaster 22440 postgres 17 0 2445m 497m 483m R 5 3.2 1:44.50 postmaster 22486 postgres 17 0 2432m 84m 81m R 4 0.5 0:00.76 postmaster 3 root 34 19 0 0 0 R 0 0.0 1:47.50 ksoftirqd/0 4726 root 15 0 29540 8776 3428 S 0 0.1 140:02.98 X 24950 root 15 0 0 0 0 S 0 0.0 0:30.96 pdflush 1 root 16 0 812 316 280 S 0 0.0 0:13.29 init 2 root RT 0 0 0 0 S 0 0.0 0:01.46 migration/0 4 root RT 0 0 0 0 S 0 0.0 0:00.78 migration/1 5 root 34 19 0 0 0 S 0 0.0 1:36.79 ksoftirqd/1 6 root RT 0 0 0 0 S 0 0.0 0:01.46 migration/2 7 root 34 19 0 0 0 R 0 0.0 1:49.83 ksoftirqd/2 8 root RT 0 0 0 0 S 0 0.0 0:00.79 migration/3 9 root 34 19 0 0 0 S 0 0.0 1:38.18 ksoftirqd/3 10 root 10 -5 0 0 0 S 0 0.0 1:02.11 events/0 11 root 10 -5 0 0 0 S 0 0.0 1:03.27 events/1 12 root 10 -5 0 0 0 S 0 0.0 1:01.76 events/2 13 root 10 -5 0 0 0 S 0 0.0 1:02.29 events/3 14 root 10 -5 0 0 0 S 0 0.0 0:00.01 khelper 1016 root 10 -5 0 0 0 S 0 0.0 0:00.00 kthread 1054 root 10 -5 0 0 0 S 0 0.0 0:03.08 kblockd/0 1055 root 10 -5 0 0 0 S 0 0.0 0:02.83 kblockd/1 1056 root 10 -5 0 0 0 S 0 0.0 0:03.19 kblockd/2 The CPU Load shoots upto 40 during peak time. * Following is my postgresql.conf (without comments) * hba_file = '/var/lib/pgsql/data/pg_hba.conf' listen_addresses = '*' port = 5432 max_connections = 1800 shared_buffers = 300000 max_fsm_relations = 1000 effective_cache_size = 200000 log_destination = 'stderr' redirect_stderr = on log_rotation_age = 0 log_rotation_size = 10240 silent_mode = onlog_line_prefix = '%t %d %u ' autovacuum = on datestyle = 'iso, dmy' lc_messages = 'en_US.UTF-8' lc_monetary = 'en_US.UTF-8' lc_numeric = 'en_US.UTF-8' lc_time = 'en_US.UTF-8' *User Access* Total Number of Users is 500 Maximum number of Concurrent users will be 500 during peak time Off Peak time the maximum number of concurrent user will be around 150 to 200. Please let me know your suggestions to improve the performance. Regards Shiva Raman