On Tue, Mar 11, 2014 at 12:04 PM, Anand Kumar, Karthik < karthik.anandku...@classmates.com> wrote:
> Hi all, > > We're running postgres 9.3.2, server configuration below. > > Seemingly randomly, we will see the number of active queries in postgres > go up until we hit max_connections. The DB will recover after a few minutes. > > We had the issue a couple of times in Feb 2014. We then upgraded the > postgres server from 9.1 to 9.3.2, and the occurrence has gone up > significantly - to several times a day. > > The user CPU goes up as well to a 100%, no increase in I/O or system CPU. > We have slow query logging, and there is no dramatic change in the slow > queries either. > There is a corresponding spike in shared locks, but that seems to be an > effect not a cause - it corresponds to an increase in the number of running > processes at the time. > > We had a similar issue in the past - that was solved by disabling > transparent_huge_pages - but the difference there was that we'd see queries > slow down dramatically. Currently, we don't. Also, transparent_huge_pages > is still disabled. > > I do realize the issue would be caused by a spurt in incoming > connections - we do not yet have conclusive evidence on whether that's > happening (active queries climbs up, however no conclusive proof on whether > thats because of slow down, or because of increase in traffic). Working on > getting the information, will update with that information as soon as we > have it. > > I thought I'd send a post out to the group before then, to see if anyone > has run into anything similar. > > Thanks, > Karthik > > site=# SELECT version(); > PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 > 20120305 (Red Hat 4.4.6-4), 64-bit > > site=# SELECT name, current_setting(name), source > site-# FROM pg_settings > site-# WHERE source NOT IN ('default', 'override'); > application_name|psql|client > archive_command|/usr/bin/archiver.sh %f %p|configuration file > archive_mode|on|configuration file > autovacuum_freeze_max_age|250000000|configuration file > autovacuum_max_workers|6|configuration file > bgwriter_lru_maxpages|1000|configuration file > bgwriter_lru_multiplier|4|configuration file > checkpoint_completion_target|0.8|configuration file > checkpoint_segments|250|configuration file > checkpoint_timeout|15min|configuration file > checkpoint_warning|6min|configuration file > client_encoding|UTF8|client > commit_siblings|25|configuration file > cpu_tuple_cost|0.03|configuration file > DateStyle|ISO, MDY|configuration file > default_statistics_target|300|configuration file > default_text_search_config|pg_catalog.english|configuration file > effective_cache_size|568GB|configuration file > fsync|on|configuration file > lc_messages|en_US.UTF-8|configuration file > lc_monetary|en_US.UTF-8|configuration file > lc_numeric|en_US.UTF-8|configuration file > lc_time|en_US.UTF-8|configuration file > listen_addresses|*|configuration file > log_autovacuum_min_duration|0|configuration file > log_checkpoints|on|configuration file > log_connections|on|configuration file > log_destination|syslog|configuration file > log_directory|pg_log|configuration file > log_filename|postgresql-%a.log|configuration file > log_line_prefix|user=%u,db=%d,ip=%h |configuration file > log_min_duration_statement|100ms|configuration file > log_min_messages|debug1|configuration file > log_rotation_age|1d|configuration file > log_rotation_size|0|configuration file > log_timezone|US/Pacific|configuration file > log_truncate_on_rotation|on|configuration file > logging_collector|off|configuration file > maintenance_work_mem|1GB|configuration file > max_connections|1500|configuration file > max_locks_per_transaction|1000|configuration file > max_stack_depth|2MB|environment variable > max_wal_senders|5|configuration file > port|5432|command line > random_page_cost|2|configuration file > shared_buffers|8GB|configuration file > synchronous_commit|off|configuration file > syslog_facility|local0|configuration file > syslog_ident|postgres|configuration file > TimeZone|US/Pacific|configuration file > vacuum_freeze_table_age|0|configuration file > wal_buffers|32MB|configuration file > wal_keep_segments|250|configuration file > wal_level|hot_standby|configuration file > wal_sync_method|fsync|configuration file > work_mem|130MB|configuration file > Please let us know your hardware configuration like RAM, CPU (cores) etc. Do you see any messages indicating any processes getting terminated/killed forcibly in the Postgresql logs ? Or do you see any shared memory related error messages ? cpu_tuple_cost=0.03 - which is not default, any reasons for increasing this. effective_cache_size = 568 GB - Please help us know if this is optimal for your system. Venkata Balaji N Sr. Database Administrator Fujitsu Australia