Re: [PERFORM] server performance issues - suggestions for tuning

2007-08-28 Thread Richard Huxton

Kevin Kempter wrote:

Hi List;

I've just inherited multiple postgres database servers in multiple data 
centers across the US and Europe via a new contract I've just started.


Each night during the nightly batch processing several of the servers (2 in 
particular) slow to a crawl - they are dedicated postgres database servers. 
There is a lot of database activity going on sometimes upwards of 200 
concurrent queries however I just dont think that the machines should be this 
pegged. I am in the process of cleaning up dead space - their #1 fix for 
performance issues in the past is to kill the current vacuum process. 
Likewise I've just bumped shared_buffers to 15 and work_mem to 25. 


Well, allowing vacuum to do its job can clearly only help matters. I'm 
not sure about setting work_mem so high though. That's the memory you're 
using per-sort, so you can use multiples of that in a single query. With 
200 concurrent queries I'd worry about running into swap. If you're 
doing it just for the batch processes that might make sense.


You might well want to set maintenance_work_mem quite high though, for 
any overnight maintenance.


A shared_buffers of 1.2GB isn't outrageous, but again with 200 backend 
processes you'll want to consider how much memory each process will 
consume. It could be that you're better off with a smaller 
shared_buffers and relying more on the OS doing its disk caching.


Even at that I still see slow processing/high system loads at nite.I have 
noticed that killing the current vacuum process (autovacuum is turned on) 
speeds up the entire machine significantly.


If it's disk i/o that's the limiting factor you might want to look at 
the Cost-Based Vacuum Delay section in the configuration settings.


The servers are 4-CPU intel boxes (not dual-core) with 4Gig of memory and 
attached to raid-10 array's


Any thoughts on where to start?


Make sure you are gathering stats and at least stats_block_level stuff. 
Then have a cron-job make copies of the stats tables, but adding a 
timestamp column. That way you can run diffs against different time periods.


Pair this up with top/vmstat/iostat activity.

Use log_min_duration_statement to catch any long-running queries so you 
can see if you're getting bad plans that push activity up.


Try and make only one change at a time, otherwise it's difficult to tell 
what's helping/hurting.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] server performance issues - suggestions for tuning

2007-08-28 Thread Kevin Grittner
 On Mon, Aug 27, 2007 at 11:13 PM, in message
[EMAIL PROTECTED], Kevin Kempter
[EMAIL PROTECTED] wrote: 
 Each night during the nightly batch processing several of the servers (2 in 
 particular) slow to a crawl - they are dedicated postgres database servers. 
 There is a lot of database activity going on sometimes upwards of 200 
 concurrent queries
 
 Any thoughts on where to start?
 
Is there any way to queue up these queries and limit how many are running at
a time?  I don't know what the experience of others is, but I've found that
when I have more than two to four queries running per CPU, throughput starts
to drop, and response time drops even faster.
 
For purposes of illustration, for a moment let's forget that a query may
block waiting for I/O and another query might be able to use the CPU in the
meantime.  Then, think of it this way -- if you have one CPU and 100 queries
to run, each of which will take one second, if you start them all and they
time slice, nobody gets anything for 100 seconds, so that is your average
response time.  If you run the one at a time, only one query takes that
long, the rest are faster, and you've cut your average response time in
half.  On top of that, there is overhead to switching between processes,
and there can be contention for resources such as locks, which both have a
tendency to further slow things down.
 
In the real world, there are multiple resources which can hold up a
query, so you get benefit from running more than one query at a time,
because they will often be using different resources.
 
But unless that machine has 50 CPUs, you will probably get better throughput
and response time by queuing the requests.
 
-Kevin
 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] server performance issues - suggestions for tuning

2007-08-28 Thread Andrew Sullivan
On Tue, Aug 28, 2007 at 08:12:06AM -0500, Kevin Grittner wrote:
  
 Is there any way to queue up these queries and limit how many are running at
 a time?  

Sure: limit the number of connections to the database, and put a pool
in front.  It can indeed help.

If you have a lot of bloat due to large numbers of failed vacuums,
however, I suspect your problem is I/O.  Vacuum churns through the
disk very aggressively, and if you're close to your I/O limit, it can
push you over the top.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The year's penultimate month is not in truth a good way of saying
November.
--H.W. Fowler

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] server performance issues - suggestions for tuning

2007-08-28 Thread Scott Marlowe
On 8/27/07, Kevin Kempter [EMAIL PROTECTED] wrote:
 Hi List;

 I've just inherited multiple postgres database servers in multiple data
 centers across the US and Europe via a new contract I've just started.

What pg version are you working with, and on what OS / OS version?

 Each night during the nightly batch processing several of the servers (2 in
 particular) slow to a crawl - they are dedicated postgres database servers.
 There is a lot of database activity going on sometimes upwards of 200
 concurrent queries however I just dont think that the machines should be this
 pegged. I am in the process of cleaning up dead space - their #1 fix for
 performance issues in the past is to kill the current vacuum process.
 Likewise I've just bumped shared_buffers to 15 and work_mem to 25.

way too big for work_mem as mentioned before.  Set it to something
reasonable, like 8M or so.  Then, if you've got one query that really
needs lots of memory to run well, you can set it higher for that
connection / query only.  You can even set work_mem to a particular
number for a particular user with alter user command.

Oh, and 200 concurrent queries is a LOT.

 Even at that I still see slow processing/high system loads at nite.I have
 noticed that killing the current vacuum process (autovacuum is turned on)
 speeds up the entire machine significantly.

 The servers are 4-CPU intel boxes (not dual-core) with 4Gig of memory and
 attached to raid-10 array's

It sounds to me like your systems are I/O bound, at least when vacuum
is running.  If you want to get good performance and have vacuum run
in a reasonable amount of time, you might need to upgrade your RAID
subsystems.  Do you have battery backed caching controllers?  Which
exact model controller are you using?  How many drives in your RAID10
array?  What types of queries are typical (OLAP versus OLTP really)?

 Any thoughts on where to start?

The vacuum cost settings to reduce the impact vacuum has.

Increasing fsm settings as needed.

Vacuum verbose to see if you've blown out your fsm settings and to see
what fsm settings you might need.

reindexing particularly bloated tables / indexes.

hardware upgrades if needed.

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] server performance issues - suggestions for tuning

2007-08-27 Thread Kevin Kempter
Hi List;

I've just inherited multiple postgres database servers in multiple data 
centers across the US and Europe via a new contract I've just started.

Each night during the nightly batch processing several of the servers (2 in 
particular) slow to a crawl - they are dedicated postgres database servers. 
There is a lot of database activity going on sometimes upwards of 200 
concurrent queries however I just dont think that the machines should be this 
pegged. I am in the process of cleaning up dead space - their #1 fix for 
performance issues in the past is to kill the current vacuum process. 
Likewise I've just bumped shared_buffers to 15 and work_mem to 25. 

Even at that I still see slow processing/high system loads at nite.I have 
noticed that killing the current vacuum process (autovacuum is turned on) 
speeds up the entire machine significantly.

The servers are 4-CPU intel boxes (not dual-core) with 4Gig of memory and 
attached to raid-10 array's

Any thoughts on where to start?

Below are the current/relevant/changed postgresql.conf settings.

Thanks in advance...

/Kevin




== postgresql.conf (partial listing)
#---
# CLIENT CONNECTION DEFAULTS
#---

# - Statement Behavior -

#search_path = '$user,public'   # schema names
#default_tablespace = ''# a tablespace name, '' uses
# the default
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#statement_timeout = 0  # 0 is disabled, in milliseconds

# - Locale and Formatting -

#datestyle = 'iso, mdy'
#timezone = unknown # actually, defaults to TZ 
# environment setting
#australian_timezones = off
#extra_float_digits = 0 # min -15, max 2
#client_encoding = sql_ascii# actually, defaults to database
# encoding

# These settings are initialized by initdb -- they might be changed
lc_messages = 'en_US.UTF-8' # locale for system error 
message 
# strings
lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'  # locale for number formatting
lc_time = 'en_US.UTF-8' # locale for time formatting

# - Other Defaults -

#explain_pretty_print = on
#dynamic_library_path = '$libdir'


#---
# LOCK MANAGEMENT
#---

#deadlock_timeout = 1000# in milliseconds
#max_locks_per_transaction = 64 # min 10
# note: each lock table slot uses ~220 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.


#---
# VERSION/PLATFORM COMPATIBILITY
#---

# - Previous Postgres Versions -

#add_missing_from = off
#backslash_quote = safe_encoding# on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = off
#regex_flavor = advanced# advanced, extended, or basic
#sql_inheritance = on

# - Other Platforms  Clients -

#transform_null_equals = off


#---
# CUSTOMIZED OPTIONS
#---

#custom_variable_classes = ''   # list of custom variable class names
=

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match