Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Willo van der Merwe

Merlin Moncure wrote:

On 8/29/06, Willo van der Merwe [EMAIL PROTECTED] wrote:


 and it has 743321 rows and a explain analyze select count(*) from
property_values;



you have a number of options:

All good ideas and I'll be sure to implement them later.


I am curious why you need to query the count of records in the log
table to six digits of precision.

I'm not with you you here.
I'm drawing statistic for the my users on a per user basis in real-time, 
so there are a couple of where clauses attached.


merlin


Hi Merlin,

This was just an example. All queries have slowed down. Could it be that 
I've reached some cut-off and now my disk is thrashing?


Currently the load looks like this:
Cpu0  : 96.8% us,  1.9% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  1.0% si
Cpu1  : 97.8% us,  1.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si
Cpu2  : 96.8% us,  2.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si
Cpu3  : 96.2% us,  3.2% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] performance problems.

2006-08-30 Thread Matthew Sullivan

All,

Got a little bit of a performance problem I hope that can be resolved.

All the files/info I believe you are going to ask for are here:

http://www.au.sorbs.net/~matthew/postgres/30.8.06/

The odd thing was it originally was fast (1-2 seconds) which is all I 
need - the query is a permissions check and I added a permissions 
caching engine to the client code.  However, I was testing part of my 
new interface and added and expired some rows in the permissions, and 
authorisation tables (taking the row count to ~15) the performance 
dropped to 86seconds (ish) which is unusable... :-(


Unfortunately I do not have a query plan from before the performance issue.

work_mem has been adjusted from 512 to 8192, 65536 and 100 with no 
apparent effect.

random_page_cost has been 4 and 2 - 2 results in 89seconds for the query.

The hardware is a Compaq 6400r with 4G of EDO RAM, 4x500MHz Xeons and a 
Compaq RAID 3200 in RAID 5 configuration running across 3 spindles (34G 
total space).


The OS is FreeBSD 5.4-RELEASE-p14
The PG Version is 8.1.3

Solutions/tips greatly appreciated.

Regards,

Mat

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Luke Lonergan
 Currently the load looks like this:
 Cpu0  : 96.8% us,  1.9% sy,  0.0% ni,  0.3% id,  0.0% wa,  
 0.0% hi,  1.0% si
 Cpu1  : 97.8% us,  1.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  
 0.0% hi,  0.3% si
 Cpu2  : 96.8% us,  2.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  
 0.0% hi,  0.3% si
 Cpu3  : 96.2% us,  3.2% sy,  0.0% ni,  0.3% id,  0.0% wa,  
 0.0% hi,  0.3% si

All four CPUs are hammered busy - check top and look for runaway
processes.

- Luke


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Willo van der Merwe

Rusty Conover wrote:


On Aug 29, 2006, at 7:52 AM, Willo van der Merwe wrote:


Hi,

We're running PostgreSQL 8.1.4 on CentOS 4 (Linux version 
2.6.9-34.0.1.ELsmp). Hardware specs:

2x AMD Dual-Core Opteron 270 Italy 1Ghz HT 2 x 1MB L2 Cache Socket 940
4 GB Registered ECC PC3200 DDR RAM
SuperMicro Server-Class 1U AS1020S series system
Dual-channel Ultra320 SCSI controller
1 x 73 GB 10,000rpm Ultra320 SCSI drive with 8MB cache
I use it to drive a web application. Everything was working fine when 
all of a sudden today, things went belly up. Load on the server 
started increasing and query speeds decreased rapidly. After dropping 
all the clients I did some quick tests and found the following:


I have a log table looking like this:
   Table public.log
 Column  |Type |Modifiers
-+-+-
 site| bigint  | not null
 stamp   | timestamp without time zone | default now()
 type| character(8)| not null default 'log'::bpchar
 user| text| not null default 'public'::text
 message | text|
Indexes:
fki_log_sites btree (site)
ix_log_stamp btree (stamp)
ix_log_type btree (type)
ix_log_user btree (user)
Foreign-key constraints:
log_sites FOREIGN KEY (site) REFERENCES sites(id) ON UPDATE 
CASCADE ON DELETE CASCADE


and it has 743321 rows and a explain analyze select count(*) from 
property_values;
QUERY 
PLAN   
--
 Aggregate  (cost=55121.95..55121.96 rows=1 width=0) (actual 
time=4557.797..4557.798 rows=1 loops=1)
   -  Seq Scan on property_values  (cost=0.00..51848.56 rows=1309356 
width=0) (actual time=0.026..2581.418 rows=1309498 loops=1)

 Total runtime: 4557.978 ms
(3 rows)

4 1/2 seconds for a count(*) ? This seems a bit rough - is there 
anything else I can try to optimize my Database? You can imagine that 
slightly more complex queries goes out the roof.


Any help appreciated

Regards

Willo van der Merwe



Hi,

What about doing a little bit of normalization? 


With 700k rows you could probably gain some improvements by:

* normalizing the type and user columns to integer keys (dropping the 
8 byte overhead for storing the field lengths)
* maybe change the type column so that its a smallint if there is just 
a small range of possible values (emulating a enum type in other 
databases) rather the joining to another table.
* maybe move message (if the majority of the rows are big and not null 
but not big enough to be TOASTed, ergo causing only a small number of 
rows to fit onto a 8k page) out of this table into a separate table 
that is joined only when you need the column's content.


Doing these things would fit more rows onto each page, making the scan 
less intensive by not causing the drive to seek as much.  Of course 
all of these suggestions depend on your workload.


Cheers,

Rusty
--
Rusty Conover
InfoGears Inc.


Hi Rusty,

Good ideas and I've implemented some of them, and gained about 10%. I'm 
still sitting on a load avg of about 60.


Any ideas on optimizations on my postgresql.conf, that might have an effect?


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Willo van der Merwe

Luke Lonergan wrote:

Currently the load looks like this:
Cpu0  : 96.8% us,  1.9% sy,  0.0% ni,  0.3% id,  0.0% wa,  
0.0% hi,  1.0% si
Cpu1  : 97.8% us,  1.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  
0.0% hi,  0.3% si
Cpu2  : 96.8% us,  2.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  
0.0% hi,  0.3% si
Cpu3  : 96.2% us,  3.2% sy,  0.0% ni,  0.3% id,  0.0% wa,  
0.0% hi,  0.3% si



All four CPUs are hammered busy - check top and look for runaway
processes.

- Luke


  

Yes, the first 463 process are all postgres. In the meanwhile I've done:
Dropped max_connections from 500 to 250 and
Upped shared_buffers = 5

Without any apparent effect.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Luke Lonergan
Interesting - in this quick snapshot there is no I/O happening at all.
What happens when you track the activity for a longer period of time?

How about just capturing vmstat during a period when the queries are
slow?

Has the load average been this high forever or are you experiencing a
growth in workload?  463 processes all doing CPU work will take 100x as
long as one query on a 4 CPU box, have you worked through how long you
should expect the queries to take?

- Luke 

 -Original Message-
 From: Willo van der Merwe [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, August 30, 2006 4:35 AM
 To: Luke Lonergan
 Cc: Merlin Moncure; pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] PostgreSQL performance issues
 
 Luke Lonergan wrote:
  Currently the load looks like this:
  Cpu0  : 96.8% us,  1.9% sy,  0.0% ni,  0.3% id,  0.0% wa, 
 0.0% hi,  
  1.0% si
  Cpu1  : 97.8% us,  1.6% sy,  0.0% ni,  0.3% id,  0.0% wa, 
 0.0% hi,  
  0.3% si
  Cpu2  : 96.8% us,  2.6% sy,  0.0% ni,  0.3% id,  0.0% wa, 
 0.0% hi,  
  0.3% si
  Cpu3  : 96.2% us,  3.2% sy,  0.0% ni,  0.3% id,  0.0% wa, 
 0.0% hi,  
  0.3% si
  
 
  All four CPUs are hammered busy - check top and look for runaway 
  processes.
 
  - Luke
 
 

 Yes, the first 463 process are all postgres. In the meanwhile 
 I've done:
 Dropped max_connections from 500 to 250 and Upped 
 shared_buffers = 5
 
 Without any apparent effect.
 
 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Alex Hayward
On Wed, 30 Aug 2006, Willo van der Merwe wrote:

 Merlin Moncure wrote:
  On 8/29/06, Willo van der Merwe [EMAIL PROTECTED] wrote:
 
   and it has 743321 rows and a explain analyze select count(*) from
  property_values;
 
 
  you have a number of options:
 All good ideas and I'll be sure to implement them later.

  I am curious why you need to query the count of records in the log
  table to six digits of precision.
 I'm not with you you here.
 I'm drawing statistic for the my users on a per user basis in real-time,
 so there are a couple of where clauses attached.

Most of the advice so far has been aimed at improving the performance of
the query you gave. If this query isn't representative of your load then
you'll get better advice if you post the queries you are actually making
along with EXPLAIN ANALYZE output.

 Hi Merlin,

 This was just an example. All queries have slowed down. Could it be that
 I've reached some cut-off and now my disk is thrashing?

 Currently the load looks like this:
 Cpu0  : 96.8% us,  1.9% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  1.0% si
 Cpu1  : 97.8% us,  1.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si
 Cpu2  : 96.8% us,  2.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si
 Cpu3  : 96.2% us,  3.2% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si

It seems to be a sort of standing assumption on this list that databases
are much larger than memory and that database servers are almost always IO
bound. This isn't always true, but as we don't know the size of your
database or working set we can't tell. You'd have to look at your OS's IO
statistics to be sure, but it doesn't look to me to be likely that you're
IO bound.

If there are significant writes going on then it may also be interesting
to know your context switch rate and whether dropping your foreign key
constraint makes any difference. IIRC your foreign key constraint will
result in the row in log_sites being locked FOR UPDATE and cause updates
and inserts into your log table for a particular site to be serialized (I
may be out of date on this, it's a while since I heavily used foreign
keys).

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Willo van der Merwe

That's exactly what I'm experiencing.

Everything was fine until yesterday, when we noticed a considerable site 
slow-down. Graphs showed the server suddenly spiking to a load of 67. At 
first I thought somebody executed a ran-away query, so I restarted 
postgres, but after it came back up, it climbed back up to this load.


In the meanwhile I've applied some table level optimizations and the 
postgres.conf optimizatrions ... nothing


Here's the vmstat output, since reboot last night

[EMAIL PROTECTED] ~]# vmstat -a
procs ---memory-- ---swap-- -io --system-- 
cpu
r  b   swpd   free  inact active   si   sobibo   incs us sy 
id wa
27  0  0 595312 248100 296276400 831  105 7 63  
2 35  0

[EMAIL PROTECTED] ~]# vmstat -d
disk- reads writes--- 
-IO--
  total merged sectors  ms  total merged sectors  ms
cursec
ram0   0  0   0   0  0  0   0   0  
0  0
ram1   0  0   0   0  0  0   0   0  
0  0
ram2   0  0   0   0  0  0   0   0  
0  0
ram3   0  0   0   0  0  0   0   0  
0  0
ram4   0  0   0   0  0  0   0   0  
0  0
ram5   0  0   0   0  0  0   0   0  
0  0
ram6   0  0   0   0  0  0   0   0  
0  0
ram7   0  0   0   0  0  0   0   0  
0  0
ram8   0  0   0   0  0  0   0   0  
0  0
ram9   0  0   0   0  0  0   0   0  
0  0
ram10  0  0   0   0  0  0   0   0  
0  0
ram11  0  0   0   0  0  0   0   0  
0  0
ram12  0  0   0   0  0  0   0   0  
0  0
ram13  0  0   0   0  0  0   0   0  
0  0
ram14  0  0   0   0  0  0   0   0  
0  0
ram15  0  0   0   0  0  0   0   0  
0  0
sda   197959  38959 4129737  952923 777438 1315162 16839981 
39809324  0   2791
fd00  0   0   0  0  0   0   0  
0  0
md00  0   0   0  0  0   0   0  
0  0




Luke Lonergan wrote:

Interesting - in this quick snapshot there is no I/O happening at all.
What happens when you track the activity for a longer period of time?

How about just capturing vmstat during a period when the queries are
slow?

Has the load average been this high forever or are you experiencing a
growth in workload?  463 processes all doing CPU work will take 100x as
long as one query on a 4 CPU box, have you worked through how long you
should expect the queries to take?

- Luke 

  

-Original Message-
From: Willo van der Merwe [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 30, 2006 4:35 AM

To: Luke Lonergan
Cc: Merlin Moncure; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] PostgreSQL performance issues

Luke Lonergan wrote:


Currently the load looks like this:
Cpu0  : 96.8% us,  1.9% sy,  0.0% ni,  0.3% id,  0.0% wa, 

0.0% hi,  


1.0% si
Cpu1  : 97.8% us,  1.6% sy,  0.0% ni,  0.3% id,  0.0% wa, 

0.0% hi,  


0.3% si
Cpu2  : 96.8% us,  2.6% sy,  0.0% ni,  0.3% id,  0.0% wa, 

0.0% hi,  


0.3% si
Cpu3  : 96.2% us,  3.2% sy,  0.0% ni,  0.3% id,  0.0% wa, 

0.0% hi,  


0.3% si


All four CPUs are hammered busy - check top and look for runaway 
processes.


- Luke


  
  
Yes, the first 463 process are all postgres. In the meanwhile 
I've done:
Dropped max_connections from 500 to 250 and Upped 
shared_buffers = 5


Without any apparent effect.






  



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Willo van der Merwe

Alex Hayward wrote:

On Wed, 30 Aug 2006, Willo van der Merwe wrote:

  

Merlin Moncure wrote:


On 8/29/06, Willo van der Merwe [EMAIL PROTECTED] wrote:

  

 and it has 743321 rows and a explain analyze select count(*) from
property_values;



you have a number of options:
  

All good ideas and I'll be sure to implement them later.



I am curious why you need to query the count of records in the log
table to six digits of precision.
  

I'm not with you you here.
I'm drawing statistic for the my users on a per user basis in real-time,
so there are a couple of where clauses attached.



Most of the advice so far has been aimed at improving the performance of
the query you gave. If this query isn't representative of your load then
you'll get better advice if you post the queries you are actually making
along with EXPLAIN ANALYZE output.

  

Hi Merlin,

This was just an example. All queries have slowed down. Could it be that
I've reached some cut-off and now my disk is thrashing?

Currently the load looks like this:
Cpu0  : 96.8% us,  1.9% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  1.0% si
Cpu1  : 97.8% us,  1.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si
Cpu2  : 96.8% us,  2.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si
Cpu3  : 96.2% us,  3.2% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si



It seems to be a sort of standing assumption on this list that databases
are much larger than memory and that database servers are almost always IO
bound. This isn't always true, but as we don't know the size of your
database or working set we can't tell. You'd have to look at your OS's IO
statistics to be sure, but it doesn't look to me to be likely that you're
IO bound.

If there are significant writes going on then it may also be interesting
to know your context switch rate and whether dropping your foreign key
constraint makes any difference. IIRC your foreign key constraint will
result in the row in log_sites being locked FOR UPDATE and cause updates
and inserts into your log table for a particular site to be serialized (I
may be out of date on this, it's a while since I heavily used foreign
keys).

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
  


Hi Alex,

Yes, I haven't noticed any major I/O waits either. The crazy thing here 
is that all the queries were running an an acceptable time limit, but 
then suddenly it went haywire. I did not change any of the queries or 
fiddle with the server in any way. Previously we've experienced 1 or 2 
spikes a day (where load would suddenly spike to 67 or so, but then 
quickly drop down to below 4) but in this case it stayed up. So I 
restarted the service and started fiddling with options, with no 
apparent effect.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Dave Cramer


On 30-Aug-06, at 7:35 AM, Willo van der Merwe wrote:


Luke Lonergan wrote:

Currently the load looks like this:
Cpu0  : 96.8% us,  1.9% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0%  
hi,  1.0% si
Cpu1  : 97.8% us,  1.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0%  
hi,  0.3% si
Cpu2  : 96.8% us,  2.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0%  
hi,  0.3% si
Cpu3  : 96.2% us,  3.2% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0%  
hi,  0.3% si




All four CPUs are hammered busy - check top and look for runaway
processes.

- Luke



Yes, the first 463 process are all postgres. In the meanwhile I've  
done:

Dropped max_connections from 500 to 250 and
Upped shared_buffers = 5


With 4G of memory you can push shared buffers to double that.
effective_cache should be 3/4 of available memory.

Can you also check vmstat 1 for high context switches during this  
query, high being over 100k


Dave


Without any apparent effect.

---(end of  
broadcast)---

TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Willo van der Merwe




Dave Cramer wrote:

On 30-Aug-06, at 7:35 AM, Willo van der Merwe wrote:
  
  
  Luke Lonergan wrote:


  Currently the load looks like this:

Cpu0 : 96.8% us, 1.9% sy, 0.0% ni, 0.3% id, 0.0% wa, 0.0% hi,
1.0% si

Cpu1 : 97.8% us, 1.6% sy, 0.0% ni, 0.3% id, 0.0% wa, 0.0% hi,
0.3% si

Cpu2 : 96.8% us, 2.6% sy, 0.0% ni, 0.3% id, 0.0% wa, 0.0% hi,
0.3% si

Cpu3 : 96.2% us, 3.2% sy, 0.0% ni, 0.3% id, 0.0% wa, 0.0% hi,
0.3% si


  
  
All four CPUs are hammered busy - check "top" and look for runaway
  
processes.
  
  
- Luke
  
  
  
  

Yes, the first 463 process are all postgres. In the meanwhile I've
done:

Dropped max_connections from 500 to 250 and

Upped shared_buffers = 5

  
  
With 4G of memory you can push shared buffers to double that.
  
effective_cache should be 3/4 of available memory.
  
  
Can you also check vmstat 1 for high context switches during this
query, high being over 100k
  
  
Dave
  
  
Without any apparent effect.


---(end of
broadcast)---

TIP 9: In versions below 8.0, the planner will ignore your desire to

 choose an index scan if your joining column's datatypes do not

 match


  
  
  

Hi Dave,

Ok, I've upped shared_buffers = 15
and effective_cache_size = 10

and restarted the service
top now reads:

top - 15:08:28 up
20:12, 1 user, load average: 19.55, 22.48, 26.59
Tasks: 132 total, 24 running, 108 sleeping, 0 stopped, 0 zombie
Cpu0 : 97.0% us, 1.0% sy, 0.0% ni, 0.3% id, 0.0% wa, 0.3% hi,
1.3% si
Cpu1 : 98.3% us, 1.7% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi,
0.0% si
Cpu2 : 98.0% us, 1.7% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi,
0.3% si
Cpu3 : 96.7% us, 3.3% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi,
0.0% si
Mem: 4060084k total, 2661772k used, 1398312k free, 108152k buffers
Swap: 4192956k total, 0k used, 4192956k free, 2340936k cached

 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11446 postgres 17 0 1280m 97m 95m R 28.9 2.5 0:03.63 postmaster
11435 postgres 16 0 1279m 120m 117m R 26.9 3.0 0:05.18 postmaster
11438 postgres 16 0 1279m 31m 30m R 24.6 0.8 0:04.43 postmaster
11163 postgres 16 0 1279m 120m 118m R 23.2 3.0 0:42.61 postmaster
11167 postgres 16 0 1279m 120m 118m R 23.2 3.0 0:41.04 postmaster
11415 postgres 15 0 1279m 299m 297m R 22.2 7.5 0:07.07 postmaster
11428 postgres 15 0 1279m 34m 32m R 21.9 0.9 0:05.53 postmaster
11225 postgres 16 0 1279m 31m 30m R 21.6 0.8 0:34.95 postmaster
11298 postgres 16 0 1279m 118m 117m R 21.6 3.0 0:23.82 postmaster
11401 postgres 15 0 1279m 31m 30m R 21.6 0.8 0:08.18 postmaster
11377 postgres 15 0 1279m 122m 120m R 20.9 3.1 0:09.54 postmaster
11357 postgres 17 0 1280m 126m 123m R 19.9 3.2 0:13.98 postmaster
11415 postgres 16 0 1279m 299m 297m R 17.1 7.5 0:06.40 postmaster
11461 postgres 17 0 1279m 81m 78m R 17.1 2.0 0:00.77 postmaster
11357 postgres 15 0 1279m 120m 118m S 16.8 3.0 0:13.38 postmaster
11458 postgres 16 0 1279m 31m 30m R 15.8 0.8 0:00.97 postmaster
11446 postgres 15 0 1279m 31m 30m S 15.5 0.8 0:02.76 postmaster
11428 postgres 15 0 1279m 34m 32m S 15.2 0.9 0:04.87 postmaster
11435 postgres 16 0 1279m 120m 117m R 14.2 3.0 0:04.37 postmaster
11466 postgres 16 0 1279m 33m 32m S 7.9 0.9 0:00.24 postmaster

load avg is climbing...

vmstat 1

I don't see any cs  100k

procs
---memory-- ---swap-- -io --system-- cpu
r b swpd free buff cache si so bi bo in cs us
sy id wa
33 0 0 1352128 108248 2352604 0 0 7 33 147 26
65 2 33 0
19 0 0 1348360 108264 2352656 0 0 0 348 3588 1408
98 2 0 0
26 0 0 1346024 108264 2352996 0 0 0 80 3461 1154
98 2 0 0
27 0 0 1349496 108264 2352996 0 0 0 100 3611 1199
98 2 0 0
31 0 0 1353872 108264 2353064 0 0 0 348 3329 1227
97 2 0 0
21 0 0 1352528 108264 2353064 0 0 0 80 3201 1437
97 2 0 0
28 0 0 1352096 108280 2353184 0 0 0 64 3579 1073
98 2 0 0
29 0 0 1352096 108284 2353180 0 0 0 0 3538 1293
98 2 0 0
28 0 0 1351776 108288 2353244 0 0 0 36 3339 1313
99 1 0 0
22 0 0 1366392 108288 2353244 0 0 0 588 3663 1303
99 1 0 0
27 0 0 1366392 108288 2353312 0 0 0 84 3276 1028
99 1 0 0
28 0 0 1365504 108296 2353372 0 0 0 140 3500 1164
98 2 0 0
26 0 0 1368272 108296 2353372 0 0 0 68 3268 1082
98 2 0 0
25 0 0 1372232 108296 2353508 0 0 0 260 3261 1278
97 3 0 0
26 0 0 1366056 108296 2353644 0 0 0 0 3268 1178
98 2 0 0
24 1 0 1368704 108296 2353780 0 0 0 1788 3548 1614
97 3 0 0
29 0 0 1367728 108296 2353304 0 0 0 60 3637 1105
99 1 0 0
21 0 0 1365224 108300 2353640 0 0 0 12 3257 918
99 1 0 0
27 0 0 1363944 108300 2354116 0 0 0 72 3052 1365
98 2 0 0
25 0 0 1366968 108300 2354184 0 0 0 212 3314 1696
99 1 0 0
30 0 0 1363552 108300 2354184 0 0 0 72 3147 1420
97 2 0 0
27 0 0 1367792 108300 2354184 0 0 0 184 3245 1310
97 2 0 0
21 0 0 1369088 108308 2354380 0 0 0 140 3306 987
98 2 0 0
11 1 0 1366056 108308 2354448 0 0 0 88 3210 1183
98 1 0 0
27 0 0 1361104 108308 2354516 0 0 0 0 3598 1015
98 2 

Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Dave Dutcher
Title: Message



That's an 
interesting situation. Your CPU's are pegged, and you're hardly doing any 
IO. I wonder if there is some ineficient query, or if its just very high 
query volume. Maybe you could try setting log_min_duration_statement to 
try to track down the slowest of the queries. Then post the slow queries 
with an explain analyze to the list.

Here is some info 
on setting up logging:
http://www.postgresql.org/docs/8.1/interactive/runtime-config-logging.html


Are your queries 
standard SQL or do you call functions you wrote in PL/pgSQl or PL/Python or 
anything?




Re: [PERFORM] performance problems.

2006-08-30 Thread Vivek Khera


On Aug 30, 2006, at 5:29 AM, Matthew Sullivan wrote:

The hardware is a Compaq 6400r with 4G of EDO RAM, 4x500MHz Xeons  
and a Compaq RAID 3200 in RAID 5 configuration running across 3  
spindles (34G total space).


The OS is FreeBSD 5.4-RELEASE-p14
The PG Version is 8.1.3


What else does this box do?

I think you should try these settings, which I use on 4GB dual  
Opteron boxes running FreeBSD 6.x dedicated to Postgres only.  Your  
effective_cache_size seems overly optimistic for freebsd.  cranking  
up the shared buffers seems to be one of the best bangs for the buck  
under pg 8.1.  I recently doubled them and nearly tripled my  
performance on a massive write-mostly (insert/update) load.  Unless  
your disk system is *really* slow, random_page_cost should be reduced  
from the default 4.


As you can see, I change *very* little from the default config.


shared_buffers = 7  # min 16 or  
max_connections*2, 8KB each

work_mem = 262144   # min 64, size in KB
maintenance_work_mem = 524288   # min 1024, size in KB

checkpoint_segments = 256
checkpoint_timeout = 900

effective_cache_size = 27462# `sysctl -n  
vfs.hibufspace` / 8192 (BLKSZ)

random_page_cost = 2

if you're feeling adventurous try these to reduce the checkpoint  
impact on the system:


bgwriter_lru_percent = 2.0
bgwriter_lru_maxpages = 40
bgwriter_all_percent = 0.666
bgwriter_all_maxpages = 40


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.MailerMailer, LLC Rockville, MD
http://www.MailerMailer.com/ +1-301-869-4449 x806




smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Willo van der Merwe

Dave Dutcher wrote:
That's an interesting situation.  Your CPU's are pegged, and you're 
hardly doing any IO.  I wonder if there is some ineficient query, or 
if its just very high query volume.  Maybe you could try setting 
log_min_duration_statement to try to track down the slowest of the 
queries.  Then post the slow queries with an explain analyze to the list.
 
Here is some info on setting up logging:

http://www.postgresql.org/docs/8.1/interactive/runtime-config-logging.html
 
Are your queries standard SQL or do you call functions you wrote in 
PL/pgSQl or PL/Python or anything?
 
 
It might be a combo of queries and load. My queries use almost 
exclusively functions, but on an unloaded dev machine performs its 
queries in aprox 10ms. When is it appropriate to start clustering 
database servers?


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Merlin Moncure

On 8/30/06, Willo van der Merwe [EMAIL PROTECTED] wrote:

This was just an example. All queries have slowed down. Could it be that
I've reached some cut-off and now my disk is thrashing?

Currently the load looks like this:
Cpu0  : 96.8% us,  1.9% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  1.0% si
Cpu1  : 97.8% us,  1.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si
Cpu2  : 96.8% us,  2.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si
Cpu3  : 96.2% us,  3.2% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si


I don't think so, it looks like you are cpu bound.  Your server has a
(fairly high) budget of records per second it can crunch through.  You
have hit that limit and backpressure is building up and server load is
escalating.  This almost certainly due to inefficient sql, which is
very easy to do especially if you are using some type of middleware
which writes the sql for you.  The trick here would be to turn all sql
logging on and find out where your budget is getting spent.  solving
the problem may be a simple matter of adding an index or crafting a
stored procedure.

merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] performance problems.

2006-08-30 Thread Jim C. Nasby
On Wed, Aug 30, 2006 at 10:10:28AM -0400, Vivek Khera wrote:
 effective_cache_size = 27462# `sysctl -n  
 vfs.hibufspace` / 8192 (BLKSZ)
 random_page_cost = 2

You misunderstand how effective_cache_size is used. It's the *only*
memory factor that plays a role in cost estimator functions. This means
it should include the memory set aside for caching in shared_buffers.

Also, hibufspace is only talking about filesystem buffers in FreeBSD,
which AFAIK has nothing to do with total memory available for caching,
since VM pages are also used to cache data.

Basically, your best bet for setting effective_cache_size is to use the
total memory in the machine, and substract some overhead for the OS and
other processes. I'll typically subtract 1G.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Alan Hodgson
On Wednesday 30 August 2006 03:48, Willo van der Merwe 
[EMAIL PROTECTED] wrote:
 Hi Rusty,

 Good ideas and I've implemented some of them, and gained about 10%. I'm
 still sitting on a load avg of about 60.

 Any ideas on optimizations on my postgresql.conf, that might have an
 effect?

If all of those sessions are truly doing a select count(*) from a .75 
million row table (plus half a million dead rows), then I'm not suprised 
it's bogged down.  Every query has to loop through the cache of the full 
table in memory every time it's run.

Your CPU is doing something.   I doubt that postgresql.conf settings are  
going to help.  What exactly are all those high CPU usage sessions doing?

-- 
Government big enough to supply everything you need is big enough to take
everything you have ... the course of history shows that as a government
grows, liberty decreases. -- Thomas Jefferson

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] slow i/o

2006-08-30 Thread Jignesh K. Shah

The bgwriter parameters changed in 8.1

Try

bgwriter_lru_maxpages=0
bgwriter_lru_percent=0

to turn off bgwriter and see if there is any change.

-Jignesh


Junaili Lie wrote:

Hi Jignesh,
Thank you for my reply.
I have the setting just like what you described:

wal_sync_method = fsync
wal_buffers = 128
checkpoint_segments = 128
bgwriter_all_percent = 0
bgwriter_maxpages = 0


I ran the dtrace script and found the following:
During the i/o busy time, there are postgres processes that has very 
high BYTES count. During that non i/o busy time, this same process 
doesn't do a lot of i/o activity. I checked the pg_stat_activity but 
couldn't found this process. Doing ps revealed that this process is 
started at the same time since the postgres started, which leads me to 
believe that it maybe background writer or some other internal process. 
This process are not autovacuum because it doesn't disappear when I  
tried turning autovacuum off.
Except for the ones mentioned above, I didn't modify the other 
background setting:

MONSOON=# show bgwriter_delay ;
 bgwriter_delay

 200
(1 row)

MONSOON=# show bgwriter_lru_maxpages ;  bgwriter_lru_maxpages
---
 5
(1 row)

MONSOON=# show bgwriter_lru_percent ;
 bgwriter_lru_percent
--
 1
(1 row)

This i/o spike only happens at minute 1 and minute 6 (ie. 10.51, 10.56) 
. If I do select * from pg_stat_activity during this time, I will see a 
lot of write queries waiting to be processed. After a few seconds, 
everything seems to be gone. All writes that are not happening at the 
time of this i/o jump are being processed very fast, thus do not show on 
pg_stat_activity.


Thanks in advance for the reply,
Best,

J

On 8/29/06, *Jignesh K. Shah* [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


Also to answer your real question:

DTrace On Solaris 10:

# dtrace -s /usr/demo/dtrace/whoio.d

It will tell you the pids doing the io activity and  on which devices.
There are more scripts in that directory like iosnoop.d, iotime.d
and others which also will give
other details like file accessed, time it took for the io etc.

Hope this helps.

Regards,
Jignesh


Junaili Lie wrote:
  Hi everyone,
  We have a postgresql 8.1 installed on Solaris 10. It is running fine.
  However, for the past couple days, we have seen the i/o reports
  indicating that the i/o is busy most of the time. Before this, we
only
  saw i/o being busy occasionally (very rare). So far, there has
been no
  performance complaints by customers, and the slow query reports
doesn't
  indicate anything out of the ordinary.
  There's no code changes on the applications layer and no database
  configuration changes.
  I am wondering if there's a tool out there on Solaris to tell which
  process is doing most of the i/o activity?
  Thank you in advance.
 
  J
 




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] performance problems.

2006-08-30 Thread Alex Hayward
On Wed, 30 Aug 2006, Jim C. Nasby wrote:

 On Wed, Aug 30, 2006 at 10:10:28AM -0400, Vivek Khera wrote:
  effective_cache_size = 27462# `sysctl -n
  vfs.hibufspace` / 8192 (BLKSZ)
  random_page_cost = 2

 You misunderstand how effective_cache_size is used. It's the *only*
 memory factor that plays a role in cost estimator functions. This means
 it should include the memory set aside for caching in shared_buffers.

 Also, hibufspace is only talking about filesystem buffers in FreeBSD,
 which AFAIK has nothing to do with total memory available for caching,
 since VM pages are also used to cache data.

I believe it's not talking about quantities of buffers at all, but about
kernel virtual address space. It's something like the amount of kernel
virtual address space available for mapping buffer-cache pages in to
kernel memory. It certainly won't tell you (or even approximate) how much
PostgreSQL data is being cached by the OS. Cached PostgreSQL data will
appear in the active, inactive and cached values - and (AFAIK) there isn't
any distinction between file-backed pages and swap-backed pages amongst
those.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] slow i/o

2006-08-30 Thread Junaili Lie
I have tried this to no avail.
I have also tried changing the bg_writer_delay parameter to 10. The spike in i/o still occurs although not in a consistent basis and it is only happening for a few seconds.


On 8/30/06, Jignesh K. Shah [EMAIL PROTECTED] wrote:
The bgwriter parameters changed in 8.1Trybgwriter_lru_maxpages=0bgwriter_lru_percent=0
to turn off bgwriter and see if there is any change.-JigneshJunaili Lie wrote: Hi Jignesh, Thank you for my reply. I have the setting just like what you described:
 wal_sync_method = fsync wal_buffers = 128 checkpoint_segments = 128 bgwriter_all_percent = 0 bgwriter_maxpages = 0 I ran the dtrace script and found the following:
 During the i/o busy time, there are postgres processes that has very high BYTES count. During that non i/o busy time, this same process doesn't do a lot of i/o activity. I checked the pg_stat_activity but
 couldn't found this process. Doing ps revealed that this process is started at the same time since the postgres started, which leads me to believe that it maybe background writer or some other internal process.
 This process are not autovacuum because it doesn't disappear when I tried turning autovacuum off. Except for the ones mentioned above, I didn't modify the other background setting: MONSOON=# show bgwriter_delay ;
bgwriter_delay 200 (1 row) MONSOON=# show bgwriter_lru_maxpages ;bgwriter_lru_maxpages ---5 (1 row)
 MONSOON=# show bgwriter_lru_percent ;bgwriter_lru_percent --1 (1 row) This i/o spike only happens at minute 1 and minute 6 (ie. 10.51, 10.56
) . If I do select * from pg_stat_activity during this time, I will see a lot of write queries waiting to be processed. After a few seconds, everything seems to be gone. All writes that are not happening at the
 time of this i/o jump are being processed very fast, thus do not show on pg_stat_activity. Thanks in advance for the reply, Best, J On 8/29/06, *Jignesh K. Shah* 
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Also to answer your real question: DTrace On Solaris 10:
 # dtrace -s /usr/demo/dtrace/whoio.d It will tell you the pids doing the io activity andon which devices. There are more scripts in that directory like iosnoop.d, iotime.d
 and others which also will give other details like file accessed, time it took for the io etc. Hope this helps. Regards, Jignesh
 Junaili Lie wrote: Hi everyone, We have a postgresql 8.1 installed on Solaris 10. It is running fine. However, for the past couple days, we have seen the i/o reports
 indicating that the i/o is busy most of the time. Before this, we only saw i/o being busy occasionally (very rare). So far, there has been no performance complaints by customers, and the slow query reports
 doesn't indicate anything out of the ordinary. There's no code changes on the applications layer and no database configuration changes. I am wondering if there's a tool out there on Solaris to tell which
 process is doing most of the i/o activity? Thank you in advance. J


Re: [PERFORM] performance problems.

2006-08-30 Thread Matthew Sullivan

Vivek Khera wrote:



On Aug 30, 2006, at 5:29 AM, Matthew Sullivan wrote:

The hardware is a Compaq 6400r with 4G of EDO RAM, 4x500MHz Xeons  
and a Compaq RAID 3200 in RAID 5 configuration running across 3  
spindles (34G total space).


The OS is FreeBSD 5.4-RELEASE-p14
The PG Version is 8.1.3



What else does this box do?


Notihing - it's the developement DB and is dedicated to the development 
website - which has a total number of users of '1' ;-)


I think you should try these settings, which I use on 4GB dual  
Opteron boxes running FreeBSD 6.x dedicated to Postgres only.  Your  
effective_cache_size seems overly optimistic for freebsd.  cranking  
up the shared buffers seems to be one of the best bangs for the buck  
under pg 8.1.  I recently doubled them and nearly tripled my  
performance on a massive write-mostly (insert/update) load.  Unless  
your disk system is *really* slow, random_page_cost should be reduced  
from the default 4.


I'll give this a try.



As you can see, I change *very* little from the default config.


shared_buffers = 7  # min 16 or  
max_connections*2, 8KB each

work_mem = 262144   # min 64, size in KB
maintenance_work_mem = 524288   # min 1024, size in KB

checkpoint_segments = 256
checkpoint_timeout = 900

effective_cache_size = 27462# `sysctl -n  vfs.hibufspace` 
/ 8192 (BLKSZ)

random_page_cost = 2

if you're feeling adventurous try these to reduce the checkpoint  
impact on the system:


bgwriter_lru_percent = 2.0
bgwriter_lru_maxpages = 40
bgwriter_all_percent = 0.666
bgwriter_all_maxpages = 40

That might have some impact on the production server (which is also 
running PG - but the old DB and RT3) however the new DB is only me in 
devel, so I think that it will not have much of an effect (I'll still 
try it though)


Regards,

Mat

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] performance problems.

2006-08-30 Thread Mark Kirkwood

Matthew Sullivan wrote:



The OS is FreeBSD 5.4-RELEASE-p14
The PG Version is 8.1.3

Solutions/tips greatly appreciated.



This won't help this particular query, but 6.1-RELEASE will possibly be 
a better performer generally, in particular for your SMP system - e.g. 
the vfs layer is no longer under the Giant lock in the 6.x series, so 
parallel io should be much better!


Cheers

Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] performance problems.

2006-08-30 Thread Dave Cramer


On 30-Aug-06, at 10:10 AM, Vivek Khera wrote:



On Aug 30, 2006, at 5:29 AM, Matthew Sullivan wrote:

The hardware is a Compaq 6400r with 4G of EDO RAM, 4x500MHz Xeons  
and a Compaq RAID 3200 in RAID 5 configuration running across 3  
spindles (34G total space).


The OS is FreeBSD 5.4-RELEASE-p14
The PG Version is 8.1.3


What else does this box do?

I think you should try these settings, which I use on 4GB dual  
Opteron boxes running FreeBSD 6.x dedicated to Postgres only.  Your  
effective_cache_size seems overly optimistic for freebsd.  cranking  
up the shared buffers seems to be one of the best bangs for the  
buck under pg 8.1.  I recently doubled them and nearly tripled my  
performance on a massive write-mostly (insert/update) load.  Unless  
your disk system is *really* slow, random_page_cost should be  
reduced from the default 4.


Actually unless you have a ram disk you should probably leave  
random_page_cost at 4, shared buffers should be 2x what you have  
here, maintenance work mem is pretty high

effective cache should be much larger 3/4 of 4G or about 36

Setting work _mem this high should be done with caution. From the  
manual Note that for a complex query, several sort or hash  
operations might be running in parallel; each one will be allowed to  
use as much memory as this value specifies before it starts to put  
data into temporary files. Also, several running sessions could be  
doing such operations concurrently. So the total memory used could be  
many times the value of work_mem

As you can see, I change *very* little from the default config.


shared_buffers = 7  # min 16 or  
max_connections*2, 8KB each

work_mem = 262144   # min 64, size in KB
maintenance_work_mem = 524288   # min 1024, size in KB

checkpoint_segments = 256
checkpoint_timeout = 900

effective_cache_size = 27462# `sysctl -n  
vfs.hibufspace` / 8192 (BLKSZ)

random_page_cost = 2

if you're feeling adventurous try these to reduce the checkpoint  
impact on the system:


bgwriter_lru_percent = 2.0
bgwriter_lru_maxpages = 40
bgwriter_all_percent = 0.666
bgwriter_all_maxpages = 40


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.MailerMailer, LLC Rockville, MD
http://www.MailerMailer.com/ +1-301-869-4449 x806





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly