Re: [PERFORM] Looking for ideas on how to speed up warehouse loading

2004-04-23 Thread Aaron Werman
By definition, it is equivalent to:

SELECT t1.id, t2.url FROM referral_temp t2 LEFT /*OUTER*/ JOIN d_referral t1
ON t2.url = t1.referral_raw_url
union all
SELECT null, url FROM referral_temp WHERE url is null
ORDER BY 1;



/Aaron

- Original Message - 
From: Joe Conway [EMAIL PROTECTED]
To: Sean Shanny [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, April 23, 2004 12:38 AM
Subject: Re: [PERFORM] Looking for ideas on how to speed up warehouse
loading


 Sean Shanny wrote:
  explain analyze SELECT t1.id, t2.url FROM referral_temp t2 LEFT OUTER
  JOIN d_referral t1 ON t2.url = t1.referral_raw_url ORDER BY t1.id;

  What I would like to know is if there are better ways to do the join?  I
  need to get all the rows back from the referral_temp table as they are
  used for assigning FK's for the fact table later in processing.  When I
  iterate over the values that I get back those with t1.id = null I assign
  a new FK and push both into the d_referral table as new entries as well
  as a text file for later use.  The matching records are written to a
  text file for later use.

 Would something like this work any better (without disabling index scans):

 SELECT t1.id, t2.url
 FROM referral_temp t2, d_referral t1
 WHERE t1.referral_raw_url = t2.url;

 process rows with a match

 SELECT t1.id, t2.url
 FROM referral_temp t2
 WHERE NOT EXISTS
 (select 1 FROM d_referral t1 WHERE t1.referral_raw_url = t2.url);

 process rows without a match

 ?

 Joe

 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] Help with performance problems

2004-04-23 Thread Chris Hoover
I need some help.  I have 5 db servers running our database servers, and they 
all are having various degrees of performance problems.  The problems we are 
experiencing are:

1.  General slowness
2.  High loads

All of our db's are running on Dell Poweredge 2650 with 2 P4 Xeons (2.8 - 
3.06 GHz) with 8 to 12 GB of memory.  The databases are running on attached 
Dell Powervault 220s running raid5.

The databases were created and taken into production before I started working 
here and are very flat.  Most of the major tables have a combined primary key 
using an int field and a single char field.  There are some additional 
indexes on some tables.  Most queries I see in the  logs are running at less 
than .01 seconds with many significantly slower.

We are trying to narrow down the performance problem to either the db or the 
hardware.  As the dba, I need to  try and get these db's tuned to the best 
possible way considering the current db state.  We are in the beginning of a 
complete db redesign and application re-write, but the completion and 
deployment of the new db and app are quite a ways off.

Anyway, we are running the following:
PE 2650 w/ 2 cpus (2.8-3.06) - HT on
8-12 GB memory
OS on raid 0
DB's on Powervaults 220S using raid 5 (over 6 disks)
Each Postgresql cluster has 2 db up to almost 170db's (project to level out 
the num of db's/cluster is being started)
DB's are no bigger than a few GB in size (largest is about 11GB according to a 
du -h)
Running RH ES 2.1

Here is the postgresql.conf from the server with the 11GB db:

max_connections = 64
shared_buffers = 32768  # 256MB=32768(buffs)*8192(bytes/buff)
max_fsm_relations = 1000# min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 1   # min 1000, fsm is free space map, ~6 bytes
sort_mem = 4096 # 256MB=4096(bytes/proc)*64(procs or conns)
checkpoint_segments = 16# in logfile segments, min 1, 16MB each
checkpoint_timeout = 30 # range 30-3600, in seconds
effective_cache_size = 131072   # typically 8KB each
log_connections = true
log_pid = true
log_statement = true
log_duration = true
log_timestamp = true
stats_start_collector = true
stats_reset_on_server_start = true
stats_command_string = true
stats_row_level = true
stats_block_level = true
LC_MESSAGES = 'en_US'
LC_MONETARY = 'en_US'
LC_NUMERIC = 'en_US'
LC_TIME = 'en_US'

Here is top (server running pretty good right now)
  9:28am  up 25 days, 16:02,  2 users,  load average: 0.54, 0.33, 0.22
94 processes: 91 sleeping, 3 running, 0 zombie, 0 stopped
CPU0 states: 64.0% user,  0.1% system,  0.0% nice, 34.0% idle
CPU1 states: 29.0% user,  9.0% system,  0.0% nice, 60.0% idle
CPU2 states:  2.0% user,  0.1% system,  0.0% nice, 96.0% idle
CPU3 states:  0.0% user,  0.0% system,  0.0% nice, 100.0% idle
Mem:  7720072K av, 7711648K used,8424K free,  265980K shrd,  749888K buff
Swap: 2096440K av,   22288K used, 2074152K free 6379304K 
cached

Here is top from another server (with the most db's): 
 9:31am  up 25 days, 16:05,  5 users,  load average: 2.34, 3.39, 4.28
147 processes: 145 sleeping, 2 running, 0 zombie, 0 stopped
CPU0 states:  6.0% user,  1.0% system,  0.0% nice, 91.0% idle
CPU1 states:  9.0% user,  4.0% system,  0.0% nice, 85.0% idle
CPU2 states:  9.0% user,  3.0% system,  0.0% nice, 86.0% idle
CPU3 states:  9.0% user,  4.0% system,  0.0% nice, 85.0% idle
Mem:  7721096K av, 7708040K used,   13056K free,  266132K shrd, 3151336K buff
Swap: 2096440K av,   24208K used, 2072232K free 3746596K 
cached

Thanks for any help/advice,

Chris


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Help with performance problems

2004-04-23 Thread Aaron Werman
Your second server has queuing (load averages are highish), only 2 processes
running, and almost all cycles are idle. You need to track down your
bottleneck. Have you looked at iostat/vmstat? I think it would be useful to
post these, ideally both before and after full vacuum analyze.

/Aaron


- Original Message - 
From: Chris Hoover [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, April 23, 2004 9:31 AM
Subject: [PERFORM] Help with performance problems


I need some help.  I have 5 db servers running our database servers, and
they
all are having various degrees of performance problems.  The problems we are
experiencing are:

1.  General slowness
2.  High loads

All of our db's are running on Dell Poweredge 2650 with 2 P4 Xeons (2.8 -
3.06 GHz) with 8 to 12 GB of memory.  The databases are running on attached
Dell Powervault 220s running raid5.

The databases were created and taken into production before I started
working
here and are very flat.  Most of the major tables have a combined primary
key
using an int field and a single char field.  There are some additional
indexes on some tables.  Most queries I see in the  logs are running at less
than .01 seconds with many significantly slower.

We are trying to narrow down the performance problem to either the db or the
hardware.  As the dba, I need to  try and get these db's tuned to the best
possible way considering the current db state.  We are in the beginning of a
complete db redesign and application re-write, but the completion and
deployment of the new db and app are quite a ways off.

Anyway, we are running the following:
PE 2650 w/ 2 cpus (2.8-3.06) - HT on
8-12 GB memory
OS on raid 0
DB's on Powervaults 220S using raid 5 (over 6 disks)
Each Postgresql cluster has 2 db up to almost 170db's (project to level out
the num of db's/cluster is being started)
DB's are no bigger than a few GB in size (largest is about 11GB according to
a
du -h)
Running RH ES 2.1

Here is the postgresql.conf from the server with the 11GB db:

max_connections = 64
shared_buffers = 32768 # 256MB=32768(buffs)*8192(bytes/buff)
max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 1 # min 1000, fsm is free space map, ~6 bytes
sort_mem = 4096 # 256MB=4096(bytes/proc)*64(procs or conns)
checkpoint_segments = 16 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 30 # range 30-3600, in seconds
effective_cache_size = 131072 # typically 8KB each
log_connections = true
log_pid = true
log_statement = true
log_duration = true
log_timestamp = true
stats_start_collector = true
stats_reset_on_server_start = true
stats_command_string = true
stats_row_level = true
stats_block_level = true
LC_MESSAGES = 'en_US'
LC_MONETARY = 'en_US'
LC_NUMERIC = 'en_US'
LC_TIME = 'en_US'

Here is top (server running pretty good right now)
  9:28am  up 25 days, 16:02,  2 users,  load average: 0.54, 0.33, 0.22
94 processes: 91 sleeping, 3 running, 0 zombie, 0 stopped
CPU0 states: 64.0% user,  0.1% system,  0.0% nice, 34.0% idle
CPU1 states: 29.0% user,  9.0% system,  0.0% nice, 60.0% idle
CPU2 states:  2.0% user,  0.1% system,  0.0% nice, 96.0% idle
CPU3 states:  0.0% user,  0.0% system,  0.0% nice, 100.0% idle
Mem:  7720072K av, 7711648K used,8424K free,  265980K shrd,  749888K
buff
Swap: 2096440K av,   22288K used, 2074152K free 6379304K
cached

Here is top from another server (with the most db's):
 9:31am  up 25 days, 16:05,  5 users,  load average: 2.34, 3.39, 4.28
147 processes: 145 sleeping, 2 running, 0 zombie, 0 stopped
CPU0 states:  6.0% user,  1.0% system,  0.0% nice, 91.0% idle
CPU1 states:  9.0% user,  4.0% system,  0.0% nice, 85.0% idle
CPU2 states:  9.0% user,  3.0% system,  0.0% nice, 86.0% idle
CPU3 states:  9.0% user,  4.0% system,  0.0% nice, 85.0% idle
Mem:  7721096K av, 7708040K used,   13056K free,  266132K shrd, 3151336K
buff
Swap: 2096440K av,   24208K used, 2072232K free 3746596K
cached

Thanks for any help/advice,

Chris


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Looking for ideas on how to speed up warehouse loading

2004-04-23 Thread scott.marlowe

On Thu, 22 Apr 2004, Sean Shanny wrote:

 I should have included this as well:
  fsync  | on
  shared_buffers | 4000
  sort_mem   | 64000

For purposes of loading only, you can try turning off fsync, assuming this 
is a virgin load and you can just re-initdb should bad things happen (OS, 
postgresql crash, power plug pulled, etc...)

Also increasing sort_mem and shared_buffers might help.  Especially 
sort_mem.  But turn it back down to something reasonable after the import.

And turn fsync back on after the import too.  Note you have to restart 
postgresql to make fsync = off take effect.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Help with performance problems

2004-04-23 Thread Josh Berkus
Chris,

 I need some help.  I have 5 db servers running our database servers, and
 they all are having various degrees of performance problems.  The problems
 we are experiencing are:

I'mm confused.  You're saying general slowness but say that most queries run 
in under .01 seconds.   And you say high loads but the TOP snapshots you 
provide show servers with 2 CPUs idle.   

Are you sure you actually *have* a performance issue?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Help with performance problems

2004-04-23 Thread scott.marlowe
On Fri, 23 Apr 2004, Chris Hoover wrote:

 DB's on Powervaults 220S using raid 5 (over 6 disks)

What controller is this, the adaptec?  We've found it to be slower than 
the LSI megaraid based controller, but YMMV.

 Running RH ES 2.1

Are you running the latest kernel for ES 2.1?  Early 2.4 kernels are 
pretty pokey and have some odd behaviour under load that later 2.4 
kernels seemed to fix.

 Here is the postgresql.conf from the server with the 11GB db:
 
 max_connections = 64
 shared_buffers = 32768# 256MB=32768(buffs)*8192(bytes/buff)
 max_fsm_relations = 1000  # min 10, fsm is free space map, ~40 bytes
 max_fsm_pages = 1 # min 1000, fsm is free space map, ~6 bytes

IF you're doing lots of updates and such, you might want these higher.
Have you vacuumed full the databases since taking over?

 sort_mem = 4096   # 256MB=4096(bytes/proc)*64(procs or conns)

Sorry, that's wrong.  sort_mem is measure in kbytes.  i.e. 8192 means 8 
megs sort_mem.  Try setting it a bit higher (you've got LOTS of ram in these 
boxes) to something like 16 or 32 meg.

 checkpoint_segments = 16  # in logfile segments, min 1, 16MB each
 checkpoint_timeout = 30   # range 30-3600, in seconds
 effective_cache_size = 131072 # typically 8KB each

This still looks low.  On one machine you're showing kernel cache of about 
.7 gig, on the other it's 6 gig.  6 gigs of kernel cache would be a 
setting of 80.  It's more of a nudge factor than an exact science, so 
don't worry too much.

If you've got fast I/O look at lowering random page cost to something 
between 1 and 2.  We use 1.3 to 1.4 on most of our machines with fast 
drives under them.

I'd use vmstat to see if you're I/O bound.  

also, look for index bloat.  Before 7.4 it was a serious problem.  With 
7.4 regular vacuuming should reclaim most lost space, but there are corner 
cases where you still might need to re-index.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Help with performance problems

2004-04-23 Thread Chris Hoover
I know the numbers look ok, but we are definetly suffering.  Also, if I try to 
run any sort of vacuum or other db activity during normal business hours, 
load goes through the roof.  I have seen loads of over 10 when trying to 
vacuum the larger cluster and would have to kill the vacuums due to 
complaints.  

I think this is probably related to the hardware configuration, but I want to 
make sure that there are no changes I could make configuration wise to the db 
that might lighten the problem.

I'm especially want to make sure that I have the memory parameters set to good 
numbers for my db's so that I can minimize thrashing between the postgres 
memory pools and the hard drive.  I am thinking that this may be a big issue 
here?

Thanks for any help,

Chris
On Friday 23 April 2004 12:42, Josh Berkus wrote:
 Chris,

  I need some help.  I have 5 db servers running our database servers, and
  they all are having various degrees of performance problems.  The
  problems we are experiencing are:

 I'mm confused.  You're saying general slowness but say that most queries
 run in under .01 seconds.   And you say high loads but the TOP snapshots
 you provide show servers with 2 CPUs idle.

 Are you sure you actually *have* a performance issue?


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Help with performance problems

2004-04-23 Thread Chris Hoover
Sorry for the confusion here.  I can't run any sort of vacuum durin the day 
due to performance hits.  However, I have run vacuums at night.  Several 
nights a week I run a vacuumdb -f -z  on all of the clusters.  I can take 
serveral hours to complete, but it does complete.

During the day, I have tried to run a vacuumdb -v and a vacuumdb -z -v during 
the day since I read it is supposed to help performance, but as I said, it 
causes to much of a stress on the system.

I did change the vacuumdb script to do set the vacuum_mem to 512 when 
vacuuming to try and help the situation (from the script: ${PATHNAME}psql 
$PSQLOPT $ECHOOPT -c SET vacuum_mem=524288;SET autocommit TO 'on';VACUUM 
$full $verbose $analyze $table -d $db ), and I reset it to 8192 at the end.

Anyway, thank you for the ideas so far, and any additional will be greatly 
appreciated.

Chris
On Friday 23 April 2004 13:44, Kevin Barnard wrote:
 Chris Hoover wrote:
 I know the numbers look ok, but we are definetly suffering.  Also, if I
  try to run any sort of vacuum or other db activity during normal business
  hours, load goes through the roof.  I have seen loads of over 10 when
  trying to vacuum the larger cluster and would have to kill the vacuums
  due to complaints.

 This is your problem then.  You have to regularly vacuum the DB.  You
 might want to dump and reload or schedule a vacuum full.  If you don't
 it doesn't matter what you do you will never get decent performance.
 Make sure you vacuum as a superuser this way you get system tables as well.

 Killing a vacuum is bad it tends to make the situation worse.  If you
 need to vaccuum one table at a time.

 I think this is probably related to the hardware configuration, but I want
  to make sure that there are no changes I could make configuration wise to
  the db that might lighten the problem.
 
 I'm especially want to make sure that I have the memory parameters set to
  good numbers for my db's so that I can minimize thrashing between the
  postgres memory pools and the hard drive.  I am thinking that this may be
  a big issue here?

 Get the vacuum done and don't worry about the hardware or the settings
 until afterwords.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Why will vacuum not end?

2004-04-23 Thread Shea,Dan [CIS]

PWFPM_DEV=# select * from pg_locks;
 relation | database | transaction |  pid  |   mode   |
granted
--+--+-+---+--+-

17472 |17347 | |  2618 | ShareUpdateExclusiveLock | t
  |  |10858533 | 28778 | ExclusiveLock| t
17472 |17347 | |  2618 | ShareUpdateExclusiveLock | t
  |  |10803814 |  2618 | ExclusiveLock| t
16759 |17347 | | 28778 | AccessShareLock  | t
(5 rows)

PWFPM_DEV=#

17347 is the database PWFPM_DEV iod, The pids are below

[EMAIL PROTECTED] root]# ps -ef |grep 28778|grep -v grep
postgres 28778   504  0 18:06 ?00:00:00 postgres: scores PWFPM_DEV
[local] idle
[EMAIL PROTECTED] root]# ps -ef |grep 2618|grep -v grep
postgres  2618   504  8 Apr22 ?02:31:00 postgres: postgres PWFPM_DEV
[local] VACUUM
[EMAIL PROTECTED] root]#
A vacuum is running now.  I restarted the database, set vacuum_mem =
'196608'; and started a new vacuum.  I also stopped inserting into the
database.
I hoping I will get some results.

PWFPM_DEV=# select now();vacuum verbose analyze forecastelement;select
now();
  now
---
 2004-04-22 13:38:02.083592+00
(1 row)

INFO:  vacuuming public.forecastelement
INFO:  index forecastelement_rwv_idx now contains 391385895 row versions
in 5051132 pages
DETAIL:  27962015 index row versions were removed.
771899 index pages have been deleted, 496872 are currently reusable.
CPU 4499.54s/385.76u sec elapsed 55780.91 sec.
INFO:  forecastelement: removed 33554117 row versions in 737471 pages
DETAIL:  CPU 135.61s/83.99u sec elapsed 1101.26 sec.
-Original Message-
From: Christopher Kings-Lynne [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 20, 2004 9:26 PM
To: Shea,Dan [CIS]
Cc: [EMAIL PROTECTED]
Subject: Re: [PERFORM] Why will vacuum not end?


 No, but data is constantly being inserted by userid scores.  It is
postgres
 runnimg the vacuum.
 Dan.

Well, inserts create some locks - perhaps that's the problem...

Otherwise, check the pg_locks view to see if you can figure it out.

Chris


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Help with performance problems

2004-04-23 Thread Josh Berkus
Chris,

 Sorry for the confusion here.  I can't run any sort of vacuum durin the day
 due to performance hits.  However, I have run vacuums at night.  Several
 nights a week I run a vacuumdb -f -z  on all of the clusters.  I can take
 serveral hours to complete, but it does complete.

Well, here's your first problem:  since your FSM pages is low, and you're only 
vacuuming once a day, you've got to have some serious table and index bloat.   
SO you're going to need to do VACUUM FULL on all of your databases, and then 
REINDEX on all of your indexes.

After that, raise your max_fsm_pages to something useful, like 1,000,000.   Of 
course, data on your real rate of updates would help more.

If you're getting severe disk choke when you vacuum, you probably are I/O 
bound.   You may want to try something which allows you to vacuum one table 
at a time, either pg_autovacuum or a custom script.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Why will vacuum not end?

2004-04-23 Thread Josh Berkus
Guys,

 Well, inserts create some locks - perhaps that's the problem...

 Otherwise, check the pg_locks view to see if you can figure it out.

FWIW, I've had this happen a couple of times, too.   Unfortunately, it's 
happend in the middle of the day so that I had to cancel the processes and 
get the system back to normal in too much of a hurry to consider documenting 
it.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Help with performance problems

2004-04-23 Thread Ron St-Pierre
Josh Berkus wrote:

Chris,

 

Sorry for the confusion here.  I can't run any sort of vacuum durin the day
due to performance hits.  However, I have run vacuums at night.  Several
nights a week I run a vacuumdb -f -z  on all of the clusters.  I can take
serveral hours to complete, but it does complete.
   

Well, here's your first problem:  since your FSM pages is low, and you're only 
vacuuming once a day, you've got to have some serious table and index bloat.   
SO you're going to need to do VACUUM FULL on all of your databases, and then 
REINDEX on all of your indexes.

After that, raise your max_fsm_pages to something useful, like 1,000,000.   Of 
course, data on your real rate of updates would help more.

If you're getting severe disk choke when you vacuum, you probably are I/O 
bound.   You may want to try something which allows you to vacuum one table 
at a time, either pg_autovacuum or a custom script.

 

Tom and Josh recently gave me some help about setting the fsm settings 
which was quite useful. The full message is at
http://archives.postgresql.org/pgsql-performance/2004-04/msg00229.php
and the 'most interesting' posrtion was:

Actually, since he's running 7.4, there's an even better way.  Do a
VACUUM VERBOSE (full-database vacuum --- doesn't matter whether you
ANALYZE or not).  At the end of the very voluminous output, you'll see
something like
INFO:  free space map: 240 relations, 490 pages stored; 4080 total pages needed
DETAIL:  Allocated FSM size: 1000 relations + 2 pages = 178 kB shared memory.
Here, I would need max_fsm_relations = 240 and max_fsm_pages = 4080 to
exactly cover the present freespace needs of my system.  I concur with
the suggestion to bump that up a good deal, of course, but that gives
you a real number to start from.
The DETAIL part of the message shows my current settings (which are the
defaults) and what the FSM is costing me in shared memory space.
Good luck
Ron


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Help with performance problems

2004-04-23 Thread Chris Hoover
On Friday 23 April 2004 14:57, Ron St-Pierre wrote:
Does this apply to 7.3.4 also?
  Actually, since he's running 7.4, there's an even better way.  Do a
  VACUUM VERBOSE (full-database vacuum --- doesn't matter whether you
  ANALYZE or not).  At the end of the very voluminous output, you'll see
  something like


  INFO:  free space map: 240 relations, 490 pages stored; 4080 total pages
 needed DETAIL:  Allocated FSM size: 1000 relations + 2 pages = 178 kB
 shared memory.


  Here, I would need max_fsm_relations = 240 and max_fsm_pages = 4080 to
  exactly cover the present freespace needs of my system.  I concur with
  the suggestion to bump that up a good deal, of course, but that gives
  you a real number to start from.


  The DETAIL part of the message shows my current settings (which are the
  defaults) and what the FSM is costing me in shared memory space.

 Good luck
 Ron




 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Help with performance problems

2004-04-23 Thread scott.marlowe
On Fri, 23 Apr 2004, Chris Hoover wrote:

 On Friday 23 April 2004 13:21, scott.marlowe wrote:
  On Fri, 23 Apr 2004, Chris Hoover wrote:
   DB's on Powervaults 220S using raid 5 (over 6 disks)
 
  What controller is this, the adaptec?  We've found it to be slower than
  the LSI megaraid based controller, but YMMV.
 
 We are using the perc3/di controller.  Believe it is using the megaraid 
 driver.

No, that's the adaptec, the PERC3/DC is the lsi megaraid.  See if there 
are newer drivers for the RAID card.  In terms of performance, the adaptec 
and lsi drivers have improved considerably in later versions.  In terms of 
stability they've largely gotten better with a few in between releases on 
the megaraid getting poor grades.  The latest / greatest from Dell is 
pretty up to date.

   Running RH ES 2.1
 
  Are you running the latest kernel for ES 2.1?  Early 2.4 kernels are
  pretty pokey and have some odd behaviour under load that later 2.4
  kernels seemed to fix.
 
 I'm not sure we are at the latest and greatest for 2.1, but I am trying to get 
 there.  Management won't let me do the upgrade w/o first testing/proving it 
 will not cause any more issues.  Due to all of the current issues, and the 
 criticality of these systems to our bottom line, they are being very careful 
 with any change that may impact our users further.

Understood.  It's why my production box is still running a 2.4 kernel on 
rh 7.2 with pg 7.2.  They just work, but for us stability AND performance 
are both good with our load.

You can install a new kernel and set up the machine to still boot off of 
the old one, and test on the weekend to see how it behaves under 
simulated load.  Mining the logs for slow queries is a good way to build 
one.

while we don't upgrade our production server's applications to the latest 
and greatest all the time (i.e. php or postgresql or openldap) we always 
run the latest security patches, and I think the latest kernels had 
security fixes for ES 2.1, so NOT upgrading it dangerous.  Late model 
linux kernels (the 2.0.x and 2.2.x where x20) tend to be VERY stable and 
very conservatively backported and upgraded, so running a new one isn't 
usually a big risk.

   Here is the postgresql.conf from the server with the 11GB db:
  
   max_connections = 64
   shared_buffers = 32768# 256MB=32768(buffs)*8192(bytes/buff)
   max_fsm_relations = 1000  # min 10, fsm is free space map, ~40 bytes
   max_fsm_pages = 1 # min 1000, fsm is free space map, ~6 bytes
 
  IF you're doing lots of updates and such, you might want these higher.
  Have you vacuumed full the databases since taking over?
 
   sort_mem = 4096   # 256MB=4096(bytes/proc)*64(procs or conns)
 
  Sorry, that's wrong.  sort_mem is measure in kbytes.  i.e. 8192 means 8
  megs sort_mem.  Try setting it a bit higher (you've got LOTS of ram in
  these boxes) to something like 16 or 32 meg.
 
   checkpoint_segments = 16  # in logfile segments, min 1, 16MB each
   checkpoint_timeout = 30   # range 30-3600, in seconds
   effective_cache_size = 131072 # typically 8KB each
 
  This still looks low.  On one machine you're showing kernel cache of about
  .7 gig, on the other it's 6 gig.  6 gigs of kernel cache would be a
  setting of 80.  It's more of a nudge factor than an exact science, so
  don't worry too much.
 I believe changing this requires a restart of the cluster (correct?).  If so, 
 I'll try bumping up the effective_cache_size over the weekend.
 
 Also, will all of the memory available to these machines, should I be running 
 with larger shared_buffers?  It seems like 256M is a bit small.

No, you probably shouldn't.  PostgreSQL doesn't cache in the classical 
sense.   If all backends close, the stuff they had in their buffers 
disappears in a flash.  So, it's generally considered better to let the 
kernel do the bulk of the caching, and having the buffer area be large 
enough to hold a large portion, if not all, of your working set of data.  
But between the cache management which is dirt simple and works but seems 
to have performance issues with large numbers of buffers, and the fact 
that all the memory in it disappears when the last backend using it.

for instance, in doing the following seq scan select:

explain analyze select * from test;

where test is a ~10 megabyte table, the first time I ran it it took 5 
seconds to run.  The second time took it 2.5, the third 1.9, and it 
levelled out around there.  Starting up another backend and running the 
same query got a 1.9 second response also.  Shutting down both 
connections, and running the query again, with only the kernel for 
caching, I got 1.9.

That's on a 2.4.2[2-4] kernel.

  If you've got fast I/O look at lowering random page cost to something
  between 1 and 2.  We use 1.3 to 1.4 on most of our machines with fast
  drives under them.
 
  I'd use vmstat to see if you're I/O bound.
 
 If we end up being I/O bound, should the 

Re: [PERFORM] Help with performance problems

2004-04-23 Thread Ron St-Pierre
Chris Hoover wrote:

On Friday 23 April 2004 14:57, Ron St-Pierre wrote:
Does this apply to 7.3.4 also?
No it doesn't, I didn't look back through the thread far enough to see 
what you were running. I tried it on 7.3.4 and none of the summary info 
listed below was returned.  FWIW one of our DBs was slowing down 
considerably on an update (30+ minutes) and after I changed 
max_fsm_pages from the 7.4 default of 20,000 to 50,000, it completed in 
about eight minutes.

Ron

 

Actually, since he's running 7.4, there's an even better way.  Do a
VACUUM VERBOSE (full-database vacuum --- doesn't matter whether you
ANALYZE or not).  At the end of the very voluminous output, you'll see
something like
INFO:  free space map: 240 relations, 490 pages stored; 4080 total pages
needed DETAIL:  Allocated FSM size: 1000 relations + 2 pages = 178 kB
shared memory.
Here, I would need max_fsm_relations = 240 and max_fsm_pages = 4080 to
exactly cover the present freespace needs of my system.  I concur with
the suggestion to bump that up a good deal, of course, but that gives
you a real number to start from.
The DETAIL part of the message shows my current settings (which are the
defaults) and what the FSM is costing me in shared memory space.
Good luck
Ron


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings
   



 



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Setting Shared Buffers , Effective Cache, Sort Mem Parameters

2004-04-23 Thread Manfred Koizar
On Fri, 23 Apr 2004 10:20:10 -0400, Pallav Kalva [EMAIL PROTECTED] wrote:
 the database sizes is around 2- 4 gig and 
there are 5 of them. this machine is
 mainly for the databases and nothing is running on them.

Did I understand correctly that you run (or plan to run) five
postmasters?  Is there a special reason that you cannot put all your
tables into one database?

setting shared buffers to 1 allocates (81Mb)  and effective 
cache to 40 would be around (3gig)
does this means that if all of the 81mb of the shared memory gets 
allocated it will use rest from the effective
cache of (3g-81mb) ?

Simply said, if Postgres wants to access a block, it first looks whether
this block is already in shared buffers which should be the case, if the
block is one of the last 1 blocks accessed.  Otherwise the block has
to be read in.  If the OS has the block in its cache, reading it is just
a (fast) memory operation, else it involves a (slow) physical disk read.

The number of database pages residing in the OS cache is totally out of
control of Postgres.  Effective_cache_size tells the query planner how
many database pages can be *expected* to be present in the OS cache.

increasing the shared buffers space to 2g

Setting shared_buffers to half your available memory is the worst thing
you can do.  You would end up caching exactly the same set of blocks in
the internal buffers and in the OS cache, thus effectively making one of
the caches useless.

Better keep shared_buffers low and let the OS do its job.

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] order by index, and inheritance

2004-04-23 Thread Rod Taylor
On Thu, 2004-04-22 at 07:02, Michiel Meeuwissen wrote:
 Rod Taylor [EMAIL PROTECTED] wrote:
  The scan is picking the best method for grabbing everything within the
  table, since it is not aware that we do not require everything.
 
 Hmm. That is a bit silly. Why does it use the index if select only from
 mm_mediasources?
 
  You can explicitly tell it what you want to do via:
  
  SELECT *
FROM (SELECT * FROM mm_mediasources ORDER BY number DESC LIMIT 20 
UNION SELECT * FROM subtable ORDER BY number DESC LIMIT 20) AS tab
   ORDER BY number DESC LIMIT 20
 
 I think you meant 'only mm_mediasources', and btw order by and limit are not
 accepted before union, so the above query does not compile.

Yes, I did mean only. Try putting another set of brackets around the
selects to get ORDER BY, etc. accepted. You can add another layer of
subselects in the from if that doesn't work.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] order by index, and inheritance

2004-04-23 Thread Rod Taylor

 This indeeds performs good (about 1 times faster then select number,url
 from mm_mediasources order by number desc limit 20) . But hardly beautiful,
 and quite useless too because of course I am now going to want to use an
 offset (limit 20 offset 20, you see..), which seems more or less impossible
 in this way, isn't it.

Yes, and the offset is a good reason why PostgreSQL will not be able to
do it by itself either.

Is number unique across the board?

If so, instead of the offset you could use WHERE number  $lastValue.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html