[PERFORM] When to vacuum a table?

2006-11-26 Thread Joost Kraaijeveld
Hi,

Are there guidelines (or any empirical data) available how to determine
how often a table should be vacuumed for optimum performance or is this
an experience / trial-and-error thing?

TIA 

-- 
Groeten,

Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
web: www.askesis.nl

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Marcelo Costa

Hi,

From: http://www.postgresql.org/docs/7.4/interactive/sql-vacuum.html


VACUUM reclaims storage occupied by deleted tuples. In normal
PostgreSQLoperation, tuples that are deleted or obsoleted by an update
are not
physically removed from their table; they remain present until a VACUUM is
done. Therefore it's necessary to do VACUUM periodically, especially on
frequently-updated tables.

The vacuum analyze form additionally collects statistics on the
disbursion of columns in the database, which the optimizer uses when it
calculates just how to execute queries. The availability of this data can
make a tremendous difference in the execution speed of queries. This command
can also be run from cron, but it probably makes more sense to run this
command as part of your nightly backup procedure - if vacuum is going to
screw up the database, you'd prefer it to happen immediately after (not
before!) you've made a backup! The vacuum command is very reliable, but
conservatism is the key to good system management. So, if you're using the
export procedure described above, you don't need to do this extra step.

All its tables constantly manipulated (INSERT, UPDATE, DELETE) they need a
VACUUM, therefore the necessity to execute at least one time to the day
normally of dawn if its database will be very great .

[],s

Marcelo Costa
Secretaria Executiva de Educação do Pará
Amazonia - Pará - Brazil

2006/11/26, Joost Kraaijeveld [EMAIL PROTECTED]:


Hi,

Are there guidelines (or any empirical data) available how to determine
how often a table should be vacuumed for optimum performance or is this
an experience / trial-and-error thing?

TIA

--
Groeten,

Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
web: www.askesis.nl

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly





--
Marcelo Costa


Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Steinar H. Gunderson
On Sun, Nov 26, 2006 at 09:43:11AM -0300, Marcelo Costa wrote:
 All its tables constantly manipulated (INSERT, UPDATE, DELETE) they need a
 VACUUM

Just a minor clarification here: INSERT does not create dead rows, only
UPDATE and DELETE do. Thus, if you only insert rows, you do not need to
vacuum (although you probably need to analyze).

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Marcelo Costa

Sorry,

realy you are correct.

[],s

Marcelo Costa.

2006/11/26, Steinar H. Gunderson [EMAIL PROTECTED]:


On Sun, Nov 26, 2006 at 09:43:11AM -0300, Marcelo Costa wrote:
 All its tables constantly manipulated (INSERT, UPDATE, DELETE) they need
a
 VACUUM

Just a minor clarification here: INSERT does not create dead rows, only
UPDATE and DELETE do. Thus, if you only insert rows, you do not need to
vacuum (although you probably need to analyze).

/* Steinar */
--
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly





--
Marcelo Costa


Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Rod Taylor


On 26-Nov-06, at 8:11 AM, Steinar H. Gunderson wrote:


On Sun, Nov 26, 2006 at 09:43:11AM -0300, Marcelo Costa wrote:
All its tables constantly manipulated (INSERT, UPDATE, DELETE)  
they need a

VACUUM


Just a minor clarification here: INSERT does not create dead rows,  
only
UPDATE and DELETE do. Thus, if you only insert rows, you do not  
need to

vacuum (although you probably need to analyze).


Not entirely true. An insert  rollback will create dead rows. If you  
attempt and fail a large number of insert transactions then you will  
still need to vacuum.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Andrew Sullivan
On Sun, Nov 26, 2006 at 09:24:29AM -0500, Rod Taylor wrote:
 attempt and fail a large number of insert transactions then you will  
 still need to vacuum.

And you still need to vacuum an insert-only table sometimes, because
of the system-wide vacuum requirement.

A


-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The whole tendency of modern prose is away from concreteness.
--George Orwell

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Craig A. James

Rod Taylor wrote:

Just a minor clarification here: INSERT does not create dead rows, only
UPDATE and DELETE do. Thus, if you only insert rows, you do not need to
vacuum (although you probably need to analyze).


Is there no real-time garbage collection at all in Postgres?  And if so, is 
this because nobody has had time to implement garbage collection, or for a more 
fundamental reason, or because VACUUM is seen as sufficient?

I'm just curious ... Vacuum has always seemed to me like an ugly wart on the pretty face of Postgres.  (I say this even though I implemented an identical solution on a non-relational chemistry database system a long time ago.  I didn't like it then, either.) 


Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Tom Lane
Craig A. James [EMAIL PROTECTED] writes:
 Is there no real-time garbage collection at all in Postgres?

No.

 And if so, is this because nobody has had time to implement garbage
 collection, or for a more fundamental reason, or because VACUUM is
 seen as sufficient?

If you really want to know, read the mountains of (mostly) junk that
have been written about replacing VACUUM in pgsql-hackers.  The short
answer (with apologies to Winston Churchill) is that VACUUM is the worst
solution, except for all the others that have been suggested.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Joshua D. Drake



If you really want to know, read the mountains of (mostly) junk that
have been written about replacing VACUUM in pgsql-hackers.  The short
answer (with apologies to Winston Churchill) is that VACUUM is the worst
solution, except for all the others that have been suggested.


The lesser of 50 evils? ;)

Joshua D. Drake





regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly




---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] shared_buffers 284263 on OS X

2006-11-26 Thread Jim C. Nasby
On Sat, Nov 18, 2006 at 08:13:26PM -0700, Brian Wipf wrote:
 It certainly is unfortunate if Guido's right and this is an upper  
 limit for OS X. The performance benefit of having high shared_buffers  
 on our mostly read database is remarkable.

Got any data about that you can share? People have been wondering about
cases where drastically increasing shared_buffers makes a difference.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] availability of SATA vendors

2006-11-26 Thread Jim C. Nasby
On Wed, Nov 22, 2006 at 09:02:04AM -0800, Jeff Frost wrote:
 A valid question.  Does the caching raid controller negate the desire to 
 separate pg_xlog from PGDATA?

Theoretically, yes. But I don't think I've seen any hard numbers from
testing.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] availability of SATA vendors

2006-11-26 Thread Jim C. Nasby
On Wed, Nov 22, 2006 at 04:35:37PM -0500, Bucky Jordan wrote:
 While I'm at it, if I have time I'll run pgbench with pg_log on a
 separate RAID1, and one with it on a RAID10x6, but I don't know how
 useful those results will be.

Very, but only if the controller has write-caching enabled. For testing
purposes it won't batter if it's actually got a BBU so long as the write
cache works (of course you wouldn't run in production like that...)
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Postgres server crash

2006-11-26 Thread Jim C. Nasby
On Sat, Nov 18, 2006 at 05:28:46PM -0800, Richard Troy wrote:
 soapbox ...I read a large number of articles on this subject and am
 absolutely dumbfounded by the -ahem- idiots who think killing a random
 process is an appropriate action. I'm just taking their word for it that
 there's some kind of impossibility of the existing Linux kernel not
 getting itself into a potentially hung situation because it didn't save
 itself any memory. Frankly, if it takes a complete kernel rewrite to fix
 the problem that the damned operating system can't manage its own needs,
 then the kernel needs to be rewritten! /soapbox
 
 These kernel hackers could learn something from VAX/VMS.

What's interesting is that apparently FreeBSD also has overcommit (and
IIRC no way to disable it), yet I never hear people going off on OOM
kills in FreeBSD. My theory is that FreeBSD admins are smart enough to
dedicate a decent amount of swap space, so that by the time you got to
an OOM kill situation you'd be so far into swapping that the box would
be nearly unusable. Many linux 'admins' think it's ok to save a few GB
of disk space by allocating a small amount of swap (or none at all), and
*kaboom*.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Jim C. Nasby
On Sun, Nov 26, 2006 at 12:24:17PM +0100, Joost Kraaijeveld wrote:
 Hi,
 
 Are there guidelines (or any empirical data) available how to determine
 how often a table should be vacuumed for optimum performance or is this
 an experience / trial-and-error thing?

Most of the time I just turn autovac on, set the scale factors to
0.2/0.1 and the thresholds to 300/200 and turn on vacuum_cost_delay
(usually set to 20). That's a pretty decent setup for most applications.
It also doesn't hurt to run a periodic vacuumdb -av and look at the tail
end of it's output to make sure you have adequate FSM settings.

The exception to that rule is for tables that are very small and have a
lot of churn; I'll vacuum those by hand very frequently (every 60
seconds or better).
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] shared_buffers 284263 on OS X

2006-11-26 Thread Brendan Duddridge
I think the main issue is that we can't seem to get PostgreSQL  
compiled for 64 bit on OS X on an Xserve G5. Has anyone done that?


We have 8 GB of RAM on that server, but we can't seem to utilize it  
all. At least not for the shared_buffers setting.


Thanks,


Brendan Duddridge | CTO | 403-277-5591 x24 |  [EMAIL PROTECTED]

ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave. SE
Calgary, AB  T2G 0V9

http://www.clickspace.com

On Nov 26, 2006, at 4:25 PM, Jim C. Nasby wrote:


On Sat, Nov 18, 2006 at 08:13:26PM -0700, Brian Wipf wrote:

It certainly is unfortunate if Guido's right and this is an upper
limit for OS X. The performance benefit of having high shared_buffers
on our mostly read database is remarkable.


Got any data about that you can share? People have been wondering  
about

cases where drastically increasing shared_buffers makes a difference.
--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of  
broadcast)---

TIP 6: explain analyze is your friend





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] shared_buffers 284263 on OS X

2006-11-26 Thread Tom Lane
Brendan Duddridge [EMAIL PROTECTED] writes:
 I think the main issue is that we can't seem to get PostgreSQL  
 compiled for 64 bit on OS X on an Xserve G5. Has anyone done that?

There is no obvious reason why it would not work, and if anyone has
tried and failed, they've not bothered to provide details on any PG list
I read ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] shared_buffers 284263 on OS X

2006-11-26 Thread Brian Wipf

On 27-Nov-06, at 4:04 AM, Tom Lane wrote:

Brendan Duddridge [EMAIL PROTECTED] writes:

I think the main issue is that we can't seem to get PostgreSQL
compiled for 64 bit on OS X on an Xserve G5. Has anyone done that?


There is no obvious reason why it would not work, and if anyone has
tried and failed, they've not bothered to provide details on any PG  
list

I read ...


I'll post details of the problems I've had compiling for 64-bit on OS  
X Tiger to the pgsql-ports when I get a chance.


Brian Wipf


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] shared_buffers 284263 on OS X

2006-11-26 Thread Guido Neitzer

Am 27.11.2006 um 00:25 schrieb Jim C. Nasby:

Got any data about that you can share? People have been wondering  
about

cases where drastically increasing shared_buffers makes a difference.


I have tried to compile PostgreSQL as a 64Bit application on my G5  
but wasn't successful. But I must admit, that I'm not a C programmer  
at all. I know enough to work with source packages and configure /  
make but not enough to work with the errors I got from the compile.  
And as I'm extremely busy right now, I can't follow the trail and  
learn more about it.


Perhaps someone with more knowledge can take a look at it.

cug

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] shared_buffers 284263 on OS X

2006-11-26 Thread Brian Wipf

On 26-Nov-06, at 11:25 PM, Jim C. Nasby wrote:

On Sat, Nov 18, 2006 at 08:13:26PM -0700, Brian Wipf wrote:

It certainly is unfortunate if Guido's right and this is an upper
limit for OS X. The performance benefit of having high shared_buffers
on our mostly read database is remarkable.


Got any data about that you can share? People have been wondering  
about

cases where drastically increasing shared_buffers makes a difference.


Unfortunately, there are more differences than just the  
shared_buffers setting in production right now; it's a completely  
different set up, so the numbers I have to compare against aren't  
particularly useful.


When I get the chance, I will try to post data that shows the benefit  
of having a higher value of shared_buffers for our usage pattern  
(with all other settings being constant -- well, except maybe  
effective_cache_size). Basically, in our current configuration, we  
can cache all of the data we care about 99% of the time in about 3GB  
of shared_buffers. Having shared_buffers set to 512MB as it was  
originally, we were needlessly going to disk all of the time.


Brian Wipf


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] shared_buffers 284263 on OS X

2006-11-26 Thread Guido Neitzer

Am 27.11.2006 um 08:04 schrieb Guido Neitzer:

But, be aware of another thing here: As far as I have read about 64  
Bit applications on G5, these apps are definitely slower than their  
32 bit counterparts (I'm currently on the train so I can't be more  
precise here without Google ...). Was it something with not enough  
registers in the CPU? Something like that ... So it might be, that  
the 64 bit version is able to use more shared memory but is slower  
than the 32 bit version and you come out with the same performance.  
Nobody knows ...


Some information about that:

http://www.geekpatrol.ca/2006/09/32-bit-vs-64-bit-performance/

So, the impact doesn't seem to high. So it seems to depend on the  
usage pattern whether the 32 bit with less RAM and slightly higher  
performance might be faster than 64 bit with more shared memory and  
slightly lower performance.


cug


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate