Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table

2006-10-18 Thread Larry Rosenman
Jim C. Nasby wrote:
> On Sun, Oct 15, 2006 at 04:52:12PM +0200, Tobias Brox wrote:
>> Are there any logs that can help me, and eventually, are there any
>> ready-made scripts for checking when autovacuum is running, and
>> eventually for how long it keeps its transactions?  I'll probably
>> write up something myself if not.
> 
> 8.2 adds some stats on when autovac last ran, per-table. I don't
> remember if it reports how long it took to vacuum the table, but that
> would probably be useful info.  

It does NOT.  It's just the timestamp of the END of the vacuum / analyze.

(I'm the author of the patch).


-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 512-248-2683 E-Mail: ler@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table

2006-10-18 Thread Jim C. Nasby
On Sun, Oct 15, 2006 at 04:52:12PM +0200, Tobias Brox wrote:
> Are there any logs that can help me, and eventually, are there any
> ready-made scripts for checking when autovacuum is running, and
> eventually for how long it keeps its transactions?  I'll probably write
> up something myself if not.

8.2 adds some stats on when autovac last ran, per-table. I don't
remember if it reports how long it took to vacuum the table, but that
would probably be useful info.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table

2006-10-15 Thread Tobias Brox
[Matthew T. O'Connor - Sun at 10:42:34AM -0400]
> Yeah, I think if the delay settings are too high it can cause problems, 
> that's part of the reason we have yet to turn these on be default since 
> we won't have enough data to suggest good values.  Can you tell us what 
> settings you finally settled on?

I'm still not yet settled, and the project manager is critical to
autovacuum (adds complexity, no obvious benefits from it, we see from
the CPU graphs that it's causing iowait, iowait is bad).  We're going to
run autovacuum biweekly now to see what effect it has on the server
load.

I've been using the cost/delay-setting of 200/200 for a week now, and
I'm going to continue with 100/150 for a while.  

Are there any known disadvantages of lowering both values to the extreme
- say, 20/20 instead of 200/200?  That would efficiently mean "sleep as
often as possible, and sleep for 1 ms for each cost unit spent" if
I've understood the system right.

Are there any logs that can help me, and eventually, are there any
ready-made scripts for checking when autovacuum is running, and
eventually for how long it keeps its transactions?  I'll probably write
up something myself if not.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table

2006-10-15 Thread Matthew T. O'Connor

Tobias Brox wrote:

[Matthew T. O'Connor - Wed at 02:33:10PM -0400]
  
In addition autovacuum respects the work of manual or cron based 
vacuums, so if you issue a vacuum right after a daily batch insert / 
update, autovacuum won't repeat the work of that manual vacuum.



I was experimenting a bit with autovacuum now.  To make the least effect
possible, I started with a too high cost_delay/cost_limit-ratio.  The
effect of this was that autovacuum "never" finished the transactions it
started with, and this was actually causing the nightly vacuum to not do
it's job good enough.


Yeah, I think if the delay settings are too high it can cause problems, 
that's part of the reason we have yet to turn these on be default since 
we won't have enough data to suggest good values.  Can you tell us what 
settings you finally settled on?


BTW hopefully for 8.3 we are going to add the concept of maintenance 
windows to autovacuum, during these periods you can lower the thresholds 
and perhaps even change the delay settings to make autovacuum more 
aggressive during the maintenance window.  This hopefully will just 
about eliminate the need for nightly cron based vacuum runs.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table

2006-10-15 Thread Tobias Brox
[Matthew T. O'Connor - Wed at 02:33:10PM -0400]
> In addition autovacuum respects the work of manual or cron based 
> vacuums, so if you issue a vacuum right after a daily batch insert / 
> update, autovacuum won't repeat the work of that manual vacuum.

I was experimenting a bit with autovacuum now.  To make the least effect
possible, I started with a too high cost_delay/cost_limit-ratio.  The
effect of this was that autovacuum "never" finished the transactions it
started with, and this was actually causing the nightly vacuum to not do
it's job good enough.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-28 Thread Edoardo Ceccarelli






  
True,
but autovacuum could check load -before- and -during- it's execution and 
it could adjust himself automatically to perform more or less 
aggressively depending on the difference between those two values.
Maybe with a parameter like: maximum-autovacuum-load=0.2
that would mean: "never load the machine more than 20% for the autovacuum"

  
  
This is pretty non-trivial.  How do you define 20% load?  20% of the
CPU?  Does that mean that it's OK for autovac to use 3% cpu and 100% of
your IO?  Ok, so we need to calculate an average of IO and CPU -- which
disks?  If your WAL logs are on one disk, and you've used tablespaces
to spread the rest of your DB across different partitions, it can
be pretty difficult to determine which IO parameters you want to take
into consideration.

  

As I said before, it could be done, the main requirement is to find a
way for pg to check for a value of the system load; of course it has to
be an average value between disk and cpu, of course the daemon would
have to collect sample of this values continuously, and of course
everything would be better if the server it's only running Postgres
Still I think you are right, it wouldn't suit exactly every situations
but it could be an "emergency" feature:
What happened to me was very clear: 
server running pg8 under heavy load, cpu's were 90% idle as usual.
At some point the vacuum started, the server reached 50 of overall load
and cpu's were 1% idle
I think any test can detect such a situation, regardles if load it's
more I/O based or CPU based









Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Bill Moran
In response to Edoardo Ceccarelli <[EMAIL PROTECTED]>:

> Rod Taylor wrote:
> > On Wed, 2006-09-27 at 18:08 +0200, Edoardo Ceccarelli wrote:
> >   
> >> I have read that autovacuum cannot check to see pg load before
> >> launching 
> >> vacuum but is there any patch about it? that would sort out the
> >> problem 
> >> in a good and simple way. 
> >> 
> >
> > In some cases the solution to high load is to vacuum the tables being
> > hit the heaviest -- meaning that simply checking machine load isn't
> > enough to make that decision.
> >
> > In fact, that high load problem is exactly why autovacuum was created in
> > the first place.
> >   
> True,
> but autovacuum could check load -before- and -during- it's execution and 
> it could adjust himself automatically to perform more or less 
> aggressively depending on the difference between those two values.
> Maybe with a parameter like: maximum-autovacuum-load=0.2
> that would mean: "never load the machine more than 20% for the autovacuum"

This is pretty non-trivial.  How do you define 20% load?  20% of the
CPU?  Does that mean that it's OK for autovac to use 3% cpu and 100% of
your IO?  Ok, so we need to calculate an average of IO and CPU -- which
disks?  If your WAL logs are on one disk, and you've used tablespaces
to spread the rest of your DB across different partitions, it can
be pretty difficult to determine which IO parameters you want to take
into consideration.

-- 
Bill Moran
Collaborative Fusion Inc.


IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Matthew T. O'Connor

Csaba Nagy wrote:

On Wed, 2006-09-27 at 18:08, Edoardo Ceccarelli wrote:
  

How can I configure the vacuum to run after the daily batch insert/update?



Check out this:
http://www.postgresql.org/docs/8.1/static/catalog-pg-autovacuum.html

By inserting the right row you can disable autovacuum to vacuum your big
tables, and then you can schedule vacuum nightly for those just as
before. There's still a benefit in that you don't need to care about
vacuuming the rest of the tables, which will be done just in time.


In addition autovacuum respects the work of manual or cron based 
vacuums, so if you issue a vacuum right after a daily batch insert / 
update, autovacuum won't repeat the work of that manual vacuum.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Rod Taylor
On Wed, 2006-09-27 at 18:08 +0200, Edoardo Ceccarelli wrote:
> 
> I have read that autovacuum cannot check to see pg load before
> launching 
> vacuum but is there any patch about it? that would sort out the
> problem 
> in a good and simple way. 

In some cases the solution to high load is to vacuum the tables being
hit the heaviest -- meaning that simply checking machine load isn't
enough to make that decision.

In fact, that high load problem is exactly why autovacuum was created in
the first place.
-- 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Tobias Brox
[Edoardo Ceccarelli - Wed at 06:49:23PM +0200]
> ...another thing is, how could autovacuum check for machine load, this 
> is something I cannot imagine right now...

One solution I made for our application, is to check the
pg_stats_activity view.  It requires some config to get the stats
available in that view, though.  When the application is to start a
low-priority transaction, it will first do:

  select count(*) from pg_stat_activity where current_query not like
  '

Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Edoardo Ceccarelli




Rod Taylor wrote:

  On Wed, 2006-09-27 at 18:08 +0200, Edoardo Ceccarelli wrote:
  
  
I have read that autovacuum cannot check to see pg load before
launching 
vacuum but is there any patch about it? that would sort out the
problem 
in a good and simple way. 

  
  
In some cases the solution to high load is to vacuum the tables being
hit the heaviest -- meaning that simply checking machine load isn't
enough to make that decision.

In fact, that high load problem is exactly why autovacuum was created in
the first place.
  

True, 
but autovacuum could check load -before- and -during- it's execution
and it could adjust himself automatically to perform more or less
aggressively depending on the difference between those two values.
Maybe with a parameter like: maximum-autovacuum-load=0.2
that would mean: "never load the machine more than 20% for the
autovacuum"

...another thing is, how could autovacuum check for machine load, this
is something I cannot imagine right now...





Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Edoardo Ceccarelli






Bill Moran wrote:

  In response to Edoardo Ceccarelli <[EMAIL PROTECTED]>:
  


I have read that autovacuum cannot check to see pg load before launching 
vacuum but is there any patch about it? that would sort out the problem 
in a good and simple way.
Otherwise, which kind of set of parameters I should put in autovacuum 
configuration? I am stuck because in our case the table gets mostly read 
and if I set up things as to vacuum the table after a specific amount of 
insert/updates, I cannot foresee whether this could happen during 
daytime when server is under high load.
How can I configure the vacuum to run after the daily batch insert/update?

  
  
It doesn't sound as if your setup is a good match for autovacuum.  You
might be better off going back to the cron vacuums.  That's the
beauty of Postgres -- it gives you the choice.

If you want to continue with autovac, you may want to experiment with
vacuum_cost_delay and associated parameters, which can lessen the
impact of vacuuming.

  

The db is constantly monitored during high peak so that we can switch
to a backup pg7.3 database that is being vacuumed every night.
This is giving me the opportunity to try it so I tried this:

vacuum_cost_delay = 200
vacuum_cost_page_hit = 5
vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20
vacuum_cost_limit = 100

I know these values affect the normal vacuum process but apparently
this means setting
#autovacuum_vacuum_cost_delay = -1  # default vacuum cost delay for
    # autovac, -1 means use
    # vacuum_cost_delay

and 
#autovacuum_vacuum_cost_limit = -1  # default vacuum cost limit for
    # autovac, -1 means use
    # vacuum_cost_limit


for the rest of them I am currently trying the deafults:

#autovacuum_naptime = 60    # time between autovacuum runs,
in secs
#autovacuum_vacuum_threshold = 1000 # min # of tuple updates before
vacuum
#autovacuum_analyze_threshold = 500 # min # of tuple updates before
analyze
#autovacuum_vacuum_scale_factor = 0.4   # fraction of rel size before
vacuum
#autovacuum_analyze_scale_factor = 0.2  # fraction of rel size before
analyze

Does anybody know which process is actually AUTO-vacuum-ing the db? 
So that I can check when is running...








Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Bill Moran
In response to Edoardo Ceccarelli <[EMAIL PROTECTED]>:

> Hello,
> 
> we are running a 7.3 postgres db with only a big table (avg 
> 500.000records) and 7 indexes for a search engine.
> we have 2 of this databases and we can switch from one to another.
> Last week we decided to give a try to 8.1 on one of them and everything 
> went fine, db is faster (about 2 or 3 times in our case) and the server 
> load is higher - which should mean that faster response time is achieved 
> by taking a better use of the server.
> 
> We also activated the autovacuum feature to give it a try and that's 
> were our problems started.
> I left the standard autovacuum configuration just to wait and see, pg 
> decided to start a vacuum on the table just midday when users were 
> launching search queries on the table and server load reached a very 
> high value so that in a couple of minutes the db was unusable
> 
> With pg7.3 we use to vacuum the db night time, mostly because the insert 
> and updates in this table is made in a batch way: a single task that 
> puts 100.000 records in the db in 10/20minutes, so the best time to 
> actually vacuum the db would be after this batch.
> 
> I have read that autovacuum cannot check to see pg load before launching 
> vacuum but is there any patch about it? that would sort out the problem 
> in a good and simple way.
> Otherwise, which kind of set of parameters I should put in autovacuum 
> configuration? I am stuck because in our case the table gets mostly read 
> and if I set up things as to vacuum the table after a specific amount of 
> insert/updates, I cannot foresee whether this could happen during 
> daytime when server is under high load.
> How can I configure the vacuum to run after the daily batch insert/update?

It doesn't sound as if your setup is a good match for autovacuum.  You
might be better off going back to the cron vacuums.  That's the
beauty of Postgres -- it gives you the choice.

If you want to continue with autovac, you may want to experiment with
vacuum_cost_delay and associated parameters, which can lessen the
impact of vacuuming.

-- 
Bill Moran
Collaborative Fusion Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Csaba Nagy
On Wed, 2006-09-27 at 18:08, Edoardo Ceccarelli wrote:
> How can I configure the vacuum to run after the daily batch insert/update?

Check out this:
http://www.postgresql.org/docs/8.1/static/catalog-pg-autovacuum.html

By inserting the right row you can disable autovacuum to vacuum your big
tables, and then you can schedule vacuum nightly for those just as
before. There's still a benefit in that you don't need to care about
vacuuming the rest of the tables, which will be done just in time.

Cheers,
Csaba.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Tobias Brox
[Edoardo Ceccarelli - Wed at 06:08:30PM +0200]
> We also activated the autovacuum feature to give it a try and that's 
> were our problems started.
(...)
> How can I configure the vacuum to run after the daily batch insert/update?

I think you shouldn't use autovacuum in your case.

We haven't dared testing out autovacuum yet even though we probably
should, so we're running vacuum at fixed times of the day.  We have a
very simple script to do this, the most important part of it reads:

echo "vacuum verbose analyze;" | psql $DB_NAME > $logdir/$filename 2>&1


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Alan Hodgson
On Wednesday 27 September 2006 09:08, Edoardo Ceccarelli <[EMAIL PROTECTED]> 
wrote:
>
> How can I configure the vacuum to run after the daily batch
> insert/update?
>

If you really only want it to run then, you should disable autovacuum and 
continue to run the vacuum manually.

You might also investigate the vacuum cost delay options, which will make 
vacuum take longer but will have less of an impact on your database while 
running.


-- 
"If a nation values anything more than freedom, it will lose its freedom;
and the irony of it is that if it is comfort or money that it values more,
it will lose that too." -- Somerset Maugham, Author


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] autovacuum on a -mostly- r/o table

2006-09-27 Thread Edoardo Ceccarelli

Hello,

we are running a 7.3 postgres db with only a big table (avg 
500.000records) and 7 indexes for a search engine.

we have 2 of this databases and we can switch from one to another.
Last week we decided to give a try to 8.1 on one of them and everything 
went fine, db is faster (about 2 or 3 times in our case) and the server 
load is higher - which should mean that faster response time is achieved 
by taking a better use of the server.


We also activated the autovacuum feature to give it a try and that's 
were our problems started.
I left the standard autovacuum configuration just to wait and see, pg 
decided to start a vacuum on the table just midday when users were 
launching search queries on the table and server load reached a very 
high value so that in a couple of minutes the db was unusable


With pg7.3 we use to vacuum the db night time, mostly because the insert 
and updates in this table is made in a batch way: a single task that 
puts 100.000 records in the db in 10/20minutes, so the best time to 
actually vacuum the db would be after this batch.


I have read that autovacuum cannot check to see pg load before launching 
vacuum but is there any patch about it? that would sort out the problem 
in a good and simple way.
Otherwise, which kind of set of parameters I should put in autovacuum 
configuration? I am stuck because in our case the table gets mostly read 
and if I set up things as to vacuum the table after a specific amount of 
insert/updates, I cannot foresee whether this could happen during 
daytime when server is under high load.

How can I configure the vacuum to run after the daily batch insert/update?

Any help appreciated
Thank you very much
Edoardo



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster