Re: [PATCHES] Autovacuum integration patch

2005-07-05 Thread Matthew T. O'Connor

Tom Lane wrote:


Matthew T. O'Connor matthew@zeut.net writes:
 


The current implementation of XID wraparound requires that the vacuum
command be run against the entire database, you can not run it on a per
table basis and have it work.  At least that is my understanding,
   



No, you're wrong.  VACUUMing of individual tables is perfectly good
enough as far as XID wrap protection goes, it's just that we chose to
track whether it had been done at the database level.  If we tracked it
in, say, a new pg_class column then in principle you could protect
against XID wrap with only table-at-a-time VACUUMs.  (I think you'd
still want the pg_database column, but you'd update it to be the minimum
of the per-table values at the completion of any VACUUM.)

At the time this didn't seem particularly worth the complication since
no one would be likely to try to do that manually --- but with
autovacuum handling the work, it starts to sound more realistic.



Good, I'm glad I'm wrong on this.  This will be another nice advantage 
of autovacuum then and should be fairly easy to do.  Any thoughts on 
this being a change we can get in for 8.1?


Matt


---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] Autovacuum integration patch

2005-07-05 Thread Tom Lane
Matthew T. O'Connor matthew@zeut.net writes:
 Tom Lane wrote:
 No, you're wrong.  VACUUMing of individual tables is perfectly good
 enough as far as XID wrap protection goes, it's just that we chose to
 track whether it had been done at the database level.  If we tracked it
 in, say, a new pg_class column then in principle you could protect
 against XID wrap with only table-at-a-time VACUUMs.

 Good, I'm glad I'm wrong on this.  This will be another nice advantage 
 of autovacuum then and should be fairly easy to do.  Any thoughts on 
 this being a change we can get in for 8.1?

I'd say this is probably a tad too late --- there's a fair amount of
code change that would be needed, none of which has been written, and
we are past the feature-freeze deadline for new code.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] Autovacuum integration patch

2005-07-05 Thread Alvaro Herrera
On Tue, Jul 05, 2005 at 01:00:50PM -0400, Tom Lane wrote:
 Matthew T. O'Connor matthew@zeut.net writes:
  Tom Lane wrote:
  No, you're wrong.  VACUUMing of individual tables is perfectly good
  enough as far as XID wrap protection goes, it's just that we chose to
  track whether it had been done at the database level.  If we tracked it
  in, say, a new pg_class column then in principle you could protect
  against XID wrap with only table-at-a-time VACUUMs.
 
  Good, I'm glad I'm wrong on this.  This will be another nice advantage 
  of autovacuum then and should be fairly easy to do.  Any thoughts on 
  this being a change we can get in for 8.1?
 
 I'd say this is probably a tad too late --- there's a fair amount of
 code change that would be needed, none of which has been written, and
 we are past the feature-freeze deadline for new code.

Right.  I've written a small, non-intrusive patch that handles the Xid
wraparound just as pg_autovacuum used to, checking the Xid from
pg_database.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Hay quien adquiere la mala costumbre de ser infeliz (M. A. Evans)

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] Autovacuum integration patch

2005-07-05 Thread Bruce Momjian

TODO item?

---

Alvaro Herrera wrote:
 On Tue, Jul 05, 2005 at 01:00:50PM -0400, Tom Lane wrote:
  Matthew T. O'Connor matthew@zeut.net writes:
   Tom Lane wrote:
   No, you're wrong.  VACUUMing of individual tables is perfectly good
   enough as far as XID wrap protection goes, it's just that we chose to
   track whether it had been done at the database level.  If we tracked it
   in, say, a new pg_class column then in principle you could protect
   against XID wrap with only table-at-a-time VACUUMs.
  
   Good, I'm glad I'm wrong on this.  This will be another nice advantage 
   of autovacuum then and should be fairly easy to do.  Any thoughts on 
   this being a change we can get in for 8.1?
  
  I'd say this is probably a tad too late --- there's a fair amount of
  code change that would be needed, none of which has been written, and
  we are past the feature-freeze deadline for new code.
 
 Right.  I've written a small, non-intrusive patch that handles the Xid
 wraparound just as pg_autovacuum used to, checking the Xid from
 pg_database.
 
 -- 
 Alvaro Herrera (alvherre[a]alvh.no-ip.org)
 Hay quien adquiere la mala costumbre de ser infeliz (M. A. Evans)
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] Autovacuum integration patch

2005-07-05 Thread Matthew T. O'Connor
I think so.  Something like: Improve autovacuum xid wraparound detection 
by moving to a pertable solution rather than per database.


Matt


Bruce Momjian wrote:


TODO item?

 




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] Autovacuum integration patch

2005-07-04 Thread Matthew T. O'Connor




XID wraparound:  The patch as submitted doesn't handle XID wraparound
issues.  The old contrib autovacuum would do an XID wraparound check as
it's 1st operation upon connecting to a database.  If XID wraparound was
looks like it's going to be a problem soon, then the whole database
would be vacuumed, eliminating the need to check specific tables.
   



Hmm.  Yes, this patch doesn't handle Xid wraparound.  This should be
easy to add though.  Anyway, I was thinking that we could add a last
vacuum Xid to pg_autovacuum, and handle Xid wraparound for each table
separately -- this means you don't have to issue huge whole-database
VACUUMs, because it will be handled nicely for each table.  Storing the
last vacuum Xid in pg_database would have to be rethought.
 



The current implementation of XID wraparound requires that the vacuum
command be run against the entire database, you can not run it on a per
table basis and have it work.  At least that is my understanding, it would
require some reworking of the vacuum system and I have no idea what is
involved in that.  For now, we should just do it the simple way.  BTW, I
think this is a candidate for only being done during the maintenance
window.



Maybe what we could do is have a separate pg_vacuum table to hold
constantly-moving information about the tables: last vacuum Xid, count
of tuples at last vacuum/analyze, etc; so pg_autovacuum would only hold
the constants for autovacuum equations.  This pg_vacuum table would be
updated by VACUUM, not autovacuum, so it would be always correct and up-
to-date.
 



I'm not sure I see the value in a new pg_vacuum table.  reltuples already has
the tuple count from the last vacuum and I don't think last XID on a per
table basis is helpful.



Better logging of autovacuum activity:  I think the we could use some
more detail in the debug elog statements.  For example showing exactly
what autovacuum believes the threshold and current count is.
   



Ok.  I actually had lots more logging in the original patch, but I
removed it because it was too verbose.  Again, it's easy to add.
 



Well, I don't know what is best, but it would be nice to be able to get at
the information that tells you why autovacuum did or did not take action. 
Perhaps put back what you had in, but move it up to a higher debug level.  FWIW, I think the debug info from the contrib version was sufficient.




How to deal with shared relations:  As an optimization, the contrib
version of autovacuum treated shared relations different than it treated
the rest when connected to any database that is not template1.
   



Ah, interesting.  Yes, I think that could be done too.  Very easy to do.
Anyway, the shared relations are not that big usually, so this shouldn't
be an issue.
 



Agreed this is not a big issue, it's a bit of a micro optimization.



Couple of other thoughts:
Do the vacuum commands respect the GUC vacuum delay settings?
   



Huh, I don't know.  I just issue a vacuum() call.  That function sets
the delay settings AFAICS, so I think it should be working.
 



Can someone confirm this?



Should we be able to set per table vacuum delay settings?
   


We could set that in the hypotetical pg_vacuum relation.
 



Again, I don't think this would be good for the pg_vacuum table, I think 
it should be in the autovacuum table, because what a user wants 
autovacuum to do might be different than what he wants a manually run 
vacuum to do.



This patch doesn't have the maintenance window that was discussed a
while ago.
   


True.  I have several questions about it.  Where would that information
be stored, in another system catalog?  Would it be per-database or
per-table?  What happens if I'm not able to do all work inside the
maintenance window, is it left for the next one?  If the maintenance
window ends and there is a vacuum running, is it terminated or is it
allowed to continue?
 



One could argue that it should be per database, but I think per cluster should 
be sufficient.  I think it could be handled as few GUC settings, such as:
autovac_maint_begin = 1AM
autovac_maint_duration = 4 (measured in hours)
autovac_maint_factor = .5 (reduce the thresholds by half during the maintenance 
window, this option might be good to have on a per table basis, if so, then add 
it to the pg_autovacuum table)

If there is still work to do after the maint window expires, then it's left for 
next time or when the regular threshold is exceeded which ever happens first.  
I wouldn't terminate an in progress vacuum.



There is a very important issue I forgot to mention.  This autovacuum
process only handles databases that exist in the Stats hash table.
However, the stat hash table only has information about databases and
tables that have been used in the current postmaster run.  So if you
don't connect to a database regularly, that database won't get
autovacuumed after a postmaster restart.  I think (but IMBFOS) that
this is also true for individual tables, i.e. 

[PATCHES] Autovacuum integration patch

2005-06-29 Thread Alvaro Herrera
Hackers,

(Resend, like fifth time or so.  bzip2'ing the patch for luck.)

Here is a first cut at autovacuum integration.  Please have a look at
it.  Note that this patch automatically creates three new files:

src/backend/postmaster/autovacuum.c
src/include/catalog/pg_autovacuum.h
src/include/postmaster/autovacuum.h

Note that the daemon is not activated by default.

There are several things that are painfully evident with this thing on:

- TRUNCATE does not update stats.  It should send a stat message to
  which we can react.

- If you empty a whole table using DELETE just after an
  automatically-issued VACUUM takes place, the new threshold may not be
  enough to trigger a new VACUUM.  Thus you end up with a bloated table,
  and it won't get vacuumed until it grows again.  This may be a problem
  with the cost equations, but those are AFAICT identical to those of
  pg_autovacuum, so we may need to rethink the equations.

- The default value of on for reset stats on server start is going to be
  painful with autovacuum, because it reacts badly to losing the info.

- We should make VACUUM and ANALYZE update the pg_autovacuum relation,
  in order to make the autovacuum daemon behave sanely with manually
  issued VACUUM/ANALYZE.

- Having an autovacuum process running on a database can be surprising
  if you want to drop a database, or create a new one using it as a
  template.  This happenned to me several times.

- The shutdown sequence is not debugged nor very well tested.  It may be
  all wrong.

- The startup sequence is a mixture from pgarch, normal backend and
  pgstat.  I find it relatively clean but I can't swear it's bug-free.

- There are no docs

- There are no ALTER TABLE commands to change the pg_autovacuum
  attributes for a table. (Enable/disable, set thresholds and scaling
  factor)

- I compiled with -DEXEC_BACKEND, but I didn't look to see if it
  actually worked on that case.

Apart from all these issues, it is completely functional :-)  It can
survive several make installcheck runs without problem, and the
regression database is vacuumed/analyzed as it runs.

Some of these issues are trivial to handle.  However I'd like to release
this right now, so I can go back to shared dependencies now that role
support is in.

Barring any objections I think this should be integrated, so these
issues can be tackled by interested parties.

-- 
Alvaro Herrera (alvherre[a]surnet.cl)
World domination is proceeding according to plan(Andrew Morton)


autovacuum-4.patch.bz2
Description: Binary data

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] Autovacuum integration patch

2005-06-29 Thread Matthew T. O'Connor

Alvaro Herrera wrote:


There are several things that are painfully evident with this thing on:

- TRUNCATE does not update stats.  It should send a stat message to
 which we can react.
 



How important is this really?  The stats from before the truncate might 
be ok, especially since they might represent how the table will look in 
the future.  Also, there isn't any free space in a table that was just 
truncated, so there is no need to run vacuum to update the FSM.



- If you empty a whole table using DELETE just after an
 automatically-issued VACUUM takes place, the new threshold may not be
 enough to trigger a new VACUUM.  Thus you end up with a bloated table,
 and it won't get vacuumed until it grows again.  This may be a problem
 with the cost equations, but those are AFAICT identical to those of
 pg_autovacuum, so we may need to rethink the equations.
 



I'm very open to a better equation if someone has one, but I'm not sure 
what the problem is.  If there are 10,000 rows in a table and an 
autovacuum takes place, you will have a threshold of 5,000 (assuming you 
are using the default threshold parmeters: base = 1000, scaling factor = 
0.4).  So now when all the rows are deleted that will be enough activity 
to cross the threshold and cause another vacuum.   I guess the problem 
is if the table is smaller say, 1,000 rows, now after a vacuum, the 
threshold will be 1,400, and deleting all the rows will not cause a 
vacuum.  But that is OK because a 1,000 row table is probably not very 
big.  The purpose of the base threshold value is so that vacuum commands 
don't get run continually on really small tables that are updated a lot, 
it's OK to have some slack space.  If the default is deemed to high, we 
can always lower it.



- The default value of on for reset stats on server start is going to be
 painful with autovacuum, because it reacts badly to losing the info.
 



I agree, this is an issue.  Is there any reason not to change  
stats_reset_on_restart to default to true?



- We should make VACUUM and ANALYZE update the pg_autovacuum relation,
 in order to make the autovacuum daemon behave sanely with manually
 issued VACUUM/ANALYZE.
 



Agree completly.  This way autovacuum can work in harmony with manually 
issued or cron isssued vacuum commands.



- Having an autovacuum process running on a database can be surprising
 if you want to drop a database, or create a new one using it as a
 template.  This happenned to me several times.
 



Not sure what to do about this.   We could reduce the number of times 
autovacuum actually connects to a database by checking the stats flat 
file before we connect.  If there hasn't been any activity since the 
last time we connected, then don't connect again.  Better ideas anyone?



- The shutdown sequence is not debugged nor very well tested.  It may be
 all wrong.
 



Ok, I'm testing it now, i'll let you know if I see anything funny.


- The startup sequence is a mixture from pgarch, normal backend and
 pgstat.  I find it relatively clean but I can't swear it's bug-free.
 



Same as above.


- There are no docs
 



I can help here as long as I don't have to have the docs done before July 1.


- There are no ALTER TABLE commands to change the pg_autovacuum
 attributes for a table. (Enable/disable, set thresholds and scaling
 factor)
 



I don't think we need this do we?  Mucking around in the autovacuum 
table shouldn't cause the system any serious problems, if you do mess up 
your values, it's easy to just reset them all to 0 and start back with 
the defaults.



- I compiled with -DEXEC_BACKEND, but I didn't look to see if it
 actually worked on that case.

Apart from all these issues, it is completely functional :-)  It can
survive several make installcheck runs without problem, and the
regression database is vacuumed/analyzed as it runs.
 



Cool.


Some of these issues are trivial to handle.  However I'd like to release
this right now, so I can go back to shared dependencies now that role
support is in.

Barring any objections I think this should be integrated, so these
issues can be tackled by interested parties.



Couple of other thoughts:
Do the vacuum commands respect the GUC vacuum delay settings?
Should we be able to set per table vacuum delay settings?
This patch doesn't have the maintenance window that was discussed a 
while ago.  Can that be added after July 1?


Thanks Alvaro for doing the integration work

Matthew O'Connor



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PATCHES] Autovacuum integration patch

2005-06-29 Thread Matthew T. O'Connor

Alvaro Herrera wrote:


Hackers,

Here is a first cut at autovacuum integration.  Please have a look at
it.  Note that this patch automatically creates three new files:
 



Couple more things that I didn't think about while we were talking about 
this the other day.


XID wraparound:  The patch as submitted doesn't handle XID wraparound 
issues.  The old contrib autovacuum would do an XID wraparound check as 
it's 1st operation upon connecting to a database.  If XID wraparound was 
looks like it's going to be a problem soon, then the whole database 
would be vacuumed, eliminating the need to check specific tables.


Better logging of autovacuum activity:  I think the we could use some 
more detail in the debug elog statements.  For example showing exactly 
what autovacuum believes the threshold and current count is.


How to deal with shared relations:  As an optimization, the contrib 
version of autovacuum treated shared relations different than it treated 
the rest when connected to any database that is not template1.  That is, 
when connected to a DB other than template1, autovacuum would not issue 
vacuum commands. rather it would only issue analyze commands.   When 
autovacuum got around to connecting to template1, it would then issue 
the vacuum command.  The hope was that this would reducing a shared 
relation from getting vacuumed n times (where n is the number of 
databases in a cluster) whenever it crossed over it's threshold.  I'm 
not sure if this optimizaion is really important, or even exactly correct.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] Autovacuum Integration Patch Take 5

2004-08-06 Thread Bruce Momjian

Matthew, your reply was exactly the type of reply I would have made in
your situation.  Your arguments are clear and indisputable.

Due to the many large patches the we had to process during this release,
we serialized their review.  However, I made promises to developers that
their patches would get the same consideration if they were reviewed
early or late.  Obviously this wasn't true of your patch.  We found more
issues than we thought and didn't give you time to address them. 
Frankly we are lucky autovacuum was the only item that didn't make it
because several features were in similar need of major work.  Of course
that is no consolation to you and people looking for autovacuum in 8.0.

Not sure what I can do about it at this point.  I am going to write up a
whole documentation section on 3rd party tools and interfaces and
pg_autovacuum would have a big mention there.

There is the issue of Win32 and the need for pg_autovacuum to start
easily.

---

Matthew T. O'Connor wrote:
 Tom Lane wrote:
 
 You're headed in the right direction, but I'm afraid we're running out
 of time.  The core committee has chewed this over and agreed that we
 can't postpone beta for the amount of time we think it will take to make
 this patch committable.  So we're going to hold it over for the 8.1
 release cycle.
 
 I have to make a personal apology to you for the fact that things worked
 out this way.  I really should have looked at your patch much earlier
 and given you some feedback that might have allowed you to resolve the
 issues in time.  I did not because (a) I felt that the other patches
 I was working on were more important features (a judgment I still stand
 by) and (b) I thought your patch was in good enough shape that we could
 apply it with little effort.  That judgment was badly off, and again I
 must apologize for it.  I hope you won't get discouraged, and will
 continue to work on an integrated autovacuum for 8.1.
   
 
 
 AGGGHH!
 This is very frustrating.  I saw this coming weeks and weeks ago and 
 tried to get people's attention so that this wouldn't happen.  Aside 
 from my personal frustration, I will say that autovacuum is a high 
 priority for lots of users of autovacuum and there are already lots of 
 users looking forward to it being in 8.0.  FWIW, I tried to clean up as 
 much stuff as I could the other night and submit and updated patch, I 
 would guess that it wouldn't take you very long to clean up the shutdown 
 issues.
 
 BTW, I choose to try to integrate it into the backend on the 
 recomendation of several people on the hackers list despite my warnings 
 that I would probably need help with the backend code issues.  I could 
 have instead put my time towards an improved version in contrib, now the 
 end-users will have to go another release cycle without any of the 
 feature improvements I had hoped for.
 
 FWIW, core has also agreed that we want to shoot for a much shorter
 release cycle for 8.1 than we have had in the past couple of releases.
 It seems likely that as the new 8.0 features are shaken out, 8.1 will
 be mostly a mop-up development cycle, and that we will want to push it
 out relatively soon (we're thinking of perhaps 3-4 months in
 development, with a total release cycle of 6-7 months).
 
   
 
 I think we have all heard this before
 
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] Autovacuum Integration Patch Take 5

2004-08-05 Thread Matthew T. O'Connor
Tom Lane wrote:
You're headed in the right direction, but I'm afraid we're running out
of time.  The core committee has chewed this over and agreed that we
can't postpone beta for the amount of time we think it will take to make
this patch committable.  So we're going to hold it over for the 8.1
release cycle.
I have to make a personal apology to you for the fact that things worked
out this way.  I really should have looked at your patch much earlier
and given you some feedback that might have allowed you to resolve the
issues in time.  I did not because (a) I felt that the other patches
I was working on were more important features (a judgment I still stand
by) and (b) I thought your patch was in good enough shape that we could
apply it with little effort.  That judgment was badly off, and again I
must apologize for it.  I hope you won't get discouraged, and will
continue to work on an integrated autovacuum for 8.1.
 

AGGGHH!
This is very frustrating.  I saw this coming weeks and weeks ago and 
tried to get people's attention so that this wouldn't happen.  Aside 
from my personal frustration, I will say that autovacuum is a high 
priority for lots of users of autovacuum and there are already lots of 
users looking forward to it being in 8.0.  FWIW, I tried to clean up as 
much stuff as I could the other night and submit and updated patch, I 
would guess that it wouldn't take you very long to clean up the shutdown 
issues.

BTW, I choose to try to integrate it into the backend on the 
recomendation of several people on the hackers list despite my warnings 
that I would probably need help with the backend code issues.  I could 
have instead put my time towards an improved version in contrib, now the 
end-users will have to go another release cycle without any of the 
feature improvements I had hoped for.

FWIW, core has also agreed that we want to shoot for a much shorter
release cycle for 8.1 than we have had in the past couple of releases.
It seems likely that as the new 8.0 features are shaken out, 8.1 will
be mostly a mop-up development cycle, and that we will want to push it
out relatively soon (we're thinking of perhaps 3-4 months in
development, with a total release cycle of 6-7 months).
 

I think we have all heard this before
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] Autovacuum Integration Patch Take 5

2004-08-05 Thread Matthew T. O'Connor
Tom Lane wrote:
Matthew T. O'Connor [EMAIL PROTECTED] writes:
 

Well I didn't get out of the office as early as I had hoped, and I have
stayed up longer than I had planned, but I have a patch that addresses
many of the issues raised by Tom.  Please take a look at let me know if
I'm heading in the right direction.  
   

You're headed in the right direction, but I'm afraid we're running out
of time.  The core committee has chewed this over and agreed that we
can't postpone beta for the amount of time we think it will take to make
this patch committable.  So we're going to hold it over for the 8.1
release cycle.
BTW, I know people are eager for 8.0, but given that our release cycle 
is so long, and that by everyones estimates we are at least 3 months 
away from a release, what is the hurry for beta?  A few more days to get 
this feature in wouldn't hurt, it was submittted before feature freeze, 
and I have been waiting weeks on end to get feedback. 

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PATCHES] Autovacuum Integration Patch Take 5

2004-08-05 Thread Tom Lane
Matthew T. O'Connor [EMAIL PROTECTED] writes:
 BTW, I know people are eager for 8.0, but given that our release cycle 
 is so long, and that by everyones estimates we are at least 3 months 
 away from a release, what is the hurry for beta?

If I thought we were just a day or two away from having a committable
patch, I'd lobby for more delay, but I don't really think that (and
now that I know you'll be gone over the next couple days, the odds of
that have clearly dropped to zero).  We have already slipped beta six
weeks from the original plan, and we cannot keep slipping it
indefinitely.

Again, I do have to apologize for not having found some time to look at
your patch earlier.  Hindsight is always 20-20 :-(

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly