Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-12 Thread Mark Wong
On Fri, 12 Aug 2005 17:49:41 -0400
Alvaro Herrera [EMAIL PROTECTED] wrote:

 On Fri, Aug 12, 2005 at 10:49:43AM -0700, Mark Wong wrote:
  I thought I'd run a couple of tests to see if it would be helpful
  against CVS from Aug 3, 2005.
  
  Here's a run with autovacuum turned off:
  http://www.testing.osdl.org/projects/dbt2dev/results/dev4-015/42/
  5186.55 notpm
  
  Autvacuum on with default settings:
  http://www.testing.osdl.org/projects/dbt2dev/results/dev4-015/38/
  5462.23 notpm
 
 Just noticed what seems to be a bug: in
 
 http://www.testing.osdl.org/projects/dbt2dev/results/dev4-015/42/db/index_info.input
 
 plot index_info.data using 1:2 title i_customer with lines, \
 index_info.data using 1:2 title i_orders with lines, \
 index_info.data using 1:3 title pk_customer with lines, \
 index_info.data using 1:4 title pk_district with lines, \
 index_info.data using 1:5 title pk_item with lines, \
 index_info.data using 1:6 title pk_new_order with lines, \
 index_info.data using 1:7 title pk_order_line with lines, \
 index_info.data using 1:8 title pk_orders with lines, \
 index_info.data using 1:9 title pk_stock with lines, \
 index_info.data using 1:11 title pk_warehouse with lines
 
 Notice how the subindexes are wrong ... I think it should be 1:3 for
 i_orders, no?  Apparently indexes_scan.data has the same problem.

Whoops!  I think I fixed it for real now and the charts should be
updated now.  It was broken slightly more previously.

 It called my attention that the pk_warehouse index seems to have a very
 different usage in both runs in index_info, but in indexes_scan they
 seem similar.

Thanks,
Mark

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-12 Thread Alvaro Herrera
On Fri, Aug 12, 2005 at 10:49:43AM -0700, Mark Wong wrote:
 I thought I'd run a couple of tests to see if it would be helpful
 against CVS from Aug 3, 2005.
 
 Here's a run with autovacuum turned off:
 http://www.testing.osdl.org/projects/dbt2dev/results/dev4-015/42/
 5186.55 notpm
 
 Autvacuum on with default settings:
 http://www.testing.osdl.org/projects/dbt2dev/results/dev4-015/38/
 5462.23 notpm

Just noticed what seems to be a bug: in

http://www.testing.osdl.org/projects/dbt2dev/results/dev4-015/42/db/index_info.input

plot index_info.data using 1:2 title i_customer with lines, \
index_info.data using 1:2 title i_orders with lines, \
index_info.data using 1:3 title pk_customer with lines, \
index_info.data using 1:4 title pk_district with lines, \
index_info.data using 1:5 title pk_item with lines, \
index_info.data using 1:6 title pk_new_order with lines, \
index_info.data using 1:7 title pk_order_line with lines, \
index_info.data using 1:8 title pk_orders with lines, \
index_info.data using 1:9 title pk_stock with lines, \
index_info.data using 1:11 title pk_warehouse with lines

Notice how the subindexes are wrong ... I think it should be 1:3 for
i_orders, no?  Apparently indexes_scan.data has the same problem.


It called my attention that the pk_warehouse index seems to have a very
different usage in both runs in index_info, but in indexes_scan they
seem similar.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Las mujeres son como hondas:  mientras más resistencia tienen,
 más lejos puedes llegar con ellas  (Jonas Nightingale, Leap of Faith)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-12 Thread Alvaro Herrera
On Fri, Aug 12, 2005 at 03:16:04PM -0700, Mark Wong wrote:
 On Fri, 12 Aug 2005 17:49:41 -0400
 Alvaro Herrera [EMAIL PROTECTED] wrote:
 
  Notice how the subindexes are wrong ... I think it should be 1:3 for
  i_orders, no?  Apparently indexes_scan.data has the same problem.
 
 Whoops!  I think I fixed it for real now and the charts should be
 updated now.  It was broken slightly more previously.

Hmm, did you fix the 42 case only?  The other one is broken too ...

Also, it seems the tran_lock.out file captured wrong input -- I think
you mean WHERE transactionid IS NULL in the query instead of WHERE
transaction IS NULL.

I wonder what the big down-spikes (?) at minutes ~45 and ~85 correspond
to.  Are those checkpoints?  The IO vmstat chart would indicate that, I
think.


Anyway, it's interesting to see the performance go up with autovacuum
on.  I certainly didn't expect that in this kind of test.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
No necesitamos banderas
 No reconocemos fronteras  (Jorge González)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-12 Thread Mark Wong
On Fri, 12 Aug 2005 18:42:09 -0400
Alvaro Herrera [EMAIL PROTECTED] wrote:

 On Fri, Aug 12, 2005 at 03:16:04PM -0700, Mark Wong wrote:
  On Fri, 12 Aug 2005 17:49:41 -0400
  Alvaro Herrera [EMAIL PROTECTED] wrote:
  
   Notice how the subindexes are wrong ... I think it should be 1:3 for
   i_orders, no?  Apparently indexes_scan.data has the same problem.
  
  Whoops!  I think I fixed it for real now and the charts should be
  updated now.  It was broken slightly more previously.
 
 Hmm, did you fix the 42 case only?  The other one is broken too ...

The other dev4-015 cases should be fixed too.
 
 Also, it seems the tran_lock.out file captured wrong input -- I think
 you mean WHERE transactionid IS NULL in the query instead of WHERE
 transaction IS NULL.

Hmm, ok I can try that in a future test run.  I'm not very familiar with
this table, what's the difference between transaction and transactionid?

 I wonder what the big down-spikes (?) at minutes ~45 and ~85 correspond
 to.  Are those checkpoints?  The IO vmstat chart would indicate that, I
 think.

That's correct, those should be checkpoints. 
 
 Anyway, it's interesting to see the performance go up with autovacuum
 on.  I certainly didn't expect that in this kind of test.

I think in Mary's case it was hurting, but she's running the workload
dramatically different.  I think she was planning to revisit that after
we sort out what's going on with the grouped WAL writes.

Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-12 Thread Alvaro Herrera
On Fri, Aug 12, 2005 at 03:49:57PM -0700, Mark Wong wrote:
 On Fri, 12 Aug 2005 18:42:09 -0400
 Alvaro Herrera [EMAIL PROTECTED] wrote:

  Also, it seems the tran_lock.out file captured wrong input -- I think
  you mean WHERE transactionid IS NULL in the query instead of WHERE
  transaction IS NULL.
 
 Hmm, ok I can try that in a future test run.  I'm not very familiar with
 this table, what's the difference between transaction and transactionid?

transaction is the Xid of the transaction holding or waiting for the
lock.  transactionid is not null in the case where the lock is for a
TransactionId.  I guess it depends on what do you want though -- now
that I think about it, capturing only transaction locks is very likely
not what you want.

http://developer.postgresql.org/docs/postgres/view-pg-locks.html

I wonder why do you have that condition though.  I don't think
transaction can ever be NULL in that view.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Hay dos momentos en la vida de un hombre en los que no debería
especular: cuando puede permitírselo y cuando no puede (Mark Twain)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-11 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Updated this patch again:

 - vacuum_cost_delay and vacuum_cost_limit can be set per table, as well
 as globally with autovacuum_vacuum_cost_{limit,delay}
 
 - pgstat is reset if recovery is required
 
 - pgstat reset at postmaster start is disabled by default
 
 - Xid-wraparound VACUUM is now FREEZE without ANALYZE, iff the database
 has datallowconn=false or datistemplate=true

 - A database-wide vacuum is also issued if the vacuumxid is found to be
   very old.

Applied with minor tweaks --- mostly, fixing it so the custom cost
settings are applied for ANALYZE as well as VACUUM.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-11 Thread Alvaro Herrera
On Thu, Aug 11, 2005 at 05:13:15PM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Updated this patch again:
 
 Applied with minor tweaks --- mostly, fixing it so the custom cost
 settings are applied for ANALYZE as well as VACUUM.

Ok, cool, thanks.  I think this completes the autovacuum work I wanted
to do for 8.1.  AFAIK the only thing that we are still badly missing is
the documentation update.  (Matthew, if you are too busy to write it,
please let me know so I know I have to tackle it.)

For 8.2 my first priority (autovac-related) is to eliminate, or at least
alleviate, the need for database-wide vacuums, by keeping track of Xid
wraparound issues on a per-relation basis.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Este mail se entrega garantizadamente 100% libre de sarcasmo.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-10 Thread Alvaro Herrera
Updated this patch again:

  - vacuum_cost_delay and vacuum_cost_limit can be set per table, as well
as globally with autovacuum_vacuum_cost_{limit,delay}
  
  - pgstat is reset if recovery is required
  
  - pgstat reset at postmaster start is disabled by default
 
 - Xid-wraparound VACUUM is now FREEZE without ANALYZE, iff the database
   has datallowconn=false or datistemplate=true

- A database-wide vacuum is also issued if the vacuumxid is found to be
  very old.

Note that I had to add datvacuumxid to the pg_database flat file.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Ciencias políticas es la ciencia de entender por qué
 los políticos actúan como lo hacen  (netfunny.com)
Index: doc/src/sgml/catalogs.sgml
===
RCS file: /home/alvherre/cvs/pgsql/doc/src/sgml/catalogs.sgml,v
retrieving revision 2.110
diff -c -r2.110 catalogs.sgml
*** doc/src/sgml/catalogs.sgml  31 Jul 2005 17:19:16 -  2.110
--- doc/src/sgml/catalogs.sgml  7 Aug 2005 19:03:22 -
***
*** 1197,1202 
--- 1197,1216 
entryMultiplier for reltuples to add to
 structfieldanl_base_thresh//entry
   /row
+ 
+row
+ entrystructfieldvac_cost_delay/structfield/entry
+ entrytypeinteger/type/entry
+ entry/entry
+ entryCustom variablevacuum_cost_delay/ parameter/entry
+/row
+ 
+row
+ entrystructfieldvac_cost_limit/structfield/entry
+ entrytypeinteger/type/entry
+ entry/entry
+ entryCustom variablevacuum_cost_limit/ parameter/entry
+/row
  /tbody
 /tgroup
/table
***
*** 1217,1223 
para
 Any of the numerical fields can contain literal-1/ (or indeed
 any negative value) to indicate that the system-wide default should
!be used for this particular value.
/para
  
   /sect1
--- 1231,1241 
para
 Any of the numerical fields can contain literal-1/ (or indeed
 any negative value) to indicate that the system-wide default should
!be used for this particular value.  Observe that the
!structfieldvac_cost_delay/ variable inherits its default value from the
!varnameautovacuum_vacuum_cost_delay/ configuration parameter,
!or from varnamevacuum_cost_delay/ if the former is set to a negative
!value.  The same applies to structfieldvac_cost_limit/.
/para
  
   /sect1
Index: doc/src/sgml/runtime.sgml
===
RCS file: /home/alvherre/cvs/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.341
diff -c -r1.341 runtime.sgml
*** doc/src/sgml/runtime.sgml   30 Jul 2005 17:15:35 -  1.341
--- doc/src/sgml/runtime.sgml   31 Jul 2005 23:22:11 -
***
*** 3399,3404 
--- 3399,3436 
/listitem
   /varlistentry
  
+  varlistentry id=guc-autovacuum-vacuum-cost-delay 
xreflabel=autovacuum_vacuum_cost_delay
+   termvarnameautovacuum_vacuum_cost_delay/varname 
(typeinteger/type)/term
+   indexterm
+primaryvarnameautovacuum_vacuum_cost_delay/ configuration 
parameter/primary
+   /indexterm
+   listitem
+para
+ Specifies the default value that will be applied to each
+ commandVACUUM/ operation, for tables that do not have
+ a default value set in structnamepg_autovacuum/.  If a
+ negative value is specified (like the default value of -1),
+ the varnamevacuum_cost_delay/ value will be applied instead.
+/para
+   /listitem
+  /varlistentry
+ 
+  varlistentry id=guc-autovacuum-cost-limit 
xreflabel=autovacuum_vacuum_cost_limit
+   termvarnameautovacuum_vacuum_cost_limit/varname 
(typeinteger/type)/term
+   indexterm
+primaryvarnameautovacuum_vacuum_cost_limit/ configuration 
parameter/primary
+   /indexterm
+   listitem
+para
+ Specifies the default value that will be applied to each
+ commandVACUUM/ operation, for tables that do not have
+ a default value set in structnamepg_autovacuum/.  If a 
+ negative value is specified (like the default value of -1),
+ the varnamevacuum_cost_limit/ value will be applied instead.
+/para
+   /listitem
+  /varlistentry
+ 
  /variablelist
 /sect2
  
Index: src/backend/access/transam/xlog.c
===
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.214
diff -c -r1.214 xlog.c
*** src/backend/access/transam/xlog.c   30 Jul 2005 14:15:44 -  1.214
--- src/backend/access/transam/xlog.c   31 Jul 2005 21:36:45 -
***
*** 33,38 
--- 33,39 
  #include catalog/catversion.h
  #include catalog/pg_control.h
  #include miscadmin.h
+ #include pgstat.h
  #include postmaster/bgwriter.h
  #include storage/bufpage.h
  #include storage/fd.h
***

Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-09 Thread Alvaro Herrera
On Fri, Jul 29, 2005 at 03:33:09PM -0400, Tom Lane wrote:

 It occurs to me that vacuuming to prevent XID wraparound is not the only
 reason to do DB-wide vacuums: we also need to keep
 pg_database.datvacuumxid from getting too old, else we will have
 problems with clog bloat.  We may need to rethink the test used.

Hmm.  I have a patch for this, but now that it's ready, I wonder if it's
really needed.  If I understand vacuum_set_xid_limits() correctly, it's
very difficult for the vacuumxid to be far behind the freeze limit.  And
in the case it's actually behind, then there's nothing we can do -- the
only way out is for the user to end the long-running transaction.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Cómo ponemos nuestros dedos en la arcilla del otro. Eso es la amistad; jugar
al alfarero y ver qué formas se pueden sacar del otro (C. Halloway en
La Feria de las Tinieblas, R. Bradbury)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-09 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Hmm.  I have a patch for this, but now that it's ready, I wonder if it's
 really needed.  If I understand vacuum_set_xid_limits() correctly, it's
 very difficult for the vacuumxid to be far behind the freeze limit.

Umm ... they can be close together, or a billion XIDs apart, depending
on whether the FREEZE option was used.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-09 Thread Alvaro Herrera
On Tue, Aug 09, 2005 at 11:24:40PM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Hmm.  I have a patch for this, but now that it's ready, I wonder if it's
  really needed.  If I understand vacuum_set_xid_limits() correctly, it's
  very difficult for the vacuumxid to be far behind the freeze limit.
 
 Umm ... they can be close together, or a billion XIDs apart, depending
 on whether the FREEZE option was used.

Sorry, my point was that vacuumxid is generally going to be higher than
freeze-xid, and where it isn't, a simple vacuum can't fix it.

But now that I think about it, maybe the point is that if a long-running
transaction (a billon-transactions old transaction?) was running when
the last database-wide vacuum was run, then vacuumxid is going to be
older than freeze-xid, so we may need a database-wide vacuum to fix that
even though the freeze-xid is not old enough.

Is that right?

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
We take risks not to escape from life, but to prevent life escaping from us.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-08 Thread Alvaro Herrera
On Sun, Jul 31, 2005 at 07:36:36PM -0400, Alvaro Herrera wrote:

Updated this patch:

 - vacuum_cost_delay and vacuum_cost_limit can be set per table, as well
   as globally with autovacuum_vacuum_cost_{limit,delay}
 
 - pgstat is reset if recovery is required
 
 - pgstat reset at postmaster start is disabled by default

- Xid-wraparound VACUUM is now FREEZE without ANALYZE, iff the database
  has datallowconn=false or datistemplate=true

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
No hay cielo posible sin hundir nuestras raíces
 en la profundidad de la tierra(Malucha Pinto)
Index: doc/src/sgml/catalogs.sgml
===
RCS file: /home/alvherre/cvs/pgsql/doc/src/sgml/catalogs.sgml,v
retrieving revision 2.110
diff -c -r2.110 catalogs.sgml
*** doc/src/sgml/catalogs.sgml  31 Jul 2005 17:19:16 -  2.110
--- doc/src/sgml/catalogs.sgml  7 Aug 2005 19:03:22 -
***
*** 1197,1202 
--- 1197,1216 
entryMultiplier for reltuples to add to
 structfieldanl_base_thresh//entry
   /row
+ 
+row
+ entrystructfieldvac_cost_delay/structfield/entry
+ entrytypeinteger/type/entry
+ entry/entry
+ entryCustom variablevacuum_cost_delay/ parameter/entry
+/row
+ 
+row
+ entrystructfieldvac_cost_limit/structfield/entry
+ entrytypeinteger/type/entry
+ entry/entry
+ entryCustom variablevacuum_cost_limit/ parameter/entry
+/row
  /tbody
 /tgroup
/table
***
*** 1217,1223 
para
 Any of the numerical fields can contain literal-1/ (or indeed
 any negative value) to indicate that the system-wide default should
!be used for this particular value.
/para
  
   /sect1
--- 1231,1241 
para
 Any of the numerical fields can contain literal-1/ (or indeed
 any negative value) to indicate that the system-wide default should
!be used for this particular value.  Observe that the
!structfieldvac_cost_delay/ variable inherits its default value from the
!varnameautovacuum_vacuum_cost_delay/ configuration parameter,
!or from varnamevacuum_cost_delay/ if the former is set to a negative
!value.  The same applies to structfieldvac_cost_limit/.
/para
  
   /sect1
Index: doc/src/sgml/runtime.sgml
===
RCS file: /home/alvherre/cvs/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.341
diff -c -r1.341 runtime.sgml
*** doc/src/sgml/runtime.sgml   30 Jul 2005 17:15:35 -  1.341
--- doc/src/sgml/runtime.sgml   31 Jul 2005 23:22:11 -
***
*** 3399,3404 
--- 3399,3436 
/listitem
   /varlistentry
  
+  varlistentry id=guc-autovacuum-vacuum-cost-delay 
xreflabel=autovacuum_vacuum_cost_delay
+   termvarnameautovacuum_vacuum_cost_delay/varname 
(typeinteger/type)/term
+   indexterm
+primaryvarnameautovacuum_vacuum_cost_delay/ configuration 
parameter/primary
+   /indexterm
+   listitem
+para
+ Specifies the default value that will be applied to each
+ commandVACUUM/ operation, for tables that do not have
+ a default value set in structnamepg_autovacuum/.  If a
+ negative value is specified (like the default value of -1),
+ the varnamevacuum_cost_delay/ value will be applied instead.
+/para
+   /listitem
+  /varlistentry
+ 
+  varlistentry id=guc-autovacuum-cost-limit 
xreflabel=autovacuum_vacuum_cost_limit
+   termvarnameautovacuum_vacuum_cost_limit/varname 
(typeinteger/type)/term
+   indexterm
+primaryvarnameautovacuum_vacuum_cost_limit/ configuration 
parameter/primary
+   /indexterm
+   listitem
+para
+ Specifies the default value that will be applied to each
+ commandVACUUM/ operation, for tables that do not have
+ a default value set in structnamepg_autovacuum/.  If a 
+ negative value is specified (like the default value of -1),
+ the varnamevacuum_cost_limit/ value will be applied instead.
+/para
+   /listitem
+  /varlistentry
+ 
  /variablelist
 /sect2
  
Index: src/backend/access/transam/xlog.c
===
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.214
diff -c -r1.214 xlog.c
*** src/backend/access/transam/xlog.c   30 Jul 2005 14:15:44 -  1.214
--- src/backend/access/transam/xlog.c   31 Jul 2005 21:36:45 -
***
*** 33,38 
--- 33,39 
  #include catalog/catversion.h
  #include catalog/pg_control.h
  #include miscadmin.h
+ #include pgstat.h
  #include postmaster/bgwriter.h
  #include storage/bufpage.h
  #include storage/fd.h
***
*** 48,54 
  
  
  /*
!  *Becauase O_DIRECT bypasses the kernel buffers, and 

Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-01 Thread Michael Paesold

Alvaro Herrera wrote:


Here is another patch for autovacuum:

...

- Xid-wraparound VACUUM is now FULL without ANALYZE


Am I right in my assumption that this VACUUM FULL can happen for any 
database, not just a template database?


I think this is a bad idea. Vacuum full is not an option for our and many 
other production databases. I suggest that should be a plain VACUUM.


Otherwise I think you have done great job finally integrating auto vacuum 
into the backend.


Best Regards,
Michael Paesold 



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-01 Thread Alvaro Herrera
On Mon, Aug 01, 2005 at 09:55:11AM +0200, Michael Paesold wrote:
 Alvaro Herrera wrote:
 
 Here is another patch for autovacuum:
 ...
 - Xid-wraparound VACUUM is now FULL without ANALYZE
 
 Am I right in my assumption that this VACUUM FULL can happen for any 
 database, not just a template database?

Ah, right.  I think it would be OK if we made it FULL only for
datallowcon=false databases.  OTOH maybe it's not worth at all making
the distinction and we should continue using the previous policy of
issuing only non-FULL VACUUM.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Having your biases confirmed independently is how scientific progress is
made, and hence made our great society what it is today (Mary Gardiner)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-01 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 - Xid-wraparound VACUUM is now FULL without ANALYZE
 
 Am I right in my assumption that this VACUUM FULL can happen for any 
 database, not just a template database?

 Ah, right.  I think it would be OK if we made it FULL only for
 datallowcon=false databases.  OTOH maybe it's not worth at all making
 the distinction and we should continue using the previous policy of
 issuing only non-FULL VACUUM.

Strikes me as a waste of cycles: the one database *least* likely to be
in need of a VACUUM FULL is template0.

What we perhaps should consider is VACUUM FREEZE, not FULL, when hitting
a template database --- this would maximize the interval before needing
to do it again.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-31 Thread Alvaro Herrera
On Fri, Jul 29, 2005 at 03:33:09PM -0400, Tom Lane wrote:

 It occurs to me that vacuuming to prevent XID wraparound is not the only
 reason to do DB-wide vacuums: we also need to keep
 pg_database.datvacuumxid from getting too old, else we will have
 problems with clog bloat.  We may need to rethink the test used.

I was unable to come up with a reasonable test for this.  How would we
determine what is too old?  Of course, I could pick any number from
thin air, if that was what you were thinking.  Going forward (8.2) I
think this should also be handled on a table per table basis, just like
the freeze Xid limit.

OTOH I just saw this comment in createdb():

/*
 * Normally we mark the new database with the same datvacuumxid and
 * datfrozenxid as the source.  However, if the source is not allowing
 * connections then we assume it is fully frozen, and we can set the
 * current transaction ID as the xid limits.  This avoids immediately
 * starting to generate warnings after cloning template0.
 */

This means that if the user manages to unfreeze a database, disallow
connections, and later use it as a template, we could suffer Xid-
wraparound data loss in the new database.  Should we rethink this?
Sadly, the only interface for disallowing connections is to manually
update pg_database, so it's impossible to raise a warning about it, or
something; and it's quite likely that people will disallow connections
without reading the proper documentation.  (They do such things all the
time).

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
La experiencia nos dice que el hombre peló millones de veces las patatas,
pero era forzoso admitir la posibilidad de que en un caso entre millones,
las patatas pelarían al hombre (Ijon Tichy)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-31 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 On Fri, Jul 29, 2005 at 03:33:09PM -0400, Tom Lane wrote:
 It occurs to me that vacuuming to prevent XID wraparound is not the only
 reason to do DB-wide vacuums: we also need to keep
 pg_database.datvacuumxid from getting too old, else we will have
 problems with clog bloat.  We may need to rethink the test used.

 I was unable to come up with a reasonable test for this.  How would we
 determine what is too old?

Well, it depends what you think is too much space for pg_clog.  If we
just follow the standard anti-wrap policy, we'll vacuum at least once
every half billion transactions, so pg_clog could be expected to grow
to about 125Mb, which maybe isn't a problem these days.

 OTOH I just saw this comment in createdb():

 /*
  * Normally we mark the new database with the same datvacuumxid and
  * datfrozenxid as the source.  However, if the source is not allowing
  * connections then we assume it is fully frozen, and we can set the
  * current transaction ID as the xid limits.  This avoids immediately
  * starting to generate warnings after cloning template0.
  */

 This means that if the user manages to unfreeze a database, disallow
 connections, and later use it as a template, we could suffer Xid-
 wraparound data loss in the new database.  Should we rethink this?

I don't think so.  Fooling with a template database is risky in any
case, and the fact that autovacuum might save your bacon (if you are
running autovacuum) doesn't make it less so.

BTW, it strikes me that there is one serious error in the current
autovac logic: it does VACUUM ANALYZE rather than merely VACUUM
when doing XID-wrap protection.  This means that it actively introduces
unfrozen tuples into template databases, which is A Bad Move.  We
should just VACUUM, instead.

 Sadly, the only interface for disallowing connections is to manually
 update pg_database,

As of now, we have a documented way of disallowing connections that
doesn't involve messing with datallowconn, so this argument seems a
lot weaker than it might have awhile back.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-31 Thread Alvaro Herrera
Here is another patch for autovacuum:

- vacuum_cost_delay and vacuum_cost_limit can be set per table, as well
  as globally with autovacuum_vacuum_cost_{limit,delay}

- pgstat is reset if recovery is required

- pgstat reset at postmaster start is disabled by default

- Xid-wraparound VACUUM is now FULL without ANALYZE


Note that because of the cost parameters, I changed the vacuum call in a
per-table call instead of passing a list of Oids.  This could be changed
by having two separate lists, one which uses the default values and
other for the rest, but it hardly seems worth the trouble.

(This patch requires catversion bump.)

On Sun, Jul 31, 2005 at 03:15:35PM -0400, Tom Lane wrote:

 BTW, it strikes me that there is one serious error in the current
 autovac logic: it does VACUUM ANALYZE rather than merely VACUUM
 when doing XID-wrap protection.  This means that it actively introduces
 unfrozen tuples into template databases, which is A Bad Move.  We
 should just VACUUM, instead.

True.  Changed in the attached patch.


I think this completes our expectations for 8.1, doesn't it?  Now we
only need the documentation.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
I call it GNU/Linux. Except the GNU/ is silent. (Ben Reiter)
Index: doc/src/sgml/catalogs.sgml
===
RCS file: /home/alvherre/cvs/pgsql/doc/src/sgml/catalogs.sgml,v
retrieving revision 2.109
diff -c -r2.109 catalogs.sgml
*** doc/src/sgml/catalogs.sgml  26 Jul 2005 16:38:25 -  2.109
--- doc/src/sgml/catalogs.sgml  31 Jul 2005 20:19:20 -
***
*** 1187,1192 
--- 1187,1206 
entryMultiplier for reltuples to add to
 structfieldanl_base_thresh//entry
   /row
+ 
+row
+ entrystructfieldvac_cost_delay/structfield/entry
+ entrytypeinteger/type/entry
+ entry/entry
+ entryCustom variablevacuum_cost_delay/ parameter/entry
+/row
+ 
+row
+ entrystructfieldvac_cost_limit/structfield/entry
+ entrytypeinteger/type/entry
+ entry/entry
+ entryCustom variablevacuum_cost_limit/ parameter/entry
+/row
  /tbody
 /tgroup
/table
***
*** 1207,1213 
para
 Any of the numerical fields can contain literal-1/ (or indeed
 any negative value) to indicate that the system-wide default should
!be used for this particular value.
/para
  
   /sect1
--- 1221,1231 
para
 Any of the numerical fields can contain literal-1/ (or indeed
 any negative value) to indicate that the system-wide default should
!be used for this particular value.  Observe that the
!structfieldvac_cost_delay/ variable inherits its default value from the
!varnameautovacuum_vacuum_cost_delay/ configuration parameter,
!or from varnamevacuum_cost_delay/ if the former is set to a negative
!value.  The same applies to structfieldvac_cost_limit/.
/para
  
   /sect1
Index: doc/src/sgml/runtime.sgml
===
RCS file: /home/alvherre/cvs/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.341
diff -c -r1.341 runtime.sgml
*** doc/src/sgml/runtime.sgml   30 Jul 2005 17:15:35 -  1.341
--- doc/src/sgml/runtime.sgml   31 Jul 2005 23:22:11 -
***
*** 3399,3404 
--- 3399,3436 
/listitem
   /varlistentry
  
+  varlistentry id=guc-autovacuum-vacuum-cost-delay 
xreflabel=autovacuum_vacuum_cost_delay
+   termvarnameautovacuum_vacuum_cost_delay/varname 
(typeinteger/type)/term
+   indexterm
+primaryvarnameautovacuum_vacuum_cost_delay/ configuration 
parameter/primary
+   /indexterm
+   listitem
+para
+ Specifies the default value that will be applied to each
+ commandVACUUM/ operation, for tables that do not have
+ a default value set in structnamepg_autovacuum/.  If a
+ negative value is specified (like the default value of -1),
+ the varnamevacuum_cost_delay/ value will be applied instead.
+/para
+   /listitem
+  /varlistentry
+ 
+  varlistentry id=guc-autovacuum-cost-limit 
xreflabel=autovacuum_vacuum_cost_limit
+   termvarnameautovacuum_vacuum_cost_limit/varname 
(typeinteger/type)/term
+   indexterm
+primaryvarnameautovacuum_vacuum_cost_limit/ configuration 
parameter/primary
+   /indexterm
+   listitem
+para
+ Specifies the default value that will be applied to each
+ commandVACUUM/ operation, for tables that do not have
+ a default value set in structnamepg_autovacuum/.  If a 
+ negative value is specified (like the default value of -1),
+ the varnamevacuum_cost_limit/ value will be applied instead.
+/para
+   /listitem
+  /varlistentry
+ 
  /variablelist
 /sect2
  
Index: src/backend/access/transam/xlog.c

Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-30 Thread Michael Paesold

Alvaro Herrera wrote:


I still haven't added custom cost-based delays, but I don't see that as
a showstopper for removing it.  I just went through the CVS log and I
don't see anything else that applies.


I think you should at least add an autovacuum specific value for 
vacuum_cost_delay because it turns cost-based vacuum delay on or off. I 
believe not many will have vacuum_cost_delay enabled in postgresql.conf, but 
will want to enable it for autovacuum.

At least I do.

Best Regards,
Michael Paesold 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-30 Thread Tom Lane
Michael Paesold [EMAIL PROTECTED] writes:
 Alvaro Herrera wrote:
 I still haven't added custom cost-based delays, but I don't see that as
 a showstopper for removing it.  I just went through the CVS log and I
 don't see anything else that applies.

 I think you should at least add an autovacuum specific value for 
 vacuum_cost_delay because it turns cost-based vacuum delay on or off.

It occurs to me that you could have that today, using the knowledge that
the autovac daemon runs as the bootstrap user: use ALTER USER SET to
attach user-specific vacuum delay settings to that role.  This is a
pretty bletcherous solution, because (a) it requires knowledge of an
undocumented implementation detail and (b) it would interfere with using
that role for normal manual maintenance.  So I agree that a few extra
GUC settings would be better.  But we could get away without 'em.

Along the same lines, it was suggested that we need a way to disable
stats gathering on a per-database basis.  We already have it: you can
use ALTER DATABASE SET to control stats_row_level and stats_block_level
that way.  Neither of the above two objections apply to this usage, so
I think we can mark off that wishlist item as done.  (Of course, the
soon-to-appear autovac documentation had better mention this trick.)

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-30 Thread Bruce Momjian
Tom Lane wrote:
 Michael Paesold [EMAIL PROTECTED] writes:
  Alvaro Herrera wrote:
  I still haven't added custom cost-based delays, but I don't see that as
  a showstopper for removing it.  I just went through the CVS log and I
  don't see anything else that applies.
 
  I think you should at least add an autovacuum specific value for 
  vacuum_cost_delay because it turns cost-based vacuum delay on or off.
 
 It occurs to me that you could have that today, using the knowledge that
 the autovac daemon runs as the bootstrap user: use ALTER USER SET to
 attach user-specific vacuum delay settings to that role.  This is a
 pretty bletcherous solution, because (a) it requires knowledge of an
 undocumented implementation detail and (b) it would interfere with using
 that role for normal manual maintenance.  So I agree that a few extra
 GUC settings would be better.  But we could get away without 'em.
 
 Along the same lines, it was suggested that we need a way to disable
 stats gathering on a per-database basis.  We already have it: you can
 use ALTER DATABASE SET to control stats_row_level and stats_block_level
 that way.  Neither of the above two objections apply to this usage, so
 I think we can mark off that wishlist item as done.  (Of course, the
 soon-to-appear autovac documentation had better mention this trick.)

I am thinking we should move ahead with what we have now, suggest the
work-arounds, and thensee what use-cases we have for it for later
releases.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-30 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  I think you should at least add an autovacuum specific value for 
  vacuum_cost_delay because it turns cost-based vacuum delay on or off.
 
  I am thinking we should move ahead with what we have now, suggest the
  work-arounds, and thensee what use-cases we have for it for later
  releases.
 
 I think it's absolutely unquestionable that there is a use-case for
 running autovac with different vacuum-delay settings than you would
 want to apply to manually issued vacuums.  We don't need to wait for
 field experience on that one; we already have it with the contrib
 version.

So do we need to add new GUC variables?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-30 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 I think you should at least add an autovacuum specific value for 
 vacuum_cost_delay because it turns cost-based vacuum delay on or off.

 I am thinking we should move ahead with what we have now, suggest the
 work-arounds, and thensee what use-cases we have for it for later
 releases.

I think it's absolutely unquestionable that there is a use-case for
running autovac with different vacuum-delay settings than you would
want to apply to manually issued vacuums.  We don't need to wait for
field experience on that one; we already have it with the contrib
version.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-30 Thread Alvaro Herrera
On Sat, Jul 30, 2005 at 10:57:15AM -0400, Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian pgman@candle.pha.pa.us writes:
   I think you should at least add an autovacuum specific value for 
   vacuum_cost_delay because it turns cost-based vacuum delay on or off.
  
   I am thinking we should move ahead with what we have now, suggest the
   work-arounds, and thensee what use-cases we have for it for later
   releases.
  
  I think it's absolutely unquestionable that there is a use-case for
  running autovac with different vacuum-delay settings than you would
  want to apply to manually issued vacuums.  We don't need to wait for
  field experience on that one; we already have it with the contrib
  version.
 
 So do we need to add new GUC variables?

I was thinking in a GUC var for global setting, and a column in
pg_autovacuum for individual, per table setting.  Just one, for the
vacuum_cost_limit parameter; I don't think we really need settable cost
parameters.

A case could be made for setting the vacuum_cost_delay parameter as
well.  Thoughts?

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Es filósofo el que disfruta con los enigmas (G. Coli)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-29 Thread Tom Lane
BTW, is there still any reason not to remove the contrib/pg_autovacuum
directory from CVS?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-29 Thread Alvaro Herrera
On Fri, Jul 29, 2005 at 11:19:34AM -0400, Tom Lane wrote:
 BTW, is there still any reason not to remove the contrib/pg_autovacuum
 directory from CVS?

I still haven't added custom cost-based delays, but I don't see that as
a showstopper for removing it.  I just went through the CVS log and I
don't see anything else that applies.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Voy a acabar con todos los humanos / con los humanos yo acabaré
voy a acabar con todos / con todos los humanos acabaré (Bender)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-29 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Updated this patch:
 - The stat collector is modified so as to keep shared relations separate
   from regular ones.  Autovacuum understands this.
   [etc]

Applied with some fixes --- you had broken the reporting of statistics
for shared tables, for one thing.  Also the patch seemed to be missing
diffs for header files?

It occurs to me that vacuuming to prevent XID wraparound is not the only
reason to do DB-wide vacuums: we also need to keep
pg_database.datvacuumxid from getting too old, else we will have
problems with clog bloat.  We may need to rethink the test used.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-29 Thread Alvaro Herrera
On Fri, Jul 29, 2005 at 03:33:09PM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Updated this patch:
  - The stat collector is modified so as to keep shared relations separate
from regular ones.  Autovacuum understands this.
[etc]
 
 Applied with some fixes --- you had broken the reporting of statistics
 for shared tables, for one thing.

Oops :-(  Didn't notice that.

 Also the patch seemed to be missing diffs for header files?

Damn, I generated the diff from within src/backend instead of the root
:-(  Sorry for the inconvenience.

 It occurs to me that vacuuming to prevent XID wraparound is not the only
 reason to do DB-wide vacuums: we also need to keep
 pg_database.datvacuumxid from getting too old, else we will have
 problems with clog bloat.  We may need to rethink the test used.

Hmm.  Will see about it.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
La Primavera ha venido. Nadie sabe como ha sido (A. Machado)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-29 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Also the patch seemed to be missing diffs for header files?

 Damn, I generated the diff from within src/backend instead of the root
 :-(  Sorry for the inconvenience.

No problem --- reverse-engineering the changes to function declarations
was simple enough.  But did you have any other changes outside
src/backend?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-29 Thread Alvaro Herrera
On Fri, Jul 29, 2005 at 05:46:11PM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Also the patch seemed to be missing diffs for header files?
 
  Damn, I generated the diff from within src/backend instead of the root
  :-(  Sorry for the inconvenience.
 
 No problem --- reverse-engineering the changes to function declarations
 was simple enough.  But did you have any other changes outside
 src/backend?

Nope, that was it.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Estoy de acuerdo contigo en que la verdad absoluta no existe...
El problema es que la mentira sí existe y tu estás mintiendo (G. Lama)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-27 Thread Alvaro Herrera
On Mon, Jul 25, 2005 at 09:31:15AM +0800, Christopher Kings-Lynne wrote:
  We have to consider what
  happens at stat reset -- AFAICS there's no problem, because as soon as
  the table sees some activity, it will be picked up by pgstat.
  However, it would be bad if stats are reset right after some heavy
  activity on a table.  Maybe the only thing we need is documentation.
 
 What's the use-case for having the stat reset feature at all?
 
 I believe I was the root cause of the pg_stat_reset() function.  The 
 idea at the time was that if you decide to do a round of index 
 optimisation, you want to be able to search for unused indexes and 
 heavily seq. scanned tables.
 
 If you reset the stats you have 'clean' data to work with.  For 
 instance, you can get 24 hours of clean stats data.

Ok, so there's a reason for having a manual stat-reset.  However what's
the rationale for cleaning stats at postmaster start?  In fact I think
it's actively bad because you lose any data you had before postmaster
stop/crash.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
I personally became interested in Linux while I was dating an English major
who wouldn't know an operating system if it walked up and bit him.
(Val Henson)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-27 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Ok, so there's a reason for having a manual stat-reset.  However what's
 the rationale for cleaning stats at postmaster start?  In fact I think
 it's actively bad because you lose any data you had before postmaster
 stop/crash.

We probably *should* drop the stats file if any WAL replay activity
occurs, because the stats file could be out of sync with reality
--- this is particularly important in a PITR recovery situation,
where the stats file is likely to be WAY out of sync.  (Maybe only
clobber it in PITR mode?)

I agree that the argument for doing it in a normal restart is pretty
weak.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-25 Thread Matthew T. O'Connor

Tom Lane wrote:


Matthew T. O'Connor matthew@zeut.net writes:
 

I don't know either, but this brings up another question.  Stats 
wraparound.  


We'll all be safely dead, for one thing ;-)

At one update per nanosecond, it'd take approximately 300 years to wrap
a 64-bit counter.  Somehow I don't have a problem with the idea that
Postgres would need to be rebooted that often.  We'd want to fix the
32-bit nature of XIDs long before 64-bit stats counters get to be a
real-world issue ...



*sigh*  Sorry, I should have done a little math before I asked that 
question.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-25 Thread Alvaro Herrera
On Thu, Jul 14, 2005 at 10:52:56AM -0400, Tom Lane wrote:
 I've applied Alvaro's latest integrated-autovacuum patch.  There are
 still a number of loose ends to be dealt with before beta, though:

Updated this patch:

- The stat collector is modified so as to keep shared relations separate
  from regular ones.  Autovacuum understands this.

- Temp tables are completely ignored.

- pg_statistic is ignored for analyze.  It's still candidate for vacuum.

- databases with no stat entry are still ignored, except that they are
  checked for Xid wraparound like any other.  The oldest one is chosen
  for vacuum in a particular autovacuum run.

- A database-wide vacuum forces a pg_database flat-file update, so that
  the wraparound check actually works.

- The postmaster's main loop sleeps Min(60, autovacuum_naptime), in
  order to be able to pick naptimes smaller than 60 seconds.  In order
  not to make the loop a busy-wait, I forced a minimum of 1 to that GUC
  var.  Yes, an argument could be made that the minimum could be higher.
  Not sure if we actually want to dictate policy on this.  The minimum
  is there only to prevent the postmaster from using 100% of a CPU the
  whole time.

- Tables with no stat entries are completely ignored.

- The stat collector ignores messages that relate to databases it
  doesn't know about.  This makes it inocuous to issue a database-wide
  vacuum on a template database.  A special case is made for database
  InvalidOid -- an entry for it is created regardless.


Two comments still apply:

- I haven't done anything yet w.r.t. the custom vacuum_delay nor sleep
  scale factor.

- There are still no docs.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Porque Kim no hacia nada, pero, eso sí,
con extraordinario éxito (Kim, Kipling)
Index: access/transam/xlog.c
===
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.210
diff -c -r1.210 xlog.c
*** access/transam/xlog.c   23 Jul 2005 15:31:16 -  1.210
--- access/transam/xlog.c   24 Jul 2005 17:55:51 -
***
*** 465,471 
 TimeLineID endTLI,
 uint32 endLogId, uint32 endLogSeg);
  static void WriteControlFile(void);
- static void ReadControlFile(void);
  static char *str_time(time_t tnow);
  static void issue_xlog_fsync(void);
  
--- 465,470 
***
*** 3383,3390 
 errmsg(could not close control file: %m)));
  }
  
! static void
! ReadControlFile(void)
  {
pg_crc32crc;
int fd;
--- 3382,3394 
 errmsg(could not close control file: %m)));
  }
  
! /*
!  * Read and verify the control file, filling the ControlFile struct.
!  *
!  * If nextXid is not NULL, the latest Checkpoint's nextXid is returned.
!  */
! void
! ReadControlFile(TransactionId *nextXid)
  {
pg_crc32crc;
int fd;
***
*** 3525,3530 
--- 3529,3537 
   ControlFile-lc_ctype),
 errhint(It looks like you need to initdb or install locale 
support.)));
  
+   if (PointerIsValid(nextXid))
+   *nextXid = ControlFile-checkPointCopy.nextXid;
+ 
/* Make the fixed locale settings visible as GUC variables, too */
SetConfigOption(lc_collate, ControlFile-lc_collate,
PGC_INTERNAL, PGC_S_OVERRIDE);
***
*** 3650,3656 
 * for the reasons why).
 */
if (!IsBootstrapProcessingMode())
!   ReadControlFile();
  }
  
  /*
--- 3657,3663 
 * for the reasons why).
 */
if (!IsBootstrapProcessingMode())
!   ReadControlFile(NULL);
  }
  
  /*
***
*** 4232,4238 
 * Note: in most control paths, *ControlFile is already valid and we 
need
 * not do ReadControlFile() here, but might as well do it to be sure.
 */
!   ReadControlFile();
  
if (ControlFile-logSeg == 0 ||
ControlFile-state  DB_SHUTDOWNED ||
--- 4239,4245 
 * Note: in most control paths, *ControlFile is already valid and we 
need
 * not do ReadControlFile() here, but might as well do it to be sure.
 */
!   ReadControlFile(NULL);
  
if (ControlFile-logSeg == 0 ||
ControlFile-state  DB_SHUTDOWNED ||
Index: commands/analyze.c
===
RCS file: /home/alvherre/cvs/pgsql/src/backend/commands/analyze.c,v
retrieving revision 1.87
diff -c -r1.87 analyze.c
*** commands/analyze.c  14 Jul 2005 05:13:39 -  1.87
--- commands/analyze.c  26 Jul 2005 03:51:45 -
***
*** 317,323 
 * a zero-column table.
 */
   

Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-24 Thread Alvaro Herrera
On Thu, Jul 14, 2005 at 10:52:56AM -0400, Tom Lane wrote:
 I've applied Alvaro's latest integrated-autovacuum patch.  There are
 still a number of loose ends to be dealt with before beta, though:

Ok, here's a patch that deals with some of this:

- The stat collector is modified so as to keep shared relations separate
  from regular ones.  Also, backends sends messages separately.
  Autovacuum takes advantage of this, so it correctly identifies the
  appropiate time to operate on a shared relation, irrespective of the
  database where they were modified.  Note however that it uses each
  database's pg_autovacuum settings.  This means it could be vacuumed
  sooner in one database than another, but I don't think it's a problem.

- Temp tables are completely ignored.

- pg_statistic is completely ignored.

- databases with no stat entry are still ignored, except that they are
  checked for Xid wraparound like any other.  The oldest one is chosen
  for vacuum in a particular autovacuum run.

- A database-wide vacuum forces a pg_database flat-file update, so that
  the wraparound check actually works.

- The postmaster's main loop sleeps Min(60, autovacuum_naptime), in
  order to be able to pick naptimes smaller than 60 seconds.  In order
  not to make the loop a busy-wait, I forced a minimum of 1 to that GUC
  var.



Some comments:

- Now that we have a real Xid wraparound check, we could go back to
  having any table with no stat entry ignored, which was the original
  coding.  There's no danger of wraparound, and there'd be no work done
  to a table that doesn't have any activity.  We have to consider what
  happens at stat reset -- AFAICS there's no problem, because as soon as
  the table sees some activity, it will be picked up by pgstat.
  However, it would be bad if stats are reset right after some heavy
  activity on a table.  Maybe the only thing we need is documentation.

- datallowcon is still ignored.  Now it's safe to do so, because we have
  a real Xid wraparound check.  Changing it requires extending the
  pg_database flat-file (should be fairly easy).

- There are stat messages emitted for a database-wide vacuum, just like
  any other.  This means that all tables in the database would end up in
  pgstat; and also all databases, including those with datallowconn = false.
  This may not be good.  I'm not sure what exactly to do about it.  Do
  we want to disallow such stats?  Disable message sending (or
  collecting) in some circumstances?

- I haven't done anything yet w.r.t. the custom vacuum_delay nor sleep
  scale factor.

- There are still no docs.


-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Porque Kim no hacia nada, pero, eso sí,
con extraordinario éxito (Kim, Kipling)
Index: src/backend/access/transam/xlog.c
===
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.210
diff -c -r1.210 xlog.c
*** src/backend/access/transam/xlog.c   23 Jul 2005 15:31:16 -  1.210
--- src/backend/access/transam/xlog.c   24 Jul 2005 17:55:51 -
***
*** 465,471 
 TimeLineID endTLI,
 uint32 endLogId, uint32 endLogSeg);
  static void WriteControlFile(void);
- static void ReadControlFile(void);
  static char *str_time(time_t tnow);
  static void issue_xlog_fsync(void);
  
--- 465,470 
***
*** 3383,3390 
 errmsg(could not close control file: %m)));
  }
  
! static void
! ReadControlFile(void)
  {
pg_crc32crc;
int fd;
--- 3382,3394 
 errmsg(could not close control file: %m)));
  }
  
! /*
!  * Read and verify the control file, filling the ControlFile struct.
!  *
!  * If nextXid is not NULL, the latest Checkpoint's nextXid is returned.
!  */
! void
! ReadControlFile(TransactionId *nextXid)
  {
pg_crc32crc;
int fd;
***
*** 3525,3530 
--- 3529,3537 
   ControlFile-lc_ctype),
 errhint(It looks like you need to initdb or install locale 
support.)));
  
+   if (PointerIsValid(nextXid))
+   *nextXid = ControlFile-checkPointCopy.nextXid;
+ 
/* Make the fixed locale settings visible as GUC variables, too */
SetConfigOption(lc_collate, ControlFile-lc_collate,
PGC_INTERNAL, PGC_S_OVERRIDE);
***
*** 3650,3656 
 * for the reasons why).
 */
if (!IsBootstrapProcessingMode())
!   ReadControlFile();
  }
  
  /*
--- 3657,3663 
 * for the reasons why).
 */
if (!IsBootstrapProcessingMode())
!   ReadControlFile(NULL);
  }
  
  /*
***
*** 4232,4238 
 * Note: in most control paths, *ControlFile is 

Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-24 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 - pg_statistic is completely ignored.

... pg_statistic still needs vacuuming, surely.  It's only ANALYZE
that you can/should skip for it.

 - The postmaster's main loop sleeps Min(60, autovacuum_naptime), in
   order to be able to pick naptimes smaller than 60 seconds.  In order
   not to make the loop a busy-wait, I forced a minimum of 1 to that GUC
   var.

Hmm, I wonder whether the minimum shouldn't be 10.  Or even 60.

 - Now that we have a real Xid wraparound check, we could go back to
   having any table with no stat entry ignored, which was the original
   coding.  There's no danger of wraparound, and there'd be no work done
   to a table that doesn't have any activity.

Agreed.

   We have to consider what
   happens at stat reset -- AFAICS there's no problem, because as soon as
   the table sees some activity, it will be picked up by pgstat.
   However, it would be bad if stats are reset right after some heavy
   activity on a table.  Maybe the only thing we need is documentation.

What's the use-case for having the stat reset feature at all?

 - datallowcon is still ignored.  Now it's safe to do so, because we have
   a real Xid wraparound check.  Changing it requires extending the
   pg_database flat-file (should be fairly easy).

I think this is all right, as long as a database that shows no stats
traffic is only connected to when it needs to be vacuumed for XID wrap
prevention purposes.

 - There are stat messages emitted for a database-wide vacuum, just like
   any other.  This means that all tables in the database would end up in
   pgstat; and also all databases, including those with datallowconn = false.
   This may not be good.  I'm not sure what exactly to do about it.  Do
   we want to disallow such stats?  Disable message sending (or
   collecting) in some circumstances?

Needs thought...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-24 Thread Alvaro Herrera
On Sun, Jul 24, 2005 at 02:33:38PM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  - pg_statistic is completely ignored.
 
 ... pg_statistic still needs vacuuming, surely.  It's only ANALYZE
 that you can/should skip for it.

Sorry, yes, it's ignored only for analyze.

  - The postmaster's main loop sleeps Min(60, autovacuum_naptime), in
order to be able to pick naptimes smaller than 60 seconds.  In order
not to make the loop a busy-wait, I forced a minimum of 1 to that GUC
var.
 
 Hmm, I wonder whether the minimum shouldn't be 10.  Or even 60.

It's ok with me.  What do other people think?

We have to consider what
happens at stat reset -- AFAICS there's no problem, because as soon as
the table sees some activity, it will be picked up by pgstat.
However, it would be bad if stats are reset right after some heavy
activity on a table.  Maybe the only thing we need is documentation.
 
 What's the use-case for having the stat reset feature at all?

I don't know.  Maybe the people who added it can tell?


  - There are stat messages emitted for a database-wide vacuum, just like
any other.  This means that all tables in the database would end up in
pgstat; and also all databases, including those with datallowconn = false.
This may not be good.  I'm not sure what exactly to do about it.  Do
we want to disallow such stats?  Disable message sending (or
collecting) in some circumstances?
 
 Needs thought...

Ok.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
I call it GNU/Linux. Except the GNU/ is silent. (Ben Reiter)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-24 Thread Christopher Kings-Lynne

 We have to consider what
 happens at stat reset -- AFAICS there's no problem, because as soon as
 the table sees some activity, it will be picked up by pgstat.
 However, it would be bad if stats are reset right after some heavy
 activity on a table.  Maybe the only thing we need is documentation.



What's the use-case for having the stat reset feature at all?


I believe I was the root cause of the pg_stat_reset() function.  The 
idea at the time was that if you decide to do a round of index 
optimisation, you want to be able to search for unused indexes and 
heavily seq. scanned tables.


If you reset the stats you have 'clean' data to work with.  For 
instance, you can get 24 hours of clean stats data.


Chris


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-07-24 Thread Tom Lane
Matthew T. O'Connor matthew@zeut.net writes:
 I don't know either, but this brings up another question.  Stats 
 wraparound.  The n_tup_ins/upd/del columns in the stats system are 
 defined as bigint, what happens when the total number of upd for example 
 exceeds the capacity for bigint, or overflows to negative, anyone have 
 any idea?

We'll all be safely dead, for one thing ;-)

At one update per nanosecond, it'd take approximately 300 years to wrap
a 64-bit counter.  Somehow I don't have a problem with the idea that
Postgres would need to be rebooted that often.  We'd want to fix the
32-bit nature of XIDs long before 64-bit stats counters get to be a
real-world issue ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match