Re: [PATCHES] [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

2007-02-04 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> BTW I've got serious reservations about whether this bit is safe:
>> 
>>> +   /* The table could've grown since vacuum started, and 
>>> there
>>> +* might already be dead tuples on the new pages. Catch 
>>> them
>>> +* as well. Also, we want to include any live tuples in 
>>> the
>>> +* new pages in the statistics.
>>> +*/
>>> +   nblocks = RelationGetNumberOfBlocks(onerel);
>> 
>> I seem to recall some assumptions somewhere in the system that a vacuum
>> won't visit newly-added pages.

> Hmm, I can't think of anything.

I think I was thinking of the second risk described here:
http://archives.postgresql.org/pgsql-hackers/2005-05/msg00613.php
which is now fixed so maybe there's no longer any problem.  (If there
is, a change like this will convert it from a very-low-probability
problem into a significant-probability problem, so I guess we'll
find out...)

I still don't like the patch though; rechecking the relation length
every N blocks is uselessly inefficient and still doesn't create any
guarantees about having examined everything.  If we think this is
worth doing at all, we should arrange to recheck the length after
processing what we think is the last block, not at any other time.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

2007-02-03 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:

I have two runs of DBT-2, one with the patch and one without.



Patched:


autovac "public.stock" scans:1 pages:1285990(-0) 
tuples:25303056(-2671265) CPU 95.22s/38.02u sec elapsed 10351.17 sec



Unpatched:


autovac "public.stock" scans:1 pages:1284504(-0) 
tuples:25001369(-1973760) CPU 86.55s/34.70u sec elapsed 9628.13 sec


So that makes this patch a good idea why?  (Maybe what we need to see
is the impact on the total elapsed time for the DBT-2 test, rather
than just the VACUUM runtime.)


The patch is a good idea because the vacuum collected more dead tuples, 
not because it makes vacuum run faster (it doesn't).


I believe the increase in runtime is caused by some random variations in 
the test. As I said, there's something going on that server that I don't 
 fully understand. I'll setup another test set.


The impact on the whole DBT-2 test is hard to measure, it would require 
a long run so that you reach a steady state where tables are not growing 
because of dead tuples anymore. We'll need to do that anyway, so 
hopefully I'll get a chance to measure the impact of this patch as well. 
The effect I'm expecting the patch to have is to make the steady-state 
size of the stock table smaller, giving more cache hits and less I/O.



BTW I've got serious reservations about whether this bit is safe:


+   /* The table could've grown since vacuum started, and 
there
+* might already be dead tuples on the new pages. Catch 
them
+* as well. Also, we want to include any live tuples in 
the
+* new pages in the statistics.
+*/
+   nblocks = RelationGetNumberOfBlocks(onerel);


I seem to recall some assumptions somewhere in the system that a vacuum
won't visit newly-added pages.


Hmm, I can't think of anything.

Without the patch, there can't be any dead tuples on the newly-added 
pages, according to the snapshot taken at the beginning of the vacuum, 
so it would be a waste of time to visit them.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

2007-02-01 Thread Bruce Momjian
Tom Lane wrote:
> Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> > I have two runs of DBT-2, one with the patch and one without.
> 
> > Patched:
> 
> > autovac "public.stock" scans:1 pages:1285990(-0) 
> > tuples:25303056(-2671265) CPU 95.22s/38.02u sec elapsed 10351.17 sec
> 
> > Unpatched:
> 
> > autovac "public.stock" scans:1 pages:1284504(-0) 
> > tuples:25001369(-1973760) CPU 86.55s/34.70u sec elapsed 9628.13 sec
> 
> So that makes this patch a good idea why?  (Maybe what we need to see
> is the impact on the total elapsed time for the DBT-2 test, rather
> than just the VACUUM runtime.)

Yea, in summary, it looks like the patch made the performance worse.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

2007-02-01 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> I have two runs of DBT-2, one with the patch and one without.

> Patched:

> autovac "public.stock" scans:1 pages:1285990(-0) 
> tuples:25303056(-2671265) CPU 95.22s/38.02u sec elapsed 10351.17 sec

> Unpatched:

> autovac "public.stock" scans:1 pages:1284504(-0) 
> tuples:25001369(-1973760) CPU 86.55s/34.70u sec elapsed 9628.13 sec

So that makes this patch a good idea why?  (Maybe what we need to see
is the impact on the total elapsed time for the DBT-2 test, rather
than just the VACUUM runtime.)

BTW I've got serious reservations about whether this bit is safe:

> + /* The table could've grown since vacuum started, and 
> there
> +  * might already be dead tuples on the new pages. Catch 
> them
> +  * as well. Also, we want to include any live tuples in 
> the
> +  * new pages in the statistics.
> +  */
> + nblocks = RelationGetNumberOfBlocks(onerel);

I seem to recall some assumptions somewhere in the system that a vacuum
won't visit newly-added pages.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [pgsql-patches] Recalculating OldestXmin in a long-running vacuum

2007-02-01 Thread Heikki Linnakangas

Bruce Momjian wrote:

Have you gotten performance numbers on this yet?


I have two runs of DBT-2, one with the patch and one without.

Patched:

autovac "public.stock" scans:1 pages:1285990(-0) 
tuples:25303056(-2671265) CPU 95.22s/38.02u sec elapsed 10351.17 sec


Unpatched:

autovac "public.stock" scans:1 pages:1284504(-0) 
tuples:25001369(-1973760) CPU 86.55s/34.70u sec elapsed 9628.13 sec


Both autovacuums started roughly at the same time after test start. The 
numbers mean that without the patch, the vacuum found 1973760 dead 
tuples and with the patch 2671265 dead tuples. The runs were done with 
autovacuum_vacuum_scale_factor = 0.05, to trigger the autovacuum earlier 
than with the default.


Before these test runs, I realized that the patch had a little 
strangeness. Because we're taking new snapshot during the vacuum, some 
rows that are updated while the vacuum is running might not get counted 
as live. That can happen when the new updated version goes to page that 
has already been vacuumed, and the old version is on a page that hasn't 
yet been vacuumed. Also, because we're taking new snapshots, it makes 
sense to recalculate the relation size as well to vacuum any new blocks 
at the end. Attached is an updated patch that does that.


The reason I haven't posted the results earlier is that we're having 
some periodic drops in performance on that server that we can't explain. 
 (I don't think it's checkpoint nor autovacuum). I wanted to figure 
that out first, but I don't think that makes a difference for this patch.


Is this enough, or does someone want more evidence?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/commands/vacuumlazy.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/vacuumlazy.c,v
retrieving revision 1.82
diff -c -r1.82 vacuumlazy.c
*** src/backend/commands/vacuumlazy.c	5 Jan 2007 22:19:27 -	1.82
--- src/backend/commands/vacuumlazy.c	22 Jan 2007 11:35:34 -
***
*** 66,71 
--- 66,73 
  #define REL_TRUNCATE_MINIMUM	1000
  #define REL_TRUNCATE_FRACTION	16
  
+ /* OldestXmin is recalculated every OLDEST_XMIN_REFRESH_INTERVAL pages */
+ #define OLDEST_XMIN_REFRESH_INTERVAL 100
  
  typedef struct LVRelStats
  {
***
*** 274,279 
--- 276,296 
  			vacrelstats->num_dead_tuples = 0;
  		}
  
+ 		/* Get a new OldestXmin every OLDEST_XMIN_REFRESH_INTERVAL pages
+ 		 * so that we get to reclaim a little bit more dead tuples in a 
+ 		 * long-running vacuum.
+ 		 */
+ 		if (blkno % OLDEST_XMIN_REFRESH_INTERVAL == (OLDEST_XMIN_REFRESH_INTERVAL - 1))
+ 		{
+ 			OldestXmin = GetOldestXmin(onerel->rd_rel->relisshared, true);
+ 			/* The table could've grown since vacuum started, and there
+ 			 * might already be dead tuples on the new pages. Catch them
+ 			 * as well. Also, we want to include any live tuples in the
+ 			 * new pages in the statistics.
+ 			 */
+ 			nblocks = RelationGetNumberOfBlocks(onerel);
+ 		}
+ 
  		buf = ReadBuffer(onerel, blkno);
  
  		/* Initially, we only need shared access to the buffer */
Index: src/backend/storage/ipc/procarray.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/ipc/procarray.c,v
retrieving revision 1.20
diff -c -r1.20 procarray.c
*** src/backend/storage/ipc/procarray.c	5 Jan 2007 22:19:38 -	1.20
--- src/backend/storage/ipc/procarray.c	16 Jan 2007 16:57:59 -
***
*** 416,426 
  	/*
  	 * Normally we start the min() calculation with our own XID.  But if
  	 * called by checkpointer, we will not be inside a transaction, so use
! 	 * next XID as starting point for min() calculation.  (Note that if there
! 	 * are no xacts running at all, that will be the subtrans truncation
! 	 * point!)
  	 */
! 	if (IsTransactionState())
  		result = GetTopTransactionId();
  	else
  		result = ReadNewTransactionId();
--- 416,429 
  	/*
  	 * Normally we start the min() calculation with our own XID.  But if
  	 * called by checkpointer, we will not be inside a transaction, so use
! 	 * next XID as starting point for min() calculation.  We also don't
! 	 * include our own transaction if ignoreVacuum is true and we're a
! 	 * vacuum process ourselves.
! 	 *
! 	 * (Note that if there are no xacts running at all, that will be the
! 	 * subtrans truncation point!)
  	 */
! 	if (IsTransactionState() && !(ignoreVacuum && MyProc->inVacuum))
  		result = GetTopTransactionId();
  	else
  		result = ReadNewTransactionId();

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate