Re: [PATCHES] Skipping VACUUM of indexes when no work required

2006-02-12 Thread Simon Riggs
On Sat, 2006-02-11 at 16:36 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > I believe this is safe.
> 
> I won't insult your intelligence by pointing out how I know that you
> didn't even test the patch against hash or gist.

I don't recall either way, though from what you say it seems I did not
test those cases. Thanks for catching my error.

> The major problem with the patch is that it's incapable of producing
> correct tuple-count stats for partial indexes, which is really not
> acceptable from a planning standpoint.  What I'm currently fooling with
> is skipping the bulkdelete scan only if the index isn't partial...

Thanks for spotting this case. I strive to learn.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] Skipping VACUUM of indexes when no work required

2006-02-11 Thread Tom Lane
Here's the patch as-applied.  Note one major difference from your
original: the logic about whether an indexscan can be skipped is now
entirely local to the index AMs, rather than allowing VACUUM to make
assumptions that may not be warranted for particular AMs.  For the
same reason, the AM is still responsible for providing the tuple
count statistic.

regards, tom lane




binGCkhzbMrzu.bin
Description: bulkdelete.patch.gz

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Skipping VACUUM of indexes when no work required

2006-02-11 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> I believe this is safe.

I won't insult your intelligence by pointing out how I know that you
didn't even test the patch against hash or gist.

The major problem with the patch is that it's incapable of producing
correct tuple-count stats for partial indexes, which is really not
acceptable from a planning standpoint.  What I'm currently fooling with
is skipping the bulkdelete scan only if the index isn't partial...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Skipping VACUUM of indexes when no work required

2006-02-11 Thread Simon Riggs
On Sat, 2006-02-11 at 12:04 -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > Patch applied.  Thanks.
> 
> Please revert.  You cannot skip scanning indexes simply because there
> was no heap activity.  btree for instance does post-cleanup on the
> next vacuum.

The patch skips only the first scan, not the second phase which does the
post-cleanup. We discussed this before and I listened...

I believe this is safe.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Skipping VACUUM of indexes when no work required

2006-02-11 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> We discussed an optimization of VACUUM here
> http://archives.postgresql.org/pgsql-hackers/2005-09/msg00046.php
> that would allow VACUUM to complete faster by avoiding scanning the
> indexes when no rows were removed from the heap by the VACUUM.

After looking at this, I think it is salvageable, but the patch as
written complicates the vacuum-to-index-AM API more than necessary;
there's no reason why the AM has to expose the fact that it skipped
doing anything.

I'll clean it up and reapply.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Skipping VACUUM of indexes when no work required

2006-02-11 Thread Tom Lane
Bruce Momjian  writes:
> Patch applied.  Thanks.

Please revert.  You cannot skip scanning indexes simply because there
was no heap activity.  btree for instance does post-cleanup on the
next vacuum.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Skipping VACUUM of indexes when no work required

2006-02-11 Thread Bruce Momjian

Patch applied.  Thanks.

---


Simon Riggs wrote:
> On Wed, 2005-12-07 at 17:40 +, Simon Riggs wrote:
> > On Wed, 2005-12-07 at 09:55 -0500, Tom Lane wrote:
> > > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > > We discussed an optimization of VACUUM here
> > > > http://archives.postgresql.org/pgsql-hackers/2005-09/msg00046.php
> > > > that would allow VACUUM to complete faster by avoiding scanning the
> > > > indexes when no rows were removed from the heap by the VACUUM.
> > > 
> > > Unfortunately I can't read that message right now because archives
> > > isn't responding, but this seems like a pretty bad idea to me.
> > > You still have to do the vacuum cleanup pass (at least in the btree
> > > case, and the only reason gist doesn't need it is it's not yet up
> > > to speed) so there's no real savings.
> > 
> > There are real savings; this is not a theoretical patch.
> > 
> > One pass of an index is faster than two, always.
> 
> Test results on a 1.2GB table, 10^6 rows and 3 indexes:
> 
> w/o optimization  87s
> with optimization 56s
> 
> Timings taken with primed cache, to allow reasonable comparison without
> confusing the issue with hint bit updates etc.
> 
> Performance gain is dependant upon:
> 1. size of index
> 2. logical/physical ordering of index pages
> 
> These tests performed immediately after load, which is best case, but
> also the main case for which I seek to optimize.
> 
> postgres=# select pg_relation_size('vactest');
>  pg_relation_size
> --
>1204707328
> 
> vacuum verbose vactest;
> psql:vacnout1.sql:3: INFO:  vacuuming "public.vactest"
> psql:vacnout1.sql:3: INFO:  index "vactest_idx1" now contains 1000
> row versions in 21899 pages
> DETAIL:  0 index pages have been deleted, 0 are currently reusable.
> CPU 0.25s/0.03u sec elapsed 5.81 sec.
> psql:vacnout1.sql:3: INFO:  index "vactest_idx2" now contains 1000
> row versions in 21899 pages
> DETAIL:  0 index pages have been deleted, 0 are currently reusable.
> CPU 0.26s/0.04u sec elapsed 5.78 sec.
> psql:vacnout1.sql:3: INFO:  index "vactest_idx3" now contains 1000
> row versions in 21899 pages
> DETAIL:  0 index pages have been deleted, 0 are currently reusable.
> CPU 0.23s/0.05u sec elapsed 5.69 sec.
> psql:vacnout1.sql:3: INFO:  "vactest": found 0 removable, 1000
> nonremovable row versions in 147059 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 2 unused item pointers.
> 0 pages are entirely empty.
> CPU 2.59s/0.58u sec elapsed 56.02 sec.
> psql:vacnout1.sql:3: INFO:  vacuuming "pg_toast.pg_toast_16415"
> psql:vacnout1.sql:3: INFO:  index "pg_toast_16415_index" now contains 0
> row versions in 1 pages
> DETAIL:  0 index pages have been deleted, 0 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> psql:vacnout1.sql:3: INFO:  "pg_toast_16415": found 0 removable, 0
> nonremovable row versions in 0 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 0 unused item pointers.
> 0 pages are entirely empty.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> VACUUM
> 
> 
> update vactest set col1a =1 where col1a = 1 and col1b = 1 and col1c = 1;
> UPDATE 1
> vacuum verbose vactest;
> psql:vacnout1.sql:6: INFO:  vacuuming "public.vactest"
> psql:vacnout1.sql:6: INFO:  index "vactest_idx1" now contains 1000
> row versions in 21899 pages
> DETAIL:  1 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 0.94s/0.45u sec elapsed 15.29 sec.
> psql:vacnout1.sql:6: INFO:  index "vactest_idx2" now contains 1000
> row versions in 21899 pages
> DETAIL:  1 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 1.03s/0.40u sec elapsed 16.80 sec.
> psql:vacnout1.sql:6: INFO:  index "vactest_idx3" now contains 1000
> row versions in 21899 pages
> DETAIL:  1 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 0.94s/0.49u sec elapsed 15.84 sec.
> psql:vacnout1.sql:6: INFO:  "vactest": removed 1 row versions in 1 pages
> DETAIL:  CPU 0.00s/0.00u sec elapsed 0.02 sec.
> psql:vacnout1.sql:6: INFO:  "vactest": found 1 removable, 1000
> nonremovable row versions in 147059 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 1 unused item pointers.
> 0 pages are entirely empty.
> CPU 4.99s/1.85u sec elapsed 87.20 sec.
> psql:vacnout1.sql:6: INFO:  vacuuming "pg_toast.pg_toast_16415"
> psql:vacnout1.sql:6: INFO:  index "pg_toast_16415_index" now contains 0
> row versions in 1 pages
> DETAIL:  0 index pages have been deleted, 0 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> psql:vacnout1.sql:6: INFO:  "pg_toast_16415": found 0 removable, 0
> nonremovable row versions in 0 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 0 unused item pointers.
> 0 pages are entire

Re: [PATCHES] Skipping VACUUM of indexes when no work required

2005-12-07 Thread Simon Riggs
On Wed, 2005-12-07 at 17:40 +, Simon Riggs wrote:
> On Wed, 2005-12-07 at 09:55 -0500, Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > We discussed an optimization of VACUUM here
> > > http://archives.postgresql.org/pgsql-hackers/2005-09/msg00046.php
> > > that would allow VACUUM to complete faster by avoiding scanning the
> > > indexes when no rows were removed from the heap by the VACUUM.
> > 
> > Unfortunately I can't read that message right now because archives
> > isn't responding, but this seems like a pretty bad idea to me.
> > You still have to do the vacuum cleanup pass (at least in the btree
> > case, and the only reason gist doesn't need it is it's not yet up
> > to speed) so there's no real savings.
> 
> There are real savings; this is not a theoretical patch.
> 
> One pass of an index is faster than two, always.

Test results on a 1.2GB table, 10^6 rows and 3 indexes:

w/o optimization87s
with optimization   56s

Timings taken with primed cache, to allow reasonable comparison without
confusing the issue with hint bit updates etc.

Performance gain is dependant upon:
1. size of index
2. logical/physical ordering of index pages

These tests performed immediately after load, which is best case, but
also the main case for which I seek to optimize.

postgres=# select pg_relation_size('vactest');
 pg_relation_size
--
   1204707328

vacuum verbose vactest;
psql:vacnout1.sql:3: INFO:  vacuuming "public.vactest"
psql:vacnout1.sql:3: INFO:  index "vactest_idx1" now contains 1000
row versions in 21899 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.25s/0.03u sec elapsed 5.81 sec.
psql:vacnout1.sql:3: INFO:  index "vactest_idx2" now contains 1000
row versions in 21899 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.26s/0.04u sec elapsed 5.78 sec.
psql:vacnout1.sql:3: INFO:  index "vactest_idx3" now contains 1000
row versions in 21899 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.23s/0.05u sec elapsed 5.69 sec.
psql:vacnout1.sql:3: INFO:  "vactest": found 0 removable, 1000
nonremovable row versions in 147059 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 2 unused item pointers.
0 pages are entirely empty.
CPU 2.59s/0.58u sec elapsed 56.02 sec.
psql:vacnout1.sql:3: INFO:  vacuuming "pg_toast.pg_toast_16415"
psql:vacnout1.sql:3: INFO:  index "pg_toast_16415_index" now contains 0
row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
psql:vacnout1.sql:3: INFO:  "pg_toast_16415": found 0 removable, 0
nonremovable row versions in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM


update vactest set col1a =1 where col1a = 1 and col1b = 1 and col1c = 1;
UPDATE 1
vacuum verbose vactest;
psql:vacnout1.sql:6: INFO:  vacuuming "public.vactest"
psql:vacnout1.sql:6: INFO:  index "vactest_idx1" now contains 1000
row versions in 21899 pages
DETAIL:  1 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.94s/0.45u sec elapsed 15.29 sec.
psql:vacnout1.sql:6: INFO:  index "vactest_idx2" now contains 1000
row versions in 21899 pages
DETAIL:  1 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 1.03s/0.40u sec elapsed 16.80 sec.
psql:vacnout1.sql:6: INFO:  index "vactest_idx3" now contains 1000
row versions in 21899 pages
DETAIL:  1 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.94s/0.49u sec elapsed 15.84 sec.
psql:vacnout1.sql:6: INFO:  "vactest": removed 1 row versions in 1 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.02 sec.
psql:vacnout1.sql:6: INFO:  "vactest": found 1 removable, 1000
nonremovable row versions in 147059 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 1 unused item pointers.
0 pages are entirely empty.
CPU 4.99s/1.85u sec elapsed 87.20 sec.
psql:vacnout1.sql:6: INFO:  vacuuming "pg_toast.pg_toast_16415"
psql:vacnout1.sql:6: INFO:  index "pg_toast_16415_index" now contains 0
row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
psql:vacnout1.sql:6: INFO:  "pg_toast_16415": found 0 removable, 0
nonremovable row versions in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

Other details available.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can 

Re: [PATCHES] Skipping VACUUM of indexes when no work required

2005-12-07 Thread Simon Riggs
On Wed, 2005-12-07 at 09:55 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > We discussed an optimization of VACUUM here
> > http://archives.postgresql.org/pgsql-hackers/2005-09/msg00046.php
> > that would allow VACUUM to complete faster by avoiding scanning the
> > indexes when no rows were removed from the heap by the VACUUM.
> 
> Unfortunately I can't read that message right now because archives
> isn't responding, but this seems like a pretty bad idea to me.
> You still have to do the vacuum cleanup pass (at least in the btree
> case, and the only reason gist doesn't need it is it's not yet up
> to speed) so there's no real savings.

There are real savings; this is not a theoretical patch.

One pass of an index is faster than two, always.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Skipping VACUUM of indexes when no work required

2005-12-07 Thread Joshua D. Drake
On Wed, 2005-12-07 at 09:55 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > We discussed an optimization of VACUUM here
> > http://archives.postgresql.org/pgsql-hackers/2005-09/msg00046.php
> > that would allow VACUUM to complete faster by avoiding scanning the
> > indexes when no rows were removed from the heap by the VACUUM.

resolved.

> 
> Unfortunately I can't read that message right now because archives
> isn't responding, but this seems like a pretty bad idea to me.
> You still have to do the vacuum cleanup pass (at least in the btree
> case, and the only reason gist doesn't need it is it's not yet up
> to speed) so there's no real savings.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend
-- 
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: PLphp, PLperl, ODBCng - http://www.commandprompt.com/



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] Skipping VACUUM of indexes when no work required

2005-12-07 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> We discussed an optimization of VACUUM here
> http://archives.postgresql.org/pgsql-hackers/2005-09/msg00046.php
> that would allow VACUUM to complete faster by avoiding scanning the
> indexes when no rows were removed from the heap by the VACUUM.

Unfortunately I can't read that message right now because archives
isn't responding, but this seems like a pretty bad idea to me.
You still have to do the vacuum cleanup pass (at least in the btree
case, and the only reason gist doesn't need it is it's not yet up
to speed) so there's no real savings.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PATCHES] Skipping VACUUM of indexes when no work required

2005-12-07 Thread Simon Riggs
We discussed an optimization of VACUUM here
http://archives.postgresql.org/pgsql-hackers/2005-09/msg00046.php
that would allow VACUUM to complete faster by avoiding scanning the
indexes when no rows were removed from the heap by the VACUUM.

Patch applies cleanly on cvstip; make check passes.

Tests shows clear performance gain when no rows removed by VACUUM.

Not as useful as may once have been, but certainly no loss either,
whatever happens in the future with VACUUM.

Best Regards, Simon Riggs
Index: src/backend/access/gist/gistvacuum.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/gist/gistvacuum.c,v
retrieving revision 1.11
diff -c -r1.11 gistvacuum.c
*** src/backend/access/gist/gistvacuum.c	22 Nov 2005 18:17:05 -	1.11
--- src/backend/access/gist/gistvacuum.c	7 Dec 2005 11:52:43 -
***
*** 125,131 
  	if (chldtuple.ituplen > 1)
  	{
  		/*
! 		 * child was splitted, so we need mark completion
  		 * insert(split)
  		 */
  		int			j;
--- 125,131 
  	if (chldtuple.ituplen > 1)
  	{
  		/*
! 		 * child was split, so we need mark completion
  		 * insert(split)
  		 */
  		int			j;
***
*** 329,337 
  }
  
  /*
!  * For usial vacuum just update FSM, for full vacuum
   * reforms parent tuples if some of childs was deleted or changed,
!  * update invalid tuples (they can exsist from last crash recovery only),
   * tries to get smaller index
   */
  
--- 329,337 
  }
  
  /*
!  * For usual vacuum just update FSM, for full vacuum
   * reforms parent tuples if some of childs was deleted or changed,
!  * update invalid tuples (they can exist from last crash recovery only),
   * tries to get smaller index
   */
  
***
*** 505,514 
  			   *ptr;
  	bool		needLock;
  
! 	stack = (GistBDItem *) palloc0(sizeof(GistBDItem));
! 
! 	stack->blkno = GIST_ROOT_BLKNO;
! 	needFullVacuum = false;
  
  	while (stack)
  	{
--- 505,519 
  			   *ptr;
  	bool		needLock;
  
! if (callback_state)
! {
! 	stack = (GistBDItem *) palloc0(sizeof(GistBDItem));
! 
! 	   stack->blkno = GIST_ROOT_BLKNO;
! 	   needFullVacuum = false;
! }
! else
! stack = NULL;
  
  	while (stack)
  	{
Index: src/backend/access/hash/hash.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/hash/hash.c,v
retrieving revision 1.82
diff -c -r1.82 hash.c
*** src/backend/access/hash/hash.c	6 Nov 2005 19:29:00 -	1.82
--- src/backend/access/hash/hash.c	7 Dec 2005 11:52:43 -
***
*** 504,509 
--- 504,520 
  	tuples_removed = 0;
  	num_index_tuples = 0;
  
+ 	/* return statistics */
+ 	num_pages = RelationGetNumberOfBlocks(rel);
+ 
+ 	result = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
+ 	result->num_pages = num_pages;
+ 
+ if (!callback_state)
+ {
+ 	PG_RETURN_POINTER(result);
+ }
+ 
  	/*
  	 * Read the metapage to fetch original bucket and tuple counts.  Also, we
  	 * keep a copy of the last-seen metapage so that we can use its
***
*** 652,662 
  
  	_hash_wrtbuf(rel, metabuf);
  
- 	/* return statistics */
- 	num_pages = RelationGetNumberOfBlocks(rel);
- 
- 	result = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
- 	result->num_pages = num_pages;
  	result->num_index_tuples = num_index_tuples;
  	result->tuples_removed = tuples_removed;
  
--- 663,668 
Index: src/backend/access/index/indexam.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/index/indexam.c,v
retrieving revision 1.87
diff -c -r1.87 indexam.c
*** src/backend/access/index/indexam.c	3 Dec 2005 05:51:00 -	1.87
--- src/backend/access/index/indexam.c	7 Dec 2005 11:52:44 -
***
*** 685,690 
--- 685,695 
   *		callback routine tells whether a given main-heap tuple is
   *		to be deleted
   *
+  *  passing NULL callback_state can be interpreted by the 
+  *  index access method as meaning that the index does not need
+  *  to be scanned in logical sequence to remove rows for this call
+  *  index_vacuum_cleanup is always required after this, however.
+  * 
   *		return value is an optional palloc'd struct of statistics
   * 
   */
Index: src/backend/access/nbtree/nbtree.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/nbtree/nbtree.c,v
retrieving revision 1.134
diff -c -r1.134 nbtree.c
*** src/backend/access/nbtree/nbtree.c	22 Nov 2005 18:17:06 -	1.134
--- src/backend/access/nbtree/nbtree.c	7 Dec 2005 11:52:45 -
***
*** 587,592 
--- 587,593 
  	int			ndeletable;
  	Buffer		buf;
  	BlockNumber num_pages;
+ boolscanindex = true;
  
  	tuples_removed = 0;
  	num_index_tup