Re: [HACKERS] [PATCHES] Trivial patch to double vacuum speed on tables with no indexes

2006-09-04 Thread Alvaro Herrera
Gregory Stark wrote:
 
 Bruce Momjian [EMAIL PROTECTED] writes:
 
  Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   Patch applied.  Thanks.
  
  Wait a minute.   This patch changes the behavior so that
  LockBufferForCleanup is applied to *every* heap page, not only the ones
  where there are removable tuples.  It's not hard to imagine scenarios
  where that results in severe system-wide performance degradation.
  Has there been any real-world testing of this idea?
 
  I see the no-index case now:
 
  +   if (nindexes)
  +   LockBuffer(buf, BUFFER_LOCK_SHARE);
  +   else
  +   LockBufferForCleanup(buf);
 
  Let's see what Greg says, or revert.
 
 Hm, that's a good point. I could return it to the original method where it
 released the share lock and did he LockBufferForCleanup only if necessary. I
 thought it was awkward to acquire a lock then release it to acquire a
 different lock on the same buffer but it's true that it doesn't always have to
 acquire the second lock.

This rush to apply patches just because no one seems to be capable of
keeping up with them not being reviewed, is starting to get a bit
worrisome.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Trivial patch to double vacuum speed on tables with no indexes

2006-09-04 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Patch applied.  Thanks.

Wait a minute.   This patch changes the behavior so that
LockBufferForCleanup is applied to *every* heap page, not only the ones
where there are removable tuples.  It's not hard to imagine scenarios
where that results in severe system-wide performance degradation.
Has there been any real-world testing of this idea?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Trivial patch to double vacuum speed on tables with no indexes

2006-09-04 Thread Gregory Stark

Bruce Momjian [EMAIL PROTECTED] writes:

 Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Patch applied.  Thanks.
 
 Wait a minute.   This patch changes the behavior so that
 LockBufferForCleanup is applied to *every* heap page, not only the ones
 where there are removable tuples.  It's not hard to imagine scenarios
 where that results in severe system-wide performance degradation.
 Has there been any real-world testing of this idea?

 I see the no-index case now:

 +   if (nindexes)
 +   LockBuffer(buf, BUFFER_LOCK_SHARE);
 +   else
 +   LockBufferForCleanup(buf);

 Let's see what Greg says, or revert.

Hm, that's a good point. I could return it to the original method where it
released the share lock and did he LockBufferForCleanup only if necessary. I
thought it was awkward to acquire a lock then release it to acquire a
different lock on the same buffer but it's true that it doesn't always have to
acquire the second lock.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Trivial patch to double vacuum speed on tables with no indexes

2006-08-28 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 The reason the patch is so short is that it's a kluge.  If we really
 cared about supporting this case, more wide-ranging changes would be
 needed (eg, there's no need to eat maintenance_work_mem worth of RAM
 for the dead-TIDs array); and a decent respect to the opinions of
 mankind would require some attention to updating the header comments
 and function descriptions, too.

The only part that seems klugy to me is how it releases the lock and
reacquires it rather than wait in the first place until it can acquire the
lock. Fixed that and changed lazy_space_alloc to allocate only as much space
as is really necessary.

Gosh, I've never been accused of offending all mankind before.



--- vacuumlazy.c31 Jul 2006 21:09:00 +0100  1.76
+++ vacuumlazy.c28 Aug 2006 09:58:41 +0100  
@@ -16,6 +16,10 @@
  * perform a pass of index cleanup and page compaction, then resume the heap
  * scan with an empty TID array.
  *
+ * As a special exception if we're processing a table with no indexes we can
+ * vacuum each page as we go so we don't need to allocate more space than
+ * enough to hold as many heap tuples fit on one page.
+ *
  * We can limit the storage for page free space to MaxFSMPages entries,
  * since that's the most the free space map will be willing to remember
  * anyway. If the relation has fewer than that many pages with free space,
@@ -106,7 +110,7 @@
   TransactionId 
OldestXmin);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 LVRelStats *vacrelstats, 
TransactionId OldestXmin);
-static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
+static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks, 
unsigned nindexes);
 static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
   ItemPointer itemptr);
 static void lazy_record_free_space(LVRelStats *vacrelstats,
@@ -206,7 +210,8 @@
  * This routine sets commit status bits, builds lists of dead 
tuples
  * and pages with free space, and calculates statistics on the 
number
  * of live tuples in the heap.  When done, or when we run low on 
space
- * for dead-tuple TIDs, invoke vacuuming of indexes and heap.
+ * for dead-tuple TIDs, or after every page if the table has no 
indexes 
+ * invoke vacuuming of indexes and heap.
  *
  * It also updates the minimum Xid found anywhere on the table in
  * vacrelstats-minxid, for later storing it in pg_class.relminxid.
@@ -247,7 +252,7 @@
vacrelstats-rel_pages = nblocks;
vacrelstats-nonempty_pages = 0;
 
-   lazy_space_alloc(vacrelstats, nblocks);
+   lazy_space_alloc(vacrelstats, nblocks, nindexes);
 
for (blkno = 0; blkno  nblocks; blkno++)
{
@@ -282,8 +287,14 @@
 
buf = ReadBuffer(onerel, blkno);
 
-   /* In this phase we only need shared access to the buffer */
-   LockBuffer(buf, BUFFER_LOCK_SHARE);
+   /* In this phase we only need shared access to the buffer 
unless we're
+* going to do the vacuuming now which we do if there are no 
indexes 
+*/
+
+   if (nindexes)
+   LockBuffer(buf, BUFFER_LOCK_SHARE);
+   else
+   LockBufferForCleanup(buf);
 
page = BufferGetPage(buf);
 
@@ -450,6 +461,12 @@
{
lazy_record_free_space(vacrelstats, blkno,
   
PageGetFreeSpace(page));
+   } else if (!nindexes) {
+   /* If there are no indexes we can vacuum the page right 
now instead
+* of doing a second scan */
+   lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats);
+   lazy_record_free_space(vacrelstats, blkno, 
PageGetFreeSpace(BufferGetPage(buf)));
+   vacrelstats-num_dead_tuples = 0;
}
 
/* Remember the location of the last page with nonremovable 
tuples */
@@ -891,16 +908,20 @@
  * See the comments at the head of this file for rationale.
  */
 static void
-lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
+lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks, unsigned 
nindexes)
 {
longmaxtuples;
int maxpages;
 
-   maxtuples = (maintenance_work_mem * 1024L) / sizeof(ItemPointerData);
-   maxtuples = Min(maxtuples, INT_MAX);
-   maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
-   /* stay sane if small maintenance_work_mem */
-   maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);
+   if (nindexes) {
+  

Re: [HACKERS] [PATCHES] Trivial patch to double vacuum speed on tables with no indexes

2006-08-28 Thread Alvaro Herrera
Gregory Stark wrote:
 
 Tom Lane [EMAIL PROTECTED] writes:
 
  The reason the patch is so short is that it's a kluge.  If we really
  cared about supporting this case, more wide-ranging changes would be
  needed (eg, there's no need to eat maintenance_work_mem worth of RAM
  for the dead-TIDs array); and a decent respect to the opinions of
  mankind would require some attention to updating the header comments
  and function descriptions, too.
 
 The only part that seems klugy to me is how it releases the lock and
 reacquires it rather than wait in the first place until it can acquire the
 lock. Fixed that and changed lazy_space_alloc to allocate only as much space
 as is really necessary.
 
 Gosh, I've never been accused of offending all mankind before.

Does that feel good or bad?

I won't comment on the spirit of the patch but I'll observe that you
should respect mankind a little more by observing brace position in
if/else ;-)



-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Trivial patch to double vacuum speed on tables with no indexes

2006-08-27 Thread Tom Lane
stark [EMAIL PROTECTED] writes:
 There isn't really any need for the second pass in lazy vacuum if the table
 has no indexes.

How often does that case come up in the real world, for tables that are
large enough that you'd care about vacuum performance?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Trivial patch to double vacuum speed on tables with no indexes

2006-08-27 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 stark [EMAIL PROTECTED] writes:
 There isn't really any need for the second pass in lazy vacuum if the table
 has no indexes.

 How often does that case come up in the real world, for tables that are
 large enough that you'd care about vacuum performance?

Admittedly it's not the most common scenario. But it does come up. ETL
applications for example that load data, then perform some manipulation of the
data before loading the data. If they have many updates to do they'll probably
have to do vacuums between some of them.

Arguably if you don't have any indexes on a large table it's quite likely to
be *because* you're planning on doing some big updates such that it'll be
faster to simply rebuild the indexes when you're done anyways.

I would have had the same objection if it resulted in substantially more
complex code but it was so simple that it doesn't seem like a concern.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Trivial patch to double vacuum speed on tables with no indexes

2006-08-27 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 How often does that case come up in the real world, for tables that are
 large enough that you'd care about vacuum performance?

 I would have had the same objection if it resulted in substantially more
 complex code but it was so simple that it doesn't seem like a concern.

The reason the patch is so short is that it's a kluge.  If we really
cared about supporting this case, more wide-ranging changes would be
needed (eg, there's no need to eat maintenance_work_mem worth of RAM
for the dead-TIDs array); and a decent respect to the opinions of
mankind would require some attention to updating the header comments
and function descriptions, too.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend