Re: [HACKERS] More vacuum.c refactoring

2004-08-06 Thread Manfred Koizar
[Sorry for the late reply.  I'm still struggling to catch up after
vacation ...]

On Fri, 9 Jul 2004 21:29:52 -0400 (EDT), Bruce Momjian
[EMAIL PROTECTED] wrote:

Where are we on this, 2x.  :-)

Here:
 Tom Lane wrote:
  Will study these comments later, but it's too late at night here...

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] More vacuum.c refactoring

2004-08-06 Thread Tom Lane
Manfred Koizar [EMAIL PROTECTED] writes:
 On Fri, 9 Jul 2004 21:29:52 -0400 (EDT), Bruce Momjian
 [EMAIL PROTECTED] wrote:
 
 Where are we on this, 2x.  :-)

 Here:
 Tom Lane wrote:
 Will study these comments later, but it's too late at night here...

I haven't had time to review the original, already-applied refactoring,
let alone the follow-up.  Considering that there's been at least one bug
report against the original (unused variable IIRC), I think it does need
reviewing.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] More vacuum.c refactoring

2004-07-09 Thread Bruce Momjian

Where are we on this, 2x.  :-)

---

Bruce Momjian wrote:
 
 Where are we on this?
 
 ---
 
 Tom Lane wrote:
  Manfred Koizar [EMAIL PROTECTED] writes:
   I understand you, honestly.  Do I read between your lines that you
   didn't review my previous vacuum.c refactoring patch?  Please do.  It'd
   make *me* more comfortable.
  
  I did not yet, but I will get to it.  I encourage everyone else to
  take a look too.  I agree with Alvaro that fooling with this code
  merits extreme caution.
  
  BTW, I do not at all mean to suggest that vacuum.c contains no bugs
  at the moment ;-).  I suspect for example that it is a bit random
  about whether MOVED_OFF/MOVED_IN bits get cleared immediately, or
  only by the next transaction that chances to visit the tuple.  The
  next-transaction-fixup behavior has to be there in case the VACUUM
  transaction crashes, but that doesn't mean that VACUUM should
  deliberately leave work undone.
  
   I see three significant differences between the code in repair_frag()
   and vacuum_page().
  
  Will study these comments later, but it's too late at night here...
  again, the more eyeballs on this the better...
  
  regards, tom lane
  
  ---(end of broadcast)---
  TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
  
 
 -- 
   Bruce Momjian|  http://candle.pha.pa.us
   [EMAIL PROTECTED]   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
 
 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] More vacuum.c refactoring

2004-06-18 Thread Bruce Momjian

Where are we on this?

---

Tom Lane wrote:
 Manfred Koizar [EMAIL PROTECTED] writes:
  I understand you, honestly.  Do I read between your lines that you
  didn't review my previous vacuum.c refactoring patch?  Please do.  It'd
  make *me* more comfortable.
 
 I did not yet, but I will get to it.  I encourage everyone else to
 take a look too.  I agree with Alvaro that fooling with this code
 merits extreme caution.
 
 BTW, I do not at all mean to suggest that vacuum.c contains no bugs
 at the moment ;-).  I suspect for example that it is a bit random
 about whether MOVED_OFF/MOVED_IN bits get cleared immediately, or
 only by the next transaction that chances to visit the tuple.  The
 next-transaction-fixup behavior has to be there in case the VACUUM
 transaction crashes, but that doesn't mean that VACUUM should
 deliberately leave work undone.
 
  I see three significant differences between the code in repair_frag()
  and vacuum_page().
 
 Will study these comments later, but it's too late at night here...
 again, the more eyeballs on this the better...
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


TESTING (was: RE: [HACKERS] More vacuum.c refactoring )

2004-06-11 Thread Dann Corbit
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
 Sent: Thursday, June 10, 2004 2:19 PM
 To: Manfred Koizar
 Cc: [EMAIL PROTECTED]
 Subject: Re: [HACKERS] More vacuum.c refactoring 
 
 
 Manfred Koizar [EMAIL PROTECTED] writes:
  This code is very similar to vacuum_page().  The major 
 difference is 
  that vacuum_page() uses vacpage-offsets while the code in 
  repair_frag() looks for MOVED_OFF bits in tuple headers.  
 AFAICS the 
  tuples with the MOVED_OFF bit set are exactly those referenced by 
  vacpage-offsets.
 
 This does not make me comfortable.  You *think* that two 
 different bits of code are doing the same thing, so you want 
 to hack up vacuum.c?  This module is delicate code --- we've 
 had tons of bugs there in the past
 --- and no I have zero confidence that passing the regression 
 tests proves anything, because all those prior bugs passed 
 the regression tests.

Then why didn't those bugs get added to the regression?  That has been
standard procedure in every place that I have ever worked.
We have 7000+ tests in our CONNX regression suite that take one week to
run, on an array of dozens of computers from micros to mainframes.
Besides finding quickly if you have reintroduced a problem, it will also
ferret out lots of newly introduced problems.
It seems that MySQL has some sort of extensive test, from looking at
their site.  Maybe the Pgsql group could simply cannibalize it.

Perhaps your regression test is really a sanity check, rather than a
regression test. After all, the meaning of 'regression' itself demands
that you introduce new tests based upon old failures.

I seem to recall that someone was porting the NIST suite to PostgreSQL.
What ever happened to that effort?


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: TESTING (was: RE: [HACKERS] More vacuum.c refactoring )

2004-06-11 Thread Tom Lane
Dann Corbit [EMAIL PROTECTED] writes:
 --- and no I have zero confidence that passing the regression 
 tests proves anything, because all those prior bugs passed 
 the regression tests.

 Then why didn't those bugs get added to the regression?

Because there wasn't any reasonable way to make them reproducible.

The set of things we can test in the regression tests is only a small
fraction of the interesting properties of Postgres.  This is
unfortunate but ranting about standard practice doesn't change it.

 I seem to recall that someone was porting the NIST suite to PostgreSQL.
 What ever happened to that effort?

It was done and we fixed a couple of bugs based on it (the one I can
think of offhand had to do with semantics of aggregate functions in
sub-selects).  I don't think there's anything more to be learned there.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: TESTING (was: RE: [HACKERS] More vacuum.c refactoring )

2004-06-11 Thread Dann Corbit
 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED] 
 Sent: Friday, June 11, 2004 2:35 PM
 To: Dann Corbit
 Cc: Manfred Koizar; [EMAIL PROTECTED]
 Subject: Re: TESTING (was: RE: [HACKERS] More vacuum.c refactoring ) 
 
 
 Dann Corbit [EMAIL PROTECTED] writes:
  --- and no I have zero confidence that passing the regression
  tests proves anything, because all those prior bugs passed 
  the regression tests.
 
  Then why didn't those bugs get added to the regression?
 
 Because there wasn't any reasonable way to make them reproducible.
 
 The set of things we can test in the regression tests is only 
 a small fraction of the interesting properties of Postgres.  
 This is unfortunate but ranting about standard practice 
 doesn't change it.
 
  I seem to recall that someone was porting the NIST suite to 
  PostgreSQL. What ever happened to that effort?
 
 It was done and we fixed a couple of bugs based on it (the 
 one I can think of offhand had to do with semantics of 
 aggregate functions in sub-selects).  I don't think there's 
 anything more to be learned there.

It is reassuring to know that it passed with flying colors.

Can I get the ported version?

I would love to play with it.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: TESTING (was: RE: [HACKERS] More vacuum.c refactoring )

2004-06-11 Thread Tom Lane
Dann Corbit [EMAIL PROTECTED] writes:
 It was done and we fixed a couple of bugs based on it (the 
 one I can think of offhand had to do with semantics of 
 aggregate functions in sub-selects).  I don't think there's 
 anything more to be learned there.

 It is reassuring to know that it passed with flying colors.

Passed with flying colors might be overstating it --- but the other
things it found were stuff we already knew about, eg, unimplemented
features, identifier case folding, that kind of thing.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[HACKERS] More vacuum.c refactoring

2004-06-10 Thread Manfred Koizar
Near the end of repair_frag() in vacuum.c -- under the comment /* clean
moved tuples from last page in Nvacpagelist list */ -- there is code
that marks itemids as unused.  Itemids affected are those referring to
tuples that have been moved off the last page.

This code is very similar to vacuum_page().  The major difference is
that vacuum_page() uses vacpage-offsets while the code in repair_frag()
looks for MOVED_OFF bits in tuple headers.  AFAICS the tuples with the
MOVED_OFF bit set are exactly those referenced by vacpage-offsets.

The attached patch passes make check and make installcheck.  Please
apply unless I'm missing something.

Servus
 Manfred
diff -Ncr ../base/src/backend/commands/vacuum.c src/backend/commands/vacuum.c
*** ../base/src/backend/commands/vacuum.c   Wed Jun  2 21:46:59 2004
--- src/backend/commands/vacuum.c   Thu Jun 10 18:50:26 2004
***
*** 2288,2355 
vacpage-offsets_free  0)
{
Buffer  buf;
-   Pagepage;
-   OffsetNumberunused[BLCKSZ / sizeof(OffsetNumber)];
-   OffsetNumberoffnum,
-   maxoff;
-   int uncnt;
-   int num_tuples = 0;
  
buf = ReadBuffer(onerel, vacpage-blkno);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
!   page = BufferGetPage(buf);
!   maxoff = PageGetMaxOffsetNumber(page);
!   for (offnum = FirstOffsetNumber;
!offnum = maxoff;
!offnum = OffsetNumberNext(offnum))
!   {
!   ItemId  itemid = PageGetItemId(page, offnum);
!   HeapTupleHeader htup;
! 
!   if (!ItemIdIsUsed(itemid))
!   continue;
!   htup = (HeapTupleHeader) PageGetItem(page, itemid);
!   if (htup-t_infomask  HEAP_XMIN_COMMITTED)
!   continue;
! 
!   /*
!   ** See comments in the walk-along-page loop above, why 
we
!   ** have Asserts here instead of if (...) elog(ERROR).
!   */
!   Assert(!(htup-t_infomask  HEAP_MOVED_IN));
!   Assert(htup-t_infomask  HEAP_MOVED_OFF);
!   Assert(HeapTupleHeaderGetXvac(htup) == myXID);
! 
!   itemid-lp_flags = ~LP_USED;
!   num_tuples++;
! 
!   }
!   Assert(vacpage-offsets_free == num_tuples);
! 
!   START_CRIT_SECTION();
! 
!   uncnt = PageRepairFragmentation(page, unused);
! 
!   /* XLOG stuff */
!   if (!onerel-rd_istemp)
!   {
!   XLogRecPtr  recptr;
! 
!   recptr = log_heap_clean(onerel, buf, unused, uncnt);
!   PageSetLSN(page, recptr);
!   PageSetSUI(page, ThisStartUpID);
!   }
!   else
!   {
!   /*
!* No XLOG record, but still need to flag that XID 
exists
!* on disk
!*/
!   MyXactMadeTempRelUpdate = true;
!   }
! 
!   END_CRIT_SECTION();
! 
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
WriteBuffer(buf);
}
--- 2288,2297 
vacpage-offsets_free  0)
{
Buffer  buf;
  
buf = ReadBuffer(onerel, vacpage-blkno);
LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
!   vacuum_page(onerel, buf, vacpage);
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
WriteBuffer(buf);
}

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] More vacuum.c refactoring

2004-06-10 Thread Tom Lane
Manfred Koizar [EMAIL PROTECTED] writes:
 This code is very similar to vacuum_page().  The major difference is
 that vacuum_page() uses vacpage-offsets while the code in repair_frag()
 looks for MOVED_OFF bits in tuple headers.  AFAICS the tuples with the
 MOVED_OFF bit set are exactly those referenced by vacpage-offsets.

This does not make me comfortable.  You *think* that two different bits
of code are doing the same thing, so you want to hack up vacuum.c?  This
module is delicate code --- we've had tons of bugs there in the past
--- and no I have zero confidence that passing the regression tests
proves anything, because all those prior bugs passed the regression
tests.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] More vacuum.c refactoring

2004-06-10 Thread Manfred Koizar
On Thu, 10 Jun 2004 17:19:22 -0400, Tom Lane [EMAIL PROTECTED] wrote:
This does not make me comfortable.

I understand you, honestly.  Do I read between your lines that you
didn't review my previous vacuum.c refactoring patch?  Please do.  It'd
make *me* more comfortable.

  You *think* that two different bits of code are doing the same thing,

I see three significant differences between the code in repair_frag()
and vacuum_page().

1) vacuum_page() has
Assert(vacpage-offsets_used == 0);

vacpage is the last VacPage that has been inserted into Nvacpagelist.
It is allocated in line 1566, offsets_used is immediately set to 0 and
never changed.  So this Assert(...) doesn't hurt.

2) In vacuum_page() the lp_flags are set inside a critical section.

This is no problem because the clear-used-flags loop does not
elog(ERROR, ...).  Please correct me if I'm wrong.

3) vacuum_page() uses vacpage-offsets to locate the itemids that are to
be updated.

If we can show that these are the same itemids that belong to the tuples
that are found by inspecting the tuple headers, then the two code
snippets are equivalent.  The first hint that this is the case is
Assert(vacpage-offsets_free == num_tuples);

So both spots expect to update the same number of itemids.  What about
the contents of the offsets[] array?  Offset numbers are entered into
this array by statements like

vacpage-offsets[vacpage-offsets_free++] = offnum;

in or immediately after the walk-along-page loop.  These assignments are
preceded either by code that sets the appropriate infomask bits or by
assertions that the bits are already set appropriately.

The rest (from PageRepairFragmentation to END_CRIT_SECTION) is
identical.

 so you want to hack up vacuum.c?  This
module is delicate code --- we've had tons of bugs there in the past

But why is it so delicate?  Not only because it handles difficult
problems, but also because it is written in a not very
maintenance-friendly way.  Before I started refactoring the code
repair_frag() had more than 1100 lines and (almost) all variables used
anywhere in the function were declared at function level.

We cannot declare a code freeze for a module just because it is so badly
written that every change is likely to break it.  Someday someone will
*have* to change it.

Better we break it today in an effort to make the code clearer.  

--- and no I have zero confidence that passing the regression tests
proves anything

Unfortunately you are right :-(

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] More vacuum.c refactoring

2004-06-10 Thread Alvaro Herrera
On Fri, Jun 11, 2004 at 02:00:07AM +0200, Manfred Koizar wrote:

If I may ...

  so you want to hack up vacuum.c?  This
 module is delicate code --- we've had tons of bugs there in the past
 
 But why is it so delicate?  Not only because it handles difficult
 problems, but also because it is written in a not very
 maintenance-friendly way.  Before I started refactoring the code
 repair_frag() had more than 1100 lines and (almost) all variables used
 anywhere in the function were declared at function level.

I agree.  This code is horrid.  I also agree with Tom in that this
should be done with extreme caution, but it is a needed task.

Maybe we could establish heavier testing for this kind of change so
potential patches can be tested extensively.  Concurrent vacuums with
all kinds of imaginable operations (insert, updates, deletes), in tight
loops, could be a start.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
No es bueno caminar con un hombre muerto


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] More vacuum.c refactoring

2004-06-10 Thread Tom Lane
Manfred Koizar [EMAIL PROTECTED] writes:
 I understand you, honestly.  Do I read between your lines that you
 didn't review my previous vacuum.c refactoring patch?  Please do.  It'd
 make *me* more comfortable.

I did not yet, but I will get to it.  I encourage everyone else to
take a look too.  I agree with Alvaro that fooling with this code
merits extreme caution.

BTW, I do not at all mean to suggest that vacuum.c contains no bugs
at the moment ;-).  I suspect for example that it is a bit random
about whether MOVED_OFF/MOVED_IN bits get cleared immediately, or
only by the next transaction that chances to visit the tuple.  The
next-transaction-fixup behavior has to be there in case the VACUUM
transaction crashes, but that doesn't mean that VACUUM should
deliberately leave work undone.

 I see three significant differences between the code in repair_frag()
 and vacuum_page().

Will study these comments later, but it's too late at night here...
again, the more eyeballs on this the better...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] More vacuum.c refactoring

2004-06-10 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Maybe we could establish heavier testing for this kind of change so
 potential patches can be tested extensively.  Concurrent vacuums with
 all kinds of imaginable operations (insert, updates, deletes), in tight
 loops, could be a start.

VACUUM FULL takes an exclusive lock, so it should not have to worry
about concurrent operations on the table.  What we have to think about
is the initial states it can see.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org