Re: [HACKERS] More vacuum.c refactoring
[Sorry for the late reply. I'm still struggling to catch up after vacation ...] On Fri, 9 Jul 2004 21:29:52 -0400 (EDT), Bruce Momjian [EMAIL PROTECTED] wrote: Where are we on this, 2x. :-) Here: Tom Lane wrote: Will study these comments later, but it's too late at night here... Servus Manfred ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] More vacuum.c refactoring
Manfred Koizar [EMAIL PROTECTED] writes: On Fri, 9 Jul 2004 21:29:52 -0400 (EDT), Bruce Momjian [EMAIL PROTECTED] wrote: Where are we on this, 2x. :-) Here: Tom Lane wrote: Will study these comments later, but it's too late at night here... I haven't had time to review the original, already-applied refactoring, let alone the follow-up. Considering that there's been at least one bug report against the original (unused variable IIRC), I think it does need reviewing. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] More vacuum.c refactoring
Where are we on this, 2x. :-) --- Bruce Momjian wrote: Where are we on this? --- Tom Lane wrote: Manfred Koizar [EMAIL PROTECTED] writes: I understand you, honestly. Do I read between your lines that you didn't review my previous vacuum.c refactoring patch? Please do. It'd make *me* more comfortable. I did not yet, but I will get to it. I encourage everyone else to take a look too. I agree with Alvaro that fooling with this code merits extreme caution. BTW, I do not at all mean to suggest that vacuum.c contains no bugs at the moment ;-). I suspect for example that it is a bit random about whether MOVED_OFF/MOVED_IN bits get cleared immediately, or only by the next transaction that chances to visit the tuple. The next-transaction-fixup behavior has to be there in case the VACUUM transaction crashes, but that doesn't mean that VACUUM should deliberately leave work undone. I see three significant differences between the code in repair_frag() and vacuum_page(). Will study these comments later, but it's too late at night here... again, the more eyeballs on this the better... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] More vacuum.c refactoring
Where are we on this? --- Tom Lane wrote: Manfred Koizar [EMAIL PROTECTED] writes: I understand you, honestly. Do I read between your lines that you didn't review my previous vacuum.c refactoring patch? Please do. It'd make *me* more comfortable. I did not yet, but I will get to it. I encourage everyone else to take a look too. I agree with Alvaro that fooling with this code merits extreme caution. BTW, I do not at all mean to suggest that vacuum.c contains no bugs at the moment ;-). I suspect for example that it is a bit random about whether MOVED_OFF/MOVED_IN bits get cleared immediately, or only by the next transaction that chances to visit the tuple. The next-transaction-fixup behavior has to be there in case the VACUUM transaction crashes, but that doesn't mean that VACUUM should deliberately leave work undone. I see three significant differences between the code in repair_frag() and vacuum_page(). Will study these comments later, but it's too late at night here... again, the more eyeballs on this the better... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
TESTING (was: RE: [HACKERS] More vacuum.c refactoring )
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Thursday, June 10, 2004 2:19 PM To: Manfred Koizar Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] More vacuum.c refactoring Manfred Koizar [EMAIL PROTECTED] writes: This code is very similar to vacuum_page(). The major difference is that vacuum_page() uses vacpage-offsets while the code in repair_frag() looks for MOVED_OFF bits in tuple headers. AFAICS the tuples with the MOVED_OFF bit set are exactly those referenced by vacpage-offsets. This does not make me comfortable. You *think* that two different bits of code are doing the same thing, so you want to hack up vacuum.c? This module is delicate code --- we've had tons of bugs there in the past --- and no I have zero confidence that passing the regression tests proves anything, because all those prior bugs passed the regression tests. Then why didn't those bugs get added to the regression? That has been standard procedure in every place that I have ever worked. We have 7000+ tests in our CONNX regression suite that take one week to run, on an array of dozens of computers from micros to mainframes. Besides finding quickly if you have reintroduced a problem, it will also ferret out lots of newly introduced problems. It seems that MySQL has some sort of extensive test, from looking at their site. Maybe the Pgsql group could simply cannibalize it. Perhaps your regression test is really a sanity check, rather than a regression test. After all, the meaning of 'regression' itself demands that you introduce new tests based upon old failures. I seem to recall that someone was porting the NIST suite to PostgreSQL. What ever happened to that effort? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: TESTING (was: RE: [HACKERS] More vacuum.c refactoring )
Dann Corbit [EMAIL PROTECTED] writes: --- and no I have zero confidence that passing the regression tests proves anything, because all those prior bugs passed the regression tests. Then why didn't those bugs get added to the regression? Because there wasn't any reasonable way to make them reproducible. The set of things we can test in the regression tests is only a small fraction of the interesting properties of Postgres. This is unfortunate but ranting about standard practice doesn't change it. I seem to recall that someone was porting the NIST suite to PostgreSQL. What ever happened to that effort? It was done and we fixed a couple of bugs based on it (the one I can think of offhand had to do with semantics of aggregate functions in sub-selects). I don't think there's anything more to be learned there. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: TESTING (was: RE: [HACKERS] More vacuum.c refactoring )
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Friday, June 11, 2004 2:35 PM To: Dann Corbit Cc: Manfred Koizar; [EMAIL PROTECTED] Subject: Re: TESTING (was: RE: [HACKERS] More vacuum.c refactoring ) Dann Corbit [EMAIL PROTECTED] writes: --- and no I have zero confidence that passing the regression tests proves anything, because all those prior bugs passed the regression tests. Then why didn't those bugs get added to the regression? Because there wasn't any reasonable way to make them reproducible. The set of things we can test in the regression tests is only a small fraction of the interesting properties of Postgres. This is unfortunate but ranting about standard practice doesn't change it. I seem to recall that someone was porting the NIST suite to PostgreSQL. What ever happened to that effort? It was done and we fixed a couple of bugs based on it (the one I can think of offhand had to do with semantics of aggregate functions in sub-selects). I don't think there's anything more to be learned there. It is reassuring to know that it passed with flying colors. Can I get the ported version? I would love to play with it. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: TESTING (was: RE: [HACKERS] More vacuum.c refactoring )
Dann Corbit [EMAIL PROTECTED] writes: It was done and we fixed a couple of bugs based on it (the one I can think of offhand had to do with semantics of aggregate functions in sub-selects). I don't think there's anything more to be learned there. It is reassuring to know that it passed with flying colors. Passed with flying colors might be overstating it --- but the other things it found were stuff we already knew about, eg, unimplemented features, identifier case folding, that kind of thing. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] More vacuum.c refactoring
Near the end of repair_frag() in vacuum.c -- under the comment /* clean moved tuples from last page in Nvacpagelist list */ -- there is code that marks itemids as unused. Itemids affected are those referring to tuples that have been moved off the last page. This code is very similar to vacuum_page(). The major difference is that vacuum_page() uses vacpage-offsets while the code in repair_frag() looks for MOVED_OFF bits in tuple headers. AFAICS the tuples with the MOVED_OFF bit set are exactly those referenced by vacpage-offsets. The attached patch passes make check and make installcheck. Please apply unless I'm missing something. Servus Manfred diff -Ncr ../base/src/backend/commands/vacuum.c src/backend/commands/vacuum.c *** ../base/src/backend/commands/vacuum.c Wed Jun 2 21:46:59 2004 --- src/backend/commands/vacuum.c Thu Jun 10 18:50:26 2004 *** *** 2288,2355 vacpage-offsets_free 0) { Buffer buf; - Pagepage; - OffsetNumberunused[BLCKSZ / sizeof(OffsetNumber)]; - OffsetNumberoffnum, - maxoff; - int uncnt; - int num_tuples = 0; buf = ReadBuffer(onerel, vacpage-blkno); LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE); ! page = BufferGetPage(buf); ! maxoff = PageGetMaxOffsetNumber(page); ! for (offnum = FirstOffsetNumber; !offnum = maxoff; !offnum = OffsetNumberNext(offnum)) ! { ! ItemId itemid = PageGetItemId(page, offnum); ! HeapTupleHeader htup; ! ! if (!ItemIdIsUsed(itemid)) ! continue; ! htup = (HeapTupleHeader) PageGetItem(page, itemid); ! if (htup-t_infomask HEAP_XMIN_COMMITTED) ! continue; ! ! /* ! ** See comments in the walk-along-page loop above, why we ! ** have Asserts here instead of if (...) elog(ERROR). ! */ ! Assert(!(htup-t_infomask HEAP_MOVED_IN)); ! Assert(htup-t_infomask HEAP_MOVED_OFF); ! Assert(HeapTupleHeaderGetXvac(htup) == myXID); ! ! itemid-lp_flags = ~LP_USED; ! num_tuples++; ! ! } ! Assert(vacpage-offsets_free == num_tuples); ! ! START_CRIT_SECTION(); ! ! uncnt = PageRepairFragmentation(page, unused); ! ! /* XLOG stuff */ ! if (!onerel-rd_istemp) ! { ! XLogRecPtr recptr; ! ! recptr = log_heap_clean(onerel, buf, unused, uncnt); ! PageSetLSN(page, recptr); ! PageSetSUI(page, ThisStartUpID); ! } ! else ! { ! /* !* No XLOG record, but still need to flag that XID exists !* on disk !*/ ! MyXactMadeTempRelUpdate = true; ! } ! ! END_CRIT_SECTION(); ! LockBuffer(buf, BUFFER_LOCK_UNLOCK); WriteBuffer(buf); } --- 2288,2297 vacpage-offsets_free 0) { Buffer buf; buf = ReadBuffer(onerel, vacpage-blkno); LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE); ! vacuum_page(onerel, buf, vacpage); LockBuffer(buf, BUFFER_LOCK_UNLOCK); WriteBuffer(buf); } ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] More vacuum.c refactoring
Manfred Koizar [EMAIL PROTECTED] writes: This code is very similar to vacuum_page(). The major difference is that vacuum_page() uses vacpage-offsets while the code in repair_frag() looks for MOVED_OFF bits in tuple headers. AFAICS the tuples with the MOVED_OFF bit set are exactly those referenced by vacpage-offsets. This does not make me comfortable. You *think* that two different bits of code are doing the same thing, so you want to hack up vacuum.c? This module is delicate code --- we've had tons of bugs there in the past --- and no I have zero confidence that passing the regression tests proves anything, because all those prior bugs passed the regression tests. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] More vacuum.c refactoring
On Thu, 10 Jun 2004 17:19:22 -0400, Tom Lane [EMAIL PROTECTED] wrote: This does not make me comfortable. I understand you, honestly. Do I read between your lines that you didn't review my previous vacuum.c refactoring patch? Please do. It'd make *me* more comfortable. You *think* that two different bits of code are doing the same thing, I see three significant differences between the code in repair_frag() and vacuum_page(). 1) vacuum_page() has Assert(vacpage-offsets_used == 0); vacpage is the last VacPage that has been inserted into Nvacpagelist. It is allocated in line 1566, offsets_used is immediately set to 0 and never changed. So this Assert(...) doesn't hurt. 2) In vacuum_page() the lp_flags are set inside a critical section. This is no problem because the clear-used-flags loop does not elog(ERROR, ...). Please correct me if I'm wrong. 3) vacuum_page() uses vacpage-offsets to locate the itemids that are to be updated. If we can show that these are the same itemids that belong to the tuples that are found by inspecting the tuple headers, then the two code snippets are equivalent. The first hint that this is the case is Assert(vacpage-offsets_free == num_tuples); So both spots expect to update the same number of itemids. What about the contents of the offsets[] array? Offset numbers are entered into this array by statements like vacpage-offsets[vacpage-offsets_free++] = offnum; in or immediately after the walk-along-page loop. These assignments are preceded either by code that sets the appropriate infomask bits or by assertions that the bits are already set appropriately. The rest (from PageRepairFragmentation to END_CRIT_SECTION) is identical. so you want to hack up vacuum.c? This module is delicate code --- we've had tons of bugs there in the past But why is it so delicate? Not only because it handles difficult problems, but also because it is written in a not very maintenance-friendly way. Before I started refactoring the code repair_frag() had more than 1100 lines and (almost) all variables used anywhere in the function were declared at function level. We cannot declare a code freeze for a module just because it is so badly written that every change is likely to break it. Someday someone will *have* to change it. Better we break it today in an effort to make the code clearer. --- and no I have zero confidence that passing the regression tests proves anything Unfortunately you are right :-( Servus Manfred ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] More vacuum.c refactoring
On Fri, Jun 11, 2004 at 02:00:07AM +0200, Manfred Koizar wrote: If I may ... so you want to hack up vacuum.c? This module is delicate code --- we've had tons of bugs there in the past But why is it so delicate? Not only because it handles difficult problems, but also because it is written in a not very maintenance-friendly way. Before I started refactoring the code repair_frag() had more than 1100 lines and (almost) all variables used anywhere in the function were declared at function level. I agree. This code is horrid. I also agree with Tom in that this should be done with extreme caution, but it is a needed task. Maybe we could establish heavier testing for this kind of change so potential patches can be tested extensively. Concurrent vacuums with all kinds of imaginable operations (insert, updates, deletes), in tight loops, could be a start. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) No es bueno caminar con un hombre muerto ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] More vacuum.c refactoring
Manfred Koizar [EMAIL PROTECTED] writes: I understand you, honestly. Do I read between your lines that you didn't review my previous vacuum.c refactoring patch? Please do. It'd make *me* more comfortable. I did not yet, but I will get to it. I encourage everyone else to take a look too. I agree with Alvaro that fooling with this code merits extreme caution. BTW, I do not at all mean to suggest that vacuum.c contains no bugs at the moment ;-). I suspect for example that it is a bit random about whether MOVED_OFF/MOVED_IN bits get cleared immediately, or only by the next transaction that chances to visit the tuple. The next-transaction-fixup behavior has to be there in case the VACUUM transaction crashes, but that doesn't mean that VACUUM should deliberately leave work undone. I see three significant differences between the code in repair_frag() and vacuum_page(). Will study these comments later, but it's too late at night here... again, the more eyeballs on this the better... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] More vacuum.c refactoring
Alvaro Herrera [EMAIL PROTECTED] writes: Maybe we could establish heavier testing for this kind of change so potential patches can be tested extensively. Concurrent vacuums with all kinds of imaginable operations (insert, updates, deletes), in tight loops, could be a start. VACUUM FULL takes an exclusive lock, so it should not have to worry about concurrent operations on the table. What we have to think about is the initial states it can see. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org