Re: [HACKERS] Regression test failures
Stefan Kaltenbrunner <[EMAIL PROTECTED]> writes: > iirc the error I got was something along the line of: > ERROR: catalog is missing 1 attribute(s) for relid 17231 It's possible that that's the same problem but in a form triggered by ALTER ADD COLUMN. I was able to reproduce the problem I saw, and have now decided that there are several interacting bugs involved. Basically, the sequence BEGIN; SAVEPOINT x; CREATE TABLE foo ...; ROLLBACK TO x; ought to *always* fail (bug #1) but chances not to do so because of bug #2 --- except that there's a race condition (bug #3) which allows the failure to emerge if some other backend has done the right thing during a narrow time window. Bug #1 is that relcache.c isn't accounting for subtransactions in its handling of rd_isnew relcache entries. In the above example, foo is marked rd_isnew and so relcache.c tries to preserve the relcache entry until transaction end. The ROLLBACK will hit it with cache invalidation actions telling it that the pg_class and pg_attribute entries for the table have changed. Normally that would cause the relcache entry to be dropped, but since it's rd_isnew, relcache.c mulishly tries to rebuild it instead. So it's reading catalog entries that are now considered deleted, and so the "deleted while in use" error is exactly what you'd expect. So why don't you get that all the time? Well, bug #2 is that TransactionIdIsCurrentTransactionId still considers the already-aborted subtransaction ID to be current, so the validity tests in tqual.c will think the catalog rows are still valid. Except that there is a race condition. If, between the time that the subxact is marked aborted in pg_clog and the time relcache.c tries to re-read these rows, some other backend comes along and examines the pg_class row in question, it will mark the row as XMIN_INVALID, in which case tqual.c doesn't bother to check TransactionIdIsCurrentTransactionId but just declares the row no good. So, with just the right sort of concurrent activity, it's possible to observe the error. I think your "catalog is missing 1 attribute" example might be the same sort of thing, except the XMIN marking happened to a pg_attribute row instead of a pg_class row. (I'm not totally convinced by that explanation though --- the failure should be transient rather than repeatable, if this was the mechanism.) After further thought I'm thinking that bug #2 is not so much whether TransactionIdIsCurrentTransactionId is behaving correctly, but the fact that it is being invoked at all. We should never be doing catalog accesses in transaction-aborted state. The relcache code therefore needs to be changed so that it doesn't try to do rebuilds immediately, but waits until we are back in a "good" state (either out of the failed subtransaction, or starting a whole fresh transaction if invalidation happened at the end of a main transaction). I think this bug exists in existing releases too, for invalidation events affecting nailed-in-cache system catalogs. It's not clear you'd ever see an actual failure in the field for that case, but it's still wrong. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Regression test failures
Tom Lane wrote: Bruce Momjian <[EMAIL PROTECTED]> writes: I am still seeing random regression test failures on my SMP BSD/OS machine. It basically happens when doing 'gmake check'. I have tried running repeated tests and can't get it to reproduce, but when checking patches it has happened perhaps once a week for the past six weeks. It happens once and then doesn't happen again. I will keep investigating. I reported this perhaps three weeks ago. Do these failures look anything like this? --- 78,86 DROP TABLE foo; CREATE TABLE bar (a int); ROLLBACK TO SAVEPOINT one; ! WARNING: AbortSubTransaction while in ABORT state ! ERROR: relation 555088 deleted while still in use ! server closed the connection unexpectedly ! This probably means the server terminated abnormally ! before or while processing the request. ! connection to server was lost I got this once this morning and have been unable to reproduce it. The OID referenced in the message seemed to correspond to the relation "bar", created just above the point of error. Just for the record I had strange errors too on beta1 - when playing with creating/deleting/altering tables and savepoints(not sure if that is related anyhow). I had it once two times in a row, but when I tried to build a testcase to report this issue I couldn't reproduce it again :-( iirc the error I got was something along the line of: ERROR: catalog is missing 1 attribute(s) for relid 17231 Stefan ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Regression test failures
Bruce Momjian <[EMAIL PROTECTED]> writes: > I am still seeing random regression test failures on my SMP BSD/OS > machine. It basically happens when doing 'gmake check'. > I have tried running repeated tests and can't get it to reproduce, but > when checking patches it has happened perhaps once a week for the past > six weeks. It happens once and then doesn't happen again. > I will keep investigating. I reported this perhaps three weeks ago. Do these failures look anything like this? --- 78,86 DROP TABLE foo; CREATE TABLE bar (a int); ROLLBACK TO SAVEPOINT one; ! WARNING: AbortSubTransaction while in ABORT state ! ERROR: relation 555088 deleted while still in use ! server closed the connection unexpectedly ! This probably means the server terminated abnormally ! before or while processing the request. ! connection to server was lost I got this once this morning and have been unable to reproduce it. The OID referenced in the message seemed to correspond to the relation "bar", created just above the point of error. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Regression test failures
I am still seeing random regression test failures on my SMP BSD/OS machine. It basically happens when doing 'gmake check'. I have tried running repeated tests and can't get it to reproduce, but when checking patches it has happened perhaps once a week for the past six weeks. It happens once and then doesn't happen again. I will keep investigating. I reported this perhaps three weeks ago. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] regression test failures in CVS HEAD
Looks like the expected files weren't updated. Probably my fault, but the tests themselves are fine. On Tue, 2002-08-20 at 01:37, Neil Conway wrote: > The 'type_sanity' and 'domain' regression tests seem to fail with CVS > HEAD. Here's the diff: > > *** ./expected/type_sanity.outSun Aug 4 15:48:11 2002 > --- ./results/type_sanity.out Tue Aug 20 01:32:35 2002 > *** > *** 16,22 > SELECT p1.oid, p1.typname > FROM pg_type as p1 > WHERE (p1.typlen <= 0 AND p1.typlen != -1) OR > ! (p1.typtype != 'b' AND p1.typtype != 'c' AND p1.typtype != 'p') OR > NOT p1.typisdefined OR > (p1.typalign != 'c' AND p1.typalign != 's' AND >p1.typalign != 'i' AND p1.typalign != 'd') OR > --- 16,22 > SELECT p1.oid, p1.typname > FROM pg_type as p1 > WHERE (p1.typlen <= 0 AND p1.typlen != -1) OR > ! p1.typtype not in('b', 'c', 'd', 'p') OR > NOT p1.typisdefined OR > (p1.typalign != 'c' AND p1.typalign != 's' AND >p1.typalign != 'i' AND p1.typalign != 'd') OR > *** > *** 60,66 > -- NOTE: as of 7.3, this check finds SET, smgr, and unknown. > SELECT p1.oid, p1.typname > FROM pg_type as p1 > ! WHERE p1.typtype = 'b' AND p1.typname NOT LIKE '\\_%' AND NOT EXISTS > (SELECT 1 FROM pg_type as p2 >WHERE p2.typname = ('_' || p1.typname)::name AND > p2.typelem = p1.oid); > --- 60,66 > -- NOTE: as of 7.3, this check finds SET, smgr, and unknown. > SELECT p1.oid, p1.typname > FROM pg_type as p1 > ! WHERE p1.typtype in ('b','d') AND p1.typname NOT LIKE '\\_%' AND NOT EXISTS > (SELECT 1 FROM pg_type as p2 >WHERE p2.typname = ('_' || p1.typname)::name AND > p2.typelem = p1.oid); > > == > > *** ./expected/domain.out Fri Jul 12 14:43:19 2002 > --- ./results/domain.out Tue Aug 20 01:32:57 2002 > *** > *** 143,154 > ( col1 ddef1 > , col2 ddef2 > , col3 ddef3 > ! , col4 ddef4 > , col5 ddef1 NOT NULL DEFAULT NULL > , col6 ddef2 DEFAULT '88' > , col7 ddef4 DEFAULT 8000 > , col8 ddef5 > ); > insert into defaulttest default values; > insert into defaulttest default values; > insert into defaulttest default values; > --- 143,155 > ( col1 ddef1 > , col2 ddef2 > , col3 ddef3 > ! , col4 ddef4 PRIMARY KEY > , col5 ddef1 NOT NULL DEFAULT NULL > , col6 ddef2 DEFAULT '88' > , col7 ddef4 DEFAULT 8000 > , col8 ddef5 > ); > + NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'defaulttest_pkey' >for table 'defaulttest' > insert into defaulttest default values; > insert into defaulttest default values; > insert into defaulttest default values; > > == > > Cheers, > > Neil > > -- > Neil Conway <[EMAIL PROTECTED]> || PGP Key ID: DB3C29FC > > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] regression test failures in CVS HEAD
The 'type_sanity' and 'domain' regression tests seem to fail with CVS HEAD. Here's the diff: *** ./expected/type_sanity.out Sun Aug 4 15:48:11 2002 --- ./results/type_sanity.out Tue Aug 20 01:32:35 2002 *** *** 16,22 SELECT p1.oid, p1.typname FROM pg_type as p1 WHERE (p1.typlen <= 0 AND p1.typlen != -1) OR ! (p1.typtype != 'b' AND p1.typtype != 'c' AND p1.typtype != 'p') OR NOT p1.typisdefined OR (p1.typalign != 'c' AND p1.typalign != 's' AND p1.typalign != 'i' AND p1.typalign != 'd') OR --- 16,22 SELECT p1.oid, p1.typname FROM pg_type as p1 WHERE (p1.typlen <= 0 AND p1.typlen != -1) OR ! p1.typtype not in('b', 'c', 'd', 'p') OR NOT p1.typisdefined OR (p1.typalign != 'c' AND p1.typalign != 's' AND p1.typalign != 'i' AND p1.typalign != 'd') OR *** *** 60,66 -- NOTE: as of 7.3, this check finds SET, smgr, and unknown. SELECT p1.oid, p1.typname FROM pg_type as p1 ! WHERE p1.typtype = 'b' AND p1.typname NOT LIKE '\\_%' AND NOT EXISTS (SELECT 1 FROM pg_type as p2 WHERE p2.typname = ('_' || p1.typname)::name AND p2.typelem = p1.oid); --- 60,66 -- NOTE: as of 7.3, this check finds SET, smgr, and unknown. SELECT p1.oid, p1.typname FROM pg_type as p1 ! WHERE p1.typtype in ('b','d') AND p1.typname NOT LIKE '\\_%' AND NOT EXISTS (SELECT 1 FROM pg_type as p2 WHERE p2.typname = ('_' || p1.typname)::name AND p2.typelem = p1.oid); == *** ./expected/domain.out Fri Jul 12 14:43:19 2002 --- ./results/domain.outTue Aug 20 01:32:57 2002 *** *** 143,154 ( col1 ddef1 , col2 ddef2 , col3 ddef3 ! , col4 ddef4 , col5 ddef1 NOT NULL DEFAULT NULL , col6 ddef2 DEFAULT '88' , col7 ddef4 DEFAULT 8000 , col8 ddef5 ); insert into defaulttest default values; insert into defaulttest default values; insert into defaulttest default values; --- 143,155 ( col1 ddef1 , col2 ddef2 , col3 ddef3 ! , col4 ddef4 PRIMARY KEY , col5 ddef1 NOT NULL DEFAULT NULL , col6 ddef2 DEFAULT '88' , col7 ddef4 DEFAULT 8000 , col8 ddef5 ); + NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'defaulttest_pkey' +for table 'defaulttest' insert into defaulttest default values; insert into defaulttest default values; insert into defaulttest default values; == Cheers, Neil -- Neil Conway <[EMAIL PROTECTED]> || PGP Key ID: DB3C29FC ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster