On Tue, Dec 21, 2004 at 05:56:47PM -0500, Tom Lane wrote: > Mark Wong <[EMAIL PROTECTED]> writes: > > On Tue, Dec 21, 2004 at 02:23:41PM -0500, Tom Lane wrote: > >> Mark Wong <[EMAIL PROTECTED]> writes: > >>> [2004-12-20 15:48:18 PST] The error is [ERROR: failed to > >>> re-find parent key in "pk_district" > >> > >> Yikes. Is this reproducible? > > > Yes, and I think there is one for each of the rollbacks that are > > occuring in the workload. Except for the 1% that's supposed to happen > > for the new-order transaction. > > Well, we need to find out what's causing that. There are two possible > sources of that error (one elog in src/backend/access/nbtree/nbtinsert.c, > and one in src/backend/access/nbtree/nbtpage.c) and neither of them > should ever fire. > > If you want to track it yourself, please change those elog(ERROR)s to > elog(PANIC) so that they'll generate core dumps, then build with > --enable-debug if you didn't already (--enable-cassert would be good too) > and get a debugger stack trace from the core dump. > > Otherwise, can you extract a test case that causes this without needing > vast resources to run? > > regards, tom lane
I was going to try Matthew's suggestion of turning up the debug on pg_autovacuum, unless you don't that'll help find the cause. I'm not sure if I can more easily reproduce the problem but i can try. I'll go ahead and make the elog() changes you recommended and do a run overnight either way. Mark ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])