Re: [HACKERS] curious regression failures
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane wrote: Idle thought here: did anything get done with the idea of decoupling main-table vacuum decisions from toast-table vacuum decisions? vacuum.c comments * Get a session-level lock too. This will protect our access to the * relation across multiple transactions, so that we can vacuum the * relation's TOAST table (if any) secure in the knowledge that no one is * deleting the parent relation. and it suddenly occurs to me that we'd need some other way to deal with that scenario if autovac tries to vacuum toast tables independently. Hmm, right. We didn't change this in 8.3 but it looks like somebody will need to have a great idea before long. Of course, the easy answer is to grab a session-level lock for the main table while vacuuming the toast table, but it doesn't seem very friendly. Just a normal lock would do, no? At least for normal (non-full) vacuum. I'm not clear why this has to be dealt with at all though. What happens if we don't do anything? Doesn't it just mean a user trying to drop the table will block until the vacuum is done? Or does dropping not take a lock on the toast table? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] curious regression failures
Tom Lane wrote: Idle thought here: did anything get done with the idea of decoupling main-table vacuum decisions from toast-table vacuum decisions? vacuum.c comments * Get a session-level lock too. This will protect our access to the * relation across multiple transactions, so that we can vacuum the * relation's TOAST table (if any) secure in the knowledge that no one is * deleting the parent relation. and it suddenly occurs to me that we'd need some other way to deal with that scenario if autovac tries to vacuum toast tables independently. Hmm, right. We didn't change this in 8.3 but it looks like somebody will need to have a great idea before long. Of course, the easy answer is to grab a session-level lock for the main table while vacuuming the toast table, but it doesn't seem very friendly. Also, did you see the thread complaining that autovacuums block CREATE INDEX? This seems true given the current locking definitions, and it's a bit annoying. Is it worth inventing a new table lock type just for vacuum? Hmm. I think Jim is right in that what we need is to make some forms of ALTER TABLE take a lighter lock, one that doesn't conflict with analyze. Guillaume's complaint are about restore times, which can only be affected by analyze, not vacuum. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] curious regression failures
Tom Lane wrote: There might be another way to manage this, but we're not inventing a new invalidation mechanism for 8.3. This patch will have to be reverted for the time being :-( Thanks. Seems it was a good judgement call to apply it only to HEAD, after all. In any case, at that point we are mostly done with the expensive steps of vacuuming, so the transaction finishes not long after this. I don't think this issue is worth inventing a new invalidation mechanism. -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC La victoria es para quien se atreve a estar solo ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] curious regression failures
Alvaro Herrera [EMAIL PROTECTED] writes: In any case, at that point we are mostly done with the expensive steps of vacuuming, so the transaction finishes not long after this. I don't think this issue is worth inventing a new invalidation mechanism. Yeah, I agree --- there are only a few catalog updates left to do after we truncate. If we held the main-table exclusive lock while vacuuming the TOAST table, we'd have a problem, but it looks to me like we don't. Idle thought here: did anything get done with the idea of decoupling main-table vacuum decisions from toast-table vacuum decisions? vacuum.c comments * Get a session-level lock too. This will protect our access to the * relation across multiple transactions, so that we can vacuum the * relation's TOAST table (if any) secure in the knowledge that no one is * deleting the parent relation. and it suddenly occurs to me that we'd need some other way to deal with that scenario if autovac tries to vacuum toast tables independently. Also, did you see the thread complaining that autovacuums block CREATE INDEX? This seems true given the current locking definitions, and it's a bit annoying. Is it worth inventing a new table lock type just for vacuum? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] curious regression failures (was Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded)
Andrew Dunstan [EMAIL PROTECTED] writes: pgbfprod=# select sysname, stage, snapshot from build_status where log ~ $$read only \d+ of \d+ bytes$$; sysname |stage | snapshot ---+--+- zebra | InstallCheck | 2007-09-11 10:25:03 wildebeest | InstallCheck | 2007-09-11 22:00:11 baiji | InstallCheck | 2007-09-12 22:39:24 luna_moth | InstallCheck | 2007-09-19 13:10:01 (4 rows) Fascinating. So I would venture that (1) it's definitely our bug, not something we could blame on NFS or whatever, and (2) we introduced it fairly recently. That specific error message wording exists only in HEAD, but it's been there since 2007-01-03, so if there were a pre-existing problem you'd think there would be some more matches. The patterns I notice here are (1) they're all InstallCheck not Check failures; (2) though not all at the same place in the tests, it's a fairly short range; (3) it's all references to system catalogs, though not all the same one. My gut feeling is that we're seeing autovacuum truncate off an empty end block and then a backend tries to reference that block again. But there should be enough interlocks in place to prevent such references. Any ideas out there? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] curious regression failures
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Andrew Dunstan wrote: pgbfprod=# select sysname, stage, snapshot from build_status where log ~ $$read only \d+ of \d+ bytes$$; sysname |stage | snapshot ---+--+- zebra | InstallCheck | 2007-09-11 10:25:03 wildebeest | InstallCheck | 2007-09-11 22:00:11 baiji | InstallCheck | 2007-09-12 22:39:24 luna_moth | InstallCheck | 2007-09-19 13:10:01 hmm all of those seem to fail the foreign key checks in a very similiar way and that are vastly different platforms (windows,solaris,openbsd and linux). Is this exhaustive? That is, are we sure this never happened before Sept 11th? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] curious regression failures
Gregory Stark wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Andrew Dunstan wrote: pgbfprod=# select sysname, stage, snapshot from build_status where log ~ $$read only \d+ of \d+ bytes$$; sysname |stage | snapshot ---+--+- zebra | InstallCheck | 2007-09-11 10:25:03 wildebeest | InstallCheck | 2007-09-11 22:00:11 baiji | InstallCheck | 2007-09-12 22:39:24 luna_moth | InstallCheck | 2007-09-19 13:10:01 hmm all of those seem to fail the foreign key checks in a very similiar way and that are vastly different platforms (windows,solaris,openbsd and linux). Is this exhaustive? That is, are we sure this never happened before Sept 11th? Yes, we have never thrown away any buildfarm history, and we have build logs going back several years now. Being able to run queries like this makes it all worth while :-) (Thanks Joshua for the disk space - I know it annoys you.) cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] curious regression failures
Looking back, by far the largest change in the period Sep 1 - Sep 11 was the lazy xid calculation and read-only transactions. That seems like the most likely culprit. But given Tom's comments this commit stands out too: ---BeginMessage--- Log Message: --- Release the exclusive lock on the table early after truncating it in lazy vacuum, instead of waiting till commit. Modified Files: -- pgsql/src/backend/commands: vacuumlazy.c (r1.92 - r1.93) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/commands/vacuumlazy.c?r1=1.92r2=1.93) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---End Message--- -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] curious regression failures
Gregory Stark [EMAIL PROTECTED] writes: But given Tom's comments this commit stands out too: From: Alvaro Herrera [EMAIL PROTECTED] Log Message: --- Release the exclusive lock on the table early after truncating it in lazy vacuum, instead of waiting till commit. I had thought about that one and not seen a problem with it --- but sometimes when the light goes on, it's just blinding :-(. This change is undoubtedly what's breaking it. The failures in question are coming from commands that try to insert new entries into various system tables. Now normally, the first place a backend will try to insert a brand-new tuple in a table is the rd_targblock block that is remembered in relcache as being where we last successfully inserted. The failures must be happening because autovacuum has just truncated away where rd_targblock points. There is a mechanism to reset everyone's rd_targblock after a truncation: it's done by broadcasting a shared-invalidation relcache inval message for that relation. Which happens at commit, before releasing locks, which is the correct time for the typical application of this mechanism, namely to make sure people see system-catalog updates on time. Releasing the exclusive lock early allows backends to try to access the relation again before they've heard about the truncation. There might be another way to manage this, but we're not inventing a new invalidation mechanism for 8.3. This patch will have to be reverted for the time being :-( regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org