Re: [HACKERS] curious regression failures

2007-09-24 Thread Gregory Stark
Alvaro Herrera [EMAIL PROTECTED] writes:

 Tom Lane wrote:

 Idle thought here: did anything get done with the idea of decoupling
 main-table vacuum decisions from toast-table vacuum decisions?  vacuum.c
 comments
 
  * Get a session-level lock too. This will protect our access to the
  * relation across multiple transactions, so that we can vacuum the
  * relation's TOAST table (if any) secure in the knowledge that no one is
  * deleting the parent relation.
 
 and it suddenly occurs to me that we'd need some other way to deal with
 that scenario if autovac tries to vacuum toast tables independently.

 Hmm, right.  We didn't change this in 8.3 but it looks like somebody
 will need to have a great idea before long.

 Of course, the easy answer is to grab a session-level lock for the main
 table while vacuuming the toast table, but it doesn't seem very
 friendly.

Just a normal lock would do, no? At least for normal (non-full) vacuum.

I'm not clear why this has to be dealt with at all though. What happens if we
don't do anything? Doesn't it just mean a user trying to drop the table will
block until the vacuum is done? Or does dropping not take a lock on the toast
table?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] curious regression failures

2007-09-23 Thread Alvaro Herrera
Tom Lane wrote:

 Idle thought here: did anything get done with the idea of decoupling
 main-table vacuum decisions from toast-table vacuum decisions?  vacuum.c
 comments
 
  * Get a session-level lock too. This will protect our access to the
  * relation across multiple transactions, so that we can vacuum the
  * relation's TOAST table (if any) secure in the knowledge that no one is
  * deleting the parent relation.
 
 and it suddenly occurs to me that we'd need some other way to deal with
 that scenario if autovac tries to vacuum toast tables independently.

Hmm, right.  We didn't change this in 8.3 but it looks like somebody
will need to have a great idea before long.

Of course, the easy answer is to grab a session-level lock for the main
table while vacuuming the toast table, but it doesn't seem very
friendly.

 Also, did you see the thread complaining that autovacuums block CREATE
 INDEX?  This seems true given the current locking definitions, and it's
 a bit annoying.  Is it worth inventing a new table lock type just for
 vacuum?

Hmm.  I think Jim is right in that what we need is to make some forms of
ALTER TABLE take a lighter lock, one that doesn't conflict with analyze.
Guillaume's complaint are about restore times, which can only be
affected by analyze, not vacuum.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] curious regression failures

2007-09-20 Thread Alvaro Herrera
Tom Lane wrote:

 There might be another way to manage this, but we're not inventing
 a new invalidation mechanism for 8.3.  This patch will have to be
 reverted for the time being :-(

Thanks.  Seems it was a good judgement call to apply it only to HEAD,
after all.

In any case, at that point we are mostly done with the expensive steps
of vacuuming, so the transaction finishes not long after this.  I don't
think this issue is worth inventing a new invalidation mechanism.

-- 
Alvaro Herrera  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
La victoria es para quien se atreve a estar solo

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] curious regression failures

2007-09-20 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 In any case, at that point we are mostly done with the expensive steps
 of vacuuming, so the transaction finishes not long after this.  I don't
 think this issue is worth inventing a new invalidation mechanism.

Yeah, I agree --- there are only a few catalog updates left to do after
we truncate.  If we held the main-table exclusive lock while vacuuming
the TOAST table, we'd have a problem, but it looks to me like we don't.

Idle thought here: did anything get done with the idea of decoupling
main-table vacuum decisions from toast-table vacuum decisions?  vacuum.c
comments

 * Get a session-level lock too. This will protect our access to the
 * relation across multiple transactions, so that we can vacuum the
 * relation's TOAST table (if any) secure in the knowledge that no one is
 * deleting the parent relation.

and it suddenly occurs to me that we'd need some other way to deal with
that scenario if autovac tries to vacuum toast tables independently.

Also, did you see the thread complaining that autovacuums block CREATE
INDEX?  This seems true given the current locking definitions, and it's
a bit annoying.  Is it worth inventing a new table lock type just for
vacuum?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] curious regression failures (was Re: [PATCHES] PL/TCL Patch to prevent postgres from becoming multithreaded)

2007-09-19 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 pgbfprod=# select sysname, stage, snapshot from build_status where log ~ 
 $$read only \d+ of \d+ bytes$$;
   sysname   |stage |  snapshot  
  ---+--+-
  zebra  | InstallCheck | 2007-09-11 10:25:03
  wildebeest | InstallCheck | 2007-09-11 22:00:11
  baiji  | InstallCheck | 2007-09-12 22:39:24
  luna_moth  | InstallCheck | 2007-09-19 13:10:01
 (4 rows)

Fascinating.  So I would venture that (1) it's definitely our bug,
not something we could blame on NFS or whatever, and (2) we introduced
it fairly recently.  That specific error message wording exists only
in HEAD, but it's been there since 2007-01-03, so if there were a
pre-existing problem you'd think there would be some more matches.

The patterns I notice here are (1) they're all InstallCheck not Check
failures; (2) though not all at the same place in the tests, it's
a fairly short range; (3) it's all references to system catalogs,
though not all the same one.

My gut feeling is that we're seeing autovacuum truncate off an empty end
block and then a backend tries to reference that block again.  But there
should be enough interlocks in place to prevent such references.  Any
ideas out there?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] curious regression failures

2007-09-19 Thread Gregory Stark
Stefan Kaltenbrunner [EMAIL PROTECTED] writes:

 Andrew Dunstan wrote:

 pgbfprod=# select sysname, stage, snapshot from build_status where log ~
 $$read only \d+ of \d+ bytes$$;
  sysname   |stage |  snapshot 
 ---+--+-
 zebra  | InstallCheck | 2007-09-11 10:25:03
 wildebeest | InstallCheck | 2007-09-11 22:00:11
 baiji  | InstallCheck | 2007-09-12 22:39:24
 luna_moth  | InstallCheck | 2007-09-19 13:10:01

 hmm all of those seem to fail the foreign key checks in a very similiar
 way and that are vastly different platforms (windows,solaris,openbsd and
 linux).

Is this exhaustive? That is, are we sure this never happened before Sept 11th?


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] curious regression failures

2007-09-19 Thread Andrew Dunstan



Gregory Stark wrote:

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:

  

Andrew Dunstan wrote:



pgbfprod=# select sysname, stage, snapshot from build_status where log ~
$$read only \d+ of \d+ bytes$$;
 sysname   |stage |  snapshot 
---+--+-

zebra  | InstallCheck | 2007-09-11 10:25:03
wildebeest | InstallCheck | 2007-09-11 22:00:11
baiji  | InstallCheck | 2007-09-12 22:39:24
luna_moth  | InstallCheck | 2007-09-19 13:10:01
  

hmm all of those seem to fail the foreign key checks in a very similiar
way and that are vastly different platforms (windows,solaris,openbsd and
linux).



Is this exhaustive? That is, are we sure this never happened before Sept 11th?

  


Yes, we have never thrown away any buildfarm history, and we have build 
logs going back several years now. Being able to run queries like this 
makes it all worth while :-) (Thanks Joshua for the disk space - I know 
it annoys you.)


cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] curious regression failures

2007-09-19 Thread Gregory Stark


Looking back, by far the largest change in the period Sep 1 - Sep 11 was the
lazy xid calculation and read-only transactions. That seems like the most
likely culprit.

But given Tom's comments this commit stands out too:


---BeginMessage---
Log Message:
---
Release the exclusive lock on the table early after truncating it in lazy
vacuum, instead of waiting till commit.

Modified Files:
--
pgsql/src/backend/commands:
vacuumlazy.c (r1.92 - r1.93)

(http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/commands/vacuumlazy.c?r1=1.92r2=1.93)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
---End Message---


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] curious regression failures

2007-09-19 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 But given Tom's comments this commit stands out too:

 From: Alvaro Herrera [EMAIL PROTECTED]
 Log Message:
 ---
 Release the exclusive lock on the table early after truncating it in lazy
 vacuum, instead of waiting till commit.

I had thought about that one and not seen a problem with it --- but
sometimes when the light goes on, it's just blinding :-(.  This change
is undoubtedly what's breaking it.  The failures in question are coming
from commands that try to insert new entries into various system tables.
Now normally, the first place a backend will try to insert a brand-new
tuple in a table is the rd_targblock block that is remembered in
relcache as being where we last successfully inserted.  The failures
must be happening because autovacuum has just truncated away where
rd_targblock points.  There is a mechanism to reset everyone's
rd_targblock after a truncation: it's done by broadcasting a
shared-invalidation relcache inval message for that relation.  Which
happens at commit, before releasing locks, which is the correct time for
the typical application of this mechanism, namely to make sure people
see system-catalog updates on time.  Releasing the exclusive lock early
allows backends to try to access the relation again before they've heard
about the truncation.

There might be another way to manage this, but we're not inventing
a new invalidation mechanism for 8.3.  This patch will have to be
reverted for the time being :-(

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org