Re: [HACKERS] race condition for drop schema cascade?

2005-01-03 Thread Bruce Momjian

Did this get resolved as an OS file system issue?

---

Andrew Dunstan wrote:
 
 
 Tom Lane wrote:
 
 Andrew Dunstan [EMAIL PROTECTED] writes:
   
 
 You're right - my query was not sufficiently specific. There have in 
 fact been 4 failures:
 
 
 
   
 
 pgbuildfarm=# select sysname, snapshot, stage, branch from build_status 
 where log ~ 'tablespace testspace is not empty.*tablespace testspace 
 is not empty' and not log ~ 'No space left';
  sysname |  snapshot   |stage | branch
  +-+--+
  hare| 2004-12-09 05:15:05 | Check| HEAD
  otter   | 2004-12-11 15:50:09 | Check| HEAD
  otter   | 2004-12-15 15:50:10 | Check| HEAD
  gibbon  | 2004-12-28 23:55:05 | InstallCheck | HEAD
 
 
 
 Why does the last show as an install failure?
   
 
 
 
 We run the standard regression suite twice - the failure on Gibbon 
 occurred on the second of these. Clearly this is very transient.
 
 
 Anyway, given the small number of machines involved, I'm once again
 wondering what filesystem they are using.  They wouldn't be running
 the check over NFS, by any chance, for instance?
 
 The theory that is in my mind is that the bgwriter could have written
 out a page for the table in the test tablespace, and thereby be holding
 an open file pointer for it.  On standard Unix filesystems this would
 not disrupt the backend's ability to unlink the table at the DROP stage,
 but I'm wondering about nonstandard filesystems ...
 
   
 
 
 Jim Buttafuoco reported on December 16th that he had rebuilt the 
 filesystem on his MIPS box - I assume this means that he isn't using 
 NFS. In any case, we have not seen the problem since then. His Alpha box 
 has not been reporting buildfarm results since before then.
 
 The Cygwin box is running on NTFS - and we know we've encountered plenty 
 of problems with unlinking on Windows.
 
 I know it's not much to go on.
 
 cheers
 
 andrew
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] race condition for drop schema cascade?

2004-12-29 Thread Kurt Roeckx
 pgbuildfarm=# select name, operating_system, stage, count from buildsystems 
 b, (select sysname, stage, count(*) as count from build_status where log ~ 
 'tablespace testspace is not empty' group by sysname, stage) as s where 
 s.sysname=b.name;

Note that the expected log has that as error message after a
drop tablespace testspace;, while it should works with a
drop tablespace testspace cascade;.

How many of those errors are because of some other error?  Like
dog for intance ran out of diskspace recently and had those in
the logs.  I know panda also once ran out of diskspace, but the
logs for that aren't available on the site anymore.

When was the last time this error actually happened?  Because
looking at emu (which seem to have it the most) shows that it's
last 30 builds are all succesful.


PS: It might be nice to have an option to keep the last X days of
all logs around.


Kurt


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] race condition for drop schema cascade?

2004-12-29 Thread Andrew Dunstan

Kurt Roeckx wrote:
pgbuildfarm=# select name, operating_system, stage, count from buildsystems 
b, (select sysname, stage, count(*) as count from build_status where log ~ 
'tablespace testspace is not empty' group by sysname, stage) as s where 
s.sysname=b.name;
   

Note that the expected log has that as error message after a
drop tablespace testspace;, while it should works with a
drop tablespace testspace cascade;.
How many of those errors are because of some other error?  Like
dog for intance ran out of diskspace recently and had those in
the logs.  I know panda also once ran out of diskspace, but the
logs for that aren't available on the site anymore.
When was the last time this error actually happened?  Because
looking at emu (which seem to have it the most) shows that it's
last 30 builds are all succesful.
 

You're right - my query was not sufficiently specific. There have in 
fact been 4 failures:

pgbuildfarm=# select sysname, snapshot, stage, branch from build_status 
where log ~ 'tablespace testspace is not empty.*tablespace testspace 
is not empty' and not log ~ 'No space left';
sysname |  snapshot   |stage | branch
+-+--+
hare| 2004-12-09 05:15:05 | Check| HEAD
otter   | 2004-12-11 15:50:09 | Check| HEAD
otter   | 2004-12-15 15:50:10 | Check| HEAD
gibbon  | 2004-12-28 23:55:05 | InstallCheck | HEAD

gibbon is a Cygwin box, otter and hare are both Debian 3.1 boxes, hare 
on Alpha and otter on MIPS.


PS: It might be nice to have an option to keep the last X days of
all logs around.
 

You mean on the client? I'd rather not - the logs kept there are mostly 
intended as debugging devices. The buildfarm db keeps the log from the 
stage where an error occurred indefinitely. I intend to provide a way of 
going back through that history - at the moment you can easily see the 
last 30.

cheers
andrew
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] race condition for drop schema cascade?

2004-12-29 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 You're right - my query was not sufficiently specific. There have in 
 fact been 4 failures:

 pgbuildfarm=# select sysname, snapshot, stage, branch from build_status 
 where log ~ 'tablespace testspace is not empty.*tablespace testspace 
 is not empty' and not log ~ 'No space left';
  sysname |  snapshot   |stage | branch
  +-+--+
  hare| 2004-12-09 05:15:05 | Check| HEAD
  otter   | 2004-12-11 15:50:09 | Check| HEAD
  otter   | 2004-12-15 15:50:10 | Check| HEAD
  gibbon  | 2004-12-28 23:55:05 | InstallCheck | HEAD

Why does the last show as an install failure?

Anyway, given the small number of machines involved, I'm once again
wondering what filesystem they are using.  They wouldn't be running
the check over NFS, by any chance, for instance?

The theory that is in my mind is that the bgwriter could have written
out a page for the table in the test tablespace, and thereby be holding
an open file pointer for it.  On standard Unix filesystems this would
not disrupt the backend's ability to unlink the table at the DROP stage,
but I'm wondering about nonstandard filesystems ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] race condition for drop schema cascade?

2004-12-29 Thread Andrew Dunstan

Tom Lane wrote:
Andrew Dunstan [EMAIL PROTECTED] writes:
 

You're right - my query was not sufficiently specific. There have in 
fact been 4 failures:
   

 

pgbuildfarm=# select sysname, snapshot, stage, branch from build_status 
where log ~ 'tablespace testspace is not empty.*tablespace testspace 
is not empty' and not log ~ 'No space left';
sysname |  snapshot   |stage | branch
+-+--+
hare| 2004-12-09 05:15:05 | Check| HEAD
otter   | 2004-12-11 15:50:09 | Check| HEAD
otter   | 2004-12-15 15:50:10 | Check| HEAD
gibbon  | 2004-12-28 23:55:05 | InstallCheck | HEAD
   

Why does the last show as an install failure?
 


We run the standard regression suite twice - the failure on Gibbon 
occurred on the second of these. Clearly this is very transient.


Anyway, given the small number of machines involved, I'm once again
wondering what filesystem they are using.  They wouldn't be running
the check over NFS, by any chance, for instance?
The theory that is in my mind is that the bgwriter could have written
out a page for the table in the test tablespace, and thereby be holding
an open file pointer for it.  On standard Unix filesystems this would
not disrupt the backend's ability to unlink the table at the DROP stage,
but I'm wondering about nonstandard filesystems ...
 

Jim Buttafuoco reported on December 16th that he had rebuilt the 
filesystem on his MIPS box - I assume this means that he isn't using 
NFS. In any case, we have not seen the problem since then. His Alpha box 
has not been reporting buildfarm results since before then.

The Cygwin box is running on NTFS - and we know we've encountered plenty 
of problems with unlinking on Windows.

I know it's not much to go on.
cheers
andrew
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] race condition for drop schema cascade?

2004-12-29 Thread Jim Buttafuoco
Andrew/all

I have not seen any problems on my MIPS systems since the rebuild ext3 (I ran 
badblocks during fs creation).  I should 
have the alpha running about soon, the disk died and I am waiting a 
replacement.  I do believe there is a floating 
point problem with older alpha's out there.  The seems to have a problem with 
INFINITY and NAN's. I did some checking 
on the net and the problem seems know (with no solution).  Maybe something can 
go into the readme or such.  If anyone 
is interested in looking at this for  pg8.0 I can give SSH access in a week or 
so.

Jim



-- Original Message ---
From: Andrew Dunstan [EMAIL PROTECTED]
To: Tom Lane [EMAIL PROTECTED]
Cc: Kurt Roeckx [EMAIL PROTECTED], PostgreSQL-development 
pgsql-hackers@postgresql.org
Sent: Wed, 29 Dec 2004 13:05:26 -0500
Subject: Re: [HACKERS] race condition for drop schema cascade?

 Tom Lane wrote:
 
 Andrew Dunstan [EMAIL PROTECTED] writes:
   
 
 You're right - my query was not sufficiently specific. There have in 
 fact been 4 failures:
 
 
 
   
 
 pgbuildfarm=# select sysname, snapshot, stage, branch from build_status 
 where log ~ 'tablespace testspace is not empty.*tablespace testspace 
 is not empty' and not log ~ 'No space left';
  sysname |  snapshot   |stage | branch
  +-+--+
  hare| 2004-12-09 05:15:05 | Check| HEAD
  otter   | 2004-12-11 15:50:09 | Check| HEAD
  otter   | 2004-12-15 15:50:10 | Check| HEAD
  gibbon  | 2004-12-28 23:55:05 | InstallCheck | HEAD
 
 
 
 Why does the last show as an install failure?
   
 
 
 We run the standard regression suite twice - the failure on Gibbon 
 occurred on the second of these. Clearly this is very transient.
 
 Anyway, given the small number of machines involved, I'm once again
 wondering what filesystem they are using.  They wouldn't be running
 the check over NFS, by any chance, for instance?
 
 The theory that is in my mind is that the bgwriter could have written
 out a page for the table in the test tablespace, and thereby be holding
 an open file pointer for it.  On standard Unix filesystems this would
 not disrupt the backend's ability to unlink the table at the DROP stage,
 but I'm wondering about nonstandard filesystems ...
 
   
 
 
 Jim Buttafuoco reported on December 16th that he had rebuilt the 
 filesystem on his MIPS box - I assume this means that he isn't using 
 NFS. In any case, we have not seen the problem since then. His Alpha box 
 has not been reporting buildfarm results since before then.
 
 The Cygwin box is running on NTFS - and we know we've encountered plenty 
 of problems with unlinking on Windows.
 
 I know it's not much to go on.
 
 cheers
 
 andrew
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
--- End of Original Message ---


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] race condition for drop schema cascade?

2004-12-29 Thread Jim Buttafuoco
Tom,

my systems are all EXT3 (Debian 3.1) (andrew can tell you which ones they are).

Jim


-- Original Message ---
From: Tom Lane [EMAIL PROTECTED]
To: Andrew Dunstan [EMAIL PROTECTED]
Cc: Kurt Roeckx [EMAIL PROTECTED], PostgreSQL-development 
pgsql-hackers@postgresql.org
Sent: Wed, 29 Dec 2004 12:26:56 -0500
Subject: Re: [HACKERS] race condition for drop schema cascade? 

 Andrew Dunstan [EMAIL PROTECTED] writes:
  You're right - my query was not sufficiently specific. There have in 
  fact been 4 failures:
 
  pgbuildfarm=# select sysname, snapshot, stage, branch from build_status 
  where log ~ 'tablespace testspace is not empty.*tablespace testspace 
  is not empty' and not log ~ 'No space left';
   sysname |  snapshot   |stage | branch
   +-+--+
   hare| 2004-12-09 05:15:05 | Check| HEAD
   otter   | 2004-12-11 15:50:09 | Check| HEAD
   otter   | 2004-12-15 15:50:10 | Check| HEAD
   gibbon  | 2004-12-28 23:55:05 | InstallCheck | HEAD
 
 Why does the last show as an install failure?
 
 Anyway, given the small number of machines involved, I'm once again
 wondering what filesystem they are using.  They wouldn't be running
 the check over NFS, by any chance, for instance?
 
 The theory that is in my mind is that the bgwriter could have written
 out a page for the table in the test tablespace, and thereby be holding
 an open file pointer for it.  On standard Unix filesystems this would
 not disrupt the backend's ability to unlink the table at the DROP stage,
 but I'm wondering about nonstandard filesystems ...
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
--- End of Original Message ---


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] race condition for drop schema cascade?

2004-12-29 Thread Andrew Dunstan

Jim Buttafuoco wrote:
Andrew/all
I have not seen any problems on my MIPS systems since the rebuild ext3 (I ran badblocks during fs creation).  I should 
have the alpha running about soon, the disk died and I am waiting a replacement.  I do believe there is a floating 
point problem with older alpha's out there.  The seems to have a problem with INFINITY and NAN's. I did some checking 
on the net and the problem seems know (with no solution).  Maybe something can go into the readme or such.  If anyone 
is interested in looking at this for  pg8.0 I can give SSH access in a week or so.

 

 

I doubt that either of the problems (FP on old Alpha or failing 'drop 
schema cascade + drop tablespace')  is a showstopper. Maybe the 
'platforms supported' notes should carry a mention.

cheers
andrew
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] race condition for drop schema cascade?

2004-12-28 Thread Andrew Dunstan

Tom Lane wrote:
Andrew Dunstan [EMAIL PROTECTED] writes:
 

I have seen this failure several times, but not consistently, on the 
buildfarm member otter (Debian/MIPS) and possible on others, and am 
wondering if it indicates a possible race condition on DROP SCHEMA CASCADE.
   

Hard to see what, considering that there's only one backend touching
that tablespace in the test.  I'd be inclined to wonder if there's
a filesystem-level problem on that platform.  What filesystem are you
running on anyway?
 

I have just seen this error again, this time on Cygwin. I did a trawl 
thought the buildfarm history looking for other occurrences and found it 
happening on many platforms:
pgbuildfarm=# select name, operating_system, stage, count from buildsystems b, (select 
sysname, stage, count(*) as count from build_status where log ~ 'tablespace 
testspace is not empty' group by sysname, stage) as s where s.sysname=b.name;
  name| operating_system |stage | count 

--+--+--+---
spoonbill | OpenBSD  | Check| 2
lionfish  | Linux| Check| 9
kudu  | Solaris  | InstallCheck | 1
kudu  | Solaris  | Check| 5
emu   | OpenBSD  | Check|   137
loris | Windows  | Check| 2
gibbon| Cygwin   | InstallCheck | 1
panda | Linux Debian | Check| 3
otter | Debian Linux | Check| 2
hare  | Debian Linux | Check| 3
dog   | Fedora Core  | Check|17
fantail   | Linux| Check| 3
osprey| NetBSD   | Check|15
(13 rows)

cheers
andrew
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] race condition for drop schema cascade?

2004-12-28 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 I have just seen this error again, this time on Cygwin. I did a trawl thought 
 the buildfarm history looking for other occurrences and found it happening on 
 many platforms:

[ yawning... ]  I've got to go to bed now, but so far tonight my Fedora
Core 3 machine has completed 314 iterations of make check on CVS tip
with no such error.  So whatever this is, there must be some
platform-specific issue involved...

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [Fwd: Re: [HACKERS] race condition for drop schema cascade?]

2004-12-16 Thread Jim Buttafuoco
I have rebuild the filesystem on my indy (MIPS) that Andrew reported on.  The 
first run completed 100%,  I would give 
it a couple more runs before we can say its the filesystem not Postgresql that 
was causing the drop to fail.
 


-- Original Message ---
From: Andrew Dunstan [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wed, 15 Dec 2004 16:42:59 -0500
Subject: [Fwd: Re: [HACKERS] race condition for drop schema cascade?]

 Jim, please advise?
 
 thanks
 
 andrew
 
  Original Message 
 Subject:  Re: [HACKERS] race condition for drop schema cascade?
 Date: Wed, 15 Dec 2004 16:29:01 -0500
 From: Tom Lane [EMAIL PROTECTED]
 To:   Andrew Dunstan [EMAIL PROTECTED]
 CC:   PostgreSQL-development [EMAIL PROTECTED]
 References:   [EMAIL PROTECTED]
 
 Andrew Dunstan [EMAIL PROTECTED] writes:
  I have seen this failure several times, but not consistently, on the 
  buildfarm member otter (Debian/MIPS) and possible on others, and am 
  wondering if it indicates a possible race condition on DROP SCHEMA CASCADE.
 
 Hard to see what, considering that there's only one backend touching
 that tablespace in the test.  I'd be inclined to wonder if there's
 a filesystem-level problem on that platform.  What filesystem are you
 running on anyway?
 
   regards, tom lane
--- End of Original Message ---


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] race condition for drop schema cascade?

2004-12-15 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 I have seen this failure several times, but not consistently, on the 
 buildfarm member otter (Debian/MIPS) and possible on others, and am 
 wondering if it indicates a possible race condition on DROP SCHEMA CASCADE.

Hard to see what, considering that there's only one backend touching
that tablespace in the test.  I'd be inclined to wonder if there's
a filesystem-level problem on that platform.  What filesystem are you
running on anyway?

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend