Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-16 Thread Andres Freund
Hi, On 2013-01-16 01:28:09 -0500, Tom Lane wrote: It's a compiler bug. Gah. Not again. Not that I am surprised, but still. icc 11.1 apparently thinks that this loop in doPickSplit: (Why does it think it needs to prefetch an array it's only going to write into? Is IA64's cache hardware

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-16 Thread Sergey Koposov
Hi, On Wed, 16 Jan 2013, Andres Freund wrote: On 2013-01-16 01:28:09 -0500, Tom Lane wrote: It's a compiler bug. Thanks for investigating. But I'm not sure there is any way other way for me other than switching to gcc, because intel stopped providing their IA64 version of compilers free of

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-16 Thread Andres Freund
On 2013-01-16 14:41:47 +, Sergey Koposov wrote: Hi, On Wed, 16 Jan 2013, Andres Freund wrote: On 2013-01-16 01:28:09 -0500, Tom Lane wrote: It's a compiler bug. Thanks for investigating. But I'm not sure there is any way other way for me other than switching to gcc, because intel

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-16 Thread Andrew Dunstan
On 01/16/2013 09:41 AM, Sergey Koposov wrote: So unless somebody suggest otherwise, i'm going to switch to gcc on this buildfarm. If you switch compiler it should be a new buildfarm animal. (That just means re-registering so you get a new name/secret pair.) We have provision for upgrading

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-16 Thread Sergey Koposov
On Wed, 16 Jan 2013, Andres Freund wrote: So unless somebody suggest otherwise, i'm going to switch to gcc on this buildfarm. What about switching to -O1 of 11.1? I don't know. It is up to -hackers to decide. I think that icc on ia64 have shown bugginess time after time. But if you think

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-16 Thread Tom Lane
Sergey Koposov kopo...@ast.cam.ac.uk writes: On Wed, 16 Jan 2013, Andres Freund wrote: What about switching to -O1 of 11.1? I don't know. It is up to -hackers to decide. I think that icc on ia64 have shown bugginess time after time. But if you think that buildfarm with icc 11.1 -O1 carry

Re: [HACKERS] Curious buildfarm failures

2013-01-15 Thread Andres Freund
On 2013-01-14 22:56:47 +0100, Andres Freund wrote: On 2013-01-14 22:50:16 +0100, Andres Freund wrote: On 2013-01-14 16:35:48 -0500, Tom Lane wrote: Since commit 2065dd2834e832eb820f1fbcd16746d6af1f6037, there have been a few buildfarm failures along the lines of -- Commit table

Re: [HACKERS] Curious buildfarm failures

2013-01-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On 2013-01-14 16:35:48 -0500, Tom Lane wrote: Another thing is that dugong has been reproducibly failing with drop cascades to table testschema.atable -- Should succeed DROP TABLESPACE testspace; + ERROR: tablespace testspace is not empty

Re: [HACKERS] Curious buildfarm failures

2013-01-15 Thread Andrew Dunstan
On 01/15/2013 11:04 AM, Andres Freund wrote: Could the buildfarm owner perhaps tell us which files are left in the tablespace 'testspace'? They will not be able to easily - the workspace is normally cleared out at the end of each run. cheers andrew -- Sent via pgsql-hackers mailing

Re: [HACKERS] Curious buildfarm failures

2013-01-15 Thread Andres Freund
On 2013-01-15 11:19:28 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2013-01-14 16:35:48 -0500, Tom Lane wrote: Another thing is that dugong has been reproducibly failing with drop cascades to table testschema.atable -- Should succeed DROP TABLESPACE

Re: [HACKERS] Curious buildfarm failures

2013-01-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: Interestingly the compiler couldn't deduce that e.g. DateTimeParseError() didn't return with the old ereport definition but it can with the new one which seems strange. Oooh, I hadn't noticed that. I guess that must indicate that this version of

Re: [HACKERS] Curious buildfarm failures

2013-01-15 Thread Andrew Dunstan
On 01/15/2013 11:57 AM, Tom Lane wrote: Well, it could be quite reproducible, if for example what's happening is that the DROP is failing to wait for the checkpointer at all. Is there a way to enable log_checkpoints during a buildfarm run? It'd be good to get timestamps added to the postmaster

Re: [HACKERS] Curious buildfarm failures

2013-01-15 Thread Andrew Dunstan
On 01/15/2013 12:07 PM, Andrew Dunstan wrote: On 01/15/2013 11:57 AM, Tom Lane wrote: Well, it could be quite reproducible, if for example what's happening is that the DROP is failing to wait for the checkpointer at all. Is there a way to enable log_checkpoints during a buildfarm run? It'd

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Sergey Koposov
Hi, Date: Tue, 15 Jan 2013 11:57:07 -0500 From: Tom Lane t...@sss.pgh.pa.us To: Andres Freund and...@2ndquadrant.com Cc: m...@sai.msu.ru, pgsql-hackers@postgreSQL.org, Andrew Dunstan and...@dunslane.net Subject: Re: Curious buildfarm failures Well, it could be quite reproducible, if for

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Andres Freund
On 2013-01-15 17:27:50 +, Sergey Koposov wrote: Hi, Date: Tue, 15 Jan 2013 11:57:07 -0500 From: Tom Lane t...@sss.pgh.pa.us To: Andres Freund and...@2ndquadrant.com Cc: m...@sai.msu.ru, pgsql-hackers@postgreSQL.org, Andrew Dunstan and...@dunslane.net Subject: Re: Curious buildfarm

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Sergey Koposov
On Tue, 15 Jan 2013, Andres Freund wrote: Any chance you could run make check again but with log_statement=all and log_min_messages=debug2? That might tell us a bit more, whether the dependency code doesn't work right or whether the checkpoint is doing strange things. Here it is : 2013-01-15

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Tom Lane
Sergey Koposov kopo...@ast.cam.ac.uk writes: And I do see the tblspc file left after the finish of make check: tmp_check/data/pg_tblspc/16385/PG_9.3_201212081/16384/16387 Interesting. If the tests are run immediately after initdb, 16387 is the relfilenode assigned to table foo in the

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Sergey Koposov
On Tue, 15 Jan 2013, Tom Lane wrote: BTW, I just finished trying to reproduce this on an IA64 machine belonging to Red Hat, without success. So that seems to eliminate any possibility of the machine architecture being the trigger issue. The compiler's still a likely cause though. Anybody have

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Andres Freund
On 2013-01-15 14:40:11 -0500, Tom Lane wrote: Sergey Koposov kopo...@ast.cam.ac.uk writes: And I do see the tblspc file left after the finish of make check: tmp_check/data/pg_tblspc/16385/PG_9.3_201212081/16384/16387 Interesting. If the tests are run immediately after initdb, 16387

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: I played a bit arround (thanks Sergey!) and it seems to be some rather strange optimization issue around the fsync request queue. Namely changing request-rnode = rnode; into request-rnode.spcNode = rnode.spcNode;

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Andres Freund
On 2013-01-15 17:56:40 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: I played a bit arround (thanks Sergey!) and it seems to be some rather strange optimization issue around the fsync request queue. Namely changing request-rnode = rnode; into

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Andres Freund
On 2013-01-16 00:26:01 +0100, Andres Freund wrote: On 2013-01-15 17:56:40 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: I played a bit arround (thanks Sergey!) and it seems to be some rather strange optimization issue around the fsync request queue. Namely

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: FWIW its also triggerable if two other function calls are places inside the above if() (I tried fprintf(stderr, argh) and kill(0, 0)). [ confused... ] You mean replacing the abort() in the elog macro with one of these functions? Or something else?

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Andres Freund
On 2013-01-15 19:56:52 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: FWIW its also triggerable if two other function calls are places inside the above if() (I tried fprintf(stderr, argh) and kill(0, 0)). [ confused... ] You mean replacing the abort() in the elog

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On 2013-01-15 19:56:52 -0500, Tom Lane wrote: At this point I'm more interested in his report in alpine.lrh.2.03.1301152012220@ast.cam.ac.uk about the Assert at spgdoinsert.c:1222 failing. That's pretty new code, so more likely to have a

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Andres Freund
On 2013-01-16 02:13:26 +0100, Andres Freund wrote: On 2013-01-15 19:56:52 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: FWIW its also triggerable if two other function calls are places inside the above if() (I tried fprintf(stderr, argh) and kill(0, 0)). [

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Andres Freund
On 2013-01-15 20:32:00 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2013-01-15 19:56:52 -0500, Tom Lane wrote: At this point I'm more interested in his report in alpine.lrh.2.03.1301152012220@ast.cam.ac.uk about the Assert at spgdoinsert.c:1222 failing.

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Andres Freund
On 2013-01-16 02:34:52 +0100, Andres Freund wrote: On 2013-01-16 02:13:26 +0100, Andres Freund wrote: On 2013-01-15 19:56:52 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: FWIW its also triggerable if two other function calls are places inside the above if() (I

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: -O0 passes Grumble... suspect we're chasing another compiler bug now, but ... You might try -O1; if that shows the bug it'll probably be a tad easier to debug in. regards, tom lane -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Tom Lane
It's a compiler bug. icc 11.1 apparently thinks that this loop in doPickSplit: /* * Update nodes[] array to point into the newly formed innerTuple, so that * we can adjust their downlinks below. */ SGITITERATE(innerTuple, i, node) { nodes[i] = node; } is

[HACKERS] Curious buildfarm failures

2013-01-14 Thread Tom Lane
Since commit 2065dd2834e832eb820f1fbcd16746d6af1f6037, there have been a few buildfarm failures along the lines of -- Commit table drop COMMIT PREPARED 'regress-two'; ! PANIC: failed to re-find shared proclock object ! PANIC: failed to re-find shared proclock object ! connection to server

Re: [HACKERS] Curious buildfarm failures

2013-01-14 Thread Andres Freund
On 2013-01-14 16:35:48 -0500, Tom Lane wrote: Since commit 2065dd2834e832eb820f1fbcd16746d6af1f6037, there have been a few buildfarm failures along the lines of -- Commit table drop COMMIT PREPARED 'regress-two'; ! PANIC: failed to re-find shared proclock object ! PANIC: failed to

Re: [HACKERS] Curious buildfarm failures

2013-01-14 Thread Andres Freund
On 2013-01-14 22:50:16 +0100, Andres Freund wrote: On 2013-01-14 16:35:48 -0500, Tom Lane wrote: Since commit 2065dd2834e832eb820f1fbcd16746d6af1f6037, there have been a few buildfarm failures along the lines of -- Commit table drop COMMIT PREPARED 'regress-two'; ! PANIC: failed

Re: [HACKERS] Curious buildfarm failures

2013-01-14 Thread Heikki Linnakangas
On 14.01.2013 23:35, Tom Lane wrote: Since commit 2065dd2834e832eb820f1fbcd16746d6af1f6037, there have been a few buildfarm failures along the lines of -- Commit table drop COMMIT PREPARED 'regress-two'; ! PANIC: failed to re-find shared proclock object ! PANIC: failed to re-find shared

Re: [HACKERS] Curious buildfarm failures

2013-01-14 Thread Heikki Linnakangas
On 15.01.2013 00:14, Heikki Linnakangas wrote: On 14.01.2013 23:35, Tom Lane wrote: Since commit 2065dd2834e832eb820f1fbcd16746d6af1f6037, there have been a few buildfarm failures along the lines of -- Commit table drop COMMIT PREPARED 'regress-two'; ! PANIC: failed to re-find shared proclock

Re: [HACKERS] Curious buildfarm failures

2013-01-14 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes: The problem seems to be when the the old and the key hash to the same bucket. In that case, hash_update_hash_key() tries to link the entry to itself. The attached patch fixes it for me. Doh! I had a feeling that that needed a special case,