Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Andrew Dunstan
Joshua D. Drake wrote: Andrew Dunstan wrote: Tom Lane wrote: Martijn van Oosterhout writes: But I'm just sprouting ideas here, the proof is in the pudding. If the logs are easily available (or a subset of, say the last month) then people could play with that and see what happ

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Joshua D. Drake
Andrew Dunstan wrote: > Tom Lane wrote: >> Martijn van Oosterhout writes: >> >>> But I'm just sprouting ideas here, the proof is in the pudding. If the >>> logs are easily available (or a subset of, say the last month) then >>> people could play with that and see what happens... >>> >> >> A

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Andrew Dunstan
Tom Lane wrote: Martijn van Oosterhout writes: But I'm just sprouting ideas here, the proof is in the pudding. If the logs are easily available (or a subset of, say the last month) then people could play with that and see what happens... Anyone who wants to play around can replicate w

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Tom Lane
Martijn van Oosterhout writes: > But I'm just sprouting ideas here, the proof is in the pudding. If the > logs are easily available (or a subset of, say the last month) then > people could play with that and see what happens... Anyone who wants to play around can replicate what I did, which was t

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Martijn van Oosterhout
On Tue, Mar 20, 2007 at 11:36:09AM -0400, Andrew Dunstan wrote: > My biggest worry apart from maintenance (which doesn't matter that much > - if people don't enter the regexes they don't get the tags they want) > is that the regexes will not be specific enough, and so give false > positives on t

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Andrew Dunstan
Arturo Perez wrote: I don't know if this has come up yet but In terms of tagging errors we might be able to use some machine learning techniques. There are NLP/learning systems that interpret logs. They learn over time what is normal and what isn't and can flag things that are abnormal

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Arturo Perez
I don't know if this has come up yet but In terms of tagging errors we might be able to use some machine learning techniques. There are NLP/learning systems that interpret logs. They learn over time what is normal and what isn't and can flag things that are abnormal. For example,

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes: > The wrinkle is that applying the tags on the fly is probably not a great > idea - the status page query is already in desperate need of overhauling > because it's too slow. So we'd need a daemon to set up the tags in the > background. But that's an im

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Andrew Dunstan
Tom Lane wrote: The point I think you are missing is that having something like this will *eliminate* repetitive, boring work, namely recognizing multiple reports of the same problem. The buildfarm has gotten big enough that some way of dealing with that is desperately needed, else our ability t

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Martijn van Oosterhout wrote: >> Maybe a simple compromise would be being able to setup a set of regexes >> that search the output and set a flag it that string is found. If you >> find the string, it gets marked with a flag, which means that when you >>

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Andrew Dunstan
Alvaro Herrera wrote: Andrew Dunstan wrote: The buildfarm works because it leverages our strength, namely automating things. But all the tagging suggestions I've seen will involve regular, repetitive and possibly boring work, precisely the thing we are not good at as a group. You ma

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Alvaro Herrera
Andrew Dunstan wrote: > The buildfarm works because it leverages our strength, namely automating > things. But all the tagging suggestions I've seen will involve regular, > repetitive and possibly boring work, precisely the thing we are not good > at as a group. You may be forgetting that Mart

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Andrew Dunstan
Stefan Kaltenbrunner wrote: however as a buildfarm admin I occasionally wished i had a way to invalidate reports generated from my boxes to prevent someone wasting time to investigate them (like errors caused by system upgrades,configuration problems or other local issues). It would be extr

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Stefan Kaltenbrunner
Andrew Dunstan wrote: Martijn van Oosterhout wrote: On Tue, Mar 20, 2007 at 02:57:13AM -0400, Tom Lane wrote: Maybe we should think about filtering the noise. Like, say, discarding every report from mongoose that involves an icc core dump ... http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Andrew Dunstan
Martijn van Oosterhout wrote: On Tue, Mar 20, 2007 at 02:57:13AM -0400, Tom Lane wrote: Maybe we should think about filtering the noise. Like, say, discarding every report from mongoose that involves an icc core dump ... http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mongoose&dt=2007-03-2

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-20 Thread Martijn van Oosterhout
On Tue, Mar 20, 2007 at 02:57:13AM -0400, Tom Lane wrote: > Maybe we should think about filtering the noise. Like, say, discarding > every report from mongoose that involves an icc core dump ... > http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mongoose&dt=2007-03-20%2006:30:01 Maybe a simple c

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> But we've already had a couple of cases of interesting failures going >> unnoticed because of the noise level. Between duplicate reports about >> busted patches and transient problems on particular build machines >> (out of disk space

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Andrew Dunstan
Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: Tom Lane wrote: Actually what I *really* want is something closer to "show me all the unexplained failures", but unless Andrew is willing to support some way of tagging failures in the master database, I suppose that won't happe

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Actually what I *really* want is something closer to "show me all the >> unexplained failures", but unless Andrew is willing to support some way >> of tagging failures in the master database, I suppose that won't happen. > Who would d

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Andrew Dunstan
I wrote: 2. I was annoyed repeatedly that some buildfarm members weren't reporting log_archive_filenames entries, which forced going the long way round in the process I was using. Seems like we need some more proactive means for getting buildfarm owners to keep their script versions up-to-date.

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Andrew Dunstan
Tom Lane wrote: I think what would be nice is some way to view all the failures for a given branch, extending back not-sure-how-far. Right now the only way to see past failures is to look at individual machines' histories, which is not real satisfactory when you want a broader view. Actually wh

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Tom Lane
"Joshua D. Drake" <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> The current buildfarm webpages make it easy to see when a branch tip >> is seriously broken, but it's not very easy to investigate transient >> failures, such as a regression test race condition that only >> materializes once in awh

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Gregory Stark
"Gregory Stark" <[EMAIL PROTECTED]> writes: > "Tom Lane" <[EMAIL PROTECTED]> writes: > >> row-ordering discrepancy in rowtypes test| >> 2007-02-10 03:00:02 | 3 > > Is this because the test is fixed or unfixable? If not shouldn't the test get > an ORDER BY clause

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes: > "Tom Lane" <[EMAIL PROTECTED]> writes: >> missing BYTE_ORDER definition for Solaris| >> 2007-01-10 14:18:23 | 1 > What is this BYTE_ORDER macro? Should I be using it instead of the > AC_C_BIGENDIAN test in configure for t

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Andrew Dunstan
Tom Lane wrote: BTW, before I forget, this little project turned up a couple of small improvements for the current buildfarm infrastructure: 1. There are half a dozen entries with obviously bogus timestamps: bfarm=# select sysname,snapshot,branch from mfailures where snapshot < '2004-01-01';

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Stefan Kaltenbrunner
Gregory Stark wrote: "Tom Lane" <[EMAIL PROTECTED]> writes: Also, for completeness, the causes I wrote off as not interesting (anymore, in some cases): missing BYTE_ORDER definition for Solaris| 2007-01-10 14:18:23 | 1 What is this BYTE_ORDER macro? Should I

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-19 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes: > Also, for completeness, the causes I wrote off as not interesting > (anymore, in some cases): > > missing BYTE_ORDER definition for Solaris| > 2007-01-10 14:18:23 | 1 What is this BYTE_ORDER macro? Should I be using it in

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-18 Thread Tom Lane
BTW, before I forget, this little project turned up a couple of small improvements for the current buildfarm infrastructure: 1. There are half a dozen entries with obviously bogus timestamps: bfarm=# select sysname,snapshot,branch from mfailures where snapshot < '2004-01-01'; sysname |

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-18 Thread Tom Lane
Jeremy Drake <[EMAIL PROTECTED]> writes: > These on mongoose are most likely a result of flaky hardware. Yeah, I saw a pretty fair number of irreproducible issues that are probably hardware flake-outs. Of course you can't tell which are those and which are low-probability software bugs for many m

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-18 Thread Tom Lane
"Joshua D. Drake" <[EMAIL PROTECTED]> writes: >> Some of these might possibly be interesting to other people ... > If you provide the various greps, etc... I will put it into the website > proper... Unfortunately I didn't keep notes on exactly what I searched for in each case. Some of them were

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-18 Thread Jeremy Drake
On Sun, 18 Mar 2007, Tom Lane wrote: > another icc crash| > 2007-02-03 10:50:01 | 1 > icc "internal error" | > 2007-03-16 16:30:01 |29 These on mongoose are most likely a result of flak

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-18 Thread Joshua D. Drake
| 2007-01-31 17:30:01 |16 use of // comment| 2007-02-16 09:23:02 | 1 xml code teething problems | 2007-02-16 16:01:05 |79 (54 rows) Some of these might

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-18 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes: > OK, for anyone that wants to play, I have created an extract that > contains a summary of every non-CVS-related failure we've had. It's a > single table looking like this: I did some analysis on this data. Attached is a text dump of a table declared

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-18 Thread Josh Berkus
Andrew, Lastly, note that some buildfarm enhancements are on the SOC project list. I have no idea if anyone will express any interest in that, of course. It's not very glamorous work. On the other hand, I think there are a lot more student perl hackers and web people than there are folks wit

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Andrew Dunstan
Jeremy Drake wrote: >> >> >> The dump is just under 1Mb and can be downloaded from >> http://www.pgbuildfarm.org/mfailures.dump > > Sure about that? > > HTTP request sent, awaiting response... 200 OK > Length: 9,184,142 (8.8M) [text/plain] > Damn these new specs. They made me skip a digit. cheer

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Jeremy Drake
On Fri, 16 Mar 2007, Andrew Dunstan wrote: > OK, for anyone that wants to play, I have created an extract that contains a > summary of every non-CVS-related failure we've had. It's a single table > looking like this: > > CREATE TABLE mfailures ( >sysname text, >snapshot timestamp without t

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Andrew Dunstan
Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: Well, the db is currently running around 13Gb, so that's not something to be exported lightly ;-) Yeah. I would assume though that the vast bulk of that is captured log files. For the purposes I'm imagining, it'd be sufficien

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Well, the db is currently running around 13Gb, so that's not something > to be exported lightly ;-) Yeah. I would assume though that the vast bulk of that is captured log files. For the purposes I'm imagining, it'd be sufficient to export only the re

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Joshua D. Drake
> Well, the db is currently running around 13Gb, so that's not something > to be exported lightly ;-) > > If we upgraded from Postgres 8.0.x to 8.2.x we could make use of some > features, like dynamic partitioning and copy from queries, that might > make life easier (CP people: that's a hint :-)

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Andrew Dunstan
Tom Lane wrote: The current buildfarm webpages make it easy to see when a branch tip is seriously broken, but it's not very easy to investigate transient failures, such as a regression test race condition that only materializes once in awhile. I would like to have a way of seeing just the failed

Re: [HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Joshua D. Drake
Tom Lane wrote: > The current buildfarm webpages make it easy to see when a branch tip > is seriously broken, but it's not very easy to investigate transient > failures, such as a regression test race condition that only > materializes once in awhile. I would like to have a way of seeing > just th

[HACKERS] Buildfarm feature request: some way to track/classify failures

2007-03-16 Thread Tom Lane
The current buildfarm webpages make it easy to see when a branch tip is seriously broken, but it's not very easy to investigate transient failures, such as a regression test race condition that only materializes once in awhile. I would like to have a way of seeing just the failed build attempts ac