Thomas Munro <thomas.mu...@gmail.com> writes: > I've been doing that in a little database that pulls down the results > and analyses them with primitive regexes. First I wanted to know the > pass/fail history for each individual regression, isolation and TAP > script, then I wanted to build something that could identify tests > that are 'flapping', and work out when the started and stopped > flapping etc. I soon realised it was all too noisy, but then I > figured that I could fix that by detecting crashes. So I classify > every top level build farm run as SUCCESS, FAILURE or CRASH. If the > top level run was CRASH, than I can disregard the individual per > script results, because they're all BS.
If you can pin the crash on a particular test script, it'd be useful to track that as a kind of failure. In general, though, both crashes and non-crash failures tend to cause collateral damage to later test scripts --- if you can't filter that out then the later scripts will have high false-positive rates. regards, tom lane