On Thu, Jun 11, 2020 at 9:43 AM Thomas Munro <thomas.mu...@gmail.com> wrote: > On Thu, Jun 11, 2020 at 2:13 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > I have in the past scraped the latter results and tried to make sense of > > them. They are *mighty* noisy, even when considering just one animal > > that I know to be running on a machine with little else to do. Maybe > > averaging across the whole buildfarm could reduce the noise level, but > > I'm not very hopeful. Per-test-script times would likely be even > > noisier (ISTM anyway, maybe I'm wrong). > > I've been doing that in a little database that pulls down the results > and analyses them with primitive regexes. First I wanted to know the > pass/fail history for each individual regression, isolation and TAP > script, then I wanted to build something that could identify tests > that are 'flapping', and work out when the started and stopped > flapping etc. I soon realised it was all too noisy, but then I > figured that I could fix that by detecting crashes. So I classify > every top level build farm run as SUCCESS, FAILURE or CRASH. If the > top level run was CRASH, than I can disregard the individual per > script results, because they're all BS.
With more coffee I realise that you were talking about noise times, not noisy pass/fail results. But I still want to throw that idea out there, if we're considering analysing the logs.