PASS -> ANY ; Test moves away from PASSNo, only a regression if the destination result is FAIL (if it's UNRESOLVED then there might be a separate regression - execution test becoming UNRESOLVED should be accompanied by compilation becoming FAIL). If it's XFAIL, it might formally be a regression, but one already being tracked in another way (presumably Bugzilla) which should not turn the bot red. If it's XPASS, that simply means XFAILing conditions slightly wider than necessary in order to mark failure in another configuration as expected. My suggestion is: PASS -> FAIL is an unambiguous regression. Anything else -> FAIL and new FAILing tests aren't regressions at the individual test level, but may be treated as such at the whole testsuite level.
I don't have a strong opinion on the definition of a Regression in this context but I would very much like to see status changes highlighted in the test results to indicate that something that worked before no longer works as well, to help us spot the kinds of problems I've run into and a had trouble with. (Showing the SVN revision number along with each transition would be great.) Here are a couple of examples. A recent change of mine caused a test in the target_supports.exp file to fail to detect attribute ifunc support. That in turn prevented regression tests for the attribute from being compiled (changed them from PASS to UNSUPPORTED) which ultimately masked a bug my change had introduced. My script that looks for regressions in my own test results would normally catch this before I commit such a change. Unfortunately, the script ignores results with the UNSUPPORTED status, so this bug slipped in unnoticed. Regardless of whether or not these types of errors are considered Regressions, highlighting them perhaps in different colors would be helpful.
Any transition where the destination result is not FAIL is not a regression. ERRORs in the .sum or .log files should be watched out for as well, however, as sometimes they may indicate broken Tcl syntax in the testsuite, which may cause many tests not to be run.
Yes, please. I had a problem happen with a test with a bad DejaGnu directive. The test failed in an non-obvious way (I think it caused an ERROR in the log) which caused a small number of tests that ran after it to fail. Because of parallel make (I run tests with make -j96) the failing tests changed from one run of the test suite to the next and the whole problem ended up being quite hard to debug. (The ultimate root cause was a stray backslash in a dj-warning directive introduced by copying and pasting between an Emacs session in one terminal and a via session in another. The backslash was in column 80 and so virtually impossible to see.) Martin
