[gem5-dev] Test success criteria

Andreas Sandberg Fri, 03 Jun 2016 08:02:21 -0700

Hi Everyone,

I think none of you have missed that I have been poking around with gem5¹s
test infrastructure. There are some changes I would like to make to the
test success/fail criteria we currently enforce.


In the past, we used to diff all test outputs (configs, stderr, stdout,
stats, terminal), but only fail if the stats differed. This means that in
practice, the only pass/fail criteria is a matching stat file. The reason
for not failing on output diffs was most likely that someone got sick of
updating diff rules to prevent false test failures.

In general, I find it very annoying having to update reference output/stat
files for trivial changes (e.g., adding some stat). It makes the revision
history ugly, the diffs are large, and it makes bisection annoying (people
usually update stats for batches of changes at a time). Additionally,
quite a few stats are effectively debug stats and tell us very little
about performance.

I would suggest that we do this:

 * Optional tests (e.g., EIO and Solaris boot) should be purely
functional. That is, no stat diffing and no output diffing. The tests are
successful if gem5 runs to completion without exploding.

 * Don¹t diff output files (config.ini, config.json, simout, simerr,
system.terminal) in the general case.

 * Don¹t diff output files or stat files for tests that check functional
behaviour. For example: lergning-gem5*, switcheroo tests, checkpoint
tests, CPU checker tests.

 * Include /test/ output and test stdout/stderr for SE tests where the
output can be used to determine if the test executed correctly.

 * Consider running gem5 with --quiet to reduce simout/simerr noise.

 * Disable at runtime or automate removal of skipped stats at when
updating reference data.

 * Never fail if new stats are added.

To reduce test runtime, I would suggest hacking our images a bit to avoid
running the full boot process. In some cases (especially for modern disk
images), starting user space can be a substantial portion of a tests¹s
runtime. We could just tell the kernel to run the test instead of init,
that would avoid starting udev and a lot of other (slow) parts of the boot
processes.

Cheers,
Andreas

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.

_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

[gem5-dev] Test success criteria

Reply via email to