I think stats based verification would be worth it in small doses. If I just had to worry about the hello world tests (which I think exercise stats reasonably well) then that wouldn't be that bad.
Gabe On Thu, Dec 4, 2014 at 11:09 AM, mike upton via gem5-dev <[email protected]> wrote: > I would love to contribute to this... > > Does anyone have gem5 hooked up to ant or other CI testing infrastructure? > > > > On Wed, Dec 3, 2014 at 9:58 PM, Steve Reinhardt via gem5-dev < > [email protected]> wrote: > > > Hi Gabe, > > > > There's a long history here; I think everyone agrees the status quo wrt > > testing is inadequate, but there are a lot of different needs as well. I > > won't go into a lot of detail, but there is a wiki page left over from > our > > last attempt: http://gem5.org/NewRegressionFramework. Actually I see > now > > that you contributed to an early version of that. > > > > I'm not opposed to us having more unit tests and a framework to run them. > > Having the ability to integrate unit tests into the regressions would be > a > > good goal for a new regression system. > > > > Having better unit tests might provide a nice middle ground between, on > the > > one hand, running a few tests targeting whatever you're doing (the bug > > you're fixing or feature you're adding), plus a few quick "hello world" > > tests (which gives you a feeling that your change is "probably good", for > > some definition of probably); and on the other hand, running the full > > regression suite. I'm not sure it would replace either one of those > > though. Thus, to be honest, I think the testing situation has bigger > > problems at this point; there's a lot on that wiki page, and unit testing > > isn't even mentioned. > > > > As far as your points 2 & 3: The regression tests do print out 'FAILED' > vs. > > 'CHANGED' or something like that, so you can tell the difference between > > functional failures and stats changes pretty easily. You can look at the > > stats differences in the test output directory to see exactly what the > > changes are. The job of figuring out whether a particular set of stats > > changes is "reasonable" given some actual modeling change seems > inherently > > impossible to automate, so I'm not sure what you're looking for there. > > > > Ali said he's been working on a new test framework; at this point I > expect > > that's our best bet for moving forward. I'll let him decide whether he's > > ready to say more about it. > > > > Steve > > > > On Sun, Nov 23, 2014 at 6:51 AM, Gabe Black via gem5-dev < > > [email protected]> > > wrote: > > > > > Hi everybody. I'd like to start a conversation about testing strategies > > and > > > gem5. Please let me know if my understanding is out of date, but I > think > > > the primary mechanism we use for testing is running benchmarks, > booting, > > > etc., and making sure the stats haven't changed. There are a few things > > > that make that not so great. > > > > > > 1. Benchmarks can take a LONG time to run. I'd like to know whether my > > > change is probably good in a couple seconds, not a couple hours. > > > 2. There isn't much of an indication of *what* went wrong, just that > > > something somewhere changed. > > > 3. There isn't much of an indication *if* something went wrong. For a > > > certain class of changes, it's reasonable to expect the stats to stay > the > > > same. For instance, a simulator performance optimization shouldn't > change > > > the stats. If you add a new device, change how execution works, fix > some > > > microcode, etc., you just have to guestimate if the amount of change > > looks > > > reasonable and update the stats. Which, per 1, can take hours. > > > 4. Merge conflicts. If two people make changes that affect the stats, > one > > > will go in first, and the other person will have to rebase on top of > > those > > > changes and rerun the stats. Which, per 1, can take hours. > > > > > > I know writing new tests isn't what most people want to be doing with > > their > > > time (including me), but as far as I can see this is a big shortcoming > of > > > the simulator as it stands. I think we would get a lot of benefit from > > more > > > unit tests of both base functionality (we have a little of that), and > of > > > device models, execution bits, etc. (we have none?). We could either > > expand > > > on the unit test code we have, or bring in an existing framework like > > this > > > one: > > > > > > https://code.google.com/p/googletest/ > > > > > > I've never used that or know much of anything about it. > > > > > > It *should* be easy for us to use our modularity and object oriented > > design > > > to pull pieces of the simulator into test harnesses and make sure they > do > > > reasonable things in isolation. If it isn't maybe that's something we > > > should fix too. > > > > > > We should also think about how to make it easy/automatic to run unit > > tests, > > > and how to get them to run automatically alongside the nightly > regression > > > runs. > > > > > > Gabe > > > _______________________________________________ > > > gem5-dev mailing list > > > [email protected] > > > http://m5sim.org/mailman/listinfo/gem5-dev > > > > > _______________________________________________ > > gem5-dev mailing list > > [email protected] > > http://m5sim.org/mailman/listinfo/gem5-dev > > > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
