What I'd like is for us not to use stats as a pass/fail criteria. I'm not sure how that would work, but using the stats is pretty fragile and hard to maintain. It's tricky because you want to make sure the stats themselves are still correct, but there are lots of "correct" stats which are different. I agree that automatically deciding how much stats should change is not feasible. I haven't had a chance to read that wiki page, but one thing I remember, perhaps from the last time this came up, is that the regressions we run are benchmarks that do the same thing many times to get steady state behavior. To verify something is correct, we don't need to loop over the same block of code thousands or millions of times. We could probably make things a lot faster without losing coverage that way, although changing the regression binaries wouldn't necessarily be very straightforward.
While I think there are significant drawbacks from long running tests as detailed in my earlier email, there are benefits for really quick tests too. They could, for instance, be run automatically on every CL as part of a continuous integration system. It would also be practical to run all of them before sending a CL out. Right now I just take a best guess what regressions are worth running since running the long ones is a major time commitment, especially across all the ISAs. Gabe On Wed, Dec 3, 2014 at 9:58 PM, Steve Reinhardt via gem5-dev < [email protected]> wrote: > Hi Gabe, > > There's a long history here; I think everyone agrees the status quo wrt > testing is inadequate, but there are a lot of different needs as well. I > won't go into a lot of detail, but there is a wiki page left over from our > last attempt: http://gem5.org/NewRegressionFramework. Actually I see now > that you contributed to an early version of that. > > I'm not opposed to us having more unit tests and a framework to run them. > Having the ability to integrate unit tests into the regressions would be a > good goal for a new regression system. > > Having better unit tests might provide a nice middle ground between, on the > one hand, running a few tests targeting whatever you're doing (the bug > you're fixing or feature you're adding), plus a few quick "hello world" > tests (which gives you a feeling that your change is "probably good", for > some definition of probably); and on the other hand, running the full > regression suite. I'm not sure it would replace either one of those > though. Thus, to be honest, I think the testing situation has bigger > problems at this point; there's a lot on that wiki page, and unit testing > isn't even mentioned. > > As far as your points 2 & 3: The regression tests do print out 'FAILED' vs. > 'CHANGED' or something like that, so you can tell the difference between > functional failures and stats changes pretty easily. You can look at the > stats differences in the test output directory to see exactly what the > changes are. The job of figuring out whether a particular set of stats > changes is "reasonable" given some actual modeling change seems inherently > impossible to automate, so I'm not sure what you're looking for there. > > Ali said he's been working on a new test framework; at this point I expect > that's our best bet for moving forward. I'll let him decide whether he's > ready to say more about it. > > Steve > > On Sun, Nov 23, 2014 at 6:51 AM, Gabe Black via gem5-dev < > [email protected]> > wrote: > > > Hi everybody. I'd like to start a conversation about testing strategies > and > > gem5. Please let me know if my understanding is out of date, but I think > > the primary mechanism we use for testing is running benchmarks, booting, > > etc., and making sure the stats haven't changed. There are a few things > > that make that not so great. > > > > 1. Benchmarks can take a LONG time to run. I'd like to know whether my > > change is probably good in a couple seconds, not a couple hours. > > 2. There isn't much of an indication of *what* went wrong, just that > > something somewhere changed. > > 3. There isn't much of an indication *if* something went wrong. For a > > certain class of changes, it's reasonable to expect the stats to stay the > > same. For instance, a simulator performance optimization shouldn't change > > the stats. If you add a new device, change how execution works, fix some > > microcode, etc., you just have to guestimate if the amount of change > looks > > reasonable and update the stats. Which, per 1, can take hours. > > 4. Merge conflicts. If two people make changes that affect the stats, one > > will go in first, and the other person will have to rebase on top of > those > > changes and rerun the stats. Which, per 1, can take hours. > > > > I know writing new tests isn't what most people want to be doing with > their > > time (including me), but as far as I can see this is a big shortcoming of > > the simulator as it stands. I think we would get a lot of benefit from > more > > unit tests of both base functionality (we have a little of that), and of > > device models, execution bits, etc. (we have none?). We could either > expand > > on the unit test code we have, or bring in an existing framework like > this > > one: > > > > https://code.google.com/p/googletest/ > > > > I've never used that or know much of anything about it. > > > > It *should* be easy for us to use our modularity and object oriented > design > > to pull pieces of the simulator into test harnesses and make sure they do > > reasonable things in isolation. If it isn't maybe that's something we > > should fix too. > > > > We should also think about how to make it easy/automatic to run unit > tests, > > and how to get them to run automatically alongside the nightly regression > > runs. > > > > Gabe > > _______________________________________________ > > gem5-dev mailing list > > [email protected] > > http://m5sim.org/mailman/listinfo/gem5-dev > > > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
