That makes a lot of sense. I had thought before about an SE chroot, especially when I was thinking about getting dynamic linking to work. I think it is sort of overkill, but it probably wouldn't be -that- bad to implement. It would also help protect against a crazy defunct simulation destroying everything on your machine with your privileges. Trapping /proc/ also sounds like a good idea since there's probably nothing in there the test should really be messing with.
I do think we should keep parser as a regression, though. Gabe Quoting Steve Reinhardt <[EMAIL PROTECTED]>: > Took me a lot longer than it should have in retrospect, but here's the > problem (from --trace-flags=SyscallVerbose): > > 594199893000: global: opening file /proc/meminfo > 594199893000: system.cpu: syscall open returns 4 > 594200152000: system.cpu: syscall fstat called w/arguments > 4,140737488339680,140737488339680,0 > 594200152000: system.cpu: syscall fstat returns 0 > [...] > 594200272000: system.cpu: syscall read called w/arguments > 4,46912559464448,8192,34 > 594200272000: system.cpu: syscall read returns 630 > > I don't know *why* parser opens, fstats, and reads /proc/meminfo, but > that's clearly where the system dependence is coming from. As far as > fixing the problem, the easiest thing would be to hack parser to not > do that, or just not use parser in the regressions. > > If we wanted to get really fancy we could recognize /proc/meminfo as > special and redirect it to some canned input. It might be worth > checking in open() and warning anytime anything under /proc gets > opened. Or maybe we should implement something like chroot inside of > SE mode, so you could get rid of all the path-based issues by forcing > everything to be relative to the working dir, and then use symlinks to > set up the structure you want... powerful, but overkill for our uses > IMO. > > Steve > > On Mon, Nov 17, 2008 at 7:37 PM, <[EMAIL PROTECTED]> wrote: > > Yes, I'm sure it's not a timing mode thing. The timing mode regressions > didn't > > exist for x86 until very recently, and parser has been unstable for maybe > as > > long as a year. > > > > Gabe > > > > Quoting Steve Reinhardt <[EMAIL PROTECTED]>: > > > >> Interestingly, I just ran on my desktop here and on zizzer and both > >> failed, but when I looked more closely, I see that my desktop is > >> failing because it's running 5 fewer instructions than the reference > >> output, while zizzer is failing because it's running 5 extra > >> instructions. (And yes, I double-checked and they both have the same > >> reference instruction count.) Both of these seem pretty consistent. > >> > >> I also checked the poolfs regression outputs and they get yet a third > >> value, and amazingly the simple-atomic runs fail there too. All of > >> the instruction counts vary only in the last couple of digits, so I'll > >> just use those to summarize: > >> > >> ref zizzer poolfs home > >> simple-atomic 702 702 786 692 > >> simple-timing 697 702 786 692 > >> > >> So it doesn't appear to be a timing-mode thing; that's just a side > >> effect of us having inconsistent reference outputs for the two runs. > >> > >> Steve > >> > >> On Mon, Nov 17, 2008 at 2:53 PM, <[EMAIL PROTECTED]> wrote: > >> > Exactly. Or one machine will be in Ann Arbor and the other in > California. > >> Maybe > >> > it has something to do with the test checking the actual clock time/date > on > >> the > >> > host somehow? It could behave slightly differently depending on some > little > >> part > >> > of that like converting it to seconds changing the path the microcode > takes > >> for > >> > the division instruction or something. > >> > > >> > Speaking of which, I think it would be really handy to distinguish > between > >> the > >> > number of actual instructions that commit vs. the number of microops. If > I > >> have > >> > to change microcode for some reason I'd expect the later to change, but > the > >> > former probably means I broke something. > >> > > >> > Gabe > >> > > >> > Quoting nathan binkert <[EMAIL PROTECTED]>: > >> > > >> >> The biggest problem is that I've never been able to find two machines > >> >> that behave differently. When things change, I can't find something > >> >> that did it the "old" way. > >> >> > >> >> Nate > >> >> > >> >> > >> >> > If somebody can and wants to get a tracediff between two differently > >> >> behaving > >> >> > versions of parser, that would go a long way to figuring out what the > >> >> problem > >> >> > is. > >> >> > > >> >> > Gabe > >> >> > > >> >> > Quoting nathan binkert <[EMAIL PROTECTED]>: > >> >> > > >> >> >> I more meant that it seems like an infrequently used syscall that > uses > >> >> >> an uninitilaized variable that affects the return value could easily > >> >> >> be the result. The stats differences in both simulations are > minimal > >> >> >> and similar. > >> >> >> > >> >> >> Nate > >> >> >> > >> >> >> On Mon, Nov 17, 2008 at 12:07 PM, Steve Reinhardt <[EMAIL PROTECTED]> > >> >> wrote: > >> >> >> > I sort of doubt it... parser has always been a bit > nondeterministic, > >> >> >> > where this is just a subtle and unforeseen but deterministic side > >> >> >> > effect of a bug fix. > >> >> >> > > >> >> >> > Steve > >> >> >> > > >> >> >> > On Mon, Nov 17, 2008 at 11:57 AM, nathan binkert > <[EMAIL PROTECTED]> > >> >> wrote: > >> >> >> >> Ah, so that was you. That makes sense. I seriously wonder if > this > >> or > >> >> >> >> something like it is the problem with 20.parser. > >> >> >> >> > >> >> >> >> Nate > >> >> >> >> > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev > _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
