Took me a lot longer than it should have in retrospect, but here's the
problem (from --trace-flags=SyscallVerbose):

594199893000: global: opening file /proc/meminfo
594199893000: system.cpu: syscall open returns 4
594200152000: system.cpu: syscall fstat called w/arguments
4,140737488339680,140737488339680,0
594200152000: system.cpu: syscall fstat returns 0
[...]
594200272000: system.cpu: syscall read called w/arguments
4,46912559464448,8192,34
594200272000: system.cpu: syscall read returns 630

I don't know *why* parser opens, fstats, and reads /proc/meminfo, but
that's clearly where the system dependence is coming from.  As far as
fixing the problem, the easiest thing would be to hack parser to not
do that, or just not use parser in the regressions.

If we wanted to get really fancy we could recognize /proc/meminfo as
special and redirect it to some canned input.  It might be worth
checking in open() and warning anytime anything under /proc gets
opened.  Or maybe we should implement something like chroot inside of
SE mode, so you could get rid of all the path-based issues by forcing
everything to be relative to the working dir, and then use symlinks to
set up the structure you want... powerful, but overkill for our uses
IMO.

Steve

On Mon, Nov 17, 2008 at 7:37 PM,  <[EMAIL PROTECTED]> wrote:
> Yes, I'm sure it's not a timing mode thing. The timing mode regressions didn't
> exist for x86 until very recently, and parser has been unstable for maybe as
> long as a year.
>
> Gabe
>
> Quoting Steve Reinhardt <[EMAIL PROTECTED]>:
>
>> Interestingly, I just ran on my desktop here and on zizzer and both
>> failed, but when I looked more closely, I see that my desktop is
>> failing because it's running 5 fewer instructions than the reference
>> output, while zizzer is failing because it's running 5 extra
>> instructions.  (And yes, I double-checked and they both have the same
>> reference instruction count.)  Both of these seem pretty consistent.
>>
>> I also checked the poolfs regression outputs and they get yet a third
>> value, and amazingly the simple-atomic runs fail there too.  All of
>> the instruction counts vary only in the last couple of digits, so I'll
>> just use those to summarize:
>>
>>                           ref   zizzer poolfs home
>> simple-atomic   702  702    786     692
>> simple-timing    697  702    786     692
>>
>> So it doesn't appear to be a timing-mode thing; that's just a side
>> effect of us having inconsistent reference outputs for the two runs.
>>
>> Steve
>>
>> On Mon, Nov 17, 2008 at 2:53 PM,  <[EMAIL PROTECTED]> wrote:
>> > Exactly. Or one machine will be in Ann Arbor and the other in California.
>> Maybe
>> > it has something to do with the test checking the actual clock time/date on
>> the
>> > host somehow? It could behave slightly differently depending on some little
>> part
>> > of that like converting it to seconds changing the path the microcode takes
>> for
>> > the division instruction or something.
>> >
>> > Speaking of which, I think it would be really handy to distinguish between
>> the
>> > number of actual instructions that commit vs. the number of microops. If I
>> have
>> > to change microcode for some reason I'd expect the later to change, but the
>> > former probably means I broke something.
>> >
>> > Gabe
>> >
>> > Quoting nathan binkert <[EMAIL PROTECTED]>:
>> >
>> >> The biggest problem is that I've never been able to find two machines
>> >> that behave differently.  When things change, I can't find something
>> >> that did it the "old" way.
>> >>
>> >>   Nate
>> >>
>> >>
>> >> > If somebody can and wants to get a tracediff between two differently
>> >> behaving
>> >> > versions of parser, that would go a long way to figuring out what the
>> >> problem
>> >> > is.
>> >> >
>> >> > Gabe
>> >> >
>> >> > Quoting nathan binkert <[EMAIL PROTECTED]>:
>> >> >
>> >> >> I more meant that it seems like an infrequently used syscall that uses
>> >> >> an uninitilaized variable that affects the return value could easily
>> >> >> be the result.  The stats differences in both simulations are minimal
>> >> >> and similar.
>> >> >>
>> >> >>   Nate
>> >> >>
>> >> >> On Mon, Nov 17, 2008 at 12:07 PM, Steve Reinhardt <[EMAIL PROTECTED]>
>> >> wrote:
>> >> >> > I sort of doubt it... parser has always been a bit nondeterministic,
>> >> >> > where this is just a subtle and unforeseen but deterministic side
>> >> >> > effect of a bug fix.
>> >> >> >
>> >> >> > Steve
>> >> >> >
>> >> >> > On Mon, Nov 17, 2008 at 11:57 AM, nathan binkert <[EMAIL PROTECTED]>
>> >> wrote:
>> >> >> >> Ah, so that was you.  That makes sense.  I seriously wonder if this
>> or
>> >> >> >> something like it is the problem with 20.parser.
>> >> >> >>
>> >> >> >>  Nate
>> >> >> >>
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to