Re: [m5-dev] parser error (was Re: changeset in m5: Update stats for brk fix (cset f28f020f3006).)

Steve Reinhardt Tue, 18 Nov 2008 15:45:29 -0800

No rush... we've lived with this for quite a while, at least we know why now.


On Tue, Nov 18, 2008 at 3:10 PM, nathan binkert <[EMAIL PROTECTED]> wrote:
> I added support for this kind of file mapping stuff for the m5 command
> so I could load multiple files into the simulator in full system mode.
>  The python portion of this work could easily be ported to SE mode.
> Unless you want diffs now, I'll work on getting this stuff in the tree
> after ISCA.
>
> It basically allowed me to clean up all of the boot scripts and such.
>
>  Nate
>
> On Tue, Nov 18, 2008 at 2:25 PM, Steve Reinhardt <[EMAIL PROTECTED]> wrote:
>> Took me a lot longer than it should have in retrospect, but here's the
>> problem (from --trace-flags=SyscallVerbose):
>>
>> 594199893000: global: opening file /proc/meminfo
>> 594199893000: system.cpu: syscall open returns 4
>> 594200152000: system.cpu: syscall fstat called w/arguments
>> 4,140737488339680,140737488339680,0
>> 594200152000: system.cpu: syscall fstat returns 0
>> [...]
>> 594200272000: system.cpu: syscall read called w/arguments
>> 4,46912559464448,8192,34
>> 594200272000: system.cpu: syscall read returns 630
>>
>> I don't know *why* parser opens, fstats, and reads /proc/meminfo, but
>> that's clearly where the system dependence is coming from.  As far as
>> fixing the problem, the easiest thing would be to hack parser to not
>> do that, or just not use parser in the regressions.
>>
>> If we wanted to get really fancy we could recognize /proc/meminfo as
>> special and redirect it to some canned input.  It might be worth
>> checking in open() and warning anytime anything under /proc gets
>> opened.  Or maybe we should implement something like chroot inside of
>> SE mode, so you could get rid of all the path-based issues by forcing
>> everything to be relative to the working dir, and then use symlinks to
>> set up the structure you want... powerful, but overkill for our uses
>> IMO.
>>
>> Steve
>>
>> On Mon, Nov 17, 2008 at 7:37 PM,  <[EMAIL PROTECTED]> wrote:
>>> Yes, I'm sure it's not a timing mode thing. The timing mode regressions 
>>> didn't
>>> exist for x86 until very recently, and parser has been unstable for maybe as
>>> long as a year.
>>>
>>> Gabe
>>>
>>> Quoting Steve Reinhardt <[EMAIL PROTECTED]>:
>>>
>>>> Interestingly, I just ran on my desktop here and on zizzer and both
>>>> failed, but when I looked more closely, I see that my desktop is
>>>> failing because it's running 5 fewer instructions than the reference
>>>> output, while zizzer is failing because it's running 5 extra
>>>> instructions.  (And yes, I double-checked and they both have the same
>>>> reference instruction count.)  Both of these seem pretty consistent.
>>>>
>>>> I also checked the poolfs regression outputs and they get yet a third
>>>> value, and amazingly the simple-atomic runs fail there too.  All of
>>>> the instruction counts vary only in the last couple of digits, so I'll
>>>> just use those to summarize:
>>>>
>>>>                           ref   zizzer poolfs home
>>>> simple-atomic   702  702    786     692
>>>> simple-timing    697  702    786     692
>>>>
>>>> So it doesn't appear to be a timing-mode thing; that's just a side
>>>> effect of us having inconsistent reference outputs for the two runs.
>>>>
>>>> Steve
>>>>
>>>> On Mon, Nov 17, 2008 at 2:53 PM,  <[EMAIL PROTECTED]> wrote:
>>>> > Exactly. Or one machine will be in Ann Arbor and the other in California.
>>>> Maybe
>>>> > it has something to do with the test checking the actual clock time/date 
>>>> > on
>>>> the
>>>> > host somehow? It could behave slightly differently depending on some 
>>>> > little
>>>> part
>>>> > of that like converting it to seconds changing the path the microcode 
>>>> > takes
>>>> for
>>>> > the division instruction or something.
>>>> >
>>>> > Speaking of which, I think it would be really handy to distinguish 
>>>> > between
>>>> the
>>>> > number of actual instructions that commit vs. the number of microops. If 
>>>> > I
>>>> have
>>>> > to change microcode for some reason I'd expect the later to change, but 
>>>> > the
>>>> > former probably means I broke something.
>>>> >
>>>> > Gabe
>>>> >
>>>> > Quoting nathan binkert <[EMAIL PROTECTED]>:
>>>> >
>>>> >> The biggest problem is that I've never been able to find two machines
>>>> >> that behave differently.  When things change, I can't find something
>>>> >> that did it the "old" way.
>>>> >>
>>>> >>   Nate
>>>> >>
>>>> >>
>>>> >> > If somebody can and wants to get a tracediff between two differently
>>>> >> behaving
>>>> >> > versions of parser, that would go a long way to figuring out what the
>>>> >> problem
>>>> >> > is.
>>>> >> >
>>>> >> > Gabe
>>>> >> >
>>>> >> > Quoting nathan binkert <[EMAIL PROTECTED]>:
>>>> >> >
>>>> >> >> I more meant that it seems like an infrequently used syscall that 
>>>> >> >> uses
>>>> >> >> an uninitilaized variable that affects the return value could easily
>>>> >> >> be the result.  The stats differences in both simulations are minimal
>>>> >> >> and similar.
>>>> >> >>
>>>> >> >>   Nate
>>>> >> >>
>>>> >> >> On Mon, Nov 17, 2008 at 12:07 PM, Steve Reinhardt <[EMAIL PROTECTED]>
>>>> >> wrote:
>>>> >> >> > I sort of doubt it... parser has always been a bit 
>>>> >> >> > nondeterministic,
>>>> >> >> > where this is just a subtle and unforeseen but deterministic side
>>>> >> >> > effect of a bug fix.
>>>> >> >> >
>>>> >> >> > Steve
>>>> >> >> >
>>>> >> >> > On Mon, Nov 17, 2008 at 11:57 AM, nathan binkert <[EMAIL 
>>>> >> >> > PROTECTED]>
>>>> >> wrote:
>>>> >> >> >> Ah, so that was you.  That makes sense.  I seriously wonder if 
>>>> >> >> >> this
>>>> or
>>>> >> >> >> something like it is the problem with 20.parser.
>>>> >> >> >>
>>>> >> >> >>  Nate
>>>> >> >> >>
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>>
>>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] parser error (was Re: changeset in m5: Update stats for brk fix (cset f28f020f3006).)

Reply via email to