Hi all,

Merely a reflection on what Tony wrote in his mail: For SE regressions
runs the binaries are _not_ always in the same path. In fact we probably
have 3-4 different permutations in ARM alone. I do not know if the path
difference will impact the behaviour, but it is definitely a concern.

A completely random thought, and I am out of my depth here, but is there
any chance we could use something like Docker images as the SE filesystem?
We could use a base image just like we do in FS, and then have overlays
for any experiment-specific files. Again, I do not know what is
possible/desirable here...merely a thought.

Andreas

On 09/09/2015 19:47, "gem5-dev on behalf of Steve Reinhardt"
<[email protected] on behalf of [email protected]> wrote:

>Hi Joel,
>
>We haven't run into this specific issue internally; I think our bugs must
>be robust enough that they're reproducible even in the face of minor
>changes from system to system.  (And of course once you start debugging on
>a particular system, the behavior is deterministic.)
>
>This is really a symptom of a larger problem with SE mode, which is that
>too much of the underlying host filesystem leaks into the simulated
>environment.  We've run into this in a bigger way with dynamic linking
>(and
>yes, Mike Upton, those were our patches that were described at the
>workshop
>which we haven't posted yet), where you can end up linking in different
>versions of shared libraries when you run on different platforms and ld.so
>just grabs whatever happens to be in the local /lib directory.  Another
>thing David did this past summer is set up a path redirection scheme that
>lets you build a fake filesystem for your SE-mode binary.  This lets you
>do
>things like set up a directory tree with a particular set of shared libs
>in
>it, make that look like "/" to the simulated binary, then reuse that tree
>on multiple systems.  It also allows you to do some handy things like
>auto-generate a file that shows up as /proc/cpuinfo to the simulated
>system
>with information determined by your gem5 config.  This doesn't directly
>solve the readlink() issue, but I think you could use this feature to put
>your executable at a reproducible host-independent path.  Of course, this
>is yet another set of patches we have yet to post, as far as I can tell.
>Unfortunately we have some big deadlines this month, but I'm hoping we can
>get back to clearing up some of our patch backlog next month.  In the
>meantime, as Tony says, I'm not sure how else to resolve the tension
>between reproducibility and correctness here.
>
>Steve
>
>
>On Wed, Sep 9, 2015 at 11:14 AM Joel Hestness <[email protected]>
>wrote:
>
>> > If any codes use readlink() in this way, they need the correct path,
>>and
>> > not gem5’s path. This isn’t non-determinism, the runs will always be
>>the
>> > same given the binaries are in the same path. I’m not sure how this
>>can
>> be
>> > solved other than to say we won’t allow codes to use readlink() this
>>way.
>> >
>>
>> It's not a viable solution to just disallow this kind of use, though.
>>GLIBC
>> uses readlink, which gets compiled into some static binaries, so a gem5
>> user may not have control over the use of readlink. A simple example
>>where
>> this fails is if I run simulations on a cluster with a different
>>directory
>> structure than machines I can use to debug. The difference in directory
>> structure is enough so that the cluster runs cannot be replicated on
>>other
>> machines.
>>
>>
>>
>> > Using readlink() on /proc/self/exe would return the path to the gem5
>> > binary previously, as opposed to the binary running inside of gem5. We
>> > noticed this because some of our codes used readlink() this way to
>>figure
>> > out their paths.
>> >
>>
>> Sure. Can you explain how you are running simulations such that you
>>don't
>> need to worry about simulations being replicable in different settings?
>>Are
>> you always able to debug simulations in the exact same setting where you
>> discovered a bug?
>>
>>   Joel
>>
>>
>>
>> >
>> > *From:* Joel Hestness [mailto:[email protected]]
>> > *Sent:* Wednesday, September 09, 2015 10:32 AM
>> > *To:* gem5 Developer List; Hashe, David; Che, Shuai; Gutierrez,
>>Anthony
>> > *Subject:* Re: [gem5-dev] changeset in gem5: syscall: Add readlink to
>>x86
>> > with special cas...
>> >
>> >
>> >
>> > Hey David, Shuai, and Tony,
>> >
>> >
>> >
>> >   It appears that this patch can introduce simulator non-determinism
>>due
>> > to the use of Process::progName(). Basically, if benchmark paths are
>> > changed (e.g. running on a different system), then the string copy
>> > operations in the "/proc/self/exe" codepath can create strings of
>> different
>> > lengths. As a result, readlink can return different values depending
>>on
>> the
>> > progName() and cause the simulated benchmark to behave differently.
>>This
>> is
>> > problematic when trying to replicate sims for regressions or
>>debugging.
>> >
>> >
>> >
>> >   It looks like this patch aims to address changes in GLIBC, which now
>> > uses readlink to find the binary before handling environment variables
>> (in
>> > _dl_non_dynamic_init
>><http://osxr.org/glibc/source/elf/dl-support.c#0312
>> >).
>> > That said, I'm not clear how this patch is supposed to fix problems or
>> how
>> > we can avoid the nondeterminism it introduces. Can you guys elaborate
>>on
>> > how this change is intended to work (e.g. what is the use case you
>>were
>> > aiming for)?
>> >
>> >
>> >
>> >   Thank you,
>> >
>> >   Joel
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Sat, Aug 1, 2015 at 11:07 AM, David Hashe <[email protected]>
>> wrote:
>> >
>> > changeset 9abf6a7c14ab in /z/repo/gem5
>> > details: http://repo.gem5.org/gem5?cmd=changeset;node=9abf6a7c14ab
>> > description:
>> >         syscall: Add readlink to x86 with special case /proc/self/exe
>> >
>> >         This patch implements the correct behavior.
>> >
>> > diffstat:
>> >
>> >  src/arch/x86/linux/process.cc |   6 +++---
>> >  src/sim/syscall_emul.cc       |  31 ++++++++++++++++++++++++++-----
>> >  2 files changed, 29 insertions(+), 8 deletions(-)
>> >
>> > diffs (80 lines):
>> >
>> > diff -r 255ebb0b32b4 -r 9abf6a7c14ab src/arch/x86/linux/process.cc
>> > --- a/src/arch/x86/linux/process.cc     Mon Jul 20 09:15:18 2015 -0500
>> > +++ b/src/arch/x86/linux/process.cc     Mon Jul 20 09:15:18 2015 -0500
>> > @@ -485,7 +485,7 @@
>> >      /* 264 */ SyscallDesc("renameat", unimplementedFunc),
>> >      /* 265 */ SyscallDesc("linkat", unimplementedFunc),
>> >      /* 266 */ SyscallDesc("symlinkat", unimplementedFunc),
>> > -    /* 267 */ SyscallDesc("readlinkat", unimplementedFunc),
>> > +    /* 267 */ SyscallDesc("readlinkat", readlinkFunc),
>> >      /* 268 */ SyscallDesc("fchmodat", unimplementedFunc),
>> >      /* 269 */ SyscallDesc("faccessat", unimplementedFunc),
>> >      /* 270 */ SyscallDesc("pselect6", unimplementedFunc),
>> > @@ -626,7 +626,7 @@
>> >      /*  82 */ SyscallDesc("select", unimplementedFunc),
>> >      /*  83 */ SyscallDesc("symlink", unimplementedFunc),
>> >      /*  84 */ SyscallDesc("oldlstat", unimplementedFunc),
>> > -    /*  85 */ SyscallDesc("readlink", unimplementedFunc),
>> > +    /*  85 */ SyscallDesc("readlink", readlinkFunc),
>> >      /*  86 */ SyscallDesc("uselib", unimplementedFunc),
>> >      /*  87 */ SyscallDesc("swapon", unimplementedFunc),
>> >      /*  88 */ SyscallDesc("reboot", unimplementedFunc),
>> > @@ -846,7 +846,7 @@
>> >      /* 302 */ SyscallDesc("renameat", unimplementedFunc),
>> >      /* 303 */ SyscallDesc("linkat", unimplementedFunc),
>> >      /* 304 */ SyscallDesc("symlinkat", unimplementedFunc),
>> > -    /* 305 */ SyscallDesc("readlinkat", unimplementedFunc),
>> > +    /* 305 */ SyscallDesc("readlinkat", readlinkFunc),
>> >      /* 306 */ SyscallDesc("fchmodat", unimplementedFunc),
>> >      /* 307 */ SyscallDesc("faccessat", unimplementedFunc),
>> >      /* 308 */ SyscallDesc("pselect6", unimplementedFunc),
>> > diff -r 255ebb0b32b4 -r 9abf6a7c14ab src/sim/syscall_emul.cc
>> > --- a/src/sim/syscall_emul.cc   Mon Jul 20 09:15:18 2015 -0500
>> > +++ b/src/sim/syscall_emul.cc   Mon Jul 20 09:15:18 2015 -0500
>> > @@ -365,12 +365,10 @@
>> >          }
>> >          strncpy((char *)buf.bufferPtr(), cwd.c_str(), size);
>> >          result = cwd.length();
>> > -    }
>> > -    else {
>> > +    } else {
>> >          if (getcwd((char *)buf.bufferPtr(), size) != NULL) {
>> >              result = strlen((char *)buf.bufferPtr());
>> > -        }
>> > -        else {
>> > +        } else {
>> >              result = -1;
>> >          }
>> >      }
>> > @@ -405,7 +403,30 @@
>> >
>> >      BufferArg buf(bufPtr, bufsiz);
>> >
>> > -    int result = readlink(path.c_str(), (char *)buf.bufferPtr(),
>> bufsiz);
>> > +    int result = -1;
>> > +    if (path != "/proc/self/exe") {
>> > +        result = readlink(path.c_str(), (char *)buf.bufferPtr(),
>> bufsiz);
>> > +    } else {
>> > +        // readlink() will return the path of the binary given
>> > +        // with the -c option, however it is possible that this
>> > +        // will still result in incorrect behavior if one binary
>> > +        // runs another, e.g., -c time -o "my_binary" where
>> > +        // my_binary calls readlink(). this is a very unlikely case,
>> > +        // so we issue a warning.
>> > +        warn_once("readlink may yield unexpected results if multiple
>>"
>> > +                  "binaries are used\n");
>> > +        if (strlen(p->progName()) > bufsiz) {
>> > +            // readlink will truncate the contents of the
>> > +            // path to ensure it is no more than bufsiz
>> > +            strncpy((char*)buf.bufferPtr(), p->progName(), bufsiz);
>> > +            result = bufsiz;
>> > +        } else {
>> > +            // return the program's working path rather
>> > +            // than the one for the gem5 binary itself.
>> > +            strcpy((char*)buf.bufferPtr(), p->progName());
>> > +            result = strlen((char*)buf.bufferPtr());
>> > +        }
>> > +    }
>> >
>> >      buf.copyOut(tc->getMemProxy());
>> >
>> > _______________________________________________
>> > gem5-dev mailing list
>> > [email protected]
>> > http://m5sim.org/mailman/listinfo/gem5-dev
>> >
>> >
>> >
>> >
>> >
>> > --
>> >
>> >   Joel Hestness
>> >   PhD Candidate, Computer Architecture
>> >   Dept. of Computer Science, University of Wisconsin - Madison
>> >   http://pages.cs.wisc.edu/~hestness/
>> >
>>
>>
>>
>> --
>>   Joel Hestness
>>   PhD Candidate, Computer Architecture
>>   Dept. of Computer Science, University of Wisconsin - Madison
>>   http://pages.cs.wisc.edu/~hestness/
>> _______________________________________________
>> gem5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/gem5-dev
>>
>_______________________________________________
>gem5-dev mailing list
>[email protected]
>http://m5sim.org/mailman/listinfo/gem5-dev


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to