> If any codes use readlink() in this way, they need the correct path, and > not gem5’s path. This isn’t non-determinism, the runs will always be the > same given the binaries are in the same path. I’m not sure how this can be > solved other than to say we won’t allow codes to use readlink() this way. >
It's not a viable solution to just disallow this kind of use, though. GLIBC uses readlink, which gets compiled into some static binaries, so a gem5 user may not have control over the use of readlink. A simple example where this fails is if I run simulations on a cluster with a different directory structure than machines I can use to debug. The difference in directory structure is enough so that the cluster runs cannot be replicated on other machines. > Using readlink() on /proc/self/exe would return the path to the gem5 > binary previously, as opposed to the binary running inside of gem5. We > noticed this because some of our codes used readlink() this way to figure > out their paths. > Sure. Can you explain how you are running simulations such that you don't need to worry about simulations being replicable in different settings? Are you always able to debug simulations in the exact same setting where you discovered a bug? Joel > > *From:* Joel Hestness [mailto:[email protected]] > *Sent:* Wednesday, September 09, 2015 10:32 AM > *To:* gem5 Developer List; Hashe, David; Che, Shuai; Gutierrez, Anthony > *Subject:* Re: [gem5-dev] changeset in gem5: syscall: Add readlink to x86 > with special cas... > > > > Hey David, Shuai, and Tony, > > > > It appears that this patch can introduce simulator non-determinism due > to the use of Process::progName(). Basically, if benchmark paths are > changed (e.g. running on a different system), then the string copy > operations in the "/proc/self/exe" codepath can create strings of different > lengths. As a result, readlink can return different values depending on the > progName() and cause the simulated benchmark to behave differently. This is > problematic when trying to replicate sims for regressions or debugging. > > > > It looks like this patch aims to address changes in GLIBC, which now > uses readlink to find the binary before handling environment variables (in > _dl_non_dynamic_init <http://osxr.org/glibc/source/elf/dl-support.c#0312>). > That said, I'm not clear how this patch is supposed to fix problems or how > we can avoid the nondeterminism it introduces. Can you guys elaborate on > how this change is intended to work (e.g. what is the use case you were > aiming for)? > > > > Thank you, > > Joel > > > > > > > > On Sat, Aug 1, 2015 at 11:07 AM, David Hashe <[email protected]> wrote: > > changeset 9abf6a7c14ab in /z/repo/gem5 > details: http://repo.gem5.org/gem5?cmd=changeset;node=9abf6a7c14ab > description: > syscall: Add readlink to x86 with special case /proc/self/exe > > This patch implements the correct behavior. > > diffstat: > > src/arch/x86/linux/process.cc | 6 +++--- > src/sim/syscall_emul.cc | 31 ++++++++++++++++++++++++++----- > 2 files changed, 29 insertions(+), 8 deletions(-) > > diffs (80 lines): > > diff -r 255ebb0b32b4 -r 9abf6a7c14ab src/arch/x86/linux/process.cc > --- a/src/arch/x86/linux/process.cc Mon Jul 20 09:15:18 2015 -0500 > +++ b/src/arch/x86/linux/process.cc Mon Jul 20 09:15:18 2015 -0500 > @@ -485,7 +485,7 @@ > /* 264 */ SyscallDesc("renameat", unimplementedFunc), > /* 265 */ SyscallDesc("linkat", unimplementedFunc), > /* 266 */ SyscallDesc("symlinkat", unimplementedFunc), > - /* 267 */ SyscallDesc("readlinkat", unimplementedFunc), > + /* 267 */ SyscallDesc("readlinkat", readlinkFunc), > /* 268 */ SyscallDesc("fchmodat", unimplementedFunc), > /* 269 */ SyscallDesc("faccessat", unimplementedFunc), > /* 270 */ SyscallDesc("pselect6", unimplementedFunc), > @@ -626,7 +626,7 @@ > /* 82 */ SyscallDesc("select", unimplementedFunc), > /* 83 */ SyscallDesc("symlink", unimplementedFunc), > /* 84 */ SyscallDesc("oldlstat", unimplementedFunc), > - /* 85 */ SyscallDesc("readlink", unimplementedFunc), > + /* 85 */ SyscallDesc("readlink", readlinkFunc), > /* 86 */ SyscallDesc("uselib", unimplementedFunc), > /* 87 */ SyscallDesc("swapon", unimplementedFunc), > /* 88 */ SyscallDesc("reboot", unimplementedFunc), > @@ -846,7 +846,7 @@ > /* 302 */ SyscallDesc("renameat", unimplementedFunc), > /* 303 */ SyscallDesc("linkat", unimplementedFunc), > /* 304 */ SyscallDesc("symlinkat", unimplementedFunc), > - /* 305 */ SyscallDesc("readlinkat", unimplementedFunc), > + /* 305 */ SyscallDesc("readlinkat", readlinkFunc), > /* 306 */ SyscallDesc("fchmodat", unimplementedFunc), > /* 307 */ SyscallDesc("faccessat", unimplementedFunc), > /* 308 */ SyscallDesc("pselect6", unimplementedFunc), > diff -r 255ebb0b32b4 -r 9abf6a7c14ab src/sim/syscall_emul.cc > --- a/src/sim/syscall_emul.cc Mon Jul 20 09:15:18 2015 -0500 > +++ b/src/sim/syscall_emul.cc Mon Jul 20 09:15:18 2015 -0500 > @@ -365,12 +365,10 @@ > } > strncpy((char *)buf.bufferPtr(), cwd.c_str(), size); > result = cwd.length(); > - } > - else { > + } else { > if (getcwd((char *)buf.bufferPtr(), size) != NULL) { > result = strlen((char *)buf.bufferPtr()); > - } > - else { > + } else { > result = -1; > } > } > @@ -405,7 +403,30 @@ > > BufferArg buf(bufPtr, bufsiz); > > - int result = readlink(path.c_str(), (char *)buf.bufferPtr(), bufsiz); > + int result = -1; > + if (path != "/proc/self/exe") { > + result = readlink(path.c_str(), (char *)buf.bufferPtr(), bufsiz); > + } else { > + // readlink() will return the path of the binary given > + // with the -c option, however it is possible that this > + // will still result in incorrect behavior if one binary > + // runs another, e.g., -c time -o "my_binary" where > + // my_binary calls readlink(). this is a very unlikely case, > + // so we issue a warning. > + warn_once("readlink may yield unexpected results if multiple " > + "binaries are used\n"); > + if (strlen(p->progName()) > bufsiz) { > + // readlink will truncate the contents of the > + // path to ensure it is no more than bufsiz > + strncpy((char*)buf.bufferPtr(), p->progName(), bufsiz); > + result = bufsiz; > + } else { > + // return the program's working path rather > + // than the one for the gem5 binary itself. > + strcpy((char*)buf.bufferPtr(), p->progName()); > + result = strlen((char*)buf.bufferPtr()); > + } > + } > > buf.copyOut(tc->getMemProxy()); > > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev > > > > > > -- > > Joel Hestness > PhD Candidate, Computer Architecture > Dept. of Computer Science, University of Wisconsin - Madison > http://pages.cs.wisc.edu/~hestness/ > -- Joel Hestness PhD Candidate, Computer Architecture Dept. of Computer Science, University of Wisconsin - Madison http://pages.cs.wisc.edu/~hestness/ _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
