There are a number of things about this patch I like (adding aux vectors to MIPS, cleaning up some Alpha-isms), but I really don't like how this gets plumbed explicitly through the CPU and thread contexts. Why can't this be shoved into a register like it likely is on a real CPU? How is its value used to provide TLS?
Gabe Quoting Matt <[email protected]>: > Below is a patch to get TLS working for MIPS. > > Also, there is another m5 bug that will prevent my very simple fp add > program from working. The C library calls mmap to map an anonymous > piece of memory. However, the mmap syscall implementation in m5 has > this line (in mmapFunc() in src/sim/syscall_emul.hh): > int fd = p->sim_fd(p->getSyscallArg(tc, index)); > The problem is that when mmap is called to map an anonymous region > instead of a file the file descriptor argument is not legitimate. But > this line will try to look up the file descriptor mapping with > sim_fd(). This will fail and result in m5 segfaulting. Simply > changing the line to this should work for now: > int fd = p->getSyscallArg(tc, index); > > Also, once TLS is working and mmap is fixed you'll get an assertion > failure in the emulation of the MIPS instruction Add_d when it calls > fpNanOperands(). This function used to check both 32 and 64-bit > floating-point values for NaN, however this was changed to only > support 32-bit floating-point values (and an assert was added to make > sure of this), even though this is called from double-precision > floating-point arithmetic operations (which obviously results in the > assertion failure). > > Once this is fixed, you'll get to the main error that I was trying to > point out originally: that m5 will compute incorrect results for > simple double-precision floating-point arithmetic (in this case it > will compute that 5.0 + 0.1 = 0.1). > > So anyway, now here's the patch to get TLS working in MIPS: > > diff --git a/src/arch/mips/isa/decoder.isa b/src/arch/mips/isa/decoder.isa > index c531347..1de96ab 100644 > --- a/src/arch/mips/isa/decoder.isa > +++ b/src/arch/mips/isa/decoder.isa > @@ -2476,10 +2476,8 @@ decode OPCODE_HI default Unknown::unknown() { > } > } > } > - 0x3: decode OP_HI { > - 0x2: decode OP_LO { > - 0x3: FailUnimpl::rdhwr(); > - } > + format IntOp { > + 0x3: rdhwr({{ Rt.uw = xc->getThreadArea(); }}); > } > } > } > diff --git a/src/arch/mips/linux/process.cc b/src/arch/mips/linux/process.cc > index c2a05b7..75ce514 100644 > --- a/src/arch/mips/linux/process.cc > +++ b/src/arch/mips/linux/process.cc > @@ -64,6 +64,16 @@ unameFunc(SyscallDesc *desc, int callnum, > LiveProcess *process, > return 0; > } > > +/// Target set_thread_area() handler. > +static SyscallReturn > +set_thread_areaFunc(SyscallDesc *desc, int callnum, LiveProcess *process, > + ThreadContext *tc) > +{ > + int index = 0; > + tc->setThreadArea(process->getSyscallArg(tc, index)); > + return 0; > +} > + > /// Target sys_getsysyinfo() handler. Even though this call is > /// borrowed from Tru64, the subcases that get used appear to be > /// different in practice from those used by Tru64 processes. > @@ -409,7 +419,8 @@ SyscallDesc MipsLinuxProcess::syscallDescs[] = { > /* 279 */ SyscallDesc("unknown #279", unimplementedFunc), > /* 280 */ SyscallDesc("add_key", unimplementedFunc), > /* 281 */ SyscallDesc("request_key", unimplementedFunc), > - /* 282 */ SyscallDesc("keyctl", unimplementedFunc) > + /* 282 */ SyscallDesc("keyctl", unimplementedFunc), > + /* 283 */ SyscallDesc("set_thread_area", set_thread_areaFunc) > }; > > MipsLinuxProcess::MipsLinuxProcess(LiveProcessParams * params, > diff --git a/src/arch/mips/process.cc b/src/arch/mips/process.cc > index d96b0c8..11f8da5 100644 > --- a/src/arch/mips/process.cc > +++ b/src/arch/mips/process.cc > @@ -34,6 +34,7 @@ > #include "arch/mips/process.hh" > > #include "base/loader/object_file.hh" > +#include "base/loader/elf_object.hh" > #include "base/misc.hh" > #include "cpu/thread_context.hh" > > @@ -61,8 +62,8 @@ MipsLiveProcess::MipsLiveProcess(LiveProcessParams > * params, > brk_point = objFile->dataBase() + objFile->dataSize() + > objFile->bssSize(); > brk_point = roundUp(brk_point, VMPageSize); > > - // Set up region for mmaps. For now, start at bottom of kuseg space. > - mmap_start = mmap_end = 0x10000; > + // Set up region for mmaps. Start it 1GB above the top of the heap. > + mmap_start = mmap_end = brk_point + 0x40000000L; > } > > void > @@ -76,12 +77,44 @@ MipsLiveProcess::startup() > void > MipsLiveProcess::argsInit(int intSize, int pageSize) > { > + Process::startup(); > + > // load object file into target memory > objFile->loadSections(initVirtMem); > > - // Calculate how much space we need for arg & env arrays. > + typedef AuxVector<uint32_t> auxv_t; > + std::vector<auxv_t> auxv; > + > + ElfObject * elfObject = dynamic_cast<ElfObject *>(objFile); > + if (elfObject) > + { > + // Set the system page size > + auxv.push_back(auxv_t(M5_AT_PAGESZ, MipsISA::VMPageSize)); > + // Set the frequency at which time() increments > + auxv.push_back(auxv_t(M5_AT_CLKTCK, 100)); > + // For statically linked executables, this is the virtual > + // address of the program header tables if they appear in the > + // executable image. > + auxv.push_back(auxv_t(M5_AT_PHDR, elfObject->programHeaderTable())); > + DPRINTF(Loader, "auxv at PHDR %08p\n", > elfObject->programHeaderTable()); > + // This is the size of a program header entry from the elf file. > + auxv.push_back(auxv_t(M5_AT_PHENT, elfObject->programHeaderSize())); > + // This is the number of program headers from the original elf file. > + auxv.push_back(auxv_t(M5_AT_PHNUM, > elfObject->programHeaderCount())); > + //The entry point to the program > + auxv.push_back(auxv_t(M5_AT_ENTRY, objFile->entryPoint())); > + //Different user and group IDs > + auxv.push_back(auxv_t(M5_AT_UID, uid())); > + auxv.push_back(auxv_t(M5_AT_EUID, euid())); > + auxv.push_back(auxv_t(M5_AT_GID, gid())); > + auxv.push_back(auxv_t(M5_AT_EGID, egid())); > + } > + > + // Calculate how much space we need for arg & env & auxv arrays. > int argv_array_size = intSize * (argv.size() + 1); > int envp_array_size = intSize * (envp.size() + 1); > + int auxv_array_size = intSize * 2 * (auxv.size() + 1); > + > int arg_data_size = 0; > for (vector<string>::size_type i = 0; i < argv.size(); ++i) { > arg_data_size += argv[i].size() + 1; > @@ -92,7 +125,12 @@ MipsLiveProcess::argsInit(int intSize, int pageSize) > } > > int space_needed = > - argv_array_size + envp_array_size + arg_data_size + env_data_size; > + argv_array_size + > + envp_array_size + > + auxv_array_size + > + arg_data_size + > + env_data_size; > + > if (space_needed < 32*1024) > space_needed = 32*1024; > > @@ -113,7 +151,8 @@ MipsLiveProcess::argsInit(int intSize, int pageSize) > // ======== > uint32_t argv_array_base = stack_min + intSize; // room for argc > uint32_t envp_array_base = argv_array_base + argv_array_size; > - uint32_t arg_data_base = envp_array_base + envp_array_size; > + uint32_t auxv_array_base = envp_array_base + envp_array_size; > + uint32_t arg_data_base = auxv_array_base + auxv_array_size; > uint32_t env_data_base = arg_data_base + arg_data_size; > > // write contents to stack > @@ -133,6 +172,19 @@ MipsLiveProcess::argsInit(int intSize, int pageSize) > > copyStringArray(envp, envp_array_base, env_data_base, initVirtMem); > > + // Copy the aux vector > + for (vector<auxv_t>::size_type x = 0; x < auxv.size(); x++) { > + initVirtMem->writeBlob(auxv_array_base + x * 2 * intSize, > + (uint8_t*)&(auxv[x].a_type), intSize); > + initVirtMem->writeBlob(auxv_array_base + (x * 2 + 1) * intSize, > + (uint8_t*)&(auxv[x].a_val), intSize); > + } > + > + // Write out the terminating zeroed auxilliary vector > + const uint64_t zero = 0; > + initVirtMem->writeBlob(auxv_array_base + 2 * intSize * auxv.size(), > + (uint8_t*)&zero, 2 * intSize); > + > ThreadContext *tc = system->getThreadContext(contextIds[0]); > > setSyscallArg(tc, 0, argc); > > diff --git a/src/cpu/inorder/inorder_dyn_inst.cc > b/src/cpu/inorder/inorder_dyn_inst.cc > index 5ab8396..2ce9662 100644 > --- a/src/cpu/inorder/inorder_dyn_inst.cc > +++ b/src/cpu/inorder/inorder_dyn_inst.cc > @@ -330,6 +330,18 @@ InOrderDynInst::syscall(int64_t callnum) > } > #endif > > +Addr > +InOrderDynInst::getThreadArea(void) > +{ > + return thread->getThreadArea(); > +} > + > +void > +InOrderDynInst::setThreadArea(Addr addr) > +{ > + thread->setThreadArea(addr); > +} > + > void > InOrderDynInst::prefetch(Addr addr, unsigned flags) > { > diff --git a/src/cpu/inorder/inorder_dyn_inst.hh > b/src/cpu/inorder/inorder_dyn_inst.hh > index 522b4e8..7ecfeb1 100644 > --- a/src/cpu/inorder/inorder_dyn_inst.hh > +++ b/src/cpu/inorder/inorder_dyn_inst.hh > @@ -503,6 +503,8 @@ class InOrderDynInst : public FastAlloc, public > RefCounted > /** Calls a syscall. */ > void syscall(int64_t callnum); > #endif > + Addr getThreadArea(void); > + void setThreadArea(Addr addr); > void prefetch(Addr addr, unsigned flags); > void writeHint(Addr addr, int size, unsigned flags); > Fault copySrcTranslate(Addr src); > diff --git a/src/cpu/inorder/thread_context.hh > b/src/cpu/inorder/thread_context.hh > index 820f307..89af50f 100644 > --- a/src/cpu/inorder/thread_context.hh > +++ b/src/cpu/inorder/thread_context.hh > @@ -277,6 +277,14 @@ class InOrderThreadContext : public ThreadContext > { return cpu->syscall(callnum, thread->readTid()); } > #endif > > + /** Gets the thread area (to support TLS). */ > + virtual Addr getThreadArea(void) > + { return thread->getThreadArea(); } > + > + /** Sets the thread area (to support TLS). */ > + virtual void setThreadArea(Addr addr) > + { thread->setThreadArea(addr); } > + > /** Reads the funcExeInst counter. */ > virtual Counter readFuncExeInst() { return thread->funcExeInst; } > > diff --git a/src/cpu/o3/dyn_inst.hh b/src/cpu/o3/dyn_inst.hh > index e1279f8..c10cfca 100644 > --- a/src/cpu/o3/dyn_inst.hh > +++ b/src/cpu/o3/dyn_inst.hh > @@ -177,8 +177,14 @@ class BaseO3DynInst : public BaseDynInst<Impl> > #else > /** Calls a syscall. */ > void syscall(int64_t callnum); > + > + /** Gets or sets the thread area (for TLS) */ > + Addr getThreadArea(void); > + void setThreadArea(Addr addr); > #endif > > + > + > public: > > // The register accessor methods provide the index of the > diff --git a/src/cpu/o3/dyn_inst_impl.hh b/src/cpu/o3/dyn_inst_impl.hh > index 8d391ce..935b5b6 100644 > --- a/src/cpu/o3/dyn_inst_impl.hh > +++ b/src/cpu/o3/dyn_inst_impl.hh > @@ -182,5 +182,19 @@ BaseO3DynInst<Impl>::syscall(int64_t callnum) > this->setNextPC(new_next_pc); > } > } > + > +template <class Impl> > +Addr > +BaseO3DynInst<Impl>::getThreadArea(void) > +{ > + return this->thread->getThreadArea(); > +} > + > +template <class Impl> > +void > +BaseO3DynInst<Impl>::setThreadArea(Addr addr) > +{ > + this->thread->setThreadArea(addr); > +} > #endif > > diff --git a/src/cpu/o3/thread_context.hh b/src/cpu/o3/thread_context.hh > index 78b2660..c9a6dd9 100755 > --- a/src/cpu/o3/thread_context.hh > +++ b/src/cpu/o3/thread_context.hh > @@ -241,6 +241,14 @@ class O3ThreadContext : public ThreadContext > virtual void syscall(int64_t callnum) > { return cpu->syscall(callnum, thread->threadId()); } > > + /** Gets the thread area (to support TLS). */ > + virtual Addr getThreadArea(void) > + { return thread->getThreadArea(); } > + > + /** Sets the thread area (to support TLS). */ > + virtual void setThreadArea(Addr addr) > + { thread->setThreadArea(addr); } > + > /** Reads the funcExeInst counter. */ > virtual Counter readFuncExeInst() { return thread->funcExeInst; } > #else > diff --git a/src/cpu/simple/base.hh b/src/cpu/simple/base.hh > index 39961fb..0043892 100644 > --- a/src/cpu/simple/base.hh > +++ b/src/cpu/simple/base.hh > @@ -394,6 +394,8 @@ class BaseSimpleCPU : public BaseCPU > bool simPalCheck(int palFunc) { return thread->simPalCheck(palFunc); } > #else > void syscall(int64_t callnum) { thread->syscall(callnum); } > + Addr getThreadArea(void) { return thread->getThreadArea(); } > + void setThreadArea(Addr addr) { thread->setThreadArea(addr); } > #endif > > bool misspeculating() { return thread->misspeculating(); } > diff --git a/src/cpu/simple_thread.hh b/src/cpu/simple_thread.hh > index 2d28607..4e93972 100644 > --- a/src/cpu/simple_thread.hh > +++ b/src/cpu/simple_thread.hh > @@ -128,6 +128,9 @@ class SimpleThread : public ThreadState > */ > Addr nextNPC; > > + protected: > + Addr thread_area; // Pointer to the thread area (for TLS) > + > public: > // pointer to CPU associated with this SimpleThread > BaseCPU *cpu; > @@ -406,6 +409,16 @@ class SimpleThread : public ThreadState > { > process->syscall(callnum, tc); > } > + > + Addr getThreadArea(void) > + { > + return thread_area; > + } > + > + void setThreadArea(Addr addr) > + { > + thread_area = addr; > + } > #endif > }; > > diff --git a/src/cpu/thread_context.hh b/src/cpu/thread_context.hh > index 78ecdac..1d5786a 100644 > --- a/src/cpu/thread_context.hh > +++ b/src/cpu/thread_context.hh > @@ -258,6 +258,10 @@ class ThreadContext > // 1 if the CPU has no more active threads (meaning it's OK to exit); > // Used in syscall-emulation mode when a thread calls the exit syscall. > virtual int exit() { return 1; }; > + > + // Get or set the thread area (for TLS) > + virtual Addr getThreadArea(void) = 0; > + virtual void setThreadArea(Addr addr) = 0; > #endif > > /** function to compare two thread contexts (for debugging) */ > @@ -435,6 +439,12 @@ class ProxyThreadContext : public ThreadContext > void syscall(int64_t callnum) > { actualTC->syscall(callnum); } > > + Addr getThreadArea(void) > + { return actualTC->getThreadArea(); } > + > + void setThreadArea(Addr addr) > + { actualTC->setThreadArea(addr); } > + > Counter readFuncExeInst() { return actualTC->readFuncExeInst(); } > #endif > }; > diff --git a/src/cpu/thread_state.hh b/src/cpu/thread_state.hh > index cf637ae..0d55895 100644 > --- a/src/cpu/thread_state.hh > +++ b/src/cpu/thread_state.hh > @@ -233,6 +233,15 @@ struct ThreadState { > // Count failed store conditionals so we can warn of apparent > // application deadlock situations. > unsigned storeCondFailures; > + > +#if !FULL_SYSTEM > + protected: > + Addr thread_area; > + > + public: > + Addr getThreadArea(void) { return thread_area; } > + void setThreadArea(Addr addr) { thread_area = addr; } > +#endif > }; > > #endif // __CPU_THREAD_STATE_HH__ > diff --git a/src/sim/process.hh b/src/sim/process.hh > index ab9d64c..d663e92 100644 > --- a/src/sim/process.hh > +++ b/src/sim/process.hh > @@ -269,6 +269,9 @@ class LiveProcess : public Process > uint64_t __pid; > uint64_t __ppid; > > + // Thread pointer value (to support TLS) > + Addr tp_value; > + > public: > > enum AuxiliaryVectorType { > @@ -323,6 +326,8 @@ class LiveProcess : public Process > } > > std::string getcwd() const { return cwd; } > + Addr gettp_value () const { return tp_value; } > + void settp_value (Addr addr) { tp_value = addr; } > > virtual void syscall(int64_t callnum, ThreadContext *tc); > > > > On Mon, Dec 14, 2009 at 8:51 AM, Matt <[email protected]> wrote: >> I think the unsupported instruction is the rdhwr instruction. It's >> used to get the thread pointer which is used by many C library >> functions to access global variables that are defined thread-local >> these days (like errno, and the locale structure). If TLS is not >> supported, even very basic C library calls will fail. >> >> In order to get basic TLS support working, I had to implement the >> set_thread_area syscall and the rdhwr instruction. The argument to >> set_thread_area is saved and then returned whenever rdhwr asks for it. >> I'll try to provide a patch to get TLS working in MIPS. I've got it >> working now, but it's hacked into m5-stable from several months ago, >> so it will take a bit of work to generate a patch that will apply >> cleanly to the head of the current development tree. >> >> On Mon, Dec 14, 2009 at 8:33 AM, Korey Sewell <[email protected]> wrote: >>> which inst in there is unsupported?... Does it not recognize the "move"??? >>> >>> On Sun, Dec 13, 2009 at 6:43 PM, Gabe Black <[email protected]> wrote: >>>> >>>> I actually just ran into and straightened out (I think) the syscall >>>> interface issue. I basically added the missing calls and set >>>> set_thread_area to ignoreFunc. Now I'm running into an unsupported >>>> instruction in __current_locale_name which gdb doesn't seem to >>>> understand either. Any idea how this should be handled? >>>> >>>> 0x00453720 <__current_locale_name+0>: lui gp,0x4b >>>> 0x00453724 <__current_locale_name+4>: addiu gp,gp,32544 >>>> 0x00453728 <__current_locale_name+8>: 0x7c03e83b >>>> 0x0045372c <__current_locale_name+12>: move v0,v1 >>>> 0x00453730 <__current_locale_name+16>: lw v1,-30160(gp) >>>> 0x00453734 <__current_locale_name+20>: addiu a0,a0,16 >>>> 0x00453738 <__current_locale_name+24>: addu v0,v1,v0 >>>> 0x0045373c <__current_locale_name+28>: lw v0,0(v0) >>>> 0x00453740 <__current_locale_name+32>: sll a0,a0,0x2 >>>> 0x00453744 <__current_locale_name+36>: addu v0,v0,a0 >>>> 0x00453748 <__current_locale_name+40>: lw v0,0(v0) >>>> 0x0045374c <__current_locale_name+44>: jr ra >>>> 0x00453750 <__current_locale_name+48>: nop >>>> >>>> Gabe >>>> >>>> Matt wrote: >>>> > Attached is the simple C source and the compiled MIPS binary that adds >>>> > two doubles and prints the result. >>>> > The binary was compiled with my cross compiler which is GCC 4.4.1 >>>> > linked with EGLIBC 2.10.1 with Linux 2.6.29.6 headers. Hopefully this >>>> > binary will work for you; I've made some modifications to my copy of >>>> > M5 to get it to handle some of the more recent changes to the Linux >>>> > MIPS syscall interface. So hopefully, since this is a pretty simple >>>> > little program, it won't do anything your version of M5 doesn't yet >>>> > support. >>>> > >>>> > On Sat, Dec 12, 2009 at 4:54 PM, Gabe Black <[email protected]> >>>> > wrote: >>>> > >>>> >> Matt wrote: >>>> >> >>>> >>> I'm having problems getting double-precision floating-point to work in >>>> >>> m5 for the MIPS isa. >>>> >>> >>>> >>> The 32-bit MIPS isa has 32 32-bit floating-point registers. >>>> >>> Double-precision floating-point numbers are stored in pairs of >>>> >>> floating-point registers. At least that's how I understand it. >>>> >>> >>>> >>> Simple floating point math used to work in m5 until changeset >>>> >>> 781969fbeca9. After the change, it seams that m5 does not read two >>>> >>> 32-bit floating point registers to get a double-precision >>>> >>> floating-point operand, but only one 32-bit floating-point register >>>> >>> (when it's simulating an add_d instruction, for example). This >>>> >>> results in incorrect floating point arithmetic. >>>> >>> >>>> >>> I have the following C program (compiled for MIPS) that exercises the >>>> >>> problem: >>>> >>> >>>> >>> #include<stdio.h> >>>> >>> >>>> >>> int main (void) >>>> >>> { >>>> >>> double x, y, z; >>>> >>> >>>> >>> x = 5.0; >>>> >>> y = 0.1; >>>> >>> >>>> >>> z = x + y; >>>> >>> >>>> >>> printf ("z = %lf\n", z); >>>> >>> >>>> >>> return 0; >>>> >>> } >>>> >>> >>>> >>> It should print "z = 5.1", but it doesn't because the simulation of >>>> >>> the floating-point addition is wrong. >>>> >>> >>>> >>> Can anyone tell me why this change was made that seems to break the >>>> >>> simulation of double-precision floating-point arithmetic in m5? >>>> >>> >>>> >>> Thanks. >>>> >>> >>>> >>> >>>> >>> >>>> >> Could you send out a compiled version of your test program please? I'm >>>> >> having some problems getting the cross compiler working. >>>> >> >>>> >> Gabe >>>> >> _______________________________________________ >>>> >> m5-dev mailing list >>>> >> [email protected] >>>> >> http://m5sim.org/mailman/listinfo/m5-dev >>>> >> >>>> >> >>>> > >>>> > >>>> > >>>> > >>>> > ------------------------------------------------------------------------ >>>> > >>>> > _______________________________________________ >>>> > m5-dev mailing list >>>> > [email protected] >>>> > http://m5sim.org/mailman/listinfo/m5-dev >>>> >>>> _______________________________________________ >>>> m5-dev mailing list >>>> [email protected] >>>> http://m5sim.org/mailman/listinfo/m5-dev >>> >>> >>> >>> -- >>> - Korey >>> >>> _______________________________________________ >>> m5-dev mailing list >>> [email protected] >>> http://m5sim.org/mailman/listinfo/m5-dev >>> >>> >> >> >> >> -- >> Cheers, >> Matt >> > > > > -- > Cheers, > Matt > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev > _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
