Re: [m5-dev] MIPS double-precision floating point

Matt Mon, 14 Dec 2009 15:59:28 -0800

Below is a patch to get TLS working for MIPS.

Also, there is another m5 bug that will prevent my very simple fp add
program from working.  The C library calls mmap to map an anonymous
piece of memory.  However, the mmap syscall implementation in m5 has
this line (in mmapFunc() in src/sim/syscall_emul.hh):
    int fd = p->sim_fd(p->getSyscallArg(tc, index));
The problem is that when mmap is called to map an anonymous region
instead of a file the file descriptor argument is not legitimate.  But
this line will try to look up the file descriptor mapping with
sim_fd().  This will fail and result in m5 segfaulting.  Simply
changing the line to this should work for now:
    int fd = p->getSyscallArg(tc, index);


Also, once TLS is working and mmap is fixed you'll get an assertion
failure in the emulation of the MIPS instruction Add_d when it calls
fpNanOperands().  This function used to check both 32 and 64-bit
floating-point values for NaN, however this was changed to only
support 32-bit floating-point values (and an assert was added to make
sure of this), even though this is called from double-precision
floating-point arithmetic operations (which obviously results in the
assertion failure).

Once this is fixed, you'll get to the main error that I was trying to
point out originally: that m5 will compute incorrect results for
simple double-precision floating-point arithmetic (in this case it
will compute that 5.0 + 0.1 = 0.1).

So anyway, now here's the patch to get TLS working in MIPS:

diff --git a/src/arch/mips/isa/decoder.isa b/src/arch/mips/isa/decoder.isa
index c531347..1de96ab 100644
--- a/src/arch/mips/isa/decoder.isa
+++ b/src/arch/mips/isa/decoder.isa
@@ -2476,10 +2476,8 @@ decode OPCODE_HI default Unknown::unknown() {
                         }
                     }
                 }
-                0x3: decode OP_HI {
-                    0x2: decode OP_LO {
-                        0x3: FailUnimpl::rdhwr();
-                    }
+                format IntOp {
+                    0x3: rdhwr({{ Rt.uw = xc->getThreadArea(); }});
                 }
             }
         }
diff --git a/src/arch/mips/linux/process.cc b/src/arch/mips/linux/process.cc
index c2a05b7..75ce514 100644
--- a/src/arch/mips/linux/process.cc
+++ b/src/arch/mips/linux/process.cc
@@ -64,6 +64,16 @@ unameFunc(SyscallDesc *desc, int callnum,
LiveProcess *process,
     return 0;
 }

+/// Target set_thread_area() handler.
+static SyscallReturn
+set_thread_areaFunc(SyscallDesc *desc, int callnum, LiveProcess *process,
+                    ThreadContext *tc)
+{
+    int index = 0;
+    tc->setThreadArea(process->getSyscallArg(tc, index));
+    return 0;
+}
+
 /// Target sys_getsysyinfo() handler.  Even though this call is
 /// borrowed from Tru64, the subcases that get used appear to be
 /// different in practice from those used by Tru64 processes.
@@ -409,7 +419,8 @@ SyscallDesc MipsLinuxProcess::syscallDescs[] = {
     /* 279 */ SyscallDesc("unknown #279", unimplementedFunc),
     /* 280 */ SyscallDesc("add_key", unimplementedFunc),
     /* 281 */ SyscallDesc("request_key", unimplementedFunc),
-    /* 282 */ SyscallDesc("keyctl", unimplementedFunc)
+    /* 282 */ SyscallDesc("keyctl", unimplementedFunc),
+    /* 283 */ SyscallDesc("set_thread_area", set_thread_areaFunc)
 };

 MipsLinuxProcess::MipsLinuxProcess(LiveProcessParams * params,
diff --git a/src/arch/mips/process.cc b/src/arch/mips/process.cc
index d96b0c8..11f8da5 100644
--- a/src/arch/mips/process.cc
+++ b/src/arch/mips/process.cc
@@ -34,6 +34,7 @@
 #include "arch/mips/process.hh"

 #include "base/loader/object_file.hh"
+#include "base/loader/elf_object.hh"
 #include "base/misc.hh"
 #include "cpu/thread_context.hh"

@@ -61,8 +62,8 @@ MipsLiveProcess::MipsLiveProcess(LiveProcessParams * params,
     brk_point = objFile->dataBase() + objFile->dataSize() + objFile->bssSize();
     brk_point = roundUp(brk_point, VMPageSize);

-    // Set up region for mmaps. For now, start at bottom of kuseg space.
-    mmap_start = mmap_end = 0x10000;
+    // Set up region for mmaps.  Start it 1GB above the top of the heap.
+    mmap_start = mmap_end = brk_point + 0x40000000L;
 }

 void
@@ -76,12 +77,44 @@ MipsLiveProcess::startup()
 void
 MipsLiveProcess::argsInit(int intSize, int pageSize)
 {
+    Process::startup();
+
     // load object file into target memory
     objFile->loadSections(initVirtMem);

-    // Calculate how much space we need for arg & env arrays.
+    typedef AuxVector<uint32_t> auxv_t;
+    std::vector<auxv_t> auxv;
+
+    ElfObject * elfObject = dynamic_cast<ElfObject *>(objFile);
+    if (elfObject)
+    {
+        // Set the system page size
+        auxv.push_back(auxv_t(M5_AT_PAGESZ, MipsISA::VMPageSize));
+        // Set the frequency at which time() increments
+        auxv.push_back(auxv_t(M5_AT_CLKTCK, 100));
+        // For statically linked executables, this is the virtual
+        // address of the program header tables if they appear in the
+        // executable image.
+        auxv.push_back(auxv_t(M5_AT_PHDR, elfObject->programHeaderTable()));
+        DPRINTF(Loader, "auxv at PHDR %08p\n",
elfObject->programHeaderTable());
+        // This is the size of a program header entry from the elf file.
+        auxv.push_back(auxv_t(M5_AT_PHENT, elfObject->programHeaderSize()));
+        // This is the number of program headers from the original elf file.
+        auxv.push_back(auxv_t(M5_AT_PHNUM, elfObject->programHeaderCount()));
+        //The entry point to the program
+        auxv.push_back(auxv_t(M5_AT_ENTRY, objFile->entryPoint()));
+        //Different user and group IDs
+        auxv.push_back(auxv_t(M5_AT_UID, uid()));
+        auxv.push_back(auxv_t(M5_AT_EUID, euid()));
+        auxv.push_back(auxv_t(M5_AT_GID, gid()));
+        auxv.push_back(auxv_t(M5_AT_EGID, egid()));
+    }
+
+    // Calculate how much space we need for arg & env & auxv arrays.
     int argv_array_size = intSize * (argv.size() + 1);
     int envp_array_size = intSize * (envp.size() + 1);
+    int auxv_array_size = intSize * 2 * (auxv.size() + 1);
+
     int arg_data_size = 0;
     for (vector<string>::size_type i = 0; i < argv.size(); ++i) {
         arg_data_size += argv[i].size() + 1;
@@ -92,7 +125,12 @@ MipsLiveProcess::argsInit(int intSize, int pageSize)
     }

     int space_needed =
-         argv_array_size + envp_array_size + arg_data_size + env_data_size;
+        argv_array_size +
+        envp_array_size +
+        auxv_array_size +
+        arg_data_size +
+        env_data_size;
+
     if (space_needed < 32*1024)
         space_needed = 32*1024;

@@ -113,7 +151,8 @@ MipsLiveProcess::argsInit(int intSize, int pageSize)
     // ========
     uint32_t argv_array_base = stack_min + intSize; // room for argc
     uint32_t envp_array_base = argv_array_base + argv_array_size;
-    uint32_t arg_data_base = envp_array_base + envp_array_size;
+    uint32_t auxv_array_base = envp_array_base + envp_array_size;
+    uint32_t arg_data_base = auxv_array_base + auxv_array_size;
     uint32_t env_data_base = arg_data_base + arg_data_size;

     // write contents to stack
@@ -133,6 +172,19 @@ MipsLiveProcess::argsInit(int intSize, int pageSize)

     copyStringArray(envp, envp_array_base, env_data_base, initVirtMem);

+    // Copy the aux vector
+    for (vector<auxv_t>::size_type x = 0; x < auxv.size(); x++) {
+        initVirtMem->writeBlob(auxv_array_base + x * 2 * intSize,
+                (uint8_t*)&(auxv[x].a_type), intSize);
+        initVirtMem->writeBlob(auxv_array_base + (x * 2 + 1) * intSize,
+                (uint8_t*)&(auxv[x].a_val), intSize);
+    }
+
+    // Write out the terminating zeroed auxilliary vector
+    const uint64_t zero = 0;
+    initVirtMem->writeBlob(auxv_array_base + 2 * intSize * auxv.size(),
+            (uint8_t*)&zero, 2 * intSize);
+
     ThreadContext *tc = system->getThreadContext(contextIds[0]);

     setSyscallArg(tc, 0, argc);

diff --git a/src/cpu/inorder/inorder_dyn_inst.cc
b/src/cpu/inorder/inorder_dyn_inst.cc
index 5ab8396..2ce9662 100644
--- a/src/cpu/inorder/inorder_dyn_inst.cc
+++ b/src/cpu/inorder/inorder_dyn_inst.cc
@@ -330,6 +330,18 @@ InOrderDynInst::syscall(int64_t callnum)
 }
 #endif

+Addr
+InOrderDynInst::getThreadArea(void)
+{
+    return thread->getThreadArea();
+}
+
+void
+InOrderDynInst::setThreadArea(Addr addr)
+{
+    thread->setThreadArea(addr);
+}
+
 void
 InOrderDynInst::prefetch(Addr addr, unsigned flags)
 {
diff --git a/src/cpu/inorder/inorder_dyn_inst.hh
b/src/cpu/inorder/inorder_dyn_inst.hh
index 522b4e8..7ecfeb1 100644
--- a/src/cpu/inorder/inorder_dyn_inst.hh
+++ b/src/cpu/inorder/inorder_dyn_inst.hh
@@ -503,6 +503,8 @@ class InOrderDynInst : public FastAlloc, public RefCounted
     /** Calls a syscall. */
     void syscall(int64_t callnum);
 #endif
+    Addr getThreadArea(void);
+    void setThreadArea(Addr addr);
     void prefetch(Addr addr, unsigned flags);
     void writeHint(Addr addr, int size, unsigned flags);
     Fault copySrcTranslate(Addr src);
diff --git a/src/cpu/inorder/thread_context.hh
b/src/cpu/inorder/thread_context.hh
index 820f307..89af50f 100644
--- a/src/cpu/inorder/thread_context.hh
+++ b/src/cpu/inorder/thread_context.hh
@@ -277,6 +277,14 @@ class InOrderThreadContext : public ThreadContext
     { return cpu->syscall(callnum, thread->readTid()); }
 #endif

+    /** Gets the thread area (to support TLS). */
+    virtual Addr getThreadArea(void)
+    { return thread->getThreadArea(); }
+
+    /** Sets the thread area (to support TLS). */
+    virtual void setThreadArea(Addr addr)
+    { thread->setThreadArea(addr); }
+
     /** Reads the funcExeInst counter. */
     virtual Counter readFuncExeInst() { return thread->funcExeInst; }

diff --git a/src/cpu/o3/dyn_inst.hh b/src/cpu/o3/dyn_inst.hh
index e1279f8..c10cfca 100644
--- a/src/cpu/o3/dyn_inst.hh
+++ b/src/cpu/o3/dyn_inst.hh
@@ -177,8 +177,14 @@ class BaseO3DynInst : public BaseDynInst<Impl>
 #else
     /** Calls a syscall. */
     void syscall(int64_t callnum);
+
+    /** Gets or sets the thread area (for TLS) */
+    Addr getThreadArea(void);
+    void setThreadArea(Addr addr);
 #endif

+
+
   public:

     // The register accessor methods provide the index of the
diff --git a/src/cpu/o3/dyn_inst_impl.hh b/src/cpu/o3/dyn_inst_impl.hh
index 8d391ce..935b5b6 100644
--- a/src/cpu/o3/dyn_inst_impl.hh
+++ b/src/cpu/o3/dyn_inst_impl.hh
@@ -182,5 +182,19 @@ BaseO3DynInst<Impl>::syscall(int64_t callnum)
         this->setNextPC(new_next_pc);
     }
 }
+
+template <class Impl>
+Addr
+BaseO3DynInst<Impl>::getThreadArea(void)
+{
+    return this->thread->getThreadArea();
+}
+
+template <class Impl>
+void
+BaseO3DynInst<Impl>::setThreadArea(Addr addr)
+{
+    this->thread->setThreadArea(addr);
+}
 #endif

diff --git a/src/cpu/o3/thread_context.hh b/src/cpu/o3/thread_context.hh
index 78b2660..c9a6dd9 100755
--- a/src/cpu/o3/thread_context.hh
+++ b/src/cpu/o3/thread_context.hh
@@ -241,6 +241,14 @@ class O3ThreadContext : public ThreadContext
     virtual void syscall(int64_t callnum)
     { return cpu->syscall(callnum, thread->threadId()); }

+    /** Gets the thread area (to support TLS). */
+    virtual Addr getThreadArea(void)
+    { return thread->getThreadArea(); }
+
+    /** Sets the thread area (to support TLS). */
+    virtual void setThreadArea(Addr addr)
+    { thread->setThreadArea(addr); }
+
     /** Reads the funcExeInst counter. */
     virtual Counter readFuncExeInst() { return thread->funcExeInst; }
 #else
diff --git a/src/cpu/simple/base.hh b/src/cpu/simple/base.hh
index 39961fb..0043892 100644
--- a/src/cpu/simple/base.hh
+++ b/src/cpu/simple/base.hh
@@ -394,6 +394,8 @@ class BaseSimpleCPU : public BaseCPU
     bool simPalCheck(int palFunc) { return thread->simPalCheck(palFunc); }
 #else
     void syscall(int64_t callnum) { thread->syscall(callnum); }
+    Addr getThreadArea(void) { return thread->getThreadArea(); }
+    void setThreadArea(Addr addr) { thread->setThreadArea(addr); }
 #endif

     bool misspeculating() { return thread->misspeculating(); }
diff --git a/src/cpu/simple_thread.hh b/src/cpu/simple_thread.hh
index 2d28607..4e93972 100644
--- a/src/cpu/simple_thread.hh
+++ b/src/cpu/simple_thread.hh
@@ -128,6 +128,9 @@ class SimpleThread : public ThreadState
      */
     Addr nextNPC;

+  protected:
+    Addr thread_area;   // Pointer to the thread area (for TLS)
+
   public:
     // pointer to CPU associated with this SimpleThread
     BaseCPU *cpu;
@@ -406,6 +409,16 @@ class SimpleThread : public ThreadState
     {
         process->syscall(callnum, tc);
     }
+
+    Addr getThreadArea(void)
+    {
+        return thread_area;
+    }
+
+    void setThreadArea(Addr addr)
+    {
+        thread_area = addr;
+    }
 #endif
 };

diff --git a/src/cpu/thread_context.hh b/src/cpu/thread_context.hh
index 78ecdac..1d5786a 100644
--- a/src/cpu/thread_context.hh
+++ b/src/cpu/thread_context.hh
@@ -258,6 +258,10 @@ class ThreadContext
     // 1 if the CPU has no more active threads (meaning it's OK to exit);
     // Used in syscall-emulation mode when a  thread calls the exit syscall.
     virtual int exit() { return 1; };
+
+    // Get or set the thread area (for TLS)
+    virtual Addr getThreadArea(void) = 0;
+    virtual void setThreadArea(Addr addr) = 0;
 #endif

     /** function to compare two thread contexts (for debugging) */
@@ -435,6 +439,12 @@ class ProxyThreadContext : public ThreadContext
     void syscall(int64_t callnum)
     { actualTC->syscall(callnum); }

+    Addr getThreadArea(void)
+    { return actualTC->getThreadArea(); }
+
+    void setThreadArea(Addr addr)
+    { actualTC->setThreadArea(addr); }
+
     Counter readFuncExeInst() { return actualTC->readFuncExeInst(); }
 #endif
 };
diff --git a/src/cpu/thread_state.hh b/src/cpu/thread_state.hh
index cf637ae..0d55895 100644
--- a/src/cpu/thread_state.hh
+++ b/src/cpu/thread_state.hh
@@ -233,6 +233,15 @@ struct ThreadState {
     // Count failed store conditionals so we can warn of apparent
     // application deadlock situations.
     unsigned storeCondFailures;
+
+#if !FULL_SYSTEM
+  protected:
+    Addr thread_area;
+
+  public:
+    Addr getThreadArea(void) { return thread_area; }
+    void setThreadArea(Addr addr) { thread_area = addr; }
+#endif
 };

 #endif // __CPU_THREAD_STATE_HH__
diff --git a/src/sim/process.hh b/src/sim/process.hh
index ab9d64c..d663e92 100644
--- a/src/sim/process.hh
+++ b/src/sim/process.hh
@@ -269,6 +269,9 @@ class LiveProcess : public Process
     uint64_t __pid;
     uint64_t __ppid;

+    // Thread pointer value (to support TLS)
+    Addr tp_value;
+
   public:

     enum AuxiliaryVectorType {
@@ -323,6 +326,8 @@ class LiveProcess : public Process
     }

     std::string getcwd() const { return cwd; }
+    Addr gettp_value () const { return tp_value; }
+    void settp_value (Addr addr) { tp_value = addr; }

     virtual void syscall(int64_t callnum, ThreadContext *tc);



On Mon, Dec 14, 2009 at 8:51 AM, Matt <[email protected]> wrote:
> I think the unsupported instruction is the rdhwr instruction.  It's
> used to get the thread pointer which is used by many C library
> functions to access global variables that are defined thread-local
> these days (like errno, and the locale structure).  If TLS is not
> supported, even very basic C library calls will fail.
>
> In order to get basic TLS support working, I had to implement the
> set_thread_area syscall and the rdhwr instruction.  The argument to
> set_thread_area is saved and then returned whenever rdhwr asks for it.
>  I'll try to provide a patch to get TLS working in MIPS.  I've got it
> working now, but it's hacked into m5-stable from several months ago,
> so it will take a bit of work to generate a patch that will apply
> cleanly to the head of the current development tree.
>
> On Mon, Dec 14, 2009 at 8:33 AM, Korey Sewell <[email protected]> wrote:
>> which inst in there is unsupported?... Does it not recognize the "move"???
>>
>> On Sun, Dec 13, 2009 at 6:43 PM, Gabe Black <[email protected]> wrote:
>>>
>>> I actually just ran into and straightened out (I think) the syscall
>>> interface issue. I basically added the missing calls and set
>>> set_thread_area to ignoreFunc. Now I'm running into an unsupported
>>> instruction in __current_locale_name which gdb doesn't seem to
>>> understand either. Any idea how this should be handled?
>>>
>>> 0x00453720 <__current_locale_name+0>:   lui     gp,0x4b
>>> 0x00453724 <__current_locale_name+4>:   addiu   gp,gp,32544
>>> 0x00453728 <__current_locale_name+8>:   0x7c03e83b
>>> 0x0045372c <__current_locale_name+12>:  move    v0,v1
>>> 0x00453730 <__current_locale_name+16>:  lw      v1,-30160(gp)
>>> 0x00453734 <__current_locale_name+20>:  addiu   a0,a0,16
>>> 0x00453738 <__current_locale_name+24>:  addu    v0,v1,v0
>>> 0x0045373c <__current_locale_name+28>:  lw      v0,0(v0)
>>> 0x00453740 <__current_locale_name+32>:  sll     a0,a0,0x2
>>> 0x00453744 <__current_locale_name+36>:  addu    v0,v0,a0
>>> 0x00453748 <__current_locale_name+40>:  lw      v0,0(v0)
>>> 0x0045374c <__current_locale_name+44>:  jr      ra
>>> 0x00453750 <__current_locale_name+48>:  nop
>>>
>>> Gabe
>>>
>>> Matt wrote:
>>> > Attached is the simple C source and the compiled MIPS binary that adds
>>> > two doubles and prints the result.
>>> > The binary was compiled with my cross compiler which is GCC 4.4.1
>>> > linked with EGLIBC 2.10.1 with Linux 2.6.29.6 headers.  Hopefully this
>>> > binary will work for you; I've made some modifications to my copy of
>>> > M5 to get it to handle some of the more recent changes to the Linux
>>> > MIPS syscall interface.  So hopefully, since this is a pretty simple
>>> > little program, it won't do anything your version of M5 doesn't yet
>>> > support.
>>> >
>>> > On Sat, Dec 12, 2009 at 4:54 PM, Gabe Black <[email protected]>
>>> > wrote:
>>> >
>>> >> Matt wrote:
>>> >>
>>> >>> I'm having problems getting double-precision floating-point to work in
>>> >>> m5 for the MIPS isa.
>>> >>>
>>> >>> The 32-bit MIPS isa has 32 32-bit floating-point registers.
>>> >>> Double-precision floating-point numbers are stored in pairs of
>>> >>> floating-point registers.  At least that's how I understand it.
>>> >>>
>>> >>> Simple floating point math used to work in m5 until changeset
>>> >>> 781969fbeca9.  After the change, it seams that m5 does not read two
>>> >>> 32-bit floating point registers to get a double-precision
>>> >>> floating-point operand, but only one 32-bit floating-point register
>>> >>> (when it's simulating an add_d instruction, for example).  This
>>> >>> results in incorrect floating point arithmetic.
>>> >>>
>>> >>> I have the following C program (compiled for MIPS) that exercises the
>>> >>> problem:
>>> >>>
>>> >>> #include<stdio.h>
>>> >>>
>>> >>> int main (void)
>>> >>> {
>>> >>>         double x, y, z;
>>> >>>
>>> >>>         x = 5.0;
>>> >>>         y = 0.1;
>>> >>>
>>> >>>         z = x + y;
>>> >>>
>>> >>>         printf ("z = %lf\n", z);
>>> >>>
>>> >>>         return 0;
>>> >>> }
>>> >>>
>>> >>> It should print "z = 5.1", but it doesn't because the simulation of
>>> >>> the floating-point addition is wrong.
>>> >>>
>>> >>> Can anyone tell me why this change was made that seems to break the
>>> >>> simulation of double-precision floating-point arithmetic in m5?
>>> >>>
>>> >>> Thanks.
>>> >>>
>>> >>>
>>> >>>
>>> >> Could you send out a compiled version of your test program please? I'm
>>> >> having some problems getting the cross compiler working.
>>> >>
>>> >> Gabe
>>> >> _______________________________________________
>>> >> m5-dev mailing list
>>> >> [email protected]
>>> >> http://m5sim.org/mailman/listinfo/m5-dev
>>> >>
>>> >>
>>> >
>>> >
>>> >
>>> >
>>> > ------------------------------------------------------------------------
>>> >
>>> > _______________________________________________
>>> > m5-dev mailing list
>>> > [email protected]
>>> > http://m5sim.org/mailman/listinfo/m5-dev
>>>
>>> _______________________________________________
>>> m5-dev mailing list
>>> [email protected]
>>> http://m5sim.org/mailman/listinfo/m5-dev
>>
>>
>>
>> --
>> - Korey
>>
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>>
>>
>
>
>
> --
> Cheers,
> Matt
>



-- 
Cheers,
Matt
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] MIPS double-precision floating point

Reply via email to