Thanks Ali! Sounds like we are close to the target! I'll look into this. In 
crosstool's howto.html, it said NPTL is not supported, but I think this is for 
crosstool 0.42. I am using crosstool 0.43, it seems there is a .dat for 
gcc.4.1.1-glibc-2.3.5-nptl.dat. Maybe I just have to use --enable-add-ons = 
nptl when compiling my application. But if it requries rebuilding the toolchain 
it may take a while. I'll let you know then.

Thanks a lot! I'm very grateful!

Jiayuan



----- Original Message ----- 
From: "Ali Saidi" <[EMAIL PROTECTED]>
To: "M5 users mailing list" <m5-users@m5sim.org>
Sent: 2007年6月17日 11:32 AM
Subject: Re: [m5-users] synchronization primitives in SE mode


If seems like for whatever reason the malloc code is using this  
header from libc:
./sysdeps/generic/malloc-machine.h
as opposed to:
./nptl/sysdeps/pthread/malloc-machine.h

The prior just does what you have listed below, the latter version  
actually calls code that does a real locking. I would check your  
build environment before you go off trying to write your own locking  
code.
Ali


On Jun 16, 2007, at 7:18 PM, Jiayuan Meng wrote:

> Thanks Steve!
>
>
>
>> As long as the CPU IDs are distinct then it shouldn't matter that the
>> thread IDs are all zero... the load locked/store conditional code
>> requires both to match before it considers the request as coming from
>> the same place.
>
> I see. I also tried assigning a unique ID to each thread. The  
> problem is still there though.
>
>>
>> Can you tell from the trace that there's a point inside mutex_lock()
>> where all of your cpus do LLs to the same address followed by SCs,  
>> and
>> all of the SCs succeed?  This would be definitive evidence that  
>> there's
>> a problem with LL/SC.  Other than that it's just a guess that this is
>> where the problem lies.
>
> As I am looking into the trace, it seems that the libc_malloc is  
> not using LL/SC. The critial part looks like this:
>
> ldl r1, 0(r0) .......... load r1, initially it is 0
> bne r1,0xxx .......... if r1 is non-zero, jump to function  
> libc__arena_get
> lda r1, 1(r31).........assign r1 to be 1
> stl r1, 0(r0)................store r1 to its original address
>
> the mutex_lock and unlock functions are inlined, so it appears  
> these all happen in libc_malloc.
>
> the race condition happens when the four threads simultaneously run  
> this piece of code. If only I can change the ldl and stl to LL/SC!  
> I'm guessing that I am not invoking the right mutex_lock functions,  
> but I don't have idea about how to do that. Maybe I can write a  
> mutex_lock function and link it to the program. Any suggestions?
>
>
>>
>> Also, what happens if you run with AtomicSimpleCPU, and with or  
>> without
>> a single level of caches?
>
> I'm currently using private L1s and shared L2s. I'll test about  
> using a single level of shared L1. but are you actually interested  
> about how LL/SC in M5 behave under these different configurations?  
> I'll let you know when I capture these instructions :)
>
> Thanks for the insights!
>
> Jiayuan
>
>
>>
>> Steve
>>
>> Jiayuan Meng wrote:
>>> Hi Ali,
>>>
>>> Thanks for the quick responce.
>>>
>>> I am having a master thread spawning child threads on multiple cpus.
>>> Once a thread gets allocated to a cpu, it always resides there  
>>> (so far).
>>>
>>> I am using AtomicTimingCPU. In my test case with racing mallocs,
>>> I have five CPUs(with id from 0 to 4). A master threads initially  
>>> runs
>>> on cpu0. When it comes to a pseudo instruction, it tells the  
>>> simulator
>>> to spawn four child threads on the other CPUs. Each CPU only uses  
>>> one
>>> thread context(all have the id 0 by default). Will this be a
>>> problem? I'll try assigning different thread IDs.
>>>
>>> To create threads, I learned from "stack_createFunc" and
>>> "init_thread_context" in  kern/tru64/tru64.hh, basically allocate  
>>> a new
>>> stack, and assigns the pc and sp register. A major difference  
>>> might be
>>> that I am not using pthreads. instead, I inserted a new pseudo
>>> instruction which "atomically" creates four threads on the other  
>>> four
>>> CPUs, they start to execute at the same cycle.
>>>
>>> I actually extended SimpleCPUs to have multiple thread contexts  
>>> and the
>>> CPU can switch among them. They are tested with the splash2 FFT
>>> benchmark and things went fine. But to make the test more clear,  
>>> I just
>>> set each CPU to have exactly one thread context. In the future, I  
>>> may
>>> need to "migrate" a running thread context from one CPU to another.
>>>
>>> I'm in trouble now... I wonder how splash2 gets around with this  
>>> in SE mode?
>>>
>>> Thanks again!
>>>
>>> Jiayuan
>>>
>>>
>>>     ----- Original Message -----
>>>     *From:* Ali Saidi <mailto:[EMAIL PROTECTED]>
>>>     *To:* M5 users mailing list <mailto:m5-users@m5sim.org>
>>>     *Sent:* 2007年6月16日 2:41 AM
>>>     *Subject:* Re: [m5-users] synchronization primitives in SE mode
>>>
>>>     The Alpha ISA has a load locked and a store conditional  
>>> instruction
>>>     which we support. Again I don't know exactly what you're  
>>> doing to
>>>     create your threads, but you need to make sure that their cpu/ 
>>> thread
>>>     ids are unique. Are you scheduling each thread on it's own  
>>> cpu or
>>>     are they moving around?
>>>
>>>     Ali
>>>
>>>
>>>
>>>     On Jun 15, 2007, at 1:30 PM, Jiayuan Meng wrote:
>>>
>>>>     Hey all,
>>>>
>>>>     By using the --trace-flags=Exec debug tool, I found that  
>>>> there is
>>>>     a race condition in the malloc function in my multithreaded
>>>>     program. However, when looking into the malloc.c in the  
>>>> glibc, it
>>>>     said it is a thread-safe version. I also noticed that in
>>>>     malloc/arena.c, it uses mutex_lock(), which seems to be a
>>>>     spinlock. This may still be problematic if several threads are
>>>>     accessing the lock simultaneously.
>>>>
>>>>     So, what kind of synchronization support does M5 have in SE  
>>>> mode?
>>>>     Does it have store-conditional or test-and-set instructions or
>>>>     I'll have to add one myself?
>>>>
>>>>     Thanks!
>>>>
>>>>     Jiayuan
>>>
>>>      
>>> -------------------------------------------------------------------- 
>>> ----
>>>
>>>     _______________________________________________
>>>     m5-users mailing list
>>>     m5-users@m5sim.org
>>>     http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> ----
>>>
>>> _______________________________________________
>>> m5-users mailing list
>>> m5-users@m5sim.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>> _______________________________________________
>> m5-users mailing list
>> m5-users@m5sim.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
> _______________________________________________
> m5-users mailing list
> m5-users@m5sim.org
> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

_______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
_______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Reply via email to