Thanks Ali! Sounds like we are close to the target! I'll look into this. In crosstool's howto.html, it said NPTL is not supported, but I think this is for crosstool 0.42. I am using crosstool 0.43, it seems there is a .dat for gcc.4.1.1-glibc-2.3.5-nptl.dat. Maybe I just have to use --enable-add-ons = nptl when compiling my application. But if it requries rebuilding the toolchain it may take a while. I'll let you know then.
Thanks a lot! I'm very grateful! Jiayuan ----- Original Message ----- From: "Ali Saidi" <[EMAIL PROTECTED]> To: "M5 users mailing list" <m5-users@m5sim.org> Sent: 2007年6月17日 11:32 AM Subject: Re: [m5-users] synchronization primitives in SE mode If seems like for whatever reason the malloc code is using this header from libc: ./sysdeps/generic/malloc-machine.h as opposed to: ./nptl/sysdeps/pthread/malloc-machine.h The prior just does what you have listed below, the latter version actually calls code that does a real locking. I would check your build environment before you go off trying to write your own locking code. Ali On Jun 16, 2007, at 7:18 PM, Jiayuan Meng wrote: > Thanks Steve! > > > >> As long as the CPU IDs are distinct then it shouldn't matter that the >> thread IDs are all zero... the load locked/store conditional code >> requires both to match before it considers the request as coming from >> the same place. > > I see. I also tried assigning a unique ID to each thread. The > problem is still there though. > >> >> Can you tell from the trace that there's a point inside mutex_lock() >> where all of your cpus do LLs to the same address followed by SCs, >> and >> all of the SCs succeed? This would be definitive evidence that >> there's >> a problem with LL/SC. Other than that it's just a guess that this is >> where the problem lies. > > As I am looking into the trace, it seems that the libc_malloc is > not using LL/SC. The critial part looks like this: > > ldl r1, 0(r0) .......... load r1, initially it is 0 > bne r1,0xxx .......... if r1 is non-zero, jump to function > libc__arena_get > lda r1, 1(r31).........assign r1 to be 1 > stl r1, 0(r0)................store r1 to its original address > > the mutex_lock and unlock functions are inlined, so it appears > these all happen in libc_malloc. > > the race condition happens when the four threads simultaneously run > this piece of code. If only I can change the ldl and stl to LL/SC! > I'm guessing that I am not invoking the right mutex_lock functions, > but I don't have idea about how to do that. Maybe I can write a > mutex_lock function and link it to the program. Any suggestions? > > >> >> Also, what happens if you run with AtomicSimpleCPU, and with or >> without >> a single level of caches? > > I'm currently using private L1s and shared L2s. I'll test about > using a single level of shared L1. but are you actually interested > about how LL/SC in M5 behave under these different configurations? > I'll let you know when I capture these instructions :) > > Thanks for the insights! > > Jiayuan > > >> >> Steve >> >> Jiayuan Meng wrote: >>> Hi Ali, >>> >>> Thanks for the quick responce. >>> >>> I am having a master thread spawning child threads on multiple cpus. >>> Once a thread gets allocated to a cpu, it always resides there >>> (so far). >>> >>> I am using AtomicTimingCPU. In my test case with racing mallocs, >>> I have five CPUs(with id from 0 to 4). A master threads initially >>> runs >>> on cpu0. When it comes to a pseudo instruction, it tells the >>> simulator >>> to spawn four child threads on the other CPUs. Each CPU only uses >>> one >>> thread context(all have the id 0 by default). Will this be a >>> problem? I'll try assigning different thread IDs. >>> >>> To create threads, I learned from "stack_createFunc" and >>> "init_thread_context" in kern/tru64/tru64.hh, basically allocate >>> a new >>> stack, and assigns the pc and sp register. A major difference >>> might be >>> that I am not using pthreads. instead, I inserted a new pseudo >>> instruction which "atomically" creates four threads on the other >>> four >>> CPUs, they start to execute at the same cycle. >>> >>> I actually extended SimpleCPUs to have multiple thread contexts >>> and the >>> CPU can switch among them. They are tested with the splash2 FFT >>> benchmark and things went fine. But to make the test more clear, >>> I just >>> set each CPU to have exactly one thread context. In the future, I >>> may >>> need to "migrate" a running thread context from one CPU to another. >>> >>> I'm in trouble now... I wonder how splash2 gets around with this >>> in SE mode? >>> >>> Thanks again! >>> >>> Jiayuan >>> >>> >>> ----- Original Message ----- >>> *From:* Ali Saidi <mailto:[EMAIL PROTECTED]> >>> *To:* M5 users mailing list <mailto:m5-users@m5sim.org> >>> *Sent:* 2007年6月16日 2:41 AM >>> *Subject:* Re: [m5-users] synchronization primitives in SE mode >>> >>> The Alpha ISA has a load locked and a store conditional >>> instruction >>> which we support. Again I don't know exactly what you're >>> doing to >>> create your threads, but you need to make sure that their cpu/ >>> thread >>> ids are unique. Are you scheduling each thread on it's own >>> cpu or >>> are they moving around? >>> >>> Ali >>> >>> >>> >>> On Jun 15, 2007, at 1:30 PM, Jiayuan Meng wrote: >>> >>>> Hey all, >>>> >>>> By using the --trace-flags=Exec debug tool, I found that >>>> there is >>>> a race condition in the malloc function in my multithreaded >>>> program. However, when looking into the malloc.c in the >>>> glibc, it >>>> said it is a thread-safe version. I also noticed that in >>>> malloc/arena.c, it uses mutex_lock(), which seems to be a >>>> spinlock. This may still be problematic if several threads are >>>> accessing the lock simultaneously. >>>> >>>> So, what kind of synchronization support does M5 have in SE >>>> mode? >>>> Does it have store-conditional or test-and-set instructions or >>>> I'll have to add one myself? >>>> >>>> Thanks! >>>> >>>> Jiayuan >>> >>> >>> -------------------------------------------------------------------- >>> ---- >>> >>> _______________________________________________ >>> m5-users mailing list >>> m5-users@m5sim.org >>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users >>> >>> >>> -------------------------------------------------------------------- >>> ---- >>> >>> _______________________________________________ >>> m5-users mailing list >>> m5-users@m5sim.org >>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users >> _______________________________________________ >> m5-users mailing list >> m5-users@m5sim.org >> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users > _______________________________________________ > m5-users mailing list > m5-users@m5sim.org > http://m5sim.org/cgi-bin/mailman/listinfo/m5-users _______________________________________________ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
_______________________________________________ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users