On Mon, 19 Oct 2009 14:33:45 -0700, nathan binkert <[email protected]>
wrote:
>> The fundamental law in PDES is that one process should not run ahead of
>> another if this one can trigger events effecting it. If an object
>> schedules
>> an event that should have been processed at time T1 but the eventqueue
is
>> at
>> time T2 with T2 > T1 then you'll have causality errors. So this is what
I
>> meant with blocking eventqueues. Other then that I agree that concurrent
>> scheduling should be lockless. Why not implement the eventqueues as a
>> lockless qeueu?
> Because event queues require insertion in the middle, and as far as I
> know, there is no way to build a lockless queue where you can insert
> into the middle (unless we have a double compare and swap).  Anyway, I
> think that the atomic operation would actually slow things down
> anyway.  The frequency of cross queue events vs regular events should
> hopefully be low, so it seems that we should try to optimize for the
> regular ones.
> 
> CMPXCHG16B exists on modern machines, but I think it would be pretty
> limiting to require that instruction to run a parallel simulation.
> 
>   Nate
At some point we're going to have to discuss what systems we're willing to
support multi-threaded M5 on and what we're not. TLS is great in some cases
(e.g. curTick), however TLS isn't supported on Mac OS X. To get similar
functionality pthread_getspecific() is used, but it's definitely uglier and
probably slower on Linux than __thread. Is OS X multithreaded support a
priority? For development it might be sort of nice, but I don't see having
a farm of OS X machines available for simulations in the near future so I
don't think so. 

Generally, I see the same thing for instructions like CMPXCHG16B. If it
makes it easier to do, or provides a large speedup then I'm pretty much for
it. I would rather the common case of hardware that people are going to run
simulations on be fast at the expense of hardware that we're not going to
run on (not multithreaded), just for the sake of having un-used support. 

On machines without CMPXCHG16B we could use a mutex, but if the machine has
it and it makes it significantly easier to be correct and fast I'm for it.

Ali

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to