Hello again, It's been a while since my last email and I've made some progress in getting parts of m5 to run in parallel but I've reached a critical phase and have some questions of which I'm sure some of the people reading will be able to help me with.
My main objective has always been to get CPU cores and their private caches to get simulated in parallel as well as the rest of the shared memory. The easiest way to do this is to place an element in between port interfaces that handles concurrency by basically forwarding member calls to the port on one side to the peer-port on the other side by means of a remotely processed event. Kind of like a remote procedure call. The problem is that these procedures have a return value, with the exception of functional calls. The only way to generically maintain consistency is to block the call on one thread and wait for the return value from the other thread. However if two calls from opposing ports are executed at the same time you will get a deadlock. This clearly is a big problem since in general you can't interrupt one of the calls. This makes parallel atomic calls pretty much impossible between private caches unless I rewrote the port interface and the implementing classes or had some guarantee that the state of the objects remained consistent if for example one atomic call got executed while another was blocking. Timing calls are interesting. They return a value as well but it only signals point-to-point acceptance. So in theory, in case of a deadlock I could simply return false and send a retry a few ticks later after which the call would start over. However the semantics of "false" and it's effect on the simulation are unclear to me. I would like to know if this could have an effect on the accuracy or even functional correctness? Maybe it would even be possible to return true when a deadlock is detected and handle the retry separately in case the remote end would return false, this would be more efficient. If parallelizing timing calls is also impossible this way then I'm going to have to recode some large, complicated chunks of m5 so I'm hoping it won't come to that. Assuming that the former problem has been solved the question remains if parts of the memory system can even safely run concurrently since I'm guessing pointers to data are shared in between MemObjects. In theory the cache coherence protocol should prohibit concurrent, incoherent read/writes but I don't know the code that well. thanks in advance, Stijn ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
