>> I already have an idea of how m5 could safely simulate disjoint >> collections of SimObjects of a configuration on different eventqueues in >> parallel. The guarantee m5 would need to have, to let two collections >> simulate out of sync safely, is that objects from different collections >> only interact through events that can only affect the targeted object. >> We would then have disjoint collections of SimObjects where the only >> references an object from one collection can have to objects from >> another are those of a safe type which only allow event communication. >> The python initialization code could then check the SimObjects and their >> references and partition the SimObjects in such collections. It could >> then automatically assign eventqueues or check if the existing >> assignment of eventqueues is safe. If key SimObjects in m5 are then >> modified to interact through the safe type, large sections of the >> simulator could safely run concurrently. Memory ports would be good >> candidates for this. > I think this is very realistic. I think that (at least for the > short/medium term), we should assume that the system is not parallel > unless we're actually running a simulation. This will make it very > easy for us to deal with configuration, statistics dumping, > checkpointing, object switching, and other things that aren't > performance criticial. > > Assuming this, there are a few global objects in M5 that we will have > to deal with, but initially, we can just use reader/writer locks on > them. I know of at least the instruction cache (and the event queue > itself of course). Ali, wasn't there a document somewhere that > spelled some of this out? Perhaps we could put this on the wiki.
I don't think reader/writer locks will be appropriate for most things. Either lockfree data structures or replication will probably provide higher performance in most cases. The list of things that need some sort of solution is: FastAlloc RefCount StaticInst::decode() EventQueue FastAlloc is a tricky since the pools need some protection. Replication doesn't seem to immediately work since an object can be allocated on one core and passe via an event to another core where it is deallocated. Adding locks is probably more expensive than not using FastAlloc at all. RefCount atomic_inc/atomic_dec is probably the best bet since it could happen on any core. StaticInst::decode() Perhaps replication would be a good place to start. I think the structure is just accessed too much to have any kind of locking. EventQueue lockfree? Ali _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
