On Fri, 12 Nov 2010, Steve Reinhardt wrote:


Right now I am profiling with coherence protocol as MOESI_hammer. I am
thinking of profiling using a different protocol to make sure that it is not
an artifact of the protocol in use.


That sounds like a good idea.

All in all, we would ideally like to both speed up individual calls and
reduce the number of calls. IIRC, gprof indicated that findTagInSet() was
called 4-5X more frequently than there were cache accesses, which makes no
sense to me; it seems like a typical cache hit should only require a single
tag lookup.

Should this not be true in case a multiprocessor system is being simulated? I am not aware the configuration that ruby_fs.py makes use of.



That's another thing to keep in mind, is that typical programs really have
very high cache hit rates, so another approach is to look at what happens in
the process of servicing an L1 cache hit and optimize that path as much as
possible.

Steve


--
Nilay
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to