On Sat, February 9, 2013 3:54 pm, Steve Reinhardt wrote: > It's important to remember that we already do have two levels of > caching... > the PC-based page cache and the backing hash table. I think making the > PC-based page cache lock free should be pretty straightforward. Since the > hash-table-based decode cache is only accessed on a page cache miss, maybe > just using a lock-based hash table like concurrent_unordered_map would be > OK. > > I hadn't thought about the challenge of resizing a hash table; I was > mostly > thinking about how easy it should be to append new entries on the end of a > hash chain in a lock-free manner. Note that we never delete anything from > the hash table, so that's not an issue. The speculative unlocked lookup > followed by a locked try-again-and-insert-if-needed makes sense; I think a > readers/writer lock is probably overkill in that case. Another option (if > we really do want a lock-free hash table) would be just to preallocate a > reasonable number of buckets and not allow it to grow. > > I'll repeat my initial statement, which is that this whole discussion is > probably premature, as we don't really have a general parallel simulator > yet. When we do get to the point of optimizing the cache, if there is > still contention about the right approach, we should probably start by > instrumenting the code and getting a feel for the typical miss rates and > sizes. I wouldn't be surprised if we find that, once we get past the > initial startup transient, as long as the PC-based page cache hits are > fast, the rest of it really doesn't matter. >
Steve First, do you want to tackle the decode cache after we have parallelized the message buffers and ports? Second, we use the PC of the currently executing thread to get a page from the decode cache. Since PC is a virtual address, why are we not required to carry out a translation first? -- Nilay _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
