On Tue, Apr 24, 2012 at 1:29 AM, Gabe Black <[email protected]> wrote:

> On 04/23/12 14:36, Steve Reinhardt wrote:
> > Great, sounds like we're pretty much thinking along the same lines.
> >
> > On Mon, Apr 23, 2012 at 11:47 AM, Gabe Black <[email protected]>
> wrote:
> >
> >> We'll have to see how it performs, I guess. One nice thing is that the
> >> page cache indirectly finds the same instruction in different places.
> >
> > You mean the decoder cache, not the page cache, right?
>
> Yeah, I think so. Conceptually they're two different things, but their
> implementations are intertwined so I don't really think of them as
> separate.
>

It's easy for me to think of them as separate because the state-based
decoder cache has been there forever (and I may have written it myself),
while the page-based cache was grafted in front of it later and I never
really got familiar with that code.

In any case, we should think of them separately at the design level, since
odds are good we want to treat them differently.


> > Also, if a significant fraction of hits are absorbed by the page cache
> > (which I expect is true), then I don't see a benefit from having separate
> > state-based caches vs. just having a single cache that has the full
> context
> > as part of the key.  Ideally the hash function should do a good job of
> > dealing with the contextual state, and if lookups aren't extremely
> frequent
> > then the overhead of calculating the hash on the larger state shouldn't
> be
> > a big deal.
>
> There are two reasons to not make context part of the key. First, you
> have to keep copying it into your ExtMachInst over and over and over
> even though it's always the same thing. Second, you have to have a more
> complex hash function and/or one that does more work to look up only
> keys that match the context when you're excluding the same possible
> matches over and over and over. If you make the context implicit by what
> cache you pick or the fact that nothing is in there except things that
> are compatible, you can just ignore the context. That makes things
> easier every time you do a lookup which is a lot.
>

That's why I said "if a significant fraction of hits are absorbed by the
page cache"... if you were doing this on every instruction, the issues you
bring up would matter, but I'm not convinced they matter if you only do it
on a page cache miss.


> >>>> What I'm planning to do is to keep track of how many
> >>>> and what bytes where at a particular PC with whatever contextualizing
> >>>> state like operand size, operating mode, etc. When an instruction is
> >>>> being fed into the predecoder, it will just check to see if the first
> n
> >>>> bytes are the same, and if so skip all the way to the static inst. If
> >>>> they aren't or if the contextualizing state changed and the cache was
> >>>> thrown out, then it falls back to the existing mechanism.
> >>>>
> >>> This is just a minor extension to the current decode page cache, right?
> >> Conceptually minor, but I'm still working out a way to tease the decode
> >> cache apart enough that it can be adjusted like that without making a
> >> mess. I haven't spent a lot of time on it yet though, so it may just
> >> take a little more thought.
> >>
> > OK, sounds good.  Dealing with context in the page cache seems like a
> more
> > interesting problem.  For example, I could see having multiple page
> caches
> > indexed by context and swapping them in and out on context changes to
> avoid
> > having to check the context on every access.
>
> Yeah, this is the sort of thing I'm thinking of for the pre-predecoder
> cache too.
>

Hmm, we may be talking past each other... when I say "page cache", I mean
the pre-predecoder cache.

To summarize, right now what I am envisioning is:

page cache --> predecoder --> state-based cache --> decoder

Are you thinking of something different?


> > In general our memory overhead is pretty low, so my inclination would be
> to
> > just keep all the decoded instructions around.  I'm guessing that
> whatever
> > context you think is slowly or rarely changing, there's probably some
> > pathological case where it changes faster than you think it should.  In
> > addition, if we have a state-based cache that just uses the full
> > context+machine instruction as an index, as I proposed above, there's
> never
> > a need to flush it.  Depending on how the page cache is handled, you may
> > want to limit what you keep there, but even in that case I'd be biased
> > toward keeping everything just for simplicity's sake.
>
> My concern is a sparsely populated array, for instance. I haven't looked
> at the actual numbers, but if we have, say, 100,000 possible context
> sets, then we'd have maybe 99,995 unused caches/cache pointers/whatever.
> Maybe a context indexed hash as a backing store for caches would
> eliminate this problem. Actually I think that would probably work out
> pretty well.
>

Yea, I agree, if the context is sparse then another hash_map is called for.

Steve
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to