Hi folks, specifically ARM folks. We've been seeing a problem with O3 where
when switching vector register renaming modes (full vectors vs vector
elements), the CPU checks its bookkeeping and finds that a vector register
is missing, ie with no instructions in flight, the free list has one fewer
register in it than the difference between the total number of physical
vector registers, and the number that should be taken up with architectural
state.

This problem has been somewhat difficult to reproduce, although we can get
it to happen, and it does happen often enough that it's been a real pain
for us. Given that it's not very easy to get it to happen which makes it
hard to observe, I've been digging around in the code trying to understand
what all the pieces do and why the bookkeeping might be wrong.

The most promising thing I've found so far is that when squashing, the
rename stage looks at its history and rolls back renames for squashed
instructions. Some registers are fixed and not renamed, so rolling back
those would be pointless. Also those registers should not go on the free
list.

The way O3 detects those special registers is that they have the same index
before and after renaming. If that is the case, O3 ignores those entries,
and does not roll them back or mark their target as free.

This check is slightly out of date though, since with the recently added
pinned register writes, a register will be renamed to the same thing
several times in a row. When these entries are checked, they will not be
rolled back (I think this part is still fine), but they will also not be
marked as free.

This isn't exactly a smoking gun though, since the more I think about it,
the more I think this may actually be ok. If one of the later writes is
squashed, the register isn't "free" since it still holds the (partially
written) architectural state. If everything gets squashed all the way back
to the first entry which did change what register to use, then the slightly
outdated check won't trigger and things should be freed up correctly (I
think).

This code is mostly new to me though, so I'm not super confident making any
grand declarations about what's going on. All the pieces seem to be there
though, which makes me very suspicious.

Maybe something goes wrong if the right number of writes never happens
because later writers get squashed?

Gabe
_______________________________________________
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to