On Sun, Apr 22, 2012 at 8:15 PM, Nilay Vaish <[email protected]> wrote:
> On Sun, 22 Apr 2012, Steve Reinhardt wrote: > > On Sun, Apr 22, 2012 at 7:20 PM, Nilay Vaish <[email protected]> wrote: >> >> >>> The way we currently recognise the CC register needs to be read, is that >>> it appears on the RHS of some assignment statement. It seems to me that >>> this would remain true even after the register is split. The expression >>> will just change to something like >>> zaps = genZaps(zaps, <something>); >>> >>> The isa parser would then mark zaps as both source and destination. And >>> the >>> two add instructions would still not execute in parallel, as the second >>> one >>> would be dependent on the first for the value of the zaps register. >>> >>> We can either -- >>> a. drop the default assumption that we need to make partial updates, and >>> handle >>> partial update as a special case. >>> b. keep the default assumption that we need to make partial updates, and >>> handle >>> full update as a special case. >>> >>> Current patches are along the lines of the second option. >>> >>> >> I'm jumping in partially informed, but can we just have two functions, >> like: >> >> zaps = setZaps(<something>); >> >> and >> >> zaps = modifyZaps(zaps, <something>); >> >> and then let the isa parser do its stuff naturally? >> >> Steve >> >> > I don't think that is possible. This code will appear in the .isa file. In > the .isa file, we cannot decide which version to use as the CC bits to be > written vary with the context in which the microop is used. So, we need a > run time condition that figures out tries to evaluate if the register needs > to be read. > Sorry for being way behind on this, but I'm curious just how many microops there are that have different impacts on the flags depending on their context, and how many different contexts there are. I got the impression before that there would be this huge explosion of microops if we actually had a different ADD micro-op (for example) for each set of bits that it could possibly write. However, looking at Appendix E of Vol 3 of the AMD ISA manual, it looks like the set of bits written by each macro-instruction (at least) is pretty well defined. I can believe that it's also valuable to have an ADD micro-op that doesn't affect flags for use in microcode sequences. But is there a 3rd version of ADD we need that modifies some but not all the flags that the ADD macroinstruction does? Basically if (hypothetically exaggerating) 80% of the macro-ops can modify five different combinations of flags depending on context, then this complex mechanism makes sense. On the other hand, if there are a small number of microops that need two versions (one that modifies a certain set of flags and one that doesn't), and maybe an even smaller number that legitimately decide which flags to look at at runtime, then this is starting to feel like overkill. I'm sure the truth is somewhere in the middle, but I just don't understand the code well enough to know which extreme it's closer to... and if it's close to the former, I'd like to understand why. Steve _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
