I can't say I have a clue about the x86 condition codes, however for arm we successfully split up the condition codes into groups that were sticky and groups that were not and finally into groups of the sub groups that were written together. In doing so we got the o3 CPU to only insert dependencies between intuitions where there are real flag dependencies.
Thanks, Ali Sent from my ARM powered mobile device On Apr 6, 2012, at 3:17 PM, Gabe Black <[email protected]> wrote: > It's complicated. Looking at it again I reminded myself of all the ways > it doesn't fit into the way the ISA parser does things, so it's going to > quite a bit of work to fix properly. I don't have any ideas for how to > make it much simpler that would be at all practical. > > Gabe > > On 04/05/12 21:10, Watanabe, Yasuko wrote: >> Hi Gabe, >> >> Do you already have an idea of how to fix this? If so, can you give me some >> pointers? >> >> Yasuko >> >> -----Original Message----- >> From: [email protected] [mailto:[email protected]] On Behalf >> Of Gabe Black >> Sent: Thursday, April 05, 2012 6:12 PM >> To: [email protected] >> Subject: Re: [gem5-dev] Data dependency caused by flags >> >> Yes, you guys are right. This is a recognized problem, and I've made some >> changes over time which should make it easier to do this like a real x86 CPU >> would. I haven't yet, but it's on the horizon. I tend to be very busy, >> although circumstances may mean I have a little more or less time than >> normal for a little while so I don't know for sure when I'll get it fixed. >> If you have an idea of how to get it to do what you want locally, feel free. >> That will get you going, and when I get it fixed for real then you can start >> using that. >> >> Gabe >> >> On 04/05/12 17:18, Watanabe, Yasuko wrote: >>> Nilay, >>> >>> I agree with you. I think the dependencies of those flag bits should be >>> evaluated at bit level. >>> >>> Gabe and others, >>> >>> This change seems invasive. Do you know the best way to handle this? >>> >>> Yasuko >>> >>> -----Original Message----- >>> From: [email protected] [mailto:[email protected]] On >>> Behalf Of Nilay Vaish >>> Sent: Thursday, April 05, 2012 3:35 AM >>> To: gem5 Developer List >>> Subject: Re: [gem5-dev] Data dependency caused by flags >>> >>> The code for the function genFlags() in src/arch/x86/insts/microregop.cc >>> suggests that the values of flag bits not updated by the ADD instruction >>> need to be retained. This means that the previous values need to be read >>> and written again, which means the second ADD can be dependent on a value >>> written by the first ADD. If the dependencies were evaulated at bit level, >>> then these instructions would not be dependent. >>> >>> -- >>> Nilay >>> >>> On Thu, 5 Apr 2012, Watanabe, Yasuko wrote: >>> >>>> I ran O3 CPU in FS mode in x86 with a simple microbenchmark and got a >>>> much lower IPC than the theoretical IPC. The issue seems to be data >>>> dependencies caused by (control) flags, not registers, and I am >>>> wondering if anyone has come across the same issue. >>>> >>>> The microbenchmark has many data independent ADD instructions >>>> (http://repo.gem5.org/gem5/file/570b44fe6e04/src/arch/x86/isa/insts/g >>>> e >>>> neral_purpose/arithmetic/add_and_subtract.py#l41) >>>> in a loop. On a 2-wide out-of-order machine with enough resources, >>>> the IPC should be two at a steady stated. However, the IPC only goes >>>> up to one. What is happening is that even though the ADDs have two >>>> source and one destination registers and a flag to set in x86, gem5 >>>> adds one extra flag source register to the ADDs. As a result, each >>>> ADD becomes dependent on the earlier ADD's destination flag, >>>> constraining the achievable IPC to one. >>>> >>>> Here is an example sequence with physical register mappings: >>>> ADD: S1=98, S2=9, S3=2, D1=82, D2=105 (flag) >>>> ADD: S1=92, S2=9, S3=105 (flag), D1=79, D2=90 ... >>>> >>>> Physical registers 98, 9, and 92 are ready when those two ADDs are >>>> renamed; however, as you can see, the second ADD has to wait for the >>>> first ADD because of the extra flag source register S3. When I >>>> removed those flags in the macroop definition, the IPC jumped up from 1 to >>>> 1.7. >>>> >>>> Does anyone know why the ADD has to read the flags, even though the >>>> x86 manual does not say that? Those flags should just cause >>>> write-after-write dependency, not read-after-write. >>>> >>>> Yasuko >>>> >>>> _______________________________________________ >>>> gem5-dev mailing list >>>> [email protected] >>>> http://m5sim.org/mailman/listinfo/gem5-dev >>>> >>> _______________________________________________ >>> gem5-dev mailing list >>> [email protected] >>> http://m5sim.org/mailman/listinfo/gem5-dev >>> >>> >>> _______________________________________________ >>> gem5-dev mailing list >>> [email protected] >>> http://m5sim.org/mailman/listinfo/gem5-dev >> _______________________________________________ >> gem5-dev mailing list >> [email protected] >> http://m5sim.org/mailman/listinfo/gem5-dev >> >> >> _______________________________________________ >> gem5-dev mailing list >> [email protected] >> http://m5sim.org/mailman/listinfo/gem5-dev > > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
