Ali Saidi wrote: > On Thu, 19 Aug 2010 15:21:24 -0400, Gabriel Michael Black > <[email protected]> wrote: > >> Quoting Ali Saidi <[email protected]>: >> >> >>> On Tue, 17 Aug 2010 14:19:49 -0400, Gabriel Michael Black >>> <[email protected]> wrote: >>> >>>> Quoting Steve Reinhardt <[email protected]>: >>>> >>>> >>>>> On Sun, Aug 15, 2010 at 6:07 PM, Min Kyu Jeong <[email protected]> >>>>> >>> wrote: >>> >>>>>> I needed to spit out a code that reads from a register, and writes >>>>>> > to > >>> it >>> >>>>>> again. The thing is arch reg indices are renamed (reg renaming and >>>>>> shadow >>>>>> reg file), so many structures are needed to be looked up to find a >>>>>> previous/current physical register for the given arch reg name. >>>>>> > These > >>>>>> indirections are handled in ISA-specific code and encapsulated, and >>>>>> >>> the >>> >>>>>> operand section of the ISA description specifies which one to use. >>>>>> >>>>> I think we're thinking of these things on different levels... the >>>>> operand registers are available via StaticInst::srcRegIdx() and >>>>> destRegIdx(), and can be accessed via (for example) readIntReg() in >>>>> o3/thread_context_impl.hh, none of which are ISA dependent. I don't >>>>> see why you can't build on these calls to handle this entirely within >>>>> the O3 model in an ISA-independent fashion (i.e., a way that would >>>>> generalize to any ISA with universal predication, if we ever run into >>>>> another one). >>>>> >>>> I was actually thinking this could be handled at a level higher than >>>> the parser, namely in the ISA description. >>>> >>>> If the predicate is false, then the instruction still needs to forward >>>> the value from the physical source register to the physical >>>> destination register so future consumers get the right thing. For >>>> renaming to work, it's necessary for all those registers to be listed >>>> as both sources and destinations. By putting a fall through condition >>>> in the instruction as defined by the ISA description that just assigns >>>> registers to themselves (DestReg = DestReg; SrcReg1 = SrcReg1) both of >>>> those conditions are satisfied automatically without the CPUs having >>>> to get any smarter/complex/constrained and without retrofitting the >>>> ISA parser. >>>> >>>> This will mean having to change a lot of instruction templates so that >>>> they have the fall through case and so that the definitions know what >>>> registers to assign to themselves. That's going to be tedious, but >>>> hopefully fairly mechanical to do. >>>> >>>> If the ISA parser were a python framework instead of a parser, then >>>> it's conceivable this sort of thing could be handled automatically and >>>> cleanly by subclassing, replacing default function implementations, >>>> etc. etc., and kept local to ARM. That would give the benefits you >>>> might have been going for by changing the parser (the parser does the >>>> work, fewer bugs...) , but without impacting the complexity seen by >>>> all the other ISAs. That's not going to happen tomorrow or possibly >>>> ever so I hestitated to mention it and don't make any plans around it, >>>> but it's something I would really like to see happen. >>>> >>>> >>> I'm not sure I know why people are so opposed to changing a piece of >>> > code > >>> that is supposed to make creating an isa description easier, not more >>> difficult. With the changes all the regression tests still pass, and >>> > the > >>> parser is generating the exact same code for the other ISAs with and >>> without this change. >>> >>> The isa parser already generates 130,000 lines of code per cpu model >>> > and > >>> 75,000 lines of decode. It takes over 1GB of memory and 4 minutes to >>> compile a CPU model. To do this without changing the ISA parser and to >>> allow for out-of-order execution, conditional always and plain >>> conditional >>> instructions would have to be decoded separately, doubling the size of >>> the >>> decoder and cpu models. Additionally, not only would we have to go >>> through >>> the process of changing all the isa descriptions, but the possibility >>> > for > >>> introducing subtle bugs where predicted false instructions wouldn't be >>> guaranteed to properly update their renamed registers. It seems like >>> > the > >>> right answer in this case is to have the isa parser be able to figure >>> > out > >>> what registers need updating when an instruction is predicated false >>> > and > >>> do >>> the update. >>> >>> >> Between the CPU and the parser I think it should be in the parser. >> You're right that it's a mess trying to do it in the ISA description, >> but that's a localized mess. If we continue making the parser more and >> more complicated it turns into a global mess. Fixing problems in the >> parser can be very hard and annoyingly subtle. >> > > I understand that you don't want to make the parser more complex and I > understand the reasons why. My point is that some complexity in the parser > is much better than 10x the complexity somewhere else. If you look at what > is changed in the parser, for the case without reset_code substitution, it > changes very little. The bulk of the changes are the makeReset() blobs, > which are pretty-much boiler plate. In my mind, this is the least bug-laden > approach and any other ISAs with predication can take advantage of the same > mechanism to implement it. > > Thanks, > Ali > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev >
I'm not happy about it, but I think you may have a compelling enough argument. If Steve is ok with it I'm ok with it. Gabe _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
