Re: [m5-dev] Review Request: ARM: make predicated-false instruction to move data from a old register.

Ali Saidi Thu, 19 Aug 2010 12:30:10 -0700

On Thu, 19 Aug 2010 15:21:24 -0400, Gabriel Michael Black
<[email protected]> wrote:
> Quoting Ali Saidi <[email protected]>:
> 
>>
>> On Tue, 17 Aug 2010 14:19:49 -0400, Gabriel Michael Black
>> <[email protected]> wrote:
>>> Quoting Steve Reinhardt <[email protected]>:
>>>
>>>> On Sun, Aug 15, 2010 at 6:07 PM, Min Kyu Jeong <[email protected]>
>> wrote:
>>>>>
>>>>>
>>>>> I needed to spit out a code that reads from a register, and writes
to
>> it
>>>>> again. The thing is arch reg indices are renamed (reg renaming and
>>>>> shadow
>>>>> reg file), so many structures are needed to be looked up to find a
>>>>> previous/current physical register for the given arch reg name.
These
>>>>> indirections are handled in ISA-specific code and encapsulated, and
>> the
>>>>> operand section of the ISA description specifies which one to use.
>>>>
>>>> I think we're thinking of these things on different levels... the
>>>> operand registers are available via StaticInst::srcRegIdx() and
>>>> destRegIdx(), and can be accessed via (for example) readIntReg() in
>>>> o3/thread_context_impl.hh, none of which are ISA dependent.  I don't
>>>> see why you can't build on these calls to handle this entirely within
>>>> the O3 model in an ISA-independent fashion (i.e., a way that would
>>>> generalize to any ISA with universal predication, if we ever run into
>>>> another one).
>>> I was actually thinking this could be handled at a level higher than
>>> the parser, namely in the ISA description.
>>>
>>> If the predicate is false, then the instruction still needs to forward
>>> the value from the physical source register to the physical
>>> destination register so future consumers get the right thing. For
>>> renaming to work, it's necessary for all those registers to be listed
>>> as both sources and destinations. By putting a fall through condition
>>> in the instruction as defined by the ISA description that just assigns
>>> registers to themselves (DestReg = DestReg; SrcReg1 = SrcReg1) both of
>>> those conditions are satisfied automatically without the CPUs having
>>> to get any smarter/complex/constrained and without retrofitting the
>>> ISA parser.
>>>
>>> This will mean having to change a lot of instruction templates so that
>>> they have the fall through case and so that the definitions know what
>>> registers to assign to themselves. That's going to be tedious, but
>>> hopefully fairly mechanical to do.
>>>
>>> If the ISA parser were a python framework instead of a parser, then
>>> it's conceivable this sort of thing could be handled automatically and
>>> cleanly by subclassing, replacing default function implementations,
>>> etc. etc., and kept local to ARM. That would give the benefits you
>>> might have been going for by changing the parser (the parser does the
>>> work, fewer bugs...) , but without impacting the complexity seen by
>>> all the other ISAs. That's not going to happen tomorrow or possibly
>>> ever so I hestitated to mention it and don't make any plans around it,
>>> but it's something I would really like to see happen.
>>>
>>
>> I'm not sure I know why people are so opposed to changing a piece of
code
>> that is supposed to make creating an isa description easier, not more
>> difficult. With the changes all the regression tests still pass, and
the
>> parser is generating the exact same code for the other ISAs with and
>> without this change.
>>
>> The isa parser already generates 130,000 lines of code per cpu model
and
>> 75,000 lines of decode. It takes over 1GB of memory and 4 minutes to
>> compile a CPU model. To do this without changing the ISA parser and to
>> allow for out-of-order execution, conditional always and plain
>> conditional
>> instructions would have to be decoded separately, doubling the size of
>> the
>> decoder and cpu models. Additionally, not only would we have to go
>> through
>> the process of changing all the isa descriptions, but the possibility
for
>> introducing subtle bugs where predicted false instructions wouldn't be
>> guaranteed to properly update their renamed registers. It seems like
the
>> right answer in this case is to have the isa parser be able to figure
out
>> what registers need updating when an instruction is predicated false
and
>> do
>> the update.
>>
> Between the CPU and the parser I think it should be in the parser.  
> You're right that it's a mess trying to do it in the ISA description,  
> but that's a localized mess. If we continue making the parser more and  
> more complicated it turns into a global mess. Fixing problems in the  
> parser can be very hard and annoyingly subtle.


I understand that you don't want to make the parser more complex and I
understand the reasons why. My point is that some complexity in the parser
is much better than 10x the complexity somewhere else. If you look at what
is changed in the parser, for the case without reset_code substitution, it
changes very little. The bulk of the changes are the makeReset() blobs,
which are pretty-much boiler plate. In my mind, this is the least bug-laden
approach and any other ISAs with predication can take advantage of the same
mechanism to implement it.

Thanks,
Ali
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Review Request: ARM: make predicated-false instruction to move data from a old register.

Reply via email to