Ali Saidi wrote:
> On Thu, 19 Aug 2010 15:21:24 -0400, Gabriel Michael Black
> <[email protected]> wrote:
>   
>> Quoting Ali Saidi <[email protected]>:
>>
>>     
>>> On Tue, 17 Aug 2010 14:19:49 -0400, Gabriel Michael Black
>>> <[email protected]> wrote:
>>>       
>>>> Quoting Steve Reinhardt <[email protected]>:
>>>>
>>>>         
>>>>> On Sun, Aug 15, 2010 at 6:07 PM, Min Kyu Jeong <[email protected]>
>>>>>           
>>> wrote:
>>>       
>>>>>> I needed to spit out a code that reads from a register, and writes
>>>>>>             
> to
>   
>>> it
>>>       
>>>>>> again. The thing is arch reg indices are renamed (reg renaming and
>>>>>> shadow
>>>>>> reg file), so many structures are needed to be looked up to find a
>>>>>> previous/current physical register for the given arch reg name.
>>>>>>             
> These
>   
>>>>>> indirections are handled in ISA-specific code and encapsulated, and
>>>>>>             
>>> the
>>>       
>>>>>> operand section of the ISA description specifies which one to use.
>>>>>>             
>>>>> I think we're thinking of these things on different levels... the
>>>>> operand registers are available via StaticInst::srcRegIdx() and
>>>>> destRegIdx(), and can be accessed via (for example) readIntReg() in
>>>>> o3/thread_context_impl.hh, none of which are ISA dependent.  I don't
>>>>> see why you can't build on these calls to handle this entirely within
>>>>> the O3 model in an ISA-independent fashion (i.e., a way that would
>>>>> generalize to any ISA with universal predication, if we ever run into
>>>>> another one).
>>>>>           
>>>> I was actually thinking this could be handled at a level higher than
>>>> the parser, namely in the ISA description.
>>>>
>>>> If the predicate is false, then the instruction still needs to forward
>>>> the value from the physical source register to the physical
>>>> destination register so future consumers get the right thing. For
>>>> renaming to work, it's necessary for all those registers to be listed
>>>> as both sources and destinations. By putting a fall through condition
>>>> in the instruction as defined by the ISA description that just assigns
>>>> registers to themselves (DestReg = DestReg; SrcReg1 = SrcReg1) both of
>>>> those conditions are satisfied automatically without the CPUs having
>>>> to get any smarter/complex/constrained and without retrofitting the
>>>> ISA parser.
>>>>
>>>> This will mean having to change a lot of instruction templates so that
>>>> they have the fall through case and so that the definitions know what
>>>> registers to assign to themselves. That's going to be tedious, but
>>>> hopefully fairly mechanical to do.
>>>>
>>>> If the ISA parser were a python framework instead of a parser, then
>>>> it's conceivable this sort of thing could be handled automatically and
>>>> cleanly by subclassing, replacing default function implementations,
>>>> etc. etc., and kept local to ARM. That would give the benefits you
>>>> might have been going for by changing the parser (the parser does the
>>>> work, fewer bugs...) , but without impacting the complexity seen by
>>>> all the other ISAs. That's not going to happen tomorrow or possibly
>>>> ever so I hestitated to mention it and don't make any plans around it,
>>>> but it's something I would really like to see happen.
>>>>
>>>>         
>>> I'm not sure I know why people are so opposed to changing a piece of
>>>       
> code
>   
>>> that is supposed to make creating an isa description easier, not more
>>> difficult. With the changes all the regression tests still pass, and
>>>       
> the
>   
>>> parser is generating the exact same code for the other ISAs with and
>>> without this change.
>>>
>>> The isa parser already generates 130,000 lines of code per cpu model
>>>       
> and
>   
>>> 75,000 lines of decode. It takes over 1GB of memory and 4 minutes to
>>> compile a CPU model. To do this without changing the ISA parser and to
>>> allow for out-of-order execution, conditional always and plain
>>> conditional
>>> instructions would have to be decoded separately, doubling the size of
>>> the
>>> decoder and cpu models. Additionally, not only would we have to go
>>> through
>>> the process of changing all the isa descriptions, but the possibility
>>>       
> for
>   
>>> introducing subtle bugs where predicted false instructions wouldn't be
>>> guaranteed to properly update their renamed registers. It seems like
>>>       
> the
>   
>>> right answer in this case is to have the isa parser be able to figure
>>>       
> out
>   
>>> what registers need updating when an instruction is predicated false
>>>       
> and
>   
>>> do
>>> the update.
>>>
>>>       
>> Between the CPU and the parser I think it should be in the parser.  
>> You're right that it's a mess trying to do it in the ISA description,  
>> but that's a localized mess. If we continue making the parser more and  
>> more complicated it turns into a global mess. Fixing problems in the  
>> parser can be very hard and annoyingly subtle.
>>     
>
> I understand that you don't want to make the parser more complex and I
> understand the reasons why. My point is that some complexity in the parser
> is much better than 10x the complexity somewhere else. If you look at what
> is changed in the parser, for the case without reset_code substitution, it
> changes very little. The bulk of the changes are the makeReset() blobs,
> which are pretty-much boiler plate. In my mind, this is the least bug-laden
> approach and any other ISAs with predication can take advantage of the same
> mechanism to implement it.
>
> Thanks,
> Ali
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>   

I'm not happy about it, but I think you may have a compelling enough
argument. If Steve is ok with it I'm ok with it.

Gabe
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to