Jack Whitham wrote:
> On Sat, Jul 04, 2009 at 03:23:40AM -0700, Gabe Black wrote:
>   
>> In case anyone is still reading this thread, that approach worked quite 
>> well. I'm now moving forward with what I was trying to do in the first 
>> place, namely entirely eliminate the ISA defined register files as 
>> described in my pdf long ago.
>>     
>
> If you are changing this code, I should tell you about the changes I
> have been making in order to support ARM on O3, because those might
> influence what you do. 
>
> The really big change is that you have to support predicated operations
> that aren't executed. These are possible on ARM, and become a big issue
> for O3 support, because they are always partially executed to evaluate 
> the condition codes in Cpsr. If the condition is false, then the operation 
> *copies* the destination registers:
>
>     def template PredOpExecute {{
>         Fault %(class_name)s::execute(%(CPU_exec_context)s *xc,
>                 Trace::InstRecord *traceData) const
>         {   
>             Fault fault = NoFault;
>             %(op_decl)s;
>             %(op_rd)s;
>
>             if (%(predicate_test)s)
>             {   
>                 %(code)s;
>                 if (fault == NoFault)
>                 {   
>                     %(op_wb)s;
>                 }
>             } else {
>                 %(op_copy)s;        // <--- 
>             }
>             return fault;
>         }
>
> That's necessary because register renaming takes place *before* it is 
> known whether the operation will execute or not. Rd (as input) and Rd (as
> output) are different physical registers. The destination registers need
> to be source registers.
>
> There is also an important optimisation that you can make: you can 
> redirect the Cpsr and Rd inputs to the zero register if the operation 
> is unconditional. 
>
> I've attached a patch to suggest how this could be done. Incidentally, I
> really like the idea of making isa_parser.py into a class or package,
> because my patch needs to access globals in isa_parser.py and it wasn't
> easy to arrange this (I passed globals() as a parameter in the end).
>
> What the patch does in isa_parser.py:
> 1.  Adds a "makeCopy" operation to every subclass of Operand. (You 
>     can't use makeRead then makeWrite because they sometimes do sign 
>     extension.) "makeCopy" is generated by the new "%(op_copy)s" 
>     template code. 
> 2.  Allows any ISA to specify post-processor methods for
>     (a) the operand list (fixup_operands_fn, e.g. ArmAddCpsrAndCopy)
>     (b) each operand (op_desc.post_process_fn, e.g. ArmPostProc)
>     (c) each InstObjParams (e.g. ArmAddToConstructor)
>
>
> What the patch does in the ARM ISA definition:
> 1.  Adds %(op_copy)s as appropriate.
> 2.  Calls ArmAddToConstructor on every InstObjParams; this adds some
>     flags to every constructor in decoder.cc. These include:
>     (a) flags[IsCond] (if operation is conditional),
>     (b) flags[IsControl] etc. (if operation writes to PC).
> 3.  As an optimisation, ArmAddToConstructor also sets "conditional sources",
>     to the zero register if flags[IsCond] == false.
> 4.  Calls ArmAddCpsrAndCopy on every operand list genereated inside
>     InstObjParams. This adds the Cpsr (if it's not already present) and
>     adds all destination registers as sources. These are only used if
>     the operation is conditional. A side effect of this step is that the
>     substitution "new_code = re.sub(r'^', 'Cpsr = Cpsr;', new_code)" is
>     not needed any more - a good thing, because most operations don't
>     change Cpsr.
> 5.  Calls ArmPostProc on every operand (just before finalize) to make
>     all %(op_copy)s code dependent on 
>     (1) flags[IsCond] == true
>     (2) destination register != PC
>     This is because you do not want to write to the PC unless you are
>     really changing it.
> 6.  There is some special processing for load operations which I haven't 
>     completely finished yet. Specifically; the generation of EA is the
>     only part of a load which is guaranteed to execute on O3, since the
>     load may not issue if flags[IsCond] == true, so the EA generator
>     must also do "%(op_copy)s".
> 7.  There are few bug fixes for bugs that only show up if speculative
>     execution is used.
>
> I hope that this patch will be useful to you. You can probably find 
> better ways of doing each step than my "post processor" approach, 
> but the key thing is support for something like "op_copy" which 
> is used if a predicated operation is not executed. Without that, there 
> can be no O3 support for ARM. 
>   
> Incidentally, O3 + ARM is very nearly working now :). Some changes are
> necessary inside src/cpu/o3/ - I haven't included these in this patch to
> avoid confusion, but I will send them out when they are fully working,
> and when the isa_parser.py changes are more stable.
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev

I haven't looked at your patch in great detail, but I suspect that there
might be a way to get that to work without new mechanisms. The devil is
in the details so we'll see what can be done. I do have a number of
changes that affect isa_parser.py and/or ARM's isa description. If you
can give me some sort of test to run, I can try to make sure that even
if things are done substantially differently your support will work as
well as it does now.

Gabe

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to