Jack Whitham wrote:
> On Sat, Jul 04, 2009 at 03:23:40AM -0700, Gabe Black wrote:
>
>> In case anyone is still reading this thread, that approach worked quite
>> well. I'm now moving forward with what I was trying to do in the first
>> place, namely entirely eliminate the ISA defined register files as
>> described in my pdf long ago.
>>
>
> If you are changing this code, I should tell you about the changes I
> have been making in order to support ARM on O3, because those might
> influence what you do.
>
> The really big change is that you have to support predicated operations
> that aren't executed. These are possible on ARM, and become a big issue
> for O3 support, because they are always partially executed to evaluate
> the condition codes in Cpsr. If the condition is false, then the operation
> *copies* the destination registers:
>
> def template PredOpExecute {{
> Fault %(class_name)s::execute(%(CPU_exec_context)s *xc,
> Trace::InstRecord *traceData) const
> {
> Fault fault = NoFault;
> %(op_decl)s;
> %(op_rd)s;
>
> if (%(predicate_test)s)
> {
> %(code)s;
> if (fault == NoFault)
> {
> %(op_wb)s;
> }
> } else {
> %(op_copy)s; // <---
> }
> return fault;
> }
>
> That's necessary because register renaming takes place *before* it is
> known whether the operation will execute or not. Rd (as input) and Rd (as
> output) are different physical registers. The destination registers need
> to be source registers.
>
> There is also an important optimisation that you can make: you can
> redirect the Cpsr and Rd inputs to the zero register if the operation
> is unconditional.
>
> I've attached a patch to suggest how this could be done. Incidentally, I
> really like the idea of making isa_parser.py into a class or package,
> because my patch needs to access globals in isa_parser.py and it wasn't
> easy to arrange this (I passed globals() as a parameter in the end).
>
> What the patch does in isa_parser.py:
> 1. Adds a "makeCopy" operation to every subclass of Operand. (You
> can't use makeRead then makeWrite because they sometimes do sign
> extension.) "makeCopy" is generated by the new "%(op_copy)s"
> template code.
> 2. Allows any ISA to specify post-processor methods for
> (a) the operand list (fixup_operands_fn, e.g. ArmAddCpsrAndCopy)
> (b) each operand (op_desc.post_process_fn, e.g. ArmPostProc)
> (c) each InstObjParams (e.g. ArmAddToConstructor)
>
>
> What the patch does in the ARM ISA definition:
> 1. Adds %(op_copy)s as appropriate.
> 2. Calls ArmAddToConstructor on every InstObjParams; this adds some
> flags to every constructor in decoder.cc. These include:
> (a) flags[IsCond] (if operation is conditional),
> (b) flags[IsControl] etc. (if operation writes to PC).
> 3. As an optimisation, ArmAddToConstructor also sets "conditional sources",
> to the zero register if flags[IsCond] == false.
> 4. Calls ArmAddCpsrAndCopy on every operand list genereated inside
> InstObjParams. This adds the Cpsr (if it's not already present) and
> adds all destination registers as sources. These are only used if
> the operation is conditional. A side effect of this step is that the
> substitution "new_code = re.sub(r'^', 'Cpsr = Cpsr;', new_code)" is
> not needed any more - a good thing, because most operations don't
> change Cpsr.
> 5. Calls ArmPostProc on every operand (just before finalize) to make
> all %(op_copy)s code dependent on
> (1) flags[IsCond] == true
> (2) destination register != PC
> This is because you do not want to write to the PC unless you are
> really changing it.
> 6. There is some special processing for load operations which I haven't
> completely finished yet. Specifically; the generation of EA is the
> only part of a load which is guaranteed to execute on O3, since the
> load may not issue if flags[IsCond] == true, so the EA generator
> must also do "%(op_copy)s".
> 7. There are few bug fixes for bugs that only show up if speculative
> execution is used.
>
> I hope that this patch will be useful to you. You can probably find
> better ways of doing each step than my "post processor" approach,
> but the key thing is support for something like "op_copy" which
> is used if a predicated operation is not executed. Without that, there
> can be no O3 support for ARM.
>
> Incidentally, O3 + ARM is very nearly working now :). Some changes are
> necessary inside src/cpu/o3/ - I haven't included these in this patch to
> avoid confusion, but I will send them out when they are fully working,
> and when the isa_parser.py changes are more stable.
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
I haven't looked at your patch in great detail, but I suspect that there
might be a way to get that to work without new mechanisms. The devil is
in the details so we'll see what can be done. I do have a number of
changes that affect isa_parser.py and/or ARM's isa description. If you
can give me some sort of test to run, I can try to make sure that even
if things are done substantially differently your support will work as
well as it does now.
Gabe
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev