Hi everyone,

I just discovered this while trying to optimise some of the hash functions.  This might already be known, but in case it isn't, here's something useful to know.

The LEA instruction is useful because you can essentially perform "x := y + z + const" with one instruction, or just "x := y + z" or "x := y + const" if none of the source and destination registers match.  However, on Sandy Bridge and later (not sure about AMD processors) the 3-operand version has a 3-cycle latency and only one execution port (reduces concurrency if there are nearby instructions that fetch addresses), but the 2-operand version (whether reg/reg or reg/const) has only a single cycle latency and can be dispatched to at least two different ports.

Long story short, if you have something like:

LEA ECX, [ECX + EAX + $f57c0faf]
ROL ECX, 7

There is a 2-cycle delay before the ROL instruction can be executed.  However, if you expand LEA into two ADD instructions:

ADD ECX, EAX
ADD ECX, $f57c0faf
ROL ECX, 7

Though slightly larger, this triplet executes one cycle faster overall because there's no additional latency between the instructions.

The 3-operand LEA instruction is still useful in a few cases though:

    - If all the registers are different though, since to expand it into arithmetic/logical instructions, it would require an additional MOV instruction, which doesn't offer any speed bonuses and just increases code size.

    - In cases where the destination is the same as one of the source registers, as long as the destination isn't used for at least 3 cycles, then it is a saving (minimising concurrent uses of the AGU execution ports also helps).

    - And of course, if one of the registers has a scalar muliplier, then this is also faster than equivalent arithmetic/logical instructions.

With all this in mind I'll have a ponder about introducing a new peephole optimisation that expands potentially slow LEA instructions.

Kit

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to