From: "Tony Harminc" <[email protected]> Sent: Tuesday, June 03, 2014 3:30 AM
On 2 June 2014 11:00, Rob van der Heij <[email protected]> wrote:The optimized gcc code was more like this (for 3 bytes) * IC R2,1(R5) LHI R3,0 AR R2,R1 AR R0,R1 IC R3,2(R5) LHI R1,0 AR R3,R2 AR R0,R2 IC R1,3(R5) LHI R2,0 AR R1,R3 AR R0,R3 * As I understand, the LHI is done earlier in the stream to allow overlap with the other instructions.Is LHI Rn,0 faster than SR Rn,Rn? I'd expect them to be the same, but SR is half the size, and so lessens the amount of i-cache used.
XR Rn,Rn is faster than SR. But does it matter? Such an instruction should be executed only once, and once only. It shouldn't be in the loop. --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com
