> Am 17.10.2021 um 13:25 schrieb J. Gareth Moreton via fpc-devel > <fpc-devel@lists.freepascal.org>: > > Hi everyone, > > While reading up on some algorithms, I came across a recommendation of using > a shorter arithmetic function to change the value of a constant in a register > rather than loading the new value directly. However, the algorithm assumes a > RISC-like processor, so I'm not sure if it applies to an Intel x86-64 > processor. Consider the following: > > movq $0xaaaaaaaaaaaaaaab,%rax > imulq %rax,%rcx > movq $0x5555555555555555,%rax > cmpq %rax,%rcx > setle %al > > This algorithm sets %al to 1 if %rcx is divisible by 3, and 0 if it's not, > and was compiled from the following Pascal code (under -O3, but -O1 produces > almost exactly the same): > > function IsDivisible3(Numerator: QWord): Boolean; > begin > Result := (Numerator * $AAAAAAAAAAAAAAAB) <= $5555555555555555; > end; > > (One of my merge requests produces this code from "Result := (x mod 3) = 0") > > My question is this: can "movq $0x5555555555555555,%rax" be replaced with > "shrq $0x1,%rax" without incurring an additional pipeline stall? The MOV > instruction takes 10 bytes to store, while "SHR 1" takes only 3. Given that > %rax is used beforehand and the CMP instruction has to wait until the IMUL > instruction has finished executing, logic tells me that I can get away with > it here, but I'm not sure if the metric to go by is the execution speed of > IMUL (i.e. the IMUL instruction is the limiting factor before CMP can be > executed), or the simple fact that the previous value of %rax was used and > will be loaded with $AAAAAAAAAAAAAAAB by the time it comes to load it with a > new value.
I’d expect that the shl is executed in parallel on with the imul on most modern out of order architectures. So no real issue. OTOH, this is a very rare case so it is questionable if it is useful to check for this situation. _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel