On 10/4/20 2:01 PM, J. Gareth Moreton via fpc-devel wrote:
Hi Nikolay,
I've got some good code to test, but I need to double-check with
someone to see if the licensing agreements allow (the code is rather
complex, but showcases the effect of the TEST instructions quite nicely).
Is your platform a Windows or a Unix machine? I ask because I don't
want to send you functions that use the wrong calling convention!
I dual boot Linux and Windows, but prefer testing on Linux.
Best regards,
Nikolay
Gareth aka. Kit
On 02/10/2020 14:13, Nikolay Nikolov via fpc-devel wrote:
On 10/2/20 2:13 PM, J. Gareth Moreton via fpc-devel wrote:
Confirmed my suspicions. if I zero the upper bits of the register
(I used something akin to "AND RCX, $F"), there is no speed loss.
Therefore, I can make the hypothesis, on my Intel(R) Core(TM)
i7-10750H, that using TEST on a sub-register causes a false
dependency if the bits outside of the subset are not zero, even
though the register isn't being modified.
If you send me a test program, I can run it on my Ryzen 5 2500U to
see how AMD behaves. We don't specifically optimize for AMD (yet),
but it's interesting to know.
Nikolay
Gareth aka. Kit
On 02/10/2020 11:57, J. Gareth Moreton via fpc-devel wrote:
So... I've done some tests, replacing TEST RCX, $4 with TEST CL, $4
and the like in a number-crunching function, and it seems to cause
a notable penalty, even though none of the instructions are in my
critical loop. So I think it's something that needs to be avoided
in most cases. I think the reason why it worked in my Int and Frac
functions is because the processor knows the upper 48 bits of the
register are zero.
Long story short... best not to do it unless you have some
additional insight into what the registers contain.
Gareth aka. Kit
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel