Am 16.10.2017 um 22:33 schrieb Markus Beth: > Sorry for the late reply. I had a weekend off(line). > > The instructions were chosen on purpose and Sergey already cited the part of > the Intel documentation > that explains why this is correct. You can find a similar part in AMD "AMD64 > Architecture > Programmer’s Manual Volume 1: Application Programming":
Yes, Sergey is of course right, it was too late yesterday :) > >> 3.4.5 High 32 Bits >> In 64-bit mode, the following rules apply to extension of results into >> the high 32 bits when results smaller than 64 bits are written: >> >> * Zero-Extension of 32-Bit Results: 32-bit results are zero-extended >> into the high 32 bits of 64-bit GPR destination registers. > > I think other x86_64 CPU manufacturers also adhere to this rule as I know gcc > also relies on this. > > I generally prefer the instructions operating on 32 bit operands over those > operating on 64 bit > operands where appropriate because they are typically encoded in less bytes > as they do not need a > REX prefix. > > I have updated the patch (attached) to include a code path for 'oldbinutils' > as Gareth suggested. In > addition I switched the tails (.LCmpbyteZero and .LCmpbyteExitFast) as when > we leave the loop > because the loop count reaches zero, we know already that the last bytes were > the same and do not > need to subq them. > > Markus > > P.S.: I am currently working on another version of CompareByte that might > have a slightly higher > latency for very small len but a higher throughput (2 cycles per iteration > vs. 3 cycles on an Intel > Arrandale CPU (Westmere microarchitecture)). But this would need some more > testing and benchmarking. > I can come up with it here again if this would be of any interest. Small lengths in terms of matching string or overall lengths? BTW: I would really like to see a PCMPSTR based implementation :) _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel