Am 16.10.2017 um 22:33 schrieb Markus Beth:
> Sorry for the late reply. I had a weekend off(line).
> 
> The instructions were chosen on purpose and Sergey already cited the part of 
> the Intel documentation
> that explains why this is correct. You can find a similar part in AMD "AMD64 
> Architecture
> Programmer’s Manual Volume 1: Application Programming":

Yes, Sergey is of course right, it was too late yesterday :)

> 
>> 3.4.5 High 32 Bits
>> In 64-bit mode, the following rules apply to extension of results into
>> the high 32 bits when results smaller than 64 bits are written:
>>
>> * Zero-Extension of 32-Bit Results: 32-bit results are zero-extended
>>   into the high 32 bits of 64-bit GPR destination registers.
> 
> I think other x86_64 CPU manufacturers also adhere to this rule as I know gcc 
> also relies on this.
> 
> I generally prefer the instructions operating on 32 bit operands over those 
> operating on 64 bit
> operands where appropriate because they are typically encoded in less bytes 
> as they do not need a
> REX prefix.
> 
> I have updated the patch (attached) to include a code path for 'oldbinutils' 
> as Gareth suggested. In
> addition I switched the tails (.LCmpbyteZero and .LCmpbyteExitFast) as when 
> we leave the loop
> because the loop count reaches zero, we know already that the last bytes were 
> the same and do not
> need to subq them.
> 
> Markus
> 
> P.S.: I am currently working on another version of CompareByte that might 
> have a slightly higher
> latency for very small len but a higher throughput (2 cycles per iteration 
> vs. 3 cycles on an Intel
> Arrandale CPU (Westmere microarchitecture)). But this would need some more 
> testing and benchmarking.
> I can come up with it here again if this would be of any interest.

Small lengths in terms of matching string or overall lengths?

BTW: I would really like to see a PCMPSTR based implementation :)

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to