> Hi, > > This patch optimises the matching rules for floating-point comparison with > respects to eq/ne on x86-64 > > 1, When the inputs of a comparison is the same (i.e `isNaN` patterns), `ZF` > is always set, so we don't need `cmpOpUCF2` for the eq/ne cases, which > improves the sequence of `If (CmpF x x) (Bool ne)` from > > ucomiss xmm0, xmm0 > jp label > jne label > > into > > ucomiss xmm0, xmm0 > jp label > > 2, The move rules for `cmpOpUCF2` is missing, which makes patterns such as `x > == y ? 1 : 0` to fall back to `cmpOpU`, which have a really high cost of > fixing the flags, such as > > xorl ecx, ecx > ucomiss xmm0, xmm1 > jnp done > pushf > andq [rsp], 0xffffff2b > popf > done: > movl eax, 1 > cmovel eax, ecx > > The patch changes this sequence into > > xorl ecx, ecx > ucomiss xmm0, xmm1 > movl eax, 1 > cmovpl eax, ecx > cmovnel eax, ecx > > 3, The patch also changes the pattern of `isInfinite` to be more optimised by > using `Math.abs` to reduce 1 comparison and compares the result with > `MAX_VALUE` since `>` is more optimised than `==` for floating-point types. > > The benchmark results are as follow: > > Before After > Benchmark Mode Cnt Score Error Score > Error Unit Ratio > FPComparison.equalDouble avgt 5 2876.242 ± 58.875 594.636 ± > 8.922 ns/op 4.84 > FPComparison.equalFloat avgt 5 3062.430 ± 31.371 663.849 ± > 3.656 ns/op 4.61 > FPComparison.isFiniteDouble avgt 5 475.749 ± 19.027 518.309 ± > 107.352 ns/op 0.92 > FPComparison.isFiniteFloat avgt 5 506.525 ± 14.417 515.576 ± > 14.669 ns/op 0.98 > FPComparison.isInfiniteDouble avgt 5 1232.800 ± 31.677 621.185 ± > 11.935 ns/op 1.98 > FPComparison.isInfiniteFloat avgt 5 1234.708 ± 70.239 623.566 ± > 15.206 ns/op 1.98 > FPComparison.isNanDouble avgt 5 2255.847 ± 7.238 400.124 ± > 0.762 ns/op 5.64 > FPComparison.isNanFloat avgt 5 2567.044 ± 36.078 546.486 ± > 1.509 ns/op 4.70 > > Thank you very much.
Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/8525/files - new: https://git.openjdk.java.net/jdk/pull/8525/files/ba93dcf2..7fcfe4a3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8525&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8525&range=01-02 Stats: 11 lines in 1 file changed: 10 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/8525.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/8525/head:pull/8525 PR: https://git.openjdk.java.net/jdk/pull/8525