Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]
On Fri, 8 Apr 2022 22:17:23 GMT, Srinivas Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one > additional commit since the last revision: > > Fix the divmod crash due to lack of control node The issue is easy to reproduce, see [JDK-8284635](https://bugs.openjdk.java.net/browse/JDK-8284635). - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]
On Mon, 11 Apr 2022 02:01:17 GMT, David Holmes wrote: > This change appears to be causing crashes in tier4 - possibly Xcomp related: > > # assert(ctrl == kit.control()) failed: Control flow was added although the > intrinsic bailed out > I will file a new bug. Thank you for informing this issue! In order to reproduce the issue, I just kickstarted a tier4 run on a Xeon server. Will keep you updated. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]
On Fri, 8 Apr 2022 22:17:23 GMT, Srinivas Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one > additional commit since the last revision: > > Fix the divmod crash due to lack of control node This change appears to be causing crashes in tier4 - possibly Xcomp related: # assert(ctrl == kit.control()) failed: Control flow was added although the intrinsic bailed out I will file a new bug. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]
On Wed, 6 Apr 2022 06:23:47 GMT, Jatin Bhateja wrote: >>> Also need a jtreg test for this. >> >> Thanks Sandhya for the review. Made the suggested changes and added jtreg >> tests as well. > > Hi @vamsi-parasa , thanks for addressing my comments, looks good to me > otherwise apart from the outstanding comments. @jatin-bhateja Thank you Jatin! - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]
On Sat, 9 Apr 2022 18:25:54 GMT, Vladimir Kozlov wrote: > Testing passed. Thank you Vladimir! - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]
On Fri, 8 Apr 2022 22:17:23 GMT, Srinivas Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one > additional commit since the last revision: > > Fix the divmod crash due to lack of control node Testing passed. - Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]
On Fri, 8 Apr 2022 22:17:23 GMT, Srinivas Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one > additional commit since the last revision: > > Fix the divmod crash due to lack of control node I submitted new testing. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]
On Fri, 8 Apr 2022 00:55:50 GMT, Srinivas Vamsi Parasa wrote: >> I have few comments. > > Hi Vladimir (@vnkozlov), > > Incorporated all the suggestions you made in the previous review and pushed a > new commit. > Please let me know if anything else is needed. > > Thanks, > Vamsi > @vamsi-parasa I got failures in new tests when run with `-XX:UseAVX=3 > -XX:+UnlockDiagnosticVMOptions -XX:+UseKNLSetting ` flags: > > ``` > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGFPE (0x8) at pc=0x7f2fa8c674ea, pid=3334, tid=3335 > # > # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build > 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug > 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, > tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # J 504% c2 compiler.intrinsics.TestLongUnsignedDivMod.testDivideUnsigned()V > (48 bytes) @ 0x7f2fa8c674ea [0x7f2fa8c672a0+0x024a] > # > ``` > > ``` > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGFPE (0x8) at pc=0x7fb8c0c4fb18, pid=3309, tid=3310 > # > # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build > 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug > 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, > tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # J 445 c2 compiler.intrinsics.TestIntegerUnsignedDivMod.divmod(III)V (23 > bytes) @ 0x7fb8c0c4fb18 [0x7fb8c0c4fae0+0x0038] > # > ``` Hi Vladimir (@vnkozlov), fixed it in the new commit, could you pls check? This is being caused by lack of control() node in udiv/umod related nodes. After adding the control() node, the tests are passing for me. Thanks for pointing this out! - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]
> Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This change also implements the DivMod optimization which fuses > division and modulus operations if needed. The DivMod optimization shows 3x > improvement for Integer and ~65% improvement for Long. Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: Fix the divmod crash due to lack of control node - Changes: - all: https://git.openjdk.java.net/jdk/pull/7572/files - new: https://git.openjdk.java.net/jdk/pull/7572/files/3e3fc977..a71ea238 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=11-12 Stats: 8 lines in 2 files changed: 0 ins; 4 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7572.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572 PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]
On Fri, 8 Apr 2022 18:32:06 GMT, Sandhya Viswanathan wrote: > My suggestion is to keep the -ve path assembly optimization in this patch. > When the optimization in IR is introduced, the assembly could then be > simplified as suggested by @merykitty. Okay. Lets do that as part of JDK-8282365. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]
On Fri, 8 Apr 2022 01:05:33 GMT, Srinivas Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one > additional commit since the last revision: > > uncomment zero in integer div, mod test My suggestion is to keep the -ve path assembly optimization in this patch. When the optimization in IR is introduced, the assembly could then be simplified as suggested by @merykitty. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]
On Fri, 8 Apr 2022 01:05:33 GMT, Srinivas Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one > additional commit since the last revision: > > uncomment zero in integer div, mod test Agree, this is reasonable suggestion. It could be done in these changes. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]
On Fri, 8 Apr 2022 16:39:31 GMT, Vladimir Kozlov wrote: >> Hi Vladimir (@vnkozlov), >> >> Incorporated all the suggestions you made in the previous review and pushed >> a new commit. >> Please let me know if anything else is needed. >> >> Thanks, >> Vamsi > > @vamsi-parasa I got failures in new tests when run with `-XX:UseAVX=3 > -XX:+UnlockDiagnosticVMOptions -XX:+UseKNLSetting ` flags: > > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGFPE (0x8) at pc=0x7f2fa8c674ea, pid=3334, tid=3335 > # > # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build > 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug > 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, > tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # J 504% c2 compiler.intrinsics.TestLongUnsignedDivMod.testDivideUnsigned()V > (48 bytes) @ 0x7f2fa8c674ea [0x7f2fa8c672a0+0x024a] > # > > > > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGFPE (0x8) at pc=0x7fb8c0c4fb18, pid=3309, tid=3310 > # > # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build > 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug > 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, > tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # J 445 c2 compiler.intrinsics.TestIntegerUnsignedDivMod.divmod(III)V (23 > bytes) @ 0x7fb8c0c4fb18 [0x7fb8c0c4fae0+0x0038] > # @vnkozlov The `uDivI_rRegNode` currently emits machine code equivalent to the following Java pseudocode: if (div < 0) { // fast path, if div < 0, then (unsigned)div > MAX_UINT / 2U // I don't know why this is so complicated, basically this is rax u>= div ? 1 : 0 return (rax & ~(rax - div)) >>> (Integer.SIZE - 1); } else { // slow path, just do the division normally return rax u/ div; } What I am suggesting is to leave the negative-divisor fast part to be implemented in the IR and the macro assembler should only concern emitting the division instruction and not worry about optimisation here. Thanks. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]
On Fri, 8 Apr 2022 00:55:50 GMT, Srinivas Vamsi Parasa wrote: >> I have few comments. > > Hi Vladimir (@vnkozlov), > > Incorporated all the suggestions you made in the previous review and pushed a > new commit. > Please let me know if anything else is needed. > > Thanks, > Vamsi @vamsi-parasa I got failures in new tests when run with `-XX:UseAVX=3 -XX:+UnlockDiagnosticVMOptions -XX:+UseKNLSetting ` flags: # A fatal error has been detected by the Java Runtime Environment: # # SIGFPE (0x8) at pc=0x7f2fa8c674ea, pid=3334, tid=3335 # # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # J 504% c2 compiler.intrinsics.TestLongUnsignedDivMod.testDivideUnsigned()V (48 bytes) @ 0x7f2fa8c674ea [0x7f2fa8c672a0+0x024a] # # A fatal error has been detected by the Java Runtime Environment: # # SIGFPE (0x8) at pc=0x7fb8c0c4fb18, pid=3309, tid=3310 # # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # J 445 c2 compiler.intrinsics.TestIntegerUnsignedDivMod.divmod(III)V (23 bytes) @ 0x7fb8c0c4fb18 [0x7fb8c0c4fae0+0x0038] # - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]
On Fri, 8 Apr 2022 01:59:10 GMT, Quan Anh Mai wrote: > Personally, I think the optimisation for `div < 0` should be handled by the > mid-end optimiser, which will not only give us the advantages of dead code > elimination, but also global code motion. I would suggest the backend only > doing `xorl rdx, rdx; divl $div$$Register` and the optimisation for `div < 0` > will be implemented as a part of JDK-8282365. What do you think? I agree that we can do more optimizations with constants as JDK-8282365 suggested. But I think we should proceed with current changes as they are after fixing remaining issues. I assume that you are talking about case when `divisor` is constant (or both). Because if it is not, IR optimization will not help - we don't profile arithmetic values so we can't generate uncommon trap path without some profiling information. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]
On Fri, 8 Apr 2022 01:05:33 GMT, Srinivas Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one > additional commit since the last revision: > > uncomment zero in integer div, mod test Personally, I think the optimisation for `div < 0` should be handled by the mid-end optimiser, which will not only give us the advantages of dead code elimination, but also global code motion. I would suggest the backend only doing `xorl rdx, rdx; divl $div$$Register` and the optimisation for `div < 0` will be implemented as a part of JDK-8282365. What do you think? Thanks. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]
On Fri, 8 Apr 2022 01:05:33 GMT, Srinivas Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one > additional commit since the last revision: > > uncomment zero in integer div, mod test Good. I forgot before to ask about how you handle devision by 0 and now you added check for it. Let me run testing before approval. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]
> Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This change also implements the DivMod optimization which fuses > division and modulus operations if needed. The DivMod optimization shows 3x > improvement for Integer and ~65% improvement for Long. Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: uncomment zero in integer div, mod test - Changes: - all: https://git.openjdk.java.net/jdk/pull/7572/files - new: https://git.openjdk.java.net/jdk/pull/7572/files/bfb6c02e..3e3fc977 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=10-11 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7572.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572 PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]
On Wed, 6 Apr 2022 00:45:37 GMT, Vladimir Kozlov wrote: >> Thanks for suggesting the enhancement. This enhancement will be implemented >> as a part of https://bugs.openjdk.java.net/browse/JDK-8282365 > > You do need `Ideal()` methods at least to check for dead code. Added the Ideal() methods for checking dead code. Pls see the new commit. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]
On Wed, 6 Apr 2022 00:46:01 GMT, Vladimir Kozlov wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one >> additional commit since the last revision: >> >> add error msg for jtreg test > > I have few comments. Hi Vladimir (@vnkozlov), Incorporated all the suggestions you made in the previous review and pushed a new commit. Please let me know if anything else is needed. Thanks, Vamsi > src/hotspot/cpu/x86/assembler_x86.cpp line 12375: > >> 12373: } >> 12374: #endif >> 12375: > > Please, place it near `idivq()` so you would not need `#ifdef`. Made the change as per your suggestion. > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4568: > >> 4566: subl(rdx, divisor); >> 4567: if (VM_Version::supports_bmi1()) andnl(rax, rdx, rax); >> 4568: else { > > Please, follow our coding stile here and in following methods: > > if (VM_Version::supports_bmi1()) { > andnl(rax, rdx, rax); > } else { Pls see the new commit which fixed the coding style. > src/hotspot/cpu/x86/x86_64.ad line 8701: > >> 8699: %} >> 8700: >> 8701: instruct udivI_rReg(rax_RegI rax, no_rax_rdx_RegI div, rFlagsReg cr, >> rdx_RegI rdx) > > I suggest to follow the pattern in other `div/mod` instructions: `(rax_RegI > rax, rdx_RegI rdx, no_rax_rdx_RegI div, rFlagsReg cr)` > > Similar in following new instructions. Pls see the new commit which fixed the pattern. > test/hotspot/jtreg/compiler/intrinsics/TestIntegerDivMod.java line 55: > >> 53: dividends[i] = rng.nextInt(); >> 54: divisors[i] = rng.nextInt(); >> 55: } > > I don't trust RND to generate corner cases. > Please, add cases here and in TestLongDivMod.java for MAX, MIN, 0. You are right. Using an updated corner cases test revealed divide by zero crash which was fixed. Please see the updated jtreg tests inspired by unsigned divide/remainder tests in test/jdk/java/lang/Long/Unsigned.java and test/jdk/java/lang/Integer/Unsigned.java. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v11]
> Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This change also implements the DivMod optimization which fuses > division and modulus operations if needed. The DivMod optimization shows 3x > improvement for Integer and ~65% improvement for Long. Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: Idead Ideal for udiv, umod nodes and update jtreg tests to use corner cases - Changes: - all: https://git.openjdk.java.net/jdk/pull/7572/files - new: https://git.openjdk.java.net/jdk/pull/7572/files/9949047c..bfb6c02e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=09-10 Stats: 701 lines in 7 files changed: 423 ins; 274 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7572.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572 PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v10]
> Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This change also implements the DivMod optimization which fuses > division and modulus operations if needed. The DivMod optimization shows 3x > improvement for Integer and ~65% improvement for Long. Srinivas Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - use appropriate style changes - Merge branch 'master' of https://git.openjdk.java.net/jdk into udivmod - Merge branch 'openjdk:master' into udivmod - add error msg for jtreg test - update jtreg test to run on x86_64 - add bmi1 support check and jtreg tests - Merge branch 'master' of https://git.openjdk.java.net/jdk into udivmod - fix 32bit build issues - Fix line at end of file - Move intrinsic code to macro assembly routines; remove unused transformations for div and mod nodes - ... and 5 more: https://git.openjdk.java.net/jdk/compare/4451257b...9949047c - Changes: https://git.openjdk.java.net/jdk/pull/7572/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk=7572=09 Stats: 1011 lines in 20 files changed: 1009 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7572.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572 PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]
On Wed, 6 Apr 2022 00:46:01 GMT, Vladimir Kozlov wrote: > I have few comments. Thank you Vladimir (@vnkozlov) for suggesting the changes! Will incorporate the suggestions and push an update in few hours. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]
On Mon, 4 Apr 2022 07:24:12 GMT, Vamsi Parasa wrote: >> Also need a jtreg test for this. > >> Also need a jtreg test for this. > > Thanks Sandhya for the review. Made the suggested changes and added jtreg > tests as well. Hi @vamsi-parasa , thanks for addressing my comments, looks good to me otherwise apart from the outstanding comments. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v9]
On Wed, 6 Apr 2022 06:02:07 GMT, Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Vamsi Parasa has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 13 commits: > > - Merge branch 'openjdk:master' into udivmod > - add error msg for jtreg test > - update jtreg test to run on x86_64 > - add bmi1 support check and jtreg tests > - Merge branch 'master' of https://git.openjdk.java.net/jdk into udivmod > - fix 32bit build issues > - Fix line at end of file > - Move intrinsic code to macro assembly routines; remove unused > transformations for div and mod nodes > - fix trailing white space errors > - fix whitespaces > - ... and 3 more: > https://git.openjdk.java.net/jdk/compare/741be461...acba7c19 Marked as reviewed by jbhateja (Committer). - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v9]
> Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This change also implements the DivMod optimization which fuses > division and modulus operations if needed. The DivMod optimization shows 3x > improvement for Integer and ~65% improvement for Long. Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge branch 'openjdk:master' into udivmod - add error msg for jtreg test - update jtreg test to run on x86_64 - add bmi1 support check and jtreg tests - Merge branch 'master' of https://git.openjdk.java.net/jdk into udivmod - fix 32bit build issues - Fix line at end of file - Move intrinsic code to macro assembly routines; remove unused transformations for div and mod nodes - fix trailing white space errors - fix whitespaces - ... and 3 more: https://git.openjdk.java.net/jdk/compare/741be461...acba7c19 - Changes: https://git.openjdk.java.net/jdk/pull/7572/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk=7572=08 Stats: 1007 lines in 20 files changed: 1005 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7572.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572 PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]
On Tue, 5 Apr 2022 20:26:18 GMT, Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Vamsi Parasa has updated the pull request incrementally with one additional > commit since the last revision: > > add error msg for jtreg test I have few comments. src/hotspot/cpu/x86/assembler_x86.cpp line 12375: > 12373: } > 12374: #endif > 12375: Please, place it near `idivq()` so you would not need `#ifdef`. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4568: > 4566: subl(rdx, divisor); > 4567: if (VM_Version::supports_bmi1()) andnl(rax, rdx, rax); > 4568: else { Please, follow our coding stile here and in following methods: if (VM_Version::supports_bmi1()) { andnl(rax, rdx, rax); } else { src/hotspot/cpu/x86/x86_64.ad line 8701: > 8699: %} > 8700: > 8701: instruct udivI_rReg(rax_RegI rax, no_rax_rdx_RegI div, rFlagsReg cr, > rdx_RegI rdx) I suggest to follow the pattern in other `div/mod` instructions: `(rax_RegI rax, rdx_RegI rdx, no_rax_rdx_RegI div, rFlagsReg cr)` Similar in following new instructions. test/hotspot/jtreg/compiler/intrinsics/TestIntegerDivMod.java line 55: > 53: dividends[i] = rng.nextInt(); > 54: divisors[i] = rng.nextInt(); > 55: } I don't trust RND to generate corner cases. Please, add cases here and in TestLongDivMod.java for MAX, MIN, 0. - Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]
On Thu, 24 Feb 2022 19:04:37 GMT, Vamsi Parasa wrote: >> src/hotspot/share/opto/divnode.cpp line 881: >> >>> 879: return (phase->type( in(2) )->higher_equal(TypeLong::ONE)) ? in(1) : >>> this; >>> 880: } >>> 881: >>> //--Value-- >> >> Ideal transform to replace unsigned divide by cheaper logical right shift >> instruction if divisor is POW will be useful. > > Thanks for suggesting the enhancement. This enhancement will be implemented > as a part of https://bugs.openjdk.java.net/browse/JDK-8282365 You do need `Ideal()` methods at least to check for dead code. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]
On Tue, 5 Apr 2022 20:26:18 GMT, Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Vamsi Parasa has updated the pull request incrementally with one additional > commit since the last revision: > > add error msg for jtreg test Marked as reviewed by sviswanathan (Reviewer). Looks good to me. You need one more review. @vnkozlov Could you please help review this patch? - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v6]
On Tue, 5 Apr 2022 17:06:44 GMT, Sandhya Viswanathan wrote: >> Vamsi Parasa has updated the pull request incrementally with one additional >> commit since the last revision: >> >> add bmi1 support check and jtreg tests > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.hpp line 362: > >> 360: void vector_popcount_long(XMMRegister dst, XMMRegister src, >> XMMRegister xtmp1, >> 361: XMMRegister xtmp2, XMMRegister xtmp3, >> Register rtmp, >> 362: int vec_enc); > > This doesn't seem to be related to this patch. This is coming due to a merge with the latest upstream (jdk) > test/hotspot/jtreg/compiler/intrinsics/TestIntegerDivMod.java line 107: > >> 105: } >> 106: if (mismatch) { >> 107: throw new RuntimeException("Test failed"); > > It would be good to print dividend, divisor, operation, actual result and > expected result here. Please see the updated error message in the recent commit. > test/hotspot/jtreg/compiler/intrinsics/TestLongDivMod.java line 104: > >> 102: } >> 103: if (mismatch) { >> 104: throw new RuntimeException("Test failed"); > > It would be good to print dividend, divisor, operation, actual result and > expected result here. Please see the updated error message in the recent commit. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]
> Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This change also implements the DivMod optimization which fuses > division and modulus operations if needed. The DivMod optimization shows 3x > improvement for Integer and ~65% improvement for Long. Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: add error msg for jtreg test - Changes: - all: https://git.openjdk.java.net/jdk/pull/7572/files - new: https://git.openjdk.java.net/jdk/pull/7572/files/e97c6fbc..8047767c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=06-07 Stats: 41 lines in 2 files changed: 37 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7572.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572 PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]
On Thu, 24 Feb 2022 14:13:47 GMT, Jatin Bhateja wrote: >> Vamsi Parasa has updated the pull request incrementally with one additional >> commit since the last revision: >> >> fix 32bit build issues > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4408: > >> 4406: jmp(done); >> 4407: bind(neg_divisor_fastpath); >> 4408: // Fastpath for divisor < 0: > > How about checking if divisor is +ve or -ve constant and non-constant > dividend in identity routine and setting a flag in IR node, which can be used > to either emit fast / slow path in a new instruction selection pattern. It > will save emitting redundant instructions. Thanks for suggesting the enhancement. This enhancement will be implemented as a part of https://bugs.openjdk.java.net/browse/JDK-8282365 > src/hotspot/share/opto/divnode.cpp line 881: > >> 879: return (phase->type( in(2) )->higher_equal(TypeLong::ONE)) ? in(1) : >> this; >> 880: } >> 881: >> //--Value-- > > Ideal transform to replace unsigned divide by cheaper logical right shift > instruction if divisor is POW will be useful. Thanks for suggesting the enhancement. This enhancement will be implemented as a part of https://bugs.openjdk.java.net/browse/JDK-8282365 > src/hotspot/share/opto/divnode.cpp line 897: > >> 895: >> 896: // Either input is BOTTOM ==> the result is the local BOTTOM >> 897: const Type *bot = bottom_type(); > > Can we add constant folding handling when both dividend and divisor are > constants. Thanks for suggesting the enhancement. This enhancement will be implemented as a part of https://bugs.openjdk.java.net/browse/JDK-8282365 - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]
On Thu, 24 Feb 2022 02:43:46 GMT, Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > Vamsi Parasa has updated the pull request incrementally with one additional > commit since the last revision: > > fix 32bit build issues src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4408: > 4406: jmp(done); > 4407: bind(neg_divisor_fastpath); > 4408: // Fastpath for divisor < 0: How about checking if divisor is +ve or -ve constant and non-constant dividend in identity routine and setting a flag in IR node, which can be used to either emit fast / slow path in a new instruction selection pattern. It will save emitting redundant instructions. src/hotspot/share/opto/divnode.cpp line 881: > 879: return (phase->type( in(2) )->higher_equal(TypeLong::ONE)) ? in(1) : > this; > 880: } > 881: > //--Value-- Ideal transform to replace unsigned divide by cheaper logical right shift instruction if divisor is POW will be useful. src/hotspot/share/opto/divnode.cpp line 897: > 895: > 896: // Either input is BOTTOM ==> the result is the local BOTTOM > 897: const Type *bot = bottom_type(); Can we add constant folding handling when both dividend and divisor are constants. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]
> Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This change also implements the DivMod optimization which fuses > division and modulus operations if needed. The DivMod optimization shows 3x > improvement for Integer and ~65% improvement for Long. Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: fix 32bit build issues - Changes: - all: https://git.openjdk.java.net/jdk/pull/7572/files - new: https://git.openjdk.java.net/jdk/pull/7572/files/13549290..2915b2e7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=02-03 Stats: 91 lines in 2 files changed: 49 ins; 42 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7572.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572 PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v3]
> Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This change also implements the DivMod optimization which fuses > division and modulus operations if needed. The DivMod optimization shows 3x > improvement for Integer and ~65% improvement for Long. Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: Fix line at end of file - Changes: - all: https://git.openjdk.java.net/jdk/pull/7572/files - new: https://git.openjdk.java.net/jdk/pull/7572/files/7fc18af3..13549290 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7572.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572 PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v2]
> Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This change also implements the DivMod optimization which fuses > division and modulus operations if needed. The DivMod optimization shows 3x > improvement for Integer and ~65% improvement for Long. Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: Move intrinsic code to macro assembly routines; remove unused transformations for div and mod nodes - Changes: - all: https://git.openjdk.java.net/jdk/pull/7572/files - new: https://git.openjdk.java.net/jdk/pull/7572/files/fa57175a..7fc18af3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=00-01 Stats: 326 lines in 7 files changed: 137 ins; 176 del; 13 mod Patch: https://git.openjdk.java.net/jdk/pull/7572.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572 PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long
On Wed, 23 Feb 2022 05:52:00 GMT, Jatin Bhateja wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 76: > >> 74: return quotients; >> 75: } >> 76: > > Return seems redundant here. Will remove it. > test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 83: > >> 81: } >> 82: return remainders; >> 83: } > > Return seems redundant here. Will remove it. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long
On Wed, 23 Feb 2022 05:46:45 GMT, Jatin Bhateja wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > src/hotspot/share/opto/divnode.cpp line 1350: > >> 1348: return NULL; >> 1349: } >> 1350: > > Please remove Value and Ideal routines if no explicit transforms are being > done. Will remove the unused transformations. > src/hotspot/share/opto/divnode.cpp line 1362: > >> 1360: } >> 1361: >> 1362: >> //= > > You can remove Ideal routine is not transformation is being done. Will remove the unused transformation. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long
On Wed, 23 Feb 2022 05:43:10 GMT, Jatin Bhateja wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This change also implements the DivMod optimization which fuses >> division and modulus operations if needed. The DivMod optimization shows 3x >> improvement for Integer and ~65% improvement for Long. > > src/hotspot/cpu/x86/x86_64.ad line 8602: > >> 8600: __ jmp(done); >> 8601: __ bind(neg_divisor_fastpath); >> 8602: // Fastpath for divisor < 0: > > Move in macro assembly routine. Sure, will move it to a macro assembly routine > src/hotspot/cpu/x86/x86_64.ad line 8633: > >> 8631: __ jmp(done); >> 8632: __ bind(neg_divisor_fastpath); >> 8633: // Fastpath for divisor < 0: > > Move in macro assembly rountine. Sure, will move it to a macro assembly routine > src/hotspot/cpu/x86/x86_64.ad line 8902: > >> 8900: __ subl(tmp_rax, divisor); >> 8901: __ andnl(tmp_rax, tmp_rax, rdx); >> 8902: __ sarl(tmp_rax, 31); > > Please move this into a macro assembly routine. Sure, will move it to a macro assembly routine > src/hotspot/cpu/x86/x86_64.ad line 8932: > >> 8930: // Fastpath when divisor < 0: >> 8931: // remainder = dividend - (((dividend & ~(dividend - divisor)) >> >> (Long.SIZE - 1)) & divisor) >> 8932: // See Hacker's Delight (2nd ed), section 9.3 which is implemented >> in java.lang.Long.remainderUnsigned() > > Please move it into a macro assembly routine. Sure, will move it to a macro assembly routine > src/hotspot/share/opto/compile.cpp line 3499: > >> 3497: Node* d = n->find_similar(Op_UDivI); >> 3498: if (d) { >> 3499: // Replace them with a fused unsigned divmod if supported > > Can you explain a bit here, why can't this transformation be handled earlier ? This is following the existing approach being used for signed DivMod > test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 75: > >> 73: } >> 74: return quotients; >> 75: } > > Do we need to return quotients, since it's a field being explicitly modified. Will remove it. > test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 82: > >> 80: remainders[i] = Long.remainderUnsigned(dividends[i], >> divisors[i]); >> 81: } >> 82: return remainders; > > Same as above Will remove it. - PR: https://git.openjdk.java.net/jdk/pull/7572
Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long
On Tue, 22 Feb 2022 09:24:47 GMT, Vamsi Parasa wrote: > Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This change also implements the DivMod optimization which fuses > division and modulus operations if needed. The DivMod optimization shows 3x > improvement for Integer and ~65% improvement for Long. src/hotspot/cpu/x86/x86_64.ad line 8602: > 8600: __ jmp(done); > 8601: __ bind(neg_divisor_fastpath); > 8602: // Fastpath for divisor < 0: Move in macro assembly routine. src/hotspot/cpu/x86/x86_64.ad line 8633: > 8631: __ jmp(done); > 8632: __ bind(neg_divisor_fastpath); > 8633: // Fastpath for divisor < 0: Move in macro assembly rountine. src/hotspot/cpu/x86/x86_64.ad line 8722: > 8720: __ shrl(rax, 31); // quotient > 8721: __ sarl(tmp, 31); > 8722: __ andl(tmp, divisor); Move in macro assembly routine. src/hotspot/cpu/x86/x86_64.ad line 8763: > 8761: __ andnq(rax, rax, rdx); > 8762: __ movq(tmp, rax); > 8763: __ shrq(rax, 63); // quotient Move in macro assembly routine. src/hotspot/cpu/x86/x86_64.ad line 8902: > 8900: __ subl(tmp_rax, divisor); > 8901: __ andnl(tmp_rax, tmp_rax, rdx); > 8902: __ sarl(tmp_rax, 31); Please move this into a macro assembly routine. src/hotspot/cpu/x86/x86_64.ad line 8932: > 8930: // Fastpath when divisor < 0: > 8931: // remainder = dividend - (((dividend & ~(dividend - divisor)) >> > (Long.SIZE - 1)) & divisor) > 8932: // See Hacker's Delight (2nd ed), section 9.3 which is implemented > in java.lang.Long.remainderUnsigned() Please move it into a macro assembly routine. src/hotspot/share/opto/compile.cpp line 3499: > 3497: Node* d = n->find_similar(Op_UDivI); > 3498: if (d) { > 3499: // Replace them with a fused unsigned divmod if supported Can you explain a bit here, why can't this transformation be handled earlier ? src/hotspot/share/opto/divnode.cpp line 1350: > 1348: return NULL; > 1349: } > 1350: Please remove Value and Ideal routines if no explicit transforms are being done. src/hotspot/share/opto/divnode.cpp line 1362: > 1360: } > 1361: > 1362: > //= You can remove Ideal routine is not transformation is being done. test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 76: > 74: return quotients; > 75: } > 76: Return seems redundant here. test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 83: > 81: } > 82: return remainders; > 83: } Return seems redundant here. test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 75: > 73: } > 74: return quotients; > 75: } Do we need to return quotients, since it's a field being explicitly modified. test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 82: > 80: remainders[i] = Long.remainderUnsigned(dividends[i], > divisors[i]); > 81: } > 82: return remainders; Same as above - PR: https://git.openjdk.java.net/jdk/pull/7572