Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]

2022-04-11 Thread Tobias Hartmann
On Fri, 8 Apr 2022 22:17:23 GMT, Srinivas Vamsi Parasa  
wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Fix the divmod crash due to lack of control node

The issue is easy to reproduce, see 
[JDK-8284635](https://bugs.openjdk.java.net/browse/JDK-8284635).

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]

2022-04-10 Thread Srinivas Vamsi Parasa
On Mon, 11 Apr 2022 02:01:17 GMT, David Holmes  wrote:

> This change appears to be causing crashes in tier4 - possibly Xcomp related:
> 
> # assert(ctrl == kit.control()) failed: Control flow was added although the 
> intrinsic bailed out
> I will file a new bug.

Thank you for informing this issue! In order to reproduce the issue, I just 
kickstarted a tier4 run on a Xeon server. Will  keep you updated.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]

2022-04-10 Thread David Holmes
On Fri, 8 Apr 2022 22:17:23 GMT, Srinivas Vamsi Parasa  
wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Fix the divmod crash due to lack of control node

This change appears to be causing crashes in tier4 - possibly Xcomp related:

#  assert(ctrl == kit.control()) failed: Control flow was added although the 
intrinsic bailed out

I will file a new bug.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]

2022-04-10 Thread Srinivas Vamsi Parasa
On Wed, 6 Apr 2022 06:23:47 GMT, Jatin Bhateja  wrote:

>>> Also need a jtreg test for this.
>> 
>> Thanks Sandhya for the review. Made the suggested changes and added jtreg 
>> tests as well.
>
> Hi @vamsi-parasa , thanks for addressing my comments, looks good to me 
> otherwise apart from the outstanding comments.

@jatin-bhateja Thank you Jatin!

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]

2022-04-09 Thread Srinivas Vamsi Parasa
On Sat, 9 Apr 2022 18:25:54 GMT, Vladimir Kozlov  wrote:

> Testing passed.

Thank you Vladimir!

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]

2022-04-09 Thread Vladimir Kozlov
On Fri, 8 Apr 2022 22:17:23 GMT, Srinivas Vamsi Parasa  
wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Fix the divmod crash due to lack of control node

Testing passed.

-

Marked as reviewed by kvn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]

2022-04-08 Thread Vladimir Kozlov
On Fri, 8 Apr 2022 22:17:23 GMT, Srinivas Vamsi Parasa  
wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Fix the divmod crash due to lack of control node

I submitted new testing.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]

2022-04-08 Thread Srinivas Vamsi Parasa
On Fri, 8 Apr 2022 00:55:50 GMT, Srinivas Vamsi Parasa  
wrote:

>> I have few comments.
>
> Hi Vladimir (@vnkozlov),
> 
> Incorporated all the suggestions you made in the previous review and pushed a 
> new commit.
> Please let me know if anything else is needed.
> 
> Thanks,
> Vamsi

> @vamsi-parasa I got failures in new tests when run with `-XX:UseAVX=3 
> -XX:+UnlockDiagnosticVMOptions -XX:+UseKNLSetting ` flags:
> 
> ```
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGFPE (0x8) at pc=0x7f2fa8c674ea, pid=3334, tid=3335
> #
> # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 
> 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
> 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, 
> tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
> # Problematic frame:
> # J 504% c2 compiler.intrinsics.TestLongUnsignedDivMod.testDivideUnsigned()V 
> (48 bytes) @ 0x7f2fa8c674ea [0x7f2fa8c672a0+0x024a]
> #
> ```
> 
> ```
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGFPE (0x8) at pc=0x7fb8c0c4fb18, pid=3309, tid=3310
> #
> # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 
> 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
> 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, 
> tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
> # Problematic frame:
> # J 445 c2 compiler.intrinsics.TestIntegerUnsignedDivMod.divmod(III)V (23 
> bytes) @ 0x7fb8c0c4fb18 [0x7fb8c0c4fae0+0x0038]
> #
> ```

Hi Vladimir (@vnkozlov), fixed it in the new commit, could you pls check? 
This is being caused by lack of control() node in udiv/umod related nodes. 
After adding the control() node, the tests are passing for me. Thanks for 
pointing this out!

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]

2022-04-08 Thread Srinivas Vamsi Parasa
> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
> change shows 3x improvement for Integer methods and upto 25% improvement for 
> Long. This change also implements the DivMod optimization which fuses 
> division and modulus operations if needed. The DivMod optimization shows 3x 
> improvement for Integer and ~65% improvement for Long.

Srinivas Vamsi Parasa has updated the pull request incrementally with one 
additional commit since the last revision:

  Fix the divmod crash due to lack of control node

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7572/files
  - new: https://git.openjdk.java.net/jdk/pull/7572/files/3e3fc977..a71ea238

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=12
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=11-12

  Stats: 8 lines in 2 files changed: 0 ins; 4 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]

2022-04-08 Thread Vladimir Kozlov
On Fri, 8 Apr 2022 18:32:06 GMT, Sandhya Viswanathan  
wrote:

> My suggestion is to keep the -ve path assembly optimization in this patch.
> When the optimization in IR is introduced, the assembly could then be 
> simplified as suggested by @merykitty.

Okay. Lets do that as part of JDK-8282365.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]

2022-04-08 Thread Sandhya Viswanathan
On Fri, 8 Apr 2022 01:05:33 GMT, Srinivas Vamsi Parasa  
wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   uncomment zero in integer div, mod test

My suggestion is to keep the -ve path assembly optimization in this patch.
When the optimization in IR is introduced, the assembly could then be 
simplified as suggested by @merykitty.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]

2022-04-08 Thread Vladimir Kozlov
On Fri, 8 Apr 2022 01:05:33 GMT, Srinivas Vamsi Parasa  
wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   uncomment zero in integer div, mod test

Agree, this is reasonable suggestion. It could be done in these changes.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]

2022-04-08 Thread Quan Anh Mai
On Fri, 8 Apr 2022 16:39:31 GMT, Vladimir Kozlov  wrote:

>> Hi Vladimir (@vnkozlov),
>> 
>> Incorporated all the suggestions you made in the previous review and pushed 
>> a new commit.
>> Please let me know if anything else is needed.
>> 
>> Thanks,
>> Vamsi
>
> @vamsi-parasa I got failures in new tests when run with `-XX:UseAVX=3 
> -XX:+UnlockDiagnosticVMOptions -XX:+UseKNLSetting ` flags:
> 
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGFPE (0x8) at pc=0x7f2fa8c674ea, pid=3334, tid=3335
> #
> # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 
> 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
> 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, 
> tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
> # Problematic frame:
> # J 504% c2 compiler.intrinsics.TestLongUnsignedDivMod.testDivideUnsigned()V 
> (48 bytes) @ 0x7f2fa8c674ea [0x7f2fa8c672a0+0x024a]
> #
> 
> 
> 
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGFPE (0x8) at pc=0x7fb8c0c4fb18, pid=3309, tid=3310
> #
> # JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 
> 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
> 19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, 
> tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
> # Problematic frame:
> # J 445 c2 compiler.intrinsics.TestIntegerUnsignedDivMod.divmod(III)V (23 
> bytes) @ 0x7fb8c0c4fb18 [0x7fb8c0c4fae0+0x0038]
> #

@vnkozlov  The `uDivI_rRegNode` currently emits machine code equivalent to the 
following Java pseudocode:

if (div < 0) {
// fast path, if div < 0, then (unsigned)div > MAX_UINT / 2U
// I don't know why this is so complicated, basically this is rax u>= 
div ? 1 : 0
return (rax & ~(rax - div)) >>> (Integer.SIZE - 1); 
} else {
// slow path, just do the division normally
return rax u/ div;
}

What I am suggesting is to leave the negative-divisor fast part to be 
implemented in the IR and the macro assembler should only concern emitting the 
division instruction and not worry about optimisation here.

Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]

2022-04-08 Thread Vladimir Kozlov
On Fri, 8 Apr 2022 00:55:50 GMT, Srinivas Vamsi Parasa  
wrote:

>> I have few comments.
>
> Hi Vladimir (@vnkozlov),
> 
> Incorporated all the suggestions you made in the previous review and pushed a 
> new commit.
> Please let me know if anything else is needed.
> 
> Thanks,
> Vamsi

@vamsi-parasa I got failures in new tests when run with `-XX:UseAVX=3 
-XX:+UnlockDiagnosticVMOptions -XX:+UseKNLSetting ` flags:

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGFPE (0x8) at pc=0x7f2fa8c674ea, pid=3334, tid=3335
#
# JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 
19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, 
tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# J 504% c2 compiler.intrinsics.TestLongUnsignedDivMod.testDivideUnsigned()V 
(48 bytes) @ 0x7f2fa8c674ea [0x7f2fa8c672a0+0x024a]
#



# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGFPE (0x8) at pc=0x7fb8c0c4fb18, pid=3309, tid=3310
#
# JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 
19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
19-internal-2022-04-08-0157190.vladimir.kozlov.jdkgit, mixed mode, sharing, 
tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# J 445 c2 compiler.intrinsics.TestIntegerUnsignedDivMod.divmod(III)V (23 
bytes) @ 0x7fb8c0c4fb18 [0x7fb8c0c4fae0+0x0038]
#

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]

2022-04-08 Thread Vladimir Kozlov
On Fri, 8 Apr 2022 01:59:10 GMT, Quan Anh Mai  wrote:

> Personally, I think the optimisation for `div < 0` should be handled by the 
> mid-end optimiser, which will not only give us the advantages of dead code 
> elimination, but also global code motion. I would suggest the backend only 
> doing `xorl rdx, rdx; divl $div$$Register` and the optimisation for `div < 0` 
> will be implemented as a part of JDK-8282365. What do you think?

 I agree that we can do more optimizations with constants as JDK-8282365 
suggested.

But I think we should proceed with current changes as they are after fixing 
remaining issues.
I assume that you are talking about case when `divisor` is constant (or both). 
Because if it is not, IR optimization will not help - we don't profile 
arithmetic values so we can't generate uncommon trap path without some 
profiling information.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]

2022-04-07 Thread Quan Anh Mai
On Fri, 8 Apr 2022 01:05:33 GMT, Srinivas Vamsi Parasa  
wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   uncomment zero in integer div, mod test

Personally, I think the optimisation for `div < 0` should be handled by the 
mid-end optimiser, which will not only give us the advantages of dead code 
elimination, but also global code motion. I would suggest the backend only 
doing `xorl rdx, rdx; divl $div$$Register` and the optimisation for `div < 0` 
will be implemented as a part of JDK-8282365. What do you think?
Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]

2022-04-07 Thread Vladimir Kozlov
On Fri, 8 Apr 2022 01:05:33 GMT, Srinivas Vamsi Parasa  
wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   uncomment zero in integer div, mod test

Good. I forgot before to ask about how you handle devision by 0 and now you 
added check for it.

Let me run testing before approval.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]

2022-04-07 Thread Srinivas Vamsi Parasa
> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
> change shows 3x improvement for Integer methods and upto 25% improvement for 
> Long. This change also implements the DivMod optimization which fuses 
> division and modulus operations if needed. The DivMod optimization shows 3x 
> improvement for Integer and ~65% improvement for Long.

Srinivas Vamsi Parasa has updated the pull request incrementally with one 
additional commit since the last revision:

  uncomment zero in integer div, mod test

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7572/files
  - new: https://git.openjdk.java.net/jdk/pull/7572/files/bfb6c02e..3e3fc977

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=11
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=10-11

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]

2022-04-07 Thread Srinivas Vamsi Parasa
On Wed, 6 Apr 2022 00:45:37 GMT, Vladimir Kozlov  wrote:

>> Thanks for suggesting the enhancement. This enhancement will be implemented 
>> as a part of https://bugs.openjdk.java.net/browse/JDK-8282365
>
> You do need `Ideal()` methods at least to check for dead code.

Added the Ideal() methods for checking dead code. Pls see the new commit.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]

2022-04-07 Thread Srinivas Vamsi Parasa
On Wed, 6 Apr 2022 00:46:01 GMT, Vladimir Kozlov  wrote:

>> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   add error msg for jtreg test
>
> I have few comments.

Hi Vladimir (@vnkozlov),

Incorporated all the suggestions you made in the previous review and pushed a 
new commit.
Please let me know if anything else is needed.

Thanks,
Vamsi

> src/hotspot/cpu/x86/assembler_x86.cpp line 12375:
> 
>> 12373: }
>> 12374: #endif
>> 12375: 
> 
> Please, place it near `idivq()` so you would not need `#ifdef`.

Made the change as per your suggestion.

> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4568:
> 
>> 4566:   subl(rdx, divisor);
>> 4567:   if (VM_Version::supports_bmi1()) andnl(rax, rdx, rax);
>> 4568:   else {
> 
> Please, follow our coding stile here and in following methods:
> 
> if (VM_Version::supports_bmi1()) {
>   andnl(rax, rdx, rax);
> } else {

Pls see the new commit which fixed the coding style.

> src/hotspot/cpu/x86/x86_64.ad line 8701:
> 
>> 8699: %}
>> 8700: 
>> 8701: instruct udivI_rReg(rax_RegI rax, no_rax_rdx_RegI div, rFlagsReg cr, 
>> rdx_RegI rdx)
> 
> I suggest to follow the pattern in other `div/mod` instructions: `(rax_RegI 
> rax, rdx_RegI rdx, no_rax_rdx_RegI div, rFlagsReg cr)`
> 
> Similar in following new instructions.

Pls see the new commit which fixed the pattern.

> test/hotspot/jtreg/compiler/intrinsics/TestIntegerDivMod.java line 55:
> 
>> 53: dividends[i] = rng.nextInt();
>> 54: divisors[i] = rng.nextInt();
>> 55: }
> 
> I don't trust RND to generate corner cases.
> Please, add cases here and in TestLongDivMod.java for MAX, MIN, 0.

You are right. Using an updated corner cases test revealed divide by zero crash 
which was fixed. Please see the updated jtreg tests inspired by unsigned 
divide/remainder tests in test/jdk/java/lang/Long/Unsigned.java and 
test/jdk/java/lang/Integer/Unsigned.java.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v11]

2022-04-07 Thread Srinivas Vamsi Parasa
> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
> change shows 3x improvement for Integer methods and upto 25% improvement for 
> Long. This change also implements the DivMod optimization which fuses 
> division and modulus operations if needed. The DivMod optimization shows 3x 
> improvement for Integer and ~65% improvement for Long.

Srinivas Vamsi Parasa has updated the pull request incrementally with one 
additional commit since the last revision:

  Idead Ideal for udiv, umod nodes and update jtreg tests to use corner cases

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7572/files
  - new: https://git.openjdk.java.net/jdk/pull/7572/files/9949047c..bfb6c02e

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=10
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=09-10

  Stats: 701 lines in 7 files changed: 423 ins; 274 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v10]

2022-04-06 Thread Srinivas Vamsi Parasa
> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
> change shows 3x improvement for Integer methods and upto 25% improvement for 
> Long. This change also implements the DivMod optimization which fuses 
> division and modulus operations if needed. The DivMod optimization shows 3x 
> improvement for Integer and ~65% improvement for Long.

Srinivas Vamsi Parasa has updated the pull request with a new target base due 
to a merge or a rebase. The pull request now contains 15 commits:

 - use appropriate style changes
 - Merge branch 'master' of https://git.openjdk.java.net/jdk into udivmod
 - Merge branch 'openjdk:master' into udivmod
 - add error msg for jtreg test
 - update jtreg test to run on x86_64
 - add bmi1 support check and jtreg tests
 - Merge branch 'master' of https://git.openjdk.java.net/jdk into udivmod
 - fix 32bit build issues
 - Fix line at end of file
 - Move intrinsic code to macro assembly routines; remove unused 
transformations for div and mod nodes
 - ... and 5 more: https://git.openjdk.java.net/jdk/compare/4451257b...9949047c

-

Changes: https://git.openjdk.java.net/jdk/pull/7572/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk=7572=09
  Stats: 1011 lines in 20 files changed: 1009 ins; 1 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]

2022-04-06 Thread Vamsi Parasa
On Wed, 6 Apr 2022 00:46:01 GMT, Vladimir Kozlov  wrote:

> I have few comments.

Thank you Vladimir (@vnkozlov) for suggesting the changes! Will incorporate the 
suggestions and push an update in few hours.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]

2022-04-06 Thread Jatin Bhateja
On Mon, 4 Apr 2022 07:24:12 GMT, Vamsi Parasa  wrote:

>> Also need a jtreg test for this.
>
>> Also need a jtreg test for this.
> 
> Thanks Sandhya for the review. Made the suggested changes and added jtreg 
> tests as well.

Hi @vamsi-parasa , thanks for addressing my comments, looks good to me 
otherwise apart from the outstanding comments.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v9]

2022-04-06 Thread Jatin Bhateja
On Wed, 6 Apr 2022 06:02:07 GMT, Vamsi Parasa  wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Vamsi Parasa has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 13 commits:
> 
>  - Merge branch 'openjdk:master' into udivmod
>  - add error msg for jtreg test
>  - update jtreg test to run on x86_64
>  - add bmi1 support check and jtreg tests
>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into udivmod
>  - fix 32bit build issues
>  - Fix line at end of file
>  - Move intrinsic code to macro assembly routines; remove unused 
> transformations for div and mod nodes
>  - fix trailing white space errors
>  - fix whitespaces
>  - ... and 3 more: 
> https://git.openjdk.java.net/jdk/compare/741be461...acba7c19

Marked as reviewed by jbhateja (Committer).

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v9]

2022-04-06 Thread Vamsi Parasa
> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
> change shows 3x improvement for Integer methods and upto 25% improvement for 
> Long. This change also implements the DivMod optimization which fuses 
> division and modulus operations if needed. The DivMod optimization shows 3x 
> improvement for Integer and ~65% improvement for Long.

Vamsi Parasa has updated the pull request with a new target base due to a merge 
or a rebase. The pull request now contains 13 commits:

 - Merge branch 'openjdk:master' into udivmod
 - add error msg for jtreg test
 - update jtreg test to run on x86_64
 - add bmi1 support check and jtreg tests
 - Merge branch 'master' of https://git.openjdk.java.net/jdk into udivmod
 - fix 32bit build issues
 - Fix line at end of file
 - Move intrinsic code to macro assembly routines; remove unused 
transformations for div and mod nodes
 - fix trailing white space errors
 - fix whitespaces
 - ... and 3 more: https://git.openjdk.java.net/jdk/compare/741be461...acba7c19

-

Changes: https://git.openjdk.java.net/jdk/pull/7572/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk=7572=08
  Stats: 1007 lines in 20 files changed: 1005 ins; 1 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]

2022-04-05 Thread Vladimir Kozlov
On Tue, 5 Apr 2022 20:26:18 GMT, Vamsi Parasa  wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Vamsi Parasa has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   add error msg for jtreg test

I have few comments.

src/hotspot/cpu/x86/assembler_x86.cpp line 12375:

> 12373: }
> 12374: #endif
> 12375: 

Please, place it near `idivq()` so you would not need `#ifdef`.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4568:

> 4566:   subl(rdx, divisor);
> 4567:   if (VM_Version::supports_bmi1()) andnl(rax, rdx, rax);
> 4568:   else {

Please, follow our coding stile here and in following methods:

if (VM_Version::supports_bmi1()) {
  andnl(rax, rdx, rax);
} else {

src/hotspot/cpu/x86/x86_64.ad line 8701:

> 8699: %}
> 8700: 
> 8701: instruct udivI_rReg(rax_RegI rax, no_rax_rdx_RegI div, rFlagsReg cr, 
> rdx_RegI rdx)

I suggest to follow the pattern in other `div/mod` instructions: `(rax_RegI 
rax, rdx_RegI rdx, no_rax_rdx_RegI div, rFlagsReg cr)`

Similar in following new instructions.

test/hotspot/jtreg/compiler/intrinsics/TestIntegerDivMod.java line 55:

> 53: dividends[i] = rng.nextInt();
> 54: divisors[i] = rng.nextInt();
> 55: }

I don't trust RND to generate corner cases.
Please, add cases here and in TestLongDivMod.java for MAX, MIN, 0.

-

Changes requested by kvn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]

2022-04-05 Thread Vladimir Kozlov
On Thu, 24 Feb 2022 19:04:37 GMT, Vamsi Parasa  wrote:

>> src/hotspot/share/opto/divnode.cpp line 881:
>> 
>>> 879:   return (phase->type( in(2) )->higher_equal(TypeLong::ONE)) ? in(1) : 
>>> this;
>>> 880: }
>>> 881: 
>>> //--Value--
>> 
>> Ideal transform to replace unsigned divide by cheaper logical right shift 
>> instruction if divisor is POW will be useful.
>
> Thanks for suggesting the enhancement. This enhancement will be implemented 
> as a part of https://bugs.openjdk.java.net/browse/JDK-8282365

You do need `Ideal()` methods at least to check for dead code.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]

2022-04-05 Thread Sandhya Viswanathan
On Tue, 5 Apr 2022 20:26:18 GMT, Vamsi Parasa  wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Vamsi Parasa has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   add error msg for jtreg test

Marked as reviewed by sviswanathan (Reviewer).

Looks good to me. You need one more review.

@vnkozlov Could you please help review this patch?

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v6]

2022-04-05 Thread Vamsi Parasa
On Tue, 5 Apr 2022 17:06:44 GMT, Sandhya Viswanathan  
wrote:

>> Vamsi Parasa has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   add bmi1 support check and jtreg tests
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.hpp line 362:
> 
>> 360:   void vector_popcount_long(XMMRegister dst, XMMRegister src, 
>> XMMRegister xtmp1,
>> 361: XMMRegister xtmp2, XMMRegister xtmp3, 
>> Register rtmp,
>> 362: int vec_enc);
> 
> This doesn't seem to be related to this patch.

This is coming due to a merge with the latest upstream (jdk)

> test/hotspot/jtreg/compiler/intrinsics/TestIntegerDivMod.java line 107:
> 
>> 105: }
>> 106: if (mismatch) {
>> 107: throw new RuntimeException("Test failed");
> 
> It would be good to print dividend, divisor, operation, actual result and 
> expected result here.

Please see the updated error message in the recent commit.

> test/hotspot/jtreg/compiler/intrinsics/TestLongDivMod.java line 104:
> 
>> 102: }
>> 103: if (mismatch) {
>> 104: throw new RuntimeException("Test failed");
> 
> It would be good to print dividend, divisor, operation, actual result and 
> expected result here.

Please see the updated error message in the recent commit.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]

2022-04-05 Thread Vamsi Parasa
> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
> change shows 3x improvement for Integer methods and upto 25% improvement for 
> Long. This change also implements the DivMod optimization which fuses 
> division and modulus operations if needed. The DivMod optimization shows 3x 
> improvement for Integer and ~65% improvement for Long.

Vamsi Parasa has updated the pull request incrementally with one additional 
commit since the last revision:

  add error msg for jtreg test

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7572/files
  - new: https://git.openjdk.java.net/jdk/pull/7572/files/e97c6fbc..8047767c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=07
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=06-07

  Stats: 41 lines in 2 files changed: 37 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]

2022-02-24 Thread Vamsi Parasa
On Thu, 24 Feb 2022 14:13:47 GMT, Jatin Bhateja  wrote:

>> Vamsi Parasa has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   fix 32bit build issues
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4408:
> 
>> 4406:   jmp(done);
>> 4407:   bind(neg_divisor_fastpath);
>> 4408:   // Fastpath for divisor < 0:
> 
> How about checking if divisor is +ve or -ve constant and non-constant 
> dividend in identity routine and setting a flag in IR node, which can be used 
> to either emit fast / slow path in a new instruction selection pattern. It 
> will save emitting redundant instructions.

Thanks for suggesting the enhancement. This enhancement will be implemented as 
a part of https://bugs.openjdk.java.net/browse/JDK-8282365

> src/hotspot/share/opto/divnode.cpp line 881:
> 
>> 879:   return (phase->type( in(2) )->higher_equal(TypeLong::ONE)) ? in(1) : 
>> this;
>> 880: }
>> 881: 
>> //--Value--
> 
> Ideal transform to replace unsigned divide by cheaper logical right shift 
> instruction if divisor is POW will be useful.

Thanks for suggesting the enhancement. This enhancement will be implemented as 
a part of https://bugs.openjdk.java.net/browse/JDK-8282365

> src/hotspot/share/opto/divnode.cpp line 897:
> 
>> 895: 
>> 896:   // Either input is BOTTOM ==> the result is the local BOTTOM
>> 897:   const Type *bot = bottom_type();
> 
> Can we add constant folding handling when both dividend and divisor are 
> constants.

Thanks for suggesting the enhancement. This enhancement will be implemented as 
a part of https://bugs.openjdk.java.net/browse/JDK-8282365

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]

2022-02-24 Thread Jatin Bhateja
On Thu, 24 Feb 2022 02:43:46 GMT, Vamsi Parasa  wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Vamsi Parasa has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   fix 32bit build issues

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4408:

> 4406:   jmp(done);
> 4407:   bind(neg_divisor_fastpath);
> 4408:   // Fastpath for divisor < 0:

How about checking if divisor is +ve or -ve constant and non-constant dividend 
in identity routine and setting a flag in IR node, which can be used to either 
emit fast / slow path in a new instruction selection pattern. It will save 
emitting redundant instructions.

src/hotspot/share/opto/divnode.cpp line 881:

> 879:   return (phase->type( in(2) )->higher_equal(TypeLong::ONE)) ? in(1) : 
> this;
> 880: }
> 881: 
> //--Value--

Ideal transform to replace unsigned divide by cheaper logical right shift 
instruction if divisor is POW will be useful.

src/hotspot/share/opto/divnode.cpp line 897:

> 895: 
> 896:   // Either input is BOTTOM ==> the result is the local BOTTOM
> 897:   const Type *bot = bottom_type();

Can we add constant folding handling when both dividend and divisor are 
constants.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]

2022-02-23 Thread Vamsi Parasa
> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
> change shows 3x improvement for Integer methods and upto 25% improvement for 
> Long. This change also implements the DivMod optimization which fuses 
> division and modulus operations if needed. The DivMod optimization shows 3x 
> improvement for Integer and ~65% improvement for Long.

Vamsi Parasa has updated the pull request incrementally with one additional 
commit since the last revision:

  fix 32bit build issues

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7572/files
  - new: https://git.openjdk.java.net/jdk/pull/7572/files/13549290..2915b2e7

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=02-03

  Stats: 91 lines in 2 files changed: 49 ins; 42 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v3]

2022-02-23 Thread Vamsi Parasa
> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
> change shows 3x improvement for Integer methods and upto 25% improvement for 
> Long. This change also implements the DivMod optimization which fuses 
> division and modulus operations if needed. The DivMod optimization shows 3x 
> improvement for Integer and ~65% improvement for Long.

Vamsi Parasa has updated the pull request incrementally with one additional 
commit since the last revision:

  Fix line at end of file

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7572/files
  - new: https://git.openjdk.java.net/jdk/pull/7572/files/7fc18af3..13549290

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=01-02

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v2]

2022-02-23 Thread Vamsi Parasa
> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
> change shows 3x improvement for Integer methods and upto 25% improvement for 
> Long. This change also implements the DivMod optimization which fuses 
> division and modulus operations if needed. The DivMod optimization shows 3x 
> improvement for Integer and ~65% improvement for Long.

Vamsi Parasa has updated the pull request incrementally with one additional 
commit since the last revision:

  Move intrinsic code to macro assembly routines; remove unused transformations 
for div and mod nodes

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7572/files
  - new: https://git.openjdk.java.net/jdk/pull/7572/files/fa57175a..7fc18af3

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=7572=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7572=00-01

  Stats: 326 lines in 7 files changed: 137 ins; 176 del; 13 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7572.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7572/head:pull/7572

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long

2022-02-23 Thread Vamsi Parasa
On Wed, 23 Feb 2022 05:52:00 GMT, Jatin Bhateja  wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 76:
> 
>> 74: return quotients;
>> 75: }
>> 76: 
> 
> Return seems redundant here.

Will remove it.

> test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 83:
> 
>> 81: }
>> 82: return remainders;
>> 83: }
> 
> Return seems redundant here.

Will remove it.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long

2022-02-23 Thread Vamsi Parasa
On Wed, 23 Feb 2022 05:46:45 GMT, Jatin Bhateja  wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> src/hotspot/share/opto/divnode.cpp line 1350:
> 
>> 1348: return NULL;
>> 1349:   }
>> 1350: 
> 
> Please remove Value and Ideal routines if no explicit transforms are being 
> done.

Will remove the unused transformations.

> src/hotspot/share/opto/divnode.cpp line 1362:
> 
>> 1360:   }
>> 1361: 
>> 1362: 
>> //=
> 
> You can remove Ideal routine is not transformation is being done.

Will remove the unused transformation.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long

2022-02-23 Thread Vamsi Parasa
On Wed, 23 Feb 2022 05:43:10 GMT, Jatin Bhateja  wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> src/hotspot/cpu/x86/x86_64.ad line 8602:
> 
>> 8600: __ jmp(done);
>> 8601: __ bind(neg_divisor_fastpath); 
>> 8602: // Fastpath for divisor < 0: 
> 
> Move in macro assembly routine.

Sure, will move it to a macro assembly routine

> src/hotspot/cpu/x86/x86_64.ad line 8633:
> 
>> 8631: __ jmp(done);
>> 8632: __ bind(neg_divisor_fastpath);
>> 8633: // Fastpath for divisor < 0: 
> 
> Move in macro assembly rountine.

Sure, will move it to a macro assembly routine

> src/hotspot/cpu/x86/x86_64.ad line 8902:
> 
>> 8900: __ subl(tmp_rax, divisor);
>> 8901: __ andnl(tmp_rax, tmp_rax, rdx);
>> 8902: __ sarl(tmp_rax, 31);
> 
> Please move this into a macro assembly routine.

Sure, will move it to a macro assembly routine

> src/hotspot/cpu/x86/x86_64.ad line 8932:
> 
>> 8930: // Fastpath when divisor < 0: 
>> 8931: // remainder = dividend - (((dividend & ~(dividend - divisor)) >> 
>> (Long.SIZE - 1)) & divisor)
>> 8932: // See Hacker's Delight (2nd ed), section 9.3 which is implemented 
>> in java.lang.Long.remainderUnsigned()
> 
> Please move it into a macro assembly routine.

Sure, will move it to a macro assembly routine

> src/hotspot/share/opto/compile.cpp line 3499:
> 
>> 3497:   Node* d = n->find_similar(Op_UDivI);
>> 3498:   if (d) {
>> 3499: // Replace them with a fused unsigned divmod if supported
> 
> Can you explain a bit here, why can't this transformation be handled earlier ?

This is following the existing approach being used for signed DivMod

> test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 75:
> 
>> 73: }
>> 74: return quotients;
>> 75: }
> 
> Do we need to return quotients, since it's a field  being explicitly modified.

Will remove it.

> test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 82:
> 
>> 80: remainders[i] = Long.remainderUnsigned(dividends[i], 
>> divisors[i]);
>> 81: }
>> 82: return remainders;
> 
> Same as above

Will remove it.

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long

2022-02-22 Thread Jatin Bhateja
On Tue, 22 Feb 2022 09:24:47 GMT, Vamsi Parasa  wrote:

> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
> change shows 3x improvement for Integer methods and upto 25% improvement for 
> Long. This change also implements the DivMod optimization which fuses 
> division and modulus operations if needed. The DivMod optimization shows 3x 
> improvement for Integer and ~65% improvement for Long.

src/hotspot/cpu/x86/x86_64.ad line 8602:

> 8600: __ jmp(done);
> 8601: __ bind(neg_divisor_fastpath); 
> 8602: // Fastpath for divisor < 0: 

Move in macro assembly routine.

src/hotspot/cpu/x86/x86_64.ad line 8633:

> 8631: __ jmp(done);
> 8632: __ bind(neg_divisor_fastpath);
> 8633: // Fastpath for divisor < 0: 

Move in macro assembly rountine.

src/hotspot/cpu/x86/x86_64.ad line 8722:

> 8720: __ shrl(rax, 31); // quotient
> 8721: __ sarl(tmp, 31);
> 8722: __ andl(tmp, divisor);

Move in macro assembly routine.

src/hotspot/cpu/x86/x86_64.ad line 8763:

> 8761: __ andnq(rax, rax, rdx);
> 8762: __ movq(tmp, rax);
> 8763: __ shrq(rax, 63); // quotient

Move in macro assembly routine.

src/hotspot/cpu/x86/x86_64.ad line 8902:

> 8900: __ subl(tmp_rax, divisor);
> 8901: __ andnl(tmp_rax, tmp_rax, rdx);
> 8902: __ sarl(tmp_rax, 31);

Please move this into a macro assembly routine.

src/hotspot/cpu/x86/x86_64.ad line 8932:

> 8930: // Fastpath when divisor < 0: 
> 8931: // remainder = dividend - (((dividend & ~(dividend - divisor)) >> 
> (Long.SIZE - 1)) & divisor)
> 8932: // See Hacker's Delight (2nd ed), section 9.3 which is implemented 
> in java.lang.Long.remainderUnsigned()

Please move it into a macro assembly routine.

src/hotspot/share/opto/compile.cpp line 3499:

> 3497:   Node* d = n->find_similar(Op_UDivI);
> 3498:   if (d) {
> 3499: // Replace them with a fused unsigned divmod if supported

Can you explain a bit here, why can't this transformation be handled earlier ?

src/hotspot/share/opto/divnode.cpp line 1350:

> 1348: return NULL;
> 1349:   }
> 1350: 

Please remove Value and Ideal routines if no explicit transforms are being done.

src/hotspot/share/opto/divnode.cpp line 1362:

> 1360:   }
> 1361: 
> 1362: 
> //=

You can remove Ideal routine is not transformation is being done.

test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 76:

> 74: return quotients;
> 75: }
> 76: 

Return seems redundant here.

test/micro/org/openjdk/bench/java/lang/IntegerDivMod.java line 83:

> 81: }
> 82: return remainders;
> 83: }

Return seems redundant here.

test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 75:

> 73: }
> 74: return quotients;
> 75: }

Do we need to return quotients, since it's a field  being explicitly modified.

test/micro/org/openjdk/bench/java/lang/LongDivMod.java line 82:

> 80: remainders[i] = Long.remainderUnsigned(dividends[i], 
> divisors[i]);
> 81: }
> 82: return remainders;

Same as above

-

PR: https://git.openjdk.java.net/jdk/pull/7572