Re: RFR: 8286182: C2: crash with SIGFPE when executing compiled code

2022-05-19 Thread Tobias Hartmann
On Wed, 18 May 2022 15:44:10 GMT, Quan Anh Mai  wrote:

> This patch backs out the changes made by 
> [JDK-8285390](https://bugs.openjdk.java.net/browse/JDK-8285390) and 
> [JDK-8284742](https://bugs.openjdk.java.net/browse/JDK-8284742) since there 
> are failures due to div nodes floating above their validity checks.
> 
> Thanks.

Looks good to me too but the backout should adhere to the process defined here: 
https://openjdk.java.net/guide/#backing-out-a-change

I would suggest "Alternative 2" of 3.

-

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/8774


Re: RFR: 8281712: [REDO] AArch64: Implement string_compare intrinsic in SVE

2022-05-17 Thread Tobias Hartmann
On Mon, 16 May 2022 07:21:27 GMT, Ningsheng Jian  wrote:

> This is the REDO of JDK-8269559 and JDK-8275448. Those two backouts finally 
> turned to be some system zlib issue in AArch64 macOS, and is not related to 
> the patch itself. See [1][2] for details.
> 
> This patch is generally the same as JDK-8275448, which uses SVE to optimize 
> string_compare intrinsics for long string comparisons. I did a rebase with 
> small tweaks to get better performance on recent Neoverse hardware. Test data 
> on systems with different SVE vector sizes:
> 
> 
>casedelta size  128-bits  256-bits  512-bits
> compareToLL  2   24 0.17% 0.58% 0.00%
> compareToLL  2   36 0.00%2.25%  0.04%
> compareToLL  2   72 -4.40%   3.87%  -12.82%
> compareToLL  2   1284.55%58.31% 13.53%
> compareToLL  2   25619.39%   69.77% 82.03%
> compareToLL  2   5121.81%68.38% 170.93%
> compareToLU  2   24 25.57%   46.98% 54.61%
> compareToLU  2   36 36.03%   70.26% 94.33%
> compareToLU  2   72 35.86%   90.58% 146.04%
> compareToLU  2   12870.82%   119.19%266.22%
> compareToLU  2   25680.77%   146.33%420.01%
> compareToLU  2   51294.62%   171.72%530.87%
> compareToUL  2   24 20.82%   34.48% 62.14%
> compareToUL  2   36 39.77%   60.79% 69.77%
> compareToUL  2   72 35.46%   84.34% 121.90%
> compareToUL  2   12867.77%   110.97%220.53%
> compareToUL  2   25677.05%   160.29%331.30%
> compareToUL  2   51291.88%   184.57%524.21%
> compareToUU  2   24 -0.13%   0.40%  0.00%
> compareToUU  2   36 -9.18%   12.84% -13.93%
> compareToUU  2   72 1.67%60.61% 6.69%
> compareToUU  2   12813.51%   60.33% 55.27%
> compareToUU  2   2562.55%62.17% 153.26%
> compareToUU  2   5124.12%68.62% 201.68%
> 
> JTreg tests passed on SVE hardware.
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8275448
> [2] https://bugs.openjdk.java.net/browse/JDK-8282954

Marked as reviewed by thartmann (Reviewer).

Sure, I already submitted testing yesterday, it all passed.

-

PR: https://git.openjdk.java.net/jdk/pull/8723


Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-05-16 Thread Tobias Hartmann
On Wed, 27 Apr 2022 09:13:34 GMT, Jie Fu  wrote:

>> Hi all,
>> 
>> According to the Vector API doc, the `LSHR` operator computes 
>> `a>>>(n&(ESIZE*8-1))`.
>> However, current implementation is incorrect for negative bytes/shorts.
>> 
>> The background is that one of our customers try to vectorize `urshift` with 
>> `urshiftVector` like the following.
>> 
>>  13 public static void urshift(byte[] src, byte[] dst) {
>>  14 for (int i = 0; i < src.length; i++) {
>>  15 dst[i] = (byte)(src[i] >>> 3);
>>  16 }
>>  17 }
>>  18 
>>  19 public static void urshiftVector(byte[] src, byte[] dst) {
>>  20 int i = 0;
>>  21 for (; i < spec.loopBound(src.length); i +=spec.length()) {
>>  22 var va = ByteVector.fromArray(spec, src, i);
>>  23 var vb = va.lanewise(VectorOperators.LSHR, 3);
>>  24 vb.intoArray(dst, i);
>>  25 }
>>  26 
>>  27 for (; i < src.length; i++) {
>>  28 dst[i] = (byte)(src[i] >>> 3);
>>  29 }
>>  30 }
>> 
>> 
>> Unfortunately and to our surprise, code@line28 computes different results 
>> with code@line23.
>> It took quite a long time to figure out this bug.
>> 
>> The root cause is that current implemenation of Vector API can't compute the 
>>  unsigned right shift results as what is done for scalar `>>>` for negative 
>> byte/short elements.
>> Actually, current implementation will do `(a & 0xFF) >>> (n & 7)` [1] for 
>> all bytes, which is unable to compute the vectorized `>>>` for negative 
>> bytes.
>> So this seems unreasonable and unfriendly to Java developers.
>> It would be better to fix it.
>> 
>> The key idea to support unsigned right shift of negative bytes/shorts is 
>> just to replace the unsigned right shift operation with the signed right 
>> shift operation.
>> This logic is:
>> - For byte elements, unsigned right shift is equal to signed right shift if 
>> the shift_cnt <= 24.
>> - For short elements, unsigned right shift is equal to signed right shift if 
>> the shift_cnt <= 16.
>> - For Vector API, the shift_cnt will be masked to shift_cnt <= 7 for bytes 
>> and shift_cnt <= 15 for shorts.
>> 
>> I just learned this idea from https://github.com/openjdk/jdk/pull/7979 .
>> And many thanks to @fg1417 .
>> 
>> 
>> Thanks.
>> Best regards,
>> Jie
>> 
>> [1] 
>> https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java#L935
>
>> > > According to the Vector API doc, the LSHR operator computes 
>> > > a>>>(n&(ESIZE*8-1))
>> 
>> Documentation is correct if viewed strictly in context of subword vector 
>> lane, JVM internally promotes/sign extends subword type scalar variables 
>> into int type, but vectors are loaded from continuous memory holding 
>> subwords, it will not be correct for developer to imagine that individual 
>> subword type lanes will be upcasted into int lanes before being operated 
>> upon.
>> 
>> Thus both java implementation and compiler handling looks correct.
> 
> Thanks @jatin-bhateja for taking a look at this.
> After the discussion, I think it's fine to keep the current implementation of 
> LSHR.
> So we're now fixing the misleading doc here: 
> https://github.com/openjdk/jdk/pull/8291 .
> 
> And I think it would be better to add one more operator for `>>>`.
> Thanks.

@DamonFool should this PR and the JBS issue be closed?

-

PR: https://git.openjdk.java.net/jdk/pull/8276


Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v13]

2022-04-11 Thread Tobias Hartmann
On Fri, 8 Apr 2022 22:17:23 GMT, Srinivas Vamsi Parasa  
wrote:

>> Optimizes the divideUnsigned() and remainderUnsigned() methods in 
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This 
>> change shows 3x improvement for Integer methods and upto 25% improvement for 
>> Long. This change also implements the DivMod optimization which fuses 
>> division and modulus operations if needed. The DivMod optimization shows 3x 
>> improvement for Integer and ~65% improvement for Long.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Fix the divmod crash due to lack of control node

The issue is easy to reproduce, see 
[JDK-8284635](https://bugs.openjdk.java.net/browse/JDK-8284635).

-

PR: https://git.openjdk.java.net/jdk/pull/7572


Re: RFR: 8265768 [aarch64] Use glibc libm impl for dlog,dlog10,dexp iff 2.29 or greater on AArch64.

2022-04-06 Thread Tobias Hartmann
On Fri, 1 Apr 2022 15:38:36 GMT, Andrew Haley  wrote:

>> Will this patch change `java.lang.Math`, `java.lang.StrictMath` or both? 
>> I've noticed differences in iterative machine learning algorithms using exp 
>> & log across different JVMs and architectures which we try to track in 
>> [Tribuo](https://github.com/oracle/tribuo) by recording the JVM & arch in 
>> our model provenance objects. If this patch is integrated will there be an 
>> easy way to get (e.g. from `System.getProperty`) what implementation of exp 
>> is in use by the current JVM? Otherwise I won't be able to notify users that 
>> the model may not reproduce if they rerun the same computation on different 
>> versions of Linux with the same JVM & architecture.
>
>> Will this patch change `java.lang.Math`, `java.lang.StrictMath` or both? 
>> I've noticed differences in iterative machine learning algorithms using exp 
>> & log across different JVMs and architectures which we try to track in 
>> [Tribuo](https://github.com/oracle/tribuo) by recording the JVM & arch in 
>> our model provenance objects.
> 
> Exactly so, and that is why this patch was never integrated. This was only 
> ever going to be about `java.lang.Math`, but we foundered on the rock of 
> monotonicity. Here's the spec:
> 
> "most methods with more than 0.5 ulp errors are required to be 
> semi-monotonic: whenever the mathematical function is non-decreasing, so is 
> the floating-point approximation, likewise, whenever the mathematical 
> function is non-increasing, so is the floating-point approximation. Not all 
> approximations that have 1 ulp accuracy will automatically meet the 
> monotonicity requirements."
> 
> We couldn't guarantee we'd meet the monotonicity requirements if we used 
> glibc libm, so this patch was, with some regret, abandoned.

@theRealAph Thanks for the summary. I closed the JBS issue as Won't Fix.

-

PR: https://git.openjdk.java.net/jdk/pull/3510


Re: RFR: 8265768 [aarch64] Use glibc libm impl for dlog,dlog10,dexp iff 2.29 or greater on AArch64.

2022-03-30 Thread Tobias Hartmann
On Tue, 25 May 2021 15:32:40 GMT, gregcawthorne  wrote:

>> Glibc 2.29 onwards provides optimised versions of log,log10,exp.
>> These functions have an accuracy of 0.9ulp or better in glibc
>> 2.29.
>> 
>> Therefore this patch adds code to parse, store and check
>> the runtime glibcs version in os_linux.cpp/hpp.
>> This is then used to select the glibcs implementation of
>> log, log10, exp at runtime for c1 and c2, iff we have
>> glibc 2.29 or greater.
>> 
>> This will ensure OpenJDK can benefit from future improvements
>> to glibc.
>> 
>> Glibc adheres to the ieee754 standard, unless stated otherwise
>> in its spec.
>> 
>> As there are no stated exceptions in the current glibc spec
>> for dlog, dlog10 and dexp, we can assume they currently follow
>> ieee754 (which testing confirms). As such, future version of
>> glibc are unlikely to lose this compliance with ieee754 in
>> future.
>> 
>> W.r.t performance this patch sees ~15-30% performance improvements for
>> log and log10, with ~50-80% performance improvements for exp for the
>> common input ranged (which output real numbers). However for the NaN
>> and inf output ranges we see a slow down of up to a factor of 2 for
>> some functions and architectures.
>> 
>> Due to this being the uncommon case we assert that this is a
>> worthwhile tradeoff.
>
> greg.cawtho...@arm.com
> 
> Should work

@gregcawthorne any plans to re-open and fix this?

-

PR: https://git.openjdk.java.net/jdk/pull/3510


Re: RFR: 8279508: Auto-vectorize Math.round API [v18]

2022-03-23 Thread Tobias Hartmann
On Fri, 18 Mar 2022 20:19:08 GMT, Jatin Bhateja  wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar 
>> IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain 
>> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 
>> 510.36 | 548.39 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 
>> 293.48 | 274.01 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 
>> 751.83 | 2274.13 | 3.02
>> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 
>> 388.52 | 1334.18 | 3.43
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 22 commits:
> 
>  - 8279508: Using an explicit scratch register since rscratch1 is bound to 
> r10 and its usage is transparent to compiler.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Windows build failure fix.
>  - 8279508: Styling comments resolved.
>  - 8279508: Creating separate test for round double under feature check.
>  - 8279508: Reducing the invocation count and compile thresholds for 
> RoundTests.java.
>  - 8279508: Review comments resolution.
>  - 8279508: Preventing domain switch-over penalty for Math.round(float) and 
> constraining unrolling to prevent code bloating.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Removing +LogCompilation flag.
>  - ... and 12 more: 
> https://git.openjdk.java.net/jdk/compare/ff0b0927...c17440cf

All tests passed.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v18]

2022-03-22 Thread Tobias Hartmann
On Fri, 18 Mar 2022 20:19:08 GMT, Jatin Bhateja  wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar 
>> IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain 
>> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 
>> 510.36 | 548.39 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 
>> 293.48 | 274.01 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 
>> 751.83 | 2274.13 | 3.02
>> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 
>> 388.52 | 1334.18 | 3.43
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 22 commits:
> 
>  - 8279508: Using an explicit scratch register since rscratch1 is bound to 
> r10 and its usage is transparent to compiler.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Windows build failure fix.
>  - 8279508: Styling comments resolved.
>  - 8279508: Creating separate test for round double under feature check.
>  - 8279508: Reducing the invocation count and compile thresholds for 
> RoundTests.java.
>  - 8279508: Review comments resolution.
>  - 8279508: Preventing domain switch-over penalty for Math.round(float) and 
> constraining unrolling to prevent code bloating.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Removing +LogCompilation flag.
>  - ... and 12 more: 
> https://git.openjdk.java.net/jdk/compare/ff0b0927...c17440cf

Sure, I'll re-run testing and report back.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v17]

2022-03-14 Thread Tobias Hartmann
On Sun, 13 Mar 2022 06:36:15 GMT, Jatin Bhateja  wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar 
>> IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain 
>> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 
>> 510.36 | 548.39 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 
>> 293.48 | 274.01 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 
>> 751.83 | 2274.13 | 3.02
>> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 
>> 388.52 | 1334.18 | 3.43
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   8279508: Windows build failure fix.

`compiler/c2/cr6340864/TestFloatVect.java` and `TestDoubleVect.java` fail on 
Windows:


# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc005) at pc=0x01971b940123, pid=56524, 
tid=57368
#
# JRE version: Java(TM) SE Runtime Environment (19.0) (fastdebug build 
19-internal-2022-03-14-0834080.tobias.hartmann.jdk2)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
19-internal-2022-03-14-0834080.tobias.hartmann.jdk2, mixed mode, sharing, 
tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64)
# Problematic frame:
# J 205 c2 compiler.c2.cr6340864.TestFloatVect.test_round([I[F)V (24 bytes) @ 
0x01971b940123 [0x01971b93ffe0+0x0143]

-

PR: https://git.openjdk.java.net/jdk/pull/7094


[jdk18] Integrated: 8281713: [BACKOUT] AArch64: Implement string_compare intrinsic in SVE

2022-02-14 Thread Tobias Hartmann
On Mon, 14 Feb 2022 10:30:09 GMT, Tobias Hartmann  wrote:

> We observed several failures on macosx aarch64 after integration of 
> [JDK-8275448](https://bugs.openjdk.java.net/browse/JDK-8275448). I could 
> reliably reproduce 
> [JDK-8281512](https://bugs.openjdk.java.net/browse/JDK-8281512) with JDK 18 
> b25-1665 (the first build with 
> [JDK-8275448](https://bugs.openjdk.java.net/browse/JDK-8275448) containing no 
> other changes) but **not** with JDK 18 b25-1664 (the build just before 
> integration). Also, I could reliably reproduce the issue with mainline but 
> **not** with the string compare intrinsic disabled. I think this is enough 
> evidence to prove that 
> [JDK-8275448](https://bugs.openjdk.java.net/browse/JDK-8275448) is the root 
> cause of the failures.
> 
> Given that we don't fully understand which part of this fix is causing the 
> issue and how (it might be a side effect that triggers a bug in the build 
> toolchain or adlc), I propose to backout the change. The backout applies 
> cleanly, approval is pending.
> 
> Thanks,
> Tobias

This pull request has now been integrated.

Changeset: 2be2a298
Author:Tobias Hartmann 
URL:   
https://git.openjdk.java.net/jdk18/commit/2be2a298f13c3a38d9518ccfea11dfd8a736d56c
Stats: 423 lines in 9 files changed: 0 ins; 412 del; 11 mod

8281713: [BACKOUT] AArch64: Implement string_compare intrinsic in SVE

Reviewed-by: kvn, dlong

-

PR: https://git.openjdk.java.net/jdk18/pull/116


Re: [jdk18] RFR: 8281713: [BACKOUT] AArch64: Implement string_compare intrinsic in SVE

2022-02-14 Thread Tobias Hartmann
On Mon, 14 Feb 2022 10:30:09 GMT, Tobias Hartmann  wrote:

> We observed several failures on macosx aarch64 after integration of 
> [JDK-8275448](https://bugs.openjdk.java.net/browse/JDK-8275448). I could 
> reliably reproduce 
> [JDK-8281512](https://bugs.openjdk.java.net/browse/JDK-8281512) with JDK 18 
> b25-1665 (the first build with 
> [JDK-8275448](https://bugs.openjdk.java.net/browse/JDK-8275448) containing no 
> other changes) but **not** with JDK 18 b25-1664 (the build just before 
> integration). Also, I could reliably reproduce the issue with mainline but 
> **not** with the string compare intrinsic disabled. I think this is enough 
> evidence to prove that 
> [JDK-8275448](https://bugs.openjdk.java.net/browse/JDK-8275448) is the root 
> cause of the failures.
> 
> Given that we don't fully understand which part of this fix is causing the 
> issue and how (it might be a side effect that triggers a bug in the build 
> toolchain or adlc), I propose to backout the change. The backout applies 
> cleanly, approval is pending.
> 
> Thanks,
> Tobias

Vladimir, Dean, thanks for the reviews.

-

PR: https://git.openjdk.java.net/jdk18/pull/116


[jdk18] RFR: 8281713: [BACKOUT] AArch64: Implement string_compare intrinsic in SVE

2022-02-14 Thread Tobias Hartmann
We observed several failures on macosx aarch64 after integration of 
[JDK-8275448](https://bugs.openjdk.java.net/browse/JDK-8275448). I could 
reliably reproduce 
[JDK-8281512](https://bugs.openjdk.java.net/browse/JDK-8281512) with JDK 18 
b25-1665 (the first build with 
[JDK-8275448](https://bugs.openjdk.java.net/browse/JDK-8275448) containing no 
other changes) but **not** with JDK 18 b25-1664 (the build just before 
integration). Also, I could reliably reproduce the issue with mainline but 
**not** with the string compare intrinsic disabled. I think this is enough 
evidence to prove that 
[JDK-8275448](https://bugs.openjdk.java.net/browse/JDK-8275448) is the root 
cause of the failures.

Given that we don't fully understand which part of this fix is causing the 
issue and how (it might be a side effect that triggers a bug in the build 
toolchain or adlc), I propose to backout the change. The backout applies 
cleanly, approval is pending.

Thanks,
Tobias

-

Commit messages:
 - Revert "8275448: [REDO] AArch64: Implement string_compare intrinsic in SVE"

Changes: https://git.openjdk.java.net/jdk18/pull/116/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk18=116=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281713
  Stats: 423 lines in 9 files changed: 0 ins; 412 del; 11 mod
  Patch: https://git.openjdk.java.net/jdk18/pull/116.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/116/head:pull/116

PR: https://git.openjdk.java.net/jdk18/pull/116


Re: RFR: 8252990: Intrinsify Unsafe.storeStoreFence [v2]

2021-10-28 Thread Tobias Hartmann
On Thu, 28 Oct 2021 08:58:48 GMT, Aleksey Shipilev  wrote:

>> `Unsafe.storeStoreFence` currently delegates to stronger 
>> `Unsafe.storeFence`. We can teach compilers to map this directly to already 
>> existing rules that handle `MemBarStoreStore`. Like explicit 
>> `LoadFence`/`StoreFence`, we introduce the special node to differentiate 
>> explicit fence and implicit store-store barriers. `storeStoreFence` is 
>> usually used to simulate safe `final`-field like constructions in special 
>> JDK classes, like `ConstantCallSite` and friends.
>> 
>> Motivational performance difference on benchmarks from JDK-8276054 on ARM32:
>> 
>> 
>> Benchmark  Mode  Cnt   ScoreError  Units
>> Multiple.plain avgt3   2.669 ±  0.004  ns/op
>> Multiple.release   avgt3  16.688 ±  0.057  ns/op
>> Multiple.storeStoreavgt3  14.021 ±  0.144  ns/op // Better
>> 
>> MultipleWithLoads.plainavgt3   4.672 ±  0.053  ns/op
>> MultipleWithLoads.release  avgt3  16.689 ±  0.044  ns/op
>> MultipleWithLoads.storeStore   avgt3  14.012 ±  0.010  ns/op // Better
>> 
>> MultipleWithStores.plain   avgt3  14.687 ±  0.009  ns/op
>> MultipleWithStores.release avgt3  45.393 ±  0.192  ns/op
>> MultipleWithStores.storeStore  avgt3  38.048 ±  0.033  ns/op // Better
>> 
>> Publishing.plain   avgt3  27.079 ±  0.201  ns/op
>> Publishing.release avgt3  27.088 ±  0.241  ns/op
>> Publishing.storeStore  avgt3  27.009 ±  0.259  ns/op // Within 
>> error, hidden by allocation
>> 
>> Single.plain   avgt3   2.670 ± 0.002  ns/op
>> Single.releaseFenceavgt3   6.675 ± 0.001  ns/op
>> Single.storeStoreFence avgt3   8.012 ± 0.027  ns/op  // Worse, 
>> seems to be ARM32 implementation artifact
>> 
>> 
>> As expected, this does not affect x86_64 at all, because both `release` and 
>> `storeStore` are effectively no-ops, only affecting compiler optimizations:
>> 
>> 
>> Benchmark  Mode  Cnt  Score   Error  Units
>> 
>> Multiple.plain avgt3  0.406 ± 0.002  ns/op
>> Multiple.release   avgt3  0.409 ± 0.018  ns/op
>> Multiple.storeStoreavgt3  0.406 ± 0.001  ns/op
>> 
>> MultipleWithLoads.plainavgt3  4.328 ± 0.006  ns/op
>> MultipleWithLoads.release  avgt3  4.600 ± 0.014  ns/op
>> MultipleWithLoads.storeStore   avgt3  4.602 ± 0.006  ns/op
>> 
>> MultipleWithStores.plain   avgt3  0.812 ± 0.001  ns/op
>> MultipleWithStores.release avgt3  0.812 ± 0.002  ns/op
>> MultipleWithStores.storeStore  avgt3  0.812 ± 0.002  ns/op
>> 
>> Publishing.plain   avgt3  6.370 ± 0.059  ns/op
>> Publishing.release avgt3  6.358 ± 0.436  ns/op
>> Publishing.storeStore  avgt3  6.367 ± 0.054  ns/op
>> 
>> Single.plain   avgt3  0.407 ± 0.039  ns/op
>> Single.releaseFenceavgt3  0.406 ± 0.001  ns/op
>> Single.storeStoreFence avgt3  0.406 ± 0.001  ns/op
>> 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug `tier1`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Fix the comment to match JDK-8276096

That looks good to me.

-

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/6136


Re: RFR: 8274242: Implement fast-path for ASCII-compatible CharsetEncoders on x86 [v6]

2021-09-29 Thread Tobias Hartmann
On Wed, 29 Sep 2021 12:36:23 GMT, Claes Redestad  wrote:

>> This patch extends the `ISO_8859_1.implEncodeISOArray` intrinsic on x86 to 
>> work also for ASCII encoding, which makes for example the `UTF_8$Encoder` 
>> perform on par with (or outperform) similarly getting charset encoded bytes 
>> from a String. The former took a small performance hit in JDK 9, and the 
>> latter improved greatly in the same release.
>> 
>> Extending the `EncodeIsoArray` intrinsics on other platforms should be 
>> possible, but I'm unfamiliar with the macro assembler in general and unlike 
>> the x86 intrinsic they don't use a simple vectorized mask to implement the 
>> latin-1 check. For example aarch64 seem to filter out the low bytes and then 
>> check if there's any bits set in the high bytes. Clever, but very different 
>> to the 0xFF80 2-byte mask that an ASCII test wants.
>
> Claes Redestad has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Clean up and make TestEncodeIntrinsics fail on the particular scenario 
> where the ISO intrinsic was used in place of the ASCII-only intrinsic

The incremental change looks good and trivial to me.

-

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5621


Re: RFR: 8274242: Implement fast-path for ASCII-compatible CharsetEncoders on x86 [v5]

2021-09-28 Thread Tobias Hartmann
On Tue, 28 Sep 2021 11:49:29 GMT, Claes Redestad  wrote:

>> This patch extends the `ISO_8859_1.implEncodeISOArray` intrinsic on x86 to 
>> work also for ASCII encoding, which makes for example the `UTF_8$Encoder` 
>> perform on par with (or outperform) similarly getting charset encoded bytes 
>> from a String. The former took a small performance hit in JDK 9, and the 
>> latter improved greatly in the same release.
>> 
>> Extending the `EncodeIsoArray` intrinsics on other platforms should be 
>> possible, but I'm unfamiliar with the macro assembler in general and unlike 
>> the x86 intrinsic they don't use a simple vectorized mask to implement the 
>> latin-1 check. For example aarch64 seem to filter out the low bytes and then 
>> check if there's any bits set in the high bytes. Clever, but very different 
>> to the 0xFF80 2-byte mask that an ASCII test wants.
>
> Claes Redestad has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add fail-safe predicate to all encode_iso_array to only match non-ASCII 
> EncodeISOArrayNodes

The latest version looks good to me.

-

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5621


Re: RFR: 8274242: Implement fast-path for ASCII-compatible CharsetEncoders on x86 [v3]

2021-09-27 Thread Tobias Hartmann
On Mon, 27 Sep 2021 11:59:54 GMT, Claes Redestad  wrote:

> > Should we remove the "iso" part from the method/class names?
> 
> I'm open to suggestions, but I've not been able to think of anything better. 
> `encodeISOOrASCII` doesn't seem helpful and since ASCII is a subset of the 
> ISO-8859-1 encoding referred to by the "iso" moniker then the ASCII-only 
> variant is technically encoding chars to valid ISO-8859-1.

Okay, that's fine with me.

-

PR: https://git.openjdk.java.net/jdk/pull/5621


Re: RFR: 8274242: Implement fast-path for ASCII-compatible CharsetEncoders on x86

2021-09-27 Thread Tobias Hartmann
On Tue, 21 Sep 2021 21:58:48 GMT, Claes Redestad  wrote:

> This patch extends the `ISO_8859_1.implEncodeISOArray` intrinsic on x86 to 
> work also for ASCII encoding, which makes for example the `UTF_8$Encoder` 
> perform on par with (or outperform) similarly getting charset encoded bytes 
> from a String. The former took a small performance hit in JDK 9, and the 
> latter improved greatly in the same release.
> 
> Extending the `EncodeIsoArray` intrinsics on other platforms should be 
> possible, but I'm unfamiliar with the macro assembler in general and unlike 
> the x86 intrinsic they don't use a simple vectorized mask to implement the 
> latin-1 check. For example aarch64 seem to filter out the low bytes and then 
> check if there's any bits set in the high bytes. Clever, but very different 
> to the 0xFF80 2-byte mask that an ASCII test wants.

Very nice. The changes look good to me, just added some minor comments.

Should we remove the "iso" part from the method/class names?

src/hotspot/cpu/x86/x86_32.ad line 12218:

> 12216: instruct encode_ascii_array(eSIRegP src, eDIRegP dst, eDXRegI len,
> 12217:   regD tmp1, regD tmp2, regD tmp3, regD tmp4,
> 12218:   eCXRegI tmp5, eAXRegI result, eFlagsReg cr) 
> %{

Indentation is wrong.

src/hotspot/cpu/x86/x86_32.ad line 12223:

> 12221:   effect(TEMP tmp1, TEMP tmp2, TEMP tmp3, TEMP tmp4, USE_KILL src, 
> USE_KILL dst, USE_KILL len, KILL tmp5, KILL cr);
> 1: 
> 12223:   format %{ "Encode array $src,$dst,$len -> $result// KILL ECX, 
> EDX, $tmp1, $tmp2, $tmp3, $tmp4, ESI, EDI " %}

You might want to change the opto assembly comment to "Encode ascii array" (and 
to "Encode iso array" above). Same on 64-bit.

src/hotspot/share/opto/intrinsicnode.hpp line 171:

> 169: 
> 170: 
> //--EncodeISOArray
> 171: // encode char[] to byte[] in ISO_8859_1

Comment should be adjusted to `... in ISO_8859_1 or ASCII`.

-

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5621


Re: RFR: 8267357: build breaks with -Werror option on micro benchmark added for JDK-8256973

2021-05-19 Thread Tobias Hartmann
On Wed, 19 May 2021 08:20:13 GMT, Jatin Bhateja  wrote:

> Relevant declarations modified and tested with -Werror, no longer see 
> unchecked conversion warnings.
> 
> Kindly review and approve.

Marked as reviewed by thartmann (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/4108


Re: [12] RFR(S) 8216151: [Graal] Module jdk.internal.vm.compiler.management has not been granted accessClassInPackage.org.graalvm.compiler.debug

2019-01-14 Thread Tobias Hartmann
Hi Vladimir,

looks good to me.

Best regards,
Tobias


On 13.01.19 03:46, Vladimir Kozlov wrote:
> http://cr.openjdk.java.net/~kvn/8216151/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8216151
> 
> Have to update default.policy after changes in 
> jdk.internal.vm.compiler.management files done by
> JDK-8199755: "Update Graal".
> 
> Ran CheckAccessClassInPackagePermissions.java test.
> 


Re: RFR 8214971 : Replace use of string.equals("") with isEmpty()

2018-12-10 Thread Tobias Hartmann
Hi Claes,

the difference is the "array encoding" (ae) argument passed to the intrinsic. 
See
MacroAssembler::string_compare(... int ae). If we compare an UTF16 String, we 
know that the number
of bytes is always even (two byte / char encoding), whereas a Latin1 string has 
a byte encoding. So
basically, the UTF16 intrinsics loads/compares chars whereas the Latin1 
intrinsic compares bytes
(both apply vectorization accordingly).

But you are right, we could always use the Latin1 intrinsic but in theory it 
should be slower.
At least on x86_64, might be different on ARM or other platforms.

Thanks,
Tobias

On 07.12.18 20:35, Claes Redestad wrote:
> This is an interesting point: can anyone explain why there are two distinct 
> methods for LATIN1 and
> UTF16 equality, with corresponding
> intrinsics? Aleksey? Tobias?
> 
> Testing with Arrays.equals then performance profile is
> quite different due vectorization (often better, sometimes worse),
> but this is performance-neutral for a variety of latin1 and utf16 inputs:
> 
> if (coder() == aString.coder())
>     return StringLatin1.equals(value, aString.value);
> 
> Is there some UTF16 input where StringLatin1.equals != StringUTF16.equals 
> that forbids the above?
> Performance-wise it seems neutral, and all tests seem to pass with the above 
> (obviously need to run
> more tests...).
> 
> Thanks!
> 
> /Claes
> 
> On 2018-12-07 04:53, James Laskey wrote:
>> Or simply;
>>
>> if (anObject instanceof String) {
>>     String aString = (String)anObject;
>>     if (coder() == aString.coder())
>>  return Arrays.equals(value, aString.value);
>>     }
>>     }
>>
>> Sent from my iPhone


Re: Review Request: JDK-8207146: Rename jdk.internal.misc.Unsafe::xxxObject to xxxReference

2018-10-22 Thread Tobias Hartmann
Hi Mandy,

the compiler related changes look good to me.

Please run hs-tier1-3 if you haven't done so yet.

Best regards,
Tobias


On 16.10.18 18:08, Mandy Chung wrote:
> Webrev:
> http://cr.openjdk.java.net/~mchung/jdk12/webrevs/8207146/webrev.00/
> 
> Unsafe::getObject returns a reference to an object. Similarly
> Unsafe::putObject sets a reference in the given base+offset.
> This patch proposes to rename Unsafe xxxObject to xxxReference
> that will make the xxxReference API very clear that these
> are getters/setters for a reference (instance of object class)
> and ready for adding xxxValue in the futurefor values.
> Note that this renaming only applies to jdk.internal.misc.Unsafe
> but not sun.misc.Unsafe.  So no impact to existing libraries
> that depends on sun.misc.Unsafe in jdk.unsupported module.
> 
> I ran jdk_core, core_tools, langtools and nashorn tier1-3,
> hotspot_runtime, hotspot_compiler test groups.
> 
> thanks
> Mandy


Re: (M) RFR: 8200167: Validate more special case invocations

2018-04-27 Thread Tobias Hartmann
Hi David,

On 27.04.2018 00:04, David Holmes wrote:

>> src/hotspot/share/c1/c1_Canonicalizer.cpp
>> ...
>>   void Canonicalizer::do_CheckCast  (CheckCast*   x) {
>> -  if (x->klass()->is_loaded()) {
>> +  if (x->klass()->is_loaded() && !x->is_invokespecial_receiver_check())
>>
>> It seems like it's not something specific to invokespecial, but a generic 
>> problem in how interface
>> casts are handled in C1: it's not correct to eliminate the cast if 
>> obj->declared_type() is an
>> interface. I assume that's what happens in your case. FTR I'm fine with 
>> handling it separately.
> 
> The above came from Tobias. If you think there is a more general issue here 
> then we should file a
> separate bug and formulate a test case.

To clarify, I've quickly debugged this problem before going on vacation and 
rather than a full fix,
the intention of the above change was to quickly verify that the problem is 
indeed an incorrectly
eliminated receiver cast.

I'm also fine with handling this in a separate bug or to push this as a quick 
fix and file a follow
up bug for further investigation if other changes are necessary.

Thanks,
Tobias


Re: RFR JDK-8193479: test/hotspot/jtreg/compiler/codegen/Test6896617.java fails after JDK-8184947

2017-12-15 Thread Tobias Hartmann
Hi Sherman,

On 15.12.2017 10:48, Tobias Hartmann wrote:
> I would say it's best to wait for the jdk10 repo to open.

I've checked with Jesper and we can/should quarantine this test right away. 
I'll take care of it [1].

Best regards,
Tobias

[1] 
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-December/027933.html


Re: RFR JDK-8193479: test/hotspot/jtreg/compiler/codegen/Test6896617.java fails after JDK-8184947

2017-12-15 Thread Tobias Hartmann
Hi Sherman,

On 14.12.2017 21:46, Xueming Shen wrote:
> I have a webrev out there. But I will probably have to wait for the
> new jdk10 repo open next week.
> 
> Or maybe the hs repo is still open for something like this? and
> you guys can help get it in directly there?

I would say it's best to wait for the jdk10 repo to open.

> It appears the consensus is to put this one into the ProblemList and let the
> hotspot folks to fix the test case.
> 
> issue: https://bugs.openjdk.java.net/browse/JDK-8193479
> webrev: http://cr.openjdk.java.net/~sherman/8193479/webrev
> 
> The corresponding hotspot/compiler issue filed: JDK-8193549.

The correct way is to create a subtask to quarantine the test. I've created one 
for you and closed (JDK-8193549) as
duplicate:
https://bugs.openjdk.java.net/browse/JDK-8193608

Please un-assign 8193479 if you don't plan to fix the test.

Thanks,
Tobias


Re: [9] RFR(L) 8158168: SIGSEGV: CollectedHeap::fill_with_objects(HeapWord*, unsigned long, bool)+0xa8

2017-03-31 Thread Tobias Hartmann

On 30.03.2017 20:24, dean.l...@oracle.com wrote:
> I would like to go with the webrev.2 version, with asserts removed.  All the 
> reviewers have said they are OK with that version, and it allows more 
> complete testing than the minimal version.

Okay, I'm fine with that!

Best regards,
Tobias

> On 3/23/17 12:03 PM, dean.l...@oracle.com wrote:
>>
>> On 3/23/17 11:25 AM, dean.l...@oracle.com wrote:
>>
>>> On 3/22/17 1:49 PM, Vladimir Ivanov wrote:
>>>
> Also, it looks like the changes I made to ASB.appendChars(char[] s, int
> off, int end) are not needed.

 Agree.

>> Vladimir, don't you need to replace checkIndex with checkOffset in
>> indexOf and lastIndexOf, so that we allow count == length?

 Yes, my bad. Good catch. Updated webrev in place.

 FTR I haven't done any extensive testing of the minimized fix.

 If we agree to proceed with it, the regression test should be updated as 
 well. I think the viable solution would be to construct broken SBs (using 
 reflection) and invoke affected methods on them.

>>>
>>> We can construct broken SBs using the Helper class that gets patched into 
>>> java.lang.  I'll work on that.
>>>
>>
>> Nevermind.  I forgot that some problems can only happen when the SB is 
>> changed half-way through the method.  For example,
>> in append(), we can't force an overflow unless we change the SB after 
>> ensureCapacityInternal() is called.  I could do something like:
>>
>>  760 public AbstractStringBuilder append(int i) {
>> 761 int count = this.count;
>>  762 int spaceNeeded = count + Integer.stringSize(i);
>>  763 ensureCapacityInternal(spaceNeeded);
>>  764 if (isLatin1()) {
>>  >>  Helper.fuzzValue(this);
>>  765 Integer.getChars(i, spaceNeeded, value);
>>  766 } else {
>>  767 byte[] val = this.value;
>>  >>  Helper.fuzzValue(this);
>>  768 checkBoundsBeginEnd(count, spaceNeeded, val.length >> 1);
>>  769 Integer.getCharsUTF16(i, spaceNeeded, val);
>>  770 }
>> 771 this.count = spaceNeeded;
>>  772 return this;
>>  773 }
>>
>> where the default Helper.fuzzValue() is an empty method, but the test would 
>> patch in its own version of Helper that changes the ASB field values.  I 
>> like this less than refactoring the checks to StringUTF16.
>>
>> dl
>>
>>> dl
>>>
 Best regards,
 Vladimir Ivanov

>> On 3/22/17 8:35 AM, Vladimir Ivanov wrote:
>> So are we convinced that the proposed changes will never lead to a
>> crash due to a missing or incorrect bounds check, due to a racy
>> use of
>> an unsynchronized ASB instance e.g. StringBuilder?
>
> If only we had a static analysis tool that could tell us if the
> code is
> safe.  Because we don't, in my initial changeset, we always take a
> snapshot of the ASB fields by passing those field values to
> StringUTF16
> before doing checks on them.  And I wrote a test to make sure that
> those
> StringUTF16 interfaces are catching all the underflows and overflows I
> could imagine, and I added verification code to detect when a check
> was
> missed.
>
> However, all the reviewers have requested to minimize the amount of
> changes.  In Vladimir's version, if there is a missing check
> somewhere,
> then yes it could lead to a crash.
>>>
>>> I'd like to point out that asserts and verification code are disabled
>>> by default. They are invaluable during problem diagnosis, but don't
>>> help at all from defence-in-depth perspective.
>>>
>>> But I agree that it's easier to reason about and test the initial
>>> version of the fix.
>>>
 I wonder if the reviewers have fully realized the potential impact
 here?
 This has exposed a flaw in the way intrinsics are used from core
 classes.
>>>
>>> FTR here are the checks I omitted in the minimized version (modulo
>>> separation of indexOf/lastIndexOf for trusted/non-trusted callers):
>>>
>>> http://cr.openjdk.java.net/~vlivanov/dlong/8158168/redundant_checks/
>>>
>>> Other than that, the difference is mainly about undoing refactorings
>>> and removing verification logic (asserts + checks in the JVM).
>>>
>>> There are still unsafe accesses which are considered safe in both
>>> versions (see StringUTF16.Trusted usages in the initial version [1]).
>>>
>>> We used to provide safe wrappers for unsafe intrinsics which makes it
>>> much easier to reason about code correctness. I'd like to see compact
>>> string code refactored that way and IMO the initial version by Dean
>>> is a big step in the right direction.
>>>
>>> I still prefer to see a point fix in 9 and major refactoring
>>> happening in 10, but I'll 

Re: [9] RFR(L) 8158168: SIGSEGV: CollectedHeap::fill_with_objects(HeapWord*, unsigned long, bool)+0xa8

2017-03-22 Thread Tobias Hartmann
Hi,

On 22.03.2017 06:09, David Holmes wrote:
> I wonder if the reviewers have fully realized the potential impact here? This 
> has exposed a flaw in the way intrinsics are used from core classes.

For the record, I'm fine with both the initial fix as well as the simplified 
version. If we don't add the debug runtime checks in the intrinsics, we should 
definitely add them with JDK 10 as additional safety net (we should consider 
other intrinsics and C1 as well).

Thanks,
Tobias


Re: RFR 8170155: StringBuffer and StringBuilder stream methods are not late-binding

2016-11-25 Thread Tobias Hartmann
Hi,

On 24.11.2016 00:56, Xueming Shen wrote:
> On 11/23/2016 02:39 PM, Paul Sandoz wrote:
>> Hi,
>>
>> Please review:
>>
>>
>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8170155-string-buffer-builder-late-binding/webrev/
>>
>> This patch:
>>
>> 1) updates the StringBuilder/StringBuffer.chars()/codePoints() so they are 
>> late binding.
>>
>> 2) refactors the spliterator late binding and fail fast tests, separating 
>> them out, and to the former additional tests are added for CharSequence 
>> implementations and BitSet.
>>
>> The code in AbstractStringBuilder has the following bounds check call to 
>> checkOffset:
>>
>> 1558 return StreamSupport.intStream(
>> 1559 () ->  {
>> 1560 byte[] val = this.value; int count = this.count;
>> 1561 checkOffset(count, val.length>>  coder);
>> 1562 return coder == LATIN1
>> 1563? new StringLatin1.CharsSpliterator(val, 0, 
>> count, 0)
>> 1564: new StringUTF16.CharsSpliterator(val, 0, 
>> count, 0);
>> 1565 },
>> 1566 Spliterator.ORDERED | Spliterator.SIZED | 
>> Spliterator.SUBSIZED,
>> 1567 false);
>>
>> (Separately checkOffset could be replaced with the internal 
>> Preconditions.checkIndex.)

Yes, if possible, bound checks should be replaced by the intrinsified 
Preconditions.check() (see https://bugs.openjdk.java.net/browse/JDK-8042997).

>> On initial inspection that should be a no-op, but talking to Sherman we 
>> believe it is required for UTF-16 access, since the intrinsics do not 
>> perform bounds checks, so they need to be guarded (for conformance purposes 
>> if a no-op).

If an out of bounds access is possible when using 
StringUTF16.putChar/getChar(), the check in the Java code is required because 
the intrinsic does not implement any bound checks (see below).

> One of the purposes of having StringUTF16.putChar() (other than the charAt())
> is to avoid the expensive boundary check on each/every char access. I'm 
> forwarding
> the email to Tobias to make sure it's still the case on hotspot side.

Yes, this is still correct. For performance reasons, the intrinsic for 
StringUTF16.putChar/getChar() has no boundary checks (see 
LibraryCallKit::inline_string_char_access()). It should *only* be used without 
a bounds check in the Java code if the index is guaranteed to be always in 
bounds.

> I'm not sure if it is still desired to do the same boundary check in case of 
> LATIN1
> for the benefit of consistency.  Assume there might be concurrent 
> access/operation
> between val = this.value and count = this.count; (for StringBuilder) for 
> example,
> the this.value got doubled and the this.count got increased. One will end up 
> with
> StringIndexOutOfBoundsException() from checkOffset, but the other will be 
> ioobe
> from vm?

You mean when hitting a check in an intrinsic compared to when hitting a check 
in the Java wrapper?

Actually, bound checks should always be done in the Java wrapper method that 
calls the (unchecked) intrinsic. In general, the unchecked intrinsic should not 
be called directly. StringUTF16.putChar/getChar() is an exception because there 
are places where we we need to omit the check for performance reasons.

I'm planning to revisit this with JDK-8156534 in JDK 10.

Best regards,
Tobias

> 
> Sherman
> 
> 
>> If so i propose the following to make this clearer:
>>
>> return StreamSupport.intStream(
>>  () ->  {
>>  byte[] val = this.value; int count = this.count;
>>  if (coder == LATIN1) {
>>  return new StringLatin1.CharsSpliterator(val, 0, count, 0);
>>  } else {
>>  // Perform an explicit bounds check since HotSpot
>>  // intrinsics to access UTF-16 characters in the byte[]
>>  // array will not perform bounds checks
>>  checkOffset(count, val.length>>  coder);
>>  return new StringUTF16.CharsSpliterator(val, 0, count, 0);
>>  }
>>  },
>>  Spliterator.ORDERED | Spliterator.SIZED | Spliterator.SUBSIZED,
>>  false);
>>
>> Paul.
> 


Re: RFR: 8159035: com/sun/crypto/provider/Cipher/CTS/CTSMode.java test crashed due to unhandled case of cipher length value as 0

2016-11-08 Thread Tobias Hartmann
Hi Rahul,

On 07.11.2016 12:21, Rahul Raghavan wrote:
> Hi,
> 
> Request review for closed bug fix - JDK-8159035.
> 
>  - http://cr.openjdk.java.net/~rraghavan/8159035/webrev.03/

Looks good to me!

> Notes:
> 
> 1.  - https://bugs.openjdk.java.net/browse/JDK-8159035 - 
> (com/sun/crypto/provider/Cipher/CTS/CTSMode.java test crashed due to 
> unhandled case of cipher length value as 0)
> Related issues - 
> https://bugs.openjdk.java.net/browse/JDK-8076112 - 'Add 
> @HotSpotIntrinsicCandidate annotation to indicate methods for which Java 
> Runtime has intrinsics'
> https://bugs.openjdk.java.net/browse/JDK-8167595 - 'AArch64: SEGV in stub 
> code cipherBlockChaining_decryptAESCrypt'
> 
> 2. Found root cause of the reported jvm crash for sparc -
> Crash happens at 'generate_cipherBlockChaining_decryptAESCrypt_Parallel()' 
> [stubGenerator_sparc.cpp]
> The implDecrypt can be called from CipherBlockChaining.decrypt, even with 
> cipherLen as 0.
> But the same condition is not handled in the stub code and results in crash.
> (the same applicable for implEncrypt)
> 
> 3. Though the reported case was for sparc, understood that same issue is 
> present for x86, aarch64 (JDK-8167595).
> So in above  fix proposed in Java wrapper method side 
> [CipherBlockChaining.java].
> 
> 4. The same issue in aarch64 (JDK-8167595) was fixed earlier in 
> stubGenerator_aarch64.
> So once above  is approved, I will initiate new hotspot webrev to 
> revert  this earlier 8167595 change.

Please file another bug for this and link it to this bug.

> 5. Checked for any other similar cases with HotSpotIntrinsicCandidate support 
> and found two cases as proposed in  
>  - implCrypt() / CounterMode.java 
>  - implEncodeISOArray() / ISO_8859_1.java 
> 
> Confirmed no Issues for  with  jprt testing (-testset hotspot, core)

Please also run our RBT testing before pushing.

Thanks,
Tobias

> Thanks,
> Rahul
> 


Re: RFR JDK-8143330: No appropriate CCC request for two new protected methods added in JDK9b93 by JDK-8141132

2015-11-23 Thread Tobias Hartmann
Hi,

looks good to me.

Best,
Tobias

On 20.11.2015 19:41, Xueming Shen wrote:
> 
> I missed the override version in StringBuffer.java, thanks
> Tobias for catching it.
> 
> http://cr.openjdk.java.net/~sherman/8143553
> https://bugs.openjdk.java.net/browse/JDK-8143553
> 
> I did re-generate the api doc, but still missed it :-(
> 
> Thanks,
> Sherman
> 
> 
> On 11/19/2015 12:50 PM, Xueming Shen wrote:
>> Thanks Joe, Alan! The synopsis has been updated accordingly.
>>
>> On 11/19/2015 12:43 PM, Alan Bateman wrote:
>>>
>>>
>>> On 19/11/2015 20:39, joe darcy wrote:
 Looks good Sherman; thanks,

>>> Looks okay to me too. We should fix the synopsis on the JIRA issue or at 
>>> least put a better message in the change-set comment.
>>>
>>> -Alan.
>>
> 


Re: RFR JDK-8143330: No appropriate CCC request for two new protected methods added in JDK9b93 by JDK-8141132

2015-11-20 Thread Tobias Hartmann
Hi Sherman,

On 19.11.2015 21:27, Xueming Shen wrote:
> Hi
> 
> Please help review the change for 8143330.
> 
> Issue: https://bugs.openjdk.java.net/browse/JDK-8143330
> webrev: http://cr.openjdk.java.net/~sherman/8143330/webrev

What about this

733 protected synchronized void getBytes(byte dst[], int dstBegin, byte 
coder) {
734 super.getBytes(dst, dstBegin, coder);
735 }

which was added to StringBuffer?

Best,
Tobias

> Cause: two implementation methods ABS.getBytes/initBytes were
> mistakenly declared as "protected" and exposed to the public. Both
> should be package private.
> 
> Thanks,
> Sherman


Re: [9] RFR(S): 8005873: JRuby test_respond_to.rb asserts with: MT-unsafe modification of inline cache

2014-06-02 Thread Tobias Hartmann


On 28.05.2014 12:48, Vladimir Ivanov wrote:

Looks good.

It should be safe to sync on MTF instance since it's not accessible 
outside (MTF and MT.form() are package-private).


Best regards,
Vladimir Ivanov


Thank you, Vladimir.

Could someone please push the patch?

Thanks,
Tobias



On 5/28/14 1:49 PM, Tobias Hartmann wrote:

Hi,

thanks everyone for the feedback!

@Remi: I agree with Paul. This is not a problem because if the normal
read sees an outdated null value, a new LambdaForm is created and
setCachedLambdaForm(...) is executed. This will guarantee that the
non-null value is seen and used. The unnecessary creation of a new
LamdaForm is not a problem either.

@John: I added the code that you suggested to simulate CAS. Please find
the new webrev at:

http://cr.openjdk.java.net/~anoll/8005873/webrev.02/

Sorry for the delay, I was on vacation.

Thanks,
Tobias

On 19.05.2014 20:31, John Rose wrote:

On May 16, 2014, at 4:56 AM, Tobias Hartmann
tobias.hartm...@oracle.com wrote:


Is it sufficient then to use synchronized (lambdaForms) { ... } in
setCachedLambdaForm(..) and a normal read in cachedLambdaForm(..)?

Yes, that is how I see it.  The fast path is a racy non-volatile read
of a safely-published structure.

(If safe publication via arrays were broken, java.lang.String would be
broken.  But the JMM is carefully designed to support safe publication
of array elements, and through array elements.)

— John






Re: [9] RFR(S): 8005873: JRuby test_respond_to.rb asserts with: MT-unsafe modification of inline cache

2014-06-02 Thread Tobias Hartmann

Hi Vladimir,

On 02.06.2014 19:38, Vladimir Ivanov wrote:

On 6/2/14 9:23 PM, Christian Thalinger wrote:


On Jun 2, 2014, at 6:29 AM, Vladimir Ivanov 
vladimir.x.iva...@oracle.com wrote:



Tobias, I'll take care of it.


Are you also taking care of the backports?
Yes. I'll backport into 8u myself and for 7 I'll ask help from 
Sustaining.


Thank you!

Best,
Tobias



Best regards,
Vladimir Ivanov




Best regards,
Vladimir Ivanov

On 6/2/14 10:04 AM, Tobias Hartmann wrote:


On 28.05.2014 12:48, Vladimir Ivanov wrote:

Looks good.

It should be safe to sync on MTF instance since it's not accessible
outside (MTF and MT.form() are package-private).

Best regards,
Vladimir Ivanov


Thank you, Vladimir.

Could someone please push the patch?

Thanks,
Tobias



On 5/28/14 1:49 PM, Tobias Hartmann wrote:

Hi,

thanks everyone for the feedback!

@Remi: I agree with Paul. This is not a problem because if the 
normal

read sees an outdated null value, a new LambdaForm is created and
setCachedLambdaForm(...) is executed. This will guarantee that the
non-null value is seen and used. The unnecessary creation of a new
LamdaForm is not a problem either.

@John: I added the code that you suggested to simulate CAS. 
Please find

the new webrev at:

http://cr.openjdk.java.net/~anoll/8005873/webrev.02/

Sorry for the delay, I was on vacation.

Thanks,
Tobias

On 19.05.2014 20:31, John Rose wrote:

On May 16, 2014, at 4:56 AM, Tobias Hartmann
tobias.hartm...@oracle.com wrote:


Is it sufficient then to use synchronized (lambdaForms) { ... } in
setCachedLambdaForm(..) and a normal read in cachedLambdaForm(..)?
Yes, that is how I see it.  The fast path is a racy non-volatile 
read

of a safely-published structure.

(If safe publication via arrays were broken, java.lang.String 
would be
broken.  But the JMM is carefully designed to support safe 
publication

of array elements, and through array elements.)

— John










Re: [9] RFR(S): 8005873: JRuby test_respond_to.rb asserts with: MT-unsafe modification of inline cache

2014-05-28 Thread Tobias Hartmann

Hi,

thanks everyone for the feedback!

@Remi: I agree with Paul. This is not a problem because if the normal 
read sees an outdated null value, a new LambdaForm is created and 
setCachedLambdaForm(...) is executed. This will guarantee that the 
non-null value is seen and used. The unnecessary creation of a new 
LamdaForm is not a problem either.


@John: I added the code that you suggested to simulate CAS. Please find 
the new webrev at:


http://cr.openjdk.java.net/~anoll/8005873/webrev.02/

Sorry for the delay, I was on vacation.

Thanks,
Tobias

On 19.05.2014 20:31, John Rose wrote:

On May 16, 2014, at 4:56 AM, Tobias Hartmann tobias.hartm...@oracle.com wrote:


Is it sufficient then to use synchronized (lambdaForms) { ... } in 
setCachedLambdaForm(..) and a normal read in cachedLambdaForm(..)?

Yes, that is how I see it.  The fast path is a racy non-volatile read of a 
safely-published structure.

(If safe publication via arrays were broken, java.lang.String would be broken.  
But the JMM is carefully designed to support safe publication of array 
elements, and through array elements.)

— John