Re: RFR: 8268231: Aarch64: Use ldp in intrinsics for String.compareTo [v6]

2021-09-20 Thread Andrew Haley
On Mon, 20 Sep 2021 09:52:45 GMT, Andrew Dinn  wrote:

>> Andrew, can you help us to approve this?
>
> I agree with Andrew Haley that this patch is not going to make an improvement 
> for anything but a very small number of applications. Processing of strings 
> over a few 10s of bytes is rare. On the other hand the doesn't seem to cause 
> any performance drop for the much more common case of processing short 
> strings. so it does no harm. Also, the new and old code are much the same in 
> terms of complexity so that is no reason to prefer one over the other. The 
> only real concern I have is that any change involves the risk of error and 
> the ratio of cases that might benefit to cases that might suffer from an 
> error is very low. I don't think that's a reason to avoid pushing this patch 
> upstream but it does suggest that we should not backport it.

OK, thanks. That seems like a sensible compromise.

-

PR: https://git.openjdk.java.net/jdk/pull/4722


Re: RFR: 8268231: Aarch64: Use ldp in intrinsics for String.compareTo [v6]

2021-09-20 Thread Andrew Dinn
On Fri, 17 Sep 2021 07:13:24 GMT, Wu Yan  wrote:

>> It's fine. I don't think it'll affect any real programs, so it's rather 
>> pointless. I don't know if that's any reason not to approve it.
>
> Andrew, can you help us to approve this?

I agree with Andrew Haley that this patch is not going to make an improvement 
for anything but a very small number of applications. Processing of strings 
over a few 10s of bytes is rare. On the other hand the doesn't seem to cause 
any performance drop for the much more common case of processing short strings. 
so it does no harm. Also, the new and old code are much the same in terms of 
complexity so that is no reason to prefer one over the other. The only real 
concern I have is that any change involves the risk of error and the ratio of 
cases that might benefit to cases that might suffer from an error is very low. 
I don't think that's a reason to avoid pushing this patch upstream but it does 
suggest that we should not backport it.

-

PR: https://git.openjdk.java.net/jdk/pull/4722


Re: RFR: 8268231: Aarch64: Use ldp in intrinsics for String.compareTo [v6]

2021-09-17 Thread Wu Yan
On Sun, 5 Sep 2021 13:23:21 GMT, Andrew Haley  wrote:

>> Thanks, I'll fix it.
>
> It's fine. I don't think it'll affect any real programs, so it's rather 
> pointless. I don't know if that's any reason not to approve it.

Andrew, can you help us to approve this?

-

PR: https://git.openjdk.java.net/jdk/pull/4722


Re: RFR: 8268231: Aarch64: Use ldp in intrinsics for String.compareTo [v6]

2021-09-05 Thread Andrew Haley
On Thu, 26 Aug 2021 09:26:24 GMT, Wu Yan  wrote:

>> src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 4871:
>> 
>>> 4869: // exit from large loop when less than 64 bytes left to read or 
>>> we're about
>>> 4870: // to prefetch memory behind array border
>>> 4871: int largeLoopExitCondition = MAX(64, 
>>> SoftwarePrefetchHintDistance)/(isLL ? 1 : 2);
>> 
>> This breaks the Windows AArch64 build:
>> 
>> 
>> Creating support/modules_libs/java.base/server/jvm.dll from 1051 file(s)
>> d:\a\jdk\jdk\jdk\src\hotspot\cpu\aarch64\stubGenerator_aarch64.cpp(4871): 
>> error C3861: 'MAX': identifier not found
>> make[3]: *** [lib/CompileJvm.gmk:143: 
>> /cygdrive/d/a/jdk/jdk/jdk/build/windows-aarch64/hotspot/variant-server/libjvm
>> 
>> 
>> https://github.com/Wanghuang-Huawei/jdk/runs/3260986937
>> 
>> Should probably be left as `MAX2`.
>
> Thanks, I'll fix it.

It's fine. I don't think it'll affect any real programs, so it's rather 
pointless. I don't know if that's any reason not to approve it.

-

PR: https://git.openjdk.java.net/jdk/pull/4722


Re: RFR: 8268231: Aarch64: Use ldp in intrinsics for String.compareTo [v6]

2021-08-26 Thread Wu Yan
On Wed, 25 Aug 2021 07:40:56 GMT, Nick Gasson  wrote:

> I've run the benchmark on several different machines and didn't see any 
> performance regressions, and the speed-up for longer strings looks quite 
> good. I also ran jtreg tier1-3 with no new failures so I think this is ok.
> 
> If you fix the Windows build I'll approve it. But please wait for another 
> review, preferably from @theRealAph.

OK, Thank you very much!


> Note that JDK-8269559 (#5129) is also adding a JMH benchmark for this 
> intrinsic: it would be good if we could merge them, either now or later.

The JMH benchmark added by JDK-8269559 (#5129) can cover our test items 
(compareToLL and compareToUU), and can show the improvement of our patch, so we 
decided to delete our JMH benchmark in the next commit.
The test results using that JMH benchmark in JDK-8269559 are as follows:

Raspberry Pi 4B
base:
Benchmark   (delta)  (size)  Mode  Cnt   Score  
 Error  Units
StringCompareToDifferentLength.compareToLL2  24  avgt3   2.310 
? 0.050  ms/op
StringCompareToDifferentLength.compareToLL2  36  avgt3   2.818 
? 0.185  ms/op
StringCompareToDifferentLength.compareToLL2  72  avgt3   3.151 
? 0.215  ms/op
StringCompareToDifferentLength.compareToLL2 128  avgt3   4.171 
? 1.320  ms/op
StringCompareToDifferentLength.compareToLL2 256  avgt3   6.169 
? 0.653  ms/op
StringCompareToDifferentLength.compareToLL2 512  avgt3  10.911 
? 0.175  ms/op
StringCompareToDifferentLength.compareToLU2  24  avgt3   3.312 
? 0.102  ms/op
StringCompareToDifferentLength.compareToLU2  36  avgt3   4.162 
? 0.032  ms/op
StringCompareToDifferentLength.compareToLU2  72  avgt3   5.705 
? 0.152  ms/op
StringCompareToDifferentLength.compareToLU2 128  avgt3   9.301 
? 0.749  ms/op
StringCompareToDifferentLength.compareToLU2 256  avgt3  16.507 
? 1.353  ms/op
StringCompareToDifferentLength.compareToLU2 512  avgt3  30.160 
? 0.377  ms/op
StringCompareToDifferentLength.compareToUL2  24  avgt3   3.366 
? 0.280  ms/op
StringCompareToDifferentLength.compareToUL2  36  avgt3   4.308 
? 0.037  ms/op
StringCompareToDifferentLength.compareToUL2  72  avgt3   5.674 
? 0.210  ms/op
StringCompareToDifferentLength.compareToUL2 128  avgt3   9.358 
? 0.158  ms/op
StringCompareToDifferentLength.compareToUL2 256  avgt3  16.165 
? 0.158  ms/op
StringCompareToDifferentLength.compareToUL2 512  avgt3  29.857 
? 0.277  ms/op
StringCompareToDifferentLength.compareToUU2  24  avgt3   3.149 
? 0.209  ms/op
StringCompareToDifferentLength.compareToUU2  36  avgt3   3.157 
? 0.102  ms/op
StringCompareToDifferentLength.compareToUU2  72  avgt3   4.415 
? 0.073  ms/op
StringCompareToDifferentLength.compareToUU2 128  avgt3   6.244 
? 0.224  ms/op
StringCompareToDifferentLength.compareToUU2 256  avgt3  11.032 
? 0.080  ms/op
StringCompareToDifferentLength.compareToUU2 512  avgt3  20.942 
? 3.973  ms/op

opt:
Benchmark   (delta)  (size)  Mode  Cnt   Score  
 Error  Units
StringCompareToDifferentLength.compareToLL2  24  avgt3   2.319 
? 0.121  ms/op
StringCompareToDifferentLength.compareToLL2  36  avgt3   2.820 
? 0.096  ms/op
StringCompareToDifferentLength.compareToLL2  72  avgt3   2.511 
? 0.024  ms/op
StringCompareToDifferentLength.compareToLL2 128  avgt3   3.496 
? 0.382  ms/op
StringCompareToDifferentLength.compareToLL2 256  avgt3   5.215 
? 0.210  ms/op
StringCompareToDifferentLength.compareToLL2 512  avgt3   7.772 
? 0.448  ms/op
StringCompareToDifferentLength.compareToLU2  24  avgt3   3.432 
? 0.249  ms/op
StringCompareToDifferentLength.compareToLU2  36  avgt3   4.156 
? 0.052  ms/op
StringCompareToDifferentLength.compareToLU2  72  avgt3   5.735 
? 0.043  ms/op
StringCompareToDifferentLength.compareToLU2 128  avgt3   9.215 
? 0.394  ms/op
StringCompareToDifferentLength.compareToLU2 256  avgt3  16.373 
? 0.515  ms/op
StringCompareToDifferentLength.compareToLU2 512  avgt3  29.906 
? 0.245  ms/op
StringCompareToDifferentLength.compareToUL2  24  avgt3   3.361 
? 0.116  ms/op
StringCompareToDifferentLength.compareToUL2  36  avgt3   4.253 
? 0.061  ms/op
StringCompareToDifferentLength.compareToUL2  72  avgt3   5.744 
? 0.082  ms/op
StringCompareToDifferentLength.compareToUL2 128  avgt3   9.167 
? 0.343  ms/op
StringCompareToDifferentLength.compareToUL2 256  avgt3  16.591 
? 0.999  ms/op

Re: RFR: 8268231: Aarch64: Use ldp in intrinsics for String.compareTo [v6]

2021-08-25 Thread Nick Gasson
On Fri, 6 Aug 2021 09:50:54 GMT, Wang Huang  wrote:

>> Dear all, 
>> Can you do me a favor to review this patch. This patch use `ldp` to 
>> implement String.compareTo.
>>
>> * We add a JMH test case 
>> * Here is the result of this test case
>>  
>> Benchmark   |(size)| Mode| Cnt|Score | Error  |Units 
>> -|--|-||--||-
>> StringCompare.compareLL   |  64  | avgt| 5  |7.992 | ±   0.005|us/op
>> StringCompare.compareLL   |  72  | avgt| 5  |15.029| ±   0.006|us/op
>> StringCompare.compareLL   |  80  | avgt| 5  |14.655| ±   0.011|us/op
>> StringCompare.compareLL   |  91  | avgt| 5  |16.363| ±   0.12 |us/op
>> StringCompare.compareLL   |  101 | avgt| 5  |16.966| ±   0.007|us/op
>> StringCompare.compareLL   |  121 | avgt| 5  |19.276| ±   0.006|us/op
>> StringCompare.compareLL   |  181 | avgt| 5  |19.002| ±   0.417|us/op
>> StringCompare.compareLL   |  256 | avgt| 5  |24.707| ±   0.041|us/op
>> StringCompare.compareLLWithLdp|  64  | avgt| 5  |8.001   | ± 
>> 0.121|us/op
>> StringCompare.compareLLWithLdp|  72  | avgt| 5  |11.573| ±   0.003|us/op
>> StringCompare.compareLLWithLdp|  80  | avgt| 5  |6.861 | ±   0.004|us/op
>> StringCompare.compareLLWithLdp|  91  | avgt| 5  |12.774| ±   0.201|us/op
>> StringCompare.compareLLWithLdp|  101 | avgt| 5  |8.691 | ±   0.004|us/op
>> StringCompare.compareLLWithLdp|  121 | avgt| 5  |11.091| ±   1.342|us/op
>> StringCompare.compareLLWithLdp|  181 | avgt| 5  |14.64 | ±   0.581|us/op
>> StringCompare.compareLLWithLdp|  256 | avgt| 5  |25.879| ±   1.775|us/op
>> StringCompare.compareUU   |  64  | avgt| 5  |13.476| ±   0.01 |us/op
>> StringCompare.compareUU   |  72  | avgt| 5  |15.078| ±   0.006|us/op
>> StringCompare.compareUU   |  80  | avgt| 5  |23.512| ±   0.011|us/op
>> StringCompare.compareUU   |  91  | avgt| 5  |24.284| ±   0.008|us/op
>> StringCompare.compareUU   |  101 | avgt| 5  |20.707| ±   0.017|us/op
>> StringCompare.compareUU   |  121 | avgt| 5  |29.302| ±   0.011|us/op
>> StringCompare.compareUU   |  181 | avgt| 5  |39.31   | ± 
>> 0.016|us/op
>> StringCompare.compareUU   |  256 | avgt| 5  |54.592| ±   0.392|us/op
>> StringCompare.compareUUWithLdp|  64  | avgt| 5  |16.389| ±   0.008|us/op
>> StringCompare.compareUUWithLdp|  72  | avgt| 5  |10.71 | ±   0.158|us/op
>> StringCompare.compareUUWithLdp|  80  | avgt| 5  |11.488| ±   0.024|us/op
>> StringCompare.compareUUWithLdp|  91  | avgt| 5  |13.412| ±   0.006|us/op
>> StringCompare.compareUUWithLdp|  101 | avgt| 5  |16.245| ±   0.434|us/op
>> StringCompare.compareUUWithLdp|  121 | avgt| 5  |16.597| ±   0.016|us/op
>> StringCompare.compareUUWithLdp|  181 | avgt| 5  |27.373| ±   0.017|us/op
>> StringCompare.compareUUWithLdp|  256 | avgt| 5  |41.74 | ±   3.5  |us/op
>> 
>> From this table, we can see that in most cases, our patch is better than old 
>> one.
>> 
>> Thank you for your review. Any suggestions are welcome.
>
> Wang Huang has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   fix codestyle

I've run the benchmark on several different machines and didn't see any 
performance regressions, and the speed-up for longer strings looks quite good. 
I also ran jtreg tier1-3 with no new failures so I think this is ok.

If you fix the Windows build I'll approve it. But please wait for another 
review, preferably from @theRealAph.

Note that JDK-8269559 (#5129) is also adding a JMH benchmark for this 
intrinsic: it would be good if we could merge them, either now or later.

src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 4871:

> 4869: // exit from large loop when less than 64 bytes left to read or 
> we're about
> 4870: // to prefetch memory behind array border
> 4871: int largeLoopExitCondition = MAX(64, 
> SoftwarePrefetchHintDistance)/(isLL ? 1 : 2);

This breaks the Windows AArch64 build:


Creating support/modules_libs/java.base/server/jvm.dll from 1051 file(s)
d:\a\jdk\jdk\jdk\src\hotspot\cpu\aarch64\stubGenerator_aarch64.cpp(4871): error 
C3861: 'MAX': identifier not found
make[3]: *** [lib/CompileJvm.gmk:143: 
/cygdrive/d/a/jdk/jdk/jdk/build/windows-aarch64/hotspot/variant-server/libjvm


https://github.com/Wanghuang-Huawei/jdk/runs/3260986937

Should probably be left as `MAX2`.

-

Changes requested by ngasson (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4722


Re: RFR: 8268231: Aarch64: Use ldp in intrinsics for String.compareTo [v6]

2021-08-06 Thread Wang Huang
> Dear all, 
> Can you do me a favor to review this patch. This patch use `ldp` to 
> implement String.compareTo.
>
> * We add a JMH test case 
> * Here is the result of this test case
>  
> Benchmark|(size)| Mode| Cnt|Score | Error  |Units 
> -|--|-||--||-
> StringCompare.compareLL   |  64  | avgt| 5  |7.992 | ±0.005|us/op
> StringCompare.compareLL   |  72  | avgt| 5  |15.029| ±0.006|us/op
> StringCompare.compareLL   |  80  | avgt| 5  |14.655| ±0.011|us/op
> StringCompare.compareLL   |  91  | avgt| 5  |16.363| ±0.12 |us/op
> StringCompare.compareLL   |  101 | avgt| 5  |16.966| ±0.007|us/op
> StringCompare.compareLL   |  121 | avgt| 5  |19.276| ±0.006|us/op
> StringCompare.compareLL   |  181 | avgt| 5  |19.002| ±0.417|us/op
> StringCompare.compareLL   |  256 | avgt| 5  |24.707| ±0.041|us/op
> StringCompare.compareLLWithLdp|  64  | avgt| 5  |8.001| ± 
> 0.121|us/op
> StringCompare.compareLLWithLdp|  72  | avgt| 5  |11.573| ±0.003|us/op
> StringCompare.compareLLWithLdp|  80  | avgt| 5  |6.861 | ±0.004|us/op
> StringCompare.compareLLWithLdp|  91  | avgt| 5  |12.774| ±0.201|us/op
> StringCompare.compareLLWithLdp|  101 | avgt| 5  |8.691 | ±0.004|us/op
> StringCompare.compareLLWithLdp|  121 | avgt| 5  |11.091| ±1.342|us/op
> StringCompare.compareLLWithLdp|  181 | avgt| 5  |14.64 | ±0.581|us/op
> StringCompare.compareLLWithLdp|  256 | avgt| 5  |25.879| ±1.775|us/op
> StringCompare.compareUU   |  64  | avgt| 5  |13.476| ±0.01 |us/op
> StringCompare.compareUU   |  72  | avgt| 5  |15.078| ±0.006|us/op
> StringCompare.compareUU   |  80  | avgt| 5  |23.512| ±0.011|us/op
> StringCompare.compareUU   |  91  | avgt| 5  |24.284| ±0.008|us/op
> StringCompare.compareUU   |  101 | avgt| 5  |20.707| ±0.017|us/op
> StringCompare.compareUU   |  121 | avgt| 5  |29.302| ±0.011|us/op
> StringCompare.compareUU   |  181 | avgt| 5  |39.31| ± 
> 0.016|us/op
> StringCompare.compareUU   |  256 | avgt| 5  |54.592| ±0.392|us/op
> StringCompare.compareUUWithLdp|  64  | avgt| 5  |16.389| ±0.008|us/op
> StringCompare.compareUUWithLdp|  72  | avgt| 5  |10.71 | ±0.158|us/op
> StringCompare.compareUUWithLdp|  80  | avgt| 5  |11.488| ±0.024|us/op
> StringCompare.compareUUWithLdp|  91  | avgt| 5  |13.412| ±0.006|us/op
> StringCompare.compareUUWithLdp|  101 | avgt| 5  |16.245| ±0.434|us/op
> StringCompare.compareUUWithLdp|  121 | avgt| 5  |16.597| ±0.016|us/op
> StringCompare.compareUUWithLdp|  181 | avgt| 5  |27.373| ±0.017|us/op
> StringCompare.compareUUWithLdp|  256 | avgt| 5  |41.74 | ±3.5  |us/op
> 
> From this table, we can see that in most cases, our patch is better than old 
> one.
> 
> Thank you for your review. Any suggestions are welcome.

Wang Huang has updated the pull request incrementally with one additional 
commit since the last revision:

  fix codestyle

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4722/files
  - new: https://git.openjdk.java.net/jdk/pull/4722/files/60dd0516..2f756261

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=4722=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=4722=04-05

  Stats: 9 lines in 1 file changed: 0 ins; 1 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4722.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4722/head:pull/4722

PR: https://git.openjdk.java.net/jdk/pull/4722