On Mon, 13 Jan 2025 17:12:31 GMT, Galder Zamarreño <gal...@openjdk.org> wrote:
>> This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in >> order to help improve vectorization performance. >> >> Currently vectorization does not kick in for loops containing either of >> these calls because of the following error: >> >> >> VLoop::check_preconditions: failed: control flow in loop not allowed >> >> >> The control flow is due to the java implementation for these methods, e.g. >> >> >> public static long max(long a, long b) { >> return (a >= b) ? a : b; >> } >> >> >> This patch intrinsifies the calls to replace the CmpL + Bool nodes for >> MaxL/MinL nodes respectively. >> By doing this, vectorization no longer finds the control flow and so it can >> carry out the vectorization. >> E.g. >> >> >> SuperWord::transform_loop: >> Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt >> strip_mined >> 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] >> inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] >> !jvms: Test::test @ bci:14 (line 21) >> >> >> Applying the same changes to `ReductionPerf` as in >> https://github.com/openjdk/jdk/pull/13056, we can compare the results before >> and after. Before the patch, on darwin/aarch64 (M1): >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java >> 1 1 0 0 >> ============================== >> TEST SUCCESS >> >> long min 1155 >> long max 1173 >> >> >> After the patch, on darwin/aarch64 (M1): >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java >> 1 1 0 0 >> ============================== >> TEST SUCCESS >> >> long min 1042 >> long max 1042 >> >> >> This patch does not add an platform-specific backend implementations for the >> MaxL/MinL nodes. >> Therefore, it still relies on the macro expansion to transform those into >> CMoveL. >> >> I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these >> results: >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PA... > > Galder Zamarreño has updated the pull request incrementally with one > additional commit since the last revision: > > Make sure it runs with cpus with either avx512 or asimd Changes requested by epeter (Reviewer). test/hotspot/jtreg/compiler/intrinsics/math/TestMinMaxInlining.java line 2: > 1: /* > 2: * Copyright (c) 2024, Red Hat, Inc. All rights reserved. Year 2025? No idea what the Red Hat policy is though. test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 104: > 102: } > 103: > 104: public static void ReductionInit(long[] longs, int probability) { Suggestion: public static void reductionInit(long[] longs, int probability) { This is a method name, not a class - so I think it should start lower-case, right? test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 107: > 105: int aboveCount, abovePercent; > 106: > 107: // Iterate until you find a set that matches the requirement > probability Can you give a high-level definition / explanation what this does? Also: what is the expected number of rounds you iterate here? I'm asking because I would like to be sure that a timeout is basically impossible because the probability is too low. test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 121: > 119: } else { > 120: // Decrement by at least 1 > 121: long decrement = random.nextLong(10) + 1; Nit: I would call it `diffToMax`, because you are really just going to get a value below the `max`, and you are not decrementing the `max`. But up to you if you want to change it. ------------- PR Review: https://git.openjdk.org/jdk/pull/20098#pullrequestreview-2549073452 PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914424784 PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914426190 PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914431043 PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914434395