On Fri, 13 Mar 2026 10:38:32 GMT, Jatin Bhateja <[email protected]> wrote:
>> Hi all, >> >> This patch optimizes SIMD kernels making heavy use of broadcasted inputs >> through following reassociating ideal transformations. >> >> >> VectorOperation (VectorBroadcast INP1, VectorBroadcast INP2) => >> VectorBroadcast (ScalarOpration INP1, INP2) >> >> VectorOperation (VectorBroadcast INP1) (VectorOperation (VectorBroadcast >> INP2) INP3) => >> VectorOperation INP3 (VectorOperation >> (VectorBroadcast INP1) (VectorBroadcast INP2)) >> >> >> The idea is to push broadcasts across the vector operation and replace the >> vector with an equivalent, cheaper scalar variant. Currently, patch handles >> most common vector operations. >> >> Following are the performance number of benchmark included with this patch >> on latest generation x86 targets:- >> >> **AMD Turin (2.1GHz)** >> <img width="1122" height="355" alt="image" >> src="https://github.com/user-attachments/assets/3f5087bf-0e14-4c56-b0c2-3d23253bad54" >> /> >> >> **Intel Granite Rapids (2.1GHz)** >> <img width="1105" height="325" alt="image" >> src="https://github.com/user-attachments/assets/c8481f86-4db2-4c4e-bd65-51542c59fe63" >> /> >> >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > Reveiw comments resolutions src/hotspot/share/opto/vectornode.cpp line 1308: > 1306: if (!is_commutative_vector_operation(Opcode())) { > 1307: return nullptr; > 1308: } Seems we need also consider the predicated version of the node here? Can we check the predicated feature for a node inside `is_commutative_vector_operation()`? This is reasonable since a binary operations is not commutative if it has the mask input. src/hotspot/share/opto/vectornode.cpp line 1338: > 1336: BasicType bt = vect_type()->element_basic_type(); > 1337: if (can_push_through_broadcast(bt)) { > 1338: Suggestion: BasicType bt = vect_type()->element_basic_type(); if (!can_push_through_broadcast(bt)) { return nullptr; } // start transformation src/hotspot/share/opto/vectornode.cpp line 1354: > 1352: assert(req() < 4 || in(3)->Opcode() == Op_Replicate, ""); > 1353: sinp3 = in(3)->in(1); > 1354: } Do we need an assertion of the `req() >= 2 && req() <= 4` at the start? src/hotspot/share/opto/vectornode.cpp line 1358: > 1356: Node* sop = make_scalar(phase->C, Opcode(), bt, in(0), sinp1, > sinp2, sinp3); > 1357: if (sop != nullptr) { > 1358: sop = phase->transform(sop); Suggestion: Node* sop = make_scalar(phase->C, Opcode(), bt, in(0), sinp1, sinp2, sinp3); if (sop == nullptr) { return nullptr; } sop = phase->transform(sop); ... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25617#discussion_r2985272291 PR Review Comment: https://git.openjdk.org/jdk/pull/25617#discussion_r2985282538 PR Review Comment: https://git.openjdk.org/jdk/pull/25617#discussion_r2985279243 PR Review Comment: https://git.openjdk.org/jdk/pull/25617#discussion_r2985287079
