On Thu, 12 Feb 2026 05:18:36 GMT, Jatin Bhateja <[email protected]> wrote:
>> Hi all, >> >> This patch optimizes SIMD kernels making heavy use of broadcasted inputs >> through following reassociating ideal transformations. >> >> >> VectorOperation (VectorBroadcast INP1, VectorBroadcast INP2) => >> VectorBroadcast (ScalarOpration INP1, INP2) >> >> VectorOperation (VectorBroadcast INP1) (VectorOperation (VectorBroadcast >> INP2) INP3) => >> VectorOperation INP3 (VectorOperation >> (VectorBroadcast INP1) (VectorOperation INP2)) >> >> >> The idea is to push broadcasts across the vector operation and replace the >> vector with an equivalent, cheaper scalar variant. Currently, patch handles >> most common vector operations. >> >> Following are the performance number of benchmark included with this patch >> on latest generation x86 targets:- >> >> **AMD Turin (2.1GHz)** >> <img width="1122" height="355" alt="image" >> src="https://github.com/user-attachments/assets/3f5087bf-0e14-4c56-b0c2-3d23253bad54" >> /> >> >> **Intel Granite Rapids (2.1GHz)** >> <img width="1105" height="325" alt="image" >> src="https://github.com/user-attachments/assets/c8481f86-4db2-4c4e-bd65-51542c59fe63" >> /> >> >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > Review comments resolution src/hotspot/share/opto/vectornode.cpp line 1193: > 1191: } > 1192: > 1193: bool VectorNode::can_push_broadcasts_across_vector_operation(BasicType > bt) { Better to add a comment for this method? src/hotspot/share/opto/vectornode.cpp line 1331: > 1329: return create_reassociated_node(this, in(1), in(2), in1_2, in1_1, > phase); > 1330: } > 1331: } These two parts are duplicated. How about merging the code like: Suggestion: Node* in1 = in(1); Node* in2 = in(2); // Swap broadcast operation to left to make the following reassociation simpler if (in2->Opcode() == Op_Replicate) { swap(in1, in2); } if (in1->Opcode() == Op_Replicate && in2->Opcode() == Opcode()) { ... } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25617#discussion_r2839339938 PR Review Comment: https://git.openjdk.org/jdk/pull/25617#discussion_r2839336882
