On Tue, 10 Mar 2026 06:37:29 GMT, Jatin Bhateja <[email protected]> wrote:
>> Hi all, >> >> This patch optimizes SIMD kernels making heavy use of broadcasted inputs >> through following reassociating ideal transformations. >> >> >> VectorOperation (VectorBroadcast INP1, VectorBroadcast INP2) => >> VectorBroadcast (ScalarOpration INP1, INP2) >> >> VectorOperation (VectorBroadcast INP1) (VectorOperation (VectorBroadcast >> INP2) INP3) => >> VectorOperation INP3 (VectorOperation >> (VectorBroadcast INP1) (VectorBroadcast INP2)) >> >> >> The idea is to push broadcasts across the vector operation and replace the >> vector with an equivalent, cheaper scalar variant. Currently, patch handles >> most common vector operations. >> >> Following are the performance number of benchmark included with this patch >> on latest generation x86 targets:- >> >> **AMD Turin (2.1GHz)** >> <img width="1122" height="355" alt="image" >> src="https://github.com/user-attachments/assets/3f5087bf-0e14-4c56-b0c2-3d23253bad54" >> /> >> >> **Intel Granite Rapids (2.1GHz)** >> <img width="1105" height="325" alt="image" >> src="https://github.com/user-attachments/assets/c8481f86-4db2-4c4e-bd65-51542c59fe63" >> /> >> >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > Review comments resolution src/hotspot/share/opto/vectornode.cpp line 1293: > 1291: // scalar operation. > 1292: // > 1293: // VectorOperation (VectorBroadcast INP1) (VectorOperation > (VectorBroadcast INP2) INP3) => The comment looks confusing: it mentions `VectorBroadcast` while the corresponding node is named `ReplicateNode`. src/hotspot/share/opto/vectornode.hpp line 158: > 156: static int opcode(int sopc, BasicType bt); // scalar_opc -> > vector_opc > 157: static int scalar_opcode(int vopc, BasicType bt); // vector_opc -> > scalar_opc, 0 if not handled > 158: static Node* make_scalar(Compile* c, int sopc, Node* control, Node* > in1, Node* in2, Node* in3); It's a bit weird to see `VectorNode::make_scalar()`. It can be either moved to `Node` or accept vector opcode and do vector->scalar opcode conversion internally. Also, it would be nice to ensure that `VectorNode::opcode()` and `VectorNode::scalar_opcode()` agree. And `VectorNode::make_scalar()` can be one place where it is checked (`assert(opcode(scalar_opcode(vopc)) == vopc, "%s", NodeClassNames[vopc])`). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25617#discussion_r2921957722 PR Review Comment: https://git.openjdk.org/jdk/pull/25617#discussion_r2921991340
