On Tue, 3 Jun 2025 14:54:16 GMT, Jatin Bhateja <[email protected]> wrote:

> Hi all,
> 
> This patch optimizes SIMD kernels making heavy use of broadcasted inputs 
> through following reassociating ideal transformations.
> 
> 
>  VectorOperation (VectorBroadcast INP1,  VectorBroadcast INP2) => 
>                             VectorBroadcast (ScalarOpration INP1, INP2)
> 
>  VectorOperation (VectorBroadcast INP1) (VectorOperation (VectorBroadcast 
> INP2) INP3) => 
>                              VectorOperation INP3 (VectorOperation 
> (VectorBroadcast INP1) (VectorBroadcast INP2))
> 
> 
> The idea is to push broadcasts across the vector operation and replace the 
> vector with an equivalent, cheaper scalar variant.  Currently, patch handles 
> most common vector operations.
> 
> Following are the performance number of benchmark included with this patch on 
> latest generation x86 targets:- 
> 
> **AMD Turin (2.1GHz)**
> <img width="1122" height="355" alt="image" 
> src="https://github.com/user-attachments/assets/3f5087bf-0e14-4c56-b0c2-3d23253bad54";
>  />
> 
> **Intel Granite Rapids (2.1GHz)**
> <img width="1105" height="325" alt="image" 
> src="https://github.com/user-attachments/assets/c8481f86-4db2-4c4e-bd65-51542c59fe63";
>  />
> 
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> ---------
> - [x] I confirm that I make this contribution in accordance with the [OpenJDK 
> Interim AI Policy](https://openjdk.org/legal/ai).

This pull request has now been integrated.

Changeset: 7ff7efd5
Author:    Jatin Bhateja <[email protected]>
URL:       
https://git.openjdk.org/jdk/commit/7ff7efd59de98b29357c1f5bb424e90639b51be1
Stats:     2357 lines in 7 files changed: 2335 ins; 14 del; 8 mod

8358521: Optimize vector operations by reassociating broadcasted inputs

Reviewed-by: epeter, vlivanov, xgong

-------------

PR: https://git.openjdk.org/jdk/pull/25617

Reply via email to