https://github.com/Scottcjn created 
https://github.com/llvm/llvm-project/pull/187315

## Summary

Add the missing `_mm_loadu_si64` intrinsic to 
`clang/lib/Headers/ppc_wrappers/emmintrin.h`. This loads a 64-bit integer from 
unaligned memory into the lower 64 bits of `__m128i`, zeroing the upper bits.

GCC provides this intrinsic in its PPC SSE2 compatibility header but Clang 
doesn't, causing build failures for SSE2 code compiled on PowerPC via the 
compatibility layer (e.g., Skia).

The implementation delegates to `_mm_set_epi64`, matching the existing 
`_mm_loadl_epi64` pattern already in the file.

Fixes https://github.com/llvm/llvm-project/issues/91247

## Test plan

- Added call to `_mm_loadu_si64` in `ppc-emmintrin.c` test
- Added FileCheck pattern verifying the expected `_mm_set_epi64` call is emitted
- Verified on IBM POWER8 S824 (ppc64le) that the intrinsic compiles and 
produces correct VSX instructions

>From 8a3d02c852231f1a923e121b60aa911890dec5a4 Mon Sep 17 00:00:00 2001
From: Scott Boudreaux <[email protected]>
Date: Wed, 18 Mar 2026 11:02:09 -0500
Subject: [PATCH] [Clang][PowerPC] Add _mm_loadu_si64 to
 ppc_wrappers/emmintrin.h

Add the missing _mm_loadu_si64 intrinsic to the PowerPC SSE2
compatibility header. This function loads a 64-bit integer from an
unaligned memory location into the lower 64 bits of an __m128i,
zeroing the upper bits.

GCC already provides this intrinsic in its PPC SSE2 wrapper. Its
absence in Clang's ppc_wrappers/emmintrin.h causes build failures
for code using SSE2 intrinsics on PowerPC via the compatibility
layer (e.g., Skia).

The implementation delegates to _mm_set_epi64, matching the
existing _mm_loadl_epi64 pattern.

Fixes #91247

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
---
 clang/lib/Headers/ppc_wrappers/emmintrin.h | 6 ++++++
 clang/test/CodeGen/PowerPC/ppc-emmintrin.c | 4 ++++
 2 files changed, 10 insertions(+)

diff --git a/clang/lib/Headers/ppc_wrappers/emmintrin.h 
b/clang/lib/Headers/ppc_wrappers/emmintrin.h
index fc18ab9d43b15..fbe5294988efd 100644
--- a/clang/lib/Headers/ppc_wrappers/emmintrin.h
+++ b/clang/lib/Headers/ppc_wrappers/emmintrin.h
@@ -762,6 +762,12 @@ extern __inline __m128i
   return (__m128i)(vec_vsx_ld(0, (signed int const *)__P));
 }
 
+extern __inline __m128i
+    __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+    _mm_loadu_si64(void const *__P) {
+  return _mm_set_epi64((__m64)0LL, *(__m64 *)__P);
+}
+
 extern __inline __m128i
     __attribute__((__gnu_inline__, __always_inline__, __artificial__))
     _mm_loadl_epi64(__m128i_u const *__P) {
diff --git a/clang/test/CodeGen/PowerPC/ppc-emmintrin.c 
b/clang/test/CodeGen/PowerPC/ppc-emmintrin.c
index 88bf84eb81574..01053b937adb1 100644
--- a/clang/test/CodeGen/PowerPC/ppc-emmintrin.c
+++ b/clang/test/CodeGen/PowerPC/ppc-emmintrin.c
@@ -607,6 +607,7 @@ test_load() {
   resd = _mm_loadl_pd(md1, dp);
   resd = _mm_loadr_pd(dp);
   resd = _mm_loadu_pd(dp);
+  resi = _mm_loadu_si64(mip);
   resi = _mm_loadu_si128(mip);
 }
 
@@ -654,6 +655,9 @@ test_load() {
 // CHECK: %[[ADDR:[0-9a-zA-Z_.]+]] = load ptr, ptr %{{[0-9a-zA-Z_.]+}}, align 8
 // CHECK: call <2 x double> @vec_vsx_ld(int, double const*)(i32 noundef 
signext 0, ptr noundef %[[ADDR]])
 
+// CHECK-LABEL: define available_externally <2 x i64> @_mm_loadu_si64
+// CHECK: call <2 x i64> @_mm_set_epi64(i64 noundef 0, i64 noundef 
%{{[0-9a-zA-Z_.]+}})
+
 // CHECK-LABEL: define available_externally <2 x i64> @_mm_loadu_si128
 // CHECK: load ptr, ptr %{{[0-9a-zA-Z_.]+}}, align 8
 // CHECK: call <4 x i32> @vec_vsx_ld(int, int const*)(i32 noundef signext 0, 
ptr noundef %{{[0-9a-zA-Z_.]+}})

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to