[PATCH] D63636: [PowerPC][Altivec] Fix offsets for vec_xl and vec_xst

2019-11-07 Thread Nemanja Ivanovic via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGe0407f549653: [PowerPC][Altivec] Fix offsets for vec_xl and 
vec_xst (authored by nemanjai).
Herald added subscribers: shchenz, wuzish.

Changed prior to commit:
  https://reviews.llvm.org/D63636?vs=206123=228355#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63636/new/

https://reviews.llvm.org/D63636

Files:
  clang/lib/Headers/altivec.h
  clang/test/CodeGen/builtins-ppc-xl-xst.c

Index: clang/test/CodeGen/builtins-ppc-xl-xst.c
===
--- /dev/null
+++ clang/test/CodeGen/builtins-ppc-xl-xst.c
@@ -0,0 +1,490 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: powerpc-registered-target
+// RUN: %clang_cc1 -target-feature +altivec -target-feature +vsx \
+// RUN:   -triple powerpc64-unknown-unknown -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -target-feature +altivec -target-feature +vsx \
+// RUN:   -target-feature +power8-vector -triple powerpc64le-unknown-unknown \
+// RUN:   -emit-llvm %s -o - | FileCheck %s -check-prefixes=CHECK,CHECK-P8
+#include 
+
+// CHECK-LABEL: @test1(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[__VEC_ADDR_I:%.*]] = alloca <8 x i16>, align 16
+// CHECK-NEXT:[[__OFFSET_ADDR_I1:%.*]] = alloca i64, align 8
+// CHECK-NEXT:[[__PTR_ADDR_I2:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:[[__ADDR_I3:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:[[__OFFSET_ADDR_I:%.*]] = alloca i64, align 8
+// CHECK-NEXT:[[__PTR_ADDR_I:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:[[__ADDR_I:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:[[C_ADDR:%.*]] = alloca <8 x i16>*, align 8
+// CHECK-NEXT:[[PTR_ADDR:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:store <8 x i16>* [[C:%.*]], <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:store i16* [[PTR:%.*]], i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load i16*, i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:store i64 3, i64* [[__OFFSET_ADDR_I]], align 8
+// CHECK-NEXT:store i16* [[TMP0]], i16** [[__PTR_ADDR_I]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load i16*, i16** [[__PTR_ADDR_I]], align 8
+// CHECK-NEXT:[[TMP2:%.*]] = bitcast i16* [[TMP1]] to i8*
+// CHECK-NEXT:[[TMP3:%.*]] = load i64, i64* [[__OFFSET_ADDR_I]], align 8
+// CHECK-NEXT:[[ADD_PTR_I:%.*]] = getelementptr inbounds i8, i8* [[TMP2]], i64 [[TMP3]]
+// CHECK-NEXT:store i8* [[ADD_PTR_I]], i8** [[__ADDR_I]], align 8
+// CHECK-NEXT:[[TMP4:%.*]] = load i8*, i8** [[__ADDR_I]], align 8
+// CHECK-NEXT:[[TMP5:%.*]] = bitcast i8* [[TMP4]] to <8 x i16>*
+// CHECK-NEXT:[[TMP6:%.*]] = load <8 x i16>, <8 x i16>* [[TMP5]], align 1
+// CHECK-NEXT:[[TMP7:%.*]] = load <8 x i16>*, <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:store <8 x i16> [[TMP6]], <8 x i16>* [[TMP7]], align 16
+// CHECK-NEXT:[[TMP8:%.*]] = load <8 x i16>*, <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:[[TMP9:%.*]] = load <8 x i16>, <8 x i16>* [[TMP8]], align 16
+// CHECK-NEXT:[[TMP10:%.*]] = load i16*, i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:store <8 x i16> [[TMP9]], <8 x i16>* [[__VEC_ADDR_I]], align 16
+// CHECK-NEXT:store i64 7, i64* [[__OFFSET_ADDR_I1]], align 8
+// CHECK-NEXT:store i16* [[TMP10]], i16** [[__PTR_ADDR_I2]], align 8
+// CHECK-NEXT:[[TMP11:%.*]] = load i16*, i16** [[__PTR_ADDR_I2]], align 8
+// CHECK-NEXT:[[TMP12:%.*]] = bitcast i16* [[TMP11]] to i8*
+// CHECK-NEXT:[[TMP13:%.*]] = load i64, i64* [[__OFFSET_ADDR_I1]], align 8
+// CHECK-NEXT:[[ADD_PTR_I4:%.*]] = getelementptr inbounds i8, i8* [[TMP12]], i64 [[TMP13]]
+// CHECK-NEXT:store i8* [[ADD_PTR_I4]], i8** [[__ADDR_I3]], align 8
+// CHECK-NEXT:[[TMP14:%.*]] = load <8 x i16>, <8 x i16>* [[__VEC_ADDR_I]], align 16
+// CHECK-NEXT:[[TMP15:%.*]] = load i8*, i8** [[__ADDR_I3]], align 8
+// CHECK-NEXT:[[TMP16:%.*]] = bitcast i8* [[TMP15]] to <8 x i16>*
+// CHECK-NEXT:store <8 x i16> [[TMP14]], <8 x i16>* [[TMP16]], align 1
+// CHECK-NEXT:ret void
+//
+void test1(vector signed short *c, signed short *ptr) {
+*c = vec_xl(3ll, ptr);
+vec_xst(*c, 7ll, ptr);
+}
+
+// CHECK-LABEL: @test2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[__VEC_ADDR_I:%.*]] = alloca <8 x i16>, align 16
+// CHECK-NEXT:[[__OFFSET_ADDR_I1:%.*]] = alloca i64, align 8
+// CHECK-NEXT:[[__PTR_ADDR_I2:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:[[__ADDR_I3:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:[[__OFFSET_ADDR_I:%.*]] = alloca i64, align 8
+// CHECK-NEXT:[[__PTR_ADDR_I:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:[[__ADDR_I:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:[[C_ADDR:%.*]] = alloca <8 x i16>*, align 8
+// CHECK-NEXT:[[PTR_ADDR:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:store <8 x i16>* [[C:%.*]], <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:store i16* 

[PATCH] D63636: [PowerPC][Altivec] Fix offsets for vec_xl and vec_xst

2019-06-22 Thread Jinsong Ji via Phabricator via cfe-commits
jsji accepted this revision.
jsji added a comment.
This revision is now accepted and ready to land.

LGTM. Thanks for fixing.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63636/new/

https://reviews.llvm.org/D63636



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D63636: [PowerPC][Altivec] Fix offsets for vec_xl and vec_xst

2019-06-22 Thread Nemanja Ivanovic via Phabricator via cfe-commits
nemanjai updated this revision to Diff 206123.
nemanjai added a comment.

Remove the double cast. Simplify the test case. Rename the temp.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63636/new/

https://reviews.llvm.org/D63636

Files:
  lib/Headers/altivec.h
  test/CodeGen/builtins-ppc-xl-xst.c

Index: test/CodeGen/builtins-ppc-xl-xst.c
===
--- test/CodeGen/builtins-ppc-xl-xst.c
+++ test/CodeGen/builtins-ppc-xl-xst.c
@@ -0,0 +1,490 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: powerpc-registered-target
+// RUN: %clang_cc1 -target-feature +altivec -target-feature +vsx \
+// RUN:   -triple powerpc64-unknown-unknown -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -target-feature +altivec -target-feature +vsx \
+// RUN:   -target-feature +power8-vector -triple powerpc64le-unknown-unknown \
+// RUN:   -emit-llvm %s -o - | FileCheck %s -check-prefixes=CHECK,CHECK-P8
+#include 
+
+// CHECK-LABEL: @test1(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[__VEC_ADDR_I:%.*]] = alloca <8 x i16>, align 16
+// CHECK-NEXT:[[__OFFSET_ADDR_I1:%.*]] = alloca i64, align 8
+// CHECK-NEXT:[[__PTR_ADDR_I2:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:[[__ADDR_I3:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:[[__OFFSET_ADDR_I:%.*]] = alloca i64, align 8
+// CHECK-NEXT:[[__PTR_ADDR_I:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:[[__ADDR_I:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:[[C_ADDR:%.*]] = alloca <8 x i16>*, align 8
+// CHECK-NEXT:[[PTR_ADDR:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:store <8 x i16>* [[C:%.*]], <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:store i16* [[PTR:%.*]], i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load i16*, i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:store i64 3, i64* [[__OFFSET_ADDR_I]], align 8
+// CHECK-NEXT:store i16* [[TMP0]], i16** [[__PTR_ADDR_I]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load i16*, i16** [[__PTR_ADDR_I]], align 8
+// CHECK-NEXT:[[TMP2:%.*]] = bitcast i16* [[TMP1]] to i8*
+// CHECK-NEXT:[[TMP3:%.*]] = load i64, i64* [[__OFFSET_ADDR_I]], align 8
+// CHECK-NEXT:[[ADD_PTR_I:%.*]] = getelementptr inbounds i8, i8* [[TMP2]], i64 [[TMP3]]
+// CHECK-NEXT:store i8* [[ADD_PTR_I]], i8** [[__ADDR_I]], align 8
+// CHECK-NEXT:[[TMP4:%.*]] = load i8*, i8** [[__ADDR_I]], align 8
+// CHECK-NEXT:[[TMP5:%.*]] = bitcast i8* [[TMP4]] to <8 x i16>*
+// CHECK-NEXT:[[TMP6:%.*]] = load <8 x i16>, <8 x i16>* [[TMP5]], align 1
+// CHECK-NEXT:[[TMP7:%.*]] = load <8 x i16>*, <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:store <8 x i16> [[TMP6]], <8 x i16>* [[TMP7]], align 16
+// CHECK-NEXT:[[TMP8:%.*]] = load <8 x i16>*, <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:[[TMP9:%.*]] = load <8 x i16>, <8 x i16>* [[TMP8]], align 16
+// CHECK-NEXT:[[TMP10:%.*]] = load i16*, i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:store <8 x i16> [[TMP9]], <8 x i16>* [[__VEC_ADDR_I]], align 16
+// CHECK-NEXT:store i64 7, i64* [[__OFFSET_ADDR_I1]], align 8
+// CHECK-NEXT:store i16* [[TMP10]], i16** [[__PTR_ADDR_I2]], align 8
+// CHECK-NEXT:[[TMP11:%.*]] = load i16*, i16** [[__PTR_ADDR_I2]], align 8
+// CHECK-NEXT:[[TMP12:%.*]] = bitcast i16* [[TMP11]] to i8*
+// CHECK-NEXT:[[TMP13:%.*]] = load i64, i64* [[__OFFSET_ADDR_I1]], align 8
+// CHECK-NEXT:[[ADD_PTR_I4:%.*]] = getelementptr inbounds i8, i8* [[TMP12]], i64 [[TMP13]]
+// CHECK-NEXT:store i8* [[ADD_PTR_I4]], i8** [[__ADDR_I3]], align 8
+// CHECK-NEXT:[[TMP14:%.*]] = load <8 x i16>, <8 x i16>* [[__VEC_ADDR_I]], align 16
+// CHECK-NEXT:[[TMP15:%.*]] = load i8*, i8** [[__ADDR_I3]], align 8
+// CHECK-NEXT:[[TMP16:%.*]] = bitcast i8* [[TMP15]] to <8 x i16>*
+// CHECK-NEXT:store <8 x i16> [[TMP14]], <8 x i16>* [[TMP16]], align 1
+// CHECK-NEXT:ret void
+//
+void test1(vector signed short *c, signed short *ptr) {
+*c = vec_xl(3ll, ptr);
+vec_xst(*c, 7ll, ptr);
+}
+
+// CHECK-LABEL: @test2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[__VEC_ADDR_I:%.*]] = alloca <8 x i16>, align 16
+// CHECK-NEXT:[[__OFFSET_ADDR_I1:%.*]] = alloca i64, align 8
+// CHECK-NEXT:[[__PTR_ADDR_I2:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:[[__ADDR_I3:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:[[__OFFSET_ADDR_I:%.*]] = alloca i64, align 8
+// CHECK-NEXT:[[__PTR_ADDR_I:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:[[__ADDR_I:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:[[C_ADDR:%.*]] = alloca <8 x i16>*, align 8
+// CHECK-NEXT:[[PTR_ADDR:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:store <8 x i16>* [[C:%.*]], <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:store i16* [[PTR:%.*]], i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load i16*, i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:store i64 3, i64* [[__OFFSET_ADDR_I]], align 8
+// 

[PATCH] D63636: [PowerPC][Altivec] Fix offsets for vec_xl and vec_xst

2019-06-22 Thread Nemanja Ivanovic via Phabricator via cfe-commits
nemanjai marked 3 inline comments as done.
nemanjai added inline comments.



Comment at: lib/Headers/altivec.h:16364
   signed short *__ptr) {
-  return *(unaligned_vec_sshort *)(__ptr + __offset);
+  signed char *Adjusted = (signed char *)__ptr + __offset;
+  return *(unaligned_vec_sshort *)((signed short *)Adjusted);

jsji wrote:
> Why we name it `Adjusted`?  Why not just `__addr`? 
Sure. I don't really have any preference with respect to the name at all.



Comment at: lib/Headers/altivec.h:16365
+  signed char *Adjusted = (signed char *)__ptr + __offset;
+  return *(unaligned_vec_sshort *)((signed short *)Adjusted);
 }

jsji wrote:
> Why we want to cast it to `(signed short *)` again? Looks like unnecessary 
> casting to me?
Argh, yup the double cast is silly. I initially did something different for 
this and just missed cleaning up these. I'll update.



Comment at: test/CodeGen/builtins-ppc-xl-xst.c:4
+// RUN: %clang_cc1 -target-feature +altivec -target-feature +vsx -triple 
powerpc64-unknown-unknown -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -target-feature +altivec -target-feature +vsx 
-target-feature +power8-vector -triple powerpc64le-unknown-unknown -emit-llvm 
%s -o - | FileCheck %s -check-prefix=CHECK-P8
+#include 

jsji wrote:
> Any difference for results without `power8-vector `, except for `test9` and 
> `test10`?
> 
> Why not split `test9` and `test10` to another file for simplicity?
I like running all of them both with and without power8-vector. I can simplify 
this by using `check-prefixes=CHECK,CHECK-P8` so that we only have one sequence 
of checks for each function.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63636/new/

https://reviews.llvm.org/D63636



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D63636: [PowerPC][Altivec] Fix offsets for vec_xl and vec_xst

2019-06-21 Thread Jinsong Ji via Phabricator via cfe-commits
jsji added a comment.

A few questions.




Comment at: lib/Headers/altivec.h:16364
   signed short *__ptr) {
-  return *(unaligned_vec_sshort *)(__ptr + __offset);
+  signed char *Adjusted = (signed char *)__ptr + __offset;
+  return *(unaligned_vec_sshort *)((signed short *)Adjusted);

Why we name it `Adjusted`?  Why not just `__addr`? 



Comment at: lib/Headers/altivec.h:16365
+  signed char *Adjusted = (signed char *)__ptr + __offset;
+  return *(unaligned_vec_sshort *)((signed short *)Adjusted);
 }

Why we want to cast it to `(signed short *)` again? Looks like unnecessary 
casting to me?



Comment at: test/CodeGen/builtins-ppc-xl-xst.c:4
+// RUN: %clang_cc1 -target-feature +altivec -target-feature +vsx -triple 
powerpc64-unknown-unknown -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -target-feature +altivec -target-feature +vsx 
-target-feature +power8-vector -triple powerpc64le-unknown-unknown -emit-llvm 
%s -o - | FileCheck %s -check-prefix=CHECK-P8
+#include 

Any difference for results without `power8-vector `, except for `test9` and 
`test10`?

Why not split `test9` and `test10` to another file for simplicity?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63636/new/

https://reviews.llvm.org/D63636



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D63636: [PowerPC][Altivec] Fix offsets for vec_xl and vec_xst

2019-06-20 Thread Nemanja Ivanovic via Phabricator via cfe-commits
nemanjai created this revision.
nemanjai added reviewers: hfinkel, jsji, rzurob, saghir.
Herald added a subscriber: kbarton.
Herald added a project: clang.

As we currently have it implemented in altivec.h, the offsets for these two 
intrinsics are element offsets. The documentation in the ABI (as well as the 
implementation in both XL and GCC) states that these should be byte offsets.


Repository:
  rC Clang

https://reviews.llvm.org/D63636

Files:
  lib/Headers/altivec.h
  test/CodeGen/builtins-ppc-xl-xst.c

Index: test/CodeGen/builtins-ppc-xl-xst.c
===
--- test/CodeGen/builtins-ppc-xl-xst.c
+++ test/CodeGen/builtins-ppc-xl-xst.c
@@ -0,0 +1,849 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: powerpc-registered-target
+// RUN: %clang_cc1 -target-feature +altivec -target-feature +vsx -triple powerpc64-unknown-unknown -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -target-feature +altivec -target-feature +vsx -target-feature +power8-vector -triple powerpc64le-unknown-unknown -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-P8
+#include 
+
+// CHECK-LABEL: @test1(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[__VEC_ADDR_I:%.*]] = alloca <8 x i16>, align 16
+// CHECK-NEXT:[[__OFFSET_ADDR_I1:%.*]] = alloca i64, align 8
+// CHECK-NEXT:[[__PTR_ADDR_I2:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:[[ADJUSTED_I3:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:[[__OFFSET_ADDR_I:%.*]] = alloca i64, align 8
+// CHECK-NEXT:[[__PTR_ADDR_I:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:[[ADJUSTED_I:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:[[C_ADDR:%.*]] = alloca <8 x i16>*, align 8
+// CHECK-NEXT:[[PTR_ADDR:%.*]] = alloca i16*, align 8
+// CHECK-NEXT:store <8 x i16>* [[C:%.*]], <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:store i16* [[PTR:%.*]], i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load i16*, i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:store i64 3, i64* [[__OFFSET_ADDR_I]], align 8
+// CHECK-NEXT:store i16* [[TMP0]], i16** [[__PTR_ADDR_I]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load i16*, i16** [[__PTR_ADDR_I]], align 8
+// CHECK-NEXT:[[TMP2:%.*]] = bitcast i16* [[TMP1]] to i8*
+// CHECK-NEXT:[[TMP3:%.*]] = load i64, i64* [[__OFFSET_ADDR_I]], align 8
+// CHECK-NEXT:[[ADD_PTR_I:%.*]] = getelementptr inbounds i8, i8* [[TMP2]], i64 [[TMP3]]
+// CHECK-NEXT:store i8* [[ADD_PTR_I]], i8** [[ADJUSTED_I]], align 8
+// CHECK-NEXT:[[TMP4:%.*]] = load i8*, i8** [[ADJUSTED_I]], align 8
+// CHECK-NEXT:[[TMP5:%.*]] = bitcast i8* [[TMP4]] to i16*
+// CHECK-NEXT:[[TMP6:%.*]] = bitcast i16* [[TMP5]] to <8 x i16>*
+// CHECK-NEXT:[[TMP7:%.*]] = load <8 x i16>, <8 x i16>* [[TMP6]], align 1
+// CHECK-NEXT:[[TMP8:%.*]] = load <8 x i16>*, <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:store <8 x i16> [[TMP7]], <8 x i16>* [[TMP8]], align 16
+// CHECK-NEXT:[[TMP9:%.*]] = load <8 x i16>*, <8 x i16>** [[C_ADDR]], align 8
+// CHECK-NEXT:[[TMP10:%.*]] = load <8 x i16>, <8 x i16>* [[TMP9]], align 16
+// CHECK-NEXT:[[TMP11:%.*]] = load i16*, i16** [[PTR_ADDR]], align 8
+// CHECK-NEXT:store <8 x i16> [[TMP10]], <8 x i16>* [[__VEC_ADDR_I]], align 16
+// CHECK-NEXT:store i64 7, i64* [[__OFFSET_ADDR_I1]], align 8
+// CHECK-NEXT:store i16* [[TMP11]], i16** [[__PTR_ADDR_I2]], align 8
+// CHECK-NEXT:[[TMP12:%.*]] = load i16*, i16** [[__PTR_ADDR_I2]], align 8
+// CHECK-NEXT:[[TMP13:%.*]] = bitcast i16* [[TMP12]] to i8*
+// CHECK-NEXT:[[TMP14:%.*]] = load i64, i64* [[__OFFSET_ADDR_I1]], align 8
+// CHECK-NEXT:[[ADD_PTR_I4:%.*]] = getelementptr inbounds i8, i8* [[TMP13]], i64 [[TMP14]]
+// CHECK-NEXT:store i8* [[ADD_PTR_I4]], i8** [[ADJUSTED_I3]], align 8
+// CHECK-NEXT:[[TMP15:%.*]] = load <8 x i16>, <8 x i16>* [[__VEC_ADDR_I]], align 16
+// CHECK-NEXT:[[TMP16:%.*]] = load i8*, i8** [[ADJUSTED_I3]], align 8
+// CHECK-NEXT:[[TMP17:%.*]] = bitcast i8* [[TMP16]] to <8 x i16>*
+// CHECK-NEXT:store <8 x i16> [[TMP15]], <8 x i16>* [[TMP17]], align 1
+// CHECK-NEXT:ret void
+//
+// CHECK-P8-LABEL: @test1(
+// CHECK-P8-NEXT:  entry:
+// CHECK-P8-NEXT:[[__VEC_ADDR_I:%.*]] = alloca <8 x i16>, align 16
+// CHECK-P8-NEXT:[[__OFFSET_ADDR_I1:%.*]] = alloca i64, align 8
+// CHECK-P8-NEXT:[[__PTR_ADDR_I2:%.*]] = alloca i16*, align 8
+// CHECK-P8-NEXT:[[ADJUSTED_I3:%.*]] = alloca i8*, align 8
+// CHECK-P8-NEXT:[[__OFFSET_ADDR_I:%.*]] = alloca i64, align 8
+// CHECK-P8-NEXT:[[__PTR_ADDR_I:%.*]] = alloca i16*, align 8
+// CHECK-P8-NEXT:[[ADJUSTED_I:%.*]] = alloca i8*, align 8
+// CHECK-P8-NEXT:[[C_ADDR:%.*]] = alloca <8 x i16>*, align 8
+// CHECK-P8-NEXT:[[PTR_ADDR:%.*]] = alloca i16*, align 8
+// CHECK-P8-NEXT:store <8 x i16>* [[C:%.*]], <8 x i16>** [[C_ADDR]], align 8
+// CHECK-P8-NEXT:store i16* [[PTR:%.*]], i16** [[PTR_ADDR]], align 8
+//