llvmorg-github-actions[bot] wrote:

<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-offload

Author: Piotr Balcer (pbalcer)

<details>
<summary>Changes</summary>

This is a follow-up to #<!-- -->194333, which changed liboffload's kernel 
launch interface to take an array of pointers to arguments instead of a 
contiguous argument buffer, but left the old path intact for OpenMP, to be 
cleaned up iteratively in later changes.

This patch modifies the offload host kernels, and the OpenMP clang codegen, to 
also use argument pointer arrays. The host was the only plugin that did not 
support the new submission path. With it converted, OpenMP can migrate to use 
the new path in all scenarios, after which everything related to contiguous 
argument buffer and its construction can be removed from plugin interface and 
the backends.

The change itself is fairly straightforward. The bulk of the diff is automated 
test updates. In the runtime, the host kernel signature becomes an array of 
pointers (void **), and the codegen is updated to match. With each argument 
being a pointer, the kernel function now adds one load to read the argument's 
value.

Assisted-by: Claude

---

Patch is 1.30 MiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/205355.diff


31 Files Affected:

- (modified) clang/lib/CodeGen/CGStmtOpenMP.cpp (+21-14) 
- (modified) clang/test/OpenMP/distribute_codegen.cpp (+94-56) 
- (modified) clang/test/OpenMP/distribute_simd_codegen.cpp (+204-124) 
- (modified) clang/test/OpenMP/target_dyn_groupprivate_codegen.cpp (+156-122) 
- (modified) clang/test/OpenMP/target_firstprivate_codegen.cpp (+594-474) 
- (modified) clang/test/OpenMP/target_ompx_dyn_cgroup_mem_codegen.cpp 
(+156-122) 
- (modified) clang/test/OpenMP/target_parallel_codegen.cpp (+311-249) 
- (modified) clang/test/OpenMP/target_parallel_for_codegen.cpp (+356-288) 
- (modified) clang/test/OpenMP/target_parallel_for_simd_codegen.cpp (+1070-932) 
- (modified) clang/test/OpenMP/target_parallel_generic_loop_codegen-2.cpp 
(+44-36) 
- (modified) clang/test/OpenMP/target_parallel_if_codegen.cpp (+148-118) 
- (modified) clang/test/OpenMP/target_parallel_num_threads_codegen.cpp 
(+140-110) 
- (modified) clang/test/OpenMP/target_parallel_num_threads_strict_codegen.cpp 
(+212-168) 
- (modified) clang/test/OpenMP/target_private_codegen.cpp (+3-2) 
- (modified) clang/test/OpenMP/target_task_affinity_codegen.cpp (+84-78) 
- (modified) clang/test/OpenMP/target_teams_codegen.cpp (+479-381) 
- (modified) clang/test/OpenMP/target_teams_distribute_codegen.cpp (+416-344) 
- (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_codegen.cpp 
(+84-66) 
- (modified) 
clang/test/OpenMP/target_teams_distribute_parallel_for_firstprivate_codegen.cpp 
(+121-95) 
- (modified) 
clang/test/OpenMP/target_teams_distribute_parallel_for_private_codegen.cpp 
(+15-10) 
- (modified) 
clang/test/OpenMP/target_teams_distribute_parallel_for_simd_codegen.cpp 
(+110-86) 
- (modified) 
clang/test/OpenMP/target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp
 (+121-95) 
- (modified) 
clang/test/OpenMP/target_teams_distribute_parallel_for_simd_private_codegen.cpp 
(+15-10) 
- (modified) clang/test/OpenMP/target_teams_distribute_simd_codegen.cpp 
(+1076-934) 
- (modified) clang/test/OpenMP/target_teams_generic_loop_codegen-1.cpp (+84-66) 
- (modified) clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp 
(+15-10) 
- (modified) clang/test/OpenMP/target_teams_map_codegen.cpp (+170-126) 
- (modified) clang/test/OpenMP/target_teams_num_teams_codegen.cpp (+140-110) 
- (modified) clang/test/OpenMP/target_teams_thread_limit_codegen.cpp (+150-118) 
- (modified) clang/test/OpenMP/teams_codegen.cpp (+90-66) 
- (modified) offload/plugins-nextgen/host/src/rtl.cpp (+4-3) 


``````````diff
diff --git a/clang/lib/CodeGen/CGStmtOpenMP.cpp 
b/clang/lib/CodeGen/CGStmtOpenMP.cpp
index 314f4e14dd1d2..95fd6694437fe 100644
--- a/clang/lib/CodeGen/CGStmtOpenMP.cpp
+++ b/clang/lib/CodeGen/CGStmtOpenMP.cpp
@@ -751,8 +751,7 @@ static llvm::Function 
*emitOutlinedFunctionPrologueAggregate(
   Address ContextAddr = CGF.GetAddrOfLocalVar(CD->getContextParam());
   ContextV = CGF.Builder.CreateLoad(ContextAddr);
 
-  // The runtime passes arguments as a flat array of promoted intptr_t values.
-  llvm::Type *IntPtrTy = CGF.IntPtrTy;
+  // The runtime passes arguments as an array of pointers.
   llvm::Type *PtrTy = CGF.Builder.getPtrTy();
   llvm::Align PtrAlign = CGM.getDataLayout().getPointerABIAlignment(0);
   CharUnits SlotAlign = CharUnits::fromQuantity(PtrAlign.value());
@@ -760,11 +759,12 @@ static llvm::Function 
*emitOutlinedFunctionPrologueAggregate(
   for (auto [FD, C, FieldIdx] :
        llvm::zip(RD->fields(), CS.captures(),
                  llvm::seq<unsigned>(RD->getNumFields()))) {
-    llvm::Value *Slot =
-        CGF.Builder.CreateConstInBoundsGEP1_32(IntPtrTy, ContextV, FieldIdx);
+    llvm::Value *SlotPtr =
+        CGF.Builder.CreateConstInBoundsGEP1_32(PtrTy, ContextV, FieldIdx);
+    llvm::Value *Slot = CGF.Builder.CreateAlignedLoad(PtrTy, SlotPtr, 
PtrAlign);
 
-    // Generate the appropriate load from the GEP into the __context struct.
-    // This includes all of the user arguments as well as the implicit kernel
+    // Generate the appropriate load from the per-argument storage. This
+    // includes all of the user arguments as well as the implicit kernel
     // argument pointer.
     if (C.capturesVariableByCopy() && FD->getType()->isAnyPointerType()) {
       const VarDecl *CurVD = C.getCapturedVar();
@@ -973,11 +973,14 @@ llvm::Function 
*CodeGenFunction::GenerateOpenMPCapturedStmtFunctionAggregate(
       const ImplicitParamDecl *Param = CD->getParam(I);
       if (Param == CD->getContextParam())
         continue;
-      llvm::Value *ParamAddr = Builder.CreateConstInBoundsGEP1_32(
-          IntPtrTy, ContextV, FieldIdx, Twine(Param->getName()) + ".addr");
+      llvm::Align PtrAlign = CGM.getDataLayout().getPointerABIAlignment(0);
+      llvm::Value *SlotPtr = Builder.CreateConstInBoundsGEP1_32(
+          Builder.getPtrTy(), ContextV, FieldIdx,
+          Twine(Param->getName()) + ".addr");
+      llvm::Value *ParamAddr =
+          Builder.CreateAlignedLoad(Builder.getPtrTy(), SlotPtr, PtrAlign);
       llvm::Value *ParamVal = Builder.CreateAlignedLoad(
-          Builder.getPtrTy(), ParamAddr,
-          CGM.getDataLayout().getPointerABIAlignment(0), Param->getName());
+          Builder.getPtrTy(), ParamAddr, PtrAlign, Param->getName());
       Address ParamLocalAddr =
           CreateMemTemp(Param->getType(), Param->getName());
       Builder.CreateStore(ParamVal, ParamLocalAddr);
@@ -1016,8 +1019,10 @@ llvm::Function 
*CodeGenFunction::GenerateOpenMPCapturedStmtFunctionAggregate(
 
   for (auto [FD, InnerParam, SlotIdx] : llvm::zip(
            RD->fields(), F->args(), llvm::seq<unsigned>(RD->getNumFields()))) {
-    llvm::Value *Slot = WrapperCGF.Builder.CreateConstInBoundsGEP1_32(
-        WrapperCGF.IntPtrTy, WrapperContextV, SlotIdx);
+    llvm::Value *SlotPtr = WrapperCGF.Builder.CreateConstInBoundsGEP1_32(
+        WrapperCGF.Builder.getPtrTy(), WrapperContextV, SlotIdx);
+    llvm::Value *Slot = WrapperCGF.Builder.CreateAlignedLoad(
+        WrapperCGF.Builder.getPtrTy(), SlotPtr, PtrAlign);
     llvm::Value *Val = WrapperCGF.Builder.CreateAlignedLoad(
         InnerParam.getType(), Slot, PtrAlign, InnerParam.getName());
     CallArgs.push_back(Val);
@@ -1026,8 +1031,10 @@ llvm::Function 
*CodeGenFunction::GenerateOpenMPCapturedStmtFunctionAggregate(
   // Handle the load from the implicit dyn_ptr at the end of the __context.
   unsigned SlotIdx = RD->getNumFields();
   auto InnerParam = F->arg_begin() + SlotIdx;
-  llvm::Value *Slot = WrapperCGF.Builder.CreateConstInBoundsGEP1_32(
-      WrapperCGF.IntPtrTy, WrapperContextV, SlotIdx);
+  llvm::Value *SlotPtr = WrapperCGF.Builder.CreateConstInBoundsGEP1_32(
+      WrapperCGF.Builder.getPtrTy(), WrapperContextV, SlotIdx);
+  llvm::Value *Slot = WrapperCGF.Builder.CreateAlignedLoad(
+      WrapperCGF.Builder.getPtrTy(), SlotPtr, PtrAlign);
   llvm::Value *Val = WrapperCGF.Builder.CreateAlignedLoad(
       InnerParam->getType(), Slot, PtrAlign, InnerParam->getName());
   CallArgs.push_back(Val);
diff --git a/clang/test/OpenMP/distribute_codegen.cpp 
b/clang/test/OpenMP/distribute_codegen.cpp
index afd18e91911dd..f5b9a3a5d7c9e 100644
--- a/clang/test/OpenMP/distribute_codegen.cpp
+++ b/clang/test/OpenMP/distribute_codegen.cpp
@@ -2033,12 +2033,17 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK17-NEXT:    [[DYN_PTR1:%.*]] = alloca ptr, align 8
 // CHECK17-NEXT:    store ptr [[__CONTEXT]], ptr [[__CONTEXT_ADDR]], align 8
 // CHECK17-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[__CONTEXT_ADDR]], align 8
-// CHECK17-NEXT:    [[A:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 0
-// CHECK17-NEXT:    [[B:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 1
-// CHECK17-NEXT:    [[C:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 2
-// CHECK17-NEXT:    [[D:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 3
-// CHECK17-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds i64, ptr 
[[TMP0]], i32 4
-// CHECK17-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 8
+// CHECK17-NEXT:    [[TMP1:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 0
+// CHECK17-NEXT:    [[A:%.*]] = load ptr, ptr [[TMP1]], align 8
+// CHECK17-NEXT:    [[TMP2:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 1
+// CHECK17-NEXT:    [[B:%.*]] = load ptr, ptr [[TMP2]], align 8
+// CHECK17-NEXT:    [[TMP3:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 2
+// CHECK17-NEXT:    [[C:%.*]] = load ptr, ptr [[TMP3]], align 8
+// CHECK17-NEXT:    [[TMP4:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 3
+// CHECK17-NEXT:    [[D:%.*]] = load ptr, ptr [[TMP4]], align 8
+// CHECK17-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds ptr, ptr 
[[TMP0]], i32 4
+// CHECK17-NEXT:    [[TMP5:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 8
+// CHECK17-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[TMP5]], align 8
 // CHECK17-NEXT:    store ptr [[DYN_PTR]], ptr [[DYN_PTR1]], align 8
 // CHECK17-NEXT:    call void (ptr, i32, ptr, ...) @__kmpc_fork_teams(ptr 
@[[GLOB2:[0-9]+]], i32 4, ptr 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z23without_schedule_clausePfS_S_S__l56.omp_outlined,
 ptr [[A]], ptr [[B]], ptr [[C]], ptr [[D]])
 // CHECK17-NEXT:    ret void
@@ -2145,12 +2150,17 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK17-NEXT:    [[DYN_PTR1:%.*]] = alloca ptr, align 8
 // CHECK17-NEXT:    store ptr [[__CONTEXT]], ptr [[__CONTEXT_ADDR]], align 8
 // CHECK17-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[__CONTEXT_ADDR]], align 8
-// CHECK17-NEXT:    [[A:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 0
-// CHECK17-NEXT:    [[B:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 1
-// CHECK17-NEXT:    [[C:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 2
-// CHECK17-NEXT:    [[D:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 3
-// CHECK17-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds i64, ptr 
[[TMP0]], i32 4
-// CHECK17-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 8
+// CHECK17-NEXT:    [[TMP1:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 0
+// CHECK17-NEXT:    [[A:%.*]] = load ptr, ptr [[TMP1]], align 8
+// CHECK17-NEXT:    [[TMP2:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 1
+// CHECK17-NEXT:    [[B:%.*]] = load ptr, ptr [[TMP2]], align 8
+// CHECK17-NEXT:    [[TMP3:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 2
+// CHECK17-NEXT:    [[C:%.*]] = load ptr, ptr [[TMP3]], align 8
+// CHECK17-NEXT:    [[TMP4:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 3
+// CHECK17-NEXT:    [[D:%.*]] = load ptr, ptr [[TMP4]], align 8
+// CHECK17-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds ptr, ptr 
[[TMP0]], i32 4
+// CHECK17-NEXT:    [[TMP5:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 8
+// CHECK17-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[TMP5]], align 8
 // CHECK17-NEXT:    store ptr [[DYN_PTR]], ptr [[DYN_PTR1]], align 8
 // CHECK17-NEXT:    call void (ptr, i32, ptr, ...) @__kmpc_fork_teams(ptr 
@[[GLOB2]], i32 4, ptr 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z18static_not_chunkedPfS_S_S__l68.omp_outlined,
 ptr [[A]], ptr [[B]], ptr [[C]], ptr [[D]])
 // CHECK17-NEXT:    ret void
@@ -2257,12 +2267,17 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK17-NEXT:    [[DYN_PTR1:%.*]] = alloca ptr, align 8
 // CHECK17-NEXT:    store ptr [[__CONTEXT]], ptr [[__CONTEXT_ADDR]], align 8
 // CHECK17-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[__CONTEXT_ADDR]], align 8
-// CHECK17-NEXT:    [[A:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 0
-// CHECK17-NEXT:    [[B:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 1
-// CHECK17-NEXT:    [[C:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 2
-// CHECK17-NEXT:    [[D:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], i32 3
-// CHECK17-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds i64, ptr 
[[TMP0]], i32 4
-// CHECK17-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 8
+// CHECK17-NEXT:    [[TMP1:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 0
+// CHECK17-NEXT:    [[A:%.*]] = load ptr, ptr [[TMP1]], align 8
+// CHECK17-NEXT:    [[TMP2:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 1
+// CHECK17-NEXT:    [[B:%.*]] = load ptr, ptr [[TMP2]], align 8
+// CHECK17-NEXT:    [[TMP3:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 2
+// CHECK17-NEXT:    [[C:%.*]] = load ptr, ptr [[TMP3]], align 8
+// CHECK17-NEXT:    [[TMP4:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 3
+// CHECK17-NEXT:    [[D:%.*]] = load ptr, ptr [[TMP4]], align 8
+// CHECK17-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds ptr, ptr 
[[TMP0]], i32 4
+// CHECK17-NEXT:    [[TMP5:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 8
+// CHECK17-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[TMP5]], align 8
 // CHECK17-NEXT:    store ptr [[DYN_PTR]], ptr [[DYN_PTR1]], align 8
 // CHECK17-NEXT:    call void (ptr, i32, ptr, ...) @__kmpc_fork_teams(ptr 
@[[GLOB2]], i32 4, ptr 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z14static_chunkedPfS_S_S__l80.omp_outlined,
 ptr [[A]], ptr [[B]], ptr [[C]], ptr [[D]])
 // CHECK17-NEXT:    ret void
@@ -2387,11 +2402,13 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK17-NEXT:    [[DYN_PTR1:%.*]] = alloca ptr, align 8
 // CHECK17-NEXT:    store ptr [[__CONTEXT]], ptr [[__CONTEXT_ADDR]], align 8
 // CHECK17-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[__CONTEXT_ADDR]], align 8
-// CHECK17-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], 
i32 0
-// CHECK17-NEXT:    [[TMP2:%.*]] = load i8, ptr [[TMP1]], align 8
-// CHECK17-NEXT:    store i8 [[TMP2]], ptr [[A]], align 1
-// CHECK17-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds i64, ptr 
[[TMP0]], i32 1
-// CHECK17-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 8
+// CHECK17-NEXT:    [[TMP1:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 0
+// CHECK17-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8
+// CHECK17-NEXT:    [[TMP3:%.*]] = load i8, ptr [[TMP2]], align 8
+// CHECK17-NEXT:    store i8 [[TMP3]], ptr [[A]], align 1
+// CHECK17-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds ptr, ptr 
[[TMP0]], i32 1
+// CHECK17-NEXT:    [[TMP4:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 8
+// CHECK17-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[TMP4]], align 8
 // CHECK17-NEXT:    store ptr [[DYN_PTR]], ptr [[DYN_PTR1]], align 8
 // CHECK17-NEXT:    call void (ptr, i32, ptr, ...) @__kmpc_fork_teams(ptr 
@[[GLOB2]], i32 1, ptr 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z12test_precondv_l92.omp_outlined, 
ptr [[A]])
 // CHECK17-NEXT:    ret void
@@ -2498,11 +2515,13 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK17-NEXT:    [[DYN_PTR1:%.*]] = alloca ptr, align 8
 // CHECK17-NEXT:    store ptr [[__CONTEXT]], ptr [[__CONTEXT_ADDR]], align 8
 // CHECK17-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[__CONTEXT_ADDR]], align 8
-// CHECK17-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i64, ptr [[TMP0]], 
i32 0
-// CHECK17-NEXT:    [[TMP2:%.*]] = load i16, ptr [[TMP1]], align 8
-// CHECK17-NEXT:    store i16 [[TMP2]], ptr [[AA]], align 2
-// CHECK17-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds i64, ptr 
[[TMP0]], i32 1
-// CHECK17-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 8
+// CHECK17-NEXT:    [[TMP1:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 0
+// CHECK17-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 8
+// CHECK17-NEXT:    [[TMP3:%.*]] = load i16, ptr [[TMP2]], align 8
+// CHECK17-NEXT:    store i16 [[TMP3]], ptr [[AA]], align 2
+// CHECK17-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds ptr, ptr 
[[TMP0]], i32 1
+// CHECK17-NEXT:    [[TMP4:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 8
+// CHECK17-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[TMP4]], align 8
 // CHECK17-NEXT:    store ptr [[DYN_PTR]], ptr [[DYN_PTR1]], align 8
 // CHECK17-NEXT:    call void (ptr, i32, ptr, ...) @__kmpc_fork_teams(ptr 
@[[GLOB2]], i32 1, ptr 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z9ftemplateIiET_v_l108.omp_outlined, 
ptr [[AA]])
 // CHECK17-NEXT:    ret void
@@ -2597,12 +2616,17 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK19-NEXT:    [[DYN_PTR1:%.*]] = alloca ptr, align 4
 // CHECK19-NEXT:    store ptr [[__CONTEXT]], ptr [[__CONTEXT_ADDR]], align 4
 // CHECK19-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[__CONTEXT_ADDR]], align 4
-// CHECK19-NEXT:    [[A:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 0
-// CHECK19-NEXT:    [[B:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 1
-// CHECK19-NEXT:    [[C:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 2
-// CHECK19-NEXT:    [[D:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 3
-// CHECK19-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds i32, ptr 
[[TMP0]], i32 4
-// CHECK19-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 4
+// CHECK19-NEXT:    [[TMP1:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 0
+// CHECK19-NEXT:    [[A:%.*]] = load ptr, ptr [[TMP1]], align 4
+// CHECK19-NEXT:    [[TMP2:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 1
+// CHECK19-NEXT:    [[B:%.*]] = load ptr, ptr [[TMP2]], align 4
+// CHECK19-NEXT:    [[TMP3:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 2
+// CHECK19-NEXT:    [[C:%.*]] = load ptr, ptr [[TMP3]], align 4
+// CHECK19-NEXT:    [[TMP4:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 3
+// CHECK19-NEXT:    [[D:%.*]] = load ptr, ptr [[TMP4]], align 4
+// CHECK19-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds ptr, ptr 
[[TMP0]], i32 4
+// CHECK19-NEXT:    [[TMP5:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 4
+// CHECK19-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[TMP5]], align 4
 // CHECK19-NEXT:    store ptr [[DYN_PTR]], ptr [[DYN_PTR1]], align 4
 // CHECK19-NEXT:    call void (ptr, i32, ptr, ...) @__kmpc_fork_teams(ptr 
@[[GLOB2:[0-9]+]], i32 4, ptr 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z23without_schedule_clausePfS_S_S__l56.omp_outlined,
 ptr [[A]], ptr [[B]], ptr [[C]], ptr [[D]])
 // CHECK19-NEXT:    ret void
@@ -2705,12 +2729,17 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK19-NEXT:    [[DYN_PTR1:%.*]] = alloca ptr, align 4
 // CHECK19-NEXT:    store ptr [[__CONTEXT]], ptr [[__CONTEXT_ADDR]], align 4
 // CHECK19-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[__CONTEXT_ADDR]], align 4
-// CHECK19-NEXT:    [[A:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 0
-// CHECK19-NEXT:    [[B:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 1
-// CHECK19-NEXT:    [[C:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 2
-// CHECK19-NEXT:    [[D:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 3
-// CHECK19-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds i32, ptr 
[[TMP0]], i32 4
-// CHECK19-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 4
+// CHECK19-NEXT:    [[TMP1:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 0
+// CHECK19-NEXT:    [[A:%.*]] = load ptr, ptr [[TMP1]], align 4
+// CHECK19-NEXT:    [[TMP2:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 1
+// CHECK19-NEXT:    [[B:%.*]] = load ptr, ptr [[TMP2]], align 4
+// CHECK19-NEXT:    [[TMP3:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 2
+// CHECK19-NEXT:    [[C:%.*]] = load ptr, ptr [[TMP3]], align 4
+// CHECK19-NEXT:    [[TMP4:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 3
+// CHECK19-NEXT:    [[D:%.*]] = load ptr, ptr [[TMP4]], align 4
+// CHECK19-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds ptr, ptr 
[[TMP0]], i32 4
+// CHECK19-NEXT:    [[TMP5:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 4
+// CHECK19-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[TMP5]], align 4
 // CHECK19-NEXT:    store ptr [[DYN_PTR]], ptr [[DYN_PTR1]], align 4
 // CHECK19-NEXT:    call void (ptr, i32, ptr, ...) @__kmpc_fork_teams(ptr 
@[[GLOB2]], i32 4, ptr 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z18static_not_chunkedPfS_S_S__l68.omp_outlined,
 ptr [[A]], ptr [[B]], ptr [[C]], ptr [[D]])
 // CHECK19-NEXT:    ret void
@@ -2813,12 +2842,17 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK19-NEXT:    [[DYN_PTR1:%.*]] = alloca ptr, align 4
 // CHECK19-NEXT:    store ptr [[__CONTEXT]], ptr [[__CONTEXT_ADDR]], align 4
 // CHECK19-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[__CONTEXT_ADDR]], align 4
-// CHECK19-NEXT:    [[A:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 0
-// CHECK19-NEXT:    [[B:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 1
-// CHECK19-NEXT:    [[C:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 2
-// CHECK19-NEXT:    [[D:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i32 3
-// CHECK19-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds i32, ptr 
[[TMP0]], i32 4
-// CHECK19-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 4
+// CHECK19-NEXT:    [[TMP1:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 0
+// CHECK19-NEXT:    [[A:%.*]] = load ptr, ptr [[TMP1]], align 4
+// CHECK19-NEXT:    [[TMP2:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 1
+// CHECK19-NEXT:    [[B:%.*]] = load ptr, ptr [[TMP2]], align 4
+// CHECK19-NEXT:    [[TMP3:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 2
+// CHECK19-NEXT:    [[C:%.*]] = load ptr, ptr [[TMP3]], align 4
+// CHECK19-NEXT:    [[TMP4:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 3
+// CHECK19-NEXT:    [[D:%.*]] = load ptr, ptr [[TMP4]], align 4
+// CHECK19-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds ptr, ptr 
[[TMP0]], i32 4
+// CHECK19-NEXT:    [[TMP5:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 4
+// CHECK19-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[TMP5]], align 4
 // CHECK19-NEXT:    store ptr [[DYN_PTR]], ptr [[DYN_PTR1]], align 4
 // CHECK19-NEXT:    call void (ptr, i32, ptr, ...) @__kmpc_fork_teams(ptr 
@[[GLOB2]], i32 4, ptr 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}__Z14static_chunkedPfS_S_S__l80.omp_outlined,
 ptr [[A]], ptr [[B]], ptr [[C]], ptr [[D]])
 // CHECK19-NEXT:    ret void
@@ -2939,11 +2973,13 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK19-NEXT:    [[DYN_PTR1:%.*]] = alloca ptr, align 4
 // CHECK19-NEXT:    store ptr [[__CONTEXT]], ptr [[__CONTEXT_ADDR]], align 4
 // CHECK19-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[__CONTEXT_ADDR]], align 4
-// CHECK19-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], 
i32 0
-// CHECK19-NEXT:    [[TMP2:%.*]] = load i8, ptr [[TMP1]], align 4
-// CHECK19-NEXT:    store i8 [[TMP2]], ptr [[A]], align 1
-// CHECK19-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds i32, ptr 
[[TMP0]], i32 1
-// CHECK19-NEXT:    [[DYN_PTR:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 4
+// CHECK19-NEXT:    [[TMP1:%.*]] = getelementptr inbounds ptr, ptr [[TMP0]], 
i32 0
+// CHECK19-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[TMP1]], align 4
+// CHECK19-NEXT:    [[TMP3:%.*]] = load i8, ptr [[TMP2]], align 4
+// CHECK19-NEXT:    store i8 [[TMP3]], ptr [[A]], align 1
+// CHECK19-NEXT:    [[DYN_PTR_ADDR:%.*]] = getelementptr inbounds ptr, ptr 
[[TMP0]], i32 1
+// CHECK19-NEXT:    [[TMP4:%.*]] = load ptr, ptr [[DYN_PTR_ADDR]], align 4
+// CHECK19-NEXT:    [[D...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/205355
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to