[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ddpagan abandoned this revision. ddpagan added a comment. Change no longer needed per review/comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ddpagan added inline comments. Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:1570-1575 + // If we are here with a 'target teams loop' then we are emitting the + // 'parallel' region of the 'target teams distribute parallel for' + // emitted in place of the 'target teams loop'. Based on the specification + // noted above, an if-clause associated with a 'target teams loop', be it + // 'if(val)' or an 'if(target:val)', will apply only to 'target' and not + // the 'parallel' of the 'target teams distribute parallel for'. ABataev wrote: > ddpagan wrote: > > ABataev wrote: > > > ddpagan wrote: > > > > ABataev wrote: > > > > > ddpagan wrote: > > > > > > ABataev wrote: > > > > > > > It does not match the spec. > > > > > > > ``` > > > > > > > For a combined or composite construct, if no > > > > > > > directive-name-modifier is specified then the if clause applies > > > > > > > to all constituent constructs to which an if clause can apply. > > > > > > > ``` > > > > > > > So, if(val) should be applied to both target and parallel > > > > > > > regions, no? > > > > > > > It does not match the spec. > > > > > > > ``` > > > > > > > For a combined or composite construct, if no > > > > > > > directive-name-modifier is specified then the if clause applies > > > > > > > to all constituent constructs to which an if clause can apply. > > > > > > > ``` > > > > > > > So, if(val) should be applied to both target and parallel > > > > > > > regions, no? > > > > > > > > > > > > Hi Alexey - Question for you: does revising the comment above at > > > > > > lines 1570-1575 to the following text help explain in a better way > > > > > > what's being done, and why? > > > > > > > > > > > > If we are handling a 'target teams distribute parallel for' > > > > > > explicitly written > > > > > > in the source, and it has an 'if(val)' clause, the if condition > > > > > > is applied to > > > > > > both 'target' and 'parallel' regions according to > > > > > > OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]. > > > > > > > > > > > > However, if we are mapping an explicit 'target teams loop > > > > > > if(val)' onto a > > > > > > 'target teams distribute parallel for if(val)', to preserve the > > > > > > 'if' semantics > > > > > > as specified by the user with the 'target teams loop', we apply > > > > > > it just to > > > > > > the 'target' region. > > > > > It does not match the spec. Why we shall handle it this way? > > > > You're right, Alexey ... it doesn't match the spec, but here's why we > > > > thought this would be an appropriate way to implement 'target teams > > > > loop if(val)'. When a user specifies 'if(val)' with a 'target teams > > > > loop', their expectation is that its effect will only apply to the > > > > 'target' region. Since a 'loop' construct can be implemented in a > > > > number of different ways with the freedom granted by the specs > > > > description of 'loop' (part of which describes it as being able to be > > > > run concurrently), using a 'target teams distribute parallel for' > > > > construct is a simple and effective default choice, which is what > > > > happens today. > > > > target_teams_loop => target_teams_distribute_parallel_for > > > > Applying the if clause to the parallel region for this case can > > > > potentially limit it to one thread, which would hinder performance > > > > gains otherwise possible, and presumably wouldn't be what the user > > > > wanted/expected. > > > > > > > > The semantics of the spec (OpenMP 5.2 [3.4, if Clause, Semantics, > > > > 15-18]) is definitely what should be applied to an explicit instance of > > > > target_teams_distribute_parallel_for, but in this case (when mapping > > > > target_teams_loop => target_teams_distribute_parallel_for) it seems > > > > reasonable to make the choice described above. > > > I'm not sure this is true users expectations. > > You're right in the sense that there is an assumption being made there. So > > would you recommend just staying with the semantics as defined in the spec > > regardless? > Yew, better to follow spec defaults. Ok. Sounds good. Thanks for review and suggestions, Alexey. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ABataev added inline comments. Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:1570-1575 + // If we are here with a 'target teams loop' then we are emitting the + // 'parallel' region of the 'target teams distribute parallel for' + // emitted in place of the 'target teams loop'. Based on the specification + // noted above, an if-clause associated with a 'target teams loop', be it + // 'if(val)' or an 'if(target:val)', will apply only to 'target' and not + // the 'parallel' of the 'target teams distribute parallel for'. ddpagan wrote: > ABataev wrote: > > ddpagan wrote: > > > ABataev wrote: > > > > ddpagan wrote: > > > > > ABataev wrote: > > > > > > It does not match the spec. > > > > > > ``` > > > > > > For a combined or composite construct, if no > > > > > > directive-name-modifier is specified then the if clause applies to > > > > > > all constituent constructs to which an if clause can apply. > > > > > > ``` > > > > > > So, if(val) should be applied to both target and parallel regions, > > > > > > no? > > > > > > It does not match the spec. > > > > > > ``` > > > > > > For a combined or composite construct, if no > > > > > > directive-name-modifier is specified then the if clause applies to > > > > > > all constituent constructs to which an if clause can apply. > > > > > > ``` > > > > > > So, if(val) should be applied to both target and parallel regions, > > > > > > no? > > > > > > > > > > Hi Alexey - Question for you: does revising the comment above at > > > > > lines 1570-1575 to the following text help explain in a better way > > > > > what's being done, and why? > > > > > > > > > > If we are handling a 'target teams distribute parallel for' > > > > > explicitly written > > > > > in the source, and it has an 'if(val)' clause, the if condition is > > > > > applied to > > > > > both 'target' and 'parallel' regions according to > > > > > OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]. > > > > > > > > > > However, if we are mapping an explicit 'target teams loop if(val)' > > > > > onto a > > > > > 'target teams distribute parallel for if(val)', to preserve the > > > > > 'if' semantics > > > > > as specified by the user with the 'target teams loop', we apply it > > > > > just to > > > > > the 'target' region. > > > > It does not match the spec. Why we shall handle it this way? > > > You're right, Alexey ... it doesn't match the spec, but here's why we > > > thought this would be an appropriate way to implement 'target teams loop > > > if(val)'. When a user specifies 'if(val)' with a 'target teams loop', > > > their expectation is that its effect will only apply to the 'target' > > > region. Since a 'loop' construct can be implemented in a number of > > > different ways with the freedom granted by the specs description of > > > 'loop' (part of which describes it as being able to be run concurrently), > > > using a 'target teams distribute parallel for' construct is a simple and > > > effective default choice, which is what happens today. > > > target_teams_loop => target_teams_distribute_parallel_for > > > Applying the if clause to the parallel region for this case can > > > potentially limit it to one thread, which would hinder performance gains > > > otherwise possible, and presumably wouldn't be what the user > > > wanted/expected. > > > > > > The semantics of the spec (OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]) > > > is definitely what should be applied to an explicit instance of > > > target_teams_distribute_parallel_for, but in this case (when mapping > > > target_teams_loop => target_teams_distribute_parallel_for) it seems > > > reasonable to make the choice described above. > > I'm not sure this is true users expectations. > You're right in the sense that there is an assumption being made there. So > would you recommend just staying with the semantics as defined in the spec > regardless? Yew, better to follow spec defaults. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ddpagan added inline comments. Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:1570-1575 + // If we are here with a 'target teams loop' then we are emitting the + // 'parallel' region of the 'target teams distribute parallel for' + // emitted in place of the 'target teams loop'. Based on the specification + // noted above, an if-clause associated with a 'target teams loop', be it + // 'if(val)' or an 'if(target:val)', will apply only to 'target' and not + // the 'parallel' of the 'target teams distribute parallel for'. ABataev wrote: > ddpagan wrote: > > ABataev wrote: > > > ddpagan wrote: > > > > ABataev wrote: > > > > > It does not match the spec. > > > > > ``` > > > > > For a combined or composite construct, if no directive-name-modifier > > > > > is specified then the if clause applies to all constituent constructs > > > > > to which an if clause can apply. > > > > > ``` > > > > > So, if(val) should be applied to both target and parallel regions, no? > > > > > It does not match the spec. > > > > > ``` > > > > > For a combined or composite construct, if no directive-name-modifier > > > > > is specified then the if clause applies to all constituent constructs > > > > > to which an if clause can apply. > > > > > ``` > > > > > So, if(val) should be applied to both target and parallel regions, no? > > > > > > > > Hi Alexey - Question for you: does revising the comment above at lines > > > > 1570-1575 to the following text help explain in a better way what's > > > > being done, and why? > > > > > > > > If we are handling a 'target teams distribute parallel for' > > > > explicitly written > > > > in the source, and it has an 'if(val)' clause, the if condition is > > > > applied to > > > > both 'target' and 'parallel' regions according to > > > > OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]. > > > > > > > > However, if we are mapping an explicit 'target teams loop if(val)' > > > > onto a > > > > 'target teams distribute parallel for if(val)', to preserve the 'if' > > > > semantics > > > > as specified by the user with the 'target teams loop', we apply it > > > > just to > > > > the 'target' region. > > > It does not match the spec. Why we shall handle it this way? > > You're right, Alexey ... it doesn't match the spec, but here's why we > > thought this would be an appropriate way to implement 'target teams loop > > if(val)'. When a user specifies 'if(val)' with a 'target teams loop', their > > expectation is that its effect will only apply to the 'target' region. > > Since a 'loop' construct can be implemented in a number of different ways > > with the freedom granted by the specs description of 'loop' (part of which > > describes it as being able to be run concurrently), using a 'target teams > > distribute parallel for' construct is a simple and effective default > > choice, which is what happens today. > > target_teams_loop => target_teams_distribute_parallel_for > > Applying the if clause to the parallel region for this case can potentially > > limit it to one thread, which would hinder performance gains otherwise > > possible, and presumably wouldn't be what the user wanted/expected. > > > > The semantics of the spec (OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]) > > is definitely what should be applied to an explicit instance of > > target_teams_distribute_parallel_for, but in this case (when mapping > > target_teams_loop => target_teams_distribute_parallel_for) it seems > > reasonable to make the choice described above. > I'm not sure this is true users expectations. You're right in the sense that there is an assumption being made there. So would you recommend just staying with the semantics as defined in the spec regardless? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ABataev added inline comments. Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:1570-1575 + // If we are here with a 'target teams loop' then we are emitting the + // 'parallel' region of the 'target teams distribute parallel for' + // emitted in place of the 'target teams loop'. Based on the specification + // noted above, an if-clause associated with a 'target teams loop', be it + // 'if(val)' or an 'if(target:val)', will apply only to 'target' and not + // the 'parallel' of the 'target teams distribute parallel for'. ddpagan wrote: > ABataev wrote: > > ddpagan wrote: > > > ABataev wrote: > > > > It does not match the spec. > > > > ``` > > > > For a combined or composite construct, if no directive-name-modifier is > > > > specified then the if clause applies to all constituent constructs to > > > > which an if clause can apply. > > > > ``` > > > > So, if(val) should be applied to both target and parallel regions, no? > > > > It does not match the spec. > > > > ``` > > > > For a combined or composite construct, if no directive-name-modifier is > > > > specified then the if clause applies to all constituent constructs to > > > > which an if clause can apply. > > > > ``` > > > > So, if(val) should be applied to both target and parallel regions, no? > > > > > > Hi Alexey - Question for you: does revising the comment above at lines > > > 1570-1575 to the following text help explain in a better way what's being > > > done, and why? > > > > > > If we are handling a 'target teams distribute parallel for' explicitly > > > written > > > in the source, and it has an 'if(val)' clause, the if condition is > > > applied to > > > both 'target' and 'parallel' regions according to > > > OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]. > > > > > > However, if we are mapping an explicit 'target teams loop if(val)' onto > > > a > > > 'target teams distribute parallel for if(val)', to preserve the 'if' > > > semantics > > > as specified by the user with the 'target teams loop', we apply it just > > > to > > > the 'target' region. > > It does not match the spec. Why we shall handle it this way? > You're right, Alexey ... it doesn't match the spec, but here's why we thought > this would be an appropriate way to implement 'target teams loop if(val)'. > When a user specifies 'if(val)' with a 'target teams loop', their expectation > is that its effect will only apply to the 'target' region. Since a 'loop' > construct can be implemented in a number of different ways with the freedom > granted by the specs description of 'loop' (part of which describes it as > being able to be run concurrently), using a 'target teams distribute parallel > for' construct is a simple and effective default choice, which is what > happens today. > target_teams_loop => target_teams_distribute_parallel_for > Applying the if clause to the parallel region for this case can potentially > limit it to one thread, which would hinder performance gains otherwise > possible, and presumably wouldn't be what the user wanted/expected. > > The semantics of the spec (OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]) is > definitely what should be applied to an explicit instance of > target_teams_distribute_parallel_for, but in this case (when mapping > target_teams_loop => target_teams_distribute_parallel_for) it seems > reasonable to make the choice described above. I'm not sure this is true users expectations. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ddpagan added inline comments. Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:1570-1575 + // If we are here with a 'target teams loop' then we are emitting the + // 'parallel' region of the 'target teams distribute parallel for' + // emitted in place of the 'target teams loop'. Based on the specification + // noted above, an if-clause associated with a 'target teams loop', be it + // 'if(val)' or an 'if(target:val)', will apply only to 'target' and not + // the 'parallel' of the 'target teams distribute parallel for'. ABataev wrote: > ddpagan wrote: > > ABataev wrote: > > > It does not match the spec. > > > ``` > > > For a combined or composite construct, if no directive-name-modifier is > > > specified then the if clause applies to all constituent constructs to > > > which an if clause can apply. > > > ``` > > > So, if(val) should be applied to both target and parallel regions, no? > > > It does not match the spec. > > > ``` > > > For a combined or composite construct, if no directive-name-modifier is > > > specified then the if clause applies to all constituent constructs to > > > which an if clause can apply. > > > ``` > > > So, if(val) should be applied to both target and parallel regions, no? > > > > Hi Alexey - Question for you: does revising the comment above at lines > > 1570-1575 to the following text help explain in a better way what's being > > done, and why? > > > > If we are handling a 'target teams distribute parallel for' explicitly > > written > > in the source, and it has an 'if(val)' clause, the if condition is > > applied to > > both 'target' and 'parallel' regions according to > > OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]. > > > > However, if we are mapping an explicit 'target teams loop if(val)' onto a > > 'target teams distribute parallel for if(val)', to preserve the 'if' > > semantics > > as specified by the user with the 'target teams loop', we apply it just to > > the 'target' region. > It does not match the spec. Why we shall handle it this way? You're right, Alexey ... it doesn't match the spec, but here's why we thought this would be an appropriate way to implement 'target teams loop if(val)'. When a user specifies 'if(val)' with a 'target teams loop', their expectation is that its effect will only apply to the 'target' region. Since a 'loop' construct can be implemented in a number of different ways with the freedom granted by the specs description of 'loop' (part of which describes it as being able to be run concurrently), using a 'target teams distribute parallel for' construct is a simple and effective default choice, which is what happens today. target_teams_loop => target_teams_distribute_parallel_for Applying the if clause to the parallel region for this case can potentially limit it to one thread, which would hinder performance gains otherwise possible, and presumably wouldn't be what the user wanted/expected. The semantics of the spec (OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]) is definitely what should be applied to an explicit instance of target_teams_distribute_parallel_for, but in this case (when mapping target_teams_loop => target_teams_distribute_parallel_for) it seems reasonable to make the choice described above. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ABataev added inline comments. Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:1570-1575 + // If we are here with a 'target teams loop' then we are emitting the + // 'parallel' region of the 'target teams distribute parallel for' + // emitted in place of the 'target teams loop'. Based on the specification + // noted above, an if-clause associated with a 'target teams loop', be it + // 'if(val)' or an 'if(target:val)', will apply only to 'target' and not + // the 'parallel' of the 'target teams distribute parallel for'. ddpagan wrote: > ABataev wrote: > > It does not match the spec. > > ``` > > For a combined or composite construct, if no directive-name-modifier is > > specified then the if clause applies to all constituent constructs to which > > an if clause can apply. > > ``` > > So, if(val) should be applied to both target and parallel regions, no? > > It does not match the spec. > > ``` > > For a combined or composite construct, if no directive-name-modifier is > > specified then the if clause applies to all constituent constructs to which > > an if clause can apply. > > ``` > > So, if(val) should be applied to both target and parallel regions, no? > > Hi Alexey - Question for you: does revising the comment above at lines > 1570-1575 to the following text help explain in a better way what's being > done, and why? > > If we are handling a 'target teams distribute parallel for' explicitly > written > in the source, and it has an 'if(val)' clause, the if condition is applied > to > both 'target' and 'parallel' regions according to > OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]. > > However, if we are mapping an explicit 'target teams loop if(val)' onto a > 'target teams distribute parallel for if(val)', to preserve the 'if' > semantics > as specified by the user with the 'target teams loop', we apply it just to > the 'target' region. It does not match the spec. Why we shall handle it this way? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ddpagan added inline comments. Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:1570-1575 + // If we are here with a 'target teams loop' then we are emitting the + // 'parallel' region of the 'target teams distribute parallel for' + // emitted in place of the 'target teams loop'. Based on the specification + // noted above, an if-clause associated with a 'target teams loop', be it + // 'if(val)' or an 'if(target:val)', will apply only to 'target' and not + // the 'parallel' of the 'target teams distribute parallel for'. ABataev wrote: > It does not match the spec. > ``` > For a combined or composite construct, if no directive-name-modifier is > specified then the if clause applies to all constituent constructs to which > an if clause can apply. > ``` > So, if(val) should be applied to both target and parallel regions, no? > It does not match the spec. > ``` > For a combined or composite construct, if no directive-name-modifier is > specified then the if clause applies to all constituent constructs to which > an if clause can apply. > ``` > So, if(val) should be applied to both target and parallel regions, no? Hi Alexey - Question for you: does revising the comment above at lines 1570-1575 to the following text help explain in a better way what's being done, and why? If we are handling a 'target teams distribute parallel for' explicitly written in the source, and it has an 'if(val)' clause, the if condition is applied to both 'target' and 'parallel' regions according to OpenMP 5.2 [3.4, if Clause, Semantics, 15-18]. However, if we are mapping an explicit 'target teams loop if(val)' onto a 'target teams distribute parallel for if(val)', to preserve the 'if' semantics as specified by the user with the 'target teams loop', we apply it just to the 'target' region. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ABataev added inline comments. Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:1570-1575 + // If we are here with a 'target teams loop' then we are emitting the + // 'parallel' region of the 'target teams distribute parallel for' + // emitted in place of the 'target teams loop'. Based on the specification + // noted above, an if-clause associated with a 'target teams loop', be it + // 'if(val)' or an 'if(target:val)', will apply only to 'target' and not + // the 'parallel' of the 'target teams distribute parallel for'. It does not match the spec. ``` For a combined or composite construct, if no directive-name-modifier is specified then the if clause applies to all constituent constructs to which an if clause can apply. ``` So, if(val) should be applied to both target and parallel regions, no? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ddpagan updated this revision to Diff 547982. ddpagan added a comment. As requested, added reference to OpenMP 5.2 specification that discusses handling of if clauses with combined/composite directives. Also, improved comment relating to what was being done and why. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 Files: clang/lib/CodeGen/CGStmtOpenMP.cpp clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp Index: clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp === --- clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp +++ clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp @@ -695,7 +695,6 @@ // CHECK1-NEXT:[[DOTOMP_STRIDE:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[DOTOMP_IS_LAST:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[I:%.*]] = alloca i32, align 4 -// CHECK1-NEXT:[[DOTBOUND_ZERO_ADDR:%.*]] = alloca i32, align 4 // CHECK1-NEXT:store ptr [[DOTGLOBAL_TID_]], ptr [[DOTGLOBAL_TID__ADDR]], align 8 // CHECK1-NEXT:store ptr [[DOTBOUND_TID_]], ptr [[DOTBOUND_TID__ADDR]], align 8 // CHECK1-NEXT:store i32 0, ptr [[DOTOMP_COMB_LB]], align 4 @@ -729,16 +728,12 @@ // CHECK1-NEXT:[[TMP8:%.*]] = zext i32 [[TMP7]] to i64 // CHECK1-NEXT:[[TMP9:%.*]] = load i32, ptr [[DOTOMP_COMB_UB]], align 4 // CHECK1-NEXT:[[TMP10:%.*]] = zext i32 [[TMP9]] to i64 -// CHECK1-NEXT:call void @__kmpc_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) -// CHECK1-NEXT:[[TMP11:%.*]] = load ptr, ptr [[DOTGLOBAL_TID__ADDR]], align 8 -// CHECK1-NEXT:store i32 0, ptr [[DOTBOUND_ZERO_ADDR]], align 4 -// CHECK1-NEXT:call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l83.omp_outlined.omp_outlined(ptr [[TMP11]], ptr [[DOTBOUND_ZERO_ADDR]], i64 [[TMP8]], i64 [[TMP10]]) #[[ATTR2]] -// CHECK1-NEXT:call void @__kmpc_end_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) +// CHECK1-NEXT:call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr @[[GLOB3]], i32 2, ptr @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l83.omp_outlined.omp_outlined, i64 [[TMP8]], i64 [[TMP10]]) // CHECK1-NEXT:br label [[OMP_INNER_FOR_INC:%.*]] // CHECK1: omp.inner.for.inc: -// CHECK1-NEXT:[[TMP12:%.*]] = load i32, ptr [[DOTOMP_IV]], align 4 -// CHECK1-NEXT:[[TMP13:%.*]] = load i32, ptr [[DOTOMP_STRIDE]], align 4 -// CHECK1-NEXT:[[ADD:%.*]] = add nsw i32 [[TMP12]], [[TMP13]] +// CHECK1-NEXT:[[TMP11:%.*]] = load i32, ptr [[DOTOMP_IV]], align 4 +// CHECK1-NEXT:[[TMP12:%.*]] = load i32, ptr [[DOTOMP_STRIDE]], align 4 +// CHECK1-NEXT:[[ADD:%.*]] = add nsw i32 [[TMP11]], [[TMP12]] // CHECK1-NEXT:store i32 [[ADD]], ptr [[DOTOMP_IV]], align 4 // CHECK1-NEXT:br label [[OMP_INNER_FOR_COND]] // CHECK1: omp.inner.for.end: @@ -847,7 +842,6 @@ // CHECK1-NEXT:[[DOTOMP_STRIDE:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[DOTOMP_IS_LAST:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[I:%.*]] = alloca i32, align 4 -// CHECK1-NEXT:[[DOTBOUND_ZERO_ADDR:%.*]] = alloca i32, align 4 // CHECK1-NEXT:store ptr [[DOTGLOBAL_TID_]], ptr [[DOTGLOBAL_TID__ADDR]], align 8 // CHECK1-NEXT:store ptr [[DOTBOUND_TID_]], ptr [[DOTBOUND_TID__ADDR]], align 8 // CHECK1-NEXT:store i64 [[DOTCAPTURE_EXPR_]], ptr [[DOTCAPTURE_EXPR__ADDR]], align 8 @@ -882,25 +876,12 @@ // CHECK1-NEXT:[[TMP8:%.*]] = zext i32 [[TMP7]] to i64 // CHECK1-NEXT:[[TMP9:%.*]] = load i32, ptr [[DOTOMP_COMB_UB]], align 4 // CHECK1-NEXT:[[TMP10:%.*]] = zext i32 [[TMP9]] to i64 -// CHECK1-NEXT:[[TMP11:%.*]] = load i8, ptr [[DOTCAPTURE_EXPR__ADDR]], align 1 -// CHECK1-NEXT:[[TOBOOL:%.*]] = trunc i8 [[TMP11]] to i1 -// CHECK1-NEXT:br i1 [[TOBOOL]], label [[OMP_IF_THEN:%.*]], label [[OMP_IF_ELSE:%.*]] -// CHECK1: omp_if.then: // CHECK1-NEXT:call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr @[[GLOB3]], i32 2, ptr @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l90.omp_outlined.omp_outlined, i64 [[TMP8]], i64 [[TMP10]]) -// CHECK1-NEXT:br label [[OMP_IF_END:%.*]] -// CHECK1: omp_if.else: -// CHECK1-NEXT:call void @__kmpc_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) -// CHECK1-NEXT:[[TMP12:%.*]] = load ptr, ptr [[DOTGLOBAL_TID__ADDR]], align 8 -// CHECK1-NEXT:store i32 0, ptr [[DOTBOUND_ZERO_ADDR]], align 4 -// CHECK1-NEXT:call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l90.omp_outlined.omp_outlined(ptr [[TMP12]], ptr [[DOTBOUND_ZERO_ADDR]], i64 [[TMP8]], i64 [[TMP10]]) #[[ATTR2]] -// CHECK1-NEXT:call void @__kmpc_end_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) -// CHECK1-NEXT:br label [[OMP_IF_END]] -// CHECK1: omp_if.end: // CHECK1-NEXT:br label [[OMP_INNER_FOR_INC:%.*]] // CHECK1: omp.inner.for.inc: -// CHECK1-NEXT:[[TMP13:%.*]] = load i32, ptr [[DOTOMP_IV]], align 4 -// CHECK1-NEXT:[[TMP14:%.*]] = load i32, ptr
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ddpagan added inline comments. Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:1564 + // Any if-clause associated with expansion for 'target teams loop' should + // apply to target region only. OpenMP 5.2 [3.4, if Clause, Semantics, 15-18] + if (S.getDirectiveKind() != OMPD_target_teams_loop) ABataev wrote: > I don't see this statement in the standard Sorry for that confusion on my part, Alexey. You're correct. I'll change this to the proper spec phrasing and resubmit. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ABataev added inline comments. Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:1564 + // Any if-clause associated with expansion for 'target teams loop' should + // apply to target region only. OpenMP 5.2 [3.4, if Clause, Semantics, 15-18] + if (S.getDirectiveKind() != OMPD_target_teams_loop) I don't see this statement in the standard CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ddpagan updated this revision to Diff 547515. ddpagan added a comment. Added reference to relevant portion of the OpenMP 5.2 specification. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 Files: clang/lib/CodeGen/CGStmtOpenMP.cpp clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp Index: clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp === --- clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp +++ clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp @@ -695,7 +695,6 @@ // CHECK1-NEXT:[[DOTOMP_STRIDE:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[DOTOMP_IS_LAST:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[I:%.*]] = alloca i32, align 4 -// CHECK1-NEXT:[[DOTBOUND_ZERO_ADDR:%.*]] = alloca i32, align 4 // CHECK1-NEXT:store ptr [[DOTGLOBAL_TID_]], ptr [[DOTGLOBAL_TID__ADDR]], align 8 // CHECK1-NEXT:store ptr [[DOTBOUND_TID_]], ptr [[DOTBOUND_TID__ADDR]], align 8 // CHECK1-NEXT:store i32 0, ptr [[DOTOMP_COMB_LB]], align 4 @@ -729,16 +728,12 @@ // CHECK1-NEXT:[[TMP8:%.*]] = zext i32 [[TMP7]] to i64 // CHECK1-NEXT:[[TMP9:%.*]] = load i32, ptr [[DOTOMP_COMB_UB]], align 4 // CHECK1-NEXT:[[TMP10:%.*]] = zext i32 [[TMP9]] to i64 -// CHECK1-NEXT:call void @__kmpc_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) -// CHECK1-NEXT:[[TMP11:%.*]] = load ptr, ptr [[DOTGLOBAL_TID__ADDR]], align 8 -// CHECK1-NEXT:store i32 0, ptr [[DOTBOUND_ZERO_ADDR]], align 4 -// CHECK1-NEXT:call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l83.omp_outlined.omp_outlined(ptr [[TMP11]], ptr [[DOTBOUND_ZERO_ADDR]], i64 [[TMP8]], i64 [[TMP10]]) #[[ATTR2]] -// CHECK1-NEXT:call void @__kmpc_end_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) +// CHECK1-NEXT:call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr @[[GLOB3]], i32 2, ptr @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l83.omp_outlined.omp_outlined, i64 [[TMP8]], i64 [[TMP10]]) // CHECK1-NEXT:br label [[OMP_INNER_FOR_INC:%.*]] // CHECK1: omp.inner.for.inc: -// CHECK1-NEXT:[[TMP12:%.*]] = load i32, ptr [[DOTOMP_IV]], align 4 -// CHECK1-NEXT:[[TMP13:%.*]] = load i32, ptr [[DOTOMP_STRIDE]], align 4 -// CHECK1-NEXT:[[ADD:%.*]] = add nsw i32 [[TMP12]], [[TMP13]] +// CHECK1-NEXT:[[TMP11:%.*]] = load i32, ptr [[DOTOMP_IV]], align 4 +// CHECK1-NEXT:[[TMP12:%.*]] = load i32, ptr [[DOTOMP_STRIDE]], align 4 +// CHECK1-NEXT:[[ADD:%.*]] = add nsw i32 [[TMP11]], [[TMP12]] // CHECK1-NEXT:store i32 [[ADD]], ptr [[DOTOMP_IV]], align 4 // CHECK1-NEXT:br label [[OMP_INNER_FOR_COND]] // CHECK1: omp.inner.for.end: @@ -847,7 +842,6 @@ // CHECK1-NEXT:[[DOTOMP_STRIDE:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[DOTOMP_IS_LAST:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[I:%.*]] = alloca i32, align 4 -// CHECK1-NEXT:[[DOTBOUND_ZERO_ADDR:%.*]] = alloca i32, align 4 // CHECK1-NEXT:store ptr [[DOTGLOBAL_TID_]], ptr [[DOTGLOBAL_TID__ADDR]], align 8 // CHECK1-NEXT:store ptr [[DOTBOUND_TID_]], ptr [[DOTBOUND_TID__ADDR]], align 8 // CHECK1-NEXT:store i64 [[DOTCAPTURE_EXPR_]], ptr [[DOTCAPTURE_EXPR__ADDR]], align 8 @@ -882,25 +876,12 @@ // CHECK1-NEXT:[[TMP8:%.*]] = zext i32 [[TMP7]] to i64 // CHECK1-NEXT:[[TMP9:%.*]] = load i32, ptr [[DOTOMP_COMB_UB]], align 4 // CHECK1-NEXT:[[TMP10:%.*]] = zext i32 [[TMP9]] to i64 -// CHECK1-NEXT:[[TMP11:%.*]] = load i8, ptr [[DOTCAPTURE_EXPR__ADDR]], align 1 -// CHECK1-NEXT:[[TOBOOL:%.*]] = trunc i8 [[TMP11]] to i1 -// CHECK1-NEXT:br i1 [[TOBOOL]], label [[OMP_IF_THEN:%.*]], label [[OMP_IF_ELSE:%.*]] -// CHECK1: omp_if.then: // CHECK1-NEXT:call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr @[[GLOB3]], i32 2, ptr @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l90.omp_outlined.omp_outlined, i64 [[TMP8]], i64 [[TMP10]]) -// CHECK1-NEXT:br label [[OMP_IF_END:%.*]] -// CHECK1: omp_if.else: -// CHECK1-NEXT:call void @__kmpc_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) -// CHECK1-NEXT:[[TMP12:%.*]] = load ptr, ptr [[DOTGLOBAL_TID__ADDR]], align 8 -// CHECK1-NEXT:store i32 0, ptr [[DOTBOUND_ZERO_ADDR]], align 4 -// CHECK1-NEXT:call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l90.omp_outlined.omp_outlined(ptr [[TMP12]], ptr [[DOTBOUND_ZERO_ADDR]], i64 [[TMP8]], i64 [[TMP10]]) #[[ATTR2]] -// CHECK1-NEXT:call void @__kmpc_end_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) -// CHECK1-NEXT:br label [[OMP_IF_END]] -// CHECK1: omp_if.end: // CHECK1-NEXT:br label [[OMP_INNER_FOR_INC:%.*]] // CHECK1: omp.inner.for.inc: -// CHECK1-NEXT:[[TMP13:%.*]] = load i32, ptr [[DOTOMP_IV]], align 4 -// CHECK1-NEXT:[[TMP14:%.*]] = load i32, ptr [[DOTOMP_STRIDE]], align 4 -// CHECK1-NEXT:[[ADD:%.*]] = add nsw i32 [[TMP13]], [[TMP14]] +// CHECK1-NEXT:[[TMP11:%.*]] = load i32,
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ABataev added a comment. Add the reference to openmp spec Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157197/new/ https://reviews.llvm.org/D157197 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157197: [clang][CodeGen][OpenMP] Fix if-clause for 'target teams loop'
ddpagan created this revision. ddpagan added a reviewer: ABataev. ddpagan added projects: clang, OpenMP. Herald added subscribers: guansong, yaxunl. Herald added a project: All. ddpagan requested review of this revision. Herald added a reviewer: jdoerfert. Herald added subscribers: cfe-commits, jplehr, sstefan1. When emitting parallel region for 'target teams distribute parallel for' expansion of 'target teams loop', ignore if-clause as anything specified should apply to target region only. Updated test results for OpenMP/target_teams_generic_loop_if_codegen.cpp Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D157197 Files: clang/lib/CodeGen/CGStmtOpenMP.cpp clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp Index: clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp === --- clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp +++ clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp @@ -695,7 +695,6 @@ // CHECK1-NEXT:[[DOTOMP_STRIDE:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[DOTOMP_IS_LAST:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[I:%.*]] = alloca i32, align 4 -// CHECK1-NEXT:[[DOTBOUND_ZERO_ADDR:%.*]] = alloca i32, align 4 // CHECK1-NEXT:store ptr [[DOTGLOBAL_TID_]], ptr [[DOTGLOBAL_TID__ADDR]], align 8 // CHECK1-NEXT:store ptr [[DOTBOUND_TID_]], ptr [[DOTBOUND_TID__ADDR]], align 8 // CHECK1-NEXT:store i32 0, ptr [[DOTOMP_COMB_LB]], align 4 @@ -729,16 +728,12 @@ // CHECK1-NEXT:[[TMP8:%.*]] = zext i32 [[TMP7]] to i64 // CHECK1-NEXT:[[TMP9:%.*]] = load i32, ptr [[DOTOMP_COMB_UB]], align 4 // CHECK1-NEXT:[[TMP10:%.*]] = zext i32 [[TMP9]] to i64 -// CHECK1-NEXT:call void @__kmpc_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) -// CHECK1-NEXT:[[TMP11:%.*]] = load ptr, ptr [[DOTGLOBAL_TID__ADDR]], align 8 -// CHECK1-NEXT:store i32 0, ptr [[DOTBOUND_ZERO_ADDR]], align 4 -// CHECK1-NEXT:call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l83.omp_outlined.omp_outlined(ptr [[TMP11]], ptr [[DOTBOUND_ZERO_ADDR]], i64 [[TMP8]], i64 [[TMP10]]) #[[ATTR2]] -// CHECK1-NEXT:call void @__kmpc_end_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) +// CHECK1-NEXT:call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr @[[GLOB3]], i32 2, ptr @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l83.omp_outlined.omp_outlined, i64 [[TMP8]], i64 [[TMP10]]) // CHECK1-NEXT:br label [[OMP_INNER_FOR_INC:%.*]] // CHECK1: omp.inner.for.inc: -// CHECK1-NEXT:[[TMP12:%.*]] = load i32, ptr [[DOTOMP_IV]], align 4 -// CHECK1-NEXT:[[TMP13:%.*]] = load i32, ptr [[DOTOMP_STRIDE]], align 4 -// CHECK1-NEXT:[[ADD:%.*]] = add nsw i32 [[TMP12]], [[TMP13]] +// CHECK1-NEXT:[[TMP11:%.*]] = load i32, ptr [[DOTOMP_IV]], align 4 +// CHECK1-NEXT:[[TMP12:%.*]] = load i32, ptr [[DOTOMP_STRIDE]], align 4 +// CHECK1-NEXT:[[ADD:%.*]] = add nsw i32 [[TMP11]], [[TMP12]] // CHECK1-NEXT:store i32 [[ADD]], ptr [[DOTOMP_IV]], align 4 // CHECK1-NEXT:br label [[OMP_INNER_FOR_COND]] // CHECK1: omp.inner.for.end: @@ -847,7 +842,6 @@ // CHECK1-NEXT:[[DOTOMP_STRIDE:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[DOTOMP_IS_LAST:%.*]] = alloca i32, align 4 // CHECK1-NEXT:[[I:%.*]] = alloca i32, align 4 -// CHECK1-NEXT:[[DOTBOUND_ZERO_ADDR:%.*]] = alloca i32, align 4 // CHECK1-NEXT:store ptr [[DOTGLOBAL_TID_]], ptr [[DOTGLOBAL_TID__ADDR]], align 8 // CHECK1-NEXT:store ptr [[DOTBOUND_TID_]], ptr [[DOTBOUND_TID__ADDR]], align 8 // CHECK1-NEXT:store i64 [[DOTCAPTURE_EXPR_]], ptr [[DOTCAPTURE_EXPR__ADDR]], align 8 @@ -882,25 +876,12 @@ // CHECK1-NEXT:[[TMP8:%.*]] = zext i32 [[TMP7]] to i64 // CHECK1-NEXT:[[TMP9:%.*]] = load i32, ptr [[DOTOMP_COMB_UB]], align 4 // CHECK1-NEXT:[[TMP10:%.*]] = zext i32 [[TMP9]] to i64 -// CHECK1-NEXT:[[TMP11:%.*]] = load i8, ptr [[DOTCAPTURE_EXPR__ADDR]], align 1 -// CHECK1-NEXT:[[TOBOOL:%.*]] = trunc i8 [[TMP11]] to i1 -// CHECK1-NEXT:br i1 [[TOBOOL]], label [[OMP_IF_THEN:%.*]], label [[OMP_IF_ELSE:%.*]] -// CHECK1: omp_if.then: // CHECK1-NEXT:call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr @[[GLOB3]], i32 2, ptr @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l90.omp_outlined.omp_outlined, i64 [[TMP8]], i64 [[TMP10]]) -// CHECK1-NEXT:br label [[OMP_IF_END:%.*]] -// CHECK1: omp_if.else: -// CHECK1-NEXT:call void @__kmpc_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) -// CHECK1-NEXT:[[TMP12:%.*]] = load ptr, ptr [[DOTGLOBAL_TID__ADDR]], align 8 -// CHECK1-NEXT:store i32 0, ptr [[DOTBOUND_ZERO_ADDR]], align 4 -// CHECK1-NEXT:call void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l90.omp_outlined.omp_outlined(ptr [[TMP12]], ptr [[DOTBOUND_ZERO_ADDR]], i64 [[TMP8]], i64 [[TMP10]]) #[[ATTR2]] -// CHECK1-NEXT:call void @__kmpc_end_serialized_parallel(ptr @[[GLOB3]], i32 [[TMP1]]) -// CHECK1-NEXT:br