[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D115661#3193431 , @yaxunl wrote:

> What about situations of a derived pointer to the global variable? For example
>
>   const int a[100] ;
>   
>   foo([50]);
>
> If we put a in addr space 4, it is easy to deduce [50] is constant. If we 
> put a in addr space 1, how does backend know [50] is constant?

Everything using alias analysis looks back for the definition of the original 
object


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D115661#3193413 , @yaxunl wrote:

> In D115661#3193157 , @arsenm wrote:
>
>> In D115661#3193152 , @yaxunl wrote:
>>
>>> In D115661#3192983 , @estewart08 
>>> wrote:
>>>
 In D115661#3190477 , @yaxunl 
 wrote:

> This may cause perf regressions for HIP.

 Do you have a test that would show such a regression? Emitting a store to 
 address space (4) in a constructor seems the wrong thing to do.
>>>
>>> The two lit tests which changed from addr space 4 to 1 demonstrated that. 
>>> In alias analysis, if a variable is in addr space 4, the backend knows that 
>>> it is constant and can do optimizations on it. After changing to addr space 
>>> 1, those optimizations are gone.
>>
>> The backend also knows because the constant flag is set on the global 
>> variable. Addrspace(4) is a kludge which is largely redundant with other 
>> mechanisms for indicating constants
>
> If backend can only rely on constant flag then we do not need put global 
> variables in constant addr space.
>
> Let's leave this patch as it is now. And revisit it if there are any 
> regressions found.

What about situations of a derived pointer to the global variable? For example

  const int a[100] ;
  
  foo([50]);

If we put a in addr space 4, it is easy to deduce [50] is constant. If we put 
a in addr space 1, how does backend know [50] is constant?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D115661#3193157 , @arsenm wrote:

> In D115661#3193152 , @yaxunl wrote:
>
>> In D115661#3192983 , @estewart08 
>> wrote:
>>
>>> In D115661#3190477 , @yaxunl 
>>> wrote:
>>>
 This may cause perf regressions for HIP.
>>>
>>> Do you have a test that would show such a regression? Emitting a store to 
>>> address space (4) in a constructor seems the wrong thing to do.
>>
>> The two lit tests which changed from addr space 4 to 1 demonstrated that. In 
>> alias analysis, if a variable is in addr space 4, the backend knows that it 
>> is constant and can do optimizations on it. After changing to addr space 1, 
>> those optimizations are gone.
>
> The backend also knows because the constant flag is set on the global 
> variable. Addrspace(4) is a kludge which is largely redundant with other 
> mechanisms for indicating constants

If backend can only rely on constant flag then we do not need put global 
variables in constant addr space.

Let's leave this patch as it is now. And revisit it if there are any 
regressions found.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Ethan Stewart via Phabricator via cfe-commits
estewart08 added a comment.

In D115661#3193157 , @arsenm wrote:

> In D115661#3193152 , @yaxunl wrote:
>
>> In D115661#3192983 , @estewart08 
>> wrote:
>>
>>> In D115661#3190477 , @yaxunl 
>>> wrote:
>>>
 This may cause perf regressions for HIP.
>>>
>>> Do you have a test that would show such a regression? Emitting a store to 
>>> address space (4) in a constructor seems the wrong thing to do.
>>
>> The two lit tests which changed from addr space 4 to 1 demonstrated that. In 
>> alias analysis, if a variable is in addr space 4, the backend knows that it 
>> is constant and can do optimizations on it. After changing to addr space 1, 
>> those optimizations are gone.
>
> The backend also knows because the constant flag is set on the global 
> variable. Addrspace(4) is a kludge which is largely redundant with other 
> mechanisms for indicating constants

If I am understanding you correctly, putting things in address space (4) has 
little to no performance benefit. @yaxunl seems to think otherwise. I agree 
that we can further constrain the address space (1) criteria, but I am getting 
conflicting viewpoints here on performance.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment.

Changing the has-constant-initialiser check to a more precise/robust one sounds 
reasonable to me.

In D115661#3193157 , @arsenm wrote:

> The backend also knows because the constant flag is set on the global 
> variable. Addrspace(4) is a kludge which is largely redundant with other 
> mechanisms for indicating constants

There's an interesting edge case here - we might want variables that are 
assigned to from a global ctor and subsequently constant, in which case they 
are not 'constant' as far as IR is concerned, but they are amenable to scalar 
loads and will be invariant once loaded. Some tables of data that are computed 
once in a .ctor kernel launch and then read by other kernels. I'm not totally 
sure how to express that pattern in IR though.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D115661#3193152 , @yaxunl wrote:

> In D115661#3192983 , @estewart08 
> wrote:
>
>> In D115661#3190477 , @yaxunl wrote:
>>
>>> This may cause perf regressions for HIP.
>>
>> Do you have a test that would show such a regression? Emitting a store to 
>> address space (4) in a constructor seems the wrong thing to do.
>
> The two lit tests which changed from addr space 4 to 1 demonstrated that. In 
> alias analysis, if a variable is in addr space 4, the backend knows that it 
> is constant and can do optimizations on it. After changing to addr space 1, 
> those optimizations are gone.

The backend also knows because the constant flag is set on the global variable. 
Addrspace(4) is a kludge which is largely redundant with other mechanisms for 
indicating constants


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D115661#3192983 , @estewart08 
wrote:

> In D115661#3190477 , @yaxunl wrote:
>
>> This may cause perf regressions for HIP.
>
> Do you have a test that would show such a regression? Emitting a store to 
> address space (4) in a constructor seems the wrong thing to do.

The two lit tests which changed from addr space 4 to 1 demonstrated that. In 
alias analysis, if a variable is in addr space 4, the backend knows that it is 
constant and can do optimizations on it. After changing to addr space 1, those 
optimizations are gone.

I am not objecting to putting variables like `const float x=log2(y)` to addr 
space 1. However, the lit tests indicate the condition check is too restrictive.




Comment at: clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp:10
 // X86: @_ZN1A3FooE ={{.*}} constant i32 123, align 4
-// AMD: @_ZN1A3FooE ={{.*}} addrspace(4) constant i32 123, align 4
+// AMD: @_ZN1A3FooE ={{.*}} addrspace(1) constant i32 123, align 4
 const int *p = ::Foo; // emit available_externally

estewart08 wrote:
> yaxunl wrote:
> > Do you know why this is not treated as constant initialization?
> > 
> There is no initialization here:
> ```
> const int A::Foo;
> ```
> 
> It seems the compiler ignores the 123 in the struct.
> 
> I see address space (4) when the test is written like this:
> ```
> struct A {
>   static const int Foo;
> };
> const int A::Foo = 123;
> ```
In this case, there are two Decl's for `A::Foo` and the initializer of `A::Foo` 
is on the first initializer whereas `hasConstantInitialization` only checks the 
final Decl, which does not have an initializer. Therefore 
hasConstantInitialization may conservatively return false for a variable with a 
constant initializer in a previous decl.

clang codegen calls VarDecl::getAnyInitializer to checks all Decls of a 
variable when emitting its initializer in LLVM IR

https://github.com/llvm/llvm-project/blob/5336befe8c3cde08cec020583700b4d2ba25ac16/clang/lib/AST/Decl.cpp#L2277

I would suggest calling getAnyInitializer to get the initializer, then check 
whether it is constant expr to determine whether the variable will have a 
constant initializer in IR. I think hasConstantInitialization does not do that 
because it has multiple usages, not just for clang codegen, therefore it does 
not check all decls.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Ethan Stewart via Phabricator via cfe-commits
estewart08 added a comment.

In D115661#3190477 , @yaxunl wrote:

> This may cause perf regressions for HIP.

Do you have a test that would show such a regression? Emitting a store to 
address space (4) in a constructor seems the wrong thing to do.




Comment at: clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp:10
 // X86: @_ZN1A3FooE ={{.*}} constant i32 123, align 4
-// AMD: @_ZN1A3FooE ={{.*}} addrspace(4) constant i32 123, align 4
+// AMD: @_ZN1A3FooE ={{.*}} addrspace(1) constant i32 123, align 4
 const int *p = ::Foo; // emit available_externally

yaxunl wrote:
> Do you know why this is not treated as constant initialization?
> 
There is no initialization here:
```
const int A::Foo;
```

It seems the compiler ignores the 123 in the struct.

I see address space (4) when the test is written like this:
```
struct A {
  static const int Foo;
};
const int A::Foo = 123;
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Ethan Stewart via Phabricator via cfe-commits
This revision was not accepted when it landed; it landed in state "Needs 
Revision".
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGd1327f8a574a: [clang][amdgpu] - Choose when to promote 
VarDecl to address space 4. (authored by estewart08).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
  clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp


Index: clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
===
--- clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
+++ clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
@@ -7,7 +7,7 @@
   static const int Foo = 123;
 };
 // X86: @_ZN1A3FooE ={{.*}} constant i32 123, align 4
-// AMD: @_ZN1A3FooE ={{.*}} addrspace(4) constant i32 123, align 4
+// AMD: @_ZN1A3FooE ={{.*}} addrspace(1) constant i32 123, align 4
 const int *p = ::Foo; // emit available_externally
 const int A::Foo;   // convert to full definition
 
@@ -37,7 +37,7 @@
   // CXX11X86: @_ZN3Foo21ConstexprStaticMemberE = available_externally 
constant i32 42,
   // CXX17X86: @_ZN3Foo21ConstexprStaticMemberE = linkonce_odr constant i32 42,
   // CXX11AMD: @_ZN3Foo21ConstexprStaticMemberE = available_externally 
addrspace(4) constant i32 42,
-  // CXX17AMD: @_ZN3Foo21ConstexprStaticMemberE = linkonce_odr addrspace(4) 
constant i32 42,
+  // CXX17AMD: @_ZN3Foo21ConstexprStaticMemberE = linkonce_odr addrspace(4) 
constant i32 42, comdat, align 4
   static constexpr int ConstexprStaticMember = 42;
   // X86: @_ZN3Foo17ConstStaticMemberE = available_externally constant i32 43,
   // AMD: @_ZN3Foo17ConstStaticMemberE = available_externally addrspace(4) 
constant i32 43,
Index: clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
===
--- clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
+++ clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
@@ -78,12 +78,12 @@
 // X86: @[[PARTLY_CONSTANT_SECOND:_ZGRN15partly_constant2ilE2_]] = internal 
global [2 x i32] zeroinitializer, align 4
 // X86: @[[PARTLY_CONSTANT_THIRD:_ZGRN15partly_constant2ilE3_]] = internal 
constant [4 x i32] [i32 5, i32 6, i32 7, i32 8], align 4
 // AMDGCN: @_ZN15partly_constant1kE ={{.*}} addrspace(1) global i32 0, align 4
-// AMDGCN: @_ZN15partly_constant2ilE ={{.*}} addrspace(4) global {{.*}} null, 
align 8
-// AMDGCN: @[[PARTLY_CONSTANT_OUTER:_ZGRN15partly_constant2ilE_]] = internal 
addrspace(4) global {{.*}} zeroinitializer, align 8
-// AMDGCN: @[[PARTLY_CONSTANT_INNER:_ZGRN15partly_constant2ilE0_]] = internal 
addrspace(4) global [3 x {{.*}}] zeroinitializer, align 8
-// AMDGCN: @[[PARTLY_CONSTANT_FIRST:_ZGRN15partly_constant2ilE1_]] = internal 
addrspace(4) constant [3 x i32] [i32 1, i32 2, i32 3], align 4
-// AMDGCN: @[[PARTLY_CONSTANT_SECOND:_ZGRN15partly_constant2ilE2_]] = internal 
addrspace(4) global [2 x i32] zeroinitializer, align 4
-// AMDGCN: @[[PARTLY_CONSTANT_THIRD:_ZGRN15partly_constant2ilE3_]] = internal 
addrspace(4) constant [4 x i32] [i32 5, i32 6, i32 7, i32 8], align 4
+// AMDGCN: @_ZN15partly_constant2ilE ={{.*}} addrspace(1) global {{.*}} null, 
align 8
+// AMDGCN: @[[PARTLY_CONSTANT_OUTER:_ZGRN15partly_constant2ilE_]] = internal 
addrspace(1) global {{.*}} zeroinitializer, align 8
+// AMDGCN: @[[PARTLY_CONSTANT_INNER:_ZGRN15partly_constant2ilE0_]] = internal 
addrspace(1) global [3 x {{.*}}] zeroinitializer, align 8
+// AMDGCN: @[[PARTLY_CONSTANT_FIRST:_ZGRN15partly_constant2ilE1_]] = internal 
addrspace(1) constant [3 x i32] [i32 1, i32 2, i32 3], align 4
+// AMDGCN: @[[PARTLY_CONSTANT_SECOND:_ZGRN15partly_constant2ilE2_]] = internal 
addrspace(1) global [2 x i32] zeroinitializer, align 4
+// AMDGCN: @[[PARTLY_CONSTANT_THIRD:_ZGRN15partly_constant2ilE3_]] = internal 
addrspace(1) constant [4 x i32] [i32 5, i32 6, i32 7, i32 8], align 4
 
 // X86: @[[REFTMP1:.*]] = private constant [2 x i32] [i32 42, i32 43], align 4
 // X86: @[[REFTMP2:.*]] = private constant [3 x %{{.*}}] [%{{.*}} { i32 1 }, 
%{{.*}} { i32 2 }, %{{.*}} { i32 3 }], align 4
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -9362,7 +9362,9 @@
   if (AddrSpace != LangAS::Default)
 return AddrSpace;
 
-  if (CGM.isTypeConstant(D->getType(), false)) {
+  // Only promote to address space 4 if VarDecl has constant initialization.
+  if (CGM.isTypeConstant(D->getType(), false) &&
+  D->hasConstantInitialization()) {
 if (auto ConstAS = CGM.getTarget().getConstantAddressSpace())
   return ConstAS.getValue();
   }


Index: 

[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl requested changes to this revision.
yaxunl added a comment.
This revision now requires changes to proceed.

This may cause perf regressions for HIP.




Comment at: clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp:10
 // X86: @_ZN1A3FooE ={{.*}} constant i32 123, align 4
-// AMD: @_ZN1A3FooE ={{.*}} addrspace(4) constant i32 123, align 4
+// AMD: @_ZN1A3FooE ={{.*}} addrspace(1) constant i32 123, align 4
 const int *p = ::Foo; // emit available_externally

Do you know why this is not treated as constant initialization?



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision.
JonChesterfield added a comment.
This revision is now accepted and ready to land.

That sounds good here. Infer addrspaces is pretty complicated, given marginal 
benefit for (4) this patch seems to hit the mark.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D115661#3190324 , @JonChesterfield 
wrote:

> Patch looks ok to me. This will fix the miscompile (we end up with a store to 
> addrspace(4) at present) without upsetting whatever hacks rely on 
> addrspace(4). @arsenm reasonable as a point fix?

Yes, I'm mostly saying to not waste your time trying to infer addrspace(4) in 
InferAddressSpaces or anything like that


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment.

Patch looks ok to me. This will fix the miscompile (we end up with a store to 
addrspace(4) at present) without upsetting whatever hacks rely on addrspace(4). 
@arsenm reasonable as a point fix?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D115661#3190041 , @arsenm wrote:

> There's no real advantage to using addrspace(4) at all

There are a few places where it's used as an optimization hint / as a crutch 
where we don't have a proper analysis. Fundamentally I would like to eliminate 
addrspace(4) eventually


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

There's no real advantage to using addrspace(4) at all


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Ethan Stewart via Phabricator via cfe-commits
estewart08 updated this revision to Diff 394002.
estewart08 added a comment.

Resubmit patch with lint.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115661/new/

https://reviews.llvm.org/D115661

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
  clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp


Index: clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
===
--- clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
+++ clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
@@ -7,7 +7,7 @@
   static const int Foo = 123;
 };
 // X86: @_ZN1A3FooE ={{.*}} constant i32 123, align 4
-// AMD: @_ZN1A3FooE ={{.*}} addrspace(4) constant i32 123, align 4
+// AMD: @_ZN1A3FooE ={{.*}} addrspace(1) constant i32 123, align 4
 const int *p = ::Foo; // emit available_externally
 const int A::Foo;   // convert to full definition
 
@@ -37,7 +37,7 @@
   // CXX11X86: @_ZN3Foo21ConstexprStaticMemberE = available_externally 
constant i32 42,
   // CXX17X86: @_ZN3Foo21ConstexprStaticMemberE = linkonce_odr constant i32 42,
   // CXX11AMD: @_ZN3Foo21ConstexprStaticMemberE = available_externally 
addrspace(4) constant i32 42,
-  // CXX17AMD: @_ZN3Foo21ConstexprStaticMemberE = linkonce_odr addrspace(4) 
constant i32 42,
+  // CXX17AMD: @_ZN3Foo21ConstexprStaticMemberE = linkonce_odr addrspace(4) 
constant i32 42, comdat, align 4
   static constexpr int ConstexprStaticMember = 42;
   // X86: @_ZN3Foo17ConstStaticMemberE = available_externally constant i32 43,
   // AMD: @_ZN3Foo17ConstStaticMemberE = available_externally addrspace(4) 
constant i32 43,
Index: clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
===
--- clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
+++ clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
@@ -78,12 +78,12 @@
 // X86: @[[PARTLY_CONSTANT_SECOND:_ZGRN15partly_constant2ilE2_]] = internal 
global [2 x i32] zeroinitializer, align 4
 // X86: @[[PARTLY_CONSTANT_THIRD:_ZGRN15partly_constant2ilE3_]] = internal 
constant [4 x i32] [i32 5, i32 6, i32 7, i32 8], align 4
 // AMDGCN: @_ZN15partly_constant1kE ={{.*}} addrspace(1) global i32 0, align 4
-// AMDGCN: @_ZN15partly_constant2ilE ={{.*}} addrspace(4) global {{.*}} null, 
align 8
-// AMDGCN: @[[PARTLY_CONSTANT_OUTER:_ZGRN15partly_constant2ilE_]] = internal 
addrspace(4) global {{.*}} zeroinitializer, align 8
-// AMDGCN: @[[PARTLY_CONSTANT_INNER:_ZGRN15partly_constant2ilE0_]] = internal 
addrspace(4) global [3 x {{.*}}] zeroinitializer, align 8
-// AMDGCN: @[[PARTLY_CONSTANT_FIRST:_ZGRN15partly_constant2ilE1_]] = internal 
addrspace(4) constant [3 x i32] [i32 1, i32 2, i32 3], align 4
-// AMDGCN: @[[PARTLY_CONSTANT_SECOND:_ZGRN15partly_constant2ilE2_]] = internal 
addrspace(4) global [2 x i32] zeroinitializer, align 4
-// AMDGCN: @[[PARTLY_CONSTANT_THIRD:_ZGRN15partly_constant2ilE3_]] = internal 
addrspace(4) constant [4 x i32] [i32 5, i32 6, i32 7, i32 8], align 4
+// AMDGCN: @_ZN15partly_constant2ilE ={{.*}} addrspace(1) global {{.*}} null, 
align 8
+// AMDGCN: @[[PARTLY_CONSTANT_OUTER:_ZGRN15partly_constant2ilE_]] = internal 
addrspace(1) global {{.*}} zeroinitializer, align 8
+// AMDGCN: @[[PARTLY_CONSTANT_INNER:_ZGRN15partly_constant2ilE0_]] = internal 
addrspace(1) global [3 x {{.*}}] zeroinitializer, align 8
+// AMDGCN: @[[PARTLY_CONSTANT_FIRST:_ZGRN15partly_constant2ilE1_]] = internal 
addrspace(1) constant [3 x i32] [i32 1, i32 2, i32 3], align 4
+// AMDGCN: @[[PARTLY_CONSTANT_SECOND:_ZGRN15partly_constant2ilE2_]] = internal 
addrspace(1) global [2 x i32] zeroinitializer, align 4
+// AMDGCN: @[[PARTLY_CONSTANT_THIRD:_ZGRN15partly_constant2ilE3_]] = internal 
addrspace(1) constant [4 x i32] [i32 5, i32 6, i32 7, i32 8], align 4
 
 // X86: @[[REFTMP1:.*]] = private constant [2 x i32] [i32 42, i32 43], align 4
 // X86: @[[REFTMP2:.*]] = private constant [3 x %{{.*}}] [%{{.*}} { i32 1 }, 
%{{.*}} { i32 2 }, %{{.*}} { i32 3 }], align 4
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -9362,7 +9362,9 @@
   if (AddrSpace != LangAS::Default)
 return AddrSpace;
 
-  if (CGM.isTypeConstant(D->getType(), false)) {
+  // Only promote to address space 4 if VarDecl has constant initialization.
+  if (CGM.isTypeConstant(D->getType(), false) &&
+  D->hasConstantInitialization()) {
 if (auto ConstAS = CGM.getTarget().getConstantAddressSpace())
   return ConstAS.getValue();
   }


Index: clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
===
--- clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
+++ clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
@@ -7,7 +7,7 @@
   static const int Foo = 123;
 };
 // 

[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Ethan Stewart via Phabricator via cfe-commits
estewart08 created this revision.
estewart08 added a reviewer: JonChesterfield.
Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, kzhuravl.
estewart08 requested review of this revision.
Herald added subscribers: cfe-commits, wdng.
Herald added a project: clang.

There are instances where clang codegen creates stores to
address space 4 in ctors, which causes a crash in llc.
This store was being optimized out at opt levels > 0.

For example:

pragma omp declare target
static  const double log_smallx = log2(smallx);
pragma omp end declare target

This patch ensures that any global const that does not
have constant initialization stays in address space 1.

Note - a second patch is in the works where all global
constants are placed in address space 1 during
codegen and then the opt pass InferAdressSpaces
will promote to address space 4 where necessary.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D115661

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
  clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp


Index: clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
===
--- clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
+++ clang/test/CodeGenCXX/cxx11-extern-constexpr.cpp
@@ -7,7 +7,7 @@
   static const int Foo = 123;
 };
 // X86: @_ZN1A3FooE ={{.*}} constant i32 123, align 4
-// AMD: @_ZN1A3FooE ={{.*}} addrspace(4) constant i32 123, align 4
+// AMD: @_ZN1A3FooE ={{.*}} addrspace(1) constant i32 123, align 4
 const int *p = ::Foo; // emit available_externally
 const int A::Foo;   // convert to full definition
 
@@ -37,7 +37,7 @@
   // CXX11X86: @_ZN3Foo21ConstexprStaticMemberE = available_externally 
constant i32 42,
   // CXX17X86: @_ZN3Foo21ConstexprStaticMemberE = linkonce_odr constant i32 42,
   // CXX11AMD: @_ZN3Foo21ConstexprStaticMemberE = available_externally 
addrspace(4) constant i32 42,
-  // CXX17AMD: @_ZN3Foo21ConstexprStaticMemberE = linkonce_odr addrspace(4) 
constant i32 42,
+  // CXX17AMD: @_ZN3Foo21ConstexprStaticMemberE = linkonce_odr addrspace(4) 
constant i32 42, comdat, align 4
   static constexpr int ConstexprStaticMember = 42;
   // X86: @_ZN3Foo17ConstStaticMemberE = available_externally constant i32 43,
   // AMD: @_ZN3Foo17ConstStaticMemberE = available_externally addrspace(4) 
constant i32 43,
Index: clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
===
--- clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
+++ clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
@@ -78,12 +78,12 @@
 // X86: @[[PARTLY_CONSTANT_SECOND:_ZGRN15partly_constant2ilE2_]] = internal 
global [2 x i32] zeroinitializer, align 4
 // X86: @[[PARTLY_CONSTANT_THIRD:_ZGRN15partly_constant2ilE3_]] = internal 
constant [4 x i32] [i32 5, i32 6, i32 7, i32 8], align 4
 // AMDGCN: @_ZN15partly_constant1kE ={{.*}} addrspace(1) global i32 0, align 4
-// AMDGCN: @_ZN15partly_constant2ilE ={{.*}} addrspace(4) global {{.*}} null, 
align 8
-// AMDGCN: @[[PARTLY_CONSTANT_OUTER:_ZGRN15partly_constant2ilE_]] = internal 
addrspace(4) global {{.*}} zeroinitializer, align 8
-// AMDGCN: @[[PARTLY_CONSTANT_INNER:_ZGRN15partly_constant2ilE0_]] = internal 
addrspace(4) global [3 x {{.*}}] zeroinitializer, align 8
-// AMDGCN: @[[PARTLY_CONSTANT_FIRST:_ZGRN15partly_constant2ilE1_]] = internal 
addrspace(4) constant [3 x i32] [i32 1, i32 2, i32 3], align 4
-// AMDGCN: @[[PARTLY_CONSTANT_SECOND:_ZGRN15partly_constant2ilE2_]] = internal 
addrspace(4) global [2 x i32] zeroinitializer, align 4
-// AMDGCN: @[[PARTLY_CONSTANT_THIRD:_ZGRN15partly_constant2ilE3_]] = internal 
addrspace(4) constant [4 x i32] [i32 5, i32 6, i32 7, i32 8], align 4
+// AMDGCN: @_ZN15partly_constant2ilE ={{.*}} addrspace(1) global {{.*}} null, 
align 8
+// AMDGCN: @[[PARTLY_CONSTANT_OUTER:_ZGRN15partly_constant2ilE_]] = internal 
addrspace(1) global {{.*}} zeroinitializer, align 8
+// AMDGCN: @[[PARTLY_CONSTANT_INNER:_ZGRN15partly_constant2ilE0_]] = internal 
addrspace(1) global [3 x {{.*}}] zeroinitializer, align 8
+// AMDGCN: @[[PARTLY_CONSTANT_FIRST:_ZGRN15partly_constant2ilE1_]] = internal 
addrspace(1) constant [3 x i32] [i32 1, i32 2, i32 3], align 4
+// AMDGCN: @[[PARTLY_CONSTANT_SECOND:_ZGRN15partly_constant2ilE2_]] = internal 
addrspace(1) global [2 x i32] zeroinitializer, align 4
+// AMDGCN: @[[PARTLY_CONSTANT_THIRD:_ZGRN15partly_constant2ilE3_]] = internal 
addrspace(1) constant [4 x i32] [i32 5, i32 6, i32 7, i32 8], align 4
 
 // X86: @[[REFTMP1:.*]] = private constant [2 x i32] [i32 42, i32 43], align 4
 // X86: @[[REFTMP2:.*]] = private constant [3 x %{{.*}}] [%{{.*}} { i32 1 }, 
%{{.*}} { i32 2 }, %{{.*}} { i32 3 }], align 4
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@