[clang] [Clang][TableGen] Support specifying address space in clang builtin prototypes (PR #108497)

2024-09-20 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> > > > Gentle ping @AaronBallman , @philnik777 , @fpetrogalli :)
> > > 
> > > 
> > > Ah, sorry -- because the PR is marked as a Draft, I figured it wasn't 
> > > ready for review yet.
> > > I think I'd rather this was expressed differently; we already don't put 
> > > attribute information in the prototype anyway (`noexcept` as an example), 
> > > so I'd prefer to continue down that road and put the address space 
> > > information into the `Attributes` field. e.g.,
> > > ```
> > > def BuiltinCPUIs : Builtin {
> > >   let Spellings = ["__builtin_cpu_is"];
> > >   let Attributes = [NoThrow, Const, AddressSpace<2>];
> > >   let Prototype = "bool(char const*)";
> > > }
> > > ```
> > > 
> > > 
> > > 
> > >   
> > > 
> > > 
> > >   
> > > 
> > > 
> > > 
> > >   
> > > I think that makes it more clean in terms of specifying the attribute, 
> > > and it also means we can name the address spaces in `BuiltinsBase.td` if 
> > > we would like, which is even easier for folks to understand when reading 
> > > `Builtins.td`
> > > WDYT?
> > 
> > 
> > Thanks for the reply @AaronBallman . The reason this is still a draft is 
> > that I wanted it to be an initial proposal to get some inputs and a 
> > consensus on the final design. and about it being part of the "Attributes" 
> > field, one major issue is that the address space information should be per 
> > argument including the return type. "Attributes" field currently expresses 
> > attributes to the function. If attribute in the prototype is not desired, 
> > probably a new field that lets us specify per argument attributes makes 
> > sense ?
> 
> Oh! I hadn't realized this was needed on a per-parameter basis. Oof that 
> makes this more awkward. I'd still love to avoid writing this as part of the 
> signature; I think we could use the existing `IndexedAttribute` to specify 
> which argument the attribute applies to. e.g.,
> 
> ```
> class AddressSpace : IndexedAttribute<"something", 
> Idx> {
>   int SpaceNum = AddrSpaceNum;
> }
> ```

Makes sense, I will give this a try and update the PR

https://github.com/llvm/llvm-project/pull/108497
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][TableGen] Support specifying address space in clang builtin prototypes (PR #108497)

2024-09-18 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> > Gentle ping @AaronBallman , @philnik777 , @fpetrogalli :)
> 
> Ah, sorry -- because the PR is marked as a Draft, I figured it wasn't ready 
> for review yet.
> 
> I think I'd rather this was expressed differently; we already don't put 
> attribute information in the prototype anyway (`noexcept` as an example), so 
> I'd prefer to continue down that road and put the address space information 
> into the `Attributes` field. e.g.,
> 
> ```
> def BuiltinCPUIs : Builtin {
>   let Spellings = ["__builtin_cpu_is"];
>   let Attributes = [NoThrow, Const, AddressSpace<2>];
>   let Prototype = "bool(char const*)";
> }
> ```
> 
> I think that makes it more clean in terms of specifying the attribute, and it 
> also means we can name the address spaces in `BuiltinsBase.td` if we would 
> like, which is even easier for folks to understand when reading `Builtins.td`
> 
> WDYT?

Thanks for the reply @AaronBallman . The reason this is still a draft is that I 
wanted it to be an initial proposal to get some inputs and a consensus on the 
final design. and  about it being part of the "Attributes" field, one major 
issues is that the address space information should be per argument including 
the return type. "Attributes" field currently expresses attributes to the 
function. If attribute in the prototype is not desired, probably a new field 
that lets us specify per argument attributes makes sense ? 

https://github.com/llvm/llvm-project/pull/108497
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][TableGen] Support specifying address space in clang builtin prototypes (PR #108497)

2024-09-15 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

Gentle ping @AaronBallman , @philnik777 , @fpetrogalli  :)

https://github.com/llvm/llvm-project/pull/108497
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][TableGen] Support specifying address space in clang builtin prototypes (PR #108497)

2024-09-12 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH created 
https://github.com/llvm/llvm-project/pull/108497

this is a follow up from the discussion in 
https://github.com/llvm/llvm-project/pull/86801 (apologies for the long 
delay...). 
This PR proposes a way to specify address spaces in builtin prototypes. The 
idea is to specify address space numbers using CXX11 attribute list syntax 
([[]]) with following limitations,
1. The attribute [[addrspace[n]]] is strictly a "prefix" to the builtin type, 
i.e something as follows is not accepted,
 int* const [[addrspace[3]]] ;
2. I really wanted the syntax to be like [[addrspace(n)]] but '(' token 
conflicts with function signature. so current approach is to use 
"[[addrspace[n]]]"
3. The attribute is only valid with pointer and reference types (as per the 
restriction imposed by .def files)

I would like some views on this approach and alternate suggestions if any. Also 
please let me know if there are any parallel efforts towards this which I might 
not be aware of.

>From 6afc2e91d8877cc330f6e317a404a74990d9c607 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 4 Sep 2024 10:34:54 +
Subject: [PATCH] [clang][TableGen] Support specifying address space in clang
 builtin prototypes

---
 .../target-builtins-prototype-parser.td   | 71 +++
 clang/utils/TableGen/ClangBuiltinsEmitter.cpp | 52 --
 2 files changed, 119 insertions(+), 4 deletions(-)

diff --git a/clang/test/TableGen/target-builtins-prototype-parser.td 
b/clang/test/TableGen/target-builtins-prototype-parser.td
index 555aebb3ccfb1f..dcff11046603ef 100644
--- a/clang/test/TableGen/target-builtins-prototype-parser.td
+++ b/clang/test/TableGen/target-builtins-prototype-parser.td
@@ -6,6 +6,12 @@
 // RUN: not clang-tblgen -I %p/../../include/ %s --gen-clang-builtins 
-DERROR_EXPECTED_B 2>&1 | FileCheck %s --check-prefix=ERROR_EXPECTED_B
 // RUN: not clang-tblgen -I %p/../../include/ %s --gen-clang-builtins 
-DERROR_EXPECTED_C 2>&1 | FileCheck %s --check-prefix=ERROR_EXPECTED_C
 // RUN: not clang-tblgen -I %p/../../include/ %s --gen-clang-builtins 
-DERROR_EXPECTED_D 2>&1 | FileCheck %s --check-prefix=ERROR_EXPECTED_D
+// RUN: not clang-tblgen -I %p/../../include/ %s --gen-clang-builtins 
-DERROR_EXPECTED_E 2>&1 | FileCheck %s --check-prefix=ERROR_EXPECTED_E
+// RUN: not clang-tblgen -I %p/../../include/ %s --gen-clang-builtins 
-DERROR_EXPECTED_F 2>&1 | FileCheck %s --check-prefix=ERROR_EXPECTED_F
+// RUN: not clang-tblgen -I %p/../../include/ %s --gen-clang-builtins 
-DERROR_EXPECTED_G 2>&1 | FileCheck %s --check-prefix=ERROR_EXPECTED_G
+// RUN: not clang-tblgen -I %p/../../include/ %s --gen-clang-builtins 
-DERROR_EXPECTED_H 2>&1 | FileCheck %s --check-prefix=ERROR_EXPECTED_H
+// RUN: not clang-tblgen -I %p/../../include/ %s --gen-clang-builtins 
-DERROR_EXPECTED_I 2>&1 | FileCheck %s --check-prefix=ERROR_EXPECTED_I
+// RUN: not clang-tblgen -I %p/../../include/ %s --gen-clang-builtins 
-DERROR_EXPECTED_J 2>&1 | FileCheck %s --check-prefix=ERROR_EXPECTED_J
 
 include "clang/Basic/BuiltinsBase.td"
 
@@ -113,3 +119,68 @@ def : Builtin {
 }
 #endif
 
+def : Builtin {
+// CHECK: BUILTIN(__builtin_test_addrspace_attribute_01, "di*3", "")
+  let Prototype = "double( [[addrspace[3]]] int*)";
+  let Spellings = ["__builtin_test_addrspace_attribute_01"];
+}
+
+def : Builtin {
+// CHECK: BUILTIN(__builtin_test_addrspace_attribute_02, "Ii*5i*d", "")
+  let Prototype = "_Constant [[addrspace[5]]] int* (int*, double)";
+  let Spellings = ["__builtin_test_addrspace_attribute_02"];
+}
+
+def : Builtin {
+// CHECK: BUILTIN(__builtin_test_addrspace_attribute_04, "Ii&4id*7", "")
+  let Prototype = "_Constant [[addrspace[4]]] int& (int , [[addrspace[7]]] 
double*)";
+  let Spellings = ["__builtin_test_addrspace_attribute_04"];
+}
+
+#ifdef ERROR_EXPECTED_E
+def : Builtin {
+// ERROR_EXPECTED_E: :[[# @LINE + 1]]:7: error: Expected opening bracket '[' 
after 'addrspace'
+  let Prototype = "_Constant [[addrspace]] int& (int , double*)";
+  let Spellings = ["__builtin_test_addrspace_attribute_04"];
+}
+#endif
+
+#ifdef ERROR_EXPECTED_F
+def : Builtin {
+// ERROR_EXPECTED_F: :[[# @LINE + 1]]:7: error: Address space attribute can 
only be specified with a pointer or reference type
+  let Prototype = "_Constant [[addrspace[4]]] int (int , double*)";
+  let Spellings = ["__builtin_test_addrspace_attribute_04"];
+}
+#endif
+
+#ifdef ERROR_EXPECTED_G
+def : Builtin {
+// ERROR_EXPECTED_G: :[[# @LINE + 1]]:7: error: Expecetd valid integer for 
'addrspace' attribute
+  let Prototype = "_Constant [[addrspace[k]]] int* (int , double*)";
+  let Spellings = ["__builtin_test_addrspace_attribute_04"];
+}
+#endif
+
+#ifdef ERROR_EXPECTED_H
+def : Builtin {
+// ERROR_EXPECTED_H: :[[# @LINE + 1]]:7: error: Expected closing bracket ']' 
after address space specification
+  let Prototype = "_Constant [[addrspace[6 int* (int , double*)";
+  let Spellings = ["__builtin_test_addrspace_attribute_04"];
+}
+#endif
+
+#ifdef ERROR_EXP

[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #72607)

2024-08-07 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

closing this, since its handled via 
https://github.com/llvm/llvm-project/pull/101126

https://github.com/llvm/llvm-project/pull/72607
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #72607)

2024-08-07 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH closed 
https://github.com/llvm/llvm-project/pull/72607
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-08-07 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> @vikramRH Do you need someone else to merge this for you?

sorry for the delay, merged.

https://github.com/llvm/llvm-project/pull/101126
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-08-07 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH closed 
https://github.com/llvm/llvm-project/pull/101126
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-08-07 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

### Merge activity

* **Aug 7, 6:38 AM EDT**: @vikramRH started a stack merge that includes this 
pull request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/101126).


https://github.com/llvm/llvm-project/pull/101126
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-08-04 Thread Vikram Hegde via cfe-commits


@@ -64,6 +64,9 @@ sections with improvements to Clang's support for those 
languages.
 
 C++ Language Changes
 
+- Allow single element access of vector object to be constant expression.

vikramRH wrote:

done

https://github.com/llvm/llvm-project/pull/101126
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-08-04 Thread Vikram Hegde via cfe-commits


@@ -3,40 +3,40 @@
 
 typedef int __attribute__((vector_size(16))) VI4;
 constexpr VI4 A = {1,2,3,4};
-static_assert(A[0] == 1, ""); // ref-error {{not an integral constant 
expression}}
-static_assert(A[1] == 2, ""); // ref-error {{not an integral constant 
expression}}
-static_assert(A[2] == 3, ""); // ref-error {{not an integral constant 
expression}}
-static_assert(A[3] == 4, ""); // ref-error {{not an integral constant 
expression}}
+static_assert(A[0] == 1, "");
+static_assert(A[1] == 2, "");
+static_assert(A[2] == 3, "");
+static_assert(A[3] == 4, "");
 
 
 /// FIXME: It would be nice if the note said 'vector' instead of 'array'.
-static_assert(A[12] == 4, ""); // ref-error {{not an integral constant 
expression}} \
-   // expected-error {{not an integral constant 
expression}} \
-   // expected-note {{cannot refer to element 12 
of array of 4 elements in a constant expression}}
+static_assert(A[12] == 4, ""); // both-error {{not an integral constant 
expression}} \
+   // expected-note {{cannot refer to element 12 
of array of 4 elements in a constant expression}} \
+   // ref-note {{read of dereferenced 
one-past-the-end pointer is not allowed in a constant expression}}

vikramRH wrote:

done, let me know if the updated changes are okay

https://github.com/llvm/llvm-project/pull/101126
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-08-02 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/101126

>From 690901f2370381285afa7cf7c2f7401d89e568f6 Mon Sep 17 00:00:00 2001
From: Vikram 
Date: Mon, 29 Jul 2024 08:56:07 -0400
Subject: [PATCH 1/2] [clang][ExprConst] allow single element access of vector
 object to be constant expression

---
 clang/docs/ReleaseNotes.rst   |   3 +
 clang/lib/AST/ExprConstant.cpp| 102 +-
 clang/lib/AST/Interp/State.h  |   3 +-
 clang/test/AST/Interp/builtin-functions.cpp   |  26 ++---
 clang/test/AST/Interp/vectors.cpp |  50 -
 clang/test/CodeGenCXX/temporaries.cpp |  41 ---
 .../constexpr-vectors-access-elements.cpp |  29 +
 7 files changed, 190 insertions(+), 64 deletions(-)
 create mode 100644 clang/test/SemaCXX/constexpr-vectors-access-elements.cpp

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ddad083571eb1..2179aaea12387 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -64,6 +64,9 @@ sections with improvements to Clang's support for those 
languages.
 
 C++ Language Changes
 
+- Allow single element access of vector object to be constant expression.
+  Supports the `V.xyzw` syntax and other tidbits as seen in OpenCL.
+  Selecting multiple elements is left as a future work.
 
 C++17 Feature Support
 ^
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 558e20ed3e423..08f49ac896153 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -222,6 +222,11 @@ namespace {
 ArraySize = 2;
 MostDerivedLength = I + 1;
 IsArray = true;
+  } else if (const auto *VT = Type->getAs()) {
+Type = VT->getElementType();
+ArraySize = VT->getNumElements();
+MostDerivedLength = I + 1;
+IsArray = true;
   } else if (const FieldDecl *FD = getAsField(Path[I])) {
 Type = FD->getType();
 ArraySize = 0;
@@ -268,7 +273,6 @@ namespace {
 /// If the current array is an unsized array, the value of this is
 /// undefined.
 uint64_t MostDerivedArraySize;
-
 /// The type of the most derived object referred to by this address.
 QualType MostDerivedType;
 
@@ -442,6 +446,16 @@ namespace {
   MostDerivedArraySize = 2;
   MostDerivedPathLength = Entries.size();
 }
+
+void addVectorElementUnchecked(QualType EltTy, uint64_t Size,
+   uint64_t Idx) {
+  Entries.push_back(PathEntry::ArrayIndex(Idx));
+  MostDerivedType = EltTy;
+  MostDerivedPathLength = Entries.size();
+  MostDerivedArraySize = 0;
+  MostDerivedIsArrayElement = false;
+}
+
 void diagnoseUnsizedArrayPointerArithmetic(EvalInfo &Info, const Expr *E);
 void diagnosePointerArithmetic(EvalInfo &Info, const Expr *E,
const APSInt &N);
@@ -1737,6 +1751,11 @@ namespace {
   if (checkSubobject(Info, E, Imag ? CSK_Imag : CSK_Real))
 Designator.addComplexUnchecked(EltTy, Imag);
 }
+void addVectorElement(EvalInfo &Info, const Expr *E, QualType EltTy,
+  uint64_t Size, uint64_t Idx) {
+  if (checkSubobject(Info, E, CSK_VectorElement))
+Designator.addVectorElementUnchecked(EltTy, Size, Idx);
+}
 void clearIsNullPointer() {
   IsNullPtr = false;
 }
@@ -3310,6 +3329,19 @@ static bool HandleLValueComplexElement(EvalInfo &Info, 
const Expr *E,
   return true;
 }
 
+static bool HandleLValueVectorElement(EvalInfo &Info, const Expr *E,
+  LValue &LVal, QualType EltTy,
+  uint64_t Size, uint64_t Idx) {
+  if (Idx) {
+CharUnits SizeOfElement;
+if (!HandleSizeof(Info, E->getExprLoc(), EltTy, SizeOfElement))
+  return false;
+LVal.Offset += SizeOfElement * Idx;
+  }
+  LVal.addVectorElement(Info, E, EltTy, Size, Idx);
+  return true;
+}
+
 /// Try to evaluate the initializer for a variable declaration.
 ///
 /// \param Info   Information about the ongoing evaluation.
@@ -3855,6 +3887,19 @@ findSubobject(EvalInfo &Info, const Expr *E, const 
CompleteObject &Obj,
 return handler.found(Index ? O->getComplexFloatImag()
: O->getComplexFloatReal(), ObjType);
   }
+} else if (const auto *VT = ObjType->getAs()) {
+  uint64_t Index = Sub.Entries[I].getAsArrayIndex();
+  if (Index >= VT->getNumElements()) {
+if (Info.getLangOpts().CPlusPlus11)
+  Info.FFDiag(E, diag::note_constexpr_access_past_end)
+  << handler.AccessKind;
+else
+  Info.FFDiag(E);
+return handler.failed();
+  }
+  ObjType = VT->getElementType();
+  assert(I == N - 1 && "extracting subobject of scalar?");
+  return handler.found(O->getVectorElt(Index), ObjType);
 } else if

[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-07-30 Thread Vikram Hegde via cfe-commits


@@ -3,40 +3,40 @@
 
 typedef int __attribute__((vector_size(16))) VI4;
 constexpr VI4 A = {1,2,3,4};
-static_assert(A[0] == 1, ""); // ref-error {{not an integral constant 
expression}}
-static_assert(A[1] == 2, ""); // ref-error {{not an integral constant 
expression}}
-static_assert(A[2] == 3, ""); // ref-error {{not an integral constant 
expression}}
-static_assert(A[3] == 4, ""); // ref-error {{not an integral constant 
expression}}
+static_assert(A[0] == 1, "");
+static_assert(A[1] == 2, "");
+static_assert(A[2] == 3, "");
+static_assert(A[3] == 4, "");
 
 
 /// FIXME: It would be nice if the note said 'vector' instead of 'array'.
-static_assert(A[12] == 4, ""); // ref-error {{not an integral constant 
expression}} \
-   // expected-error {{not an integral constant 
expression}} \
-   // expected-note {{cannot refer to element 12 
of array of 4 elements in a constant expression}}
+static_assert(A[12] == 4, ""); // both-error {{not an integral constant 
expression}} \
+   // expected-note {{cannot refer to element 12 
of array of 4 elements in a constant expression}} \
+   // ref-note {{read of dereferenced 
one-past-the-end pointer is not allowed in a constant expression}}

vikramRH wrote:

I just kept the original version of the PR, but the message "cannot refer to 
element 12 of array of 4 elements" seems correct here. I shall update this

https://github.com/llvm/llvm-project/pull/101126
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-07-30 Thread Vikram Hegde via cfe-commits


@@ -442,6 +446,16 @@ namespace {
   MostDerivedArraySize = 2;
   MostDerivedPathLength = Entries.size();
 }
+
+void addVectorElementUnchecked(QualType EltTy, uint64_t Size,
+   uint64_t Idx) {
+  Entries.push_back(PathEntry::ArrayIndex(Idx));

vikramRH wrote:

Yes, I thought about having a new accessor of the sort "vectorIndex" but all it 
seems to achieve is just adding new API that returns does the exact same thing 
as array (other than perhaps adding a new meaning to PathEntry value). I will 
update it if you feel this makes sense.

https://github.com/llvm/llvm-project/pull/101126
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-07-29 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH ready_for_review 
https://github.com/llvm/llvm-project/pull/101126
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-07-29 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/101126
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-07-29 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

* **#101126** https://app.graphite.dev/github/pr/llvm/llvm-project/101126?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @vikramRH and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/101126
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #101126)

2024-07-29 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH created 
https://github.com/llvm/llvm-project/pull/101126

None

>From 690901f2370381285afa7cf7c2f7401d89e568f6 Mon Sep 17 00:00:00 2001
From: Vikram 
Date: Mon, 29 Jul 2024 08:56:07 -0400
Subject: [PATCH] [clang][ExprConst] allow single element access of vector
 object to be constant expression

---
 clang/docs/ReleaseNotes.rst   |   3 +
 clang/lib/AST/ExprConstant.cpp| 102 +-
 clang/lib/AST/Interp/State.h  |   3 +-
 clang/test/AST/Interp/builtin-functions.cpp   |  26 ++---
 clang/test/AST/Interp/vectors.cpp |  50 -
 clang/test/CodeGenCXX/temporaries.cpp |  41 ---
 .../constexpr-vectors-access-elements.cpp |  29 +
 7 files changed, 190 insertions(+), 64 deletions(-)
 create mode 100644 clang/test/SemaCXX/constexpr-vectors-access-elements.cpp

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ddad083571eb1..2179aaea12387 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -64,6 +64,9 @@ sections with improvements to Clang's support for those 
languages.
 
 C++ Language Changes
 
+- Allow single element access of vector object to be constant expression.
+  Supports the `V.xyzw` syntax and other tidbits as seen in OpenCL.
+  Selecting multiple elements is left as a future work.
 
 C++17 Feature Support
 ^
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 558e20ed3e423..08f49ac896153 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -222,6 +222,11 @@ namespace {
 ArraySize = 2;
 MostDerivedLength = I + 1;
 IsArray = true;
+  } else if (const auto *VT = Type->getAs()) {
+Type = VT->getElementType();
+ArraySize = VT->getNumElements();
+MostDerivedLength = I + 1;
+IsArray = true;
   } else if (const FieldDecl *FD = getAsField(Path[I])) {
 Type = FD->getType();
 ArraySize = 0;
@@ -268,7 +273,6 @@ namespace {
 /// If the current array is an unsized array, the value of this is
 /// undefined.
 uint64_t MostDerivedArraySize;
-
 /// The type of the most derived object referred to by this address.
 QualType MostDerivedType;
 
@@ -442,6 +446,16 @@ namespace {
   MostDerivedArraySize = 2;
   MostDerivedPathLength = Entries.size();
 }
+
+void addVectorElementUnchecked(QualType EltTy, uint64_t Size,
+   uint64_t Idx) {
+  Entries.push_back(PathEntry::ArrayIndex(Idx));
+  MostDerivedType = EltTy;
+  MostDerivedPathLength = Entries.size();
+  MostDerivedArraySize = 0;
+  MostDerivedIsArrayElement = false;
+}
+
 void diagnoseUnsizedArrayPointerArithmetic(EvalInfo &Info, const Expr *E);
 void diagnosePointerArithmetic(EvalInfo &Info, const Expr *E,
const APSInt &N);
@@ -1737,6 +1751,11 @@ namespace {
   if (checkSubobject(Info, E, Imag ? CSK_Imag : CSK_Real))
 Designator.addComplexUnchecked(EltTy, Imag);
 }
+void addVectorElement(EvalInfo &Info, const Expr *E, QualType EltTy,
+  uint64_t Size, uint64_t Idx) {
+  if (checkSubobject(Info, E, CSK_VectorElement))
+Designator.addVectorElementUnchecked(EltTy, Size, Idx);
+}
 void clearIsNullPointer() {
   IsNullPtr = false;
 }
@@ -3310,6 +3329,19 @@ static bool HandleLValueComplexElement(EvalInfo &Info, 
const Expr *E,
   return true;
 }
 
+static bool HandleLValueVectorElement(EvalInfo &Info, const Expr *E,
+  LValue &LVal, QualType EltTy,
+  uint64_t Size, uint64_t Idx) {
+  if (Idx) {
+CharUnits SizeOfElement;
+if (!HandleSizeof(Info, E->getExprLoc(), EltTy, SizeOfElement))
+  return false;
+LVal.Offset += SizeOfElement * Idx;
+  }
+  LVal.addVectorElement(Info, E, EltTy, Size, Idx);
+  return true;
+}
+
 /// Try to evaluate the initializer for a variable declaration.
 ///
 /// \param Info   Information about the ongoing evaluation.
@@ -3855,6 +3887,19 @@ findSubobject(EvalInfo &Info, const Expr *E, const 
CompleteObject &Obj,
 return handler.found(Index ? O->getComplexFloatImag()
: O->getComplexFloatReal(), ObjType);
   }
+} else if (const auto *VT = ObjType->getAs()) {
+  uint64_t Index = Sub.Entries[I].getAsArrayIndex();
+  if (Index >= VT->getNumElements()) {
+if (Info.getLangOpts().CPlusPlus11)
+  Info.FFDiag(E, diag::note_constexpr_access_past_end)
+  << handler.AccessKind;
+else
+  Info.FFDiag(E);
+return handler.failed();
+  }
+  ObjType = VT->getElementType();
+  assert(I == N - 1 && "extracting subobject of scalar?");
+  return handler.found(O->getVectorElt(Index), ObjType);
 } else 

[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

2024-06-27 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

closing this in favour of https://github.com/llvm/llvm-project/pull/96933 and 
https://github.com/llvm/llvm-project/pull/96934

https://github.com/llvm/llvm-project/pull/96473
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

2024-06-27 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH closed 
https://github.com/llvm/llvm-project/pull/96473
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

2024-06-26 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

Apologies for the commit spam here, graphite seems a good option now onwards. 
However all dependent patches have landed now, the diff here is now up to date.

https://github.com/llvm/llvm-project/pull/96473
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

2024-06-26 Thread Vikram Hegde via cfe-commits


@@ -228,10 +228,11 @@ void 
AMDGPUAtomicOptimizerImpl::visitAtomicRMWInst(AtomicRMWInst &I) {
 
   // If the value operand is divergent, each lane is contributing a different
   // value to the atomic calculation. We can only optimize divergent values if
-  // we have DPP available on our subtarget, and the atomic operation is 32
-  // bits.
+  // we have DPP available on our subtarget, and the atomic operation is either
+  // 32 or 64 bits.
   if (ValDivergent &&
-  (!ST->hasDPP() || DL->getTypeSizeInBits(I.getType()) != 32)) {
+  (!ST->hasDPP() || (DL->getTypeSizeInBits(I.getType()) != 32 &&
+  DL->getTypeSizeInBits(I.getType()) != 64))) {

vikramRH wrote:

Done

https://github.com/llvm/llvm-project/pull/96473
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-06-25 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH closed 
https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-25 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH closed 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

2024-06-24 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/96473
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-06-23 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH ready_for_review 
https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-06-23 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #72607)

2024-06-18 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> Hello @vikramRH, please feel free to commandeer this.

Thanks @yuanfang-chen. Also, clang already rejects expressions like &V[0] 
(https://godbolt.org/z/eGcxzGo66), which is also true with constexprs and this 
PR. What's the specific concern here ?

https://github.com/llvm/llvm-project/pull/72607
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-06-17 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

Updated this PR to be in sync with #89217, However still plan is to land this 
land this only after changes in #89217 are accepted.

https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-06-17 Thread Vikram Hegde via cfe-commits


@@ -18479,6 +18479,28 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned 
BuiltinID,
 CGM.getIntrinsic(Intrinsic::amdgcn_update_dpp, Args[0]->getType());
 return Builder.CreateCall(F, Args);
   }
+  case AMDGPU::BI__builtin_amdgcn_permlane16:
+  case AMDGPU::BI__builtin_amdgcn_permlanex16: {
+llvm::Value *Src0 = EmitScalarExpr(E->getArg(0));

vikramRH wrote:

added a new helper

https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #72607)

2024-06-16 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

@yuanfang-chen , @AaronBallman, @shafik,  are we still actively looking into 
this ? (I would be willing to commandeer this if its not high on your priority 
list)

https://github.com/llvm/llvm-project/pull/72607
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-16 Thread Vikram Hegde via cfe-commits


@@ -0,0 +1,65 @@
+; RUN: llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx1100 
-verify-machineinstrs -o - %s | FileCheck --check-prefixes=CHECK,ISEL %s
+
+; CHECK-LABEL: name:basic_readfirstlane_i64
+;   CHECK:[[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL_ANCHOR

vikramRH wrote:

Makes sense, updated.

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-14 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-14 Thread Vikram Hegde via cfe-commits


@@ -0,0 +1,65 @@
+; RUN: llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx1100 
-verify-machineinstrs -o - %s | FileCheck --check-prefixes=CHECK,ISEL %s
+
+; CHECK-LABEL: name:basic_readfirstlane_i64
+;   CHECK:[[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL_ANCHOR

vikramRH wrote:

this is a preexisting error, and the failure is further down the pipeline. 
(after sreg alloc now i guess), does it make sense to have it as xfail now 
rather then stopping after isel? 

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-14 Thread Vikram Hegde via cfe-commits


@@ -0,0 +1,65 @@
+; RUN: llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx1100 
-verify-machineinstrs -o - %s | FileCheck --check-prefixes=CHECK,ISEL %s
+
+; CHECK-LABEL: name:basic_readfirstlane_i64
+;   CHECK:[[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL_ANCHOR

vikramRH wrote:

I currently see machine verifier failure which is not related to this patch. An 
 i32 example with trunc here, https://godbolt.org/z/he8asMe77. 
This is also seen with wider type legalizations that we do now, so I cannot 
integrate these with existing tests just yet. am I missing something here ?

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-14 Thread Vikram Hegde via cfe-commits

vikramRH wrote:





> That's another option. The only real plus to the intermediate is it's 
> slightly less annoying to write combines for. But there are limited combining 
> opportunities for these

we now legalize to intrinsics directly. The SDAG lowering uses a new helper to 
unroll vector cases while also handling convergence tokens

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-12 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> > > > @jayfoad's testcase fails and the same test should be repeated for all 
> > > > 3 intrinsics
> > > 
> > > 
> > > added MIR tests for 3 intrinsics. The issue is that Im not able to attach 
> > > the glue nodes to newly created laneop pieces since they fail at 
> > > selection. #87509 should enable this,
> > 
> > 
> > I am not really comfortable waiting for #87509 to fix convergence tokens in 
> > this expansion. Is it really true that this expansion cannot be fixed 
> > independent of future work on `CONVERGENCE_GLUE`? There is no way to 
> > manually handle the same glue operands??
> 
> I guess one way would be to have custom selection for each of the new node 
> type introduced, but would this be a proper way forward ? (this would be in 
> general for all convergent SDNodes i guess if selection is not made generic)

Or drop the new nodes altogether and legelaize to intrinsics directly ?

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-12 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> > > @jayfoad's testcase fails and the same test should be repeated for all 3 
> > > intrinsics
> > 
> > 
> > added MIR tests for 3 intrinsics. The issue is that Im not able to attach 
> > the glue nodes to newly created laneop pieces since they fail at selection. 
> > #87509 should enable this,
> 
> I am not really comfortable waiting for #87509 to fix convergence tokens in 
> this expansion. Is it really true that this expansion cannot be fixed 
> independent of future work on `CONVERGENCE_GLUE`? There is no way to manually 
> handle the same glue operands??

I guess one way would be to have custom selection for each of the new node type 
introduced, but would this be a proper way forward ? (this would be in general 
for all convergent SDNodes i guess if selection is not made generic)

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-12 Thread Vikram Hegde via cfe-commits


@@ -0,0 +1,46 @@
+# RUN: not --crash llc -mtriple=amdgcn -run-pass=none -verify-machineinstrs -o 
/dev/null %s 2>&1 | FileCheck %s

vikramRH wrote:

Okay, I'll update with IR's

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-12 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> @jayfoad's testcase fails and the same test should be repeated for all 3 
> intrinsics

added MIR tests for 3 intrinsics. The issue is that Im not able to attach the 
glue nodes to newly created laneop pieces since they fail at selection. 
https://github.com/llvm/llvm-project/pull/87509 should enable this,

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-03 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> You should add the mentioned convergence-tokens.ll test function

Added the test in a separate test file

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-31 Thread Vikram Hegde via cfe-commits


@@ -5496,6 +5496,9 @@ const char* 
AMDGPUTargetLowering::getTargetNodeName(unsigned Opcode) const {
   NODE_NAME_CASE(LDS)
   NODE_NAME_CASE(FPTRUNC_ROUND_UPWARD)
   NODE_NAME_CASE(FPTRUNC_ROUND_DOWNWARD)
+  NODE_NAME_CASE(READLANE)
+  NODE_NAME_CASE(READFIRSTLANE)

vikramRH wrote:

done

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Vikram Hegde via cfe-commits


@@ -5461,8 +5461,7 @@ bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper 
&Helper,
 
   SmallVector PartialRes;
   unsigned NumParts = Size / 32;
-  MachineInstrBuilder Src0Parts, Src2Parts;
-  Src0Parts = B.buildUnmerge(PartialResTy, Src0);
+  MachineInstrBuilder Src0Parts = B.buildUnmerge(PartialResTy, Src0), 
Src2Parts;

vikramRH wrote:

Done

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Vikram Hegde via cfe-commits


@@ -1170,6 +1170,23 @@ The AMDGPU backend implements the following LLVM IR 
intrinsics.
 
   :ref:`llvm.set.fpenv` Sets the floating point 
environment to the specifies state.
 
+  llvm.amdgcn.readfirstlaneProvides direct access to 
v_readfirstlane_b32. Returns the value in
+   the lowest active lane of 
the input operand. Currently implemented
+   for i16, i32, float, half, 
bf16, <2 x i16>, <2 x half>, <2 x bfloat>,

vikramRH wrote:

done

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Vikram Hegde via cfe-commits


@@ -6086,6 +6086,63 @@ static SDValue lowerBALLOTIntrinsic(const 
SITargetLowering &TLI, SDNode *N,
   DAG.getConstant(0, SL, MVT::i32), DAG.getCondCode(ISD::SETNE));
 }
 
+static SDValue lowerLaneOp(const SITargetLowering &TLI, SDNode *N,
+   SelectionDAG &DAG) {
+  EVT VT = N->getValueType(0);
+  unsigned ValSize = VT.getSizeInBits();
+  unsigned IntrinsicID = N->getConstantOperandVal(0);
+  SDValue Src0 = N->getOperand(1);
+  SDLoc SL(N);
+  MVT IntVT = MVT::getIntegerVT(ValSize);
+
+  auto createLaneOp = [&DAG, &SL](SDValue Src0, SDValue Src1, SDValue Src2,
+  MVT VT) -> SDValue {
+return (Src2 ? DAG.getNode(AMDGPUISD::WRITELANE, SL, VT, {Src0, Src1, 
Src2})
+: Src1 ? DAG.getNode(AMDGPUISD::READLANE, SL, VT, {Src0, Src1})
+   : DAG.getNode(AMDGPUISD::READFIRSTLANE, SL, VT, {Src0}));
+  };
+
+  SDValue Src1, Src2;
+  if (IntrinsicID == Intrinsic::amdgcn_readlane ||
+  IntrinsicID == Intrinsic::amdgcn_writelane) {
+Src1 = N->getOperand(2);
+if (IntrinsicID == Intrinsic::amdgcn_writelane)
+  Src2 = N->getOperand(3);
+  }
+
+  if (ValSize == 32) {
+// Already legal
+return SDValue();
+  }
+
+  if (ValSize < 32) {
+bool IsFloat = VT.isFloatingPoint();
+Src0 = DAG.getAnyExtOrTrunc(IsFloat ? DAG.getBitcast(IntVT, Src0) : Src0,
+SL, MVT::i32);
+if (Src2.getNode()) {
+  Src2 = DAG.getAnyExtOrTrunc(IsFloat ? DAG.getBitcast(IntVT, Src2) : Src2,
+  SL, MVT::i32);
+}
+SDValue LaneOp = createLaneOp(Src0, Src1, Src2, MVT::i32);
+SDValue Trunc = DAG.getAnyExtOrTrunc(LaneOp, SL, IntVT);
+return IsFloat ? DAG.getBitcast(VT, Trunc) : Trunc;
+  }
+
+  if ((ValSize % 32) == 0) {
+MVT VecVT = MVT::getVectorVT(MVT::i32, ValSize / 32);

vikramRH wrote:

Updated

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Vikram Hegde via cfe-commits


@@ -1170,6 +1170,23 @@ The AMDGPU backend implements the following LLVM IR 
intrinsics.
 
   :ref:`llvm.set.fpenv` Sets the floating point 
environment to the specifies state.
 
+  llvm.amdgcn.readfirstlaneProvides direct access to 
v_readfirstlane_b32. Returns the value in
+   the lowest active lane of 
the input operand. Currently 
+   implemented for i16, i32, 
float, half, bf16, v2i16, v2f16 and types 
+   whose sizes are multiples 
of 32-bit.
+
+  llvm.amdgcn.readlane Provides direct access to 
v_readlane_b32. Returns the value in the 
+   specified lane of the first 
input operand. The second operand 
+   specifies the lane to read 
from. Currently implemented
+   for i16, i32, float, half, 
bf16, v2i16, v2f16 and types whose sizes

vikramRH wrote:

Updated

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Vikram Hegde via cfe-commits


@@ -5387,6 +5387,124 @@ bool 
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
   return true;
 }
 
+// TODO: Fix pointer type handling
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ Intrinsic::ID IID) const {
+
+  MachineIRBuilder &B = Helper.MIRBuilder;
+  MachineRegisterInfo &MRI = *B.getMRI();
+
+  Register DstReg = MI.getOperand(0).getReg();
+  Register Src0 = MI.getOperand(2).getReg();
+
+  auto createLaneOp = [&](Register Src0, Register Src1,
+  Register Src2) -> Register {
+auto LaneOp = B.buildIntrinsic(IID, {S32}).addUse(Src0);
+switch (IID) {
+case Intrinsic::amdgcn_readfirstlane:
+  return LaneOp.getReg(0);
+case Intrinsic::amdgcn_readlane:
+  return LaneOp.addUse(Src1).getReg(0);
+case Intrinsic::amdgcn_writelane:
+  return LaneOp.addUse(Src1).addUse(Src2).getReg(0);
+default:
+  llvm_unreachable("unhandled lane op");
+}
+  };
+
+  Register Src1, Src2;
+  if (IID == Intrinsic::amdgcn_readlane || IID == Intrinsic::amdgcn_writelane) 
{
+Src1 = MI.getOperand(3).getReg();
+if (IID == Intrinsic::amdgcn_writelane) {
+  Src2 = MI.getOperand(4).getReg();
+}
+  }
+
+  LLT Ty = MRI.getType(DstReg);
+  unsigned Size = Ty.getSizeInBits();
+
+  if (Size == 32) {
+// Already legal
+return true;
+  }
+
+  if (Size < 32) {
+Src0 = B.buildAnyExt(S32, Src0).getReg(0);
+if (Src2.isValid())
+  Src2 = B.buildAnyExt(LLT::scalar(32), Src2).getReg(0);
+
+Register LaneOpDst = createLaneOp(Src0, Src1, Src2);
+B.buildTrunc(DstReg, LaneOpDst);
+
+MI.eraseFromParent();
+return true;
+  }
+
+  if ((Size % 32) == 0) {
+SmallVector PartialRes;
+unsigned NumParts = Size / 32;
+LLT PartialResTy =
+Ty.isVector() && Ty.getElementType() == S16 ? V2S16 : S32;

vikramRH wrote:

Done

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-29 Thread Vikram Hegde via cfe-commits


@@ -6086,6 +6086,63 @@ static SDValue lowerBALLOTIntrinsic(const 
SITargetLowering &TLI, SDNode *N,
   DAG.getConstant(0, SL, MVT::i32), DAG.getCondCode(ISD::SETNE));
 }
 
+static SDValue lowerLaneOp(const SITargetLowering &TLI, SDNode *N,
+   SelectionDAG &DAG) {
+  EVT VT = N->getValueType(0);
+  unsigned ValSize = VT.getSizeInBits();
+  unsigned IntrinsicID = N->getConstantOperandVal(0);
+  SDValue Src0 = N->getOperand(1);
+  SDLoc SL(N);
+  MVT IntVT = MVT::getIntegerVT(ValSize);
+
+  auto createLaneOp = [&DAG, &SL](SDValue Src0, SDValue Src1, SDValue Src2,
+  MVT VT) -> SDValue {
+return (Src2 ? DAG.getNode(AMDGPUISD::WRITELANE, SL, VT, {Src0, Src1, 
Src2})
+: Src1 ? DAG.getNode(AMDGPUISD::READLANE, SL, VT, {Src0, Src1})
+   : DAG.getNode(AMDGPUISD::READFIRSTLANE, SL, VT, {Src0}));
+  };
+
+  SDValue Src1, Src2;
+  if (IntrinsicID == Intrinsic::amdgcn_readlane ||
+  IntrinsicID == Intrinsic::amdgcn_writelane) {
+Src1 = N->getOperand(2);
+if (IntrinsicID == Intrinsic::amdgcn_writelane)
+  Src2 = N->getOperand(3);
+  }
+
+  if (ValSize == 32) {
+// Already legal
+return SDValue();
+  }
+
+  if (ValSize < 32) {
+bool IsFloat = VT.isFloatingPoint();
+Src0 = DAG.getAnyExtOrTrunc(IsFloat ? DAG.getBitcast(IntVT, Src0) : Src0,
+SL, MVT::i32);
+if (Src2.getNode()) {
+  Src2 = DAG.getAnyExtOrTrunc(IsFloat ? DAG.getBitcast(IntVT, Src2) : Src2,
+  SL, MVT::i32);
+}
+SDValue LaneOp = createLaneOp(Src0, Src1, Src2, MVT::i32);
+SDValue Trunc = DAG.getAnyExtOrTrunc(LaneOp, SL, IntVT);
+return IsFloat ? DAG.getBitcast(VT, Trunc) : Trunc;
+  }
+
+  if ((ValSize % 32) == 0) {
+MVT VecVT = MVT::getVectorVT(MVT::i32, ValSize / 32);

vikramRH wrote:

Understood. Thanks !

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-29 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-05-29 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

1. Added/updated tests for permlanex16, permlane64
2. This needs https://github.com/llvm/llvm-project/pull/89217 to land first so 
that only incremental changes can be reviewed. 

https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-05-29 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-28 Thread Vikram Hegde via cfe-commits


@@ -6086,6 +6086,63 @@ static SDValue lowerBALLOTIntrinsic(const 
SITargetLowering &TLI, SDNode *N,
   DAG.getConstant(0, SL, MVT::i32), DAG.getCondCode(ISD::SETNE));
 }
 
+static SDValue lowerLaneOp(const SITargetLowering &TLI, SDNode *N,
+   SelectionDAG &DAG) {
+  EVT VT = N->getValueType(0);
+  unsigned ValSize = VT.getSizeInBits();
+  unsigned IntrinsicID = N->getConstantOperandVal(0);
+  SDValue Src0 = N->getOperand(1);
+  SDLoc SL(N);
+  MVT IntVT = MVT::getIntegerVT(ValSize);
+
+  auto createLaneOp = [&DAG, &SL](SDValue Src0, SDValue Src1, SDValue Src2,
+  MVT VT) -> SDValue {
+return (Src2 ? DAG.getNode(AMDGPUISD::WRITELANE, SL, VT, {Src0, Src1, 
Src2})
+: Src1 ? DAG.getNode(AMDGPUISD::READLANE, SL, VT, {Src0, Src1})
+   : DAG.getNode(AMDGPUISD::READFIRSTLANE, SL, VT, {Src0}));
+  };
+
+  SDValue Src1, Src2;
+  if (IntrinsicID == Intrinsic::amdgcn_readlane ||
+  IntrinsicID == Intrinsic::amdgcn_writelane) {
+Src1 = N->getOperand(2);
+if (IntrinsicID == Intrinsic::amdgcn_writelane)
+  Src2 = N->getOperand(3);
+  }
+
+  if (ValSize == 32) {
+// Already legal
+return SDValue();
+  }
+
+  if (ValSize < 32) {
+bool IsFloat = VT.isFloatingPoint();
+Src0 = DAG.getAnyExtOrTrunc(IsFloat ? DAG.getBitcast(IntVT, Src0) : Src0,
+SL, MVT::i32);
+if (Src2.getNode()) {
+  Src2 = DAG.getAnyExtOrTrunc(IsFloat ? DAG.getBitcast(IntVT, Src2) : Src2,
+  SL, MVT::i32);
+}
+SDValue LaneOp = createLaneOp(Src0, Src1, Src2, MVT::i32);
+SDValue Trunc = DAG.getAnyExtOrTrunc(LaneOp, SL, IntVT);
+return IsFloat ? DAG.getBitcast(VT, Trunc) : Trunc;
+  }
+
+  if ((ValSize % 32) == 0) {
+MVT VecVT = MVT::getVectorVT(MVT::i32, ValSize / 32);
+Src0 = DAG.getBitcast(VecVT, Src0);
+
+if (Src2.getNode())
+  Src2 = DAG.getBitcast(VecVT, Src2);
+
+SDValue LaneOp = createLaneOp(Src0, Src1, Src2, VecVT);
+SDValue UnrolledLaneOp = DAG.UnrollVectorOp(LaneOp.getNode());
+return DAG.getBitcast(VT, UnrolledLaneOp);

vikramRH wrote:

```suggestion
  MVT LaneOpT =
VT.isVector() && VT.getVectorElementType().getSizeInBits() == 16
? MVT::v2i16
: MVT::i32;
SDValue Src0SubReg, Src2SubReg;
SmallVector LaneOps;
LaneOps.push_back(DAG.getTargetConstant(
TLI.getRegClassFor(VT.getSimpleVT(), N->isDivergent())->getID(), SL,
MVT::i32));
for (unsigned i = 0; i < (ValSize / 32); i++) {
  unsigned SubRegIdx = SIRegisterInfo::getSubRegFromChannel(i);
  Src0SubReg = DAG.getTargetExtractSubreg(SubRegIdx, SL, LaneOpT, Src0);
  if (Src2)
Src2SubReg = DAG.getTargetExtractSubreg(SubRegIdx, SL, LaneOpT, Src2);
  LaneOps.push_back(createLaneOp(Src0SubReg, Src1, Src2SubReg, LaneOpT));
  LaneOps.push_back(DAG.getTargetConstant(SubRegIdx, SL, MVT::i32));
}
return SDValue(
DAG.getMachineNode(TargetOpcode::REG_SEQUENCE, SL, VT, LaneOps), 0);
```

@arsenm , @jayfoad , an alternate idea here that is much closer in logic to the 
GIsel implementation and doesn't rely on bitcasts. how does this look ? 

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-05-26 Thread Vikram Hegde via cfe-commits


@@ -5433,7 +5450,16 @@ bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper 
&Helper,
 ? Src0
 : B.buildBitcast(LLT::scalar(Size), 
Src0).getReg(0);
 Src0 = B.buildAnyExt(S32, Src0Cast).getReg(0);
-if (Src2.isValid()) {
+
+if (IsPermLane16) {
+  Register Src1Cast =
+  MRI.getType(Src1).isScalar()
+  ? Src1
+  : B.buildBitcast(LLT::scalar(Size), Src2).getReg(0);

vikramRH wrote:

Yes, I will take over the changes from 
https://github.com/llvm/llvm-project/pull/89217 once finalized,

https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-05-26 Thread Vikram Hegde via cfe-commits


@@ -18479,6 +18479,25 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned 
BuiltinID,
 CGM.getIntrinsic(Intrinsic::amdgcn_update_dpp, Args[0]->getType());
 return Builder.CreateCall(F, Args);
   }
+  case AMDGPU::BI__builtin_amdgcn_permlane16:
+  case AMDGPU::BI__builtin_amdgcn_permlanex16: {
+Intrinsic::ID IID;
+IID = BuiltinID == AMDGPU::BI__builtin_amdgcn_permlane16
+  ? Intrinsic::amdgcn_permlane16
+  : Intrinsic::amdgcn_permlanex16;
+
+llvm::Value *Src0 = EmitScalarExpr(E->getArg(0));
+llvm::Value *Src1 = EmitScalarExpr(E->getArg(1));
+llvm::Value *Src2 = EmitScalarExpr(E->getArg(2));

vikramRH wrote:

yes

https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-26 Thread Vikram Hegde via cfe-commits


@@ -5456,43 +5444,32 @@ bool 
AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
   if ((Size % 32) == 0) {
 SmallVector PartialRes;
 unsigned NumParts = Size / 32;
-auto IsS16Vec = Ty.isVector() && Ty.getElementType() == S16;
+bool IsS16Vec = Ty.isVector() && Ty.getElementType() == S16;

vikramRH wrote:

done

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-23 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-23 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> > 1. What's the proper way to legalize f16 and bf16 for SDAG case without 
> > bitcasts ? (I would think  "fp_extend -> LaneOp -> Fptrunc" is wrong)
> 
> Bitcast to i16, anyext to i32, laneop, trunc to i16, bitcast to original type.
> 
> Why wouldn't you use bitcasts?

Just a doubt I had on previous comments, sorry for the noise !

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-23 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

updated the GIsel legalizer, I still have couple of questions for SDAG case 
though,
1. What's the proper way to legalize f16 and bf16 for SDAG case without 
bitcasts ? (I would think  "fp_extend -> LaneOp -> Fptrunc" is wrong)
2. For scalar cases such as i64, f64, i128 .. (i.e 32 bit multiples), I guess 
bitcast to vectors (v2i32, v2f32, v4i32) is unavoidable since "UnrollVectorOp" 
wouldn't work otherwise. any alternalte suggestions here ?

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-23 Thread Vikram Hegde via cfe-commits


@@ -5387,6 +5387,192 @@ bool 
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ Intrinsic::ID IID) const {
+
+  MachineIRBuilder &B = Helper.MIRBuilder;
+  MachineRegisterInfo &MRI = *B.getMRI();
+
+  Register DstReg = MI.getOperand(0).getReg();
+  Register Src0 = MI.getOperand(2).getReg();
+
+  auto createLaneOp = [&](Register Src0, Register Src1,
+  Register Src2) -> Register {
+auto LaneOp = B.buildIntrinsic(IID, {S32}).addUse(Src0);
+switch (IID) {
+case Intrinsic::amdgcn_readfirstlane:
+  return LaneOp.getReg(0);
+case Intrinsic::amdgcn_readlane:
+  return LaneOp.addUse(Src1).getReg(0);
+case Intrinsic::amdgcn_writelane:
+  return LaneOp.addUse(Src1).addUse(Src2).getReg(0);
+default:
+  llvm_unreachable("unhandled lane op");
+}
+  };
+
+  Register Src1, Src2;
+  if (IID == Intrinsic::amdgcn_readlane || IID == Intrinsic::amdgcn_writelane) 
{
+Src1 = MI.getOperand(3).getReg();
+if (IID == Intrinsic::amdgcn_writelane) {
+  Src2 = MI.getOperand(4).getReg();
+}
+  }
+
+  LLT Ty = MRI.getType(DstReg);
+  unsigned Size = Ty.getSizeInBits();
+
+  if (Size == 32) {
+// Already legal
+return true;
+  }
+
+  if (Size < 32) {
+Register Src0Cast = MRI.getType(Src0).isScalar()
+? Src0
+: B.buildBitcast(LLT::scalar(Size), 
Src0).getReg(0);
+Src0 = B.buildAnyExt(S32, Src0Cast).getReg(0);
+if (Src2.isValid()) {
+  Register Src2Cast =
+  MRI.getType(Src2).isScalar()
+  ? Src2
+  : B.buildBitcast(LLT::scalar(Size), Src2).getReg(0);
+  Src2 = B.buildAnyExt(LLT::scalar(32), Src2Cast).getReg(0);
+}
+
+Register LaneOpDst = createLaneOp(Src0, Src1, Src2);
+if (Ty.isScalar())
+  B.buildTrunc(DstReg, LaneOpDst);
+else {
+  auto Trunc = B.buildTrunc(LLT::scalar(Size), LaneOpDst);
+  B.buildBitcast(DstReg, Trunc);
+}
+
+MI.eraseFromParent();
+return true;
+  }
+
+  if ((Size % 32) == 0) {
+SmallVector PartialRes;
+unsigned NumParts = Size / 32;
+auto IsS16Vec = Ty.isVector() && Ty.getElementType() == S16;
+MachineInstrBuilder Src0Parts;
+
+if (Ty.isPointer()) {
+  auto PtrToInt = B.buildPtrToInt(LLT::scalar(Size), Src0);
+  Src0Parts = B.buildUnmerge(S32, PtrToInt);
+} else if (Ty.isPointerVector()) {
+  LLT IntVecTy = Ty.changeElementType(
+  LLT::scalar(Ty.getElementType().getSizeInBits()));
+  auto PtrToInt = B.buildPtrToInt(IntVecTy, Src0);
+  Src0Parts = B.buildUnmerge(S32, PtrToInt);
+} else
+  Src0Parts =
+  IsS16Vec ? B.buildUnmerge(V2S16, Src0) : B.buildUnmerge(S32, Src0);
+
+switch (IID) {
+case Intrinsic::amdgcn_readlane: {
+  Register Src1 = MI.getOperand(3).getReg();
+  for (unsigned i = 0; i < NumParts; ++i) {
+Src0 = IsS16Vec ? B.buildBitcast(S32, Src0Parts.getReg(i)).getReg(0)
+: Src0Parts.getReg(i);
+PartialRes.push_back(
+(B.buildIntrinsic(Intrinsic::amdgcn_readlane, {S32})
+ .addUse(Src0)
+ .addUse(Src1))
+.getReg(0));
+  }
+  break;
+}
+case Intrinsic::amdgcn_readfirstlane: {
+  for (unsigned i = 0; i < NumParts; ++i) {
+Src0 = IsS16Vec ? B.buildBitcast(S32, Src0Parts.getReg(i)).getReg(0)
+: Src0Parts.getReg(i);
+PartialRes.push_back(
+(B.buildIntrinsic(Intrinsic::amdgcn_readfirstlane, {S32})
+ .addUse(Src0)
+ .getReg(0)));
+  }
+
+  break;
+}
+case Intrinsic::amdgcn_writelane: {
+  Register Src1 = MI.getOperand(3).getReg();
+  Register Src2 = MI.getOperand(4).getReg();
+  MachineInstrBuilder Src2Parts;
+
+  if (Ty.isPointer()) {
+auto PtrToInt = B.buildPtrToInt(S64, Src2);
+Src2Parts = B.buildUnmerge(S32, PtrToInt);
+  } else if (Ty.isPointerVector()) {
+LLT IntVecTy = Ty.changeElementType(
+LLT::scalar(Ty.getElementType().getSizeInBits()));
+auto PtrToInt = B.buildPtrToInt(IntVecTy, Src2);
+Src2Parts = B.buildUnmerge(S32, PtrToInt);
+  } else
+Src2Parts =
+IsS16Vec ? B.buildUnmerge(V2S16, Src2) : B.buildUnmerge(S32, Src2);

vikramRH wrote:

done

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-23 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-23 Thread Vikram Hegde via cfe-commits


@@ -6086,6 +6086,62 @@ static SDValue lowerBALLOTIntrinsic(const 
SITargetLowering &TLI, SDNode *N,
   DAG.getConstant(0, SL, MVT::i32), DAG.getCondCode(ISD::SETNE));
 }
 
+static SDValue lowerLaneOp(const SITargetLowering &TLI, SDNode *N,
+   SelectionDAG &DAG) {
+  EVT VT = N->getValueType(0);
+  unsigned ValSize = VT.getSizeInBits();
+  unsigned IntrinsicID = N->getConstantOperandVal(0);
+  SDValue Src0 = N->getOperand(1);
+  SDLoc SL(N);
+  MVT IntVT = MVT::getIntegerVT(ValSize);
+
+  auto createLaneOp = [&DAG, &SL](SDValue Src0, SDValue Src1, SDValue Src2,
+  MVT VT) -> SDValue {
+return (Src2 ? DAG.getNode(AMDGPUISD::WRITELANE, SL, VT, {Src0, Src1, 
Src2})
+: Src1 ? DAG.getNode(AMDGPUISD::READLANE, SL, VT, {Src0, Src1})
+   : DAG.getNode(AMDGPUISD::READFIRSTLANE, SL, VT, {Src0}));
+  };
+
+  SDValue Src1, Src2;
+  if (IntrinsicID == Intrinsic::amdgcn_readlane ||
+  IntrinsicID == Intrinsic::amdgcn_writelane) {
+Src1 = N->getOperand(2);
+if (IntrinsicID == Intrinsic::amdgcn_writelane)
+  Src2 = N->getOperand(3);
+  }
+
+  if (ValSize == 32) {
+// Already legal
+return SDValue();
+  }
+
+  if (ValSize < 32) {
+SDValue InitBitCast = DAG.getBitcast(IntVT, Src0);
+Src0 = DAG.getAnyExtOrTrunc(InitBitCast, SL, MVT::i32);
+if (Src2.getNode()) {
+  SDValue Src2Cast = DAG.getBitcast(IntVT, Src2);

vikramRH wrote:

What would be the proper way to legalize f16 and bf16 for SDAG case without 
bitcasts ? (Im currently thinking  "fp_extend -> LaneOp -> Fptrunc" which seems 
wrong)

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-05-20 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-05-20 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-05-20 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-05-20 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-18 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-18 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-18 Thread Vikram Hegde via cfe-commits


@@ -243,11 +243,16 @@ def VOP_READFIRSTLANE : VOPProfile <[i32, i32, untyped, 
untyped]> {
 // FIXME: Specify SchedRW for READFIRSTLANE_B32
 // TODO: There is VOP3 encoding also
 def V_READFIRSTLANE_B32 : VOP1_Pseudo <"v_readfirstlane_b32", 
VOP_READFIRSTLANE,
-   getVOP1Pat.ret, 1> {
+   [], 1> {
   let isConvergent = 1;
 }
 
+foreach vt = Reg32Types.types in {
+  def : GCNPat<(vt (AMDGPUreadfirstlane (vt VRegOrLdsSrc_32:$src0))),
+(V_READFIRSTLANE_B32 (vt VRegOrLdsSrc_32:$src0))

vikramRH wrote:

Done

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-16 Thread Vikram Hegde via cfe-commits


@@ -243,11 +243,16 @@ def VOP_READFIRSTLANE : VOPProfile <[i32, i32, untyped, 
untyped]> {
 // FIXME: Specify SchedRW for READFIRSTLANE_B32
 // TODO: There is VOP3 encoding also
 def V_READFIRSTLANE_B32 : VOP1_Pseudo <"v_readfirstlane_b32", 
VOP_READFIRSTLANE,
-   getVOP1Pat.ret, 1> {
+   [], 1> {
   let isConvergent = 1;
 }
 
+foreach vt = Reg32Types.types in {
+  def : GCNPat<(vt (AMDGPUreadfirstlane (vt VRegOrLdsSrc_32:$src0))),
+(V_READFIRSTLANE_B32 (vt VRegOrLdsSrc_32:$src0))

vikramRH wrote:

Do you think these changes are okay until I figure out root cause ?

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-16 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-16 Thread Vikram Hegde via cfe-commits


@@ -342,6 +342,22 @@ def AMDGPUfdot2_impl : SDNode<"AMDGPUISD::FDOT2",
 
 def AMDGPUperm_impl : SDNode<"AMDGPUISD::PERM", AMDGPUDTIntTernaryOp, []>;
 
+def AMDGPUReadfirstlaneOp : SDTypeProfile<1, 1, [
+  SDTCisSameAs<0, 1>
+]>;
+
+def AMDGPUReadlaneOp : SDTypeProfile<1, 2, [
+  SDTCisSameAs<0, 1>, SDTCisInt<2>
+]>;
+
+def AMDGPUDWritelaneOp : SDTypeProfile<1, 3, [
+  SDTCisSameAs<1, 1>, SDTCisInt<2>, SDTCisSameAs<0, 3>,

vikramRH wrote:

Thanks for pointing this, missed updating this latest version. updated now, 
however issue is not related to this

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-16 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-16 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-16 Thread Vikram Hegde via cfe-commits


@@ -243,11 +243,16 @@ def VOP_READFIRSTLANE : VOPProfile <[i32, i32, untyped, 
untyped]> {
 // FIXME: Specify SchedRW for READFIRSTLANE_B32
 // TODO: There is VOP3 encoding also
 def V_READFIRSTLANE_B32 : VOP1_Pseudo <"v_readfirstlane_b32", 
VOP_READFIRSTLANE,
-   getVOP1Pat.ret, 1> {
+   [], 1> {
   let isConvergent = 1;
 }
 
+foreach vt = Reg32Types.types in {
+  def : GCNPat<(vt (AMDGPUreadfirstlane (vt VRegOrLdsSrc_32:$src0))),
+(V_READFIRSTLANE_B32 (vt VRegOrLdsSrc_32:$src0))

vikramRH wrote:

Attaching example match table snippets for v2i16 and p3 here, should make the 
scenario bit more clear,
for v2i16
 ```
GIM_Try, /*On fail goto*//*Label 3499*/ GIMT_Encode4(202699), // Rule ID 2117 //
GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, 
GIMT_Encode2(Intrinsic::amdgcn_writelane),
GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_v2s16,
GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_v2s16,
GIM_RootCheckType, /*Op*/3, /*Type*/GILLT_s32,
GIM_RootCheckType, /*Op*/4, /*Type*/GILLT_v2s16,
GIM_RootCheckRegBankForClass, /*Op*/0, 
/*RC*/GIMT_Encode2(AMDGPU::VGPR_32RegClassID),
// (intrinsic_wo_chain:{ *:[v2i16] } 2863:{ *:[iPTR] }, v2i16:{ 
*:[v2i16] }:$src0, i32:{ *:[i32] }:$src1, v2i16:{ *:[v2i16] }:$src2)  =>  
(V_WRITELANE_B32:{ *:[v2i16] } SCSrc_b32:{ *:[v2i16] }:$src0, SCSrc_b32:{ 
*:[i32] }:$src1, VGPR_32:{ *:[v2i16] }:$src2)
GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(AMDGPU::V_WRITELANE_B32),
```

and for p3,
```
GIM_Try, /*On fail goto*//*Label 3502*/ GIMT_Encode4(202816), // Rule ID 2129 //
GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, 
GIMT_Encode2(Intrinsic::amdgcn_writelane),
GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_s32,
GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_p2s32,
GIM_RootCheckType, /*Op*/3, /*Type*/GILLT_s32,
GIM_RootCheckType, /*Op*/4, /*Type*/GILLT_p2s32,
GIM_RootCheckRegBankForClass, /*Op*/0, 
/*RC*/GIMT_Encode2(AMDGPU::VGPR_32RegClassID),
// (intrinsic_wo_chain:{ *:[i32] } 2863:{ *:[iPTR] }, p2:{ *:[i32] 
}:$src0, i32:{ *:[i32] }:$src1, p2:{ *:[i32] }:$src2)  =>  (V_WRITELANE_B32:{ 
*:[i32] } SCSrc_b32:{ *:[i32] }:$src0, SCSrc_b32:{ *:[i32] }:$src1, VGPR_32:{ 
*:[i32] }:$src2)
GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(AMDGPU::V_WRITELANE_B32),
```

The destination type check for p3 case is still for "GILLT_s32",



https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-16 Thread Vikram Hegde via cfe-commits


@@ -243,11 +243,16 @@ def VOP_READFIRSTLANE : VOPProfile <[i32, i32, untyped, 
untyped]> {
 // FIXME: Specify SchedRW for READFIRSTLANE_B32
 // TODO: There is VOP3 encoding also
 def V_READFIRSTLANE_B32 : VOP1_Pseudo <"v_readfirstlane_b32", 
VOP_READFIRSTLANE,
-   getVOP1Pat.ret, 1> {
+   [], 1> {
   let isConvergent = 1;
 }
 
+foreach vt = Reg32Types.types in {
+  def : GCNPat<(vt (AMDGPUreadfirstlane (vt VRegOrLdsSrc_32:$src0))),
+(V_READFIRSTLANE_B32 (vt VRegOrLdsSrc_32:$src0))

vikramRH wrote:

Unfortunately no, Had tried this and couple of other variations. the issue 
seems to be too specific to GIsel pointers..

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-15 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-15 Thread Vikram Hegde via cfe-commits


@@ -5387,6 +5387,212 @@ bool 
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ Intrinsic::ID IID) const {
+
+  MachineIRBuilder &B = Helper.MIRBuilder;
+  MachineRegisterInfo &MRI = *B.getMRI();
+
+  Register DstReg = MI.getOperand(0).getReg();
+  Register Src0 = MI.getOperand(2).getReg();
+
+  auto createLaneOp = [&](Register Src0, Register Src1,
+  Register Src2) -> Register {
+auto LaneOp = B.buildIntrinsic(IID, {S32}).addUse(Src0);
+switch (IID) {
+case Intrinsic::amdgcn_readfirstlane:
+  return LaneOp.getReg(0);
+case Intrinsic::amdgcn_readlane:
+  return LaneOp.addUse(Src1).getReg(0);
+case Intrinsic::amdgcn_writelane:
+  return LaneOp.addUse(Src1).addUse(Src2).getReg(0);
+default:
+  llvm_unreachable("unhandled lane op");
+}
+  };
+
+  Register Src1, Src2;
+  if (IID == Intrinsic::amdgcn_readlane || IID == Intrinsic::amdgcn_writelane) 
{
+Src1 = MI.getOperand(3).getReg();
+if (IID == Intrinsic::amdgcn_writelane) {
+  Src2 = MI.getOperand(4).getReg();
+}
+  }
+
+  LLT Ty = MRI.getType(DstReg);
+  unsigned Size = Ty.getSizeInBits();
+
+  if (Size == 32) {
+if (Ty.isScalar())
+  // Already legal

vikramRH wrote:

Also the issue is only for pointer types, float, v2i16 etc work just fine

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-15 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-15 Thread Vikram Hegde via cfe-commits


@@ -5387,6 +5387,212 @@ bool 
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ Intrinsic::ID IID) const {
+
+  MachineIRBuilder &B = Helper.MIRBuilder;
+  MachineRegisterInfo &MRI = *B.getMRI();
+
+  Register DstReg = MI.getOperand(0).getReg();
+  Register Src0 = MI.getOperand(2).getReg();
+
+  auto createLaneOp = [&](Register Src0, Register Src1,
+  Register Src2) -> Register {
+auto LaneOp = B.buildIntrinsic(IID, {S32}).addUse(Src0);
+switch (IID) {
+case Intrinsic::amdgcn_readfirstlane:
+  return LaneOp.getReg(0);
+case Intrinsic::amdgcn_readlane:
+  return LaneOp.addUse(Src1).getReg(0);
+case Intrinsic::amdgcn_writelane:
+  return LaneOp.addUse(Src1).addUse(Src2).getReg(0);
+default:
+  llvm_unreachable("unhandled lane op");
+}
+  };
+
+  Register Src1, Src2;
+  if (IID == Intrinsic::amdgcn_readlane || IID == Intrinsic::amdgcn_writelane) 
{
+Src1 = MI.getOperand(3).getReg();
+if (IID == Intrinsic::amdgcn_writelane) {
+  Src2 = MI.getOperand(4).getReg();
+}
+  }
+
+  LLT Ty = MRI.getType(DstReg);
+  unsigned Size = Ty.getSizeInBits();
+
+  if (Size == 32) {
+if (Ty.isScalar())
+  // Already legal

vikramRH wrote:

Done except for pointers. I currently see an issue where pattern type inference 
somehow deduces destination type to scalars (instead of say LLT_ p3s32). not 
currently sure why , any ideas ?

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #72607)

2024-05-14 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

@yuanfang-chen , any plans to continue with this PR ?

https://github.com/llvm/llvm-project/pull/72607
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-13 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

Added new 32 bit pointer,  <8 x i16> tests

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-13 Thread Vikram Hegde via cfe-commits


@@ -5386,6 +5386,153 @@ bool 
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ Intrinsic::ID IID) const {
+
+  MachineIRBuilder &B = Helper.MIRBuilder;
+  MachineRegisterInfo &MRI = *B.getMRI();
+
+  Register DstReg = MI.getOperand(0).getReg();
+  Register Src0 = MI.getOperand(2).getReg();
+
+  Register Src1, Src2;
+  if (IID == Intrinsic::amdgcn_readlane || IID == Intrinsic::amdgcn_writelane) 
{
+Src1 = MI.getOperand(3).getReg();
+if (IID == Intrinsic::amdgcn_writelane) {
+  Src2 = MI.getOperand(4).getReg();
+}
+  }
+
+  LLT Ty = MRI.getType(DstReg);
+  unsigned Size = Ty.getSizeInBits();
+
+  if (Size == 32) {
+if (Ty.isScalar())
+  // Already legal
+  return true;
+
+Register Src0Valid = B.buildBitcast(S32, Src0).getReg(0);
+MachineInstrBuilder LaneOpDst;
+switch (IID) {
+case Intrinsic::amdgcn_readfirstlane: {
+  LaneOpDst = B.buildIntrinsic(IID, {S32}).addUse(Src0Valid);
+  break;
+}
+case Intrinsic::amdgcn_readlane: {
+  LaneOpDst = B.buildIntrinsic(IID, {S32}).addUse(Src0Valid).addUse(Src1);
+  break;
+}
+case Intrinsic::amdgcn_writelane: {
+  Register Src2Valid = B.buildBitcast(S32, Src2).getReg(0);
+  LaneOpDst = B.buildIntrinsic(IID, {S32})
+  .addUse(Src0Valid)
+  .addUse(Src1)
+  .addUse(Src2Valid);
+}
+}
+
+Register LaneOpDstReg = LaneOpDst.getReg(0);
+B.buildBitcast(DstReg, LaneOpDstReg);
+MI.eraseFromParent();
+return true;
+  }
+
+  if (Size < 32) {
+Register Src0Cast = MRI.getType(Src0).isScalar()
+? Src0
+: B.buildBitcast(LLT::scalar(Size), 
Src0).getReg(0);
+Register Src0Valid = B.buildAnyExt(S32, Src0Cast).getReg(0);
+
+MachineInstrBuilder LaneOpDst;
+switch (IID) {
+case Intrinsic::amdgcn_readfirstlane: {
+  LaneOpDst = B.buildIntrinsic(IID, {S32}).addUse(Src0Valid);
+  break;
+}
+case Intrinsic::amdgcn_readlane: {
+  LaneOpDst = B.buildIntrinsic(IID, {S32}).addUse(Src0Valid).addUse(Src1);
+  break;
+}
+case Intrinsic::amdgcn_writelane: {
+  Register Src2Cast =
+  MRI.getType(Src2).isScalar()
+  ? Src2
+  : B.buildBitcast(LLT::scalar(Size), Src2).getReg(0);
+  Register Src2Valid = B.buildAnyExt(LLT::scalar(32), Src2Cast).getReg(0);
+  LaneOpDst = B.buildIntrinsic(IID, {S32})
+  .addUse(Src0Valid)
+  .addUse(Src1)
+  .addUse(Src2Valid);
+}
+}
+
+Register LaneOpDstReg = LaneOpDst.getReg(0);
+if (Ty.isScalar())
+  B.buildTrunc(DstReg, LaneOpDstReg);
+else {
+  auto Trunc = B.buildTrunc(LLT::scalar(Size), LaneOpDstReg);
+  B.buildBitcast(DstReg, Trunc);
+}
+
+MI.eraseFromParent();
+return true;
+  }
+
+  if ((Size % 32) == 0) {
+SmallVector PartialRes;
+unsigned NumParts = Size / 32;
+auto Src0Parts = B.buildUnmerge(S32, Src0);

vikramRH wrote:

done I hope as per the expectation, however I don't understand the plus  here

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-09 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH deleted 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-09 Thread Vikram Hegde via cfe-commits


@@ -5386,6 +5386,153 @@ bool 
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ Intrinsic::ID IID) const {
+
+  MachineIRBuilder &B = Helper.MIRBuilder;
+  MachineRegisterInfo &MRI = *B.getMRI();
+
+  Register DstReg = MI.getOperand(0).getReg();
+  Register Src0 = MI.getOperand(2).getReg();
+
+  Register Src1, Src2;
+  if (IID == Intrinsic::amdgcn_readlane || IID == Intrinsic::amdgcn_writelane) 
{
+Src1 = MI.getOperand(3).getReg();
+if (IID == Intrinsic::amdgcn_writelane) {
+  Src2 = MI.getOperand(4).getReg();
+}
+  }
+
+  LLT Ty = MRI.getType(DstReg);
+  unsigned Size = Ty.getSizeInBits();
+
+  if (Size == 32) {
+if (Ty.isScalar())
+  // Already legal
+  return true;
+
+Register Src0Valid = B.buildBitcast(S32, Src0).getReg(0);
+MachineInstrBuilder LaneOpDst;
+switch (IID) {
+case Intrinsic::amdgcn_readfirstlane: {
+  LaneOpDst = B.buildIntrinsic(IID, {S32}).addUse(Src0Valid);
+  break;
+}
+case Intrinsic::amdgcn_readlane: {
+  LaneOpDst = B.buildIntrinsic(IID, {S32}).addUse(Src0Valid).addUse(Src1);
+  break;
+}
+case Intrinsic::amdgcn_writelane: {
+  Register Src2Valid = B.buildBitcast(S32, Src2).getReg(0);
+  LaneOpDst = B.buildIntrinsic(IID, {S32})
+  .addUse(Src0Valid)
+  .addUse(Src1)
+  .addUse(Src2Valid);
+}
+}
+
+Register LaneOpDstReg = LaneOpDst.getReg(0);
+B.buildBitcast(DstReg, LaneOpDstReg);
+MI.eraseFromParent();
+return true;
+  }
+
+  if (Size < 32) {
+Register Src0Cast = MRI.getType(Src0).isScalar()
+? Src0
+: B.buildBitcast(LLT::scalar(Size), 
Src0).getReg(0);
+Register Src0Valid = B.buildAnyExt(S32, Src0Cast).getReg(0);
+
+MachineInstrBuilder LaneOpDst;
+switch (IID) {
+case Intrinsic::amdgcn_readfirstlane: {
+  LaneOpDst = B.buildIntrinsic(IID, {S32}).addUse(Src0Valid);
+  break;
+}
+case Intrinsic::amdgcn_readlane: {
+  LaneOpDst = B.buildIntrinsic(IID, {S32}).addUse(Src0Valid).addUse(Src1);
+  break;
+}
+case Intrinsic::amdgcn_writelane: {
+  Register Src2Cast =
+  MRI.getType(Src2).isScalar()
+  ? Src2
+  : B.buildBitcast(LLT::scalar(Size), Src2).getReg(0);
+  Register Src2Valid = B.buildAnyExt(LLT::scalar(32), Src2Cast).getReg(0);
+  LaneOpDst = B.buildIntrinsic(IID, {S32})
+  .addUse(Src0Valid)
+  .addUse(Src1)
+  .addUse(Src2Valid);
+}
+}
+
+Register LaneOpDstReg = LaneOpDst.getReg(0);
+if (Ty.isScalar())
+  B.buildTrunc(DstReg, LaneOpDstReg);
+else {
+  auto Trunc = B.buildTrunc(LLT::scalar(Size), LaneOpDstReg);
+  B.buildBitcast(DstReg, Trunc);
+}
+
+MI.eraseFromParent();
+return true;
+  }
+
+  if ((Size % 32) == 0) {
+SmallVector PartialRes;
+unsigned NumParts = Size / 32;
+auto Src0Parts = B.buildUnmerge(S32, Src0);

vikramRH wrote:

Do you mean extract s16 elements individually and handle them as  (Size < 32) 
case ?

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-09 Thread Vikram Hegde via cfe-commits


@@ -5386,6 +5386,153 @@ bool 
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ Intrinsic::ID IID) const {
+
+  MachineIRBuilder &B = Helper.MIRBuilder;
+  MachineRegisterInfo &MRI = *B.getMRI();
+
+  Register DstReg = MI.getOperand(0).getReg();
+  Register Src0 = MI.getOperand(2).getReg();
+
+  Register Src1, Src2;
+  if (IID == Intrinsic::amdgcn_readlane || IID == Intrinsic::amdgcn_writelane) 
{
+Src1 = MI.getOperand(3).getReg();
+if (IID == Intrinsic::amdgcn_writelane) {
+  Src2 = MI.getOperand(4).getReg();
+}
+  }
+
+  LLT Ty = MRI.getType(DstReg);
+  unsigned Size = Ty.getSizeInBits();
+
+  if (Size == 32) {
+if (Ty.isScalar())
+  // Already legal
+  return true;
+
+Register Src0Valid = B.buildBitcast(S32, Src0).getReg(0);
+MachineInstrBuilder LaneOpDst;
+switch (IID) {

vikramRH wrote:

my bad, I will improve the helper

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-09 Thread Vikram Hegde via cfe-commits


@@ -5386,6 +5386,153 @@ bool 
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ Intrinsic::ID IID) const {
+
+  MachineIRBuilder &B = Helper.MIRBuilder;
+  MachineRegisterInfo &MRI = *B.getMRI();
+
+  Register DstReg = MI.getOperand(0).getReg();
+  Register Src0 = MI.getOperand(2).getReg();
+
+  Register Src1, Src2;
+  if (IID == Intrinsic::amdgcn_readlane || IID == Intrinsic::amdgcn_writelane) 
{
+Src1 = MI.getOperand(3).getReg();
+if (IID == Intrinsic::amdgcn_writelane) {
+  Src2 = MI.getOperand(4).getReg();
+}
+  }
+
+  LLT Ty = MRI.getType(DstReg);
+  unsigned Size = Ty.getSizeInBits();
+
+  if (Size == 32) {
+if (Ty.isScalar())
+  // Already legal
+  return true;
+
+Register Src0Valid = B.buildBitcast(S32, Src0).getReg(0);
+MachineInstrBuilder LaneOpDst;
+switch (IID) {

vikramRH wrote:

I removed the helper in the recent commit following @arsenm's suggestion. Only 
reason is readability

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-09 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-09 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> > add f32 pattern to select read/writelane operations
> 
> Why would you need this? Don't you legalize f32 to i32?

Sorry about this. Its a leftover comment from the initial implementation which 
I should have removed.

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-09 Thread Vikram Hegde via cfe-commits


@@ -5386,6 +5386,130 @@ bool 
AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper &Helper,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ Intrinsic::ID IID) const {
+
+  MachineIRBuilder &B = Helper.MIRBuilder;
+  MachineRegisterInfo &MRI = *B.getMRI();
+
+  Register DstReg = MI.getOperand(0).getReg();
+  Register Src0 = MI.getOperand(2).getReg();
+
+  auto createLaneOp = [&](Register &Src0, Register &Src1,
+  Register &Src2) -> Register {
+auto LaneOpDst = B.buildIntrinsic(IID, {S32}).addUse(Src0);
+if (Src2.isValid())
+  return (LaneOpDst.addUse(Src1).addUse(Src2)).getReg(0);
+if (Src1.isValid())
+  return (LaneOpDst.addUse(Src1)).getReg(0);
+return LaneOpDst.getReg(0);
+  };
+
+  Register Src1, Src2, Src0Valid, Src2Valid;
+  if (IID == Intrinsic::amdgcn_readlane || IID == Intrinsic::amdgcn_writelane) 
{
+Src1 = MI.getOperand(3).getReg();
+if (IID == Intrinsic::amdgcn_writelane) {
+  Src2 = MI.getOperand(4).getReg();
+}
+  }
+
+  LLT Ty = MRI.getType(DstReg);
+  unsigned Size = Ty.getSizeInBits();
+
+  if (Size == 32) {
+if (Ty.isScalar())
+  // Already legal
+  return true;
+
+Register Src0Valid = B.buildBitcast(S32, Src0).getReg(0);
+if (Src2.isValid())
+  Src2Valid = B.buildBitcast(S32, Src2).getReg(0);
+Register LaneOp = createLaneOp(Src0Valid, Src1, Src2Valid);
+B.buildBitcast(DstReg, LaneOp);
+MI.eraseFromParent();
+return true;
+  }
+
+  if (Size < 32) {
+Register Src0Cast = MRI.getType(Src0).isScalar()
+? Src0
+: B.buildBitcast(LLT::scalar(Size), 
Src0).getReg(0);
+Src0Valid = B.buildAnyExt(S32, Src0Cast).getReg(0);
+
+if (Src2.isValid()) {
+  Register Src2Cast =
+  MRI.getType(Src2).isScalar()
+  ? Src2
+  : B.buildBitcast(LLT::scalar(Size), Src2).getReg(0);
+  Src2Valid = B.buildAnyExt(LLT::scalar(32), Src2Cast).getReg(0);
+}
+Register LaneOp = createLaneOp(Src0Valid, Src1, Src2Valid);
+if (Ty.isScalar())
+  B.buildTrunc(DstReg, LaneOp);
+else {
+  auto Trunc = B.buildTrunc(LLT::scalar(Size), LaneOp);
+  B.buildBitcast(DstReg, Trunc);
+}
+
+MI.eraseFromParent();
+return true;
+  }
+
+  if ((Size % 32) == 0) {
+SmallVector PartialRes;
+unsigned NumParts = Size / 32;
+auto Src0Parts = B.buildUnmerge(S32, Src0);
+
+switch (IID) {
+case Intrinsic::amdgcn_readlane: {
+  Register Src1 = MI.getOperand(3).getReg();
+  for (unsigned i = 0; i < NumParts; ++i)
+PartialRes.push_back(
+(B.buildIntrinsic(Intrinsic::amdgcn_readlane, {S32})
+ .addUse(Src0Parts.getReg(i))
+ .addUse(Src1))
+.getReg(0));

vikramRH wrote:

should this be a seperate change that addresses other such instances too ?

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-05 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Add support for i64/f64 readlane, writelane and readfirstlane operations. (PR #89217)

2024-05-05 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

1. Review comments
2. improve GIsel lowering
3. add tests for half, bfloat, float2, ptr, vector of ptr and int
4. removed gfx700 checks from writelane test since it caused issues with f16 
legalization. is this required ?

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Add support for i64/f64 readlane, writelane and readfirstlane operations. (PR #89217)

2024-05-02 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

new commit extends @jayfoad's implementation with GIsel support.  yet to add 
tests for half, floats and some vectors

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Add support for i64/f64 readlane, writelane and readfirstlane operations. (PR #89217)

2024-04-22 Thread Vikram Hegde via cfe-commits


@@ -4822,6 +4822,111 @@ static MachineBasicBlock *lowerWaveReduce(MachineInstr 
&MI,
   return RetBB;
 }
 
+static MachineBasicBlock *lowerPseudoLaneOp(MachineInstr &MI,

vikramRH wrote:

@arsenm, would "PreISelIntrinsicLowering" be a proper place for this ?

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Add support for i64/f64 readlane, writelane and readfirstlane operations. (PR #89217)

2024-04-19 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

Gentle ping :)

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][WIP] Add support for i64/f64 readlane, writelane and readfirstlane operations. (PR #89217)

2024-04-19 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

Added/updated tests for readfirstlane and writelane ops

https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [RFC][Clang] Enable custom type checking for printf (PR #86801)

2024-03-30 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

> I looked at the OpenCL spec for C standard library support and was surprised 
> that 1) it's only talking about C99 so it's unclear what happens for C11 
> (clause 6 says "This document describes the modifications and restrictions to 
> C99 and C11 in OpenCL C" but 6.11 only talks about C99 headers and leaves 
> `iso646.h`, `math.h`, `stdbool.h`, `stddef.h`, (all in C99) as well as 
> `stdalign.h`, `stdatomic.h`, `stdnoreturn.h`, `threads.h`, and `uchar.h` 
> available?), and 2) OpenCL's `printf` is not really the same function as C's 
> `printf` 
> (https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html#differences-between-opencl-c-and-c99-printf).
> 
> #1 is probably more of an oversight than anything, at least with the C11 
> headers. So maybe this isn't a super slippery slope, but maybe C23 will 
> change that (I can imagine `stdbit.h` being of use in OpenCL for bit-bashing 
> operations). However, the fact that the builtin isn't really `printf` but is 
> `printf`-like makes me think we should make it a separate builtin to avoid 
> surprises (we do diagnostics based on builtin IDs and we have special 
> checking logic that we perhaps should be exempting in some cases).

Understood. Then I propose the following. 
1. Currently Builtin TableGen does not seem to support specifying lang address 
spaces in function prototypes. this needs to be implemented first if not 
already in development.
2. We could have two new macro variants probably named "OCL_BUILTIN" and 
"OCL_LIB_BUILTIN" which will take the ID's of the form 
"BI_OCL##". we would also need corresponding TableGen classes 
(probably named similar to the macros) which can expose such overloaded 
prototypes when required.

How does this sound ?


https://github.com/llvm/llvm-project/pull/86801
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [RFC][Clang] Enable custom type checking for printf (PR #86801)

2024-03-28 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

Thanks for the comments @AaronBallman. The core issue here is that the current 
builtin handling design does not allow multiple overloads for the same 
identifier to coexist  (ref. 
https://github.com/llvm/llvm-project/blob/eacda36c7dd842cb15c0c954eda74b67d0c73814/clang/include/clang/Basic/Builtins.h#L66),
 unless the builtins are defined in target specific namespaces which is what I 
tried in my original patch . If we want change this approach, I currently think 
of couple of ways at a top level
1. As you said, we could have OCL specific LibBuiltin and LangBuiltin TableGen 
classes (and corresponding macros in Buitlins.inc). To make this work they 
would need new builtin ID's  of different form (say "BI_OCL##"). 
This is very Language specific.
2. Probably change the current Builtin Info structure to allow vector of 
possible signatures for an identifier. The builtin type decoder could choose 
the appropriate signature based on LangOpt. (This wording is vague and could be 
a separate discussion in itself )

either way, changes in current design are required. printf is the only current 
use case I know that can benefit out of this (since OpenCL v1.2 s6.9.f says 
other library functions defined in C standard header are not available ,so 🤷‍♂️ 
 ). But I guess we could have more use cases in future. can this be a separate 
discussion ? This patch would unblock my current work for now. 
 


https://github.com/llvm/llvm-project/pull/86801
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [RFC][Clang] Enable custom type checking for printf (PR #86801)

2024-03-27 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH ready_for_review 
https://github.com/llvm/llvm-project/pull/86801
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [AMDGPU] Treat printf as builtin for OpenCL (PR #72554)

2024-03-27 Thread Vikram Hegde via cfe-commits

vikramRH wrote:

closing this in favour of https://github.com/llvm/llvm-project/pull/86801

https://github.com/llvm/llvm-project/pull/72554
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [AMDGPU] Treat printf as builtin for OpenCL (PR #72554)

2024-03-27 Thread Vikram Hegde via cfe-commits

https://github.com/vikramRH closed 
https://github.com/llvm/llvm-project/pull/72554
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   >