from:"Paul Robinson via Phabricator via cfe\-commits"

[PATCH] D147655: Implement mangling rules for C++20 concepts and requires-expressions.

2023-09-26 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

> You need to do `returnit`.

Doh! Thanks. Those two instantiations could have different function bodies, but 
would have the same mangled name. Got it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147655/new/

https://reviews.llvm.org/D147655

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D147655: Implement mangling rules for C++20 concepts and requires-expressions.

2023-09-26 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Hi @rsmith,

> these two different templates would have the same mangling:

  template  T returnit() {return I;};
  template  T returnit() { return I; }

I tried compiling `long foo() { return returnit(); }` with these two 
templates, and got different manglings. 
`_Z8returnitIlLl4EET_v` and
`_Z8returnitIlLi4EET_v`

Am I misunderstanding something about the problem with the old mangling rules?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147655/new/

https://reviews.llvm.org/D147655

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159054: [Driver] Removal of C_INCLUDE_DIRS feature

2023-09-05 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

@MaskRay thanks for the info!

In D159054#4638322 , @brad wrote:

> I was going to hold this over for a bit longer. 2 weeks and if no one says 
> anything then go ahead?

The main thing to worry about, clearly, is what happens as the change 
percolates downstream and into distros. But you probably know more about that 
than I do anyhow. :)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159054/new/

https://reviews.llvm.org/D159054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159054: [Driver] Removal of C_INCLUDE_DIRS feature

2023-08-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In D159054#4626772 , @brad wrote:

> Just FYI I am not in a rush to commit this. I am posting this more so as a 
> means of prodding for discussion of the feature.

So far nobody has popped up to say they want it.

@MaskRay I poked around a bit on sourcegraph.com and didn't see any statement 
about what it actually searches, other than vague "all your repositories" and 
"all your code." The "your" bit makes me wonder how broad it is really. The 
website doesn't say.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159054/new/

https://reviews.llvm.org/D159054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D158614: [UBSan] Disable the function sanitizer on an execute-only target.

2023-08-23 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

The PS5 bits LGTM, but as I'm not familiar with the ARM aspects I won't give 
final approval.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158614/new/

https://reviews.llvm.org/D158614

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-08-18 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:423
+  if (!CSInfo) {
+SmallString<64> Checksum;
+std::optional CSKind =

In the final commit, `Checksum` is outside the `if` so that its lifetime 
persists to the end of the function (specifically, past the `createFile` call). 
`CSInfo` holds a StringRef to this string, not the string itself.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156571/new/

https://reviews.llvm.org/D156571

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-08-18 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Thanks @DavidSpickett the patch is currently reverted. I have a revised patch 
coming soon and I will keep a close eye on the bots.
I believe it's a string-lifetime issue and so whether it manifests is 
unpredictable, but we have enough different bots in the farm that it did get 
caught.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156571/new/

https://reviews.llvm.org/D156571

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-08-17 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

This patch is possibly a suspect in at least some bot failures 
 although I'm at a 
loss to understand why. Perhaps I can't just blithely call getChecksum() and 
copy what it sends back? The ways of metadata remain mysterious to me. I'll be 
poking at this more as time permits, unless someone can tell me my obvious 
mistake.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156571/new/

https://reviews.llvm.org/D156571

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-08-17 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Detail added in the commit message, good idea!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156571/new/

https://reviews.llvm.org/D156571

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-08-17 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGca1295c5a15f: [DebugInfo] Alternate (more efficient) MD5 fix 
(authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156571/new/

https://reviews.llvm.org/D156571

Files:
  clang/lib/CodeGen/CGDebugInfo.cpp
  clang/lib/CodeGen/CGDebugInfo.h
  clang/test/CodeGenCXX/debug-info-function-context.cpp


Index: clang/test/CodeGenCXX/debug-info-function-context.cpp
===
--- clang/test/CodeGenCXX/debug-info-function-context.cpp
+++ clang/test/CodeGenCXX/debug-info-function-context.cpp
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple 
x86_64-pc-linux-gnu %s \
-// RUN: -dwarf-version=5 -main-file-name %s  -o - | FileCheck %s
+// RUN: -dwarf-version=5 -main-file-name debug-info-function-context.cpp  
-o - | FileCheck %s
 
 struct C {
   void member_function();
@@ -31,8 +31,8 @@
 
 // The first DIFile is for the CU, the second is what everything else uses.
 // We're using DWARF v5 so both should have MD5 checksums.
-// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
-// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} 
checksumkind: CSK_MD5
+// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5, 
checksum: [[CKSUM:".*"]]
+// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} 
checksumkind: CSK_MD5, checksum: [[CKSUM]]
 // CHECK: ![[C:[0-9]+]] = distinct !DICompositeType(tag: 
DW_TAG_structure_type, name: "C",
 // CHECK: ![[NS:.*]] = !DINamespace(name: "ns"
 // CHECK: !DISubprogram(name: "member_function",{{.*}} scope: ![[C]],{{.*}} 
DISPFlagDefinition
Index: clang/lib/CodeGen/CGDebugInfo.h
===
--- clang/lib/CodeGen/CGDebugInfo.h
+++ clang/lib/CodeGen/CGDebugInfo.h
@@ -148,7 +148,7 @@
   llvm::BumpPtrAllocator DebugInfoNames;
   StringRef CWDName;
 
-  llvm::StringMap DIFileCache;
+  llvm::DenseMap DIFileCache;
   llvm::DenseMap SPCache;
   /// Cache declarations relevant to DW_TAG_imported_declarations (C++
   /// using declarations and global alias variables) that aren't covered
Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -391,12 +391,14 @@
   SourceManager  = CGM.getContext().getSourceManager();
   StringRef FileName;
   FileID FID;
+  std::optional> CSInfo;
 
   if (Loc.isInvalid()) {
 // The DIFile used by the CU is distinct from the main source file. Call
 // createFile() below for canonicalization if the source file was specified
 // with an absolute path.
 FileName = TheCU->getFile()->getFilename();
+CSInfo = TheCU->getFile()->getChecksum();
   } else {
 PresumedLoc PLoc = SM.getPresumedLoc(Loc);
 FileName = PLoc.getFilename();
@@ -417,13 +419,13 @@
   return cast(V);
   }
 
-  SmallString<64> Checksum;
-
-  std::optional CSKind =
+  if (!CSInfo) {
+SmallString<64> Checksum;
+std::optional CSKind =
   computeChecksum(FID, Checksum);
-  std::optional> CSInfo;
-  if (CSKind)
-CSInfo.emplace(*CSKind, Checksum);
+if (CSKind)
+  CSInfo.emplace(*CSKind, Checksum);
+  }
   return createFile(FileName, CSInfo, getSource(SM, SM.getFileID(Loc)));
 }
 


Index: clang/test/CodeGenCXX/debug-info-function-context.cpp
===
--- clang/test/CodeGenCXX/debug-info-function-context.cpp
+++ clang/test/CodeGenCXX/debug-info-function-context.cpp
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple x86_64-pc-linux-gnu %s \
-// RUN: -dwarf-version=5 -main-file-name %s  -o - | FileCheck %s
+// RUN: -dwarf-version=5 -main-file-name debug-info-function-context.cpp  -o - | FileCheck %s
 
 struct C {
   void member_function();
@@ -31,8 +31,8 @@
 
 // The first DIFile is for the CU, the second is what everything else uses.
 // We're using DWARF v5 so both should have MD5 checksums.
-// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
-// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
+// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5, checksum: [[CKSUM:".*"]]
+// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5, checksum: [[CKSUM]]
 // CHECK: ![[C:[0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "C",
 // CHECK: ![[NS:.*]] = !DINamespace(name: "ns"
 // CHECK: !DISubprogram(name: "member_function",{{.*}} scope: ![[C]],{{.*}} DISPFlagDefinition
Index: clang/lib/CodeGen/CGDebugInfo.h

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-08-16 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Ping


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156571/new/

https://reviews.llvm.org/D156571

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153276: [clang][Interp] Reject reinterpret_cast expressions

2023-08-08 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/AST/Interp/Interp.h:1761
+  S.FFDiag(Loc, diag::note_constexpr_invalid_cast)
+  << static_cast(Kind) << S.Current->getRange(OpPC);
+  return false;

aaron.ballman wrote:
> tbaeder wrote:
> > probinson wrote:
> > > Would you mind changing this cast from `uint8_t` to `unsigned`? We have 
> > > an internal bot using a pickier mode of MSVC and it complains about 
> > > ambiguous overloads, as the `operator<<` doesn't have anything smaller 
> > > than `int` and `unsigned`.
> > Feel free to push that change, I'm currently working on something else.
> Yeah, that's a perfectly reasonable NFC commit to make -- go for it!
https://github.com/llvm/llvm-project/commit/4e8cae4aec6590ca13ec65ed38d6da55c6031755



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153276/new/

https://reviews.llvm.org/D153276

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153276: [clang][Interp] Reject reinterpret_cast expressions

2023-08-01 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/AST/Interp/Interp.h:1761
+  S.FFDiag(Loc, diag::note_constexpr_invalid_cast)
+  << static_cast(Kind) << S.Current->getRange(OpPC);
+  return false;

Would you mind changing this cast from `uint8_t` to `unsigned`? We have an 
internal bot using a pickier mode of MSVC and it complains about ambiguous 
overloads, as the `operator<<` doesn't have anything smaller than `int` and 
`unsigned`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153276/new/

https://reviews.llvm.org/D153276

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-08-01 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 546144.
probinson added a comment.

Reuse the main file's checksum instead of recalculating it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156571/new/

https://reviews.llvm.org/D156571

Files:
  clang/lib/CodeGen/CGDebugInfo.cpp
  clang/lib/CodeGen/CGDebugInfo.h
  clang/test/CodeGenCXX/debug-info-function-context.cpp


Index: clang/test/CodeGenCXX/debug-info-function-context.cpp
===
--- clang/test/CodeGenCXX/debug-info-function-context.cpp
+++ clang/test/CodeGenCXX/debug-info-function-context.cpp
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple 
x86_64-pc-linux-gnu %s \
-// RUN: -dwarf-version=5 -main-file-name %s  -o - | FileCheck %s
+// RUN: -dwarf-version=5 -main-file-name debug-info-function-context.cpp  
-o - | FileCheck %s
 
 struct C {
   void member_function();
@@ -31,8 +31,8 @@
 
 // The first DIFile is for the CU, the second is what everything else uses.
 // We're using DWARF v5 so both should have MD5 checksums.
-// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
-// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} 
checksumkind: CSK_MD5
+// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5, 
checksum: [[CKSUM:".*"]]
+// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} 
checksumkind: CSK_MD5, checksum: [[CKSUM]]
 // CHECK: ![[C:[0-9]+]] = distinct !DICompositeType(tag: 
DW_TAG_structure_type, name: "C",
 // CHECK: ![[NS:.*]] = !DINamespace(name: "ns"
 // CHECK: !DISubprogram(name: "member_function",{{.*}} scope: ![[C]],{{.*}} 
DISPFlagDefinition
Index: clang/lib/CodeGen/CGDebugInfo.h
===
--- clang/lib/CodeGen/CGDebugInfo.h
+++ clang/lib/CodeGen/CGDebugInfo.h
@@ -148,7 +148,7 @@
   llvm::BumpPtrAllocator DebugInfoNames;
   StringRef CWDName;
 
-  llvm::StringMap DIFileCache;
+  llvm::DenseMap DIFileCache;
   llvm::DenseMap SPCache;
   /// Cache declarations relevant to DW_TAG_imported_declarations (C++
   /// using declarations and global alias variables) that aren't covered
Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -391,12 +391,14 @@
   SourceManager  = CGM.getContext().getSourceManager();
   StringRef FileName;
   FileID FID;
+  std::optional> CSInfo;
 
   if (Loc.isInvalid()) {
 // The DIFile used by the CU is distinct from the main source file. Call
 // createFile() below for canonicalization if the source file was specified
 // with an absolute path.
 FileName = TheCU->getFile()->getFilename();
+CSInfo = TheCU->getFile()->getChecksum();
   } else {
 PresumedLoc PLoc = SM.getPresumedLoc(Loc);
 FileName = PLoc.getFilename();
@@ -417,13 +419,13 @@
   return cast(V);
   }
 
-  SmallString<64> Checksum;
-
-  std::optional CSKind =
+  if (!CSInfo) {
+SmallString<64> Checksum;
+std::optional CSKind =
   computeChecksum(FID, Checksum);
-  std::optional> CSInfo;
-  if (CSKind)
-CSInfo.emplace(*CSKind, Checksum);
+if (CSKind)
+  CSInfo.emplace(*CSKind, Checksum);
+  }
   return createFile(FileName, CSInfo, getSource(SM, SM.getFileID(Loc)));
 }
 


Index: clang/test/CodeGenCXX/debug-info-function-context.cpp
===
--- clang/test/CodeGenCXX/debug-info-function-context.cpp
+++ clang/test/CodeGenCXX/debug-info-function-context.cpp
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple x86_64-pc-linux-gnu %s \
-// RUN: -dwarf-version=5 -main-file-name %s  -o - | FileCheck %s
+// RUN: -dwarf-version=5 -main-file-name debug-info-function-context.cpp  -o - | FileCheck %s
 
 struct C {
   void member_function();
@@ -31,8 +31,8 @@
 
 // The first DIFile is for the CU, the second is what everything else uses.
 // We're using DWARF v5 so both should have MD5 checksums.
-// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
-// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
+// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5, checksum: [[CKSUM:".*"]]
+// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5, checksum: [[CKSUM]]
 // CHECK: ![[C:[0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "C",
 // CHECK: ![[NS:.*]] = !DINamespace(name: "ns"
 // CHECK: !DISubprogram(name: "member_function",{{.*}} scope: ![[C]],{{.*}} DISPFlagDefinition
Index: clang/lib/CodeGen/CGDebugInfo.h
===
--- clang/lib/CodeGen/CGDebugInfo.h
+++ clang/lib/CodeGen/CGDebugInfo.h
@@ -148,7 +148,7 @@
   llvm::BumpPtrAllocator

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-08-01 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Of course this means we're rerunning MD5 on the main source file; probably can 
capture that from TheCU and save that cost as well.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156571/new/

https://reviews.llvm.org/D156571

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-08-01 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 546115.
probinson added a comment.

Use the main FileID instead of expensive string compares.

Figured this out after staring at CreateCompileUnit for long enough. Seeding 
the DIFileCache with the DIFile created there made another test unhappy 
(difile_entry.cpp) but this fix doesn't.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156571/new/

https://reviews.llvm.org/D156571

Files:
  clang/lib/CodeGen/CGDebugInfo.cpp
  clang/lib/CodeGen/CGDebugInfo.h
  clang/test/CodeGenCXX/debug-info-function-context.cpp


Index: clang/test/CodeGenCXX/debug-info-function-context.cpp
===
--- clang/test/CodeGenCXX/debug-info-function-context.cpp
+++ clang/test/CodeGenCXX/debug-info-function-context.cpp
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple 
x86_64-pc-linux-gnu %s \
-// RUN: -dwarf-version=5 -main-file-name %s  -o - | FileCheck %s
+// RUN: -dwarf-version=5 -main-file-name debug-info-function-context.cpp  
-o - | FileCheck %s
 
 struct C {
   void member_function();
Index: clang/lib/CodeGen/CGDebugInfo.h
===
--- clang/lib/CodeGen/CGDebugInfo.h
+++ clang/lib/CodeGen/CGDebugInfo.h
@@ -148,7 +148,7 @@
   llvm::BumpPtrAllocator DebugInfoNames;
   StringRef CWDName;
 
-  llvm::StringMap DIFileCache;
+  llvm::DenseMap DIFileCache;
   llvm::DenseMap SPCache;
   /// Cache declarations relevant to DW_TAG_imported_declarations (C++
   /// using declarations and global alias variables) that aren't covered
Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -397,6 +397,7 @@
 // createFile() below for canonicalization if the source file was specified
 // with an absolute path.
 FileName = TheCU->getFile()->getFilename();
+FID = SM.getMainFileID();
   } else {
 PresumedLoc PLoc = SM.getPresumedLoc(Loc);
 FileName = PLoc.getFilename();


Index: clang/test/CodeGenCXX/debug-info-function-context.cpp
===
--- clang/test/CodeGenCXX/debug-info-function-context.cpp
+++ clang/test/CodeGenCXX/debug-info-function-context.cpp
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple x86_64-pc-linux-gnu %s \
-// RUN: -dwarf-version=5 -main-file-name %s  -o - | FileCheck %s
+// RUN: -dwarf-version=5 -main-file-name debug-info-function-context.cpp  -o - | FileCheck %s
 
 struct C {
   void member_function();
Index: clang/lib/CodeGen/CGDebugInfo.h
===
--- clang/lib/CodeGen/CGDebugInfo.h
+++ clang/lib/CodeGen/CGDebugInfo.h
@@ -148,7 +148,7 @@
   llvm::BumpPtrAllocator DebugInfoNames;
   StringRef CWDName;
 
-  llvm::StringMap DIFileCache;
+  llvm::DenseMap DIFileCache;
   llvm::DenseMap SPCache;
   /// Cache declarations relevant to DW_TAG_imported_declarations (C++
   /// using declarations and global alias variables) that aren't covered
Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -397,6 +397,7 @@
 // createFile() below for canonicalization if the source file was specified
 // with an absolute path.
 FileName = TheCU->getFile()->getFilename();
+FID = SM.getMainFileID();
   } else {
 PresumedLoc PLoc = SM.getPresumedLoc(Loc);
 FileName = PLoc.getFilename();
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-07-31 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

The CU's DIFile is conjured up in CGDebugInfo::CreateCompileUnit(), and the 
name is derived from `-main-file-name` rather than anything SourceManager 
provides. Although there is a comment wondering about that. CreateCompileUnit() 
computes a checksum for that DIFile, but apparently the DIFile never makes it 
into the DIFileCache.

DIFileCache is generated by CGDebugInfo::createFile(); it looks like there are 
some subtle differences in the path handling between what it does and what 
CreateCompileUnit() does, so making CreateCompileUnit() go through createFile() 
might not the the right thing. But it does look like CreateCompileUnit() might 
just be missing a one-liner to add the CU's DIFile to DIFileCache.

I'll try that instead.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156571/new/

https://reviews.llvm.org/D156571

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155991: [DWARF] Make sure file entry for artificial functions has an MD5 checksum

2023-07-28 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

See https://reviews.llvm.org/D156571


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155991/new/

https://reviews.llvm.org/D155991

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156571: [DebugInfo] Alternate MD5 fix, NFC

2023-07-28 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: dblaikie, nikic.
probinson added a project: debug-info.
Herald added a subscriber: StephenFan.
Herald added a project: All.
probinson requested review of this revision.

Should have lower memory and time cost than D155991 
.


https://reviews.llvm.org/D156571

Files:
  clang/lib/CodeGen/CGDebugInfo.cpp
  clang/lib/CodeGen/CGDebugInfo.h


Index: clang/lib/CodeGen/CGDebugInfo.h
===
--- clang/lib/CodeGen/CGDebugInfo.h
+++ clang/lib/CodeGen/CGDebugInfo.h
@@ -148,7 +148,7 @@
   llvm::BumpPtrAllocator DebugInfoNames;
   StringRef CWDName;
 
-  llvm::StringMap DIFileCache;
+  llvm::DenseMap DIFileCache;
   llvm::DenseMap SPCache;
   /// Cache declarations relevant to DW_TAG_imported_declarations (C++
   /// using declarations and global alias variables) that aren't covered
Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -397,6 +397,17 @@
 // createFile() below for canonicalization if the source file was specified
 // with an absolute path.
 FileName = TheCU->getFile()->getFilename();
+for (auto It : DIFileCache) {
+  // StringRef operator== does a content comparison.
+  if (FileName == It.first) {
+// Got a string match, use the existing DIFile if possible.
+if (llvm::Metadata *V = It.second)
+  return cast(V);
+// Fall through to create the DIFile but use the original string.
+FileName = It.first;
+break;
+  }
+}
   } else {
 PresumedLoc PLoc = SM.getPresumedLoc(Loc);
 FileName = PLoc.getFilename();


Index: clang/lib/CodeGen/CGDebugInfo.h
===
--- clang/lib/CodeGen/CGDebugInfo.h
+++ clang/lib/CodeGen/CGDebugInfo.h
@@ -148,7 +148,7 @@
   llvm::BumpPtrAllocator DebugInfoNames;
   StringRef CWDName;
 
-  llvm::StringMap DIFileCache;
+  llvm::DenseMap DIFileCache;
   llvm::DenseMap SPCache;
   /// Cache declarations relevant to DW_TAG_imported_declarations (C++
   /// using declarations and global alias variables) that aren't covered
Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -397,6 +397,17 @@
 // createFile() below for canonicalization if the source file was specified
 // with an absolute path.
 FileName = TheCU->getFile()->getFilename();
+for (auto It : DIFileCache) {
+  // StringRef operator== does a content comparison.
+  if (FileName == It.first) {
+// Got a string match, use the existing DIFile if possible.
+if (llvm::Metadata *V = It.second)
+  return cast(V);
+// Fall through to create the DIFile but use the original string.
+FileName = It.first;
+break;
+  }
+}
   } else {
 PresumedLoc PLoc = SM.getPresumedLoc(Loc);
 FileName = PLoc.getFilename();
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155991: [DWARF] Make sure file entry for artificial functions has an MD5 checksum

2023-07-28 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

> Any memory usage measurements to check this doesn't have a significant 
> adverse impact by copying all the strings?

Not actual measurements, no; but intuitively the size should not be much 
greater than the size of the filename entries in the .debug_line section for 
the CU, which should be KB not MB. (Plus StringMap entry overhead, which is 
constant per node.) Therefore I didn't take the time to measure. But see below 
for a possible alternate approach which would avoid even that much.

> Could/should we do the lookup on the CU filename before it goes into the DI 
> metadata, and store that FileID somewhere for later use?

There's a note in issue 63955 to the effect that I can't find an API to turn a 
filename into a FileID. If there is one that I didn't find, we could use 
FileIDs instead of pointers to name strings.

> FYI this is a 0.5% compile-time regression on `O0` builds 
> (https://llvm-compile-time-tracker.com/compare.php?from=69593aa5c054cec6be6b822c073ccdc63748a68d=7abb5fc618cec66841a8280d2a099a4c9c8cb91b=instructions:u).
>  Is that expected?

That's higher than I expected.

It might be feasible to do this a different way: When an invalid loc comes in, 
search the DIFileCache for a matching string, and use that instead of using the 
CU's copy of the filename. Then we can go back to using pointers as the keys. 
Might eliminate the duplicate DIFile as well. I'll look into that. Then the 
time cost would be incurred only when invalid locs come in, which is a minority 
of lookups (depends on the number of artificial functions generated).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155991/new/

https://reviews.llvm.org/D155991

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156248: [Headers][doc] Add description of _mm256_movemask_epi8

2023-07-25 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1fde372b3200: [Headers][doc] Add description of 
_mm256_movemask_epi8 (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156248/new/

https://reviews.llvm.org/D156248

Files:
  clang/lib/Headers/avx2intrin.h


Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -1307,6 +1307,23 @@
   return (__m256i)__builtin_elementwise_min((__v8su)__a, (__v8su)__b);
 }
 
+/// Creates a 32-bit integer mask from the most significant bit of each byte
+///in the 256-bit integer vector in \a __a and returns the result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 31
+///   j := i*8
+///   result[i] := __a[j+7]
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPMOVMSKB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing the source bytes.
+/// \returns The 32-bit integer mask.
 static __inline__ int __DEFAULT_FN_ATTRS256
 _mm256_movemask_epi8(__m256i __a)
 {


Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -1307,6 +1307,23 @@
   return (__m256i)__builtin_elementwise_min((__v8su)__a, (__v8su)__b);
 }
 
+/// Creates a 32-bit integer mask from the most significant bit of each byte
+///in the 256-bit integer vector in \a __a and returns the result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 31
+///   j := i*8
+///   result[i] := __a[j+7]
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPMOVMSKB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing the source bytes.
+/// \returns The 32-bit integer mask.
 static __inline__ int __DEFAULT_FN_ATTRS256
 _mm256_movemask_epi8(__m256i __a)
 {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156248: [Headers][doc] Add description of _mm256_movemask_epi8

2023-07-25 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: pengfei, RKSimon, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

(Missed one from the list of functions I was asked to document. This really 
should be the last review!)


https://reviews.llvm.org/D156248

Files:
  clang/lib/Headers/avx2intrin.h


Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -1307,6 +1307,23 @@
   return (__m256i)__builtin_elementwise_min((__v8su)__a, (__v8su)__b);
 }
 
+/// Creates a 32-bit integer mask from the most significant bit of each byte
+///in the 256-bit integer vector in \a __a and returns the result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 31
+///   j := i*8
+///   result[i] := __a[j+7]
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPMOVMSKB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing the source bytes.
+/// \returns The 32-bit integer mask.
 static __inline__ int __DEFAULT_FN_ATTRS256
 _mm256_movemask_epi8(__m256i __a)
 {


Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -1307,6 +1307,23 @@
   return (__m256i)__builtin_elementwise_min((__v8su)__a, (__v8su)__b);
 }
 
+/// Creates a 32-bit integer mask from the most significant bit of each byte
+///in the 256-bit integer vector in \a __a and returns the result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 31
+///   j := i*8
+///   result[i] := __a[j+7]
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPMOVMSKB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing the source bytes.
+/// \returns The 32-bit integer mask.
 static __inline__ int __DEFAULT_FN_ATTRS256
 _mm256_movemask_epi8(__m256i __a)
 {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156127: Partially revert changes to test lang-std.cpp

2023-07-24 Thread Paul Robinson via Phabricator via cfe-commits

probinson accepted this revision.
probinson added a comment.
This revision is now accepted and ready to land.

LGTM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156127/new/

https://reviews.llvm.org/D156127

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156127: Partially revert changes to test lang-std.cpp

2023-07-24 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/test/Preprocessor/lang-std.cpp:4
 // RUN: %clang_cc1 -dM -E %s | grep __cplusplus >%T-cpp-std.txt
+// RUN: cat %T-cpp-std.txt | FileCheck --check-prefix=CXX17 %s
+

Use `--input-file` and there's one fewer process to create.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156127/new/

https://reviews.llvm.org/D156127

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155991: [DWARF] Make sure file entry for artificial functions has an MD5 checksum

2023-07-24 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG7abb5fc618ce: [DWARF] Make sure file entry for artificial 
functions has an MD5 checksum (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155991/new/

https://reviews.llvm.org/D155991

Files:
  clang/lib/CodeGen/CGDebugInfo.h
  clang/test/CodeGenCXX/debug-info-function-context.cpp


Index: clang/test/CodeGenCXX/debug-info-function-context.cpp
===
--- clang/test/CodeGenCXX/debug-info-function-context.cpp
+++ clang/test/CodeGenCXX/debug-info-function-context.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple 
x86_64-pc-linux-gnu %s -o - | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple 
x86_64-pc-linux-gnu %s \
+// RUN: -dwarf-version=5 -main-file-name %s  -o - | FileCheck %s
 
 struct C {
   void member_function();
@@ -21,11 +22,17 @@
 int global_namespace_variable = 1;
 }
 
+// Generate the artificial global functions to initialize a global.
+int global_initialized_variable = C::static_member_function();
+
 // Check that the functions that belong to C have C as a context and the
 // functions that belong to the namespace have it as a context, and the global
-// function has the file as a context.
+// functions (user-defined and artificial) have the file as a context.
 
-// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",
+// The first DIFile is for the CU, the second is what everything else uses.
+// We're using DWARF v5 so both should have MD5 checksums.
+// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
+// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} 
checksumkind: CSK_MD5
 // CHECK: ![[C:[0-9]+]] = distinct !DICompositeType(tag: 
DW_TAG_structure_type, name: "C",
 // CHECK: ![[NS:.*]] = !DINamespace(name: "ns"
 // CHECK: !DISubprogram(name: "member_function",{{.*}} scope: ![[C]],{{.*}} 
DISPFlagDefinition
@@ -35,3 +42,6 @@
 // CHECK: !DISubprogram(name: "global_function",{{.*}} scope: ![[FILE]],{{.*}} 
DISPFlagDefinition
 
 // CHECK: !DISubprogram(name: "global_namespace_function",{{.*}} scope: 
![[NS]],{{.*}} DISPFlagDefinition
+
+// CHECK: !DISubprogram(name: "__cxx_global_var_init",{{.*}} scope: 
![[FILE]],{{.*}} DISPFlagDefinition
+// CHECK: !DISubprogram(linkageName: "_GLOBAL__sub_I_{{.*}}",{{.*}} scope: 
![[FILE]],{{.*}} DISPFlagDefinition
Index: clang/lib/CodeGen/CGDebugInfo.h
===
--- clang/lib/CodeGen/CGDebugInfo.h
+++ clang/lib/CodeGen/CGDebugInfo.h
@@ -148,7 +148,7 @@
   llvm::BumpPtrAllocator DebugInfoNames;
   StringRef CWDName;
 
-  llvm::DenseMap DIFileCache;
+  llvm::StringMap DIFileCache;
   llvm::DenseMap SPCache;
   /// Cache declarations relevant to DW_TAG_imported_declarations (C++
   /// using declarations and global alias variables) that aren't covered


Index: clang/test/CodeGenCXX/debug-info-function-context.cpp
===
--- clang/test/CodeGenCXX/debug-info-function-context.cpp
+++ clang/test/CodeGenCXX/debug-info-function-context.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple x86_64-pc-linux-gnu %s -o - | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple x86_64-pc-linux-gnu %s \
+// RUN: -dwarf-version=5 -main-file-name %s  -o - | FileCheck %s
 
 struct C {
   void member_function();
@@ -21,11 +22,17 @@
 int global_namespace_variable = 1;
 }
 
+// Generate the artificial global functions to initialize a global.
+int global_initialized_variable = C::static_member_function();
+
 // Check that the functions that belong to C have C as a context and the
 // functions that belong to the namespace have it as a context, and the global
-// function has the file as a context.
+// functions (user-defined and artificial) have the file as a context.
 
-// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",
+// The first DIFile is for the CU, the second is what everything else uses.
+// We're using DWARF v5 so both should have MD5 checksums.
+// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
+// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
 // CHECK: ![[C:[0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "C",
 // CHECK: ![[NS:.*]] = !DINamespace(name: "ns"
 // CHECK: !DISubprogram(name: "member_function",{{.*}} scope: ![[C]],{{.*}} DISPFlagDefinition
@@ -35,3 +42,6 @@
 // CHECK: !DISubprogram(name: "global_function",{{.*}} scope: ![[FILE]],{{.*}} DISPFlagDefinition
 
 // CHECK: !DISubprogram(name: "global_namespace_function",{{.*}} scope: ![[NS]],{{.*}}

[PATCH] D155859: [Headers][doc] Add misc non-AVX2 intrinsic descriptions

2023-07-24 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG69593aa5c054: [Headers][doc] Add misc non-AVX2 intrinsic 
descriptions (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155859/new/

https://reviews.llvm.org/D155859

Files:
  clang/lib/Headers/adxintrin.h
  clang/lib/Headers/bmi2intrin.h
  clang/lib/Headers/clflushoptintrin.h
  clang/lib/Headers/clzerointrin.h
  clang/lib/Headers/rdseedintrin.h
  clang/lib/Headers/xsavecintrin.h

Index: clang/lib/Headers/xsavecintrin.h
===
--- clang/lib/Headers/xsavecintrin.h
+++ clang/lib/Headers/xsavecintrin.h
@@ -17,12 +17,62 @@
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__,  __target__("xsavec")))
 
+/// Performs a full or partial save of processor state to the memory at
+///\a __p. The exact state saved depends on the 64-bit mask \a __m and
+///processor control register \c XCR0.
+///
+/// \code{.operation}
+/// mask[62:0] := __m[62:0] AND XCR0[62:0]
+/// FOR i := 0 TO 62
+///   IF mask[i] == 1
+/// CASE (i) OF
+/// 0: save X87 FPU state
+/// 1: save SSE state
+/// DEFAULT: __p.Ext_Save_Area[i] := ProcessorState[i]
+///   FI
+/// ENDFOR
+/// __p.Header.XSTATE_BV[62:0] := INIT_FUNCTION(mask[62:0])
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c XSAVEC instruction.
+///
+/// \param __p
+///Pointer to the save area; must be 64-byte aligned.
+/// \param __m
+///A 64-bit mask indicating what state should be saved.
 static __inline__ void __DEFAULT_FN_ATTRS
 _xsavec(void *__p, unsigned long long __m) {
   __builtin_ia32_xsavec(__p, __m);
 }
 
 #ifdef __x86_64__
+/// Performs a full or partial save of processor state to the memory at
+///\a __p. The exact state saved depends on the 64-bit mask \a __m and
+///processor control register \c XCR0.
+///
+/// \code{.operation}
+/// mask[62:0] := __m[62:0] AND XCR0[62:0]
+/// FOR i := 0 TO 62
+///   IF mask[i] == 1
+/// CASE (i) OF
+/// 0: save X87 FPU state
+/// 1: save SSE state
+/// DEFAULT: __p.Ext_Save_Area[i] := ProcessorState[i]
+///   FI
+/// ENDFOR
+/// __p.Header.XSTATE_BV[62:0] := INIT_FUNCTION(mask[62:0])
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c XSAVEC64 instruction.
+///
+/// \param __p
+///Pointer to the save area; must be 64-byte aligned.
+/// \param __m
+///A 64-bit mask indicating what state should be saved.
 static __inline__ void __DEFAULT_FN_ATTRS
 _xsavec64(void *__p, unsigned long long __m) {
   __builtin_ia32_xsavec64(__p, __m);
Index: clang/lib/Headers/rdseedintrin.h
===
--- clang/lib/Headers/rdseedintrin.h
+++ clang/lib/Headers/rdseedintrin.h
@@ -17,12 +17,54 @@
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("rdseed")))
 
+/// Stores a hardware-generated 16-bit random value in the memory at \a __p.
+///
+///The random number generator complies with NIST SP800-90B and SP800-90C.
+///
+/// \code{.operation}
+/// IF HW_NRND_GEN.ready == 1
+///   Store16(__p, HW_NRND_GEN.data)
+///   result := 1
+/// ELSE
+///   Store16(__p, 0)
+///   result := 0
+/// END
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c RDSEED instruction.
+///
+/// \param __p
+///Pointer to memory for storing the 16-bit random number.
+/// \returns 1 if a random number was generated, 0 if not.
 static __inline__ int __DEFAULT_FN_ATTRS
 _rdseed16_step(unsigned short *__p)
 {
   return (int) __builtin_ia32_rdseed16_step(__p);
 }
 
+/// Stores a hardware-generated 32-bit random value in the memory at \a __p.
+///
+///The random number generator complies with NIST SP800-90B and SP800-90C.
+///
+/// \code{.operation}
+/// IF HW_NRND_GEN.ready == 1
+///   Store32(__p, HW_NRND_GEN.data)
+///   result := 1
+/// ELSE
+///   Store32(__p, 0)
+///   result := 0
+/// END
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c RDSEED instruction.
+///
+/// \param __p
+///Pointer to memory for storing the 32-bit random number.
+/// \returns 1 if a random number was generated, 0 if not.
 static __inline__ int __DEFAULT_FN_ATTRS
 _rdseed32_step(unsigned int *__p)
 {
@@ -30,6 +72,27 @@
 }
 
 #ifdef __x86_64__
+/// Stores a hardware-generated 64-bit random value in the memory at \a __p.
+///
+///The random number generator complies with NIST SP800-90B and SP800-90C.
+///
+/// \code{.operation}
+/// IF HW_NRND_GEN.ready == 1
+///   Store64(__p, HW_NRND_GEN.data)
+///   result := 1
+/// ELSE
+///   Store64(__p, 0)
+///

[PATCH] D156143: Add Adrian and David as owners for debug info

2023-07-24 Thread Paul Robinson via Phabricator via cfe-commits

probinson accepted this revision.
probinson added a comment.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156143/new/

https://reviews.llvm.org/D156143

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155539: [CUDA][HIP] Use the same default language std as C++

2023-07-24 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In D155539#4524543 , @yaxunl wrote:

> In D155539#4524189 , @probinson 
> wrote:
>
>> This change to lang-std.cpp causes it not to verify _which_ language 
>> standard is the default. It only verifies that cuda and hip don't _change_ 
>> it.
>> If you run FileCheck on one of those output files, it would preserve that 
>> property.
>
> It is intentional since we just want to make sure CUDA/HIP uses the same 
> default language standard as C++. We do not want to update the test when C++ 
> changes its default language standard.

I agree you don't want to change the CUDA/HIP part. But we do want to preserve 
the check for what the actual default is. That was the purpose when the test 
was introduced in D131465 . As it stands, 
there is no test for the actual default; you have repurposed the test from 
"check language standard defaults" to "check that CUDA/HIP don't affect the 
default" and that's losing test coverage.

If you want to extract the CUDA/HIP part into a separate test, that's fine. But 
please don't lose the coverage that we had.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155539/new/

https://reviews.llvm.org/D155539

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155539: [CUDA][HIP] Use the same default language std as C++

2023-07-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

This change to lang-std.cpp causes it not to verify _which_ language standard 
is the default. It only verifies that cuda and hip don't _change_ it.
If you run FileCheck on one of those output files, it would preserve that 
property.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155539/new/

https://reviews.llvm.org/D155539

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155998: Set default C++ level for PlayStation(r) to C++17.

2023-07-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson accepted this revision.
probinson added a comment.
This revision is now accepted and ready to land.

Huh. It looks like someone else removed that stuff from lang-std.cpp recently. 
I'd have thought that would cause a failure on our bot, but whatever.
LGTM.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155998/new/

https://reviews.llvm.org/D155998

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155998: Set default C++ level for PlayStation(r) to C++17.

2023-07-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

(Also remove the now-incorrect comment from lang-std.cpp)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155998/new/

https://reviews.llvm.org/D155998

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155998: Set default C++ level for Playstation(r) to C++17.

2023-07-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

The value of a separate test for PS4/PS5 is questionable, now that it's the 
same as the general Clang default.
Probably better to delete `lang-std-sie.cpp` and remove the UNSUPPORTED from 
`lang-std.cpp` ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155998/new/

https://reviews.llvm.org/D155998

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155859: [Headers][doc] Add misc non-AVX2 intrinsic descriptions

2023-07-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/rdseedintrin.h:56
+/// ELSE
+///   Store16(__p, 0)
+///   result := 0

pengfei wrote:
> 32
Oops. Fixed.



Comment at: clang/lib/Headers/rdseedintrin.h:84
+/// ELSE
+///   Store16(__p, 0)
+///   result := 0

pengfei wrote:
> 64
Oops. Fixed.



Comment at: clang/lib/Headers/xsavecintrin.h:34
+/// ENDFOR
+/// __p.Header.XSTATE_BV := mask
+/// \endcode

pengfei wrote:
> It's not `mask` but `mask AND XINUSE[62:0]`. The bit[1] also relies on `MXCSR 
> ≠ 1F80H`. I think we can simply use `__p.Header.XSTATE_BV[62:0] := 
> INIT_FUNCTION(mask[62:0])`
Okay. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155859/new/

https://reviews.llvm.org/D155859

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155859: [Headers][doc] Add misc non-AVX2 intrinsic descriptions

2023-07-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 543075.
probinson marked 4 inline comments as done.
probinson added a comment.

Address review comments.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155859/new/

https://reviews.llvm.org/D155859

Files:
  clang/lib/Headers/adxintrin.h
  clang/lib/Headers/bmi2intrin.h
  clang/lib/Headers/clflushoptintrin.h
  clang/lib/Headers/clzerointrin.h
  clang/lib/Headers/rdseedintrin.h
  clang/lib/Headers/xsavecintrin.h

Index: clang/lib/Headers/xsavecintrin.h
===
--- clang/lib/Headers/xsavecintrin.h
+++ clang/lib/Headers/xsavecintrin.h
@@ -17,12 +17,62 @@
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__,  __target__("xsavec")))
 
+/// Performs a full or partial save of processor state to the memory at
+///\a __p. The exact state saved depends on the 64-bit mask \a __m and
+///processor control register \c XCR0.
+///
+/// \code{.operation}
+/// mask[62:0] := __m[62:0] AND XCR0[62:0]
+/// FOR i := 0 TO 62
+///   IF mask[i] == 1
+/// CASE (i) OF
+/// 0: save X87 FPU state
+/// 1: save SSE state
+/// DEFAULT: __p.Ext_Save_Area[i] := ProcessorState[i]
+///   FI
+/// ENDFOR
+/// __p.Header.XSTATE_BV[62:0] := INIT_FUNCTION(mask[62:0])
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c XSAVEC instruction.
+///
+/// \param __p
+///Pointer to the save area; must be 64-byte aligned.
+/// \param __m
+///A 64-bit mask indicating what state should be saved.
 static __inline__ void __DEFAULT_FN_ATTRS
 _xsavec(void *__p, unsigned long long __m) {
   __builtin_ia32_xsavec(__p, __m);
 }
 
 #ifdef __x86_64__
+/// Performs a full or partial save of processor state to the memory at
+///\a __p. The exact state saved depends on the 64-bit mask \a __m and
+///processor control register \c XCR0.
+///
+/// \code{.operation}
+/// mask[62:0] := __m[62:0] AND XCR0[62:0]
+/// FOR i := 0 TO 62
+///   IF mask[i] == 1
+/// CASE (i) OF
+/// 0: save X87 FPU state
+/// 1: save SSE state
+/// DEFAULT: __p.Ext_Save_Area[i] := ProcessorState[i]
+///   FI
+/// ENDFOR
+/// __p.Header.XSTATE_BV[62:0] := INIT_FUNCTION(mask[62:0])
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c XSAVEC64 instruction.
+///
+/// \param __p
+///Pointer to the save area; must be 64-byte aligned.
+/// \param __m
+///A 64-bit mask indicating what state should be saved.
 static __inline__ void __DEFAULT_FN_ATTRS
 _xsavec64(void *__p, unsigned long long __m) {
   __builtin_ia32_xsavec64(__p, __m);
Index: clang/lib/Headers/rdseedintrin.h
===
--- clang/lib/Headers/rdseedintrin.h
+++ clang/lib/Headers/rdseedintrin.h
@@ -17,12 +17,54 @@
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("rdseed")))
 
+/// Stores a hardware-generated 16-bit random value in the memory at \a __p.
+///
+///The random number generator complies with NIST SP800-90B and SP800-90C.
+///
+/// \code{.operation}
+/// IF HW_NRND_GEN.ready == 1
+///   Store16(__p, HW_NRND_GEN.data)
+///   result := 1
+/// ELSE
+///   Store16(__p, 0)
+///   result := 0
+/// END
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c RDSEED instruction.
+///
+/// \param __p
+///Pointer to memory for storing the 16-bit random number.
+/// \returns 1 if a random number was generated, 0 if not.
 static __inline__ int __DEFAULT_FN_ATTRS
 _rdseed16_step(unsigned short *__p)
 {
   return (int) __builtin_ia32_rdseed16_step(__p);
 }
 
+/// Stores a hardware-generated 32-bit random value in the memory at \a __p.
+///
+///The random number generator complies with NIST SP800-90B and SP800-90C.
+///
+/// \code{.operation}
+/// IF HW_NRND_GEN.ready == 1
+///   Store32(__p, HW_NRND_GEN.data)
+///   result := 1
+/// ELSE
+///   Store32(__p, 0)
+///   result := 0
+/// END
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c RDSEED instruction.
+///
+/// \param __p
+///Pointer to memory for storing the 32-bit random number.
+/// \returns 1 if a random number was generated, 0 if not.
 static __inline__ int __DEFAULT_FN_ATTRS
 _rdseed32_step(unsigned int *__p)
 {
@@ -30,6 +72,27 @@
 }
 
 #ifdef __x86_64__
+/// Stores a hardware-generated 64-bit random value in the memory at \a __p.
+///
+///The random number generator complies with NIST SP800-90B and SP800-90C.
+///
+/// \code{.operation}
+/// IF HW_NRND_GEN.ready == 1
+///   Store64(__p, HW_NRND_GEN.data)
+///   result := 1
+/// ELSE
+///   Store64(__p, 0)
+///   result := 0
+/// END
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c RDSEED instruction.
+///
+/// \param __p
+///Pointer to

[PATCH] D155861: [Headers][doc] Add SHA1/SHA256 intrinsic descriptions

2023-07-21 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG775d6df6a5f2: [Headers][doc] Add SHA1/SHA256 intrinsic 
descriptions (authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D155861?vs=542569=543066#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155861/new/

https://reviews.llvm.org/D155861

Files:
  clang/lib/Headers/shaintrin.h

Index: clang/lib/Headers/shaintrin.h
===
--- clang/lib/Headers/shaintrin.h
+++ clang/lib/Headers/shaintrin.h
@@ -17,39 +17,167 @@
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("sha"), __min_vector_width__(128)))
 
+/// Performs four iterations of the inner loop of the SHA-1 message digest
+///algorithm using the starting SHA-1 state (A, B, C, D) from the 128-bit
+///vector of [4 x i32] in \a V1 and the next four 32-bit elements of the
+///message from the 128-bit vector of [4 x i32] in \a V2. Note that the
+///SHA-1 state variable E must have already been added to \a V2
+///(\c _mm_sha1nexte_epu32() can perform this step). Returns the updated
+///SHA-1 state (A, B, C, D) as a 128-bit vector of [4 x i32].
+///
+///The SHA-1 algorithm has an inner loop of 80 iterations, twenty each
+///with a different combining function and rounding constant. This
+///intrinsic performs four iterations using a combining function and
+///rounding constant selected by \a M[1:0].
+///
+/// \headerfile 
+///
+/// \code
+/// __m128i _mm_sha1rnds4_epu32(__m128i V1, __m128i V2, const int M);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c SHA1RNDS4 instruction.
+///
+/// \param V1
+///A 128-bit vector of [4 x i32] containing the initial SHA-1 state.
+/// \param V2
+///A 128-bit vector of [4 x i32] containing the next four elements of
+///the message, plus SHA-1 state variable E.
+/// \param M
+///An immediate value where bits [1:0] select among four possible
+///combining functions and rounding constants (not specified here).
+/// \returns A 128-bit vector of [4 x i32] containing the updated SHA-1 state.
 #define _mm_sha1rnds4_epu32(V1, V2, M) \
   __builtin_ia32_sha1rnds4((__v4si)(__m128i)(V1), (__v4si)(__m128i)(V2), (M))
 
+/// Calculates the SHA-1 state variable E from the SHA-1 state variables in
+///the 128-bit vector of [4 x i32] in \a __X, adds that to the next set of
+///four message elements in the 128-bit vector of [4 x i32] in \a __Y, and
+///returns the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c SHA1NEXTE instruction.
+///
+/// \param __X
+///A 128-bit vector of [4 x i32] containing the current SHA-1 state.
+/// \param __Y
+///A 128-bit vector of [4 x i32] containing the next four elements of the
+///message.
+/// \returns A 128-bit vector of [4 x i32] containing the updated SHA-1
+///values.
 static __inline__ __m128i __DEFAULT_FN_ATTRS
 _mm_sha1nexte_epu32(__m128i __X, __m128i __Y)
 {
   return (__m128i)__builtin_ia32_sha1nexte((__v4si)__X, (__v4si)__Y);
 }
 
+/// Performs an intermediate calculation for deriving the next four SHA-1
+///message elements using previous message elements from the 128-bit
+///vectors of [4 x i32] in \a __X and \a __Y, and returns the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c SHA1MSG1 instruction.
+///
+/// \param __X
+///A 128-bit vector of [4 x i32] containing previous message elements.
+/// \param __Y
+///A 128-bit vector of [4 x i32] containing previous message elements.
+/// \returns A 128-bit vector of [4 x i32] containing the derived SHA-1
+///elements.
 static __inline__ __m128i __DEFAULT_FN_ATTRS
 _mm_sha1msg1_epu32(__m128i __X, __m128i __Y)
 {
   return (__m128i)__builtin_ia32_sha1msg1((__v4si)__X, (__v4si)__Y);
 }
 
+/// Performs the final calculation for deriving the next four SHA-1 message
+///elements using previous message elements from the 128-bit vectors of
+///[4 x i32] in \a __X and \a __Y, and returns the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c SHA1MSG2 instruction.
+///
+/// \param __X
+///A 128-bit vector of [4 x i32] containing an intermediate result.
+/// \param __Y
+///A 128-bit vector of [4 x i32] containing previous message values.
+/// \returns A 128-bit vector of [4 x i32] containing the updated SHA-1
+///values.
 static __inline__ __m128i __DEFAULT_FN_ATTRS
 _mm_sha1msg2_epu32(__m128i __X, __m128i __Y)
 {
   return (__m128i)__builtin_ia32_sha1msg2((__v4si)__X, (__v4si)__Y);
 }
 
+/// Performs two rounds of SHA-256 operation using the following inputs: a
+///starting SHA-256 state (C, D, G, H) from the 128-bit

[PATCH] D155861: [Headers][doc] Add SHA1/SHA256 intrinsic descriptions

2023-07-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Thanks!

The pre-merge check caught mistakes in the \param commands, which I fixed 
before pushing.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155861/new/

https://reviews.llvm.org/D155861

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155991: [DWARF] Make sure file entry for artificial functions has an MD5 checksum

2023-07-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

The fix is straightforward but the test was surprisingly tricky; I had to add 
the -main-file-name option to keep one of the DIFile's from ending up as 
``.
I'm still befuddled about why there are two DIFile entries, when the 
DIFileCache clearly has only one. The metadata remains mysterious to me.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155991/new/

https://reviews.llvm.org/D155991

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155991: [DWARF] Make sure file entry for artificial functions has an MD5 checksum

2023-07-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: dblaikie, aprantl.
probinson added a project: debug-info.
Herald added a project: All.
probinson requested review of this revision.

The DIFile cache was keyed on a string pointer instead of string content,
which was causing misses and resulted in an entry without a checksum.
In DWARF v5 if any checksum is missing, we can't write any to the output
file, so this had consequences.

Fixes https://github.com/llvm/llvm-project/issues/63955


https://reviews.llvm.org/D155991

Files:
  clang/lib/CodeGen/CGDebugInfo.h
  clang/test/CodeGenCXX/debug-info-function-context.cpp


Index: clang/test/CodeGenCXX/debug-info-function-context.cpp
===
--- clang/test/CodeGenCXX/debug-info-function-context.cpp
+++ clang/test/CodeGenCXX/debug-info-function-context.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple 
x86_64-pc-linux-gnu %s -o - | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple 
x86_64-pc-linux-gnu %s \
+// RUN: -dwarf-version=5 -main-file-name %s  -o - | FileCheck %s
 
 struct C {
   void member_function();
@@ -21,11 +22,17 @@
 int global_namespace_variable = 1;
 }
 
+// Generate the artificial global functions to initialize a global.
+int global_initialized_variable = C::static_member_function();
+
 // Check that the functions that belong to C have C as a context and the
 // functions that belong to the namespace have it as a context, and the global
-// function has the file as a context.
+// functions (user-defined and artificial) have the file as a context.
 
-// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",
+// The first DIFile is for the CU, the second is what everything else uses.
+// We're using DWARF v5 so both should have MD5 checksums.
+// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
+// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} 
checksumkind: CSK_MD5
 // CHECK: ![[C:[0-9]+]] = distinct !DICompositeType(tag: 
DW_TAG_structure_type, name: "C",
 // CHECK: ![[NS:.*]] = !DINamespace(name: "ns"
 // CHECK: !DISubprogram(name: "member_function",{{.*}} scope: ![[C]],{{.*}} 
DISPFlagDefinition
@@ -35,3 +42,6 @@
 // CHECK: !DISubprogram(name: "global_function",{{.*}} scope: ![[FILE]],{{.*}} 
DISPFlagDefinition
 
 // CHECK: !DISubprogram(name: "global_namespace_function",{{.*}} scope: 
![[NS]],{{.*}} DISPFlagDefinition
+
+// CHECK: !DISubprogram(name: "__cxx_global_var_init",{{.*}} scope: 
![[FILE]],{{.*}} DISPFlagDefinition
+// CHECK: !DISubprogram(linkageName: "_GLOBAL__sub_I_{{.*}}",{{.*}} scope: 
![[FILE]],{{.*}} DISPFlagDefinition
Index: clang/lib/CodeGen/CGDebugInfo.h
===
--- clang/lib/CodeGen/CGDebugInfo.h
+++ clang/lib/CodeGen/CGDebugInfo.h
@@ -148,7 +148,7 @@
   llvm::BumpPtrAllocator DebugInfoNames;
   StringRef CWDName;
 
-  llvm::DenseMap DIFileCache;
+  llvm::StringMap DIFileCache;
   llvm::DenseMap SPCache;
   /// Cache declarations relevant to DW_TAG_imported_declarations (C++
   /// using declarations and global alias variables) that aren't covered


Index: clang/test/CodeGenCXX/debug-info-function-context.cpp
===
--- clang/test/CodeGenCXX/debug-info-function-context.cpp
+++ clang/test/CodeGenCXX/debug-info-function-context.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple x86_64-pc-linux-gnu %s -o - | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple x86_64-pc-linux-gnu %s \
+// RUN: -dwarf-version=5 -main-file-name %s  -o - | FileCheck %s
 
 struct C {
   void member_function();
@@ -21,11 +22,17 @@
 int global_namespace_variable = 1;
 }
 
+// Generate the artificial global functions to initialize a global.
+int global_initialized_variable = C::static_member_function();
+
 // Check that the functions that belong to C have C as a context and the
 // functions that belong to the namespace have it as a context, and the global
-// function has the file as a context.
+// functions (user-defined and artificial) have the file as a context.
 
-// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",
+// The first DIFile is for the CU, the second is what everything else uses.
+// We're using DWARF v5 so both should have MD5 checksums.
+// CHECK: !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
+// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",{{.*}} checksumkind: CSK_MD5
 // CHECK: ![[C:[0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "C",
 // CHECK: ![[NS:.*]] = !DINamespace(name: "ns"
 // CHECK: !DISubprogram(name: "member_function",{{.*}} scope: ![[C]],{{.*}} DISPFlagDefinition
@@ -35,3 +42,6 @@
 // CHECK: !DISubprogram(name: "global_function",{{.*}} scope: ![[FILE]],{{.*}}

[PATCH] D155861: [Headers][doc] Add SHA1/SHA256 intrinsic descriptions

2023-07-20 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: pengfei, RKSimon, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

I didn't include pseudo-code, because it would be long and complicated, 
probably not tell the whole story, and these are _extremely_ specialized 
instructions. I can have a go at it if reviewers insist.

FTR this should be the last of the intrinsic description reviews I plan to 
post. I still need to double-check my list, but I'm pretty sure I got them all.


https://reviews.llvm.org/D155861

Files:
  clang/lib/Headers/shaintrin.h

Index: clang/lib/Headers/shaintrin.h
===
--- clang/lib/Headers/shaintrin.h
+++ clang/lib/Headers/shaintrin.h
@@ -17,39 +17,167 @@
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("sha"), __min_vector_width__(128)))
 
+/// Performs four iterations of the inner loop of the SHA-1 message digest
+///algorithm using the starting SHA-1 state (A, B, C, D) from the 128-bit
+///vector of [4 x i32] in \a V1 and the next four 32-bit elements of the
+///message from the 128-bit vector of [4 x i32] in \a V2. Note that the
+///SHA-1 state variable E must have already been added to \a V2
+///(\c _mm_sha1nexte_epu32() can perform this step). Returns the updated
+///SHA-1 state (A, B, C, D) as a 128-bit vector of [4 x i32].
+///
+///The SHA-1 algorithm has an inner loop of 80 iterations, twenty each
+///with a different combining function and rounding constant. This
+///intrinsic performs four iterations using a combining function and
+///rounding constant selected by \a M[1:0].
+///
+/// \headerfile 
+///
+/// \code
+/// __m128i _mm_sha1rnds4_epu32(__m128i V1, __m128i V2, const int M);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c SHA1RNDS4 instruction.
+///
+/// \param V1
+///A 128-bit vector of [4 x i32] containing the initial SHA-1 state.
+/// \param V2
+///A 128-bit vector of [4 x i32] containing the next four elements of
+///the message, plus SHA-1 state variable E.
+/// \param M
+///An immediate value where bits [1:0] select among four possible
+///combining functions and rounding constants (not specified here).
+/// \returns A 128-bit vector of [4 x i32] containing the updated SHA-1 state.
 #define _mm_sha1rnds4_epu32(V1, V2, M) \
   __builtin_ia32_sha1rnds4((__v4si)(__m128i)(V1), (__v4si)(__m128i)(V2), (M))
 
+/// Calculates the SHA-1 state variable E from the SHA-1 state variables in
+///the 128-bit vector of [4 x i32] in \a __X, adds that to the next set of
+///four message elements in the 128-bit vector of [4 x i32] in \a __Y, and
+///returns the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c SHA1NEXTE instruction.
+///
+/// \param V1
+///A 128-bit vector of [4 x i32] containing the current SHA-1 state.
+/// \param V2
+///A 128-bit vector of [4 x i32] containing the next four elements of the
+///message.
+/// \returns A 128-bit vector of [4 x i32] containing the updated SHA-1
+///values.
 static __inline__ __m128i __DEFAULT_FN_ATTRS
 _mm_sha1nexte_epu32(__m128i __X, __m128i __Y)
 {
   return (__m128i)__builtin_ia32_sha1nexte((__v4si)__X, (__v4si)__Y);
 }
 
+/// Performs an intermediate calculation for deriving the next four SHA-1
+///message elements using previous message elements from the 128-bit
+///vectors of [4 x i32] in \a __X and \a __Y, and returns the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c SHA1MSG1 instruction.
+///
+/// \param V1
+///A 128-bit vector of [4 x i32] containing previous message elements.
+/// \param V2
+///A 128-bit vector of [4 x i32] containing previous message elements.
+/// \returns A 128-bit vector of [4 x i32] containing the derived SHA-1
+///elements.
 static __inline__ __m128i __DEFAULT_FN_ATTRS
 _mm_sha1msg1_epu32(__m128i __X, __m128i __Y)
 {
   return (__m128i)__builtin_ia32_sha1msg1((__v4si)__X, (__v4si)__Y);
 }
 
+/// Performs the final calculation for deriving the next four SHA-1 message
+///elements using previous message elements from the 128-bit vectors of
+///[4 x i32] in \a __X and \a __Y, and returns the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c SHA1MSG2 instruction.
+///
+/// \param V1
+///A 128-bit vector of [4 x i32] containing an intermediate result.
+/// \param V2
+///A 128-bit vector of [4 x i32] containing previous message values.
+/// \returns A 128-bit vector of [4 x i32] containing the updated SHA-1
+///values.
 static __inline__ __m128i __DEFAULT_FN_ATTRS
 _mm_sha1msg2_epu32(__m128i __X, __m128i __Y)
 {
   return (__m128i)__builtin_ia32_sha1msg2((__v4si)__X, (__v4si)__Y);
 }
 
+/// Performs two rounds of SHA-256 operation using the

[PATCH] D155859: [Headers][doc] Add misc non-AVX2 intrinsic descriptions

2023-07-20 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: pengfei, RKSimon, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

Adds descriptions for adxintrin.h, bmi2intrin.h, clflushoptintrin.h,
rdseedintrin.h, and xsavecintrin.h.
Revises clzerointrin.h, the description had some errors.


https://reviews.llvm.org/D155859

Files:
  clang/lib/Headers/adxintrin.h
  clang/lib/Headers/bmi2intrin.h
  clang/lib/Headers/clflushoptintrin.h
  clang/lib/Headers/clzerointrin.h
  clang/lib/Headers/rdseedintrin.h
  clang/lib/Headers/xsavecintrin.h

Index: clang/lib/Headers/xsavecintrin.h
===
--- clang/lib/Headers/xsavecintrin.h
+++ clang/lib/Headers/xsavecintrin.h
@@ -17,12 +17,62 @@
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__,  __target__("xsavec")))
 
+/// Performs a full or partial save of processor state to the memory at
+///\a __p. The exact state saved depends on the 64-bit mask \a __m and
+///processor control register \c XCR0.
+///
+/// \code{.operation}
+/// mask[62:0] := __m[62:0] AND XCR0[62:0]
+/// FOR i := 0 TO 62
+///   IF mask[i] == 1
+/// CASE (i) OF
+/// 0: save X87 FPU state
+/// 1: save SSE state
+/// DEFAULT: __p.Ext_Save_Area[i] := ProcessorState[i]
+///   FI
+/// ENDFOR
+/// __p.Header.XSTATE_BV := mask
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c XSAVEC instruction.
+///
+/// \param __p
+///Pointer to the save area; must be 64-byte aligned.
+/// \param __m
+///A 64-bit mask indicating what state should be saved.
 static __inline__ void __DEFAULT_FN_ATTRS
 _xsavec(void *__p, unsigned long long __m) {
   __builtin_ia32_xsavec(__p, __m);
 }
 
 #ifdef __x86_64__
+/// Performs a full or partial save of processor state to the memory at
+///\a __p. The exact state saved depends on the 64-bit mask \a __m and
+///processor control register \c XCR0.
+///
+/// \code{.operation}
+/// mask[62:0] := __m[62:0] AND XCR0[62:0]
+/// FOR i := 0 TO 62
+///   IF mask[i] == 1
+/// CASE (i) OF
+/// 0: save X87 FPU state
+/// 1: save SSE state
+/// DEFAULT: __p.Ext_Save_Area[i] := ProcessorState[i]
+///   FI
+/// ENDFOR
+/// __p.Header.XSTATE_BV := mask
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c XSAVEC64 instruction.
+///
+/// \param __p
+///Pointer to the save area; must be 64-byte aligned.
+/// \param __m
+///A 64-bit mask indicating what state should be saved.
 static __inline__ void __DEFAULT_FN_ATTRS
 _xsavec64(void *__p, unsigned long long __m) {
   __builtin_ia32_xsavec64(__p, __m);
Index: clang/lib/Headers/rdseedintrin.h
===
--- clang/lib/Headers/rdseedintrin.h
+++ clang/lib/Headers/rdseedintrin.h
@@ -17,12 +17,54 @@
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("rdseed")))
 
+/// Stores a hardware-generated 16-bit random value in the memory at \a __p.
+///
+///The random number generator complies with NIST SP800-90B and SP800-90C.
+///
+/// \code{.operation}
+/// IF HW_NRND_GEN.ready == 1
+///   Store16(__p, HW_NRND_GEN.data)
+///   result := 1
+/// ELSE
+///   Store16(__p, 0)
+///   result := 0
+/// END
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c RDSEED instruction.
+///
+/// \param __p
+///Pointer to memory for storing the 16-bit random number.
+/// \returns 1 if a random number was generated, 0 if not.
 static __inline__ int __DEFAULT_FN_ATTRS
 _rdseed16_step(unsigned short *__p)
 {
   return (int) __builtin_ia32_rdseed16_step(__p);
 }
 
+/// Stores a hardware-generated 32-bit random value in the memory at \a __p.
+///
+///The random number generator complies with NIST SP800-90B and SP800-90C.
+///
+/// \code{.operation}
+/// IF HW_NRND_GEN.ready == 1
+///   Store32(__p, HW_NRND_GEN.data)
+///   result := 1
+/// ELSE
+///   Store16(__p, 0)
+///   result := 0
+/// END
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c RDSEED instruction.
+///
+/// \param __p
+///Pointer to memory for storing the 32-bit random number.
+/// \returns 1 if a random number was generated, 0 if not.
 static __inline__ int __DEFAULT_FN_ATTRS
 _rdseed32_step(unsigned int *__p)
 {
@@ -30,6 +72,27 @@
 }
 
 #ifdef __x86_64__
+/// Stores a hardware-generated 64-bit random value in the memory at \a __p.
+///
+///The random number generator complies with NIST SP800-90B and SP800-90C.
+///
+/// \code{.operation}
+/// IF HW_NRND_GEN.ready == 1
+///   Store64(__p, HW_NRND_GEN.data)
+///   result := 1
+/// ELSE
+///   Store16(__p, 0)
+///   result := 0
+/// END
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic

[PATCH] D155081: Specify the developer policy around links to external resources

2023-07-17 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: llvm/docs/DeveloperPolicy.rst:359
+  If the patch fixes a bug in GitHub Issues, we encourage adding
+  "Fixes https://github.com/llvm/llvm-project/issues/12345; to automate closing
+  the issue in GitHub. If the patch has been reviewed, we encourage adding a

mehdi_amini wrote:
> ldionne wrote:
> > smeenai wrote:
> > > aaron.ballman wrote:
> > > > arsenm wrote:
> > > > > I haven't quite figured out what the exact syntaxes which are 
> > > > > automatically recognized. It seems to recognize "Fixes #Nxyz"
> > > > Yup, it does support that form as well. I had heard more than once 
> > > > during code review that folks seem to prefer the full link because it's 
> > > > easier to click on that from the commit message than it is to navigate 
> > > > to the fix from the number alone. That seemed like a pretty good reason 
> > > > to recommend the full form, but I don't have strong opinions.
> > > +1 for encouraging the full link
> > Perhaps we could encourage using `https://llvm.org/PR12345` instead? Does 
> > anybody know whether `llvm.org/PRXXX` is something that we intend to keep 
> > around with the Github transition or not?
> @arsenm: It's documented 
> https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword
> And for linking cross-repo: 
> https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/autolinked-references-and-urls#issues-and-pull-requests
> Perhaps we could encourage using `https://llvm.org/PR12345` instead? Does 
> anybody know whether `llvm.org/PRXXX` is something that we intend to keep 
> around with the Github transition or not?

Currently the PRxxx links are to the old bugzillas, not the Github issues. It 
might be sad to lose that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155081/new/

https://reviews.llvm.org/D155081

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153681: [X86] Move back _mulx_u32 to 32-bit only

2023-06-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I'm okay with putting this back if the codegen (D153620 
) can't be resolved. Honestly I didn't think 
it was a problem.
But I think Craig or Simon should sign off.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153681/new/

https://reviews.llvm.org/D153681

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-30 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1461fabfb141: [Headers][doc] Add load/store/cmp/cvt 
intrinsic descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -600,30 +600,130 @@
   ((__m256i)__builtin_ia32_pblendw256((__v16hi)(__m256i)(V1), \
   (__v16hi)(__m256i)(V2), (int)(M)))
 
+/// Compares corresponding bytes in the 256-bit integer vectors in \a __a and
+///\a __b for equality and returns the outcomes in the corresponding
+///bytes of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 31
+///   j := i*8
+///   result[j+7:j] := (__a[j+7:j] == __b[j+7:j]) ? 0xFF : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the inputs.
+/// \param __b
+///A 256-bit integer vector containing one of the inputs.
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qi)__a == (__v32qi)__b);
 }
 
+/// Compares corresponding elements in the 256-bit vectors of [16 x i16] in
+///\a __a and \a __b for equality and returns the outcomes in the
+///corresponding elements of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 15
+///   j := i*16
+///   result[j+15:j] := (__a[j+15:j] == __b[j+15:j]) ? 0x : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the inputs.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the inputs.
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hi)__a == (__v16hi)__b);
 }
 
+/// Compares corresponding elements in the 256-bit vectors of [8 x i32] in
+///\a __a and \a __b for equality and returns the outcomes in the
+///corresponding elements of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*32
+///   result[j+31:j] := (__a[j+31:j] == __b[j+31:j]) ? 0x : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the inputs.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the inputs.
+/// \returns A 256-bit vector of [8 x i32] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8si)__a == (__v8si)__b);
 }
 
+/// Compares corresponding elements in the 256-bit vectors of [4 x i64] in
+///\a __a and \a __b for equality and returns the outcomes in the
+///corresponding elements of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*64
+///   result[j+63:j] := (__a[j+63:j] == __b[j+63:j]) ? 0x : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the inputs.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the inputs.
+/// \returns A 256-bit vector of [4 x i64] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4di)__a == (__v4di)__b);
 }
 
+/// Compares corresponding signed bytes in the 256-bit integer vectors in
+///\a __a and \a __b for greater-than and returns the outcomes in the
+///corresponding bytes of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 31
+///   j := i*8
+///   result[j+7:j] := (__a[j+7:j] > __b[j+7:j]) ? 0xFF : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPGTB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the inputs.
+/// \param __b
+///A 256-bit integer vector containing one of the inputs.
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpgt_epi8(__m256i __a, __m256i __b)
 {
@@ -632,18 +732,78 @@
   return

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:3474
+///   IF __M[j+31] == 1
+/// result[j+31:j] := Load32(__X+(i*4))
+///   ELSE

pengfei wrote:
> probinson wrote:
> > pengfei wrote:
> > > probinson wrote:
> > > > pengfei wrote:
> > > > > A more intrinsic guide format is `MEM[__X+j:j]`
> > > > LoadXX is the syntax in the gather intrinsics, e.g. 
> > > > _mm_mask_i32gather_pd. I'd prefer to be consistent.
> > > I think the problem here is the measurement is easily confusing.
> > > From C point of view, `__X` is a `int` pointer, so we should `+ i` rather 
> > > than `i * 4`
> > > From the other part of the code, we are measuring in bits, but here `i * 
> > > 4` is a byte offset.
> > Well, the pseudo-code is clearly not C. If you look at the gather code, it 
> > computes a byte address using an offset multiplied by an explicit scale 
> > factor. I am doing exactly the same here.
> > 
> > The syntax `MEM[__X+j:j]` is mixing a byte address with a bit offset, which 
> > I think is more confusing. To be fully consistent, using `[]` with bit 
> > offsets only, it should be
> > ```
> > k := __X*8 + i*32
> > result[j+31:j] := MEM[k+31:k]
> > ```
> > which I think obscures more than it explains.
> Yeah, it's not C code here. But we are easy to fall into C concepts, e.g., 
> why assuming __X is measuring in bytes?
> That's why I think it's clear to make both in bits.
> I made a mistake here, I wanted to propose `MEM[__X+j+31: __X+j]`. It matches 
> with [[ 
> https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=4057,4058,4059,4053,665,3890,5959,5910,3870,4280=_mm256_maskload_epi32
>  | Intrinsic Guide ]].
> 
We assume `__X` is in bytes because that's how addresses work on X86. Adding a 
bit offset to a byte address makes no sense. I see that is how existing Intel 
documentation works, which does not make it correct.

To "make both in bits" means multiplying `__X` by 8, as in the example in my 
previous comment. Or coming up with a different syntax that makes the 
difference clear.
`MEM(__X)[j+31:j]` or even `MEM[__X][j+31:j]` would be preferable.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-29 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

> we might emit member function declarations for call sites in DWARF, for 
> instance

So are you now leaning more toward wanting to emit the "used" declarations as 
well? I'm sure you don't want to implement it the way we did...

I *think* the template parameter thing has more to do with not supporting 
expressions in all their glory, although I'm fuzzy on that--last time I looked 
at that stuff was a number of years ago.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-29 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:3474
+///   IF __M[j+31] == 1
+/// result[j+31:j] := Load32(__X+(i*4))
+///   ELSE

pengfei wrote:
> probinson wrote:
> > pengfei wrote:
> > > A more intrinsic guide format is `MEM[__X+j:j]`
> > LoadXX is the syntax in the gather intrinsics, e.g. _mm_mask_i32gather_pd. 
> > I'd prefer to be consistent.
> I think the problem here is the measurement is easily confusing.
> From C point of view, `__X` is a `int` pointer, so we should `+ i` rather 
> than `i * 4`
> From the other part of the code, we are measuring in bits, but here `i * 4` 
> is a byte offset.
Well, the pseudo-code is clearly not C. If you look at the gather code, it 
computes a byte address using an offset multiplied by an explicit scale factor. 
I am doing exactly the same here.

The syntax `MEM[__X+j:j]` is mixing a byte address with a bit offset, which I 
think is more confusing. To be fully consistent, using `[]` with bit offsets 
only, it should be
```
k := __X*8 + i*32
result[j+31:j] := MEM[k+31:k]
```
which I think obscures more than it explains.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-29 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 535804.
probinson added a comment.

s/:7/:j/ correcting bit references.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -600,30 +600,130 @@
   ((__m256i)__builtin_ia32_pblendw256((__v16hi)(__m256i)(V1), \
   (__v16hi)(__m256i)(V2), (int)(M)))
 
+/// Compares corresponding bytes in the 256-bit integer vectors in \a __a and
+///\a __b for equality and returns the outcomes in the corresponding
+///bytes of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 31
+///   j := i*8
+///   result[j+7:j] := (__a[j+7:j] == __b[j+7:j]) ? 0xFF : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the inputs.
+/// \param __b
+///A 256-bit integer vector containing one of the inputs.
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qi)__a == (__v32qi)__b);
 }
 
+/// Compares corresponding elements in the 256-bit vectors of [16 x i16] in
+///\a __a and \a __b for equality and returns the outcomes in the
+///corresponding elements of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 15
+///   j := i*16
+///   result[j+15:j] := (__a[j+15:j] == __b[j+15:j]) ? 0x : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the inputs.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the inputs.
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hi)__a == (__v16hi)__b);
 }
 
+/// Compares corresponding elements in the 256-bit vectors of [8 x i32] in
+///\a __a and \a __b for equality and returns the outcomes in the
+///corresponding elements of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*32
+///   result[j+31:j] := (__a[j+31:j] == __b[j+31:j]) ? 0x : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the inputs.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the inputs.
+/// \returns A 256-bit vector of [8 x i32] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8si)__a == (__v8si)__b);
 }
 
+/// Compares corresponding elements in the 256-bit vectors of [4 x i64] in
+///\a __a and \a __b for equality and returns the outcomes in the
+///corresponding elements of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*64
+///   result[j+63:j] := (__a[j+63:j] == __b[j+63:j]) ? 0x : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPEQQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the inputs.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the inputs.
+/// \returns A 256-bit vector of [4 x i64] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpeq_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4di)__a == (__v4di)__b);
 }
 
+/// Compares corresponding signed bytes in the 256-bit integer vectors in
+///\a __a and \a __b for greater-than and returns the outcomes in the
+///corresponding bytes of the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 31
+///   j := i*8
+///   result[j+7:j] := (__a[j+7:j] > __b[j+7:j]) ? 0xFF : 0
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPCMPGTB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the inputs.
+/// \param __b
+///A 256-bit integer vector containing one of the inputs.
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_cmpgt_epi8(__m256i __a, __m256i __b)
 {
@@ -632,18 +732,78 @@
   return (__m256i)((__v32qs)__a > (__v32qs)__b);
 }
 
+/// Compares corresponding signed elements in the 256-bit vectors of
+///[16 x i16] in \a __a and \a __b for greater-than and returns the
+///outcomes in the corresponding

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-29 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:3474
+///   IF __M[j+31] == 1
+/// result[j+31:j] := Load32(__X+(i*4))
+///   ELSE

pengfei wrote:
> A more intrinsic guide format is `MEM[__X+j:j]`
LoadXX is the syntax in the gather intrinsics, e.g. _mm_mask_i32gather_pd. I'd 
prefer to be consistent.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

2023-06-28 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a subscriber: cfe-commits.
probinson added a comment.

+ cfe-commits which didn't get added automatically.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153576: [Headers] Fix up some conditionals

2023-06-22 Thread Paul Robinson via Phabricator via cfe-commits

probinson closed this revision.
probinson added a comment.

Committed 
[[here|https://github.com/llvm/llvm-project/commit/3db8410487ce704f02ef8a175e87295d4e86c8df]]
 and closing manually because I forgot to put the review link in the commit 
message.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153576/new/

https://reviews.llvm.org/D153576

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153576: [Headers] Fix up some conditionals

2023-06-22 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

The Intel documentation 

 doesn't hint that `_mulx_32` isn't available in 64-bit mode. I'm guessing gcc 
messed up and we copied them.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153576/new/

https://reviews.llvm.org/D153576

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153576: [Headers] Fix up some conditionals

2023-06-22 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: craig.topper, RKSimon, pengfei, goldstein.w.n.
Herald added a project: All.
probinson requested review of this revision.

While looking at adding intrinsic function descriptions, I found some
oddities in the conditionals. I've fiddled with some of them.  This is
not a complete audit.

Two headers have been changed to require only immintrin.h and not
x86intrin.h; this conforms to which header actually includes them, and with
Intel documentation.

One function in bmi2intrin.h was defined only in 32-bit mode, for no
apparent reason; I fixed that.

clzerointrin.h is included from x86intrin.h, so I made it not check for 
immintrin.h.
This is an AMD only instruction so there's no Intel documentation to check 
against.


https://reviews.llvm.org/D153576

Files:
  clang/lib/Headers/bmi2intrin.h
  clang/lib/Headers/clzerointrin.h
  clang/lib/Headers/rdseedintrin.h


Index: clang/lib/Headers/rdseedintrin.h
===
--- clang/lib/Headers/rdseedintrin.h
+++ clang/lib/Headers/rdseedintrin.h
@@ -7,8 +7,8 @@
  *===---===
  */
 
-#if !defined __X86INTRIN_H && !defined __IMMINTRIN_H
-#error "Never use  directly; include  instead."
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
 #endif
 
 #ifndef __RDSEEDINTRIN_H
Index: clang/lib/Headers/clzerointrin.h
===
--- clang/lib/Headers/clzerointrin.h
+++ clang/lib/Headers/clzerointrin.h
@@ -6,7 +6,7 @@
  *
  *===---===
  */
-#if !defined __X86INTRIN_H && !defined __IMMINTRIN_H
+#ifndef __X86INTRIN_H
 #error "Never use  directly; include  instead."
 #endif
 
Index: clang/lib/Headers/bmi2intrin.h
===
--- clang/lib/Headers/bmi2intrin.h
+++ clang/lib/Headers/bmi2intrin.h
@@ -7,8 +7,8 @@
  *===---===
  */
 
-#if !defined __X86INTRIN_H && !defined __IMMINTRIN_H
-#error "Never use  directly; include  instead."
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
 #endif
 
 #ifndef __BMI2INTRIN_H
@@ -35,6 +35,14 @@
   return __builtin_ia32_pext_si(__X, __Y);
 }
 
+static __inline__ unsigned int __DEFAULT_FN_ATTRS
+_mulx_u32(unsigned int __X, unsigned int __Y, unsigned int *__P)
+{
+  unsigned long long __res = (unsigned long long) __X * __Y;
+  *__P = (unsigned int)(__res >> 32);
+  return (unsigned int)__res;
+}
+
 #ifdef  __x86_64__
 
 static __inline__ unsigned long long __DEFAULT_FN_ATTRS
@@ -64,17 +72,7 @@
   return (unsigned long long) __res;
 }
 
-#else /* !__x86_64__ */
-
-static __inline__ unsigned int __DEFAULT_FN_ATTRS
-_mulx_u32 (unsigned int __X, unsigned int __Y, unsigned int *__P)
-{
-  unsigned long long __res = (unsigned long long) __X * __Y;
-  *__P = (unsigned int) (__res >> 32);
-  return (unsigned int) __res;
-}
-
-#endif /* !__x86_64__  */
+#endif /* __x86_64__  */
 
 #undef __DEFAULT_FN_ATTRS
 


Index: clang/lib/Headers/rdseedintrin.h
===
--- clang/lib/Headers/rdseedintrin.h
+++ clang/lib/Headers/rdseedintrin.h
@@ -7,8 +7,8 @@
  *===---===
  */
 
-#if !defined __X86INTRIN_H && !defined __IMMINTRIN_H
-#error "Never use  directly; include  instead."
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
 #endif
 
 #ifndef __RDSEEDINTRIN_H
Index: clang/lib/Headers/clzerointrin.h
===
--- clang/lib/Headers/clzerointrin.h
+++ clang/lib/Headers/clzerointrin.h
@@ -6,7 +6,7 @@
  *
  *===---===
  */
-#if !defined __X86INTRIN_H && !defined __IMMINTRIN_H
+#ifndef __X86INTRIN_H
 #error "Never use  directly; include  instead."
 #endif
 
Index: clang/lib/Headers/bmi2intrin.h
===
--- clang/lib/Headers/bmi2intrin.h
+++ clang/lib/Headers/bmi2intrin.h
@@ -7,8 +7,8 @@
  *===---===
  */
 
-#if !defined __X86INTRIN_H && !defined __IMMINTRIN_H
-#error "Never use  directly; include  instead."
+#ifndef __IMMINTRIN_H
+#error "Never use  directly; include  instead."
 #endif
 
 #ifndef __BMI2INTRIN_H
@@ -35,6 +35,14 @@
   return __builtin_ia32_pext_si(__X, __Y);
 }
 
+static __inline__ unsigned int __DEFAULT_FN_ATTRS
+_mulx_u32(unsigned int __X, unsigned int __Y, unsigned int *__P)
+{
+  unsigned long long __res = (unsigned long long) __X * __Y;
+  *__P = (unsigned int)(__res >> 32);
+  return (unsigned int)__res;
+}
+
 #ifdef  __x86_64__
 
 static __inline__ unsigned long long __DEFAULT_FN_ATTRS
@@

[PATCH] D153462: [Headers][doc] Add various arith/logical intrinsic descriptions to avx2intrin.h

2023-06-22 Thread Paul Robinson via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGbfa467bdc0e9: [Headers][doc] Add various arith/logical 
intrinsic descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153462/new/

https://reviews.llvm.org/D153462

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -19,22 +19,112 @@
 #define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx2"), __min_vector_width__(128)))
 
 /* SSE4 Multiple Packed Sums of Absolute Difference.  */
+/// Computes sixteen sum of absolute difference (SAD) operations on sets of
+///four unsigned 8-bit integers from the 256-bit integer vectors \a X and
+///\a Y.
+///
+///Eight SAD results are computed using the lower half of the input
+///vectors, and another eight using the upper half. These 16-bit values
+///are returned in the lower and upper halves of the 256-bit result,
+///respectively.
+///
+///A single SAD operation selects four bytes from \a X and four bytes from
+///\a Y as input. It computes the differences between each \a X byte and
+///the corresponding \a Y byte, takes the absolute value of each
+///difference, and sums these four values to form one 16-bit result. The
+///intrinsic computes 16 of these results with different sets of input
+///bytes.
+///
+///For each set of eight results, the SAD operations use the same four
+///bytes from \a Y; the starting bit position for these four bytes is
+///specified by \a M[1:0] times 32. The eight operations use successive
+///sets of four bytes from \a X; the starting bit position for the first
+///set of four bytes is specified by \a M[2] times 32. These bit positions
+///are all relative to the 128-bit lane for each set of eight operations.
+///
+/// \code{.operation}
+/// r := 0
+/// FOR i := 0 TO 1
+///   j := i*3
+///   Ybase := M[j+1:j]*32 + i*128
+///   Xbase := M[j+2]*32 + i*128
+///   FOR k := 0 TO 3
+/// temp0 := ABS(X[Xbase+7:Xbase] - Y[Ybase+7:Ybase])
+/// temp1 := ABS(X[Xbase+15:Xbase+8] - Y[Ybase+15:Ybase+8])
+/// temp2 := ABS(X[Xbase+23:Xbase+16] - Y[Ybase+23:Ybase+16])
+/// temp3 := ABS(X[Xbase+31:Xbase+24] - Y[Ybase+31:Ybase+24])
+/// result[r+15:r] := temp0 + temp1 + temp2 + temp3
+/// Xbase := Xbase + 8
+/// r := r + 16
+///   ENDFOR
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// \code
+/// __m256i _mm256_mpsadbw_epu8(__m256i X, __m256i Y, const int M);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VMPSADBW instruction.
+///
+/// \param X
+///A 256-bit integer vector containing one of the inputs.
+/// \param Y
+///A 256-bit integer vector containing one of the inputs.
+/// \param M
+/// An unsigned immediate value specifying the starting positions of the
+/// bytes to operate on.
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 #define _mm256_mpsadbw_epu8(X, Y, M) \
   ((__m256i)__builtin_ia32_mpsadbw256((__v32qi)(__m256i)(X), \
   (__v32qi)(__m256i)(Y), (int)(M)))
 
+/// Computes the absolute value of each signed byte in the 256-bit integer
+///vector \a __a and returns each value in the corresponding byte of
+///the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPABSB instruction.
+///
+/// \param __a
+///A 256-bit integer vector.
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_abs_epi8(__m256i __a)
 {
 return (__m256i)__builtin_elementwise_abs((__v32qs)__a);
 }
 
+/// Computes the absolute value of each signed 16-bit element in the 256-bit
+///vector of [16 x i16] in \a __a and returns each value in the
+///corresponding element of the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPABSW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16].
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_abs_epi16(__m256i __a)
 {
 return (__m256i)__builtin_elementwise_abs((__v16hi)__a);
 }
 
+/// Computes the absolute value of each signed 32-bit element in the 256-bit
+///vector of [8 x i32] in \a __a and returns each value in the
+///corresponding element of the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPABSD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32].
+/// \returns A 256-bit vector of [8 x i32] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_abs_epi32(__m256i __a)
 {
@@ -345,24 +435,88 @@

[PATCH] D153462: [Headers][doc] Add various arith/logical intrinsic descriptions to avx2intrin.h

2023-06-21 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: pengfei, RKSimon, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D153462

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -19,22 +19,112 @@
 #define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("avx2"), __min_vector_width__(128)))
 
 /* SSE4 Multiple Packed Sums of Absolute Difference.  */
+/// Computes sixteen sum of absolute difference (SAD) operations on sets of
+///four unsigned 8-bit integers from the 256-bit integer vectors \a X and
+///\a Y.
+///
+///Eight SAD results are computed using the lower half of the input
+///vectors, and another eight using the upper half. These 16-bit values
+///are returned in the lower and upper halves of the 256-bit result,
+///respectively.
+///
+///A single SAD operation selects four bytes from \a X and four bytes from
+///\a Y as input. It computes the differences between each \a X byte and
+///the corresponding \a Y byte, takes the absolute value of each
+///difference, and sums these four values to form one 16-bit result. The
+///intrinsic computes 16 of these results with different sets of input
+///bytes.
+///
+///For each set of eight results, the SAD operations use the same four
+///bytes from \a Y; the starting bit position for these four bytes is
+///specified by \a M[1:0] times 32. The eight operations use successive
+///sets of four bytes from \a X; the starting bit position for the first
+///set of four bytes is specified by \a M[2] times 32. These bit positions
+///are all relative to the 128-bit lane for each set of eight operations.
+///
+/// \code{.operation}
+/// r := 0
+/// FOR i := 0 TO 1
+///   j := i*3
+///   Ybase := M[j+1:j]*32 + i*128
+///   Xbase := M[j+2]*32 + i*128
+///   FOR k := 0 TO 3
+/// temp0 := ABS(X[Xbase+7:Xbase] - Y[Ybase+7:Ybase])
+/// temp1 := ABS(X[Xbase+15:Xbase+8] - Y[Ybase+15:Ybase+8])
+/// temp2 := ABS(X[Xbase+23:Xbase+16] - Y[Ybase+23:Ybase+16])
+/// temp3 := ABS(X[Xbase+31:Xbase+24] - Y[Ybase+31:Ybase+24])
+/// result[r+15:r] := temp0 + temp1 + temp2 + temp3
+/// Xbase := Xbase + 8
+/// r := r + 16
+///   ENDFOR
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// \code
+/// __m256i _mm256_mpsadbw_epu8(__m256i X, __m256i Y, const int M);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VMPSADBW instruction.
+///
+/// \param X
+///A 256-bit integer vector containing one of the inputs.
+/// \param Y
+///A 256-bit integer vector containing one of the inputs.
+/// \param M
+/// An unsigned immediate value specifying the starting positions of the
+/// bytes to operate on.
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 #define _mm256_mpsadbw_epu8(X, Y, M) \
   ((__m256i)__builtin_ia32_mpsadbw256((__v32qi)(__m256i)(X), \
   (__v32qi)(__m256i)(Y), (int)(M)))
 
+/// Computes the absolute value of each signed byte in the 256-bit integer
+///vector \a __a and returns each value in the corresponding byte of
+///the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPABSB instruction.
+///
+/// \param __a
+///A 256-bit integer vector.
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_abs_epi8(__m256i __a)
 {
 return (__m256i)__builtin_elementwise_abs((__v32qs)__a);
 }
 
+/// Computes the absolute value of each signed 16-bit element in the 256-bit
+///vector of [16 x i16] in \a __a and returns each value in the
+///corresponding element of the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPABSW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16].
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_abs_epi16(__m256i __a)
 {
 return (__m256i)__builtin_elementwise_abs((__v16hi)__a);
 }
 
+/// Computes the absolute value of each signed 32-bit element in the 256-bit
+///vector of [8 x i32] in \a __a and returns each value in the
+///corresponding element of the result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPABSD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32].
+/// \returns A 256-bit vector of [8 x i32] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_abs_epi32(__m256i __a)
 {
@@ -345,24 +435,88 @@
   ((__m256i)__builtin_ia32_palignr256((__v32qi)(__m256i)(a), \
   (__v32qi)(__m256i)(b), (n)))
 
+/// Computes the bitwise AND of the

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-07 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Could always go with `-gsuppress-undefined-methods` if you're not happy about 
default-on options. I don't have a strong opinion.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Oh, `-fstandalone-debug` should override this? In the Sony implementation, it 
does. (Sorry for not remembering that sooner.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I have to say I'm not super excited about "-gincomplete-types" given that 
"incomplete type" means something different to a C++ user.

"-gdefined-methods-only" ? reads awkwardly in the no- form.
"-gsuppress-undefined-methods" ? which is similar to what we called it.
"-gundefined-methods" ? you'd default-true in that case, the no form would 
suppress them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

My debugger guy says "this shouldn't be a problem."

Given that, my request is that `-gincomplete-types` should be default-true for 
`DebuggerTuning == SCE` if you want to commit this; otherwise I'll redo our 
downstream patch to match yours.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-06 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In D152017#4397113 , @dblaikie wrote:

> What's the particular goal/value in including called-but-not-defined 
> functions? Are your users generally building only parts of their program with 
> debug info & you want it to be complete-ish in the parts that do have debug 
> info? Because if they were building the whole program with debug info, /some/ 
> translation unit would have the definition of the function, and the debug 
> info for it.

Actually the goal was to match the behavior of proprietary compilers for 
previous consoles, knowing that the debugger would be fine with that. I think 
it's worth taking this idea (only defined methods) back to them, and see what 
they think. Because your patch is seriously simpler than ours!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-05 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I experimented with this. Looks like it emits info only for //defined// 
methods, and not //used// methods. That is, if you change the test to say
`void t1::f1() { f2(); }`
you get DWARF for f1 but not f2. The way Sony does it, you get DWARF for f1 and 
f2.
(Neither case sees info for f3, which is declared but not defined or used.)

Is that what you intended?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152017: [DebugInfo] Add flag to only emit referenced member functions

2023-06-02 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I'm traveling but will look at this on Monday.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152017/new/

https://reviews.llvm.org/D152017

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D151749: [Headers][doc] Add "shuffle-like" intrinsic descriptions to avx2intrin.h

2023-05-31 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGe5399f1d7cab: [Headers][doc] Add shuffle-like intrinsic 
descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D151749?vs=526782=527015#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151749/new/

https://reviews.llvm.org/D151749

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -41,24 +41,126 @@
 return (__m256i)__builtin_elementwise_abs((__v8si)__a);
 }
 
+/// Converts the elements of two 256-bit vectors of [16 x i16] to 8-bit
+///integers using signed saturation, and returns the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*16
+///   k := i*8
+///   result[7+k:k] := SATURATE8(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8(__b[143+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKSSWB instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [16 x i16] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packs_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packsswb256((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Converts the elements of two 256-bit vectors of [8 x i32] to 16-bit
+///integers using signed saturation, and returns the resulting 256-bit
+///vector of [16 x i16].
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*32
+///   k := i*16
+///   result[15+k:k] := SATURATE16(__a[31+j:j])
+///   result[79+k:64+k] := SATURATE16(__b[31+j:j])
+///   result[143+k:128+k] := SATURATE16(__a[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16(__b[159+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKSSDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [8 x i32] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packs_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packssdw256((__v8si)__a, (__v8si)__b);
 }
 
+/// Converts elements from two 256-bit vectors of [16 x i16] to 8-bit integers
+///using unsigned saturation, and returns the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*16
+///   k := i*8
+///   result[7+k:k] := SATURATE8U(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8U(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8U(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8U(__b[143+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKUSWB instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [16 x i16] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packus_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packuswb256((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Converts elements from two 256-bit vectors of [8 x i32] to 16-bit integers
+///using unsigned saturation, and returns the resulting 256-bit vector of
+///[16 x i16].
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*32
+///   k := i*16
+///   result[15+k:k] := SATURATE16U(__V1[31+j:j])
+///   result[79+k:64+k] := SATURATE16U(__V2[31+j:j])
+///   result[143+k:128+k] := SATURATE16U(__V1[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16U(__V2[159+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKUSDW instruction.
+///
+/// \param __V1
+///A 256-bit vector of [8 x i32] used to generate result[63:0] and
+///result[191:128].
+/// \param __V2
+///A 256-bit vector of [8 x i32] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packus_epi32(__m256i __V1, __m256i __V2)
 {
@@ -215,6

[PATCH] D151749: [Headers][doc] Add "shuffle-like" intrinsic descriptions to avx2intrin.h

2023-05-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 526782.
probinson added a comment.

Update some SATURATEx to SATURATExU


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151749/new/

https://reviews.llvm.org/D151749

Files:
  clang/lib/Headers/avx2intrin.h


Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -111,10 +111,10 @@
 /// FOR i := 0 TO 7
 ///   j := i*16
 ///   k := i*8
-///   result[7+k:k] := SATURATE8(__a[15+j:j])
-///   result[71+k:64+k] := SATURATE8(__b[15+j:j])
-///   result[135+k:128+k] := SATURATE8(__a[143+j:128+j])
-///   result[199+k:192+k] := SATURATE8(__b[143+j:128+j])
+///   result[7+k:k] := SATURATE8U(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8U(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8U(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8U(__b[143+j:128+j])
 /// ENDFOR
 /// \endcode
 ///
@@ -143,10 +143,10 @@
 /// FOR i := 0 TO 3
 ///   j := i*32
 ///   k := i*16
-///   result[15+k:k] := SATURATE16(__V1[31+j:j])
-///   result[79+k:64+k] := SATURATE16(__V2[31+j:j])
-///   result[143+k:128+k] := SATURATE16(__V1[159+j:128+j])
-///   result[207+k:192+k] := SATURATE16(__V2[159+j:128+j])
+///   result[15+k:k] := SATURATE16U(__V1[31+j:j])
+///   result[79+k:64+k] := SATURATE16U(__V2[31+j:j])
+///   result[143+k:128+k] := SATURATE16U(__V1[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16U(__V2[159+j:128+j])
 /// ENDFOR
 /// \endcode
 ///


Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -111,10 +111,10 @@
 /// FOR i := 0 TO 7
 ///   j := i*16
 ///   k := i*8
-///   result[7+k:k] := SATURATE8(__a[15+j:j])
-///   result[71+k:64+k] := SATURATE8(__b[15+j:j])
-///   result[135+k:128+k] := SATURATE8(__a[143+j:128+j])
-///   result[199+k:192+k] := SATURATE8(__b[143+j:128+j])
+///   result[7+k:k] := SATURATE8U(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8U(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8U(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8U(__b[143+j:128+j])
 /// ENDFOR
 /// \endcode
 ///
@@ -143,10 +143,10 @@
 /// FOR i := 0 TO 3
 ///   j := i*32
 ///   k := i*16
-///   result[15+k:k] := SATURATE16(__V1[31+j:j])
-///   result[79+k:64+k] := SATURATE16(__V2[31+j:j])
-///   result[143+k:128+k] := SATURATE16(__V1[159+j:128+j])
-///   result[207+k:192+k] := SATURATE16(__V2[159+j:128+j])
+///   result[15+k:k] := SATURATE16U(__V1[31+j:j])
+///   result[79+k:64+k] := SATURATE16U(__V2[31+j:j])
+///   result[143+k:128+k] := SATURATE16U(__V1[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16U(__V2[159+j:128+j])
 /// ENDFOR
 /// \endcode
 ///
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D151749: [Headers][doc] Add "shuffle-like" intrinsic descriptions to avx2intrin.h

2023-05-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: RKSimon, pengfei, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

(Time to look for the next round of embarrassing mistakes...)


https://reviews.llvm.org/D151749

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -41,24 +41,126 @@
 return (__m256i)__builtin_elementwise_abs((__v8si)__a);
 }
 
+/// Converts the elements of two 256-bit vectors of [16 x i16] to 8-bit
+///integers using signed saturation, and returns the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*16
+///   k := i*8
+///   result[7+k:k] := SATURATE8(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8(__b[143+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKSSWB instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [16 x i16] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packs_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packsswb256((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Converts the elements of two 256-bit vectors of [8 x i32] to 16-bit
+///integers using signed saturation, and returns the resulting 256-bit
+///vector of [16 x i16].
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*32
+///   k := i*16
+///   result[15+k:k] := SATURATE16(__a[31+j:j])
+///   result[79+k:64+k] := SATURATE16(__b[31+j:j])
+///   result[143+k:128+k] := SATURATE16(__a[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16(__b[159+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKSSDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [8 x i32] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packs_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packssdw256((__v8si)__a, (__v8si)__b);
 }
 
+/// Converts elements from two 256-bit vectors of [16 x i16] to 8-bit integers
+///using unsigned saturation, and returns the 256-bit result.
+///
+/// \code{.operation}
+/// FOR i := 0 TO 7
+///   j := i*16
+///   k := i*8
+///   result[7+k:k] := SATURATE8(__a[15+j:j])
+///   result[71+k:64+k] := SATURATE8(__b[15+j:j])
+///   result[135+k:128+k] := SATURATE8(__a[143+j:128+j])
+///   result[199+k:192+k] := SATURATE8(__b[143+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKUSWB instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] used to generate result[63:0] and
+///result[191:128].
+/// \param __b
+///A 256-bit vector of [16 x i16] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit integer vector containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packus_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_ia32_packuswb256((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Converts elements from two 256-bit vectors of [8 x i32] to 16-bit integers
+///using unsigned saturation, and returns the resulting 256-bit vector of
+///[16 x i16].
+///
+/// \code{.operation}
+/// FOR i := 0 TO 3
+///   j := i*32
+///   k := i*16
+///   result[15+k:k] := SATURATE16(__V1[31+j:j])
+///   result[79+k:64+k] := SATURATE16(__V2[31+j:j])
+///   result[143+k:128+k] := SATURATE16(__V1[159+j:128+j])
+///   result[207+k:192+k] := SATURATE16(__V2[159+j:128+j])
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPACKUSDW instruction.
+///
+/// \param __V1
+///A 256-bit vector of [8 x i32] used to generate result[63:0] and
+///result[191:128].
+/// \param __V2
+///A 256-bit vector of [8 x i32] used to generate result[127:64] and
+///result[255:192].
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_packus_epi32(__m256i __V1, __m256i __V2)
 {
@@ -215,6 +317,30 @@
   return (__m256i)__builtin_elementwise_add_sat((__v16hu)__a, (__v16hu)__b);
 }
 
+/// Uses the lower half of the 256-bit vector \a a as the upper half of a
+///temporary 256-bit value, and the lower half of the 256-bit vector

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-30 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
probinson marked an inline comment as done.
Closed by commit rGd8291908ef49: [Headers][doc] Add add/sub/mul intrinsic 
descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D150114?vs=525632=526700#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -65,48 +65,150 @@
   return (__m256i) __builtin_ia32_packusdw256((__v8si)__V1, (__v8si)__V2);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors and returns the lower 8 bits of each sum in the corresponding
+///byte of the 256-bit integer vector result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qu)__a + (__v32qu)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] and returns the lower 16 bits of each sum in the
+///corresponding element of the [16 x i16] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hu)__a + (__v16hu)__b);
 }
 
+/// Adds 32-bit integers from corresponding elements of two 256-bit vectors of
+///[8 x i32] and returns the lower 32 bits of each sum in the corresponding
+///element of the [8 x i32] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \returns A 256-bit vector of [8 x i32] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8su)__a + (__v8su)__b);
 }
 
+/// Adds 64-bit integers from corresponding elements of two 256-bit vectors of
+///[4 x i64] and returns the lower 64 bits of each sum in the corresponding
+///element of the [4 x i64] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \returns A 256-bit vector of [4 x i64] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4du)__a + (__v4du)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using signed saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v32qs)__a, (__v32qs)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] using signed saturation, and returns the [16 x i16] result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-30 Thread Paul Robinson via Phabricator via cfe-commits

probinson marked an inline comment as done.
probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:412
+///vectors of [16 x i16] and returns the lower 16 bits of each difference
+///in an element of the [16 x i16] result (overflow is ignored).
+///Differences from \a __a are returned in the lower 64 bits of each

pengfei wrote:
> underflow?
I don't often see "underflow" applied to integer operations. Technically, any 
signed add or subtract could either overflow or underflow, depending on the 
sign and magnitude of the operands. I think just saying "overflow" is clear 
enough?



Comment at: clang/lib/Headers/avx2intrin.h:448
+///vectors of [8 x i32] and returns the lower 32 bits of each difference in
+///an element of the [8 x i31] result (overflow is ignored). Differences
+///from \a __a are returned in the lower 64 bits of each 128-bit half of

pengfei wrote:
> pengfei wrote:
> > typo or intended?
> underflow.
The `i31` is a typo, fixed.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-25 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:456
+///   j := i*128
+///   result[j+31:j] := __a[j+63:j+32] - __a[j+31:j]
+///   result[j+63:j+32] := __a[j+127:j+96] - __a[j+95:j+64]

craig.topper wrote:
> Intel intrinsics guide says
> 
> ```
> dst[31:0] := a[31:0] - a[63:32]
> dst[63:32] := a[95:64] - a[127:96]
> dst[95:64] := b[31:0] - b[63:32]
> dst[127:96] := b[95:64] - b[127:96]
> dst[159:128] := a[159:128] - a[191:160]
> dst[191:160] := a[223:192] - a[255:224]
> dst[223:192] := b[159:128] - b[191:160]
> dst[255:224] := b[223:192] - b[255:224]
> dst[MAX:256] := 0
> ```
> 
> So I think the operands are in the wrong order here?
Words fail me. Also diagrams. I wanted the add and sub descriptions to look 
similar, and copy-pasted from add to sub without verifying the order.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-25 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 525632.
probinson added a comment.

Correct order of horizontal operands


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -65,48 +65,150 @@
   return (__m256i) __builtin_ia32_packusdw256((__v8si)__V1, (__v8si)__V2);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors and returns the lower 8 bits of each sum in the corresponding
+///byte of the 256-bit integer vector result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qu)__a + (__v32qu)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] and returns the lower 16 bits of each sum in the
+///corresponding element of the [16 x i16] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hu)__a + (__v16hu)__b);
 }
 
+/// Adds 32-bit integers from corresponding elements of two 256-bit vectors of
+///[8 x i32] and returns the lower 32 bits of each sum in the corresponding
+///element of the [8 x i32] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \returns A 256-bit vector of [8 x i32] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8su)__a + (__v8su)__b);
 }
 
+/// Adds 64-bit integers from corresponding elements of two 256-bit vectors of
+///[4 x i64] and returns the lower 64 bits of each sum in the corresponding
+///element of the [4 x i64] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \returns A 256-bit vector of [4 x i64] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4du)__a + (__v4du)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using signed saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v32qs)__a, (__v32qs)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] using signed saturation, and returns the [16 x i16] result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using unsigned saturation, and returns each sum in the
+///corresponding byte of the 256-bit

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-23 Thread Paul Robinson via Phabricator via cfe-commits

probinson updated this revision to Diff 524786.
probinson added a comment.

Address review comments


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -65,48 +65,150 @@
   return (__m256i) __builtin_ia32_packusdw256((__v8si)__V1, (__v8si)__V2);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors and returns the lower 8 bits of each sum in the corresponding
+///byte of the 256-bit integer vector result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qu)__a + (__v32qu)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] and returns the lower 16 bits of each sum in the
+///corresponding element of the [16 x i16] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hu)__a + (__v16hu)__b);
 }
 
+/// Adds 32-bit integers from corresponding elements of two 256-bit vectors of
+///[8 x i32] and returns the lower 32 bits of each sum in the corresponding
+///element of the [8 x i32] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \returns A 256-bit vector of [8 x i32] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8su)__a + (__v8su)__b);
 }
 
+/// Adds 64-bit integers from corresponding elements of two 256-bit vectors of
+///[4 x i64] and returns the lower 64 bits of each sum in the corresponding
+///element of the [4 x i64] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \returns A 256-bit vector of [4 x i64] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4du)__a + (__v4du)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using signed saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSB instruction.
+///
+/// \param __a
+///A 256-bit integer vector containing one of the source operands.
+/// \param __b
+///A 256-bit integer vector containing one of the source operands.
+/// \returns A 256-bit integer vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v32qs)__a, (__v32qs)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] using signed saturation, and returns the [16 x i16] result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using unsigned saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-23 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:156
+///A 256-bit vector containing one of the source operands.
+/// \returns A 256-bit vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256

craig.topper wrote:
> Why do some return descriptions include the type like [4 x i64] but some 
> don't?
My policy has been to provide the type for element sizes other than byte. So, I 
haven't been saying [32 x i8] but I do say [4 x i64] or whatever.
Although, I have also tended to say "integer vector" when it's a byte vector, 
and I'll make that consistent here as well.



Comment at: clang/lib/Headers/avx2intrin.h:1043
+///corresponding byte of the 256-bit integer vector result (overflow is
+///ignored). For each byte, computes  result = __a - __b .
+///

pengfei wrote:
> It better to move it to `\code{.operation}` for consistency. Same for the 
> below.
Okay.



Comment at: clang/lib/Headers/avx2intrin.h:1050
+/// \param __a
+///A 256-bit vector containing the subtrahends.
+/// \param __b

craig.topper wrote:
> I think minuend and subtrahend are swapped here.
Thanks for catching that!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150114/new/

https://reviews.llvm.org/D150114

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150278: [Headers][doc] Add "shift" intrinsic descriptions to avx2intrin.h

2023-05-10 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG642bd1123d05: [Headers][doc] Add shift intrinsic 
descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150278/new/

https://reviews.llvm.org/D150278

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -493,108 +493,404 @@
 return (__m256i)__builtin_ia32_psignd256((__v8si)__a, (__v8si)__b);
 }
 
+/// Shifts each 128-bit half of the 256-bit integer vector \a a left by
+///\a imm bytes, shifting in zero bytes, and returns the result. If \a imm
+///is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// \code
+/// __m256i _mm256_slli_si256(__m256i a, const int imm);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VPSLLDQ instruction.
+///
+/// \param a
+///A 256-bit integer vector to be shifted.
+/// \param imm
+/// An unsigned immediate value specifying the shift count (in bytes).
+/// \returns A 256-bit integer vector containing the result.
 #define _mm256_slli_si256(a, imm) \
   ((__m256i)__builtin_ia32_pslldqi256_byteshift((__v4di)(__m256i)(a), (int)(imm)))
 
+/// Shifts each 128-bit half of the 256-bit integer vector \a a left by
+///\a imm bytes, shifting in zero bytes, and returns the result. If \a imm
+///is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// \code
+/// __m256i _mm256_bslli_epi128(__m256i a, const int imm);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VPSLLDQ instruction.
+///
+/// \param a
+///A 256-bit integer vector to be shifted.
+/// \param imm
+///An unsigned immediate value specifying the shift count (in bytes).
+/// \returns A 256-bit integer vector containing the result.
 #define _mm256_bslli_epi128(a, imm) \
   ((__m256i)__builtin_ia32_pslldqi256_byteshift((__v4di)(__m256i)(a), (int)(imm)))
 
+/// Shifts each 16-bit element of the 256-bit vector of [16 x i16] in \a __a
+///left by \a __count bits, shifting in zero bits, and returns the result.
+///If \a __count is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] to be shifted.
+/// \param __count
+///An unsigned integer value specifying the shift count (in bits).
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_slli_epi16(__m256i __a, int __count)
 {
   return (__m256i)__builtin_ia32_psllwi256((__v16hi)__a, __count);
 }
 
+/// Shifts each 16-bit element of the 256-bit vector of [16 x i16] in \a __a
+///left by the number of bits specified by the lower 64 bits of \a __count,
+///shifting in zero bits, and returns the result. If \a __count is greater
+///than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] to be shifted.
+/// \param __count
+///A 128-bit vector of [2 x i64] whose lower element gives the unsigned
+///shift count (in bits). The upper element is ignored.
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_sll_epi16(__m256i __a, __m128i __count)
 {
   return (__m256i)__builtin_ia32_psllw256((__v16hi)__a, (__v8hi)__count);
 }
 
+/// Shifts each 32-bit element of the 256-bit vector of [8 x i32] in \a __a
+///left by \a __count bits, shifting in zero bits, and returns the result.
+///If \a __count is greater than 31, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] to be shifted.
+/// \param __count
+///An unsigned integer value specifying the shift count (in bits).
+/// \returns A 256-bit vector of [8 x i32] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_slli_epi32(__m256i __a, int __count)
 {
   return (__m256i)__builtin_ia32_pslldi256((__v8si)__a, __count);
 }
 
+/// Shifts each 32-bit element of the 256-bit vector of [8 x i32] in \a __a
+///left by the number of bits given in the lower 64 bits of \a __count,
+///shifting in zero bits, and returns the result. If \a __count is greater
+///than 31, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] to be shifted.
+///

[PATCH] D150278: [Headers][doc] Add "shift" intrinsic descriptions to avx2intrin.h

2023-05-10 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: pengfei, RKSimon, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D150278

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -493,108 +493,404 @@
 return (__m256i)__builtin_ia32_psignd256((__v8si)__a, (__v8si)__b);
 }
 
+/// Shifts each 128-bit half of the 256-bit integer vector \a a left by
+///\a imm bytes, shifting in zero bytes, and returns the result. If \a imm
+///is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// \code
+/// __m256i _mm256_slli_si256(__m256i a, const int imm);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VPSLLDQ instruction.
+///
+/// \param a
+///A 256-bit integer vector to be shifted.
+/// \param imm
+/// An unsigned immediate value specifying the shift count (in bytes).
+/// \returns A 256-bit integer vector containing the result.
 #define _mm256_slli_si256(a, imm) \
   ((__m256i)__builtin_ia32_pslldqi256_byteshift((__v4di)(__m256i)(a), (int)(imm)))
 
+/// Shifts each 128-bit half of the 256-bit integer vector \a a left by
+///\a imm bytes, shifting in zero bytes, and returns the result. If \a imm
+///is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// \code
+/// __m256i _mm256_bslli_epi128(__m256i a, const int imm);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VPSLLDQ instruction.
+///
+/// \param a
+///A 256-bit integer vector to be shifted.
+/// \param imm
+///An unsigned immediate value specifying the shift count (in bytes).
+/// \returns A 256-bit integer vector containing the result.
 #define _mm256_bslli_epi128(a, imm) \
   ((__m256i)__builtin_ia32_pslldqi256_byteshift((__v4di)(__m256i)(a), (int)(imm)))
 
+/// Shifts each 16-bit element of the 256-bit vector of [16 x i16] in \a __a
+///left by \a __count bits, shifting in zero bits, and returns the result.
+///If \a __count is greater than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] to be shifted.
+/// \param __count
+///An unsigned integer value specifying the shift count (in bits).
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_slli_epi16(__m256i __a, int __count)
 {
   return (__m256i)__builtin_ia32_psllwi256((__v16hi)__a, __count);
 }
 
+/// Shifts each 16-bit element of the 256-bit vector of [16 x i16] in \a __a
+///left by the number of bits specified by the lower 64 bits of \a __count,
+///shifting in zero bits, and returns the result. If \a __count is greater
+///than 15, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] to be shifted.
+/// \param __count
+///A 128-bit vector of [2 x i64] whose lower element gives the unsigned
+///shift count (in bits). The upper element is ignored.
+/// \returns A 256-bit vector of [16 x i16] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_sll_epi16(__m256i __a, __m128i __count)
 {
   return (__m256i)__builtin_ia32_psllw256((__v16hi)__a, (__v8hi)__count);
 }
 
+/// Shifts each 32-bit element of the 256-bit vector of [8 x i32] in \a __a
+///left by \a __count bits, shifting in zero bits, and returns the result.
+///If \a __count is greater than 31, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] to be shifted.
+/// \param __count
+///An unsigned integer value specifying the shift count (in bits).
+/// \returns A 256-bit vector of [8 x i32] containing the result.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_slli_epi32(__m256i __a, int __count)
 {
   return (__m256i)__builtin_ia32_pslldi256((__v8si)__a, __count);
 }
 
+/// Shifts each 32-bit element of the 256-bit vector of [8 x i32] in \a __a
+///left by the number of bits given in the lower 64 bits of \a __count,
+///shifting in zero bits, and returns the result. If \a __count is greater
+///than 31, the returned result is all zeroes.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPSLLD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] to be shifted.
+/// \param __count
+///A 128-bit vector of [2 x i64] whose lower element gives the unsigned
+///shift count (in bits). The upper element is ignored.
+/// \returns A 256-bit vector of [8 x i32] containing

[PATCH] D150114: [Headers][doc] Add "add/sub/mul" intrinsic descriptions to avx2intrin.h

2023-05-08 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: pengfei, RKSimon, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D150114

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -65,48 +65,150 @@
   return (__m256i) __builtin_ia32_packusdw256((__v8si)__V1, (__v8si)__V2);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors and returns the lower 8 bits of each sum in the corresponding
+///byte of the 256-bit integer vector result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDB instruction.
+///
+/// \param __a
+///A 256-bit vector containing one of the source operands.
+/// \param __b
+///A 256-bit vector containing one of the source operands.
+/// \returns A 256-bit vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v32qu)__a + (__v32qu)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] and returns the lower 16 bits of each sum in the
+///corresponding element of the [16 x i16] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v16hu)__a + (__v16hu)__b);
 }
 
+/// Adds 32-bit integers from corresponding elements of two 256-bit vectors of
+///[8 x i32] and returns the lower 32 bits of each sum in the corresponding
+///element of the [8 x i32] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDD instruction.
+///
+/// \param __a
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [8 x i32] containing one of the source operands.
+/// \returns A 256-bit vector of [8 x i32] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi32(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v8su)__a + (__v8su)__b);
 }
 
+/// Adds 64-bit integers from corresponding elements of two 256-bit vectors of
+///[4 x i64] and returns the lower 64 bits of each sum in the corresponding
+///element of the [4 x i64] result (overflow is ignored).
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDQ instruction.
+///
+/// \param __a
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [4 x i64] containing one of the source operands.
+/// \returns A 256-bit vector of [4 x i64] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_add_epi64(__m256i __a, __m256i __b)
 {
   return (__m256i)((__v4du)__a + (__v4du)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using signed saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSB instruction.
+///
+/// \param __a
+///A 256-bit vector containing one of the source operands.
+/// \param __b
+///A 256-bit vector containing one of the source operands.
+/// \returns A 256-bit vector containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi8(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v32qs)__a, (__v32qs)__b);
 }
 
+/// Adds 16-bit integers from corresponding elements of two 256-bit vectors of
+///[16 x i16] using signed saturation, and returns the [16 x i16] result.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VPADDSW instruction.
+///
+/// \param __a
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \param __b
+///A 256-bit vector of [16 x i16] containing one of the source operands.
+/// \returns A 256-bit vector of [16 x i16] containing the sums.
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_adds_epi16(__m256i __a, __m256i __b)
 {
   return (__m256i)__builtin_elementwise_add_sat((__v16hi)__a, (__v16hi)__b);
 }
 
+/// Adds 8-bit integers from corresponding bytes of two 256-bit integer
+///vectors using unsigned saturation, and returns each sum in the
+///corresponding byte of the 256-bit integer vector result.
+///
+/// \headerfile

[PATCH] D149205: [Headers][doc] Add "gather" intrinsic descriptions to avx2intrin.h

2023-04-26 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
probinson marked an inline comment as done.
Closed by commit rG039ae62405b6: [Headers][doc] Add gather 
intrinsic descriptions to avx2intrin.h (authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D149205?vs=516917=517182#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149205/new/

https://reviews.llvm.org/D149205

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -935,102 +935,810 @@
   return (__m128i)__builtin_ia32_psrlv2di((__v2di)__X, (__v2di)__Y);
 }
 
+/// Conditionally gathers two 64-bit floating-point values, either from the
+///128-bit vector of [2 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of [4 x i32] in \a i. The 128-bit vector
+///of [2 x double] in \a mask determines the source for each element.
+///
+/// \code{.operation}
+/// FOR element := 0 to 1
+///   j := element*64
+///   k := element*32
+///   IF mask[j+63] == 0
+/// result[j+63:j] := a[j+63:j]
+///   ELSE
+/// result[j+63:j] := Load64(m + SignExtend(i[k+31:k])*s)
+///   FI
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// \code
+/// __m128d _mm_mask_i32gather_pd(__m128d a, const double *m, __m128i i,
+///   __m128d mask, const int s);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VGATHERDPD instruction.
+///
+/// \param a
+///A 128-bit vector of [2 x double] used as the source when a mask bit is
+///zero.
+/// \param m
+///A pointer to the memory used for loading values.
+/// \param i
+///A 128-bit vector of [4 x i32] containing signed indexes into \a m. Only
+///the first two elements are used.
+/// \param mask
+///A 128-bit vector of [2 x double] containing the mask. The most
+///significant bit of each element in the mask vector represents the mask
+///bits. If a mask bit is zero, the corresponding value from vector \a a
+///is gathered; otherwise the value is loaded from memory.
+/// \param s
+///A literal constant scale factor for the indexes in \a i. Must be
+///1, 2, 4, or 8.
+/// \returns A 128-bit vector of [2 x double] containing the gathered values.
 #define _mm_mask_i32gather_pd(a, m, i, mask, s) \
   ((__m128d)__builtin_ia32_gatherd_pd((__v2df)(__m128i)(a), \
   (double const *)(m), \
   (__v4si)(__m128i)(i), \
   (__v2df)(__m128d)(mask), (s)))
 
+/// Conditionally gathers four 64-bit floating-point values, either from the
+///256-bit vector of [4 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of [4 x i32] in \a i. The 256-bit vector
+///of [4 x double] in \a mask determines the source for each element.
+///
+/// \code{.operation}
+/// FOR element := 0 to 3
+///   j := element*64
+///   k := element*32
+///   IF mask[j+63] == 0
+/// result[j+63:j] := a[j+63:j]
+///   ELSE
+/// result[j+63:j] := Load64(m + SignExtend(i[k+31:k])*s)
+///   FI
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// \code
+/// __m256d _mm256_mask_i32gather_pd(__m256d a, const double *m, __m128i i,
+///  __m256d mask, const int s);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VGATHERDPD instruction.
+///
+/// \param a
+///A 256-bit vector of [4 x double] used as the source when a mask bit is
+///zero.
+/// \param m
+///A pointer to the memory used for loading values.
+/// \param i
+///A 128-bit vector of [4 x i32] containing signed indexes into \a m.
+/// \param mask
+///A 256-bit vector of [4 x double] containing the mask. The most
+///significant bit of each element in the mask vector represents the mask
+///bits. If a mask bit is zero, the corresponding value from vector \a a
+///is gathered; otherwise the value is loaded from memory.
+/// \param s
+///A literal constant scale factor for the indexes in \a i. Must be
+///1, 2, 4, or 8.
+/// \returns A 256-bit vector of [4 x double] containing the gathered values.
 #define _mm256_mask_i32gather_pd(a, m, i, mask, s) \
   ((__m256d)__builtin_ia32_gatherd_pd256((__v4df)(__m256d)(a), \
  (double const *)(m), \
  (__v4si)(__m128i)(i), \
  (__v4df)(__m256d)(mask), (s)))
 
+/// Conditionally gathers two 64-bit floating-point values, either from the
+///128-bit vector of [2 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of

[PATCH] D149205: [Headers][doc] Add "gather" intrinsic descriptions to avx2intrin.h

2023-04-26 Thread Paul Robinson via Phabricator via cfe-commits

probinson marked 2 inline comments as done.
probinson added inline comments.



Comment at: clang/lib/Headers/avx2intrin.h:942
+///
+/// \code
+/// FOR element := 0 to 1

pengfei wrote:
> Use `\code{.operation}` please, the same below. Our internal tool will 
> recognize this pattern.
Ok. I'll modify our tooling to ignore it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149205/new/

https://reviews.llvm.org/D149205

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D149205: [Headers][doc] Add "gather" intrinsic descriptions to avx2intrin.h

2023-04-25 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: pengfei, RKSimon, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D149205

Files:
  clang/lib/Headers/avx2intrin.h

Index: clang/lib/Headers/avx2intrin.h
===
--- clang/lib/Headers/avx2intrin.h
+++ clang/lib/Headers/avx2intrin.h
@@ -786,7 +786,6 @@
   return (__m128i)__builtin_shufflevector((__v8hi)__X, (__v8hi)__X, 0, 0, 0, 0, 0, 0, 0, 0);
 }
 
-
 static __inline__ __m128i __DEFAULT_FN_ATTRS128
 _mm_broadcastd_epi32(__m128i __X)
 {
@@ -935,102 +934,810 @@
   return (__m128i)__builtin_ia32_psrlv2di((__v2di)__X, (__v2di)__Y);
 }
 
+/// Conditionally gathers two 64-bit floating-point values, either from the
+///128-bit vector of [2 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of [4 x i32] in \a i. The 128-bit vector
+///of [2 x double] in \a mask determines the source for each element.
+///
+/// \code
+/// FOR element := 0 to 1
+///   j := element*64
+///   k := element*32
+///   IF mask[j+63] == 0
+/// result[j+63:j] := a[j+63:j]
+///   ELSE
+/// result[j+63:j] := Load64(m + SignExtend(i[k+31:k])*s)
+///   FI
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// \code
+/// __m128d _mm_mask_i32gather_pd(__m128d a, const double *m, __m128i i,
+///   __m128d mask, const int s);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VGATHERDPD instruction.
+///
+/// \param a
+///A 128-bit vector of [2 x double] used as the source when a mask bit is
+///zero.
+/// \param m
+///A pointer to the memory used for loading values.
+/// \param i
+///A 128-bit vector of [4 x i32] containing signed indexes into \a m. Only
+///the first two elements are used.
+/// \param mask
+///A 128-bit vector of [2 x double] containing the mask. The most
+///significant bit of each element in the mask vector represents the mask
+///bits. If a mask bit is zero, the corresponding value from vector \a a
+///is gathered; otherwise the value is loaded from memory.
+/// \param s
+///A literal constant scale factor for the indexes in \a i. Must be
+///1, 2, 4, or 8.
+/// \returns A 128-bit vector of [2 x double] containing the gathered values.
 #define _mm_mask_i32gather_pd(a, m, i, mask, s) \
   ((__m128d)__builtin_ia32_gatherd_pd((__v2df)(__m128i)(a), \
   (double const *)(m), \
   (__v4si)(__m128i)(i), \
   (__v2df)(__m128d)(mask), (s)))
 
+/// Conditionally gathers four 64-bit floating-point values, either from the
+///256-bit vector of [4 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of [4 x i32] in \a i. The 256-bit vector
+///of [4 x double] in \a mask determines the source for each element.
+///
+/// \code
+/// FOR element := 0 to 3
+///   j := element*64
+///   k := element*32
+///   IF mask[j+63] == 0
+/// result[j+63:j] := a[j+63:j]
+///   ELSE
+/// result[j+63:j] := Load64(m + SignExtend(i[k+31:k])*s)
+///   FI
+/// ENDFOR
+/// \endcode
+///
+/// \headerfile 
+///
+/// \code
+/// __m256d _mm256_mask_i32gather_pd(__m256d a, const double *m, __m128i i,
+///  __m256d mask, const int s);
+/// \endcode
+///
+/// This intrinsic corresponds to the \c VGATHERDPD instruction.
+///
+/// \param a
+///A 256-bit vector of [4 x double] used as the source when a mask bit is
+///zero.
+/// \param m
+///A pointer to the memory used for loading values.
+/// \param i
+///A 128-bit vector of [4 x i32] containing signed indexes into \a m.
+/// \param mask
+///A 256-bit vector of [4 x double] containing the mask. The most
+///significant bit of each element in the mask vector represents the mask
+///bits. If a mask bit is zero, the corresponding value from vector \a a
+///is gathered; otherwise the value is loaded from memory.
+/// \param s
+///A literal constant scale factor for the indexes in \a i. Must be
+///1, 2, 4, or 8.
+/// \returns A 256-bit vector of [4 x double] containing the gathered values.
 #define _mm256_mask_i32gather_pd(a, m, i, mask, s) \
   ((__m256d)__builtin_ia32_gatherd_pd256((__v4df)(__m256d)(a), \
  (double const *)(m), \
  (__v4si)(__m128i)(i), \
  (__v4df)(__m256d)(mask), (s)))
 
+/// Conditionally gathers two 64-bit floating-point values, either from the
+///128-bit vector of [2 x double] in \a a, or from memory \a m using scaled
+///indexes from the 128-bit vector of [2 x i64] in \a i. The 128-bit vector
+///of [2 x double] in \a mask determines the source for each element.
+///
+/// \code
+/// FOR element := 0

[PATCH] D148653: [Header][doc] Add/revise MONITOR/MWAIT[X] descriptions

2023-04-19 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/pmmintrin.h:278
 ///the monitor event pending state. Data stored in the monitored address
 ///range causes the processor to exit the pending state.
 ///

goldstein.w.n wrote:
> interrupts too. Might as well add that if updating the comments.
Ah, sorry missed this comment before I pushed. Updated in rG12426441




Comment at: clang/lib/Headers/pmmintrin.h:291
 /// \param __hints
-///Optional hints for the monitoring state, which may vary by processor.
+///Optional hints for the monitoring state, which can vary by processor.
 static __inline__ void __DEFAULT_FN_ATTRS

goldstein.w.n wrote:
> out of curiosity, why "may" -> "can"?
That was on the advice of my tech writer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148653/new/

https://reviews.llvm.org/D148653

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D148653: [Header][doc] Add/revise MONITOR/MWAIT[X] descriptions

2023-04-19 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG5ddcef2ad3db: [Headers][doc] Add/revise MONITOR/MWAIT 
descriptions (authored by probinson).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148653/new/

https://reviews.llvm.org/D148653

Files:
  clang/lib/Headers/mwaitxintrin.h
  clang/lib/Headers/pmmintrin.h


Index: clang/lib/Headers/pmmintrin.h
===
--- clang/lib/Headers/pmmintrin.h
+++ clang/lib/Headers/pmmintrin.h
@@ -253,9 +253,12 @@
 ///the processor in the monitor event pending state. Data stored in the
 ///monitored address range causes the processor to exit the pending state.
 ///
+/// The \c MONITOR instruction can be used in kernel mode, and in other modes
+/// if MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MONITOR  instruction.
+/// This intrinsic corresponds to the \c MONITOR instruction.
 ///
 /// \param __p
 ///The memory range to be monitored. The size of the range is determined by
@@ -270,19 +273,22 @@
   __builtin_ia32_monitor(__p, __extensions, __hints);
 }
 
-/// Used with the MONITOR instruction to wait while the processor is in
+/// Used with the \c MONITOR instruction to wait while the processor is in
 ///the monitor event pending state. Data stored in the monitored address
 ///range causes the processor to exit the pending state.
 ///
+/// The \c MWAIT instruction can be used in kernel mode, and in other modes if
+/// MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MWAIT  instruction.
+/// This intrinsic corresponds to the \c MWAIT instruction.
 ///
 /// \param __extensions
-///Optional extensions for the monitoring state, which may vary by
+///Optional extensions for the monitoring state, which can vary by
 ///processor.
 /// \param __hints
-///Optional hints for the monitoring state, which may vary by processor.
+///Optional hints for the monitoring state, which can vary by processor.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_mwait(unsigned __extensions, unsigned __hints)
 {
Index: clang/lib/Headers/mwaitxintrin.h
===
--- clang/lib/Headers/mwaitxintrin.h
+++ clang/lib/Headers/mwaitxintrin.h
@@ -16,12 +16,41 @@
 
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__,  
__target__("mwaitx")))
+
+/// Establishes a linear address memory range to be monitored and puts
+///the processor in the monitor event pending state. Data stored in the
+///monitored address range causes the processor to exit the pending state.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c MONITORX instruction.
+///
+/// \param __p
+///The memory range to be monitored. The size of the range is determined by
+///CPUID function _0005h.
+/// \param __extensions
+///Optional extensions for the monitoring state.
+/// \param __hints
+///Optional hints for the monitoring state.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_monitorx(void * __p, unsigned __extensions, unsigned __hints)
 {
   __builtin_ia32_monitorx(__p, __extensions, __hints);
 }
 
+/// Used with the \c MONITORX instruction to wait while the processor is in
+///the monitor event pending state. Data stored in the monitored address
+///range causes the processor to exit the pending state.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c MWAITX instruction.
+///
+/// \param __extensions
+///Optional extensions for the monitoring state, which can vary by
+///processor.
+/// \param __hints
+///Optional hints for the monitoring state, which can vary by processor.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_mwaitx(unsigned __extensions, unsigned __hints, unsigned __clock)
 {


Index: clang/lib/Headers/pmmintrin.h
===
--- clang/lib/Headers/pmmintrin.h
+++ clang/lib/Headers/pmmintrin.h
@@ -253,9 +253,12 @@
 ///the processor in the monitor event pending state. Data stored in the
 ///monitored address range causes the processor to exit the pending state.
 ///
+/// The \c MONITOR instruction can be used in kernel mode, and in other modes
+/// if MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MONITOR  instruction.
+/// This intrinsic corresponds to the \c MONITOR instruction.
 ///
 /// \param __p
 ///The memory range to be monitored. The size of the range is determined by
@@ -270,19 +273,22 @@
   __builtin_ia32_monitor(__p, __extensions, __hints);
 }
 
-/// Used with the MONITOR

[PATCH] D148653: [Header][doc] Add/revise MONITOR/MWAIT[X] descriptions

2023-04-18 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: RKSimon, pengfei, goldstein.w.n, craig.topper.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D148653

Files:
  clang/lib/Headers/mwaitxintrin.h
  clang/lib/Headers/pmmintrin.h


Index: clang/lib/Headers/pmmintrin.h
===
--- clang/lib/Headers/pmmintrin.h
+++ clang/lib/Headers/pmmintrin.h
@@ -253,9 +253,12 @@
 ///the processor in the monitor event pending state. Data stored in the
 ///monitored address range causes the processor to exit the pending state.
 ///
+/// The \c MONITOR instruction can be used in kernel mode, and in other modes
+/// if MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MONITOR  instruction.
+/// This intrinsic corresponds to the \c MONITOR instruction.
 ///
 /// \param __p
 ///The memory range to be monitored. The size of the range is determined by
@@ -270,19 +273,22 @@
   __builtin_ia32_monitor(__p, __extensions, __hints);
 }
 
-/// Used with the MONITOR instruction to wait while the processor is in
+/// Used with the \c MONITOR instruction to wait while the processor is in
 ///the monitor event pending state. Data stored in the monitored address
 ///range causes the processor to exit the pending state.
 ///
+/// The \c MWAIT instruction can be used in kernel mode, and in other modes if
+/// MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MWAIT  instruction.
+/// This intrinsic corresponds to the \c MWAIT instruction.
 ///
 /// \param __extensions
-///Optional extensions for the monitoring state, which may vary by
+///Optional extensions for the monitoring state, which can vary by
 ///processor.
 /// \param __hints
-///Optional hints for the monitoring state, which may vary by processor.
+///Optional hints for the monitoring state, which can vary by processor.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_mwait(unsigned __extensions, unsigned __hints)
 {
Index: clang/lib/Headers/mwaitxintrin.h
===
--- clang/lib/Headers/mwaitxintrin.h
+++ clang/lib/Headers/mwaitxintrin.h
@@ -16,12 +16,41 @@
 
 /* Define the default attributes for the functions in this file. */
 #define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__,  
__target__("mwaitx")))
+
+/// Establishes a linear address memory range to be monitored and puts
+///the processor in the monitor event pending state. Data stored in the
+///monitored address range causes the processor to exit the pending state.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c MONITORX instruction.
+///
+/// \param __p
+///The memory range to be monitored. The size of the range is determined by
+///CPUID function _0005h.
+/// \param __extensions
+///Optional extensions for the monitoring state.
+/// \param __hints
+///Optional hints for the monitoring state.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_monitorx(void * __p, unsigned __extensions, unsigned __hints)
 {
   __builtin_ia32_monitorx(__p, __extensions, __hints);
 }
 
+/// Used with the \c MONITORX instruction to wait while the processor is in
+///the monitor event pending state. Data stored in the monitored address
+///range causes the processor to exit the pending state.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c MWAITX instruction.
+///
+/// \param __extensions
+///Optional extensions for the monitoring state, which can vary by
+///processor.
+/// \param __hints
+///Optional hints for the monitoring state, which can vary by processor.
 static __inline__ void __DEFAULT_FN_ATTRS
 _mm_mwaitx(unsigned __extensions, unsigned __hints, unsigned __clock)
 {


Index: clang/lib/Headers/pmmintrin.h
===
--- clang/lib/Headers/pmmintrin.h
+++ clang/lib/Headers/pmmintrin.h
@@ -253,9 +253,12 @@
 ///the processor in the monitor event pending state. Data stored in the
 ///monitored address range causes the processor to exit the pending state.
 ///
+/// The \c MONITOR instruction can be used in kernel mode, and in other modes
+/// if MSR  C001_0015h[MonMwaitUserEn]  is set.
+///
 /// \headerfile 
 ///
-/// This intrinsic corresponds to the  MONITOR  instruction.
+/// This intrinsic corresponds to the \c MONITOR instruction.
 ///
 /// \param __p
 ///The memory range to be monitored. The size of the range is determined by
@@ -270,19 +273,22 @@
   __builtin_ia32_monitor(__p, __extensions, __hints);
 }
 
-/// Used with the MONITOR instruction to wait while the processor is in
+/// Used with the \c MONITOR instruction to wait while the processor is in
 ///the monitor event pending state. Data stored in the monitored address

[PATCH] D148021: [Headers][doc] Add FMA intrinsic descriptions

2023-04-18 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

I chose to leave the "for each element" cases as-is, but I will keep your 
comments in mind as I go through other intrinsics.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148021/new/

https://reviews.llvm.org/D148021

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D148021: [Headers][doc] Add FMA intrinsic descriptions

2023-04-18 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG0905c567f0c7: [Headers][doc] Add FMA intrinsic descriptions 
(authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D148021?vs=512461=514663#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148021/new/

https://reviews.llvm.org/D148021

Files:
  clang/lib/Headers/fmaintrin.h

Index: clang/lib/Headers/fmaintrin.h
===
--- clang/lib/Headers/fmaintrin.h
+++ clang/lib/Headers/fmaintrin.h
@@ -18,192 +18,756 @@
 #define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("fma"), __min_vector_width__(128)))
 #define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("fma"), __min_vector_width__(256)))
 
+/// Computes a multiply-add of 128-bit vectors of [4 x float].
+///For each element, computes  (__A * __B) + __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VFMADD213PS instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the addend.
+/// \returns A 128-bit vector of [4 x float] containing the result.
 static __inline__ __m128 __DEFAULT_FN_ATTRS128
 _mm_fmadd_ps(__m128 __A, __m128 __B, __m128 __C)
 {
   return (__m128)__builtin_ia32_vfmaddps((__v4sf)__A, (__v4sf)__B, (__v4sf)__C);
 }
 
+/// Computes a multiply-add of 128-bit vectors of [2 x double].
+///For each element, computes  (__A * __B) + __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VFMADD213PD instruction.
+///
+/// \param __A
+///A 128-bit vector of [2 x double] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [2 x double] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [2 x double] containing the addend.
+/// \returns A 128-bit [2 x double] vector containing the result.
 static __inline__ __m128d __DEFAULT_FN_ATTRS128
 _mm_fmadd_pd(__m128d __A, __m128d __B, __m128d __C)
 {
   return (__m128d)__builtin_ia32_vfmaddpd((__v2df)__A, (__v2df)__B, (__v2df)__C);
 }
 
+/// Computes a scalar multiply-add of the single-precision values in the
+///low 32 bits of 128-bit vectors of [4 x float].
+/// \code
+/// result[31:0] = (__A[31:0] * __B[31:0]) + __C[31:0]
+/// result[127:32] = __A[127:32]
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VFMADD213SS instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand in the low
+///32 bits.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier in the low
+///32 bits.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the addend in the low
+///32 bits.
+/// \returns A 128-bit vector of [4 x float] containing the result in the low
+///32 bits and a copy of \a __A[127:32] in the upper 96 bits.
 static __inline__ __m128 __DEFAULT_FN_ATTRS128
 _mm_fmadd_ss(__m128 __A, __m128 __B, __m128 __C)
 {
   return (__m128)__builtin_ia32_vfmaddss3((__v4sf)__A, (__v4sf)__B, (__v4sf)__C);
 }
 
+/// Computes a scalar multiply-add of the double-precision values in the
+///low 64 bits of 128-bit vectors of [2 x double].
+/// \code
+/// result[63:0] = (__A[63:0] * __B[63:0]) + __C[63:0]
+/// result[127:64] = __A[127:64]
+/// \endcode
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VFMADD213SD instruction.
+///
+/// \param __A
+///A 128-bit vector of [2 x double] containing the multiplicand in the low
+///64 bits.
+/// \param __B
+///A 128-bit vector of [2 x double] containing the multiplier in the low
+///64 bits.
+/// \param __C
+///A 128-bit vector of [2 x double] containing the addend in the low
+///64 bits.
+/// \returns A 128-bit vector of [2 x double] containing the result in the low
+///64 bits and a copy of \a __A[127:64] in the upper 64 bits.
 static __inline__ __m128d __DEFAULT_FN_ATTRS128
 _mm_fmadd_sd(__m128d __A, __m128d __B, __m128d __C)
 {
   return (__m128d)__builtin_ia32_vfmaddsd3((__v2df)__A, (__v2df)__B, (__v2df)__C);
 }
 
+/// Computes a multiply-subtract of 128-bit vectors of [4 x float].
+///For each element, computes  (__A * __B) - __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the \c VFMSUB213PS instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the subtrahend.
+/// \returns A 128-bit vector of [4 x float] containing the

[PATCH] D148021: [Headers][doc] Add FMA intrinsic descriptions

2023-04-13 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/Headers/fmaintrin.h:22
+/// Computes a multiply-add of 128-bit vectors of [4 x float].
+///For each element, computes  (__A * __B) + __C .
+///

pengfei wrote:
> We are using a special format to describute the function in a pseudo code to 
> share it with the intrinsic guide, e.g.,
> https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/avx512fintrin.h#L9604-L9610
> https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm512_i32logather_pd_expand=4077
> 
> There's no strong requirement to follow it, but it would be better to adopt a 
> uniform format.
Is a FOR loop with computed bit offsets really clearer than "For each element" 
? Is it valuable to repeat information that can be found in the instruction 
reference? 
I can accept answers of "yes" and "yes" because I am not someone who ever deals 
with vector data, but I would be a little surprised by those answers.




Comment at: clang/lib/Headers/fmaintrin.h:26
+///
+/// This intrinsic corresponds to the  VFMADD213PS  instruction.
+///

pengfei wrote:
> It would be odd to user given this is just 1/3 instructions the intrinsic may 
> generate, but I don't have a good idea here.
I listed the 213 version because that's the one that multiplies the first two 
operands, and the intrinsic multiplies the first two operands. So it's the 
instruction that most closely corresponds to the intrinsic.
We don't guarantee that the "corresponding" instruction is what is actually 
generated, in general. I know this point has come up before regarding intrinsic 
descriptions. My thinking is that the "corresponding instruction" gives the 
reader a place to look in the instruction reference manual, so listing only one 
is again okay.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148021/new/

https://reviews.llvm.org/D148021

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D148021: [Headers][doc] Add FMA intrinsic descriptions

2023-04-11 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: RKSimon, pengfei, goldstein.w.n.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D148021

Files:
  clang/lib/Headers/fmaintrin.h

Index: clang/lib/Headers/fmaintrin.h
===
--- clang/lib/Headers/fmaintrin.h
+++ clang/lib/Headers/fmaintrin.h
@@ -18,192 +18,724 @@
 #define __DEFAULT_FN_ATTRS128 __attribute__((__always_inline__, __nodebug__, __target__("fma"), __min_vector_width__(128)))
 #define __DEFAULT_FN_ATTRS256 __attribute__((__always_inline__, __nodebug__, __target__("fma"), __min_vector_width__(256)))
 
+/// Computes a multiply-add of 128-bit vectors of [4 x float].
+///For each element, computes  (__A * __B) + __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  VFMADD213PS  instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the addend.
+/// \returns A 128-bit vector of [4 x float] containing the result.
 static __inline__ __m128 __DEFAULT_FN_ATTRS128
 _mm_fmadd_ps(__m128 __A, __m128 __B, __m128 __C)
 {
   return (__m128)__builtin_ia32_vfmaddps((__v4sf)__A, (__v4sf)__B, (__v4sf)__C);
 }
 
+/// Computes a multiply-add of 128-bit vectors of [2 x double].
+///For each element, computes  (__A * __B) + __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  VFMADD213PD  instruction.
+///
+/// \param __A
+///A 128-bit vector of [2 x double] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [2 x double] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [2 x double] containing the addend.
+/// \returns A 128-bit [2 x double] vector containing the result.
 static __inline__ __m128d __DEFAULT_FN_ATTRS128
 _mm_fmadd_pd(__m128d __A, __m128d __B, __m128d __C)
 {
   return (__m128d)__builtin_ia32_vfmaddpd((__v2df)__A, (__v2df)__B, (__v2df)__C);
 }
 
+/// Computes a scalar multiply-add of the single-precision values in the
+///low 32 bits of 128-bit vectors of [4 x float]. \n
+/// result[31:0] = (__A[31:0] * __B[31:0]) + __C[31:0]  \n
+/// result[127:32] = __A[127:32] 
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  VFMADD213SS  instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand in the low
+///32 bits.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier in the low
+///32 bits.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the addend in the low
+///32 bits.
+/// \returns A 128-bit vector of [4 x float] containing the result in the low
+///32 bits and a copy of \a __A[127:32] in the upper 96 bits.
 static __inline__ __m128 __DEFAULT_FN_ATTRS128
 _mm_fmadd_ss(__m128 __A, __m128 __B, __m128 __C)
 {
   return (__m128)__builtin_ia32_vfmaddss3((__v4sf)__A, (__v4sf)__B, (__v4sf)__C);
 }
 
+/// Computes a scalar multiply-add of the double-precision values in the
+///low 64 bits of 128-bit vectors of [2 x double]. \n
+/// result[63:0] = (__A[63:0] * __B[63:0]) + __C[63:0]  \n
+/// result[127:64] = __A[127:64] 
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  VFMADD213SD  instruction.
+///
+/// \param __A
+///A 128-bit vector of [2 x double] containing the multiplicand in the low
+///64 bits.
+/// \param __B
+///A 128-bit vector of [2 x double] containing the multiplier in the low
+///64 bits.
+/// \param __C
+///A 128-bit vector of [2 x double] containing the addend in the low
+///64 bits.
+/// \returns A 128-bit vector of [2 x double] containing the result in the low
+///64 bits and a copy of \a __A[127:64] in the upper 64 bits.
 static __inline__ __m128d __DEFAULT_FN_ATTRS128
 _mm_fmadd_sd(__m128d __A, __m128d __B, __m128d __C)
 {
   return (__m128d)__builtin_ia32_vfmaddsd3((__v2df)__A, (__v2df)__B, (__v2df)__C);
 }
 
+/// Computes a multiply-subtract of 128-bit vectors of [4 x float].
+///For each element, computes  (__A * __B) - __C .
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  VFMSUB213PS  instruction.
+///
+/// \param __A
+///A 128-bit vector of [4 x float] containing the multiplicand.
+/// \param __B
+///A 128-bit vector of [4 x float] containing the multiplier.
+/// \param __C
+///A 128-bit vector of [4 x float] containing the subtrahend.
+/// \returns A 128-bit vector of [4 x float] containing the result.
 static __inline__ __m128 __DEFAULT_FN_ATTRS128
 _mm_fmsub_ps(__m128 __A, __m128 __B, __m128 __C)
 {
   return (__m128)__builtin_ia32_vfmaddps((__v4sf)__A, (__v4sf)__B, -(__v4sf)__C);
 }
 
+/// Computes a multiply-subtract of 128-bit vectors of [2 x double].
+///For each element, computes  (__A * __B) -

[PATCH] D147256: [DebugInfo] Fix file path separator when targeting windows.

2023-04-05 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

An LLVM code change should be testable on its own; this has it tested by Clang.
I think you need a new command-line option to set 
TargetOptions::UseTargetPathSeparator e.g. via llvm-mc. Other TargetOptions are 
handled this way.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147256/new/

https://reviews.llvm.org/D147256

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D147461: [Headers] Add some intrinsic function descriptions to immintrin.h

2023-04-04 Thread Paul Robinson via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa82170fa41ca: [Headers] Add some intrinsic function 
descriptions to immintrin.h. (authored by probinson).
Herald added a project: clang.

Changed prior to commit:
  https://reviews.llvm.org/D147461?vs=510568=510778#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147461/new/

https://reviews.llvm.org/D147461

Files:
  clang/lib/Headers/immintrin.h

Index: clang/lib/Headers/immintrin.h
===
--- clang/lib/Headers/immintrin.h
+++ clang/lib/Headers/immintrin.h
@@ -284,18 +284,45 @@
 
 #if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) ||  \
 defined(__RDRND__)
+/// Returns a 16-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///A pointer to a 16-bit memory location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand16_step(unsigned short *__p)
 {
   return (int)__builtin_ia32_rdrand16_step(__p);
 }
 
+/// Returns a 32-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///A pointer to a 32-bit memory location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand32_step(unsigned int *__p)
 {
   return (int)__builtin_ia32_rdrand32_step(__p);
 }
 
+/// Returns a 64-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///A pointer to a 64-bit memory location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 #ifdef __x86_64__
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand64_step(unsigned long long *__p)
@@ -325,48 +352,108 @@
 #if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) ||  \
 defined(__FSGSBASE__)
 #ifdef __x86_64__
+/// Reads the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDFSBASE  instruction.
+///
+/// \returns The lower 32 bits of the FS base register.
 static __inline__ unsigned int __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readfsbase_u32(void)
 {
   return __builtin_ia32_rdfsbase32();
 }
 
+/// Reads the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDFSBASE  instruction.
+///
+/// \returns The contents of the FS base register.
 static __inline__ unsigned long long __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readfsbase_u64(void)
 {
   return __builtin_ia32_rdfsbase64();
 }
 
+/// Reads the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDGSBASE  instruction.
+///
+/// \returns The lower 32 bits of the GS base register.
 static __inline__ unsigned int __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readgsbase_u32(void)
 {
   return __builtin_ia32_rdgsbase32();
 }
 
+/// Reads the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDGSBASE  instruction.
+///
+/// \returns The contents of the GS base register.
 static __inline__ unsigned long long __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readgsbase_u64(void)
 {
   return __builtin_ia32_rdgsbase64();
 }
 
+/// Modifies the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRFSBASE  instruction.
+///
+/// \param __V
+///Value to use for the lower 32 bits of the FS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writefsbase_u32(unsigned int __V)
 {
   __builtin_ia32_wrfsbase32(__V);
 }
 
+/// Modifies the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRFSBASE  instruction.
+///
+/// \param __V
+///Value to use for the FS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writefsbase_u64(unsigned long long __V)
 {
   __builtin_ia32_wrfsbase64(__V);
 }
 
+/// Modifies the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRGSBASE  instruction.
+///
+/// \param __V
+///Value to use for the lower 32 bits of the GS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))

[PATCH] D147461: [Headers] Add some intrinsic function descriptions to immintrin.h

2023-04-03 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

FTR, I'll be working my way through a bunch of intrinsics over the next month 
or so, trying not to do too many at once. Mostly AVX2 but also some others.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147461/new/

https://reviews.llvm.org/D147461

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D147461: [Headers] Add some intrinsic function descriptions to immintrin.h

2023-04-03 Thread Paul Robinson via Phabricator via cfe-commits

probinson created this revision.
probinson added reviewers: RKSimon, pengfei.
Herald added a project: All.
probinson requested review of this revision.

https://reviews.llvm.org/D147461

Files:
  clang/lib/Headers/immintrin.h

Index: clang/lib/Headers/immintrin.h
===
--- clang/lib/Headers/immintrin.h
+++ clang/lib/Headers/immintrin.h
@@ -284,18 +284,45 @@
 
 #if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) ||  \
 defined(__RDRND__)
+/// Returns a 16-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///Pointer to a 16-bit location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand16_step(unsigned short *__p)
 {
   return (int)__builtin_ia32_rdrand16_step(__p);
 }
 
+/// Returns a 32-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///Pointer to a 32-bit location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand32_step(unsigned int *__p)
 {
   return (int)__builtin_ia32_rdrand32_step(__p);
 }
 
+/// Returns a 64-bit hardware-generated random value.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDRAND  instruction.
+///
+/// \param __p
+///Pointer to a 64-bit location to place the random value.
+/// \returns 1 if the value was successfully generated, 0 otherwise.
 #ifdef __x86_64__
 static __inline__ int __attribute__((__always_inline__, __nodebug__, __target__("rdrnd")))
 _rdrand64_step(unsigned long long *__p)
@@ -325,48 +352,108 @@
 #if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) ||  \
 defined(__FSGSBASE__)
 #ifdef __x86_64__
+/// Reads the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDFSBASE  instruction.
+///
+/// \returns The lower 32 bits of the FS base register.
 static __inline__ unsigned int __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readfsbase_u32(void)
 {
   return __builtin_ia32_rdfsbase32();
 }
 
+/// Reads the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDFSBASE  instruction.
+///
+/// \returns The contents of the FS base register.
 static __inline__ unsigned long long __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readfsbase_u64(void)
 {
   return __builtin_ia32_rdfsbase64();
 }
 
+/// Reads the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDGSBASE  instruction.
+///
+/// \returns The lower 32 bits of the GS base register.
 static __inline__ unsigned int __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readgsbase_u32(void)
 {
   return __builtin_ia32_rdgsbase32();
 }
 
+/// Reads the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  RDGSBASE  instruction.
+///
+/// \returns The contents of the GS base register.
 static __inline__ unsigned long long __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _readgsbase_u64(void)
 {
   return __builtin_ia32_rdgsbase64();
 }
 
+/// Modifies the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRFSBASE  instruction.
+///
+/// \param __V
+///Value to use for the lower 32 bits of the FS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writefsbase_u32(unsigned int __V)
 {
   __builtin_ia32_wrfsbase32(__V);
 }
 
+/// Modifies the FS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRFSBASE  instruction.
+///
+/// \param __V
+///Value to use for the FS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writefsbase_u64(unsigned long long __V)
 {
   __builtin_ia32_wrfsbase64(__V);
 }
 
+/// Modifies the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRGSBASE  instruction.
+///
+/// \param __V
+///Value to use for the lower 32 bits of the GS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))
 _writegsbase_u32(unsigned int __V)
 {
   __builtin_ia32_wrgsbase32(__V);
 }
 
+/// Modifies the GS base register.
+///
+/// \headerfile 
+///
+/// This intrinsic corresponds to the  WRFSBASE  instruction.
+///
+/// \param __V
+///Value to use for GS base register.
 static __inline__ void __attribute__((__always_inline__, __nodebug__, __target__("fsgsbase")))

[PATCH] D147256: [DebugInfo] Fix file path separator when targeting windows.

2023-03-31 Thread Paul Robinson via Phabricator via cfe-commits

probinson added subscribers: debug-info, probinson.
probinson added a comment.

I think we cannot be 100% sure about source paths in a cross-compile situation. 
Cross-compiling on platform A targeting platform B does not mean your sources 
and debugger UI are on platform B. My users keep source and debugger UI on 
platform A, debugging target B remotely. We need to preserve the host 
pathnames. It is not clear to me that this patch does so.

Comment at: llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp:787

   StringRef PathRef(Asm->TM.Options.ObjectFilenameForDebug);
   llvm::SmallString<256> PathStore(PathRef);

zequanwu wrote:
> hans wrote:
> > This handles codeview. Does anything need to be done for dwarf on windows? 
> > mstorsjo might have input on that.
> It looks like `TM.Options.ObjectFilenameForDebug` is only used for codeview. 
> I guess dwarf doesn't store the object file path.
Right, DWARF only stores the source path.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147256/new/

https://reviews.llvm.org/D147256

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141824: [clang-repl] Add a command to load dynamic libraries

2023-03-29 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In D141824#4231372 , @probinson wrote:

> In D141824#4229953 , @argentite 
> wrote:
>
>> Just to confirm, `UNSUPPORTED: target=x86_64-scei-ps4` should be enough, 
>> right?
>
> `UNSUPPORTED: target={{.*-(ps4|ps5)}}` please.

Oh wait, @dyung says this should work on PS5.  Yes, your original suggestion is 
good.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141824/new/

https://reviews.llvm.org/D141824

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141824: [clang-repl] Add a command to load dynamic libraries

2023-03-29 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

In D141824#4229953 , @argentite wrote:

> Just to confirm, `UNSUPPORTED: target=x86_64-scei-ps4` should be enough, 
> right?

`UNSUPPORTED: target={{.*-(ps4|ps5)}}` please.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141824/new/

https://reviews.llvm.org/D141824

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144870: [Clang][DebugInfo] Emit zero size bitfields in the debug info to delimit bitfields in different allocation units.

2023-03-27 Thread Paul Robinson via Phabricator via cfe-commits

probinson accepted this revision.
probinson added a comment.
This revision is now accepted and ready to land.

One entirely optional suggestion on the test. LGTM.




Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:1563
+
+  assert(PreviousBitfield->isBitField());
+

jmmartinez wrote:
> probinson wrote:
> > Is this true for cases like
> > ```
> > struct nonadjacent {
> >   char a : 8;
> >   char : 0;
> >   int b;
> >   char d : 8;
> > };
> > ```
> > where the field `d` has a predecessor that is not a bitfield? (This might 
> > be my ignorance of how Decls are put together, but asserting that `advance` 
> > is guaranteed to get you a bitfield just seems a little odd.)
> In that case the assert is never reached.
> 
> When emiting the debug-info for `d`, when looking at the metadata generated 
> for the previous field the function should exit early on this condition:
> ```
>   if (!PreviousMDField || !PreviousMDField->isBitField() ||
>   PreviousMDField->getSizeInBits() == 0)
> return nullptr;
> ```
Ah, you are right. Thanks!



Comment at: clang/test/CodeGen/debug-info-bitfield-0-struct.c:40
+struct SecondDuplicate {
+  // BOTH-DAG: ![[SECONDD:[0-9]+]] = distinct !DICompositeType(tag: 
DW_TAG_structure_type, name: "SecondDuplicate", file: !{{[0-9]+}}, line: 
{{[0-9]+}}, size: 64, elements: ![[SECONDD_ELEMENTS:[0-9]+]])
+

Maybe use something like `SECONDDUP` instead of `SECONDD` which can look like a 
typo.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144870/new/

https://reviews.llvm.org/D144870

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144870: [Clang][DebugInfo] Emit zero size bitfields in the debug info to delimit bitfields in different allocation units.

2023-03-24 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Still one question, and haven't dug into the test in detail yet.




Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:1552
+
+  auto *PreviousMDEntry =
+  PreviousFieldsDI.empty() ? nullptr : PreviousFieldsDI.back();

Maybe a comment here,
`// If we already emitted metadata for a 0-length bitfield, nothing to do here.`
(I was briefly confused by "if the previous field is a 0-length bitfield, do 
nothing" until I realized this is looking at the metadata not the decl.



Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:1563
+
+  assert(PreviousBitfield->isBitField());
+

Is this true for cases like
```
struct nonadjacent {
  char a : 8;
  char : 0;
  int b;
  char d : 8;
};
```
where the field `d` has a predecessor that is not a bitfield? (This might be my 
ignorance of how Decls are put together, but asserting that `advance` is 
guaranteed to get you a bitfield just seems a little odd.)



Comment at: clang/lib/CodeGen/CGDebugInfo.h:324
 
-  /// Create new bit field member.
-  llvm::DIType *createBitFieldType(const FieldDecl *BitFieldDecl,
-   llvm::DIScope *RecordTy,
-   const RecordDecl *RD);
+  /// Create new bit field member
+  llvm::DIDerivedType *createBitFieldType(const FieldDecl *BitFieldDecl,

Please keep that final `.`



Comment at: clang/lib/CodeGen/CGDebugInfo.h:330
+  /// Create an anonnymous zero-size separator for bit-field-decl if needed on
+  /// the target
+  llvm::DIDerivedType *createBitFieldSeparatorIfNeeded(




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144870/new/

https://reviews.llvm.org/D144870

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D145803: [clang][DebugInfo] Emit DW_AT_type of preferred name if available

2023-03-24 Thread Paul Robinson via Phabricator via cfe-commits

probinson added subscribers: wolfgangp, probinson.
probinson added a comment.

This is pretty different from the "always desugar to the canonical type" habit 
that has always been in place. Sony has done some downstream things to try to 
work around that in the past. @wolfgangp will remember it better than I do, but 
I think we make some effort to preserve the type-as-written. This goes in 
completely the other direction.




Comment at: clang/test/CodeGen/preferred_name.cpp:49
+
+Foo> varFooInt;
+

This doesn't become `Foo` ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145803/new/

https://reviews.llvm.org/D145803

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D146802: [Documentation] improved documentation of diagnostic messages by explaining thier syntax and test of clang by telling which subobject is uninitialized

2023-03-24 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

You are combining documentation of the syntax for defining diagnostics, and 
changes to the content of certain diagnostics. The LLVM project wants to see 
one topic per patch, so these things need to be done separately.

Also, the documentation probably does not want to live in one of the many .td 
files that define diagnostics. It wants to go somewhere generic, and I'm not 
sure what a good place would be.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146802/new/

https://reviews.llvm.org/D146802

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144870: [Clang][DebugInfo] Emit zero size bitfields in the debug info to delimit bitfields in different allocation units.

2023-03-17 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

Is it possible you need to look only at the immediately preceding field, and 
not iterate? For example,

  struct zero_bitfield {
char a : 8;
char : 0;
char b : 8;
char c : 8;
  };

If processing `b` sees the zero-length bitfield and does the needful, then when 
processing `c` it's sufficient to see that `b` is a preceding bitfield and know 
that the needful has been done.

I'm now also curious whether this ABI aspect affects non-bitfields. For example:

  struct non_adjacent_bitfields {
char d;
char : 0;
char e;
char f : 8;
  };

(1) Is `e` affected by the presence of the zero-length bitfield? (2) is `f` 
affected? (3) If the answers are "no" and "yes" then you do need to iterate 
looking for the zero-length bitfield, but otherwise I think it's sufficient to 
look only at the preceding field.

(Of course it's possible that finding the preceding field can be done only by 
iterating, but I'd hope that isn't necessary.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144870/new/

https://reviews.llvm.org/D144870

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144870: [Clang][DebugInfo] Emit zero size bitfields in the debug info to delimit bitfields in different allocation units.

2023-03-14 Thread Paul Robinson via Phabricator via cfe-commits

probinson added inline comments.



Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:1558
+  EmitSeparator = FieldIt->isBitField();
+  }
+

I might not be following this correctly, but it feels like EmitSeparator will 
end up true if the last field is a bitfield, even if there are no zero-length 
bitfields in front of it. The test does not cover this case (to show that the 
no-zero-bitfields case is handled properly).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144870/new/

https://reviews.llvm.org/D144870

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D145173: Make section attribute and -ffunction-sections play nicely

2023-03-07 Thread Paul Robinson via Phabricator via cfe-commits

probinson abandoned this revision.
probinson added a comment.

I think the GC behavior with explicit section names is currently a little 
peculiar. For functions without a section name, -ffunction-sections allows GC 
to happen at the individual function level. With a section name, GC would 
happen at the level of all-or-nothing per input file, regardless of 
-ffunction-sections. That just seems unexpected and inconsistent to me.
But it is the current behavior, and given the objections it's not worth 
pursuing this.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145173/new/

https://reviews.llvm.org/D145173

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D143745: Make section attribute and -ffunction-sections play nicely

2023-03-03 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a comment.

See D145173  for a different tactic to solve 
this.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143745/new/

https://reviews.llvm.org/D143745

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D145271: [MSVC compatibility][DLLEXPORT/DLLIMPORT] Allow dllexport/dllimport for local classes

2023-03-03 Thread Paul Robinson via Phabricator via cfe-commits

probinson added a subscriber: cfe-commits.
probinson added a project: clang.
probinson added a comment.

I've looked at this but I'd like someone more in tune with MSVC behavior to 
review as well.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145271/new/

https://reviews.llvm.org/D145271

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 3 4 5 6 >

1 - 100 of 558 matches

Mail list logo