llvmorg-github-actions[bot] wrote:

<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-clang-codegen

Author: PimpalkarNeha

<details>
<summary>Changes</summary>

With -fpointer-tbaa enabled, Clang emitted distinct pointer TBAA tags such as
"pN omnipotent char" for pointers whose scalar base is omnipotent char
(char, signed char, unsigned char) as siblings of tags like "pN int" under
the same parent "any pN pointer". In LLVM TBAA, sibling nodes imply NoAlias,
but in C these char types may alias any type. A store through char*** and a
load through int*** on the same memory could therefore be treated as
non-aliasing.

This could allow GVN and other alias-based optimizations to eliminate or fold
the load incorrectly (for example to undef), leading to miscompilation or a
runtime crash. The issue was observed when building SPEC CPU2000 175.vpr at
-O3 -flto with -fpointer-tbaa.

**Root cause:** char-based pointer TBAA created a separate leaf ("pN omnipotent
char") instead of the universal pointer node at that depth, incorrectly
placing char-derived and int-derived accesses in a sibling relationship.

**Fix:** when the builtin base resolves to omnipotent char (getChar()), return
getAnyPtr(PtrDepth), consistent with void pointers. getAnyPtr() is the
universal alias node at depth N, so char-derived pointer accesses may alias
other pointer accesses at that depth via an ancestor relationship rather
than as siblings.

Add clang/test/CodeGen/tbaa-char-derived-pointers.c to verify that char-derived
pointer accesses no longer emit "pN omnipotent char" TBAA nodes.

---
Full diff: https://github.com/llvm/llvm-project/pull/200392.diff


2 Files Affected:

- (modified) clang/lib/CodeGen/CodeGenTBAA.cpp (+3) 
- (added) clang/test/CodeGen/tbaa-char-derived-pointers.c (+23) 


``````````diff
diff --git a/clang/lib/CodeGen/CodeGenTBAA.cpp 
b/clang/lib/CodeGen/CodeGenTBAA.cpp
index 1854df7c7c0f1..fd596208c5d6e 100644
--- a/clang/lib/CodeGen/CodeGenTBAA.cpp
+++ b/clang/lib/CodeGen/CodeGenTBAA.cpp
@@ -286,6 +286,9 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type 
*Ty) {
     SmallString<256> TyName;
     if (isa<BuiltinType>(Ty)) {
       llvm::MDNode *ScalarMD = getTypeInfoHelper(Ty);
+      // char-derived pointers alias everything; use the universal pointer 
node.
+      if (ScalarMD == getChar())
+        return getAnyPtr(PtrDepth);
       StringRef Name =
           cast<llvm::MDString>(
               ScalarMD->getOperand(CodeGenOpts.NewStructPathTBAA ? 2 : 0))
diff --git a/clang/test/CodeGen/tbaa-char-derived-pointers.c 
b/clang/test/CodeGen/tbaa-char-derived-pointers.c
new file mode 100644
index 0000000000000..6c18a8a74a1b2
--- /dev/null
+++ b/clang/test/CodeGen/tbaa-char-derived-pointers.c
@@ -0,0 +1,23 @@
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -O2 -disable-llvm-passes 
-emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,OLD
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -O2 -disable-llvm-passes 
-emit-llvm -new-struct-path-tbaa -o - %s | FileCheck %s 
--check-prefixes=CHECK,NEW
+//
+// char-derived pointers must use the generic "any pN pointer" TBAA node, not a
+// distinct "pN omnipotent char" sibling of "pN int" (which would imply 
NoAlias).
+
+void char_int_ptr_cast(int ***p) {
+  char ***c = (char ***)p;
+  *c = 0;
+  (void)*p;
+}
+
+// CHECK-LABEL: define dso_local void @char_int_ptr_cast(
+// CHECK: store ptr null, ptr {{.*}}, !tbaa !{{[0-9]+}}
+
+// OLD-DAG: !{{[0-9]+}} = !{!"p2 int", !{{[0-9]+}}, i64 0}
+// OLD-DAG: !{{[0-9]+}} = !{!"any p2 pointer", !{{[0-9]+}}, i64 0}
+
+// NEW-DAG: !{{[0-9]+}} = !{!{{[0-9]+}}, i64 8, !"p2 int"}
+// NEW-DAG: !{{[0-9]+}} = !{!{{[0-9]+}}, i64 8, !"any p2 pointer"}
+
+// CHECK-NOT: !{!"p2 omnipotent char"
+// CHECK-NOT: !{!"p3 omnipotent char"

``````````

</details>


https://github.com/llvm/llvm-project/pull/200392
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to