https://github.com/PimpalkarNeha created 
https://github.com/llvm/llvm-project/pull/200392

With -fpointer-tbaa enabled, Clang emitted distinct pointer TBAA tags such as
"pN omnipotent char" for pointers whose scalar base is omnipotent char
(char, signed char, unsigned char) as siblings of tags like "pN int" under
the same parent "any pN pointer". In LLVM TBAA, sibling nodes imply NoAlias,
but in C these char types may alias any type. A store through char*** and a
load through int*** on the same memory could therefore be treated as
non-aliasing.

This could allow GVN and other alias-based optimizations to eliminate or fold
the load incorrectly (for example to undef), leading to miscompilation or a
runtime crash. The issue was observed when building SPEC CPU2000 175.vpr at
-O3 -flto with -fpointer-tbaa.

**Root cause:** char-based pointer TBAA created a separate leaf ("pN omnipotent
char") instead of the universal pointer node at that depth, incorrectly
placing char-derived and int-derived accesses in a sibling relationship.

**Fix:** when the builtin base resolves to omnipotent char (getChar()), return
getAnyPtr(PtrDepth), consistent with void pointers. getAnyPtr() is the
universal alias node at depth N, so char-derived pointer accesses may alias
other pointer accesses at that depth via an ancestor relationship rather
than as siblings.

Add clang/test/CodeGen/tbaa-char-derived-pointers.c to verify that char-derived
pointer accesses no longer emit "pN omnipotent char" TBAA nodes.

>From 933de211eb597e65bc83fffeb84216e328deb413 Mon Sep 17 00:00:00 2001
From: Neha Pimpalkar <[email protected]>
Date: Fri, 29 May 2026 17:58:27 +0530
Subject: [PATCH] Fix pointer TBAA for char-derived pointer types

---
 clang/lib/CodeGen/CodeGenTBAA.cpp             |  3 +++
 .../test/CodeGen/tbaa-char-derived-pointers.c | 23 +++++++++++++++++++
 2 files changed, 26 insertions(+)
 create mode 100644 clang/test/CodeGen/tbaa-char-derived-pointers.c

diff --git a/clang/lib/CodeGen/CodeGenTBAA.cpp 
b/clang/lib/CodeGen/CodeGenTBAA.cpp
index 1854df7c7c0f1..fd596208c5d6e 100644
--- a/clang/lib/CodeGen/CodeGenTBAA.cpp
+++ b/clang/lib/CodeGen/CodeGenTBAA.cpp
@@ -286,6 +286,9 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type 
*Ty) {
     SmallString<256> TyName;
     if (isa<BuiltinType>(Ty)) {
       llvm::MDNode *ScalarMD = getTypeInfoHelper(Ty);
+      // char-derived pointers alias everything; use the universal pointer 
node.
+      if (ScalarMD == getChar())
+        return getAnyPtr(PtrDepth);
       StringRef Name =
           cast<llvm::MDString>(
               ScalarMD->getOperand(CodeGenOpts.NewStructPathTBAA ? 2 : 0))
diff --git a/clang/test/CodeGen/tbaa-char-derived-pointers.c 
b/clang/test/CodeGen/tbaa-char-derived-pointers.c
new file mode 100644
index 0000000000000..6c18a8a74a1b2
--- /dev/null
+++ b/clang/test/CodeGen/tbaa-char-derived-pointers.c
@@ -0,0 +1,23 @@
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -O2 -disable-llvm-passes 
-emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,OLD
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -O2 -disable-llvm-passes 
-emit-llvm -new-struct-path-tbaa -o - %s | FileCheck %s 
--check-prefixes=CHECK,NEW
+//
+// char-derived pointers must use the generic "any pN pointer" TBAA node, not a
+// distinct "pN omnipotent char" sibling of "pN int" (which would imply 
NoAlias).
+
+void char_int_ptr_cast(int ***p) {
+  char ***c = (char ***)p;
+  *c = 0;
+  (void)*p;
+}
+
+// CHECK-LABEL: define dso_local void @char_int_ptr_cast(
+// CHECK: store ptr null, ptr {{.*}}, !tbaa !{{[0-9]+}}
+
+// OLD-DAG: !{{[0-9]+}} = !{!"p2 int", !{{[0-9]+}}, i64 0}
+// OLD-DAG: !{{[0-9]+}} = !{!"any p2 pointer", !{{[0-9]+}}, i64 0}
+
+// NEW-DAG: !{{[0-9]+}} = !{!{{[0-9]+}}, i64 8, !"p2 int"}
+// NEW-DAG: !{{[0-9]+}} = !{!{{[0-9]+}}, i64 8, !"any p2 pointer"}
+
+// CHECK-NOT: !{!"p2 omnipotent char"
+// CHECK-NOT: !{!"p3 omnipotent char"

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to