[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-25 Thread Matthias Braun via Phabricator via cfe-commits
MatzeB added a comment.

In D154658#4533476 , @MatzeB wrote:

> This change results in some of our builds (distributed Thin-LTO in case that 
> matters) to fail with missing symbols. At a first glance this seems to emit 
> VTables in some files where it didn't do this before and then fails to 
> resolve some members of that vtable. I'm in the process of analyzing this 
> further and making a small reproducer.

I just noticed b6847edfc235829b37dd6d734ef5bbfa0a58b6fc 
 mentioned 
in the task above which isn't part of our builds yet. I see even symbols 
emitted with that change and suspect it may resolves the undefined symbols 
problems then. Will know for sure in a day or two (and if you hear nothing all 
is working well).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-25 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith added a comment.

In D154658#4533476 , @MatzeB wrote:

> This change results in some of our builds (distributed Thin-LTO in case that 
> matters) to fail with missing symbols. At a first glance this seems to emit 
> VTables in some files where it didn't do this before and then fails to 
> resolve some members of that vtable. I'm in the process of analyzing this 
> further and making a small reproducer.

Please try again after rGb6847edfc235829b37dd6d734ef5bbfa0a58b6fc 
 and see 
if you're still having issues.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-25 Thread Matthias Braun via Phabricator via cfe-commits
MatzeB added a comment.

This change results in some of our builds (distributed Thin-LTO in case that 
matters) to fail with missing symbols. At a first glance this seems to emit 
VTables in some files where it didn't do this before and then fails to resolve 
some members of that vtable. I'm in the process of analyzing this further and 
making a small reproducer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-24 Thread Nico Weber via Phabricator via cfe-commits
thakis added a comment.

This caused https://github.com/llvm/llvm-project/issues/64088


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-21 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG9d525bf94b25: Optimize emission of `dynamic_cast` to final 
classes. (authored by rsmith).

Changed prior to commit:
  https://reviews.llvm.org/D154658?vs=542693=543133#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/AST/ExprCXX.cpp
  clang/lib/CodeGen/CGCXXABI.h
  clang/lib/CodeGen/CGExprCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/lib/CodeGen/MicrosoftCXXABI.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCXX/dynamic-cast-always-null.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact.cpp
  clang/test/Driver/fassume-unique-vtables.cpp

Index: clang/test/Driver/fassume-unique-vtables.cpp
===
--- /dev/null
+++ clang/test/Driver/fassume-unique-vtables.cpp
@@ -0,0 +1,4 @@
+// RUN: %clang -### -fno-assume-unique-vtables %s -S 2>&1 | FileCheck -check-prefix=CHECK-OPT %s
+// RUN: %clang -### -fno-assume-unique-vtables -fassume-unique-vtables %s -S 2>&1 | FileCheck -check-prefix=CHECK-NOOPT %s
+// CHECK-OPT: "-fno-assume-unique-vtables"
+// CHECK-NOOPT-NOT: "-fno-assume-unique-vtables"
Index: clang/test/CodeGenCXX/dynamic-cast-exact.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/dynamic-cast-exact.cpp
@@ -0,0 +1,78 @@
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -emit-llvm -fcxx-exceptions -fexceptions -std=c++11 -o - | FileCheck %s --implicit-check-not='call {{.*}} @__dynamic_cast'
+struct Offset { virtual ~Offset(); };
+struct A { virtual ~A(); };
+struct B final : Offset, A { };
+
+struct C { virtual ~C(); int c; };
+struct D : A { int d; };
+struct E : A { int e; };
+struct F : virtual A { int f; };
+struct G : virtual A { int g; };
+struct H final : C, D, E, F, G { int h; };
+
+// CHECK-LABEL: @_Z7inexactP1A
+C *inexact(A *a) {
+  // CHECK: call {{.*}} @__dynamic_cast
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z12exact_singleP1A
+B *exact_single(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 1, i32 2)
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 -8
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[RESULT]], %[[LABEL_NOTNULL]] ], [ null, %[[LABEL_FAILED]] ]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z9exact_refR1A
+B _ref(A ) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 1, i32 2)
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 -8
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: call {{.*}} @__cxa_bad_cast
+  // CHECK: unreachable
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: ret ptr %[[RESULT]]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z11exact_multiP1A
+H *exact_multi(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[OFFSET_TO_TOP_SLOT:.*]] = getelementptr inbounds i64, ptr %[[VPTR]], i64 -2
+  // CHECK: %[[OFFSET_TO_TOP:.*]] = load i64, ptr %[[OFFSET_TO_TOP_SLOT]]
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 %[[OFFSET_TO_TOP]]
+  // CHECK: %[[DERIVED_VPTR:.*]] = load ptr, ptr %[[RESULT]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[DERIVED_VPTR]], getelementptr inbounds ({ [5 x ptr], [4 x ptr], [4 x ptr], [6 x ptr], [6 x ptr] }, ptr @_ZTV1H, i32 0, inrange i32 0, i32 3)
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[RESULT]], %[[LABEL_NOTNULL]] ], [ null, %[[LABEL_FAILED]] ]
+  return 

[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-21 Thread John McCall via Phabricator via cfe-commits
rjmccall accepted this revision.
rjmccall added a comment.
This revision is now accepted and ready to land.

Thanks, LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-20 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith added a comment.

In D154658#4516554 , @rjmccall wrote:

> LGTM, except, should we have a way to turn this optimization off specifically?

Sure. I suppose even for an internal linkage vtable there could be a reason to 
want to turn this off (eg, if someone constructs a custom vtable at runtime). 
I've added `-fno-assume-unique-vtables` to disable the optimization.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-20 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith updated this revision to Diff 542693.
rsmith added a comment.
Herald added a subscriber: MaskRay.

- Add option to disable vptr-based dynamic_cast optimization.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/AST/ExprCXX.cpp
  clang/lib/CodeGen/CGCXXABI.h
  clang/lib/CodeGen/CGExprCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/lib/CodeGen/MicrosoftCXXABI.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCXX/dynamic-cast-always-null.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact.cpp
  clang/test/Driver/fassume-unique-vtables.cpp

Index: clang/test/Driver/fassume-unique-vtables.cpp
===
--- /dev/null
+++ clang/test/Driver/fassume-unique-vtables.cpp
@@ -0,0 +1,4 @@
+// RUN: %clang -### -fno-assume-unique-vtables %s -S 2>&1 | FileCheck -check-prefix=CHECK-OPT %s
+// RUN: %clang -### -fno-assume-unique-vtables -fassume-unique-vtables %s -S 2>&1 | FileCheck -check-prefix=CHECK-NOOPT %s
+// CHECK-OPT: "-fno-assume-unique-vtables"
+// CHECK-NOOPT-NOT: "-fno-assume-unique-vtables"
Index: clang/test/CodeGenCXX/dynamic-cast-exact.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/dynamic-cast-exact.cpp
@@ -0,0 +1,78 @@
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -emit-llvm -fcxx-exceptions -fexceptions -std=c++11 -o - | FileCheck %s --implicit-check-not='call {{.*}} @__dynamic_cast'
+struct Offset { virtual ~Offset(); };
+struct A { virtual ~A(); };
+struct B final : Offset, A { };
+
+struct C { virtual ~C(); int c; };
+struct D : A { int d; };
+struct E : A { int e; };
+struct F : virtual A { int f; };
+struct G : virtual A { int g; };
+struct H final : C, D, E, F, G { int h; };
+
+// CHECK-LABEL: @_Z7inexactP1A
+C *inexact(A *a) {
+  // CHECK: call {{.*}} @__dynamic_cast
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z12exact_singleP1A
+B *exact_single(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 1, i32 2)
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 -8
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[RESULT]], %[[LABEL_NOTNULL]] ], [ null, %[[LABEL_FAILED]] ]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z9exact_refR1A
+B _ref(A ) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 1, i32 2)
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 -8
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: call {{.*}} @__cxa_bad_cast
+  // CHECK: unreachable
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: ret ptr %[[RESULT]]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z11exact_multiP1A
+H *exact_multi(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[OFFSET_TO_TOP_SLOT:.*]] = getelementptr inbounds i64, ptr %[[VPTR]], i64 -2
+  // CHECK: %[[OFFSET_TO_TOP:.*]] = load i64, ptr %[[OFFSET_TO_TOP_SLOT]]
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 %[[OFFSET_TO_TOP]]
+  // CHECK: %[[DERIVED_VPTR:.*]] = load ptr, ptr %[[RESULT]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[DERIVED_VPTR]], getelementptr inbounds ({ [5 x ptr], [4 x ptr], [4 x ptr], [6 x ptr], [6 x ptr] }, ptr @_ZTV1H, i32 0, inrange i32 0, i32 3)
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[RESULT]], %[[LABEL_NOTNULL]] ], [ null, %[[LABEL_FAILED]] ]
+  return dynamic_cast(a);
+}
Index: clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp

[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-19 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

LGTM, except, should we have a way to turn this optimization off specifically?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-19 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith added a comment.

Ping.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-12 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith added a comment.

In D154658#4490468 , @rjmccall wrote:

> Oh, this does matter on platforms using pointer authentication, because each 
> vptr field is signed with a constant discriminator derived from the name of 
> the class that introduced it.

Oh, hm. I guess you have an out-of-tree patch for that? I've switched back to 
passing types into `GetVTablePtr`, and stopped using that from 
`emitExactDynamicCast` in favor of directly loading the vtpr after the 
`dynamic_cast`. You would need to disable the exact `dynamic_cast` 
optimization when pointer authentication is enabled for vptrs, but that should 
be a small patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-12 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith updated this revision to Diff 539776.
rsmith marked an inline comment as done.
rsmith added a comment.
Herald added a reviewer: jdoerfert.
Herald added subscribers: jplehr, sstefan1.

- Avoid emitting the type_info when detecting whether it would be null.
- Bring back class type in GetVTablePtr and instead don't use it here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

Files:
  clang/lib/AST/ExprCXX.cpp
  clang/lib/CodeGen/CGCXXABI.h
  clang/lib/CodeGen/CGExprCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/lib/CodeGen/MicrosoftCXXABI.cpp
  clang/test/CodeGenCXX/dynamic-cast-always-null.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact.cpp

Index: clang/test/CodeGenCXX/dynamic-cast-exact.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/dynamic-cast-exact.cpp
@@ -0,0 +1,78 @@
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -emit-llvm -fcxx-exceptions -fexceptions -std=c++11 -o - | FileCheck %s --implicit-check-not='call {{.*}} @__dynamic_cast'
+struct Offset { virtual ~Offset(); };
+struct A { virtual ~A(); };
+struct B final : Offset, A { };
+
+struct C { virtual ~C(); int c; };
+struct D : A { int d; };
+struct E : A { int e; };
+struct F : virtual A { int f; };
+struct G : virtual A { int g; };
+struct H final : C, D, E, F, G { int h; };
+
+// CHECK-LABEL: @_Z7inexactP1A
+C *inexact(A *a) {
+  // CHECK: call {{.*}} @__dynamic_cast
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z12exact_singleP1A
+B *exact_single(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 1, i32 2)
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 -8
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[RESULT]], %[[LABEL_NOTNULL]] ], [ null, %[[LABEL_FAILED]] ]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z9exact_refR1A
+B _ref(A ) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 1, i32 2)
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 -8
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: call {{.*}} @__cxa_bad_cast
+  // CHECK: unreachable
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: ret ptr %[[RESULT]]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z11exact_multiP1A
+H *exact_multi(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[OFFSET_TO_TOP_SLOT:.*]] = getelementptr inbounds i64, ptr %[[VPTR]], i64 -2
+  // CHECK: %[[OFFSET_TO_TOP:.*]] = load i64, ptr %[[OFFSET_TO_TOP_SLOT]]
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 %[[OFFSET_TO_TOP]]
+  // CHECK: %[[DERIVED_VPTR:.*]] = load ptr, ptr %[[RESULT]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[DERIVED_VPTR]], getelementptr inbounds ({ [5 x ptr], [4 x ptr], [4 x ptr], [6 x ptr], [6 x ptr] }, ptr @_ZTV1H, i32 0, inrange i32 0, i32 3)
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[RESULT]], %[[LABEL_NOTNULL]] ], [ null, %[[LABEL_FAILED]] ]
+  return dynamic_cast(a);
+}
Index: clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
@@ -0,0 +1,13 @@
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -emit-llvm -std=c++11 -o - | FileCheck %s --check-prefixes=CHECK,EXACT
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -fvisibility=hidden -emit-llvm -std=c++11 -o - | FileCheck %s --check-prefixes=CHECK,INEXACT
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -fapple-kext -emit-llvm -std=c++11 -o - | 

[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-11 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

In D154658#4482202 , @rsmith wrote:

> In D154658#4481225 , @rjmccall 
> wrote:
>
>> In D154658#4479213 , @rsmith wrote:
>>
>>> I think (hope?) we should be able to apply this to a much larger set of 
>>> cases. Would it be correct to do this optimization unless the vtable might 
>>> be emitted with vague linkage and non-default visibility (that is, unless 
>>> we're in the odd case where people expect non-default visibility classes to 
>>> be the same type across DSOs)? Or are there cases where we might be using a 
>>> vtable that (eg) doesn't even have the right symbol?
>>
>> I don't know of any problems other than the total failure of vague linkage 
>> across DSO boundaries, so if we just treat that as an implicit exception to 
>> the vtable uniqueness guarantee in the ABI, I think you've got the condition 
>> exactly right: we could do this for any class where either the v-table 
>> doesn't have vague linkage or the class's formal visibility is not 
>> `default`.  Our experience at Apple with enforcing type visibility is that 
>> it's usually good for one or two bug reports a year, but it's pretty easy to 
>> explain to users what they did wrong, and Clang's visibility attributes are 
>> pretty simple to use.  (`type_visibility` helps a lot with managing 
>> tradeoffs.)
>
> I did some checking through the Clang implementation and found another two 
> cases:
>
> - under `-fapple-kext`, vague-linkage vtables get emitted in each translation 
> unit that references them
> - under `-fno-rtti`, identical vtables for distinct types could get merged 
> because we emit vtables as `unnamed_addr` (this doesn't affect 
> `dynamic_cast`, because `-fno-rtti` also disables `dynamic_cast` entirely)
>
> The good news seems to be that if you don't use any language extensions (type 
> visibility, `-fno-rtti`, `-fapple-kext`), then we do actually provide the 
> guarantee that the ABI describes. :)
>
> In D154658#4479170 , @rjmccall 
> wrote:
>
>> If there are multiple subobjects of the source type in the destination type, 
>> consider just casting to `void*` first instead of doing multiple comparisons.
>
> This turned out to be a little subtle: the vptr comparison we do after the 
> cast requires loading a vptr of an object of entirely-unknown type. This is 
> the first time we're doing that in IR generation, and `GetVTablePtr` expected 
> to be told which class it's loading the vtable for (presumably, so that it 
> can load from the right offset within the class). However, the implementation 
> of `GetVTablePtr` didn't use the class for anything; it's always at the start 
> for all of our supported ABIs. But this patch would make it harder to support 
> an ABI that didn't put the vptr at the start of the most-derived object.

Oh, this does matter on platforms using pointer authentication, because each 
vptr field is signed with a constant discriminator derived from the name of the 
class that introduced it.




Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:204
+// another type.
+if (!CGM.GetAddrOfRTTIDescriptor(RecordTy))
+  return false;

Is there no reasonable way of checking this without actually triggering the 
emission of RTTI?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-07 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith updated this revision to Diff 538283.
rsmith added a comment.

- Mark gep as inbounds.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

Files:
  clang/lib/AST/ExprCXX.cpp
  clang/lib/CodeGen/CGCXXABI.h
  clang/lib/CodeGen/CGClass.cpp
  clang/lib/CodeGen/CGExprCXX.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/lib/CodeGen/MicrosoftCXXABI.cpp
  clang/test/CodeGenCXX/dynamic-cast-always-null.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact.cpp

Index: clang/test/CodeGenCXX/dynamic-cast-exact.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/dynamic-cast-exact.cpp
@@ -0,0 +1,78 @@
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -emit-llvm -fcxx-exceptions -fexceptions -std=c++11 -o - | FileCheck %s --implicit-check-not='call {{.*}} @__dynamic_cast'
+struct Offset { virtual ~Offset(); };
+struct A { virtual ~A(); };
+struct B final : Offset, A { };
+
+struct C { virtual ~C(); int c; };
+struct D : A { int d; };
+struct E : A { int e; };
+struct F : virtual A { int f; };
+struct G : virtual A { int g; };
+struct H final : C, D, E, F, G { int h; };
+
+// CHECK-LABEL: @_Z7inexactP1A
+C *inexact(A *a) {
+  // CHECK: call {{.*}} @__dynamic_cast
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z12exact_singleP1A
+B *exact_single(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 1, i32 2)
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 -8
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[RESULT]], %[[LABEL_NOTNULL]] ], [ null, %[[LABEL_FAILED]] ]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z9exact_refR1A
+B _ref(A ) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 1, i32 2)
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 -8
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: call {{.*}} @__cxa_bad_cast
+  // CHECK: unreachable
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: ret ptr %[[RESULT]]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z11exact_multiP1A
+H *exact_multi(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[OFFSET_TO_TOP_SLOT:.*]] = getelementptr inbounds i64, ptr %[[VPTR]], i64 -2
+  // CHECK: %[[OFFSET_TO_TOP:.*]] = load i64, ptr %[[OFFSET_TO_TOP_SLOT]]
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 %[[OFFSET_TO_TOP]]
+  // CHECK: %[[DERIVED_VPTR:.*]] = load ptr, ptr %[[RESULT]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[DERIVED_VPTR]], getelementptr inbounds ({ [5 x ptr], [4 x ptr], [4 x ptr], [6 x ptr], [6 x ptr] }, ptr @_ZTV1H, i32 0, inrange i32 0, i32 3)
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[RESULT]], %[[LABEL_NOTNULL]] ], [ null, %[[LABEL_FAILED]] ]
+  return dynamic_cast(a);
+}
Index: clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
@@ -0,0 +1,13 @@
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -emit-llvm -std=c++11 -o - | FileCheck %s --check-prefixes=CHECK,EXACT
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -fvisibility=hidden -emit-llvm -std=c++11 -o - | FileCheck %s --check-prefixes=CHECK,INEXACT
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -fapple-kext -emit-llvm -std=c++11 -o - | FileCheck %s --check-prefixes=CHECK,INEXACT
+
+struct A { virtual ~A(); };
+struct B final : A { };
+
+// CHECK-LABEL: @_Z5exactP1A
+B *exact(A *a) {
+  // INEXACT: call 

[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-07 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith marked an inline comment as done.
rsmith added a comment.

In D154658#4481225 , @rjmccall wrote:

> In D154658#4479213 , @rsmith wrote:
>
>> I think (hope?) we should be able to apply this to a much larger set of 
>> cases. Would it be correct to do this optimization unless the vtable might 
>> be emitted with vague linkage and non-default visibility (that is, unless 
>> we're in the odd case where people expect non-default visibility classes to 
>> be the same type across DSOs)? Or are there cases where we might be using a 
>> vtable that (eg) doesn't even have the right symbol?
>
> I don't know of any problems other than the total failure of vague linkage 
> across DSO boundaries, so if we just treat that as an implicit exception to 
> the vtable uniqueness guarantee in the ABI, I think you've got the condition 
> exactly right: we could do this for any class where either the v-table 
> doesn't have vague linkage or the class's formal visibility is not `default`. 
>  Our experience at Apple with enforcing type visibility is that it's usually 
> good for one or two bug reports a year, but it's pretty easy to explain to 
> users what they did wrong, and Clang's visibility attributes are pretty 
> simple to use.  (`type_visibility` helps a lot with managing tradeoffs.)

I did some checking through the Clang implementation and found another two 
cases:

- under `-fapple-kext`, vague-linkage vtables get emitted in each translation 
unit that references them
- under `-fno-rtti`, identical vtables for distinct types could get merged 
because we emit vtables as `unnamed_addr` (this doesn't affect `dynamic_cast`, 
because `-fno-rtti` also disables `dynamic_cast` entirely)

The good news seems to be that if you don't use any language extensions (type 
visibility, `-fno-rtti`, `-fapple-kext`), then we do actually provide the 
guarantee that the ABI describes. :)

In D154658#4479170 , @rjmccall wrote:

> If there are multiple subobjects of the source type in the destination type, 
> consider just casting to `void*` first instead of doing multiple comparisons.

This turned out to be a little subtle: the vptr comparison we do after the cast 
requires loading a vptr of an object of entirely-unknown type. This is the 
first time we're doing that in IR generation, and `GetVTablePtr` expected to be 
told which class it's loading the vtable for (presumably, so that it can load 
from the right offset within the class). However, the implementation of 
`GetVTablePtr` didn't use the class for anything; it's always at the start for 
all of our supported ABIs. But this patch would make it harder to support an 
ABI that didn't put the vptr at the start of the most-derived object.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-07 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith updated this revision to Diff 538281.
rsmith added a comment.

- Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

Files:
  clang/lib/AST/ExprCXX.cpp
  clang/lib/CodeGen/CGCXXABI.h
  clang/lib/CodeGen/CGClass.cpp
  clang/lib/CodeGen/CGExprCXX.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/lib/CodeGen/MicrosoftCXXABI.cpp
  clang/test/CodeGenCXX/dynamic-cast-always-null.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact.cpp

Index: clang/test/CodeGenCXX/dynamic-cast-exact.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/dynamic-cast-exact.cpp
@@ -0,0 +1,78 @@
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -emit-llvm -fcxx-exceptions -fexceptions -std=c++11 -o - | FileCheck %s --implicit-check-not='call {{.*}} @__dynamic_cast'
+struct Offset { virtual ~Offset(); };
+struct A { virtual ~A(); };
+struct B final : Offset, A { };
+
+struct C { virtual ~C(); int c; };
+struct D : A { int d; };
+struct E : A { int e; };
+struct F : virtual A { int f; };
+struct G : virtual A { int g; };
+struct H final : C, D, E, F, G { int h; };
+
+// CHECK-LABEL: @_Z7inexactP1A
+C *inexact(A *a) {
+  // CHECK: call {{.*}} @__dynamic_cast
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z12exact_singleP1A
+B *exact_single(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 1, i32 2)
+  // CHECK: %[[RESULT:.*]] = getelementptr i8, ptr %[[PTR]], i64 -8
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[RESULT]], %[[LABEL_NOTNULL]] ], [ null, %[[LABEL_FAILED]] ]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z9exact_refR1A
+B _ref(A ) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 1, i32 2)
+  // CHECK: %[[RESULT:.*]] = getelementptr i8, ptr %[[PTR]], i64 -8
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: call {{.*}} @__cxa_bad_cast
+  // CHECK: unreachable
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: ret ptr %[[RESULT]]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z11exact_multiP1A
+H *exact_multi(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_FAILED:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[OFFSET_TO_TOP_SLOT:.*]] = getelementptr inbounds i64, ptr %[[VPTR]], i64 -2
+  // CHECK: %[[OFFSET_TO_TOP:.*]] = load i64, ptr %[[OFFSET_TO_TOP_SLOT]]
+  // CHECK: %[[RESULT:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 %[[OFFSET_TO_TOP]]
+  // CHECK: %[[DERIVED_VPTR:.*]] = load ptr, ptr %[[RESULT]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[DERIVED_VPTR]], getelementptr inbounds ({ [5 x ptr], [4 x ptr], [4 x ptr], [6 x ptr], [6 x ptr] }, ptr @_ZTV1H, i32 0, inrange i32 0, i32 3)
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_END:.*]], label %[[LABEL_FAILED]]
+
+  // CHECK: [[LABEL_FAILED]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[RESULT]], %[[LABEL_NOTNULL]] ], [ null, %[[LABEL_FAILED]] ]
+  return dynamic_cast(a);
+}
Index: clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/dynamic-cast-exact-disabled.cpp
@@ -0,0 +1,13 @@
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -emit-llvm -std=c++11 -o - | FileCheck %s --check-prefixes=CHECK,EXACT
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -fvisibility=hidden -emit-llvm -std=c++11 -o - | FileCheck %s --check-prefixes=CHECK,INEXACT
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -fapple-kext -emit-llvm -std=c++11 -o - | FileCheck %s --check-prefixes=CHECK,INEXACT
+
+struct A { virtual ~A(); };
+struct B final : A { };
+
+// CHECK-LABEL: @_Z5exactP1A
+B *exact(A *a) {
+  // INEXACT: call {{.*}} 

[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-07 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

In D154658#4479213 , @rsmith wrote:

> In D154658#4479170 , @rjmccall 
> wrote:
>
>> I don't think it's an intended guarantee of the Itanium ABI that the v-table 
>> will be unique, and v-tables are frequently not unique in the presence of 
>> shared libraries.
>
> https://itanium-cxx-abi.github.io/cxx-abi/abi.html#vtable-general explicitly 
> guarantees this: "[...] However, the virtual table pointers within all the 
> objects (instances) of a particular most-derived class point to the same set 
> of virtual tables."

Huh, I've never noticed that before.  I still don't think it's satisfied in 
practice, at least in the presence of shared libraries.  This is notably the 
exact same setup that eventually forced (or "forced") GCC to start using string 
comparison for RTTI instead of the ABI-prescribed algorithm that relies on 
`_ZTS` pointer equality.

>> They should be unique for classes with internal linkage, but of course 
>> that's a vastly reduced domain for the optimization.
>
> I think (hope?) we should be able to apply this to a much larger set of 
> cases. Would it be correct to do this optimization unless the vtable might be 
> emitted with vague linkage and non-default visibility (that is, unless we're 
> in the odd case where people expect non-default visibility classes to be the 
> same type across DSOs)? Or are there cases where we might be using a vtable 
> that (eg) doesn't even have the right symbol?

I don't know of any problems other than the total failure of vague linkage 
across DSO boundaries, so if we just treat that as an implicit exception to the 
vtable uniqueness guarantee in the ABI, I think you've got the condition 
exactly right: we could do this for any class where either the v-table doesn't 
have vague linkage or the class's formal visibility is not `default`.  Our 
experience at Apple with enforcing type visibility is that it's usually good 
for one or two bug reports a year, but it's pretty easy to explain to users 
what they did wrong, and Clang's visibility attributes are pretty simple to 
use.  (`type_visibility` helps a lot with managing tradeoffs.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-06 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith added a comment.

In D154658#4479170 , @rjmccall wrote:

> I don't think it's an intended guarantee of the Itanium ABI that the v-table 
> will be unique, and v-tables are frequently not unique in the presence of 
> shared libraries.

https://itanium-cxx-abi.github.io/cxx-abi/abi.html#vtable-general explicitly 
guarantees this: "[...] However, the virtual table pointers within all the 
objects (instances) of a particular most-derived class point to the same set of 
virtual tables."

> They should be unique for classes with internal linkage, but of course that's 
> a vastly reduced domain for the optimization.

I think (hope?) we should be able to apply this to a much larger set of cases. 
Would it be correct to do this optimization unless the vtable might be emitted 
with vague linkage and non-default visibility (that is, unless we're in the odd 
case where people expect non-default visibility classes to be the same type 
across DSOs)? Or are there cases where we might be using a vtable that (eg) 
doesn't even have the right symbol?

> If there are multiple subobjects of the source type in the destination type, 
> consider just casting to `void*` first instead of doing multiple comparisons.

Ha, that seems obvious in retrospect :) Will do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-06 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

I don't think it's an intended guarantee of the Itanium ABI that the v-table 
will be unique, and v-tables are frequently not unique in the presence of 
shared libraries.  They should be unique for classes with internal linkage, but 
of course that's a vastly reduced domain for the optimization.  We could do an 
RTTI comparison in the general case, either inline or by synthesizing a helper 
function to compare two `std::type_info`s.  Or you could have a flag to tell 
you to make stronger uniqueness assumptions.

If there are multiple subobjects of the source type in the destination type, 
consider just casting to `void*` first instead of doing multiple comparisons.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-06 Thread Nuno Lopes via Phabricator via cfe-commits
nlopes added inline comments.



Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:1541
+CGF.Builder.CreateBr(CastFail);
+return llvm::UndefValue::get(CGF.VoidPtrTy);
+  }

Please use PoisonValue as a placeholder whenever possible. We are trying to get 
rid of undef.
Thank you!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154658/new/

https://reviews.llvm.org/D154658

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154658: Optimize emission of `dynamic_cast` to final classes.

2023-07-06 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith created this revision.
rsmith added a reviewer: rjmccall.
Herald added subscribers: nlopes, mgrang.
Herald added a project: All.
rsmith requested review of this revision.
Herald added subscribers: cfe-commits, wangpc.
Herald added a project: clang.

- When the destination is a final class type that does not derive from the 
source type, the cast always fails and is now emitted as a null pointer or call 
to __cxa_bad_cast.

- When the destination is a final class type that does derive from the source 
type, emit a direct comparison against the corresponding base class vptr 
value(s). There may be more than one such value in the case of multiple 
inheritance; check them all.

For now, this is supported only for the Itanium ABI. I expect the same thing is
possible for the MS ABI too, but I don't know what guarantees are made about
vfptr uniqueness.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D154658

Files:
  clang/lib/AST/ExprCXX.cpp
  clang/lib/CodeGen/CGCXXABI.h
  clang/lib/CodeGen/CGExprCXX.cpp
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/lib/CodeGen/MicrosoftCXXABI.cpp
  clang/test/CodeGenCXX/dynamic-cast-always-null.cpp
  clang/test/CodeGenCXX/dynamic-cast-exact.cpp

Index: clang/test/CodeGenCXX/dynamic-cast-exact.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/dynamic-cast-exact.cpp
@@ -0,0 +1,90 @@
+// RUN: %clang_cc1 -I%S %s -triple x86_64-apple-darwin10 -emit-llvm -fcxx-exceptions -fexceptions -std=c++11 -o - | FileCheck %s --implicit-check-not='call {{.*}} @__dynamic_cast'
+struct A { virtual ~A(); };
+struct B final : A { };
+
+struct C { virtual ~C(); int c; };
+struct D : A { int d; };
+struct E : A { int e; };
+struct F : virtual A { int f; };
+struct G : virtual A { int g; };
+struct H final : C, D, E, F, G { int h; };
+
+// CHECK-LABEL: @_Z7inexactP1A
+C *inexact(A *a) {
+  // CHECK: call {{.*}} @__dynamic_cast
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z12exact_singleP1A
+B *exact_single(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_NULL:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 0, i32 2)
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_SUCCESS:.*]], label %[[LABEL_NULL]]
+
+  // CHECK: [[LABEL_SUCCESS]]:
+  // CHECK: br label %[[LABEL_END:.*]]
+
+  // CHECK: [[LABEL_NULL]]:
+  // CHECK: br label %[[LABEL_END]]
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: phi ptr [ %[[PTR]], %[[LABEL_SUCCESS]] ], [ null, %[[LABEL_NULL]] ]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z9exact_refR1A
+B _ref(A ) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_NULL:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [4 x ptr] }, ptr @_ZTV1B, i32 0, inrange i32 0, i32 2)
+  // CHECK: br i1 %[[MATCH]], label %[[LABEL_SUCCESS:.*]], label %[[LABEL_NULL]]
+
+  // CHECK: [[LABEL_SUCCESS]]:
+  // CHECK: br label %[[LABEL_END:.*]]
+
+  // CHECK: [[LABEL_NULL]]:
+  // CHECK: call {{.*}} @__cxa_bad_cast
+  // CHECK: unreachable
+
+  // CHECK: [[LABEL_END]]:
+  // CHECK: ret ptr %[[PTR]]
+  return dynamic_cast(a);
+}
+
+// CHECK-LABEL: @_Z11exact_multiP1A
+H *exact_multi(A *a) {
+  // CHECK: %[[PTR_NULL:.*]] = icmp eq ptr %[[PTR:.*]], null
+  // CHECK: br i1 %[[PTR_NULL]], label %[[LABEL_NULL:.*]], label %[[LABEL_NOTNULL:.*]]
+
+  // CHECK: [[LABEL_NOTNULL]]:
+  // CHECK: %[[VPTR:.*]] = load ptr, ptr %[[PTR]]
+  // CHECK: %[[MATCH_1:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [5 x ptr], [4 x ptr], [4 x ptr], [6 x ptr], [6 x ptr] }, ptr @_ZTV1H, i32 0, inrange i32 1, i32 2)
+  // CHECK: br i1 %[[MATCH_1]], label %[[LABEL_SUCCESS:.*]], label %[[LABEL_CONT_1:.*]]
+
+  // CHECK: [[LABEL_CONT_1]]:
+  // CHECK: %[[MATCH_2:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [5 x ptr], [4 x ptr], [4 x ptr], [6 x ptr], [6 x ptr] }, ptr @_ZTV1H, i32 0, inrange i32 2, i32 2)
+  // CHECK: br i1 %[[MATCH_2]], label %[[LABEL_SUCCESS:.*]], label %[[LABEL_CONT_2:.*]]
+
+  // CHECK: [[LABEL_CONT_2]]:
+  // CHECK: %[[MATCH_3:.*]] = icmp eq ptr %[[VPTR]], getelementptr inbounds ({ [5 x ptr], [4 x ptr], [4 x ptr], [6 x ptr], [6 x ptr] }, ptr @_ZTV1H, i32 0, inrange i32 3, i32 4)
+  // CHECK: br i1 %[[MATCH_3]], label %[[LABEL_SUCCESS:.*]], label %[[LABEL_NULL]]
+
+  // CHECK: [[LABEL_SUCCESS]]:
+  // CHECK: %[[SUCCESS_OFFSET:.*]] = phi i64 [ -16, %[[LABEL_NOTNULL]] ], [ -32, %[[LABEL_CONT_1]] ], [ -48, %[[LABEL_CONT_2]] ]
+  // CHECK: %[[SUCCESS_VALUE:.*]] = getelementptr inbounds i8, ptr %[[PTR]], i64 %[[SUCCESS_OFFSET]]
+  // CHECK: br label