[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff 55d4816393f897054a4721920502d45c645edf1d cb86c3c12959513309ccf97592d602ba4da36d1d -- clang/test/CodeGen/object-size-sub-object.c clang/lib/CodeGen/CGBuiltin.cpp clang/lib/CodeGen/CGExpr.cpp clang/test/CodeGen/attr-counted-by.c clang/test/CodeGen/catch-undef-behavior.c clang/test/CodeGen/object-size.c clang/test/CodeGen/object-size.cpp clang/test/CodeGen/pass-object-size.c clang/test/CodeGen/sanitize-recover.c llvm/include/llvm/Analysis/MemoryBuiltins.h llvm/lib/Analysis/MemoryBuiltins.cpp llvm/lib/IR/AutoUpgrade.cpp llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp `` View the diff from clang-format here. ``diff diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index d8d4aa5288..7461d0ee00 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -1244,8 +1244,8 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, unsigned Type, // it's set, a closest surrounding subobject is considered the object a // pointer points to. Value *WholeObj = Builder.getInt1((Type & 1) == 0); - return Builder.CreateCall(F, {Ptr, Min, NullIsUnknown, Dynamic, WholeObj, -SubobjectSize}); + return Builder.CreateCall( + F, {Ptr, Min, NullIsUnknown, Dynamic, WholeObj, SubobjectSize}); } namespace { diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index b0aa3579bd..d94a775cab 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -750,8 +750,8 @@ void CodeGenFunction::EmitTypeCheck(TypeCheckKind TCK, SourceLocation Loc, llvm::Value *WholeObj = Builder.getTrue(); llvm::Value *SubobjectSize = Builder.getInt64(0); llvm::Value *LargeEnough = Builder.CreateICmpUGE( - Builder.CreateCall(F, {Ptr, Min, NullIsUnknown, Dynamic, WholeObj, - SubobjectSize}), + Builder.CreateCall( + F, {Ptr, Min, NullIsUnknown, Dynamic, WholeObj, SubobjectSize}), Size); Checks.push_back(std::make_pair(LargeEnough, SanitizerKind::ObjectSize)); } `` https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -26996,18 +26996,38 @@ class, structure, array, or other object. Arguments: "" -The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a -pointer to or into the ``object``. The second argument determines whether -``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is -unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` -in address space 0 is used as its pointer argument. If it's ``false``, -``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if -the ``null`` is in a non-zero address space or if ``true`` is given for the -third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth -argument to ``llvm.objectsize`` determines if the value should be evaluated at -runtime. +The ``llvm.objectsize`` intrinsic takes six arguments: + +- The first argument is a pointer to or into the ``object``. +- The second argument controls which value to return when the size is unknown: + + - If it's ``false``, ``llvm.objectsize`` returns ``-1``. + - If it's ``true``, ``llvm.objectsize`` returns ``0``. + +- The third argument controls how ``llvm.objectsize`` acts when ``null`` in + address space 0 is used as its pointer argument: + + - If it's ``false``, ``llvm.objectsize`` reports 0 bytes available when given +``null``. + - If it's ``true``, or the ``null`` pointer is in a non-zero address space, +the size is assumed to be unknown. + +- The fourth argument to ``llvm.objectsize`` determines if the value should be + evaluated at runtime. +- The fifth argument controls which size ``llvm.objectsize`` returns: + + - If it's ``false``, ``llvm.objectsize`` returns the size of the closest +surrounding subobject. + - If it's ``true``, ``llvm.objectsize`` returns the size of the whole object. + +- If non-zero, the sixth and seventh arguments encode the size and offset + information, respectively, of the original subobject's layout and is used + when the fifth argument is ``false``. +- The seventh argument encodes the offset information of the original + subobject's layout and is used when the fifth argument is ``false``. zygoloid wrote: > > in some simple test cases it does appear to work. > > Please don't be insulting. I didn't just run a couple of "simple" tests and > called it a day. I had many rather convoluted examples and they all worked. I apologize, I didn't mean to imply you hadn't been diligent here. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -26996,18 +26996,38 @@ class, structure, array, or other object. Arguments: "" -The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a -pointer to or into the ``object``. The second argument determines whether -``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is -unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` -in address space 0 is used as its pointer argument. If it's ``false``, -``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if -the ``null`` is in a non-zero address space or if ``true`` is given for the -third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth -argument to ``llvm.objectsize`` determines if the value should be evaluated at -runtime. +The ``llvm.objectsize`` intrinsic takes six arguments: + +- The first argument is a pointer to or into the ``object``. +- The second argument controls which value to return when the size is unknown: + + - If it's ``false``, ``llvm.objectsize`` returns ``-1``. + - If it's ``true``, ``llvm.objectsize`` returns ``0``. + +- The third argument controls how ``llvm.objectsize`` acts when ``null`` in + address space 0 is used as its pointer argument: + + - If it's ``false``, ``llvm.objectsize`` reports 0 bytes available when given +``null``. + - If it's ``true``, or the ``null`` pointer is in a non-zero address space, +the size is assumed to be unknown. + +- The fourth argument to ``llvm.objectsize`` determines if the value should be + evaluated at runtime. +- The fifth argument controls which size ``llvm.objectsize`` returns: + + - If it's ``false``, ``llvm.objectsize`` returns the size of the closest +surrounding subobject. + - If it's ``true``, ``llvm.objectsize`` returns the size of the whole object. + +- If non-zero, the sixth and seventh arguments encode the size and offset + information, respectively, of the original subobject's layout and is used + when the fifth argument is ``false``. +- The seventh argument encodes the offset information of the original + subobject's layout and is used when the fifth argument is ``false``. bwendling wrote: > in some simple test cases it does appear to work. Please don't be insulting. I didn't just run a couple of "simple" tests and called it a day. I had many rather convoluted examples and they all worked. > Sorry, that was a typo in my example: the call in h() should be f(&b.a). Okay. This patch generates the correct answer (modified so it's calculated by the middle end): ``` $ clang -O2 ~/llvm/a.c && ./a.out &((char *)&p->n)[idx]: __builtin_dynamic_object_size(&((char *)&p->n)[idx], 1) = 3 &((char *)&p->n)[idx]: __builtin_dynamic_object_size(&((char *)&p->n)[idx], 1) = 3 ``` > I mean, pass a pointer to the start of the subobject. For example, for > p->arr[i], you'd pass in &p->arr. I don't think that's necessary. If we're given the original size and the middle end is calculating the final size, it may be sufficient to ignore the offset all together for sub-objects. I'll see if it works. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
https://github.com/zygoloid edited https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -26996,18 +26996,38 @@ class, structure, array, or other object. Arguments: "" -The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a -pointer to or into the ``object``. The second argument determines whether -``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is -unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` -in address space 0 is used as its pointer argument. If it's ``false``, -``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if -the ``null`` is in a non-zero address space or if ``true`` is given for the -third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth -argument to ``llvm.objectsize`` determines if the value should be evaluated at -runtime. +The ``llvm.objectsize`` intrinsic takes six arguments: + +- The first argument is a pointer to or into the ``object``. +- The second argument controls which value to return when the size is unknown: + + - If it's ``false``, ``llvm.objectsize`` returns ``-1``. + - If it's ``true``, ``llvm.objectsize`` returns ``0``. + +- The third argument controls how ``llvm.objectsize`` acts when ``null`` in + address space 0 is used as its pointer argument: + + - If it's ``false``, ``llvm.objectsize`` reports 0 bytes available when given +``null``. + - If it's ``true``, or the ``null`` pointer is in a non-zero address space, +the size is assumed to be unknown. + +- The fourth argument to ``llvm.objectsize`` determines if the value should be + evaluated at runtime. +- The fifth argument controls which size ``llvm.objectsize`` returns: + + - If it's ``false``, ``llvm.objectsize`` returns the size of the closest +surrounding subobject. + - If it's ``true``, ``llvm.objectsize`` returns the size of the whole object. + +- If non-zero, the sixth and seventh arguments encode the size and offset + information, respectively, of the original subobject's layout and is used + when the fifth argument is ``false``. +- The seventh argument encodes the offset information of the original + subobject's layout and is used when the fifth argument is ``false``. zygoloid wrote: > First you tell me that I can't use LLVM's IR to determine the subobject, even > though I did and it worked just fine It definitely doesn't work fine, but sure, in some simple test cases it does appear to work. > now you're saying that I can't use the front-end's knowledge about the > structure. I'm not. I'm saying that you can't assume that the LLVM IR idea of the complete object lines up with some enclosing subobject you've found in the frontend. You're still trying to use the IR notion of complete object and subobjects, and that still doesn't work for the same reasons as before. You can absolutely compute where the subobject is in the frontend, and pass that information onto LLVM. But you'll need to pass it on in a way that is meaningful to LLVM. For example, if you pass a pointer and size to the intrinsic describing the complete range of addresses covering the subobject, that should be fine. But an offset is not useful if the frontend and middle-end can't agree on what it's an offset relative to. > In your example, you've explicitly lied to the compiler about the types being > passed in. Sorry, that was a typo in my example: the call in `h()` should be `f(&b.a)`. > I have no clue what you mean by pass a "pointer to `p->n` to the intrinsic" > as that's already the first argument in the intrinsic. I mean, pass a pointer to the start of the subobject. For example, for `p->arr[i]`, you'd pass in `&p->arr`. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -26996,18 +26996,38 @@ class, structure, array, or other object. Arguments: "" -The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a -pointer to or into the ``object``. The second argument determines whether -``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is -unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` -in address space 0 is used as its pointer argument. If it's ``false``, -``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if -the ``null`` is in a non-zero address space or if ``true`` is given for the -third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth -argument to ``llvm.objectsize`` determines if the value should be evaluated at -runtime. +The ``llvm.objectsize`` intrinsic takes six arguments: + +- The first argument is a pointer to or into the ``object``. +- The second argument controls which value to return when the size is unknown: + + - If it's ``false``, ``llvm.objectsize`` returns ``-1``. + - If it's ``true``, ``llvm.objectsize`` returns ``0``. + +- The third argument controls how ``llvm.objectsize`` acts when ``null`` in + address space 0 is used as its pointer argument: + + - If it's ``false``, ``llvm.objectsize`` reports 0 bytes available when given +``null``. + - If it's ``true``, or the ``null`` pointer is in a non-zero address space, +the size is assumed to be unknown. + +- The fourth argument to ``llvm.objectsize`` determines if the value should be + evaluated at runtime. +- The fifth argument controls which size ``llvm.objectsize`` returns: + + - If it's ``false``, ``llvm.objectsize`` returns the size of the closest +surrounding subobject. + - If it's ``true``, ``llvm.objectsize`` returns the size of the whole object. + +- If non-zero, the sixth and seventh arguments encode the size and offset + information, respectively, of the original subobject's layout and is used + when the fifth argument is ``false``. +- The seventh argument encodes the offset information of the original + subobject's layout and is used when the fifth argument is ``false``. bwendling wrote: You really need to make up your mind. First you tell me that I can't use LLVM's IR to determine the subobject, even though I did and it worked just fine, and now you're saying that I can't use the front-end's knowledge about the structure. In your example, you've explicitly lied to the compiler about the types being passed in. Unlike in GCC, we don't do any inlining in the front-end, so we can't properly handle this. I have no clue what you mean by pass a "pointer to `p->n` to the intrinsic" as that's already the first argument in the intrinsic. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -26996,18 +26996,38 @@ class, structure, array, or other object. Arguments: "" -The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a -pointer to or into the ``object``. The second argument determines whether -``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is -unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` -in address space 0 is used as its pointer argument. If it's ``false``, -``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if -the ``null`` is in a non-zero address space or if ``true`` is given for the -third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth -argument to ``llvm.objectsize`` determines if the value should be evaluated at -runtime. +The ``llvm.objectsize`` intrinsic takes six arguments: + +- The first argument is a pointer to or into the ``object``. +- The second argument controls which value to return when the size is unknown: + + - If it's ``false``, ``llvm.objectsize`` returns ``-1``. + - If it's ``true``, ``llvm.objectsize`` returns ``0``. + +- The third argument controls how ``llvm.objectsize`` acts when ``null`` in + address space 0 is used as its pointer argument: + + - If it's ``false``, ``llvm.objectsize`` reports 0 bytes available when given +``null``. + - If it's ``true``, or the ``null`` pointer is in a non-zero address space, +the size is assumed to be unknown. + +- The fourth argument to ``llvm.objectsize`` determines if the value should be + evaluated at runtime. +- The fifth argument controls which size ``llvm.objectsize`` returns: + + - If it's ``false``, ``llvm.objectsize`` returns the size of the closest +surrounding subobject. + - If it's ``true``, ``llvm.objectsize`` returns the size of the whole object. + +- If non-zero, the sixth and seventh arguments encode the size and offset + information, respectively, of the original subobject's layout and is used + when the fifth argument is ``false``. +- The seventh argument encodes the offset information of the original + subobject's layout and is used when the fifth argument is ``false``. zygoloid wrote: You're passing the difference between the address of some arbitrary class object and a field. The address of that class object is not the base address that the LLVM intrinsic will calculate. So the offset you're passing in is not meaningful. And there's no way it even *can* be meaningful. Consider something like this: ```c++ struct A { int n; }; struct B { int data[100]; A a; }; int f(A *p) { return __builtin_dynamic_object_size(&p->n, 1); } int g() { A a; return f(&a); } int h() { B b; return f(&b); } ``` After inlining, the BDOS intrinsic in `g` will see a base pointer of `&a`, and the intrinsic in `h` will see a base pointer of `&b`. The offset from the base pointer to the `n` subobject is different in each case. So there's no way you can compute a constant offset in Clang's lowering and pass it to the intrinsic. What you *can* do is pass a pointer to `p->n` to the intrinsic as the start of the subobject. That'll do the right thing regardless of what LLVM computes to be the complete enclosing object. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -26996,18 +26996,38 @@ class, structure, array, or other object. Arguments: "" -The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a -pointer to or into the ``object``. The second argument determines whether -``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is -unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` -in address space 0 is used as its pointer argument. If it's ``false``, -``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if -the ``null`` is in a non-zero address space or if ``true`` is given for the -third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth -argument to ``llvm.objectsize`` determines if the value should be evaluated at -runtime. +The ``llvm.objectsize`` intrinsic takes six arguments: + +- The first argument is a pointer to or into the ``object``. +- The second argument controls which value to return when the size is unknown: + + - If it's ``false``, ``llvm.objectsize`` returns ``-1``. + - If it's ``true``, ``llvm.objectsize`` returns ``0``. + +- The third argument controls how ``llvm.objectsize`` acts when ``null`` in + address space 0 is used as its pointer argument: + + - If it's ``false``, ``llvm.objectsize`` reports 0 bytes available when given +``null``. + - If it's ``true``, or the ``null`` pointer is in a non-zero address space, +the size is assumed to be unknown. + +- The fourth argument to ``llvm.objectsize`` determines if the value should be + evaluated at runtime. +- The fifth argument controls which size ``llvm.objectsize`` returns: + + - If it's ``false``, ``llvm.objectsize`` returns the size of the closest +surrounding subobject. + - If it's ``true``, ``llvm.objectsize`` returns the size of the whole object. + +- If non-zero, the sixth and seventh arguments encode the size and offset + information, respectively, of the original subobject's layout and is used + when the fifth argument is ``false``. +- The seventh argument encodes the offset information of the original + subobject's layout and is used when the fifth argument is ``false``. bwendling wrote: I'm not passing a difference between two pointers; it's the offset from the start of the outermost `RecordDecl` to the object. The LLVM intrinsic recursively walks back through the instructions (that define the pointer) to try to calculate the value. It's rather convoluted and hard to read, because it involves two visitor classes called one from another. Eventually, it returns a Size / Offset pair (Offset from the start of the structure). At that point, I use the extra information I added to determine if those values are within the range of the sub-object. If it's outside of that range, I return 0. Otherwise, I calculate the remaining size after adjusting the offset (i.e. the offset is adjusted to be from the beginning of the sub-object rather than the start of the structure). https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -26996,18 +26996,38 @@ class, structure, array, or other object. Arguments: "" -The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a -pointer to or into the ``object``. The second argument determines whether -``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is -unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` -in address space 0 is used as its pointer argument. If it's ``false``, -``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if -the ``null`` is in a non-zero address space or if ``true`` is given for the -third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth -argument to ``llvm.objectsize`` determines if the value should be evaluated at -runtime. +The ``llvm.objectsize`` intrinsic takes six arguments: + +- The first argument is a pointer to or into the ``object``. +- The second argument controls which value to return when the size is unknown: + + - If it's ``false``, ``llvm.objectsize`` returns ``-1``. + - If it's ``true``, ``llvm.objectsize`` returns ``0``. + +- The third argument controls how ``llvm.objectsize`` acts when ``null`` in + address space 0 is used as its pointer argument: + + - If it's ``false``, ``llvm.objectsize`` reports 0 bytes available when given +``null``. + - If it's ``true``, or the ``null`` pointer is in a non-zero address space, +the size is assumed to be unknown. + +- The fourth argument to ``llvm.objectsize`` determines if the value should be + evaluated at runtime. +- The fifth argument controls which size ``llvm.objectsize`` returns: + + - If it's ``false``, ``llvm.objectsize`` returns the size of the closest +surrounding subobject. + - If it's ``true``, ``llvm.objectsize`` returns the size of the whole object. + +- If non-zero, the sixth and seventh arguments encode the size and offset + information, respectively, of the original subobject's layout and is used + when the fifth argument is ``false``. +- The seventh argument encodes the offset information of the original + subobject's layout and is used when the fifth argument is ``false``. zygoloid wrote: As far as I can see, the "offset" value you're passing in is a difference between two pointers both of which are unknown to the middle-end -- it's the difference from the start of the two-levels-out enclosing class subobject to the one-level-out enclosing field subobject, I think. I could be misunderstanding something, but I don't see how you can use that information to get correct results from the builtin. Can you explain how the LLVM intrinsic uses this information? https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -1052,11 +1053,143 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type, return Builder.CreateSelect(Cmp, Res, ConstantInt::get(ResType, 0, IsSigned)); } +namespace { + +/// SubobjectFinder - A simple visitor to find the "sub-object" pointed to by a +/// __builtin_dynamic_object_size call. Information gathered from the +/// sub-object is used by the back-end to determine the correct size when the +/// 'TYPE' of the __bdos call has the least significant bit set (i.e. asking +/// for the sub-object size). +/// +/// The expectation is that we'll eventually hit one of three expression types: +/// +/// 1. DeclRefExpr - This is the expression for the base of the structure. +/// 2. MemberExpr - This is the field in the structure. +/// 3. CompoundLiteralExpr - This is for people who create something +/// heretical like (struct foo has a flexible array member): +/// +///(struct foo){ 1, 2 }.blah[idx]; +/// +/// All other expressions can be correctly handled with the current code. +struct SubobjectFinder +: public ConstStmtVisitor { + SubobjectFinder() = default; + + //======// + //Visitor Methods + //======// + + const Expr *VisitStmt(const Stmt *S) { return nullptr; } + + const Expr *VisitDeclRefExpr(const DeclRefExpr *E) { return E; } + const Expr *VisitMemberExpr(const MemberExpr *E) { return E; } + const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) { +return E; + } + + const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) { +return Visit(E->getBase()); + } + const Expr *VisitCastExpr(const CastExpr *E) { bwendling wrote: It seems to. See https://godbolt.org/z/4xaY4191o for an example (the `&((char *)&var.z.a)[argc]` example looks through them. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -26996,18 +26996,38 @@ class, structure, array, or other object. Arguments: "" -The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a -pointer to or into the ``object``. The second argument determines whether -``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is -unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` -in address space 0 is used as its pointer argument. If it's ``false``, -``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if -the ``null`` is in a non-zero address space or if ``true`` is given for the -third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth -argument to ``llvm.objectsize`` determines if the value should be evaluated at -runtime. +The ``llvm.objectsize`` intrinsic takes six arguments: + +- The first argument is a pointer to or into the ``object``. +- The second argument controls which value to return when the size is unknown: + + - If it's ``false``, ``llvm.objectsize`` returns ``-1``. + - If it's ``true``, ``llvm.objectsize`` returns ``0``. + +- The third argument controls how ``llvm.objectsize`` acts when ``null`` in + address space 0 is used as its pointer argument: + + - If it's ``false``, ``llvm.objectsize`` reports 0 bytes available when given +``null``. + - If it's ``true``, or the ``null`` pointer is in a non-zero address space, +the size is assumed to be unknown. + +- The fourth argument to ``llvm.objectsize`` determines if the value should be + evaluated at runtime. +- The fifth argument controls which size ``llvm.objectsize`` returns: + + - If it's ``false``, ``llvm.objectsize`` returns the size of the closest +surrounding subobject. + - If it's ``true``, ``llvm.objectsize`` returns the size of the whole object. + +- If non-zero, the sixth and seventh arguments encode the size and offset + information, respectively, of the original subobject's layout and is used + when the fifth argument is ``false``. +- The seventh argument encodes the offset information of the original + subobject's layout and is used when the fifth argument is ``false``. bwendling wrote: A lot of your review seems to be based on this differing of opinion of what to do when indexing outside of the object currently being pointed to. Let's get this sorted out before I make changes... https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -26996,18 +26996,38 @@ class, structure, array, or other object. Arguments: "" -The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a -pointer to or into the ``object``. The second argument determines whether -``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is -unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` -in address space 0 is used as its pointer argument. If it's ``false``, -``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if -the ``null`` is in a non-zero address space or if ``true`` is given for the -third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth -argument to ``llvm.objectsize`` determines if the value should be evaluated at -runtime. +The ``llvm.objectsize`` intrinsic takes six arguments: + +- The first argument is a pointer to or into the ``object``. +- The second argument controls which value to return when the size is unknown: + + - If it's ``false``, ``llvm.objectsize`` returns ``-1``. + - If it's ``true``, ``llvm.objectsize`` returns ``0``. + +- The third argument controls how ``llvm.objectsize`` acts when ``null`` in + address space 0 is used as its pointer argument: + + - If it's ``false``, ``llvm.objectsize`` reports 0 bytes available when given +``null``. + - If it's ``true``, or the ``null`` pointer is in a non-zero address space, +the size is assumed to be unknown. + +- The fourth argument to ``llvm.objectsize`` determines if the value should be + evaluated at runtime. +- The fifth argument controls which size ``llvm.objectsize`` returns: + + - If it's ``false``, ``llvm.objectsize`` returns the size of the closest +surrounding subobject. + - If it's ``true``, ``llvm.objectsize`` returns the size of the whole object. + +- If non-zero, the sixth and seventh arguments encode the size and offset + information, respectively, of the original subobject's layout and is used + when the fifth argument is ``false``. +- The seventh argument encodes the offset information of the original + subobject's layout and is used when the fifth argument is ``false``. bwendling wrote: > I think the information you're passing in here isn't quite what we'd want. If > I'm reading the code correctly, the offset you're passing in is the field > offset relative to the immediately-enclosing record type, which doesn't give > us any information about either where the pointer is within the subobject, or > where the subobject is within the complete object, so this doesn't seem like > it can be enough information to produce a correct result. That's the information which leads to the correct calculation. If you have a pointer like this: ```c struct S { int a; char c[234]; int b; }; void foo(struct S *ptr) { size_t x = __builtin_dynamic_object_size(ptr->a[22], 1); /* ... */ } ``` the value of `x` should be `0`. See https://godbolt.org/z/4xaY4191o for a list of examples that show this behavior (at least in GCC). Notice that this applies for the sub-object type only. If the __bdos value is `0`, then your behavior is the correct behavior. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -1052,11 +1053,143 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type, return Builder.CreateSelect(Cmp, Res, ConstantInt::get(ResType, 0, IsSigned)); } +namespace { + +/// SubobjectFinder - A simple visitor to find the "sub-object" pointed to by a +/// __builtin_dynamic_object_size call. Information gathered from the +/// sub-object is used by the back-end to determine the correct size when the +/// 'TYPE' of the __bdos call has the least significant bit set (i.e. asking +/// for the sub-object size). +/// +/// The expectation is that we'll eventually hit one of three expression types: +/// +/// 1. DeclRefExpr - This is the expression for the base of the structure. +/// 2. MemberExpr - This is the field in the structure. +/// 3. CompoundLiteralExpr - This is for people who create something +/// heretical like (struct foo has a flexible array member): +/// +///(struct foo){ 1, 2 }.blah[idx]; +/// +/// All other expressions can be correctly handled with the current code. +struct SubobjectFinder +: public ConstStmtVisitor { + SubobjectFinder() = default; + + //======// + //Visitor Methods + //======// + + const Expr *VisitStmt(const Stmt *S) { return nullptr; } + + const Expr *VisitDeclRefExpr(const DeclRefExpr *E) { return E; } + const Expr *VisitMemberExpr(const MemberExpr *E) { return E; } + const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) { +return E; + } + + const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) { +return Visit(E->getBase()); + } + const Expr *VisitCastExpr(const CastExpr *E) { zygoloid wrote: Does GCC look through explicit casts? I wonder if this should be restricted to `ImplicitCastExpr`s. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -1052,11 +1053,143 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type, return Builder.CreateSelect(Cmp, Res, ConstantInt::get(ResType, 0, IsSigned)); } +namespace { + +/// SubobjectFinder - A simple visitor to find the "sub-object" pointed to by a +/// __builtin_dynamic_object_size call. Information gathered from the +/// sub-object is used by the back-end to determine the correct size when the +/// 'TYPE' of the __bdos call has the least significant bit set (i.e. asking +/// for the sub-object size). +/// +/// The expectation is that we'll eventually hit one of three expression types: +/// +/// 1. DeclRefExpr - This is the expression for the base of the structure. +/// 2. MemberExpr - This is the field in the structure. +/// 3. CompoundLiteralExpr - This is for people who create something +/// heretical like (struct foo has a flexible array member): +/// +///(struct foo){ 1, 2 }.blah[idx]; +/// +/// All other expressions can be correctly handled with the current code. +struct SubobjectFinder +: public ConstStmtVisitor { + SubobjectFinder() = default; + + //======// + //Visitor Methods + //======// + + const Expr *VisitStmt(const Stmt *S) { return nullptr; } + + const Expr *VisitDeclRefExpr(const DeclRefExpr *E) { return E; } + const Expr *VisitMemberExpr(const MemberExpr *E) { return E; } + const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) { +return E; + } + + const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) { +return Visit(E->getBase()); + } + const Expr *VisitCastExpr(const CastExpr *E) { +return Visit(E->getSubExpr()); + } + const Expr *VisitParenExpr(const ParenExpr *E) { +return Visit(E->getSubExpr()); + } + const Expr *VisitUnaryAddrOf(const clang::UnaryOperator *E) { +return Visit(E->getSubExpr()); + } + const Expr *VisitUnaryDeref(const clang::UnaryOperator *E) { +return Visit(E->getSubExpr()); + } zygoloid wrote: I think you'll need to be more careful when walking through address-of / dereferences -- the set of things you should step over when traversing a pointer to an object is different from the set of things you should step over when traversing an object lvalue. For example, the bounds to use for `*p->member` will be computed as the bounds of `member`, which isn't correct. I think you could address this by either having separate traversals for pointers versus lvalues, or by avoiding (for example) stepping through lvalue-to-rvalue conversions when stepping over `CastExpr`s -- and in fact, the latter seems like a good idea in general, given that a `CastExpr` could do pretty much anything to the pointer / lvalue. In general, I think it's only really safe to step over casts that are a no-op for address purposes. Bitcasts seem OK, address space conversions seem OK, etc. but a lot of cast kinds are not going to be reasonable to step over. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -1052,11 +1053,143 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type, return Builder.CreateSelect(Cmp, Res, ConstantInt::get(ResType, 0, IsSigned)); } +namespace { + +/// SubobjectFinder - A simple visitor to find the "sub-object" pointed to by a +/// __builtin_dynamic_object_size call. Information gathered from the +/// sub-object is used by the back-end to determine the correct size when the +/// 'TYPE' of the __bdos call has the least significant bit set (i.e. asking +/// for the sub-object size). +/// +/// The expectation is that we'll eventually hit one of three expression types: +/// +/// 1. DeclRefExpr - This is the expression for the base of the structure. +/// 2. MemberExpr - This is the field in the structure. +/// 3. CompoundLiteralExpr - This is for people who create something +/// heretical like (struct foo has a flexible array member): +/// +///(struct foo){ 1, 2 }.blah[idx]; +/// +/// All other expressions can be correctly handled with the current code. +struct SubobjectFinder +: public ConstStmtVisitor { + SubobjectFinder() = default; + + //======// + //Visitor Methods + //======// + + const Expr *VisitStmt(const Stmt *S) { return nullptr; } + + const Expr *VisitDeclRefExpr(const DeclRefExpr *E) { return E; } + const Expr *VisitMemberExpr(const MemberExpr *E) { return E; } + const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) { +return E; + } + + const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) { +return Visit(E->getBase()); + } + const Expr *VisitCastExpr(const CastExpr *E) { +return Visit(E->getSubExpr()); + } + const Expr *VisitParenExpr(const ParenExpr *E) { +return Visit(E->getSubExpr()); + } + const Expr *VisitUnaryAddrOf(const clang::UnaryOperator *E) { +return Visit(E->getSubExpr()); + } + const Expr *VisitUnaryDeref(const clang::UnaryOperator *E) { +return Visit(E->getSubExpr()); + } +}; + +} // end anonymous namespace + +/// getFieldInfo - Gather the size and offset of the field \p VD in \p RD. +static std::pair getFieldInfo(CodeGenFunction &CGF, + const RecordDecl *RD, + const ValueDecl *VD, + uint64_t Offset = 0) { + if (!RD) +return std::make_pair(0, 0); + + ASTContext &Ctx = CGF.getContext(); + const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD); + unsigned FieldNo = 0; + + for (const Decl *D : RD->decls()) { +if (const auto *Record = dyn_cast(D)) { + std::pair Res = + getFieldInfo(CGF, Record->getDefinition(), VD, + Offset + Layout.getFieldOffset(FieldNo)); + if (Res.first != 0) +return Res; + continue; +} + +if (const auto *FD = dyn_cast(D); FD == VD) { + Offset += Layout.getFieldOffset(FieldNo); + return std::make_pair(Ctx.getTypeSizeInChars(FD->getType()).getQuantity(), +Ctx.toCharUnitsFromBits(Offset).getQuantity()); +} + +if (isa(D)) + ++FieldNo; + } zygoloid wrote: This work recursively looping through fields is not necessary: this function only succeeds when `VD` is a `FieldDecl`, so you can `dyn_cast` it to that type, then get the enclosing `DeclContext` to find the record, and use `FieldDecl::getFieldIndex` to find the field number. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -1052,11 +1053,143 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type, return Builder.CreateSelect(Cmp, Res, ConstantInt::get(ResType, 0, IsSigned)); } +namespace { + +/// SubobjectFinder - A simple visitor to find the "sub-object" pointed to by a +/// __builtin_dynamic_object_size call. Information gathered from the +/// sub-object is used by the back-end to determine the correct size when the +/// 'TYPE' of the __bdos call has the least significant bit set (i.e. asking +/// for the sub-object size). +/// +/// The expectation is that we'll eventually hit one of three expression types: +/// +/// 1. DeclRefExpr - This is the expression for the base of the structure. +/// 2. MemberExpr - This is the field in the structure. +/// 3. CompoundLiteralExpr - This is for people who create something +/// heretical like (struct foo has a flexible array member): +/// +///(struct foo){ 1, 2 }.blah[idx]; +/// +/// All other expressions can be correctly handled with the current code. +struct SubobjectFinder +: public ConstStmtVisitor { + SubobjectFinder() = default; + + //======// + //Visitor Methods + //======// + + const Expr *VisitStmt(const Stmt *S) { return nullptr; } + + const Expr *VisitDeclRefExpr(const DeclRefExpr *E) { return E; } + const Expr *VisitMemberExpr(const MemberExpr *E) { return E; } + const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) { +return E; + } + + const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) { +return Visit(E->getBase()); + } + const Expr *VisitCastExpr(const CastExpr *E) { +return Visit(E->getSubExpr()); + } + const Expr *VisitParenExpr(const ParenExpr *E) { +return Visit(E->getSubExpr()); + } + const Expr *VisitUnaryAddrOf(const clang::UnaryOperator *E) { +return Visit(E->getSubExpr()); + } + const Expr *VisitUnaryDeref(const clang::UnaryOperator *E) { +return Visit(E->getSubExpr()); + } +}; + +} // end anonymous namespace + +/// getFieldInfo - Gather the size and offset of the field \p VD in \p RD. +static std::pair getFieldInfo(CodeGenFunction &CGF, + const RecordDecl *RD, + const ValueDecl *VD, + uint64_t Offset = 0) { + if (!RD) +return std::make_pair(0, 0); + + ASTContext &Ctx = CGF.getContext(); + const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD); + unsigned FieldNo = 0; + + for (const Decl *D : RD->decls()) { +if (const auto *Record = dyn_cast(D)) { + std::pair Res = + getFieldInfo(CGF, Record->getDefinition(), VD, + Offset + Layout.getFieldOffset(FieldNo)); + if (Res.first != 0) +return Res; + continue; +} + +if (const auto *FD = dyn_cast(D); FD == VD) { + Offset += Layout.getFieldOffset(FieldNo); + return std::make_pair(Ctx.getTypeSizeInChars(FD->getType()).getQuantity(), +Ctx.toCharUnitsFromBits(Offset).getQuantity()); +} + +if (isa(D)) + ++FieldNo; + } + + return std::make_pair(0, 0); +} + +/// getSubobjectInfo - Find the sub-object that \p E points to. If it lives +/// inside a struct, return the "size" and "offset" of that sub-object. +static std::pair getSubobjectInfo(CodeGenFunction &CGF, + const Expr *E) { + const Expr *Subobject = SubobjectFinder().Visit(E); + if (!Subobject) +return std::make_pair(0, 0); + + const RecordDecl *OuterRD = nullptr; + const ValueDecl *VD = nullptr; + + if (const auto *DRE = dyn_cast(Subobject)) { +// We're pointing to the beginning of the struct. +VD = DRE->getDecl(); +QualType Ty = VD->getType(); +if (Ty->isPointerType()) + Ty = Ty->getPointeeType(); +OuterRD = Ty->getAsRecordDecl(); zygoloid wrote: If I'm reading this correctly, I think this case is redundant: `getFieldInfo` only succeeds when `VD` is a field, but we're not going to have an evaluated `DeclRefExpr` that names a field. Can we return `0, 0` in this case, like we do for compound literals? I think the only case when we have non-zero values to return is when we've found a `FieldDecl`. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -1052,11 +1053,143 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type, return Builder.CreateSelect(Cmp, Res, ConstantInt::get(ResType, 0, IsSigned)); } +namespace { + +/// SubobjectFinder - A simple visitor to find the "sub-object" pointed to by a +/// __builtin_dynamic_object_size call. Information gathered from the +/// sub-object is used by the back-end to determine the correct size when the +/// 'TYPE' of the __bdos call has the least significant bit set (i.e. asking +/// for the sub-object size). +/// +/// The expectation is that we'll eventually hit one of three expression types: +/// +/// 1. DeclRefExpr - This is the expression for the base of the structure. +/// 2. MemberExpr - This is the field in the structure. +/// 3. CompoundLiteralExpr - This is for people who create something +/// heretical like (struct foo has a flexible array member): +/// +///(struct foo){ 1, 2 }.blah[idx]; +/// +/// All other expressions can be correctly handled with the current code. +struct SubobjectFinder +: public ConstStmtVisitor { + SubobjectFinder() = default; + + //======// + //Visitor Methods + //======// + + const Expr *VisitStmt(const Stmt *S) { return nullptr; } + + const Expr *VisitDeclRefExpr(const DeclRefExpr *E) { return E; } + const Expr *VisitMemberExpr(const MemberExpr *E) { return E; } + const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) { +return E; + } + + const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) { +return Visit(E->getBase()); + } + const Expr *VisitCastExpr(const CastExpr *E) { +return Visit(E->getSubExpr()); + } zygoloid wrote: A derived-to-base cast is a traversal to a subobject in C++, so we should presumably terminate the traversal when we reach one and use the offset and size of the base class as the subobject. That'd be a pretty big delta from what you have here, but it'd be a good idea to add a FIXME here. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
@@ -26996,18 +26996,38 @@ class, structure, array, or other object. Arguments: "" -The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a -pointer to or into the ``object``. The second argument determines whether -``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is -unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` -in address space 0 is used as its pointer argument. If it's ``false``, -``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if -the ``null`` is in a non-zero address space or if ``true`` is given for the -third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth -argument to ``llvm.objectsize`` determines if the value should be evaluated at -runtime. +The ``llvm.objectsize`` intrinsic takes six arguments: + +- The first argument is a pointer to or into the ``object``. +- The second argument controls which value to return when the size is unknown: + + - If it's ``false``, ``llvm.objectsize`` returns ``-1``. + - If it's ``true``, ``llvm.objectsize`` returns ``0``. + +- The third argument controls how ``llvm.objectsize`` acts when ``null`` in + address space 0 is used as its pointer argument: + + - If it's ``false``, ``llvm.objectsize`` reports 0 bytes available when given +``null``. + - If it's ``true``, or the ``null`` pointer is in a non-zero address space, +the size is assumed to be unknown. + +- The fourth argument to ``llvm.objectsize`` determines if the value should be + evaluated at runtime. +- The fifth argument controls which size ``llvm.objectsize`` returns: + + - If it's ``false``, ``llvm.objectsize`` returns the size of the closest +surrounding subobject. + - If it's ``true``, ``llvm.objectsize`` returns the size of the whole object. + +- If non-zero, the sixth and seventh arguments encode the size and offset + information, respectively, of the original subobject's layout and is used + when the fifth argument is ``false``. +- The seventh argument encodes the offset information of the original + subobject's layout and is used when the fifth argument is ``false``. zygoloid wrote: I think the information you're passing in here isn't quite what we'd want. If I'm reading the code correctly, the offset you're passing in is the field offset relative to the immediately-enclosing record type, which doesn't give us any information about either where the pointer is within the subobject, or where the subobject is within the complete object, so this doesn't seem like it can be enough information to produce a correct result. Rather than passing in the offset of the subobject (relative to an unknown anchor point), I think it would be more useful to pass in a pointer to the start of the subobject. Or to pass in the offset from the start of the subobject to the pointer argument, but that would likely be harder for the frontend to calculate (eg, you'd need to work out the offset produced by array indexing). https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
bwendling wrote: Another ping. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
bwendling wrote: Friendly ping. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
https://github.com/bwendling edited https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
bwendling wrote: The first PR attempt was here: https://github.com/llvm/llvm-project/pull/78526 It was NACK'ed because it used the LLVM IR representation of the structure, which wasn't appropriate. To solve that issue, I chose to expand the `llvm.objectsize()` builtin to contain the size and offset of the sub-object, which is determined in the front-end. Note that there are many other things wrong with our `__builtin_{dynamic_}object_size` implementations. For one, they perform loads of the pointer which isn't necessary and contrary to the idea that the builtins don't allow for side-effects. I'll be addressing those issues in future patches. https://github.com/llvm/llvm-project/pull/83204 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #83204)
llvmbot wrote: @llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-clang Author: Bill Wendling (bwendling) Changes The second argument of __builtin_dynamic_object_size controls whether it returns the size of the whole object or the closest surrounding object. For this struct: struct s { int foo; char bar[2][40]; int baz; int qux; }; int main(int argc, char **argv) { struct s f; #define report(x) printf(#x ": %zu\n", x) argc = 1; report(__builtin_dynamic_object_size(f.bar[argc], 0)); report(__builtin_dynamic_object_size(f.bar[argc], 1)); return 0; } should return: __builtin_dynamic_object_size(f.bar[argc], 0): 48 __builtin_dynamic_object_size(f.bar[argc], 1): 40 determined by the least significant bit of the TYPE. The LLVM IR isn't sufficient to determine what could be considered a "sub-object". Instead determine the size / offset info in the front-end and pass that information along with the intrinsic. This expands the llvm.objectsize intrinsic to add these three new fields: - The fifth argument controls which object: - If false, return the size of the closest surrounding object. - If true, return the size of the whole object from the pointer. - If non-zero and the fifth argument is 'false', the size of the sub-object. - If non-zero and the fifth argument is 'false', the offset of the sub-object. --- Patch is 172.77 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/83204.diff 45 Files Affected: - (modified) clang/lib/CodeGen/CGBuiltin.cpp (+163-10) - (modified) clang/lib/CodeGen/CGExpr.cpp (+8-3) - (modified) clang/test/CodeGen/catch-undef-behavior.c (+1-1) - (added) clang/test/CodeGen/object-size-sub-object.c (+311) - (modified) clang/test/CodeGen/object-size.c (+549-221) - (modified) clang/test/CodeGen/object-size.cpp (+7-7) - (modified) clang/test/CodeGen/pass-object-size.c (+3-3) - (modified) clang/test/CodeGen/sanitize-recover.c (+1-1) - (modified) llvm/docs/LangRef.rst (+34-14) - (modified) llvm/include/llvm/Analysis/MemoryBuiltins.h (+9) - (modified) llvm/include/llvm/IR/Intrinsics.td (+4-3) - (modified) llvm/lib/Analysis/MemoryBuiltins.cpp (+20-4) - (modified) llvm/lib/IR/AutoUpgrade.cpp (+63-6) - (modified) llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp (+2-1) - (modified) llvm/test/Analysis/CostModel/X86/free-intrinsics.ll (+4-4) - (modified) llvm/test/Analysis/CostModel/free-intrinsics-datalayout.ll (+4-4) - (modified) llvm/test/Analysis/CostModel/free-intrinsics-no_info.ll (+4-4) - (modified) llvm/test/Assembler/auto_upgrade_intrinsics.ll (+4-4) - (modified) llvm/test/Bitcode/objectsize-upgrade-7.0.ll (+2-2) - (modified) llvm/test/CodeGen/AArch64/GlobalISel/memcpy_chk_no_tail.ll (+2-2) - (modified) llvm/test/CodeGen/AArch64/memsize-remarks.ll (+16-16) - (modified) llvm/test/CodeGen/AMDGPU/promote-alloca-mem-intrinsics.ll (+4-4) - (modified) llvm/test/Other/cgscc-libcall-update.ll (+2-2) - (modified) llvm/test/Transforms/InferAddressSpaces/AMDGPU/debug-info.ll (+4-4) - (modified) llvm/test/Transforms/InferAddressSpaces/AMDGPU/intrinsics.ll (+6-6) - (modified) llvm/test/Transforms/InferAlignment/propagate-assume.ll (+2-2) - (modified) llvm/test/Transforms/Inline/call-intrinsic-objectsize.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/allocsize.ll (+5-5) - (modified) llvm/test/Transforms/InstCombine/assume_inevitable.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/builtin-dynamic-object-size.ll (+13-13) - (modified) llvm/test/Transforms/InstCombine/builtin-object-size-custom-dl.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/builtin-object-size-strdup-family.ll (+14-10) - (modified) llvm/test/Transforms/InstCombine/invoke.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/memset_chk-1.ll (+5-5) - (modified) llvm/test/Transforms/InstCombine/objsize.ll (+38-38) - (modified) llvm/test/Transforms/InstCombine/stpcpy_chk-1.ll (+4-4) - (modified) llvm/test/Transforms/InstCombine/strcpy_chk-1.ll (+5-5) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-load.ll (+2-2) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll (+5-5) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-posix-memalign.ll (+8-8) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll (+2-2) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll (+2-2) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll (+18-18) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/stale-worklist-phi.ll (+2-2) - (modified) llvm/test/Transforms/SCCP/issue59602-assume-like-call-users.ll (+5-5) ``diff diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 2d16e7cdc06053..9d8e5671d9d12d 100644 --- a/clang/lib/C