Re: [PATCH v3] c++: Don't prune constant capture proxies only used in array dimensions [PR114292]

Simon Martin Mon, 27 Jan 2025 11:35:18 -0800

Hi Jason,

On 27 Jan 2025, at 15:23, Jason Merrill wrote:


> On 1/27/25 8:14 AM, Simon Martin wrote:
>> Hi Jason,
>>
>> On 24 Jan 2025, at 16:51, Jason Merrill wrote:
>>
>>> On 1/24/25 6:19 AM, Simon Martin wrote:
>>>> Hi Jason,
>>>>
>>>> On 23 Jan 2025, at 22:56, Jason Merrill wrote:
>>>>
>>>>> On 1/23/25 12:06 PM, Simon Martin wrote:
>>>>>> Hi Marek,
>>>>>>
>>>>>> On 23 Jan 2025, at 16:45, Marek Polacek wrote:
>>>>>>
>>>>>>> On Thu, Jan 23, 2025 at 02:40:09PM +0000, Simon Martin wrote:
>>>>>>>> Hi Jason,
>>>>>>>>
>>>>>>>> On 20 Jan 2025, at 22:49, Jason Merrill wrote:
>>>>>>>>
>>>>>>>>> On 1/20/25 2:11 PM, Simon Martin wrote:
>>>>>>>>>> Hi Jason,
>>>>>>>>>>
>>>>>>>>>> On 15 Jan 2025, at 22:42, Jason Merrill wrote:
>>>>>>>>>>
>>>>>>>>>>> On 12/12/24 2:07 PM, Simon Martin wrote:
>>>>>>>>>>>> We currently ICE upon the following valid (under -Wno-vla)
>>
>>>>>>>>>>>> code
>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> === cut here ===
>>>>>>>>>>>> void f(int c) {
>>>>>>>>>>>>        constexpr int r = 4;
>>>>>>>>>>>>        [&](auto) { int t[r * c]; }(0);
>>>>>>>>>>>> }
>>>>>>>>>>>> === cut here ===
>>>>>>>>>>>>
>>>>>>>>>>>> The problem is that when parsing the lambda body, and more
>>>>>>>>>>>> specifically
>>>>>>>>>>>> the multiplication, we mark the lambda as
>>>>>>>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED
>>>>>>>>>>>> even though the replacement of r by 4 is "undone" by the 
>>>>>>>>>>>> call
>>
>>>>>>>>>>>> to
>>>>>>>>>>>> build_min_non_dep in build_x_binary_op. This makes
>>>>>>>>>>>> prune_lambda_captures
>>>>>>>>>>>> remove the proxy declaration while it should not, and we 
>>>>>>>>>>>> trip
>>
>>>>>>>>>>>> on
>>>>>>>>>>>> an
>>>>>>>>
>>>>>>>>>>>> assert at instantiation time.
>>>>>>>>>>>
>>>>>>>>>>> Why does prune_lambda_captures remove the declaration if 
>>>>>>>>>>> it's
>>>>>>>>>>> still
>>>>>>>>>>> used in the function body?  Setting
>>>>>>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED
>>>>>>>>>>> just
>>>>>>>>>>> means "we might have optimized away some captures", the tree
>>>>>>>>>>> walk
>>>>>>>>>>> should have found the use put back by build_x_binary_op.
>>>>>>>>>> I think that this is due to a bug in mark_const_cap_r, 
>>>>>>>>>> that’s
>>>>>>>>>> been
>>>>>>>>>> here since the beginning, i.e. r8-7213-g1577f10a637352: it
>>>>>>>>>> decides
>>>>>>>>
>>>>>>>>>> NOT
>>>>>>>>>> to walk sub-trees when encountering a DECL_EXPR for a 
>>>>>>>>>> constant
>>>>>>>>>> capture
>>>>>>>>>> proxy (at lambda.cc:1769). I don’t understand why we
>>
>>>>>>>>>> wouldn’t
>>>>>>>>>> want
>>>>>>>>>> to continue.
>>>>>>>>>
>>>>>>>>> Because that's the declaration of the capture proxy, not a use
>>>>>>>>> of
>>>>>>>>> it.
>>>>>>>> Indeed, thanks for clarifying.
>>>>>>>>
>>>>>>>>> Why aren't we finding the use in the declaration of t?
>>>>>>>> After further investigation, the problem is rather that neither
>>>>>>>> walk_tree_1 nor cp_walk_subtrees walk the dimensions of array
>>>>>>>> types,
>>>>>>>> so
>>>>>>>> we miss the uses.
>>>>>>>>
>>>>>>>>>> Removing that line fixes the PR, but breaks 3 existing tests
>>>>>>>>>> (lambda-capture1.C, lambda-const11.C and lambda-const11a.C,
>>>>>>>>>> that
>>>>>>>>>> all
>>>>>>>>>> assert that we optimise out the capture); that’s why I did
>>>>>>>>>> not
>>>>>>>>>> pursue
>>>>>>>>>> it in the first place.
>>>>>>>>>
>>>>>>>>> Right.
>>>>>>>>>
>>>>>>>>>> But taking another look, it might not be that big a deal that
>>>>>>>>>> we
>>>>>>>>>> don’t
>>>>>>>>>> optimise those out: as soon as we use -O1 or above, the
>>>>>>>>>> assignment
>>>>>>>>>> to
>>>>>>>>
>>>>>>>>>> the closure field actually disappears.
>>>>>>>>>
>>>>>>>>> Completely breaking this optimization is a big deal,
>>>>>>>>> particularly
>>>>>>>>> since it affects the layout of closure objects.  We can't 
>>>>>>>>> always
>>
>>>>>>>>> optimize everything away.
>>>>>>>> ACK.
>>>>>>>>
>>>>>>>> The attached updated patch makes cp_walk_subtrees walk array 
>>>>>>>> type
>>
>>>>>>>> dimensions, which fixes the initial PR without those 3
>>>>>>>> regressions.
>>>>
>>>>>>>>
>>>>>>>> Successfully tested on x86_64-pc-linux-gnu. Is it OK?
>>>>>>>>
>>>>>>>> Simon
>>>>>>>>
>>>>>>>> PS: In theory I think it’d make sense to do do this change in
>>>>>>>>
>>>>>>>> walk_tree_1 since C also supports VLAs, but doing so breaks 
>>>>>>>> some
>>>>
>>>>>>>> OMP
>>>>>>>> tests. OMP can do interesting things with array bounds (see
>>>>>>>> r9-5354-gda972c05f48637), and fixing those would require 
>>>>>>>> teaching
>>
>>>>>>>> walk_tree_1 about the “omp dummy var” array bounds, which I
>>>>>>>> think
>>>>>>>> would be a bit ugly. And I’m not aware of any C case that 
>>>>>>>> would
>>>>>>>> be
>>>>>>>> improved/fixed by doing this change, so we’re probably fine 
>>>>>>>> not
>>>>>>>> doing
>>>>>>>> it.
>>>>>>>
>>>>>>>>    From e19bb6c943a422b1201c5c9a2a1d4f32141baf84 Mon Sep 17
>>>>>>>> 00:00:00
>>>>>>>> 2001
>>>>>>>> From: Simon Martin <si...@nasilyan.com>
>>>>>>>> Date: Wed, 22 Jan 2025 16:19:47 +0100
>>>>>>>> Subject: [PATCH] c++: Don't prune constant capture proxies only
>>>>>>>> used
>>>>>>>> in array
>>>>>>>>     dimensions [PR114292]
>>>>>>>>
>>>>>>>> We currently ICE upon the following valid (under -Wno-vla) code
>>>>>>>>
>>>>>>>> === cut here ===
>>>>>>>> void f(int c) {
>>>>>>>>      constexpr int r = 4;
>>>>>>>>      [&](auto) { int t[r * c]; }(0);
>>>>>>>> }
>>>>>>>> === cut here ===
>>>>>>>>
>>>>>>>> When parsing the lambda body, and more specifically the
>>>>>>>> multiplication,
>>>>>>>> we mark the lambda as LAMBDA_EXPR_CAPTURE_OPTIMIZED, which
>>>>>>>> indicates
>>>>>>>> to
>>>>>>>> prune_lambda_captures that it might be possible to optimize out
>>>>>>>> some
>>>>>>>> captures.
>>>>>>>>
>>>>>>>> The problem is that prune_lambda_captures then misses the use 
>>>>>>>> of
>>>>
>>>>>>>> the
>>>>>>>> r
>>>>>>>> capture (because neither walk_tree_1 nor cp_walk_subtrees walks
>>>>>>>> the
>>>>
>>>>>>>> dimensions of array types - here "r * c"), hence believes the
>>>>>>>> capture
>>>>>>>> can be pruned... and we trip on an assert when instantiating 
>>>>>>>> the
>>>>>>>> lambda.
>>>>>>>>
>>>>>>>> This patch changes cp_walk_subtrees so that when walking a
>>>>>>>> declaration
>>>>>>>> with an array type, it walks that array type's dimensions. 
>>>>>>>> Since
>>>>>>>> C
>>>>>>>> also
>>>>>>>> supports VLAs, I thought it'd make sense to do this in
>>>>>>>> walk_tree_1,
>>>>
>>>>>>>> but
>>>>>>>> this breaks some omp-low related test cases (because OMP can do
>>>>>>>> funky
>>>>>>>> things with array bounds when adjust_array_error_bounds is 
>>>>>>>> set),
>>>>
>>>>>>>> and
>>>>>>>> I
>>>>>>>> don't really want to add code to walk_tree_1 to skips arrays
>>>>>>>> whose
>>>>>>>> dimension is a temporary variable with the "omp dummy var"
>>>>>>>> attribute.
>>>>>>>>
>>>>>>>> Successfully tested on x86_64-pc-linux-gnu.
>>>>>>>>
>>>>>>>>        PR c++/114292
>>>>>>>>
>>>>>>>> gcc/cp/ChangeLog:
>>>>>>>>
>>>>>>>>        * tree.cc (cp_walk_subtrees): Walk array type dimensions.
>>>>>>>>
>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>
>>>>>>>>        * g++.dg/cpp1y/lambda-ice4.C: New test.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>     gcc/cp/tree.cc                           | 11 ++++++
>>>>>>>>     gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C | 44
>>>>>>>> ++++++++++++++++++++++++
>>>>>>>>     2 files changed, 55 insertions(+)
>>>>>>>>     create mode 100644 gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
>>>>>>>>
>>>>>>>> diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
>>>>>>>> index 36581865a17..fc9a2fbff32 100644
>>>>>>>> --- a/gcc/cp/tree.cc
>>>>>>>> +++ b/gcc/cp/tree.cc
>>>>>>>> @@ -5844,6 +5844,17 @@ cp_walk_subtrees (tree *tp, int
>>>>>>>> *walk_subtrees_p, walk_tree_fn func,
>>>>>>>>           break;
>>>>>>>>
>>>>>>>>         default:
>>>>>>>> +      /* If this is an array, walk its dimensions.  */
>>>>>>>> +      if (DECL_P (t) && TREE_TYPE (t)
>>>>>>>> +        && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
>>>>>>>> +      {
>>>>>>>> +        tree domain = TYPE_DOMAIN (TREE_TYPE (t));
>>>>>>>> +        if (domain)
>>>>>>>> +          {
>>>>>>>> +            WALK_SUBTREE (TYPE_MIN_VALUE (domain));
>>>>>>>> +            WALK_SUBTREE (TYPE_MAX_VALUE (domain));
>>>>>>>> +          }
>>>>>>>> +      }
>>>>>>>>           return NULL_TREE;
>>>>>>>>         }
>>>>>>>
>>>>>>> Is there a reason I'm missing not to put this into the TYPE_P
>>>>>>> block
>>>>>>> above?
>>>>>> Do you mean put it in that block as well, or instead of where I 
>>>>>> put
>>
>>>>>> it?
>>>>>> If the latter, we don’t see any TYPE_P node for “int[r*n]”,
>>>>>> so
>>>>>> we
>>>>>> don’t go into that block, and continue ICE’ing without the
>>>>>> change
>>>>>> in
>>>>>> the “default:”.
>>>>>>
>>>>>> Anyway it’s a very good call (thanks!), because it got me to
>>>>>> check
>>>>>> what we get if we change the lambda body to just do “typedef 
>>>>>> int
>>>>>> MyT[c*r];”, and we still ICE. And from a quick look, doing
>>>>
>>>>>> something
>>>>>> similar in the TYPE_P block does not fix it.
>>>>>>
>>>>>> I’ll work something out and report back.
>>>>>
>>>>> I would think this should go in the DECL_EXPR handling, so we 
>>>>> don't
>>>>> walk into the type every time we see a use of an array variable.
>>>> Indeed, we can do the check there, thanks a lot.
>>>>
>>>> And concerning the case of a single “typedef int MyT[r*c];”,
>>>> there
>>>> already is code in walk_tree_1 that covers such typedefs, but that
>>
>>>> does
>>>> not walk array type dimensions.
>>>>
>>>> So the result is the attached updated patch, successfully tested on
>>>> x86_64-pc-linux-gnu. OK for trunk? And if so, also for branches?
>>>
>>>>    * tree.cc (walk_tree_1): Walk array type dimensions, if any, for
>>>>    TYPE_DECLs.
>>>
>>> Hmm, why isn't this covered by the existing call to 
>>> walk_type_fields?
>>>
>>> Ah, I guess because that just walks into TYPE_DOMAIN, and nothing
>>> walks from there into the TYPE_MIN/MAX_VALUE.
>>>
>>> What if instead of touching walk_tree_1, mark_const_cap_r handled
>>> INTEGER_TYPE?
>> I don’t think we can look into TYPE_{MIN,MAX}_VALUE from
>> mark_const_cap_r since we might need to recurse (here we have a
>> MINUS_EXPR in TYPE_MAX_VALUE) and I don’t think we can from there?
>
> Several tree walk functions recurse, in which case they need to handle 
> the pointer_set themselves instead of using 
> cp_walk_tree_without_duplicates.  For instance, 
> find_parameter_packs_r. But no need for this patch.
>
>>> I know walk_tree used to walk into VLA bounds, which is why we have
>>> stabilize_vla_size to split out any relevant expressions into
>>> temporaries; I'm reluctant to re-add that without a lot of
>>> investigation into the history.
>> Yeah, makes sense. I learnt that touching walk_tree requires caution
>> through all that I broke with my few attempts at changing it :-)
>
> OK, I looked it up: this goes back to r119481
> (r0-77782-g8f6e6bf375959d), the bounds walking was removed because it 
> caused trouble for Ada.  That patch added bounds walking to 
> for_each_template_parm_r instead to avoid regressing C++.
Thanks for digging into this and providing extra context.

>>> In cp_walk_subtrees, I think we also want to go through the typedef
>>> handling at the top, so in the DECL_EXPR handling, let's just walk
>>> into the type of the decl. So we walk into the ARRAY_TYPE, check
>>> whether it's a typedef, if not walk_type_fields walks into the
>>> TYPE_DOMAIN, and the mark_const_cap_r adjustment I suggested above
>>> checks the TYPE_MAX_VALUE from there.
>> I think that the typedef handling is already done in walk_tree (at 
>> least
>> I was unable to craft a test case where doing something extra for
>> typedefs in cp_walk_subtrees was required).
>
> I agree that nothing extra is required; I was referring to the 
> typedef_variant_p code at the top of cp_walk_subtree.
>
>> And following up on your suggestion, things end up pretty simple: 
>> just
>> walk the type of declarations in DECL_EXPRs, and explicitly handle 
>> the
>> min/max value of INTEGER_TYPEs in cp_walk_subtree.
>>
>> At least that’s what the updated patch does, successfully tested on
>> x86_64-pc-linux-gnu. Ok for trunk?
>
>> @@ -5843,6 +5844,11 @@ cp_walk_subtrees (tree *tp, int 
>> *walk_subtrees_p, walk_tree_fn func,
>>        WALK_SUBTREE (STATIC_ASSERT_MESSAGE (t));
>>        break;
>> +    case INTEGER_TYPE:
>> +      WALK_SUBTREE (TYPE_MIN_VALUE (t));
>> +      WALK_SUBTREE (TYPE_MAX_VALUE (t));
>> +      break;
>
> Let's add a comment
>
> /* Removed from walk_type_fields in r119481.  */
>
> and remove the now-redundant INTEGER_TYPE handling from 
> for_each_template_parm_r.
>
> OK with those adjustments.
Thanks a lot. Merged to trunk as r15-7238-gceabea405ffdc8.

Even though it’s an “old” regression (from GCC 7), it’s a 
reject-valid and I’m tempted to backport; OK for active branches as 
well?

Simon

Re: [PATCH v3] c++: Don't prune constant capture proxies only used in array dimensions [PR114292]

Reply via email to