Re: [PATCH v3] c++: Don't prune constant capture proxies only used in array dimensions [PR114292]

Simon Martin Mon, 27 Jan 2025 12:17:45 -0800

Hi Jason,

On 27 Jan 2025, at 20:53, Jason Merrill wrote:


> On 1/27/25 2:34 PM, Simon Martin wrote:
>> Hi Jason,
>>
>> On 27 Jan 2025, at 15:23, Jason Merrill wrote:
>>
>>> On 1/27/25 8:14 AM, Simon Martin wrote:
>>>> Hi Jason,
>>>>
>>>> On 24 Jan 2025, at 16:51, Jason Merrill wrote:
>>>>
>>>>> On 1/24/25 6:19 AM, Simon Martin wrote:
>>>>>> Hi Jason,
>>>>>>
>>>>>> On 23 Jan 2025, at 22:56, Jason Merrill wrote:
>>>>>>
>>>>>>> On 1/23/25 12:06 PM, Simon Martin wrote:
>>>>>>>> Hi Marek,
>>>>>>>>
>>>>>>>> On 23 Jan 2025, at 16:45, Marek Polacek wrote:
>>>>>>>>
>>>>>>>>> On Thu, Jan 23, 2025 at 02:40:09PM +0000, Simon Martin wrote:
>>>>>>>>>> Hi Jason,
>>>>>>>>>>
>>>>>>>>>> On 20 Jan 2025, at 22:49, Jason Merrill wrote:
>>>>>>>>>>
>>>>>>>>>>> On 1/20/25 2:11 PM, Simon Martin wrote:
>>>>>>>>>>>> Hi Jason,
>>>>>>>>>>>>
>>>>>>>>>>>> On 15 Jan 2025, at 22:42, Jason Merrill wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/12/24 2:07 PM, Simon Martin wrote:
>>>>>>>>>>>>>> We currently ICE upon the following valid (under 
>>>>>>>>>>>>>> -Wno-vla)
>>>>
>>>>>>>>>>>>>> code
>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> === cut here ===
>>>>>>>>>>>>>> void f(int c) {
>>>>>>>>>>>>>>         constexpr int r = 4;
>>>>>>>>>>>>>>         [&](auto) { int t[r * c]; }(0);
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>> === cut here ===
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The problem is that when parsing the lambda body, and 
>>>>>>>>>>>>>> more
>>>>>>>>>>>>>> specifically
>>>>>>>>>>>>>> the multiplication, we mark the lambda as
>>>>>>>>>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED
>>>>>>>>>>>>>> even though the replacement of r by 4 is "undone" by the
>>>>>>>>>>>>>> call
>>>>
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> build_min_non_dep in build_x_binary_op. This makes
>>>>>>>>>>>>>> prune_lambda_captures
>>>>>>>>>>>>>> remove the proxy declaration while it should not, and we
>>>>>>>>>>>>>> trip
>>>>
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>> an
>>>>>>>>>>
>>>>>>>>>>>>>> assert at instantiation time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Why does prune_lambda_captures remove the declaration if
>>>>>>>>>>>>> it's
>>>>>>>>>>>>> still
>>>>>>>>>>>>> used in the function body?  Setting
>>>>>>>>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED
>>>>>>>>>>>>> just
>>>>>>>>>>>>> means "we might have optimized away some captures", the 
>>>>>>>>>>>>> tree
>>>>>>>>>>>>> walk
>>>>>>>>>>>>> should have found the use put back by build_x_binary_op.
>>>>>>>>>>>> I think that this is due to a bug in mark_const_cap_r,
>>>>>>>>>>>> that’s
>>>>>>>>>>>> been
>>>>>>>>>>>> here since the beginning, i.e. r8-7213-g1577f10a637352: it
>>>>>>>>>>>> decides
>>>>>>>>>>
>>>>>>>>>>>> NOT
>>>>>>>>>>>> to walk sub-trees when encountering a DECL_EXPR for a
>>>>>>>>>>>> constant
>>>>>>>>>>>> capture
>>>>>>>>>>>> proxy (at lambda.cc:1769). I don’t understand why we
>>>>
>>>>>>>>>>>> wouldn’t
>>>>>>>>>>>> want
>>>>>>>>>>>> to continue.
>>>>>>>>>>>
>>>>>>>>>>> Because that's the declaration of the capture proxy, not a 
>>>>>>>>>>> use
>>>>>>>>>>> of
>>>>>>>>>>> it.
>>>>>>>>>> Indeed, thanks for clarifying.
>>>>>>>>>>
>>>>>>>>>>> Why aren't we finding the use in the declaration of t?
>>>>>>>>>> After further investigation, the problem is rather that 
>>>>>>>>>> neither
>>>>>>>>>> walk_tree_1 nor cp_walk_subtrees walk the dimensions of array
>>>>>>>>>> types,
>>>>>>>>>> so
>>>>>>>>>> we miss the uses.
>>>>>>>>>>
>>>>>>>>>>>> Removing that line fixes the PR, but breaks 3 existing 
>>>>>>>>>>>> tests
>>>>>>>>>>>> (lambda-capture1.C, lambda-const11.C and lambda-const11a.C,
>>>>>>>>>>>> that
>>>>>>>>>>>> all
>>>>>>>>>>>> assert that we optimise out the capture); that’s why I 
>>>>>>>>>>>> did
>>>>>>>>>>>> not
>>>>>>>>>>>> pursue
>>>>>>>>>>>> it in the first place.
>>>>>>>>>>>
>>>>>>>>>>> Right.
>>>>>>>>>>>
>>>>>>>>>>>> But taking another look, it might not be that big a deal 
>>>>>>>>>>>> that
>>>>>>>>>>>> we
>>>>>>>>>>>> don’t
>>>>>>>>>>>> optimise those out: as soon as we use -O1 or above, the
>>>>>>>>>>>> assignment
>>>>>>>>>>>> to
>>>>>>>>>>
>>>>>>>>>>>> the closure field actually disappears.
>>>>>>>>>>>
>>>>>>>>>>> Completely breaking this optimization is a big deal,
>>>>>>>>>>> particularly
>>>>>>>>>>> since it affects the layout of closure objects.  We can't
>>>>>>>>>>> always
>>>>
>>>>>>>>>>> optimize everything away.
>>>>>>>>>> ACK.
>>>>>>>>>>
>>>>>>>>>> The attached updated patch makes cp_walk_subtrees walk array
>>>>>>>>>> type
>>>>
>>>>>>>>>> dimensions, which fixes the initial PR without those 3
>>>>>>>>>> regressions.
>>>>>>
>>>>>>>>>>
>>>>>>>>>> Successfully tested on x86_64-pc-linux-gnu. Is it OK?
>>>>>>>>>>
>>>>>>>>>> Simon
>>>>>>>>>>
>>>>>>>>>> PS: In theory I think it’d make sense to do do this change 
>>>>>>>>>> in
>>>>>>>>>>
>>>>>>>>>> walk_tree_1 since C also supports VLAs, but doing so breaks
>>>>>>>>>> some
>>>>>>
>>>>>>>>>> OMP
>>>>>>>>>> tests. OMP can do interesting things with array bounds (see
>>>>>>>>>> r9-5354-gda972c05f48637), and fixing those would require
>>>>>>>>>> teaching
>>>>
>>>>>>>>>> walk_tree_1 about the “omp dummy var” array bounds, which 
>>>>>>>>>> I
>>>>>>>>>> think
>>>>>>>>>> would be a bit ugly. And I’m not aware of any C case that
>>>>>>>>>> would
>>>>>>>>>> be
>>>>>>>>>> improved/fixed by doing this change, so we’re probably fine
>>>>>>>>>> not
>>>>>>>>>> doing
>>>>>>>>>> it.
>>>>>>>>>
>>>>>>>>>>     From e19bb6c943a422b1201c5c9a2a1d4f32141baf84 Mon Sep 17
>>>>>>>>>> 00:00:00
>>>>>>>>>> 2001
>>>>>>>>>> From: Simon Martin <si...@nasilyan.com>
>>>>>>>>>> Date: Wed, 22 Jan 2025 16:19:47 +0100
>>>>>>>>>> Subject: [PATCH] c++: Don't prune constant capture proxies 
>>>>>>>>>> only
>>>>>>>>>> used
>>>>>>>>>> in array
>>>>>>>>>>      dimensions [PR114292]
>>>>>>>>>>
>>>>>>>>>> We currently ICE upon the following valid (under -Wno-vla) 
>>>>>>>>>> code
>>>>>>>>>>
>>>>>>>>>> === cut here ===
>>>>>>>>>> void f(int c) {
>>>>>>>>>>       constexpr int r = 4;
>>>>>>>>>>       [&](auto) { int t[r * c]; }(0);
>>>>>>>>>> }
>>>>>>>>>> === cut here ===
>>>>>>>>>>
>>>>>>>>>> When parsing the lambda body, and more specifically the
>>>>>>>>>> multiplication,
>>>>>>>>>> we mark the lambda as LAMBDA_EXPR_CAPTURE_OPTIMIZED, which
>>>>>>>>>> indicates
>>>>>>>>>> to
>>>>>>>>>> prune_lambda_captures that it might be possible to optimize 
>>>>>>>>>> out
>>>>>>>>>> some
>>>>>>>>>> captures.
>>>>>>>>>>
>>>>>>>>>> The problem is that prune_lambda_captures then misses the use
>>>>>>>>>> of
>>>>>>
>>>>>>>>>> the
>>>>>>>>>> r
>>>>>>>>>> capture (because neither walk_tree_1 nor cp_walk_subtrees 
>>>>>>>>>> walks
>>>>>>>>>> the
>>>>>>
>>>>>>>>>> dimensions of array types - here "r * c"), hence believes the
>>>>>>>>>> capture
>>>>>>>>>> can be pruned... and we trip on an assert when instantiating
>>>>>>>>>> the
>>>>>>>>>> lambda.
>>>>>>>>>>
>>>>>>>>>> This patch changes cp_walk_subtrees so that when walking a
>>>>>>>>>> declaration
>>>>>>>>>> with an array type, it walks that array type's dimensions.
>>>>>>>>>> Since
>>>>>>>>>> C
>>>>>>>>>> also
>>>>>>>>>> supports VLAs, I thought it'd make sense to do this in
>>>>>>>>>> walk_tree_1,
>>>>>>
>>>>>>>>>> but
>>>>>>>>>> this breaks some omp-low related test cases (because OMP can 
>>>>>>>>>> do
>>>>>>>>>> funky
>>>>>>>>>> things with array bounds when adjust_array_error_bounds is
>>>>>>>>>> set),
>>>>>>
>>>>>>>>>> and
>>>>>>>>>> I
>>>>>>>>>> don't really want to add code to walk_tree_1 to skips arrays
>>>>>>>>>> whose
>>>>>>>>>> dimension is a temporary variable with the "omp dummy var"
>>>>>>>>>> attribute.
>>>>>>>>>>
>>>>>>>>>> Successfully tested on x86_64-pc-linux-gnu.
>>>>>>>>>>
>>>>>>>>>>      PR c++/114292
>>>>>>>>>>
>>>>>>>>>> gcc/cp/ChangeLog:
>>>>>>>>>>
>>>>>>>>>>      * tree.cc (cp_walk_subtrees): Walk array type dimensions.
>>>>>>>>>>
>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>
>>>>>>>>>>      * g++.dg/cpp1y/lambda-ice4.C: New test.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>      gcc/cp/tree.cc                           | 11 ++++++
>>>>>>>>>>      gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C | 44
>>>>>>>>>> ++++++++++++++++++++++++
>>>>>>>>>>      2 files changed, 55 insertions(+)
>>>>>>>>>>      create mode 100644 
>>>>>>>>>> gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
>>>>>>>>>>
>>>>>>>>>> diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
>>>>>>>>>> index 36581865a17..fc9a2fbff32 100644
>>>>>>>>>> --- a/gcc/cp/tree.cc
>>>>>>>>>> +++ b/gcc/cp/tree.cc
>>>>>>>>>> @@ -5844,6 +5844,17 @@ cp_walk_subtrees (tree *tp, int
>>>>>>>>>> *walk_subtrees_p, walk_tree_fn func,
>>>>>>>>>>            break;
>>>>>>>>>>
>>>>>>>>>>          default:
>>>>>>>>>> +      /* If this is an array, walk its dimensions.  */
>>>>>>>>>> +      if (DECL_P (t) && TREE_TYPE (t)
>>>>>>>>>> +      && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
>>>>>>>>>> +    {
>>>>>>>>>> +      tree domain = TYPE_DOMAIN (TREE_TYPE (t));
>>>>>>>>>> +      if (domain)
>>>>>>>>>> +        {
>>>>>>>>>> +          WALK_SUBTREE (TYPE_MIN_VALUE (domain));
>>>>>>>>>> +          WALK_SUBTREE (TYPE_MAX_VALUE (domain));
>>>>>>>>>> +        }
>>>>>>>>>> +    }
>>>>>>>>>>            return NULL_TREE;
>>>>>>>>>>          }
>>>>>>>>>
>>>>>>>>> Is there a reason I'm missing not to put this into the TYPE_P
>>>>>>>>> block
>>>>>>>>> above?
>>>>>>>> Do you mean put it in that block as well, or instead of where I
>>>>>>>> put
>>>>
>>>>>>>> it?
>>>>>>>> If the latter, we don’t see any TYPE_P node for 
>>>>>>>> “int[r*n]”,
>>>>>>>> so
>>>>>>>> we
>>>>>>>> don’t go into that block, and continue ICE’ing without the
>>>>>>>> change
>>>>>>>> in
>>>>>>>> the “default:”.
>>>>>>>>
>>>>>>>> Anyway it’s a very good call (thanks!), because it got me to
>>>>>>>> check
>>>>>>>> what we get if we change the lambda body to just do “typedef
>>>>>>>> int
>>>>>>>> MyT[c*r];”, and we still ICE. And from a quick look, doing
>>>>>>
>>>>>>>> something
>>>>>>>> similar in the TYPE_P block does not fix it.
>>>>>>>>
>>>>>>>> I’ll work something out and report back.
>>>>>>>
>>>>>>> I would think this should go in the DECL_EXPR handling, so we
>>>>>>> don't
>>>>>>> walk into the type every time we see a use of an array variable.
>>>>>> Indeed, we can do the check there, thanks a lot.
>>>>>>
>>>>>> And concerning the case of a single “typedef int MyT[r*c];”,
>>>>>> there
>>>>>> already is code in walk_tree_1 that covers such typedefs, but 
>>>>>> that
>>>>
>>>>>> does
>>>>>> not walk array type dimensions.
>>>>>>
>>>>>> So the result is the attached updated patch, successfully tested 
>>>>>> on
>>>>>> x86_64-pc-linux-gnu. OK for trunk? And if so, also for branches?
>>>>>
>>>>>>  * tree.cc (walk_tree_1): Walk array type dimensions, if any, for
>>>>>>  TYPE_DECLs.
>>>>>
>>>>> Hmm, why isn't this covered by the existing call to
>>>>> walk_type_fields?
>>>>>
>>>>> Ah, I guess because that just walks into TYPE_DOMAIN, and nothing
>>>>> walks from there into the TYPE_MIN/MAX_VALUE.
>>>>>
>>>>> What if instead of touching walk_tree_1, mark_const_cap_r handled
>>>>> INTEGER_TYPE?
>>>> I don’t think we can look into TYPE_{MIN,MAX}_VALUE from
>>>> mark_const_cap_r since we might need to recurse (here we have a
>>>> MINUS_EXPR in TYPE_MAX_VALUE) and I don’t think we can from 
>>>> there?
>>>
>>> Several tree walk functions recurse, in which case they need to 
>>> handle
>>> the pointer_set themselves instead of using
>>> cp_walk_tree_without_duplicates.  For instance,
>>> find_parameter_packs_r. But no need for this patch.
>>>
>>>>> I know walk_tree used to walk into VLA bounds, which is why we 
>>>>> have
>>>>> stabilize_vla_size to split out any relevant expressions into
>>>>> temporaries; I'm reluctant to re-add that without a lot of
>>>>> investigation into the history.
>>>> Yeah, makes sense. I learnt that touching walk_tree requires 
>>>> caution
>>>> through all that I broke with my few attempts at changing it :-)
>>>
>>> OK, I looked it up: this goes back to r119481
>>> (r0-77782-g8f6e6bf375959d), the bounds walking was removed because 
>>> it
>>> caused trouble for Ada.  That patch added bounds walking to
>>> for_each_template_parm_r instead to avoid regressing C++.
>> Thanks for digging into this and providing extra context.
>>
>>>>> In cp_walk_subtrees, I think we also want to go through the 
>>>>> typedef
>>>>> handling at the top, so in the DECL_EXPR handling, let's just walk
>>>>> into the type of the decl. So we walk into the ARRAY_TYPE, check
>>>>> whether it's a typedef, if not walk_type_fields walks into the
>>>>> TYPE_DOMAIN, and the mark_const_cap_r adjustment I suggested above
>>>>> checks the TYPE_MAX_VALUE from there.
>>>> I think that the typedef handling is already done in walk_tree (at
>>>> least
>>>> I was unable to craft a test case where doing something extra for
>>>> typedefs in cp_walk_subtrees was required).
>>>
>>> I agree that nothing extra is required; I was referring to the
>>> typedef_variant_p code at the top of cp_walk_subtree.
>>>
>>>> And following up on your suggestion, things end up pretty simple:
>>>> just
>>>> walk the type of declarations in DECL_EXPRs, and explicitly handle
>>>> the
>>>> min/max value of INTEGER_TYPEs in cp_walk_subtree.
>>>>
>>>> At least that’s what the updated patch does, successfully tested 
>>>> on
>>>> x86_64-pc-linux-gnu. Ok for trunk?
>>>
>>>> @@ -5843,6 +5844,11 @@ cp_walk_subtrees (tree *tp, int
>>>> *walk_subtrees_p, walk_tree_fn func,
>>>>         WALK_SUBTREE (STATIC_ASSERT_MESSAGE (t));
>>>>         break;
>>>> +    case INTEGER_TYPE:
>>>> +      WALK_SUBTREE (TYPE_MIN_VALUE (t));
>>>> +      WALK_SUBTREE (TYPE_MAX_VALUE (t));
>>>> +      break;
>>>
>>> Let's add a comment
>>>
>>> /* Removed from walk_type_fields in r119481.  */
>>>
>>> and remove the now-redundant INTEGER_TYPE handling from
>>> for_each_template_parm_r.
>>>
>>> OK with those adjustments.
>> Thanks a lot. Merged to trunk as r15-7238-gceabea405ffdc8.
>>
>> Even though it’s an “old” regression (from GCC 7), it’s a
>> reject-valid and I’m tempted to backport; OK for active branches as
>> well?
>
> For backports I'd be inclined to condition the INTEGER_TYPE handing on 
> if (processing_template_decl) to reduce its impact.  But first let's 
> wait and see if this breaks anything on trunk.
ACK. I’ll resume the discussion about backports via this thread in 2-3 
weeks.

Simon

Re: [PATCH v3] c++: Don't prune constant capture proxies only used in array dimensions [PR114292]

Reply via email to