Hi Jason, On 27 Jan 2025, at 20:53, Jason Merrill wrote:
> On 1/27/25 2:34 PM, Simon Martin wrote: >> Hi Jason, >> >> On 27 Jan 2025, at 15:23, Jason Merrill wrote: >> >>> On 1/27/25 8:14 AM, Simon Martin wrote: >>>> Hi Jason, >>>> >>>> On 24 Jan 2025, at 16:51, Jason Merrill wrote: >>>> >>>>> On 1/24/25 6:19 AM, Simon Martin wrote: >>>>>> Hi Jason, >>>>>> >>>>>> On 23 Jan 2025, at 22:56, Jason Merrill wrote: >>>>>> >>>>>>> On 1/23/25 12:06 PM, Simon Martin wrote: >>>>>>>> Hi Marek, >>>>>>>> >>>>>>>> On 23 Jan 2025, at 16:45, Marek Polacek wrote: >>>>>>>> >>>>>>>>> On Thu, Jan 23, 2025 at 02:40:09PM +0000, Simon Martin wrote: >>>>>>>>>> Hi Jason, >>>>>>>>>> >>>>>>>>>> On 20 Jan 2025, at 22:49, Jason Merrill wrote: >>>>>>>>>> >>>>>>>>>>> On 1/20/25 2:11 PM, Simon Martin wrote: >>>>>>>>>>>> Hi Jason, >>>>>>>>>>>> >>>>>>>>>>>> On 15 Jan 2025, at 22:42, Jason Merrill wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On 12/12/24 2:07 PM, Simon Martin wrote: >>>>>>>>>>>>>> We currently ICE upon the following valid (under >>>>>>>>>>>>>> -Wno-vla) >>>> >>>>>>>>>>>>>> code >>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> === cut here === >>>>>>>>>>>>>> void f(int c) { >>>>>>>>>>>>>> constexpr int r = 4; >>>>>>>>>>>>>> [&](auto) { int t[r * c]; }(0); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> === cut here === >>>>>>>>>>>>>> >>>>>>>>>>>>>> The problem is that when parsing the lambda body, and >>>>>>>>>>>>>> more >>>>>>>>>>>>>> specifically >>>>>>>>>>>>>> the multiplication, we mark the lambda as >>>>>>>>>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED >>>>>>>>>>>>>> even though the replacement of r by 4 is "undone" by the >>>>>>>>>>>>>> call >>>> >>>>>>>>>>>>>> to >>>>>>>>>>>>>> build_min_non_dep in build_x_binary_op. This makes >>>>>>>>>>>>>> prune_lambda_captures >>>>>>>>>>>>>> remove the proxy declaration while it should not, and we >>>>>>>>>>>>>> trip >>>> >>>>>>>>>>>>>> on >>>>>>>>>>>>>> an >>>>>>>>>> >>>>>>>>>>>>>> assert at instantiation time. >>>>>>>>>>>>> >>>>>>>>>>>>> Why does prune_lambda_captures remove the declaration if >>>>>>>>>>>>> it's >>>>>>>>>>>>> still >>>>>>>>>>>>> used in the function body? Setting >>>>>>>>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED >>>>>>>>>>>>> just >>>>>>>>>>>>> means "we might have optimized away some captures", the >>>>>>>>>>>>> tree >>>>>>>>>>>>> walk >>>>>>>>>>>>> should have found the use put back by build_x_binary_op. >>>>>>>>>>>> I think that this is due to a bug in mark_const_cap_r, >>>>>>>>>>>> that’s >>>>>>>>>>>> been >>>>>>>>>>>> here since the beginning, i.e. r8-7213-g1577f10a637352: it >>>>>>>>>>>> decides >>>>>>>>>> >>>>>>>>>>>> NOT >>>>>>>>>>>> to walk sub-trees when encountering a DECL_EXPR for a >>>>>>>>>>>> constant >>>>>>>>>>>> capture >>>>>>>>>>>> proxy (at lambda.cc:1769). I don’t understand why we >>>> >>>>>>>>>>>> wouldn’t >>>>>>>>>>>> want >>>>>>>>>>>> to continue. >>>>>>>>>>> >>>>>>>>>>> Because that's the declaration of the capture proxy, not a >>>>>>>>>>> use >>>>>>>>>>> of >>>>>>>>>>> it. >>>>>>>>>> Indeed, thanks for clarifying. >>>>>>>>>> >>>>>>>>>>> Why aren't we finding the use in the declaration of t? >>>>>>>>>> After further investigation, the problem is rather that >>>>>>>>>> neither >>>>>>>>>> walk_tree_1 nor cp_walk_subtrees walk the dimensions of array >>>>>>>>>> types, >>>>>>>>>> so >>>>>>>>>> we miss the uses. >>>>>>>>>> >>>>>>>>>>>> Removing that line fixes the PR, but breaks 3 existing >>>>>>>>>>>> tests >>>>>>>>>>>> (lambda-capture1.C, lambda-const11.C and lambda-const11a.C, >>>>>>>>>>>> that >>>>>>>>>>>> all >>>>>>>>>>>> assert that we optimise out the capture); that’s why I >>>>>>>>>>>> did >>>>>>>>>>>> not >>>>>>>>>>>> pursue >>>>>>>>>>>> it in the first place. >>>>>>>>>>> >>>>>>>>>>> Right. >>>>>>>>>>> >>>>>>>>>>>> But taking another look, it might not be that big a deal >>>>>>>>>>>> that >>>>>>>>>>>> we >>>>>>>>>>>> don’t >>>>>>>>>>>> optimise those out: as soon as we use -O1 or above, the >>>>>>>>>>>> assignment >>>>>>>>>>>> to >>>>>>>>>> >>>>>>>>>>>> the closure field actually disappears. >>>>>>>>>>> >>>>>>>>>>> Completely breaking this optimization is a big deal, >>>>>>>>>>> particularly >>>>>>>>>>> since it affects the layout of closure objects. We can't >>>>>>>>>>> always >>>> >>>>>>>>>>> optimize everything away. >>>>>>>>>> ACK. >>>>>>>>>> >>>>>>>>>> The attached updated patch makes cp_walk_subtrees walk array >>>>>>>>>> type >>>> >>>>>>>>>> dimensions, which fixes the initial PR without those 3 >>>>>>>>>> regressions. >>>>>> >>>>>>>>>> >>>>>>>>>> Successfully tested on x86_64-pc-linux-gnu. Is it OK? >>>>>>>>>> >>>>>>>>>> Simon >>>>>>>>>> >>>>>>>>>> PS: In theory I think it’d make sense to do do this change >>>>>>>>>> in >>>>>>>>>> >>>>>>>>>> walk_tree_1 since C also supports VLAs, but doing so breaks >>>>>>>>>> some >>>>>> >>>>>>>>>> OMP >>>>>>>>>> tests. OMP can do interesting things with array bounds (see >>>>>>>>>> r9-5354-gda972c05f48637), and fixing those would require >>>>>>>>>> teaching >>>> >>>>>>>>>> walk_tree_1 about the “omp dummy var” array bounds, which >>>>>>>>>> I >>>>>>>>>> think >>>>>>>>>> would be a bit ugly. And I’m not aware of any C case that >>>>>>>>>> would >>>>>>>>>> be >>>>>>>>>> improved/fixed by doing this change, so we’re probably fine >>>>>>>>>> not >>>>>>>>>> doing >>>>>>>>>> it. >>>>>>>>> >>>>>>>>>> From e19bb6c943a422b1201c5c9a2a1d4f32141baf84 Mon Sep 17 >>>>>>>>>> 00:00:00 >>>>>>>>>> 2001 >>>>>>>>>> From: Simon Martin <si...@nasilyan.com> >>>>>>>>>> Date: Wed, 22 Jan 2025 16:19:47 +0100 >>>>>>>>>> Subject: [PATCH] c++: Don't prune constant capture proxies >>>>>>>>>> only >>>>>>>>>> used >>>>>>>>>> in array >>>>>>>>>> dimensions [PR114292] >>>>>>>>>> >>>>>>>>>> We currently ICE upon the following valid (under -Wno-vla) >>>>>>>>>> code >>>>>>>>>> >>>>>>>>>> === cut here === >>>>>>>>>> void f(int c) { >>>>>>>>>> constexpr int r = 4; >>>>>>>>>> [&](auto) { int t[r * c]; }(0); >>>>>>>>>> } >>>>>>>>>> === cut here === >>>>>>>>>> >>>>>>>>>> When parsing the lambda body, and more specifically the >>>>>>>>>> multiplication, >>>>>>>>>> we mark the lambda as LAMBDA_EXPR_CAPTURE_OPTIMIZED, which >>>>>>>>>> indicates >>>>>>>>>> to >>>>>>>>>> prune_lambda_captures that it might be possible to optimize >>>>>>>>>> out >>>>>>>>>> some >>>>>>>>>> captures. >>>>>>>>>> >>>>>>>>>> The problem is that prune_lambda_captures then misses the use >>>>>>>>>> of >>>>>> >>>>>>>>>> the >>>>>>>>>> r >>>>>>>>>> capture (because neither walk_tree_1 nor cp_walk_subtrees >>>>>>>>>> walks >>>>>>>>>> the >>>>>> >>>>>>>>>> dimensions of array types - here "r * c"), hence believes the >>>>>>>>>> capture >>>>>>>>>> can be pruned... and we trip on an assert when instantiating >>>>>>>>>> the >>>>>>>>>> lambda. >>>>>>>>>> >>>>>>>>>> This patch changes cp_walk_subtrees so that when walking a >>>>>>>>>> declaration >>>>>>>>>> with an array type, it walks that array type's dimensions. >>>>>>>>>> Since >>>>>>>>>> C >>>>>>>>>> also >>>>>>>>>> supports VLAs, I thought it'd make sense to do this in >>>>>>>>>> walk_tree_1, >>>>>> >>>>>>>>>> but >>>>>>>>>> this breaks some omp-low related test cases (because OMP can >>>>>>>>>> do >>>>>>>>>> funky >>>>>>>>>> things with array bounds when adjust_array_error_bounds is >>>>>>>>>> set), >>>>>> >>>>>>>>>> and >>>>>>>>>> I >>>>>>>>>> don't really want to add code to walk_tree_1 to skips arrays >>>>>>>>>> whose >>>>>>>>>> dimension is a temporary variable with the "omp dummy var" >>>>>>>>>> attribute. >>>>>>>>>> >>>>>>>>>> Successfully tested on x86_64-pc-linux-gnu. >>>>>>>>>> >>>>>>>>>> PR c++/114292 >>>>>>>>>> >>>>>>>>>> gcc/cp/ChangeLog: >>>>>>>>>> >>>>>>>>>> * tree.cc (cp_walk_subtrees): Walk array type dimensions. >>>>>>>>>> >>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>> >>>>>>>>>> * g++.dg/cpp1y/lambda-ice4.C: New test. >>>>>>>>>> >>>>>>>>>> --- >>>>>>>>>> gcc/cp/tree.cc | 11 ++++++ >>>>>>>>>> gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C | 44 >>>>>>>>>> ++++++++++++++++++++++++ >>>>>>>>>> 2 files changed, 55 insertions(+) >>>>>>>>>> create mode 100644 >>>>>>>>>> gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C >>>>>>>>>> >>>>>>>>>> diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc >>>>>>>>>> index 36581865a17..fc9a2fbff32 100644 >>>>>>>>>> --- a/gcc/cp/tree.cc >>>>>>>>>> +++ b/gcc/cp/tree.cc >>>>>>>>>> @@ -5844,6 +5844,17 @@ cp_walk_subtrees (tree *tp, int >>>>>>>>>> *walk_subtrees_p, walk_tree_fn func, >>>>>>>>>> break; >>>>>>>>>> >>>>>>>>>> default: >>>>>>>>>> + /* If this is an array, walk its dimensions. */ >>>>>>>>>> + if (DECL_P (t) && TREE_TYPE (t) >>>>>>>>>> + && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE) >>>>>>>>>> + { >>>>>>>>>> + tree domain = TYPE_DOMAIN (TREE_TYPE (t)); >>>>>>>>>> + if (domain) >>>>>>>>>> + { >>>>>>>>>> + WALK_SUBTREE (TYPE_MIN_VALUE (domain)); >>>>>>>>>> + WALK_SUBTREE (TYPE_MAX_VALUE (domain)); >>>>>>>>>> + } >>>>>>>>>> + } >>>>>>>>>> return NULL_TREE; >>>>>>>>>> } >>>>>>>>> >>>>>>>>> Is there a reason I'm missing not to put this into the TYPE_P >>>>>>>>> block >>>>>>>>> above? >>>>>>>> Do you mean put it in that block as well, or instead of where I >>>>>>>> put >>>> >>>>>>>> it? >>>>>>>> If the latter, we don’t see any TYPE_P node for >>>>>>>> “int[r*n]”, >>>>>>>> so >>>>>>>> we >>>>>>>> don’t go into that block, and continue ICE’ing without the >>>>>>>> change >>>>>>>> in >>>>>>>> the “default:”. >>>>>>>> >>>>>>>> Anyway it’s a very good call (thanks!), because it got me to >>>>>>>> check >>>>>>>> what we get if we change the lambda body to just do “typedef >>>>>>>> int >>>>>>>> MyT[c*r];”, and we still ICE. And from a quick look, doing >>>>>> >>>>>>>> something >>>>>>>> similar in the TYPE_P block does not fix it. >>>>>>>> >>>>>>>> I’ll work something out and report back. >>>>>>> >>>>>>> I would think this should go in the DECL_EXPR handling, so we >>>>>>> don't >>>>>>> walk into the type every time we see a use of an array variable. >>>>>> Indeed, we can do the check there, thanks a lot. >>>>>> >>>>>> And concerning the case of a single “typedef int MyT[r*c];”, >>>>>> there >>>>>> already is code in walk_tree_1 that covers such typedefs, but >>>>>> that >>>> >>>>>> does >>>>>> not walk array type dimensions. >>>>>> >>>>>> So the result is the attached updated patch, successfully tested >>>>>> on >>>>>> x86_64-pc-linux-gnu. OK for trunk? And if so, also for branches? >>>>> >>>>>> * tree.cc (walk_tree_1): Walk array type dimensions, if any, for >>>>>> TYPE_DECLs. >>>>> >>>>> Hmm, why isn't this covered by the existing call to >>>>> walk_type_fields? >>>>> >>>>> Ah, I guess because that just walks into TYPE_DOMAIN, and nothing >>>>> walks from there into the TYPE_MIN/MAX_VALUE. >>>>> >>>>> What if instead of touching walk_tree_1, mark_const_cap_r handled >>>>> INTEGER_TYPE? >>>> I don’t think we can look into TYPE_{MIN,MAX}_VALUE from >>>> mark_const_cap_r since we might need to recurse (here we have a >>>> MINUS_EXPR in TYPE_MAX_VALUE) and I don’t think we can from >>>> there? >>> >>> Several tree walk functions recurse, in which case they need to >>> handle >>> the pointer_set themselves instead of using >>> cp_walk_tree_without_duplicates. For instance, >>> find_parameter_packs_r. But no need for this patch. >>> >>>>> I know walk_tree used to walk into VLA bounds, which is why we >>>>> have >>>>> stabilize_vla_size to split out any relevant expressions into >>>>> temporaries; I'm reluctant to re-add that without a lot of >>>>> investigation into the history. >>>> Yeah, makes sense. I learnt that touching walk_tree requires >>>> caution >>>> through all that I broke with my few attempts at changing it :-) >>> >>> OK, I looked it up: this goes back to r119481 >>> (r0-77782-g8f6e6bf375959d), the bounds walking was removed because >>> it >>> caused trouble for Ada. That patch added bounds walking to >>> for_each_template_parm_r instead to avoid regressing C++. >> Thanks for digging into this and providing extra context. >> >>>>> In cp_walk_subtrees, I think we also want to go through the >>>>> typedef >>>>> handling at the top, so in the DECL_EXPR handling, let's just walk >>>>> into the type of the decl. So we walk into the ARRAY_TYPE, check >>>>> whether it's a typedef, if not walk_type_fields walks into the >>>>> TYPE_DOMAIN, and the mark_const_cap_r adjustment I suggested above >>>>> checks the TYPE_MAX_VALUE from there. >>>> I think that the typedef handling is already done in walk_tree (at >>>> least >>>> I was unable to craft a test case where doing something extra for >>>> typedefs in cp_walk_subtrees was required). >>> >>> I agree that nothing extra is required; I was referring to the >>> typedef_variant_p code at the top of cp_walk_subtree. >>> >>>> And following up on your suggestion, things end up pretty simple: >>>> just >>>> walk the type of declarations in DECL_EXPRs, and explicitly handle >>>> the >>>> min/max value of INTEGER_TYPEs in cp_walk_subtree. >>>> >>>> At least that’s what the updated patch does, successfully tested >>>> on >>>> x86_64-pc-linux-gnu. Ok for trunk? >>> >>>> @@ -5843,6 +5844,11 @@ cp_walk_subtrees (tree *tp, int >>>> *walk_subtrees_p, walk_tree_fn func, >>>> WALK_SUBTREE (STATIC_ASSERT_MESSAGE (t)); >>>> break; >>>> + case INTEGER_TYPE: >>>> + WALK_SUBTREE (TYPE_MIN_VALUE (t)); >>>> + WALK_SUBTREE (TYPE_MAX_VALUE (t)); >>>> + break; >>> >>> Let's add a comment >>> >>> /* Removed from walk_type_fields in r119481. */ >>> >>> and remove the now-redundant INTEGER_TYPE handling from >>> for_each_template_parm_r. >>> >>> OK with those adjustments. >> Thanks a lot. Merged to trunk as r15-7238-gceabea405ffdc8. >> >> Even though it’s an “old” regression (from GCC 7), it’s a >> reject-valid and I’m tempted to backport; OK for active branches as >> well? > > For backports I'd be inclined to condition the INTEGER_TYPE handing on > if (processing_template_decl) to reduce its impact. But first let's > wait and see if this breaks anything on trunk. ACK. I’ll resume the discussion about backports via this thread in 2-3 weeks. Simon