Hi Jason, On 27 Jan 2025, at 15:23, Jason Merrill wrote:
> On 1/27/25 8:14 AM, Simon Martin wrote: >> Hi Jason, >> >> On 24 Jan 2025, at 16:51, Jason Merrill wrote: >> >>> On 1/24/25 6:19 AM, Simon Martin wrote: >>>> Hi Jason, >>>> >>>> On 23 Jan 2025, at 22:56, Jason Merrill wrote: >>>> >>>>> On 1/23/25 12:06 PM, Simon Martin wrote: >>>>>> Hi Marek, >>>>>> >>>>>> On 23 Jan 2025, at 16:45, Marek Polacek wrote: >>>>>> >>>>>>> On Thu, Jan 23, 2025 at 02:40:09PM +0000, Simon Martin wrote: >>>>>>>> Hi Jason, >>>>>>>> >>>>>>>> On 20 Jan 2025, at 22:49, Jason Merrill wrote: >>>>>>>> >>>>>>>>> On 1/20/25 2:11 PM, Simon Martin wrote: >>>>>>>>>> Hi Jason, >>>>>>>>>> >>>>>>>>>> On 15 Jan 2025, at 22:42, Jason Merrill wrote: >>>>>>>>>> >>>>>>>>>>> On 12/12/24 2:07 PM, Simon Martin wrote: >>>>>>>>>>>> We currently ICE upon the following valid (under -Wno-vla) >> >>>>>>>>>>>> code >>>> >>>>>>>>>>>> >>>>>>>>>>>> === cut here === >>>>>>>>>>>> void f(int c) { >>>>>>>>>>>> constexpr int r = 4; >>>>>>>>>>>> [&](auto) { int t[r * c]; }(0); >>>>>>>>>>>> } >>>>>>>>>>>> === cut here === >>>>>>>>>>>> >>>>>>>>>>>> The problem is that when parsing the lambda body, and more >>>>>>>>>>>> specifically >>>>>>>>>>>> the multiplication, we mark the lambda as >>>>>>>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED >>>>>>>>>>>> even though the replacement of r by 4 is "undone" by the >>>>>>>>>>>> call >> >>>>>>>>>>>> to >>>>>>>>>>>> build_min_non_dep in build_x_binary_op. This makes >>>>>>>>>>>> prune_lambda_captures >>>>>>>>>>>> remove the proxy declaration while it should not, and we >>>>>>>>>>>> trip >> >>>>>>>>>>>> on >>>>>>>>>>>> an >>>>>>>> >>>>>>>>>>>> assert at instantiation time. >>>>>>>>>>> >>>>>>>>>>> Why does prune_lambda_captures remove the declaration if >>>>>>>>>>> it's >>>>>>>>>>> still >>>>>>>>>>> used in the function body? Setting >>>>>>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED >>>>>>>>>>> just >>>>>>>>>>> means "we might have optimized away some captures", the tree >>>>>>>>>>> walk >>>>>>>>>>> should have found the use put back by build_x_binary_op. >>>>>>>>>> I think that this is due to a bug in mark_const_cap_r, >>>>>>>>>> that’s >>>>>>>>>> been >>>>>>>>>> here since the beginning, i.e. r8-7213-g1577f10a637352: it >>>>>>>>>> decides >>>>>>>> >>>>>>>>>> NOT >>>>>>>>>> to walk sub-trees when encountering a DECL_EXPR for a >>>>>>>>>> constant >>>>>>>>>> capture >>>>>>>>>> proxy (at lambda.cc:1769). I don’t understand why we >> >>>>>>>>>> wouldn’t >>>>>>>>>> want >>>>>>>>>> to continue. >>>>>>>>> >>>>>>>>> Because that's the declaration of the capture proxy, not a use >>>>>>>>> of >>>>>>>>> it. >>>>>>>> Indeed, thanks for clarifying. >>>>>>>> >>>>>>>>> Why aren't we finding the use in the declaration of t? >>>>>>>> After further investigation, the problem is rather that neither >>>>>>>> walk_tree_1 nor cp_walk_subtrees walk the dimensions of array >>>>>>>> types, >>>>>>>> so >>>>>>>> we miss the uses. >>>>>>>> >>>>>>>>>> Removing that line fixes the PR, but breaks 3 existing tests >>>>>>>>>> (lambda-capture1.C, lambda-const11.C and lambda-const11a.C, >>>>>>>>>> that >>>>>>>>>> all >>>>>>>>>> assert that we optimise out the capture); that’s why I did >>>>>>>>>> not >>>>>>>>>> pursue >>>>>>>>>> it in the first place. >>>>>>>>> >>>>>>>>> Right. >>>>>>>>> >>>>>>>>>> But taking another look, it might not be that big a deal that >>>>>>>>>> we >>>>>>>>>> don’t >>>>>>>>>> optimise those out: as soon as we use -O1 or above, the >>>>>>>>>> assignment >>>>>>>>>> to >>>>>>>> >>>>>>>>>> the closure field actually disappears. >>>>>>>>> >>>>>>>>> Completely breaking this optimization is a big deal, >>>>>>>>> particularly >>>>>>>>> since it affects the layout of closure objects. We can't >>>>>>>>> always >> >>>>>>>>> optimize everything away. >>>>>>>> ACK. >>>>>>>> >>>>>>>> The attached updated patch makes cp_walk_subtrees walk array >>>>>>>> type >> >>>>>>>> dimensions, which fixes the initial PR without those 3 >>>>>>>> regressions. >>>> >>>>>>>> >>>>>>>> Successfully tested on x86_64-pc-linux-gnu. Is it OK? >>>>>>>> >>>>>>>> Simon >>>>>>>> >>>>>>>> PS: In theory I think it’d make sense to do do this change in >>>>>>>> >>>>>>>> walk_tree_1 since C also supports VLAs, but doing so breaks >>>>>>>> some >>>> >>>>>>>> OMP >>>>>>>> tests. OMP can do interesting things with array bounds (see >>>>>>>> r9-5354-gda972c05f48637), and fixing those would require >>>>>>>> teaching >> >>>>>>>> walk_tree_1 about the “omp dummy var” array bounds, which I >>>>>>>> think >>>>>>>> would be a bit ugly. And I’m not aware of any C case that >>>>>>>> would >>>>>>>> be >>>>>>>> improved/fixed by doing this change, so we’re probably fine >>>>>>>> not >>>>>>>> doing >>>>>>>> it. >>>>>>> >>>>>>>> From e19bb6c943a422b1201c5c9a2a1d4f32141baf84 Mon Sep 17 >>>>>>>> 00:00:00 >>>>>>>> 2001 >>>>>>>> From: Simon Martin <si...@nasilyan.com> >>>>>>>> Date: Wed, 22 Jan 2025 16:19:47 +0100 >>>>>>>> Subject: [PATCH] c++: Don't prune constant capture proxies only >>>>>>>> used >>>>>>>> in array >>>>>>>> dimensions [PR114292] >>>>>>>> >>>>>>>> We currently ICE upon the following valid (under -Wno-vla) code >>>>>>>> >>>>>>>> === cut here === >>>>>>>> void f(int c) { >>>>>>>> constexpr int r = 4; >>>>>>>> [&](auto) { int t[r * c]; }(0); >>>>>>>> } >>>>>>>> === cut here === >>>>>>>> >>>>>>>> When parsing the lambda body, and more specifically the >>>>>>>> multiplication, >>>>>>>> we mark the lambda as LAMBDA_EXPR_CAPTURE_OPTIMIZED, which >>>>>>>> indicates >>>>>>>> to >>>>>>>> prune_lambda_captures that it might be possible to optimize out >>>>>>>> some >>>>>>>> captures. >>>>>>>> >>>>>>>> The problem is that prune_lambda_captures then misses the use >>>>>>>> of >>>> >>>>>>>> the >>>>>>>> r >>>>>>>> capture (because neither walk_tree_1 nor cp_walk_subtrees walks >>>>>>>> the >>>> >>>>>>>> dimensions of array types - here "r * c"), hence believes the >>>>>>>> capture >>>>>>>> can be pruned... and we trip on an assert when instantiating >>>>>>>> the >>>>>>>> lambda. >>>>>>>> >>>>>>>> This patch changes cp_walk_subtrees so that when walking a >>>>>>>> declaration >>>>>>>> with an array type, it walks that array type's dimensions. >>>>>>>> Since >>>>>>>> C >>>>>>>> also >>>>>>>> supports VLAs, I thought it'd make sense to do this in >>>>>>>> walk_tree_1, >>>> >>>>>>>> but >>>>>>>> this breaks some omp-low related test cases (because OMP can do >>>>>>>> funky >>>>>>>> things with array bounds when adjust_array_error_bounds is >>>>>>>> set), >>>> >>>>>>>> and >>>>>>>> I >>>>>>>> don't really want to add code to walk_tree_1 to skips arrays >>>>>>>> whose >>>>>>>> dimension is a temporary variable with the "omp dummy var" >>>>>>>> attribute. >>>>>>>> >>>>>>>> Successfully tested on x86_64-pc-linux-gnu. >>>>>>>> >>>>>>>> PR c++/114292 >>>>>>>> >>>>>>>> gcc/cp/ChangeLog: >>>>>>>> >>>>>>>> * tree.cc (cp_walk_subtrees): Walk array type dimensions. >>>>>>>> >>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>> >>>>>>>> * g++.dg/cpp1y/lambda-ice4.C: New test. >>>>>>>> >>>>>>>> --- >>>>>>>> gcc/cp/tree.cc | 11 ++++++ >>>>>>>> gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C | 44 >>>>>>>> ++++++++++++++++++++++++ >>>>>>>> 2 files changed, 55 insertions(+) >>>>>>>> create mode 100644 gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C >>>>>>>> >>>>>>>> diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc >>>>>>>> index 36581865a17..fc9a2fbff32 100644 >>>>>>>> --- a/gcc/cp/tree.cc >>>>>>>> +++ b/gcc/cp/tree.cc >>>>>>>> @@ -5844,6 +5844,17 @@ cp_walk_subtrees (tree *tp, int >>>>>>>> *walk_subtrees_p, walk_tree_fn func, >>>>>>>> break; >>>>>>>> >>>>>>>> default: >>>>>>>> + /* If this is an array, walk its dimensions. */ >>>>>>>> + if (DECL_P (t) && TREE_TYPE (t) >>>>>>>> + && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE) >>>>>>>> + { >>>>>>>> + tree domain = TYPE_DOMAIN (TREE_TYPE (t)); >>>>>>>> + if (domain) >>>>>>>> + { >>>>>>>> + WALK_SUBTREE (TYPE_MIN_VALUE (domain)); >>>>>>>> + WALK_SUBTREE (TYPE_MAX_VALUE (domain)); >>>>>>>> + } >>>>>>>> + } >>>>>>>> return NULL_TREE; >>>>>>>> } >>>>>>> >>>>>>> Is there a reason I'm missing not to put this into the TYPE_P >>>>>>> block >>>>>>> above? >>>>>> Do you mean put it in that block as well, or instead of where I >>>>>> put >> >>>>>> it? >>>>>> If the latter, we don’t see any TYPE_P node for “int[r*n]”, >>>>>> so >>>>>> we >>>>>> don’t go into that block, and continue ICE’ing without the >>>>>> change >>>>>> in >>>>>> the “default:”. >>>>>> >>>>>> Anyway it’s a very good call (thanks!), because it got me to >>>>>> check >>>>>> what we get if we change the lambda body to just do “typedef >>>>>> int >>>>>> MyT[c*r];”, and we still ICE. And from a quick look, doing >>>> >>>>>> something >>>>>> similar in the TYPE_P block does not fix it. >>>>>> >>>>>> I’ll work something out and report back. >>>>> >>>>> I would think this should go in the DECL_EXPR handling, so we >>>>> don't >>>>> walk into the type every time we see a use of an array variable. >>>> Indeed, we can do the check there, thanks a lot. >>>> >>>> And concerning the case of a single “typedef int MyT[r*c];”, >>>> there >>>> already is code in walk_tree_1 that covers such typedefs, but that >> >>>> does >>>> not walk array type dimensions. >>>> >>>> So the result is the attached updated patch, successfully tested on >>>> x86_64-pc-linux-gnu. OK for trunk? And if so, also for branches? >>> >>>> * tree.cc (walk_tree_1): Walk array type dimensions, if any, for >>>> TYPE_DECLs. >>> >>> Hmm, why isn't this covered by the existing call to >>> walk_type_fields? >>> >>> Ah, I guess because that just walks into TYPE_DOMAIN, and nothing >>> walks from there into the TYPE_MIN/MAX_VALUE. >>> >>> What if instead of touching walk_tree_1, mark_const_cap_r handled >>> INTEGER_TYPE? >> I don’t think we can look into TYPE_{MIN,MAX}_VALUE from >> mark_const_cap_r since we might need to recurse (here we have a >> MINUS_EXPR in TYPE_MAX_VALUE) and I don’t think we can from there? > > Several tree walk functions recurse, in which case they need to handle > the pointer_set themselves instead of using > cp_walk_tree_without_duplicates. For instance, > find_parameter_packs_r. But no need for this patch. > >>> I know walk_tree used to walk into VLA bounds, which is why we have >>> stabilize_vla_size to split out any relevant expressions into >>> temporaries; I'm reluctant to re-add that without a lot of >>> investigation into the history. >> Yeah, makes sense. I learnt that touching walk_tree requires caution >> through all that I broke with my few attempts at changing it :-) > > OK, I looked it up: this goes back to r119481 > (r0-77782-g8f6e6bf375959d), the bounds walking was removed because it > caused trouble for Ada. That patch added bounds walking to > for_each_template_parm_r instead to avoid regressing C++. Thanks for digging into this and providing extra context. >>> In cp_walk_subtrees, I think we also want to go through the typedef >>> handling at the top, so in the DECL_EXPR handling, let's just walk >>> into the type of the decl. So we walk into the ARRAY_TYPE, check >>> whether it's a typedef, if not walk_type_fields walks into the >>> TYPE_DOMAIN, and the mark_const_cap_r adjustment I suggested above >>> checks the TYPE_MAX_VALUE from there. >> I think that the typedef handling is already done in walk_tree (at >> least >> I was unable to craft a test case where doing something extra for >> typedefs in cp_walk_subtrees was required). > > I agree that nothing extra is required; I was referring to the > typedef_variant_p code at the top of cp_walk_subtree. > >> And following up on your suggestion, things end up pretty simple: >> just >> walk the type of declarations in DECL_EXPRs, and explicitly handle >> the >> min/max value of INTEGER_TYPEs in cp_walk_subtree. >> >> At least that’s what the updated patch does, successfully tested on >> x86_64-pc-linux-gnu. Ok for trunk? > >> @@ -5843,6 +5844,11 @@ cp_walk_subtrees (tree *tp, int >> *walk_subtrees_p, walk_tree_fn func, >> WALK_SUBTREE (STATIC_ASSERT_MESSAGE (t)); >> break; >> + case INTEGER_TYPE: >> + WALK_SUBTREE (TYPE_MIN_VALUE (t)); >> + WALK_SUBTREE (TYPE_MAX_VALUE (t)); >> + break; > > Let's add a comment > > /* Removed from walk_type_fields in r119481. */ > > and remove the now-redundant INTEGER_TYPE handling from > for_each_template_parm_r. > > OK with those adjustments. Thanks a lot. Merged to trunk as r15-7238-gceabea405ffdc8. Even though it’s an “old” regression (from GCC 7), it’s a reject-valid and I’m tempted to backport; OK for active branches as well? Simon