Hi Jason,
On 20 Jan 2025, at 22:49, Jason Merrill wrote:
> On 1/20/25 2:11 PM, Simon Martin wrote:
>> Hi Jason,
>>
>> On 15 Jan 2025, at 22:42, Jason Merrill wrote:
>>
>>> On 12/12/24 2:07 PM, Simon Martin wrote:
>>>> We currently ICE upon the following valid (under -Wno-vla) code
>>>>
>>>> === cut here ===
>>>> void f(int c) {
>>>> constexpr int r = 4;
>>>> [&](auto) { int t[r * c]; }(0);
>>>> }
>>>> === cut here ===
>>>>
>>>> The problem is that when parsing the lambda body, and more
>>>> specifically
>>>> the multiplication, we mark the lambda as
>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED
>>>> even though the replacement of r by 4 is "undone" by the call to
>>>> build_min_non_dep in build_x_binary_op. This makes
>>>> prune_lambda_captures
>>>> remove the proxy declaration while it should not, and we trip on an
>>>> assert at instantiation time.
>>>
>>> Why does prune_lambda_captures remove the declaration if it's still
>>> used in the function body? Setting LAMBDA_EXPR_CAPTURE_OPTIMIZED
>>> just
>>> means "we might have optimized away some captures", the tree walk
>>> should have found the use put back by build_x_binary_op.
>> I think that this is due to a bug in mark_const_cap_r, that’s been
>> here since the beginning, i.e. r8-7213-g1577f10a637352: it decides
>> NOT
>> to walk sub-trees when encountering a DECL_EXPR for a constant
>> capture
>> proxy (at lambda.cc:1769). I don’t understand why we wouldn’t
>> want
>> to continue.
>
> Because that's the declaration of the capture proxy, not a use of it.
Indeed, thanks for clarifying.
> Why aren't we finding the use in the declaration of t?
After further investigation, the problem is rather that neither
walk_tree_1 nor cp_walk_subtrees walk the dimensions of array types, so
we miss the uses.
>> Removing that line fixes the PR, but breaks 3 existing tests
>> (lambda-capture1.C, lambda-const11.C and lambda-const11a.C, that all
>> assert that we optimise out the capture); that’s why I did not
>> pursue
>> it in the first place.
>
> Right.
>
>> But taking another look, it might not be that big a deal that we
>> don’t
>> optimise those out: as soon as we use -O1 or above, the assignment to
>> the closure field actually disappears.
>
> Completely breaking this optimization is a big deal, particularly
> since it affects the layout of closure objects. We can't always
> optimize everything away.
ACK.
The attached updated patch makes cp_walk_subtrees walk array type
dimensions, which fixes the initial PR without those 3 regressions.
Successfully tested on x86_64-pc-linux-gnu. Is it OK?
Simon
PS: In theory I think it’d make sense to do do this change in
walk_tree_1 since C also supports VLAs, but doing so breaks some OMP
tests. OMP can do interesting things with array bounds (see
r9-5354-gda972c05f48637), and fixing those would require teaching
walk_tree_1 about the “omp dummy var” array bounds, which I think
would be a bit ugly. And I’m not aware of any C case that would be
improved/fixed by doing this change, so we’re probably fine not doing
it.
From e19bb6c943a422b1201c5c9a2a1d4f32141baf84 Mon Sep 17 00:00:00 2001
From: Simon Martin <si...@nasilyan.com>
Date: Wed, 22 Jan 2025 16:19:47 +0100
Subject: [PATCH] c++: Don't prune constant capture proxies only used in array
dimensions [PR114292]
We currently ICE upon the following valid (under -Wno-vla) code
=== cut here ===
void f(int c) {
constexpr int r = 4;
[&](auto) { int t[r * c]; }(0);
}
=== cut here ===
When parsing the lambda body, and more specifically the multiplication,
we mark the lambda as LAMBDA_EXPR_CAPTURE_OPTIMIZED, which indicates to
prune_lambda_captures that it might be possible to optimize out some
captures.
The problem is that prune_lambda_captures then misses the use of the r
capture (because neither walk_tree_1 nor cp_walk_subtrees walks the
dimensions of array types - here "r * c"), hence believes the capture
can be pruned... and we trip on an assert when instantiating the lambda.
This patch changes cp_walk_subtrees so that when walking a declaration
with an array type, it walks that array type's dimensions. Since C also
supports VLAs, I thought it'd make sense to do this in walk_tree_1, but
this breaks some omp-low related test cases (because OMP can do funky
things with array bounds when adjust_array_error_bounds is set), and I
don't really want to add code to walk_tree_1 to skips arrays whose
dimension is a temporary variable with the "omp dummy var" attribute.
Successfully tested on x86_64-pc-linux-gnu.
PR c++/114292
gcc/cp/ChangeLog:
* tree.cc (cp_walk_subtrees): Walk array type dimensions.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/lambda-ice4.C: New test.
---
gcc/cp/tree.cc | 11 ++++++
gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C | 44 ++++++++++++++++++++++++
2 files changed, 55 insertions(+)
create mode 100644 gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 36581865a17..fc9a2fbff32 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -5844,6 +5844,17 @@ cp_walk_subtrees (tree *tp, int *walk_subtrees_p,
walk_tree_fn func,
break;
default:
+ /* If this is an array, walk its dimensions. */
+ if (DECL_P (t) && TREE_TYPE (t)
+ && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+ {
+ tree domain = TYPE_DOMAIN (TREE_TYPE (t));
+ if (domain)
+ {
+ WALK_SUBTREE (TYPE_MIN_VALUE (domain));
+ WALK_SUBTREE (TYPE_MAX_VALUE (domain));
+ }
+ }
return NULL_TREE;
}
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
b/gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
new file mode 100644
index 00000000000..2f816153069
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
@@ -0,0 +1,44 @@
+// PR c++/114292
+// { dg-do "compile" { target c++14 } }
+// { dg-additional-options "-Wno-vla" }
+
+#define ASSERT_CAPTURE_NUMBER(Lambda, NumCaptures) \
+ { \
+ auto oneCapture = [&](auto) { int t[c]; }; \
+ const auto sizeOneCapture = sizeof (oneCapture); \
+ const auto expected = NumCaptures ? NumCaptures * sizeOneCapture : 1; \
+ static_assert (sizeof (Lambda) == expected, ""); \
+ }
+
+void foo (int c)
+{
+ constexpr int r = 4;
+
+ // This used to ICE.
+ [&](auto) { int t[c * r]; }(0);
+
+ // All those worked already, but were not covered by any test - do it here.
+ auto ok_1 = [&](auto) { int t[c]; };
+ ok_1 (0);
+ ASSERT_CAPTURE_NUMBER (ok_1, 1);
+
+ auto ok_2 = [&](auto) { int n = r * c; int t[n]; };
+ ok_2 (0);
+ ASSERT_CAPTURE_NUMBER (ok_2, 2);
+
+ auto ok_3 = [&](auto) { int t[r]; };
+ ok_3 (0);
+ ASSERT_CAPTURE_NUMBER (ok_3, 0);
+
+ auto ok_4 = [&](auto) { int t[c * 4]; };
+ ok_4 (0);
+ ASSERT_CAPTURE_NUMBER (ok_4, 1);
+
+ auto ok_5 = [&](auto) { int t[1]; };
+ ok_5 (0);
+ ASSERT_CAPTURE_NUMBER (ok_5, 0);
+
+ auto ok_6 = [&](auto) { int t[c * r]; };
+ ok_6 (0);
+ ASSERT_CAPTURE_NUMBER (ok_6, 2);
+}
--
2.44.0