Re: C++ delayed folding branch review

Jason Merrill Thu, 30 Jul 2015 16:02:52 -0700

On 07/30/2015 05:00 PM, Kai Tietz wrote:

2015-07-30 18:52 GMT+02:00 Jason Merrill <ja...@redhat.com>:

On 07/29/2015 06:56 PM, Kai Tietz wrote:


@@ -1430,6 +1438,8 @@ cxx_eval_call_expression (const
constexpr_ctx
*ctx,
tree t,
     bool
     reduced_constant_expression_p (tree t)
     {
+  /* Make sure we remove useless initial NOP_EXPRs.  */
+  STRIP_NOPS (t);



Checked, and removing those STRIP_NOPS cause regressions about
vector-casts.  At least the STRIP_NOPS in
reduced_constant_expression_p seems to be required.  See as example
g++.dg/ext/vector20.C as testcase.
It sees that '(vec)(const __vector(2) long int){3l, 4l}' is not a
constant expression.



But when was that NOP_EXPR added?  It should have been folded away before we
get here.


See below for this.  This might be related to the store_init_value issue.

@@ -8496,16 +8467,18 @@ compute_array_index_type (tree name, tree
size,
tsubst_flags_t complain)
           SET_TYPE_STRUCTURAL_EQUALITY (itype);
           return itype;
         }
-
+
+  /* We need to do fully folding to determine if we have VLA, or
not.  */
+  tree size_constant = cp_try_fold_to_constant (size);


Again, we already called maybe_constant_value.


Sure, but maybe_constant_value still produces nops ...


If someone tries to create an array with a size that involves
arithmetic
overflow, that's undefined behavior and we should probably give an
error rather than fold it away.


If we need to do some reduction to constant value here, as expr might
be actually a constant, which isn't folded here.  Eg something like:
struct {
    char abc[sizeof (int) * 8];
};
Due delayed folding array index isn't necessarily reduced here.  So we
need to perform at least constant value folding for diagnostics, as we
do right now.


Yes, we need to do some folding, that's why we call maybe_constant_value!


...so we shouldn't need cp_fully_fold.

@@ -13078,6 +13042,8 @@ build_enumerator (tree name, tree value,
tree
enumtype, tree attributes,
       if (value)
         STRIP_TYPE_NOPS (value);

+  if (value)
+    value = cp_try_fold_to_constant (value);



Again, this is unnecessary because we call cxx_constant_value
below.



See nops, and other unary-operations we want to reduce here to real
constant value ...



The cxx_constant_value call below will deal with them.



Likewise for grokbitfield.



Hmm, AFAIR we don't call cxx_constant_value in all code-paths.  But I
will look into it, and come back to you on it.



I am still on it ...  first did the other points



Looks like this hasn't changed.



Yes, for grokbitfield current version uses fold_simple for witdth.  So
just expressions based on constants getting reduced to short form.  In
grokbitfield I don't see invocation of cxx_constant_value.  So how can
we be sure that width is reduced to integer-cst?



We call cxx_constant_value on bit-field widths in check_bitfield_decl.


Hmm, ok.  But I don't see that this function gets called in context of
grokbitfield, after we set DECL_INITIAL.


Nope, it's called later on as part of finish_struct.

By removing this folding here, we get new failures in
g++.dg/warn/overflow-warn-1.C testcase:
New errors are at lin 32 that 'bif-foeld 's::<anonymous>' width not an
integer constant'
and at same line ''(1 / 0) is not a constant expression'.  Those
message don't look wrong.

The testcase next to this 'overflow-warn-3.C and overflow-warn-4.C'
failing in the same manner for (1 / 0) case.  But there are no other
regressions in g++.dg & libstdc++

Shall I extend the testcases for this message?


Please.

@@ -6575,6 +6578,13 @@ cp_parser_postfix_open_square_expression
(cp_parser
*parser,
            index = cp_parser_expression (parser);
         }

+  /* For offsetof and declaration of types we need
+     constant integeral values.
+     Also we meed to fold for negative constants so that
diagnostic
in
+     c-family/c-common.c doesn't fail for array-bounds.  */
+  if (for_offsetof || decltype_p
+      || (TREE_CODE (index) == NEGATE_EXPR && TREE_CODE
(TREE_OPERAND
(index, 0)) == INTEGER_CST))
+    index = cp_try_fold_to_constant (index);



Similarly, for offsetof the folding should happen closer to where
it
is needed.

Why is it needed for decltype, which is querying the type of an
expression?

For NEGATE_EXPR, we had talked about always folding a NEGATE of a
constant; this isn't the right place to do it.



Same as above, we need in those cases (and for -1 too) the constant
values early anyway.  So I saw it as more logical to have done this
conversion as soon as possible after initialization.



I don't think this is as soon as possible; we can fold the
NEGATE_EXPR
immediately when we build it, at the end of cp_build_unary_op.

I still wonder why any folding is necessary for decltype.  When I ask
why, I want to know *why*, not just have you tell me again that it's
needed.  I don't think it is.

For offsetof, I wonder if it makes sense to extend fold_offsetof_1 to
handle whatever additional folding is needed here.  If not, then fold
in finish_offsetof, before calling fold_offsetof.



I see that this is now an unconditional fold_simple, but I still don't
understand why it needs to be folded here, in the parser.



The point to fold the 'value' here is for cases
'processing_template_decl' isn't false. We could move it to the
else-case of the 'if (! processing_template_decl)' line for being more
explicit?



Well, on looking here in more detail, we might don't that that initial
folding here.  As for processing_template_decl fold_simple (and
cp_fully_fold) doesn't do much.



Looks like the fold is still there.



Yes, but a fold_simple one just working on constant values.  It
doesn't fold expressions like 'a == a' to a constant.  I extended
comment in current version on branch.  Additionally it invokes now the
fold_simple always.

We want to reduce index, if possible, for
diagnostics in code in c-family/c-common.c



Why not closer to the diagnostics?


It seemed to me like the most efficient way to do this reduction.  Do
you have a different place in mind?


Wherever the diagnostic is.

for array-bounds,


We already fold array bounds.


Bounds we fold, but the we fold here the index to be constant value to
be able to compare against type's bounds.


Compare where?

for types (they need to be fully folded)


WHY?  How many times do I need to ask you for SOME reason?  You keep just
saying it's necessary without any evidence.


AFAIR we have talked about that.  But for other readers, types aren't
delayed in folding, as we share it with ME, and we aren't be able to
use them in a delayed-folded state in FE. Type-sizes, and alignments,
etc need to be reduced here.  It wouldn't make much sense to delay
folding for them too, as we need those types already while parsing.

Well, here we're dealing with expressions, not types. Definitely typesizes and alignments need to be reduced, but that isn't what we'redealing with here in the parser. Here we're parsing an expression. Idon't see the connection.

and to be sure we simplify basic operations on constant-values.


Why here, rather than closer to where we care about such simplification?

Again:

I want to delay it to:

1) the places where we actually care about constant values, all of
which
already call maybe_constant_value or cxx_constant_value, so they
shouldn't need much change; and
2) the places where we want a simplified expression for warnings,
where
we should call fold_simple.

Folding in the parser is wrong, most of all because template
substitution doesn't go through the parser.

In 'cp_parser_omp_var_list_no_open' we need to fold 'length' can
'low_bound' as those values getting checked some lines below (see
lines 27936, 27944).



OK, but this seems like an typical case of needing to fold for
diagnostics;
usually in those cases you use the folded value for the diagnostics and
then
keep using the unfolded expression elsewhere.



Right.



So are you going to make that change here?


I intend so.  I need first to complete regression-runs for the changes
below.  This seems to me more like a smaller issue.

In 'cp_parser_cilk_grainsize' we fold 2nd argument of
'cp_paser_cild_for' by 'fold_simple'.  Not sure if it is worth to move
operand-folding into cp_parser_cilk_for itself, as we have here just
two users of 'cp_parser_cilk_for'.
One time we pass 'integer_zero_node' as this argument, and the other
time a binary-expression, which might be constant value.
But sure we can move it into 'cp_parser_cilk_grainsize'.if you prefer?



Why does the fold need to be in the parser?



Well, if we hit it during our tree-walk in cp_fold_r, then we don't
need to fold it here.  I will check, if this is really necessary.


See tree_walk_1 ... and we walk into it, so this folding of gain
should be removable.  I will commit it after regression-testing.

@@ -7249,7 +7249,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq
*pre_p)
           /* Handle OMP_FOR_COND.  */
           t = TREE_VEC_ELT (OMP_FOR_COND (for_stmt), i);
           gcc_assert (COMPARISON_CLASS_P (t));
-      gcc_assert (TREE_OPERAND (t, 0) == decl);
+      gcc_assert (TREE_OPERAND (t, 0) == decl || TREE_OPERAND (t,
1) ==
decl);



Why didn't delayed folding canonicalize this so that the decl is in
op0?



Delay folding doesn't canonicalize this.



Why not?  Doesn't it fold all expressions?



It fold them lately.  I will recheck this code-change.  It might be no
longer required due recent changes to omp-folding.  It could be that
original pattern didn't applied here anymore, and therefore statement
didn't been transformed into its canonical form.  Bit I assume this
could be resolved.



This hunk is necessary as we don't use canonical-form produced by
shorten_compare anymore.  Therefore special operand can occur on
right-hand side too.



That seems like a problem, if the middle end is expecting the canonical
form.  What is your plan for dealing with shorten_compare issues, again?


Actually ME deals with none-cannonical form too.  It just asserts on
it at this place.  After delayed-folding work I will continue work
(Jeff pushed first parts of this work already to ML) on eliminating
use of shorten_compare completely, and move its folding-patterns to
match.pd.

It looks like c_finish_omp_for should have done this canonicalizationfor the condition. And various places assume that the OMP for conditionhas had this canonicalization done, this is just the only place there'san assert. We need to make sure that it's done, somehow.

@@ -1947,6 +1947,8 @@ build_complex (tree type, tree real, tree
imag)
     {
       tree t = make_node (COMPLEX_CST);

+  real = fold (real);
+  imag = fold (imag);



I still think this is wrong.  The arguments should be sufficiently
folded.



As we don't fold unary-operators on constants, we need to fold it at
some place.  AFAICS is the C++ FE not calling directly
build_complex.
So this place was the easiest way to avoid issues with things like
'-'
'1' etc.



Is this because of the



         value = build_complex (NULL_TREE, convert (const_type,

integer_zero_node),
value);



Might be.  This should be indeed a 'fold_convert', isn't it?



Yes.



Applied modification to it.



So can we remove the fold in build_complex now?

in interpret_float?  I think "convert" definitely needs to do some
folding, since it's called from middle-end code that expects that.



I remember talking about "convert" doing some folding (and cp_convert
not) in our 1:1 last week.



Can't remember that.  I know that we were talking about the difference
of convert and fold_convert.  convert can be used on C++ specifics,
but fold_convert is something shared with ME.



convert is called from the ME, which sometimes expects folding.

So first 'fold_convert'
isn't the same as 'fold (convert ())'.
I don't find places we invoke convert () in ME.  We have some calls in
convert.c (see convert_to_integer, convert_to_integer_nofold, and
convert_to_real), which all used in AST only AFAICS.



I was thinking of convert.c and fold-const.c to be part of the ME, since
they are language-independent.  But I guess other people think of the ME
starting with gimple.

And it looks like the only language-independent uses of convert are in
c-family; I guess many of them should change to fold_convert.



Hmm, in context of this work? Or is this more a general point about future
work?



In the context of this work, if they are introducing problematic NOPs.


Ok, I will take a closer look to convert () usage in c-family/.  By
quick looking this seems to be the only place for now we needed to
change.

@@ -5080,6 +5081,7 @@ output_constructor_bitfield (oc_local_state
*local,
unsigned int bit_offset)
       while (TREE_CODE (local->val) == VIEW_CONVERT_EXPR
             || TREE_CODE (local->val) == NON_LVALUE_EXPR)
         local->val = TREE_OPERAND (local->val, 0);
+  local->val = fold (local->val);



Likewise.



As soon as we can be sure that values getting fully_folded, or at
least folded for constants, we should be able to remove this.



Yep, they need to be folded before we get here.



I didn't come to remove this line for testing.  As we fold now for
initializers more early, and cp_fold supports constructors, it could
be that we don't need this anymore.  It is on my pile.

That fold is still required.  By removing it, I saw boostrap issue due
'invalid initializer'.



That indicates a folding problem earlier on, that will cause some
initialization that should be performed at compile time to happen at run
time instead.

Please investigate the bootstrap issue further.


Yes, I do. I assume it is related to 'store_init_value'.  For cases
decl_maybe_constant_var_p() is true, or decl is a static, we are
calling maybe_constant_init on the value, but for other cases we don't
simplify value. (Btw this might be related to the
STRIP_NOPS-requirement in 'reduced_constant_expression_p').  So by
adding the following hunk it seems to work (still need to verify)

Index: typeck2.c
===================================================================
--- typeck2.c   (Revision 226401)
+++ typeck2.c   (Arbeitskopie)
@@ -833,6 +833,8 @@ store_init_value (tree decl, tree init, vec<tree,
        DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = const_init;
        TREE_CONSTANT (decl) = const_init && decl_maybe_constant_var_p (decl);
      }
+  else
+    value = fold_simple (value);

    if (cxx_dialect >= cxx14 && CLASS_TYPE_P (strip_array_types (type)))
      /* Handle aggregate NSDMI in non-constant initializers, too.  */

I guess we want to extend the code for handling statics and constants toalso handle the case where the initializer is a CONSTRUCTOR. And alsofold individual elements in split_nonconstant_init.


Jason

Re: C++ delayed folding branch review

Reply via email to