On Thu, Jun 18, 2020 at 10:10 AM Martin Liška <mli...@suse.cz> wrote:
>
> On 6/17/20 3:15 PM, Richard Biener wrote:
> > On Wed, Jun 17, 2020 at 10:50 AM Richard Biener
> > <richard.guent...@gmail.com> wrote:
> >>
> >> On Mon, Jun 15, 2020 at 2:20 PM Martin Liška <mli...@suse.cz> wrote:
> >>>
> >>> On 6/15/20 1:59 PM, Richard Biener wrote:
> >>>> On Mon, Jun 15, 2020 at 1:19 PM Martin Liška <mli...@suse.cz> wrote:
> >>>>>
> >>>>> On 6/15/20 9:14 AM, Richard Biener wrote:
> >>>>>> On Fri, Jun 12, 2020 at 3:24 PM Martin Liška <mli...@suse.cz> wrote:
> >>>>>>>
> >>>>>>> On 6/12/20 11:43 AM, Richard Biener wrote:
> >>>>>>>> So ... how far are you with enforcing a split VEC_COND_EXPR?
> >>>>>>>> Thus can we avoid the above completely (even as intermediate
> >>>>>>>> state)?
> >>>>>>>
> >>>>>>> Apparently, I'm quite close. Using the attached patch I see only 2 
> >>>>>>> testsuite
> >>>>>>> failures:
> >>>>>>>
> >>>>>>> FAIL: gcc.dg/tree-ssa/pr68714.c scan-tree-dump-times reassoc1 " <= " 1
> >>>>>>> FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3
> >>>>>>>
> >>>>>>> The first one is about teaching reassoc about the SSA_NAMEs in 
> >>>>>>> VEC_COND_EXPR. I haven't
> >>>>>>> analyze the second failure.
> >>>>>>>
> >>>>>>> I'm also not sure about the gimlification change, I see a superfluous 
> >>>>>>> assignments:
> >>>>>>>       vec_cond_cmp.5 = _1 == _2;
> >>>>>>>       vec_cond_cmp.6 = vec_cond_cmp.5;
> >>>>>>>       vec_cond_cmp.7 = vec_cond_cmp.6;
> >>>>>>>       _3 = VEC_COND_EXPR <vec_cond_cmp.7, { -1, -1, -1, -1, -1, -1, 
> >>>>>>> -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }>;
> >>>>>>> ?
> >>>>>>>
> >>>>>>> So with the suggested patch, the EH should be gone as you suggested. 
> >>>>>>> Right?
> >>>>>>
> >>>>>> Right, it should be on the comparison already from the start.
> >>>>>>
> >>>>>> @@ -14221,9 +14221,13 @@ gimplify_expr (tree *expr_p, gimple_seq
> >>>>>> *pre_p, gimple_seq *post_p,
> >>>>>>            case VEC_COND_EXPR:
> >>>>>>              {
> >>>>>>                enum gimplify_status r0, r1, r2;
> >>>>>> -
> >>>>>>                r0 = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p,
> >>>>>>                                    post_p, is_gimple_condexpr, 
> >>>>>> fb_rvalue);
> >>>>>> +           tree xop0 = TREE_OPERAND (*expr_p, 0);
> >>>>>> +           tmp = create_tmp_var_raw (TREE_TYPE (xop0), 
> >>>>>> "vec_cond_cmp");
> >>>>>> +           gimple_add_tmp_var (tmp);
> >>>>>> +           gimplify_assign (tmp, xop0, pre_p);
> >>>>>> +           TREE_OPERAND (*expr_p, 0) = tmp;
> >>>>>>                r1 = gimplify_expr (&TREE_OPERAND (*expr_p, 1), pre_p,
> >>>>>>                                    post_p, is_gimple_val, fb_rvalue);
> >>>>>>
> >>>>>> all of VEC_COND_EXPR can now be a simple goto expr_3;
> >>>>>
> >>>>> Works for me, thanks!
> >>>>>
> >>>>>>
> >>>>>> diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
> >>>>>> index 494c9e9c20b..090fb52a2f1 100644
> >>>>>> --- a/gcc/tree-ssa-forwprop.c
> >>>>>> +++ b/gcc/tree-ssa-forwprop.c
> >>>>>> @@ -3136,6 +3136,10 @@ pass_forwprop::execute (function *fun)
> >>>>>>                        if (code == COND_EXPR
> >>>>>>                            || code == VEC_COND_EXPR)
> >>>>>>                          {
> >>>>>> +                       /* Do not propagate into VEC_COND_EXPRs.  */
> >>>>>> +                       if (code == VEC_COND_EXPR)
> >>>>>> +                         break;
> >>>>>> +
> >>>>>>
> >>>>>> err - remove the || code == VEC_COND_EXPR instead?
> >>>>>
> >>>>> Yep.
> >>>>>
> >>>>>>
> >>>>>> @@ -2221,24 +2226,12 @@ expand_vector_operations (void)
> >>>>>>     {
> >>>>>>       gimple_stmt_iterator gsi;
> >>>>>>       basic_block bb;
> >>>>>> -  bool cfg_changed = false;
> >>>>>>
> >>>>>>       FOR_EACH_BB_FN (bb, cfun)
> >>>>>> -    {
> >>>>>> -      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> >>>>>> -       {
> >>>>>> -         expand_vector_operations_1 (&gsi);
> >>>>>> -         /* ???  If we do not cleanup EH then we will ICE in
> >>>>>> -            verification.  But in reality we have created wrong-code
> >>>>>> -            as we did not properly transition EH info and edges to
> >>>>>> -            the piecewise computations.  */
> >>>>>> -         if (maybe_clean_eh_stmt (gsi_stmt (gsi))
> >>>>>> -             && gimple_purge_dead_eh_edges (bb))
> >>>>>> -           cfg_changed = true;
> >>>>>> -       }
> >>>>>> -    }
> >>>>>>
> >>>>>> I'm not sure about this.  Consider the C++ testcase where
> >>>>>> the ?: is replaced by a division.  If veclower needs to replace
> >>>>>> that with four scalrar division statements then the above
> >>>>>> still applies - veclower does not correctly duplicate EH info
> >>>>>> and EH edges to the individual divisions (and we do not know
> >>>>>> which component might trap).
> >>>>>>
> >>>>>> So please leave the above in.  You can try if using integer
> >>>>>> division makes it break and add such a testcase if there's
> >>>>>> no coverage for this in the testsuite.
> >>>>>
> >>>>> I'm leaving that above. Can you please explain how can a division 
> >>>>> test-case
> >>>>> be created?
> >>>>
> >>>> typedef long v2di __attribute__((vector_size(16)));
> >>>>
> >>>> v2di foo (v2di a, v2di b)
> >>>> {
> >>>>     try
> >>>>     {
> >>>>       v2di res = a / b;
> >>>>       return res;
> >>>>       }
> >>>>       catch (...)
> >>>>       {
> >>>>       return (v2di){};
> >>>>       }
> >>>> }
> >>>>
> >>>> with -fnon-call-exceptions I see in t.ii.090t.ehdisp (correctly):
> >>>>
> >>>> ;;   basic block 2, loop depth 0
> >>>> ;;    pred:       ENTRY
> >>>>     [LP 1] _6 = a_4(D) / b_5(D);
> >>>> ;;    succ:       5
> >>>> ;;                3
> >>>>
> >>>> while after t.ii.226t.veclower we have
> >>>>
> >>>> ;;   basic block 2, loop depth 0
> >>>> ;;    pred:       ENTRY
> >>>>     _13 = BIT_FIELD_REF <a_4(D), 64, 0>;
> >>>>     _14 = BIT_FIELD_REF <b_5(D), 64, 0>;
> >>>>     _15 = _13 / _14;
> >>>>     _16 = BIT_FIELD_REF <a_4(D), 64, 64>;
> >>>>     _17 = BIT_FIELD_REF <b_5(D), 64, 64>;
> >>>>     _18 = _16 / _17;
> >>>>     _6 = {_15, _18};
> >>>>     res_7 = _6;
> >>>>     _8 = res_7;
> >>>> ;;    succ:       3
> >>>>
> >>>> and all EH is gone and we'd ICE if you remove the above hunk.  Hopefully.
> >>>
> >>> Yes, it ICEs then:
> >>>
> >>>
> >>> ./xg++ -B. ~/Programming/testcases/ice.c -c -fnon-call-exceptions -O3
> >>> /home/marxin/Programming/testcases/ice.c: In function ‘v2di foo(v2di, 
> >>> v2di)’:
> >>> /home/marxin/Programming/testcases/ice.c:3:6: error: statement marked for 
> >>> throw, but doesn’t
> >>>       3 | v2di foo (v2di a, v2di b)
> >>>         |      ^~~
> >>> _6 = {_12, _15};
> >>> during GIMPLE pass: veclower2
> >>> /home/marxin/Programming/testcases/ice.c:3:6: internal compiler error: 
> >>> verify_gimple failed
> >>> 0x10e308a verify_gimple_in_cfg(function*, bool)
> >>>          /home/marxin/Programming/gcc/gcc/tree-cfg.c:5461
> >>> 0xfc9caf execute_function_todo
> >>>          /home/marxin/Programming/gcc/gcc/passes.c:1985
> >>> 0xfcaafc do_per_function
> >>>          /home/marxin/Programming/gcc/gcc/passes.c:1640
> >>> 0xfcaafc execute_todo
> >>>          /home/marxin/Programming/gcc/gcc/passes.c:2039
> >>> Please submit a full bug report,
> >>> with preprocessed source if appropriate.
> >>> Please include the complete backtrace with any bug report.
> >>> See <https://gcc.gnu.org/bugs/> for instructions.
> >>>
> >>>>
> >>>> We still generate wrong-code obviously as we'd need to duplicate the
> >>>> EH info on each component division (and split blocks and generate
> >>>> extra EH edges).  That's a pre-existing bug of course.  I just wanted
> >>>> to avoid to create a new instance just because of the early instruction
> >>>> selection for VEC_COND_EXPR.
> >>>
> >>> Fine!
> >>>
> >>>>
> >>>>>>
> >>>>>> What's missing from the patch is adjusting
> >>>>>> verify_gimple_assign_ternary from
> >>>>>>
> >>>>>>      if (((rhs_code == VEC_COND_EXPR || rhs_code == COND_EXPR)
> >>>>>>           ? !is_gimple_condexpr (rhs1) : !is_gimple_val (rhs1))
> >>>>>>          || !is_gimple_val (rhs2)
> >>>>>>          || !is_gimple_val (rhs3))
> >>>>>>        {
> >>>>>>          error ("invalid operands in ternary operation");
> >>>>>>          return true;
> >>>>>>
> >>>>>> to the same with the rhs_code == VEC_COND_EXPR case removed.
> >>>>>
> >>>>> Hmm. I'm not sure I've got this comment. Why do we want to change it
> >>>>> and is it done wright in the patch?
> >>>>
> >>>> Ah, I missed the hunk you added.
> >>>
> >>> That explains the confusion I got.
> >>>
> >>>>   But the check should be an inclusive
> >>>> one, not an exclusive one and earlier accepting a is_gimple_condexpr
> >>>> is superfluous when you later reject the tcc_comparison part.  Just
> >>>> testing is_gimple_val is better.  So yes, remove your tree-cfg.c hunk
> >>>> and just adjust the above test.
> >>>
> >>> I simplified that.
> >>>
> >>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Please double-check the changelog
> >>
> >>          (do_store_flag):
> >>
> >> +       tree-vect-isel.o \
> >>
> >> IMHO we want to move more of the pattern matching magic of RTL
> >> expansion here to obsolete TER.  So please name it gimple-isel.cc
> >> (.cc!, not .c)
> >>
> >> +  gassign *assign = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (cond));
> >> +  if (stmt != NULL
> >> +      && TREE_CODE_CLASS (gimple_assign_rhs_code (assign)) != 
> >> tcc_comparison)
> >> +    return ERROR_MARK;
> >>
> >> you want stmt == NULL || TREE_CODE_CLASS (...)
> >>
> >> in case the def stmt is a call.
> >>
> >> +         gimple_seq seq;
> >> +         tree exp = force_gimple_operand (comb, &seq, true, NULL_TREE);
> >> +         if (seq)
> >> +           {
> >> +             gimple_stmt_iterator gsi = gsi_for_stmt (vcond0);
> >> +             gsi_insert_before (&gsi, seq, GSI_SAME_STMT);
> >> +           }
> >>
> >> use force_gimple_operand_gsi that makes the above simpler.
> >>
> >>            if (invert)
> >> -           std::swap (*gimple_assign_rhs2_ptr (stmt0),
> >> -                      *gimple_assign_rhs3_ptr (stmt0));
> >> -         update_stmt (stmt0);
> >> +           std::swap (*gimple_assign_rhs2_ptr (vcond0),
> >> +                      *gimple_assign_rhs3_ptr (vcond0));
> >>
> >> use swap_ssa_operands.
> >>
> >> +         gimple_assign_set_rhs1 (vcond0, exp);
> >> +         update_stmt (vcond0);
> >>
> >> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> >> index cf2d979fea1..710b17a7c5c 100644
> >> --- a/gcc/tree-vect-stmts.c
> >> +++ b/gcc/tree-vect-stmts.c
> >> @@ -9937,8 +9937,8 @@ vectorizable_condition (vec_info *vinfo,
> >>          {
> >>            vec_cond_rhs = vec_oprnds1[i];
> >>            if (bitop1 == NOP_EXPR)
> >> -           vec_compare = build2 (cond_code, vec_cmp_type,
> >> -                                 vec_cond_lhs, vec_cond_rhs);
> >> +           vec_compare = gimplify_build2 (gsi, cond_code, vec_cmp_type,
> >> +                                          vec_cond_lhs, vec_cond_rhs);
> >>            else
> >>              {
> >>
> >> please don't introduce more uses of gimplify_buildN - I'd like to
> >> get rid of those.  You can use
> >>
> >>       gimple_seq stmts = NULL;
> >>       vec_compare = gimple_build (&stmts, cond_code, ...);
> >>       gsi_insert_seq_before/after (...);
> >>
> >> OK with those changes.
> >
> > Applying the patch caused
> >
> > Running target unix//-m32
> > FAIL: gcc.c-torture/execute/ieee/pr50310.c execution,  -O3
> > -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
> > -finline-functions
> > FAIL: gcc.c-torture/execute/ieee/pr50310.c execution,  -O3 -g
>
> I can't reproduce that with current master. Can you?

Yes.

> make check-gcc RUNTESTFLAGS="--target_board=unix/-m32 ieee.exp=pr50310.c"
...
FAIL: gcc.c-torture/execute/ieee/pr50310.c execution,  -O3
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
-finline-functions
FAIL: gcc.c-torture/execute/ieee/pr50310.c execution,  -O3 -g

mind the -m32

> >
> > and
> >
> > FAIL: ext/random/simd_fast_mersenne_twister_engine/operators/inequal.cc
> > (test for excess errors)
> > UNRESOLVED: 
> > ext/random/simd_fast_mersenne_twister_engine/operators/inequal.cc
> > compilation failed to produce executable
>
> I've just fixed this one.
>
> Martin
>
> >
> > Richard.
> >
> >> Thanks,
> >> Richard.
> >>
> >>
> >>> Thanks,
> >>> Martin
> >>>
> >>>>
> >>>>>>
> >>>>>> You'll likely figure the vectorizer still creates some VEC_COND_EXPRs
> >>>>>> with embedded comparisons.
> >>>>>
> >>>>> I've fixed 2 failing test-cases I mentioned in the previous email.
> >>>>>
> >>>>> Martin
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Richard.
> >>>>>>
> >>>>>>
> >>>>>>> Martin
> >>>>>
> >>>
>

Reply via email to