On 24 March 2011 14:40, Richard Guenther <richard.guent...@gmail.com> wrote:
> On Thu, Mar 24, 2011 at 1:39 PM, Richard Guenther
> <richard.guent...@gmail.com> wrote:
>> On Thu, Mar 24, 2011 at 11:03 AM, H.J. Lu <hjl.to...@gmail.com> wrote:
>>> On Tue, Mar 22, 2011 at 6:28 AM, Ira Rosen <ira.ro...@linaro.org> wrote:
>>>> On 17 March 2011 16:48, Richard Guenther <richard.guent...@gmail.com> 
>>>> wrote:
>>>>
>>>>>> +  then_datarefs = VEC_alloc (data_reference_p, heap, 1);
>>>>>> +  else_datarefs = VEC_alloc (data_reference_p, heap, 1);
>>>>>> +  then_ddrs = VEC_alloc (ddr_p, heap, 1);
>>>>>> +  else_ddrs = VEC_alloc (ddr_p, heap, 1);
>>>>>> +  if (!compute_data_dependences_for_bb (then_bb, false, &then_datarefs,
>>>>>> +                                        &then_ddrs)
>>>>>
>>>>> Can we avoid computing dependencies if the other BB would have no
>>>>> data-refs?  Thus, split collecting datarefs and computing dependences?
>>>>
>>>> Done.
>>>>
>>>>>
>>>>>> +      || !compute_data_dependences_for_bb (else_bb, false, 
>>>>>> &else_datarefs,
>>>>>> +                                           &else_ddrs)
>>>>>> +      || !VEC_length (data_reference_p, then_datarefs)
>>>>>> +      || !VEC_length (data_reference_p, else_datarefs))
>>>>>> +    {
>>>>>> +      free_data_refs (then_datarefs);
>>>>>> +      free_data_refs (else_datarefs);
>>>>>> +      free_dependence_relations (then_ddrs);
>>>>>> +      free_dependence_relations (else_ddrs);
>>>>>> +      return false;
>>>>>> +    }
>>>>>> +
>>>>>> +  /* Check that there are no read-after-write or write-after-write 
>>>>>> dependencies
>>>>>> +     in THEN_BB.  */
>>>>>> +  FOR_EACH_VEC_ELT (ddr_p, then_ddrs, i, ddr)
>>>>>> +    {
>>>>>> +      struct data_reference *dra = DDR_A (ddr);
>>>>>> +      struct data_reference *drb = DDR_B (ddr);
>>>>>> +
>>>>>> +      if (DDR_ARE_DEPENDENT (ddr) != chrec_known
>>>>>> +          && ((DR_IS_READ (dra) && DR_IS_WRITE (drb)
>>>>>> +               && gimple_uid (DR_STMT (dra)) > gimple_uid (DR_STMT 
>>>>>> (drb)))
>>>>>> +              || (DR_IS_READ (drb) && DR_IS_WRITE (dra)
>>>>>> +                  && gimple_uid (DR_STMT (drb)) > gimple_uid (DR_STMT 
>>>>>> (dra)))
>>>>>> +              || (DR_IS_WRITE (dra) && DR_IS_WRITE (drb))))
>>>>>
>>>>> The gimple_uids are not initialized here, you need to make sure to
>>>>> call renumber_gimple_stmt_uids () before starting.  Note that phiopt
>>>>> incrementally changes the IL, so I'm not sure those uids will stay
>>>>> valid as stmts are moved across blocks.
>>>>
>>>> I added a call to renumber_gimple_stmt_uids_in_blocks() before data
>>>> dependence checks, and there are no code changes between that and the
>>>> checks, so, I think, it should be OK.
>>>>
>>>>>
>>>>>> +        {
>>>>>> +          free_data_refs (then_datarefs);
>>>>>> +          free_data_refs (else_datarefs);
>>>>>> +          free_dependence_relations (then_ddrs);
>>>>>> +          free_dependence_relations (else_ddrs);
>>>>>> +          return false;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +  /* Check that there are no read-after-write or write-after-write 
>>>>>> dependencies
>>>>>> +     in ELSE_BB.  */
>>>>>> +  FOR_EACH_VEC_ELT (ddr_p, else_ddrs, i, ddr)
>>>>>> +    {
>>>>>> +      struct data_reference *dra = DDR_A (ddr);
>>>>>> +      struct data_reference *drb = DDR_B (ddr);
>>>>>> +
>>>>>> +      if (DDR_ARE_DEPENDENT (ddr) != chrec_known
>>>>>> +          && ((DR_IS_READ (dra) && DR_IS_WRITE (drb)
>>>>>> +               && gimple_uid (DR_STMT (dra)) > gimple_uid (DR_STMT 
>>>>>> (drb)))
>>>>>> +              || (DR_IS_READ (drb) && DR_IS_WRITE (dra)
>>>>>> +                  && gimple_uid (DR_STMT (drb)) > gimple_uid (DR_STMT 
>>>>>> (dra)))
>>>>>> +              || (DR_IS_WRITE (dra) && DR_IS_WRITE (drb))))
>>>>>> +        {
>>>>>> +          free_data_refs (then_datarefs);
>>>>>> +          free_data_refs (else_datarefs);
>>>>>> +          free_dependence_relations (then_ddrs);
>>>>>> +          free_dependence_relations (else_ddrs);
>>>>>> +          return false;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +  /* Found pairs of stores with equal LHS.  */
>>>>>> +  FOR_EACH_VEC_ELT (data_reference_p, then_datarefs, i, then_dr)
>>>>>> +    {
>>>>>> +      if (DR_IS_READ (then_dr))
>>>>>> +        continue;
>>>>>> +
>>>>>> +      then_store = DR_STMT (then_dr);
>>>>>> +      then_lhs = gimple_assign_lhs (then_store);
>>>>>> +      found = false;
>>>>>> +
>>>>>> +      FOR_EACH_VEC_ELT (data_reference_p, else_datarefs, j, else_dr)
>>>>>> +        {
>>>>>> +          if (DR_IS_READ (else_dr))
>>>>>> +            continue;
>>>>>> +
>>>>>> +          else_store = DR_STMT (else_dr);
>>>>>> +          else_lhs = gimple_assign_lhs (else_store);
>>>>>> +
>>>>>> +          if (operand_equal_p (then_lhs, else_lhs, 0))
>>>>>> +            {
>>>>>> +              found = true;
>>>>>> +              break;
>>>>>> +            }
>>>>>> +        }
>>>>>> +
>>>>>> +      if (!found)
>>>>>> +        continue;
>>>>>> +
>>>>>> +      res = cond_if_else_store_replacement_1 (then_bb, else_bb, join_bb,
>>>>>> +                                              then_store, else_store);
>>>>>
>>>>> So you are executing if-else store replacement for common data reference
>>>>> pairs only.  I think it's cheaper to collect those pairs before computing
>>>>> dependences and only if there is at least one pair perform the 
>>>>> optimization.
>>>>
>>>> OK, I changed the order.
>>>>
>>>>>
>>>>> You basically perform store sinking, creating a PHI node for each store
>>>>> you sink (that's then probably if-converted by if-conversion later, 
>>>>> eventually
>>>>> redundant with -ftree-loop-if-convert-stores?)
>>>>>
>>>>> I am concerned that having no bound on the number of stores sunk will
>>>>> increase register pressure and does not allow scheduling of the stores
>>>>> in an optimal way.  Consider two BBs similar to
>>>>>
>>>>>  t = a + b;
>>>>>  *p = t;
>>>>>  t = c + d;
>>>>>  *q = t;
>>>>>
>>>>> where the transformation undoes a good schedule and makes fixing it
>>>>> impossible if the remaining statements are not if-convertible.
>>>>>
>>>>> Thus, I'd rather make this transformation only if in the end the 
>>>>> conditional
>>>>> can be completely if-converted.
>>>>>
>>>>> I realize that we already do unbound and very aggressive if-conversion
>>>>> in tree-ifcvt.c regardless of whether the loop will be vectorized or not
>>>>> (including leaking the if-converted loops to the various loop versions
>>>>> we create during vectorization, causing only code-size bloat).  But it's
>>>>> not a good reason to continue down this road ;)
>>>>>
>>>>> I suppose a simple maximum on the number of stores to sink
>>>>> controllable by a param should do, eventually disabling this
>>>>> extended transformation when vectorization is disabled?
>>>>
>>>> I added a param, and it's set to 0 if either vectorization or if-conversion
>>>> is disabled.
>>>>
>>>>>
>>>>> Otherwise the implementation looks good.
>>>>
>>>> Bootstrapped and tested on powerpc64-suse-linux.
>>>> OK to apply?
>>>>
>>>> Thanks,
>>>> Ira
>>>>
>>>> ChangeLog:
>>>>
>>>>     * doc/invoke.texi (max-stores-to-sink): Document.
>>>>     * params.h (MAX_STORES_TO_SINK): Define.
>>>>     * opts.c (finish_options): Set MAX_STORES_TO_SINK to 0
>>>>     if either vectorization or if-conversion is disabled.
>>>>     * tree-data-ref.c (dr_equal_offsets_p1): Moved and renamed from
>>>>     tree-vect-data-refs.c vect_equal_offsets.
>>>>     (dr_equal_offsets_p): New function.
>>>>     (find_data_references_in_bb): Remove static.
>>>>     * tree-data-ref.h (find_data_references_in_bb): Declare.
>>>>     (dr_equal_offsets_p): Likewise.
>>>>     * tree-vect-data-refs.c (vect_equal_offsets): Move to tree-data-ref.c.
>>>>     (vect_drs_dependent_in_basic_block): Update calls to 
>>>> vect_equal_offsets.
>>>>     (vect_check_interleaving): Likewise.
>>>>     * tree-ssa-phiopt.c: Include cfgloop.h and tree-data-ref.h.
>>>>     (cond_if_else_store_replacement): Rename to...
>>>>     (cond_if_else_store_replacement_1): ... this. Change arguments and
>>>>     documentation.
>>>>     (cond_if_else_store_replacement): New function.
>>>>     * Makefile.in (tree-ssa-phiopt.o): Adjust dependencies.
>>>>     * params.def (PARAM_MAX_STORES_TO_SINK): Define.
>>>>
>>>> testsuite/ChangeLog:
>>>>
>>>>     * gcc.dg/vect/vect-cselim-1.c: New test.
>>>>     * gcc.dg/vect/vect-cselim-2.c: New test.
>>>>
>>>
>>> This caused:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48270
>>
>> Yep, I see
>>
>> FAIL: gcc.c-torture/execute/builtins/strlen-2.c compilation,  -O3 
>> -fomit-frame-p
>> ointer  (internal compiler error)
>>
>> on x86_64-linux with
>>
>> #0  0x0000000001f53253 in gimple_uid (g=0x0)
>>    at /space/rguenther/src/svn/trunk/gcc/gimple.h:1297
>> 1297      return g->gsbase.uid;
>> (gdb)
>> Bottom (innermost) frame selected; you cannot go down.
>> (gdb) up
>> #1  0x0000000001f7753d in cond_if_else_store_replacement (
>>    then_bb=0x7ffff533abc8, else_bb=0x7ffff533ac30, join_bb=0x7ffff533ac98)
>>    at /space/rguenther/src/svn/trunk/gcc/tree-ssa-phiopt.c:1511
>> 1511          if (DDR_ARE_DEPENDENT (ddr) != chrec_known
>>
>> DR_STMT of dra is NULL.  No idea how that can happen.
>
> Maybe because
>
> 1495      compute_all_dependences (then_datarefs, &then_ddrs, NULL, false);
> 1496      compute_all_dependences (else_datarefs, &else_ddrs, NULL, false);
> 1497      free_data_refs (then_datarefs);
> 1498      free_data_refs (else_datarefs);
>
> frees the data refs too early?

This was my guess too, I posted a patch in the PR.

Thanks,
Ira

>
> Richard.
>

Reply via email to