On Thu, Jul 5, 2012 at 2:49 PM, Tom de Vries <tom_devr...@mentor.com> wrote:
> On 04/07/12 19:02, Ulrich Weigand wrote:
>> Any suggestions how to fix this?  Should tail merging detect
>> __builtin_unreachable and not merge such block?  Or else, should
>> the CFG optimizer be extended (how?) to handle unreachable blocks
>> with multiple predecessors better?
>
> Ulrich,
>
> I extended the example as such:
> ...
> int
> foo(int a)
> {
>     if (a <= 0)
>       {
> #ifdef MERGED
>       L1:
> #endif
>         __builtin_unreachable();
>       }
>     if (a > 2)
> #ifdef MERGED
>       goto L1;
> #else
>       __builtin_unreachable();
> #endif
>     return a > 0;
> }
> ...
>
> Indeed I can reproduce the problem:
> ...
> $ gcc unreachable.c -O2 -S -o- -ftree-tail-merge
> foo:
>         .cfi_startproc
>         testl   %edi, %edi
>         jle     .L3
>         cmpl    $2, %edi
>         jg      .L3
>         movl    $1, %eax
>         ret
> .L3:
>         .cfi_endproc
> ...
>
> And I can make the problem go away with -fno-tree-tail-merge:
> ...
> $ gcc unreachable.c -O2 -S -o- -fno-tree-tail-merge
> foo:
>         .cfi_startproc
>         movl    $1, %eax
>         ret
>         .cfi_endproc
> ...
>
> But I can also reproduce the problem with -fno-tree-tail-merge by using 
> -DMERGED:
> ,,,
> $ gcc unreachable.c -O2 -S -o- -fno-tree-tail-merge -DMERGED
> foo:
>         .cfi_startproc
>         testl   %edi, %edi
>         jle     .L3
>         cmpl    $2, %edi
>         jg      .L3
>         movl    $1, %eax
>         ret
> .L3:
>         .cfi_endproc
> ,,,
>
> So it seems that this is a problem of a missed optimization, triggered by, but
> not exclusively triggered by -ftree-tail-merge.
>
> How to fix the missed optimization, I'm not sure where it should be done.
>
> I think the optimization could be done in tree-vrp. Currently the assert
> expressions look like this:
> ...
> .foo (intD.6 aD.1711)
> {
>   _BoolD.1693 D.1720;
>   intD.6 D.1719;
>
>   # BLOCK 2 freq:10000
>   # PRED: ENTRY [100.0%]  (fallthru,exec)
>   if (aD.1711_1(D) <= 0)
>     goto <bb 3>;
>   else
>     goto <bb 4>;
>   # SUCC: 3 [0.0%]  (true,exec) 4 [100.0%]  (false,exec)
>
>   # BLOCK 3 freq:4
>   # PRED: 2 [0.0%]  (true,exec)
>   # VUSE <.MEMD.1722_4(D)>
>   # USE = nonlocal
>   # CLB = nonlocal
>   __builtin_unreachableD.997 ();
>   # SUCC:
>
>   # BLOCK 4 freq:9996
>   # PRED: 2 [100.0%]  (false,exec)
>   aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>;
>   if (aD.1711_6 > 2)
>     goto <bb 5>;
>   else
>     goto <bb 6>;
>   # SUCC: 5 [0.0%]  (true,exec) 6 [100.0%]  (false,exec)
>
>   # BLOCK 5 freq:4
>   # PRED: 4 [0.0%]  (true,exec)
>   # VUSE <.MEMD.1722_4(D)>
>   # USE = nonlocal
>   # CLB = nonlocal
>   __builtin_unreachableD.997 ();
>   # SUCC:
>
>   # BLOCK 6 freq:9992
>   # PRED: 4 [100.0%]  (false,exec)
>   aD.1711_5 = ASSERT_EXPR <aD.1711_6, aD.1711_6 <= 2>;
>   D.1720_2 = aD.1711_5 > 0;
>   D.1719_3 = (intD.6) D.1720_2;
>   # VUSE <.MEMD.1722_4(D)>
>   return D.1719_3;
>   # SUCC: EXIT [100.0%]
>
> }
> ..
>
> The asserts allow the return result to be optimized, but not the cfg 
> conditions.
>
> AFAIU, we can insert the asserts earlier. F.i., we can insert
>   aD.1711_6 = ASSERT_EXPR <aD.1711_1(D), aD.1711_1(D) > 0>
> before the GIMPLE_COND in bb2.
>
> Attached proof-of-concept patch implements this, and works for this example 
> both
> with and without -DMERGED, independent of -ftree-tail-merge.

You don't need to do it this way - you can simply remove the CFG path
feeding __builtin_unreachable.  Putting the assert before the if looks
a bit backward, so I'd prefer to not do that.

> The only problem I can see with this approach is that doing this early (vrp1)
> basically removes range annotations which might have had an effect later on. 
> It
> could be and idea to only do this in vrp2, which is not much earlier than
> expand, were the rtl cfg optimization kicks in for the first time.

Indeed.

Richard.

> Thanks,
> - Tom

Reply via email to