On Fri, Nov 20, 2015 at 11:27:48AM +0100, Richard Henderson wrote:
> Toward fixing PR68385.  I'm just starting a full round of testing, but
> 
> extern void underflow(void) __attribute__((noreturn));
> unsigned sub1(unsigned a, unsigned b)
> {
>     unsigned r = a - b;
>     if (r > a) underflow();
>     return r;
> }
> 
> unsigned sub2(unsigned a, unsigned b)
> {
>     unsigned r;
>     if (__builtin_sub_overflow(a, b, &r)) underflow();
>     return r;
> }
> 
> 
> sub1:
>       movl    %edi, %eax
>       subl    %esi, %eax
>       cmpl    %eax, %edi
>       jb      .L7
>       rep ret
> ...
> sub2:
>       movl    %edi, %eax
>       subl    %esi, %eax
>       jb      .L16
>       rep ret
> ...

That looks good.

> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -6156,6 +6156,22 @@
>                 (const_string "4")]
>             (const_string "<MODE_SIZE>")))])
>  
> +(define_expand "uaddv<mode>4"
> +  [(parallel [(set (reg:CCC FLAGS_REG)
> +                (compare:CCC
> +                  (plus:SWI (match_dup 1) (match_dup 2))
> +                  (match_dup 1)))
> +           (set (match_dup 0)
> +                (plus:SWI (match_dup 1) (match_dup 2)))])
> +   (set (pc) (if_then_else
> +            (ne (reg:CCC FLAGS_REG) (const_int 0))
> +            (label_ref (match_operand 3))
> +            (pc)))]
> +  ""
> +{
> +  ix86_fixup_binary_operands_no_copy (PLUS, <MODE>mode, operands);
> +})

Do we need this one on i?86?  I'm not against adding it to optabs, so that
other targets have a way to improve that, but doesn't combine handle this
case on i?86 already well?
I've been thinking of only transforming the above sub1 code (in forwprop, as
richi suggested) to sub2 internal call + REALPART/IMAGPART extraction if
the corresponding optab exists.

> +
>  ;; The lea patterns for modes less than 32 bits need to be matched by
>  ;; several insns converted to real lea by splitters.
>  
> @@ -6461,6 +6477,20 @@
>                 (const_string "4")]
>             (const_string "<MODE_SIZE>")))])
>  
> +(define_expand "usubv<mode>4"
> +  [(parallel [(set (reg:CC FLAGS_REG)
> +                (compare:CC (match_dup 1) (match_dup 2)))
> +           (set (match_dup 0)
> +                (minus:SWI (match_dup 1) (match_dup 2)))])
> +   (set (pc) (if_then_else
> +            (ltu (reg:CC FLAGS_REG) (const_int 0))
> +            (label_ref (match_operand 3))
> +            (pc)))]

If this works, it will be nice, I thought we'll need a new CC*mode.

> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -4912,6 +4912,25 @@ address calculations.  @code{add@var{m}3} is used if
>  @itemx @samp{and@var{m}3}, @samp{ior@var{m}3}, @samp{xor@var{m}3}
>  Similar, for other arithmetic operations.
>  
> +@cindex @code{addv@var{m}4} instruction pattern
> +@item @samp{addv@var{m}4}
> +Add operand 2 and operand 1, storing the result in operand 0.  If signed
> +overflow occurs during the addition, jump to the label in operand 3.
> +
> +@cindex @code{subv@var{m}4} instruction pattern
> +@cindex @code{mulv@var{m}4} instruction pattern
> +@item @samp{subv@var{m}4}, @samp{mulv@var{m}4}
> +Similar, for other signed arithmetic operations.
> +
> +@cindex @code{uaddv@var{m}4} instruction pattern
> +@item @samp{uaddv@var{m}4}
> +Like @code{addv@var{m}4}, except jump on unsigned overflow.
> +
> +@cindex @code{usubv@var{m}4} instruction pattern
> +@cindex @code{umulv@var{m}4} instruction pattern
> +@item @samp{usubv@var{m}4}, @samp{umulv@var{m}4}
> +Similar, for other unsigned arithmetic operations.

Eric has just submitted a documentation path that documented the
{add,sub,mul,umul}v<mode>4 and negv<mode>3 patterns, so this should be
applied on top of that.

        Jakub

Reply via email to