On Fri, Dec 18, 2015 at 06:15:11PM +0100, Marek Polacek wrote:
> As outlined in the PR, build_atomic_assign can do a better job.  It seems
> profitable to use __atomic_fetch_* built-ins when possible rather than the
> generic CAS loop.  This patch attempts to implement that.
> 
> There are two ???s in the patch.  For the first one, I think let's just
> ignore enums, the potential trouble doesn't seem to be worth it.  And wrt
> the second ???, well, here I was unsure, so I decided to go the generic way
> for restrict-qualified pointers.
> 
> The two new tests should insure that things work sane even when this
> optimization kicks in.
> 
> Bootstrapped/regtested on x86_64-linux and powerpc64le-linux, ok for trunk?
> 
> 2015-12-18  Marek Polacek  <pola...@redhat.com>
> 
>       PR c/68908
>       * c-typeck.c (build_atomic_assign): Improve commentary.  Add
>       optimization to use __atomic_fetch_* built-in if possible.
> 
>       * gcc.dg/atomic/c11-atomic-exec-6.c: New test.
>       * gcc.dg/atomic/stdatomic-op-5.c: New test.
> 
> diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
> index b605f81..9b6febb 100644
> --- gcc/c/c-typeck.c
> +++ gcc/c/c-typeck.c
> @@ -3685,9 +3685,9 @@ pointer_diff (location_t loc, tree op0, tree op1)
>    return convert (restype, result);
>  }
>  
> -/* Expand atomic compound assignments into an approriate sequence as
> -   specified by the C11 standard section 6.5.16.2.   
> -    given 
> +/* Expand atomic compound assignments into an appropriate sequence as
> +   specified by the C11 standard section 6.5.16.2.
> +
>         _Atomic T1 E1
>         T2 E2
>         E1 op= E2
> @@ -3719,26 +3719,25 @@ loop:
>  done:
>    feupdateenv (&fenv);
>  
> -  Also note that the compiler is simply issuing the generic form of
> -  the atomic operations.  This requires temp(s) and has their address
> -  taken.  The atomic processing is smart enough to figure out when the
> -  size of an object can utilize a lock-free version, and convert the
> -  built-in call to the appropriate lock-free routine.  The optimizers
> -  will then dispose of any temps that are no longer required, and
> -  lock-free implementations are utilized as long as there is target
> -  support for the required size.
> +  The compiler will issue the __atomic_fetch_* built-in when possible,
> +  otherwise it will generate the generic form of the atomic operations.
> +  This requires temp(s) and has their address taken.  The atomic processing
> +  is smart enough to figure out when the size of an object can utilize
> +  a lock-free version, and convert the built-in call to the appropriate
> +  lock-free routine.  The optimizers will then dispose of any temps that
> +  are no longer required, and lock-free implementations are utilized as
> +  long as there is target support for the required size.
>  
>    If the operator is NOP_EXPR, then this is a simple assignment, and
>    an __atomic_store is issued to perform the assignment rather than
> -  the above loop.
> -
> -*/
> +  the above loop.  */
>  
>  /* Build an atomic assignment at LOC, expanding into the proper
>     sequence to store LHS MODIFYCODE= RHS.  Return a value representing
> -   the result of the operation, unless RETURN_OLD_P in which case
> +   the result of the operation, unless RETURN_OLD_P, in which case
>     return the old value of LHS (this is only for postincrement and
>     postdecrement).  */
> +
>  static tree
>  build_atomic_assign (location_t loc, tree lhs, enum tree_code modifycode,
>                    tree rhs, bool return_old_p)
> @@ -3805,6 +3804,86 @@ build_atomic_assign (location_t loc, tree lhs, enum 
> tree_code modifycode,
>        return build2 (COMPOUND_EXPR, nonatomic_lhs_type, compound_stmt, val);
>      }
>  
> +  /* Attempt to implement the atomic operation as an __atomic_fetch_* or
> +     __atomic_*_fetch built-in rather than a CAS loop.  atomic_bool type
> +     isn't applicable for such builtins.  ??? Do we want to handle enums?  */
> +  if ((TREE_CODE (lhs_type) == INTEGER_TYPE || POINTER_TYPE_P (lhs_type))
> +      && TREE_CODE (rhs_type) == INTEGER_TYPE)
> +    {
> +      built_in_function fncode;
> +      switch (modifycode)
> +     {
> +     case PLUS_EXPR:
> +     case POINTER_PLUS_EXPR:
> +       fncode = (return_old_p
> +                 ? BUILT_IN_ATOMIC_FETCH_ADD_N
> +                 : BUILT_IN_ATOMIC_ADD_FETCH_N);
> +       break;
> +     case MINUS_EXPR:
> +       fncode = (return_old_p
> +                 ? BUILT_IN_ATOMIC_FETCH_SUB_N
> +                 : BUILT_IN_ATOMIC_SUB_FETCH_N);
> +       break;
> +     case BIT_AND_EXPR:
> +       fncode = (return_old_p
> +                 ? BUILT_IN_ATOMIC_FETCH_AND_N
> +                 : BUILT_IN_ATOMIC_AND_FETCH_N);
> +       break;
> +     case BIT_IOR_EXPR:
> +       fncode = (return_old_p
> +                 ? BUILT_IN_ATOMIC_FETCH_OR_N
> +                 : BUILT_IN_ATOMIC_OR_FETCH_N);
> +       break;
> +     case BIT_XOR_EXPR:
> +       fncode = (return_old_p
> +                 ? BUILT_IN_ATOMIC_FETCH_XOR_N
> +                 : BUILT_IN_ATOMIC_XOR_FETCH_N);
> +       break;
> +     default:
> +       goto cas_loop;
> +     }
> +
> +      /* If this is a pointer type, we need to multiplicate by the size of

This should read "multiply".  Consider it fixed.

        Marek

Reply via email to