On Fri, Dec 18, 2015 at 06:15:11PM +0100, Marek Polacek wrote: > As outlined in the PR, build_atomic_assign can do a better job. It seems > profitable to use __atomic_fetch_* built-ins when possible rather than the > generic CAS loop. This patch attempts to implement that. > > There are two ???s in the patch. For the first one, I think let's just > ignore enums, the potential trouble doesn't seem to be worth it. And wrt > the second ???, well, here I was unsure, so I decided to go the generic way > for restrict-qualified pointers. > > The two new tests should insure that things work sane even when this > optimization kicks in. > > Bootstrapped/regtested on x86_64-linux and powerpc64le-linux, ok for trunk? > > 2015-12-18 Marek Polacek <pola...@redhat.com> > > PR c/68908 > * c-typeck.c (build_atomic_assign): Improve commentary. Add > optimization to use __atomic_fetch_* built-in if possible. > > * gcc.dg/atomic/c11-atomic-exec-6.c: New test. > * gcc.dg/atomic/stdatomic-op-5.c: New test. > > diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c > index b605f81..9b6febb 100644 > --- gcc/c/c-typeck.c > +++ gcc/c/c-typeck.c > @@ -3685,9 +3685,9 @@ pointer_diff (location_t loc, tree op0, tree op1) > return convert (restype, result); > } > > -/* Expand atomic compound assignments into an approriate sequence as > - specified by the C11 standard section 6.5.16.2. > - given > +/* Expand atomic compound assignments into an appropriate sequence as > + specified by the C11 standard section 6.5.16.2. > + > _Atomic T1 E1 > T2 E2 > E1 op= E2 > @@ -3719,26 +3719,25 @@ loop: > done: > feupdateenv (&fenv); > > - Also note that the compiler is simply issuing the generic form of > - the atomic operations. This requires temp(s) and has their address > - taken. The atomic processing is smart enough to figure out when the > - size of an object can utilize a lock-free version, and convert the > - built-in call to the appropriate lock-free routine. The optimizers > - will then dispose of any temps that are no longer required, and > - lock-free implementations are utilized as long as there is target > - support for the required size. > + The compiler will issue the __atomic_fetch_* built-in when possible, > + otherwise it will generate the generic form of the atomic operations. > + This requires temp(s) and has their address taken. The atomic processing > + is smart enough to figure out when the size of an object can utilize > + a lock-free version, and convert the built-in call to the appropriate > + lock-free routine. The optimizers will then dispose of any temps that > + are no longer required, and lock-free implementations are utilized as > + long as there is target support for the required size. > > If the operator is NOP_EXPR, then this is a simple assignment, and > an __atomic_store is issued to perform the assignment rather than > - the above loop. > - > -*/ > + the above loop. */ > > /* Build an atomic assignment at LOC, expanding into the proper > sequence to store LHS MODIFYCODE= RHS. Return a value representing > - the result of the operation, unless RETURN_OLD_P in which case > + the result of the operation, unless RETURN_OLD_P, in which case > return the old value of LHS (this is only for postincrement and > postdecrement). */ > + > static tree > build_atomic_assign (location_t loc, tree lhs, enum tree_code modifycode, > tree rhs, bool return_old_p) > @@ -3805,6 +3804,86 @@ build_atomic_assign (location_t loc, tree lhs, enum > tree_code modifycode, > return build2 (COMPOUND_EXPR, nonatomic_lhs_type, compound_stmt, val); > } > > + /* Attempt to implement the atomic operation as an __atomic_fetch_* or > + __atomic_*_fetch built-in rather than a CAS loop. atomic_bool type > + isn't applicable for such builtins. ??? Do we want to handle enums? */ > + if ((TREE_CODE (lhs_type) == INTEGER_TYPE || POINTER_TYPE_P (lhs_type)) > + && TREE_CODE (rhs_type) == INTEGER_TYPE) > + { > + built_in_function fncode; > + switch (modifycode) > + { > + case PLUS_EXPR: > + case POINTER_PLUS_EXPR: > + fncode = (return_old_p > + ? BUILT_IN_ATOMIC_FETCH_ADD_N > + : BUILT_IN_ATOMIC_ADD_FETCH_N); > + break; > + case MINUS_EXPR: > + fncode = (return_old_p > + ? BUILT_IN_ATOMIC_FETCH_SUB_N > + : BUILT_IN_ATOMIC_SUB_FETCH_N); > + break; > + case BIT_AND_EXPR: > + fncode = (return_old_p > + ? BUILT_IN_ATOMIC_FETCH_AND_N > + : BUILT_IN_ATOMIC_AND_FETCH_N); > + break; > + case BIT_IOR_EXPR: > + fncode = (return_old_p > + ? BUILT_IN_ATOMIC_FETCH_OR_N > + : BUILT_IN_ATOMIC_OR_FETCH_N); > + break; > + case BIT_XOR_EXPR: > + fncode = (return_old_p > + ? BUILT_IN_ATOMIC_FETCH_XOR_N > + : BUILT_IN_ATOMIC_XOR_FETCH_N); > + break; > + default: > + goto cas_loop; > + } > + > + /* If this is a pointer type, we need to multiplicate by the size of
This should read "multiply". Consider it fixed. Marek