On 11/28/2011 06:02 AM, Alan Modra wrote: > - unsigned int ret = bar->generation & ~3; > - /* Do we need any barrier here or is __sync_add_and_fetch acting > - as the needed LoadLoad barrier already? */ > - ret += __sync_add_and_fetch (&bar->awaited, -1) == 0; > + unsigned int ret = __atomic_load_4 (&bar->generation, MEMMODEL_ACQUIRE) & > ~3; > + ret += __atomic_add_fetch (&bar->awaited, -1, MEMMODEL_ACQ_REL) == 0;
Given that the read from bar->generation is ACQ, we don't need a duplicate barrier from the REL on the atomic add. I believe both can be MEMMODEL_ACQUIRE both in order to force the ordering of these two memops, as well as force these to happen before anything subsequent. The s/_4/_n/ change needs doing. Otherwise ok. r~