On Mon, 31 Jul 2017, Jeff Law wrote: > >> In the middle end patch, do we need a barrier before the fence as well? > >> The post-fence barrier prevents reordering the fence with anything which > >> follows the fence. But do we have to also prevent reordering the fence > >> with prior instructions with any of the memory models? This isn't my > >> area of expertise, so if it's dumb question, don't hesitate to let me > >> know :-) > > > > That depends on how pessimistic we want to be with respect to backend > > getting it wrong. My expectation here is that if a backend emits non-empty > > RTL, the produced sequence for the fence itself acts as a compiler memory > > barrier. > Perhaps. But do we really want to rely on that? EMitting a scheduling > barrier prior to these atomics is virtually free.
Please consider that expand_mem_thread_fence is used to place fences around seq-cst atomic loads&stores when the backend doesn't provide a direct pattern. With compiler barriers on both sides of the machine barrier, the generated sequence for a seq-cst atomic load will be 7 insns: asm volatile ("":::"memory"); machine_seq_cst_fence (); asm volatile ("":::"memory"); dst = mem[src]; asm volatile ("":::"memory"); machine_seq_cst_fence (); asm volatile ("":::"memory"); I can easily imagine people looking at RTL dumps with this overkill fencing being unhappy about this. I'd be more happy with detecting empty expansion via get_last_insn (). Thanks. Alexander