Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

Alexander Monakov Mon, 31 Jul 2017 11:01:55 -0700

On Mon, 31 Jul 2017, Jeff Law wrote:
> > Please consider that expand_mem_thread_fence is used to place fences around
> > seq-cst atomic loads&stores when the backend doesn't provide a direct 
> > pattern.
> > With compiler barriers on both sides of the machine barrier, the generated
> > sequence for a seq-cst atomic load will be 7 insns:
> > 
> >   asm volatile ("":::"memory");
> >   machine_seq_cst_fence ();
> >   asm volatile ("":::"memory");
> >   dst = mem[src];
> >   asm volatile ("":::"memory");
> >   machine_seq_cst_fence ();
> >   asm volatile ("":::"memory");
> > 
> > I can easily imagine people looking at RTL dumps with this overkill fencing
> > being unhappy about this.
> But the extra fences aren't actually going to impact anything except
> perhaps an unmeasurable compile-time hit.  ie, it may look bad, but I'd
> have a hard time believing it matters in practice.


I agree it doesn't matter that much from compile-time/memory standpoint.
I meant it matters from the standpoint of a person working with the RTL dumps,
i.e. having to work through all that, especially if they need to work with
non-slim dumps.

> > I'd be more happy with detecting empty expansion via get_last_insn ().
> That seems like an unnecessary complication to me.

I think it's quite simple by GCC standards:

  {
    rtx_insn last = get_last_insn ();
    emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
    if (last == get_last_insn () && !is_mm_relaxed (model))
      expand_asm_memory_barrier ();
  }

> I'd prefer instead to just emit the necessary fencing in the generic code
> and update the MD docs so that maintainers know they don't have to emit
> the RTL fences themselves.

I agree the docs need an update, but hopefully not in this way.  The legacy
__sync_synchronize barrier has always been required to be a compiler barrier
when expanded to RTL, and it's quite natural to use the same RTL structure
for new atomic fences as the old memory_barrier expansion.  The only problem
in current practice seen so far is with empty expansion for new patterns.

Thanks.
Alexander

Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

Reply via email to