Re: [Qemu-devel] [RFC v3 PATCH 01/14] Introduce TCGOpcode for memory barrier

Sergey Fedorov Mon, 20 Jun 2016 14:33:07 -0700

On 18/06/16 07:03, Pranith Kumar wrote:
> diff --git a/tcg/tcg.h b/tcg/tcg.h
> index db6a062..36feca9 100644
> --- a/tcg/tcg.h
> +++ b/tcg/tcg.h
> @@ -408,6 +408,20 @@ static inline intptr_t QEMU_ARTIFICIAL 
> GET_TCGV_PTR(TCGv_ptr t)
>  #define TCG_CALL_DUMMY_TCGV     MAKE_TCGV_I32(-1)
>  #define TCG_CALL_DUMMY_ARG      ((TCGArg)(-1))
>  
> +typedef enum {
> +    TCG_MO_LD_LD    = 1,
> +    TCG_MO_ST_LD    = 2,
> +    TCG_MO_LD_ST    = 4,
> +    TCG_MO_ST_ST    = 8,
> +    TCG_MO_ALL      = 0xF, // OR of all above


So TCG_MO_ALL specifies a so called "full" memory barrier?

> +} TCGOrder;
> +
> +typedef enum {
> +    TCG_BAR_ACQ     = 32,
> +    TCG_BAR_REL     = 64,

I'm convinced that the only practical way to represent a standalone
acquire memory barrier is to order all previous loads with all
subsequent loads and stores. Similarly, a standalone release memory
barrier would order all previous loads and stores with all subsequent
stores. [1]

On the other hand, acquire or release semantic associated with a memory
operation itself can be directly mapped into e.g. AArch64's Load-Acquire
(LDAR) and Store-Release (STLR) instructions. A standalone barrier
adjacent to a memory operation shouldn't be mapped this way because it
should provide more strict guarantees than e.g. AArch64 instructions
mentioned above.

Therefore, I advocate for clear distinction between standalone memory
barriers and implicit memory ordering semantics associated with memory
operations themselves.

[1] http://preshing.com/20130922/acquire-and-release-fences/

> +    TCG_BAR_SC      = 128,

How's that different from TCG_MO_ALL?

> +} TCGBar;
> +
>  /* Conditions.  Note that these are laid out for easy manipulation by
>     the functions below:
>       bit 0 is used for inverting;

Kind regards,
Sergey

Re: [Qemu-devel] [RFC v3 PATCH 01/14] Introduce TCGOpcode for memory barrier

Reply via email to