Re: [dpdk-dev] [PATCH v3 1/3] rwlock: reimplement with atomic builtins

2019-03-19 Thread Gavin Hu (Arm Technology China)
Hi Konstantin, 

> -Original Message-
> From: Ananyev, Konstantin 
> Sent: Friday, March 15, 2019 7:41 PM
> To: Joyce Kong (Arm Technology China) ;
> dev@dpdk.org
> Cc: nd ; Gavin Hu (Arm Technology China)
> ; jer...@marvell.com; chao...@linux.vnet.ibm.com;
> Richardson, Bruce ; tho...@monjalon.net;
> hemant.agra...@nxp.com; Honnappa Nagarahalli
> 
> Subject: RE: [PATCH v3 1/3] rwlock: reimplement with atomic builtins
> 
> Hi,
> 
> 
> > The __sync builtin based implementation generates full memory
> > barriers ('dmb ish') on Arm platforms. Using C11 atomic builtins
> > to generate one way barriers.
> >
> > Here is the assembly code of __sync_compare_and_swap builtin.
> > __sync_bool_compare_and_swap(dst, exp, src);
> >0x0090f1b0 <+16>:e0 07 40 f9 ldr x0, [sp, #8]
> >0x0090f1b4 <+20>:e1 0f 40 79 ldrhw1, [sp, #6]
> >0x0090f1b8 <+24>:e2 0b 40 79 ldrhw2, [sp, #4]
> >0x0090f1bc <+28>:21 3c 00 12 and w1, w1, #0x
> >0x0090f1c0 <+32>:03 7c 5f 48 ldxrh   w3, [x0]
> >0x0090f1c4 <+36>:7f 00 01 6b cmp w3, w1
> >0x0090f1c8 <+40>:61 00 00 54 b.ne0x90f1d4
> >   // b.any
> >0x0090f1cc <+44>:02 fc 04 48 stlxrh  w4, w2, [x0]
> >0x0090f1d0 <+48>:84 ff ff 35 cbnzw4, 0x90f1c0
> > 
> >0x0090f1d4 <+52>:bf 3b 03 d5 dmb ish
> >0x0090f1d8 <+56>:e0 17 9f 1a csetw0, eq  // eq = none
> >
> > Signed-off-by: Gavin Hu 
> > Tested-by: Joyce Kong 
> > Acked-by: Jerin Jacob 
> > ---
> 
> Wouldn't it be plausible to change _try_ functions to use __atomic too (for
> consistency)?
> Apart from that looks good to me.
> Konstantin

Sure, we will do it in next version. 



Re: [dpdk-dev] [PATCH v3 1/3] rwlock: reimplement with atomic builtins

2019-03-15 Thread Ananyev, Konstantin
Hi,


> The __sync builtin based implementation generates full memory
> barriers ('dmb ish') on Arm platforms. Using C11 atomic builtins
> to generate one way barriers.
> 
> Here is the assembly code of __sync_compare_and_swap builtin.
> __sync_bool_compare_and_swap(dst, exp, src);
>0x0090f1b0 <+16>:e0 07 40 f9 ldr x0, [sp, #8]
>0x0090f1b4 <+20>:e1 0f 40 79 ldrhw1, [sp, #6]
>0x0090f1b8 <+24>:e2 0b 40 79 ldrhw2, [sp, #4]
>0x0090f1bc <+28>:21 3c 00 12 and w1, w1, #0x
>0x0090f1c0 <+32>:03 7c 5f 48 ldxrh   w3, [x0]
>0x0090f1c4 <+36>:7f 00 01 6b cmp w3, w1
>0x0090f1c8 <+40>:61 00 00 54 b.ne0x90f1d4
>   // b.any
>0x0090f1cc <+44>:02 fc 04 48 stlxrh  w4, w2, [x0]
>0x0090f1d0 <+48>:84 ff ff 35 cbnzw4, 0x90f1c0
> 
>0x0090f1d4 <+52>:bf 3b 03 d5 dmb ish
>0x0090f1d8 <+56>:e0 17 9f 1a csetw0, eq  // eq = none
> 
> Signed-off-by: Gavin Hu 
> Tested-by: Joyce Kong 
> Acked-by: Jerin Jacob 
> ---

Wouldn't it be plausible to change _try_ functions to use __atomic too (for 
consistency)?
Apart from that looks good to me.
Konstantin



Re: [dpdk-dev] [PATCH v3 1/3] rwlock: reimplement with atomic builtins

2019-03-14 Thread Gavin Hu (Arm Technology China)
Hi Stephen,

> -Original Message-
> From: Stephen Hemminger 
> Sent: Thursday, March 14, 2019 11:54 PM
> To: Joyce Kong (Arm Technology China) 
> Cc: dev@dpdk.org; nd ; Gavin Hu (Arm Technology China)
> ; jer...@marvell.com; konstantin.anan...@intel.com;
> chao...@linux.vnet.ibm.com; bruce.richard...@intel.com;
> tho...@monjalon.net; hemant.agra...@nxp.com; Honnappa Nagarahalli
> 
> Subject: Re: [dpdk-dev] [PATCH v3 1/3] rwlock: reimplement with atomic
> builtins
> 
> On Thu, 14 Mar 2019 21:15:02 +0800
> Joyce Kong  wrote:
> 
> > -   success = rte_atomic32_cmpset((volatile uint32_t *)&rwl-
> >cnt,
> > - (uint32_t)x, (uint32_t)(x + 1));
> > +   success = __atomic_compare_exchange_n(&rwl->cnt, &x,
> x+1, 1,
> > +   __ATOMIC_ACQUIRE,
> __ATOMIC_RELAXED);
> 
> Would it be possible to have rte_atomic32_cmpset be an inline function
> that became __atomic_comppare_exchange? Then all usages would be the
> same.

There is already a patch for this and Honnappa commented on this: 
https://mails.dpdk.org/archives/dev/2019-January/124297.html




Re: [dpdk-dev] [PATCH v3 1/3] rwlock: reimplement with atomic builtins

2019-03-14 Thread Stephen Hemminger
On Thu, 14 Mar 2019 21:15:02 +0800
Joyce Kong  wrote:

> - success = rte_atomic32_cmpset((volatile uint32_t *)&rwl->cnt,
> -   (uint32_t)x, (uint32_t)(x + 1));
> + success = __atomic_compare_exchange_n(&rwl->cnt, &x, x+1, 1,
> + __ATOMIC_ACQUIRE, __ATOMIC_RELAXED);

Would it be possible to have rte_atomic32_cmpset be an inline function
that became __atomic_comppare_exchange? Then all usages would be the same.