Re: [dpdk-dev] [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa outbound sqn update

Ananyev, Konstantin Mon, 23 Mar 2020 12:19:09 -0700

> -----Original Message-----
> From: Honnappa Nagarahalli <[email protected]>
> Sent: Monday, March 23, 2020 7:08 PM
> To: Ananyev, Konstantin <[email protected]>; Phil Yang 
> <[email protected]>; [email protected]; Van Haaren, Harry
> <[email protected]>; [email protected]; 
> [email protected]; [email protected]; Richardson, Bruce
> <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; 
> Gavin Hu <[email protected]>; Ruifeng Wang
> <[email protected]>; Joyce Kong <[email protected]>; nd <[email protected]>; 
> Honnappa Nagarahalli
> <[email protected]>; nd <[email protected]>
> Subject: RE: [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa outbound 
> sqn update
> 
> <snip>
> 
> > Subject: RE: [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa 
> > outbound
> > sqn update
> >
> > Hi Phil,
> >
> > >
> > > For SA outbound packets, rte_atomic64_add_return is used to generate
> > > SQN atomically. This introduced an unnecessary full barrier by calling
> > > the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> > > patch optimized it with c11 atomic and eliminated the expensive
> > > barrier for aarch64.
> > >
> > > Signed-off-by: Phil Yang <[email protected]>
> > > Reviewed-by: Ruifeng Wang <[email protected]>
> > > Reviewed-by: Gavin Hu <[email protected]>
> > > ---
> > >  lib/librte_ipsec/ipsec_sqn.h | 3 ++-
> > >  lib/librte_ipsec/sa.h        | 2 +-
> > >  2 files changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/librte_ipsec/ipsec_sqn.h
> > > b/lib/librte_ipsec/ipsec_sqn.h index 0c2f76a..e884af7 100644
> > > --- a/lib/librte_ipsec/ipsec_sqn.h
> > > +++ b/lib/librte_ipsec/ipsec_sqn.h
> > > @@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa *sa,
> > > uint32_t *num)
> > >
> > >   n = *num;
> > >   if (SQN_ATOMIC(sa))
> > > -         sqn = (uint64_t)rte_atomic64_add_return(&sa-
> > >sqn.outb.atom, n);
> > > +         sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
> > > +                 __ATOMIC_RELAXED);
> >
> > One generic thing to note:
> > clang for i686 in some cases will generate a proper function call for 64-bit
> > __atomic builtins (gcc seems to always generate cmpxchng8b for such cases).
> > Does anyone consider it as a potential problem?
> > It probably not a big deal, but would like to know broader opinion.
> I had looked at this some time back for GCC. The function call is generated 
> only if the underlying platform does not support the atomic
> instructions for the operand size. Otherwise, gcc generates the instructions 
> directly.
> I would think the behavior would be the same for clang.

>From what I see not really.
As an example:

$ cat tatm11.c
#include <stdint.h>

struct x {
        uint64_t v __attribute__((aligned(8)));
};

uint64_t
ffxadd1(struct x *x, uint32_t n, uint32_t m)
{
        return __atomic_add_fetch(&x->v, n, __ATOMIC_RELAXED);
}

uint64_t
ffxadd11(uint64_t *v, uint32_t n, uint32_t m)
{
        return __atomic_add_fetch(v, n, __ATOMIC_RELAXED);
}

gcc for i686 will generate code with cmpxchng8b for both cases.
clang will generate cmpxchng8b for ffxadd1() - when data is explicitly 8B 
aligned,
but will emit a function call for ffxadd11().

> 
> >
> > >   else {
> > >           sqn = sa->sqn.outb.raw + n;
> > >           sa->sqn.outb.raw = sqn;
> > > diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h index
> > > d22451b..cab9a2e 100644
> > > --- a/lib/librte_ipsec/sa.h
> > > +++ b/lib/librte_ipsec/sa.h
> > > @@ -120,7 +120,7 @@ struct rte_ipsec_sa {
> > >    */
> > >   union {
> > >           union {
> > > -                 rte_atomic64_t atom;
> > > +                 uint64_t atom;
> > >                   uint64_t raw;
> > >           } outb;
> >
> > If we don't need rte_atomic64 here anymore, then I think we can collapse the
> > union to just:
> > uint64_t outb;
> >
> > >           struct {
> > > --
> > > 2.7.4
Re: [dpdk-dev] [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa outbound sqn update

Reply via email to