Re: [dpdk-dev] [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa outbound sqn update

Honnappa Nagarahalli Mon, 23 Mar 2020 13:21:46 -0700

<snip>

> > > Subject: RE: [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa
> > > outbound sqn update
> > >
> > > Hi Phil,
> > >
> > > >
> > > > For SA outbound packets, rte_atomic64_add_return is used to
> > > > generate SQN atomically. This introduced an unnecessary full
> > > > barrier by calling the '__sync' builtin implemented rte_atomic_XX
> > > > API on aarch64. This patch optimized it with c11 atomic and
> > > > eliminated the expensive barrier for aarch64.
> > > >
> > > > Signed-off-by: Phil Yang <[email protected]>
> > > > Reviewed-by: Ruifeng Wang <[email protected]>
> > > > Reviewed-by: Gavin Hu <[email protected]>
> > > > ---
> > > >  lib/librte_ipsec/ipsec_sqn.h | 3 ++-
> > > >  lib/librte_ipsec/sa.h        | 2 +-
> > > >  2 files changed, 3 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/lib/librte_ipsec/ipsec_sqn.h
> > > > b/lib/librte_ipsec/ipsec_sqn.h index 0c2f76a..e884af7 100644
> > > > --- a/lib/librte_ipsec/ipsec_sqn.h
> > > > +++ b/lib/librte_ipsec/ipsec_sqn.h
> > > > @@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa *sa,
> > > > uint32_t *num)
> > > >
> > > >         n = *num;
> > > >         if (SQN_ATOMIC(sa))
> > > > -               sqn = (uint64_t)rte_atomic64_add_return(&sa-
> > > >sqn.outb.atom, n);
> > > > +               sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
> > > > +                       __ATOMIC_RELAXED);
> > >
> > > One generic thing to note:
> > > clang for i686 in some cases will generate a proper function call
> > > for 64-bit __atomic builtins (gcc seems to always generate cmpxchng8b for
> such cases).
> > > Does anyone consider it as a potential problem?
> > > It probably not a big deal, but would like to know broader opinion.
> > I had looked at this some time back for GCC. The function call is
> > generated only if the underlying platform does not support the atomic
> instructions for the operand size. Otherwise, gcc generates the instructions
> directly.
> > I would think the behavior would be the same for clang.
> 
> From what I see not really.
> As an example:
> 
> $ cat tatm11.c
> #include <stdint.h>
> 
> struct x {
>         uint64_t v __attribute__((aligned(8))); };
> 
> uint64_t
> ffxadd1(struct x *x, uint32_t n, uint32_t m) {
>         return __atomic_add_fetch(&x->v, n, __ATOMIC_RELAXED); }
> 
> uint64_t
> ffxadd11(uint64_t *v, uint32_t n, uint32_t m) {
>         return __atomic_add_fetch(v, n, __ATOMIC_RELAXED); }
> 
> gcc for i686 will generate code with cmpxchng8b for both cases.
> clang will generate cmpxchng8b for ffxadd1() - when data is explicitly 8B
> aligned, but will emit a function call for ffxadd11().
Does it require libatomic to be linked in this case? Clang documentation calls 
out unaligned case where it would generate the function call [1].
On aarch64, the atomic instructions need the address to be aligned.


[1] https://clang.llvm.org/docs/Toolchain.html#atomics-library

> 
> >
> > >
> > > >         else {
> > > >                 sqn = sa->sqn.outb.raw + n;
> > > >                 sa->sqn.outb.raw = sqn;
> > > > diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h index
> > > > d22451b..cab9a2e 100644
> > > > --- a/lib/librte_ipsec/sa.h
> > > > +++ b/lib/librte_ipsec/sa.h
> > > > @@ -120,7 +120,7 @@ struct rte_ipsec_sa {
> > > >          */
> > > >         union {
> > > >                 union {
> > > > -                       rte_atomic64_t atom;
> > > > +                       uint64_t atom;
> > > >                         uint64_t raw;
> > > >                 } outb;
> > >
> > > If we don't need rte_atomic64 here anymore, then I think we can
> > > collapse the union to just:
> > > uint64_t outb;
> > >
> > > >                 struct {
> > > > --
> > > > 2.7.4

Re: [dpdk-dev] [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa outbound sqn update

Reply via email to