Re: [dpdk-dev] [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa outbound sqn update

Phil Yang Wed, 25 Mar 2020 02:39:35 -0700

> -----Original Message-----
> From: Ananyev, Konstantin <[email protected]>
> Sent: Tuesday, March 24, 2020 7:04 PM
> To: Phil Yang <[email protected]>; Honnappa Nagarahalli
> <[email protected]>; [email protected]; Van Haaren,
> Harry <[email protected]>; [email protected];
> [email protected]; [email protected]; Richardson, Bruce
> <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]; Gavin Hu <[email protected]>; Ruifeng Wang
> <[email protected]>; Joyce Kong <[email protected]>; nd
> <[email protected]>; nd <[email protected]>; nd <[email protected]>
> Subject: RE: [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa
> outbound sqn update
> 
> 
> Hi Phil,
> 
> > <snip>
> > > >
> > > > > Subject: RE: [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa
> > > outbound
> > > > > sqn update
> > > > >
> > > > > Hi Phil,
> > > > >
> > > > > >
> > > > > > For SA outbound packets, rte_atomic64_add_return is used to
> > > generate
> > > > > > SQN atomically. This introduced an unnecessary full barrier by 
> > > > > > calling
> > > > > > the '__sync' builtin implemented rte_atomic_XX API on aarch64.
> This
> > > > > > patch optimized it with c11 atomic and eliminated the expensive
> > > > > > barrier for aarch64.
> > > > > >
> > > > > > Signed-off-by: Phil Yang <[email protected]>
> > > > > > Reviewed-by: Ruifeng Wang <[email protected]>
> > > > > > Reviewed-by: Gavin Hu <[email protected]>
> > > > > > ---
> > > > > >  lib/librte_ipsec/ipsec_sqn.h | 3 ++-
> > > > > >  lib/librte_ipsec/sa.h        | 2 +-
> > > > > >  2 files changed, 3 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/lib/librte_ipsec/ipsec_sqn.h
> > > > > > b/lib/librte_ipsec/ipsec_sqn.h index 0c2f76a..e884af7 100644
> > > > > > --- a/lib/librte_ipsec/ipsec_sqn.h
> > > > > > +++ b/lib/librte_ipsec/ipsec_sqn.h
> > > > > > @@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa
> *sa,
> > > > > > uint32_t *num)
> > > > > >
> > > > > >     n = *num;
> > > > > >     if (SQN_ATOMIC(sa))
> > > > > > -           sqn = (uint64_t)rte_atomic64_add_return(&sa-
> > > > > >sqn.outb.atom, n);
> > > > > > +           sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
> > > > > > +                   __ATOMIC_RELAXED);
> > > > >
> > > > > One generic thing to note:
> > > > > clang for i686 in some cases will generate a proper function call for 
> > > > > 64-
> bit
> > > > > __atomic builtins (gcc seems to always generate cmpxchng8b for such
> > > cases).
> > > > > Does anyone consider it as a potential problem?
> > > > > It probably not a big deal, but would like to know broader opinion.
> > > > I had looked at this some time back for GCC. The function call is
> generated
> > > only if the underlying platform does not support the atomic
> > > > instructions for the operand size. Otherwise, gcc generates the
> instructions
> > > directly.
> > > > I would think the behavior would be the same for clang.
> > >
> > > From what I see not really.
> > > As an example:
> > >
> > > $ cat tatm11.c
> > > #include <stdint.h>
> > >
> > > struct x {
> > >         uint64_t v __attribute__((aligned(8)));
> > > };
> > >
> > > uint64_t
> > > ffxadd1(struct x *x, uint32_t n, uint32_t m)
> > > {
> > >         return __atomic_add_fetch(&x->v, n, __ATOMIC_RELAXED);
> > > }
> > >
> > > uint64_t
> > > ffxadd11(uint64_t *v, uint32_t n, uint32_t m)
> > > {
> > >         return __atomic_add_fetch(v, n, __ATOMIC_RELAXED);
> > > }
> > >
> > > gcc for i686 will generate code with cmpxchng8b for both cases.
> > > clang will generate cmpxchng8b for ffxadd1() - when data is explicitly 8B
> > > aligned,
> > > but will emit a function call for ffxadd11().
> >
> > I guess your testbed is an i386 platform.  However, what I see here is
> different.
> >
> > Testbed i686:  Ubuntu 18.04.4 LTS/GCC 8.3/ Clang 9.0.0-2
> > Both Clang and GCC for i686 generate code with xadd for these two cases.
> 
> I suppose you meant x86_64 here (-m64), right?


Yes. It is x86_64 here.

> 
> 
> >
> > Testbed i386:  Ubuntu 16.04 LTS (Installed libatomic)/GCC 5.4.0/ Clang 4.0.0
> > GCC will generate code with cmpxchng8b for both cases.
> > Clang generated code emits a function call for both cases.
> 
> That's exactly what I am talking about above.
> X86_64 (64 bit binary) - no function calls for both gcc and clang
> i686 (32 bit binary) - no function calls with gcc, functions calls with clang
> when explicit alignment is not specified.
> 
> As I said in my initial email, that's probably not a big deal -
> from what I was told so far we don't officially support clang for IA-32
> and I don't know does anyone uses it at all right now.
> Though if someone thinks it is a potential problem here -
> it is better to flag it at early stage.
> So once again my questions to the community:
> 1/ Does anyone builds/uses DPDK with i686-clang?
> 2/ If there are anyone, can these persons try to evaluate
> how big perf drop it would cause for them?
> 3/ Is there an option to switch to i686-gcc (supported one)?
> Konstantin
> 
> > >
> > > >
> > > > >
> > > > > >     else {
> > > > > >             sqn = sa->sqn.outb.raw + n;
> > > > > >             sa->sqn.outb.raw = sqn;
> > > > > > diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h index
> > > > > > d22451b..cab9a2e 100644
> > > > > > --- a/lib/librte_ipsec/sa.h
> > > > > > +++ b/lib/librte_ipsec/sa.h
> > > > > > @@ -120,7 +120,7 @@ struct rte_ipsec_sa {
> > > > > >      */
> > > > > >     union {
> > > > > >             union {
> > > > > > -                   rte_atomic64_t atom;
> > > > > > +                   uint64_t atom;
> > > > > >                     uint64_t raw;
> > > > > >             } outb;
> > > > >
> > > > > If we don't need rte_atomic64 here anymore, then I think we can
> > > collapse the
> > > > > union to just:
> > > > > uint64_t outb;
> > > > >
> > > > > >             struct {
> > > > > > --
> > > > > > 2.7.4

Re: [dpdk-dev] [PATCH v3 06/12] ipsec: optimize with c11 atomic for sa outbound sqn update

Reply via email to