Re: [ovs-dev] [v12 15/16] dpif-netdev: Optimize dp output action.

Ferriter, Cian Wed, 16 Jun 2021 09:23:28 -0700

Hi Ian,

Further responses are inline.


> -----Original Message-----
> From: Stokes, Ian <[email protected]>
> Sent: Wednesday 16 June 2021 13:47
> To: Ferriter, Cian <[email protected]>; [email protected]; Van 
> Haaren, Harry <[email protected]>;
> [email protected]
> Subject: RE: [ovs-dev] [v12 15/16] dpif-netdev: Optimize dp output action.
> 
> > Hi Ian,
> >
> > Thanks for the review. My responses are inline.
> >
> > > -----Original Message-----
> > > From: Stokes, Ian <[email protected]>
> > > Sent: Tuesday 15 June 2021 18:04
> > > To: Ferriter, Cian <[email protected]>; [email protected];
> > Ferriter, Cian <[email protected]>; Van Haaren, Harry
> > > <[email protected]>
> > > Cc: [email protected]
> > > Subject: RE: [ovs-dev] [v12 15/16] dpif-netdev: Optimize dp output action.
> > >
> > > > From: Harry van Haaren <[email protected]>
> > > >
> > > > This commit optimizes the output action, by enabling the compiler to
> > > > optimize the code better through reducing code complexity.
> > > >
> > > > The core concept of this optimization is that the array-length checks
> > > > have already been performed above the copying code, so can be removed.
> > > > Removing of the per-packet length checks allows the compiler to auto-
> > vectorize
> > > > the stores using SIMD registers.
> > > >
> > > > Signed-off-by: Harry van Haaren <[email protected]>
> > > >
> > >
> > > Thanks for the patch Cian/Harry.
> > >
> > > Comments inline.
> > > > ---
> > > >
> > > > v8: Add NEWS entry.
> > > > ---
> > > >  NEWS              |  1 +
> > > >  lib/dpif-netdev.c | 23 ++++++++++++++++++-----
> > > >  2 files changed, 19 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/NEWS b/NEWS
> > > > index d04dac746..a7ec34af1 100644
> > > > --- a/NEWS
> > > > +++ b/NEWS
> > > > @@ -17,6 +17,7 @@ Post-v2.15.0
> > > >       * Enable the AVX512 DPCLS implementation to use VPOPCNT 
> > > > instruction
> > if
> > > > the
> > > >         CPU supports it. This enhances performance by using the native
> > vpopcount
> > > >         instructions, instead of the emulated version of vpopcount.
> > > > +     * Optimize dp_netdev_output by enhancing compiler optimization
> > potential.
> > > >     - ovs-ctl:
> > > >       * New option '--no-record-hostname' to disable hostname 
> > > > configuration
> > > >         in ovsdb on startup.
> > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > > > index 04a4f71cb..f46269ae3 100644
> > > > --- a/lib/dpif-netdev.c
> > > > +++ b/lib/dpif-netdev.c
> > > > @@ -7290,12 +7290,25 @@ dp_execute_output_action(struct
> > > > dp_netdev_pmd_thread *pmd,
> > > >          pmd->n_output_batches++;
> > > >      }
> > > >
> > > > -    struct dp_packet *packet;
> > > > -    DP_PACKET_BATCH_FOR_EACH (i, packet, packets_) {
> > > > -        p->output_pkts_rxqs[dp_packet_batch_size(&p->output_pkts)] =
> > > > -            pmd->ctx.last_rxq;
> > > > -        dp_packet_batch_add(&p->output_pkts, packet);
> > > > +    /* The above checks ensure that there is enough space in the output
> > batch.
> > > > +     * Using dp_packet_batch_add() has a branch to check if the batch 
> > > > is full.
> > > > +     * This branch reduces the compiler's ability to optimize 
> > > > efficiently. The
> > > > +     * below code implements packet movement between batches without
> > > > checks,
> > > > +     * with the required semantics of output batch perhaps containing
> > packets.
> > > > +     */
> > >
> > > What is the performance gain recorded with this approach with compiler
> > optimizations?
> > >
> >
> > The test was run using the scalar DPIF, 1 flow and EMC on, where the
> > before/after cycle cost of the patch is 243 cycles before to 230 cycles per 
> > packet
> > after. Hence this optimization saves 13 cycles per packet in my 
> > measurements.
> 
> So I think that info is important and should be added to the commit message 
> and will help inform whether to make the change.
> 

I added this information to the commit message.

> Ilya has worked in this area extensively so would be interested to hear his 
> opinion?
> 
> I don't think this is a blocker to the DPIF work so could possibly be 
> submitted separately as a patch for master?
> 

Correct, it doesn't block the DPIF work. I'll submit as a separate patch.

> Regards
> Ian
> >
> > > > +    int batch_size = dp_packet_batch_size(packets_);
> > > > +    int out_batch_idx = dp_packet_batch_size(&p->output_pkts);
> > > > +    struct dp_netdev_rxq *rxq = pmd->ctx.last_rxq;
> > > > +    struct dp_packet_batch *output_batch = &p->output_pkts;
> > > > +
> > > > +    for (int i = 0; i < batch_size; i++) {
> > > > +        struct dp_packet *packet = packets_->packets[i];
> > > > +        p->output_pkts_rxqs[out_batch_idx] = rxq;
> > > > +        output_batch->packets[out_batch_idx] = packet;
> > > > +        out_batch_idx++;
> > > >      }
> > > > +    output_batch->count += batch_size;
> > >
> > > So I guess the counter argument here is that the previous approach uses 
> > > the
> > dp_packet standard functions.
> > >
> > > There has been work over time to ensure that dp_packets are handled in a
> > similar way using the dp_packet api.
> > >
> > > This I think reverses the position. That's why I'd like to get a sense of 
> > > the
> > performance gain as I think it's important to justify breaking
> > > from code conformance.
> > >
> > > Has there been thought to modifying the dp_packet_batch_add ?
> > >
> >
> > We thought about adding an "unsafe" version of dp_packet_batch_add() which
> > doesn't check whether the packet batch is full. Due to this optimization 
> > being
> > dependent on other checks in the surrounding code, it isn't appropriate to 
> > have
> > a generic function containing this optimization.
> >
> > > Regards
> > > Ian
> > > > +
> > > >      return true;
> > > >  }
> > > >
> > > > --
> > > > 2.31.1
> > > >
> > > > _______________________________________________
> > > > dev mailing list
> > > > [email protected]
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [v12 15/16] dpif-netdev: Optimize dp output action.

Reply via email to