Ilya, Sure, I'll rebase my patch and also ensure netdev_is_flow_api_enabled() is called once per batch. I don't expect any issue with that.
Regards, Balazs On Fri, Jun 25, 2021 at 2:00 PM Ilya Maximets <[email protected]> wrote: > On 6/25/21 1:04 PM, Ferriter, Cian wrote: > > Hi Eli, > > > > I have some concerns about how we plug the packet state recover logic > into dfc_processing() here. My concerns are inline below. > > > > I'm concerned that we are hurting performance in the partial HWOL case, > as this patchset is introducing new direct (non-inline) function calls per > packet to the software datapath. We can reduce performance impact in this > area, see specific suggestions below. > > > > Thanks, > > Cian > > > >> -----Original Message----- > >> From: Eli Britstein <[email protected]> > >> Sent: Wednesday 23 June 2021 16:53 > >> To: [email protected]; Ilya Maximets <[email protected]> > >> Cc: Gaetan Rivet <[email protected]>; Majd Dibbiny <[email protected]>; > Sriharsha Basavapatna > >> <[email protected]>; Hemal Shah < > [email protected]>; Ivan Malov > >> <[email protected]>; Ferriter, Cian <[email protected]>; > Eli Britstein <[email protected]>; > >> Finn, Emma <[email protected]>; Kovacevic, Marko < > [email protected]> > >> Subject: [PATCH V7 06/13] dpif-netdev: Add HW miss packet state recover > logic. > >> > >> Recover the packet if it was partially processed by the HW. Fallback to > >> lookup flow by mark association. > >> > >> Signed-off-by: Eli Britstein <[email protected]> > >> Reviewed-by: Gaetan Rivet <[email protected]> > >> Acked-by: Sriharsha Basavapatna <[email protected]> > >> Tested-by: Emma Finn <[email protected]> > >> Tested-by: Marko Kovacevic <[email protected]> > >> Signed-off-by: Ilya Maximets <[email protected]> > >> --- > >> lib/dpif-netdev.c | 45 +++++++++++++++++++++++++++++++++++++++++---- > >> 1 file changed, 41 insertions(+), 4 deletions(-) > >> > >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > >> index 8fa7eb6d4..d5b7d5b73 100644 > >> --- a/lib/dpif-netdev.c > >> +++ b/lib/dpif-netdev.c > >> @@ -114,6 +114,7 @@ COVERAGE_DEFINE(datapath_drop_invalid_port); > >> COVERAGE_DEFINE(datapath_drop_invalid_bond); > >> COVERAGE_DEFINE(datapath_drop_invalid_tnl_port); > >> COVERAGE_DEFINE(datapath_drop_rx_invalid_packet); > >> +COVERAGE_DEFINE(datapath_drop_hw_miss_recover); > >> > >> /* Protects against changes to 'dp_netdevs'. */ > >> static struct ovs_mutex dp_netdev_mutex = OVS_MUTEX_INITIALIZER; > >> @@ -7062,6 +7063,39 @@ smc_lookup_batch(struct dp_netdev_pmd_thread > *pmd, > >> pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SMC_HIT, > n_smc_hit); > >> } > >> > >> +static struct tx_port * pmd_send_port_cache_lookup( > >> + const struct dp_netdev_pmd_thread *pmd, odp_port_t port_no); > >> + > >> +static inline int > >> +dp_netdev_hw_flow(const struct dp_netdev_pmd_thread *pmd, > >> + odp_port_t port_no, > >> + struct dp_packet *packet, > >> + struct dp_netdev_flow **flow) > >> +{ > >> + struct tx_port *p; > >> + uint32_t mark; > >> + > > > > > > Start of full HWOL recovery block > > > >> + /* Restore the packet if HW processing was terminated before > completion. */ > >> + p = pmd_send_port_cache_lookup(pmd, port_no); > >> + if (OVS_LIKELY(p)) { > >> + int err = netdev_hw_miss_packet_recover(p->port->netdev, > packet); > >> + > >> + if (err && err != EOPNOTSUPP) { > >> + COVERAGE_INC(datapath_drop_hw_miss_recover); > >> + return -1; > >> + } > >> + } > > > > End of full HWOL recovery block > > > > IIUC, the above is only relevant to full HWOL where the packet is > partially processed by the HW. In a partial HWOL testcase, we see a > performance drop as a result of the above code. The performance after the > patchset is applied is 0.94x what the performance was before. > > While reviewing the patch set I noticed this part too, but > this code was tested twice by Intel engineers, so I figured > that it doesn't hurt performance of partial offloading. > > In general, it should be easy to re-order partial and full > offloading checks like this (didn't test): > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index c5ab35d2a..36a5976f2 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -7113,6 +7113,14 @@ dp_netdev_hw_flow(const struct dp_netdev_pmd_thread > *pmd, > struct tx_port *p; > uint32_t mark; > > + /* If no mark, no flow to find. */ > + if (dp_packet_has_flow_mark(packet, &mark)) { > + *flow = mark_to_flow_find(pmd, mark); > + return 0; > + } > + > + *flow = NULL; > + > /* Restore the packet if HW processing was terminated before > completion. */ > p = pmd_send_port_cache_lookup(pmd, port_no); > if (OVS_LIKELY(p)) { > @@ -7123,14 +7131,6 @@ dp_netdev_hw_flow(const struct dp_netdev_pmd_thread > *pmd, > return -1; > } > } > - > - /* If no mark, no flow to find. */ > - if (!dp_packet_has_flow_mark(packet, &mark)) { > - *flow = NULL; > - return 0; > - } > - > - *flow = mark_to_flow_find(pmd, mark); > return 0; > } > --- > > If everyone agrees, I can do this change. > > > > > We should branch over this code in the partial HWOL scenario, where we > don't need to get the call to pmd_send_port_cache_lookup() and > netdev_hw_miss_packet_recover(). We want this branch to be transparent to > the user. Since both full and partial HWOL falls under the > other_config:hw-offload=true switch, we might need a configure time check > NIC capabilities solution or something similar to branch over full HWOL > packet recovery code. Does this make sense? > > Compile time check of capabilities doesn't make sense as it's unknown > in vast majority of cases on which HW the package will run. Code should > be as generic as possible. > > > > > > > perf top showing cycles spent per function in my partial HWOL scenario. > We can see netdev_hw_miss_packet_recover(), > netdev_offload_dpdk_hw_miss_packet_recover() and > netdev_is_flow_api_enabled() taking cycles: > > 28.79% pmd-c01/id:8 ovs-vswitchd [.] dp_netdev_input__ > > 13.72% pmd-c01/id:8 ovs-vswitchd [.] parse_tcp_flags > > 11.07% pmd-c01/id:8 ovs-vswitchd [.] i40e_recv_pkts_vec_avx2 > > 10.94% pmd-c01/id:8 ovs-vswitchd [.] > i40e_xmit_fixed_burst_vec_avx2 > > 7.48% pmd-c01/id:8 ovs-vswitchd [.] cmap_find > > 4.94% pmd-c01/id:8 ovs-vswitchd [.] > netdev_hw_miss_packet_recover > > 4.77% pmd-c01/id:8 ovs-vswitchd [.] > dp_execute_output_action > > 3.81% pmd-c01/id:8 ovs-vswitchd [.] > dp_netdev_pmd_flush_output_on_port > > 3.40% pmd-c01/id:8 ovs-vswitchd [.] netdev_send > > 2.49% pmd-c01/id:8 ovs-vswitchd [.] netdev_dpdk_eth_send > > 1.16% pmd-c01/id:8 ovs-vswitchd [.] netdev_dpdk_rxq_recv > > 0.90% pmd-c01/id:8 ovs-vswitchd [.] pmd_perf_end_iteration > > 0.75% pmd-c01/id:8 ovs-vswitchd [.] > dp_netdev_process_rxq_port > > 0.68% pmd-c01/id:8 ovs-vswitchd [.] > netdev_is_flow_api_enabled > > 0.55% pmd-c01/id:8 ovs-vswitchd [.] > netdev_offload_dpdk_hw_miss_packet_recover > > > >> + > >> + /* If no mark, no flow to find. */ > >> + if (!dp_packet_has_flow_mark(packet, &mark)) { > >> + *flow = NULL; > >> + return 0; > >> + } > >> + > >> + *flow = mark_to_flow_find(pmd, mark); > >> + return 0; > >> +} > >> + > >> /* Try to process all ('cnt') the 'packets' using only the datapath > flow cache > >> * 'pmd->flow_cache'. If a flow is not found for a packet > 'packets[i]', the > >> * miniflow is copied into 'keys' and the packet pointer is moved at > the > >> @@ -7106,7 +7140,6 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, > >> > >> DP_PACKET_BATCH_REFILL_FOR_EACH (i, cnt, packet, packets_) { > >> struct dp_netdev_flow *flow; > >> - uint32_t mark; > >> > >> if (OVS_UNLIKELY(dp_packet_size(packet) < ETH_HEADER_LEN)) { > >> dp_packet_delete(packet); > >> @@ -7125,9 +7158,13 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, > >> pkt_metadata_init(&packet->md, port_no); > >> } > >> > >> - if ((*recirc_depth_get() == 0) && > >> - dp_packet_has_flow_mark(packet, &mark)) { > >> - flow = mark_to_flow_find(pmd, mark); > >> + if (netdev_is_flow_api_enabled() && *recirc_depth_get() == 0) { > > > > Here we have a per packet call to netdev_is_flow_api_enabled(). I think > that netdev_is_flow_api_enabled() should be inlined if it's going to be > called per packet. We can see from the above "perf top" that it isn't > inlined since it shows up as a separate function. > > I'd consider "inlining" and moving a lot of stuff to headers harmful > for the code base as it makes it less readable and it's really hard > to preserve this kind of things during code modification. > It's much better to fix the logic instead of hammering the code with > blind low level optimizations. > > For this particular issue we already have a solution here: > > https://patchwork.ozlabs.org/project/openvswitch/patch/c0b99b4b9be6c290a6aa3cb00049fac4ebfac5d8.1618390390.git.bnem...@redhat.com/ > In short, we only need to check once per batch. I think, Balazs > will be able to re-base his patch on top of this series including > the check for netdev_is_flow_api_enabled(). > > Balazs, will you? > > Best regards, Ilya Maximets. > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
