> On Wed, Sep 22, 2021 at 8:38 PM Ananyev, Konstantin
> <konstantin.anan...@intel.com> wrote:
> >
> >
> > > > Hi Jerin,
> > > >
> > > > > > NOTE: This is just an RFC to start further discussion and collect 
> > > > > > the feedback.
> > > > > > Due to significant amount of work, changes required are applied 
> > > > > > only to two
> > > > > > PMDs so far: net/i40e and net/ice.
> > > > > > So to build it you'll need to add:
> > > > > > -Denable_drivers='common/*,mempool/*,net/ice,net/i40e'
> > > > > > to your config options.
> > > > >
> > > > > >
> > > > > > That approach was selected to avoid(/minimize) possible performance 
> > > > > > losses.
> > > > > >
> > > > > > So far I done only limited amount functional and performance 
> > > > > > testing.
> > > > > > Didn't spot any functional problems, and performance numbers
> > > > > > remains the same before and after the patch on my box (testpmd, 
> > > > > > macswap fwd).
> > > > >
> > > > >
> > > > > Based on testing on octeonxt2. We see some regression in testpmd and
> > > > > bit on l3fwd too.
> > > > >
> > > > > Without patch: 73.5mpps/core in testpmd iofwd
> > > > > With out patch: 72 5mpps/core in testpmd iofwd
> > > > >
> > > > > Based on my understanding it is due to additional indirection.
> > > >
> > > > From your patch below, it looks like not actually additional 
> > > > indirection,
> > > > but extra memory dereference - func and dev pointers are now stored
> > > > at different places.
> > >
> > > Yup. I meant the same. We are on the same page.
> > >
> > > > Plus the fact that now we dereference rte_eth_devices[]
> > > > data inside PMD function. Which probably prevents compiler and CPU to 
> > > > load
> > > >  rte_eth_devices[port_id].data and rte_eth_devices[port_id]. 
> > > > pre_tx_burst_cbs[queue_id]
> > > > in advance before calling actual RX/TX function.
> > >
> > > Yes.
> > >
> > > > About your approach: I don’t mind to add extra opaque 'void *data' 
> > > > pointer,
> > > > but would prefer not to expose callback invocations code into inline 
> > > > function.
> > > > Main reason for that - I think it still need to be reworked to allow 
> > > > adding/removing
> > > > callbacks without stopping the device. Something similar to what was 
> > > > done for cryptodev
> > > > callbacks. To be able to do that in future without another ABI breakage 
> > > > callbacks related part
> > > > needs to be kept internal.
> > > > Though what we probably can do: add two dynamic arrays of opaque 
> > > > pointers to  rte_eth_burst_api.
> > > > One for rx/tx queue data pointers, second for rx/tx callback pointers.
> > > > To be more specific, something like:
> > > >
> > > > typedef uint16_t (*rte_eth_rx_burst_t)( void *rxq, struct rte_mbuf 
> > > > **rx_pkts, uint16_t nb_pkts, void *cbs);
> > > > typedef uint16_t (*rte_eth_tx_burst_t)(void *txq, struct rte_mbuf 
> > > > **tx_pkts, uint16_t nb_pkts, void *cbs);
> > > > ....
> > > >
> > > > struct rte_eth_burst_api {
> > > >         rte_eth_rx_burst_t rx_pkt_burst;
> > > >         /**< PMD receive function. */
> > > >         rte_eth_tx_burst_t tx_pkt_burst;
> > > >         /**< PMD transmit function. */
> > > >         rte_eth_tx_prep_t tx_pkt_prepare;
> > > >         /**< PMD transmit prepare function. */
> > > >         rte_eth_rx_queue_count_t rx_queue_count;
> > > >         /**< Get the number of used RX descriptors. */
> > > >         rte_eth_rx_descriptor_status_t rx_descriptor_status;
> > > >         /**< Check the status of a Rx descriptor. */
> > > >         rte_eth_tx_descriptor_status_t tx_descriptor_status;
> > > >         /**< Check the status of a Tx descriptor. */
> > > >         struct {
> > > >                  void **queue_data;   /* point to 
> > > > rte_eth_devices[port_id].data-> rx_queues */
> > > >                  void **cbs;                  /*  points to 
> > > > rte_eth_devices[port_id].post_rx_burst_cbs */
> > > >        } rx_data, tx_data;
> > > > } __rte_cache_aligned;
> > > >
> > > > static inline uint16_t
> > > > rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
> > > >                  struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
> > > > {
> > > >        struct rte_eth_burst_api *p;
> > > >
> > > >         if (port_id >= RTE_MAX_ETHPORTS || queue_id >= 
> > > > RTE_MAX_QUEUES_PER_PORT)
> > > >                 return 0;
> > > >
> > > >       p =  &rte_eth_burst_api[port_id];
> > > >       return p->rx_pkt_burst(p->rx_data.queue_data[queue_id], rx_pkts, 
> > > > nb_pkts, p->rx_data.cbs[queue_id]);
> > >
> > >
> > >
> > > That works.
> > >
> > >
> > > > }
> > > >
> > > > Same for TX.
> > > >
> > > > If that looks ok to everyone, I'll try to prepare next version based on 
> > > > that.
> > >
> > >
> > > Looks good to me.
> > >
> > > > In theory that should avoid extra dereference problem and even reduce 
> > > > indirection.
> > > > As a drawback data->rxq/txq should always be allocated for 
> > > > RTE_MAX_QUEUES_PER_PORT entries,
> > > > but I presume that’s not a big deal.
> > > >
> > > > As a side question - is there any reason why 
> > > > rte_ethdev_trace_rx_burst() is invoked at very last point,
> > > > while rte_ethdev_trace_tx_burst()  after CBs but before actual 
> > > > tx_pkt_burst()?
> > > > It would make things simpler if tracng would always be done either on 
> > > > entrance or exit of rx/tx_burst.
> > >
> > > exit is fine.
> > >
> > > >
> > > > >
> > > > > My suggestion to fix the problem by:
> > > > > Removing the additional `data` redirection and pull callback function
> > > > > pointers back
> > > > > and keep rest as opaque as done in the existing patch like [1]
> > > > >
> > > > > I don't believe this has any real implication on future ABI stability
> > > > > as we will not be adding
> > > > > any new item in rte_eth_fp in any way as new features can be added in 
> > > > > slowpath
> > > > > rte_eth_dev as mentioned in the patch.
> > >
> > > Ack
> > >
> > > I will happy to test again after the rework and report any performance
> > > issues if any.
> >
> > Thanks Jerin, v2 is out:
> > https://patches.dpdk.org/project/dpdk/list/?series=19084
> > Please have a look, when you'll get a chance.
> 
> Tested the series. Looks good, No performance issue.

That's great news, thanks for testing it.
Plan to proceed with proper v3 in next few days.
 

Reply via email to