Hi Bing,

> -----Original Message-----
> From: Bing Zhao <bi...@nvidia.com>
> Sent: Wednesday, October 7, 2020 2:22 PM
> Subject: RE: [PATCH 1/4] ethdev: add hairpin bind and unbind APIs
> 
> Hi Ori,
> 
> > -----Original Message-----
> > From: Ori Kam <or...@nvidia.com>
> > Sent: Sunday, October 4, 2020 5:20 PM
> > Subject: RE: [PATCH 1/4] ethdev: add hairpin bind and unbind APIs
> >
> > Hi Bing,
> >
> > PSB,
> >
> > Thanks,
> > Ori
> > > -----Original Message-----
> > > From: Bing Zhao <bi...@nvidia.com>
> > > Sent: Thursday, October 1, 2020 3:26 AM
> > > Cc: dev@dpdk.org
> > > Subject: [PATCH 1/4] ethdev: add hairpin bind and unbind APIs
> > >
> > > In single port hairpin mode, all the hairpin TX and RX queues
> > belong
> > > to the same device. After the queues are set up properly, there is
> > no
> > > other dependency between the TX queue and its RX peer queue. The
> > > binding process that connected the TX and RX queues together from
> > > hardware level will be done automatically during the device start
> > > procedure. Everything required is configured and initialized
> > already
> > > for the binding process.
> > >
> > > But in two ports hairpin mode, there will be some cross-
> > dependences
> > > between two different ports. Usually, the ports will be
> > initialized
> > > serially by the main thread but not in parallel. The earlier port
> > will
> > > not be able to enable the bind if the following peer port is not
> > yet
> > > configured with HW resources. What's more, if one port is detached
> > /
> > > attached dynamically, it would introduce more trouble for the
> > hairpin
> > > binding.
> > >
> > > To overcome these, new APIs for binding and unbinding are added.
> > > During startup, only the hairpin TX and RX peer queues will be set
> > up.
> > > Nothing will be done when starting the device if the queues are
> > > without auto-bind attribute. Only after the required ports pair
> > > started, the `rte_eth_hairpin_bind()` API can be called to bind
> > the
> > > all TX queues of the egress port to the RX queues of the peer port.
> > > Then the connection between the egress and ingress ports pair will
> > be
> > > established.
> > >
> > > The `rte_eth_hairpin_unbind()` API could be used to disconnect the
> > > egress and the peer ingress ports. This should only be called
> > before
> > > the device is closed if needed. When doing the clean up, all the
> > > egress and ingress pairs related to a single port should be taken
> > into
> > > consideration.
> > >
> > > Signed-off-by: Bing Zhao <bi...@nvidia.com>
> > > ---
> > >  lib/librte_ethdev/rte_ethdev.c           | 107
> > > +++++++++++++++++++++++++++++++
> > >  lib/librte_ethdev/rte_ethdev.h           |  51 +++++++++++++++
> > >  lib/librte_ethdev/rte_ethdev_driver.h    |  52 +++++++++++++++
> > >  lib/librte_ethdev/rte_ethdev_version.map |   2 +
> > >  4 files changed, 212 insertions(+)
> > >
> > > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > > b/lib/librte_ethdev/rte_ethdev.c index dfe5c1b..72f567b 100644
> > > --- a/lib/librte_ethdev/rte_ethdev.c
> > > +++ b/lib/librte_ethdev/rte_ethdev.c
> > > @@ -2175,6 +2175,113 @@ rte_eth_tx_hairpin_queue_setup(uint16_t
> > > port_id, uint16_t tx_queue_id,
> > >   return eth_err(port_id, ret);
> > >  }
> > >
> > > +int
> > > +rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port) {
> > > + struct rte_eth_dev *dev;
> > > + struct rte_eth_dev *rdev;
> > > + uint16_t p;
> > > + uint16_t rp;
> > > + int ret = 0;
> > > +
> > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(tx_port, -EINVAL);
> > > + dev = &rte_eth_devices[tx_port];
> > > + if (!dev->data->dev_started) {
> > > +         RTE_ETHDEV_LOG(ERR, "TX port %d is not started",
> > tx_port);
> > > +         return -EBUSY;
> > > + }
> > > +
> > > + /*
> > > +  * If the all the ports probed belong to two or more separate
> > NICs, it
> > > +  * is recommended that each pair is bound independently but
> > not in the
> > > +  * loop to bind all ports.
> > > +  */
> >
> > I don't understand your comment.
> 
> I mean, in the following loops, if the application wants to bind one TX port 
> to all
> the other RX ports, it would not be supported between different PMDs or
> different NICs. Then the roll-back actions will be done and no hairpin port 
> peers
> will work successfully.
> 

O.K.

> >
> > > + if (rx_port == RTE_MAX_ETHPORTS) {
> >
> > I think maybe this should be done in the tx queue. Since if the bind
> > don't need some port why do we care if it is started?
> 
> Do you mean the checking in the RX ports loop below? At first, I intended to
> make sure this API would be called only after all the ports are started.
> But yes, there is no need if some port is not being used.
> 
I mean move the check to the tx pmd, if for some reason (peer device not 
started)
then it should fail.

> > So either add a new function to get all peer ports from the tx port,
> > or move this logic to the Target PMD.
> 
> There would be one more API to be introduced. To my understanding, it would
> be more appropriate if we add a new API here.
> But it would keep RTE simpler if we move such logic into the PMD.
> 
I'm fine with moving the logic to the PMD as long as it can check the status.
which he can using the internal function right?

> >
> > > +         RTE_ETH_FOREACH_DEV(p) {
> > > +                 rdev = &rte_eth_devices[p];
> > > +                 if (!rdev->data->dev_started) {
> > > +                         RTE_ETHDEV_LOG(ERR,
> > > +                                        "RX port %d is not started", p);
> > > +                         ret = -EBUSY;
> > > +                         goto unbind;
> > > +                 }
> > > +                 ret = (*dev->dev_ops->hairpin_bind)(dev, p);
> > > +                 if (ret) {
> > > +                         RTE_ETHDEV_LOG(ERR, "Failed to bind hairpin
> > > TX "
> > > +                                        "%d to RX %d", tx_port, p);
> > > +                         goto unbind;
> > > +                 }
> > > +         }
> > > + } else {
> > > +         RTE_ETH_VALID_PORTID_OR_ERR_RET(rx_port, -EINVAL);
> > > +         rdev = &rte_eth_devices[rx_port];
> > > +         if (!rdev->data->dev_started) {
> > > +                 RTE_ETHDEV_LOG(ERR,
> > > +                                "RX port %d is not started", rx_port);
> > > +                 return -EBUSY;
> > > +         }
> > > +         ret = (*dev->dev_ops->hairpin_bind)(dev, rx_port);
> > > +         if (ret)
> > > +                 RTE_ETHDEV_LOG(ERR, "Failed to bind hairpin TX %d
> > "
> > > +                                "to RX %d", tx_port, rx_port);
> > > + }
> > > +
> > > + return ret;
> > > +
> > > +unbind:
> > > + /* Roll back the previous binding process. */
> > > + RTE_ETH_FOREACH_DEV(rp) {
> > > +         if (rp < p)
> > > +                 (*dev->dev_ops->hairpin_unbind)(dev, rp);
> > > +         else
> > > +                 break;
> > > + }
> > > + return ret;
> > > +}
> > > +
> > > +int
> > > +rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port) {
> > > + struct rte_eth_dev *dev;
> > > + struct rte_eth_dev *rdev;
> > > + uint16_t p;
> > > + int ret = 0;
> > > +
> > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(tx_port, -EINVAL);
> > > + dev = &rte_eth_devices[tx_port];
> > > + if (!dev->data->dev_started) {
> > > +         RTE_ETHDEV_LOG(ERR, "TX port %d is stopped", tx_port);
> > > +         return -EBUSY;
> > > + }
> > > +
> > > + if (rx_port == RTE_MAX_ETHPORTS) {
> > > +         RTE_ETH_FOREACH_DEV(p) {
> > > +                 rdev = &rte_eth_devices[p];
> > > +                 if (!rdev->data->dev_started) {
> >
> > This condition should never be true.
> > First see my comment above about the list of devices, second port
> > should fail to stop if it is bounded.
> 
> Thank, I will remove this check and make the logic the same as "bind" API.
> When stopping the second port, PMD will check if any peer port is bound and
> return a failure then.
> 

Great we should also consider what happens in case of hot plug,
In this case the hot plug port should be able to unbind himself from
all ports.

> >
> > > +                         RTE_ETHDEV_LOG(ERR, "RX port %d is
> > > stopped", p);
> > > +                         ret = -EBUSY;
> > > +                         break;
> > > +                 }
> > > +                 ret = (*dev->dev_ops->hairpin_unbind)(dev, p);
> > > +                 if (ret) {
> > > +                         RTE_ETHDEV_LOG(ERR, "Failed to unbind
> > > hairpin "
> > > +                                        "TX %d from RX %d", tx_port, p);
> > > +                         break;
> > > +                 }
> > > +         }
> > > + } else {
> > > +         RTE_ETH_VALID_PORTID_OR_ERR_RET(rx_port, -EINVAL);
> > > +         rdev = &rte_eth_devices[rx_port];
> > > +         if (!rdev->data->dev_started) {
> > > +                 RTE_ETHDEV_LOG(ERR, "RX port %d is stopped",
> > > rx_port);
> > > +                 return -EBUSY;
> > > +         }
> > > +         ret = (*dev->dev_ops->hairpin_unbind)(dev, rx_port);
> > > + }
> > > +
> > > + return ret;
> > > +}
> > > +
> > >  void
> > >  rte_eth_tx_buffer_drop_callback(struct rte_mbuf **pkts, uint16_t
> > unsent,
> > >           void *userdata __rte_unused)
> > > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > > b/lib/librte_ethdev/rte_ethdev.h index 645a186..c3fb684 100644
> > > --- a/lib/librte_ethdev/rte_ethdev.h
> > > +++ b/lib/librte_ethdev/rte_ethdev.h
> > > @@ -2133,6 +2133,57 @@ int rte_eth_tx_hairpin_queue_setup
> > >    const struct rte_eth_hairpin_conf *conf);
> > >
> > >  /**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change, or be removed, without
> > prior
> > > notice
> > > + *
> > > + * Bind all hairpin TX queues of one port to the RX queues of the
> > peer port.
> > > + * It is only allowed to call this API after all hairpin queues
> > are
> > > +configured
> > > + * properly and the devices of TX and peer RX are in started
> > state.
> > > + *
> > > + * @param tx_port
> > > + *   The TX port identifier of the Ethernet device.
> > > + * @param rx_port
> > > + *   The peer RX port identifier of the Ethernet device.
> > > + *   RTE_MAX_ETHPORTS is allowed for the traversal of all devices.
> > > + *   RX port ID could have the same value with TX port ID.
> > > + *
> > > + * @return
> > > + *   - (0) if successful.
> > > + *   - (-EINVAL) if bad parameter.
> > > + *   - (-EBUSY) if device is not in started state.
> > > + *   - (-ENOTSUP) if hardware doesn't support.
> > > + *   - Others detailed errors from PMD drivers.
> > > + */
> > > +__rte_experimental
> > > +int rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port);
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change, or be removed, without
> > prior
> > > notice
> > > + *
> > > + * Unbind all hairpin TX queues of one port from the RX queues of
> > the
> > > + peer
> > > port.
> > > + * This should be called before closing the TX or RX devices
> > > +(optional). After
> > > + * unbind the hairpin ports pair, it is allowed to bind them
> > again.
> > > + * Changing queues configuration should be after stopping the
> > device.
> > > + *
> > > + * @param tx_port
> > > + *   The TX port identifier of the Ethernet device.
> > > + * @param rx_port
> > > + *   The peer RX port identifier of the Ethernet device.
> > > + *   RTE_MAX_ETHPORTS is allowed for traversal of all devices.
> > > + *   RX port ID could have the same value with TX port ID.
> > > + *
> > > + * @return
> > > + *   - (0) if successful.
> > > + *   - (-EINVAL) if bad parameter.
> > > + *   - (-EBUSY) if device is in stopped state.
> > > + *   - (-ENOTSUP) if hardware doesn't support.
> > > + *   - Others detailed errors from PMD drivers.
> > > + */
> > > +__rte_experimental
> > > +int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port);
> > > +
> > > +/**
> > >   * Return the NUMA socket to which an Ethernet device is
> > connected
> > >   *
> > >   * @param port_id
> > > diff --git a/lib/librte_ethdev/rte_ethdev_driver.h
> > > b/lib/librte_ethdev/rte_ethdev_driver.h
> > > index 04ac8e9..910433f 100644
> > > --- a/lib/librte_ethdev/rte_ethdev_driver.h
> > > +++ b/lib/librte_ethdev/rte_ethdev_driver.h
> > > @@ -575,6 +575,54 @@ typedef int (*eth_tx_hairpin_queue_setup_t)
> > >    const struct rte_eth_hairpin_conf *hairpin_conf);
> > >
> > >  /**
> > > + * @internal
> > > + * Bind all hairpin TX queues of one port to the RX queues of the
> > peer port.
> > > + *
> > > + * @param dev
> > > + *   ethdev handle of port.
> > > + * @param rx_port
> > > + *   the peer RX port.
> > > + *
> > > + * @return
> > > + *   Negative errno value on error, 0 on success.
> > > + *
> > > + * @retval 0
> > > + *   Success, bind successfully.
> > > + * @retval -ENOTSUP
> > > + *   Bind API is not supported.
> > > + * @retval -EINVAL
> > > + *   One of the parameters is invalid.
> > > + * @retval -EBUSY
> > > + *   Device is not started.
> > > + */
> > > +typedef int (*eth_hairpin_bind_t)(struct rte_eth_dev *dev,
> > > +                         uint16_t rx_port);
> > > +
> > > +/**
> > > + * @internal
> > > + * Unbind all hairpin TX queues of one port from the RX queues of
> > the
> > > +peer
> > > port.
> > > + *
> > > + * @param dev
> > > + *   ethdev handle of port.
> > > + * @param rx_port
> > > + *   the peer RX port.
> > > + *
> > > + * @return
> > > + *   Negative errno value on error, 0 on success.
> > > + *
> > > + * @retval 0
> > > + *   Success, bind successfully.
> > > + * @retval -ENOTSUP
> > > + *   Bind API is not supported.
> > > + * @retval -EINVAL
> > > + *   One of the parameters is invalid.
> > > + * @retval -EBUSY
> > > + *   Device is already stopped.
> > > + */
> > > +typedef int (*eth_hairpin_unbind_t)(struct rte_eth_dev *dev,
> > > +                           uint16_t rx_port);
> > > +
> > > +/**
> > >   * @internal A structure containing the functions exported by an
> > > Ethernet driver.
> > >   */
> > >  struct eth_dev_ops {
> > > @@ -713,6 +761,10 @@ struct eth_dev_ops {
> > >   /**< Set up device RX hairpin queue. */
> > >   eth_tx_hairpin_queue_setup_t tx_hairpin_queue_setup;
> > >   /**< Set up device TX hairpin queue. */
> > > + eth_hairpin_bind_t hairpin_bind;
> > > + /**< Bind all hairpin TX queues of device to the peer port RX
> > queues. */
> > > + eth_hairpin_unbind_t hairpin_unbind;
> > > + /**< Unbind all hairpin TX queues from the peer port RX queues.
> > */
> > >  };
> > >
> > >  /**
> > > diff --git a/lib/librte_ethdev/rte_ethdev_version.map
> > > b/lib/librte_ethdev/rte_ethdev_version.map
> > > index c95ef51..18efe4e 100644
> > > --- a/lib/librte_ethdev/rte_ethdev_version.map
> > > +++ b/lib/librte_ethdev/rte_ethdev_version.map
> > > @@ -227,6 +227,8 @@ EXPERIMENTAL {
> > >   rte_tm_wred_profile_delete;
> > >
> > >   # added in 20.11
> > > + rte_eth_hairpin_bind;
> > > + rte_eth_hairpin_unbind;
> > >   rte_eth_link_speed_to_str;
> > >   rte_eth_link_to_str;
> > >  };
> > > --
> > > 2.5.5
> 
> Thanks

Reply via email to