> 
> Few comments inline.

Thanks for the feedback Ilya. 

> 
> > The 'options:n_rxq_desc' and 'n_txq_desc' fields allow the number of rx
> > and tx descriptors for dpdk ports to be modified. By default the values
> > are set to 2048, but can be modified to an integer between 1 and 4096
> > that is a power of two. The values can be modified at runtime, however
> > require the NIC to restart when changed.
> >
> > Signed-off-by: Ciara Loftus <ciara.loftus at intel.com>
> >
> > ---
> > v3:
> > * Make queue sizes per-port rather than global
> > * Check if queue size is power of 2 - fail if so.
> >
> > v2:
> > * Rebase
> >
> >  INSTALL.DPDK-ADVANCED.md | 16 ++++++++++++++--
> >  NEWS                     |  2 ++
> >  lib/netdev-dpdk.c        | 48
> +++++++++++++++++++++++++++++++++++++++++++-----
> >  vswitchd/vswitch.xml     | 22 ++++++++++++++++++++++
> >  4 files changed, 81 insertions(+), 7 deletions(-)
> >
> > diff --git a/INSTALL.DPDK-ADVANCED.md b/INSTALL.DPDK-ADVANCED.md
> > index d7b9873..488e84f 100644
> > --- a/INSTALL.DPDK-ADVANCED.md
> > +++ b/INSTALL.DPDK-ADVANCED.md
> > @@ -257,7 +257,19 @@ needs to be affinitized accordingly.
> >    The rx queues are assigned to pmd threads on the same NUMA node in a
> >    round-robin fashion.
> >
> > -### 4.4 Exact Match Cache
> > +### 4.4 DPDK Physical Port Queue Sizes
> > +  `ovs-vsctl set Interface dpdk0 options:n_rxq_desc=<integer>`
> > +  `ovs-vsctl set Interface dpdk0 options:n_txq_desc=<integer>`
> > +
> > +  The command above sets the number of rx/tx descriptors that the NIC
> > +  associated with dpdk0 will be initialised with.
> > +
> > +  Different 'n_rxq_desc' and 'n_txq_desc' configurations yield different
> > +  benefits in terms of throughput and latency for different scenarios.
> > +  Generally, smaller queue sizes can have a positive impact for latency at
> the
> > +  expense of throughput. The opposite is often true for larger queue sizes.
> 
> Here we can mention that increasing the number of rx descriptors may lead
> to performance degradation because of using non-vectorized rx functions.
> At least this is true for i40e and maybe ixgbe dpdk drivers. Setting
> 'n_rxq_desc=4096' for them will lead to disabling of vectorized rx.

It seems the same applies for ixgbe (IXGBE_MAX_RING_DESC=4096) 
http://dpdk.org/doc/guides/nics/ixgbe.html#rx-constraints
I will include this info in the next version.

> 
> > +
> > +### 4.5 Exact Match Cache
> >
> >    Each pmd thread contains one EMC. After initial flow setup in the
> >    datapath, the EMC contains a single table and provides the lowest level
> > @@ -274,7 +286,7 @@ needs to be affinitized accordingly.
> >    avoiding datapath classifier lookups is to have multiple pmd threads
> >    running. This can be done as described in section 4.2.
> >
> > -### 4.5 Rx Mergeable buffers
> > +### 4.6 Rx Mergeable buffers
> >
> >    Rx Mergeable buffers is a virtio feature that allows chaining of multiple
> >    virtio descriptors to handle large packet sizes. As such, large packets
> > diff --git a/NEWS b/NEWS
> > index 21ab538..901886d 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -125,6 +125,8 @@ v2.6.0 - xx xxx xxxx
> >       * Remove dpdkvhostcuse port type.
> >       * OVS client mode for vHost and vHost reconnect (Requires QEMU 2.7)
> >       * 'dpdkvhostuserclient' port type.
> > +     * New option 'n_rxq_desc' and 'n_txq_desc' fields for DPDK interfaces
> > +       which set the number of rx and tx descriptors to use for the given
> port.
> >     - Increase number of registers to 16.
> >     - ovs-benchmark: This utility has been removed due to lack of use and
> >       bitrot.
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index 89bdc4d..228993f 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -132,8 +132,9 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF /
> ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF))
> >
> >  #define SOCKET0              0
> >
> > -#define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue,
> Max (n+32<=4096)*/
> > -#define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue,
> Max (n+32<=4096)*/
> > +#define NIC_PORT_DEFAULT_RXQ_SIZE 2048 /* Default size of Physical
> NIC RXQ */
> > +#define NIC_PORT_DEFAULT_TXQ_SIZE 2048 /* Default size of Physical
> NIC TXQ */
> > +#define NIC_PORT_MAX_Q_SIZE 4096       /* Maximum size of Physical
> NIC Queue */
> >
> >  #define OVS_VHOST_MAX_QUEUE_NUM 1024  /* Maximum number of
> vHost TX queues. */
> >  #define OVS_VHOST_QUEUE_MAP_UNKNOWN (-1) /* Mapping not
> initialized. */
> > @@ -372,6 +373,12 @@ struct netdev_dpdk {
> >      int requested_mtu;
> >      int requested_n_txq;
> >      int requested_n_rxq;
> > +    int requested_rxq_size;
> > +    int requested_txq_size;
> > +
> > +    /* Number of rx/tx descriptors for physical devices */
> > +    int rxq_size;
> > +    int txq_size;
> >
> >      /* Socket ID detected when vHost device is brought up */
> >      int requested_socket_id;
> > @@ -646,7 +653,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk
> *dev, int n_rxq, int n_txq)
> >          }
> >
> >          for (i = 0; i < n_txq; i++) {
> > -            diag = rte_eth_tx_queue_setup(dev->port_id, i,
> NIC_PORT_TX_Q_SIZE,
> > +            diag = rte_eth_tx_queue_setup(dev->port_id, i, dev->txq_size,
> >                                            dev->socket_id, NULL);
> >              if (diag) {
> >                  VLOG_INFO("Interface %s txq(%d) setup error: %s",
> > @@ -662,7 +669,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk
> *dev, int n_rxq, int n_txq)
> >          }
> >
> >          for (i = 0; i < n_rxq; i++) {
> > -            diag = rte_eth_rx_queue_setup(dev->port_id, i,
> NIC_PORT_RX_Q_SIZE,
> > +            diag = rte_eth_rx_queue_setup(dev->port_id, i, dev->rxq_size,
> >                                            dev->socket_id, NULL,
> >                                            dev->dpdk_mp->mp);
> >              if (diag) {
> > @@ -837,6 +844,10 @@ netdev_dpdk_init(struct netdev *netdev,
> unsigned int port_no,
> >      netdev->n_txq = NR_QUEUE;
> >      dev->requested_n_rxq = netdev->n_rxq;
> >      dev->requested_n_txq = netdev->n_txq;
> > +    dev->rxq_size = NIC_PORT_DEFAULT_RXQ_SIZE;
> > +    dev->txq_size = NIC_PORT_DEFAULT_TXQ_SIZE;
> > +    dev->requested_rxq_size = dev->rxq_size;
> > +    dev->requested_txq_size = dev->txq_size;
> >
> >      /* Initialize the flow control to NULL */
> >      memset(&dev->fc_conf, 0, sizeof dev->fc_conf);
> > @@ -1051,6 +1062,8 @@ netdev_dpdk_get_config(const struct netdev
> *netdev, struct smap *args)
> >      smap_add_format(args, "configured_rx_queues", "%d", netdev-
> >n_rxq);
> >      smap_add_format(args, "requested_tx_queues", "%d", dev-
> >requested_n_txq);
> >      smap_add_format(args, "configured_tx_queues", "%d", netdev-
> >n_txq);
> > +    smap_add_format(args, "rxq_descriptors", "%d", dev->rxq_size);
> > +    smap_add_format(args, "txq_descriptors", "%d", dev->txq_size);
> >      smap_add_format(args, "mtu", "%d", dev->mtu);
> >      ovs_mutex_unlock(&dev->mutex);
> >
> > @@ -1069,6 +1082,21 @@ dpdk_set_rxq_config(struct netdev_dpdk *dev,
> const struct smap *args)
> >      }
> >  }
> >
> > +static void
> > +dpdk_process_queue_size(struct netdev *netdev, const struct smap
> *args,
> > +                        char *flag, int *new_size)
> > +{
> > +    int queue_size;
> > +
> > +    queue_size = smap_get_int(args, flag, 0);
> > +    if (queue_size > 0 && queue_size <= NIC_PORT_MAX_Q_SIZE
> > +            && rte_is_power_of_2(queue_size)
> 
> I'm suggesting to use OVS internal functions to do the check: 'IS_POW2()'
> macro or function 'is_pow2()'.
> 
> > +            && queue_size != *new_size) {
> > +        *new_size = queue_size;
> > +        netdev_request_reconfigure(netdev);
> > +    }
> 
> Also, some sane error message would be nice in case of wrong value.

This is generally avoided for set_config since it is called so often. An error 
isn't printed in when setting n_rxq for example.
If the user sets a bogus value you can end up with many prints of the same log. 
And these logs again later when you set another value eg. n_rxq.
For this reason I don't plan to make this change.

Thanks,
Ciara

> 
> > +}
> > +
> >  static int
> >  netdev_dpdk_set_config(struct netdev *netdev, const struct smap *args)
> >  {
> > @@ -1078,6 +1106,11 @@ netdev_dpdk_set_config(struct netdev
> *netdev, const struct smap *args)
> >
> >      dpdk_set_rxq_config(dev, args);
> >
> > +    dpdk_process_queue_size(netdev, args, "n_rxq_desc",
> > +                            &dev->requested_rxq_size);
> > +    dpdk_process_queue_size(netdev, args, "n_txq_desc",
> > +                            &dev->requested_txq_size);
> > +
> >      /* Flow control support is only available for DPDK Ethernet ports. */
> >      bool rx_fc_en = false;
> >      bool tx_fc_en = false;
> > @@ -2923,7 +2956,9 @@ netdev_dpdk_reconfigure(struct netdev
> *netdev)
> >
> >      if (netdev->n_txq == dev->requested_n_txq
> >          && netdev->n_rxq == dev->requested_n_rxq
> > -        && dev->mtu == dev->requested_mtu) {
> > +        && dev->mtu == dev->requested_mtu
> > +        && dev->rxq_size == dev->requested_rxq_size
> > +        && dev->txq_size == dev->requested_txq_size) {
> >          /* Reconfiguration is unnecessary */
> >
> >          goto out;
> > @@ -2938,6 +2973,9 @@ netdev_dpdk_reconfigure(struct netdev
> *netdev)
> >      netdev->n_txq = dev->requested_n_txq;
> >      netdev->n_rxq = dev->requested_n_rxq;
> >
> > +    dev->rxq_size = dev->requested_rxq_size;
> > +    dev->txq_size = dev->requested_txq_size;
> > +
> >      rte_free(dev->tx_q);
> >      err = dpdk_eth_dev_init(dev);
> >      netdev_dpdk_alloc_txq(dev, netdev->n_txq);
> > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> > index e73023d..1e96ada 100644
> > --- a/vswitchd/vswitch.xml
> > +++ b/vswitchd/vswitch.xml
> > @@ -2375,6 +2375,28 @@
> >            Only supported by dpdkvhostuserclient interfaces.
> >          </p>
> >        </column>
> > +
> > +      <column name="options" key="n_rxq_desc"
> > +              type='{"type": "integer", "minInteger": 1, "maxInteger": 
> > 4096}'>
> > +        <p>
> > +          Specifies the rx queue size (number rx descriptors) for dpdk 
> > ports.
> > +          The value must be a multiple of 2, less than 4096 and supported
> 
> s/multiple/power/
> 
> > +          by the hardware of the device being configured.
> > +          If not specified or an incorrect value is specified, 2048 rx
> > +          descriptors will be used by default.
> > +        </p>
> > +      </column>
> > +
> > +      <column name="options" key="n_txq_desc"
> > +              type='{"type": "integer", "minInteger": 1, "maxInteger": 
> > 4096}'>
> > +        <p>
> > +          Specifies the tx queue size (number tx descriptors) for dpdk 
> > ports.
> > +          The value must be a multiple of 2, less than 4096 and supported
> 
> s/multiple/power/
> 
> > +          by the hardware of the device being configured.
> > +          If not specified or an incorrect value is specified, 2048 tx
> > +          descriptors will be used by default.
> > +        </p>
> > +      </column>
> >      </group>
> >
> >      <group title="MTU">
> > --
> > 2.4.3

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to