On 7 Jul 2019, at 18:13, David Marchand wrote:

Add a statistic to help diagnose contention on the vhost txqs.
This is usually seen as dropped packets on the physical ports for rates
that are usually handled fine by OVS.

Signed-off-by: David Marchand <david.march...@redhat.com>
---
This patch applies on top of Kevin "vhost-retry" series [1].

1: https://patchwork.ozlabs.org/project/openvswitch/list/?submitter=70482
---
Documentation/topics/dpdk/vhost-user.rst | 25 +++++++++++++++++++++++++
 lib/netdev-dpdk.c                        | 20 ++++++++++++++++++--
 2 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
index 724aa62..e276b21 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -553,6 +553,31 @@ shown with::

   $ ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_retries

+vhost tx contention
+-------------------
+
+Contrary to physical ports where the number of Transmit queues is decided at +the configuration of the port and does not change once set up, the number of +Transmit queues for a vhost-user or vhost-user-client interface is negociated +with the guest when its driver enables a virtio interface and OVS has no
+control over it.
+
+Each PMD threads needs a Transmit queue to send packet on such a port. +To accomodate with the cases where the number of PMD threads differs from the

Maybe make it more explicit, i.e. where the number of PMD threads is higher than the number of tx queues.

+number of Transmit queues enabled by a Virtual Machine, the Transmit queues are +distributed among the PMD threads and their accesses are protected by a
+spinlock.
+
+When contention occurs (because two PMD threads are mapped to the same Transmit
+queue), it can be hard to understand this phenomenon.
+It usually will be seen as OVS taking more cycles to handle the packet and can +be seen as packets getting dropped on the receive side (rx_dropped statistic
+on the physical ports).
+
+A statistic for the port is available to catch those cases::
+
+  ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_contentions
+
 vhost-user Dequeue Zero Copy (experimental)
 -------------------------------------------

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 0e70e44..844fb3c 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -139,10 +139,11 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / ROUND_DOWN_POW2(MAX_NB_MBUF / MIN_NB_MBUF))
 #define XSTAT_RX_JABBER_ERRORS           "rx_jabber_errors"

 /* Size of vHost custom stats. */
-#define VHOST_CUSTOM_STATS_SIZE          1
+#define VHOST_CUSTOM_STATS_SIZE          2

 /* Names of vHost custom stats. */
 #define VHOST_STAT_TX_RETRIES            "tx_retries"
+#define VHOST_STAT_TX_CONTENTIONS        "tx_contentions"

 #define SOCKET0              0

@@ -340,6 +341,8 @@ struct dpdk_tx_queue {
* pmd threads (see 'concurrent_txq'). */ int map; /* Mapping of configured vhost-user queues
                                     * to enabled by guest. */
+ uint64_t tx_contentions; /* Number of times a pmd has been waiting + * for another to xmit on this queue. */
 };

 /* dpdk has no way to remove dpdk ring ethernet devices
@@ -2387,7 +2390,10 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
         goto out;
     }

-    rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
+    if (unlikely(!rte_spinlock_trylock(&dev->tx_q[qid].tx_lock))) {
+        rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
+        atomic_count_inc64(&dev->tx_q[qid].tx_contentions);

I would swap the two lines above, as the atomic will probably finish before the lock is freeā€™ed.

+    }

     cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt);
     /* Check has QoS has been configured for the netdev */
@@ -2878,6 +2884,16 @@ netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev,
     custom_stats->counters[0].value = dev->tx_retries;
     rte_spinlock_unlock(&dev->stats_lock);

+ ovs_strlcpy(custom_stats->counters[1].name, VHOST_STAT_TX_CONTENTIONS,

Can we maybe do something smart with the index, [1], used below in each case (and above [0]), as it seems prone to error.

+                NETDEV_CUSTOM_STATS_NAME_SIZE);
+    custom_stats->counters[1].value = 0;
+    for (unsigned int i = 0; i < netdev->n_txq; i++) {
+        uint64_t tx_contentions;
+
+ atomic_read_relaxed(&dev->tx_q[i].tx_contentions, &tx_contentions);
+        custom_stats->counters[1].value += tx_contentions;
+    }
+
     ovs_mutex_unlock(&dev->mutex);

     return 0;
--
1.8.3.1
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to