On 08/01/2017 08:58 AM, Kevin Traynor wrote:
Previously rxqs were assigned to pmds by round robin in
port/queue order.

Now that we have the processing cycles used for existing rxqs,
use that information to try and produced a better balanced
distribution of rxqs across pmds. i.e. given multiple pmds, the
rxqs which have consumed the largest amount of processing cycles
will be placed on different pmds.

The rxqs are sorted by their processing cycles and assigned (in
sorted order) round robin across pmds.

Signed-off-by: Kevin Traynor <ktray...@redhat.com>
---
  Documentation/howto/dpdk.rst |  7 +++++
  lib/dpif-netdev.c            | 73 +++++++++++++++++++++++++++++++++++---------
  2 files changed, 66 insertions(+), 14 deletions(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index af01d3e..a969285 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -119,4 +119,11 @@ After that PMD threads on cores where RX queues was pinned 
will become
    thread.

+If pmd-rxq-affinity is not set for rxqs, they will be assigned to pmds (cores)
+automatically. The processing cycles that have been required for each rxq
+will be used where known to assign rxqs with the highest consumption of
+processing cycles to different pmds.
+
+Rxq to pmds assignment takes place whenever there are configuration changes.
+
  QoS
  ---
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 25a521a..a05e586 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3295,8 +3295,29 @@ rr_numa_list_destroy(struct rr_numa_list *rr)
  }

+/* Sort Rx Queues by the processing cycles they are consuming. */
+static int
+rxq_cycle_sort(const void *a, const void *b)
+{
+    struct dp_netdev_rxq * qa;
+    struct dp_netdev_rxq * qb;
+
+    qa = *(struct dp_netdev_rxq **) a;
+    qb = *(struct dp_netdev_rxq **) b;
+
+    if (dp_netdev_rxq_get_cycles(qa, RXQ_CYCLES_PROC_LAST) >=
+            dp_netdev_rxq_get_cycles(qb, RXQ_CYCLES_PROC_LAST)) {
+        return -1;
+    }
+
+    return 1;
+}
+
  /* Assign pmds to queues.  If 'pinned' is true, assign pmds to pinned
   * queues and marks the pmds as isolated.  Otherwise, assign non isolated
   * pmds to unpinned queues.
   *
+ * If 'pinned' is false queues will be sorted by processing cycles they are
+ * consuming and then assigned to pmds in round robin order.
+ *
   * The function doesn't touch the pmd threads, it just stores the assignment
   * in the 'pmd' member of each rxq. */
@@ -3306,18 +3327,14 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) 
OVS_REQUIRES(dp->port_mutex)
      struct dp_netdev_port *port;
      struct rr_numa_list rr;
-
-    rr_numa_list_populate(dp, &rr);
+    struct dp_netdev_rxq ** rxqs = NULL;
+    int i, n_rxqs = 0;
+    struct rr_numa *numa = NULL;
+    int numa_id;

      HMAP_FOR_EACH (port, node, &dp->ports) {
-        struct rr_numa *numa;
-        int numa_id;
-
          if (!netdev_is_pmd(port->netdev)) {
              continue;
          }

-        numa_id = netdev_get_numa_id(port->netdev);
-        numa = rr_numa_list_lookup(&rr, numa_id);
-
          for (int qid = 0; qid < port->n_rxq; qid++) {
              struct dp_netdev_rxq *q = &port->rxqs[qid];
@@ -3337,17 +3354,45 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) 
OVS_REQUIRES(dp->port_mutex)
                  }
              } else if (!pinned && q->core_id == OVS_CORE_UNSPEC) {
-                if (!numa) {
-                    VLOG_WARN("There's no available (non isolated) pmd thread "
-                              "on numa node %d. Queue %d on port \'%s\' will "
-                              "not be polled.",
-                              numa_id, qid, netdev_get_name(port->netdev));
+                if (n_rxqs == 0) {
+                    rxqs = xmalloc(sizeof *rxqs);
                  } else {
-                    q->pmd = rr_numa_get_pmd(numa);
+                    rxqs = xrealloc(rxqs, sizeof *rxqs * (n_rxqs + 1));
                  }
+                /* Store the queue. */
+                rxqs[n_rxqs++] = q;
              }
          }
      }

+    if (n_rxqs > 1) {
+        /* Sort the queues in order of the processing cycles
+         * they consumed during their last pmd interval. */
+        qsort(rxqs, n_rxqs, sizeof *rxqs, rxq_cycle_sort);
+    }
+
+    rr_numa_list_populate(dp, &rr);
+    /* Assign the sorted queues to pmds in round robin. */
+    for (i = 0; i < n_rxqs; i++) {
+        numa_id = netdev_get_numa_id(rxqs[i]->port->netdev);
+        numa = rr_numa_list_lookup(&rr, numa_id);
+        if (!numa) {
+            VLOG_WARN("There's no available (non isolated) pmd thread "
+                      "on numa node %d. Queue %d on port \'%s\' will "
+                      "not be polled.",
+                      numa_id, netdev_rxq_get_queue_id(rxqs[i]->rx),
+                      netdev_get_name(rxqs[i]->port->netdev));
+            continue;
+        }
+        rxqs[i]->pmd = rr_numa_get_pmd(numa);
+        VLOG_INFO("Core %d on numa node %d assigned port \'%s\' "
+                  "rx queue %d (measured processing cycles %"PRIu64").",
+                  rxqs[i]->pmd->core_id, numa_id,
+                  netdev_rxq_get_name(rxqs[i]->rx),
+                  netdev_rxq_get_queue_id(rxqs[i]->rx),
+                  dp_netdev_rxq_get_cycles(rxqs[i], RXQ_CYCLES_PROC_LAST));

Kevin,

I've been reviewing and testing this code and found something odd.  The 
measured processing cycles are
always zero in my setup.

sample log output:

2017-08-08T12:48:25.871Z|00417|dpif_netdev|INFO|Core 6 on numa node 0 assigned 
port 'port-em2' rx queue 5 (measured processing cycles 10011304791).
2017-08-08T12:48:25.871Z|00418|dpif_netdev|INFO|Core 6 on numa node 0 assigned 
port 'port-em2' rx queue 4 (measured processing cycles 0).

Initially I configure my setup with 16 rxq's and a PMD CPU mask of 0x1FFFE.  
Then I've been testing by running
iperf traffic with multiple ports 8 or 16 (-P option) to allow 'processing 
cycles' to count up.  Or at least I think that's
what should be happening.  But when I reconfigure the rxq's and cpu mask the 
processing cycles is always
zero.

How are you testing this?  Perhaps it's just my test harness or something else.

Initial setup:

ovs-vsctl set Interface port-em2 options:n_rxq=16
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x1FFFE

(Note that I do not set affinity - I have read your patch to infer that this is 
for cases without affinitization.)

After getting traffic I then run this setup:

ovs-vsctl set Interface port-em2 options:n_rxq=4
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x1E

Any advice or comment?

Thanks,

- Greg

+    }
+
      rr_numa_list_destroy(&rr);
+    free(rxqs);
  }



_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to