Hi Ian, I'll send a new patch shortly.
One comment below. Thanks for reviewing, /Billy. > -----Original Message----- > From: Stokes, Ian > Sent: Wednesday, May 10, 2017 1:19 PM > To: O Mahony, Billy <[email protected]>; [email protected] > Subject: RE: [ovs-dev] [PATCH v4] dpif-netdev: Assign ports to pmds on non- > local numa node. > > > From: billyom <[email protected]> > > > > Previously if there is no available (non-isolated) pmd on the numa > > node for a port then the port is not polled at all. This can result in > > a non- operational system until such time as nics are physically > > repositioned. It is preferable to operate with a pmd on the 'wrong' > > numa node albeit with lower performance. Local pmds are still chosen > when available. > > Thanks Billy, few minor comments below. > > > > > Signed-off-by: Billy O'Mahony <[email protected]> > > --- > > v4: Fix a checkpatch error > > v3: Fix warning messages not appearing when using multiqueue > > v2: Add details of warning messages into docs > > > > Documentation/intro/install/dpdk.rst | 10 +++++++++ > > lib/dpif-netdev.c | 40 > > ++++++++++++++++++++++++++++++++---- > > 2 files changed, 46 insertions(+), 4 deletions(-) > > > > diff --git a/Documentation/intro/install/dpdk.rst > > b/Documentation/intro/install/dpdk.rst > > index d1c0e65..ca73184 100644 > > --- a/Documentation/intro/install/dpdk.rst > > +++ b/Documentation/intro/install/dpdk.rst > > @@ -460,6 +460,16 @@ affinitized accordingly. > > pmd thread on a NUMA node is only created if there is at least > > one DPDK > > interface from that NUMA node added to OVS. > > > > + .. note:: > > + On NUMA systems PCI devices are also local to a NUMA node. Rx > > + queues > > for > > + PCI device will assigned to a pmd on it's local NUMA node if > > + pmd-cpu- > > mask > > + has created a pmd thread on that NUMA node. If not the queue will be > > + assigned to a pmd on a remote NUMA node. This will result in reduced > > + maximum throughput on that device. In the case such a queue > > assignment > > + is made a warning message will be logged: "There's no available > > + (non isolated) pmd thread on numa node N. Queue Q on port P will > > + be > > assigned > > + to a pmd on numa node M. Expect reduced performance." > > The warning message above is different to what is outputted now by the > code (no mention of the core etc). > Can you update this? > > Also a minor nit on formatting, there's a mix of single and double spaces at > the start of new sentences throughout the paragraph, can you fix these? [[BO'M]] The single spaces are used only within the description/quote of the error message. Double spaces are used in the help text proper. Or at least that is the intention. I'll double check. > > > + > > - QEMU vCPU thread Affinity > > > > A VM performing simple packet forwarding or running complex packet > > pipelines diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index > > b3a0806..bcbd325 100644 > > --- a/lib/dpif-netdev.c > > +++ b/lib/dpif-netdev.c > > @@ -3149,10 +3149,13 @@ rr_numa_list_lookup(struct rr_numa_list *rr, > > int > > numa_id) } > > > > static void > > -rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr) > > +rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr, > > + int *all_numa_ids, unsigned all_numa_ids_sz, > > + int *num_ids_written) > > { > > struct dp_netdev_pmd_thread *pmd; > > struct rr_numa *numa; > > + unsigned idx = 0; > > > > hmap_init(&rr->numas); > > > > @@ -3170,7 +3173,11 @@ rr_numa_list_populate(struct dp_netdev *dp, > > struct rr_numa_list *rr) > > numa->n_pmds++; > > numa->pmds = xrealloc(numa->pmds, numa->n_pmds * sizeof > > *numa- > > >pmds); > > numa->pmds[numa->n_pmds - 1] = pmd; > > + > > + all_numa_ids[idx % all_numa_ids_sz] = pmd->numa_id; > > + idx++; > > } > > + *num_ids_written = idx; > > } > > > > static struct dp_netdev_pmd_thread * > > @@ -3202,8 +3209,15 @@ rxq_scheduling(struct dp_netdev *dp, bool > > pinned) > > OVS_REQUIRES(dp->port_mutex) { > > struct dp_netdev_port *port; > > struct rr_numa_list rr; > > + int all_numa_ids [64]; > > + int all_numa_ids_sz = sizeof all_numa_ids / sizeof all_numa_ids[0]; > > + unsigned all_numa_ids_idx = 0; > > + int all_numa_ids_max_idx = 0; > > + int num_numa_ids = 0; > > > > - rr_numa_list_populate(dp, &rr); > > + rr_numa_list_populate(dp, &rr, all_numa_ids, all_numa_ids_sz, > > + &num_numa_ids); > > + all_numa_ids_max_idx = MIN(num_numa_ids - 1, all_numa_ids_sz - > > + 1); > > > > HMAP_FOR_EACH (port, node, &dp->ports) { > > struct rr_numa *numa; > > @@ -3234,10 +3248,28 @@ rxq_scheduling(struct dp_netdev *dp, bool > > pinned) > > OVS_REQUIRES(dp->port_mutex) > > } > > } else if (!pinned && q->core_id == OVS_CORE_UNSPEC) { > > if (!numa) { > > + if (all_numa_ids_max_idx < 0) { > > + VLOG_ERR("There is no available > > + (non-isolated) > > pmd " > > + "thread for port \'%s\'. This port > > + will > > " > > + "not be polled. Is pmd-cpu-mask set to " > > + "zero? Or are all PMDs isolated to > > + other > > " > > + "queues?", netdev_get_name(port- > > >netdev)); > > Could you identify the queue in the ERR above? As it is now the ERR will > repeat for each queue that cannot be assigned a pmd on the port without > any discernible differences between the ERR messages. > > > + continue; > > + } > > + int alt_numa_id = all_numa_ids[all_numa_ids_idx]; > > + struct rr_numa *alt_numa; > > + alt_numa = rr_numa_list_lookup(&rr, alt_numa_id); > > + q->pmd = rr_numa_get_pmd(alt_numa); > > VLOG_WARN("There's no available (non isolated) > > pmd thread " > > "on numa node %d. Queue %d on port \'%s\' > > will " > > - "not be polled.", > > - numa_id, qid, netdev_get_name(port- > > >netdev)); > > + "be assigned to the pmd on core %d " > > + "(numa node %d). Expect reduced > > performance.", > > + numa_id, qid, netdev_get_name(port- > > >netdev), > > + q->pmd->core_id, q->pmd->numa_id); > > + all_numa_ids_idx++; > > + if (all_numa_ids_idx > all_numa_ids_max_idx) { > > + all_numa_ids_idx = 0; > > + } > > } else { > > q->pmd = rr_numa_get_pmd(numa); > > } > > -- > > 2.7.4 > > > > _______________________________________________ > > dev mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
