On 05/29/2017 01:22 PM, Ilya Maximets wrote: > On 26.05.2017 20:14, Kevin Traynor wrote: >> On 05/26/2017 03:55 PM, Ilya Maximets wrote: >>> On 10.03.2017 07:27, Daniele Di Proietto wrote: >>>> 2017-02-21 6:49 GMT-08:00 Ilya Maximets <i.maxim...@samsung.com>: >>>>> Reconfiguration of HW NICs may lead to packet drops. >>>>> In current model all physical ports will be reconfigured each >>>>> time number of PMD threads changed. Since we not stopping >>>>> threads on pmd-cpu-mask changes, this patch will help to further >>>>> decrease port's downtime by setting the maximum possible number >>>>> of wanted tx queues to avoid unnecessary reconfigurations. >>>>> >>>>> Signed-off-by: Ilya Maximets <i.maxim...@samsung.com> >>>> >>>> I haven't thought this through a lot, but the last big series we pushed >>>> on master went exactly in the opposite direction, i.e. created one txq >>>> for each thread in the datapath. >>>> >>>> I thought this was a good idea because: >>>> >>>> * On some systems with hyperthreading we can have a lot of cpus (we >>>> received >>>> reports of systems with 72 cores). If you want to use only a couple of >>>> cores >>>> you're still forced to have a lot of unused txqs, which prevent you >>>> from having >>>> lockless tx. >>>> * We thought that reconfiguring the number of pmds would not be a frequent >>>> operation. >>>> >>>> Why do you want to reconfigure the threads that often? Is it to be >>>> able to support more throughput quickly? >>> >>> Yes. >>> >>>> In this case I think we shouldn't use the number of cpus, >>>> but something else coming from the user, so that the administrator can >>>> balance how >>>> quickly pmd threads can be reconfigured vs how many txqs should be on >>>> the system. >>>> I'm not sure how to make this user friendly though. >>>> >>>> What do you think? >>> >>> Right now I'm thinking about combined solution which will respect >>> both issues (too big number of TXQs and frequent device reconfiguration). >>> I think, we can implement additional function to get port's limitations. >>> For now we can request the maximum number of TX queues from netdev and >>> use it while number of cores less then number of queues. >>> Something like this: >>> >>> lib/netdev-dpdk.c: >>> uint32_t netdev_dpdk_get_max_txqs(struct netdev *netdev) >>> { >>> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); >>> struct rte_eth_dev_info dev_info; >>> >>> ovs_mutex_lock(&dev->mutex); >>> rte_eth_dev_info_get(dev->port_id, &dev_info); >>> ovs_mutex_unlock(&dev->mutex); >>> >>> return dev_info.max_tx_queues; >>> } >>> >>> lib/dpif-netdev.c:reconfigure_datapath(): >>> >>> <-----> >>> max_tx_queues = netdev_get_max_txqs(port->netdev); >>> number_of_threads = cmap_count(&dp->poll_threads); >>> wanted_txqs = MAX(max_tx_queues, number_of_threads); >>> <-----> >>> >>> In this implementation there will be no additional locking if number of >>> threads less or equal to maximum possible number of tx queues in HW. >>> >>> What do you think? Ian? Darrell? >>> >> >> I'm not sure about using rte_eth_dev_info_get() as IIRC there was >> problems previously with it reporting a number, but then when you went >> to configure them they were not all available depending on mode. That >> was why the trial and error queue configure was put in. >> >> What about replacing 'max_tx_queues' above with a number like 16 that is >> likely to be supported by the ports and unlikely be exceeded by >> number_of_threads? >> >> Kevin. > > Hi Kevin. Thank you for reminding me of this issue. > > But I think that magic numbers is not a good solution. > > One issue in my first implementation is that desired number of queues is > not actually the same as required number. > > How about something like this: > <-----------------------------------------------------------------> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 97f72d3..1a18e4f 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -3448,7 +3448,7 @@ reconfigure_datapath(struct dp_netdev *dp) > { > struct dp_netdev_pmd_thread *pmd; > struct dp_netdev_port *port; > - int wanted_txqs; > + int needed_txqs, wanted_txqs; > > dp->last_reconfigure_seq = seq_read(dp->reconfigure_seq); > > @@ -3456,7 +3456,15 @@ reconfigure_datapath(struct dp_netdev *dp) > * on the system and the user configuration. */ > reconfigure_pmd_threads(dp); > > - wanted_txqs = cmap_count(&dp->poll_threads); > + /* We need 1 Tx queue for each thread to avoid locking, but we will try > + * to allocate the maximum possible value to minimize the number of port > + * reconfigurations. */ > + needed_txqs = cmap_count(&dp->poll_threads); > + /* (n_cores + 1) is the maximum that we might need to have. > + * Additional queue is for non-PMD threads. */ > + wanted_txqs = ovs_numa_get_n_cores(); > + ovs_assert(wanted_txqs != OVS_CORE_UNSPEC); > + wanted_txqs++; > > /* The number of pmd threads might have changed, or a port can be new: > * adjust the txqs. */ > @@ -3469,9 +3477,13 @@ reconfigure_datapath(struct dp_netdev *dp) > > /* Check for all the ports that need reconfiguration. We cache this in > * 'port->reconfigure', because netdev_is_reconf_required() can change at > - * any time. */ > + * any time. Also mark for reconfiguration all ports which will likely > + * change their 'dynamic_txqs' parameter. It's required to stop using > + * them before changing this setting. */ > HMAP_FOR_EACH (port, node, &dp->ports) { > - if (netdev_is_reconf_required(port->netdev)) { > + if (netdev_is_reconf_required(port->netdev) > + || (port->dynamic_txqs > + != netdev_n_txq(port->netdev) < needed_txqs)) {
I'm not quite sure why there needs to be a reconfigure here. It will again try with the same wanted_txqs. Is it not enough to just update the dynamic_txqs flag as is done later ? > port->need_reconfigure = true; > } > } > @@ -3505,7 +3517,7 @@ reconfigure_datapath(struct dp_netdev *dp) > seq_change(dp->port_seq); > port_destroy(port); > } else { > - port->dynamic_txqs = netdev_n_txq(port->netdev) < wanted_txqs; > + port->dynamic_txqs = netdev_n_txq(port->netdev) < needed_txqs; > } > } > > <-----------------------------------------------------------------> > > Best regards, Ilya Maximets. > >>> >>>> Thanks, >>>> >>>> Daniele >>>> >>>>> --- >>>>> lib/dpif-netdev.c | 6 +++++- >>>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >>>>> index 6e575ab..e2b4f39 100644 >>>>> --- a/lib/dpif-netdev.c >>>>> +++ b/lib/dpif-netdev.c >>>>> @@ -3324,7 +3324,11 @@ reconfigure_datapath(struct dp_netdev *dp) >>>>> * on the system and the user configuration. */ >>>>> reconfigure_pmd_threads(dp); >>>>> >>>>> - wanted_txqs = cmap_count(&dp->poll_threads); >>>>> + /* We need 1 Tx queue for each possible cpu core. */ >>>>> + wanted_txqs = ovs_numa_get_n_cores(); >>>>> + ovs_assert(wanted_txqs != OVS_CORE_UNSPEC); >>>>> + /* And 1 Tx queue for non-PMD threads. */ >>>>> + wanted_txqs++; >>>>> >>>>> /* The number of pmd threads might have changed, or a port can be >>>>> new: >>>>> * adjust the txqs. */ >>>>> -- >>>>> 2.7.4 >>>>> >>>> >>>> >>>> >>> _______________________________________________ >>> dev mailing list >>> d...@openvswitch.org >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>> >> >> >> >> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev