On 29.05.2017 18:00, Kevin Traynor wrote: > On 05/29/2017 02:31 PM, Ilya Maximets wrote: >> On 29.05.2017 16:26, Kevin Traynor wrote: >>> On 05/29/2017 01:22 PM, Ilya Maximets wrote: >>>> On 26.05.2017 20:14, Kevin Traynor wrote: >>>>> On 05/26/2017 03:55 PM, Ilya Maximets wrote: >>>>>> On 10.03.2017 07:27, Daniele Di Proietto wrote: >>>>>>> 2017-02-21 6:49 GMT-08:00 Ilya Maximets <i.maxim...@samsung.com>: >>>>>>>> Reconfiguration of HW NICs may lead to packet drops. >>>>>>>> In current model all physical ports will be reconfigured each >>>>>>>> time number of PMD threads changed. Since we not stopping >>>>>>>> threads on pmd-cpu-mask changes, this patch will help to further >>>>>>>> decrease port's downtime by setting the maximum possible number >>>>>>>> of wanted tx queues to avoid unnecessary reconfigurations. >>>>>>>> >>>>>>>> Signed-off-by: Ilya Maximets <i.maxim...@samsung.com> >>>>>>> >>>>>>> I haven't thought this through a lot, but the last big series we pushed >>>>>>> on master went exactly in the opposite direction, i.e. created one txq >>>>>>> for each thread in the datapath. >>>>>>> >>>>>>> I thought this was a good idea because: >>>>>>> >>>>>>> * On some systems with hyperthreading we can have a lot of cpus (we >>>>>>> received >>>>>>> reports of systems with 72 cores). If you want to use only a couple >>>>>>> of cores >>>>>>> you're still forced to have a lot of unused txqs, which prevent you >>>>>>> from having >>>>>>> lockless tx. >>>>>>> * We thought that reconfiguring the number of pmds would not be a >>>>>>> frequent >>>>>>> operation. >>>>>>> >>>>>>> Why do you want to reconfigure the threads that often? Is it to be >>>>>>> able to support more throughput quickly? >>>>>> >>>>>> Yes. >>>>>> >>>>>>> In this case I think we shouldn't use the number of cpus, >>>>>>> but something else coming from the user, so that the administrator can >>>>>>> balance how >>>>>>> quickly pmd threads can be reconfigured vs how many txqs should be on >>>>>>> the system. >>>>>>> I'm not sure how to make this user friendly though. >>>>>>> >>>>>>> What do you think? >>>>>> >>>>>> Right now I'm thinking about combined solution which will respect >>>>>> both issues (too big number of TXQs and frequent device reconfiguration). >>>>>> I think, we can implement additional function to get port's limitations. >>>>>> For now we can request the maximum number of TX queues from netdev and >>>>>> use it while number of cores less then number of queues. >>>>>> Something like this: >>>>>> >>>>>> lib/netdev-dpdk.c: >>>>>> uint32_t netdev_dpdk_get_max_txqs(struct netdev *netdev) >>>>>> { >>>>>> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); >>>>>> struct rte_eth_dev_info dev_info; >>>>>> >>>>>> ovs_mutex_lock(&dev->mutex); >>>>>> rte_eth_dev_info_get(dev->port_id, &dev_info); >>>>>> ovs_mutex_unlock(&dev->mutex); >>>>>> >>>>>> return dev_info.max_tx_queues; >>>>>> } >>>>>> >>>>>> lib/dpif-netdev.c:reconfigure_datapath(): >>>>>> >>>>>> <-----> >>>>>> max_tx_queues = netdev_get_max_txqs(port->netdev); >>>>>> number_of_threads = cmap_count(&dp->poll_threads); >>>>>> wanted_txqs = MAX(max_tx_queues, number_of_threads); >>>>>> <-----> >>>>>> >>>>>> In this implementation there will be no additional locking if number of >>>>>> threads less or equal to maximum possible number of tx queues in HW. >>>>>> >>>>>> What do you think? Ian? Darrell? >>>>>> >>>>> >>>>> I'm not sure about using rte_eth_dev_info_get() as IIRC there was >>>>> problems previously with it reporting a number, but then when you went >>>>> to configure them they were not all available depending on mode. That >>>>> was why the trial and error queue configure was put in. >>>>> >>>>> What about replacing 'max_tx_queues' above with a number like 16 that is >>>>> likely to be supported by the ports and unlikely be exceeded by >>>>> number_of_threads? >>>>> >>>>> Kevin. >>>> >>>> Hi Kevin. Thank you for reminding me of this issue. >>>> >>>> But I think that magic numbers is not a good solution. >>>> >>>> One issue in my first implementation is that desired number of queues is >>>> not actually the same as required number. >>>> >>>> How about something like this: >>>> <-----------------------------------------------------------------> >>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >>>> index 97f72d3..1a18e4f 100644 >>>> --- a/lib/dpif-netdev.c >>>> +++ b/lib/dpif-netdev.c >>>> @@ -3448,7 +3448,7 @@ reconfigure_datapath(struct dp_netdev *dp) >>>> { >>>> struct dp_netdev_pmd_thread *pmd; >>>> struct dp_netdev_port *port; >>>> - int wanted_txqs; >>>> + int needed_txqs, wanted_txqs; >>>> >>>> dp->last_reconfigure_seq = seq_read(dp->reconfigure_seq); >>>> >>>> @@ -3456,7 +3456,15 @@ reconfigure_datapath(struct dp_netdev *dp) >>>> * on the system and the user configuration. */ >>>> reconfigure_pmd_threads(dp); >>>> >>>> - wanted_txqs = cmap_count(&dp->poll_threads); >>>> + /* We need 1 Tx queue for each thread to avoid locking, but we will >>>> try >>>> + * to allocate the maximum possible value to minimize the number of >>>> port >>>> + * reconfigurations. */ >>>> + needed_txqs = cmap_count(&dp->poll_threads); >>>> + /* (n_cores + 1) is the maximum that we might need to have. >>>> + * Additional queue is for non-PMD threads. */ >>>> + wanted_txqs = ovs_numa_get_n_cores(); >>>> + ovs_assert(wanted_txqs != OVS_CORE_UNSPEC); >>>> + wanted_txqs++; >>>> >>>> /* The number of pmd threads might have changed, or a port can be new: >>>> * adjust the txqs. */ >>>> @@ -3469,9 +3477,13 @@ reconfigure_datapath(struct dp_netdev *dp) >>>> >>>> /* Check for all the ports that need reconfiguration. We cache this >>>> in >>>> * 'port->reconfigure', because netdev_is_reconf_required() can >>>> change at >>>> - * any time. */ >>>> + * any time. Also mark for reconfiguration all ports which will >>>> likely >>>> + * change their 'dynamic_txqs' parameter. It's required to stop using >>>> + * them before changing this setting. */ >>>> HMAP_FOR_EACH (port, node, &dp->ports) { >>>> - if (netdev_is_reconf_required(port->netdev)) { >>>> + if (netdev_is_reconf_required(port->netdev) >>>> + || (port->dynamic_txqs >>>> + != netdev_n_txq(port->netdev) < needed_txqs)) { >>> >>> I'm not quite sure why there needs to be a reconfigure here. It will >>> again try with the same wanted_txqs. Is it not enough to just update the >>> dynamic_txqs flag as is done later ? >> >> It's not enough because reconfiguration will not be requested if we're >> trying to set up same number of queues second time. >> > > My question is - why request a reconfigure at all based on needed_txqs? > > We already configured requesting wanted_txqs (=ovs_numa_get_n_cores()) > and now we would configure again requesting the same num wanted_txqs, > regardless of the new needed_txqs value. > > Could we just update the dynamic_txqs flag (as you have below) to > reflect any new difference between configured txqs and needed_txqs. > Maybe I missed some part? > > Kevin.
Ok. I got it. The reason is in fact that we need to remove these ports from PMD threads. It's simpler to mark ports here and allow 'pmd_remove_stale_ports' to remove them from threads. After that where will be no actual reconfiguration because 'port_reconfigure' will just check that nothing changed (branch /* Reconfiguration is unnecessary */). If we want to avoid calling 'port_reconfigure' we'll have to remove ports from threads using additional loop and handle them separately in reconfiguration loop. I think, my solution just simpler and not really slower because there will be no actual reconfiguration. What do you think? > >> See lib/netdev-dpdk.c:netdev_dpdk_set_tx_multiq() for details: >> >> if (dev->requested_n_txq == n_txq) { >> >> goto out; >> >> } >> >> >> >>>> port->need_reconfigure = true; >>>> } >>>> } >>>> @@ -3505,7 +3517,7 @@ reconfigure_datapath(struct dp_netdev *dp) >>>> seq_change(dp->port_seq); >>>> port_destroy(port); >>>> } else { >>>> - port->dynamic_txqs = netdev_n_txq(port->netdev) < wanted_txqs; >>>> + port->dynamic_txqs = netdev_n_txq(port->netdev) < needed_txqs; >>>> } >>>> } >>>> >>>> <-----------------------------------------------------------------> >>>> >>>> Best regards, Ilya Maximets. >>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Daniele >>>>>>> >>>>>>>> --- >>>>>>>> lib/dpif-netdev.c | 6 +++++- >>>>>>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >>>>>>>> index 6e575ab..e2b4f39 100644 >>>>>>>> --- a/lib/dpif-netdev.c >>>>>>>> +++ b/lib/dpif-netdev.c >>>>>>>> @@ -3324,7 +3324,11 @@ reconfigure_datapath(struct dp_netdev *dp) >>>>>>>> * on the system and the user configuration. */ >>>>>>>> reconfigure_pmd_threads(dp); >>>>>>>> >>>>>>>> - wanted_txqs = cmap_count(&dp->poll_threads); >>>>>>>> + /* We need 1 Tx queue for each possible cpu core. */ >>>>>>>> + wanted_txqs = ovs_numa_get_n_cores(); >>>>>>>> + ovs_assert(wanted_txqs != OVS_CORE_UNSPEC); >>>>>>>> + /* And 1 Tx queue for non-PMD threads. */ >>>>>>>> + wanted_txqs++; >>>>>>>> >>>>>>>> /* The number of pmd threads might have changed, or a port can be >>>>>>>> new: >>>>>>>> * adjust the txqs. */ >>>>>>>> -- >>>>>>>> 2.7.4 >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> dev mailing list >>>>>> d...@openvswitch.org >>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>> >>>>> >>>>> >>>>> >>>>> >>> >>> >>> >>> > > > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev