On 29.05.2017 16:26, Kevin Traynor wrote:
> On 05/29/2017 01:22 PM, Ilya Maximets wrote:
>> On 26.05.2017 20:14, Kevin Traynor wrote:
>>> On 05/26/2017 03:55 PM, Ilya Maximets wrote:
>>>> On 10.03.2017 07:27, Daniele Di Proietto wrote:
>>>>> 2017-02-21 6:49 GMT-08:00 Ilya Maximets <[email protected]>:
>>>>>> Reconfiguration of HW NICs may lead to packet drops.
>>>>>> In current model all physical ports will be reconfigured each
>>>>>> time number of PMD threads changed. Since we not stopping
>>>>>> threads on pmd-cpu-mask changes, this patch will help to further
>>>>>> decrease port's downtime by setting the maximum possible number
>>>>>> of wanted tx queues to avoid unnecessary reconfigurations.
>>>>>>
>>>>>> Signed-off-by: Ilya Maximets <[email protected]>
>>>>>
>>>>> I haven't thought this through a lot, but the last big series we pushed
>>>>> on master went exactly in the opposite direction, i.e. created one txq
>>>>> for each thread in the datapath.
>>>>>
>>>>> I thought this was a good idea because:
>>>>>
>>>>> * On some systems with hyperthreading we can have a lot of cpus (we 
>>>>> received
>>>>>    reports of systems with 72 cores). If you want to use only a couple of 
>>>>> cores
>>>>>    you're still forced to have a lot of unused txqs, which prevent you
>>>>> from having
>>>>>    lockless tx.
>>>>> * We thought that reconfiguring the number of pmds would not be a frequent
>>>>>    operation.
>>>>>
>>>>> Why do you want to reconfigure the threads that often?  Is it to be
>>>>> able to support more throughput quickly?
>>>>
>>>> Yes.
>>>>
>>>>> In this case I think we shouldn't use the number of cpus,
>>>>> but something else coming from the user, so that the administrator can
>>>>> balance how
>>>>> quickly pmd threads can be reconfigured vs how many txqs should be on
>>>>> the system.
>>>>> I'm not sure how to make this user friendly though.
>>>>>
>>>>> What do you think?
>>>>
>>>> Right now I'm thinking about combined solution which will respect
>>>> both issues (too big number of TXQs and frequent device reconfiguration).
>>>> I think, we can implement additional function to get port's limitations.
>>>> For now we can request the maximum number of TX queues from netdev and
>>>> use it while number of cores less then number of queues.
>>>> Something like this:
>>>>    
>>>> lib/netdev-dpdk.c:
>>>> uint32_t netdev_dpdk_get_max_txqs(struct netdev *netdev)
>>>> {
>>>>     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>>     struct rte_eth_dev_info dev_info;
>>>>
>>>>     ovs_mutex_lock(&dev->mutex);
>>>>     rte_eth_dev_info_get(dev->port_id, &dev_info);
>>>>     ovs_mutex_unlock(&dev->mutex);
>>>>
>>>>     return dev_info.max_tx_queues;
>>>> }
>>>>
>>>> lib/dpif-netdev.c:reconfigure_datapath():
>>>>
>>>>     <----->
>>>>     max_tx_queues = netdev_get_max_txqs(port->netdev);
>>>>     number_of_threads = cmap_count(&dp->poll_threads);
>>>>     wanted_txqs = MAX(max_tx_queues, number_of_threads);
>>>>     <----->
>>>>
>>>> In this implementation there will be no additional locking if number of
>>>> threads less or equal to maximum possible number of tx queues in HW.
>>>>
>>>> What do you think? Ian? Darrell?
>>>>
>>>
>>> I'm not sure about using rte_eth_dev_info_get() as IIRC there was
>>> problems previously with it reporting a number, but then when you went
>>> to configure them they were not all available depending on mode. That
>>> was why the trial and error queue configure was put in.
>>>
>>> What about replacing 'max_tx_queues' above with a number like 16 that is
>>> likely to be supported by the ports and unlikely be exceeded by
>>> number_of_threads?
>>>
>>> Kevin.
>>
>> Hi Kevin. Thank you for reminding me of this issue.
>>
>> But I think that magic numbers is not a good solution.
>>
>> One issue in my first implementation is that desired number of queues is
>> not actually the same as required number.
>>
>> How about something like this:
>> <----------------------------------------------------------------->
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 97f72d3..1a18e4f 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -3448,7 +3448,7 @@ reconfigure_datapath(struct dp_netdev *dp)
>>  {
>>      struct dp_netdev_pmd_thread *pmd;
>>      struct dp_netdev_port *port;
>> -    int wanted_txqs;
>> +    int needed_txqs, wanted_txqs;
>>  
>>      dp->last_reconfigure_seq = seq_read(dp->reconfigure_seq);
>>  
>> @@ -3456,7 +3456,15 @@ reconfigure_datapath(struct dp_netdev *dp)
>>       * on the system and the user configuration. */
>>      reconfigure_pmd_threads(dp);
>>  
>> -    wanted_txqs = cmap_count(&dp->poll_threads);
>> +    /* We need 1 Tx queue for each thread to avoid locking, but we will try
>> +     * to allocate the maximum possible value to minimize the number of port
>> +     * reconfigurations. */
>> +    needed_txqs = cmap_count(&dp->poll_threads);
>> +    /* (n_cores + 1) is the maximum that we might need to have.
>> +     * Additional queue is for non-PMD threads. */
>> +    wanted_txqs = ovs_numa_get_n_cores();
>> +    ovs_assert(wanted_txqs != OVS_CORE_UNSPEC);
>> +    wanted_txqs++;
>>  
>>      /* The number of pmd threads might have changed, or a port can be new:
>>       * adjust the txqs. */
>> @@ -3469,9 +3477,13 @@ reconfigure_datapath(struct dp_netdev *dp)
>>  
>>      /* Check for all the ports that need reconfiguration.  We cache this in
>>       * 'port->reconfigure', because netdev_is_reconf_required() can change 
>> at
>> -     * any time. */
>> +     * any time.  Also mark for reconfiguration all ports which will likely
>> +     * change their 'dynamic_txqs' parameter. It's required to stop using
>> +     * them before changing this setting. */
>>      HMAP_FOR_EACH (port, node, &dp->ports) {
>> -        if (netdev_is_reconf_required(port->netdev)) {
>> +        if (netdev_is_reconf_required(port->netdev)
>> +            || (port->dynamic_txqs
>> +                !=  netdev_n_txq(port->netdev) < needed_txqs)) {
> 
> I'm not quite sure why there needs to be a reconfigure here. It will
> again try with the same wanted_txqs. Is it not enough to just update the
> dynamic_txqs flag as is done later ?

It's not enough because reconfiguration will not be requested if we're
trying to set up same number of queues second time.

See lib/netdev-dpdk.c:netdev_dpdk_set_tx_multiq() for details:

     if (dev->requested_n_txq == n_txq) {                                       
         goto out;                                                              
     }


 
>>              port->need_reconfigure = true;
>>          }
>>      }
>> @@ -3505,7 +3517,7 @@ reconfigure_datapath(struct dp_netdev *dp)
>>              seq_change(dp->port_seq);
>>              port_destroy(port);
>>          } else {
>> -            port->dynamic_txqs = netdev_n_txq(port->netdev) < wanted_txqs;
>> +            port->dynamic_txqs = netdev_n_txq(port->netdev) < needed_txqs;
>>          }
>>      }
>>  
>> <----------------------------------------------------------------->
>>
>> Best regards, Ilya Maximets.
>>
>>>>
>>>>> Thanks,
>>>>>
>>>>> Daniele
>>>>>
>>>>>> ---
>>>>>>  lib/dpif-netdev.c | 6 +++++-
>>>>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>>>>> index 6e575ab..e2b4f39 100644
>>>>>> --- a/lib/dpif-netdev.c
>>>>>> +++ b/lib/dpif-netdev.c
>>>>>> @@ -3324,7 +3324,11 @@ reconfigure_datapath(struct dp_netdev *dp)
>>>>>>       * on the system and the user configuration. */
>>>>>>      reconfigure_pmd_threads(dp);
>>>>>>
>>>>>> -    wanted_txqs = cmap_count(&dp->poll_threads);
>>>>>> +    /* We need 1 Tx queue for each possible cpu core. */
>>>>>> +    wanted_txqs = ovs_numa_get_n_cores();
>>>>>> +    ovs_assert(wanted_txqs != OVS_CORE_UNSPEC);
>>>>>> +    /* And 1 Tx queue for non-PMD threads. */
>>>>>> +    wanted_txqs++;
>>>>>>
>>>>>>      /* The number of pmd threads might have changed, or a port can be 
>>>>>> new:
>>>>>>       * adjust the txqs. */
>>>>>> --
>>>>>> 2.7.4
>>>>>>
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> dev mailing list
>>>> [email protected]
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>
>>>
>>>
>>>
>>>
> 
> 
> 
> 
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to