On 15.01.2019 11:39, Ophir Munk wrote:
> Hi Ilya,
> Please find comments inline
> 
>> -----Original Message-----
>> From: Ilya Maximets <[email protected]>
>> Sent: Monday, January 14, 2019 12:11 PM
>> To: Ophir Munk <[email protected]>; [email protected]
>> Cc: Asaf Penso <[email protected]>; Ian Stokes <[email protected]>;
>> Shahaf Shuler <[email protected]>; Thomas Monjalon
>> <[email protected]>; Olga Shern <[email protected]>; Kevin
>> Traynor <[email protected]>
>> Subject: Re: [PATCH v4] netdev-dpdk: support port representors
>>
>> On 13.01.2019 18:13, Ophir Munk wrote:
>>> Hi Ilya,
>>> Thank you for your review.
>>> Please find comments inline.
>>>
>>>> -----Original Message-----
>>>> From: Ilya Maximets <[email protected]>
>>>> Sent: Thursday, January 10, 2019 1:32 PM
>>>> To: Ophir Munk <[email protected]>; [email protected]
>>>> Cc: Asaf Penso <[email protected]>; Ian Stokes
>>>> <[email protected]>; Shahaf Shuler <[email protected]>;
>> Thomas
>>>> Monjalon <[email protected]>; Olga Shern <[email protected]>;
>>>> Kevin Traynor <[email protected]>
>>>> Subject: Re: [PATCH v4] netdev-dpdk: support port representors
>>>>
>>>> On 08.01.2019 12:31, Ophir Munk wrote:
>>>>> Dpdk port representors were introduced in dpdk versions 18.xx.
>>>>> Prior to port representors there was a one-to-one relationship
>>>>> between an rte device (e.g. PCI bus) and an eth device (referenced
>>>>> as dpdk port id in OVS). With port representors the relationship
>>>>> becomes one-to-many rte device to eth devices.
>>>>> For example in [3] there are two devices (representors) using the
>>>>> same PCI physical address 0000:08:00.0:
>>>>> "0000:08:00.0,representor=[3]" and "0000:08:00.0,representor=[5]".
>>>>> This commit handles the new one-to-many relationship. For example,
>>>>> when one of the device port representors in [3] is closed - the PCI
>>>>> bus cannot be detached until the other device port representor is
>>>>> closed as well. OVS remains backward compatible by supporting dpdk
>>>>> legacy PCI ports which do not include port representors.
>>>>> Dpdk port representors related commits are listed in [1]. Dpdk port
>>>>> representors documentation appears in [2]. A sample configuration
>>>>> which uses two representors ports (the output of "ovs-vsctl show"
>>>>> command) is shown in [3].
>>>>> +
>>>>>  static int
>>>>>  vhost_common_construct(struct netdev *netdev)
>>>>>      OVS_REQUIRES(dpdk_mutex)
>>>>> @@ -1353,20 +1373,25 @@ static void
>>>>>  netdev_dpdk_destruct(struct netdev *netdev)  {
>>>>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>>> -    struct rte_eth_dev_info dev_info;
>>>>> +    struct rte_device *rte_dev;
>>>>>
>>>>>      ovs_mutex_lock(&dpdk_mutex);
>>>>>
>>>>>      rte_eth_dev_stop(dev->port_id);
>>>>>      dev->started = false;
>>>>> -
>>>>
>>>> Please, keep the empty line.
>>>>
>>>
>>> I will update in v5.
>>>
>>>>>      if (dev->attached) {
>>>>> +        /* Remove the port eth device. */
>>>>>          rte_eth_dev_close(dev->port_id);
>>>>> -        rte_eth_dev_info_get(dev->port_id, &dev_info);
>>>>> -        if (dev_info.device && !rte_dev_remove(dev_info.device)) {
>>>>> -            VLOG_INFO("Device '%s' has been detached", dev->devargs);
>>>>
>>>> Please keep this log message for the successful case.
>>>
>>> Please note the line below:
>>>    VLOG_INFO("Device '%s' has been removed", dev->devargs);
>>>
>>> This is the log message for the successful case. With the introduction of
>> representors several eth devices can use one PCI device.
>>> If you remove the first representor - its eth device is closed - but you do
>> not detach the PCI device till the second representor eth device is closed as
>> well.
>>> Would you suggest replacing "removed" with "closed" or "detached"?
>>
>> I understand what you're trying to do here. What I'm trying to say is that in
>> current version, message "has been removed" states that device was closed,
>> but we do not know if we tried to detach it.
>> We now have two levels of getting port out of OVS. First is "removing"
>> (rte_eth_dev_close) and the second is "detaching" (rte_dev_remove, kind
>> of confusing).
>> So, in current code we do not know if we detached the port or we skipped
>> detaching because of existence of other ports sharing the same device.
>>
> 
> In v5 I will inform if a device was removed (e.g. closed) or detached.
> 
>>>>
>>>> I'm not sure if 'netdev_dpdk_get_port_by_devargs' is the correct name
>>>> because it not only finds (gets) the port, but probes (attaches) it.
>>>>
>>>
>>> I think that the function main purpose is to find the port. Probing is 
>>> optional
>> for that purpose.
>>> I am good with the current name but I am open to any new name you
>> would like to suggest.
>>> Can you please suggest?
>>
>> IMHO, the main purpose of this function is to probe the port because
>> searching will succeed only if we're trying to add duplicated port.
>> 'netdev_dpdk_probe_port_by_devargs' should be better, IMHO.
>>
> 
> In v5 I will update the function name per your suggestion.
> 
>>>
>>>>> +{
>>>>> +    struct rte_dev_iterator iterator;
>>>>> +    dpdk_port_t port_id;
>>>>> +
>>>>> +    RTE_ETH_FOREACH_MATCHING_DEV(port_id, devargs, &iterator) {
>>>>
>>>
>>>> Invalid spacing around the FOREACH loop.
>>>> Same in all the other places.
>>>
>>> I do not understand this comment. Can you please explain or demo the fix?
>>
>> You just need an extra space before the '(' like this:
>>      'RTE_ETH_FOREACH_MATCHING_DEV ('
>>
> 
> I will update in v5
> 
>> I'll send a patch for checkpatch script to catch this. Right now it supports 
>> only
>> 'FOR_EACH' loops, not the 'FOREACH'.
>>
>>>
>>>>>      dpdk_port_t new_port_id = DPDK_ETH_PORT_ID_INVALID;
>>>>>
>>>>>      if (strncmp(devargs, "class=eth,mac=", 14) == 0) {
>>>>>          new_port_id = netdev_dpdk_get_port_by_mac(&devargs[14]);
>>>>>      } else {
>>>>> -        name = xmemdup0(devargs, strcspn(devargs, ","));
>>>>> -        if (rte_eth_dev_get_port_by_name(name, &new_port_id)
>>>>> -                || !rte_eth_dev_is_valid_port(new_port_id)) {
>>>>> -            /* Device not found in DPDK, attempt to attach it */
>>>>> -            if (!rte_dev_probe(devargs)
>>>>> -                && !rte_eth_dev_get_port_by_name(name, &new_port_id)) {
>>>>> -                /* Attach successful */
>>>>> -                dev->attached = true;
>>>>> -                VLOG_INFO("Device '%s' attached to DPDK", devargs);
>>>>> -            } else {
>>>>> -                /* Attach unsuccessful */
>>>>> +        new_port_id = netdev_dpdk_get_port_by_devargs(devargs);
>>>>> +        if (!rte_eth_dev_is_valid_port(new_port_id)) {
>>>>> +            new_port_id = DPDK_ETH_PORT_ID_INVALID;
>>>>> +        } else {
>>>>> +            struct netdev_dpdk *dup_dev;
>>>>> +
>>>>> +            dup_dev = netdev_dpdk_lookup_by_port_id(new_port_id);
>>>>> +            if (dup_dev) {
>>>>> +                VLOG_WARN_BUF(errp, "'%s' is trying to use device '%s' "
>>>>> +                               "which is already in use by '%s'",
>>>>> +                          netdev_get_name(&dev->up), devargs,
>>>>> +                          netdev_get_name(&dup_dev->up));
>>>>
>>>> Why you duplicating this part of code ?
>>>> Can we just replace 'rte_eth_dev_get_port_by_name' with
>>>> 'netdev_dpdk_get_port_by_devargs' and keep all the other logic ?
>>>>
>>>> netdev_dpdk_get_port_by_devargs should look like this in this case:
>>>>
>>>> /* Return the first DPDK port_id matching the devargs pattern. */
>>>> static bool netdev_dpdk_get_port_by_devargs(const char *devargs,
>>>> dpdk_port_t
>>>> *port_id)
>>>>     OVS_REQUIRES(dpdk_mutex)
>>>> {
>>>>     struct rte_dev_iterator iterator;
>>>>
>>>>     RTE_ETH_FOREACH_MATCHING_DEV (*port_id, devargs, &iterator) {
>>>>         /* If a break is done - must call rte_eth_iterator_cleanup. */
>>>>         rte_eth_iterator_cleanup(&iterator);
>>>>         break;
>>>>     }
>>>>     return *port_id != DPDK_ETH_PORT_ID_INVALID; }
>>>>
>>>> netdev_dpdk_process_devargs:
>>>>
>>>>         if (netdev_dpdk_get_port_by_devargs(devagrs, &new_port_id)
>>>>                 || !rte_eth_dev_is_valid_port(new_port_id)) {
>>>>             /* Device not found in DPDK, attempt to attach it */
>>>>             if (!rte_dev_probe(devargs)
>>>>                 && !netdev_dpdk_get_port_by_devargs(devargs,
>> &new_port_id)) {
>>>>                 /* Attach successful */
>>>>                 dev->attached = true;
>>>>                 VLOG_INFO("Device '%s' attached to DPDK", devargs);
>>>>             } else {
>>>>                 /* Attach unsuccessful */
>>>>                 new_port_id = DPDK_ETH_PORT_ID_INVALID;
>>>>             }
>>>>         }
>>>>
>>>> If it works, I'd prefer this variant.
>>>>
>>>
>>> I don't think it works in case you add two different ports with the same
>> representor devargs (or the same PCI address with a PMD supporting
>> representors).
>>> The extra code is required to avoid this misconfiguration (using the same
>> device for two different ports).
>>> For example, the following configuration must fail:
>>>
>>> ovs-vsctl --no-wait add-port br1 port-one -- set Interface port-one
>>> type=dpdk options:dpdk-devargs=0000:08:00.0,representor=[1]
>>> ovs-vsctl --no-wait add-port br1 port-two -- set Interface port-two
>>> type=dpdk options:dpdk-devargs=0000:08:00.0,representor=[1]
>>>
>>> In such misconfiguration please note the waring you get:
>>>                 VLOG_WARN_BUF(errp, "'%s' is trying to use device '%s' "
>>>                                "which is already in use by '%s'",
>>>                           netdev_get_name(&dev->up), devargs,
>>>                           netdev_get_name(&dup_dev->up));
>>>
>>> An interesting question is why didn't we need this extra code in previous
>> OVS versions.
>>> The reason is that if we called rte_dev_probe() a second time for the same
>> PCI device - we got an error. So we were immune against this
>> misconfiguration.
>>> This was the behavior before the introduction of representors. With
>> representor - calling rte_dev_probe() twice for the same devargs will return
>> successfully, hence the need for an additional check (extra code).
>>
>> But we have the same check on the upper level. If you will not check here,
>> the new_port_id will go to netdev_dpdk_set_config(), where we have equal
>> check for duplicated ports. Why this doesn't work for you?
>>
> 
> I will explain why this does not work. Please note the following code snippet 
> from 
> netdev_dpdk_set_config():
> 
> 1    dpdk_port_t new_port_id = netdev_dpdk_process_devargs(dev, new_devargs, 
> errp)
> 2   if (!rte_eth_dev_is_valid_port(new_port_id)) {
> 3       err = EINVAL;
> 4   } else if (new_port_id == dev->port_id) {
> 5       /* Already configured, do not reconfigure again */
> 6       err = 0;
> 7   } else {
> 8       err = netdev_dpdk_check_dup_dev(new_port_id, netdev,
> 9                 new_devargs, errp);
> 10       if (!err) {
> 11          int sid = rte_eth_dev_socket_id(new_port_id);
> 12
> 13          dev->requested_socket_id = sid < 0 ? SOCKET0 : sid;
> 14          dev->devargs = xstrdup(new_devargs);
> 15          dev->port_id = new_port_id;
> 16          netdev_request_reconfigure(&dev->up);
> 17          netdev_dpdk_clear_xstats(dev);
> 18      }
> 19 }
> 
> Line 1: Before representors were introduced the check for a duplicated device 
> was implicitly done inside netdev_dpdk_process_devarags which returned an 
> error (reason: rte_dev_probe() was called twice on the same devargs and 
> failed).
> In such a case err is set to EINVAL in lines 2-3 and we do not reach lines 
> 4-19. With the introduction of representors netdev_dpdk_process_devarags() 
> may return the same 'new_port_id' if called again with the same devargs, or 
> may fail as before for PMDs which do not support representors.
> In case netdev_dpdk_process_devarags() is successful for the same devargs we 
> end with no errors (err = 0) in lines 4-6 .... hence we ignore the dup dev 
> case which results in accepting the same device for 2 different ports .... 
> which ends in seg fault.

But 'dev->port_id' should not be initialized yet at this point.
And the (new_port_id == dev->port_id) should not be true and we
should execute code in 'else' branch (8-18).

> In order to prevent a change in flow logic (dup dev should fail in 
> netdev_dpdk_process_devarags() as always has been) I moved the check of 
> dup_dev inside the call of netdev_dpdk_process_devarags(). 
> The check for dup dev in line 8 is redundant (probably has always been so).
> 
>>>
>>>>
>>>>> +        response = xasprintf("Device '%s' is being shared with other "
>>>>> +                             "interfaces. Remove them before detaching.",
>>>>> +                             argv[1]);
>>>>> +        goto error;
>>>>> +    }
>>>>>
>>>>> -    rte_eth_dev_info_get(port_id, &dev_info);
>>>>> -    if (!dev_info.device || rte_dev_remove(dev_info.device)) {
>>>>> -        response = xasprintf("Device '%s' can not be detached", argv[1]);
>>>>> +    rte_eth_dev_close(port_id);
>>>>> +    if (rte_dev_remove(rte_dev) < 0) {
>>>>> +        response = xasprintf("Device '%s' can not be removed",
>>>>> + argv[1]);
>>>>
>>>> /removed/detached/
>>>> You're using different words for equal situations. This is misleading.
>>>>
>>>
>>> I can change the wording to "detached'. However I think it is misleading in 
>>> a
>> different way:
>>> Assuming there are two representors using the same PCI bus - then
>>> "detaching" one representor will only close the eth device, but will not
>> detach the PCI device until the second representor is closed as well.
>>> When all eth devices are closed - please note the message below:
>>>    response = xasprintf("All devices shared with device '%s' "
>>>                          "have been detached", argv[1]);
>>>
>>> Having said that - do you still suggest replacing "removed" with
>> "detached"?
>>
>> I'm suggesting to separate two cases:
>>
>>  1. rte_eth_dev_close() executed.
>>  2. rte_dev_remove() executed.
>>
>> The first one we could call "Port '%s' closed", the second "All devices 
>> shared
>> with device '%s' detached". And as we have two places where this could
>> happen, in both places we should have equal messages for the equal events.
>> You may vary the wording, but, IMHO, it should be clear from the log which
>> of above events occurred. And we should log every our action (1 and 2) for
>> this purpose.
>> Otherwise it'll be hard to tell what devices are still attached to OVS.
>>
>>>
> 
> In function netdev_dpdk_detach() you have one of 2 cases:
> 
> 1. If you try to detach a device whose PCI is shared with other devides you 
> return an error
> response = xasprintf("Device '%s' is being shared with other "
>                              "interfaces. Remove them before detaching.",
>                              argv[1]);
> 
> 2. If you detach the last device using the PCI bus - you can really detach. 
> In this case you execute both 
> rte_eth_dev_close() and rte_dev_remove() at once and there is no separation 
> between the two.
> 
> In this case you  inform.
>     response = xasprintf("All devices shared with device '%s' "
>                           "have been detached", argv[1]);
> 
> It seems IMHO the correct way of informing.
> 
> 
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to