On 05/04/18 15:17, Stokes, Ian wrote:
On 03/27/2018 04:07 PM, Stokes, Ian wrote:
On 27.03.2018 13:19, Stokes, Ian wrote:
It is possible to change LSC detection mode to polling or interrupt
mode for DPDK interfaces. The default is polling mode. To set
interrupt mode, option dpdk-lsc-interrupt has to be set to true.

In polling mode more processor time is needed, since the OVS
repeatedly reads the link state with a short period. It can lead to
packet loss for certain systems.

In interrupt mode the hardware itself triggers an interrupt when
link state change happens, so less processing time needs for the OVS.

For detailed description and usage see the dpdk install
Could you, please, better describe why we need this change?
Because we're not removing the polling thread. OVS will still poll
the link states periodically. This config option has no effect on that
Also, link state polling in OVS uses 'rte_eth_link_get_nowait()'
function which will be called in both cases and should not wait for
hardware reply in any implementation.
rte_eth_link_get_nowait() on Intel XL710 could take an excessive time to
respond. The following patch, https://dpdk.org/ml/archives/dev/2018-
March/092156.html is taking care of it from a DPDK side.

There might be other drivers that also take a long time, hence this patch
might still be useful in the future.

I believe it was related to a case where bonded mode in active back
was causing packet drops due to the frequency that the LSC was being
polled. Using interrupt based approach alleviated the issue. (I'm open
to correction on this :))

@Robert/Eelco You may be able to provide some more light here and
whether the patches below in DPDK resolve the issue?
This long delay can be an issue in bonding mode, as the links checks for
bonding interfaces is holding the RW lock in bond_run(). This same lock is
taken in the PMD thread when calling the bond_check_admissibility() for
upcall traffic.
There was recent bug fix for intel NICs that fixes waiting of an
admin queue on link state requests despite of 'no_wait' flag:
Will this fix your target case?

So, the difference of execution time of 'rte_eth_link_get_nowait()'
with enabled and disabled interrupts should be not so significant.
Do you have performance measurements? Measurement with above fix
I do not have delay numbers but I know that we were no longer seeing
dropped traffic compared to other NICs under the same load with upcall
traffic present.
Thanks for working on this Robert.

I've completed some testing including the case where LSC is not
supported, in which case the port will remain in a down state and
fail rx/tx traffic. This behavior conforms to the netdev_reconfigure
expectations in the fail case so that's ok.

I'm not sure if this is acceptable. For example, we're not failing
reconfiguration in case of issues with number of queues. We're trying
different numbers until we have working configuration.
Maybe we need the same fall-back mechanism in case of not supported
LSC interrupts? (MTU setup errors are really uncommon unlike LSC
support in PMDs).
Thanks for raising this Ilya.

I thought of this as well. I'd like to see a fall back to the PMD but
didn’t see how it could be done in a clean way.
Unfortunately rte_eth_dev_configure() returns -EINVAL when lsc mode is
requested but not supported.
It doesn't give us a clue if the error is related to lsc mode as it
could also relate to a number of other configure issues such as
nb_rxq/nb_txq/portid etc.
It would be better if we could query the device via ethdev api to see if
it supports lsc interrupt mode but that’s not available currently.
Maybe a DPDK patch before we continue?
It's hard to say. As part of the call to rte_eth_dev_configure() there is a 
check specifically to see if lsc interrupt is supported with the following.

        /* Check that the device supports requested interrupts */
        if ((dev_conf->intr_conf.lsc == 1) &&
                (!(dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC))) {
                        RTE_PMD_DEBUG_TRACE("driver %s does not support lsc\n",
                        return -EINVAL;

Even if a patch was submitted to extend the API in DPDK to allow this check 
specifically, I feel it will be the same code as above, just in a new function 
name. The check would remain in rte_eth_dev_configure() anyway.
I was more hinting to the fact that the only way to get this device information is trough some global data set, rte_eth_devices[port_id].data->dev_flags, would be nice if a clean API existed. But if we are ok, we can access it directly from OVS.

We could do something similar in OVS for now to allow PMD fallback where it's 
not supported.


dev mailing list

Reply via email to