Not a full review. Comments inline. Best regards, Ilya Maximets.
On 24.01.2018 17:35, Róbert Mulik wrote: > It is possible to change LSC detection mode to polling or interrupt mode > for DPDK interfaces. The default is polling mode. To set interrupt mode, > option dpdk-lsc-interrupt has to be set to true. > > In polling mode more processor time is needed, since the OVS repeatedly reads > the link state with a short period. It can lead to packet loss for certain > systems. > > In interrupt mode the hardware itself triggers an interrupt when link state > change happens, so less processing time needs for the OVS. It is not possible > to enable the interrupt mode on all hardware. > > For detailed description and usage see the dpdk install documentation. > > Signed-off-by: Robert Mulik <robert.mu...@ericsson.com> > --- > Documentation/intro/install/dpdk.rst | 48 +++++++++++++++++++++++++++++++++++ > lib/netdev-dpdk.c | 49 > +++++++++++++++++++++++++++++++++--- > lib/netdev-dpdk.h | 2 ++ > vswitchd/bridge.c | 8 ++++++ > vswitchd/vswitch.xml | 45 +++++++++++++++++++++++++++++++++ > 5 files changed, 149 insertions(+), 3 deletions(-) > > -- > 1.9.1 > > diff --git a/Documentation/intro/install/dpdk.rst > b/Documentation/intro/install/dpdk.rst > index 040e62e..806a7af 100644 > --- a/Documentation/intro/install/dpdk.rst > +++ b/Documentation/intro/install/dpdk.rst > @@ -626,6 +626,54 @@ The average number of packets per output batch can be > checked in PMD stats:: > > $ ovs-appctl dpif-netdev/pmd-stats-show > > +Link State Change (LSC) detection configuration > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +There are two methods to get the information when Link State Change (LSC) > +happens on a network interface: by polling or interrupt. > + > +With polling method, a process is running in the background and repeatedly > +reads the link state with a short period. It continuously needs processor > time > +and between 2 reading periods it can`t see the link state change, therefore > +the reaction time depends on the polling period. With higher rate, more > +processor time is needed. Another problem with the poll mode is that on some > +hardware a polling cycle takes too much time, which (in the end) leads to > +packet loss for certain systems. > + > +If interrupts are used to get LSC information, the hardware itself triggers > +an interrupt when link state change happens, the thread wakes up from sleep, > +updates the information, and goes back to sleep mode. When no link state > +change happens (most of the time), the thread remains in sleep mode and > +doesn`t use processor time at all. The disadvantage of this method is that > +when interrupt happens, the processor has to handle it immediately, so it > +puts the currently running process to background, handles the interrupt, and > +takes the background process back. Another disadvantage is that some hardware > +can`t be configured to generate LSC interrupts. > + > +The default configuration is polling mode. To set interrupt mode, option > +dpdk-lsc-interrupt has to be set to true. > + > +Global settings > + > +Command to set interrupt mode for all interfaces: > +ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=true > + > +Command to set polling mode for all interfaces: > +ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=false > +or: > +ovs-vsctl remove Open_vSwitch . other_config dpdk-lsc-interrupt > + > +Interface specific settings (override global settings) > + > +Command to set interrupt mode for a specific interface: > +ovs-vsctl set interface <interface_name> options:dpdk-lsc-interrupt=true > + > +Command to set polling mode for a specific interface: > +ovs-vsctl set interface <interface_name> options:dpdk-lsc-interrupt=false > + > +Command to reset to globally defined mode for a specific interface: > +ovs-vsctl remove interface <interface_name> options dpdk-lsc-interrupt > + > Limitations > ------------ > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index ac2e38e..6643ea9 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -148,7 +148,7 @@ typedef uint16_t dpdk_port_t; > #define VHOST_ENQ_RETRY_NUM 8 > #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) > > -static const struct rte_eth_conf port_conf = { > +static struct rte_eth_conf port_conf = { > .rxmode = { > .mq_mode = ETH_MQ_RX_RSS, > .split_hdr_size = 0, > @@ -167,6 +167,10 @@ static const struct rte_eth_conf port_conf = { > .txmode = { > .mq_mode = ETH_MQ_TX_NONE, > }, > + .intr_conf = { > + /* LSC interrupt mode disabled, polling mode used. */ > + .lsc = 0, > + }, > }; > > /* > @@ -433,6 +437,12 @@ struct netdev_dpdk { > /* DPDK-ETH hardware offload features, > * from the enum set 'dpdk_hw_ol_features' */ > uint32_t hw_ol_features; > + > + /* Properties for link state change detection mode. > + * If lsc_interrupt_mode is set to false, poll mode is used, > + * otherwise interrupt mode is used. */ > + bool requested_lsc_interrupt_mode; > + bool lsc_interrupt_mode; > ); > > PADDED_MEMBERS(CACHE_LINE_SIZE, > @@ -457,6 +467,18 @@ int netdev_dpdk_get_vid(const struct netdev_dpdk *dev); > struct ingress_policer * > netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev); > > +void > +netdev_dpdk_set_default_lsc_detect_mode(bool interrupt_mode) > +{ > + port_conf.intr_conf.lsc = interrupt_mode; > +} > + > +bool > +netdev_dpdk_get_default_lsc_detect_mode(void)> +{ > + return port_conf.intr_conf.lsc; > +} Names of these functions are incorrect. IMHO, netdev_dpdk_get_default_lsc_detect_mode() should return "interrupt" or "polling", but netdev_dpdk_get_default_lsc_interrupt_mode() or netdev_dpdk_get_default_lsc_interrupt_mode_enabled() should return "true" or "false". All the variants are too long. Can we make them shorter without degrading readability? > + > static bool > is_dpdk_class(const struct netdev_class *class) > { > @@ -684,12 +706,14 @@ dpdk_watchdog(void *dummy OVS_UNUSED) > } > > static int > -dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) > +dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq) > { > int diag = 0; > int i; > struct rte_eth_conf conf = port_conf; > > + conf.intr_conf.lsc = dev->lsc_interrupt_mode; > + > /* For some NICs (e.g. Niantic), scatter_rx mode needs to be explicitly > * enabled. */ > if (dev->mtu > ETHER_MTU) { > @@ -799,7 +823,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) > n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq); > n_txq = MIN(info.max_tx_queues, dev->up.n_txq); > > - diag = dpdk_eth_dev_queue_setup(dev, n_rxq, n_txq); > + diag = dpdk_eth_dev_port_config(dev, n_rxq, n_txq); > if (diag) { > VLOG_ERR("Interface %s(rxq:%d txq:%d) configure error: %s", > dev->up.name, n_rxq, n_txq, rte_strerror(-diag)); > @@ -895,6 +919,8 @@ common_construct(struct netdev *netdev, dpdk_port_t > port_no, > dev->flags = 0; > dev->requested_mtu = ETHER_MTU; > dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu); > + dev->requested_lsc_interrupt_mode = > + netdev_dpdk_get_default_lsc_detect_mode(); > ovsrcu_index_init(&dev->vid, -1); > dev->vhost_reconfigured = false; > dev->attached = false; > @@ -1469,6 +1495,20 @@ netdev_dpdk_set_config(struct netdev *netdev, const > struct smap *args, > "The old 'dpdk<port_id>' names are not > supported", > netdev_get_name(netdev)); > err = EINVAL; > + goto out; > + } > + > + struct smap_node *node = smap_get_node(args, "dpdk-lsc-interrupt"); > + bool lsc_interrupt_mode; > + > + if (node) { > + lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", > false); > + } else { > + lsc_interrupt_mode = netdev_dpdk_get_default_lsc_detect_mode(); > + } Above code could be replaced by just: lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", dpdk_get_default_lsc_detect_mode()); Second thought here: Why we're checking this value before checking the 'err'? > + if (dev->requested_lsc_interrupt_mode != lsc_interrupt_mode) { > + dev->requested_lsc_interrupt_mode = lsc_interrupt_mode; > + netdev_request_reconfigure(netdev); > } > > if (err) { > @@ -3482,6 +3522,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev) > if (netdev->n_txq == dev->requested_n_txq > && netdev->n_rxq == dev->requested_n_rxq > && dev->mtu == dev->requested_mtu > + && dev->lsc_interrupt_mode == dev->requested_lsc_interrupt_mode > && dev->rxq_size == dev->requested_rxq_size > && dev->txq_size == dev->requested_txq_size > && dev->socket_id == dev->requested_socket_id) { > @@ -3497,6 +3538,8 @@ netdev_dpdk_reconfigure(struct netdev *netdev) > goto out; > } > > + dev->lsc_interrupt_mode = dev->requested_lsc_interrupt_mode; > + > netdev->n_txq = dev->requested_n_txq; > netdev->n_rxq = dev->requested_n_rxq; > > diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h > index b7d02a7..5440743 100644 > --- a/lib/netdev-dpdk.h > +++ b/lib/netdev-dpdk.h > @@ -27,6 +27,8 @@ struct dp_packet; > > void netdev_dpdk_register(void); > void free_dpdk_buf(struct dp_packet *); > +void netdev_dpdk_set_default_lsc_detect_mode(bool interrupt_mode); > +bool netdev_dpdk_get_default_lsc_detect_mode(void); > > #else > > diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c > index f44f950..86c88e7 100644 > --- a/vswitchd/bridge.c > +++ b/vswitchd/bridge.c > @@ -68,6 +68,9 @@ > #include "lib/vswitch-idl.h" > #include "xenserver.h" > #include "vlan-bitmap.h" > +#ifdef DPDK_NETDEV > +#include "./lib/netdev-provider.h" > +#endif > > VLOG_DEFINE_THIS_MODULE(bridge); > > @@ -601,6 +604,11 @@ bridge_reconfigure(const struct ovsrec_open_vswitch > *ovs_cfg) > ofproto_set_vlan_limit(smap_get_int(&ovs_cfg->other_config, "vlan-limit", > LEGACY_MAX_VLAN_HEADERS)); > > +#ifdef DPDK_NETDEV > + netdev_dpdk_set_default_lsc_detect_mode( > + smap_get_bool(&ovs_cfg->other_config, "dpdk-lsc-interrupt", false)); > +#endif Is there a reason why we should be able to configure default value in runtime? I think that we could make this boot-time option and move all the related code to lib/dpdk.{c,h} just like for vhost iommu support. IMHO, user should know if most of his HW NICs supports LSC interrupt mode or not before starting the OVS. > + > ofproto_set_threads( > smap_get_int(&ovs_cfg->other_config, "n-handler-threads", 0), > smap_get_int(&ovs_cfg->other_config, "n-revalidator-threads", 0)); > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index 7e89325..95071eb 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -320,6 +320,29 @@ > </p> > </column> > > + <column name="other_config" key="dpdk-lsc-interrupt" > + type='{"type": "boolean"}'> > + <p> > + Set this value to <code>true</code> to configure interrupt mode for > + Link State Change (LSC) detection instead of poll mode for DPDK > + interfaces. > + </p> > + <p> > + The default value is <code>false</code>. > + </p> > + <p> > + If this value is <code>false</code> at startup, poll mode is used > for > + all netdev dpdk interfaces. > + </p> > + <p> > + This value can be overridden for a specific interface in the > options > + section of that interface. > + </p> > + <p> > + This parameter has an effect only on netdev dpdk interfaces. > + </p> > + </column> > + > <column name="other_config" key="dpdk-extra" > type='{"type": "string"}'> > <p> > @@ -3620,6 +3643,28 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > type=patch options:peer=p1 \ > </column> > </group> > > + <group title="Link State Change detection mode"> > + <column name="options" key="dpdk-lsc-interrupt" > + type='{"type": "boolean"}'> > + <p> > + Set this value to <code>true</code> to configure interrupt mode for > + Link State Change (LSC) detection instead of poll mode for the DPDK > + interface. > + </p> > + <p> > + If this value is not set, the value is taken from the global > + settings. > + </p> > + <p> > + If this value is set, the global LSC interrupt settings are > + overridden. > + </p> > + <p> > + This parameter has an effect only on netdev dpdk interfaces. > + </p> > + </column> > + </group> > + > <group title="Common Columns"> > The overall purpose of these columns is described under <code>Common > Columns</code> at the beginning of this document. > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev