[ovs-dev] [PATCH 1/1] ovsdb-idl: fix memory leak for the memory allocated in ovsdb_idl_column.parse() functions.
"make check-valgrind" points to memory leak issue in case memory is allocated within one of ovsdb_idl_column.parse() functions. More details can be found in pull request https://github.com/openvswitch/ovs/pull/290. Each issue can be solved in many different ways. This one was solved by modifying struct ovsdb_idl_row what I considered to be the optimal way. However modifying data structures can bring some potential risks currently not envisaged to me (I'm a novice in OVS code). So the main purpose of this patch is to find out what is the level of freedom for modifying existing data structures. Signed-off-by: Damijan Skvarc Submitted-at: https://github.com/openvswitch/ovs/pull/290. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 1/3] doc: Move vhost tx retry info to separate section.
Bleep bloop. Greetings Kevin Traynor, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line is 81 characters long (recommended limit is 79) #94 FILE: Documentation/topics/dpdk/vhost-user.rst:512: Lines checked: 110, Warnings: 1, Errors: 0 Please check this out. If you feel there has been an error, please email acon...@bytheb.org Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v4 3/3] netdev-dpdk: Enable vhost-tx-retries config.
vhost tx retries can provide some mitigation against dropped packets due to a temporarily slow guest/limited queue size for an interface, but on the other hand when a system is fully loaded those extra cycles retrying could mean packets are dropped elsewhere. Up to now max vhost tx retries have been hardcoded, which meant no tuning and no way to disable for debugging to see if extra cycles spent retrying resulted in rx drops on some other interface. Add an option to change the max retries, with a value of 0 effectively disabling vhost tx retries. Signed-off-by: Kevin Traynor Acked-by: Eelco Chaudron Acked-by: Flavio Leitner --- Documentation/topics/dpdk/vhost-user.rst | 28 + lib/netdev-dpdk.c| 39 +--- vswitchd/vswitch.xml | 12 3 files changed, 75 insertions(+), 4 deletions(-) diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst index 368f44520..724aa62f6 100644 --- a/Documentation/topics/dpdk/vhost-user.rst +++ b/Documentation/topics/dpdk/vhost-user.rst @@ -351,4 +351,29 @@ The default value is ``false``. .. _dpdk-testpmd: +vhost-user-client tx retries config +~~~ + +For vhost-user-client interfaces, the max amount of retries can be changed from +the default 8 by setting ``tx-retries-max``. + +The minimum is 0 which means there will be no retries and if any packets in +each batch cannot be sent immediately they will be dropped. The maximum is 32, +which would mean that after the first packet(s) in the batch was sent there +could be a maximum of 32 more retries. + +Retries can help with avoiding packet loss when temporarily unable to send to a +vhost interface because the virtqueue is full. However, spending more time +retrying to send to one interface, will reduce the time available for rx/tx and +processing packets on other interfaces, so some tuning may be required for best +performance. + +Tx retries max can be set for vhost-user-client ports:: + +$ ovs-vsctl set Interface vhost-client-1 options:tx-retries-max=0 + +.. note:: + + Configurable vhost tx retries are not supported with vhost-user ports. + DPDK in the Guest - @@ -496,4 +521,7 @@ packets can be sent, it may mean the guest is not accepting packets, so there are no (more) retries. +For information about configuring the maximum amount of tx retries for +vhost-user-client interfaces see `vhost-user-client tx retries config`_. + .. note:: diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index d3e02d389..b8592962f 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -165,5 +165,10 @@ typedef uint16_t dpdk_port_t; #define DPDK_PORT_ID_FMT "%"PRIu16 -#define VHOST_ENQ_RETRY_NUM 8 +/* Minimum amount of vhost tx retries, effectively a disable. */ +#define VHOST_ENQ_RETRY_MIN 0 +/* Maximum amount of vhost tx retries. */ +#define VHOST_ENQ_RETRY_MAX 32 +/* Legacy default value for vhost tx retries. */ +#define VHOST_ENQ_RETRY_DEF 8 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) @@ -418,5 +423,7 @@ struct netdev_dpdk { /* True if vHost device is 'up' and has been reconfigured at least once */ bool vhost_reconfigured; -/* 3 pad bytes here. */ + +atomic_uint8_t vhost_tx_retries_max; +/* 2 pad bytes here. */ ); @@ -1262,4 +1269,6 @@ vhost_common_construct(struct netdev *netdev) } +atomic_store_relaxed(&dev->vhost_tx_retries_max, VHOST_ENQ_RETRY_DEF); + return common_construct(netdev, DPDK_ETH_PORT_ID_INVALID, DPDK_DEV_VHOST, socket_id); @@ -1922,4 +1931,5 @@ netdev_dpdk_vhost_client_set_config(struct netdev *netdev, struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); const char *path; +int max_tx_retries, cur_max_tx_retries; ovs_mutex_lock(&dev->mutex); @@ -1938,4 +1948,17 @@ netdev_dpdk_vhost_client_set_config(struct netdev *netdev, } } + +max_tx_retries = smap_get_int(args, "tx-retries-max", + VHOST_ENQ_RETRY_DEF); +if (max_tx_retries < VHOST_ENQ_RETRY_MIN +|| max_tx_retries > VHOST_ENQ_RETRY_MAX) { +max_tx_retries = VHOST_ENQ_RETRY_DEF; +} +atomic_read_relaxed(&dev->vhost_tx_retries_max, &cur_max_tx_retries); +if (max_tx_retries != cur_max_tx_retries) { +atomic_store_relaxed(&dev->vhost_tx_retries_max, max_tx_retries); +VLOG_INFO("Max Tx retries for vhost device '%s' set to %d", + dev->up.name, max_tx_retries); +} ovs_mutex_unlock(&dev->mutex); @@ -2351,4 +2374,5 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, unsigned int dropped = 0; int i, retries = 0; +int max_retries = VHOST_ENQ_RETRY_MIN; int vid = netdev_dpdk_get_vid(dev); @@ -2380,9 +2404,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, /* Prepare for possible retry
[ovs-dev] [PATCH v4 1/3] doc: Move vhost tx retry info to separate section.
vhost tx retry is applicable to vhost-user and vhost-user-client, but was in the section that compares them. Also, moved further down the doc as prefer to have more fundamental info about vhost nearer the top. Fixes: 6d6513bfc657 ("doc: Add info on vhost tx retries.") Reported-by: David Marchand Signed-off-by: Kevin Traynor --- Documentation/topics/dpdk/vhost-user.rst | 72 1 file changed, 36 insertions(+), 36 deletions(-) diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst index a6cf9d1cc..5f393aced 100644 --- a/Documentation/topics/dpdk/vhost-user.rst +++ b/Documentation/topics/dpdk/vhost-user.rst @@ -76,40 +76,4 @@ mode ports require QEMU version 2.7. Ports of type vhost-user are currently deprecated and will be removed in a future release. -vhost tx retries - - -When sending a batch of packets to a vhost-user or vhost-user-client interface, -it may happen that some but not all of the packets in the batch are able to be -sent to the guest. This is often because there is not enough free descriptors -in the virtqueue for all the packets in the batch to be sent. In this case -there will be a retry, with a default maximum of 8 occurring. If at any time no -packets can be sent, it may mean the guest is not accepting packets, so there -are no (more) retries. - -.. note:: - - Maximum vhost tx batch size is defined by NETDEV_MAX_BURST, and is currently - as 32. - -Tx Retries may be reduced or even avoided by some external configuration, such -as increasing the virtqueue size through the ``rx_queue_size`` parameter -introduced in QEMU 2.7.0 / libvirt 2.3.0:: - - - - - - - - - -The guest application will also need need to provide enough descriptors. For -example with ``testpmd`` the command line argument can be used:: - - --rxd=1024 --txd=1024 - -The guest should also have sufficient cores dedicated for consuming and -processing packets at the required rate. - .. _dpdk-vhost-user: @@ -521,4 +485,40 @@ DPDK vHost User ports can be configured to use Jumbo Frames. For more information, refer to :doc:`jumbo-frames`. +vhost tx retries + + +When sending a batch of packets to a vhost-user or vhost-user-client interface, +it may happen that some but not all of the packets in the batch are able to be +sent to the guest. This is often because there is not enough free descriptors +in the virtqueue for all the packets in the batch to be sent. In this case +there will be a retry, with a default maximum of 8 occurring. If at any time no +packets can be sent, it may mean the guest is not accepting packets, so there +are no (more) retries. + +.. note:: + + Maximum vhost tx batch size is defined by NETDEV_MAX_BURST, and is currently + as 32. + +Tx Retries may be reduced or even avoided by some external configuration, such +as increasing the virtqueue size through the ``rx_queue_size`` parameter +introduced in QEMU 2.7.0 / libvirt 2.3.0:: + + + + + + + + + +The guest application will also need need to provide enough descriptors. For +example with ``testpmd`` the command line argument can be used:: + + --rxd=1024 --txd=1024 + +The guest should also have sufficient cores dedicated for consuming and +processing packets at the required rate. + vhost-user Dequeue Zero Copy (experimental) --- -- 2.20.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v4 2/3] netdev-dpdk: Add custom stat for vhost tx retries.
vhost tx retries may occur, and it can be a sign that the guest is not optimally configured. Add a custom stat so a user will know if vhost tx retries are occurring and hence give a hint that guest config should be examined. Signed-off-by: Kevin Traynor --- Documentation/topics/dpdk/vhost-user.rst | 5 lib/netdev-dpdk.c| 38 +++- 2 files changed, 42 insertions(+), 1 deletion(-) diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst index 5f393aced..368f44520 100644 --- a/Documentation/topics/dpdk/vhost-user.rst +++ b/Documentation/topics/dpdk/vhost-user.rst @@ -521,4 +521,9 @@ The guest should also have sufficient cores dedicated for consuming and processing packets at the required rate. +The amount of Tx retries on a vhost-user or vhost-user-client interface can be +shown with:: + + $ ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_retries + vhost-user Dequeue Zero Copy (experimental) --- diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index a380b6ffe..d3e02d389 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -139,4 +139,10 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / ROUND_DOWN_POW2(MAX_NB_MBUF / MIN_NB_MBUF)) #define XSTAT_RX_JABBER_ERRORS "rx_jabber_errors" +/* Size of vHost custom stats. */ +#define VHOST_STAT_TX_RETRIES_SIZE1 + +/* Name of vHost custom stats. */ +#define VHOST_STAT_TX_RETRIES"tx_retries" + #define SOCKET0 0 @@ -434,7 +440,9 @@ struct netdev_dpdk { PADDED_MEMBERS(CACHE_LINE_SIZE, struct netdev_stats stats; +/* Custom stat for retries when unable to transmit. */ +uint64_t tx_retries; /* Protects stats */ rte_spinlock_t stats_lock; -/* 44 pad bytes here. */ +/* 4 pad bytes here. */ ); @@ -1190,4 +1198,6 @@ common_construct(struct netdev *netdev, dpdk_port_t port_no, dev->rte_xstats_ids_size = 0; +dev->tx_retries = 0; + return 0; } @@ -2381,4 +2391,5 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts, cnt + dropped); +dev->tx_retries += MIN(retries, VHOST_ENQ_RETRY_NUM); rte_spinlock_unlock(&dev->stats_lock); @@ -2818,4 +2829,27 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, } +static int +netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev, + struct netdev_custom_stats *custom_stats) +{ +struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); + +ovs_mutex_lock(&dev->mutex); + +custom_stats->size = VHOST_STAT_TX_RETRIES_SIZE; +custom_stats->counters = (struct netdev_custom_counter *) + xzalloc(sizeof(struct netdev_custom_counter)); +ovs_strlcpy(custom_stats->counters->name, VHOST_STAT_TX_RETRIES, +NETDEV_CUSTOM_STATS_NAME_SIZE); + +rte_spinlock_lock(&dev->stats_lock); +custom_stats->counters->value = dev->tx_retries; +rte_spinlock_unlock(&dev->stats_lock); + +ovs_mutex_unlock(&dev->mutex); + +return 0; +} + static int netdev_dpdk_get_features(const struct netdev *netdev, @@ -4398,4 +4432,5 @@ static const struct netdev_class dpdk_vhost_class = { .get_carrier = netdev_dpdk_vhost_get_carrier, .get_stats = netdev_dpdk_vhost_get_stats, +.get_custom_stats = netdev_dpdk_vhost_get_custom_stats, .get_status = netdev_dpdk_vhost_user_get_status, .reconfigure = netdev_dpdk_vhost_reconfigure, @@ -4413,4 +4448,5 @@ static const struct netdev_class dpdk_vhost_client_class = { .get_carrier = netdev_dpdk_vhost_get_carrier, .get_stats = netdev_dpdk_vhost_get_stats, +.get_custom_stats = netdev_dpdk_vhost_get_custom_stats, .get_status = netdev_dpdk_vhost_user_get_status, .reconfigure = netdev_dpdk_vhost_client_reconfigure, -- 2.20.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v4 0/3] vhost tx retry updates
v4: - 1/2 New patch: Move vhost tx retries doc to a seperate section (David) - 2/3 -- Changed tx_retries to be a custom stat for vhost (Ilya) -- Added back in MIN() that was dropped in v2, as in retesting I saw it is needed when the retry limit is reached to prevent an accounting error -- note: this is not a typo, it was just out of date -/* 44 pad bytes here. */ +/* 4 pad bytes here. */ -- Removed acks due to the amount of changes made - 3/3 -- Added link between doc sections on vhost tx retries and and config (Ian) -- Fixed spacing, fixed typo and added default in docs (Ilya) -- Some minor renaming due to 2/3 changes -- Changed config option name from "vhost-tx-retries" to "tx-retries-max" in order to have a generic user facing name in case it is ever extended to other interfaces -- Other minor edits -- Kept acks as the operation didn't really change v3: - 2/4 really updated text to NETDEV_MAX_BURST this time - 4/4 comments from Flavio. Made max retries limit 32. v2: - 1/4 - split out minor fix from docs patch - 2/4 - updated text, NETDEV_MAX_BURST, QEMU/libvirt versions - 3/4 - removed unneeded MIN() - 4/4 - explicitly range check, removed unneeded tx_counters fn changes Kevin Traynor (3): doc: Move vhost tx retry info to separate section. netdev-dpdk: Add custom stat for vhost tx retries. netdev-dpdk: Enable vhost-tx-retries config. Documentation/topics/dpdk/vhost-user.rst | 105 +++ lib/netdev-dpdk.c| 75 +++- vswitchd/vswitch.xml | 12 +++ 3 files changed, 152 insertions(+), 40 deletions(-) -- 2.20.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] AUTHORS entry
Hi Damijan, I noticed that we have some inconsistent spelling of your first name in the Git history. Specifically, I see both "Damjan" and "Damijan" in different places in history. While it's not possible to fix previous entries in the Git history, I want to make sure that we're spelling your name correctly in AUTHORS. Currently, it says Damijan. What is the correct (or preferred) spelling? Thanks, Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv5] tunnel: Add layer 2 IPv6 GRE encapsulation support.
On 7/1/2019 3:29 PM, William Tu wrote: On Mon, Jul 1, 2019 at 3:10 PM Gregory Rose wrote: On 7/1/2019 12:45 PM, William Tu wrote: The patch adds ip6gre support. Tunnel type 'ip6gre' with packet_type= legacy_l2 is a layer 2 GRE tunnel over IPv6, carrying inner ethernet packets and encap with GRE header with outer IPv6 header. Encapsulation of layer 3 packet over IPv6 GRE, ip6gre, is not supported yet. I tested it by running: # make check-kernel TESTSUITEFLAGS='-k ip6gre' under kernel 5.2 and for userspace: Hi William, I tried this on a system with the 5.2-rc5+ kernel and got the following results: ## --- ## ## openvswitch 2.11.90 test suite. ## ## --- ## 12: datapath - ping over ip6gre L2 tunnel skipped (system-traffic.at:344) ## - ## ## Test results. ## ## - ## 0 tests were successful. 1 test was skipped. This is on a Ubuntu 16.04 distro with the upstream kernel installed. What system/distro base did you use? Thanks, - Greg I tested it on ubuntu using kernel 5.2.0-rc5+ I run #make check-kernel TESTSUITEFLAGS='-k ip6gre' and it works OK, not being skipped by macro OVS_CHECK_KERNEL_EXCL(3, 10, 4, 15) William I see what happened, I recently ran an update on the system and it overwrote my iproute2/ip I built from upstream so the kernel check was failing because of the wrong ip program. Works fine now. "$@" -k ip6gre -j1 || (test X'' = Xyes && "$@" --recheck) ## --- ## ## openvswitch 2.11.90 test suite. ## ## --- ## 12: datapath - ping over ip6gre L2 tunnel ok ## - ## ## Test results. ## ## - ## 1 test was successful. Sorry for the confusion... I'm running a few other tests and review but should be done by tomorrow. Thanks! - Greg ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv5] tunnel: Add layer 2 IPv6 GRE encapsulation support.
On Mon, Jul 1, 2019 at 3:10 PM Gregory Rose wrote: > > > > On 7/1/2019 12:45 PM, William Tu wrote: > > The patch adds ip6gre support. Tunnel type 'ip6gre' with packet_type= > > legacy_l2 is a layer 2 GRE tunnel over IPv6, carrying inner ethernet packets > > and encap with GRE header with outer IPv6 header. Encapsulation of layer 3 > > packet over IPv6 GRE, ip6gre, is not supported yet. I tested it by running: > ># make check-kernel TESTSUITEFLAGS='-k ip6gre' > > under kernel 5.2 and for userspace: > > Hi William, > > I tried this on a system with the 5.2-rc5+ kernel and got the following > results: > > ## --- ## > ## openvswitch 2.11.90 test suite. ## > ## --- ## > 12: datapath - ping over ip6gre L2 tunnel skipped > (system-traffic.at:344) > > ## - ## > ## Test results. ## > ## - ## > > 0 tests were successful. > 1 test was skipped. > > This is on a Ubuntu 16.04 distro with the upstream kernel installed. > What system/distro base did you use? > > Thanks, > > - Greg > I tested it on ubuntu using kernel 5.2.0-rc5+ I run #make check-kernel TESTSUITEFLAGS='-k ip6gre' and it works OK, not being skipped by macro OVS_CHECK_KERNEL_EXCL(3, 10, 4, 15) William ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCHv5] tunnel: Add layer 2 IPv6 GRE encapsulation support.
On 7/1/2019 12:45 PM, William Tu wrote: The patch adds ip6gre support. Tunnel type 'ip6gre' with packet_type= legacy_l2 is a layer 2 GRE tunnel over IPv6, carrying inner ethernet packets and encap with GRE header with outer IPv6 header. Encapsulation of layer 3 packet over IPv6 GRE, ip6gre, is not supported yet. I tested it by running: # make check-kernel TESTSUITEFLAGS='-k ip6gre' under kernel 5.2 and for userspace: Hi William, I tried this on a system with the 5.2-rc5+ kernel and got the following results: ## --- ## ## openvswitch 2.11.90 test suite. ## ## --- ## 12: datapath - ping over ip6gre L2 tunnel skipped (system-traffic.at:344) ## - ## ## Test results. ## ## - ## 0 tests were successful. 1 test was skipped. This is on a Ubuntu 16.04 distro with the upstream kernel installed. What system/distro base did you use? Thanks, - Greg # make check TESTSUITEFLAGS='-k ip6gre' Signed-off-by: William Tu --- v1-v2 - rebase to master v2-v3 - update documentation suggested by Eli v3-v4 - squash Eli's documentation v4-v5 - remove using 'ip6gretap', use only 'ip6gre' with options:packet_type=legacy_l2 --- Documentation/faq/configuration.rst | 13 +++ NEWS| 1 + datapath/linux/compat/ip6_gre.c | 2 +- lib/dpif-netlink-rtnl.c | 8 - tests/system-traffic.at | 40 + tests/tunnel-push-pop-ipv6.at | 69 + vswitchd/vswitch.xml| 32 ++--- 7 files changed, 151 insertions(+), 14 deletions(-) diff --git a/Documentation/faq/configuration.rst b/Documentation/faq/configuration.rst index cb2c6b4eca98..ff3b71a5d4ef 100644 --- a/Documentation/faq/configuration.rst +++ b/Documentation/faq/configuration.rst @@ -212,6 +212,19 @@ Q: Does Open vSwitch support ERSPAN? options:erspan_ver=2 options:erspan_dir=1 \ options:erspan_hwid=4 +Q: Does Open vSwitch support IPv6 GRE? + +A: Yes. L2 tunnel interface GRE over IPv6 is supported. +L3 GRE tunnel over IPv6 is not supported. + +:: + +$ ovs-vsctl add-br br0 +$ ovs-vsctl add-port br0 at_gre0 -- \ +set int at_gre0 type=ip6gre \ +options:remote_ip=fc00:100::1 \ +options:packet_type=legacy_l2 + Q: How do I connect two bridges? A: First, why do you want to do this? Two connected bridges are not much diff --git a/NEWS b/NEWS index a38ab258fc6c..c7e84ed7931d 100644 --- a/NEWS +++ b/NEWS @@ -47,6 +47,7 @@ Post-v2.11.0 - Linux datapath: * Support for the kernel versions 4.19.x and 4.20.x. * Support for the kernel version 5.0.x. + - Add L2 GRE tunnel over IPv6 support. v2.11.0 - 19 Feb 2019 diff --git a/datapath/linux/compat/ip6_gre.c b/datapath/linux/compat/ip6_gre.c index ca4e66133570..ab50c72d0753 100644 --- a/datapath/linux/compat/ip6_gre.c +++ b/datapath/linux/compat/ip6_gre.c @@ -2550,7 +2550,7 @@ static struct rtnl_link_ops ip6gre_link_ops __read_mostly = { }; static struct rtnl_link_ops ip6gre_tap_ops __read_mostly = { - .kind = "ip6gre", + .kind = "ip6gretap", .maxtype= RPL_IFLA_GRE_MAX, .policy = ip6gre_policy, .priv_size = sizeof(struct ip6_tnl), diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c index 2e23a8c14fcf..582274c46774 100644 --- a/lib/dpif-netlink-rtnl.c +++ b/lib/dpif-netlink-rtnl.c @@ -104,7 +104,13 @@ vport_type_to_kind(enum ovs_vport_type type, case OVS_VPORT_TYPE_IP6ERSPAN: return "ip6erspan"; case OVS_VPORT_TYPE_IP6GRE: -return "ip6gre"; +if (tnl_cfg->pt_mode == NETDEV_PT_LEGACY_L2) { +return "ip6gretap"; +} else if (tnl_cfg->pt_mode == NETDEV_PT_LEGACY_L3) { +return NULL; +} else { +return NULL; +} case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: diff --git a/tests/system-traffic.at b/tests/system-traffic.at index d23ee897b0b2..8ea450887076 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -340,6 +340,46 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 | FORMAT_PI OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP +AT_SETUP([datapath - ping over ip6gre L2 tunnel]) +OVS_CHECK_KERNEL_EXCL(3, 10, 4, 15) +OVS_CHECK_GRE() +OVS_CHECK_ERSPAN() + +OVS_TRAFFIC_VSWITCHD_START() +ADD_BR([br-underlay]) + +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) + +ADD_NAMESPACES(at_ns0) + +dnl Set up underlay link from host into the namespace using veth pair. +ADD_VETH(p0, at_ns0, br-underlay, "fc00:100::1/96", [], [], nodad) +AT_CHECK([ip addr add dev br-underlay "fc00:100::100/96" no
Re: [ovs-dev] [PATCH 1/1] ovsdb-idl: memory leak while destroying database
On Mon, Jul 01, 2019 at 12:24:38PM +0200, Damjan Skvarc wrote: > While checking unit tests with valgrind option (make check-valgrind) I have > noticed > several memory leaks of the following format: Thanks. I applied this to master. I noticed that the call to ovsdb_idl_db_clear() could be moved into ovsdb_idl_db_destroy() instead of being in two places, so I did that, so what I pushed was the following. --8<--cut here-->8-- From: Damjan Skvarc Date: Mon, 1 Jul 2019 12:24:38 +0200 Subject: [PATCH] ovsdb-idl: memory leak while destroying database While checking unit tests with valgrind option (make check-valgrind) I have noticed several memory leaks of the following format: . ==20019== 13,883 (296 direct, 13,587 indirect) bytes in 1 blocks are definitely lost in loss record 346 of 346 ==20019==at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==20019==by 0x530F52: xcalloc (util.c:121) ==20019==by 0x5037A1: ovsdb_idl_row_create__ (ovsdb-idl.c:3120) ==20019==by 0x5045A3: ovsdb_idl_row_create (ovsdb-idl.c:3133) ==20019==by 0x507240: ovsdb_idl_process_update2 (ovsdb-idl.c:2478) ==20019==by 0x507240: ovsdb_idl_db_parse_update__ (ovsdb-idl.c:2328) ==20019==by 0x507240: ovsdb_idl_db_parse_update (ovsdb-idl.c:2380) ==20019==by 0x508128: ovsdb_idl_process_response (ovsdb-idl.c:742) ==20019==by 0x508128: ovsdb_idl_process_msg (ovsdb-idl.c:831) ==20019==by 0x508128: ovsdb_idl_run (ovsdb-idl.c:915) ==20019==by 0x4106D9: bridge_run (bridge.c:2977) ==20019==by 0x40719C: main (ovs-vswitchd.c:127) ==20019== ==20019== LEAK SUMMARY: ==20019==definitely lost: 296 bytes in 1 blocks ==20019==indirectly lost: 13,587 bytes in 10 blocks ==20019== possibly lost: 0 bytes in 0 blocks ==20019==still reachable: 43,563 bytes in 440 blocks ==20019== suppressed: 288 bytes in 1 blocks The problem is that table records maintained by database which is going to be destroyed with ovsdb_idl_db_destroy() function are not destroyed. Signed-off-by: Damijan Skvarc Signed-off-by: Ben Pfaff --- lib/ovsdb-idl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c index d27e205d2ff6..5c6109603113 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -239,6 +239,7 @@ static bool ovsdb_idl_check_server_db(struct ovsdb_idl *); static void ovsdb_idl_send_monitor_request(struct ovsdb_idl *, struct ovsdb_idl_db *, enum ovsdb_idl_monitor_method); +static void ovsdb_idl_db_clear(struct ovsdb_idl_db *db); struct ovsdb_idl { struct ovsdb_idl_db server; @@ -557,6 +558,7 @@ ovsdb_idl_db_destroy(struct ovsdb_idl_db *db) { ovs_assert(!db->txn); ovsdb_idl_db_txn_abort_all(db); +ovsdb_idl_db_clear(db); for (size_t i = 0; i < db->class_->n_tables; i++) { struct ovsdb_idl_table *table = &db->tables[i]; ovsdb_idl_condition_destroy(&table->condition); @@ -581,7 +583,6 @@ ovsdb_idl_destroy(struct ovsdb_idl *idl) if (idl) { ovsdb_idl_clear(idl); jsonrpc_session_close(idl->session); - ovsdb_idl_db_destroy(&idl->server); ovsdb_idl_db_destroy(&idl->data); json_destroy(idl->request_id); -- 2.20.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCHv5] tunnel: Add layer 2 IPv6 GRE encapsulation support.
The patch adds ip6gre support. Tunnel type 'ip6gre' with packet_type= legacy_l2 is a layer 2 GRE tunnel over IPv6, carrying inner ethernet packets and encap with GRE header with outer IPv6 header. Encapsulation of layer 3 packet over IPv6 GRE, ip6gre, is not supported yet. I tested it by running: # make check-kernel TESTSUITEFLAGS='-k ip6gre' under kernel 5.2 and for userspace: # make check TESTSUITEFLAGS='-k ip6gre' Signed-off-by: William Tu --- v1-v2 - rebase to master v2-v3 - update documentation suggested by Eli v3-v4 - squash Eli's documentation v4-v5 - remove using 'ip6gretap', use only 'ip6gre' with options:packet_type=legacy_l2 --- Documentation/faq/configuration.rst | 13 +++ NEWS| 1 + datapath/linux/compat/ip6_gre.c | 2 +- lib/dpif-netlink-rtnl.c | 8 - tests/system-traffic.at | 40 + tests/tunnel-push-pop-ipv6.at | 69 + vswitchd/vswitch.xml| 32 ++--- 7 files changed, 151 insertions(+), 14 deletions(-) diff --git a/Documentation/faq/configuration.rst b/Documentation/faq/configuration.rst index cb2c6b4eca98..ff3b71a5d4ef 100644 --- a/Documentation/faq/configuration.rst +++ b/Documentation/faq/configuration.rst @@ -212,6 +212,19 @@ Q: Does Open vSwitch support ERSPAN? options:erspan_ver=2 options:erspan_dir=1 \ options:erspan_hwid=4 +Q: Does Open vSwitch support IPv6 GRE? + +A: Yes. L2 tunnel interface GRE over IPv6 is supported. +L3 GRE tunnel over IPv6 is not supported. + +:: + +$ ovs-vsctl add-br br0 +$ ovs-vsctl add-port br0 at_gre0 -- \ +set int at_gre0 type=ip6gre \ +options:remote_ip=fc00:100::1 \ +options:packet_type=legacy_l2 + Q: How do I connect two bridges? A: First, why do you want to do this? Two connected bridges are not much diff --git a/NEWS b/NEWS index a38ab258fc6c..c7e84ed7931d 100644 --- a/NEWS +++ b/NEWS @@ -47,6 +47,7 @@ Post-v2.11.0 - Linux datapath: * Support for the kernel versions 4.19.x and 4.20.x. * Support for the kernel version 5.0.x. + - Add L2 GRE tunnel over IPv6 support. v2.11.0 - 19 Feb 2019 diff --git a/datapath/linux/compat/ip6_gre.c b/datapath/linux/compat/ip6_gre.c index ca4e66133570..ab50c72d0753 100644 --- a/datapath/linux/compat/ip6_gre.c +++ b/datapath/linux/compat/ip6_gre.c @@ -2550,7 +2550,7 @@ static struct rtnl_link_ops ip6gre_link_ops __read_mostly = { }; static struct rtnl_link_ops ip6gre_tap_ops __read_mostly = { - .kind = "ip6gre", + .kind = "ip6gretap", .maxtype= RPL_IFLA_GRE_MAX, .policy = ip6gre_policy, .priv_size = sizeof(struct ip6_tnl), diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c index 2e23a8c14fcf..582274c46774 100644 --- a/lib/dpif-netlink-rtnl.c +++ b/lib/dpif-netlink-rtnl.c @@ -104,7 +104,13 @@ vport_type_to_kind(enum ovs_vport_type type, case OVS_VPORT_TYPE_IP6ERSPAN: return "ip6erspan"; case OVS_VPORT_TYPE_IP6GRE: -return "ip6gre"; +if (tnl_cfg->pt_mode == NETDEV_PT_LEGACY_L2) { +return "ip6gretap"; +} else if (tnl_cfg->pt_mode == NETDEV_PT_LEGACY_L3) { +return NULL; +} else { +return NULL; +} case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: diff --git a/tests/system-traffic.at b/tests/system-traffic.at index d23ee897b0b2..8ea450887076 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -340,6 +340,46 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 | FORMAT_PI OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP +AT_SETUP([datapath - ping over ip6gre L2 tunnel]) +OVS_CHECK_KERNEL_EXCL(3, 10, 4, 15) +OVS_CHECK_GRE() +OVS_CHECK_ERSPAN() + +OVS_TRAFFIC_VSWITCHD_START() +ADD_BR([br-underlay]) + +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) + +ADD_NAMESPACES(at_ns0) + +dnl Set up underlay link from host into the namespace using veth pair. +ADD_VETH(p0, at_ns0, br-underlay, "fc00:100::1/96", [], [], nodad) +AT_CHECK([ip addr add dev br-underlay "fc00:100::100/96" nodad]) +AT_CHECK([ip link set dev br-underlay up]) + +dnl Set up tunnel endpoints on OVS outside the namespace and with a native +dnl linux device inside the namespace. +ADD_OVS_TUNNEL6([ip6gre], [br0], [at_gre0], [fc00:100::1], [10.1.1.100/24], +[options:packet_type=legacy_l2]) +ADD_NATIVE_TUNNEL6([ip6gretap], [ns_gretap0], [at_ns0], [fc00:100::100], + [10.1.1.1/24], [local fc00:100::1]) + +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 2 fc00:100::100]) + +dnl First, check the underlay +NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 2 fc00:100::100 | FORMAT_PING], [0], [dnl +3 pack
Re: [ovs-dev] [PATCH] netdev-offload-tc: Fix requesting match on wildcarded vlan tpid.
On 7/1/2019 2:21 PM, Ilya Maximets wrote: > Hi, Eli. > Did you have a chance to test this? Yes, sorry for the delay. It works fine (though didn't test QinQ. only native/single-tagged). Reviewed-by: Eli Britstein > Best regards, Ilya Maximets. > > On 19.06.2019 12:16, Ilya Maximets wrote: >> 'mask' must be checked first before configuring key in flower. >> >> CC: Eli Britstein >> Fixes: 0b0a84783cd6 ("netdev-tc-offloads: Support match on priority tags") >> Signed-off-by: Ilya Maximets >> --- >> lib/netdev-offload-tc.c | 8 ++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) >> >> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c >> index 902ec9543..4cc044b4b 100644 >> --- a/lib/netdev-offload-tc.c >> +++ b/lib/netdev-offload-tc.c >> @@ -1146,9 +1146,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct >> match *match, >> } >> mask->mpls_lse[0] = 0; >> >> -if (eth_type_vlan(key->vlans[0].tpid)) { >> +if (mask->vlans[0].tpid && eth_type_vlan(key->vlans[0].tpid)) { >> flower.key.encap_eth_type[0] = flower.key.eth_type; >> +flower.mask.encap_eth_type[0] = flower.mask.eth_type; >> flower.key.eth_type = key->vlans[0].tpid; >> +flower.mask.eth_type = mask->vlans[0].tpid; >> } >> if (mask->vlans[0].tci) { >> ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK); >> @@ -1179,9 +1181,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct >> match *match, >> } >> } >> >> -if (eth_type_vlan(key->vlans[1].tpid)) { >> +if (mask->vlans[1].tpid && eth_type_vlan(key->vlans[1].tpid)) { >> flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0]; >> +flower.mask.encap_eth_type[1] = flower.mask.encap_eth_type[0]; >> flower.key.encap_eth_type[0] = key->vlans[1].tpid; >> +flower.mask.encap_eth_type[0] = mask->vlans[1].tpid; >> } >> if (mask->vlans[1].tci) { >> ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK); >> ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3 2/3] ovn: Send GARP for the router ports with reside-on-redirect-chassis options set
On Mon, Jul 1, 2019 at 9:45 AM wrote: > > From: Numan Siddique > > With the commit [1], the routing for the provider logical switches > connected to a router is centralized on the master gateway chassis > (if the option - reside-on-redirect-chassis) is set. When the > failover happens and a standby gateway chassis becomes master, > it should send GARPs for the router port macs. Without this, the > physical switch doesn't learn the new location of the router port macs > immediately and this could result in traffic disruption. > > This patch addresses this issue so that the ovn-controller which claims the > distributed gatweway router port sends out the GARPs. > > ovn-controller sends the GARPs if the Port_Binding.nat_addresses column > is set. This patch makes use of this column, instead of adding a new column > even though the name - nat_addresses seems a bit misnomer. The documentation > is > updated to highlight the usage of this column. > > This patch doesn't handle sending the GARPs for the gateway router port IPs. > This will be handled in a separate patch. > > [1] - 85706c34d53d ("ovn: Avoid tunneling for VLAN packets redirected to a > gateway chassis") > > Signed-off-by: Numan Siddique Looks good to me, Acked-by: Dumitru Ceara > --- > ovn/northd/ovn-northd.c | 31 ++ > tests/ovn.at| 58 +++-- > 2 files changed, 87 insertions(+), 2 deletions(-) > > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c > index c43adb51c..e0af234f8 100644 > --- a/ovn/northd/ovn-northd.c > +++ b/ovn/northd/ovn-northd.c > @@ -2522,7 +2522,38 @@ ovn_port_update_sbrec(struct northd_context *ctx, > free(nats[i]); > } > free(nats); > + > +/* Add the router mac and IPv4 addresses to > + * Port_Binding.nat_addresses so that GARP is sent for these > + * IPs by the ovn-controller on which the distributed gateway > + * router port resides if: > + * > + * - op->peer has 'reside-on-gateway-chassis' set and the > + *the logical router datapath has distributed router port. > + * > + * Note: Port_Binding.nat_addresses column is also used for > + * sending the GARPs for the router port IPs. > + * */ > +if (op->peer && op->peer->nbrp && op->peer->od->l3dgw_port && > +op->peer->od->l3redirect_port && > +smap_get_bool(&op->peer->nbrp->options, > + "reside-on-redirect-chassis", false)) { > +struct ds garp_info = DS_EMPTY_INITIALIZER; > +ds_put_format(&garp_info, "%s", op->peer->lrp_networks.ea_s); > +for (size_t i = 0; i < op->peer->lrp_networks.n_ipv4_addrs; > + i++) { > +ds_put_format(&garp_info, " %s", > + > op->peer->lrp_networks.ipv4_addrs[i].addr_s); > +} > +ds_put_format(&garp_info, " is_chassis_resident(%s)", > + op->peer->od->l3redirect_port->json_key); > + > +sbrec_port_binding_update_nat_addresses_addvalue( > +op->sb, ds_cstr(&garp_info)); > +ds_destroy(&garp_info); > +} > } > + > sbrec_port_binding_set_parent_port(op->sb, op->nbsp->parent_name); > sbrec_port_binding_set_tag(op->sb, op->nbsp->tag, op->nbsp->n_tag); > sbrec_port_binding_set_mac(op->sb, (const char **) > op->nbsp->addresses, > diff --git a/tests/ovn.at b/tests/ovn.at > index daf85a55d..ea627e128 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -9317,9 +9317,10 @@ ovn_start > # # (i.e 8.8.8.8 as destination ip). > > # Physical network: > -# # Three hypervisors hv[123]. > +# # Four hypervisors hv[1234]. > # # hv1 hosts vif foo1. > # # hv2 is the "redirect-chassis" that hosts the distributed router gateway > port. > +# # Later to test GARPs for the router port - foo, hv2 and hv4 are added to > the ha_chassis_group > # # hv3 hosts nexthop port vif outside1. > # # All other tests connect hypervisors to network n1 through br-phys for > tunneling. > # # But in this test, hv1 won't connect to n1(and no br-phys in hv1), and > @@ -9344,6 +9345,8 @@ ovs-vsctl \ > start_daemon ovn-controller > ovs-vsctl -- add-port br-int hv1-vif1 -- \ > set interface hv1-vif1 external-ids:iface-id=foo1 \ > +options:tx_pcap=hv1/vif1-tx.pcap \ > +options:rxq_pcap=hv1/vif1-rx.pcap \ > ofport-request=1 > > sim_add hv2 > @@ -9363,6 +9366,12 @@ ovs-vsctl -- add-port br-int hv3-vif1 -- \ > ofport-request=1 > ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings="phys:br-phys" > > +sim_add hv4 > +as hv4 > +ovs-vsctl add-br br-phys > +ovn_attach n1 br-phys 192.168.0.4 > +ovs-vsctl set Open_vSwitch . > external-ids:ovn-bridge-mappings="publ
Re: [ovs-dev] [PATCH v3 1/3] ovn-northd: Refactor the code which sets nat_addresses
On Mon, Jul 1, 2019 at 9:44 AM wrote: > > From: Numan Siddique > > The present code which sets the Port_Binding.nat_addresses > can be simplied. This patch does this. This would help in > upcoming commits to set the nat_addresses column with the > mac and IPs of distributed logical router ports and logical > router ports with 'reside-on-redirect-chassis' set. > > Signed-off-by: Numan Siddique Acked-by: Dumitru Ceara > --- > ovn/northd/ovn-northd.c | 32 +--- > 1 file changed, 13 insertions(+), 19 deletions(-) > > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c > index 0b0a96a3a..c43adb51c 100644 > --- a/ovn/northd/ovn-northd.c > +++ b/ovn/northd/ovn-northd.c > @@ -2493,23 +2493,12 @@ ovn_port_update_sbrec(struct northd_context *ctx, > > const char *nat_addresses = smap_get(&op->nbsp->options, > "nat-addresses"); > +size_t n_nats = 0; > +char **nats = NULL; > if (nat_addresses && !strcmp(nat_addresses, "router")) { > if (op->peer && op->peer->od > && (chassis || op->peer->od->l3redirect_port)) { > -size_t n_nats; > -char **nats = get_nat_addresses(op->peer, &n_nats); > -if (n_nats) { > -sbrec_port_binding_set_nat_addresses(op->sb, > -(const char **) nats, n_nats); > -for (size_t i = 0; i < n_nats; i++) { > -free(nats[i]); > -} > -free(nats); > -} else { > -sbrec_port_binding_set_nat_addresses(op->sb, NULL, > 0); > -} > -} else { > -sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0); > +nats = get_nat_addresses(op->peer, &n_nats); > } > /* Only accept manual specification of ethernet address > * followed by IPv4 addresses on type "l3gateway" ports. */ > @@ -2519,15 +2508,20 @@ ovn_port_update_sbrec(struct northd_context *ctx, > static struct vlog_rate_limit rl = > VLOG_RATE_LIMIT_INIT(1, 1); > VLOG_WARN_RL(&rl, "Error extracting nat-addresses."); > -sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0); > } else { > -sbrec_port_binding_set_nat_addresses(op->sb, > - &nat_addresses, 1); > destroy_lport_addresses(&laddrs); > +n_nats = 1; > +nats = xcalloc(1, sizeof *nats); > +nats[0] = xstrdup(nat_addresses); > } > -} else { > -sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0); > } > + > +sbrec_port_binding_set_nat_addresses(op->sb, > + (const char **) nats, > n_nats); > +for (size_t i = 0; i < n_nats; i++) { > +free(nats[i]); > +} > +free(nats); > } > sbrec_port_binding_set_parent_port(op->sb, op->nbsp->parent_name); > sbrec_port_binding_set_tag(op->sb, op->nbsp->tag, op->nbsp->n_tag); > -- > 2.21.0 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] vswitchd: Always cleanup userspace datapath.
On Thu, Jun 27, 2019 at 08:24:46PM +0300, Ilya Maximets wrote: > On 26.06.2019 21:27, Ben Pfaff wrote: > > On Tue, Jun 25, 2019 at 01:12:11PM +0300, Ilya Maximets wrote: > >> 'netdev' datapath is implemented within ovs-vswitchd process and can > >> not exist without it, so it should be gracefully terminated with a > >> full cleanup of resources upon ovs-vswitchd exit. > >> > >> This change forces dpif cleanup for 'netdev' datapath regardless of > >> passing '--cleanup' to 'ovs-appctl exit'. Such solution allowes to > >> not pass this additional option everytime for userspace datapath > >> installations and also allowes to not terminate system datapath in > >> setups where both datapaths runs at the same time. > >> > >> Exception made for 'internal' ports that could have user ip/route > >> configuration. These ports will not be removed without '--cleanup'. > >> > >> This change fixes OVS disappearing from the DPDK point of view > >> (keeping HW NICs improperly configured, sudden closing of vhost-user > >> connections) and will help with linux devices clearing with upcoming > >> AF_XDP netdev support. > >> > >> Signed-off-by: Ilya Maximets > > > > I'm having trouble figuring out what the critical step is in the > > destruction process that this enables or disables. It controls whether > > dpif_port_del() gets called. There's a lot of stuff under > > dpif_port_del(), most of it indirect. I'm not sure which bit is the > > important one. Would you mind explaining what it is as part of the > > commit message? > > """ > The main part is that dpif_port_del() will lead to netdev_close() > and subsequent netdev_class->destroy(dev) which will stop HW NICs > and free their resources. For vhost-user interfaces it will invoke > vhost driver unregistering with a properly closed vhost-user > connection. For upcoming AF_XDP netdev this will allow to gracefully > destroy xdp sockets and unload xdp programs from linux interfaces. > Another important thing is that port deletion will also trigger > flushing of flows offloaded to HW NICs. > """ > > Does above shed some light on the main goals of this patch? > I could add this information to commit message while applying the > patch. Thanks. It helps a lot. Please add it to the commit message. Acked-by: Ben Pfaff ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCHv14 2/2] netdev-afxdp: add new netdev type for AF_XDP.
The patch introduces experimental AF_XDP support for OVS netdev. AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket type built upon the eBPF and XDP technology. It is aims to have comparable performance to DPDK but cooperate better with existing kernel's networking stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program attached to the netdev, by-passing a couple of Linux kernel's subsystems As a result, AF_XDP socket shows much better performance than AF_PACKET For more details about AF_XDP, please see linux kernel's Documentation/networking/af_xdp.rst. Note that by default, this feature is not compiled in. Signed-off-by: William Tu --- v8-v9: - rebase to master 180bbbed3a3867d52 - Address review feedback from Ben, Ilya and Eelco, at: https://patchwork.ozlabs.org/patch/1097740/ - == From Ilya == - Optimize the reconfiguration logic - Implement .rxq_recv and .send for afxdp - Remove system-afxdp-traffic.at, reuse existing code - Use Ilya's rdtsc code - remove --disable-system - == From Eelco == - Fix bug when remove br0, util(revalidator49)|EMER|lib/poll-loop.c:111: assertion !fd != !wevent failed - Fix bug and use default value from libbpf, ex: XSK_RING_PROD__DEFAULT... - Clear xdp program when receive signal, ctrl+c - Add options to vswitch.xml, set xdpmode default to skb-mode - No support for ARM and PPC, now x86_64 only - remove redundant header includes and function/macro definitions - remove some ifdef HAVE_AF_XDP - == From others/both about afxdp rx and tx == - Several umem push/pop error handling improvement/fixes - add lock to address concurrent_txq case - improve error handling - add stats - Things that are not done yet - MTU limitation - n_txq_desc/n_rxq_desc option. v9-v10 - remove x86_64 limitation, suggested by Ben and Eelco - add xmalloc_pagealign, free_pagealign - minor refector v10-v11 - address feedback from Ilya at https://patchwork.ozlabs.org/patch/1106495/ - fix typos, and some refactoring - refactor existing code and introduce xmalloc pagealign - fix a couple of error handling case - allocate per-txq lock - dynamic allocate xsk array - fix cycle_counter_update() for non-x86/non-linux case v11-v12 - mainly address a couple of crashes reported by Eelco https://patchwork.ozlabs.org/patch/1110729/ - fix cleanup xdp program problem when ovs-vswtichd restarts - following cases should remove xdp program - kill `pidof ovs-vswitchd` - ovs-appctl -t ovs-vswtichd exit --cleanup - note: ovs-ctl restart does not have "--cleanup" so still an issue - work around issues of xsk_ring_cons__peek at libbpf, reported at https://marc.info/?l=xdp-newbies&m=156055471727857&w=2 - variable name refactoring - there are some performance degradation, but let's make sure everything works first v12-v13 - rebase to master - add coverage counter afxdp_cq_emtpy, afxdp_fq_full - minor refactoring v13-v14 - Mainly address issue reported by Ilya https://patchwork.ozlabs.org/patch/1118972/ when doing 'make check-afxdp' - Fix xdp frame headroom issue - Fix vlan test cases by disabling txvlan offload - Skip cvlan - Document TCP limitation (currently all tcp tests fail due to kernel veth driver) - Fix tunnel test cases due to --disable-system (another patch) - Switch to use pthread_spin_lock, suggested by Ben - Add coverage counter for debugging - Fix buffer starvation issue at batch_send reported by Eelco when using tap device with type=afxdp - TODO: 'make check-afxdp' still not all pass IP fragmentation expiry test not fix yet, need to implement deferral memory free, s.t like dpdk_mp_sweep. Currently hit some missing umem descs when reclaiming. --- Documentation/automake.mk | 1 + Documentation/index.rst | 1 + Documentation/intro/install/afxdp.rst | 430 Documentation/intro/install/index.rst | 1 + acinclude.m4 | 35 ++ configure.ac | 1 + lib/automake.mk | 13 + lib/dp-packet.c | 29 ++ lib/dp-packet.h | 18 +- lib/dpif-netdev-perf.h| 26 + lib/netdev-afxdp.c| 927 ++ lib/netdev-afxdp.h| 74 +++ lib/netdev-linux-private.h| 138 + lib/netdev-linux.c| 121 ++--- lib/netdev-provider.h | 3 + lib/netdev.c | 11 + lib/util.c| 92 +++- lib/util.h| 5 + lib/xdpsock.c | 170 +++ lib/xdpsock.h | 101 tests/automake.mk | 16 + tests/system-afxdp-macros.at | 32 ++ tests/system-afxdp-testsuite.at | 26 + tests/system-traffic.at | 2 + vswitchd/vswitch.xml | 15 + 25 files changed, 2180 insertions(+), 108 deletions(-) create
[ovs-dev] [PATCHv14 1/2] ovs-thread: Add pthread spin lock support.
The patch adds the basic spin lock functions: ovs_spin_{lock, try_lock, unlock, init, destroy}. OSX does not support pthread spin lock, so make it linux only. Signed-off-by: William Tu --- include/openvswitch/thread.h | 22 ++ lib/ovs-thread.c | 31 +++ 2 files changed, 53 insertions(+) diff --git a/include/openvswitch/thread.h b/include/openvswitch/thread.h index 2987db37c9dc..14cc9ad73900 100644 --- a/include/openvswitch/thread.h +++ b/include/openvswitch/thread.h @@ -33,6 +33,13 @@ struct OVS_LOCKABLE ovs_mutex { const char *where; /* NULL if and only if uninitialized. */ }; +#ifdef __linux__ +struct OVS_LOCKABLE ovs_spin { +pthread_spinlock_t lock; +const char *where; /* NULL if and only if uninitialized. */ +}; +#endif + /* "struct ovs_mutex" initializer. */ #ifdef PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP #define OVS_MUTEX_INITIALIZER { PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP, \ @@ -70,6 +77,21 @@ int ovs_mutex_trylock_at(const struct ovs_mutex *mutex, const char *where) void ovs_mutex_cond_wait(pthread_cond_t *, const struct ovs_mutex *mutex) OVS_REQUIRES(mutex); + +#ifdef __linux__ +void ovs_spin_init(const struct ovs_spin *); +void ovs_spin_destroy(const struct ovs_spin *); +void ovs_spin_unlock(const struct ovs_spin *spin) OVS_RELEASES(spin); +void ovs_spin_lock_at(const struct ovs_spin *spin, const char *where) +OVS_ACQUIRES(spin); +#define ovs_spin_lock(spin) \ +ovs_spin_lock_at(spin, OVS_SOURCE_LOCATOR) + +int ovs_spin_trylock_at(const struct ovs_spin *spin, const char *where) +OVS_TRY_LOCK(0, spin); +#define ovs_spin_trylock(spin) \ +ovs_spin_trylock_at(spin, OVS_SOURCE_LOCATOR) +#endif /* Convenient once-only execution. * diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c index 159d87e5b0ca..29a0b9e57acd 100644 --- a/lib/ovs-thread.c +++ b/lib/ovs-thread.c @@ -75,6 +75,9 @@ static bool multithreaded; LOCK_FUNCTION(mutex, lock); LOCK_FUNCTION(rwlock, rdlock); LOCK_FUNCTION(rwlock, wrlock); +#ifdef __linux__ +LOCK_FUNCTION(spin, lock); +#endif #define TRY_LOCK_FUNCTION(TYPE, FUN) \ int \ @@ -103,6 +106,9 @@ LOCK_FUNCTION(rwlock, wrlock); TRY_LOCK_FUNCTION(mutex, trylock); TRY_LOCK_FUNCTION(rwlock, tryrdlock); TRY_LOCK_FUNCTION(rwlock, trywrlock); +#ifdef __linux__ +TRY_LOCK_FUNCTION(spin, trylock); +#endif #define UNLOCK_FUNCTION(TYPE, FUN, WHERE) \ void \ @@ -125,6 +131,10 @@ UNLOCK_FUNCTION(mutex, unlock, ""); UNLOCK_FUNCTION(mutex, destroy, NULL); UNLOCK_FUNCTION(rwlock, unlock, ""); UNLOCK_FUNCTION(rwlock, destroy, NULL); +#ifdef __linux__ +UNLOCK_FUNCTION(spin, unlock, ""); +UNLOCK_FUNCTION(spin, destroy, NULL); +#endif #define XPTHREAD_FUNC1(FUNCTION, PARAM1)\ void\ @@ -268,6 +278,27 @@ ovs_mutex_cond_wait(pthread_cond_t *cond, const struct ovs_mutex *mutex_) } } +#ifdef __linux__ +static void +ovs_spin_init__(const struct ovs_spin *l_, int pshared) +{ +struct ovs_spin *l = CONST_CAST(struct ovs_spin *, l_); +int error; + +l->where = ""; +error = pthread_spin_init(&l->lock, pshared); +if (OVS_UNLIKELY(error)) { +ovs_abort(error, "pthread_spin_failed"); +} +} + +void +ovs_spin_init(const struct ovs_spin *spin) +{ +ovs_spin_init__(spin, PTHREAD_PROCESS_PRIVATE); +} +#endif + /* Initializes the 'barrier'. 'size' is the number of threads * expected to hit the barrier. */ void -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Fwd: [PATCH 1/1] ovsdb-idl: memory leak while destroying database
Damjan Skvarc writes: > Hm, telling the true, I don't know how to react on this report. > - I have made a slight change on MY LOCAL OVS FORK > - created a patch file (git format-patch -1 -s -n) > - prepare a mail (according to documentation) > - and sent it to dev list. > probably a problem occurred, because on my local ovs fork I have some other > commits? Not sure. I'll say this, there's something wrong in the following hunk. index 7545e43..d64d8c0 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -239,6 +239,7 @@ static bool ovsdb_idl_check_server_db(struct ovsdb_idl *); static void ovsdb_idl_send_monitor_request(struct ovsdb_idl *, struct ovsdb_idl_db *, enum ovsdb_idl_monitor_method); +static void ovsdb_idl_db_clear(struct ovsdb_idl_db *db); Note that it looks like it was line-wrapped, possibly by your email client. Can you double check your mail client settings? If you look in Patchwork at your patch: https://patchwork.ozlabs.org/patch/1125186/ you'll see that even patchwork didn't detect it correctly. And if I manually take your email from the mailing list, I see the same error. I would suggest trying to send again, and see if you can clear any formatting options. > Now I have applied the same changes on CLONED OVS repository and I am > attaching unmodified file (git > format-patch -1 -s -n) if you > are willing to help me. Or should I do something other? > thanks, damijan > > -- Forwarded message - > Od: 0-day Robot > Date: V pon., 1. jul. 2019 ob 12:58 > Subject: Re: [ovs-dev] [PATCH 1/1] ovsdb-idl: memory leak while destroying > database > To: Damjan Skvarc > Cc: > > Bleep bloop. Greetings Damjan Skvarc, I am a robot and I have tried out your > patch. > Thanks for your contribution. > > I encountered some error that I wasn't expecting. See the details below. > > git-am: > fatal: patch fragment without header at line 6: @@ -581,8 +582,9 @@ > ovsdb_idl_destroy(struct ovsdb_idl *idl) > Repository lacks necessary blobs to fall back on 3-way merge. > Cannot fall back to three-way merge. > Patch failed at 0001 ovsdb-idl: memory leak while destroying database > The copy of the patch that failed is found in: > > /var/lib/jenkins/jobs/upstream_build_from_pw/workspace/.git/rebase-apply/patch > When you have resolved this problem, run "git am --resolved". > If you prefer to skip this patch, run "git am --skip" instead. > To restore the original branch and stop patching, run "git am --abort". > > Please check this out. If you feel there has been an error, please email > acon...@bytheb.org > > Thanks, > 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] NEWS: Update regarding dumping HW offloaded flows.
On 2019-07-01 1:52 PM, Ilya Maximets wrote: > NEWS update was missed while updating docs for dynamic Flow API. > Since this is a user visible change, it should be mentioned here. > > Fixes: d74ca2269e36 ("dpctl: Update docs about dump-flows and HW offloading.") > Signed-off-by: Ilya Maximets > --- > NEWS | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/NEWS b/NEWS > index 2f8171f22..8ad2f6a24 100644 > --- a/NEWS > +++ b/NEWS > @@ -53,6 +53,8 @@ Post-v2.11.0 > - Linux datapath: > * Support for the kernel versions 4.19.x and 4.20.x. > * Support for the kernel version 5.0.x. > + - 'ovs-dpctl dump-flows' is no longer suitable for dumping offloaded > flows. > + 'ovs-appctl dpctl/dump-flows' should be used instead. > > > v2.11.0 - 19 Feb 2019 > Acked-by: Roi Dayan ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] netdev-offload-tc: Fix requesting match on wildcarded vlan tpid.
Hi, Eli. Did you have a chance to test this? Best regards, Ilya Maximets. On 19.06.2019 12:16, Ilya Maximets wrote: > 'mask' must be checked first before configuring key in flower. > > CC: Eli Britstein > Fixes: 0b0a84783cd6 ("netdev-tc-offloads: Support match on priority tags") > Signed-off-by: Ilya Maximets > --- > lib/netdev-offload-tc.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c > index 902ec9543..4cc044b4b 100644 > --- a/lib/netdev-offload-tc.c > +++ b/lib/netdev-offload-tc.c > @@ -1146,9 +1146,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match > *match, > } > mask->mpls_lse[0] = 0; > > -if (eth_type_vlan(key->vlans[0].tpid)) { > +if (mask->vlans[0].tpid && eth_type_vlan(key->vlans[0].tpid)) { > flower.key.encap_eth_type[0] = flower.key.eth_type; > +flower.mask.encap_eth_type[0] = flower.mask.eth_type; > flower.key.eth_type = key->vlans[0].tpid; > +flower.mask.eth_type = mask->vlans[0].tpid; > } > if (mask->vlans[0].tci) { > ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK); > @@ -1179,9 +1181,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match > *match, > } > } > > -if (eth_type_vlan(key->vlans[1].tpid)) { > +if (mask->vlans[1].tpid && eth_type_vlan(key->vlans[1].tpid)) { > flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0]; > +flower.mask.encap_eth_type[1] = flower.mask.encap_eth_type[0]; > flower.key.encap_eth_type[0] = key->vlans[1].tpid; > +flower.mask.encap_eth_type[0] = mask->vlans[1].tpid; > } > if (mask->vlans[1].tci) { > ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK); > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/1] ovsdb-idl: memory leak while destroying database
Bleep bloop. Greetings Damjan Skvarc, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. git-am: fatal: patch fragment without header at line 6: @@ -581,8 +582,9 @@ ovsdb_idl_destroy(struct ovsdb_idl *idl) Repository lacks necessary blobs to fall back on 3-way merge. Cannot fall back to three-way merge. Patch failed at 0001 ovsdb-idl: memory leak while destroying database The copy of the patch that failed is found in: /var/lib/jenkins/jobs/upstream_build_from_pw/workspace/.git/rebase-apply/patch When you have resolved this problem, run "git am --resolved". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". Please check this out. If you feel there has been an error, please email acon...@bytheb.org Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] NEWS: Update regarding dumping HW offloaded flows.
Ack On 7/1/2019 1:52 PM, Ilya Maximets wrote: > NEWS update was missed while updating docs for dynamic Flow API. > Since this is a user visible change, it should be mentioned here. > > Fixes: d74ca2269e36 ("dpctl: Update docs about dump-flows and HW offloading.") > Signed-off-by: Ilya Maximets > --- > NEWS | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/NEWS b/NEWS > index 2f8171f22..8ad2f6a24 100644 > --- a/NEWS > +++ b/NEWS > @@ -53,6 +53,8 @@ Post-v2.11.0 > - Linux datapath: >* Support for the kernel versions 4.19.x and 4.20.x. >* Support for the kernel version 5.0.x. > + - 'ovs-dpctl dump-flows' is no longer suitable for dumping offloaded > flows. > + 'ovs-appctl dpctl/dump-flows' should be used instead. > > > v2.11.0 - 19 Feb 2019 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/3] netdev: Dynamic per-port Flow API.
On 7/1/2019 1:46 PM, Ilya Maximets wrote: > On 01.07.2019 13:24, Eli Britstein wrote: >> On 7/1/2019 1:13 PM, Ilya Maximets wrote: >>> On 30.06.2019 7:47, Eli Britstein wrote: This patch breaks ovs-dpctl dump-flows, when using TC offloads (kernel). I added a print in netdev_flow_dump_create, and flow_api is NULL when invoking ovs-dpctl dump-flows. I think new netdev objects are created to the ports (netdev_open), but not properly initialized. Could you please have a look? >>> Hi. This is a known thing that ovs-dpctl is no longer suitable for dumping >>> of HW offloaded flows. Please, use 'ovs-appctl dpctl/dump-flows' instead. >>> There was a patch for documentation in v4 of patch-set: >>> >>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2019-May%2F359013.html&data=02%7C01%7Celibr%40mellanox.com%7C5df4d8354a324858271e08d6fe116097%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636975747916106491&sdata=Js1hN0Rh%2FFQ3aGQUKjgHQIl29Fk2KYd%2FQ91wE%2FYEXF0%3D&reserved=0 >>> Here is the relevant thread: >>> >>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2019-May%2F358985.html&data=02%7C01%7Celibr%40mellanox.com%7C5df4d8354a324858271e08d6fe116097%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636975747916116486&sdata=HG27RVERjwSzpD8clSsuqACWy6GgMhzixma%2BvQswIoI%3D&reserved=0 >>> >>> However, I missed updating the NEWS about that change. Will send the patch >>> for this soon. >>> >>> Best regards, Ilya Maximets. >> If that so, don't you think this should be either fixed or deprecated? > Fixing seems not possible, because ovs-dpctl is a standalone application > that works only with default configuration and HW offloading is disabled > by default. > > Regarding deprecation, I'd like to remove all the functionality from ovs-dpctl > utility keeping only ability to remove system datapaths, since this, IMHO, > is the only useful function. > >> Imagine some flows are in TC and some in OVS. I guess dpctl will only >> show the OVS ones. > Yes, and this is documented. I missed the documentation but bisected to find the offending commit. I think it's better to deprecate as you suggested above, maybe with some message to refer users to the documentation. > >> I know about that appctl, but at least currently it misses some >> information in the dump like "dp" and "offloaded". > appctl and dpctl works through the same code in lib/dpctl.c. So, everything > is supported. If you don't see those fields, increase the verbosity with > '--more' argument. Right. Sorry. > > Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] NEWS: Update regarding dumping HW offloaded flows.
NEWS update was missed while updating docs for dynamic Flow API. Since this is a user visible change, it should be mentioned here. Fixes: d74ca2269e36 ("dpctl: Update docs about dump-flows and HW offloading.") Signed-off-by: Ilya Maximets --- NEWS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/NEWS b/NEWS index 2f8171f22..8ad2f6a24 100644 --- a/NEWS +++ b/NEWS @@ -53,6 +53,8 @@ Post-v2.11.0 - Linux datapath: * Support for the kernel versions 4.19.x and 4.20.x. * Support for the kernel version 5.0.x. + - 'ovs-dpctl dump-flows' is no longer suitable for dumping offloaded flows. + 'ovs-appctl dpctl/dump-flows' should be used instead. v2.11.0 - 19 Feb 2019 -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/3] netdev: Dynamic per-port Flow API.
On 01.07.2019 13:24, Eli Britstein wrote: > > On 7/1/2019 1:13 PM, Ilya Maximets wrote: >> On 30.06.2019 7:47, Eli Britstein wrote: >>> This patch breaks ovs-dpctl dump-flows, when using TC offloads (kernel). >>> >>> I added a print in netdev_flow_dump_create, and flow_api is NULL when >>> invoking ovs-dpctl dump-flows. >>> >>> I think new netdev objects are created to the ports (netdev_open), but >>> not properly initialized. >>> >>> Could you please have a look? >>> >> Hi. This is a known thing that ovs-dpctl is no longer suitable for dumping >> of HW offloaded flows. Please, use 'ovs-appctl dpctl/dump-flows' instead. >> There was a patch for documentation in v4 of patch-set: >> >> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2019-May%2F359013.html&data=02%7C01%7Celibr%40mellanox.com%7Ceef52fe146404284578308d6fe0cc022%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636975728048962387&sdata=ktlkJhn7VpBfc0HXWFT0m4FACthk%2BwMAlsBFeYB3ZyA%3D&reserved=0 >> Here is the relevant thread: >> >> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2019-May%2F358985.html&data=02%7C01%7Celibr%40mellanox.com%7Ceef52fe146404284578308d6fe0cc022%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636975728048962387&sdata=gfdSQ48i4CLjF6f1MVQCVyR%2FGcLqq72tmJ6ErAWLGig%3D&reserved=0 >> >> However, I missed updating the NEWS about that change. Will send the patch >> for this soon. >> >> Best regards, Ilya Maximets. > > If that so, don't you think this should be either fixed or deprecated? Fixing seems not possible, because ovs-dpctl is a standalone application that works only with default configuration and HW offloading is disabled by default. Regarding deprecation, I'd like to remove all the functionality from ovs-dpctl utility keeping only ability to remove system datapaths, since this, IMHO, is the only useful function. > Imagine some flows are in TC and some in OVS. I guess dpctl will only > show the OVS ones. Yes, and this is documented. > > I know about that appctl, but at least currently it misses some > information in the dump like "dp" and "offloaded". appctl and dpctl works through the same code in lib/dpctl.c. So, everything is supported. If you don't see those fields, increase the verbosity with '--more' argument. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/3] netdev: Dynamic per-port Flow API.
On 7/1/2019 1:13 PM, Ilya Maximets wrote: > On 30.06.2019 7:47, Eli Britstein wrote: >> This patch breaks ovs-dpctl dump-flows, when using TC offloads (kernel). >> >> I added a print in netdev_flow_dump_create, and flow_api is NULL when >> invoking ovs-dpctl dump-flows. >> >> I think new netdev objects are created to the ports (netdev_open), but >> not properly initialized. >> >> Could you please have a look? >> > Hi. This is a known thing that ovs-dpctl is no longer suitable for dumping > of HW offloaded flows. Please, use 'ovs-appctl dpctl/dump-flows' instead. > There was a patch for documentation in v4 of patch-set: > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2019-May%2F359013.html&data=02%7C01%7Celibr%40mellanox.com%7Ceef52fe146404284578308d6fe0cc022%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636975728048962387&sdata=ktlkJhn7VpBfc0HXWFT0m4FACthk%2BwMAlsBFeYB3ZyA%3D&reserved=0 > Here is the relevant thread: > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2019-May%2F358985.html&data=02%7C01%7Celibr%40mellanox.com%7Ceef52fe146404284578308d6fe0cc022%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636975728048962387&sdata=gfdSQ48i4CLjF6f1MVQCVyR%2FGcLqq72tmJ6ErAWLGig%3D&reserved=0 > > However, I missed updating the NEWS about that change. Will send the patch > for this soon. > > Best regards, Ilya Maximets. If that so, don't you think this should be either fixed or deprecated? Imagine some flows are in TC and some in OVS. I guess dpctl will only show the OVS ones. I know about that appctl, but at least currently it misses some information in the dump like "dp" and "offloaded". ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/1] ovsdb-idl: memory leak while destroying database
While checking unit tests with valgrind option (make check-valgrind) I have noticed several memory leaks of the following format: . ==20019== 13,883 (296 direct, 13,587 indirect) bytes in 1 blocks are definitely lost in loss record 346 of 346 ==20019==at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==20019==by 0x530F52: xcalloc (util.c:121) ==20019==by 0x5037A1: ovsdb_idl_row_create__ (ovsdb-idl.c:3120) ==20019==by 0x5045A3: ovsdb_idl_row_create (ovsdb-idl.c:3133) ==20019==by 0x507240: ovsdb_idl_process_update2 (ovsdb-idl.c:2478) ==20019==by 0x507240: ovsdb_idl_db_parse_update__ (ovsdb-idl.c:2328) ==20019==by 0x507240: ovsdb_idl_db_parse_update (ovsdb-idl.c:2380) ==20019==by 0x508128: ovsdb_idl_process_response (ovsdb-idl.c:742) ==20019==by 0x508128: ovsdb_idl_process_msg (ovsdb-idl.c:831) ==20019==by 0x508128: ovsdb_idl_run (ovsdb-idl.c:915) ==20019==by 0x4106D9: bridge_run (bridge.c:2977) ==20019==by 0x40719C: main (ovs-vswitchd.c:127) ==20019== ==20019== LEAK SUMMARY: ==20019==definitely lost: 296 bytes in 1 blocks ==20019==indirectly lost: 13,587 bytes in 10 blocks ==20019== possibly lost: 0 bytes in 0 blocks ==20019==still reachable: 43,563 bytes in 440 blocks ==20019== suppressed: 288 bytes in 1 blocks The problem is that table records maintained by database which is going to be destroyed with ovsdb_idl_db_destroy() function are not destroyed. Signed-off-by: Damijan Skvarc --- lib/ovsdb-idl.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c index 7545e43..d64d8c0 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -239,6 +239,7 @@ static bool ovsdb_idl_check_server_db(struct ovsdb_idl *); static void ovsdb_idl_send_monitor_request(struct ovsdb_idl *, struct ovsdb_idl_db *, enum ovsdb_idl_monitor_method); +static void ovsdb_idl_db_clear(struct ovsdb_idl_db *db); struct ovsdb_idl { struct ovsdb_idl_db server; @@ -581,8 +582,9 @@ ovsdb_idl_destroy(struct ovsdb_idl *idl) if (idl) { ovsdb_idl_clear(idl); jsonrpc_session_close(idl->session); - +ovsdb_idl_db_clear(&idl->server); ovsdb_idl_db_destroy(&idl->server); +ovsdb_idl_db_clear(&idl->data); ovsdb_idl_db_destroy(&idl->data); json_destroy(idl->request_id); free(idl->remote); -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/3] netdev: Dynamic per-port Flow API.
On 30.06.2019 7:47, Eli Britstein wrote: > This patch breaks ovs-dpctl dump-flows, when using TC offloads (kernel). > > I added a print in netdev_flow_dump_create, and flow_api is NULL when > invoking ovs-dpctl dump-flows. > > I think new netdev objects are created to the ports (netdev_open), but > not properly initialized. > > Could you please have a look? > Hi. This is a known thing that ovs-dpctl is no longer suitable for dumping of HW offloaded flows. Please, use 'ovs-appctl dpctl/dump-flows' instead. There was a patch for documentation in v4 of patch-set: https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/359013.html Here is the relevant thread: https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/358985.html However, I missed updating the NEWS about that change. Will send the patch for this soon. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 2/3] ovn: Send GARP for the router ports with reside-on-redirect-chassis options set
On Fri, Jun 28, 2019 at 2:18 PM Dumitru Ceara wrote: > On Fri, Jun 14, 2019 at 2:38 PM wrote: > > > > From: Numan Siddique > > > > With the commit [1], the routing for the provider logical switches > > connected to a router is centralized on the master gateway chassis > > (if the option - reside-on-redirect-chassis) is set. When the > > failover happens and a standby gateway chassis becomes master, > > it should send GARPs for the router port macs. Without this, the > > physical switch doesn't learn the new location of the router port macs > > immediately and this could result in traffic disruption. > > > > This patch addresses this issue so that the ovn-controller which claims > the > > distributed gatweway router port sends out the GARPs. > > > > ovn-controller sends the GARPs if the Port_Binding.nat_addresses column > > is set. This patch makes use of this column, instead of adding a new > column > > even though the name - nat_addresses seems a bit misnomer. The > documentation is > > updated to highlight the usage of this column. > > > > This patch doesn't handle sending the GARPs for the gateway router port > IPs. > > This will be handled in a separate patch. > > > > [1] - 85706c34d53d ("ovn: Avoid tunneling for VLAN packets redirected to > a gateway chassis") > > > > Signed-off-by: Numan Siddique > > Hi Numan, > > A few comments inline. > > Thanks, > Dumitru > > > --- > > ovn/northd/ovn-northd.c | 30 + > > tests/ovn.at| 58 +++-- > > 2 files changed, 86 insertions(+), 2 deletions(-) > > > > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c > > index d0a77293a..26ad3583a 100644 > > --- a/ovn/northd/ovn-northd.c > > +++ b/ovn/northd/ovn-northd.c > > @@ -2518,6 +2518,36 @@ ovn_port_update_sbrec(struct northd_context *ctx, > > } > > } > > > > +/* Add the router mac and IPv4 addresses to > > + * Port_Binding.nat_addresses so that GARP is sent for these > > + * IPs by the ovn-controller on which the distributed > gateeway > > typo: s/gateeway/gateway > > > + * router port resides if: > > + * > > + * - op->peer has 'reside-on-gateway-chassis' set and the > > + *the logical router datapath has distributed router > port. > > + * > > + * Note: Port_Binding.nat_addresses column is also used for > > + * sending the GARPs for the router port IPs. > > + * */ > > +if (op->peer && op->peer->nbrp && op->peer->od->l3dgw_port > && > > +op->peer->od->l3redirect_port && > > +smap_get_bool(&op->peer->nbrp->options, > > + "reside-on-redirect-chassis", false)) { > > +struct ds garp_info = DS_EMPTY_INITIALIZER; > > +ds_put_format(&garp_info, "%s", > op->peer->lrp_networks.ea_s); > > +for (size_t i = 0; i < > op->peer->lrp_networks.n_ipv4_addrs; > > + i++) { > > +ds_put_format(&garp_info, " %s", > > + > op->peer->lrp_networks.ipv4_addrs[i].addr_s); > > +} > > +ds_put_format(&garp_info, " is_chassis_resident(%s)", > > + op->peer->od->l3redirect_port->json_key); > > + > > +n_nats++; > > +nats = xrealloc(nats, (n_nats * sizeof *nats)); > > +nats[n_nats - 1] = ds_steal_cstr(&garp_info); > > +} > > + > > sbrec_port_binding_set_nat_addresses(op->sb, > > (const char **) nats, > n_nats); > > Can't we avoid xrealloc above by moving the GARP for router port IP > code after sbrec_port_binding_set_nat_addresses and call > sbrec_port_binding_update_nat_addresses_addvalue(op->sb, > ds_cstr(&garp_info)) if we need GARP for the router port? Then we'd > also need to ds_destroy(&garp_info). > > I never knew there were *add_value/ delvalue variants. I have used this function in v3. Thanks Numan > > for (size_t i = 0; i < n_nats; i++) { > > diff --git a/tests/ovn.at b/tests/ovn.at > > index daf85a55d..ea627e128 100644 > > --- a/tests/ovn.at > > +++ b/tests/ovn.at > > @@ -9317,9 +9317,10 @@ ovn_start > > # # (i.e 8.8.8.8 as destination ip). > > > > # Physical network: > > -# # Three hypervisors hv[123]. > > +# # Four hypervisors hv[1234]. > > # # hv1 hosts vif foo1. > > # # hv2 is the "redirect-chassis" that hosts the distributed router > gateway port. > > +# # Later to test GARPs for the router port - foo, hv2 and hv4 are > added to the ha_chassis_group > > # # hv3 hosts nexthop port vif outside1. > > # # All other tests connect hypervisors to network n1 through br-phys > for tunneling. > > # # But in this test, hv1 won't connect to n1(and no br-phys in hv1), > and > > @@ -9344,6 +9345,8 @@ ovs-vsctl \ > > start_daemon ovn-controlle
Re: [ovs-dev] [PATCH v3 2/4] doc: Add info on vhost tx retries.
On Thu, Jun 27, 2019 at 1:13 PM Kevin Traynor wrote: > Add documentation about vhost tx retries and external > configuration that can help reduce/avoid them. > > Signed-off-by: Kevin Traynor > Acked-by: Eelco Chaudron > Acked-by: Flavio Leitner > --- > Documentation/topics/dpdk/vhost-user.rst | 36 > 1 file changed, 36 insertions(+) > > diff --git a/Documentation/topics/dpdk/vhost-user.rst > b/Documentation/topics/dpdk/vhost-user.rst > index f7b4b338e..1dd02b8b6 100644 > --- a/Documentation/topics/dpdk/vhost-user.rst > +++ b/Documentation/topics/dpdk/vhost-user.rst > @@ -76,4 +76,40 @@ mode ports require QEMU version 2.7. Ports of type > vhost-user are currently > deprecated and will be removed in a future release. > > +vhost tx retries > + > Coming a bit late.. Is there a reason why this is a subtitle of "vhost-user vs. vhost-user-client" ? I am about to send a patch in this same section, I can send a fix. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/3] ovn-northd: Refactor the code which sets nat_addresses
On Fri, Jun 28, 2019 at 1:55 PM Dumitru Ceara wrote: > On Fri, Jun 14, 2019 at 2:37 PM wrote: > > > > From: Numan Siddique > > > > The present code which sets the Port_Binding.nat_addresses > > can be simplied. This patch does this. This would help in > > upcoming commits to set the nat_addresses column with the > > mac and IPs of distributed logical router ports and logical > > router ports with 'reside-on-redirect-chassis' set. > > > > Signed-off-by: Numan Siddique > > Hi Numan, > > I have a couple of minor comments inline. > > > --- > > ovn/northd/ovn-northd.c | 30 +- > > 1 file changed, 13 insertions(+), 17 deletions(-) > > > > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c > > index 0b0a96a3a..d0a77293a 100644 > > --- a/ovn/northd/ovn-northd.c > > +++ b/ovn/northd/ovn-northd.c > > @@ -2493,23 +2493,12 @@ ovn_port_update_sbrec(struct northd_context *ctx, > > > > const char *nat_addresses = smap_get(&op->nbsp->options, > > "nat-addresses"); > > +size_t n_nats = 0; > > +char **nats = NULL; > > if (nat_addresses && !strcmp(nat_addresses, "router")) { > > if (op->peer && op->peer->od > > && (chassis || op->peer->od->l3redirect_port)) { > > -size_t n_nats; > > -char **nats = get_nat_addresses(op->peer, &n_nats); > > -if (n_nats) { > > -sbrec_port_binding_set_nat_addresses(op->sb, > > -(const char **) nats, n_nats); > > -for (size_t i = 0; i < n_nats; i++) { > > -free(nats[i]); > > -} > > -free(nats); > > -} else { > > -sbrec_port_binding_set_nat_addresses(op->sb, > NULL, 0); > > -} > > -} else { > > -sbrec_port_binding_set_nat_addresses(op->sb, NULL, > 0); > > +nats = get_nat_addresses(op->peer, &n_nats); > > } > > /* Only accept manual specification of ethernet address > > * followed by IPv4 addresses on type "l3gateway" ports. */ > > @@ -2519,15 +2508,22 @@ ovn_port_update_sbrec(struct northd_context *ctx, > > static struct vlog_rate_limit rl = > > VLOG_RATE_LIMIT_INIT(1, 1); > > VLOG_WARN_RL(&rl, "Error extracting > nat-addresses."); > > -sbrec_port_binding_set_nat_addresses(op->sb, NULL, > 0); > > } else { > > sbrec_port_binding_set_nat_addresses(op->sb, > > > &nat_addresses, 1); > > Shouldn't this be removed? Now we call > sbrec_port_binding_set_nat_addresses at the end of the function for > all cases. > > Thanks for pointing this out. I missed it. I have addressed it in v2. > > destroy_lport_addresses(&laddrs); > > +n_nats = 1; > > +nats = xcalloc(1, sizeof *nats); > > +nats[0] = xstrdup(nat_addresses); > > I guess xmalloc would be enough as we immediately initialize nats[0]. > I personally would prefer xcalloc because of 'nats' is a double pointer. So I have not changed in v2. Thanks Numan > > Thanks, > Dumitru > > > } > > -} else { > > -sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0); > > } > > + > > +sbrec_port_binding_set_nat_addresses(op->sb, > > + (const char **) nats, > n_nats); > > +for (size_t i = 0; i < n_nats; i++) { > > +free(nats[i]); > > +} > > +free(nats); > > } > > sbrec_port_binding_set_parent_port(op->sb, > op->nbsp->parent_name); > > sbrec_port_binding_set_tag(op->sb, op->nbsp->tag, > op->nbsp->n_tag); > > -- > > 2.21.0 > > > > ___ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 3/3] ovn: Send GARP for router port IPs of a router port connected to bridged logical switch
From: Numan Siddique This patch handles sending GARPs for - router port IPs of a distributed router port - router port IPs of a router port which belongs to gateway router (with the option - redirect-chassis set in Logical_Router.options) Signed-off-by: Numan Siddique --- ovn/northd/ovn-northd.c | 44 tests/ovn.at| 89 +++-- 2 files changed, 105 insertions(+), 28 deletions(-) diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index e0af234f8..e8cbc3534 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -1983,9 +1983,23 @@ get_nat_addresses(const struct ovn_port *op, size_t *n) } } else { /* Centralized NAT rule, either on gateway router or distributed - * router. */ -ds_put_format(&c_addresses, " %s", nat->external_ip); -central_ip_address = true; + * router. + * Check if external_ip is same as router ip. If so, then there + * is no need to add this to the nat_addresses. The router IPs + * will be added separately. */ +bool is_router_ip = false; +for (size_t j = 0; j < op->lrp_networks.n_ipv4_addrs; j++) { +if (!strcmp(nat->external_ip, +op->lrp_networks.ipv4_addrs[j].addr_s)) { +is_router_ip = true; +break; +} +} + +if (!is_router_ip) { +ds_put_format(&c_addresses, " %s", nat->external_ip); +central_ip_address = true; +} } } @@ -2531,13 +2545,26 @@ ovn_port_update_sbrec(struct northd_context *ctx, * - op->peer has 'reside-on-gateway-chassis' set and the *the logical router datapath has distributed router port. * + * - op->peer is distributed gateway router port. + * + * - op->peer's router is a gateway router and op has a localnet + *port. + * * Note: Port_Binding.nat_addresses column is also used for * sending the GARPs for the router port IPs. * */ +bool add_router_port_garp = false; if (op->peer && op->peer->nbrp && op->peer->od->l3dgw_port && op->peer->od->l3redirect_port && -smap_get_bool(&op->peer->nbrp->options, - "reside-on-redirect-chassis", false)) { +(smap_get_bool(&op->peer->nbrp->options, + "reside-on-redirect-chassis", false) || +op->peer == op->peer->od->l3dgw_port)) { +add_router_port_garp = true; +} else if (chassis && op->od->localnet_port) { +add_router_port_garp = true; +} + +if (add_router_port_garp) { struct ds garp_info = DS_EMPTY_INITIALIZER; ds_put_format(&garp_info, "%s", op->peer->lrp_networks.ea_s); for (size_t i = 0; i < op->peer->lrp_networks.n_ipv4_addrs; @@ -2545,8 +2572,11 @@ ovn_port_update_sbrec(struct northd_context *ctx, ds_put_format(&garp_info, " %s", op->peer->lrp_networks.ipv4_addrs[i].addr_s); } -ds_put_format(&garp_info, " is_chassis_resident(%s)", - op->peer->od->l3redirect_port->json_key); + +if (op->peer->od->l3redirect_port) { +ds_put_format(&garp_info, " is_chassis_resident(%s)", + op->peer->od->l3redirect_port->json_key); +} sbrec_port_binding_update_nat_addresses_addvalue( op->sb, ds_cstr(&garp_info)); diff --git a/tests/ovn.at b/tests/ovn.at index ea627e128..2e266d94a 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -6730,6 +6730,9 @@ AT_CHECK([ovn-nbctl lsp-set-addresses ln_port unknown]) AT_CHECK([ovn-nbctl lsp-set-type ln_port localnet]) AT_CHECK([ovn-nbctl lsp-set-options ln_port network_name=physnet1]) +# Wait until the patch ports are created in hv1 to connect br-int to br-eth0 +OVS_WAIT_UNTIL([test 1 = `as hv1 ovs-vsctl show | \ +grep "Port patch-br-int-to-ln_port" | wc -l`]) # Wait for packet to be received. OVS_WAIT_UNTIL([test `wc -c < "hv1/snoopvif-tx.pcap"` -ge 50]) @@ -6737,10 +6740,11 @@ trim_zeros() { sed 's/\(00\)\{1,\}$//' } $PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/snoopvif-tx.pcap | trim_zeros > packets -expected="f00108060001080006040001f001c0a80002c0a80002" +expected="f00108060001080006040001f001c0a80001c0a80001" echo $expected > expout +expected="f00108060001080006040001f001c0a80002c0a80002" +echo $expected >>
[ovs-dev] [PATCH v3 2/3] ovn: Send GARP for the router ports with reside-on-redirect-chassis options set
From: Numan Siddique With the commit [1], the routing for the provider logical switches connected to a router is centralized on the master gateway chassis (if the option - reside-on-redirect-chassis) is set. When the failover happens and a standby gateway chassis becomes master, it should send GARPs for the router port macs. Without this, the physical switch doesn't learn the new location of the router port macs immediately and this could result in traffic disruption. This patch addresses this issue so that the ovn-controller which claims the distributed gatweway router port sends out the GARPs. ovn-controller sends the GARPs if the Port_Binding.nat_addresses column is set. This patch makes use of this column, instead of adding a new column even though the name - nat_addresses seems a bit misnomer. The documentation is updated to highlight the usage of this column. This patch doesn't handle sending the GARPs for the gateway router port IPs. This will be handled in a separate patch. [1] - 85706c34d53d ("ovn: Avoid tunneling for VLAN packets redirected to a gateway chassis") Signed-off-by: Numan Siddique --- ovn/northd/ovn-northd.c | 31 ++ tests/ovn.at| 58 +++-- 2 files changed, 87 insertions(+), 2 deletions(-) diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index c43adb51c..e0af234f8 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -2522,7 +2522,38 @@ ovn_port_update_sbrec(struct northd_context *ctx, free(nats[i]); } free(nats); + +/* Add the router mac and IPv4 addresses to + * Port_Binding.nat_addresses so that GARP is sent for these + * IPs by the ovn-controller on which the distributed gateway + * router port resides if: + * + * - op->peer has 'reside-on-gateway-chassis' set and the + *the logical router datapath has distributed router port. + * + * Note: Port_Binding.nat_addresses column is also used for + * sending the GARPs for the router port IPs. + * */ +if (op->peer && op->peer->nbrp && op->peer->od->l3dgw_port && +op->peer->od->l3redirect_port && +smap_get_bool(&op->peer->nbrp->options, + "reside-on-redirect-chassis", false)) { +struct ds garp_info = DS_EMPTY_INITIALIZER; +ds_put_format(&garp_info, "%s", op->peer->lrp_networks.ea_s); +for (size_t i = 0; i < op->peer->lrp_networks.n_ipv4_addrs; + i++) { +ds_put_format(&garp_info, " %s", + op->peer->lrp_networks.ipv4_addrs[i].addr_s); +} +ds_put_format(&garp_info, " is_chassis_resident(%s)", + op->peer->od->l3redirect_port->json_key); + +sbrec_port_binding_update_nat_addresses_addvalue( +op->sb, ds_cstr(&garp_info)); +ds_destroy(&garp_info); +} } + sbrec_port_binding_set_parent_port(op->sb, op->nbsp->parent_name); sbrec_port_binding_set_tag(op->sb, op->nbsp->tag, op->nbsp->n_tag); sbrec_port_binding_set_mac(op->sb, (const char **) op->nbsp->addresses, diff --git a/tests/ovn.at b/tests/ovn.at index daf85a55d..ea627e128 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -9317,9 +9317,10 @@ ovn_start # # (i.e 8.8.8.8 as destination ip). # Physical network: -# # Three hypervisors hv[123]. +# # Four hypervisors hv[1234]. # # hv1 hosts vif foo1. # # hv2 is the "redirect-chassis" that hosts the distributed router gateway port. +# # Later to test GARPs for the router port - foo, hv2 and hv4 are added to the ha_chassis_group # # hv3 hosts nexthop port vif outside1. # # All other tests connect hypervisors to network n1 through br-phys for tunneling. # # But in this test, hv1 won't connect to n1(and no br-phys in hv1), and @@ -9344,6 +9345,8 @@ ovs-vsctl \ start_daemon ovn-controller ovs-vsctl -- add-port br-int hv1-vif1 -- \ set interface hv1-vif1 external-ids:iface-id=foo1 \ +options:tx_pcap=hv1/vif1-tx.pcap \ +options:rxq_pcap=hv1/vif1-rx.pcap \ ofport-request=1 sim_add hv2 @@ -9363,6 +9366,12 @@ ovs-vsctl -- add-port br-int hv3-vif1 -- \ ofport-request=1 ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings="phys:br-phys" +sim_add hv4 +as hv4 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.4 +ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings="public:br-ex,phys:br-phys" + # Create network n2 for vlan connectivity between hv1 and hv2 net_add n2 @@ -9374,6 +9383,10 @@ as hv2 ovs-vsctl add-br br-ex net_attach n2 br-ex +as hv4 +ovs-vsctl add-br br-ex +net_attach n2 br-ex + OVN_POPULATE_ARP ovn-nbctl create Logical_Router name=R1 @@ -9556,
[ovs-dev] [PATCH v3 1/3] ovn-northd: Refactor the code which sets nat_addresses
From: Numan Siddique The present code which sets the Port_Binding.nat_addresses can be simplied. This patch does this. This would help in upcoming commits to set the nat_addresses column with the mac and IPs of distributed logical router ports and logical router ports with 'reside-on-redirect-chassis' set. Signed-off-by: Numan Siddique --- ovn/northd/ovn-northd.c | 32 +--- 1 file changed, 13 insertions(+), 19 deletions(-) diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index 0b0a96a3a..c43adb51c 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -2493,23 +2493,12 @@ ovn_port_update_sbrec(struct northd_context *ctx, const char *nat_addresses = smap_get(&op->nbsp->options, "nat-addresses"); +size_t n_nats = 0; +char **nats = NULL; if (nat_addresses && !strcmp(nat_addresses, "router")) { if (op->peer && op->peer->od && (chassis || op->peer->od->l3redirect_port)) { -size_t n_nats; -char **nats = get_nat_addresses(op->peer, &n_nats); -if (n_nats) { -sbrec_port_binding_set_nat_addresses(op->sb, -(const char **) nats, n_nats); -for (size_t i = 0; i < n_nats; i++) { -free(nats[i]); -} -free(nats); -} else { -sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0); -} -} else { -sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0); +nats = get_nat_addresses(op->peer, &n_nats); } /* Only accept manual specification of ethernet address * followed by IPv4 addresses on type "l3gateway" ports. */ @@ -2519,15 +2508,20 @@ ovn_port_update_sbrec(struct northd_context *ctx, static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); VLOG_WARN_RL(&rl, "Error extracting nat-addresses."); -sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0); } else { -sbrec_port_binding_set_nat_addresses(op->sb, - &nat_addresses, 1); destroy_lport_addresses(&laddrs); +n_nats = 1; +nats = xcalloc(1, sizeof *nats); +nats[0] = xstrdup(nat_addresses); } -} else { -sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0); } + +sbrec_port_binding_set_nat_addresses(op->sb, + (const char **) nats, n_nats); +for (size_t i = 0; i < n_nats; i++) { +free(nats[i]); +} +free(nats); } sbrec_port_binding_set_parent_port(op->sb, op->nbsp->parent_name); sbrec_port_binding_set_tag(op->sb, op->nbsp->tag, op->nbsp->n_tag); -- 2.21.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] ovn-controller: Provide the option to configure inactivity probe interval for OpenFlow conn
From: Numan Siddique If the ovn-controller main loop takes more than 5 seconds (if there are lots of logical flows) before it calls poll_block(), it causes the poll_block to wake up immediately, since rconn module has to send echo request. With the incremental processing, this is not an issue as ovn-controller will not recompute again. But for older versions, this is an issue as it causes flow recomputations and this would result in 100% cpu all the time. With this patch, CMS can configure a higher value depending the workload. The main intention of this patch is to fix this recompuation issue for older versions (there by requesting backport), it still would be beneficial with the incremental processing engine. Signed-off-by: Numan Siddique Tested-by: Dumitru Ceara --- v1 -> v2 - * Defined the macro OFCTRL_DEFAULT_PROBE_INTERVAL_SEC instead of using hard coded value of 5 in get_ofctrl_probe_interval() as suggested by Dumitru Ceara ovn/controller/ofctrl.c | 14 -- ovn/controller/ofctrl.h | 4 +++- ovn/controller/ovn-controller.8.xml | 14 ++ ovn/controller/ovn-controller.c | 14 +- 4 files changed, 42 insertions(+), 4 deletions(-) diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c index 47a915aea..043abd69d 100644 --- a/ovn/controller/ofctrl.c +++ b/ovn/controller/ofctrl.c @@ -169,9 +169,11 @@ static void ofctrl_recv(const struct ofp_header *, enum ofptype); void ofctrl_init(struct ovn_extend_table *group_table, -struct ovn_extend_table *meter_table) +struct ovn_extend_table *meter_table, +int inactivity_probe_interval) { -swconn = rconn_create(5, 0, DSCP_DEFAULT, 1 << OFP13_VERSION); +swconn = rconn_create(inactivity_probe_interval, 0, + DSCP_DEFAULT, 1 << OFP13_VERSION); tx_counter = rconn_packet_counter_create(); hmap_init(&installed_flows); ovs_list_init(&flow_updates); @@ -1381,3 +1383,11 @@ ofctrl_is_connected(void) { return rconn_is_connected(swconn); } + +void +ofctrl_set_probe_interval(int probe_interval) +{ +if (swconn) { +rconn_set_probe_interval(swconn, probe_interval); +} +} diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h index b39cdf88b..ed8918aae 100644 --- a/ovn/controller/ofctrl.h +++ b/ovn/controller/ofctrl.h @@ -41,7 +41,8 @@ struct ovn_desired_flow_table { /* Interface for OVN main loop. */ void ofctrl_init(struct ovn_extend_table *group_table, - struct ovn_extend_table *meter_table); + struct ovn_extend_table *meter_table, + int inactivity_probe_interval); void ofctrl_run(const struct ovsrec_bridge *br_int, struct shash *pending_ct_zones); enum mf_field_id ofctrl_get_mf_field_id(void); @@ -81,5 +82,6 @@ void ofctrl_check_and_add_flow(struct ovn_desired_flow_table *, const struct uuid *, bool log_duplicate_flow); bool ofctrl_is_connected(void); +void ofctrl_set_probe_interval(int probe_interval); #endif /* ovn/ofctrl.h */ diff --git a/ovn/controller/ovn-controller.8.xml b/ovn/controller/ovn-controller.8.xml index 9721d9a5b..8f9c64838 100644 --- a/ovn/controller/ovn-controller.8.xml +++ b/ovn/controller/ovn-controller.8.xml @@ -112,6 +112,20 @@ + external_ids:ovn-openflow-probe-interval + + + The inactivity probe interval of the OpenFlow connection to the + OpenvSwitch integration bridge, in seconds. + If the value is zero, it disables the connection keepalive feature. + + + + If the value is nonzero, then it will be forced to a value of + at least 5s. + + + external_ids:ovn-encap-type diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c index 60190161f..c2c68e467 100644 --- a/ovn/controller/ovn-controller.c +++ b/ovn/controller/ovn-controller.c @@ -73,6 +73,7 @@ static unixctl_cb_func ovn_controller_conn_show; #define DEFAULT_BRIDGE_NAME "br-int" #define DEFAULT_PROBE_INTERVAL_MSEC 5000 +#define OFCTRL_DEFAULT_PROBE_INTERVAL_SEC 5 #define CONTROLLER_LOOP_STOPWATCH_NAME "ovn-controller-flow-generation" @@ -390,6 +391,15 @@ update_ssl_config(const struct ovsrec_ssl_table *ssl_table) } } +static int +get_ofctrl_probe_interval(struct ovsdb_idl *ovs_idl) +{ +const struct ovsrec_open_vswitch *cfg = ovsrec_open_vswitch_first(ovs_idl); +return smap_get_int(&cfg->external_ids, +"ovn-openflow-probe-interval", +OFCTRL_DEFAULT_PROBE_INTERVAL_SEC); +} + /* Retrieves the pointer to the OVN Southbound database from 'ovs_idl' and * updates 'sbdb_idl' with that pointer. */ static void @@ -1817,7 +1827,8 @@ main(int argc, char *argv[]) engine_init(&en_flow_output); ofctrl_init(&ed_flow_output.group_table, -
[ovs-dev] [PATCH v3 0/3] Handle GARPs for logical router port IPs
From: Numan Siddique The v1 of the patch series had just one patch which handled sending GARPs for the logical router ports with the option - reside-on-redirect-chassis set. The v2+ has totall 3 patches. Patch 1 is a simple refactor in ovn-northd code which sets the Port_Binding.nat_addresses column in Southbound db. Patch 2 takes care of sending GARPs for the logical router ports which has the option - reside-on-redirect-chassis set. This option is used when provider (or bridged) logical switches are connected a logical router with a distributed gateway router port. Patch 3 takes care of sending GARPs for the IPs of the distributed gateway router port. It also handles sendig the GARPs for the IPs of the router ports which belong to a gateway router and whoe peer is connected to a provider (or bridged) logical switch. --- v2 -> v3 --- * Addressed review comments from Dumitru for p1 and p2. Numan Siddique (3): ovn-northd: Refactor the code which sets nat_addresses ovn: Send GARP for the router ports with reside-on-redirect-chassis options set ovn: Send GARP for router port IPs of a router port connected to bridged logical switch ovn/northd/ovn-northd.c | 99 +-- tests/ovn.at| 147 +--- 2 files changed, 201 insertions(+), 45 deletions(-) -- 2.21.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev