Re: [ovs-dev] [PATCH 1/6] Generic radix trees
On Sat, May 26, 2018 at 11:16:42AM +0800, Liu Bo wrote: > > +/* > > + * Returns pointer to the specified byte @offset within @radix, allocating > > it if > > + * necessary - newly allocated slots are always zeroed out: > > + */ > > +void *__genradix_ptr_alloc(struct __genradix *radix, size_t offset, > > + gfp_t gfp_mask) > > +{ > > + struct genradix_node **n; > > Any reason that " struct genradix_node ** " is used here instead of " > struct genradix_node * "? > > Looks like this function only manipulates *n, am I missing something? It stores to *n, when it has to allocate a node (including the root) ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/6] Generic radix trees
Hi Kent, (Add all ML to cc this time.) On Wed, May 23, 2018 at 9:18 AM, Kent Overstreetwrote: > Very simple radix tree implementation that supports storing arbitrary > size entries, up to PAGE_SIZE - upcoming patches will convert existing > flex_array users to genradixes. The new genradix code has a much simpler > API and implementation, and doesn't have a hard limit on the number of > elements like flex_array does. > > Signed-off-by: Kent Overstreet > --- > include/linux/generic-radix-tree.h | 222 + > lib/Makefile | 3 +- > lib/generic-radix-tree.c | 180 +++ > 3 files changed, 404 insertions(+), 1 deletion(-) > create mode 100644 include/linux/generic-radix-tree.h > create mode 100644 lib/generic-radix-tree.c > > diff --git a/include/linux/generic-radix-tree.h > b/include/linux/generic-radix-tree.h > new file mode 100644 > index 00..3328813322 > --- /dev/null > +++ b/include/linux/generic-radix-tree.h > @@ -0,0 +1,222 @@ > +#ifndef _LINUX_GENERIC_RADIX_TREE_H > +#define _LINUX_GENERIC_RADIX_TREE_H > + > +/* > + * Generic radix trees/sparse arrays: > + * > + * Very simple and minimalistic, supporting arbitrary size entries up to > + * PAGE_SIZE. > + * > + * A genradix is defined with the type it will store, like so: > + * static GENRADIX(struct foo) foo_genradix; > + * > + * The main operations are: > + * - genradix_init(radix) - initialize an empty genradix > + * > + * - genradix_free(radix) - free all memory owned by the genradix and > + * reinitialize it > + * > + * - genradix_ptr(radix, idx) - gets a pointer to the entry at idx, returning > + * NULL if that entry does not exist > + * > + * - genradix_ptr_alloc(radix, idx, gfp) - gets a pointer to an entry, > + * allocating it if necessary > + * > + * - genradix_for_each(radix, iter, p) - iterate over each entry in a > genradix > + * > + * The radix tree allocates one page of entries at a time, so entries may > exist > + * that were never explicitly allocated - they will be initialized to all > + * zeroes. > + * > + * Internally, a genradix is just a radix tree of pages, and indexing works > in > + * terms of byte offsets. The wrappers in this header file use sizeof on the > + * type the radix contains to calculate a byte offset from the index - see > + * __idx_to_offset. > + */ > + > +#include > +#include > +#include > +#include > + > +struct genradix_node; > + > +struct __genradix { > + struct genradix_node*root; > + size_t depth; > +}; > + > +#define __GENRADIX_INITIALIZER \ > + { \ > + .tree = { \ > + .root = NULL, \ > + .depth = 0, \ > + } \ > + } > + > +/* > + * We use a 0 size array to stash the type we're storing without taking any > + * space at runtime - then the various accessor macros can use typeof() to > get > + * to it for casts/sizeof - we also force the alignment so that storing a > type > + * with a ridiculous alignment doesn't blow up the alignment or size of the > + * genradix. > + */ > + > +#define GENRADIX(_type)\ > +struct { \ > + struct __genradix tree; \ > + _type type[0] __aligned(1); \ > +} > + > +#define DEFINE_GENRADIX(_name, _type) \ > + GENRADIX(_type) _name = __GENRADIX_INITIALIZER > + > +/** > + * genradix_init - initialize a genradix > + * @_radix:genradix to initialize > + * > + * Does not fail > + */ > +#define genradix_init(_radix) \ > +do { \ > + *(_radix) = (typeof(*_radix)) __GENRADIX_INITIALIZER; \ > +} while (0) > + > +void __genradix_free(struct __genradix *); > + > +/** > + * genradix_free: free all memory owned by a genradix > + * > + * After freeing, @_radix will be reinitialized and empty > + */ > +#define genradix_free(_radix) __genradix_free(&(_radix)->tree) > + > +static inline size_t __idx_to_offset(size_t idx, size_t obj_size) > +{ > + if (__builtin_constant_p(obj_size)) > + BUILD_BUG_ON(obj_size > PAGE_SIZE); > + else > + BUG_ON(obj_size > PAGE_SIZE); > + > + if (!is_power_of_2(obj_size)) { > + size_t objs_per_page = PAGE_SIZE / obj_size; > + > + return (idx / objs_per_page) * PAGE_SIZE + > + (idx % objs_per_page) * obj_size; > + } else { > + return idx * obj_size; >
Re: [ovs-dev] (no subject)
Ihre Gewinne wurden verarbeitet und es ist bereit für die Übertragung an Sie, also kontaktieren Sie mich so schnell wie möglich, um es zu erhalten. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [branch-2.9 1/2] Set release date for 2.9.2.
All of these releases seem fine to me. Acked-by: Ben Pfaff___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [branch-2.9 1/2] Set release date for 2.9.2.
Signed-off-by: Justin Pettit--- NEWS | 3 ++- debian/changelog | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/NEWS b/NEWS index 06f4a6475f81..452d0dd0bb02 100644 --- a/NEWS +++ b/NEWS @@ -1,5 +1,6 @@ -v2.9.2 - xx xxx +v2.9.2 - 25 May 2018 + - Bug fixes v2.9.1 - 18 May 2018 diff --git a/debian/changelog b/debian/changelog index 7ed7b62b2ea7..2f216a6bb53e 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,8 +1,8 @@ openvswitch (2.9.2-1) unstable; urgency=low - + [ Open vSwitch team ] * New upstream version - -- Open vSwitch team Fri, 18 May 2018 13:58:21 -0700 + -- Open vSwitch team Fri, 25 May 2018 16:37:05 -0700 openvswitch (2.9.1-1) unstable; urgency=low -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [branch-2.9 2/2] Prepare for 2.9.3.
Signed-off-by: Justin Pettit--- NEWS | 3 +++ configure.ac | 2 +- debian/changelog | 6 ++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 452d0dd0bb02..622c230ae5b0 100644 --- a/NEWS +++ b/NEWS @@ -1,3 +1,6 @@ +v2.9.3 - xx xxx + + v2.9.2 - 25 May 2018 - Bug fixes diff --git a/configure.ac b/configure.ac index e0f1bb931168..8c84a781f2c0 100644 --- a/configure.ac +++ b/configure.ac @@ -13,7 +13,7 @@ # limitations under the License. AC_PREREQ(2.63) -AC_INIT(openvswitch, 2.9.2, b...@openvswitch.org) +AC_INIT(openvswitch, 2.9.3, b...@openvswitch.org) AC_CONFIG_SRCDIR([datapath/datapath.c]) AC_CONFIG_MACRO_DIR([m4]) AC_CONFIG_AUX_DIR([build-aux]) diff --git a/debian/changelog b/debian/changelog index 2f216a6bb53e..b52e98ddfbad 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +openvswitch (2.9.3-1) unstable; urgency=low + [ Open vSwitch team ] + * New upstream version + + -- Open vSwitch team Fri, 25 May 2018 16:37:38 -0700 + openvswitch (2.9.2-1) unstable; urgency=low [ Open vSwitch team ] * New upstream version -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [branch-2.8 2/2] Prepare for 2.8.5.
Signed-off-by: Justin Pettit--- NEWS | 3 +++ configure.ac | 2 +- debian/changelog | 6 ++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 047f8751bfab..101ded1ddca0 100644 --- a/NEWS +++ b/NEWS @@ -1,3 +1,6 @@ +v2.8.5 - xx xxx + + v2.8.4 - 25 May 2018 - Bug fixes diff --git a/configure.ac b/configure.ac index f92c870df505..d07441927701 100644 --- a/configure.ac +++ b/configure.ac @@ -13,7 +13,7 @@ # limitations under the License. AC_PREREQ(2.63) -AC_INIT(openvswitch, 2.8.4, b...@openvswitch.org) +AC_INIT(openvswitch, 2.8.5, b...@openvswitch.org) AC_CONFIG_SRCDIR([datapath/datapath.c]) AC_CONFIG_MACRO_DIR([m4]) AC_CONFIG_AUX_DIR([build-aux]) diff --git a/debian/changelog b/debian/changelog index 8450720bebe3..a48f34b195df 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +openvswitch (2.8.5-1) unstable; urgency=low + [ Open vSwitch team ] + * New upstream version + + -- Open vSwitch team Fri, 25 May 2018 16:32:19 -0700 + openvswitch (2.8.4-1) unstable; urgency=low [ Open vSwitch team ] * New upstream version -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [branch-2.8 1/2] Set release date for 2.8.4.
Signed-off-by: Justin Pettit--- NEWS | 6 +++--- debian/changelog | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/NEWS b/NEWS index c8b5659e363c..047f8751bfab 100644 --- a/NEWS +++ b/NEWS @@ -1,6 +1,6 @@ -v2.8.4 - xx xxx -- - +v2.8.4 - 25 May 2018 + + - Bug fixes v2.8.3 - 18 May 2018 - diff --git a/debian/changelog b/debian/changelog index 9149fc53b573..8450720bebe3 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,9 +1,9 @@ -openvswitch (2.8.4-1) unstable; urgency=medium - +openvswitch (2.8.4-1) unstable; urgency=low [ Open vSwitch team ] * New upstream version - -- Open vSwitch team Fri, 18 May 2018 14:05:13 -0700 + -- Open vSwitch team Fri, 25 May 2018 16:31:49 -0700 + openvswitch (2.8.3-1) unstable; urgency=medium -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [branch-2.7 1/2] Set release date for 2.7.6.
Signed-off-by: Justin Pettit--- NEWS | 4 ++-- debian/changelog | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/NEWS b/NEWS index f4046baa5a76..2a469b7df446 100644 --- a/NEWS +++ b/NEWS @@ -1,5 +1,5 @@ -v2.7.6 - xx xxx -- +v2.7.6 - 25 May 2018 + - Bug fixes v2.7.5 - 18 May 2018 diff --git a/debian/changelog b/debian/changelog index dbbbc644a492..2e570a3971d1 100644 --- a/debian/changelog +++ b/debian/changelog @@ -2,7 +2,7 @@ openvswitch (2.7.6-1) unstable; urgency=low [ Open vSwitch team ] * New upstream version - -- Open vSwitch team Fri, 18 May 2018 14:10:17 -0700 + -- Open vSwitch team Fri, 25 May 2018 16:28:06 -0700 openvswitch (2.7.5-1) unstable; urgency=low [ Open vSwitch team ] -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [branch-2.7 2/2] Prepare for 2.7.7.
Signed-off-by: Justin Pettit--- NEWS | 3 +++ configure.ac | 2 +- debian/changelog | 6 ++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 2a469b7df446..aa1e476f3557 100644 --- a/NEWS +++ b/NEWS @@ -1,3 +1,6 @@ +v2.7.7 - xx xxx + + v2.7.6 - 25 May 2018 - Bug fixes diff --git a/configure.ac b/configure.ac index 2fb4cfc82dbd..9e8c9a586f30 100644 --- a/configure.ac +++ b/configure.ac @@ -13,7 +13,7 @@ # limitations under the License. AC_PREREQ(2.63) -AC_INIT(openvswitch, 2.7.6, b...@openvswitch.org) +AC_INIT(openvswitch, 2.7.7, b...@openvswitch.org) AC_CONFIG_SRCDIR([datapath/datapath.c]) AC_CONFIG_MACRO_DIR([m4]) AC_CONFIG_AUX_DIR([build-aux]) diff --git a/debian/changelog b/debian/changelog index 2e570a3971d1..744c7a4e6a70 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +openvswitch (2.7.7-1) unstable; urgency=low + [ Open vSwitch team ] + * New upstream version + + -- Open vSwitch team Fri, 25 May 2018 16:28:32 -0700 + openvswitch (2.7.6-1) unstable; urgency=low [ Open vSwitch team ] * New upstream version -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [branch-2.6 2/2] Prepare for 2.6.4.
Signed-off-by: Justin Pettit--- NEWS | 3 +++ configure.ac | 2 +- debian/changelog | 6 ++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 7fec7dc3db51..dcef1e73a199 100644 --- a/NEWS +++ b/NEWS @@ -1,3 +1,6 @@ +v2.6.4 - xx xxx + + v2.6.3 - 25 May 2018 - Bug fixes diff --git a/configure.ac b/configure.ac index 06a4869f0907..893b3a5a8107 100644 --- a/configure.ac +++ b/configure.ac @@ -13,7 +13,7 @@ # limitations under the License. AC_PREREQ(2.63) -AC_INIT(openvswitch, 2.6.3, b...@openvswitch.org) +AC_INIT(openvswitch, 2.6.4, b...@openvswitch.org) AC_CONFIG_SRCDIR([datapath/datapath.c]) AC_CONFIG_MACRO_DIR([m4]) AC_CONFIG_AUX_DIR([build-aux]) diff --git a/debian/changelog b/debian/changelog index 3be10642380f..47ae614fe721 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +openvswitch (2.6.4-1) unstable; urgency=low + [ Open vSwitch team ] + * New upstream version + + -- Open vSwitch team Fri, 25 May 2018 16:26:35 -0700 + openvswitch (2.6.3-1) unstable; urgency=low [ Open vSwitch team ] * New upstream version -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [branch-2.6 1/2] Set release date for 2.6.3.
Signed-off-by: Justin Pettit--- NEWS | 5 +++-- debian/changelog | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/NEWS b/NEWS index efbae2aced3b..7fec7dc3db51 100644 --- a/NEWS +++ b/NEWS @@ -1,5 +1,6 @@ -v2.6.3 - xx xxx -- +v2.6.3 - 25 May 2018 + + - Bug fixes - DPDK: * Add LSC interrupt support for DPDK physical devices. diff --git a/debian/changelog b/debian/changelog index b121f8fc654a..3be10642380f 100644 --- a/debian/changelog +++ b/debian/changelog @@ -2,7 +2,7 @@ openvswitch (2.6.3-1) unstable; urgency=low [ Open vSwitch team ] * New upstream version - -- Open vSwitch team Fri, 18 May 2018 14:18:43 -0700 + -- Open vSwitch team Fri, 25 May 2018 16:26:08 -0700 openvswitch (2.6.2-1) unstable; urgency=low [ Open vSwitch team ] -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ovsdb-idl: Correct singleton insert logic
On Thu, May 17, 2018 at 01:16:55PM -0400, Mark Michelson wrote: > When inserting data into a "singleton" table (one that has maxRows == > 1), there is a check that ensures that the table is currently empty > before inserting the row. The intention is to prevent races where > multiple clients might attempt to insert rows at the same time. > > The problem is that this singleton check can cause legitimate > transactions to fail. Specifically, a transaction that attempts to > delete the current content of the table and insert new data will cause > the singleton check to fail since the table currently has data. > > This patch corrects the issue by keeping a count of the rows being > deleted and added to singleton tables. If the total is larger than zero, > then the net operation is attempting to insert rows. If the total is > less than zero, then the net operation is attempting to remove rows. If > the total is zero, then the operation is inserting and deleting an equal > number of rows (or is just updating rows). We only add the singleton > check if the total is larger than zero. > > This patch also includes a new test for singleton tables that ensures > that the maxRows constraint works as expected. > > Signed-off-by: Mark MichelsonGood catch. (It's kind of weird to delete and repopulate a singleton table, but we should support it.) I think that the following patch achieves the same end with less bookkeeping overhead. What do you think? It doesn't break anything in the testsuite, but I didn't check that it actually achieves the purpose. diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c index c6ff78e25a04..1453025acbd6 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -3929,6 +3929,39 @@ ovsdb_idl_txn_commit(struct ovsdb_idl_txn *txn) /* Add updates. */ any_updates = false; + +/* For tables constrained to have only a single row (a fairly common OVSDB + * pattern for storing global data), identify whether we're inserting a + * row. If so, then verify that the table is empty before inserting the + * row. This gives us a clear verification-related failure if there was an + * insertion race with another client. */ +for (size_t i = 0; i < txn->db->class_->n_tables; i++) { +struct ovsdb_idl_table *table = >db->tables[i]; +if (table->class_->is_singleton) { +/* Count the number of rows in the table before and after our + * transaction commits. This is O(n) in the number of rows in the + * table, but that's OK since we know that the table should only + * have one row. */ +size_t initial_rows = 0; +size_t final_rows = 0; +HMAP_FOR_EACH (row, hmap_node, >rows) { +initial_rows += row->old_datum != NULL; +final_rows += row->new_datum != NULL; +} + +if (initial_rows == 0 && final_rows == 1) { +struct json *op = json_object_create(); +json_array_add(operations, op); +json_object_put_string(op, "op", "wait"); +json_object_put_string(op, "table", table->class_->name); +json_object_put(op, "where", json_array_create_empty()); +json_object_put(op, "timeout", json_integer_create(0)); +json_object_put_string(op, "until", "=="); +json_object_put(op, "rows", json_array_create_empty()); +} +} +} + HMAP_FOR_EACH (row, txn_node, >txn_rows) { const struct ovsdb_idl_table_class *class = row->table->class_; @@ -3947,23 +3980,6 @@ ovsdb_idl_txn_commit(struct ovsdb_idl_txn *txn) struct json *row_json; size_t idx; -if (!row->old_datum && class->is_singleton) { -/* We're inserting a row into a table that allows only a - * single row. (This is a fairly common OVSDB pattern for - * storing global data.) Verify that the table is empty - * before inserting the row, so that we get a clear - * verification-related failure if there was an insertion - * race with another client. */ -struct json *op = json_object_create(); -json_array_add(operations, op); -json_object_put_string(op, "op", "wait"); -json_object_put_string(op, "table", class->name); -json_object_put(op, "where", json_array_create_empty()); -json_object_put(op, "timeout", json_integer_create(0)); -json_object_put_string(op, "until", "=="); -json_object_put(op, "rows", json_array_create_empty()); -} - struct json *op = json_object_create(); json_object_put_string(op, "op", row->old_datum ? "update" : "insert"); ___ dev
Re: [ovs-dev] [PATCH v4 0/3] Use improved dp_hash select group by default
On Thu, May 24, 2018 at 05:27:58PM +0200, Jan Scheurich wrote: > The current default OpenFlow select group implementation sends every new L4 > flow > to the slow path for the balancing decision and installs a 5-tuple "miniflow" > in the datapath to forward subsequent packets of the connection accordingly. > Clearly this has major scalability issues with many parallel L4 flows and high > connection setup rates. > > The dp_hash selection method for the OpenFlow select group was added to OVS > as an alternative. It avoids the scalability issues for the price of an > additional recirculation in the datapath. The dp_hash method is only available > to OF1.5 SDN controllers speaking the Netronome Group Mod extension to > configure the selection mechanism. This severely limited the applicability of > the dp_hash select group in the past. > > Furthermore, testing revealed that the implemented dp_hash selection often > generated a very uneven distribution of flows over group buckets and didn't > consider bucket weights at all. > > The present patch set in a first step improves the dp_hash selection method to > much more accurately distribute flows over weighted group buckets and to > apply a symmetric dp_hash function to maintain the symmetry property of the > legacy hash function. In a second step it makes the improved dp_hash method > the default in OVS for select groups that can be accurately handled by > dp_hash. > That should be the vast majority of cases. Otherwise we fall back to the > legacy slow-path selection method. > > The Netronome extension can still be used to override the default decision and > require the legacy slow-path or the dp_hash selection method. Thanks a lot. I applied this series to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 7/7] ovs-sim: Support backup and clustered databases for ovn.
On Thu, May 24, 2018 at 10:20:26PM -0700, Justin Pettit wrote: > > > On May 18, 2018, at 2:37 PM, Ben Pfaffwrote: > > > > Signed-off-by: Ben Pfaff > > Acked-by: Justin Pettit Thanks for all the reviews. I applied your suggestions and then applied the series to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] 答复: 答复: Question: why ovs flush all datapath flows when destroying br/ofproto?
OK. On Fri, May 25, 2018 at 03:46:31AM +, Lilijun (Jerry, Cloud Networking) wrote: > Hi Ben, > >Thanks for your reply very much. > >Currently, if we delete a port, ovs only flush the related datapath flows. > >So I have an opinions as follows: >When deleting the ovs bridge br-plyXXX, we can flush per br-plyXXX's port > related datapath flows instead of flush all datapath flows. > >Is that ok? > >Thanks. > > -邮件原件- > 发件人: Ben Pfaff [mailto:b...@ovn.org] > 发送时间: 2018年5月25日 1:13 > 收件人: Lilijun (Jerry, Cloud Networking)> 抄送: d...@openvswitch.org; ovs-disc...@openvswitch.org; liucheng (J) > ; lixiao (H) > 主题: Re: 答复: Question: why ovs flush all datapath flows when destroying > br/ofproto? > > It's more work to selectively flush and no one has implemented that. > > On Thu, May 24, 2018 at 02:55:00AM +, Lilijun (Jerry, Cloud Networking) > wrote: > > Are there anybody has some ideas about this question? > > Thanks. > > > > -邮件原件- > > 发件人: Lilijun (Jerry, Cloud Networking) > > 发送时间: 2018年5月18日 11:02 > > 收件人: d...@openvswitch.org; 'Ben Pfaff' ; > > ovs-disc...@openvswitch.org > > 抄送: liucheng (J) ; lixiao (H) > > 主题: Question: why ovs flush all datapath flows when destroying br/ofproto? > > > > Hi all, > > > > In my test, there are many VMs sending and receiving random flows using > > ovs. Per VM has one virtio-net NIC and connect to ovs bridge named such as > > br-plyXXX. These ovs bridges br-plyXXX are all connected to br-int using > > patch port. > > All VM's traffic can generate about 200 thousand datapath flows. I found > > that all datapath flows were flushed when deleting one VM and it's own > > br-plyXXX. > > And the implementation is in the function ofproto_destroy() calling > > ofproto_flush__(). The flush work will delete all datapath flows and > > recreate the upcall and revalidator threads. > > As a result, other VM's all traffic will miss the datapath flows and make > > upcall to match in the slow path in upcall threads. Their packet will be > > dropped or get worse latency. > > > > So, why ovs flush all datapath flows when destroying br/ofproto? Can we > > only flush the datapath flows related the destroying ofprotos? > > > > > > B.R. > > Jerry ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] vconn: Remove obsolete comment.
On Thu, May 24, 2018 at 04:46:01PM -0700, Justin Pettit wrote: > > > On May 23, 2018, at 4:39 PM, Ben Pfaffwrote: > > > > Signed-off-by: Ben Pfaff > > Acked-by: Justin Pettit Thanks, applied to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 3/7] ovs-sim: Convert documentation to RST format.
On Thu, May 24, 2018 at 05:23:03PM -0700, Justin Pettit wrote: > > > On May 18, 2018, at 2:37 PM, Ben Pfaffwrote: > > > > diff --git a/Documentation/conf.py b/Documentation/conf.py > > index babda21defff..d81fcaeb440d 100644 > > --- a/Documentation/conf.py > > +++ b/Documentation/conf.py > > ... > > @@ -123,7 +125,7 @@ _man_pages = [ > > ('ovsdb.5', > > u'Open vSwitch Database (File Formats)'), > > ('ovsdb.7', > > - u'Open vSwitch Database (Overview)'), > > + u'Open vSwitch Database (Overview)',), > > Is this added comma intentional? No. Removed. > > diff --git a/Documentation/ref/ovs-sim.1.rst > > b/Documentation/ref/ovs-sim.1.rst > > new file mode 100644 > > index ..faeee1791050 > > --- /dev/null > > +++ b/Documentation/ref/ovs-sim.1.rst > > ... > > + See `Commands`_, below, for an explanation. > > + > > +4. Runs any scripts specified on the command line (see Options > > Do you want to specify a link by referring to it as "`Options`_,"? Yes. Fixed, thanks. > > diff --git a/tutorial/ovs-sandbox b/tutorial/ovs-sandbox > > index 9505c6c3d98b..aa60b9bc3157 100755 > > --- a/tutorial/ovs-sandbox > > +++ b/tutorial/ovs-sandbox > > ... > > +for i in $(seq $n_controllers); do > > + rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller \ > > + $OVN_CTRLR_PKI --detach --no-chdir --pidfile -vconsole:off > > --log-file > > +done > > Do you want to distinguish the separate ovn-controller instance pidfiles and > logfiles by integrating a name with "$i" in it? Yes. I don't know how this ended up in a patch that's just supposed to be documentation. I split it into a separate patch. > Acked-by: Justin Pettit > > --Justin > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/7] ovsdb: Improve torture test for clusters.
On Thu, May 24, 2018 at 05:01:48PM -0700, Justin Pettit wrote: > > > On May 18, 2018, at 2:37 PM, Ben Pfaffwrote: > > > > This test is supposed to be parameterized, but one of the loops didn't > > honor the parameterization and just had hardcoded values. Also, the > > output comparison didn't work properly for more than 100 client sets > > (n1 > 100), so this adds some explicit sorting to the mix. > > > > Signed-off-by: Ben Pfaff > > --- > > tests/ovsdb-cluster.at | 8 > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/tests/ovsdb-cluster.at b/tests/ovsdb-cluster.at > > index e161d1abfc2a..cc66228c6265 100644 > > --- a/tests/ovsdb-cluster.at > > +++ b/tests/ovsdb-cluster.at > > @@ -196,12 +196,12 @@ ovsdb|WARN|schema: changed 2 columns in > > 'OVN_Southbound' database from ephemeral > > echo "...done" > > AT_CHECK([if test $phase != 2; then exit 77; fi]) > > > > -for i in `seq 0 9`; do > > - for j in `seq 5`; do > > +for i in $(seq 0 $(expr $n1 - 1) ); do > > + for j in `seq $n2`; do > > echo "$i-$j=$i-$j" > > done > > -done > expout > > -AT_CHECK([ovn-sbctl --timeout=30 --log-file=finalize.log -vtimeval:off > > -vfile -vsyslog:off --bare get SB_Global . external-ids | sed 's/, /\n/g; > > s/[[{}""]]//g;'], [0], [expout]) > > +done | sort > expout > > +AT_CHECK([ovn-sbctl --timeout=30 --log-file=finalize.log -vtimeval:off > > -vfile -vsyslog:off --bare get SB_Global . external-ids | sed 's/, /\n/g; > > s/[[{}""]]//g;' | sort], [0], [expout]) > > I believe the original code, as well as this new version, mixes spaces and > tabs. Thanks, I fixed this up. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ovn: Update TODO.
On Thu, May 24, 2018 at 04:41:38PM -0700, Justin Pettit wrote: > > > On May 24, 2018, at 11:00 AM, Ben Pfaffwrote: > > > > We've actually made a lot of improvements. > > > > Signed-off-by: Ben Pfaff > > Acked-by: Justin Pettit Thanks, applied to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] compati/ip_gre: remove duplicate vport definition.
On Fri, May 25, 2018 at 06:28:49AM -0700, William Tu wrote: > Clean up the duplicate definition of OVS_VPORT_TYPE_ERSPAN > since it is defined in openvswitch.h. > > Cc: Greg Rose> Signed-off-by: William Tu Applied to master, thanks! (Maybe I should have waited for Greg's feedback.) ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/2] odp-util: refector erspan option parsing.
On Fri, May 25, 2018 at 06:28:48AM -0700, William Tu wrote: > Instead of memcpy to a local stack, parse the erspan > metadata in memory. > > Suggested-by: Ben Pfaff> Signed-off-by: William Tu Thanks, applied to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] erspan: update NEWS and FAQ.
On Fri, May 25, 2018 at 09:47:20AM -0700, William Tu wrote: > Update Documentation/faq/configuration.rst about ERSPAN > and Update NEWS. > > Cc: Greg Rose> Signed-off-by: William Tu Thanks! Applied to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] Embrace anonymous unions.
On Fri, May 25, 2018 at 11:15:10PM +0300, aserd...@ovn.org wrote: > > > On May 24, 2018, at 10:32 AM, Ben Pfaffwrote: > > > > > > Several OVS structs contain embedded named unions, like this: > > > > > > struct { > > >... > > >union { > > >... > > >} u; > > > }; > > > > > > C11 standardized a feature that many compilers already implemented > > > anyway, where an embedded union may be unnamed, like this: > > > > > > struct { > > >... > > >union { > > >... > > >}; > > > }; > > > > > > This is more convenient because it allows the programmer to omit "u." > > > in many places. OVS already used this feature in several places. > > > This commit embraces it in several others. > > > > > > Signed-off-by: Ben Pfaff > > > > I didn't read through it carefully, but I support the change, and it's the > sort of > > thing that normally breaks in obvious ways. > > > > Acked-by: Justin Pettit > > > > --Justin > > > > > FYI I compiled on MSVC and ran the unit test and it was fine. > > Tested-by: Alin Gabriel Serdean > Acked-by: Alin Gabriel Serdean Thanks a lot Justin and Alin. I applied this to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] ovs-fields: Improve formatting of NSH section.
On Thu, May 24, 2018 at 10:32:50PM -0700, Justin Pettit wrote: > > > On May 18, 2018, at 10:16 AM, Ben Pfaffwrote: > > > > Signed-off-by: Ben Pfaff > > Acked-by: Justin Pettit Thanks for the reviews. I applied these patches to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] erspan: add NXOXM_ET_ERSPAN_ field tests.
On Fri, May 25, 2018 at 08:03:18AM -0700, William Tu wrote: > ERSPAN is the first real-world use cases of Experimenter OXM, > which introduces 4 new NXOXM_ET_ fields (ver, idx, dir, hwid). > The patch adds test cases for these fields. > > Cc: Greg Rose> Signed-off-by: William Tu Thanks, applied to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] Embrace anonymous unions.
> > On May 24, 2018, at 10:32 AM, Ben Pfaffwrote: > > > > Several OVS structs contain embedded named unions, like this: > > > > struct { > >... > >union { > >... > >} u; > > }; > > > > C11 standardized a feature that many compilers already implemented > > anyway, where an embedded union may be unnamed, like this: > > > > struct { > >... > >union { > >... > >}; > > }; > > > > This is more convenient because it allows the programmer to omit "u." > > in many places. OVS already used this feature in several places. > > This commit embraces it in several others. > > > > Signed-off-by: Ben Pfaff > > I didn't read through it carefully, but I support the change, and it's the sort of > thing that normally breaks in obvious ways. > > Acked-by: Justin Pettit > > --Justin > > FYI I compiled on MSVC and ran the unit test and it was fine. Tested-by: Alin Gabriel Serdean Acked-by: Alin Gabriel Serdean ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] New Release for branch 2.9, 2.8, 2.7, 2.6
Sounds good. I'll take it from here. Thanks. --Justin > On May 25, 2018, at 12:05 PM, Stokes, Ianwrote: > >> Sure. No problem. > > Thanks Justin, > > The bug fixes have now been merged to the release branches and are ready to > go. Thanks again for helping with this, is there anything else you need from > my side? > > Regards > Ian >> >> --Justin >> >> >>> On May 23, 2018, at 1:59 AM, Stokes, Ian wrote: >>> >>> Hi Justin, >>> >>> A bug has been discovered that is present from branches 2.9 -> 2.6 that >> can break DPDK compilation for users. >>> >>> https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/347435.html >>> >>> There is ongoing work to address this, is it possible once we have the >> fix to create a new release for the branches above? >>> >>> Apologies for requesting this so soon after you had already completed >> the new branch releases earlier this week but this is an important bug fix >> form the for OVS with DPDK. >>> >>> Thanks >>> Ian >>> > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] New Release for branch 2.9, 2.8, 2.7, 2.6
> Sure. No problem. Thanks Justin, The bug fixes have now been merged to the release branches and are ready to go. Thanks again for helping with this, is there anything else you need from my side? Regards Ian > > --Justin > > > > On May 23, 2018, at 1:59 AM, Stokes, Ianwrote: > > > > Hi Justin, > > > > A bug has been discovered that is present from branches 2.9 -> 2.6 that > can break DPDK compilation for users. > > > > https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/347435.html > > > > There is ongoing work to address this, is it possible once we have the > fix to create a new release for the branches above? > > > > Apologies for requesting this so soon after you had already completed > the new branch releases earlier this week but this is an important bug fix > form the for OVS with DPDK. > > > > Thanks > > Ian > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] OVS DPDK: dpdk_merge pull request for branch-2.6
> Thanks for all the release branch updates. I merged all of them into the > respective branches. Thanks for the quick merge Ben, much appreciated. Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ovn pacemaker: Fix the promotion issue in other cluster nodes when the master node is reset
On Thu, May 17, 2018 at 6:04 AM,wrote: > From: Numan Siddique > > When a node 'A' in the pacemaker cluster running OVN db servers in master is > brought down ungracefully ('echo b > /proc/sysrq_trigger' for example), > pacemaker > is not able to promote any other node to master in the cluster. When > pacemaker selects > a node B for instance to promote, it moves the IPAddr2 resource (i.e the > master ip) > to node 'B'. As soon the node is configured with the IP address, when the > issue is > seen, the OVN db servers which were running as standy earlier, transitions to > active. > Ideally this should not have happened. The ovsdb-servers are expected to > remain in > standby until there are promoted. (This needs separate investigation). When > the pacemaker > calls the OVN OCF script's promote action, the ovsdb_server_promot function > returns > almost immediately without recording the present master. And later in the > notify action > it demotes back the OVN db servers since the last known master doesn't match > with > node 'B's hostname. This results in pacemaker promoting/demoting in a loop. > > This patch fixes the issue by not returning immediately when promote action is > called if the OVN db servers are running as active. Now it would continue with > the ovsdb_server_promot function and records the new master by setting proper > master score ($CRM_MASTER -N $host_name -v ${master_score}) > > This issue is not seen when a node is brought down gracefully as pacemaker > before > promoting a node, calls stop, start and then promote actions. Not sure why > pacemaker > doesn't call stop, start and promote actions when a node is reset > ungracefully. > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1579025 > Signed-off-by: Numan Siddique Thanks, Numan. I tweaked commit message formatting and applied this to master and branch-2.9 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [patch v4 2/2] packets: ersapn_metadata header fixups.
On Fri, May 25, 2018 at 05:56:51AM -0700, William Tu wrote: > > > > OK, now I understand what's going on. This is only used to define the > > format of the OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attribute. Since it's not > > a wire format at all, we don't need to use ovs_16aligned_be32 either, > > and an "int" is fine too. > > > > Since erspan_metadata is part of the kernel ABI, which is exposed via > > Netlink, it should normally be defined by including a kernel header > > rather than in packets.h, which normally just defines wire formats for > > things. Can we arrange for that to happen? > > > > But the kernel UAPI erspan.h defines both 'struct erspan_md2' and > 'struct erspan_metadata'. If we choose to include from kernel, then the > wire format 'strct erspan_md2' isn't defined using ovs_16aligned_be32. In some cases we have an alternative data structure for use in situations where we need it, e.g. struct ovs_16aligned_ip6_hdr. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] OVS DPDK: dpdk_merge pull request for branch-2.6
Thanks for all the release branch updates. I merged all of them into the respective branches. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] OVS DPDK: dpdk_merge pull request for master
On Fri, May 25, 2018 at 02:17:40PM +, Stokes, Ian wrote: > Hi Ben, > > The following changes since commit 55b259471719ceca1f0083bdd6a5f8c3e7690bae: > > Extend tests for conjunctive match support in OVN (2018-05-24 11:23:58 > -0700) > > are available in the git repository at: > > https://github.com/istokes/ovs dpdk_merge > > for you to fetch changes up to 3e52fa5644cde2eba56c48355d5664e992992bf2: > > dpdk: reflect status and version in the database (2018-05-25 09:09:50 +0100) Thanks. I merged this into master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH v1] dpdk: Support both shared and per port mempools.
On 05/25/2018 05:53 PM, Lam, Tiago wrote: >> VLOG_ERR("Failed mempool \"%s\" create request of %u mbufs", >> mp_name, n_mbufs); >> } > Just a thought, but now with shared memory where n_mbufs are initially > set to 4096*64, one can see this error log printed a few times before > the memory gets allocated. We could potentially demote this to a WARN > and write a more friendly message and only print the error below, before > returning to the caller (at that point we surely couldn't allocate the > mempool). > +1. I'd go further and have it as debug. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH v1] dpdk: Support both shared and per port mempools.
Hi Tiago, On 05/25/2018 05:53 PM, Lam, Tiago wrote: > Hi Ian, > > Thanks for bringing this forward. > > I've tested this on current master by re-configuring MTUs of existing > ports, adding new ports, deleting existing ones, etc. It seems to be > behaving as expected: > - On the shared model, it allocates a new mempool only if the MTU > changes, otherwise reuses the already existing one. And it frees a > mempool only if it isn't being used by any port; > - On the per-port model, it allocates a new mempool for every port. And > it frees the mempools when the ports are destroyed / MTU is changed. > > There's a catch on how mempools are freed, though. I have a comment on > that in-line below*. > > (I see Kevin has sent a review as well, so I'll refrain to touch on the > same points, unless to re-iterate.) > > On 18/05/2018 15:30, Ian Stokes wrote: >> This commit re-introduces the concept of shared mempools as the default >> memory model for DPDK devices. Per port mempools are still available but >> must be enabled explicitly by a user. >> >> OVS previously used a shared mempool model for ports with the same MTU >> and socket configuration. This was replaced by a per port mempool model >> to address issues flagged by users such as: >> >> https://mail.openvswitch.org/pipermail/ovs-discuss/2016-September/042560.html >> >> However the per port model potentially requires an increase in memory >> resource requirements to support the same number of ports and configuration >> as the shared port model. >> >> This is considered a blocking factor for current deployments of OVS >> when upgrading to future OVS releases as a user may have to redimension >> memory for the same deployment configuration. This may not be possible for >> users. >> >> This commit resolves the issue by re-introducing shared mempools as >> the default memory behaviour in OVS DPDK but also refactors the memory >> configuration code to allow for per port mempools. >> >> This patch adds a new global config option, per-port-mp-enabled, that >> controls the enablement of per port mempools for DPDK devices. >> >> ovs-vsctl set Open_vSwitch . other_config:per-port-mp-enabled=true >> >> This value defaults to false; to enable per port mempool support, >> this field should be set to true when setting other global parameters >> on init (such as "dpdk-socket-mem", for example). Changing the value at >> runtime is not supported, and requires restarting the vswitch >> daemon. >> >> The mempool sweep functionality is also replaced with the >> sweep functionality from OVS 2.9 found in commits >> >> c77f692 (netdev-dpdk: Free mempool only when no in-use mbufs.) >> a7fb0a4 (netdev-dpdk: Add mempool reuse/free debug.) >> >> As this patch is RFC there are a number of TO-DOs including adding a >> memory calculation section to the documentation for both models. This is >> expected to be completed in the v1 after RFC. >> >> Signed-off-by: Ian Stokes>> --- >> Documentation/automake.mk| 1 + >> Documentation/topics/dpdk/index.rst | 1 + >> Documentation/topics/dpdk/memory.rst | 67 +++ >> NEWS | 1 + >> lib/dpdk-stub.c | 6 + >> lib/dpdk.c | 13 ++ >> lib/dpdk.h | 1 + >> lib/netdev-dpdk.c| 326 >> +-- >> vswitchd/vswitch.xml | 16 ++ >> 9 files changed, 305 insertions(+), 127 deletions(-) >> create mode 100644 Documentation/topics/dpdk/memory.rst >> >> diff --git a/Documentation/automake.mk b/Documentation/automake.mk >> index 683ca14..14c2189 100644 >> --- a/Documentation/automake.mk >> +++ b/Documentation/automake.mk >> @@ -36,6 +36,7 @@ DOC_SOURCE = \ >> Documentation/topics/dpdk/index.rst \ >> Documentation/topics/dpdk/bridge.rst \ >> Documentation/topics/dpdk/jumbo-frames.rst \ >> +Documentation/topics/dpdk/memory.rst \ >> Documentation/topics/dpdk/pdump.rst \ >> Documentation/topics/dpdk/phy.rst \ >> Documentation/topics/dpdk/pmd.rst \ >> diff --git a/Documentation/topics/dpdk/index.rst >> b/Documentation/topics/dpdk/index.rst >> index 181f61a..cf24a7b 100644 >> --- a/Documentation/topics/dpdk/index.rst >> +++ b/Documentation/topics/dpdk/index.rst >> @@ -40,3 +40,4 @@ The DPDK Datapath >> /topics/dpdk/qos >> /topics/dpdk/pdump >> /topics/dpdk/jumbo-frames >> + /topics/dpdk/memory >> diff --git a/Documentation/topics/dpdk/memory.rst >> b/Documentation/topics/dpdk/memory.rst >> new file mode 100644 >> index 000..1198067 >> --- /dev/null >> +++ b/Documentation/topics/dpdk/memory.rst >> @@ -0,0 +1,67 @@ >> +.. >> +Copyright 2018, Intel, Inc. >> + >> + Licensed under the Apache License, Version 2.0 (the "License"); you >> may >> + not use this file except in compliance with the License. You may >> obtain >> + a copy of the License at >> + >> +
Re: [ovs-dev] [RFC PATCH v1] dpdk: Support both shared and per port mempools.
Hi Ian, Thanks for bringing this forward. I've tested this on current master by re-configuring MTUs of existing ports, adding new ports, deleting existing ones, etc. It seems to be behaving as expected: - On the shared model, it allocates a new mempool only if the MTU changes, otherwise reuses the already existing one. And it frees a mempool only if it isn't being used by any port; - On the per-port model, it allocates a new mempool for every port. And it frees the mempools when the ports are destroyed / MTU is changed. There's a catch on how mempools are freed, though. I have a comment on that in-line below*. (I see Kevin has sent a review as well, so I'll refrain to touch on the same points, unless to re-iterate.) On 18/05/2018 15:30, Ian Stokes wrote: > This commit re-introduces the concept of shared mempools as the default > memory model for DPDK devices. Per port mempools are still available but > must be enabled explicitly by a user. > > OVS previously used a shared mempool model for ports with the same MTU > and socket configuration. This was replaced by a per port mempool model > to address issues flagged by users such as: > > https://mail.openvswitch.org/pipermail/ovs-discuss/2016-September/042560.html > > However the per port model potentially requires an increase in memory > resource requirements to support the same number of ports and configuration > as the shared port model. > > This is considered a blocking factor for current deployments of OVS > when upgrading to future OVS releases as a user may have to redimension > memory for the same deployment configuration. This may not be possible for > users. > > This commit resolves the issue by re-introducing shared mempools as > the default memory behaviour in OVS DPDK but also refactors the memory > configuration code to allow for per port mempools. > > This patch adds a new global config option, per-port-mp-enabled, that > controls the enablement of per port mempools for DPDK devices. > > ovs-vsctl set Open_vSwitch . other_config:per-port-mp-enabled=true > > This value defaults to false; to enable per port mempool support, > this field should be set to true when setting other global parameters > on init (such as "dpdk-socket-mem", for example). Changing the value at > runtime is not supported, and requires restarting the vswitch > daemon. > > The mempool sweep functionality is also replaced with the > sweep functionality from OVS 2.9 found in commits > > c77f692 (netdev-dpdk: Free mempool only when no in-use mbufs.) > a7fb0a4 (netdev-dpdk: Add mempool reuse/free debug.) > > As this patch is RFC there are a number of TO-DOs including adding a > memory calculation section to the documentation for both models. This is > expected to be completed in the v1 after RFC. > > Signed-off-by: Ian Stokes> --- > Documentation/automake.mk| 1 + > Documentation/topics/dpdk/index.rst | 1 + > Documentation/topics/dpdk/memory.rst | 67 +++ > NEWS | 1 + > lib/dpdk-stub.c | 6 + > lib/dpdk.c | 13 ++ > lib/dpdk.h | 1 + > lib/netdev-dpdk.c| 326 > +-- > vswitchd/vswitch.xml | 16 ++ > 9 files changed, 305 insertions(+), 127 deletions(-) > create mode 100644 Documentation/topics/dpdk/memory.rst > > diff --git a/Documentation/automake.mk b/Documentation/automake.mk > index 683ca14..14c2189 100644 > --- a/Documentation/automake.mk > +++ b/Documentation/automake.mk > @@ -36,6 +36,7 @@ DOC_SOURCE = \ > Documentation/topics/dpdk/index.rst \ > Documentation/topics/dpdk/bridge.rst \ > Documentation/topics/dpdk/jumbo-frames.rst \ > + Documentation/topics/dpdk/memory.rst \ > Documentation/topics/dpdk/pdump.rst \ > Documentation/topics/dpdk/phy.rst \ > Documentation/topics/dpdk/pmd.rst \ > diff --git a/Documentation/topics/dpdk/index.rst > b/Documentation/topics/dpdk/index.rst > index 181f61a..cf24a7b 100644 > --- a/Documentation/topics/dpdk/index.rst > +++ b/Documentation/topics/dpdk/index.rst > @@ -40,3 +40,4 @@ The DPDK Datapath > /topics/dpdk/qos > /topics/dpdk/pdump > /topics/dpdk/jumbo-frames > + /topics/dpdk/memory > diff --git a/Documentation/topics/dpdk/memory.rst > b/Documentation/topics/dpdk/memory.rst > new file mode 100644 > index 000..1198067 > --- /dev/null > +++ b/Documentation/topics/dpdk/memory.rst > @@ -0,0 +1,67 @@ > +.. > +Copyright 2018, Intel, Inc. > + > + Licensed under the Apache License, Version 2.0 (the "License"); you may > + not use this file except in compliance with the License. You may obtain > + a copy of the License at > + > + http://www.apache.org/licenses/LICENSE-2.0 > + > + Unless required by applicable law or agreed to in writing, software > + distributed under the License is distributed
[ovs-dev] [PATCH] erspan: update NEWS and FAQ.
Update Documentation/faq/configuration.rst about ERSPAN and Update NEWS. Cc: Greg RoseSigned-off-by: William Tu --- Documentation/faq/configuration.rst | 20 ++-- NEWS| 5 + 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/Documentation/faq/configuration.rst b/Documentation/faq/configuration.rst index 1c93a55cc720..2b9d9b0f57d9 100644 --- a/Documentation/faq/configuration.rst +++ b/Documentation/faq/configuration.rst @@ -191,8 +191,24 @@ Q: How do I configure mirroring of all traffic to a GRE tunnel? Q: Does Open vSwitch support ERSPAN? -A: No. As an alternative, Open vSwitch supports mirroring to a GRE tunnel -(see above). +A: Yes. ERSPAN version I and version II over IPv4 GRE and +IPv6 GRE tunnel are supported. See ovs-fields(7) for matching +and setting ERSPAN fields. + +:: + +$ ovs-vsctl add-br br0 +$ #For ERSPAN type 2 (version I) +$ ovs-vsctl add-port br0 at_erspan0 -- \ +set int at_erspan0 type=erspan options:key=1 \ +options:remote_ip=172.31.1.1 \ +options:erspan_ver=1 options:erspan_idx=1 +$ #For ERSPAN type 3 (version II) +$ ovs-vsctl add-port br0 at_erspan0 -- \ +set int at_erspan0 type=erspan options:key=1 \ +options:remote_ip=172.31.1.1 \ +options:erspan_ver=2 options:erspan_dir=1 \ +options:erspan_hwid=4 Q: How do I connect two bridges? diff --git a/NEWS b/NEWS index ec548b02a8b0..a7f258328a72 100644 --- a/NEWS +++ b/NEWS @@ -36,6 +36,11 @@ Post-v2.9.0 ovs-appctl dpif-netdev/pmd-perf-show * Supervision of PMD performance metrics and logging of suspicious iterations + - ERSPAN: + * Implemented ERSPAN protocol (draft-foschiano-erspan-00.txt) for + both kernel datapath and userspace datapath. + * Added port-based and flow-based ERSPAN tunnel port support, added + OpenFlow rules matching ERSPAN fields. See ovs-fields(7). v2.9.0 - 19 Feb 2018 -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/1] netdev-dpdk: Handle ENOTSUP for rte_eth_dev_set_mtu.
Hi Ian, I did all my testing with DPDK 17.11.2 and the following OVS commit (HEAD of master at the time of testing): cferrite@silpixa00393943:~/ovs$ git show --summary commit 55b259471719ceca1f0083bdd6a5f8c3e7690bae Author: Numan SiddiqueDate: Thu May 24 17:45:53 2018 +0200 Extend tests for conjunctive match support in OVN Before the patch was applied, vswitchd crashed with the following being seen in the vswitchd log when adding a PCAP vdev: 2018-05-25T14:35:00Z|00091|netdev_dpdk|WARN|Rx checksum offload is not supported on port 2 2018-05-25T14:35:00Z|00092|netdev_dpdk|ERR|Interface pcap-dev0 MTU (1500) setup error: Operation not supported 2018-05-25T14:35:00Z|00093|netdev_dpdk|ERR|Interface pcap-dev0(rxq:1 txq:1 lsc interrupt mode:false) configure error: Operation not supported 2018-05-25T14:35:00Z|00094|dpif_netdev|INFO|Core 3 on numa node 0 assigned port 'pcap-dev0' rx queue 0 (measured processing cycles 0). After applying the patch (applied cleanly to master), the following was seen in the vswitchd log when adding a PCAP vdev: 2018-05-25T13:38:16Z|00091|netdev_dpdk|WARN|Rx checksum offload is not supported on port 2 2018-05-25T13:38:16Z|00092|netdev_dpdk|WARN|Interface pcap-dev0 does not support MTU configuration, max packet size supported is 1500. Before the patch was applied, vswitchd crashed with the following being seen in the vswitchd log when adding a null PMD vdev: 2018-05-25T14:55:00Z|00092|netdev_dpdk|WARN|Rx checksum offload is not supported on port 2 2018-05-25T14:55:00Z|00093|netdev_dpdk|ERR|Interface null0 MTU (1500) setup error: Operation not supported 2018-05-25T14:55:00Z|00094|netdev_dpdk|ERR|Interface null0(rxq:1 txq:2 lsc interrupt mode:false) configure error: Operation not supported 2018-05-25T14:55:00Z|00095|dpif_netdev|INFO|Core 3 on numa node 0 assigned port 'null0' rx queue 0 (measured processing cycles 0). 2018-05-25T14:55:00Z|00096|bridge|INFO|bridge br0: added interface null0 on port 1 After applying the patch (applied cleanly to master), the following was seen in the vswitchd log when adding a null PMD vdev: 2018-05-25T15:02:59Z|00091|netdev_dpdk|WARN|Rx checksum offload is not supported on port 2 2018-05-25T15:02:59Z|00092|netdev_dpdk|WARN|Interface null0 does not support MTU configuration, max packet size supported is 1500. 2018-05-25T15:02:59Z|00093|netdev_dpdk|INFO|Port 2: 00:00:00:00:00:00 2018-05-25T15:02:59Z|00094|dpif_netdev|INFO|Core 3 on numa node 0 assigned port 'null0' rx queue 0 (measured processing cycles 0). 2018-05-25T15:02:59Z|00095|bridge|INFO|bridge br0: added interface null0 on port 1 In both cases, the MTU errors are fixed, with just warnings being output as is expected. Acked-by: Cian Ferriter Tested-by: Cian Ferriter Thanks, Cian > -Original Message- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Ian Stokes > Sent: 22 May 2018 15:54 > To: d...@openvswitch.org > Subject: [ovs-dev] [PATCH v2 1/1] netdev-dpdk: Handle ENOTSUP for > rte_eth_dev_set_mtu. > > The function rte_eth_dev_set_mtu is not supported for all DPDK drivers. > Currently if it is not supported we return an error in > dpdk_eth_dev_queue_setup. There are two issues with this. > > (i) A device can still function even if rte_eth_dev_set_mtu is not supported > albeit with the default max rx packet length. > > (ii) When ENOTSUP is returned it will not be caught in port_reconfigure() at > the dpif-netdev layer. Port_reconfigure() checks if a netdev_reconfigure() > function is supported for a given netdev and ignores EOPNOTSUPP errors as > it assumes errors of this value mean there is no reconfiguration function. > In this case the reconfiguration function is supported for netdev dpdk but a > function called as part of the reconfigure (rte_eth_dev_set_mtu) may not be > supported. > > As this is a corner case, this commit warns a user when > rte_eth_dev_set_mtu is not supported and informs them of the default max > rx packet length that will be used instead. > > Signed-off-by: Ian Stokes > Co-author: Michal Weglicki > Tested-By: Ciara Loftus > --- > v1 -> v2 > * Rebase to head of master. > * Use %"PRIu16 instead of %d for MTU value in VLOG. > * Fix typo in comments. > --- > lib/netdev-dpdk.c | 17 ++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 87152a7..390a9ba > 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -776,6 +776,7 @@ dpdk_eth_dev_port_config(struct netdev_dpdk > *dev, int n_rxq, int n_txq) > int i; > struct rte_eth_conf conf = port_conf; > struct rte_eth_dev_info info; > +uint16_t conf_mtu; > > /* As of DPDK 17.11.1 a few PMDs require to explicitly enable > * scatter to support jumbo RX. Checking the offload
Re: [ovs-dev] [PATCH 1/1] ovs-thread: Fix thread id for threads not started with ovs_thread_create()
On 25.05.2018 17:39, Eelco Chaudron wrote: > On 25/05/18 15:30, Ilya Maximets wrote: >>> When ping-pong'in a live VM migration between two machines running >>> OVS-DPDK every now and then the ping misses would increase >>> dramatically. For example: >>> >>> ===Stream Rate: 3Mpps=== >>> No Stream_Rate Downtime Totaltime Ping_Loss Moongen_Loss >>> 0 3Mpps 128 13974 115 7168374 >>> 1 3Mpps 145 13620 17 1169770 >>> 2 3Mpps 140 14499 116 7141175 >>> 3 3Mpps 142 13358 16 1150606 >>> 4 3Mpps 136 14004 16 1124020 >>> 5 3Mpps 139 15494 214 13170452 >>> 6 3Mpps 136 15610 217 13282413 >>> 7 3Mpps 146 13194 17 1167512 >>> 8 3Mpps 148 12871 16 1162655 >>> 9 3Mpps 137 15615 214 13170656 >>> >>> I identified this issue being introduced in OVS commit >>> https://github.com/openvswitch/ovs/commit/f3e7ec254738364101eed8f04b1d954cb510615c >>> and more specific due to DPDK commit >>> http://dpdk.org/browse/dpdk/commit/?id=af14759181240120f76c82f894982e8f33f0ba2a >>> >>> The combined changes no longer have OVS start the vhost socket polling >>> thread at startup, but DPDK will do it on its own when the first vhost >>> client is started. >>> >>> Figuring out the reason why this happens kept me puzzled for quite some >>> time... >>> What happens is that the callbacks called from the vhost thread are >>> calling ovsrcu_synchronize() as part of destroy_device(). This will >>> end-up calling seq_wait__(), and the problem is with this part: >>> >>> 176static void >>> 177seq_wait__(struct seq *seq, uint64_t value, const char *where) >>> 178 OVS_REQUIRES(seq_mutex) >>> 179{ > 180 unsigned int id = ovsthread_id_self(); >>> 181 uint32_t hash = hash_int(id, 0); >>> 182 struct seq_waiter *waiter; >>> 183 >>> 184 HMAP_FOR_EACH_IN_BUCKET (waiter, hmap_node, hash, >waiters) >>> { > 185 if (waiter->ovsthread_id == id) { >>> 186 if (waiter->value != value) { >>> 187 /* The current value is different from the value >>> we've already >>> 188 * waited for, */ >>> 189 poll_immediate_wake_at(where); >>> 190 } else { >>> 191 /* Already waiting on 'value', nothing more to do. */ >>> 192 } > 193 return; >>> 194 } >>> 195 } >>> 196 >>> >>> By default, all created threads outside of OVS will get thread id 0, >>> which is equal to the main ovs thread. So for example in the function >>> above if the main thread is waiting already we won't add ourselves as >>> a waiter. >> Good catch. Thanks for working on this. I guess, this issue could even cause >> a crash, because vhost device could be freed before other threads stop >> working with it. >> >>> The fix below assigns UINT_MAX to none OVS created threads, which will >>> fix this specific issue. However if more none OVS threads gets added >>> the issue might arise again. >>> >>> Currently, I do not see another solution that will work unless DPDK is >>> adding some framework/callback support when new threads get created. >> What do you think about allocating ids on demand inside >> 'ovsthread_id_self()'? > I was thinking about this also, but was not sure where to put it... Not sure > why ovsthread_id_self() did not come to mind :) >> This will work for any number of threads. Something like this: >> >> diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c >> index f8bc06d..ff3a1df 100644 >> --- a/lib/ovs-thread.c >> +++ b/lib/ovs-thread.c >> @@ -315,7 +315,7 @@ ovs_barrier_block(struct ovs_barrier *barrier) >> } >> } >> >> -DEFINE_EXTERN_PER_THREAD_DATA(ovsthread_id, 0); >> +DEFINE_EXTERN_PER_THREAD_DATA(ovsthread_id, UINT_MAX); > Guess I'll add a #define for clearity on the UINT_MAX usage. Yes, this should be useful. >> struct ovsthread_aux { >> void *(*start)(void *); >> @@ -323,24 +323,28 @@ struct ovsthread_aux { >> char name[16]; >> }; >> +void >> +ovsthread_id_init(void) >> +{ >> + static atomic_count next_id = ATOMIC_COUNT_INIT(0); >> + >> + unsigned int id = atomic_count_inc(_id); >> + >> + *ovsthread_id_get() = id; >> +} >> + >> static void * >> ovsthread_wrapper(void *aux_) >> { >> - static atomic_count next_id = ATOMIC_COUNT_INIT(1); >> - >> struct ovsthread_aux *auxp = aux_; >> struct ovsthread_aux aux; >> - unsigned int id; >> - >> - id = atomic_count_inc(_id); >> - *ovsthread_id_get() = id; >> > I would still call ovsthread_id_init() here explicitly, as its more clear. ok. >> aux = *auxp; >> free(auxp); >> /* The
[ovs-dev] [PATCH] erspan: add NXOXM_ET_ERSPAN_ field tests.
ERSPAN is the first real-world use cases of Experimenter OXM, which introduces 4 new NXOXM_ET_ fields (ver, idx, dir, hwid). The patch adds test cases for these fields. Cc: Greg RoseSigned-off-by: William Tu --- tests/ovs-ofctl.at | 61 ++ 1 file changed, 57 insertions(+), 4 deletions(-) diff --git a/tests/ovs-ofctl.at b/tests/ovs-ofctl.at index e1fc8299a2df..187433397a81 100644 --- a/tests/ovs-ofctl.at +++ b/tests/ovs-ofctl.at @@ -942,6 +942,12 @@ NXM_OF_ETH_TYPE(0800) NXM_NX_CT_LABEL_W(10203040506070809000a0b0c0d0e0f0/f1f2f3f NXM_NX_DP_HASH(01234567) NXOXM_ET_DP_HASH(01234567) +# ERSPAN (testing experimenter OXM). +NXOXM_ET_ERSPAN_VER(01) +NXOXM_ET_ERSPAN_IDX(01020304) +NXOXM_ET_ERSPAN_DIR(01) +NXOXM_ET_ERSPAN_HWID(12) + # Invalid field number. 01020304(/) @@ -1238,6 +1244,12 @@ NXM_OF_ETH_TYPE(0800), NXM_NX_CT_LABEL_W(10203040506070809000a0b0c0d0e0f0/f1f2f3 NXM_NX_DP_HASH(01234567) NXM_NX_DP_HASH(01234567) +# ERSPAN (testing experimenter OXM). +NXOXM_ET_ERSPAN_VER_W(01/01) +NXOXM_ET_ERSPAN_IDX(01020304) +NXOXM_ET_ERSPAN_DIR_W(01/01) +NXOXM_ET_ERSPAN_HWID_W(12/12) + # Invalid field number. nx_pull_match() returned error OFPBMC_BAD_FIELD @@ -2430,17 +2442,26 @@ NXOXM_ET_DP_HASH(01234567) NXM_NX_DP_HASH_W(01234567/0fff) NXOXM_ET_DP_HASH_W(01234567/0fff) + +NXOXM_ET_ERSPAN_VER(01) + +NXOXM_ET_ERSPAN_IDX(01020304) +NXOXM_ET_ERSPAN_IDX_W(01020304/0fff) + +NXOXM_ET_ERSPAN_DIR(01) + +NXOXM_ET_ERSPAN_HWID(12) ]) -# To allow for testing experimenter OXM, which doesn't really have many -# examples in the wild, we've defined a variant of NXM_NX_DP_HASH that uses -# the experimenter OXM mechanism, called NXOXM_ET_DP_HASH. We've defined +# To allow for testing experimenter OXM, we've defined a variant of NXM_NX_DP_HASH +# that uses the experimenter OXM mechanism, called NXOXM_ET_DP_HASH. We've defined # it as if it were introduced with OpenFlow 1.5, which gives us the # opportunity to see that both forms are accepted in all OpenFlow versions # but the experimenter form is used for encoding in OF1.5+. # # First verify that both forms are accepted and NXOXM_ET_DP_HASH is encoded -# in OF1.5. +# in OF1.5. Then verify the ERSPAN options, which exclusively use the +# Experimenter OXM. AT_CHECK([ovs-ofctl -m --strict parse-oxm OpenFlow15 < oxm.txt], [0], [dnl NXOXM_ET_DP_HASH(01234567) 00 01 00 10 ff ff 00 08-00 00 23 20 01 23 45 67 @@ -2453,10 +2474,27 @@ NXOXM_ET_DP_HASH_W(01234567/0fff) NXOXM_ET_DP_HASH_W(01234567/0fff) 00 01 00 14 ff ff 01 0c-00 00 23 20 01 23 45 67 0010 0f ff ff ff 00 00 00 00 + +NXOXM_ET_ERSPAN_VER_W(01/01) + 00 01 00 0e ff ff 19 06-00 00 23 20 01 01 00 00 + +NXOXM_ET_ERSPAN_IDX(01020304) + 00 01 00 10 ff ff 16 08-00 00 23 20 01 02 03 04 +NXOXM_ET_ERSPAN_IDX_W(01020304/0fff) + 00 01 00 14 ff ff 17 0c-00 00 23 20 01 02 03 04 +0010 0f ff ff ff 00 00 00 00 + +NXOXM_ET_ERSPAN_DIR_W(01/01) + 00 01 00 0e ff ff 1b 06-00 00 23 20 01 01 00 00 + +NXOXM_ET_ERSPAN_HWID_W(12/12) + 00 01 00 0e ff ff 1d 06-00 00 23 20 12 12 00 00 ]) # Then verify that both forms are accepted and NXM_NX_DP_HASH is encoded # in OF1.2. +# The NXOXM_ET_ERSPAN does not support in OF1.2, so it is +# encoded in OF1.5. AT_CHECK([ovs-ofctl -m --strict parse-oxm OpenFlow12 < oxm.txt], [0], [dnl NXM_NX_DP_HASH(01234567) 00 01 00 0c 00 01 46 04-01 23 45 67 00 00 00 00 @@ -2467,6 +2505,21 @@ NXM_NX_DP_HASH_W(01234567/0fff) 00 01 00 10 00 01 47 08-01 23 45 67 0f ff ff ff NXM_NX_DP_HASH_W(01234567/0fff) 00 01 00 10 00 01 47 08-01 23 45 67 0f ff ff ff + +NXOXM_ET_ERSPAN_VER_W(01/01) + 00 01 00 0e ff ff 19 06-00 00 23 20 01 01 00 00 + +NXOXM_ET_ERSPAN_IDX(01020304) + 00 01 00 10 ff ff 16 08-00 00 23 20 01 02 03 04 +NXOXM_ET_ERSPAN_IDX_W(01020304/0fff) + 00 01 00 14 ff ff 17 0c-00 00 23 20 01 02 03 04 +0010 0f ff ff ff 00 00 00 00 + +NXOXM_ET_ERSPAN_DIR_W(01/01) + 00 01 00 0e ff ff 1b 06-00 00 23 20 01 01 00 00 + +NXOXM_ET_ERSPAN_HWID_W(12/12) + 00 01 00 0e ff ff 1d 06-00 00 23 20 12 12 00 00 ]) AT_CLEANUP -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/1] ovs-thread: Fix thread id for threads not started with ovs_thread_create()
On 25/05/18 15:30, Ilya Maximets wrote: When ping-pong'in a live VM migration between two machines running OVS-DPDK every now and then the ping misses would increase dramatically. For example: ===Stream Rate: 3Mpps=== No Stream_Rate Downtime Totaltime Ping_Loss Moongen_Loss 0 3Mpps 128 13974 115 7168374 1 3Mpps 145 1362017 1169770 2 3Mpps 140 14499 116 7141175 3 3Mpps 142 1335816 1150606 4 3Mpps 136 1400416 1124020 5 3Mpps 139 15494 214 13170452 6 3Mpps 136 15610 217 13282413 7 3Mpps 146 1319417 1167512 8 3Mpps 148 1287116 1162655 9 3Mpps 137 15615 214 13170656 I identified this issue being introduced in OVS commit https://github.com/openvswitch/ovs/commit/f3e7ec254738364101eed8f04b1d954cb510615c and more specific due to DPDK commit http://dpdk.org/browse/dpdk/commit/?id=af14759181240120f76c82f894982e8f33f0ba2a The combined changes no longer have OVS start the vhost socket polling thread at startup, but DPDK will do it on its own when the first vhost client is started. Figuring out the reason why this happens kept me puzzled for quite some time... What happens is that the callbacks called from the vhost thread are calling ovsrcu_synchronize() as part of destroy_device(). This will end-up calling seq_wait__(), and the problem is with this part: 176static void 177seq_wait__(struct seq *seq, uint64_t value, const char *where) 178OVS_REQUIRES(seq_mutex) 179{ 180unsigned int id = ovsthread_id_self(); 181uint32_t hash = hash_int(id, 0); 182struct seq_waiter *waiter; 183 184HMAP_FOR_EACH_IN_BUCKET (waiter, hmap_node, hash, >waiters) { 185if (waiter->ovsthread_id == id) { 186if (waiter->value != value) { 187/* The current value is different from the value we've already 188 * waited for, */ 189poll_immediate_wake_at(where); 190} else { 191/* Already waiting on 'value', nothing more to do. */ 192} 193return; 194} 195} 196 By default, all created threads outside of OVS will get thread id 0, which is equal to the main ovs thread. So for example in the function above if the main thread is waiting already we won't add ourselves as a waiter. Good catch. Thanks for working on this. I guess, this issue could even cause a crash, because vhost device could be freed before other threads stop working with it. The fix below assigns UINT_MAX to none OVS created threads, which will fix this specific issue. However if more none OVS threads gets added the issue might arise again. Currently, I do not see another solution that will work unless DPDK is adding some framework/callback support when new threads get created. What do you think about allocating ids on demand inside 'ovsthread_id_self()'? I was thinking about this also, but was not sure where to put it... Not sure why ovsthread_id_self() did not come to mind :) This will work for any number of threads. Something like this: diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c index f8bc06d..ff3a1df 100644 --- a/lib/ovs-thread.c +++ b/lib/ovs-thread.c @@ -315,7 +315,7 @@ ovs_barrier_block(struct ovs_barrier *barrier) } } -DEFINE_EXTERN_PER_THREAD_DATA(ovsthread_id, 0); +DEFINE_EXTERN_PER_THREAD_DATA(ovsthread_id, UINT_MAX); Guess I'll add a #define for clearity on the UINT_MAX usage. struct ovsthread_aux { void *(*start)(void *); @@ -323,24 +323,28 @@ struct ovsthread_aux { char name[16]; }; +void +ovsthread_id_init(void) +{ +static atomic_count next_id = ATOMIC_COUNT_INIT(0); + +unsigned int id = atomic_count_inc(_id); + +*ovsthread_id_get() = id; +} + static void * ovsthread_wrapper(void *aux_) { -static atomic_count next_id = ATOMIC_COUNT_INIT(1); - struct ovsthread_aux *auxp = aux_; struct ovsthread_aux aux; -unsigned int id; - -id = atomic_count_inc(_id); -*ovsthread_id_get() = id; I would still call ovsthread_id_init() here explicitly, as its more clear. aux = *auxp; free(auxp); /* The order of the following calls is important, because * ovsrcu_quiesce_end() saves a copy of the thread name. */ -char *subprogram_name = xasprintf("%s%u", aux.name, id); +char *subprogram_name = xasprintf("%s%u", aux.name, ovsthread_id_self()); set_subprogram_name(subprogram_name); free(subprogram_name); ovsrcu_quiesce_end(); diff --git a/lib/ovs-thread.h b/lib/ovs-thread.h index 03fd804..ada09d1 100644 --- a/lib/ovs-thread.h +++
[ovs-dev] OVS DPDK: dpdk_merge pull request for branch-2.6
Hi Ben, The following changes since commit acc46dc06e02316c27b9a5d66ac85726c2091a6a: Prepare for 2.6.3. (2018-05-18 15:53:55 -0700) are available in the git repository at: https://github.com/istokes/ovs dpdk_merge_2_6 for you to fetch changes up to d620091dd371386b11e6f6b3a6b7b250860e267d: netdev-dpdk: Remove use of rte_mempool_ops_get_count. (2018-05-23 16:37:53 +0100) Ian Stokes (1): Configurable Link State Change (LSC) detection mode Kevin Traynor (1): netdev-dpdk: Remove use of rte_mempool_ops_get_count. INSTALL.DPDK-ADVANCED.md | 25 + NEWS | 3 ++- lib/netdev-dpdk.c| 60 +++- vswitchd/vswitch.xml | 17 + 4 files changed, 87 insertions(+), 18 deletions(-) Thanks Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] OVS DPDK: dpdk_merge pull request for branch-2.7
Hi Ben, The following changes since commit 3c21d3f679aa980aeed79bd5582d891b9f109fe6: Prepare for 2.7.6. (2018-05-18 15:54:21 -0700) are available in the git repository at: https://github.com/istokes/ovs dpdk_merge_2_7 for you to fetch changes up to 177a84282867d40d2a275f4ff2eac4c1a29976e0: NEWS: Add LSC support for DPDK. (2018-05-24 20:09:47 +0100) Ian Stokes (1): NEWS: Add LSC support for DPDK. Kevin Traynor (1): netdev-dpdk: Remove use of rte_mempool_ops_get_count. NEWS | 2 ++ lib/netdev-dpdk.c | 25 + 2 files changed, 15 insertions(+), 12 deletions(-) Thanks Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] OVS DPDK: dpdk_merge pull request for branch-2.8
Hi Ben, The following changes since commit 8b10a2632b84cfe5153203db9d042d51864af328: datapath: compat: Fix ndo_size in RHEL 7.5 backport (2018-05-23 15:36:00 -0700) are available in the git repository at: https://github.com/istokes/ovs dpdk_merge_2_8 for you to fetch changes up to 983938bdea3b5692a3038aef082dd87ede3bd9a8: NEWS: Add LSC support for DPDK. (2018-05-24 20:09:12 +0100) Eelco Chaudron (1): netdev-dpdk: Don't use PMD driver if not configured successfully Ian Stokes (1): NEWS: Add LSC support for DPDK. Ilya Maximets (1): dpif-netdev: Free packets on TUNNEL_PUSH if may_steal. Kevin Traynor (1): netdev-dpdk: Remove use of rte_mempool_ops_get_count. Timothy Redaelli (1): netdev-dpdk: fix check for "net_nfp" driver NEWS | 2 ++ lib/dpif-netdev.c | 28 lib/netdev-dpdk.c | 36 ++-- 3 files changed, 44 insertions(+), 22 deletions(-) Thanks Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] OVS DPDK: dpdk_merge pull request for branch-2.9
Hi Ben, The following changes since commit 2a3ee45583ef3880da99fd415410b71054f7711a: datapath: compat: Fix ndo_size in RHEL 7.5 backport (2018-05-23 15:35:36 -0700) are available in the git repository at: https://github.com/istokes/ovs dpdk_merge_2_9 for you to fetch changes up to 418be3dab421f1877411efc2e4275d00f28fcb4a: dpif-netdev: Free packets on TUNNEL_PUSH if may_steal. (2018-05-24 20:03:14 +0100) Eelco Chaudron (1): netdev-dpdk: Don't use PMD driver if not configured successfully Ilya Maximets (1): dpif-netdev: Free packets on TUNNEL_PUSH if may_steal. Kevin Traynor (1): netdev-dpdk: Remove use of rte_mempool_ops_get_count. Timothy Redaelli (1): netdev-dpdk: fix check for "net_nfp" driver lib/dpif-netdev.c | 28 lib/netdev-dpdk.c | 35 +-- 2 files changed, 41 insertions(+), 22 deletions(-) Thanks Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] OVS DPDK: dpdk_merge pull request for master
Hi Ben, The following changes since commit 55b259471719ceca1f0083bdd6a5f8c3e7690bae: Extend tests for conjunctive match support in OVN (2018-05-24 11:23:58 -0700) are available in the git repository at: https://github.com/istokes/ovs dpdk_merge for you to fetch changes up to 3e52fa5644cde2eba56c48355d5664e992992bf2: dpdk: reflect status and version in the database (2018-05-25 09:09:50 +0100) Aaron Conole (2): dpdk: allow init to fail dpdk: reflect status and version in the database Eelco Chaudron (1): netdev-dpdk: Don't use PMD driver if not configured successfully Ilya Maximets (1): dpif-netdev: Free packets on TUNNEL_PUSH if should_steal. Kevin Traynor (1): netdev-dpdk: Remove use of rte_mempool_ops_get_count. Timothy Redaelli (1): netdev-dpdk: fix check for "net_nfp" driver Documentation/faq/configuration.rst | 8 +--- Documentation/intro/install/dpdk.rst | 27 --- NEWS | 1 + lib/dpdk-stub.c | 10 ++ lib/dpdk.c | 42 ++ lib/dpdk.h | 3 ++- lib/dpif-netdev.c| 21 + lib/netdev-dpdk.c| 35 +-- vswitchd/bridge.c| 5 + vswitchd/vswitch.ovsschema | 11 --- vswitchd/vswitch.xml | 11 +++ 11 files changed, 134 insertions(+), 40 deletions(-) Thanks Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] OVS in UserSpace !
On Wed, May 23, 2018 at 11:50 PM, rakesh kumarwrote: > Dear team, > > Please share your experience of OVS in userspace, I am trying to configure > OVS in userspace and want to skip the Kernel module so that the packet > processing and switching can only be handled in Kernel space. > The OVS userspace datapath might be very slow, unless you use OVS-DPDK. see this instruction: http://docs.openvswitch.org/en/latest/intro/install/userspace/ --William ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Ovs in userspace
On Fri, May 25, 2018 at 12:58 AM, rakesh kumarwrote: > Hello Team > > I am planning to write few algorithms to modify the switching behavior of > packets by ovs switch but I don’t want to do in kernel mode , I want my > entire process to be handled in userspace . > > Team waiting for your suggestions!! > Hi, You can use OVS-DPDK which is in userspace. Or if you don't want to use DPDK, you can use: http://docs.openvswitch.org/en/latest/intro/install/userspace/ --William ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/1] ovs-thread: Fix thread id for threads not started with ovs_thread_create()
> When ping-pong'in a live VM migration between two machines running > OVS-DPDK every now and then the ping misses would increase > dramatically. For example: > > ===Stream Rate: 3Mpps=== > No Stream_Rate Downtime Totaltime Ping_Loss Moongen_Loss > 0 3Mpps 128 13974 115 7168374 > 1 3Mpps 145 1362017 1169770 > 2 3Mpps 140 14499 116 7141175 > 3 3Mpps 142 1335816 1150606 > 4 3Mpps 136 1400416 1124020 > 5 3Mpps 139 15494 214 13170452 > 6 3Mpps 136 15610 217 13282413 > 7 3Mpps 146 1319417 1167512 > 8 3Mpps 148 1287116 1162655 > 9 3Mpps 137 15615 214 13170656 > > I identified this issue being introduced in OVS commit > https://github.com/openvswitch/ovs/commit/f3e7ec254738364101eed8f04b1d954cb510615c > and more specific due to DPDK commit > http://dpdk.org/browse/dpdk/commit/?id=af14759181240120f76c82f894982e8f33f0ba2a > > The combined changes no longer have OVS start the vhost socket polling > thread at startup, but DPDK will do it on its own when the first vhost > client is started. > > Figuring out the reason why this happens kept me puzzled for quite some > time... > What happens is that the callbacks called from the vhost thread are > calling ovsrcu_synchronize() as part of destroy_device(). This will > end-up calling seq_wait__(), and the problem is with this part: > >176static void >177seq_wait__(struct seq *seq, uint64_t value, const char *where) >178OVS_REQUIRES(seq_mutex) >179{ >>> 180unsigned int id = ovsthread_id_self(); >181uint32_t hash = hash_int(id, 0); >182struct seq_waiter *waiter; >183 >184HMAP_FOR_EACH_IN_BUCKET (waiter, hmap_node, hash, >waiters) { >>> 185if (waiter->ovsthread_id == id) { >186if (waiter->value != value) { >187/* The current value is different from the value we've > already >188 * waited for, */ >189poll_immediate_wake_at(where); >190} else { >191/* Already waiting on 'value', nothing more to do. */ >192} >>> 193return; >194} >195} >196 > > By default, all created threads outside of OVS will get thread id 0, > which is equal to the main ovs thread. So for example in the function > above if the main thread is waiting already we won't add ourselves as > a waiter. Good catch. Thanks for working on this. I guess, this issue could even cause a crash, because vhost device could be freed before other threads stop working with it. > > The fix below assigns UINT_MAX to none OVS created threads, which will > fix this specific issue. However if more none OVS threads gets added > the issue might arise again. > > Currently, I do not see another solution that will work unless DPDK is > adding some framework/callback support when new threads get created. What do you think about allocating ids on demand inside 'ovsthread_id_self()'? This will work for any number of threads. Something like this: diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c index f8bc06d..ff3a1df 100644 --- a/lib/ovs-thread.c +++ b/lib/ovs-thread.c @@ -315,7 +315,7 @@ ovs_barrier_block(struct ovs_barrier *barrier) } } -DEFINE_EXTERN_PER_THREAD_DATA(ovsthread_id, 0); +DEFINE_EXTERN_PER_THREAD_DATA(ovsthread_id, UINT_MAX); struct ovsthread_aux { void *(*start)(void *); @@ -323,24 +323,28 @@ struct ovsthread_aux { char name[16]; }; +void +ovsthread_id_init(void) +{ +static atomic_count next_id = ATOMIC_COUNT_INIT(0); + +unsigned int id = atomic_count_inc(_id); + +*ovsthread_id_get() = id; +} + static void * ovsthread_wrapper(void *aux_) { -static atomic_count next_id = ATOMIC_COUNT_INIT(1); - struct ovsthread_aux *auxp = aux_; struct ovsthread_aux aux; -unsigned int id; - -id = atomic_count_inc(_id); -*ovsthread_id_get() = id; aux = *auxp; free(auxp); /* The order of the following calls is important, because * ovsrcu_quiesce_end() saves a copy of the thread name. */ -char *subprogram_name = xasprintf("%s%u", aux.name, id); +char *subprogram_name = xasprintf("%s%u", aux.name, ovsthread_id_self()); set_subprogram_name(subprogram_name); free(subprogram_name); ovsrcu_quiesce_end(); diff --git a/lib/ovs-thread.h b/lib/ovs-thread.h index 03fd804..ada09d1 100644 --- a/lib/ovs-thread.h +++ b/lib/ovs-thread.h @@ -467,12 +467,21 @@ void *ovsthread_getspecific(ovsthread_key_t); DECLARE_EXTERN_PER_THREAD_DATA(unsigned int, ovsthread_id); +void ovsthread_id_init(void); + /* Returns a per-thread identifier unique within the
[ovs-dev] [PATCH 2/2] compati/ip_gre: remove duplicate vport definition.
Clean up the duplicate definition of OVS_VPORT_TYPE_ERSPAN since it is defined in openvswitch.h. Cc: Greg RoseSigned-off-by: William Tu --- datapath/linux/compat/ip_gre.c | 4 1 file changed, 4 deletions(-) diff --git a/datapath/linux/compat/ip_gre.c b/datapath/linux/compat/ip_gre.c index 4e5565ee362a..5911c6cc10a2 100644 --- a/datapath/linux/compat/ip_gre.c +++ b/datapath/linux/compat/ip_gre.c @@ -1568,10 +1568,6 @@ static struct vport *erspan_create(const struct vport_parms *parms) return ovs_netdev_link(vport, parms->name); } -#ifndef OVS_VPORT_TYPE_ERSPAN -/* Until integration is done... */ -#define OVS_VPORT_TYPE_ERSPAN 107 /* ERSPAN tunnel. */ -#endif static struct vport_ops ovs_erspan_vport_ops = { .type = OVS_VPORT_TYPE_ERSPAN, .create = erspan_create, -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/2] odp-util: refector erspan option parsing.
Instead of memcpy to a local stack, parse the erspan metadata in memory. Suggested-by: Ben PfaffSigned-off-by: William Tu --- lib/odp-util.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/lib/odp-util.c b/lib/odp-util.c index 105ac809073e..5e858f0f9797 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -2781,17 +2781,14 @@ odp_tun_key_from_attr__(const struct nlattr *attr, bool is_mask, tun_metadata_from_geneve_nlattr(a, is_mask, tun); break; case OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS: { -int attr_len = nl_attr_get_size(a); -struct erspan_metadata opts; +const struct erspan_metadata *opts = nl_attr_get(a); -memcpy(, nl_attr_get(a), attr_len); - -tun->erspan_ver = opts.version; +tun->erspan_ver = opts->version; if (tun->erspan_ver == 1) { -tun->erspan_idx = ntohl(opts.u.index); +tun->erspan_idx = ntohl(opts->u.index); } else if (tun->erspan_ver == 2) { -tun->erspan_dir = opts.u.md2.dir; -tun->erspan_hwid = get_hwid(); +tun->erspan_dir = opts->u.md2.dir; +tun->erspan_hwid = get_hwid(>u.md2); } else { VLOG_WARN("%s invalid erspan version\n", __func__); } -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [patch v4 2/2] packets: ersapn_metadata header fixups.
> > OK, now I understand what's going on. This is only used to define the > format of the OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attribute. Since it's not > a wire format at all, we don't need to use ovs_16aligned_be32 either, > and an "int" is fine too. > > Since erspan_metadata is part of the kernel ABI, which is exposed via > Netlink, it should normally be defined by including a kernel header > rather than in packets.h, which normally just defines wire formats for > things. Can we arrange for that to happen? > But the kernel UAPI erspan.h defines both 'struct erspan_md2' and 'struct erspan_metadata'. If we choose to include from kernel, then the wire format 'strct erspan_md2' isn't defined using ovs_16aligned_be32. or should we generate an erspan.h using kernel UAPI erspan.h? something like: sed -f ./build-aux/extract-odp-netlink-h < datapath/linux/compat/include/linux/openvswitch.h > include/odp-netlink.h but generate include/erspan.h Thanks William ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v1] netdev-dpdk: Enable HW_CRC_STRIP for virtual functions.
Hi, I did all my testing with DPDK 17.11.2 and the following OVS commit (patch doesn't apply cleanly to master): cferrite@silpixa00393943:~/ovs$ git show --summary commit 6b71df2fc06921758798e41833d67d9daf647d19 Author: Kevin TraynorDate: Wed Apr 25 12:20:53 2018 +0100 faq: Document DPDK version maintenance. Before the patch was applied, the following was seen when a VF port is added: cferrite@silpixa00393943:~$ sudo $OVS_DIR/utilities/ovs-vsctl --timeout 10 add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=:07:02.0 ovs-vsctl: Error detected while setting up 'dpdk0': could not add network device dpdk0 to ofproto (No such device). See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/usr/local/var/log/openvswitch". The vswitchd log after the VF port is added: 2018-05-25T09:31:16Z|00080|dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 3 created. 2018-05-25T09:31:16Z|00081|dpif_netdev|INFO|There are 1 pmd threads on numa node 0 2018-05-25T09:31:16Z|00082|dpdk|ERR|i40evf_dev_configure(): VF can't disable HW CRC Strip 2018-05-25T09:31:16Z|00083|netdev_dpdk|WARN|Interface dpdk0 eth_dev setup error Invalid argument 2018-05-25T09:31:16Z|00084|netdev_dpdk|ERR|Interface dpdk0(rxq:1 txq:2 lsc interrupt mode:false) configure error: Invalid argument 2018-05-25T09:31:16Z|00085|dpif_netdev|ERR|Failed to set interface dpdk0 new configuration 2018-05-25T09:31:16Z|00086|bridge|WARN|could not add network device dpdk0 to ofproto (No such device) After the patch is applied there is no error message when the VF port is added: cferrite@silpixa00393943:~$ sudo $OVS_DIR/utilities/ovs-vsctl --timeout 10 add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=:07:02.0 The vswitchd log after the VF port is added: 2018-05-25T10:00:31Z|00081|netdev_dpdk|INFO|Virtual function detected, HW_CRC_STRIP will be enabled 2018-05-25T10:00:33Z|00082|dpdk|WARN|i40evf_execute_vf_cmd(): No response for 28 2018-05-25T10:00:33Z|00083|dpdk|ERR|i40evf_disable_vlan_strip(): Failed to execute command of VIRTCHNL_OP_DISABLE_VLAN_STRIPPING 2018-05-25T10:00:33Z|00084|dpdk|ERR|i40evf_config_promisc(): fail to execute command CONFIG_PROMISCUOUS_MODE 2018-05-25T10:00:33Z|00085|dpdk|ERR|i40evf_config_promisc(): fail to execute command CONFIG_PROMISCUOUS_MODE There are errors related to promiscuous mode, but there is a message stating that CRC Stripping will be enabled so the patch is working for me. Appctl query on the port: cferrite@silpixa00393943:~$ sudo $OVS_DIR/utilities/ovs-appctl dpif/show netdev@ovs-netdev: hit:0 missed:0 br0: br0 65534/1: (tap) dpdk0 1/2: (dpdk: configured_rx_queues=1, configured_rxq_descriptors=2048, configured_tx_queues=2, configured_txq_descriptors=2048, mtu=1500, requested_rx_queues=1, requested_rxq_descriptors=2048, requested_tx_queues=2, requested_txq_descriptors=2048, rx_csum_offload=true) I have also successfully passed traffic into this VF port and out a regular PF port as summarised by the below CMDs: sudo $OVS_DIR/utilities/ovs-vsctl --timeout 10 add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=:07:02.0 sudo $OVS_DIR/utilities/ovs-vsctl --timeout 10 add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk options:dpdk-devargs=:07:00.1 sudo $OVS_DIR/utilities/ovs-ofctl del-flows br0 sudo $OVS_DIR/utilities/ovs-ofctl add-flow br0 idle_timeout=0,in_port=1,action=output:2 sudo $OVS_DIR/utilities/ovs-ofctl add-flow br0 idle_timeout=0,in_port=2,action=output:1 The two findings from my testing: * There are error messages coming from DPDK about promiscuous mode on the port. Is this something that should be disabled or otherwise considered in the next revision of the patch? * The patch doesn't apply cleanly to master. It looks like this is caused by the following patch: cferrite@silpixa00393943:~/ovs$ git show 65a87968f4 commit 65a87968f4cfd9cf7a433a3156d98118078f9e4e Author: Pablo Cascón Date: Fri Apr 27 17:40:49 2018 +0100 netdev-dpdk: don't enable scatter for jumbo RX support for nfp I have one comment about the code below, if this is addressed/resolved I'm happy to give: Acked-by: Cian Ferriter Tested-by: Cian Ferriter Thanks, Cian > -Original Message- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Ian Stokes > Sent: 03 May 2018 14:25 > To: d...@openvswitch.org > Subject: [ovs-dev] [PATCH v1] netdev-dpdk: Enable HW_CRC_STRIP for > virtual functions. > > From: Michal Weglicki > > Virtual functions such as igb_vf and i40e_vf require HW_CRC_STRIP to be > explicitly enabled before configuration, otherwise device configuration will > fail. > > This commit achieves this by adding NETDEV_RX_HW_CRC_STRIP to > dpdk_hw_ol_features. When a dpdk device
[ovs-dev] [PATCH v3] Avoid tunneling for VLAN packets redirected to a gateway chassis
From: venkata anilWhen a vm on a vlan tenant network sends traffic to an external network, it is tunneled from host chassis to gateway chassis. In the earlier discussion [1], Russel (also in his doc [2]) suggested if we can figure out a way for OVN to do this redirect to the gateway host over a VLAN network. This patch implements his suggestion i.e will redirect to gateway chassis using incoming tenant vlan network. Gateway chassis are expected to be configured with tenant vlan networks. In this approach, new logical and physical flows introduced for packet processing in both host and gateway chassis. Packet processing in the host chassis: 1) A new ovs flow added in physical table 65, which sets MLF_RCV_FROM_VLAN flag for packets from vlan network entering into router pipeline 2) A new flow added in lr_in_ip_routing, for packets output through distributed gateway port and matching MLF_RCV_FROM_VLAN flag, set REGBIT_NAT_REDIRECT i.e table=7 (lr_in_ip_routing ), priority=2, match=( ip4.dst == 0.0.0.0/0 && flags.rcv_from_vlan == 1 && !is_chassis_resident("cr-alice")), action=(reg9[0] = 1; next;) This flow will be set only on chassis not hosting chassisredirect port i.e compute node. When REGBIT_NAT_REDIRECT set, a) lr_in_arp_resolve, will set packet eth.dst to distibuted gateway port MAC b) lr_in_gw_redirect, will set chassisredirect port as outport 3) A new ovs flow added in physical table 32 will use source vlan tenant network tag as vlan ID for sending the packet to gateway chassis. As this vlan packet destination MAC is distibuted gateway port MAC, packet will only reach the gateway chassis. table=32,priority=150,reg14=0x3,reg15=0x6,metadata=0x4 actions=mod_vlan_vid:2010,output:25,strip_vlan This flow will be set only on chassis not hosting chassisredirect port i.e compute node. Packet processing in the gateway chassis: 1) A new ovs flow added in physical table 0 to pass vlan traffic coming from localnet port to the connected router pipeline(i.e router attached to vlan tenant network). This flow will set router metadata, reg14 to router's patch port(lrp) (i.e patch port connecting router and vlan tenant network) and a new MLF_RCV_FROM_VLAN flag. table=0,priority=150,in_port=67,dl_vlan=2010 actions=strip_vlan, load:0x4->OXM_OF_METADATA[],load:0x3->NXM_NX_REG14[], load:0x1->NXM_NX_REG10[5],resubmit(,8) This flow will be set only on chassis hosting chassisredirect port i.e gateway node. 2) A new flow added in lr_in_admission which checks MLF_RCV_FROM_VLAN and allows the packet. This flow will be set only on chassis hosting chassisredirect port i.e gateway node. table=0 (lr_in_admission), priority=100 , match=( flags.rcv_from_vlan == 1 && inport == "lrp-44383893-613a-4bfe-b483- e7d0dc3055cd" && is_chassis_resident("cr-lrp-a6e3d2ab-313a-4ea3- 8ec4-c3c774a11f49")), action=(next;) Then packet will pass through router ingress and egress pipelines and then to external switch pipeline. In a scenario where the traffic between two vms in the same tenant vlan network across different chassis i.e if "vm1" on tenant vlan network "net1" is on host chassis "ch1" and "vm2" on same tenant vlan network "net1" is on gateway chassis "gw1". When the packet arrived on "gw1" chassis from localnet port, we still send it to router pipeline and router pipeline will send it to destination switch ("net1") pipeline. But in this case when packet arrives at "vm2", it will have router MAC as source MAC as the packet is routed in gateway chassis. This bevaviour can be seen only for destination vms hosted on gateway node. [1] https://mail.openvswitch.org/pipermail/ovs-discuss/2018-April/046557.html [2] Point 3 in section 3.3.1 - Future Enhancements https://docs.google.com/document/d/1JecGIXPH0RAqfGvD0nmtBdEU1zflHACp8WSRnKCFSgg/edit# Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-April/046543.html Signed-off-by: Venkata Anil --- ovn/controller/bfd.c| 3 +- ovn/controller/binding.c| 10 +- ovn/controller/ovn-controller.c | 3 + ovn/controller/ovn-controller.h | 16 ++- ovn/controller/physical.c | 94 - ovn/lib/logical-fields.c| 4 + ovn/lib/logical-fields.h| 2 + ovn/northd/ovn-northd.c | 35 +++ tests/ovn.at| 227 9 files changed, 390 insertions(+), 4 deletions(-) diff --git a/ovn/controller/bfd.c b/ovn/controller/bfd.c index 8f020d5..cbbd3ba 100644 --- a/ovn/controller/bfd.c +++ b/ovn/controller/bfd.c @@ -139,8 +139,9 @@ bfd_travel_gw_related_chassis(struct local_datapath *dp, struct local_datapath_node, node); dp = dp_binding->dp; free(dp_binding); +const struct sbrec_datapath_binding *pdp; for (size_t i = 0; i < dp->n_peer_dps;
Re: [ovs-dev] [RFC PATCH v1] dpdk: Support both shared and per port mempools.
On 05/18/2018 03:30 PM, Ian Stokes wrote: > This commit re-introduces the concept of shared mempools as the default > memory model for DPDK devices. Per port mempools are still available but > must be enabled explicitly by a user. > > OVS previously used a shared mempool model for ports with the same MTU > and socket configuration. This was replaced by a per port mempool model > to address issues flagged by users such as: > > https://mail.openvswitch.org/pipermail/ovs-discuss/2016-September/042560.html > > However the per port model potentially requires an increase in memory > resource requirements to support the same number of ports and configuration > as the shared port model. > > This is considered a blocking factor for current deployments of OVS > when upgrading to future OVS releases as a user may have to redimension > memory for the same deployment configuration. This may not be possible for > users. > Hi Ian, thanks for addressing this. Few comments below... > This commit resolves the issue by re-introducing shared mempools as > the default memory behaviour in OVS DPDK but also refactors the memory > configuration code to allow for per port mempools. > This patch adds a new global config option, per-port-mp-enabled, that > controls the enablement of per port mempools for DPDK devices. > > ovs-vsctl set Open_vSwitch . other_config:per-port-mp-enabled=true > It doesn't need enabled in the name. If say 'per-port-mempool=true/false', then it's already obvious it's enabled. Actually, I wonder if 'per-port-memory' would be better as it doesn't require a user to know what a DPDK mempool is. > This value defaults to false; to enable per port mempool support, > this field should be set to true when setting other global parameters > on init (such as "dpdk-socket-mem", for example). Changing the value at > runtime is not supported, and requires restarting the vswitch > daemon. > > The mempool sweep functionality is also replaced with the > sweep functionality from OVS 2.9 found in commits > > c77f692 (netdev-dpdk: Free mempool only when no in-use mbufs.) > a7fb0a4 (netdev-dpdk: Add mempool reuse/free debug.) > > As this patch is RFC there are a number of TO-DOs including adding a > memory calculation section to the documentation for both models. This is > expected to be completed in the v1 after RFC. > > Signed-off-by: Ian Stokes> --- > Documentation/automake.mk| 1 + > Documentation/topics/dpdk/index.rst | 1 + > Documentation/topics/dpdk/memory.rst | 67 +++ > NEWS | 1 + > lib/dpdk-stub.c | 6 + > lib/dpdk.c | 13 ++ > lib/dpdk.h | 1 + > lib/netdev-dpdk.c| 326 > +-- > vswitchd/vswitch.xml | 16 ++ > 9 files changed, 305 insertions(+), 127 deletions(-) > create mode 100644 Documentation/topics/dpdk/memory.rst > > diff --git a/Documentation/automake.mk b/Documentation/automake.mk > index 683ca14..14c2189 100644 > --- a/Documentation/automake.mk > +++ b/Documentation/automake.mk > @@ -36,6 +36,7 @@ DOC_SOURCE = \ > Documentation/topics/dpdk/index.rst \ > Documentation/topics/dpdk/bridge.rst \ > Documentation/topics/dpdk/jumbo-frames.rst \ > + Documentation/topics/dpdk/memory.rst \ > Documentation/topics/dpdk/pdump.rst \ > Documentation/topics/dpdk/phy.rst \ > Documentation/topics/dpdk/pmd.rst \ > diff --git a/Documentation/topics/dpdk/index.rst > b/Documentation/topics/dpdk/index.rst > index 181f61a..cf24a7b 100644 > --- a/Documentation/topics/dpdk/index.rst > +++ b/Documentation/topics/dpdk/index.rst > @@ -40,3 +40,4 @@ The DPDK Datapath > /topics/dpdk/qos > /topics/dpdk/pdump > /topics/dpdk/jumbo-frames > + /topics/dpdk/memory > diff --git a/Documentation/topics/dpdk/memory.rst > b/Documentation/topics/dpdk/memory.rst > new file mode 100644 > index 000..1198067 > --- /dev/null > +++ b/Documentation/topics/dpdk/memory.rst > @@ -0,0 +1,67 @@ > +.. > +Copyright 2018, Intel, Inc. > + > + Licensed under the Apache License, Version 2.0 (the "License"); you may > + not use this file except in compliance with the License. You may obtain > + a copy of the License at > + > + http://www.apache.org/licenses/LICENSE-2.0 > + > + Unless required by applicable law or agreed to in writing, software > + distributed under the License is distributed on an "AS IS" BASIS, > WITHOUT > + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See > the > + License for the specific language governing permissions and limitations > + under the License. > + > + Convention for heading levels in Open vSwitch documentation: > + > + === Heading 0 (reserved for the title in a document) > + --- Heading 1 > + ~~~ Heading 2 > +
[ovs-dev] [PATCH 1/1] ovs-thread: Fix thread id for threads not started with ovs_thread_create()
When ping-pong'in a live VM migration between two machines running OVS-DPDK every now and then the ping misses would increase dramatically. For example: ===Stream Rate: 3Mpps=== No Stream_Rate Downtime Totaltime Ping_Loss Moongen_Loss 0 3Mpps 128 13974 115 7168374 1 3Mpps 145 1362017 1169770 2 3Mpps 140 14499 116 7141175 3 3Mpps 142 1335816 1150606 4 3Mpps 136 1400416 1124020 5 3Mpps 139 15494 214 13170452 6 3Mpps 136 15610 217 13282413 7 3Mpps 146 1319417 1167512 8 3Mpps 148 1287116 1162655 9 3Mpps 137 15615 214 13170656 I identified this issue being introduced in OVS commit https://github.com/openvswitch/ovs/commit/f3e7ec254738364101eed8f04b1d954cb510615c and more specific due to DPDK commit http://dpdk.org/browse/dpdk/commit/?id=af14759181240120f76c82f894982e8f33f0ba2a The combined changes no longer have OVS start the vhost socket polling thread at startup, but DPDK will do it on its own when the first vhost client is started. Figuring out the reason why this happens kept me puzzled for quite some time... What happens is that the callbacks called from the vhost thread are calling ovsrcu_synchronize() as part of destroy_device(). This will end-up calling seq_wait__(), and the problem is with this part: 176static void 177seq_wait__(struct seq *seq, uint64_t value, const char *where) 178OVS_REQUIRES(seq_mutex) 179{ >> 180unsigned int id = ovsthread_id_self(); 181uint32_t hash = hash_int(id, 0); 182struct seq_waiter *waiter; 183 184HMAP_FOR_EACH_IN_BUCKET (waiter, hmap_node, hash, >waiters) { >> 185if (waiter->ovsthread_id == id) { 186if (waiter->value != value) { 187/* The current value is different from the value we've already 188 * waited for, */ 189poll_immediate_wake_at(where); 190} else { 191/* Already waiting on 'value', nothing more to do. */ 192} >> 193return; 194} 195} 196 By default, all created threads outside of OVS will get thread id 0, which is equal to the main ovs thread. So for example in the function above if the main thread is waiting already we won't add ourselves as a waiter. The fix below assigns UINT_MAX to none OVS created threads, which will fix this specific issue. However if more none OVS threads gets added the issue might arise again. Currently, I do not see another solution that will work unless DPDK is adding some framework/callback support when new threads get created. Signed-off-by: Eelco Chaudron--- lib/ovs-thread.c| 2 +- lib/ovs-thread.h| 7 +++ vswitchd/ovs-vswitchd.c | 2 ++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c index f8bc06d38..e663794ea 100644 --- a/lib/ovs-thread.c +++ b/lib/ovs-thread.c @@ -315,7 +315,7 @@ ovs_barrier_block(struct ovs_barrier *barrier) } } -DEFINE_EXTERN_PER_THREAD_DATA(ovsthread_id, 0); +DEFINE_EXTERN_PER_THREAD_DATA(ovsthread_id, UINT_MAX); struct ovsthread_aux { void *(*start)(void *); diff --git a/lib/ovs-thread.h b/lib/ovs-thread.h index 03fd80439..4f543eb57 100644 --- a/lib/ovs-thread.h +++ b/lib/ovs-thread.h @@ -525,4 +525,11 @@ bool may_fork(void); int count_cpu_cores(void); bool thread_is_pmd(void); +static inline void +ovsthread_set_id(unsigned int id) +{ +assert_single_threaded(); +*ovsthread_id_get() = id; +} + #endif /* ovs-thread.h */ diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c index 414b54780..ed456e2d2 100644 --- a/vswitchd/ovs-vswitchd.c +++ b/vswitchd/ovs-vswitchd.c @@ -39,6 +39,7 @@ #include "ovsdb-idl.h" #include "ovs-rcu.h" #include "ovs-router.h" +#include "ovs-thread.h" #include "openvswitch/poll-loop.h" #include "simap.h" #include "stream-ssl.h" @@ -78,6 +79,7 @@ main(int argc, char *argv[]) int retval; set_program_name(argv[0]); +ovsthread_set_id(0); ovs_cmdl_proctitle_init(argc, argv); service_start(, ); -- 2.17.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [patch v4 2/2] packets: ersapn_metadata header fixups.
On Thu, May 24, 2018 at 2:16 PM, Ben Pfaffwrote: > On Thu, May 24, 2018 at 12:08:31PM -0700, William Tu wrote: > > On Thu, May 24, 2018 at 10:20 AM, Ben Pfaff wrote: > > > On Thu, May 24, 2018 at 05:36:10AM -0700, William Tu wrote: > > >> On Wed, May 23, 2018 at 7:13 PM, Darrell Ball > wrote: > > >> > The struct erspan_metadata is updated to replace the 'version' > > >> > placeholder with the erspan base hdr. Also, the erspan > > >> > index is defined explicitly as ovs_16aligned_be32 to mirror > > >> > its encoding. > > >> > Changes to odp_util result from updating the erspan index > > >> > type. > > >> > > > >> > CC: William Tu > > >> > Fixes: 068794b43f0e ("erspan: Add flow-based erspan options") > > >> > Signed-off-by: Darrell Ball > > >> > --- > > >> > lib/odp-util.c | 36 > > >> > lib/packets.h | 6 +++--- > > >> > 2 files changed, 23 insertions(+), 19 deletions(-) > > >> > > > >> > diff --git a/lib/odp-util.c b/lib/odp-util.c > > >> > index 105ac80..767281f 100644 > > >> > --- a/lib/odp-util.c > > >> > +++ b/lib/odp-util.c > > >> > @@ -2786,9 +2786,9 @@ odp_tun_key_from_attr__(const struct nlattr > *attr, bool is_mask, > > >> > > > >> > memcpy(, nl_attr_get(a), attr_len); > > >> > > > >> > -tun->erspan_ver = opts.version; > > >> > +tun->erspan_ver = opts.bh.ver; > > >> > if (tun->erspan_ver == 1) { > > >> > -tun->erspan_idx = ntohl(opts.u.index); > > >> > +tun->erspan_idx = ntohl(get_16aligned_be32(& > opts.u.index)); > > >> > } else if (tun->erspan_ver == 2) { > > >> > tun->erspan_dir = opts.u.md2.dir; > > >> > tun->erspan_hwid = get_hwid(); > > >> > @@ -2890,10 +2890,11 @@ tun_key_to_attr(struct ofpbuf *a, const > struct flow_tnl *tun_key, > > >> > !strcmp(tnl_type, "ip6erspan")) && > > >> > (tun_key->erspan_ver == 1 || tun_key->erspan_ver == 2)) { > > >> > struct erspan_metadata opts; > > >> > +memset(, 0, sizeof opts); > > >> > > > >> > -opts.version = tun_key->erspan_ver; > > >> > -if (opts.version == 1) { > > >> > -opts.u.index = htonl(tun_key->erspan_idx); > > >> > +opts.bh.ver = tun_key->erspan_ver; > > >> > +if (opts.bh.ver == 1) { > > >> > +put_16aligned_be32(, > htonl(tun_key->erspan_idx)); > > >> > } else { > > >> > opts.u.md2.dir = tun_key->erspan_dir; > > >> > set_hwid(, tun_key->erspan_hwid); > > >> > @@ -3368,22 +3369,23 @@ format_odp_tun_erspan_opt(const struct > nlattr *attr, > > >> > opts = nl_attr_get(attr); > > >> > mask = mask_attr ? nl_attr_get(mask_attr) : NULL; > > >> > > > >> > -ver = (uint8_t)opts->version; > > >> > +ver = opts->bh.ver; > > >> > if (mask) { > > >> > -ver_ma = (uint8_t)mask->version; > > >> > +ver_ma = mask->bh.ver; > > >> > } > > >> > > > >> > format_u8u(ds, "ver", ver, mask ? _ma : NULL, verbose); > > >> > > > >> > -if (opts->version == 1) { > > >> > +if (opts->bh.ver == 1) { > > >> > if (mask) { > > >> > ds_put_format(ds, "idx=%#"PRIx32"/%#"PRIx32",", > > >> > - ntohl(opts->u.index), > > >> > - ntohl(mask->u.index)); > > >> > + ntohl(get_16aligned_be32(& > opts->u.index)), > > >> > + ntohl(get_16aligned_be32(& > mask->u.index))); > > >> > } else { > > >> > -ds_put_format(ds, "idx=%#"PRIx32",", > ntohl(opts->u.index)); > > >> > +ds_put_format(ds, "idx=%#"PRIx32",", > > >> > + ntohl(get_16aligned_be32(& > opts->u.index))); > > >> > } > > >> > -} else if (opts->version == 2) { > > >> > +} else if (opts->bh.ver == 2) { > > >> > dir = opts->u.md2.dir; > > >> > hwid = opts->u.md2.hwid; > > >> > if (mask) { > > >> > @@ -4859,10 +4861,11 @@ scan_erspan_metadata(const char *s, > > >> > > > >> > if (!strncmp(s, ")", 1)) { > > >> > s += 1; > > >> > -key->version = ver; > > >> > -key->u.index = htonl(idx); > > >> > +memset(>bh, 0, sizeof key->bh); > > >> > +key->bh.ver = ver; > > >> > +put_16aligned_be32(>u.index, htonl(idx)); > > >> > if (mask) { > > >> > -mask->u.index = htonl(idx_mask); > > >> > +put_16aligned_be32(>u.index, > htonl(idx_mask)); > > >> > } > > >> > } > > >> > return s - s_base; > > >> > @@ -4882,7 +4885,8 @@ scan_erspan_metadata(const char *s, > > >> > > > >> > if (!strncmp(s, ")", 1)) { > > >> > s += 1; > > >> > -key->version = ver; > > >> > +memset(>bh, 0, sizeof key->bh); > > >> > +
[ovs-dev] OVS with core emulator
Team , Anyone tried to use OVS as a SDN services in Core Network emulator , Please share your findings ! Regards Rakesh Kumar ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Ovs in userspace
Hello Team I am planning to write few algorithms to modify the switching behavior of packets by ovs switch but I don’t want to do in kernel mode , I want my entire process to be handled in userspace . Team waiting for your suggestions!! Regards Rakesh Kumar ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] Improved Packet Drop Statistics in OVS.
Hello everyone, May I ask to provide some (even brief) documentation about each specific drop counter? That would really help understanding why a specific drop is happening. Thanks, Federico On 25 May 2018 at 07:44, Rohith Basavarajawrote: > Hi Greg, > > Thanks a lot for sharing the info. > Ben clarified that it should be ok if this new stuff is used only in the > DPDK datapath and it’s not used in the kernel datapath. > > Thanks > Rohith > > > > On 24/05/18, 9:05 PM, "Gregory Rose" wrote: > > On 5/23/2018 7:19 PM, Rohith Basavaraja wrote: > > Hi Ben, > > > > Only changes in > > datapath/linux/compat/include/linux/openvswitch.h > > are related to OvS Kernel module. > > > > The changes in openvswitch.h are not upstreamed yet. > > Can you please let me know the procedure for submitting the > > Changes for OvS Kernel Module?. > > > > In the meanwhile can I get code review feedback for the changes, > while I > > Figure out how to commit changes in datapath/linux/compat/include/ > linux/openvswitch.h. > > > > Thanks > > Rohith > > You need to submit changes to openvswitch.h to the Linux network mail > list. > > https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt > > After acceptance upstream we can then pull in the change. > > And this might help as well. > > https://www.kernel.org/doc/html/v4.16/process/submitting-patches.html > > Thanks, > > - Greg > > > > > > > On 24/05/18, 2:03 AM, "Ben Pfaff" wrote: > > > > I see that this adds code to the OVS kernel module. Are these > changes > > upstream already in Linux? Usually we use an "upstream first" > > philosophy regarding the kernel module. > > > > > > ___ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev