[ovs-dev] [PATCH v2] ovs-lib: pass optional --election-timer arg to ovsdb-tool

2021-06-07 Thread Dan Williams
Signed-off-by: Dan Williams 
---
v2: put --election-timer arg before create-cluster per Ilya

 utilities/ovs-lib.in | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
index ab38ece458b7b..61a062fa992da 100644
--- a/utilities/ovs-lib.in
+++ b/utilities/ovs-lib.in
@@ -495,15 +495,21 @@ create_cluster () {
 DB_FILE="$1"
 DB_SCHEMA="$2"
 LOCAL_ADDR="$3"
+    ELECTION_TIMER_MS="$4"
+
+    election_timer_arg=
+    if [ -n "$ELECTION_TIMER_MS" ]; then
+  election_timer_arg="--election-timer=$ELECTION_TIMER_MS"
+    fi
 
 if test ! -e "$DB_FILE"; then
-    action "Creating cluster database $DB_FILE" ovsdb_tool create-
cluster "$DB_FILE" "$DB_SCHEMA" "$LOCAL_ADDR"
+    action "Creating cluster database $DB_FILE" ovsdb_tool
"$election_timer_arg" create-cluster "$DB_FILE" "$DB_SCHEMA"
"$LOCAL_ADDR"
 elif ovsdb_tool db-is-standalone "$DB_FILE"; then
 # Convert standalone database to clustered.
 backup_db || return 1
 rm -f "$DB_FILE"
 action "Creating cluster database $DB_FILE from existing one"
\
-   ovsdb_tool create-cluster "$DB_FILE" "$backup"
"$LOCAL_ADDR"
+   ovsdb_tool "$election_timer_arg" create-cluster
"$DB_FILE" "$backup" "$LOCAL_ADDR"
 fi
 }
 
-- 
2.31.1



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 3/3] Honor ACL direction when omitting ct for stateless

2021-06-07 Thread Ihar Hrachyshka
On Wed, Jun 2, 2021 at 12:22 AM Han Zhou  wrote:
>
>
>
> On Tue, Jun 1, 2021 at 12:28 PM Ihar Hrachyshka  wrote:
> >
> > On Thu, May 20, 2021 at 9:55 PM Han Zhou  wrote:
> > >
> > >
> > >
> > > On Thu, May 20, 2021 at 3:22 PM Han Zhou  wrote:
> > > >
> > > >
> > > >
> > > > On Mon, May 17, 2021 at 2:47 PM Ihar Hrachyshka  
> > > > wrote:
> > > > >
> > > > > While we *should not* circumvent conntrack when a stateful ACL of 
> > > > > higher
> > > > > priority is present on the switch, we should do so only when
> > > > > allow-stateless and allow-stateful directions are the same, otherwise 
> > > > > we
> > > > > should still skip conntrack for allow-stateless ACLs.
> > > > >
> > > > > Fixes: 3187b9fef1 ("ovn-northd: introduce new allow-stateless ACL 
> > > > > verb")
> > > >
> > > > Is this patch a "fix" or an "optimization"?
> > > >
> > > > >
> > > > > Signed-off-by: Ihar Hrachyshka 
> > > > > ---
> > > > >  northd/lswitch.dl| 88 ---
> > > > >  northd/ovn-northd.c  | 89 
> > > > > ++--
> > > > >  northd/ovn_northd.dl | 32 
> > > > >  tests/ovn-northd.at  | 72 +++
> > > > >  4 files changed, 213 insertions(+), 68 deletions(-)
> > > > >
> > > > > diff --git a/northd/lswitch.dl b/northd/lswitch.dl
> > > > > index b73cfd047..8a4c9154c 100644
> > > > > --- a/northd/lswitch.dl
> > > > > +++ b/northd/lswitch.dl
> > > > > @@ -135,16 +135,39 @@ LogicalSwitchStatelessACL(ls, acl) :-
> > > > >  LogicalSwitchACL(ls, acl),
> > > > >  nb::ACL(._uuid = acl, .action = "allow-stateless").
> > > > >
> > > > > +relation LogicalSwitchStatelessFromACL(ls: uuid, acl: uuid)
> > > > > +
> > > > > +LogicalSwitchStatelessFromACL(ls, acl) :-
> > > > > +LogicalSwitchStatelessACL(ls, acl),
> > > > > +nb::ACL(._uuid = acl, .direction = "from-lport").
> > > > > +
> > > > > +// "Pitfalls of projections" in ddlog-new-feature.rst explains why 
> > > > > this
> > > > > +// is an output relation:
> > > > > +output relation LogicalSwitchHasStatelessFromACL(ls: uuid, 
> > > > > has_stateless_from_acl: bool)
> > > > > +
> > > > > +LogicalSwitchHasStatelessFromACL(ls, true) :-
> > > > > +LogicalSwitchStatelessFromACL(ls, _).
> > > > > +
> > > > > +LogicalSwitchHasStatelessFromACL(ls, false) :-
> > > > > +nb::Logical_Switch(._uuid = ls),
> > > > > +not LogicalSwitchStatelessFromACL(ls, _).
> > > > > +
> > > > > +relation LogicalSwitchStatelessToACL(ls: uuid, acl: uuid)
> > > > > +
> > > > > +LogicalSwitchStatelessToACL(ls, acl) :-
> > > > > +LogicalSwitchStatelessACL(ls, acl),
> > > > > +nb::ACL(._uuid = acl, .direction = "to-lport").
> > > > > +
> > > > >  // "Pitfalls of projections" in ddlog-new-feature.rst explains why 
> > > > > this
> > > > >  // is an output relation:
> > > > > -output relation LogicalSwitchHasStatelessACL(ls: uuid, 
> > > > > has_stateless_acl: bool)
> > > > > +output relation LogicalSwitchHasStatelessToACL(ls: uuid, 
> > > > > has_stateless_to_acl: bool)
> > > > >
> > > > > -LogicalSwitchHasStatelessACL(ls, true) :-
> > > > > -LogicalSwitchStatelessACL(ls, _).
> > > > > +LogicalSwitchHasStatelessToACL(ls, true) :-
> > > > > +LogicalSwitchStatelessToACL(ls, _).
> > > > >
> > > > > -LogicalSwitchHasStatelessACL(ls, false) :-
> > > > > +LogicalSwitchHasStatelessToACL(ls, false) :-
> > > > >  nb::Logical_Switch(._uuid = ls),
> > > > > -not LogicalSwitchStatelessACL(ls, _).
> > > > > +not LogicalSwitchStatelessToACL(ls, _).
> > > > >
> > > > >  // "Pitfalls of projections" in ddlog-new-feature.rst explains why 
> > > > > this
> > > > >  // is an output relation:
> > > > > @@ -223,18 +246,19 @@ LogicalSwitchHasNonRouterPort(ls, false) :-
> > > > >  /* Switch relation collects all attributes of a logical switch */
> > > > >
> > > > >  relation (
> > > > > -ls:nb::Logical_Switch,
> > > > > -has_stateful_acl:  bool,
> > > > > -has_stateless_acl: bool,
> > > > > -has_acls:  bool,
> > > > > -has_lb_vip:bool,
> > > > > -has_dns_records:   bool,
> > > > > -has_unknown_ports: bool,
> > > > > -localnet_ports:Vec<(uuid, string)>,  // UUID and name of 
> > > > > each localnet port.
> > > > > -subnet:Option<(in_addr/*subnet*/, in_addr/*mask*/, 
> > > > > bit<32>/*start_ipv4*/, bit<32>/*total_ipv4s*/)>,
> > > > > -ipv6_prefix:   Option,
> > > > > -mcast_cfg: Ref,
> > > > > -is_vlan_transparent: bool,
> > > > > +ls: nb::Logical_Switch,
> > > > > +has_stateful_acl:   bool,
> > > > > +has_stateless_from_acl: bool,
> > > > > +has_stateless_to_acl:   bool,
> > > > > +has_acls:   bool,
> > > > > +has_lb_vip: bool,
> > > > > +has_dns_records:bool,
> > > > > +has_unknown_ports:  bool,
> > > > > +localnet_ports: Vec<(uuid, string)>,  // UUID and name 
> > 

[ovs-dev] [PATCH] Remove Python 2 leftovers.

2021-06-07 Thread Rosemarie O'Riorden
Fixes: 1ca0323e7c29 ("Require Python 3 and remove support for Python 2.")
Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949875
Signed-off-by: Rosemarie O'Riorden 
---
 python/ovstest/tests.py  |  4 +---
 python/ovstest/util.py   |  2 +-
 python/setup.py  |  2 --
 utilities/checkpatch.py  |  1 -
 utilities/ovs-l3ping.in  | 22 --
 utilities/ovs-parse-backtrace.in | 12 ++--
 utilities/ovs-pcap.in|  4 +---
 utilities/ovs-vlan-test.in   | 28 ++--
 8 files changed, 35 insertions(+), 40 deletions(-)

diff --git a/python/ovstest/tests.py b/python/ovstest/tests.py
index 6de3cc3af..a237ce16d 100644
--- a/python/ovstest/tests.py
+++ b/python/ovstest/tests.py
@@ -10,12 +10,10 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from __future__ import print_function
-
 import math
 import time
 
-import ovstest.util as util
+import util
 
 DEFAULT_TEST_BRIDGE = "ovstestbr0"
 DEFAULT_TEST_PORT = "ovstestport0"
diff --git a/python/ovstest/util.py b/python/ovstest/util.py
index 72457158f..4caf6c352 100644
--- a/python/ovstest/util.py
+++ b/python/ovstest/util.py
@@ -26,7 +26,7 @@ import socket
 import struct
 import subprocess
 
-import exceptions
+import builtins as exceptions
 
 import xmlrpc.client
 
diff --git a/python/setup.py b/python/setup.py
index d385d8372..a0f4ffded 100644
--- a/python/setup.py
+++ b/python/setup.py
@@ -10,8 +10,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from __future__ import print_function
-
 import sys
 
 from distutils.command.build_ext import build_ext
diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
index bc6bfae15..ac14da29b 100755
--- a/utilities/checkpatch.py
+++ b/utilities/checkpatch.py
@@ -13,7 +13,6 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from __future__ import print_function
 
 import email
 import getopt
diff --git a/utilities/ovs-l3ping.in b/utilities/ovs-l3ping.in
index 92d32acb3..89a57d529 100644
--- a/utilities/ovs-l3ping.in
+++ b/utilities/ovs-l3ping.in
@@ -19,11 +19,13 @@ achieved by tunneling the control connection inside the 
tunnel itself.
 """
 
 import socket
-import xmlrpclib
+import xmlrpc.client
 
-import ovstest.args as args
-import ovstest.tests as tests
-import ovstest.util as util
+import sys
+sys.path.insert(1, '../python/ovstest')
+import args
+import tests
+import util
 
 
 def get_packet_sizes(me, he, remote_ip):
@@ -64,13 +66,13 @@ if __name__ == '__main__':
 ps = get_packet_sizes(me, he, args.client[0])
 tests.do_direct_tests(me, he, bandwidth, interval, ps)
 except KeyboardInterrupt:
-print "Terminating"
-except xmlrpclib.Fault:
-print "Couldn't contact peer"
+print("Terminating")
+except xmlrpc.client.Fault:
+print("Couldn't contact peer")
 except socket.error:
-print "Couldn't contact peer"
-except xmlrpclib.ProtocolError:
-print "XMLRPC control channel was abruptly terminated"
+print("Couldn't contact peer")
+except xmlrpc.client.ProtocolError:
+print("XMLRPC control channel was abruptly terminated")
 finally:
 if local_server is not None:
 local_server.terminate()
diff --git a/utilities/ovs-parse-backtrace.in b/utilities/ovs-parse-backtrace.in
index d5506769a..f44f05cd1 100755
--- a/utilities/ovs-parse-backtrace.in
+++ b/utilities/ovs-parse-backtrace.in
@@ -70,7 +70,7 @@ result.  Expected usage is for ovs-appctl backtrace to be 
piped in.""")
 if os.path.exists(debug):
 binary = debug
 
-print "Binary: %s\n" % binary
+print("Binary: %s\n" % binary)
 
 stdin = sys.stdin.read()
 
@@ -88,15 +88,15 @@ result.  Expected usage is for ovs-appctl backtrace to be 
piped in.""")
 for lines, count in traces:
 longest = max(len(l) for l in lines)
 
-print "Backtrace Count: %d" % count
+print("Backtrace Count: %d" % count)
 for line in lines:
 match = re.search(r'\[(0x.*)]', line)
 if match:
-print "%s %s" % (line.ljust(longest),
- addr2line(binary, match.group(1)))
+print("%s %s" % (line.ljust(longest),
+ addr2line(binary, match.group(1
 else:
-print line
-print
+print(line)
+print()
 
 
 if __name__ == "__main__":
diff --git a/utilities/ovs-pcap.in b/utilities/ovs-pcap.in
index dddbee4df..6b5f63399 100755
--- a/utilities/ovs-pcap.in
+++ b/utilities/ovs-pcap.in
@@ -14,8 +14,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the 

Re: [ovs-dev] [PATCH v3 0/9] Stream Record/Replay.

2021-06-07 Thread Ilya Maximets
On 6/4/21 4:40 PM, Dumitru Ceara wrote:
> On 5/27/21 3:28 PM, Ilya Maximets wrote:
>> This patch set adds new stream provider and other functionality in
>> order to record all the incoming data on all the steams (ssl, tcp,
>> unixctl) of openvswitch library based applications and replay these
>> streams later for debugging purposes or performance tests.
>>
>> For example, these changes allowed me to record the lifecycle of
>> a standalone ovsdb-server in a fake-multinode cluster from ovn
>> scale testing setup with 100 nodes.  While having only replay files,
>> user could reproduce transactions and connections in a big cluster on
>> a local PC wile being able to tweak various log levels, run everything
>> under debugger or tracer, measure performance difference with perf.
>>
>> It also proven to be useful for debugging issues in ovn components.
>> IIRC, following issue was not easily reproducible, but it was recorded
>> and debugged by using v1 of this patch series integrated into
>> ovn-controller:
>>
>>   91a6a4580267 ("ovsdb-idl: Fix use-after-free when deleting orphaned rows.")
>>
>> Current implementation doesn't work with clustered databases since
>> raft heavily depends on time events.  This is a point of further
>> improvement.
>>
>> More details and usage examples in the documentation in the last
>> patch of this set.
>>
> 
> Hi Ilya,
> 
> Thanks for the really nice feature!  Looking forward to the OVN bits. :)
> 
> I acked patch 2/9 (the others I had already acked in v2).

Thanks!  I folded in your suggestions for patch #2 and applied the series.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovs-lib: pass optional --election-timer arg to ovsdb-tool

2021-06-07 Thread Dan Williams
On Mon, 2021-06-07 at 20:56 +0200, Ilya Maximets wrote:
> On 6/7/21 6:04 PM, Dan Williams wrote:
> > 
> > Signed-off-by: Dan Williams 
> > ---
> >  utilities/ovs-lib.in | 10 --
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
> > index ab38ece458b7b..d5aef8c85b3e0 100644
> > --- a/utilities/ovs-lib.in
> > +++ b/utilities/ovs-lib.in
> > @@ -495,15 +495,21 @@ create_cluster () {
> >  DB_FILE="$1"
> >  DB_SCHEMA="$2"
> >  LOCAL_ADDR="$3"
> > +    ELECTION_TIMER_MS="$4"
> > +
> > +    election_timer_arg=
> > +    if [ -n "$ELECTION_TIMER_MS" ]; then
> > +  election_timer_arg="--election-timer=$ELECTION_TIMER_MS"
> > +    fi
> >  
> >  if test ! -e "$DB_FILE"; then
> > -    action "Creating cluster database $DB_FILE" ovsdb_tool
> > create-cluster "$DB_FILE" "$DB_SCHEMA" "$LOCAL_ADDR"
> > +    action "Creating cluster database $DB_FILE" ovsdb_tool
> > create-cluster "$election_timer_arg" "$DB_FILE" "$DB_SCHEMA"
> > "$LOCAL_ADDR"
> >  elif ovsdb_tool db-is-standalone "$DB_FILE"; then
> >  # Convert standalone database to clustered.
> >  backup_db || return 1
> >  rm -f "$DB_FILE"
> >  action "Creating cluster database $DB_FILE from existing
> > one" \
> > -   ovsdb_tool create-cluster "$DB_FILE" "$backup"
> > "$LOCAL_ADDR"
> > +   ovsdb_tool create-cluster "$election_timer_arg"
> > "$DB_FILE" "$backup" "$LOCAL_ADDR"
> 
> Shouldn't the option go before the command name?

ovsdb-tool doesn't seem to care, but I'll make the change in v2.

Dan

> 
> >  fi
> >  }
> >  
> > 
> 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovs-lib: pass optional --election-timer arg to ovsdb-tool

2021-06-07 Thread Ilya Maximets
On 6/7/21 6:04 PM, Dan Williams wrote:
> 
> Signed-off-by: Dan Williams 
> ---
>  utilities/ovs-lib.in | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
> index ab38ece458b7b..d5aef8c85b3e0 100644
> --- a/utilities/ovs-lib.in
> +++ b/utilities/ovs-lib.in
> @@ -495,15 +495,21 @@ create_cluster () {
>  DB_FILE="$1"
>  DB_SCHEMA="$2"
>  LOCAL_ADDR="$3"
> +ELECTION_TIMER_MS="$4"
> +
> +election_timer_arg=
> +if [ -n "$ELECTION_TIMER_MS" ]; then
> +  election_timer_arg="--election-timer=$ELECTION_TIMER_MS"
> +fi
>  
>  if test ! -e "$DB_FILE"; then
> -action "Creating cluster database $DB_FILE" ovsdb_tool 
> create-cluster "$DB_FILE" "$DB_SCHEMA" "$LOCAL_ADDR"
> +action "Creating cluster database $DB_FILE" ovsdb_tool 
> create-cluster "$election_timer_arg" "$DB_FILE" "$DB_SCHEMA" "$LOCAL_ADDR"
>  elif ovsdb_tool db-is-standalone "$DB_FILE"; then
>  # Convert standalone database to clustered.
>  backup_db || return 1
>  rm -f "$DB_FILE"
>  action "Creating cluster database $DB_FILE from existing one" \
> -   ovsdb_tool create-cluster "$DB_FILE" "$backup" "$LOCAL_ADDR"
> +   ovsdb_tool create-cluster "$election_timer_arg" "$DB_FILE" 
> "$backup" "$LOCAL_ADDR"

Shouldn't the option go before the command name?

>  fi
>  }
>  
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4] netlink: removed incorrect optimization

2021-06-07 Thread Toms Atteka
This optimization caused FLOW_TNL_F_UDPIF flag not to be used in
hash calculation for geneve tunnel when revalidating flows which
resulted in different cache hash values and incorrect behaviour.

Added test to prevent regression.

CC: Jesse Gross 
Fixes: 6728d578f64e ("dpif-netdev: Translate Geneve options per-flow, not 
per-packet.")
Reported-at: https://github.com/vmware-tanzu/antrea/issues/897
Signed-off-by: Toms Atteka 
---
 lib/tun-metadata.c  |  2 +-
 tests/system-traffic.at | 54 +
 2 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c
index c0b0ae044..af0bcbde8 100644
--- a/lib/tun-metadata.c
+++ b/lib/tun-metadata.c
@@ -828,7 +828,7 @@ tun_metadata_to_geneve_nlattr(const struct flow_tnl *tun,
 } else {
 tun_metadata_to_geneve_nlattr_mask(key, tun, flow, b);
 }
-} else if (flow->metadata.present.len || is_mask) {
+} else {
 nl_msg_put_unspec(b, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS,
   tun->metadata.opts.gnv,
   flow->metadata.present.len);
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index fb5b9a36d..483cc7e83 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -574,6 +574,60 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 
10.1.1.100 | FORMAT_PI
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([datapath - ping over geneve tunnel, delete flow regression])
+OVS_CHECK_GENEVE()
+
+OVS_TRAFFIC_VSWITCHD_START()
+ADD_BR([br-underlay])
+
+AT_DATA([flows.txt], [dnl
+priority=100,icmp actions=resubmit(,10)
+priority=0 actions=NORMAL
+table=10, priority=100, ip, actions=ct(table=20,zone=65520)
+table=20, priority=200, ip, ct_state=-new+trk, actions=resubmit(,30)
+table=20, priority=100, ip, ct_state=+new, actions=resubmit(,30)
+table=20, priority=50, ip, actions=DROP
+table=30, priority=100, ip, actions=ct(commit,table=40,zone=65520)
+table=40, actions=normal
+])
+
+AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
+AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
+
+ADD_NAMESPACES(at_ns0)
+
+dnl Set up underlay link from host into the namespace using veth pair.
+ADD_VETH(p0, at_ns0, br-underlay, "172.31.1.1/24")
+AT_CHECK([ip addr add dev br-underlay "172.31.1.100/24"])
+AT_CHECK([ip link set dev br-underlay up])
+
+dnl Set up tunnel endpoints on OVS outside the namespace and with a native
+dnl linux device inside the namespace.
+ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.31.1.1], [10.1.1.100/24])
+ADD_NATIVE_TUNNEL([geneve], [ns_gnv0], [at_ns0], [172.31.1.100], [10.1.1.1/24],
+  [vni 0])
+
+dnl First, check the underlay
+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 172.31.1.100 | FORMAT_PING], 
[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+dnl ping over tunnel should work
+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | FORMAT_PING], 
[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+AT_CHECK([ovs-ofctl del-flows br0 "ct_state=+new"])
+
+dnl ping should not go through after removal of the flow
+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | FORMAT_PING], 
[0], [dnl
+7 packets transmitted, 0 received, 100% packet loss, time 0ms
+])
+
+OVS_TRAFFIC_VSWITCHD_STOP(["/|ERR|/d
+/|WARN|/d"])
+AT_CLEANUP
+
 AT_SETUP([datapath - flow resume with geneve tun_metadata])
 OVS_CHECK_GENEVE()
 
-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] Fix redundant datapath set ethernet action with NSH Decap

2021-06-07 Thread Jan Scheurich via dev
> -Original Message-
> From: Martin Varghese 
> Sent: Monday, 7 June, 2021 16:47
> To: Ilya Maximets 
> Cc: d...@openvswitch.org; echau...@redhat.com; Jan Scheurich
> ; Martin Varghese
> 
> Subject: Re: [ovs-dev] [PATCH v2] Fix redundant datapath set ethernet action
> with NSH Decap
> 
> On Wed, May 19, 2021 at 12:26:40PM +0200, Ilya Maximets wrote:
> > On 5/19/21 5:26 AM, Martin Varghese wrote:
> > > On Tue, May 18, 2021 at 10:03:39PM +0200, Ilya Maximets wrote:
> > >> On 5/17/21 3:45 PM, Martin Varghese wrote:
> > >>> From: Martin Varghese 
> > >>>
> > >>> When a decap action is applied on NSH header encapsulatiing a
> > >>> ethernet packet a redundant set mac address action is programmed
> > >>> to the datapath.
> > >>>
> > >>> Fixes: f839892a206a ("OF support and translation of generic encap
> > >>> and decap")
> > >>> Signed-off-by: Martin Varghese 
> > >>> Acked-by: Jan Scheurich 
> > >>> Acked-by: Eelco Chaudron 
> > >>> ---
> > >>> Changes in v2:
> > >>>   - Fixed code styling
> > >>>   - Added Ack from jan.scheur...@ericsson.com
> > >>>   - Added Ack from echau...@redhat.com
> > >>>
> > >>
> > >> Hi, Martin.
> > >> For some reason this patch triggers frequent failures of the
> > >> following unit test:
> > >>
> > >> 2314. packet-type-aware.at:619: testing ptap - L3 over patch port
> > >> ...
> 
> The test is failing as, during revalidation, NORMAL action is dropping 
> packets.
> With these changes, the mac address in flow structures get cleared with decap
> action. Hence the NORMAL action drops the packet assuming a loop (SRC and
> DST mac address are zero). I assume NORMAL action handling in
> xlate_push_stats_entry is not adapted for l3 packet. The timing at which
> revalidator gets triggered explains the sporadicity of the issue. The issue is
> never seen as the MAC addresses in flow structure were not cleared with decap
> before.
> 
> So can we use NORMAL action with a L3 packet ?  Does OVS handle all the L3
> use cases with Normal action ? If not, shouldn't we not use NORMAL action in
> this test case
> 
> Comments?
> 

Good catch! Normal flow L2 bridging is of course nonsense for the use case of 
forwarding an L3 packet. I am surprised that the packet was forwarded at all in 
the first place. That in itself can be considered a bug. Correctly, a Normal 
flow should drop non-Ethernet packets, I would say.

To fix the test case I suggest to replace the Normal action in br1 with 
"output:gre1" in line 700.

> 
> > >> stdout:
> > >> warped
> > >> ./packet-type-aware.at:726:
> > >> ovs-appctl dpctl/dump-flows --names dummy@ovs-dummy |
> > >> strip_used | grep -v ipv6 | sort
> > >>
> > >> --- -   2021-05-18 21:57:56.810513366 +0200
> > >> +++ /home/i.maximets/work/git/ovs/tests/testsuite.dir/at-
> groups/2314/stdout 2021-05-18 21:57:56.806609814 +0200
> > >> @@ -1,3 +1,3 @@
> > >>  flow-dump from the main thread:
> > >> -recirc_id(0),in_port(n0),packet_type(ns=0,id=0),eth(src=3a:6d:d2:0
> > >> 9:9c:ab,dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),ipv4(tos=0/0x3,frag
> > >> =no), packets:1, bytes:98, used:0.0s,
> > >> actions:pop_eth,clone(tnl_push(tnl_port(gre_sys),header(size=38,typ
> > >> e=3,eth(dst=de:af:be:ef:ba:be,src=aa:55:00:00:00:02,dl_type=0x0800)
> > >> ,ipv4(src=10.0.0.1,dst=10.0.0.2,proto=47,tos=0,ttl=64,frag=0x4000),
> > >> gre((flags=0x0,proto=0x800))),out_port(br2)),n2)
> > >> +recirc_id(0),in_port(n0),packet_type(ns=0,id=0),eth(src=3a:6d:d2:0
> > >> +9:9c:ab,dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),ipv4(tos=0/0x3,fra
> > >> +g=no), packets:1, bytes:98, used:0.0s, actions:drop
> > >>
> > >>
> > >> It fails very frequently in GitHub Actions, but it's harder to make
> > >> it fail on my local machine.  Following change to the test allows
> > >> to reproduce the failure almost always on my local machine:
> > >>
> > >> diff --git a/tests/packet-type-aware.at
> > >> b/tests/packet-type-aware.at index 540cf98f3..01dbc8030 100644
> > >> --- a/tests/packet-type-aware.at
> > >> +++ b/tests/packet-type-aware.at
> > >> @@ -721,7 +721,7 @@ AT_CHECK([
> > >>  ovs-appctl netdev-dummy/receive n0
> > >>
> 1e2ce92a669e3a6dd2099cab080045548a83400040011aadc0a80a0ac0a80
> a1
> > >>
> e0800b7170a4d0002fd509a58de1c02001011121314151617
> 18
> > >>
> 191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f3031323334353637
> > >>  ], [0], [ignore])
> > >>
> > >> -ovs-appctl time/warp 1000
> > >> +ovs-appctl time/warp 1000 100
> > >>
> > >>  AT_CHECK([
> > >>  ovs-appctl dpctl/dump-flows --names dummy@ovs-dummy |
> > >> strip_used | grep -v ipv6 | sort
> > >> --
> > >>
> > >> Without your patch I can not make it fail locally even with above
> > >> wrapping change applied.
> > >>
> > >> Could you, please, take a look?
> > >>
> > >
> > > Hi Ilya
> > >
> > > Travis CI was good. i will rebase & try again
> > > https://protect2.fireeye.com/v1/url?k=8ec813e4-d1532ae4-8ec8537f-86f
> > > c6812c361-3273f254661f9c75=1=c83c3bb8-d3c9-439a-8031-
> 72634d0fecc
> > > 2=https%3A%2F%2Ftravis-
> 

[ovs-dev] [PATCH] ovs-lib: pass optional --election-timer arg to ovsdb-tool

2021-06-07 Thread Dan Williams


Signed-off-by: Dan Williams 
---
 utilities/ovs-lib.in | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
index ab38ece458b7b..d5aef8c85b3e0 100644
--- a/utilities/ovs-lib.in
+++ b/utilities/ovs-lib.in
@@ -495,15 +495,21 @@ create_cluster () {
 DB_FILE="$1"
 DB_SCHEMA="$2"
 LOCAL_ADDR="$3"
+ELECTION_TIMER_MS="$4"
+
+election_timer_arg=
+if [ -n "$ELECTION_TIMER_MS" ]; then
+  election_timer_arg="--election-timer=$ELECTION_TIMER_MS"
+fi
 
 if test ! -e "$DB_FILE"; then
-action "Creating cluster database $DB_FILE" ovsdb_tool create-cluster 
"$DB_FILE" "$DB_SCHEMA" "$LOCAL_ADDR"
+action "Creating cluster database $DB_FILE" ovsdb_tool create-cluster 
"$election_timer_arg" "$DB_FILE" "$DB_SCHEMA" "$LOCAL_ADDR"
 elif ovsdb_tool db-is-standalone "$DB_FILE"; then
 # Convert standalone database to clustered.
 backup_db || return 1
 rm -f "$DB_FILE"
 action "Creating cluster database $DB_FILE from existing one" \
-   ovsdb_tool create-cluster "$DB_FILE" "$backup" "$LOCAL_ADDR"
+   ovsdb_tool create-cluster "$election_timer_arg" "$DB_FILE" 
"$backup" "$LOCAL_ADDR"
 fi
 }
 
-- 
2.31.1


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovs-discuss] Patch for PACKET_OUT getting deleted twice crash

2021-06-07 Thread Ilya Maximets
On 6/7/21 3:59 PM, Ilya Maximets wrote:
> On 6/7/21 3:09 PM, Aaron Conole wrote:
>> Ilya Maximets  writes:
>>
 Here is a patch with both a test and a fix.
>>
>> Thanks so much!  It's nice to get fixes, but I think it's really great
>> when test cases come along with them.
>>
>>> Hi.  Thanks for working n this!
>>>
>>> CC: ovs-dev
>>>
 Not submitting as a formal
 patch because I would like some feedback on whether 1) maintainers feel
 this is worth fixing and
>>>
>>> I can reproduce the crash with your test.  Basically, actions in userspace
>>> datapath may drop packets if something goes wrong.  'meter' action just
>>> seems to be the most explicit variant.  So, I think, this is definitely
>>> worth fixing as some other condition might trigger this crash on packet-out
>>> as well.
>>>
>>> ==2568112==ERROR: AddressSanitizer: heap-use-after-free
>>> on address 0x6160699c at pc 0x00573860 bp 0x7ffebc6cc880 sp 
>>> 0x7ffebc6cc878
>>> READ of size 1 at 0x6160699c thread T0
>>> #0 0x57385f in dp_packet_delete lib/dp-packet.h:242:16
>>> #1 0x57372c in ofproto_packet_out_uninit ofproto/ofproto.c:3562:5
>>> #2 0x585e77 in handle_packet_out ofproto/ofproto.c:3722:5
>>> #3 0x583801 in handle_single_part_openflow ofproto/ofproto.c:8499:16
>>> #4 0x570c9c in handle_openflow ofproto/ofproto.c:8686:21
>>> #5 0x611781 in ofconn_run ofproto/connmgr.c:1329:13
>>> #6 0x6112ed in connmgr_run ofproto/connmgr.c:356:9
>>> #7 0x56fdf4 in ofproto_run ofproto/ofproto.c:1891:5
>>> #8 0x545ec0 in bridge_run__ vswitchd/bridge.c:3251:9
>>> #9 0x5456a5 in bridge_run vswitchd/bridge.c:3310:5
>>> #10 0x55f5b1 in main vswitchd/ovs-vswitchd.c:127:9
>>> #11 0x7f85bfe09081 in __libc_start_main (/lib64/libc.so.6+0x27081)
>>> #12 0x46d00d in _start (vswitchd/ovs-vswitchd+0x46d00d)
>>>
 2) whether this is the way to fix it.
>>>
>>> See inline.
>>>

 I have tried to make the most minimal change possible, but this means that
 there might be paths through the code that give unexpected behaviour (which
 in the worst case would be a memory leak I suppose).

 Tony

 diff --git a/lib/dp-packet.h b/lib/dp-packet.h
 index 246be14d0..5e0dabe67 100644
 --- a/lib/dp-packet.h
 +++ b/lib/dp-packet.h
 @@ -739,6 +739,7 @@ struct dp_packet_batch {
 size_t count;
 bool trunc; /* true if the batch needs truncate. */
 bool do_not_steal; /* Indicate that the packets should not be stolen.
 */
 +bool packet_out; /* Indicate single packet is PACKET_OUT */
 struct dp_packet *packets[NETDEV_MAX_BURST];
 };

 @@ -748,6 +749,7 @@ dp_packet_batch_init(struct dp_packet_batch *batch)
 batch->count = 0;
 batch->trunc = false;
 batch->do_not_steal = false;
 +batch->packet_out = false;
 }

 static inline void
 diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
 index 650e67ab3..deba4a94a 100644
 --- a/lib/dpif-netdev.c
 +++ b/lib/dpif-netdev.c
 @@ -4170,6 +4170,7 @@ dpif_netdev_execute(struct dpif *dpif, struct
 dpif_execute *execute)

 dp_packet_batch_init_packet(, execute->packet);
 pp.do_not_steal = true;
 +pp.packet_out = execute->packet_out;
 dp_netdev_execute_actions(pmd, , false, execute->flow,
   execute->actions, execute->actions_len);
>>>
>>> There is already a dirty hack named "do_not_steal" that was introduced,
>>> I guess, exactly to avoid crash in the conntrack code that could drop/steal
>>> the packet just like meter action.  And it seems that here in
>>> dpif_netdev_execute() is the only problematic entry point as all other
>>> normal paths expects that packet might be destroyed.
>>>
>>> The problem was, I suppose, introduced when we tried to unify semantics
>>> of "may_steal" flag by turning it into "should_steal".  But it seems that
>>> in this function we really need to prohibit stealing of the packet since
>>> ofproto layer still owns it regardless of the result of execution.
>>>
>>> I don't think that we need one more flag here, but we have several options
>>> how to fix the crash:
>>>
>>> 1. Start honoring batch->do_not_steal flag in all actions that may result
>>>in packet drops.  As the original idea of having 'do_not_steal' flag for
>>>a batch is very hacky, I'd like to not do that.
>>
>> +1
>>
>>> 2. Try to propagate information that packet was deleted up to ofproto layer,
>>>i.e. make handle_packet_out() aware of that.  Will, probably, be not that
>>>easy to do.
>>
>> I had a look at doing this, but as you note it is quite intrusive, and
>> we need to make changes all over.
>>
>>> 3. This function (dpif_netdev_execute) is not on a hot path in userspace
>>>datapath, IIUC.  It might be that it's just easier to remove the
>>>'do_not_steal' flag entirely, clone the packet here and call the
>>>

Re: [ovs-dev] [PATCH v2] Fix redundant datapath set ethernet action with NSH Decap

2021-06-07 Thread Martin Varghese
On Wed, May 19, 2021 at 12:26:40PM +0200, Ilya Maximets wrote:
> On 5/19/21 5:26 AM, Martin Varghese wrote:
> > On Tue, May 18, 2021 at 10:03:39PM +0200, Ilya Maximets wrote:
> >> On 5/17/21 3:45 PM, Martin Varghese wrote:
> >>> From: Martin Varghese 
> >>>
> >>> When a decap action is applied on NSH header encapsulatiing a
> >>> ethernet packet a redundant set mac address action is programmed
> >>> to the datapath.
> >>>
> >>> Fixes: f839892a206a ("OF support and translation of generic encap and 
> >>> decap")
> >>> Signed-off-by: Martin Varghese 
> >>> Acked-by: Jan Scheurich 
> >>> Acked-by: Eelco Chaudron 
> >>> ---
> >>> Changes in v2:
> >>>   - Fixed code styling
> >>>   - Added Ack from jan.scheur...@ericsson.com
> >>>   - Added Ack from echau...@redhat.com
> >>>
> >>
> >> Hi, Martin.
> >> For some reason this patch triggers frequent failures of the following
> >> unit test:
> >>   
> >> 2314. packet-type-aware.at:619: testing ptap - L3 over patch port
> >> ...

The test is failing as, during revalidation, NORMAL action is dropping packets.
With these changes, the mac address in flow structures get cleared with decap
action. Hence the NORMAL action drops the packet assuming a loop (SRC and DST 
mac address are zero). I assume NORMAL action handling in 
xlate_push_stats_entry is not adapted for l3 packet. The timing at which 
revalidator gets triggered explains the sporadicity of the issue. The issue is 
never seen as the MAC addresses in flow structure were not cleared with decap 
before.

So can we use NORMAL action with a L3 packet ?  Does OVS handle all the L3
use cases with Normal action ? If not, shouldn't we not use NORMAL action in 
this test case
 
Comments? 


> >> stdout:
> >> warped
> >> ./packet-type-aware.at:726:
> >> ovs-appctl dpctl/dump-flows --names dummy@ovs-dummy | strip_used | 
> >> grep -v ipv6 | sort
> >>
> >> --- -   2021-05-18 21:57:56.810513366 +0200
> >> +++ 
> >> /home/i.maximets/work/git/ovs/tests/testsuite.dir/at-groups/2314/stdout
> >>  2021-05-18 21:57:56.806609814 +0200
> >> @@ -1,3 +1,3 @@
> >>  flow-dump from the main thread:
> >> -recirc_id(0),in_port(n0),packet_type(ns=0,id=0),eth(src=3a:6d:d2:09:9c:ab,dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),ipv4(tos=0/0x3,frag=no),
> >>  packets:1, bytes:98, used:0.0s, 
> >> actions:pop_eth,clone(tnl_push(tnl_port(gre_sys),header(size=38,type=3,eth(dst=de:af:be:ef:ba:be,src=aa:55:00:00:00:02,dl_type=0x0800),ipv4(src=10.0.0.1,dst=10.0.0.2,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x0,proto=0x800))),out_port(br2)),n2)
> >> +recirc_id(0),in_port(n0),packet_type(ns=0,id=0),eth(src=3a:6d:d2:09:9c:ab,dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),ipv4(tos=0/0x3,frag=no),
> >>  packets:1, bytes:98, used:0.0s, actions:drop
> >>
> >>
> >> It fails very frequently in GitHub Actions, but it's harder to make it fail
> >> on my local machine.  Following change to the test allows to reproduce the
> >> failure almost always on my local machine:
> >>
> >> diff --git a/tests/packet-type-aware.at b/tests/packet-type-aware.at
> >> index 540cf98f3..01dbc8030 100644
> >> --- a/tests/packet-type-aware.at
> >> +++ b/tests/packet-type-aware.at
> >> @@ -721,7 +721,7 @@ AT_CHECK([
> >>  ovs-appctl netdev-dummy/receive n0 
> >> 1e2ce92a669e3a6dd2099cab080045548a83400040011aadc0a80a0ac0a80a1e0800b7170a4d0002fd509a58de1c0200101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f3031323334353637
> >>  ], [0], [ignore])
> >>  
> >> -ovs-appctl time/warp 1000
> >> +ovs-appctl time/warp 1000 100
> >>  
> >>  AT_CHECK([
> >>  ovs-appctl dpctl/dump-flows --names dummy@ovs-dummy | strip_used | 
> >> grep -v ipv6 | sort
> >> --
> >>
> >> Without your patch I can not make it fail locally even with above wrapping
> >> change applied.
> >>
> >> Could you, please, take a look?
> >>
> > 
> > Hi Ilya
> > 
> > Travis CI was good. i will rebase & try again
> > https://travis-ci.org/github/martinpattara/ovs/builds/770919003
> 
> Travis has only one job with tests enabled and it tests on arm.
> GitHub Actions (which is our main CI now) wasn't good:
>   https://github.com/martinpattara/ovs/runs/2567454405
> 
> Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovs-discuss] Patch for PACKET_OUT getting deleted twice crash

2021-06-07 Thread Ilya Maximets
On 6/7/21 3:09 PM, Aaron Conole wrote:
> Ilya Maximets  writes:
> 
>>> Here is a patch with both a test and a fix.
> 
> Thanks so much!  It's nice to get fixes, but I think it's really great
> when test cases come along with them.
> 
>> Hi.  Thanks for working n this!
>>
>> CC: ovs-dev
>>
>>> Not submitting as a formal
>>> patch because I would like some feedback on whether 1) maintainers feel
>>> this is worth fixing and
>>
>> I can reproduce the crash with your test.  Basically, actions in userspace
>> datapath may drop packets if something goes wrong.  'meter' action just
>> seems to be the most explicit variant.  So, I think, this is definitely
>> worth fixing as some other condition might trigger this crash on packet-out
>> as well.
>>
>> ==2568112==ERROR: AddressSanitizer: heap-use-after-free
>> on address 0x6160699c at pc 0x00573860 bp 0x7ffebc6cc880 sp 
>> 0x7ffebc6cc878
>> READ of size 1 at 0x6160699c thread T0
>> #0 0x57385f in dp_packet_delete lib/dp-packet.h:242:16
>> #1 0x57372c in ofproto_packet_out_uninit ofproto/ofproto.c:3562:5
>> #2 0x585e77 in handle_packet_out ofproto/ofproto.c:3722:5
>> #3 0x583801 in handle_single_part_openflow ofproto/ofproto.c:8499:16
>> #4 0x570c9c in handle_openflow ofproto/ofproto.c:8686:21
>> #5 0x611781 in ofconn_run ofproto/connmgr.c:1329:13
>> #6 0x6112ed in connmgr_run ofproto/connmgr.c:356:9
>> #7 0x56fdf4 in ofproto_run ofproto/ofproto.c:1891:5
>> #8 0x545ec0 in bridge_run__ vswitchd/bridge.c:3251:9
>> #9 0x5456a5 in bridge_run vswitchd/bridge.c:3310:5
>> #10 0x55f5b1 in main vswitchd/ovs-vswitchd.c:127:9
>> #11 0x7f85bfe09081 in __libc_start_main (/lib64/libc.so.6+0x27081)
>> #12 0x46d00d in _start (vswitchd/ovs-vswitchd+0x46d00d)
>>
>>> 2) whether this is the way to fix it.
>>
>> See inline.
>>
>>>
>>> I have tried to make the most minimal change possible, but this means that
>>> there might be paths through the code that give unexpected behaviour (which
>>> in the worst case would be a memory leak I suppose).
>>>
>>> Tony
>>>
>>> diff --git a/lib/dp-packet.h b/lib/dp-packet.h
>>> index 246be14d0..5e0dabe67 100644
>>> --- a/lib/dp-packet.h
>>> +++ b/lib/dp-packet.h
>>> @@ -739,6 +739,7 @@ struct dp_packet_batch {
>>> size_t count;
>>> bool trunc; /* true if the batch needs truncate. */
>>> bool do_not_steal; /* Indicate that the packets should not be stolen.
>>> */
>>> +bool packet_out; /* Indicate single packet is PACKET_OUT */
>>> struct dp_packet *packets[NETDEV_MAX_BURST];
>>> };
>>>
>>> @@ -748,6 +749,7 @@ dp_packet_batch_init(struct dp_packet_batch *batch)
>>> batch->count = 0;
>>> batch->trunc = false;
>>> batch->do_not_steal = false;
>>> +batch->packet_out = false;
>>> }
>>>
>>> static inline void
>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>> index 650e67ab3..deba4a94a 100644
>>> --- a/lib/dpif-netdev.c
>>> +++ b/lib/dpif-netdev.c
>>> @@ -4170,6 +4170,7 @@ dpif_netdev_execute(struct dpif *dpif, struct
>>> dpif_execute *execute)
>>>
>>> dp_packet_batch_init_packet(, execute->packet);
>>> pp.do_not_steal = true;
>>> +pp.packet_out = execute->packet_out;
>>> dp_netdev_execute_actions(pmd, , false, execute->flow,
>>>   execute->actions, execute->actions_len);
>>
>> There is already a dirty hack named "do_not_steal" that was introduced,
>> I guess, exactly to avoid crash in the conntrack code that could drop/steal
>> the packet just like meter action.  And it seems that here in
>> dpif_netdev_execute() is the only problematic entry point as all other
>> normal paths expects that packet might be destroyed.
>>
>> The problem was, I suppose, introduced when we tried to unify semantics
>> of "may_steal" flag by turning it into "should_steal".  But it seems that
>> in this function we really need to prohibit stealing of the packet since
>> ofproto layer still owns it regardless of the result of execution.
>>
>> I don't think that we need one more flag here, but we have several options
>> how to fix the crash:
>>
>> 1. Start honoring batch->do_not_steal flag in all actions that may result
>>in packet drops.  As the original idea of having 'do_not_steal' flag for
>>a batch is very hacky, I'd like to not do that.
> 
> +1
> 
>> 2. Try to propagate information that packet was deleted up to ofproto layer,
>>i.e. make handle_packet_out() aware of that.  Will, probably, be not that
>>easy to do.
> 
> I had a look at doing this, but as you note it is quite intrusive, and
> we need to make changes all over.
> 
>> 3. This function (dpif_netdev_execute) is not on a hot path in userspace
>>datapath, IIUC.  It might be that it's just easier to remove the
>>'do_not_steal' flag entirely, clone the packet here and call the
>>dp_netdev_execute_actions() with should_steal=true.
>>This sounds like the best solution for me, unless I overlooked some
>>scenario, where this code is on a 

Re: [ovs-dev] [ovs-discuss] Patch for PACKET_OUT getting deleted twice crash

2021-06-07 Thread Aaron Conole
Ilya Maximets  writes:

>> Here is a patch with both a test and a fix.

Thanks so much!  It's nice to get fixes, but I think it's really great
when test cases come along with them.

> Hi.  Thanks for working n this!
>
> CC: ovs-dev
>
>> Not submitting as a formal
>> patch because I would like some feedback on whether 1) maintainers feel
>> this is worth fixing and
>
> I can reproduce the crash with your test.  Basically, actions in userspace
> datapath may drop packets if something goes wrong.  'meter' action just
> seems to be the most explicit variant.  So, I think, this is definitely
> worth fixing as some other condition might trigger this crash on packet-out
> as well.
>
> ==2568112==ERROR: AddressSanitizer: heap-use-after-free
> on address 0x6160699c at pc 0x00573860 bp 0x7ffebc6cc880 sp 
> 0x7ffebc6cc878
> READ of size 1 at 0x6160699c thread T0
> #0 0x57385f in dp_packet_delete lib/dp-packet.h:242:16
> #1 0x57372c in ofproto_packet_out_uninit ofproto/ofproto.c:3562:5
> #2 0x585e77 in handle_packet_out ofproto/ofproto.c:3722:5
> #3 0x583801 in handle_single_part_openflow ofproto/ofproto.c:8499:16
> #4 0x570c9c in handle_openflow ofproto/ofproto.c:8686:21
> #5 0x611781 in ofconn_run ofproto/connmgr.c:1329:13
> #6 0x6112ed in connmgr_run ofproto/connmgr.c:356:9
> #7 0x56fdf4 in ofproto_run ofproto/ofproto.c:1891:5
> #8 0x545ec0 in bridge_run__ vswitchd/bridge.c:3251:9
> #9 0x5456a5 in bridge_run vswitchd/bridge.c:3310:5
> #10 0x55f5b1 in main vswitchd/ovs-vswitchd.c:127:9
> #11 0x7f85bfe09081 in __libc_start_main (/lib64/libc.so.6+0x27081)
> #12 0x46d00d in _start (vswitchd/ovs-vswitchd+0x46d00d)
>
>> 2) whether this is the way to fix it.
>
> See inline.
>
>> 
>> I have tried to make the most minimal change possible, but this means that
>> there might be paths through the code that give unexpected behaviour (which
>> in the worst case would be a memory leak I suppose).
>> 
>> Tony
>> 
>> diff --git a/lib/dp-packet.h b/lib/dp-packet.h
>> index 246be14d0..5e0dabe67 100644
>> --- a/lib/dp-packet.h
>> +++ b/lib/dp-packet.h
>> @@ -739,6 +739,7 @@ struct dp_packet_batch {
>> size_t count;
>> bool trunc; /* true if the batch needs truncate. */
>> bool do_not_steal; /* Indicate that the packets should not be stolen.
>> */
>> +bool packet_out; /* Indicate single packet is PACKET_OUT */
>> struct dp_packet *packets[NETDEV_MAX_BURST];
>> };
>> 
>> @@ -748,6 +749,7 @@ dp_packet_batch_init(struct dp_packet_batch *batch)
>> batch->count = 0;
>> batch->trunc = false;
>> batch->do_not_steal = false;
>> +batch->packet_out = false;
>> }
>> 
>> static inline void
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 650e67ab3..deba4a94a 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -4170,6 +4170,7 @@ dpif_netdev_execute(struct dpif *dpif, struct
>> dpif_execute *execute)
>> 
>> dp_packet_batch_init_packet(, execute->packet);
>> pp.do_not_steal = true;
>> +pp.packet_out = execute->packet_out;
>> dp_netdev_execute_actions(pmd, , false, execute->flow,
>>   execute->actions, execute->actions_len);
>
> There is already a dirty hack named "do_not_steal" that was introduced,
> I guess, exactly to avoid crash in the conntrack code that could drop/steal
> the packet just like meter action.  And it seems that here in
> dpif_netdev_execute() is the only problematic entry point as all other
> normal paths expects that packet might be destroyed.
>
> The problem was, I suppose, introduced when we tried to unify semantics
> of "may_steal" flag by turning it into "should_steal".  But it seems that
> in this function we really need to prohibit stealing of the packet since
> ofproto layer still owns it regardless of the result of execution.
>
> I don't think that we need one more flag here, but we have several options
> how to fix the crash:
>
> 1. Start honoring batch->do_not_steal flag in all actions that may result
>in packet drops.  As the original idea of having 'do_not_steal' flag for
>a batch is very hacky, I'd like to not do that.

+1

> 2. Try to propagate information that packet was deleted up to ofproto layer,
>i.e. make handle_packet_out() aware of that.  Will, probably, be not that
>easy to do.

I had a look at doing this, but as you note it is quite intrusive, and
we need to make changes all over.

> 3. This function (dpif_netdev_execute) is not on a hot path in userspace
>datapath, IIUC.  It might be that it's just easier to remove the
>'do_not_steal' flag entirely, clone the packet here and call the
>dp_netdev_execute_actions() with should_steal=true.
>This sounds like the best solution for me, unless I overlooked some
>scenario, where this code is on a hot path.

I like this approach.

> Thoughts?
>
> Aaron, you have a patch[1] to remove 'do_not_steal' flag, so fix for this 
> issue
> will, likely, touch the same 

[ovs-dev] [PATCH ovn 1/2] tests: Add debugging 4 HV, 3LS, 2 LR test.

2021-06-07 Thread Mark Michelson
Add some checking to "4 HV, 3 LS, 2 LR, packet test with HA distributed
router gateway port".

Signed-off-by: Mark Michelson 
---
 tests/ovn.at | 90 
 1 file changed, 48 insertions(+), 42 deletions(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index f26894ce4..6cda89781 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -10442,9 +10442,9 @@ net_add n1
 
 sim_add hv1
 as hv1
-ovs-vsctl add-br br-phys
+check ovs-vsctl add-br br-phys
 ovn_attach n1 br-phys 192.168.0.1
-ovs-vsctl -- add-port br-int hv1-vif1 -- \
+check ovs-vsctl -- add-port br-int hv1-vif1 -- \
 set interface hv1-vif1 external-ids:iface-id=foo1 \
 options:tx_pcap=hv1/vif1-tx.pcap \
 options:rxq_pcap=hv1/vif1-rx.pcap \
@@ -10452,19 +10452,19 @@ ovs-vsctl -- add-port br-int hv1-vif1 -- \
 
 sim_add gw1
 as gw1
-ovs-vsctl add-br br-phys
+check ovs-vsctl add-br br-phys
 ovn_attach n1 br-phys 192.168.0.2
 
 sim_add gw2
 as gw2
-ovs-vsctl add-br br-phys
+check ovs-vsctl add-br br-phys
 ovn_attach n1 br-phys 192.168.0.4
 
 sim_add ext1
 as ext1
-ovs-vsctl add-br br-phys
+check ovs-vsctl add-br br-phys
 ovn_attach n1 br-phys 192.168.0.3
-ovs-vsctl -- add-port br-int ext1-vif1 -- \
+check ovs-vsctl -- add-port br-int ext1-vif1 -- \
 set interface ext1-vif1 external-ids:iface-id=outside1 \
 options:tx_pcap=ext1/vif1-tx.pcap \
 options:rxq_pcap=ext1/vif1-rx.pcap \
@@ -10475,77 +10475,81 @@ ovs-vsctl -- add-port br-int ext1-vif1 -- \
 # for ARP resolution).
 OVN_POPULATE_ARP
 
-ovn-nbctl create Logical_Router name=R0
-ovn-nbctl create Logical_Router name=R1
+AT_CHECK([ovn-nbctl create Logical_Router name=R0 -- \
+  create Logical_Router name=R1 | uuidfilt], [0], [<0>
+<1>
+])
 
-ovn-nbctl ls-add foo
-ovn-nbctl ls-add join
-ovn-nbctl ls-add alice
-ovn-nbctl ls-add outside
+check ovn-nbctl ls-add foo
+check ovn-nbctl ls-add join
+check ovn-nbctl ls-add alice
+check ovn-nbctl ls-add outside
 
 #Connect foo to R0
-ovn-nbctl lrp-add R0 R0-foo 00:00:01:01:02:03 192.168.1.1/24
-ovn-nbctl lsp-add foo foo-R0 -- set Logical_Switch_Port foo-R0 \
+check ovn-nbctl lrp-add R0 R0-foo 00:00:01:01:02:03 192.168.1.1/24
+check ovn-nbctl lsp-add foo foo-R0 -- set Logical_Switch_Port foo-R0 \
 type=router options:router-port=R0-foo \
 -- lsp-set-addresses foo-R0 router
 
 #Connect R0 to join
-ovn-nbctl lrp-add R0 R0-join 00:00:0d:01:02:03 100.60.1.1/24
-ovn-nbctl lsp-add join join-R0 -- set Logical_Switch_Port join-R0 \
+check ovn-nbctl lrp-add R0 R0-join 00:00:0d:01:02:03 100.60.1.1/24
+check ovn-nbctl lsp-add join join-R0 -- set Logical_Switch_Port join-R0 \
 type=router options:router-port=R0-join \
 -- lsp-set-addresses join-R0 router
 
 #Connect join to R1
-ovn-nbctl lrp-add R1 R1-join 00:00:0e:01:02:03 100.60.1.2/24
-ovn-nbctl lsp-add join join-R1 -- set Logical_Switch_Port join-R1 \
+check ovn-nbctl lrp-add R1 R1-join 00:00:0e:01:02:03 100.60.1.2/24
+check ovn-nbctl lsp-add join join-R1 -- set Logical_Switch_Port join-R1 \
 type=router options:router-port=R1-join \
 -- lsp-set-addresses join-R1 router
 
 #add route rules
-ovn-nbctl lr-route-add R0 0.0.0.0/0 100.60.1.2
-ovn-nbctl lr-route-add R1 192.168.0.0/16 100.60.1.1
+check ovn-nbctl lr-route-add R0 0.0.0.0/0 100.60.1.2
+check ovn-nbctl lr-route-add R1 192.168.0.0/16 100.60.1.1
 
 # Connect alice to R1 as distributed router gateway port on gw1
-ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24
+check ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24
 
-ovn-nbctl \
+AT_CHECK([ovn-nbctl \
 --id=@gc0 create Gateway_Chassis name=alice_gw1 \
  chassis_name=gw1 \
  priority=20 -- \
 --id=@gc1 create Gateway_Chassis name=alice_gw2 \
  chassis_name=gw2 \
  priority=10 -- \
-set Logical_Router_Port alice 'gateway_chassis=[@gc0,@gc1]'
+set Logical_Router_Port alice 'gateway_chassis=[@gc0,@gc1]' | uuidfilt], 
[0], [<0>
+<1>
+])
 
-ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port rp-alice \
+check ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port rp-alice \
 type=router options:router-port=alice \
 -- lsp-set-addresses rp-alice router
 
 # Create logical port foo1 in foo
-ovn-nbctl lsp-add foo foo1 \
+check ovn-nbctl lsp-add foo foo1 \
 -- lsp-set-addresses foo1 "f0:00:00:01:02:03 192.168.1.2"
 
 # Create logical port outside1 in outside
-ovn-nbctl lsp-add outside outside1 \
+check ovn-nbctl lsp-add outside outside1 \
 -- lsp-set-addresses outside1 "f0:00:00:01:02:04 172.16.1.3"
 
 # Create localnet port in alice
-ovn-nbctl lsp-add alice ln-alice
-ovn-nbctl lsp-set-addresses ln-alice unknown
-ovn-nbctl lsp-set-type ln-alice localnet
-ovn-nbctl lsp-set-options ln-alice network_name=phys
+check ovn-nbctl lsp-add alice ln-alice
+check ovn-nbctl lsp-set-addresses ln-alice unknown
+check ovn-nbctl lsp-set-type ln-alice 

[ovs-dev] [PATCH ovn 2/2] tests: Fix consistency of 4 HV, 3LS, 2 LR test runs.

2021-06-07 Thread Mark Michelson
Much like the 4 HV, 1 LS, 1 LR test, this one was also having issues
where the pcap file would not be reset before attempting to capture
packets. Therefore, the pcap file would be reset after having captured
packets we were expecting to see.

Therefore, this patch does similarly to 879ebc8c6, in that it adds the
following:
* Ensure that pcap file is reset before continuing
* Ensure that HA flows are installed before testing packets.

Signed-off-by: Mark Michelson 
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1941061
---
 tests/ovn.at | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index 6cda89781..4e84639b5 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -10566,15 +10566,12 @@ wait_column "$hv1_ch_uuid" HA_Chassis_Group 
ref_chassis
 # Allow some time for ovn-northd and ovn-controller to catch up.
 check ovn-nbctl --wait=hv sync
 
-reset_pcap_file() {
-local iface=$1
-local pcap_file=$2
-ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
-options:rxq_pcap=dummy-rx.pcap
-rm -f ${pcap_file}*.pcap
-ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \
-options:rxq_pcap=${pcap_file}-rx.pcap
-}
+hv1_gw1_ofport=$(as hv1 ovs-vsctl --bare --columns ofport find Interface 
name=ovn-gw1-0)
+hv1_gw2_ofport=$(as hv1 ovs-vsctl --bare --columns ofport find Interface 
name=ovn-gw2-0)
+
+OVS_WAIT_UNTIL([
+test 1 = $(as hv1 ovs-ofctl dump-flows br-int table=37 | grep -c 
"active_backup,ofport,members:$hv1_gw1_ofport,$hv1_gw2_ofport")
+])
 
 test_ip_packet()
 {
@@ -10619,9 +10616,9 @@ grep actions=mod_dl_dst:f0:00:00:01:02:04 | wc -l` -eq 1
 
exp_gw_ip_garp=020102030806000108000604000102010203ac100101ac100101
 echo $exp_gw_ip_garp >> ext1-vif1.expected
 
-as $active_gw reset_pcap_file br-phys_n1 $active_gw/br-phys_n1
-as $backup_gw reset_pcap_file br-phys_n1 $backup_gw/br-phys_n1
-as ext1 reset_pcap_file ext1-vif1 ext1/vif1
+as $active_gw reset_iface_pcap_file br-phys_n1 $active_gw/br-phys_n1
+as $backup_gw reset_iface_pcap_file br-phys_n1 $backup_gw/br-phys_n1
+as ext1 reset_iface_pcap_file ext1-vif1 ext1/vif1
 
 # Resend packet from foo1 to outside1
 check as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
@@ -10649,6 +10646,10 @@ AT_CHECK([ovn-nbctl --wait=hv \
 <1>
 ])
 
+OVS_WAIT_UNTIL([
+test 1 = $(as hv1 ovs-ofctl dump-flows br-int table=37 | grep -c 
"active_backup,ofport,members:$hv1_gw2_ofport,$hv1_gw1_ofport")
+])
+
 test_ip_packet gw2 gw1
 
 OVN_CLEANUP([hv1],[gw1],[gw2],[ext1])
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] dpif-netdev: Report overhead busy cycles per pmd.

2021-06-07 Thread David Marchand
When setting PMD auto load balance parameters and observing the
feature (and the per rxq statistics) in action, you can easily get
rebalances while the sum of per rxq pmd usage is below the pmd load
threshold you had set.

This is because the dpif-netdev/pmd-rxq-show command only reports "pure"
rxq cycles while some cycles are used in the pmd mainloop and adds up to
the total pmd load.

dpif-netdev/pmd-stats-show does report per pmd load usage.
This load is measured since the last dpif-netdev/pmd-stats-clear call.
On the other hand, the auto load balance feature uses the pmd load on a 10s
sliding window which makes it non trivial to correlate.

Gather per pmd busy cycles with the same periodicity and report the
difference as overhead in dpif-netdev/pmd-rxq-show so that we have all
info in a single command.

Example:
$ ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 0 core_id 4:
  isolated : false
  port: dpdk0 queue-id:  0 (enabled)   pmd usage: 37 %
  port: dpdk1 queue-id:  0 (enabled)   pmd usage: 36 %
  port: vhost3queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost6queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost7queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  4 %
pmd thread numa_id 0 core_id 18:
  isolated : false
  port: vhost0queue-id:  0 (enabled)   pmd usage: 37 %
  port: vhost1queue-id:  0 (enabled)   pmd usage: 39 %
  port: vhost2queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost4queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost5queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  5 %

Signed-off-by: David Marchand 
---
 Documentation/topics/dpdk/pmd.rst |   6 ++
 lib/dpif-netdev.c | 100 --
 tests/pmd.at  |   8 ++-
 3 files changed, 80 insertions(+), 34 deletions(-)

diff --git a/Documentation/topics/dpdk/pmd.rst 
b/Documentation/topics/dpdk/pmd.rst
index e481e79414..2df10dcc5d 100644
--- a/Documentation/topics/dpdk/pmd.rst
+++ b/Documentation/topics/dpdk/pmd.rst
@@ -213,6 +213,12 @@ load for each of its associated queues every 10 seconds. 
If the aggregated PMD
 load reaches the load threshold for 6 consecutive intervals then PMD considers
 itself to be overloaded.
 
+.. versionchanged:: 2.16.0
+
+   The per PMD load is shown in the same fashion than Rx queue's in the output
+   of ``pmd-rxq-show``. It accounts for all Rx queue's processing cycles and
+   internal PMD core main loop cost.
+
 For example, to set the load threshold to 70%::
 
 $ ovs-vsctl set open_vswitch .\
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 650e67ab30..bcebcebe0e 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -235,11 +235,11 @@ struct dfc_cache {
 
 /* Time in microseconds of the interval in which rxq processing cycles used
  * in rxq to pmd assignments is measured and stored. */
-#define PMD_RXQ_INTERVAL_LEN 1000LL
+#define PMD_INTERVAL_LEN 1000LL
 
 /* Number of intervals for which cycles are stored
  * and used during rxq to pmd assignment. */
-#define PMD_RXQ_INTERVAL_MAX 6
+#define PMD_INTERVAL_MAX 6
 
 /* Time in microseconds to try RCU quiescing. */
 #define PMD_RCU_QUIESCE_INTERVAL 1LL
@@ -465,9 +465,9 @@ struct dp_netdev_rxq {
 
 /* Counters of cycles spent successfully polling and processing pkts. */
 atomic_ullong cycles[RXQ_N_CYCLES];
-/* We store PMD_RXQ_INTERVAL_MAX intervals of data for an rxq and then
+/* We store PMD_INTERVAL_MAX intervals of data for an rxq and then
sum them to yield the cycles used for an rxq. */
-atomic_ullong cycles_intrvl[PMD_RXQ_INTERVAL_MAX];
+atomic_ullong cycles_intrvl[PMD_INTERVAL_MAX];
 };
 
 /* A port in a netdev-based datapath. */
@@ -703,13 +703,18 @@ struct dp_netdev_pmd_thread {
 long long int next_optimization;
 /* End of the next time interval for which processing cycles
are stored for each polled rxq. */
-long long int rxq_next_cycle_store;
+long long int next_cycle_store;
 
 /* Last interval timestamp. */
 uint64_t intrvl_tsc_prev;
 /* Last interval cycles. */
 atomic_ullong intrvl_cycles;
 
+/* Write index for 'busy_cycles_intrvl'. */
+unsigned int intrvl_idx;
+/* Busy cycles in last PMD_INTERVAL_MAX intervals. */
+atomic_ullong busy_cycles_intrvl[PMD_INTERVAL_MAX];
+
 /* Current context of the PMD thread. */
 struct dp_netdev_pmd_thread_ctx ctx;
 
@@ -1229,6 +1234,8 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd)
 struct rxq_poll *list;
 size_t n_rxq;
 uint64_t total_cycles = 0;
+uint64_t busy_cycles = 0;
+uint64_t polling_cycles = 0;
 
 ds_put_format(reply,
   "pmd thread numa_id %d core_id %u:\n  isolated : %s\n",
@@ -1241,16 +1248,27 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd)
 /* Get 

Re: [ovs-dev] [PATCH v2] ofproto-dpif-upcall: Improve concurrency by adjust flow-limit

2021-06-07 Thread taoyunupt
Hi ,
  Please help to review this patch. It  has passed couple of days. Hope to 
know your idea. Thanks.





Thanks,

YUN







At 2021-05-26 10:44:16, "Tao YunXiang"  wrote:
>OVS uses a set of gentle mechanisms to control the number of flows in 
>datapath. However, it is difficult for the number of datapath flows to 
>reach a high level in a short time.
>
>Through some tests with Apache benchmark, I found the point is that
>the value of flow-limit increases too slowly, but decreases very quickly. 
>
>This change has two improvements. One is the flow-limit will adjust more 
>linearly. The other one is, when the duration is small, the flow-limit will
>increase faster.
>
>Through this change, the test result of Apache benchmark can reach 12k from 5k,
>the test command is as follows:
>
>ab -n 20 -c 500 -r http://1.1.1.2/
>
>v1->v2: Make flow-limit changes linearly. Add a limitation to make flow_limit
>won't overflow UINT_MAX.
>
>
>Signed-off-by: Tao YunXiang 
>Cc: Ben Pfaff 
>
>---
> ofproto/ofproto-dpif-upcall.c | 20 +---
> 1 file changed, 13 insertions(+), 7 deletions(-)
>
>diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>index ccf97266c..805409165 100644
>--- a/ofproto/ofproto-dpif-upcall.c
>+++ b/ofproto/ofproto-dpif-upcall.c
>@@ -963,14 +963,20 @@ udpif_revalidator(void *arg)
> 
> duration = MAX(time_msec() - start_time, 1);
> udpif->dump_duration = duration;
>-if (duration > 2000) {
>-flow_limit /= duration / 1000;
>-} else if (duration > 1300) {
>-flow_limit = flow_limit * 3 / 4;
>-} else if (duration < 1000 &&
>-   flow_limit < n_flows * 1000 / duration) {
>-flow_limit += 1000;
>+
>+/* Use duration as a reference, adjust the value of flow_limit,
>+ * when the duration is short, increase the flow_limit, and vice
>+ * versa. The purpose of using multiple conversions between int
>+ * and float is to make the flow_limit change more linear. */ 
>+
>+if (duration >= 1000) {
>+flow_limit = (int) ((float) (flow_limit) / (duration / 
>1000.0));
>+} else if (flow_limit * (1000.0 / duration) >= UINT_MAX) {
>+flow_limit = ofproto_flow_limit;
>+} else {
>+flow_limit = (int) (flow_limit * (1000.0 / duration));
> }
>+
> flow_limit = MIN(ofproto_flow_limit, MAX(flow_limit, 1000));
> atomic_store_relaxed(>flow_limit, flow_limit);
> 
>-- 
>2.17.1
>
>
>
>___
>dev mailing list
>d...@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] system-traffic.at:add missing comma

2021-06-07 Thread taoyunupt
Hi ,
  Please help to review this patch. It  has passed couple of days. Thanks.





Thanks,

YUN


At 2021-05-26 10:53:45, "Tao YunXiang"  wrote:
>Add missing comma.
>
>Signed-off-by: Tao YunXiang 
>Cc: Joe Stringer 
>
>---
> tests/system-traffic.at | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>index fb5b9a36d..ed925cea5 100644
>--- a/tests/system-traffic.at
>+++ b/tests/system-traffic.at
>@@ -2561,7 +2561,7 @@ priority=10,arp,action=normal
> 
> dnl Only allow non-fragmented messages and 1st fragments of each message
> priority=100,in_port=1,icmp,ip_frag=no,action=ct(commit,zone=9),2
>-priority=100,in_port=1,icmp,ip_frag=firstaction=ct(commit,zone=9),2
>+priority=100,in_port=1,icmp,ip_frag=first,action=ct(commit,zone=9),2
> priority=100,in_port=2,ct_state=-trk,icmp,action=ct(table=0,zone=9)
> priority=100,in_port=2,ct_state=+trk+est-new,icmp,action=1
> ])
>-- 
>2.17.1
>
>
>
>___
>dev mailing list
>d...@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] bond: Fix broken rebalancing after link state changes.

2021-06-07 Thread Ilya Maximets
On 8/2/17 11:09 PM, Andy Zhou wrote:
> On Thu, Jul 20, 2017 at 10:21 AM, Ilya Maximets  
> wrote:
>> There are 3 constraints for moving hashes from one slave to another:
>>
>> 1. The load difference is larger than ~3% of one slave's load.
>> 2. The load difference between slaves exceeds 10 bytes.
>> 3. Moving of the hash makes the load difference lower by > 10%.
>>
>> In current implementation if one of the slaves goes DOWN state, all
>> the hashes assigned to it will be moved to other slaves. After that,
>> if slave will go UP it will wait for rebalancing to get some hashes.
>> But in case where we have more than 10 equally loaded hashes it
>> will never fit constraint #3, because each hash will handle less than
>> 10% of the load. Situation become worse when number of flows grows
>> higher and it's almost impossible to migrate any hash when all the
>> 256 hash entries are used which is very likely when we have few
>> hundreds/thousands of flows.
>>
>> As a result, if one of the slaves goes down and up while traffic
>> flows, it will never be used again for packet transmission.
>> Situation will not be fixed even if we'll stop traffic completely
>> and start it again because first two constraints will block
>> rebalancing on the earlier stages while we have low amount of traffic.
>>
>> Moving of one hash if destination has no hashes as it was before
>> commit c460a6a7bc75 ("ofproto/bond: simplify rebalancing logic")
>> will not help because having one hash isn't enough to make load
>> difference less than 10% of total load and this slave will
>> handle only that one hash forever.
>>
>> To fix this lets try to move few hashes simultaniously to fit
>> constraint #3.
> 
> Thanks for working on this.

Sorry for not replying for almost 4 years. :)

And sorry for resurrecting the thread, but the issue still exists and
I think that we still need to fix it.  The first patch needs a
minor rebase, but it still works fine.  The test in the second patch
is still valid.

> 
> Current interface tries to only rebalance one hash bucket at a time.
> There are two reasons I can think of:
> 
> In general, It is good idea to keep flow binding stable. Moving flow
> introduces side effects, such as packet re-ordering, that impacts
> application performances significantly.

Yes.  I agree with that.  We should try to minimize moving of flows.
Current algorithm moves hashes until difference between heaviest
member and the lightest one is more than ~3% or it's more than ~1Mbps
and there is at least one hash that could be moved to make the load
difference better by 10%.  That is, in general, a lot of hash migrations.
New algorithm keeps all the previous restrictions, but allows to move
more than 1 hash at a time to fulfill the last 10% restriction.
It's hard to tell if it will result in more movements in general case.
But, I think, we can tune it better to keep as close to the current one
as possible. See below. 

> 
> The second reason is the bandwidth each flow consumes tends to
> be more dynamic in practice.  It is usually pointless to use a snapshot
> of bandwidth consumption to come up a "perfect" bond port binding.
> For example, TCP flows rate adapt to available bandwidth.  It is
> probably better to make smaller changes in each rebalancing step.

Yes, I agree with this statement too.  But new algorithm doesn't try
to create a perfect distribution.  It stops collecting hashes for
migration when distribution is close enough.  In this patch it's 10%
difference with the "ideal" value.  I agree that this might be too
much.  We can, probably, tune this value.  For example, 90% difference
with "ideal" for a single movement will be somewhat close to the
current algorithm as we will move hashes that reduces the difference
by 10%.   I'm afraid, though, that this will result in much more
movements, so it might make sense to settle somewhere in the middle,
e.g. 40-50% of improvement for a single migration.  This should be
fair enough.

Thoughts?

> 
> Is it still possible to address the issue mentioned while maintaining
> current interface? Can we find some way to break the tie in case current
> algorithm fail to nominate any bucket to move?

It's hard to tell.  I've been thinking about this for a long time
before sending this patch and didn't came up with anything elegant.
It would be easy to identify this kind of case where all hashes are
smaller than 10% of the difference, but the difference itself is high,
and use the new algorithm only for this case, but I suspect that in
this case we will always move hashes one by one several times with
the old algorithm and use the new one on the last iteration to move
a bunch of smaller hashes.  The end result might be not any different
than just running new algorithm from the beginning.  Also this schema
will take much more processing time than just running a new algorithm
from the start.

For now I'm going to rebase this patch-set and, probably, change the
moving 

Re: [ovs-dev] [PATCH ovn] ovn-trace: correctly handle ct_dnat(IP) action

2021-06-07 Thread Dumitru Ceara
On 6/4/21 7:06 PM, Mark Gray wrote:
> ovn-trace does not set translated ip address for ct_dnat()
> actions when tracing. This causes the trace to end prematurely.

Hi Mark,

Thanks for the patch!

> 
> This can be tested with the following or an equivalent for IPv6:
> 
> ovn-nbctl ls-add sw0
> ovn-nbctl lsp-add sw0 sw0-port1
> ovn-nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:01 192.168.0.2"
> 
> ovn-nbctl ls-add sw1
> ovn-nbctl lsp-add sw1 sw1-port1
> ovn-nbctl lsp-set-addresses sw1-port1 "50:54:00:00:00:03 11.0.0.2"
> 
> ovn-nbctl lr-add lr0
> ovn-nbctl lrp-add lr0 lrp0 00:00:00:00:ff:01 192.168.0.1/24
> ovn-nbctl lsp-add sw0 lrp0-attachment
> ovn-nbctl lsp-set-type lrp0-attachment router
> ovn-nbctl lsp-set-addresses lrp0-attachment 00:00:00:00:ff:01
> ovn-nbctl lsp-set-options lrp0-attachment router-port=lrp0
> ovn-nbctl lrp-add lr0 lrp1 00:00:00:00:ff:02 11.0.0.1/24 -- 
> lrp-set-gateway-chassis lrp1 chassis-1
> ovn-nbctl lsp-add sw1 lrp1-attachment
> ovn-nbctl lsp-set-type lrp1-attachment router
> ovn-nbctl lsp-set-addresses lrp1-attachment 00:00:00:00:ff:02
> ovn-nbctl lsp-set-options lrp1-attachment router-port=lrp1
> 
> ovn-nbctl lr-nat-add lr0 dnat 42.42.42.42 192.168.0.2
> 
> ovs-vsctl add-port br-int p1 -- \
> set Interface p1 external_ids:iface-id=sw0-port1
> ovs-vsctl add-port br-int p2 -- \
> set Interface p2 external_ids:iface-id=sw1-port1
> 
> ovn-trace  'inport == "sw1-port1" && eth.src == 50:54:00:00:00:03 && eth.dst 
> == 00:00:00:00:ff:02 && ip4.dst == 42.42.42.42 && ip4.src == 11.0.0.2 && 
> ip.ttl == 64'

We have some ovn-trace self tests already, in ovn-northd.at (for
configurations that don't need ovn-controller) and ovn.at (for more
complex scenario), so I think this could be added as a new test there.

> 
> Signed-off-by: Mark Gray 
> ---
>  utilities/ovn-trace.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/utilities/ovn-trace.c b/utilities/ovn-trace.c
> index 3b26b5af1d69..15b737b23496 100644
> --- a/utilities/ovn-trace.c
> +++ b/utilities/ovn-trace.c
> @@ -2297,10 +2297,20 @@ execute_ct_nat(const struct ovnact_ct_nat *ct_nat,
>  if (ct_nat->family == AF_INET) {
>  ds_put_format(, "(ip4.%s="IP_FMT")", direction,
>IP_ARGS(ct_nat->ipv4));
> +if (is_dst) {
> +ct_flow.nw_dst=ct_nat->ipv4;
> +} else {
> +ct_flow.nw_src=ct_nat->ipv4;
> +}

Nit: needs spaces around '='

>  } else {
>  ds_put_format(, "(ip6.%s=", direction);
>  ipv6_format_addr(_nat->ipv6, );
>  ds_put_char(, ')');
> +if (is_dst) {
> +ct_flow.ipv6_dst=ct_nat->ipv6;
> +} else {
> +ct_flow.ipv6_src=ct_nat->ipv6;
> +}

Nit: same here

>  }
>  
>  uint8_t state = is_dst ? CS_DST_NAT : CS_SRC_NAT;
> 

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 0/4] Fix ovn-controller I-P for multicast groups and lport changes

2021-06-07 Thread Dumitru Ceara
On 6/7/21 7:33 AM, Han Zhou wrote:
> On Fri, Jun 4, 2021 at 6:56 AM Dumitru Ceara  wrote:
>>
>> On 5/28/21 9:23 PM, Han Zhou wrote:
>>> The series fixes incremental processing for missing dependency handling
> for
>>> multicast group and logical port binding changes when computing logical
> flows.
>>> It also removes the workaround in northd that was required due to the
> missing
>>> dependency handling. In addition, the fix also allows us to monitor all
> DPGs as
>>> an optimization, so it is also included in the series.
>>>
>>
>> Hi Han,
>>
>> Thanks for working on fixing this!  I wonder however if there will be
>> any performance impact due to the added resource references?  I didn't
>> do any scale testing, did you have a chance to?
>>
> Hi Dumitru,
> 

Hi Han,

> Thanks for the review. There is no performance impact noticed because the
> port-binding reference is already tracked in the existing implementation
> for lports used at outport/inport and is_chassis_resident. This fix mainly
> changes the way how it is tracked (by name instead of DP keys). I admit
> that there are some extra references added compared with existing
> implementation:
> 1. References for the lports that are not found
> 2. For lports that are used other than for
> inport/outport/is_chassis_resident

I'm not sure there is any other lport reference except the three above.

> 3. For MC groups (found or not found)
> 
> 1) is the very few cases that do not matter for performance but matter for
> correctness. 2) seems also rare, if exists at all. 3) shouldn't impact
> performance either because the number of MC groups should be much smaller
> than that of lports.
> So I think the performance impact is negligible and if there is any cost it
> is necessary for the correctness. I ran a test with 1k lswitches, 10k
> lports and an ACL using half of the ports in a logical group (and address
> set) and triggers full recompute - there is no performance difference
> noticed.
> 

Great, thanks for the confirmation.

>> As an alternative, maybe longer term though, would it make sense to use
>> explicit references instead in the Southbound schema?  That would
>> simplify the ovn-controller code and would rely on the IDL to propagate
>> the change tracking to the flows that refer to multicast groups/port
>> bindings/port groups.
> 
> I wonder if this is possible because the end user specifies the "match"
> directly in an ACL, which can use port names. We will have to change the
> ACL syntax to support it.

An alternative would be to perform some basic ACL match parsing in
northd, parsing tokens like
inport/outport/is_chassis_resident/$/@.

> On the other hand, if we can generate lflows using references to SB
> port-bindings or MC groups, we would be able to put the DP key in the lflow
> directly, without the need for ovn-controller to parse the lport/mc-group
> at all, so there is no reference needed, right?

Exactly, that was what I was thinking too.  I wonder how we'd maintain
backwards compatibility though.  Would we need a logical_flow_v2 table
in the Southbound?

> 
> Thanks,
> Han
> 

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev