Hi Ihar, also it seems that the test, which was added in a mentioned commit, is flaky. I continuously see 2 or 3 runs of 10 are failed on my dev VM. With verbose logging fail looks like this:
103. ovn.at:3708<http://ovn.at:3708>: testing VXLAN check port/datapath key space limits -- ovn-northd -- dp-groups=yes ... creating ovn-sb database ovsdb-server -vjsonrpc --remote=punix:/home/ec2-user/ovn/tests/testsuite.dir/0103/ovn-sb/ovn-sb.sock --remote=db:OVN_Southbound,SB_Global,connections --private-key=/home/ec2-user/ovn/tests/testpki-test-privkey.pem --certificate=/home/ec2-user/ovn/tests/testpki-test-cert.pem --ca-cert=/home/ec2-user/ovn/tests/testpki-cacert.pem /home/ec2-user/ovn/tests/testsuite.dir/0103/ovn-sb/ovn-sb.db -vconsole:off --detach --no-chdir --pidfile --log-file creating ovn-nb database ovsdb-server -vjsonrpc --remote=punix:/home/ec2-user/ovn/tests/testsuite.dir/0103/ovn-nb/ovn-nb.sock /home/ec2-user/ovn/tests/testsuite.dir/0103/ovn-nb/ovn-nb.db -vconsole:off --detach --no-chdir --pidfile --log-file starting northd ovn-northd -vjsonrpc --ovnnb-db=unix:/home/ec2-user/ovn/tests/testsuite.dir/0103/ovn-nb/ovn-nb.sock --ovnsb-db=unix:/home/ec2-user/ovn/tests/testsuite.dir/0103/ovn-sb/ovn-sb.sock -vconsole:off --detach --no-chdir --pidfile --log-file starting northd-backup ovn-northd -vjsonrpc --ovnnb-db=unix:/home/ec2-user/ovn/tests/testsuite.dir/0103/ovn-nb/ovn-nb.sock --ovnsb-db=unix:/home/ec2-user/ovn/tests/testsuite.dir/0103/ovn-sb/ovn-sb.sock -vconsole:off --detach --no-chdir --pidfile --log-file 4c3ef530-9e97-4c5a-a4cc-0aed1abe88fb ovn-macros.at:234<http://ovn-macros.at:234>: waiting until TCP_PORT=`sed -n 's/.*0:.*: listening on port \([0-9]*\)$/\1/p' "$d/ovn-sb/ovsdb-server.log"` && test X != X"$TCP_PORT"... ovn-macros.at:234<http://ovn-macros.at:234>: wait succeeded immediately adding simulator 'main' ovsdb-server --remote=punix:/home/ec2-user/ovn/tests/testsuite.dir/0103/main/db.sock -vconsole:off --detach --no-chdir --pidfile --log-file ovs-vswitchd --enable-dummy=system -vvconn -vofproto_dpif -vunixctl -vconsole:off --detach --no-chdir --pidfile --log-file ovs-vsctl add-br br-phys ./ovn-macros.at:385<http://ovn-macros.at:385>: "$@" ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys ./ovn-macros.at:385<http://ovn-macros.at:385>: "$@" ovn-controller --enable-dummy-vif-plug -vconsole:off --detach --no-chdir --pidfile --log-file ovn-nbctl --wait=sb sync ./ovn-macros.at:385<http://ovn-macros.at:385>: "$@" ovn-nbctl ls-add ls-bad -- set Logical_Switch ls-bad other_config:requested-tnl-key=5000 ./ovn-macros.at:385<http://ovn-macros.at:385>: "$@" ovn-nbctl lsp-add ls-bad lsp-bad -- set logical_switch_port lsp-bad options:requested-tnl-key=5000 ./ovn-macros.at:385<http://ovn-macros.at:385>: "$@" ovn-nbctl --wait=sb sync ./ovn-macros.at:385<http://ovn-macros.at:385>: "$@" ovs-vsctl add-port br-int vif-bad -- set Interface vif-bad external-ids:iface-id=lsp-bad ./ovn-macros.at:385<http://ovn-macros.at:385>: "$@" ovn.at:3708<http://ovn.at:3708>: waiting until test x`ovn-nbctl lsp-get-up lsp-bad` = xup... ovn.at:3708<http://ovn.at:3708>: wait succeeded immediately ./ovn.at:3708<http://ovn.at:3708>: ovn-sbctl get Datapath_Binding ls-bad tunnel_key --- - 2021-11-10 21:35:42.458340681 +0300 +++ /home/ec2-user/ovn/tests/testsuite.dir/at-groups/103/stdout 2021-11-10 21:35:42.456326372 +0300 @@ -1,2 +1,2 @@ -1 +5000 103. ovn.at:3708<http://ovn.at:3708>: FAILED (ovn.at:3708<http://ovn.at:3708>) I tried to add await for the '1' output here, but it didn’t help. From ovn-northd logs I see next. lswitch is added with tnl_key 5000: 2021-11-10T18:37:15.824Z|00049|jsonrpc|DBG|unix:/home/ec2-user/ovn/tests/testsuite.dir/0103/ovn-nb/ovn-nb.sock: received notification, method="update3", params=[["monid","OVN_Northbound"],"00000000-0000-0000-0000-000000000000",{"Logical_Switch":{"29213084-77d0-47be-bd38-1ce8a4d33117":{"insert":{"name":"ls-bad","other_config":["map",[["requested-tnl-key","5000"]]]}}}}] Then ovn-northd inserts many records in SB DB: one of them is: "uuid-name":"rowab213870_6420_4c6a_b005_31d0c468f6fe","row":{"tunnel_key":5000,"external_ids":["map",[["logical-switch","29213084-77d0-47be-bd38-1ce8a4d33117"],["name","ls-bad"]]]},"op":"insert","table":"Datapath_Binding"}, then lsp-bad is created: 2021-11-10T18:37:15.830Z|00051|jsonrpc|DBG|unix:/home/ec2-user/ovn/tests/testsuite.dir/0103/ovn-nb/ovn-nb.sock: received notification, method="update3", params=[["monid","OVN_Northbound"],"00000000-0000-0000-0000-0000000 00000",{"Logical_Switch":{"29213084-77d0-47be-bd38-1ce8a4d33117":{"modify":{"ports":["uuid","893ffa21-53bd-4b91-9747-05aaf2ddc322"]}}},"Logical_Switch_Port":{"893ffa21-53bd-4b91-9747-05aaf2ddc322":{"insert":{"name":"lsp- bad","options":["map",[["requested-tnl-key","5000"]]]}}}}] And part of lsp-related transaction to SB DB: {"uuid-name":"row06ada4fc_48b6_43af_a35b_d0e3a2435e17","row":{"up":false,"tunnel_key":5000,"datapath":["uuid","6ea0064f-870c-47ed-96ef-90bd8ce847a5"],"logical_port":"lsp-bad","options":["map",[["requested-tnl-key","5000"]]]},"op":"insert","table":"Port_Binding"} And warns: 2021-11-10T18:37:15.834Z|00061|northd|WARN|Tunnel key 5000 for datapath ls-bad is incompatible with VXLAN 2021-11-10T18:37:15.834Z|00062|northd|WARN|Tunnel key 5000 for port lsp-bad is incompatible with VXLAN I don’t have the knowledge how should ovn-northd process such case, but hope these logs will help you understand what’s going on... Thanks. Regards, Vladislav Odintsov On 9 Nov 2021, at 19:37, Ihar Hrachyshka <[email protected]<mailto:[email protected]>> wrote: It was not the intent to change the range of tunnel keys for vtep setups. If that happens, we should fix it. Perhaps the range is not touched but the range validation is activated in vtep case. If so, the validation should be disabled. Ihar On 11/8/21 4:57 PM, Vladislav Odintsov wrote: Hi, is seems like I found a regression with a latest ovn main branch code using ovn-ic. My setup utilises ramp switches (ovn-controller-vtep), so some chassis have vxlan encap. But I guess the problem affects non-vtep switch VXLAN installations too. OVN IC transit switches use tunnel key ids > OVN_MAX_DP_KEY_LOCAL (OVN_MAX_DP_KEY - OVN_MAX_DP_GLOBAL_NUM) so if user has at least one VXLAN-enabled chassis and interconnected with transit switch LRs he/she will get in northd logs lots of next messages: 2021-11-08T21:03:32.606Z|00132|northd|WARN|Tunnel key 16711897 for datapath vpc-6A0E5C34-global is incompatible with VXLAN 2021-11-08T21:04:32.607Z|00133|northd|WARN|Dropped 27 log messages in last 60 seconds (most recently, 31 seconds ago) due to excessive rate In commit [1] this logic was introduced. I can’t understand from the code and from commit itself if previous behaviour can be supported, but maybe I’m missing something. Can somebody help on this please? Thanks. 1: https://github.com/ovn-org/ovn/commit/fd44d75959cedcedf1f103173be1d9fa1abd9cb8 Regards, Vladislav Odintsov _______________________________________________ dev mailing list [email protected]<mailto:[email protected]> https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected]<mailto:[email protected]> https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
