Re: [ovs-dev] [PATCH] netdev-afxdp: add afxdp specific maximum MTU check
On 8 Nov 2019, at 2:34, William Tu wrote: On Thu, Nov 07, 2019 at 03:01:18PM +0100, Eelco Chaudron wrote: Any feedback on this? On 1 Oct 2019, at 11:55, Eelco Chaudron wrote: Drivers natively supporting AF_XDP will check that a configured MTU size will not exceed the allowed size for AF_XDP. However, when the skb compatibility mode is used there is no check and any value is accepted. This, for example, is the case when using the TAP interface. This fix adds a check to make sure only AF_XDP valid values are excepted. Signed-off-by: Eelco Chaudron --- lib/netdev-afxdp.c | 17 + lib/netdev-afxdp.h |1 + lib/netdev-linux.c |9 + 3 files changed, 27 insertions(+) diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c index 6e0180327..140150f29 100644 --- a/lib/netdev-afxdp.c +++ b/lib/netdev-afxdp.c @@ -1001,6 +1001,23 @@ netdev_afxdp_destruct(struct netdev *netdev) ovs_mutex_destroy(&dev->mutex); } +int +netdev_afxdp_verify_mtu_size(const struct netdev *netdev, int mtu) +{ +/* + * If a device is used in xdpmode skb, no driver-specific MTU size is + * checked and any value is allowed resulting in packet drops. + * This check will verify the maximum supported value based on the + * buffer size allocated and the additional headroom required. + */ +if (netdev == NULL || mtu <= 0 +|| mtu > (FRAME_SIZE - OVS_XDP_HEADROOM - XDP_PACKET_HEADROOM)) { I remember XDP max MTU = 3520 bytes, and it's (page_size(4096) - headroom(256) - shinfo(320)) so here we should also subtract shinfo? That is the theoretical maximum, however you’re code allocated chunks of FRAME_SIZE, so that dictated the maximum value. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Track vhost tx contention.
On Tue, Nov 5, 2019 at 4:37 PM Ilya Maximets wrote: > >> That's an interesting debug method, but it looks not very suitable > >> for an end-user documentation. One thing that bothers me the most > >> is referencing C code snippets in this kind of documentation. > > > > Ok, can we conclude on the coverage counter wrt the v1 then? > > https://patchwork.ozlabs.org/patch/1153238/ > > > > Should I submit a v3 with the doc update (but removing the parts about > > perf) ? > > 'perf' part should not be there. > > I'm in doubt about the rest of the docs. This part might be useful, but > it doesn't provide any solution for the problem and I really don't know > if there is something we can suggest in that case. "Rework your network > topology" doesn't sound like a friendly solution or exact steps to do. > > One more thing is that documenting coverage counters doesn't look like a > good idea to me and I'd like to not create a precedent. > > One day we'll rework this to be some "PMD/netdev performance statistics" > and it'll be OK to document it. But there is nothing more permanent > than a temporary solution. > > Right now the easiest way for me is to just apply v1. Ok for me. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] netdev-afxdp: add afxdp specific maximum MTU check
On 7 Nov 2019, at 17:43, Ilya Maximets wrote: On 07.11.2019 15:01, Eelco Chaudron wrote: Any feedback on this? On 1 Oct 2019, at 11:55, Eelco Chaudron wrote: Drivers natively supporting AF_XDP will check that a configured MTU size will not exceed the allowed size for AF_XDP. However, when the skb compatibility mode is used there is no check and any value is accepted. This sounds like a kernel issue. So, maybe it's better to fix it there? Cant remember the details but I did see an easy fix so decided to fix it here first :) This, for example, is the case when using the TAP interface. This fix adds a check to make sure only AF_XDP valid values are excepted. Signed-off-by: Eelco Chaudron --- lib/netdev-afxdp.c | 17 + lib/netdev-afxdp.h | 1 + lib/netdev-linux.c | 9 + 3 files changed, 27 insertions(+) diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c index 6e0180327..140150f29 100644 --- a/lib/netdev-afxdp.c +++ b/lib/netdev-afxdp.c @@ -1001,6 +1001,23 @@ netdev_afxdp_destruct(struct netdev *netdev) ovs_mutex_destroy(&dev->mutex); } +int +netdev_afxdp_verify_mtu_size(const struct netdev *netdev, int mtu) +{ + /* + * If a device is used in xdpmode skb, no driver-specific MTU size is + * checked and any value is allowed resulting in packet drops. + * This check will verify the maximum supported value based on the + * buffer size allocated and the additional headroom required. + */ + if (netdev == NULL || mtu <= 0 I don't really think that we need to check above things. netdev can't be NULL here. You may put an assert here if you worried. mtu shuld be validated by db schema, and it can not be < 1. Ok will remove it… + || mtu > (FRAME_SIZE - OVS_XDP_HEADROOM - XDP_PACKET_HEADROOM)) { mtu doesn't usually include ethernet header, vlans. So, these should be excluded too. Not sure about FCS, if it passed to XDP program in generic mode or it's stripped by the driver. Thought I did some tests and the MTU given was from the ethernet header till the FCS (not included). But I’ll re-test it on the new revision, and update the code here if needed… + return EINVAL; + } + + return 0; +} + int netdev_afxdp_get_custom_stats(const struct netdev *netdev, struct netdev_custom_stats *custom_stats) diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h index e2f400b72..ee6939fce 100644 --- a/lib/netdev-afxdp.h +++ b/lib/netdev-afxdp.h @@ -39,6 +39,7 @@ int netdev_afxdp_rxq_construct(struct netdev_rxq *rxq_); void netdev_afxdp_rxq_destruct(struct netdev_rxq *rxq_); int netdev_afxdp_construct(struct netdev *netdev_); void netdev_afxdp_destruct(struct netdev *netdev_); +int netdev_afxdp_verify_mtu_size(const struct netdev *netdev, int mtu); int netdev_afxdp_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch, diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index f48192373..89d0d9a9d 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -1593,6 +1593,15 @@ netdev_linux_set_mtu(struct netdev *netdev_, int mtu) goto exit; } +#ifdef HAVE_AF_XDP + if (!strcmp(netdev_get_type(netdev_), "afxdp")) { It's better to compare netdev class with &netdev_afxdp_calss as it done for tap. Will do this in next version! + error = netdev_afxdp_verify_mtu_size(netdev_, mtu); + if (error) { + goto exit; + } + } +#endif + if (netdev->cache_valid & VALID_MTU) { error = netdev->netdev_mtu_error; if (error || netdev->mtu == mtu) { ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] netdev-dpdk: Fix dev attached flag.
If the dev was already probed correctly, the dev attached flag should be set to true, or resource would leak during destruct. Signed-off-by: Zhike Wang --- lib/netdev-dpdk.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index a65540c..10e2fe9 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -1728,6 +1728,9 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev, new_port_id = DPDK_ETH_PORT_ID_INVALID; } } +} else { +dev->attached = true; +VLOG_INFO("Device '%s' attached to DPDK", devargs); } } -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] flow: Fix IPv6 header parser with partial offloading.
Set new_proto before it is used in parse_ipv6_ext_hdrs__(). Signed-off-by: Zhike Wang --- lib/flow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/flow.c b/lib/flow.c index a18a1e6..45bb96b 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -1136,11 +1136,11 @@ parse_tcp_flags(struct dp_packet *packet) dp_packet_set_l2_pad_size(packet, size - plen); size = plen; const struct ovs_16aligned_ip6_frag *frag_hdr; +nw_proto = nh->ip6_nxt; if (!parse_ipv6_ext_hdrs__(&data, &size, &nw_proto, &nw_frag, &frag_hdr)) { return 0; } -nw_proto = nh->ip6_nxt; } else { return 0; } -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] conntrack: Fix tcp payload length in case multi-segments.
Signed-off-by: Zhike Wang --- lib/conntrack-private.h | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h index 590f139..1d21f6e 100644 --- a/lib/conntrack-private.h +++ b/lib/conntrack-private.h @@ -233,13 +233,17 @@ conn_update_expiration(struct conntrack *ct, struct conn *conn, static inline uint32_t tcp_payload_length(struct dp_packet *pkt) { -const char *tcp_payload = dp_packet_get_tcp_payload(pkt); -if (tcp_payload) { -return ((char *) dp_packet_tail(pkt) - dp_packet_l2_pad_size(pkt) -- tcp_payload); -} else { -return 0; +size_t l4_size = dp_packet_l4_size(pkt); + +if (OVS_LIKELY(l4_size >= TCP_HEADER_LEN)) { +struct tcp_header *tcp = dp_packet_l4(pkt); +int tcp_len = TCP_OFFSET(tcp->tcp_ctl) * 4; + +if (OVS_LIKELY(tcp_len >= TCP_HEADER_LEN && tcp_len <= l4_size)) { +return (l4_size - tcp_len); +} } +return 0; } #endif /* conntrack-private.h */ -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] BUG: MAX_LOCKDEP_ENTRIES too low!
I guess this is more for Peter or Ingo... On Thu 07-11-19 19:54:08, syzbot wrote: > syzbot has found a reproducer for the following crash on: > > HEAD commit:99a8efbb NFC: st21nfca: fix double free > git tree: net > console output: https://syzkaller.appspot.com/x/log.txt?x=15ed70d8e0 > kernel config: https://syzkaller.appspot.com/x/.config?x=cbbed3e8d4eb64bf > dashboard link: https://syzkaller.appspot.com/bug?extid=cd0ec5211ac07c18c049 > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13cf5594e0 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1036c762e0 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+cd0ec5211ac07c18c...@syzkaller.appspotmail.com > > device 5580n entered promiscuous mode > BUG: MAX_LOCKDEP_ENTRIES too low! > turning off the locking correctness validator. > CPU: 0 PID: 14197 Comm: syz-executor527 Not tainted 5.4.0-rc5+ #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x172/0x1f0 lib/dump_stack.c:113 > alloc_list_entry.cold+0x11/0x18 kernel/locking/lockdep.c:1292 > add_lock_to_list kernel/locking/lockdep.c:1313 [inline] > check_prev_add kernel/locking/lockdep.c:2528 [inline] > check_prevs_add kernel/locking/lockdep.c:2581 [inline] > validate_chain kernel/locking/lockdep.c:2971 [inline] > __lock_acquire+0x2a15/0x4a00 kernel/locking/lockdep.c:3955 > lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 > __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline] > _raw_spin_lock_bh+0x33/0x50 kernel/locking/spinlock.c:175 > spin_lock_bh include/linux/spinlock.h:343 [inline] > netif_addr_lock_bh include/linux/netdevice.h:4055 [inline] > dev_set_rx_mode+0x20/0x40 net/core/dev.c:7808 > dev_set_promiscuity+0xbf/0xe0 net/core/dev.c:7716 > internal_dev_create+0x387/0x550 net/openvswitch/vport-internal_dev.c:196 > ovs_vport_add+0x150/0x500 net/openvswitch/vport.c:199 > new_vport+0x1b/0x1d0 net/openvswitch/datapath.c:194 > ovs_dp_cmd_new+0x5e5/0xe30 net/openvswitch/datapath.c:1644 > genl_family_rcv_msg+0x74b/0xf90 net/netlink/genetlink.c:629 > genl_rcv_msg+0xca/0x170 net/netlink/genetlink.c:654 > netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477 > genl_rcv+0x29/0x40 net/netlink/genetlink.c:665 > netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] > netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328 > netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917 > sock_sendmsg_nosec net/socket.c:637 [inline] > sock_sendmsg+0xd7/0x130 net/socket.c:657 > ___sys_sendmsg+0x803/0x920 net/socket.c:2311 > __sys_sendmsg+0x105/0x1d0 net/socket.c:2356 > __do_sys_sendmsg net/socket.c:2365 [inline] > __se_sys_sendmsg net/socket.c:2363 [inline] > __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363 > do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > RIP: 0033:0x441779 > Code: e8 9c ad 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff > 0f 83 1b 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:7ffea7e5fcc8 EFLAGS: 0246 ORIG_RAX: 002e > RAX: ffda RBX: 0003 RCX: 00441779 > RDX: RSI: 2240 RDI: 0003 > RBP: 00058f66 R08: 7ffe0025 R09: 7ffe0025 > R10: 0004 R11: 0246 R12: 006cdbc0 > R13: 0013 R14: R15: > -- Jan Kara SUSE Labs, CR ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds
Hi David > -Original Message- > From: David Wilder > Sent: Thursday, November 7, 2019 3:21 AM > To: ovs-dev@openvswitch.org > Cc: i.maxim...@ovn.org; b...@ovn.org; Yanqin Wei (Arm Technology China) > ; wil...@us.ibm.com > Subject: [PATCH v3] travis: support ppc64le builds > > Add support for travis-ci ppc64le builds. > > - Updated matrix in .travis.yml to include an arch: ppc64le build. > - Move package install needed for 32bit builds to .travis/linux-prepare.sh. > > To keep the total build time at an acceptable level only a single build job is > included in the matrix for ppc64le. > > A build report example can be found here [1] [0] http://travis-ci.org/ [1] > https://travis-ci.org/djlwilder/ovs/builds/607851729 > > Signed-off-by: David Wilder > --- > Addressed review comments: > - Cleaned up linux-prepare.sh (v2) > - Switch from os: linux-ppc64le to arch: ppc64le (v3) > > .travis.yml | 5 +++-- > .travis/linux-prepare.sh | 5 - > 2 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/.travis.yml b/.travis.yml > index 482efd2d1..308c09635 100644 > --- a/.travis.yml > +++ b/.travis.yml > @@ -14,7 +14,6 @@ addons: >apt: > packages: >- bc > - - gcc-multilib >- libssl-dev >- llvm-dev >- libjemalloc1 > @@ -26,7 +25,6 @@ addons: >- libelf-dev >- selinux-policy-dev >- libunbound-dev > - - libunbound-dev:i386 >- libunwind-dev > > before_install: ./.travis/${TRAVIS_OS_NAME}-prepare.sh > @@ -52,6 +50,9 @@ matrix: > - os: osx >compiler: clang >env: OPTS="--disable-ssl" > +- arch: ppc64le > + compiler: gcc > + env: OPTS="--disable-ssl" > > script: ./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS > > diff --git a/.travis/linux-prepare.sh b/.travis/linux-prepare.sh index > 9e3ac0df7..d66f480c6 100755 > --- a/.travis/linux-prepare.sh > +++ b/.travis/linux-prepare.sh > @@ -18,7 +18,10 @@ pip install --user --upgrade docutils if [ "$M32" ]; then > # 32-bit and 64-bit libunwind can not be installed at the same time. > # This will remove the 64-bit libunwind and install 32-bit version. > -sudo apt-get install -y libunwind-dev:i386 > +sudo apt-get install -y \ > + gcc-multilib \ > + libunwind-dev:i386 \ > + libunbound-dev:i386 [Yanqin] They are x86 specific dependency. It is better to use "$TRAVIS_ARCH" == "amd64" condition. [Yanqin] Is gcc-multilib only required for 32bits build? > fi > > # IPv6 is supported by kernel but disabled in TravisCI images: > -- > 2.23.0.162.gf1d4a28 IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn] travis: Fix CI failure for osx builds
From: Numan Siddique The below failure is seen checking for Python 3 (version 3.4 or later)... /usr/local/bin/python3 checking where Python six library is available... configure: error: Missing Python six library. The command "./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS" exited with 1. This patch fixes it. Signed-off-by: Numan Siddique --- .travis/osx-prepare.sh | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/.travis/osx-prepare.sh b/.travis/osx-prepare.sh index 4725fd829..7f639a62e 100755 --- a/.travis/osx-prepare.sh +++ b/.travis/osx-prepare.sh @@ -1,7 +1,5 @@ #!/bin/bash set -ev -pip2 install --user six -pip2 install --user --upgrade docutils +pip3 install --user six +pip3 install --user --upgrade docutils -brew update || true -brew uninstall libtool && brew install libtool || true -- 2.23.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn] docs: Add note about RBAC and remote ovn-northd connection
From: Frode Nordahl Signed-off-by: Frode Nordahl Acked-by: Aliasgar Ginwala --- .../topics/role-based-access-control.rst | 7 ++ Documentation/tutorials/ovn-rbac.rst | 25 +++ 2 files changed, 32 insertions(+) diff --git a/Documentation/topics/role-based-access-control.rst b/Documentation/topics/role-based-access-control.rst index 2acd1e88b..e13e2d5dc 100644 --- a/Documentation/topics/role-based-access-control.rst +++ b/Documentation/topics/role-based-access-control.rst @@ -82,6 +82,13 @@ command: $ ovn-sbctl set-connection role=ovn-controller ssl:192.168.0.1:6642 +.. note:: + + There is currently no pre-defined role for ovn-northd. You must configure + a separate listener on the OVN southbound database that ovn-northd can + connect to if your deployment topology require ovn-northd to connect to a + OVN southbound database instance on a remote machine. + Pre-defined Roles - This section describes roles that have been defined internally by OVS/OVN. diff --git a/Documentation/tutorials/ovn-rbac.rst b/Documentation/tutorials/ovn-rbac.rst index 22b169d6d..fc2de5d5d 100644 --- a/Documentation/tutorials/ovn-rbac.rst +++ b/Documentation/tutorials/ovn-rbac.rst @@ -132,3 +132,28 @@ Configuring RBAC /path/to/chassis_2-cert.pem /path/to/cacert.pem $ ovs-vsctl set open_vswitch . \ external_ids:ovn-remote=ssl:machine_3-ip:6642 + +The OVN central control daemon and RBAC +~~~ + +The OVN central control daemon (`ovn-northd`) needs full write access to +the southbound database. When you have one machine hosting the central +components, `ovn-northd` can talk to the databases through a local unix +socket, bypassing the `ovn-controller` RBAC configured for the listener +at port '6642'. However, if you want to deploy multiple machines for +hosting the central components, `ovn-northd` will require a remote +connection to all of them. + +1. Configure the southbound database with a second SSL listener on a + separate port without RBAC enabled for use by `ovn-northd`. + + In `machine_3`:: + + $ ovn-sbctl -- --id=@conn_uuid create Connection \ + target="pssl\:16642" \ + -- add SB_Global . connections=@conn_uuid + + .. note:: + + Care should be taken to restrict access to the above mentioned port + so that only trusted machines can connect to it. -- 2.23.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] docs: Add note about RBAC and remote ovn-northd connection
On Fri, Nov 8, 2019 at 11:22 AM Frode Nordahl wrote: > > Signed-off-by: Frode Nordahl > Acked-by: Aliasgar Ginwala > Submitted-at: https://github.com/ovn-org/ovn/pull/25 I applied this patch to master. Sorry I didn't notice that you already had sent the patch to the ML and I resubmitted here - https://patchwork.ozlabs.org/patch/1191808/. Thanks Numan > --- > .../topics/role-based-access-control.rst | 7 ++ > Documentation/tutorials/ovn-rbac.rst | 25 +++ > 2 files changed, 32 insertions(+) > > diff --git a/Documentation/topics/role-based-access-control.rst > b/Documentation/topics/role-based-access-control.rst > index 2acd1e88b..e13e2d5dc 100644 > --- a/Documentation/topics/role-based-access-control.rst > +++ b/Documentation/topics/role-based-access-control.rst > @@ -82,6 +82,13 @@ command: > > $ ovn-sbctl set-connection role=ovn-controller ssl:192.168.0.1:6642 > > +.. note:: > + > + There is currently no pre-defined role for ovn-northd. You must configure > + a separate listener on the OVN southbound database that ovn-northd can > + connect to if your deployment topology require ovn-northd to connect to a > + OVN southbound database instance on a remote machine. > + > Pre-defined Roles > - > This section describes roles that have been defined internally by OVS/OVN. > diff --git a/Documentation/tutorials/ovn-rbac.rst > b/Documentation/tutorials/ovn-rbac.rst > index 22b169d6d..fc2de5d5d 100644 > --- a/Documentation/tutorials/ovn-rbac.rst > +++ b/Documentation/tutorials/ovn-rbac.rst > @@ -132,3 +132,28 @@ Configuring RBAC > /path/to/chassis_2-cert.pem /path/to/cacert.pem >$ ovs-vsctl set open_vswitch . \ > external_ids:ovn-remote=ssl:machine_3-ip:6642 > + > +The OVN central control daemon and RBAC > +~~~ > + > +The OVN central control daemon (`ovn-northd`) needs full write access to > +the southbound database. When you have one machine hosting the central > +components, `ovn-northd` can talk to the databases through a local unix > +socket, bypassing the `ovn-controller` RBAC configured for the listener > +at port '6642'. However, if you want to deploy multiple machines for > +hosting the central components, `ovn-northd` will require a remote > +connection to all of them. > + > +1. Configure the southbound database with a second SSL listener on a > + separate port without RBAC enabled for use by `ovn-northd`. > + > + In `machine_3`:: > + > + $ ovn-sbctl -- --id=@conn_uuid create Connection \ > + target="pssl\:16642" \ > + -- add SB_Global . connections=@conn_uuid > + > + .. note:: > + > + Care should be taken to restrict access to the above mentioned port > + so that only trusted machines can connect to it. > -- > 2.20.1 > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] docs: Add note about RBAC and remote ovn-northd connection
On Fri, Nov 8, 2019 at 11:56 AM Numan Siddique wrote: > > On Fri, Nov 8, 2019 at 11:22 AM Frode Nordahl > wrote: > > > > Signed-off-by: Frode Nordahl > > Acked-by: Aliasgar Ginwala > > Submitted-at: https://github.com/ovn-org/ovn/pull/25 > > I applied this patch to master. > Sorry I didn't notice that you already had sent the patch to the ML > and I resubmitted here - https://patchwork.ozlabs.org/patch/1191808/. No worries, and thank you for the merge! I'll stick to sending patches through ml/patchworks in the future. -- Frode Nordahl > Thanks > Numan > > > --- > > .../topics/role-based-access-control.rst | 7 ++ > > Documentation/tutorials/ovn-rbac.rst | 25 +++ > > 2 files changed, 32 insertions(+) > > > > diff --git a/Documentation/topics/role-based-access-control.rst > > b/Documentation/topics/role-based-access-control.rst > > index 2acd1e88b..e13e2d5dc 100644 > > --- a/Documentation/topics/role-based-access-control.rst > > +++ b/Documentation/topics/role-based-access-control.rst > > @@ -82,6 +82,13 @@ command: > > > > $ ovn-sbctl set-connection role=ovn-controller ssl:192.168.0.1:6642 > > > > +.. note:: > > + > > + There is currently no pre-defined role for ovn-northd. You must > > configure > > + a separate listener on the OVN southbound database that ovn-northd can > > + connect to if your deployment topology require ovn-northd to connect to > > a > > + OVN southbound database instance on a remote machine. > > + > > Pre-defined Roles > > - > > This section describes roles that have been defined internally by OVS/OVN. > > diff --git a/Documentation/tutorials/ovn-rbac.rst > > b/Documentation/tutorials/ovn-rbac.rst > > index 22b169d6d..fc2de5d5d 100644 > > --- a/Documentation/tutorials/ovn-rbac.rst > > +++ b/Documentation/tutorials/ovn-rbac.rst > > @@ -132,3 +132,28 @@ Configuring RBAC > > /path/to/chassis_2-cert.pem /path/to/cacert.pem > >$ ovs-vsctl set open_vswitch . \ > > external_ids:ovn-remote=ssl:machine_3-ip:6642 > > + > > +The OVN central control daemon and RBAC > > +~~~ > > + > > +The OVN central control daemon (`ovn-northd`) needs full write access to > > +the southbound database. When you have one machine hosting the central > > +components, `ovn-northd` can talk to the databases through a local unix > > +socket, bypassing the `ovn-controller` RBAC configured for the listener > > +at port '6642'. However, if you want to deploy multiple machines for > > +hosting the central components, `ovn-northd` will require a remote > > +connection to all of them. > > + > > +1. Configure the southbound database with a second SSL listener on a > > + separate port without RBAC enabled for use by `ovn-northd`. > > + > > + In `machine_3`:: > > + > > + $ ovn-sbctl -- --id=@conn_uuid create Connection \ > > + target="pssl\:16642" \ > > + -- add SB_Global . connections=@conn_uuid > > + > > + .. note:: > > + > > + Care should be taken to restrict access to the above mentioned port > > + so that only trusted machines can connect to it. > > -- > > 2.20.1 > > ___ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] docs: Add note about RBAC and remote ovn-northd connection
Bleep bloop. Greetings Numan Siddique, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Committer Numan Siddique needs to sign off. Lines checked: 66, Warnings: 0, Errors: 1 Please check this out. If you feel there has been an error, please email acon...@redhat.com Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] L2 acl on the logical switch
Your email is really cluttered. Could you please send again with plain text ? Thanks Numan On Fri, Nov 8, 2019 at 6:22 AM venugopal iyer via dev wrote: > > [Pardon the length of the mail; have left out the ofctl flows, but if it > helps i can send them too] > Assuming :logical Switch (ls1) with > ls1_vm1 : 02:ac:10:ff:00:11/172.16.255.11 > ls1_vm2 : 02:ac:10:ff:00:22/172.16.255.22 > ls1_vm3 : 02:ac:10:ff:00:33/172.16.255.33 > Looking to :have MAC ACLs between vm1 and vm2 so they can't talk to > each other. > Using:# ovn-nbctl create Address_Set name=macs > addresses=\"02:ac:10:ff:00:11\",\"02:ac:10:ff:00:22\"# ovn-nbctl > create Port_Group name=l2pg# ovn-nbctl add Port_Group l2pg ports > # ovn-nbctl add Port_Group l2pg ports # > ovn-nbctl acl-add ls1 to-lport 32767 "outport == @l2pg && eth.src == \$macs" > drop > What we are seeing: The above, by itself, will limit L3 and L2 between > vm1 and vm2, but not vm3 (as expected) > lflow: table=4 (ls_out_acl ), > priority=33767, match=(outport == @l2pg && eth.src == $macs), action=(/* drop > */) > dpflow (sending a arbit packet of ether type 0x7a05):-- > > (0),in_port(vm1),eth(src=02:ac:10:ff:00:11,dst=02:ac:10:ff:00:22),eth_type(0x7a05), > packets:0, bytes:0, used:never, actions:drop > However, if there is a stateful rule, e.g.:# ovn-nbctl acl-add ls1 > from-lport 2000 "inport == \"ls1-vm3\" && ip" allow-related > Then, the behavior differs > lflow:--- table=6 (ls_in_acl ), > priority=3000 , match=(!ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 > && (inport == "ls1-vm3" && ip)), action=(next;) table=6 (ls_in_acl >), priority=3000 , match=(((ct.new && !ct.est) || (!ct.new && ct.est > && !ct.rpl && ct_label.blocked == 1)) && (inport == "ls1-vm3" && ip)), > action=(reg0[1] = 1; next;) table=4 (ls_out_acl ), > priority=33767, match=((!ct.est || (ct.est && ct_label.blocked == 1)) && > (outport == @l2pg && eth.src == $macs)), action=(/* drop */) table=4 > (ls_out_acl ), priority=33767, match=(ct.est && ct_label.blocked == 0 > && (outport == @l2pg && eth.src == $macs)), action=(ct_commit(ct_label=1/1); > /* drop */) > *Note: the condition !ct.new && ct.est && !ct.rpl && ct_label.blocked == > 0** > This does block L3 traffic: > dpctl flow for ICMP- > recirc_id(0),in_port(vm1),eth(src=02:ac:10:ff:00:11),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8), > packets:6, bytes:588, used:0.864s, actions:ct(zone=1),recirc(0x1) > recirc_id(0x1),in_port(vm1),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(dst=02:ac:10:ff:00:22),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8), > packets:6, bytes:588, used:0.864s, > actions:ct(commit,zone=1,label=0/0x1),ct(zone=4),recirc(0x2) > recirc_id(0x2),in_port(vm1),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:11),eth_type(0x0800),ipv4(frag=no), > packets:6, bytes:588, used:0.864s, actions:drop > recirc_id(0),in_port(vm1),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:11,dst=02:ac:10:ff:00:22),eth_type(0x0806),arp(sha=02:ac:10:ff:00:11), > packets:0, bytes:0, used:never, actions:vm2 > recirc_id(0),in_port(vm2),ct_state(-new-est-rel-rpl-inv -trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:22,dst=02:ac:10:ff:00:11),eth_type(0x0806),arp(sha=02:ac:10:ff:00:22), packets:0, bytes:0, used:never, actions:vm1 > > But, it doesn't block L2, see the ARP going through, without the stateful > rule the ARP would have been dropped too. > Using the arbit packet: > > recirc_id(0),in_port(vm1),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:11,dst=02:ac:10:ff:00:22),eth_type(0x7a05), > packets:0, bytes:0, used:never, actions:vm2 > > > From what i understand it is because of the condition "!ct.new && ct.est && > !ct.rpl && ct_label.blocked == 0", which implies trk, I suppose, which makes > sense for IP packets, but probably not for non-IP packets, i.e. the rules > above have "-trk". Hence, I believe it gets through > > Proposed changes: > If my understanding is right (and objective to remove the inconsistency > between when there are stateful rules and not, w.r.t. L2 packets), we > couldhave the condition as: > # git diffdiff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.cindex > 6c6de2afd..b0f524531 100644--- a/ovn/northd/ovn-northd.c+++ > b/ovn/northd/ovn-northd.c@@ -4191,10 +4191,13 @@ consider_acl(struct hmap > *lflows, struct ovn_datapath *od, * depending on whether the > connection was previously committed * to the connection tracker with >
[ovs-dev] [PATCH ovn 0/3] Improve ovn-detrace support for parsing OpenFlow cookies.
Commit eb25a7da639e ("Improve debuggability of OVN to OpenFlow translations.") added support for more types of cookies (partial SB uuids) that are stored in OpenFlow entries created by ovn-controller. The last patch in this series implements support for parsing all these different cookies in ovn-detrace too. The first two patches are bug fixes required as ovn-detrace was broken after moving OVN to its own separte rundir. CC: Han Zhou Signed-off-by: Dumitru Ceara Dumitru Ceara (3): ovn-detrace: Fix rundir. ovn-detrace: Fix line parsing. ovn-detrace: Add support for other types of SB cookies. utilities/ovn-detrace.in | 186 -- 1 file changed, 129 insertions(+), 57 deletions(-) ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn 2/3] ovn-detrace: Fix line parsing.
The script was never parsing the first cookie in the input. Also, add a check to make sure that the cookie refers to a Logical_Flow before trying to print the record. Signed-off-by: Dumitru Ceara --- utilities/ovn-detrace.in | 24 ++-- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in index 9471e37..34b6b0e 100755 --- a/utilities/ovn-detrace.in +++ b/utilities/ovn-detrace.in @@ -188,22 +188,26 @@ def main(): cookie = None while True: line = sys.stdin.readline() + +if line == '': +break + +line = line.strip() + if cookie: # print lflow info when the current flow block ends -if regex_table_id.match(line) or line.strip() == '': +if regex_table_id.match(line): lflow = get_lflow_from_cookie(ovsdb_ovnsb, cookie) -print_lflow(lflow, " * ") -print_lflow_nb_hint(lflow, "* ", ovsdb_ovnnb) -cookie = None +if lflow: +print_lflow(lflow, " * ") +print_lflow_nb_hint(lflow, "* ", ovsdb_ovnnb) +cookie = None -print line.strip() -if line == "": -break +print line m = regex_cookie.match(line) -if not m: -continue -cookie = m.group(1) +if m: +cookie = m.group(1) if __name__ == "__main__": ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn 1/3] ovn-detrace: Fix rundir.
After the separation of OVS and OVN rundirs, the ovn-detrace script hasn't been updated to use the OVN rundir instead of the OVS one. Signed-off-by: Dumitru Ceara --- utilities/ovn-detrace.in |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in index c842adc..9471e37 100755 --- a/utilities/ovn-detrace.in +++ b/utilities/ovn-detrace.in @@ -169,16 +169,16 @@ def main(): "(use --help for help)\n" % argv0) sys.exit(1) -ovs_rundir = os.getenv('OVS_RUNDIR', '@RUNDIR@') +ovn_rundir = os.getenv('OVN_RUNDIR', '@OVN_RUNDIR@') if not ovnsb_db: ovnsb_db = os.getenv('OVN_SB_DB') if not ovnsb_db: -ovnsb_db = 'unix:%s/ovnsb_db.sock' % ovs_rundir +ovnsb_db = 'unix:%s/ovnsb_db.sock' % ovn_rundir if not ovnnb_db: ovnnb_db = os.getenv('OVN_NB_DB') if not ovnnb_db: -ovnnb_db = 'unix:%s/ovnnb_db.sock' % ovs_rundir +ovnnb_db = 'unix:%s/ovnnb_db.sock' % ovn_rundir ovsdb_ovnsb = OVSDB(ovnsb_db, 'OVN_Southbound') ovsdb_ovnnb = OVSDB(ovnnb_db, 'OVN_Northbound') ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn 3/3] ovn-detrace: Add support for other types of SB cookies.
Commit eb25a7da639e ("Improve debuggability of OVN to OpenFlow translations.") added cookies for Port_Binding, Mac_Binding, Multicast_Group, Chassis records to the OpenfFlow entries they generate. Add support for parsing such cookies in ovn-detrace too. Also, refactor ovn-detrace to allow a more generic way of defining cookie handlers. Signed-off-by: Dumitru Ceara --- utilities/ovn-detrace.in | 166 -- 1 file changed, 117 insertions(+), 49 deletions(-) diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in index 34b6b0e..79c6d3b 100755 --- a/utilities/ovn-detrace.in +++ b/utilities/ovn-detrace.in @@ -98,48 +98,112 @@ class OVSDB(object): def find_row_by_partial_uuid(self, table_name, value): return self._find_row(table_name, lambda row: value in str(row.uuid)) - -def get_lflow_from_cookie(ovnsb_db, cookie): -return ovnsb_db.find_row_by_partial_uuid('Logical_Flow', cookie) - - -def print_lflow(lflow, prefix): -ldp_uuid = lflow.logical_datapath.uuid -ldp_name = str(lflow.logical_datapath.external_ids.get('name')) - -print '%sLogical datapath: "%s" (%s) [%s]' % (prefix, - ldp_name, - ldp_uuid, - lflow.pipeline) -print "%sLogical flow: table=%s (%s), priority=%s, " \ - "match=(%s), actions=(%s)" % (prefix, -lflow.table_id, -lflow.external_ids.get('stage-name'), -lflow.priority, -str(lflow.match).strip('"'), -str(lflow.actions).strip('"')) - - -def print_lflow_nb_hint(lflow, prefix, ovnnb_db): -external_ids = lflow.external_ids -if external_ids.get('stage-name') in ['ls_in_acl', - 'ls_out_acl']: -acl_hint = external_ids.get('stage-hint') -if not acl_hint: -return -acl = ovnnb_db.find_row_by_partial_uuid('ACL', acl_hint) -if not acl: +class CookieHandler(object): +def __init__(self, db, table): +self._db = db +self._table = table + +def get_record(self, cookie): +return self._db.find_row_by_partial_uuid(self._table, cookie) + +def print_record(self, record, prefix): +pass + +def print_hint(self, record, prefix, db): +pass + +def datapath_str(datapath): +return '"%s" (%s)' % (str(datapath.external_ids.get('name')), + datapath.uuid) + +def chassis_str(chassis): +if len(chassis) == 0: +return '' +ch = chassis[0] +return 'chassis-name "%s", chassis-str "%s"' % (ch.name, ch.hostname) + +class LogicalFlowHandler(CookieHandler): +def __init__(self, ovnsb_db): +super(LogicalFlowHandler, self).__init__(ovnsb_db, 'Logical_Flow') + +def print_record(self, lflow, prefix): +print('%sLogical datapath: %s [%s]' % +(prefix, datapath_str(lflow.logical_datapath), lflow.pipeline)) +print('%sLogical flow: table=%s (%s), priority=%s, ' + 'match=(%s), actions=(%s)' % +(prefix, lflow.table_id, lflow.external_ids.get('stage-name'), + lflow.priority, + str(lflow.match).strip('"'), + str(lflow.actions).strip('"'))) + +def print_hint(self, lflow, prefix, ovnnb_db): +external_ids = lflow.external_ids +if external_ids.get('stage-name') in ['ls_in_acl', + 'ls_out_acl']: +acl_hint = external_ids.get('stage-hint') +if not acl_hint: +return +acl = ovnnb_db.find_row_by_partial_uuid('ACL', acl_hint) +if not acl: +return +output = '%sACL: %s, priority=%s, ' \ + 'match=(%s), %s' % (prefix, + acl.direction, + acl.priority, + acl.match.strip('"'), + acl.action) +if acl.log: +output += ' (log)' +print(output) + +class PortBindingHandler(CookieHandler): +def __init__(self, ovnsb_db): +super(PortBindingHandler, self).__init__(ovnsb_db, 'Port_Binding') + +def print_record(self, pb, prefix): +ldp_uuid = pb.datapath.uuid +ldp_name = str(pb.datapath.external_ids.get('name')) + +print('%sLogical datapath: %s' % (prefix, datapath_str(pb.datapath))) +print('%sPort Binding: logical_port "%s", tunnel_key %ld, %s' % +(prefix, pb.logical_port, pb.tunnel_key, + chassis_str(pb.chassis))) + +class MacBindingHandler(CookieHandler): +def __init__(self, ovnsb_db
Re: [ovs-dev] [PATCH ovn] travis: Fix CI failure for osx builds
Hi Ilya, If you get some time, could you please take a look at this patch ? Thanks Numan On Fri, Nov 8, 2019 at 4:13 PM wrote: > > From: Numan Siddique > > The below failure is seen > > > checking for Python 3 (version 3.4 or later)... /usr/local/bin/python3 > checking where Python six library is available... configure: error: Missing > Python six library. > The command "./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS" exited with 1. > > > This patch fixes it. > > Signed-off-by: Numan Siddique > --- > .travis/osx-prepare.sh | 6 ++ > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/.travis/osx-prepare.sh b/.travis/osx-prepare.sh > index 4725fd829..7f639a62e 100755 > --- a/.travis/osx-prepare.sh > +++ b/.travis/osx-prepare.sh > @@ -1,7 +1,5 @@ > #!/bin/bash > set -ev > -pip2 install --user six > -pip2 install --user --upgrade docutils > +pip3 install --user six > +pip3 install --user --upgrade docutils > > -brew update || true > -brew uninstall libtool && brew install libtool || true > -- > 2.23.0 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] travis: Fix CI failure for osx builds
On 08.11.2019 13:12, Numan Siddique wrote: Hi Ilya, Hi Numan, Comments inline. Best regards, Ilya Maximets. If you get some time, could you please take a look at this patch ? Thanks Numan On Fri, Nov 8, 2019 at 4:13 PM wrote: From: Numan Siddique The below failure is seen checking for Python 3 (version 3.4 or later)... /usr/local/bin/python3 checking where Python six library is available... configure: error: Missing Python six library. The command "./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS" exited with 1. This patch fixes it. Signed-off-by: Numan Siddique --- .travis/osx-prepare.sh | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/.travis/osx-prepare.sh b/.travis/osx-prepare.sh index 4725fd829..7f639a62e 100755 --- a/.travis/osx-prepare.sh +++ b/.travis/osx-prepare.sh @@ -1,7 +1,5 @@ #!/bin/bash set -ev -pip2 install --user six -pip2 install --user --upgrade docutils +pip3 install --user six +pip3 install --user --upgrade docutils This is required because OVS requires python3 now. This might make sense to point to that fact in commit-message. This empty line should be removed too as it is the last in a file. -brew update || true -brew uninstall libtool && brew install libtool || true This part is just an optimization. Doesn't really related to this fix. Some details in a corresponding OVS commit: 3bfc9c1c30d5 ("travis: Drop OSX workarounds.") I'm OK with the change in general. It might be better to mention in commit message why libtool update was also removed. Otherwise, Acked-by: Ilya Maximets ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode.
On Thu, Nov 07, 2019 at 02:28:18PM +0100, Eelco Chaudron wrote: > > > On 7 Nov 2019, at 14:21, Ilya Maximets wrote: > > >On 07.11.2019 13:39, Eelco Chaudron wrote: > >> > >> > >>On 7 Nov 2019, at 12:36, Ilya Maximets wrote: > >> > >>>Until now there was only two options for XDP mode in OVS: SKB or DRV. > >>>i.e. 'generic XDP' or 'native XDP with zero-copy enabled'. > >>> > >>>Devices like 'veth' interfaces in Linux supports native XDP, but > >>>doesn't support zero-copy mode. This case can not be covered by > >>>existing API and we have to use slower generic XDP for such devices. > >>>There are few more issues, e.g. TCP is not supported in generic XDP > >>>mode for veth interfaces due to kernel limitations, however it is > >>>supported in native mode. > >>> > >>>This change introduces ability to use native XDP without zero-copy > >>>along with best-effort configuration option that enabled by default. > >>>In best-effort case OVS will sequentially try different modes starting > >>>from the fastest one and will choose the first acceptable for current > >>>interface. This will guarantee the best possible performance. > >>> > >>>If user will want to choose specific mode, it's still possible by > >>>setting the 'options:xdp-mode'. > >>> > >>>This change additionally changes the API by renaming the configuration > >>>knob from 'xdpmode' to 'xdp-mode' and also renaming the modes > >>>themselves to be more user-friendly. > >>> > >>>The full list of currently supported modes: > >>> * native-with-zerocopy - former DRV > >>> * native - new one, DRV without zero-copy > >>> * generic - former SKB > >>> * best-effort - new one, chooses the best available from > >>> 3 above modes > >>> > >>>Since 'best-effort' is a default mode, users will not need to > >>>explicitely set 'xdp-mode' in most cases. > >>> > >>>TCP related tests enabled back in system afxdp testsuite, because > >>>'best-effort' will choose 'native' mode for veth interfaces > >>>and this mode has no issues with TCP. > >>> > >>>Signed-off-by: Ilya Maximets > >> > >>No review, I was running some performance tests for the OVS conference > >>presentation so quickly tried this patch (assuming performance would > >>increase :)… > >> > >>So with the native option (default for tap) I see a decrease, :(, in > >>performance over the skb mode (this is with my usual ovs_perf PVP test > >>setup). > >> > >>With the patch applied, and default configuration > >>(xdp-mode-in-use=native-with-zerocopy for my tap): > > > >Hmm. tap supports zero-copy? Interesting. No, tap doesn't support zero-copy mode. > >Have you tried forcing 'native' mode without zero-copy? > >Maybe it will make sense to enable/disable need-wakeup feature. > > Oops, wrong cut/paste: xdp-mode-in-use=native > > All my the tests where with use-need-wakeup=true > > port 2: tapVM (afxdp: n_rxq=1, use-need-wakeup=true, xdp-mode=best-effort, > xdp-mode-in-use=native) > > > > >> > >>"Physical to Virtual to Physical test, L3 flows[port redirect]" > >>, Packet size > >>Number of flows, 64 > >> 1, 711581 > >> 100, 673152 > >>1000, 617733 > >> > >>Here you will even see a performance drop with multiple IP flows (this > >>is a single queue config). > > > >This is strange.. One explanation might be that For using tap with native mode, OVS is sending XDP frame into the kernel, but since tap device eventually has to create an skb and forward to vhost_net then QEMU, so anyway we are having skb overhead. On the other hand For using tap with skb mode, there might be more optimization exist in kernel, some batch allocation and batch rx/tx (See driver/net/tun.c) --William ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode.
On Thu, Nov 07, 2019 at 12:36:33PM +0100, Ilya Maximets wrote: > Until now there was only two options for XDP mode in OVS: SKB or DRV. > i.e. 'generic XDP' or 'native XDP with zero-copy enabled'. > > Devices like 'veth' interfaces in Linux supports native XDP, but > doesn't support zero-copy mode. This case can not be covered by > existing API and we have to use slower generic XDP for such devices. > There are few more issues, e.g. TCP is not supported in generic XDP > mode for veth interfaces due to kernel limitations, however it is > supported in native mode. > > This change introduces ability to use native XDP without zero-copy > along with best-effort configuration option that enabled by default. > In best-effort case OVS will sequentially try different modes starting > from the fastest one and will choose the first acceptable for current > interface. This will guarantee the best possible performance. > > If user will want to choose specific mode, it's still possible by > setting the 'options:xdp-mode'. > > This change additionally changes the API by renaming the configuration > knob from 'xdpmode' to 'xdp-mode' and also renaming the modes > themselves to be more user-friendly. > > The full list of currently supported modes: > * native-with-zerocopy - former DRV > * native - new one, DRV without zero-copy > * generic - former SKB > * best-effort - new one, chooses the best available from >3 above modes Since we are renaming the mode, in doc, should we tell user the mapping of these mode to kernel AF_XDP's mode? So let user know generic mode in OVS = generic SKB in kernel, native mode in OVS = native mode without zc... > > Since 'best-effort' is a default mode, users will not need to > explicitely set 'xdp-mode' in most cases. > > TCP related tests enabled back in system afxdp testsuite, because > 'best-effort' will choose 'native' mode for veth interfaces > and this mode has no issues with TCP. > > Signed-off-by: Ilya Maximets > --- LGTM. Acked-by: William Tu I look through the patch and everything looks good. Thanks for fixing the documentation. 'make check-afxdp' also pass the previous failed cases due to TCP. It would also be good if this 'best-effort' mode is supported in libbpf. I think other users of af_xdp also have the same configuration headache. Regards, William > > With this patch I modified the user-visible API, but I think it's OK > since it's still an experimental netdev. Comments are welcome. > > Version 2: > * Rebased on current master. > > Documentation/intro/install/afxdp.rst | 54 --- > NEWS | 12 +- > lib/netdev-afxdp.c| 223 -- > lib/netdev-afxdp.h| 9 ++ > lib/netdev-linux-private.h| 8 +- > tests/system-afxdp-macros.at | 7 - > vswitchd/vswitch.xml | 38 +++-- > 7 files changed, 227 insertions(+), 124 deletions(-) > > diff --git a/Documentation/intro/install/afxdp.rst > b/Documentation/intro/install/afxdp.rst > index a136db0c9..937770ad0 100644 > --- a/Documentation/intro/install/afxdp.rst > +++ b/Documentation/intro/install/afxdp.rst > @@ -153,9 +153,8 @@ To kick start end-to-end autotesting:: >make check-afxdp TESTSUITEFLAGS='1' > > .. note:: > - Not all test cases pass at this time. Currenly all TCP related > - tests, ex: using wget or http, are skipped due to XDP limitations > - on veth. cvlan test is also skipped. > + Not all test cases pass at this time. Currenly all cvlan tests are skipped > + due to kernel issues. > > If a test case fails, check the log at:: > > @@ -177,33 +176,35 @@ in :doc:`general`:: >ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev > > Make sure your device driver support AF_XDP, netdev-afxdp supports > -the following additional options (see man ovs-vswitchd.conf.db for > +the following additional options (see ``man ovs-vswitchd.conf.db`` for > more details): > > - * **xdpmode**: use "drv" for driver mode, or "skb" for skb mode. > + * ``xdp-mode``: ``best-effort``, ``native-with-zerocopy``, > + ``native`` or ``generic``. Defaults to ``best-effort``, i.e. best of > + supported modes, so in most cases you don't need to change it. > > - * **use-need-wakeup**: default "true" if libbpf supports it, otherwise > false. > + * ``use-need-wakeup``: default ``true`` if libbpf supports it, > + otherwise ``false``. > > For example, to use 1 PMD (on core 4) on 1 queue (queue 0) device, > -configure these options: **pmd-cpu-mask, pmd-rxq-affinity, and n_rxq**. > -The **xdpmode** can be "drv" or "skb":: > +configure these options: ``pmd-cpu-mask``, ``pmd-rxq-affinity``, and > +``n_rxq``:: > >ethtool -L enp2s0 combined 1 >ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x10 >ovs-vsctl add-port br0 enp2s0 -- set interface enp2s0 type="afxdp" \ >
[ovs-dev] [PATCHv4] netdev-afxdp: Enable loading XDP program.
Now netdev-afxdp always forwards all packets to userspace because it is using libbpf's default XDP program, see 'xsk_load_xdp_prog'. There are some cases when users want to keep packets in kernel instead of sending to userspace, for example, management traffic such as SSH should be processed in kernel. The patch enables loading the user-provide XDP program by doing $ovs-vsctl -- set int afxdp-p0 options:xdp-obj= So users can implement their filtering logic or traffic steering idea in their XDP program, and rest of the traffic passes to AF_XDP socket handled by OVS. Signed-off-by: William Tu --- v4: Feedbacks from Eelco. - First load the program, then configure xsk. Let API take care of xdp prog and map loading, don't set XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD. - When loading custom xdp, need to close(prog_fd) and close(map_fd) to release the resources - make sure prog and map is unloaded by bpftool. - update doc, afxdp.rst - Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/608986781 v3: Feedbacks from Eelco. - keep using xdpobj not xdp-obj (because we alread use xdpmode) or we change both to xdp-obj and xdp-mode? - log a info message when using external program for better debugging - combine some failure messages - update doc NEW: - add options:xdpobj=__default__, to set back to libbpf default prog - Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/606153231 v2: A couple fixes and remove RFC --- Documentation/intro/install/afxdp.rst | 59 + lib/netdev-afxdp.c| 121 +++--- lib/netdev-linux-private.h| 4 ++ 3 files changed, 176 insertions(+), 8 deletions(-) diff --git a/Documentation/intro/install/afxdp.rst b/Documentation/intro/install/afxdp.rst index a136db0c950a..d95a85f39035 100644 --- a/Documentation/intro/install/afxdp.rst +++ b/Documentation/intro/install/afxdp.rst @@ -273,6 +273,65 @@ Or, use OVS pmd tool:: ovs-appctl dpif-netdev/pmd-stats-show +Loading Custom XDP Program +-- +By defailt, netdev-afxdp always forwards all packets to userspace because +it is using libbpf's default XDP program. There are some cases when users +want to keep packets in kernel instead of sending to userspace, for example, +management traffic such as SSH should be processed in kernel. This can be +done by loading the user-provided XDP program:: + + ovs-vsctl -- set int afxdp-p0 options:xdpobj= + +So users can implement their filtering logic or traffic steering idea +in their XDP program, and rest of the traffic passes to AF_XDP socket +handled by OVS. To set it back to default, use:: + + ovs-vsctl -- set int afxdp-p0 options:xdpobj=__default__ + +Below is a sample C program compiled under kernel's samples/bpf/. + +.. code-block:: c + + #include + #include "bpf_helpers.h" + + #if LINUX_VERSION_CODE < KERNEL_VERSION(5,3,0) + /* Kernel version before 5.3 needed an additional map */ + struct bpf_map_def SEC("maps") qidconf_map = { + .type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(int), + .value_size = sizeof(int), + .max_entries = 64, + }; + #endif + + /* OVS will associate map 'xsks_map' to xsk socket. */ + struct bpf_map_def SEC("maps") xsks_map = { + .type = BPF_MAP_TYPE_XSKMAP, + .key_size = sizeof(int), + .value_size = sizeof(int), + .max_entries = 32, + }; + + SEC("xdp_sock") + int xdp_sock_prog(struct xdp_md *ctx) + { + int index = ctx->rx_queue_index; + + /* Customized by user. + * For example + * 1) filter out all SSH traffic and return XDP_PASS + *for kernel to process. + * 2) Drop unwanted packet by returning XDP_DROP. + */ + + /* Rest of packets goes to AF_XDP. */ + return bpf_redirect_map(&xsks_map, index, 0); + } + char _license[] SEC("license") = "GPL"; + + Example Script -- diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c index af654d498a88..853eeb8a8dbe 100644 --- a/lib/netdev-afxdp.c +++ b/lib/netdev-afxdp.c @@ -21,6 +21,7 @@ #include "netdev-afxdp.h" #include "netdev-afxdp-pool.h" +#include #include #include #include @@ -30,6 +31,7 @@ #include #include #include +#include #include #include @@ -88,9 +90,12 @@ BUILD_ASSERT_DECL(PROD_NUM_DESCS == CONS_NUM_DESCS); #define UMEM2DESC(elem, base) ((uint64_t)((char *)elem - (char *)base)) +#define LIBBPF_XDP_PROGRAM "__default__" + static struct xsk_socket_info *xsk_configure(int ifindex, int xdp_queue_id, int mode, bool use_need_wakeup); -static void xsk_remove_xdp_program(uint32_t ifindex, int xdpmode); +static void xsk_remove_xdp_program(uint32_t ifindex, int xdpmode, + int prog_fd, int map_fd); static void xsk_destroy(struct xsk_socket_info *xsk); static int xsk_configure_all(struct netdev *netdev); s
[ovs-dev] [PATCH ovn v6] ovn-northd: Limit ARP/ND broadcast domain whenever possible.
ARP request and ND NS packets for router owned IPs were being flooded in the complete L2 domain (using the MC_FLOOD multicast group). However this creates a scaling issue in scenarios where aggregation logical switches are connected to more logical routers (~350). The logical pipelines of all routers would have to be executed before the packet is finally replied to by a single router, the owner of the IP address. This commit limits the broadcast domain by bypassing the L2 Lookup stage for ARP requests that will be replied by a single router. The packets are forwarded only to the router port that owns the target IP address. IPs that are owned by the routers and for which this fix applies are: - IP addresses configured on the router ports. - VIPs. - NAT IPs. This commit also fixes function get_router_load_balancer_ips() which was incorrectly returning a single address_family even though the IP set could contain a mix of IPv4 and IPv6 addresses. Reported-at: https://bugzilla.redhat.com/1756945 Reported-by: Anil Venkata Signed-off-by: Dumitru Ceara --- v6: - Address Han's comments: - remove flooding of ARPs targeting OVN owned IP addresses. - update ovn-architecture documentation. - rename ARP handling functions. - Adapt "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR" autotest to take into account the new way of forwarding ARPs. - Also, properly deal with ARP packets on VLAN-backed networks. v5: Address Numan's comments: update comments & make autotest more robust. v4: Rebase. v3: Properly deal with VXLAN traffic. Address review comments from Numan (add autotests). Fix function get_router_load_balancer_ips. Rebase -> deal with IPv6 NAT too. v2: Move ARP broadcast domain limiting to table S_SWITCH_IN_L2_LKUP to address localnet ports too. --- controller/physical.c| 2 + include/ovn/logical-fields.h | 4 + northd/ovn-northd.8.xml | 16 ++ northd/ovn-northd.c | 337 --- ovn-architecture.7.xml | 19 +++ tests/ovn.at | 307 +-- 6 files changed, 591 insertions(+), 94 deletions(-) diff --git a/controller/physical.c b/controller/physical.c index 500d419..751dbbf 100644 --- a/controller/physical.c +++ b/controller/physical.c @@ -567,6 +567,7 @@ put_replace_chassis_mac_flows(const struct simap *ct_zones, if (tag) { ofpact_put_STRIP_VLAN(ofpacts_p); +put_load(1, MFF_LOG_FLAGS, MLF_RCV_FROM_VLAN_BIT, 1, ofpacts_p); } load_logical_ingress_metadata(localnet_port, &zone_ids, ofpacts_p); replace_mac = ofpact_put_SET_ETH_SRC(ofpacts_p); @@ -1124,6 +1125,7 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_port_binding_by_name, ofpacts_p); } ofpact_put_STRIP_VLAN(ofpacts_p); +put_load(1, MFF_LOG_FLAGS, MLF_RCV_FROM_VLAN_BIT, 1, ofpacts_p); } /* Remember the size with just strip vlan added so far, diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h index 9b7c34f..15e0d1e 100644 --- a/include/ovn/logical-fields.h +++ b/include/ovn/logical-fields.h @@ -57,6 +57,7 @@ enum mff_log_flags_bits { MLF_LOCAL_ONLY_BIT = 4, MLF_NESTED_CONTAINER_BIT = 5, MLF_LOOKUP_MAC_BIT = 6, +MLF_RCV_FROM_VLAN_BIT = 7, }; /* MFF_LOG_FLAGS_REG flag assignments */ @@ -88,6 +89,9 @@ enum mff_log_flags { /* Indicate that the lookup in the mac binding table was successful. */ MLF_LOOKUP_MAC = (1 << MLF_LOOKUP_MAC_BIT), + +/* Indicate that a packet was received on a VLAN backed network. */ +MLF_RCV_FROM_VLAN = (1 << MLF_RCV_FROM_VLAN_BIT), }; /* OVN logical fields diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 0a33dcd..6fbb3ab 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -1005,6 +1005,22 @@ output; +Priority-80 flows for each port connected to a logical router +matching self originated GARP/ARP request/ND packets. These packets +are flooded to the MC_FLOOD which contains all logical +ports. + + + +Priority-75 flows for each IP address/VIP/NAT address owned by a +router port connected to the switch. These flows match ARP requests +and ND packets for the specific IP addresses. Matched packets are +forwarded in the L3 domain only to the router that owns the IP +address and flooded in the L2 domain on all ports except patch +ports connected to logical routers. + + + A priority-70 flow that outputs all packets with an Ethernet broadcast or multicast eth.dst to the MC_FLOOD multicast group. diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 2f0f501..9de3d20 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -210,6 +210,9 @@ enum ovn_stage { #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[4]"
Re: [ovs-dev] [PATCH ovn v5] ovn-northd: Limit ARP/ND broadcast domain whenever possible.
On Thu, Nov 7, 2019 at 7:02 PM Dumitru Ceara wrote: > > On Thu, Nov 7, 2019 at 6:02 PM Han Zhou wrote: > > > > > > > > On Thu, Nov 7, 2019 at 7:43 AM Dumitru Ceara wrote: > > > > > > On Thu, Nov 7, 2019 at 3:51 AM Han Zhou wrote: > > > > > > > > > > > > > > > > On Mon, Nov 4, 2019 at 11:32 AM Numan Siddique wrote: > > > > > > > > > > On Mon, Nov 4, 2019 at 8:37 PM Dumitru Ceara > > > > > wrote: > > > > > > > > > > > > ARP request and ND NS packets for router owned IPs were being > > > > > > flooded in the complete L2 domain (using the MC_FLOOD multicast > > > > > > group). > > > > > > However this creates a scaling issue in scenarios where aggregation > > > > > > logical switches are connected to more logical routers (~350). The > > > > > > logical pipelines of all routers would have to be executed before > > > > > > the > > > > > > packet is finally replied to by a single router, the owner of the IP > > > > > > address. > > > > > > > > > > > > This commit limits the broadcast domain by bypassing the L2 Lookup > > > > > > stage > > > > > > for ARP requests that will be replied by a single router. The > > > > > > packets > > > > > > are still flooded in the L2 domain but not on any of the other patch > > > > > > ports towards other routers connected to the switch. This restricted > > > > > > flooding is done by using a new multicast group (MC_ARP_ND_FLOOD). > > > > > > > > > > > > IPs that are owned by the routers and for which this fix applies > > > > > > are: > > > > > > - IP addresses configured on the router ports. > > > > > > - VIPs. > > > > > > - NAT IPs. > > > > > > > > > > > > This commit also fixes function get_router_load_balancer_ips() which > > > > > > was incorrectly returning a single address_family even though the > > > > > > IP set could contain a mix of IPv4 and IPv6 addresses. > > > > > > > > > > > > Reported-at: https://bugzilla.redhat.com/1756945 > > > > > > Reported-by: Anil Venkata > > > > > > Signed-off-by: Dumitru Ceara > > > > > > > > > > Thanks Dumitru, for addressing the review comments. > > > > > > > > > > Acked-by: Numan Siddique > > > > > > > > > > Han, if you can take a look in this patch and provide your comments, > > > > > that would be great. > > > > > > > > > > > Hi Han, > > > > > > > > > > > Sorry for delayed response :( > > > > > > No worries and thanks for reviewing this change. > > > > > > > > > > > The patch looks good to me except: > > > > > > > > 1. It changes the behavior that when an ARP request for a LRP's IP is > > > > coming to the logical switch, other routers will no longer learn the > > > > MAC-IP binding of the ARP sender. This has been discussed and I tend to > > > > agree with Dumitru that it should not cause real issue. I think it just > > > > worth to be documented clearly in the ovn-architecture, probably in the > > > > section: Logical Routers and Logical Patch Ports, because this is > > > > something different from a traditional switch's behavior, and network > > > > administrators may get suprised. > > > > > > Ack, I'll add this in v6. > > > > > > > > > > > 2. Since we are anyway changing the behavior of ARP request handling > > > > and bypassing logical router ports, and we think it should not cause > > > > real problem, then I wonder why adding the MC_ARP_ND group to still > > > > flood to the non-router ports. Is it really useful? Maybe it is not a > > > > big deal, but I think the extra MC group would add some performance > > > > cost and it is almost redundant with the regular MC_FLOOD except that > > > > there is no router ports in it - it is a lot of redundancy considering > > > > that most LSPs are regular VIFs in typical large scale environments. I > > > > would suggest to simplify this unless there is clear concerns of not > > > > flooding to other ports. > > > > > > If I skip flooding to non-router ports the following test fails > > > because one of the things checked by the test is that we flood > > > broadcasts (including ARPs) in the broadcast domain: > > > 36: ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR FAILED > > > (ovs-macros.at:220) > > > > > > So my concern was that people might expect the ARP requests to be > > > flooded within the L2 broadcast domain. > > > However, we could add configuration knob to change the behavior and > > > only flood them on localnet ports. Like this we could maintain the > > > current L2 flooding but in deployments where this is not necessary the > > > users could disable the flooding to VIF ports and this would remove > > > the MC_ARP_ND group unless there's a localnet port in the datapath. > > > > > > What do you think? > > > > > > > Adding configuration knob is a valid option, but I hoped we could simplify > > the change instead of making it even more complex :) > > > > My point is that since we changed the behavior, it might be better to > > change it with a more straightforward philosophy: for ARP request for an IP > > that is owned by a LRP, it is not flooded
[ovs-dev] [PATCH RFC] dpif-netdev: Add ovs-appctl dpif-netdev/subtable-show.
Add an ovs-appctl command to iterate through the dpcls and for each subtable output the miniflow bits for any existing table. $ ovs-appctl dpif-netdev/subatable-show pmd thread numa_id 0 dpcls port 2: subtable: unit_0: 4 (0x4) unit_1: 2 (0x2) pmd thread numa_id 1 dpcls port 3: subtable: unit_0: 4 (0x3) unit_1: 2 (0x5) Signed-off-by: Emma Finn --- NEWS| 2 ++ lib/dpif-netdev-unixctl.man | 4 lib/dpif-netdev.c | 54 - 3 files changed, 59 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 88b8189..c01c100 100644 --- a/NEWS +++ b/NEWS @@ -10,6 +10,8 @@ Post-v2.12.0 if supported by libbpf. * Add option to enable, disable and query TCP sequence checking in conntrack. + * New "ovs-appctl dpif-netdev/subtable-show" command for userspace + datapath to show subtable miniflow bits. v2.12.0 - 03 Sep 2019 - diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man index 6c54f6f..c443465 100644 --- a/lib/dpif-netdev-unixctl.man +++ b/lib/dpif-netdev-unixctl.man @@ -217,3 +217,7 @@ with port names, which this thread polls. . .IP "\fBdpif-netdev/pmd-rxq-rebalance\fR [\fIdp\fR]" Reassigns rxqs to pmds in the datapath \fIdp\fR based on their current usage. +. +.IP "\fBdpif-netdev/subtable-show\fR [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" +For one or all pmd threads of the datapath \fIdp\fR show the list of miniflow +bits for each subtable in the datapath classifier. \ No newline at end of file diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 4720ba1..7ae422e 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -857,6 +857,8 @@ static inline bool pmd_perf_metrics_enabled(const struct dp_netdev_pmd_thread *pmd); static void queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd, struct dp_netdev_flow *flow); +static void pmd_info_show_subtable(struct ds *reply, + struct dp_netdev_pmd_thread *pmd); static void emc_cache_init(struct emc_cache *flow_cache) @@ -979,6 +981,7 @@ enum pmd_info_type { PMD_INFO_CLEAR_STATS, /* Set the cycles count to 0. */ PMD_INFO_SHOW_RXQ,/* Show poll lists of pmd threads. */ PMD_INFO_PERF_SHOW, /* Show pmd performance details. */ +PMD_INFO_SHOW_SUBTABLE, /* Show subtable miniflow bits. */ }; static void @@ -1334,6 +1337,8 @@ dpif_netdev_pmd_info(struct unixctl_conn *conn, int argc, const char *argv[], pmd_info_show_stats(&reply, pmd); } else if (type == PMD_INFO_PERF_SHOW) { pmd_info_show_perf(&reply, pmd, (struct pmd_perf_params *)aux); +} else if (type == PMD_INFO_SHOW_SUBTABLE) { +pmd_info_show_subtable(&reply, pmd); } } free(pmd_list); @@ -1391,7 +1396,8 @@ dpif_netdev_init(void) { static enum pmd_info_type show_aux = PMD_INFO_SHOW_STATS, clear_aux = PMD_INFO_CLEAR_STATS, - poll_aux = PMD_INFO_SHOW_RXQ; + poll_aux = PMD_INFO_SHOW_RXQ, + subtable_aux = PMD_INFO_SHOW_SUBTABLE; unixctl_command_register("dpif-netdev/pmd-stats-show", "[-pmd core] [dp]", 0, 3, dpif_netdev_pmd_info, @@ -1416,6 +1422,9 @@ dpif_netdev_init(void) "[-us usec] [-q qlen]", 0, 10, pmd_perf_log_set_cmd, NULL); +unixctl_command_register("dpif-netdev/subtable-show", "[-pmd core] [dp]", + 0, 3, dpif_netdev_pmd_info, + (void *)&subtable_aux); return 0; } @@ -8036,3 +8045,46 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], } return false; } + +/* Iterate through all dpcls instances and dump out all subtable + * miniflow bits. */ +static void +pmd_info_show_subtable(struct ds *reply, struct dp_netdev_pmd_thread *pmd) +{ +if (pmd->core_id != NON_PMD_CORE_ID) { +struct rxq_poll *list; +size_t n_rxq; +struct dpcls *cls; +struct dpcls_subtable *subtable; + +ovs_mutex_lock(&pmd->port_mutex); +sorted_poll_list(pmd, &list, &n_rxq); +for (int i = 0; i < n_rxq; i++) { +struct dp_netdev_rxq *rxq = list[i].rxq; +odp_port_t in_port = rxq->port->port_no; +cls = dp_netdev_pmd_lookup_dpcls(pmd, in_port); +if (!cls) { +continue; +} else { +struct pvector *pvec = &cls->subtables; + +PVECTOR_FOR_EACH (subtable, pvec) { +ds_put_format(reply, "pmd thread numa_id %d " + "core_id %u: \n", + pmd->numa_id, pmd->core_id); +ds_put_format(reply, " dpcls p
Re: [ovs-dev] [PATCH] netdev: Dynamic per-port Flow API.
On 06.11.2019 18:11, Ophir Munk wrote: Hi Ilya, A few months ago we discussed the missing functionality of vports offloading under user space bridges. Commit [1] was added to explicitly avoid installing userspace vport flows (to avoid confusion with the vport kernel). When reverting commit [1] - we are left with this missing functionality. Applying your suggested patch netdev_vport_has_system_port() API) does not seem to work. It always fails when it tries to look for the interface name (say "vxlan10") in the system list where vxlan interfaces are renamed (say "vxlan_sys_4789"). Hi Ophir, No-one ever tried to run this code, so I'm not surprised. I could take a look at this issue on next week. Best regards, Ilya Maximets. You wrote: On master, after applying dynamic flow API patch-set, we'll be able to use patch that I suggested in previous mail to properly handle this situation and enable vport offloading. It would be appreciated if we can resume the efforts in fixing this missing functionality. Please advise on the next steps. [1] Commit 0da667e345 ("dpif-netdev: Forbid vport offloading attempts.") -Original Message- From: Ilya Maximets Sent: Monday, May 13, 2019 3:33 PM To: Ophir Munk ; ovs-dev@openvswitch.org Cc: Ian Stokes ; Flavio Leitner ; Kevin Traynor ; Roni Bar Yanai ; Finn Christensen ; Ben Pfaff ; Simon Horman ; Olga Shern ; Asaf Penso ; Oz Shlomo ; Majd Dibbiny Subject: Re: [PATCH] netdev: Dynamic per-port Flow API. On 13.05.2019 14:41, Ophir Munk wrote: Hi Ilya, -Original Message- From: Ilya Maximets Sent: Monday, May 13, 2019 12:22 PM To: Ophir Munk ; ovs-dev@openvswitch.org Cc: Ian Stokes ; Flavio Leitner ; Kevin Traynor ; Roni Bar Yanai ; Finn Christensen ; Ben Pfaff ; Simon Horman ; Olga Shern ; Asaf Penso ; Oz Shlomo Subject: Re: [PATCH] netdev: Dynamic per-port Flow API. On 09.05.2019 1:59, Ophir Munk wrote: Hi Ilya, ... I am recreating this scenario in my setup. I see. Yes, you're right. And I think that this case could be reproduced on current master without any patches. So, it's a bug that we need to fix. Otherwise userspace datapath will try to offload its flows to the unrelated system interfaces. For now we could just forbid offloading to vports in dpif- netdev. I'll prepare the patch. This fix also should be backported. We need a way to enable offloading of vports in dpif-netdev. So if you can address It with your new patch - that would be appreciated. I'm suggesting disabling because it'll be easy to backport to older branches. On master, after applying dynamic flow API patch-set, we'll be able to use patch that I suggested in previous mail to properly handle this situation and enable vport offloading. This patch series is important but one of its main goals is to enable different offloads for different vports under Kernel/userspace. Can you please advise how this goal can be achieved? It looks like there is no way to distinguish system and userspace vports by looking only on netdev. We should check with dpif type. Agreed. But then looking at your sample patch below you are basing your decision on the existence of system port (and not on dpif type). Right now 'ifindex' is used for checking, so this is equally racy. I think it is risky because you are dependent on the order of operations: Creation of the system port versus checking for system port existence. Which was first (under different scenarios)? Actually, port creation and checking will always happen in the same thread context, so creation and checking will be serialized. Kernel should guarantee the port existence. No races here. What do you say about the following patch: diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c int netdev_ports_insert(struct netdev *netdev, const struct dpif_class *dpif_class, struct dpif_port *dpif_port) @@ -547,8 +555,9 @@ netdev_ports_insert(struct netdev *netdev, const struct dpif_class *dpif_class, netdev_ports_hash(dpif_port->port_no, dpif_class)); hmap_insert(&ifindex_to_port, &data->ifindex_node, ifindex); ovs_mutex_unlock(&netdev_hmap_mutex); -netdev_init_flow_api(netdev); +netdev_init_flow_api(netdev, dpif_class->type); /* Pass class + type "netdev" or "syste" */ return 0; } It's not a full solution. It is just a hint how to pass the dpif type ("netdev" or "system") when initializing the flow api. We can use the dpif type when initializing vport offload. This is, actually, the first thing I tried. However, this requires exposing the internals of 'dpif-provider', which is bad from the point of code architecture. For the below patch code, I missed actual port requesting from the datapath. Following incremental needed: --- diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index 82943c102..1c88a91ed 100644 --- a/lib/netdev-vport.c +++ b/lib/netdev-vport.c @@ -124,13 +124,28 @@ netdev_vport_class_get_dpif_port(const
[ovs-dev] Extending ovs_action_attr to add a new action
Hi, I need to add a field to enum ovs_action_attr, but I see that the definition between the upstream header[1] and the one in compat[2] differs. Upstream enum stops at OVS_ACTION_ATTR_CHECK_PKT_LEN, with an extra "hidden" element after __OVS_ACTION_ATTR_MAX (22) Our compat version instead, has OVS_ACTION_ATTR_TUNNEL_{PUSH,POP} defined only #ifndef __KERNEL__, with __OVS_ACTION_ATTR_MAX being 22 for the kernel and 24 for userspace. If I add a field OVS_ACTION_ATTR_WHATEVER just before __OVS_ACTION_ATTR_MAX in the kernel, older userspace will incorrectly see the new action as OVS_ACTION_ATTR_TUNNEL_PUSH. How can we extend this enum without breaking compatibility? If OVS_ACTION_ATTR_TUNNEL_* really is unused in the kernel datapath, what if we pad the kernel header with two padding fields? -%<- --- a/include/uapi/linux/openvswitch.h +++ b/include/uapi/linux/openvswitch.h @@ -925,6 +926,8 @@ enum ovs_action_attr { OVS_ACTION_ATTR_METER,/* u32 meter ID. */ OVS_ACTION_ATTR_CLONE,/* Nested OVS_CLONE_ATTR_*. */ OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */ + _OVS_ACTION_ATTR_TUNNEL_PUSH, /* unused in kernel datapath. */ + _OVS_ACTION_ATTR_TUNNEL_POP, /* unused in kernel datapath. */ __OVS_ACTION_ATTR_MAX, /* Nothing past this will be accepted ->%- [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/openvswitch.h?h=v5.3#n899 [2] https://github.com/openvswitch/ovs/blob/v2.12.0/datapath/linux/compat/include/linux/openvswitch.h#L962 Regards, -- Matteo Croce per aspera ad upstream ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] netdev-dpdk: Fix dev attached flag.
On 08.11.2019 9:55, Zhike Wang wrote: If the dev was already probed correctly, the dev attached flag should be set to true, or resource would leak during destruct. We're not detaching devices that wasn't attached by us. If device is already probed in DPDK than it means that most likely it was attached on dpdk_eal_init(), because user explicitly added this device in EAL arguments with 'dpdk-extra' knob. Previously there was non-detachable devices and we protected ourselves from trying to detach them. I beleive (hope) that in modern DPDK all the devices are detachable, but still. If user added devices to EAL arguments than he/she wanted them to exist whole time while OVS is running and we're not eligible to detach these devices. If you really want to detach them, use special command 'ovs-appctl netdev-dpdk/detach' that was designed for this exact case. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] L2 acl on the logical switch
Sorry about that, hopefully this is better (else, I'll just attach it the next time) Assuming :Logical Switch (ls1) with ls1_vm1 : 02:ac:10:ff:00:11/172.16.255.11 ls1_vm2 : 02:ac:10:ff:00:22/172.16.255.22 ls1_vm3 : 02:ac:10:ff:00:33/172.16.255.33 Looking to : To have MAC ACLs between vm1 and vm2 so they can't talk to each other. Using: # ovn-nbctl create Address_Set name=macs addresses=\"02:ac:10:ff:00:11\",\"02:ac:10:ff:00:22\" # ovn-nbctl create Port_Group name=l2pg # ovn-nbctl add Port_Group l2pg ports # ovn-nbctl add Port_Group l2pg ports # ovn-nbctl acl-add ls1 to-lport 32767 "outport == @l2pg && eth.src == \$macs" drop What we are seeing: The above, by itself, will limit L3 and L2 between vm1 and vm2, but not vm3 (as expected) lflow: table=4 (ls_out_acl ), priority=33767, match=(outport == @l2pg && eth.src == $macs), action=(/* drop */) dpflow (sending a arbit packet of ether type 0x7a05): (0),in_port(vm1),eth(src=02:ac:10:ff:00:11,dst=02:ac:10:ff:00:22),eth_type(0x7a05), packets:0, bytes:0, used:never, actions:drop However, if there is a stateful rule, e.g.: # ovn-nbctl acl-add ls1 from-lport 2000 "inport == \"ls1-vm3\" && ip" allow-related Then, the behavior differs lflow: table=6 (ls_in_acl ), priority=3000 , match=(!ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 && (inport == "ls1-vm3" && ip)), action=(next;) table=6 (ls_in_acl ), priority=3000 , match=(((ct.new && !ct.est) || (!ct.new && ct.est && !ct.rpl && ct_label.blocked == 1)) && (inport == "ls1-vm3" && ip)), action=(reg0[1] = 1; next;) table=4 (ls_out_acl ), priority=33767, match=((!ct.est || (ct.est && ct_label.blocked == 1)) && (outport == @l2pg && eth.src == $macs)), action=(/* drop */) table=4 (ls_out_acl ), priority=33767, match=(ct.est && ct_label.blocked == 0 && (outport == @l2pg && eth.src == $macs)), action=(ct_commit(ct_label=1/1); /* drop */) *Note: the condition !ct.new && ct.est && !ct.rpl && ct_label.blocked == 0** This does block L3 traffic: dpctl flow for ICMP recirc_id(0),in_port(vm1),eth(src=02:ac:10:ff:00:11),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8), packets:6, bytes:588, used:0.864s, actions:ct(zone=1),recirc(0x1) recirc_id(0x1),in_port(vm1),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(dst=02:ac:10:ff:00:22),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8), packets:6, bytes:588, used:0.864s, actions:ct(commit,zone=1,label=0/0x1),ct(zone=4),recirc(0x2) recirc_id(0x2),in_port(vm1),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:11),eth_type(0x0800),ipv4(frag=no), packets:6, bytes:588, used:0.864s, actions:drop recirc_id(0),in_port(vm1),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:11,dst=02:ac:10:ff:00:22),eth_type(0x0806),arp(sha=02:ac:10:ff:00:11), packets:0, bytes:0, used:never, actions:vm2 recirc_id(0),in_port(vm2),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:22,dst=02:ac:10:ff:00:11),eth_type(0x0806),arp(sha=02:ac:10:ff:00:22), packets:0, bytes:0, used:never, actions:vm1 But, it doesn't block L2, see the ARP going through, without the stateful rule the ARP would have been dropped too. From what i understand it is because of the condition "the condition !ct.new && ct.est && !ct.rpl && ct_label.blocked == 0", which specifically looks fo "+trk" which makes sense for IP packets, but probably not for non-IP packets, i.e. the ARP rule above has "-trk". Hence, I believe it gets through. Proposed changes: If my understanding is right (and objective to remove the inconsistency between when there are stateful rules and not, w.r.t. L2 packets), we should have the condition as: # git diffdiff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.cindex 6c6de2afd..b0f524531 100644--- a/ovn/northd/ovn-northd.c+++ b/ovn/northd/ovn-northd.c@@ -4191,10 +4191,13 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, * depending on whether the connection was previously committed * to the connection tracker with ct_commit. */ if (has_stateful) {- /* If the packet is not part of an established connection, then- * we can simply reject/drop it. */++ /* If the packet is not tracked or part of an established connection, then+ * we can simply reject/drop it.+ */ ds_put_cstr(&match,- "(!ct.est || (ct.est && ct_label.blocked == 1))");+ "(!ct.trk || !ct.est || (ct.est && ct_label.blocked == 1))"); if (!strcmp(acl->action, "reject")) { build_reject_acl_rules(od, lflows, stage, acl, &match, &actions); with this the rules above look like: lflow: table=
Re: [ovs-dev] [PATCH v2] dpif-netdev: log rxq assignment in isolated pmd
On 22.10.2019 11:34, Gowrishankar Muthukrishnan wrote: There is no log about isolated rxq assignment in a pmd today, which sometimes could be useful to trace rxq/pmd pinning, when debugging with log. Ovs-appctl dpif-netdev/pmd-rxq-show reports about it already, but logging is helpful to trace pinning in time. Changes: v2: init numa_id for the info. Reported-at: https://bugzilla.redhat.com/1728616 Signed-off-by: Gowrishankar Muthukrishnan --- Hi Gowrishankar, Thanks for the patch. I like the idea of this additional log. Few comments for implementation inline. Best regards, Ilya Maximets. lib/dpif-netdev.c | 5 + 1 file changed, 5 insertions(+) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 4546b55e8..4d7c36787 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -4573,6 +4573,11 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) OVS_REQUIRES(dp->port_mutex) q->pmd = pmd; pmd->isolated = true; dp_netdev_pmd_unref(pmd); Print should go before dropping the pmd reference. +numa_id = netdev_get_numa_id(q->port->netdev); +VLOG_INFO("Core %d on numa node %d assigned port \'%s\' " + "rx queue %d.", pmd->core_id, numa_id, 'numa_id' is not correct. You need to print the numa_id of the core/thread, not the numa_id of netdev. So, there should be 'pmd->numa_id' instead. And, please, align the arguments to the next symbol after '(', i.e. 4 spaces to the right. + netdev_rxq_get_name(q->rx), + netdev_rxq_get_queue_id(q->rx)); } } else if (!pinned && q->core_id == OVS_CORE_UNSPEC) { uint64_t cycle_hist = 0; ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Extending ovs_action_attr to add a new action
Bleep bloop. Greetings Matteo Croce, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. git-am: fatal: corrupt patch at line 13 Repository lacks necessary blobs to fall back on 3-way merge. Cannot fall back to three-way merge. Patch failed at 0001 Extending ovs_action_attr to add a new action The copy of the patch that failed is found in: /var/lib/jenkins/jobs/upstream_build_from_pw/workspace/.git/rebase-apply/patch When you have resolved this problem, run "git am --resolved". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". Please check this out. If you feel there has been an error, please email acon...@redhat.com Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] ¿Cómo crear un equipo triunfador?
26 de Noviembre | Horario de 10:00 a 17:00 hrs. | (hora del centro de México) - Coaching y liderazgo efectivo. - El líder coach es un estilo muy completo para utilizar cuando queremos capacitar a otros y conseguir resultados sólidos a mediano y largo plazo. Este webinar tiene como finalidad llevar a los participantes a ser capaces de intervenir positivamente en otros individuos en funciones de liderazgo, comprendiendo diversos estilos y aplicando en la práctica acciones que permitan mejorar la toma de decisiones para alcanzar los objetivos de su organización. Solicita información respondiendo a este correo con la palabra Coaching, junto con los siguientes datos: Nombre: Correo electrónico: Número telefónico: Números de Atención: (045) 55 15 54 66 30 - (045) 55 85567293 - (045) 5530167085 En caso de que haya recibido este correo sin haberlo solicitado o si desea dejar de recibir nuestra promoción favor de responder con la palabra baja o enviar un correo a bajas@ innovalearn.net ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds
Hi Wei On 2019-11-08 02:02, Yanqin Wei (Arm Technology China) wrote: Hi David -Original Message- From: David Wilder Sent: Thursday, November 7, 2019 3:21 AM To: ovs-dev@openvswitch.org Cc: i.maxim...@ovn.org; b...@ovn.org; Yanqin Wei (Arm Technology China) ; wil...@us.ibm.com Subject: [PATCH v3] travis: support ppc64le builds Add support for travis-ci ppc64le builds. - Updated matrix in .travis.yml to include an arch: ppc64le build. - Move package install needed for 32bit builds to .travis/linux-prepare.sh. To keep the total build time at an acceptable level only a single build job is included in the matrix for ppc64le. A build report example can be found here [1] [0] https://urldefense.proofpoint.com/v2/url?u=http-3A__travis-2Dci.org_&d=DwIFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=6QTaKFSaP0A5SgshVJPuewRjtEH32pQD2AUjoEnZyTo&s=IILKkh0Orqnp1nKjHWWCkXSOMpBYjG9WKs8l34TyPts&e= [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_djlwilder_ovs_builds_607851729&d=DwIFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=6QTaKFSaP0A5SgshVJPuewRjtEH32pQD2AUjoEnZyTo&s=mVCcGr-NWlZzQn2xSnackkm2Aawg7iyvYEyh4lMpjFw&e= Signed-off-by: David Wilder --- Addressed review comments: - Cleaned up linux-prepare.sh (v2) - Switch from os: linux-ppc64le to arch: ppc64le (v3) .travis.yml | 5 +++-- .travis/linux-prepare.sh | 5 - 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/.travis.yml b/.travis.yml index 482efd2d1..308c09635 100644 --- a/.travis.yml +++ b/.travis.yml @@ -14,7 +14,6 @@ addons: apt: packages: - bc - - gcc-multilib - libssl-dev - llvm-dev - libjemalloc1 @@ -26,7 +25,6 @@ addons: - libelf-dev - selinux-policy-dev - libunbound-dev - - libunbound-dev:i386 - libunwind-dev before_install: ./.travis/${TRAVIS_OS_NAME}-prepare.sh @@ -52,6 +50,9 @@ matrix: - os: osx compiler: clang env: OPTS="--disable-ssl" +- arch: ppc64le + compiler: gcc + env: OPTS="--disable-ssl" script: ./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS diff --git a/.travis/linux-prepare.sh b/.travis/linux-prepare.sh index 9e3ac0df7..d66f480c6 100755 --- a/.travis/linux-prepare.sh +++ b/.travis/linux-prepare.sh @@ -18,7 +18,10 @@ pip install --user --upgrade docutils if [ "$M32" ]; then # 32-bit and 64-bit libunwind can not be installed at the same time. # This will remove the 64-bit libunwind and install 32-bit version. -sudo apt-get install -y libunwind-dev:i386 +sudo apt-get install -y \ + gcc-multilib \ + libunwind-dev:i386 \ + libunbound-dev:i386 [Yanqin] They are x86 specific dependency. It is better to use "$TRAVIS_ARCH" == "amd64" condition. [ wilder ] In this case it is not needed as ppc64le is not supporting 32bit builds (not in the matrix). If arm64 will support a 32bit build, then you will need to modify this section anyway, to install arm packages. [Yanqin] Is gcc-multilib only required for 32bits build? [ wilder ] Yes. fi # IPv6 is supported by kernel but disabled in TravisCI images: -- 2.23.0.162.gf1d4a28 IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Fix use of dangling pointers in I-P runtime_data.
1. storage data and the void *arg of init() breaks the engine node data encapsulation. 2. engine_node_valid(&en_flow_output, engine_run_id) is not needed? Should use storage to access instead? 3. refactor of engine is good but better to be a separate commit 4. we can have a new interface: engine_get_data(), which returns data if it is valid. we should never expose the data directly. We should init the engine node with dynamically allocated engine data structure (and remember to free during destroy) Hi Dumitru, Sorry for late response. On Mon, Nov 4, 2019 at 4:54 AM Dumitru Ceara wrote: > > The incremental processing engine might stop a run before the > en_runtime_data node is processed. In such cases the ed_runtime_data > fields might contain pointers to already deleted SB records. For > example, if a port binding corresponding to a patch port is removed from > the SB database and the incremental processing engine aborts before the > en_runtime_data node is processed then the corresponding local_datapath > hashtable entry in ed_runtime_data is stale and will store a pointer to > the already freed sbrec_port_binding record. > > This will cause invalid memory accesses in various places (e.g., > pinctrl_run() -> prepare_ipv6_ras()). > > To fix the issue we need a way to track how each node was processed > during an engine run. This commit transforms the 'changed' field in > struct engine_node in a 'state' field. Possible node states are: > - "New": the node is not yet initialized. > - "Stale": data in the node is not up to date with the DB. > - "Updated": data in the node is valid but was updated during > the last run of the engine. > - "Valid": data in the node is valid and didn't change during > the last run of the engine. > - "Aborted": during the last run, processing was aborted for > this node. > - "Destroyed": the node was already cleaned up. > > We also add a separation between engine node data that can be accessed > at any time (regardless if the last engine run was successful or not) > and data that may be accessed only if the nodes are up to date. This > helps avoiding custom "engine_node_valid" handlers for different > nodes. > > The commit also simplifies the logic of calling engine_run and > engine_need_run in order to reduce the number of external variables > required to track the result of the last engine execution. > > Functions that need to be called from the main loop and depend on > various data contents of the engine's nodes are now called only if > the data is up to date. > > CC: Han Zhou > Fixes: ca278d98a4f5 ("ovn-controller: Initial use of incremental engine - quiet mode.") > Signed-off-by: Dumitru Ceara > > --- > v2: Address Han's comments: > - call engine_node_valid() in all the places where node local data is > used. > - move out "global" data outside the engine nodes. Make a clear > separation between data that can be safely used at any time and data > that can be used only when the engine run was successful. I am concerned with this kind of separation of *global* data, which breaks the data encapsulation of engine node, and can easily result in hidden dependency. As you know the main purpose of the I-P engine is to force explicit dependency exposed between different engine nodes thus ensure the correctness (at least it helps to ensure) of incremental processing. Here is my proposal to address the problem with better encapsulation. Firstly, let's avoid direct engine data access outside of engine module. At engine node construction, instead of using reference of stack variable (such as struct ed_type_runtime_data ed_runtime_data), we can allocate the memory in the engine node's init() interface, and free in the cleanup() interface. This way, there will be no way to directly access engine data like &ed_runtime_data.local_datapaths. Secondly, let's add a engine module interface engine_get_data() to retrieve *and validate* data for an engine node: void * engine_get_data(struct engine_node *node, uint64_t run_id) { if (engine_node_valid(node, run_id)) { return node->data; } return NULL; } This should be used whenever an engine node data need to be accessed. (It may be even better to use node name as input instead of node structure, but it seems ok to me either way) Now let's address the always-valid node problem. I was proposing an is_valid() interface for engine node. It can be NULL by default, but if a node xxx's data is always valid, it can be implemented like: static bool en_xxx_is_valid(struct engine_node *node) { /* This node's data will always be valid */ return true; } For the engine_node_valid() function, it can be changed slightly: bool engine_node_valid(struct engine_node *node, uint64_t run_id) { if (node->is_valid) { return node->is_valid(); } return node->run_id == run_id && (node->state == EN_UPDATED || node->state == EN_VALID); } So if is_valid is not implemented, it will be the default ch
Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Fix use of dangling pointers in I-P runtime_data.
On Fri, Nov 8, 2019 at 11:22 AM Han Zhou wrote: > > 1. storage data and the void *arg of init() breaks the engine node data encapsulation. > 2. engine_node_valid(&en_flow_output, engine_run_id) is not needed? Should use storage to access instead? > 3. refactor of engine is good but better to be a separate commit > 4. we can have a new interface: engine_get_data(), which returns data if it is valid. we should never expose the data directly. We should init the engine node with dynamically allocated engine data structure (and remember to free during destroy) Oops! please ignore the above part since it was draft and I forgot to delete after editing the formal response, mostly redundant :-) Real response started here => > > Hi Dumitru, > > Sorry for late response. > On Mon, Nov 4, 2019 at 4:54 AM Dumitru Ceara wrote: > > > > The incremental processing engine might stop a run before the > > en_runtime_data node is processed. In such cases the ed_runtime_data > > fields might contain pointers to already deleted SB records. For > > example, if a port binding corresponding to a patch port is removed from > > the SB database and the incremental processing engine aborts before the > > en_runtime_data node is processed then the corresponding local_datapath > > hashtable entry in ed_runtime_data is stale and will store a pointer to > > the already freed sbrec_port_binding record. > > > > This will cause invalid memory accesses in various places (e.g., > > pinctrl_run() -> prepare_ipv6_ras()). > > > > To fix the issue we need a way to track how each node was processed > > during an engine run. This commit transforms the 'changed' field in > > struct engine_node in a 'state' field. Possible node states are: > > - "New": the node is not yet initialized. > > - "Stale": data in the node is not up to date with the DB. > > - "Updated": data in the node is valid but was updated during > > the last run of the engine. > > - "Valid": data in the node is valid and didn't change during > > the last run of the engine. > > - "Aborted": during the last run, processing was aborted for > > this node. > > - "Destroyed": the node was already cleaned up. > > > > We also add a separation between engine node data that can be accessed > > at any time (regardless if the last engine run was successful or not) > > and data that may be accessed only if the nodes are up to date. This > > helps avoiding custom "engine_node_valid" handlers for different > > nodes. > > > > The commit also simplifies the logic of calling engine_run and > > engine_need_run in order to reduce the number of external variables > > required to track the result of the last engine execution. > > > > Functions that need to be called from the main loop and depend on > > various data contents of the engine's nodes are now called only if > > the data is up to date. > > > > CC: Han Zhou > > Fixes: ca278d98a4f5 ("ovn-controller: Initial use of incremental engine - quiet mode.") > > Signed-off-by: Dumitru Ceara > > > > --- > > v2: Address Han's comments: > > - call engine_node_valid() in all the places where node local data is > > used. > > - move out "global" data outside the engine nodes. Make a clear > > separation between data that can be safely used at any time and data > > that can be used only when the engine run was successful. > > I am concerned with this kind of separation of *global* data, which breaks the data encapsulation of engine node, and can easily result in hidden dependency. As you know the main purpose of the I-P engine is to force explicit dependency exposed between different engine nodes thus ensure the correctness (at least it helps to ensure) of incremental processing. > > Here is my proposal to address the problem with better encapsulation. > > Firstly, let's avoid direct engine data access outside of engine module. At engine node construction, instead of using reference of stack variable (such as struct ed_type_runtime_data ed_runtime_data), we can allocate the memory in the engine node's init() interface, and free in the cleanup() interface. This way, there will be no way to directly access engine data like &ed_runtime_data.local_datapaths. > > Secondly, let's add a engine module interface engine_get_data() to retrieve *and validate* data for an engine node: > void * > engine_get_data(struct engine_node *node, uint64_t run_id) > { > if (engine_node_valid(node, run_id)) { > return node->data; > } > return NULL; > } > > This should be used whenever an engine node data need to be accessed. (It may be even better to use node name as input instead of node structure, but it seems ok to me either way) > > Now let's address the always-valid node problem. I was proposing an is_valid() interface for engine node. It can be NULL by default, but if a node xxx's data is always valid, it can be implemented like: > > static bool > en_xxx_is_valid(struct engine_node *node) > { > /* This node's data will always be valid */ > re
[ovs-dev] [PATCH net 1/2] openvswitch: support asymmetric conntrack
The openvswitch module shares a common conntrack and NAT infrastructure exposed via netfilter. It's possible that a packet needs both SNAT and DNAT manipulation, due to e.g. tuple collision. Netfilter can support this because it runs through the NAT table twice - once on ingress and again after egress. The openvswitch module doesn't have such capability. Like netfilter hook infrastructure, we should run through NAT twice to keep the symmetry. Fixes: 05752523e565 ("openvswitch: Interface with NAT.") Signed-off-by: Aaron Conole --- net/openvswitch/conntrack.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index 05249eb45082..283e8f9a5fd2 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -903,6 +903,17 @@ static int ovs_ct_nat(struct net *net, struct sw_flow_key *key, } err = ovs_ct_nat_execute(skb, ct, ctinfo, &info->range, maniptype); + if (err == NF_ACCEPT && + ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) { + if (maniptype == NF_NAT_MANIP_SRC) + maniptype = NF_NAT_MANIP_DST; + else + maniptype = NF_NAT_MANIP_SRC; + + err = ovs_ct_nat_execute(skb, ct, ctinfo, &info->range, +maniptype); + } + /* Mark NAT done if successful and update the flow key. */ if (err == NF_ACCEPT) ovs_nat_update_key(key, skb, maniptype); -- 2.21.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH net 2/2] act_ct: support asymmetric conntrack
The act_ct TC module shares a common conntrack and NAT infrastructure exposed via netfilter. It's possible that a packet needs both SNAT and DNAT manipulation, due to e.g. tuple collision. Netfilter can support this because it runs through the NAT table twice - once on ingress and again after egress. The act_ct action doesn't have such capability. Like netfilter hook infrastructure, we should run through NAT twice to keep the symmetry. Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") Signed-off-by: Aaron Conole --- net/sched/act_ct.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index fcc46025e790..f3232a00970f 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -329,6 +329,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb, bool commit) { #if IS_ENABLED(CONFIG_NF_NAT) + int err; enum nf_nat_manip_type maniptype; if (!(ct_action & TCA_CT_ACT_NAT)) @@ -359,7 +360,17 @@ static int tcf_ct_act_nat(struct sk_buff *skb, return NF_ACCEPT; } - return ct_nat_execute(skb, ct, ctinfo, range, maniptype); + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype); + if (err == NF_ACCEPT && + ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) { + if (maniptype == NF_NAT_MANIP_SRC) + maniptype = NF_NAT_MANIP_DST; + else + maniptype = NF_NAT_MANIP_SRC; + + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype); + } + return err; #else return NF_ACCEPT; #endif -- 2.21.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Extending ovs_action_attr to add a new action
On Fri, Nov 08, 2019 at 05:12:55PM +0100, Matteo Croce wrote: > Hi, > > I need to add a field to enum ovs_action_attr, but I see that the > definition between the upstream header[1] and the one in compat[2] > differs. > Upstream enum stops at OVS_ACTION_ATTR_CHECK_PKT_LEN, with an extra > "hidden" element after __OVS_ACTION_ATTR_MAX (22) > Our compat version instead, has OVS_ACTION_ATTR_TUNNEL_{PUSH,POP} > defined only #ifndef __KERNEL__, with __OVS_ACTION_ATTR_MAX being 22 > for the kernel and 24 for userspace. > > If I add a field OVS_ACTION_ATTR_WHATEVER just before > __OVS_ACTION_ATTR_MAX in the kernel, older userspace will incorrectly > see the new action as OVS_ACTION_ATTR_TUNNEL_PUSH. "older userspace" means you're using userspace datapath (dpif-netdev)? If true, then it's not using kernel module. if "older userspace" means just ovs-vswitchd using kernel module, and you want to upgrade ovs kernel module with your new action and without upgrade ovs-vswitchd? Usually we also upgrade ovs-vswitchd, so I don't know how this can be done. Regards, William > > How can we extend this enum without breaking compatibility? > If OVS_ACTION_ATTR_TUNNEL_* really is unused in the kernel datapath, > what if we pad the kernel header with two padding fields? > > -%<- > --- a/include/uapi/linux/openvswitch.h > +++ b/include/uapi/linux/openvswitch.h > @@ -925,6 +926,8 @@ enum ovs_action_attr { > OVS_ACTION_ATTR_METER,/* u32 meter ID. */ > OVS_ACTION_ATTR_CLONE,/* Nested OVS_CLONE_ATTR_*. */ > OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */ > + _OVS_ACTION_ATTR_TUNNEL_PUSH, /* unused in kernel datapath. */ > + _OVS_ACTION_ATTR_TUNNEL_POP, /* unused in kernel datapath. */ > > __OVS_ACTION_ATTR_MAX, /* Nothing past this will be accepted > ->%- > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/openvswitch.h?h=v5.3#n899 > [2] > https://github.com/openvswitch/ovs/blob/v2.12.0/datapath/linux/compat/include/linux/openvswitch.h#L962 > > Regards, > -- > Matteo Croce > per aspera ad upstream > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds
On Wed, Nov 06, 2019 at 11:20:48AM -0800, David Wilder wrote: > Add support for travis-ci ppc64le builds. > > - Updated matrix in .travis.yml to include an arch: ppc64le build. > - Move package install needed for 32bit builds to .travis/linux-prepare.sh. > > To keep the total build time at an acceptable level only a single build > job is included in the matrix for ppc64le. > > A build report example can be found here [1] > [0] http://travis-ci.org/ > [1] https://travis-ci.org/djlwilder/ovs/builds/607851729 > > Signed-off-by: David Wilder > --- LGTM, Acked-by: William Tu ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds
On 06.11.2019 20:20, David Wilder wrote: Add support for travis-ci ppc64le builds. - Updated matrix in .travis.yml to include an arch: ppc64le build. - Move package install needed for 32bit builds to .travis/linux-prepare.sh. To keep the total build time at an acceptable level only a single build job is included in the matrix for ppc64le. A build report example can be found here [1] [0] http://travis-ci.org/ [1] https://travis-ci.org/djlwilder/ovs/builds/607851729 Signed-off-by: David Wilder --- Addressed review comments: - Cleaned up linux-prepare.sh (v2) - Switch from os: linux-ppc64le to arch: ppc64le (v3) What a wonderful world of undocumented features. :) Anyway, I just tried this patch and it fails for me because of missing pip: https://travis-ci.org/igsilya/ovs/jobs/609402867 pip install --disable-pip-version-check --user six flake8 hacking ./.travis/linux-prepare.sh: line 15: pip: command not found Restarting the job doesn't help. I'm wondering what is the base image and who controls preinstalled software? Maybe it makes sense to hold on until travis will start supporting ppc64le officially? Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Track vhost tx contention.
On 08.11.2019 9:30, David Marchand wrote: On Tue, Nov 5, 2019 at 4:37 PM Ilya Maximets wrote: That's an interesting debug method, but it looks not very suitable for an end-user documentation. One thing that bothers me the most is referencing C code snippets in this kind of documentation. Ok, can we conclude on the coverage counter wrt the v1 then? https://patchwork.ozlabs.org/patch/1153238/ Should I submit a v3 with the doc update (but removing the parts about perf) ? 'perf' part should not be there. I'm in doubt about the rest of the docs. This part might be useful, but it doesn't provide any solution for the problem and I really don't know if there is something we can suggest in that case. "Rework your network topology" doesn't sound like a friendly solution or exact steps to do. One more thing is that documenting coverage counters doesn't look like a good idea to me and I'd like to not create a precedent. One day we'll rework this to be some "PMD/netdev performance statistics" and it'll be OK to document it. But there is nothing more permanent than a temporary solution. Right now the easiest way for me is to just apply v1. Ok for me. OK. I'll apply v1 then. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] netdev-dpdk: Track vhost tx contention.
On 26.08.2019 16:33, David Marchand wrote: Add a coverage counter to help diagnose contention on the vhost txqs. This is seen as dropped packets on the physical ports for rates that are usually handled fine by OVS. Signed-off-by: David Marchand --- Thanks! I changed 'unlikely' to 'OVS_UNLIKELY' and applied to master. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds
On 2019-11-08 14:52, Ilya Maximets wrote: On 06.11.2019 20:20, David Wilder wrote: Add support for travis-ci ppc64le builds. - Updated matrix in .travis.yml to include an arch: ppc64le build. - Move package install needed for 32bit builds to .travis/linux-prepare.sh. To keep the total build time at an acceptable level only a single build job is included in the matrix for ppc64le. A build report example can be found here [1] [0] https://urldefense.proofpoint.com/v2/url?u=http-3A__travis-2Dci.org_&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=_UtshYcJsj4Pt3b9hfgEiyaEIT3j9gPEIgmBatCEqCo&s=r0fBOs-21CKcV4kyZGnzh3fcjrpR8caYSl8K2i1St54&e= [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_djlwilder_ovs_builds_607851729&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=_UtshYcJsj4Pt3b9hfgEiyaEIT3j9gPEIgmBatCEqCo&s=7t2rzVasH7Xq_R7jWkWZO9rkgm4KHMH-WavBzCRbF74&e= Signed-off-by: David Wilder --- Addressed review comments: - Cleaned up linux-prepare.sh (v2) - Switch from os: linux-ppc64le to arch: ppc64le (v3) What a wonderful world of undocumented features. :) Anyway, I just tried this patch and it fails for me because of missing pip: https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_igsilya_ovs_jobs_609402867&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=_UtshYcJsj4Pt3b9hfgEiyaEIT3j9gPEIgmBatCEqCo&s=PF1oO_KkZFd_RRKToj6UBN2t2YhvTVE5XnVD1GF9u60&e= pip install --disable-pip-version-check --user six flake8 hacking ./.travis/linux-prepare.sh: line 15: pip: command not found Restarting the job doesn't help. I'm wondering what is the base image and who controls preinstalled software? Maybe it makes sense to hold on until travis will start supporting ppc64le officially? That is a bummer, I am seeing that error now as well, it worked yesterday :( I agree we should hold off. I will let you know if I figure it out. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] conntrack: Fix tcp payload length in case multi-segments.
Thanks for the patch Would you mind describing the use case that this patch is aiming to support ? On Fri, Nov 8, 2019 at 1:23 AM Zhike Wang wrote: > Signed-off-by: Zhike Wang > --- > lib/conntrack-private.h | 16 ++-- > 1 file changed, 10 insertions(+), 6 deletions(-) > > diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h > index 590f139..1d21f6e 100644 > --- a/lib/conntrack-private.h > +++ b/lib/conntrack-private.h > @@ -233,13 +233,17 @@ conn_update_expiration(struct conntrack *ct, struct > conn *conn, > static inline uint32_t > tcp_payload_length(struct dp_packet *pkt) > { > -const char *tcp_payload = dp_packet_get_tcp_payload(pkt); > -if (tcp_payload) { > -return ((char *) dp_packet_tail(pkt) - dp_packet_l2_pad_size(pkt) > -- tcp_payload); > -} else { > -return 0; > +size_t l4_size = dp_packet_l4_size(pkt); > + > +if (OVS_LIKELY(l4_size >= TCP_HEADER_LEN)) { > +struct tcp_header *tcp = dp_packet_l4(pkt); > +int tcp_len = TCP_OFFSET(tcp->tcp_ctl) * 4; > + > +if (OVS_LIKELY(tcp_len >= TCP_HEADER_LEN && tcp_len <= l4_size)) { > +return (l4_size - tcp_len); > +} > Maybe I missed something, but it looks like the same calculation is arrived at. > } > +return 0; > } > > #endif /* conntrack-private.h */ > -- > 1.8.3.1 > > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] same tcp session encapsulated with different udp src port in kernel mode if packet has do ip_forward
I have tested this patch in linux with ovs version 2.8.2, and it seems worked. >From fa24d308d40f37f890fead0b79ac1f0f7baa28ba Mon Sep 17 00:00:00 2001 From: hzchenyuefang Date: Sat, 9 Nov 2019 10:14:23 +0800 Subject: [PATCH 1/1] fix skb_hash problem when sending from internal port first packet need come to userspace first and when execute in datapath, skb->hash value is changed to 0 hence the packet's skb->hash value need to recalculate when used. This may happend in such conditions: send packet from internal port, and then do vxlan encapsulting, outer udp header source port is calculated by skb->hash(see function udp_flow_src_port), so same tcp session packet may have different outer udp source port, this may affect load blance. --- datapath-windows/ovsext/BufferMgmt.h | 1 + datapath-windows/ovsext/DpInternal.h | 1 + datapath-windows/ovsext/User.c| 11 ++- datapath/datapath.c | 17 - datapath/datapath.h | 0 datapath/linux/compat/include/linux/openvswitch.h | 1 + lib/dpif-netlink.c| 7 ++- lib/dpif.h| 2 ++ ofproto/ofproto-dpif-upcall.c | 11 ++- 9 files changed, 47 insertions(+), 4 deletions(-) mode change 100644 => 100755 datapath-windows/ovsext/BufferMgmt.h mode change 100644 => 100755 datapath-windows/ovsext/DpInternal.h mode change 100644 => 100755 datapath-windows/ovsext/User.c mode change 100644 => 100755 datapath/datapath.c mode change 100644 => 100755 datapath/datapath.h mode change 100644 => 100755 lib/dpif-netlink.c mode change 100644 => 100755 lib/dpif.h mode change 100644 => 100755 ofproto/ofproto-dpif-upcall.c diff --git a/datapath-windows/ovsext/BufferMgmt.h b/datapath-windows/ovsext/BufferMgmt.h old mode 100644 new mode 100755 index dcf310a..34f50a1 --- a/datapath-windows/ovsext/BufferMgmt.h +++ b/datapath-windows/ovsext/BufferMgmt.h @@ -59,6 +59,7 @@ typedef union _OVS_BUFFER_CONTEXT { UINT32 dataOffsetDelta; }; UINT16 mru; +UINT32 skb_hash; }; UINT64 value[MEM_ALIGN_SIZE(32) >> 3]; diff --git a/datapath-windows/ovsext/DpInternal.h b/datapath-windows/ovsext/DpInternal.h old mode 100644 new mode 100755 index 3e351b7..8bb7110 --- a/datapath-windows/ovsext/DpInternal.h +++ b/datapath-windows/ovsext/DpInternal.h @@ -300,6 +300,7 @@ typedef struct OvsPacketExecute { uint32_t dpNo; uint32_t inPort; uint16 mru; + uint32 skb_hash; uint32_t packetLen; uint32_t actionsLen; PNL_MSG_HDR nlMsgHdr; diff --git a/datapath-windows/ovsext/User.c b/datapath-windows/ovsext/User.c old mode 100644 new mode 100755 index 4693a8b..4439379 --- a/datapath-windows/ovsext/User.c +++ b/datapath-windows/ovsext/User.c @@ -292,7 +292,8 @@ OvsNlExecuteCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, [OVS_PACKET_ATTR_USERDATA] = {.type = NL_A_UNSPEC, .optional = TRUE}, [OVS_PACKET_ATTR_EGRESS_TUN_KEY] = {.type = NL_A_UNSPEC, .optional = TRUE}, -[OVS_PACKET_ATTR_MRU] = { .type = NL_A_U16, .optional = TRUE } +[OVS_PACKET_ATTR_MRU] = { .type = NL_A_U16, .optional = TRUE }, +[OVS_PACKET_ATTR_SKB_HASH] = { .type = NL_A_U16, .optional = TRUE } }; RtlZeroMemory(&execute, sizeof(OvsPacketExecute)); @@ -394,6 +395,9 @@ _MapNlAttrToOvsPktExec(PNL_MSG_HDR nlMsgHdr, PNL_ATTR *nlAttrs, if (nlAttrs[OVS_PACKET_ATTR_MRU]) { execute->mru = NlAttrGetU16(nlAttrs[OVS_PACKET_ATTR_MRU]); } +if (nlAttrs[OVS_PACKET_ATTR_SKB_HASH]){ +execute->skb_hash = NlAttrGetU32(nlAttrs[OVS_PACKET_ATTR_SKB_HASH]); +} } NTSTATUS @@ -1101,6 +1105,11 @@ OvsCreateQueueNlPacket(PVOID userData, goto fail; } } +if (ctx->skb_hash) { +if (!NlMsgPutTailU32(&nlBuf, OVS_PACKET_ATTR_SKB_HASH, (UINT32)ctx->skb_hash)){ +goto fail; +} +} /* XXX must send OVS_PACKET_ATTR_EGRESS_TUN_KEY if set by vswtchd */ if (userData){ diff --git a/datapath/datapath.c b/datapath/datapath.c old mode 100644 new mode 100755 index b565fc5..9559b4d --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -287,7 +287,6 @@ void ovs_dp_process_packet(struct sk_buff *skb, struct sw_flow_key *key) stats_counter = &stats->n_missed; goto out; } - ovs_flow_stats_update(flow, key->tp.flags, skb); sf_acts = rcu_dereference(flow->sf_acts); ovs_execute_actions(dp, skb, sf_acts, key); @@ -521,6 +520,14 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb, pad_packet(dp, user_skb); } +if (skb->hash) { +if (nla_put_u32(user_skb, OVS_PACKET_ATTR_SKB_HASH, skb->hash)) { +err = -ENOBUFS; +goto out; +} +
[ovs-dev] [PATCH v4 3/3 ovn] OVN ACL: Allow a user to input ct.label value for an acl
This patch allows user to associate a value with acl, which will be assigned to ct.label of the corresponding connection tracking entry. This value can be used to map a ct entry with corresponding OVN ACL or higher level constructs like security group. Signed-off-by: Ankur Sharma --- northd/ovn-northd.c | 37 ++--- ovn-nb.ovsschema | 5 +++-- ovn-nb.xml| 12 +++ tests/ovn-nbctl.at| 12 +-- tests/ovn.at | 57 +++ utilities/ovn-nbctl.c | 24 +- 6 files changed, 135 insertions(+), 12 deletions(-) diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index dcd975d..226882e 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -190,12 +190,13 @@ enum ovn_stage { #define OVN_ACL_PRI_OFFSET 1000 /* Register definitions specific to switches. */ -#define REGBIT_CONNTRACK_DEFRAG "reg0[0]" -#define REGBIT_CONNTRACK_COMMIT "reg0[1]" -#define REGBIT_CONNTRACK_NAT "reg0[2]" -#define REGBIT_DHCP_OPTS_RESULT "reg0[3]" -#define REGBIT_DNS_LOOKUP_RESULT "reg0[4]" -#define REGBIT_ND_RA_OPTS_RESULT "reg0[5]" +#define REGBIT_CONNTRACK_DEFRAG "reg0[0]" +#define REGBIT_CONNTRACK_COMMIT "reg0[1]" +#define REGBIT_CONNTRACK_NAT"reg0[2]" +#define REGBIT_CONNTRACK_SET_LABEL "reg0[3]" +#define REGBIT_DHCP_OPTS_RESULT "reg0[4]" +#define REGBIT_DNS_LOOKUP_RESULT"reg0[5]" +#define REGBIT_ND_RA_OPTS_RESULT"reg0[6]" /* Register definitions for switches and routers. */ #define REGBIT_NAT_REDIRECT "reg9[0]" @@ -4550,7 +4551,14 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, " || (!ct.new && ct.est && !ct.rpl " "&& ct.blocked == 1)) " "&& (%s)", acl->match); -ds_put_cstr(&actions, REGBIT_CONNTRACK_COMMIT" = 1; "); +if (acl->label) { + ds_put_format(&actions, REGBIT_CONNTRACK_COMMIT" = 1; " + ""REGBIT_CONNTRACK_SET_LABEL" = 1; " + "xxreg1 = %s; ", acl->label); +} else { + ds_put_cstr(&actions, REGBIT_CONNTRACK_COMMIT" = 1; "); +} + build_acl_log(&actions, acl); ds_put_cstr(&actions, "next;"); ovn_lflow_add_with_hint(lflows, od, stage, @@ -4988,6 +4996,21 @@ build_stateful(struct ovn_datapath *od, struct hmap *lflows) ovn_lflow_add(lflows, od, S_SWITCH_IN_STATEFUL, 0, "1", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_STATEFUL, 0, "1", "next;"); +/* If REGBIT_CONNTRACK_COMMIT is set as 1 and + * REGBIT_CONNTRACK_SET_LABEL is set to 1, then the packets should be + * committed to conntrack. + * We always set ct_mark.blocked to 0 here as + * any packet that makes it this far is part of a connection we + * want to allow to continue. */ +ovn_lflow_add(lflows, od, S_SWITCH_IN_STATEFUL, 150, + REGBIT_CONNTRACK_COMMIT" == 1 && " + ""REGBIT_CONNTRACK_SET_LABEL" == 1", + "ct_commit(ct_mark=0/1, ct_label=xxreg1); next;"); +ovn_lflow_add(lflows, od, S_SWITCH_OUT_STATEFUL, 150, + REGBIT_CONNTRACK_COMMIT" == 1 && " + ""REGBIT_CONNTRACK_SET_LABEL" == 1", + "ct_commit(ct_mark=0/1, ct_label=xxreg1); next;"); + /* If REGBIT_CONNTRACK_COMMIT is set as 1, then the packets should be * committed to conntrack. We always set ct.blocked to 0 here as * any packet that makes it this far is part of a connection we diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema index 084305b..4e8a279 100644 --- a/ovn-nb.ovsschema +++ b/ovn-nb.ovsschema @@ -1,7 +1,7 @@ { "name": "OVN_Northbound", -"version": "5.17.0", -"cksum": "1128988054 23237", +"version": "5.18.0", +"cksum": "1553853266 23311", "tables": { "NB_Global": { "columns": { @@ -177,6 +177,7 @@ "debug"]]}, "min": 0, "max": 1}}, "meter": {"type": {"key": "string", "min": 0, "max": 1}}, +"label": {"type": {"key": "string", "min": 0, "max": 1}}, "external_ids": { "type": {"key": "string", "value": "string", "min": 0, "max": "unlimited"}}}, diff --git a/ovn-nb.xml b/ovn-nb.xml index d8f3237..57faee4 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -1438,6 +1438,18 @@ default, log messages are not rate-limited. + + + +Associates an identifier with the ACL. +Same value will be written to corresponding connection +tracker entry. Value should be in hex, for example: 0x1234. +This value can help in debugging from connection tracker side, +
[ovs-dev] [PATCH v4 0/3] Associate identifier with OVN ACL connection tracking entry
I submitted this patch long time back and somehow lost track it. Resubmitting the series, calling it as V4, as it addresses the review comments given till v3. https://mail.openvswitch.org/pipermail/ovs-dev/2019-April/358280.html What: a. Goal is to be able to associate some identifier with a connection tracking entry. b. This identifier can be used to map OVN ACL which added this entry or higher level constructs like openstack security group etc. c. There are 2 connection tracking fields which can be used for it. ct.mark (32 bits) and ct.label (128 bits). d. Patch intends to use ct.label, as this is a longer field and hence would be put to a better use, if it stores the identifier. Why: a. Adding an identifier would help in debugging. b. Now, we can map a connection tracking entry to corresponding acl, security group etc. c. One of the use cases for this mapping would be to identify ACLs which added corresponding connection tracker entry, which is causing unexpected drops/leaks. How: Following is the sequence of changes: Patch 1: i. Current implementation uses a bit ct.label to handle policy update cases, where we use a bit in ct.label to indicate that reply traffic should be dropped now. ii. Swap the usage of ct.label in current implementation with ct.mark. Patch 2: i. Add support in parser to allow ct.label and mark to be set from registers as well (as of now only integer/masked integer is allowed). Patch 3: i. Add a new column (named 'label') to Table ACL in northbound schema. ii. ovn-northd changes to enhance logical flows to set ct.label to acl->label. For example: table=4 (ls_out_acl ), action=(reg0[1] = 1; reg0[3] = 1; xxreg1 = 0x1234; next;) . . . table=7 (ls_out_stateful), ... match=(reg0[1] == 1 && reg0[3] == 1), Ankur Sharma (3): OVN ACL: Replace the usage of ct_label with ct_mark OVN ACL: Allow ct_mark and ct_label values to be set from register as well OVN ACL: Allow a user to input ct.label value for an acl Documentation/tutorials/ovn-openstack.rst | 14 ++--- include/ovn/actions.h | 3 + lib/actions.c | 72 ++ lib/logical-fields.c | 3 + northd/ovn-northd.8.xml | 14 ++--- northd/ovn-northd.c | 87 +-- ovn-nb.ovsschema | 5 +- ovn-nb.xml| 12 ovn-sb.xml| 41 + tests/ovn-nbctl.at| 12 +++- tests/ovn.at | 99 +-- utilities/ovn-nbctl.c | 24 +++- 12 files changed, 310 insertions(+), 76 deletions(-) -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v4 1/3 ovn] OVN ACL: Replace the usage of ct_label with ct_mark
OVN ACL implementation used ct_label to indicate if a previosuly allowed connection should not be allowed anymore and vice versa. However, ct_label is a 128 bit value and we should rather leverage on ct_mark which is a 32 bit value. Using ct_mark for this purpose, allows us to use ct_label for storing other values like, identifier for corresponidng OVN ACL/Security group etc. Signed-off-by: Ankur Sharma --- Documentation/tutorials/ovn-openstack.rst | 14 - lib/logical-fields.c | 3 ++ northd/ovn-northd.8.xml | 14 - northd/ovn-northd.c | 50 --- tests/ovn.at | 11 +++ 5 files changed, 49 insertions(+), 43 deletions(-) diff --git a/Documentation/tutorials/ovn-openstack.rst b/Documentation/tutorials/ovn-openstack.rst index 3ef0523..5134406 100644 --- a/Documentation/tutorials/ovn-openstack.rst +++ b/Documentation/tutorials/ovn-openstack.rst @@ -60,7 +60,7 @@ packaging for developers, in a way that allows you to follow along with the tutorial in full. Unless you have a spare computer laying about, it's easiest to install -DevStacck in a virtual machine. This tutorial was built using a VM +DevStack in a virtual machine. This tutorial was built using a VM implemented by KVM and managed by virt-manager. I recommend configuring the VM configured for the x86-64 architecture, 6 GB RAM, 2 VCPUs, and a 20 GB virtual disk. @@ -1228,7 +1228,7 @@ as the output port:: ct_next(ct_state=est|trk /* default (use --ct to customize) */) --- - 6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 && (inport == "ap" && ip4), priority 2002, uuid a12b39f0 + 6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (inport == "ap" && ip4), priority 2002, uuid a12b39f0 next; 13. ls_in_l2_lkup (ovn-northd.c:3529): eth.dst == fa:16:3e:f6:e2:8f, priority 50, uuid c43ead31 outport = "17d870"; @@ -1297,7 +1297,7 @@ Finally the logical switch for ``n2`` runs through the same logic as ct_next(ct_state=est|trk /* default (use --ct to customize) */) --- - 4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 && (outport == "cp" && ip4 && ip4.src == $as_ip4_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 2002, uuid a746fa0d + 4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (outport == "cp" && ip4 && ip4.src == $as_ip4_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 2002, uuid a746fa0d next; 7. ls_out_port_sec_ip (ovn-northd.c:2364): outport == "cp" && eth.dst == fa:16:3e:89:f2:36 && ip4.dst == {255.255.255.255, 224.0.0.0/4, 10.1.2.7}, priority 90, uuid 4d9862b5 next; @@ -1537,7 +1537,7 @@ firewall and is output to ``d``:: ct_next(ct_state=est|trk /* default (use --ct to customize) */) --- - 4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 && (outport == "dp" && ip4 && ip4.src == 0.0.0.0/0 && icmp4), priority 2002, uuid b860fc9f + 4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (outport == "dp" && ip4 && ip4.src == 0.0.0.0/0 && icmp4), priority 2002, uuid b860fc9f next; 7. ls_out_port_sec_ip (ovn-northd.c:2364): outport == "dp" && eth.dst == fa:16:3e:c1:f5:a2 && ip4.dst == {255.255.255.255, 224.0.0.0/4, 10.0.0.6}, priority 90, uuid 15655a98 next; @@ -1649,7 +1649,7 @@ closely to those for IPv4 which we already discussed back under ct_next(ct_state=est|trk /* default (use --ct to customize) */) --- - 6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 && (inport == "ap" && ip6), priority 2002, uuid 7fdd607e + 6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (inport == "ap" && ip6), priority 2002, uuid 7fdd607e next; 13. ls_in_l2_lkup (ovn-northd.c:3529): eth.dst == fa:16:3e:ef:2f:8b, priority 50, uuid e1d87fc5 outport = "ad952e"; @@ -1707,7 +1707,7 @@ closely to those for IPv4 which we already discussed back under ct_next(ct_state=est|trk /* default (use --ct to customize) */) --- - 4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 && (outport == "cp" && ip6 && ip6.src == $as_ip6_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 2002, uuid 12fc96f9 + 4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (outport == "cp" && ip6 && ip6.src == $as_ip6_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 20
[ovs-dev] [PATCH v4 2/3 ovn] OVN ACL: Allow ct_mark and ct_label values to be set from register as well
OVN allows only an integer (or masked integer) to be assigned to ct_mark and ct_label. This patch, enhances the parser code to allow ct_mark and ct_label to be assigned from registers as well. Signed-off-by: Ankur Sharma --- include/ovn/actions.h | 3 +++ lib/actions.c | 72 --- ovn-sb.xml| 41 + tests/ovn.at | 31 ++ 4 files changed, 126 insertions(+), 21 deletions(-) diff --git a/include/ovn/actions.h b/include/ovn/actions.h index f4997e9..dda9a66 100644 --- a/include/ovn/actions.h +++ b/include/ovn/actions.h @@ -218,8 +218,11 @@ struct ovnact_ct_next { /* OVNACT_CT_COMMIT. */ struct ovnact_ct_commit { struct ovnact ovnact; +bool is_ct_mark_reg, is_ct_label_reg; /* If the value is from a register */ uint32_t ct_mark, ct_mark_mask; ovs_be128 ct_label, ct_label_mask; +enum mf_field_id ct_mark_reg, ct_label_reg; +uint16_t ct_mark_reg_bits, ct_label_reg_bits; }; /* OVNACT_CT_DNAT, OVNACT_CT_SNAT. */ diff --git a/lib/actions.c b/lib/actions.c index a999a4f..2b00469 100644 --- a/lib/actions.c +++ b/lib/actions.c @@ -634,8 +634,21 @@ parse_ct_commit_arg(struct action_context *ctx, } else if (ctx->lexer->token.type == LEX_T_MASKED_INTEGER) { cc->ct_mark = ntohll(ctx->lexer->token.value.integer); cc->ct_mark_mask = ntohll(ctx->lexer->token.mask.integer); +} else if (ctx->lexer->token.type == LEX_T_ID) { + +cc->ct_mark_mask = UINT32_MAX; + +const struct mf_field *mf = mf_from_name(ctx->lexer->token.s); +if (mf && mf_is_register(mf->id)) { +cc->is_ct_mark_reg = true; +cc->ct_mark_reg = mf->id; + } else { + lexer_syntax_error(ctx->lexer, "invalid field name: %s", + ctx->lexer->token.s); + return; + } } else { -lexer_syntax_error(ctx->lexer, "expecting integer"); +lexer_syntax_error(ctx->lexer, "invalid token type"); return; } lexer_get(ctx->lexer); @@ -649,9 +662,21 @@ parse_ct_commit_arg(struct action_context *ctx, } else if (ctx->lexer->token.type == LEX_T_MASKED_INTEGER) { cc->ct_label = ctx->lexer->token.value.be128_int; cc->ct_label_mask = ctx->lexer->token.mask.be128_int; +} else if (ctx->lexer->token.type == LEX_T_ID) { + + const struct mf_field *mf = mf_from_name(ctx->lexer->token.s); + if (mf && mf_is_register(mf->id)) { + cc->is_ct_label_reg = true; + cc->ct_label_reg = mf->id; + } else { + lexer_syntax_error(ctx->lexer, "invalid field name: %s", + ctx->lexer->token.s); + return; + } + } else { -lexer_syntax_error(ctx->lexer, "expecting integer"); -return; + lexer_syntax_error(ctx->lexer, "invalid token type"); + return; } lexer_get(ctx->lexer); } else { @@ -719,15 +744,42 @@ encode_CT_COMMIT(const struct ovnact_ct_commit *cc, size_t set_field_offset = ofpacts->size; ofpbuf_pull(ofpacts, set_field_offset); -if (cc->ct_mark_mask) { +if (cc->is_ct_mark_reg) { +struct ofpact_reg_move *move = ofpact_put_REG_MOVE(ofpacts); +const struct mf_field *src_reg = mf_from_id(cc->ct_mark_reg); +const struct mf_field *ct_mark = mf_from_id(MFF_CT_MARK); + +move->src.field = src_reg; +move->src.ofs = 0; +move->src.n_bits = src_reg->n_bits < ct_mark->n_bits ? + src_reg->n_bits : ct_mark->n_bits; +move->dst.field = mf_from_id(MFF_CT_MARK); +move->dst.ofs = 0; +move->dst.n_bits = src_reg->n_bits < ct_mark->n_bits ? + src_reg->n_bits : ct_mark->n_bits; +} else if (cc->ct_mark_mask) { const ovs_be32 value = htonl(cc->ct_mark); const ovs_be32 mask = htonl(cc->ct_mark_mask); -ofpact_put_set_field(ofpacts, mf_from_id(MFF_CT_MARK), &value, &mask); -} - -if (!ovs_be128_is_zero(cc->ct_label_mask)) { -ofpact_put_set_field(ofpacts, mf_from_id(MFF_CT_LABEL), &cc->ct_label, - &cc->ct_label_mask); +ofpact_put_set_field(ofpacts, mf_from_id(MFF_CT_MARK), &value, + &mask); +} + +if (cc->is_ct_label_reg) { +struct ofpact_reg_move *move = ofpact_put_REG_MOVE(ofpacts); +const struct mf_field *src_reg = mf_from_id(cc->ct_label_reg); +const struct mf_field *ct_label = mf_from_id(MFF_CT_LABEL); + +move->src.field = src_reg; +move->src.ofs = 0; +move->src.n_bits = src_reg->n_bits < ct_label->n_bits ? + src_reg->n_bits : ct_label->n_bits; +move->dst.field = ct_
Re: [ovs-dev] [PATCH v4 1/3 ovn] OVN ACL: Replace the usage of ct_label with ct_mark
Bleep bloop. Greetings Ankur Sharma, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line is 141 characters long (recommended limit is 79) #42 FILE: Documentation/tutorials/ovn-openstack.rst:1231: 6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (inport == "ap" && ip4), priority 2002, uuid a12b39f0 WARNING: Line is 202 characters long (recommended limit is 79) #51 FILE: Documentation/tutorials/ovn-openstack.rst:1300: 4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (outport == "cp" && ip4 && ip4.src == $as_ip4_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 2002, uuid a746fa0d WARNING: Line is 176 characters long (recommended limit is 79) #60 FILE: Documentation/tutorials/ovn-openstack.rst:1540: 4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (outport == "dp" && ip4 && ip4.src == 0.0.0.0/0 && icmp4), priority 2002, uuid b860fc9f WARNING: Line is 141 characters long (recommended limit is 79) #69 FILE: Documentation/tutorials/ovn-openstack.rst:1652: 6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (inport == "ap" && ip6), priority 2002, uuid 7fdd607e WARNING: Line is 202 characters long (recommended limit is 79) #78 FILE: Documentation/tutorials/ovn-openstack.rst:1710: 4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (outport == "cp" && ip6 && ip6.src == $as_ip6_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 2002, uuid 12fc96f9 WARNING: Line is 227 characters long (recommended limit is 79) #87 FILE: Documentation/tutorials/ovn-openstack.rst:1916: 6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked == 0 && (inport == "ap" && ip4 && ip4.dst == {255.255.255.255, 10.1.1.0/24} && udp && udp.src == 68 && udp.dst == 67), priority 2002, uuid 9c90245d Lines checked: 365, Warnings: 6, Errors: 0 Please check this out. If you feel there has been an error, please email acon...@redhat.com Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 2/3 ovn] OVN ACL: Allow ct_mark and ct_label values to be set from register as well
Bleep bloop. Greetings Ankur Sharma, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line is 117 characters long (recommended limit is 79) #150 FILE: ovn-sb.xml:1248: ct_commit(ct_mark=(value[/mask] OR regX OR xregX OR xxregX)); WARNING: Line is 118 characters long (recommended limit is 79) #155 FILE: ovn-sb.xml:1253: ct_commit(ct_label=(value[/mask] OR regX OR xregX OR xxregX)); WARNING: Line is 116 characters long (recommended limit is 79) #160 FILE: ovn-sb.xml:1258: ct_commit(ct_mark=(value[/mask] OR regX OR xregX OR xxregX), WARNING: Line is 109 characters long (recommended limit is 79) #161 FILE: ovn-sb.xml:1259: ct_label=(value[/mask] OR regX OR xregX OR xxregX )); Lines checked: 236, Warnings: 4, Errors: 0 Please check this out. If you feel there has been an error, please email acon...@redhat.com Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3] dpif-netdev: log rxq assignment in isolated pmd
There is no log about isolated rxq assignment in a pmd today, which sometimes could be useful to trace rxq/pmd pinning, when debugging with log. Ovs-appctl dpif-netdev/pmd-rxq-show reports about it already, but logging is helpful to trace pinning in time. Changes: v3: correction on pmd unref and numa_id. Reported-at: https://bugzilla.redhat.com/1728616 Signed-off-by: Gowrishankar Muthukrishnan --- lib/dpif-netdev.c | 4 1 file changed, 4 insertions(+) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 4546b55e8..aeae66aea 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -4572,6 +4572,10 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) OVS_REQUIRES(dp->port_mutex) } else { q->pmd = pmd; pmd->isolated = true; +VLOG_INFO("Core %d on numa node %d assigned port \'%s\' " + "rx queue %d.", pmd->core_id, pmd->numa_id, + netdev_rxq_get_name(q->rx), + netdev_rxq_get_queue_id(q->rx)); dp_netdev_pmd_unref(pmd); } } else if (!pinned && q->core_id == OVS_CORE_UNSPEC) { -- 2.17.2 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn v6] ovn-northd: Limit ARP/ND broadcast domain whenever possible.
On Fri, Nov 8, 2019 at 6:38 AM Dumitru Ceara wrote: > > ARP request and ND NS packets for router owned IPs were being > flooded in the complete L2 domain (using the MC_FLOOD multicast group). > However this creates a scaling issue in scenarios where aggregation > logical switches are connected to more logical routers (~350). The > logical pipelines of all routers would have to be executed before the > packet is finally replied to by a single router, the owner of the IP > address. > > This commit limits the broadcast domain by bypassing the L2 Lookup stage > for ARP requests that will be replied by a single router. The packets > are forwarded only to the router port that owns the target IP address. > > IPs that are owned by the routers and for which this fix applies are: > - IP addresses configured on the router ports. > - VIPs. > - NAT IPs. > > This commit also fixes function get_router_load_balancer_ips() which > was incorrectly returning a single address_family even though the > IP set could contain a mix of IPv4 and IPv6 addresses. > > Reported-at: https://bugzilla.redhat.com/1756945 > Reported-by: Anil Venkata > Signed-off-by: Dumitru Ceara > > --- > v6: > - Address Han's comments: > - remove flooding of ARPs targeting OVN owned IP addresses. > - update ovn-architecture documentation. > - rename ARP handling functions. > - Adapt "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR" autotest to take into > account the new way of forwarding ARPs. > - Also, properly deal with ARP packets on VLAN-backed networks. I am confused by this additional VLAN related change. VLAN is just handled as bridged logical switch where a localnet port is used as *inport*. It seems to me no difference from regular localnet port no matter with or without VLAN tag. When an ARP request is going through the ingress pipeline of the bridged logical switch, the logic of bypassing the flooding added by this patch should just apply, right? It is also very common scenario that the *aggregation switch* for the routers is an external physical network backed by VLAN. I think the purpose of this patch is to make sure scenario scale. Did I misunderstand anything here? In addition, the below macro definition may be renamed to FLAGBIT_..., because it is for the bits of MFF_LOG_FLAGS, which is one of the pre-defined logical fields, instead of for the MFF_LOG_REG0-9 registers. > > +#define REGBIT_NOT_VXLAN "flags[1] == 0" > +#define REGBIT_NOT_VLAN "flags[7] == 0" > + The other part looks good to me. Thanks for simply the patch. Han ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] travis: Fix CI failure for osx builds
On Fri, Nov 8, 2019 at 6:09 PM Ilya Maximets wrote: > > On 08.11.2019 13:12, Numan Siddique wrote: > > Hi Ilya, > > Hi Numan, > > Comments inline. > > Best regards, Ilya Maximets. > > > > > If you get some time, could you please take a look at this patch ? > > > > Thanks > > Numan > > > > > > On Fri, Nov 8, 2019 at 4:13 PM wrote: > >> > >> From: Numan Siddique > >> > >> The below failure is seen > >> > >> > >> checking for Python 3 (version 3.4 or later)... /usr/local/bin/python3 > >> checking where Python six library is available... configure: error: > >> Missing Python six library. > >> The command "./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS" exited with 1. > >> > >> > >> This patch fixes it. > >> > >> Signed-off-by: Numan Siddique > >> --- > >> .travis/osx-prepare.sh | 6 ++ > >> 1 file changed, 2 insertions(+), 4 deletions(-) > >> > >> diff --git a/.travis/osx-prepare.sh b/.travis/osx-prepare.sh > >> index 4725fd829..7f639a62e 100755 > >> --- a/.travis/osx-prepare.sh > >> +++ b/.travis/osx-prepare.sh > >> @@ -1,7 +1,5 @@ > >> #!/bin/bash > >> set -ev > >> -pip2 install --user six > >> -pip2 install --user --upgrade docutils > >> +pip3 install --user six > >> +pip3 install --user --upgrade docutils > > This is required because OVS requires python3 now. > This might make sense to point to that fact in commit-message. > > >> > > This empty line should be removed too as it is the last in a file. > > >> -brew update || true > >> -brew uninstall libtool && brew install libtool || true > > This part is just an optimization. Doesn't really related to this fix. > Some details in a corresponding OVS commit: > 3bfc9c1c30d5 ("travis: Drop OSX workarounds.") > > I'm OK with the change in general. It might be better to mention > in commit message why libtool update was also removed. > > Otherwise, > Acked-by: Ilya Maximets Hi Ilya, Thanks for the review. I applied the below modified as per your suggestions >From 9ec3bed0d4ec62f08835ac1b04b1c65f332934bf Mon Sep 17 00:00:00 2001 From: Numan Siddique Date: Fri, 8 Nov 2019 00:46:18 +0530 Subject: [PATCH] travis: Fix CI failure for osx builds The below failure is seen checking for Python 3 (version 3.4 or later)... /usr/local/bin/python3 checking where Python six library is available... configure: error: Missing Python six library. The command "./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS" exited with 1. This is because, OVS requires python3 now. This patch fixes it by installing six library using pip3. This patch also optimizes the build by removing the libtool update. As per this commit [1] in OVS, it is no longer required and it would save 4-6 minutes of build time. [1] - 3bfc9c1c30d5 ("travis: Drop OSX workarounds.") Acked-by: Ilya Maximets Signed-off-by: Numan Siddique --- .travis/osx-prepare.sh | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/.travis/osx-prepare.sh b/.travis/osx-prepare.sh index 4725fd829..78d5bb579 100755 --- a/.travis/osx-prepare.sh +++ b/.travis/osx-prepare.sh @@ -1,7 +1,4 @@ #!/bin/bash set -ev -pip2 install --user six -pip2 install --user --upgrade docutils - -brew update || true -brew uninstall libtool && brew install libtool || true +pip3 install --user six +pip3 install --user --upgrade docutils -- 2.23.0 * > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev