Re: [ovs-dev] [PATCH v2 2/2] ovn: Support address sets generated from port groups
On Fri, Apr 13, 2018 at 2:40 PM, Han Zhouwrote: > > > > On Fri, Apr 13, 2018 at 1:42 PM, Ben Pfaff wrote: > > > > On Fri, Apr 13, 2018 at 01:33:26PM -0700, Han Zhou wrote: > > > On Fri, Apr 13, 2018 at 12:54 PM, Ben Pfaff wrote: > > > > I think that sync_address_sets() is O(n**2) in n_ipv4_addrs and > > > > n_ipv6_addrs, because of the allocation strategy. If port groups get > > > > big (and they will eventually even if they do not today, because scale) > > > > I think that it would be better to expand the allocations exponentially. > > > > > > Sorry that I didn't find it a problem. For each group, there are i ports, > > > and for each port there are j address groups (e.g. mac ip1 ip2 ip3 ...), > > > and for each address group there are k addresses (some of them can be ipv4 > > > and some can be ipv6). > > > Although there are nested loops, this implementation iterates each ipv4 and > > > ipv6 address only once, so if the total number of ipv4 and ipv6 addresses > > > are n, then it should be O(n). Please correct me if I missed something here. > > > > I agree that the function visits and inserts each ipv4 and ipv6 address > > once, which is O(n). The issue is that xrealloc() has to copy all of > > the addresses each time. Suppose, for example, that there are 50 > > addresses in an address set, 1 per port. This yields a sequence of 50 > > xrealloc() calls that copy 0, 1, 2, ..., 49 addresses, respectively, > > which is O(n**2) copies. The usual solution is to expand the array by > > (roughly) doubling, so that there are only 0 + 1 + 2 + 4 + 8 + 16 + 32 > > copies total, which is O(n) copies. > > Now I got your point! I will send V3 with this addressed. Thanks Ben! Hi Ben, V3 is here: https://patchwork.ozlabs.org/patch/898123/ All comments are addressed except that I kept the xcalloc() there as initial allocation for the exponential allocations. Please take a look. Thanks, Han ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3] ovn: Support address sets generated from port groups
Address sets are automatically generated from corresponding port groups, and can be used directly in ACL match conditions. There are two address sets generated for each port group: _ip4 _ip6 For example, if port_group1 is created, we can directly use below match condition in ACL: "outport == @port_group1 && ip4.src == $port_group1_ip4" This will simplify OVN client implementation, and avoid some tricky problems such as race conditions when maintaining address set memberships as discussed in the link below. Reported-by: Lucas Alvares GomesReported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-February/046174.html Reviewed-by: Mark Michelson Reviewed-by: Daniel Alvarez Signed-off-by: Han Zhou --- Notes: v2->v3: - fix memory leak - reduce amount of xreallocs - xcalloc(1, ...) was kept there as initial allocation for the exponential allocation strategy NEWS| 3 +- ovn/northd/ovn-northd.c | 99 ++--- ovn/ovn-nb.xml | 18 ovn/ovn-sb.xml | 23 - tests/ovn.at| 226 5 files changed, 352 insertions(+), 17 deletions(-) diff --git a/NEWS b/NEWS index 58a7b58..77c2b43 100644 --- a/NEWS +++ b/NEWS @@ -20,7 +20,8 @@ Post-v2.9.0 * implemented icmp4/icmp6/tcp_reset actions in order to drop the packet and reply with a RST for TCP or ICMPv4/ICMPv6 unreachable message for other IPv4/IPv6-based protocols whenever a reject ACL rule is hit. - * ACL match conditions can now match on Port_Groups. + * ACL match conditions can now match on Port_Groups as well as address + sets that are automatically generated by Port_Groups. v2.9.0 - 19 Feb 2018 diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index 607fc75..b55f5f8 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -6179,9 +6179,32 @@ build_lflows(struct northd_context *ctx, struct hmap *datapaths, hmap_destroy(); } -/* OVN_Northbound and OVN_Southbound have an identical Address_Set table. - * We always update OVN_Southbound to match the current data in - * OVN_Northbound, so that the address sets used in Logical_Flows in +static void +sync_address_set(struct northd_context *ctx, const char *name, + const char **addrs, size_t n_addrs, + struct shash *sb_address_sets) +{ +const struct sbrec_address_set *sb_address_set; +sb_address_set = shash_find_and_delete(sb_address_sets, + name); +if (!sb_address_set) { +sb_address_set = sbrec_address_set_insert(ctx->ovnsb_txn); +sbrec_address_set_set_name(sb_address_set, name); +} + +sbrec_address_set_set_addresses(sb_address_set, +addrs, n_addrs); +} + +/* OVN_Southbound Address_Set table contains same records as in north + * bound, plus the records generated from Port_Group table in north bound. + * + * There are 2 records generated from each port group, one for IPv4, and + * one for IPv6, named in the format: _ip4 and + * _ip6 respectively. MAC addresses are ignored. + * + * We always update OVN_Southbound to match the Address_Set and Port_Group + * in OVN_Northbound, so that the address sets used in Logical_Flows in * OVN_Southbound is checked against the proper set.*/ static void sync_address_sets(struct northd_context *ctx) @@ -6193,19 +6216,67 @@ sync_address_sets(struct northd_context *ctx) shash_add(_address_sets, sb_address_set->name, sb_address_set); } -const struct nbrec_address_set *nb_address_set; -NBREC_ADDRESS_SET_FOR_EACH (nb_address_set, ctx->ovnnb_idl) { -sb_address_set = shash_find_and_delete(_address_sets, - nb_address_set->name); -if (!sb_address_set) { -sb_address_set = sbrec_address_set_insert(ctx->ovnsb_txn); -sbrec_address_set_set_name(sb_address_set, nb_address_set->name); +/* sync port group generated address sets first */ +const struct nbrec_port_group *nb_port_group; +NBREC_PORT_GROUP_FOR_EACH (nb_port_group, ctx->ovnnb_idl) { +char **ipv4_addrs = xcalloc(1, sizeof *ipv4_addrs); +size_t n_ipv4_addrs = 0; +size_t n_ipv4_addrs_buf = 1; +char **ipv6_addrs = xcalloc(1, sizeof *ipv6_addrs); +size_t n_ipv6_addrs = 0; +size_t n_ipv6_addrs_buf = 1; +for (size_t i = 0; i < nb_port_group->n_ports; i++) { +for (size_t j = 0; j < nb_port_group->ports[i]->n_addresses; j++) { +struct lport_addresses laddrs; +extract_lsp_addresses(nb_port_group->ports[i]->addresses[j], + ); +while (n_ipv4_addrs_buf < n_ipv4_addrs + laddrs.n_ipv4_addrs) { +
Re: [ovs-dev] [PATCH v2 2/2] ovn: Support address sets generated from port groups
On Fri, Apr 13, 2018 at 1:42 PM, Ben Pfaffwrote: > > On Fri, Apr 13, 2018 at 01:33:26PM -0700, Han Zhou wrote: > > On Fri, Apr 13, 2018 at 12:54 PM, Ben Pfaff wrote: > > > I think that sync_address_sets() is O(n**2) in n_ipv4_addrs and > > > n_ipv6_addrs, because of the allocation strategy. If port groups get > > > big (and they will eventually even if they do not today, because scale) > > > I think that it would be better to expand the allocations exponentially. > > > > Sorry that I didn't find it a problem. For each group, there are i ports, > > and for each port there are j address groups (e.g. mac ip1 ip2 ip3 ...), > > and for each address group there are k addresses (some of them can be ipv4 > > and some can be ipv6). > > Although there are nested loops, this implementation iterates each ipv4 and > > ipv6 address only once, so if the total number of ipv4 and ipv6 addresses > > are n, then it should be O(n). Please correct me if I missed something here. > > I agree that the function visits and inserts each ipv4 and ipv6 address > once, which is O(n). The issue is that xrealloc() has to copy all of > the addresses each time. Suppose, for example, that there are 50 > addresses in an address set, 1 per port. This yields a sequence of 50 > xrealloc() calls that copy 0, 1, 2, ..., 49 addresses, respectively, > which is O(n**2) copies. The usual solution is to expand the array by > (roughly) doubling, so that there are only 0 + 1 + 2 + 4 + 8 + 16 + 32 > copies total, which is O(n) copies. Now I got your point! I will send V3 with this addressed. Thanks Ben! > > Thanks, > > Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] rhel/systemd: Prevent deletion of runtime directory.
On 4 April 2018 at 08:13, Aaron Conolewrote: > Gurucharan Shetty writes: > > > Currently, when we do a 'service openvswitch stop', > > '/var/run/openvswitch' gets deleted. This is a problem > > if you have other users (like OVN) using the same > > runtime directory since we delete all the files > > related to ovsdb-server backing OVN's databases. > > > > This commit fixes it by removing the runtime directory > > information from the systemd unit file. > > > > CC: acon...@redhat.com > > Signed-off-by: Gurucharan Shetty > > --- > > I don't know whether there are other drawbacks of removing > > 'RuntimeDirectory' > > Hi Guru, > > I noticed that if I use 'systemctl restart openvswitch' with the > ovn-northd service running, I get the following output: > > 11:06:40 aconole {master} ~/git/ovs$ ls -lah /var/run/openvswitch/ > total 24K > drwxr-xr-x. 2 openvswitch openvswitch 360 Apr 4 11:06 . > drwxr-xr-x. 44 rootroot1.3K Apr 4 11:06 .. > srwxr-x---. 1 openvswitch openvswitch0 Apr 4 11:06 br0.mgmt > srwxr-x---. 1 openvswitch openvswitch0 Apr 4 11:06 br0.snoop > srwxr-x---. 1 openvswitch openvswitch0 Apr 4 11:06 db.sock > srwxr-x---. 1 rootroot 0 Apr 4 11:06 ovnnb_db.ctl > -rw-r--r--. 1 rootroot 6 Apr 4 11:06 ovnnb_db.pid > srwxr-x---. 1 rootroot 0 Apr 4 11:06 ovnnb_db.sock > srwxr-x---. 1 rootroot 0 Apr 4 11:06 > ovn-northd.30673.ctl > -rw-r--r--. 1 rootroot 6 Apr 4 11:06 ovn-northd.pid > srwxr-x---. 1 rootroot 0 Apr 4 11:06 ovnsb_db.ctl > -rw-r--r--. 1 rootroot 6 Apr 4 11:06 ovnsb_db.pid > srwxr-x---. 1 rootroot 0 Apr 4 11:06 ovnsb_db.sock > srwxr-x---. 1 openvswitch openvswitch0 Apr 4 11:06 > ovsdb-server.30569.ctl > -rw-r--r--. 1 openvswitch openvswitch6 Apr 4 11:06 ovsdb-server.pid > srwxr-x---. 1 openvswitch openvswitch0 Apr 4 11:06 > ovs-vswitchd.30612.ctl > -rw-r--r--. 1 openvswitch openvswitch6 Apr 4 11:06 ovs-vswitchd.pid > -rw-r--r--. 1 rootroot 43 Apr 4 11:06 useropts > 11:06:47 aconole {master} ~/git/ovs$ ps aux | grep ovn-northd\ > root 30673 0.0 0.0 27544 2952 ?S< 11:06 0:00 > ovn-northd -vconsole:emer -vsyslog:err -vfile:info > --ovnnb-db=unix:/run/openvswitch/ovnnb_db.sock > --ovnsb-db=unix:/run/openvswitch/ovnsb_db.sock --no-chdir > --log-file=/var/log/openvswitch/ovn-northd.log > --pidfile=/run/openvswitch/ovn-northd.pid > --detach --monitor > aconole 30730 0.0 0.0 119532 1048 pts/0S+ 11:07 0:00 grep > --color=auto northd > 11:09:22 aconole {master} ~/git/ovs$ sudo systemctl restart > openvswitch.service 11:09:24 aconole {master} ~/git/ovs$ ls -lah > /var/run/openvswitch/ > total 24K > drwxr-xr-x. 2 openvswitch openvswitch 360 Apr 4 11:09 . > drwxr-xr-x. 44 rootroot1.3K Apr 4 11:09 .. > srwxr-x---. 1 openvswitch openvswitch0 Apr 4 11:09 br0.mgmt > srwxr-x---. 1 openvswitch openvswitch0 Apr 4 11:09 br0.snoop > srwxr-x---. 1 openvswitch openvswitch0 Apr 4 11:09 db.sock > srwxr-x---. 1 rootroot 0 Apr 4 11:09 ovnnb_db.ctl > -rw-r--r--. 1 rootroot 6 Apr 4 11:09 ovnnb_db.pid > srwxr-x---. 1 rootroot 0 Apr 4 11:09 ovnnb_db.sock > srwxr-x---. 1 rootroot 0 Apr 4 11:09 > ovn-northd.31264.ctl > -rw-r--r--. 1 rootroot 6 Apr 4 11:09 ovn-northd.pid > srwxr-x---. 1 rootroot 0 Apr 4 11:09 ovnsb_db.ctl > -rw-r--r--. 1 rootroot 6 Apr 4 11:09 ovnsb_db.pid > srwxr-x---. 1 rootroot 0 Apr 4 11:09 ovnsb_db.sock > srwxr-x---. 1 openvswitch openvswitch0 Apr 4 11:09 > ovsdb-server.31160.ctl > -rw-r--r--. 1 openvswitch openvswitch6 Apr 4 11:09 ovsdb-server.pid > srwxr-x---. 1 openvswitch openvswitch0 Apr 4 11:09 > ovs-vswitchd.31203.ctl > -rw-r--r--. 1 openvswitch openvswitch6 Apr 4 11:09 ovs-vswitchd.pid > -rw-r--r--. 1 rootroot 43 Apr 4 11:09 useropts > > This looks like both the ovn-northd and openvswitch services are > restarting in concert (which is what I'd expect to happen due to the > "Requires=" and "After=" stanzas). > > Is there a set of reproduction steps that helps demonstrates the issue? > > NOTE: I used 'sudo systemctl start ovn-northd' to start the northd > service. Also note, this behavior seems to happen regardless of > "RuntimeDirectory" setting. > In my case, I was using debian packages built by a university in newzealand - https://packages.wand.net.nz/ They seem to have copied over the rhel systemd file over - but only for OVS (not for OVN) and hence the difference. Nevertheless, it looks like a big hammer to restart ovn-northd too when you restart OVS.
Re: [ovs-dev] [PATCH] ovn: Recirculate packets after a unSNAT.
On 4 April 2018 at 14:47, Ben Pfaffwrote: > On Mon, Mar 26, 2018 at 02:18:28PM -0700, Gurucharan Shetty wrote: > > commit f6fabcc6245 (ofproto-dpif: Mark packets as "untracked" > > after call to ct().) changed the behavior after a call to ct(). > > The +trk bit would automatically be unset if packet is sent to > > ct() and not forked. This caused a bug in the OVN gateway > > pipeline when there is SNAT rule as well as load-balancing rule. > > > > In the OVN gateway pipeline for the gateway router, we had an > > optimization where the packets sent to unSNAT need not go through > > a recirculation. But since doing this now means that the +trk bit > > gets unset, the DNAT rules for load-balancing a new packet in the next > > table won't get hit. > > > > This commit removes the optimization for unSNAT packets so that > > there is always a recirculation. > > > > Signed-off-by: Gurucharan Shetty > > Acked-by: Ben Pfaff > Thanks. I applied this to master and 2.9 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: native tunnel using valid tun_src specified by flow or port options
OK. I think I understand the patch now. It makes OVS native tunneling honor tunnel-specified source addresses, in the same way that Linux kernel tunneling honors them. I see one potential problem: it makes ovs-router.c #include . This is a problem because ovs-router.c is used on Windows and BSD as well as on Linux, and those systems do not have rtnetlink.h or RTN_LOCAL. Would you please change the patch to avoid this problem? Thank you for the contribution. It will make OVS native tunneling better. On Sat, Feb 03, 2018 at 08:47:41AM +0800, wenxu wrote: > Hi ben, > > > This patch can be a bugfix. > The tunnel_src of packet maybe not the IP address set by gre port > options:local_ip > dpdk-br has two address 10.1.1.7/24 and 10.1.1.254/32 > Interface gre options:local_ip="10.1.1.254" > But the tunnel_src of tunnel packet is always 10.1.1.7 > > > Also It can be a new feature. > In the same case the kernel space Open vSwitch can send packet with > tunnel_src 10.1.1.254. > > > Why we need this? > In the High-availability cluster > server1 with IP 10.1.1.7/24 and a virtual IP 10.1.1.254 > server2 with IP 10.1.1.8/24 and a virtual IP 10.1.1.254 > > > 10.1.1.7 and 10.1.1.8 running for the bgp and 10.1.1.254 provide the > dataplane forward > > > > > At 2018-02-03 06:30:51, "Ben Pfaff"wrote: > >Thank you for submitting a patch to improve Open vSwitch. I do not yet > >understand the purpose of this patch. Is it a bug fix or a new feature? > > > >Thanks, > > > >Ben. > > > >On Fri, Feb 02, 2018 at 02:08:52PM +0800, we...@ucloud.cn wrote: > >> From: wenxu > >> > >> native tunnel build tunnel with tun_src only from the route src and > >> not care about the options:local_ip. > >> Sometimes an virtual IP using for tun_src > >> dpdk-br: > >> inet 10.1.1.7/24 brd 10.1.1.255 scope global dpdk-br > >> inet 10.1.1.254/32 scope global dpdk-br > >> > >> Interface: gre options: {key=flow, local_ip="10.1.1.254", remote_ip=flow} > >> > >> the native tunnel always using 10.1.1.7 as the tunnel_src but not > >> 10.1.1.254. > >> > >> This patch made valid tun_src specified by flow-action or gre port options > >> can be used for tunnel_src of packet. It stores the rtm_type for each route > >> and improve the priority RTN_LOCAL type(higher then userdef route). > >> Like the kernel space when lookup the route, if there are tun_src specified > >> by flow-action or port options. Check the tun_src wheather is a local > >> address, then lookup the route. > >> > >> Signed-off-by: wenxu > >> Signed-off-by: frank.zeng > >> --- > >> lib/ovs-router.c | 38 +++--- > >> lib/ovs-router.h |2 +- > >> lib/route-table.c| 10 -- > >> ofproto/ofproto-dpif-sflow.c |2 +- > >> ofproto/ofproto-dpif-xlate.c |4 > >> 5 files changed, 45 insertions(+), 11 deletions(-) > >> > >> diff --git a/lib/ovs-router.c b/lib/ovs-router.c > >> index 0f1103b..e1375a3 100644 > >> --- a/lib/ovs-router.c > >> +++ b/lib/ovs-router.c > >> @@ -29,6 +29,7 @@ > >> #include > >> #include > >> #include > >> +#include > >> > >> #include "classifier.h" > >> #include "command-line.h" > >> @@ -61,6 +62,7 @@ struct ovs_router_entry { > >> struct in6_addr nw_addr; > >> struct in6_addr src_addr; > >> uint8_t plen; > >> +uint8_t rtm_type; > >> uint8_t priority; > >> uint32_t mark; > >> }; > >> @@ -97,13 +99,28 @@ ovs_router_lookup(uint32_t mark, const struct in6_addr > >> *ip6_dst, > >> const struct cls_rule *cr; > >> struct flow flow = {.ipv6_dst = *ip6_dst, .pkt_mark = mark}; > >> > >> +if (src && ipv6_addr_is_set(src)) { > >> +const struct cls_rule *cr_src; > >> +struct flow flow_src = {.ipv6_dst = *src, .pkt_mark = mark}; > >> + > >> +cr_src = classifier_lookup(, OVS_VERSION_MAX, _src, > >> NULL); > >> +if (cr_src) { > >> +struct ovs_router_entry *p_src = > >> ovs_router_entry_cast(cr_src); > >> +if (p_src->rtm_type != RTN_LOCAL) { > >> +return false; > >> +} > >> +} else { > >> +return false; > >> +} > >> +} > >> + > >> cr = classifier_lookup(, OVS_VERSION_MAX, , NULL); > >> if (cr) { > >> struct ovs_router_entry *p = ovs_router_entry_cast(cr); > >> > >> ovs_strlcpy(output_bridge, p->output_bridge, IFNAMSIZ); > >> *gw = p->gw; > >> -if (src) { > >> +if (src && !ipv6_addr_is_set(src)) { > >> *src = p->src_addr; > >> } > >> return true; > >> @@ -184,7 +201,7 @@ out: > >> } > >> > >> static int > >> -ovs_router_insert__(uint32_t mark, uint8_t priority, > >> +ovs_router_insert__(uint32_t mark, uint8_t priority, uint8_t rtm_type, > >> const struct in6_addr *ip6_dst, > >> uint8_t
Re: [ovs-dev] [PATCH v2 2/2] ovn: Support address sets generated from port groups
On Fri, Apr 13, 2018 at 01:33:26PM -0700, Han Zhou wrote: > On Fri, Apr 13, 2018 at 12:54 PM, Ben Pfaffwrote: > > I think that sync_address_sets() is O(n**2) in n_ipv4_addrs and > > n_ipv6_addrs, because of the allocation strategy. If port groups get > > big (and they will eventually even if they do not today, because scale) > > I think that it would be better to expand the allocations exponentially. > > Sorry that I didn't find it a problem. For each group, there are i ports, > and for each port there are j address groups (e.g. mac ip1 ip2 ip3 ...), > and for each address group there are k addresses (some of them can be ipv4 > and some can be ipv6). > Although there are nested loops, this implementation iterates each ipv4 and > ipv6 address only once, so if the total number of ipv4 and ipv6 addresses > are n, then it should be O(n). Please correct me if I missed something here. I agree that the function visits and inserts each ipv4 and ipv6 address once, which is O(n). The issue is that xrealloc() has to copy all of the addresses each time. Suppose, for example, that there are 50 addresses in an address set, 1 per port. This yields a sequence of 50 xrealloc() calls that copy 0, 1, 2, ..., 49 addresses, respectively, which is O(n**2) copies. The usual solution is to expand the array by (roughly) doubling, so that there are only 0 + 1 + 2 + 4 + 8 + 16 + 32 copies total, which is O(n) copies. Thanks, Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 2/2] ovn: Support address sets generated from port groups
On Fri, Apr 13, 2018 at 12:54 PM, Ben Pfaffwrote: > > On Wed, Apr 04, 2018 at 05:51:48PM -0700, Han Zhou wrote: > > Address sets are automatically generated from corresponding port > > groups, and can be used directly in ACL match conditions. > > Thanks! > > I think that sync_address_sets() has a memory leak, because I don't see > any free() calls that match up with the xstrdup() calls. (I'm not sure > that the xstrdup() calls are actually needed.) Oops! Yes, before freeing the ipv4_addrs and ipv6_addrs, I forgot to free the strings in them. I will fix with V3. The xstrdup() calls are still needed, since the strings there are from char arrays (rather than dynamic char *) in the local variable laddrs, which is destroyed in every iteration. > > I don't think the xcalloc() calls are needed. I think you can just > initialize ipv4_addrs and ipv6_addrs to NULL. Sure! > > I think that sync_address_sets() is O(n**2) in n_ipv4_addrs and > n_ipv6_addrs, because of the allocation strategy. If port groups get > big (and they will eventually even if they do not today, because scale) > I think that it would be better to expand the allocations exponentially. Sorry that I didn't find it a problem. For each group, there are i ports, and for each port there are j address groups (e.g. mac ip1 ip2 ip3 ...), and for each address group there are k addresses (some of them can be ipv4 and some can be ipv6). Although there are nested loops, this implementation iterates each ipv4 and ipv6 address only once, so if the total number of ipv4 and ipv6 addresses are n, then it should be O(n). Please correct me if I missed something here. Thanks for the review! Han ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 2/2] ovn: Support address sets generated from port groups
On Wed, Apr 04, 2018 at 05:51:48PM -0700, Han Zhou wrote: > Address sets are automatically generated from corresponding port > groups, and can be used directly in ACL match conditions. Thanks! I think that sync_address_sets() has a memory leak, because I don't see any free() calls that match up with the xstrdup() calls. (I'm not sure that the xstrdup() calls are actually needed.) I don't think the xcalloc() calls are needed. I think you can just initialize ipv4_addrs and ipv6_addrs to NULL. I think that sync_address_sets() is O(n**2) in n_ipv4_addrs and n_ipv6_addrs, because of the allocation strategy. If port groups get big (and they will eventually even if they do not today, because scale) I think that it would be better to expand the allocations exponentially. Thanks, Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/2] ovn: Support port groups in ACLs
On Wed, Apr 04, 2018 at 05:51:47PM -0700, Han Zhou wrote: > This patch enables using port group names in ACL match conditions. > Users can create a port group in northbound DB Port_Group table, > and then use the name of the port group in ACL match conditions > for "inport" or "outport". It can help reduce the number of ACLs > for CMS clients such as OpenStack Neutron, for the use cases > where a group of logical ports share same ACL rules except the > "inport"/"outport" part. Without this patch, the clients have to > create N (N = number of lports) ACLs, and this patch helps achieve > the same goal with only one ACL. E.g.: Thanks Han and others. I applied this to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 09/11] ofproto-dpif-slow: Add IPv6 agent address support.
Thanks for taking a look! On Fri, Apr 13, 2018 at 12:27:40PM -0700, Neil McKee wrote: > Looks ok to me. Will this be accessible via ovs-vsctl? Yes, it should work the same as before. > If so, hsflowd will be able to tell ovs to use the same agent-address > as the one it already settled on. Instead of just specifying the > device and hoping it works out: > https://github.com/sflow/host-sflow/blob/v2.0.15/src/Linux/mod_ovs.c#L258 > > So that's another use-case for this new option. Even better, great. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 09/11] ofproto-dpif-slow: Add IPv6 agent address support.
Looks ok to me. Will this be accessible via ovs-vsctl? If so, hsflowd will be able to tell ovs to use the same agent-address as the one it already settled on. Instead of just specifying the device and hoping it works out: https://github.com/sflow/host-sflow/blob/v2.0.15/src/Linux/mod_ovs.c#L258 So that's another use-case for this new option. On Fri, Apr 13, 2018 at 10:26 AM, Ben Pfaffwrote: > Suggested-by: Neil McKee > Signed-off-by: Ben Pfaff > --- > ofproto/ofproto-dpif-sflow.c | 55 > ++-- > 1 file changed, 33 insertions(+), 22 deletions(-) > > diff --git a/ofproto/ofproto-dpif-sflow.c b/ofproto/ofproto-dpif-sflow.c > index 5d8c0e19f8e3..fe79a9dbbad6 100644 > --- a/ofproto/ofproto-dpif-sflow.c > +++ b/ofproto/ofproto-dpif-sflow.c > @@ -446,43 +446,45 @@ sflow_choose_agent_address(const char *agent_device, > const char *control_ip, > SFLAddress *agent_addr) > { > -const char *target; > -struct in_addr in4; > - > -memset(agent_addr, 0, sizeof *agent_addr); > -agent_addr->type = SFLADDRESSTYPE_IP_V4; > +struct in6_addr ip; > > if (agent_device) { > -if (!netdev_get_in4_by_name(agent_device, ) > -|| !lookup_ip(agent_device, )) { > +/* If 'agent_device' is the name of a network device, use its IP > + * address. */ > +if (!netdev_get_ip_by_name(agent_device, )) { > +goto success; > +} > + > +/* If 'agent_device' is itself an IP address, use it. */ > +struct sockaddr_storage ss; > +if (inet_parse_address(agent_device, )) { > +ip = ss_get_address(); > goto success; > } > } > > +/* Otherwise, use an appropriate local IP address for one of the > + * collectors' remote IP addresses. */ > +const char *target; > SSET_FOR_EACH (target, targets) { > -union { > -struct sockaddr_storage ss; > -struct sockaddr_in sin; > -} sa; > -char name[IFNAMSIZ]; > - > -if (inet_parse_active(target, SFL_DEFAULT_COLLECTOR_PORT, ) > -&& sa.ss.ss_family == AF_INET) { > -struct in6_addr addr6, src, gw; > - > -in6_addr_set_mapped_ipv4(, sa.sin.sin_addr.s_addr); > +struct sockaddr_storage ss; > +if (inet_parse_active(target, SFL_DEFAULT_COLLECTOR_PORT, )) { > /* sFlow only supports target in default routing table with > * packet mark zero. > */ > -if (ovs_router_lookup(0, , name, , )) { > +ip = ss_get_address(); > > -in4.s_addr = in6_addr_get_mapped_ipv4(); > +struct in6_addr src, gw; > +char name[IFNAMSIZ]; > +if (ovs_router_lookup(0, , name, , )) { > goto success; > } > } > } > > -if (control_ip && !lookup_ip(control_ip, )) { > +struct sockaddr_storage ss; > +if (control_ip && inet_parse_address(control_ip, )) { > +ip = ss_get_address(); > goto success; > } > > @@ -490,7 +492,16 @@ sflow_choose_agent_address(const char *agent_device, > return false; > > success: > -agent_addr->address.ip_v4.addr = (OVS_FORCE uint32_t) in4.s_addr; > +memset(agent_addr, 0, sizeof *agent_addr); > +if (IN6_IS_ADDR_V4MAPPED()) { > +agent_addr->type = SFLADDRESSTYPE_IP_V4; > +agent_addr->address.ip_v4.addr > += (OVS_FORCE uint32_t) in6_addr_get_mapped_ipv4(); > +} else { > +agent_addr->type = SFLADDRESSTYPE_IP_V6; > +memcpy(agent_addr->address.ip_v6.addr, ip.s6_addr, > + sizeof agent_addr->address.ip_v6.addr); > +} > return true; > } > > -- > 2.16.1 > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] Edit Open vSwitch license info so that GitHub recognizes it.
Ben Pfaffwrites: > From: Andrea Kao > > GitHub uses a library called Licensee to identify a project's license > type. It shows this information in the status bar and via the API if it > can unambiguously identify the license. > > This commit creates a LICENSE file that stores the full text of the > Apache license. It also removes the COPYING file and transfers its > contents to a new "License" section in the README. > > Collectively, these changes allow Licensee to successfully identify the > license type of Open vSwitch's codebase as Apache. > > Submitted-at: https://github.com/openvswitch/ovs/pull/224 > Signed-off-by: Andrea Kao > [b...@ovn.org removed references to COPYING and updated Makefile.am] > Signed-off-by: Ben Pfaff > --- According to the Licensee[1] code, it should be good enough to put the exact text of the Apache 2.0 license into the COPYING file. Then again COPYING is a GNU-ism, afaik. So either way: Acked-by: Aaron Conole As a side note, as I start editing files, I'll add SPDX identifiers if others think it makes sense. [1]: https://github.com/benbalter/licensee/blob/master/lib/licensee/project_files/license_file.rb ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/1] netdev-dpdk: fix RX jumbo for NICs not supporting scatter
On 04/13/2018 04:20 PM, Stokes, Ian wrote: >> Currently to RX jumbo packets fails for NICs not supporting scatter. >> Scatter is not strictly needed for jumbo support on RX. This change fixes >> the issue by only enabling scatter for NICs supporting it. Add a quirk for >> "igb" while the PMD is fixed to advertise scatter. >> > > Thanks for the v2 Pablo. > > Adding Eelco and Kevin as they had some comments on the v1. > > FYI I'm investigating on the DPDK side to see how/when the flag should be set > and used for igb and ixgbe as well as other drivers. > > https://dpdk.org/ml/archives/dev/2018-April/097056.html > >> Reported-by: Louis Peens>> Signed-off-by: Pablo Cascón >> Reviewed-by: Simon Horman >> --- >> lib/netdev-dpdk.c | 10 +- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index ee39cbe..8f6a0a3 >> 100644 >> --- a/lib/netdev-dpdk.c >> +++ b/lib/netdev-dpdk.c >> @@ -694,11 +694,19 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, >> int n_rxq, int n_txq) >> int diag = 0; >> int i; >> struct rte_eth_conf conf = port_conf; >> +struct rte_eth_dev_info info; >> >> /* For some NICs (e.g. Niantic), scatter_rx mode needs to be >> explicitly >> * enabled. */ >> if (dev->mtu > ETHER_MTU) { >> -conf.rxmode.enable_scatter = 1; >> +rte_eth_dev_info_get(dev->port_id, ); >> +if (info.rx_offload_capa & DEV_RX_OFFLOAD_SCATTER) { >> +conf.rxmode.enable_scatter = 1; >> +} else if (!strcmp(info.driver_name, "igb")) { >> +/* Quirk: as of DPDK 17.11.1 igb's PMD requires explicitly >> + enabling scatter but fails to advertise it. */ > Can you check name for "nfp" and don't set enable_scatter? I don't think most of the PMDs used these new offload defines in DPDK17.11. > I'm not sure this is acceptable. I'm worried it sets a precedent for code for > specific devices and could lead to further instances of this in the future. > > It could be argued the scatter approach up to now was specific to Niantic but > it also worked for igb and i40e. I40e devices don’t require scatter but can > handle it without issue if it is set. > > In the past this type of solution has been rejected as the preferred approach > was to keep netdev-dpdk code generic as possible. > > That’s why I suggest deferring the patch in OVS until the required changes > are made in DPDK to satisfy all cases. 17.11.2 is targeted for May 19th. We > could have a solution in place for then. > > I'm not trying to obstruct this but these cases do arise so interested to > hear what others think? > > Ian > >> +conf.rxmode.enable_scatter = 1; >> +} >> } >> >> conf.rxmode.hw_ip_checksum = (dev->hw_ol_features & >> -- >> 2.7.4 >> >> ___ >> dev mailing list >> d...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 0/2] Correct handling of double encap and decap actions
Thanks for checking. I applied both patches to branch-2.9. For branch-2.8, would you mind submitting the fixed-up patches? It would save me a few minutes. Thanks, Ben. On Fri, Apr 06, 2018 at 05:37:20PM +, Jan Scheurich wrote: > Yes that fix should be applied to branches 2.9 and 2.8. > > I checked that it applies and passes all unit tests pass. > > On branch-2.8 the patch for nsh.at patch must be slightly retrofitted as the > datapath action names changed from encap_nsh/decap_nsh to push_nsh/pop_nsh > and the nsh_ttl field was introduced in 2.9. > > diff --git a/tests/nsh.at b/tests/nsh.at > index 6ae71b5..6eb4637 100644 > --- a/tests/nsh.at > +++ b/tests/nsh.at > @@ -351,7 +351,7 @@ bridge("br0") > > Final flow: unchanged > Megaflow: recirc_id=0,eth,ip,in_port=1,dl_dst=66:77:88:99:aa:bb,nw_frag=no > -Datapath actions: > push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x1234,si=255,c1=0x1122334 > +Datapath actions: > encap_nsh(flags=0,mdtype=1,np=3,spi=0x1234,si=255,c1=0x11223344,c2=0 > ]) > > AT_CHECK([ > @@ -370,7 +370,7 @@ bridge("br0") > > Final flow: > recirc_id=0x1,eth,in_port=4,vlan_tci=0x,dl_src=00:00:00:00:00:00,dl_ds > Megaflow: > recirc_id=0x1,packet_type=(1,0x894f),in_port=4,nsh_mdtype=1,nsh_np=3,nsh_spi > -Datapath actions: pop_nsh(),recirc(0x2) > +Datapath actions: decap_nsh(),recirc(0x2) > ]) > > AT_CHECK([ > @@ -407,8 +407,8 @@ ovs-appctl time/warp 1000 > AT_CHECK([ > ovs-appctl dpctl/dump-flows dummy@ovs-dummy | strip_used | grep -v ipv6 > | sort > ], [0], [flow-dump from non-dpdk interfaces: > -recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0 > -recirc_id(0x3),in_port(1),packet_type(ns=1,id=0x894f),nsh(mdtype=1,np=3,spi=0x1234,c1= > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0 > +recirc_id(0x3),in_port(1),packet_type(ns=1,id=0x894f),nsh(mdtype=1,np=3,spi=0x1234,c1= > > recirc_id(0x4),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packe > ]) > > Thanks, Jan > > > -Original Message- > > From: Ben Pfaff [mailto:b...@ovn.org] > > Sent: Friday, 06 April, 2018 18:36 > > To: Jan Scheurich> > Cc: d...@openvswitch.org; yi.y.y...@intel.com > > Subject: Re: [PATCH v2 0/2] Correct handling of double encap and decap > > actions > > > > On Fri, Apr 06, 2018 at 09:35:48AM -0700, Ben Pfaff wrote: > > > On Thu, Apr 05, 2018 at 04:11:02PM +0200, Jan Scheurich wrote: > > > > Recent tests with NSH encap have shown that the translation of multiple > > > > subsequent encap() or decap() actions was incorrect. This patch set > > > > corrects the handling and adds a unit test for NSH to cover two NSH > > > > and one Ethernet encapsulation levels. > > > > > > Thanks. Should this be applied to branch-2.9? > > > > To be clear, I applied it to master just now. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next 0/6] Add dpdk-bond support
On Thu, Apr 12, 2018 at 05:52:46AM -0700, xiangxia.m@gmail.com wrote: > From: Tonghao Zhang> > The bond of openvswitch has not good performance. I'd really say that the best solution to that is to improve the performance, rather than piling on another bonding layer. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next 0/6] Add dpdk-bond support
Why is this series tagged "net-next"? It doesn't appear to have anything to do with the kernel. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] netns: Add documentation and update NEWS.
On Wed, Apr 11, 2018 at 08:50:56PM -0300, Flavio Leitner wrote: > Create a document to describe the how it works and known > limitations and update the NEWS accordingly. > > Signed-off-by: Flavio LeitnerThanks! I folded in the following to make "checkpatch" and "make" happy, and then I applied this to master. --8<--cut here-->8-- diff --git a/Documentation/automake.mk b/Documentation/automake.mk index 93cf3a11be57..c05a2313a5b8 100644 --- a/Documentation/automake.mk +++ b/Documentation/automake.mk @@ -40,6 +40,7 @@ DOC_SOURCE = \ Documentation/topics/high-availability.rst \ Documentation/topics/integration.rst \ Documentation/topics/language-bindings.rst \ + Documentation/topics/networking-namespaces.rst \ Documentation/topics/openflow.rst \ Documentation/topics/ovn-news-2.8.rst \ Documentation/topics/ovsdb-replication.rst \ diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst index 13b6d8abbb30..fa7f0a2fb089 100644 --- a/Documentation/topics/index.rst +++ b/Documentation/topics/index.rst @@ -42,6 +42,7 @@ OVS porting openflow bonding + networking-namespaces ovsdb-replication dpdk/index windows diff --git a/Documentation/topics/networking-namespaces.rst b/Documentation/topics/networking-namespaces.rst index 96589b513080..5c265cd0247b 100644 --- a/Documentation/topics/networking-namespaces.rst +++ b/Documentation/topics/networking-namespaces.rst @@ -66,4 +66,3 @@ setting MTU, lacks the proper API in the kernel, so they remain unsupported. In most use cases that needs to move ports to another networking namespaces should use veth pairs instead because it offers a cleaner and more robust solution with no noticeable performance penalty. - ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] Edit Open vSwitch license info so that GitHub recognizes it.
From: Andrea KaoGitHub uses a library called Licensee to identify a project's license type. It shows this information in the status bar and via the API if it can unambiguously identify the license. This commit creates a LICENSE file that stores the full text of the Apache license. It also removes the COPYING file and transfers its contents to a new "License" section in the README. Collectively, these changes allow Licensee to successfully identify the license type of Open vSwitch's codebase as Apache. Submitted-at: https://github.com/openvswitch/ovs/pull/224 Signed-off-by: Andrea Kao [b...@ovn.org removed references to COPYING and updated Makefile.am] Signed-off-by: Ben Pfaff --- v1->v2: I removed references to COPYING from the tree and updated Makefile.am to distribute the new LICENSE file. COPYING | 42 - LICENSE | 191 Makefile.am | 1 + README.rst | 33 +++ debian/rules| 3 +- rhel/openvswitch-fedora.spec.in | 4 +- rhel/openvswitch.spec.in| 2 +- 7 files changed, 228 insertions(+), 48 deletions(-) delete mode 100644 COPYING create mode 100644 LICENSE diff --git a/COPYING b/COPYING deleted file mode 100644 index afb98b9d7dd8.. --- a/COPYING +++ /dev/null @@ -1,42 +0,0 @@ -This file is a summary of the licensing of files in this distribution. -Some files may be marked specifically with a different license, in -which case that license applies to the file in question. - -Most files are licensed under the Apache License, Version 2.0: - -Licensed under the Apache License, Version 2.0 (the "License"); -you may not use this file except in compliance with the License. -You may obtain a copy of the License at: - -http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, software -distributed under the License is distributed on an "AS IS" BASIS, -WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -See the License for the specific language governing permissions and -limitations under the License. - -Files under the datapath directory are licensed under the GNU General -Public License, version 2. - -File build-aux/cccl is licensed under the GNU General Public -License, version 2. - -The following files are licensed under the 2-clause BSD license. -include/windows/getopt.h -lib/getopt_long.c -lib/conntrack-tcp.c - -The following files are licensed under the 3-clause BSD-license -include/windows/netinet/icmp6.h -include/windows/netinet/ip6.h -lib/strsep.c - -Files under the xenserver directory are licensed on a file-by-file -basis. Refer to each file for details. - -Files lib/sflow*.[ch] are licensed under the terms of either the -Sun Industry Standards Source License 1.1, that is available at: -http://host-sflow.sourceforge.net/sissl.html -or the InMon sFlow License, that is available at: -http://www.inmon.com/technology/sflowlicense.txt diff --git a/LICENSE b/LICENSE new file mode 100644 index ..4ba8cc0c88b0 --- /dev/null +++ b/LICENSE @@ -0,0 +1,191 @@ + + Apache License + Version 2.0, January 2004 +https://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in
[ovs-dev] Follow Up- Oil & Gas, Energy & Utility
Hi, Hope you are doing well. Would you be interested in acquiring an Email List for Oil & Gas, Energy & Utility Industry? PS: If you have a very specific and niche target, please do mention that here as we can customize the lists for you across industries and geographies/countries. Some of the counts available are mentioned below: . Oil & Gas . Energy Plants . Petroleum Engineers & Pipeline Providers . Exploration, Production, Land . Pipelining, Refining, Gas Processing . Drilling, Well Servicing, Work over Contractors . Geophysical Contractors and Data Brokers . Equipment and Supplies . Field Services . Environmental Services and Products . Manufacturers/Suppliers & Gas Plant Operators . Gas Compressor & Pump Operators . Lubricant Suppliers & Drillers, Drilling Contractors . Nuclear and offshore renewable sectors . Coal & Energy Division . Manufacturing and many more across US/UK/CANADA, Europe and all over the world. List Contains: Title, Name, Email, Company, Website, Fax, Website, Contact Number, Revenue Size, Industry and Social Media Link. Let me know your Target Industry/category that you are trying to reach, so that I can send you more information. Look forward to hearing from you soon Regards Erin Lofton Sr. Marketing Executive If you do not wish to receive future emails from us, please reply as 'leave out' ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/1] netdev-vport: reject concomitant incompatible tunnels
On Fri, Feb 09, 2018 at 03:42:56PM +0100, Eelco Chaudron wrote: > This patch will make sure VXLAN tunnels with and without the group > based policy (GBP) option enabled can not coexist on the same > destination UDP port. > > In theory, VXLAN tunnel with and without GBP enables can be > multiplexed on the same UDP port as long as different VNI's are > used. However currently OVS does not support this, hence this patch to > check for this condition. > > v1->v2 > Updated commit message as its now clear that the OVS implementation > does not support VNI multiplexing on the same UDP port. > > Signed-off-by: Eelco ChaudronThanks for the update. Doesn't this make tunnel configuration O(n**2) in the number of tunnels? It looks like every tunnel checks at configuration time whether there is another tunnel of the same kind. I know of some configurations with hundreds (thousands?) of tunnels. Is there a way to make it cheaper? Thanks, Ben. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] flow: Extend 5-tuple hash calculation for non-IP packets
On Thu, Apr 12, 2018 at 01:01:09PM +, Gabor Halász wrote: > In real-world vSwitch deployments, handling a few thousand flows, > EMC is quickly saturated, so it's optimal usage is critical to > reach the highest packet forwarding speed of the vSwitch. > > EMC lookup is initiated based on the hash value of the packet. > In case the packet does not already have a stored hash value > during processing, the miniflow_hash_5tuple() function is invoked > in the datapath. While packets entering the vSwitch from an > external interface usually have valid hashes (pre-computed by NICs > supporting RSS), the ones coming from vhostuser ports (internal > packets from VMs) do not. > > Non-IP traffic received from the VMs experiences very bad EMC hit > rates and hence forwarding performance, because the miniflow_hash_5tuple() > returns the same hash value and these packets will hit the same EMC > entries and cause collisions if there are more than two distinct > megaflows with traffic in the PMD. > > The purpose of the patch is to compute proper hashes with sufficient > entropy for EMC lookup also for non-IP traffic to avoid constant EMC > thrashing. The hash calculation has been extended to handle unrecognized > ethernet types and MPLS, using the header fields that are valid for a > specific protocol. > > Forwarding of non-IP packets in NFVI scenarios is very likely to happen > based on MAC addresses and/or VLAN tags. > By implementing a special case for matching on MPLS label, this change > prepares a separate commit that will enable hash recalculation for MPLS > packets received from L3 GRE tunnels. Today we skip re-computation of > the hash and the original GRE hash is only updated with the increased > recirc_depth. > > Signed-off-by: Gabor Halasz> --- > v1->v2: > - Fix for Clang error "cast increases required alignment from 2 to 4" Thanks for v2. It doesn't make a difference because OVS_FORCE does not affect Clang, only sparse. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] stopwatch: Add latch_poll to stopwatch loop.
On Wed, Apr 11, 2018 at 09:15:22AM -0500, Mark Michelson wrote: > Nothing was clearing the latch, so the loop was busy. This makes it so > the loop only runs on new calls to latch_set() by a separate thread. > > Signed-off-by: Mark MichelsonThanks, applied. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] Add multi-column index support for the Python IDL
On Fri, Apr 13, 2018 at 10:27:03AM -0500, Terry Wilson wrote: > One could argue that if if distro packaging is the issue, then distros > could patch in the simple try/except ImportError and add the > sortedcontainer code like I did above. I'm quite sympathetic to that viewpoint--I think that OVS currently has far too much distro-specific stuff in it. In the long run I'd like to drop the Debian and Red Hat and XenServer packaging from the tree. As a Debian developer myself, I know that it's not actually helpful for upstream to provide packaging. > But, if this works for the ovs team and distro folks, I'm happy with > it. Seems OK for now. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ofproto: Allow bundle idle timeout to be configured.
On Fri, Apr 13, 2018 at 01:45:30PM -0300, Flavio Leitner wrote: > In some cases 10 seconds might be too much time and in > other cases it might be too little. > > The OpenFlow spec mandates that it should wait at least one > second, so enforce that as the minimum acceptable value. > > Signed-off-by: Flavio LeitnerThanks for the patch. I don't think that this will set the idle timeout back to the default if the setting is removed. It's better if there's that behavior. What do you think of this incremental? I have not tested it but it is meant to behave that way. Also it avoids integer overflow. Thanks, Ben. --8<--cut here-->8-- diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c index 677deef948ea..f78b4c5ff411 100644 --- a/ofproto/connmgr.c +++ b/ofproto/connmgr.c @@ -37,6 +37,7 @@ #include "openvswitch/poll-loop.h" #include "openvswitch/rconn.h" #include "openvswitch/shash.h" +#include "sat-math.h" #include "simap.h" #include "stream.h" #include "timeval.h" @@ -142,7 +143,7 @@ struct ofconn { #define BUNDLE_IDLE_TIMEOUT_DEFAULT 1 /* Expire idle bundles after * 10 seconds. */ -static int bundle_idle_timeout = BUNDLE_IDLE_TIMEOUT_DEFAULT; +static unsigned int bundle_idle_timeout = BUNDLE_IDLE_TIMEOUT_DEFAULT; static struct ofconn *ofconn_create(struct connmgr *, struct rconn *, enum ofconn_type, bool enable_async_msgs) @@ -471,12 +472,16 @@ ofconn_get_ofproto(const struct ofconn *ofconn) return ofconn->connmgr->ofproto; } +/* Sets the bundle idle timeout to 'timeout' seconds, interpreting 0 as + * requesting the default timeout. + * + * The OpenFlow spec mandates the timeout to be at least one second; . */ void -connmgr_set_bundle_idle_timeout(unsigned timeout) { -/* OpenFlow spec mandates the timeout to be at least one second. */ -if (timeout > 0) { -bundle_idle_timeout = timeout * 1000; -} +connmgr_set_bundle_idle_timeout(unsigned timeout) +{ +bundle_idle_timeout = (timeout + ? sat_mul(timeout, 1000) + : BUNDLE_IDLE_TIMEOUT_DEFAULT); } /* OpenFlow configuration. */ ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] netdev-dpdk: Free mempool only when no in-use mbufs.
DPDK mempools are freed when they are no longer needed. This can happen when a port is removed or a port's mtu is reconfigured so that a new mempool is used. It is possible that an mbuf is attempted to be returned to a freed mempool from NIC Tx queues and this can lead to a segfault. In order to prevent this, only free mempools when they are not needed and have no in-use mbufs. As this might not be possible immediately, create a free list of mempools and sweep it anytime a port tries to get a mempool. Fixes: 8d38823bdf8b ("netdev-dpdk: fix memory leak") Cc: mark.b.kavanag...@gmail.com Cc: Ilya MaximetsReported-by: Venkatesan Pradeep Signed-off-by: Kevin Traynor --- v2: Get number of entries on the mempool ring directly. lib/netdev-dpdk.c | 86 +++ 1 file changed, 81 insertions(+), 5 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index ee39cbe..3306b19 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -297,4 +297,14 @@ static struct ovs_mutex dpdk_mp_mutex OVS_ACQ_AFTER(dpdk_mutex) = OVS_MUTEX_INITIALIZER; +/* Contains all 'struct dpdk_mp's. */ +static struct ovs_list dpdk_mp_free_list OVS_GUARDED_BY(dpdk_mp_mutex) += OVS_LIST_INITIALIZER(_mp_free_list); + +/* Wrapper for a mempool released but not yet freed. */ +struct dpdk_mp { + struct rte_mempool *mp; + struct ovs_list list_node OVS_GUARDED_BY(dpdk_mp_mutex); + }; + /* There should be one 'struct dpdk_tx_queue' created for * each cpu core. */ @@ -512,4 +522,56 @@ ovs_rte_pktmbuf_init(struct rte_mempool *mp OVS_UNUSED, } +static int +dpdk_mp_full(const struct rte_mempool *mp) OVS_REQUIRES(dpdk_mp_mutex) +{ +unsigned ring_count; +/* This logic is needed because rte_mempool_full() is not guaranteed to + * be atomic and mbufs could be moved from mempool cache --> mempool ring + * during the call. However, as no mbufs will be taken from the mempool + * at this time, we can work around it by also checking the ring entries + * separately and ensuring that they have not changed. + */ +ring_count = rte_mempool_ops_get_count(mp); +if (rte_mempool_full(mp) && rte_mempool_ops_get_count(mp) == ring_count) { +return 1; +} + +return 0; +} + +/* Free unused mempools. */ +static void +dpdk_mp_sweep(void) +{ +struct dpdk_mp *dmp, *next; + +ovs_mutex_lock(_mp_mutex); +LIST_FOR_EACH_SAFE (dmp, next, list_node, _mp_free_list) { +if (dpdk_mp_full(dmp->mp)) { +VLOG_DBG("Freeing mempool \"%s\"", dmp->mp->name); +ovs_list_remove(>list_node); +rte_mempool_free(dmp->mp); +rte_free(dmp); +} +} +ovs_mutex_unlock(_mp_mutex); +} + +/* Ensure a mempool will not be freed. */ +static void +dpdk_mp_do_not_free(struct rte_mempool *mp) OVS_REQUIRES(dpdk_mp_mutex) +{ +struct dpdk_mp *dmp, *next; + +LIST_FOR_EACH_SAFE (dmp, next, list_node, _mp_free_list) { +if (dmp->mp == mp) { +VLOG_DBG("Removing mempool \"%s\" from free list", dmp->mp->name); +ovs_list_remove(>list_node); +rte_free(dmp); +break; +} +} +} + /* Returns a valid pointer when either of the following is true: * - a new mempool was just created; @@ -578,4 +640,6 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu) VLOG_DBG("A mempool with name \"%s\" already exists at %p.", mp_name, mp); +/* Ensure this reused mempool will not be freed. */ +dpdk_mp_do_not_free(mp); } else { VLOG_ERR("Failed mempool \"%s\" create request of %u mbufs", @@ -590,5 +654,5 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu) /* Release an existing mempool. */ static void -dpdk_mp_free(struct rte_mempool *mp) +dpdk_mp_release(struct rte_mempool *mp) { if (!mp) { @@ -597,6 +661,16 @@ dpdk_mp_free(struct rte_mempool *mp) ovs_mutex_lock(_mp_mutex); -VLOG_DBG("Releasing \"%s\" mempool", mp->name); -rte_mempool_free(mp); +if (dpdk_mp_full(mp)) { +VLOG_DBG("Freeing mempool \"%s\"", mp->name); +rte_mempool_free(mp); +} else { +struct dpdk_mp *dmp; + +dmp = dpdk_rte_mzalloc(sizeof *dmp); +if (dmp) { +dmp->mp = mp; +ovs_list_push_back(_mp_free_list, >list_node); +} +} ovs_mutex_unlock(_mp_mutex); } @@ -616,4 +690,6 @@ netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) int ret = 0; +dpdk_mp_sweep(); + mp = dpdk_mp_create(dev, FRAME_LEN_TO_MTU(buf_size)); if (!mp) { @@ -628,5 +704,5 @@ netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) if (dev->mp != mp) { /* A new mempool was created, release the previous one. */ -dpdk_mp_free(dev->mp); +dpdk_mp_release(dev->mp); } else { ret =
[ovs-dev] [PATCH] tests: Fix typo in test name.
Signed-off-by: Ben Pfaff--- tests/ovn.at | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/ovn.at b/tests/ovn.at index 3ceeab3ce073..e5a648661f3c 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -177,7 +177,7 @@ ct_state = NXM_NX_CT_STATE ]]) AT_CLEANUP -AT_SETUP([ovn -- compsition]) +AT_SETUP([ovn -- composition]) AT_CHECK([ovstest test-ovn composition 2], [0], [ignore]) AT_CLEANUP -- 2.16.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3] ovn-controller: Handle Port_Binding's "requested-chassis" option in physical.c
On Tue, Mar 20, 2018 at 04:59:42PM +0530, nusid...@redhat.com wrote: > From: Numan Siddique> > When a Logical_Switch_Port P's options is set with 'requested-chassis=hv1' > and if the user has bound this logical port to two OVS interfaces each in > different host (eg. hv1 and hv2), then ovn-controller in hv1 sets the > P's Port_Binding.chassis to hv1 which is as expected. But on hv2, > ovn-controller > is adding OF flows in table 0 and table 65 for the OVS interface instead of > considering 'P' as a remote port. When another logical port bound on hv2, > pings to the logical port 'P', the packet gets delivered to hv2 OVS interface > instead of hv1 OVS interface, which is wrong. > > This scenario is most likely to happen when requested-chassis option is used > by CMS during migration of a VM from one chassis to another. > > This patch fixes this issue by checking the Port_Binding's "requested-chassis" > option in physical.c before adding the flows in table 0 an 65. > > Reported-by: Marcin Mirecki > Reported-at: > https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/345266.html > Signed-off-by: Numan Siddique > Tested-by: Marcin Mirecki Thank you. I applied this to master and branch-2.9. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/2] vswitchd: Allow user to directly specify sFlow agent address.
The IPv6 feature was more work than I expected, but I did post the series: https://patchwork.ozlabs.org/project/openvswitch/list/?series=38837=* On Tue, Apr 10, 2018 at 12:23:12PM -0700, Ben Pfaff wrote: > It's currently IPv4 only but it's a good idea to add IPv6 support. I'll > put that on my to-do list. > > It's also a good idea to warn about the potential perils of this > feature. I'll do that too. > > I'll try to get to this soon, neither feature should be much work. > > On Tue, Apr 10, 2018 at 10:13:24AM -0700, Neil McKee wrote: > > No objections. Sounds useful for testing, and for situations where > > the agent-address selection has been made by a higher controller. > > Just one question: it seems like it is only IPv4: can an IPv6 > > agent-address be configured this way too? > > > > I'll admit to this triggering bad memories of a situation when two > > switches were sending sFlow using each other's agent address, so > > perhaps a warning not to use this "manually" would be appropriate? > > > > -- > > Neil McKee > > InMon Corp. > > http://www.inmon.com > > > > > > On Sat, Mar 31, 2018 at 5:12 PM, Ben Pfaffwrote: > > > At least for testing purposes, and perhaps in production, it is useful to > > > be able to fix the agent IP address directly, rather that indirecting it > > > through a device name or the routing table. > > > > > > This commit uses this feature to fix the agent IP address used in the unit > > > tests. This will be particularly useful in an upcoming commit that > > > disables the use of the system routing table in the unit tests, to make > > > the tests' results independent of the host's routes. > > > > > > CC: Neil McKee > > > Signed-off-by: Ben Pfaff > > > --- > > > ofproto/ofproto-dpif-sflow.c | 3 ++- > > > tests/ofproto-dpif.at| 8 > > > vswitchd/vswitch.xml | 25 ++--- > > > 3 files changed, 24 insertions(+), 12 deletions(-) > > > > > > diff --git a/ofproto/ofproto-dpif-sflow.c b/ofproto/ofproto-dpif-sflow.c > > > index 60e1b4e0a661..5d8c0e19f8e3 100644 > > > --- a/ofproto/ofproto-dpif-sflow.c > > > +++ b/ofproto/ofproto-dpif-sflow.c > > > @@ -453,7 +453,8 @@ sflow_choose_agent_address(const char *agent_device, > > > agent_addr->type = SFLADDRESSTYPE_IP_V4; > > > > > > if (agent_device) { > > > -if (!netdev_get_in4_by_name(agent_device, )) { > > > +if (!netdev_get_in4_by_name(agent_device, ) > > > +|| !lookup_ip(agent_device, )) { > > > goto success; > > > } > > > } > > > diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at > > > index d2058eddd3eb..60f28e2a08dd 100644 > > > --- a/tests/ofproto-dpif.at > > > +++ b/tests/ofproto-dpif.at > > > @@ -6333,7 +6333,7 @@ ovs-vsctl \ > > >set Interface p1 options:ifindex=1003 -- \ > > >set Bridge br0 sflow=@sf -- \ > > >--id=@sf create sflow targets=\"127.0.0.1:$SFLOW_PORT\" \ > > > - header=128 sampling=1 polling=1 > > > + header=128 sampling=1 polling=1 agent=127.0.0.1 > > > > > > dnl sleep long enough to get the sFlow datagram flushed out (may be > > > delayed for up to 1 second) > > > AT_CHECK([ovs-appctl time/warp 2000 100], [0], [ignore]) > > > @@ -6382,7 +6382,7 @@ AT_CHECK([ovs-ofctl add-flow br0 action=3]) > > > dnl enable sflow > > > ovs-vsctl \ > > > set Bridge br0 sflow=@sf -- \ > > > - --id=@sf create sflow targets=\"127.0.0.1:$SFLOW_PORT\" \ > > > + --id=@sf create sflow targets=\"127.0.0.1:$SFLOW_PORT\" > > > agent=127.0.0.1 \ > > > header=128 sampling=1 polling=0 > > > > > > dnl introduce a packet that will be flooded to the tunnel > > > @@ -6474,7 +6474,7 @@ AT_CHECK([ovs-appctl netdev-dummy/receive p0 > > > 'recirc_id(0),in_port(1),eth(src=f8 > > > dnl configure sflow on int-br only > > > ovs-vsctl \ > > > set Bridge int-br sflow=@sf -- \ > > > - --id=@sf create sflow targets=\"127.0.0.1:$SFLOW_PORT\" \ > > > + --id=@sf create sflow targets=\"127.0.0.1:$SFLOW_PORT\" > > > agent=127.0.0.1 \ > > > header=128 sampling=1 polling=0 > > > > > > dnl set up route to 192.168.1.2 via br0 > > > @@ -6553,7 +6553,7 @@ ovs-appctl time/stop > > > dnl configure sflow > > > ovs-vsctl \ > > > set Bridge br0 sflow=@sf -- \ > > > - --id=@sf create sflow targets=\"127.0.0.1:$SFLOW_PORT\" \ > > > + --id=@sf create sflow targets=\"127.0.0.1:$SFLOW_PORT\" > > > agent=127.0.0.1 \ > > > header=128 sampling=1 polling=0 > > > > > > AT_CHECK([ovs-appctl netdev-dummy/receive p1 > > > 'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x8847),mpls(label=11,tc=3,ttl=64,bos=1)']) > > > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > > > index f899a19764a4..9c2a8263e604 100644 > > > --- a/vswitchd/vswitch.xml > > > +++ b/vswitchd/vswitch.xml > > > @@ -5099,13 +5099,24 @@ ovs-vsctl add-port br0
[ovs-dev] [PATCH 11/11] ovn-nbctl: Use common code for sockaddr_storage.
This better reuses existing code. Signed-off-by: Ben Pfaff--- ovn/utilities/ovn-nbctl.c | 43 --- 1 file changed, 12 insertions(+), 31 deletions(-) diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c index 85d4d3534b54..5c18161fe1c2 100644 --- a/ovn/utilities/ovn-nbctl.c +++ b/ovn/utilities/ovn-nbctl.c @@ -1782,7 +1782,6 @@ nbctl_lb_add(struct ctl_context *ctx) const char *lb_proto; bool is_update_proto = false; -bool is_vip_with_port = true; if (ctx->argc == 4) { /* Default protocol. */ @@ -1803,38 +1802,17 @@ nbctl_lb_add(struct ctl_context *ctx) "and a port number with : as a separator).", lb_vip); } -char lb_vip_normalized[INET6_ADDRSTRLEN + 8]; -char normalized_ip[INET6_ADDRSTRLEN]; -if (ss_vip.ss_family == AF_INET) { -struct sockaddr_in *sin = ALIGNED_CAST(struct sockaddr_in *, _vip); -inet_ntop(AF_INET, >sin_addr, normalized_ip, - sizeof normalized_ip); -if (sin->sin_port) { -is_vip_with_port = true; -snprintf(lb_vip_normalized, sizeof lb_vip_normalized, "%s:%d", - normalized_ip, ntohs(sin->sin_port)); -} else { -is_vip_with_port = false; -ovs_strlcpy(lb_vip_normalized, normalized_ip, -sizeof lb_vip_normalized); -} +struct ds lb_vip_normalized_ds = DS_EMPTY_INITIALIZER; +uint16_t lb_vip_port = ss_get_port(_vip); +if (lb_vip_port) { +ss_format_address(_vip, _vip_normalized_ds); +ds_put_format(_vip_normalized_ds, ":%d", lb_vip_port); } else { -struct sockaddr_in6 *sin6 = ALIGNED_CAST(struct sockaddr_in6 *, - _vip); -inet_ntop(AF_INET6, >sin6_addr, normalized_ip, - sizeof normalized_ip); -if (sin6->sin6_port) { -is_vip_with_port = true; -snprintf(lb_vip_normalized, sizeof lb_vip_normalized, "[%s]:%d", - normalized_ip, ntohs(sin6->sin6_port)); -} else { -is_vip_with_port = false; -ovs_strlcpy(lb_vip_normalized, normalized_ip, -sizeof lb_vip_normalized); -} +ss_format_address_nobracks(_vip, _vip_normalized_ds); } +const char *lb_vip_normalized = ds_cstr(_vip_normalized_ds); -if (!is_vip_with_port && is_update_proto) { +if (!lb_vip_port && is_update_proto) { ctl_fatal("Protocol is unnecessary when no port of vip " "is given."); } @@ -1845,7 +1823,7 @@ nbctl_lb_add(struct ctl_context *ctx) token != NULL; token = strtok_r(NULL, ",", _ptr)) { struct sockaddr_storage ss_dst; -if (is_vip_with_port) { +if (lb_vip_port) { if (!inet_parse_active(token, -1, _dst)) { ctl_fatal("%s: should be an IP address and a port " "number with : as a separator.", token); @@ -1892,6 +1870,7 @@ nbctl_lb_add(struct ctl_context *ctx) nbrec_load_balancer_verify_vips(lb); nbrec_load_balancer_set_vips(lb, >vips); ds_destroy(_ips_new); +ds_destroy(_vip_normalized_ds); return; } } @@ -1904,6 +1883,8 @@ nbctl_lb_add(struct ctl_context *ctx) lb_vip_normalized, ds_cstr(_ips_new)); nbrec_load_balancer_set_vips(lb, >vips); ds_destroy(_ips_new); + +ds_destroy(_vip_normalized_ds); } static void -- 2.16.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 10/11] ovn-nbctl: Simplify lb_info_add_smap().
Signed-off-by: Ben Pfaff--- ovn/utilities/ovn-nbctl.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c index 16698de1f049..85d4d3534b54 100644 --- a/ovn/utilities/ovn-nbctl.c +++ b/ovn/utilities/ovn-nbctl.c @@ -1945,11 +1945,9 @@ static void lb_info_add_smap(const struct nbrec_load_balancer *lb, struct smap *lbs, int vip_width) { -struct ds key = DS_EMPTY_INITIALIZER; -struct ds val = DS_EMPTY_INITIALIZER; - const struct smap_node **nodes = smap_sort(>vips); if (nodes) { +struct ds val = DS_EMPTY_INITIALIZER; for (int i = 0; i < smap_count(>vips); i++) { const struct smap_node *node = nodes[i]; @@ -1971,11 +1969,8 @@ lb_info_add_smap(const struct nbrec_load_balancer *lb, node->key, node->value); } -ds_put_format(, "%-20.16s", lb->name); -smap_add(lbs, ds_cstr(), ds_cstr()); - -ds_destroy(); -ds_destroy(); +smap_add_nocopy(lbs, xasprintf("%-20.16s", lb->name), +ds_steal_cstr()); free(nodes); } } -- 2.16.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 09/11] ofproto-dpif-slow: Add IPv6 agent address support.
Suggested-by: Neil McKeeSigned-off-by: Ben Pfaff --- ofproto/ofproto-dpif-sflow.c | 55 ++-- 1 file changed, 33 insertions(+), 22 deletions(-) diff --git a/ofproto/ofproto-dpif-sflow.c b/ofproto/ofproto-dpif-sflow.c index 5d8c0e19f8e3..fe79a9dbbad6 100644 --- a/ofproto/ofproto-dpif-sflow.c +++ b/ofproto/ofproto-dpif-sflow.c @@ -446,43 +446,45 @@ sflow_choose_agent_address(const char *agent_device, const char *control_ip, SFLAddress *agent_addr) { -const char *target; -struct in_addr in4; - -memset(agent_addr, 0, sizeof *agent_addr); -agent_addr->type = SFLADDRESSTYPE_IP_V4; +struct in6_addr ip; if (agent_device) { -if (!netdev_get_in4_by_name(agent_device, ) -|| !lookup_ip(agent_device, )) { +/* If 'agent_device' is the name of a network device, use its IP + * address. */ +if (!netdev_get_ip_by_name(agent_device, )) { +goto success; +} + +/* If 'agent_device' is itself an IP address, use it. */ +struct sockaddr_storage ss; +if (inet_parse_address(agent_device, )) { +ip = ss_get_address(); goto success; } } +/* Otherwise, use an appropriate local IP address for one of the + * collectors' remote IP addresses. */ +const char *target; SSET_FOR_EACH (target, targets) { -union { -struct sockaddr_storage ss; -struct sockaddr_in sin; -} sa; -char name[IFNAMSIZ]; - -if (inet_parse_active(target, SFL_DEFAULT_COLLECTOR_PORT, ) -&& sa.ss.ss_family == AF_INET) { -struct in6_addr addr6, src, gw; - -in6_addr_set_mapped_ipv4(, sa.sin.sin_addr.s_addr); +struct sockaddr_storage ss; +if (inet_parse_active(target, SFL_DEFAULT_COLLECTOR_PORT, )) { /* sFlow only supports target in default routing table with * packet mark zero. */ -if (ovs_router_lookup(0, , name, , )) { +ip = ss_get_address(); -in4.s_addr = in6_addr_get_mapped_ipv4(); +struct in6_addr src, gw; +char name[IFNAMSIZ]; +if (ovs_router_lookup(0, , name, , )) { goto success; } } } -if (control_ip && !lookup_ip(control_ip, )) { +struct sockaddr_storage ss; +if (control_ip && inet_parse_address(control_ip, )) { +ip = ss_get_address(); goto success; } @@ -490,7 +492,16 @@ sflow_choose_agent_address(const char *agent_device, return false; success: -agent_addr->address.ip_v4.addr = (OVS_FORCE uint32_t) in4.s_addr; +memset(agent_addr, 0, sizeof *agent_addr); +if (IN6_IS_ADDR_V4MAPPED()) { +agent_addr->type = SFLADDRESSTYPE_IP_V4; +agent_addr->address.ip_v4.addr += (OVS_FORCE uint32_t) in6_addr_get_mapped_ipv4(); +} else { +agent_addr->type = SFLADDRESSTYPE_IP_V6; +memcpy(agent_addr->address.ip_v6.addr, ip.s6_addr, + sizeof agent_addr->address.ip_v6.addr); +} return true; } -- 2.16.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 08/11] netdev: New function netdev_get_ip_by_name().
This is like netdev_get_in4_by_name() but accepts any IP address instead of just an IPv4 address. It will acquire its first user in an upcoming commit. Signed-off-by: Ben Pfaff--- lib/netdev.c | 81 +++- lib/netdev.h | 1 + 2 files changed, 59 insertions(+), 23 deletions(-) diff --git a/lib/netdev.c b/lib/netdev.c index 00192f0b00da..1c4ae2c6cdd2 100644 --- a/lib/netdev.c +++ b/lib/netdev.c @@ -1142,39 +1142,74 @@ netdev_set_in4(struct netdev *netdev, struct in_addr addr, struct in_addr mask) : EOPNOTSUPP); } -/* Obtains ad IPv4 address from device name and save the address in - * in4. Returns 0 if successful, otherwise a positive errno value. - */ +static int +netdev_get_addresses_by_name(const char *device_name, + struct in6_addr **addrsp, int *n_addrsp) +{ +struct netdev *netdev; +int error = netdev_open(device_name, NULL, ); +if (error) { +*addrsp = NULL; +*n_addrsp = 0; +return error; +} + +struct in6_addr *masks; +error = netdev_get_addr_list(netdev, addrsp, , n_addrsp); +netdev_close(netdev); +free(masks); +return error; +} + +/* Obtains an IPv4 address from 'device_name' and save the address in '*in4'. + * Returns 0 if successful, otherwise a positive errno value. */ int netdev_get_in4_by_name(const char *device_name, struct in_addr *in4) { -struct in6_addr *mask, *addr6; -int err, n_in6, i; -struct netdev *dev; +struct in6_addr *addrs; +int n; +int error = netdev_get_addresses_by_name(device_name, , ); -err = netdev_open(device_name, NULL, ); -if (err) { -return err; +in4->s_addr = 0; +if (!error) { +error = ENOENT; +for (int i = 0; i < n; i++) { +if (IN6_IS_ADDR_V4MAPPED([i])) { +in4->s_addr = in6_addr_get_mapped_ipv4([i]); +error = 0; +break; +} +} } +free(addrs); -err = netdev_get_addr_list(dev, , , _in6); -if (err) { -goto out; -} +return error; +} -for (i = 0; i < n_in6; i++) { -if (IN6_IS_ADDR_V4MAPPED([i])) { -in4->s_addr = in6_addr_get_mapped_ipv4([i]); -goto out; +/* Obtains an IPv4 or IPv6 address from 'device_name' and save the address in + * '*ss', representing IPv4 addressse as v6-mapped. Returns 0 if successful, + * otherwise a positive errno value. */ +int +netdev_get_ip_by_name(const char *device_name, struct in6_addr *in6) +{ +struct in6_addr *addrs; +int n; +int error = netdev_get_addresses_by_name(device_name, , ); + +*in6 = in6addr_any; +if (!error) { +error = ENOENT; +for (int i = 0; i < n; i++) { +if (!in6_is_lla([i])) { +*in6 = addrs[i]; +error = 0; +break; +} } } -err = -ENOENT; -out: -free(addr6); -free(mask); -netdev_close(dev); -return err; +free(addrs); +return error; } /* Adds 'router' as a default IP gateway for the TCP/IP stack that corresponds diff --git a/lib/netdev.h b/lib/netdev.h index ff1b604b24e2..441e53daeb69 100644 --- a/lib/netdev.h +++ b/lib/netdev.h @@ -282,6 +282,7 @@ void netdev_restore_flags(struct netdev_saved_flags *); /* TCP/IP stack interface. */ int netdev_set_in4(struct netdev *, struct in_addr addr, struct in_addr mask); int netdev_get_in4_by_name(const char *device_name, struct in_addr *in4); +int netdev_get_ip_by_name(const char *device_name, struct in6_addr *); int netdev_get_addr_list(const struct netdev *netdev, struct in6_addr **addr, struct in6_addr **mask, int *n_in6); -- 2.16.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 07/11] socket-util: Add more functions for IPv[46] sockaddr and sockaddr_storage.
The existing functions for working with sockaddr_storage that contain an IPv4 or IPv6 address are useful. This commit adds more functions for working with them, as well as a parallel set of functions for struct sockaddr. This also adds an initial user for some of the new sockaddr functions in netdev.c. Signed-off-by: Ben Pfaff--- lib/netdev.c | 31 lib/socket-util.c | 138 +++--- lib/socket-util.h | 12 + 3 files changed, 120 insertions(+), 61 deletions(-) diff --git a/lib/netdev.c b/lib/netdev.c index b303a7dc558d..00192f0b00da 100644 --- a/lib/netdev.c +++ b/lib/netdev.c @@ -50,6 +50,7 @@ #include "seq.h" #include "openvswitch/shash.h" #include "smap.h" +#include "socket-util.h" #include "sset.h" #include "svec.h" #include "openvswitch/vlog.h" @@ -2034,29 +2035,13 @@ netdev_get_addrs(const char dev[], struct in6_addr **paddr, addr_array = xzalloc(sizeof *addr_array * cnt); mask_array = xzalloc(sizeof *mask_array * cnt); for (ifa = if_addr_list; ifa; ifa = ifa->ifa_next) { -int family; - -if (!ifa->ifa_name || !ifa->ifa_addr || !ifa->ifa_netmask -|| strncmp(ifa->ifa_name, dev, IFNAMSIZ)) { -continue; -} - -family = ifa->ifa_addr->sa_family; -if (family == AF_INET) { -const struct sockaddr_in *sin; - -sin = ALIGNED_CAST(const struct sockaddr_in *, ifa->ifa_addr); -in6_addr_set_mapped_ipv4(_array[i], sin->sin_addr.s_addr); -sin = ALIGNED_CAST(const struct sockaddr_in *, ifa->ifa_netmask); -in6_addr_set_mapped_ipv4(_array[i], sin->sin_addr.s_addr); -i++; -} else if (family == AF_INET6) { -const struct sockaddr_in6 *sin6; - -sin6 = ALIGNED_CAST(const struct sockaddr_in6 *, ifa->ifa_addr); -memcpy(_array[i], >sin6_addr, sizeof *addr_array); -sin6 = ALIGNED_CAST(const struct sockaddr_in6 *, ifa->ifa_netmask); -memcpy(_array[i], >sin6_addr, sizeof *mask_array); +if (ifa->ifa_name +&& ifa->ifa_addr +&& ifa->ifa_netmask +&& !strncmp(ifa->ifa_name, dev, IFNAMSIZ) +&& sa_is_ip(ifa->ifa_addr)) { +addr_array[i] = sa_get_address(ifa->ifa_addr); +mask_array[i] = sa_get_address(ifa->ifa_netmask); i++; } } diff --git a/lib/socket-util.c b/lib/socket-util.c index 0964a015e3f9..12d16f582d52 100644 --- a/lib/socket-util.c +++ b/lib/socket-util.c @@ -53,6 +53,9 @@ VLOG_DEFINE_THIS_MODULE(socket_util); static int getsockopt_int(int fd, int level, int option, const char *optname, int *valuep); +static struct sockaddr_in *sin_cast(const struct sockaddr *); +static struct sockaddr_in6 *sin6_cast(const struct sockaddr *); +static const struct sockaddr *sa_cast(const struct sockaddr_storage *); /* Sets 'fd' to non-blocking mode. Returns 0 if successful, otherwise a * positive errno value. */ @@ -422,7 +425,7 @@ parse_sockaddr_components(struct sockaddr_storage *ss, const char *port_s, uint16_t default_port, const char *s) { -struct sockaddr_in *sin = ALIGNED_CAST(struct sockaddr_in *, ss); +struct sockaddr_in *sin = sin_cast(sa_cast(ss)); int port; if (port_s && port_s[0]) { @@ -436,9 +439,7 @@ parse_sockaddr_components(struct sockaddr_storage *ss, memset(ss, 0, sizeof *ss); if (host_s && strchr(host_s, ':')) { -struct sockaddr_in6 *sin6 -= ALIGNED_CAST(struct sockaddr_in6 *, ss); - +struct sockaddr_in6 *sin6 = sin6_cast(sa_cast(ss)); char *addr = strsep(_s, "%"); sin6->sin6_family = AF_INET6; @@ -1017,25 +1018,47 @@ describe_fd(int fd) #endif /* _WIN32 */ return ds_steal_cstr(); } - -/* sockaddr_storage helpers. */ +/* sockaddr helpers. */ + +static struct sockaddr_in * +sin_cast(const struct sockaddr *sa) +{ +return ALIGNED_CAST(struct sockaddr_in *, sa); +} + +static struct sockaddr_in6 * +sin6_cast(const struct sockaddr *sa) +{ +return ALIGNED_CAST(struct sockaddr_in6 *, sa); +} + +/* Returns true if 'sa' represents an IPv4 or IPv6 address, false otherwise. */ +bool +sa_is_ip(const struct sockaddr *sa) +{ +return sa->sa_family == AF_INET || sa->sa_family == AF_INET6; +} + +/* Returns the IPv4 or IPv6 address in 'sa'. Returns IPv4 addresses as + * v6-mapped. */ +struct in6_addr +sa_get_address(const struct sockaddr *sa) +{ +ovs_assert(sa_is_ip(sa)); +return (sa->sa_family == AF_INET +? in6_addr_mapped_ipv4(sin_cast(sa)->sin_addr.s_addr) +: sin6_cast(sa)->sin6_addr); +} -/* Returns the IPv4 or IPv6 port in 'ss'. */ +/* Returns the IPv4 or IPv6 port in 'sa'. */ uint16_t -ss_get_port(const struct sockaddr_storage *ss) +sa_get_port(const struct sockaddr *sa) { -if (ss->ss_family ==
[ovs-dev] [PATCH 06/11] Make : parsing uniform treewide.
I didn't realize until now that the tree had two different ways of parsing strings in the form : and :. There are the long-standing inet_parse_active() and inet_parse_passive() functions, and more recently the ipv46_parse() function. This commit eliminates the latter and changes the code to use the former. The two implementations interpreted some input differently. In particular, the older functions required IPv6 addresses to be [bracketed], but the newer ones did not. I'd prefer the brackets to be required, because of ambiguous cases like "::1:2:3:4:80" (is :80 part of the IPv6 address or a port number?) but for compatibility this patch changes the merged code to use the more liberal interpretation. Signed-off-by: Ben Pfaff--- lib/packets.c | 76 - lib/packets.h | 10 lib/socket-util.c | 140 +++--- lib/socket-util.h | 3 +- ovn/utilities/ovn-nbctl.c | 43 +++--- ovsdb/raft.c | 5 +- tests/ovn-nbctl.at| 56 +-- 7 files changed, 125 insertions(+), 208 deletions(-) diff --git a/lib/packets.c b/lib/packets.c index 8ebae8cc8fe8..38bfb6015b9e 100644 --- a/lib/packets.c +++ b/lib/packets.c @@ -652,82 +652,6 @@ ip_parse_cidr(const char *s, ovs_be32 *ip, unsigned int *plen) return error; } -/* Parses the string into an IPv4 or IPv6 address. - * The port flags act as follows: - * * PORT_OPTIONAL: A port may be present but is not required - * * PORT_REQUIRED: A port must be present - * * PORT_FORBIDDEN: A port must not be present - */ -char * OVS_WARN_UNUSED_RESULT -ipv46_parse(const char *s, enum port_flags flags, struct sockaddr_storage *ss) -{ -char *error = NULL; - -char *copy; -copy = xstrdup(s); - -char *addr; -char *port; -if (*copy == '[') { -char *end; - -addr = copy + 1; -end = strchr(addr, ']'); -if (!end) { -error = xasprintf("No closing bracket on address %s", s); -goto finish; -} -*end++ = '\0'; -if (*end == ':') { -port = end + 1; -} else { -port = NULL; -} -} else { -addr = copy; -port = strchr(copy, ':'); -if (port) { -if (strchr(port + 1, ':')) { -port = NULL; -} else { -*port++ = '\0'; -} -} -} - -if (port && !*port) { -error = xasprintf("Port is an empty string"); -goto finish; -} - -if (port && flags == PORT_FORBIDDEN) { -error = xasprintf("Port forbidden in address %s", s); -goto finish; -} else if (!port && flags == PORT_REQUIRED) { -error = xasprintf("Port required in address %s", s); -goto finish; -} - -struct addrinfo hints = { -.ai_flags = AI_NUMERICHOST | AI_NUMERICSERV, -.ai_family = AF_UNSPEC, -}; -struct addrinfo *res; -int status; -status = getaddrinfo(addr, port, , ); -if (status) { -error = xasprintf("Error parsing address %s: %s", -s, gai_strerror(status)); -goto finish; -} -memcpy(ss, res->ai_addr, res->ai_addrlen); -freeaddrinfo(res); - -finish: -free(copy); -return error; -} - /* Parses string 's', which must be an IPv6 address. Stores the IPv6 address * into '*ip'. Returns true if successful, otherwise false. */ bool diff --git a/lib/packets.h b/lib/packets.h index 9a71aa3abbdb..b2bf70697e90 100644 --- a/lib/packets.h +++ b/lib/packets.h @@ -1362,16 +1362,6 @@ struct in6_addr ipv6_create_mask(int mask); int ipv6_count_cidr_bits(const struct in6_addr *netmask); bool ipv6_is_cidr(const struct in6_addr *netmask); -enum port_flags { -PORT_OPTIONAL, -PORT_REQUIRED, -PORT_FORBIDDEN, -}; - -char *ipv46_parse(const char *s, enum port_flags flags, - struct sockaddr_storage *ss) -OVS_WARN_UNUSED_RESULT; - bool ipv6_parse(const char *s, struct in6_addr *ip); char *ipv6_parse_masked(const char *s, struct in6_addr *ipv6, struct in6_addr *mask); diff --git a/lib/socket-util.c b/lib/socket-util.c index 223e3780ba5f..0964a015e3f9 100644 --- a/lib/socket-util.c +++ b/lib/socket-util.c @@ -333,39 +333,89 @@ guess_netmask(ovs_be32 ip_) : htonl(0)); /* ??? */ } -/* This is like strsep() except: - * - *- The separator string is ":". - * - *- Square brackets [] quote ":" separators and are removed from the - * tokens. */ -char * -inet_parse_token(char **pp) +static char * +unbracket(char *s) +{ +if (*s == '[') { +s++; + +char *end = strchr(s, '\0'); +if (end[-1] == ']') { +end[-1] = '\0'; +} +} +return s; +} + +/* 'host_index' is 0 if the host precedes the port within 's', 1 otherwise. */ +static void
[ovs-dev] [PATCH 05/11] socket-util: Make inet_parse_active() and inet_parse_passive() more alike.
Until now, the default_port parameters to these functions have had different types and different behavior. There is a reason for this, since it makes sense to listen on a kernel-selected port but it does not make sense to connect to a kernel-selected port, but this overlooks the possibility that a caller might want to parse a string in the format understood by inet_parse_active() without actually using it to connect to a remote host. This commit makes the behavior consistent and updates all the callers to work with the new semantics. Signed-off-by: Ben Pfaff--- lib/socket-util.c | 14 +++--- lib/socket-util.h | 4 ++-- lib/stream-tcp.c| 2 +- lib/stream.c| 3 +-- lib/stream.h| 2 +- lib/vlog.c | 2 +- ofproto/collectors.c| 4 ++-- ofproto/collectors.h| 2 +- ofproto/netflow.c | 2 +- ovn/northd/ovn-northd.c | 2 +- ovsdb/raft-private.c| 2 +- 11 files changed, 19 insertions(+), 20 deletions(-) diff --git a/lib/socket-util.c b/lib/socket-util.c index 2d893bc9feb6..223e3780ba5f 100644 --- a/lib/socket-util.c +++ b/lib/socket-util.c @@ -431,13 +431,13 @@ exit: /* Parses 'target', which should be a string in the format "[:]". * , which is required, may be an IPv4 address or an IPv6 address - * enclosed in square brackets. If 'default_port' is nonzero then is - * optional and defaults to 'default_port'. + * enclosed in square brackets. If 'default_port' is nonnegative then + * is optional and defaults to 'default_port'. * * On success, returns true and stores the parsed remote address into '*ss'. * On failure, logs an error, stores zeros into '*ss', and returns false. */ bool -inet_parse_active(const char *target_, uint16_t default_port, +inet_parse_active(const char *target_, int default_port, struct sockaddr_storage *ss) { char *target = xstrdup(target_); @@ -452,7 +452,7 @@ inet_parse_active(const char *target_, uint16_t default_port, if (!host) { VLOG_ERR("%s: host must be specified", target_); ok = false; -} else if (!port && !default_port) { +} else if (!port && default_port < 0) { VLOG_ERR("%s: port must be specified", target_); ok = false; } else if (p && p[strspn(p, " \t\r\n")] != '\0') { @@ -472,8 +472,8 @@ inet_parse_active(const char *target_, uint16_t default_port, /* Opens a non-blocking IPv4 or IPv6 socket of the specified 'style' and * connects to 'target', which should be a string in the format * "[:]". , which is required, may be an IPv4 address or an - * IPv6 address enclosed in square brackets. If 'default_port' is nonzero then - * is optional and defaults to 'default_port'. + * IPv6 address enclosed in square brackets. If 'default_port' is nonnegative + * then is optional and defaults to 'default_port'. * * 'style' should be SOCK_STREAM (for TCP) or SOCK_DGRAM (for UDP). * @@ -488,7 +488,7 @@ inet_parse_active(const char *target_, uint16_t default_port, * should be in the range [0, 63] and will automatically be shifted to the * appropriately place in the IP tos field. */ int -inet_open_active(int style, const char *target, uint16_t default_port, +inet_open_active(int style, const char *target, int default_port, struct sockaddr_storage *ssp, int *fdp, uint8_t dscp) { struct sockaddr_storage ss; diff --git a/lib/socket-util.h b/lib/socket-util.h index b1eca88eb131..d927d67a0e1b 100644 --- a/lib/socket-util.h +++ b/lib/socket-util.h @@ -47,9 +47,9 @@ void drain_fd(int fd, size_t n_packets); ovs_be32 guess_netmask(ovs_be32 ip); char *inet_parse_token(char **); -bool inet_parse_active(const char *target, uint16_t default_port, +bool inet_parse_active(const char *target, int default_port, struct sockaddr_storage *ssp); -int inet_open_active(int style, const char *target, uint16_t default_port, +int inet_open_active(int style, const char *target, int default_port, struct sockaddr_storage *ssp, int *fdp, uint8_t dscp); bool inet_parse_passive(const char *target, int default_port, diff --git a/lib/stream-tcp.c b/lib/stream-tcp.c index 4fae731d9b9e..e8dc2bfaa2df 100644 --- a/lib/stream-tcp.c +++ b/lib/stream-tcp.c @@ -53,7 +53,7 @@ tcp_open(const char *name, char *suffix, struct stream **streamp, uint8_t dscp) { int fd, error; -error = inet_open_active(SOCK_STREAM, suffix, 0, NULL, , dscp); +error = inet_open_active(SOCK_STREAM, suffix, -1, NULL, , dscp); if (fd >= 0) { return new_tcp_stream(xstrdup(name), fd, error, streamp); } else { diff --git a/lib/stream.c b/lib/stream.c index 083e2fb93f77..63c592344080 100644 --- a/lib/stream.c +++ b/lib/stream.c @@ -747,8 +747,7 @@ pstream_open_with_default_port(const char *name_, * - On error, function returns false and *ss contains garbage. */ bool -stream_parse_target_with_default_port(const char *target, -
[ovs-dev] [PATCH 04/11] socket-util: New function inet_parse_address().
This will acquire its first user in an upcoming commit. Signed-off-by: Ben Pfaff--- lib/socket-util.c | 26 ++ lib/socket-util.h | 2 ++ 2 files changed, 28 insertions(+) diff --git a/lib/socket-util.c b/lib/socket-util.c index b36de371baa1..2d893bc9feb6 100644 --- a/lib/socket-util.c +++ b/lib/socket-util.c @@ -698,6 +698,32 @@ error: return -error; } +/* Parses 'target', which may be an IPv4 address or an IPv6 address + * enclosed in square brackets. + * + * On success, returns true and stores the parsed remote address into '*ss'. + * On failure, logs an error, stores zeros into '*ss', and returns false. */ +bool +inet_parse_address(const char *target_, struct sockaddr_storage *ss) +{ +char *target = xstrdup(target_); +char *p = target; +char *host = inet_parse_token(); +bool ok = false; +if (!host) { +VLOG_ERR("%s: host must be specified", target_); +} else if (p && p[strspn(p, " \t\r\n")] != '\0') { +VLOG_ERR("%s: unexpected characters follow host", target_); +} else { +ok = parse_sockaddr_components(ss, host, NULL, 0, target_); +} +if (!ok) { +memset(ss, 0, sizeof *ss); +} +free(target); +return ok; +} + int read_fully(int fd, void *p_, size_t size, size_t *bytes_read) { diff --git a/lib/socket-util.h b/lib/socket-util.h index d8c6f082c8cf..b1eca88eb131 100644 --- a/lib/socket-util.h +++ b/lib/socket-util.h @@ -58,6 +58,8 @@ int inet_open_passive(int style, const char *target, int default_port, struct sockaddr_storage *ssp, uint8_t dscp, bool kernel_print_port); +bool inet_parse_address(const char *target, struct sockaddr_storage *); + int read_fully(int fd, void *, size_t, size_t *bytes_read); int write_fully(int fd, const void *, size_t, size_t *bytes_written); -- 2.16.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 02/11] socket-util: Make address parser check for trailing garbage.
Signed-off-by: Ben Pfaff--- lib/socket-util.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/lib/socket-util.c b/lib/socket-util.c index 5485e3b515dd..86ac4d433a2d 100644 --- a/lib/socket-util.c +++ b/lib/socket-util.c @@ -455,6 +455,9 @@ inet_parse_active(const char *target_, uint16_t default_port, } else if (!port && !default_port) { VLOG_ERR("%s: port must be specified", target_); ok = false; +} else if (p && p[strspn(p, " \t\r\n")] != '\0') { +VLOG_ERR("%s: unexpected characters follow host and port", target_); +ok = false; } else { ok = parse_sockaddr_components(ss, host, port, default_port, target_); } @@ -579,6 +582,9 @@ inet_parse_passive(const char *target_, int default_port, if (!port && default_port < 0) { VLOG_ERR("%s: port must be specified", target_); ok = false; +} else if (p && p[strspn(p, " \t\r\n")] != '\0') { +VLOG_ERR("%s: unexpected characters follow port and host", target_); +ok = false; } else { ok = parse_sockaddr_components(ss, host, port, default_port, target_); } -- 2.16.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 01/11] socket-util: Fix error in comment on ss_format_address().
The output for this function is a dynamic string and doesn't have a fixed buffer size, so the comment was wrong. Signed-off-by: Ben Pfaff--- lib/socket-util.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/lib/socket-util.c b/lib/socket-util.c index 7fbcdf19feac..5485e3b515dd 100644 --- a/lib/socket-util.c +++ b/lib/socket-util.c @@ -995,8 +995,7 @@ is_safe_name(const char *name) } /* Formats the IPv4 or IPv6 address in 'ss' into 's'. If 'ss' is an IPv6 - * address, puts square brackets around the address. 'bufsize' should be at - * least SS_NTOP_BUFSIZE. */ + * address, puts square brackets around the address. */ void ss_format_address(const struct sockaddr_storage *ss, struct ds *s) { -- 2.16.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [branch 2.9 PATCH v3 2/2] netdev-dpdk: Add mempool reuse/free debug.
There is debug when a new mempool is created, but not when it is reused or freed. Add these as it is very difficult to debug mempool issues from logs without them. Signed-off-by: Kevin Traynor--- lib/netdev-dpdk.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index d8fb222..3cd7c64 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -616,4 +616,5 @@ dpdk_mp_sweep(void) OVS_REQUIRES(dpdk_mp_mutex) LIST_FOR_EACH_SAFE (dmp, next, list_node, _mp_list) { if (!dmp->refcount && dpdk_mp_full(dmp->mp)) { +VLOG_DBG("Freeing mempool \"%s\"", dmp->mp->name); ovs_list_remove(>list_node); rte_mempool_free(dmp->mp); @@ -632,4 +633,5 @@ dpdk_mp_get(int socket_id, int mtu) LIST_FOR_EACH (dmp, list_node, _mp_list) { if (dmp->socket_id == socket_id && dmp->mtu == mtu) { +VLOG_DBG("Reusing mempool \"%s\"", dmp->mp->name); dmp->refcount++; reuse = true; -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [branch 2.9 PATCH v3 1/2] netdev-dpdk: Free mempool only when no in-use mbufs.
DPDK mempools are freed when they are no longer needed. This can happen when a port is removed or a port's mtu is reconfigured so that a new mempool is used. It is possible that an mbuf is attempted to be returned to a freed mempool from NIC Tx queues and this can lead to a segfault. In order to prevent this, only free mempools when they are not needed and have no in-use mbufs. As this might not be possible immediately, sweep the mempools anytime a port tries to get a mempool. Fixes: 8d38823bdf8b ("netdev-dpdk: fix memory leak") Cc: mark.b.kavanag...@gmail.com Cc: Ilya MaximetsReported-by: Venkatesan Pradeep Signed-off-by: Kevin Traynor --- v3: Get number of entries on the mempool ring directly. v2: Add second call to rte_mempool_full() as it is not atomic so we can't trust one call to it. lib/netdev-dpdk.c | 58 ++- 1 file changed, 45 insertions(+), 13 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index c19cedc..d8fb222 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -590,8 +590,42 @@ dpdk_mp_create(int socket_id, int mtu) } +static int +dpdk_mp_full(const struct rte_mempool *mp) OVS_REQUIRES(dpdk_mp_mutex) +{ +unsigned ring_count; +/* This logic is needed because rte_mempool_full() is not guaranteed to + * be atomic and mbufs could be moved from mempool cache --> mempool ring + * during the call. However, as no mbufs will be taken from the mempool + * at this time, we can work around it by also checking the ring entries + * separately and ensuring that they have not changed. + */ +ring_count = rte_mempool_ops_get_count(mp); +if (rte_mempool_full(mp) && rte_mempool_ops_get_count(mp) == ring_count) { +return 1; +} + +return 0; +} + +/* Free unused mempools. */ +static void +dpdk_mp_sweep(void) OVS_REQUIRES(dpdk_mp_mutex) +{ +struct dpdk_mp *dmp, *next; + +LIST_FOR_EACH_SAFE (dmp, next, list_node, _mp_list) { +if (!dmp->refcount && dpdk_mp_full(dmp->mp)) { +ovs_list_remove(>list_node); +rte_mempool_free(dmp->mp); +rte_free(dmp); +} +} +} + static struct dpdk_mp * dpdk_mp_get(int socket_id, int mtu) { struct dpdk_mp *dmp; +bool reuse = false; ovs_mutex_lock(_mp_mutex); @@ -599,15 +633,18 @@ dpdk_mp_get(int socket_id, int mtu) if (dmp->socket_id == socket_id && dmp->mtu == mtu) { dmp->refcount++; -goto out; +reuse = true; +break; } } +/* Sweep mempools after reuse or before create. */ +dpdk_mp_sweep(); -dmp = dpdk_mp_create(socket_id, mtu); -if (dmp) { -ovs_list_push_back(_mp_list, >list_node); +if (!reuse) { +dmp = dpdk_mp_create(socket_id, mtu); +if (dmp) { +ovs_list_push_back(_mp_list, >list_node); +} } -out: - ovs_mutex_unlock(_mp_mutex); @@ -615,5 +652,5 @@ out: } -/* Release an existing mempool. */ +/* Decrement reference to a mempool. */ static void dpdk_mp_put(struct dpdk_mp *dmp) @@ -625,10 +662,5 @@ dpdk_mp_put(struct dpdk_mp *dmp) ovs_mutex_lock(_mp_mutex); ovs_assert(dmp->refcount); - -if (!--dmp->refcount) { -ovs_list_remove(>list_node); -rte_mempool_free(dmp->mp); -rte_free(dmp); - } +dmp->refcount--; ovs_mutex_unlock(_mp_mutex); } -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] ofproto: Allow bundle idle timeout to be configured.
In some cases 10 seconds might be too much time and in other cases it might be too little. The OpenFlow spec mandates that it should wait at least one second, so enforce that as the minimum acceptable value. Signed-off-by: Flavio Leitner--- ofproto/connmgr.c | 19 ++ ofproto/connmgr.h | 2 ++ ofproto/ofproto.c | 7 ++ ofproto/ofproto.h | 1 + tests/ofproto.at | 62 ++ vswitchd/bridge.c | 3 ++- vswitchd/ovs-vswitchd.8.in | 5 +++- vswitchd/vswitch.xml | 12 + 8 files changed, 104 insertions(+), 7 deletions(-) diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c index 964b8c8d8..677deef94 100644 --- a/ofproto/connmgr.c +++ b/ofproto/connmgr.c @@ -138,10 +138,11 @@ struct ofconn { /* vswitchd/ovs-vswitchd.8.in documents the value of BUNDLE_IDLE_LIFETIME in * seconds. That documentation must be kept in sync with the value below. */ -enum { -BUNDLE_EXPIRY_INTERVAL = 1000, /* Check bundle expiry every 1 sec. */ -BUNDLE_IDLE_TIMEOUT = 1,/* Expire idle bundles after 10 seconds. */ -}; +#define BUNDLE_EXPIRY_INTERVAL 1000 /* Check bundle expiry every 1 sec. */ +#define BUNDLE_IDLE_TIMEOUT_DEFAULT 1 /* Expire idle bundles after + * 10 seconds. */ + +static int bundle_idle_timeout = BUNDLE_IDLE_TIMEOUT_DEFAULT; static struct ofconn *ofconn_create(struct connmgr *, struct rconn *, enum ofconn_type, bool enable_async_msgs) @@ -469,6 +470,14 @@ ofconn_get_ofproto(const struct ofconn *ofconn) { return ofconn->connmgr->ofproto; } + +void +connmgr_set_bundle_idle_timeout(unsigned timeout) { +/* OpenFlow spec mandates the timeout to be at least one second. */ +if (timeout > 0) { +bundle_idle_timeout = timeout * 1000; +} +} /* OpenFlow configuration. */ @@ -1247,7 +1256,7 @@ static void bundle_remove_expired(struct ofconn *ofconn, long long int now) { struct ofp_bundle *b, *next; -long long int limit = now - BUNDLE_IDLE_TIMEOUT; +long long int limit = now - bundle_idle_timeout; HMAP_FOR_EACH_SAFE (b, next, node, >bundles) { if (b->used <= limit) { diff --git a/ofproto/connmgr.h b/ofproto/connmgr.h index 2405fbd79..eb3be1668 100644 --- a/ofproto/connmgr.h +++ b/ofproto/connmgr.h @@ -86,6 +86,8 @@ void connmgr_get_memory_usage(const struct connmgr *, struct simap *usage); struct ofproto *ofconn_get_ofproto(const struct ofconn *); +void connmgr_set_bundle_idle_timeout(unsigned timeout); + void connmgr_retry(struct connmgr *); /* OpenFlow configuration. */ diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index 36f4c0b16..4af7e6486 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -680,6 +680,13 @@ ofproto_set_in_band_queue(struct ofproto *ofproto, int queue_id) connmgr_set_in_band_queue(ofproto->connmgr, queue_id); } +/* Sets the bundle max idle timeout */ +void +ofproto_set_bundle_idle_timeout(unsigned timeout) +{ +connmgr_set_bundle_idle_timeout(timeout); +} + /* Sets the number of flows at which eviction from the kernel flow table * will occur. */ void diff --git a/ofproto/ofproto.h b/ofproto/ofproto.h index 8c85bbf7f..3ca88baf0 100644 --- a/ofproto/ofproto.h +++ b/ofproto/ofproto.h @@ -312,6 +312,7 @@ void ofproto_reconnect_controllers(struct ofproto *); void ofproto_set_extra_in_band_remotes(struct ofproto *, const struct sockaddr_in *, size_t n); void ofproto_set_in_band_queue(struct ofproto *, int queue_id); +void ofproto_set_bundle_idle_timeout(unsigned timeout); void ofproto_set_flow_limit(unsigned limit); void ofproto_set_max_idle(unsigned max_idle); void ofproto_set_forward_bpdu(struct ofproto *, bool forward_bpdu); diff --git a/tests/ofproto.at b/tests/ofproto.at index c1beea7ae..f5a75dbcb 100644 --- a/tests/ofproto.at +++ b/tests/ofproto.at @@ -5456,6 +5456,68 @@ OFPT_BARRIER_REPLY (OF1.4): OVS_VSWITCHD_STOP AT_CLEANUP +AT_SETUP([ofproto - bundle custom timeout (OpenFlow 1.4)]) +AT_KEYWORDS([monitor]) +OVS_VSWITCHD_START([set Open_vSwitch . other_config:bundle-idle-timeout=4]) + +# Start a monitor, use the required protocol version +ovs-ofctl -O OpenFlow14 monitor br0 --detach --no-chdir --pidfile >monitor.log 2>&1 +AT_CAPTURE_FILE([monitor.log]) + +ovs-appctl time/stop + +# Send an OpenFlow14 message (05), OFPT_BUNDLE_CONTROL (21), length (10), xid (01) +ovs-appctl -t ovs-ofctl ofctl/send "05 21 00 10 00 00 00 01 00 00 00 01 00 00 00 03" +ovs-appctl time/warp 2000 +# Send a bundle flow mod, it should keep the bundle alive. +ovs-appctl -t ovs-ofctl ofctl/send "05 22 00 a0 00 00 00 02 00 00 00 01 00 00 00 03 \ +05 0e 00 90 00 00 00 02 00 00 00 00 00 00 00 00 \ +00 00 00 00 00 00 00 00 01 00 00 00 00 00 ff ff \ +ff ff ff ff ff ff ff ff ff ff ff ff 00 00 00 00 \ +00 01 00 42 80 00 00 04 00 00 00
Re: [ovs-dev] [PATCH] flow: Extend 5-tuple hash calculation for non-IP packets
OVS_FORCE does not affect Clang, only sparse. On Fri, Apr 13, 2018 at 11:43:24AM +, Gabor Halász wrote: > Hi Ben, > Indeed there was some problem with that line. > Instead of using the eth_addr_to_uint64() func, I have added OVS_FORCE > compiler flag to keep the logic more visible. > +hash = hash_add64(hash, *((OVS_FORCE uint64_t *) >dl_dst)); > +hash = hash_add(hash, (OVS_FORCE uint32_t) > +*((uint8_t *) >dl_dst) + sizeof(uint64_t)); > > Thanks, > Gabor > > > > I haven't properly reviewed this, but Clang reports: > > > > ../lib/flow.c:2133:35: error: cast from 'const struct eth_addr *' to > > 'uint64_t *' (aka 'unsigned long long *') increases required alignment > > from 2 to 4 [-Werror,-Wcast-align] > > > > which is for this line: > > > > hash = hash_add64(hash, *((uint64_t *) >dl_dst)); > > > > Maybe you should use eth_addr_to_uint64(). ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] Add multi-column index support for the Python IDL
One could argue that if if distro packaging is the issue, then distros could patch in the simple try/except ImportError and add the sortedcontainer code like I did above. But, if this works for the ovs team and distro folks, I'm happy with it. Terry ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/1] netdev-dpdk: fix RX jumbo for NICs not supporting scatter
> Currently to RX jumbo packets fails for NICs not supporting scatter. > Scatter is not strictly needed for jumbo support on RX. This change fixes > the issue by only enabling scatter for NICs supporting it. Add a quirk for > "igb" while the PMD is fixed to advertise scatter. > Thanks for the v2 Pablo. Adding Eelco and Kevin as they had some comments on the v1. FYI I'm investigating on the DPDK side to see how/when the flag should be set and used for igb and ixgbe as well as other drivers. https://dpdk.org/ml/archives/dev/2018-April/097056.html > Reported-by: Louis Peens> Signed-off-by: Pablo Cascón > Reviewed-by: Simon Horman > --- > lib/netdev-dpdk.c | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index ee39cbe..8f6a0a3 > 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -694,11 +694,19 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, > int n_rxq, int n_txq) > int diag = 0; > int i; > struct rte_eth_conf conf = port_conf; > +struct rte_eth_dev_info info; > > /* For some NICs (e.g. Niantic), scatter_rx mode needs to be > explicitly > * enabled. */ > if (dev->mtu > ETHER_MTU) { > -conf.rxmode.enable_scatter = 1; > +rte_eth_dev_info_get(dev->port_id, ); > +if (info.rx_offload_capa & DEV_RX_OFFLOAD_SCATTER) { > +conf.rxmode.enable_scatter = 1; > +} else if (!strcmp(info.driver_name, "igb")) { > +/* Quirk: as of DPDK 17.11.1 igb's PMD requires explicitly > + enabling scatter but fails to advertise it. */ I'm not sure this is acceptable. I'm worried it sets a precedent for code for specific devices and could lead to further instances of this in the future. It could be argued the scatter approach up to now was specific to Niantic but it also worked for igb and i40e. I40e devices don’t require scatter but can handle it without issue if it is set. In the past this type of solution has been rejected as the preferred approach was to keep netdev-dpdk code generic as possible. That’s why I suggest deferring the patch in OVS until the required changes are made in DPDK to satisfy all cases. 17.11.2 is targeted for May 19th. We could have a solution in place for then. I'm not trying to obstruct this but these cases do arise so interested to hear what others think? Ian > +conf.rxmode.enable_scatter = 1; > +} > } > > conf.rxmode.hw_ip_checksum = (dev->hw_ol_features & > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 1/1] netdev-dpdk: fix RX jumbo for NICs not supporting scatter
Currently to RX jumbo packets fails for NICs not supporting scatter. Scatter is not strictly needed for jumbo support on RX. This change fixes the issue by only enabling scatter for NICs supporting it. Add a quirk for "igb" while the PMD is fixed to advertise scatter. Reported-by: Louis PeensSigned-off-by: Pablo Cascón Reviewed-by: Simon Horman --- lib/netdev-dpdk.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index ee39cbe..8f6a0a3 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -694,11 +694,19 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) int diag = 0; int i; struct rte_eth_conf conf = port_conf; +struct rte_eth_dev_info info; /* For some NICs (e.g. Niantic), scatter_rx mode needs to be explicitly * enabled. */ if (dev->mtu > ETHER_MTU) { -conf.rxmode.enable_scatter = 1; +rte_eth_dev_info_get(dev->port_id, ); +if (info.rx_offload_capa & DEV_RX_OFFLOAD_SCATTER) { +conf.rxmode.enable_scatter = 1; +} else if (!strcmp(info.driver_name, "igb")) { +/* Quirk: as of DPDK 17.11.1 igb's PMD requires explicitly + enabling scatter but fails to advertise it. */ +conf.rxmode.enable_scatter = 1; +} } conf.rxmode.hw_ip_checksum = (dev->hw_ol_features & -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 0/1] netdev-dpdk: fix RX jumbo for NICs not supporting scatter
Hello, here is a v2, this cover letter is to highlight that there is a quirk being added and that it has not been tested with "igb" (don't have an available system with igb at the moment). Some test with "igb" would be greatly appreciated, along with any feedback. Thanks, Pablo Pablo Cascón (1): netdev-dpdk: fix RX jumbo for NICs not supporting scatter lib/netdev-dpdk.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Yourlong awaited part payment of $5.5 Million USD
Attn: Your long awaited part payment of $5.5 Million USD is ready for immediate release to you, and it was electronically credited into an ATM Visa Card for easy delivery. Your new Payment Reference No.- 6363836, Password No: 006786, Pin Code No: 1787 Your Certificate of Merit Payment No: 05872, Released Code No: 1134; Immediate Telex confirmation No:- 043612; Secret Code No: TBKTA28 NAME,. ... ADDRESS,.. NEXT OF KIN. PHONE ... OCCUPATION ... REGARDS DR MARK HAMSEN ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] flow: Extend 5-tuple hash calculation for non-IP packets
Hi Ben, Indeed there was some problem with that line. Instead of using the eth_addr_to_uint64() func, I have added OVS_FORCE compiler flag to keep the logic more visible. +hash = hash_add64(hash, *((OVS_FORCE uint64_t *) >dl_dst)); +hash = hash_add(hash, (OVS_FORCE uint32_t) +*((uint8_t *) >dl_dst) + sizeof(uint64_t)); Thanks, Gabor > I haven't properly reviewed this, but Clang reports: > > ../lib/flow.c:2133:35: error: cast from 'const struct eth_addr *' to > 'uint64_t *' (aka 'unsigned long long *') increases required alignment > from 2 to 4 [-Werror,-Wcast-align] > > which is for this line: > > hash = hash_add64(hash, *((uint64_t *) >dl_dst)); > > Maybe you should use eth_addr_to_uint64(). ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Transfer
Good day to you. I am the director for payment and within my capacity as the person in charge, i want to transfer some amount out for safe keeping and investment. If you can assit, then let me know so i can send you the full details. I would want you to keep the letter secret. This is not spam mail. Regards, John Telephone +2348142131529 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/1] netdev-dpdk: fix RX jumbo for NICs not supporting scatter
On 12/04/18 14:05, Stokes, Ian wrote: On 10/04/18 21:08, Stokes, Ian wrote: Currently to RX jumbo packets fails for NICs not supporting scatter. Scatter is not strictly needed for jumbo support on RX. This change fixes the issue by only enabling scatter for NICs supporting it. Reported-by: Louis PeensSigned-off-by: Pablo Cascón Reviewed-by: Simon Horman --- lib/netdev-dpdk.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index ee39cbe..28b20b5 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -694,11 +694,14 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) int diag = 0; int i; struct rte_eth_conf conf = port_conf; +struct rte_eth_dev_info info; /* For some NICs (e.g. Niantic), scatter_rx mode needs to be explicitly * enabled. */ if (dev->mtu > ETHER_MTU) { -conf.rxmode.enable_scatter = 1; +rte_eth_dev_info_get(dev->port_id, ); +if (info.rx_offload_capa & DEV_RX_OFFLOAD_SCATTER) +conf.rxmode.enable_scatter = 1; Thanks for this, quick note: conf.rxmode.enable_scatter = 1; should be enclosed in braces as per OVS coding style. With an ixgbe device (a Niantic in this case), it never hits the offload condition above so that it can set scatter. Currently DEV_RX_OFFLOAD_SCATTER is not set for ixgbe or igb devices as part of info.rx_offload_capa. Surprisingly MTU still worked for the ixgbe device. Digging a little deeper it seems the need to set scatter explicitly for ixgbe was modified by commit net/ixgbe: remove MTU setting limitation (c009c6b1) "This patch allows setting this special MTU when device is stopped, because scattered_rx will be re-configured during next port start and driver may enable scattered receive according new MTU value." In the ixgbe case it's ok because the device is stopped at the time we call set mtu. However the patch breaks existing functionality for igb devices as they do not have a flag set for DEV_RX_OFFLOAD_SCATTER either and still explicitly require scatter to be set regardless of being stopped. Otherwise they fail to reconfigure and remain in a down state. I think a patch is needed for DPDK to set the DEV_RX_OFFLOAD_SCATTER flag for ixgbe and igb devices. I can look into that. Thanks for the testing and investigation. Sorry for unearthing this bug :) If setting the DEV_RX_OFFLOAD_SCATTER flag in the igb and ixgbe PMD fixes the issue (when this patch is applied to OVS) please warm other PMDs in the DPDK's mailing list In an effort to avoid breaking existing functionality in OVS I suggest we defer this patch until that support is in place for DPDK 17.11.2. Unfortunately deferring this patch would hurt Netronome's use case, this jumbo/scatter bug needs to be fixed. Will post a v2 including your style feedback and a check for driver_name being "igb" to set scatter regardless of the capability. That will make OVS-DPDK to work with "igb" before and after your fix. We can comment in that v2 if other extra checks are needed We could combine the current patch with the incremental below to fix the style issue and comment. Thoughts? -/* For some NICs (e.g. Niantic), scatter_rx mode needs to be explicitly - * enabled. */ +/* For some NICs scatter_rx mode needs to be explicitly enabled. */ if (dev->mtu > ETHER_MTU) { rte_eth_dev_info_get(dev->port_id, ); -if (info.rx_offload_capa & DEV_RX_OFFLOAD_SCATTER) +if (info.rx_offload_capa & DEV_RX_OFFLOAD_SCATTER) { conf.rxmode.enable_scatter = 1; +} } Thanks Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] dpdk docs: Drop file share in libvirt config.
On Thu, 2018-04-12 at 08:24 +0100, Lam, Tiago wrote: > On 11/04/2018 15:03, Stephen Finucane wrote: > > On Wed, 2018-04-11 at 09:54 -0400, Aaron Conole wrote: > > > Tiago Lamwrites: > > > > > > > When explaining on how to add vhost-user ports to a guest, using > > > > libvirt, the following piece of configuration is used: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This is used to facilitate sharing of a DPDK directory between the host > > > > and the guest. However, for this to work selinux also needs to be > > > > configured (or disabled). Furthermore, if one is using Ubuntu, libvirtd > > > > would need to be added to complain only in AppArmor. Instead, in [1] it > > > > is advised to use wget to get the DPDK sources over the internet, which > > > > avoids this differentiation. Thus, we drop this piece of configuration > > > > here as well and keep the example configuration as simple as possible. > > > > > > > > This has been verified on both a Fedora 27 image and a Ubuntu 16.04 LTS > > > > image. > > > > > > > > [1] > > > > http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/#dpdk-in-the-guest > > > > > > > > Signed-off-by: Tiago Lam > > > > --- > > > > > > > > CC'ed Stephen, > > > > > > > > I took the liberty of removing your TODO from here, as I read it to be > > > > related > > > > to the (now removed) SELinux instruction below. If you think it should > > > > still be > > > > there let me know and I'll gladly send a v2. > > > > > > I think it should remain until the selinux issues have been addressed. > > > > > > Is there a list somewhere of the AVC denials? Maybe it makes sense to > > > allow them. > > > > If I'm reading this correctly, Tiago is saying these exceptions only > > happen because we're sharing an arbitrary directory with the guest to > > avoid downloading the DPDK sources twice. Given that there's a valid > > workaround (just fetching sources twice), simply removing that section > > of the XML removes the need to disable SELinux. If so, dropping the > > warning does make sense in my mind. > > > > Stephen > > > > Thanks, Stephen. Yeah, that's what I was aiming at. In order to get the > file sharing working properly, one must fiddle around with either > SELinux or AppArmor, and that seems to be the sole reason why > `setenforce 0` is there. Losing the dependency on the file sharing means > we can lose any instructions that tell the user how to fiddle with > either of those systems. > > Just a note though, in that the user won't have to download the DPDK > sources twice, only once. Following the guide, the user first sets up > the vhost-user ports using libvirt, and once inside the VM he should > follow up on running `testpmd` inside the guest [1], where he will be > instructed to download the DPDK sources. This makes this piece of the > docs a bit more consistent, I think. > > [1] > http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/#dpdk-in-the-guest That all sounds fair to me. Acked-by: Stephen Finucane ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev