Re: [ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.
On Thu, Nov 14, 2019 at 10:56 AM Dumitru Ceara wrote: > > On Thu, Nov 14, 2019 at 7:43 PM Numan Siddique wrote: > > > > > > > > On Thu, Nov 14, 2019, 11:53 PM Han Zhou wrote: > >> > >> On Thu, Nov 14, 2019 at 6:22 AM Dumitru Ceara wrote: > >> > > >> > On Thu, Nov 14, 2019 at 2:14 PM Numan Siddique wrote: > >> > > > >> > > On Thu, Nov 14, 2019 at 6:14 PM Dumitru Ceara wrote: > >> > > > > >> > > > Reported-by: Numan Siddique > >> > > > Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain > >> whenever possible.") > >> > > > Signed-off-by: Dumitru Ceara > >> > > > >> > > Thanks Dumitru for the fix. I applied this to master. > >> > > > >> > > Thanks > >> > > Numan > >> > > >> > Thanks Numan and sorry for missing this case in the first place. > >> > >> Sorry for missing this in review. Just want to confirm, is this a bug or > >> just improvement? I think it is not a bug, because the set is empty {}, it > >> should just get ignored in the end by ovn-controller when translating to > >> OpenFlow rules, correct? > > > > Hi Han, > > As far as I see ovn-controller doesn't accept empty set. In > parse_constant_set() we expect at least one constant: > > https://github.com/ovn-org/ovn/blob/master/lib/expr.c#L865 > > So I would say that this was a bug in the ovn-northd code I added. And > anyway it doesn't really make sense to add the flows if the address > sets are empty. > > Thanks, > Dumitru > Yes, you are right. I think I messed up with the empty address-set or port-group, which works, but not empty constant set in {}. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.
On Thu, Nov 14, 2019, 11:53 PM Han Zhou wrote: > On Thu, Nov 14, 2019 at 6:22 AM Dumitru Ceara wrote: > > > > On Thu, Nov 14, 2019 at 2:14 PM Numan Siddique wrote: > > > > > > On Thu, Nov 14, 2019 at 6:14 PM Dumitru Ceara > wrote: > > > > > > > > Reported-by: Numan Siddique > > > > Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain > whenever possible.") > > > > Signed-off-by: Dumitru Ceara > > > > > > Thanks Dumitru for the fix. I applied this to master. > > > > > > Thanks > > > Numan > > > > Thanks Numan and sorry for missing this case in the first place. > > Sorry for missing this in review. Just want to confirm, is this a bug or > just improvement? I think it is not a bug, because the set is empty {}, it > should just get ignored in the end by ovn-controller when translating to > OpenFlow rules, correct? > This caused system tests to fail as there were warnings in the ovn-controller log because of empty set. Thanks > > > > > > > > --- > > > > northd/ovn-northd.c | 12 > > > > 1 file changed, 8 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > > > index d6beb97..6742bc0 100644 > > > > --- a/northd/ovn-northd.c > > > > +++ b/northd/ovn-northd.c > > > > @@ -5337,10 +5337,14 @@ build_lswitch_rport_arp_req_flows(struct > ovn_port *op, > > > > } > > > > } > > > > > > > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, > sw_op, > > > > -sw_od, 75, lflows); > > > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, > sw_op, > > > > -sw_od, 75, lflows); > > > > +if (!sset_is_empty(_ips_v4)) { > > > > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, > AF_INET, sw_op, > > > > +sw_od, 75, lflows); > > > > +} > > > > +if (!sset_is_empty(_ips_v6)) { > > > > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, > AF_INET6, sw_op, > > > > +sw_od, 75, lflows); > > > > +} > > > > > > > > sset_destroy(_ips_v4); > > > > sset_destroy(_ips_v6); > > > > -- > > > > 1.8.3.1 > > > > > > > > ___ > > > > dev mailing list > > > > d...@openvswitch.org > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > > > > > > ___ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.
On Thu, Nov 14, 2019 at 6:22 AM Dumitru Ceara wrote: > > On Thu, Nov 14, 2019 at 2:14 PM Numan Siddique wrote: > > > > On Thu, Nov 14, 2019 at 6:14 PM Dumitru Ceara wrote: > > > > > > Reported-by: Numan Siddique > > > Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever possible.") > > > Signed-off-by: Dumitru Ceara > > > > Thanks Dumitru for the fix. I applied this to master. > > > > Thanks > > Numan > > Thanks Numan and sorry for missing this case in the first place. Sorry for missing this in review. Just want to confirm, is this a bug or just improvement? I think it is not a bug, because the set is empty {}, it should just get ignored in the end by ovn-controller when translating to OpenFlow rules, correct? > > > > > > --- > > > northd/ovn-northd.c | 12 > > > 1 file changed, 8 insertions(+), 4 deletions(-) > > > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > > index d6beb97..6742bc0 100644 > > > --- a/northd/ovn-northd.c > > > +++ b/northd/ovn-northd.c > > > @@ -5337,10 +5337,14 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op, > > > } > > > } > > > > > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op, > > > -sw_od, 75, lflows); > > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op, > > > -sw_od, 75, lflows); > > > +if (!sset_is_empty(_ips_v4)) { > > > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op, > > > +sw_od, 75, lflows); > > > +} > > > +if (!sset_is_empty(_ips_v6)) { > > > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op, > > > +sw_od, 75, lflows); > > > +} > > > > > > sset_destroy(_ips_v4); > > > sset_destroy(_ips_v6); > > > -- > > > 1.8.3.1 > > > > > > ___ > > > dev mailing list > > > d...@openvswitch.org > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn v2 00/13] OVN Interconnection
On Wed, Nov 13, 2019 at 10:27 AM Numan Siddique wrote: > > On Thu, Oct 31, 2019 at 2:43 AM Han Zhou wrote: > > > > The series supports interconnecting multiple OVN deployments (e.g. located at > > multiple data centers) through logical routers connected with tansit logical > > switches with overlay tunnels, managed through OVN control plane. See the > > ovn-architecture.rst document updates for more details, and find the > > instructions in Documentation/tutorials/ovn-interconnection.rst. > > > > Han Zhou (13): > > ovn-architecture: Add documentation for OVN interconnection feature. > > ovn-inb: Interconnection northbound DB schema and CLI. > > ovn-isb: Interconnection southbound DB schema and CLI. > > ovn-ic: Interconnection controller with AZ registeration. > > ovn-northd.c: Refactor allocate_tnlid. > > ovn-ic: Transit switch controller. > > ovn-sb: Add columns is_interconn and is_remote to Chassis. > > ovn-ic: Interconnection gateway controller. > > ovn-ic: Interconnection port controller. > > ovn.at: e2e test for OVN interconnection. > > ovn-ctl: Refactor to reduce redundant code. > > ovn-ctl: Support commands for interconnection. > > tutorial: Add tutorial for OVN Interconnection. > > > > Hi Han, > > Thanks for working on this feature. It's an interesting use case. > > I had a quick look at all the patches. > Numan, thanks a lot for the thorough review! Please see my response inlined. > I have few comments > > 1. I would suggest to rename the DBs as ovn-ic-nb.ovsschema (and the > same for ovn-isb). > The DB name is - OVN_IC_Northbound. So it would make sense to have > - ovn-ic-nb.ovsschema > I would also suggest to rename the utilities to ovn-ic-nbctl and > ovn-ic-sbctl. > With ovn-inbctl/ovn-isbctl, it is really confusing. > Sure, I felt not quite convenient with two dashes in a command name. I agree that ovn-ic-nbctl and ovn-ic-sbctl are more clear. I can change it. > 2. ovn-ic service writes to interconnect south db, ovn north db and > ovn south db. Writing to ic south db and > ovn north db is fine. But it seems a little odd for ovn-ic to > write to south db. From what I understand it writes > to south db for 3 purposes > a. Updating the tunnel_key column of datapath_binding > representing the transit switch > b. Updating the key column of port_binding representing the > logical router port connecting to the transit switch. > c. Creating chassis rows for remote gateway chassis. > >I think it's better if ovn-ic can delegate all these to ovn-northd. > For (a) and (b), ovn-ic can set the generated tunnel key >in the other_config/options column of Logical switch/Logical switch > port. ovn-northd can internally set this value to >the south db. > >For (c), I think its better we add another table - Remote_Chassis > (or some other appropriate name) . ovn-ic will create rows >in this table for each remote chassis and ovn-northd will create > chassis row in south db. >We already have Gateway_Chassis table in North db. So I think it's > fine if we add Remote_Chassis table. The only odd thing >would be is to store the encap details in North db. > >With this, ovn-ic, doesn't need to worry about syncing between > interconnect south db and ovn south db. In my opinion ovn-ic >should act more like CMS to each availability zone and hence should > not do any write transactions to the south db. > > Any concerns with this proposed approach ? > We could avoid ovn-ic writing directly to SB with some extra logic in northd, but I don't see any problem for ovn-ic to update SB directly. First of all, we have hypervisors creating and updating SB directly for Chassis and Encap records. The design here is that ovn-ic updates Chassis and Encap on behalf of remote gateway nodes, which I think is straightforward and reasonable. Similarly, port-binding's chassis column is updated the same way as how hypervisors are updating it. Secondly, for tunnel keys updating, it may seem graceful to update from northd, since northd is the only client that updates tunnel keys today, but since ovn-ic is responsible for calculating these keys, and it already has connection to SB, I think it is just more natural and efficient to update it directly, to avoid the extra logic and unnecessary latency from northd with another round of computation. The scope of the ovn-ic is only for the interconnection objects, so I don't see any conflict or ownership ambiguity here. What do you think? > 3. In patch 7,its better to rename the ovs configuration option - > "external_ids:is-interconn" to "external_ids:ovn-is-interconn". >You also need to document it in ovn-controller-8.xml. > Agree! >Or maybe we can remove this option - external_ids:is-interconn. We > probably don't need this if we do like below > >2 (c) can also be done this way > - User creates transit switch. > - ovn-ic creates transit switch in north db. >
Re: [ovs-dev] ovs-vswitchd is crashing while adding dpdk ports to the ovs bridge
> Hi, > > I'm trying to add the dpdk ports to the ovs bridge, and while doing so, i > see the ovs daemon crashing with the 'operation not permitted' error, > here's my architecture, i am on a debian vm (pseudo host) that has virt-io > devices attached to it, i'm binding these virt-io's to the dpdk (igb_uio) > and trying to add it to the ovs bridge. > dpdk version: 17.11.2 > ovs-version: 2.9.0 Hi Deepak, You're using very old OVS and DPDK releases. A lot of issues was fixed in later stable releases. Try using OVS 2.9.6 and DPDK 17.11.6 that are current recommended versions. (There are already more recent DPDK 17.11.* releases, but they wasn't verified with OVS yet). With OVS 2.9.6 at least it should stop crashing because of the following commit: 3e417c853b6d ("netdev-dpdk: Don't use PMD driver if not configured successfully") Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [RFC ovn] ovn-northd: ls_*_acl behavior not consistent for untracked flows
From: venu iyer If one creates a port group and a MAC address set, and create an ACL that prevents packets being output to a port in the Port Group from any MAC address in the address set, the outcome is not consistent. The outcome depends on whether there is a stateful rule on the switch or not. Specifically: Assuming 'l2pg' is a port group with a list of ports and 'macs' is an Address Set with a list of MAC addresses and the intent is to drop all packets with source MAC address in 'macs' to any port in 'l2pg' using: ovn-nbctl acl-add to-lport 5000 \ "outport == @l2pg && eth.src == $macs" drop Without any stateful rule on the logical switch, the corresponding logical flow looks like: table=4 (ls_out_acl), priority=6000,\ match=(outport == @l2pg && eth.src == $macs), \ action=(/* drop */) Based on this rule, any packet destined to the ports in 'l2pg' with source Address in 'macs' will be dropped - as is expected from the ACL above. While with a Stateful rule on the switch (any stateful rule will do), the same rule looks like: table=4 (ls_out_acl), priority=6000, \ match=((!ct.est || (ct.est && ct_label.blocked == 1)) && \ (outport == @l2pg && eth.src == $macs)), action=(/* drop */) However, with this, only packets that are tracked will match the rule and be dropped, e.g. IP packets will be dropped, but ARP etc., will go through - this is not expected. Based on whether there are stateful rules or not on the switch, untracked packets will see different behavior. The fix is to complete the rules in the stateful case, i.e. instead of looking for flows that are not established or not new, we should first check if flows are not tracked. The fix was tested in the above scenario. Additionally, the following ACL was added to test the change in the "allow" case (i.e. to drop all the packets based on the above ACL, but have a higher priority rule that selectively allow ARP). ovn-nbctl acl-add ls1 to-lport 6000 \ "outport == @l2pg && eth.type == 0x806" allow with and without the staeful rule to make sure the behavior is the same. OVN test cases were run with this change and no failures were seen. Signed-off-by: venu iyer --- northd/ovn-northd.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 6742bc002..1f5f97796 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -4585,12 +4585,12 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, * deletion. There is no need to commit here, so we can just * proceed to the next table. We use this to ensure that this * connection is still allowed by the currently defined - * policy. */ + * policy. Match untracked packets too. */ ds_clear(); ds_clear(); ds_put_format(, - "!ct.new && ct.est && !ct.rpl" - " && ct_label.blocked == 0 && (%s)", + "(!ct.trk || (!ct.new && ct.est && !ct.rpl" + " && ct_label.blocked == 0)) && (%s)", acl->match); build_acl_log(, acl); @@ -4613,10 +4613,10 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, * depending on whether the connection was previously committed * to the connection tracker with ct_commit. */ if (has_stateful) { -/* If the packet is not part of an established connection, then +/* If the packet is not tracked or part of an established connection, then * we can simply reject/drop it. */ ds_put_cstr(, -"(!ct.est || (ct.est && ct_label.blocked == 1))"); +"(!ct.trk || !ct.est || (ct.est && ct_label.blocked == 1))"); if (!strcmp(acl->action, "reject")) { build_reject_acl_rules(od, lflows, stage, acl, , ); -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Fix use of dangling pointers in I-P runtime_data.
On Wed, Nov 13, 2019 at 9:18 AM Dumitru Ceara wrote: > > On Fri, Nov 8, 2019 at 8:27 PM Han Zhou wrote: > > > > > > > > On Fri, Nov 8, 2019 at 11:22 AM Han Zhou wrote: > > > > > > 1. storage data and the void *arg of init() breaks the engine node data > > > encapsulation. > > > 2. engine_node_valid(_flow_output, engine_run_id) is not needed? > > > Should use storage to access instead? > > > 3. refactor of engine is good but better to be a separate commit > > > 4. we can have a new interface: engine_get_data(), which returns data if > > > it is valid. we should never expose the data directly. We should init the > > > engine node with dynamically allocated engine data structure (and > > > remember to free during destroy) > > > > Oops! please ignore the above part since it was draft and I forgot to > > delete after editing the formal response, mostly redundant :-) > > Real response started here => > > > > > > Hi Dumitru, > > > > > > Sorry for late response. > > > On Mon, Nov 4, 2019 at 4:54 AM Dumitru Ceara wrote: > > > > > > > > The incremental processing engine might stop a run before the > > > > en_runtime_data node is processed. In such cases the ed_runtime_data > > > > fields might contain pointers to already deleted SB records. For > > > > example, if a port binding corresponding to a patch port is removed from > > > > the SB database and the incremental processing engine aborts before the > > > > en_runtime_data node is processed then the corresponding local_datapath > > > > hashtable entry in ed_runtime_data is stale and will store a pointer to > > > > the already freed sbrec_port_binding record. > > > > > > > > This will cause invalid memory accesses in various places (e.g., > > > > pinctrl_run() -> prepare_ipv6_ras()). > > > > > > > > To fix the issue we need a way to track how each node was processed > > > > during an engine run. This commit transforms the 'changed' field in > > > > struct engine_node in a 'state' field. Possible node states are: > > > > - "New": the node is not yet initialized. > > > > - "Stale": data in the node is not up to date with the DB. > > > > - "Updated": data in the node is valid but was updated during > > > > the last run of the engine. > > > > - "Valid": data in the node is valid and didn't change during > > > > the last run of the engine. > > > > - "Aborted": during the last run, processing was aborted for > > > > this node. > > > > - "Destroyed": the node was already cleaned up. > > > > > > > > We also add a separation between engine node data that can be accessed > > > > at any time (regardless if the last engine run was successful or not) > > > > and data that may be accessed only if the nodes are up to date. This > > > > helps avoiding custom "engine_node_valid" handlers for different > > > > nodes. > > > > > > > > The commit also simplifies the logic of calling engine_run and > > > > engine_need_run in order to reduce the number of external variables > > > > required to track the result of the last engine execution. > > > > > > > > Functions that need to be called from the main loop and depend on > > > > various data contents of the engine's nodes are now called only if > > > > the data is up to date. > > > > > > > > CC: Han Zhou > > > > Fixes: ca278d98a4f5 ("ovn-controller: Initial use of incremental engine > > > > - quiet mode.") > > > > Signed-off-by: Dumitru Ceara > > > > > > > > --- > > > > v2: Address Han's comments: > > > > - call engine_node_valid() in all the places where node local data is > > > > used. > > > > - move out "global" data outside the engine nodes. Make a clear > > > > separation between data that can be safely used at any time and data > > > > that can be used only when the engine run was successful. > > > > > > I am concerned with this kind of separation of *global* data, which > > > breaks the data encapsulation of engine node, and can easily result in > > > hidden dependency. As you know the main purpose of the I-P engine is to > > > force explicit dependency exposed between different engine nodes thus > > > ensure the correctness (at least it helps to ensure) of incremental > > > processing. > > > > > > Here is my proposal to address the problem with better encapsulation. > > > > > > Firstly, let's avoid direct engine data access outside of engine module. > > > At engine node construction, instead of using reference of stack variable > > > (such as struct ed_type_runtime_data ed_runtime_data), we can allocate > > > the memory in the engine node's init() interface, and free in the > > > cleanup() interface. This way, there will be no way to directly access > > > engine data like _runtime_data.local_datapaths. > > > > > > Secondly, let's add a engine module interface engine_get_data() to > > > retrieve *and validate* data for an engine node: > > > void * > > > engine_get_data(struct engine_node *node, uint64_t run_id) > > > { > > > if (engine_node_valid(node, run_id)) { > > > return
[ovs-dev] [PATCH v3 ovn 4/4] ovn-controller: Fix use of dangling pointers in I-P runtime_data.
The incremental processing engine might stop a run before the en_runtime_data node is processed. In such cases the ed_runtime_data fields might contain pointers to already deleted SB records. For example, if a port binding corresponding to a patch port is removed from the SB database and the incremental processing engine aborts before the en_runtime_data node is processed then the corresponding local_datapath hashtable entry in ed_runtime_data is stale and will store a pointer to the already freed sbrec_port_binding record. This will cause invalid memory accesses in various places (e.g., pinctrl_run() -> prepare_ipv6_ras()). To fix the issue we introduce the concept of "internal_data" vs "data" in engine nodes. The first field, "internal_data", is data that can be accessed by the incremental engine nodes handlers (data from other nodes must be considered read-only and data from other nodes must not be accessed if the nodes haven't been refreshed in the current iteration). The second field, "data" is a pointer reset at engine_run() and if non-NULL indicates to users outside the incremental engine that the data is safe to use. This commit also adds an "is_valid()" method to engine nodes to allow users to override the default behavior of determining if data is valid in a node (e.g., for the ct-zones node the data is always safe to access). CC: Han Zhou Fixes: ca278d98a4f5 ("ovn-controller: Initial use of incremental engine - quiet mode.") Signed-off-by: Dumitru Ceara --- v3: - split the change in series. - Address Han's comments: - fix the data encapsulation issue. - add is_valid method to nodes. - add internal_data/data fields to nodes as it makes it easier to write the code instead of adding an "engine_get_data()" API. v2: Address Han's comments: - call engine_node_valid() in all the places where node local data is used. - move out "global" data outside the engine nodes. Make a clear separation between data that can be safely used at any time and data that can be used only when the engine run was successful. - add a debug log for iterations when the engine didn't run. - refactor a bit more the incremental engine code. --- controller/ovn-controller.c | 227 ++- lib/inc-proc-eng.c | 25 - lib/inc-proc-eng.h | 28 + 3 files changed, 162 insertions(+), 118 deletions(-) diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index 51f466a..0475a30 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -741,8 +741,7 @@ struct ed_type_ofctrl_is_connected { static void en_ofctrl_is_connected_init(struct engine_node *node) { -struct ed_type_ofctrl_is_connected *data = -(struct ed_type_ofctrl_is_connected *)node->data; +struct ed_type_ofctrl_is_connected *data = node->internal_data; data->connected = false; } @@ -754,8 +753,7 @@ en_ofctrl_is_connected_cleanup(struct engine_node *node OVS_UNUSED) static void en_ofctrl_is_connected_run(struct engine_node *node) { -struct ed_type_ofctrl_is_connected *data = -(struct ed_type_ofctrl_is_connected *)node->data; +struct ed_type_ofctrl_is_connected *data = node->internal_data; if (data->connected != ofctrl_is_connected()) { data->connected = !data->connected; engine_set_node_state(node, EN_UPDATED); @@ -775,7 +773,7 @@ struct ed_type_addr_sets { static void en_addr_sets_init(struct engine_node *node) { -struct ed_type_addr_sets *as = (struct ed_type_addr_sets *)node->data; +struct ed_type_addr_sets *as = node->internal_data; shash_init(>addr_sets); as->change_tracked = false; sset_init(>new); @@ -786,7 +784,7 @@ en_addr_sets_init(struct engine_node *node) static void en_addr_sets_cleanup(struct engine_node *node) { -struct ed_type_addr_sets *as = (struct ed_type_addr_sets *)node->data; +struct ed_type_addr_sets *as = node->internal_data; expr_const_sets_destroy(>addr_sets); shash_destroy(>addr_sets); sset_destroy(>new); @@ -797,7 +795,7 @@ en_addr_sets_cleanup(struct engine_node *node) static void en_addr_sets_run(struct engine_node *node) { -struct ed_type_addr_sets *as = (struct ed_type_addr_sets *)node->data; +struct ed_type_addr_sets *as = node->internal_data; sset_clear(>new); sset_clear(>deleted); @@ -817,7 +815,7 @@ en_addr_sets_run(struct engine_node *node) static bool addr_sets_sb_address_set_handler(struct engine_node *node) { -struct ed_type_addr_sets *as = (struct ed_type_addr_sets *)node->data; +struct ed_type_addr_sets *as = node->internal_data; sset_clear(>new); sset_clear(>deleted); @@ -852,7 +850,7 @@ struct ed_type_port_groups{ static void en_port_groups_init(struct engine_node *node) { -struct ed_type_port_groups *pg = (struct ed_type_port_groups *)node->data; +struct ed_type_port_groups *pg = node->internal_data;
[ovs-dev] [PATCH v3 ovn 3/4] ovn-controller: Add separate I-P engine node for processing ct-zones.
Signed-off-by: Dumitru Ceara --- controller/ovn-controller.c | 117 +-- 1 file changed, 78 insertions(+), 39 deletions(-) diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index 4f8ceae..51f466a 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -933,11 +933,6 @@ struct ed_type_runtime_data { * _ */ struct sset local_lport_ids; struct sset active_tunnels; - -/* connection tracking zones. */ -unsigned long ct_zone_bitmap[BITMAP_N_LONGS(MAX_CT_ZONES)]; -struct shash pending_ct_zones; -struct simap ct_zones; }; static void @@ -945,24 +940,11 @@ en_runtime_data_init(struct engine_node *node) { struct ed_type_runtime_data *data = (struct ed_type_runtime_data *)node->data; -struct ovsrec_open_vswitch_table *ovs_table = -(struct ovsrec_open_vswitch_table *)EN_OVSDB_GET( -engine_get_input("OVS_open_vswitch", node)); -struct ovsrec_bridge_table *bridge_table = -(struct ovsrec_bridge_table *)EN_OVSDB_GET( -engine_get_input("OVS_bridge", node)); + hmap_init(>local_datapaths); sset_init(>local_lports); sset_init(>local_lport_ids); sset_init(>active_tunnels); -shash_init(>pending_ct_zones); -simap_init(>ct_zones); - -/* Initialize connection tracking zones. */ -memset(data->ct_zone_bitmap, 0, sizeof data->ct_zone_bitmap); -bitmap_set1(data->ct_zone_bitmap, 0); /* Zone 0 is reserved. */ -restore_ct_zones(bridge_table, ovs_table, - >ct_zones, data->ct_zone_bitmap); } static void @@ -983,9 +965,6 @@ en_runtime_data_cleanup(struct engine_node *node) free(cur_node); } hmap_destroy(>local_datapaths); - -simap_destroy(>ct_zones); -shash_destroy(>pending_ct_zones); } static void @@ -997,9 +976,6 @@ en_runtime_data_run(struct engine_node *node) struct sset *local_lports = >local_lports; struct sset *local_lport_ids = >local_lport_ids; struct sset *active_tunnels = >active_tunnels; -unsigned long *ct_zone_bitmap = data->ct_zone_bitmap; -struct shash *pending_ct_zones = >pending_ct_zones; -struct simap *ct_zones = >ct_zones; static bool first_run = true; if (first_run) { @@ -1094,9 +1070,6 @@ en_runtime_data_run(struct engine_node *node) ovs_table, local_datapaths, local_lports, local_lport_ids); -update_ct_zones(local_lports, local_datapaths, ct_zones, -ct_zone_bitmap, pending_ct_zones); - engine_set_node_state(node, EN_UPDATED); } @@ -1138,6 +,55 @@ runtime_data_sb_port_binding_handler(struct engine_node *node) return !changed; } +/* Connection tracking zones. */ +struct ed_type_ct_zones { +unsigned long bitmap[BITMAP_N_LONGS(MAX_CT_ZONES)]; +struct shash pending; +struct simap current; +}; + +static void +en_ct_zones_init(struct engine_node *node) +{ +struct ed_type_ct_zones *data = node->data; +struct ovsrec_open_vswitch_table *ovs_table = +(struct ovsrec_open_vswitch_table *)EN_OVSDB_GET( +engine_get_input("OVS_open_vswitch", node)); +struct ovsrec_bridge_table *bridge_table = +(struct ovsrec_bridge_table *)EN_OVSDB_GET( +engine_get_input("OVS_bridge", node)); + +shash_init(>pending); +simap_init(>current); + +memset(data->bitmap, 0, sizeof data->bitmap); +bitmap_set1(data->bitmap, 0); /* Zone 0 is reserved. */ +restore_ct_zones(bridge_table, ovs_table, >current, data->bitmap); +} + +static void +en_ct_zones_cleanup(struct engine_node *node) +{ +struct ed_type_ct_zones *data = node->data; + +simap_destroy(>current); +shash_destroy(>pending); +} + +static void +en_ct_zones_run(struct engine_node *node) +{ +struct ed_type_ct_zones *data = node->data; +struct ed_type_runtime_data *rt_data = +(struct ed_type_runtime_data *)engine_get_input( +"runtime_data", node)->data; + +update_ct_zones(_data->local_lports, _data->local_datapaths, +>current, data->bitmap, >pending); + +engine_set_node_state(node, EN_UPDATED); +} + struct ed_type_mff_ovn_geneve { enum mf_field_id mff_ovn_geneve; }; @@ -1215,7 +1237,11 @@ en_flow_output_run(struct engine_node *node) struct sset *local_lports = _data->local_lports; struct sset *local_lport_ids = _data->local_lport_ids; struct sset *active_tunnels = _data->active_tunnels; -struct simap *ct_zones = _data->ct_zones; + +struct ed_type_ct_zones *ct_zones_data = +(struct ed_type_ct_zones *)engine_get_input( +"ct_zones", node)->data; +struct simap *ct_zones = _zones_data->current; struct ed_type_mff_ovn_geneve *ed_mff_ovn_geneve = (struct ed_type_mff_ovn_geneve *)engine_get_input( @@ -1445,7 +1471,11 @@ flow_output_sb_port_binding_handler(struct engine_node *node)
[ovs-dev] [PATCH v3 ovn 2/4] ovn-controller: Add per node states to I-P engine.
This commit transforms the 'changed' field in struct engine_node in a 'state' field. Possible node states are: - "Stale": data in the node is not up to date with the DB. - "Updated": data in the node is valid but was updated during the last run of the engine. - "Valid": data in the node is valid and didn't change during the last run of the engine. - "Aborted": during the last run, processing was aborted for this node. This commit also further refactors the I-P engine: - instead of recursively performing all the engine processing a preprocessing stage is added (engine_get_nodes()) before the main processing loop is executed in order to topologically sort nodes in the engine such that all inputs of a given node appear in the sorted array before the node itself. This simplifies a bit the code in engine_run(). - remove the need for using an engine_run_id by using the newly added states. Signed-off-by: Dumitru Ceara --- controller/ovn-controller.c | 88 ++--- lib/inc-proc-eng.c | 218 --- lib/inc-proc-eng.h | 74 +++ 3 files changed, 267 insertions(+), 113 deletions(-) diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index 3922f3d..4f8ceae 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -758,10 +758,10 @@ en_ofctrl_is_connected_run(struct engine_node *node) (struct ed_type_ofctrl_is_connected *)node->data; if (data->connected != ofctrl_is_connected()) { data->connected = !data->connected; -node->changed = true; +engine_set_node_state(node, EN_UPDATED); return; } -node->changed = false; +engine_set_node_state(node, EN_VALID); } struct ed_type_addr_sets { @@ -811,7 +811,7 @@ en_addr_sets_run(struct engine_node *node) addr_sets_init(as_table, >addr_sets); as->change_tracked = false; -node->changed = true; +engine_set_node_state(node, EN_UPDATED); } static bool @@ -830,11 +830,14 @@ addr_sets_sb_address_set_handler(struct engine_node *node) addr_sets_update(as_table, >addr_sets, >new, >deleted, >updated); -node->changed = !sset_is_empty(>new) || !sset_is_empty(>deleted) -|| !sset_is_empty(>updated); +if (!sset_is_empty(>new) || !sset_is_empty(>deleted) || +!sset_is_empty(>updated)) { +engine_set_node_state(node, EN_UPDATED); +} else { +engine_set_node_state(node, EN_VALID); +} as->change_tracked = true; -node->changed = true; return true; } @@ -885,7 +888,7 @@ en_port_groups_run(struct engine_node *node) port_groups_init(pg_table, >port_groups); pg->change_tracked = false; -node->changed = true; +engine_set_node_state(node, EN_UPDATED); } static bool @@ -904,11 +907,14 @@ port_groups_sb_port_group_handler(struct engine_node *node) port_groups_update(pg_table, >port_groups, >new, >deleted, >updated); -node->changed = !sset_is_empty(>new) || !sset_is_empty(>deleted) -|| !sset_is_empty(>updated); +if (!sset_is_empty(>new) || !sset_is_empty(>deleted) || +!sset_is_empty(>updated)) { +engine_set_node_state(node, EN_UPDATED); +} else { +engine_set_node_state(node, EN_VALID); +} pg->change_tracked = true; -node->changed = true; return true; } @@ -1091,7 +1097,7 @@ en_runtime_data_run(struct engine_node *node) update_ct_zones(local_lports, local_datapaths, ct_zones, ct_zone_bitmap, pending_ct_zones); -node->changed = true; +engine_set_node_state(node, EN_UPDATED); } static bool @@ -1157,10 +1163,10 @@ en_mff_ovn_geneve_run(struct engine_node *node) enum mf_field_id mff_ovn_geneve = ofctrl_get_mf_field_id(); if (data->mff_ovn_geneve != mff_ovn_geneve) { data->mff_ovn_geneve = mff_ovn_geneve; -node->changed = true; +engine_set_node_state(node, EN_UPDATED); return; } -node->changed = false; +engine_set_node_state(node, EN_VALID); } struct ed_type_flow_output { @@ -1322,7 +1328,7 @@ en_flow_output_run(struct engine_node *node) active_tunnels, flow_table); -node->changed = true; +engine_set_node_state(node, EN_UPDATED); } static bool @@ -1404,7 +1410,7 @@ flow_output_sb_logical_flow_handler(struct engine_node *node) flow_table, group_table, meter_table, lfrr, conj_id_ofs); -node->changed = true; +engine_set_node_state(node, EN_UPDATED); return handled; } @@ -1427,7 +1433,7 @@ flow_output_sb_mac_binding_handler(struct engine_node *node) lflow_handle_changed_neighbors(sbrec_port_binding_by_name, mac_binding_table, flow_table); -node->changed = true; +engine_set_node_state(node, EN_UPDATED); return true; } @@
[ovs-dev] [PATCH v3 ovn 1/4] ovn-controller: Refactor I-P engine_run() tracking.
This commit simplifies the logic of calling engine_run and engine_need_run in order to reduce the number of external variables required to track the result of the last engine execution. The engine code is also refactored a bit and the engine_run() function is split in different functions that handle computing/recomputing a node. Signed-off-by: Dumitru Ceara --- controller/ovn-controller.c | 33 ++- lib/inc-proc-eng.c | 124 +-- lib/inc-proc-eng.h |7 ++ 3 files changed, 107 insertions(+), 57 deletions(-) diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index 9ab98be..3922f3d 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -1942,7 +1942,6 @@ main(int argc, char *argv[]) _pkt); uint64_t engine_run_id = 0; -uint64_t old_engine_run_id = 0; bool engine_run_done = true; unsigned int ovs_cond_seqno = UINT_MAX; @@ -1952,10 +1951,11 @@ main(int argc, char *argv[]) exiting = false; restart = false; while (!exiting) { +engine_run_id++; + update_sb_db(ovs_idl_loop.idl, ovnsb_idl_loop.idl); update_ssl_config(ovsrec_ssl_table_get(ovs_idl_loop.idl)); ofctrl_set_probe_interval(get_ofctrl_probe_interval(ovs_idl_loop.idl)); -old_engine_run_id = engine_run_id; struct ovsdb_idl_txn *ovs_idl_txn = ovsdb_idl_loop_run(_idl_loop); unsigned int new_ovs_cond_seqno @@ -2047,12 +2047,12 @@ main(int argc, char *argv[]) if (engine_run_done) { engine_set_abort_recompute(true); engine_run_done = engine_run(_flow_output, - ++engine_run_id); + engine_run_id); } } else { engine_set_abort_recompute(false); engine_run_done = true; -engine_run(_flow_output, ++engine_run_id); +engine_run(_flow_output, engine_run_id); } } stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME, @@ -2095,17 +2095,20 @@ main(int argc, char *argv[]) } } -if (old_engine_run_id == engine_run_id || !engine_run_done) { -if (!engine_run_done || engine_need_run(_flow_output)) { -VLOG_DBG("engine did not run, force recompute next time: " - "br_int %p, chassis %p", br_int, chassis); -engine_set_force_recompute(true); -poll_immediate_wake(); -} else { -VLOG_DBG("engine did not run, and it was not needed" - " either: br_int %p, chassis %p", - br_int, chassis); -} +if (engine_need_run(_flow_output, engine_run_id)) { +VLOG_DBG("engine did not run, force recompute next time: " +"br_int %p, chassis %p", br_int, chassis); +engine_set_force_recompute(true); +poll_immediate_wake(); +} else if (!engine_run_done) { +VLOG_DBG("engine was aborted, force recompute next time: " + "br_int %p, chassis %p", br_int, chassis); +engine_set_force_recompute(true); +poll_immediate_wake(); +} else if (!engine_has_run(_flow_output, engine_run_id)) { +VLOG_DBG("engine did not run, and it was not needed" + " either: br_int %p, chassis %p", + br_int, chassis); } else { engine_set_force_recompute(false); } diff --git a/lib/inc-proc-eng.c b/lib/inc-proc-eng.c index 1064a08..8a085e2 100644 --- a/lib/inc-proc-eng.c +++ b/lib/inc-proc-eng.c @@ -129,14 +129,72 @@ engine_ovsdb_node_add_index(struct engine_node *node, const char *name, } bool -engine_run(struct engine_node *node, uint64_t run_id) +engine_has_run(struct engine_node *node, uint64_t run_id) +{ +return node->run_id == run_id; +} + +/* Do a full recompute (or at least try). If we're not allowed then + * mark the node as "aborted". + */ +static bool +engine_recompute(struct engine_node *node, bool forced, bool allowed) +{ +VLOG_DBG("node: %s, recompute (%s)", node->name, + forced ? "forced" : "triggered"); + +if (!allowed) { +VLOG_DBG("node: %s, recompute aborted", node->name); +return false; +} + +node->run(node); +VLOG_DBG("node: %s, changed: %d", node->name, node->changed); +return true; +} + +/* Return true if the node could be computed without triggerring a
Re: [ovs-dev] [PATCH] dpdk: Deprecate pdump support.
Hi, Sorry, I don’t work on OVS, so I never tested pdump with OVS. I remembered, Ciara testing earlier versions of pdump with OVS in the past. Thanks, Reshma > -Original Message- > From: David Marchand > Sent: Thursday, November 14, 2019 4:41 PM > To: Pattan, Reshma > Cc: ovs-dev@openvswitch.org; Kevin Traynor ; > Aaron Conole ; Loftus, Ciara > ; Flavio Leitner ; Ilya Maximets > ; Stokes, Ian > Subject: Re: [PATCH] dpdk: Deprecate pdump support. > > Hello Reshma, > > Has pdump been tested (recently) with OVS? > > > On Mon, Nov 11, 2019 at 7:53 PM Ilya Maximets > wrote: > > > > The conventional way for packet dumping in OVS is to use ovs-tcpdump > > that works via traffic mirroring. DPDK pdump could probably be used > > for some lower level debugging, but it is not commonly used for > > various reasons. > > > > There are lots of limitations for using this functionality in practice. > > Most of them connected with running secondary pdump process and > memory > > layout issues like requirement to disable ASLR in kernel. > > More details are available in DPDK guide: > > https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi- > p > > rocess-limitations > > > > Beside the functional limitations it's also hard to use this > > functionality correctly. User must be sure that OVS and pdump utility > > are running on different CPU cores, which is hard because non-PMD > > threads could float over available CPU cores. This or any other > > misconfiguration will likely lead to crash of the pdump utility or/and > > OVS. > > > > Another problem is that the user must actually have this special pdump > > utility in a system and it might be not available in distributions. > > > > This change disables pdump support by default introducing special > > configuration option '--enable-dpdk-pdump'. Deprecation warnings will > > be shown to users on configuration and in runtime. > > > > Claiming to completely remove this functionality from OVS in one of > > the next releases. > > > > Signed-off-by: Ilya Maximets > > Acked-by: Aaron Conole > > - Recompiled from scratch, on OVS master (before this patch) with dpdk > 18.11.2. > > other_config: {dpdk-init="true", pmd-cpu-mask="0x8002"} > > 2 physical ports, 2 vhost ports. > > 2019-11-14T14:13:09.596Z|00018|dpdk|INFO|Using DPDK 18.11.2 2019-11- > 14T14:13:09.596Z|00019|dpdk|INFO|DPDK Enabled - initializing... > 2019-11-14T14:13:09.596Z|00020|dpdk|INFO|No vhost-sock-dir provided - > defaulting to //var/run/openvswitch 2019-11- > 14T14:13:09.596Z|00021|dpdk|INFO|IOMMU support for vhost-user-client > disabled. > 2019-11-14T14:13:09.596Z|00022|dpdk|INFO|POSTCOPY support for vhost- > user-client disabled. > 2019-11-14T14:13:09.596Z|00023|dpdk|INFO|Per port memory for DPDK > devices disabled. > 2019-11-14T14:13:09.596Z|00024|dpdk|INFO|EAL ARGS: ovs-vswitchd -- > socket-mem 1024 --socket-limit 1024 -l 0. > 2019-11-14T14:13:09.600Z|00025|dpdk|INFO|EAL: Detected 28 lcore(s) > 2019-11-14T14:13:09.600Z|00026|dpdk|INFO|EAL: Detected 1 NUMA nodes > 2019-11-14T14:13:09.602Z|00027|dpdk|INFO|EAL: Multi-process socket > /var/run/openvswitch/dpdk/rte/mp_socket > 2019-11-14T14:13:09.618Z|00028|dpdk|INFO|EAL: Probing VFIO support... > 2019-11-14T14:13:09.618Z|00029|dpdk|INFO|EAL: VFIO support initialized > 2019-11-14T14:13:14.612Z|00030|dpdk|INFO|EAL: PCI device :01:00.0 on > NUMA socket 0 > 2019-11-14T14:13:14.612Z|00031|dpdk|INFO|EAL: probe driver: > 8086:10fb net_ixgbe > 2019-11-14T14:13:14.613Z|00032|dpdk|INFO|EAL: using IOMMU type 1 > (Type 1) > 2019-11-14T14:13:14.744Z|00033|dpdk|INFO|EAL: Ignore mapping IO port > bar(2) > 2019-11-14T14:13:15.090Z|00034|dpdk|INFO|EAL: PCI device :01:00.1 on > NUMA socket 0 > 2019-11-14T14:13:15.090Z|00035|dpdk|INFO|EAL: probe driver: > 8086:10fb net_ixgbe > 2019-11-14T14:13:15.199Z|00036|dpdk|INFO|EAL: Ignore mapping IO port > bar(2) > 2019-11-14T14:13:15.530Z|00037|dpdk|INFO|EAL: PCI device :07:00.0 on > NUMA socket 0 > 2019-11-14T14:13:15.530Z|00038|dpdk|INFO|EAL: probe driver: > 8086:1521 net_e1000_igb > 2019-11-14T14:13:15.530Z|00039|dpdk|INFO|EAL: PCI device :07:00.1 on > NUMA socket 0 > 2019-11-14T14:13:15.530Z|00040|dpdk|INFO|EAL: probe driver: > 8086:1521 net_e1000_igb > ... > 2019-11-14T14:13:15.802Z|00042|dpdk|INFO|DPDK pdump packet capture > enabled 2019-11-14T14:13:15.803Z|00043|dpdk|INFO|DPDK Enabled - > initialized > > - Attached a gdb to ovs-vswitchd. > > - Started pdump: > # sudo -u openvswitch XDG_RUNTIME_DIR=/var/run/openvswitch > ./v18.11.2/app/dpdk-pdump -- --pdump > 'port=0,queue=*,rx-dev=/tmp/pkts.pcap' > EAL: Detected 28 lcore(s) > EAL: Detected 1 NUMA nodes > EAL: Multi-process socket > /var/run/openvswitch/dpdk/rte/mp_socket_83791_549cdfd05e328e > EAL: Probing VFIO support... > EAL: VFIO support initialized > EAL: PCI device :01:00.0 on NUMA socket 0 > EAL: probe driver: 8086:10fb net_ixgbe > EAL: using IOMMU type 1 (Type 1) > EAL: PCI device
Re: [ovs-dev] [PATCH] dpdk: Deprecate pdump support.
Hello Reshma, Has pdump been tested (recently) with OVS? On Mon, Nov 11, 2019 at 7:53 PM Ilya Maximets wrote: > > The conventional way for packet dumping in OVS is to use ovs-tcpdump > that works via traffic mirroring. DPDK pdump could probably be used > for some lower level debugging, but it is not commonly used for > various reasons. > > There are lots of limitations for using this functionality in practice. > Most of them connected with running secondary pdump process and > memory layout issues like requirement to disable ASLR in kernel. > More details are available in DPDK guide: > https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations > > Beside the functional limitations it's also hard to use this > functionality correctly. User must be sure that OVS and pdump utility > are running on different CPU cores, which is hard because non-PMD > threads could float over available CPU cores. This or any other > misconfiguration will likely lead to crash of the pdump utility > or/and OVS. > > Another problem is that the user must actually have this special pdump > utility in a system and it might be not available in distributions. > > This change disables pdump support by default introducing special > configuration option '--enable-dpdk-pdump'. Deprecation warnings will > be shown to users on configuration and in runtime. > > Claiming to completely remove this functionality from OVS in one > of the next releases. > > Signed-off-by: Ilya Maximets > Acked-by: Aaron Conole - Recompiled from scratch, on OVS master (before this patch) with dpdk 18.11.2. other_config: {dpdk-init="true", pmd-cpu-mask="0x8002"} 2 physical ports, 2 vhost ports. 2019-11-14T14:13:09.596Z|00018|dpdk|INFO|Using DPDK 18.11.2 2019-11-14T14:13:09.596Z|00019|dpdk|INFO|DPDK Enabled - initializing... 2019-11-14T14:13:09.596Z|00020|dpdk|INFO|No vhost-sock-dir provided - defaulting to //var/run/openvswitch 2019-11-14T14:13:09.596Z|00021|dpdk|INFO|IOMMU support for vhost-user-client disabled. 2019-11-14T14:13:09.596Z|00022|dpdk|INFO|POSTCOPY support for vhost-user-client disabled. 2019-11-14T14:13:09.596Z|00023|dpdk|INFO|Per port memory for DPDK devices disabled. 2019-11-14T14:13:09.596Z|00024|dpdk|INFO|EAL ARGS: ovs-vswitchd --socket-mem 1024 --socket-limit 1024 -l 0. 2019-11-14T14:13:09.600Z|00025|dpdk|INFO|EAL: Detected 28 lcore(s) 2019-11-14T14:13:09.600Z|00026|dpdk|INFO|EAL: Detected 1 NUMA nodes 2019-11-14T14:13:09.602Z|00027|dpdk|INFO|EAL: Multi-process socket /var/run/openvswitch/dpdk/rte/mp_socket 2019-11-14T14:13:09.618Z|00028|dpdk|INFO|EAL: Probing VFIO support... 2019-11-14T14:13:09.618Z|00029|dpdk|INFO|EAL: VFIO support initialized 2019-11-14T14:13:14.612Z|00030|dpdk|INFO|EAL: PCI device :01:00.0 on NUMA socket 0 2019-11-14T14:13:14.612Z|00031|dpdk|INFO|EAL: probe driver: 8086:10fb net_ixgbe 2019-11-14T14:13:14.613Z|00032|dpdk|INFO|EAL: using IOMMU type 1 (Type 1) 2019-11-14T14:13:14.744Z|00033|dpdk|INFO|EAL: Ignore mapping IO port bar(2) 2019-11-14T14:13:15.090Z|00034|dpdk|INFO|EAL: PCI device :01:00.1 on NUMA socket 0 2019-11-14T14:13:15.090Z|00035|dpdk|INFO|EAL: probe driver: 8086:10fb net_ixgbe 2019-11-14T14:13:15.199Z|00036|dpdk|INFO|EAL: Ignore mapping IO port bar(2) 2019-11-14T14:13:15.530Z|00037|dpdk|INFO|EAL: PCI device :07:00.0 on NUMA socket 0 2019-11-14T14:13:15.530Z|00038|dpdk|INFO|EAL: probe driver: 8086:1521 net_e1000_igb 2019-11-14T14:13:15.530Z|00039|dpdk|INFO|EAL: PCI device :07:00.1 on NUMA socket 0 2019-11-14T14:13:15.530Z|00040|dpdk|INFO|EAL: probe driver: 8086:1521 net_e1000_igb ... 2019-11-14T14:13:15.802Z|00042|dpdk|INFO|DPDK pdump packet capture enabled 2019-11-14T14:13:15.803Z|00043|dpdk|INFO|DPDK Enabled - initialized - Attached a gdb to ovs-vswitchd. - Started pdump: # sudo -u openvswitch XDG_RUNTIME_DIR=/var/run/openvswitch ./v18.11.2/app/dpdk-pdump -- --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap' EAL: Detected 28 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/openvswitch/dpdk/rte/mp_socket_83791_549cdfd05e328e EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device :01:00.0 on NUMA socket 0 EAL: probe driver: 8086:10fb net_ixgbe EAL: using IOMMU type 1 (Type 1) EAL: PCI device :01:00.1 on NUMA socket 0 EAL: probe driver: 8086:10fb net_ixgbe EAL: PCI device :07:00.0 on NUMA socket 0 EAL: probe driver: 8086:1521 net_e1000_igb EAL: PCI device :07:00.1 on NUMA socket 0 EAL: probe driver: 8086:1521 net_e1000_igb Port 3 MAC: 02 70 63 61 70 00 - Sent one packet to the first physical port from my tgen Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f4840659700 (LWP 84336)] bucket_dequeue_orphans (n_orphans=251, obj_table=0x14fe5af50, bd=0x14fdad880) at /root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:190 190objptr = bucket_stack_pop(bd->buckets[rte_lcore_id()]); (gdb) bt #0 bucket_dequeue_orphans
Re: [ovs-dev] [PATCH net 2/2] act_ct: support asymmetric conntrack
On Fri, Nov 08, 2019 at 04:07:14PM -0500, Aaron Conole wrote: > The act_ct TC module shares a common conntrack and NAT infrastructure > exposed via netfilter. It's possible that a packet needs both SNAT and > DNAT manipulation, due to e.g. tuple collision. Netfilter can support > this because it runs through the NAT table twice - once on ingress and > again after egress. The act_ct action doesn't have such capability. > > Like netfilter hook infrastructure, we should run through NAT twice to > keep the symmetry. > > Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") > > Signed-off-by: Aaron Conole > --- > net/sched/act_ct.c | 13 - > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c > index fcc46025e790..f3232a00970f 100644 > --- a/net/sched/act_ct.c > +++ b/net/sched/act_ct.c > @@ -329,6 +329,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb, > bool commit) > { > #if IS_ENABLED(CONFIG_NF_NAT) > + int err; > enum nf_nat_manip_type maniptype; > > if (!(ct_action & TCA_CT_ACT_NAT)) > @@ -359,7 +360,17 @@ static int tcf_ct_act_nat(struct sk_buff *skb, > return NF_ACCEPT; > } > > - return ct_nat_execute(skb, ct, ctinfo, range, maniptype); > + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype); > + if (err == NF_ACCEPT && > + ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) { > + if (maniptype == NF_NAT_MANIP_SRC) > + maniptype = NF_NAT_MANIP_DST; > + else > + maniptype = NF_NAT_MANIP_SRC; > + > + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype); > + } I keep thinking about this and I'm not entirely convinced that this shouldn't be simpler. More like: if (DNAT) DNAT if (SNAT) SNAT So it always does DNAT before SNAT, similarly to what iptables would do on PRE/POSTROUTING chains. > + return err; > #else > return NF_ACCEPT; > #endif > -- > 2.21.0 > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] ovs-vswitchd is crashing while adding dpdk ports to the ovs bridge
Hi, I'm trying to add the dpdk ports to the ovs bridge, and while doing so, i see the ovs daemon crashing with the 'operation not permitted' error, here's my architecture, i am on a debian vm (pseudo host) that has virt-io devices attached to it, i'm binding these virt-io's to the dpdk (igb_uio) and trying to add it to the ovs bridge. dpdk version: 17.11.2 ovs-version: 2.9.0 i did try looking for cores in /var/lib/systemd/coredump, but there are no cores there, strange. And i cannot look for cores in the current directory of that process because it keeps getting killed (as the daemon tries to come up and crashes and keeps at it forever) 2019-11-14T16:08:54.438Z|05381|bridge|INFO|bridge br0: added interface br0 on port 65534 2019-11-14T16:08:54.443Z|05382|dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 3 created. 2019-11-14T16:08:54.443Z|05383|dpif_netdev|INFO|There are 1 pmd threads on numa node 0 2019-11-14T16:08:54.554Z|05384|netdev_dpdk|WARN|Rx checksum offload is not supported on port 0 2019-11-14T16:08:54.554Z|05385|netdev_dpdk|ERR|Interface p1 MTU (1500) setup error: Operation not supported 2019-11-14T16:08:54.554Z|05386|netdev_dpdk|ERR|Interface p1(rxq:1 txq:2) configure error: Operation not supported 2019-11-14T16:08:54.554Z|05387|dpif_netdev|INFO|Core 3 on numa node 0 assigned port 'p1' rx queue 0 (measured processing cycles 0). 2019-11-14T16:08:54.562Z|05340|daemon_unix(monitor)|ERR|fork child died before signaling startup (killed (Segmentation fault)) 2019-11-14T16:08:54.562Z|05341|daemon_unix(monitor)|WARN|1781 crashes: pid 10551 died, killed (Segmentation fault), core dumped, waiting until 10 seconds since last restart Should i be on the physical host (bare metal) instead a vm? and bind physical devices to dpdk and then add it to the bridge? I'm not sure where i'm going wrong. Any advise is appreciated. Thanks, -Deepak Gowda ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH net-next] net: openvswitch: don't call pad_packet if not necessary
From: Tonghao Zhang The nla_put_u16/nla_put_u32 makes sure that *attrlen is align. The call tree is that: nla_put_u16/nla_put_u32 -> nla_putattrlen = sizeof(u16) or sizeof(u32) -> __nla_put attrlen -> __nla_reserve attrlen -> skb_put(skb, nla_total_size(attrlen)) nla_total_size returns the total length of attribute including padding. Cc: Joe Stringer Cc: William Tu Signed-off-by: Tonghao Zhang --- net/openvswitch/datapath.c | 22 -- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c index 8ce1f773378d..93d4991ddc1f 100644 --- a/net/openvswitch/datapath.c +++ b/net/openvswitch/datapath.c @@ -487,23 +487,17 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb, } /* Add OVS_PACKET_ATTR_MRU */ - if (upcall_info->mru) { - if (nla_put_u16(user_skb, OVS_PACKET_ATTR_MRU, - upcall_info->mru)) { - err = -ENOBUFS; - goto out; - } - pad_packet(dp, user_skb); + if (upcall_info->mru && + nla_put_u16(user_skb, OVS_PACKET_ATTR_MRU, upcall_info->mru)) { + err = -ENOBUFS; + goto out; } /* Add OVS_PACKET_ATTR_LEN when packet is truncated */ - if (cutlen > 0) { - if (nla_put_u32(user_skb, OVS_PACKET_ATTR_LEN, - skb->len)) { - err = -ENOBUFS; - goto out; - } - pad_packet(dp, user_skb); + if (cutlen > 0 && + nla_put_u32(user_skb, OVS_PACKET_ATTR_LEN, skb->len)) { + err = -ENOBUFS; + goto out; } /* Add OVS_PACKET_ATTR_HASH */ -- 2.23.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ipf: bail out when ipf state is COMPLETED
Hi RongQing/Wang On Thu, Nov 14, 2019 at 1:26 AM Li RongQing wrote: > it is easy to crash ovs when a packet with same id > hits a list that already reassembled completedly > but have not been sent out yet, and this packet is > not duplicate with this hit ipf list due to bigger > offset > > 1 0x7f9fef0ae2d9 in __GI_abort () at abort.c:89 > Good DOS test. Fix is correct. This needs a 'Fixes' tag. Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.") This will need to be backported to 2.10. Thanks Darrell 2 0x00464042 in ipf_list_state_transition at lib/ipf.c:545 > > Co-authored-by: Wang Li > Signed-off-by: Wang Li > Signed-off-by: Li RongQing > --- > lib/ipf.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/lib/ipf.c b/lib/ipf.c > index 4cc0f2df6..45c489122 100644 > --- a/lib/ipf.c > +++ b/lib/ipf.c > @@ -899,7 +899,8 @@ ipf_handle_frag(struct ipf *ipf, struct dp_packet > *pkt, ovs_be16 dl_type, >MIN(max_frag_list_size, > IPF_FRAG_LIST_MIN_INCREMENT)); > hmap_insert(>frag_lists, _list->node, hash); > ipf_expiry_list_add(>frag_exp_list, ipf_list, now); > -} else if (ipf_list->state == IPF_LIST_STATE_REASS_FAIL) { > +} else if (ipf_list->state == IPF_LIST_STATE_REASS_FAIL || > + ipf_list->state == IPF_LIST_STATE_COMPLETED) { > /* Bail out as early as possible. */ > return false; > } else if (ipf_list->last_inuse_idx + 1 >= ipf_list->size) { > -- > 2.16.2 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn v2] Require Python 3 and remove support for Python 2.
Bleep bloop. Greetings Numan Siddique, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. git-am: error: Your local changes to the following files would be overwritten by merge: utilities/checkpatch.py Please, commit your changes or stash them before you can merge. Aborting Failed to merge in the changes. Patch failed at 0001 Require Python 3 and remove support for Python 2. The copy of the patch that failed is found in: /var/lib/jenkins/jobs/upstream_build_from_pw/workspace/OVN/.git/rebase-apply/patch When you have resolved this problem, run "git am --resolved". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". Please check this out. If you feel there has been an error, please email acon...@redhat.com Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn v2] Require Python 3 and remove support for Python 2.
From: Numan Siddique OVS removed the support for Python 2 in the commit [1]. And its time we do the same for OVN. Python3 is now mandatory, otherwise configure will fail. This patch takes care of removing Python 2 references. [1] - 1ca0323e7c29("Require Python 3 and remove support for Python 2.") Signed-off-by: Numan Siddique --- v1 -> v2 === * Addressed the review comments from Mark - Removed HAVE_PYTHON3 var and fixed ovn-detrace.in exception seen with python3. Makefile.am | 12 ++-- automake.mk | 8 +-- build-aux/dpdkstrip.py | 2 +- build-aux/sodepends.py | 2 +- build-aux/soexpand.py | 2 +- configure.ac| 2 - ipsec/ovs-monitor-ipsec.in | 2 +- m4/ovn.m4 | 89 ++--- rhel/ovn-fedora.spec.in | 35 +- tests/atlocal.in| 23 ++- tests/checkpatch.at | 12 ++-- tests/ovn-controller-vtep.at| 1 - tests/ovn-northd.at | 10 --- tests/ovn.at| 63 - tests/ovsdb-macros.at | 4 -- tests/system-kmod-macros.at | 1 - tests/system-userspace-macros.at| 1 - tutorial/ovs-sandbox| 1 + utilities/bugtool/automake.mk | 2 - utilities/checkpatch.py | 3 +- utilities/ovn-detrace.in| 6 +- utilities/ovn-docker-overlay-driver.in | 2 +- utilities/ovn-docker-underlay-driver.in | 2 +- 23 files changed, 36 insertions(+), 249 deletions(-) diff --git a/Makefile.am b/Makefile.am index 1e41e49ea..8eed7a72b 100644 --- a/Makefile.am +++ b/Makefile.am @@ -71,7 +71,7 @@ endif # foo/__init__.pyc will cause Python to ignore foo.py. run_python = \ PYTHONPATH=$(OVS_SRCDIR)/python$(psep)$$PYTHONPATH \ - PYTHONDONTWRITEBYTECODE=yes $(PYTHON) + PYTHONDONTWRITEBYTECODE=yes $(PYTHON3) ALL_LOCAL = BUILT_SOURCES = @@ -165,13 +165,13 @@ ro_shell = printf '\043 Generated automatically -- do not modify!-*- buffer- SUFFIXES += .in .in: - $(AM_V_GEN)PYTHONPATH=$(OVS_SRCDIR)/python$(psep)$$PYTHONPATH$(psep)$(srcdir)/python $(PYTHON) $(srcdir)/build-aux/soexpand.py -I$(srcdir) -I$(OVS_SRCDIR) < $< | \ - $(PYTHON) $(srcdir)/build-aux/dpdkstrip.py $(DPDKSTRIP_FLAGS) | \ + $(AM_V_GEN)PYTHONPATH=$(OVS_SRCDIR)/python$(psep)$$PYTHONPATH$(psep)$(srcdir)/python $(PYTHON3) $(srcdir)/build-aux/soexpand.py -I$(srcdir) -I$(OVS_SRCDIR) < $< | \ + $(PYTHON3) $(srcdir)/build-aux/dpdkstrip.py $(DPDKSTRIP_FLAGS) | \ sed \ -e 's,[@]PKIDIR[@],$(PKIDIR),g' \ -e 's,[@]LOGDIR[@],$(LOGDIR),g' \ -e 's,[@]DBDIR[@],$(DBDIR),g' \ - -e 's,[@]PYTHON[@],$(PYTHON),g' \ + -e 's,[@]PYTHON3[@],$(PYTHON3),g' \ -e 's,[@]OVN_RUNDIR[@],$(OVN_RUNDIR),g' \ -e 's,[@]OVSBUILDDIR[@],$(OVSBUILDDIR),g' \ -e 's,[@]VERSION[@],$(VERSION),g' \ @@ -197,7 +197,7 @@ SUFFIXES += .xml PKIDIR='$(PKIDIR)' \ LOGDIR='$(LOGDIR)' \ DBDIR='$(DBDIR)' \ - PYTHON='$(PYTHON)' \ + PYTHON3='$(PYTHON3)' \ RUNDIR='$(RUNDIR)' \ OVN_RUNDIR='$(OVN_RUNDIR)' \ VERSION='$(VERSION)' \ @@ -425,7 +425,7 @@ CLEANFILES += flake8-check include $(srcdir)/manpages.mk $(srcdir)/manpages.mk: $(MAN_ROOTS) build-aux/sodepends.py $(OVS_SRCDIR)/python/build/soutil.py - @PYTHONPATH=$(OVS_SRCDIR)/python$(psep)$$PYTHONPATH$(psep)$(srcdir)/python $(PYTHON) $(srcdir)/build-aux/sodepends.py -I. -I$(srcdir) -I$(OVS_MANDIR) $(MAN_ROOTS) >$(@F).tmp + @PYTHONPATH=$(OVS_SRCDIR)/python$(psep)$$PYTHONPATH$(psep)$(srcdir)/python $(PYTHON3) $(srcdir)/build-aux/sodepends.py -I. -I$(srcdir) -I$(OVS_MANDIR) $(MAN_ROOTS) >$(@F).tmp @if cmp -s $(@F).tmp $@; then \ touch $@; \ rm -f $(@F).tmp; \ diff --git a/automake.mk b/automake.mk index ad801f1e5..591e00751 100644 --- a/automake.mk +++ b/automake.mk @@ -6,19 +6,17 @@ CLEANFILES += ovn-architecture.7 # # If "python" or "dot" is not available, then we do not add graphical diagram # to the documentation. -if HAVE_PYTHON if HAVE_DOT OVSDB_DOT = $(run_python) ${OVSDIR}/ovsdb/ovsdb-dot.in ovn-nb.gv: ${OVSDIR}/ovsdb/ovsdb-dot.in $(srcdir)/ovn-nb.ovsschema $(AM_V_GEN)$(OVSDB_DOT) --no-arrows $(srcdir)/ovn-nb.ovsschema > $@ ovn-nb.pic: ovn-nb.gv ${OVSDIR}/ovsdb/dot2pic - $(AM_V_GEN)(dot -T plain < ovn-nb.gv | $(PYTHON) ${OVSDIR}/ovsdb/dot2pic -f 3) > $@.tmp && \ + $(AM_V_GEN)(dot -T plain < ovn-nb.gv | $(PYTHON3) ${OVSDIR}/ovsdb/dot2pic -f 3) > $@.tmp && \ mv $@.tmp $@ OVN_NB_PIC = ovn-nb.pic OVN_NB_DOT_DIAGRAM_ARG = --er-diagram=$(OVN_NB_PIC) CLEANFILES += ovn-nb.gv ovn-nb.pic endif -endif # OVN northbound
Re: [ovs-dev] [PATCH net 2/2] act_ct: support asymmetric conntrack
On 11/14/2019 4:22 PM, Roi Dayan wrote: > > On 2019-11-08 11:07 PM, Aaron Conole wrote: >> The act_ct TC module shares a common conntrack and NAT infrastructure >> exposed via netfilter. It's possible that a packet needs both SNAT and >> DNAT manipulation, due to e.g. tuple collision. Netfilter can support >> this because it runs through the NAT table twice - once on ingress and >> again after egress. The act_ct action doesn't have such capability. >> >> Like netfilter hook infrastructure, we should run through NAT twice to >> keep the symmetry. >> >> Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") >> >> Signed-off-by: Aaron Conole >> --- >> net/sched/act_ct.c | 13 - >> 1 file changed, 12 insertions(+), 1 deletion(-) >> >> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c >> index fcc46025e790..f3232a00970f 100644 >> --- a/net/sched/act_ct.c >> +++ b/net/sched/act_ct.c >> @@ -329,6 +329,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb, >>bool commit) >> { >> #if IS_ENABLED(CONFIG_NF_NAT) >> +int err; >> enum nf_nat_manip_type maniptype; >> >> if (!(ct_action & TCA_CT_ACT_NAT)) >> @@ -359,7 +360,17 @@ static int tcf_ct_act_nat(struct sk_buff *skb, >> return NF_ACCEPT; >> } >> >> -return ct_nat_execute(skb, ct, ctinfo, range, maniptype); >> +err = ct_nat_execute(skb, ct, ctinfo, range, maniptype); >> +if (err == NF_ACCEPT && >> +ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) { >> +if (maniptype == NF_NAT_MANIP_SRC) >> +maniptype = NF_NAT_MANIP_DST; >> +else >> +maniptype = NF_NAT_MANIP_SRC; >> + >> +err = ct_nat_execute(skb, ct, ctinfo, range, maniptype); >> +} >> +return err; >> #else >> return NF_ACCEPT; >> #endif >> > +paul Hi Aaron, I think I understand the issue and this looks good, Can you describe the scenario to reproduce this? Thanks, Paul. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net 2/2] act_ct: support asymmetric conntrack
On 2019-11-08 11:07 PM, Aaron Conole wrote: > The act_ct TC module shares a common conntrack and NAT infrastructure > exposed via netfilter. It's possible that a packet needs both SNAT and > DNAT manipulation, due to e.g. tuple collision. Netfilter can support > this because it runs through the NAT table twice - once on ingress and > again after egress. The act_ct action doesn't have such capability. > > Like netfilter hook infrastructure, we should run through NAT twice to > keep the symmetry. > > Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") > > Signed-off-by: Aaron Conole > --- > net/sched/act_ct.c | 13 - > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c > index fcc46025e790..f3232a00970f 100644 > --- a/net/sched/act_ct.c > +++ b/net/sched/act_ct.c > @@ -329,6 +329,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb, > bool commit) > { > #if IS_ENABLED(CONFIG_NF_NAT) > + int err; > enum nf_nat_manip_type maniptype; > > if (!(ct_action & TCA_CT_ACT_NAT)) > @@ -359,7 +360,17 @@ static int tcf_ct_act_nat(struct sk_buff *skb, > return NF_ACCEPT; > } > > - return ct_nat_execute(skb, ct, ctinfo, range, maniptype); > + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype); > + if (err == NF_ACCEPT && > + ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) { > + if (maniptype == NF_NAT_MANIP_SRC) > + maniptype = NF_NAT_MANIP_DST; > + else > + maniptype = NF_NAT_MANIP_SRC; > + > + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype); > + } > + return err; > #else > return NF_ACCEPT; > #endif > +paul ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.
On Thu, Nov 14, 2019 at 2:14 PM Numan Siddique wrote: > > On Thu, Nov 14, 2019 at 6:14 PM Dumitru Ceara wrote: > > > > Reported-by: Numan Siddique > > Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever > > possible.") > > Signed-off-by: Dumitru Ceara > > Thanks Dumitru for the fix. I applied this to master. > > Thanks > Numan Thanks Numan and sorry for missing this case in the first place. > > > --- > > northd/ovn-northd.c | 12 > > 1 file changed, 8 insertions(+), 4 deletions(-) > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > index d6beb97..6742bc0 100644 > > --- a/northd/ovn-northd.c > > +++ b/northd/ovn-northd.c > > @@ -5337,10 +5337,14 @@ build_lswitch_rport_arp_req_flows(struct ovn_port > > *op, > > } > > } > > > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op, > > -sw_od, 75, lflows); > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op, > > -sw_od, 75, lflows); > > +if (!sset_is_empty(_ips_v4)) { > > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, > > sw_op, > > +sw_od, 75, lflows); > > +} > > +if (!sset_is_empty(_ips_v6)) { > > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, > > sw_op, > > +sw_od, 75, lflows); > > +} > > > > sset_destroy(_ips_v4); > > sset_destroy(_ips_v6); > > -- > > 1.8.3.1 > > > > ___ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.
On Thu, Nov 14, 2019 at 6:14 PM Dumitru Ceara wrote: > > Reported-by: Numan Siddique > Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever > possible.") > Signed-off-by: Dumitru Ceara Thanks Dumitru for the fix. I applied this to master. Thanks Numan > --- > northd/ovn-northd.c | 12 > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index d6beb97..6742bc0 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -5337,10 +5337,14 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op, > } > } > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op, > -sw_od, 75, lflows); > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op, > -sw_od, 75, lflows); > +if (!sset_is_empty(_ips_v4)) { > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op, > +sw_od, 75, lflows); > +} > +if (!sset_is_empty(_ips_v6)) { > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op, > +sw_od, 75, lflows); > +} > > sset_destroy(_ips_v4); > sset_destroy(_ips_v6); > -- > 1.8.3.1 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Good afternoon.
Hello dear how are you doing ? Did you receive the previous mail i sent to you? I wait your response as soon as possible. Best Regards, Eng.Muhammad Haseeb ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.
Reported-by: Numan Siddique Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever possible.") Signed-off-by: Dumitru Ceara --- northd/ovn-northd.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index d6beb97..6742bc0 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -5337,10 +5337,14 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op, } } -build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op, -sw_od, 75, lflows); -build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op, -sw_od, 75, lflows); +if (!sset_is_empty(_ips_v4)) { +build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op, +sw_od, 75, lflows); +} +if (!sset_is_empty(_ips_v6)) { +build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op, +sw_od, 75, lflows); +} sset_destroy(_ips_v4); sset_destroy(_ips_v6); -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn 2/2] Add support for Route Info Option in RA - RFC 4191
Introduce support for Route Info Option sent in Router Advertisement according to RFC 4191. Route Info Option are configured providing route_info in ipv6_ra_configs column of Logical_Router_Port table. route_info is a comma separated string where each field provides PRF and prefix for a given route (e.g: HIGH-aef1::11/48,LOW-aef2::11/96) Signed-off-by: Lorenzo Bianconi --- controller/pinctrl.c | 80 lib/ovn-l7.h | 12 +++ northd/ovn-northd.c | 6 ovn-nb.xml | 14 tests/ovn.at | 32 -- 5 files changed, 133 insertions(+), 11 deletions(-) diff --git a/controller/pinctrl.c b/controller/pinctrl.c index b736b60ad..51c7f8a36 100644 --- a/controller/pinctrl.c +++ b/controller/pinctrl.c @@ -2195,6 +2195,7 @@ struct ipv6_ra_config { struct in6_addr rdnss; bool has_rdnss; struct ds dnssl; +struct ds route_info; }; struct ipv6_ra_state { @@ -2217,6 +2218,7 @@ ipv6_ra_config_delete(struct ipv6_ra_config *config) if (config) { destroy_lport_addresses(>prefixes); ds_destroy(>dnssl); +ds_destroy(>route_info); free(config); } } @@ -2256,6 +2258,7 @@ ipv6_ra_update_config(const struct sbrec_port_binding *pb) config->mtu = smap_get_int(>options, "ipv6_ra_mtu", ND_MTU_DEFAULT); config->la_flags = IPV6_ND_RA_OPT_PREFIX_ON_LINK; ds_init(>dnssl); +ds_init(>route_info); const char *address_mode = smap_get(>options, "ipv6_ra_address_mode"); if (!address_mode) { @@ -2313,6 +2316,11 @@ ipv6_ra_update_config(const struct sbrec_port_binding *pb) ds_put_buffer(>dnssl, dnssl, strlen(dnssl)); } +const char *route_info = smap_get(>options, "ipv6_ra_route_info"); +if (route_info) { +ds_put_buffer(>route_info, route_info, strlen(route_info)); +} + return config; fail: @@ -2420,6 +2428,74 @@ packet_put_ra_dnssl_opt(struct dp_packet *b, ovs_be32 lifetime, prev_l4_size + size)); } +static void +packet_put_ra_route_info_opt(struct dp_packet *b, ovs_be32 lifetime, + char *route_list) +{ +size_t prev_l4_size = dp_packet_l4_size(b); +struct ip6_hdr *nh = dp_packet_l3(b); +char *t0, *r0 = NULL; +size_t size = 0; + +for (t0 = strtok_r(route_list, ",", ); t0; + t0 = strtok_r(NULL, ",", )) { +struct ovs_nd_route_info *nd_rinfo = +dp_packet_put_uninit(b, sizeof *nd_rinfo); +char *t1, *r1 = NULL; +int index; + +for (t1 = strtok_r(t0, "-", ), index = 0; t1; + t1 = strtok_r(NULL, "-", ), index++) { + +nd_rinfo->type = ND_OPT_ROUTE_INFO; +nd_rinfo->route_lifetime = lifetime; + +switch (index) { +case 0: +if (!strcmp(t1, "HIGH")) { +nd_rinfo->flags = IPV6_ND_RA_OPT_PRF_HIGH; +} else if (!strcmp(t1, "LOW")) { +nd_rinfo->flags = IPV6_ND_RA_OPT_PRF_LOW; +} else { +nd_rinfo->flags = IPV6_ND_RA_OPT_PRF_NORMAL; +} +break; +case 1: { +struct lport_addresses route; +uint8_t plen; + +if (!extract_ip_addresses(t1, )) { +return; +} +if (!route.n_ipv6_addrs) { +destroy_lport_addresses(); +return; +} + +nd_rinfo->prefix_len = route.ipv6_addrs->plen; +plen = DIV_ROUND_UP(nd_rinfo->prefix_len, 64); +nd_rinfo->len = 1 + plen; +dp_packet_put(b, _addrs->network, plen * 8); +size += sizeof *nd_rinfo + plen * 8; + +destroy_lport_addresses(); +index = 0; +break; +} +default: +return; +} +} +} + +nh->ip6_plen = htons(prev_l4_size + size); +struct ovs_ra_msg *ra = dp_packet_l4(b); +ra->icmph.icmp6_cksum = 0; +uint32_t icmp_csum = packet_csum_pseudoheader6(dp_packet_l3(b)); +ra->icmph.icmp6_cksum = csum_finish(csum_continue(icmp_csum, ra, + prev_l4_size + size)); +} + /* Called with in the pinctrl_handler thread context. */ static long long int ipv6_ra_send(struct rconn *swconn, struct ipv6_ra_state *ra) @@ -2459,6 +2535,10 @@ ipv6_ra_send(struct rconn *swconn, struct ipv6_ra_state *ra) packet_put_ra_dnssl_opt(, htonl(0x), ra->config->dnssl.string); } +if (ra->config->route_info.length) { +packet_put_ra_route_info_opt(, htonl(0x), + ra->config->route_info.string); +} uint64_t ofpacts_stub[4096 / 8]; struct ofpbuf ofpacts =
[ovs-dev] [PATCH ovn 1/2] Add support to Default Router Preference (PRF) - RFC 4191
Introduce support for Default Router Preference (PRF) in IPv6 Router Advertisement according to RFC 4191 Signed-off-by: Lorenzo Bianconi --- controller/pinctrl.c | 20 +--- lib/ovn-l7.h | 4 northd/ovn-northd.c | 7 +++ ovn-nb.xml | 12 tests/ovn.at | 13 + 5 files changed, 49 insertions(+), 7 deletions(-) diff --git a/controller/pinctrl.c b/controller/pinctrl.c index a90ee73d6..b736b60ad 100644 --- a/controller/pinctrl.c +++ b/controller/pinctrl.c @@ -2263,10 +2263,10 @@ ipv6_ra_update_config(const struct sbrec_port_binding *pb) goto fail; } if (!strcmp(address_mode, "dhcpv6_stateless")) { -config->mo_flags = IPV6_ND_RA_FLAG_OTHER_ADDR_CONFIG; +config->mo_flags |= IPV6_ND_RA_FLAG_OTHER_ADDR_CONFIG; config->la_flags |= IPV6_ND_RA_OPT_PREFIX_AUTONOMOUS; } else if (!strcmp(address_mode, "dhcpv6_stateful")) { -config->mo_flags = IPV6_ND_RA_FLAG_MANAGED_ADDR_CONFIG; +config->mo_flags |= IPV6_ND_RA_FLAG_MANAGED_ADDR_CONFIG; } else if (!strcmp(address_mode, "slaac")) { config->la_flags |= IPV6_ND_RA_OPT_PREFIX_AUTONOMOUS; } else { @@ -2274,6 +2274,13 @@ ipv6_ra_update_config(const struct sbrec_port_binding *pb) goto fail; } +const char *prf = smap_get(>options, "ipv6_ra_prf"); +if (!strcmp(prf, "HIGH")) { +config->mo_flags |= IPV6_ND_RA_OPT_PRF_HIGH; +} else if (!strcmp(prf, "LOW")) { +config->mo_flags |= IPV6_ND_RA_OPT_PRF_LOW; +} + const char *prefixes = smap_get(>options, "ipv6_ra_prefixes"); if (prefixes && !extract_ip_addresses(prefixes, >prefixes)) { VLOG_WARN("Invalid IPv6 prefixes: %s", prefixes); @@ -2423,10 +2430,17 @@ ipv6_ra_send(struct rconn *swconn, struct ipv6_ra_state *ra) uint64_t packet_stub[128 / 8]; struct dp_packet packet; +uint16_t router_lt = IPV6_ND_RA_LIFETIME; + +if (!router_lt) { +/* Reset PRF to MEDIUM if router lifetime is not set */ +ra->config->mo_flags &= ~IPV6_ND_RA_OPT_PRF_LOW; +} + dp_packet_use_stub(, packet_stub, sizeof packet_stub); compose_nd_ra(, ra->config->eth_src, ra->config->eth_dst, >config->ipv6_src, >config->ipv6_dst, -255, ra->config->mo_flags, htons(IPV6_ND_RA_LIFETIME), 0, 0, +255, ra->config->mo_flags, htons(router_lt), 0, 0, ra->config->mtu); for (int i = 0; i < ra->config->prefixes.n_ipv6_addrs; i++) { diff --git a/lib/ovn-l7.h b/lib/ovn-l7.h index 5fc370bf5..fd898b0dc 100644 --- a/lib/ovn-l7.h +++ b/lib/ovn-l7.h @@ -287,6 +287,10 @@ nd_ra_opts_destroy(struct hmap *nd_ra_opts) #define IPV6_ND_RA_OPT_PREFIX_VALID_LIFETIME0x #define IPV6_ND_RA_OPT_PREFIX_PREFERRED_LIFETIME0x +#define IPV6_ND_RA_OPT_PRF_NORMAL 0x00 +#define IPV6_ND_RA_OPT_PRF_HIGH 0x08 +#define IPV6_ND_RA_OPT_PRF_LOW 0x18 + static inline void nd_ra_opts_init(struct hmap *nd_ra_opts) { diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index d6beb9768..a1eaa9071 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -6757,6 +6757,13 @@ copy_ra_to_sb(struct ovn_port *op, const char *address_mode) smap_add(, "ipv6_ra_src_eth", op->lrp_networks.ea_s); +const char *prf = smap_get(>nbrp->ipv6_ra_configs, "prf"); +if (!prf || (strcmp(prf, "HIGH") && strcmp(prf, "LOW"))) { +smap_add(, "ipv6_ra_prf", "MEDIUM"); +} else { +smap_add(, "ipv6_ra_prf", prf); +} + sbrec_port_binding_set_options(op->sb, ); smap_destroy(); } diff --git a/ovn-nb.xml b/ovn-nb.xml index d8f3237fc..89d23ad7b 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -1859,6 +1859,18 @@ + +Default Router Preference (PRF) indicates whether to prefer this +router over other default routers (RFC 4191). +Possible values are: + + + HIGH: mapped to 0x01 in RA PRF field + MEDIUM: mapped to 0x00 in RA PRF field + LOW: mapped to 0x11 in RA PRF field + + + The recommended MTU for the link. Default is 0, which means no MTU Option will be included in RA packet replied by ovn-controller. diff --git a/tests/ovn.at b/tests/ovn.at index 4a10a11e9..85d539c78 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -11504,23 +11504,28 @@ ra_test 05dc 00 0 0 c0 40 aef0 ovn-nbctl --wait=hv set Logical_Router_port ro-sw networks='aef0\:\:1/64 fd0f\:\:1/48' ra_test 05dc 00 0 0 c0 40 aef0 30 fd0f +# Test PRF for default gw +ovn-nbctl --wait=hv set Logical_Router_port ro-sw ipv6_ra_configs:prf="LOW" +ra_test 05dc 18 0 0 c0 40 aef0 30 fd0f + # Now test for RDNSS ovn-nbctl --wait=hv set Logical_Router_port
[ovs-dev] [PATCH ovn 0/2] Add support for PRF and Route Info Option in RA (RFC 4191)
Lorenzo Bianconi (2): Add support to Default Router Preference (PRF) - RFC 4191 Add support for Route Info Option in RA - RFC 4191 controller/pinctrl.c | 100 +-- lib/ovn-l7.h | 16 +++ northd/ovn-northd.c | 13 ++ ovn-nb.xml | 26 +++ tests/ovn.at | 35 ++- 5 files changed, 177 insertions(+), 13 deletions(-) -- 2.21.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode.
On 13 Nov 2019, at 14:23, Ilya Maximets wrote: On 13.11.2019 1:21, William Tu wrote: On Thu, Nov 7, 2019 at 3:36 AM Ilya Maximets wrote: Until now there was only two options for XDP mode in OVS: SKB or DRV. i.e. 'generic XDP' or 'native XDP with zero-copy enabled'. Devices like 'veth' interfaces in Linux supports native XDP, but doesn't support zero-copy mode. This case can not be covered by existing API and we have to use slower generic XDP for such devices. There are few more issues, e.g. TCP is not supported in generic XDP mode for veth interfaces due to kernel limitations, however it is supported in native mode. This change introduces ability to use native XDP without zero-copy along with best-effort configuration option that enabled by default. In best-effort case OVS will sequentially try different modes starting from the fastest one and will choose the first acceptable for current interface. This will guarantee the best possible performance. If user will want to choose specific mode, it's still possible by setting the 'options:xdp-mode'. This change additionally changes the API by renaming the configuration knob from 'xdpmode' to 'xdp-mode' and also renaming the modes themselves to be more user-friendly. The full list of currently supported modes: * native-with-zerocopy - former DRV * native - new one, DRV without zero-copy * generic - former SKB * best-effort - new one, chooses the best available from 3 above modes Since 'best-effort' is a default mode, users will not need to explicitely set 'xdp-mode' in most cases. TCP related tests enabled back in system afxdp testsuite, because 'best-effort' will choose 'native' mode for veth interfaces and this mode has no issues with TCP. Signed-off-by: Ilya Maximets --- I'm thinking about adding some tests case and patches depends on this API change. If there is no more comment, I will apply to master. I think, it might be good to wait for some comments from Eelco. @Eelco, do you have a plan to review the code? Do you agree with the API update? I’m rather busy this week, will try to review it next week, and I’m ok with the API change. The only thing that bothers me is the worse performance of the TAP interface with the new default config. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] ipf: bail out when ipf state is COMPLETED
it is easy to crash ovs when a packet with same id hits a list that already reassembled completedly but have not been sent out yet, and this packet is not duplicate with this hit ipf list due to bigger offset 1 0x7f9fef0ae2d9 in __GI_abort () at abort.c:89 2 0x00464042 in ipf_list_state_transition at lib/ipf.c:545 Co-authored-by: Wang Li Signed-off-by: Wang Li Signed-off-by: Li RongQing --- lib/ipf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/ipf.c b/lib/ipf.c index 4cc0f2df6..45c489122 100644 --- a/lib/ipf.c +++ b/lib/ipf.c @@ -899,7 +899,8 @@ ipf_handle_frag(struct ipf *ipf, struct dp_packet *pkt, ovs_be16 dl_type, MIN(max_frag_list_size, IPF_FRAG_LIST_MIN_INCREMENT)); hmap_insert(>frag_lists, _list->node, hash); ipf_expiry_list_add(>frag_exp_list, ipf_list, now); -} else if (ipf_list->state == IPF_LIST_STATE_REASS_FAIL) { +} else if (ipf_list->state == IPF_LIST_STATE_REASS_FAIL || + ipf_list->state == IPF_LIST_STATE_COMPLETED) { /* Bail out as early as possible. */ return false; } else if (ipf_list->last_inuse_idx + 1 >= ipf_list->size) { -- 2.16.2 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 ovn 0/3] Improve ovn-detrace support for parsing OpenFlow cookies.
On Wed, Nov 13, 2019 at 9:02 PM Han Zhou wrote: > > > > On Wed, Nov 13, 2019 at 1:51 AM Dumitru Ceara wrote: > > > > Commit eb25a7da639e ("Improve debuggability of OVN to OpenFlow > > translations.") > > added support for more types of cookies (partial SB uuids) that are stored > > in OpenFlow entries created by ovn-controller. > > > > The last patch in this series implements support for parsing all these > > different cookies in ovn-detrace too. > > > > The first patch is a bug fix required as ovn-detrace was broken after > > moving OVN to its own separate rundir. > > > > The second patch fixes line parsing in ovn-detrace. > > > > CC: Han Zhou > > Signed-off-by: Dumitru Ceara > > > > Dumitru Ceara (3): > > ovn-detrace: Fix rundir. > > ovn-detrace: Fix line parsing. > > ovn-detrace: Add support for other types of SB cookies. > > > > > > utilities/ovn-detrace.in | 201 > > -- > > 1 file changed, 138 insertions(+), 63 deletions(-) > > > > > > --- > > v4: > > - Address Han's comments: > > - Fix printing of last logical flow information. > > - Added Acked-by from Mark on patches 1 & 3. I didn't add it to patch 2 > > because I significantly changed patch 2 in v4. > > v3: > > - Remove stray "%s". > > - Rename pprint to print_p to avoid name clashes with the library function. > > v2: > > - Address Mark's comments: > > - properly handle potential collisions between cookie -> UUID mappings. > > - when looking up SB records by cookie, match on the first part of the > > UUID. > > - Further refactor ovn-detrace to simplify prefixing outputs. > > - Change print statements to print() calls. > > > > Thanks Dumitru. I applied the series to master. Thanks Han and Mark for reviewing and merging this! ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v7 ovn 2/2] ovn-northd: Limit ARP/ND broadcast domain whenever possible.
On Wed, Nov 13, 2019 at 8:52 PM Han Zhou wrote: > > > > On Wed, Nov 13, 2019 at 8:14 AM Dumitru Ceara wrote: > > > > On Wed, Nov 13, 2019 at 4:30 PM Han Zhou wrote: > > > > > > > > > > > > On Wed, Nov 13, 2019 at 2:42 AM Dumitru Ceara wrote: > > >> > > >> On Tue, Nov 12, 2019 at 8:50 PM Han Zhou wrote: > > >> > > > >> > > > >> > > > >> > > > >> > On Tue, Nov 12, 2019 at 10:10 AM Dumitru Ceara > > >> > wrote: > > >> > > > > >> > > On Tue, Nov 12, 2019 at 6:17 PM Han Zhou wrote: > > >> > > > > > >> > > > > > >> > > > > > >> > > > On Tue, Nov 12, 2019 at 2:29 AM Dumitru Ceara > > >> > > > wrote: > > >> > > > > > > >> > > > > ARP request and ND NS packets for router owned IPs were being > > >> > > > > flooded in the complete L2 domain (using the MC_FLOOD multicast > > >> > > > > group). > > >> > > > > However this creates a scaling issue in scenarios where > > >> > > > > aggregation > > >> > > > > logical switches are connected to more logical routers (~350). > > >> > > > > The > > >> > > > > logical pipelines of all routers would have to be executed > > >> > > > > before the > > >> > > > > packet is finally replied to by a single router, the owner of > > >> > > > > the IP > > >> > > > > address. > > >> > > > > > > >> > > > > This commit limits the broadcast domain by bypassing the L2 > > >> > > > > Lookup stage > > >> > > > > for ARP requests that will be replied by a single router. The > > >> > > > > packets > > >> > > > > are forwarded only to the router port that owns the target IP > > >> > > > > address. > > >> > > > > > > >> > > > > IPs that are owned by the routers and for which this fix applies > > >> > > > > are: > > >> > > > > - IP addresses configured on the router ports. > > >> > > > > - VIPs. > > >> > > > > - NAT IPs. > > >> > > > > > > >> > > > > Reported-at: https://bugzilla.redhat.com/1756945 > > >> > > > > Reported-by: Anil Venkata > > >> > > > > Signed-off-by: Dumitru Ceara > > >> > > > > > > >> > > > > --- > > >> > > > > v7: > > >> > > > > - Address Han's comments: > > >> > > > > - Remove flooding for all ARPs received on VLAN networks. To > > >> > > > > avoid > > >> > > > > that we now identify self originated (G)ARPs by matching > > >> > > > > on source > > >> > > > > MAC address too. > > >> > > > > - Rename REGBIT_NOT_VXLAN to FLAGBIT_NOT_VXLAN. > > >> > > > > - Fix ovn-sb manpage. > > >> > > > > - Split patch in a series of 2: > > >> > > > > - patch1: fixes the get_router_load_balancer_ips() function. > > >> > > > > - patch2: limits the ARP/ND broadcast domain. > > >> > > > > v6: > > >> > > > > - Address Han's comments: > > >> > > > > - remove flooding of ARPs targeting OVN owned IP addresses. > > >> > > > > - update ovn-architecture documentation. > > >> > > > > - rename ARP handling functions. > > >> > > > > - Adapt "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR" autotest to > > >> > > > > take into > > >> > > > > account the new way of forwarding ARPs. > > >> > > > > - Also, properly deal with ARP packets on VLAN-backed networks. > > >> > > > > v5: Address Numan's comments: update comments & make autotest > > >> > > > > more > > >> > > > > robust. > > >> > > > > v4: Rebase. > > >> > > > > v3: Properly deal with VXLAN traffic. Address review comments > > >> > > > > from > > >> > > > > Numan (add autotests). Fix function > > >> > > > > get_router_load_balancer_ips. > > >> > > > > Rebase -> deal with IPv6 NAT too. > > >> > > > > v2: Move ARP broadcast domain limiting to table > > >> > > > > S_SWITCH_IN_L2_LKUP to > > >> > > > > address localnet ports too. > > >> > > > > --- > > >> > > > > northd/ovn-northd.8.xml | 14 ++ > > >> > > > > northd/ovn-northd.c | 230 > > >> > > > > +++ > > >> > > > > ovn-architecture.7.xml | 19 +++ > > >> > > > > tests/ovn.at| 307 > > >> > > > > +-- > > >> > > > > 4 files changed, 530 insertions(+), 40 deletions(-) > > >> > > > > > > >> > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > > >> > > > > index 0a33dcd..344cc0d 100644 > > >> > > > > --- a/northd/ovn-northd.8.xml > > >> > > > > +++ b/northd/ovn-northd.8.xml > > >> > > > > @@ -1005,6 +1005,20 @@ output; > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > +Priority-80 flows for each port connected to a logical > > >> > > > > router > > >> > > > > +matching self originated GARP/ARP request/ND packets. > > >> > > > > These packets > > >> > > > > +are flooded to the MC_FLOOD which contains > > >> > > > > all logical > > >> > > > > +ports. > > >> > > > > + > > >> > > > > + > > >> > > > > + > > >> > > > > +Priority-75 flows for each IP address/VIP/NAT address > > >> > > > > owned by a > > >> > > > > +router port connected to the switch. These flows match > > >> > > > > ARP requests > > >> > > > > +and ND