Re: [ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.

2019-11-14 Thread Han Zhou
On Thu, Nov 14, 2019 at 10:56 AM Dumitru Ceara  wrote:
>
> On Thu, Nov 14, 2019 at 7:43 PM Numan Siddique  wrote:
> >
> >
> >
> > On Thu, Nov 14, 2019, 11:53 PM Han Zhou  wrote:
> >>
> >> On Thu, Nov 14, 2019 at 6:22 AM Dumitru Ceara 
wrote:
> >> >
> >> > On Thu, Nov 14, 2019 at 2:14 PM Numan Siddique 
wrote:
> >> > >
> >> > > On Thu, Nov 14, 2019 at 6:14 PM Dumitru Ceara 
wrote:
> >> > > >
> >> > > > Reported-by: Numan Siddique 
> >> > > > Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain
> >> whenever possible.")
> >> > > > Signed-off-by: Dumitru Ceara 
> >> > >
> >> > > Thanks Dumitru for the fix. I applied this to master.
> >> > >
> >> > > Thanks
> >> > > Numan
> >> >
> >> > Thanks Numan and sorry for missing this case in the first place.
> >>
> >> Sorry for missing this in review. Just want to confirm, is this a bug
or
> >> just improvement? I think it is not a bug, because the set is empty
{}, it
> >> should just get ignored in the end by ovn-controller when translating
to
> >> OpenFlow rules, correct?
> >
>
> Hi Han,
>
> As far as I see ovn-controller doesn't accept empty set. In
> parse_constant_set() we expect at least one constant:
>
> https://github.com/ovn-org/ovn/blob/master/lib/expr.c#L865
>
> So I would say that this was a bug in the ovn-northd code I added. And
> anyway it doesn't really make sense to add the flows if the address
> sets are empty.
>
> Thanks,
> Dumitru
>
Yes, you are right. I think I messed up with the empty address-set or
port-group, which works, but not empty constant set in {}.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.

2019-11-14 Thread Numan Siddique
On Thu, Nov 14, 2019, 11:53 PM Han Zhou  wrote:

> On Thu, Nov 14, 2019 at 6:22 AM Dumitru Ceara  wrote:
> >
> > On Thu, Nov 14, 2019 at 2:14 PM Numan Siddique  wrote:
> > >
> > > On Thu, Nov 14, 2019 at 6:14 PM Dumitru Ceara 
> wrote:
> > > >
> > > > Reported-by: Numan Siddique 
> > > > Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain
> whenever possible.")
> > > > Signed-off-by: Dumitru Ceara 
> > >
> > > Thanks Dumitru for the fix. I applied this to master.
> > >
> > > Thanks
> > > Numan
> >
> > Thanks Numan and sorry for missing this case in the first place.
>
> Sorry for missing this in review. Just want to confirm, is this a bug or
> just improvement? I think it is not a bug, because the set is empty {}, it
> should just get ignored in the end by ovn-controller when translating to
> OpenFlow rules, correct?
>

This caused system tests to fail as there were warnings in the
ovn-controller log because of empty set.

Thanks

>
> > >
> > > > ---
> > > >  northd/ovn-northd.c | 12 
> > > >  1 file changed, 8 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > > index d6beb97..6742bc0 100644
> > > > --- a/northd/ovn-northd.c
> > > > +++ b/northd/ovn-northd.c
> > > > @@ -5337,10 +5337,14 @@ build_lswitch_rport_arp_req_flows(struct
> ovn_port *op,
> > > >  }
> > > >  }
> > > >
> > > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET,
> sw_op,
> > > > -sw_od, 75, lflows);
> > > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6,
> sw_op,
> > > > -sw_od, 75, lflows);
> > > > +if (!sset_is_empty(_ips_v4)) {
> > > > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v4,
> AF_INET, sw_op,
> > > > +sw_od, 75, lflows);
> > > > +}
> > > > +if (!sset_is_empty(_ips_v6)) {
> > > > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v6,
> AF_INET6, sw_op,
> > > > +sw_od, 75, lflows);
> > > > +}
> > > >
> > > >  sset_destroy(_ips_v4);
> > > >  sset_destroy(_ips_v6);
> > > > --
> > > > 1.8.3.1
> > > >
> > > > ___
> > > > dev mailing list
> > > > d...@openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > >
> > >
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.

2019-11-14 Thread Han Zhou
On Thu, Nov 14, 2019 at 6:22 AM Dumitru Ceara  wrote:
>
> On Thu, Nov 14, 2019 at 2:14 PM Numan Siddique  wrote:
> >
> > On Thu, Nov 14, 2019 at 6:14 PM Dumitru Ceara  wrote:
> > >
> > > Reported-by: Numan Siddique 
> > > Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain
whenever possible.")
> > > Signed-off-by: Dumitru Ceara 
> >
> > Thanks Dumitru for the fix. I applied this to master.
> >
> > Thanks
> > Numan
>
> Thanks Numan and sorry for missing this case in the first place.

Sorry for missing this in review. Just want to confirm, is this a bug or
just improvement? I think it is not a bug, because the set is empty {}, it
should just get ignored in the end by ovn-controller when translating to
OpenFlow rules, correct?
>
> >
> > > ---
> > >  northd/ovn-northd.c | 12 
> > >  1 file changed, 8 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > index d6beb97..6742bc0 100644
> > > --- a/northd/ovn-northd.c
> > > +++ b/northd/ovn-northd.c
> > > @@ -5337,10 +5337,14 @@ build_lswitch_rport_arp_req_flows(struct
ovn_port *op,
> > >  }
> > >  }
> > >
> > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET,
sw_op,
> > > -sw_od, 75, lflows);
> > > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6,
sw_op,
> > > -sw_od, 75, lflows);
> > > +if (!sset_is_empty(_ips_v4)) {
> > > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v4,
AF_INET, sw_op,
> > > +sw_od, 75, lflows);
> > > +}
> > > +if (!sset_is_empty(_ips_v6)) {
> > > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v6,
AF_INET6, sw_op,
> > > +sw_od, 75, lflows);
> > > +}
> > >
> > >  sset_destroy(_ips_v4);
> > >  sset_destroy(_ips_v6);
> > > --
> > > 1.8.3.1
> > >
> > > ___
> > > dev mailing list
> > > d...@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >
> >
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 00/13] OVN Interconnection

2019-11-14 Thread Han Zhou
On Wed, Nov 13, 2019 at 10:27 AM Numan Siddique  wrote:
>
> On Thu, Oct 31, 2019 at 2:43 AM Han Zhou  wrote:
> >
> > The series supports interconnecting multiple OVN deployments (e.g.
 located at
> > multiple data centers) through logical routers connected with tansit
logical
> > switches with overlay tunnels, managed through OVN control plane.  See
the
> > ovn-architecture.rst document updates for more details, and find the
> > instructions in Documentation/tutorials/ovn-interconnection.rst.
> >
> > Han Zhou (13):
> >   ovn-architecture: Add documentation for OVN interconnection feature.
> >   ovn-inb: Interconnection northbound DB schema and CLI.
> >   ovn-isb: Interconnection southbound DB schema and CLI.
> >   ovn-ic: Interconnection controller with AZ registeration.
> >   ovn-northd.c: Refactor allocate_tnlid.
> >   ovn-ic: Transit switch controller.
> >   ovn-sb: Add columns is_interconn and is_remote to Chassis.
> >   ovn-ic: Interconnection gateway controller.
> >   ovn-ic: Interconnection port controller.
> >   ovn.at: e2e test for OVN interconnection.
> >   ovn-ctl: Refactor to reduce redundant code.
> >   ovn-ctl: Support commands for interconnection.
> >   tutorial: Add tutorial for OVN Interconnection.
> >
>
> Hi Han,
>
> Thanks for working on this feature. It's an interesting use case.
>
> I had a quick look at all the patches.
>

Numan, thanks a lot for the thorough review! Please see my response inlined.

> I have few comments
>
> 1. I would suggest to rename the DBs as  ovn-ic-nb.ovsschema (and the
> same for ovn-isb).
> The DB name is - OVN_IC_Northbound. So it would make sense to have
> - ovn-ic-nb.ovsschema
>  I would also suggest to rename the utilities to ovn-ic-nbctl and
> ovn-ic-sbctl.
> With  ovn-inbctl/ovn-isbctl, it is really confusing.
>
Sure, I felt not quite convenient with two dashes in a command name. I
agree that ovn-ic-nbctl and ovn-ic-sbctl are more clear. I can change it.

> 2. ovn-ic service writes to interconnect south db, ovn north db and
> ovn south db. Writing to ic south db and
> ovn north db is fine. But it seems a little odd for ovn-ic to
> write to south db. From what I understand it writes
> to south db for 3 purposes
>   a. Updating the tunnel_key column of datapath_binding
> representing the transit switch
>   b. Updating the key column of port_binding representing the
> logical router port connecting to the transit switch.
>   c. Creating chassis rows for remote gateway chassis.
>
>I think it's better if ovn-ic can delegate all these to ovn-northd.
> For (a) and (b), ovn-ic can set the generated tunnel key
>in the other_config/options column of Logical switch/Logical switch
> port. ovn-northd can internally set this value to
>the south db.
>
>For (c), I think its better we add another table - Remote_Chassis
> (or some other appropriate name) . ovn-ic will create rows
>in this table for each remote chassis and ovn-northd will create
> chassis row in south db.
>We already have Gateway_Chassis table in North db. So I think it's
> fine if we add Remote_Chassis table. The only odd thing
>would be is to store the encap details in North db.
>
>With this, ovn-ic, doesn't need to worry about syncing between
> interconnect south db and ovn south db. In my opinion ovn-ic
>should act more like CMS to each availability zone and hence should
> not do any write transactions to the south db.
>
> Any concerns with this proposed approach ?
>
We could avoid ovn-ic writing directly to SB with some extra logic in
northd, but I don't see any problem for ovn-ic to update SB directly. First
of all, we have hypervisors creating and updating SB directly for Chassis
and Encap records. The design here is that ovn-ic updates Chassis and Encap
on behalf of remote gateway nodes, which I think is straightforward and
reasonable. Similarly, port-binding's chassis column is updated the same
way as how hypervisors are updating it. Secondly, for tunnel keys updating,
it may seem graceful to update from northd, since northd is the only client
that updates tunnel keys today, but since ovn-ic is responsible for
calculating these keys, and it already has connection to SB, I think it is
just more natural and efficient to update it directly, to avoid the extra
logic and unnecessary latency from northd with another round of
computation. The scope of the ovn-ic is only for the interconnection
objects, so I don't see any conflict or ownership ambiguity here. What do
you think?

> 3. In patch 7,its better to rename the ovs configuration option -
> "external_ids:is-interconn"  to "external_ids:ovn-is-interconn".
>You also need to document it in ovn-controller-8.xml.
>
Agree!

>Or maybe we can remove this option - external_ids:is-interconn. We
> probably don't need this if we do like below
>
>2 (c) can also be done this way
>  - User creates transit switch.
>  - ovn-ic creates transit switch in north db.
>  

Re: [ovs-dev] ovs-vswitchd is crashing while adding dpdk ports to the ovs bridge

2019-11-14 Thread Ilya Maximets
> Hi,
> 
> I'm trying to add the dpdk ports to the ovs bridge, and while doing so, i
> see the ovs daemon crashing with the 'operation not permitted' error,
> here's my architecture, i am on a debian vm (pseudo host) that has virt-io
> devices attached to it, i'm binding these virt-io's to the dpdk (igb_uio)
> and trying to add it to the ovs bridge.
> dpdk version: 17.11.2
> ovs-version: 2.9.0

Hi Deepak,

You're using very old OVS and DPDK releases.  A lot of issues was fixed in
later stable releases.  Try using OVS 2.9.6 and DPDK 17.11.6 that are
current recommended versions. (There are already more recent DPDK 17.11.*
releases, but they wasn't verified with OVS yet).
With OVS 2.9.6 at least it should stop crashing because of the following
commit:
3e417c853b6d ("netdev-dpdk: Don't use PMD driver if not configured 
successfully")

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [RFC ovn] ovn-northd: ls_*_acl behavior not consistent for untracked flows

2019-11-14 Thread venugopali
From: venu iyer 

If one creates a port group and a MAC address set, and create an
ACL that prevents packets being output to a port in the Port Group from
any MAC address in the address set, the outcome is not consistent.

The outcome depends on whether there is a stateful rule on the switch or not.

Specifically:

Assuming 'l2pg' is a port group with a list of ports and 'macs' is an Address
Set with a list of MAC addresses and the intent is to drop all packets
with source MAC address in 'macs' to any port in 'l2pg' using:

ovn-nbctl acl-add  to-lport 5000 \
"outport == @l2pg && eth.src == $macs" drop

Without any stateful rule on the logical switch, the corresponding
logical flow looks like:
table=4 (ls_out_acl), priority=6000,\
match=(outport == @l2pg && eth.src == $macs), \
action=(/* drop */)

Based on this rule, any packet destined to the ports in 'l2pg' with source
Address in 'macs' will be dropped - as is expected from the ACL above.

While with a Stateful rule on the switch (any stateful rule will do),
the same rule looks like:
table=4 (ls_out_acl), priority=6000, \
match=((!ct.est || (ct.est && ct_label.blocked == 1)) && \
(outport == @l2pg && eth.src == $macs)), action=(/* drop */)

However, with this, only packets that are tracked will match the rule
and be dropped, e.g. IP packets will be dropped, but ARP etc., will go
through - this is not expected.

Based on whether there are stateful rules or not on the switch,
untracked packets will see different behavior.

The fix is to complete the rules in the stateful case, i.e. instead
of looking for flows that are not established or not new, we should
first check if flows are not tracked.

The fix was tested in the above scenario. Additionally, the following
ACL was added to test the change in the "allow" case (i.e. to drop
all the packets based on the above ACL, but have a higher priority
rule that selectively allow ARP).

ovn-nbctl acl-add ls1 to-lport 6000  \
"outport == @l2pg && eth.type == 0x806" allow

with and without the staeful rule to make sure the behavior is the
same.

OVN test cases were run with this change and no failures were seen.

Signed-off-by: venu iyer 
---
 northd/ovn-northd.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 6742bc002..1f5f97796 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -4585,12 +4585,12 @@ consider_acl(struct hmap *lflows, struct ovn_datapath 
*od,
  * deletion.  There is no need to commit here, so we can just
  * proceed to the next table. We use this to ensure that this
  * connection is still allowed by the currently defined
- * policy. */
+ * policy. Match untracked packets too. */
 ds_clear();
 ds_clear();
 ds_put_format(,
-  "!ct.new && ct.est && !ct.rpl"
-  " && ct_label.blocked == 0 && (%s)",
+  "(!ct.trk || (!ct.new && ct.est && !ct.rpl"
+  " && ct_label.blocked == 0)) && (%s)",
   acl->match);
 
 build_acl_log(, acl);
@@ -4613,10 +4613,10 @@ consider_acl(struct hmap *lflows, struct ovn_datapath 
*od,
  * depending on whether the connection was previously committed
  * to the connection tracker with ct_commit. */
 if (has_stateful) {
-/* If the packet is not part of an established connection, then
+/* If the packet is not tracked or part of an established 
connection, then
  * we can simply reject/drop it. */
 ds_put_cstr(,
-"(!ct.est || (ct.est && ct_label.blocked == 1))");
+"(!ct.trk || !ct.est || (ct.est && ct_label.blocked == 
1))");
 if (!strcmp(acl->action, "reject")) {
 build_reject_acl_rules(od, lflows, stage, acl, ,
);
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Fix use of dangling pointers in I-P runtime_data.

2019-11-14 Thread Dumitru Ceara
On Wed, Nov 13, 2019 at 9:18 AM Dumitru Ceara  wrote:
>
> On Fri, Nov 8, 2019 at 8:27 PM Han Zhou  wrote:
> >
> >
> >
> > On Fri, Nov 8, 2019 at 11:22 AM Han Zhou  wrote:
> > >
> > > 1. storage data and the void *arg of init() breaks the engine node data 
> > > encapsulation.
> > > 2. engine_node_valid(_flow_output, engine_run_id) is not needed? 
> > > Should use storage to access instead?
> > > 3. refactor of engine is good but better to be a separate commit
> > > 4. we can have a new interface: engine_get_data(), which returns data if 
> > > it is valid. we should never expose the data directly. We should init the 
> > > engine node with dynamically allocated engine data structure (and 
> > > remember to free during destroy)
> >
> > Oops! please ignore the above part since it was draft and I forgot to 
> > delete after editing the formal response, mostly redundant :-)
> > Real response started here =>
> > >
> > > Hi Dumitru,
> > >
> > > Sorry for late response.
> > > On Mon, Nov 4, 2019 at 4:54 AM Dumitru Ceara  wrote:
> > > >
> > > > The incremental processing engine might stop a run before the
> > > > en_runtime_data node is processed. In such cases the ed_runtime_data
> > > > fields might contain pointers to already deleted SB records. For
> > > > example, if a port binding corresponding to a patch port is removed from
> > > > the SB database and the incremental processing engine aborts before the
> > > > en_runtime_data node is processed then the corresponding local_datapath
> > > > hashtable entry in ed_runtime_data is stale and will store a pointer to
> > > > the already freed sbrec_port_binding record.
> > > >
> > > > This will cause invalid memory accesses in various places (e.g.,
> > > > pinctrl_run() -> prepare_ipv6_ras()).
> > > >
> > > > To fix the issue we need a way to track how each node was processed
> > > > during an engine run. This commit transforms the 'changed' field in
> > > > struct engine_node in a 'state' field. Possible node states are:
> > > > - "New": the node is not yet initialized.
> > > > - "Stale": data in the node is not up to date with the DB.
> > > > - "Updated": data in the node is valid but was updated during
> > > >   the last run of the engine.
> > > > - "Valid": data in the node is valid and didn't change during
> > > >   the last run of the engine.
> > > > - "Aborted": during the last run, processing was aborted for
> > > >   this node.
> > > > - "Destroyed": the node was already cleaned up.
> > > >
> > > > We also add a separation between engine node data that can be accessed
> > > > at any time (regardless if the last engine run was successful or not)
> > > > and data that may be accessed only if the nodes are up to date. This
> > > > helps avoiding custom "engine_node_valid" handlers for different
> > > > nodes.
> > > >
> > > > The commit also simplifies the logic of calling engine_run and
> > > > engine_need_run in order to reduce the number of external variables
> > > > required to track the result of the last engine execution.
> > > >
> > > > Functions that need to be called from the main loop and depend on
> > > > various data contents of the engine's nodes are now called only if
> > > > the data is up to date.
> > > >
> > > > CC: Han Zhou 
> > > > Fixes: ca278d98a4f5 ("ovn-controller: Initial use of incremental engine 
> > > > - quiet mode.")
> > > > Signed-off-by: Dumitru Ceara 
> > > >
> > > > ---
> > > > v2: Address Han's comments:
> > > > - call engine_node_valid() in all the places where node local data is
> > > >   used.
> > > > - move out "global" data outside the engine nodes. Make a clear
> > > >   separation between data that can be safely used at any time and data
> > > >   that can be used only when the engine run was successful.
> > >
> > > I am concerned with this kind of separation of *global* data, which 
> > > breaks the data encapsulation of engine node, and can easily result in 
> > > hidden dependency. As you know the main purpose of the I-P engine is to 
> > > force explicit dependency exposed between different engine nodes thus 
> > > ensure the correctness (at least it helps to ensure) of incremental 
> > > processing.
> > >
> > > Here is my proposal to address the problem with better encapsulation.
> > >
> > > Firstly, let's avoid direct engine data access outside of engine module. 
> > > At engine node construction, instead of using reference of stack variable 
> > > (such as struct ed_type_runtime_data ed_runtime_data), we can allocate 
> > > the memory in the engine node's init() interface, and free in the 
> > > cleanup() interface. This way, there will be no way to directly access 
> > > engine data like _runtime_data.local_datapaths.
> > >
> > > Secondly, let's add a engine module interface engine_get_data() to 
> > > retrieve *and validate* data for an engine node:
> > > void *
> > > engine_get_data(struct engine_node *node, uint64_t run_id)
> > > {
> > > if (engine_node_valid(node, run_id)) {
> > > return 

[ovs-dev] [PATCH v3 ovn 4/4] ovn-controller: Fix use of dangling pointers in I-P runtime_data.

2019-11-14 Thread Dumitru Ceara
The incremental processing engine might stop a run before the
en_runtime_data node is processed. In such cases the ed_runtime_data
fields might contain pointers to already deleted SB records. For
example, if a port binding corresponding to a patch port is removed from
the SB database and the incremental processing engine aborts before the
en_runtime_data node is processed then the corresponding local_datapath
hashtable entry in ed_runtime_data is stale and will store a pointer to
the already freed sbrec_port_binding record.

This will cause invalid memory accesses in various places (e.g.,
pinctrl_run() -> prepare_ipv6_ras()).

To fix the issue we introduce the concept of "internal_data" vs "data" in
engine nodes. The first field, "internal_data", is data that can be accessed
by the incremental engine nodes handlers (data from other nodes must be
considered read-only and data from other nodes must not be accessed if the
nodes haven't been refreshed in the current iteration). The second field,
"data" is a pointer reset at engine_run() and if non-NULL indicates to
users outside the incremental engine that the data is safe to use.

This commit also adds an "is_valid()" method to engine nodes to allow
users to override the default behavior of determining if data is valid in a
node (e.g., for the ct-zones node the data is always safe to access).

CC: Han Zhou 
Fixes: ca278d98a4f5 ("ovn-controller: Initial use of incremental engine - quiet 
mode.")
Signed-off-by: Dumitru Ceara 

---
v3:
- split the change in series.
- Address Han's comments:
- fix the data encapsulation issue.
- add is_valid method to nodes.
- add internal_data/data fields to nodes as it makes it easier to write
  the code instead of adding an "engine_get_data()" API.
v2: Address Han's comments:
- call engine_node_valid() in all the places where node local data is
  used.
- move out "global" data outside the engine nodes. Make a clear
  separation between data that can be safely used at any time and data
  that can be used only when the engine run was successful.
- add a debug log for iterations when the engine didn't run.
- refactor a bit more the incremental engine code.
---
 controller/ovn-controller.c |  227 ++-
 lib/inc-proc-eng.c  |   25 -
 lib/inc-proc-eng.h  |   28 +
 3 files changed, 162 insertions(+), 118 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 51f466a..0475a30 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -741,8 +741,7 @@ struct ed_type_ofctrl_is_connected {
 static void
 en_ofctrl_is_connected_init(struct engine_node *node)
 {
-struct ed_type_ofctrl_is_connected *data =
-(struct ed_type_ofctrl_is_connected *)node->data;
+struct ed_type_ofctrl_is_connected *data = node->internal_data;
 data->connected = false;
 }
 
@@ -754,8 +753,7 @@ en_ofctrl_is_connected_cleanup(struct engine_node *node 
OVS_UNUSED)
 static void
 en_ofctrl_is_connected_run(struct engine_node *node)
 {
-struct ed_type_ofctrl_is_connected *data =
-(struct ed_type_ofctrl_is_connected *)node->data;
+struct ed_type_ofctrl_is_connected *data = node->internal_data;
 if (data->connected != ofctrl_is_connected()) {
 data->connected = !data->connected;
 engine_set_node_state(node, EN_UPDATED);
@@ -775,7 +773,7 @@ struct ed_type_addr_sets {
 static void
 en_addr_sets_init(struct engine_node *node)
 {
-struct ed_type_addr_sets *as = (struct ed_type_addr_sets *)node->data;
+struct ed_type_addr_sets *as = node->internal_data;
 shash_init(>addr_sets);
 as->change_tracked = false;
 sset_init(>new);
@@ -786,7 +784,7 @@ en_addr_sets_init(struct engine_node *node)
 static void
 en_addr_sets_cleanup(struct engine_node *node)
 {
-struct ed_type_addr_sets *as = (struct ed_type_addr_sets *)node->data;
+struct ed_type_addr_sets *as = node->internal_data;
 expr_const_sets_destroy(>addr_sets);
 shash_destroy(>addr_sets);
 sset_destroy(>new);
@@ -797,7 +795,7 @@ en_addr_sets_cleanup(struct engine_node *node)
 static void
 en_addr_sets_run(struct engine_node *node)
 {
-struct ed_type_addr_sets *as = (struct ed_type_addr_sets *)node->data;
+struct ed_type_addr_sets *as = node->internal_data;
 
 sset_clear(>new);
 sset_clear(>deleted);
@@ -817,7 +815,7 @@ en_addr_sets_run(struct engine_node *node)
 static bool
 addr_sets_sb_address_set_handler(struct engine_node *node)
 {
-struct ed_type_addr_sets *as = (struct ed_type_addr_sets *)node->data;
+struct ed_type_addr_sets *as = node->internal_data;
 
 sset_clear(>new);
 sset_clear(>deleted);
@@ -852,7 +850,7 @@ struct ed_type_port_groups{
 static void
 en_port_groups_init(struct engine_node *node)
 {
-struct ed_type_port_groups *pg = (struct ed_type_port_groups *)node->data;
+struct ed_type_port_groups *pg = node->internal_data;
 

[ovs-dev] [PATCH v3 ovn 3/4] ovn-controller: Add separate I-P engine node for processing ct-zones.

2019-11-14 Thread Dumitru Ceara
Signed-off-by: Dumitru Ceara 
---
 controller/ovn-controller.c |  117 +--
 1 file changed, 78 insertions(+), 39 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 4f8ceae..51f466a 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -933,11 +933,6 @@ struct ed_type_runtime_data {
  * _ */
 struct sset local_lport_ids;
 struct sset active_tunnels;
-
-/* connection tracking zones. */
-unsigned long ct_zone_bitmap[BITMAP_N_LONGS(MAX_CT_ZONES)];
-struct shash pending_ct_zones;
-struct simap ct_zones;
 };
 
 static void
@@ -945,24 +940,11 @@ en_runtime_data_init(struct engine_node *node)
 {
 struct ed_type_runtime_data *data =
 (struct ed_type_runtime_data *)node->data;
-struct ovsrec_open_vswitch_table *ovs_table =
-(struct ovsrec_open_vswitch_table *)EN_OVSDB_GET(
-engine_get_input("OVS_open_vswitch", node));
-struct ovsrec_bridge_table *bridge_table =
-(struct ovsrec_bridge_table *)EN_OVSDB_GET(
-engine_get_input("OVS_bridge", node));
+
 hmap_init(>local_datapaths);
 sset_init(>local_lports);
 sset_init(>local_lport_ids);
 sset_init(>active_tunnels);
-shash_init(>pending_ct_zones);
-simap_init(>ct_zones);
-
-/* Initialize connection tracking zones. */
-memset(data->ct_zone_bitmap, 0, sizeof data->ct_zone_bitmap);
-bitmap_set1(data->ct_zone_bitmap, 0); /* Zone 0 is reserved. */
-restore_ct_zones(bridge_table, ovs_table,
- >ct_zones, data->ct_zone_bitmap);
 }
 
 static void
@@ -983,9 +965,6 @@ en_runtime_data_cleanup(struct engine_node *node)
 free(cur_node);
 }
 hmap_destroy(>local_datapaths);
-
-simap_destroy(>ct_zones);
-shash_destroy(>pending_ct_zones);
 }
 
 static void
@@ -997,9 +976,6 @@ en_runtime_data_run(struct engine_node *node)
 struct sset *local_lports = >local_lports;
 struct sset *local_lport_ids = >local_lport_ids;
 struct sset *active_tunnels = >active_tunnels;
-unsigned long *ct_zone_bitmap = data->ct_zone_bitmap;
-struct shash *pending_ct_zones = >pending_ct_zones;
-struct simap *ct_zones = >ct_zones;
 
 static bool first_run = true;
 if (first_run) {
@@ -1094,9 +1070,6 @@ en_runtime_data_run(struct engine_node *node)
 ovs_table, local_datapaths,
 local_lports, local_lport_ids);
 
-update_ct_zones(local_lports, local_datapaths, ct_zones,
-ct_zone_bitmap, pending_ct_zones);
-
 engine_set_node_state(node, EN_UPDATED);
 }
 
@@ -1138,6 +,55 @@ runtime_data_sb_port_binding_handler(struct engine_node 
*node)
 return !changed;
 }
 
+/* Connection tracking zones. */
+struct ed_type_ct_zones {
+unsigned long bitmap[BITMAP_N_LONGS(MAX_CT_ZONES)];
+struct shash pending;
+struct simap current;
+};
+
+static void
+en_ct_zones_init(struct engine_node *node)
+{
+struct ed_type_ct_zones *data = node->data;
+struct ovsrec_open_vswitch_table *ovs_table =
+(struct ovsrec_open_vswitch_table *)EN_OVSDB_GET(
+engine_get_input("OVS_open_vswitch", node));
+struct ovsrec_bridge_table *bridge_table =
+(struct ovsrec_bridge_table *)EN_OVSDB_GET(
+engine_get_input("OVS_bridge", node));
+
+shash_init(>pending);
+simap_init(>current);
+
+memset(data->bitmap, 0, sizeof data->bitmap);
+bitmap_set1(data->bitmap, 0); /* Zone 0 is reserved. */
+restore_ct_zones(bridge_table, ovs_table, >current, data->bitmap);
+}
+
+static void
+en_ct_zones_cleanup(struct engine_node *node)
+{
+struct ed_type_ct_zones *data = node->data;
+
+simap_destroy(>current);
+shash_destroy(>pending);
+}
+
+static void
+en_ct_zones_run(struct engine_node *node)
+{
+struct ed_type_ct_zones *data = node->data;
+struct ed_type_runtime_data *rt_data =
+(struct ed_type_runtime_data *)engine_get_input(
+"runtime_data", node)->data;
+
+update_ct_zones(_data->local_lports, _data->local_datapaths,
+>current, data->bitmap, >pending);
+
+engine_set_node_state(node, EN_UPDATED);
+}
+
 struct ed_type_mff_ovn_geneve {
 enum mf_field_id mff_ovn_geneve;
 };
@@ -1215,7 +1237,11 @@ en_flow_output_run(struct engine_node *node)
 struct sset *local_lports = _data->local_lports;
 struct sset *local_lport_ids = _data->local_lport_ids;
 struct sset *active_tunnels = _data->active_tunnels;
-struct simap *ct_zones = _data->ct_zones;
+
+struct ed_type_ct_zones *ct_zones_data =
+(struct ed_type_ct_zones *)engine_get_input(
+"ct_zones", node)->data;
+struct simap *ct_zones = _zones_data->current;
 
 struct ed_type_mff_ovn_geneve *ed_mff_ovn_geneve =
 (struct ed_type_mff_ovn_geneve *)engine_get_input(
@@ -1445,7 +1471,11 @@ flow_output_sb_port_binding_handler(struct engine_node 
*node)
 

[ovs-dev] [PATCH v3 ovn 2/4] ovn-controller: Add per node states to I-P engine.

2019-11-14 Thread Dumitru Ceara
This commit transforms the 'changed' field in struct engine_node in a
'state' field. Possible node states are:
- "Stale": data in the node is not up to date with the DB.
- "Updated": data in the node is valid but was updated during
  the last run of the engine.
- "Valid": data in the node is valid and didn't change during
  the last run of the engine.
- "Aborted": during the last run, processing was aborted for
  this node.

This commit also further refactors the I-P engine:
- instead of recursively performing all the engine processing a
  preprocessing stage is added (engine_get_nodes()) before the main processing
  loop is executed in order to topologically sort nodes in the engine such
  that all inputs of a given node appear in the sorted array before the node
  itself. This simplifies a bit the code in engine_run().
- remove the need for using an engine_run_id by using the newly added states.

Signed-off-by: Dumitru Ceara 
---
 controller/ovn-controller.c |   88 ++---
 lib/inc-proc-eng.c  |  218 ---
 lib/inc-proc-eng.h  |   74 +++
 3 files changed, 267 insertions(+), 113 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 3922f3d..4f8ceae 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -758,10 +758,10 @@ en_ofctrl_is_connected_run(struct engine_node *node)
 (struct ed_type_ofctrl_is_connected *)node->data;
 if (data->connected != ofctrl_is_connected()) {
 data->connected = !data->connected;
-node->changed = true;
+engine_set_node_state(node, EN_UPDATED);
 return;
 }
-node->changed = false;
+engine_set_node_state(node, EN_VALID);
 }
 
 struct ed_type_addr_sets {
@@ -811,7 +811,7 @@ en_addr_sets_run(struct engine_node *node)
 addr_sets_init(as_table, >addr_sets);
 
 as->change_tracked = false;
-node->changed = true;
+engine_set_node_state(node, EN_UPDATED);
 }
 
 static bool
@@ -830,11 +830,14 @@ addr_sets_sb_address_set_handler(struct engine_node *node)
 addr_sets_update(as_table, >addr_sets, >new,
  >deleted, >updated);
 
-node->changed = !sset_is_empty(>new) || !sset_is_empty(>deleted)
-|| !sset_is_empty(>updated);
+if (!sset_is_empty(>new) || !sset_is_empty(>deleted) ||
+!sset_is_empty(>updated)) {
+engine_set_node_state(node, EN_UPDATED);
+} else {
+engine_set_node_state(node, EN_VALID);
+}
 
 as->change_tracked = true;
-node->changed = true;
 return true;
 }
 
@@ -885,7 +888,7 @@ en_port_groups_run(struct engine_node *node)
 port_groups_init(pg_table, >port_groups);
 
 pg->change_tracked = false;
-node->changed = true;
+engine_set_node_state(node, EN_UPDATED);
 }
 
 static bool
@@ -904,11 +907,14 @@ port_groups_sb_port_group_handler(struct engine_node 
*node)
 port_groups_update(pg_table, >port_groups, >new,
  >deleted, >updated);
 
-node->changed = !sset_is_empty(>new) || !sset_is_empty(>deleted)
-|| !sset_is_empty(>updated);
+if (!sset_is_empty(>new) || !sset_is_empty(>deleted) ||
+!sset_is_empty(>updated)) {
+engine_set_node_state(node, EN_UPDATED);
+} else {
+engine_set_node_state(node, EN_VALID);
+}
 
 pg->change_tracked = true;
-node->changed = true;
 return true;
 }
 
@@ -1091,7 +1097,7 @@ en_runtime_data_run(struct engine_node *node)
 update_ct_zones(local_lports, local_datapaths, ct_zones,
 ct_zone_bitmap, pending_ct_zones);
 
-node->changed = true;
+engine_set_node_state(node, EN_UPDATED);
 }
 
 static bool
@@ -1157,10 +1163,10 @@ en_mff_ovn_geneve_run(struct engine_node *node)
 enum mf_field_id mff_ovn_geneve = ofctrl_get_mf_field_id();
 if (data->mff_ovn_geneve != mff_ovn_geneve) {
 data->mff_ovn_geneve = mff_ovn_geneve;
-node->changed = true;
+engine_set_node_state(node, EN_UPDATED);
 return;
 }
-node->changed = false;
+engine_set_node_state(node, EN_VALID);
 }
 
 struct ed_type_flow_output {
@@ -1322,7 +1328,7 @@ en_flow_output_run(struct engine_node *node)
  active_tunnels,
  flow_table);
 
-node->changed = true;
+engine_set_node_state(node, EN_UPDATED);
 }
 
 static bool
@@ -1404,7 +1410,7 @@ flow_output_sb_logical_flow_handler(struct engine_node 
*node)
   flow_table, group_table, meter_table, lfrr,
   conj_id_ofs);
 
-node->changed = true;
+engine_set_node_state(node, EN_UPDATED);
 return handled;
 }
 
@@ -1427,7 +1433,7 @@ flow_output_sb_mac_binding_handler(struct engine_node 
*node)
 lflow_handle_changed_neighbors(sbrec_port_binding_by_name,
 mac_binding_table, flow_table);
 
-node->changed = true;
+engine_set_node_state(node, EN_UPDATED);
 return true;
 }
 
@@ 

[ovs-dev] [PATCH v3 ovn 1/4] ovn-controller: Refactor I-P engine_run() tracking.

2019-11-14 Thread Dumitru Ceara
This commit simplifies the logic of calling engine_run and engine_need_run in
order to reduce the number of external variables required to track the result
of the last engine execution.

The engine code is also refactored a bit and the engine_run() function is
split in different functions that handle computing/recomputing a node.

Signed-off-by: Dumitru Ceara 
---
 controller/ovn-controller.c |   33 ++-
 lib/inc-proc-eng.c  |  124 +--
 lib/inc-proc-eng.h  |7 ++
 3 files changed, 107 insertions(+), 57 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 9ab98be..3922f3d 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -1942,7 +1942,6 @@ main(int argc, char *argv[])
  _pkt);
 
 uint64_t engine_run_id = 0;
-uint64_t old_engine_run_id = 0;
 bool engine_run_done = true;
 
 unsigned int ovs_cond_seqno = UINT_MAX;
@@ -1952,10 +1951,11 @@ main(int argc, char *argv[])
 exiting = false;
 restart = false;
 while (!exiting) {
+engine_run_id++;
+
 update_sb_db(ovs_idl_loop.idl, ovnsb_idl_loop.idl);
 update_ssl_config(ovsrec_ssl_table_get(ovs_idl_loop.idl));
 ofctrl_set_probe_interval(get_ofctrl_probe_interval(ovs_idl_loop.idl));
-old_engine_run_id = engine_run_id;
 
 struct ovsdb_idl_txn *ovs_idl_txn = ovsdb_idl_loop_run(_idl_loop);
 unsigned int new_ovs_cond_seqno
@@ -2047,12 +2047,12 @@ main(int argc, char *argv[])
 if (engine_run_done) {
 engine_set_abort_recompute(true);
 engine_run_done = engine_run(_flow_output,
- ++engine_run_id);
+ engine_run_id);
 }
 } else {
 engine_set_abort_recompute(false);
 engine_run_done = true;
-engine_run(_flow_output, ++engine_run_id);
+engine_run(_flow_output, engine_run_id);
 }
 }
 stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
@@ -2095,17 +2095,20 @@ main(int argc, char *argv[])
 }
 
 }
-if (old_engine_run_id == engine_run_id || !engine_run_done) {
-if (!engine_run_done || engine_need_run(_flow_output)) {
-VLOG_DBG("engine did not run, force recompute next time: "
- "br_int %p, chassis %p", br_int, chassis);
-engine_set_force_recompute(true);
-poll_immediate_wake();
-} else {
-VLOG_DBG("engine did not run, and it was not needed"
- " either: br_int %p, chassis %p",
- br_int, chassis);
-}
+if (engine_need_run(_flow_output, engine_run_id)) {
+VLOG_DBG("engine did not run, force recompute next time: "
+"br_int %p, chassis %p", br_int, chassis);
+engine_set_force_recompute(true);
+poll_immediate_wake();
+} else if (!engine_run_done) {
+VLOG_DBG("engine was aborted, force recompute next time: "
+ "br_int %p, chassis %p", br_int, chassis);
+engine_set_force_recompute(true);
+poll_immediate_wake();
+} else if (!engine_has_run(_flow_output, engine_run_id)) {
+VLOG_DBG("engine did not run, and it was not needed"
+ " either: br_int %p, chassis %p",
+ br_int, chassis);
 } else {
 engine_set_force_recompute(false);
 }
diff --git a/lib/inc-proc-eng.c b/lib/inc-proc-eng.c
index 1064a08..8a085e2 100644
--- a/lib/inc-proc-eng.c
+++ b/lib/inc-proc-eng.c
@@ -129,14 +129,72 @@ engine_ovsdb_node_add_index(struct engine_node *node, 
const char *name,
 }
 
 bool
-engine_run(struct engine_node *node, uint64_t run_id)
+engine_has_run(struct engine_node *node, uint64_t run_id)
+{
+return node->run_id == run_id;
+}
+
+/* Do a full recompute (or at least try). If we're not allowed then
+ * mark the node as "aborted".
+ */
+static bool
+engine_recompute(struct engine_node *node, bool forced, bool allowed)
+{
+VLOG_DBG("node: %s, recompute (%s)", node->name,
+ forced ? "forced" : "triggered");
+
+if (!allowed) {
+VLOG_DBG("node: %s, recompute aborted", node->name);
+return false;
+}
+
+node->run(node);
+VLOG_DBG("node: %s, changed: %d", node->name, node->changed);
+return true;
+}
+
+/* Return true if the node could be computed without triggerring a 

Re: [ovs-dev] [PATCH] dpdk: Deprecate pdump support.

2019-11-14 Thread Pattan, Reshma
Hi,

Sorry, I don’t work on OVS, so I never tested pdump with OVS.
I remembered,  Ciara testing earlier versions of pdump with  OVS in the past.

Thanks,
Reshma



> -Original Message-
> From: David Marchand 
> Sent: Thursday, November 14, 2019 4:41 PM
> To: Pattan, Reshma 
> Cc: ovs-dev@openvswitch.org; Kevin Traynor ;
> Aaron Conole ; Loftus, Ciara
> ; Flavio Leitner ; Ilya Maximets
> ; Stokes, Ian 
> Subject: Re: [PATCH] dpdk: Deprecate pdump support.
> 
> Hello Reshma,
> 
> Has pdump been tested (recently) with OVS?
> 
> 
> On Mon, Nov 11, 2019 at 7:53 PM Ilya Maximets 
> wrote:
> >
> > The conventional way for packet dumping in OVS is to use ovs-tcpdump
> > that works via traffic mirroring.  DPDK pdump could probably be used
> > for some lower level debugging, but it is not commonly used for
> > various reasons.
> >
> > There are lots of limitations for using this functionality in practice.
> > Most of them connected with running secondary pdump process and
> memory
> > layout issues like requirement to disable ASLR in kernel.
> > More details are available in DPDK guide:
> > https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-
> p
> > rocess-limitations
> >
> > Beside the functional limitations it's also hard to use this
> > functionality correctly.  User must be sure that OVS and pdump utility
> > are running on different CPU cores, which is hard because non-PMD
> > threads could float over available CPU cores.  This or any other
> > misconfiguration will likely lead to crash of the pdump utility or/and
> > OVS.
> >
> > Another problem is that the user must actually have this special pdump
> > utility in a system and it might be not available in distributions.
> >
> > This change disables pdump support by default introducing special
> > configuration option '--enable-dpdk-pdump'.  Deprecation warnings will
> > be shown to users on configuration and in runtime.
> >
> > Claiming to completely remove this functionality from OVS in one of
> > the next releases.
> >
> > Signed-off-by: Ilya Maximets 
> > Acked-by: Aaron Conole 
> 
> - Recompiled from scratch, on OVS master (before this patch) with dpdk
> 18.11.2.
> 
> other_config: {dpdk-init="true", pmd-cpu-mask="0x8002"}
> 
> 2 physical ports, 2 vhost ports.
> 
> 2019-11-14T14:13:09.596Z|00018|dpdk|INFO|Using DPDK 18.11.2 2019-11-
> 14T14:13:09.596Z|00019|dpdk|INFO|DPDK Enabled - initializing...
> 2019-11-14T14:13:09.596Z|00020|dpdk|INFO|No vhost-sock-dir provided -
> defaulting to //var/run/openvswitch 2019-11-
> 14T14:13:09.596Z|00021|dpdk|INFO|IOMMU support for vhost-user-client
> disabled.
> 2019-11-14T14:13:09.596Z|00022|dpdk|INFO|POSTCOPY support for vhost-
> user-client disabled.
> 2019-11-14T14:13:09.596Z|00023|dpdk|INFO|Per port memory for DPDK
> devices disabled.
> 2019-11-14T14:13:09.596Z|00024|dpdk|INFO|EAL ARGS: ovs-vswitchd --
> socket-mem 1024 --socket-limit 1024 -l 0.
> 2019-11-14T14:13:09.600Z|00025|dpdk|INFO|EAL: Detected 28 lcore(s)
> 2019-11-14T14:13:09.600Z|00026|dpdk|INFO|EAL: Detected 1 NUMA nodes
> 2019-11-14T14:13:09.602Z|00027|dpdk|INFO|EAL: Multi-process socket
> /var/run/openvswitch/dpdk/rte/mp_socket
> 2019-11-14T14:13:09.618Z|00028|dpdk|INFO|EAL: Probing VFIO support...
> 2019-11-14T14:13:09.618Z|00029|dpdk|INFO|EAL: VFIO support initialized
> 2019-11-14T14:13:14.612Z|00030|dpdk|INFO|EAL: PCI device :01:00.0 on
> NUMA socket 0
> 2019-11-14T14:13:14.612Z|00031|dpdk|INFO|EAL:   probe driver:
> 8086:10fb net_ixgbe
> 2019-11-14T14:13:14.613Z|00032|dpdk|INFO|EAL:   using IOMMU type 1
> (Type 1)
> 2019-11-14T14:13:14.744Z|00033|dpdk|INFO|EAL: Ignore mapping IO port
> bar(2)
> 2019-11-14T14:13:15.090Z|00034|dpdk|INFO|EAL: PCI device :01:00.1 on
> NUMA socket 0
> 2019-11-14T14:13:15.090Z|00035|dpdk|INFO|EAL:   probe driver:
> 8086:10fb net_ixgbe
> 2019-11-14T14:13:15.199Z|00036|dpdk|INFO|EAL: Ignore mapping IO port
> bar(2)
> 2019-11-14T14:13:15.530Z|00037|dpdk|INFO|EAL: PCI device :07:00.0 on
> NUMA socket 0
> 2019-11-14T14:13:15.530Z|00038|dpdk|INFO|EAL:   probe driver:
> 8086:1521 net_e1000_igb
> 2019-11-14T14:13:15.530Z|00039|dpdk|INFO|EAL: PCI device :07:00.1 on
> NUMA socket 0
> 2019-11-14T14:13:15.530Z|00040|dpdk|INFO|EAL:   probe driver:
> 8086:1521 net_e1000_igb
> ...
> 2019-11-14T14:13:15.802Z|00042|dpdk|INFO|DPDK pdump packet capture
> enabled 2019-11-14T14:13:15.803Z|00043|dpdk|INFO|DPDK Enabled -
> initialized
> 
> - Attached a gdb to ovs-vswitchd.
> 
> - Started pdump:
> # sudo -u openvswitch XDG_RUNTIME_DIR=/var/run/openvswitch
> ./v18.11.2/app/dpdk-pdump -- --pdump
> 'port=0,queue=*,rx-dev=/tmp/pkts.pcap'
> EAL: Detected 28 lcore(s)
> EAL: Detected 1 NUMA nodes
> EAL: Multi-process socket
> /var/run/openvswitch/dpdk/rte/mp_socket_83791_549cdfd05e328e
> EAL: Probing VFIO support...
> EAL: VFIO support initialized
> EAL: PCI device :01:00.0 on NUMA socket 0
> EAL:   probe driver: 8086:10fb net_ixgbe
> EAL:   using IOMMU type 1 (Type 1)
> EAL: PCI device 

Re: [ovs-dev] [PATCH] dpdk: Deprecate pdump support.

2019-11-14 Thread David Marchand
Hello Reshma,

Has pdump been tested (recently) with OVS?


On Mon, Nov 11, 2019 at 7:53 PM Ilya Maximets  wrote:
>
> The conventional way for packet dumping in OVS is to use ovs-tcpdump
> that works via traffic mirroring.  DPDK pdump could probably be used
> for some lower level debugging, but it is not commonly used for
> various reasons.
>
> There are lots of limitations for using this functionality in practice.
> Most of them connected with running secondary pdump process and
> memory layout issues like requirement to disable ASLR in kernel.
> More details are available in DPDK guide:
> https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations
>
> Beside the functional limitations it's also hard to use this
> functionality correctly.  User must be sure that OVS and pdump utility
> are running on different CPU cores, which is hard because non-PMD
> threads could float over available CPU cores.  This or any other
> misconfiguration will likely lead to crash of the pdump utility
> or/and OVS.
>
> Another problem is that the user must actually have this special pdump
> utility in a system and it might be not available in distributions.
>
> This change disables pdump support by default introducing special
> configuration option '--enable-dpdk-pdump'.  Deprecation warnings will
> be shown to users on configuration and in runtime.
>
> Claiming to completely remove this functionality from OVS in one
> of the next releases.
>
> Signed-off-by: Ilya Maximets 
> Acked-by: Aaron Conole 

- Recompiled from scratch, on OVS master (before this patch) with dpdk 18.11.2.

other_config: {dpdk-init="true", pmd-cpu-mask="0x8002"}

2 physical ports, 2 vhost ports.

2019-11-14T14:13:09.596Z|00018|dpdk|INFO|Using DPDK 18.11.2
2019-11-14T14:13:09.596Z|00019|dpdk|INFO|DPDK Enabled - initializing...
2019-11-14T14:13:09.596Z|00020|dpdk|INFO|No vhost-sock-dir provided -
defaulting to //var/run/openvswitch
2019-11-14T14:13:09.596Z|00021|dpdk|INFO|IOMMU support for
vhost-user-client disabled.
2019-11-14T14:13:09.596Z|00022|dpdk|INFO|POSTCOPY support for
vhost-user-client disabled.
2019-11-14T14:13:09.596Z|00023|dpdk|INFO|Per port memory for DPDK
devices disabled.
2019-11-14T14:13:09.596Z|00024|dpdk|INFO|EAL ARGS: ovs-vswitchd
--socket-mem 1024 --socket-limit 1024 -l 0.
2019-11-14T14:13:09.600Z|00025|dpdk|INFO|EAL: Detected 28 lcore(s)
2019-11-14T14:13:09.600Z|00026|dpdk|INFO|EAL: Detected 1 NUMA nodes
2019-11-14T14:13:09.602Z|00027|dpdk|INFO|EAL: Multi-process socket
/var/run/openvswitch/dpdk/rte/mp_socket
2019-11-14T14:13:09.618Z|00028|dpdk|INFO|EAL: Probing VFIO support...
2019-11-14T14:13:09.618Z|00029|dpdk|INFO|EAL: VFIO support initialized
2019-11-14T14:13:14.612Z|00030|dpdk|INFO|EAL: PCI device :01:00.0
on NUMA socket 0
2019-11-14T14:13:14.612Z|00031|dpdk|INFO|EAL:   probe driver:
8086:10fb net_ixgbe
2019-11-14T14:13:14.613Z|00032|dpdk|INFO|EAL:   using IOMMU type 1 (Type 1)
2019-11-14T14:13:14.744Z|00033|dpdk|INFO|EAL: Ignore mapping IO port bar(2)
2019-11-14T14:13:15.090Z|00034|dpdk|INFO|EAL: PCI device :01:00.1
on NUMA socket 0
2019-11-14T14:13:15.090Z|00035|dpdk|INFO|EAL:   probe driver:
8086:10fb net_ixgbe
2019-11-14T14:13:15.199Z|00036|dpdk|INFO|EAL: Ignore mapping IO port bar(2)
2019-11-14T14:13:15.530Z|00037|dpdk|INFO|EAL: PCI device :07:00.0
on NUMA socket 0
2019-11-14T14:13:15.530Z|00038|dpdk|INFO|EAL:   probe driver:
8086:1521 net_e1000_igb
2019-11-14T14:13:15.530Z|00039|dpdk|INFO|EAL: PCI device :07:00.1
on NUMA socket 0
2019-11-14T14:13:15.530Z|00040|dpdk|INFO|EAL:   probe driver:
8086:1521 net_e1000_igb
...
2019-11-14T14:13:15.802Z|00042|dpdk|INFO|DPDK pdump packet capture enabled
2019-11-14T14:13:15.803Z|00043|dpdk|INFO|DPDK Enabled - initialized

- Attached a gdb to ovs-vswitchd.

- Started pdump:
# sudo -u openvswitch XDG_RUNTIME_DIR=/var/run/openvswitch
./v18.11.2/app/dpdk-pdump -- --pdump
'port=0,queue=*,rx-dev=/tmp/pkts.pcap'
EAL: Detected 28 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket
/var/run/openvswitch/dpdk/rte/mp_socket_83791_549cdfd05e328e
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device :01:00.0 on NUMA socket 0
EAL:   probe driver: 8086:10fb net_ixgbe
EAL:   using IOMMU type 1 (Type 1)
EAL: PCI device :01:00.1 on NUMA socket 0
EAL:   probe driver: 8086:10fb net_ixgbe
EAL: PCI device :07:00.0 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
EAL: PCI device :07:00.1 on NUMA socket 0
EAL:   probe driver: 8086:1521 net_e1000_igb
Port 3 MAC: 02 70 63 61 70 00

- Sent one packet to the first physical port from my tgen

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f4840659700 (LWP 84336)]
bucket_dequeue_orphans (n_orphans=251, obj_table=0x14fe5af50,
bd=0x14fdad880) at
/root/dpdk/drivers/mempool/bucket/rte_mempool_bucket.c:190
190objptr = bucket_stack_pop(bd->buckets[rte_lcore_id()]);
(gdb) bt
#0  bucket_dequeue_orphans 

Re: [ovs-dev] [PATCH net 2/2] act_ct: support asymmetric conntrack

2019-11-14 Thread Marcelo Ricardo Leitner
On Fri, Nov 08, 2019 at 04:07:14PM -0500, Aaron Conole wrote:
> The act_ct TC module shares a common conntrack and NAT infrastructure
> exposed via netfilter.  It's possible that a packet needs both SNAT and
> DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
> this because it runs through the NAT table twice - once on ingress and
> again after egress.  The act_ct action doesn't have such capability.
> 
> Like netfilter hook infrastructure, we should run through NAT twice to
> keep the symmetry.
> 
> Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
> 
> Signed-off-by: Aaron Conole 
> ---
>  net/sched/act_ct.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> index fcc46025e790..f3232a00970f 100644
> --- a/net/sched/act_ct.c
> +++ b/net/sched/act_ct.c
> @@ -329,6 +329,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
> bool commit)
>  {
>  #if IS_ENABLED(CONFIG_NF_NAT)
> + int err;
>   enum nf_nat_manip_type maniptype;
>  
>   if (!(ct_action & TCA_CT_ACT_NAT))
> @@ -359,7 +360,17 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
>   return NF_ACCEPT;
>   }
>  
> - return ct_nat_execute(skb, ct, ctinfo, range, maniptype);
> + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
> + if (err == NF_ACCEPT &&
> + ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) {
> + if (maniptype == NF_NAT_MANIP_SRC)
> + maniptype = NF_NAT_MANIP_DST;
> + else
> + maniptype = NF_NAT_MANIP_SRC;
> +
> + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
> + }

I keep thinking about this and I'm not entirely convinced that this
shouldn't be simpler. More like:

if (DNAT)
DNAT
if (SNAT)
SNAT

So it always does DNAT before SNAT, similarly to what iptables would
do on PRE/POSTROUTING chains.

> + return err;
>  #else
>   return NF_ACCEPT;
>  #endif
> -- 
> 2.21.0
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] ovs-vswitchd is crashing while adding dpdk ports to the ovs bridge

2019-11-14 Thread Deepak Gowda
Hi,

I'm trying to add the dpdk ports to the ovs bridge, and while doing so, i
see the ovs daemon crashing with the 'operation not permitted' error,
here's my architecture, i am on a debian vm (pseudo host) that has virt-io
devices attached to it, i'm binding these virt-io's to the dpdk (igb_uio)
and trying to add it to the ovs bridge.
dpdk version: 17.11.2
ovs-version: 2.9.0
i did try looking for cores in /var/lib/systemd/coredump, but there are no
cores there, strange.
And i cannot look for cores in the current directory of that process
because it keeps getting killed (as the daemon tries to come up and crashes
and keeps at it forever)

2019-11-14T16:08:54.438Z|05381|bridge|INFO|bridge br0: added interface br0
on port 65534
2019-11-14T16:08:54.443Z|05382|dpif_netdev|INFO|PMD thread on numa_id: 0,
core id:  3 created.
2019-11-14T16:08:54.443Z|05383|dpif_netdev|INFO|There are 1 pmd threads on
numa node 0
2019-11-14T16:08:54.554Z|05384|netdev_dpdk|WARN|Rx checksum offload is not
supported on port 0
2019-11-14T16:08:54.554Z|05385|netdev_dpdk|ERR|Interface p1 MTU (1500)
setup error: Operation not supported
2019-11-14T16:08:54.554Z|05386|netdev_dpdk|ERR|Interface p1(rxq:1 txq:2)
configure error: Operation not supported
2019-11-14T16:08:54.554Z|05387|dpif_netdev|INFO|Core 3 on numa node 0
assigned port 'p1' rx queue 0 (measured processing cycles 0).
2019-11-14T16:08:54.562Z|05340|daemon_unix(monitor)|ERR|fork child died
before signaling startup (killed (Segmentation fault))
2019-11-14T16:08:54.562Z|05341|daemon_unix(monitor)|WARN|1781 crashes: pid
10551 died, killed (Segmentation fault), core dumped, waiting until 10
seconds since last restart


Should i be on the physical host (bare metal) instead a vm? and bind
physical devices to dpdk and then add it to the bridge?

I'm not sure where i'm going wrong.
Any advise is appreciated.
Thanks,
-Deepak Gowda
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next] net: openvswitch: don't call pad_packet if not necessary

2019-11-14 Thread xiangxia . m . yue
From: Tonghao Zhang 

The nla_put_u16/nla_put_u32 makes sure that
*attrlen is align. The call tree is that:

nla_put_u16/nla_put_u32
  -> nla_putattrlen = sizeof(u16) or sizeof(u32)
  -> __nla_put  attrlen
  -> __nla_reserve  attrlen
  -> skb_put(skb, nla_total_size(attrlen))

nla_total_size returns the total length of attribute
including padding.

Cc: Joe Stringer 
Cc: William Tu 
Signed-off-by: Tonghao Zhang 
---
 net/openvswitch/datapath.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 8ce1f773378d..93d4991ddc1f 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -487,23 +487,17 @@ static int queue_userspace_packet(struct datapath *dp, 
struct sk_buff *skb,
}
 
/* Add OVS_PACKET_ATTR_MRU */
-   if (upcall_info->mru) {
-   if (nla_put_u16(user_skb, OVS_PACKET_ATTR_MRU,
-   upcall_info->mru)) {
-   err = -ENOBUFS;
-   goto out;
-   }
-   pad_packet(dp, user_skb);
+   if (upcall_info->mru &&
+   nla_put_u16(user_skb, OVS_PACKET_ATTR_MRU, upcall_info->mru)) {
+   err = -ENOBUFS;
+   goto out;
}
 
/* Add OVS_PACKET_ATTR_LEN when packet is truncated */
-   if (cutlen > 0) {
-   if (nla_put_u32(user_skb, OVS_PACKET_ATTR_LEN,
-   skb->len)) {
-   err = -ENOBUFS;
-   goto out;
-   }
-   pad_packet(dp, user_skb);
+   if (cutlen > 0 &&
+   nla_put_u32(user_skb, OVS_PACKET_ATTR_LEN, skb->len)) {
+   err = -ENOBUFS;
+   goto out;
}
 
/* Add OVS_PACKET_ATTR_HASH */
-- 
2.23.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ipf: bail out when ipf state is COMPLETED

2019-11-14 Thread Darrell Ball
Hi RongQing/Wang

On Thu, Nov 14, 2019 at 1:26 AM Li RongQing  wrote:

> it is easy to crash ovs when a packet with same id
> hits a list that already reassembled completedly
> but have not been sent out yet, and this packet is
> not duplicate with this hit ipf list due to bigger
> offset
>
> 1  0x7f9fef0ae2d9 in __GI_abort () at abort.c:89
>

Good DOS test.
Fix is correct.

This needs a 'Fixes' tag.
Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")

This will need to be backported to 2.10.

Thanks Darrell

2  0x00464042 in ipf_list_state_transition at lib/ipf.c:545
>
> Co-authored-by: Wang Li 
> Signed-off-by: Wang Li 
> Signed-off-by: Li RongQing 
> ---
>  lib/ipf.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/lib/ipf.c b/lib/ipf.c
> index 4cc0f2df6..45c489122 100644
> --- a/lib/ipf.c
> +++ b/lib/ipf.c
> @@ -899,7 +899,8 @@ ipf_handle_frag(struct ipf *ipf, struct dp_packet
> *pkt, ovs_be16 dl_type,
>MIN(max_frag_list_size,
> IPF_FRAG_LIST_MIN_INCREMENT));
>  hmap_insert(>frag_lists, _list->node, hash);
>  ipf_expiry_list_add(>frag_exp_list, ipf_list, now);
> -} else if (ipf_list->state == IPF_LIST_STATE_REASS_FAIL) {
> +} else if (ipf_list->state == IPF_LIST_STATE_REASS_FAIL ||
> +   ipf_list->state == IPF_LIST_STATE_COMPLETED) {
>  /* Bail out as early as possible. */
>  return false;
>  } else if (ipf_list->last_inuse_idx + 1 >= ipf_list->size) {
> --
> 2.16.2
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] Require Python 3 and remove support for Python 2.

2019-11-14 Thread 0-day Robot
Bleep bloop.  Greetings Numan Siddique, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
error: Your local changes to the following files would be overwritten by merge:
utilities/checkpatch.py
Please, commit your changes or stash them before you can merge.
Aborting
Failed to merge in the changes.
Patch failed at 0001 Require Python 3 and remove support for Python 2.
The copy of the patch that failed is found in:
   
/var/lib/jenkins/jobs/upstream_build_from_pw/workspace/OVN/.git/rebase-apply/patch
When you have resolved this problem, run "git am --resolved".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] Require Python 3 and remove support for Python 2.

2019-11-14 Thread numans
From: Numan Siddique 

OVS removed the support for Python 2 in the commit [1].
And its time we do the same for OVN. Python3 is now mandatory,
otherwise configure will fail.

This patch takes care of removing Python 2 references.

[1] - 1ca0323e7c29("Require Python 3 and remove support for Python 2.")

Signed-off-by: Numan Siddique 
---

v1 -> v2
===
  * Addressed the review comments from Mark - Removed HAVE_PYTHON3 var
and fixed ovn-detrace.in exception seen with python3.

 Makefile.am | 12 ++--
 automake.mk |  8 +--
 build-aux/dpdkstrip.py  |  2 +-
 build-aux/sodepends.py  |  2 +-
 build-aux/soexpand.py   |  2 +-
 configure.ac|  2 -
 ipsec/ovs-monitor-ipsec.in  |  2 +-
 m4/ovn.m4   | 89 ++---
 rhel/ovn-fedora.spec.in | 35 +-
 tests/atlocal.in| 23 ++-
 tests/checkpatch.at | 12 ++--
 tests/ovn-controller-vtep.at|  1 -
 tests/ovn-northd.at | 10 ---
 tests/ovn.at| 63 -
 tests/ovsdb-macros.at   |  4 --
 tests/system-kmod-macros.at |  1 -
 tests/system-userspace-macros.at|  1 -
 tutorial/ovs-sandbox|  1 +
 utilities/bugtool/automake.mk   |  2 -
 utilities/checkpatch.py |  3 +-
 utilities/ovn-detrace.in|  6 +-
 utilities/ovn-docker-overlay-driver.in  |  2 +-
 utilities/ovn-docker-underlay-driver.in |  2 +-
 23 files changed, 36 insertions(+), 249 deletions(-)

diff --git a/Makefile.am b/Makefile.am
index 1e41e49ea..8eed7a72b 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -71,7 +71,7 @@ endif
 # foo/__init__.pyc will cause Python to ignore foo.py.
 run_python = \
PYTHONPATH=$(OVS_SRCDIR)/python$(psep)$$PYTHONPATH \
-   PYTHONDONTWRITEBYTECODE=yes $(PYTHON)
+   PYTHONDONTWRITEBYTECODE=yes $(PYTHON3)
 
 ALL_LOCAL =
 BUILT_SOURCES =
@@ -165,13 +165,13 @@ ro_shell = printf '\043 Generated automatically -- do not 
modify!-*- buffer-
 
 SUFFIXES += .in
 .in:
-   
$(AM_V_GEN)PYTHONPATH=$(OVS_SRCDIR)/python$(psep)$$PYTHONPATH$(psep)$(srcdir)/python
 $(PYTHON) $(srcdir)/build-aux/soexpand.py -I$(srcdir) -I$(OVS_SRCDIR) < $< | \
- $(PYTHON) $(srcdir)/build-aux/dpdkstrip.py $(DPDKSTRIP_FLAGS) | \
+   
$(AM_V_GEN)PYTHONPATH=$(OVS_SRCDIR)/python$(psep)$$PYTHONPATH$(psep)$(srcdir)/python
 $(PYTHON3) $(srcdir)/build-aux/soexpand.py -I$(srcdir) -I$(OVS_SRCDIR) < $< | \
+ $(PYTHON3) $(srcdir)/build-aux/dpdkstrip.py $(DPDKSTRIP_FLAGS) | \
  sed \
-e 's,[@]PKIDIR[@],$(PKIDIR),g' \
-e 's,[@]LOGDIR[@],$(LOGDIR),g' \
-e 's,[@]DBDIR[@],$(DBDIR),g' \
-   -e 's,[@]PYTHON[@],$(PYTHON),g' \
+   -e 's,[@]PYTHON3[@],$(PYTHON3),g' \
-e 's,[@]OVN_RUNDIR[@],$(OVN_RUNDIR),g' \
-e 's,[@]OVSBUILDDIR[@],$(OVSBUILDDIR),g' \
-e 's,[@]VERSION[@],$(VERSION),g' \
@@ -197,7 +197,7 @@ SUFFIXES += .xml
  PKIDIR='$(PKIDIR)' \
  LOGDIR='$(LOGDIR)' \
  DBDIR='$(DBDIR)' \
- PYTHON='$(PYTHON)' \
+ PYTHON3='$(PYTHON3)' \
  RUNDIR='$(RUNDIR)' \
  OVN_RUNDIR='$(OVN_RUNDIR)' \
  VERSION='$(VERSION)' \
@@ -425,7 +425,7 @@ CLEANFILES += flake8-check
 
 include $(srcdir)/manpages.mk
 $(srcdir)/manpages.mk: $(MAN_ROOTS) build-aux/sodepends.py 
$(OVS_SRCDIR)/python/build/soutil.py
-   
@PYTHONPATH=$(OVS_SRCDIR)/python$(psep)$$PYTHONPATH$(psep)$(srcdir)/python 
$(PYTHON) $(srcdir)/build-aux/sodepends.py -I. -I$(srcdir) -I$(OVS_MANDIR) 
$(MAN_ROOTS) >$(@F).tmp
+   
@PYTHONPATH=$(OVS_SRCDIR)/python$(psep)$$PYTHONPATH$(psep)$(srcdir)/python 
$(PYTHON3) $(srcdir)/build-aux/sodepends.py -I. -I$(srcdir) -I$(OVS_MANDIR) 
$(MAN_ROOTS) >$(@F).tmp
@if cmp -s $(@F).tmp $@; then \
  touch $@; \
  rm -f $(@F).tmp; \
diff --git a/automake.mk b/automake.mk
index ad801f1e5..591e00751 100644
--- a/automake.mk
+++ b/automake.mk
@@ -6,19 +6,17 @@ CLEANFILES += ovn-architecture.7
 #
 # If "python" or "dot" is not available, then we do not add graphical diagram
 # to the documentation.
-if HAVE_PYTHON
 if HAVE_DOT
 OVSDB_DOT = $(run_python) ${OVSDIR}/ovsdb/ovsdb-dot.in
 ovn-nb.gv: ${OVSDIR}/ovsdb/ovsdb-dot.in $(srcdir)/ovn-nb.ovsschema
$(AM_V_GEN)$(OVSDB_DOT) --no-arrows $(srcdir)/ovn-nb.ovsschema > $@
 ovn-nb.pic: ovn-nb.gv ${OVSDIR}/ovsdb/dot2pic
-   $(AM_V_GEN)(dot -T plain < ovn-nb.gv | $(PYTHON) 
${OVSDIR}/ovsdb/dot2pic -f 3) > $@.tmp && \
+   $(AM_V_GEN)(dot -T plain < ovn-nb.gv | $(PYTHON3) 
${OVSDIR}/ovsdb/dot2pic -f 3) > $@.tmp && \
mv $@.tmp $@
 OVN_NB_PIC = ovn-nb.pic
 OVN_NB_DOT_DIAGRAM_ARG = --er-diagram=$(OVN_NB_PIC)
 CLEANFILES += ovn-nb.gv ovn-nb.pic
 endif
-endif
 
 # OVN northbound 

Re: [ovs-dev] [PATCH net 2/2] act_ct: support asymmetric conntrack

2019-11-14 Thread Paul Blakey
On 11/14/2019 4:22 PM, Roi Dayan wrote:
>
> On 2019-11-08 11:07 PM, Aaron Conole wrote:
>> The act_ct TC module shares a common conntrack and NAT infrastructure
>> exposed via netfilter.  It's possible that a packet needs both SNAT and
>> DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
>> this because it runs through the NAT table twice - once on ingress and
>> again after egress.  The act_ct action doesn't have such capability.
>>
>> Like netfilter hook infrastructure, we should run through NAT twice to
>> keep the symmetry.
>>
>> Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
>>
>> Signed-off-by: Aaron Conole 
>> ---
>>   net/sched/act_ct.c | 13 -
>>   1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
>> index fcc46025e790..f3232a00970f 100644
>> --- a/net/sched/act_ct.c
>> +++ b/net/sched/act_ct.c
>> @@ -329,6 +329,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
>>bool commit)
>>   {
>>   #if IS_ENABLED(CONFIG_NF_NAT)
>> +int err;
>>  enum nf_nat_manip_type maniptype;
>>   
>>  if (!(ct_action & TCA_CT_ACT_NAT))
>> @@ -359,7 +360,17 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
>>  return NF_ACCEPT;
>>  }
>>   
>> -return ct_nat_execute(skb, ct, ctinfo, range, maniptype);
>> +err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
>> +if (err == NF_ACCEPT &&
>> +ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) {
>> +if (maniptype == NF_NAT_MANIP_SRC)
>> +maniptype = NF_NAT_MANIP_DST;
>> +else
>> +maniptype = NF_NAT_MANIP_SRC;
>> +
>> +err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
>> +}
>> +return err;
>>   #else
>>  return NF_ACCEPT;
>>   #endif
>>
> +paul

Hi Aaron,

I think I understand the issue and this looks good,

Can you describe the scenario to reproduce this?


Thanks,

Paul.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net 2/2] act_ct: support asymmetric conntrack

2019-11-14 Thread Roi Dayan



On 2019-11-08 11:07 PM, Aaron Conole wrote:
> The act_ct TC module shares a common conntrack and NAT infrastructure
> exposed via netfilter.  It's possible that a packet needs both SNAT and
> DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
> this because it runs through the NAT table twice - once on ingress and
> again after egress.  The act_ct action doesn't have such capability.
> 
> Like netfilter hook infrastructure, we should run through NAT twice to
> keep the symmetry.
> 
> Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
> 
> Signed-off-by: Aaron Conole 
> ---
>  net/sched/act_ct.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
> index fcc46025e790..f3232a00970f 100644
> --- a/net/sched/act_ct.c
> +++ b/net/sched/act_ct.c
> @@ -329,6 +329,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
> bool commit)
>  {
>  #if IS_ENABLED(CONFIG_NF_NAT)
> + int err;
>   enum nf_nat_manip_type maniptype;
>  
>   if (!(ct_action & TCA_CT_ACT_NAT))
> @@ -359,7 +360,17 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
>   return NF_ACCEPT;
>   }
>  
> - return ct_nat_execute(skb, ct, ctinfo, range, maniptype);
> + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
> + if (err == NF_ACCEPT &&
> + ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) {
> + if (maniptype == NF_NAT_MANIP_SRC)
> + maniptype = NF_NAT_MANIP_DST;
> + else
> + maniptype = NF_NAT_MANIP_SRC;
> +
> + err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
> + }
> + return err;
>  #else
>   return NF_ACCEPT;
>  #endif
> 

+paul
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.

2019-11-14 Thread Dumitru Ceara
On Thu, Nov 14, 2019 at 2:14 PM Numan Siddique  wrote:
>
> On Thu, Nov 14, 2019 at 6:14 PM Dumitru Ceara  wrote:
> >
> > Reported-by: Numan Siddique 
> > Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever 
> > possible.")
> > Signed-off-by: Dumitru Ceara 
>
> Thanks Dumitru for the fix. I applied this to master.
>
> Thanks
> Numan

Thanks Numan and sorry for missing this case in the first place.

>
> > ---
> >  northd/ovn-northd.c | 12 
> >  1 file changed, 8 insertions(+), 4 deletions(-)
> >
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index d6beb97..6742bc0 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -5337,10 +5337,14 @@ build_lswitch_rport_arp_req_flows(struct ovn_port 
> > *op,
> >  }
> >  }
> >
> > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op,
> > -sw_od, 75, lflows);
> > -build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op,
> > -sw_od, 75, lflows);
> > +if (!sset_is_empty(_ips_v4)) {
> > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, 
> > sw_op,
> > +sw_od, 75, lflows);
> > +}
> > +if (!sset_is_empty(_ips_v6)) {
> > +build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, 
> > sw_op,
> > +sw_od, 75, lflows);
> > +}
> >
> >  sset_destroy(_ips_v4);
> >  sset_destroy(_ips_v6);
> > --
> > 1.8.3.1
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.

2019-11-14 Thread Numan Siddique
On Thu, Nov 14, 2019 at 6:14 PM Dumitru Ceara  wrote:
>
> Reported-by: Numan Siddique 
> Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever 
> possible.")
> Signed-off-by: Dumitru Ceara 

Thanks Dumitru for the fix. I applied this to master.

Thanks
Numan

> ---
>  northd/ovn-northd.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index d6beb97..6742bc0 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -5337,10 +5337,14 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>  }
>  }
>
> -build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op,
> -sw_od, 75, lflows);
> -build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op,
> -sw_od, 75, lflows);
> +if (!sset_is_empty(_ips_v4)) {
> +build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op,
> +sw_od, 75, lflows);
> +}
> +if (!sset_is_empty(_ips_v6)) {
> +build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op,
> +sw_od, 75, lflows);
> +}
>
>  sset_destroy(_ips_v4);
>  sset_destroy(_ips_v6);
> --
> 1.8.3.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Good afternoon.

2019-11-14 Thread Engr. Muhammad Haseeb via dev
Hello dear how are you doing ?


Did you receive the previous mail i sent to you?
I wait your response as soon as possible.

Best Regards,
Eng.Muhammad Haseeb
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] ovn-northd: Avoid empty address list when limiting ARP/ND broadcast.

2019-11-14 Thread Dumitru Ceara
Reported-by: Numan Siddique 
Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever 
possible.")
Signed-off-by: Dumitru Ceara 
---
 northd/ovn-northd.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index d6beb97..6742bc0 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -5337,10 +5337,14 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
 }
 }
 
-build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op,
-sw_od, 75, lflows);
-build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op,
-sw_od, 75, lflows);
+if (!sset_is_empty(_ips_v4)) {
+build_lswitch_rport_arp_req_flow_for_ip(_ips_v4, AF_INET, sw_op,
+sw_od, 75, lflows);
+}
+if (!sset_is_empty(_ips_v6)) {
+build_lswitch_rport_arp_req_flow_for_ip(_ips_v6, AF_INET6, sw_op,
+sw_od, 75, lflows);
+}
 
 sset_destroy(_ips_v4);
 sset_destroy(_ips_v6);
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 2/2] Add support for Route Info Option in RA - RFC 4191

2019-11-14 Thread Lorenzo Bianconi
Introduce support for Route Info Option sent in Router
Advertisement according to RFC 4191. Route Info Option are
configured providing route_info in ipv6_ra_configs column of
Logical_Router_Port table. route_info is a comma separated string
where each field provides PRF and prefix for a given route
(e.g: HIGH-aef1::11/48,LOW-aef2::11/96)

Signed-off-by: Lorenzo Bianconi 
---
 controller/pinctrl.c | 80 
 lib/ovn-l7.h | 12 +++
 northd/ovn-northd.c  |  6 
 ovn-nb.xml   | 14 
 tests/ovn.at | 32 --
 5 files changed, 133 insertions(+), 11 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index b736b60ad..51c7f8a36 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -2195,6 +2195,7 @@ struct ipv6_ra_config {
 struct in6_addr rdnss;
 bool has_rdnss;
 struct ds dnssl;
+struct ds route_info;
 };
 
 struct ipv6_ra_state {
@@ -2217,6 +2218,7 @@ ipv6_ra_config_delete(struct ipv6_ra_config *config)
 if (config) {
 destroy_lport_addresses(>prefixes);
 ds_destroy(>dnssl);
+ds_destroy(>route_info);
 free(config);
 }
 }
@@ -2256,6 +2258,7 @@ ipv6_ra_update_config(const struct sbrec_port_binding *pb)
 config->mtu = smap_get_int(>options, "ipv6_ra_mtu", ND_MTU_DEFAULT);
 config->la_flags = IPV6_ND_RA_OPT_PREFIX_ON_LINK;
 ds_init(>dnssl);
+ds_init(>route_info);
 
 const char *address_mode = smap_get(>options, "ipv6_ra_address_mode");
 if (!address_mode) {
@@ -2313,6 +2316,11 @@ ipv6_ra_update_config(const struct sbrec_port_binding 
*pb)
 ds_put_buffer(>dnssl, dnssl, strlen(dnssl));
 }
 
+const char *route_info = smap_get(>options, "ipv6_ra_route_info");
+if (route_info) {
+ds_put_buffer(>route_info, route_info, strlen(route_info));
+}
+
 return config;
 
 fail:
@@ -2420,6 +2428,74 @@ packet_put_ra_dnssl_opt(struct dp_packet *b, ovs_be32 
lifetime,
   prev_l4_size + size));
 }
 
+static void
+packet_put_ra_route_info_opt(struct dp_packet *b, ovs_be32 lifetime,
+ char *route_list)
+{
+size_t prev_l4_size = dp_packet_l4_size(b);
+struct ip6_hdr *nh = dp_packet_l3(b);
+char *t0, *r0 = NULL;
+size_t size = 0;
+
+for (t0 = strtok_r(route_list, ",", ); t0;
+ t0 = strtok_r(NULL, ",", )) {
+struct ovs_nd_route_info *nd_rinfo =
+dp_packet_put_uninit(b, sizeof *nd_rinfo);
+char *t1, *r1 = NULL;
+int index;
+
+for (t1 = strtok_r(t0, "-", ), index = 0; t1;
+ t1 = strtok_r(NULL, "-", ), index++) {
+
+nd_rinfo->type = ND_OPT_ROUTE_INFO;
+nd_rinfo->route_lifetime = lifetime;
+
+switch (index) {
+case 0:
+if (!strcmp(t1, "HIGH")) {
+nd_rinfo->flags = IPV6_ND_RA_OPT_PRF_HIGH;
+} else if (!strcmp(t1, "LOW")) {
+nd_rinfo->flags = IPV6_ND_RA_OPT_PRF_LOW;
+} else {
+nd_rinfo->flags = IPV6_ND_RA_OPT_PRF_NORMAL;
+}
+break;
+case 1: {
+struct lport_addresses route;
+uint8_t plen;
+
+if (!extract_ip_addresses(t1, )) {
+return;
+}
+if (!route.n_ipv6_addrs) {
+destroy_lport_addresses();
+return;
+}
+
+nd_rinfo->prefix_len = route.ipv6_addrs->plen;
+plen = DIV_ROUND_UP(nd_rinfo->prefix_len, 64);
+nd_rinfo->len = 1 + plen;
+dp_packet_put(b, _addrs->network, plen * 8);
+size += sizeof *nd_rinfo + plen * 8;
+
+destroy_lport_addresses();
+index = 0;
+break;
+}
+default:
+return;
+}
+}
+}
+
+nh->ip6_plen = htons(prev_l4_size + size);
+struct ovs_ra_msg *ra = dp_packet_l4(b);
+ra->icmph.icmp6_cksum = 0;
+uint32_t icmp_csum = packet_csum_pseudoheader6(dp_packet_l3(b));
+ra->icmph.icmp6_cksum = csum_finish(csum_continue(icmp_csum, ra,
+  prev_l4_size + size));
+}
+
 /* Called with in the pinctrl_handler thread context. */
 static long long int
 ipv6_ra_send(struct rconn *swconn, struct ipv6_ra_state *ra)
@@ -2459,6 +2535,10 @@ ipv6_ra_send(struct rconn *swconn, struct ipv6_ra_state 
*ra)
 packet_put_ra_dnssl_opt(, htonl(0x),
 ra->config->dnssl.string);
 }
+if (ra->config->route_info.length) {
+packet_put_ra_route_info_opt(, htonl(0x),
+ ra->config->route_info.string);
+}
 
 uint64_t ofpacts_stub[4096 / 8];
 struct ofpbuf ofpacts = 

[ovs-dev] [PATCH ovn 1/2] Add support to Default Router Preference (PRF) - RFC 4191

2019-11-14 Thread Lorenzo Bianconi
Introduce support for Default Router Preference (PRF) in IPv6 Router
Advertisement according to RFC 4191

Signed-off-by: Lorenzo Bianconi 
---
 controller/pinctrl.c | 20 +---
 lib/ovn-l7.h |  4 
 northd/ovn-northd.c  |  7 +++
 ovn-nb.xml   | 12 
 tests/ovn.at | 13 +
 5 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index a90ee73d6..b736b60ad 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -2263,10 +2263,10 @@ ipv6_ra_update_config(const struct sbrec_port_binding 
*pb)
 goto fail;
 }
 if (!strcmp(address_mode, "dhcpv6_stateless")) {
-config->mo_flags = IPV6_ND_RA_FLAG_OTHER_ADDR_CONFIG;
+config->mo_flags |= IPV6_ND_RA_FLAG_OTHER_ADDR_CONFIG;
 config->la_flags |= IPV6_ND_RA_OPT_PREFIX_AUTONOMOUS;
 } else if (!strcmp(address_mode, "dhcpv6_stateful")) {
-config->mo_flags = IPV6_ND_RA_FLAG_MANAGED_ADDR_CONFIG;
+config->mo_flags |= IPV6_ND_RA_FLAG_MANAGED_ADDR_CONFIG;
 } else if (!strcmp(address_mode, "slaac")) {
 config->la_flags |= IPV6_ND_RA_OPT_PREFIX_AUTONOMOUS;
 } else {
@@ -2274,6 +2274,13 @@ ipv6_ra_update_config(const struct sbrec_port_binding 
*pb)
 goto fail;
 }
 
+const char *prf = smap_get(>options, "ipv6_ra_prf");
+if (!strcmp(prf, "HIGH")) {
+config->mo_flags |= IPV6_ND_RA_OPT_PRF_HIGH;
+} else if (!strcmp(prf, "LOW")) {
+config->mo_flags |= IPV6_ND_RA_OPT_PRF_LOW;
+}
+
 const char *prefixes = smap_get(>options, "ipv6_ra_prefixes");
 if (prefixes && !extract_ip_addresses(prefixes, >prefixes)) {
 VLOG_WARN("Invalid IPv6 prefixes: %s", prefixes);
@@ -2423,10 +2430,17 @@ ipv6_ra_send(struct rconn *swconn, struct ipv6_ra_state 
*ra)
 
 uint64_t packet_stub[128 / 8];
 struct dp_packet packet;
+uint16_t router_lt = IPV6_ND_RA_LIFETIME;
+
+if (!router_lt) {
+/* Reset PRF to MEDIUM if router lifetime is not set */
+ra->config->mo_flags &= ~IPV6_ND_RA_OPT_PRF_LOW;
+}
+
 dp_packet_use_stub(, packet_stub, sizeof packet_stub);
 compose_nd_ra(, ra->config->eth_src, ra->config->eth_dst,
 >config->ipv6_src, >config->ipv6_dst,
-255, ra->config->mo_flags, htons(IPV6_ND_RA_LIFETIME), 0, 0,
+255, ra->config->mo_flags, htons(router_lt), 0, 0,
 ra->config->mtu);
 
 for (int i = 0; i < ra->config->prefixes.n_ipv6_addrs; i++) {
diff --git a/lib/ovn-l7.h b/lib/ovn-l7.h
index 5fc370bf5..fd898b0dc 100644
--- a/lib/ovn-l7.h
+++ b/lib/ovn-l7.h
@@ -287,6 +287,10 @@ nd_ra_opts_destroy(struct hmap *nd_ra_opts)
 #define IPV6_ND_RA_OPT_PREFIX_VALID_LIFETIME0x
 #define IPV6_ND_RA_OPT_PREFIX_PREFERRED_LIFETIME0x
 
+#define IPV6_ND_RA_OPT_PRF_NORMAL   0x00
+#define IPV6_ND_RA_OPT_PRF_HIGH 0x08
+#define IPV6_ND_RA_OPT_PRF_LOW  0x18
+
 static inline void
 nd_ra_opts_init(struct hmap *nd_ra_opts)
 {
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index d6beb9768..a1eaa9071 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -6757,6 +6757,13 @@ copy_ra_to_sb(struct ovn_port *op, const char 
*address_mode)
 
 smap_add(, "ipv6_ra_src_eth", op->lrp_networks.ea_s);
 
+const char *prf = smap_get(>nbrp->ipv6_ra_configs, "prf");
+if (!prf || (strcmp(prf, "HIGH") && strcmp(prf, "LOW"))) {
+smap_add(, "ipv6_ra_prf", "MEDIUM");
+} else {
+smap_add(, "ipv6_ra_prf", prf);
+}
+
 sbrec_port_binding_set_options(op->sb, );
 smap_destroy();
 }
diff --git a/ovn-nb.xml b/ovn-nb.xml
index d8f3237fc..89d23ad7b 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -1859,6 +1859,18 @@
 
   
 
+  
+Default Router Preference (PRF) indicates whether to prefer this
+router over other default routers (RFC 4191).
+Possible values are:
+
+
+  HIGH: mapped to 0x01 in RA PRF field
+  MEDIUM: mapped to 0x00 in RA PRF field
+  LOW: mapped to 0x11 in RA PRF field
+
+  
+
   
 The recommended MTU for the link. Default is 0, which means no MTU
 Option will be included in RA packet replied by ovn-controller.
diff --git a/tests/ovn.at b/tests/ovn.at
index 4a10a11e9..85d539c78 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -11504,23 +11504,28 @@ ra_test 05dc 00 0 0 c0 40 
aef0
 ovn-nbctl --wait=hv set Logical_Router_port ro-sw networks='aef0\:\:1/64 
fd0f\:\:1/48'
 ra_test 05dc 00 0 0 c0 40 aef0 30 
fd0f
 
+# Test PRF for default gw
+ovn-nbctl --wait=hv set Logical_Router_port ro-sw ipv6_ra_configs:prf="LOW"
+ra_test 05dc 18 0 0 c0 40 aef0 30 
fd0f
+
 # Now test for RDNSS
 ovn-nbctl --wait=hv set Logical_Router_port 

[ovs-dev] [PATCH ovn 0/2] Add support for PRF and Route Info Option in RA (RFC 4191)

2019-11-14 Thread Lorenzo Bianconi
Lorenzo Bianconi (2):
  Add support to Default Router Preference (PRF) - RFC 4191
  Add support for Route Info Option in RA - RFC 4191

 controller/pinctrl.c | 100 +--
 lib/ovn-l7.h |  16 +++
 northd/ovn-northd.c  |  13 ++
 ovn-nb.xml   |  26 +++
 tests/ovn.at |  35 ++-
 5 files changed, 177 insertions(+), 13 deletions(-)

-- 
2.21.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode.

2019-11-14 Thread Eelco Chaudron



On 13 Nov 2019, at 14:23, Ilya Maximets wrote:


On 13.11.2019 1:21, William Tu wrote:
On Thu, Nov 7, 2019 at 3:36 AM Ilya Maximets  
wrote:


Until now there was only two options for XDP mode in OVS: SKB or 
DRV.

i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.

Devices like 'veth' interfaces in Linux supports native XDP, but
doesn't support zero-copy mode.  This case can not be covered by
existing API and we have to use slower generic XDP for such devices.
There are few more issues, e.g. TCP is not supported in generic XDP
mode for veth interfaces due to kernel limitations, however it is
supported in native mode.

This change introduces ability to use native XDP without zero-copy
along with best-effort configuration option that enabled by default.
In best-effort case OVS will sequentially try different modes 
starting
from the fastest one and will choose the first acceptable for 
current

interface.  This will guarantee the best possible performance.

If user will want to choose specific mode, it's still possible by
setting the 'options:xdp-mode'.

This change additionally changes the API by renaming the 
configuration

knob from 'xdpmode' to 'xdp-mode' and also renaming the modes
themselves to be more user-friendly.

The full list of currently supported modes:
  * native-with-zerocopy - former DRV
  * native   - new one, DRV without zero-copy
  * generic  - former SKB
  * best-effort  - new one, chooses the best available from
   3 above modes

Since 'best-effort' is a default mode, users will not need to
explicitely set 'xdp-mode' in most cases.

TCP related tests enabled back in system afxdp testsuite, because
'best-effort' will choose 'native' mode for veth interfaces
and this mode has no issues with TCP.

Signed-off-by: Ilya Maximets 
---


I'm thinking about adding some tests case and patches depends on
this API change. If there is no more comment, I will apply to master.


I think, it might be good to wait for some comments from Eelco.

@Eelco, do you have a plan to review the code? Do you agree with the
API update?


I’m rather busy this week, will try to review it next week, and I’m 
ok with the API change.


The only thing that bothers me is the worse performance of the TAP 
interface with the new default config.


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] ipf: bail out when ipf state is COMPLETED

2019-11-14 Thread Li RongQing
it is easy to crash ovs when a packet with same id
hits a list that already reassembled completedly
but have not been sent out yet, and this packet is
not duplicate with this hit ipf list due to bigger
offset

1  0x7f9fef0ae2d9 in __GI_abort () at abort.c:89
2  0x00464042 in ipf_list_state_transition at lib/ipf.c:545

Co-authored-by: Wang Li 
Signed-off-by: Wang Li 
Signed-off-by: Li RongQing 
---
 lib/ipf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/ipf.c b/lib/ipf.c
index 4cc0f2df6..45c489122 100644
--- a/lib/ipf.c
+++ b/lib/ipf.c
@@ -899,7 +899,8 @@ ipf_handle_frag(struct ipf *ipf, struct dp_packet *pkt, 
ovs_be16 dl_type,
   MIN(max_frag_list_size, IPF_FRAG_LIST_MIN_INCREMENT));
 hmap_insert(>frag_lists, _list->node, hash);
 ipf_expiry_list_add(>frag_exp_list, ipf_list, now);
-} else if (ipf_list->state == IPF_LIST_STATE_REASS_FAIL) {
+} else if (ipf_list->state == IPF_LIST_STATE_REASS_FAIL ||
+   ipf_list->state == IPF_LIST_STATE_COMPLETED) {
 /* Bail out as early as possible. */
 return false;
 } else if (ipf_list->last_inuse_idx + 1 >= ipf_list->size) {
-- 
2.16.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 ovn 0/3] Improve ovn-detrace support for parsing OpenFlow cookies.

2019-11-14 Thread Dumitru Ceara
On Wed, Nov 13, 2019 at 9:02 PM Han Zhou  wrote:
>
>
>
> On Wed, Nov 13, 2019 at 1:51 AM Dumitru Ceara  wrote:
> >
> > Commit eb25a7da639e ("Improve debuggability of OVN to OpenFlow 
> > translations.")
> > added support for more types of cookies (partial SB uuids) that are stored
> > in OpenFlow entries created by ovn-controller.
> >
> > The last patch in this series implements support for parsing all these
> > different cookies in ovn-detrace too.
> >
> > The first patch is a bug fix required as ovn-detrace was broken after
> > moving OVN to its own separate rundir.
> >
> > The second patch fixes line parsing in ovn-detrace.
> >
> > CC: Han Zhou 
> > Signed-off-by: Dumitru Ceara 
> >
> > Dumitru Ceara (3):
> >   ovn-detrace: Fix rundir.
> >   ovn-detrace: Fix line parsing.
> >   ovn-detrace: Add support for other types of SB cookies.
> >
> >
> >  utilities/ovn-detrace.in |  201 
> > --
> >  1 file changed, 138 insertions(+), 63 deletions(-)
> >
> >
> > ---
> > v4:
> > - Address Han's comments:
> >   - Fix printing of last logical flow information.
> > - Added Acked-by from Mark on patches 1 & 3. I didn't add it to patch 2
> >   because I significantly changed patch 2 in v4.
> > v3:
> > - Remove stray "%s".
> > - Rename pprint to print_p to avoid name clashes with the library function.
> > v2:
> > - Address Mark's comments:
> >   - properly handle potential collisions between cookie -> UUID mappings.
> >   - when looking up SB records by cookie, match on the first part of the
> > UUID.
> > - Further refactor ovn-detrace to simplify prefixing outputs.
> > - Change print statements to print() calls.
> >
>
> Thanks Dumitru. I applied the series to master.

Thanks Han and Mark for reviewing and merging this!

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 ovn 2/2] ovn-northd: Limit ARP/ND broadcast domain whenever possible.

2019-11-14 Thread Dumitru Ceara
On Wed, Nov 13, 2019 at 8:52 PM Han Zhou  wrote:
>
>
>
> On Wed, Nov 13, 2019 at 8:14 AM Dumitru Ceara  wrote:
> >
> > On Wed, Nov 13, 2019 at 4:30 PM Han Zhou  wrote:
> > >
> > >
> > >
> > > On Wed, Nov 13, 2019 at 2:42 AM Dumitru Ceara  wrote:
> > >>
> > >> On Tue, Nov 12, 2019 at 8:50 PM Han Zhou  wrote:
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > On Tue, Nov 12, 2019 at 10:10 AM Dumitru Ceara  
> > >> > wrote:
> > >> > >
> > >> > > On Tue, Nov 12, 2019 at 6:17 PM Han Zhou  wrote:
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > On Tue, Nov 12, 2019 at 2:29 AM Dumitru Ceara  
> > >> > > > wrote:
> > >> > > > >
> > >> > > > > ARP request and ND NS packets for router owned IPs were being
> > >> > > > > flooded in the complete L2 domain (using the MC_FLOOD multicast 
> > >> > > > > group).
> > >> > > > > However this creates a scaling issue in scenarios where 
> > >> > > > > aggregation
> > >> > > > > logical switches are connected to more logical routers (~350). 
> > >> > > > > The
> > >> > > > > logical pipelines of all routers would have to be executed 
> > >> > > > > before the
> > >> > > > > packet is finally replied to by a single router, the owner of 
> > >> > > > > the IP
> > >> > > > > address.
> > >> > > > >
> > >> > > > > This commit limits the broadcast domain by bypassing the L2 
> > >> > > > > Lookup stage
> > >> > > > > for ARP requests that will be replied by a single router. The 
> > >> > > > > packets
> > >> > > > > are forwarded only to the router port that owns the target IP 
> > >> > > > > address.
> > >> > > > >
> > >> > > > > IPs that are owned by the routers and for which this fix applies 
> > >> > > > > are:
> > >> > > > > - IP addresses configured on the router ports.
> > >> > > > > - VIPs.
> > >> > > > > - NAT IPs.
> > >> > > > >
> > >> > > > > Reported-at: https://bugzilla.redhat.com/1756945
> > >> > > > > Reported-by: Anil Venkata 
> > >> > > > > Signed-off-by: Dumitru Ceara 
> > >> > > > >
> > >> > > > > ---
> > >> > > > > v7:
> > >> > > > > - Address Han's comments:
> > >> > > > > - Remove flooding for all ARPs received on VLAN networks. To 
> > >> > > > > avoid
> > >> > > > >   that we now identify self originated (G)ARPs by matching 
> > >> > > > > on source
> > >> > > > >   MAC address too.
> > >> > > > > - Rename REGBIT_NOT_VXLAN to FLAGBIT_NOT_VXLAN.
> > >> > > > > - Fix ovn-sb manpage.
> > >> > > > > - Split patch in a series of 2:
> > >> > > > > - patch1: fixes the get_router_load_balancer_ips() function.
> > >> > > > > - patch2: limits the ARP/ND broadcast domain.
> > >> > > > > v6:
> > >> > > > > - Address Han's comments:
> > >> > > > > - remove flooding of ARPs targeting OVN owned IP addresses.
> > >> > > > > - update ovn-architecture documentation.
> > >> > > > > - rename ARP handling functions.
> > >> > > > > - Adapt "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR" autotest to 
> > >> > > > > take into
> > >> > > > > account the new way of forwarding ARPs.
> > >> > > > > - Also, properly deal with ARP packets on VLAN-backed networks.
> > >> > > > > v5: Address Numan's comments: update comments & make autotest 
> > >> > > > > more
> > >> > > > > robust.
> > >> > > > > v4: Rebase.
> > >> > > > > v3: Properly deal with VXLAN traffic. Address review comments 
> > >> > > > > from
> > >> > > > > Numan (add autotests). Fix function 
> > >> > > > > get_router_load_balancer_ips.
> > >> > > > > Rebase -> deal with IPv6 NAT too.
> > >> > > > > v2: Move ARP broadcast domain limiting to table 
> > >> > > > > S_SWITCH_IN_L2_LKUP to
> > >> > > > > address localnet ports too.
> > >> > > > > ---
> > >> > > > >  northd/ovn-northd.8.xml |   14 ++
> > >> > > > >  northd/ovn-northd.c |  230 
> > >> > > > > +++
> > >> > > > >  ovn-architecture.7.xml  |   19 +++
> > >> > > > >  tests/ovn.at|  307 
> > >> > > > > +--
> > >> > > > >  4 files changed, 530 insertions(+), 40 deletions(-)
> > >> > > > >
> > >> > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > >> > > > > index 0a33dcd..344cc0d 100644
> > >> > > > > --- a/northd/ovn-northd.8.xml
> > >> > > > > +++ b/northd/ovn-northd.8.xml
> > >> > > > > @@ -1005,6 +1005,20 @@ output;
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > +Priority-80 flows for each port connected to a logical 
> > >> > > > > router
> > >> > > > > +matching self originated GARP/ARP request/ND packets. 
> > >> > > > > These packets
> > >> > > > > +are flooded to the MC_FLOOD which contains 
> > >> > > > > all logical
> > >> > > > > +ports.
> > >> > > > > +  
> > >> > > > > +
> > >> > > > > +  
> > >> > > > > +Priority-75 flows for each IP address/VIP/NAT address 
> > >> > > > > owned by a
> > >> > > > > +router port connected to the switch. These flows match 
> > >> > > > > ARP requests
> > >> > > > > +and ND