Re: [ovs-dev] [PATCH ovn] travis: Fix CI failure for osx builds

2019-11-08 Thread Numan Siddique
On Fri, Nov 8, 2019 at 6:09 PM Ilya Maximets  wrote:
>
> On 08.11.2019 13:12, Numan Siddique wrote:
> > Hi Ilya,
>
> Hi Numan,
>
> Comments inline.
>
> Best regards, Ilya Maximets.
>
> >
> > If you get some time, could you please take a look at this patch ?
> >
> > Thanks
> > Numan
> >
> >
> > On Fri, Nov 8, 2019 at 4:13 PM  wrote:
> >>
> >> From: Numan Siddique 
> >>
> >> The below failure is seen
> >>
> >> 
> >> checking for Python 3 (version 3.4 or later)... /usr/local/bin/python3
> >> checking where Python six library is available... configure: error: 
> >> Missing Python six library.
> >> The command "./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS" exited with 1.
> >> 
> >>
> >> This patch fixes it.
> >>
> >> Signed-off-by: Numan Siddique 
> >> ---
> >>   .travis/osx-prepare.sh | 6 ++
> >>   1 file changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/.travis/osx-prepare.sh b/.travis/osx-prepare.sh
> >> index 4725fd829..7f639a62e 100755
> >> --- a/.travis/osx-prepare.sh
> >> +++ b/.travis/osx-prepare.sh
> >> @@ -1,7 +1,5 @@
> >>   #!/bin/bash
> >>   set -ev
> >> -pip2 install --user six
> >> -pip2 install --user --upgrade docutils
> >> +pip3 install --user six
> >> +pip3 install --user --upgrade docutils
>
> This is required because OVS requires python3 now.
> This might make sense to point to that fact in commit-message.
>
> >>
>
> This empty line should be removed too as it is the last in a file.
>
> >> -brew update || true
> >> -brew uninstall libtool && brew install libtool || true
>
> This part is just an optimization. Doesn't really related to this fix.
> Some details in a corresponding OVS commit:
>  3bfc9c1c30d5 ("travis: Drop OSX workarounds.")
>
> I'm OK with the change in general.  It might be better to mention
> in commit message why libtool update was also removed.
>
> Otherwise,
> Acked-by: Ilya Maximets 

Hi Ilya,

Thanks for the review.

I applied the below modified as per  your suggestions


>From 9ec3bed0d4ec62f08835ac1b04b1c65f332934bf Mon Sep 17 00:00:00 2001
From: Numan Siddique 
Date: Fri, 8 Nov 2019 00:46:18 +0530
Subject: [PATCH] travis: Fix CI failure for osx builds

The below failure is seen


checking for Python 3 (version 3.4 or later)... /usr/local/bin/python3
checking where Python six library is available... configure: error:
Missing Python six library.
The command "./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS" exited with 1.


This is because, OVS requires python3 now. This patch fixes it by installing
six library using pip3.

This patch also optimizes the build by removing the libtool update.
As per this commit [1] in OVS, it is no longer required and it would
save 4-6 minutes of build time.

[1] - 3bfc9c1c30d5 ("travis: Drop OSX workarounds.")

Acked-by: Ilya Maximets 
Signed-off-by: Numan Siddique 
---
 .travis/osx-prepare.sh | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/.travis/osx-prepare.sh b/.travis/osx-prepare.sh
index 4725fd829..78d5bb579 100755
--- a/.travis/osx-prepare.sh
+++ b/.travis/osx-prepare.sh
@@ -1,7 +1,4 @@
 #!/bin/bash
 set -ev
-pip2 install --user six
-pip2 install --user --upgrade docutils
-
-brew update || true
-brew uninstall libtool && brew install libtool || true
+pip3 install --user six
+pip3 install --user --upgrade docutils
-- 
2.23.0
*
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v6] ovn-northd: Limit ARP/ND broadcast domain whenever possible.

2019-11-08 Thread Han Zhou
On Fri, Nov 8, 2019 at 6:38 AM Dumitru Ceara  wrote:
>
> ARP request and ND NS packets for router owned IPs were being
> flooded in the complete L2 domain (using the MC_FLOOD multicast group).
> However this creates a scaling issue in scenarios where aggregation
> logical switches are connected to more logical routers (~350). The
> logical pipelines of all routers would have to be executed before the
> packet is finally replied to by a single router, the owner of the IP
> address.
>
> This commit limits the broadcast domain by bypassing the L2 Lookup stage
> for ARP requests that will be replied by a single router. The packets
> are forwarded only to the router port that owns the target IP address.
>
> IPs that are owned by the routers and for which this fix applies are:
> - IP addresses configured on the router ports.
> - VIPs.
> - NAT IPs.
>
> This commit also fixes function get_router_load_balancer_ips() which
> was incorrectly returning a single address_family even though the
> IP set could contain a mix of IPv4 and IPv6 addresses.
>
> Reported-at: https://bugzilla.redhat.com/1756945
> Reported-by: Anil Venkata 
> Signed-off-by: Dumitru Ceara 
>
> ---
> v6:
> - Address Han's comments:
>   - remove flooding of ARPs targeting OVN owned IP addresses.
>   - update ovn-architecture documentation.
>   - rename ARP handling functions.
>   - Adapt "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR" autotest to take into
> account the new way of forwarding ARPs.
> - Also, properly deal with ARP packets on VLAN-backed networks.

I am confused by this additional VLAN related change. VLAN is just handled
as bridged logical switch where a localnet port is used as *inport*. It
seems to me no difference from regular localnet port no matter with or
without VLAN tag. When an ARP request is going through the ingress pipeline
of the bridged logical switch, the logic of bypassing the flooding added by
this patch should just apply, right? It is also very common scenario that
the *aggregation switch* for the routers is an external physical network
backed by VLAN. I think the purpose of this patch is to make sure scenario
scale. Did I misunderstand anything here?

In addition, the below macro definition may be renamed to FLAGBIT_...,
because it is for the bits of MFF_LOG_FLAGS, which is one of the
pre-defined logical fields, instead of for the MFF_LOG_REG0-9 registers.
>
> +#define REGBIT_NOT_VXLAN "flags[1] == 0"
> +#define REGBIT_NOT_VLAN "flags[7] == 0"
> +

The other part looks good to me. Thanks for simply the patch.

Han
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] dpif-netdev: log rxq assignment in isolated pmd

2019-11-08 Thread Gowrishankar Muthukrishnan
There is no log about isolated rxq assignment in a pmd today, which
sometimes could be useful to trace rxq/pmd pinning, when debugging
with log. Ovs-appctl dpif-netdev/pmd-rxq-show reports about it
already, but logging is helpful to trace pinning in time.

Changes:
  v3: correction on pmd unref and numa_id.

Reported-at: https://bugzilla.redhat.com/1728616
Signed-off-by: Gowrishankar Muthukrishnan 
---
 lib/dpif-netdev.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 4546b55e8..aeae66aea 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -4572,6 +4572,10 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) 
OVS_REQUIRES(dp->port_mutex)
 } else {
 q->pmd = pmd;
 pmd->isolated = true;
+VLOG_INFO("Core %d on numa node %d assigned port \'%s\' "
+  "rx queue %d.", pmd->core_id, pmd->numa_id,
+  netdev_rxq_get_name(q->rx),
+  netdev_rxq_get_queue_id(q->rx));
 dp_netdev_pmd_unref(pmd);
 }
 } else if (!pinned && q->core_id == OVS_CORE_UNSPEC) {
-- 
2.17.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 2/3 ovn] OVN ACL: Allow ct_mark and ct_label values to be set from register as well

2019-11-08 Thread 0-day Robot
Bleep bloop.  Greetings Ankur Sharma, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 117 characters long (recommended limit is 79)
#150 FILE: ovn-sb.xml:1248:
ct_commit(ct_mark=(value[/mask] OR regX OR 
xregX OR xxregX));

WARNING: Line is 118 characters long (recommended limit is 79)
#155 FILE: ovn-sb.xml:1253:
ct_commit(ct_label=(value[/mask] OR regX OR 
xregX OR xxregX));

WARNING: Line is 116 characters long (recommended limit is 79)
#160 FILE: ovn-sb.xml:1258:
ct_commit(ct_mark=(value[/mask] OR regX OR 
xregX OR xxregX),

WARNING: Line is 109 characters long (recommended limit is 79)
#161 FILE: ovn-sb.xml:1259:
ct_label=(value[/mask] OR regX OR 
xregX OR xxregX ));

Lines checked: 236, Warnings: 4, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 1/3 ovn] OVN ACL: Replace the usage of ct_label with ct_mark

2019-11-08 Thread 0-day Robot
Bleep bloop.  Greetings Ankur Sharma, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 141 characters long (recommended limit is 79)
#42 FILE: Documentation/tutorials/ovn-openstack.rst:1231:
   6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked 
== 0 && (inport == "ap" && ip4), priority 2002, uuid a12b39f0

WARNING: Line is 202 characters long (recommended limit is 79)
#51 FILE: Documentation/tutorials/ovn-openstack.rst:1300:
   4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct.blocked == 0 && (outport == "cp" && ip4 && ip4.src == 
$as_ip4_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 2002, uuid a746fa0d

WARNING: Line is 176 characters long (recommended limit is 79)
#60 FILE: Documentation/tutorials/ovn-openstack.rst:1540:
   4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct.blocked == 0 && (outport == "dp" && ip4 && ip4.src == 0.0.0.0/0 && icmp4), 
priority 2002, uuid b860fc9f

WARNING: Line is 141 characters long (recommended limit is 79)
#69 FILE: Documentation/tutorials/ovn-openstack.rst:1652:
   6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked 
== 0 && (inport == "ap" && ip6), priority 2002, uuid 7fdd607e

WARNING: Line is 202 characters long (recommended limit is 79)
#78 FILE: Documentation/tutorials/ovn-openstack.rst:1710:
   4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct.blocked == 0 && (outport == "cp" && ip6 && ip6.src == 
$as_ip6_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 2002, uuid 12fc96f9

WARNING: Line is 227 characters long (recommended limit is 79)
#87 FILE: Documentation/tutorials/ovn-openstack.rst:1916:
   6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && ct.blocked 
== 0 && (inport == "ap" && ip4 && ip4.dst == {255.255.255.255, 10.1.1.0/24} && 
udp && udp.src == 68 && udp.dst == 67), priority 2002, uuid 9c90245d

Lines checked: 365, Warnings: 6, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 2/3 ovn] OVN ACL: Allow ct_mark and ct_label values to be set from register as well

2019-11-08 Thread Ankur Sharma
OVN allows only an integer (or masked integer) to be assigned to
ct_mark and ct_label.

This patch, enhances the parser code to allow ct_mark and ct_label
to be assigned from registers as well.

Signed-off-by: Ankur Sharma 
---
 include/ovn/actions.h |  3 +++
 lib/actions.c | 72 ---
 ovn-sb.xml| 41 +
 tests/ovn.at  | 31 ++
 4 files changed, 126 insertions(+), 21 deletions(-)

diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index f4997e9..dda9a66 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -218,8 +218,11 @@ struct ovnact_ct_next {
 /* OVNACT_CT_COMMIT. */
 struct ovnact_ct_commit {
 struct ovnact ovnact;
+bool is_ct_mark_reg, is_ct_label_reg; /* If the value is from a register */
 uint32_t ct_mark, ct_mark_mask;
 ovs_be128 ct_label, ct_label_mask;
+enum mf_field_id ct_mark_reg, ct_label_reg;
+uint16_t ct_mark_reg_bits, ct_label_reg_bits;
 };
 
 /* OVNACT_CT_DNAT, OVNACT_CT_SNAT. */
diff --git a/lib/actions.c b/lib/actions.c
index a999a4f..2b00469 100644
--- a/lib/actions.c
+++ b/lib/actions.c
@@ -634,8 +634,21 @@ parse_ct_commit_arg(struct action_context *ctx,
 } else if (ctx->lexer->token.type == LEX_T_MASKED_INTEGER) {
 cc->ct_mark = ntohll(ctx->lexer->token.value.integer);
 cc->ct_mark_mask = ntohll(ctx->lexer->token.mask.integer);
+} else if (ctx->lexer->token.type == LEX_T_ID) {
+
+cc->ct_mark_mask = UINT32_MAX;
+
+const struct mf_field *mf = mf_from_name(ctx->lexer->token.s);
+if (mf && mf_is_register(mf->id)) {
+cc->is_ct_mark_reg = true;
+cc->ct_mark_reg = mf->id;
+   } else {
+   lexer_syntax_error(ctx->lexer, "invalid field name: %s",
+  ctx->lexer->token.s);
+  return;
+   }
 } else {
-lexer_syntax_error(ctx->lexer, "expecting integer");
+lexer_syntax_error(ctx->lexer, "invalid token type");
 return;
 }
 lexer_get(ctx->lexer);
@@ -649,9 +662,21 @@ parse_ct_commit_arg(struct action_context *ctx,
 } else if (ctx->lexer->token.type == LEX_T_MASKED_INTEGER) {
 cc->ct_label = ctx->lexer->token.value.be128_int;
 cc->ct_label_mask = ctx->lexer->token.mask.be128_int;
+} else if (ctx->lexer->token.type == LEX_T_ID) {
+
+   const struct mf_field *mf = mf_from_name(ctx->lexer->token.s);
+   if (mf && mf_is_register(mf->id)) {
+   cc->is_ct_label_reg = true;
+   cc->ct_label_reg = mf->id;
+   } else {
+  lexer_syntax_error(ctx->lexer, "invalid field name: %s",
+ ctx->lexer->token.s);
+  return;
+   }
+
 } else {
-lexer_syntax_error(ctx->lexer, "expecting integer");
-return;
+   lexer_syntax_error(ctx->lexer, "invalid token type");
+   return;
 }
 lexer_get(ctx->lexer);
 } else {
@@ -719,15 +744,42 @@ encode_CT_COMMIT(const struct ovnact_ct_commit *cc,
 size_t set_field_offset = ofpacts->size;
 ofpbuf_pull(ofpacts, set_field_offset);
 
-if (cc->ct_mark_mask) {
+if (cc->is_ct_mark_reg) {
+struct ofpact_reg_move *move = ofpact_put_REG_MOVE(ofpacts);
+const struct mf_field *src_reg = mf_from_id(cc->ct_mark_reg);
+const struct mf_field *ct_mark = mf_from_id(MFF_CT_MARK);
+
+move->src.field = src_reg;
+move->src.ofs = 0;
+move->src.n_bits = src_reg->n_bits < ct_mark->n_bits ?
+   src_reg->n_bits : ct_mark->n_bits;
+move->dst.field = mf_from_id(MFF_CT_MARK);
+move->dst.ofs = 0;
+move->dst.n_bits = src_reg->n_bits < ct_mark->n_bits ?
+   src_reg->n_bits : ct_mark->n_bits;
+} else if (cc->ct_mark_mask) {
 const ovs_be32 value = htonl(cc->ct_mark);
 const ovs_be32 mask = htonl(cc->ct_mark_mask);
-ofpact_put_set_field(ofpacts, mf_from_id(MFF_CT_MARK), , );
-}
-
-if (!ovs_be128_is_zero(cc->ct_label_mask)) {
-ofpact_put_set_field(ofpacts, mf_from_id(MFF_CT_LABEL), >ct_label,
- >ct_label_mask);
+ofpact_put_set_field(ofpacts, mf_from_id(MFF_CT_MARK), ,
+ );
+}
+
+if (cc->is_ct_label_reg) {
+struct ofpact_reg_move *move = ofpact_put_REG_MOVE(ofpacts);
+const struct mf_field *src_reg = mf_from_id(cc->ct_label_reg);
+const struct mf_field *ct_label = mf_from_id(MFF_CT_LABEL);
+
+move->src.field = src_reg;
+move->src.ofs = 0;
+move->src.n_bits = src_reg->n_bits < ct_label->n_bits ?
+   src_reg->n_bits : ct_label->n_bits;
+move->dst.field = ct_label;
+move->dst.ofs 

[ovs-dev] [PATCH v4 1/3 ovn] OVN ACL: Replace the usage of ct_label with ct_mark

2019-11-08 Thread Ankur Sharma
OVN ACL implementation used ct_label to indicate if a previosuly
allowed connection should not be allowed anymore and vice versa.

However, ct_label is a 128 bit value and we should rather leverage
on ct_mark which is a 32 bit value.

Using ct_mark for this purpose, allows us to use ct_label for storing
other values like, identifier for corresponidng OVN ACL/Security group etc.

Signed-off-by: Ankur Sharma 
---
 Documentation/tutorials/ovn-openstack.rst | 14 -
 lib/logical-fields.c  |  3 ++
 northd/ovn-northd.8.xml   | 14 -
 northd/ovn-northd.c   | 50 ---
 tests/ovn.at  | 11 +++
 5 files changed, 49 insertions(+), 43 deletions(-)

diff --git a/Documentation/tutorials/ovn-openstack.rst 
b/Documentation/tutorials/ovn-openstack.rst
index 3ef0523..5134406 100644
--- a/Documentation/tutorials/ovn-openstack.rst
+++ b/Documentation/tutorials/ovn-openstack.rst
@@ -60,7 +60,7 @@ packaging for developers, in a way that allows you to follow 
along
 with the tutorial in full.
 
 Unless you have a spare computer laying about, it's easiest to install
-DevStacck in a virtual machine.  This tutorial was built using a VM
+DevStack in a virtual machine.  This tutorial was built using a VM
 implemented by KVM and managed by virt-manager.  I recommend
 configuring the VM configured for the x86-64 architecture, 6 GB RAM, 2
 VCPUs, and a 20 GB virtual disk.
@@ -1228,7 +1228,7 @@ as the output port::
 
   ct_next(ct_state=est|trk /* default (use --ct to customize) */)
   ---
-   6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct_label.blocked == 0 && (inport == "ap" && ip4), priority 2002, uuid a12b39f0
+   6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct.blocked == 0 && (inport == "ap" && ip4), priority 2002, uuid a12b39f0
   next;
   13. ls_in_l2_lkup (ovn-northd.c:3529): eth.dst == fa:16:3e:f6:e2:8f, 
priority 50, uuid c43ead31
   outport = "17d870";
@@ -1297,7 +1297,7 @@ Finally the logical switch for ``n2`` runs through the 
same logic as
 
   ct_next(ct_state=est|trk /* default (use --ct to customize) */)
   ---
-   4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct_label.blocked == 0 && (outport == "cp" && ip4 && ip4.src == 
$as_ip4_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 2002, uuid a746fa0d
+   4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct.blocked == 0 && (outport == "cp" && ip4 && ip4.src == 
$as_ip4_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 2002, uuid a746fa0d
   next;
7. ls_out_port_sec_ip (ovn-northd.c:2364): outport == "cp" && eth.dst == 
fa:16:3e:89:f2:36 && ip4.dst == {255.255.255.255, 224.0.0.0/4, 10.1.2.7}, 
priority 90, uuid 4d9862b5
   next;
@@ -1537,7 +1537,7 @@ firewall and is output to ``d``::
 
   ct_next(ct_state=est|trk /* default (use --ct to customize) */)
   ---
-   4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct_label.blocked == 0 && (outport == "dp" && ip4 && ip4.src == 0.0.0.0/0 && 
icmp4), priority 2002, uuid b860fc9f
+   4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct.blocked == 0 && (outport == "dp" && ip4 && ip4.src == 0.0.0.0/0 && icmp4), 
priority 2002, uuid b860fc9f
   next;
7. ls_out_port_sec_ip (ovn-northd.c:2364): outport == "dp" && eth.dst == 
fa:16:3e:c1:f5:a2 && ip4.dst == {255.255.255.255, 224.0.0.0/4, 10.0.0.6}, 
priority 90, uuid 15655a98
   next;
@@ -1649,7 +1649,7 @@ closely to those for IPv4 which we already discussed back 
under
 
   ct_next(ct_state=est|trk /* default (use --ct to customize) */)
   ---
-   6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct_label.blocked == 0 && (inport == "ap" && ip6), priority 2002, uuid 7fdd607e
+   6. ls_in_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct.blocked == 0 && (inport == "ap" && ip6), priority 2002, uuid 7fdd607e
   next;
   13. ls_in_l2_lkup (ovn-northd.c:3529): eth.dst == fa:16:3e:ef:2f:8b, 
priority 50, uuid e1d87fc5
   outport = "ad952e";
@@ -1707,7 +1707,7 @@ closely to those for IPv4 which we already discussed back 
under
 
   ct_next(ct_state=est|trk /* default (use --ct to customize) */)
   ---
-   4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct_label.blocked == 0 && (outport == "cp" && ip6 && ip6.src == 
$as_ip6_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 2002, uuid 12fc96f9
+   4. ls_out_acl (ovn-northd.c:2925): !ct.new && ct.est && !ct.rpl && 
ct.blocked == 0 && (outport == "cp" && ip6 && ip6.src == 
$as_ip6_0fc1b6cf_f925_49e6_8f00_6dd13beca9dc), priority 

[ovs-dev] [PATCH v4 3/3 ovn] OVN ACL: Allow a user to input ct.label value for an acl

2019-11-08 Thread Ankur Sharma
This patch allows user to associate a value with acl,
which will be assigned to ct.label of the corresponding
connection tracking entry.

This value can be used to map a ct entry with corresponding
OVN ACL or higher level constructs like security group.

Signed-off-by: Ankur Sharma 
---
 northd/ovn-northd.c   | 37 ++---
 ovn-nb.ovsschema  |  5 +++--
 ovn-nb.xml| 12 +++
 tests/ovn-nbctl.at| 12 +--
 tests/ovn.at  | 57 +++
 utilities/ovn-nbctl.c | 24 +-
 6 files changed, 135 insertions(+), 12 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index dcd975d..226882e 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -190,12 +190,13 @@ enum ovn_stage {
 #define OVN_ACL_PRI_OFFSET 1000
 
 /* Register definitions specific to switches. */
-#define REGBIT_CONNTRACK_DEFRAG  "reg0[0]"
-#define REGBIT_CONNTRACK_COMMIT  "reg0[1]"
-#define REGBIT_CONNTRACK_NAT "reg0[2]"
-#define REGBIT_DHCP_OPTS_RESULT  "reg0[3]"
-#define REGBIT_DNS_LOOKUP_RESULT "reg0[4]"
-#define REGBIT_ND_RA_OPTS_RESULT "reg0[5]"
+#define REGBIT_CONNTRACK_DEFRAG "reg0[0]"
+#define REGBIT_CONNTRACK_COMMIT "reg0[1]"
+#define REGBIT_CONNTRACK_NAT"reg0[2]"
+#define REGBIT_CONNTRACK_SET_LABEL  "reg0[3]"
+#define REGBIT_DHCP_OPTS_RESULT "reg0[4]"
+#define REGBIT_DNS_LOOKUP_RESULT"reg0[5]"
+#define REGBIT_ND_RA_OPTS_RESULT"reg0[6]"
 
 /* Register definitions for switches and routers. */
 #define REGBIT_NAT_REDIRECT "reg9[0]"
@@ -4550,7 +4551,14 @@ consider_acl(struct hmap *lflows, struct ovn_datapath 
*od,
   " || (!ct.new && ct.est && !ct.rpl "
"&& ct.blocked == 1)) "
   "&& (%s)", acl->match);
-ds_put_cstr(, REGBIT_CONNTRACK_COMMIT" = 1; ");
+if (acl->label) {
+   ds_put_format(, REGBIT_CONNTRACK_COMMIT" = 1; "
+ ""REGBIT_CONNTRACK_SET_LABEL" = 1; "
+ "xxreg1 = %s; ", acl->label);
+} else {
+   ds_put_cstr(, REGBIT_CONNTRACK_COMMIT" = 1; ");
+}
+
 build_acl_log(, acl);
 ds_put_cstr(, "next;");
 ovn_lflow_add_with_hint(lflows, od, stage,
@@ -4988,6 +4996,21 @@ build_stateful(struct ovn_datapath *od, struct hmap 
*lflows)
 ovn_lflow_add(lflows, od, S_SWITCH_IN_STATEFUL, 0, "1", "next;");
 ovn_lflow_add(lflows, od, S_SWITCH_OUT_STATEFUL, 0, "1", "next;");
 
+/* If REGBIT_CONNTRACK_COMMIT is set as 1 and
+ * REGBIT_CONNTRACK_SET_LABEL is set to 1, then the packets should be
+ * committed to conntrack.
+ * We always set ct_mark.blocked to 0 here as
+ * any packet that makes it this far is part of a connection we
+ * want to allow to continue. */
+ovn_lflow_add(lflows, od, S_SWITCH_IN_STATEFUL, 150,
+  REGBIT_CONNTRACK_COMMIT" == 1 && "
+  ""REGBIT_CONNTRACK_SET_LABEL" == 1",
+  "ct_commit(ct_mark=0/1, ct_label=xxreg1); next;");
+ovn_lflow_add(lflows, od, S_SWITCH_OUT_STATEFUL, 150,
+  REGBIT_CONNTRACK_COMMIT" == 1 && "
+  ""REGBIT_CONNTRACK_SET_LABEL" == 1",
+  "ct_commit(ct_mark=0/1, ct_label=xxreg1); next;");
+
 /* If REGBIT_CONNTRACK_COMMIT is set as 1, then the packets should be
  * committed to conntrack. We always set ct.blocked to 0 here as
  * any packet that makes it this far is part of a connection we
diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema
index 084305b..4e8a279 100644
--- a/ovn-nb.ovsschema
+++ b/ovn-nb.ovsschema
@@ -1,7 +1,7 @@
 {
 "name": "OVN_Northbound",
-"version": "5.17.0",
-"cksum": "1128988054 23237",
+"version": "5.18.0",
+"cksum": "1553853266 23311",
 "tables": {
 "NB_Global": {
 "columns": {
@@ -177,6 +177,7 @@
 "debug"]]},
   "min": 0, "max": 1}},
 "meter": {"type": {"key": "string", "min": 0, "max": 1}},
+"label": {"type": {"key": "string", "min": 0, "max": 1}},
 "external_ids": {
 "type": {"key": "string", "value": "string",
  "min": 0, "max": "unlimited"}}},
diff --git a/ovn-nb.xml b/ovn-nb.xml
index d8f3237..57faee4 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -1438,6 +1438,18 @@
 default, log messages are not rate-limited.
 
   
+
+  
+
+Associates an identifier with the ACL.
+Same value will be written to corresponding connection
+tracker entry. Value should be in hex, for example: 0x1234.
+This value can help in debugging from connection tracker side,
+for example, through this "label" we 

[ovs-dev] [PATCH v4 0/3] Associate identifier with OVN ACL connection tracking entry

2019-11-08 Thread Ankur Sharma
I submitted this patch long time back and somehow lost track it.
Resubmitting the series, calling it as V4, as it addresses the
review comments given till v3.
https://mail.openvswitch.org/pipermail/ovs-dev/2019-April/358280.html

What:

a. Goal is to be able to associate some identifier with a connection tracking
entry.

b. This identifier can be used to map OVN ACL which added this entry or
higher level constructs like openstack security group etc.

c. There are 2 connection tracking fields which can be used for it.
ct.mark (32 bits) and ct.label (128 bits).

d. Patch intends to use ct.label, as this is a longer field and
hence would be put to a better use, if it stores the identifier.

Why:

a. Adding an identifier would help in debugging.
b. Now, we can map a connection tracking entry to corresponding
   acl, security group etc.
c. One of the use cases for this mapping would be to identify
   ACLs which added corresponding connection tracker entry, which
   is causing unexpected drops/leaks.

How:

Following is the sequence of changes:

Patch 1:
   i.  Current implementation uses a bit ct.label to handle policy update cases,
   where we use a bit in ct.label to indicate that reply traffic should
   be dropped now.
  ii.  Swap the usage of ct.label in current implementation with ct.mark.

Patch 2:
   i. Add support in parser to allow ct.label and mark to be set from registers
  as well (as of now only integer/masked integer is allowed).

Patch 3:
   i. Add a new column (named 'label') to Table ACL in northbound schema.
  ii. ovn-northd changes to enhance logical flows to set ct.label to acl->label.
  For example:
  table=4 (ls_out_acl ),  action=(reg0[1] = 1; reg0[3] = 1; 
xxreg1 = 0x1234; next;)
  .
  .
  .
  table=7 (ls_out_stateful), ... match=(reg0[1] == 1 && reg0[3] == 1),


Ankur Sharma (3):
  OVN ACL: Replace the usage of ct_label with ct_mark
  OVN ACL: Allow ct_mark and ct_label values to be set from register as
well
  OVN ACL: Allow a user to input ct.label value for an acl

 Documentation/tutorials/ovn-openstack.rst | 14 ++---
 include/ovn/actions.h |  3 +
 lib/actions.c | 72 ++
 lib/logical-fields.c  |  3 +
 northd/ovn-northd.8.xml   | 14 ++---
 northd/ovn-northd.c   | 87 +--
 ovn-nb.ovsschema  |  5 +-
 ovn-nb.xml| 12 
 ovn-sb.xml| 41 +
 tests/ovn-nbctl.at| 12 +++-
 tests/ovn.at  | 99 +--
 utilities/ovn-nbctl.c | 24 +++-
 12 files changed, 310 insertions(+), 76 deletions(-)

-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] same tcp session encapsulated with different udp src port in kernel mode if packet has do ip_forward

2019-11-08 Thread ychen



I have tested this patch in linux with ovs version 2.8.2, and it seems worked.




>From fa24d308d40f37f890fead0b79ac1f0f7baa28ba Mon Sep 17 00:00:00 2001
From: hzchenyuefang 
Date: Sat, 9 Nov 2019 10:14:23 +0800
Subject: [PATCH 1/1] fix skb_hash problem when sending from internal port


first packet need come to userspace first and when execute in datapath,
skb->hash value is changed to 0 hence the packet's skb->hash value need
to recalculate when used.  This may happend in such conditions: send
packet from internal port, and then do vxlan encapsulting, outer udp
header source port is calculated by skb->hash(see function
udp_flow_src_port), so same tcp session packet may have different outer
udp source port, this may affect load blance.
---
 datapath-windows/ovsext/BufferMgmt.h  |  1 +
 datapath-windows/ovsext/DpInternal.h  |  1 +
 datapath-windows/ovsext/User.c| 11 ++-
 datapath/datapath.c   | 17 -
 datapath/datapath.h   |  0
 datapath/linux/compat/include/linux/openvswitch.h |  1 +
 lib/dpif-netlink.c|  7 ++-
 lib/dpif.h|  2 ++
 ofproto/ofproto-dpif-upcall.c | 11 ++-
 9 files changed, 47 insertions(+), 4 deletions(-)
 mode change 100644 => 100755 datapath-windows/ovsext/BufferMgmt.h
 mode change 100644 => 100755 datapath-windows/ovsext/DpInternal.h
 mode change 100644 => 100755 datapath-windows/ovsext/User.c
 mode change 100644 => 100755 datapath/datapath.c
 mode change 100644 => 100755 datapath/datapath.h
 mode change 100644 => 100755 lib/dpif-netlink.c
 mode change 100644 => 100755 lib/dpif.h
 mode change 100644 => 100755 ofproto/ofproto-dpif-upcall.c


diff --git a/datapath-windows/ovsext/BufferMgmt.h 
b/datapath-windows/ovsext/BufferMgmt.h
old mode 100644
new mode 100755
index dcf310a..34f50a1
--- a/datapath-windows/ovsext/BufferMgmt.h
+++ b/datapath-windows/ovsext/BufferMgmt.h
@@ -59,6 +59,7 @@ typedef union _OVS_BUFFER_CONTEXT {
 UINT32 dataOffsetDelta;
 };
 UINT16 mru;
+UINT32 skb_hash;
 };


 UINT64 value[MEM_ALIGN_SIZE(32) >> 3];
diff --git a/datapath-windows/ovsext/DpInternal.h 
b/datapath-windows/ovsext/DpInternal.h
old mode 100644
new mode 100755
index 3e351b7..8bb7110
--- a/datapath-windows/ovsext/DpInternal.h
+++ b/datapath-windows/ovsext/DpInternal.h
@@ -300,6 +300,7 @@ typedef struct OvsPacketExecute {
uint32_t dpNo;
uint32_t inPort;
uint16 mru;
+   uint32 skb_hash;
uint32_t packetLen;
uint32_t actionsLen;
PNL_MSG_HDR nlMsgHdr;
diff --git a/datapath-windows/ovsext/User.c b/datapath-windows/ovsext/User.c
old mode 100644
new mode 100755
index 4693a8b..4439379
--- a/datapath-windows/ovsext/User.c
+++ b/datapath-windows/ovsext/User.c
@@ -292,7 +292,8 @@ OvsNlExecuteCmdHandler(POVS_USER_PARAMS_CONTEXT 
usrParamsCtx,
 [OVS_PACKET_ATTR_USERDATA] = {.type = NL_A_UNSPEC, .optional = TRUE},
 [OVS_PACKET_ATTR_EGRESS_TUN_KEY] = {.type = NL_A_UNSPEC,
 .optional = TRUE},
-[OVS_PACKET_ATTR_MRU] = { .type = NL_A_U16, .optional = TRUE }
+[OVS_PACKET_ATTR_MRU] = { .type = NL_A_U16, .optional = TRUE },
+[OVS_PACKET_ATTR_SKB_HASH] = { .type = NL_A_U16, .optional = TRUE }
 };


 RtlZeroMemory(, sizeof(OvsPacketExecute));
@@ -394,6 +395,9 @@ _MapNlAttrToOvsPktExec(PNL_MSG_HDR nlMsgHdr, PNL_ATTR 
*nlAttrs,
 if (nlAttrs[OVS_PACKET_ATTR_MRU]) {
 execute->mru = NlAttrGetU16(nlAttrs[OVS_PACKET_ATTR_MRU]);
 }
+if (nlAttrs[OVS_PACKET_ATTR_SKB_HASH]){
+execute->skb_hash = NlAttrGetU32(nlAttrs[OVS_PACKET_ATTR_SKB_HASH]);
+}
 }


 NTSTATUS
@@ -1101,6 +1105,11 @@ OvsCreateQueueNlPacket(PVOID userData,
 goto fail;
 }
 }
+if (ctx->skb_hash) {
+if (!NlMsgPutTailU32(, OVS_PACKET_ATTR_SKB_HASH, 
(UINT32)ctx->skb_hash)){
+goto fail;
+}
+}


 /* XXX must send OVS_PACKET_ATTR_EGRESS_TUN_KEY if set by vswtchd */
 if (userData){
diff --git a/datapath/datapath.c b/datapath/datapath.c
old mode 100644
new mode 100755
index b565fc5..9559b4d
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -287,7 +287,6 @@ void ovs_dp_process_packet(struct sk_buff *skb, struct 
sw_flow_key *key)
stats_counter = >n_missed;
goto out;
}
-
ovs_flow_stats_update(flow, key->tp.flags, skb);
sf_acts = rcu_dereference(flow->sf_acts);
ovs_execute_actions(dp, skb, sf_acts, key);
@@ -521,6 +520,14 @@ static int queue_userspace_packet(struct datapath *dp, 
struct sk_buff *skb,
pad_packet(dp, user_skb);
}


+if (skb->hash) {
+if (nla_put_u32(user_skb, OVS_PACKET_ATTR_SKB_HASH, skb->hash)) {
+err = -ENOBUFS;
+goto out;
+}
+pad_packet(dp, 

Re: [ovs-dev] [PATCH] conntrack: Fix tcp payload length in case multi-segments.

2019-11-08 Thread Darrell Ball
Thanks for the patch

Would you mind describing the use case that this patch is aiming to support
?

On Fri, Nov 8, 2019 at 1:23 AM Zhike Wang  wrote:

> Signed-off-by: Zhike Wang 
> ---
>  lib/conntrack-private.h | 16 ++--
>  1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
> index 590f139..1d21f6e 100644
> --- a/lib/conntrack-private.h
> +++ b/lib/conntrack-private.h
> @@ -233,13 +233,17 @@ conn_update_expiration(struct conntrack *ct, struct
> conn *conn,
>  static inline uint32_t
>  tcp_payload_length(struct dp_packet *pkt)
>  {
> -const char *tcp_payload = dp_packet_get_tcp_payload(pkt);
> -if (tcp_payload) {
> -return ((char *) dp_packet_tail(pkt) - dp_packet_l2_pad_size(pkt)
> -- tcp_payload);
> -} else {
> -return 0;
> +size_t l4_size = dp_packet_l4_size(pkt);
> +
> +if (OVS_LIKELY(l4_size >= TCP_HEADER_LEN)) {
> +struct tcp_header *tcp = dp_packet_l4(pkt);
> +int tcp_len = TCP_OFFSET(tcp->tcp_ctl) * 4;
> +
> +if (OVS_LIKELY(tcp_len >= TCP_HEADER_LEN && tcp_len <= l4_size)) {
> +return (l4_size - tcp_len);
> +}
>

Maybe I missed something, but it looks like the same calculation is arrived
at.


>  }
> +return 0;
>  }
>
>  #endif /* conntrack-private.h */
> --
> 1.8.3.1
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds

2019-11-08 Thread dwilder

On 2019-11-08 14:52, Ilya Maximets wrote:

On 06.11.2019 20:20, David Wilder wrote:

Add support for travis-ci ppc64le builds.

- Updated matrix in .travis.yml to include an arch: ppc64le build.
- Move package install needed for 32bit builds to 
.travis/linux-prepare.sh.


To keep the total build time at an acceptable level only a single 
build

job is included in the matrix for ppc64le.

A build report example can be found here [1]
[0] 
https://urldefense.proofpoint.com/v2/url?u=http-3A__travis-2Dci.org_=DwICaQ=jf_iaSHvJObTbx-siA1ZOg=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI=_UtshYcJsj4Pt3b9hfgEiyaEIT3j9gPEIgmBatCEqCo=r0fBOs-21CKcV4kyZGnzh3fcjrpR8caYSl8K2i1St54= 
[1] 
https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_djlwilder_ovs_builds_607851729=DwICaQ=jf_iaSHvJObTbx-siA1ZOg=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI=_UtshYcJsj4Pt3b9hfgEiyaEIT3j9gPEIgmBatCEqCo=7t2rzVasH7Xq_R7jWkWZO9rkgm4KHMH-WavBzCRbF74= 
Signed-off-by: David Wilder 

---
Addressed review comments:
- Cleaned up linux-prepare.sh (v2)
- Switch from os: linux-ppc64le to arch: ppc64le (v3)


What a wonderful world of undocumented features. :)

Anyway, I just tried this patch and it fails for me because of missing 
pip:


https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_igsilya_ovs_jobs_609402867=DwICaQ=jf_iaSHvJObTbx-siA1ZOg=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI=_UtshYcJsj4Pt3b9hfgEiyaEIT3j9gPEIgmBatCEqCo=PF1oO_KkZFd_RRKToj6UBN2t2YhvTVE5XnVD1GF9u60=

pip install --disable-pip-version-check --user six flake8 hacking
./.travis/linux-prepare.sh: line 15: pip: command not found

Restarting the job doesn't help.

I'm wondering what is the base image and who controls preinstalled 
software?
Maybe it makes sense to hold on until travis will start supporting 
ppc64le

officially?



That is a bummer, I am seeing that error now as well, it worked 
yesterday :(

I agree we should hold off.  I will let you know if I figure it out.



Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-dpdk: Track vhost tx contention.

2019-11-08 Thread Ilya Maximets

On 26.08.2019 16:33, David Marchand wrote:

Add a coverage counter to help diagnose contention on the vhost txqs.
This is seen as dropped packets on the physical ports for rates that
are usually handled fine by OVS.

Signed-off-by: David Marchand 
---


Thanks! I changed 'unlikely' to 'OVS_UNLIKELY' and applied to master.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-dpdk: Track vhost tx contention.

2019-11-08 Thread Ilya Maximets

On 08.11.2019 9:30, David Marchand wrote:

On Tue, Nov 5, 2019 at 4:37 PM Ilya Maximets  wrote:

That's an interesting debug method, but it looks not very suitable
for an end-user documentation.  One thing that bothers me the most
is referencing C code snippets in this kind of documentation.


Ok, can we conclude on the coverage counter wrt the v1 then?
https://patchwork.ozlabs.org/patch/1153238/

Should I submit a v3 with the doc update (but removing the parts about perf) ?


'perf' part should not be there.

I'm in doubt about the rest of the docs.  This part might be useful, but
it doesn't provide any solution for the problem and I really don't know
if there is something we can suggest in that case.  "Rework your network
topology" doesn't sound like a friendly solution or exact steps to do.

One more thing is that documenting coverage counters doesn't look like a
good idea to me and I'd like to not create a precedent.

One day we'll rework this to be some "PMD/netdev performance statistics"
and it'll be OK to document it.  But there is nothing more permanent
than a temporary solution.

Right now the easiest way for me is to just apply v1.


Ok for me.


OK. I'll apply v1 then.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds

2019-11-08 Thread Ilya Maximets

On 06.11.2019 20:20, David Wilder wrote:

Add support for travis-ci ppc64le builds.

- Updated matrix in .travis.yml to include an arch: ppc64le build.
- Move package install needed for 32bit builds to .travis/linux-prepare.sh.

To keep the total build time at an acceptable level only a single build
job is included in the matrix for ppc64le.

A build report example can be found here [1]
[0] http://travis-ci.org/
[1] https://travis-ci.org/djlwilder/ovs/builds/607851729

Signed-off-by: David Wilder 
---
Addressed review comments:
- Cleaned up linux-prepare.sh (v2)
- Switch from os: linux-ppc64le to arch: ppc64le (v3)


What a wonderful world of undocumented features. :)

Anyway, I just tried this patch and it fails for me because of missing pip:

https://travis-ci.org/igsilya/ovs/jobs/609402867

pip install --disable-pip-version-check --user six flake8 hacking
./.travis/linux-prepare.sh: line 15: pip: command not found

Restarting the job doesn't help.

I'm wondering what is the base image and who controls preinstalled software?
Maybe it makes sense to hold on until travis will start supporting ppc64le
officially?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds

2019-11-08 Thread William Tu
On Wed, Nov 06, 2019 at 11:20:48AM -0800, David Wilder wrote:
> Add support for travis-ci ppc64le builds.
> 
> - Updated matrix in .travis.yml to include an arch: ppc64le build.
> - Move package install needed for 32bit builds to .travis/linux-prepare.sh.
> 
> To keep the total build time at an acceptable level only a single build
> job is included in the matrix for ppc64le.
> 
> A build report example can be found here [1]
> [0] http://travis-ci.org/
> [1] https://travis-ci.org/djlwilder/ovs/builds/607851729
> 
> Signed-off-by: David Wilder 
> ---
LGTM,
Acked-by: William Tu 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Extending ovs_action_attr to add a new action

2019-11-08 Thread William Tu
On Fri, Nov 08, 2019 at 05:12:55PM +0100, Matteo Croce wrote:
> Hi,
> 
> I need to add a field to enum ovs_action_attr, but I see that the
> definition between the upstream header[1] and the one in compat[2]
> differs.
> Upstream enum stops at OVS_ACTION_ATTR_CHECK_PKT_LEN, with an extra
> "hidden" element after __OVS_ACTION_ATTR_MAX (22)
> Our compat version instead, has OVS_ACTION_ATTR_TUNNEL_{PUSH,POP}
> defined only #ifndef __KERNEL__, with __OVS_ACTION_ATTR_MAX being 22
> for the kernel and 24 for userspace.
> 
> If I add a field OVS_ACTION_ATTR_WHATEVER just before
> __OVS_ACTION_ATTR_MAX in the kernel, older userspace will incorrectly
> see the new action as OVS_ACTION_ATTR_TUNNEL_PUSH.

"older userspace" means you're using userspace datapath (dpif-netdev)?
If true, then it's not using kernel module.

if "older userspace" means just ovs-vswitchd using kernel module,
and you want to upgrade ovs kernel module with your new action
and without upgrade ovs-vswitchd?

Usually we also upgrade ovs-vswitchd, so I don't know how this can be done.

Regards,
William
> 
> How can we extend this enum without breaking compatibility?
> If OVS_ACTION_ATTR_TUNNEL_* really is unused in the kernel datapath,
> what if we pad the kernel header with two padding fields?
> 
> -%<-
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -925,6 +926,8 @@ enum ovs_action_attr {
>   OVS_ACTION_ATTR_METER,/* u32 meter ID. */
>   OVS_ACTION_ATTR_CLONE,/* Nested OVS_CLONE_ATTR_*.  */
>   OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
> + _OVS_ACTION_ATTR_TUNNEL_PUSH, /* unused in kernel datapath. */
> + _OVS_ACTION_ATTR_TUNNEL_POP,  /* unused in kernel datapath. */
> 
>   __OVS_ACTION_ATTR_MAX,   /* Nothing past this will be accepted
> ->%-
> 
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/openvswitch.h?h=v5.3#n899
> [2] 
> https://github.com/openvswitch/ovs/blob/v2.12.0/datapath/linux/compat/include/linux/openvswitch.h#L962
> 
> Regards,
> -- 
> Matteo Croce
> per aspera ad upstream
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net 2/2] act_ct: support asymmetric conntrack

2019-11-08 Thread Aaron Conole
The act_ct TC module shares a common conntrack and NAT infrastructure
exposed via netfilter.  It's possible that a packet needs both SNAT and
DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
this because it runs through the NAT table twice - once on ingress and
again after egress.  The act_ct action doesn't have such capability.

Like netfilter hook infrastructure, we should run through NAT twice to
keep the symmetry.

Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")

Signed-off-by: Aaron Conole 
---
 net/sched/act_ct.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index fcc46025e790..f3232a00970f 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -329,6 +329,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
  bool commit)
 {
 #if IS_ENABLED(CONFIG_NF_NAT)
+   int err;
enum nf_nat_manip_type maniptype;
 
if (!(ct_action & TCA_CT_ACT_NAT))
@@ -359,7 +360,17 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
return NF_ACCEPT;
}
 
-   return ct_nat_execute(skb, ct, ctinfo, range, maniptype);
+   err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
+   if (err == NF_ACCEPT &&
+   ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) {
+   if (maniptype == NF_NAT_MANIP_SRC)
+   maniptype = NF_NAT_MANIP_DST;
+   else
+   maniptype = NF_NAT_MANIP_SRC;
+
+   err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
+   }
+   return err;
 #else
return NF_ACCEPT;
 #endif
-- 
2.21.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net 1/2] openvswitch: support asymmetric conntrack

2019-11-08 Thread Aaron Conole
The openvswitch module shares a common conntrack and NAT infrastructure
exposed via netfilter.  It's possible that a packet needs both SNAT and
DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
this because it runs through the NAT table twice - once on ingress and
again after egress.  The openvswitch module doesn't have such capability.

Like netfilter hook infrastructure, we should run through NAT twice to
keep the symmetry.

Fixes: 05752523e565 ("openvswitch: Interface with NAT.")
Signed-off-by: Aaron Conole 
---
 net/openvswitch/conntrack.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 05249eb45082..283e8f9a5fd2 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -903,6 +903,17 @@ static int ovs_ct_nat(struct net *net, struct sw_flow_key 
*key,
}
err = ovs_ct_nat_execute(skb, ct, ctinfo, >range, maniptype);
 
+   if (err == NF_ACCEPT &&
+   ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) {
+   if (maniptype == NF_NAT_MANIP_SRC)
+   maniptype = NF_NAT_MANIP_DST;
+   else
+   maniptype = NF_NAT_MANIP_SRC;
+
+   err = ovs_ct_nat_execute(skb, ct, ctinfo, >range,
+maniptype);
+   }
+
/* Mark NAT done if successful and update the flow key. */
if (err == NF_ACCEPT)
ovs_nat_update_key(key, skb, maniptype);
-- 
2.21.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Fix use of dangling pointers in I-P runtime_data.

2019-11-08 Thread Han Zhou
On Fri, Nov 8, 2019 at 11:22 AM Han Zhou  wrote:
>
> 1. storage data and the void *arg of init() breaks the engine node data
encapsulation.
> 2. engine_node_valid(_flow_output, engine_run_id) is not needed?
Should use storage to access instead?
> 3. refactor of engine is good but better to be a separate commit
> 4. we can have a new interface: engine_get_data(), which returns data if
it is valid. we should never expose the data directly. We should init the
engine node with dynamically allocated engine data structure (and remember
to free during destroy)

Oops! please ignore the above part since it was draft and I forgot to
delete after editing the formal response, mostly redundant :-)
Real response started here =>
>
> Hi Dumitru,
>
> Sorry for late response.
> On Mon, Nov 4, 2019 at 4:54 AM Dumitru Ceara  wrote:
> >
> > The incremental processing engine might stop a run before the
> > en_runtime_data node is processed. In such cases the ed_runtime_data
> > fields might contain pointers to already deleted SB records. For
> > example, if a port binding corresponding to a patch port is removed from
> > the SB database and the incremental processing engine aborts before the
> > en_runtime_data node is processed then the corresponding local_datapath
> > hashtable entry in ed_runtime_data is stale and will store a pointer to
> > the already freed sbrec_port_binding record.
> >
> > This will cause invalid memory accesses in various places (e.g.,
> > pinctrl_run() -> prepare_ipv6_ras()).
> >
> > To fix the issue we need a way to track how each node was processed
> > during an engine run. This commit transforms the 'changed' field in
> > struct engine_node in a 'state' field. Possible node states are:
> > - "New": the node is not yet initialized.
> > - "Stale": data in the node is not up to date with the DB.
> > - "Updated": data in the node is valid but was updated during
> >   the last run of the engine.
> > - "Valid": data in the node is valid and didn't change during
> >   the last run of the engine.
> > - "Aborted": during the last run, processing was aborted for
> >   this node.
> > - "Destroyed": the node was already cleaned up.
> >
> > We also add a separation between engine node data that can be accessed
> > at any time (regardless if the last engine run was successful or not)
> > and data that may be accessed only if the nodes are up to date. This
> > helps avoiding custom "engine_node_valid" handlers for different
> > nodes.
> >
> > The commit also simplifies the logic of calling engine_run and
> > engine_need_run in order to reduce the number of external variables
> > required to track the result of the last engine execution.
> >
> > Functions that need to be called from the main loop and depend on
> > various data contents of the engine's nodes are now called only if
> > the data is up to date.
> >
> > CC: Han Zhou 
> > Fixes: ca278d98a4f5 ("ovn-controller: Initial use of incremental engine
- quiet mode.")
> > Signed-off-by: Dumitru Ceara 
> >
> > ---
> > v2: Address Han's comments:
> > - call engine_node_valid() in all the places where node local data is
> >   used.
> > - move out "global" data outside the engine nodes. Make a clear
> >   separation between data that can be safely used at any time and data
> >   that can be used only when the engine run was successful.
>
> I am concerned with this kind of separation of *global* data, which
breaks the data encapsulation of engine node, and can easily result in
hidden dependency. As you know the main purpose of the I-P engine is to
force explicit dependency exposed between different engine nodes thus
ensure the correctness (at least it helps to ensure) of incremental
processing.
>
> Here is my proposal to address the problem with better encapsulation.
>
> Firstly, let's avoid direct engine data access outside of engine module.
At engine node construction, instead of using reference of stack variable
(such as struct ed_type_runtime_data ed_runtime_data), we can allocate the
memory in the engine node's init() interface, and free in the cleanup()
interface. This way, there will be no way to directly access engine data
like _runtime_data.local_datapaths.
>
> Secondly, let's add a engine module interface engine_get_data() to
retrieve *and validate* data for an engine node:
> void *
> engine_get_data(struct engine_node *node, uint64_t run_id)
> {
> if (engine_node_valid(node, run_id)) {
> return node->data;
> }
> return NULL;
> }
>
> This should be used whenever an engine node data need to be accessed. (It
may be even better to use node name as input instead of node structure, but
it seems ok to me either way)
>
> Now let's address the always-valid node problem. I was proposing an
is_valid() interface for engine node. It can be NULL by default, but if a
node xxx's data is always valid, it can be implemented like:
>
> static bool
> en_xxx_is_valid(struct engine_node *node)
> {
> /* This node's data will always be valid */
> return 

Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Fix use of dangling pointers in I-P runtime_data.

2019-11-08 Thread Han Zhou
1. storage data and the void *arg of init() breaks the engine node data
encapsulation.
2. engine_node_valid(_flow_output, engine_run_id) is not needed? Should
use storage to access instead?
3. refactor of engine is good but better to be a separate commit
4. we can have a new interface: engine_get_data(), which returns data if it
is valid. we should never expose the data directly. We should init the
engine node with dynamically allocated engine data structure (and remember
to free during destroy)

Hi Dumitru,

Sorry for late response.
On Mon, Nov 4, 2019 at 4:54 AM Dumitru Ceara  wrote:
>
> The incremental processing engine might stop a run before the
> en_runtime_data node is processed. In such cases the ed_runtime_data
> fields might contain pointers to already deleted SB records. For
> example, if a port binding corresponding to a patch port is removed from
> the SB database and the incremental processing engine aborts before the
> en_runtime_data node is processed then the corresponding local_datapath
> hashtable entry in ed_runtime_data is stale and will store a pointer to
> the already freed sbrec_port_binding record.
>
> This will cause invalid memory accesses in various places (e.g.,
> pinctrl_run() -> prepare_ipv6_ras()).
>
> To fix the issue we need a way to track how each node was processed
> during an engine run. This commit transforms the 'changed' field in
> struct engine_node in a 'state' field. Possible node states are:
> - "New": the node is not yet initialized.
> - "Stale": data in the node is not up to date with the DB.
> - "Updated": data in the node is valid but was updated during
>   the last run of the engine.
> - "Valid": data in the node is valid and didn't change during
>   the last run of the engine.
> - "Aborted": during the last run, processing was aborted for
>   this node.
> - "Destroyed": the node was already cleaned up.
>
> We also add a separation between engine node data that can be accessed
> at any time (regardless if the last engine run was successful or not)
> and data that may be accessed only if the nodes are up to date. This
> helps avoiding custom "engine_node_valid" handlers for different
> nodes.
>
> The commit also simplifies the logic of calling engine_run and
> engine_need_run in order to reduce the number of external variables
> required to track the result of the last engine execution.
>
> Functions that need to be called from the main loop and depend on
> various data contents of the engine's nodes are now called only if
> the data is up to date.
>
> CC: Han Zhou 
> Fixes: ca278d98a4f5 ("ovn-controller: Initial use of incremental engine -
quiet mode.")
> Signed-off-by: Dumitru Ceara 
>
> ---
> v2: Address Han's comments:
> - call engine_node_valid() in all the places where node local data is
>   used.
> - move out "global" data outside the engine nodes. Make a clear
>   separation between data that can be safely used at any time and data
>   that can be used only when the engine run was successful.

I am concerned with this kind of separation of *global* data, which breaks
the data encapsulation of engine node, and can easily result in hidden
dependency. As you know the main purpose of the I-P engine is to force
explicit dependency exposed between different engine nodes thus ensure the
correctness (at least it helps to ensure) of incremental processing.

Here is my proposal to address the problem with better encapsulation.

Firstly, let's avoid direct engine data access outside of engine module. At
engine node construction, instead of using reference of stack variable
(such as struct ed_type_runtime_data ed_runtime_data), we can allocate the
memory in the engine node's init() interface, and free in the cleanup()
interface. This way, there will be no way to directly access engine data
like _runtime_data.local_datapaths.

Secondly, let's add a engine module interface engine_get_data() to retrieve
*and validate* data for an engine node:
void *
engine_get_data(struct engine_node *node, uint64_t run_id)
{
if (engine_node_valid(node, run_id)) {
return node->data;
}
return NULL;
}

This should be used whenever an engine node data need to be accessed. (It
may be even better to use node name as input instead of node structure, but
it seems ok to me either way)

Now let's address the always-valid node problem. I was proposing an
is_valid() interface for engine node. It can be NULL by default, but if a
node xxx's data is always valid, it can be implemented like:

static bool
en_xxx_is_valid(struct engine_node *node)
{
/* This node's data will always be valid */
return true;
}

For the engine_node_valid() function, it can be changed slightly:

bool
engine_node_valid(struct engine_node *node, uint64_t run_id)
{
if (node->is_valid) {
return node->is_valid();
}
return node->run_id == run_id &&
(node->state == EN_UPDATED || node->state == EN_VALID);
}

So if is_valid is not implemented, it will be the default check, 

Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds

2019-11-08 Thread dwilder

Hi Wei

On 2019-11-08 02:02, Yanqin Wei (Arm Technology China) wrote:

Hi David


-Original Message-
From: David Wilder 
Sent: Thursday, November 7, 2019 3:21 AM
To: ovs-dev@openvswitch.org
Cc: i.maxim...@ovn.org; b...@ovn.org; Yanqin Wei (Arm Technology China)
; wil...@us.ibm.com
Subject: [PATCH v3] travis: support ppc64le builds

Add support for travis-ci ppc64le builds.

- Updated matrix in .travis.yml to include an arch: ppc64le build.
- Move package install needed for 32bit builds to 
.travis/linux-prepare.sh.


To keep the total build time at an acceptable level only a single 
build job is

included in the matrix for ppc64le.

A build report example can be found here [1] [0] 
https://urldefense.proofpoint.com/v2/url?u=http-3A__travis-2Dci.org_=DwIFAg=jf_iaSHvJObTbx-siA1ZOg=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI=6QTaKFSaP0A5SgshVJPuewRjtEH32pQD2AUjoEnZyTo=IILKkh0Orqnp1nKjHWWCkXSOMpBYjG9WKs8l34TyPts= 
 [1]

https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_djlwilder_ovs_builds_607851729=DwIFAg=jf_iaSHvJObTbx-siA1ZOg=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI=6QTaKFSaP0A5SgshVJPuewRjtEH32pQD2AUjoEnZyTo=mVCcGr-NWlZzQn2xSnackkm2Aawg7iyvYEyh4lMpjFw=

Signed-off-by: David Wilder 
---
Addressed review comments:
- Cleaned up linux-prepare.sh (v2)
- Switch from os: linux-ppc64le to arch: ppc64le (v3)

 .travis.yml  | 5 +++--
 .travis/linux-prepare.sh | 5 -
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 482efd2d1..308c09635 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -14,7 +14,6 @@ addons:
   apt:
 packages:
   - bc
-  - gcc-multilib
   - libssl-dev
   - llvm-dev
   - libjemalloc1
@@ -26,7 +25,6 @@ addons:
   - libelf-dev
   - selinux-policy-dev
   - libunbound-dev
-  - libunbound-dev:i386
   - libunwind-dev

 before_install: ./.travis/${TRAVIS_OS_NAME}-prepare.sh
@@ -52,6 +50,9 @@ matrix:
 - os: osx
   compiler: clang
   env: OPTS="--disable-ssl"
+- arch: ppc64le
+  compiler: gcc
+  env: OPTS="--disable-ssl"

 script: ./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS

diff --git a/.travis/linux-prepare.sh b/.travis/linux-prepare.sh index
9e3ac0df7..d66f480c6 100755
--- a/.travis/linux-prepare.sh
+++ b/.travis/linux-prepare.sh
@@ -18,7 +18,10 @@ pip install --user --upgrade docutils  if [ "$M32" 
]; then
 # 32-bit and 64-bit libunwind can not be installed at the same 
time.
 # This will remove the 64-bit libunwind and install 32-bit 
version.

-sudo apt-get install -y libunwind-dev:i386
+sudo apt-get install -y \
+ gcc-multilib \
+ libunwind-dev:i386 \
+ libunbound-dev:i386

[Yanqin] They are x86 specific dependency. It is better to use
"$TRAVIS_ARCH" == "amd64" condition.


[ wilder ] In this case it is not needed as ppc64le is not supporting 
32bit builds (not in the matrix).
If arm64 will support a 32bit build, then you will need to modify this 
section anyway,  to install arm packages.



[Yanqin]  Is gcc-multilib only required for 32bits build?


[ wilder ] Yes.


 fi

 # IPv6 is supported by kernel but disabled in TravisCI images:
--
2.23.0.162.gf1d4a28


IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose
the contents to any other person, use it for any purpose, or store or
copy the information in any medium. Thank you.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] ¿Cómo crear un equipo triunfador?

2019-11-08 Thread Coaching y liderazgo efectivo
26 de Noviembre | Horario de 10:00 a 17:00 hrs.  |  (hora del centro de México) 

- Coaching y liderazgo efectivo. - 


El líder coach es un estilo muy completo para utilizar cuando queremos 
capacitar a otros y conseguir resultados sólidos a
mediano y largo plazo.
Este webinar tiene como finalidad llevar a los participantes a ser capaces de 
intervenir positivamente en otros individuos en
funciones de liderazgo, comprendiendo diversos estilos y aplicando en la 
práctica acciones que permitan mejorar la toma de decisiones
para alcanzar los objetivos de su organización.

Solicita información respondiendo a este correo con la palabra Coaching, junto 
con los siguientes datos:

Nombre:
Correo electrónico:
Número telefónico:


Números de Atención: 

(045) 55 15 54 66 30 - (045) 55 85567293 - (045) 5530167085 

En caso de que haya recibido este correo sin haberlo solicitado o si desea 
dejar de recibir nuestra promoción favor de responder con la palabra baja o 
enviar un correo a bajas@ innovalearn.net


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Extending ovs_action_attr to add a new action

2019-11-08 Thread 0-day Robot
Bleep bloop.  Greetings Matteo Croce, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
fatal: corrupt patch at line 13
Repository lacks necessary blobs to fall back on 3-way merge.
Cannot fall back to three-way merge.
Patch failed at 0001 Extending ovs_action_attr to add a new action
The copy of the patch that failed is found in:
   
/var/lib/jenkins/jobs/upstream_build_from_pw/workspace/.git/rebase-apply/patch
When you have resolved this problem, run "git am --resolved".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] dpif-netdev: log rxq assignment in isolated pmd

2019-11-08 Thread Ilya Maximets

On 22.10.2019 11:34, Gowrishankar Muthukrishnan wrote:

There is no log about isolated rxq assignment in a pmd today, which
sometimes could be useful to trace rxq/pmd pinning, when debugging
with log. Ovs-appctl dpif-netdev/pmd-rxq-show reports about it
already, but logging is helpful to trace pinning in time.

Changes:
   v2: init numa_id for the info.

Reported-at: https://bugzilla.redhat.com/1728616
Signed-off-by: Gowrishankar Muthukrishnan 
---


Hi Gowrishankar,
Thanks for the patch.  I like the idea of this additional log.

Few comments for implementation inline.

Best regards, Ilya Maximets.


  lib/dpif-netdev.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 4546b55e8..4d7c36787 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -4573,6 +4573,11 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) 
OVS_REQUIRES(dp->port_mutex)
  q->pmd = pmd;
  pmd->isolated = true;
  dp_netdev_pmd_unref(pmd);


Print should go before dropping the pmd reference.


+numa_id = netdev_get_numa_id(q->port->netdev);
+VLOG_INFO("Core %d on numa node %d assigned port \'%s\' "
+  "rx queue %d.", pmd->core_id, numa_id,


'numa_id' is not correct.  You need to print the numa_id of the core/thread,
not the numa_id of netdev.  So, there should be 'pmd->numa_id' instead.

And, please, align the arguments to the next symbol after '(', i.e. 4 spaces
to the right.


+  netdev_rxq_get_name(q->rx),
+  netdev_rxq_get_queue_id(q->rx));
  }
  } else if (!pinned && q->core_id == OVS_CORE_UNSPEC) {
  uint64_t cycle_hist = 0;


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] L2 acl on the logical switch

2019-11-08 Thread venugopal iyer via dev
 Sorry about that, hopefully this is better (else, I'll just attach it the next 
time)
Assuming :Logical Switch (ls1) with ls1_vm1 : 02:ac:10:ff:00:11/172.16.255.11
                                       ls1_vm2 : 
02:ac:10:ff:00:22/172.16.255.22                                       ls1_vm3 : 
02:ac:10:ff:00:33/172.16.255.33
Looking to :  To have MAC ACLs between vm1 and vm2 so they can't talk to each 
other.
Using:
  # ovn-nbctl  create Address_Set name=macs 
addresses=\"02:ac:10:ff:00:11\",\"02:ac:10:ff:00:22\"  # ovn-nbctl create 
Port_Group name=l2pg   # ovn-nbctl add Port_Group l2pg ports   # 
ovn-nbctl add Port_Group l2pg ports   # ovn-nbctl acl-add ls1 to-lport 
32767  "outport == @l2pg && eth.src == \$macs" drop
What we are seeing:  The above, by itself, will limit L3 and L2 between vm1 and 
vm2, but not vm3 (as expected)
  lflow:    table=4 (ls_out_acl         ), priority=33767, match=(outport == 
@l2pg && eth.src == $macs), action=(/* drop */)
  dpflow (sending a arbit packet of ether type 0x7a05):
    
(0),in_port(vm1),eth(src=02:ac:10:ff:00:11,dst=02:ac:10:ff:00:22),eth_type(0x7a05),
 packets:0, bytes:0, used:never, actions:drop    However, if there is a 
stateful rule, e.g.:
    # ovn-nbctl acl-add ls1 from-lport 2000  "inport == \"ls1-vm3\" && ip" 
allow-related
  Then, the behavior differs  lflow:    table=6 (ls_in_acl          ), 
priority=3000 , match=(!ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 && 
(inport == "ls1-vm3" && ip)), action=(next;)    table=6 (ls_in_acl          ), 
priority=3000 , match=(((ct.new && !ct.est) || (!ct.new && ct.est && !ct.rpl && 
ct_label.blocked == 1)) && (inport == "ls1-vm3" && ip)), action=(reg0[1] = 1; 
next;)    table=4 (ls_out_acl         ), priority=33767, match=((!ct.est || 
(ct.est && ct_label.blocked == 1)) && (outport == @l2pg && eth.src == $macs)), 
action=(/* drop */)    table=4 (ls_out_acl         ), priority=33767, 
match=(ct.est && ct_label.blocked == 0 && (outport == @l2pg && eth.src == 
$macs)), action=(ct_commit(ct_label=1/1); /* drop */)
*Note: the condition !ct.new && ct.est && !ct.rpl && ct_label.blocked == 
0**

This does block L3 traffic:

  dpctl flow for ICMP    
recirc_id(0),in_port(vm1),eth(src=02:ac:10:ff:00:11),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8),
 packets:6, bytes:588, used:0.864s, actions:ct(zone=1),recirc(0x1)    
recirc_id(0x1),in_port(vm1),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(dst=02:ac:10:ff:00:22),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8),
 packets:6, bytes:588, used:0.864s, 
actions:ct(commit,zone=1,label=0/0x1),ct(zone=4),recirc(0x2)    
recirc_id(0x2),in_port(vm1),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:11),eth_type(0x0800),ipv4(frag=no),
 packets:6, bytes:588, used:0.864s, actions:drop    
recirc_id(0),in_port(vm1),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:11,dst=02:ac:10:ff:00:22),eth_type(0x0806),arp(sha=02:ac:10:ff:00:11),
 packets:0, bytes:0, used:never, actions:vm2    
recirc_id(0),in_port(vm2),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:22,dst=02:ac:10:ff:00:11),eth_type(0x0806),arp(sha=02:ac:10:ff:00:22),
 packets:0, bytes:0, used:never, actions:vm1
  But, it doesn't block L2, see the ARP going through, without the stateful 
rule the ARP would have been dropped too.
  From what i understand it is because of the condition "the condition !ct.new 
&& ct.est && !ct.rpl && ct_label.blocked == 0",  which specifically looks fo 
"+trk" which makes sense for IP packets, but probably not for non-IP packets, 
i.e. the ARP rule   above has "-trk". Hence, I believe it gets through.
  Proposed changes:  If my understanding is right (and objective to remove the 
inconsistency between when there are stateful rules and not, w.r.t. L2 
packets), 
  we should have the condition as:  # git diffdiff --git 
a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.cindex 6c6de2afd..b0f524531 
100644--- a/ovn/northd/ovn-northd.c+++ b/ovn/northd/ovn-northd.c@@ -4191,10 
+4191,13 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od,          
* depending on whether the connection was previously committed          * to 
the connection tracker with ct_commit. */         if (has_stateful) {-          
  /* If the packet is not part of an established connection, then-             
* we can simply reject/drop it. */++            /* If the packet is not tracked 
or part of an established connection, then+             * we can simply 
reject/drop it.+             */             ds_put_cstr(,-                
        "(!ct.est || (ct.est && ct_label.blocked == 1))");+                     
   "(!ct.trk || !ct.est || (ct.est && ct_label.blocked == 1))");             if 
(!strcmp(acl->action, "reject")) {                 build_reject_acl_rules(od, 
lflows, stage, acl, ,                                        );
with this the rules above look like:
  lflow:    table=6 (ls_in_acl        

Re: [ovs-dev] [PATCH] netdev-dpdk: Fix dev attached flag.

2019-11-08 Thread Ilya Maximets

On 08.11.2019 9:55, Zhike Wang wrote:

If the dev was already probed correctly, the dev attached flag
should be set to true, or resource would leak during destruct.


We're not detaching devices that wasn't attached by us.

If device is already probed in DPDK than it means that most likely it
was attached on dpdk_eal_init(), because user explicitly added this
device in EAL arguments with 'dpdk-extra' knob.

Previously there was non-detachable devices and we protected ourselves
from trying to detach them.  I beleive (hope) that in modern DPDK
all the devices are detachable, but still.  If user added devices
to EAL arguments than he/she wanted them to exist whole time while
OVS is running and we're not eligible to detach these devices.

If you really want to detach them, use special command
'ovs-appctl netdev-dpdk/detach' that was designed for this exact case.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Extending ovs_action_attr to add a new action

2019-11-08 Thread Matteo Croce
Hi,

I need to add a field to enum ovs_action_attr, but I see that the
definition between the upstream header[1] and the one in compat[2]
differs.
Upstream enum stops at OVS_ACTION_ATTR_CHECK_PKT_LEN, with an extra
"hidden" element after __OVS_ACTION_ATTR_MAX (22)
Our compat version instead, has OVS_ACTION_ATTR_TUNNEL_{PUSH,POP}
defined only #ifndef __KERNEL__, with __OVS_ACTION_ATTR_MAX being 22
for the kernel and 24 for userspace.

If I add a field OVS_ACTION_ATTR_WHATEVER just before
__OVS_ACTION_ATTR_MAX in the kernel, older userspace will incorrectly
see the new action as OVS_ACTION_ATTR_TUNNEL_PUSH.

How can we extend this enum without breaking compatibility?
If OVS_ACTION_ATTR_TUNNEL_* really is unused in the kernel datapath,
what if we pad the kernel header with two padding fields?

-%<-
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -925,6 +926,8 @@ enum ovs_action_attr {
  OVS_ACTION_ATTR_METER,/* u32 meter ID. */
  OVS_ACTION_ATTR_CLONE,/* Nested OVS_CLONE_ATTR_*.  */
  OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
+ _OVS_ACTION_ATTR_TUNNEL_PUSH, /* unused in kernel datapath. */
+ _OVS_ACTION_ATTR_TUNNEL_POP,  /* unused in kernel datapath. */

  __OVS_ACTION_ATTR_MAX,   /* Nothing past this will be accepted
->%-

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/openvswitch.h?h=v5.3#n899
[2] 
https://github.com/openvswitch/ovs/blob/v2.12.0/datapath/linux/compat/include/linux/openvswitch.h#L962

Regards,
-- 
Matteo Croce
per aspera ad upstream

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev: Dynamic per-port Flow API.

2019-11-08 Thread Ilya Maximets

On 06.11.2019 18:11, Ophir Munk wrote:

Hi Ilya,
A few months ago we discussed the missing functionality of vports offloading 
under user space bridges.
Commit [1] was added to explicitly avoid installing userspace vport flows (to 
avoid confusion with the vport kernel).
When reverting commit [1] - we are left with this missing functionality.
Applying your suggested patch netdev_vport_has_system_port() API)  does not 
seem to work.
It always fails when it tries to look for the interface name (say "vxlan10") in the 
system list where vxlan interfaces are renamed (say "vxlan_sys_4789").


Hi Ophir,

No-one ever tried to run this code, so I'm not surprised.
I could take a look at this issue on next week.

Best regards, Ilya Maximets.



You wrote:

On master, after applying dynamic flow API patch-set, we'll be able to use
patch that I suggested in previous mail to properly handle this situation and
enable vport offloading.


It would be appreciated if we can resume the efforts in fixing this missing 
functionality.
Please advise on the next steps.

[1]
Commit 0da667e345 ("dpif-netdev: Forbid vport offloading attempts.")



-Original Message-
From: Ilya Maximets 
Sent: Monday, May 13, 2019 3:33 PM
To: Ophir Munk ; ovs-dev@openvswitch.org
Cc: Ian Stokes ; Flavio Leitner ;
Kevin Traynor ; Roni Bar Yanai
; Finn Christensen ; Ben Pfaff
; Simon Horman ; Olga
Shern ; Asaf Penso ; Oz
Shlomo ; Majd Dibbiny 
Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.

On 13.05.2019 14:41, Ophir Munk wrote:

Hi Ilya,


-Original Message-
From: Ilya Maximets 
Sent: Monday, May 13, 2019 12:22 PM
To: Ophir Munk ; ovs-dev@openvswitch.org
Cc: Ian Stokes ; Flavio Leitner
; Kevin Traynor ; Roni Bar
Yanai ; Finn Christensen ;

Ben

Pfaff ; Simon Horman ;

Olga

Shern ; Asaf Penso ; Oz
Shlomo 
Subject: Re: [PATCH] netdev: Dynamic per-port Flow API.

On 09.05.2019 1:59, Ophir Munk wrote:

Hi Ilya,


...

I am recreating this scenario in my setup.


I see. Yes, you're right. And I think that this case could be
reproduced on current master without any patches. So, it's a bug that we

need to fix.

Otherwise userspace datapath will try to offload its flows to the
unrelated system interfaces. For now we could just forbid offloading
to vports in dpif- netdev. I'll prepare the patch. This fix also should be

backported.




We need a way to enable offloading of vports in dpif-netdev. So if you
can address It with your new patch - that would be appreciated.


I'm suggesting disabling because it'll be easy to backport to older branches.
On master, after applying dynamic flow API patch-set, we'll be able to use
patch that I suggested in previous mail to properly handle this situation and
enable vport offloading.







This patch series is important but one of its main goals is to
enable different

offloads for different vports under Kernel/userspace.

Can you please advise how this goal can be achieved?


It looks like there is no way to distinguish system and userspace
vports by looking only on netdev. We should check with dpif type.


Agreed. But then looking at your sample patch below you are basing
your decision on the existence of system port (and not on dpif type).


Right now 'ifindex' is used for checking, so this is equally racy.


  I think it is risky because
you are dependent on the order of operations: Creation of the system port

versus checking for system port existence.

Which was first (under different scenarios)?


Actually, port creation and checking will always happen in the same thread
context, so creation and checking will be serialized. Kernel should guarantee
the port existence. No races here.



What do you say about the following patch:

diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c int
netdev_ports_insert(struct netdev *netdev, const struct dpif_class
*dpif_class,
  struct dpif_port *dpif_port) @@ -547,8 +555,9 @@
netdev_ports_insert(struct netdev *netdev, const struct dpif_class
*dpif_class,
  netdev_ports_hash(dpif_port->port_no, dpif_class));
  hmap_insert(_to_port, >ifindex_node, ifindex);
  ovs_mutex_unlock(_hmap_mutex);

-netdev_init_flow_api(netdev);
+netdev_init_flow_api(netdev, dpif_class->type); /* Pass class
+ type "netdev" or "syste" */

  return 0;
}

It's not a full solution. It is just a hint how to pass the dpif type ("netdev" 
or

"system")  when initializing the flow api.

We can use the dpif type when initializing vport offload.


This is, actually, the first thing I tried. However, this requires exposing the
internals of 'dpif-provider', which is bad from the point of code architecture.

For the below patch code, I missed actual port requesting from the datapath.
Following incremental needed:

---
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index
82943c102..1c88a91ed 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -124,13 +124,28 @@ netdev_vport_class_get_dpif_port(const struct
netdev_class 

[ovs-dev] [PATCH RFC] dpif-netdev: Add ovs-appctl dpif-netdev/subtable-show.

2019-11-08 Thread Emma Finn
Add an ovs-appctl command to iterate through the dpcls
and for each subtable output the miniflow bits for any
existing table.

$ ovs-appctl dpif-netdev/subatable-show
pmd thread numa_id 0
  dpcls port 2:
subtable:
  unit_0: 4 (0x4)
  unit_1: 2 (0x2)
pmd thread numa_id 1
  dpcls port 3:
subtable:
  unit_0: 4 (0x3)
  unit_1: 2 (0x5)

Signed-off-by: Emma Finn 
---
 NEWS|  2 ++
 lib/dpif-netdev-unixctl.man |  4 
 lib/dpif-netdev.c   | 54 -
 3 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 88b8189..c01c100 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,8 @@ Post-v2.12.0
if supported by libbpf.
  * Add option to enable, disable and query TCP sequence checking in
conntrack.
+ * New "ovs-appctl dpif-netdev/subtable-show" command for userspace
+   datapath to show subtable miniflow bits.
 
 v2.12.0 - 03 Sep 2019
 -
diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man
index 6c54f6f..c443465 100644
--- a/lib/dpif-netdev-unixctl.man
+++ b/lib/dpif-netdev-unixctl.man
@@ -217,3 +217,7 @@ with port names, which this thread polls.
 .
 .IP "\fBdpif-netdev/pmd-rxq-rebalance\fR [\fIdp\fR]"
 Reassigns rxqs to pmds in the datapath \fIdp\fR based on their current usage.
+.
+.IP "\fBdpif-netdev/subtable-show\fR [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]"
+For one or all pmd threads of the datapath \fIdp\fR show the list of miniflow
+bits for each subtable in the datapath classifier.
\ No newline at end of file
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 4720ba1..7ae422e 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -857,6 +857,8 @@ static inline bool
 pmd_perf_metrics_enabled(const struct dp_netdev_pmd_thread *pmd);
 static void queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd,
   struct dp_netdev_flow *flow);
+static void pmd_info_show_subtable(struct ds *reply,
+   struct dp_netdev_pmd_thread *pmd);
 
 static void
 emc_cache_init(struct emc_cache *flow_cache)
@@ -979,6 +981,7 @@ enum pmd_info_type {
 PMD_INFO_CLEAR_STATS, /* Set the cycles count to 0. */
 PMD_INFO_SHOW_RXQ,/* Show poll lists of pmd threads. */
 PMD_INFO_PERF_SHOW,   /* Show pmd performance details. */
+PMD_INFO_SHOW_SUBTABLE, /* Show subtable miniflow bits. */
 };
 
 static void
@@ -1334,6 +1337,8 @@ dpif_netdev_pmd_info(struct unixctl_conn *conn, int argc, 
const char *argv[],
 pmd_info_show_stats(, pmd);
 } else if (type == PMD_INFO_PERF_SHOW) {
 pmd_info_show_perf(, pmd, (struct pmd_perf_params *)aux);
+} else if (type == PMD_INFO_SHOW_SUBTABLE) {
+pmd_info_show_subtable(, pmd);
 }
 }
 free(pmd_list);
@@ -1391,7 +1396,8 @@ dpif_netdev_init(void)
 {
 static enum pmd_info_type show_aux = PMD_INFO_SHOW_STATS,
   clear_aux = PMD_INFO_CLEAR_STATS,
-  poll_aux = PMD_INFO_SHOW_RXQ;
+  poll_aux = PMD_INFO_SHOW_RXQ,
+  subtable_aux = PMD_INFO_SHOW_SUBTABLE;
 
 unixctl_command_register("dpif-netdev/pmd-stats-show", "[-pmd core] [dp]",
  0, 3, dpif_netdev_pmd_info,
@@ -1416,6 +1422,9 @@ dpif_netdev_init(void)
  "[-us usec] [-q qlen]",
  0, 10, pmd_perf_log_set_cmd,
  NULL);
+unixctl_command_register("dpif-netdev/subtable-show", "[-pmd core] [dp]",
+ 0, 3, dpif_netdev_pmd_info,
+ (void *)_aux);
 return 0;
 }
 
@@ -8036,3 +8045,46 @@ dpcls_lookup(struct dpcls *cls, const struct 
netdev_flow_key *keys[],
 }
 return false;
 }
+
+/* Iterate through all dpcls instances and dump out all subtable
+ * miniflow bits. */
+static void
+pmd_info_show_subtable(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
+{
+if (pmd->core_id != NON_PMD_CORE_ID) {
+struct rxq_poll *list;
+size_t n_rxq;
+struct dpcls *cls;
+struct dpcls_subtable *subtable;
+
+ovs_mutex_lock(>port_mutex);
+sorted_poll_list(pmd, , _rxq);
+for (int i = 0; i < n_rxq; i++) {
+struct dp_netdev_rxq *rxq = list[i].rxq;
+odp_port_t in_port = rxq->port->port_no;
+cls = dp_netdev_pmd_lookup_dpcls(pmd, in_port);
+if (!cls) {
+continue;
+} else {
+struct pvector *pvec = >subtables;
+
+PVECTOR_FOR_EACH (subtable, pvec) {
+ds_put_format(reply, "pmd thread numa_id %d "
+  "core_id %u: \n",
+  pmd->numa_id, pmd->core_id);
+ds_put_format(reply, "  dpcls port %d: \n",cls->in_port);
+

Re: [ovs-dev] [PATCH ovn v5] ovn-northd: Limit ARP/ND broadcast domain whenever possible.

2019-11-08 Thread Dumitru Ceara
On Thu, Nov 7, 2019 at 7:02 PM Dumitru Ceara  wrote:
>
> On Thu, Nov 7, 2019 at 6:02 PM Han Zhou  wrote:
> >
> >
> >
> > On Thu, Nov 7, 2019 at 7:43 AM Dumitru Ceara  wrote:
> > >
> > > On Thu, Nov 7, 2019 at 3:51 AM Han Zhou  wrote:
> > > >
> > > >
> > > >
> > > > On Mon, Nov 4, 2019 at 11:32 AM Numan Siddique  wrote:
> > > > >
> > > > > On Mon, Nov 4, 2019 at 8:37 PM Dumitru Ceara  
> > > > > wrote:
> > > > > >
> > > > > > ARP request and ND NS packets for router owned IPs were being
> > > > > > flooded in the complete L2 domain (using the MC_FLOOD multicast 
> > > > > > group).
> > > > > > However this creates a scaling issue in scenarios where aggregation
> > > > > > logical switches are connected to more logical routers (~350). The
> > > > > > logical pipelines of all routers would have to be executed before 
> > > > > > the
> > > > > > packet is finally replied to by a single router, the owner of the IP
> > > > > > address.
> > > > > >
> > > > > > This commit limits the broadcast domain by bypassing the L2 Lookup 
> > > > > > stage
> > > > > > for ARP requests that will be replied by a single router. The 
> > > > > > packets
> > > > > > are still flooded in the L2 domain but not on any of the other patch
> > > > > > ports towards other routers connected to the switch. This restricted
> > > > > > flooding is done by using a new multicast group (MC_ARP_ND_FLOOD).
> > > > > >
> > > > > > IPs that are owned by the routers and for which this fix applies 
> > > > > > are:
> > > > > > - IP addresses configured on the router ports.
> > > > > > - VIPs.
> > > > > > - NAT IPs.
> > > > > >
> > > > > > This commit also fixes function get_router_load_balancer_ips() which
> > > > > > was incorrectly returning a single address_family even though the
> > > > > > IP set could contain a mix of IPv4 and IPv6 addresses.
> > > > > >
> > > > > > Reported-at: https://bugzilla.redhat.com/1756945
> > > > > > Reported-by: Anil Venkata 
> > > > > > Signed-off-by: Dumitru Ceara 
> > > > >
> > > > > Thanks Dumitru, for addressing the review comments.
> > > > >
> > > > > Acked-by: Numan Siddique 
> > > > >
> > > > > Han, if you can take a look in this patch and provide your comments,
> > > > > that would be great.
> > > > >
> > >
> > > Hi Han,
> > >
> > > >
> > > > Sorry for delayed response :(
> > >
> > > No worries and thanks for reviewing this change.
> > >
> > > >
> > > > The patch looks good to me except:
> > > >
> > > > 1. It changes the behavior that when an ARP request for a LRP's IP is 
> > > > coming to the logical switch, other routers will no longer learn the 
> > > > MAC-IP binding of the ARP sender. This has been discussed and I tend to 
> > > > agree with Dumitru that it should not cause real issue. I think it just 
> > > > worth to be documented clearly in the ovn-architecture, probably in the 
> > > > section: Logical Routers and Logical Patch Ports, because this is 
> > > > something different from a traditional switch's behavior, and network 
> > > > administrators may get suprised.
> > >
> > > Ack, I'll add this in v6.
> > >
> > > >
> > > > 2. Since we are anyway changing the behavior of ARP request handling 
> > > > and bypassing logical router ports, and we think it should not cause 
> > > > real problem, then I wonder why adding the MC_ARP_ND group to still 
> > > > flood to the non-router ports. Is it really useful? Maybe it is not a 
> > > > big deal, but I think the extra MC group would add some performance 
> > > > cost and it is almost redundant with the regular MC_FLOOD except that 
> > > > there is no router ports in it - it is a lot of redundancy considering 
> > > > that most LSPs are regular VIFs in typical large scale environments. I 
> > > > would suggest to simplify this unless there is clear concerns of not 
> > > > flooding to other ports.
> > >
> > > If I skip flooding to non-router ports the following test fails
> > > because one of the things checked by the test is that we flood
> > > broadcasts (including ARPs) in the broadcast domain:
> > >  36: ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR   FAILED 
> > > (ovs-macros.at:220)
> > >
> > > So my concern was that people might expect the ARP requests to be
> > > flooded within the L2 broadcast domain.
> > > However, we could add configuration knob to change the behavior and
> > > only flood them on localnet ports. Like this we could maintain the
> > > current L2 flooding but in deployments where this is not necessary the
> > > users could disable the flooding to VIF ports and this would remove
> > > the MC_ARP_ND group unless there's a localnet port in the datapath.
> > >
> > > What do you think?
> > >
> >
> > Adding configuration knob is a valid option, but I hoped we could simplify 
> > the change instead of making it even more complex :)
> >
> > My point is that since we changed the behavior, it might be better to 
> > change it with a more straightforward philosophy: for ARP request for an IP 
> > that is owned by a LRP, it is not 

[ovs-dev] [PATCH ovn v6] ovn-northd: Limit ARP/ND broadcast domain whenever possible.

2019-11-08 Thread Dumitru Ceara
ARP request and ND NS packets for router owned IPs were being
flooded in the complete L2 domain (using the MC_FLOOD multicast group).
However this creates a scaling issue in scenarios where aggregation
logical switches are connected to more logical routers (~350). The
logical pipelines of all routers would have to be executed before the
packet is finally replied to by a single router, the owner of the IP
address.

This commit limits the broadcast domain by bypassing the L2 Lookup stage
for ARP requests that will be replied by a single router. The packets
are forwarded only to the router port that owns the target IP address.

IPs that are owned by the routers and for which this fix applies are:
- IP addresses configured on the router ports.
- VIPs.
- NAT IPs.

This commit also fixes function get_router_load_balancer_ips() which
was incorrectly returning a single address_family even though the
IP set could contain a mix of IPv4 and IPv6 addresses.

Reported-at: https://bugzilla.redhat.com/1756945
Reported-by: Anil Venkata 
Signed-off-by: Dumitru Ceara 

---
v6:
- Address Han's comments:
  - remove flooding of ARPs targeting OVN owned IP addresses.
  - update ovn-architecture documentation.
  - rename ARP handling functions.
  - Adapt "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR" autotest to take into
account the new way of forwarding ARPs.
- Also, properly deal with ARP packets on VLAN-backed networks.
v5: Address Numan's comments: update comments & make autotest more
robust.
v4: Rebase.
v3: Properly deal with VXLAN traffic. Address review comments from
Numan (add autotests). Fix function get_router_load_balancer_ips.
Rebase -> deal with IPv6 NAT too.
v2: Move ARP broadcast domain limiting to table S_SWITCH_IN_L2_LKUP to
address localnet ports too.
---
 controller/physical.c|   2 +
 include/ovn/logical-fields.h |   4 +
 northd/ovn-northd.8.xml  |  16 ++
 northd/ovn-northd.c  | 337 ---
 ovn-architecture.7.xml   |  19 +++
 tests/ovn.at | 307 +--
 6 files changed, 591 insertions(+), 94 deletions(-)

diff --git a/controller/physical.c b/controller/physical.c
index 500d419..751dbbf 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -567,6 +567,7 @@ put_replace_chassis_mac_flows(const struct simap *ct_zones,
 
 if (tag) {
 ofpact_put_STRIP_VLAN(ofpacts_p);
+put_load(1, MFF_LOG_FLAGS, MLF_RCV_FROM_VLAN_BIT, 1, ofpacts_p);
 }
 load_logical_ingress_metadata(localnet_port, _ids, ofpacts_p);
 replace_mac = ofpact_put_SET_ETH_SRC(ofpacts_p);
@@ -1124,6 +1125,7 @@ consider_port_binding(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
  ofpacts_p);
 }
 ofpact_put_STRIP_VLAN(ofpacts_p);
+put_load(1, MFF_LOG_FLAGS, MLF_RCV_FROM_VLAN_BIT, 1, ofpacts_p);
 }
 
 /* Remember the size with just strip vlan added so far,
diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h
index 9b7c34f..15e0d1e 100644
--- a/include/ovn/logical-fields.h
+++ b/include/ovn/logical-fields.h
@@ -57,6 +57,7 @@ enum mff_log_flags_bits {
 MLF_LOCAL_ONLY_BIT = 4,
 MLF_NESTED_CONTAINER_BIT = 5,
 MLF_LOOKUP_MAC_BIT = 6,
+MLF_RCV_FROM_VLAN_BIT = 7,
 };
 
 /* MFF_LOG_FLAGS_REG flag assignments */
@@ -88,6 +89,9 @@ enum mff_log_flags {
 
 /* Indicate that the lookup in the mac binding table was successful. */
 MLF_LOOKUP_MAC = (1 << MLF_LOOKUP_MAC_BIT),
+
+/* Indicate that a packet was received on a VLAN backed network. */
+MLF_RCV_FROM_VLAN = (1 << MLF_RCV_FROM_VLAN_BIT),
 };
 
 /* OVN logical fields
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 0a33dcd..6fbb3ab 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1005,6 +1005,22 @@ output;
   
 
   
+Priority-80 flows for each port connected to a logical router
+matching self originated GARP/ARP request/ND packets. These packets
+are flooded to the MC_FLOOD which contains all logical
+ports.
+  
+
+  
+Priority-75 flows for each IP address/VIP/NAT address owned by a
+router port connected to the switch. These flows match ARP requests
+and ND packets for the specific IP addresses.  Matched packets are
+forwarded in the L3 domain only to the router that owns the IP
+address and flooded in the L2 domain on all ports except patch
+ports connected to logical routers.
+  
+
+  
 A priority-70 flow that outputs all packets with an Ethernet broadcast
 or multicast eth.dst to the MC_FLOOD
 multicast group.
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 2f0f501..9de3d20 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -210,6 +210,9 @@ enum ovn_stage {
 #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[4]"
 

[ovs-dev] [PATCHv4] netdev-afxdp: Enable loading XDP program.

2019-11-08 Thread William Tu
Now netdev-afxdp always forwards all packets to userspace because
it is using libbpf's default XDP program, see 'xsk_load_xdp_prog'.
There are some cases when users want to keep packets in kernel instead
of sending to userspace, for example, management traffic such as SSH
should be processed in kernel.

The patch enables loading the user-provide XDP program by doing
  $ovs-vsctl -- set int afxdp-p0 options:xdp-obj=

So users can implement their filtering logic or traffic steering idea
in their XDP program, and rest of the traffic passes to AF_XDP socket
handled by OVS.

Signed-off-by: William Tu 
---
v4:
Feedbacks from Eelco.
- First load the program, then configure xsk.
  Let API take care of xdp prog and map loading, don't set
  XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD.
- When loading custom xdp, need to close(prog_fd) and close(map_fd)
  to release the resources
- make sure prog and map is unloaded by bpftool.
- update doc, afxdp.rst
- Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/608986781

v3:
Feedbacks from Eelco.
- keep using xdpobj not xdp-obj (because we alread use xdpmode)
  or we change both to xdp-obj and xdp-mode?
- log a info message when using external program for better debugging
- combine some failure messages
- update doc
NEW:
- add options:xdpobj=__default__, to set back to libbpf default prog
- Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/606153231

v2:
A couple fixes and remove RFC
---
 Documentation/intro/install/afxdp.rst |  59 +
 lib/netdev-afxdp.c| 121 +++---
 lib/netdev-linux-private.h|   4 ++
 3 files changed, 176 insertions(+), 8 deletions(-)

diff --git a/Documentation/intro/install/afxdp.rst 
b/Documentation/intro/install/afxdp.rst
index a136db0c950a..d95a85f39035 100644
--- a/Documentation/intro/install/afxdp.rst
+++ b/Documentation/intro/install/afxdp.rst
@@ -273,6 +273,65 @@ Or, use OVS pmd tool::
   ovs-appctl dpif-netdev/pmd-stats-show
 
 
+Loading Custom XDP Program
+--
+By defailt, netdev-afxdp always forwards all packets to userspace because
+it is using libbpf's default XDP program. There are some cases when users
+want to keep packets in kernel instead of sending to userspace, for example,
+management traffic such as SSH should be processed in kernel. This can be
+done by loading the user-provided XDP program::
+
+  ovs-vsctl -- set int afxdp-p0 options:xdpobj=
+
+So users can implement their filtering logic or traffic steering idea
+in their XDP program, and rest of the traffic passes to AF_XDP socket
+handled by OVS. To set it back to default, use::
+
+  ovs-vsctl -- set int afxdp-p0 options:xdpobj=__default__
+
+Below is a sample C program compiled under kernel's samples/bpf/.
+
+.. code-block:: c
+
+  #include 
+  #include "bpf_helpers.h"
+
+  #if LINUX_VERSION_CODE < KERNEL_VERSION(5,3,0)
+  /* Kernel version before 5.3 needed an additional map */
+  struct bpf_map_def SEC("maps") qidconf_map = {
+  .type = BPF_MAP_TYPE_ARRAY,
+  .key_size = sizeof(int),
+  .value_size = sizeof(int),
+  .max_entries = 64,
+  };
+  #endif
+
+  /* OVS will associate map 'xsks_map' to xsk socket. */
+  struct bpf_map_def SEC("maps") xsks_map = {
+  .type = BPF_MAP_TYPE_XSKMAP,
+  .key_size = sizeof(int),
+  .value_size = sizeof(int),
+  .max_entries = 32,
+  };
+
+  SEC("xdp_sock")
+  int xdp_sock_prog(struct xdp_md *ctx)
+  {
+  int index = ctx->rx_queue_index;
+
+  /* Customized by user.
+   * For example
+   * 1) filter out all SSH traffic and return XDP_PASS
+   *for kernel to process.
+   * 2) Drop unwanted packet by returning XDP_DROP.
+   */
+
+  /* Rest of packets goes to AF_XDP. */
+  return bpf_redirect_map(_map, index, 0);
+  }
+  char _license[] SEC("license") = "GPL";
+
+
 Example Script
 --
 
diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index af654d498a88..853eeb8a8dbe 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -21,6 +21,7 @@
 #include "netdev-afxdp.h"
 #include "netdev-afxdp-pool.h"
 
+#include 
 #include 
 #include 
 #include 
@@ -30,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -88,9 +90,12 @@ BUILD_ASSERT_DECL(PROD_NUM_DESCS == CONS_NUM_DESCS);
 
 #define UMEM2DESC(elem, base) ((uint64_t)((char *)elem - (char *)base))
 
+#define LIBBPF_XDP_PROGRAM "__default__"
+
 static struct xsk_socket_info *xsk_configure(int ifindex, int xdp_queue_id,
  int mode, bool use_need_wakeup);
-static void xsk_remove_xdp_program(uint32_t ifindex, int xdpmode);
+static void xsk_remove_xdp_program(uint32_t ifindex, int xdpmode,
+   int prog_fd, int map_fd);
 static void xsk_destroy(struct xsk_socket_info *xsk);
 static int xsk_configure_all(struct netdev *netdev);
 

Re: [ovs-dev] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode.

2019-11-08 Thread William Tu
On Thu, Nov 07, 2019 at 12:36:33PM +0100, Ilya Maximets wrote:
> Until now there was only two options for XDP mode in OVS: SKB or DRV.
> i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.
> 
> Devices like 'veth' interfaces in Linux supports native XDP, but
> doesn't support zero-copy mode.  This case can not be covered by
> existing API and we have to use slower generic XDP for such devices.
> There are few more issues, e.g. TCP is not supported in generic XDP
> mode for veth interfaces due to kernel limitations, however it is
> supported in native mode.
> 
> This change introduces ability to use native XDP without zero-copy
> along with best-effort configuration option that enabled by default.
> In best-effort case OVS will sequentially try different modes starting
> from the fastest one and will choose the first acceptable for current
> interface.  This will guarantee the best possible performance.
> 
> If user will want to choose specific mode, it's still possible by
> setting the 'options:xdp-mode'.
> 
> This change additionally changes the API by renaming the configuration
> knob from 'xdpmode' to 'xdp-mode' and also renaming the modes
> themselves to be more user-friendly.
> 
> The full list of currently supported modes:
>   * native-with-zerocopy - former DRV
>   * native   - new one, DRV without zero-copy
>   * generic  - former SKB
>   * best-effort  - new one, chooses the best available from
>3 above modes

Since we are renaming the mode, in doc, should we tell user the mapping
of these mode to kernel AF_XDP's mode?
So let user know generic mode in OVS  = generic SKB in kernel,
native mode in OVS = native mode without zc...

> 
> Since 'best-effort' is a default mode, users will not need to
> explicitely set 'xdp-mode' in most cases.
> 
> TCP related tests enabled back in system afxdp testsuite, because
> 'best-effort' will choose 'native' mode for veth interfaces
> and this mode has no issues with TCP.
> 
> Signed-off-by: Ilya Maximets 
> ---

LGTM.
Acked-by: William Tu 

I look through the patch and everything looks good.
Thanks for fixing the documentation.
'make check-afxdp' also pass the previous failed cases due to TCP.

It would also be good if this 'best-effort' mode is supported in libbpf.
I think other users of af_xdp also have the same configuration headache.

Regards,
William

> 
> With this patch I modified the user-visible API, but I think it's OK
> since it's still an experimental netdev.  Comments are welcome.
> 
> Version 2:
>   * Rebased on current master.
> 
>  Documentation/intro/install/afxdp.rst |  54 ---
>  NEWS  |  12 +-
>  lib/netdev-afxdp.c| 223 --
>  lib/netdev-afxdp.h|   9 ++
>  lib/netdev-linux-private.h|   8 +-
>  tests/system-afxdp-macros.at  |   7 -
>  vswitchd/vswitch.xml  |  38 +++--
>  7 files changed, 227 insertions(+), 124 deletions(-)
> 
> diff --git a/Documentation/intro/install/afxdp.rst 
> b/Documentation/intro/install/afxdp.rst
> index a136db0c9..937770ad0 100644
> --- a/Documentation/intro/install/afxdp.rst
> +++ b/Documentation/intro/install/afxdp.rst
> @@ -153,9 +153,8 @@ To kick start end-to-end autotesting::
>make check-afxdp TESTSUITEFLAGS='1'
>  
>  .. note::
> -   Not all test cases pass at this time. Currenly all TCP related
> -   tests, ex: using wget or http, are skipped due to XDP limitations
> -   on veth. cvlan test is also skipped.
> +   Not all test cases pass at this time. Currenly all cvlan tests are skipped
> +   due to kernel issues.
>  
>  If a test case fails, check the log at::
>  
> @@ -177,33 +176,35 @@ in :doc:`general`::
>ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
>  
>  Make sure your device driver support AF_XDP, netdev-afxdp supports
> -the following additional options (see man ovs-vswitchd.conf.db for
> +the following additional options (see ``man ovs-vswitchd.conf.db`` for
>  more details):
>  
> - * **xdpmode**: use "drv" for driver mode, or "skb" for skb mode.
> + * ``xdp-mode``: ``best-effort``, ``native-with-zerocopy``,
> +   ``native`` or ``generic``.  Defaults to ``best-effort``, i.e. best of
> +   supported modes, so in most cases you don't need to change it.
>  
> - * **use-need-wakeup**: default "true" if libbpf supports it, otherwise 
> false.
> + * ``use-need-wakeup``: default ``true`` if libbpf supports it,
> +   otherwise ``false``.
>  
>  For example, to use 1 PMD (on core 4) on 1 queue (queue 0) device,
> -configure these options: **pmd-cpu-mask, pmd-rxq-affinity, and n_rxq**.
> -The **xdpmode** can be "drv" or "skb"::
> +configure these options: ``pmd-cpu-mask``, ``pmd-rxq-affinity``, and
> +``n_rxq``::
>  
>ethtool -L enp2s0 combined 1
>ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x10
>ovs-vsctl add-port br0 enp2s0 -- set interface enp2s0 type="afxdp" \
> 

Re: [ovs-dev] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode.

2019-11-08 Thread William Tu
On Thu, Nov 07, 2019 at 02:28:18PM +0100, Eelco Chaudron wrote:
> 
> 
> On 7 Nov 2019, at 14:21, Ilya Maximets wrote:
> 
> >On 07.11.2019 13:39, Eelco Chaudron wrote:
> >>
> >>
> >>On 7 Nov 2019, at 12:36, Ilya Maximets wrote:
> >>
> >>>Until now there was only two options for XDP mode in OVS: SKB or DRV.
> >>>i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.
> >>>
> >>>Devices like 'veth' interfaces in Linux supports native XDP, but
> >>>doesn't support zero-copy mode.  This case can not be covered by
> >>>existing API and we have to use slower generic XDP for such devices.
> >>>There are few more issues, e.g. TCP is not supported in generic XDP
> >>>mode for veth interfaces due to kernel limitations, however it is
> >>>supported in native mode.
> >>>
> >>>This change introduces ability to use native XDP without zero-copy
> >>>along with best-effort configuration option that enabled by default.
> >>>In best-effort case OVS will sequentially try different modes starting
> >>>from the fastest one and will choose the first acceptable for current
> >>>interface.  This will guarantee the best possible performance.
> >>>
> >>>If user will want to choose specific mode, it's still possible by
> >>>setting the 'options:xdp-mode'.
> >>>
> >>>This change additionally changes the API by renaming the configuration
> >>>knob from 'xdpmode' to 'xdp-mode' and also renaming the modes
> >>>themselves to be more user-friendly.
> >>>
> >>>The full list of currently supported modes:
> >>>  * native-with-zerocopy - former DRV
> >>>  * native   - new one, DRV without zero-copy
> >>>  * generic  - former SKB
> >>>  * best-effort  - new one, chooses the best available from
> >>>   3 above modes
> >>>
> >>>Since 'best-effort' is a default mode, users will not need to
> >>>explicitely set 'xdp-mode' in most cases.
> >>>
> >>>TCP related tests enabled back in system afxdp testsuite, because
> >>>'best-effort' will choose 'native' mode for veth interfaces
> >>>and this mode has no issues with TCP.
> >>>
> >>>Signed-off-by: Ilya Maximets 
> >>
> >>No review, I was running some performance tests for the OVS conference
> >>presentation so quickly tried this patch (assuming performance would
> >>increase :)…
> >>
> >>So with the native option (default for tap) I see a decrease, :(,  in
> >>performance over the skb mode (this is with my usual ovs_perf PVP test
> >>setup).
> >>
> >>With the patch applied, and default configuration
> >>(xdp-mode-in-use=native-with-zerocopy for my tap):
> >
> >Hmm. tap supports zero-copy? Interesting.

No, tap doesn't support zero-copy mode.

> >Have you tried forcing 'native' mode without zero-copy?
> >Maybe it will make sense to enable/disable need-wakeup feature.
> 
> Oops, wrong cut/paste: xdp-mode-in-use=native
> 
> All my the tests where with use-need-wakeup=true
> 
>   port 2: tapVM (afxdp: n_rxq=1, use-need-wakeup=true, xdp-mode=best-effort,
> xdp-mode-in-use=native)
> 
> >
> >>
> >>"Physical to Virtual to Physical test, L3 flows[port redirect]"
> >>, Packet size
> >>Number of flows,  64
> >>    1, 711581
> >>  100, 673152
> >>1000, 617733
> >>
> >>Here you will even see a performance drop with multiple IP flows (this
> >>is a single queue config).
> >
> >This is strange..

One explanation might be that
For using tap with native mode, OVS is sending XDP frame into the
kernel, but since tap device eventually has to create an skb and
forward to vhost_net then QEMU, so anyway we are having skb overhead.

On the other hand
For using tap with skb mode, there might be more optimization exist
in kernel, some batch allocation and batch rx/tx (See driver/net/tun.c)

--William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] travis: Fix CI failure for osx builds

2019-11-08 Thread Ilya Maximets

On 08.11.2019 13:12, Numan Siddique wrote:

Hi Ilya,


Hi Numan,

Comments inline.

Best regards, Ilya Maximets.



If you get some time, could you please take a look at this patch ?

Thanks
Numan


On Fri, Nov 8, 2019 at 4:13 PM  wrote:


From: Numan Siddique 

The below failure is seen


checking for Python 3 (version 3.4 or later)... /usr/local/bin/python3
checking where Python six library is available... configure: error: Missing 
Python six library.
The command "./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS" exited with 1.


This patch fixes it.

Signed-off-by: Numan Siddique 
---
  .travis/osx-prepare.sh | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/.travis/osx-prepare.sh b/.travis/osx-prepare.sh
index 4725fd829..7f639a62e 100755
--- a/.travis/osx-prepare.sh
+++ b/.travis/osx-prepare.sh
@@ -1,7 +1,5 @@
  #!/bin/bash
  set -ev
-pip2 install --user six
-pip2 install --user --upgrade docutils
+pip3 install --user six
+pip3 install --user --upgrade docutils


This is required because OVS requires python3 now.
This might make sense to point to that fact in commit-message.





This empty line should be removed too as it is the last in a file.


-brew update || true
-brew uninstall libtool && brew install libtool || true


This part is just an optimization. Doesn't really related to this fix.
Some details in a corresponding OVS commit:
3bfc9c1c30d5 ("travis: Drop OSX workarounds.")

I'm OK with the change in general.  It might be better to mention
in commit message why libtool update was also removed.

Otherwise,
Acked-by: Ilya Maximets 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] travis: Fix CI failure for osx builds

2019-11-08 Thread Numan Siddique
Hi Ilya,

If you get some time, could you please take a look at this patch ?

Thanks
Numan


On Fri, Nov 8, 2019 at 4:13 PM  wrote:
>
> From: Numan Siddique 
>
> The below failure is seen
>
> 
> checking for Python 3 (version 3.4 or later)... /usr/local/bin/python3
> checking where Python six library is available... configure: error: Missing 
> Python six library.
> The command "./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS" exited with 1.
> 
>
> This patch fixes it.
>
> Signed-off-by: Numan Siddique 
> ---
>  .travis/osx-prepare.sh | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/.travis/osx-prepare.sh b/.travis/osx-prepare.sh
> index 4725fd829..7f639a62e 100755
> --- a/.travis/osx-prepare.sh
> +++ b/.travis/osx-prepare.sh
> @@ -1,7 +1,5 @@
>  #!/bin/bash
>  set -ev
> -pip2 install --user six
> -pip2 install --user --upgrade docutils
> +pip3 install --user six
> +pip3 install --user --upgrade docutils
>
> -brew update || true
> -brew uninstall libtool && brew install libtool || true
> --
> 2.23.0
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 3/3] ovn-detrace: Add support for other types of SB cookies.

2019-11-08 Thread Dumitru Ceara
Commit eb25a7da639e ("Improve debuggability of OVN to OpenFlow
translations.") added cookies for Port_Binding, Mac_Binding,
Multicast_Group, Chassis records to the OpenfFlow entries they
generate.

Add support for parsing such cookies in ovn-detrace too.
Also, refactor ovn-detrace to allow a more generic way of defining
cookie handlers.

Signed-off-by: Dumitru Ceara 
---
 utilities/ovn-detrace.in |  166 --
 1 file changed, 117 insertions(+), 49 deletions(-)

diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in
index 34b6b0e..79c6d3b 100755
--- a/utilities/ovn-detrace.in
+++ b/utilities/ovn-detrace.in
@@ -98,48 +98,112 @@ class OVSDB(object):
 def find_row_by_partial_uuid(self, table_name, value):
 return self._find_row(table_name, lambda row: value in str(row.uuid))
 
-
-def get_lflow_from_cookie(ovnsb_db, cookie):
-return ovnsb_db.find_row_by_partial_uuid('Logical_Flow', cookie)
-
-
-def print_lflow(lflow, prefix):
-ldp_uuid = lflow.logical_datapath.uuid
-ldp_name = str(lflow.logical_datapath.external_ids.get('name'))
-
-print '%sLogical datapath: "%s" (%s) [%s]' % (prefix,
-  ldp_name,
-  ldp_uuid,
-  lflow.pipeline)
-print "%sLogical flow: table=%s (%s), priority=%s, " \
-  "match=(%s), actions=(%s)" % (prefix,
-lflow.table_id,
-lflow.external_ids.get('stage-name'),
-lflow.priority,
-str(lflow.match).strip('"'),
-str(lflow.actions).strip('"'))
-
-
-def print_lflow_nb_hint(lflow, prefix, ovnnb_db):
-external_ids = lflow.external_ids
-if external_ids.get('stage-name') in ['ls_in_acl',
-  'ls_out_acl']:
-acl_hint = external_ids.get('stage-hint')
-if not acl_hint:
-return
-acl = ovnnb_db.find_row_by_partial_uuid('ACL', acl_hint)
-if not acl:
+class CookieHandler(object):
+def __init__(self, db, table):
+self._db = db
+self._table = table
+
+def get_record(self, cookie):
+return self._db.find_row_by_partial_uuid(self._table, cookie)
+
+def print_record(self, record, prefix):
+pass
+
+def print_hint(self, record, prefix, db):
+pass
+
+def datapath_str(datapath):
+return '"%s" (%s)' % (str(datapath.external_ids.get('name')),
+  datapath.uuid)
+
+def chassis_str(chassis):
+if len(chassis) == 0:
+return ''
+ch = chassis[0]
+return 'chassis-name "%s", chassis-str "%s"' % (ch.name, ch.hostname)
+
+class LogicalFlowHandler(CookieHandler):
+def __init__(self, ovnsb_db):
+super(LogicalFlowHandler, self).__init__(ovnsb_db, 'Logical_Flow')
+
+def print_record(self, lflow, prefix):
+print('%sLogical datapath: %s [%s]' %
+(prefix, datapath_str(lflow.logical_datapath), lflow.pipeline))
+print('%sLogical flow: table=%s (%s), priority=%s, '
+  'match=(%s), actions=(%s)' %
+(prefix, lflow.table_id, lflow.external_ids.get('stage-name'),
+ lflow.priority,
+ str(lflow.match).strip('"'),
+ str(lflow.actions).strip('"')))
+
+def print_hint(self, lflow, prefix, ovnnb_db):
+external_ids = lflow.external_ids
+if external_ids.get('stage-name') in ['ls_in_acl',
+  'ls_out_acl']:
+acl_hint = external_ids.get('stage-hint')
+if not acl_hint:
+return
+acl = ovnnb_db.find_row_by_partial_uuid('ACL', acl_hint)
+if not acl:
+return
+output = '%sACL: %s, priority=%s, ' \
+ 'match=(%s), %s' % (prefix,
+ acl.direction,
+ acl.priority,
+ acl.match.strip('"'),
+ acl.action)
+if acl.log:
+output += ' (log)'
+print(output)
+
+class PortBindingHandler(CookieHandler):
+def __init__(self, ovnsb_db):
+super(PortBindingHandler, self).__init__(ovnsb_db, 'Port_Binding')
+
+def print_record(self, pb, prefix):
+ldp_uuid = pb.datapath.uuid
+ldp_name = str(pb.datapath.external_ids.get('name'))
+
+print('%sLogical datapath: %s' % (prefix, datapath_str(pb.datapath)))
+print('%sPort Binding: logical_port "%s", tunnel_key %ld, %s' %
+(prefix, pb.logical_port, pb.tunnel_key,
+ chassis_str(pb.chassis)))
+
+class MacBindingHandler(CookieHandler):
+def __init__(self, 

[ovs-dev] [PATCH ovn 1/3] ovn-detrace: Fix rundir.

2019-11-08 Thread Dumitru Ceara
After the separation of OVS and OVN rundirs, the ovn-detrace script
hasn't been updated to use the OVN rundir instead of the OVS one.

Signed-off-by: Dumitru Ceara 
---
 utilities/ovn-detrace.in |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in
index c842adc..9471e37 100755
--- a/utilities/ovn-detrace.in
+++ b/utilities/ovn-detrace.in
@@ -169,16 +169,16 @@ def main():
  "(use --help for help)\n" % argv0)
 sys.exit(1)
 
-ovs_rundir = os.getenv('OVS_RUNDIR', '@RUNDIR@')
+ovn_rundir = os.getenv('OVN_RUNDIR', '@OVN_RUNDIR@')
 if not ovnsb_db:
 ovnsb_db = os.getenv('OVN_SB_DB')
 if not ovnsb_db:
-ovnsb_db = 'unix:%s/ovnsb_db.sock' % ovs_rundir
+ovnsb_db = 'unix:%s/ovnsb_db.sock' % ovn_rundir
 
 if not ovnnb_db:
 ovnnb_db = os.getenv('OVN_NB_DB')
 if not ovnnb_db:
-ovnnb_db = 'unix:%s/ovnnb_db.sock' % ovs_rundir
+ovnnb_db = 'unix:%s/ovnnb_db.sock' % ovn_rundir
 
 ovsdb_ovnsb = OVSDB(ovnsb_db, 'OVN_Southbound')
 ovsdb_ovnnb = OVSDB(ovnnb_db, 'OVN_Northbound')

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 2/3] ovn-detrace: Fix line parsing.

2019-11-08 Thread Dumitru Ceara
The script was never parsing the first cookie in the input. Also, add a
check to make sure that the cookie refers to a Logical_Flow before
trying to print the record.

Signed-off-by: Dumitru Ceara 
---
 utilities/ovn-detrace.in |   24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in
index 9471e37..34b6b0e 100755
--- a/utilities/ovn-detrace.in
+++ b/utilities/ovn-detrace.in
@@ -188,22 +188,26 @@ def main():
 cookie = None
 while True:
 line = sys.stdin.readline()
+
+if line == '':
+break
+
+line = line.strip()
+
 if cookie:
 # print lflow info when the current flow block ends
-if regex_table_id.match(line) or line.strip() == '':
+if regex_table_id.match(line):
 lflow = get_lflow_from_cookie(ovsdb_ovnsb, cookie)
-print_lflow(lflow, "  * ")
-print_lflow_nb_hint(lflow, "* ", ovsdb_ovnnb)
-cookie = None
+if lflow:
+print_lflow(lflow, "  * ")
+print_lflow_nb_hint(lflow, "* ", ovsdb_ovnnb)
+cookie = None
 
-print line.strip()
-if line == "":
-break
+print line
 
 m = regex_cookie.match(line)
-if not m:
-continue
-cookie = m.group(1)
+if m:
+cookie = m.group(1)
 
 
 if __name__ == "__main__":

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 0/3] Improve ovn-detrace support for parsing OpenFlow cookies.

2019-11-08 Thread Dumitru Ceara
Commit eb25a7da639e ("Improve debuggability of OVN to OpenFlow translations.")
added support for more types of cookies (partial SB uuids) that are stored
in OpenFlow entries created by ovn-controller.

The last patch in this series implements support for parsing all these
different cookies in ovn-detrace too.

The first two patches are bug fixes required as ovn-detrace was broken
after moving OVN to its own separte rundir.

CC: Han Zhou 
Signed-off-by: Dumitru Ceara 

Dumitru Ceara (3):
  ovn-detrace: Fix rundir.
  ovn-detrace: Fix line parsing.
  ovn-detrace: Add support for other types of SB cookies.


 utilities/ovn-detrace.in |  186 --
 1 file changed, 129 insertions(+), 57 deletions(-)


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] L2 acl on the logical switch

2019-11-08 Thread Numan Siddique
Your email is really cluttered.

Could you please send again with plain text ?

Thanks
Numan


On Fri, Nov 8, 2019 at 6:22 AM venugopal iyer via dev
 wrote:
>
> [Pardon the length of the mail; have left out the ofctl flows, but if it 
> helps i can send them too]
> Assuming :logical Switch (ls1) with   
> ls1_vm1 : 02:ac:10:ff:00:11/172.16.255.11  
> ls1_vm2 : 02:ac:10:ff:00:22/172.16.255.22  
> ls1_vm3 : 02:ac:10:ff:00:33/172.16.255.33
> Looking to :have MAC ACLs between vm1 and vm2 so they can't talk to 
> each other.
> Using:# ovn-nbctl  create Address_Set name=macs 
> addresses=\"02:ac:10:ff:00:11\",\"02:ac:10:ff:00:22\"# ovn-nbctl 
> create Port_Group name=l2pg# ovn-nbctl add Port_Group l2pg ports 
> # ovn-nbctl add Port_Group l2pg ports # 
> ovn-nbctl acl-add ls1 to-lport 32767  "outport == @l2pg && eth.src == \$macs" 
> drop
> What we are seeing:  The above, by itself, will limit L3 and L2 between 
> vm1 and vm2, but not vm3 (as expected)
> lflow:  table=4 (ls_out_acl ), 
> priority=33767, match=(outport == @l2pg && eth.src == $macs), action=(/* drop 
> */)
> dpflow (sending a arbit packet of ether type 0x7a05):--   
>  
> (0),in_port(vm1),eth(src=02:ac:10:ff:00:11,dst=02:ac:10:ff:00:22),eth_type(0x7a05),
>  packets:0, bytes:0, used:never, actions:drop
> However, if there is a stateful rule, e.g.:# ovn-nbctl acl-add ls1 
> from-lport 2000  "inport == \"ls1-vm3\" && ip" allow-related
> Then, the behavior differs
> lflow:---  table=6 (ls_in_acl  ), 
> priority=3000 , match=(!ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 
> && (inport == "ls1-vm3" && ip)), action=(next;)  table=6 (ls_in_acl   
>), priority=3000 , match=(((ct.new && !ct.est) || (!ct.new && ct.est 
> && !ct.rpl && ct_label.blocked == 1)) && (inport == "ls1-vm3" && ip)), 
> action=(reg0[1] = 1; next;)  table=4 (ls_out_acl ), 
> priority=33767, match=((!ct.est || (ct.est && ct_label.blocked == 1)) && 
> (outport == @l2pg && eth.src == $macs)), action=(/* drop */)  table=4 
> (ls_out_acl ), priority=33767, match=(ct.est && ct_label.blocked == 0 
> && (outport == @l2pg && eth.src == $macs)), action=(ct_commit(ct_label=1/1); 
> /* drop */)
> *Note: the condition !ct.new && ct.est && !ct.rpl && ct_label.blocked == 
> 0**
> This does block L3 traffic:
> dpctl flow for ICMP-   
> recirc_id(0),in_port(vm1),eth(src=02:ac:10:ff:00:11),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8),
>  packets:6, bytes:588, used:0.864s, actions:ct(zone=1),recirc(0x1)   
> recirc_id(0x1),in_port(vm1),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(dst=02:ac:10:ff:00:22),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8),
>  packets:6, bytes:588, used:0.864s, 
> actions:ct(commit,zone=1,label=0/0x1),ct(zone=4),recirc(0x2)   
> recirc_id(0x2),in_port(vm1),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:11),eth_type(0x0800),ipv4(frag=no),
>  packets:6, bytes:588, used:0.864s, actions:drop   
> recirc_id(0),in_port(vm1),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:11,dst=02:ac:10:ff:00:22),eth_type(0x0806),arp(sha=02:ac:10:ff:00:11),
>  packets:0, bytes:0, used:never, actions:vm2   
> recirc_id(0),in_port(vm2),ct_state(-new-est-rel-rpl-inv
 
-trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:22,dst=02:ac:10:ff:00:11),eth_type(0x0806),arp(sha=02:ac:10:ff:00:22),
 packets:0, bytes:0, used:never, actions:vm1
>
> But, it doesn't block L2, see the ARP going through, without the stateful 
> rule the ARP would have been dropped too.
> Using the arbit packet:
>
> recirc_id(0),in_port(vm1),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=02:ac:10:ff:00:11,dst=02:ac:10:ff:00:22),eth_type(0x7a05),
>  packets:0, bytes:0, used:never, actions:vm2
>
>
> From what i understand it is because of the condition "!ct.new && ct.est && 
> !ct.rpl && ct_label.blocked == 0", which implies trk, I suppose, which makes 
> sense for IP packets, but probably not for non-IP packets, i.e. the  rules 
> above have "-trk". Hence, I believe it gets through
>
> Proposed changes:
> If my understanding is right (and objective to remove the inconsistency 
> between when there are stateful rules and not, w.r.t. L2 packets), we 
> couldhave the condition as:
> # git diffdiff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.cindex 
> 6c6de2afd..b0f524531 100644--- a/ovn/northd/ovn-northd.c+++ 
> b/ovn/northd/ovn-northd.c@@ -4191,10 +4191,13 @@ consider_acl(struct hmap 
> *lflows, struct ovn_datapath *od,  * depending on whether the 
> connection was previously committed  * to the connection tracker with 

Re: [ovs-dev] [PATCH ovn] docs: Add note about RBAC and remote ovn-northd connection

2019-11-08 Thread 0-day Robot
Bleep bloop.  Greetings Numan Siddique, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Committer Numan Siddique  needs to sign off.
Lines checked: 66, Warnings: 0, Errors: 1


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] docs: Add note about RBAC and remote ovn-northd connection

2019-11-08 Thread Frode Nordahl
On Fri, Nov 8, 2019 at 11:56 AM Numan Siddique  wrote:
>
> On Fri, Nov 8, 2019 at 11:22 AM Frode Nordahl
>  wrote:
> >
> > Signed-off-by: Frode Nordahl 
> > Acked-by: Aliasgar Ginwala 
> > Submitted-at: https://github.com/ovn-org/ovn/pull/25
>
> I applied this patch to master.
> Sorry I didn't notice that you already had sent the patch to the ML
> and I resubmitted here - https://patchwork.ozlabs.org/patch/1191808/.

No worries, and thank you for the merge!

I'll stick to sending patches through ml/patchworks in the future.

-- 
Frode Nordahl

> Thanks
> Numan
>
> > ---
> >  .../topics/role-based-access-control.rst  |  7 ++
> >  Documentation/tutorials/ovn-rbac.rst  | 25 +++
> >  2 files changed, 32 insertions(+)
> >
> > diff --git a/Documentation/topics/role-based-access-control.rst
> > b/Documentation/topics/role-based-access-control.rst
> > index 2acd1e88b..e13e2d5dc 100644
> > --- a/Documentation/topics/role-based-access-control.rst
> > +++ b/Documentation/topics/role-based-access-control.rst
> > @@ -82,6 +82,13 @@ command:
> >
> > $ ovn-sbctl set-connection role=ovn-controller ssl:192.168.0.1:6642
> >
> > +.. note::
> > +
> > +   There is currently no pre-defined role for ovn-northd. You must 
> > configure
> > +   a separate listener on the OVN southbound database that ovn-northd can
> > +   connect to if your deployment topology require ovn-northd to connect to 
> > a
> > +   OVN southbound database instance on a remote machine.
> > +
> >  Pre-defined Roles
> >  -
> >  This section describes roles that have been defined internally by OVS/OVN.
> > diff --git a/Documentation/tutorials/ovn-rbac.rst
> > b/Documentation/tutorials/ovn-rbac.rst
> > index 22b169d6d..fc2de5d5d 100644
> > --- a/Documentation/tutorials/ovn-rbac.rst
> > +++ b/Documentation/tutorials/ovn-rbac.rst
> > @@ -132,3 +132,28 @@ Configuring RBAC
> >  /path/to/chassis_2-cert.pem /path/to/cacert.pem
> >$ ovs-vsctl set open_vswitch . \
> >  external_ids:ovn-remote=ssl:machine_3-ip:6642
> > +
> > +The OVN central control daemon and RBAC
> > +~~~
> > +
> > +The OVN central control daemon (`ovn-northd`) needs full write access to
> > +the southbound database. When you have one machine hosting the central
> > +components, `ovn-northd` can talk to the databases through a local unix
> > +socket, bypassing the `ovn-controller` RBAC configured for the listener
> > +at port '6642'. However, if you want to deploy multiple machines for
> > +hosting the central components, `ovn-northd` will require a remote
> > +connection to all of them.
> > +
> > +1. Configure the southbound database with a second SSL listener on a
> > +   separate port without RBAC enabled for use by `ovn-northd`.
> > +
> > +   In `machine_3`::
> > +
> > +  $ ovn-sbctl -- --id=@conn_uuid create Connection \
> > +  target="pssl\:16642" \
> > +  -- add  SB_Global . connections=@conn_uuid
> > +
> > +   .. note::
> > +
> > + Care should be taken to restrict access to the above mentioned port
> > + so that only trusted machines can connect to it.
> > --
> > 2.20.1
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] docs: Add note about RBAC and remote ovn-northd connection

2019-11-08 Thread Numan Siddique
On Fri, Nov 8, 2019 at 11:22 AM Frode Nordahl
 wrote:
>
> Signed-off-by: Frode Nordahl 
> Acked-by: Aliasgar Ginwala 
> Submitted-at: https://github.com/ovn-org/ovn/pull/25

I applied this patch to master.
Sorry I didn't notice that you already had sent the patch to the ML
and I resubmitted here - https://patchwork.ozlabs.org/patch/1191808/.

Thanks
Numan

> ---
>  .../topics/role-based-access-control.rst  |  7 ++
>  Documentation/tutorials/ovn-rbac.rst  | 25 +++
>  2 files changed, 32 insertions(+)
>
> diff --git a/Documentation/topics/role-based-access-control.rst
> b/Documentation/topics/role-based-access-control.rst
> index 2acd1e88b..e13e2d5dc 100644
> --- a/Documentation/topics/role-based-access-control.rst
> +++ b/Documentation/topics/role-based-access-control.rst
> @@ -82,6 +82,13 @@ command:
>
> $ ovn-sbctl set-connection role=ovn-controller ssl:192.168.0.1:6642
>
> +.. note::
> +
> +   There is currently no pre-defined role for ovn-northd. You must configure
> +   a separate listener on the OVN southbound database that ovn-northd can
> +   connect to if your deployment topology require ovn-northd to connect to a
> +   OVN southbound database instance on a remote machine.
> +
>  Pre-defined Roles
>  -
>  This section describes roles that have been defined internally by OVS/OVN.
> diff --git a/Documentation/tutorials/ovn-rbac.rst
> b/Documentation/tutorials/ovn-rbac.rst
> index 22b169d6d..fc2de5d5d 100644
> --- a/Documentation/tutorials/ovn-rbac.rst
> +++ b/Documentation/tutorials/ovn-rbac.rst
> @@ -132,3 +132,28 @@ Configuring RBAC
>  /path/to/chassis_2-cert.pem /path/to/cacert.pem
>$ ovs-vsctl set open_vswitch . \
>  external_ids:ovn-remote=ssl:machine_3-ip:6642
> +
> +The OVN central control daemon and RBAC
> +~~~
> +
> +The OVN central control daemon (`ovn-northd`) needs full write access to
> +the southbound database. When you have one machine hosting the central
> +components, `ovn-northd` can talk to the databases through a local unix
> +socket, bypassing the `ovn-controller` RBAC configured for the listener
> +at port '6642'. However, if you want to deploy multiple machines for
> +hosting the central components, `ovn-northd` will require a remote
> +connection to all of them.
> +
> +1. Configure the southbound database with a second SSL listener on a
> +   separate port without RBAC enabled for use by `ovn-northd`.
> +
> +   In `machine_3`::
> +
> +  $ ovn-sbctl -- --id=@conn_uuid create Connection \
> +  target="pssl\:16642" \
> +  -- add  SB_Global . connections=@conn_uuid
> +
> +   .. note::
> +
> + Care should be taken to restrict access to the above mentioned port
> + so that only trusted machines can connect to it.
> --
> 2.20.1
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] docs: Add note about RBAC and remote ovn-northd connection

2019-11-08 Thread numans
From: Frode Nordahl 

Signed-off-by: Frode Nordahl 
Acked-by: Aliasgar Ginwala 
---
 .../topics/role-based-access-control.rst  |  7 ++
 Documentation/tutorials/ovn-rbac.rst  | 25 +++
 2 files changed, 32 insertions(+)

diff --git a/Documentation/topics/role-based-access-control.rst 
b/Documentation/topics/role-based-access-control.rst
index 2acd1e88b..e13e2d5dc 100644
--- a/Documentation/topics/role-based-access-control.rst
+++ b/Documentation/topics/role-based-access-control.rst
@@ -82,6 +82,13 @@ command:
 
$ ovn-sbctl set-connection role=ovn-controller ssl:192.168.0.1:6642
 
+.. note::
+
+   There is currently no pre-defined role for ovn-northd. You must configure
+   a separate listener on the OVN southbound database that ovn-northd can
+   connect to if your deployment topology require ovn-northd to connect to a
+   OVN southbound database instance on a remote machine.
+
 Pre-defined Roles
 -
 This section describes roles that have been defined internally by OVS/OVN.
diff --git a/Documentation/tutorials/ovn-rbac.rst 
b/Documentation/tutorials/ovn-rbac.rst
index 22b169d6d..fc2de5d5d 100644
--- a/Documentation/tutorials/ovn-rbac.rst
+++ b/Documentation/tutorials/ovn-rbac.rst
@@ -132,3 +132,28 @@ Configuring RBAC
 /path/to/chassis_2-cert.pem /path/to/cacert.pem
   $ ovs-vsctl set open_vswitch . \
 external_ids:ovn-remote=ssl:machine_3-ip:6642
+
+The OVN central control daemon and RBAC
+~~~
+
+The OVN central control daemon (`ovn-northd`) needs full write access to
+the southbound database. When you have one machine hosting the central
+components, `ovn-northd` can talk to the databases through a local unix
+socket, bypassing the `ovn-controller` RBAC configured for the listener
+at port '6642'. However, if you want to deploy multiple machines for
+hosting the central components, `ovn-northd` will require a remote
+connection to all of them.
+
+1. Configure the southbound database with a second SSL listener on a
+   separate port without RBAC enabled for use by `ovn-northd`.
+
+   In `machine_3`::
+
+  $ ovn-sbctl -- --id=@conn_uuid create Connection \
+  target="pssl\:16642" \
+  -- add  SB_Global . connections=@conn_uuid
+
+   .. note::
+
+ Care should be taken to restrict access to the above mentioned port
+ so that only trusted machines can connect to it.
-- 
2.23.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] travis: Fix CI failure for osx builds

2019-11-08 Thread numans
From: Numan Siddique 

The below failure is seen


checking for Python 3 (version 3.4 or later)... /usr/local/bin/python3
checking where Python six library is available... configure: error: Missing 
Python six library.
The command "./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS" exited with 1.


This patch fixes it.

Signed-off-by: Numan Siddique 
---
 .travis/osx-prepare.sh | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/.travis/osx-prepare.sh b/.travis/osx-prepare.sh
index 4725fd829..7f639a62e 100755
--- a/.travis/osx-prepare.sh
+++ b/.travis/osx-prepare.sh
@@ -1,7 +1,5 @@
 #!/bin/bash
 set -ev
-pip2 install --user six
-pip2 install --user --upgrade docutils
+pip3 install --user six
+pip3 install --user --upgrade docutils
 
-brew update || true
-brew uninstall libtool && brew install libtool || true
-- 
2.23.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds

2019-11-08 Thread Yanqin Wei (Arm Technology China)
Hi David

> -Original Message-
> From: David Wilder 
> Sent: Thursday, November 7, 2019 3:21 AM
> To: ovs-dev@openvswitch.org
> Cc: i.maxim...@ovn.org; b...@ovn.org; Yanqin Wei (Arm Technology China)
> ; wil...@us.ibm.com
> Subject: [PATCH v3] travis: support ppc64le builds
>
> Add support for travis-ci ppc64le builds.
>
> - Updated matrix in .travis.yml to include an arch: ppc64le build.
> - Move package install needed for 32bit builds to .travis/linux-prepare.sh.
>
> To keep the total build time at an acceptable level only a single build job is
> included in the matrix for ppc64le.
>
> A build report example can be found here [1] [0] http://travis-ci.org/ [1]
> https://travis-ci.org/djlwilder/ovs/builds/607851729
>
> Signed-off-by: David Wilder 
> ---
> Addressed review comments:
> - Cleaned up linux-prepare.sh (v2)
> - Switch from os: linux-ppc64le to arch: ppc64le (v3)
>
>  .travis.yml  | 5 +++--
>  .travis/linux-prepare.sh | 5 -
>  2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/.travis.yml b/.travis.yml
> index 482efd2d1..308c09635 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -14,7 +14,6 @@ addons:
>apt:
>  packages:
>- bc
> -  - gcc-multilib
>- libssl-dev
>- llvm-dev
>- libjemalloc1
> @@ -26,7 +25,6 @@ addons:
>- libelf-dev
>- selinux-policy-dev
>- libunbound-dev
> -  - libunbound-dev:i386
>- libunwind-dev
>
>  before_install: ./.travis/${TRAVIS_OS_NAME}-prepare.sh
> @@ -52,6 +50,9 @@ matrix:
>  - os: osx
>compiler: clang
>env: OPTS="--disable-ssl"
> +- arch: ppc64le
> +  compiler: gcc
> +  env: OPTS="--disable-ssl"
>
>  script: ./.travis/${TRAVIS_OS_NAME}-build.sh $OPTS
>
> diff --git a/.travis/linux-prepare.sh b/.travis/linux-prepare.sh index
> 9e3ac0df7..d66f480c6 100755
> --- a/.travis/linux-prepare.sh
> +++ b/.travis/linux-prepare.sh
> @@ -18,7 +18,10 @@ pip install --user --upgrade docutils  if [ "$M32" ]; then
>  # 32-bit and 64-bit libunwind can not be installed at the same time.
>  # This will remove the 64-bit libunwind and install 32-bit version.
> -sudo apt-get install -y libunwind-dev:i386
> +sudo apt-get install -y \
> + gcc-multilib \
> + libunwind-dev:i386 \
> + libunbound-dev:i386
[Yanqin] They are x86 specific dependency. It is better to use  "$TRAVIS_ARCH" 
== "amd64" condition.
[Yanqin]  Is gcc-multilib only required for 32bits build?
>  fi
>
>  # IPv6 is supported by kernel but disabled in TravisCI images:
> --
> 2.23.0.162.gf1d4a28

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] BUG: MAX_LOCKDEP_ENTRIES too low!

2019-11-08 Thread Jan Kara
I guess this is more for Peter or Ingo...

On Thu 07-11-19 19:54:08, syzbot wrote:
> syzbot has found a reproducer for the following crash on:
> 
> HEAD commit:99a8efbb NFC: st21nfca: fix double free
> git tree:   net
> console output: https://syzkaller.appspot.com/x/log.txt?x=15ed70d8e0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=cbbed3e8d4eb64bf
> dashboard link: https://syzkaller.appspot.com/bug?extid=cd0ec5211ac07c18c049
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=13cf5594e0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1036c762e0
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+cd0ec5211ac07c18c...@syzkaller.appspotmail.com
> 
> device 5580n entered promiscuous mode
> BUG: MAX_LOCKDEP_ENTRIES too low!
> turning off the locking correctness validator.
> CPU: 0 PID: 14197 Comm: syz-executor527 Not tainted 5.4.0-rc5+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
>  alloc_list_entry.cold+0x11/0x18 kernel/locking/lockdep.c:1292
>  add_lock_to_list kernel/locking/lockdep.c:1313 [inline]
>  check_prev_add kernel/locking/lockdep.c:2528 [inline]
>  check_prevs_add kernel/locking/lockdep.c:2581 [inline]
>  validate_chain kernel/locking/lockdep.c:2971 [inline]
>  __lock_acquire+0x2a15/0x4a00 kernel/locking/lockdep.c:3955
>  lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
>  __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
>  _raw_spin_lock_bh+0x33/0x50 kernel/locking/spinlock.c:175
>  spin_lock_bh include/linux/spinlock.h:343 [inline]
>  netif_addr_lock_bh include/linux/netdevice.h:4055 [inline]
>  dev_set_rx_mode+0x20/0x40 net/core/dev.c:7808
>  dev_set_promiscuity+0xbf/0xe0 net/core/dev.c:7716
>  internal_dev_create+0x387/0x550 net/openvswitch/vport-internal_dev.c:196
>  ovs_vport_add+0x150/0x500 net/openvswitch/vport.c:199
>  new_vport+0x1b/0x1d0 net/openvswitch/datapath.c:194
>  ovs_dp_cmd_new+0x5e5/0xe30 net/openvswitch/datapath.c:1644
>  genl_family_rcv_msg+0x74b/0xf90 net/netlink/genetlink.c:629
>  genl_rcv_msg+0xca/0x170 net/netlink/genetlink.c:654
>  netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
>  genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
>  netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
>  netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
>  netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
>  sock_sendmsg_nosec net/socket.c:637 [inline]
>  sock_sendmsg+0xd7/0x130 net/socket.c:657
>  ___sys_sendmsg+0x803/0x920 net/socket.c:2311
>  __sys_sendmsg+0x105/0x1d0 net/socket.c:2356
>  __do_sys_sendmsg net/socket.c:2365 [inline]
>  __se_sys_sendmsg net/socket.c:2363 [inline]
>  __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
>  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x441779
> Code: e8 9c ad 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> 0f 83 1b 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7ffea7e5fcc8 EFLAGS: 0246 ORIG_RAX: 002e
> RAX: ffda RBX: 0003 RCX: 00441779
> RDX:  RSI: 2240 RDI: 0003
> RBP: 00058f66 R08: 7ffe0025 R09: 7ffe0025
> R10: 0004 R11: 0246 R12: 006cdbc0
> R13: 0013 R14:  R15: 
> 
-- 
Jan Kara 
SUSE Labs, CR
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] conntrack: Fix tcp payload length in case multi-segments.

2019-11-08 Thread Zhike Wang
Signed-off-by: Zhike Wang 
---
 lib/conntrack-private.h | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
index 590f139..1d21f6e 100644
--- a/lib/conntrack-private.h
+++ b/lib/conntrack-private.h
@@ -233,13 +233,17 @@ conn_update_expiration(struct conntrack *ct, struct conn 
*conn,
 static inline uint32_t
 tcp_payload_length(struct dp_packet *pkt)
 {
-const char *tcp_payload = dp_packet_get_tcp_payload(pkt);
-if (tcp_payload) {
-return ((char *) dp_packet_tail(pkt) - dp_packet_l2_pad_size(pkt)
-- tcp_payload);
-} else {
-return 0;
+size_t l4_size = dp_packet_l4_size(pkt);
+
+if (OVS_LIKELY(l4_size >= TCP_HEADER_LEN)) {
+struct tcp_header *tcp = dp_packet_l4(pkt);
+int tcp_len = TCP_OFFSET(tcp->tcp_ctl) * 4;
+
+if (OVS_LIKELY(tcp_len >= TCP_HEADER_LEN && tcp_len <= l4_size)) {
+return (l4_size - tcp_len);
+}
 }
+return 0;
 }
 
 #endif /* conntrack-private.h */
-- 
1.8.3.1


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] flow: Fix IPv6 header parser with partial offloading.

2019-11-08 Thread Zhike Wang
Set new_proto before it is used in parse_ipv6_ext_hdrs__().

Signed-off-by: Zhike Wang 
---
 lib/flow.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/flow.c b/lib/flow.c
index a18a1e6..45bb96b 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -1136,11 +1136,11 @@ parse_tcp_flags(struct dp_packet *packet)
 dp_packet_set_l2_pad_size(packet, size - plen);
 size = plen;
 const struct ovs_16aligned_ip6_frag *frag_hdr;
+nw_proto = nh->ip6_nxt;
 if (!parse_ipv6_ext_hdrs__(, , _proto, _frag,
 _hdr)) {
 return 0;
 }
-nw_proto = nh->ip6_nxt;
 } else {
 return 0;
 }
-- 
1.8.3.1


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] netdev-dpdk: Fix dev attached flag.

2019-11-08 Thread Zhike Wang
If the dev was already probed correctly, the dev attached flag
should be set to true, or resource would leak during destruct.

Signed-off-by: Zhike Wang 
---
 lib/netdev-dpdk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index a65540c..10e2fe9 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1728,6 +1728,9 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
 new_port_id = DPDK_ETH_PORT_ID_INVALID;
 }
 }
+} else {
+dev->attached = true;
+VLOG_INFO("Device '%s' attached to DPDK", devargs);
 }
 }
 
-- 
1.8.3.1


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-afxdp: add afxdp specific maximum MTU check

2019-11-08 Thread Eelco Chaudron



On 7 Nov 2019, at 17:43, Ilya Maximets wrote:


On 07.11.2019 15:01, Eelco Chaudron wrote:

Any feedback on this?


On 1 Oct 2019, at 11:55, Eelco Chaudron wrote:

Drivers natively supporting AF_XDP will check that a configured MTU 
size

will not exceed the allowed size for AF_XDP. However, when the skb
compatibility mode is used there is no check and any value is 
accepted.


This sounds like a kernel issue.
So, maybe it's better to fix it there?


Cant remember the details but I did see an easy fix so decided to fix it 
here first :)



This, for example, is the case when using the TAP interface.

This fix adds a check to make sure only AF_XDP valid values are 
excepted.


Signed-off-by: Eelco Chaudron 
---
 lib/netdev-afxdp.c |   17 +
 lib/netdev-afxdp.h |    1 +
 lib/netdev-linux.c |    9 +
 3 files changed, 27 insertions(+)

diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index 6e0180327..140150f29 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -1001,6 +1001,23 @@ netdev_afxdp_destruct(struct netdev *netdev)
 ovs_mutex_destroy(>mutex);
 }

+int
+netdev_afxdp_verify_mtu_size(const struct netdev *netdev, int mtu)
+{
+    /*
+ * If a device is used in xdpmode skb, no driver-specific 
MTU size is
+ * checked and any value is allowed resulting in packet 
drops.
+ * This check will verify the maximum supported value based 
on the
+ * buffer size allocated and the additional headroom 
required.

+ */
+    if (netdev == NULL || mtu <= 0


I don't really think that we need to check above things.
netdev can't be NULL here.  You may put an assert here if you worried.

mtu shuld be validated by db schema, and it can not be < 1.


Ok will remove it…

+    || mtu > (FRAME_SIZE - OVS_XDP_HEADROOM - 
XDP_PACKET_HEADROOM)) {


mtu doesn't usually include ethernet header, vlans.  So, these should 
be
excluded too.  Not sure about FCS, if it passed to XDP program in 
generic

mode or it's stripped by the driver.


Thought I did some tests and the MTU given was from the ethernet header 
till the FCS (not included).
But I’ll re-test it on the new revision, and update the code here if 
needed…



+    return EINVAL;
+    }
+
+    return 0;
+}
+
 int
 netdev_afxdp_get_custom_stats(const struct netdev *netdev,
   struct 
netdev_custom_stats *custom_stats)

diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h
index e2f400b72..ee6939fce 100644
--- a/lib/netdev-afxdp.h
+++ b/lib/netdev-afxdp.h
@@ -39,6 +39,7 @@ int netdev_afxdp_rxq_construct(struct netdev_rxq 
*rxq_);

 void netdev_afxdp_rxq_destruct(struct netdev_rxq *rxq_);
 int netdev_afxdp_construct(struct netdev *netdev_);
 void netdev_afxdp_destruct(struct netdev *netdev_);
+int netdev_afxdp_verify_mtu_size(const struct netdev *netdev, int 
mtu);


 int netdev_afxdp_rxq_recv(struct netdev_rxq *rxq_,
   struct 
dp_packet_batch *batch,

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index f48192373..89d0d9a9d 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -1593,6 +1593,15 @@ netdev_linux_set_mtu(struct netdev *netdev_, 
int mtu)

 goto exit;
 }

+#ifdef HAVE_AF_XDP
+    if (!strcmp(netdev_get_type(netdev_), "afxdp")) {


It's better to compare netdev class with _afxdp_calss
as it done for tap.


Will do this in next version!


+    error = netdev_afxdp_verify_mtu_size(netdev_, mtu);
+    if (error) {
+    goto exit;
+    }
+    }
+#endif
+
 if (netdev->cache_valid & VALID_MTU) {
 error = netdev->netdev_mtu_error;
 if (error || netdev->mtu == mtu) {

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-dpdk: Track vhost tx contention.

2019-11-08 Thread David Marchand
On Tue, Nov 5, 2019 at 4:37 PM Ilya Maximets  wrote:
> >> That's an interesting debug method, but it looks not very suitable
> >> for an end-user documentation.  One thing that bothers me the most
> >> is referencing C code snippets in this kind of documentation.
> >
> > Ok, can we conclude on the coverage counter wrt the v1 then?
> > https://patchwork.ozlabs.org/patch/1153238/
> >
> > Should I submit a v3 with the doc update (but removing the parts about 
> > perf) ?
>
> 'perf' part should not be there.
>
> I'm in doubt about the rest of the docs.  This part might be useful, but
> it doesn't provide any solution for the problem and I really don't know
> if there is something we can suggest in that case.  "Rework your network
> topology" doesn't sound like a friendly solution or exact steps to do.
>
> One more thing is that documenting coverage counters doesn't look like a
> good idea to me and I'd like to not create a precedent.
>
> One day we'll rework this to be some "PMD/netdev performance statistics"
> and it'll be OK to document it.  But there is nothing more permanent
> than a temporary solution.
>
> Right now the easiest way for me is to just apply v1.

Ok for me.


-- 
David Marchand
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-afxdp: add afxdp specific maximum MTU check

2019-11-08 Thread Eelco Chaudron



On 8 Nov 2019, at 2:34, William Tu wrote:


On Thu, Nov 07, 2019 at 03:01:18PM +0100, Eelco Chaudron wrote:

Any feedback on this?


On 1 Oct 2019, at 11:55, Eelco Chaudron wrote:

Drivers natively supporting AF_XDP will check that a configured MTU 
size

will not exceed the allowed size for AF_XDP. However, when the skb
compatibility mode is used there is no check and any value is 
accepted.

This, for example, is the case when using the TAP interface.

This fix adds a check to make sure only AF_XDP valid values are 
excepted.


Signed-off-by: Eelco Chaudron 
---
lib/netdev-afxdp.c |   17 +
lib/netdev-afxdp.h |1 +
lib/netdev-linux.c |9 +
3 files changed, 27 insertions(+)

diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index 6e0180327..140150f29 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -1001,6 +1001,23 @@ netdev_afxdp_destruct(struct netdev *netdev)
ovs_mutex_destroy(>mutex);
}

+int
+netdev_afxdp_verify_mtu_size(const struct netdev *netdev, int mtu)
+{
+/*
+ * If a device is used in xdpmode skb, no driver-specific MTU 
size is

+ * checked and any value is allowed resulting in packet drops.
+ * This check will verify the maximum supported value based on 
the

+ * buffer size allocated and the additional headroom required.
+ */
+if (netdev == NULL || mtu <= 0
+|| mtu > (FRAME_SIZE - OVS_XDP_HEADROOM - 
XDP_PACKET_HEADROOM)) {


I remember XDP max MTU = 3520 bytes,
and it's (page_size(4096) - headroom(256) - shinfo(320))
so here we should also subtract shinfo?


That is the theoretical maximum, however you’re code allocated chunks 
of FRAME_SIZE, so that dictated the maximum value.


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev