Re: [ovs-dev] [PATCH v3 4/9] netdev-offload-tc: Check for valid netdev ifindex in flow_put

2022-02-22 Thread Roi Dayan via dev




On 2022-02-22 5:23 PM, Eelco Chaudron wrote:

Verify that the returned ifindex by netdev_get_ifindex() is valid.
This might not be the case in the ERSPAN port scenario, which can
not be offloaded.

Signed-off-by: Eelco Chaudron 
---
v3:
   - Fixed netdev reference issue on failure
   - Added netdev_flow_api_equals() check

  lib/netdev-offload-tc.c |   18 ++
  1 file changed, 18 insertions(+)

diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 9845e8d3f..0105d883f 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1841,7 +1841,25 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
*match,
  VLOG_DBG_RL(, "Can't find netdev for output port %d", 
port);
  return ENODEV;
  }
+
+if (!netdev_flow_api_equals(netdev, outdev)) {
+VLOG_DBG_RL(,
+"Flow API provider mismatch between ingress (%s) "
+"and egress (%s) ports",
+netdev_get_name(netdev), netdev_get_name(outdev));
+netdev_close(outdev);
+return EOPNOTSUPP;
+}
+
  action->out.ifindex_out = netdev_get_ifindex(outdev);
+if (action->out.ifindex_out < 0) {
+VLOG_DBG_RL(,
+"Can't find ifindex for output port %s, error %d",
+netdev_get_name(outdev), action->out.ifindex_out);
+netdev_close(outdev);
+return -action->out.ifindex_out;
+}
+
  action->out.ingress = is_internal_port(netdev_get_type(outdev));
  action->type = TC_ACT_OUTPUT;
  flower.action_count++;



Acked-by: Roi Dayan 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 8/9] netdev-offload-tc: Check for none offloadable ct_state flag combination

2022-02-22 Thread Roi Dayan via dev




On 2022-02-22 5:26 PM, Eelco Chaudron wrote:

This patch checks for none offloadable ct_state match flag combinations.
If they exist force the +trk flag down to TC Flower

Signed-off-by: Eelco Chaudron 
---
v3:
  - Instead of warning about an invalid flag combination fix it.

  lib/netdev-offload-tc.c |6 ++
  1 file changed, 6 insertions(+)

diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 0105d883f..3d2c1d844 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1541,6 +1541,12 @@ parse_match_ct_state_to_flower(struct tc_flower *flower, 
struct match *match)
  flower->key.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
  flower->mask.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
  }
+
+if (flower->key.ct_state &&
+!(flower->key.ct_state & TCA_FLOWER_KEY_CT_FLAGS_TRACKED)) {
+flower->key.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
+flower->mask.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
+}
  }
  
  if (mask->ct_zone) {




just to be sure, the check is if we have a ct state flag
that can be offloaded but no +trk then force add +trk ?
so any +state will always be with +trk ?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd: Don't log transaction failures on standby instances.

2022-02-22 Thread Han Zhou
On Fri, Feb 11, 2022 at 12:52 AM Dumitru Ceara  wrote:
>
> On 2/11/22 00:54, Numan Siddique wrote:
> > On Thu, Jan 27, 2022 at 10:00 AM Dumitru Ceara 
wrote:
> >>
> >> We still need to try to ovsdb_idl_loop_commit_and_wait() on instances
> >> that are in standby mode.  That's because they need to try to take the
> >> lock.  But if they're in standby they won't actually build a
transaction
> >> json and will return early because they don't own the SB lock.
> >>
> >> That's reported as an error by ovsdb_idl_loop_commit_and_wait() but we
> >> shouldn't act on it.
> >>
> >> Also, to ensure that no DB changes are missed, ovsdb_idl_track_clear()
> >> should be called only on active instances.  The standby or paused ones
> >> will get the incremental updates when they become active.  Otherwise we
> >> would be forced to perform a full recompute when the instance becomes
> >> active.
> >
> > Hi Dumitru,
> >
>
> Hi Numan,
>
> Thanks for the review!
>
> > I've a question on the track clear being moved out of the standby
instances.
> > To ensure correctness,  I suppose it's better to trigger a full
recompute when a
> > standby instance becomes active. What do you think?
> >
>
> I might be wrong but I don't think that's necessary.  It may also be
> quite costly as full recomputes can take quite long.
>
> > Also lets say CMS does the below operations
> >  - Add a logical switch S1
> >  - Add a  logical port p1 in S1
> >  - Add a logical port p2 in S1
> >  - Delete logical port p2
> >  - Delete a logical switch.
> >
> > With this patch since we are not clearing the tracking information,
> > how does ovn-northd
> > process the tracked changes when it becomes active ?
>
> When ovn-northd becomes active, from a Northbound database perspective,
> there were no changes; that is, S1 didn't exist when it was last active
> and it doesn't exist now either.
>
> So, there should be no NB change to process.  Accumulating tracked
> changes without calling clear() on the standby has exactly this effect.

Hi Dumitru,

I wonder how accumulating tracked changes without calling clear() would
work.

Firstly, I was under the impression that ovsdb_idl_track_clear() must be
called before the next ovsdb_idl_run(), and the current change tracking
implementation cannot safely carry tracking information across the
iterations. This was why in ovn-controller whenever (!engine_has_run() &&
engine_need_run()) we force recompute in the next iteration. If changes
could be carried over we would have incrementally processed the accumulated
changes without forcing recompute. I can't recall the details, and I
checked the IDL again briefly but I didn't find the exact reason why it is
not safe. But I believe it was never used this way before. I should have
added a TODO for this (but somehow forgot to, sorry).

Secondly, even if it is safe to accumulate changes, it is going to be a
memory problem. Active-standby failover happens very rarely in a healthy
production environment. So even if the DB size is small, the change
tracking can grow without any limit. I tried with this patch by doing
simply a loop of adding and deleting a single logical switch 10k times on
top of an empty NB DB, and the standby northd's memory grew to 1.1GB.

Thirdly, processing all the tracked changes when standby becomes active is
likely taking higher cost than recompute, if the tracked change size is
bigger than the DB size.

So for now I would suggest keeping the logic of clearing tracking on every
iteration and force recompute when failover.

Thanks,
Han

>
> From a Southbound database perspective there are two cases:
>
> a. The former active northd processed some (but not all) of the NB
> changes and executed their corresponding SB transactions.  In this case,
> the standby northd also receives update messages for the SB records that
> were changed.  The standby northd tracks these changes.
>
> When the standby northd becomes active it will:
> - determine that NB state didn't change
> - SB state changed and needs to be reconciled (today we do this with the
> help of a NULL change handler for SB_* tables which will trigger a full
> recompute).
>
> b. The former active northd processed all of the NB changes and executed
> their corresponding SB transactions.  In this case, the final state of
> the NB and SB databases should be equivalent to their initial states.
> NB/SB changes will be accumulated by the change tracking mechanism on
> the standby resulting in empty tracked changes lists.  This is fine
> because the new active northd doesn't need to do anything, the DB
> contents are already consistent.
>
> c. The former active northd processed none of the NB changes yet.  This
> is very similar to case "b" above, the new active northd doesn't need to
> change anything in the NB/SB and it won't do that either.
>
> >
> > Thanks
> > Numan
> >
>
> Thanks,
> Dumitru
>
> ___
> dev mailing list
> d...@openvswitch.org
> 

Re: [ovs-dev] [PATCH ovn v2] Copy external_ids from Logical_Router_Port to SB database

2022-02-22 Thread Han Zhou
Hi Selva,

Thanks for the patch. Please see my comments below.

On Tue, Feb 8, 2022 at 9:18 AM Selvaraj Palaniyappan 
wrote:
>
> Hello Mark,
>
> Thanks for reviewing the diff. Please find the background info related to
this patch.
>
> "The ML2 Plugin can add some useful info to NB database of Logical router
port(LRP) table's external-ids that can be propagated to SB Port_binding
table. Some module on compute node can consume these external-ids from
Port_binding entries. The useful info could be subnet type on LRP and other
data.”

Please keep in mind that any separate connections to SB DB could add
pressure to the SB DB at scale. With this considered, I am ok with this
patch, since we are doing the same for LSPs already.

>
> Regarding subject line, I have used "git send-mail” with format patch o/p
to upload the patch. Complete description has been taken from format patch
o/p. If needed, let me send the new patch. Let me know your input.
>
Not sure what's wrong, but you could run the format-patch command first and
check if it generates the patch file properly. (you can also send email by
git send-email followed by the file name)

Please see a minor comment on the test case.

> Thanks,
> Selva
>
> On 05-Feb-2022, at 12:38 AM, Mark Michelson > wrote:
>
> Hello,
>
> I had a look, and while the code does what it claims, it's not clear why
you want to do this. Can you provide some background?
>
> Also, it appears in this version of the patch, you put the entire
description in the subject line :)
>
> Thanks
>
> On 1/27/22 07:50, Selvaraj Palaniyappan wrote:
> Signed-off-by: Selvaraj Palaniyappan >
> ---
>  northd/northd.c |  1 +
>  ovn-nb.xml  |  6 ++
>  ovn-sb.xml  |  3 ++-
>  tests/ovn-northd.at | 14 ++
>  4 files changed, 23 insertions(+), 1 deletion(-)
> diff --git a/northd/northd.c b/northd/northd.c
> index fc7a64f99..090922ae2 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -3240,6 +3240,7 @@ ovn_port_update_sbrec(struct northd_input
*input_data,
>  ds_destroy();
>struct smap ids = SMAP_INITIALIZER();
> +smap_clone(, >nbrp->external_ids);
>  sbrec_port_binding_set_external_ids(op->sb, );
>sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0);
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index 6a6972856..293d25b32 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -2895,6 +2895,12 @@
>  
>
>  See External IDs at the beginning of this document.
> +
> +  The ovn-northd program copies all these pairs
into the
> +   column of the
> +   table in 
> +  database.
> +
>
>  
>
> diff --git a/ovn-sb.xml b/ovn-sb.xml
> index 9ddacdf09..f7c41ccdc 100644
> --- a/ovn-sb.xml
> +++ b/ovn-sb.xml
> @@ -3354,7 +3354,8 @@ tcp.flags = RST;
>  
>The ovn-northd program populates this column with
>all entries into the  column of the
> -   table of the
> +   and
> +   tables of the
> database.
>  
>
> diff --git a/tests/ovn-northd.at b/tests/
ovn-northd.at
> index 84e52e701..f9c5259f1 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -144,6 +144,20 @@ AT_CHECK([test x`ovn-nbctl lsp-get-up S1-R1` = xup])
>  AT_CLEANUP
>  ])
>  +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([check external id propagation to SBDB])

The title of the test case is too general, because all the tables have an
external-ids column. Better to call out the "logical router port, or LRP".

> +ovn_start
> +
> +ovn-nbctl lr-add ro
> +ovn-nbctl lrp-add ro lrp0 00:00:00:00:00:01 192.168.1.1/24
> +ovn-nbctl --wait=sb set logical_router_port lrp0 external_ids=test=123
> +AT_CHECK([ovn-sbctl --columns=external_ids --bare find Port_Binding
logical_port=lrp0],
> +[0], [test=123
> +])

Just a hint, you can also use the check_column function which is slightly
more concise:
check_column "test=123" sb:Port_Binding external_ids logical_port=lrp0

With the minor comments addressed:
Acked-by: Han Zhou 

Thanks,
Han

> +
> +AT_CLEANUP
> +])
> +
>  OVN_FOR_EACH_NORTHD([
>  AT_SETUP([check IPv6 RA config propagation to SBDB])
>  ovn_start
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] conntrack: Support packets/bytes stats

2022-02-22 Thread Yifeng Sun
Userspace conntrack doesn't support conntrack stats for packets and
bytes. This patch implements it.

Signed-off-by: Yifeng Sun 
---
 lib/conntrack-private.h   |  9 +
 lib/conntrack.c   | 28 
 tests/system-common-macros.at |  2 +-
 tests/system-traffic.at   | 30 ++
 4 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
index dfdf4e676..7f21d3772 100644
--- a/lib/conntrack-private.h
+++ b/lib/conntrack-private.h
@@ -91,6 +91,11 @@ enum OVS_PACKED_ENUM ct_conn_type {
 CT_CONN_TYPE_UN_NAT,
 };
 
+struct conn_counter {
+atomic_uint64_t packets;
+atomic_uint64_t bytes;
+};
+
 struct conn {
 /* Immutable data. */
 struct conn_key key;
@@ -123,6 +128,10 @@ struct conn {
 enum ct_conn_type conn_type;
 
 uint32_t tp_id; /* Timeout policy ID. */
+
+/* Counters. */
+struct conn_counter counters_orig;
+struct conn_counter counters_reply;
 };
 
 enum ct_update_res {
diff --git a/lib/conntrack.c b/lib/conntrack.c
index 33a1a9295..177154cd8 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1245,6 +1245,21 @@ conn_update_state_alg(struct conntrack *ct, struct 
dp_packet *pkt,
 return false;
 }
 
+static void
+conn_update_counters(struct conn *conn,
+ const struct dp_packet *pkt, bool reply)
+{
+if (conn) {
+struct conn_counter *counter = (reply
+   ? >counters_reply
+   : >counters_orig);
+uint64_t old;
+
+atomic_count_inc64(>packets);
+atomic_add(>bytes, dp_packet_size(pkt), );
+}
+}
+
 static void
 set_cached_conn(const struct nat_action_info_t *nat_action_info,
 const struct conn_lookup_ctx *ctx, struct conn *conn,
@@ -1283,6 +1298,8 @@ process_one_fast(uint16_t zone, const uint32_t *setmark,
 if (setlabel) {
 set_label(pkt, conn, [0], [1]);
 }
+
+conn_update_counters(conn, pkt, pkt->md.reply);
 }
 
 static void
@@ -1420,6 +1437,8 @@ process_one(struct conntrack *ct, struct dp_packet *pkt,
 set_label(pkt, conn, [0], [1]);
 }
 
+conn_update_counters(conn, pkt, ctx->reply);
+
 handle_alg_ctl(ct, ctx, pkt, ct_alg_ctl, conn, now, !!nat_action_info);
 
 set_cached_conn(nat_action_info, ctx, conn, pkt);
@@ -2641,6 +2660,15 @@ conn_to_ct_dpif_entry(const struct conn *conn, struct 
ct_dpif_entry *entry,
 }
 ovs_mutex_unlock(>lock);
 
+entry->counters_orig.packets = atomic_count_get64(
+(atomic_uint64_t *)>counters_orig.packets);
+entry->counters_orig.bytes = atomic_count_get64(
+(atomic_uint64_t *)>counters_orig.bytes);
+entry->counters_reply.packets = atomic_count_get64(
+(atomic_uint64_t *)>counters_reply.packets);
+entry->counters_reply.bytes = atomic_count_get64(
+(atomic_uint64_t *)>counters_reply.bytes);
+
 entry->timeout = (expiration > 0) ? expiration / 1000 : 0;
 
 if (conn->alg) {
diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
index 19a0b125b..89cd7b83c 100644
--- a/tests/system-common-macros.at
+++ b/tests/system-common-macros.at
@@ -240,7 +240,7 @@ m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed 
's/csum:.*/csum: /'])
 # and limit the output to the rows containing 'ip-addr'.
 #
 m4_define([FORMAT_CT],
-[[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e 
's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | sort | 
uniq]])
+[[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e 
's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' -e 
's/timeout=[0-9]*/timeout=/g' | sort | uniq]])
 
 # NETNS_DAEMONIZE([namespace], [command], [pidfile])
 #
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index f22d86e46..15b2c288c 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -6743,6 +6743,36 @@ AT_CHECK([ovs-ofctl dump-flows br0 | grep table=2, | 
OFPROTO_CLEAR_DURATION_IDLE
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([conntrack - stats])
+CHECK_CONNTRACK()
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_NAMESPACES(at_ns0, at_ns1)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
+ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
+
+AT_DATA([flows.txt], [dnl
+priority=1,action=drop
+priority=10,arp,action=normal
+priority=100,in_port=1,icmp,action=ct(commit),2
+priority=100,in_port=2,ct_state=-trk,icmp,action=ct(table=0)
+priority=100,in_port=2,ct_state=+trk+est-new,icmp,action=1
+])
+
+AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
+
+NS_CHECK_EXEC([at_ns0], [ping -s 500 -q -c 3 -i 0.3 -w 2 10.1.1.2 | 
FORMAT_PING], [0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack -s | FORMAT_CT(10.1.1.2)], [0], [dnl

Re: [ovs-dev] [PATCH net v4 1/1] openvswitch: Fix setting ipv6 fields causing hw csum failure

2022-02-22 Thread Jakub Kicinski
You'll need to rebase, the patch which made everything force inlined
got merged.

On Sun, 20 Feb 2022 15:21:14 +0200 Paul Blakey wrote:
> +static inline __wsum
> +csum_block_replace(__wsum csum, __wsum old, __wsum new, int offset)
> +{
> + return csum_block_add(csum_block_sub(csum, old, offset), new, offset);

Why still _block? Since the arguments are pre-shifted we should be able
to subtract and add the entire thing, and the math will work out.
Obviously you'd need to shift by the offset in the word, not in a byte
in the caller.

> +}
> +
>  static inline __wsum csum_unfold(__sum16 n)
>  {
>   return (__force __wsum)n;
> @@ -184,4 +190,5 @@ static inline __wsum wsum_negate(__wsum val)
>  {
>   return (__force __wsum)-((__force u32)val);
>  }
> +

Spurious?

>  #endif
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 08/10] netdev-offload-tc: Check for none offloadable ct_state flag combination

2022-02-22 Thread Marcelo Leitner
On Tue, Feb 22, 2022 at 04:44:30PM +0100, Eelco Chaudron wrote:
>
>
> On 22 Feb 2022, at 12:36, Marcelo Leitner wrote:
>
> > +Cc Wenxu, Paul and netdev
> >
> > On Tue, Feb 22, 2022 at 10:33:44AM +0100, Eelco Chaudron wrote:
> >>
> >>
> >> On 21 Feb 2022, at 15:53, Eelco Chaudron wrote:
> >>
> >>> On 21 Feb 2022, at 14:33, Marcelo Leitner wrote:
> >>
> >> 
> >>
> >> Don’t think this is true, it will only print if +trk and any other 
> >> flags are set.
> >> Guess this is where the miscommunication is.>
> >>> The message also seems to be a bit aggressive, especially since it 
> >>> will
> >>> almost always be printed.
> >
> > Yeah.  I missed the fact that you're checking for zero and 
> > flower->key.ct_state
> > will actually mark existence of other flags.  So, that is fine.
> >
> > However, I'm still not sure that the condition is fully correct.
> >
> > If we'll take a match on '+est' with all other flags wildcarded, that 
> > will
> > trigger the condition, because 'flower->key.ct_state' will contain the 
> > 'est' bit,
> > but 'trk' bit will not be set.  The point is that even though -trk+est 
> > is not
> 
>  Oh ow. tc flower will reject this combination today, btw. I don't know
>  about hw implications for changing that by now.
> 
>  https://elixir.bootlin.com/linux/latest/C/ident/fl_validate_ct_state
>  'state' parameter in there is the value masked already.
> 
>  We directly mapped openflow restrictions to the datapath.
> 
> > a valid combination and +trk+est is, OVS may in theory produce the 
> > match with
> > 'est' bit set and 'trk' bit wildcarded.  And that can be a correct 
> > configuration.
> 
>  I guess that means that the only possible parameter validation on
>  ct_state at tc level is about its length. Thoughts?
> 
> >>>
> >>> Guess I get it now also :) I was missing the wildcard bit that OVS 
> >>> implies when not specifying any :)
> >>>
> >>> I think I can fix this by just adding +trk on the TC side when we get the 
> >>> OVS wildcard for +trk. Guess this holds true as for TC there is no -trk 
> >>> +flags.
> >>>
> >>> I’m trying to replicate patch 9 all afternoon, and due to the fact I did 
> >>> not write down which test was causing the problem, and it taking 20-30 
> >>> runs, it has not happened yet :( But will do it later tomorrow, see if it 
> >>> works in all cases ;)
> >>>
> >>
> >> So I’ve been doing some experiments (and running all system-traffic 
> >> tests), and I think the following fix will solve the problem by just 
> >> making sure the +trk flag is set in this case on the TC side.
> >> This will not change the behavior compared to the kernel.
> >>
> >> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> >> index 0105d883f..3d2c1d844 100644
> >> --- a/lib/netdev-offload-tc.c
> >> +++ b/lib/netdev-offload-tc.c
> >> @@ -1541,6 +1541,12 @@ parse_match_ct_state_to_flower(struct tc_flower 
> >> *flower, struct match *match)
> >>  flower->key.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
> >>  flower->mask.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
> >>  }
> >> +
> >> +if (flower->key.ct_state &&
> >> +!(flower->key.ct_state & TCA_FLOWER_KEY_CT_FLAGS_TRACKED)) {
> >> +flower->key.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
> >> +flower->mask.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
> >> +}
> >
> > I had meant to update the kernel instead. As Ilya was saying, as this
> > is dealing with masks, the validation that tc is doing is not right. I
> > mean, for a connection to be in +est, it needs +trk, right, but for
> > matching, one could have the following value/mask:
> >  value=est
> >  mask=est
> > which means: match connections in Established AND also untracked ones.
>
> Maybe it was too late last night, but why also untracked ones?

Nah, it was too early today here. :D

> It should only match untracked ones with the est flag set, or do I miss 
> something?
> Untracked ones can never have the est flag set, right?

My bad. Please scratch that description. Right.. it can't match
untracked ones because it's checking that est is set, and untracked
ones can never have it.

Yet, the point is, the mask, per say, is not wrong. All conntrack
entries that have +est will also have +trk, okay, but a user filtering
only for +est is not wrong and tc shouldn't reject it.

I couldn't find a similar verification in ovs kernel. Maybe I missed
it. But if vswitchd would need the tweak in here, seems ovs kernel
doesn't need it, and then the two would potentially have different
behaviours.

>
> > Apparently this is what the test is triggering here, and the patch
> > above could lead to not match the 2nd part of the AND above.
> >
> > When fixing the parameter validation in flower, we went too far.
> >
> >   Marcelo
> >
> >>  }
> >>
> >> I will send out a 

Re: [ovs-dev] [PATCH] reconnect: Fix broken inactivity probe if there is no other reason to wake up.

2022-02-22 Thread Dumitru Ceara
On 2/22/22 16:45, Ilya Maximets wrote:
> On 2/22/22 13:05, Dumitru Ceara wrote:
>> On 2/21/22 15:16, Ilya Maximets wrote:
>>> The purpose of reconnect_deadline__() function is twofold:
>>>
>>> 1. Its result is used to tell if the state has to be changed right now
>>>in reconnect_run().
>>> 2. Its result also used to determine when the process need to wake up
>>>and call reconnect_run() for a next time, i.e. when the state may
>>>need to be changed next time.
>>>
>>> Since introduction of the 'receive-attempted' feature, the function
>>> returns LLONG_MAX if the deadline is in the future.  That works for
>>> the first case, but doesn't for the second one, because we don't
>>> really know when we need to call reconnect_run().
>>>
>>> This is the problem for applications where jsonrpc connection is the
>>> only source of wake ups, e.g. ovn-northd.  When the network goes down
>>> silently, e.g. server looses IP address due to DHCP failure, ovn-northd
>>> will sleep in the poll loop indefinitely after being told that it
>>> doesn't need to call reconnect_run() (deadline == LLONG_MAX).
>>>
>>> Fixing that by actually returning the expected time if it is in the
>>> future, so we will know when to wake up.  In order to keep the
>>> 'receive-attempted' feature, returning 'now + 1' in case where the
>>> time has already passed, but receive wasn't attempted.  That will
>>> trigger a fast wake up, so the application will be able to attempt the
>>> receive even if there was no real events.  In a correctly written
>>> application we should not fall into this case more than once in a row.
>>> '+ 1' ensures that we will not transition into a different state
>>> prematurely, i.e. before the receive is actually attempted.
>>>
>>> Fixes: 4241d652e465 ("jsonrpc: Avoid disconnecting prematurely due to long 
>>> poll intervals.")
>>> Signed-off-by: Ilya Maximets 
>>> ---
>>
>> Hi Ilya,
>>
>> This change looks good to me, I just have a couple minor comment-related
>> remarks; in any case:
>>
>> Acked-by: Dumitru Ceara 
>>
>> Regards,
>> Dumitru
>>
>>>
>>> CC: Wentao Jia
>>> This might be the problem you tried to fix in northd by setting a probe
>>> interval.  Would be great, if you can test your case with this patch.
>>>
>>>  lib/reconnect.c |  29 +---
>>>  python/ovs/reconnect.py |  40 +++-
>>>  tests/reconnect.at  | 102 +++-
>>>  3 files changed, 133 insertions(+), 38 deletions(-)
>>>
>>> diff --git a/lib/reconnect.c b/lib/reconnect.c
>>> index a929ddfd2..f6fd165e4 100644
>>> --- a/lib/reconnect.c
>>> +++ b/lib/reconnect.c
>>> @@ -75,7 +75,8 @@ struct reconnect {
>>>  
>>>  static void reconnect_transition__(struct reconnect *, long long int now,
>>> enum state state);
>>> -static long long int reconnect_deadline__(const struct reconnect *);
>>> +static long long int reconnect_deadline__(const struct reconnect *,
>>> +  long long int now);
>>>  static bool reconnect_may_retry(struct reconnect *);
>>>  
>>>  static const char *
>>> @@ -539,7 +540,7 @@ reconnect_transition__(struct reconnect *fsm, long long 
>>> int now,
>>>  }
>>>  
>>>  static long long int
>>> -reconnect_deadline__(const struct reconnect *fsm)
>>> +reconnect_deadline__(const struct reconnect *fsm, long long int now)
>>>  {
>>>  ovs_assert(fsm->state_entered != LLONG_MIN);
>>>  switch (fsm->state) {
>>> @@ -557,8 +558,18 @@ reconnect_deadline__(const struct reconnect *fsm)
>>>  if (fsm->probe_interval) {
>>>  long long int base = MAX(fsm->last_activity, 
>>> fsm->state_entered);
>>>  long long int expiration = base + fsm->probe_interval;
>>> -if (fsm->last_receive_attempt >= expiration) {
>>> +if (now < expiration || fsm->last_receive_attempt >= 
>>> expiration) {
>>> +/* We still have time before the expiration or the time has
>>> + * already passed and there was no activity.  In the first 
>>> case
>>> + * we need to wait for the expiration, in the second - 
>>> we're
>>> + * already past the deadline. */
>>>  return expiration;
>>> +} else {
>>> +/* Time has already passed, but we didn't attempt to 
>>> receive
>>> + * anything.  We need to wake up and try to receive even if
>>> + * nothing is pending, so we can update the expiration time
>>> + * or transition into S_IDLE state. */
>>> +return now + 1;
>>>  }
>>>  }
>>>  return LLONG_MAX;
>>> @@ -566,8 +577,14 @@ reconnect_deadline__(const struct reconnect *fsm)
>>>  case S_IDLE:
>>>  if (fsm->probe_interval) {
>>>  long long int expiration = fsm->state_entered + 
>>> fsm->probe_interval;
>>> -if (fsm->last_receive_attempt >= expiration) {
>>> +if 

Re: [ovs-dev] [PATCH ovn v3] controller: add ovn-set-local-ip option

2022-02-22 Thread Vladislav Odintsov
Thanks Numan.

Regards,
Vladislav Odintsov

> On 22 Feb 2022, at 18:41, Numan Siddique  wrote:
> 
> On Sat, Feb 19, 2022 at 1:38 AM Han Zhou  wrote:
>> 
>> On Fri, Feb 18, 2022 at 10:38 AM Vladislav Odintsov 
>> wrote:
>>> 
>>> When transport node has multiple interfaces (vlans) and
>>> ovn-encap-ip on different hosts need to be configured
>>> from different VLANs source IP for encapsulated packet
>>> can be not the same, which is expected by remote system.
>>> 
>>> Explicitely setting local_ip resolves such problem.
>>> 
>>> Signed-off-by: Vladislav Odintsov 
>>> ---
>>> controller/encaps.c | 43 +
>>> controller/ovn-controller.8.xml |  7 ++
>>> tests/ovn-controller.at |  9 +++
>>> 3 files changed, 44 insertions(+), 15 deletions(-)
>>> 
>>> diff --git a/controller/encaps.c b/controller/encaps.c
>>> index 66e0cd8cd..8e6d290c1 100644
>>> --- a/controller/encaps.c
>>> +++ b/controller/encaps.c
>>> @@ -23,6 +23,7 @@
>>> #include "openvswitch/vlog.h"
>>> #include "lib/ovn-sb-idl.h"
>>> #include "ovn-controller.h"
>>> +#include "smap.h"
>>> 
>>> VLOG_DEFINE_THIS_MODULE(encaps);
>>> 
>>> @@ -176,8 +177,31 @@ tunnel_add(struct tunnel_ctx *tc, const struct
>> sbrec_sb_global *sbg,
>>> smap_add(, "dst_port", dst_port);
>>> }
>>> 
>>> +const struct ovsrec_open_vswitch *cfg =
>>> +ovsrec_open_vswitch_table_first(ovs_table);
>>> +
>>> +bool set_local_ip = false;
>>> +if (cfg) {
>>> +/* If the tos option is configured, get it */
>>> +const char *encap_tos = smap_get_def(>external_ids,
>>> +   "ovn-encap-tos", "none");
>>> +
>>> +if (encap_tos && strcmp(encap_tos, "none")) {
>>> +smap_add(, "tos", encap_tos);
>>> +}
>>> +
>>> +/* If ovn-set-local-ip option is configured, get it */
>>> +set_local_ip = smap_get_bool(>external_ids,
>> "ovn-set-local-ip",
>>> + false);
>>> +}
>>> +
>>> /* Add auth info if ipsec is enabled. */
>>> if (sbg->ipsec) {
>>> +set_local_ip = true;
>>> +smap_add(, "remote_name", new_chassis_id);
>>> +}
>>> +
>>> +if (set_local_ip) {
>>> const struct sbrec_chassis *this_chassis = tc->this_chassis;
>>> const char *local_ip = NULL;
>>> 
>>> @@ -187,8 +211,10 @@ tunnel_add(struct tunnel_ctx *tc, const struct
>> sbrec_sb_global *sbg,
>>>  */
>>> for (int i = 0; i < this_chassis->n_encaps; i++) {
>>> if (local_ip && strcmp(local_ip,
>> this_chassis->encaps[i]->ip)) {
>>> -VLOG_ERR("ovn-encap-ip has been configured as a list.
>> This "
>>> - "is unsupported for IPsec.");
>>> +static struct vlog_rate_limit rl =
>> VLOG_RATE_LIMIT_INIT(5, 1);
>>> +VLOG_ERR_RL(, "ovn-encap-ip has been configured as a
>> list. "
>>> +"This is unsupported for IPsec and explicit "
>>> +"local_ip configuration.");
>>> /* No need to loop further as we know this condition has
>> been
>>>  * hit */
>>> break;
>>> @@ -200,19 +226,6 @@ tunnel_add(struct tunnel_ctx *tc, const struct
>> sbrec_sb_global *sbg,
>>> if (local_ip) {
>>> smap_add(, "local_ip", local_ip);
>>> }
>>> -smap_add(, "remote_name", new_chassis_id);
>>> -}
>>> -
>>> -const struct ovsrec_open_vswitch *cfg =
>>> -ovsrec_open_vswitch_table_first(ovs_table);
>>> -/* If the tos option is configured, get it */
>>> -if (cfg) {
>>> -const char *encap_tos = smap_get_def(>external_ids,
>>> -   "ovn-encap-tos", "none");
>>> -
>>> -if (encap_tos && strcmp(encap_tos, "none")) {
>>> -smap_add(, "tos", encap_tos);
>>> -}
>>> }
>>> 
>>> /* If there's an existing chassis record that does not need any
>> change,
>>> diff --git a/controller/ovn-controller.8.xml
>> b/controller/ovn-controller.8.xml
>>> index e9708fe64..cc9a7d1c2 100644
>>> --- a/controller/ovn-controller.8.xml
>>> +++ b/controller/ovn-controller.8.xml
>>> @@ -304,6 +304,13 @@
>>> of how many entries there are in the cache.  By default this is
>> set to
>>> 3 (30 seconds).
>>>   
>>> +  external_ids:ovn-set-local-ip
>>> +  
>>> +The boolean flag indicates if ovn-controller when
>> create
>>> +tunnel ports should set local_ip parameter.  Can be
>>> +heplful to pin source outer IP for the tunnel when multiple
>> interfaces
>>> +are used on the host for overlay traffic.
>>> +  
>>> 
>>> 
>>> 
>>> diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
>>> index e99eec1d6..89ae2c9e1 100644
>>> --- a/tests/ovn-controller.at
>>> +++ b/tests/ovn-controller.at
>>> @@ -298,6 +298,15 @@ OVS_WAIT_UNTIL([check_tunnel_property type geneve])
>>> ovs-vsctl del-port ovn-fakech-0
>>> 

Re: [ovs-dev] [PATCH] reconnect: Fix broken inactivity probe if there is no other reason to wake up.

2022-02-22 Thread Ilya Maximets
On 2/22/22 13:05, Dumitru Ceara wrote:
> On 2/21/22 15:16, Ilya Maximets wrote:
>> The purpose of reconnect_deadline__() function is twofold:
>>
>> 1. Its result is used to tell if the state has to be changed right now
>>in reconnect_run().
>> 2. Its result also used to determine when the process need to wake up
>>and call reconnect_run() for a next time, i.e. when the state may
>>need to be changed next time.
>>
>> Since introduction of the 'receive-attempted' feature, the function
>> returns LLONG_MAX if the deadline is in the future.  That works for
>> the first case, but doesn't for the second one, because we don't
>> really know when we need to call reconnect_run().
>>
>> This is the problem for applications where jsonrpc connection is the
>> only source of wake ups, e.g. ovn-northd.  When the network goes down
>> silently, e.g. server looses IP address due to DHCP failure, ovn-northd
>> will sleep in the poll loop indefinitely after being told that it
>> doesn't need to call reconnect_run() (deadline == LLONG_MAX).
>>
>> Fixing that by actually returning the expected time if it is in the
>> future, so we will know when to wake up.  In order to keep the
>> 'receive-attempted' feature, returning 'now + 1' in case where the
>> time has already passed, but receive wasn't attempted.  That will
>> trigger a fast wake up, so the application will be able to attempt the
>> receive even if there was no real events.  In a correctly written
>> application we should not fall into this case more than once in a row.
>> '+ 1' ensures that we will not transition into a different state
>> prematurely, i.e. before the receive is actually attempted.
>>
>> Fixes: 4241d652e465 ("jsonrpc: Avoid disconnecting prematurely due to long 
>> poll intervals.")
>> Signed-off-by: Ilya Maximets 
>> ---
> 
> Hi Ilya,
> 
> This change looks good to me, I just have a couple minor comment-related
> remarks; in any case:
> 
> Acked-by: Dumitru Ceara 
> 
> Regards,
> Dumitru
> 
>>
>> CC: Wentao Jia
>> This might be the problem you tried to fix in northd by setting a probe
>> interval.  Would be great, if you can test your case with this patch.
>>
>>  lib/reconnect.c |  29 +---
>>  python/ovs/reconnect.py |  40 +++-
>>  tests/reconnect.at  | 102 +++-
>>  3 files changed, 133 insertions(+), 38 deletions(-)
>>
>> diff --git a/lib/reconnect.c b/lib/reconnect.c
>> index a929ddfd2..f6fd165e4 100644
>> --- a/lib/reconnect.c
>> +++ b/lib/reconnect.c
>> @@ -75,7 +75,8 @@ struct reconnect {
>>  
>>  static void reconnect_transition__(struct reconnect *, long long int now,
>> enum state state);
>> -static long long int reconnect_deadline__(const struct reconnect *);
>> +static long long int reconnect_deadline__(const struct reconnect *,
>> +  long long int now);
>>  static bool reconnect_may_retry(struct reconnect *);
>>  
>>  static const char *
>> @@ -539,7 +540,7 @@ reconnect_transition__(struct reconnect *fsm, long long 
>> int now,
>>  }
>>  
>>  static long long int
>> -reconnect_deadline__(const struct reconnect *fsm)
>> +reconnect_deadline__(const struct reconnect *fsm, long long int now)
>>  {
>>  ovs_assert(fsm->state_entered != LLONG_MIN);
>>  switch (fsm->state) {
>> @@ -557,8 +558,18 @@ reconnect_deadline__(const struct reconnect *fsm)
>>  if (fsm->probe_interval) {
>>  long long int base = MAX(fsm->last_activity, 
>> fsm->state_entered);
>>  long long int expiration = base + fsm->probe_interval;
>> -if (fsm->last_receive_attempt >= expiration) {
>> +if (now < expiration || fsm->last_receive_attempt >= 
>> expiration) {
>> +/* We still have time before the expiration or the time has
>> + * already passed and there was no activity.  In the first 
>> case
>> + * we need to wait for the expiration, in the second - we're
>> + * already past the deadline. */
>>  return expiration;
>> +} else {
>> +/* Time has already passed, but we didn't attempt to receive
>> + * anything.  We need to wake up and try to receive even if
>> + * nothing is pending, so we can update the expiration time
>> + * or transition into S_IDLE state. */
>> +return now + 1;
>>  }
>>  }
>>  return LLONG_MAX;
>> @@ -566,8 +577,14 @@ reconnect_deadline__(const struct reconnect *fsm)
>>  case S_IDLE:
>>  if (fsm->probe_interval) {
>>  long long int expiration = fsm->state_entered + 
>> fsm->probe_interval;
>> -if (fsm->last_receive_attempt >= expiration) {
>> +if (now < expiration || fsm->last_receive_attempt >= 
>> expiration) {
> 
> Should we duplicate or add a similar comment to what we have for S_ACTIVE?

I 

Re: [ovs-dev] [PATCH v2 08/10] netdev-offload-tc: Check for none offloadable ct_state flag combination

2022-02-22 Thread Eelco Chaudron


On 22 Feb 2022, at 12:36, Marcelo Leitner wrote:

> +Cc Wenxu, Paul and netdev
>
> On Tue, Feb 22, 2022 at 10:33:44AM +0100, Eelco Chaudron wrote:
>>
>>
>> On 21 Feb 2022, at 15:53, Eelco Chaudron wrote:
>>
>>> On 21 Feb 2022, at 14:33, Marcelo Leitner wrote:
>>
>> 
>>
>> Don’t think this is true, it will only print if +trk and any other flags 
>> are set.
>> Guess this is where the miscommunication is.>
>>> The message also seems to be a bit aggressive, especially since it will
>>> almost always be printed.
>
> Yeah.  I missed the fact that you're checking for zero and 
> flower->key.ct_state
> will actually mark existence of other flags.  So, that is fine.
>
> However, I'm still not sure that the condition is fully correct.
>
> If we'll take a match on '+est' with all other flags wildcarded, that will
> trigger the condition, because 'flower->key.ct_state' will contain the 
> 'est' bit,
> but 'trk' bit will not be set.  The point is that even though -trk+est is 
> not

 Oh ow. tc flower will reject this combination today, btw. I don't know
 about hw implications for changing that by now.

 https://elixir.bootlin.com/linux/latest/C/ident/fl_validate_ct_state
 'state' parameter in there is the value masked already.

 We directly mapped openflow restrictions to the datapath.

> a valid combination and +trk+est is, OVS may in theory produce the match 
> with
> 'est' bit set and 'trk' bit wildcarded.  And that can be a correct 
> configuration.

 I guess that means that the only possible parameter validation on
 ct_state at tc level is about its length. Thoughts?

>>>
>>> Guess I get it now also :) I was missing the wildcard bit that OVS implies 
>>> when not specifying any :)
>>>
>>> I think I can fix this by just adding +trk on the TC side when we get the 
>>> OVS wildcard for +trk. Guess this holds true as for TC there is no -trk 
>>> +flags.
>>>
>>> I’m trying to replicate patch 9 all afternoon, and due to the fact I did 
>>> not write down which test was causing the problem, and it taking 20-30 
>>> runs, it has not happened yet :( But will do it later tomorrow, see if it 
>>> works in all cases ;)
>>>
>>
>> So I’ve been doing some experiments (and running all system-traffic tests), 
>> and I think the following fix will solve the problem by just making sure the 
>> +trk flag is set in this case on the TC side.
>> This will not change the behavior compared to the kernel.
>>
>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>> index 0105d883f..3d2c1d844 100644
>> --- a/lib/netdev-offload-tc.c
>> +++ b/lib/netdev-offload-tc.c
>> @@ -1541,6 +1541,12 @@ parse_match_ct_state_to_flower(struct tc_flower 
>> *flower, struct match *match)
>>  flower->key.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
>>  flower->mask.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
>>  }
>> +
>> +if (flower->key.ct_state &&
>> +!(flower->key.ct_state & TCA_FLOWER_KEY_CT_FLAGS_TRACKED)) {
>> +flower->key.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
>> +flower->mask.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
>> +}
>
> I had meant to update the kernel instead. As Ilya was saying, as this
> is dealing with masks, the validation that tc is doing is not right. I
> mean, for a connection to be in +est, it needs +trk, right, but for
> matching, one could have the following value/mask:
>  value=est
>  mask=est
> which means: match connections in Established AND also untracked ones.

Maybe it was too late last night, but why also untracked ones?
It should only match untracked ones with the est flag set, or do I miss 
something?
Untracked ones can never have the est flag set, right?

> Apparently this is what the test is triggering here, and the patch
> above could lead to not match the 2nd part of the AND above.
>
> When fixing the parameter validation in flower, we went too far.
>
>   Marcelo
>
>>  }
>>
>> I will send out a v3 of this set soon with this change included.
>>
>> //Eelco
>>
>> 
>>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3] controller: add ovn-set-local-ip option

2022-02-22 Thread Numan Siddique
On Sat, Feb 19, 2022 at 1:38 AM Han Zhou  wrote:
>
> On Fri, Feb 18, 2022 at 10:38 AM Vladislav Odintsov 
> wrote:
> >
> > When transport node has multiple interfaces (vlans) and
> > ovn-encap-ip on different hosts need to be configured
> > from different VLANs source IP for encapsulated packet
> > can be not the same, which is expected by remote system.
> >
> > Explicitely setting local_ip resolves such problem.
> >
> > Signed-off-by: Vladislav Odintsov 
> > ---
> >  controller/encaps.c | 43 +
> >  controller/ovn-controller.8.xml |  7 ++
> >  tests/ovn-controller.at |  9 +++
> >  3 files changed, 44 insertions(+), 15 deletions(-)
> >
> > diff --git a/controller/encaps.c b/controller/encaps.c
> > index 66e0cd8cd..8e6d290c1 100644
> > --- a/controller/encaps.c
> > +++ b/controller/encaps.c
> > @@ -23,6 +23,7 @@
> >  #include "openvswitch/vlog.h"
> >  #include "lib/ovn-sb-idl.h"
> >  #include "ovn-controller.h"
> > +#include "smap.h"
> >
> >  VLOG_DEFINE_THIS_MODULE(encaps);
> >
> > @@ -176,8 +177,31 @@ tunnel_add(struct tunnel_ctx *tc, const struct
> sbrec_sb_global *sbg,
> >  smap_add(, "dst_port", dst_port);
> >  }
> >
> > +const struct ovsrec_open_vswitch *cfg =
> > +ovsrec_open_vswitch_table_first(ovs_table);
> > +
> > +bool set_local_ip = false;
> > +if (cfg) {
> > +/* If the tos option is configured, get it */
> > +const char *encap_tos = smap_get_def(>external_ids,
> > +   "ovn-encap-tos", "none");
> > +
> > +if (encap_tos && strcmp(encap_tos, "none")) {
> > +smap_add(, "tos", encap_tos);
> > +}
> > +
> > +/* If ovn-set-local-ip option is configured, get it */
> > +set_local_ip = smap_get_bool(>external_ids,
> "ovn-set-local-ip",
> > + false);
> > +}
> > +
> >  /* Add auth info if ipsec is enabled. */
> >  if (sbg->ipsec) {
> > +set_local_ip = true;
> > +smap_add(, "remote_name", new_chassis_id);
> > +}
> > +
> > +if (set_local_ip) {
> >  const struct sbrec_chassis *this_chassis = tc->this_chassis;
> >  const char *local_ip = NULL;
> >
> > @@ -187,8 +211,10 @@ tunnel_add(struct tunnel_ctx *tc, const struct
> sbrec_sb_global *sbg,
> >   */
> >  for (int i = 0; i < this_chassis->n_encaps; i++) {
> >  if (local_ip && strcmp(local_ip,
> this_chassis->encaps[i]->ip)) {
> > -VLOG_ERR("ovn-encap-ip has been configured as a list.
> This "
> > - "is unsupported for IPsec.");
> > +static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 1);
> > +VLOG_ERR_RL(, "ovn-encap-ip has been configured as a
> list. "
> > +"This is unsupported for IPsec and explicit "
> > +"local_ip configuration.");
> >  /* No need to loop further as we know this condition has
> been
> >   * hit */
> >  break;
> > @@ -200,19 +226,6 @@ tunnel_add(struct tunnel_ctx *tc, const struct
> sbrec_sb_global *sbg,
> >  if (local_ip) {
> >  smap_add(, "local_ip", local_ip);
> >  }
> > -smap_add(, "remote_name", new_chassis_id);
> > -}
> > -
> > -const struct ovsrec_open_vswitch *cfg =
> > -ovsrec_open_vswitch_table_first(ovs_table);
> > -/* If the tos option is configured, get it */
> > -if (cfg) {
> > -const char *encap_tos = smap_get_def(>external_ids,
> > -   "ovn-encap-tos", "none");
> > -
> > -if (encap_tos && strcmp(encap_tos, "none")) {
> > -smap_add(, "tos", encap_tos);
> > -}
> >  }
> >
> >  /* If there's an existing chassis record that does not need any
> change,
> > diff --git a/controller/ovn-controller.8.xml
> b/controller/ovn-controller.8.xml
> > index e9708fe64..cc9a7d1c2 100644
> > --- a/controller/ovn-controller.8.xml
> > +++ b/controller/ovn-controller.8.xml
> > @@ -304,6 +304,13 @@
> >  of how many entries there are in the cache.  By default this is
> set to
> >  3 (30 seconds).
> >
> > +  external_ids:ovn-set-local-ip
> > +  
> > +The boolean flag indicates if ovn-controller when
> create
> > +tunnel ports should set local_ip parameter.  Can be
> > +heplful to pin source outer IP for the tunnel when multiple
> interfaces
> > +are used on the host for overlay traffic.
> > +  
> >  
> >
> >  
> > diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
> > index e99eec1d6..89ae2c9e1 100644
> > --- a/tests/ovn-controller.at
> > +++ b/tests/ovn-controller.at
> > @@ -298,6 +298,15 @@ OVS_WAIT_UNTIL([check_tunnel_property type geneve])
> >  ovs-vsctl del-port ovn-fakech-0
> >  OVS_WAIT_UNTIL([check_tunnel_property type geneve])
> >
> > +# set `ovn-set-local-ip` option to true and check if 

Re: [ovs-dev] [PATCH ovn v2 00/11] ovn-controller: Fine-grained address set incremental processing.

2022-02-22 Thread Numan Siddique
On Thu, Feb 17, 2022 at 5:54 PM Han Zhou  wrote:
>
> On Thu, Feb 17, 2022 at 1:18 PM Mark Michelson  wrote:
> >
> > Hi Han,
> >
> > I had a look through this series and to be honest, I probably don't have
> > the expertise to critique every line of code. However, the test cases
> > you added, plus my own independent "make sure nothing falls apart"
> > tests, give me enough confidence to OK this patch series.
> >
> > Acked-by: Mark Michelson 
> >
> Thanks Mark for the review! Since Numan mentioned that he will review it as
> well, I will wait for his feedback. It is always better to have more eyes
> on the patches.

Thanks.  I've started reviewing the patches and hoping to provide the
comments soon.

Numan

>
> Thanks,
> Han
>
> > On 2/9/22 12:54, Han Zhou wrote:
> > > Today although the incremental processing engine of ovn-controller
> handles
> > > address set changes incrementally, it is at the logical flow level
> instead of
> > > individual addresses level. A single address change in an address set
> would
> > > cause all the related logical flows being reprocessed.  The cost of
> > > reprocessing a lflow referencing a big address set can be very high.
> When the
> > > change rate of anaddress sets is high, ovn-controller would be busy
> reprocessing
> > > logical flows.
> > >
> > > This patch series optimizes this typical scenario for large scale
> environment
> > > by incrementally processing each individual address updates. When the
> change is
> > > small (e.g. adding/deleting a single address in an address set), this
> results
> > > in constant processing time, regardless of the size of the address set.
> > >
> > > There are limitations that these approaches can't apply. For example,
> when an
> > > ACL is in the below forms:
> > >
> > >  ip.src == $as1 || ip.dst == $as2
> > >  ip.src == {$as1, $as2}
> > >  ip.src == {$as1, ip1}
> > >
> > > In these cases during lflow parsing the expressions are combined to a
> single
> > > OR, which loses the tracking information for the address sets' IPs and
> flows
> > > generated. There are other cases that can't be handled that are
> documented for
> > > the function lflow_handle_addr_set_update, and also added in test
> cases.  In
> > > these cases it just fall back to the old approach that reprocesses the
> > > lflow. So, this change doesn't add any new constraint to the users, but
> just
> > > leave some cases as unoptimized as it was before.
> > >
> > > Scale test shows obvious performance gains because the time complexity
> > > changed from O(n) to O(1). The bigger the size of address set, the more
> > > CPU savings. With the AS size of 10k, the test shows ~40x speed up.
> > >
> > > Test setup:
> > > CPU: Intel(R) Core(TM) i9-7920X CPU @ 2.90GHz.
> > > 5 ACL all referencing an address set of 10,000 IPs.
> > >
> > > Measure the time spent by ovn-controller for handling one IP deletion
> > > from the address set:
> > >
> > > Before: ~400ms
> > > After: 11-12ms
> > >
> > > There is memory cost increase, due to the index built to track each
> individual
> > > addresses. The total memory cost for the OF flows in ovn-controller
> increased
> > > ~20% in the 10k AS size test.
> > >
> > > Before:
> > > ofctrl_desired_flow_usage-KB:22248
> > > ofctrl_installed_flow_usage-KB:14850
> > > ofctrl_sb_flow_ref_usage-KB:7208
> > >
> > > After:
> > > ofctrl_desired_flow_usage-KB:22248
> > > ofctrl_installed_flow_usage-KB:14850
> > > ofctrl_sb_flow_ref_usage-KB:15551
> > >
> > > ---
> > > v1 -> v2:
> > > - Fixed a build error of patch 4, which was caused by misplacing a
> change to
> > >patch 4 which should have been in patch 5. Updated patch 5 commit
> message as
> > >well.
> > >
> > > Changes after RFC:
> > > - Added a new patch for maintaining ref_count in logical flow resource
> > >reference for resource type: address-set, which is needed for the
> correctness
> > >of both IP addition and deletion I-P in some corner cases.
> > > - Fixed the corner case when the same address set is used multiple
> times in the
> > >same lflow with one of the references untrackable. Added test case
> to cover
> > >as well.
> > > - Added more documentation, such as the limitations and corner cases.
> > > - Added tests for ipv6 and mac address support.
> > > - Other minor improvements.
> > >
> > > Han Zhou (11):
> > >expr.c: Use expr_destroy and expr_clone instead of free and xmemdup.
> > >ofctrl.c: Combine remove_flows_from_sb_to_flow and
> > >  ofctrl_flood_remove_flows.
> > >ovn-controller: Track individual IP information of address set during
> > >  lflow parsing.
> > >ovn-controller.c: Remove unnecessary asserts and useless variables.
> > >ovn-controller.c: Refactor init_lflow_ctx.
> > >ovn-controller: Tracking SB address set updates.
> > >lflow.c: Set "changed" properly in lflow_handle_changed_ref().
> > >ovn-controller: Add tests for different ACL address set usage
> > >  patterns.
> > >lflow: 

[ovs-dev] [PATCH v3 9/9] odp-util: Fix output for tc to be equal to kernel

2022-02-22 Thread Eelco Chaudron
When the same flow is programmed in the kernel and tc, they
look different due to the way they are translated. They take
the userspace approach by always including the packet type
attribute. To make the outputs the same, show the ethernet
header when the packet type is wildcarded, and not printed.

So without the fix the kernel would show (ovs-appctl dpctl/dump-flows):

  in_port(3),eth(),eth_type(0x0800),ipv4(frag=no), packets:19, bytes:11614, 
used:0.001s, actions:output

Where as TC would show:

  in_port(3),eth_type(0x0800),ipv4(frag=no), packets:19, bytes:11614, 
used:0.001s, actions:output

Signed-off-by: Eelco Chaudron 
Acked-by: Roi Dayan 
---
 lib/odp-util.c  |5 +
 tests/tunnel.at |2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/odp-util.c b/lib/odp-util.c
index 99dd55807..cd7063046 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -4630,6 +4630,11 @@ odp_flow_format(const struct nlattr *key, size_t key_len,
 ds_put_char(ds, ',');
 }
 ds_put_cstr(ds, "eth()");
+} else if (attr_type == OVS_KEY_ATTR_PACKET_TYPE && is_wildcard) {
+/* See the above help text, however in the case where the
+ * packet type is not shown, we still need to display the
+ * eth() header if the packets type is wildcarded. */
+has_packet_type_key = false;
 }
 ofpbuf_clear();
 }
diff --git a/tests/tunnel.at b/tests/tunnel.at
index b8ae7caa9..fd482aa87 100644
--- a/tests/tunnel.at
+++ b/tests/tunnel.at
@@ -126,7 +126,7 @@ AT_CHECK([ovs-appctl dpif/show | tail -n +3], [0], [dnl
 AT_CHECK([ovs-appctl dpctl/add-flow 
"tunnel(dst=1.1.1.1,src=3.3.3.200/255.255.255.0,tp_dst=123,tp_src=1,ttl=64),recirc_id(0),in_port(1),eth(),eth_type(0x0800),ipv4()"
 "2"])
 
 AT_CHECK([ovs-appctl dpctl/dump-flows | tail -1], [0], [dnl
-tunnel(src=3.3.3.200/255.255.255.0,dst=1.1.1.1,ttl=64,tp_src=1,tp_dst=123),recirc_id(0),in_port(1),eth_type(0x0800),
 packets:0, bytes:0, used:never, actions:2
+tunnel(src=3.3.3.200/255.255.255.0,dst=1.1.1.1,ttl=64,tp_src=1,tp_dst=123),recirc_id(0),in_port(1),eth(),eth_type(0x0800),
 packets:0, bytes:0, used:never, actions:2
 ])
 
 OVS_VSWITCHD_STOP

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 8/9] netdev-offload-tc: Check for none offloadable ct_state flag combination

2022-02-22 Thread Eelco Chaudron
This patch checks for none offloadable ct_state match flag combinations.
If they exist force the +trk flag down to TC Flower

Signed-off-by: Eelco Chaudron 
---
v3:
 - Instead of warning about an invalid flag combination fix it.

 lib/netdev-offload-tc.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 0105d883f..3d2c1d844 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1541,6 +1541,12 @@ parse_match_ct_state_to_flower(struct tc_flower *flower, 
struct match *match)
 flower->key.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
 flower->mask.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
 }
+
+if (flower->key.ct_state &&
+!(flower->key.ct_state & TCA_FLOWER_KEY_CT_FLAGS_TRACKED)) {
+flower->key.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
+flower->mask.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
+}
 }
 
 if (mask->ct_zone) {

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 7/9] netdev-offload-tc: Fix IP and port ranges in flower returns.

2022-02-22 Thread Eelco Chaudron
When programming NAT rules OVS only sets the minimum value for a
single IP/port value. However, responses from flower will always
return min == max for single IP/port values. This is causing the
verification to fail as the request is different than the response.
To avoid this, we will update the response to match the request.

Signed-off-by: Eelco Chaudron 
Acked-by: Roi Dayan 
---
 lib/tc.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/lib/tc.c b/lib/tc.c
index ebec097dc..f2778af4c 100644
--- a/lib/tc.c
+++ b/lib/tc.c
@@ -1486,7 +1486,9 @@ nl_parse_act_ct(struct nlattr *options, struct tc_flower 
*flower)
 if (ipv4_max) {
 ovs_be32 addr = nl_attr_get_be32(ipv4_max);
 
-action->ct.range.ipv4.max = addr;
+if (action->ct.range.ipv4.min != addr) {
+action->ct.range.ipv4.max = addr;
+}
 }
 } else if (ipv6_min) {
 action->ct.range.ip_family = AF_INET6;
@@ -1495,7 +1497,9 @@ nl_parse_act_ct(struct nlattr *options, struct tc_flower 
*flower)
 if (ipv6_max) {
 struct in6_addr addr = nl_attr_get_in6_addr(ipv6_max);
 
-action->ct.range.ipv6.max = addr;
+if (!ipv6_addr_equals(>ct.range.ipv6.min, )) {
+action->ct.range.ipv6.max = addr;
+}
 }
 }
 
@@ -1503,6 +1507,10 @@ nl_parse_act_ct(struct nlattr *options, struct tc_flower 
*flower)
 action->ct.range.port.min = nl_attr_get_be16(port_min);
 if (port_max) {
 action->ct.range.port.max = nl_attr_get_be16(port_max);
+if (action->ct.range.port.min ==
+action->ct.range.port.max) {
+action->ct.range.port.max = 0;
+}
 }
 }
 }

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 6/9] netdev-offload-tc: Fix use of ICMP values instead of masks defines.

2022-02-22 Thread Eelco Chaudron
Signed-off-by: Eelco Chaudron 
Acked-by: Roi Dayan 
---
 lib/tc.c |   13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/lib/tc.c b/lib/tc.c
index 8cfd5aa5a..ebec097dc 100644
--- a/lib/tc.c
+++ b/lib/tc.c
@@ -939,24 +939,21 @@ nl_parse_flower_ip(struct nlattr **attrs, struct 
tc_flower *flower) {
 key->icmp_code =
nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV4_CODE]);
 mask->icmp_code =
-nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV4_CODE]);
+nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV4_CODE_MASK]);
 }
 if (attrs[TCA_FLOWER_KEY_ICMPV4_TYPE_MASK]) {
-key->icmp_type =
-   nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV4_TYPE_MASK]);
+key->icmp_type = nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV4_TYPE]);
 mask->icmp_type =
 nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV4_TYPE_MASK]);
 }
 } else if (ip_proto == IPPROTO_ICMPV6) {
 if (attrs[TCA_FLOWER_KEY_ICMPV6_CODE_MASK]) {
-key->icmp_code =
-   nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV6_CODE]);
+key->icmp_code = nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV6_CODE]);
 mask->icmp_code =
- nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV6_CODE]);
+ nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV6_CODE_MASK]);
 }
 if (attrs[TCA_FLOWER_KEY_ICMPV6_TYPE_MASK]) {
-key->icmp_type =
-   nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV6_TYPE_MASK]);
+key->icmp_type = nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV6_TYPE]);
 mask->icmp_type =
 nl_attr_get_u8(attrs[TCA_FLOWER_KEY_ICMPV6_TYPE_MASK]);
 }

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 5/9] netdev-offload-tc: Always include conntrack information to tc

2022-02-22 Thread Eelco Chaudron
Regardless of the traffic type, if requested, the conntrack information
should be included to keep the datapath and tc rules in sync.

Signed-off-by: Eelco Chaudron 
Acked-by: Roi Dayan 
---
 lib/tc.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/tc.c b/lib/tc.c
index 48a24282d..8cfd5aa5a 100644
--- a/lib/tc.c
+++ b/lib/tc.c
@@ -2903,13 +2903,13 @@ nl_msg_put_flower_options(struct ofpbuf *request, 
struct tc_flower *flower)
 FLOWER_PUT_MASKED_VALUE(icmp_code, TCA_FLOWER_KEY_ICMPV6_CODE);
 FLOWER_PUT_MASKED_VALUE(icmp_type, TCA_FLOWER_KEY_ICMPV6_TYPE);
 }
-
-FLOWER_PUT_MASKED_VALUE(ct_state, TCA_FLOWER_KEY_CT_STATE);
-FLOWER_PUT_MASKED_VALUE(ct_zone, TCA_FLOWER_KEY_CT_ZONE);
-FLOWER_PUT_MASKED_VALUE(ct_mark, TCA_FLOWER_KEY_CT_MARK);
-FLOWER_PUT_MASKED_VALUE(ct_label, TCA_FLOWER_KEY_CT_LABELS);
 }
 
+FLOWER_PUT_MASKED_VALUE(ct_state, TCA_FLOWER_KEY_CT_STATE);
+FLOWER_PUT_MASKED_VALUE(ct_zone, TCA_FLOWER_KEY_CT_ZONE);
+FLOWER_PUT_MASKED_VALUE(ct_mark, TCA_FLOWER_KEY_CT_MARK);
+FLOWER_PUT_MASKED_VALUE(ct_label, TCA_FLOWER_KEY_CT_LABELS);
+
 if (host_eth_type == ETH_P_IP) {
 FLOWER_PUT_MASKED_VALUE(ipv4.ipv4_src, TCA_FLOWER_KEY_IPV4_SRC);
 FLOWER_PUT_MASKED_VALUE(ipv4.ipv4_dst, TCA_FLOWER_KEY_IPV4_DST);

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 4/9] netdev-offload-tc: Check for valid netdev ifindex in flow_put

2022-02-22 Thread Eelco Chaudron
Verify that the returned ifindex by netdev_get_ifindex() is valid.
This might not be the case in the ERSPAN port scenario, which can
not be offloaded.

Signed-off-by: Eelco Chaudron 
---
v3:
  - Fixed netdev reference issue on failure
  - Added netdev_flow_api_equals() check

 lib/netdev-offload-tc.c |   18 ++
 1 file changed, 18 insertions(+)

diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 9845e8d3f..0105d883f 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1841,7 +1841,25 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
*match,
 VLOG_DBG_RL(, "Can't find netdev for output port %d", port);
 return ENODEV;
 }
+
+if (!netdev_flow_api_equals(netdev, outdev)) {
+VLOG_DBG_RL(,
+"Flow API provider mismatch between ingress (%s) "
+"and egress (%s) ports",
+netdev_get_name(netdev), netdev_get_name(outdev));
+netdev_close(outdev);
+return EOPNOTSUPP;
+}
+
 action->out.ifindex_out = netdev_get_ifindex(outdev);
+if (action->out.ifindex_out < 0) {
+VLOG_DBG_RL(,
+"Can't find ifindex for output port %s, error %d",
+netdev_get_name(outdev), action->out.ifindex_out);
+netdev_close(outdev);
+return -action->out.ifindex_out;
+}
+
 action->out.ingress = is_internal_port(netdev_get_type(outdev));
 action->type = TC_ACT_OUTPUT;
 flower.action_count++;

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 3/9] odp_util: Fix parse_key_and_mask_to_match() vlan parsing

2022-02-22 Thread Eelco Chaudron
The parse_key_and_mask_to_match() is a function to translate
a netlink formatted key/mask to match structure. And should
not consider any configuration setting when translating.

Signed-off-by: Eelco Chaudron 
Acked-by: Roi Dayan 
---
 lib/odp-util.c |   41 -
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/lib/odp-util.c b/lib/odp-util.c
index 0ff302856..99dd55807 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -7188,7 +7188,8 @@ parse_8021q_onward(const struct nlattr 
*attrs[OVS_KEY_ATTR_MAX + 1],
uint64_t present_attrs, int out_of_range_attr,
uint64_t expected_attrs, struct flow *flow,
const struct nlattr *key, size_t key_len,
-   const struct flow *src_flow, char **errorp)
+   const struct flow *src_flow, char **errorp,
+   bool ignore_vlan_limit)
 {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
 bool is_mask = src_flow != flow;
@@ -7196,9 +7197,11 @@ parse_8021q_onward(const struct nlattr 
*attrs[OVS_KEY_ATTR_MAX + 1],
 const struct nlattr *encap;
 enum odp_key_fitness encap_fitness;
 enum odp_key_fitness fitness = ODP_FIT_ERROR;
+int vlan_limit;
 int encaps = 0;
 
-while (encaps < flow_vlan_limit &&
+vlan_limit = ignore_vlan_limit ? FLOW_MAX_VLAN_HEADERS : flow_vlan_limit;
+while (encaps < vlan_limit &&
(is_mask
 ? (src_flow->vlans[encaps].tci & htons(VLAN_CFI)) != 0
 : eth_type_vlan(flow->dl_type))) {
@@ -7281,7 +7284,7 @@ parse_8021q_onward(const struct nlattr 
*attrs[OVS_KEY_ATTR_MAX + 1],
 static enum odp_key_fitness
 odp_flow_key_to_flow__(const struct nlattr *key, size_t key_len,
struct flow *flow, const struct flow *src_flow,
-   char **errorp)
+   char **errorp, bool ignore_vlan_limit)
 {
 /* New "struct flow" fields that are visible to the datapath (including all
  * data fields) should be translated from equivalent datapath flow fields
@@ -7431,7 +7434,7 @@ odp_flow_key_to_flow__(const struct nlattr *key, size_t 
key_len,
 : eth_type_vlan(src_flow->dl_type)) {
 fitness = parse_8021q_onward(attrs, present_attrs, out_of_range_attr,
  expected_attrs, flow, key, key_len,
- src_flow, errorp);
+ src_flow, errorp, ignore_vlan_limit);
 } else {
 if (is_mask) {
 /* A missing VLAN mask means exact match on vlan_tci 0 (== no
@@ -7497,7 +7500,7 @@ enum odp_key_fitness
 odp_flow_key_to_flow(const struct nlattr *key, size_t key_len,
  struct flow *flow, char **errorp)
 {
-return odp_flow_key_to_flow__(key, key_len, flow, flow, errorp);
+return odp_flow_key_to_flow__(key, key_len, flow, flow, errorp, false);
 }
 
 /* Converts the 'mask_key_len' bytes of OVS_KEY_ATTR_* attributes in 'mask_key'
@@ -7509,14 +7512,16 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t 
key_len,
  * If 'errorp' is nonnull, this function uses it for detailed error reports: if
  * the return value is ODP_FIT_ERROR, it stores a malloc()'d error string in
  * '*errorp', otherwise NULL. */
-enum odp_key_fitness
-odp_flow_key_to_mask(const struct nlattr *mask_key, size_t mask_key_len,
- struct flow_wildcards *mask, const struct flow *src_flow,
- char **errorp)
+static enum odp_key_fitness
+odp_flow_key_to_mask__(const struct nlattr *mask_key, size_t mask_key_len,
+   struct flow_wildcards *mask,
+   const struct flow *src_flow,
+   char **errorp, bool ignore_vlan_limit)
 {
 if (mask_key_len) {
 return odp_flow_key_to_flow__(mask_key, mask_key_len,
-  >masks, src_flow, errorp);
+  >masks, src_flow, errorp,
+  ignore_vlan_limit);
 } else {
 if (errorp) {
 *errorp = NULL;
@@ -7530,6 +7535,15 @@ odp_flow_key_to_mask(const struct nlattr *mask_key, 
size_t mask_key_len,
 }
 }
 
+enum odp_key_fitness
+odp_flow_key_to_mask(const struct nlattr *mask_key, size_t mask_key_len,
+ struct flow_wildcards *mask,
+ const struct flow *src_flow, char **errorp)
+{
+return odp_flow_key_to_mask__(mask_key, mask_key_len, mask, src_flow,
+  errorp, false);
+}
+
 /* Converts the netlink formated key/mask to match.
  * Fails if odp_flow_key_from_key/mask and odp_flow_key_key/mask
  * disagree on the acceptable form of flow */
@@ -7540,7 +7554,8 @@ parse_key_and_mask_to_match(const struct nlattr *key, 
size_t key_len,
 {
 enum odp_key_fitness fitness;
 
-fitness = odp_flow_key_to_flow(key, key_len, >flow, NULL);
+fitness = 

[ovs-dev] [PATCH v3 2/9] netdev-offload-tc: Set the correct VLAN_VID and VLAN_PCP masks

2022-02-22 Thread Eelco Chaudron
This change will set the correct VID and PCP masks, as well as the
ethernet type mask.

Signed-off-by: Eelco Chaudron 
Acked-by: Roi Dayan 
---
v2: Fixed sparse "expected restricted ovs_be16" warning

 lib/tc.c |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/tc.c b/lib/tc.c
index 77305a105..48a24282d 100644
--- a/lib/tc.c
+++ b/lib/tc.c
@@ -568,16 +568,17 @@ nl_parse_flower_vlan(struct nlattr **attrs, struct 
tc_flower *flower)
 
 flower->key.encap_eth_type[0] =
 nl_attr_get_be16(attrs[TCA_FLOWER_KEY_ETH_TYPE]);
+flower->mask.encap_eth_type[0] = CONSTANT_HTONS(0x);
 
 if (attrs[TCA_FLOWER_KEY_VLAN_ID]) {
 flower->key.vlan_id[0] =
 nl_attr_get_u16(attrs[TCA_FLOWER_KEY_VLAN_ID]);
-flower->mask.vlan_id[0] = 0x;
+flower->mask.vlan_id[0] = VLAN_VID_MASK >> VLAN_VID_SHIFT;
 }
 if (attrs[TCA_FLOWER_KEY_VLAN_PRIO]) {
 flower->key.vlan_prio[0] =
 nl_attr_get_u8(attrs[TCA_FLOWER_KEY_VLAN_PRIO]);
-flower->mask.vlan_prio[0] = 0xff;
+flower->mask.vlan_prio[0] = VLAN_PCP_MASK >> VLAN_PCP_SHIFT;
 }
 
 if (!attrs[TCA_FLOWER_KEY_VLAN_ETH_TYPE]) {
@@ -590,17 +591,18 @@ nl_parse_flower_vlan(struct nlattr **attrs, struct 
tc_flower *flower)
 }
 
 flower->key.encap_eth_type[1] = flower->key.encap_eth_type[0];
+flower->mask.encap_eth_type[1] = CONSTANT_HTONS(0x);
 flower->key.encap_eth_type[0] = encap_ethtype;
 
 if (attrs[TCA_FLOWER_KEY_CVLAN_ID]) {
 flower->key.vlan_id[1] =
 nl_attr_get_u16(attrs[TCA_FLOWER_KEY_CVLAN_ID]);
-flower->mask.vlan_id[1] = 0x;
+flower->mask.vlan_id[1] = VLAN_VID_MASK >> VLAN_VID_SHIFT;
 }
 if (attrs[TCA_FLOWER_KEY_CVLAN_PRIO]) {
 flower->key.vlan_prio[1] =
 nl_attr_get_u8(attrs[TCA_FLOWER_KEY_CVLAN_PRIO]);
-flower->mask.vlan_prio[1] = 0xff;
+flower->mask.vlan_prio[1] = VLAN_PCP_MASK >> VLAN_PCP_SHIFT;
 }
 }
 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 1/9] netdev-offload-tc: Add debug logs on tc rule verify failures

2022-02-22 Thread Eelco Chaudron
This patch adds more detailed debug logs on tc verify failures to
ease debugging the actual cause after the fact.

Signed-off-by: Eelco Chaudron 
Acked-by: Roi Dayan 
---
 lib/tc.c |   79 +-
 1 file changed, 73 insertions(+), 6 deletions(-)

diff --git a/lib/tc.c b/lib/tc.c
index 38a1dfc0e..77305a105 100644
--- a/lib/tc.c
+++ b/lib/tc.c
@@ -2980,12 +2980,78 @@ nl_msg_put_flower_options(struct ofpbuf *request, 
struct tc_flower *flower)
 return 0;
 }
 
+static void log_tc_flower_match(const char *msg,
+const struct tc_flower *a,
+const struct tc_flower *b)
+{
+uint8_t key_a[sizeof(struct tc_flower_key)];
+uint8_t key_b[sizeof(struct tc_flower_key)];
+struct ds s = DS_EMPTY_INITIALIZER;
+
+for (int i = 0; i < sizeof a->key; i++) {
+uint8_t mask_a = ((uint8_t *)>mask)[i];
+uint8_t mask_b = ((uint8_t *)>mask)[i];
+
+key_a[i] = ((uint8_t *)>key)[i] & mask_a;
+key_b[i] = ((uint8_t *)>key)[i] & mask_b;
+}
+ds_put_cstr(, "\nExpected Mask:\n");
+ds_put_hex(, >mask, sizeof a->mask);
+ds_put_cstr(, "\nReceived Mask:\n");
+ds_put_hex(, >mask, sizeof b->mask);
+ds_put_cstr(, "\nExpected Key:\n");
+ds_put_hex(, >key, sizeof a->key);
+ds_put_cstr(, "\nReceived Key:\n");
+ds_put_hex(, >key, sizeof b->key);
+ds_put_cstr(, "\nExpected Masked Key:\n");
+ds_put_hex(, key_a, sizeof key_a);
+ds_put_cstr(, "\nReceived Masked Key:\n");
+ds_put_hex(, key_b, sizeof key_b);
+
+if (a->action_count != b->action_count) {
+/* If action count is not equal, we print all actions to see which
+ * ones are missing. */
+const struct tc_action *action;
+int i;
+
+ds_put_cstr(, "\nExpected Actions:\n");
+for (i = 0, action = a->actions; i < a->action_count; i++, action++) {
+ds_put_cstr(, " - ");
+ds_put_hex(, action, sizeof *action);
+ds_put_cstr(, "\n");
+}
+ds_put_cstr(, "Received Actions:\n");
+for (i = 0, action = b->actions; i < b->action_count; i++, action++) {
+ds_put_cstr(, " - ");
+ds_put_hex(, action, sizeof *action);
+ds_put_cstr(, "\n");
+}
+} else {
+/* Only dump the delta in actions. */
+const struct tc_action *action_a = a->actions;
+const struct tc_action *action_b = b->actions;
+
+for (int i = 0; i < a->action_count; i++, action_a++, action_b++) {
+if (memcmp(action_a, action_b, sizeof *action_a)) {
+ds_put_format(,
+  "\nAction %d mismatch:\n - Expected Action: ",
+  i);
+ds_put_hex(, action_a, sizeof *action_a);
+ds_put_cstr(, "\n - Received Action: ");
+ds_put_hex(, action_b, sizeof *action_b);
+}
+}
+}
+VLOG_DBG_RL(_rl, "%s%s", msg, ds_cstr());
+ds_destroy();
+}
+
 static bool
 cmp_tc_flower_match_action(const struct tc_flower *a,
const struct tc_flower *b)
 {
 if (memcmp(>mask, >mask, sizeof a->mask)) {
-VLOG_DBG_RL(_rl, "tc flower compare failed mask compare");
+log_tc_flower_match("tc flower compare failed mask compare:", a, b);
 return false;
 }
 
@@ -2998,8 +3064,8 @@ cmp_tc_flower_match_action(const struct tc_flower *a,
 uint8_t key_b = ((uint8_t *)>key)[i] & mask;
 
 if (key_a != key_b) {
-VLOG_DBG_RL(_rl, "tc flower compare failed key compare at "
-"%d", i);
+log_tc_flower_match("tc flower compare failed masked key compare:",
+a, b);
 return false;
 }
 }
@@ -3009,14 +3075,15 @@ cmp_tc_flower_match_action(const struct tc_flower *a,
 const struct tc_action *action_b = b->actions;
 
 if (a->action_count != b->action_count) {
-VLOG_DBG_RL(_rl, "tc flower compare failed action length check");
+log_tc_flower_match("tc flower compare failed action length check",
+a, b);
 return false;
 }
 
 for (int i = 0; i < a->action_count; i++, action_a++, action_b++) {
 if (memcmp(action_a, action_b, sizeof *action_a)) {
-VLOG_DBG_RL(_rl, "tc flower compare failed action compare "
-"for %d", i);
+log_tc_flower_match("tc flower compare failed action compare",
+a, b);
 return false;
 }
 }

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 0/9] netdev-offload-tc: Fix various tc-offload related problems

2022-02-22 Thread Eelco Chaudron
This series fixes a bunch of TC offload-related issues found when
running the kernel data path self-test (make check-kernel) forcing
TC to be enabled. To do this manually I applied the following diff:

  
https://github.com/chaudron/ovs/commit/d7ff1060f371e2cbd4b2e8dee222a9eed55073c8

These changes are OVS-related fixes, however, I have still two
problems that look kernel to investigate.

I'm also planning to find a nice way to include the system-traffic.at
into "make check-offloads". However, this might take some time, and I
do not want to delay getting the actual fixes in.

The following branch has some raw changes to system-traffic.at to
make most of the tests work with TC enabled hardcoded:

  https://github.com/chaudron/ovs/tree/dev/tc_verify

v2:
  - Fixed sparse "expected restricted ovs_be16" warning
  - Removed patch "netdev-offload-tc: stats should be captured on first action 
in the list"
  - Added new patch "odp-util: Fix output for tc to be equal to kernel"

v3:
  - [4/10] Fixed netdev reference issue on failure
  - [4/10] Added netdev_flow_api_equals() check
  - [8/10] Instead of warning about an invalid CT flag combination fix it
  - [9/10] Removing patch 9, "revalidator: Fix datapath statistics update"

Eelco Chaudron (9):
  netdev-offload-tc: Add debug logs on tc rule verify failures
  netdev-offload-tc: Set the correct VLAN_VID and VLAN_PCP masks
  odp_util: Fix parse_key_and_mask_to_match() vlan parsing
  netdev-offload-tc: Check for valid netdev ifindex in flow_put
  netdev-offload-tc: Always include conntrack information to tc
  netdev-offload-tc: Fix use of ICMP values instead of masks defines.
  netdev-offload-tc: Fix IP and port ranges in flower returns.
  netdev-offload-tc: Check for none offloadable ct_state flag combination
  odp-util: Fix output for tc to be equal to kernel


 lib/netdev-offload-tc.c |  24 
 lib/odp-util.c  |  46 ++-
 lib/tc.c| 124 
 tests/tunnel.at |   2 +-
 4 files changed, 157 insertions(+), 39 deletions(-)

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovsdb: raft: Fix inability to join the cluster after interrupted attempt.

2022-02-22 Thread Dumitru Ceara
On 1/28/22 19:51, Ilya Maximets wrote:
> If the joining server re-connects while catching up (e.g. if it crashed
> or connection got closed due to inactivity), the data we sent might be
> lost, so the server will never reply to append request or a snapshot
> installation request.  At the same time, leader will decline all the
> subsequent requests to join from that server with the 'in progress'
> resolution.  At this point the new server will never be able to join
> the cluster, because it will never receive the raft log while leader
> thinks that it was already sent.
> 
> This happened in practice when one of the servers got preempted for a
> few seconds, so the leader closed connection due to inactivity.
> 
> Destroying the joining server if disconnection detected.  This will
> allow to start the joining from scratch when the server re-connects
> and sends the new join request.
> 
> We can't track re-connection in the raft_conn_run(), because it's
> incoming connection and the jsonrpc will not keep it alive or
> try to reconnect.  Next time the server re-connects it will be an
> entirely new raft conn.
> 
> Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered 
> databases.")
> Reported-at: https://bugzilla.redhat.com/2033514
> Signed-off-by: Ilya Maximets 
> ---

As far as I can tell, this change is fine; the test case also helps!

Acked-by: Dumitru Ceara 

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] |fail| pw1595962 [PATCH ovn] pinctrl: give a 'no such name' response when no DNS record is available

2022-02-22 Thread 0-day Robot
From: ro...@bytheb.org

Test-Label: github-robot: ovn-kubernetes
Test-Status: fail
http://patchwork.ozlabs.org/api/patches/1595962/

_github build: failed_
Build URL: https://github.com/ovsrobot/ovn/actions/runs/1881582770
Build Logs:
---Summary of failed steps---
"Build" failed at step Build ovn-kubernetes container
--End summary of failed steps

---BEGIN LOGS

 [Begin job log] "Build" at step Build ovn-kubernetes container

  |  ^~
controller/pinctrl.c:2992:54: note: each undeclared identifier is 
reported only once for each function it appears in
controller/pinctrl.c:3020:9: error: 'ancount' undeclared (first use in 
this function)
 3020 | if (ancount == 0) {
  | ^~~
make[2]: *** [Makefile:2308: controller/pinctrl.o] Error 1
make[2]: *** Waiting for unfinished jobs
make[2]: Leaving directory '/tmp/ovn/rpm/rpmbuild/BUILD/ovn-21.12.90'
make[1]: Leaving directory '/tmp/ovn/rpm/rpmbuild/BUILD/ovn-21.12.90'
make[1]: *** [Makefile:1519: all] Error 2


RPM build errors:
error: Bad exit status from /var/tmp/rpm-tmp.BisZ44 (%build)
line 33: Possible unexpanded macro in: Obsoletes: openvswitch-ovn-common < 
21.12.90-%{release}
line 34: Possible unexpanded macro in: Provides: openvswitch-ovn-common = 
21.12.90-%{release}
line 84: It's not recommended to have unversioned Obsoletes: Obsoletes: 
openvswitch-ovn-central
line 95: It's not recommended to have unversioned Obsoletes: Obsoletes: 
openvswitch-ovn-host
line 105: It's not recommended to have unversioned Obsoletes: Obsoletes: 
openvswitch-ovn-vtep
line 115: It's not recommended to have unversioned Obsoletes: Obsoletes: 
openvswitch-ovn-docker
Bad exit status from /var/tmp/rpm-tmp.BisZ44 (%build)
make: *** [Makefile:3767: rpm-fedora] Error 1
The command '/bin/sh -c make rpm-fedora' returned a non-zero code: 2

##[error]Process completed with exit code 2.

 [End job log] "Build" at step Build ovn-kubernetes container

END LOGS-
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] |fail| pw1595962 [PATCH ovn] pinctrl: give a 'no such name' response when no DNS record is available

2022-02-22 Thread 0-day Robot
From: ro...@bytheb.org

Test-Label: github-robot: Build and Test
Test-Status: fail
http://patchwork.ozlabs.org/api/patches/1595962/

_github build: failed_
Build URL: https://github.com/ovsrobot/ovn/actions/runs/1881582768
Build Logs:
---Summary of failed steps---
"osx clang --disable-ssl" failed at step build
"linux gcc --disable-ssl" failed at step build
"linux clang --disable-ssl" failed at step build
"linux gcc test" failed at step build
"linux gcc system-test" failed at step build
"linux clang test asan" failed at step build
"linux gcc test -ljemalloc" failed at step build
"linux clang test -ljemalloc" failed at step build
"linux gcc m32 --disable-ssl" failed at step build
--End summary of failed steps

---BEGIN LOGS

 [Begin job log] "osx clang --disable-ssl" at step build

#define HAVE_PRAGMA_MESSAGE 1

configure: exit 0

## -- ##
## Running config.status. ##
## -- ##

This file was extended by ovn config.status 21.12.90, which was
generated by GNU Autoconf 2.71.  Invocation command line was

  CONFIG_FILES= 
  CONFIG_HEADERS  = 
  CONFIG_LINKS= 
  CONFIG_COMMANDS = 
  $ ./config.status Makefile depfiles

on Mac-1645533059795.local

config.status:1309: creating Makefile
config.status:1538: executing depfiles commands
config.status:1640: cd .   && sed -e '/# am--include-marker/d' Makefile 
| make -f - am--depfiles
make[1]: Nothing to be done for `am--depfiles'.
config.status:1645: $? = 0
##[error]Process completed with exit code 1.

 [End job log] "osx clang --disable-ssl" at step build






 [Begin job log] "linux gcc --disable-ssl" at step build


## -- ##
## Running config.status. ##
## -- ##

This file was extended by ovn config.status 21.12.90, which was
generated by GNU Autoconf 2.69.  Invocation command line was

  CONFIG_FILES= 
  CONFIG_HEADERS  = 
  CONFIG_LINKS= 
  CONFIG_COMMANDS = 
  $ ./config.status Makefile depfiles

on fv-az163-348

config.status:1310: creating Makefile
config.status:1539: executing depfiles commands
config.status:1639: cd .   && sed -e '/# am--include-marker/d' Makefile 
| make -f - am--depfiles
make[1]: warning: jobserver unavailable: using -j1.  Add '+' to parent make 
rule.
make[1]: Entering directory '/home/runner/work/ovn/ovn'
make[1]: Nothing to be done for 'am--depfiles'.
make[1]: Leaving directory '/home/runner/work/ovn/ovn'
config.status:1644: $? = 0
##[error]Process completed with exit code 1.

 [End job log] "linux gcc --disable-ssl" at step build






 [Begin job log] "linux clang --disable-ssl" at step build


## -- ##
## Running config.status. ##
## -- ##

This file was extended by ovn config.status 21.12.90, which was
generated by GNU Autoconf 2.69.  Invocation command line was

  CONFIG_FILES= 
  CONFIG_HEADERS  = 
  CONFIG_LINKS= 
  CONFIG_COMMANDS = 
  $ ./config.status Makefile depfiles

on fv-az163-348

config.status:1311: creating Makefile
config.status:1540: executing depfiles commands
config.status:1640: cd .   && sed -e '/# am--include-marker/d' Makefile 
| make -f - am--depfiles
make[1]: warning: jobserver unavailable: using -j1.  Add '+' to parent make 
rule.
make[1]: Entering directory '/home/runner/work/ovn/ovn'
make[1]: Nothing to be done for 'am--depfiles'.
make[1]: Leaving directory '/home/runner/work/ovn/ovn'
config.status:1645: $? = 0
##[error]Process completed with exit code 1.

 [End job log] "linux clang --disable-ssl" at step build






 [Begin job log] "linux gcc test" at step build

writing... ovn-sim.1 { } done
build succeeded.

The 

Re: [ovs-dev] [PATCH ovn] pinctrl: give a 'no such name' response when no DNS record is available

2022-02-22 Thread 0-day Robot
Bleep bloop.  Greetings 294754599--- via dev, I am a robot and I have tried out 
your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


build:
gcc -std=gnu99 -DHAVE_CONFIG_H -I.   -I ./include  -I ./include -I ./ovn -I 
./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/lib
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/lib
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR
-Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat 
-Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast 
-Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror  -g 
-O2 -MT controller/lflow-conj-ids.o -MD -MP -MF $depbase.Tpo -c -o 
controller/lflow-conj-ids.o controller/lflow-conj-ids.c &&\
mv -f $depbase.Tpo $depbase.Po
depbase=`echo controller/lport.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I.   -I ./include  -I ./include -I ./ovn -I 
./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/lib
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/lib
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR
-Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat 
-Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast 
-Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror  -g 
-O2 -MT controller/lport.o -MD -MP -MF $depbase.Tpo -c -o controller/lport.o 
controller/lport.c &&\
mv -f $depbase.Tpo $depbase.Po
depbase=`echo controller/ofctrl.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I.   -I ./include  -I ./include -I ./ovn -I 
./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/lib
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/lib
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR
-Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat 
-Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast 
-Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror  -g 
-O2 -MT controller/ofctrl.o -MD -MP -MF $depbase.Tpo -c -o controller/ofctrl.o 
controller/ofctrl.c &&\
mv -f $depbase.Tpo $depbase.Po
depbase=`echo controller/ofctrl-seqno.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I.   -I ./include  -I ./include -I ./ovn -I 
./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/lib
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/lib
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR
-Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat 
-Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast 
-Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror  -g 
-O2 -MT controller/ofctrl-seqno.o -MD -MP -MF $depbase.Tpo -c -o 
controller/ofctrl-seqno.o controller/ofctrl-seqno.c &&\
mv -f $depbase.Tpo $depbase.Po
depbase=`echo controller/pinctrl.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I.   -I ./include  -I ./include -I ./ovn -I 
./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/OVSDIR/include
 

[ovs-dev] [PATCH ovn] pinctrl: give a 'no such name' response when no DNS record is available

2022-02-22 Thread 294754599--- via dev
From: Yuan-96 <294754...@qq.com>

While the CentOS 8 requests an IPv4 DNS record, the response will be slow 
without a correct IPv6 DNS record on server. It will send a DNS request(type 
) message first but ovn controller will not reply. And the DNS request(type 
A) message will be sent until DNS request(type ) is timeout.
So the DNS client may expect to receive a reponse with RCODE 3(RFC1035. Name 
Error) instead of nothing when DNS record is not available. And this patch is 
for that.

Signed-off-by: Yuan-96 <294754...@qq.com>
---
 controller/pinctrl.c | 72 +++-
 1 file changed, 37 insertions(+), 35 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 1b8b475..a2fc1a2 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -2952,47 +2952,41 @@ pinctrl_handle_dns_lookup(
 }
 
 ds_destroy(_name);
-if (!answer_data) {
-goto exit;
-}
 
 
-uint16_t ancount = 0;
-uint64_t dns_ans_stub[128 / 8];
-struct ofpbuf dns_answer = OFPBUF_STUB_INITIALIZER(dns_ans_stub);
+if (answer_data) {
+uint16_t ancount = 0;
+uint64_t dns_ans_stub[128 / 8];
+struct ofpbuf dns_answer = OFPBUF_STUB_INITIALIZER(dns_ans_stub);
 
-if (query_type == DNS_QUERY_TYPE_PTR) {
-dns_build_ptr_answer(_answer, in_queryname, idx, answer_data);
-ancount++;
-} else {
-struct lport_addresses ip_addrs;
-if (!extract_ip_addresses(answer_data, _addrs)) {
-goto exit;
-}
+if (query_type == DNS_QUERY_TYPE_PTR) {
+dns_build_ptr_answer(_answer, in_queryname, idx, answer_data);
+ancount++;
+} else {
+struct lport_addresses ip_addrs;
+if (!extract_ip_addresses(answer_data, _addrs)) {
+goto exit;
+}
 
-if (query_type == DNS_QUERY_TYPE_A ||
-query_type == DNS_QUERY_TYPE_ANY) {
-for (size_t i = 0; i < ip_addrs.n_ipv4_addrs; i++) {
-dns_build_a_answer(_answer, in_queryname, idx,
-   ip_addrs.ipv4_addrs[i].addr);
-ancount++;
+if (query_type == DNS_QUERY_TYPE_A ||
+query_type == DNS_QUERY_TYPE_ANY) {
+for (size_t i = 0; i < ip_addrs.n_ipv4_addrs; i++) {
+dns_build_a_answer(_answer, in_queryname, idx,
+   ip_addrs.ipv4_addrs[i].addr);
+ancount++;
+}
 }
-}
 
-if (query_type == DNS_QUERY_TYPE_ ||
-query_type == DNS_QUERY_TYPE_ANY) {
-for (size_t i = 0; i < ip_addrs.n_ipv6_addrs; i++) {
-dns_build__answer(_answer, in_queryname, idx,
-  _addrs.ipv6_addrs[i].addr);
-ancount++;
+if (query_type == DNS_QUERY_TYPE_ ||
+query_type == DNS_QUERY_TYPE_ANY) {
+for (size_t i = 0; i < ip_addrs.n_ipv6_addrs; i++) {
+dns_build__answer(_answer, in_queryname, idx,
+  _addrs.ipv6_addrs[i].addr);
+ancount++;
+}
 }
+destroy_lport_addresses(_addrs);
 }
-destroy_lport_addresses(_addrs);
-}
-
-if (!ancount) {
-ofpbuf_uninit(_answer);
-goto exit;
 }
 
 uint16_t new_l4_size = ntohs(in_udp->udp_len) +  dns_answer.size;
@@ -3023,6 +3017,12 @@ pinctrl_handle_dns_lookup(
 /* Set the response bit to 1 in the flags. */
 out_dns_header->lo_flag |= 0x80;
 
+if (ancount == 0) {
+/*Set the reply code bits to 0011 in the flags
+ *when here is no dns record. */
+out_dns_header->hi_flag |= 0x03;
+}
+
 /* Set the answer RRs. */
 out_dns_header->ancount = htons(ancount);
 out_dns_header->arcount = 0;
@@ -3030,8 +3030,10 @@ pinctrl_handle_dns_lookup(
 /* Copy the Query section. */
 dp_packet_put(_out, dp_packet_data(pkt_in), dp_packet_size(pkt_in));
 
-/* Copy the answer sections. */
-dp_packet_put(_out, dns_answer.data, dns_answer.size);
+if (dns_answer.size > 0) {
+/* Copy the answer sections. */
+dp_packet_put(_out, dns_answer.data, dns_answer.size);
+}
 ofpbuf_uninit(_answer);
 
 out_udp->udp_len = htons(new_l4_size);
-- 
1.8.3.1


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] reconnect: Fix broken inactivity probe if there is no other reason to wake up.

2022-02-22 Thread Dumitru Ceara
On 2/21/22 15:16, Ilya Maximets wrote:
> The purpose of reconnect_deadline__() function is twofold:
> 
> 1. Its result is used to tell if the state has to be changed right now
>in reconnect_run().
> 2. Its result also used to determine when the process need to wake up
>and call reconnect_run() for a next time, i.e. when the state may
>need to be changed next time.
> 
> Since introduction of the 'receive-attempted' feature, the function
> returns LLONG_MAX if the deadline is in the future.  That works for
> the first case, but doesn't for the second one, because we don't
> really know when we need to call reconnect_run().
> 
> This is the problem for applications where jsonrpc connection is the
> only source of wake ups, e.g. ovn-northd.  When the network goes down
> silently, e.g. server looses IP address due to DHCP failure, ovn-northd
> will sleep in the poll loop indefinitely after being told that it
> doesn't need to call reconnect_run() (deadline == LLONG_MAX).
> 
> Fixing that by actually returning the expected time if it is in the
> future, so we will know when to wake up.  In order to keep the
> 'receive-attempted' feature, returning 'now + 1' in case where the
> time has already passed, but receive wasn't attempted.  That will
> trigger a fast wake up, so the application will be able to attempt the
> receive even if there was no real events.  In a correctly written
> application we should not fall into this case more than once in a row.
> '+ 1' ensures that we will not transition into a different state
> prematurely, i.e. before the receive is actually attempted.
> 
> Fixes: 4241d652e465 ("jsonrpc: Avoid disconnecting prematurely due to long 
> poll intervals.")
> Signed-off-by: Ilya Maximets 
> ---

Hi Ilya,

This change looks good to me, I just have a couple minor comment-related
remarks; in any case:

Acked-by: Dumitru Ceara 

Regards,
Dumitru

> 
> CC: Wentao Jia
> This might be the problem you tried to fix in northd by setting a probe
> interval.  Would be great, if you can test your case with this patch.
> 
>  lib/reconnect.c |  29 +---
>  python/ovs/reconnect.py |  40 +++-
>  tests/reconnect.at  | 102 +++-
>  3 files changed, 133 insertions(+), 38 deletions(-)
> 
> diff --git a/lib/reconnect.c b/lib/reconnect.c
> index a929ddfd2..f6fd165e4 100644
> --- a/lib/reconnect.c
> +++ b/lib/reconnect.c
> @@ -75,7 +75,8 @@ struct reconnect {
>  
>  static void reconnect_transition__(struct reconnect *, long long int now,
> enum state state);
> -static long long int reconnect_deadline__(const struct reconnect *);
> +static long long int reconnect_deadline__(const struct reconnect *,
> +  long long int now);
>  static bool reconnect_may_retry(struct reconnect *);
>  
>  static const char *
> @@ -539,7 +540,7 @@ reconnect_transition__(struct reconnect *fsm, long long 
> int now,
>  }
>  
>  static long long int
> -reconnect_deadline__(const struct reconnect *fsm)
> +reconnect_deadline__(const struct reconnect *fsm, long long int now)
>  {
>  ovs_assert(fsm->state_entered != LLONG_MIN);
>  switch (fsm->state) {
> @@ -557,8 +558,18 @@ reconnect_deadline__(const struct reconnect *fsm)
>  if (fsm->probe_interval) {
>  long long int base = MAX(fsm->last_activity, fsm->state_entered);
>  long long int expiration = base + fsm->probe_interval;
> -if (fsm->last_receive_attempt >= expiration) {
> +if (now < expiration || fsm->last_receive_attempt >= expiration) 
> {
> +/* We still have time before the expiration or the time has
> + * already passed and there was no activity.  In the first 
> case
> + * we need to wait for the expiration, in the second - we're
> + * already past the deadline. */
>  return expiration;
> +} else {
> +/* Time has already passed, but we didn't attempt to receive
> + * anything.  We need to wake up and try to receive even if
> + * nothing is pending, so we can update the expiration time
> + * or transition into S_IDLE state. */
> +return now + 1;
>  }
>  }
>  return LLONG_MAX;
> @@ -566,8 +577,14 @@ reconnect_deadline__(const struct reconnect *fsm)
>  case S_IDLE:
>  if (fsm->probe_interval) {
>  long long int expiration = fsm->state_entered + 
> fsm->probe_interval;
> -if (fsm->last_receive_attempt >= expiration) {
> +if (now < expiration || fsm->last_receive_attempt >= expiration) 
> {

Should we duplicate or add a similar comment to what we have for S_ACTIVE?

>  return expiration;
> +} else {
> +/* Time has already passed, but we didn't attempt to receive
> + * 

Re: [ovs-dev] [PATCH v2 08/10] netdev-offload-tc: Check for none offloadable ct_state flag combination

2022-02-22 Thread Marcelo Leitner
+Cc Wenxu, Paul and netdev

On Tue, Feb 22, 2022 at 10:33:44AM +0100, Eelco Chaudron wrote:
>
>
> On 21 Feb 2022, at 15:53, Eelco Chaudron wrote:
>
> > On 21 Feb 2022, at 14:33, Marcelo Leitner wrote:
>
> 
>
>  Don’t think this is true, it will only print if +trk and any other flags 
>  are set.
>  Guess this is where the miscommunication is.>
> > The message also seems to be a bit aggressive, especially since it will
> > almost always be printed.
> >>>
> >>> Yeah.  I missed the fact that you're checking for zero and 
> >>> flower->key.ct_state
> >>> will actually mark existence of other flags.  So, that is fine.
> >>>
> >>> However, I'm still not sure that the condition is fully correct.
> >>>
> >>> If we'll take a match on '+est' with all other flags wildcarded, that will
> >>> trigger the condition, because 'flower->key.ct_state' will contain the 
> >>> 'est' bit,
> >>> but 'trk' bit will not be set.  The point is that even though -trk+est is 
> >>> not
> >>
> >> Oh ow. tc flower will reject this combination today, btw. I don't know
> >> about hw implications for changing that by now.
> >>
> >> https://elixir.bootlin.com/linux/latest/C/ident/fl_validate_ct_state
> >> 'state' parameter in there is the value masked already.
> >>
> >> We directly mapped openflow restrictions to the datapath.
> >>
> >>> a valid combination and +trk+est is, OVS may in theory produce the match 
> >>> with
> >>> 'est' bit set and 'trk' bit wildcarded.  And that can be a correct 
> >>> configuration.
> >>
> >> I guess that means that the only possible parameter validation on
> >> ct_state at tc level is about its length. Thoughts?
> >>
> >
> > Guess I get it now also :) I was missing the wildcard bit that OVS implies 
> > when not specifying any :)
> >
> > I think I can fix this by just adding +trk on the TC side when we get the 
> > OVS wildcard for +trk. Guess this holds true as for TC there is no -trk 
> > +flags.
> >
> > I’m trying to replicate patch 9 all afternoon, and due to the fact I did 
> > not write down which test was causing the problem, and it taking 20-30 
> > runs, it has not happened yet :( But will do it later tomorrow, see if it 
> > works in all cases ;)
> >
>
> So I’ve been doing some experiments (and running all system-traffic tests), 
> and I think the following fix will solve the problem by just making sure the 
> +trk flag is set in this case on the TC side.
> This will not change the behavior compared to the kernel.
>
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 0105d883f..3d2c1d844 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -1541,6 +1541,12 @@ parse_match_ct_state_to_flower(struct tc_flower 
> *flower, struct match *match)
>  flower->key.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
>  flower->mask.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
>  }
> +
> +if (flower->key.ct_state &&
> +!(flower->key.ct_state & TCA_FLOWER_KEY_CT_FLAGS_TRACKED)) {
> +flower->key.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
> +flower->mask.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
> +}

I had meant to update the kernel instead. As Ilya was saying, as this
is dealing with masks, the validation that tc is doing is not right. I
mean, for a connection to be in +est, it needs +trk, right, but for
matching, one could have the following value/mask:
 value=est
 mask=est
which means: match connections in Established AND also untracked ones.

Apparently this is what the test is triggering here, and the patch
above could lead to not match the 2nd part of the AND above.

When fixing the parameter validation in flower, we went too far.

  Marcelo

>  }
>
> I will send out a v3 of this set soon with this change included.
>
> //Eelco
>
> 
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Simplify atomic function pointer stores

2022-02-22 Thread Ferriter, Cian
Hi Amber,

Thanks for reviewing and Acking!


> -Original Message-
> From: Amber, Kumar 
> Sent: Tuesday 22 February 2022 06:15
> To: ovs-dev@openvswitch.org
> Cc: i.maxim...@ovn.org; Ferriter, Cian 
> Subject: RE: [PATCH] dpif-netdev: Simplify atomic function pointer stores
> 
> Hi Cian,
> 
> Thanks for the Patch.
> 
> I have tested the patch and reviewed as well.
> One small minor comment .
> 
> > +atomic_store_relaxed(>miniflow_extract_opt,
> > +miniflow_funcs[MFEX_IMPL_SCALAR].extract_func);
> >  VLOG_INFO("Not enough packets matched (%u/%u), disabling"
> >" optimized MFEX.", max_hits, stats->pkt_count);
> 
> Alignment of the comment is off.
> 

Yes, unfortunately I couldn't wrap the atomic_store_relaxed line and keep the 
alignment as we'd like. If I do that, the line is 80 characters long which is 
just over the 79 character limit.

> Acked-by: Kumar Amber 
> 
> Regards
> Amber

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 08/10] netdev-offload-tc: Check for none offloadable ct_state flag combination

2022-02-22 Thread Eelco Chaudron


On 21 Feb 2022, at 15:53, Eelco Chaudron wrote:

> On 21 Feb 2022, at 14:33, Marcelo Leitner wrote:



 Don’t think this is true, it will only print if +trk and any other flags 
 are set.
 Guess this is where the miscommunication is.>
> The message also seems to be a bit aggressive, especially since it will
> almost always be printed.
>>>
>>> Yeah.  I missed the fact that you're checking for zero and 
>>> flower->key.ct_state
>>> will actually mark existence of other flags.  So, that is fine.
>>>
>>> However, I'm still not sure that the condition is fully correct.
>>>
>>> If we'll take a match on '+est' with all other flags wildcarded, that will
>>> trigger the condition, because 'flower->key.ct_state' will contain the 
>>> 'est' bit,
>>> but 'trk' bit will not be set.  The point is that even though -trk+est is 
>>> not
>>
>> Oh ow. tc flower will reject this combination today, btw. I don't know
>> about hw implications for changing that by now.
>>
>> https://elixir.bootlin.com/linux/latest/C/ident/fl_validate_ct_state
>> 'state' parameter in there is the value masked already.
>>
>> We directly mapped openflow restrictions to the datapath.
>>
>>> a valid combination and +trk+est is, OVS may in theory produce the match 
>>> with
>>> 'est' bit set and 'trk' bit wildcarded.  And that can be a correct 
>>> configuration.
>>
>> I guess that means that the only possible parameter validation on
>> ct_state at tc level is about its length. Thoughts?
>>
>
> Guess I get it now also :) I was missing the wildcard bit that OVS implies 
> when not specifying any :)
>
> I think I can fix this by just adding +trk on the TC side when we get the OVS 
> wildcard for +trk. Guess this holds true as for TC there is no -trk +flags.
>
> I’m trying to replicate patch 9 all afternoon, and due to the fact I did not 
> write down which test was causing the problem, and it taking 20-30 runs, it 
> has not happened yet :( But will do it later tomorrow, see if it works in all 
> cases ;)
>

So I’ve been doing some experiments (and running all system-traffic tests), and 
I think the following fix will solve the problem by just making sure the +trk 
flag is set in this case on the TC side.
This will not change the behavior compared to the kernel.

diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 0105d883f..3d2c1d844 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1541,6 +1541,12 @@ parse_match_ct_state_to_flower(struct tc_flower *flower, 
struct match *match)
 flower->key.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
 flower->mask.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
 }
+
+if (flower->key.ct_state &&
+!(flower->key.ct_state & TCA_FLOWER_KEY_CT_FLAGS_TRACKED)) {
+flower->key.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
+flower->mask.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
+}
 }

I will send out a v3 of this set soon with this change included.

//Eelco



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 09/10] revalidator: Fix datapath statistics update.

2022-02-22 Thread Eelco Chaudron


On 21 Feb 2022, at 12:29, Eelco Chaudron wrote:

> On 17 Feb 2022, at 14:10, Ilya Maximets wrote:
>
>> On 1/31/22 11:54, Eelco Chaudron wrote:
>>> Make sure to only update packet and byte counters when valid,
>>> or else this could lead to "temporarily/occasionally"
>>> out-of-sync flow counters.
>>
>> There was already the same patch submitted here:
>> https://patchwork.ozlabs.org/project/openvswitch/patch/20200602075036.78112-1-zhaozha...@163.com/
>>
>> And I'm still not comfortable with the change, because it seems
>> like it only hides the underlying datapath problem.  Do you know
>> why exactly datapath stats become lower than previously reported?
>>
>> If it's some kind of a statistics flush, that will mean that flow
>> statistics will not be updated until new stats will catch up to the
>> old value leading to the flow revalidation and incorrect flow stats
>> anyway.
>
> This was very hard to reproduce, only once out of 20-30 runs if I remember 
> correctly.
>
> Without the patch, it would sometimes show very high numbers and then got 
> updated after a while with the correct numbers.
> At least this is what I remember, as I did not take any notes, and my brain 
> wanted to forget this patchset :)
>
>
> Guess the fix is needed anyway as this behavior was there since day one in 
> revalidate_ukey(), e79a6c833.
>
> I also see that you got no reply to your comments, so I’ll take another stab 
> to make sure this patch is really fixing the problem, or fixing a problem, 
> and hiding another TC problem.
>
>> We fixed incorrect stats on flow modification for dpdk offload provider
>> previously here:
>> https://patchwork.ozlabs.org/project/openvswitch/patch/20201012142735.5304-1-el...@nvidia.com/
>>
>> Do we need something similar for tc?
>
> I’ll take a close look at the DPDK patch, once I get my reproducer going.

Unfortunately, after an afternoon and night or running tests, I can not 
replicate the problem with the weird counters :(

So for now I’ll drop this patch from the set, and hopefully, it will resurface 
when I’m integrating the system-traffic tests into the hardware offload set.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev