Re: [ovs-dev] [PATCH v3] ofp-prop: Fix unaligned 128 bit access.

2024-06-21 Thread Ilya Maximets
On 6/20/24 07:21, Ales Musil wrote:
> On Wed, Jun 19, 2024 at 3:19 PM Mike Pattrick  wrote:
> 
>> When compiling with '-fsanitize=address,undefined', the "ovs-ofctl
>> ct-flush" test will yield the following undefined behavior flagged by
>> UBSan. This problem is caused by the fact that 128bit property put/parse
>> functions weren't adding appropriate padding before writing or reading
>> the value.
>>
>> This patch uses get_32aligned_* functions to copy the bytes as they are
>> aligned.
>>
>> lib/ofp-prop.c:277:14: runtime error: load of misaligned address
>> 0x6060687c for type 'union ovs_be128', which requires 8 byte
>> alignment
>> 0x6060687c: note: pointer points here
>>   00 05 00 14 00 00 00 00  00 00 00 00 00 00 00 00  00 ff ab 00
>>   ^
>> 0: in ofpprop_parse_u128 lib/ofp-prop.c:277
>> 1: in ofp_ct_match_decode lib/ofp-ct.c:525
>> 2: in ofp_print_nxt_ct_flush lib/ofp-print.c:959
>> 3: in ofp_to_string__ lib/ofp-print.c:1206
>> 4: in ofp_to_string lib/ofp-print.c:1264
>> 5: in ofp_print lib/ofp-print.c:1308
>> 6: in ofctl_ofp_print utilities/ovs-ofctl.c:4899
>> 7: in ovs_cmdl_run_command__ lib/command-line.c:247
>> 8: in ovs_cmdl_run_command lib/command-line.c:278
>> 9: in main utilities/ovs-ofctl.c:186
>>
>> Signed-off-by: Mike Pattrick 
>> ---
>> v2: removed memcpy
>> v3: fixed checkpatch
>> ---
>>  lib/ofp-prop.c | 17 ++---
>>  1 file changed, 10 insertions(+), 7 deletions(-)
>>
> Looks good to me, thanks.
> 
> Acked-by: Ales Musil 
> 

Thanks, Mike and Ales!  I added the missing Fixes tag and applied
the change.  Also backported to 3.3.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] tunnel: Allow UDP zero checksum with IPv6 tunnels.

2024-06-21 Thread Ilya Maximets
On 3/27/24 16:55, Mike Pattrick wrote:
> This patch adopts the proposed RFC 6935 by allowing null UDP checksums
> even if the tunnel protocol is IPv6. This is already supported by Linux
> through the udp6zerocsumtx tunnel option. It is disabled by default and
> IPv6 tunnels are flagged as requiring a checksum, but this patch enables
> the user to set csum=false on IPv6 tunnels.
> 
> Signed-off-by: Mike Pattrick 
> ---
> v2: Changed documentation, and added a NEWS item
> v3: NEWS file merge conflict
> v4: Better comments, new test
> v5: Addressed identified nit's
> ---
>  NEWS  |  4 
>  lib/netdev-native-tnl.c   |  2 +-
>  lib/netdev-vport.c| 17 +++--
>  lib/netdev.h  | 18 +-
>  ofproto/tunnel.c  | 10 --
>  tests/tunnel-push-pop-ipv6.at |  9 +
>  tests/tunnel-push-pop.at  |  7 +++
>  tests/tunnel.at   |  2 +-
>  vswitchd/vswitch.xml  | 12 +---
>  9 files changed, 71 insertions(+), 10 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index c9e4064e6..6c8c4a2dc 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -4,6 +4,10 @@ Post-v3.3.0
>   * Conntrack now supports 'random' flag for selecting ports in a range
> while natting and 'persistent' flag for selection of the IP address
> from a range.
> + * IPv6 UDP tunnel encapsulation including Geneve and VXLAN will now
> +   honour the csum option.  Configuring the interface with
> +   "options:csum=false" now has the same effect as the udp6zerocsumtx
> +   option has with Linux kernel UDP tunnels.
>  
>  
>  v3.3.0 - 16 Feb 2024
> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> index dee9ab344..e8258bc4e 100644
> --- a/lib/netdev-native-tnl.c
> +++ b/lib/netdev-native-tnl.c
> @@ -424,7 +424,7 @@ udp_build_header(const struct netdev_tunnel_config 
> *tnl_cfg,
>  udp = netdev_tnl_ip_build_header(data, params, IPPROTO_UDP, 0);
>  udp->udp_dst = tnl_cfg->dst_port;
>  
> -if (params->is_ipv6 || params->flow->tunnel.flags & FLOW_TNL_F_CSUM) {
> +if (params->flow->tunnel.flags & FLOW_TNL_F_CSUM) {
>  /* Write a value in now to mark that we should compute the checksum
>   * later. 0x is handy because it is transparent to the
>   * calculation. */
> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
> index 60caa02fb..234a4ebe1 100644
> --- a/lib/netdev-vport.c
> +++ b/lib/netdev-vport.c
> @@ -702,7 +702,9 @@ set_tunnel_config(struct netdev *dev_, const struct smap 
> *args, char **errp)
>  tnl_cfg.dst_port = htons(atoi(node->value));
>  } else if (!strcmp(node->key, "csum") && has_csum) {
>  if (!strcmp(node->value, "true")) {
> -tnl_cfg.csum = true;
> +tnl_cfg.csum = NETDEV_TNL_CSUM_ENABLED;
> +} else if (!strcmp(node->value, "false")) {
> +tnl_cfg.csum = NETDEV_TNL_CSUM_DISABLED;
>  }
>  } else if (!strcmp(node->key, "seq") && has_seq) {
>  if (!strcmp(node->value, "true")) {
> @@ -850,6 +852,15 @@ set_tunnel_config(struct netdev *dev_, const struct smap 
> *args, char **errp)
>  }
>  }
>  
> +/* The default csum state for GRE is special as it does have an optional
> + * checksum but the default configuration isn't correlated with IP 
> version
> + * like UDP tunnels are.  Likewise, tunnels with no checksum at all must 
> be
> + * in this state. */
> +if (tnl_cfg.csum == NETDEV_TNL_CSUM_DEFAULT &&
> +(!has_csum || strstr(type, "gre"))) {
> +tnl_cfg.csum = NETDEV_TNL_DEFAULT_NO_CSUM;
> +}
> +
>  enum tunnel_layers layers = tunnel_supported_layers(type, _cfg);
>  const char *full_type = (strcmp(type, "vxlan") ? type
>   : (tnl_cfg.exts & (1 << OVS_VXLAN_EXT_GPE)
> @@ -1026,8 +1037,10 @@ get_tunnel_config(const struct netdev *dev, struct 
> smap *args)
>  }
>  }
>  
> -if (tnl_cfg->csum) {
> +if (tnl_cfg->csum == NETDEV_TNL_CSUM_ENABLED) {
>  smap_add(args, "csum", "true");
> +} else if (tnl_cfg->csum == NETDEV_TNL_CSUM_DISABLED) {
> +smap_add(args, "csum", "false");
>  }
>  
>  if (tnl_cfg->set_seq) {
> diff --git a/lib/netdev.h b/lib/netdev.h
> index 67a8486bd..5d253157c 100644
> --- a/lib/netdev.h
> +++ b/lib/netdev.h
> @@ -111,6 +111,22 @@ enum netdev_srv6_flowlabel {
>  SRV6_FLOWLABEL_COMPUTE,
>  };
>  
> +enum netdev_tnl_csum {
> +/* Default value for UDP tunnels if no configurations is present.  
> Enforce
> + * checksum calculation in IPv6 tunnels, disable in IPv4 tunnels. */
> +NETDEV_TNL_CSUM_DEFAULT = 0,
> +
> +/* Checksum explicitly to be calculated. */
> +NETDEV_TNL_CSUM_ENABLED,
> +
> +/* Checksum calculation explicitly disabled. */
> +NETDEV_TNL_CSUM_DISABLED,
> +
> +/* A value for when there is no checksum or the default value is no
> + 

Re: [ovs-dev] [PATCH net] openvswitch: get related ct labels from its master if it is not confirmed

2024-06-20 Thread Ilya Maximets
On 6/20/24 00:08, Xin Long wrote:
> Ilya found a failure in running check-kernel tests with at_groups=144
> (144: conntrack - FTP SNAT orig tuple) in OVS repo. After his further
> investigation, the root cause is that the labels sent to userspace
> for related ct are incorrect.
> 
> The labels for unconfirmed related ct should use its master's labels.
> However, the changes made in commit 8c8b73320805 ("openvswitch: set
> IPS_CONFIRMED in tmpl status only when commit is set in conntrack")
> led to getting labels from this related ct.
> 
> So fix it in ovs_ct_get_labels() by changing to copy labels from its
> master ct if it is a unconfirmed related ct. Note that there is no
> fix needed for ct->mark, as it was already copied from its master
> ct for related ct in init_conntrack().
> 
> Fixes: 8c8b73320805 ("openvswitch: set IPS_CONFIRMED in tmpl status only when 
> commit is set in conntrack")
> Reported-by: Ilya Maximets 
> Signed-off-by: Xin Long 
> ---
>  net/openvswitch/conntrack.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 331730fd3580..920e802ff01e 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -167,8 +167,13 @@ static u32 ovs_ct_get_mark(const struct nf_conn *ct)
>  static void ovs_ct_get_labels(const struct nf_conn *ct,
> struct ovs_key_ct_labels *labels)
>  {
> - struct nf_conn_labels *cl = ct ? nf_ct_labels_find(ct) : NULL;
> + struct nf_conn_labels *cl = NULL;
>  
> + if (ct) {
> + if (ct->master && !nf_ct_is_confirmed(ct))
> + ct = ct->master;
> + cl = nf_ct_labels_find(ct);
> + }
>   if (cl)
>   memcpy(labels, cl->bits, OVS_CT_LABELS_LEN);
>   else

Thanks, Xin!  LGTM.

Tested with OVS testsuite and it works fine.  Also re-checked OVN
system tests and they also work as expected.

Reviewed-by: Ilya Maximets 
Tested-by: Ilya Maximets 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample

2024-06-19 Thread Ilya Maximets
On 6/19/24 22:40, Adrián Moreno wrote:
> On Wed, Jun 19, 2024 at 08:21:02PM GMT, Ilya Maximets wrote:
>> On 6/19/24 08:35, Adrián Moreno wrote:
>>> On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote:
>>>> On 6/18/24 12:50, Adrián Moreno wrote:
>>>>> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
>>>>>> On 6/18/24 09:00, Adrián Moreno wrote:
>>>>>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>>>>>>>> On 6/17/24 13:55, Ilya Maximets wrote:
>>>>>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>>>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>>>>>>>> observability-oriented.
>>>>>>>>>>
>>>>>>>>>> Apart from some corner case in which it's used a replacement of 
>>>>>>>>>> clone()
>>>>>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>>>>>>>> local emit_sample.
>>>>>>>>>>
>>>>>>>>>> With this in mind, it doesn't make much sense to report
>>>>>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>>>>>>>
>>>>>>>>>> For instance, if the flow:
>>>>>>>>>>
>>>>>>>>>>   actions:sample(..,emit_sample(..)),2
>>>>>>>>>>
>>>>>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>>>>>>>> confusing for users since the packet did reach its destination.
>>>>>>>>>>
>>>>>>>>>> This patch makes internal action execution silently consume the skb
>>>>>>>>>> instead of notifying a drop for this case.
>>>>>>>>>>
>>>>>>>>>> Unfortunately, this patch does not remove all potential sources of
>>>>>>>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>>>>>>>
>>>>>>>>>> actions:sample(..,emit_sample(..))
>>>>>>>>>>
>>>>>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we 
>>>>>>>>>> aren't.
>>>>>>>>>>
>>>>>>>>>> Sadly, this case is difficult to solve without breaking the
>>>>>>>>>> optimization by which the skb is not cloned on last sample actions.
>>>>>>>>>> But, given explicit drop actions are now supported, OVS can just add 
>>>>>>>>>> one
>>>>>>>>>> after the last sample() and rewrite the flow as:
>>>>>>>>>>
>>>>>>>>>> actions:sample(..,emit_sample(..)),drop
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Adrian Moreno 
>>>>>>>>>> ---
>>>>>>>>>>  net/openvswitch/actions.c | 13 +++--
>>>>>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>>>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>>>>>>>> --- a/net/openvswitch/actions.c
>>>>>>>>>> +++ b/net/openvswitch/actions.c
>>>>>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>>>>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>>>>>>>
>>>>>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>>>>>>>> +{
>>>>>>>>>> +/* Do not emit packet drops inside sample(). */
>>>>>>>>>> +if (OVS_CB(skb)->probability)
>>>>>>>>>> +consume_skb(skb);
>>>>>>>>>> +else
>>>>>>>>>> +ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>>>> +}
>>>>>>

Re: [ovs-dev] [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample

2024-06-19 Thread Ilya Maximets
On 6/19/24 08:35, Adrián Moreno wrote:
> On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote:
>> On 6/18/24 12:50, Adrián Moreno wrote:
>>> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
>>>> On 6/18/24 09:00, Adrián Moreno wrote:
>>>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>>>>>> On 6/17/24 13:55, Ilya Maximets wrote:
>>>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>>>>>> observability-oriented.
>>>>>>>>
>>>>>>>> Apart from some corner case in which it's used a replacement of clone()
>>>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>>>>>> local emit_sample.
>>>>>>>>
>>>>>>>> With this in mind, it doesn't make much sense to report
>>>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>>>>>
>>>>>>>> For instance, if the flow:
>>>>>>>>
>>>>>>>>   actions:sample(..,emit_sample(..)),2
>>>>>>>>
>>>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>>>>>> confusing for users since the packet did reach its destination.
>>>>>>>>
>>>>>>>> This patch makes internal action execution silently consume the skb
>>>>>>>> instead of notifying a drop for this case.
>>>>>>>>
>>>>>>>> Unfortunately, this patch does not remove all potential sources of
>>>>>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>>>>>
>>>>>>>> actions:sample(..,emit_sample(..))
>>>>>>>>
>>>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we 
>>>>>>>> aren't.
>>>>>>>>
>>>>>>>> Sadly, this case is difficult to solve without breaking the
>>>>>>>> optimization by which the skb is not cloned on last sample actions.
>>>>>>>> But, given explicit drop actions are now supported, OVS can just add 
>>>>>>>> one
>>>>>>>> after the last sample() and rewrite the flow as:
>>>>>>>>
>>>>>>>> actions:sample(..,emit_sample(..)),drop
>>>>>>>>
>>>>>>>> Signed-off-by: Adrian Moreno 
>>>>>>>> ---
>>>>>>>>  net/openvswitch/actions.c | 13 +++--
>>>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>>>>>> --- a/net/openvswitch/actions.c
>>>>>>>> +++ b/net/openvswitch/actions.c
>>>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>>>>>
>>>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>>>>>> +{
>>>>>>>> +  /* Do not emit packet drops inside sample(). */
>>>>>>>> +  if (OVS_CB(skb)->probability)
>>>>>>>> +  consume_skb(skb);
>>>>>>>> +  else
>>>>>>>> +  ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 
>>>>>>>> 'flow_keys'
>>>>>>>>   * space. Return NULL if out of key spaces.
>>>>>>>>   */
>>>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct 
>>>>>>>> sk_buff *skb,
>>>>>>>>if ((arg->probability != U32_MAX) &&
>>>>>>>>(!arg->probability || get_random_u32() > arg->probability)) 
>>>>>>>> {
>&

Re: [ovs-dev] [PATCH net-next 3/3] openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack

2024-06-19 Thread Ilya Maximets
On 6/19/24 16:07, Xin Long wrote:
> On Wed, Jun 19, 2024 at 8:58 AM Ilya Maximets  wrote:
>>
>> On 6/18/24 17:50, Ilya Maximets wrote:
>>> On 6/18/24 16:58, Xin Long wrote:
>>>> On Tue, Jun 18, 2024 at 7:34 AM Ilya Maximets  wrote:
>>>>>
>>>>> On 6/17/24 22:10, Ilya Maximets wrote:
>>>>>> On 7/16/23 23:09, Xin Long wrote:
>>>>>>> By not setting IPS_CONFIRMED in tmpl that allows the exp not to be 
>>>>>>> removed
>>>>>>> from the hashtable when lookup, we can simplify the exp processing code 
>>>>>>> a
>>>>>>> lot in openvswitch conntrack.
>>>>>>>
>>>>>>> Signed-off-by: Xin Long 
>>>>>>> ---
>>>>>>>  net/openvswitch/conntrack.c | 78 +
>>>>>>>  1 file changed, 10 insertions(+), 68 deletions(-)
>>>>>>>
>>>>>>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>>>>>>> index 331730fd3580..fa955e892210 100644
>>>>>>> --- a/net/openvswitch/conntrack.c
>>>>>>> +++ b/net/openvswitch/conntrack.c
>>>>>>> @@ -455,45 +455,6 @@ static int ovs_ct_handle_fragments(struct net 
>>>>>>> *net, struct sw_flow_key *key,
>>>>>>>  return 0;
>>>>>>>  }
>>>>>>>
>>>>>>> -static struct nf_conntrack_expect *
>>>>>>> -ovs_ct_expect_find(struct net *net, const struct nf_conntrack_zone 
>>>>>>> *zone,
>>>>>>> -   u16 proto, const struct sk_buff *skb)
>>>>>>> -{
>>>>>>> -struct nf_conntrack_tuple tuple;
>>>>>>> -struct nf_conntrack_expect *exp;
>>>>>>> -
>>>>>>> -if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), proto, net, 
>>>>>>> ))
>>>>>>> -return NULL;
>>>>>>> -
>>>>>>> -exp = __nf_ct_expect_find(net, zone, );
>>>>>>> -if (exp) {
>>>>>>> -struct nf_conntrack_tuple_hash *h;
>>>>>>> -
>>>>>>> -/* Delete existing conntrack entry, if it clashes with the
>>>>>>> - * expectation.  This can happen since conntrack ALGs do 
>>>>>>> not
>>>>>>> - * check for clashes between (new) expectations and 
>>>>>>> existing
>>>>>>> - * conntrack entries.  nf_conntrack_in() will check the
>>>>>>> - * expectations only if a conntrack entry can not be found,
>>>>>>> - * which can lead to OVS finding the expectation (here) in 
>>>>>>> the
>>>>>>> - * init direction, but which will not be removed by the
>>>>>>> - * nf_conntrack_in() call, if a matching conntrack entry is
>>>>>>> - * found instead.  In this case all init direction packets
>>>>>>> - * would be reported as new related packets, while reply
>>>>>>> - * direction packets would be reported as un-related
>>>>>>> - * established packets.
>>>>>>> - */
>>>>>>> -h = nf_conntrack_find_get(net, zone, );
>>>>>>> -if (h) {
>>>>>>> -struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
>>>>>>> -
>>>>>>> -nf_ct_delete(ct, 0, 0);
>>>>>>> -nf_ct_put(ct);
>>>>>>> -}
>>>>>>> -}
>>>>>>> -
>>>>>>> -return exp;
>>>>>>> -}
>>>>>>> -
>>>>>>>  /* This replicates logic from nf_conntrack_core.c that is not 
>>>>>>> exported. */
>>>>>>>  static enum ip_conntrack_info
>>>>>>>  ovs_ct_get_info(const struct nf_conntrack_tuple_hash *h)
>>>>>>> @@ -852,36 +813,16 @@ static int ovs_ct_lookup(struct net *net, struct 
>>>>>>> sw_flow_key *key,
>>>>>>>   const struct ovs_conntrack_info *info,

Re: [ovs-dev] [PATCH net-next 3/3] openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack

2024-06-19 Thread Ilya Maximets
On 6/18/24 17:50, Ilya Maximets wrote:
> On 6/18/24 16:58, Xin Long wrote:
>> On Tue, Jun 18, 2024 at 7:34 AM Ilya Maximets  wrote:
>>>
>>> On 6/17/24 22:10, Ilya Maximets wrote:
>>>> On 7/16/23 23:09, Xin Long wrote:
>>>>> By not setting IPS_CONFIRMED in tmpl that allows the exp not to be removed
>>>>> from the hashtable when lookup, we can simplify the exp processing code a
>>>>> lot in openvswitch conntrack.
>>>>>
>>>>> Signed-off-by: Xin Long 
>>>>> ---
>>>>>  net/openvswitch/conntrack.c | 78 +
>>>>>  1 file changed, 10 insertions(+), 68 deletions(-)
>>>>>
>>>>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>>>>> index 331730fd3580..fa955e892210 100644
>>>>> --- a/net/openvswitch/conntrack.c
>>>>> +++ b/net/openvswitch/conntrack.c
>>>>> @@ -455,45 +455,6 @@ static int ovs_ct_handle_fragments(struct net *net, 
>>>>> struct sw_flow_key *key,
>>>>>  return 0;
>>>>>  }
>>>>>
>>>>> -static struct nf_conntrack_expect *
>>>>> -ovs_ct_expect_find(struct net *net, const struct nf_conntrack_zone *zone,
>>>>> -   u16 proto, const struct sk_buff *skb)
>>>>> -{
>>>>> -struct nf_conntrack_tuple tuple;
>>>>> -struct nf_conntrack_expect *exp;
>>>>> -
>>>>> -if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), proto, net, 
>>>>> ))
>>>>> -return NULL;
>>>>> -
>>>>> -exp = __nf_ct_expect_find(net, zone, );
>>>>> -if (exp) {
>>>>> -struct nf_conntrack_tuple_hash *h;
>>>>> -
>>>>> -/* Delete existing conntrack entry, if it clashes with the
>>>>> - * expectation.  This can happen since conntrack ALGs do not
>>>>> - * check for clashes between (new) expectations and existing
>>>>> - * conntrack entries.  nf_conntrack_in() will check the
>>>>> - * expectations only if a conntrack entry can not be found,
>>>>> - * which can lead to OVS finding the expectation (here) in 
>>>>> the
>>>>> - * init direction, but which will not be removed by the
>>>>> - * nf_conntrack_in() call, if a matching conntrack entry is
>>>>> - * found instead.  In this case all init direction packets
>>>>> - * would be reported as new related packets, while reply
>>>>> - * direction packets would be reported as un-related
>>>>> - * established packets.
>>>>> - */
>>>>> -h = nf_conntrack_find_get(net, zone, );
>>>>> -if (h) {
>>>>> -struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
>>>>> -
>>>>> -nf_ct_delete(ct, 0, 0);
>>>>> -nf_ct_put(ct);
>>>>> -}
>>>>> -}
>>>>> -
>>>>> -return exp;
>>>>> -}
>>>>> -
>>>>>  /* This replicates logic from nf_conntrack_core.c that is not exported. 
>>>>> */
>>>>>  static enum ip_conntrack_info
>>>>>  ovs_ct_get_info(const struct nf_conntrack_tuple_hash *h)
>>>>> @@ -852,36 +813,16 @@ static int ovs_ct_lookup(struct net *net, struct 
>>>>> sw_flow_key *key,
>>>>>   const struct ovs_conntrack_info *info,
>>>>>   struct sk_buff *skb)
>>>>>  {
>>>>> -struct nf_conntrack_expect *exp;
>>>>> -
>>>>> -/* If we pass an expected packet through nf_conntrack_in() the
>>>>> - * expectation is typically removed, but the packet could still be
>>>>> - * lost in upcall processing.  To prevent this from happening we
>>>>> - * perform an explicit expectation lookup.  Expected connections are
>>>>> - * always new, and will be passed through conntrack only when they 
>>>>> are
>>>>> - * committed, as it is OK to remove the expectation at that time.
>>>>> - */
>>>>> -exp = ovs_ct_expect_find(

Re: [ovs-dev] [PATCH v2 2/2] ipf: Handle common case of ipf defragmentation.

2024-06-18 Thread Ilya Maximets
On 6/11/24 16:02, Aaron Conole wrote:
> Ilya Maximets  writes:
> 
>> On 6/5/24 16:54, Aaron Conole wrote:
>>> Mike Pattrick  writes:
>>>
>>>> When conntrack is reassembling packet fragments, the same reassembly
>>>> context can be shared across multiple threads handling different packets
>>>> simultaneously. Once a full packet is assembled, it is added to a packet
>>>> batch for processing, in the case where there are multiple different pmd
>>>> threads accessing conntrack simultaneously, there is a race condition
>>>> where the reassembled packet may be added to an arbitrary batch even if
>>>> the current batch is available.
>>>>
>>>> When this happens, the packet may be handled incorrectly as it is
>>>> inserted into a random openflow execution pipeline, instead of the
>>>> pipeline for that packets flow.
>>>>
>>>> This change makes a best effort attempt to try to add the defragmented
>>>> packet to the current batch. directly. This should succeed most of the
>>>> time.
>>>>
>>>> Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
>>>> Reported-at: https://issues.redhat.com/browse/FDP-560
>>>> Signed-off-by: Mike Pattrick 
>>>> ---
>>>
>>> The patch overall looks good to me.  I'm considering applying with the
>>> following addition:
>>>
>>>   diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
>>>   index 6b293770dd..d9b9e0c23f 100755
>>>   --- a/utilities/checkpatch.py
>>>   +++ b/utilities/checkpatch.py
>>>   @@ -63,7 +63,8 @@ def open_spell_check_dict():
>>>  'dhcpv6', 'opts', 'metadata', 'geneve', 
>>> 'mutex',
>>>  'netdev', 'netdevs', 'subtable', 'virtio', 
>>> 'qos',
>>>  'policer', 'datapath', 'tunctl', 'attr', 
>>> 'ethernet',
>>>   -  'ether', 'defrag', 'defragment', 'loopback', 
>>> 'sflow',
>>>   +  'ether', 'defrag', 'defragment', 
>>> 'defragmented',
>>>   +  'loopback', 'sflow',
>>>  'acl', 'initializer', 'recirc', 'xlated', 
>>> 'unclosed',
>>>  'netlink', 'msec', 'usec', 'nsec', 'ms', 'us', 
>>> 'ns',
>>>  'kilobits', 'kbps', 'kilobytes', 'megabytes', 
>>> 'mbps',
>>>
>>>
>>> unless anyone objects.  This is to squelch:
>>>
>>> == Checking 16f6885353c2 ("ipf: Handle common case of ipf 
>>> defragmentation.") ==
>>> WARNING: Possible misspelled word: "defragmented"
>>> Did you mean:  ['defragment ed', 'defragment-ed', 'defragment']
>>> Lines checked: 129, Warnings: 1, Errors: 0
>>
>> It doesn't affect CI today, so can be a separate patch, I think.  We
>> have a few more
>> words like this in relatively recent commits, like 'poller' or
>> 'autovalidator', these
>> can be bundled in that separate commit as well.
>>
>> Though updating the dictionary along with the patch that is using the
>> word sounds OK
>> to me as well.
> 
> That makes sense to me.
> 
> I've been thinking of adding a spell-check test to the robot.  Rather
> than the existing apply check doing the spell checking.  The spell
> checker would only ever generate a warning.  WDYT?

Sounds fine to me, but we really need to make the checking more robust.
Currently it feeds into the checker too many things that it shouldn't.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next 3/3] openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack

2024-06-18 Thread Ilya Maximets
On 6/18/24 16:58, Xin Long wrote:
> On Tue, Jun 18, 2024 at 7:34 AM Ilya Maximets  wrote:
>>
>> On 6/17/24 22:10, Ilya Maximets wrote:
>>> On 7/16/23 23:09, Xin Long wrote:
>>>> By not setting IPS_CONFIRMED in tmpl that allows the exp not to be removed
>>>> from the hashtable when lookup, we can simplify the exp processing code a
>>>> lot in openvswitch conntrack.
>>>>
>>>> Signed-off-by: Xin Long 
>>>> ---
>>>>  net/openvswitch/conntrack.c | 78 +
>>>>  1 file changed, 10 insertions(+), 68 deletions(-)
>>>>
>>>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>>>> index 331730fd3580..fa955e892210 100644
>>>> --- a/net/openvswitch/conntrack.c
>>>> +++ b/net/openvswitch/conntrack.c
>>>> @@ -455,45 +455,6 @@ static int ovs_ct_handle_fragments(struct net *net, 
>>>> struct sw_flow_key *key,
>>>>  return 0;
>>>>  }
>>>>
>>>> -static struct nf_conntrack_expect *
>>>> -ovs_ct_expect_find(struct net *net, const struct nf_conntrack_zone *zone,
>>>> -   u16 proto, const struct sk_buff *skb)
>>>> -{
>>>> -struct nf_conntrack_tuple tuple;
>>>> -struct nf_conntrack_expect *exp;
>>>> -
>>>> -if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), proto, net, 
>>>> ))
>>>> -return NULL;
>>>> -
>>>> -exp = __nf_ct_expect_find(net, zone, );
>>>> -if (exp) {
>>>> -struct nf_conntrack_tuple_hash *h;
>>>> -
>>>> -/* Delete existing conntrack entry, if it clashes with the
>>>> - * expectation.  This can happen since conntrack ALGs do not
>>>> - * check for clashes between (new) expectations and existing
>>>> - * conntrack entries.  nf_conntrack_in() will check the
>>>> - * expectations only if a conntrack entry can not be found,
>>>> - * which can lead to OVS finding the expectation (here) in the
>>>> - * init direction, but which will not be removed by the
>>>> - * nf_conntrack_in() call, if a matching conntrack entry is
>>>> - * found instead.  In this case all init direction packets
>>>> - * would be reported as new related packets, while reply
>>>> - * direction packets would be reported as un-related
>>>> - * established packets.
>>>> - */
>>>> -h = nf_conntrack_find_get(net, zone, );
>>>> -if (h) {
>>>> -struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
>>>> -
>>>> -nf_ct_delete(ct, 0, 0);
>>>> -nf_ct_put(ct);
>>>> -}
>>>> -}
>>>> -
>>>> -return exp;
>>>> -}
>>>> -
>>>>  /* This replicates logic from nf_conntrack_core.c that is not exported. */
>>>>  static enum ip_conntrack_info
>>>>  ovs_ct_get_info(const struct nf_conntrack_tuple_hash *h)
>>>> @@ -852,36 +813,16 @@ static int ovs_ct_lookup(struct net *net, struct 
>>>> sw_flow_key *key,
>>>>   const struct ovs_conntrack_info *info,
>>>>   struct sk_buff *skb)
>>>>  {
>>>> -struct nf_conntrack_expect *exp;
>>>> -
>>>> -/* If we pass an expected packet through nf_conntrack_in() the
>>>> - * expectation is typically removed, but the packet could still be
>>>> - * lost in upcall processing.  To prevent this from happening we
>>>> - * perform an explicit expectation lookup.  Expected connections are
>>>> - * always new, and will be passed through conntrack only when they are
>>>> - * committed, as it is OK to remove the expectation at that time.
>>>> - */
>>>> -exp = ovs_ct_expect_find(net, >zone, info->family, skb);
>>>> -if (exp) {
>>>> -u8 state;
>>>> -
>>>> -/* NOTE: New connections are NATted and Helped only when
>>>> - * committed, so we are not calling into NAT here.
>>>> - */
>>>> -state = OVS_CS_F_TRACKED | OVS_CS_F_NEW | OVS_CS_F_RELATE

Re: [ovs-dev] [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample

2024-06-18 Thread Ilya Maximets
On 6/18/24 12:50, Adrián Moreno wrote:
> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
>> On 6/18/24 09:00, Adrián Moreno wrote:
>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>>>> On 6/17/24 13:55, Ilya Maximets wrote:
>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>>>> observability-oriented.
>>>>>>
>>>>>> Apart from some corner case in which it's used a replacement of clone()
>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>>>> local emit_sample.
>>>>>>
>>>>>> With this in mind, it doesn't make much sense to report
>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>>>
>>>>>> For instance, if the flow:
>>>>>>
>>>>>>   actions:sample(..,emit_sample(..)),2
>>>>>>
>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>>>> confusing for users since the packet did reach its destination.
>>>>>>
>>>>>> This patch makes internal action execution silently consume the skb
>>>>>> instead of notifying a drop for this case.
>>>>>>
>>>>>> Unfortunately, this patch does not remove all potential sources of
>>>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>>>
>>>>>> actions:sample(..,emit_sample(..))
>>>>>>
>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we 
>>>>>> aren't.
>>>>>>
>>>>>> Sadly, this case is difficult to solve without breaking the
>>>>>> optimization by which the skb is not cloned on last sample actions.
>>>>>> But, given explicit drop actions are now supported, OVS can just add one
>>>>>> after the last sample() and rewrite the flow as:
>>>>>>
>>>>>> actions:sample(..,emit_sample(..)),drop
>>>>>>
>>>>>> Signed-off-by: Adrian Moreno 
>>>>>> ---
>>>>>>  net/openvswitch/actions.c | 13 +++--
>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>>>> --- a/net/openvswitch/actions.c
>>>>>> +++ b/net/openvswitch/actions.c
>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>>>
>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>>>> +{
>>>>>> +/* Do not emit packet drops inside sample(). */
>>>>>> +if (OVS_CB(skb)->probability)
>>>>>> +consume_skb(skb);
>>>>>> +else
>>>>>> +ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>> +}
>>>>>> +
>>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>>>>>   * space. Return NULL if out of key spaces.
>>>>>>   */
>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct 
>>>>>> sk_buff *skb,
>>>>>>  if ((arg->probability != U32_MAX) &&
>>>>>>  (!arg->probability || get_random_u32() > arg->probability)) 
>>>>>> {
>>>>>>  if (last)
>>>>>> -ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>> +ovs_drop_skb_last_action(skb);
>>>>
>>>> Always consuming the skb at this point makes sense, since having smaple()
>>>> as a last action is a reasonable thing to have.  But this looks more like
>>>> a fix for the original drop reason patch set.
>>>>
>>>
>>> I don't think consuming the skb at this point makes sense. It was very
>>> intentionally changed to a drop since a very common use-case for
>>> sampling is drop-sampling, i.e:

Re: [ovs-dev] [PATCH net-next 3/3] openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack

2024-06-18 Thread Ilya Maximets
On 6/17/24 22:10, Ilya Maximets wrote:
> On 7/16/23 23:09, Xin Long wrote:
>> By not setting IPS_CONFIRMED in tmpl that allows the exp not to be removed
>> from the hashtable when lookup, we can simplify the exp processing code a
>> lot in openvswitch conntrack.
>>
>> Signed-off-by: Xin Long 
>> ---
>>  net/openvswitch/conntrack.c | 78 +
>>  1 file changed, 10 insertions(+), 68 deletions(-)
>>
>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>> index 331730fd3580..fa955e892210 100644
>> --- a/net/openvswitch/conntrack.c
>> +++ b/net/openvswitch/conntrack.c
>> @@ -455,45 +455,6 @@ static int ovs_ct_handle_fragments(struct net *net, 
>> struct sw_flow_key *key,
>>  return 0;
>>  }
>>  
>> -static struct nf_conntrack_expect *
>> -ovs_ct_expect_find(struct net *net, const struct nf_conntrack_zone *zone,
>> -   u16 proto, const struct sk_buff *skb)
>> -{
>> -struct nf_conntrack_tuple tuple;
>> -struct nf_conntrack_expect *exp;
>> -
>> -if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), proto, net, 
>> ))
>> -return NULL;
>> -
>> -exp = __nf_ct_expect_find(net, zone, );
>> -if (exp) {
>> -struct nf_conntrack_tuple_hash *h;
>> -
>> -/* Delete existing conntrack entry, if it clashes with the
>> - * expectation.  This can happen since conntrack ALGs do not
>> - * check for clashes between (new) expectations and existing
>> - * conntrack entries.  nf_conntrack_in() will check the
>> - * expectations only if a conntrack entry can not be found,
>> - * which can lead to OVS finding the expectation (here) in the
>> - * init direction, but which will not be removed by the
>> - * nf_conntrack_in() call, if a matching conntrack entry is
>> - * found instead.  In this case all init direction packets
>> - * would be reported as new related packets, while reply
>> - * direction packets would be reported as un-related
>> - * established packets.
>> - */
>> -h = nf_conntrack_find_get(net, zone, );
>> -if (h) {
>> -struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
>> -
>> -nf_ct_delete(ct, 0, 0);
>> -nf_ct_put(ct);
>> -}
>> -}
>> -
>> -return exp;
>> -}
>> -
>>  /* This replicates logic from nf_conntrack_core.c that is not exported. */
>>  static enum ip_conntrack_info
>>  ovs_ct_get_info(const struct nf_conntrack_tuple_hash *h)
>> @@ -852,36 +813,16 @@ static int ovs_ct_lookup(struct net *net, struct 
>> sw_flow_key *key,
>>   const struct ovs_conntrack_info *info,
>>   struct sk_buff *skb)
>>  {
>> -struct nf_conntrack_expect *exp;
>> -
>> -/* If we pass an expected packet through nf_conntrack_in() the
>> - * expectation is typically removed, but the packet could still be
>> - * lost in upcall processing.  To prevent this from happening we
>> - * perform an explicit expectation lookup.  Expected connections are
>> - * always new, and will be passed through conntrack only when they are
>> - * committed, as it is OK to remove the expectation at that time.
>> - */
>> -exp = ovs_ct_expect_find(net, >zone, info->family, skb);
>> -if (exp) {
>> -u8 state;
>> -
>> -/* NOTE: New connections are NATted and Helped only when
>> - * committed, so we are not calling into NAT here.
>> - */
>> -state = OVS_CS_F_TRACKED | OVS_CS_F_NEW | OVS_CS_F_RELATED;
>> -__ovs_ct_update_key(key, state, >zone, exp->master);
> 
> Hi, Xin, others.
> 
> Unfortunately, it seems like removal of this code broke the expected behavior.
> OVS in userspace expects that SYN packet of a new related FTP connection will
> get +new+rel+trk flags, but after this patch we're only getting +rel+trk and 
> not
> new.  This is a problem because we need to commit this connection with the 
> label
> and we do that for +new packets.  If we can't get +new packet we'll have to 
> commit
> every single +rel+trk packet, which doesn't make a lot of sense.  And it's a
> significant behavior change regardless.

Interestingly enough I see +new+rel+trk packets in cases without SNAT,
but we can only get +rel+trk in cases with SNAT.  So, this may be just
a generic conntrack bug somewhere.  At least the behavior seems fairly
inconsistent.

> 
> Could you, please, take a look?
> 
> The issue can be reproduced by running check-kernel tests in OVS repo.
> 'FTP SNAT orig tuple' tests fail 100% of the time.
> 
> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample

2024-06-18 Thread Ilya Maximets
On 6/18/24 09:00, Adrián Moreno wrote:
> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>> On 6/17/24 13:55, Ilya Maximets wrote:
>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>> observability-oriented.
>>>>
>>>> Apart from some corner case in which it's used a replacement of clone()
>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>> local emit_sample.
>>>>
>>>> With this in mind, it doesn't make much sense to report
>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>
>>>> For instance, if the flow:
>>>>
>>>>   actions:sample(..,emit_sample(..)),2
>>>>
>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>> confusing for users since the packet did reach its destination.
>>>>
>>>> This patch makes internal action execution silently consume the skb
>>>> instead of notifying a drop for this case.
>>>>
>>>> Unfortunately, this patch does not remove all potential sources of
>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>
>>>> actions:sample(..,emit_sample(..))
>>>>
>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>>>
>>>> Sadly, this case is difficult to solve without breaking the
>>>> optimization by which the skb is not cloned on last sample actions.
>>>> But, given explicit drop actions are now supported, OVS can just add one
>>>> after the last sample() and rewrite the flow as:
>>>>
>>>> actions:sample(..,emit_sample(..)),drop
>>>>
>>>> Signed-off-by: Adrian Moreno 
>>>> ---
>>>>  net/openvswitch/actions.c | 13 +++--
>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>> --- a/net/openvswitch/actions.c
>>>> +++ b/net/openvswitch/actions.c
>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>
>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>> +{
>>>> +  /* Do not emit packet drops inside sample(). */
>>>> +  if (OVS_CB(skb)->probability)
>>>> +  consume_skb(skb);
>>>> +  else
>>>> +  ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>> +}
>>>> +
>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>>>   * space. Return NULL if out of key spaces.
>>>>   */
>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct 
>>>> sk_buff *skb,
>>>>if ((arg->probability != U32_MAX) &&
>>>>(!arg->probability || get_random_u32() > arg->probability)) {
>>>>if (last)
>>>> -  ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>> +  ovs_drop_skb_last_action(skb);
>>
>> Always consuming the skb at this point makes sense, since having smaple()
>> as a last action is a reasonable thing to have.  But this looks more like
>> a fix for the original drop reason patch set.
>>
> 
> I don't think consuming the skb at this point makes sense. It was very
> intentionally changed to a drop since a very common use-case for
> sampling is drop-sampling, i.e: replacing an empty action list (that
> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
> that replacement should not have any effect on the number of
> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
> the same way (only observed in one case).
> 
> 
>>>>return 0;
>>>>}
>>>>
>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, 
>>>> struct sk_buff *skb,
>>>>}
>>>>}
>>>>
>>>> -  ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>> +  ovs_drop_skb_last_action(skb);
>>>
>>> I don't think I agree with this one.  If we have a sample() action with
>>> a lot of different

Re: [ovs-dev] [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action

2024-06-18 Thread Ilya Maximets
On 6/18/24 11:47, Ilya Maximets wrote:
> On 6/18/24 09:33, Adrián Moreno wrote:
>> On Mon, Jun 17, 2024 at 12:44:45PM GMT, Ilya Maximets wrote:
>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>> Add support for a new action: emit_sample.
>>>>
>>>> This action accepts a u32 group id and a variable-length cookie and uses
>>>> the psample multicast group to make the packet available for
>>>> observability.
>>>>
>>>> The maximum length of the user-defined cookie is set to 16, same as
>>>> tc_cookie, to discourage using cookies that will not be offloadable.
>>>>
>>>> Signed-off-by: Adrian Moreno 
>>>> ---
>>>>  Documentation/netlink/specs/ovs_flow.yaml | 17 
>>>>  include/uapi/linux/openvswitch.h  | 25 
>>>>  net/openvswitch/actions.c | 50 +++
>>>>  net/openvswitch/flow_netlink.c| 33 ++-
>>>>  4 files changed, 124 insertions(+), 1 deletion(-)
>>>
>>> Some nits below, beside ones already mentioned.
>>>
>>
>> Thanks, Ilya.
>>
>>>>
>>>> diff --git a/Documentation/netlink/specs/ovs_flow.yaml 
>>>> b/Documentation/netlink/specs/ovs_flow.yaml
>>>> index 4fdfc6b5cae9..a7ab5593a24f 100644
>>>> --- a/Documentation/netlink/specs/ovs_flow.yaml
>>>> +++ b/Documentation/netlink/specs/ovs_flow.yaml
>>>> @@ -727,6 +727,12 @@ attribute-sets:
>>>>  name: dec-ttl
>>>>  type: nest
>>>>  nested-attributes: dec-ttl-attrs
>>>> +  -
>>>> +name: emit-sample
>>>> +type: nest
>>>> +nested-attributes: emit-sample-attrs
>>>> +doc: |
>>>> +  Sends a packet sample to psample for external observation.
>>>>-
>>>>  name: tunnel-key-attrs
>>>>  enum-name: ovs-tunnel-key-attr
>>>> @@ -938,6 +944,17 @@ attribute-sets:
>>>>-
>>>>  name: gbp
>>>>  type: u32
>>>> +  -
>>>> +name: emit-sample-attrs
>>>> +enum-name: ovs-emit-sample-attr
>>>> +name-prefix: ovs-emit-sample-attr-
>>>> +attributes:
>>>> +  -
>>>> +name: group
>>>> +type: u32
>>>> +  -
>>>> +name: cookie
>>>> +type: binary
>>>>
>>>>  operations:
>>>>name-prefix: ovs-flow-cmd-
>>>> diff --git a/include/uapi/linux/openvswitch.h 
>>>> b/include/uapi/linux/openvswitch.h
>>>> index efc82c318fa2..a0e9dde0584a 100644
>>>> --- a/include/uapi/linux/openvswitch.h
>>>> +++ b/include/uapi/linux/openvswitch.h
>>>> @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
>>>>  };
>>>>  #endif
>>>>
>>>> +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
>>>> +/**
>>>> + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
>>>> + * action.
>>>> + *
>>>> + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of 
>>>> the
>>>> + * sample.
>>>> + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that 
>>>> contains
>>>> + * user-defined metadata. The maximum length is 16 bytes.
>>>
>>> s/16/OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE/
>>>
>>>> + *
>>>> + * Sends the packet to the psample multicast group with the specified 
>>>> group and
>>>> + * cookie. It is possible to combine this action with the
>>>> + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being 
>>>> emitted.
>>>> + */
>>>> +enum ovs_emit_sample_attr {
>>>> +  OVS_EMIT_SAMPLE_ATTR_UNPSEC,
>>>> +  OVS_EMIT_SAMPLE_ATTR_GROUP, /* u32 number. */
>>>> +  OVS_EMIT_SAMPLE_ATTR_COOKIE,/* Optional, user specified cookie. */
>>>> +  __OVS_EMIT_SAMPLE_ATTR_MAX
>>>> +};
>>>> +
>>>> +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
>>>> +
>>>> +
>>>>  /**
>>>>   * enum ovs_action_attr - Action types.
>>>>   *
>>>> @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
>>>>OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
>>>&g

Re: [ovs-dev] [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action

2024-06-18 Thread Ilya Maximets
On 6/18/24 09:33, Adrián Moreno wrote:
> On Mon, Jun 17, 2024 at 12:44:45PM GMT, Ilya Maximets wrote:
>> On 6/3/24 20:56, Adrian Moreno wrote:
>>> Add support for a new action: emit_sample.
>>>
>>> This action accepts a u32 group id and a variable-length cookie and uses
>>> the psample multicast group to make the packet available for
>>> observability.
>>>
>>> The maximum length of the user-defined cookie is set to 16, same as
>>> tc_cookie, to discourage using cookies that will not be offloadable.
>>>
>>> Signed-off-by: Adrian Moreno 
>>> ---
>>>  Documentation/netlink/specs/ovs_flow.yaml | 17 
>>>  include/uapi/linux/openvswitch.h  | 25 
>>>  net/openvswitch/actions.c | 50 +++
>>>  net/openvswitch/flow_netlink.c| 33 ++-
>>>  4 files changed, 124 insertions(+), 1 deletion(-)
>>
>> Some nits below, beside ones already mentioned.
>>
> 
> Thanks, Ilya.
> 
>>>
>>> diff --git a/Documentation/netlink/specs/ovs_flow.yaml 
>>> b/Documentation/netlink/specs/ovs_flow.yaml
>>> index 4fdfc6b5cae9..a7ab5593a24f 100644
>>> --- a/Documentation/netlink/specs/ovs_flow.yaml
>>> +++ b/Documentation/netlink/specs/ovs_flow.yaml
>>> @@ -727,6 +727,12 @@ attribute-sets:
>>>  name: dec-ttl
>>>  type: nest
>>>  nested-attributes: dec-ttl-attrs
>>> +  -
>>> +name: emit-sample
>>> +type: nest
>>> +nested-attributes: emit-sample-attrs
>>> +doc: |
>>> +  Sends a packet sample to psample for external observation.
>>>-
>>>  name: tunnel-key-attrs
>>>  enum-name: ovs-tunnel-key-attr
>>> @@ -938,6 +944,17 @@ attribute-sets:
>>>-
>>>  name: gbp
>>>  type: u32
>>> +  -
>>> +name: emit-sample-attrs
>>> +enum-name: ovs-emit-sample-attr
>>> +name-prefix: ovs-emit-sample-attr-
>>> +attributes:
>>> +  -
>>> +name: group
>>> +type: u32
>>> +  -
>>> +name: cookie
>>> +type: binary
>>>
>>>  operations:
>>>name-prefix: ovs-flow-cmd-
>>> diff --git a/include/uapi/linux/openvswitch.h 
>>> b/include/uapi/linux/openvswitch.h
>>> index efc82c318fa2..a0e9dde0584a 100644
>>> --- a/include/uapi/linux/openvswitch.h
>>> +++ b/include/uapi/linux/openvswitch.h
>>> @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
>>>  };
>>>  #endif
>>>
>>> +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
>>> +/**
>>> + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
>>> + * action.
>>> + *
>>> + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
>>> + * sample.
>>> + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that 
>>> contains
>>> + * user-defined metadata. The maximum length is 16 bytes.
>>
>> s/16/OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE/
>>
>>> + *
>>> + * Sends the packet to the psample multicast group with the specified 
>>> group and
>>> + * cookie. It is possible to combine this action with the
>>> + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being 
>>> emitted.
>>> + */
>>> +enum ovs_emit_sample_attr {
>>> +   OVS_EMIT_SAMPLE_ATTR_UNPSEC,
>>> +   OVS_EMIT_SAMPLE_ATTR_GROUP, /* u32 number. */
>>> +   OVS_EMIT_SAMPLE_ATTR_COOKIE,/* Optional, user specified cookie. */
>>> +   __OVS_EMIT_SAMPLE_ATTR_MAX
>>> +};
>>> +
>>> +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
>>> +
>>> +
>>>  /**
>>>   * enum ovs_action_attr - Action types.
>>>   *
>>> @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
>>> OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
>>> OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
>>> OVS_ACTION_ATTR_DROP, /* u32 error code. */
>>> +   OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
>>>
>>> __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
>>>* from userspace. */
>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>&

Re: [ovs-dev] [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample

2024-06-18 Thread Ilya Maximets
On 6/18/24 09:38, Adrián Moreno wrote:
> On Mon, Jun 17, 2024 at 12:00:04PM GMT, Ilya Maximets wrote:
>> On 6/3/24 20:56, Adrian Moreno wrote:
>>> If the action has a user_cookie, pass it along to the sample so it can
>>> be easily identified.
>>>
>>> Signed-off-by: Adrian Moreno 
>>> ---
>>>  net/sched/act_sample.c | 12 
>>>  1 file changed, 12 insertions(+)
>>>
>>> diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
>>> index a69b53d54039..5c3f86ec964a 100644
>>> --- a/net/sched/act_sample.c
>>> +++ b/net/sched/act_sample.c
>>> @@ -165,9 +165,11 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff 
>>> *skb,
>>>  const struct tc_action *a,
>>>  struct tcf_result *res)
>>>  {
>>> +   u8 cookie_data[TC_COOKIE_MAX_SIZE] = {};
>>
>> Is it necessary to initialize these 16 bytes on every call?
>> Might be expensive.  We're passing the data length around,
>> so the uninitialized parts should not be accessed.
>>
> 
> They "should" not, indeed. I was just trying to be extra careful.
> Are you worried TC_COOKIE_MAX_SIZE could grow or the cycles needed to
> clear the current 16 bytes?

I'm assuming that any extra cycles spent per packet are undesirable,
so should be avoided, if possible.  Even if we save 1-2 cycles per
packet, it's a lot when we talk about millions of packets per second.

In this particular case, it seems, we do not sacrifice anything, so
it's just a couple of cycles back for free.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next 3/3] openvswitch: set IPS_CONFIRMED in tmpl status only when commit is set in conntrack

2024-06-17 Thread Ilya Maximets
On 7/16/23 23:09, Xin Long wrote:
> By not setting IPS_CONFIRMED in tmpl that allows the exp not to be removed
> from the hashtable when lookup, we can simplify the exp processing code a
> lot in openvswitch conntrack.
> 
> Signed-off-by: Xin Long 
> ---
>  net/openvswitch/conntrack.c | 78 +
>  1 file changed, 10 insertions(+), 68 deletions(-)
> 
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 331730fd3580..fa955e892210 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -455,45 +455,6 @@ static int ovs_ct_handle_fragments(struct net *net, 
> struct sw_flow_key *key,
>   return 0;
>  }
>  
> -static struct nf_conntrack_expect *
> -ovs_ct_expect_find(struct net *net, const struct nf_conntrack_zone *zone,
> -u16 proto, const struct sk_buff *skb)
> -{
> - struct nf_conntrack_tuple tuple;
> - struct nf_conntrack_expect *exp;
> -
> - if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), proto, net, 
> ))
> - return NULL;
> -
> - exp = __nf_ct_expect_find(net, zone, );
> - if (exp) {
> - struct nf_conntrack_tuple_hash *h;
> -
> - /* Delete existing conntrack entry, if it clashes with the
> -  * expectation.  This can happen since conntrack ALGs do not
> -  * check for clashes between (new) expectations and existing
> -  * conntrack entries.  nf_conntrack_in() will check the
> -  * expectations only if a conntrack entry can not be found,
> -  * which can lead to OVS finding the expectation (here) in the
> -  * init direction, but which will not be removed by the
> -  * nf_conntrack_in() call, if a matching conntrack entry is
> -  * found instead.  In this case all init direction packets
> -  * would be reported as new related packets, while reply
> -  * direction packets would be reported as un-related
> -  * established packets.
> -  */
> - h = nf_conntrack_find_get(net, zone, );
> - if (h) {
> - struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
> -
> - nf_ct_delete(ct, 0, 0);
> - nf_ct_put(ct);
> - }
> - }
> -
> - return exp;
> -}
> -
>  /* This replicates logic from nf_conntrack_core.c that is not exported. */
>  static enum ip_conntrack_info
>  ovs_ct_get_info(const struct nf_conntrack_tuple_hash *h)
> @@ -852,36 +813,16 @@ static int ovs_ct_lookup(struct net *net, struct 
> sw_flow_key *key,
>const struct ovs_conntrack_info *info,
>struct sk_buff *skb)
>  {
> - struct nf_conntrack_expect *exp;
> -
> - /* If we pass an expected packet through nf_conntrack_in() the
> -  * expectation is typically removed, but the packet could still be
> -  * lost in upcall processing.  To prevent this from happening we
> -  * perform an explicit expectation lookup.  Expected connections are
> -  * always new, and will be passed through conntrack only when they are
> -  * committed, as it is OK to remove the expectation at that time.
> -  */
> - exp = ovs_ct_expect_find(net, >zone, info->family, skb);
> - if (exp) {
> - u8 state;
> -
> - /* NOTE: New connections are NATted and Helped only when
> -  * committed, so we are not calling into NAT here.
> -  */
> - state = OVS_CS_F_TRACKED | OVS_CS_F_NEW | OVS_CS_F_RELATED;
> - __ovs_ct_update_key(key, state, >zone, exp->master);

Hi, Xin, others.

Unfortunately, it seems like removal of this code broke the expected behavior.
OVS in userspace expects that SYN packet of a new related FTP connection will
get +new+rel+trk flags, but after this patch we're only getting +rel+trk and not
new.  This is a problem because we need to commit this connection with the label
and we do that for +new packets.  If we can't get +new packet we'll have to 
commit
every single +rel+trk packet, which doesn't make a lot of sense.  And it's a
significant behavior change regardless.

Could you, please, take a look?

The issue can be reproduced by running check-kernel tests in OVS repo.
'FTP SNAT orig tuple' tests fail 100% of the time.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample

2024-06-17 Thread Ilya Maximets
On 6/17/24 13:55, Ilya Maximets wrote:
> On 6/3/24 20:56, Adrian Moreno wrote:
>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>> observability-oriented.
>>
>> Apart from some corner case in which it's used a replacement of clone()
>> for old kernels, it's really only used for sFlow, IPFIX and now,
>> local emit_sample.
>>
>> With this in mind, it doesn't make much sense to report
>> OVS_DROP_LAST_ACTION inside sample actions.
>>
>> For instance, if the flow:
>>
>>   actions:sample(..,emit_sample(..)),2
>>
>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>> confusing for users since the packet did reach its destination.
>>
>> This patch makes internal action execution silently consume the skb
>> instead of notifying a drop for this case.
>>
>> Unfortunately, this patch does not remove all potential sources of
>> confusion since, if the sample action itself is the last action, e.g:
>>
>> actions:sample(..,emit_sample(..))
>>
>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>
>> Sadly, this case is difficult to solve without breaking the
>> optimization by which the skb is not cloned on last sample actions.
>> But, given explicit drop actions are now supported, OVS can just add one
>> after the last sample() and rewrite the flow as:
>>
>> actions:sample(..,emit_sample(..)),drop
>>
>> Signed-off-by: Adrian Moreno 
>> ---
>>  net/openvswitch/actions.c | 13 +++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>> index 33f6d93ba5e4..54fc1abcff95 100644
>> --- a/net/openvswitch/actions.c
>> +++ b/net/openvswitch/actions.c
>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>  static struct action_flow_keys __percpu *flow_keys;
>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>  
>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>> +{
>> +/* Do not emit packet drops inside sample(). */
>> +if (OVS_CB(skb)->probability)
>> +consume_skb(skb);
>> +else
>> +ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>> +}
>> +
>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>   * space. Return NULL if out of key spaces.
>>   */
>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff 
>> *skb,
>>  if ((arg->probability != U32_MAX) &&
>>  (!arg->probability || get_random_u32() > arg->probability)) {
>>  if (last)
>> -ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>> +ovs_drop_skb_last_action(skb);

Always consuming the skb at this point makes sense, since having smaple()
as a last action is a reasonable thing to have.  But this looks more like
a fix for the original drop reason patch set.

>>  return 0;
>>  }
>>  
>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, 
>> struct sk_buff *skb,
>>  }
>>  }
>>  
>> -ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>> +ovs_drop_skb_last_action(skb);
> 
> I don't think I agree with this one.  If we have a sample() action with
> a lot of different actions inside and we reached the end while the last
> action didn't consume the skb, then we should report that.  E.g.
> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> 
> The only actions that are actually consuming the skb are "output",
> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> consuming the skb "naturally" by stealing it when it is the last action.
> "userspace" has an explicit check to consume the skb if it is the last
> action.  "emit_sample" should have the similar check.  It should likely
> be added at the point of action introduction instead of having a separate
> patch.
> 
> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample

2024-06-17 Thread Ilya Maximets
On 6/3/24 20:56, Adrian Moreno wrote:
> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> observability-oriented.
> 
> Apart from some corner case in which it's used a replacement of clone()
> for old kernels, it's really only used for sFlow, IPFIX and now,
> local emit_sample.
> 
> With this in mind, it doesn't make much sense to report
> OVS_DROP_LAST_ACTION inside sample actions.
> 
> For instance, if the flow:
> 
>   actions:sample(..,emit_sample(..)),2
> 
> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> confusing for users since the packet did reach its destination.
> 
> This patch makes internal action execution silently consume the skb
> instead of notifying a drop for this case.
> 
> Unfortunately, this patch does not remove all potential sources of
> confusion since, if the sample action itself is the last action, e.g:
> 
> actions:sample(..,emit_sample(..))
> 
> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> 
> Sadly, this case is difficult to solve without breaking the
> optimization by which the skb is not cloned on last sample actions.
> But, given explicit drop actions are now supported, OVS can just add one
> after the last sample() and rewrite the flow as:
> 
> actions:sample(..,emit_sample(..)),drop
> 
> Signed-off-by: Adrian Moreno 
> ---
>  net/openvswitch/actions.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 33f6d93ba5e4..54fc1abcff95 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>  static struct action_flow_keys __percpu *flow_keys;
>  static DEFINE_PER_CPU(int, exec_actions_level);
>  
> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> +{
> + /* Do not emit packet drops inside sample(). */
> + if (OVS_CB(skb)->probability)
> + consume_skb(skb);
> + else
> + ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> +}
> +
>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>   * space. Return NULL if out of key spaces.
>   */
> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff 
> *skb,
>   if ((arg->probability != U32_MAX) &&
>   (!arg->probability || get_random_u32() > arg->probability)) {
>   if (last)
> - ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> + ovs_drop_skb_last_action(skb);
>   return 0;
>   }
>  
> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, 
> struct sk_buff *skb,
>   }
>   }
>  
> - ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> + ovs_drop_skb_last_action(skb);

I don't think I agree with this one.  If we have a sample() action with
a lot of different actions inside and we reached the end while the last
action didn't consume the skb, then we should report that.  E.g.
"sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.

The only actions that are actually consuming the skb are "output",
"userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
consuming the skb "naturally" by stealing it when it is the last action.
"userspace" has an explicit check to consume the skb if it is the last
action.  "emit_sample" should have the similar check.  It should likely
be added at the point of action introduction instead of having a separate
patch.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.

2024-06-17 Thread Ilya Maximets
On 6/17/24 09:08, Adrián Moreno wrote:
> On Fri, Jun 14, 2024 at 12:55:59PM GMT, Aaron Conole wrote:
>> Adrian Moreno  writes:
>>
>>> The behavior of actions might not be the exact same if they are being
>>> executed inside a nested sample action. Store the probability of the
>>> parent sample action in the skb's cb area.
>>
>> What does that mean?
>>
> 
> Emit action, for instance, needs the probability so that psample
> consumers know what was the sampling rate applied. Also, the way we
> should inform about packet drops (via kfree_skb_reason) changes (see
> patch 7/9).
> 
>>> Use the probability in emit_sample to pass it down to psample.
>>>
>>> Signed-off-by: Adrian Moreno 
>>> ---
>>>  include/uapi/linux/openvswitch.h |  3 ++-
>>>  net/openvswitch/actions.c| 25 ++---
>>>  net/openvswitch/datapath.h   |  3 +++
>>>  net/openvswitch/vport.c  |  1 +
>>>  4 files changed, 28 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/openvswitch.h 
>>> b/include/uapi/linux/openvswitch.h
>>> index a0e9dde0584a..9d675725fa2b 100644
>>> --- a/include/uapi/linux/openvswitch.h
>>> +++ b/include/uapi/linux/openvswitch.h
>>> @@ -649,7 +649,8 @@ enum ovs_flow_attr {
>>>   * Actions are passed as nested attributes.
>>>   *
>>>   * Executes the specified actions with the given probability on a 
>>> per-packet
>>> - * basis.
>>> + * basis. Nested actions will be able to access the probability value of 
>>> the
>>> + * parent @OVS_ACTION_ATTR_SAMPLE.
>>>   */
>>>  enum ovs_sample_attr {
>>> OVS_SAMPLE_ATTR_UNSPEC,
>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>> index 3b4dba0ded59..33f6d93ba5e4 100644
>>> --- a/net/openvswitch/actions.c
>>> +++ b/net/openvswitch/actions.c
>>> @@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct 
>>> sk_buff *skb,
>>> struct nlattr *sample_arg;
>>> int rem = nla_len(attr);
>>> const struct sample_arg *arg;
>>> +   u32 init_probability;
>>> bool clone_flow_key;
>>> +   int err;
>>>
>>> /* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
>>> sample_arg = nla_data(attr);
>>> arg = nla_data(sample_arg);
>>> actions = nla_next(sample_arg, );
>>> +   init_probability = OVS_CB(skb)->probability;
>>>
>>> if ((arg->probability != U32_MAX) &&
>>> (!arg->probability || get_random_u32() > arg->probability)) {
>>> @@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct 
>>> sk_buff *skb,
>>> return 0;
>>> }
>>>
>>> +   if (init_probability) {
>>> +   OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
>>> +   arg->probability / U32_MAX);
>>> +   } else {
>>> +   OVS_CB(skb)->probability = arg->probability;
>>> +   }
>>> +
>>
>> I'm confused by this.  Eventually, integer arithmetic will practically
>> guarantee that nested sample() calls will go to 0.  So eventually, the
>> test above will be impossible to meet mathematically.
>>
>> OTOH, you could argue that a 1% of 50% is low anyway, but it still would
>> have a positive probability count, and still be possible for
>> get_random_u32() call to match.
>>
> 
> Using OVS's probability semantics, we can express probabilities as low
> as (100/U32_MAX)% which is pretty low indeed. However, just because the
> probability of executing the action is low I don't think we should not
> report it.
> 
> Rethinking the integer arithmetics, it's true that we should avoid
> hitting zero on the division, eg: nesting 6x 1% sampling rates will make
> the result be zero which will make probability restoration fail on the
> way back. Threrefore, the new probability should be at least 1.
> 
> 
>> I'm not sure about this particular change.  Why do we need it?
>>
> 
> Why do we need to propagate the probability down to nested "sample"
> actions? or why do we need to store the probability in the cb area in
> the first place?
> 
> The former: Just for correctness as only storing the last one would be
> incorrect. Although I don't know of any use for nested "sample" actions.

I think, we can drop this for now.  All the user interfaces specify
the probability per action.  So, it should be fine to report the
probability of the action that emitted the sample without taking into
account the whole timeline of that packet.  Besides, packet can leave
OVS and go back loosing the metadata, so it will not actually be a
full solution anyway.  Single-action metadata is easier to define.

> The latter: To pass it down to psample so that sample receivers know how
> the sampling rate applied (and, e.g: do throughput estimations like OVS
> does with IPFIX).
> 
> 
>>> clone_flow_key = !arg->exec;
>>> -   return clone_execute(dp, skb, key, 0, actions, rem, last,
>>> -clone_flow_key);
>>> +   err = clone_execute(dp, skb, key, 0, actions, rem, last,
>>> +   clone_flow_key);
>>> +
>>> +   if (!last)
>>
>> Is this 

Re: [ovs-dev] [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action

2024-06-17 Thread Ilya Maximets
On 6/3/24 20:56, Adrian Moreno wrote:
> Add support for a new action: emit_sample.
> 
> This action accepts a u32 group id and a variable-length cookie and uses
> the psample multicast group to make the packet available for
> observability.
> 
> The maximum length of the user-defined cookie is set to 16, same as
> tc_cookie, to discourage using cookies that will not be offloadable.
> 
> Signed-off-by: Adrian Moreno 
> ---
>  Documentation/netlink/specs/ovs_flow.yaml | 17 
>  include/uapi/linux/openvswitch.h  | 25 
>  net/openvswitch/actions.c | 50 +++
>  net/openvswitch/flow_netlink.c| 33 ++-
>  4 files changed, 124 insertions(+), 1 deletion(-)

Some nits below, beside ones already mentioned.

> 
> diff --git a/Documentation/netlink/specs/ovs_flow.yaml 
> b/Documentation/netlink/specs/ovs_flow.yaml
> index 4fdfc6b5cae9..a7ab5593a24f 100644
> --- a/Documentation/netlink/specs/ovs_flow.yaml
> +++ b/Documentation/netlink/specs/ovs_flow.yaml
> @@ -727,6 +727,12 @@ attribute-sets:
>  name: dec-ttl
>  type: nest
>  nested-attributes: dec-ttl-attrs
> +  -
> +name: emit-sample
> +type: nest
> +nested-attributes: emit-sample-attrs
> +doc: |
> +  Sends a packet sample to psample for external observation.
>-
>  name: tunnel-key-attrs
>  enum-name: ovs-tunnel-key-attr
> @@ -938,6 +944,17 @@ attribute-sets:
>-
>  name: gbp
>  type: u32
> +  -
> +name: emit-sample-attrs
> +enum-name: ovs-emit-sample-attr
> +name-prefix: ovs-emit-sample-attr-
> +attributes:
> +  -
> +name: group
> +type: u32
> +  -
> +name: cookie
> +type: binary
>  
>  operations:
>name-prefix: ovs-flow-cmd-
> diff --git a/include/uapi/linux/openvswitch.h 
> b/include/uapi/linux/openvswitch.h
> index efc82c318fa2..a0e9dde0584a 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
>  };
>  #endif
>  
> +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
> +/**
> + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
> + * action.
> + *
> + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
> + * sample.
> + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that 
> contains
> + * user-defined metadata. The maximum length is 16 bytes.

s/16/OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE/

> + *
> + * Sends the packet to the psample multicast group with the specified group 
> and
> + * cookie. It is possible to combine this action with the
> + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being 
> emitted.
> + */
> +enum ovs_emit_sample_attr {
> + OVS_EMIT_SAMPLE_ATTR_UNPSEC,
> + OVS_EMIT_SAMPLE_ATTR_GROUP, /* u32 number. */
> + OVS_EMIT_SAMPLE_ATTR_COOKIE,/* Optional, user specified cookie. */
> + __OVS_EMIT_SAMPLE_ATTR_MAX
> +};
> +
> +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
> +
> +
>  /**
>   * enum ovs_action_attr - Action types.
>   *
> @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
>   OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
>   OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
>   OVS_ACTION_ATTR_DROP, /* u32 error code. */
> + OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
>  
>   __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
>  * from userspace. */
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 964225580824..3b4dba0ded59 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -24,6 +24,11 @@
>  #include 
>  #include 
>  #include 
> +
> +#if IS_ENABLED(CONFIG_PSAMPLE)
> +#include 
> +#endif
> +
>  #include 
>  
>  #include "datapath.h"
> @@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct 
> sw_flow_key *key)
>   return 0;
>  }
>  
> +static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> +const struct sw_flow_key *key,
> +const struct nlattr *attr)
> +{
> +#if IS_ENABLED(CONFIG_PSAMPLE)
> + struct psample_group psample_group = {};
> + struct psample_metadata md = {};
> + struct vport *input_vport;
> + const struct nlattr *a;
> + int rem;
> +
> + for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> +  a = nla_next(a, )) {

Since the action is strictly validated, can use use nla_for_each_attr()
or nla_for_each_nested() ?

> + switch (nla_type(a)) {
> + case OVS_EMIT_SAMPLE_ATTR_GROUP:
> + psample_group.group_num = nla_get_u32(a);
> + break;
> +
> + case OVS_EMIT_SAMPLE_ATTR_COOKIE:
> + 

Re: [ovs-dev] [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample

2024-06-17 Thread Ilya Maximets
On 6/3/24 20:56, Adrian Moreno wrote:
> If the action has a user_cookie, pass it along to the sample so it can
> be easily identified.
> 
> Signed-off-by: Adrian Moreno 
> ---
>  net/sched/act_sample.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
> index a69b53d54039..5c3f86ec964a 100644
> --- a/net/sched/act_sample.c
> +++ b/net/sched/act_sample.c
> @@ -165,9 +165,11 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
>const struct tc_action *a,
>struct tcf_result *res)
>  {
> + u8 cookie_data[TC_COOKIE_MAX_SIZE] = {};

Is it necessary to initialize these 16 bytes on every call?
Might be expensive.  We're passing the data length around,
so the uninitialized parts should not be accessed.

Best regards, Ilya Maximets.

>   struct tcf_sample *s = to_sample(a);
>   struct psample_group *psample_group;
>   struct psample_metadata md = {};
> + struct tc_cookie *user_cookie;
>   int retval;
>  
>   tcf_lastuse_update(>tcf_tm);
> @@ -189,6 +191,16 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
>   if (skb_at_tc_ingress(skb) && tcf_sample_dev_ok_push(skb->dev))
>   skb_push(skb, skb->mac_len);
>  
> + rcu_read_lock();
> + user_cookie = rcu_dereference(a->user_cookie);
> + if (user_cookie) {
> + memcpy(cookie_data, user_cookie->data,
> +user_cookie->len);
> + md.user_cookie = cookie_data;
> + md.user_cookie_len = user_cookie->len;
> + }
> + rcu_read_unlock();
> +
>   md.trunc_size = s->truncate ? s->trunc_size : skb->len;
>   psample_sample_packet(psample_group, skb, s->rate, );
>  

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] netdev-dpdk: Use LSC interrupt mode.

2024-06-17 Thread Ilya Maximets
On 6/17/24 09:46, David Marchand wrote:
> On Fri, Jun 14, 2024 at 6:22 PM Ilya Maximets  wrote:
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>> index 0fa37d5145..a260bc8485 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -2397,7 +2397,18 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
>>> struct smap *args,
>>>  }
>>>  }
>>>
>>> -lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", false);
>>> +lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", true);
>>> +if (lsc_interrupt_mode && !(*info.dev_flags & RTE_ETH_DEV_INTR_LSC)) {
>>> +if (smap_get(args, "dpdk-lsc-interrupt")) {
>>> +VLOG_ERR("interface '%s': link status interrupt is not 
>>> supported.",
>>> + netdev_get_name(netdev));
>>
>> Since we're exiting with an error set, the message should be buffered
>> into errp instead, so it can be visible in the database record and
>> returned as a result of the ovs-vsctl.
>>
>> Also, we're using WARN level for all other configuration issues, so we
>> should do that here as well.  ERR is usually some sort of internal error.
>> And we're usually just using "%s: ..." and not "interface '%s': ...".
> 
> Ok for ERR vs WARN.
> 
> For the rest, well, I copied the logs right before.
> 
> vf_mac = smap_get(args, "dpdk-vf-mac");
> if (vf_mac) {
> struct eth_addr mac;
> 
> if (!dpdk_port_is_representor(dev)) {
> VLOG_WARN("'%s' is trying to set the VF MAC '%s' "
>   "but 'options:dpdk-vf-mac' is only supported for "
>   "VF representors.",
>   netdev_get_name(netdev), vf_mac);
> } else if (!eth_addr_from_string(vf_mac, )) {
> VLOG_WARN("interface '%s': cannot parse VF MAC '%s'.",
>   netdev_get_name(netdev), vf_mac);
> } else if (eth_addr_is_multicast(mac)) {
> VLOG_WARN("interface '%s': cannot set VF MAC to multicast "
>   "address '%s'.", netdev_get_name(netdev), vf_mac);
> } else if (!eth_addr_equals(dev->requested_hwaddr, mac)) {
> dev->requested_hwaddr = mac;
> netdev_request_reconfigure(netdev);
> }
> }
> 
> lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", true);
> 
> 
> So I'll fix the dpdk-vf-mac stuff (and double check the rest of this
> function), then go with your suggestion for this added log of mine.
> 
> 

We must not initialize errp if we do not fail with error, otherwise we leak
the memory.  VF mac code does not fail the configuration, so we only log the
warning.  All the paths that fail should set errp instead.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] appveyor: Fix caching of OpenSSL installer.

2024-06-14 Thread Ilya Maximets
On 6/11/24 00:09, Alin Serdean wrote:
> Acked-by: Alin-Gabriel Serdean 

Thanks!  Applied.

Best regards, Ilya Maximets.

> 
> On Mon, Jun 10, 2024 at 11:18 PM Ilya Maximets  <mailto:i.maxim...@ovn.org>> wrote:
> 
> Apparently, if the cache dependency is specified, the cache folder
> is not checked at the end of a build and so the cache is never
> updated unless we change appveyor.yml.  This makes the cache to not
> actually work, because on each build we discover that the installer
> is outdated, download the new one and it is not uploaded to the cache,
> so it is still outdated on the next build.
> 
> Removing the dependency to get a normal cache behavior.  We're
> manually comparing the hash of the cached binary with the most
> latest one, so we will still catch any OpenSSL updates, but now
> we will also upload the updated cache back.
> 
> Fixes: 9d8208484a35 ("appveyor: Build with OpenSSL 3.0.")
> Reported-at: 
> https://help.appveyor.com/discussions/problems/36144-cache-reports-up-to-date-while-it-is-not
> Signed-off-by: Ilya Maximets 
> ---
>  appveyor.yml | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/appveyor.yml b/appveyor.yml
> index d11e46399..d0293b211 100644
> --- a/appveyor.yml
> +++ b/appveyor.yml
> @@ -15,7 +15,7 @@ init:
>                                -Value "C:\Python312-x64\python.exe"
> 
>  cache:
> -- C:\ovs-build-downloads -> appveyor.yml
> +- C:\ovs-build-downloads
> 
>  install:
>  - ps: |
> -- 
> 2.45.0
> 
> ___
> dev mailing list
> d...@openvswitch.org <mailto:d...@openvswitch.org>
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
> <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] netdev-dpdk: Use LSC interrupt mode.

2024-06-14 Thread Ilya Maximets
On 6/14/24 17:08, David Marchand wrote:
> Querying link status may get delayed for an undeterministic (long) time
> with mlx5 ports. This is a consequence of the mlx5 driver calling ethtool
> kernel API and getting stuck on the kernel RTNL lock while some other
> operation is in progress under this lock.
> 
> One impact for long link status query is that it is called under the bond
> lock taken in write mode periodically in bond_run().
> In parallel, datapath threads may block requesting to read bonding related
> info (like for example in bond_check_admissibility()).
> 
> The LSC interrupt mode is available with many DPDK drivers and is used by
> default with testpmd.
> 
> It seems safe enough to switch on this feature by default in OVS.
> We keep the per interface option to disable this feature in case of an
> unforeseen bug.
> 
> Signed-off-by: David Marchand 
> Reviewed-by: Robin Jarry 
> Acked-by: Mike Pattrick 
> ---
> Changes since v2:
> - fixed typo in NEWS,
> 
> Changes since v1:
> - (early) fail when interrupt lsc is requested by user but not supported
>   by the driver,
> - otherwise, log a debug message if user did not request interrupt mode,
> 
> ---
>  Documentation/topics/dpdk/phy.rst |  4 ++--
>  NEWS  |  3 +++
>  lib/netdev-dpdk.c | 13 -
>  vswitchd/vswitch.xml  |  8 
>  4 files changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/phy.rst 
> b/Documentation/topics/dpdk/phy.rst
> index efd168cba8..eefc25613d 100644
> --- a/Documentation/topics/dpdk/phy.rst
> +++ b/Documentation/topics/dpdk/phy.rst
> @@ -546,8 +546,8 @@ the firmware every time to fulfil this request.
>  
>  Note that not all PMD drivers support LSC interrupts.
>  
> -The default configuration is polling mode. To set interrupt mode, option
> -``dpdk-lsc-interrupt`` has to be set to ``true``.
> +The default configuration is interrupt mode. To set polling mode, option
> +``dpdk-lsc-interrupt`` has to be set to ``false``.
>  
>  Command to set interrupt mode for a specific interface::
>  $ ovs-vsctl set interface  options:dpdk-lsc-interrupt=true
> diff --git a/NEWS b/NEWS
> index 5ae0108d55..d05f2d0f89 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -9,6 +9,9 @@ Post-v3.3.0
>   https://github.com/openvswitch/ovs.git
> - DPDK:
>   * OVS validated with DPDK 23.11.1.
> + * Link status changes are now handled via interrupt mode if the DPDK
> +   driver supports it.  It is possible to revert to polling mode by 
> setting
> +   per interface 'options:dpdk-lsc-interrupt' to 'false'.
>  
>  
>  v3.3.0 - 16 Feb 2024
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 0fa37d5145..a260bc8485 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2397,7 +2397,18 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
> struct smap *args,
>  }
>  }
>  
> -lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", false);
> +lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", true);
> +if (lsc_interrupt_mode && !(*info.dev_flags & RTE_ETH_DEV_INTR_LSC)) {
> +if (smap_get(args, "dpdk-lsc-interrupt")) {
> +VLOG_ERR("interface '%s': link status interrupt is not 
> supported.",
> + netdev_get_name(netdev));

Since we're exiting with an error set, the message should be buffered
into errp instead, so it can be visible in the database record and
returned as a result of the ovs-vsctl.

Also, we're using WARN level for all other configuration issues, so we
should do that here as well.  ERR is usually some sort of internal error.
And we're usually just using "%s: ..." and not "interface '%s': ...".

> +err = EINVAL;
> +goto out;
> +}
> +VLOG_DBG("interface '%s': not enabling link status interrupt.",
> + netdev_get_name(netdev));
> +lsc_interrupt_mode = false;
> +}
>  if (dev->requested_lsc_interrupt_mode != lsc_interrupt_mode) {
>  dev->requested_lsc_interrupt_mode = lsc_interrupt_mode;
>  netdev_request_reconfigure(netdev);
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 8a1b607d71..e3afb78a4e 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -4647,12 +4647,12 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 
> type=patch options:peer=p1 \
>type='{"type": "boolean"}'>
>  
> -  Set this value to true to configure interrupt mode for
> -  Link State Change (LSC) detection instead of poll mode for the DPDK
> -  interface.
> +  Set this value to false to configure poll mode for
> +  Link State Change (LSC) detection instead of interrupt mode for the
> +  DPDK interface.
>  
>  
> -  If this value is not set, poll mode is configured.
> +  If this value is not set, interrupt mode is configured.
>

Re: [ovs-dev] [PATCH] dpif-netdev: Disable XPS (Transmit Packet Steering) for non-pmd ports.

2024-06-14 Thread Ilya Maximets
On 6/13/24 12:15, Eli Britstein wrote:
> 
> 
>> -Original Message-----
>> From: Ilya Maximets 
>> Sent: Monday, 10 June 2024 15:53
>> To: Roi Dayan ; d...@openvswitch.org
>> Cc: Eli Britstein ; Maor Dickman ;
>> i.maxim...@ovn.org
>> Subject: Re: [ovs-dev] [PATCH] dpif-netdev: Disable XPS (Transmit Packet
>> Steering) for non-pmd ports.
>>
>> External email: Use caution opening links or attachments
>>
>>
>> On 6/9/24 12:16, Roi Dayan via dev wrote:
>>> From: Eli Britstein 
>>>
>>> In the cited commit, XPS was introduced. It is NA for non-pmd ports.
>>> Upon port creation it is indeed disabled, but at port reconfigure, the
>>> condition of netdev_is_pmd() is missing.
>>> As a result, XPS is configured, and such messages are repeating in the log:
>>>   DBG|Core 2: New TX queue ID 0 for port 'v1_r'.
>>> Fix it.
>>
>> Hi, Eli.  Thanks for the patch!
>>
>> While it's maybe true that it was an original intention to not have XPS 
>> engaged
>> for non-PMD ports (frankly, I don't remember), the behavior was changed
>> quickly after in commit:
>>  e32971b8ddb4 ("dpif-netdev: Centralized threads and queues handling
>> code.") The logic was centralized in the reconfiguration code and no port is
>> actually used until it went through datapath reconfiguration.
>>
>> And later we had AF_XDP ports introduced and even afxdp-nonpmd.  For
>> these it is still important to have balanced use of Tx queues even if the 
>> port is
>> not polled by PMD threads on Rx side.
>>
>> We also changed netdev_send() API to include 'concurrent_txq' flag to make
>> the netdev implementation know if it needs to lock the queue before using.
>> Default STATIC mode doesn't set this flag.
>> This also means that we can't actually supply Tx queue IDs to netdev_send()
>> out of range of the allocated queues, since netdev implementation will have
>> to lock every time otherwise.  STATIC mode will use out-of-range queue IDs
>> with this change applied.
> Thanks for this explanation. Let me clarify the scenario:
> We add a veth port to netdev bridge (dpif-netdev). It doesn't have multiple 
> queues, so it is 1.
> The "wanted_txqs" is 2 at the minimum - one for at least one PMD, and another 
> for the main thread.
> The veth port is still configured as XPS, though there are no multiple TX 
> queues.
> When I looked back to what condition to add, I saw the is_pmd in the cited 
> commit.
> How about this then?
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 5196183ff..dac7de851 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -6646,7 +6646,8 @@ reconfigure_datapath(struct dp_netdev *dp)
>  if (port->txq_requested_mode == TXQ_REQ_MODE_HASH &&
>  netdev_n_txq(port->netdev) > 1) {
>  port->txq_mode = TXQ_MODE_XPS_HASH;
> -} else if (netdev_n_txq(port->netdev) < wanted_txqs) {
> +} else if (netdev_n_txq(port->netdev) > 1 &&
> +   netdev_n_txq(port->netdev) < wanted_txqs) {
>  port->txq_mode = TXQ_MODE_XPS;
>  } else {
>  port->txq_mode = TXQ_MODE_STATIC;
> 

This will still break non-pmd ports that support multi-queue, e.g.
afxdp-nonpmd.

>>
>> With that, I don't think we can accept this change.  At the current state of 
>> the
>> netdev API, dpif-netdev should never actually use Tx queue IDs out of the
>> allocated range and it must set 'concurrent_txq' flag whenever queues can be
>> shared, otherwise we'll get data races and crashes on out-of-range memory
>> accesses.
> Do you mean TXQ_MODE_STATIC is broken regardless of this change? I guess if 
> so it
> is currently hidden as it is never used.

It's not broken, but it must not be used (and it is not being used) if number of
available queues is less than number of threads.  If you have a device with 8 
queues
and only 4 threads in OVS, TXQ_MODE_STATIC will be used and will work fine.

Why the debug log message is a problem?

>>
>> We should technically remove all the 'qid % n_txq' stuff from all the netdev
>> implementations and replace them with ovs_assert() on the API level.  We
>> had a few patches for that in the past, but they didn't get proper attention
>> and went stale.
> Maybe, but it's not related to this commit.
>>
>> Best regards, Ilya Maximets.
>>
>>>
>>> Fixes: 324c8374852a ("dpif-netdev: XPS (Transmit Packet Steering)
>>> implementation."

Re: [ovs-dev] [PATCH] ofp-prop: Fix unaligned 128 bit access.

2024-06-14 Thread Ilya Maximets
On 6/12/24 20:12, Mike Pattrick wrote:
> On Wed, Jun 12, 2024 at 9:50 AM Ales Musil  wrote:
> 
>>
>>
>> On Wed, Jun 12, 2024 at 3:32 PM Mike Pattrick  wrote:
>>
>>> When compiling with '-fsanitize=address,undefined', the "ovs-ofctl
>>> ct-flush" test will yield the following undefined behavior flagged
>>> by UBSan. This patch uses memcpy to move the 128bit value into the
>>> stack before reading it.
>>>
>>> lib/ofp-prop.c:277:14: runtime error: load of misaligned address
>>> for type 'union ovs_be128', which requires 8 byte alignment
>>>   ^
>>> #0 0x7735d4 in ofpprop_parse_u128 lib/ofp-prop.c:277
>>> #1 0x6c6c83 in ofp_ct_match_decode lib/ofp-ct.c:529
>>> #2 0x76f3b5 in ofp_print_nxt_ct_flush lib/ofp-print.c:959
>>> #3 0x76f3b5 in ofp_to_string__ lib/ofp-print.c:1206
>>> #4 0x76f3b5 in ofp_to_string lib/ofp-print.c:1264
>>> #5 0x770c0d in ofp_print lib/ofp-print.c:1308
>>> #6 0x484a9d in ofctl_ofp_print utilities/ovs-ofctl.c:4899
>>> #7 0x4ddb77 in ovs_cmdl_run_command__ lib/command-line.c:247
>>> #8 0x47f6b3 in main utilities/ovs-ofctl.c:186

Thanks for cleaning up the trace.  Please, also remove the '#' tags.
GitHub trats them as references to issues/PRs and that is annoying.

Also, while most of the addresses in the trace are not important and it's
good to strip or shorten them, the actual address where the memory access
happened is important here, so we can see what the actual alignment was.
Was it part of the original error message?  Clang usually provides them,
not sure about gcc.

>>>
>>> Signed-off-by: Mike Pattrick 
>>> ---
>>>
>>
>> Hi Mike,
>>
>> this is interesting, do you have an idea why it didn't fail in CI by now?
>> Also AFAIR the ofprops is supposed to be aligned to 8 bytes so unless the
>> buffer itself isn't allocated at an address that is not itself 8 bytes
>> aligned it shouldn't happen. In that case we might actually have a problem
>> with other sizes.
>>
> 
> Report is seen with gcc + ubsan, but not clang + ubsan. It is possible that
> this is only seen due the test, this warning wasn't seen live.

I agree with Ales on this one.  Properties supposed to be aligned.
We need to find why they are not.  i.e. is it a property itself or
something before it.

We may need to take similar approach as in commit:
  a5cc859a4228 ("ofp-actions: Use aligned structures when decoding ofp 
actions.")

> 
> Cheers,
> M
> 
> 
>>
>>
>>>  lib/ofp-prop.c | 4 +++-
>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/ofp-prop.c b/lib/ofp-prop.c
>>> index 0a685750c..ed6365414 100644
>>> --- a/lib/ofp-prop.c
>>> +++ b/lib/ofp-prop.c
>>> @@ -271,10 +271,12 @@ enum ofperr
>>>  ofpprop_parse_u128(const struct ofpbuf *property, ovs_u128 *value)
>>>  {
>>>  ovs_be128 *p = property->msg;
>>> +ovs_be128 aligned;
>>>  if (ofpbuf_msgsize(property) != sizeof *p) {
>>>  return OFPERR_OFPBPC_BAD_LEN;
>>>  }
>>> -*value = ntoh128(*p);
>>> +memcpy(, p, sizeof aligned);

FWIW, this doesn't actually fix the issue.  At least not in all the
cases.  Compiler is free to make alignment assumptions based on the
pointer type, so we can still have unaligned access inside the memcpy.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [BUG][meter] ovs crash when add meter openflow

2024-06-14 Thread Ilya Maximets
On 6/14/24 13:33, Simon Jones wrote:
> ```
> Date:   Fri Jun 14 19:25:43 2024 +0800
> 
> bugfix of meter tc crash.
> 
> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
> index 9fde5f7a9..d08c5a35f 100644
> --- a/lib/netdev-offload.c
> +++ b/lib/netdev-offload.c
> @@ -186,7 +186,6 @@ netdev_assign_flow_api(struct netdev *netdev)
>  ovsrcu_set(>flow_api, rfa->flow_api);
>  VLOG_INFO("%s: Assigned flow API '%s'.",
>netdev_get_name(netdev), rfa->flow_api->type);
> -return 0;
>  }
>  VLOG_DBG("%s: flow API '%s' is not suitable.",
>   netdev_get_name(netdev), rfa->flow_api->type);
> ```
> 
> 
> Simon Jones
> 
> 
> Simon Jones  于2024年6月14日周五 19:25写道:
> 
>> Maybe reason is this:
>> ```
>> @netdev_offload_tc and @netdev_offload_dpdk will always be register.
>> Then the @meter_set api will be called.
>> The @meter_set could be called only after @init_flow_api, but the BUG
>> happens when @meter_set is called before @init_flow_api is called.
>>
>> Check these code:
>>
>> static int
>> netdev_assign_flow_api(struct netdev *netdev)
>> {
>> struct netdev_registered_flow_api *rfa;
>>
>> CMAP_FOR_EACH (rfa, cmap_node, _flow_apis) {
>> if (!rfa->flow_api->init_flow_api(netdev)) {
>> ovs_refcount_ref(>refcnt);
>> ovsrcu_set(>flow_api, rfa->flow_api);
>> VLOG_INFO("%s: Assigned flow API '%s'.",
>>   netdev_get_name(netdev), rfa->flow_api->type);
>> return 0;
>> }
>> VLOG_DBG("%s: flow API '%s' is not suitable.",
>>  netdev_get_name(netdev), rfa->flow_api->type);
>> }
>> VLOG_INFO("%s: No suitable flow API found.", netdev_get_name(netdev));
>>
>> return -1;
>> }
>>
>> ONLY one type of rfa->flow_api->init_flow_api (DPDK or TC) could be called.
>> Because after one called, then return 0.
>> ```
>>
>> So I suggest to just remove 'return 0'.

Hi, Simon.  This is not a correct thing to do.  By design, only one flow API
should be able to accept a particular port type.  If two API implementations
can accept the same port, that would be a bug.  Also, initializing more than
one API will cause resource leaks and potentially incorrect datapath behavior.

Since you're using userspace datapath, init_flow_api() from the TC 
implementation
should always fail.

What is your OVS version?

Best regards, Ilya Maximets.

>> Like this patch:
>>
>> 
>> Simon Jones
>>
>>
>> Simon Jones  于2024年6月11日周二 17:32写道:
>>
>>>
>>> Hi all,
>>>
>>> I'm using ovs-dpdk with this patch:
>>> ```
>>> commit 4c226944f7c55c9d6e7c85f7c33c5ce11c35ce54
>>> Author: Jianbo Liu 
>>> Date:   Fri Jul 8 03:06:26 2022 +
>>>
>>> netdev-offload-tc: Implement meter offload API for tc
>>> ```
>>> This patch is for offload flow meter by tc.
>>>
>>> Now I found a bug: ovs crash when add meter openflow.
>>>
>>> 1. How to produce:
>>> (NOTICE: This bug is not always reproducible.)
>>> ```
>>> Add these commands:
>>>
>>> ovs-ofctl -O OpenFlow13 add-meter br-int
>>> meter=1,kbps,band=type=drop,rate=1000
>>> ovs-ofctl -O OpenFlow13 add-flow br-int in_port=\"pf0vf0\",ip,nw_src=
>>> 16.0.0.0/24,nw_dst=48.0.0.0/24,nw_proto=17,actions=\
>>> <http://16.0.0.0/24,nw_dst=48.0.0.0/24,nw_proto=17,actions=%5C>
>>> "meter:1,output:p0\"
>>> ovs-ofctl -O OpenFlow13 add-flow br-int
>>> in_port=\"pf0vf0\",udp6,ipv6_src=2001:4860:0:2001::/64,ipv6_dst=2001:0:4137:9350:8000:f12a:b9c8:2815,actions=\"meter:1,output:p0\"
>>>
>>> Then ovs crash, this is core file call trace:
>>>
>>> (gdb) bt
>>> #0  id_pool_alloc_id (pool=0x0, id_=id_@entry=0x7fff180c841c) at
>>> lib/id-pool.c:112
>>> #1  0x0055f0a0 in meter_alloc_police_index
>>> (police_index=0x7fff180c841c) at lib/netdev-offload-tc.c:2530
>>> #2  meter_tc_set_policer (meter_id=..., config=0x7fff180c8538) at
>>> lib/netdev-offload-tc.c:2567
>>> #3  0x004af207 in meter_offload_set (meter_id=meter_id@entry=...,
>>> config=config@entry=0x7fff180c8538) at lib/netdev-offload.c:207
>>> #4  0x00474c2b in dpif_netdev_meter_set (dpif=,
>>> meter_id=..., conf

[ovs-dev] [PATCH] vswitchd: Only lock pages that are faulted in.

2024-06-14 Thread Ilya Maximets
The main purpose of locking the memory is to ensure that OVS can keep
doing what it did before in case of increased memory pressure, e.g.,
during VM ingest / migration.  Fulfilling this requirement can be
achieved without locking all the allocated memory, but only the pages
already accessed in the past (faulted in).  Processing of the new
traffic involves new memory allocations.  Latency on these operations
can't be guaranteed by the locking.  The main difference would be
the pre-faulting of the stack memory.  However, in order to revalidate
or process upcalls on the same traffic, the same amount of stack is
likely needed, so all the necessary memory will already be faulted in.

Switch 'mlockall' to MCL_ONFAULT to avoid consuming unnecessarily
large amounts of RAM on systems with high core counts.  For example,
in a densely populated OVN cluster this saves about 650 MB of RAM per
node on a system with 64 cores.  This equates to 320 GB of allocated
but unused RAM in a 500 node cluster.

This also makes OVS better suited by default for small systems with
limited amount of memory.

The MCL_ONFAULT flag was introduced in Linux kernel 4.4 and wasn't
available at the time of '--mlockall' introduction, but we can use it
now.  Falling back to an old way of locking in case we're running on
an older kernel just in case.

Only locking the faulted in pages also makes locking compatible with
vhost post-copy live migration by default, because we'll no longer
pre-fault all the guest's memory.  Post-copy relies on userfaultfd
to work on shared huge pages, which is only available in 4.11+ kernels.
So, technically, it should not be possible for MCL_ONFAULT to fail and
the call without it to succeed.  But keeping the check just in case
for now.

Signed-off-by: Ilya Maximets 
---
 Documentation/ref/ovs-ctl.8.rst  |  5 +++--
 Documentation/topics/dpdk/vhost-user.rst |  6 --
 NEWS |  2 ++
 lib/netdev-dpdk.c|  2 +-
 lib/util.c   | 12 ++--
 lib/util.h   |  4 ++--
 vswitchd/ovs-vswitchd.8.in   |  9 +
 vswitchd/ovs-vswitchd.c  | 17 -
 8 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/Documentation/ref/ovs-ctl.8.rst b/Documentation/ref/ovs-ctl.8.rst
index 9f077a122..cdbaac4dc 100644
--- a/Documentation/ref/ovs-ctl.8.rst
+++ b/Documentation/ref/ovs-ctl.8.rst
@@ -170,8 +170,9 @@ The following options are less important:
 * ``--no-mlockall``
 
   By default ``ovs-ctl`` passes ``--mlockall`` to ``ovs-vswitchd``,
-  requesting that it lock all of its virtual memory, preventing it
-  from being paged to disk.  This option suppresses that behavior.
+  requesting that it lock all of its virtual memory on page fault (on
+  allocation, when running on Linux kernel 4.4 and older), preventing
+  it from being paged to disk.  This option suppresses that behavior.
 
 * ``--no-self-confinement``
 
diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
index 7866543d8..d9d87aa08 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -340,8 +340,10 @@ The default value is ``false``.
 fixes (like userfaulfd leak) was released in 3.0.1.
 
 DPDK Post-copy feature requires avoiding to populate the guest memory
-(application must not call mlock* syscall). So enabling mlockall is
-incompatible with post-copy feature.
+(application must not call mlock* syscall without MCL_ONFAULT).
+So enabling mlockall is incompatible with post-copy feature in OVS 3.3 and
+older. Newer versions of OVS only lock memory pages that are faulted in,
+so both features can be used at the same time.
 
 Note that during migration of vhost-user device, PMD threads hang for the
 time of faulted pages download from source host. Transferring 1GB hugepage
diff --git a/NEWS b/NEWS
index 5ae0108d5..66c370f20 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,7 @@
 Post-v3.3.0
 
+   - Option '--mlockall' now only locks memory pages on fault, if possible.
+ This also makes it compatible with vHost Post-copy Live Migration.
- Userspace datapath:
  * Conntrack now supports 'random' flag for selecting ports in a range
while natting and 'persistent' flag for selection of the IP address
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 0fa37d514..bdc08bcf5 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -6704,7 +6704,7 @@ parse_vhost_config(const struct smap *ovs_other_config)
 
 vhost_postcopy_enabled = smap_get_bool(ovs_other_config,
"vhost-postcopy-support", false);
-if (vhost_postcopy_enabled && memory_locked()) {
+if (vhost_postcopy_enabled && memory_all_locked()) {
 VLOG_WARN("vhost-postcopy-support 

[ovs-dev] [PATCH] appveyor: Fix caching of OpenSSL installer.

2024-06-10 Thread Ilya Maximets
Apparently, if the cache dependency is specified, the cache folder
is not checked at the end of a build and so the cache is never
updated unless we change appveyor.yml.  This makes the cache to not
actually work, because on each build we discover that the installer
is outdated, download the new one and it is not uploaded to the cache,
so it is still outdated on the next build.

Removing the dependency to get a normal cache behavior.  We're
manually comparing the hash of the cached binary with the most
latest one, so we will still catch any OpenSSL updates, but now
we will also upload the updated cache back.

Fixes: 9d8208484a35 ("appveyor: Build with OpenSSL 3.0.")
Reported-at: 
https://help.appveyor.com/discussions/problems/36144-cache-reports-up-to-date-while-it-is-not
Signed-off-by: Ilya Maximets 
---
 appveyor.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/appveyor.yml b/appveyor.yml
index d11e46399..d0293b211 100644
--- a/appveyor.yml
+++ b/appveyor.yml
@@ -15,7 +15,7 @@ init:
   -Value "C:\Python312-x64\python.exe"
 
 cache:
-- C:\ovs-build-downloads -> appveyor.yml
+- C:\ovs-build-downloads
 
 install:
 - ps: |
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] checkpatch: Extend and move extra_keywords list to file.

2024-06-10 Thread Ilya Maximets
On 6/10/24 16:55, Aaron Conole wrote:
> Eelco Chaudron  writes:
> 
>> On 7 Jun 2024, at 15:46, Mike Pattrick wrote:
>>
>>> On Fri, Jun 7, 2024 at 2:35 AM Eelco Chaudron  wrote:
>>>>
>>>>
>>>>
>>>> On 6 Jun 2024, at 3:07, Mike Pattrick wrote:
>>>>
>>>>> This patch extends the extra_keywords list from 324 to 747 keywords and
>>>>> moves this list to a separate file. The methodology used to create this
>>>>> list was running the spell checker on a large volume of historical
>>>>> patches and selecting any words that appeared multiple times.
>>>>
>>>> Thanks Mike,
>>>>
>>>> I like the idea of having this in a separate file (I would add the
>>>> .txt extension to it), however, just blindly taking the last x
>>>> errors does not seem to be the right approach.
>>>>
>>>> Last time I took the words from the last 1000 commits that made
>>>> sense. For example, things like countersfn, deviceiocontrol,
>>>> etc. do not make sense to me to add.
>>>
>>> Why wouldn't we want something like deviceiocontrol in an exclusion
>>> list? It's a common Windows function name, any commit that touches the
>>> windows code has a high likelihood of including it.
>>
>> Well deviceiocontrol is maybe an outlier, but there is a lot of other
>> stuff we should not add. And even though, it only checks comments and
>> commit messages.

I agree with Eelco on that.  Some parts of the code will inevitably be
mentioned in commit messages or comments and we can't add all of them
in the dictionary.  Even 'deviceiocontrol' doesn't seem like a good word
for the dictionary.  It might be better to work on better recognition of
code parts or quoted words and strings within comments and commit messages.

Some added words are just abbreviated normal words, so it may be better
to encourage people to use a full word instead by not adding to the
dictionary.

Some added words like 'lexograpically' seem like just legit typos.  This
is not how this word supposed to be spelled.

>>
>> To make it easier to review the words added, maybe split the patch in
>> two patches. One moving words to checkpatch.words, or some other self
>> explanatory name, and the other one introducing the additional words.
> 
> I agree with this approach.  It is a bit difficult to review when moving
> things and introducing changes in the same patch.  Better to split it to
> the move and then the additions.

+1

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Disable XPS (Transmit Packet Steering) for non-pmd ports.

2024-06-10 Thread Ilya Maximets
On 6/9/24 12:16, Roi Dayan via dev wrote:
> From: Eli Britstein 
> 
> In the cited commit, XPS was introduced. It is NA for non-pmd ports.
> Upon port creation it is indeed disabled, but at port reconfigure, the
> condition of netdev_is_pmd() is missing.
> As a result, XPS is configured, and such messages are repeating in the log:
>   DBG|Core 2: New TX queue ID 0 for port 'v1_r'.
> Fix it.

Hi, Eli.  Thanks for the patch!

While it's maybe true that it was an original intention to not have XPS
engaged for non-PMD ports (frankly, I don't remember), the behavior was
changed quickly after in commit:
  e32971b8ddb4 ("dpif-netdev: Centralized threads and queues handling code.")
The logic was centralized in the reconfiguration code and no port is
actually used until it went through datapath reconfiguration.

And later we had AF_XDP ports introduced and even afxdp-nonpmd.  For these
it is still important to have balanced use of Tx queues even if the port
is not polled by PMD threads on Rx side.

We also changed netdev_send() API to include 'concurrent_txq' flag to
make the netdev implementation know if it needs to lock the queue before
using.  Default STATIC mode doesn't set this flag.
This also means that we can't actually supply Tx queue IDs to netdev_send()
out of range of the allocated queues, since netdev implementation will
have to lock every time otherwise.  STATIC mode will use out-of-range
queue IDs with this change applied.

With that, I don't think we can accept this change.  At the current state
of the netdev API, dpif-netdev should never actually use Tx queue IDs
out of the allocated range and it must set 'concurrent_txq' flag whenever
queues can be shared, otherwise we'll get data races and crashes on
out-of-range memory accesses.

We should technically remove all the 'qid % n_txq' stuff from all the
netdev implementations and replace them with ovs_assert() on the API
level.  We had a few patches for that in the past, but they didn't get
proper attention and went stale.

Best regards, Ilya Maximets.

> 
> Fixes: 324c8374852a ("dpif-netdev: XPS (Transmit Packet Steering) 
> implementation.")
> Signed-off-by: Eli Britstein 
> Acked-by: Roi Dayan 
> ---
>  lib/dpif-netdev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index c7f9e149025e..94e1204575ea 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -6804,7 +6804,7 @@ reconfigure_datapath(struct dp_netdev *dp)
>  if (port->txq_requested_mode == TXQ_REQ_MODE_HASH &&
>  netdev_n_txq(port->netdev) > 1) {
>  port->txq_mode = TXQ_MODE_XPS_HASH;
> -} else if (netdev_n_txq(port->netdev) < wanted_txqs) {
> +} else if (netdev_n_txq(port->netdev) < wanted_txqs && 
> netdev_is_pmd(port->netdev)) {
>  port->txq_mode = TXQ_MODE_XPS;
>  } else {
>  port->txq_mode = TXQ_MODE_STATIC;

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH branch-3.3 0/2] Release patches for v3.3.1.

2024-06-07 Thread Ilya Maximets
On 6/7/24 16:42, Kevin Traynor wrote:
> On 07/06/2024 15:01, Ilya Maximets wrote:
>> We didn't make a stable release for a while.  It's definitely time
>> to make one.
>>
>> Ilya Maximets (2):
>>   Set release date for 3.3.1.
>>   Prepare for 3.3.2.
>>
>>  NEWS | 6 +-
>>  configure.ac | 2 +-
>>  debian/changelog | 8 +++-
>>  3 files changed, 13 insertions(+), 3 deletions(-)
>>
> 
> Acked-by: Kevin Traynor 
> 

Thanks, Kevin and Eelco!

All releases are now tagged and pushed.  Will update pypi,
the website and make an announcement soon.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.3 1/2] Set release date for 3.3.1.

2024-06-07 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 3 ++-
 debian/changelog | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 31d235a40..aaf8d4b4a 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
-v3.3.1 - xx xxx 
+v3.3.1 - 07 Jun 2024
 
+   - Bug fixes
- DPDK:
  * OVS validated with DPDK 23.11.1.
 
diff --git a/debian/changelog b/debian/changelog
index 22c767a4c..3ff112cd5 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -2,7 +2,7 @@ openvswitch (3.3.1-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
 
- -- Open vSwitch team   Fri, 16 Feb 2024 12:25:58 +0100
+ -- Open vSwitch team   Fri, 07 Jun 2024 15:58:27 +0200
 
 openvswitch (3.3.0-1) unstable; urgency=low
 
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.3 2/2] Prepare for 3.3.2.

2024-06-07 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 3 +++
 configure.ac | 2 +-
 debian/changelog | 6 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index aaf8d4b4a..f62d06ffb 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,6 @@
+v3.3.2 - xx xxx 
+
+
 v3.3.1 - 07 Jun 2024
 
- Bug fixes
diff --git a/configure.ac b/configure.ac
index a3ea65c0f..4b00d1878 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 3.3.1, b...@openvswitch.org)
+AC_INIT(openvswitch, 3.3.2, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([vswitchd/ovs-vswitchd.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index 3ff112cd5..82af18415 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+openvswitch (3.3.2-1) unstable; urgency=low
+   [ Open vSwitch team ]
+   * New upstream version
+
+ -- Open vSwitch team   Fri, 07 Jun 2024 15:58:27 +0200
+
 openvswitch (3.3.1-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.3 0/2] Release patches for v3.3.1.

2024-06-07 Thread Ilya Maximets
We didn't make a stable release for a while.  It's definitely time
to make one.

Ilya Maximets (2):
  Set release date for 3.3.1.
  Prepare for 3.3.2.

 NEWS | 6 +-
 configure.ac | 2 +-
 debian/changelog | 8 +++-
 3 files changed, 13 insertions(+), 3 deletions(-)

-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.2 2/2] Prepare for 3.2.4.

2024-06-07 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 3 +++
 configure.ac | 2 +-
 debian/changelog | 6 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 43ab6d806..492b096db 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,6 @@
+v3.2.4 - xx xxx 
+
+
 v3.2.3 - 07 Jun 2024
 
- Bug fixes
diff --git a/configure.ac b/configure.ac
index 25a00dcb1..40c386d92 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 3.2.3, b...@openvswitch.org)
+AC_INIT(openvswitch, 3.2.4, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([vswitchd/ovs-vswitchd.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index a4ae141aa..056bd9e7f 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+openvswitch (3.2.4-1) unstable; urgency=low
+   [ Open vSwitch team ]
+   * New upstream version
+
+ -- Open vSwitch team   Fri, 07 Jun 2024 15:58:16 +0200
+
 openvswitch (3.2.3-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.2 1/2] Set release date for 3.2.3.

2024-06-07 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 3 ++-
 debian/changelog | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 8baf3d2aa..43ab6d806 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
-v3.2.3 - xx xxx 
+v3.2.3 - 07 Jun 2024
 
+   - Bug fixes
- DPDK:
  * OVS validated with DPDK 22.11.5.
 
diff --git a/debian/changelog b/debian/changelog
index 302fc0a45..a4ae141aa 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -2,7 +2,7 @@ openvswitch (3.2.3-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
 
- -- Open vSwitch team   Thu, 08 Feb 2024 17:55:30 +0100
+ -- Open vSwitch team   Fri, 07 Jun 2024 15:58:16 +0200
 
 openvswitch (3.2.2-1) unstable; urgency=low
[ Open vSwitch team ]
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.2 0/2] Release patches for v3.2.3.

2024-06-07 Thread Ilya Maximets
We didn't make a stable release for a while.  It's definitely time
to make one.

Ilya Maximets (2):
  Set release date for 3.2.3.
  Prepare for 3.2.4.

 NEWS | 6 +-
 configure.ac | 2 +-
 debian/changelog | 8 +++-
 3 files changed, 13 insertions(+), 3 deletions(-)

-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.1 1/2] Set release date for 3.1.5.

2024-06-07 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 3 ++-
 debian/changelog | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 44a3e6247..9604939b3 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
-v3.1.5 - xx xxx 
+v3.1.5 - 07 Jun 2024
 
+   - Bug fixes
- DPDK:
  * OVS validated with DPDK 22.11.5.
 
diff --git a/debian/changelog b/debian/changelog
index 232c3d266..91fee3dee 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -2,7 +2,7 @@ openvswitch (3.1.5-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
 
- -- Open vSwitch team   Thu, 08 Feb 2024 17:55:19 +0100
+ -- Open vSwitch team   Fri, 07 Jun 2024 15:58:06 +0200
 
 openvswitch (3.1.4-1) unstable; urgency=low
[ Open vSwitch team ]
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.1 2/2] Prepare for 3.1.6.

2024-06-07 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 3 +++
 configure.ac | 2 +-
 debian/changelog | 6 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 9604939b3..cf0700ab3 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,6 @@
+v3.1.6 - xx xxx 
+
+
 v3.1.5 - 07 Jun 2024
 
- Bug fixes
diff --git a/configure.ac b/configure.ac
index c30e273c5..d647b547c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 3.1.5, b...@openvswitch.org)
+AC_INIT(openvswitch, 3.1.6, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([vswitchd/ovs-vswitchd.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index 91fee3dee..56ca7fd48 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+openvswitch (3.1.6-1) unstable; urgency=low
+   [ Open vSwitch team ]
+   * New upstream version
+
+ -- Open vSwitch team   Fri, 07 Jun 2024 15:58:06 +0200
+
 openvswitch (3.1.5-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.1 0/2] Release patches for v3.1.5.

2024-06-07 Thread Ilya Maximets
We didn't make a stable release for a while.  It's definitely time
to make one.

Ilya Maximets (2):
  Set release date for 3.1.5.
  Prepare for 3.1.6.

 NEWS | 6 +-
 configure.ac | 2 +-
 debian/changelog | 8 +++-
 3 files changed, 13 insertions(+), 3 deletions(-)

-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.0 0/2] Release patches for v3.0.7.

2024-06-07 Thread Ilya Maximets
We didn't make a stable release for a while.  It's definitely time
to make one.

Ilya Maximets (2):
  Set release date for 3.0.7.
  Prepare for 3.0.8.

 NEWS | 6 +-
 configure.ac | 2 +-
 debian/changelog | 8 +++-
 3 files changed, 13 insertions(+), 3 deletions(-)

-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.0 2/2] Prepare for 3.0.8.

2024-06-07 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 3 +++
 configure.ac | 2 +-
 debian/changelog | 6 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 088bf0dd2..504c70b6b 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,6 @@
+v3.0.8 - xx xxx 
+
+
 v3.0.7 - 07 Jun 2024
 
- Bug fixes
diff --git a/configure.ac b/configure.ac
index d59b87622..3bb882643 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 3.0.7, b...@openvswitch.org)
+AC_INIT(openvswitch, 3.0.8, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([vswitchd/ovs-vswitchd.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index daaaed38a..d65ebee98 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+openvswitch (3.0.8-1) unstable; urgency=low
+   [ Open vSwitch team ]
+   * New upstream version
+
+ -- Open vSwitch team   Fri, 07 Jun 2024 15:58:01 +0200
+
 openvswitch (3.0.7-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-3.0 1/2] Set release date for 3.0.7.

2024-06-07 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 3 ++-
 debian/changelog | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 8550e8e89..088bf0dd2 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
-v3.0.7 - xx xxx 
+v3.0.7 - 07 Jun 2024
 
+   - Bug fixes
- DPDK:
  * OVS validated with DPDK 21.11.7.
 
diff --git a/debian/changelog b/debian/changelog
index 7494e83c6..daaaed38a 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -2,7 +2,7 @@ openvswitch (3.0.7-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
 
- -- Open vSwitch team   Thu, 08 Feb 2024 17:54:21 +0100
+ -- Open vSwitch team   Fri, 07 Jun 2024 15:58:01 +0200
 
 openvswitch (3.0.6-1) unstable; urgency=low
[ Open vSwitch team ]
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.17 0/2] Release patches for v2.17.10.

2024-06-07 Thread Ilya Maximets
We didn't make a stable release for a while.  It's definitely time
to make one.

Ilya Maximets (2):
  Set release date for 2.17.10.
  Prepare for 2.17.11.

 NEWS | 6 +-
 configure.ac | 2 +-
 debian/changelog | 8 +++-
 3 files changed, 13 insertions(+), 3 deletions(-)

-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.17 1/2] Set release date for 2.17.10.

2024-06-07 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 3 ++-
 debian/changelog | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index e91072dd6..a809e8e3d 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
-v2.17.10 - xx xxx 
+v2.17.10 - 07 Jun 2024
 --
+   - Bug fixes
- DPDK:
  * OVS validated with DPDK 21.11.7.
 
diff --git a/debian/changelog b/debian/changelog
index f451bf643..0778c1c74 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -2,7 +2,7 @@ openvswitch (2.17.10-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
 
- -- Open vSwitch team   Thu, 08 Feb 2024 17:52:54 +0100
+ -- Open vSwitch team   Fri, 07 Jun 2024 15:57:48 +0200
 
 openvswitch (2.17.9-1) unstable; urgency=low
[ Open vSwitch team ]
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.17 2/2] Prepare for 2.17.11.

2024-06-07 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 3 +++
 configure.ac | 2 +-
 debian/changelog | 6 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index a809e8e3d..8ba8a8d62 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,6 @@
+v2.17.11 - xx xxx 
+--
+
 v2.17.10 - 07 Jun 2024
 --
- Bug fixes
diff --git a/configure.ac b/configure.ac
index 44af9eb6d..187431e51 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 2.17.10, b...@openvswitch.org)
+AC_INIT(openvswitch, 2.17.11, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([datapath/datapath.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index 0778c1c74..dee876165 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+openvswitch (2.17.11-1) unstable; urgency=low
+   [ Open vSwitch team ]
+   * New upstream version
+
+ -- Open vSwitch team   Fri, 07 Jun 2024 15:57:48 +0200
+
 openvswitch (2.17.10-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] python: ovs: flow: Fix nested check_pkt_len acts.

2024-06-07 Thread Ilya Maximets
On 6/6/24 19:53, Adrián Moreno wrote:
> On Thu, Jun 06, 2024 at 06:00:26PM GMT, Ilya Maximets wrote:
>> On 6/6/24 17:15, Adrian Moreno wrote:
>>> Add check_pkt_len action to the decoder list that it, itself, uses.
>>>
>>> This makes nested check_pkt_len (i.e:a check_pkt_len inside another)
>>> work.
>>>
>>> Signed-off-by: Adrian Moreno 
>>> ---
>>>  python/ovs/flow/odp.py   | 43 ++--
>>>  python/ovs/tests/test_odp.py | 29 
>>>  2 files changed, 51 insertions(+), 21 deletions(-)
>>
>> Hi, Adrian.
>>
>> Could you, please, provide a Fixes tag for this?
>> No need to send v2 just for this, just reply with it to this thread.
>> (Tags should start from the beginning of the line for patchwork to
>> recognize them.)
>>
> 
> Sure, how about this:
> 
> Reported-by: Ilya Maximets 
> Fixes: 076663b31edc ("python: Add ovs datapath flow parsing.")
> 
> Thanks
> Adrián
> 

Thanks, Adrian and Eelco!

Applied and backported down to 3.0.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] python: idl: Fix index not being updated on row modification.

2024-06-07 Thread Ilya Maximets
On 6/6/24 20:55, Terry Wilson wrote:
> On Thu, Jun 6, 2024 at 10:41 AM Dumitru Ceara  wrote:
>>
>> On 5/27/24 23:39, Ilya Maximets wrote:
>>> When a row is modified, python IDL doesn't perform any operations on
>>> existing client-side indexes.  This means that if the column on which
>>> index is created changes, the old value will remain in the index and
>>> the new one will not be added to the index.  Beside lookup failures
>>> this is also causing inability to remove modified rows, because the
>>> new column value doesn't exist in the index causing an exception on
>>> attempt to remove it:
>>>
>>>  Traceback (most recent call last):
>>>File "ovsdbapp/backend/ovs_idl/connection.py", line 110, in run
>>>  self.idl.run()
>>>File "ovs/db/idl.py", line 465, in run
>>>  self.__parse_update(msg.params[2], OVSDB_UPDATE3)
>>>File "ovs/db/idl.py", line 924, in __parse_update
>>>  self.__do_parse_update(update, version, self.tables)
>>>File "ovs/db/idl.py", line 964, in __do_parse_update
>>>  changes = self.__process_update2(table, uuid, row_update)
>>>File "ovs/db/idl.py", line 991, in __process_update2
>>>  del table.rows[uuid]
>>>File "ovs/db/custom_index.py", line 102, in __delitem__
>>>  index.remove(val)
>>>File "ovs/db/custom_index.py", line 66, in remove
>>>  self.values.remove(self.index_entry_from_row(row))
>>>File "sortedcontainers/sortedlist.py", line 2015, in remove
>>>  raise ValueError('{0!r} not in list'.format(value))
>>>  ValueError: Datapath_Binding(
>>>uuid=UUID('498e66a2-70bc-4587-a66f-0433baf82f60'),
>>>tunnel_key=16711683, load_balancers=[], external_ids={}) not in list
>>>
>>> Fix that by always removing an existing row from indexes before
>>> modification and adding back afterwards.  This ensures that old
>>> values are removed from the index and new ones are added.
>>>
>>> This behavior is consistent with the C implementation.
>>>
>>> The new test that reproduces the removal issue is added.  Some extra
>>> testing infrastructure added to be able to handle and print out the
>>> 'indexed' table from the idltest schema.
>>>
>>> Fixes: 13973bc41524 ("Add multi-column index support for the Python IDL")
>>> Reported-at: 
>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2024-May/053159.html
>>> Reported-by: Roberto Bartzen Acosta 
>>> Signed-off-by: Ilya Maximets 
>>> ---
>>
>> Looks good to me:
>>
>> Acked-by: Dumitru Ceara 
>>
>> Regards,
>> Dumitru
>>
> 
> Looks good to me. I don't like that my code for IndexedRows strongly
> implies that it behaves exactly like a dict, and in this case it
> doesn't. Maybe some comments explaining why a delete has to be done
> for posterity would be helpful.
> 
> Acked-by: Terry Wilson 
> 

Thanks, Terry, Mike and Dumitru for reviews and Roberto and Vladislav
for testing!

I think, we can try to make the interfaces better, this ties a little
with the persistent uuid discussion.  But for now, applied this fix
and backported down to 2.17.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/3] python: ovsdb-idl: Make IndexedRows mirror hmap.

2024-06-07 Thread Ilya Maximets
On 4/11/24 15:57, Terry Wilson wrote:
> I tried to get this to thread under the "ovsdb-idl: potential issues
> with the persistent UUID implementation" thread, but failed. This is
> one potential solution to that which mirrors the current C IDL
> implementation which seems to work for the most part, but relying on
> the fact that hmaps allow duplicate keys wasn't necessarily intended
> there. Reading that thread is definitely a prerequisite for
> understanding why this patch exists. And it may be better to solve
> this some other way, but it seems difficult to do for the case of
> "inserting a UUID that already exists" w/o creating a race condition
> with multiple IDL clients.

Thanks, Terry!

As discussed off list, we need to think a bit more how to change the API,
so it is less awkward.  For now the current fix makes sense to me.

Applied remaining 2 patches and backported them down to 3.1.

Best regards, Ilya Maximets.

> 
> Terry
> 
> On Wed, Apr 10, 2024 at 4:39 PM Terry Wilson  wrote:
>>
>> The Python IDL code very closely mirrors the C IDL code, which uses
>> an hmap to store table rows. hmap code allows duplicate keys, while
>> IndexedRows, which is derived from DictBase does not.
>>
>> The persistent UUID code can attempt to temporarily add a Row with
>> a duplicate UUID to table.rows, so IndexedRows is modified to
>> behave similarly to the C IDL's hmap implementation.
>>
>> Fixes: 55b9507e6824 ("ovsdb-idl: Add the support to specify the uuid for row 
>> insert.")
>> Signed-off-by: Terry Wilson 
>> ---
>>  python/ovs/db/custom_index.py | 13 ++---
>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/python/ovs/db/custom_index.py b/python/ovs/db/custom_index.py
>> index 587caf5e3..3fa03d3c9 100644
>> --- a/python/ovs/db/custom_index.py
>> +++ b/python/ovs/db/custom_index.py
>> @@ -90,14 +90,21 @@ class IndexedRows(DictBase, object):
>>  index = self.indexes[name] = MultiColumnIndex(name)
>>  return index
>>
>> +def __getitem__(self, key):
>> +return self.data[key][-1]
>> +
>>  def __setitem__(self, key, item):
>> -self.data[key] = item
>> +try:
>> +self.data[key].append(item)
>> +except KeyError:
>> +self.data[key] = [item]
>>  for index in self.indexes.values():
>>  index.add(item)
>>
>>  def __delitem__(self, key):
>> -val = self.data[key]
>> -del self.data[key]
>> +val = self.data[key].pop()
>> +if len(self.data[key]) == 0:
>> +del self.data[key]
>>  for index in self.indexes.values():
>>  index.remove(val)
>>
>> --
>> 2.34.3
>>
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] python: ovs: flow: Fix nested check_pkt_len acts.

2024-06-06 Thread Ilya Maximets
On 6/6/24 17:15, Adrian Moreno wrote:
> Add check_pkt_len action to the decoder list that it, itself, uses.
> 
> This makes nested check_pkt_len (i.e:a check_pkt_len inside another)
> work.
> 
> Signed-off-by: Adrian Moreno 
> ---
>  python/ovs/flow/odp.py   | 43 ++--
>  python/ovs/tests/test_odp.py | 29 
>  2 files changed, 51 insertions(+), 21 deletions(-)

Hi, Adrian.

Could you, please, provide a Fixes tag for this?
No need to send v2 just for this, just reply with it to this thread.
(Tags should start from the beginning of the line for patchwork to
recognize them.)

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] datapath-windows: Fix parsing of split buffers in OvsGetTcpHeader.

2024-06-06 Thread Ilya Maximets
NdisGetDataBuffer() is called without providing a buffer to copy packet
data in case it is not contiguous.  So, it fails in some scenarios
where the packet is handled by the general network stack before OVS
and headers become split in multiple buffers.

Use existing helpers to retrieve the headers instead, they are using
OvsGetPacketBytes() which should be able to handle split data.

It might be a bit slower than getting direct pointers that may be
provided by NdisGetDataBuffer(), but it's better to optimize commonly
used OvsGetPacketBytes() helper in the future instead of optimizing
every single caller separately.  And we're still copying the TCP
header anyway.

Fixes: 9726a016d9d6 ("datapath-windows: Implement locking in conntrack NAT.")
Reported-at: https://github.com/openvswitch/ovs-issues/issues/323
Signed-off-by: Ilya Maximets 
---

WARNING: I beleive this code is correct, but I did not test it with real
 traffic, I only verified that it compiles.  Should not be applied
 unless someone tests it in an actual Windows setup.

 datapath-windows/ovsext/Conntrack.c | 45 ++---
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index 39ba5cc10..4649805dd 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -678,46 +678,43 @@ OvsGetTcpHeader(PNET_BUFFER_LIST nbl,
 VOID *storage,
 UINT32 *tcpPayloadLen)
 {
-IPHdr *ipHdr;
-IPv6Hdr *ipv6Hdr;
-TCPHdr *tcp;
+IPv6Hdr ipv6HdrStorage;
+IPHdr ipHdrStorage;
+const IPv6Hdr *ipv6Hdr;
+const IPHdr *ipHdr;
+const TCPHdr *tcp;
 VOID *dest = storage;
 uint16_t ipv6ExtLength = 0;
 
 if (layers->isIPv6) {
-ipv6Hdr = NdisGetDataBuffer(NET_BUFFER_LIST_FIRST_NB(nbl),
-layers->l4Offset + sizeof(TCPHdr),
-NULL, 1, 0);
+ipv6Hdr = OvsGetPacketBytes(nbl, sizeof *ipv6Hdr,
+layers->l3Offset, );
 if (ipv6Hdr == NULL) {
 return NULL;
 }
 
-tcp = (TCPHdr *)((PCHAR)ipv6Hdr + layers->l4Offset);
-ipv6Hdr = (IPv6Hdr *)((PCHAR)ipv6Hdr + layers->l3Offset);
-if (tcp->doff * 4 >= sizeof *tcp) {
-NdisMoveMemory(dest, tcp, sizeof(TCPHdr));
-ipv6ExtLength = layers->l4Offset - layers->l3Offset - 
sizeof(IPv6Hdr);
-*tcpPayloadLen = (ntohs(ipv6Hdr->payload_len) - ipv6ExtLength - 
TCP_HDR_LEN(tcp));
-return storage;
+tcp = OvsGetTcp(nbl, layers->l4Offset, dest);
+if (tcp == NULL) {
+return NULL;
 }
+
+ipv6ExtLength = layers->l4Offset - layers->l3Offset - sizeof(IPv6Hdr);
+*tcpPayloadLen = (ntohs(ipv6Hdr->payload_len) - ipv6ExtLength - 
TCP_HDR_LEN(tcp));
 } else {
-ipHdr = NdisGetDataBuffer(NET_BUFFER_LIST_FIRST_NB(nbl),
-  layers->l4Offset + sizeof(TCPHdr),
-  NULL, 1 /*no align*/, 0);
+ipHdr = OvsGetIp(nbl, layers->l3Offset, );
 if (ipHdr == NULL) {
 return NULL;
 }
 
-ipHdr = (IPHdr *)((PCHAR)ipHdr + layers->l3Offset);
-tcp = (TCPHdr *)((PCHAR)ipHdr + ipHdr->ihl * 4);
-
-if (tcp->doff * 4 >= sizeof *tcp) {
-NdisMoveMemory(dest, tcp, sizeof(TCPHdr));
-*tcpPayloadLen = TCP_DATA_LENGTH(ipHdr, tcp);
-return storage;
+tcp = OvsGetTcp(nbl, layers->l4Offset, dest);
+if (tcp == NULL) {
+return NULL;
 }
+
+*tcpPayloadLen = TCP_DATA_LENGTH(ipHdr, tcp);
 }
-return NULL;
+
+return storage;
 }
 
 static UINT8
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/3] nsh: Add support to compose-packet and use it in system tests.

2024-06-05 Thread Ilya Maximets
On 6/3/24 17:09, Simon Horman wrote:
> On Fri, May 31, 2024 at 11:45:12PM +0200, Ilya Maximets wrote:
>> OVS can parse NSH, but can't compose.  Fix that and get rid of plain
>> hex NSH packets in system tests as they are hard to read or modify.
>>
>> Tcpdump calls modified to write actual pcaps instead of text output,
>> so ovs-pcap can be used while checking the results.
>>
>> While at it, replacing sleeps with more robust waiting for tcpdump
>> to start listening.
> 
> I might have separated a) adding NSH compose support

The tests in the patch provide some test coverage for this as well,
so I didn't want to split for that reason.

> b) more robust tcpdump waiting,

We still need to change '>' to '-w' in very same lines, so I though
it's better to not touch the same code twice and just combine the
change, since it's fairly minor.

> and c) using text rather than hex,
> into 3 patches. But I don't feel strongly about it.

For now, I kept the patches as they are for reasons stated above.
Mostly to avoid touching the same code multiple times.

> 
>> M4 macros are better than shell variables, because we can see the
>> substitution result in the test log.  So, using m4_define and m4_join
>> extensively.
>>
>> Signed-off-by: Ilya Maximets 
> 
> Acked-by: Simon Horman 
> 

Thanks, Simon and Eelco!  I applied the set and backported to all
branches, so we can backport tests for future fixes without trouble
of re-working them for each branch.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 6/6] netdev-dpdk: Refactor tunnel checksum offloading.

2024-06-05 Thread Ilya Maximets
On 5/31/24 16:10, Kevin Traynor wrote:
> On 30/05/2024 14:10, David Marchand wrote:
>> All informations required for checksum offloading can be deducted by
> 
> nit:  "All information required for checksum offloading can be deduced
> by" can update on applying, assuming no more revs are needed.
> 
>> already tracked dp_packet l3_ofs, l4_ofs, inner_l3_ofs and inner_l4_ofs
>> fields.
>> Remove DPDK specific l[2-4]_len from generic OVS code.
>>
>> netdev-dpdk code then fills mbuf specifics step by step:
>> - outer_l2_len and outer_l3_len are needed for tunneling (and below
>>   features),
>> - l2_len and l3_len are needed for IP and L4 checksum (and below features),
>> - l4_len and tso_segsz are needed when doing TSO,
>>
>> Signed-off-by: David Marchand 
>> ---
>>  lib/dp-packet.h | 37 --
>>  lib/netdev-dpdk.c   | 35 ++---
>>  lib/netdev-native-tnl.c | 50 +
>>  3 files changed, 27 insertions(+), 95 deletions(-)
> 
> Acked-by: Kevin Traynor 

Thanks, David and Kevin!

I generally like the direction of this patch set, especially the
cleanup of the generic tunnel code.

I didn't test it with a real hardware nor I re-checked the math,
so will not Ack it, but it looks good to me otherwise, and I think
we should backport the whole thing to at least branch-3.3 as well.

Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/2] ipf: Handle common case of ipf defragmentation.

2024-06-05 Thread Ilya Maximets
On 6/5/24 16:54, Aaron Conole wrote:
> Mike Pattrick  writes:
> 
>> When conntrack is reassembling packet fragments, the same reassembly
>> context can be shared across multiple threads handling different packets
>> simultaneously. Once a full packet is assembled, it is added to a packet
>> batch for processing, in the case where there are multiple different pmd
>> threads accessing conntrack simultaneously, there is a race condition
>> where the reassembled packet may be added to an arbitrary batch even if
>> the current batch is available.
>>
>> When this happens, the packet may be handled incorrectly as it is
>> inserted into a random openflow execution pipeline, instead of the
>> pipeline for that packets flow.
>>
>> This change makes a best effort attempt to try to add the defragmented
>> packet to the current batch. directly. This should succeed most of the
>> time.
>>
>> Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
>> Reported-at: https://issues.redhat.com/browse/FDP-560
>> Signed-off-by: Mike Pattrick 
>> ---
> 
> The patch overall looks good to me.  I'm considering applying with the
> following addition:
> 
>   diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
>   index 6b293770dd..d9b9e0c23f 100755
>   --- a/utilities/checkpatch.py
>   +++ b/utilities/checkpatch.py
>   @@ -63,7 +63,8 @@ def open_spell_check_dict():
>  'dhcpv6', 'opts', 'metadata', 'geneve', 'mutex',
>  'netdev', 'netdevs', 'subtable', 'virtio', 'qos',
>  'policer', 'datapath', 'tunctl', 'attr', 
> 'ethernet',
>   -  'ether', 'defrag', 'defragment', 'loopback', 
> 'sflow',
>   +  'ether', 'defrag', 'defragment', 'defragmented',
>   +  'loopback', 'sflow',
>  'acl', 'initializer', 'recirc', 'xlated', 
> 'unclosed',
>  'netlink', 'msec', 'usec', 'nsec', 'ms', 'us', 
> 'ns',
>  'kilobits', 'kbps', 'kilobytes', 'megabytes', 
> 'mbps',
> 
> 
> unless anyone objects.  This is to squelch:
> 
> == Checking 16f6885353c2 ("ipf: Handle common case of ipf defragmentation.") 
> ==
> WARNING: Possible misspelled word: "defragmented"
> Did you mean:  ['defragment ed', 'defragment-ed', 'defragment']
> Lines checked: 129, Warnings: 1, Errors: 0

It doesn't affect CI today, so can be a separate patch, I think.  We have a few 
more
words like this in relatively recent commits, like 'poller' or 'autovalidator', 
these
can be bundled in that separate commit as well.

Though updating the dictionary along with the patch that is using the word 
sounds OK
to me as well.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] checkpatch: Don't warn on pointer to pointer.

2024-06-05 Thread Ilya Maximets
On 6/5/24 10:18, Adrian Moreno wrote:
> Current regexp used to check whitespaces around operators does not
> consider that there can be more than one "*" together to express pointer
> to pointer. As a result, false positive warnings are raised [1].
> 
> Fix the regexp to allow more than one consecutive "+" characters.
> 
> Signed-off-by: Adrian Moreno 
> 
> [1] Example of patch triggering false positives:
>> cat < diff --git a/test.c b/test.c
> --- a/test.c
> +++ b/test.c
> @@ -1 +1,2 @@
> +static void foo(struct oftable ***list, char **errorp);
> EOF

Hi, Adrian.  Please, don't include diffs like that in the commit
message.  It is not possible to apply the patch formatted like this.

If you need to add a diff - indent it and also better prefix each
line with something, e.g. ' | diff --git a/test.c b/test.c'.

> 
> WARNING: Line lacks whitespace around operator
> static void foo(struct oftable ***list, char **errorp);
> 
> Lines checked: 6, Warnings: 1, Errors: 0
> ---
>  utilities/checkpatch.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
> index 6b293770d..891b24dcf 100755
> --- a/utilities/checkpatch.py
> +++ b/utilities/checkpatch.py
> @@ -739,7 +739,7 @@ infix_operators = \
>  '&=', '^=', '|=', '<<=', '>>=']] \
>  + [r'[^<" ]<[^=" ]',
> r'[^\->" ]>[^=" ]',
> -   r'[^ !()/"]\*[^/]',
> +   r'[^ !()/"\*]\*[^/]',
> r'[^ !&()"]&',
> r'[^" +(]\+[^"+;]',
> r'[^" \-(]\-[^"\->;]',

Please, add a unit test for this issue.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-tc-offloads: Don't offload header modification on ip fragments.

2024-06-04 Thread Ilya Maximets
On 6/4/24 13:42, Eelco Chaudron wrote:
> 
> 
> On 1 Jun 2024, at 0:08, Ilya Maximets wrote:
> 
>> On 5/7/24 15:52, Eelco Chaudron wrote:
>>> While offloading header modifications to TC, OVS is using {TCA_PEDIT} +
>>> {TCA_CSUM} combination as that it the only way to represent header
>>> rewrite.  However, {TCA_CSUM} is unable to calculate L4 checksums for
>>> IP fragments.
>>>
>>> Since TC already applies fragmentation bit masking, this patch simply
>>> needs to prevent these packets from being processed through TC.
>>>
>>> Signed-off-by: Eelco Chaudron 
>>> ---
>>> v2: - Fixed and added some comments.
>>> - Use ovs-pcap to compare packets.
>>>
>>> NOTE: This patch needs an AVX512 fix before it can be applied.
>>>   Intel is working on this.
>>> ---
>>>  lib/netdev-offload-tc.c | 32 ++
>>>  lib/tc.c|  5 ++-
>>>  tests/system-traffic.at | 93 +
>>>  3 files changed, 129 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>>> index 921d52317..bdd307933 100644
>>> --- a/lib/netdev-offload-tc.c
>>> +++ b/lib/netdev-offload-tc.c
>>> @@ -1488,6 +1488,30 @@ parse_put_flow_ct_action(struct tc_flower *flower,
>>>  return 0;
>>>  }
>>>
>>> +static bool
>>> +will_tc_add_l4_checksum(struct tc_flower *flower, int type)
>>> +{
>>> +/* This function returns true if the tc layer will add a l4 checksum 
>>> action
>>> + * for this set action.  Refer to the csum_update_flag() function for
>>> + * detailed logic.  Note that even the kernel only supports updating 
>>> TCP,
>>> + * UDP and ICMPv6. */
>>
>> This comment should be outside of the function, I think.  It's strange
>> to have it here.  I see csum_update_flag() has a comment inside, but
>> that's strange as well.  That function has other style issues as well,
>> there is no need to copy them.
> 
> ACK, will clean this up in the next rev.
> 
>>> +switch (type) {
>>> +case OVS_KEY_ATTR_IPV4:
>>> +case OVS_KEY_ATTR_IPV6:
>>> +case OVS_KEY_ATTR_TCP:
>>> +case OVS_KEY_ATTR_UDP:
>>> +switch (flower->key.ip_proto) {
>>> +case IPPROTO_TCP:
>>> +case IPPROTO_UDP:
>>> +case IPPROTO_ICMPV6:
>>> +case IPPROTO_UDPLITE:
>>> +return true;
>>> +}
>>> +break;
>>> +}
>>> +return false;
>>> +}
>>> +
>>>  static int
>>>  parse_put_flow_set_masked_action(struct tc_flower *flower,
>>>   struct tc_action *action,
>>> @@ -1520,6 +1544,14 @@ parse_put_flow_set_masked_action(struct tc_flower 
>>> *flower,
>>>  return EOPNOTSUPP;
>>>  }
>>>
>>> +if (flower->key.flags & TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT
>>
>> We're using this field to make an offloading decision.  We must
>> also set in the mask.  If for some reason we're not matching on
>> the fragment bits, we may first receive a non-fragmented packet
>> and offload it, then fragmented traffic may match and fail the
>> checksumming.  So, we need to enable the bit in the mask.
> 
> The dp always matches on the fragment bit for IPv4 packets, I did some tests 
> with this.
> So if we sent a non-fragment packet first the dp rule will match on fragment 
> = 0. Or
> did I miss something?

It is true today, but nothing ensures that on the netdev-offload-tc level.
Moreover, the netdev-offload-tc is written in a way that it expects the
frag bits to potentially not be in the mask:
  
https://github.com/openvswitch/ovs/blob/1d681ffe3b208a0db4945b6389142ab18404a4d1/lib/netdev-offload-tc.c#L2432-L2448
So, it is going to be internally inconsistent if we do not set the bit
explicitly.

And if someday we'll stop adding the frag bit, we'll never know if we
forget to set it in netdev-offload-tc.  At the very least we'll need an
assertion that it is set.  But having an assertion will still be internally
inconsistent with the code linked above.  So, it may be better to just fix
it instead anyway.

> 
>>> +&& will_tc_add_l4_checksum(flower, type)) {
>>> +VLOG_DBG_RL(, "set action type %d not supported on fragments "
>>> +"due to checksum limitation", type);
>>> +ofpbuf_uninit(_bu

Re: [ovs-dev] [PATCH] socket: Increase listen backlog to 128 everywhere.

2024-06-04 Thread Ilya Maximets
On 6/4/24 11:54, Alin Serdean wrote:
> Does it make sense to make this configurable via automake in order to avoid
> future patches if we need to bump the value further?

I'm still not convinced this is an issue worth fixing.

The original discussion is here:
  https://mail.openvswitch.org/pipermail/ovs-discuss/2024-April/053058.html

It's not clear if these reconnections actually cause any harm or are
even helpful in getting to connect to a less busy server in a general
case.

The original number 10 was clearly way too low for any reasonable workload.
But even with that we lived for a very long time with very large clusters
without any issues.

The main problem in the original thread was that a lot of neutron clients
are having leader-only connections to the database for seemingly no reason.
That results in unnecessary mass re-connection on leadership change.
So, I'd prefer this fixed in OpenStack instead.

Best regards, Ilya Maximets.

> 
> On Tue, Jun 4, 2024 at 11:05 AM Simon Horman  wrote:
> 
>> + Ihar
>>
>> On Fri, May 31, 2024 at 03:40:08PM -0400, Brian Haley wrote:
>>> An earlier patch [1] increased the size of the listen
>>> backlog to 64. While that was a huge improvement over
>>> 10, further testing in large deployments showed 128
>>> was even better.
>>
>> nit: I would slightly prefer if a commit was referenced like this:
>>
>>   commit 2b7efee031c3 ("socket: Increase listen backlog to 64 everywhere.")
>>
>>> Looking at 'ss -lmt' output over more than one week for
>>> port 6641, captured across three different controllers,
>>> the average was:
>>>
>>> listen(s, 10) : 1213 drops/day
>>> listen(s, 64) : 791 drops/day
>>> listen(s, 128): 657 drops/day
>>>
>>> Looking at 'netstat -s | egrep -i 'listen|drop|SYN_RECV''
>>> output over one week for port 6641, again captured across
>>> three different controllers, the average was:
>>>
>>> listen(s, 10) : 741 drops/day
>>> listen(s, 64) : 200 drops/day
>>> listen(s, 128): 22 drops/day
>>>
>>> While having this value configurable would be the
>>> best solution, changing to 128 is a quick fix that
>>> should be good for all deployments. A link to the
>>> original discussion is at [2].
>>>
>>> [1]
>> https://github.com/openvswitch/ovs/commit/2b7efee031c3a2205ad2ee999275893edd083c1c
>>> [2]
>> https://github.com/openvswitch/ovs/commit/2b7efee031c3a2205ad2ee999275893edd083c1c
>>
>> nit: These two references are the same?
>>
>>> Signed-off-by: Brian Haley 
>>
>> I'd value input on this from Ihar (CCed) who worked on the cited commit.
>>
>>> ---
>>>  lib/socket-util.c| 2 +-
>>>  python/ovs/stream.py | 2 +-
>>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/socket-util.c b/lib/socket-util.c
>>> index c569b7d16..552310266 100644
>>> --- a/lib/socket-util.c
>>> +++ b/lib/socket-util.c
>>> @@ -769,7 +769,7 @@ inet_open_passive(int style, const char *target, int
>> default_port,
>>>  }
>>>
>>>  /* Listen. */
>>> -if (style == SOCK_STREAM && listen(fd, 64) < 0) {
>>> +if (style == SOCK_STREAM && listen(fd, 128) < 0) {
>>>  error = sock_errno();
>>>  VLOG_ERR("%s: listen: %s", target, sock_strerror(error));
>>>  goto error;
>>> diff --git a/python/ovs/stream.py b/python/ovs/stream.py
>>> index dbb6b2e1f..874fe0bd5 100644
>>> --- a/python/ovs/stream.py
>>> +++ b/python/ovs/stream.py
>>> @@ -620,7 +620,7 @@ class PassiveStream(object):
>>>  raise Exception('Unknown connection string')
>>>
>>>  try:
>>> -sock.listen(64)
>>> +sock.listen(128)
>>>  except socket.error as e:
>>>  vlog.err("%s: listen: %s" % (name, os.strerror(e.error)))
>>>  sock.close()
>>> --
>>> 2.34.1
>>>
>>> ___
>>> dev mailing list
>>> d...@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/2] python: ovsdb-idl: Use monitor_cond for _Server DB.

2024-06-03 Thread Ilya Maximets
On 5/6/24 18:58, Terry Wilson wrote:
> Unlike the C IDL code, the Python version still monitors the
> _Server DB with "monitor" instead of "monitor_cond". This results
> in receiving an entire Database row every time the "index" value
> is updated which includes the 'schema' column. Using "monitor_cond"
> will result in "update2" notifications which just include the
> changed "index" value.
> 
> Unlike the C IDL, the Python IDL requires a SchemaHelper object
> to instanitate the IDL, leaving it to the user of the library to
> call "get_schema" themselves. Since the Python IDL did not have
> support for retrieving the schema automatically and did not have
> a state for doing so, instead of transitioning on an error response
> from retrieving the _Server schema to requesting the "data" schema,
> this moves directly to monitoring the "data" DB.
> 
> Signed-off-by: Terry Wilson 
> ---
>  python/ovs/db/idl.py | 28 ++--
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 

Thanks, Terry!  Sorry for a long delay.

I'm considering this as a bug fix, because the use of a plain monitor
causes a significant traffic amplification with every small transactions
and can potentially create serious issues in large scale clusters.

Applied to main.  Backporting to 2.17 is probably too much, so only ported
to 3.3 for now to avoid this issue on a future LTS branch.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovsdb: Use table indexes if available for ovsdb_query().

2024-06-03 Thread Ilya Maximets
On 6/3/24 06:20, Mike Pattrick wrote:
> Currently all OVSDB database queries except for UUID lookups all result
> in linear lookups over the entire table, even if an index is present.
> 
> This patch modifies ovsdb_query() to attempt an index lookup first, if
> possible. If no matching indexes are present then a linear index is
> still conducted.
> 
> Reported-at: https://issues.redhat.com/browse/FDP-590
> Signed-off-by: Mike Pattrick 
> ---
>  NEWS |   3 ++
>  ovsdb/query.c| 102 +++
>  ovsdb/row.h  |  28 +++
>  ovsdb/transaction.c  |  27 ---
>  tests/ovsdb-execution.at |  34 -
>  tests/ovsdb-server.at|   2 +-
>  tests/ovsdb-tool.at  |   2 +-
>  7 files changed, 159 insertions(+), 39 deletions(-)

Hi, Mike.  Thanks for the patch.

Besides what Simon asked, the patch has a few other issues:

1. Lookup is performed only on the committed index and it doesn't include
   rows that are in-flight in the current transaction.

   Unlike rows in a hash table, indexes are updated only after the whole
   transaction is committed.  With this change we'll not be able to find
   newly added rows.

   Another thing related to this is that it is allowed to have duplicates
   within a transaction as long as they are removed before the transaction
   ends.  So it is possible that multiple rows will satisfy the condition
   on indexed columns while the transaction is in-flight.

   Consider the following commands executed in a sandbox:

   # ovs-vsctl set-manager "tcp:my-first-target"
   # ovsdb-client transact unix:$(pwd)/sandbox/db.sock '
   ["Open_vSwitch",
{"op": "select",
 "table": "Manager",
 "columns": ["_uuid", "target"],
 "where": [["target", "==", "tcp:my-first-target"]]},
{"op": "insert", 
 "table": "Manager",
 "uuid-name": "duplicate",
 "row": {"target": "tcp:my-first-target"}},
{"op": "select",
 "table": "Manager",
 "columns": ["_uuid", "target"],
 "where": [["target", "==", "tcp:my-first-target"]]},
{"op": "delete",
 "table": "Manager",
 "where":[["_uuid","==",["named-uuid","duplicate"]]]},
{"op": "select",
 "table": "Manager",
 "columns": ["_uuid", "target"],
 "where": [["target", "==", "tcp:my-first-target"]]}]'

   Transaction must succeed.  The first selection should return 1 row,
   the second should return both duplicates and the third should again
   return one row.

   Ideally, implementation should not leak the transaction details to
   the query module, though I'm not sure if that is 100% achievable.

2. Taking above case into account, this change needs way more unit tests
   with different order of operations and complex data updates.

3. Since this is a performance-oriented change, please, include some
   performance numbers in the commit message as well, including impact
   on non-indexed lookups, if any.

4. There seems to be a lot of logic overlap with existing functions like
   ovsdb_condition_match_every_clause(), ovsdb_index_search() and
   ovsdb_row_hash_columns().  Can we re-use those instead?  For example,
   by creating a row from the conditions before the lookup?  What a
   performance impact will look like?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Fix non-portable plus match in python vlog test.

2024-06-03 Thread Ilya Maximets
On 6/3/24 13:14, Eelco Chaudron wrote:
> 
> 
> On 3 Jun 2024, at 13:12, Ilya Maximets wrote:
> 
>> '\+' as a one-or-more match is a GNU extension and it doesn't work
>> in BSD sed.  This makes the python vlog test to fail on FreeBSD 14
>> that recently got python 3.11 in CirrusCI:
>>
>>  |  --- -   2024-06-03 10:42:26.363566000 +
>>  |  +++ /tmp/cirrus-ci-build/tests/testsuite.dir/at-groups/2541/stdout
>>  |  @@ -7,31 +7,37 @@
>>  |   Traceback (most recent call last):
>>  | File , line , in main
>>  |   assert fail
>>  |  +   
>>
>> Remove the '\+' match to make the line removal work.  It doesn't do
>> much for us as we would remove the same lines either way.
>>
>> This change makes CirruCI green again.
>>
>> Fixes: 9185793e7543 ("tests: Fix compatibility issue with Python 3.13 in 
>> vlog.at.")
>> Signed-off-by: Ilya Maximets 
> 
> The change looks good to me (I did not test it).
> 
> Acked-by: Eelco Chaudron 

Thanks!  I retested this and applied to all branches down to 2.17.

Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] tests: Fix non-portable plus match in python vlog test.

2024-06-03 Thread Ilya Maximets
'\+' as a one-or-more match is a GNU extension and it doesn't work
in BSD sed.  This makes the python vlog test to fail on FreeBSD 14
that recently got python 3.11 in CirrusCI:

 |  --- -   2024-06-03 10:42:26.363566000 +
 |  +++ /tmp/cirrus-ci-build/tests/testsuite.dir/at-groups/2541/stdout
 |  @@ -7,31 +7,37 @@
 |   Traceback (most recent call last):
 | File , line , in main
 |   assert fail
 |  +   

Remove the '\+' match to make the line removal work.  It doesn't do
much for us as we would remove the same lines either way.

This change makes CirruCI green again.

Fixes: 9185793e7543 ("tests: Fix compatibility issue with Python 3.13 in 
vlog.at.")
Signed-off-by: Ilya Maximets 
---
 tests/vlog.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/vlog.at b/tests/vlog.at
index efe91479a..2768c0740 100644
--- a/tests/vlog.at
+++ b/tests/vlog.at
@@ -8,7 +8,7 @@ AT_CHECK([$PYTHON3 $srcdir/test-vlog.py --log-file log_file \
 
 AT_CHECK([sed -e 's/.*-.*-.*T..:..:..Z |//' \
 -e 's/File ".*", line [[0-9]][[0-9]]*,/File , line ,/' \
--e '/\^\+/d' \
+-e '/\^/d' \
 stderr_log], [0], [dnl
   0  | module_0 | EMER | emergency
   1  | module_0 | ERR | error
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-tc-offloads: Don't offload header modification on ip fragments.

2024-05-31 Thread Ilya Maximets
s=mod_nw_src=11.1.1.1,ovs-p1
> +  in_port=ovs-p0,ipv6,ipv6_src=fc00::1 
> actions=set_field:fc00::100->ipv6_src,ovs-p1
> +])
> +
> +AT_CHECK([ovs-ofctl del-flows br0])
> +AT_CHECK([ovs-ofctl -Oopenflow13 add-flows br0 flows.txt])
> +
> +NETNS_DAEMONIZE([at_ns1],
> +[tcpdump -l -nn -xx -U -i p1 -w p1.pcap 2> tcpdump.err],
> +[tcpdump.pid])
> +OVS_WAIT_UNTIL([grep "listening" tcpdump.err])
> +
> +dnl IPv4 Packet content:
> +dnl   Ethernet II, Src: 36:b1:ee:7c:01:03, Dst: 36:b1:ee:7c:01:02
> +dnl   Type: IPv4 (0x0800)
> +dnl   Internet Protocol Version 4, Src: 10.1.1.1, Dst: 10.1.1.2
> +dnl   0100  = Version: 4
> +dnl    0101 = Header Length: 20 bytes (5)
> +dnl   Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
> +dnl   Total Length: 38
> +dnl   Identification: 0x0001 (1)
> +dnl   001.  = Flags: 0x1, More fragments
> +dnl   0...  = Reserved bit: Not set
> +dnl   .0..  = Don't fragment: Not set
> +dnl   ..1.  = More fragments: Set
> +dnl   ...0    = Fragment Offset: 0
> +dnl   Time to Live: 64
> +dnl   Protocol: UDP (17)
> +dnl   Header Checksum: 0x44c2
> +dnl   Data (18 bytes)
> +eth="36 b1 ee 7c 01 02 36 b1 ee 7c 01 03 08 00"
> +ip="45 00 00 26 00 01 20 00 40 11 44 c2 0a 01 01 01 0a 01 01 02"
> +data="0b c4 08 84 00 26 e9 64 01 02 03 04 05 06 07 08 09 0a"
> +packet="${eth} ${ip} ${data}"

Since you're backporting the compose-packet functionality now, it's better
if we use it here instead.  We may want to get my sendpkt patch first:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/20240531214635.2084937-2-i.maxim...@ovn.org/

And then follow a pattern from the other patches from the same set:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/20240531214635.2084937-3-i.maxim...@ovn.org/

What do you think?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 3/3] nsh: Add support to compose-packet and use it in system tests.

2024-05-31 Thread Ilya Maximets
OVS can parse NSH, but can't compose.  Fix that and get rid of plain
hex NSH packets in system tests as they are hard to read or modify.

Tcpdump calls modified to write actual pcaps instead of text output,
so ovs-pcap can be used while checking the results.

While at it, replacing sleeps with more robust waiting for tcpdump
to start listening.

M4 macros are better than shell variables, because we can see the
substitution result in the test log.  So, using m4_define and m4_join
extensively.

Signed-off-by: Ilya Maximets 
---
 lib/flow.c  |  18 
 tests/system-traffic.at | 177 ++--
 2 files changed, 134 insertions(+), 61 deletions(-)

diff --git a/lib/flow.c b/lib/flow.c
index 8e3402388..dc5fb328d 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -3420,6 +3420,24 @@ flow_compose(struct dp_packet *p, const struct flow 
*flow,
 arp->ar_sha = flow->arp_sha;
 arp->ar_tha = flow->arp_tha;
 }
+} else if (flow->dl_type == htons(ETH_TYPE_NSH)) {
+struct nsh_hdr *nsh;
+
+nsh = dp_packet_put_zeros(p, sizeof *nsh);
+dp_packet_set_l3(p, nsh);
+
+nsh_set_flags_ttl_len(nsh, flow->nsh.flags, flow->nsh.ttl,
+  flow->nsh.mdtype == NSH_M_TYPE1
+  ? NSH_M_TYPE1_LEN : NSH_BASE_HDR_LEN);
+nsh->next_proto = flow->nsh.np;
+nsh->md_type = flow->nsh.mdtype;
+put_16aligned_be32(>path_hdr, flow->nsh.path_hdr);
+
+if (flow->nsh.mdtype == NSH_M_TYPE1) {
+for (size_t i = 0; i < 4; i++) {
+put_16aligned_be32(>md1.context[i], flow->nsh.context[i]);
+}
+}
 }
 
 if (eth_type_mpls(flow->dl_type)) {
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index c4cebb0a3..3f1a15445 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -8920,21 +8920,29 @@ dnl The flow will encap a nsh header to the TCP syn 
packet
 dnl eth/ip/tcp --> OVS --> eth/nsh/eth/ip/tcp
 AT_CHECK([ovs-ofctl -Oopenflow13 add-flow br0 
"table=0,priority=100,in_port=ovs-p0,ip,actions=encap(nsh(md_type=1)),set_field:0x1234->nsh_spi,set_field:0x11223344->nsh_c1,encap(ethernet),set_field:f2:ff:00:00:00:02->dl_dst,set_field:f2:ff:00:00:00:01->dl_src,ovs-p1"])
 
-NETNS_DAEMONIZE([at_ns1], [tcpdump -l -n -xx -U -i p1 > p1.pcap], 
[tcpdump.pid])
-sleep 1
+NETNS_DAEMONIZE([at_ns1],
+  [tcpdump -l -n -xx -U -i p1 -w p1.pcap 2>tcpdump_err], [tcpdump.pid])
+OVS_WAIT_UNTIL([grep "listening" tcpdump_err])
 
-dnl The hex dump is a TCP syn packet. pkt=eth/ip/tcp
-dnl The packet is sent from p0(at_ns0) interface directed to
-dnl p1(at_ns1) interface
-NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 f2 00 00 00 00 02 f2 
00 00 00 00 01 08 00 45 00 00 28 00 01 00 00 40 06 b0 13 c0 a8 00 0a 0a 00 00 
0a 04 00 08 00 00 00 00 c8 00 00 00 00 50 02 20 00 b8 5e 00 00 > /dev/null])
+m4_define([TCP_SYN_PKT], [m4_join([,],
+  [eth_src=f2:00:00:00:00:01,eth_dst=f2:00:00:00:00:02,eth_type=0x0800],
+  [nw_src=192.168.0.10,nw_dst=10.0.0.10],
+  [nw_proto=6,nw_ttl=64,nw_frag=no],
+  [tcp_src=1024,tcp_dst=2048,tcp_flags=syn])])
 
-dnl Check the expected nsh encapsulated packet on the egress interface
-OVS_WAIT_UNTIL([cat p1.pcap | grep -E "0x: *f2ff * *0002 *f2ff * 
*0001 *894f *0fc6" 2>&1 1>/dev/null])
-OVS_WAIT_UNTIL([cat p1.pcap | grep -E "0x0010: *0103 *0012 *34ff *1122 *3344 
* * *" 2>&1 1>/dev/null])
-OVS_WAIT_UNTIL([cat p1.pcap | grep -E "0x0020: * * * *f200 * 
*0002 *f200 *" 2>&1 1>/dev/null])
-OVS_WAIT_UNTIL([cat p1.pcap | grep -E "0x0030: *0001 *0800 *4500 *0028 *0001 
* *4006 *b013" 2>&1 1>/dev/null])
-OVS_WAIT_UNTIL([cat p1.pcap | grep -E "0x0040: *c0a8 *000a *0a00 *000a *0400 
*0800 * *00c8" 2>&1 1>/dev/null])
-OVS_WAIT_UNTIL([cat p1.pcap | grep -E "0x0050: * * *5002 *2000 *b85e 
*" 2>&1 1>/dev/null])
+dnl Send the TCP SYN packet from p0(at_ns0) interface directed to
+dnl p1(at_ns1) interface.
+NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 \
+$(ovs-ofctl compose-packet --bare 'TCP_SYN_PKT')], [0], [ignore])
+
+m4_define([NSH_HEADER], [m4_join([,],
+  [eth_src=f2:ff:00:00:00:01,eth_dst=f2:ff:00:00:00:02,eth_type=0x894f],
+  [nsh_ttl=63,nsh_np=3,nsh_spi=0x1234,nsh_si=255],
+  [nsh_mdtype=1,nsh_c1=0x11223344])])
+
+OVS_WAIT_UNTIL([ovs-pcap p1.pcap | grep -q "m4_join([], [^],
+$(ovs-ofctl compose-packet --bare 'NSH_HEADER'),
+$(ovs-ofctl compose-packet --bare 'TCP_SYN_PKT'), [\$])"])
 
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
@@ -8952,19 +8960,31 @@ dnl The flow will decap a nsh header which in turn 
carries a TCP syn packet
 dnl eth/nsh/eth/ip/tcp --> OVS --> eth/ip/tcp
 AT_CHECK([ovs-

[ovs-dev] [PATCH 2/3] tests: Convert ND, MPLS and CT sendpkt tests to compose-packet.

2024-05-31 Thread Ilya Maximets
These tests contain plain hex dumps that are hard to read and modify.
Replace with equivalent calls to ovs-ofctl compose-packet --bare and
ovs-pcap.

Tcpdump calls modified to write actual pcaps instead of text output,
so ovs-pcap can be used while checking the results.

While at it, replacing sleeps with more robust waiting for tcpdump
to start listening.

M4 macros are better than shell variables, because we can see the
substitution result in the test log.  So, using m4_define and m4_join
extensively.

Signed-off-by: Ilya Maximets 
---
 tests/system-traffic.at | 233 ++--
 1 file changed, 152 insertions(+), 81 deletions(-)

diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index bd7647cbe..c4cebb0a3 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -2390,11 +2390,22 @@ table=20 actions=drop
 AT_CHECK([ovs-ofctl del-flows br0])
 AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
 
+m4_define([ND_NS_PKT], [m4_join([,],
+  [eth_src=36:b1:ee:7c:01:03,eth_dst=36:b1:ee:7c:01:02,eth_type=0x86dd],
+  [ipv6_src=fe80::f816:3eff:fe04:6604,ipv6_dst=fe80::f816:3eff:fea7:dd0e],
+  [nw_proto=58,nw_ttl=255,nw_frag=no],
+  [icmpv6_type=136,icmpv6_code=0],
+  [nd_options_type=2,nd_tll=36:b1:ee:7c:01:03])])
+
 dnl Send a mismatching neighbor discovery.
-NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 36 b1 ee 7c 01 02 36 
b1 ee 7c 01 03 86 dd 60 00 00 00 00 20 3a ff fe 80 00 00 00 00 00 00 f8 16 3e 
ff fe 04 66 04 fe 80 00 00 00 00 00 00 f8 16 3e ff fe a7 dd 0e 88 00 f1 f2 20 
00 00 00 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 02 01 36 b1 ee 7c 01 
03 > /dev/null])
+NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 \
+$(ovs-ofctl compose-packet --bare 'ND_NS_PKT,nd_target=3000::1')],
+  [0], [ignore])
 
 dnl Send a matching neighbor discovery.
-NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 36 b1 ee 7c 01 02 36 
b1 ee 7c 01 03 86 dd 60 00 00 00 00 20 3a ff fe 80 00 00 00 00 00 00 f8 16 3e 
ff fe 04 66 04 fe 80 00 00 00 00 00 00 f8 16 3e ff fe a7 dd 0e 88 00 fe 5f 20 
00 00 00 20 01 00 00 00 00 00 00 00 00 00 01 00 00 03 92 02 01 36 b1 ee 7c 01 
03 > /dev/null])
+NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 \
+$(ovs-ofctl compose-packet --bare 'ND_NS_PKT,nd_target=2001::1:0:392')],
+  [0], [ignore])
 
 AT_CHECK([ovs-appctl dpctl/dump-flows | strip_stats | strip_used | dnl
   strip_key32 | strip_ptype | strip_eth | strip_recirc | dnl
@@ -2406,10 +2417,14 @@ 
recirc_id(),in_port(2),eth_type(0x86dd),ipv6(proto=58,frag=no),icmpv6(ty
 OVS_WAIT_UNTIL([ovs-appctl dpctl/dump-flows | grep ",nd" | wc -l | grep -E ^0])
 
 dnl Send a matching neighbor discovery.
-NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 36 b1 ee 7c 01 02 36 
b1 ee 7c 01 03 86 dd 60 00 00 00 00 20 3a ff fe 80 00 00 00 00 00 00 f8 16 3e 
ff fe 04 66 04 fe 80 00 00 00 00 00 00 f8 16 3e ff fe a7 dd 0e 88 00 fe 5f 20 
00 00 00 20 01 00 00 00 00 00 00 00 00 00 01 00 00 03 92 02 01 36 b1 ee 7c 01 
03 > /dev/null])
+NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 \
+$(ovs-ofctl compose-packet --bare 'ND_NS_PKT,nd_target=2001::1:0:392')],
+  [0], [ignore])
 
 dnl Send a mismatching neighbor discovery.
-NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 36 b1 ee 7c 01 02 36 
b1 ee 7c 01 03 86 dd 60 00 00 00 00 20 3a ff fe 80 00 00 00 00 00 00 f8 16 3e 
ff fe 04 66 04 fe 80 00 00 00 00 00 00 f8 16 3e ff fe a7 dd 0e 88 00 f1 f2 20 
00 00 00 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 02 01 36 b1 ee 7c 01 
03 > /dev/null])
+NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 \
+$(ovs-ofctl compose-packet --bare 'ND_NS_PKT,nd_target=3000::1')],
+  [0], [ignore])
 
 AT_CHECK([ovs-appctl dpctl/dump-flows | strip_stats | strip_used | dnl
   strip_key32 | strip_ptype | strip_eth | strip_recirc | dnl
@@ -2438,20 +2453,29 @@ dnl The flow will encap a mpls header to the ip packet
 dnl eth/ip/icmp --> OVS --> eth/mpls/eth/ip/icmp
 AT_CHECK([ovs-ofctl -Oopenflow13 add-flow br0 
"table=0,priority=100,dl_type=0x0800 
actions=encap(mpls),set_mpls_label:2,encap(ethernet),set_field:00:00:00:00:00:02->dl_dst,set_field:00:00:00:00:00:01->dl_src,ovs-p1"])
 
-rm -rf p1.pcap
-NETNS_DAEMONIZE([at_ns1], [tcpdump -l -n -xx -U -i p1 > p1.pcap], 
[tcpdump.pid])
-sleep 1
+NETNS_DAEMONIZE([at_ns1],
+  [tcpdump -l -n -xx -U -i p1 -w p1.pcap 2>tcpdump_err], [tcpdump.pid])
+OVS_WAIT_UNTIL([grep "listening" tcpdump_err])
+
+m4_define([ICMP_PKT], [m4_join([,],
+  [eth_src=36:b1:ee:7c:01:03,eth_dst=36:b1:ee:7c:01:02,eth_type=0x0800],
+  [nw_src=10.1.1.1,nw_dst=10.1.1.2],
+  [nw_proto=1,nw_ttl=64,nw_frag=no],
+  [icmp_type=8,icmp_code=0])])
 
-dnl The hex dump is a icmp packet. pkt=eth/ip/icmp
 dnl The packet is sent from p0(at_ns0) interface directed to
-dnl p1(at_ns1) interface
-NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 36 b1 ee 7c 01 02 36 
b1 ee 7c 01 03 08 00 

[ovs-dev] [PATCH 1/3] tests: sendpkt: Allow different input formats.

2024-05-31 Thread Ilya Maximets
We require python 3, so instead of manually parsing bytes on input we
can use built-in bytes.fromhex().  This function ignores whitespaces,
so we can use different input formats - the old style space-separated
bytes as well as pure hex strings provided by ovs-ofctl compose-packet
and ovs-pcap.

Signed-off-by: Ilya Maximets 
---
 tests/sendpkt.py | 26 --
 1 file changed, 4 insertions(+), 22 deletions(-)

diff --git a/tests/sendpkt.py b/tests/sendpkt.py
index 49ac45275..7cbea5165 100755
--- a/tests/sendpkt.py
+++ b/tests/sendpkt.py
@@ -48,28 +48,10 @@ if len(args) < 2:
 if options.packet_type != "eth":
 parser.error('invalid argument to "-t"/"--type". Allowed value is "eth".')
 
-# store the hex bytes with 0x appended at the beginning
-# if not present in the user input and validate the hex bytes
-hex_list = []
-for a in args[1:]:
-if a[:2] != "0x":
-hex_byte = "0x" + a
-else:
-hex_byte = a
-try:
-temp = int(hex_byte, 0)
-except:
-parser.error("invalid hex byte " + a)
-
-if temp > 0xff:
-parser.error("hex byte " + a + " cannot be greater than 0xff!")
-
-hex_list.append(temp)
-
-if sys.version_info < (3, 0):
-pkt = "".join(map(chr, hex_list))
-else:
-pkt = bytes(hex_list)
+# Strip '0x' prefixes from hex input, combine into a single string and
+# convert to bytes.
+hex_str = "".join([a[2:] if a.startswith("0x") else a for a in args[1:]])
+pkt = bytes.fromhex(hex_str)
 
 try:
 sockfd = socket.socket(socket.AF_PACKET, socket.SOCK_RAW)
-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 0/3] tests: Use compose-packet for sendpkt calls.

2024-05-31 Thread Ilya Maximets
sendpkt.py is used in system tests to send raw packets to interfaces.
It is fed with plain hex bytes that are hard to read and understand.

This patch set teaches sendpkt to accept hex strings in slightly
different formats and teaches compose-packet command to generate NSH
headers.  With that most of the sendpkt.py calls updated to consume
packets generated from OpenFlow descriptions instead of plain coding
the bytes and checking code updated to compare results against these
composed packets.

Suggestion is to get these changes to all supported branches and
start writing new tests in the same way.  This patch set does add
some new functionality, but it is test-only, so should be OK to
backport.

There are still a few plain hex calls to sendpkt.py.  One is a
intentionally malformed geneve packet that we can't generate (at least
not fully), the others are IGMP packets that OVS doesn't fully parse
and doesn't have a way to represent fully in OpenFlow.

Next step might be to replace some of the packet-out calls as well.

Ilya Maximets (3):
  tests: sendpkt: Allow different input formats.
  tests: Convert ND, MPLS and CT sendpkt tests to compose-packet.
  nsh: Add support to compose-packet and use it in system tests.

 lib/flow.c  |  18 ++
 tests/sendpkt.py|  26 +--
 tests/system-traffic.at | 412 ++--
 3 files changed, 291 insertions(+), 165 deletions(-)

-- 
2.45.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 8/8] netdev-linux: Fix uninitialized gso_type case.

2024-05-31 Thread Ilya Maximets
On 5/29/24 12:53, Eelco Chaudron wrote:
> This patch fixes an uninitialized gso_type case in
> netdev_linux_prepend_vnet_hdr() by returning an error.
> 
> Fixes: 3337e6d91c5b ("userspace: Enable L4 checksum offloading by default.")
> Signed-off-by: Eelco Chaudron 
> ---
>  lib/netdev-linux.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> index eb0c5c624..dc67e1268 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -7167,6 +7167,10 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int 
> mtu)
>  vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
>  } else if (dp_packet_hwol_tx_ipv6(b)) {
>  vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> +} else {
> +VLOG_ERR_RL(, "Unknown gso_type for TSO packet. Flags: 
> %"PRIx64,

I'd suggest adding the # qualifier to the format string, i.e. %#"PRIx64.
Can be fixed on commit, I suppose.

Acked-by: Ilya Maximets 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 7/8] db-ctl-base: Initialize the output variable in the ctx structure.

2024-05-31 Thread Ilya Maximets
On 5/29/24 12:53, Eelco Chaudron wrote:
> Coverity was flagged that the uninitialized output variable was used
> in the ctl_context_init_command() function. This patch initializes
> the variable.
> 
> In addition it also destroys the ds string in ctl_context_done()
> in case it's not cleared properly.
> 
> Fixes: 07ff77ccb82a ("db-ctl-base: Make common database command code into 
> library.")
> Signed-off-by: Eelco Chaudron 
> ---
>  lib/db-ctl-base.c | 2 ++
>  1 file changed, 2 insertions(+)

Acked-by: Ilya Maximets 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 3/6] netdev-dpdk: Fix inner checksum when outer is not supported.

2024-05-31 Thread Ilya Maximets
On 5/30/24 15:10, David Marchand wrote:
> If outer checksum is not supported and OVS already set L3/L4 outer
> checksums in the packet, no outer mark should be left in ol_flags
> (as it confuses some driver, like net/ixgbe).
> 
> l2_len must be adjusted to account for the tunnel header.
> 
> Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
> Signed-off-by: David Marchand 
> Acked-by: Kevin Traynor 
> ---
>  lib/netdev-dpdk.c | 19 +++
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 


Recheck-request: github-robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 1/6] netdev-dpdk: Fallback to non tunnel checksum offloading.

2024-05-31 Thread Ilya Maximets
On 5/30/24 15:10, David Marchand wrote:
> The outer checksum offloading API in DPDK is ambiguous and was
> implemented by Intel folks in their drivers with the assumption that
> any outer offloading always goes with an inner offloading request.
> 
> With net/i40e and net/ice drivers, in the case of encapsulating a ARP
> packet in a vxlan tunnel (which results in requesting outer ip checksum
> with a tunnel context but no inner offloading request), a Tx failure is
> triggered, associated with a port MDD event.
> 2024-03-27T16:02:07.084Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR:
>   MDD event
> 
> To avoid this situation, if no checksum or segmentation offloading is
> requested on the inner part of a packet, fallback to "normal" (non outer)
> offloading request.
> 
> Reported-at: https://github.com/openvswitch/ovs-issues/issues/321
> Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
> Fixes: f81d782c1906 ("netdev-native-tnl: Mark all vxlan/geneve packets as 
> tunneled.")
> Signed-off-by: David Marchand 
> Acked-by: Kevin Traynor 
> ---
> Changes since v2:
> - kept offloads disabled for net/i40e and net/ice as this patch does not
>   fix outer udp checksum (a DPDK fix is required),
> - updated commitlog with details to reproduce the issue,
> - adjusted indent,
> 
> Changes since v1:
> - reset inner marks before converting outer requests,
> - fixed some coding style,
> 
> ---
>  lib/netdev-dpdk.c | 71 +++
>  1 file changed, 41 insertions(+), 30 deletions(-)


Recheck-request: github-robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 4/8] sflow: Use uint32_t instead of time_t for tick handling in the poller.

2024-05-31 Thread Ilya Maximets
On 5/30/24 09:28, Eelco Chaudron wrote:
> 
> 
> On 29 May 2024, at 12:53, Eelco Chaudron wrote:
> 
>> The sFlow library uses a uint32_t to configure timeout ticks, but
>> stores this value as a time_t. Although this doesn't cause functional
>> issues, it wastes space and confuses Coverity, potentially indicating
>> a Y2K38 problem when storing uint32_t values in time_t. This patch
>> updates the internal data structures to use uint32_t variables.
>>
>> Fixes: c72e245a0e2c ("Add InMon's sFlow Agent library to the build system.")
>> Acked-by: Mike Pattrick 
>> Signed-off-by: Eelco Chaudron 
>> ---
>>  lib/sflow_api.h | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/sflow_api.h b/lib/sflow_api.h
>> index eb23e2acd..f4bfa5ead 100644
>> --- a/lib/sflow_api.h
>> +++ b/lib/sflow_api.h
>> @@ -148,7 +148,7 @@ typedef struct _SFLPoller {
>>  /* MIB fields */
>>  SFLDataSource_instance dsi;
>>  u_int32_t sFlowCpReceiver;
>> -time_t sFlowCpInterval;
>> +u_int32_t sFlowCpInterval;
>>  /* public fields */
>>  struct _SFLAgent *agent; /* pointer to my agent */
>>  void *magic; /* ptr to pass back in getCountersFn() */
>> @@ -156,7 +156,7 @@ typedef struct _SFLPoller {
>>  u_int32_t bridgePort; /* port number local to bridge */
>>  /* private fields */
>>  SFLReceiver *myReceiver;
>> -time_t countersCountdown;
>> +u_int32_t countersCountdown;
>>  u_int32_t countersSampleSeqNo;
>>  } SFLPoller;
>>
>> -- 
>> 2.44.0
> 
> Recheck-request: github-robot
> 

GitHub claims that the issue is fixed.  Let's try again.

Recheck-request: github-robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [BUG][revalidator] ovs crash and could NOT fix again after set request_mtu

2024-05-31 Thread Ilya Maximets
On 5/31/24 04:00, Simon Jones wrote:
> Hi all,
> 
> I'm using ovs-dpdk(ovs:2.17.1, dpdk:21.11.1).
> Now I found a BUG that ovs crash and could NOT fix again after set
> request_mtu.
> 
> 1. How to reproduce and my Analysis:
> ```
> # start ovs and add bridge and port and openflow
> 
> [root@bogon ~]# ovs-vsctl show
> 0444869c-dc4d-462f-8caf-074ecbab1a55
> Bridge br-int
> datapath_type: netdev
> Port p0
> Interface p0
> type: dpdk
> options: {dpdk-devargs=":c1:00.0"}
> Port br-int
> Interface br-int
> type: internal
> Bridge br-phy
> datapath_type: netdev
> Port pf1vf0
> Interface pf1vf0
> type: dpdk
> options: {dpdk-devargs=":c1:00.1,representor=[0]"}
> Port pf1vf1
> Interface pf1vf1
> type: dpdk
> options: {dpdk-devargs=":c1:00.1,representor=[1]"}
> Port br-phy
> Interface br-phy
> type: internal
> Port pf1vf3
> Interface pf1vf3
> type: dpdk
> options: {dpdk-devargs=":c1:00.1,representor=[3]"}
> Port pf1vf2
> Interface pf1vf2
> type: dpdk
> options: {dpdk-devargs=":c1:00.1,representor=[2]"}
> ovs_version: "2.17.2"
> 
> [root@bogon ~]# ovs-ofctl dump-flows br-int
>  cookie=0x0, duration=60216.364s, table=0, n_packets=16923639262,
> n_bytes=984712027272, priority=0 actions=NORMAL
> 
>  865084 root  10 -10  522.9g   1.6g  42808 S  17.3   0.6 175:48.23
> revalidator53
>  865123 root  10 -10  522.9g   1.6g  42808 S  17.3   0.6 175:00.43
> revalidator92
>  865158 root  10 -10  522.9g   1.6g  42808 S  17.3   0.6 175:58.49
> revalidator127
>  865171 root  10 -10  522.9g   1.6g  42808 S  17.3   0.6 176:29.69
> revalidator140
>  865058 root  10 -10  522.9g   1.6g  42808 S  16.9   0.6 176:58.03
> revalidator27
>  865091 root  10 -10  522.9g   1.6g  42808 S  16.9   0.6 175:41.81
> revalidator60
>  865111 root  10 -10  522.9g   1.6g  42808 S  16.9   0.6 176:05.97
> revalidator80
>  865113 root  10 -10  522.9g   1.6g  42808 S  16.9   0.6 177:09.64
> revalidator82
>  865130 root  10 -10  522.9g   1.6g  42808 S  16.9   0.6 176:16.27
> revalidator99
>  865155 root  10 -10  522.9g   1.6g  42808 S  16.9   0.6 176:11.22
> revalidator124
>  865097 root  10 -10  522.9g   1.6g  42808 S  16.6   0.6 177:00.22
> revalidator66
>  865110 root  10 -10  522.9g   1.6g  42808 S  16.6   0.6 175:16.52
> revalidator79
>  865149 root  10 -10  522.9g   1.6g  42808 S  16.6   0.6 176:00.84
> revalidator118
>  865151 root  10 -10  522.9g   1.6g  42808 S  16.6   0.6 176:29.06
> revalidator120
>  865057 root  10 -10  522.9g   1.6g  42808 S  16.3   0.6 178:03.60
> revalidator26
>  865070 root  10 -10  522.9g   1.6g  42808 S  16.3   0.6 176:17.63
> revalidator39
>  865112 root  10 -10  522.9g   1.6g  42808 S  16.3   0.6 175:35.65
> revalidator81
>  865083 root  10 -10  522.9g   1.6g  42808 S  15.9   0.6 176:21.53
> revalidator52
>  865124 root  10 -10  522.9g   1.6g  42808 S  15.9   0.6 175:31.27
> revalidator93
>  865127 root  10 -10  522.9g   1.6g  42808 S  15.9   0.6 176:59.65
> revalidator96
>  865147 root  10 -10  522.9g   1.6g  42808 S  15.9   0.6 176:51.85
> revalidator116
>  865164 root  10 -10  522.9g   1.6g  42808 S  15.9   0.6 177:34.16
> revalidator133
>  865051 root  10 -10  522.9g   1.6g  42808 S  15.6   0.6 175:27.68
> revalidator20
>  865066 root  10 -10  522.9g   1.6g  42808 S  15.6   0.6 175:54.05
> revalidator35
>  865087 root  10 -10  522.9g   1.6g  42808 S  15.6   0.6 175:38.54
> revalidator56
>  865100 root  10 -10  522.9g   1.6g  42808 S  15.6   0.6 177:12.42
> revalidator69
>  865118 root  10 -10  522.9g   1.6g  42808 S  15.6   0.6 176:02.57
> revalidator87
>  865121 root  10 -10  522.9g   1.6g  42808 S  15.6   0.6 176:06.20
> revalidator90
>  865132 root  10 -10  522.9g   1.6g  42808 S  15.6   0.6 177:24.71
> revalidator101
>  865148 root  10 -10  522.9g   1.6g  42808 S  15.6   0.6 179:07.53
> revalidator117
>  865162 root  10 -10  522.9g   1.6g  42808 S  15.6   0.6 177:18.34
> revalidator131
>  865047 root  10 -10  522.9g   1.6g  42808 S  15.3   0.6 176:30.75
> revalidator16
>  865080 root  10 -10  522.9g   1.6g  42808 S  15.3   0.6 175:36.41
> revalidator49
>  865117 root  10 -10  522.9g   1.6g  42808 S  15.3   0.6 176:03.18
> revalidator86
>  865125 root  10 -10  522.9g   1.6g  42808 S  15.3   0.6 177:15.42
> revalidator94
>  865122 root  10 -10  522.9g   1.6g  42808 S  15.0   0.6 176:45.37
> revalidator91
>  865065 root  10 -10  522.9g   1.6g  42808 S  14.6   0.6 176:49.66
> revalidator34
>  865116 root  10 -10  522.9g   1.6g  42808 S  14.6   0.6 174:57.67
> revalidator85
>  865161 root  10 

Re: [ovs-dev] [PATCH v5] route-table: Add support for v4 via v6 route.

2024-05-30 Thread Ilya Maximets
On 5/30/24 20:17, Ilya Maximets wrote:
> On 5/30/24 01:27, William Tu wrote:
>> Add route-table support for ipv4 dst via ipv6. One use case is BGP
>> unnumbered, a mechanism that establishes peering sessions without the
>> need to explicitly configure IPv4 addresses on the interfaces involved
>> in the peering. Without using IPv4 address assignments, it uses
>> link-local IPv6 addresses of the directly connected neighbors for
>> peering purposes. For example, BGP might install the following route:
>> $ ip route get 100.87.18.3
>> 100.87.18.3 via inet6 fe80::920a:84ff:fe9e:9570 \
>> dev br-phy src 100.87.18.6
>>
>> Note that the v6 addr fe80::920a:84ff:fe9e:9570 is not being used in
>> the packet header, but only used for lookup the out dev br-phy.
>> Currently OVS can only support either all-ipv4 or all-ipv6, the patch
>> adds support for such use case.
>>
>> Reported-at: 
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2024-January/052908.html
>> Acked-by: Simon Horman 
>> Signed-off-by: William Tu 
>> ---
>> v5: fix minor CI failure
>> v4: feedback from Ilya
>> - add route del test case, wrap around test width
>> - not set neighbor cache manually
>> - on br-phy, use /32 on address in steead of /24
>> compare v3 and v4
>> https://github.com/williamtu/ovs/compare/router..router-v4
>> v3: add vxlan test, remove rfc
>> v2: fix CI error
>> ---
> 
> 
> 
>> diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
>> index 508737c53ec6..7266f0990570 100644
>> --- a/tests/tunnel-push-pop.at
>> +++ b/tests/tunnel-push-pop.at
>> @@ -196,6 +196,69 @@ OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep 
>> 100022eb00012237 | wc -l`
>>  OVS_VSWITCHD_STOP
>>  AT_CLEANUP
>>  
>> +AT_SETUP([tunnel_push_pop - v4 via v6 route])
>> +
>> +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy 
>> ofport_request=1 other-config:hwaddr=aa:55:aa:55:00:00])
>> +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br 
>> datapath_type=dummy], [0])
>> +AT_CHECK([ovs-vsctl add-port int-br t1 -- set Interface t1 type=vxlan \
>> +   options:remote_ip=1.1.2.92 options:key=123 
>> ofport_request=1\
>> +   ], [0])
>> +
>> +AT_CHECK([ovs-appctl dpif/show], [0], [dnl
>> +dummy@ovs-dummy: hit:0 missed:0
>> +  br0:
>> +br0 65534/100: (dummy-internal)
>> +p0 1/1: (dummy)
>> +  int-br:
>> +int-br 65534/2: (dummy-internal)
>> +t1 1/4789: (vxlan: key=123, remote_ip=1.1.2.92)
>> +])
>> +
>> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
>> +
>> +dnl Setup dummy interface IP addresses.
>> +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/32], [0], [OK
>> +])
>> +AT_CHECK([ovs-appctl netdev-dummy/ip6addr br0 2001:cafe::88/64], [0], [OK
>> +])
>> +dnl Add a static v4 via v6 route
>> +AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/32 br0 2001:cafe::10 
>> src=1.1.2.89], [0], [OK
>> +])
>> +
>> +AT_CHECK([ovs-appctl ovs/route/show | grep br0 | sort], [0], [dnl
>> +Cached: 1.1.2.88/32 dev br0 SRC 1.1.2.88 local
>> +Cached: 2001:cafe::/64 dev br0 SRC 2001:cafe::88 local
>> +User: 1.1.2.92/32 dev br0 GW 2001:cafe::10 SRC 1.1.2.89
>> +])
>> +
>> +dnl Check ARP Snoop
>> +AT_CHECK([ovs-appctl netdev-dummy/receive p0 'recirc_id(0),in_port(100),dnl
>> +eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),dnl
>> +arp(sip=1.1.2.92,tip=1.1.2.88,op=2,sha=f8:bc:12:44:34:b6,tha=00:00:00:00:00:00)'])
> 
> This is still not a correct test, we would never receive an ARP
> from an IPv6-only network.  This must be an IPv6 NA packet instead.
> 
> All in all, I'd expect the following test to work without modifications
> (unless I mistyped something):
> 
> ---
> AT_SETUP([tunnel_push_pop - v4 via v6 route])
> 
> OVS_VSWITCHD_START(
> [add-port br0 p0 \
>  -- set Interface p0 type=dummy ofport_request=1 \
>  other-config:hwaddr=aa:55:aa:55:00:00])
> AT_CHECK([ovs-appctl vlog/set dpif_netdev:dbg])
> AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy])
> AT_CHECK([ovs-vsctl add-port int-br t2 \
>   -- set Interface t2 type=geneve \
>   options:remote_ip=1.1.2.92 \
>   options:key=123 ofport_request=2])
> 
> dnl Setup IP addresses.
> AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/32], [0], [OK
> ])
> AT_CHECK([ovs-appctl netdev-dummy/ip6addr br0 2001:cafe::88/64], [0], [OK

Re: [ovs-dev] [PATCH v5] route-table: Add support for v4 via v6 route.

2024-05-30 Thread Ilya Maximets
w int-br action=normal])

AT_CHECK([ovs-vsctl -- set Interface p0 options:tx_pcap=p0.pcap])

dnl Check that v4-over-v6 route is used in the trace and that a tunnel neighbor
dnl lookup miss generates ND and not an ARP.
AT_CHECK([ovs-appctl ofproto/trace int-br in_port=LOCAL \
| grep -E 'tunnel|neighbor|actions'], [0], [dnl
 -> output to native tunnel
 -> tunneling to 2001:cafe::10 via br0
 -> neighbor cache miss for 2001:cafe::10 on bridge br0, sending ND request
Datapath actions: drop
])

dnl Check that the correct Neighbor Solicitation was sent out via p0.
m4_define([ND_NS_PACKET], [m4_joinall([,],
  [eth_src=aa:55:aa:55:00:00,eth_dst=33:33:ff:00:00:10,eth_type=0x86dd],
  [ipv6_src=2001:cafe::88,ipv6_dst=ff02::1:ff00:10],
  [nw_proto=58,nw_ttl=255,nw_frag=no],
  [icmpv6_type=135,icmpv6_code=0],
  [nd_target=2001:cafe::10,nd_options_type=1,nd_sll=aa:55:aa:55:00:00])])

OVS_WAIT_UNTIL([test $(ovs-pcap p0.pcap \
| grep -c "$(ovs-ofctl compose-packet --bare 'ND_NS_PACKET')") -eq 1])

dnl Now send a Neighbor Advertisement from p0 which has two effects:
dnl 1. The neighbor cache will learn that 2001:cafe::10 is at f8:bc:12:44:34:b6.
dnl 2. The br0 mac learning will learn that f8:bc:12:44:34:b6 is on p0.
AT_CHECK([ovs-appctl netdev-dummy/receive p0 dnl
 'recirc_id(0),in_port(1),dnl
  eth(src=f8:bc:12:44:34:b6,dst=aa:55:aa:55:00:00),eth_type(0x86dd),dnl
  
ipv6(src=2001:cafe::10,dst=2001:cafe::88,label=0,proto=58,tclass=0,hlimit=255,frag=no),dnl
  icmpv6(type=136,code=0),dnl
  nd(target=2001:cafe::10,sll=00:00:00:00:00:00,tll=f8:bc:12:44:34:b6)'
])

dnl Check that v4-over-v6 route is used in the trace and the tunnel is working.
AT_CHECK([ovs-appctl ofproto/trace int-br in_port=LOCAL \
| grep -E 'tunnel|neighbor|actions'], [0], [dnl
 -> output to native tunnel
 -> tunneling to 2001:cafe::10 via br0
 -> tunneling from aa:55:aa:55:00:00 1.1.2.88 to f8:bc:12:44:34:b6 
2001:cafe::10
Datapath actions: tnl_push(tnl_port(6081),header(size=50,type=5,dnl
eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),dnl
ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x4000),dnl
udp(src=0,dst=6081,csum=0x0),geneve(vni=0x7b)),out_port(100)),1
])

dnl Now check that the packet is actually encapsulated and delivered.
packet=5054000a505400091234
eth=f8bc124434b6aa55aa550800
ip4=4532400040113406010102580101025c
dnl Source port is based on a packet hash, so it may differ depending on the
dnl compiler flags and CPU type.  Masked with ''.
udp=17c1001e
geneve=65587b00
encap=${eth}${ip4}${udp}${geneve}
dnl Output to the tunnel from the int-br internal port.
dnl Checking that the packet arrived and it was correctly encapsulated.
AT_CHECK([ovs-appctl netdev-dummy/receive int-br "${packet}"])
OVS_WAIT_UNTIL([test $(ovs-pcap p0.pcap | grep -c "${encap}${packet}") -eq 1])

dnl Sending again to exercise the non-miss upcall path.
AT_CHECK([ovs-appctl netdev-dummy/receive int-br "${packet}"])
OVS_WAIT_UNTIL([test $(ovs-pcap p0.pcap | grep -c "${encap}${packet}") -eq 2])

dnl Finally, checking that the datapath flow is also correct.
AT_CHECK([ovs-appctl dpctl/dump-flows | grep tnl_push \
| strip_ufid | strip_used], [0], [dnl
recirc_id(0),in_port(2),packet_type(ns=0,id=0),dnl
eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x1234), dnl
packets:1, bytes:14, used:0.0s, dnl
actions:tnl_push(tnl_port(6081),header(size=50,type=5,dnl
eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),dnl
ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x4000),dnl
udp(src=0,dst=6081,csum=0x0),geneve(vni=0x7b)),out_port(100)),1
])

OVS_VSWITCHD_STOP
AT_CLEANUP
---

With the current version of a patch it fails on the tunnel neighbor
lookup because it looks for IPv4 neighbor and there are no IPv4
neighbors in this setup, all the neighbors are IPv6-only.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v8] rhel: Make the version, displayed to the user, customizable.

2024-05-29 Thread Ilya Maximets
gt; +
> +AC_DEFUN([OVS_CHECK_VERSION_SUFFIX], [
> +  AC_ARG_WITH([version-suffix],
> +  [AS_HELP_STRING([--with-version-suffix=ver_suffix],
> +  [Specify a version suffix that will be appended

s/a version suffix/a string/

> +   to OVS version])])
> +  AC_DEFINE_UNQUOTED([VERSION_SUFFIX], ["$with_version_suffix"],
> + [Package version suffix])
> +  AC_SUBST([VERSION_SUFFIX], [$with_version_suffix])
> +  ])
> +])
> +
>  dnl Checks for net/if_dl.h.
>  dnl
>  dnl (We use this as a proxy for checking whether we're building on FreeBSD
> diff --git a/configure.ac b/configure.ac
> index dd6553fea..8323e481d 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -202,6 +202,7 @@ OVS_CHECK_LINUX_SCTP_CT
>  OVS_CHECK_LINUX_VIRTIO_TYPES
>  OVS_CHECK_DPDK
>  OVS_CHECK_PRAGMA_MESSAGE
> +OVS_CHECK_VERSION_SUFFIX
>  AC_SUBST([CFLAGS])
>  AC_SUBST([OVS_CFLAGS])
>  AC_SUBST([OVS_LDFLAGS])
> diff --git a/include/openvswitch/version.h.in 
> b/include/openvswitch/version.h.in
> index 23d8fde4f..231f61e30 100644
> --- a/include/openvswitch/version.h.in
> +++ b/include/openvswitch/version.h.in
> @@ -19,7 +19,7 @@
>  #define OPENVSWITCH_VERSION_H 1
>  
>  #define OVS_PACKAGE_STRING  "@PACKAGE_STRING@"
> -#define OVS_PACKAGE_VERSION "@PACKAGE_VERSION@"
> +#define OVS_PACKAGE_VERSION "@PACKAGE_VERSION@@VERSION_SUFFIX@"
>  
>  #define OVS_LIB_VERSION @LT_CURRENT@
>  #define OVS_LIB_REVISION@LT_REVISION@
> diff --git a/lib/ovsdb-error.c b/lib/ovsdb-error.c
> index 9ad42b232..56512fc28 100644
> --- a/lib/ovsdb-error.c
> +++ b/lib/ovsdb-error.c
> @@ -146,7 +146,7 @@ ovsdb_internal_error(struct ovsdb_error *inner_error,
>  ds_put_char(, ')');
>  }
>  
> -ds_put_format(, " (%s %s)", program_name, VERSION);
> +ds_put_format(, " (%s %s)", program_name, VERSION VERSION_SUFFIX);
>  
>  if (inner_error) {
>  char *s = ovsdb_error_to_string_free(inner_error);
> diff --git a/lib/util.c b/lib/util.c
> index 5c31d983a..c03a1ae15 100644
> --- a/lib/util.c
> +++ b/lib/util.c
> @@ -618,8 +618,9 @@ ovs_set_program_name(const char *argv0, const char 
> *version)
>      program_name = basename;
>  
>  free(program_version);
> -if (!strcmp(version, VERSION)) {
> -program_version = xasprintf("%s (Open vSwitch) "VERSION"\n",
> +if (!strcmp(version, VERSION VERSION_SUFFIX)) {
> +program_version = xasprintf("%s (Open vSwitch) "VERSION
> +VERSION_SUFFIX"\n",
>  program_name);
>  } else {
>  program_version = xasprintf("%s %s\n"

This one also needs an update.  You may test this by running
$ ./tests/test-lib --version

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 8/8] netdev-linux: Fix uninitialized gso_type case.

2024-05-29 Thread Ilya Maximets
On 5/29/24 12:53, Eelco Chaudron wrote:
> This patch fixes an uninitialized gso_type case in
> netdev_linux_prepend_vnet_hdr() by returning an error.
> 
> Fixes: 3337e6d91c5b ("userspace: Enable L4 checksum offloading by default.")
> Signed-off-by: Eelco Chaudron 
> ---

Recheck-request: github-robot

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 3/8] sflow: Replace libc's random() function with the OVS's random_range().

2024-05-29 Thread Ilya Maximets
On 5/29/24 12:53, Eelco Chaudron wrote:
> Coverity has flagged the use of a potentially unsafe function.
> Although this is not a concern in this case since it's not used for
> encryption, we should replace it with the OVS implementation to
> achieve better randomness.
> 
> Fixes: c72e245a0e2c ("Add InMon's sFlow Agent library to the build system.")
> Acked-by: Mike Pattrick 
> Signed-off-by: Eelco Chaudron 
> ---

Recheck-request: github-robot

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 6/8] ofproto-dpif: Define age as time_t in ofproto_unixctl_fdb_add().

2024-05-29 Thread Ilya Maximets
On 5/29/24 12:53, Eelco Chaudron wrote:
> Fix the warning from Coverity about potential truncation of the
> time_t value when copying to a local variable by changing the
> local variable's type to time_t.
> 
> Fixes: ccc24fc88d59 ("ofproto-dpif: APIs and CLI option to add/delete static 
> fdb entry.")
> Acked-by: Mike Pattrick 
> Acked-by: Paolo Valerio 
> Signed-off-by: Eelco Chaudron 
> ---

Recheck-request: github-robot

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 5/8] sflow: Fix check for disabled receive time.

2024-05-29 Thread Ilya Maximets
On 5/29/24 12:53, Eelco Chaudron wrote:
> Changed sFlowRcvrTimeout to a uint32_t to avoid time_t warnings
> reported by Coverity. A uint32_t is more than large enough as
> this is a (seconds) tick counter and OVS is not even using this.
> 
> Fixes: c72e245a0e2c ("Add InMon's sFlow Agent library to the build system.")
> Acked-by: Ilya Maximets 
> Signed-off-by: Eelco Chaudron 
> --
> Note that this checkpatch reports an 'Improper whitespace
> around control block' error on this patch + some warnings.
> But I did not want to change the code style in this entire file.
> ---

Recheck-request: github-robot

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 4/8] sflow: Use uint32_t instead of time_t for tick handling in the poller.

2024-05-29 Thread Ilya Maximets
On 5/29/24 12:53, Eelco Chaudron wrote:
> The sFlow library uses a uint32_t to configure timeout ticks, but
> stores this value as a time_t. Although this doesn't cause functional
> issues, it wastes space and confuses Coverity, potentially indicating
> a Y2K38 problem when storing uint32_t values in time_t. This patch
> updates the internal data structures to use uint32_t variables.
> 
> Fixes: c72e245a0e2c ("Add InMon's sFlow Agent library to the build system.")
> Acked-by: Mike Pattrick 
> Signed-off-by: Eelco Chaudron 
> ---

Recheck-request: github-robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 2/8] cfm: Fix possible integer overflow in tc_add_matchall_policer().

2024-05-29 Thread Ilya Maximets
On 5/29/24 12:53, Eelco Chaudron wrote:
> Fix unintentional integer overflow reported by Coverity by adding
> the LL suffix to the numerical literals used in the multiplication.
> 
> Fixes: 5767a79a4059 ("cfm: Require ccm received in demand mode.")
> Acked-by: Mike Pattrick 
> Signed-off-by: Eelco Chaudron 
> ---
>  lib/cfm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Recheck-request: github-robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v4] odp-execute: Fix AVX checksum calculation.

2024-05-29 Thread Ilya Maximets
On 5/29/24 11:01, Eelco Chaudron wrote:
> 
> 
> On 28 May 2024, at 16:49, Ilya Maximets wrote:
> 
>> On 5/28/24 14:36, Eelco Chaudron wrote:
>>>
>>>
>>> On 24 May 2024, at 11:20, Emma Finn wrote:
>>>
>>>> The AVX implementation for calcualting checksums was not
>>>> handling carry-over addition correctly in some cases.
>>>> This patch adds an additional shuffle to add 16-bit padding to
>>>> the final part of the calculation to handle such cases. This
>>>> commit also adds a unit test to check the checksum carry-bits
>>>> issue with actions autovalidator enabled.
>>>
>>> Hi Emma,
>>>
>>> Thanks for sending out the v4. I have some small nits below, which I can 
>>> fix during commit time. Assuming Ilya has no other simple to fix comments.
>>>
>>> Cheers,
>>>
>>> Eelco
>>>
>>>> Signed-off-by: Emma Finn 
>>>> Reported-by: Eelco Chaudron 
>>>> ---
>>>>  lib/odp-execute-avx512.c |  5 
>>>>  tests/dpif-netdev.at | 64 
>>>>  2 files changed, 69 insertions(+)
>>>>
>>>> diff --git a/lib/odp-execute-avx512.c b/lib/odp-execute-avx512.c
>>>> index 50c48bfd4..a74a85dc1 100644
>>>> --- a/lib/odp-execute-avx512.c
>>>> +++ b/lib/odp-execute-avx512.c
>>>> @@ -366,6 +366,8 @@ avx512_get_delta(__m256i old_header, __m256i 
>>>> new_header)
>>>>0xF, 0xF, 0xF, 0xF);
>>>>  v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
>>>>
>>>> +v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
>>>> +v_delta = _mm256_shuffle_epi8(v_delta, v_swap16a);
>>>>  v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
>>>>  v_delta = _mm256_hadd_epi16(v_delta, v_zeros);
>>>>
>>>> @@ -575,6 +577,9 @@ avx512_ipv6_sum_header(__m512i ip6_header)
>>>>0xF, 0xF, 0xF, 0xF);
>>>>
>>>>  v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
>>>> +
>>>> +v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
>>>> +v_delta = _mm256_shuffle_epi8(v_delta, v_swap16a);
>>>>  v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
>>>>  v_delta = _mm256_hadd_epi16(v_delta, v_zeros);
>>>>
>>>> diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at
>>>> index 790b5a43a..260986ba9 100644
>>>> --- a/tests/dpif-netdev.at
>>>> +++ b/tests/dpif-netdev.at
>>>> @@ -1091,3 +1091,67 @@ OVS_VSWITCHD_STOP(["dnl
>>>>  /Error: unknown miniflow extract implementation superstudy./d
>>>>  /Error: invalid study_pkt_cnt value: -pmd./d"])
>>>>  AT_CLEANUP
>>>> +
>>>> +AT_SETUP([datapath - Actions Autovalidator Checksum])
>>>> +
>>>> +OVS_VSWITCHD_START(add-port br0 p0 -- set Interface p0 type=dummy \
>>>> +   -- add-port br0 p1 -- set Interface p1 type=dummy)
>>>> +
>>>> +AT_CHECK([ovs-appctl odp-execute/action-impl-set autovalidator], [0], [dnl
>>>> +Action implementation set to autovalidator.
>>>> +])
>>>> +
>>>> +# Add flows to trigger checksum calculation
>>>
>>> Comments should end with a dot(.). Also, not sure if ‘#’ is fine here, as 
>>> we are
>>> moving to ‘dnl’, but this file has both (most are ‘#’). Ilya?
>>
>> Both are fine, 'dnl' is a bit cleaner, so if you want to swap those
>> on commit that's fine, but there is no point in new version just for
>> that.
>>
>> Note that while backporting the fix we'll need to substitute the
>> 'compose-packet' calls with their results, since bare packet compose
>> is not available pre 3.3.
>>
>>>
>>>> +AT_DATA([flows.txt], [ddl
>>>> +  in_port=p0,ip,actions=mod_nw_src=10.1.1.1,p1
>>>> +  in_port=p0,ipv6,actions=set_field:fc00::100->ipv6_src,p1
>>>> +])
>>>> +AT_CHECK([ovs-ofctl del-flows br0])
>>>> +AT_CHECK([ovs-ofctl -Oopenflow13 add-flows br0 flows.txt])
>>>> +
>>>> +# Make sure checksum won't be offloaded
>>>> +AT_CHECK([ovs-vsctl set Interface p0 options:ol_ip_csum=false])
>>>> +AT_CHECK([ovs-vsctl set Interface p0 options:ol_ip_csum_set_good=false])
>>>> +
>>>> +AT_CHECK([ovs-vsctl set Interface p1 options:pcap

Re: [ovs-dev] [PATCH 2/2] netdev-linux: Fix ethtool_cmd is partly outside array bounds.

2024-05-28 Thread Ilya Maximets
On 5/27/24 13:00, Roi Dayan via dev wrote:
> Used Ubuntu with gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0
> 
> lib/netdev-linux.c: In function ‘netdev_linux_construct’:
> lib/netdev-linux.c:7003:15: error: array subscript ‘struct ethtool_cmd[0]’ is 
> partly outside array bounds of ‘union [1]’ [-Werror=array-bounds]
>  7003 | ecmd->cmd = cmd;
>   | ~~^
> lib/netdev-linux.c:2411:7: note: while referencing ‘sset_info’
>  2411 | } sset_info;
>   |   ^
> 
> Fixes: 6c59c195266c ("netdev-linux: Use ethtool to detect offload support.")
> Signed-off-by: Roi Dayan 
> ---
>  lib/netdev-linux.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> index 0cb379295af1..ec6fcf7b2c6a 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -2402,6 +2402,7 @@ static int
>  netdev_linux_read_stringset_info(struct netdev_linux *netdev, uint32_t *len)
>  {
>  union {
> +struct ethtool_cmd ecmd;
>  struct ethtool_sset_info hdr;
>  struct {
>  uint64_t pad[2];

We need to migrate from the legacy ethtool_cmd API at some point,
but this change makes sense for now.

Applied to main and backported down to 3.3.  Thanks!

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 6/6] netdev-linux: Initialize link speed in error conditions.

2024-05-28 Thread Ilya Maximets
On 5/27/24 21:08, Mike Pattrick wrote:
> Clang's static analyzer noted that the output from
> netdev_linux_get_speed_locked can be checked even if this function
> doesn't set any values.
> 
> Now we always set those values to a sane default in all cases.
> 
> Fixes: b8f8fad86435 ("netdev-linux: Use speed as max rate in tc classes.")

This is still an incorrect Fixes tag.  The correct one is:

Fixes: 19cffe30cfda ("netdev-linux: Avoid deadlock in netdev_get_speed.")

The original netdev_get_speed() call was fine, because it ensures that values
are zeroed out even on errors.  That is defined in netdev-provider API.  But
the new static netdev_linux_get_speed_locked() function didn't do the same.

I fixed that and applied the set.  Individual patches backported according to
their Fixes tags.  Thanks!

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 8/8] netdev-linux: Fix uninitialized gso_type case.

2024-05-28 Thread Ilya Maximets
On 5/28/24 13:39, Eelco Chaudron wrote:
> This patch fixes an uninitialized gso_type case in
> netdev_linux_prepend_vnet_hdr() by returning an error.
> 
> Fixes: 3337e6d91c5b ("userspace: Enable L4 checksum offloading by default.")
> Signed-off-by: Eelco Chaudron 
> ---
>  lib/netdev-linux.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> index eb0c5c624..7cffc0e13 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -7167,6 +7167,11 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int 
> mtu)
>  vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
>  } else if (dp_packet_hwol_tx_ipv6(b)) {
>  vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> +} else {
> +VLOG_ERR_RL(, "Unknown gso_type for TSO hw offload packet. "
> +"Flags: %"PRIu64,
> +(uint64_t)*dp_packet_ol_flags_ptr(b));

I'm not sure if this should be an error or warning, up to you.  But I'd suggest
removing the 'hw offload' part from the message since 'TSO' contains 'offload'
and it's not necessarily hardware here.  Also, flags are better printed in hex,
i.e. %#"PRIx64.  And there is a missing space between the cast and a variable.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 5/8] sflow: Fix check for disabled receive time.

2024-05-28 Thread Ilya Maximets
On 5/28/24 13:39, Eelco Chaudron wrote:
> Changed sFlowRcvrTimeout to a uint32_t to avoid time_t warnings
> reported by Coverity. A uint32_t is more than large enough as
> this is a (seconds) tick counter and OVS is not even using this.
> 
> Fixes: c72e245a0e2c ("Add InMon's sFlow Agent library to the build system.")
> Signed-off-by: Eelco Chaudron 
> --
> Note that this checkpatch reports an 'Improper whitespace
> around control block' error on this patch + some warnings.
> But I did not want to change the code style in this entire file.
> ---
>  lib/sflow_api.h  | 6 +++---
>  lib/sflow_receiver.c | 7 ---
>  ofproto/ofproto-dpif-sflow.c | 2 +-
>  3 files changed, 8 insertions(+), 7 deletions(-)

Acked-by: Ilya Maximets 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [Patch] ovsdb-client: Add "COLUMN" arg to help for 'dump'.

2024-05-28 Thread Ilya Maximets
On 5/28/24 19:35, Ilya Maximets wrote:
> On 5/21/24 10:38, Martin Kalcok wrote:
>> Help text for 'ovsdb-client dump' does not mention that it's capable
>> of dumping a specific column's contents if the user supplies the
>> column's name as a fourth positional argument.
>>
>> Signed-off-by: Martin Kalcok 
>> ---
>>  ovsdb/ovsdb-client.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/ovsdb/ovsdb-client.c b/ovsdb/ovsdb-client.c
>> index cf2ecfd08..0a3f1d4df 100644
>> --- a/ovsdb/ovsdb-client.c
>> +++ b/ovsdb/ovsdb-client.c
>> @@ -451,9 +451,9 @@ usage(void)
>> "wait until DATABASE reaches STATE "
>> "(\"added\" or \"connected\" or \"removed\")\n"
>> "in DATBASE on SERVER.\n"
>> -   "\n  dump [SERVER] [DATABASE] [TABLE]\n"
>> -   "dump contents of TABLE (or all tables) in DATABASE on 
>> SERVER\n"
>> -   "to stdout\n"
>> +   "\n  dump [SERVER] [DATABASE] [TABLE] [COLUMN]\n"
>> +   "dump contents of COLUMN, TABLE (or all tables) in 
>> DATABASE\n"
>> +   "on SERVER to stdout\n"
> 
> I think it was '[TABLE [COLUMN]...]' until commit 85226894ddec
> ("ovsdb-client: support monitor2") removed that part on accident.

It is also how it is defined in the man page.

> 
> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [Patch] ovsdb-client: Add "COLUMN" arg to help for 'dump'.

2024-05-28 Thread Ilya Maximets
On 5/21/24 10:38, Martin Kalcok wrote:
> Help text for 'ovsdb-client dump' does not mention that it's capable
> of dumping a specific column's contents if the user supplies the
> column's name as a fourth positional argument.
> 
> Signed-off-by: Martin Kalcok 
> ---
>  ovsdb/ovsdb-client.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/ovsdb/ovsdb-client.c b/ovsdb/ovsdb-client.c
> index cf2ecfd08..0a3f1d4df 100644
> --- a/ovsdb/ovsdb-client.c
> +++ b/ovsdb/ovsdb-client.c
> @@ -451,9 +451,9 @@ usage(void)
> "wait until DATABASE reaches STATE "
> "(\"added\" or \"connected\" or \"removed\")\n"
> "in DATBASE on SERVER.\n"
> -   "\n  dump [SERVER] [DATABASE] [TABLE]\n"
> -   "dump contents of TABLE (or all tables) in DATABASE on 
> SERVER\n"
> -   "to stdout\n"
> +   "\n  dump [SERVER] [DATABASE] [TABLE] [COLUMN]\n"
> +   "dump contents of COLUMN, TABLE (or all tables) in DATABASE\n"
> +   "on SERVER to stdout\n"

I think it was '[TABLE [COLUMN]...]' until commit 85226894ddec
("ovsdb-client: support monitor2") removed that part on accident.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] ofproto-dpif-rid: Fix duplicate entries.

2024-05-28 Thread Ilya Maximets
On 5/25/24 12:18, wushao...@chinatelecom.cn wrote:
> From: Shaohua Wu 
> 
> In scenarios with multiple PMDs, there may be
> simultaneous requests for recirc_id from multiple
> PMD threads.In recirc_alloc_id_ctx, we first check
> if there is a duplicate entry in the metadata_map
> for the same frozen_state field. If successful,
> we directly retrieve the recirc_id. If unsuccessful,
> we create a new recirc_node and insert it into
> id_map and metadata_map. There is no locking mechanism
> to prevent the possibility of two threads with the same
> state simultaneously inserting, meaning their IDs are
> different, but their frozen_states are the same.

Hi, Shaohua Wu.  Could you, please, explain why having multiple IDs
allocated for the same state is a problem?  This may create a few
more datapath flows, but should not cause any correctness issues, as
Simon pointed out previously.  The change you're proposing may have
some performance impact as we'll be holding the shared mutex for
longer while allocating IDs, so I'd like to understand why it is a
problem before making the change.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] netdev-offload-tc: Reserve lower tc prio for vlan ethertype.

2024-05-28 Thread Ilya Maximets
On 5/26/24 10:31, Roi Dayan via dev wrote:
> From: Maor Dickman 
> 
> The cited commit reserved lower tc priorities for IP ethertypes in order
> to give IP traffic higher priority than other management traffic.
> In case of of vlan encap traffic, IP traffic will still get lower
> priority.
> 
> Fix it by also reserving low priority tc prio for vlan.
> 
> Fixes: c230c7579c14 ("netdev-offload-tc: Reserve lower tc prios for ip 
> ethertypes")
> Signed-off-by: Maor Dickman 
> Acked-by: Roi Dayan 
> ---
>  lib/netdev-offload-tc.c | 2 ++
>  lib/tc.h| 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 921d5231777e..3be1c08d24f6 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -400,6 +400,8 @@ get_next_available_prio(ovs_be16 protocol)
>  return TC_RESERVED_PRIORITY_IPV4;
>  } else if (protocol == htons(ETH_P_IPV6)) {
>  return TC_RESERVED_PRIORITY_IPV6;
> +} else if (protocol == htons(ETH_P_8021Q)) {

Should 802.1ad traffic also get the priority?
What about MPLS?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Resolved flow table reference issue.

2024-05-28 Thread Ilya Maximets
On 5/23/24 09:50, wushao...@chinatelecom.cn wrote:
> From: Shaohua Wu 
> 
> description:
> The statistics for the cached DP flow table are updated in
> packet_batch_per_flow_execute. There might be a timing gap
> where the dp flow in the batch reaches its aging time and
> is prematurely aged. If actions on the dp flows  are
> executed at this time,and the memory for the DP flow
> has been released, it can lead to a crash.
> 
> 0  raise () from /lib64/libc.so.6
> 1  abort () from /lib64/libc.so.6
> 2  odp_execute_actions at lib/odp-execute.c:1166
> 3  dp_netdev_execute_actions at lib/dpif-netdev.c:11339
> 4  packet_batch_per_flow_execute  at lib/dpif-netdev.c:8537
> 5  dp_netdev_input__  at lib/dpif-netdev.c:10722
> 6  dp_netdev_input at lib/dpif-netdev.c:10731
> 7  dp_netdev_process_rxq_port at lib/dpif-netdev.c:6332
> 8  dpif_netdev_run at lib/dpif-netdev.c:7343
> 9  dpif_run at lib/dpif.c:479
> 10 type_run at ofproto/ofproto-dpif.c:370
> 11 ofproto_type_run at ofproto/ofproto.c:1789
> 12 bridge_run__ at vswitchd/bridge.c:3245
> 13 bridge_run at vswitchd/bridge.c:3310
> 14 main  at vswitchd/ovs-vswitchd.c:127
> (gdb) f 4
> (gdb) p flow->ref_cnt
> $4 = {count = 0}
> (gdb) p flow->dead
> $5 = true

Hmm.  Thanks for the patch!  Though the flow is supposed to be
protected by RCU and PMD threads are not supposed to enter
quiescent state in the middle of packet processing, so the flow
should not be freed until the processing is over.

The flow will be dead in this case with a zero counter, but it
is not freed, i.e. should be accessible.

Do you know what exactly failed in odp_execute_actions ?
Your version of the code seems to be very different from the
upstream main branch, so line numbers do not help much.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] docs: afxdp: Fix CONFIG_HAVE_EBPF_JIT Kconfig option spelling.

2024-05-28 Thread Ilya Maximets
On 5/22/24 14:47, Eelco Chaudron wrote:
> 
> 
> On 21 May 2024, at 17:47, Simon Horman wrote:
> 
>> On Tue, May 21, 2024 at 08:35:21AM +0200, Eelco Chaudron wrote:
>>>
>>>
>>> On 20 May 2024, at 20:13, Simon Horman wrote:
>>>
>>>> From: Ville Skyttä 
>>>>
>>>> Fix CONFIG_HAVE_EBPF_JIT Kconfig option spelling "EBPF" vs "BPF").
>>>>
>>>> Signed-off-by: Ville Skyttä 
>>>> [simon: added commit meesage]
>>>
>>> I guess this line was a leftover of your merge/tooling?
>>
>> It was intentional, but I can drop it.
> 
> It looked odd, maybe this is more a remark for the -- section, i.e. not part 
> of the actual commit message?

It's a normal practice to add such remarks to the commit message
when committer made some noticeable changes to the original patch,
or if they want to clarify something that wasn't fully explained
by the original commit message.  I add those sometimes as well.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] dpdk: Use DPDK 23.11.1 release.

2024-05-28 Thread Ilya Maximets
On 5/28/24 11:25, Kevin Traynor wrote:
> Update the CI and docs to use DPDK 23.11.1.
> 
> Signed-off-by: Kevin Traynor 
> ---
> v2: update NEWS
> ---
>  .github/workflows/build-and-test.yml |  4 ++--
>  Documentation/faq/releases.rst   | 10 +-
>  Documentation/intro/install/dpdk.rst |  8 
>  NEWS |  2 ++
>  4 files changed, 13 insertions(+), 11 deletions(-)

Acked-by: Ilya Maximets 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 branch-3.3] dpdk: Use DPDK 23.11.1 release for OVS 3.3.

2024-05-28 Thread Ilya Maximets
On 5/28/24 11:25, Kevin Traynor wrote:
> Update the CI and docs to use DPDK 23.11.1.
> 
> Signed-off-by: Kevin Traynor 
> ---
> v2: update NEWS
> ---
>  .github/workflows/build-and-test.yml |  4 ++--
>  Documentation/faq/releases.rst   | 10 +-
>  Documentation/intro/install/dpdk.rst |  8 
>  NEWS |  2 ++
>  4 files changed, 13 insertions(+), 11 deletions(-)

Acked-by: Ilya Maximets 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 branch-3.2] dpdk: Use DPDK 22.11.5 release for OVS 3.2.

2024-05-28 Thread Ilya Maximets
On 5/28/24 11:25, Kevin Traynor wrote:
> Update the CI and docs to use DPDK 22.11.5.
> 
> Signed-off-by: Kevin Traynor 
> ---
> v2: no change
> ---
>  .github/workflows/build-and-test.yml | 2 +-
>  Documentation/faq/releases.rst   | 8 
>  Documentation/intro/install/dpdk.rst | 8 
>  NEWS | 2 ++
>  4 files changed, 11 insertions(+), 9 deletions(-)

Acked-by: Ilya Maximets 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 branch-3.1] dpdk: Use DPDK 22.11.5 release for OVS 3.1.

2024-05-28 Thread Ilya Maximets
On 5/28/24 11:25, Kevin Traynor wrote:
> Update the CI and docs to use DPDK 22.11.5.
> 
> Signed-off-by: Kevin Traynor 
> ---
> v2: no change
> ---
>  .github/workflows/build-and-test.yml | 2 +-
>  Documentation/faq/releases.rst   | 6 +++---
>  Documentation/intro/install/dpdk.rst | 8 
>  NEWS | 2 ++
>  4 files changed, 10 insertions(+), 8 deletions(-)

Acked-by: Ilya Maximets 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 branch-3.0] dpdk: Use DPDK 21.11.7 release for OVS 3.0.

2024-05-28 Thread Ilya Maximets
On 5/28/24 11:25, Kevin Traynor wrote:
> Update the CI and docs to use DPDK 21.11.7.
> 
> Signed-off-by: Kevin Traynor 
> ---
> v2: no change
> ---
>  .ci/linux-build.sh   | 2 +-
>  Documentation/faq/releases.rst   | 4 ++--
>  Documentation/intro/install/dpdk.rst | 8 
>  NEWS | 2 ++
>  4 files changed, 9 insertions(+), 7 deletions(-)

Acked-by: Ilya Maximets 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 branch-2.17] dpdk: Use DPDK 21.11.7 release for OVS 2.17.

2024-05-28 Thread Ilya Maximets
On 5/28/24 11:25, Kevin Traynor wrote:
> Update the CI and docs to use DPDK 21.11.7.
> 
> Signed-off-by: Kevin Traynor 
> ---
> v2: no change
> ---
>  .ci/linux-build.sh   | 2 +-
>  Documentation/faq/releases.rst   | 2 +-
>  Documentation/intro/install/dpdk.rst | 8 
>  NEWS | 2 ++
>  4 files changed, 8 insertions(+), 6 deletions(-)

Didn't test, but LGTM.

Acked-by: Ilya Maximets 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH]ipf: Fix ovs ipf crash.

2024-05-28 Thread Ilya Maximets
On 5/23/24 09:40, laixiangwu wrote:
> Description:
> 
> when 1) The fragment timeout is between 15 seconds and 25 seconds; 2) 
> The ipf_list currently has received more than 32 fragments, and there 
> are other fragments of same big packet that have not been received.
> 
> When the above two scenario conditions are met, due to exceeding the 
> capacity of the packet batch(here is 32), ipf_dp_packet_batch_add 
> returns false, and ipf_list will not be cleared. However, the 32 
> fragments packets added to the packet batch will be processed normally. 
> When receiving the subsequent fragments of the ipf_list, because the 
> first 32 fragments have been processed, when processing subsequent 
> fragment packets, relevant information about the processed fragment 
> packets will be read,therefore will occur carsh.
> One solution is do not forward timeout fragment packets from the above 
> scenarios, that is, do not add them to the packet batch, and handle 
> other scenarios according to the original logic.
> Signed-off-by: laixiangwu <15310488...@163.com>
> ---
>  lib/ipf.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)

Hi, laixiangwu.  This version of the patch looks the same as the
previous one here:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/20240522021957.2292-1-15310488...@163.com/

And I see Mike asked a few questions for the approach there.
Could you, please, answer those?

For now, I'll mark this patch with 'Changes Requested'.

If you plan to send a new version based on Mike's comments, please, add
'v6' to the subject prefix, i.e. [PATCH v6], since it's technically a
6th version of it.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


  1   2   3   4   5   6   7   8   9   10   >