[ovs-dev] [PATCH v2] table: Fix freeing global variable.

2024-05-14 Thread Yunjian Wang via dev
From: Pengfei Sun 

In function shash_replace_nocopy, argument to free() is the address
of a global variable (argument passed by function table_print_json__),
which is not memory allocated by malloc().

ovsdb-client -f json monitor Open_vSwitch --timestamp

ASan reports:
=
==1443083==ERROR: AddressSanitizer: attempting free on address
which was not malloc()-ed: 0x00535980 in thread T0
#0 0xaf1c9eac in __interceptor_free (/usr/lib64/libasan.so.6+
0xa4eac)
#1 0x4826e4 in json_destroy_object lib/json.c:445
#2 0x4826e4 in json_destroy__ lib/json.c:403
#3 0x4cc4e4 in table_print lib/table.c:633
#4 0x410650 in monitor_print_table ovsdb/ovsdb-client.c:1019
#5 0x410650 in monitor_print ovsdb/ovsdb-client.c:1040
#6 0x4110cc in monitor_print ovsdb/ovsdb-client.c:1030
#7 0x4110cc in do_monitor__ ovsdb/ovsdb-client.c:1503
#8 0x40743c in main ovsdb/ovsdb-client.c:283
#9 0xaeb50038  (/usr/lib64/libc.so.6+0x2b038)
#10 0xaeb50110 in __libc_start_main (/usr/lib64/libc.so.6+
0x2b110)
#11 0x40906c in _start (/usr/local/bin/ovsdb-client+0x40906c)

Fixes: cb139fa8b3a1 ("table: New function table_format() for formatting a table 
as a string.")
Signed-off-by: Pengfei Sun 
---
V2:
Use json_object_put instead of json_object_put_nocopy
by Ilya Maximets
---
 lib/table.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/table.c b/lib/table.c
index 48d18b6..b7addbf 100644
--- a/lib/table.c
+++ b/lib/table.c
@@ -522,7 +522,7 @@ table_print_json__(const struct table *table, const struct 
table_style *style,
 json_object_put_string(json, "caption", table->caption);
 }
 if (table->timestamp) {
-json_object_put_nocopy(
+json_object_put(
 json, "time",
 json_string_create_nocopy(table_format_timestamp__()));
 }
-- 
2.33.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] vlog: Destroy async_append first then close log_fd.

2024-05-14 Thread hepeng via dev
From: Peng He 

async_append stores log_fd, it should be destructed before log_fd
is closed.

Signed-off-by: Peng He 
---
 lib/vlog.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/vlog.c b/lib/vlog.c
index e78c785f7..59b524b09 100644
--- a/lib/vlog.c
+++ b/lib/vlog.c
@@ -411,10 +411,10 @@ vlog_set_log_file__(char *new_log_file_name)
 
 /* Close old log file, if any. */
 ovs_mutex_lock(&log_file_mutex);
+async_append_destroy(log_writer);
 if (log_fd >= 0) {
 close(log_fd);
 }
-async_append_destroy(log_writer);
 free(log_file_name);
 
 /* Install new log file. */
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd, ic: Fix handling of ovn-appctl resume.

2024-05-14 Thread Numan Siddique
On Fri, May 3, 2024 at 2:07 AM Ales Musil  wrote:
>
> On Tue, Apr 23, 2024 at 2:44 PM Xavier Simonart  wrote:
>
> > After ovn-appctl resume was issued for northd or ovn-ic, there was no
> > guarantee that northd or ovn-ic were waking up, potentially handling
> > changes received while they were paused..
> > Usually, poll_block would be woken up by POLLHUP, but race conditions could
> > cause this not to happen.
> > ovn-controller is already properly handling the resume.
> >
> > This caused the following tests to fail sporadically:
> > - ovn-ic -- sync ISB status to INB
> > - propagate Port_Binding.up to NB and OVS.
> >
> > Signed-off-by: Xavier Simonart 
> > ---
> >  ic/ovn-ic.c | 2 +-
> >  northd/ovn-northd.c | 1 +
> >  2 files changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
> > index e947323bf..be23f199d 100644
> > --- a/ic/ovn-ic.c
> > +++ b/ic/ovn-ic.c
> > @@ -2409,7 +2409,7 @@ ovn_ic_resume(struct unixctl_conn *conn, int argc
> > OVS_UNUSED,
> >  {
> >  struct ic_state *state = state_;
> >  state->paused = false;
> > -
> > +poll_immediate_wake();
> >  unixctl_command_reply(conn, NULL);
> >  }
> >
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 3a5544b0c..d71114f35 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -1107,6 +1107,7 @@ ovn_northd_resume(struct unixctl_conn *conn, int
> > argc OVS_UNUSED,
> >  {
> >  struct northd_state *state = state_;
> >  state->paused = false;
> > +poll_immediate_wake();
> >
> >  unixctl_command_reply(conn, NULL);
> >  }
> > --
> > 2.31.1
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> >
> Looks good to me, thanks.
>
> Acked-by: Ales Musil 

Thanks.  Applied to main and backported until branch-23.03.

Numan

>
> --
>
> Ales Musil
>
> Senior Software Engineer - OVN Core
>
> Red Hat EMEA 
>
> amu...@redhat.com
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: Add lsp option force_fdb_lookup.

2024-05-14 Thread Numan Siddique
On Thu, May 9, 2024 at 2:17 PM Numan Siddique  wrote:
>
> On Thu, May 9, 2024 at 5:54 AM Shibir Basak  wrote:
> >
> > Hi team,
> >
> > Could someone please review these changes?
>
> Sorry for the delay.  I'll take a look and come back.

Thanks for the patch.  I applied it to the main with the below changes.

-
diff --git a/AUTHORS.rst b/AUTHORS.rst
index b4e0a2695d..2f5713d8b6 100644
--- a/AUTHORS.rst
+++ b/AUTHORS.rst
@@ -366,6 +366,7 @@ Shan Wei   davids...@tencent.com
 Sharon Krendel thekaf...@gmail.com
 Shashank Ram   r...@vmware.com
 Shashwat Srivastavashashwat.srivast...@tcs.com
+Shibir-basak   shibir.ba...@nutanix.com
 Shih-Hao Lishi...@vmware.com
 Shu Shen   shu.s...@radisys.com
 Simon Horman   ho...@ovn.org
diff --git a/NEWS b/NEWS
index 3b5e93dc9a..9883fd2ae9 100644
--- a/NEWS
+++ b/NEWS
@@ -17,6 +17,9 @@ Post v24.03.0
 external-ids, the option is no longer needed as it became effectively
 "true" for all scenarios.
   - Added DHCPv4 relay support.
+  - A new LSP option "force_fdb_lookup" has been added to ensure the additional
+MAC addresses configured on the LSP with "unknown", are learnt via the
+OVN native FDB.

 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 2e588e28f6..83cdbdd9f0 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -2038,6 +2038,12 @@ output;
   also forwarded to the  MC_UNKNOWN multicast group.
 

+
+  The above flow is not added if the logical switch port is of type
+  VIF, has unknown as one of its address and has the
+  option options:force_fdb_lookup set to true.
+
+
 
   For the Ethernet address on a logical switch port of type
   router, when that logical switch port's

-

Numan

>
> Numan
>
> >
> > Thanks,
> > Shibir Basak
> >
> > > On 24-Apr-2024, at 12:15 PM, Shibir Basak  
> > > wrote:
> > >
> > > This option is applicable only if the lsp is of default 'type'
> > > i.e. type=empty_string (which is a VM (or VIF) interface) and the
> > > lsp also has 'unknown' addresses configured.
> > > If lsp option 'force_fdb_lookup' is set to true, mac addresses
> > > of the lsp (if configured) are not installed in the l2 lookup
> > > table of the Logical Switch pipeline. However, MAC addresses
> > > are learnt using the FDB Table.
> > >
> > > Usecase:
> > > =
> > > This option is required to reduce packet loss when VM is being
> > > migrated across AZs (via VXLAN tunnel). lsp is configured in both
> > > AZs on different logical switches which are sharing the same IP
> > > subnet. If the port has unknown address set along with MAC, IP
> > > then, any packet destined to VM's MAC on destination AZ will get
> > > forwarded locally, which may lead to packet loss during VM migration.
> > >
> > > This option is useful to reduce packet loss during VM migration
> > > by forcing the logical switch to learn MAC using FDB.
> > >
> > > The other way to address this issue is to use pkt_clone_type
> > > option but this causes too much of packet duplication when there
> > > are multiple ports with unknown address in the logical switch.
> > >
> > > Signed-off-by: shibir-basak 
> > > ---
> > > northd/northd.c | 16 
> > > ovn-nb.xml  | 11 +++
> > > tests/ovn-northd.at | 38 ++
> > > 3 files changed, 65 insertions(+)
> > >
> > > diff --git a/northd/northd.c b/northd/northd.c
> > > index 37f443e70..c29396716 100644
> > > --- a/northd/northd.c
> > > +++ b/northd/northd.c
> > > @@ -1380,6 +1380,16 @@ lrport_is_enabled(const struct 
> > > nbrec_logical_router_port *lrport)
> > > return !lrport->enabled || *lrport->enabled;
> > > }
> > >
> > > +static bool
> > > +lsp_force_fdb_lookup(const struct ovn_port *op)
> > > +{
> > > +/* To enable FDB Table lookup on a logical switch port, it has to be
> > > + * of 'type' empty_string and "addresses" must have "unknown".
> > > + */
> > > +return !op->nbsp->type[0] && op->has_unknown &&
> > > +smap_get_bool(&op->nbsp->options, "force_fdb_lookup", false);
> > > +}
> > > +
> > > static struct ovn_port *
> > > ovn_port_get_peer(const struct hmap *lr_ports, struct ovn_port *op)
> > > {
> > > @@ -9479,6 +9489,12 @@ build_lswitch_ip_unicast_lookup(struct ovn_port 
> > > *op,
> > >
> > > if (ovs_scan(op->nbsp->addresses[i],
> > > ETH_ADDR_SCAN_FMT, ETH_ADDR_SCAN_ARGS(mac))) {
> > > +/* Skip adding flows related to the MAC address
> > > + * as force FDB Lookup is enabled on the lsp.

Re: [ovs-dev] [PATCH ovn] Do not reply on unicast arps for targets.

2024-05-14 Thread Numan Siddique
On Mon, May 13, 2024 at 9:00 AM Dumitru Ceara  wrote:
>
> On 5/4/24 11:38, Vasyl Saienko wrote:
> > Hello Numan
> >
> > Thanks for review and feedback.
> >
> > On Fri, May 3, 2024 at 7:14 PM Numan Siddique  wrote:
> >
> >> On Mon, Apr 22, 2024 at 2:54 AM Vasyl Saienko 
> >> wrote:
> >>>
> >>> Reply only if target ethernet address is broadcast, if
> >>> address is specified explicitly do noting to let target
> >>> reply by itself. This technique allows to monitor target
> >>> aliveness with arping.
> >>
> >> Thanks for the patch.
> >>
> >> This patch does change the behavior of OVN.  Having ARP responder
> >> flows avoids tunnelling the request packet if the destination is on a
> >> different compute node.
> >> But I don't think its a big deal.
> >>
> >> You are totally correct that the ARP responder allows limiting ARP
> > broadcast traffic between nodes. Normally ARP requests are sent to
> > broadcast ff:ff:ff:ff:ff:ff ethernet, as the protocol is designed to
> > identify ethernet addresses for known IP addresses. In this case since
> > traffic is broadcast it will be flooded among all nodes if ARP responder is
> > not applied. At the same time the client may specify the target
> > ethernet address as unicast (in this case a broadcast storm will not
> > happen, as the destination ethernet address is unicast).
> >
> > In OpenStack we use unicast ARP requests as a way to monitor VM aliveness
> > from remote locations to make sure there are no issues with
> > flows/underlying networking infra between nodes. We use ARPs due to several
> > reasons:
> > 1. This protocol is mandatory and can't be disabled on the target machine
> > (which guarantees that the target VM will always reply to ARPs if it is
> > alive)
> > 2. This protocol is not filtered by firewalls/security groups as it is
> > mandatory for network workability
> > 3. We can't use upper layers protocols such as ICMP as they will require
> > specific firewall rules inside VMs, and we do not control VMs in cloud
> > cases, but still need to monitor infrastructure aliveness.
> >
> > If ARP responder replies to requests with broadcast and unicast target
> > ethernet address, we can't use this technique for monitoring, as even if
> > target VM is down (not necessary that ovn port is down, the VM may stuck or
> > be in kernel panic for example, or datapath between monitoring tool and VM
> > is broken) the ARP responder will do reply to our request so we can't
> > identify if VM is really accessible or not as we will always got replies
> > from local ARP responder. At the same time there is no need to set up ARP
> > responder for requests with a unicast ethernet address, as they will not
> > generate broadcast storms by design (they are unicasts).
> >
>
> Right, this sounds acceptable to me too.
>
> >
> >> 1.  Accept your patch
> >> 2.  Use the NB Global option - "ignore_lsp_down" and set it to false,
> >> so that ovn-northd will install the ARP responder flows only if the
> >> logical port is up.
> >>
> > This option will not allow catching a situation when there is an issue in
> > datapath between monitoring software and VM located on another node, for
> > example  issues with flows or some underlying networking problems.
> >
> >  Have you considered this approach for your use case ?
> >> 3.  Add another global option - "disable_arp_responder" which when set
> >> to true, ovn-northd will not install ARP responder flows.
> >>
> >> Disabling ARP responder for ports completely will allow broadcast storms
> > for ARP requests. Which we would like to avoid, but enable the possibility
> > to monitor infrastructure aliveness.
> >
> >
> >> Given that OpenStack neutron ml2ovs uses ARP responder flows only for
> >> broad cast my preference would be (1).
> >>
> >> I'd like to know other's opinion on this.
> >>
>
> I'd accept the patch too so I'd go for (1).
>
> Regards,
> Dumitru
>
> >> Thanks
> >> Numan
> >>
> >>>
> >>> Closes  #239
> >>>
> >>> Signed-off-by: Vasyl Saienko 
> >>> ---
> >>>  northd/northd.c | 11 +--
> >>>  tests/ovn-northd.at |  4 ++--
> >>>  2 files changed, 11 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/northd/northd.c b/northd/northd.c
> >>> index 37f443e70..e80e1885d 100644
> >>> --- a/northd/northd.c
> >>> +++ b/northd/northd.c
> >>> @@ -8832,8 +8832,15 @@ build_lswitch_arp_nd_responder_known_ips(struct
> >> ovn_port *op,
> >>>  for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> >>>  for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
> >>>  ds_clear(match);
> >>> -ds_put_format(match, "arp.tpa == %s && arp.op == 1",
> >>> -op->lsp_addrs[i].ipv4_addrs[j].addr_s);
> >>> +/* NOTE(vsaienko): Do not reply on unicast ARPs, forward
> >>> + * them to the target to have ability to monitor target
> >>> + * aliveness via ARPs.
> >>> +*/
> >>> +ds_put_format(match,

Re: [ovs-dev] [PATCH ovn] controller: Store src_mac, src_ip in svc_monitor struct.

2024-05-14 Thread Vladislav Odintsov
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-April/413198.html

> On 14 May 2024, at 22:25, Vladislav Odintsov  wrote:
> 
> These structure members are read in pinctrl_handler() in a separare thread.
> This is unsafe: when IDL is re-connected or some IDL objects are freed
> after svc_monitors list with svc_monitor structures, which point to
> sbrec_service_monitor is initialized, sb_svc_mon structure property can
> point to wrong address, which then leads to segmentation fault in
> svc_monitor_send_tcp_health_check__() and
> svc_monitor_send_udp_health_check() on accessing svc_mon->sb_svc_mon.
> 
> Imagine situation:
> 
> Main ovn-controller thread:
> 1. Connects to SB and fills IDL with DB contents.
> 2. run pinctrl_run() with pinctrl mutex and sync_svc_monitors() as a part
>   of it.
> 3. Release mutex.
> 
> ...
> 4. Loss of OVNSB connection for any reason (network outage/inactivity probe
>   timeout/etc), start new main-loop iteration, re-initialize IDL in
>   ovsdb_idl_run() (which probably will destroy previous IDL objects).
> ...
> 
> pinctrl thread:
> 5. Awake from poll_block().
> ... run new iteration in its main loop ...
> 6. Aqcuire mutex lock.
> 7. Run svc_monitors_run(), run svc_monitor_send_tcp_health_check__() or
>   svc_monitor_send_udp_health_check(), which try to access IDL objects
>   with outdated pointers.
> 
> 8. Segmentation fault:
> 
> Stack trace of thread 212406:
>>> __GI_strlen (libc.so.6)
>>> inet_pton (libc.so.6)
>>> ip_parse (ovn-controller)
>>> svc_monitor_send_tcp_health_check__.part.37 (ovn-controller)
>>> svc_monitor_send_health_check (ovn-controller)
>>> pinctrl_handler (ovn-controller)
>>> ovsthread_wrapper (ovn-controller)
>>> start_thread (libpthread.so.0)
>>> __clone (libc.so.6)
> 
> This patch removes unsafe access to IDL objects from pinctrl thread.
> New svc_monitor structure members are used to store needed data.
> 
> CC: Numan Siddique 
> Fixes: 8be01f4a5329 ("Send service monitor health checks")
> Signed-off-by: Vladislav Odintsov 
> ---
> controller/pinctrl.c | 43 ++-
> 1 file changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index 6a2c3dc68..b843edb35 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -7307,6 +7307,9 @@ struct svc_monitor {
> long long int timestamp;
> bool is_ip6;
> 
> +struct eth_addr src_mac;
> +struct in6_addr src_ip;
> +
> long long int wait_time;
> long long int next_send_time;
> 
> @@ -7501,6 +7504,15 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
> svc_mon->n_success = 0;
> svc_mon->n_failures = 0;
> 
> +eth_addr_from_string(sb_svc_mon->src_mac, &svc_mon->src_mac);
> +if (is_ipv4) {
> +ovs_be32 ip4_src;
> +ip_parse(sb_svc_mon->src_ip, &ip4_src);
> +svc_mon->src_ip = in6_addr_mapped_ipv4(ip4_src);
> +} else {
> +ipv6_parse(sb_svc_mon->src_ip, &svc_mon->src_ip);
> +}
> +
> hmap_insert(&svc_monitors_map, &svc_mon->hmap_node, hash);
> ovs_list_push_back(&svc_monitors, &svc_mon->list_node);
> changed = true;
> @@ -8259,19 +8271,14 @@ svc_monitor_send_tcp_health_check__(struct rconn 
> *swconn,
> struct dp_packet packet;
> dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
> 
> -struct eth_addr eth_src;
> -eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
> if (svc_mon->is_ip6) {
> -struct in6_addr ip6_src;
> -ipv6_parse(svc_mon->sb_svc_mon->src_ip, &ip6_src);
> -pinctrl_compose_ipv6(&packet, eth_src, svc_mon->ea,
> - &ip6_src, &svc_mon->ip, IPPROTO_TCP,
> +pinctrl_compose_ipv6(&packet, svc_mon->src_mac, svc_mon->ea,
> + &svc_mon->src_ip, &svc_mon->ip, IPPROTO_TCP,
>  63, TCP_HEADER_LEN);
> } else {
> -ovs_be32 ip4_src;
> -ip_parse(svc_mon->sb_svc_mon->src_ip, &ip4_src);
> -pinctrl_compose_ipv4(&packet, eth_src, svc_mon->ea,
> - ip4_src, in6_addr_get_mapped_ipv4(&svc_mon->ip),
> +pinctrl_compose_ipv4(&packet, svc_mon->src_mac, svc_mon->ea,
> + in6_addr_get_mapped_ipv4(&svc_mon->src_ip),
> + in6_addr_get_mapped_ipv4(&svc_mon->ip),
>  IPPROTO_TCP, 63, TCP_HEADER_LEN);
> }
> 
> @@ -8327,24 +8334,18 @@ svc_monitor_send_udp_health_check(struct rconn 
> *swconn,
>   struct svc_monitor *svc_mon,
>   ovs_be16 udp_src)
> {
> -struct eth_addr eth_src;
> -eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
> -
> uint64_t packet_stub[128 / 8];
> struct dp_packet packet;
> dp_packet_use_stub(&packet, packet_stub, sizeof packet

[ovs-dev] [PATCH ovn] controller: Store src_mac, src_ip in svc_monitor struct.

2024-05-14 Thread Vladislav Odintsov
These structure members are read in pinctrl_handler() in a separare thread.
This is unsafe: when IDL is re-connected or some IDL objects are freed
after svc_monitors list with svc_monitor structures, which point to
sbrec_service_monitor is initialized, sb_svc_mon structure property can
point to wrong address, which then leads to segmentation fault in
svc_monitor_send_tcp_health_check__() and
svc_monitor_send_udp_health_check() on accessing svc_mon->sb_svc_mon.

Imagine situation:

Main ovn-controller thread:
1. Connects to SB and fills IDL with DB contents.
2. run pinctrl_run() with pinctrl mutex and sync_svc_monitors() as a part
   of it.
3. Release mutex.

...
4. Loss of OVNSB connection for any reason (network outage/inactivity probe
   timeout/etc), start new main-loop iteration, re-initialize IDL in
   ovsdb_idl_run() (which probably will destroy previous IDL objects).
...

pinctrl thread:
5. Awake from poll_block().
... run new iteration in its main loop ...
6. Aqcuire mutex lock.
7. Run svc_monitors_run(), run svc_monitor_send_tcp_health_check__() or
   svc_monitor_send_udp_health_check(), which try to access IDL objects
   with outdated pointers.

8. Segmentation fault:

Stack trace of thread 212406:
>> __GI_strlen (libc.so.6)
>> inet_pton (libc.so.6)
>> ip_parse (ovn-controller)
>> svc_monitor_send_tcp_health_check__.part.37 (ovn-controller)
>> svc_monitor_send_health_check (ovn-controller)
>> pinctrl_handler (ovn-controller)
>> ovsthread_wrapper (ovn-controller)
>> start_thread (libpthread.so.0)
>> __clone (libc.so.6)

This patch removes unsafe access to IDL objects from pinctrl thread.
New svc_monitor structure members are used to store needed data.

CC: Numan Siddique 
Fixes: 8be01f4a5329 ("Send service monitor health checks")
Signed-off-by: Vladislav Odintsov 
---
 controller/pinctrl.c | 43 ++-
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 6a2c3dc68..b843edb35 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -7307,6 +7307,9 @@ struct svc_monitor {
 long long int timestamp;
 bool is_ip6;
 
+struct eth_addr src_mac;
+struct in6_addr src_ip;
+
 long long int wait_time;
 long long int next_send_time;
 
@@ -7501,6 +7504,15 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
 svc_mon->n_success = 0;
 svc_mon->n_failures = 0;
 
+eth_addr_from_string(sb_svc_mon->src_mac, &svc_mon->src_mac);
+if (is_ipv4) {
+ovs_be32 ip4_src;
+ip_parse(sb_svc_mon->src_ip, &ip4_src);
+svc_mon->src_ip = in6_addr_mapped_ipv4(ip4_src);
+} else {
+ipv6_parse(sb_svc_mon->src_ip, &svc_mon->src_ip);
+}
+
 hmap_insert(&svc_monitors_map, &svc_mon->hmap_node, hash);
 ovs_list_push_back(&svc_monitors, &svc_mon->list_node);
 changed = true;
@@ -8259,19 +8271,14 @@ svc_monitor_send_tcp_health_check__(struct rconn 
*swconn,
 struct dp_packet packet;
 dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
 
-struct eth_addr eth_src;
-eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
 if (svc_mon->is_ip6) {
-struct in6_addr ip6_src;
-ipv6_parse(svc_mon->sb_svc_mon->src_ip, &ip6_src);
-pinctrl_compose_ipv6(&packet, eth_src, svc_mon->ea,
- &ip6_src, &svc_mon->ip, IPPROTO_TCP,
+pinctrl_compose_ipv6(&packet, svc_mon->src_mac, svc_mon->ea,
+ &svc_mon->src_ip, &svc_mon->ip, IPPROTO_TCP,
  63, TCP_HEADER_LEN);
 } else {
-ovs_be32 ip4_src;
-ip_parse(svc_mon->sb_svc_mon->src_ip, &ip4_src);
-pinctrl_compose_ipv4(&packet, eth_src, svc_mon->ea,
- ip4_src, in6_addr_get_mapped_ipv4(&svc_mon->ip),
+pinctrl_compose_ipv4(&packet, svc_mon->src_mac, svc_mon->ea,
+ in6_addr_get_mapped_ipv4(&svc_mon->src_ip),
+ in6_addr_get_mapped_ipv4(&svc_mon->ip),
  IPPROTO_TCP, 63, TCP_HEADER_LEN);
 }
 
@@ -8327,24 +8334,18 @@ svc_monitor_send_udp_health_check(struct rconn *swconn,
   struct svc_monitor *svc_mon,
   ovs_be16 udp_src)
 {
-struct eth_addr eth_src;
-eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
-
 uint64_t packet_stub[128 / 8];
 struct dp_packet packet;
 dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
 
 if (svc_mon->is_ip6) {
-struct in6_addr ip6_src;
-ipv6_parse(svc_mon->sb_svc_mon->src_ip, &ip6_src);
-pinctrl_compose_ipv6(&packet, eth_src, svc_mon->ea,
- &ip6_src, &svc_mon->ip, IPPROTO_UDP,
+pinctrl_compose_ipv6(&packet, svc_mon->src_mac, svc_mon->ea,
+  

[ovs-dev] [PATCH ovn] northd: Skip arp-proxy flows if the lsp is a router port.

2024-05-14 Thread Lorenzo Bianconi
Skip proxy-arp logical flows for traffic that is entering the logical
switch pipeline from a lsp of type router. This patch will avoid
recirculating back the traffic entering  by the logical router
pipeline if proxy-arp hasn been configured by the CMS.

Reported-at: https://issues.redhat.com/browse/FDP-96
Signed-off-by: Lorenzo Bianconi 
---
 northd/northd.c | 15 +--
 tests/ovn.at|  8 
 tests/system-ovn.at | 22 +-
 3 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 0cabda7ea..29dc58ef4 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -118,6 +118,7 @@ static bool default_acl_drop;
 #define REGBIT_PORT_SEC_DROP  "reg0[15]"
 #define REGBIT_ACL_STATELESS  "reg0[16]"
 #define REGBIT_ACL_HINT_ALLOW_REL "reg0[17]"
+#define REGBIT_FROM_ROUTER_PORT   "reg0[18]"
 
 #define REG_ORIG_DIP_IPV4 "reg1"
 #define REG_ORIG_DIP_IPV6 "xxreg1"
@@ -5785,6 +5786,13 @@ build_lswitch_port_sec_op(struct ovn_port *op, struct 
lflow_table *lflows,
 &op->od->localnet_ports[0]->nbsp->header_,
 op->lflow_ref);
 }
+} else if (lsp_is_router(op->nbsp)) {
+ds_put_format(actions, REGBIT_FROM_ROUTER_PORT" = 1; next;");
+ovn_lflow_add_with_lport_and_hint(lflows, op->od,
+  S_SWITCH_IN_CHECK_PORT_SEC, 70,
+  ds_cstr(match), ds_cstr(actions),
+  op->key, &op->nbsp->header_,
+  op->lflow_ref);
 }
 }
 
@@ -9051,7 +9059,9 @@ build_lswitch_arp_nd_responder_known_ips(struct ovn_port 
*op,
 if (op->proxy_arp_addrs.n_ipv4_addrs) {
 /* Match rule on all proxy ARP IPs. */
 ds_clear(match);
-ds_put_cstr(match, "arp.op == 1 && arp.tpa == {");
+ds_put_cstr(match,
+REGBIT_FROM_ROUTER_PORT" == 0 "
+"&& arp.op == 1 && arp.tpa == {");
 
 for (i = 0; i < op->proxy_arp_addrs.n_ipv4_addrs; i++) {
 ds_put_format(match, "%s/%u,",
@@ -9105,7 +9115,8 @@ build_lswitch_arp_nd_responder_known_ips(struct ovn_port 
*op,
 ds_truncate(&nd_target_match, nd_target_match.length - 2);
 ds_clear(match);
 ds_put_format(match,
-  "nd_ns "
+  REGBIT_FROM_ROUTER_PORT" == 0 "
+  "&& nd_ns "
   "&& ip6.dst == { %s } "
   "&& nd.target == { %s }",
   ds_cstr(&ip6_dst_match),
diff --git a/tests/ovn.at b/tests/ovn.at
index 486680649..e419516a7 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -32640,7 +32640,7 @@ AT_CHECK([ovn-sbctl dump-flows |
   grep ls_in_arp_rsp |
   grep "${arp_proxy_ls1[[1]]}" |
   ovn_strip_lflows], [0], [dnl
-  table=??(ls_in_arp_rsp  ), priority=30   , match=(arp.op == 1 && dnl
+  table=??(ls_in_arp_rsp  ), priority=30   , match=(reg0[[18]] == 0 && 
arp.op == 1 && dnl
 arp.tpa == {169.254.238.0/24,169.254.239.2/32}), dnl
 action=(eth.dst = eth.src; eth.src = 00:00:00:01:02:f1; arp.op = 2; dnl
 /* ARP reply */ arp.tha = arp.sha; arp.sha = 00:00:00:01:02:f1; dnl
@@ -32653,7 +32653,7 @@ AT_CHECK([ovn-sbctl dump-flows |
   grep "${arp_proxy_ls1[[3]]}" |
   ovn_strip_lflows], [0], [dnl
   table=??(ls_in_arp_rsp  ), priority=30   , dnl
-match=(nd_ns && ip6.dst == { fd7b:6b4d:7b25:d22d::/64, ff02::1:ff00:0/64, dnl
+match=(reg0[[18]] == 0 && nd_ns && ip6.dst == { fd7b:6b4d:7b25:d22d::/64, 
ff02::1:ff00:0/64, dnl
 fd7b:6b4d:7b25:d22f::1/128, ff02::1:ff00:1/128 } && dnl
 nd.target == { fd7b:6b4d:7b25:d22d::/64, fd7b:6b4d:7b25:d22f::1/128 }), dnl
 action=(nd_na_router { eth.src = 00:00:00:01:02:f1; ip6.src = nd.target; dnl
@@ -32667,7 +32667,7 @@ AT_CHECK([ovn-sbctl dump-flows |
   grep "${arp_proxy_ls2[[2]]}" |
   ovn_strip_lflows], [0], [dnl
   table=??(ls_in_arp_rsp  ), priority=30   , dnl
-match=(arp.op == 1 && arp.tpa == {169.254.236.0/24,169.254.237.2/32}), dnl
+match=(reg0[[18]] == 0 && arp.op == 1 && arp.tpa == 
{169.254.236.0/24,169.254.237.2/32}), dnl
 action=(eth.dst = eth.src; eth.src = 00:00:00:02:02:f1; arp.op = 2; dnl
 /* ARP reply */ arp.tha = arp.sha; arp.sha = 00:00:00:02:02:f1; dnl
 arp.tpa <-> arp.spa; outport = inport; flags.loopback = 1; output;)
@@ -32679,7 +32679,7 @@ AT_CHECK([ovn-sbctl dump-flows |
   grep "${arp_proxy_ls2[[4]]}" |
   ovn_strip_lflows], [0], [dnl
   table=??(ls_in_arp_rsp  ), priority=30   , dnl
-match=(nd_ns && ip6.dst == { fd7b:6b4d:7b25:d22b::/64, ff02::1:ff00:0/64, dnl
+match=(reg0[[18]] == 0 && nd_ns && ip6.dst == { fd7b:6b4d:7b25:d22b::/64, 
ff02::1:ff00:0/64, dnl
 fd7b:6b4d:7b25:d22c::1/128, ff02::1:ff00:1/128 } && dnl
 nd.target == { fd7b:6b4d:7b25:d22b::/64, fd7b:6b4d:

[ovs-dev] [v2] odp-execute: Fix AVX checksum calculation.

2024-05-14 Thread Emma Finn
The AVX implementation for calcualting checksums was not
handling carry-over addition correctly in some cases.
This patch adds an additional shuffle to add 16-bit padding
to the final part of the calculation to handle such cases.
This commit also adds a unit test to fuzz test the actions
autovalidator.

Signed-off-by: Emma Finn 
Reported-by: Eelco Chaudron 
---
 lib/odp-execute-avx512.c |  5 +
 tests/dpif-netdev.at | 26 ++
 2 files changed, 31 insertions(+)

diff --git a/lib/odp-execute-avx512.c b/lib/odp-execute-avx512.c
index 50c48bfd4..a74a85dc1 100644
--- a/lib/odp-execute-avx512.c
+++ b/lib/odp-execute-avx512.c
@@ -366,6 +366,8 @@ avx512_get_delta(__m256i old_header, __m256i new_header)
   0xF, 0xF, 0xF, 0xF);
 v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
 
+v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
+v_delta = _mm256_shuffle_epi8(v_delta, v_swap16a);
 v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
 v_delta = _mm256_hadd_epi16(v_delta, v_zeros);
 
@@ -575,6 +577,9 @@ avx512_ipv6_sum_header(__m512i ip6_header)
   0xF, 0xF, 0xF, 0xF);
 
 v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
+
+v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
+v_delta = _mm256_shuffle_epi8(v_delta, v_swap16a);
 v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
 v_delta = _mm256_hadd_epi16(v_delta, v_zeros);
 
diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at
index 790b5a43a..4db6a99e1 100644
--- a/tests/dpif-netdev.at
+++ b/tests/dpif-netdev.at
@@ -1091,3 +1091,29 @@ OVS_VSWITCHD_STOP(["dnl
 /Error: unknown miniflow extract implementation superstudy./d
 /Error: invalid study_pkt_cnt value: -pmd./d"])
 AT_CLEANUP
+
+AT_SETUP([datapath - Actions Autovalidator Fuzzy])
+AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], [])
+AT_SKIP_IF([! $PYTHON3 $srcdir/genpkts.py 2000 > packets])
+
+OVS_VSWITCHD_START(add-port br0 p0 -- set Interface p0 type=dummy \
+   -- add-port br0 p1 -- set Interface p1 type=dummy)
+
+AT_CHECK([ovs-appctl odp-execute/action-impl-set autovalidator], [0], [dnl
+Action implementation set to autovalidator.
+])
+
+AT_DATA([flows.txt], [dnl
+  in_port=p0,ip,actions=mod_nw_src=10.1.1.1,p1
+  in_port=p0,ipv6,actions=set_field:fc00::100->ipv6_src,p1
+])
+
+AT_CHECK([ovs-ofctl del-flows br0])
+AT_CHECK([ovs-ofctl -Oopenflow13 add-flows br0 flows.txt])
+
+cat packets | while read line; do
+  AT_CHECK([ovs-appctl netdev-dummy/receive p0 $line], [0], [ignore])
+done
+
+OVS_TRAFFIC_VSWITCHD_STOP
+AT_CLEANUP
-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 5/7] ci: Add missing packages to run Fedora image in GH CI.

2024-05-14 Thread Ales Musil
On Tue, May 14, 2024 at 3:19 PM Ilya Maximets  wrote:

> On 5/14/24 10:38, Ales Musil wrote:
> > There were two things missing for the Fedora builds, 32-bit
> > version of glibc to allows the -m32 compilation on Fedora
> > and numactl-devel package.
> >
> > Signed-off-by: Ales Musil 
> > ---
> >  .ci/linux-build.sh | 3 +++
> >  utilities/containers/fedora/Dockerfile | 1 +
> >  2 files changed, 4 insertions(+)
> >
> > diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
> > index 78f17f8bd..12966f532 100755
> > --- a/.ci/linux-build.sh
> > +++ b/.ci/linux-build.sh
> > @@ -83,6 +83,9 @@ function configure_gcc()
> >  # do it directly because gcc-multilib is not available
> >  # for arm64
> >  sudo apt update && sudo apt install -y gcc-multilib
> > +elif which dnf; then
> > +# Install equivalent of gcc-multilib for Fedora.
> > +sudo dnf -y update && sudo dnf -y install glibc-devel.i686
>
> dnf always refreshes package cache.  'dnf update' will actually update
> all the packages to the latest versions.  I'm not sure it is an intended
> behavior here.
>

Yeah good point, we shouldn't update the packages just install the missing
one. So only the install part is needed, I'll wait for other reviews before
posting v2.


>
> >  fi
> >  fi
> >  }
> > diff --git a/utilities/containers/fedora/Dockerfile
> b/utilities/containers/fedora/Dockerfile
> > index 9b8386aae..d40a7b31f 100755
> > --- a/utilities/containers/fedora/Dockerfile
> > +++ b/utilities/containers/fedora/Dockerfile
> > @@ -28,6 +28,7 @@ RUN dnf -y update \
> >  libtool \
> >  net-tools \
> >  nmap-ncat \
> > +numactl-devel \
> >  openssl \
> >  openssl-devel \
> >  procps-ng \
>
>
Thanks,
Ales
-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.com

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 5/7] ci: Add missing packages to run Fedora image in GH CI.

2024-05-14 Thread Ilya Maximets
On 5/14/24 10:38, Ales Musil wrote:
> There were two things missing for the Fedora builds, 32-bit
> version of glibc to allows the -m32 compilation on Fedora
> and numactl-devel package.
> 
> Signed-off-by: Ales Musil 
> ---
>  .ci/linux-build.sh | 3 +++
>  utilities/containers/fedora/Dockerfile | 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
> index 78f17f8bd..12966f532 100755
> --- a/.ci/linux-build.sh
> +++ b/.ci/linux-build.sh
> @@ -83,6 +83,9 @@ function configure_gcc()
>  # do it directly because gcc-multilib is not available
>  # for arm64
>  sudo apt update && sudo apt install -y gcc-multilib
> +elif which dnf; then
> +# Install equivalent of gcc-multilib for Fedora.
> +sudo dnf -y update && sudo dnf -y install glibc-devel.i686

dnf always refreshes package cache.  'dnf update' will actually update
all the packages to the latest versions.  I'm not sure it is an intended
behavior here.

>  fi
>  fi
>  }
> diff --git a/utilities/containers/fedora/Dockerfile 
> b/utilities/containers/fedora/Dockerfile
> index 9b8386aae..d40a7b31f 100755
> --- a/utilities/containers/fedora/Dockerfile
> +++ b/utilities/containers/fedora/Dockerfile
> @@ -28,6 +28,7 @@ RUN dnf -y update \
>  libtool \
>  net-tools \
>  nmap-ncat \
> +numactl-devel \
>  openssl \
>  openssl-devel \
>  procps-ng \

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] utilities: Correct deletion reason in flow_reval_monitor.py.

2024-05-14 Thread Eelco Chaudron



On 13 May 2024, at 17:02, Ilya Maximets wrote:

> On 5/8/24 11:19, Eelco Chaudron wrote:
>> The flow_reval_monitor.py script incorrectly reported the reasons for
>> FDR_PURGE and FDR_TOO_EXPENSIVE, as their descriptions were swapped.
>> This patch rectifies the order using a dictionary to avoid similar
>> problems in the future.
>>
>> In addition this patch also syncs the delete reason output of the
>> script, with the comments in the code.
>>
>> Fixes: 86b9e653ef22 ("revalidator: Add a USDT probe during flow deletion 
>> with purge reason.")
>> Signed-off-by: Eelco Chaudron 
>>
>> ---
>> v2: - Converted the list of strings to dictionary.
>> - Added comment to code to keep code and script in sync.
>> - Unified flow_delete reason comments and script output.
>> ---
>>  ofproto/ofproto-dpif-upcall.c| 25 ---
>>  utilities/usdt-scripts/flow_reval_monitor.py | 32 ++--
>>  2 files changed, 30 insertions(+), 27 deletions(-)
>>
>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>> index 73901b651..e4d348985 100644
>> --- a/ofproto/ofproto-dpif-upcall.c
>> +++ b/ofproto/ofproto-dpif-upcall.c
>> @@ -1,3 +1,4 @@
>> +
>>  /* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
>>   *
>>   * Licensed under the Apache License, Version 2.0 (the "License");
>> @@ -270,18 +271,20 @@ enum ukey_state {
>>  };
>>  #define N_UKEY_STATES (UKEY_DELETED + 1)
>>
>> +/* Ukey delete reasons used by USDT probes.  Please keep in sync with the
>> + * definition in utilities/usdt-scripts/flow_reval_monitor.py.  */
>>  enum flow_del_reason {
>> -FDR_NONE = 0,   /* No deletion reason for the flow. */
>> -FDR_AVOID_CACHING,  /* Flow deleted to avoid caching. */
>> -FDR_BAD_ODP_FIT,/* The flow had a bad ODP flow fit. */
>> -FDR_FLOW_IDLE,  /* The flow went unused and was deleted. */
>> -FDR_FLOW_LIMIT, /* All flows being killed. */
>> -FDR_FLOW_WILDCARDED,/* The flow needed a narrower wildcard mask. */
>> -FDR_NO_OFPROTO, /* The flow didn't have an associated ofproto. 
>> */
>> -FDR_PURGE,  /* User action caused flows to be killed. */
>> -FDR_TOO_EXPENSIVE,  /* The flow was too expensive to revalidate. */
>> -FDR_UPDATE_FAIL,/* Flow state transition was unexpected. */
>> -FDR_XLATION_ERROR,  /* There was an error translating the flow. */
>> +FDR_NONE = 0,   /* No delete reason specified. */
>> +FDR_AVOID_CACHING,  /* Cache avoidance flag set. */
>> +FDR_BAD_ODP_FIT,/* Bad ODP flow fit. */
>> +FDR_FLOW_IDLE,  /* Flow idle timeout. */
>> +FDR_FLOW_LIMIT, /* Kill all flows condition reached. */
>> +FDR_FLOW_WILDCARDED,/* Flow needs a narrower wildcard mask. */
>> +FDR_NO_OFPROTO, /* Flow lacks ofproto dpif association. */
>> +FDR_PURGE,  /* User requested flow deletion. */
>> +FDR_TOO_EXPENSIVE,  /* Too expensive to revalidate. */
>> +FDR_UPDATE_FAIL,/* Datapath update failed. */
>> +FDR_XLATION_ERROR,  /* Flow translation error. */
>>  };
>>
>>  /* 'udpif_key's are responsible for tracking the little bit of state udpif
>> diff --git a/utilities/usdt-scripts/flow_reval_monitor.py 
>> b/utilities/usdt-scripts/flow_reval_monitor.py
>> index 534ba8fa2..bc3a28443 100755
>> --- a/utilities/usdt-scripts/flow_reval_monitor.py
>> +++ b/utilities/usdt-scripts/flow_reval_monitor.py
>> @@ -254,19 +254,19 @@ FdrReasons = IntEnum(
>>  start=0,
>>  )
>>
>> -FdrReasonStrings = [
>> -"No deletion reason",
>> -"Cache avoidance flag set",
>> -"Bad ODP flow fit",
>> -"Idle flow timed out",
>> -"Kill all flows condition detected",
>> -"Mask too wide - need narrower match",
>> -"No matching ofproto rules",
>> -"Too expensive to revalidate",
>> -"Purged with user action",
>> -"Flow state inconsistent after updates",
>> -"Flow translation error",
>> -]
>
> Should we also have a comment here describing from where these are
> coming from?
>
>> +FdrReasonStrings = {
>> +FdrReasons.FDR_NONE: "No delete reason specified",
>> +FdrReasons.FDR_AVOID_CACHING: "Cache avoidance flag set",
>> +FdrReasons.FDR_BAD_ODP_FIT: "Bad ODP flow fit",
>> +FdrReasons.FDR_FLOW_IDLE: "Flow idle timeout",
>> +FdrReasons.FDR_FLOW_LIMIT: "Kill all flows condition reached",
>> +FdrReasons.FDR_FLOW_WILDCARDED: "Flow needs a narrower wildcard mask",
>> +FdrReasons.FDR_NO_OFPROTO: "Flow lacks ofproto dpif association",
>
> Can we rename this one into "Bridge not found" maybe?
>
> 'ofproto' and 'dpif' are not user-friendly words.  'ofproto' is an entirely
> internal concept.

ACK on both. Just sent out a v3.

//Eelco

>> +FdrReasons.FDR_PURGE: "User requested flow deletion",
>> +FdrReasons.FDR_TOO_EXPENSIVE: "Too expensive to revalidate",
>> +FdrReasons.FDR_UPDATE_FAIL: "Datapa

[ovs-dev] [PATCH v3] utilities: Correct deletion reason in flow_reval_monitor.py.

2024-05-14 Thread Eelco Chaudron
The flow_reval_monitor.py script incorrectly reported the reasons for
FDR_PURGE and FDR_TOO_EXPENSIVE, as their descriptions were swapped.
This patch rectifies the order using a dictionary to avoid similar
problems in the future.

In addition this patch also syncs the delete reason output of the
script, with the comments in the code.

Fixes: 86b9e653ef22 ("revalidator: Add a USDT probe during flow deletion with 
purge reason.")
Signed-off-by: Eelco Chaudron 

---
v3: - Renamed ofproto dpif to bridge in delete reasons.
- Added comment pointing back to delete reasons in .c.
v2: - Converted the list of strings to dictionary.
- Added comment to code to keep code and script in sync.
- Unified flow_delete reason comments and script output.
---
 ofproto/ofproto-dpif-upcall.c| 24 +++--
 utilities/usdt-scripts/flow_reval_monitor.py | 37 +++-
 2 files changed, 34 insertions(+), 27 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 73901b651..83609ec62 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -270,18 +270,20 @@ enum ukey_state {
 };
 #define N_UKEY_STATES (UKEY_DELETED + 1)
 
+/* Ukey delete reasons used by USDT probes.  Please keep in sync with the
+ * definition in utilities/usdt-scripts/flow_reval_monitor.py.  */
 enum flow_del_reason {
-FDR_NONE = 0,   /* No deletion reason for the flow. */
-FDR_AVOID_CACHING,  /* Flow deleted to avoid caching. */
-FDR_BAD_ODP_FIT,/* The flow had a bad ODP flow fit. */
-FDR_FLOW_IDLE,  /* The flow went unused and was deleted. */
-FDR_FLOW_LIMIT, /* All flows being killed. */
-FDR_FLOW_WILDCARDED,/* The flow needed a narrower wildcard mask. */
-FDR_NO_OFPROTO, /* The flow didn't have an associated ofproto. */
-FDR_PURGE,  /* User action caused flows to be killed. */
-FDR_TOO_EXPENSIVE,  /* The flow was too expensive to revalidate. */
-FDR_UPDATE_FAIL,/* Flow state transition was unexpected. */
-FDR_XLATION_ERROR,  /* There was an error translating the flow. */
+FDR_NONE = 0,   /* No delete reason specified. */
+FDR_AVOID_CACHING,  /* Cache avoidance flag set. */
+FDR_BAD_ODP_FIT,/* Bad ODP flow fit. */
+FDR_FLOW_IDLE,  /* Flow idle timeout. */
+FDR_FLOW_LIMIT, /* Kill all flows condition reached. */
+FDR_FLOW_WILDCARDED,/* Flow needs a narrower wildcard mask. */
+FDR_NO_OFPROTO, /* Bridge not found. */
+FDR_PURGE,  /* User requested flow deletion. */
+FDR_TOO_EXPENSIVE,  /* Too expensive to revalidate. */
+FDR_UPDATE_FAIL,/* Datapath update failed. */
+FDR_XLATION_ERROR,  /* Flow translation error. */
 };
 
 /* 'udpif_key's are responsible for tracking the little bit of state udpif
diff --git a/utilities/usdt-scripts/flow_reval_monitor.py 
b/utilities/usdt-scripts/flow_reval_monitor.py
index 534ba8fa2..28479a565 100755
--- a/utilities/usdt-scripts/flow_reval_monitor.py
+++ b/utilities/usdt-scripts/flow_reval_monitor.py
@@ -236,6 +236,11 @@ RevalResult = IntEnum(
 ],
 start=0,
 )
+
+#
+# The below FdrReasons and FdrReasonStrings definitions can be found in the
+# ofproto/ofproto-dpif-upcall.c file.  Please keep them in sync.
+#
 FdrReasons = IntEnum(
 "flow_del_reason",
 [
@@ -254,19 +259,19 @@ FdrReasons = IntEnum(
 start=0,
 )
 
-FdrReasonStrings = [
-"No deletion reason",
-"Cache avoidance flag set",
-"Bad ODP flow fit",
-"Idle flow timed out",
-"Kill all flows condition detected",
-"Mask too wide - need narrower match",
-"No matching ofproto rules",
-"Too expensive to revalidate",
-"Purged with user action",
-"Flow state inconsistent after updates",
-"Flow translation error",
-]
+FdrReasonStrings = {
+FdrReasons.FDR_NONE: "No delete reason specified",
+FdrReasons.FDR_AVOID_CACHING: "Cache avoidance flag set",
+FdrReasons.FDR_BAD_ODP_FIT: "Bad ODP flow fit",
+FdrReasons.FDR_FLOW_IDLE: "Flow idle timeout",
+FdrReasons.FDR_FLOW_LIMIT: "Kill all flows condition reached",
+FdrReasons.FDR_FLOW_WILDCARDED: "Flow needs a narrower wildcard mask",
+FdrReasons.FDR_NO_OFPROTO: "Bridge not found",
+FdrReasons.FDR_PURGE: "User requested flow deletion",
+FdrReasons.FDR_TOO_EXPENSIVE: "Too expensive to revalidate",
+FdrReasons.FDR_UPDATE_FAIL: "Datapath update failed",
+FdrReasons.FDR_XLATION_ERROR: "Flow translation error"
+}
 
 
 def err(msg, code=-1):
@@ -572,10 +577,10 @@ def print_expiration(event):
 """Prints a UFID eviction with a reason."""
 ufid_str = format_ufid(event.ufid)
 
-if event.reason > len(FdrReasons):
-reason = f"Unknown reason '{event.reason}'"
-else:
+try:
 reason = FdrReasonStrings[event.reason]
+except KeyError:
+reason = f"Unknown reason '{event.

Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Eelco Chaudron


On 14 May 2024, at 14:50, Ilya Maximets wrote:

> On 5/14/24 14:43, Eelco Chaudron wrote:
>>
>>
>> On 14 May 2024, at 14:28, Ilya Maximets wrote:
>>
>>> On 5/14/24 13:27, Eelco Chaudron wrote:


 On 14 May 2024, at 13:05, Ilya Maximets wrote:

> On 5/14/24 12:14, Adrian Moreno wrote:
>>
>>
>> On 5/14/24 11:09 AM, Ilya Maximets wrote:
>>> On 5/14/24 09:39, Adrian Moreno wrote:


 On 5/10/24 12:45 PM, Adrian Moreno wrote:
>
>
> On 5/10/24 12:06 PM, Eelco Chaudron wrote:
>> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>>
>>> The new odp sample attributes allow userspace to specify a group_id 
>>> and
>>> user-defined cookie to be passed down to psample.
>>>
>>> Add support for parsing and formatting such action.
>>>
>>> Signed-off-by: Adrian Moreno 
>>
>> Hi Adrian,
>>
>> See my comments below inline.
>>
>> //Eelco
>>
>>> ---
>>>    include/linux/openvswitch.h  |  49 +---
>>>    lib/odp-execute.c    |   3 +
>>>    lib/odp-util.c   | 150 
>>> ++-
>>>    ofproto/ofproto-dpif-ipfix.c |   2 +
>>>    python/ovs/flow/odp.py   |   2 +
>>>    tests/odp.at |   5 ++
>>>    6 files changed, 163 insertions(+), 48 deletions(-)
>>>
>>> diff --git a/include/linux/openvswitch.h 
>>> b/include/linux/openvswitch.h
>>> index d9fb991ef..3e748405c 100644
>>> --- a/include/linux/openvswitch.h
>>> +++ b/include/linux/openvswitch.h
>>> @@ -696,6 +696,7 @@ enum ovs_flow_attr {
>>>    #define OVS_UFID_F_OMIT_MASK (1 << 1)
>>>    #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)
>>>
>>> +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
>>>    /**
>>>     * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE 
>>> action.
>>>     * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to 
>>> sample with
>>> @@ -703,15 +704,27 @@ enum ovs_flow_attr {
>>>     * %UINT32_MAX samples all packets and intermediate values 
>>> sample intermediate
>>>     * fractions of packets.
>>>     * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in 
>>> sampling event.
>>> - * Actions are passed as nested attributes.
>>> + * Actions are passed as nested attributes. Optional if
>>> + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as 
>>> psample group.
>>> + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary 
>>> cookie that, if
>>> + * provided, will be copied to the psample cookie.
>>
>> Should we mention any limit on the actual length of the cookie?
>>
>> Any reason we explicitly call this psample? From an OVS perspective, 
>> this
>> could just be
>> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE
>>  and
>> _GROUP. Other datapaths, do not have PSAMPLE.
>>
>
> See my response to your comment on the cover-letter for possible name 
> alternatives.
>
>
>> Thinking about it more, from an OVS perspective we should probably 
>> not even
>> send down a COOKIE, but we should send down an obs_domain_id and 
>> obs_point_id,
>> and then the kernel can move this into a cookie.
>>
>
> I did consider this. However, the opaque cookie fits well with the tc 
> API which
> does not have two 32-bit values but an opaque 128-bit cookie. So if 
> the action
> is offloaded OVS needs to "encode" the two 32-bit values into the 
> cookie anyways.
>
>> The collector itself could be called system/local collector, or 
>> something like
>> that. This way it can use for example psample for kernel and UDST 
>> probes for
>> usespace. Both can pass the group and cookie (obs_domain_id and 
>> obs_point_id).
>>
>> Also, the presence of any of this should not dictate the psample 
>> action, we
>> probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.
>>
>
> It clearer and it might simplify option verification a bit, but isn't 
> a bit
> redundant? There is no flag for action execution for instance, the 
> presence of
> the attribute is enough.
>
> An alternative would be to have nested attributes, e.g:
>
> enum ovs_sample_attr {
>

Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Adrian Moreno



On 5/14/24 1:27 PM, Eelco Chaudron wrote:



On 14 May 2024, at 13:05, Ilya Maximets wrote:


On 5/14/24 12:14, Adrian Moreno wrote:



On 5/14/24 11:09 AM, Ilya Maximets wrote:

On 5/14/24 09:39, Adrian Moreno wrote:



On 5/10/24 12:45 PM, Adrian Moreno wrote:



On 5/10/24 12:06 PM, Eelco Chaudron wrote:

On 24 Apr 2024, at 21:53, Adrian Moreno wrote:


The new odp sample attributes allow userspace to specify a group_id and
user-defined cookie to be passed down to psample.

Add support for parsing and formatting such action.

Signed-off-by: Adrian Moreno 


Hi Adrian,

See my comments below inline.

//Eelco


---
    include/linux/openvswitch.h  |  49 +---
    lib/odp-execute.c    |   3 +
    lib/odp-util.c   | 150 ++-
    ofproto/ofproto-dpif-ipfix.c |   2 +
    python/ovs/flow/odp.py   |   2 +
    tests/odp.at |   5 ++
    6 files changed, 163 insertions(+), 48 deletions(-)

diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index d9fb991ef..3e748405c 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -696,6 +696,7 @@ enum ovs_flow_attr {
    #define OVS_UFID_F_OMIT_MASK (1 << 1)
    #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)

+#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
    /**
     * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE action.
     * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to sample with
@@ -703,15 +704,27 @@ enum ovs_flow_attr {
     * %UINT32_MAX samples all packets and intermediate values sample 
intermediate
     * fractions of packets.
     * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling event.
- * Actions are passed as nested attributes.
+ * Actions are passed as nested attributes. Optional if
+ * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
+ * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as psample group.
+ * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
+ * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie that, if
+ * provided, will be copied to the psample cookie.


Should we mention any limit on the actual length of the cookie?

Any reason we explicitly call this psample? From an OVS perspective, this
could just be
SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE and
_GROUP. Other datapaths, do not have PSAMPLE.



See my response to your comment on the cover-letter for possible name 
alternatives.



Thinking about it more, from an OVS perspective we should probably not even
send down a COOKIE, but we should send down an obs_domain_id and obs_point_id,
and then the kernel can move this into a cookie.



I did consider this. However, the opaque cookie fits well with the tc API which
does not have two 32-bit values but an opaque 128-bit cookie. So if the action
is offloaded OVS needs to "encode" the two 32-bit values into the cookie 
anyways.


The collector itself could be called system/local collector, or something like
that. This way it can use for example psample for kernel and UDST probes for
usespace. Both can pass the group and cookie (obs_domain_id and obs_point_id).

Also, the presence of any of this should not dictate the psample action, we
probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.



It clearer and it might simplify option verification a bit, but isn't a bit
redundant? There is no flag for action execution for instance, the presence of
the attribute is enough.

An alternative would be to have nested attributes, e.g:

enum ovs_sample_attr {
   OVS_SAMPLE_ATTR_UNSPEC,
   OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
   OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* attributes. */
   OVS_SAMPLE_ATTR_SYSTEM, /* Nested OVS_SAMPLE_SYSTEM_ATTR_* 
attributes. */

   __OVS_SAMPLE_ATTR_MAX,
};

enum ovs_sample_system_attr {
   OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN,  /* u32 number */
   OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */
   OVS_SAMPLE_SYSTEM_ATTR_COOKIE, /* Nested OVS_ACTION_ATTR_*

   __OVS_SSAMPLE_SYSTEM_ATTR_MAX,
};

WDYT?



Another benefit of nested attributes is the easier addition of other
sample-related attributes, I already have a candidate: OVS_SAMPLE_SYSTEM_TRUNC

This will be passed to psample (md->trunc_size) who will truncate the packet
size to the specified value. This can be useful for large packets for which we
are only interested in the headers.

WDYT?

[snip]



Hrm.  This makes me think these should not be attributes of a sample action at 
all,
but rather we should have a separate EMIT_SAMPLE action that just calls 
psample, so
we can combine it with other actions:

sample(..., actions(userspace(...),trunc(100),emit_sample(cookie=...)))

or even:

sample(..., 
actions(clone(trunc(100),emit_sample(cookie=...)),userspace(...)))

This will also simplify the requires_datapath_assistance() check as we can just
check the action and not iterate ove

Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Ilya Maximets
On 5/14/24 14:43, Eelco Chaudron wrote:
> 
> 
> On 14 May 2024, at 14:28, Ilya Maximets wrote:
> 
>> On 5/14/24 13:27, Eelco Chaudron wrote:
>>>
>>>
>>> On 14 May 2024, at 13:05, Ilya Maximets wrote:
>>>
 On 5/14/24 12:14, Adrian Moreno wrote:
>
>
> On 5/14/24 11:09 AM, Ilya Maximets wrote:
>> On 5/14/24 09:39, Adrian Moreno wrote:
>>>
>>>
>>> On 5/10/24 12:45 PM, Adrian Moreno wrote:


 On 5/10/24 12:06 PM, Eelco Chaudron wrote:
> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>
>> The new odp sample attributes allow userspace to specify a group_id 
>> and
>> user-defined cookie to be passed down to psample.
>>
>> Add support for parsing and formatting such action.
>>
>> Signed-off-by: Adrian Moreno 
>
> Hi Adrian,
>
> See my comments below inline.
>
> //Eelco
>
>> ---
>>    include/linux/openvswitch.h  |  49 +---
>>    lib/odp-execute.c    |   3 +
>>    lib/odp-util.c   | 150 
>> ++-
>>    ofproto/ofproto-dpif-ipfix.c |   2 +
>>    python/ovs/flow/odp.py   |   2 +
>>    tests/odp.at |   5 ++
>>    6 files changed, 163 insertions(+), 48 deletions(-)
>>
>> diff --git a/include/linux/openvswitch.h 
>> b/include/linux/openvswitch.h
>> index d9fb991ef..3e748405c 100644
>> --- a/include/linux/openvswitch.h
>> +++ b/include/linux/openvswitch.h
>> @@ -696,6 +696,7 @@ enum ovs_flow_attr {
>>    #define OVS_UFID_F_OMIT_MASK (1 << 1)
>>    #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)
>>
>> +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
>>    /**
>>     * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE 
>> action.
>>     * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to 
>> sample with
>> @@ -703,15 +704,27 @@ enum ovs_flow_attr {
>>     * %UINT32_MAX samples all packets and intermediate values sample 
>> intermediate
>>     * fractions of packets.
>>     * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in 
>> sampling event.
>> - * Actions are passed as nested attributes.
>> + * Actions are passed as nested attributes. Optional if
>> + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
>> + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as 
>> psample group.
>> + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
>> + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie 
>> that, if
>> + * provided, will be copied to the psample cookie.
>
> Should we mention any limit on the actual length of the cookie?
>
> Any reason we explicitly call this psample? From an OVS perspective, 
> this
> could just be
> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE
>  and
> _GROUP. Other datapaths, do not have PSAMPLE.
>

 See my response to your comment on the cover-letter for possible name 
 alternatives.


> Thinking about it more, from an OVS perspective we should probably 
> not even
> send down a COOKIE, but we should send down an obs_domain_id and 
> obs_point_id,
> and then the kernel can move this into a cookie.
>

 I did consider this. However, the opaque cookie fits well with the tc 
 API which
 does not have two 32-bit values but an opaque 128-bit cookie. So if 
 the action
 is offloaded OVS needs to "encode" the two 32-bit values into the 
 cookie anyways.

> The collector itself could be called system/local collector, or 
> something like
> that. This way it can use for example psample for kernel and UDST 
> probes for
> usespace. Both can pass the group and cookie (obs_domain_id and 
> obs_point_id).
>
> Also, the presence of any of this should not dictate the psample 
> action, we
> probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.
>

 It clearer and it might simplify option verification a bit, but isn't 
 a bit
 redundant? There is no flag for action execution for instance, the 
 presence of
 the attribute is enough.

 An alternative would be to have nested attributes, e.g:

 enum ovs_sample_attr {
   OVS_SAMPLE_ATTR_UNSPEC,
   OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
   OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* 
>>>

Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Eelco Chaudron


On 14 May 2024, at 14:28, Ilya Maximets wrote:

> On 5/14/24 13:27, Eelco Chaudron wrote:
>>
>>
>> On 14 May 2024, at 13:05, Ilya Maximets wrote:
>>
>>> On 5/14/24 12:14, Adrian Moreno wrote:


 On 5/14/24 11:09 AM, Ilya Maximets wrote:
> On 5/14/24 09:39, Adrian Moreno wrote:
>>
>>
>> On 5/10/24 12:45 PM, Adrian Moreno wrote:
>>>
>>>
>>> On 5/10/24 12:06 PM, Eelco Chaudron wrote:
 On 24 Apr 2024, at 21:53, Adrian Moreno wrote:

> The new odp sample attributes allow userspace to specify a group_id 
> and
> user-defined cookie to be passed down to psample.
>
> Add support for parsing and formatting such action.
>
> Signed-off-by: Adrian Moreno 

 Hi Adrian,

 See my comments below inline.

 //Eelco

> ---
>    include/linux/openvswitch.h  |  49 +---
>    lib/odp-execute.c    |   3 +
>    lib/odp-util.c   | 150 
> ++-
>    ofproto/ofproto-dpif-ipfix.c |   2 +
>    python/ovs/flow/odp.py   |   2 +
>    tests/odp.at |   5 ++
>    6 files changed, 163 insertions(+), 48 deletions(-)
>
> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
> index d9fb991ef..3e748405c 100644
> --- a/include/linux/openvswitch.h
> +++ b/include/linux/openvswitch.h
> @@ -696,6 +696,7 @@ enum ovs_flow_attr {
>    #define OVS_UFID_F_OMIT_MASK (1 << 1)
>    #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)
>
> +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
>    /**
>     * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE 
> action.
>     * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to 
> sample with
> @@ -703,15 +704,27 @@ enum ovs_flow_attr {
>     * %UINT32_MAX samples all packets and intermediate values sample 
> intermediate
>     * fractions of packets.
>     * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling 
> event.
> - * Actions are passed as nested attributes.
> + * Actions are passed as nested attributes. Optional if
> + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
> + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as 
> psample group.
> + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
> + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie 
> that, if
> + * provided, will be copied to the psample cookie.

 Should we mention any limit on the actual length of the cookie?

 Any reason we explicitly call this psample? From an OVS perspective, 
 this
 could just be
 SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE
  and
 _GROUP. Other datapaths, do not have PSAMPLE.

>>>
>>> See my response to your comment on the cover-letter for possible name 
>>> alternatives.
>>>
>>>
 Thinking about it more, from an OVS perspective we should probably not 
 even
 send down a COOKIE, but we should send down an obs_domain_id and 
 obs_point_id,
 and then the kernel can move this into a cookie.

>>>
>>> I did consider this. However, the opaque cookie fits well with the tc 
>>> API which
>>> does not have two 32-bit values but an opaque 128-bit cookie. So if the 
>>> action
>>> is offloaded OVS needs to "encode" the two 32-bit values into the 
>>> cookie anyways.
>>>
 The collector itself could be called system/local collector, or 
 something like
 that. This way it can use for example psample for kernel and UDST 
 probes for
 usespace. Both can pass the group and cookie (obs_domain_id and 
 obs_point_id).

 Also, the presence of any of this should not dictate the psample 
 action, we
 probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.

>>>
>>> It clearer and it might simplify option verification a bit, but isn't a 
>>> bit
>>> redundant? There is no flag for action execution for instance, the 
>>> presence of
>>> the attribute is enough.
>>>
>>> An alternative would be to have nested attributes, e.g:
>>>
>>> enum ovs_sample_attr {
>>>   OVS_SAMPLE_ATTR_UNSPEC,
>>>   OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>>   OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* 
>>> attributes. */
>>>   OVS_SAMPLE_ATTR_SYSTEM, /* Nested OVS_SAMPLE_SYSTEM_ATTR_* 
>>> attributes. */
>>>
>>>   __OVS_SAMPLE_ATTR_MAX,
>>> };
>>>

Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Ilya Maximets
On 5/14/24 13:27, Eelco Chaudron wrote:
> 
> 
> On 14 May 2024, at 13:05, Ilya Maximets wrote:
> 
>> On 5/14/24 12:14, Adrian Moreno wrote:
>>>
>>>
>>> On 5/14/24 11:09 AM, Ilya Maximets wrote:
 On 5/14/24 09:39, Adrian Moreno wrote:
>
>
> On 5/10/24 12:45 PM, Adrian Moreno wrote:
>>
>>
>> On 5/10/24 12:06 PM, Eelco Chaudron wrote:
>>> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>>>
 The new odp sample attributes allow userspace to specify a group_id and
 user-defined cookie to be passed down to psample.

 Add support for parsing and formatting such action.

 Signed-off-by: Adrian Moreno 
>>>
>>> Hi Adrian,
>>>
>>> See my comments below inline.
>>>
>>> //Eelco
>>>
 ---
    include/linux/openvswitch.h  |  49 +---
    lib/odp-execute.c    |   3 +
    lib/odp-util.c   | 150 
 ++-
    ofproto/ofproto-dpif-ipfix.c |   2 +
    python/ovs/flow/odp.py   |   2 +
    tests/odp.at |   5 ++
    6 files changed, 163 insertions(+), 48 deletions(-)

 diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
 index d9fb991ef..3e748405c 100644
 --- a/include/linux/openvswitch.h
 +++ b/include/linux/openvswitch.h
 @@ -696,6 +696,7 @@ enum ovs_flow_attr {
    #define OVS_UFID_F_OMIT_MASK (1 << 1)
    #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)

 +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
    /**
     * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE 
 action.
     * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to 
 sample with
 @@ -703,15 +704,27 @@ enum ovs_flow_attr {
     * %UINT32_MAX samples all packets and intermediate values sample 
 intermediate
     * fractions of packets.
     * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling 
 event.
 - * Actions are passed as nested attributes.
 + * Actions are passed as nested attributes. Optional if
 + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
 + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as 
 psample group.
 + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
 + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie 
 that, if
 + * provided, will be copied to the psample cookie.
>>>
>>> Should we mention any limit on the actual length of the cookie?
>>>
>>> Any reason we explicitly call this psample? From an OVS perspective, 
>>> this
>>> could just be
>>> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE
>>>  and
>>> _GROUP. Other datapaths, do not have PSAMPLE.
>>>
>>
>> See my response to your comment on the cover-letter for possible name 
>> alternatives.
>>
>>
>>> Thinking about it more, from an OVS perspective we should probably not 
>>> even
>>> send down a COOKIE, but we should send down an obs_domain_id and 
>>> obs_point_id,
>>> and then the kernel can move this into a cookie.
>>>
>>
>> I did consider this. However, the opaque cookie fits well with the tc 
>> API which
>> does not have two 32-bit values but an opaque 128-bit cookie. So if the 
>> action
>> is offloaded OVS needs to "encode" the two 32-bit values into the cookie 
>> anyways.
>>
>>> The collector itself could be called system/local collector, or 
>>> something like
>>> that. This way it can use for example psample for kernel and UDST 
>>> probes for
>>> usespace. Both can pass the group and cookie (obs_domain_id and 
>>> obs_point_id).
>>>
>>> Also, the presence of any of this should not dictate the psample 
>>> action, we
>>> probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.
>>>
>>
>> It clearer and it might simplify option verification a bit, but isn't a 
>> bit
>> redundant? There is no flag for action execution for instance, the 
>> presence of
>> the attribute is enough.
>>
>> An alternative would be to have nested attributes, e.g:
>>
>> enum ovs_sample_attr {
>>   OVS_SAMPLE_ATTR_UNSPEC,
>>   OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>   OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* 
>> attributes. */
>>   OVS_SAMPLE_ATTR_SYSTEM, /* Nested OVS_SAMPLE_SYSTEM_ATTR_* 
>> attributes. */
>>
>>   __OVS_SAMPLE_ATTR_MAX,
>> };
>>
>> enum ovs_sample_system_attr {
>>   OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN,  /* u32 number */
>>   OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */
>>   OVS_SAMP

Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Eelco Chaudron


On 14 May 2024, at 13:05, Ilya Maximets wrote:

> On 5/14/24 12:14, Adrian Moreno wrote:
>>
>>
>> On 5/14/24 11:09 AM, Ilya Maximets wrote:
>>> On 5/14/24 09:39, Adrian Moreno wrote:


 On 5/10/24 12:45 PM, Adrian Moreno wrote:
>
>
> On 5/10/24 12:06 PM, Eelco Chaudron wrote:
>> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>>
>>> The new odp sample attributes allow userspace to specify a group_id and
>>> user-defined cookie to be passed down to psample.
>>>
>>> Add support for parsing and formatting such action.
>>>
>>> Signed-off-by: Adrian Moreno 
>>
>> Hi Adrian,
>>
>> See my comments below inline.
>>
>> //Eelco
>>
>>> ---
>>>    include/linux/openvswitch.h  |  49 +---
>>>    lib/odp-execute.c    |   3 +
>>>    lib/odp-util.c   | 150 
>>> ++-
>>>    ofproto/ofproto-dpif-ipfix.c |   2 +
>>>    python/ovs/flow/odp.py   |   2 +
>>>    tests/odp.at |   5 ++
>>>    6 files changed, 163 insertions(+), 48 deletions(-)
>>>
>>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>>> index d9fb991ef..3e748405c 100644
>>> --- a/include/linux/openvswitch.h
>>> +++ b/include/linux/openvswitch.h
>>> @@ -696,6 +696,7 @@ enum ovs_flow_attr {
>>>    #define OVS_UFID_F_OMIT_MASK (1 << 1)
>>>    #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)
>>>
>>> +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
>>>    /**
>>>     * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE 
>>> action.
>>>     * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to 
>>> sample with
>>> @@ -703,15 +704,27 @@ enum ovs_flow_attr {
>>>     * %UINT32_MAX samples all packets and intermediate values sample 
>>> intermediate
>>>     * fractions of packets.
>>>     * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling 
>>> event.
>>> - * Actions are passed as nested attributes.
>>> + * Actions are passed as nested attributes. Optional if
>>> + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as 
>>> psample group.
>>> + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie 
>>> that, if
>>> + * provided, will be copied to the psample cookie.
>>
>> Should we mention any limit on the actual length of the cookie?
>>
>> Any reason we explicitly call this psample? From an OVS perspective, this
>> could just be
>> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE 
>> and
>> _GROUP. Other datapaths, do not have PSAMPLE.
>>
>
> See my response to your comment on the cover-letter for possible name 
> alternatives.
>
>
>> Thinking about it more, from an OVS perspective we should probably not 
>> even
>> send down a COOKIE, but we should send down an obs_domain_id and 
>> obs_point_id,
>> and then the kernel can move this into a cookie.
>>
>
> I did consider this. However, the opaque cookie fits well with the tc API 
> which
> does not have two 32-bit values but an opaque 128-bit cookie. So if the 
> action
> is offloaded OVS needs to "encode" the two 32-bit values into the cookie 
> anyways.
>
>> The collector itself could be called system/local collector, or 
>> something like
>> that. This way it can use for example psample for kernel and UDST probes 
>> for
>> usespace. Both can pass the group and cookie (obs_domain_id and 
>> obs_point_id).
>>
>> Also, the presence of any of this should not dictate the psample action, 
>> we
>> probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.
>>
>
> It clearer and it might simplify option verification a bit, but isn't a 
> bit
> redundant? There is no flag for action execution for instance, the 
> presence of
> the attribute is enough.
>
> An alternative would be to have nested attributes, e.g:
>
> enum ovs_sample_attr {
>   OVS_SAMPLE_ATTR_UNSPEC,
>   OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>   OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* 
> attributes. */
>   OVS_SAMPLE_ATTR_SYSTEM, /* Nested OVS_SAMPLE_SYSTEM_ATTR_* 
> attributes. */
>
>   __OVS_SAMPLE_ATTR_MAX,
> };
>
> enum ovs_sample_system_attr {
>   OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN,  /* u32 number */
>   OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */
>   OVS_SAMPLE_SYSTEM_ATTR_COOKIE, /* Nested OVS_ACTION_ATTR_*
>
>   __OVS_SSAMPLE_SYSTEM_ATTR_MAX,
> };
>
> WDYT?
>

 Another benefit of nested attributes i

Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Ilya Maximets
On 5/14/24 12:14, Adrian Moreno wrote:
> 
> 
> On 5/14/24 11:09 AM, Ilya Maximets wrote:
>> On 5/14/24 09:39, Adrian Moreno wrote:
>>>
>>>
>>> On 5/10/24 12:45 PM, Adrian Moreno wrote:


 On 5/10/24 12:06 PM, Eelco Chaudron wrote:
> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>
>> The new odp sample attributes allow userspace to specify a group_id and
>> user-defined cookie to be passed down to psample.
>>
>> Add support for parsing and formatting such action.
>>
>> Signed-off-by: Adrian Moreno 
>
> Hi Adrian,
>
> See my comments below inline.
>
> //Eelco
>
>> ---
>>    include/linux/openvswitch.h  |  49 +---
>>    lib/odp-execute.c    |   3 +
>>    lib/odp-util.c   | 150 ++-
>>    ofproto/ofproto-dpif-ipfix.c |   2 +
>>    python/ovs/flow/odp.py   |   2 +
>>    tests/odp.at |   5 ++
>>    6 files changed, 163 insertions(+), 48 deletions(-)
>>
>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>> index d9fb991ef..3e748405c 100644
>> --- a/include/linux/openvswitch.h
>> +++ b/include/linux/openvswitch.h
>> @@ -696,6 +696,7 @@ enum ovs_flow_attr {
>>    #define OVS_UFID_F_OMIT_MASK (1 << 1)
>>    #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)
>>
>> +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
>>    /**
>>     * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE 
>> action.
>>     * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to sample 
>> with
>> @@ -703,15 +704,27 @@ enum ovs_flow_attr {
>>     * %UINT32_MAX samples all packets and intermediate values sample 
>> intermediate
>>     * fractions of packets.
>>     * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling 
>> event.
>> - * Actions are passed as nested attributes.
>> + * Actions are passed as nested attributes. Optional if
>> + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
>> + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as 
>> psample group.
>> + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
>> + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie 
>> that, if
>> + * provided, will be copied to the psample cookie.
>
> Should we mention any limit on the actual length of the cookie?
>
> Any reason we explicitly call this psample? From an OVS perspective, this
> could just be
> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE 
> and
> _GROUP. Other datapaths, do not have PSAMPLE.
>

 See my response to your comment on the cover-letter for possible name 
 alternatives.


> Thinking about it more, from an OVS perspective we should probably not 
> even
> send down a COOKIE, but we should send down an obs_domain_id and 
> obs_point_id,
> and then the kernel can move this into a cookie.
>

 I did consider this. However, the opaque cookie fits well with the tc API 
 which
 does not have two 32-bit values but an opaque 128-bit cookie. So if the 
 action
 is offloaded OVS needs to "encode" the two 32-bit values into the cookie 
 anyways.

> The collector itself could be called system/local collector, or something 
> like
> that. This way it can use for example psample for kernel and UDST probes 
> for
> usespace. Both can pass the group and cookie (obs_domain_id and 
> obs_point_id).
>
> Also, the presence of any of this should not dictate the psample action, 
> we
> probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.
>

 It clearer and it might simplify option verification a bit, but isn't a bit
 redundant? There is no flag for action execution for instance, the 
 presence of
 the attribute is enough.

 An alternative would be to have nested attributes, e.g:

 enum ovs_sample_attr {
   OVS_SAMPLE_ATTR_UNSPEC,
   OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
   OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* attributes. 
 */
   OVS_SAMPLE_ATTR_SYSTEM, /* Nested OVS_SAMPLE_SYSTEM_ATTR_* 
 attributes. */

   __OVS_SAMPLE_ATTR_MAX,
 };

 enum ovs_sample_system_attr {
   OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN,  /* u32 number */
   OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */
   OVS_SAMPLE_SYSTEM_ATTR_COOKIE, /* Nested OVS_ACTION_ATTR_*

   __OVS_SSAMPLE_SYSTEM_ATTR_MAX,
 };

 WDYT?

>>>
>>> Another benefit of nested attributes is the easier addition of other
>>> sample-related attributes, I already have a candidate: 
>>> OVS_SAMPLE_SYSTEM_TRUNC
>>>
>>> This will be passed to psample (md->trunc_size) who will truncate the pa

[ovs-dev] OVN technical community meeting - May 13th

2024-05-14 Thread Dumitru Ceara


> I went ahead and scheduled a new instance:
> Date/Time: Monday, May 13th, 3PM UTC
> Link: https://meet.google.com/zns-gqsd-jdn?authuser=0&hs=122
> Agenda:
> https://docs.google.com/document/d/1dG4GwcYOSs4uArPGtOoaP5tH4KCto-GH_C3tIXSnZZ8/edit?usp=meetingnotes
> 

Thanks, everyone, for attending the OVN community meeting yesterday!  It
was a very interesting discussion about how to better integrate OVN and BGP!

Here's the link to the recording:
https://youtu.be/k448ada9aFQ

And the transcript of the chat:
https://drive.google.com/file/d/1VuF5l9wSR9z4rIH_LVF3PJBScj4Mb8JT

I'd like to schedule a new session in 5 weeks as it seems there's still
quite some details worth while discussing about how to actually
implement the BGP+OVN integration.

I was thinking of Monday, June 17th, 3PM UTC.  I'll wait for a few days
for people to check and potentially suggest alternatives before sending
out an invite.

Best regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Adrian Moreno



On 5/14/24 11:09 AM, Ilya Maximets wrote:

On 5/14/24 09:39, Adrian Moreno wrote:



On 5/10/24 12:45 PM, Adrian Moreno wrote:



On 5/10/24 12:06 PM, Eelco Chaudron wrote:

On 24 Apr 2024, at 21:53, Adrian Moreno wrote:


The new odp sample attributes allow userspace to specify a group_id and
user-defined cookie to be passed down to psample.

Add support for parsing and formatting such action.

Signed-off-by: Adrian Moreno 


Hi Adrian,

See my comments below inline.

//Eelco


---
   include/linux/openvswitch.h  |  49 +---
   lib/odp-execute.c    |   3 +
   lib/odp-util.c   | 150 ++-
   ofproto/ofproto-dpif-ipfix.c |   2 +
   python/ovs/flow/odp.py   |   2 +
   tests/odp.at |   5 ++
   6 files changed, 163 insertions(+), 48 deletions(-)

diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index d9fb991ef..3e748405c 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -696,6 +696,7 @@ enum ovs_flow_attr {
   #define OVS_UFID_F_OMIT_MASK (1 << 1)
   #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)

+#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
   /**
    * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE action.
    * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to sample with
@@ -703,15 +704,27 @@ enum ovs_flow_attr {
    * %UINT32_MAX samples all packets and intermediate values sample 
intermediate
    * fractions of packets.
    * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling event.
- * Actions are passed as nested attributes.
+ * Actions are passed as nested attributes. Optional if
+ * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
+ * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as psample group.
+ * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
+ * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie that, if
+ * provided, will be copied to the psample cookie.


Should we mention any limit on the actual length of the cookie?

Any reason we explicitly call this psample? From an OVS perspective, this
could just be
SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE and
_GROUP. Other datapaths, do not have PSAMPLE.



See my response to your comment on the cover-letter for possible name 
alternatives.



Thinking about it more, from an OVS perspective we should probably not even
send down a COOKIE, but we should send down an obs_domain_id and obs_point_id,
and then the kernel can move this into a cookie.



I did consider this. However, the opaque cookie fits well with the tc API which
does not have two 32-bit values but an opaque 128-bit cookie. So if the action
is offloaded OVS needs to "encode" the two 32-bit values into the cookie 
anyways.


The collector itself could be called system/local collector, or something like
that. This way it can use for example psample for kernel and UDST probes for
usespace. Both can pass the group and cookie (obs_domain_id and obs_point_id).

Also, the presence of any of this should not dictate the psample action, we
probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.



It clearer and it might simplify option verification a bit, but isn't a bit
redundant? There is no flag for action execution for instance, the presence of
the attribute is enough.

An alternative would be to have nested attributes, e.g:

enum ovs_sample_attr {
  OVS_SAMPLE_ATTR_UNSPEC,
  OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
  OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* attributes. */
  OVS_SAMPLE_ATTR_SYSTEM, /* Nested OVS_SAMPLE_SYSTEM_ATTR_* 
attributes. */

  __OVS_SAMPLE_ATTR_MAX,
};

enum ovs_sample_system_attr {
  OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN,  /* u32 number */
  OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */
  OVS_SAMPLE_SYSTEM_ATTR_COOKIE, /* Nested OVS_ACTION_ATTR_*

  __OVS_SSAMPLE_SYSTEM_ATTR_MAX,
};

WDYT?



Another benefit of nested attributes is the easier addition of other
sample-related attributes, I already have a candidate: OVS_SAMPLE_SYSTEM_TRUNC

This will be passed to psample (md->trunc_size) who will truncate the packet
size to the specified value. This can be useful for large packets for which we
are only interested in the headers.

WDYT?

[snip]



Hrm.  This makes me think these should not be attributes of a sample action at 
all,
but rather we should have a separate EMIT_SAMPLE action that just calls 
psample, so
we can combine it with other actions:

   sample(..., actions(userspace(...),trunc(100),emit_sample(cookie=...)))

or even:

   sample(..., 
actions(clone(trunc(100),emit_sample(cookie=...)),userspace(...)))

This will also simplify the requires_datapath_assistance() check as we can just
check the action and not iterate over the list of attributes.  And it will be
capable to only send this one action to the datapath and not the whole thing.



My suggestion of adding a OVS_SAMPLE_{}

Re: [ovs-dev] [RFC 00/11] Add psample support to NXAST_SAMPLE action.

2024-05-14 Thread Ilya Maximets
On 5/14/24 09:42, Adrian Moreno wrote:
> 
> 
> On 5/10/24 12:14 PM, Eelco Chaudron wrote:
>>
>>
>> On 10 May 2024, at 10:23, Adrian Moreno wrote:
>>
>>> On 5/10/24 9:14 AM, Eelco Chaudron wrote:


 On 24 Apr 2024, at 21:53, Adrian Moreno wrote:

> This is the userspace counterpart of the work being done in the kernel
> [1]. Sending it as RFC to get some early feedback on the overall
> solution.
>
> ** Problem description **
> Currently, OVS supports several observability features, such as
> per-bridge IPFIX, per-flow IPFIX and sFlow. However, given the
> complexity of OVN-generated pipelines, a single sample is typically not
> enough to troubleshoot what is OVN/OVS is doing with the packet. We need
> highler level metadata alongside the packet sample.
>
> This can be achieved by the means of per-flow IPFIX sampling and
> NXAST_SAMPLE action, through which OVN can add two arbitrary 32-bit
> values (observation_domain_id and observation_point_id) that are sent
> alongside the packet information in the IPFIX frames.
>
> However, there is a fundamental limitation of the existing sampling
> infrastructure, specially in the kernel datapath: samples are handled by
> ovs-vswitchd, forcing the need for an upcall (userspace action). The
> fact that samples share the same netlink sockets and handler thread cpu
> time with actual packet misse, can easily cause packet drops making this
> solution unfit for use in highly loaded production systems.
>
> Users are left with little option than guessing what sampling rate will
> be OK for their traffic pattern and dealing with the lost accuracy.
>
> ** Feature Description **
> In order to solve this situation and enable this feature to be safely
> turned on in production, this RFC uses the psample support proposed in
> [1] to send NXAST_SAMPLE samples to psample adding the observability
> domain and point information as metadata.
>
> ~~ API ~~
> The API is simply a new field called "psample_group" in the
> Flow_Sample_Collector_Set (FSCS) table. Configuring this value is
> orthogonal to also associating the FSCS entry to an entry in the IPFIX
> table.
>
> Apart from that configuration, the controller needs to add NXAST_SAMPLE
> actions that refer the entry's id.
>
> ~~ HW Offload ~~
> psample is already supported by tc using the act_sample action. The
> metadata is taken from the actions' cookie.
> This series also adds support for offloading the odp sample action,
> only when it includes psample information but not nested actions.
>
> A bit of special care has to be taken in the tc layer to not store the
> ufid in the sample's cookie as it's used to carry action-specific data.
>
> ~~ Datapath support ~~
> This new behavior of the datapth sample action is only supported on the
> kernel datapath. A more detailed analysis is needed (and planned) to
> find a way to also optimize the userspace datapath. However, although
> potentially inefficient, there is not that much risk of dropping packets
> with the existing IPFIX infrastructure.
>
> ~~ Testing ~~
> The series includes an utility program called "ovs-psample" that listens
> to packets multicasted by the psample module and prints them (also
> printing the obs_domain and obs_point ids). In addition the kernel
> series includes a tracepoint for easy testing and troubleshooting.
>
> [1] https://patchwork.kernel.org/project/netdevbpf/list/?series=847473

 Hi Andrew,

>>>
>>> Who is Andrew? :-D
>>
>> Sorry :( Handling too many things in parallel...
>>
 Thanks for sending out this RFC series. I did a code-only review (no 
 testing), and the main concern I have (besides some locking) is the direct 
 mapping to psample. If we decide that for userspace we are going to use 
 USDT probes, we need another target, and we will duplicate a lot of code. 
 My proposal would be to have a more general target name like system, or 
 local (or a better name ;). This will be a system-local dpif target with a 
 32-bit group id.

>>>
>>> Regarding the name, I agree adding "psample" to the API can be confusing 
>>> for userspace datapath. Besides, I think it should also be in sync with the 
>>> odp naming:
>>>
>>> "system": My initial thought was it could be confused with the "system@" 
>>> name of the kernel datapath, but that is not present in the OVSDB level so 
>>> it could work. SAMPLE_ATTR_{SYSTEM_GROUP,SYSTEM_USER_COOKIE}?
>>>
>>> "offload": I did consider this but I discarded it as it could be confused 
>>> with actual hw/tc offloading of the sample action.
>>>
>>> "local": Funny, I initially thought of the opposite, i.e: "external" as the 
>>> sample collection and potential encoding (which is currently "local" to 
>>> ovs-vswitchd

Re: [ovs-dev] [RFC 08/11] netdev-offload-tc: Add sample support.

2024-05-14 Thread Eelco Chaudron


On 13 May 2024, at 21:59, Adrian Moreno wrote:

> On 5/13/24 2:38 PM, Adrian Moreno wrote:
>>
>>
>> On 5/13/24 1:32 PM, Eelco Chaudron wrote:
>>>
>>>
>>> On 13 May 2024, at 10:44, Adrian Moreno wrote:
>>>
 On 5/10/24 12:06 PM, Eelco Chaudron wrote:
> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>
>> Offload the sample action if it contains psample information by creating
>> a tc "sample" action with the user cookie inside the action's cookie.
>>
>> Avoid using the "sample" action's cookie to store the ufid.
>
> I have some concerns about the sample action-only solution. What happened 
> to the idea of adding an additional cookie to the TC match? Can we add a 
> test case for the explicit drop?
>


 I guess we can ask the kernel community. However, I'm not 100% convinced 
 it makes a lot of sense form a kernel perspective. There is already a 
 cookie in every action so there is plenty of space to store user data. 
 Kernel does not enforce any semantics on the cookies, it's up to userspace 
 to interpret them at will. Also, you cannot have a flow without an action 
 IIUC.
 So having a new cookie added just because it makes OVS code sligthly 
 cleaner doesn't seem to be a super strong argument.
>>>
>>> ACK, but I thought this was part of the earlier discussion, so it should be 
>>> explored. I guess we can make the opposite argument, why a per-action 
>>> cookie if we could have a per-flow one? Having a match cookie will free the 
>>> action cookie for action related metadata.
>>>
>
> I did explore, theoretically, the idea. Adding TCA_COOKIE seemed to me like a 
> significant effort. It would probably mean adding it as a top-level filter 
> attribute (common to all filters, not only flower), handling concurrent 
> access and modification, adding support to iproute tooling, making sure it's 
> carried over through hw-offloading, etc.
>
> My bulk estimation is that this, itself, is way bigger than the psample 
> feature I'm trying to add.
>
> So, with all, I just considered the benefits would not compensate the effort 
> of creating a new filter attribute. considering:
> 1) AFAIKS, we have a way to identify flows without the need for cookies at all
> 2) We could very well use some other action (e.g: what this patch does)
> 3) I felt chances of it even being accepted are not supper high

If this is the case, it makes no sense to get this in right now. Although it 
still would be nice to have :) Simon do you know if this was ever discussed 
before?

 Regarding tc <-> ufid correlation. I'm a bit confused by the fact that 
 flow replacement seems to work on a tcf_id basis, meaning that each tc 
 flow has it's own unique tcf_id (or we would replace some other flow). 
 Also, there is a hmap mapping tcf_id to ufids already. Am I missing 
 something?
>>>
>>> You are right, I guess this exists to support kernels where we did not have 
>>> an actions cookie, so we use find_ufid() in the flow dump. So in theory if 
>>> there is no action with a flow cookie it will all still work.
>>>
>>
>> So, could we just rely on this mechanism which is also backwards compatible 
>> and just use the action cookie for action information?

Guess this would do, we should have some adequate test cases to cover any 
corner cases we can think of.

> In general, I think we should apply the sflow patch before this series.
>

 Agree. Are we all OK with the compatibility issues that will introduce?

> //Eelco
>
>> Signed-off-by: Adrian Moreno 
>> ---
>>    include/linux/automake.mk    |  5 +-
>>    include/linux/pkt_cls.h  |  2 +
>>    include/linux/tc_act/tc_sample.h | 26 ++
>>    lib/netdev-offload-tc.c  | 67 +
>>    lib/tc.c | 84 ++--
>>    lib/tc.h |  7 +++
>>    utilities/ovs-psample.c  |  8 +--
>>    7 files changed, 190 insertions(+), 9 deletions(-)
>>    create mode 100644 include/linux/tc_act/tc_sample.h
>>
>> diff --git a/include/linux/automake.mk b/include/linux/automake.mk
>> index ac306b53c..c2a270152 100644
>> --- a/include/linux/automake.mk
>> +++ b/include/linux/automake.mk
>> @@ -5,9 +5,10 @@ noinst_HEADERS += \
>>    include/linux/pkt_cls.h \
>>    include/linux/psample.h \
>>    include/linux/gen_stats.h \
>> +    include/linux/tc_act/tc_ct.h \
>>    include/linux/tc_act/tc_mpls.h \
>>    include/linux/tc_act/tc_pedit.h \
>> +    include/linux/tc_act/tc_sample.h \
>>    include/linux/tc_act/tc_skbedit.h \
>>    include/linux/tc_act/tc_tunnel_key.h \
>> -    include/linux/tc_act/tc_vlan.h \
>> -    include/linux/tc_act/tc_ct.h
>> +    include/linux/tc_act/tc_vlan.h
>> diff --git a/include/linux/pkt_cls.h b/includ

Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Eelco Chaudron



On 13 May 2024, at 14:44, Adrian Moreno wrote:

> On 5/13/24 2:38 PM, Ilya Maximets wrote:
>> On 5/13/24 09:17, Eelco Chaudron wrote:
>>>
>>>
>>> On 10 May 2024, at 15:06, Ilya Maximets wrote:
>>>
 On 5/10/24 14:01, Eelco Chaudron wrote:
>
>
> On 10 May 2024, at 12:45, Adrian Moreno wrote:
>
>> On 5/10/24 12:06 PM, Eelco Chaudron wrote:
>>> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>>>
 The new odp sample attributes allow userspace to specify a group_id and
 user-defined cookie to be passed down to psample.

 Add support for parsing and formatting such action.

 Signed-off-by: Adrian Moreno 
>>>
>>> Hi Adrian,
>>>
>>> See my comments below inline.
>>>
>>> //Eelco
>>>
 ---
include/linux/openvswitch.h  |  49 +---
lib/odp-execute.c|   3 +
lib/odp-util.c   | 150 
 ++-
ofproto/ofproto-dpif-ipfix.c |   2 +
python/ovs/flow/odp.py   |   2 +
tests/odp.at |   5 ++
6 files changed, 163 insertions(+), 48 deletions(-)

 diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
 index d9fb991ef..3e748405c 100644
 --- a/include/linux/openvswitch.h
 +++ b/include/linux/openvswitch.h
 @@ -696,6 +696,7 @@ enum ovs_flow_attr {
#define OVS_UFID_F_OMIT_MASK (1 << 1)
#define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)

 +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
/**
 * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE 
 action.
 * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to 
 sample with
 @@ -703,15 +704,27 @@ enum ovs_flow_attr {
 * %UINT32_MAX samples all packets and intermediate values sample 
 intermediate
 * fractions of packets.
 * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling 
 event.
 - * Actions are passed as nested attributes.
 + * Actions are passed as nested attributes. Optional if
 + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
 + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as 
 psample group.
 + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
 + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie 
 that, if
 + * provided, will be copied to the psample cookie.
>>>
>>> Should we mention any limit on the actual length of the cookie?
>>>
>>> Any reason we explicitly call this psample? From an OVS perspective, 
>>> this could just be 
>>> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE
>>>  and _GROUP. Other datapaths, do not have PSAMPLE.
>>>
>>
>> See my response to your comment on the cover-letter for possible name 
>> alternatives.
>
> ACK, we can continue the discussion there.
>
>>> Thinking about it more, from an OVS perspective we should probably not 
>>> even send down a COOKIE, but we should send down an obs_domain_id and 
>>> obs_point_id, and then the kernel can move this into a cookie.
>>>
>>
>> I did consider this. However, the opaque cookie fits well with the tc 
>> API which does not have two 32-bit values but an opaque 128-bit cookie. 
>> So if the action is offloaded OVS needs to "encode" the two 32-bit 
>> values into the cookie anyways.
>
> Which is fine to me, the OVS API should be clear that we have two u32 
> entries. The cookie is implementation-specific, and we use the netlink 
> format as our API for all DPs.


 I'm not sure about this.  It is a kernel interface and we can't deliver
 two separate values from the psample.  So, passing two separate values
 to the kernel doesn't make a lot of sense.  They are going to be mangled
 into a single cookie anyway and we'll have to document that somehow.
 We may as well always mangle the data from the beginning, document once
 at the top level and save ourselves from documenting a million differences
 between different implementations.  While USDT could report them 
 separately,
 I'm not sure there is a ton of value in that.
>>>
>>> Our OVS action API is not tight to psample at all, so passing in a psample 
>>> cookie
>>> does not make sense. We should not pass in any psample references to the 
>>> sample
>>> action API. If it looks like the below, the API is clean and we can mangle 
>>> it before
>>> passing it to the psample function. I see no need to document this in the 
>>> kernel, as
>>> all the kernel does is provide the TC action cookie (unrelated to the OVS 
>>> actions API).
>>
>> We're passing this data directly to psample, not TC.  An

Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Ilya Maximets
On 5/13/24 20:59, Adrian Moreno wrote:
> 
> 
> On 5/10/24 3:06 PM, Ilya Maximets wrote:
>> On 5/10/24 14:01, Eelco Chaudron wrote:
>>>
>>>
>>> On 10 May 2024, at 12:45, Adrian Moreno wrote:
>>>
 On 5/10/24 12:06 PM, Eelco Chaudron wrote:
> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>
>> The new odp sample attributes allow userspace to specify a group_id and
>> user-defined cookie to be passed down to psample.
>>
>> Add support for parsing and formatting such action.
>>
>> Signed-off-by: Adrian Moreno 
>
> Hi Adrian,
>
> See my comments below inline.
>
> //Eelco
>
>> ---
>>include/linux/openvswitch.h  |  49 +---
>>lib/odp-execute.c|   3 +
>>lib/odp-util.c   | 150 ++-
>>ofproto/ofproto-dpif-ipfix.c |   2 +
>>python/ovs/flow/odp.py   |   2 +
>>tests/odp.at |   5 ++
>>6 files changed, 163 insertions(+), 48 deletions(-)
>>
>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>> index d9fb991ef..3e748405c 100644
>> --- a/include/linux/openvswitch.h
>> +++ b/include/linux/openvswitch.h
>> @@ -696,6 +696,7 @@ enum ovs_flow_attr {
>>#define OVS_UFID_F_OMIT_MASK (1 << 1)
>>#define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)
>>
>> +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
>>/**
>> * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE 
>> action.
>> * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to sample 
>> with
>> @@ -703,15 +704,27 @@ enum ovs_flow_attr {
>> * %UINT32_MAX samples all packets and intermediate values sample 
>> intermediate
>> * fractions of packets.
>> * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling 
>> event.
>> - * Actions are passed as nested attributes.
>> + * Actions are passed as nested attributes. Optional if
>> + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
>> + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as 
>> psample group.
>> + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
>> + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie 
>> that, if
>> + * provided, will be copied to the psample cookie.
>
> Should we mention any limit on the actual length of the cookie?
>
> Any reason we explicitly call this psample? From an OVS perspective, this 
> could just be 
> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE 
> and _GROUP. Other datapaths, do not have PSAMPLE.
>

 See my response to your comment on the cover-letter for possible name 
 alternatives.
>>>
>>> ACK, we can continue the discussion there.
>>>
> Thinking about it more, from an OVS perspective we should probably not 
> even send down a COOKIE, but we should send down an obs_domain_id and 
> obs_point_id, and then the kernel can move this into a cookie.
>

 I did consider this. However, the opaque cookie fits well with the tc API 
 which does not have two 32-bit values but an opaque 128-bit cookie. So if 
 the action is offloaded OVS needs to "encode" the two 32-bit values into 
 the cookie anyways.
>>>
>>> Which is fine to me, the OVS API should be clear that we have two u32 
>>> entries. The cookie is implementation-specific, and we use the netlink 
>>> format as our API for all DPs.
>>
>>
>> I'm not sure about this.  It is a kernel interface and we can't deliver
>> two separate values from the psample.  So, passing two separate values
>> to the kernel doesn't make a lot of sense.  They are going to be mangled
>> into a single cookie anyway and we'll have to document that somehow.
>> We may as well always mangle the data from the beginning, document once
>> at the top level and save ourselves from documenting a million differences
>> between different implementations.  While USDT could report them separately,
>> I'm not sure there is a ton of value in that.
>>
>>>
> The collector itself could be called system/local collector, or something 
> like that. This way it can use for example psample for kernel and UDST 
> probes for usespace. Both can pass the group and cookie (obs_domain_id 
> and obs_point_id).
>
> Also, the presence of any of this should not dictate the psample action, 
> we probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.
>

 It clearer and it might simplify option verification a bit, but isn't a 
 bit redundant? There is no flag for action execution for instance, the 
 presence of the attribute is enough.

 An alternative would be to have nested attributes, e.g:

 enum ovs_sample_attr {
OVS_SAMPLE_ATTR_UNSPEC,
OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
OVS_SAMPLE_ATTR_A

Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Ilya Maximets
On 5/14/24 09:39, Adrian Moreno wrote:
> 
> 
> On 5/10/24 12:45 PM, Adrian Moreno wrote:
>>
>>
>> On 5/10/24 12:06 PM, Eelco Chaudron wrote:
>>> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>>>
 The new odp sample attributes allow userspace to specify a group_id and
 user-defined cookie to be passed down to psample.

 Add support for parsing and formatting such action.

 Signed-off-by: Adrian Moreno 
>>>
>>> Hi Adrian,
>>>
>>> See my comments below inline.
>>>
>>> //Eelco
>>>
 ---
   include/linux/openvswitch.h  |  49 +---
   lib/odp-execute.c    |   3 +
   lib/odp-util.c   | 150 ++-
   ofproto/ofproto-dpif-ipfix.c |   2 +
   python/ovs/flow/odp.py   |   2 +
   tests/odp.at |   5 ++
   6 files changed, 163 insertions(+), 48 deletions(-)

 diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
 index d9fb991ef..3e748405c 100644
 --- a/include/linux/openvswitch.h
 +++ b/include/linux/openvswitch.h
 @@ -696,6 +696,7 @@ enum ovs_flow_attr {
   #define OVS_UFID_F_OMIT_MASK (1 << 1)
   #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)

 +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
   /**
    * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE action.
    * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to sample 
 with
 @@ -703,15 +704,27 @@ enum ovs_flow_attr {
    * %UINT32_MAX samples all packets and intermediate values sample 
 intermediate
    * fractions of packets.
    * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling event.
 - * Actions are passed as nested attributes.
 + * Actions are passed as nested attributes. Optional if
 + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
 + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as psample 
 group.
 + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
 + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie that, 
 if
 + * provided, will be copied to the psample cookie.
>>>
>>> Should we mention any limit on the actual length of the cookie?
>>>
>>> Any reason we explicitly call this psample? From an OVS perspective, this 
>>> could just be 
>>> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE 
>>> and 
>>> _GROUP. Other datapaths, do not have PSAMPLE.
>>>
>>
>> See my response to your comment on the cover-letter for possible name 
>> alternatives.
>>
>>
>>> Thinking about it more, from an OVS perspective we should probably not even 
>>> send down a COOKIE, but we should send down an obs_domain_id and 
>>> obs_point_id, 
>>> and then the kernel can move this into a cookie.
>>>
>>
>> I did consider this. However, the opaque cookie fits well with the tc API 
>> which 
>> does not have two 32-bit values but an opaque 128-bit cookie. So if the 
>> action 
>> is offloaded OVS needs to "encode" the two 32-bit values into the cookie 
>> anyways.
>>
>>> The collector itself could be called system/local collector, or something 
>>> like 
>>> that. This way it can use for example psample for kernel and UDST probes 
>>> for 
>>> usespace. Both can pass the group and cookie (obs_domain_id and 
>>> obs_point_id).
>>>
>>> Also, the presence of any of this should not dictate the psample action, we 
>>> probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.
>>>
>>
>> It clearer and it might simplify option verification a bit, but isn't a bit 
>> redundant? There is no flag for action execution for instance, the presence 
>> of 
>> the attribute is enough.
>>
>> An alternative would be to have nested attributes, e.g:
>>
>> enum ovs_sample_attr {
>>  OVS_SAMPLE_ATTR_UNSPEC,
>>  OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>  OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* attributes. */
>>  OVS_SAMPLE_ATTR_SYSTEM, /* Nested OVS_SAMPLE_SYSTEM_ATTR_* 
>> attributes. */
>>
>>  __OVS_SAMPLE_ATTR_MAX,
>> };
>>
>> enum ovs_sample_system_attr {
>>  OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN,  /* u32 number */
>>  OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */
>>  OVS_SAMPLE_SYSTEM_ATTR_COOKIE, /* Nested OVS_ACTION_ATTR_*
>>
>>  __OVS_SSAMPLE_SYSTEM_ATTR_MAX,
>> };
>>
>> WDYT?
>>
> 
> Another benefit of nested attributes is the easier addition of other 
> sample-related attributes, I already have a candidate: OVS_SAMPLE_SYSTEM_TRUNC
> 
> This will be passed to psample (md->trunc_size) who will truncate the packet 
> size to the specified value. This can be useful for large packets for which 
> we 
> are only interested in the headers.
> 
> WDYT?
> 
> [snip]
> 

Hrm.  This makes me think these should not be attributes of a sample action at 
all,
but rather we should have a separate EMIT_SAMPLE action that just calls 
psample, so
we can combine it with other actions:

  sample(..., actions(userspa

[ovs-dev] [PATCH ovn 5/7] ci: Add missing packages to run Fedora image in GH CI.

2024-05-14 Thread Ales Musil
There were two things missing for the Fedora builds, 32-bit
version of glibc to allows the -m32 compilation on Fedora
and numactl-devel package.

Signed-off-by: Ales Musil 
---
 .ci/linux-build.sh | 3 +++
 utilities/containers/fedora/Dockerfile | 1 +
 2 files changed, 4 insertions(+)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 78f17f8bd..12966f532 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -83,6 +83,9 @@ function configure_gcc()
 # do it directly because gcc-multilib is not available
 # for arm64
 sudo apt update && sudo apt install -y gcc-multilib
+elif which dnf; then
+# Install equivalent of gcc-multilib for Fedora.
+sudo dnf -y update && sudo dnf -y install glibc-devel.i686
 fi
 fi
 }
diff --git a/utilities/containers/fedora/Dockerfile 
b/utilities/containers/fedora/Dockerfile
index 9b8386aae..d40a7b31f 100755
--- a/utilities/containers/fedora/Dockerfile
+++ b/utilities/containers/fedora/Dockerfile
@@ -28,6 +28,7 @@ RUN dnf -y update \
 libtool \
 net-tools \
 nmap-ncat \
+numactl-devel \
 openssl \
 openssl-devel \
 procps-ng \
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 7/7] ci: Bump the Fedora container to 40.

2024-05-14 Thread Ales Musil
Now that all failures were resolved for Fedora 40
we can bump the version in the container.

Signed-off-by: Ales Musil 
---
 utilities/containers/fedora/Dockerfile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/utilities/containers/fedora/Dockerfile 
b/utilities/containers/fedora/Dockerfile
index d40a7b31f..019e9f138 100755
--- a/utilities/containers/fedora/Dockerfile
+++ b/utilities/containers/fedora/Dockerfile
@@ -1,4 +1,4 @@
-FROM quay.io/fedora/fedora:39
+FROM quay.io/fedora/fedora:40
 
 ARG CONTAINERS_PATH
 
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 4/7] tests: Replace wget with curl for failing commands.

2024-05-14 Thread Ales Musil
wget2 has a bug and doesn't return proper exit code on error [0].
Replace wget with curl in places where we expect exit code to
be different from 0.

[0] https://gitlab.com/gnuwget/wget2/-/issues/652
Signed-off-by: Ales Musil 
---
 tests/system-ovn.at| 15 +++
 utilities/containers/ubuntu/Dockerfile |  1 +
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index 86fd240d2..f49330a1e 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -1439,7 +1439,7 @@ OVS_START_L7([bar3], [http])
 check ovn-nbctl --apply-after-lb acl-add foo from-lport 1002 "ip4 && ip4.dst 
== {172.16.1.2,172.16.1.3,172.16.1.4} && ct.new" drop
 check ovn-nbctl --wait=hv sync
 
-AT_CHECK([ip netns exec foo1 wget 30.0.0.1 -t 3 -T 1], [4], [ignore], [ignore])
+AT_CHECK([ip netns exec foo1 curl 30.0.0.1 --retry 3 --max-time 1], [28], 
[ignore], [ignore])
 
 AT_CHECK([ovs-appctl dpctl/flush-conntrack])
 
@@ -1603,7 +1603,7 @@ ovn-nbctl --reject lb-add lb3 30.0.0.10:80 ""
 ovn-nbctl ls-lb-add foo lb3
 # Filter reset segments
 NETNS_START_TCPDUMP([foo1], [-c 1 -neei foo1 ip[[33:1]]=0x14], [rst])
-NS_CHECK_EXEC([foo1], [wget -q 30.0.0.10],[4])
+NS_CHECK_EXEC([foo1], [curl 30.0.0.10 -s --retry 3 --max-time 1], [7])
 
 OVS_WAIT_UNTIL([
 n_reset=$(cat rst.tcpdump | wc -l)
@@ -4627,7 +4627,7 @@ NS_CHECK_EXEC([sw1-p1], [kill $(cat $pid_file)])
 NETNS_START_TCPDUMP([sw0-p2], [-c 1 -neei sw0-p2 ip[[33:1]]=0x14], [rst])
 OVS_WAIT_UNTIL([test 2 = `ovn-sbctl --bare --columns status find \
 service_monitor protocol=tcp | sed '/^$/d' | grep offline | wc -l`])
-NS_CHECK_EXEC([sw0-p2], [wget 10.0.0.10 -v -o wget$i.log],[4])
+NS_CHECK_EXEC([sw0-p2], [curl 10.0.0.10 -v > curl$i.log 2>&1],[7])
 
 OVS_WAIT_UNTIL([
 n_reset=$(cat rst.tcpdump | wc -l)
@@ -9770,7 +9770,7 @@ OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups 
br-int | \
 grep 'nat(dst=192.168.2.2:80)'])
 
 # should not dnat so will not be able to connect
-AT_CHECK([ip netns exec foo1 wget   30.30.30.30  -t 3 -T 1], [4], [ignore], 
[ignore])
+AT_CHECK([ip netns exec foo1 curl 30.30.30.30 --retry 3 --max-time 1], [28], 
[ignore], [ignore])
 
 # check conntrack zone has no tcp entry
 AT_CHECK([ovs-appctl dpctl/dump-conntrack zone=$zone_id | \
@@ -9840,14 +9840,14 @@ ovn-nbctl lsp-add bar rp-bar -- set Logical_Switch_Port 
rp-bar \
 # Logical port 'foo1' in switch 'foo'.
 ADD_NAMESPACES(foo1)
 ADD_VETH(foo1, foo1, br-int, "fd11::2/64", "f0:00:00:01:02:03", \
- "fd11::1")
+ "fd11::1", "nodad")
 ovn-nbctl lsp-add foo foo1 \
 -- lsp-set-addresses foo1 "f0:00:00:01:02:03 fd11::2"
 
 # Logical port 'bar1' in switch 'bar'.
 ADD_NAMESPACES(bar1)
 ADD_VETH(bar1, bar1, br-int, "fd12::2/64", "f0:00:00:01:02:05", \
-"fd12::1")
+ "fd12::1",  "nodad")
 ovn-nbctl lsp-add bar bar1 \
 -- lsp-set-addresses bar1 "f0:00:00:01:02:05 fd12::2"
 
@@ -9864,7 +9864,6 @@ grep 'nat(dst=\[[fd12::2\]]:80)'])
 zone_id=$(ovn-appctl -t ovn-controller ct-zone-list | grep foo1 | cut -d ' ' 
-f2)
 
 OVS_START_L7([bar1], [http6])
-
 AT_CHECK([ip netns exec foo1  wget http://[[fd12::2]] -t 3 -T 1], [0], 
[ignore], [ignore])
 
 # check conntrack zone has tcp entry
@@ -9915,7 +9914,7 @@ OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups 
br-int | \
 grep 'nat(dst=\[[fd12::2\]]:80)'])
 
 # should not dnat so will not be able to connect
-AT_CHECK([ip netns exec foo1 wget  http://[[fd30::2]]  -t 3 -T 1], [4], 
[ignore], [ignore])
+AT_CHECK([ip netns exec foo1 curl http://[[fd30::2]] --retry 3 --max-time 1], 
[28], [ignore], [ignore])
 #
 # check conntrack zone has no tcp entry
 AT_CHECK([ovs-appctl dpctl/dump-conntrack zone=$zone_id | \
diff --git a/utilities/containers/ubuntu/Dockerfile 
b/utilities/containers/ubuntu/Dockerfile
index c1ff711c5..ce7ce16c6 100755
--- a/utilities/containers/ubuntu/Dockerfile
+++ b/utilities/containers/ubuntu/Dockerfile
@@ -11,6 +11,7 @@ RUN apt update -y \
 automake \
 bc \
 clang \
+curl \
 ethtool \
 gcc \
 git \
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 6/7] ci: Make sure that we are using proper image.

2024-05-14 Thread Ales Musil
The container image for scheduled jobs was supposed to be Fedora,
however there was already Ubuntu image in the cache. Make sure
the cache is distinguished also by the event name.

Signed-off-by: Ales Musil 
---
 .github/workflows/test.yml | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
index 22e4d339d..efe2dac25 100644
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -131,7 +131,7 @@ jobs:
 uses: actions/cache@v4
 with:
   path: /tmp/image.tar
-  key: ${{ github.sha }}
+  key: ${{ github.sha }}/${{ github.event_name }}
 
   build-linux:
 needs: [build-dpdk, prepare-container]
@@ -212,10 +212,11 @@ jobs:
 key: ${{ needs.build-dpdk.outputs.dpdk_key }}
 
 - name: image cache
+  id: image_cache
   uses: actions/cache@v4
   with:
 path: /tmp/image.tar
-key: ${{ github.sha }}
+key: ${{ github.sha }}/${{ github.event_name }}
 
 # XXX This should be removed when native crun >=1.9.1
 - name: update crun script
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 0/7] Bump of CI Ubuntu and Fedora versions

2024-05-14 Thread Ales Musil
The series is pretty small however it is required for the
bump to Ubuntu 24.04 and Fedora 40. Both have newer GCC and
Clang which brought up some issues that needed to be fixed.

The series also includes fix for weekly runs to use Fedora
because the cache string wasn't specific enough.

Ales Musil (7):
  ci: Pin Fedora version for the build-rpm job.
  ovs: Bump the submodule to the tip of branch-3.3.
  ci: Update the Ubuntu container to 24.04.
  tests: Replace wget with curl for failing commands.
  ci: Add missing packages to run Fedora image in GH CI.
  ci: Make sure that we are using proper image.
  ci: Bump the Fedora container to 40.

 .ci/linux-build.sh |  3 +++
 .github/workflows/test.yml |  7 ---
 ovs|  2 +-
 tests/system-ovn.at| 15 +++
 utilities/containers/fedora/Dockerfile |  3 ++-
 utilities/containers/ubuntu/Dockerfile | 11 ++-
 6 files changed, 27 insertions(+), 14 deletions(-)

-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 3/7] ci: Update the Ubuntu container to 24.04.

2024-05-14 Thread Ales Musil
The Ubuntu 24.04 marks the Python installation as externally managed
this prevents pip from installing system-wide packages. Set the
PIP_BREAK_SYSTEM_PACKAGES env variable that allows pip to ignore
this and install the packages anyway.

At the same time the Python Babel fails to detect timezone when
it is just set to UTC. Setting it to Etc/UTC fixes the issue:

ValueError: ZoneInfo keys may not be absolute paths, got: /UTC

Signed-off-by: Ales Musil 
---
 utilities/containers/ubuntu/Dockerfile | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/utilities/containers/ubuntu/Dockerfile 
b/utilities/containers/ubuntu/Dockerfile
index ac1e6a5bf..c1ff711c5 100755
--- a/utilities/containers/ubuntu/Dockerfile
+++ b/utilities/containers/ubuntu/Dockerfile
@@ -1,4 +1,4 @@
-FROM registry.hub.docker.com/library/ubuntu:22.04
+FROM registry.hub.docker.com/library/ubuntu:24.04
 
 ARG CONTAINERS_PATH
 
@@ -37,6 +37,7 @@ RUN apt update -y \
 selinux-policy-dev \
 sudo \
 tcpdump \
+tzdata \
 wget \
 && \
 apt autoremove \
@@ -73,6 +74,10 @@ WORKDIR /workspace
 
 COPY $CONTAINERS_PATH/py-requirements.txt /tmp/py-requirements.txt
 
+# Ubuntu 24.04 marks the Python installation as externally managed, allow pip
+# to install the packages despite that.
+ENV PIP_BREAK_SYSTEM_PACKAGES 1
+
 # Update and install pip dependencies
 RUN python3 -m pip install --upgrade pip \
 && \
@@ -80,4 +85,7 @@ RUN python3 -m pip install --upgrade pip \
 && \
 python3 -m pip install -r /tmp/py-requirements.txt
 
+# The Python Babel fails to detect timezone when it is set to UTC only.
+ENV TZ Etc/UTC
+
 CMD ["/sbin/init"]
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 2/7] ovs: Bump the submodule to the tip of branch-3.3.

2024-05-14 Thread Ales Musil
The branch-3.3 includes fixes that are required for Fedora 40
and Ubuntu 24.04 bump.

cf461fe282c9 ("conntrack: Do not use {0} to initialize unions.")
4756bf4baf1e ("ofproto-dpif-trace: Fix access to an out-of-scope stack memory.")
01eca18be187 ("hash, jhash: Fix unaligned access to the hash remainder.")
4f61523c0d50 ("sparse: Add additional define for sparse on GCC >= 14.")
9a5c24d70fb6 ("sparse: Add immintrin.h header.")
3528cc6f452a ("tc: Fix -Wgnu-variable-sized-type-not-at-end warning with Clang 
18.")
5814de56878f ("tests: Fix build failure with Clang 18 due to 
-Wformat-truncation.")

Signed-off-by: Ales Musil 
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index f19448b86..bf1b16364 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit f19448b8618967a108ec6f34713dd811ce1d1334
+Subproject commit bf1b16364b3f01b0ff5f2f6e76842e666226a17b
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 1/7] ci: Pin Fedora version for the build-rpm job.

2024-05-14 Thread Ales Musil
Keep the Fedora version pinned to 40 for the build-rpm.

Fixes: 8f18b3b6c52f ("ci: Keep the container version pinned.")
Signed-off-by: Ales Musil 
---
 .github/workflows/test.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
index 456ab5c69..22e4d339d 100644
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -315,7 +315,7 @@ jobs:
   build-linux-rpm:
 name: linux rpm fedora
 runs-on: ubuntu-22.04
-container: fedora:latest
+container: fedora:40
 timeout-minutes: 30
 
 strategy:
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC 00/11] Add psample support to NXAST_SAMPLE action.

2024-05-14 Thread Adrian Moreno



On 5/10/24 12:14 PM, Eelco Chaudron wrote:



On 10 May 2024, at 10:23, Adrian Moreno wrote:


On 5/10/24 9:14 AM, Eelco Chaudron wrote:



On 24 Apr 2024, at 21:53, Adrian Moreno wrote:


This is the userspace counterpart of the work being done in the kernel
[1]. Sending it as RFC to get some early feedback on the overall
solution.

** Problem description **
Currently, OVS supports several observability features, such as
per-bridge IPFIX, per-flow IPFIX and sFlow. However, given the
complexity of OVN-generated pipelines, a single sample is typically not
enough to troubleshoot what is OVN/OVS is doing with the packet. We need
highler level metadata alongside the packet sample.

This can be achieved by the means of per-flow IPFIX sampling and
NXAST_SAMPLE action, through which OVN can add two arbitrary 32-bit
values (observation_domain_id and observation_point_id) that are sent
alongside the packet information in the IPFIX frames.

However, there is a fundamental limitation of the existing sampling
infrastructure, specially in the kernel datapath: samples are handled by
ovs-vswitchd, forcing the need for an upcall (userspace action). The
fact that samples share the same netlink sockets and handler thread cpu
time with actual packet misse, can easily cause packet drops making this
solution unfit for use in highly loaded production systems.

Users are left with little option than guessing what sampling rate will
be OK for their traffic pattern and dealing with the lost accuracy.

** Feature Description **
In order to solve this situation and enable this feature to be safely
turned on in production, this RFC uses the psample support proposed in
[1] to send NXAST_SAMPLE samples to psample adding the observability
domain and point information as metadata.

~~ API ~~
The API is simply a new field called "psample_group" in the
Flow_Sample_Collector_Set (FSCS) table. Configuring this value is
orthogonal to also associating the FSCS entry to an entry in the IPFIX
table.

Apart from that configuration, the controller needs to add NXAST_SAMPLE
actions that refer the entry's id.

~~ HW Offload ~~
psample is already supported by tc using the act_sample action. The
metadata is taken from the actions' cookie.
This series also adds support for offloading the odp sample action,
only when it includes psample information but not nested actions.

A bit of special care has to be taken in the tc layer to not store the
ufid in the sample's cookie as it's used to carry action-specific data.

~~ Datapath support ~~
This new behavior of the datapth sample action is only supported on the
kernel datapath. A more detailed analysis is needed (and planned) to
find a way to also optimize the userspace datapath. However, although
potentially inefficient, there is not that much risk of dropping packets
with the existing IPFIX infrastructure.

~~ Testing ~~
The series includes an utility program called "ovs-psample" that listens
to packets multicasted by the psample module and prints them (also
printing the obs_domain and obs_point ids). In addition the kernel
series includes a tracepoint for easy testing and troubleshooting.

[1] https://patchwork.kernel.org/project/netdevbpf/list/?series=847473


Hi Andrew,



Who is Andrew? :-D


Sorry :( Handling too many things in parallel...


Thanks for sending out this RFC series. I did a code-only review (no testing), 
and the main concern I have (besides some locking) is the direct mapping to 
psample. If we decide that for userspace we are going to use USDT probes, we 
need another target, and we will duplicate a lot of code. My proposal would be 
to have a more general target name like system, or local (or a better name ;). 
This will be a system-local dpif target with a 32-bit group id.



Regarding the name, I agree adding "psample" to the API can be confusing for 
userspace datapath. Besides, I think it should also be in sync with the odp naming:

"system": My initial thought was it could be confused with the "system@" name 
of the kernel datapath, but that is not present in the OVSDB level so it could work. 
SAMPLE_ATTR_{SYSTEM_GROUP,SYSTEM_USER_COOKIE}?

"offload": I did consider this but I discarded it as it could be confused with 
actual hw/tc offloading of the sample action.

"local": Funny, I initially thought of the opposite, i.e: "external" as the sample collection and 
potential encoding (which is currently "local" to ovs-vswitchd) is now some "external" process. 
Still I guess the fact you thought of the opposite term suggests it's not expressive enough.

"raw": As in, samples are not IPFIX any more (unless the external collector 
decides to export them in IPFIX) but we're exporting the raw packet (alongside some 
metadata). SAMPLE_ATTR_{RAW_GROUP, RAW_USER_COOKIE}?

"dpif": As in, it depends on the dpif implementation. However, this term kind of internal to OVS 
(it appears only once in ovs-vswitchd.conf.db(5) without much clarification). Its more human synonym 
"datapath", or "d

Re: [ovs-dev] [PATCH] table: Fix freeing global variable.

2024-05-14 Thread wangyunjian via dev



> -Original Message-
> From: Ilya Maximets [mailto:i.maxim...@ovn.org]
> Sent: Tuesday, May 14, 2024 3:48 PM
> To: wangyunjian ; d...@openvswitch.org
> Cc: i.maxim...@ovn.org; luyicai ; sunpengfei (G)
> 
> Subject: Re: [ovs-dev][PATCH] table: Fix freeing global variable.
> 
> On 5/14/24 05:25, Yunjian Wang wrote:
> > From: Pengfei Sun 
> >
> > In function shash_replace_nocopy, argument to free() is the address of
> > a global variable (argument passed by function table_print_json__),
> > which is not memory allocated by malloc().
> >
> > ovsdb-client -f json monitor Open_vSwitch --timestamp
> >
> > ASan reports:
> >
> 
> =
> > ==1443083==ERROR: AddressSanitizer: attempting free on address which
> > was not malloc()-ed: 0x00535980 in thread T0
> > #0 0xaf1c9eac in __interceptor_free (/usr/lib64/libasan.so.6+
> > 0xa4eac)
> > #1 0x4826e4 in json_destroy_object lib/json.c:445
> > #2 0x4826e4 in json_destroy__ lib/json.c:403
> > #3 0x4cc4e4 in table_print lib/table.c:633
> > #4 0x410650 in monitor_print_table ovsdb/ovsdb-client.c:1019
> > #5 0x410650 in monitor_print ovsdb/ovsdb-client.c:1040
> > #6 0x4110cc in monitor_print ovsdb/ovsdb-client.c:1030
> > #7 0x4110cc in do_monitor__ ovsdb/ovsdb-client.c:1503
> > #8 0x40743c in main ovsdb/ovsdb-client.c:283
> > #9 0xaeb50038  (/usr/lib64/libc.so.6+0x2b038)
> > #10 0xaeb50110 in __libc_start_main (/usr/lib64/libc.so.6+
> > 0x2b110)
> > #11 0x40906c in _start (/usr/local/bin/ovsdb-client+0x40906c)
> >
> > Use xstrdup to allocate memory for argument "time".
> >
> > Fixes: cb139fa8b3a1 ("table: New function table_format() for
> > formatting a table as a string.")
> > Signed-off-by: Pengfei Sun 
> > ---
> >  lib/table.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Hi.  Thanks for the patch!
> 
> >
> > diff --git a/lib/table.c b/lib/table.c index 48d18b6..bcc6fb0 100644
> > --- a/lib/table.c
> > +++ b/lib/table.c
> > @@ -523,7 +523,7 @@ table_print_json__(const struct table *table, const
> struct table_style *style,
> >  }
> >  if (table->timestamp) {
> >  json_object_put_nocopy(
> 
> I think, we should just remove 'nocopy' part here instead of adding xstrdup()
> below.  json_object_put() will copy the name.

Thanks for your suggestion, will include it in next version.

> 
> > -json, "time",
> > +json, xstrdup("time"),
> >  json_string_create_nocopy(table_format_timestamp__()));
> >  }
> >
> 
> Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] fix ct tp policy when create zone.

2024-05-14 Thread Chandler Wu
Hi, IIya, thanks for the attention.

If I create a zone for the first time, the new tp_cfg will be copied to the
zone, see `ct_zone_alloc`. Then update_timeout_policy will check that the
new copied tp equals to tp_cfg, so ofproto_ct_set_zone_timeout_policy will
not got called. Or you can check tag v3.0.0 or before. There's no such
issue for these versions.

Best regards.

On Tue, May 14, 2024 at 5:11 AM Ilya Maximets  wrote:

> On 5/6/24 06:05, Chandler Wu wrote:
> > From 5b4d479a9083272e56c51ef9521e6ef665dd534d Mon Sep 17 00:00:00 2001
> > From: chandlerwu 
> > Date: Mon, 6 May 2024 11:58:21 +0800
> > Subject: [PATCH] Subject:[PATCH] fix ct tp policy when create zone.
> >
> > Signed-off-by: chandlerwu 
> > ---
> >  vswitchd/bridge.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> > index 95a65fcdc..e60359b59 100644
> > --- a/vswitchd/bridge.c
> > +++ b/vswitchd/bridge.c
> > @@ -766,6 +766,8 @@ ct_zones_reconfigure(struct datapath *dp, struct
> ovsrec_datapath *dp_cfg)
> >  if (!ct_zone) {
> >  ct_zone = ct_zone_alloc(zone_id, tp_cfg);
> >  hmap_insert(&dp->ct_zones, &ct_zone->node,
> hash_int(zone_id, 0));
> > +ofproto_ct_set_zone_timeout_policy(dp->type,
> ct_zone->zone_id,
> > +   &ct_zone->tp);
> >  }
> >
> >  struct simap new_tp = SIMAP_INITIALIZER(&new_tp);
>
> Hi, Chandler Wu.  Thanks for the patch!
>
> But could you, please, explain what is the issue you're trying to fix?
>
> Reading the code, it seems that update_timeout_policy() call later in
> the function should correctly set the policy.  Is that not happening?
>
> Best regards, Ilya Maximets.
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] table: Fix freeing global variable.

2024-05-14 Thread Ilya Maximets
On 5/14/24 05:25, Yunjian Wang wrote:
> From: Pengfei Sun 
> 
> In function shash_replace_nocopy, argument to free() is the address
> of a global variable (argument passed by function table_print_json__),
> which is not memory allocated by malloc().
> 
> ovsdb-client -f json monitor Open_vSwitch --timestamp
> 
> ASan reports:
> =
> ==1443083==ERROR: AddressSanitizer: attempting free on address
> which was not malloc()-ed: 0x00535980 in thread T0
> #0 0xaf1c9eac in __interceptor_free (/usr/lib64/libasan.so.6+
> 0xa4eac)
> #1 0x4826e4 in json_destroy_object lib/json.c:445
> #2 0x4826e4 in json_destroy__ lib/json.c:403
> #3 0x4cc4e4 in table_print lib/table.c:633
> #4 0x410650 in monitor_print_table ovsdb/ovsdb-client.c:1019
> #5 0x410650 in monitor_print ovsdb/ovsdb-client.c:1040
> #6 0x4110cc in monitor_print ovsdb/ovsdb-client.c:1030
> #7 0x4110cc in do_monitor__ ovsdb/ovsdb-client.c:1503
> #8 0x40743c in main ovsdb/ovsdb-client.c:283
> #9 0xaeb50038  (/usr/lib64/libc.so.6+0x2b038)
> #10 0xaeb50110 in __libc_start_main (/usr/lib64/libc.so.6+
> 0x2b110)
> #11 0x40906c in _start (/usr/local/bin/ovsdb-client+0x40906c)
> 
> Use xstrdup to allocate memory for argument "time".
> 
> Fixes: cb139fa8b3a1 ("table: New function table_format() for formatting a 
> table as a string.")
> Signed-off-by: Pengfei Sun 
> ---
>  lib/table.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Hi.  Thanks for the patch!

> 
> diff --git a/lib/table.c b/lib/table.c
> index 48d18b6..bcc6fb0 100644
> --- a/lib/table.c
> +++ b/lib/table.c
> @@ -523,7 +523,7 @@ table_print_json__(const struct table *table, const 
> struct table_style *style,
>  }
>  if (table->timestamp) {
>  json_object_put_nocopy(

I think, we should just remove 'nocopy' part here instead of
adding xstrdup() below.  json_object_put() will copy the name.

> -json, "time",
> +json, xstrdup("time"),
>  json_string_create_nocopy(table_format_timestamp__()));
>  }
>  

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] conntrack: Fully initialize conn struct before insertion.

2024-05-14 Thread Ilya Maximets
On 5/13/24 14:38, Simon Horman wrote:
> On Fri, May 10, 2024 at 05:45:54PM +0200, Paolo Valerio wrote:
>> From: Mike Pattrick 
>>
>> In case packets are concurrently received in both directions, there's
>> a chance that the ones in the reverse direction get received right
>> after the connection gets added to the connection tracker but before
>> some of the connection's fields are fully initialized.
>> This could cause OVS to access potentially invalid, as the lookup may
>> end up retrieving the wrong offsets during CONTAINER_OF(), or
>> uninitialized memory.
>>
>> This may happen in case of regular NAT or all-zero SNAT.
>>
>> Fix it by initializing early the connections fields.
>>
>> Fixes: 1116459b3ba8 ("conntrack: Remove nat_conn introducing key 
>> directionality.")
>> Reported-at: https://issues.redhat.com/browse/FDP-616
>> Signed-off-by: Mike Pattrick 
>> Co-authored-by: Paolo Valerio 
>> Signed-off-by: Paolo Valerio 
> 
> Acked-by: Simon Horman 

Thanks, Paolo, Mike and Simon!

Applied and backported down to 2.17.

Bets regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] conntrack: Do not use {0} to initialize unions.

2024-05-14 Thread Ilya Maximets
On 5/9/24 13:37, Paolo Valerio wrote:
> Xavier Simonart  writes:
> 
>> In the following case:
>> union ct_addr {
>> unsigned int ipv4;
>> struct in6_addr ipv6;
>> };
>> union ct_addr zero_ip = {0};
>>
>> The ipv6 field might not be properly initialized.
>> For instance, clang 18.1.1 does not initialize the ipv6 field.
>>
>> Reported-at: https://issues.redhat.com/browse/FDP-608
>> Signed-off-by: Xavier Simonart 
>> ---
>> v2: updated based on nit from Paolo.
>> ---
> 
> Thanks Xavier.
> 
> Acked-by: Paolo Valerio 

Thanks, Xavier and Paolo!

Applied and backported down to 2.17.

Bets regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC 01/11] odp-util: Add support to new psample uAPI.

2024-05-14 Thread Adrian Moreno



On 5/10/24 12:45 PM, Adrian Moreno wrote:



On 5/10/24 12:06 PM, Eelco Chaudron wrote:

On 24 Apr 2024, at 21:53, Adrian Moreno wrote:


The new odp sample attributes allow userspace to specify a group_id and
user-defined cookie to be passed down to psample.

Add support for parsing and formatting such action.

Signed-off-by: Adrian Moreno 


Hi Adrian,

See my comments below inline.

//Eelco


---
  include/linux/openvswitch.h  |  49 +---
  lib/odp-execute.c    |   3 +
  lib/odp-util.c   | 150 ++-
  ofproto/ofproto-dpif-ipfix.c |   2 +
  python/ovs/flow/odp.py   |   2 +
  tests/odp.at |   5 ++
  6 files changed, 163 insertions(+), 48 deletions(-)

diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index d9fb991ef..3e748405c 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -696,6 +696,7 @@ enum ovs_flow_attr {
  #define OVS_UFID_F_OMIT_MASK (1 << 1)
  #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)

+#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
  /**
   * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE action.
   * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to sample with
@@ -703,15 +704,27 @@ enum ovs_flow_attr {
   * %UINT32_MAX samples all packets and intermediate values sample intermediate
   * fractions of packets.
   * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling event.
- * Actions are passed as nested attributes.
+ * Actions are passed as nested attributes. Optional if
+ * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
+ * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as psample group.
+ * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
+ * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie that, if
+ * provided, will be copied to the psample cookie.


Should we mention any limit on the actual length of the cookie?

Any reason we explicitly call this psample? From an OVS perspective, this 
could just be 
SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE and 
_GROUP. Other datapaths, do not have PSAMPLE.




See my response to your comment on the cover-letter for possible name 
alternatives.


Thinking about it more, from an OVS perspective we should probably not even 
send down a COOKIE, but we should send down an obs_domain_id and obs_point_id, 
and then the kernel can move this into a cookie.




I did consider this. However, the opaque cookie fits well with the tc API which 
does not have two 32-bit values but an opaque 128-bit cookie. So if the action 
is offloaded OVS needs to "encode" the two 32-bit values into the cookie anyways.


The collector itself could be called system/local collector, or something like 
that. This way it can use for example psample for kernel and UDST probes for 
usespace. Both can pass the group and cookie (obs_domain_id and obs_point_id).


Also, the presence of any of this should not dictate the psample action, we 
probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.




It clearer and it might simplify option verification a bit, but isn't a bit 
redundant? There is no flag for action execution for instance, the presence of 
the attribute is enough.


An alternative would be to have nested attributes, e.g:

enum ovs_sample_attr {
 OVS_SAMPLE_ATTR_UNSPEC,
 OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
 OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* attributes. */
 OVS_SAMPLE_ATTR_SYSTEM, /* Nested OVS_SAMPLE_SYSTEM_ATTR_* attributes. 
*/

 __OVS_SAMPLE_ATTR_MAX,
};

enum ovs_sample_system_attr {
 OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN,  /* u32 number */
 OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */
 OVS_SAMPLE_SYSTEM_ATTR_COOKIE, /* Nested OVS_ACTION_ATTR_*

 __OVS_SSAMPLE_SYSTEM_ATTR_MAX,
};

WDYT?



Another benefit of nested attributes is the easier addition of other 
sample-related attributes, I already have a candidate: OVS_SAMPLE_SYSTEM_TRUNC


This will be passed to psample (md->trunc_size) who will truncate the packet 
size to the specified value. This can be useful for large packets for which we 
are only interested in the headers.


WDYT?

[snip]

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] controller: Fix an issue wrt cleanup of stale patch port.

2024-05-14 Thread Priyankar Jain


On 10/05/24 10:06 pm, Numan Siddique wrote:

On Fri, May 3, 2024 at 11:00 AM Numan Siddique  wrote:

On Mon, Apr 22, 2024 at 2:49 AM Priyankar Jain
 wrote:

Issue:
Upon updating the network_name option of localnet port from one physical
bridge to another, ovn-controller fails to cleanup the peer localnet
patch port from the old bridge and ends up creating a duplicate peer
localnet patch port which fails in the following ovsdb transaction:

```
"Transaction causes multiple rows in \"Interface\" table to have
identical values
(patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int)
```

Current workflow:
1. Keep a set of all existing localnet ports on br-int. Let us call this
set: existing_ports.
2. For each localnet port in SB:
2.1 Create a patch port on br-int. (if it already exists on br-int,
do not create but remove the entry from exisitng_ports set).
2.2 Create a peer patch port on br-phys-x. (if it already exists on the
bridge, do not create but remove the entry from exisitng_ports set).
3. Finally, cleanup all the ports and its peers from exisiting_ports.

With the above workflow, upon network_name change of localnet LSP, since
ports on br-int does not change, only peer port needs to be move from
br-phys-x to br-phys-y, the localnet port is removed from
exisiting_ports in #2.1 and its peer never becomes eligible for cleanup.

Fix:
Include both patch port on br-int and its peer port in the
exisiting_ports set as part of #1.
This make sures that the peer port is only removed from existing_ports
only if it is already present on the correct bridge. (#2.1/#2.2)
Otherwise, during the cleanup in #3 it will be considered.

Fixes: b600316f252a ("Don't delete patch ports that don't belong to br-int")
Signed-off-by: Priyankar Jain 

Acked-by: Numan Siddique 

Applied this patch to main and backported till branch-23.03.  It
doesn't apply cleanly for branch-22.12.

Thanks
Numan

Thanks for merging the changes Numan. Appreciate the help.
22.12 does not have the commit which introduced this regression:

b600316f252a ("Don't delete patch ports that don't belong to br-int")

So i think we are good.

Thanks,
Priyankar


Numan


---
  controller/patch.c |  32 +++--
  tests/ovn.at   | 109 +
  2 files changed, 124 insertions(+), 17 deletions(-)

diff --git a/controller/patch.c b/controller/patch.c
index c1cd5060d..4fed6e375 100644
--- a/controller/patch.c
+++ b/controller/patch.c
@@ -314,6 +314,21 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn,
  || smap_get(&port->external_ids, "ovn-l3gateway-port")
  || smap_get(&port->external_ids, "ovn-logical-patch-port")) {
  shash_add(&existing_ports, port->name, port);
+/* Also add peer ports to the list. */
+for (size_t j = 0; j < port->n_interfaces; j++) {
+struct ovsrec_interface *p_iface = port->interfaces[j];
+if (strcmp(p_iface->type, "patch")) {
+continue;
+}
+const char *peer_name = smap_get(&p_iface->options, "peer");
+if (peer_name) {
+const struct ovsrec_port *peer_port =
+get_port(ovsrec_port_by_name, peer_name);
+if (peer_port) {
+shash_add(&existing_ports, peer_port->name, peer_port);
+}
+}
+}
  }
  }

@@ -336,23 +351,6 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn,
   * ovn-controller.  Otherwise it may cause unncessary dataplane
   * interruption during restart/upgrade. */
  if (!daemon_started_recently()) {
-/* delete peer patch port first */
-for (size_t i = 0; i < port->n_interfaces; i++) {
-struct ovsrec_interface *iface = port->interfaces[i];
-if (strcmp(iface->type, "patch")) {
-continue;
-}
-const char *iface_peer = smap_get(&iface->options, "peer");
-if (iface_peer) {
-const struct ovsrec_port *peer_port =
-get_port(ovsrec_port_by_name, iface_peer);
-if (peer_port) {
-remove_port(bridge_table, peer_port);
-}
-}
-}
-
-/* now delete integration bridge patch port */
  remove_port(bridge_table, port);
  }
  }
diff --git a/tests/ovn.at b/tests/ovn.at
index 4d0c7ad53..4b71e4916 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -36454,6 +36454,115 @@ OVN_CLEANUP([hv1])
  AT_CLEANUP
  ])

+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn-controller update network_name option for localnet port])
+ovn_start
+net_add n1
+
+sim_add hv1
+as hv1
+
+# Bridge Topology
+# Initial: br-int and br-phys-1 connected through ovn-localnet patch port.
+#
+# br-phys-1 -- br-int
+#
+# Fina