Re: [ovs-dev] [PATCH ovn] controller: Restore MAC and vlan for DVR scenario

2022-12-16 Thread Jakub Libosvar
On 12/12/22 7:41 AM, Dumitru Ceara wrote:
> On 12/5/22 17:07, Dumitru Ceara wrote:
>> On 9/30/22 12:43, Dumitru Ceara wrote:
>>> On 9/20/22 22:18, Mark Michelson wrote:
 Thanks Ales,

 Acked-by: Mark Michelson 

>>>
>>> I applied this to the main branch and backported it to all stable
>>> branches down to branch-22.03.
>>>
>>
>> Hi all,
>>
>> There was an internal request from within Red Hat (from Jakub in CC) to
>> backport this fix to branch-21.12 too.  If nobody is it I can take care

Hi,

is there a BZ/Jira that I can set our OSP bug as depending on?

Thank you very much for the backport!

Kuba

> 
> Oops, s/is it/is against it/
> 
> But I hope the intention of the question was clear.
> 
>> of porting the patch to 21.12.
>>
>> I'll wait a day or two to give people time to reply.
>>
> 
> I went ahead and applied this to branch 21.12 too.
> 
> Regards,
> Dumitru
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] northd: bypass connection tracking for stateless flows when there are LB flows present

2022-12-16 Thread Numan Siddique
On Thu, Dec 15, 2022 at 4:38 PM Han Zhou  wrote:
>
> On Mon, Dec 12, 2022 at 1:28 AM venu iyer  wrote:
> >
> > Currently, even stateless flows are subject to connection tracking when
> there are
> > LB rules (for DNAT). However, if a flow needs to be subjected to LB, then
> it shouldn't
> > be configured as stateless.
> >
> > Stateless flow means we should not track it, and this change exempts
> stateless
> > flows from being tracked regardless of whether LB rules are present or
> not.
> >
> > Signed-off-by: venu iyer 
> > Acked-by: Han Zhou 
> > ---
> >  northd/northd.c |  25 +++-
> >  northd/ovn-northd.8.xml |  57 
> >  ovn-nb.xml  |   3 +
> >  tests/ovn-northd.at |  76 +--
> >  tests/ovn.at|   4 +-
> >  tests/system-ovn.at | 296 
> >  6 files changed, 383 insertions(+), 78 deletions(-)
> >
> > diff --git a/northd/northd.c b/northd/northd.c
> > index 7c48bb3b4..5d8ef612f 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -140,8 +140,8 @@ enum ovn_stage {
> >  PIPELINE_STAGE(SWITCH, IN,  L2_UNKNOWN,26, "ls_in_l2_unknown")
>  \
> >
>  \
> >  /* Logical switch egress stages. */
>   \
> > -PIPELINE_STAGE(SWITCH, OUT, PRE_LB,   0, "ls_out_pre_lb")
>   \
> > -PIPELINE_STAGE(SWITCH, OUT, PRE_ACL,  1, "ls_out_pre_acl")
>  \
> > +PIPELINE_STAGE(SWITCH, OUT, PRE_ACL,  0, "ls_out_pre_acl")
>  \
> > +PIPELINE_STAGE(SWITCH, OUT, PRE_LB,   1, "ls_out_pre_lb")
>   \
> >  PIPELINE_STAGE(SWITCH, OUT, PRE_STATEFUL, 2, "ls_out_pre_stateful")
>   \
> >  PIPELINE_STAGE(SWITCH, OUT, ACL_HINT, 3, "ls_out_acl_hint")
>   \
> >  PIPELINE_STAGE(SWITCH, OUT, ACL,  4, "ls_out_acl")
>  \
> > @@ -215,6 +215,7 @@ enum ovn_stage {
> >  #define REGBIT_ACL_LABEL  "reg0[13]"
> >  #define REGBIT_FROM_RAMP  "reg0[14]"
> >  #define REGBIT_PORT_SEC_DROP  "reg0[15]"
> > +#define REGBIT_ACL_STATELESS  "reg0[16]"
> >
> >  #define REG_ORIG_DIP_IPV4 "reg1"
> >  #define REG_ORIG_DIP_IPV6 "xxreg1"
> > @@ -290,7 +291,7 @@ enum ovn_stage {
> >   * | R0 | REGBIT_{CONNTRACK/DHCP/DNS}  |   |
>   |
> >   * || REGBIT_{HAIRPIN/HAIRPIN_REPLY}   |   |
>   |
> >   * || REGBIT_ACL_HINT_{ALLOW_NEW/ALLOW/DROP/BLOCK} |   |
>   |
> > - * || REGBIT_ACL_LABEL | X |
>   |
> > + * || REGBIT_ACL_{LABEL/STATELESS} | X |
>   |
> >   * ++--+ X |
>   |
> >   * | R5 |   UNUSED | X |
> LB_L2_AFF_BACKEND_IP6   |
> >   * | R1 | ORIG_DIP_IPV4 (>= IN_PRE_STATEFUL)   | R |
>   |
> > @@ -5693,17 +5694,18 @@ build_stateless_filter(struct ovn_datapath *od,
> > const struct nbrec_acl *acl,
> > struct hmap *lflows)
> >  {
> > +const char *action = REGBIT_ACL_STATELESS" = 1; next;";
> >  if (!strcmp(acl->direction, "from-lport")) {
> >  ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_PRE_ACL,
> >  acl->priority + OVN_ACL_PRI_OFFSET,
> >  acl->match,
> > -"next;",
> > +action,
> >  >header_);
> >  } else {
> >  ovn_lflow_add_with_hint(lflows, od, S_SWITCH_OUT_PRE_ACL,
> >  acl->priority + OVN_ACL_PRI_OFFSET,
> >  acl->match,
> > -"next;",
> > +action,
> >  >header_);
> >  }
> >  }
> > @@ -5795,6 +5797,10 @@ build_pre_acls(struct ovn_datapath *od, const
> struct hmap *port_groups,
> >REGBIT_CONNTRACK_DEFRAG" = 1; next;");
> >  ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 100, "ip",
> >REGBIT_CONNTRACK_DEFRAG" = 1; next;");
> > +} else if (od->has_lb_vip) {
> > +/* We'll build stateless filters if there are LB rules so that
> > + * the stateless flows are not tracked in pre-lb. */
> > + build_stateless_filters(od, port_groups, lflows);
> >  }
> >  }
> >
> > @@ -5930,6 +5936,12 @@ build_pre_lb(struct ovn_datapath *od, const struct
> shash *meter_groups,
> >   110, lflows);
> >  }
> >
> > +/* Do not sent statless flows via conntrack */
> > +ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110,
> > +  REGBIT_ACL_STATELESS" == 1", "next;");
> > +ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110,
> > +  REGBIT_ACL_STATELESS" == 1", "next;");
> > +
> >  /* 'REGBIT_CONNTRACK_NAT' is set to let the 

Re: [ovs-dev] [PATCH 6/6] ci: add the opts about ALLOW_EXPERIMENTAL_API

2022-12-16 Thread David Marchand
On Fri, Dec 16, 2022 at 4:52 PM Simon Horman  wrote:
>
> From: Peng Zhang 
>
> This commit adds support for OVS-DPDK with -DALLOW_EXPERIMENTAL_API.
>
> Tunnel offloads and Meter offloads are experimental APIs in DPDK. To
> enable these features, compile need add -DALLOW_EXPERIMENTAL_API. So
> in workflow, we also need need the new test with
> -DALLOW_EXPERIMENTAL_API.
>
> Signed-off-by: Peng Zhang 

We have a similar patch in the dpdk-latest branch.
https://github.com/openvswitch/ovs/commit/a8f6be98801f0c43d52173843d649df2af5e1c0d
Is something wrong with it?


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Load based PMD sleeping.

2022-12-16 Thread Kevin Traynor

On 09/12/2022 15:13, David Marchand wrote:

Hi Kevin,

Here are some comments on the stats side:

On Tue, Nov 29, 2022 at 3:01 PM Kevin Traynor  wrote:

@@ -6964,5 +6999,19 @@ reload:
   * There was no time updates on current iteration. */
  pmd_thread_ctx_time_update(pmd);
-tx_packets = dp_netdev_pmd_flush_output_packets(pmd, false);
+tx_packets = dp_netdev_pmd_flush_output_packets(pmd,
+   sleep_time ? true : false);
+}
+
+if (powersave) {
+if (sleep_time) {
+xnanosleep(sleep_time);
+time_slept = sleep_time;


- time_slept is the requested time, and not the time that ovs actually slept.
PMD_PWR_SLEEP_TIME stat won't be accurate if xnanosleep() takes longer
than expected.
I would add a "timer" around xnanosleep() and store cycles internally
(and then translate back to us in the stats output).



Good idea, I added this in v2.



- Both pmd idling (PMD_CYCLES_ITER_IDLE) and pmd processing
(PMD_CYCLES_ITER_BUSY) stats now include cycles spent sleeping.
Those stats are used for animating pmd stats ("avg processing cycles
per packet") and rxq stats ("overhead").
I think those "sleeping" cycles should be counted as idle so that
"busy" cycles do represent time spent processing the packets.



As we chatted about, I removed it from both idle and busy stats. It 
didn't seem to fit well loading all the sleeps into idle as they may 
have taken place during a busy iteration.


So now it is keeping idle and busy stats meaning the same, whether 
powersave is true/false. For the sleeps there is a new stat for total 
time slept and average/iteration. So we can see idle iterations/cycles, 
busy iterations/cycles and sleep time.


I also removed max sleeps but kept "no-sleep thresh hit".

Let me know if you think it's enough visibility. Thanks for suggestion 
and review.




WDYT?


+}
+if (sleep_time < PMD_PWR_MAX_SLEEP) {
+/* Increase potential sleep time for next iteration. */
+sleep_time += PMD_PWR_INC;
+} else {
+max_sleep_hit = true;
+}
  }






___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] dpif-netdev: Load based PMD sleeping.

2022-12-16 Thread Kevin Traynor
Sleep for an incremental amount of time if none of the Rx queues
assigned to a PMD have at least half a batch of packets (i.e. 16 pkts)
on an polling iteration of the PMD.

Upon detecting the threshold of >= 16 pkts on an Rxq, reset the
sleep time to zero (i.e. no sleep).

Sleep time will be increased by 1 uS on each iteration where
the low load conditions remain up to a total of the max sleep
time which has a default of 250 uS.

The feature is off by default and can be enabled by:
ovs-vsctl set Open_vSwitch . other_config:pmd-powersave=true

The max sleep time per iteration can be set e.g. to set to 500 uS:
ovs-vsctl set Open_vSwitch . other_config:pmd-powersave-maxsleep=500

Also add new stats to pmd-perf-show to get visibility of operation
e.g.


  - No-sleep hit:36445  ( 98.4 % of busy it)
 Sleep time:   3350902  uS ( 34 us/it avg.)


Signed-off-by: Kevin Traynor 

---
v2:
- Updated to mark feature as experimental as there is still discussion
  on it's operation and control knobs
- Added pmd-powersave-maxsleep to set the max requested sleep time
- Added unit tests for pmd-powersave and pmd-powersave-maxsleep config
  knobs
- Added docs to explain that requested sleep time and actual sleep time
  may differ
- Added actual measurement of sleep time instead of reporting requested
  time
- Removed Max sleep hit statistics
- Added total sleep time statistic for the length of the measurement
  period (avg. uS per iteration still exists also)
- Updated other statistics to account for sleep time
- Some renaming
- Replaced xnanosleep with nanosleep to avoid having to start/end
  quiesce for every sleep (this may KO this feature on Windows)
- Limited max requested sleep to max PMD quiesce time (10 ms)
- Adapted ALB measurement about whether a PMD is overloaded to account
  for time spent sleeping
---
 Documentation/topics/dpdk/pmd.rst | 46 +
 lib/dpif-netdev-perf.c| 26 --
 lib/dpif-netdev-perf.h|  5 +-
 lib/dpif-netdev.c | 86 +--
 tests/pmd.at  | 43 
 vswitchd/vswitch.xml  | 34 
 6 files changed, 229 insertions(+), 11 deletions(-)

diff --git a/Documentation/topics/dpdk/pmd.rst 
b/Documentation/topics/dpdk/pmd.rst
index b259cc8b3..abc552029 100644
--- a/Documentation/topics/dpdk/pmd.rst
+++ b/Documentation/topics/dpdk/pmd.rst
@@ -312,4 +312,50 @@ reassignment due to PMD Auto Load Balance. For example, 
this could be set
 (in min) such that a reassignment is triggered at most every few hours.
 
+PMD Power Saving (Experimental)
+---
+
+PMD threads constantly poll Rx queues which are assigned to them. In order to
+reduce the CPU cycles they use, they can sleep for small periods of time
+when there is no load or very-low load from all the Rx queues they poll.
+
+This can be enabled by::
+
+$ ovs-vsctl set open_vswitch . other_config:pmd-powersave="true"
+
+With this enabled a PMD may request to sleep by an incrementing amount of time
+up to a maximum time. If at any point the threshold of at least half a batch of
+packets (i.e. 16) is received from an Rx queue that the PMD is polling  is met,
+the sleep time will be reset to 0. (i.e. no sleep).
+
+The default maximum sleep time is set 250 us. A user can configure a new
+maximum requested sleep time in uS. e.g. to set to 1 ms::
+
+$ ovs-vsctl set open_vswitch . other_config:pmd-powersave-maxsleep=1000
+
+Sleeping in a PMD thread will mean there is a period of time when the PMD
+thread will not process packets. Sleep times requested are not guaranteed
+and can differ significantly depending on system configuration. The actual
+time not processing packets will be determined by the sleep and processor
+wake-up times and should be tested with each system configuration.
+
+Sleep time statistics for 10 secs can be seen with::
+
+$ ovs-appctl dpif-netdev/pmd-stats-clear \
+&& sleep 10 && ovs-appctl dpif-netdev/pmd-perf-show
+
+Example output, showing that the 16 packet no-sleep threshhold occurred in
+98.2% of busy iterations and there was an average sleep time of
+33 us per iteration::
+
+ No-sleep hit: 119043  ( 98.2 % of busy it)
+ Sleep time:  10638025 uS ( 33 us/it avg.)
+
+.. note::
+
+If there is a sudden spike of packets while the PMD thread is sleeping and
+the processor is in a low-power state it may result in some lost packets or
+extra latency before the PMD thread returns to processing packets at full
+rate.
+
 .. _ovs-vswitchd(8):
 http://openvswitch.org/support/dist-docs/ovs-vswitchd.8.html
diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c
index a2a7d8f0b..16445c68f 100644
--- a/lib/dpif-netdev-perf.c
+++ b/lib/dpif-netdev-perf.c
@@ -231,4 +231,6 @@ pmd_perf_format_overall_stats(struct ds *str, struct 
pmd_perf_stats *s,
 uint64_t idle_iter = s->pkts.bin[0];
 uint64_t 

[ovs-dev] [PATCH v2 ovn] Add IPv6 support for lb health-check

2022-12-16 Thread Lorenzo Bianconi
Add Similar to IPv4 counterpart, introduce IPv6 load-balancer health
check support.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2136094
Signed-off-by: Lorenzo Bianconi 
---
Changes since v1:
- fix potential crash in ovn-northd
- improve documentation
---
 controller/pinctrl.c| 213 -
 northd/northd.c |  79 ++
 northd/ovn-northd.8.xml |  17 +++
 ovn-nb.xml  |  21 ++--
 tests/ovn.at| 201 ++-
 tests/system-ovn.at | 230 +++-
 6 files changed, 658 insertions(+), 103 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index f1176a2f2..d7dbdab0e 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -6740,7 +6740,7 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
 ovs_be32 ip4;
 if (ip_parse(sb_svc_mon->ip, )) {
 ip_addr = in6_addr_mapped_ipv4(ip4);
-} else {
+} else if (!ipv6_parse(sb_svc_mon->ip, _addr)) {
 continue;
 }
 
@@ -6753,16 +6753,27 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
 continue;
 }
 
-for (size_t j = 0; j < laddrs.n_ipv4_addrs; j++) {
-if (ip4 == laddrs.ipv4_addrs[j].addr) {
-ea = laddrs.ea;
-mac_found = true;
-break;
+if (IN6_IS_ADDR_V4MAPPED(_addr)) {
+for (size_t j = 0; j < laddrs.n_ipv4_addrs; j++) {
+if (ip4 == laddrs.ipv4_addrs[j].addr) {
+ea = laddrs.ea;
+mac_found = true;
+break;
+}
+}
+} else {
+for (size_t j = 0; j < laddrs.n_ipv6_addrs; j++) {
+if (IN6_ARE_ADDR_EQUAL(_addr,
+   _addrs[j].addr)) {
+ea = laddrs.ea;
+mac_found = true;
+break;
+}
 }
 }
 
-if (!mac_found && !laddrs.n_ipv4_addrs) {
-/* IPv4 address(es) are not configured. Use the first mac. */
+if (!mac_found && !laddrs.n_ipv4_addrs && !laddrs.n_ipv6_addrs) {
+/* IP address(es) are not configured. Use the first mac. */
 ea = laddrs.ea;
 mac_found = true;
 }
@@ -6796,7 +6807,7 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
 svc_mon->port_key = port_key;
 svc_mon->proto_port = sb_svc_mon->port;
 svc_mon->ip = ip_addr;
-svc_mon->is_ip6 = false;
+svc_mon->is_ip6 = !IN6_IS_ADDR_V4MAPPED(_addr);
 svc_mon->state = SVC_MON_S_INIT;
 svc_mon->status = SVC_MON_ST_UNKNOWN;
 svc_mon->protocol = protocol;
@@ -7564,26 +7575,30 @@ svc_monitor_send_tcp_health_check__(struct rconn 
*swconn,
 ovs_be32 tcp_ack,
 ovs_be16 tcp_src)
 {
-if (svc_mon->is_ip6) {
-return;
-}
-
 /* Compose a TCP-SYN packet. */
 uint64_t packet_stub[128 / 8];
 struct dp_packet packet;
+dp_packet_use_stub(, packet_stub, sizeof packet_stub);
 
 struct eth_addr eth_src;
 eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, _src);
-ovs_be32 ip4_src;
-ip_parse(svc_mon->sb_svc_mon->src_ip, _src);
-
-dp_packet_use_stub(, packet_stub, sizeof packet_stub);
-pinctrl_compose_ipv4(, eth_src, svc_mon->ea,
- ip4_src, in6_addr_get_mapped_ipv4(_mon->ip),
- IPPROTO_TCP, 63, TCP_HEADER_LEN);
+if (svc_mon->is_ip6) {
+struct in6_addr ip6_src;
+ipv6_parse(svc_mon->sb_svc_mon->src_ip, _src);
+pinctrl_compose_ipv6(, eth_src, svc_mon->ea,
+ _src, _mon->ip, IPPROTO_TCP,
+ 63, TCP_HEADER_LEN);
+} else {
+ovs_be32 ip4_src;
+ip_parse(svc_mon->sb_svc_mon->src_ip, _src);
+pinctrl_compose_ipv4(, eth_src, svc_mon->ea,
+ ip4_src, in6_addr_get_mapped_ipv4(_mon->ip),
+ IPPROTO_TCP, 63, TCP_HEADER_LEN);
+}
 
 struct tcp_header *th = dp_packet_l4();
 dp_packet_set_l4(, th);
+th->tcp_csum = 0;
 th->tcp_dst = htons(svc_mon->proto_port);
 th->tcp_src = tcp_src;
 
@@ -7594,7 +7609,11 @@ svc_monitor_send_tcp_health_check__(struct rconn *swconn,
 th->tcp_winsz = htons(65160);
 
 uint32_t csum;
-csum = packet_csum_pseudoheader(dp_packet_l3());
+if (svc_mon->is_ip6) {
+csum = packet_csum_pseudoheader6(dp_packet_l3());
+} else {
+csum = packet_csum_pseudoheader(dp_packet_l3());
+}
 csum = csum_continue(csum, th, dp_packet_size() -
  

[ovs-dev] [PATCH v5] netdev-dpdk: add control plane protection support

2022-12-16 Thread Robin Jarry
Some control protocols are used to maintain link status between
forwarding engines (e.g. LACP). When the system is not sized properly,
the PMD threads may not be able to process all incoming traffic from the
configured Rx queues. When a signaling packet of such protocols is
dropped, it can cause link flapping, worsening the situation.

Use the RTE flow API to redirect these protocols into a dedicated Rx
queue. The assumption is made that the ratio between control protocol
traffic and user data traffic is very low and thus this dedicated Rx
queue will never get full. The RSS redirection table is re-programmed to
only use the other Rx queues. The RSS table size is stored in the
netdev_dpdk structure at port initialization to avoid requesting the
information again when changing the port configuration.

The additional Rx queue will be assigned a PMD core like any other Rx
queue. Polling that extra queue may introduce increased latency and
a slight performance penalty at the benefit of preventing link flapping.

This feature must be enabled per port on specific protocols via the
cp-protection option. This option takes a coma-separated list of
protocol names. It is only supported on ethernet ports.

If the user has already configured multiple Rx queues on the port, an
additional one will be allocated for control plane packets. If the
hardware cannot satisfy the requested number of requested Rx queues, the
last Rx queue will be assigned for control plane. If only one Rx queue
is available, the cp-protection feature will be disabled. If the
hardware does not support the RTE flow matchers/actions, the feature
will be disabled.

It cannot be enabled when other_config:hw-offload=true as it may
conflict with the offloaded RTE flows. Similarly, if hw-offload is
enabled while some ports already have cp-protection enabled, RTE flow
offloading will be disabled on these ports.

Example use:

 ovs-vsctl add-bond br-phy bond0 phy0 phy1 -- \
   set interface phy0 type=dpdk options:dpdk-devargs=:ca:00.0 -- \
   set interface phy0 options:cp-protection=lacp -- \
   set interface phy1 type=dpdk options:dpdk-devargs=:ca:00.1 -- \
   set interface phy1 options:cp-protection=lacp

As a starting point, only one protocol is supported: LACP. Other
protocols can be added in the future. NIC compatibility should be
checked.

To validate that this works as intended, I used a traffic generator to
generate random traffic slightly above the machine capacity at line rate
on a two ports bond interface. OVS is configured to receive traffic on
two VLANs and pop/push them in a br-int bridge based on tags set on
patch ports.

   +--+
   | DUT  |
   |++|
   ||   br-int   || default flow, action=NORMAL
   ||||
   || patch10patch11 ||
   |+---|---|+|
   ||   | |
   |+---|---|+|
   || patch00patch01 ||
   ||  tag:10tag:20  ||
   ||||
   ||   br-phy   || default flow, action=NORMAL
   ||||
   ||   bond0|| balance-slb, lacp=passive, lacp-time=fast
   ||phy0   phy1 ||
   |+--|-|---+|
   +---|-|+
   | |
   +---|-|+
   | port0  port1 | balance L3/L4, lacp=active, lacp-time=fast
   | lag  | mode trunk VLANs 10, 20
   |  |
   |switch|
   |  |
   |  vlan 10vlan 20  |  mode access
   |   port2  port3   |
   +-|--|-+
 |  |
   +-|--|-+
   |   port0  port1   |  Random traffic that is properly balanced
   |  |  across the bond ports in both directions.
   |  traffic generator   |
   +--+

Without cp-protection, the bond0 links are randomly switching to
"defaulted" when one of the LACP packets sent by the switch is dropped
because the RX queues are full and the PMD threads did not process them
fast enough. When that happens, all traffic must go through a single
link which causes above line rate traffic to be dropped.

When cp-protection is enabled, no LACP packet is dropped and the bond
links remain enabled at all times, maximizing the throughput.

This feature may be considered as "QoS". However, it does not work by
limiting the rate of traffic explicitly. It only guarantees that some
protocols have a lower chance of being dropped because the PMD cores
cannot keep up with regular traffic.

The choice of protocols is limited on purpose. This is not meant to be
configurable by users. Some limited configurability could be considered
in the future but it would expose to more potential issues if users are
accidentally redirecting all traffic in the control plane queue.

Cc: Christophe Fontaine 
Cc: Kevin Traynor 
Cc: David Marchand 
Signed-off-by: Robin Jarry 
---
v4 -> v5:

* Added NEWS entry
* Updated dpdk documentation link 

Re: [ovs-dev] [PATCH ovn] Add IPv6 support for lb health-check

2022-12-16 Thread Lorenzo Bianconi
> On Tue, Nov 22, 2022 at 12:54 PM Lorenzo Bianconi
>  wrote:
> >
> > Add Similar to IPv4 counterpart, introduce IPv6 load-balancer health
> > check support.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2136094
> > Signed-off-by: Lorenzo Bianconi 
> 
> Hi Lorenzo,
> 
> Thanks for adding this missing feature.  Please see a few comments below.

Hi Numan,

thx for the review :)

> 
> 
> > ---
> >  controller/pinctrl.c| 213 -
> >  northd/northd.c |  77 ++
> >  northd/ovn-northd.8.xml |  17 +++
> >  ovn-nb.xml  |   8 +-
> >  tests/ovn.at| 201 ++-
> >  tests/system-ovn.at | 230 +++-
> >  6 files changed, 647 insertions(+), 99 deletions(-)
> >
> > diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> > index f44775c7e..89c2f207b 100644
> > --- a/controller/pinctrl.c
> > +++ b/controller/pinctrl.c
> > @@ -6676,7 +6676,7 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
> >  ovs_be32 ip4;
> >  if (ip_parse(sb_svc_mon->ip, )) {
> >  ip_addr = in6_addr_mapped_ipv4(ip4);
> > -} else {
> > +} else if (!ipv6_parse(sb_svc_mon->ip, _addr)) {
> >  continue;
> >  }
> >
> > @@ -6689,16 +6689,27 @@ sync_svc_monitors(struct ovsdb_idl_txn 
> > *ovnsb_idl_txn,
> >  continue;
> >  }
> >
> > -for (size_t j = 0; j < laddrs.n_ipv4_addrs; j++) {
> > -if (ip4 == laddrs.ipv4_addrs[j].addr) {
> > -ea = laddrs.ea;
> > -mac_found = true;
> > -break;
> > +if (IN6_IS_ADDR_V4MAPPED(_addr)) {
> > +for (size_t j = 0; j < laddrs.n_ipv4_addrs; j++) {
> > +if (ip4 == laddrs.ipv4_addrs[j].addr) {
> > +ea = laddrs.ea;
> > +mac_found = true;
> > +break;
> > +}
> > +}
> > +} else {
> > +for (size_t j = 0; j < laddrs.n_ipv6_addrs; j++) {
> > +if (!memcmp(_addr, _addrs[j].addr,
> > +sizeof(ovs_be32[4]))) {
> 
> It's a bit odd to use size of ovs_be32 here since we are memcmping
> 'struct in6_addr'.
> 
> I'd suggest using the macro - IN6_ARE_ADDR_EQUAL here.

ack, I will fix it in v2.

> 
> 
> > +ea = laddrs.ea;
> > +mac_found = true;
> > +break;
> > +}
> >  }
> >  }
> >
> > -if (!mac_found && !laddrs.n_ipv4_addrs) {
> > -/* IPv4 address(es) are not configured. Use the first mac. 
> > */
> > +if (!mac_found && !laddrs.n_ipv4_addrs && 
> > !laddrs.n_ipv6_addrs) {
> > +/* IP address(es) are not configured. Use the first mac. */
> >  ea = laddrs.ea;
> >  mac_found = true;
> >  }
> > @@ -6732,7 +6743,7 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
> >  svc_mon->port_key = port_key;
> >  svc_mon->proto_port = sb_svc_mon->port;
> >  svc_mon->ip = ip_addr;
> > -svc_mon->is_ip6 = false;
> > +svc_mon->is_ip6 = !IN6_IS_ADDR_V4MAPPED(_addr);
> >  svc_mon->state = SVC_MON_S_INIT;
> >  svc_mon->status = SVC_MON_ST_UNKNOWN;
> >  svc_mon->protocol = protocol;
> > @@ -7500,26 +7511,30 @@ svc_monitor_send_tcp_health_check__(struct rconn 
> > *swconn,
> >  ovs_be32 tcp_ack,
> >  ovs_be16 tcp_src)
> >  {
> > -if (svc_mon->is_ip6) {
> > -return;
> > -}
> > -
> >  /* Compose a TCP-SYN packet. */
> >  uint64_t packet_stub[128 / 8];
> >  struct dp_packet packet;
> > +dp_packet_use_stub(, packet_stub, sizeof packet_stub);
> >
> >  struct eth_addr eth_src;
> >  eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, _src);
> > -ovs_be32 ip4_src;
> > -ip_parse(svc_mon->sb_svc_mon->src_ip, _src);
> > -
> > -dp_packet_use_stub(, packet_stub, sizeof packet_stub);
> > -pinctrl_compose_ipv4(, eth_src, svc_mon->ea,
> > - ip4_src, in6_addr_get_mapped_ipv4(_mon->ip),
> > - IPPROTO_TCP, 63, TCP_HEADER_LEN);
> > +if (svc_mon->is_ip6) {
> > +struct in6_addr ip6_src;
> > +ipv6_parse(svc_mon->sb_svc_mon->src_ip, _src);
> > +pinctrl_compose_ipv6(, eth_src, svc_mon->ea,
> > + _src, _mon->ip, IPPROTO_TCP,
> > + 63, TCP_HEADER_LEN);
> > +} else {
> > +ovs_be32 ip4_src;
> > +ip_parse(svc_mon->sb_svc_mon->src_ip, _src);
> > +pinctrl_compose_ipv4(, eth_src, svc_mon->ea,
> > + ip4_src, 

Re: [ovs-dev] [PATCH 5/6] netdev-dpdk-offload: Add support for meter action

2022-12-16 Thread 0-day Robot
References:  <20221216155054.986464-6-simon.hor...@corigine.com>
 

Bleep bloop.  Greetings Simon Horman, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Jin Liu 
Lines checked: 48, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 4/6] netdev-dpdk: add meter algorithms

2022-12-16 Thread 0-day Robot
References:  <20221216155054.986464-5-simon.hor...@corigine.com>
 

Bleep bloop.  Greetings Simon Horman, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Jin Liu 
Lines checked: 97, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/6] netdev-offload-dpdk: Implement meter offload API for DPDK

2022-12-16 Thread 0-day Robot
References:  <20221216155054.986464-4-simon.hor...@corigine.com>
 

Bleep bloop.  Greetings Simon Horman, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Jin Liu 
Lines checked: 405, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/6] dpif-netdev: Offloading meter with DPDK

2022-12-16 Thread 0-day Robot
References:  <20221216155054.986464-3-simon.hor...@corigine.com>
 

Bleep bloop.  Greetings Simon Horman, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Jin Liu 
Lines checked: 153, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/6] netdev-offload: Add DPDK meter offload API

2022-12-16 Thread 0-day Robot
References:  <20221216155054.986464-2-simon.hor...@corigine.com>
 

Bleep bloop.  Greetings Simon Horman, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Jin Liu 
Lines checked: 186, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/6] dpif-netdev: Offloading meter with DPDK

2022-12-16 Thread Simon Horman
From: Peng Zhang 

OVS-DPDK meters are created in advance and OpenFlow rules refer to them by
their unique ID. A new API is used to offload them. By calling the API,
meters are created and try to be offload to all ports in the bridge.

Signed-off-by: Peng Zhang 
Signed-off-by: Jin Liu 
---
 lib/dpif-netdev.c | 102 ++
 1 file changed, 102 insertions(+)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 2c08a71c8db2..e1cc20d399a9 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -7206,6 +7206,96 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct 
dp_packet_batch *packets_,
 ovs_mutex_unlock(>lock);
 }
 
+static void
+dpif_netdev_offload_meter_set(struct dp_netdev *dp,
+  ofproto_meter_id meter_id,
+  struct ofputil_meter_config *config)
+{
+struct dp_netdev_port *port;
+struct netdev *dev;
+
+ovs_mutex_lock(_netdev_mutex);
+
+/* Add the meter with all ports in the datapath */
+HMAP_FOR_EACH (port, node, >ports) {
+dev = port->netdev;
+if (netdev_is_pmd(dev)) {
+dpdk_meter_offload_set(dev, meter_id, config);
+}
+}
+
+ovs_mutex_unlock(_netdev_mutex);
+}
+
+static void
+dpif_netdev_offload_meter_del(struct dp_netdev *dp,
+  ofproto_meter_id meter_id_,
+  struct ofputil_meter_stats *stats)
+{
+uint32_t meter_id = meter_id_.uint32;
+const struct dp_meter *meter;
+struct dp_netdev_port *port;
+struct netdev *dev;
+
+meter = dp_meter_lookup(>meters, meter_id);
+if (!meter) {
+return;
+}
+
+ovs_mutex_lock(_netdev_mutex);
+
+HMAP_FOR_EACH (port, node, >ports) {
+dev = port->netdev;
+if (netdev_is_pmd(dev)) {
+dpdk_meter_offload_del(dev, meter_id_, stats);
+}
+}
+
+ovs_mutex_unlock(_netdev_mutex);
+}
+
+static void
+dpif_netdev_offload_meter_get(struct dp_netdev *dp,
+  ofproto_meter_id meter_id_,
+  struct ofputil_meter_stats *stats)
+{
+struct ofputil_meter_stats offload_stats;
+uint32_t meter_id = meter_id_.uint32;
+const struct dp_meter *meter;
+struct dp_netdev_port *port;
+struct netdev *dev;
+
+meter = dp_meter_lookup(>meters, meter_id);
+if (!meter) {
+return;
+}
+
+ovs_mutex_lock(_netdev_mutex);
+
+HMAP_FOR_EACH (port, node, >ports) {
+memset(_stats, 0, sizeof(struct ofputil_meter_stats));
+dev = port->netdev;
+if (netdev_is_pmd(dev)) {
+dpdk_meter_offload_get(dev, meter_id_, _stats);
+if (!offload_stats.byte_in_count &&
+   !offload_stats.packet_in_count) {
+continue;
+}
+ovs_mutex_lock(>lock);
+
+stats->byte_in_count += offload_stats.byte_in_count;
+stats->packet_in_count += offload_stats.packet_in_count;
+/* nit: Meter offload currently only supports one band */
+if (meter->n_bands) {
+stats->bands[0].packet_count = stats->packet_in_count;
+stats->bands[0].byte_count = stats->byte_in_count;
+}
+ovs_mutex_unlock(>lock);
+}
+}
+ovs_mutex_unlock(_netdev_mutex);
+}
+
 /* Meter set/get/del processing is still single-threaded. */
 static int
 dpif_netdev_meter_set(struct dpif *dpif, ofproto_meter_id meter_id,
@@ -7277,6 +7367,10 @@ dpif_netdev_meter_set(struct dpif *dpif, 
ofproto_meter_id meter_id,
 
 ovs_mutex_unlock(>meters_lock);
 
+if (netdev_is_flow_api_enabled()) {
+dpif_netdev_offload_meter_set(dp, meter_id, config);
+}
+
 return 0;
 }
 
@@ -7315,6 +7409,10 @@ dpif_netdev_meter_get(const struct dpif *dpif,
 stats->n_bands = i;
 }
 
+if (netdev_is_flow_api_enabled()) {
+dpif_netdev_offload_meter_get(dp, meter_id_, stats);
+}
+
 return 0;
 }
 
@@ -7330,6 +7428,10 @@ dpif_netdev_meter_del(struct dpif *dpif,
 if (!error) {
 uint32_t meter_id = meter_id_.uint32;
 
+if (netdev_is_flow_api_enabled()) {
+dpif_netdev_offload_meter_del(dp, meter_id_, stats);
+}
+
 ovs_mutex_lock(>meters_lock);
 dp_meter_detach_free(>meters, meter_id);
 ovs_mutex_unlock(>meters_lock);
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 5/6] netdev-dpdk-offload: Add support for meter action

2022-12-16 Thread Simon Horman
From: Peng Zhang 

Add support of DPDK meter action logic.

Signed-off-by: Peng Zhang 
Signed-off-by: Jin Liu 
---
 lib/netdev-offload-dpdk.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 26c21f270a47..aeb5af05a8d9 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -2114,6 +2114,16 @@ parse_clone_actions(struct netdev *netdev,
 return 0;
 }
 
+static void OVS_UNUSED
+parse_meter_action(struct flow_actions *actions, uint32_t meter_id)
+{
+struct rte_flow_action_meter *rte_meter;
+
+rte_meter = xzalloc(sizeof *rte_meter);
+rte_meter->mtr_id = meter_id;
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_METER, rte_meter);
+}
+
 static void
 add_jump_action(struct flow_actions *actions, uint32_t group)
 {
@@ -2220,6 +2230,9 @@ parse_flow_actions(struct netdev *netdev,
 if (add_tnl_pop_action(netdev, actions, nla)) {
 return -1;
 }
+}  else if (nl_attr_type(nla) == OVS_ACTION_ATTR_METER) {
+uint32_t meter_id =  nl_attr_get_u32(nla);
+parse_meter_action(actions, meter_id);
 #endif
 } else {
 VLOG_DBG_RL(, "Unsupported action type %d", nl_attr_type(nla));
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 4/6] netdev-dpdk: add meter algorithms

2022-12-16 Thread Simon Horman
From: Peng Zhang 

Add the meter algorithms. DPDK meter support three algorithms,
and OVS also can support these algorithms.

Signed-off-by: Peng Zhang 
Signed-off-by: Jin Liu 
---
 lib/netdev-dpdk.c | 58 +--
 1 file changed, 51 insertions(+), 7 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 180f3fb4b299..4e96c678aff7 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -5275,6 +5275,50 @@ netdev_dpdk_rte_flow_query_count(struct netdev *netdev,
 return ret;
 }
 
+/* RTE_MTR_TRTCM_RFC2697 meter profile */
+static void
+netdev_dpdk_meter_profile_rfc2697_create(struct rte_mtr_meter_profile *profile,
+ const uint64_t rate,
+ const uint64_t burst,
+ const int flag)
+{
+   profile->alg = RTE_MTR_SRTCM_RFC2697;
+   profile->packet_mode = flag;
+   profile->srtcm_rfc2697.cir = rate;
+   profile->srtcm_rfc2697.cbs = burst;
+   profile->srtcm_rfc2697.ebs = burst;
+}
+
+/* RTE_MTR_TRTCM_RFC2698 meter profile */
+static void
+netdev_dpdk_meter_profile_rfc2698_create(struct rte_mtr_meter_profile *profile,
+ const uint64_t rate,
+ const uint64_t burst,
+ const int flag)
+{
+   profile->alg = RTE_MTR_TRTCM_RFC2698;
+   profile->packet_mode = flag;
+   profile->trtcm_rfc2698.cir = rate;
+   profile->trtcm_rfc2698.cbs = burst;
+   profile->trtcm_rfc2698.pir = rate;
+   profile->trtcm_rfc2698.pbs = burst;
+}
+
+/* RTE_MTR_TRTCM_RFC2698 meter profile */
+static void
+netdev_dpdk_meter_profile_rfc4115_create(struct rte_mtr_meter_profile *profile,
+ const uint64_t rate,
+ const uint64_t burst,
+ const int flag)
+{
+   profile->alg = RTE_MTR_TRTCM_RFC4115;
+   profile->packet_mode = flag;
+   profile->trtcm_rfc4115.cir = rate;
+   profile->trtcm_rfc4115.cbs = burst;
+   profile->trtcm_rfc4115.eir = rate;
+   profile->trtcm_rfc4115.ebs = burst;
+}
+
 static int OVS_UNUSED
 netdev_dpdk_meter_profile_create(struct rte_mtr_meter_profile *profile,
  struct rte_mtr_capabilities *cap,
@@ -5282,16 +5326,16 @@ netdev_dpdk_meter_profile_create(struct 
rte_mtr_meter_profile *profile,
  const uint64_t burst,
  const int flag)
 {
-if (!cap->meter_srtcm_rfc2697_n_max) {
+if (cap->meter_srtcm_rfc2697_n_max) {
+netdev_dpdk_meter_profile_rfc2697_create(profile, rate, burst, flag);
+} else if (cap->meter_trtcm_rfc2698_n_max) {
+netdev_dpdk_meter_profile_rfc2698_create(profile, rate, burst, flag);
+} else if (cap->meter_trtcm_rfc4115_n_max) {
+netdev_dpdk_meter_profile_rfc4115_create(profile, rate, burst, flag);
+} else {
 return EOPNOTSUPP;
 }
 
-profile->alg = RTE_MTR_SRTCM_RFC2697;
-profile->packet_mode = flag;
-profile->srtcm_rfc2697.cir = rate;
-profile->srtcm_rfc2697.cbs = burst;
-profile->srtcm_rfc2697.ebs = burst;
-
 return 0;
 }
 
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 6/6] ci: add the opts about ALLOW_EXPERIMENTAL_API

2022-12-16 Thread Simon Horman
From: Peng Zhang 

This commit adds support for OVS-DPDK with -DALLOW_EXPERIMENTAL_API.

Tunnel offloads and Meter offloads are experimental APIs in DPDK. To
enable these features, compile need add -DALLOW_EXPERIMENTAL_API. So
in workflow, we also need need the new test with
-DALLOW_EXPERIMENTAL_API.

Signed-off-by: Peng Zhang 
---
 .ci/linux-build.sh   |  4 
 .github/workflows/build-and-test.yml | 31 
 2 files changed, 35 insertions(+)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 485109672388..1dc6414a162d 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -227,6 +227,10 @@ assert ovs.json.from_string('{\"a\": 42}') == {'a': 42}"
 exit 0
 fi
 
+if [ "$CFG_OPTS" ]; then
+CFLAGS_FOR_OVS="${CFLAGS_FOR_OVS} ${CFG_OPTS}"
+fi
+
 if [ "$KERNEL" ]; then
 install_kernel $KERNEL
 fi
diff --git a/.github/workflows/build-and-test.yml 
b/.github/workflows/build-and-test.yml
index e08d7b1bac1e..d42b61acfb7e 100644
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@@ -20,6 +20,7 @@ jobs:
   M32: ${{ matrix.m32 }}
   OPTS:${{ matrix.opts }}
   TESTSUITE:   ${{ matrix.testsuite }}
+  CFG_OPTS:${{ matrix.cfg_opts }}
 
 name: linux ${{ join(matrix.*, ' ') }}
 runs-on: ubuntu-20.04
@@ -53,9 +54,17 @@ jobs:
   - compiler: gcc
 testsuite:test
 dpdk: dpdk
+  - compiler: gcc
+testsuite:test
+dpdk: dpdk
+cfg_opts: -DALLOW_EXPERIMENTAL_API
+  - compiler: clang
+testsuite:test
+dpdk: dpdk
   - compiler: clang
 testsuite:test
 dpdk: dpdk
+cfg_opts: -DALLOW_EXPERIMENTAL_API
 
   - compiler: gcc
 testsuite:test
@@ -74,21 +83,43 @@ jobs:
   - compiler: gcc
 dpdk: dpdk
 opts: --enable-shared
+  - compiler: gcc
+dpdk: dpdk
+opts: --enable-shared
+cfg_opts: -DALLOW_EXPERIMENTAL_API
+  - compiler: clang
+dpdk: dpdk
+opts: --enable-shared
   - compiler: clang
 dpdk: dpdk
 opts: --enable-shared
+cfg_opts: -DALLOW_EXPERIMENTAL_API
 
   - compiler: gcc
 dpdk_shared:  dpdk-shared
+  - compiler: gcc
+dpdk_shared:  dpdk-shared
+cfg_opts: -DALLOW_EXPERIMENTAL_API
   - compiler: clang
 dpdk_shared:  dpdk-shared
+  - compiler: clang
+dpdk_shared:  dpdk-shared
+cfg_opts: -DALLOW_EXPERIMENTAL_API
 
   - compiler: gcc
 dpdk_shared:  dpdk-shared
 opts: --enable-shared
+  - compiler: gcc
+dpdk_shared:  dpdk-shared
+opts: --enable-shared
+cfg_opts: -DALLOW_EXPERIMENTAL_API
+  - compiler: clang
+dpdk_shared:  dpdk-shared
+opts: --enable-shared
   - compiler: clang
 dpdk_shared:  dpdk-shared
 opts: --enable-shared
+cfg_opts: -DALLOW_EXPERIMENTAL_API
 
   - compiler: gcc
 m32:  m32
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/6] netdev-offload: Add DPDK meter offload API

2022-12-16 Thread Simon Horman
From: Peng Zhang 

Add API to offload DPDK meter to HW, and the corresponding functions to call
the DPDK meter callbacks from all the registered flow API providers.
The interfaces are like those related to DPDK meter in dpif_class, in order
to pass necessary info to HW.

Signed-off-by: Peng Zhang 
Signed-off-by: Jin Liu 
---
 Documentation/howto/dpdk.rst  |  5 +--
 lib/netdev-offload-provider.h | 30 ++
 lib/netdev-offload.c  | 59 +++
 lib/netdev-offload.h  |  9 ++
 4 files changed, 101 insertions(+), 2 deletions(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index 04609b20bd21..02fc568770ee 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -401,10 +401,11 @@ Supported actions for hardware offload are:
 - Modification of IPv6 (set_field:->ipv6_src/ipv6_dst/mod_nw_ttl).
 - Clone/output (tnl_push and output) for encapsulating over a tunnel.
 - Tunnel pop, for packets received on physical ports.
+- Meter.
 
 .. note::
-  Tunnel offloads are experimental APIs in DPDK. In order to enable it,
-  compile with -DALLOW_EXPERIMENTAL_API.
+  Tunnel offloads and Meter offloads are experimental APIs in DPDK. To enable
+  these features, compile with -DALLOW_EXPERIMENTAL_API.
 
 Multiprocess
 
diff --git a/lib/netdev-offload-provider.h b/lib/netdev-offload-provider.h
index 9108856d18d1..7ecbb8d026f1 100644
--- a/lib/netdev-offload-provider.h
+++ b/lib/netdev-offload-provider.h
@@ -102,6 +102,16 @@ struct netdev_flow_api {
 int (*meter_set)(ofproto_meter_id meter_id,
  struct ofputil_meter_config *config);
 
+/* Offloads or modifies the offloaded meter on the netdev with the given
+ * 'meter_id' and the configuration in 'config'. On failure, a non-zero
+ * error code is returned.
+ *
+ * The meter id specified through 'config->meter_id' is converted as an
+ * internal meter id. */
+int (*dpdk_meter_set)(struct netdev *,
+  ofproto_meter_id meter_id,
+  struct ofputil_meter_config *);
+
 /* Queries HW for meter stats with the given 'meter_id'. Store the stats
  * of dropped packets to band 0. On failure, a non-zero error code is
  * returned.
@@ -113,6 +123,18 @@ struct netdev_flow_api {
 int (*meter_get)(ofproto_meter_id meter_id,
  struct ofputil_meter_stats *stats);
 
+/* Queries netdev for meter stats with the given 'meter_id'. Store the
+ * stats of dropped packets to band 0. On failure, a non-zero error code
+ * is returned.
+ *
+ * Note that the 'stats' structure is already initialized, and only the
+ * available statistics should be incremented, not replaced. Those fields
+ * are packet_in_count, byte_in_count and band[]->byte_count and
+ * band[]->packet_count. */
+int (*dpdk_meter_get)(struct netdev *,
+  ofproto_meter_id meter_id,
+  struct ofputil_meter_stats *);
+
 /* Removes meter 'meter_id' from HW. Store the stats of dropped packets to
  * band 0. On failure, a non-zero error code is returned.
  *
@@ -121,6 +143,14 @@ struct netdev_flow_api {
 int (*meter_del)(ofproto_meter_id meter_id,
  struct ofputil_meter_stats *stats);
 
+/* Removes meter 'meter_id' from netdev. Store the stats of dropped packets
+ * to band 0. On failure, a non-zero error code is returned.
+ *
+ * If del success, 'stats' will be set zero. */
+int (*dpdk_meter_del)(struct netdev *,
+  ofproto_meter_id meter_id,
+  struct ofputil_meter_stats *stats);
+
 /* Initializies the netdev flow api.
  * Return 0 if successful, otherwise returns a positive errno value. */
 int (*init_flow_api)(struct netdev *);
diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
index 4592262bd34e..6aedf2e5c5e9 100644
--- a/lib/netdev-offload.c
+++ b/lib/netdev-offload.c
@@ -256,6 +256,65 @@ meter_offload_del(ofproto_meter_id meter_id, struct 
ofputil_meter_stats *stats)
 return 0;
 }
 
+void
+dpdk_meter_offload_set(struct netdev *dev,
+   ofproto_meter_id meter_id,
+   struct ofputil_meter_config *config)
+{
+struct netdev_registered_flow_api *rfa;
+
+/* Offload APIs could fail, for example, because the offload is not
+ * supported. This is fine, as the offload API should take care of this. */
+CMAP_FOR_EACH (rfa, cmap_node, _flow_apis) {
+if (rfa->flow_api->dpdk_meter_set) {
+int ret = rfa->flow_api->dpdk_meter_set(dev, meter_id, config);
+if (ret) {
+VLOG_DBG_RL(, "Failed setting meter %u for flow api %s, "
+"error %d", meter_id.uint32, rfa->flow_api->type,
+ret);
+   }
+}
+}
+}
+
+void

[ovs-dev] [PATCH 3/6] netdev-offload-dpdk: Implement meter offload API for DPDK

2022-12-16 Thread Simon Horman
From: Peng Zhang 

For dpif-netdev, meters are mapped by DPDK meter with one-to-one
relationship. Implement meter offload API to set/get/del
the DPDK meter.

Signed-off-by: Peng Zhang 
Signed-off-by: Jin Liu 
---
 lib/netdev-dpdk.c | 199 ++
 lib/netdev-dpdk.h |  41 
 lib/netdev-offload-dpdk.c |  88 +
 3 files changed, 328 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index fff57f78279a..180f3fb4b299 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -5274,8 +5275,206 @@ netdev_dpdk_rte_flow_query_count(struct netdev *netdev,
 return ret;
 }
 
+static int OVS_UNUSED
+netdev_dpdk_meter_profile_create(struct rte_mtr_meter_profile *profile,
+ struct rte_mtr_capabilities *cap,
+ const uint64_t rate,
+ const uint64_t burst,
+ const int flag)
+{
+if (!cap->meter_srtcm_rfc2697_n_max) {
+return EOPNOTSUPP;
+}
+
+profile->alg = RTE_MTR_SRTCM_RFC2697;
+profile->packet_mode = flag;
+profile->srtcm_rfc2697.cir = rate;
+profile->srtcm_rfc2697.cbs = burst;
+profile->srtcm_rfc2697.ebs = burst;
+
+return 0;
+}
+
 #ifdef ALLOW_EXPERIMENTAL_API
 
+static int
+netdev_dpdk_rte_mtr_meter_add(struct rte_mtr_meter_profile *profile,
+  struct netdev *netdev,
+  uint32_t meter_id,
+  const uint32_t rate,
+  const uint32_t burst,
+  const int flag,
+  struct rte_mtr_error *error)
+{
+struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+uint32_t meter_profile_id = meter_id;
+uint32_t meter_policy_id = meter_id;
+struct rte_mtr_capabilities cap;
+struct rte_mtr_stats mtr_stats;
+struct rte_mtr_params params;
+uint64_t stats_mask = 0;
+int clear = 0;
+int mod;
+int ret;
+
+memset(_stats, 0, sizeof(struct rte_mtr_stats));
+memset(, 0, sizeof(cap));
+
+ovs_mutex_lock(>mutex);
+
+ret = rte_mtr_capabilities_get(dev->port_id, , error);
+if (ret) {
+goto out;
+}
+
+ret = netdev_dpdk_meter_profile_create(profile, , rate, burst, flag);
+if (ret) {
+goto out;
+}
+
+
+/* If can get the meter stats, the meter is offload in the HW.
+ * So the operate is mod, just update the meter_profile.
+ *
+ * If can't get the meter stats, the meter is not offload in the HW.
+ * So the operate is add, need create the profile, policy, mtr. */
+mod = rte_mtr_stats_read(dev->port_id, meter_id, _stats, _mask,
+ clear, error);
+ret = rte_mtr_meter_profile_add(dev->port_id, meter_profile_id, profile,
+error);
+if (!mod || ret) {
+goto out;
+}
+
+rte_mtr_policy_drop_red(policy);
+ret = rte_mtr_meter_policy_add(dev->port_id, meter_policy_id, ,
+   error);
+
+if (ret) {
+goto out;
+}
+
+memset(, 0 , sizeof(struct rte_mtr_params));
+params.meter_profile_id = meter_profile_id;
+params.meter_policy_id = meter_policy_id;
+params.stats_mask = cap.stats_mask;
+params.meter_enable = 1;
+
+ret = rte_mtr_create(dev->port_id, meter_id, , 1, error);
+out:
+ovs_mutex_unlock(>mutex);
+return ret;
+}
+
+int
+netdev_dpdk_meter_create(struct netdev *netdev,
+ const uint32_t meter_profile_id,
+ const uint64_t rate,
+ const uint64_t burst,
+ const int flag)
+{
+struct rte_mtr_meter_profile profile;
+struct rte_mtr_error error;
+int ret;
+
+memset(, 0 , sizeof(struct rte_mtr_meter_profile));
+memset(, 0 , sizeof(struct rte_mtr_error));
+
+ret = netdev_dpdk_rte_mtr_meter_add(, netdev, meter_profile_id,
+rate, burst, flag, );
+if (!ret) {
+if (!VLOG_DROP_DBG()) {
+VLOG_DBG("%s: rte_meter_id %d  port_id %d mtr create ",
+ netdev_get_name(netdev), meter_profile_id,
+ netdev_dpdk_get_port_id(netdev));
+}
+} else {
+VLOG_DBG("%s: rte_mtr creation failed: %d (%s).",
+ netdev_get_name(netdev), error.type, error.message);
+}
+return ret;
+}
+
+int
+netdev_dpdk_meter_del(struct netdev *netdev,
+  const uint32_t meter_id,
+  const uint32_t meter_profile_id,
+  const uint32_t meter_policy_id)
+{
+struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+struct rte_mtr_stats mtr_stats;
+struct rte_mtr_error error;
+uint64_t stats_mask = 0;
+int clear = 

[ovs-dev] [PATCH 0/6] Add support for DPDK meter HW offload

2022-12-16 Thread Simon Horman
Hi,

this series adds support for DPDK meter HW offload

* Patch 1/6: Add netdev provider API for HW offload of DPDK meters
* Patch 2/6: Add DPIF API to offload OpenFlow meters to DPDK
* Patch 3/6: Implement netdev provider API for HW offload of DPDK meters
* Patch 4/6: Add more DPDK meter algorithms
* Patch 4/6: Add support for meter action ti DPDK HW offload
* Patch 4/6: Add CI builds with ALLOW_EXPERIMENTAL_API

Peng Zhang (6):
  netdev-offload: Add DPDK meter offload API
  dpif-netdev: Offloading meter with DPDK
  netdev-offload-dpdk: Implement meter offload API for DPDK
  netdev-dpdk: add meter algorithms
  netdev-dpdk-offload: Add support for meter action
  ci: add the opts about ALLOW_EXPERIMENTAL_API

 .ci/linux-build.sh   |   4 +
 .github/workflows/build-and-test.yml |  31 
 Documentation/howto/dpdk.rst |   5 +-
 lib/dpif-netdev.c| 102 +++
 lib/netdev-dpdk.c| 243 +++
 lib/netdev-dpdk.h|  41 +
 lib/netdev-offload-dpdk.c| 101 +++
 lib/netdev-offload-provider.h|  30 
 lib/netdev-offload.c |  59 +++
 lib/netdev-offload.h |   9 +
 10 files changed, 623 insertions(+), 2 deletions(-)

-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] rhel: avoid creating an empty database file

2022-12-16 Thread Timothy Redaelli
In 59e8cb8a053d ("rhel: Move conf.db to /var/lib/openvswitch, using symlinks.")
conf.db is created as empty file in /var/lib/openvswitch, if it doesn't
exists, but this prevent ovsdb-server to start.

This commit changes the previous behaviour to set
/var/lib/openvswitch owner to openvswitch:hugetlbfs, if built with
dpdk, or openvswitch:openvswitch.

Fixes: 59e8cb8a053d ("rhel: Move conf.db to /var/lib/openvswitch, using 
symlinks.")
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2022-December/400045.html
Reported-by: Roi Dayan 
Signed-off-by: Timothy Redaelli 
---
 rhel/openvswitch-fedora.spec.in | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/openvswitch-fedora.spec.in
index 8d692b36c..6c8813793 100644
--- a/rhel/openvswitch-fedora.spec.in
+++ b/rhel/openvswitch-fedora.spec.in
@@ -340,12 +340,6 @@ for base in conf.db .conf.db.~lock~; do
 if test ! -e $old && test ! -h $old; then
 ln -s $new $old
 fi
-touch $new
-%if %{with dpdk}
-chown openvswitch:hugetlbfs $new
-%else
-chown openvswitch:openvswitch $new
-%endif
 done
 
 %if 0%{?systemd_post:1}
@@ -506,7 +500,11 @@ fi
 %{_prefix}/lib/udev/rules.d/91-vfio.rules
 %endif
 %doc NOTICE README.rst NEWS rhel/README.RHEL.rst
-/var/lib/openvswitch
+%if %{with dpdk}
+%attr(750,openvswitch,hugetlbfs) /var/lib/openvswitch
+%else
+%attr(750,openvswitch,openvswitch) /var/lib/openvswitch
+%endif
 %attr(750,root,root) /var/log/openvswitch
 %ghost %attr(755,root,root) %{_rundir}/openvswitch
 %ghost %attr(644,root,root) %{_rundir}/openvswitch.useropts
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovs-dev v7 1/3] ofproto-dpif-upcall: fix push_dp_ops

2022-12-16 Thread Eelco Chaudron


On 16 Dec 2022, at 8:56, Peng He wrote:

> From: Peng He 
> To: Eelco Chaudron 
> Cc: Ilya Maximets , ovs-dev@openvswitch.org
> Subject: Re: [ovs-dev v7 1/3] ofproto-dpif-upcall: fix push_dp_ops
> Date: Fri, 16 Dec 2022 15:56:32 +0800
>
> Eelco Chaudron  于2022年12月13日周二 20:36写道:
>
>>
>>
>> On 10 Dec 2022, at 1:37, Peng He wrote:
>>
>>> Patch v5 has statistics issues.
>>>
>>> In order to solve this issue, we had a discussion.
>>>
>>> below is the quote of the email.
>>>
>>> ”
>>> After a second thought, I think maybe keeping INCONSISTENT just for the
>>> modify error is a better option.
>>>
>>> With current patch:
>>> 1.
>>> the modify error case:
>>> OPERATIONAL -> INCONSISTENT ->  EVICTING -> EVICTED
>>> 2.
>>> the delete error case:
>>> EVICTING -> EVICTED
>>>
>>> Change both to INCONSISTENT:
>>>
>>> the modify error case:
>>> did not change.
>>>
>>> the delete error case:
>>> EVICTING -> INCONSISTENT -> EVICTED?
>>>
>>> “
>>>
>>> And we agree to take the second solution.
>>
>> I know, but going over the state meanings again, UKEY_EVICTING means the
>> following:
>>
>>  /* Ukey is in umap, datapath flow delete is queued. */
>>
>> Which now no longer is the case, so should a new state not make more sense?
>>
>
> Why it's no longer valid?
>
> In the patch, only modify failed ukey will be set to EVICTING, is it just
> right fit the meaning of
> EVICTING? (ukey in the umap, but delete operation is queued?)

But it’s not as the delete operation is not queued, that is done in the 
revalidator_sweep__() part.

>>
>> Any one else has some input on this??
>>
>>> Eelco Chaudron  于2022年12月8日周四 18:54写道:
>>>


 On 27 Nov 2022, at 8:28, Peng He wrote:

> push_dp_ops only handles delete ops errors but ignores the modify
> ops results. It's better to handle all the dp operation errors in
> a consistent way.
>
> We observe in the production environment that sometimes a megaflow
> with wrong actions keep staying in datapath. The coverage command shows
> revalidators have dumped several times, however the correct
> actions are not set. This implies that the ukey's action does not
> equal to the meagaflow's, i.e. revalidators think the underlying
> megaflow's actions are correct however they are not.
>
> We also check the megaflow using the ofproto/trace command, and the
> actions are not matched with the ones in the actual magaflow. By
> performing a revalidator/purge command, the right actions are set.
>
> This patch prevents the inconsistency by considering modify failure
> in revalidators.
>
> To note, we cannot perform two state transitions and change ukey_state
> into UKEY_EVICTED directly here, because, if we do so, the
> sweep will remove the ukey alone and leave dp flow alive. Later, the
> dump will retrieve the dp flow and might even recover it. This will
> contribute the stats of this dp flow twice.
>
> Signed-off-by: Peng He 
> ---
>  ofproto/ofproto-dpif-upcall.c | 34 +++---
>  1 file changed, 23 insertions(+), 11 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-upcall.c
 b/ofproto/ofproto-dpif-upcall.c
> index 7ad728adf..c2cefbeb8 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -2416,26 +2416,30 @@ push_dp_ops(struct udpif *udpif, struct ukey_op
 *ops, size_t n_ops)
>
>  for (i = 0; i < n_ops; i++) {
>  struct ukey_op *op = [i];
> -struct dpif_flow_stats *push, *stats, push_buf;
> -
> -stats = op->dop.flow_del.stats;
> -push = _buf;
> -
> -if (op->dop.type != DPIF_OP_FLOW_DEL) {
> -/* Only deleted flows need their stats pushed. */
> -continue;
> -}
>
>  if (op->dop.error) {
> -/* flow_del error, 'stats' is unusable. */
>  if (op->ukey) {
>  ovs_mutex_lock(>ukey->mutex);
> -transition_ukey(op->ukey, UKEY_EVICTED);
> +if (op->dop.type == DPIF_OP_FLOW_DEL) {
> +transition_ukey(op->ukey, UKEY_EVICTED);
> +} else {
>>
>> I think we could use a comment here to make sure why we set it to
>> evicting. Maybe just a reference to the comment in revalidator_sweep__()
>> might be enough.
>>
> +transition_ukey(op->ukey, UKEY_EVICTING);
> +}
>  ovs_mutex_unlock(>ukey->mutex);
>  }
>  continue;
>  }
>
> +if (op->dop.type != DPIF_OP_FLOW_DEL) {
> +/* Only deleted flows need their stats pushed. */
> +continue;
> +}
> +
> +struct dpif_flow_stats *push, *stats, push_buf;
> +
> +stats = op->dop.flow_del.stats;
> +push = 

[ovs-dev] [PATCH ovn v3] northd: Refactor build_lrouter_nat_flows_for_lb function

2022-12-16 Thread Ales Musil
To make it easier to add flows to this stage, refactor the function.
This also has the benefit that we should see fewer allocations due to
rearranging how we create flows and how we manipulate the match string.

Signed-off-by: Ales Musil 
---
 northd/northd.c | 456 ++--
 1 file changed, 207 insertions(+), 249 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 4751feab4..7d1425180 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -3857,10 +3857,12 @@ static bool
 build_lb_vip_actions(struct ovn_lb_vip *lb_vip,
  struct ovn_northd_lb_vip *lb_vip_nb,
  struct ds *action, char *selection_fields,
+ struct ds *skip_snat_action, struct ds *force_snat_action,
  bool ls_dp, bool ct_lb_mark)
 {
 const char *ct_lb_action = ct_lb_mark ? "ct_lb_mark" : "ct_lb";
-bool skip_hash_fields = false, reject = false;
+bool reject = !lb_vip->n_backends && lb_vip->empty_backend_rej;
+bool drop = false;
 
 if (lb_vip_nb->lb_health_check) {
 ds_put_format(action, "%s(backends=", ct_lb_action);
@@ -3881,23 +3883,12 @@ build_lb_vip_actions(struct ovn_lb_vip *lb_vip,
 ds_put_format(action, "%s:%"PRIu16",",
   backend->ip_str, backend->port);
 }
+ds_chomp(action, ',');
 
-if (!n_active_backends) {
-if (!lb_vip->empty_backend_rej) {
-ds_clear(action);
-ds_put_cstr(action, debug_drop_action());
-skip_hash_fields = true;
-} else {
-reject = true;
-}
-} else {
-ds_chomp(action, ',');
-ds_put_cstr(action, ");");
-}
-} else if (lb_vip->empty_backend_rej && !lb_vip->n_backends) {
-reject = true;
+drop = !n_active_backends && !lb_vip->empty_backend_rej;
+reject = !n_active_backends && lb_vip->empty_backend_rej;
 } else {
-ds_put_format(action, "%s(backends=%s);", ct_lb_action,
+ds_put_format(action, "%s(backends=%s", ct_lb_action,
   lb_vip_nb->backend_ips);
 }
 
@@ -3907,12 +3898,26 @@ build_lb_vip_actions(struct ovn_lb_vip *lb_vip,
 ds_clear(action);
 ds_put_format(action, "reg0 = 0; reject { outport <-> inport; "
   "next(pipeline=egress,table=%d);};", stage);
-} else if (!skip_hash_fields && selection_fields && selection_fields[0]) {
-ds_chomp(action, ';');
-ds_chomp(action, ')');
-ds_put_format(action, "; hash_fields=\"%s\");", selection_fields);
+} else if (drop) {
+ds_clear(action);
+ds_put_cstr(action, debug_drop_action());
+} else if (selection_fields && selection_fields[0]) {
+ds_put_format(action, "; hash_fields=\"%s\"", selection_fields);
+}
+
+bool is_lb_action = !(reject || drop);
+
+if (!ls_dp) {
+ds_put_format(skip_snat_action, "flags.skip_snat_for_lb = 1; %s%s",
+  ds_cstr(action), is_lb_action ? "; skip_snat);" : "");
+ds_put_format(force_snat_action, "flags.force_snat_for_lb = 1; %s%s",
+  ds_cstr(action), is_lb_action ? "; force_snat);" : "");
 }
-return reject;
+if (is_lb_action) {
+ds_put_cstr(action, ");");
+}
+
+return false;
 }
 
 static void
@@ -7493,8 +7498,8 @@ build_lb_rules(struct hmap *lflows, struct ovn_northd_lb 
*lb, bool ct_lb_mark,
 /* New connections in Ingress table. */
 const char *meter = NULL;
 bool reject = build_lb_vip_actions(lb_vip, lb_vip_nb, action,
-   lb->selection_fields, true,
-   ct_lb_mark);
+   lb->selection_fields, NULL,
+   NULL, true, ct_lb_mark);
 
 ds_put_format(match, "ct.new && %s.dst == %s", ip_match,
   lb_vip->vip_str);
@@ -10431,49 +10436,123 @@ get_force_snat_ip(struct ovn_datapath *od, const 
char *key_type,
 return true;
 }
 
+#define LROUTER_NAT_LB_FLOW_INIT(MATCH, ACTION, PRIO) \
+(struct lrouter_nat_lb_flow)  \
+{ .action = (ACTION), .lflow_ref = NULL,  \
+  .hash = ovn_logical_flow_hash(  \
+  ovn_stage_get_table(S_ROUTER_IN_DNAT),  \
+  ovn_stage_get_pipeline(S_ROUTER_IN_DNAT),   \
+  (PRIO), ds_cstr(MATCH), (ACTION)) }
+
+enum lrouter_nat_lb_flow_type {
+LROUTER_NAT_LB_FLOW_NORMAL = 0,
+LROUTER_NAT_LB_FLOW_SKIP_SNAT,
+LROUTER_NAT_LB_FLOW_FORCE_SNAT,
+LROUTER_NAT_LB_FLOW_MAX,
+};
+
+struct lrouter_nat_lb_flow {
+char *action;
+struct ovn_lflow *lflow_ref;
+
+uint32_t hash;
+};
+
+struct lrouter_nat_lb_flows_ctx {
+struct lrouter_nat_lb_flow new[LROUTER_NAT_LB_FLOW_MAX];
+struct lrouter_nat_lb_flow 

Re: [ovs-dev] [PATCH ovn branch-22.12 2/2] Prepare for 22.12.1.

2022-12-16 Thread Mark Michelson

On 12/15/22 18:38, Dumitru Ceara wrote:

On 12/15/22 22:43, Mark Michelson wrote:

Signed-off-by: Mark Michelson 
---
I more-or-less randomized the time of day since this will be at some
point early tomorrow for me. I plan to regenerate the patch with the
correct time when I actually push the commit.
---
  NEWS | 3 +++
  configure.ac | 2 +-
  debian/changelog | 6 ++
  3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 226c67505..b8a31a496 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,6 @@
+OVN v22.12.1 - xx xxx 
+--
+
  OVN v22.12.0 - 16 Dec 2022
  --
- Add load balancer "affinity_timeout" option to configure load balancing
diff --git a/configure.ac b/configure.ac
index 101467253..357758e0c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
  # limitations under the License.
  
  AC_PREREQ(2.63)

-AC_INIT(ovn, 22.12.0, b...@openvswitch.org)
+AC_INIT(ovn, 22.12.1, b...@openvswitch.org)
  AC_CONFIG_MACRO_DIR([m4])
  AC_CONFIG_AUX_DIR([build-aux])
  AC_CONFIG_HEADERS([config.h])
diff --git a/debian/changelog b/debian/changelog
index cda378c12..54305c26d 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+OVN (22.12.1-1) unstable; urgency=low
+   [ OVN team ]
+   * New upstream version
+
+ -- OVN team   Thu, 16 Dec 2022 09:07:31 -0500


You mean "Fri, 16 Dec", right? :)


Yep, it'll be correct when I regenerate the patches again today.



Acked-by: Dumitru Ceara 

Thanks!


+
  ovn (22.12.0-1) unstable; urgency=low
  
 * New upstream version




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 2/2] openflow: Add extension to flush CT by generic match

2022-12-16 Thread Paolo Valerio
Ales Musil  writes:

> Add extension that allows to flush connections from CT
> by specifying fields that the connections should be
> matched against. This allows to match only some fields
> of the connection e.g. source address for orig direrction.
>
> Reported-at: https://bugzilla.redhat.com/2120546
> Signed-off-by: Ales Musil 
> ---
> v5: Add missing usage and man for ovs-ofctl command.
> v4: Allow ovs-ofctl flush/conntrack without any zone/tuple.
> v3: Rebase on top of master.
> v2: Rebase on top of master.
> Use suggestion from Ilya.
> ---

Thanks Ales.

LGTM,

Acked-by: Paolo Valerio 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3 0/5] OVN IC multiple same routes fixes

2022-12-16 Thread Vladislav Odintsov
Thanks Dumitru!

Regards,
Vladislav Odintsov

> On 16 Dec 2022, at 15:17, Dumitru Ceara  wrote:
> 
> Hi Vladislav,
> 
> I pushed the series to main branch and branch-22.12.  I also backported
> the first 4 patches all the way down to branch-22.03.
> 
> Thanks a lot for the fixes (and thanks Numan for the review)!
> 
> Regards,
> Dumitru
> 
> On 12/16/22 10:58, Vladislav Odintsov wrote:
>> Hi Dumitru,
>> 
>> Regards,
>> Vladislav Odintsov
>> 
>>> On 16 Dec 2022, at 02:16, Dumitru Ceara  wrote:
>>> 
>>> On 12/15/22 22:30, Vladislav Odintsov wrote:
 Thanks Numan for the review.
 
 It seems that for some reason appeared a memory leak, which constantly 
 reproduces.
>>> 
>>> Do you mean this one?
>>> 
>>> Direct leak of 64 byte(s) in 1 object(s) allocated from:
>>>   #0 0x49a052 in calloc 
>>> (/home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/ic/ovn-ic+0x49a052)
>>>   #1 0x751142 in xcalloc__ /home/runner/work/ovn/ovn/ovs/lib/util.c:121:31
>>>   #2 0x751170 in xzalloc__ /home/runner/work/ovn/ovn/ovs/lib/util.c:131:12
>>>   #3 0x751235 in xzalloc /home/runner/work/ovn/ovn/ovs/lib/util.c:165:12
>>>   #4 0x740353 in ovsdb_idl_txn_add_map_op 
>>> /home/runner/work/ovn/ovn/ovs/lib/ovsdb-idl.c:4160:29
>>>   #5 0x7402bf in ovsdb_idl_txn_write_partial_map 
>>> /home/runner/work/ovn/ovn/ovs/lib/ovsdb-idl.c:4317:5
>>>   #6 0x71c57f in icsbrec_port_binding_update_external_ids_setkey 
>>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/lib/ovn-ic-sb-idl.c:7220:5
>>>   #7 0x4d3696 in update_isb_pb_external_ids 
>>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:573:5
>>>   #8 0x4d2017 in create_isb_pb 
>>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:713:5
>>>   #9 0x4ce6af in port_binding_run 
>>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:800:21
>>>   #10 0x4cbb87 in ovn_db_run 
>>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:1737:5
>>>   #11 0x4cac7a in main 
>>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:2026:17
>>>   #12 0x7fcd3d9e7082 in __libc_start_main 
>>> (/lib/x86_64-linux-gnu/libc.so.6+0x24082)
>>> 
>>> From: https://github.com/ovsrobot/ovn/actions/runs/3706396547
>>> 
>>> If so, I've seen it a few times before in CI and didn't have
>>> time to investigate closely.  I'm quite confident it's not
>>> related to your changes so I think it's fine to push your
>>> series.  Like that it would also make it in time for the
>>> 22.12.0 release.
>> 
>> Yes, I was talking about this trace. So, I walked through it and see that 
>> it’s really not connected with my patchset, and it can be applied.
>> 
>>> 
 I’ll give it a time tomorrow to find the source of a problem.
>>> 
>>> I'll wait at least until Friday morning (CET).
>> 
>> Anyway if I succeed to fix this issue, I’ll send a separate patch for that.
>> 
>>> 
 
 Regards,
 Vladislav Odintsov
>>> 
>>> Regards,
>>> Dumitru
>>> 
 
> On 16 Dec 2022, at 00:21, Numan Siddique  wrote:
> 
> On Thu, Dec 15, 2022 at 12:02 PM Vladislav Odintsov  > wrote:
>> 
>> v2 -> v3:
>> - Split patch #3 by two: first fixes a bug with duplicated route
>>  advertisement and will be considered for back-porting; the second one
>>  changes ovn-ic SB:Route schema and documents ovn-ic upgrade details.
>> - Address Dumitru's comment regarding loggin rate-limit.
>> 
>> v1 -> v2:
>> - Split series by two: OVN IC-related changes and northd OF bucket limits
>> - Squash two patches in one
>> - Fix memory leak in add_to_routes_ad()
>> - Addressed review comments by Dumitru
>> 
>> v1 description here:
>> https://patchwork.ozlabs.org/project/ovn/cover/20221202173147.3032702-1-odiv...@gmail.com/
>> 
>> Vladislav Odintsov (5):
>> ic: remove orphan ovn interconnection routes
>> ic: lookup southbound port_binding only if needed
>> ic: prevent advertising/learning multiple same routes
>> ic: minor code improvements
>> ic-sb schema: add index for routes table & document upgrade path
> 
> For the entire series:
> 
> Acked-by: Numan Siddique mailto:num...@ovn.org>>
> 
> Numan
> 
>> 
>> Documentation/intro/install/ovn-upgrades.rst |  20 +++
>> NEWS |   4 +
>> ic/ovn-ic.c  | 142 +--
>> ovn-ic-sb.ovsschema  |   6 +-
>> tests/ovn-ic.at   | 133 
>> +
>> 5 files changed, 257 insertions(+), 48 deletions(-)
>> 
>> --
>> 2.36.1
>> 
>>> 
>>> ___
>>> dev mailing list
>>> d...@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> 
>> 
> 
> ___
> dev mailing list
> d...@openvswitch.org
> 

Re: [ovs-dev] [PATCH v4 2/2] openflow: Add extension to flush CT by generic match

2022-12-16 Thread Ales Musil
On Fri, Dec 16, 2022 at 1:01 PM Paolo Valerio  wrote:

> Ales Musil  writes:
>
> > Add extension that allows to flush connections from CT
> > by specifying fields that the connections should be
> > matched against. This allows to match only some fields
> > of the connection e.g. source address for orig direrction.
> >
> > Reported-at: https://bugzilla.redhat.com/2120546
> > Signed-off-by: Ales Musil 
> > ---
> > v4: Allow ovs-ofctl flush/conntrack without any zone/tuple.
> > v3: Rebase on top of master.
> > v2: Rebase on top of master.
> > Use suggestion from Ilya.
> > ---
> >  NEWS   |   3 +
> >  include/openflow/nicira-ext.h  |  30 +++
> >  include/openvswitch/ofp-msgs.h |   4 +
> >  include/openvswitch/ofp-util.h |   4 +
> >  lib/ofp-bundle.c   |   1 +
> >  lib/ofp-ct-util.c  | 146 +
> >  lib/ofp-ct-util.h  |   9 ++
> >  lib/ofp-print.c|  20 +
> >  lib/ofp-util.c |  25 ++
> >  lib/rconn.c|   1 +
> >  ofproto/ofproto-dpif.c |   8 +-
> >  ofproto/ofproto-provider.h |   7 +-
> >  ofproto/ofproto.c  |  30 ++-
> >  tests/ofp-print.at |  93 +
> >  tests/ovs-ofctl.at |  26 ++
> >  tests/system-traffic.at| 116 ++
> >  utilities/ovs-ofctl.c  |  38 +
> >  17 files changed, 503 insertions(+), 58 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index ff8904b02..46b8faa41 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -16,6 +16,9 @@ Post-v3.0.0
> >   by specifying 'max-rate' or '[r]stp-path-cost' accordingly.
> > - ovs-dpctl and related ovs-appctl commands:
> >   * "flush-conntrack" is capable of handling partial 5-tuple.
> > +   - OpenFlow:
> > +  * New OpenFlow extension NXT_CT_FLUSH to flush connections
> matching
> > +the specified fields.
> >
>
> I guess we miss an entry for ovs-ofctl flush-conntrack
>
> >
> >  v3.0.0 - 15 Aug 2022
> > diff --git a/include/openflow/nicira-ext.h
> b/include/openflow/nicira-ext.h
> > index b68804991..32ce56d31 100644
> > --- a/include/openflow/nicira-ext.h
> > +++ b/include/openflow/nicira-ext.h
> > @@ -1064,4 +1064,34 @@ struct nx_zone_id {
> >  };
> >  OFP_ASSERT(sizeof(struct nx_zone_id) == 8);
> >
> > +/* CT flush available TLVs. */
> > +enum nx_ct_flush_tlv_type {
> > +/* Outer types. */
> > +NXT_CT_ORIG_DIRECTION,/* CT orig direction outer type. */
> > +NXT_CT_REPLY_DIRECTION,   /* CT reply direction outer type. */
> > +
> > +/* Nested types. */
> > +NXT_CT_SRC,   /* CT source IPv6 or mapped IPv4 address.
> */
> > +NXT_CT_DST,   /* CT destination IPv6 or mapped IPv4
> address. */
> > +NXT_CT_SRC_PORT,  /* CT source port. */
> > +NXT_CT_DST_PORT,  /* CT destination port. */
> > +NXT_CT_ICMP_ID,   /* CT ICMP id. */
> > +NXT_CT_ICMP_TYPE, /* CT ICMP type. */
> > +NXT_CT_ICMP_CODE, /* CT ICMP code. */
> > +
> > +/* Primitive types. */
> > +NXT_CT_ZONE_ID,   /* CT zone id. */
> > +};
> > +
> > +/* NXT_CT_FLUSH.
> > + *
> > + * Flushes the connection tracking specified by 5-tuple.
> > + * The struct should be followed by TLVs specifying the matching
> parameters. */
> > +struct nx_ct_flush {
> > +uint8_t ip_proto;  /* IP protocol. */
> > +uint8_t family;/* L3 address family. */
> > +uint8_t zero[6];   /* Must be zero. */
> > +};
> > +OFP_ASSERT(sizeof(struct nx_ct_flush) == 8);
> > +
> >  #endif /* openflow/nicira-ext.h */
> > diff --git a/include/openvswitch/ofp-msgs.h
> b/include/openvswitch/ofp-msgs.h
> > index 921a937e5..659b0a3e7 100644
> > --- a/include/openvswitch/ofp-msgs.h
> > +++ b/include/openvswitch/ofp-msgs.h
> > @@ -526,6 +526,9 @@ enum ofpraw {
> >
> >  /* NXST 1.0+ (4): struct nx_ipfix_stats_reply[]. */
> >  OFPRAW_NXST_IPFIX_FLOW_REPLY,
> > +
> > +/* NXT 1.0+ (32): struct nx_ct_flush, uint8_t[8][]. */
> > +OFPRAW_NXT_CT_FLUSH,
> >  };
> >
> >  /* Decoding messages into OFPRAW_* values. */
> > @@ -772,6 +775,7 @@ enum ofptype {
> >  OFPTYPE_IPFIX_FLOW_STATS_REQUEST, /* OFPRAW_NXST_IPFIX_FLOW_REQUEST
> */
> >  OFPTYPE_IPFIX_FLOW_STATS_REPLY,   /* OFPRAW_NXST_IPFIX_FLOW_REPLY */
> >  OFPTYPE_CT_FLUSH_ZONE,/* OFPRAW_NXT_CT_FLUSH_ZONE. */
> > +OFPTYPE_CT_FLUSH,   /* OFPRAW_NXT_CT_FLUSH. */
> >
> >  /* Flow monitor extension. */
> >  OFPTYPE_FLOW_MONITOR_CANCEL,  /* OFPRAW_NXT_FLOW_MONITOR_CANCEL.
> > diff --git a/include/openvswitch/ofp-util.h
> b/include/openvswitch/ofp-util.h
> > index 84937ae26..e10d90b9f 100644
> > --- a/include/openvswitch/ofp-util.h
> > +++ b/include/openvswitch/ofp-util.h
> > @@ -65,6 +65,10 @@ struct ofpbuf *ofputil_encode_echo_reply(const struct
> ofp_header *);
> >
> >  struct ofpbuf 

[ovs-dev] [PATCH v5 2/2] openflow: Add extension to flush CT by generic match

2022-12-16 Thread Ales Musil
Add extension that allows to flush connections from CT
by specifying fields that the connections should be
matched against. This allows to match only some fields
of the connection e.g. source address for orig direrction.

Reported-at: https://bugzilla.redhat.com/2120546
Signed-off-by: Ales Musil 
---
v5: Add missing usage and man for ovs-ofctl command.
v4: Allow ovs-ofctl flush/conntrack without any zone/tuple.
v3: Rebase on top of master.
v2: Rebase on top of master.
Use suggestion from Ilya.
---
 NEWS   |   6 ++
 include/openflow/nicira-ext.h  |  30 +++
 include/openvswitch/ofp-msgs.h |   4 +
 include/openvswitch/ofp-util.h |   4 +
 lib/ofp-bundle.c   |   1 +
 lib/ofp-ct-util.c  | 146 +
 lib/ofp-ct-util.h  |   9 ++
 lib/ofp-print.c|  20 +
 lib/ofp-util.c |  25 ++
 lib/rconn.c|   1 +
 ofproto/ofproto-dpif.c |   8 +-
 ofproto/ofproto-provider.h |   7 +-
 ofproto/ofproto.c  |  30 ++-
 tests/ofp-print.at |  93 +
 tests/ovs-ofctl.at |  26 ++
 tests/system-traffic.at| 116 ++
 utilities/ovs-ofctl.8.in   |  20 +
 utilities/ovs-ofctl.c  |  40 +
 18 files changed, 528 insertions(+), 58 deletions(-)

diff --git a/NEWS b/NEWS
index ff8904b02..ac73d6293 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,12 @@ Post-v3.0.0
  by specifying 'max-rate' or '[r]stp-path-cost' accordingly.
- ovs-dpctl and related ovs-appctl commands:
  * "flush-conntrack" is capable of handling partial 5-tuple.
+   - OpenFlow:
+ * New OpenFlow extension NXT_CT_FLUSH to flush connections matching
+   the specified fields.
+   - ovs-ofctl:
+ * New command "flush-conntrack" that accepts zone and 5-tuple or partial
+   5-tuple.
 
 
 v3.0.0 - 15 Aug 2022
diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h
index b68804991..32ce56d31 100644
--- a/include/openflow/nicira-ext.h
+++ b/include/openflow/nicira-ext.h
@@ -1064,4 +1064,34 @@ struct nx_zone_id {
 };
 OFP_ASSERT(sizeof(struct nx_zone_id) == 8);
 
+/* CT flush available TLVs. */
+enum nx_ct_flush_tlv_type {
+/* Outer types. */
+NXT_CT_ORIG_DIRECTION,/* CT orig direction outer type. */
+NXT_CT_REPLY_DIRECTION,   /* CT reply direction outer type. */
+
+/* Nested types. */
+NXT_CT_SRC,   /* CT source IPv6 or mapped IPv4 address. */
+NXT_CT_DST,   /* CT destination IPv6 or mapped IPv4 address. */
+NXT_CT_SRC_PORT,  /* CT source port. */
+NXT_CT_DST_PORT,  /* CT destination port. */
+NXT_CT_ICMP_ID,   /* CT ICMP id. */
+NXT_CT_ICMP_TYPE, /* CT ICMP type. */
+NXT_CT_ICMP_CODE, /* CT ICMP code. */
+
+/* Primitive types. */
+NXT_CT_ZONE_ID,   /* CT zone id. */
+};
+
+/* NXT_CT_FLUSH.
+ *
+ * Flushes the connection tracking specified by 5-tuple.
+ * The struct should be followed by TLVs specifying the matching parameters. */
+struct nx_ct_flush {
+uint8_t ip_proto;  /* IP protocol. */
+uint8_t family;/* L3 address family. */
+uint8_t zero[6];   /* Must be zero. */
+};
+OFP_ASSERT(sizeof(struct nx_ct_flush) == 8);
+
 #endif /* openflow/nicira-ext.h */
diff --git a/include/openvswitch/ofp-msgs.h b/include/openvswitch/ofp-msgs.h
index 921a937e5..659b0a3e7 100644
--- a/include/openvswitch/ofp-msgs.h
+++ b/include/openvswitch/ofp-msgs.h
@@ -526,6 +526,9 @@ enum ofpraw {
 
 /* NXST 1.0+ (4): struct nx_ipfix_stats_reply[]. */
 OFPRAW_NXST_IPFIX_FLOW_REPLY,
+
+/* NXT 1.0+ (32): struct nx_ct_flush, uint8_t[8][]. */
+OFPRAW_NXT_CT_FLUSH,
 };
 
 /* Decoding messages into OFPRAW_* values. */
@@ -772,6 +775,7 @@ enum ofptype {
 OFPTYPE_IPFIX_FLOW_STATS_REQUEST, /* OFPRAW_NXST_IPFIX_FLOW_REQUEST */
 OFPTYPE_IPFIX_FLOW_STATS_REPLY,   /* OFPRAW_NXST_IPFIX_FLOW_REPLY */
 OFPTYPE_CT_FLUSH_ZONE,/* OFPRAW_NXT_CT_FLUSH_ZONE. */
+OFPTYPE_CT_FLUSH,   /* OFPRAW_NXT_CT_FLUSH. */
 
 /* Flow monitor extension. */
 OFPTYPE_FLOW_MONITOR_CANCEL,  /* OFPRAW_NXT_FLOW_MONITOR_CANCEL.
diff --git a/include/openvswitch/ofp-util.h b/include/openvswitch/ofp-util.h
index 84937ae26..e10d90b9f 100644
--- a/include/openvswitch/ofp-util.h
+++ b/include/openvswitch/ofp-util.h
@@ -65,6 +65,10 @@ struct ofpbuf *ofputil_encode_echo_reply(const struct 
ofp_header *);
 
 struct ofpbuf *ofputil_encode_barrier_request(enum ofp_version);
 
+struct ofpbuf *ofputil_ct_match_encode(const struct ofputil_ct_match *match,
+   uint16_t *zone_id,
+   enum ofp_version version);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/ofp-bundle.c b/lib/ofp-bundle.c
index 0161c2bc6..941a8370e 100644
--- a/lib/ofp-bundle.c
+++ b/lib/ofp-bundle.c

[ovs-dev] [PATCH v5 1/2] ofp, dpif: Allow CT flush based on partial match

2022-12-16 Thread Ales Musil
Currently, the CT can be flushed by dpctl only be specifying
the whole 5-tuple. This is not very convenient when there are
only some fields known to the user of CT flush. Add new struct
ofputil_ct_match which represents the generic filtering that can
be done for CT flush. The match is done only on fields that are
non-zero with exception to the icmp fields.

This allows the filtering just within dpctl, however
it is a preparation for OpenFlow extension.

Reported-at: https://bugzilla.redhat.com/2120546
Signed-off-by: Ales Musil 
Acked-by: Paolo Valerio 
---
v5: Add ack from Paolo.
v4: Fix a flush all scenario.
v3: Rebase on top of master.
Address the C99 comment and missing dpif_close call.
v2: Rebase on top of master.
Address comments from Paolo.
---
 NEWS   |   2 +
 include/openvswitch/ofp-util.h |  28 +++
 lib/automake.mk|   2 +
 lib/ct-dpif.c  | 208 +
 lib/ct-dpif.h  |   4 +-
 lib/dpctl.c|  42 ++---
 lib/dpctl.man  |   3 +-
 lib/ofp-ct-util.c  | 326 +
 lib/ofp-ct-util.h  |  36 
 tests/system-traffic.at|  82 -
 10 files changed, 586 insertions(+), 147 deletions(-)
 create mode 100644 lib/ofp-ct-util.c
 create mode 100644 lib/ofp-ct-util.h

diff --git a/NEWS b/NEWS
index 265375e1c..ff8904b02 100644
--- a/NEWS
+++ b/NEWS
@@ -14,6 +14,8 @@ Post-v3.0.0
  10 Gbps link speed by default in case the actual link speed cannot be
  determined.  Previously it was 10 Mbps.  Values can still be overridden
  by specifying 'max-rate' or '[r]stp-path-cost' accordingly.
+   - ovs-dpctl and related ovs-appctl commands:
+ * "flush-conntrack" is capable of handling partial 5-tuple.
 
 
 v3.0.0 - 15 Aug 2022
diff --git a/include/openvswitch/ofp-util.h b/include/openvswitch/ofp-util.h
index 091a09cad..84937ae26 100644
--- a/include/openvswitch/ofp-util.h
+++ b/include/openvswitch/ofp-util.h
@@ -19,6 +19,9 @@
 
 #include 
 #include 
+#include 
+#include 
+
 #include "openvswitch/ofp-protocol.h"
 
 struct ofp_header;
@@ -27,6 +30,31 @@ struct ofp_header;
 extern "C" {
 #endif
 
+struct ofputil_ct_tuple {
+struct in6_addr src;
+struct in6_addr dst;
+
+union {
+ovs_be16 src_port;
+ovs_be16 icmp_id;
+};
+union {
+ovs_be16 dst_port;
+struct {
+uint8_t icmp_code;
+uint8_t icmp_type;
+};
+};
+};
+
+struct ofputil_ct_match {
+uint8_t ip_proto;
+uint16_t l3_type;
+
+struct ofputil_ct_tuple tuple_orig;
+struct ofputil_ct_tuple tuple_reply;
+};
+
 bool ofputil_decode_hello(const struct ofp_header *,
   uint32_t *allowed_versions);
 struct ofpbuf *ofputil_encode_hello(uint32_t version_bitmap);
diff --git a/lib/automake.mk b/lib/automake.mk
index a0fabe38f..37135f118 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -227,6 +227,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/ofp-actions.c \
lib/ofp-bundle.c \
lib/ofp-connection.c \
+   lib/ofp-ct-util.c \
+   lib/ofp-ct-util.h \
lib/ofp-ed-props.c \
lib/ofp-errors.c \
lib/ofp-flow.c \
diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
index 6f17a26b5..9b324ff79 100644
--- a/lib/ct-dpif.c
+++ b/lib/ct-dpif.c
@@ -20,6 +20,7 @@
 #include 
 
 #include "ct-dpif.h"
+#include "ofp-ct-util.h"
 #include "openvswitch/ofp-parse.h"
 #include "openvswitch/vlog.h"
 
@@ -71,6 +72,31 @@ ct_dpif_dump_start(struct dpif *dpif, struct 
ct_dpif_dump_state **dump,
 return err;
 }
 
+static void
+ct_dpif_tuple_from_ofputil_ct_tuple(const struct ofputil_ct_tuple *ofp_tuple,
+struct ct_dpif_tuple *tuple,
+uint16_t l3_type, uint8_t ip_proto)
+{
+if (l3_type == AF_INET) {
+tuple->src.ip = in6_addr_get_mapped_ipv4(_tuple->src);
+tuple->dst.ip = in6_addr_get_mapped_ipv4(_tuple->dst);
+} else {
+tuple->src.in6 = ofp_tuple->src;
+tuple->dst.in6 = ofp_tuple->dst;
+}
+
+tuple->l3_type = l3_type;
+tuple->ip_proto = ip_proto;
+tuple->src_port = ofp_tuple->src_port;
+
+if (ip_proto == IPPROTO_ICMP || ip_proto == IPPROTO_ICMPV6) {
+tuple->icmp_code = ofp_tuple->icmp_code;
+tuple->icmp_type = ofp_tuple->icmp_type;
+} else {
+tuple->dst_port = ofp_tuple->dst_port;
+}
+}
+
 /* Dump one connection from a tracker, and put it in 'entry'.
  *
  * 'dump' should have been initialized by ct_dpif_dump_start().
@@ -100,25 +126,77 @@ ct_dpif_dump_done(struct ct_dpif_dump_state *dump)
 ? dpif->dpif_class->ct_dump_done(dpif, dump)
 : EOPNOTSUPP);
 }
-
+
+static int
+ct_dpif_flush_tuple(struct dpif *dpif, const uint16_t *zone,
+const struct ofputil_ct_match *match) {
+struct ct_dpif_dump_state *dump;
+struct ct_dpif_entry cte;
+  

[ovs-dev] [PATCH v5 0/2] Add extension for partial CT flush

2022-12-16 Thread Ales Musil
Add extension that will allow to flush
connection from CT specified by full
orig 5-tuple or partial match that combines
orig or reply direction.

Ales Musil (2):
  ofp, dpif: Allow CT flush based on partial match
  openflow: Add extension to flush CT by generic match

 NEWS   |   8 +
 include/openflow/nicira-ext.h  |  30 +++
 include/openvswitch/ofp-msgs.h |   4 +
 include/openvswitch/ofp-util.h |  32 +++
 lib/automake.mk|   2 +
 lib/ct-dpif.c  | 208 +++
 lib/ct-dpif.h  |   4 +-
 lib/dpctl.c|  42 ++-
 lib/dpctl.man  |   3 +-
 lib/ofp-bundle.c   |   1 +
 lib/ofp-ct-util.c  | 472 +
 lib/ofp-ct-util.h  |  45 
 lib/ofp-print.c|  20 ++
 lib/ofp-util.c |  25 ++
 lib/rconn.c|   1 +
 ofproto/ofproto-dpif.c |   8 +-
 ofproto/ofproto-provider.h |   7 +-
 ofproto/ofproto.c  |  30 ++-
 tests/ofp-print.at |  93 +++
 tests/ovs-ofctl.at |  26 ++
 tests/system-traffic.at| 132 +++--
 utilities/ovs-ofctl.8.in   |  20 ++
 utilities/ovs-ofctl.c  |  40 +++
 23 files changed, 1081 insertions(+), 172 deletions(-)
 create mode 100644 lib/ofp-ct-util.c
 create mode 100644 lib/ofp-ct-util.h

-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3 0/5] OVN IC multiple same routes fixes

2022-12-16 Thread Dumitru Ceara
Hi Vladislav,

I pushed the series to main branch and branch-22.12.  I also backported
the first 4 patches all the way down to branch-22.03.

Thanks a lot for the fixes (and thanks Numan for the review)!

Regards,
Dumitru

On 12/16/22 10:58, Vladislav Odintsov wrote:
> Hi Dumitru,
> 
> Regards,
> Vladislav Odintsov
> 
>> On 16 Dec 2022, at 02:16, Dumitru Ceara  wrote:
>>
>> On 12/15/22 22:30, Vladislav Odintsov wrote:
>>> Thanks Numan for the review.
>>>
>>> It seems that for some reason appeared a memory leak, which constantly 
>>> reproduces.
>>
>> Do you mean this one?
>>
>> Direct leak of 64 byte(s) in 1 object(s) allocated from:
>>#0 0x49a052 in calloc 
>> (/home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/ic/ovn-ic+0x49a052)
>>#1 0x751142 in xcalloc__ /home/runner/work/ovn/ovn/ovs/lib/util.c:121:31
>>#2 0x751170 in xzalloc__ /home/runner/work/ovn/ovn/ovs/lib/util.c:131:12
>>#3 0x751235 in xzalloc /home/runner/work/ovn/ovn/ovs/lib/util.c:165:12
>>#4 0x740353 in ovsdb_idl_txn_add_map_op 
>> /home/runner/work/ovn/ovn/ovs/lib/ovsdb-idl.c:4160:29
>>#5 0x7402bf in ovsdb_idl_txn_write_partial_map 
>> /home/runner/work/ovn/ovn/ovs/lib/ovsdb-idl.c:4317:5
>>#6 0x71c57f in icsbrec_port_binding_update_external_ids_setkey 
>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/lib/ovn-ic-sb-idl.c:7220:5
>>#7 0x4d3696 in update_isb_pb_external_ids 
>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:573:5
>>#8 0x4d2017 in create_isb_pb 
>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:713:5
>>#9 0x4ce6af in port_binding_run 
>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:800:21
>>#10 0x4cbb87 in ovn_db_run 
>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:1737:5
>>#11 0x4cac7a in main 
>> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:2026:17
>>#12 0x7fcd3d9e7082 in __libc_start_main 
>> (/lib/x86_64-linux-gnu/libc.so.6+0x24082)
>>
>> From: https://github.com/ovsrobot/ovn/actions/runs/3706396547
>>
>> If so, I've seen it a few times before in CI and didn't have
>> time to investigate closely.  I'm quite confident it's not
>> related to your changes so I think it's fine to push your
>> series.  Like that it would also make it in time for the
>> 22.12.0 release.
> 
> Yes, I was talking about this trace. So, I walked through it and see that 
> it’s really not connected with my patchset, and it can be applied.
> 
>>
>>> I’ll give it a time tomorrow to find the source of a problem.
>>
>> I'll wait at least until Friday morning (CET).
> 
> Anyway if I succeed to fix this issue, I’ll send a separate patch for that.
> 
>>
>>>
>>> Regards,
>>> Vladislav Odintsov
>>
>> Regards,
>> Dumitru
>>
>>>
 On 16 Dec 2022, at 00:21, Numan Siddique  wrote:

 On Thu, Dec 15, 2022 at 12:02 PM Vladislav Odintsov >>> > wrote:
>
> v2 -> v3:
> - Split patch #3 by two: first fixes a bug with duplicated route
>   advertisement and will be considered for back-porting; the second one
>   changes ovn-ic SB:Route schema and documents ovn-ic upgrade details.
> - Address Dumitru's comment regarding loggin rate-limit.
>
> v1 -> v2:
> - Split series by two: OVN IC-related changes and northd OF bucket limits
> - Squash two patches in one
> - Fix memory leak in add_to_routes_ad()
> - Addressed review comments by Dumitru
>
> v1 description here:
> https://patchwork.ozlabs.org/project/ovn/cover/20221202173147.3032702-1-odiv...@gmail.com/
>
> Vladislav Odintsov (5):
> ic: remove orphan ovn interconnection routes
> ic: lookup southbound port_binding only if needed
> ic: prevent advertising/learning multiple same routes
> ic: minor code improvements
> ic-sb schema: add index for routes table & document upgrade path

 For the entire series:

 Acked-by: Numan Siddique mailto:num...@ovn.org>>

 Numan

>
> Documentation/intro/install/ovn-upgrades.rst |  20 +++
> NEWS |   4 +
> ic/ovn-ic.c  | 142 +--
> ovn-ic-sb.ovsschema  |   6 +-
> tests/ovn-ic.at   | 133 
> +
> 5 files changed, 257 insertions(+), 48 deletions(-)
>
> --
> 2.36.1
>
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 2/2] openflow: Add extension to flush CT by generic match

2022-12-16 Thread Paolo Valerio
Ales Musil  writes:

> Add extension that allows to flush connections from CT
> by specifying fields that the connections should be
> matched against. This allows to match only some fields
> of the connection e.g. source address for orig direrction.
>
> Reported-at: https://bugzilla.redhat.com/2120546
> Signed-off-by: Ales Musil 
> ---
> v4: Allow ovs-ofctl flush/conntrack without any zone/tuple.
> v3: Rebase on top of master.
> v2: Rebase on top of master.
> Use suggestion from Ilya.
> ---
>  NEWS   |   3 +
>  include/openflow/nicira-ext.h  |  30 +++
>  include/openvswitch/ofp-msgs.h |   4 +
>  include/openvswitch/ofp-util.h |   4 +
>  lib/ofp-bundle.c   |   1 +
>  lib/ofp-ct-util.c  | 146 +
>  lib/ofp-ct-util.h  |   9 ++
>  lib/ofp-print.c|  20 +
>  lib/ofp-util.c |  25 ++
>  lib/rconn.c|   1 +
>  ofproto/ofproto-dpif.c |   8 +-
>  ofproto/ofproto-provider.h |   7 +-
>  ofproto/ofproto.c  |  30 ++-
>  tests/ofp-print.at |  93 +
>  tests/ovs-ofctl.at |  26 ++
>  tests/system-traffic.at| 116 ++
>  utilities/ovs-ofctl.c  |  38 +
>  17 files changed, 503 insertions(+), 58 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index ff8904b02..46b8faa41 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -16,6 +16,9 @@ Post-v3.0.0
>   by specifying 'max-rate' or '[r]stp-path-cost' accordingly.
> - ovs-dpctl and related ovs-appctl commands:
>   * "flush-conntrack" is capable of handling partial 5-tuple.
> +   - OpenFlow:
> +  * New OpenFlow extension NXT_CT_FLUSH to flush connections matching
> +the specified fields.
>

I guess we miss an entry for ovs-ofctl flush-conntrack

>  
>  v3.0.0 - 15 Aug 2022
> diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h
> index b68804991..32ce56d31 100644
> --- a/include/openflow/nicira-ext.h
> +++ b/include/openflow/nicira-ext.h
> @@ -1064,4 +1064,34 @@ struct nx_zone_id {
>  };
>  OFP_ASSERT(sizeof(struct nx_zone_id) == 8);
>  
> +/* CT flush available TLVs. */
> +enum nx_ct_flush_tlv_type {
> +/* Outer types. */
> +NXT_CT_ORIG_DIRECTION,/* CT orig direction outer type. */
> +NXT_CT_REPLY_DIRECTION,   /* CT reply direction outer type. */
> +
> +/* Nested types. */
> +NXT_CT_SRC,   /* CT source IPv6 or mapped IPv4 address. */
> +NXT_CT_DST,   /* CT destination IPv6 or mapped IPv4 address. 
> */
> +NXT_CT_SRC_PORT,  /* CT source port. */
> +NXT_CT_DST_PORT,  /* CT destination port. */
> +NXT_CT_ICMP_ID,   /* CT ICMP id. */
> +NXT_CT_ICMP_TYPE, /* CT ICMP type. */
> +NXT_CT_ICMP_CODE, /* CT ICMP code. */
> +
> +/* Primitive types. */
> +NXT_CT_ZONE_ID,   /* CT zone id. */
> +};
> +
> +/* NXT_CT_FLUSH.
> + *
> + * Flushes the connection tracking specified by 5-tuple.
> + * The struct should be followed by TLVs specifying the matching parameters. 
> */
> +struct nx_ct_flush {
> +uint8_t ip_proto;  /* IP protocol. */
> +uint8_t family;/* L3 address family. */
> +uint8_t zero[6];   /* Must be zero. */
> +};
> +OFP_ASSERT(sizeof(struct nx_ct_flush) == 8);
> +
>  #endif /* openflow/nicira-ext.h */
> diff --git a/include/openvswitch/ofp-msgs.h b/include/openvswitch/ofp-msgs.h
> index 921a937e5..659b0a3e7 100644
> --- a/include/openvswitch/ofp-msgs.h
> +++ b/include/openvswitch/ofp-msgs.h
> @@ -526,6 +526,9 @@ enum ofpraw {
>  
>  /* NXST 1.0+ (4): struct nx_ipfix_stats_reply[]. */
>  OFPRAW_NXST_IPFIX_FLOW_REPLY,
> +
> +/* NXT 1.0+ (32): struct nx_ct_flush, uint8_t[8][]. */
> +OFPRAW_NXT_CT_FLUSH,
>  };
>  
>  /* Decoding messages into OFPRAW_* values. */
> @@ -772,6 +775,7 @@ enum ofptype {
>  OFPTYPE_IPFIX_FLOW_STATS_REQUEST, /* OFPRAW_NXST_IPFIX_FLOW_REQUEST */
>  OFPTYPE_IPFIX_FLOW_STATS_REPLY,   /* OFPRAW_NXST_IPFIX_FLOW_REPLY */
>  OFPTYPE_CT_FLUSH_ZONE,/* OFPRAW_NXT_CT_FLUSH_ZONE. */
> +OFPTYPE_CT_FLUSH,   /* OFPRAW_NXT_CT_FLUSH. */
>  
>  /* Flow monitor extension. */
>  OFPTYPE_FLOW_MONITOR_CANCEL,  /* OFPRAW_NXT_FLOW_MONITOR_CANCEL.
> diff --git a/include/openvswitch/ofp-util.h b/include/openvswitch/ofp-util.h
> index 84937ae26..e10d90b9f 100644
> --- a/include/openvswitch/ofp-util.h
> +++ b/include/openvswitch/ofp-util.h
> @@ -65,6 +65,10 @@ struct ofpbuf *ofputil_encode_echo_reply(const struct 
> ofp_header *);
>  
>  struct ofpbuf *ofputil_encode_barrier_request(enum ofp_version);
>  
> +struct ofpbuf *ofputil_ct_match_encode(const struct ofputil_ct_match *match,
> +   uint16_t *zone_id,
> +   enum ofp_version version);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git 

Re: [ovs-dev] [PATCH v4 1/2] ofp, dpif: Allow CT flush based on partial match

2022-12-16 Thread Paolo Valerio
Ales Musil  writes:

> Currently, the CT can be flushed by dpctl only be specifying
> the whole 5-tuple. This is not very convenient when there are
> only some fields known to the user of CT flush. Add new struct
> ofputil_ct_match which represents the generic filtering that can
> be done for CT flush. The match is done only on fields that are
> non-zero with exception to the icmp fields.
>
> This allows the filtering just within dpctl, however
> it is a preparation for OpenFlow extension.
>
> Reported-at: https://bugzilla.redhat.com/2120546
> Signed-off-by: Ales Musil 
> ---
> v4: Fix a flush all scenario.
> v3: Rebase on top of master.
> Address the C99 comment and missing dpif_close call.
> v2: Rebase on top of master.
> Address comments from Paolo.
> ---

stressed a bit more some corner cases mostly related to icmp and after a
quick discussion offline things LGTM

Acked-by: Paolo Valerio 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net v3] openvswitch: Fix flow lookup to use unmasked key

2022-12-16 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net.git (master)
by David S. Miller :

On Thu, 15 Dec 2022 15:46:33 +0100 you wrote:
> The commit mentioned below causes the ovs_flow_tbl_lookup() function
> to be called with the masked key. However, it's supposed to be called
> with the unmasked key. This due to the fact that the datapath supports
> installing wider flows, and OVS relies on this behavior. For example
> if ipv4(src=1.1.1.1/192.0.0.0, dst=1.1.1.2/192.0.0.0) exists, a wider
> flow (smaller mask) of ipv4(src=192.1.1.1/128.0.0.0,dst=192.1.1.2/
> 128.0.0.0) is allowed to be added.
> 
> [...]

Here is the summary with links:
  - [net,v3] openvswitch: Fix flow lookup to use unmasked key
https://git.kernel.org/netdev/net/c/68bb10101e6b

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3 0/5] OVN IC multiple same routes fixes

2022-12-16 Thread Vladislav Odintsov
Hi Dumitru,

Regards,
Vladislav Odintsov

> On 16 Dec 2022, at 02:16, Dumitru Ceara  wrote:
> 
> On 12/15/22 22:30, Vladislav Odintsov wrote:
>> Thanks Numan for the review.
>> 
>> It seems that for some reason appeared a memory leak, which constantly 
>> reproduces.
> 
> Do you mean this one?
> 
> Direct leak of 64 byte(s) in 1 object(s) allocated from:
>#0 0x49a052 in calloc 
> (/home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/ic/ovn-ic+0x49a052)
>#1 0x751142 in xcalloc__ /home/runner/work/ovn/ovn/ovs/lib/util.c:121:31
>#2 0x751170 in xzalloc__ /home/runner/work/ovn/ovn/ovs/lib/util.c:131:12
>#3 0x751235 in xzalloc /home/runner/work/ovn/ovn/ovs/lib/util.c:165:12
>#4 0x740353 in ovsdb_idl_txn_add_map_op 
> /home/runner/work/ovn/ovn/ovs/lib/ovsdb-idl.c:4160:29
>#5 0x7402bf in ovsdb_idl_txn_write_partial_map 
> /home/runner/work/ovn/ovn/ovs/lib/ovsdb-idl.c:4317:5
>#6 0x71c57f in icsbrec_port_binding_update_external_ids_setkey 
> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/lib/ovn-ic-sb-idl.c:7220:5
>#7 0x4d3696 in update_isb_pb_external_ids 
> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:573:5
>#8 0x4d2017 in create_isb_pb 
> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:713:5
>#9 0x4ce6af in port_binding_run 
> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:800:21
>#10 0x4cbb87 in ovn_db_run 
> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:1737:5
>#11 0x4cac7a in main 
> /home/runner/work/ovn/ovn/ovn-22.12.90/_build/sub/../../ic/ovn-ic.c:2026:17
>#12 0x7fcd3d9e7082 in __libc_start_main 
> (/lib/x86_64-linux-gnu/libc.so.6+0x24082)
> 
> From: https://github.com/ovsrobot/ovn/actions/runs/3706396547
> 
> If so, I've seen it a few times before in CI and didn't have
> time to investigate closely.  I'm quite confident it's not
> related to your changes so I think it's fine to push your
> series.  Like that it would also make it in time for the
> 22.12.0 release.

Yes, I was talking about this trace. So, I walked through it and see that it’s 
really not connected with my patchset, and it can be applied.

> 
>> I’ll give it a time tomorrow to find the source of a problem.
> 
> I'll wait at least until Friday morning (CET).

Anyway if I succeed to fix this issue, I’ll send a separate patch for that.

> 
>> 
>> Regards,
>> Vladislav Odintsov
> 
> Regards,
> Dumitru
> 
>> 
>>> On 16 Dec 2022, at 00:21, Numan Siddique  wrote:
>>> 
>>> On Thu, Dec 15, 2022 at 12:02 PM Vladislav Odintsov >> > wrote:
 
 v2 -> v3:
 - Split patch #3 by two: first fixes a bug with duplicated route
   advertisement and will be considered for back-porting; the second one
   changes ovn-ic SB:Route schema and documents ovn-ic upgrade details.
 - Address Dumitru's comment regarding loggin rate-limit.
 
 v1 -> v2:
 - Split series by two: OVN IC-related changes and northd OF bucket limits
 - Squash two patches in one
 - Fix memory leak in add_to_routes_ad()
 - Addressed review comments by Dumitru
 
 v1 description here:
 https://patchwork.ozlabs.org/project/ovn/cover/20221202173147.3032702-1-odiv...@gmail.com/
 
 Vladislav Odintsov (5):
 ic: remove orphan ovn interconnection routes
 ic: lookup southbound port_binding only if needed
 ic: prevent advertising/learning multiple same routes
 ic: minor code improvements
 ic-sb schema: add index for routes table & document upgrade path
>>> 
>>> For the entire series:
>>> 
>>> Acked-by: Numan Siddique mailto:num...@ovn.org>>
>>> 
>>> Numan
>>> 
 
 Documentation/intro/install/ovn-upgrades.rst |  20 +++
 NEWS |   4 +
 ic/ovn-ic.c  | 142 +--
 ovn-ic-sb.ovsschema  |   6 +-
 tests/ovn-ic.at   | 133 
 +
 5 files changed, 257 insertions(+), 48 deletions(-)
 
 --
 2.36.1
 
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 2/2] northd: Store skip_snat and force_snat in ct_label/mark

2022-12-16 Thread Dumitru Ceara
On 12/16/22 08:20, Ales Musil wrote:
> On Fri, Dec 16, 2022 at 12:32 AM Dumitru Ceara  wrote:
> 
>> On 12/7/22 16:34, Ales Musil wrote:
>>> In order to have the SNAT action available for
>>> related traffic store the flag in CT register.
>>>
>>> Extend the ct_lb and ct_lb_mark action to
>>> allow additional parameter that sets the
>>> corresponding flag if needed in ct_label/ct_mark
>>> register. This allows us to later on match on it
>>> for related traffic and set the corresponding flags
>>> fro additional flows.
>>>
>>> Currently only two flags are supported "skip_snat"
>>> and "force_snat" which are mutually exclusive.
>>>
>>> Reported-at: https://bugzilla.redhat.com/2126083
>>> Signed-off-by: Ales Musil 
>>> ---
>>
>> Hi Ales,
>>
>> First of all thanks a lot for fixing this, I know it wasn't easy to
>> debug!
>>
>> [...]
>>
>>> diff --git a/northd/northd.c b/northd/northd.c
>>> index a16e8a8a7..cb5b14181 100644
>>> --- a/northd/northd.c
>>> +++ b/northd/northd.c
>>
>> [...]
>>
>>> @@ -10411,6 +10422,11 @@ build_lrouter_nat_flows_for_lb(struct
>> ovn_lb_vip *lb_vip,
>>>  bool reject = build_lb_vip_actions(lb_vip, vips_nb, action,
>>> lb->selection_fields, false,
>>> ct_lb_mark);
>>> +bool drop = !!strncmp(ds_cstr(action), "ct_lb", strlen("ct_lb"));
>>> +if (!drop) {
>>> +/* Remove the trailing ");". */
>>> +ds_truncate(action, action->length - 2);
>>> +}
>>>
>>>  /* Higher priority rules are added for load-balancing in DNAT
>>>   * table.  For every match (on a VIP[:port]), we add two flows.
>>> @@ -10427,8 +10443,9 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip
>> *lb_vip,
>>>  }
>>>
>>>  if (lb->skip_snat) {
>>> -skip_snat_new_action = xasprintf("flags.skip_snat_for_lb = 1;
>> %s",
>>> - ds_cstr(action));
>>> +skip_snat_new_action = xasprintf("flags.skip_snat_for_lb = 1;
>> %s%s",
>>> + ds_cstr(action),
>>> + drop ? "" : "; skip_snat);");
>>>  skip_snat_est_action = xasprintf("flags.skip_snat_for_lb = 1; "
>>>   "next;");
>>>  }
>>> @@ -10561,8 +10578,9 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip
>> *lb_vip,
>>>  skip_snat_new_action, est_match,
>>>  skip_snat_est_action, lflows, prio, meter_groups);
>>>
>>> -char *new_actions = xasprintf("flags.force_snat_for_lb = 1; %s",
>>> -  ds_cstr(action));
>>> +char *new_actions = xasprintf("flags.force_snat_for_lb = 1; %s%s",
>>> +  ds_cstr(action),
>>> +  drop ? "" : "; force_snat);");
>>>  build_gw_lrouter_nat_flows_for_lb(lb, gw_router_force_snat,
>>>  n_gw_router_force_snat, reject, new_match,
>>>  new_actions, est_match,
>>> @@ -10577,6 +10595,10 @@ build_lrouter_nat_flows_for_lb(struct
>> ovn_lb_vip *lb_vip,
>>> lb_aff_force_snat_router,
>>> n_lb_aff_force_snat_router);
>>>
>>> +if (!drop) {
>>> +ds_put_cstr(action, ");");
>>> +}
>>> +
>>
>> I think I should've spoken earlier but I was a bit behind with reviewing
>> this patch until now: this whole part seems a bit fragile because we
>> make assumptions about how the string built by build_lb_vip_actions()
>> looks like.  What if in the future the action string built by
>> build_lb_vip_actions() doesn't start with "ct_lb.."?  What if it
>> doesn't end with ");"?
>>
> 
>> I also don't really like the 'drop' variable: it's almost always equal
>> to 'reject' with one exception, when healthcheck is enabled and no
>> backends are active but the LB VIP is not configured to reject on
>> empty backend sets.
>>
>> But I won't block the release for this part.  Instead, I think we
>> should find a way to refactor all this code in the near future so that
>> all LB actions are built in a single place.  build_lb_vip_actions() was
>> supposed to do that but now part of it is done outside the function.
>> What do you think, does it sound feasible to you?
>>
> 
> I agree it's not the best approach, the refactor I've prepared some time
> ago might address some of those concerns.
> The problem is that build_lb_vip_actions doesn't know the context so it's
> hard to make decision if there should
> be a flag or not. I don't know what would be the best solution for that,
> but I'll try to think about the solution.
> Hopefully I'll be able to include that in the refactor.
> 

Great, thanks again!

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev