[ovs-dev] [PATCH v7] conntrack: support default timeout policy get/set cmd for netdev datapath

2022-01-18 Thread wenxu
From: wenxu 

Now, the default timeout policy for netdev datapath is hard codeing. In
some case show or modify is needed.
Add command for get/set default timeout policy. Using like this:

ovs-appctl dpctl/ct-get-default-tp [dp]
ovs-appctl dpctl/ct-set-default-tp [dp] policies

Signed-off-by: wenxu 
---
 NEWS |  4 +++
 lib/conntrack-tp.c   | 11 +++
 lib/conntrack-tp.h   |  2 ++
 lib/ct-dpif.c| 56 
 lib/ct-dpif.h|  9 ++
 lib/dpctl.c  | 69 
 lib/dpif-netdev.c| 25 +++
 lib/dpif-netlink.c   |  2 ++
 lib/dpif-provider.h  |  8 +
 tests/system-kmod-macros.at  | 10 ++
 tests/system-traffic.at  | 67 ++
 tests/system-userspace-macros.at |  7 
 12 files changed, 270 insertions(+)

diff --git a/NEWS b/NEWS
index e2cdc1a..122b1a7 100644
--- a/NEWS
+++ b/NEWS
@@ -44,6 +44,10 @@ Post-v2.16.0
- Ingress policing on Linux now uses 'matchall' classifier instead of
  'basic', if available.
- Add User Statically-Defined Tracing (USDT) probe framework support.
+   - ovs-appctl dpctl/:
+ * New commands 'ct-set-default-tp' and
+   'ct-set-default-tp' that allows to get or configure
+   netdev datapath ct default timeout policy.
 
 
 v2.16.0 - 16 Aug 2021
diff --git a/lib/conntrack-tp.c b/lib/conntrack-tp.c
index a586d3a..4677d0b 100644
--- a/lib/conntrack-tp.c
+++ b/lib/conntrack-tp.c
@@ -230,6 +230,17 @@ tm_to_ct_dpif_tp(enum ct_timeout tm)
 return CT_DPIF_TP_ATTR_MAX;
 }
 
+void
+dpif_netdev_format_timeout_policy(const struct ct_dpif_timeout_policy *tp,
+  struct ds *ds)
+{
+for (unsigned i = 0; i < N_CT_TM; i++) {
+ds_put_format(ds, "\n\t%s = %"PRIu32, ct_timeout_str[i],
+  tp->attrs[tm_to_ct_dpif_tp(i)]);
+}
+}
+
+
 static void
 conn_update_expiration__(struct conntrack *ct, struct conn *conn,
  enum ct_timeout tm, long long now,
diff --git a/lib/conntrack-tp.h b/lib/conntrack-tp.h
index 4d411d1..07dcb4e 100644
--- a/lib/conntrack-tp.h
+++ b/lib/conntrack-tp.h
@@ -27,4 +27,6 @@ void conn_init_expiration(struct conntrack *ct, struct conn 
*conn,
   enum ct_timeout tm, long long now);
 void conn_update_expiration(struct conntrack *ct, struct conn *conn,
 enum ct_timeout tm, long long now);
+void dpif_netdev_format_timeout_policy(const struct ct_dpif_timeout_policy *tp,
+   struct ds *ds);
 #endif
diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
index cfc2315..b061d28 100644
--- a/lib/ct-dpif.c
+++ b/lib/ct-dpif.c
@@ -180,6 +180,25 @@ ct_dpif_get_tcp_seq_chk(struct dpif *dpif, bool *enabled)
 }
 
 int
+ct_dpif_set_default_timeout_policy(struct dpif *dpif,
+   struct ct_dpif_timeout_policy *tp)
+{
+return (dpif->dpif_class->ct_set_default_timeout_policy
+? dpif->dpif_class->ct_set_default_timeout_policy(dpif, tp)
+: EOPNOTSUPP);
+}
+
+int
+ct_dpif_get_default_timeout_policy(struct dpif *dpif,
+   struct ct_dpif_timeout_policy *tp,
+   struct ds *ds)
+{
+return (dpif->dpif_class->ct_get_default_timeout_policy
+? dpif->dpif_class->ct_get_default_timeout_policy(dpif, tp, ds)
+: EOPNOTSUPP);
+}
+
+int
 ct_dpif_set_limits(struct dpif *dpif, const uint32_t *default_limit,
const struct ovs_list *zone_limits)
 {
@@ -710,6 +729,43 @@ ct_dpif_free_zone_limits(struct ovs_list *zone_limits)
 }
 }
 
+
+/* Parses a specification of a timeout policy from 's' into '*tp'.
+ * Returns true on success.  Otherwise, returns false and puts the
+ * error message in 'ds'. */
+bool
+ct_dpif_parse_timeout_policy_tuple(const char *s, struct ds *ds,
+   struct ct_dpif_timeout_policy *tp)
+{
+char *pos, *key, *value, *copy, *err;
+
+pos = copy = xstrdup(s);
+while (ofputil_parse_key_value(, , )) {
+uint32_t tmp;
+
+if (!*value) {
+ds_put_format(ds, "field %s missing value", key);
+goto error;
+}
+
+err = str_to_u32(value, );
+if (err) {
+  free(err);
+  goto error_with_msg;
+}
+
+ct_dpif_set_timeout_policy_attr_by_name(tp, key, tmp);
+}
+free(copy);
+
+return true;
+
+error_with_msg:
+ds_put_format(ds, "failed to parse field %s", key);
+error:
+free(copy);
+return false;
+}
 /* Parses a specification of a conntrack zone limit from 's' into '*pzone'
  * and '*plimit'.  Returns true on success.  Otherwise, returns false and
  * and puts the error message in 'ds'. */
diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h
index b59cba9..f04182b 

Re: [ovs-dev] [PATCH 0/2] Patches to branch for 2.17.

2022-01-18 Thread Ilya Maximets
On 1/18/22 20:24, Ilya Maximets wrote:
> 
> Ilya Maximets (2):
>   Prepare for 2.17.0.
>   Prepare for post-2.17.0 (2.17.90).
> 
>  Documentation/faq/releases.rst |  2 ++
>  NEWS   |  6 +-
>  configure.ac   |  2 +-
>  debian/changelog   | 10 --
>  4 files changed, 16 insertions(+), 4 deletions(-)
> 

Thanks, Aaron and Flavio!

Applied.  With the first patch only, branch-2.17 is created now.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 00/27] dpif-netdev: Parallel offload processing

2022-01-18 Thread Ilya Maximets
On 9/15/21 15:45, Maxime Coquelin wrote:
> Hi,
> 
> On 9/8/21 11:47 AM, Gaetan Rivet wrote:
>> This patch series aims to improve the performance of the management
>> of hw-offloads in dpif-netdev. In the current version, some setup
>> will experience high memory usage and poor latency between a flow
>> decision and its execution regarding hardware offloading.
>>
>> This series starts by measuring key metrics regarding both issues
>> Those patches are introduced first to compare the current status
>> with each improvements introduced.
>> Offloads enqueued and inserted, as well as the latency
>> from queue insertion to hardware insertion is measured. A new
>> command 'ovs-appctl dpctl/offload-stats-show' is introduced
>> to show the current measure.
>>
>> In my current performance test setup I am measuring an
>> average latency hovering between 1~2 seconds.
>> After the optimizations, it is reduced to 500~900 ms.
>> Finally when using multiple threads and with proper driver
>> support[1], it is measured in the order of 1 ms.
>>
>> A few modules are introduced:
>>
>>   * An ID pool with reduced capabilities, simplifying its
>> operations and allowing better performances in both
>> single and multi-thread setup.
>>
>>   * A lockless queue between PMDs / revalidators and
>> offload thread(s). As the number of PMDs increases,
>> contention can be high on the shared queue.
>> This queue is designed to serve as message queue
>> between threads.
>>
>>   * A bounded lockless MPMC ring and some helpers for
>> calculating moving averages.
>>
>>   * A moving average module for Cumulative and Exponential
>> moving averages.
>>
>> The netdev-offload-dpdk module is made thread-safe.
>> Internal maps are made per-netdev instead, and locks are
>> taken for shorter critical sections within the module.
>>
>> CI result: https://github.com/grivet/ovs/actions/runs/554918929
>>
>> [1]: The rte_flow API was made thread-safe in the 20.11 DPDK
>>  release. Drivers that do not implement those operations
>>  concurrently are protected by a lock. Others will
>>  allow better concurrency, that improve the result
>>  of this series.
>>
>> v2:
>>
>>   * Improved the MPSC queue API to simplify usage.
>>
>>   * Moved flush operation from initiator thread to offload
>> thread(s). This ensures offload metadata are shared only
>> among the offload thread pool.
>>
>>   * Flush operation needs additional thread synchronization.
>> The ovs_barrier currently triggers a UAF. Add a unit-test to
>> validate its operations and a fix for the UAF.
>>
>> CI result: https://github.com/grivet/ovs/actions/runs/741430135
>>The error comes from a failure to download 'automake' on
>>osx, unrelated to any change in this series.
>>
>> v3:
>>
>>   * Re-ordered commits so fixes are first. No conflict seen currently,
>> but it might prevent them if some requested changes to the series
>> were to move code in the same parts.
>>
>>   * Modified the reduced quiescing the thread to use ovsrcu_quiesce(),
>> and base next_rcu on the current time value (after quiescing happened,
>> however long it takes).
>>
>>   * Added Reviewed-by tags to the relevant commits.
>>
>> CI result: https://github.com/grivet/ovs/actions/runs/782655601
>>
>> v4:
>>
>>   * Modified the seq-pool to use batches of IDs with a spinlock
>> instead of lockless rings.
>>
>>   * The llring structure is removed.
>>
>>   * Due to the length of the changes to the structure, some
>> acked-by or reviewed-by were not ported to the id-fpool patch.
>>
>> CI result: https://github.com/grivet/ovs/actions/runs/921095015
>>
>> v5:
>>
>>   * Rebase on master.
>> Conflicts were seen related to the vxlan-decap and pmd rebalance
>> series.
>>
>>   * Fix typo in xchg patch spotted by Maxime Coquelin.
>>
>>   * Added Reviewed-by Maxime Coquelin on 4 patches.
> 
> 
> I went through the changes between v4 and v5, and would like to confirm
> these are are minor and OK to me.
> 
> Regards,
> Maxime
> 
>> CI result: https://github.com/grivet/ovs/actions/runs/1212804378
>>
>> Gaetan Rivet (27):
>>   ovs-thread: Fix barrier use-after-free
>>   dpif-netdev: Rename flow offload thread
>>   tests: Add ovs-barrier unit test
>>   netdev: Add flow API uninit function
>>   netdev-offload-dpdk: Use per-netdev offload metadata
>>   netdev-offload-dpdk: Implement hw-offload statistics read
>>   dpctl: Add function to read hardware offload statistics
>>   dpif-netdev: Rename offload thread structure
>>   mov-avg: Add a moving average helper structure
>>   dpif-netdev: Implement hardware offloads stats query
>>   ovs-atomic: Expose atomic exchange operation
>>   mpsc-queue: Module for lock-free message passing
>>   id-fpool: Module for fast ID generation
>>   netdev-offload: Add multi-thread API
>>   dpif-netdev: Quiesce offload thread periodically
>>   dpif-netdev: Postpone flow offload item freeing
>>   dpif-netdev: 

Re: [ovs-dev] [RFC ovn] Add LTS section to release documentation.

2022-01-18 Thread Han Zhou
On Wed, Jan 12, 2022 at 1:35 PM Mark Michelson  wrote:
>
> OVN LTS releases have a lot of ambiguity, so this is intended to codify
> LTS support and cadence.
>
> Signed-off-by: Mark Michelson 
> ---
>  Documentation/internals/release-process.rst | 28 +
>  1 file changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/Documentation/internals/release-process.rst
b/Documentation/internals/release-process.rst
> index f37c09e51..1ad43f3cc 100644
> --- a/Documentation/internals/release-process.rst
> +++ b/Documentation/internals/release-process.rst
> @@ -75,16 +75,24 @@ Scheduling`_ for the timing of each stage:
> 2019.10.2, and so on.  The process is the same for these additional
release
> as for a .0 release.
>
> -At most two release branches are formally maintained at any given time:
the
> -latest release and the latest release designed as LTS.  An LTS release
is one
> -that the OVN project has designated as being maintained for a longer
period of
> -time.  Currently, an LTS release is maintained until the next LTS is
chosen.
> -There is not currently a strict guideline on how often a new LTS release
is
> -chosen, but so far it has been about every 2 years.  That could change
based on
> -the current state of OVN development.  For example, we do not want to
designate
> -a new release as LTS that includes disruptive internal changes, as that
may
> -make it harder to support for a longer period of time.  Discussion about
> -choosing the next LTS release occurs on the OVS development mailing list.
> +Long-term Support Releases
> +--
> +
> +The OVN project will periodically designate a release as "long-term
support" or
> +LTS for short. An LTS release has the distinction of being maintained for
> +longer than a standard release.
> +
> +LTS releases will receive bug fixes until the point that another LTS is
> +released. At that point, the old LTS will receive an additional year of
> +critical and security fixes. Critical fixes are those that are required
to
> +ensure basic operation (e.g. memory leak fixes, crash fixes). Security
fixes
> +are those that address concerns about exploitable flaws in OVN and that
have a
> +corresponding CVE report.
> +
> +LTS releases are scheduled to be released once every three years. This
means
> +that any given LTS will receive bug fix support for three years,
followed by
> +one year of critical bug fixes and security fixes.
> +

Thanks Mark. It is great to have a clear timeline of the LTS releases. Just
a little concern about the 3 + 1 cycle.

Firstly it may be a big burden for maintainers to maintain branches up to 4
years old, even if it is for bug fixes only. We need to consider bug
reproduction and debugging, too, if some bugs are reported on the old LTS
branch but not the newer branches. Secondly, it doesn't seem to be very
valuable to maintain branches so old. I understand that people would choose
to do minor upgrades for just bug fixes without worrying about
compatibility and other problems incurred by a major release change, but
cloud-infrastructure changes fast. A major upgrade in 2 years sounds
already very long. Thirdly, the frequency of LTS release every 3 years may
be also too low. Some user may want to use LTS releases only, so at the 2
year point, they can either choose the last 2-year old branch which may not
have the features they need, or will have to wait another year for the next
LTS.

Would 1 + 1 (LTS once every year, and 2 latest LTS releases be maintained,
meaning each LTS's life is 2 years) be better? Or, to match the ubuntu LTS
cycle, 2 + 1?

Thanks,
Han

>
>  Release Numbering
>  -
> --
> 2.31.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] Prepare for post-2.17.0 (2.17.90).

2022-01-18 Thread Flavio Leitner
On Tue, Jan 18, 2022 at 08:24:43PM +0100, Ilya Maximets wrote:
> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] Prepare for 2.17.0.

2022-01-18 Thread Flavio Leitner
On Tue, Jan 18, 2022 at 08:24:42PM +0100, Ilya Maximets wrote:
> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-dpdk: prepare for tso offload for tx copy packet

2022-01-18 Thread Flavio Leitner
On Thu, Jan 13, 2022 at 04:23:17PM +0800, Harold Huang wrote:
> From: Harold Huang 
> 
> When one flow is output to multiple egress ports, OVS copy the packets
> and send the copy packets to the intermediate ports. The original packets
> is sent to the last port. If the intermediate port is a dpdk port, the copy
> packets should also be prepared for tso offload.
> 
> Fixes: 29cf9c1b3b ("userspace: Add TCP Segmentation Offload support")
> Signed-off-by: Harold Huang 
> ---
>  lib/netdev-dpdk.c | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 6782d3e8f..83029405e 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2737,10 +2737,11 @@ dpdk_pktmbuf_alloc(struct rte_mempool *mp, uint32_t 
> data_len)
>  }
>  
>  static struct dp_packet *
> -dpdk_copy_dp_packet_to_mbuf(struct rte_mempool *mp, struct dp_packet 
> *pkt_orig)
> +dpdk_copy_dp_packet_to_mbuf(struct netdev_dpdk *dev, struct dp_packet 
> *pkt_orig)
>  {
>  struct rte_mbuf *mbuf_dest;
>  struct dp_packet *pkt_dest;
> +struct rte_mempool *mp = dev->dpdk_mp->mp;
>  uint32_t pkt_len;
>  
>  pkt_len = dp_packet_size(pkt_orig);
> @@ -2761,11 +2762,9 @@ dpdk_copy_dp_packet_to_mbuf(struct rte_mempool *mp, 
> struct dp_packet *pkt_orig)
>  memcpy(_dest->l2_pad_size, _orig->l2_pad_size,
> sizeof(struct dp_packet) - offsetof(struct dp_packet, 
> l2_pad_size));
>  
> -if (mbuf_dest->ol_flags & RTE_MBUF_F_TX_L4_MASK) {
> -mbuf_dest->l2_len = (char *)dp_packet_l3(pkt_dest)
> -- (char *)dp_packet_eth(pkt_dest);
> -mbuf_dest->l3_len = (char *)dp_packet_l4(pkt_dest)
> -- (char *) dp_packet_l3(pkt_dest);
> +if (!netdev_dpdk_prep_hwol_packet(dev, mbuf_dest)) {
> +rte_pktmbuf_free(mbuf_dest);
> +return NULL;

What happens if a packet comes from a non-DPDK port and
goes to a vhost-user port? I think we will get into this:

netdev_dpdk_vhost_send()
\-- (not a DPDK packet)
\-- dpdk_do_tx_copy()
   \-- dpdk_copy_dp_packet_to_mbuf()
   |   \-- netdev_dpdk_prep_hwol_packet() <--
   |
   \-- __netdev_dpdk_vhost_send()
   \-- netdev_dpdk_prep_hwol_batch()
   \-- netdev_dpdk_prep_hwol_packet() <--

I think we will prepare the same packet twice.

BTW, this bug should be fixed by the patch here:
https://patchwork.ozlabs.org/project/openvswitch/patch/20210110030505.325722-1-...@sysclose.org/#2610119

fbl

>  }
>  
>  return pkt_dest;
> @@ -2813,7 +2812,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct 
> dp_packet_batch *batch)
>  continue;
>  }
>  
> -pkts[txcnt] = dpdk_copy_dp_packet_to_mbuf(dev->dpdk_mp->mp, packet);
> +pkts[txcnt] = dpdk_copy_dp_packet_to_mbuf(dev, packet);
>  if (OVS_UNLIKELY(!pkts[txcnt])) {
>  dropped = cnt - i;
>  break;
> -- 
> 2.27.0
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

-- 
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] Prepare for post-2.17.0 (2.17.90).

2022-01-18 Thread Aaron Conole
Ilya Maximets  writes:

> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] Prepare for 2.17.0.

2022-01-18 Thread Aaron Conole
Ilya Maximets  writes:

> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1] conntrack: Fix FTP NAT when TCP but not IP offload is supported

2022-01-18 Thread Mike Pattrick
On Tue, Jan 18, 2022 at 2:24 PM Flavio Leitner  wrote:
>
> On Tue, Jan 18, 2022 at 12:27:21PM -0500, Mike Pattrick wrote:
> > On Tue, Jan 18, 2022 at 11:53 AM Flavio Leitner  wrote:
> > >
> > > On Mon, Jan 17, 2022 at 09:47:18AM -0500, Mike Pattrick wrote:
> > > > On Fri, Jan 14, 2022 at 4:13 PM Flavio Leitner  
> > > > wrote:
> > > > >
> > > > > On Fri, Jan 14, 2022 at 03:50:52PM -0500, Mike Pattrick wrote:
> > > > > > On Fri, Jan 14, 2022 at 3:33 PM Flavio Leitner  
> > > > > > wrote:
> > > > > > >
> > > > > > >
> > > > > > > Hi Mike,
> > > > > > >
> > > > > > > Thanks for working on this issue.
> > > > > > >
> > > > > > > On Fri, Jan 14, 2022 at 10:45:35AM -0500, Mike Pattrick wrote:
> > > > > > > > Formerly when userspace TSO was enabled but with a non-DKDK 
> > > > > > > > interface
> > > > > > > > without support IP checksum offloading, FTP NAT connections 
> > > > > > > > would fail
> > > > > > > > if the packet length changed. This can happen if the packets 
> > > > > > > > length
> > > > > > > > changes during L7 NAT translation, predominantly with FTP.
> > > > > > > >
> > > > > > > > Now we correct the IP header checksum if hwol is disabled or if 
> > > > > > > > DPDK
> > > > > > > > will not handle the IP checksum. This fixes the conntrack - 
> > > > > > > > IPv4 FTP
> > > > > > > > Passive with DNAT" test when run with check-system-tso.
> > > > > > > >
> > > > > > > > Reported-by: Flavio Leitner 
> > > > > > >
> > > > > > > Actually, this was initially reported by Ilya.
> > > > > > >
> > > > > > > > Signed-off-by: Mike Pattrick 
> > > > > > > > ---
> > > > > > > >  lib/conntrack.c | 3 ++-
> > > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/lib/conntrack.c b/lib/conntrack.c
> > > > > > > > index 33a1a9295..1b8a26ac2 100644
> > > > > > > > --- a/lib/conntrack.c
> > > > > > > > +++ b/lib/conntrack.c
> > > > > > > > @@ -3402,7 +3402,8 @@ handle_ftp_ctl(struct conntrack *ct, 
> > > > > > > > const struct conn_lookup_ctx *ctx,
> > > > > > > >  }
> > > > > > > >  if (seq_skew) {
> > > > > > > >  ip_len = ntohs(l3_hdr->ip_tot_len) + 
> > > > > > > > seq_skew;
> > > > > > > > -if (!dp_packet_hwol_is_ipv4(pkt)) {
> > > > > > > > +if (!dp_packet_hwol_is_ipv4(pkt) ||
> > > > > > > > +!dp_packet_ip_checksum_valid(pkt)) {
> > > > > > > >  l3_hdr->ip_csum = 
> > > > > > > > recalc_csum16(l3_hdr->ip_csum,
> > > > > > > >  
> > > > > > > > l3_hdr->ip_tot_len,
> > > > > > > >  
> > > > > > > > htons(ip_len));
> > > > > > >
> > > > > > > The problem is that the current code doesn't include IPv4 csum
> > > > > > > handling as required by the Linux software ports.
> > > > > > >
> > > > > > > The patch above resolves the unit test issue because non-DPDK
> > > > > > > interfaces will not flag the packet with good IP csum, and then
> > > > > > > the csum is updated accordingly. However, a packet coming from
> > > > > > > a physical DPDK port can have that flag set by the PMD, then if
> > > > > > > it goes through that part the IP csum is not updated, which
> > > > > > > will cause a problem if that packet is sent out over a Linux
> > > > > > > software port later.
> > > >
> > > > I was curious about the performance impact of using csum() instead of
> > > > recalc_csum16(). While setting up this benchmark, I also noticed some
> > > > ways that we could improve recalc_csum16(). For example, making the
> > > > function inline and unrolling the while loop.
> > > >
> > > > In my test, Performed 2^32 checksum updates with both the original,
> > > > and my updated recalc_csum16() function, and then 2^32 full ipv4
> > > > header checksum calculations with the csum function(). I also tested
> > > > both with and without clearing the cpu cache between calls.
> > > >
> > > > I found that the optimized recalc_csum16() was 1.5x faster then the
> > > > current implementation, and the current recalc_csum16() implementation
> > > > was only 2x faster then a full header calculation with csum().
> > > >
> > > > Given these results, and the fact that any time we update the header,
> > > > we will usually be affecting more then two bytes anyways, I think the
> > > > performance improvement from using recalc_csum16() instead of csum()
> > > > isn't so fantastic.
> > >
> > > Ok, so that means we could update the checksum there, but then I
> > > wonder if there are other places where this issue can happen.
> > > I mean, if this is an isolated case, then we can fix using full
> > > checksum in 2.17 and work on a more robust solution in master.
> > > However, if there are more places with the same issue then we
> > > may need to go with an approach as I posted earlier (implementing
> > > a generic IP csum handling before sending to the 

[ovs-dev] [PATCH 2/2] Prepare for post-2.17.0 (2.17.90).

2022-01-18 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 4 
 configure.ac | 2 +-
 debian/changelog | 6 ++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index ba0f4282c..ff40da872 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,7 @@
+Post-v2.17.0
+-
+
+
 v2.17.0 - xx xxx 
 -
- Userspace datapath:
diff --git a/configure.ac b/configure.ac
index 4e9bcce27..298ea85ab 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 2.17.0, b...@openvswitch.org)
+AC_INIT(openvswitch, 2.17.90, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([datapath/datapath.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index 15051c937..c7a252dfe 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+openvswitch (2.17.90-1) unstable; urgency=low
+
+   * New upstream version
+
+ -- Open vSwitch team   Tue, 18 Jan 2022 20:14:01 +0100
+
 openvswitch (2.17.0-1) unstable; urgency=low
 
* New upstream version
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/2] Prepare for 2.17.0.

2022-01-18 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 Documentation/faq/releases.rst | 2 ++
 NEWS   | 2 +-
 configure.ac   | 2 +-
 debian/changelog   | 4 ++--
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index 59d55202d..af524251f 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -74,6 +74,7 @@ Q: What Linux kernel versions does each Open vSwitch release 
work with?
 2.14.x   3.16 to 5.5
 2.15.x   3.16 to 5.8
 2.16.x   3.16 to 5.8
+2.17.x   3.16 to 5.8
  ==
 
 Open vSwitch userspace should also work with the Linux kernel module built
@@ -209,6 +210,7 @@ Q: What DPDK version does each Open vSwitch release work 
with?
 2.14.x   19.11.10
 2.15.x   20.11.1
 2.16.x   20.11.1
+2.17.x   21.11.0
  
 
 Q: Are all the DPDK releases that OVS versions work with maintained?
diff --git a/NEWS b/NEWS
index 363cc2f83..ba0f4282c 100644
--- a/NEWS
+++ b/NEWS
@@ -1,4 +1,4 @@
-Post-v2.16.0
+v2.17.0 - xx xxx 
 -
- Userspace datapath:
  * Optimized flow lookups for datapath flows with simple match criteria.
diff --git a/configure.ac b/configure.ac
index 3e72e28bf..4e9bcce27 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 2.16.90, b...@openvswitch.org)
+AC_INIT(openvswitch, 2.17.0, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([datapath/datapath.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index deb3a5980..15051c937 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,8 +1,8 @@
-openvswitch (2.16.90-1) unstable; urgency=low
+openvswitch (2.17.0-1) unstable; urgency=low
 
* New upstream version
 
- -- Open vSwitch team   Fri, 16 Jul 2021 21:12:05 +0200
+ -- Open vSwitch team   Tue, 18 Jan 2022 20:14:00 +0100
 
 openvswitch (2.16.0-1) unstable; urgency=low
 
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 0/2] Patches to branch for 2.17.

2022-01-18 Thread Ilya Maximets


Ilya Maximets (2):
  Prepare for 2.17.0.
  Prepare for post-2.17.0 (2.17.90).

 Documentation/faq/releases.rst |  2 ++
 NEWS   |  6 +-
 configure.ac   |  2 +-
 debian/changelog   | 10 --
 4 files changed, 16 insertions(+), 4 deletions(-)

-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1] conntrack: Fix FTP NAT when TCP but not IP offload is supported

2022-01-18 Thread Flavio Leitner
On Tue, Jan 18, 2022 at 12:27:21PM -0500, Mike Pattrick wrote:
> On Tue, Jan 18, 2022 at 11:53 AM Flavio Leitner  wrote:
> >
> > On Mon, Jan 17, 2022 at 09:47:18AM -0500, Mike Pattrick wrote:
> > > On Fri, Jan 14, 2022 at 4:13 PM Flavio Leitner  wrote:
> > > >
> > > > On Fri, Jan 14, 2022 at 03:50:52PM -0500, Mike Pattrick wrote:
> > > > > On Fri, Jan 14, 2022 at 3:33 PM Flavio Leitner  
> > > > > wrote:
> > > > > >
> > > > > >
> > > > > > Hi Mike,
> > > > > >
> > > > > > Thanks for working on this issue.
> > > > > >
> > > > > > On Fri, Jan 14, 2022 at 10:45:35AM -0500, Mike Pattrick wrote:
> > > > > > > Formerly when userspace TSO was enabled but with a non-DKDK 
> > > > > > > interface
> > > > > > > without support IP checksum offloading, FTP NAT connections would 
> > > > > > > fail
> > > > > > > if the packet length changed. This can happen if the packets 
> > > > > > > length
> > > > > > > changes during L7 NAT translation, predominantly with FTP.
> > > > > > >
> > > > > > > Now we correct the IP header checksum if hwol is disabled or if 
> > > > > > > DPDK
> > > > > > > will not handle the IP checksum. This fixes the conntrack - IPv4 
> > > > > > > FTP
> > > > > > > Passive with DNAT" test when run with check-system-tso.
> > > > > > >
> > > > > > > Reported-by: Flavio Leitner 
> > > > > >
> > > > > > Actually, this was initially reported by Ilya.
> > > > > >
> > > > > > > Signed-off-by: Mike Pattrick 
> > > > > > > ---
> > > > > > >  lib/conntrack.c | 3 ++-
> > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/lib/conntrack.c b/lib/conntrack.c
> > > > > > > index 33a1a9295..1b8a26ac2 100644
> > > > > > > --- a/lib/conntrack.c
> > > > > > > +++ b/lib/conntrack.c
> > > > > > > @@ -3402,7 +3402,8 @@ handle_ftp_ctl(struct conntrack *ct, const 
> > > > > > > struct conn_lookup_ctx *ctx,
> > > > > > >  }
> > > > > > >  if (seq_skew) {
> > > > > > >  ip_len = ntohs(l3_hdr->ip_tot_len) + 
> > > > > > > seq_skew;
> > > > > > > -if (!dp_packet_hwol_is_ipv4(pkt)) {
> > > > > > > +if (!dp_packet_hwol_is_ipv4(pkt) ||
> > > > > > > +!dp_packet_ip_checksum_valid(pkt)) {
> > > > > > >  l3_hdr->ip_csum = 
> > > > > > > recalc_csum16(l3_hdr->ip_csum,
> > > > > > >  
> > > > > > > l3_hdr->ip_tot_len,
> > > > > > >  
> > > > > > > htons(ip_len));
> > > > > >
> > > > > > The problem is that the current code doesn't include IPv4 csum
> > > > > > handling as required by the Linux software ports.
> > > > > >
> > > > > > The patch above resolves the unit test issue because non-DPDK
> > > > > > interfaces will not flag the packet with good IP csum, and then
> > > > > > the csum is updated accordingly. However, a packet coming from
> > > > > > a physical DPDK port can have that flag set by the PMD, then if
> > > > > > it goes through that part the IP csum is not updated, which
> > > > > > will cause a problem if that packet is sent out over a Linux
> > > > > > software port later.
> > >
> > > I was curious about the performance impact of using csum() instead of
> > > recalc_csum16(). While setting up this benchmark, I also noticed some
> > > ways that we could improve recalc_csum16(). For example, making the
> > > function inline and unrolling the while loop.
> > >
> > > In my test, Performed 2^32 checksum updates with both the original,
> > > and my updated recalc_csum16() function, and then 2^32 full ipv4
> > > header checksum calculations with the csum function(). I also tested
> > > both with and without clearing the cpu cache between calls.
> > >
> > > I found that the optimized recalc_csum16() was 1.5x faster then the
> > > current implementation, and the current recalc_csum16() implementation
> > > was only 2x faster then a full header calculation with csum().
> > >
> > > Given these results, and the fact that any time we update the header,
> > > we will usually be affecting more then two bytes anyways, I think the
> > > performance improvement from using recalc_csum16() instead of csum()
> > > isn't so fantastic.
> >
> > Ok, so that means we could update the checksum there, but then I
> > wonder if there are other places where this issue can happen.
> > I mean, if this is an isolated case, then we can fix using full
> > checksum in 2.17 and work on a more robust solution in master.
> > However, if there are more places with the same issue then we
> > may need to go with an approach as I posted earlier (implementing
> > a generic IP csum handling before sending to the port).
> >
> > What do you think?
> 
> Given that Ilya indicated that the 2.17 release should be tagged
> today, I think it's reasonable to work towards getting your full
> solution into 2.18.
> 
> In the case of all NAT, right now we are recalculating the 

Re: [ovs-dev] [PATCH v1] conntrack: Fix FTP NAT when TCP but not IP offload is supported

2022-01-18 Thread Mike Pattrick
On Tue, Jan 18, 2022 at 11:53 AM Flavio Leitner  wrote:
>
> On Mon, Jan 17, 2022 at 09:47:18AM -0500, Mike Pattrick wrote:
> > On Fri, Jan 14, 2022 at 4:13 PM Flavio Leitner  wrote:
> > >
> > > On Fri, Jan 14, 2022 at 03:50:52PM -0500, Mike Pattrick wrote:
> > > > On Fri, Jan 14, 2022 at 3:33 PM Flavio Leitner  
> > > > wrote:
> > > > >
> > > > >
> > > > > Hi Mike,
> > > > >
> > > > > Thanks for working on this issue.
> > > > >
> > > > > On Fri, Jan 14, 2022 at 10:45:35AM -0500, Mike Pattrick wrote:
> > > > > > Formerly when userspace TSO was enabled but with a non-DKDK 
> > > > > > interface
> > > > > > without support IP checksum offloading, FTP NAT connections would 
> > > > > > fail
> > > > > > if the packet length changed. This can happen if the packets length
> > > > > > changes during L7 NAT translation, predominantly with FTP.
> > > > > >
> > > > > > Now we correct the IP header checksum if hwol is disabled or if DPDK
> > > > > > will not handle the IP checksum. This fixes the conntrack - IPv4 FTP
> > > > > > Passive with DNAT" test when run with check-system-tso.
> > > > > >
> > > > > > Reported-by: Flavio Leitner 
> > > > >
> > > > > Actually, this was initially reported by Ilya.
> > > > >
> > > > > > Signed-off-by: Mike Pattrick 
> > > > > > ---
> > > > > >  lib/conntrack.c | 3 ++-
> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/lib/conntrack.c b/lib/conntrack.c
> > > > > > index 33a1a9295..1b8a26ac2 100644
> > > > > > --- a/lib/conntrack.c
> > > > > > +++ b/lib/conntrack.c
> > > > > > @@ -3402,7 +3402,8 @@ handle_ftp_ctl(struct conntrack *ct, const 
> > > > > > struct conn_lookup_ctx *ctx,
> > > > > >  }
> > > > > >  if (seq_skew) {
> > > > > >  ip_len = ntohs(l3_hdr->ip_tot_len) + seq_skew;
> > > > > > -if (!dp_packet_hwol_is_ipv4(pkt)) {
> > > > > > +if (!dp_packet_hwol_is_ipv4(pkt) ||
> > > > > > +!dp_packet_ip_checksum_valid(pkt)) {
> > > > > >  l3_hdr->ip_csum = 
> > > > > > recalc_csum16(l3_hdr->ip_csum,
> > > > > >  
> > > > > > l3_hdr->ip_tot_len,
> > > > > >  
> > > > > > htons(ip_len));
> > > > >
> > > > > The problem is that the current code doesn't include IPv4 csum
> > > > > handling as required by the Linux software ports.
> > > > >
> > > > > The patch above resolves the unit test issue because non-DPDK
> > > > > interfaces will not flag the packet with good IP csum, and then
> > > > > the csum is updated accordingly. However, a packet coming from
> > > > > a physical DPDK port can have that flag set by the PMD, then if
> > > > > it goes through that part the IP csum is not updated, which
> > > > > will cause a problem if that packet is sent out over a Linux
> > > > > software port later.
> >
> > I was curious about the performance impact of using csum() instead of
> > recalc_csum16(). While setting up this benchmark, I also noticed some
> > ways that we could improve recalc_csum16(). For example, making the
> > function inline and unrolling the while loop.
> >
> > In my test, Performed 2^32 checksum updates with both the original,
> > and my updated recalc_csum16() function, and then 2^32 full ipv4
> > header checksum calculations with the csum function(). I also tested
> > both with and without clearing the cpu cache between calls.
> >
> > I found that the optimized recalc_csum16() was 1.5x faster then the
> > current implementation, and the current recalc_csum16() implementation
> > was only 2x faster then a full header calculation with csum().
> >
> > Given these results, and the fact that any time we update the header,
> > we will usually be affecting more then two bytes anyways, I think the
> > performance improvement from using recalc_csum16() instead of csum()
> > isn't so fantastic.
>
> Ok, so that means we could update the checksum there, but then I
> wonder if there are other places where this issue can happen.
> I mean, if this is an isolated case, then we can fix using full
> checksum in 2.17 and work on a more robust solution in master.
> However, if there are more places with the same issue then we
> may need to go with an approach as I posted earlier (implementing
> a generic IP csum handling before sending to the port).
>
> What do you think?

Given that Ilya indicated that the 2.17 release should be tagged
today, I think it's reasonable to work towards getting your full
solution into 2.18.

In the case of all NAT, right now we are recalculating the checksum
multiple times in packet_set_ipv4_addr alone. Only your patch series
could address this.

Were you planning on making any changes and reposting? Or do you want
comments in the already posted series?


-M

> Thanks,
> fbl
>
> >
> > -M
> >
> > > >
> > > > This is a good point, I was trying to get to a happy medium 

Re: [ovs-dev] [PATCH v1] conntrack: Fix FTP NAT when TCP but not IP offload is supported

2022-01-18 Thread Flavio Leitner
On Mon, Jan 17, 2022 at 09:47:18AM -0500, Mike Pattrick wrote:
> On Fri, Jan 14, 2022 at 4:13 PM Flavio Leitner  wrote:
> >
> > On Fri, Jan 14, 2022 at 03:50:52PM -0500, Mike Pattrick wrote:
> > > On Fri, Jan 14, 2022 at 3:33 PM Flavio Leitner  wrote:
> > > >
> > > >
> > > > Hi Mike,
> > > >
> > > > Thanks for working on this issue.
> > > >
> > > > On Fri, Jan 14, 2022 at 10:45:35AM -0500, Mike Pattrick wrote:
> > > > > Formerly when userspace TSO was enabled but with a non-DKDK interface
> > > > > without support IP checksum offloading, FTP NAT connections would fail
> > > > > if the packet length changed. This can happen if the packets length
> > > > > changes during L7 NAT translation, predominantly with FTP.
> > > > >
> > > > > Now we correct the IP header checksum if hwol is disabled or if DPDK
> > > > > will not handle the IP checksum. This fixes the conntrack - IPv4 FTP
> > > > > Passive with DNAT" test when run with check-system-tso.
> > > > >
> > > > > Reported-by: Flavio Leitner 
> > > >
> > > > Actually, this was initially reported by Ilya.
> > > >
> > > > > Signed-off-by: Mike Pattrick 
> > > > > ---
> > > > >  lib/conntrack.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/lib/conntrack.c b/lib/conntrack.c
> > > > > index 33a1a9295..1b8a26ac2 100644
> > > > > --- a/lib/conntrack.c
> > > > > +++ b/lib/conntrack.c
> > > > > @@ -3402,7 +3402,8 @@ handle_ftp_ctl(struct conntrack *ct, const 
> > > > > struct conn_lookup_ctx *ctx,
> > > > >  }
> > > > >  if (seq_skew) {
> > > > >  ip_len = ntohs(l3_hdr->ip_tot_len) + seq_skew;
> > > > > -if (!dp_packet_hwol_is_ipv4(pkt)) {
> > > > > +if (!dp_packet_hwol_is_ipv4(pkt) ||
> > > > > +!dp_packet_ip_checksum_valid(pkt)) {
> > > > >  l3_hdr->ip_csum = 
> > > > > recalc_csum16(l3_hdr->ip_csum,
> > > > >  
> > > > > l3_hdr->ip_tot_len,
> > > > >  
> > > > > htons(ip_len));
> > > >
> > > > The problem is that the current code doesn't include IPv4 csum
> > > > handling as required by the Linux software ports.
> > > >
> > > > The patch above resolves the unit test issue because non-DPDK
> > > > interfaces will not flag the packet with good IP csum, and then
> > > > the csum is updated accordingly. However, a packet coming from
> > > > a physical DPDK port can have that flag set by the PMD, then if
> > > > it goes through that part the IP csum is not updated, which
> > > > will cause a problem if that packet is sent out over a Linux
> > > > software port later.
> 
> I was curious about the performance impact of using csum() instead of
> recalc_csum16(). While setting up this benchmark, I also noticed some
> ways that we could improve recalc_csum16(). For example, making the
> function inline and unrolling the while loop.
> 
> In my test, Performed 2^32 checksum updates with both the original,
> and my updated recalc_csum16() function, and then 2^32 full ipv4
> header checksum calculations with the csum function(). I also tested
> both with and without clearing the cpu cache between calls.
> 
> I found that the optimized recalc_csum16() was 1.5x faster then the
> current implementation, and the current recalc_csum16() implementation
> was only 2x faster then a full header calculation with csum().
> 
> Given these results, and the fact that any time we update the header,
> we will usually be affecting more then two bytes anyways, I think the
> performance improvement from using recalc_csum16() instead of csum()
> isn't so fantastic.

Ok, so that means we could update the checksum there, but then I
wonder if there are other places where this issue can happen.
I mean, if this is an isolated case, then we can fix using full
checksum in 2.17 and work on a more robust solution in master.
However, if there are more places with the same issue then we
may need to go with an approach as I posted earlier (implementing
a generic IP csum handling before sending to the port).

What do you think?

Thanks,
fbl

> 
> -M
> 
> > >
> > > This is a good point, I was trying to get to a happy medium without
> > > repurposing the patchset that you had previously submitted. That way
> > > this patch could be available immediately, and your more thorough TSO
> > > patchset could be applied after.
> >
> > I see what you did, and I appreciate that. My concern is that it's
> > not unusual to have packets moving between dpdk and linux ports, so
> > we might have to visit this issue again, though the unit test is not
> > failing.
> >
> > > I had also prepared a larger patch as part of this work, but scrapped
> > > it because it was duplicating a lot of the work you had previously
> > > done. But if you prefer the more substantial solution, we can scrap
> > > this patch and I can submit the 

Re: [ovs-dev] [ovs-dev v4] ofproto: add refcount to ofproto to fix ofproto use-after-free

2022-01-18 Thread 0-day Robot
Bleep bloop.  Greetings Peng He, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Author Peng He  needs to sign off.
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Peng He , guohongzhi 

Lines checked: 325, Warnings: 1, Errors: 1


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [ovs-dev v4] ofproto: add refcount to ofproto to fix ofproto use-after-free

2022-01-18 Thread Peng He
From hepeng:
https://patchwork.ozlabs.org/project/openvswitch/patch/20200717015041.82746-1-hepeng.0...@bytedance.com/#2487473

also from guohongzhi :
http://patchwork.ozlabs.org/project/openvswitch/patch/20200306130555.19884-1-guohongz...@huawei.com/

also from a discussion about the mixing use of RCU and refcount in the mail
list with Ilya Maximets, William Tu, Ben Pfaf, and Gaëtan Rivet.

A summary, as quoted from Ilya:

"
RCU for ofproto was introduced for one
and only one reason - to avoid freeing ofproto while rules are still
alive.  This was done in commit f416c8d61601 ("ofproto: RCU postpone
rule destruction.").  The goal was to allow using rules without
refcounting them within a single grace period.  And that forced us
to postpone destruction of the ofproto for a single grace period.
Later commit 39c9459355b6 ("Use classifier versioning.") made it
possible for rules to be alive for more than one grace period, so
the commit made ofproto wait for 2 grace periods by double postponing.
As we can see now, that wasn't enough and we have to wait for more
than 2 grace periods in certain cases.
"

In a short, the ofproto should have a longer life time than rule, if
the rule lasts for more than 2 grace periods, the ofproto should live
longer to ensure rule->ofproto is valid. It's hard to predict how long
a ofproto should live, thus we need to use refcount on ofproto to make
things easy. The controversial part is that we have already used RCU postpone
to delay ofproto destrution, if we have to add refcount, is it simpler to
use just refcount without RCU postpone?

IMO, I think going back to the pure refcount solution is more
complicated than mixing using both.

Gaëtan Rive asks some questions on guohongzhi's v2 patch:

during ofproto_rule_create, should we use ofproto_ref
or ofproto_try_ref? how can we make sure the ofproto is alive?

By using RCU, ofproto has three states:

state 1: alive, with refcount >= 1
state 2: dying, with refcount == 0, however pointer is valid
state 3: died, memory freed, pointer might be dangling.

Without using RCU, there is no state 2, thus, we have to be very careful
every time we see a ofproto pointer. In contrast, with RCU, we can be sure
that it's alive at least in this grace peroid, so we can just check if
it is dying by ofproto_try_ref.

This shows that by mixing use of RCU and refcount we can save a lot of work
worrying if ofproto is dangling.

In short, the RCU part makes sure the ofproto is alive when we use it,
and the refcount part makes sure it lives longer enough.

Also regarding a new patch filed recently, people are now making use
of RCU to protect ofproto:

https://patchwork.ozlabs.org/project/openvswitch/patch/1638530715-44436-1-git-send-email-wangyunj...@huawei.com/

In this patch, I have merged guohongzhi's patch and mine, and fixes
accoring to the previous comments.

Signed-off-by: Peng He 
Signed-off-by: guohongzhi 
---
 ofproto/ofproto-dpif-xlate-cache.c |  2 +
 ofproto/ofproto-dpif-xlate.c   | 14 ---
 ofproto/ofproto-dpif.c | 24 +++-
 ofproto/ofproto-provider.h |  2 +
 ofproto/ofproto.c  | 62 +++---
 ofproto/ofproto.h  |  4 ++
 6 files changed, 87 insertions(+), 21 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate-cache.c 
b/ofproto/ofproto-dpif-xlate-cache.c
index dcc91cb38..9224ee2e6 100644
--- a/ofproto/ofproto-dpif-xlate-cache.c
+++ b/ofproto/ofproto-dpif-xlate-cache.c
@@ -209,6 +209,7 @@ xlate_cache_clear_entry(struct xc_entry *entry)
 {
 switch (entry->type) {
 case XC_TABLE:
+ofproto_unref(&(entry->table.ofproto->up));
 break;
 case XC_RULE:
 ofproto_rule_unref(>rule->up);
@@ -231,6 +232,7 @@ xlate_cache_clear_entry(struct xc_entry *entry)
 free(entry->learn.ofm);
 break;
 case XC_NORMAL:
+ofproto_unref(&(entry->normal.ofproto->up));
 break;
 case XC_FIN_TIMEOUT:
 /* 'u.fin.rule' is always already held as a XC_RULE, which
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 6fb59e170..129cdf714 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3024,12 +3024,14 @@ xlate_normal(struct xlate_ctx *ctx)
 struct xc_entry *entry;
 
 /* Save just enough info to update mac learning table later. */
-entry = xlate_cache_add_entry(ctx->xin->xcache, XC_NORMAL);
-entry->normal.ofproto = ctx->xbridge->ofproto;
-entry->normal.in_port = flow->in_port.ofp_port;
-entry->normal.dl_src = flow->dl_src;
-entry->normal.vlan = vlan;
-entry->normal.is_gratuitous_arp = is_grat_arp;
+if (ofproto_try_ref(>xbridge->ofproto->up)) {
+entry = xlate_cache_add_entry(ctx->xin->xcache, XC_NORMAL);
+entry->normal.ofproto = ctx->xbridge->ofproto;
+entry->normal.in_port = flow->in_port.ofp_port;
+entry->normal.dl_src = flow->dl_src;
+  

Re: [ovs-dev] [ovs-dev v3] ofproto: add refcount to ofproto to fix ofproto use-after-free

2022-01-18 Thread 贺鹏
Ah,
found a bug, will submit a v4.

Peng He  于2022年1月18日周二 21:08写道:

> From hepeng:
>
> https://patchwork.ozlabs.org/project/openvswitch/patch/20200717015041.82746-1-hepeng.0...@bytedance.com/#2487473
>
> also from guohongzhi :
>
> http://patchwork.ozlabs.org/project/openvswitch/patch/20200306130555.19884-1-guohongz...@huawei.com/
>
> also from a discussion about the mixing use of RCU and refcount in the mail
> list with Ilya Maximets, William Tu, Ben Pfaf, and Gaëtan Rivet.
>
> A summary, as quoted from Ilya:
>
> "
> RCU for ofproto was introduced for one
> and only one reason - to avoid freeing ofproto while rules are still
> alive.  This was done in commit f416c8d61601 ("ofproto: RCU postpone
> rule destruction.").  The goal was to allow using rules without
> refcounting them within a single grace period.  And that forced us
> to postpone destruction of the ofproto for a single grace period.
> Later commit 39c9459355b6 ("Use classifier versioning.") made it
> possible for rules to be alive for more than one grace period, so
> the commit made ofproto wait for 2 grace periods by double postponing.
> As we can see now, that wasn't enough and we have to wait for more
> than 2 grace periods in certain cases.
> "
>
> In a short, the ofproto should have a longer life time than rule, if
> the rule lasts for more than 2 grace periods, the ofproto should live
> longer to ensure rule->ofproto is valid. It's hard to predict how long
> a ofproto should live, thus we need to use refcount on ofproto to make
> things easy. The controversial part is that we have already used RCU
> postpone
> to delay ofproto destrution, if we have to add refcount, is it simpler to
> use just refcount without RCU postpone?
>
> IMO, I think going back to the pure refcount solution is more
> complicated than mixing using both.
>
> Gaëtan Rive asks some questions on guohongzhi's v2 patch:
>
> during ofproto_rule_create, should we use ofproto_ref
> or ofproto_try_ref? how can we make sure the ofproto is alive?
>
> by using RCU, I think the answer is that, when you see a pointer to
> ofproto, it's alive at least in this grace peroid, so it is ok to use
> ofproto_ref instead of ofproto_try_ref. This shows that by mixing use
> of RCU and refcount we can save a lot of work to query if ofproto is
> alive.
>
> In short, the RCU part makes sure the ofproto is alive when we use it,
> and the refcount part makes sure it lives longer enough.
>
> Also regarding a new patch filed recently, people are now making use
> of RCU to protect ofproto:
>
>
> https://patchwork.ozlabs.org/project/openvswitch/patch/1638530715-44436-1-git-send-email-wangyunj...@huawei.com/
>
> In this patch, I have merged guohongzhi's patch and mine, and fixes
> accoring to the previous comments.
>
> Signed-off-by: Peng He 
> Signed-off-by: guohongzhi 
> ---
>  ofproto/ofproto-dpif-xlate-cache.c |  1 +
>  ofproto/ofproto-dpif-xlate.c   |  1 +
>  ofproto/ofproto-dpif.c |  4 +--
>  ofproto/ofproto-provider.h |  2 ++
>  ofproto/ofproto.c  | 45 ++
>  ofproto/ofproto.h  |  3 ++
>  6 files changed, 49 insertions(+), 7 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-xlate-cache.c
> b/ofproto/ofproto-dpif-xlate-cache.c
> index dcc91cb38..0deee365d 100644
> --- a/ofproto/ofproto-dpif-xlate-cache.c
> +++ b/ofproto/ofproto-dpif-xlate-cache.c
> @@ -209,6 +209,7 @@ xlate_cache_clear_entry(struct xc_entry *entry)
>  {
>  switch (entry->type) {
>  case XC_TABLE:
> +ofproto_unref(&(entry->table.ofproto->up));
>  break;
>  case XC_RULE:
>  ofproto_rule_unref(>rule->up);
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 6fb59e170..0380928a9 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -3025,6 +3025,7 @@ xlate_normal(struct xlate_ctx *ctx)
>
>  /* Save just enough info to update mac learning table later. */
>  entry = xlate_cache_add_entry(ctx->xin->xcache, XC_NORMAL);
> +ofproto_ref(>xbridge->ofproto->up);
>  entry->normal.ofproto = ctx->xbridge->ofproto;
>  entry->normal.in_port = flow->in_port.ofp_port;
>  entry->normal.dl_src = flow->dl_src;
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 8143dd965..6816896dd 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -4472,8 +4472,8 @@ rule_dpif_lookup_from_table(struct ofproto_dpif
> *ofproto,
>  }
>  if (xcache) {
>  struct xc_entry *entry;
> -
>  entry = xlate_cache_add_entry(xcache, XC_TABLE);
> +ofproto_ref(>up);
>  entry->table.ofproto = ofproto;
>  entry->table.id = *table_id;
>  entry->table.match = true;
> @@ -4508,8 +4508,8 @@ rule_dpif_lookup_from_table(struct ofproto_dpif
> *ofproto,
>  }
>  if (xcache) {
>  

Re: [ovs-dev] [PATCH] system-tso: Skip encap tests when userspace TSO is enabled.

2022-01-18 Thread David Marchand
On Wed, Jan 5, 2022 at 3:58 PM Flavio Leitner  wrote:
>
> It seems Linux native tunnel configuration changed to enable
> checksum by default and that causes the check-system-tso unit
> test below to fail:
>  10: datapath - ping over vxlan tunnelFAILED (system-traffic.at:248)
>
> That happens because userspace TSO doesn't support encapsulation
> as mentioned in the current documentation. In this specific case,
> udp_extract_tnl_md() checks if the checksum is correct, but since
> TSO is enabled, the outer UDP header contains only the pseudo
> checksum and not the full packet checksum.
>
> Although the packet is marked correctly with UDP csum offload flag
> and the code could use that to verify the pseudo csum, more work
> is needed to properly translate the offloading flags from the outer
> headers to the inner headers.  For example, if the payload is a
> TCP packet, most probably the flag DP_PACKET_OL_TX_UDP_CKSUM doesn't
> make sense after decapsulating that.
>
> This patch skips the tunnel tests when the userspace TSO is enabled.

> Fixes: 29bb3093eb8b ("userspace: Enable TSO support for non-DPDK.")
> Signed-off-by: Flavio Leitner 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/3] python: idl: Resend requested but not acked conditions when reconnecting.

2022-01-18 Thread Terry Wilson
On Tue, Jan 11, 2022 at 10:37 AM Dumitru Ceara  wrote:
>
> When reconnecting forget about in-flight monitor condition changes
> if the user requested a newer condition already.
>
> This matches the C implementation, in ovsdb_cs_db_sync_condition().
>
> Fixes: 46d44cf3be0d ("python: idl: Add monitor_cond_since support.")
> Signed-off-by: Dumitru Ceara 
> ---
> Note: I spotted this while reading the code when investigating a
> different issue and I didn't manage to find a test case that would
> reliably replicate it.  Therefore I didn't add a test case.
> ---
>  python/ovs/db/idl.py |6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/python/ovs/db/idl.py b/python/ovs/db/idl.py
> index 02fc4c7368fa..035191fc56b9 100644
> --- a/python/ovs/db/idl.py
> +++ b/python/ovs/db/idl.py
> @@ -132,8 +132,10 @@ class ConditionState(object):
>
>  def reset(self):
>  """Reset a requested condition change back to new"""
> -if self._req_cond is not None and self._new_cond is None:
> -self._new_cond, self._req_cond = (self._req_cond, None)
> +if self._req_cond is not None:
> +if self._new_cond is None:
> +self._new_cond = self._req_cond
> +self._req_cond = None
>
>
>  class Idl(object):
>

Acked-By: Terry Wilson 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Fix packet drops with decap(packet_type(ns=1, type=0x8848).

2022-01-18 Thread Martin Varghese
On Tue, Jan 18, 2022 at 01:29:12PM +0100, Ilya Maximets wrote:
> On 1/18/22 13:10, Martin Varghese wrote:
> > From: Martin Varghese 
> > 
> > Added PT_MPLS_MC support in function xlate_generic_decap_action to fix 
> > packet
> > drops when decap(packet_type(ns=1,type=0x8848) action is applied.
> > 
> > Fixes: 1917ace89364("Encap & Decap actions for MPLS packet type.")
> > Signed-off-by: Martin Varghese 
> > ---
> >  ofproto/ofproto-dpif-xlate.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > index 6fb59e170..9a8e8e777 100644
> > --- a/ofproto/ofproto-dpif-xlate.c
> > +++ b/ofproto/ofproto-dpif-xlate.c
> > @@ -6707,7 +6707,8 @@ xlate_generic_decap_action(struct xlate_ctx *ctx,
> >  ctx->pending_decap = true;
> >  /* Trigger recirculation. */
> >  return true;
> > -case PT_MPLS: {
> > +case PT_MPLS:
> > +case PT_MPLS_MC: {
> >  int n;
> >  ovs_be16 ethertype;
> >  
> > 
> 
> Oh.  Seems like another 'case' got lost in the process.
> 
> The change seems correct, but could you, please, add a unit test,
> so we can have this path covered?
>
Sure. I will send that out soon.
> Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofproto: fix use-after-free for "ofproto".

2022-01-18 Thread wangyunjian via dev
http://patchwork.ozlabs.org/project/openvswitch/patch/20200306130555.19884-1-guohongz...@huawei.com/
This patch has been applied. But this patch do not fix the problem I stated.
I think this problem is that hmap does not support concurrent access.

Thanks,
Yunjian

From: 贺鹏 [mailto:xnhp0...@gmail.com]
Sent: Tuesday, January 18, 2022 3:23 PM
To: wangyunjian 
Cc: d...@openvswitch.org; i.maxim...@ovn.org; dingxiaoxiong 

Subject: Re: [ovs-dev] [PATCH] ofproto: fix use-after-free for "ofproto".

Hi, yunjian and Ilya,

this patch reminds me of the discussion we have in March last year, and during 
the discussion, you have spotted this
thread-safety issue of uuid map. Unfortunately, in that email, you did not 
reply to the mailist, so I cannot give a link to
the email. I attach it as a reference.

I quote it here:

"
That is an interesting example here...
I can't help but notice that this function is typically called
from different handler or pmd threads and modified by the main thread
while upcalls enabled.  And hmap is not a thread-safe structure.
I guess, we have another possible problem here.  We need to protect
at least this hmap and the other one with rw lock or something...
Am I right or am I missing something?  What else we need to protect?
"

And this patch fixes exactly the problem you stated.



wangyunjian via dev mailto:ovs-dev@openvswitch.org>> 
于2022年1月8日周六 17:18写道:
Friendly ping.

> -Original Message-
> From: wangyunjian
> Sent: Friday, December 3, 2021 7:25 PM
> To: d...@openvswitch.org; 
> i.maxim...@ovn.org
> Cc: dingxiaoxiong 
> mailto:dingxiaoxi...@huawei.com>>; xudingke
> mailto:xudin...@huawei.com>>; wangyunjian 
> mailto:wangyunj...@huawei.com>>; Justin
> Pettit mailto:jpet...@ovn.org>>
> Subject: [ovs-dev] [PATCH] ofproto: fix use-after-free for "ofproto".
>
> When handler threads lookup a "ofproto" and use it, main thread maybe
> remove and free the "ofproto" at the same time. The "ofproto" has not
> been protected well, which can lead to an OVS crash.
>
> This patch fixes this by making the "ofproto" lookup RCU-safe by using
> cmap instead of hmap and moving remove "ofproto" call before
> xlate_txn_commit().
>
> (gdb) bt
>   hmap_next (hmap.h:398)
>   seq_wake_waiters (seq.c:326)
>   seq_change_protected (seq.c:134)
>   seq_change (seq.c:144)
>   ofproto_dpif_send_async_msg (ofproto_dpif.c:263)
>   process_upcall (ofproto_dpif_upcall.c:1782)
>   recv_upcalls (ofproto_dpif_upcall.c:1026)
>   udpif_upcall_handler (ofproto/ofproto_dpif_upcall.c:945)
>   ovsthread_wrapper (ovs_thread.c:734)
>
> Cc: Justin Pettit mailto:jpet...@ovn.org>>
> Fixes: fcb9579be3c7 ("ofproto: Add 'ofproto_uuid' and 'ofp_in_port' to user
> action cookie.")
> Signed-off-by: Yunjian Wang 
> mailto:wangyunj...@huawei.com>>
> ---
>  ofproto/ofproto-dpif.c | 12 ++--
>  ofproto/ofproto-dpif.h |  2 +-
>  2 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index cba49a99e..aa8d32f81 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -216,8 +216,7 @@ static struct hmap all_ofproto_dpifs_by_name =
>
> HMAP_INITIALIZER(_ofproto_dpifs_by_name);
>
>  /* All existing ofproto_dpif instances, indexed by ->uuid. */
> -static struct hmap all_ofproto_dpifs_by_uuid =
> -
> HMAP_INITIALIZER(_ofproto_dpifs_by_uuid);
> +static struct cmap all_ofproto_dpifs_by_uuid = CMAP_INITIALIZER;
>
>  static bool ofproto_use_tnl_push_pop = true;
>  static void ofproto_unixctl_init(void);
> @@ -1682,7 +1681,7 @@ construct(struct ofproto *ofproto_)
>  hmap_insert(_ofproto_dpifs_by_name,
>  >all_ofproto_dpifs_by_name_node,
>  hash_string(ofproto->up.name, 0));
> -hmap_insert(_ofproto_dpifs_by_uuid,
> +cmap_insert(_ofproto_dpifs_by_uuid,
>  >all_ofproto_dpifs_by_uuid_node,
>  uuid_hash(>uuid));
>  memset(>stats, 0, sizeof ofproto->stats);
> @@ -1778,12 +1777,13 @@ destruct(struct ofproto *ofproto_, bool del)
>  ofproto->backer->need_revalidate = REV_RECONFIGURE;
>  xlate_txn_start();
>  xlate_remove_ofproto(ofproto);
> +cmap_remove(_ofproto_dpifs_by_uuid,
> +>all_ofproto_dpifs_by_uuid_node,
> +uuid_hash(>uuid));

I guess some comments are needed here, you actually make use of
the rcu_synchronize in the xlate_txn_commit to avoid access to the
ofproto from other thread through uuid map.

>  xlate_txn_commit();
>
>  hmap_remove(_ofproto_dpifs_by_name,
>  >all_ofproto_dpifs_by_name_node);
> -hmap_remove(_ofproto_dpifs_by_uuid,
> ->all_ofproto_dpifs_by_uuid_node);
>
>  OFPROTO_FOR_EACH_TABLE (table, >up) {
>  CLS_FOR_EACH (rule, up.cr, >cls) {
> @@ -5781,7 +5781,7 @@ ofproto_dpif_lookup_by_uuid(const struct uuid
> *uuid)
>  {
>  struct ofproto_dpif *ofproto;
>
> -

Re: [ovs-dev] [ovs-dev v3] ofproto: add refcount to ofproto to fix ofproto use-after-free

2022-01-18 Thread 0-day Robot
Bleep bloop.  Greetings Peng He, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Author Peng He  needs to sign off.
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Peng He , guohongzhi 

Lines checked: 258, Warnings: 1, Errors: 1


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [ovs-dev v3] ofproto: add refcount to ofproto to fix ofproto use-after-free

2022-01-18 Thread Peng He
From hepeng:
https://patchwork.ozlabs.org/project/openvswitch/patch/20200717015041.82746-1-hepeng.0...@bytedance.com/#2487473

also from guohongzhi :
http://patchwork.ozlabs.org/project/openvswitch/patch/20200306130555.19884-1-guohongz...@huawei.com/

also from a discussion about the mixing use of RCU and refcount in the mail
list with Ilya Maximets, William Tu, Ben Pfaf, and Gaëtan Rivet.

A summary, as quoted from Ilya:

"
RCU for ofproto was introduced for one
and only one reason - to avoid freeing ofproto while rules are still
alive.  This was done in commit f416c8d61601 ("ofproto: RCU postpone
rule destruction.").  The goal was to allow using rules without
refcounting them within a single grace period.  And that forced us
to postpone destruction of the ofproto for a single grace period.
Later commit 39c9459355b6 ("Use classifier versioning.") made it
possible for rules to be alive for more than one grace period, so
the commit made ofproto wait for 2 grace periods by double postponing.
As we can see now, that wasn't enough and we have to wait for more
than 2 grace periods in certain cases.
"

In a short, the ofproto should have a longer life time than rule, if
the rule lasts for more than 2 grace periods, the ofproto should live
longer to ensure rule->ofproto is valid. It's hard to predict how long
a ofproto should live, thus we need to use refcount on ofproto to make
things easy. The controversial part is that we have already used RCU postpone
to delay ofproto destrution, if we have to add refcount, is it simpler to
use just refcount without RCU postpone?

IMO, I think going back to the pure refcount solution is more
complicated than mixing using both.

Gaëtan Rive asks some questions on guohongzhi's v2 patch:

during ofproto_rule_create, should we use ofproto_ref
or ofproto_try_ref? how can we make sure the ofproto is alive?

by using RCU, I think the answer is that, when you see a pointer to
ofproto, it's alive at least in this grace peroid, so it is ok to use
ofproto_ref instead of ofproto_try_ref. This shows that by mixing use
of RCU and refcount we can save a lot of work to query if ofproto is
alive.

In short, the RCU part makes sure the ofproto is alive when we use it,
and the refcount part makes sure it lives longer enough.

Also regarding a new patch filed recently, people are now making use
of RCU to protect ofproto:

https://patchwork.ozlabs.org/project/openvswitch/patch/1638530715-44436-1-git-send-email-wangyunj...@huawei.com/

In this patch, I have merged guohongzhi's patch and mine, and fixes
accoring to the previous comments.

Signed-off-by: Peng He 
Signed-off-by: guohongzhi 
---
 ofproto/ofproto-dpif-xlate-cache.c |  1 +
 ofproto/ofproto-dpif-xlate.c   |  1 +
 ofproto/ofproto-dpif.c |  4 +--
 ofproto/ofproto-provider.h |  2 ++
 ofproto/ofproto.c  | 45 ++
 ofproto/ofproto.h  |  3 ++
 6 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate-cache.c 
b/ofproto/ofproto-dpif-xlate-cache.c
index dcc91cb38..0deee365d 100644
--- a/ofproto/ofproto-dpif-xlate-cache.c
+++ b/ofproto/ofproto-dpif-xlate-cache.c
@@ -209,6 +209,7 @@ xlate_cache_clear_entry(struct xc_entry *entry)
 {
 switch (entry->type) {
 case XC_TABLE:
+ofproto_unref(&(entry->table.ofproto->up));
 break;
 case XC_RULE:
 ofproto_rule_unref(>rule->up);
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 6fb59e170..0380928a9 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3025,6 +3025,7 @@ xlate_normal(struct xlate_ctx *ctx)
 
 /* Save just enough info to update mac learning table later. */
 entry = xlate_cache_add_entry(ctx->xin->xcache, XC_NORMAL);
+ofproto_ref(>xbridge->ofproto->up);
 entry->normal.ofproto = ctx->xbridge->ofproto;
 entry->normal.in_port = flow->in_port.ofp_port;
 entry->normal.dl_src = flow->dl_src;
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 8143dd965..6816896dd 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -4472,8 +4472,8 @@ rule_dpif_lookup_from_table(struct ofproto_dpif *ofproto,
 }
 if (xcache) {
 struct xc_entry *entry;
-
 entry = xlate_cache_add_entry(xcache, XC_TABLE);
+ofproto_ref(>up);
 entry->table.ofproto = ofproto;
 entry->table.id = *table_id;
 entry->table.match = true;
@@ -4508,8 +4508,8 @@ rule_dpif_lookup_from_table(struct ofproto_dpif *ofproto,
 }
 if (xcache) {
 struct xc_entry *entry;
-
 entry = xlate_cache_add_entry(xcache, XC_TABLE);
+ofproto_ref(>up);
 entry->table.ofproto = ofproto;
 entry->table.id = next_id;
 entry->table.match = (rule != NULL);
diff --git 

Re: [ovs-dev] [PATCH v3 0/5] PMD docs and ALB status.

2022-01-18 Thread Ilya Maximets
On 12/20/21 15:38, Kevin Traynor wrote:
> The first patches are small fixes and additions to the PMD documentation
> as was I reading it.
> 
> The last patch removes the experimental tag from PMD Auto Load Balance.
> 
> v3:
> Fixed checkpatch warning.
> Added Sunil's Acks.
> 
> v2:
> Fixes as per Sunil's comments.
> 
> GHA: https://github.com/kevintraynor/ovs/actions/runs/1602426498
> 
> Kevin Traynor (5):
>   Documentation: Fix Rx/Tx queue configuration section.
>   Documentation: Minor spelling and grammar fixes.
>   Documentation: Update PMD thread statistics.
>   Documentation: Update PMD Auto Load Balance section.
>   Documentation: Remove experimental tag for PMD ALB.
> 
>  Documentation/topics/dpdk/pmd.rst | 128 --
>  NEWS  |   2 +
>  2 files changed, 72 insertions(+), 58 deletions(-)
> 

Thanks, Kevin, Sunil and David!  Applied.

The first patch also backported down to 2.13, as this part of a
documentation is very outdated and incorrect.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Fix packet drops with decap(packet_type(ns=1, type=0x8848).

2022-01-18 Thread Ilya Maximets
On 1/18/22 13:10, Martin Varghese wrote:
> From: Martin Varghese 
> 
> Added PT_MPLS_MC support in function xlate_generic_decap_action to fix packet
> drops when decap(packet_type(ns=1,type=0x8848) action is applied.
> 
> Fixes: 1917ace89364("Encap & Decap actions for MPLS packet type.")
> Signed-off-by: Martin Varghese 
> ---
>  ofproto/ofproto-dpif-xlate.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 6fb59e170..9a8e8e777 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -6707,7 +6707,8 @@ xlate_generic_decap_action(struct xlate_ctx *ctx,
>  ctx->pending_decap = true;
>  /* Trigger recirculation. */
>  return true;
> -case PT_MPLS: {
> +case PT_MPLS:
> +case PT_MPLS_MC: {
>  int n;
>  ovs_be16 ethertype;
>  
> 

Oh.  Seems like another 'case' got lost in the process.

The change seems correct, but could you, please, add a unit test,
so we can have this path covered?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] ofproto-dpif-xlate: Fix packet drops with decap(packet_type(ns=1, type=0x8848).

2022-01-18 Thread Martin Varghese
From: Martin Varghese 

Added PT_MPLS_MC support in function xlate_generic_decap_action to fix packet
drops when decap(packet_type(ns=1,type=0x8848) action is applied.

Fixes: 1917ace89364("Encap & Decap actions for MPLS packet type.")
Signed-off-by: Martin Varghese 
---
 ofproto/ofproto-dpif-xlate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 6fb59e170..9a8e8e777 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -6707,7 +6707,8 @@ xlate_generic_decap_action(struct xlate_ctx *ctx,
 ctx->pending_decap = true;
 /* Trigger recirculation. */
 return true;
-case PT_MPLS: {
+case PT_MPLS:
+case PT_MPLS_MC: {
 int n;
 ovs_be16 ethertype;
 
-- 
2.18.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] vswitchd.xml: Add missing tx-steering PMD option.

2022-01-18 Thread Kevin Traynor

On 17/01/2022 22:25, Maxime Coquelin wrote:

This patch documents PMD's other_config:tx-steering option.

Signed-off-by: Maxime Coquelin 
---
  vswitchd/vswitch.xml | 22 ++
  1 file changed, 22 insertions(+)

diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 026b5e2ca..ef7f8f2c8 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -3375,6 +3375,28 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch 
options:peer=p1 \
  
  This option may only be used with dpdk VF representors.

+
+  
+
+  Specifies the Tx steering mode for the interface.
+
+
+  thread enables static txq mapping when the number of txq
+  is greater or equal than the number of PMD threads, and XPS mode if
+  lower than the number of PMD threads.


I think it should be 'greater than PMD threads, and XPS mode if equal or 
lower than the number of PMD threads.', because one txq is also needed 
for the OVS main thread. I checked the code and this txq is accounted 
for in wanted_txqs.


With this fix:
Acked-by: Kevin Traynor 


+
+
+  hash enables hash-based Tx steering, which distributes
+  the packets on all the transmit queues based on their 5-tuples
+  hashes.
+
+
+  Defaults to thread.
+
+  
+
  
  
  




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev