When RTE_ETH_RSS_LEVEL_INNERMOST is requested, the IP key extracts use
the innermost header index (HDR_INDEX_LAST). The hardware only resolves
that index when several IP headers are stacked: for a non-tunnelled
frame, which carries a single IP header, the extraction returns nothing.
The RSS hash is then constant and all such frames are steered to a
single Rx queue.

Always also extract the outer IP (header index 0), which the hardware
resolves for any frame. Non-tunnelled frames are thus hashed on their
only IP header, while tunnelled frames keep being hashed on their inner
IP.

This is a deliberate tradeoff: the ethdev API defines
RTE_ETH_RSS_LEVEL_INNERMOST as hashing the innermost header only, but the
hardware cannot do that without breaking RSS for plain traffic. As a
consequence, two tunnelled flows with the same inner header but
different outer IPs may hash to different queues. This limitation is
documented in the dpaa2 guide.

Alternatives considered (feedback welcome, hence RFC):

- Hash both outer and inner only under RTE_ETH_RSS_LEVEL_PMD_DEFAULT and
  keep INNERMOST strictly inner-only. The ethdev API leaves the default
  level to the PMD, so this stays API-compliant; it changes the default
  hash for tunnelled traffic.

- Add a generic RTE_ETH_RSS_LEVEL_OUTER_INNER value to the ethdev API so
  applications can request hashing on both encapsulation levels
  explicitly, instead of overloading INNERMOST. This needs an ethdev API
  change and agreement from other PMDs.

Fixes: 32f701671d2f ("net/dpaa2: support inner RSS level for tunnelled traffic")
Signed-off-by: Maxime Leroy <[email protected]>
---
 doc/guides/nics/dpaa2.rst              |  6 +++
 drivers/net/dpaa2/base/dpaa2_hw_dpni.c | 64 +++++++++++---------------
 2 files changed, 33 insertions(+), 37 deletions(-)

diff --git a/doc/guides/nics/dpaa2.rst b/doc/guides/nics/dpaa2.rst
index 2d70bd0ab9..b7b68f0cd5 100644
--- a/doc/guides/nics/dpaa2.rst
+++ b/doc/guides/nics/dpaa2.rst
@@ -558,6 +558,12 @@ Other Limitations
 
 - RSS hash key cannot be modified.
 - RSS RETA cannot be configured.
+- Under ``RTE_ETH_RSS_LEVEL_INNERMOST``, the IP hash also covers the
+  outermost IP, not only the innermost one. The hardware extracts no IP
+  at the innermost index for non-tunnelled frames, so the outer IP is
+  added to keep RSS working on plain traffic. As a result, tunnelled
+  flows with the same inner header but different outer IPs may be
+  distributed to different queues.
 
 .. _dptmapi:
 
diff --git a/drivers/net/dpaa2/base/dpaa2_hw_dpni.c 
b/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
index 07f4a3d414..b002dba171 100644
--- a/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
+++ b/drivers/net/dpaa2/base/dpaa2_hw_dpni.c
@@ -398,48 +398,38 @@ dpaa2_distset_to_dpkg_profile_cfg(
                        case RTE_ETH_RSS_IPV6:
                        case RTE_ETH_RSS_FRAG_IPV6:
                        case RTE_ETH_RSS_NONFRAG_IPV6_OTHER:
-                       case RTE_ETH_RSS_IPV6_EX:
+                       case RTE_ETH_RSS_IPV6_EX: {
+                               static const uint32_t ip_fields[] = {
+                                       NH_FLD_IP_SRC, NH_FLD_IP_DST,
+                                       NH_FLD_IP_PROTO };
+                               static const uint8_t ip_hdr_index[] = {
+                                       0, DPAA2_DIST_HDR_INDEX_LAST };
+                               unsigned int n_hdr, f, h;
 
                                if (l3_configured)
                                        break;
                                l3_configured = 1;
 
-                               kg_cfg->extracts[i].extract.from_hdr.prot =
-                                       NET_PROT_IP;
-                               kg_cfg->extracts[i].extract.from_hdr.hdr_index =
-                                       hdr_index;
-                               kg_cfg->extracts[i].extract.from_hdr.field =
-                                       NH_FLD_IP_SRC;
-                               kg_cfg->extracts[i].type =
-                                       DPKG_EXTRACT_FROM_HDR;
-                               kg_cfg->extracts[i].extract.from_hdr.type =
-                                       DPKG_FULL_FIELD;
-                               i++;
-
-                               kg_cfg->extracts[i].extract.from_hdr.prot =
-                                       NET_PROT_IP;
-                               kg_cfg->extracts[i].extract.from_hdr.hdr_index =
-                                       hdr_index;
-                               kg_cfg->extracts[i].extract.from_hdr.field =
-                                       NH_FLD_IP_DST;
-                               kg_cfg->extracts[i].type =
-                                       DPKG_EXTRACT_FROM_HDR;
-                               kg_cfg->extracts[i].extract.from_hdr.type =
-                                       DPKG_FULL_FIELD;
-                               i++;
-
-                               kg_cfg->extracts[i].extract.from_hdr.prot =
-                                       NET_PROT_IP;
-                               kg_cfg->extracts[i].extract.from_hdr.hdr_index =
-                                       hdr_index;
-                               kg_cfg->extracts[i].extract.from_hdr.field =
-                                       NH_FLD_IP_PROTO;
-                               kg_cfg->extracts[i].type =
-                                       DPKG_EXTRACT_FROM_HDR;
-                               kg_cfg->extracts[i].extract.from_hdr.type =
-                                       DPKG_FULL_FIELD;
-                               i++;
-                       break;
+                               /* outer IP always; inner IP too for INNERMOST 
*/
+                               n_hdr = (hdr_index == 
DPAA2_DIST_HDR_INDEX_LAST) ?
+                                       2 : 1;
+
+                               for (h = 0; h < n_hdr; h++)
+                                       for (f = 0; f < RTE_DIM(ip_fields); 
f++) {
+                                               
kg_cfg->extracts[i].extract.from_hdr.prot =
+                                                       NET_PROT_IP;
+                                               
kg_cfg->extracts[i].extract.from_hdr.hdr_index =
+                                                       ip_hdr_index[h];
+                                               
kg_cfg->extracts[i].extract.from_hdr.field =
+                                                       ip_fields[f];
+                                               kg_cfg->extracts[i].type =
+                                                       DPKG_EXTRACT_FROM_HDR;
+                                               
kg_cfg->extracts[i].extract.from_hdr.type =
+                                                       DPKG_FULL_FIELD;
+                                               i++;
+                                       }
+                               break;
+                       }
 
                        case RTE_ETH_RSS_NONFRAG_IPV4_TCP:
                        case RTE_ETH_RSS_NONFRAG_IPV6_TCP:
-- 
2.43.0

Reply via email to