Commit dfaf00e started using the result of dpdk_buf_size() to calculate
the available size on each mbuf, as opposed to using the previous
MBUF_SIZE macro. However, this was calculating the mbuf size by adding
up the MTU with RTE_PKTMBUF_HEADROOM and only then aligning to
NETDEV_DPDK_MBUF_ALIGN. Instead, the accounting for the
RTE_PKTMBUF_HEADROOM should only happen after alignment, as per below.

Before alignment:
ROUNDUP(MTU(1500) + RTE_PKTMBUF_HEADROOM(128), 1024) = 2048

After aligment:
ROUNDUP(MTU(1500), 1024) + 128 = 2176

This might seem insignificant, however, it might have performance
implications in DPDK, where each mbuf is expected to have 2k +
RTE_PKTMBUF_HEADROOM of available space. This is because not only some
NICs have course grained alignments of 1k, they will also take
RTE_PKTMBUF_HEADROOM bytes from the overall available space in an mbuf
when setting up their Rx requirements. Thus, only the "After alignment"
case above would guarantee a 2k of available room, as the "Before
alignment" would report only 1920B.

Some extra information can be found at:
https://mails.dpdk.org/archives/dev/2018-November/119219.html

Note: This has been found by Ian Stokes while going through some
af_packet checks.

Reported-by: Ian Stokes <[email protected]>
Fixes: dfaf00e ("netdev-dpdk: fix mbuf sizing")
Signed-off-by: Tiago Lam <[email protected]>
---
v3:
   - Take trailer_size into account when calculating mbuf size - Ian.

v2:
   - Rebase to master 85706c3 ("ovn: Avoid tunneling for VLAN packets
     redirected to a gateway chassis").

   - Fix mbuf size calculations under Documentation/topics/dpdk/memory.rst
     to take into account the header_size added to each mepool element (64
     bytes) - Ian.
---
 Documentation/topics/dpdk/memory.rst | 28 ++++++++++++++--------------
 lib/netdev-dpdk.c                    |  6 ++++--
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/Documentation/topics/dpdk/memory.rst 
b/Documentation/topics/dpdk/memory.rst
index c9b739f..c20dfed 100644
--- a/Documentation/topics/dpdk/memory.rst
+++ b/Documentation/topics/dpdk/memory.rst
@@ -107,8 +107,8 @@ Example 1
 
  MTU = 1500 Bytes
  Number of mbufs = 262144
- Mbuf size = 2752 Bytes
- Memory required = 262144 * 2752 = 721 MB
+ Mbuf size = 2944 Bytes
+ Memory required = 262144 * 2944 = 772 MB
 
 Example 2
 +++++++++
@@ -116,8 +116,8 @@ Example 2
 
  MTU = 1800 Bytes
  Number of mbufs = 262144
- Mbuf size = 2752 Bytes
- Memory required = 262144 * 2752 = 721 MB
+ Mbuf size = 2944 Bytes
+ Memory required = 262144 * 2944 = 772 MB
 
 .. note::
 
@@ -130,8 +130,8 @@ Example 3
 
  MTU = 6000 Bytes
  Number of mbufs = 262144
- Mbuf size = 8000 Bytes
- Memory required = 262144 * 8000 = 2097 MB
+ Mbuf size = 7040 Bytes
+ Memory required = 262144 * 7040 = 1845 MB
 
 Example 4
 +++++++++
@@ -139,8 +139,8 @@ Example 4
 
  MTU = 9000 Bytes
  Number of mbufs = 262144
- Mbuf size = 10048 Bytes
- Memory required = 262144 * 10048 = 2634 MB
+ Mbuf size = 10112 Bytes
+ Memory required = 262144 * 10112 = 2651 MB
 
 Per Port Memory Calculations
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -194,8 +194,8 @@ Example 1: (1 rxq, 1 PMD, 1500 MTU)
 
  MTU = 1500
  Number of mbufs = (1 * 2048) + (2 * 2048) + (1 * 32) + (16384) = 22560
- Mbuf size = 2752 Bytes
- Memory required = 22560 * 2752 = 62 MB
+ Mbuf size = 2944 Bytes
+ Memory required = 22560 * 2944 = 65 MB
 
 Example 2: (1 rxq, 2 PMD, 6000 MTU)
 +++++++++++++++++++++++++++++++++++
@@ -203,8 +203,8 @@ Example 2: (1 rxq, 2 PMD, 6000 MTU)
 
  MTU = 6000
  Number of mbufs = (1 * 2048) + (3 * 2048) + (1 * 32) + (16384) = 24608
- Mbuf size = 8000 Bytes
- Memory required = 24608 * 8000 = 196 MB
+ Mbuf size = 7040 Bytes
+ Memory required = 24608 * 7040 = 173 MB
 
 Example 3: (2 rxq, 2 PMD, 9000 MTU)
 +++++++++++++++++++++++++++++++++++
@@ -212,5 +212,5 @@ Example 3: (2 rxq, 2 PMD, 9000 MTU)
 
  MTU = 9000
  Number of mbufs = (2 * 2048) + (3 * 2048) + (1 * 32) + (16384) = 26656
- Mbuf size = 10048 Bytes
- Memory required = 26656 * 10048 = 267 MB
+ Mbuf size = 10112 Bytes
+ Memory required = 26656 * 10112 = 270 MB
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index e8618a6..a871743 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -521,8 +521,8 @@ is_dpdk_class(const struct netdev_class *class)
 static uint32_t
 dpdk_buf_size(int mtu)
 {
-    return ROUND_UP((MTU_TO_MAX_FRAME_LEN(mtu) + RTE_PKTMBUF_HEADROOM),
-                     NETDEV_DPDK_MBUF_ALIGN);
+    return ROUND_UP(MTU_TO_MAX_FRAME_LEN(mtu), NETDEV_DPDK_MBUF_ALIGN)
+            + RTE_PKTMBUF_HEADROOM;
 }
 
 /* Allocates an area of 'sz' bytes from DPDK.  The memory is zero'ed.
@@ -681,6 +681,8 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu, bool 
per_port_mp)
                   dev->requested_n_rxq, dev->requested_n_txq,
                   RTE_CACHE_LINE_SIZE);
 
+        /* The size of the mbuf's private area (i.e. area that holds OvS'
+         * dp_packet data)*/
         mbuf_priv_data_len = sizeof(struct dp_packet) -
                                  sizeof(struct rte_mbuf);
         /* The size of the entire dp_packet. */
-- 
2.7.4

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to