Commit dfaf00e started using the result of dpdk_buf_size() to calculate the available size on each mbuf, as opposed to using the previous MBUF_SIZE macro. However, this was calculating the mbuf size by adding up the MTU with RTE_PKTMBUF_HEADROOM and only then aligning to NETDEV_DPDK_MBUF_ALIGN. Instead, the accounting for the RTE_PKTMBUF_HEADROOM should only happen after alignment, as per below.
Before alignment: ROUNDUP(MTU(1500) + RTE_PKTMBUF_HEADROOM(128), 1024) = 2048 After aligment: ROUNDUP(MTU(1500), 1024) + 128 = 2176 This might seem insignificant, however, it might have performance implications in DPDK, where each mbuf is expected to have 2k + RTE_PKTMBUF_HEADROOM of available space. This is because not only some NICs have course grained alignments of 1k, they will also take RTE_PKTMBUF_HEADROOM bytes from the overall available space in an mbuf when setting up their Rx requirements. Thus, only the "After alignment" case above would guarantee a 2k of available room, as the "Before alignment" would report only 1920B. Some extra information can be found at: https://mails.dpdk.org/archives/dev/2018-November/119219.html Note: This has been found by Ian Stokes while going through some af_packet checks. Reported-by: Ian Stokes <[email protected]> Fixes: dfaf00e ("netdev-dpdk: fix mbuf sizing") Signed-off-by: Tiago Lam <[email protected]> --- v3: - Take trailer_size into account when calculating mbuf size - Ian. v2: - Rebase to master 85706c3 ("ovn: Avoid tunneling for VLAN packets redirected to a gateway chassis"). - Fix mbuf size calculations under Documentation/topics/dpdk/memory.rst to take into account the header_size added to each mepool element (64 bytes) - Ian. --- Documentation/topics/dpdk/memory.rst | 28 ++++++++++++++-------------- lib/netdev-dpdk.c | 6 ++++-- 2 files changed, 18 insertions(+), 16 deletions(-) diff --git a/Documentation/topics/dpdk/memory.rst b/Documentation/topics/dpdk/memory.rst index c9b739f..c20dfed 100644 --- a/Documentation/topics/dpdk/memory.rst +++ b/Documentation/topics/dpdk/memory.rst @@ -107,8 +107,8 @@ Example 1 MTU = 1500 Bytes Number of mbufs = 262144 - Mbuf size = 2752 Bytes - Memory required = 262144 * 2752 = 721 MB + Mbuf size = 2944 Bytes + Memory required = 262144 * 2944 = 772 MB Example 2 +++++++++ @@ -116,8 +116,8 @@ Example 2 MTU = 1800 Bytes Number of mbufs = 262144 - Mbuf size = 2752 Bytes - Memory required = 262144 * 2752 = 721 MB + Mbuf size = 2944 Bytes + Memory required = 262144 * 2944 = 772 MB .. note:: @@ -130,8 +130,8 @@ Example 3 MTU = 6000 Bytes Number of mbufs = 262144 - Mbuf size = 8000 Bytes - Memory required = 262144 * 8000 = 2097 MB + Mbuf size = 7040 Bytes + Memory required = 262144 * 7040 = 1845 MB Example 4 +++++++++ @@ -139,8 +139,8 @@ Example 4 MTU = 9000 Bytes Number of mbufs = 262144 - Mbuf size = 10048 Bytes - Memory required = 262144 * 10048 = 2634 MB + Mbuf size = 10112 Bytes + Memory required = 262144 * 10112 = 2651 MB Per Port Memory Calculations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -194,8 +194,8 @@ Example 1: (1 rxq, 1 PMD, 1500 MTU) MTU = 1500 Number of mbufs = (1 * 2048) + (2 * 2048) + (1 * 32) + (16384) = 22560 - Mbuf size = 2752 Bytes - Memory required = 22560 * 2752 = 62 MB + Mbuf size = 2944 Bytes + Memory required = 22560 * 2944 = 65 MB Example 2: (1 rxq, 2 PMD, 6000 MTU) +++++++++++++++++++++++++++++++++++ @@ -203,8 +203,8 @@ Example 2: (1 rxq, 2 PMD, 6000 MTU) MTU = 6000 Number of mbufs = (1 * 2048) + (3 * 2048) + (1 * 32) + (16384) = 24608 - Mbuf size = 8000 Bytes - Memory required = 24608 * 8000 = 196 MB + Mbuf size = 7040 Bytes + Memory required = 24608 * 7040 = 173 MB Example 3: (2 rxq, 2 PMD, 9000 MTU) +++++++++++++++++++++++++++++++++++ @@ -212,5 +212,5 @@ Example 3: (2 rxq, 2 PMD, 9000 MTU) MTU = 9000 Number of mbufs = (2 * 2048) + (3 * 2048) + (1 * 32) + (16384) = 26656 - Mbuf size = 10048 Bytes - Memory required = 26656 * 10048 = 267 MB + Mbuf size = 10112 Bytes + Memory required = 26656 * 10112 = 270 MB diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index e8618a6..a871743 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -521,8 +521,8 @@ is_dpdk_class(const struct netdev_class *class) static uint32_t dpdk_buf_size(int mtu) { - return ROUND_UP((MTU_TO_MAX_FRAME_LEN(mtu) + RTE_PKTMBUF_HEADROOM), - NETDEV_DPDK_MBUF_ALIGN); + return ROUND_UP(MTU_TO_MAX_FRAME_LEN(mtu), NETDEV_DPDK_MBUF_ALIGN) + + RTE_PKTMBUF_HEADROOM; } /* Allocates an area of 'sz' bytes from DPDK. The memory is zero'ed. @@ -681,6 +681,8 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu, bool per_port_mp) dev->requested_n_rxq, dev->requested_n_txq, RTE_CACHE_LINE_SIZE); + /* The size of the mbuf's private area (i.e. area that holds OvS' + * dp_packet data)*/ mbuf_priv_data_len = sizeof(struct dp_packet) - sizeof(struct rte_mbuf); /* The size of the entire dp_packet. */ -- 2.7.4 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
