Re: [ovs-dev] [PATCH v4 01/15] netdev-dpdk: Differentiate between mtu/mbuf size.
> On 11/07/2018 12:40, Ian Stokes wrote: > > On 7/10/2018 12:06 PM, Tiago Lam wrote: > >> When configuring a mempool, in netdev_dpdk_mempool_configure(), the > >> result of a call to dpdk_buf_size() is being used as the MTU. This, > >> however, is not the MTU but a ROUND_UP aligned number based on the > >> MTU, which could lead to the reuse of mempools even when the real > >> MTUs between different ports differ but somehow round up to the same > >> mbuf size. > >> > >> To support multi-segment mbufs, which is coming in later commits, > >> this distinction is important. For multi-segment mbufs the mbuf size > >> is always the same (multiple mbufs are chained together to hold the > >> packet's data), which means that using the mbuf size as the mtu would > >> lead to reusing always the same mempool, independent of the MTU set. > >> > >> To fix this, two fields are now passed to dpdk_mp_create(), the mtu > >> and the mbuf_size, thus creating a distinction between the two. The > >> latter is used for telling rte_pktmbuf_pool_create() the size of each > >> mbuf, while the former is used for tracking purposes in struct > >> dpdk_mp and as an input to create a unique hash for the mempool's name. > >> > >> Signed-off-by: Tiago Lam > > > > Hi Tiago, > > > > I don't think these changes are acceptable as they break from the > > expected behavior of the shared mempool model. > > > > The reason we introduced the shared mempool model was that it was used > > from OVS 2.6 -> 2.9. > > > > A user with the same deployment moving between releases would not have > > to re-calculate the memory they dimension for OVS DPDK. > > > > With these changes however they may have to. > > > > Consider the following example: > > > > A user adds a port with mtu 1500 and a second port with mtu 1800. > > Assuming both ports are on the same socket, the same mempool would be > > re-used for both ports in the current shared mempool model. > > > However with this change, separate mempools would have to be created > > as the MTU is different, potentially requiring an increase in the > > memory provisioned by the user. It may not be possible for users to > > upgrade from OVS 2.9 to 2.10 without re-dimension their memory with > this. > > > > Hi Ian, > > Thanks for the context here. > > It's a valid point. I did consider this, but wasn't aware that users / > deployments were also taking this round up into account. In that case, > this could certainly cause problems during upgrade. > > > We had a similar issue to this in OVS 2.9 release with the per port > > memory model, it was flagged as a blocking factor for Red Hat and > > Ericsson upgrades which led to the removal of the per port model in 2.9. > > > > One possible solution here is that multiseg is supported for the per > > port memory model only? > > > > It seems we can adapt this in a different way. I was trying to reuse the > "mbuf_size" for both cases; In the per-port and shared mempool case, where > it is being used as the MTU, and in the multi-segment case where it is > being used for telling rte_pktmbuf_pool_create() the size of each mbuf. > Hence why I thought it would be better to differentiate between the real > MTU and the "mbuf_size". Instead, we can calculate two "mbuf_sizes" - the > one the shared and per-port mempool models use for tracking the MTU > changes, and in case multi-segment mbufs are in use, the "mbuf_size" for > when creating the mempool. This should suffice. > > I'll update the series with this; This patch 01/15 will probably be > dropped entirely, and patch 12/15 will likely be the one taking the > changes mentioned above. > > I'll also update patches 05/15 and 11/15 to fix the Travis CI. > > Thanks, > Tiago. That sounds ok, will review in the next revision. Thanks Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 01/15] netdev-dpdk: Differentiate between mtu/mbuf size.
On 11/07/2018 12:40, Ian Stokes wrote: > On 7/10/2018 12:06 PM, Tiago Lam wrote: >> When configuring a mempool, in netdev_dpdk_mempool_configure(), the >> result of a call to dpdk_buf_size() is being used as the MTU. This, >> however, is not the MTU but a ROUND_UP aligned number based on the MTU, >> which could lead to the reuse of mempools even when the real MTUs >> between different ports differ but somehow round up to the same mbuf >> size. >> >> To support multi-segment mbufs, which is coming in later commits, this >> distinction is important. For multi-segment mbufs the mbuf size is >> always the same (multiple mbufs are chained together to hold the >> packet's data), which means that using the mbuf size as the mtu would >> lead to reusing always the same mempool, independent of the MTU set. >> >> To fix this, two fields are now passed to dpdk_mp_create(), the mtu and >> the mbuf_size, thus creating a distinction between the two. The latter >> is used for telling rte_pktmbuf_pool_create() the size of each mbuf, >> while the former is used for tracking purposes in struct dpdk_mp and as >> an input to create a unique hash for the mempool's name. >> >> Signed-off-by: Tiago Lam > > Hi Tiago, > > I don't think these changes are acceptable as they break from the > expected behavior of the shared mempool model. > > The reason we introduced the shared mempool model was that it was used > from OVS 2.6 -> 2.9. > > A user with the same deployment moving between releases would not have > to re-calculate the memory they dimension for OVS DPDK. > > With these changes however they may have to. > > Consider the following example: > > A user adds a port with mtu 1500 and a second port with mtu 1800. > Assuming both ports are on the same socket, the same mempool would be > re-used for both ports in the current shared mempool model. > However with this change, separate mempools would have to be created as > the MTU is different, potentially requiring an increase in the memory > provisioned by the user. It may not be possible for users to upgrade > from OVS 2.9 to 2.10 without re-dimension their memory with this. > Hi Ian, Thanks for the context here. It's a valid point. I did consider this, but wasn't aware that users / deployments were also taking this round up into account. In that case, this could certainly cause problems during upgrade. > We had a similar issue to this in OVS 2.9 release with the per port > memory model, it was flagged as a blocking factor for Red Hat and > Ericsson upgrades which led to the removal of the per port model in 2.9. > > One possible solution here is that multiseg is supported for the per > port memory model only? > It seems we can adapt this in a different way. I was trying to reuse the "mbuf_size" for both cases; In the per-port and shared mempool case, where it is being used as the MTU, and in the multi-segment case where it is being used for telling rte_pktmbuf_pool_create() the size of each mbuf. Hence why I thought it would be better to differentiate between the real MTU and the "mbuf_size". Instead, we can calculate two "mbuf_sizes" - the one the shared and per-port mempool models use for tracking the MTU changes, and in case multi-segment mbufs are in use, the "mbuf_size" for when creating the mempool. This should suffice. I'll update the series with this; This patch 01/15 will probably be dropped entirely, and patch 12/15 will likely be the one taking the changes mentioned above. I'll also update patches 05/15 and 11/15 to fix the Travis CI. Thanks, Tiago. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 01/15] netdev-dpdk: Differentiate between mtu/mbuf size.
On 7/10/2018 12:06 PM, Tiago Lam wrote: When configuring a mempool, in netdev_dpdk_mempool_configure(), the result of a call to dpdk_buf_size() is being used as the MTU. This, however, is not the MTU but a ROUND_UP aligned number based on the MTU, which could lead to the reuse of mempools even when the real MTUs between different ports differ but somehow round up to the same mbuf size. To support multi-segment mbufs, which is coming in later commits, this distinction is important. For multi-segment mbufs the mbuf size is always the same (multiple mbufs are chained together to hold the packet's data), which means that using the mbuf size as the mtu would lead to reusing always the same mempool, independent of the MTU set. To fix this, two fields are now passed to dpdk_mp_create(), the mtu and the mbuf_size, thus creating a distinction between the two. The latter is used for telling rte_pktmbuf_pool_create() the size of each mbuf, while the former is used for tracking purposes in struct dpdk_mp and as an input to create a unique hash for the mempool's name. Signed-off-by: Tiago Lam Hi Tiago, I don't think these changes are acceptable as they break from the expected behavior of the shared mempool model. The reason we introduced the shared mempool model was that it was used from OVS 2.6 -> 2.9. A user with the same deployment moving between releases would not have to re-calculate the memory they dimension for OVS DPDK. With these changes however they may have to. Consider the following example: A user adds a port with mtu 1500 and a second port with mtu 1800. Assuming both ports are on the same socket, the same mempool would be re-used for both ports in the current shared mempool model. However with this change, separate mempools would have to be created as the MTU is different, potentially requiring an increase in the memory provisioned by the user. It may not be possible for users to upgrade from OVS 2.9 to 2.10 without re-dimension their memory with this. We had a similar issue to this in OVS 2.9 release with the per port memory model, it was flagged as a blocking factor for Red Hat and Ericsson upgrades which led to the removal of the per port model in 2.9. One possible solution here is that multiseg is supported for the per port memory model only? Ian --- lib/netdev-dpdk.c | 22 +- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index bb4d60f..be8e8b9 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -632,7 +632,8 @@ dpdk_calculate_mbufs(struct netdev_dpdk *dev, int mtu, bool per_port_mp) } static struct dpdk_mp * -dpdk_mp_create(struct netdev_dpdk *dev, int mtu, bool per_port_mp) +dpdk_mp_create(struct netdev_dpdk *dev, int mtu, uint32_t mbuf_size, + bool per_port_mp) { char mp_name[RTE_MEMPOOL_NAMESIZE]; const char *netdev_name = netdev_get_name(&dev->up); @@ -666,21 +667,23 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu, bool per_port_mp) if (ret < 0 || ret >= RTE_MEMPOOL_NAMESIZE) { VLOG_DBG("snprintf returned %d. " "Failed to generate a mempool name for \"%s\". " - "Hash:0x%x, socket_id: %d, mtu:%d, mbufs:%u.", - ret, netdev_name, hash, socket_id, mtu, n_mbufs); + "Hash:0x%x, socket_id: %d, mtu:%d, mbufs:%u, " + "mbuf size: %u", + ret, netdev_name, hash, socket_id, mtu, n_mbufs, + mbuf_size); break; } -VLOG_DBG("Port %s: Requesting a mempool of %u mbufs " +VLOG_DBG("Port %s: Requesting a mempool of %u mbufs of size %u " "on socket %d for %d Rx and %d Tx queues.", - netdev_name, n_mbufs, socket_id, + netdev_name, n_mbufs, mbuf_size, socket_id, dev->requested_n_rxq, dev->requested_n_txq); dmp->mp = rte_pktmbuf_pool_create(mp_name, n_mbufs, MP_CACHE_SZ, sizeof (struct dp_packet) - sizeof (struct rte_mbuf), - MBUF_SIZE(mtu) + MBUF_SIZE(mbuf_size) - sizeof(struct dp_packet), socket_id); @@ -718,7 +721,8 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu, bool per_port_mp) } static struct dpdk_mp * -dpdk_mp_get(struct netdev_dpdk *dev, int mtu, bool per_port_mp) +dpdk_mp_get(struct netdev_dpdk *dev, int mtu, uint32_t mbuf_size, +bool per_port_mp) { struct dpdk_mp *dmp, *next; bool reuse = false; @@ -741,7 +745,7 @@ dpdk_mp_get(struct netdev_dpdk *dev, int mtu, bool per_port_mp) dpdk_mp_sweep(); if (!reu
Re: [ovs-dev] [PATCH v4 01/15] netdev-dpdk: Differentiate between mtu/mbuf size.
On 10 Jul 2018, at 13:06, Tiago Lam wrote: > When configuring a mempool, in netdev_dpdk_mempool_configure(), the > result of a call to dpdk_buf_size() is being used as the MTU. This, > however, is not the MTU but a ROUND_UP aligned number based on the MTU, > which could lead to the reuse of mempools even when the real MTUs > between different ports differ but somehow round up to the same mbuf > size. > > To support multi-segment mbufs, which is coming in later commits, this > distinction is important. For multi-segment mbufs the mbuf size is > always the same (multiple mbufs are chained together to hold the > packet's data), which means that using the mbuf size as the mtu would > lead to reusing always the same mempool, independent of the MTU set. > > To fix this, two fields are now passed to dpdk_mp_create(), the mtu and > the mbuf_size, thus creating a distinction between the two. The latter > is used for telling rte_pktmbuf_pool_create() the size of each mbuf, > while the former is used for tracking purposes in struct dpdk_mp and as > an input to create a unique hash for the mempool's name. > > Signed-off-by: Tiago Lam > --- Changes look good to me. Acked-by: Eelco Chaudron ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 01/15] netdev-dpdk: Differentiate between mtu/mbuf size.
On 10/07/2018 12:06, Tiago Lam wrote: > When configuring a mempool, in netdev_dpdk_mempool_configure(), the > result of a call to dpdk_buf_size() is being used as the MTU. This, > however, is not the MTU but a ROUND_UP aligned number based on the MTU, > which could lead to the reuse of mempools even when the real MTUs > between different ports differ but somehow round up to the same mbuf > size. > > To support multi-segment mbufs, which is coming in later commits, this > distinction is important. For multi-segment mbufs the mbuf size is > always the same (multiple mbufs are chained together to hold the > packet's data), which means that using the mbuf size as the mtu would > lead to reusing always the same mempool, independent of the MTU set. > > To fix this, two fields are now passed to dpdk_mp_create(), the mtu and > the mbuf_size, thus creating a distinction between the two. The latter > is used for telling rte_pktmbuf_pool_create() the size of each mbuf, > while the former is used for tracking purposes in struct dpdk_mp and as > an input to create a unique hash for the mempool's name. > > Signed-off-by: Tiago Lam > --- > lib/netdev-dpdk.c | 22 +- > 1 file changed, 13 insertions(+), 9 deletions(-) > Hi Ian, This patch changes the way the dpdk_mp struct tracks the "mtu", which was introduced in "Support both shared and per port mempools". Instead of assigning the "buf_size" to the "mtu" member in the dpdk_mp struct, we assign the "requested_mtu". This was needed because for multi-segment mbufs the "mbuf_size" may end up being always the same even for different MTUs (as the mbufs are chained together to hold all data), which meant the same mempool would end up being reused. I don't expect this to change the functionality (ran a few tests myself to confirm the correct behavior), but does this sound OK to you? Any thoughts here? Thanks, Tiago. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev