On 10.04.2018 21:12, Kevin Traynor wrote: > DPDK mempools are freed when they are no longer needed. > This can happen when a port is removed or a port's mtu > is reconfigured so that a new mempool is used. > > It is possible that an mbuf is attempted to be returned > to a freed mempool from NIC Tx queues and this can lead > to a segfault. > > In order to prevent this, only free mempools when they > are not needed and have no in-use mbufs. As this might > not be possible immediately, sweep the mempools anytime > a port tries to get a mempool. > > Fixes: 8d38823bdf8b ("netdev-dpdk: fix memory leak") > Cc: mark.b.kavanag...@gmail.com > Cc: Ilya Maximets <i.maxim...@samsung.com> > Reported-by: Venkatesan Pradeep <venkatesan.prad...@ericsson.com> > Signed-off-by: Kevin Traynor <ktray...@redhat.com> > --- > > v2: Add second call to rte_mempool_full() as it is not atomic > so we can't trust one call to it. > > lib/netdev-dpdk.c | 48 +++++++++++++++++++++++++++++++++++------------- > 1 file changed, 35 insertions(+), 13 deletions(-) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index c19cedc..00306c4 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -590,8 +590,32 @@ dpdk_mp_create(int socket_id, int mtu) > } > > +/* Free unused mempools. */ > +static void > +dpdk_mp_sweep(void) OVS_REQUIRES(dpdk_mp_mutex) > +{ > + struct dpdk_mp *dmp, *next; > + > + LIST_FOR_EACH_SAFE (dmp, next, list_node, &dpdk_mp_list) { > + if (!dmp->refcount && rte_mempool_full(dmp->mp)) { > + /* > + * rte_mempool_full() is not atomic, but at this point > + * we will not be getting any more mbufs from the mempool > + * so if it reports full again on a subsequent call, then > + * we can be sure that it really is full. > + */
Checking of the rte_mempool_full() twice will not guarantee that it's really full because it sums cache sizes with ring size and truncates the value from the upper side. For example: 0. Let the mempool size == 250. Current mempool ring contains 150 packets. Some of the packets in caches and in use. 1. Sum of the caches calculated. total_cache_len = 100; 2. In a separate thread some of the packets (let it be 50) flushed from the cache. 3. Getting the mempool ring count (200) and summing with previously calculated cache sizes. total_size = 100 + 200 == 300. --> Here we have twice accounted flushed packets. Truncated to 250 before return. 4. rte_mempool_full() returns true. 5. Second call: 6. Sum of the caches calculated. total_cache_len = 50; 7. In a separate thread some of the packets (let it be 30) flushed from the cache. 8. Getting the mempool ring count (230) and summing with previously calculated cache sizes. total_size = 50 + 230 == 280. --> We still have twice accounted mbufs. Truncated to 250 before return. 9. rte_mempool_full() returns true which could be false positive. So, it's possible to have 2 false positives in a row. The only way to be sure is to check twice rte_mempool_ops_get_count(), which is thread safe. > + if (OVS_LIKELY(rte_mempool_full(dmp->mp))) { > + ovs_list_remove(&dmp->list_node); > + rte_mempool_free(dmp->mp); > + rte_free(dmp); > + } > + } > + } > +} > + > static struct dpdk_mp * > dpdk_mp_get(int socket_id, int mtu) > { > struct dpdk_mp *dmp; > + bool reuse = false; > > ovs_mutex_lock(&dpdk_mp_mutex); > @@ -599,15 +623,18 @@ dpdk_mp_get(int socket_id, int mtu) > if (dmp->socket_id == socket_id && dmp->mtu == mtu) { > dmp->refcount++; > - goto out; > + reuse = true; > + break; > } > } > + /* Sweep mempools after reuse or before create. */ > + dpdk_mp_sweep(); > > - dmp = dpdk_mp_create(socket_id, mtu); > - if (dmp) { > - ovs_list_push_back(&dpdk_mp_list, &dmp->list_node); > + if (!reuse) { > + dmp = dpdk_mp_create(socket_id, mtu); > + if (dmp) { > + ovs_list_push_back(&dpdk_mp_list, &dmp->list_node); > + } > } > > -out: > - > ovs_mutex_unlock(&dpdk_mp_mutex); > > @@ -615,5 +642,5 @@ out: > } > > -/* Release an existing mempool. */ > +/* Decrement reference to a mempool. */ > static void > dpdk_mp_put(struct dpdk_mp *dmp) > @@ -625,10 +652,5 @@ dpdk_mp_put(struct dpdk_mp *dmp) > ovs_mutex_lock(&dpdk_mp_mutex); > ovs_assert(dmp->refcount); > - > - if (!--dmp->refcount) { > - ovs_list_remove(&dmp->list_node); > - rte_mempool_free(dmp->mp); > - rte_free(dmp); > - } > + dmp->refcount--; > ovs_mutex_unlock(&dpdk_mp_mutex); > } > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev