On 26.06.2018 15:32, Eelco Chaudron wrote: > > > On 26 Jun 2018, at 12:02, Ilya Maximets wrote: > >> On 26.06.2018 12:19, Eelco Chaudron wrote: >>> >>> >>> On 22 Jun 2018, at 21:03, Lam, Tiago wrote: >>> >>>> On 18/06/2018 12:28, Eelco Chaudron wrote: >>>>> >>>>> >>>>> On 11 Jun 2018, at 18:21, Tiago Lam wrote: >>>>> >>>>>> A new mutex, 'nonpmd_mp_mutex', has been introduced to serialise >>>>>> allocation and free operations by non-pmd threads on a given mempool. >>>>>> >>>>> >>>>> Can you explain why we need the mutex here? Can't see any reason why >>>>> rte_pktmbuf_free() needs to be protected for non-pmd threads? >>>> >>>> My understanding is that each PMD has a thread-local cache of its own, >>>> which it can access without locking. However, all non-PMD threads have >>>> direct access to the mempool, and thus need to lock in order to access >>>> it. Otherwise memory corruption could ensue. >>>> >>>> For OvS this means all these functions which can be called outside of a >>>> PMD context, or without holding the `non_pmd_mutex` mutex, in struct >>>> dp_netdev, need to lock before issue operations that modify a mempool. >>>> >>>> Hopefully I've put it succinctly enough without botching up the concept >>>> too much. >>> >>> I do not think this is a problem, as DPDK uses rte_mempool_default_cache() >>> to figure out which cache to use. >>> As the non PMD threads have a higher than RTE_MAX_LCORE id the cache will >>> not be used. >> >> >> It's not the problem. The main reason is that implementation of ring >> buffers on which the mempools based is not preemptible: >> >> lib/librte_mempool/rte_mempool.h: >> * Note: the mempool implementation is not preemptible. An lcore must not be >> * interrupted by another task that uses the same mempool (because it uses a >> * ring which is not preemptible). >> >> As soon as non-PMD threads are not pinned, they could be scheduled on the >> same >> core and preempted while executing mempool operations. > > Thanks for the additional clarification! Tiago, maybe its good to add this > info some where in the code as a comment?
Some more detailed explanation also exists in DPDK prog guide: https://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#known-issues > >>> >>>>> >>>>>> free_dpdk_buf() has been modified to make use of the introduced mutex. >>>>>> >>>>>> Signed-off-by: Tiago Lam <[email protected]> >>>>>> --- >>>>>> lib/netdev-dpdk.c | 21 ++++++++++++++++++++- >>>>>> 1 file changed, 20 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >>>>>> index f546507..efd7c20 100644 >>>>>> --- a/lib/netdev-dpdk.c >>>>>> +++ b/lib/netdev-dpdk.c >>>>>> @@ -294,6 +294,10 @@ static struct ovs_mutex dpdk_mp_mutex >>>>>> OVS_ACQ_AFTER(dpdk_mutex) >>>>>> static struct ovs_list dpdk_mp_free_list >>>>>> OVS_GUARDED_BY(dpdk_mp_mutex) >>>>>> = OVS_LIST_INITIALIZER(&dpdk_mp_free_list); >>>>>> >>>>>> +/* This mutex must be used by non pmd threads when allocating or >>>>>> freeing >>>>>> + * mbufs through mempools. */ >>>>>> +static struct ovs_mutex nonpmd_mp_mutex = OVS_MUTEX_INITIALIZER; >>>>>> + >>>>>> /* Wrapper for a mempool released but not yet freed. */ >>>>>> struct dpdk_mp { >>>>>> struct rte_mempool *mp; >>>>>> @@ -461,6 +465,8 @@ struct netdev_rxq_dpdk { >>>>>> dpdk_port_t port_id; >>>>>> }; >>>>>> >>>>>> +static bool dpdk_thread_is_pmd(void); >>>>>> + >>>>>> static void netdev_dpdk_destruct(struct netdev *netdev); >>>>>> static void netdev_dpdk_vhost_destruct(struct netdev *netdev); >>>>>> >>>>>> @@ -494,6 +500,12 @@ dpdk_buf_size(int mtu) >>>>>> NETDEV_DPDK_MBUF_ALIGN); >>>>>> } >>>>>> >>>>>> +static bool >>>>>> +dpdk_thread_is_pmd(void) >>>>>> +{ >>>>>> + return rte_lcore_id() != NON_PMD_CORE_ID; >>>>> >>>>> Will this continue to work with newer DPDK versions? I've not looked at >>>>> it in detail, but I did notice that the vhost threads in the newer DPDK >>>>> now get created with rte_ctrl_thread_create() and does some lcore >>>>> mangling. >>>>> >>>> >>>> I had a look. This seems to be a wrapper for creating management >>>> threads only, to standardize pthread creation around that in DPDK. Usual >>>> EAL thread creation and assigning of LCORE_ID_ANY to non-EAL threads >>>> seems to still be the same. >>> >>> Thanks for checking. >>>> >>>>>> +} >>>>>> + >>>>>> /* Allocates an area of 'sz' bytes from DPDK. The memory is zero'ed. >>>>>> * >>>>>> * Unlike xmalloc(), this function can return NULL on failure. */ >>>>>> @@ -506,9 +518,16 @@ dpdk_rte_mzalloc(size_t sz) >>>>>> void >>>>>> free_dpdk_buf(struct dp_packet *p) >>>>>> { >>>>>> - struct rte_mbuf *pkt = (struct rte_mbuf *) p; >>>>>> + if (!dpdk_thread_is_pmd()) { >>>>>> + ovs_mutex_lock(&nonpmd_mp_mutex); >>>>>> + } >>>>>> >>>>>> + struct rte_mbuf *pkt = (struct rte_mbuf *) p; >>>>> >>>>> Should we not use a container_of macro here? >>>>> >>>> >>>> I'll use it for the next iteration. >>>> >>>> Tiago. >>> >>> >>> > > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
