On 26 Jun 2018, at 12:02, Ilya Maximets wrote:
On 26.06.2018 12:19, Eelco Chaudron wrote:
On 22 Jun 2018, at 21:03, Lam, Tiago wrote:
On 18/06/2018 12:28, Eelco Chaudron wrote:
On 11 Jun 2018, at 18:21, Tiago Lam wrote:
A new mutex, 'nonpmd_mp_mutex', has been introduced to serialise
allocation and free operations by non-pmd threads on a given
mempool.
Can you explain why we need the mutex here? Can't see any reason
why
rte_pktmbuf_free() needs to be protected for non-pmd threads?
My understanding is that each PMD has a thread-local cache of its
own,
which it can access without locking. However, all non-PMD threads
have
direct access to the mempool, and thus need to lock in order to
access
it. Otherwise memory corruption could ensue.
For OvS this means all these functions which can be called outside
of a
PMD context, or without holding the `non_pmd_mutex` mutex, in struct
dp_netdev, need to lock before issue operations that modify a
mempool.
Hopefully I've put it succinctly enough without botching up the
concept
too much.
I do not think this is a problem, as DPDK uses
rte_mempool_default_cache() to figure out which cache to use.
As the non PMD threads have a higher than RTE_MAX_LCORE id the cache
will not be used.
It's not the problem. The main reason is that implementation of ring
buffers on which the mempools based is not preemptible:
lib/librte_mempool/rte_mempool.h:
* Note: the mempool implementation is not preemptible. An lcore must
not be
* interrupted by another task that uses the same mempool (because it
uses a
* ring which is not preemptible).
As soon as non-PMD threads are not pinned, they could be scheduled on
the same
core and preempted while executing mempool operations.
Thanks for the additional clarification! Tiago, maybe its good to add
this info some where in the code as a comment?
free_dpdk_buf() has been modified to make use of the introduced
mutex.
Signed-off-by: Tiago Lam <[email protected]>
---
lib/netdev-dpdk.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f546507..efd7c20 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -294,6 +294,10 @@ static struct ovs_mutex dpdk_mp_mutex
OVS_ACQ_AFTER(dpdk_mutex)
static struct ovs_list dpdk_mp_free_list
OVS_GUARDED_BY(dpdk_mp_mutex)
= OVS_LIST_INITIALIZER(&dpdk_mp_free_list);
+/* This mutex must be used by non pmd threads when allocating or
freeing
+ * mbufs through mempools. */
+static struct ovs_mutex nonpmd_mp_mutex = OVS_MUTEX_INITIALIZER;
+
/* Wrapper for a mempool released but not yet freed. */
struct dpdk_mp {
struct rte_mempool *mp;
@@ -461,6 +465,8 @@ struct netdev_rxq_dpdk {
dpdk_port_t port_id;
};
+static bool dpdk_thread_is_pmd(void);
+
static void netdev_dpdk_destruct(struct netdev *netdev);
static void netdev_dpdk_vhost_destruct(struct netdev *netdev);
@@ -494,6 +500,12 @@ dpdk_buf_size(int mtu)
NETDEV_DPDK_MBUF_ALIGN);
}
+static bool
+dpdk_thread_is_pmd(void)
+{
+ return rte_lcore_id() != NON_PMD_CORE_ID;
Will this continue to work with newer DPDK versions? I've not
looked at
it in detail, but I did notice that the vhost threads in the newer
DPDK
now get created with rte_ctrl_thread_create() and does some lcore
mangling.
I had a look. This seems to be a wrapper for creating management
threads only, to standardize pthread creation around that in DPDK.
Usual
EAL thread creation and assigning of LCORE_ID_ANY to non-EAL threads
seems to still be the same.
Thanks for checking.
+}
+
/* Allocates an area of 'sz' bytes from DPDK. The memory is
zero'ed.
*
* Unlike xmalloc(), this function can return NULL on failure.
*/
@@ -506,9 +518,16 @@ dpdk_rte_mzalloc(size_t sz)
void
free_dpdk_buf(struct dp_packet *p)
{
- struct rte_mbuf *pkt = (struct rte_mbuf *) p;
+ if (!dpdk_thread_is_pmd()) {
+ ovs_mutex_lock(&nonpmd_mp_mutex);
+ }
+ struct rte_mbuf *pkt = (struct rte_mbuf *) p;
Should we not use a container_of macro here?
I'll use it for the next iteration.
Tiago.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev