Re: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility

Mattias Rönnblom Tue, 17 Sep 2024 07:28:47 -0700

On 2024-09-16 16:02, Konstantin Ananyev wrote:

Introduce DPDK per-lcore id variables, or lcore variables for short.

An lcore variable has one value for every current and future lcore
id-equipped thread.

The primary <rte_lcore_var.h> use case is for statically allocating
small, frequently-accessed data structures, for which one instance
should exist for each lcore.

Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.

Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.

The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.

Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
Acked-by: Morten Brørup <m...@smartsharesystems.com>


LGTM in general, few small questions (mostly nits), see below.

--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+
+#ifdef RTE_EXEC_ENV_WINDOWS
+#include <malloc.h>
+#endif
+
+#include <rte_common.h>
+#include <rte_debug.h>
+#include <rte_log.h>
+
+#include <rte_lcore_var.h>
+
+#include "eal_private.h"
+
+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
+
+static void *lcore_buffer;
+static size_t offset = RTE_MAX_LCORE_VAR;
+
+static void *
+lcore_var_alloc(size_t size, size_t align)
+{
+       void *handle;
+       void *value;
+
+       offset = RTE_ALIGN_CEIL(offset, align);
+
+       if (offset + size > RTE_MAX_LCORE_VAR) {
+#ifdef RTE_EXEC_ENV_WINDOWS
+               lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
+                                              RTE_CACHE_LINE_SIZE);
+#else
+               lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
+                                            LCORE_BUFFER_SIZE);
+#endif


Don't remember did that question already arise or not:
For debugging and health-checking purposes - would it make sense to link all
lcore_buffer values into a linked list?
So user/developer/some tool can walk over it to check that provided handle value
is really a valid lcore_var, etc.

At least you could add some basic statistics, like the total sizeallocated my lcore variables, and the number of variables.


One could also add tracing.

+               RTE_VERIFY(lcore_buffer != NULL);
+
+               offset = 0;
+       }
+
+       handle = RTE_PTR_ADD(lcore_buffer, offset);
+
+       offset += size;
+
+       RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
+               memset(value, 0, size);
+
+       EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
+               "%"PRIuPTR"-byte alignment", size, align);
+
+       return handle;
+}
+
+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+       /* Having the per-lcore buffer size aligned on cache lines
+        * assures as well as having the base pointer aligned on cache
+        * size assures that aligned offsets also translate to alipgned
+        * pointers across all values.
+        */
+       RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+       RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+       RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
+
+       /* '0' means asking for worst-case alignment requirements */
+       if (align == 0)
+               align = alignof(max_align_t);
+
+       RTE_ASSERT(rte_is_power_of_2(align));
+
+       return lcore_var_alloc(size, align);
+}


....

diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
new file mode 100644
index 0000000000..ec3ab714a8
--- /dev/null
+++ b/lib/eal/include/rte_lcore_var.h

...

+/**
+ * Given the lcore variable type, produces the type of the lcore
+ * variable handle.
+ */
+#define RTE_LCORE_VAR_HANDLE_TYPE(type)                \
+       type *
+
+/**
+ * Define an lcore variable handle.
+ *
+ * This macro defines a variable which is used as a handle to access
+ * the various instances of a per-lcore id variable.
+ *
+ * The aim with this macro is to make clear at the point of
+ * declaration that this is an lcore handle, rather than a regular
+ * pointer.
+ *
+ * Add @b static as a prefix in case the lcore variable is only to be
+ * accessed from a particular translation unit.
+ */
+#define RTE_LCORE_VAR_HANDLE(type, name)       \
+       RTE_LCORE_VAR_HANDLE_TYPE(type) name
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align)    \
+       handle = rte_lcore_var_alloc(size, align)
+
+/**
+ * Allocate space for an lcore variable, and initialize its handle,
+ * with values aligned for any type of object.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size) \
+       RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
+
+/**
+ * Allocate space for an lcore variable of the size and alignment requirements
+ * suggested by the handle pointer type, and initialize its handle.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_ALLOC(handle)                                    \
+       RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)),       \
+                                      alignof(typeof(*(handle))))
+
+/**
+ * Allocate an explicitly-sized, explicitly-aligned lcore variable by
+ * means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align)               \
+       RTE_INIT(rte_lcore_var_init_ ## name)                           \
+       {                                                               \
+               RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align);      \
+       }
+
+/**
+ * Allocate an explicitly-sized lcore variable by means of a @ref
+ * RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT_SIZE(name, size)            \
+       RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
+
+/**
+ * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
+ *
+ * The values of the lcore variable are initialized to zero.
+ */
+#define RTE_LCORE_VAR_INIT(name)                                       \
+       RTE_INIT(rte_lcore_var_init_ ## name)                           \
+       {                                                               \
+               RTE_LCORE_VAR_ALLOC(name);                              \
+       }
+
+/**
+ * Get void pointer to lcore variable instance with the specified
+ * lcore id.
+ *
+ * @param lcore_id
+ *   The lcore id specifying which of the @c RTE_MAX_LCORE value
+ *   instances should be accessed. The lcore id need not be valid
+ *   (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ *   is also not valid (and thus should not be dereferenced).
+ * @param handle
+ *   The lcore variable handle.
+ */
+static inline void *
+rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
+{
+       return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
+}
+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ *   The lcore id specifying which of the @c RTE_MAX_LCORE value
+ *   instances should be accessed. The lcore id need not be valid
+ *   (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ *   is also not valid (and thus should not be dereferenced).
+ * @param handle
+ *   The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle)                    \
+       ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+       RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)


Would it make sense to check that rte_lcore_id() !=  LCORE_ID_ANY?
After all if people do not want this extra check, they can probably use
RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
explicitly.

It would make sense, if it was an RTE_ASSERT(). Otherwise, I don't thinkso. Attempting to gracefully handle API violations is bad practice, imo.

+
+/**
+ * Iterate over each lcore id's value for an lcore variable.
+ *
+ * @param value
+ *   A pointer successively set to point to lcore variable value
+ *   corresponding to every lcore id (up to @c RTE_MAX_LCORE).
+ * @param handle
+ *   The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle)                     \
+       for (unsigned int lcore_id =                                    \
+                    (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+            lcore_id < RTE_MAX_LCORE;                                       \
+            lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))


Might be a bit better (and safer) to make lcore_id a macro parameter?
I.E.:
define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \
for ((lcore_id) = ...


Why?

Re: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility

Reply via email to