> >On Tue, Nov 28, 2017 at 09:06:09PM +0000, Bodireddy, Bhanuprakash wrote: >> >Until now, xmalloc_cacheline() has provided its caller memory that >> >does not share a cache line, but when posix_memalign() is not >> >available it did not provide a full cache line; instead, it returned >> >memory that was offset 8 bytes into a cache line. This makes it hard >> >for clients to design structures to be cache line-aligned. This >> >commit changes >> >xmalloc_cacheline() to always return a full cache line instead of >> >memory offset into one. >> > >> >Signed-off-by: Ben Pfaff <[email protected]> >> >--- >> > lib/util.c | 60 >> >++++++++++++++++++++++++++++++++--------------------------- >> >- >> > 1 file changed, 32 insertions(+), 28 deletions(-) >> > >> >diff --git a/lib/util.c b/lib/util.c >> >index 9e6edd27ae4c..137091a3cd4f 100644 >> >--- a/lib/util.c >> >+++ b/lib/util.c >> >@@ -196,15 +196,9 @@ x2nrealloc(void *p, size_t *n, size_t s) >> > return xrealloc(p, *n * s); >> > } >> > >> >-/* The desired minimum alignment for an allocated block of memory. >> >*/ - #define MEM_ALIGN MAX(sizeof(void *), 8) - >> >BUILD_ASSERT_DECL(IS_POW2(MEM_ALIGN)); >> >-BUILD_ASSERT_DECL(CACHE_LINE_SIZE >= MEM_ALIGN); >> >- >> >-/* Allocates and returns 'size' bytes of memory in dedicated cache >> >lines. That >> >- * is, the memory block returned will not share a cache line with >> >other data, >> >- * avoiding "false sharing". (The memory returned will not be at >> >the start of >> >- * a cache line, though, so don't assume such alignment.) >> >+/* Allocates and returns 'size' bytes of memory aligned to a cache >> >+line and in >> >+ * dedicated cache lines. That is, the memory block returned will >> >+not share a >> >+ * cache line with other data, avoiding "false sharing". >> > * >> > * Use free_cacheline() to free the returned memory block. */ void >> >* @@ - >> >221,28 +215,37 @@ xmalloc_cacheline(size_t size) >> > } >> > return p; >> > #else >> >- void **payload; >> >- void *base; >> >- >> > /* Allocate room for: >> > * >> >- * - Up to CACHE_LINE_SIZE - 1 bytes before the payload, so that >> >the >> >- * start of the payload doesn't potentially share a cache line. >> >+ * - Header padding: Up to CACHE_LINE_SIZE - 1 bytes, to allow the >> >+ * pointer to be aligned exactly sizeof(void *) bytes before the >> >+ * beginning of a cache line. >> > * >> >- * - A payload consisting of a void *, followed by padding out to >> >- * MEM_ALIGN bytes, followed by 'size' bytes of user data. >> >+ * - Pointer: A pointer to the start of the header padding, to >> >allow us >> >+ * to free() the block later. >> > * >> >- * - Space following the payload up to the end of the cache line, >> >so >> >- * that the end of the payload doesn't potentially share a cache >> >line >> >- * with some following block. */ >> >- base = xmalloc((CACHE_LINE_SIZE - 1) >> >- + ROUND_UP(MEM_ALIGN + size, CACHE_LINE_SIZE)); >> >- >> >- /* Locate the payload and store a pointer to the base at the beginning. >*/ >> >- payload = (void **) ROUND_UP((uintptr_t) base, CACHE_LINE_SIZE); >> >- *payload = base; >> >- >> >- return (char *) payload + MEM_ALIGN; >> >+ * - User data: 'size' bytes. >> >+ * >> >+ * - Trailer padding: Enough to bring the user data up to a cache >> >line >> >+ * multiple. >> >+ * >> >+ * +---------------+---------+------------------------+---------+ >> >+ * | header | pointer | user data | trailer | >> >+ * +---------------+---------+------------------------+---------+ >> >+ * ^ ^ ^ >> >+ * | | | >> >+ * p q r >> >+ * >> >+ */ >> >+ void *p = xmalloc((CACHE_LINE_SIZE - 1) >> >+ + sizeof(void *) >> >+ + ROUND_UP(size, CACHE_LINE_SIZE)); >> >+ bool runt = PAD_SIZE((uintptr_t) p, CACHE_LINE_SIZE) < sizeof(void *); >> >+ void *r = (void *) ROUND_UP((uintptr_t) p + (runt ? CACHE_LINE_SIZE : >0), >> >+ CACHE_LINE_SIZE); >> >+ void **q = (void **) r - 1; >> >+ *q = p; >> >+ return r; >> > #endif >> > } >> > >> >@@ -265,7 +268,8 @@ free_cacheline(void *p) >> > free(p); >> > #else >> > if (p) { >> >- free(*(void **) ((uintptr_t) p - MEM_ALIGN)); >> >+ void **q = (void **) p - 1; >> >+ free(*q); >> > } >> > #endif >> > } >> >-- >> >> Thanks for the patch. >> Reviewed and tested this and now it returns 64 byte aligned address. >> >> Acked-by: Bhanuprakash Bodireddy <[email protected]> > >Thank you for the review. > >You are likely testing on Linux or another system that has posix_memalign(). >On such a system, the code that I just modified is not used. Did you add >"#undef HAVE_POSIX_MEMALIGN" or otherwise make sure that the new >code was actually used?
I am testing this on Linux and my system do support posix_memalign(). So I had to disable HAVE_POSIX_MEMALIGN block so that allocation would go through the else block you implemented here. Also I applied the patch (https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341231.html) on top of this to check if the address returned is 64 byte aligned. I also cross verified with gdb on vswitchd thread and changing pmd-cpu-mask to see if the code block is getting executed. - Bhanuprakash. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
