Hi,
here is a respin of the buffers allocation optimization patch. Changes V2 -> V3: ---------------- - Allocate all aligned buffers with mmap(), not only QP buffers. Changes V1 -> V2: ---------------- - Use mmap whatever the page size, not only with 64K pages. Buffers are allocated with mthca_alloc_buf(), which rounds the buffers size to the page size and then allocates page aligned memory using posix_memalign(). However, this allocation is quite wasteful on architectures using 64K pages (ia64 for example) because we then hit glibc's MMAP_THRESHOLD malloc parameter and chunks are allocated using mmap. thus we end up allocating: (requested size rounded to the page size) + (page size) + (malloc overhead) rounded internally to the page size. So for example, if we request a buffer of page_size bytes, we end up consuming 3 pages. In short, for each buffer we allocate, there is an overhead of 2 pages. This is quite visible on large clusters especially where the number of QP can reach several thousands. This patch replaces the call to posix_memalign() in mthca_alloc_buf() with a direct call to mmap(). Signed-off-by: Sebastien Dugue <[email protected]> --- src/buf.c | 21 +++++++++++++-------- 1 files changed, 13 insertions(+), 8 deletions(-) diff --git a/src/buf.c b/src/buf.c index 6c1be4f..985c1f7 100644 --- a/src/buf.c +++ b/src/buf.c @@ -35,6 +35,8 @@ #endif /* HAVE_CONFIG_H */ #include <stdlib.h> +#include <sys/mman.h> +#include <errno.h> #include "mthca.h" @@ -61,16 +63,19 @@ int mthca_alloc_buf(struct mthca_buf *buf, size_t size, int page_size) { int ret; - ret = posix_memalign(&buf->buf, page_size, align(size, page_size)); - if (ret) - return ret; + /* Use mmap directly to allocate an aligned buffer */ + buf->buf = mmap(0 ,align(size, page_size) , PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + + if (buf->buf == MAP_FAILED) + return errno; ret = ibv_dontfork_range(buf->buf, size); - if (ret) - free(buf->buf); - if (!ret) - buf->length = size; + if (ret) + munmap(buf->buf, align(size, page_size)); + else + buf->length = align(size, page_size); return ret; } @@ -78,5 +83,5 @@ int mthca_alloc_buf(struct mthca_buf *buf, size_t size, int page_size) void mthca_free_buf(struct mthca_buf *buf) { ibv_dofork_range(buf->buf, buf->length); - free(buf->buf); + munmap(buf->buf, buf->length); } -- 1.6.3.1 _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
