[ofa-general] [RFC] opensm: cl_qlock_pool benchmark

Sasha Khapyorsky Sun, 09 Dec 2007 06:06:19 -0800

Hi,

I looked at possibility to optimize and simplify SA requests processing
in OpenSM and found that very common practice there is to use
cl_qlock_pool* as a records allocator (it must be locked because same
type of requests shares the pool). It is also used as MAD allocator (via
osm_mad_pool).


Looking at implementation of q[lock_]pool I thought that it would be
interesting to compare its performance with standard malloc, which by
itself should be reasonably fast. So I wrote some stupid program
test_pool.c (do_nothing() here is for preventing from smart optimizer to
drop some cycles):


#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <complib/cl_qlockpool.h>
#include <complib/cl_qpool.h>

#define USE_MALLOC 1
#define USE_QPOOL 1

#ifdef USE_MALLOC
#define cl_qlock_pool_get(p) malloc(sizeof(*item))
#define cl_qlock_pool_put(p, mem) free(mem)
#else
#ifdef USE_QPOOL
#define cl_qlock_pool_t cl_qpool_t
#define cl_qlock_pool_construct(p) cl_qpool_construct(p)
#define cl_qlock_pool_init(p, a, b, c, d, e, f, g) \
        cl_qpool_init(p, a, b, c, d, e, f, g)
#define cl_qlock_pool_destroy(p) cl_qpool_destroy(p)
#define cl_qlock_pool_get(p) cl_qpool_get(p)
#define cl_qlock_pool_put(p, mem) cl_qpool_put(p, mem)
#endif
#endif

typedef struct item {
        cl_pool_item_t pool_item;
        char data[64];
} item_t;

#define POOL_MIN_SIZE      32
#define POOL_GROW_SIZE     32

#define N_TESTS 1000000000

static void do_nothing(struct item *items[], unsigned n)
{
        int i;
        for (i = 0 ; i < n ; i++) {
                if (!strcmp(items[i]->data, "12345678"))
                        printf("Yes!!!\n");
        }
}

static int pool_get_and_put_items(cl_qlock_pool_t *p, unsigned n)
{
        struct item *items[n];
        struct item *item;
        int i;

        for (i = 0 ; i < n ; i++) {
                item = (struct item *)cl_qlock_pool_get(p);
                if (!item)
                        return -1;
                memset(item->data, 0, sizeof(item->data));
                items[i] = item;
        }

        do_nothing(items, n);

        for (i = 0 ; i < n ; i++)
                cl_qlock_pool_put(p, &items[i]->pool_item);

        return 0;
}

static int test_pool()
{
        cl_qlock_pool_t pool;
        int i, j, status;

        cl_qlock_pool_construct(&pool);

        status = cl_qlock_pool_init(&pool, POOL_MIN_SIZE, 0, POOL_GROW_SIZE,
                                    sizeof(struct item), NULL, NULL, NULL);
        for (i = 0 ; i < N_TESTS; i++)
                if (!pool_get_and_put_items(&pool, 1000000))
                        return -i;

        for (i = 0 ; i < N_TESTS; i++) {
                if (!pool_get_and_put_items(&pool, 1000000))
                        return -i;
                for (j = 0; j < N_TESTS; j++)
                        if (!pool_get_and_put_items(&pool, 1000000))
                                return -i;
        }

        cl_qlock_pool_destroy(&pool);

        return 0;
}

int main()
{
        int ret = test_pool();

        return ret;
}


And got such typical numbers:

* with cl_qlock_pool:

real    0m0.541s
user    0m0.488s
sys     0m0.056s

* with cl_qpool:

real    0m0.350s
user    0m0.288s
sys     0m0.060s

cl_qpool is much faster, it is expected since locking cycle is skipped
there.

* with regular malloc/free:

real    0m0.292s
user    0m0.216s
sys     0m0.072s

And this one is *fastest*.

In this test I used various numbers for subsequent test cycles and
different optimization flags - numbers ratios still be similar.

This shows that regular malloc/free is fastest allocator, then used it
doesn't require locking (all allocations are per individual request) and
it is more than twice faster than current cl_qlock_pool.

Obvious question is why to not convert from cl_qlock_pool? Probably some
holes in the test? Any thoughts?

Sasha
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[ofa-general] [RFC] opensm: cl_qlock_pool benchmark

Reply via email to