Thanks. Indeed we wrote an analogous test program and come to the same conclusion, the memset() call slow things down for certain sizes of the requested memory.

On 03/20/2014 09:21 PM, Hongjia Cao wrote:

I wrote a simple test program test.c:

gcc
-I/home/hjcao/work/slurm/ /home/hjcao/work/build-slurm/src/api/libslurm.o 
-lpthread -ldl test.c

# ./a.out
a.out: xmalloc usecs: 28530
a.out: malloc usecs: 5


the code:
==============================================================================================

#include <stdlib.h>
#include <string.h>

#include "src/common/xstring.h"
#include "src/common/xmalloc.h"
#include "src/common/macros.h"
#include "src/common/hostlist.h"
#include "src/common/timers.h"

#define COUNT 64
#define MAX_RANGES (64*1024)
struct _range {
                unsigned long lo, hi;
                        int width;
};


int main(int argc, char *argv[])
{
        char *ptr = NULL;
        int i;
        DEF_TIMERS;

        START_TIMER;
        for (i = 0; i < COUNT; i++) {
                ptr = xmalloc(MAX_RANGES * sizeof(struct _range));
                xfree(ptr);
        }
        END_TIMER;
        info("xmalloc usecs: %lld", delta_t);


        START_TIMER;
        for (i = 0; i < COUNT; i++) {
                ptr = malloc(MAX_RANGES * sizeof(struct _range));
                free(ptr);
        }
        END_TIMER;
        info("malloc usecs: %lld", delta_t);


        return 0;
}


在 2014-03-20四的 10:42 -0700,David Bigagli写道:
Hi,
     do you have any data comparing the performance of malloc() versus
xmalloc()? If you suspect the problem is in memset() it would be
interesting to see the performance of xmalloc() without it.

Is there any side effect of this patch since the memory is no longer
initialized to 0 if using malloc()?

On 03/19/2014 10:22 PM, Hongjia Cao wrote:
In commit 799a753be0c116a6117b0afa84f971fc2bdb9d87 the hostlist range
count was expanded and xmalloc/xfree is used to allocate memory for the
_range structure data. xmalloc is slow in setting the memory allocated
to 0: the size of _range structure will be 12 * 64 * 1024 = 768K bytes.
Replacing xmalloc/xfree with malloc/free improve the performance
notably.

I find this when checking slowness of sinfo after upgrading from 2.6.0
to 2.6.6. The multithread patch (commit
17449c066af69441b741110ef51fc2f534272871) does not help. Replacing
hostlist_push with hostlist_push_host (commit
1b0b135f9579e253ddd5bf680d2ea70ad12f9bda) fixes the problem of sinfo,
but I think the root cause is in xmalloc.




--

Thanks,
      /David/Bigagli

www.schedmd.com

Reply via email to