> When the program runs by itself, all the malloc calls are successful. > However when I run it with valgrind's memcheck or massif tools (v > 3.6.0), a malloc call fails (which is trying to allocate around 6.4 Gb).
Which Linux distribution, which Linux kernel ("uname -a"), and which C runtime library ("ls -l /lib*/libc.so*") are you running? There are various policies (such as automatic huge pages in some kernels) and various algorithms (such as malloc implementations in glibc) which might matter. > ==808== Warning: set address range perms: large range [0x5b5c5040, > 0x3564cd040) (undefined) > ==808== Warning: set address range perms: large range [0x40d985040, > 0x70888d040) (undefined) Such warnings from valgrind are expected: the sizes are rather large, and sometimes such large sizes are clues to errors. Note that there are two of them, corresponding to the two malloc() which did succeed. The address intervals are the ranges returned by malloc(). Immediately after successful malloc() then the contents of the region are uninitialized ("undefined"). > pde_alloc: sparse matrix allocs failed: Success: nzval = 0x40d985040, > nzcol = (nil), rowptr = 0x58e9a6e0 That line above is your error message. It would be MUCH better if you listed the values in the same order as the calls to malloc(). > ==808== Warning: set address range perms: large range [0x5b5c5030, > 0x3564cd050) (noaccess) > ==808== Warning: set address range perms: large range [0x40d985030, > 0x70888d050) (noaccess) Those two large ranges must have happened _after_ the snippet below (your error message precedes them.) The difference between "undefined" and "noaccess" is one clue. However, notice that each "noaccess" range overlaps the corresponding "undefined" range by 16 bytes on both ends. Hmmm.... > > The corresponding code snippet is: > > w->rowptr = malloc((PDE_MAT_SIZE2 + 1) * sizeof(int)); > w->nzval = malloc(PDE_MAT_SIZE1 * PDE_MAT_SIZE2 * sizeof(double)); > w->nzcol = malloc(PDE_MAT_SIZE1 * PDE_MAT_SIZE2 * sizeof(int)); > if (w->nzval == 0 || w->nzcol == 0 || w->rowptr == 0) > { > fprintf(stderr, "pde_alloc: sparse matrix allocs failed: %s: > nzval = %p, nzcol = %p, rowptr = %p\n", strerror(errno), w->nzval, > w->nzcol, w->rowptr); > pde_free(w); > return 0; > } > > In this code, PDE_MAT_SIZE1 = PDE_MAT_SIZE2 = 40000. Therefore the > 'nzval' call is allocating 12.8Gb and the 'nzcol' is allocating 6.4 Gb. > (sizeof(double) = 8, sizeof(int) = 4) > > The machine has 50Gb of ram and both of these calls are successful > without using valgrind. How much paging space ("swapon -s") does the machine have? > Furthermore, it appears that the valgrind malloc function does not set > errno to the appropriate value, since the output says "Success" even > when the pointer is null. You didn't check immediately after each malloc(). This makes it harder to figure out whether errno should be valid. If the last call to malloc() [or anything else which _might_ set errno] did succeed, then the value of errno might not correspond to the last failure of malloc(). -- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense.. http://p.sf.net/sfu/splunk-d2d-c1 _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users