> When the program runs by itself, all the malloc calls are successful. 
> However when I run it with valgrind's memcheck or massif tools (v 
> 3.6.0), a malloc call fails (which is trying to allocate around 6.4 Gb). 

Which Linux distribution, which Linux kernel ("uname -a"), and which
C runtime library ("ls -l /lib*/libc.so*") are you running?
There are various policies (such as automatic huge pages in some kernels)
and various algorithms (such as malloc implementations in glibc)
which might matter.


> ==808== Warning: set address range perms: large range [0x5b5c5040, 
> 0x3564cd040) (undefined)
> ==808== Warning: set address range perms: large range [0x40d985040, 
> 0x70888d040) (undefined)

Such warnings from valgrind are expected: the sizes are rather large,
and sometimes such large sizes are clues to errors.  Note that there
are two of them, corresponding to the two malloc() which did succeed.
The address intervals are the ranges returned by malloc().  Immediately
after successful malloc() then the contents of the region are uninitialized
("undefined").

> pde_alloc: sparse matrix allocs failed: Success: nzval = 0x40d985040, 
> nzcol = (nil), rowptr = 0x58e9a6e0

That line above is your error message.  It would be MUCH better
if you listed the values in the same order as the calls to malloc().

> ==808== Warning: set address range perms: large range [0x5b5c5030, 
> 0x3564cd050) (noaccess)
> ==808== Warning: set address range perms: large range [0x40d985030, 
> 0x70888d050) (noaccess)

Those two large ranges must have happened _after_ the snippet below
(your error message precedes them.)
The difference between "undefined" and "noaccess" is one clue.
However, notice that each "noaccess" range overlaps the corresponding
"undefined" range by 16 bytes on both ends.  Hmmm....

> 
> The corresponding code snippet is:
> 
>        w->rowptr = malloc((PDE_MAT_SIZE2 + 1) * sizeof(int));
>        w->nzval = malloc(PDE_MAT_SIZE1 * PDE_MAT_SIZE2 * sizeof(double));
>        w->nzcol = malloc(PDE_MAT_SIZE1 * PDE_MAT_SIZE2 * sizeof(int));
>        if (w->nzval == 0 || w->nzcol == 0 || w->rowptr == 0)
>          {
>            fprintf(stderr, "pde_alloc: sparse matrix allocs failed: %s: 
> nzval = %p, nzcol = %p, rowptr = %p\n", strerror(errno), w->nzval, 
> w->nzcol, w->rowptr);
>            pde_free(w);
>            return 0;
>          }
> 
> In this code, PDE_MAT_SIZE1 = PDE_MAT_SIZE2 = 40000. Therefore the 
> 'nzval' call is allocating 12.8Gb and the 'nzcol' is allocating 6.4 Gb. 
> (sizeof(double) = 8, sizeof(int) = 4)
> 
> The machine has 50Gb of ram and both of these calls are successful 
> without using valgrind.

How much paging space ("swapon -s") does the machine have?

> Furthermore, it appears that the valgrind malloc function does not set 
> errno to the appropriate value, since the output says "Success" even 
> when the pointer is null.

You didn't check immediately after each malloc().  This makes it harder
to figure out whether errno should be valid.  If the last call to malloc()
[or anything else which _might_ set errno] did succeed, then the value of
errno might not correspond to the last failure of malloc().

-- 

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a 
definitive record of customers, application performance, security 
threats, fraudulent activity and more. Splunk takes this data and makes 
sense of it. Business sense. IT sense. Common sense.. 
http://p.sf.net/sfu/splunk-d2d-c1
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to