I am a little confused by some behavior. I've been trying to understand some poor performance numbers and I noticed that if I doubled the preallocation numbers (presently I'm setting d_nz and o_nz to constants, no effort has been made to get the numbers _right_) performance improved dramatically (about 20X).
This isn't so surprising except for the report from MatGetInfo. Even with the lower preallocation numbers I'm getting 0 in info.mallocs for MatInfo info; MatGetInfo(pc_A, MAT_LOCAL, &info); // done and reported on each machine individually So, my question is.... well.. why? If I'm seeing 0 mallocs before doubling prealloation shouldn't that mean I've preallocated enough? Or are their some switches I need to use to enable malloc counting? Also you can see (according to the same call to MetGetInfo) I'm wasting a lot of memory: // after doubling preallocation nz_alloc 6.2704e+07 nz_used 1.9125e+07 nz_unneed 4.3579e+07 (these are print outs of info.nz_allocated, info.nz_used and info.nzunneeded). Any thoughts? For now memory is not a bottleneck, so I guess I'll be satisfied with guessing big numbers for d_nz and o_nz. Still, I spent a lot of time scratching my head since guessing higher numbers didn't seem likely to have an effect. Thanks, -Andrew
