Satish Balay wrote: > The whole reason for having PetscMalloc2() PetscMalloc3() etc is to > reduce the number of calls to malloc() - thus hoping for a performance > boost. > > For one - this usage was triggered by old SunOS boxes [where malloc > took a long time]. We don't know if this is still true for any of the > current OSes. > > With the increased complexity of managing alignment here - is it still > worth keeping these merged mallocs? [do we still get any payoff from > this complexity? - instead of relying directly on malloc() for the > necessary alignment?]
I agree with Barry that the macros are good even if they were always doing separate macros underneath, it's much easier to see what is allocated together, and I frequently look for the PetscFree6 that must be associated with the PetscMalloc6 I'm looking at. Here are some benchmarks. In the following, I compare doing four mallocs in each loop (calling malloc directly) and doing a single malloc (and fixing up alignment). If there is nothing between the malloc and free, we get malloc4 1000000 loops in 1412541359 cycles, average 1412.541359 malloc1 1000000 loops in 298137728 cycles, average 298.137728 If there is an extra allocation between malloc and free, we get malloc4 1000000 loops in 1667959560 cycles, average 1667.959560 malloc1 1000000 loops in 538137819 cycles, average 538.137819 Clearly glibc is optimized for freeing the very last thing allocated. Fixing the alignment takes only a couple cycles. Remarkably, these results are reproducible to 6 significant figures or so. BTW, should we put PAPI support into PETSc, it no longer needs kernel patches with 2.6.31. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 261 bytes Desc: OpenPGP digital signature URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20091116/7060fe2e/attachment.pgp>
