On May 6, 2010, at 1:38 PM, McLaughlin, Michael P. wrote:
The reason, in my case, is almost certainly malloc contention (since threads share memory space). Each of my subtasks calls malloc more than a million times even for an average run. These are mostly dynamic allocations of vectors and matrices as well as smaller numbers of lists, sets and maps.
Using a custom allocator should give exactly the same benefits as multiple processes, then. You just need to make sure there is no lock contention when allocating memory. Allocators like tcmalloc do this by giving each thread a private cache of free space. (Safari uses tcmalloc for this reason.) A simpler way is to use a single-threaded allocator and just give every thread its own heap. Or if the tasks allocate memory but don't free any of it until they finish, a simple arena-based allocator will work well and is about as fast as you can possibly get.
—Jens_______________________________________________ Cocoa-dev mailing list ([email protected]) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to [email protected]
