> On Wed, 5 Dec 2001, Sander Striker wrote: > > > Right, ok. The 8192 was a number used in the original pools code, > > I just ripped it :). The BOUNDARY_SIZE is set to be 4096, which > > is the size of a page on most systems. > > i recall doing some statistics gathering and trying to get a single block > to handle many common requests ... > > > > On a thought from Dean Gaudet, how would performance be helped if > > > we #define'd apr_pcalloc to be: > > > #define apr_pcalloc(pool, size) memset(apr_palloc(pool, size), '\0', > > > size); > > > > When out of mem, this will segfault at the point where the > > apr_pcalloc macro is used. > > on linux (and anything else with optimistic memory allocation) you can't > check for out of memory just by checking for a NULL. segfault is how you > find out you ran out of memory. this thread has come up many times ;)
Damn, yes! Keep banging my head on my desk everytime you repeat it :) > so it's pretty much impossible to do portable out of memory checks. > > (if this is puzzling, think about copy-on-write fork semantics... and how > a pessimistic OS like solaris requires GB of swap which are never used.) > > > Secondly, the size is aligned to the next multiple of 8 bytes within > > apr_pcalloc, this to the advantage of memset. I wonder if taking > > the memset out of the function will improve performance. I personally > > doubt it. Dean, care to enlighten me? > > i doubt there are compilers that can figure out that the size has been > aligned inside the apr_pcalloc function. > > whereas in the macro version, structure sizes are always naturally aligned > for the target processor, and the compiler knows that as long as it has > the constant. > > the macro won't really help performance 'cause the cases where calloc is a > perf problem are typically because the memset itself is overkill. > (there's a great example of this in EAPI, or at least there was in the > EAPI that came with the apache on redhat 6.x... i haven't looked to see if > it was ever fixed. memset of 8KB of otherwise unused memory for every > request.) > > -dean So, actually, there isn't much difference between the plain apr_pcalloc implementation and the apr_pcalloc macro? (other than code duplication) The real solution seems to be eliminating apr_pcalloc calls when possible (and sensible). Maybe we can have another quantify run to identify the most prominent left? Brian? Sander
