Hi -- This issue with httpd illustrates a difficulty when working with reslists and sub-pools (note that the patches attached are not a complete solution; that's still forthcoming, I hope):
http://issues.apache.org/bugzilla/show_bug.cgi?id=39985 The reslist code itself never creates any sub-pools with apr_pool_create(). It passes the pool given to apr_reslist_create() to the constructor function whenever a resource needs to be created. If the constructor function merely allocates its opaque structure out of the pool, all is (mostly) OK. An internal mutex ensures that two constructors won't try to allocate out of the memory pool at the same time. The destructor doesn't have to do anything related to memory management. As a sidebar, there are two minor issues that still arise from this simple case. First, the caller has to remember not to use the pool they passed to apr_reslist_create() in any other places where it might execute at the same time as a constructor function, since the caller doesn't have access to the reslist's internal mutex. Second, over the lifetime of the reslist, the reslist will gradually leak memory, since there's no way to free the memory allocated in the constructor. The more serious consequence, and the one that pertains to the httpd bug report, occurs if one tries to plug the memory leak. Suppose the constructor function creates a sub-pool, and then allocates its opaque structure out of that sub-pool. The destructor can then call apr_pool_destroy() on the sub-pool and that will free the memory used by the opaque structure. Now the reslist won't leak memory over time (from the constructor, anyway). Assume, though, that a well-behaved program will eventually call apr_pool_destroy() on the pool that was passed to apr_reslist_create(). Now apr_reslist_create() registers an internal cleanup function on this pool, and this cleanup function iterates over all current resources and calls the destructor function on them. But, before the cleanup function runs, apr_pool_destroy() will have recursively called itself on all the sub-pools of the reslist's pool, including all the sub-pools created by the constructor function. Thus when the reslist's cleanup function runs and calls the destructor on a resource, the resource's sub-pool and opaque structure have been freed; the destructor will issue a second call to apr_pool_destroy() on the sub-pool and likely cause the program to crash. I'm just finishing up a proposed solution to this for httpd, but it depends on a fortuitous situation which can't be expected in most well-behaved programs. My immediate thought is that the reslist code could be changed to (a) create a sub-pool in apr_reslist_create(), so that callers don't have to avoid using the pool elsewhere, and (b) creating a sub-pool prior to each invocation of the constructor. These per-resource sub-pools would be passed to the constructor, so it could allocate without having to worry about leaks. Also, the destructor would be registered as a cleanup function on the sub-pool. To destroy a resource, just calling apr_pool_destroy() on the resource's sub-pool would then call the destructor automatically. It would be possible to simplify or remove the main cleanup function, because there would be no need to iterate over all the resources and call the destructor -- this would happen automatically within apr_pool_destroy(). The internal mutex might be able to used less extensively, because you wouldn't need to ensure that two constructor functions couldn't run at the same time; they'd be allocating out of private sub-pools and therefore be thread-safe (at least in this regard). The problem I see with this change is that it would break existing applications, like httpd, that work around the current behaviour. If a program's destructor function already calls apr_pool_destroy() on any sub-pools created in its constructor, this would cause the same kind of double-free as before, because these sub-pools would already have been destroyed before the destructor ran. (The destructor would now be a cleanup function registered on the per-resource sub-pool, the one that was passed to the constructor.) So I'm not sure much can be attempted until, perhaps, APR 2.x. Maybe one can change such things in with a minor version release, but it seems to me that while it might not break strict binary compatibility, it would be a surprise to anyone upgrading to 1.3. Is there anywhere to put suggestions for APR 2.x? Are there other solutions or problems that I haven't seen? Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
