Hi,

IMHO there's one serious problem with the way how we currently
clear or destroy the pool regarding cleanup callbacks.

Like always those problems are visible in threaded applications
only ;)

The core of the problem is the fact that pool's child pools are
destroyed before the pool's cleanups have been run.

apr_pool_destroy/clear {
  for each child in child_pools run
    apr_pool_destroy child

  run our own cleanups
  run our process cleanups
  ...
}

This algorithm makes few things impossible causing core dumps.

1. If we are in the blocking APR function (apr_socket_accept for example)
   with our own pool that is child of the pool that gets clear/destroy,
   we cannot trust our own local data still points to the valid memory
   allocated from our child pool after the accepts gets broken by the
   parent pool destroy.
2. If we have multiple threads spawning processes from their own pool we cannot
   register cleanup_for_exec to the parent pool, cause it will core dump
   in free_proc_chain. Again if the process was created from the child
   pool in free_proc_chain it will reference deallocated memory.
   The only solution is to register the cleanup_for_exec in child pool,
   and that leads to multiple 3 seconds delays (one that was supposed the
   cleanup_for_exec should deal with).

My proposal is that we change the way how apr_pool_clear/destroy operates.

apr_pool_destroy/clear {
  for each child in child_pools run
    run child cleanups recursively

  run our own cleanups

  for each child in child_pools run
    run child process cleanups recursively

  run our process cleanups

  for each child in child_pools run
    apr_pool_destroy child

  ...
}

This will cause that cleanups are run for all child pools and their child pools
in a pool chain without deallocating memory.
After all the cleanups have been run the memory will get destroyed in the
same order.
We can even add two new functions
apr_pool_run_cleanups(apr_pool_t *)
and
apr_pool_run_cleanup_for_exec(apr_pool_t *)

that will recursively run all the cleanups and remove them from the cleanup
chains. After that call, a multithreaded app can can call an join and then
safely call the clear/destroy and be assured that all blocking calls have
been exited.

Is that make any sense?

Regards,
Mladen

Reply via email to