On 10.05.2011 22:36, Rainer Jung wrote:
On 10.05.2011 22:03, Rainer Jung wrote:
On 10.05.2011 20:57, Jim Jagielski wrote:

On May 10, 2011, at 2:46 PM, Rainer Jung wrote:

On 10.05.2011 14:30, Jim Jagielski wrote:
Once Jeff applies his hook-probes patch, I'll be doing the
T&R within the next few hours.

On May 9, 2011, at 3:18 PM, Jim Jagielski wrote:

I plan on doing a T&R tomorrow...

I notice strange trunk failures on my Solaris 10 system. The failures
were already happening before the probe changes. The Perl script
RewriteMap process crashes shortly after the fork. In truss I can see
it closing file descriptors after the fork and then it crashes before
calling exec or similar. So something around apr_proc_create() seems
to go wrong, or possibly the apr_procattr are not write.

It doesn't happen on Solaris 8, so it is possible my system is
borked. It also doesn't happen for 2.2.x.

I'll try to investigate further, but if there is no immediate idea
about that I'm fine with rolling the beta, because it is not clear,
whether I have available enough time right now to debug.

Do the APR tests run cleanly?

Unfortunately yes, at least most of the time. The proc tests never
failed. I added debug output to apr_proc_create(), the crash happens in

apr_pool_cleanup_for_exec();

Digging further shows, the crash happens in running the child cleanups
for the pconf pool (in the 9th cleanup). Maybe it it related to the
testreslist failures, because some of them happen in
apr_pool_cleanup_kill. Just a wild speculation.

I will try to stop the process before the crash and investigate with the
debugger. Unfortunately the core if written doesn't seem usable.

child_cleanup_fn is NULL in a cleanup, that has plain_cleanup_fn equals
to apr_ldap_pool_cleanup_set_null. Getting closer. At least it is not
unplausible, because my builds for Solaris 8 and 10 differ by the exact
LDAP behavior.

Maybe related to log line

[Tue May 10 22:33:08.626119 2011] [ldap:info] [pid 25137] LDAP: SSL
support unavailable: LDAP: ldapssl_client_init() failed.

maybe not ...

Investigating further.

At least one reason in apr-util: File ldap/apr_ldap_rebind.c contains:

/* APR utility routine used to create the xref_lock. */
APU_DECLARE_LDAP(apr_status_t) apr_ldap_rebind_init(apr_pool_t *pool)
{
    apr_status_t retcode = APR_SUCCESS;

#ifdef NETWARE
    get_apd
#endif

    /* run after apr_thread_mutex_create cleanup */
apr_pool_cleanup_register(pool, &apr_ldap_xref_lock, apr_ldap_pool_cleanup_set_null, NULL);

#if APR_HAS_THREADS
    if (apr_ldap_xref_lock == NULL) {
retcode = apr_thread_mutex_create(&apr_ldap_xref_lock, APR_THREAD_MUTEX_DEFAULT, pool);
    }
#endif

    return(retcode);
}


The call

apr_pool_cleanup_register(pool, &apr_ldap_xref_lock, apr_ldap_pool_cleanup_set_null, NULL);

registers a child cleanup function NULL, which will always crash. because the functions are called unconditionally in apr_pool.

I will check all apr_pool_cleanup_register() in apr, apr-util and httpd for similar occurences...

Regards,

Rainer

Reply via email to