Full_Name: Paul Smith
Version: 2.4.7
OS: Ubuntu 8.04
URL: 
Submission from: (NULL) (65.78.30.67)


I've been trying to track down a crasher in Evolution's Exchange server backend,
and I believe it's a bug in OpenLDAP's libldap.  The core manifests as an
assert() in io.c:ber_flush2(), but by using valgrind I can see that the real
problem is using already-freed memory.

You can see various backtraces and valgrind outputs in the Gnome bug tracking
system here: http://bugzilla.gnome.org/show_bug.cgi?id=512605  Note that the
later stack traces are better as I've installed debugging versions of OpenLDAP
so there are better symbols shown.

In Evolution we have a loop we use to look up global contacts, which looks like
this:

        for (try = 0; try < 2; try++) {
            ldap_error = get_gc_connection (gc, op);
            if (ldap_error != LDAP_SUCCESS)
                return ldap_error;
            ldap_error = ldap_search_ext (gc->priv->ldap, base, scope,
                                          filter, (char **)attrs,
                                          FALSE, NULL, NULL, NULL, 0,
                                          &msgid);
            if (ldap_error == LDAP_SERVER_DOWN)
                continue;
            else if (ldap_error != LDAP_SUCCESS)
                return ldap_error;
        
            ldap_error = gc_ldap_result (gc->priv->ldap, op, msgid, msg);
            if (ldap_error == LDAP_SERVER_DOWN)
                continue;
            else if (ldap_error != LDAP_SUCCESS)
                return ldap_error;
        
            return LDAP_SUCCESS;
        }

where get_gc_connection() eventually calls ldap_ntlm_bind() and where
gc_ldap_result() calls ldap_result().  When the core happens, the first trip
through the for loop here gives an error LDAP_SERVER_DOWN as the return code
from ldap_result().  We go back to the top of the loop, and ldap_ntlm_bind() is
called.  That eventually ends up in ldap_send_initial_request() which invokes
ldap_send_server_request() like this:

        rc = ldap_send_server_request( ld, ber, msgid, NULL,
                NULL, NULL, NULL );

Inside ldap_send_server_request() we need to find an LDAPConn* structure, but
the one passed in is NULL as is the srvlist, so, we use the default connection:

        if ( lc == NULL ) {
            if ( srvlist == NULL ) {
                lc = ld->ld_defconn;

Valgrind reports that the very next line, attempting to access lc->lconn_status,
is referencing freed memory:

        if ( lc != NULL && lc->lconn_status == LDAP_CONNST_CONNECTING ) {

So, basically, we know that ld->ld_defconn is a non-NULL pointer pointing to
already freed memory.  This seems like it is a buggy state.

How did we get here?  In ldap_result() we get into this loop invoking
try_read1msg() (I'm removing the mutex locking for readability, even though it
IS enabled in my version of libldap):

        for ( lc = ld->ld_conns;
            rc == LDAP_MSG_X_KEEP_LOOKING && lc != NULL; )
        {
            if ( lc->lconn_status == LDAP_CONNST_CONNECTED &&
                ldap_is_read_ready( ld, lc->lconn_sb ) )
            {
                rc = try_read1msg( ld, msgid, all, &lc, result );
                if ( lc == NULL ) {
                    /* if lc gets free()'d,
                     * there's no guarantee
                     * lc->lconn_next is still
                     * sane; better restart
                     * (ITS#4405) */
                    lc = ld->ld_conns;

                    /* don't get to next conn! */
                    break;
                }
            }

            /* next conn */
            lc = lc->lconn_next;
        }

Inside try_read1msg() we call "tag = ber_get_next( lc->lconn_sb, &len, ber );"
and if the tag we get back is LBER_DEFAULT, then we declare an error and we free
the connection and set the pointer to NULL:

        case LBER_DEFAULT:
                ld->ld_errno = LDAP_SERVER_DOWN;
                ldap_free_connection( ld, lc, 1, 0 );
                lc = *lcp = NULL;
                return -1;

If you check ldap_free_connection() you'll see that it removes the LDAPConn
pointer "lc" from the list of connections before it is freed.

BUT!  The ldap_free_connection() function never does anything with the
ld->ld_defconn pointer, so if the connection we just freed is the one pointed to
by ld->ld_defconn, it is now pointing to freed memory.  And that causes the
problem detected above by valgrind, or causing an assert later on: accessing
freed memory.

I'm not really sure what the right thing to do here is, or I'd provide a patch. 
Should we set ld_defconn to NULL?  Is that ever a valid state?  Or should we
just pick another connection from the list (and what if there isn't one?)


Reply via email to