[email protected] wrote: > Full_Name: Scott Ireland > Version: 2.4.23 > OS: RHEL 6.4 > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (207.34.120.71) > > > A 2-server, multi-master replication setup appears to deadlock during sync. > After a period of time (sometimes hours), both servers will lock up at the > same > time and fail to handle requests (although they do accept TCP connections). > This only occurs when the sync occurs with TLS (either StartTLS or ldaps); > when > TLS is not used the sync runs without incident. Stopping/restarting one of > the > servers clears the lockup on the other server. Sometimes the issue can be > triggered immediately when a change is made to the database (in which case the > client does not see the operation complete), but will also occur on its own as > periodic replication checks occur.
You're running a very old release, current is 2.4.35. You're running a Red Hat customized build, contact Red Hat for support. Closing this ITS. > Typical log (level 0x4100) before the freeze: > > Jul 25 11:09:43 cd1911-o99 slapd[5789]: conn=1380 fd=23 ACCEPT from > IP=redacted:48388 (IP=redacted:389) > Jul 25 11:09:43 cd1911-o99 slapd[5789]: conn=1380 op=0 EXT > oid=1.3.6.1.4.1.1466.20037 > Jul 25 11:09:43 cd1911-o99 slapd[5789]: conn=1380 op=0 STARTTLS > Jul 25 11:09:43 cd1911-o99 slapd[5789]: conn=1380 op=0 RESULT oid= err=0 text= > [freezes here] > > GDB backtrace (this lock was triggered by a change in cn=config): > > (gdb) info threads > 5 Thread 0x7f7eeb3d0700 (LWP 7048) 0x00007f7eee227f03 in epoll_wait () > from /lib64/libc.so.6 > 4 Thread 0x7f7ed5028700 (LWP 7049) 0x00007f7eee6eb054 in __lll_lock_wait > () > from /lib64/libpthread.so.0 > 3 Thread 0x7f7ed4827700 (LWP 7050) 0x00007f7eee6eb54d in read () > from /lib64/libpthread.so.0 > 2 Thread 0x7f7ecffff700 (LWP 7051) 0x00007f7eee6e843c in > pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 > * 1 Thread 0x7f7ef0700700 (LWP 7043) 0x00007f7eee6e50ad in pthread_join () > from /lib64/libpthread.so.0 > > (gdb) bt > #0 0x00007f7eee6e50ad in pthread_join () from /lib64/libpthread.so.0 > #1 0x00007f7ef075685c in slapd_daemon () > at ../../../servers/slapd/daemon.c:2842 > #2 0x00007f7ef0742b0e in main (argc=5, argv=<value optimized out>) > at ../../../servers/slapd/main.c:961 > > (gdb) thread 2 > [Switching to thread 2 (Thread 0x7f7ecffff700 (LWP 7051))]#0 > 0x00007f7eee6e843c > in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007f7eee6e843c in pthread_cond_wait@@GLIBC_2.3.2 () > from /lib64/libpthread.so.0 > #1 0x00007f7ef085c6fb in handle_pause (tpool=<value optimized out>, > do_pause=1) at ../../../libraries/libldap_r/tpool.c:757 > #2 0x00007f7ef074a53c in config_back_modify (op=0x7f7ec81adaa0, > rs=0x7f7ecfffe920) at ../../../servers/slapd/bconfig.c:5677 > #3 0x00007f7ef07c8977 in overlay_op_walk (op=0x7f7ec81adaa0, > rs=0x7f7ecfffe920, which=op_modify, oi=0x7f7ef1a644c0, on=0x0) > at ../../../servers/slapd/backover.c:669 > #4 0x00007f7ef07c93cb in over_op_func (op=0x7f7ec81adaa0, > rs=<value optimized out>, which=<value optimized out>) > at ../../../servers/slapd/backover.c:721 > #5 0x00007f7ef0775ece in fe_op_modify (op=0x7f7ec81adaa0, rs=0x7f7ecfffe920) > at ../../../servers/slapd/modify.c:301 > #6 0x00007f7ef077685c in do_modify (op=0x7f7ec81adaa0, rs=0x7f7ecfffe920) > at ../../../servers/slapd/modify.c:175 > #7 0x00007f7ef075c9f9 in connection_operation (ctx=0x7f7ecfffeb70, > arg_v=0x7f7ec81adaa0) at ../../../servers/slapd/connection.c:1109 > #8 0x00007f7ef075d34d in connection_read_thread (ctx=0x7f7ecfffeb70, > argv=<value optimized out>) at ../../../servers/slapd/connection.c:1245 > #9 0x00007f7ef085cdd8 in ldap_int_thread_pool_wrapper (xpool=0x7f7ef19957b0) > at ../../../libraries/libldap_r/tpool.c:685 > #10 0x00007f7eee6e4851 in start_thread () from /lib64/libpthread.so.0 > #11 0x00007f7eee22790d in clone () from /lib64/libc.so.6 > > (gdb) thread 3 > [Switching to thread 3 (Thread 0x7f7ed4827700 (LWP 7050))]#0 > 0x00007f7eee6eb54d > in read () from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007f7eee6eb54d in read () from /lib64/libpthread.so.0 > #1 0x00007f7ef088c67c in sb_debug_read (sbiod=0x7f7ec0174cb0, > buf=0x7f7ec0191130, len=5) at ../../../libraries/liblber/sockbuf.c:829 > #2 0x00007f7ef0882564 in tlsm_PR_Recv (fd=<value optimized out>, > buf=0x7f7ec0191130, len=5, flags=<value optimized out>, > timeout=<value optimized out>) at tls_m.c:3032 > #3 0x00007f7eef6ed1ed in ssl_DefRecv (ss=<value optimized out>, > buf=<value optimized out>, len=5, flags=<value optimized out>) > at ssldef.c:62 > #4 0x00007f7eef6e8480 in ssl3_GatherData (ss=0x7f7ec0190d50, flags=0) > at ssl3gthr.c:59 > #5 ssl3_GatherCompleteHandshake (ss=0x7f7ec0190d50, flags=0) at > ssl3gthr.c:318 > #6 0x00007f7eef6eaed2 in ssl_GatherRecord1stHandshake (ss=0x7f7ec0190d50) > at sslcon.c:1227 > #7 0x00007f7eef6f1135 in ssl_Do1stHandshake (ss=0x7f7ec0190d50) > at sslsecur.c:120 > #8 0x00007f7eef6f296f in SSL_ForceHandshake (fd=<value optimized out>) > at sslsecur.c:390 > #9 0x00007f7ef0882288 in tlsm_session_accept_or_connect ( > session=0x7f7ec0174910, is_accept=<value optimized out>) at tls_m.c:2677 > #10 0x00007f7ef0880e8a in ldap_int_tls_connect (ld=0x7f7ec0107ca0, > conn=<value optimized out>, host=0x7f7ec0170710 "redacted") > at tls2.c:366 > #11 0x00007f7ef088110a in ldap_int_tls_start (ld=0x7f7ec0107ca0, > conn=0x7f7ec0107f10, srv=<value optimized out>) at tls2.c:845 > #12 0x00007f7ef088120e in ldap_start_tls_s (ld=0x7f7ec0107ca0, > serverctrls=0x0, clientctrls=0x0) at tls2.c:938 > #13 0x00007f7ef075269e in slap_client_connect (ldp=0x7f7ec01709f0, > sb=0x7f7ec01707d0) at ../../../servers/slapd/config.c:1921 > #14 0x00007f7ef07c5062 in do_syncrep1 (ctx=<value optimized out>, > arg=0x7f7ec0170b60) at ../../../servers/slapd/syncrepl.c:571 > #15 do_syncrepl (ctx=<value optimized out>, arg=0x7f7ec0170b60) > at ../../../servers/slapd/syncrepl.c:1422 > #16 0x00007f7ef085cdd8 in ldap_int_thread_pool_wrapper (xpool=0x7f7ef19957b0) > at ../../../libraries/libldap_r/tpool.c:685 > #17 0x00007f7eee6e4851 in start_thread () from /lib64/libpthread.so.0 > #18 0x00007f7eee22790d in clone () from /lib64/libc.so.6 > > (gdb) thread 4 > [Switching to thread 4 (Thread 0x7f7ed5028700 (LWP 7049))]#0 > 0x00007f7eee6eb054 > in __lll_lock_wait () from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007f7eee6eb054 in __lll_lock_wait () from /lib64/libpthread.so.0 > #1 0x00007f7eee6e6388 in _L_lock_854 () from /lib64/libpthread.so.0 > #2 0x00007f7eee6e6257 in pthread_mutex_lock () from /lib64/libpthread.so.0 > #3 0x00007f7ef0882280 in tlsm_session_accept_or_connect ( > session=0x7f7ec4131df0, is_accept=<value optimized out>) at tls_m.c:2674 > #4 0x00007f7ef08812c7 in ldap_pvt_tls_accept (sb=0x7f7ec4140420, > ctx_arg=0x7f7ef1a82e70) at tls2.c:433 > #5 0x00007f7ef075d3bb in connection_read (ctx=0x7f7ed5027b70, argv=0x17) > at ../../../servers/slapd/connection.c:1326 > #6 connection_read_thread (ctx=0x7f7ed5027b70, argv=0x17) > at ../../../servers/slapd/connection.c:1238 > #7 0x00007f7ef085cdd8 in ldap_int_thread_pool_wrapper (xpool=0x7f7ef19957b0) > at ../../../libraries/libldap_r/tpool.c:685 > #8 0x00007f7eee6e4851 in start_thread () from /lib64/libpthread.so.0 > #9 0x00007f7eee22790d in clone () from /lib64/libc.so.6 > > (gdb) thread 5 > [Switching to thread 5 (Thread 0x7f7eeb3d0700 (LWP 7048))]#0 > 0x00007f7eee227f03 > in epoll_wait () from /lib64/libc.so.6 > (gdb) bt > #0 0x00007f7eee227f03 in epoll_wait () from /lib64/libc.so.6 > #1 0x00007f7ef07576f5 in slapd_daemon_task (ptr=<value optimized out>) > at ../../../servers/slapd/daemon.c:2467 > #2 0x00007f7eee6e4851 in start_thread () from /lib64/libpthread.so.0 > #3 0x00007f7eee22790d in clone () from /lib64/libc.so.6 > > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
