Hello, this seems to be a RHEL/CentOS-specific issue. Please contact Red Hat Support. I think this ITS ticket may be closed.
Regards. On 03/11/2016 04:01 PM, [email protected] wrote: > Full_Name: Kevin Sullivan > Version: 2.4.39-8 > OS: RHEL 6.4 > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (38.127.164.138) > > > Setup: > > I have two OpenLDAP servers running on separate hosts and replicating via > delta > syncrepl in mirrormode. > > Problem: > > On rare occasions when one of my servers comes online, both servers will lock > up > and be unresponsive. Any query to either server will time out. The servers > will > stay in this state indefinitely. > > Workaround: > > Killing either of the servers will resolve this problem. > > Observations: > > - I don't know how to reproduce this reliably. > - Both servers have a thread that is stuck in a do_syncrep1() call. > - Netstat shows that each server has unacknowledged data in one of their > socket's receive queue. > > > GDB on host1: > > (gdb) info threads > 18 Thread 0x7f8b7b80d700 (LWP 31620) 0x00007f8ba7fa6f03 in epoll_wait () > from > /lib64/libc.so.6 > 17 Thread 0x7f8b7b00c700 (LWP 31621) 0x00007f8ba846a054 in > __lll_lock_wait () > from /lib64/libpthread.so.0 > 16 Thread 0x7f8b7a80b700 (LWP 31622) 0x00007f8ba846a054 in > __lll_lock_wait () > from /lib64/libpthread.so.0 > 15 Thread 0x7f8b7a00a700 (LWP 31623) 0x00007f8ba846a054 in > __lll_lock_wait () > from /lib64/libpthread.so.0 > 14 Thread 0x7f8b79809700 (LWP 31625) 0x00007f8ba846a54d in read () from > /lib64/libpthread.so.0 > 13 Thread 0x7f8b79008700 (LWP 31626) 0x00007f8ba846a054 in > __lll_lock_wait () > from /lib64/libpthread.so.0 > 12 Thread 0x7f8b59ffd700 (LWP 25459) 0x00007f8ba846a054 in > __lll_lock_wait () > from /lib64/libpthread.so.0 > 11 Thread 0x7f8b597fc700 (LWP 10195) 0x00007f8ba846a054 in > __lll_lock_wait () > from /lib64/libpthread.so.0 > 10 Thread 0x7f8b70b37700 (LWP 28996) 0x00007f8ba846a054 in > __lll_lock_wait () > from /lib64/libpthread.so.0 > 9 Thread 0x7f8b58ffb700 (LWP 8763) 0x00007f8ba846a054 in __lll_lock_wait > () > from /lib64/libpthread.so.0 > 8 Thread 0x7f8b4bfff700 (LWP 28614) 0x00007f8ba846a054 in __lll_lock_wait > () > from /lib64/libpthread.so.0 > 7 Thread 0x7f8b4b7fe700 (LWP 23367) 0x00007f8ba846a054 in __lll_lock_wait > () > from /lib64/libpthread.so.0 > 6 Thread 0x7f8b4affd700 (LWP 13359) 0x00007f8ba846a054 in __lll_lock_wait > () > from /lib64/libpthread.so.0 > 5 Thread 0x7f8b4a7fc700 (LWP 29064) 0x00007f8ba846a054 in __lll_lock_wait > () > from /lib64/libpthread.so.0 > 4 Thread 0x7f8b49ffb700 (LWP 2184) 0x00007f8ba846a054 in __lll_lock_wait > () > from /lib64/libpthread.so.0 > 3 Thread 0x7f8b497fa700 (LWP 31175) 0x00007f8ba846a054 in __lll_lock_wait > () > from /lib64/libpthread.so.0 > 2 Thread 0x7f8b48ff9700 (LWP 2316) 0x00007f8ba846a054 in __lll_lock_wait > () > from /lib64/libpthread.so.0 > * 1 Thread 0x7f8baa8db700 (LWP 31615) 0x00007f8ba84640ad in pthread_join () > from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007f8ba84640ad in pthread_join () from /lib64/libpthread.so.0 > #1 0x00007f8baa93c969 in slapd_daemon () at > ../../%./servers/slapd/daemon.c:2929 > #2 0x00007f8baa927733 in main (argc=6, argv=<value optimized out>) at > ../../../servers/slapd/main.c:1012 > (gdb) thread 2 > [Switching to thread 2 (Thread 0x7f8b48ff9700 (LWP 2316))]#0 > 0x00007f8ba846a054 > in __lll_lock_wait (929 from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007f8ba846a054 in __lll_lock_wait () from /lib64/libpthread.so.0 > #1 0x00007f8ba8465388 in _L_lock_854 () from /lib64/libpthread.so.0 > #2 0x00007f8ba8465257 in pthread_mutex_lock () from /lib64/libpthread.so.0 > #3 0x00007f8baa4b96a0 in tlsm_session_accept_or_connect > (session=0x7f8b2c104110, is_accept=<value optimized out>) at tls_m.c:2656 > #4 0x00007f8baa4b8787 in ldap_pvt_tls_accept (sb=0x7f8b2c104150, > ctx_arg=0x7f8bac6f8230) at tls2.c:425 > #5 0x00007f8baa943f23 in connection_read (ctx=0x7f8b48ff8b70, argv=0x2f) at > ../../../servers/slapd/connection.c:1372 > #6 connection_read_thread (ctx=0x7f8b48ff8b70, argv=0x2f) at > ../../../servers/slapd/connection.c:1284 > #7 0x00007f8baa490a98 in ldap_int_thread_pool_wrapper (xpool=0x7f8bac5c8fb0) > at > ../../../libraries/libldap_r/tpool.c:688 > #8 0x00007f8ba8463851 in start_thread () from /lib64/libpthread.so.0 > #9 0x00007f8ba7fa690d in clone () from /lib64/libc.so.6 > (gdb) thread 14 > [Switching to thread 14 (Thread 0x7f8b79809700 (LWP 31625))]#0 > 0x00007f8ba846a54d in read () from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007f8ba846a54d in read () from /lib64/libpthread.so.0 > #1 0x00007f8baa278b1c in sb_debug_read (sbiod=0x7f8b681def80, > buf=0x7f8b6819c830, len=5) at ../../../libraries/liblber/sockbuf.c:829 > #2 0x00007f8baa4b9984 in tlsm_PR_Recv (fd=<value optimized out>, > buf=0x7f8b6819c830, len=5, flags=<value optimized out>, timeout=<value > optimized > out>) at tls_m.c:3007 > #3 0x00007f8ba946c1ed in ?? () from /usr/lib64/libssl3.so > #4 0000007f8ba9467480 in ?? () from /usr/lib64/libssl3.so > #5 0x00007f8ba9469ed2 in ?? () from /usr/lib64/libssl3.so > #6 0x00007f8ba9470135 in ?? () from /usr/lib64/libssl3.so > #7 0x00007f8ba947196f in SSL_ForceHandshake () from /usr/lib64/libssl3o%o > #8 0x00007f8baa4b96a8 in tlsm_session_accept_or_connect > (session=0x7f8b681b0f20, is_accept=<value optimized out>) at tls_m.c:2658 > #9 0x00007f8baa4b82e2 in ldap_int_tls_connect (ld=0x7f8b681d9250, conn=<value > optimized out>) at tls2.c:362 > #10 0x00007f8baa4b857inin ldap_int_tls_start (ld=0x7f8b681d9250, > conn=0x7f8b681b0ba0, srv=<value optimized out>) at tls2.c:860 > #11 0x00007f8baa4b86ce in ldap_start_tls_s (ld=0x7f8b681d9250, > serverctrls=0x0, > clientctrls=0x0) at tls2.c:1040 > #12 0x00007f8baa937ff6 in slap_client_connect (ldp=0x7f8bac6d35a8, > sb=0x7f8bac6d3380) at ../../../servers/slapd/config.c:2012 > #13 0x00007f8baa9ae3f5 in do_syncrep1 (ctx=<value optimized out>, > arg=0x7f8bac6cc430) at ../../../servers/slapd/syncrepl.c:613 > #14 do_syncrepl (ctx=<value optimized out>, arg=0x7f8bac6cc430) at > ../../../servers/slapd/syncrepl.c:1527 > #15 0x00007f8baa490a98 in ldap_int_thread_pool_wrapper (xpool=0x7f8bac5c8fb0) > at > ../../../libraries/libldap_r/tpool.c:688 > #16 0x00007f8ba8463851 in start_thread () from /lib64/libpthread.so.0 > #17 0x00007f8ba7fa690d in clone () from /lib64/libc.so.6 > (gdb) thread 18 > [Switching to thread 18 (Thread 0x7f8b7b80d700 (LWP 31620))]#0 > 0x00007f8ba7fa6f03 in epoll_wait () from /lib64/libc.so.6 > (gdb) bt > #0 0x00007f8ba7fa6f03 in epoll_wait () from /lib64/libc.so.6 > #1 0x00007f8baa93d9e2 in slapd_daemon_task (ptr=<value optimized out>) at > ../../../servers/slapd/daemon.c:2536 > #2 0x00007f8ba8463851 in start_thread () from /lib64/libpthread.so.0 > #3 0x00007f8ba7fa690d in clone () from /lib64/libc.so.6 > > > GDB on host2: > > (gdb) info threads > 7 Thread 0x7f5c942d5700 (LWP 3869) 0x00007f5cc0797f03 in epoll_wait () > from > /lib64/libc.so.6 > 6 Thread 0x7f5c93ad4700 (LWP 3870) 0x00007f5cc0c5b54d in read () from > /lib64/libpthread.so.0 > 5 Thread 0x7f5c932d3700 (LWP 3871) 0x00007f5cc0c5b054 in __lll_lock_wait > () > from /lib64/libpthread.so.0 > 4 Thread 0x7f5c92ad2700 (LWP 3872) 0x00007f5cc0c5843c in > pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 > 3 Thread 0x7f5c922d1700 (LWP 3873) 0x00007f5cc0c5843c in > pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 > 2 Thread 0x7f5c91ad0700 (LWP 3874) 0x00007f5cc0c5843c in > pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 > * 1 Thread 0x7f5cc30cc700 (LWP 3868) 0x00007f5cc0c550ad in pthread_join () > from > /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007f5cc0c550ad in pthread_join () from /lib64/libpthread.so.0 > #1 0x00007f5cc312d969 in slapd_daemon () at > ../../../servers/slapd/daemon.c:2929 > #2 0x00007f5cc3118733 in main (argc=6, argv=<value optimized out>) at > ../../../servers/slapd/main.c:1012 > (gdb) thread 2 > [Switching to thread 2 (Thread 0x7f5c91ad0700 (LWP 3874))]#0 > 0x00007f5cc0c5843c > in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007f5cc0c5843c in pthread_cond_wait@@GLIBC_2.3.2 () from > /lib64/libpthread.so.0 > #1 0x00007f5cc2c81af5 in ldap_int_thread_pool_wrapper (xpool=0x7f5cc3d18030) > at > ../../../libraries/libldap_r/tpool.c:675 > #2 0x00007f5cc0c54851 in start_thread () from /lib64/libpthread.so.0 > #3 0x00007f5cc079790d in clone () from /lib64/libc.so.6 > (gdb) thread 5 > [Switching to thread 5 (Thread 0x7f5c932d3700 (LWP 3871))]#0 > 0x00007f5cc0c5b054 > in __lll_lock_wait () from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007f5cc0c5b054 in __lll_lock_wait () from /lib64/libpthread.so.0 > #1 0x00007f5cc0c56388 in _L_lock_854 () from /lib64/libpthread.so.0 > #2 0x00007f5cc0c56257 in pthread_mutex_lock () from /lib64/libpthread.so.0 > #3 0x00007f5cc2caa6a0 in tlsm_session_accept_or_connect > (session=0x7f5c841165c0, is_accept=<value optimized out>) at tls_m.c:2656 > #4 0x00007f5cc2ca9787 in ldap_pvt_tls_accept (sb=0x7f5c800008c0, > ctx_arg=0x7f5cc3e3b240) at tls2.c:425 > #5 0x00007f5cc3134f23 in connection_read (ctx=0x7f5c932d2b70, argv=0x10) at > ../../../servers/slapd/connection.c:1372 > #6 connection_read_thread (ctx=0x7f5c932d2b70, argv=0x10) at > ../../../servers/slapd/connection.c:1284 > #7 0x00007f5cc2c81a98 in ldap_int_thread_pool_wrapper (xpool=0x7f5cc3d18030) > at > ../../../libraries/libldap_r%tptpool.c:688 > #8 0x00007f5cc0c54851 in start_thread () from /lib64/libpthread.so.0 > #9 0x00007f5cc079790d in clone () from /lib64/libc.so.6 > (gdb) thread 6 > [Switching to thread 6 (Thread 0x7f5c93ad4700 (LWP 3870))]#0 > 0x00007f5cc0c5b54d > in read () from /lib64/libpthread.so.0 > (gdb) bt > #0 0x00007f5cc0c5b54d in read () from /lib64/libpthread.so.0 > #1 0x00007f5cc2a69b1c in sb_debug_read (sbiod=0x7f5c7c185e10, > buf=0x7f5c7c18a7f0, len=5) at ../../../libraries/liblber/sockbuf.c:829 > #2 0x00007f5cc2caa984 in tlsm_PR_Recv (fd=<value optimized out>, > buf=0x7f5c7c18a7f0, len=5, flags=<value optimized out>, timeout=<value > optimized > out>) at tls_m.c:3007 > #3 0x00007f5cc1c5d1ed in ?? () from /usr/lib64/libssl3.so > #4 0x00007f5cc1c58480 in ?? () from /usr/lib64/libssl3.so > #5 0x00007f5cc1c5aed2 in ?? () from /usr/lib64/libssl3.so > #6 0x00007f5cc1c61135 in ?? () from /usr/lib64/libssl3.so > #7 0x00007f5cc1c6296f in SSL_ForceHandshake () from /usr/lib64/libssl3.so > #8 0x00007f5cc2caa6a8 in tlsm_session_accept_or_connect > (session=0x7f5c7c182100, is_accept=<value optimized out>) at tls_m.c:2658 > #9 0x00007f5cc2ca92e2 in ldap_int_tls_connect (ld=0x7f5c7c100910, conn=<value > optimized out>) at tls2.c:362 > #10 0x00007f5cc2ca9574 in ldap_int_tls_start (ld=0x7f5c7c100910, > conn=0x7f5c7c10a020, srv=<value optimized out>) at tls2.c:860 > #11 0x00007f5cc2ca96ce in ldap_start_tls_s (ld=0x7f5c7c100910, > serverctrls=0x0, > clientctrls=0x0) at tls2.c:1040 > #12 0x00007f5cc3128ff6 in slap_client_connect (ldp=0x7f5cc3e22558, > sb=0x7f5cc3e22330) at ../../../servers/slapd/config.c:2012 > #13 0x00007f5cc319f3f5 in do_syncrep1 (ctx=<value optimized out>, > arg=0x7f5cc3e1f410) at ../../../servers/slapd/syncrepl.c:613 > #14 do_syncrepl (ctx=<value optimized out>, arg=0x7f5cc3e1f410) at > ../../../servers/slapd/syncrepl.c:1527 > #15 0x00007f5cc2c81a98 in ldap_int_thread_pool_wrapper (xpool=0x7f5cc3d18030) > at > ../../../libraries/libldap_r/tpool.c:688 > #16 0x00007f5cc0c54851 in start_thread () from /lib64/libpthread.so.0 > #17 0x00007f5cc079790d in clone () from /lib64/libc.so.6 > (gdb) thread 7 > [Switching to thread 7 (Thread 0x7f5c942d5700 (LWP 3869))]#0 > 0x00007f5cc0797f03 > in epoll_wait () from /lib64/libc.so.6 > (gdb) bt > #0 0x00007f5cc0797f03 in epoll_wait () from /lib64/libc.so.6 > #1 0x00007f5cc312e9e2 in slapd_daemon_task (ptr=<value optimized out>) at > ../../../servers/slapd/daemon.c:2536 > #2 0x00007f5cc0c54851 in start_thread () from /lib64/libpthread.so.0 > #3 0x00007f5cc079790d in clone () from /lib64/libc.so.6 > > > Relevant configuration information (from host2): > > syncrepl rid=001 > provider=ldap://host1/ > type=refreshAndPersist > retry="10 +" > searchbase="dc=example,dc=com" > starttls=critical > tls_cacert=/etc/openldap/ssl-certs/cacert.pem > tls_cert=/etc/openldap/ssl-certs/host1.crt > tls_key=/etc/openldap/ssl-certs/host1.key > tls_reqcert=demand > bindmethod=sasl > saslmech=EXTERNAL > sizelimit=unlimited > keepalive="10:2:10" > logfilter="(&(objectClass=auditWriteObject)(reqResult=0))" > logbase="cn=accesslog" > syncdata=accesslog > > > Netstat of host1: > > [root@host1 ~]# netstat -anp | grep slapd > tcp 0 0 0.0.0.0:389 0.0.0.0:* > LISTEN 31615/slapd > tcp 56 0 192.168.1.1:389 192.168.1.2:57276 > ESTABLISHED 31615/slapd > tcp 0 0 192.168.1.1:45569 192.168.1.2:389 > ESTABLISHED 31615/slapd > unix 2 [ ] STREAM CONNECTED 45346152 31615/slapd > unix 2 [ ] DGRAM 45346144 31615/slapd > > Netstat of host2: > > [root@host2 ~]# netstat -anp | grep slapd > tcp 0 0 0.0.0.0:389 0.0.0.0:* > LISTEN 3868/slapd > tcp 56 0 192.168.1.2:389 192.168.1.1:45569 > ESTABLISHED 3868/slapd > tcp 0 0 192.168.1.2:57276 192.168.1.1:389 > ESTABLISHED 3868/slapd > unix 2 [ ] STREAM CONNECTED 19144 3868/slapd > unix [ ] DGRAM 19136 3868/slapd > > -- Matúš HonÄk
