Howard Chu wrote: > Nick Geron wrote: >> Howard Chu wrote: >>> OK, this sounds like the background thread to propagate updates isn't >>> getting scheduled when it should. That could be a bug in the syncprov >>> overlay. >> >> Should I file a report, and if so, what information is required from >> this end? > > I've filed this as ITS#5405 and fixed it in CVS HEAD. It would be good > if you could test the patch and report your results as a followup to > the ITS.
I'll try to look into this soon. > >> Thanks for the info. That would certainly point to a problem with my >> build environment, or a new bug. >>>> I'm seeing a number of aborts when >>>> testing under high load. The latest came from running scripted >>>> ldapsearches and ldapmodifies which resulted in a mutex error (or so I >>>> am told by one of our developers). >>>> >>>> Specifically: >>>> >>>> 1) adding about 100 attributes to an entry >>>> 2) diffing the output of ldapsearch between the two nodes in loop >>>> 3) once synced, grabbing the attributes, shoving them in a temp file >>>> with delete instructions and using that with ldapmodify. >>>> >>>> I complied with debugging on which results in an abort with >>>> "connection.c: 676: connection_state_closing: Assertion >>>> 'c_struct_state >>>> == 0x02' failed" logged. >>> Interesting. It would be useful to get a gdb stack trace from that >>> situation. >>> >> Yesterday I was able to successfully reproduce this beahvior at least >> three times. This morning, I was able to reproduce it with the above >> steps yet again. From a gdb session, no backtrace was available, >> however. I then recompiled with debugging enabled and was unable to >> reproduce the bug until I added '-d 7' to the run arguments. It should >> be noted that before recompiling, I was able to reproduce the behavior >> with and without the command line debug argument. >> >> Here's the stack trace from a gdb session with arguments, -h 'ldap:/// >> ldaps:///' -d 7 >> >> => acl_string_expand: expanded: >> uid=[^,]+,ou=employees,ou=people,dc=example,dc=com >> => regex_matches: string: uid=syncrepl,ou=ldap,dc=example,dc=com >> => regex_matches: rc: 1 no matches >> slapd: connection.c:676: connection_state_closing: Assertion >> `c->c_struct_state == 0x02' failed. >> >> Program received signal SIGABRT, Aborted. >> [Switching to Thread 1124096320 (LWP 7301)] >> 0x0000003918230055 in raise () from /lib64/libc.so.6 >> (gdb) bt >> #0 0x0000003918230055 in raise () from /lib64/libc.so.6 >> #1 0x0000003918231af0 in abort () from /lib64/libc.so.6 >> #2 0x0000003918229756 in __assert_fail () from /lib64/libc.so.6 >> #3 0x000000000042d345 in connection_state_closing () >> #4 0x000000000043d43b in slap_freeself_cb () >> #5 0x000000000043ef81 in slap_send_search_entry () >> #6 0x00000000004c88c4 in syncprov_initialize () >> #7 0x00002aaaaaabc1c7 in ldap_int_thread_pool_wrapper >> (xpool=0x1a0b06d0) at tpool.c:625 >> #8 0x00000039196062f7 in start_thread () from /lib64/libpthread.so.0 >> #9 0x00000039182ce85d in clone () from /lib64/libc.so.6 >> (gdb) >> >> Envrionment = CentOS 5 updated to whatever RH thinks is current as of >> last week. Oracle db 4.5.20 and openldap 2.4.7 compiled by hand. >> >> Please let me know if any further information would be helpful. > > Your stack trace is a bit odd, because I can't find anywhere in the > source tree that uses the function "slap_freeself_cb()". Are you using > any custom overlays? It appears that your stack trace is still missing > a lot of details. You should compile with -g and without any > optimization, and make sure you're testing with the unstripped binary. The only overlays I'm using are those found in the openldap tarball. I found that function referenced in result.c [EMAIL PROTECTED] openldap-2.4.7]# grep -R slap_freeself_cb . Binary file ./servers/slapd/.libs/slapd matches ./servers/slapd/proto-slap.h:LDAP_SLAPD_F (int) slap_freeself_cb LDAP_P(( Operation *op, SlapReply *rs )); Binary file ./servers/slapd/result.o matches ./servers/slapd/result.c:int slap_freeself_cb( Operation *op, SlapReply *rs ) [EMAIL PROTECTED] openldap-2.4.7]# strings /usr/local/libexec/slapd | grep freeself slap_freeself_cb slap_freeself_cb [EMAIL PROTECTED] openldap-2.4.7]# It does look like I had stripped binaries though. I've removed optimization and verified that the -g option is being given. It took a bit to track down where stripping was being set, but 'file' now tells me the binaries are not stripped. Here's the latest backtrace: > regex_matches: string: uid=syncrepl,ou=ldap,dc=corenap,dc=com => regex_matches: rc: 1 no matches ldap_read: want=8, got=7 0000: 30 05 02 01 03 42 00 0....B. ber_get_next: tag 0x30 len 5 contents: ber_get_next ldap_read: want=8, got=0 ber_get_next on fd 21 failed errno=0 (Success) connection_closing: readying conn=1 sd=21 for close connection_close: deferring conn=1 sd=21 conn=1 op=2 do_unbind connection_resched: attempting closing conn=1 sd=21 connection_close: conn=1 sd=21 slapd: connection.c:676: connection_state_closing: Assertion `c->c_struct_state == 0x02' failed. Program received signal SIGABRT, Aborted. [Switching to Thread 1107310912 (LWP 17114)] 0x00000030ae430055 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00000030ae430055 in raise () from /lib64/libc.so.6 #1 0x00000030ae431af0 in abort () from /lib64/libc.so.6 #2 0x00000030ae429756 in __assert_fail () from /lib64/libc.so.6 #3 0x000000000043854a in connection_state_closing (c=0xa35ef60) at connection.c:676 #4 0x000000000044dc38 in send_ldap_ber (conn=0xa35ef60, ber=0x42002410) at result.c:153 #5 0x000000000045173d in slap_send_search_entry (op=0x420029a0, rs=0x420026c0) at result.c:1187 #6 0x000000000051072f in syncprov_sendresp (op=0x420029a0, opc=0x42002780, so=0xa882b80, e=0x420027d8, mode=2) at syncprov.c:809 #7 0x0000000000510a8d in syncprov_qplay (op=0x420029a0, on=0xa094690, so=0xa882b80) at syncprov.c:878 #8 0x0000000000510c54 in syncprov_qtask (ctx=0x42002dc0, arg=0x2aaac4111500) at syncprov.c:923 #9 0x00002aaaaaabc605 in ldap_int_thread_pool_wrapper (xpool=0xa0466d0) at tpool.c:625 #10 0x00000030aec062f7 in start_thread () from /lib64/libpthread.so.0 #11 0x00000030ae4ce85d in clone () from /lib64/libc.so.6 (gdb) -Nick
