[EMAIL PROTECTED] wrote: > Full_Name: Rein Tollevik > Version: CVS head > OS: CentOS 4.4 > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (81.93.160.250) > > > We have bin hit by what looks like a race condition bug in syncprov. We got > some core dumps all showing stack frames like the one at the end. As such > nasty > bugs tends to do it have behaved OK after I restarted slapd with more debug > output :-( (trace + stats + stats2 + sync). > > The configuration is a master server with multiple bdb backend databases all > being subordinate to the same glue database where syncprov is used. One of > the > backends is a syncrepl consumer from another server, the server is master for > the other backends. There are multiple consumers for the syncprov suffix, > which > I assume is what causes the race condition to happen. > > Note the a=0xBAD argument to attr_find(), which I expect is the result of some > other thread freeing the attribute list it was called with while it was > processing it. The rs->sr_entry->e_attrs argument passed to attr_find() as > the > original "a" argument by findpres_cb() looks like a perfectly valid structure, > as are all the attributes found by following the a_next pointer. The list is > terminated by an attribute with a NULL a_next value, none of the a_next values > are 0xBAD.
I don't believe that's the cause. Notice that arg0 in stack frame #9 is also 0xbad, even though it is shown correctly in frames 8 and 10. Something else is going on. > I'm currently trying to gather more information related to this bug, any > pointers as to what I should look for is appreciated. I'm posting this bug > report now in the hope that the stack frame should enlighten someone with > better > knowledge of the code than what I have. Check for stack overruns, compile without optimization and make sure it's not a compiler optimization bug, etc. > > Rein Tollevik > Basefarm AS > > #0 0x0807d03a in attr_find (a=0xbad, desc=0x81e8680) at attr.c:665 > #1 0xb7a656f6 in findpres_cb (op=0xaf068ba4, rs=0xaf068b68) at syncprov.c:546 > #2 0x0808416d in slap_response_play (op=0xaf068ba4, rs=0xaf068b68) at > result.c:307 > #3 0x0808555b in slap_send_search_entry (op=0xaf068ba4, rs=0xaf068b68) at > result.c:770 > #4 0x080f2cdc in bdb_search (op=0xaf068ba4, rs=0xaf068b68) at search.c:870 > #5 0x080db72b in overlay_op_walk (op=0xaf068ba4, rs=0xaf068b68, > which=op_search, oi=0x8274218, on=0x8274318) at backover.c:653 > #6 0x080dbcaf in over_op_func (op=0xaf068ba4, rs=0xaf068b68, which=op_search) > at backover.c:705 > #7 0x080dbdef in over_op_search (op=0xaf068ba4, rs=0xaf068b68) at > backover.c:727 > #8 0x080d9570 in glue_sub_search (op=0xaf068ba4, rs=0xaf068b68, > b0=0xaf068ba4, > on=0xaf068ba4) at backglue.c:340 > #9 0x080da131 in glue_op_search (op=0xbad, rs=0xaf068b68) at backglue.c:459 > #10 0x080db6d5 in overlay_op_walk (op=0xaf068ba4, rs=0xaf068b68, > which=op_search, oi=0x8271860, on=0x8271a60) at backover.c:643 > #11 0x080dbcaf in over_op_func (op=0xaf068ba4, rs=0xaf068b68, which=op_search) > at backover.c:705 > #12 0x080dbdef in over_op_search (op=0xaf068ba4, rs=0xaf068b68) at > backover.c:727 > #13 0xb7a65ff4 in syncprov_findcsn (op=0x85c7e60, mode=FIND_PRESENT) at > syncprov.c:700 > #14 0xb7a670a0 in syncprov_op_search (op=0x85c7e60, rs=0xaf06a1c0) at > syncprov.c:2277 > #15 0x080db6d5 in overlay_op_walk (op=0x85c7e60, rs=0xaf06a1c0, > which=op_search, > oi=0x8271860, on=0x8271b60) at backover.c:643 > #16 0x080dbcaf in over_op_func (op=0x85c7e60, rs=0xaf06a1c0, which=op_search) > at > backover.c:705 > #17 0x080dbdef in over_op_search (op=0x85c7e60, rs=0xaf06a1c0) at > backover.c:727 > #18 0x08076554 in fe_op_search (op=0x85c7e60, rs=0xaf06a1c0) at search.c:368 > #19 0x080770e4 in do_search (op=0x85c7e60, rs=0xaf06a1c0) at search.c:217 > #20 0x08073e28 in connection_operation (ctx=0xaf06a2b8, arg_v=0x85c7e60) at > connection.c:1084 > #21 0x08074f14 in connection_read_thread (ctx=0xaf06a2b8, argv=0x59) at > connection.c:1211 > #22 0xb7fb5546 in ldap_int_thread_pool_wrapper (xpool=0x81ee240) at > tpool.c:663 > #23 0xb7c80371 in start_thread () from /lib/tls/libpthread.so.0 > #24 0xb7c17ffe in clone () from /lib/tls/libc.so.6 -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
