Raphaël Ouazana-Sustowski wrote: > Hi, > > Le Ven 2 mai 2008 11:01, [EMAIL PROTECTED] a écrit : >> [EMAIL PROTECTED] wrote: >>> [EMAIL PROTECTED] wrote: >>>> This is a multi-part message in MIME format. >>>> --------------080809000906010300090306 >>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >>>> Content-Transfer-Encoding: 7bit >>>> >>>> Howard Chu wrote: >>>> >>>>> Thanks. Please try HEAD again. >>>>> >>>> No way. >>>> new testrun directory in >>>> ftp://ftp.sys-net.it/luca_scamoni_its5470_20080430-new.tgz >>>> >>>> backtrace attached >>>> >>> recent commits seem to have fixed it (at least, right now I'm not able >>> to reproduce it anymore...) >> Right. Confirmed here too; I (temporarily) added an assert(0) to the >> offending >> branch of code to make sure the patch was actually getting hit. It takes a >> very particular timing to trigger that code path. >> >> I'm not sure how we can reliably test for this down the road. Perhaps we >> should add a "disabled" config keyword for backends and syncrepl >> consumers, so >> that we can start up the individual servers, (which takes an unpredictable >> amount of time for each) and then enable various parts in a fixed sequence >> (e.g. 1 second sleeps between ldapmodify/enable requests). Even that's hit >> or >> miss, because our test database is so small it's unlikely that we can hit >> the >> window of time on demand. > > I'm testing the last RE24 tag. After 201 successful runs of test050, I got > a failure :/ > Cleaning up test run directory leftover from previous run. > Running ./scripts/test050-syncrepl-multimaster... > running defines.sh > Initializing server configurations... > Starting producer slapd on TCP/IP port 9011... > Using ldapsearch to check that producer slapd is running... > Inserting syncprov overlay on producer... > Starting consumer slapd on TCP/IP port 9012... > Using ldapsearch to check that consumer slapd is running... > Configuring syncrepl on consumer... > Starting consumer2 slapd on TCP/IP port 9013... > Using ldapsearch to check that consumer2 slapd is running... > Configuring syncrepl on consumer2... > Adding schema and databases on producer... > Using ldapadd to populate producer... > Waiting 20 seconds for syncrepl to receive changes... > Using ldapadd to populate consumer... > Waiting 20 seconds for syncrepl to receive changes... > Using ldapsearch to check that syncrepl received database changes... > Waiting 5 seconds for syncrepl to receive changes... > Waiting 5 seconds for syncrepl to receive changes... > Waiting 5 seconds for syncrepl to receive changes... > Waiting 5 seconds for syncrepl to receive changes... > Waiting 5 seconds for syncrepl to receive changes... > Waiting 5 seconds for syncrepl to receive changes... > ldapsearch failed (32)! > > testrun uploaded in > ftp://ftp.openldap.org/incoming/raphael-ouazana-testrun-080505.tgz
The logs show that the syncrepl consumers all timed out periodically, when trying to bind to a provider. It seems that using a 1 second timeout in the syncrepl configs is too short, or your test machine was too slow during that run. Probably we should remove that timeout now, since the cn=config/thread pause issue has already been resolved. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
