Thanks! Michael
-----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dejan Muhamedagic Sent: Wednesday, 14 May 2008 8:05 PM To: General Linux-HA mailing list Subject: Re: [Linux-HA] 2.0.5 longclock() wrap test Hi, On Wed, May 14, 2008 at 03:55:43PM +1000, Jones Michael wrote: > Hello, > I have been asked to prove that our cluster can handle a times() wrap > around after 247 days so I downloaded a src rpm for the version we are > using (2.0.4) and also 2.0.5. Engineered a white box wrap by modifiying > longclock and observed some problems. The primary server when wrapping > would report "No local heartbeat. Forcing restart" but fail to restart > and instead stop heartbeat without stopping the virtualaddress, this > resulted in the secondary kicking in and also grabbing the > virtualaddress so that both servers held it. When performing the wrap > test on the secondary server it did something similar but succeeded in > restarting without affecting cluster operation. Our operating system is > Red Hat 3.4.3-9.EL4 and the 2.0.5 rpm was the latest I could get to > build without errors. > > Could you tell me if there is a times() wrap problem or whether there is > a flaw in my white box test? There was and here you can find the description and the patch for release 2.0.7 here: http://developerbugs.linux-foundation.org/show_bug.cgi?id=1407 http://hg.linux-ha.org/dev/rev/6ae1d84136d3 BTW, the 2.0.5, in case you're running a CRM/v2 style configuration is very old. Thanks, Dejan > Thanks, > > Michael > > I added the following code to longclock, the setting of lasttimes to > MINJUMP was necessary to trigger the "normal wrap" code as opposed to > the "jump back in error" code. I've triggered the wrap at various times > after startup: > > timesval = (unsigned long) times(&longclock_dummy_tms_struct); > > if ( (callcount % wrapmod) == 0 ) > { > wrapcount++; > cl_log(LOG_INFO, "MJ2 WRAPPING %lu!!!!!!!!!!!!!!!!!!", > wrapmod); > timesval = 0; > lasttimes = MINJUMP; > wrapmod = 8000; > } > > if (callcount % 100 == 0) > { > if (wrapcount) > { > cl_log(LOG_INFO, "MJ WRAPPED %lu (%lu)", > timesval, wrapcount); > } > else > { > cl_log(LOG_INFO, "MJ2 %s %d (%ld)", > __FUNCTION__, sizeof(longclock_t), > timesval); > } > } > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
