Thanks!

Michael

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Dejan
Muhamedagic
Sent: Wednesday, 14 May 2008 8:05 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] 2.0.5 longclock() wrap test

Hi,

On Wed, May 14, 2008 at 03:55:43PM +1000, Jones Michael wrote:
> Hello,
> I have been asked to prove that our cluster can handle a times() wrap
> around after 247 days so I downloaded a src rpm for the version we are
> using (2.0.4) and also 2.0.5. Engineered a white box wrap by
modifiying
> longclock and observed some problems. The primary server when wrapping
> would report "No local heartbeat. Forcing restart" but fail to restart
> and instead stop heartbeat without stopping the virtualaddress, this
> resulted in the secondary kicking in and also grabbing the
> virtualaddress so that both servers held it. When performing the wrap
> test on the secondary server it did something similar but succeeded in
> restarting without affecting cluster operation. Our operating system
is
> Red Hat 3.4.3-9.EL4 and the 2.0.5 rpm was the latest I could get to
> build without errors.
> 
> Could you tell me if there is a times() wrap problem or whether there
is
> a flaw in my white box test?

There was and here you can find the description and the patch for
release 2.0.7 here:

http://developerbugs.linux-foundation.org/show_bug.cgi?id=1407
http://hg.linux-ha.org/dev/rev/6ae1d84136d3

BTW, the 2.0.5, in case you're running a CRM/v2 style
configuration is very old.

Thanks,

Dejan

> Thanks,
> 
> Michael
> 
> I added the following code to longclock, the setting of lasttimes to
> MINJUMP was necessary to trigger the "normal wrap" code as opposed to
> the "jump back in error" code. I've triggered the wrap at various
times
> after startup:
> 
>       timesval = (unsigned long) times(&longclock_dummy_tms_struct);
> 
>       if ( (callcount % wrapmod) == 0 )
>       {
>               wrapcount++;
>               cl_log(LOG_INFO, "MJ2 WRAPPING %lu!!!!!!!!!!!!!!!!!!",
> wrapmod);
>               timesval = 0;
>               lasttimes = MINJUMP;
>               wrapmod = 8000;
>       }
> 
>       if (callcount % 100 == 0)
>       {
>               if (wrapcount)
>               {
>                       cl_log(LOG_INFO, "MJ WRAPPED %lu (%lu)",
> timesval, wrapcount);
>               }
>               else
>               {
>                       cl_log(LOG_INFO, "MJ2 %s %d (%ld)",
>                               __FUNCTION__, sizeof(longclock_t),
> timesval);
>               }
>       }
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to