On Wed, Jul 01, 2009 at 05:13:42PM +0200, Lars Ellenberg wrote:
> On Wed, Jul 01, 2009 at 04:59:24PM +0200, Dejan Muhamedagic wrote:
> > Hi Lars,
> >
> > On Wed, Jul 01, 2009 at 04:44:06PM +0200, Lars Ellenberg wrote:
> > > On Wed, Jul 01, 2009 at 10:22:58AM -0400, Tavanyar, Simon wrote:
> > > > Aahh ... so it's measured after reboot. That makes sense.
> > > > Thanks, Lars.
> > > >
> > > > Should I wait for you to log a bugzilla on this?
> > >
> > > bugzilla and me don't go along very well, really ;-)
> > > so please feel free to open that bugzilla,
> > > and attach my proposed patch there.
> > >
> > > just to reiterate:
> > > the problem is not the uptime wrap or resulting longclock wrap,
> > > as that is handled correctly in cl_times() and time_longclock.
> > >
> > > but the cmp_longclock comparing unsigned values
> > > without checking for wrap.
> >
> > The wrap is stuffed into upper bits of the longclock_t variables:
> >
> > return (lc_wrapcount | (longclock_t)timesval);
> >
> > Or am I missing something? The only explanation I can think of is
> > that this is actually wrong:
>
> absolutely.
Looking again, this is probably not a problem. The nexttime is
always in the future.
> > if (cmp_longclock(lnow, append->nexttime) >= 0) {
> >
> > i.e. it should be strictly greater. I guess that what happened is
> > that somehow the prepare function got called fast enough, so that
> > both timestamps ended equal.
>
> absolutely not.
> ;)
>
> Much worse!
> as far as I see, cmp_longclock is broken (does not care for wrap)
I still don't understand what do you mean by "does not care for
wrap". In case of 32-bit clock_t the wrapcount is stuffed into
the upper bits of longclock_t which is 64-bit. How can it then
not take it into account? In case of 64-bit clock_t, there is no
wrap. It has enough bits to serve us until the end of the world.
> even on 64bit archs, and the time window for this to happen is about
> as large as the larges action (or else) timeout you set somewhere in
> crm/lrm/whatever.
>
> did you have a look at my explanation ealier in this thread,
> and the patch? (part of that message pasted here for reference)
>
>
> if (cmp_longclock(lnow, append->nexttime) >= 0)
> assuming that nexttime was set correctly, and lnow is correct, too,
> and further assuming your clock_t is only 32 bit,
> longclock_t is defined as unsigned long long,
> and that thing becomes:
>
> if ((unsigned long long)(lnow) >= (unsigned long
> long)(append->nexttime))
>
> which exactly does _NOT_ care for wrap around :(
>
>
> example:
> say, you start with a _large_ lnow (e.g. the equivalent of "-15
But you can't start with large lnow. lnow is 64-bit and you can't
get that far into the future.
> seconds"), and add an intervall of 30 seconds, resulting
> "nexttime" become the equivalent of +15 seconds.
> comparing absolute _unsigned_ means that test ends up comparing
> if (something verry large >= something very small).
> which is always true :(
>
> you need instead cast both values to _signed_, and compare the
> _difference_ against zero, as below:
>
>
> cmp_longclock macro and functions are broken. conceivably some other
> longclock related macros/functions may be broken as well.
>
> it should probably be more like
> (caution, not even compile tested)
> ((and this time incompletely copy'n'pasted,
> so whitespace will be wrong))
>
> diff -r 731f8f7b5450 include/clplumbing/longclock.h
> --- a/include/clplumbing/longclock.h Tue Jun 30 12:02:16 2009 +0200
> +++ b/include/clplumbing/longclock.h Wed Jul 01 12:27:27 2009 +0200
> @@ -127,9 +127,9 @@
> ((longclock_t)(l) - (longclock_t)(r))
>
> # define cmp_longclock(l,r) \
> - (((longclock_t)(l) < (longclock_t)(r)) \
> + ((((long long)(l) - (long long)(r)) < 0) \
> ? -1 \
> - : (((longclock_t)(l) > (longclock_t)(r)) \
> + : ((((long long)(l) - (long long)(r)) > 0) \
> ? +1 : 0))
This patch may be correct, but for practical reasons it won't
make any difference.
Thanks,
Dejan
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
>
> DRBD? and LINBIT? are registered trademarks of LINBIT, Austria.
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems