Bump! Any updates for this?

-----Original Message-----
From: Akhnin Nikita [mailto:n.akh...@ftc.ru] 
Sent: Monday, May 15, 2017 9:31 AM
To: Willy Tarreau <w...@1wt.eu>; Benoît GARNIER <chezbu...@gmail.com>
Cc: haproxy@formilux.org
Subject: RE: Failed to compile haproxy with lua on Solaris 10

Thanks guys! I'm looking forward to the patch

-----Original Message-----
From: Willy Tarreau [mailto:w...@1wt.eu]
Sent: Saturday, May 13, 2017 4:27 PM
To: Benoît GARNIER <chezbu...@gmail.com>
Cc: Ахнин Никита Андреевич <n.akh...@ftc.ru>; haproxy@formilux.org
Subject: Re: Failed to compile haproxy with lua on Solaris 10

On Sat, May 13, 2017 at 10:42:58AM +0200, Benoît GARNIER wrote:
> Le 13/05/2017 à 09:46, Willy Tarreau a écrit :
> > On Sat, May 13, 2017 at 08:44:06AM +0200, Benoît GARNIER wrote:
> >> Time handling is not easy. I hate to say that, but POSIX and glibc 
> >> manage to make it even harder. Especially the timezone global 
> >> handling which is not thread-safe as you pinpointed.
> >> Anyway, free conding a simple timegm() is not very hard, since you 
> >> don't have to take any timezone into account, only leap years.
> > That's what I've seen, none of them implement the leap seconds (I 
> > thought it was needed).
> 
> They don't because leap seconds don't exist. Really. The kernel (at 
> least on Linux) hides this extraneous second(*) from userland.
> There are two ways the kernel can do this:
> 1) skipping it entirely - i.e the kernel will return 23:59:59 during 2 
> seconds

I disagree on this one, as the kernel doesn't return human-readable time but 
only seconds and optionally micros/nanos. It's the tools and libc which turn 
that into human readable format. And furthermore, I've seen one with my own 
eyes by accident around 2000, so based on the list, I think it was the 1999 
one. That was the first time I've heard about this. I spent more than one hour 
searching what bug in my system could have caused a second to appear as "60" !

> 2) slowly adjusting time returned by the kernel until it matches UTC 
> time - for a few hours kernel seconds will last longer than a real 
> second So, at the end of day, from userland point of view, a day 
> always lasts
> 86400 seconds, even when a leap second is inserted (and although the 
> day was officially 86401 seconds long).

That's what NTP helps doing indeed by slightly slowing down the internal 
clock's PLL so that no date is off by more than 1/86400s (=11 microseconds) 
during the day.

> That means you don't have to worry about leap seconds when making 
> computations on past dates.

But yeah in fact that's a good point.

> But it can reveal subtle bugs (especially time going backward in the 
> first case and giving negative values when computing elapsed times).
> Measuring elapsed time should preferably done with
> clock_gettime(CLOCK_MONOTONIC,...) which is always monotonic, but is 
> adjusted by NTP (if the internal clock is drifting).
> Another variant is clock_gettime(CLOCK_MONOTONIC_RAW,...) which is 
> monotonic and never adjusted but is specific to Linux > 2.6.28.
> (Note: you can still have time going backward in a SMP platform during 
> task switching since each CPU has its own clock, even thought the 
> kernel try to compensate for the difference).

We've been used to these trouble in the past, on the first dual-core opteron 
systems and then in many VMs, causing us to implement an internal monotonic, 
corrected clock to address this for the internal timers. That's what the 
internal time for the tasks is measured in "ticks" which we all know are 
corrected milliseconds but could change if needed. The code was later ported to 
keepalived who started to suffer from the same problems, but causing more 
serious trouble there such as random timeouts triggering causing occasional 
switchovers.

Anyway thanks for your insights, I think we can go with a "my_timegm()"
implementation to address this problem once for all, it will be the most 
durable solution.

Willy

Reply via email to