[SSSD] Re: Monotonic clock for timed events
On Wed, 2016-10-12 at 10:52 +0200, Pavel Březina wrote: > On 10/11/2016 03:26 PM, Simo Sorce wrote: > > On Mon, 2016-10-10 at 14:04 +0200, Pavel Březina wrote: > >> On 10/10/2016 10:09 AM, Fabiano Fidêncio wrote: > >>> Victor, > >>> > >>> On Mon, Oct 10, 2016 at 10:04 AM, Victor Tapia > >>>wrote: > Hi list, > > I've faced a race condition when SSSD boots in a machine with a big > clock drift. This is what I see: > > 1. SSSD starts before the network is up, queries the LDAP server without > success and sets a retry timer (~60 secs) > 2. NTP starts and corrects the clock, 1 hour back for example. > 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the > connection. > > In this particular scenario the credentials cache is disabled, so the > wait time to login is noticeable. How feasible would it be to use a > monotonic clock for this kind of timed events? > >>> > >>> Are you running git master? This issue is supposed to be already > >>> solved by > >>> https://github.com/SSSD/sssd/commit/b8ceaeb80cffb00c26390913ea959b77f7e848b9 > >> > >> This patch fix the issue only in watchdog which would result in > >> terminating sssd otherwise. Fixing it across whole sssd would be > >> difficult. The fix should go to tevent. > > > > It also seem to fix the issue only if the time jumps backwards, not if > > it jumps forward, in that case if I read the code right, we'd still end > > up killing sssd. > > Yes, we don't need0 to care about forward jump since that means all > tevent timers that are within this time shift are fired anyway. Well I do care if sssd kills all children just because I suspended the laptop for a while :-) Simo. -- Simo Sorce * Red Hat, Inc * New York ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
On 10/11/2016 03:26 PM, Simo Sorce wrote: On Mon, 2016-10-10 at 14:04 +0200, Pavel Březina wrote: On 10/10/2016 10:09 AM, Fabiano Fidêncio wrote: Victor, On Mon, Oct 10, 2016 at 10:04 AM, Victor Tapiawrote: Hi list, I've faced a race condition when SSSD boots in a machine with a big clock drift. This is what I see: 1. SSSD starts before the network is up, queries the LDAP server without success and sets a retry timer (~60 secs) 2. NTP starts and corrects the clock, 1 hour back for example. 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the connection. In this particular scenario the credentials cache is disabled, so the wait time to login is noticeable. How feasible would it be to use a monotonic clock for this kind of timed events? Are you running git master? This issue is supposed to be already solved by https://github.com/SSSD/sssd/commit/b8ceaeb80cffb00c26390913ea959b77f7e848b9 This patch fix the issue only in watchdog which would result in terminating sssd otherwise. Fixing it across whole sssd would be difficult. The fix should go to tevent. It also seem to fix the issue only if the time jumps backwards, not if it jumps forward, in that case if I read the code right, we'd still end up killing sssd. Yes, we don't need0 to care about forward jump since that means all tevent timers that are within this time shift are fired anyway. Simo. ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
On Mon, 2016-10-10 at 11:19 +0200, Jakub Hrozek wrote: > On Mon, Oct 10, 2016 at 11:09:35AM +0200, Victor Tapia wrote: > > El 10/10/16 a las 10:56, Ian Kent escribió: > > > On Mon, 2016-10-10 at 10:42 +0200, Jakub Hrozek wrote: > > >> On Mon, Oct 10, 2016 at 10:04:30AM +0200, Victor Tapia wrote: > > >>> Hi list, > > >>> > > >>> I've faced a race condition when SSSD boots in a machine with a big > > >>> clock drift. This is what I see: > > >>> > > >>> 1. SSSD starts before the network is up, queries the LDAP server without > > >>> success and sets a retry timer (~60 secs) > > >>> 2. NTP starts and corrects the clock, 1 hour back for example. > > >>> 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the > > >>> connection. > > >>> > > >>> In this particular scenario the credentials cache is disabled, so the > > >>> wait time to login is noticeable. How feasible would it be to use a > > >>> monotonic clock for this kind of timed events? > > >> > > >> I really have not tried this and I guess I don't know tevent internals > > >> well enough if this works, but I wonder if just using: > > >> clock_getime() > > > > > > With a CLOCK_MONOTONIC I presume? > > > I think that's what has been suggested. > > > > > > > I was thinking about using CLOCK_MONOTONIC_RAW: > > > > CLOCK_MONOTONIC_RAW (since Linux 2.6.28; Linux-specific) > > Similar to CLOCK_MONOTONIC, but provides access to a raw hardware-based > > time that is not subject to NTP adjustments or the incremental > > adjustments performed by adjtime(3). > > > > >> and constructing struct timeval > > >> in place of: > > >> tevent_timeval_current_ofs() > > >> could solve this particular issue. > > >> > > >> On the other hand, this is a pattern we use in SSSD all through the code > > >> for timed events and we're just not well equipped to handle time drifts. > > >> Did you investigate why doesn't sssd detect the networking change from > > >> libnl messages or from resolv.conf being touched? > > > > I didn't dig much into it yet (I just checked tevent to confirm it uses > > gettimeofday()), so I'll take this as my next step. > > btw the samba-technical mailing list is the best source of info about > libtevent and the best place to ask questions about libtevent.. We should suggest that libtevent starts using timerfd with epoll to do time tracking instead of the internal computations, and then a monotonic clock can be used with some calculation at set time to convert the time of day to the current monotonic clock. Simo. -- Simo Sorce * Red Hat, Inc * New York ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
On Mon, 2016-10-10 at 14:04 +0200, Pavel Březina wrote: > On 10/10/2016 10:09 AM, Fabiano Fidêncio wrote: > > Victor, > > > > On Mon, Oct 10, 2016 at 10:04 AM, Victor Tapia > >wrote: > >> Hi list, > >> > >> I've faced a race condition when SSSD boots in a machine with a big > >> clock drift. This is what I see: > >> > >> 1. SSSD starts before the network is up, queries the LDAP server without > >> success and sets a retry timer (~60 secs) > >> 2. NTP starts and corrects the clock, 1 hour back for example. > >> 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the > >> connection. > >> > >> In this particular scenario the credentials cache is disabled, so the > >> wait time to login is noticeable. How feasible would it be to use a > >> monotonic clock for this kind of timed events? > > > > Are you running git master? This issue is supposed to be already > > solved by > > https://github.com/SSSD/sssd/commit/b8ceaeb80cffb00c26390913ea959b77f7e848b9 > > This patch fix the issue only in watchdog which would result in > terminating sssd otherwise. Fixing it across whole sssd would be > difficult. The fix should go to tevent. It also seem to fix the issue only if the time jumps backwards, not if it jumps forward, in that case if I read the code right, we'd still end up killing sssd. Simo. -- Simo Sorce * Red Hat, Inc * New York ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
On Mon, 2016-10-10 at 10:04 +0200, Victor Tapia wrote: > Hi list, > > I've faced a race condition when SSSD boots in a machine with a big > clock drift. This is what I see: > > 1. SSSD starts before the network is up, queries the LDAP server without > success and sets a retry timer (~60 secs) > 2. NTP starts and corrects the clock, 1 hour back for example. > 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the > connection. > > In this particular scenario the credentials cache is disabled, so the > wait time to login is noticeable. How feasible would it be to use a > monotonic clock for this kind of timed events? We should use a monotonic clock for most internal events. Simo. -- Simo Sorce * Red Hat, Inc * New York ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
On 10/10/2016 10:09 AM, Fabiano Fidêncio wrote: Victor, On Mon, Oct 10, 2016 at 10:04 AM, Victor Tapiawrote: Hi list, I've faced a race condition when SSSD boots in a machine with a big clock drift. This is what I see: 1. SSSD starts before the network is up, queries the LDAP server without success and sets a retry timer (~60 secs) 2. NTP starts and corrects the clock, 1 hour back for example. 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the connection. In this particular scenario the credentials cache is disabled, so the wait time to login is noticeable. How feasible would it be to use a monotonic clock for this kind of timed events? Are you running git master? This issue is supposed to be already solved by https://github.com/SSSD/sssd/commit/b8ceaeb80cffb00c26390913ea959b77f7e848b9 This patch fix the issue only in watchdog which would result in terminating sssd otherwise. Fixing it across whole sssd would be difficult. The fix should go to tevent. ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
On Mon, Oct 10, 2016 at 11:09:35AM +0200, Victor Tapia wrote: > El 10/10/16 a las 10:56, Ian Kent escribió: > > On Mon, 2016-10-10 at 10:42 +0200, Jakub Hrozek wrote: > >> On Mon, Oct 10, 2016 at 10:04:30AM +0200, Victor Tapia wrote: > >>> Hi list, > >>> > >>> I've faced a race condition when SSSD boots in a machine with a big > >>> clock drift. This is what I see: > >>> > >>> 1. SSSD starts before the network is up, queries the LDAP server without > >>> success and sets a retry timer (~60 secs) > >>> 2. NTP starts and corrects the clock, 1 hour back for example. > >>> 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the > >>> connection. > >>> > >>> In this particular scenario the credentials cache is disabled, so the > >>> wait time to login is noticeable. How feasible would it be to use a > >>> monotonic clock for this kind of timed events? > >> > >> I really have not tried this and I guess I don't know tevent internals > >> well enough if this works, but I wonder if just using: > >> clock_getime() > > > > With a CLOCK_MONOTONIC I presume? > > I think that's what has been suggested. > > > > I was thinking about using CLOCK_MONOTONIC_RAW: > > CLOCK_MONOTONIC_RAW (since Linux 2.6.28; Linux-specific) > Similar to CLOCK_MONOTONIC, but provides access to a raw hardware-based > time that is not subject to NTP adjustments or the incremental > adjustments performed by adjtime(3). > > >> and constructing struct timeval > >> in place of: > >> tevent_timeval_current_ofs() > >> could solve this particular issue. > >> > >> On the other hand, this is a pattern we use in SSSD all through the code > >> for timed events and we're just not well equipped to handle time drifts. > >> Did you investigate why doesn't sssd detect the networking change from > >> libnl messages or from resolv.conf being touched? > > I didn't dig much into it yet (I just checked tevent to confirm it uses > gettimeofday()), so I'll take this as my next step. btw the samba-technical mailing list is the best source of info about libtevent and the best place to ask questions about libtevent.. ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
El 10/10/16 a las 10:56, Ian Kent escribió: > On Mon, 2016-10-10 at 10:42 +0200, Jakub Hrozek wrote: >> On Mon, Oct 10, 2016 at 10:04:30AM +0200, Victor Tapia wrote: >>> Hi list, >>> >>> I've faced a race condition when SSSD boots in a machine with a big >>> clock drift. This is what I see: >>> >>> 1. SSSD starts before the network is up, queries the LDAP server without >>> success and sets a retry timer (~60 secs) >>> 2. NTP starts and corrects the clock, 1 hour back for example. >>> 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the >>> connection. >>> >>> In this particular scenario the credentials cache is disabled, so the >>> wait time to login is noticeable. How feasible would it be to use a >>> monotonic clock for this kind of timed events? >> >> I really have not tried this and I guess I don't know tevent internals >> well enough if this works, but I wonder if just using: >> clock_getime() > > With a CLOCK_MONOTONIC I presume? > I think that's what has been suggested. > I was thinking about using CLOCK_MONOTONIC_RAW: CLOCK_MONOTONIC_RAW (since Linux 2.6.28; Linux-specific) Similar to CLOCK_MONOTONIC, but provides access to a raw hardware-based time that is not subject to NTP adjustments or the incremental adjustments performed by adjtime(3). >> and constructing struct timeval >> in place of: >> tevent_timeval_current_ofs() >> could solve this particular issue. >> >> On the other hand, this is a pattern we use in SSSD all through the code >> for timed events and we're just not well equipped to handle time drifts. >> Did you investigate why doesn't sssd detect the networking change from >> libnl messages or from resolv.conf being touched? I didn't dig much into it yet (I just checked tevent to confirm it uses gettimeofday()), so I'll take this as my next step. Thanks, Victor ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
On Mon, 2016-10-10 at 10:42 +0200, Jakub Hrozek wrote: > On Mon, Oct 10, 2016 at 10:04:30AM +0200, Victor Tapia wrote: > > Hi list, > > > > I've faced a race condition when SSSD boots in a machine with a big > > clock drift. This is what I see: > > > > 1. SSSD starts before the network is up, queries the LDAP server without > > success and sets a retry timer (~60 secs) > > 2. NTP starts and corrects the clock, 1 hour back for example. > > 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the > > connection. > > > > In this particular scenario the credentials cache is disabled, so the > > wait time to login is noticeable. How feasible would it be to use a > > monotonic clock for this kind of timed events? > > I really have not tried this and I guess I don't know tevent internals > well enough if this works, but I wonder if just using: > clock_getime() With a CLOCK_MONOTONIC I presume? I think that's what has been suggested. > and constructing struct timeval > in place of: > tevent_timeval_current_ofs() > could solve this particular issue. > > On the other hand, this is a pattern we use in SSSD all through the code > for timed events and we're just not well equipped to handle time drifts. > Did you investigate why doesn't sssd detect the networking change from > libnl messages or from resolv.conf being touched? I had a patch series contributed to autofs recently that changed to use CLOCK_MONOTONIC for the same type of problem. However, the man page says of CLOCK_MONOTONIC: "Clock that cannot be set and represents monotonic time since some unspecified starting point. This clock is not affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the clock), but is affected by the incremental adjustments performed by adjtime(3) and NTP." So perhaps this won't actually help when NTP adjusts the clock (although NTP should adjust the clock, perhaps, slowly enough), I didn't catch that when I reviewed the autofs patch series mmm. Ian ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
On Mon, Oct 10, 2016 at 10:04:30AM +0200, Victor Tapia wrote: > Hi list, > > I've faced a race condition when SSSD boots in a machine with a big > clock drift. This is what I see: > > 1. SSSD starts before the network is up, queries the LDAP server without > success and sets a retry timer (~60 secs) > 2. NTP starts and corrects the clock, 1 hour back for example. > 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the > connection. > > In this particular scenario the credentials cache is disabled, so the > wait time to login is noticeable. How feasible would it be to use a > monotonic clock for this kind of timed events? I really have not tried this and I guess I don't know tevent internals well enough if this works, but I wonder if just using: clock_gettime() and constructing struct timeval in place of: tevent_timeval_current_ofs() could solve this particular issue. On the other hand, this is a pattern we use in SSSD all through the code for timed events and we're just not well equipped to handle time drifts. Did you investigate why doesn't sssd detect the networking change from libnl messages or from resolv.conf being touched? ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
Hi Fabiano and list, Please forgive me, I just realized that I wasn't, indeed, running from master. I'll try again with the proper version and come back if needed. Thanks! Victor El 10/10/16 a las 10:09, Fabiano Fidêncio escribió: > Victor, > > On Mon, Oct 10, 2016 at 10:04 AM, Victor Tapia >wrote: >> Hi list, >> >> I've faced a race condition when SSSD boots in a machine with a big >> clock drift. This is what I see: >> >> 1. SSSD starts before the network is up, queries the LDAP server without >> success and sets a retry timer (~60 secs) >> 2. NTP starts and corrects the clock, 1 hour back for example. >> 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the >> connection. >> >> In this particular scenario the credentials cache is disabled, so the >> wait time to login is noticeable. How feasible would it be to use a >> monotonic clock for this kind of timed events? > > Are you running git master? This issue is supposed to be already > solved by > https://github.com/SSSD/sssd/commit/b8ceaeb80cffb00c26390913ea959b77f7e848b9 > > Best Regards, > ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
[SSSD] Re: Monotonic clock for timed events
Victor, On Mon, Oct 10, 2016 at 10:04 AM, Victor Tapiawrote: > Hi list, > > I've faced a race condition when SSSD boots in a machine with a big > clock drift. This is what I see: > > 1. SSSD starts before the network is up, queries the LDAP server without > success and sets a retry timer (~60 secs) > 2. NTP starts and corrects the clock, 1 hour back for example. > 3. SSSD takes ~60 secs + the drift correction (1 hour) to retry the > connection. > > In this particular scenario the credentials cache is disabled, so the > wait time to login is noticeable. How feasible would it be to use a > monotonic clock for this kind of timed events? Are you running git master? This issue is supposed to be already solved by https://github.com/SSSD/sssd/commit/b8ceaeb80cffb00c26390913ea959b77f7e848b9 Best Regards, -- Fabiano Fidêncio ___ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org