[chrony-users] chrony and ntpd xleave interoperability
Hello, First, my apologies for the fingers crossing on chrony-dev when I tried to subscribe to chrony-users... I'm doing some tests to replace ntpd by chrony on some servers groups. Theses servers use a peer association with interleave option. When I try to do the same with ntpd on one side and chrony on the other, things go bad. At best, chrony got a working association with interleave status with very long response time. On the ntpd side, the association never work. The chrony server never get the "reach" state and the reach counter is stuck a zero. As soon as I remove the xleave option on the ntpd side, all start immediately to work as expected. ntpd : peer y.y.y.y minpoll 5 maxpoll10 xleave restrict y.y.y.y notrap nomodify noquery chrony : peer x.x.x.x xleave minpoll 5 maxpoll 10 allow x.x.x.0/24 Since yesterday, I had removed the xleave option on the ntpd side. All was good on the two sides. So I tried to reactivate the xleave option -> Boom it works !!! I restarted chrony -> ntpd logged "revceive: KoD packet from 192.54.145.235 has a zero org or rec timestamp. Ignoring." and four minute later "y.y.y.y 8613 83 unreacheable" The previously working assoc is now dead. No working assoc from chrony. So I restarted ntpd -> chrony start to see the other server (ntpdata) but never reach a good state. -> ntpd does not reach the "reach" state. remove the xleave from ntpd and restart -> all is still stuck restart chrony -> ntpd start to see the chrony server, reach state increment, and reach a "backup" condition. All is good on the chrony side. Re-add xleave option on ntpd side. unreach counter increment, flash=1606 so packet_bogus... on the chrony side, "Total valid RX" no longer increment... I'm lost. chrony 3.2 ntp-4.2.8p8, ntp-4.2.8p10 Could I normally expect xleave interoperability between chrony and ntpd or it is something too much "implementation specific" ? Emmanuel.
Re: [chrony-users] chrony and ntpd xleave interoperability
Le 23/01/2018 à 16:58, Miroslav Lichvar a écrit : > On Tue, Jan 23, 2018 at 02:44:56PM +0100, FUSTE Emmanuel wrote: >> Le 23/01/2018 à 13:00, Miroslav Lichvar a écrit : >>> With the current versions, if you can avoid the issue with >>> unsynchronized sources, they should interoperate, at least when their >>> polling intervals match. If it doesn't work for you, I'd like to see a >>> tcpdump output. >> Ok. I fixed min/max polling interval to 5 for testing purpose. >> Then I first restarted chrony. Wait for it to sync on a online source. >> Then restarted ntp and take capture. >> Will send you all the datas >> >> NTP is stuck in unreachable state >> Chrony is stuck with only one valid RX. > Ok. I can reproduce this problem. It seems ntpd doesn't update its > state in the interleaved mode when it receives a packet with an > unexpected origin timestamp. There was a similar issue fixed for the > basic mode few ntp releases ago: > https://bugs.ntp.org/show_bug.cgi?id=2952 > > As chronyd doesn't switch to the interleaved mode until it's receiving > valid responses and ntpd doesn't accept responses in the basic mode, > they are stuck waiting forever on each other. > > A similar thing seem to happen when trying to use the interleaved mode > between two 4.2.8p10 ntpds. You said it worked for you before, so I > assume one of the ntpds was an older version which didn't have this > bug? > Here are data from the working 4.2.8p10 platform which is composed by w.w.w.w, y.y.y.y, z.z.z.z ind assid status conf reach auth condition last_event cnt === 1 29450 f414 yes yes ok candidate reachable 1 2 29451 f414 yes yes ok candidate reachable 1 3 29452 f31f yes yes ok outlier 1 4 29453 961a yes yes none sys.peer sys_peer 1 5 29454 931d yes yes none outlier 1 ntpq> lpe remote refid st t when poll reach delay offset jitter == +x.x.x.x .MRS. 1 u 5 8 377 0.363 0.038 0.030 +y.y.y.y .PTP0. 1 s 25 64 377 0.071 0.017 0.035 -z.z.z.z .PTP0. 1 s 45 64 376 0.058 0.041 0.044 *SHM(0) .PTP0. 0 l 2 8 377 0.000 -0.017 0.005 -ntp-gps-1.thale .GPS. 1 u 4 8 377 5.031 -0.435 0.020 ntpq> rv 29451 associd=29451 status=f414 conf, authenb, auth, reach, sel_candidate, 1 event, reachable, srcadr=y.y.y.y, srcport=123, dstadr=w.w.w.w, dstport=123, leap=00, stratum=1, precision=-23, rootdelay=0.000, rootdisp=1.099, refid=PTP0, reftime=de11e3d4.1850d73b Tue, Jan 23 2018 17:39:48.094, rec=de11e3db.18563cd1 Tue, Jan 23 2018 17:39:55.095, reach=376, unreach=0, hmode=1, pmode=1, hpoll=6, ppoll=6, headway=51, flash=00 ok, keyid=112, offset=0.017, delay=0.071, dispersion=1.719, jitter=0.035, xleave=0.024, filtdelay= 0.09 0.10 0.07 0.12 0.13 0.11 0.11 0.16, filtoffset= -0.01 -0.02 0.02 0.06 0.05 -0.01 -0.04 0.00, filtdisp= 0.00 0.96 1.95 2.94 3.90 4.89 5.88 6.86 ntpq> rv 29452 associd=29452 status=f31f conf, authenb, auth, reach, sel_outlier, 1 event, interleave_error, srcadr=z.z.z.z, srcport=123, dstadr=w.w.w.w, dstport=123, leap=00, stratum=1, precision=-23, rootdelay=0.000, rootdisp=1.099, refid=PTP0, reftime=de11e4c0.a5c3751c Tue, Jan 23 2018 17:43:44.647, rec=de11e4c7.a5ca043a Tue, Jan 23 2018 17:43:51.647, reach=377, unreach=0, hmode=1, pmode=1, hpoll=6, ppoll=6, headway=13, flash=00 ok, keyid=113, offset=0.041, delay=0.058, dispersion=5.542, jitter=0.062, xleave=0.014, filtdelay= 0.11 0.14 0.11 0.11 0.10 0.08 0.06 0.08, filtoffset= 0.03 -0.05 -0.02 -0.02 -0.03 -0.02 0.04 0.09, filtdisp= 0.00 0.98 1.92 2.87 3.84 4.83 5.78 6.75 Emmanuel.
Re: [chrony-users] chrony and ntpd xleave interoperability
Le 23/01/2018 à 13:00, Miroslav Lichvar a écrit : > On Tue, Jan 23, 2018 at 11:31:38AM +0100, FUSTE Emmanuel wrote: >> When I try to do the same with ntpd on one side and chrony on the other, >> things go bad. >> At best, chrony got a working association with interleave status with >> very long response time. > A long response time up to the polling interval of the peer is normal > in symmetric associations. > >> On the ntpd side, the association never work. The chrony server never >> get the "reach" state and the reach counter is stuck a zero. > Have you tried the same configuration and the timing of restarts, > between two ntpd servers? I suspect you would see some of the issues > in this case too. > > There are probably multiple issues involved, which make it difficult > to see what's going on. I'm aware of the following: > > - ntpd doesn't accept packets from peers that are not synchronized >(yet), so peers have to be configured with other sources in order >for the symmetric association (in both basic and interleaved modes) >to start. See https://bugs.ntp.org/show_bug.cgi?id=3445. > - interleaved mode in ntpd works only when the peers use the same >polling interval. If they have the same minpoll and maxpoll, but >minpoll != maxpoll, they should in theory both get to the maxpoll >if the association doesn't work, but there may be a bug that >prevents that. > - chrony switches to the basic mode when the polling intervals don't >match, but ntpd doesn't accept responses in the basic mode if the >interleaved mode is enabled > >> chrony 3.2 >> ntp-4.2.8p8, ntp-4.2.8p10 >> >> Could I normally expect xleave interoperability between chrony and ntpd >> or it is something too much "implementation specific" ? > With the current versions, if you can avoid the issue with > unsynchronized sources, they should interoperate, at least when their > polling intervals match. If it doesn't work for you, I'd like to see a > tcpdump output. Ok. I fixed min/max polling interval to 5 for testing purpose. Then I first restarted chrony. Wait for it to sync on a online source. Then restarted ntp and take capture. Will send you all the datas NTP is stuck in unreachable state Chrony is stuck with only one valid RX. > > Please note that the symmetric mode has some security issues and it's > generally recommended to use the client/server mode instead. Even if > authentication is enabled, it is possible to break a symmetric > association by replaying old packets. (chrony has a partial protection > against this attack, but it works only in the basic mode when the > polling intervals match and there are no packets with timestamps from > future that could be replayed. It's too fragile, don't rely on it!) Yes I know. It is only used on "trusted" lan segments and/or to try to inter-operate with ntpd xleave. > > It is possible that support for symmetric associations will be dropped > from chrony in future. > I only using it to transition from ntpd to chrony. So It will not be missed. I hope my clock vendor will sometime transition from ntpd to something else (chrony) to get good xleave support (and much more). At most, I mainly use theses clocks with PTP so the NTP part only affect fail-over scenarios. Emmanuel.
Re: [chrony-users] chrony and ntpd xleave interoperability
Le 24/01/2018 à 13:45, Miroslav Lichvar a écrit : > On Tue, Jan 23, 2018 at 05:42:22PM +0100, FUSTE Emmanuel wrote: >> Le 23/01/2018 à 16:58, Miroslav Lichvar a écrit : >>> A similar thing seem to happen when trying to use the interleaved mode >>> between two 4.2.8p10 ntpds. You said it worked for you before, so I >>> assume one of the ntpds was an older version which didn't have this >>> bug? >> I have a platform with tree ntpds in interleaved mode >> Was on 2.4.8p8. >> Were upgraded today to 2.4.8p10 and are still working properly. > You are right. My test was bad (it hit the bug with unsynchronized > source). > > The bug in the interleaved mode is a bit more subtle. The state is > updated from received packet, but only when one of the timestamps is > zero (i.e. it's the first packet of the association). This means two > ntpd 4.2.8p10 can interoperate, but I suspect the association will not > recover if there is a mismatch between the receive timestamps. > > I'll send a bug report to the ntp maintainers. > > In the meantime, if you are willing to patch ntp, this should fix it: > > diff -up ntp-4.2.8p10/ntpd/ntp_proto.c.orig ntp-4.2.8p10/ntpd/ntp_proto.c > --- ntp-4.2.8p10/ntpd/ntp_proto.c.orig2018-01-24 13:35:16.611488502 > +0100 > +++ ntp-4.2.8p10/ntpd/ntp_proto.c 2018-01-24 13:35:24.113505866 +0100 > @@ -1774,7 +1774,6 @@ receive( > peer->bogusorg++; > peer->flags |= FLAG_XBOGUS; > peer->flash |= TEST2; /* bogus */ > - return; /* Bogus packet, we are done */ > } > Yes it work ! Thank you. Emmanuel.
Re: [chrony-users] NTP bogus timestamps - Chrony on openSUSE 15.1
Le 21/08/2019 à 16:00, James Knott a écrit : > On 2019-08-21 09:44 AM, Miroslav Lichvar wrote: >> It has no impact on accuracy. > Maybe not on my local network, but what if the server was some distance > away? I realize NTP was developed back in the days when a 56 Kb/s > connection was really something, but even with today's high bandwidth > connections there is some latency that would cause the client to be > slightly behind the server. The calculations based on those time stamps > were meant to determine that latency and correct for it. > > Incidentally, at work a few months ago, there was some discussion about > NTP on a major LRT project I was working on, though I wasn't directly > involved with the NTP servers. On this system, they have 2 GPS/NTP > servers, at different locations, that were to be synced with 2 other > servers. This system runs over a fibre backbone, that's 11 Km long and > they're somewhat fussy about NTP. I had to explain, to one of my > co-workers, how NTP worked. > Please, read the spec. It is not used as you think. It has NO impact on the way the calculations are done so no impact on accuracy. Emmanuel.
Re: [chrony-users] Resume from suspend and default makestep configuration
Hello Pali, Le 18/05/2020 à 12:37, Pali Rohár a écrit : > The main problem is when system is put into suspend or hibernate state. > > In my opinion resuming from suspend / hibernate state should be handled > in the same way as (re)starting chronyd. You do not know what may > happened during sleep. Yes and in case of needed workaround, it should be done at the system level, not chrony. A job for systemd. > And as I pointed there are existing problems that UEFI/BIOS firmware > changes RTC clock without good reason which results in completely wrong > system clock. > Could well be identified by blacklist at the udev/systemd level for applying or not the workaround (restart chrony or launch a chronyc command at resume) Emmanuel.
Re: [chrony-users] Resume from suspend and default makestep configuration
Le 18/05/2020 à 13:15, Pali Rohár a écrit : > On Monday 18 May 2020 10:45:02 FUSTE Emmanuel wrote: >> Hello Pali, >> >> Le 18/05/2020 à 12:37, Pali Rohár a écrit : >>> The main problem is when system is put into suspend or hibernate state. >>> >>> In my opinion resuming from suspend / hibernate state should be handled >>> in the same way as (re)starting chronyd. You do not know what may >>> happened during sleep. >> Yes and in case of needed workaround, it should be done at the system >> level, not chrony. >> A job for systemd. > Hello! Sorry for a stupid question, but what has systemd in common with > chronyd? Why should systemd care about chronyd time synchronization? Nothing. But it is to your "process manager" being systemd, sysvinit pile of scripts or whatever to restart or notify chrony, it has do do housekeeping anyway for other things when you suspend/resume. Exactly as networkmanager, ifupdown scripts, systemd-networkd reload/restart some network services when interfaces/tunnels/vpn are upped/downed. >>> And as I pointed there are existing problems that UEFI/BIOS firmware >>> changes RTC clock without good reason which results in completely wrong >>> system clock. >>> >> Could well be identified by blacklist at the udev/systemd level for >> applying or not the workaround (restart chrony or launch a chronyc >> command at resume) > Could you describe in details what do you mean by blacklist? Which udev > blacklist you mean and what should be put into that blacklist? I have > not caught this part. Faulty systems could be identified by DMI/ACPI strings and quirk applied. See for example /lib/udev/hwdb.d/60-sensor.hwdb for some laptop sensors. We could add an attribute to the RTC if it matche some vendor/bios version/model etc... to put in the hwdb (the blacklist) A udev rule will assign this attribute to the RTC if you are running on a known buggy system. A script could do anything you want at suspend/resume time in /lib/systemd/system-sleep if your RTC has the offended attribute (see systemd-sleep man page). Or better, a unit run at resume time could do anything too. The hwdb abstraction is not need if it is a local hack and should be properly defined with the hwdb/udev/systemd developers. If raised to the systemd developers, systemd-sleep / resume could take care directly and fire an appropriate target with a formally defined attribute in the hwdb. What to do with this target could be configurable and default to time daemon restart. I'm not a systemd/udev/hwdb expert/develloper, but I think this is a good track and deserve a discussion with them. Anyway, the level to tackle the problem is not chrony and the proper level for managing the problem is the init/process manager. Hwdb/udev is "a" way to share the faulty systems information across "init" ecosystem. Information that is usefully not only for chrony. Emmanuel.
Re: [chrony-users] Resume from suspend and default makestep configuration
Le 19/05/2020 à 12:29, Pali Rohár a écrit : > On Monday 18 May 2020 13:45:04 FUSTE Emmanuel wrote: >> Le 18/05/2020 à 13:15, Pali Rohár a écrit : >>> On Monday 18 May 2020 10:45:02 FUSTE Emmanuel wrote: >>>> Hello Pali, >>>> >>>> Le 18/05/2020 à 12:37, Pali Rohár a écrit : >>>>> The main problem is when system is put into suspend or hibernate state. >>>>> >>>>> In my opinion resuming from suspend / hibernate state should be handled >>>>> in the same way as (re)starting chronyd. You do not know what may >>>>> happened during sleep. >>>> Yes and in case of needed workaround, it should be done at the system >>>> level, not chrony. >>>> A job for systemd. >>> Hello! Sorry for a stupid question, but what has systemd in common with >>> chronyd? Why should systemd care about chronyd time synchronization? >> Nothing. >> But it is to your "process manager" being systemd, sysvinit pile of >> scripts or whatever to restart or notify chrony, it has do do >> housekeeping anyway for other things when you suspend/resume. > Hm... I remember that in past it was needed to blacklist broken daemons, > software and kernel modules which did not work correctly during S3 or > hibernate state. It was in some pm scripts utils... > > But I thought that these days are already passed and software can deal > with fact that machine may be put into suspend or hibernate state. > > So what you are suggesting is to put chronyd daemon into list of broken > software (which needs to be stopped prior suspend / resume)? > > It does not make sense for me as the immediate step after putting > software or kernel module into such "blacklist" was to inform upstream > authors of that daemon or kernel module they it is broken / incompatible > with suspend state and it should be fixed. > > That "blacklist" was just workaround for buggy software and not > permanent solution. No not chrony, but the machine which change RTC on your back : buggy Bios > >> Exactly as networkmanager, ifupdown scripts, systemd-networkd >> reload/restart some network services when interfaces/tunnels/vpn are >> upped/downed. > This is something totally different. all those mentioned "services" are > just independent part of system which manages network connections. > > chronyd is there to manage time synchronization. It was an "imaged comparison" for event driven config change. The event in the suspend vs time case, the event is only know and should be managed by your init system not by your time daemon. > >>>>> And as I pointed there are existing problems that UEFI/BIOS firmware >>>>> changes RTC clock without good reason which results in completely wrong >>>>> system clock. >>>>> >>>> Could well be identified by blacklist at the udev/systemd level for >>>> applying or not the workaround (restart chrony or launch a chronyc >>>> command at resume) >>> Could you describe in details what do you mean by blacklist? Which udev >>> blacklist you mean and what should be put into that blacklist? I have >>> not caught this part. >> Faulty systems could be identified by DMI/ACPI strings and quirk applied. > And what is the faulty system? Citing yourself : "as I pointed there are existing problems that UEFI/BIOS firmware changes RTC clock without good reason" > > I think this is something general and not related to particular machine. > I guess under specific conditions it may happen on any system. > >> See for example /lib/udev/hwdb.d/60-sensor.hwdb for some laptop sensors. >> We could add an attribute to the RTC if it matche some vendor/bios >> version/model etc... to put in the hwdb (the blacklist) >> A udev rule will assign this attribute to the RTC if you are running on >> a known buggy system. >> A script could do anything you want at suspend/resume time in >> /lib/systemd/system-sleep if your RTC has the offended attribute (see >> systemd-sleep man page). >> Or better, a unit run at resume time could do anything too. >> The hwdb abstraction is not need if it is a local hack and should be >> properly defined with the hwdb/udev/systemd developers. > This database is for describing hardware differences or issues. > > But above problem with time synchronization is general and hardware > independent. You can simulate same issue on your machine. > > Just put your computer into hibernation. Then boot from liveUSB some > Linxu distribution and change RTC time. Turn off liveUSB and boot your > hibern
Re: [chrony-users] Resume from suspend and default makestep configuration
Le 19/05/2020 à 13:30, Pali Rohár a écrit : > On Tuesday 19 May 2020 11:10:01 FUSTE Emmanuel wrote: >> Le 19/05/2020 à 12:29, Pali Rohár a écrit : >>> On Monday 18 May 2020 13:45:04 FUSTE Emmanuel wrote: >>>> Le 18/05/2020 à 13:15, Pali Rohár a écrit : >>>>> On Monday 18 May 2020 10:45:02 FUSTE Emmanuel wrote: >>>>>> Hello Pali, >>>>>> >>>>>> Le 18/05/2020 à 12:37, Pali Rohár a écrit : >>>>>>> The main problem is when system is put into suspend or hibernate state. >>>>>>> >>>>>>> In my opinion resuming from suspend / hibernate state should be handled >>>>>>> in the same way as (re)starting chronyd. You do not know what may >>>>>>> happened during sleep. >>>>>> Yes and in case of needed workaround, it should be done at the system >>>>>> level, not chrony. >>>>>> A job for systemd. >>>>> Hello! Sorry for a stupid question, but what has systemd in common with >>>>> chronyd? Why should systemd care about chronyd time synchronization? >>>> Nothing. >>>> But it is to your "process manager" being systemd, sysvinit pile of >>>> scripts or whatever to restart or notify chrony, it has do do >>>> housekeeping anyway for other things when you suspend/resume. >>> Hm... I remember that in past it was needed to blacklist broken daemons, >>> software and kernel modules which did not work correctly during S3 or >>> hibernate state. It was in some pm scripts utils... >>> >>> But I thought that these days are already passed and software can deal >>> with fact that machine may be put into suspend or hibernate state. >>> >>> So what you are suggesting is to put chronyd daemon into list of broken >>> software (which needs to be stopped prior suspend / resume)? >>> >>> It does not make sense for me as the immediate step after putting >>> software or kernel module into such "blacklist" was to inform upstream >>> authors of that daemon or kernel module they it is broken / incompatible >>> with suspend state and it should be fixed. >>> >>> That "blacklist" was just workaround for buggy software and not >>> permanent solution. >> No not chrony, but the machine which change RTC on your back : buggy Bios > Sorry, but I have not caught this line. Blacklist contained list of > buggy software, daemons and kernel modules which had to be (in past) > stopped / unloaded prior system went to S3 and started / (re)loaded > after system resumed. So obviously putting "buggy Bios" into blacklist > not only does not make sense, but also it did nothing. In that > particular case chronyd had to be put into that blacklist of buggy > software as it as you described is chronyd which needs to be stopped / > started... But as I said this was used in past when buggy software and > kernel modules were there when they was not able to correctly handle S3 > state. I said the machine not chrony. Please I'm not native English, but this conversation became more and more like a trooling one. Blacklist are black list, this is a generic term as you point out. > >>>> Exactly as networkmanager, ifupdown scripts, systemd-networkd >>>> reload/restart some network services when interfaces/tunnels/vpn are >>>> upped/downed. >>> This is something totally different. all those mentioned "services" are >>> just independent part of system which manages network connections. >>> >>> chronyd is there to manage time synchronization. >> It was an "imaged comparison" for event driven config change. >> The event in the suspend vs time case, the event is only know and >> should be managed by your init system not by your time daemon. >> >>>>>>> And as I pointed there are existing problems that UEFI/BIOS firmware >>>>>>> changes RTC clock without good reason which results in completely wrong >>>>>>> system clock. >>>>>>> >>>>>> Could well be identified by blacklist at the udev/systemd level for >>>>>> applying or not the workaround (restart chrony or launch a chronyc >>>>>> command at resume) >>>>> Could you describe in details what do you mean by blacklist? Which udev >>>>> blacklist you mean and what should be put into that blacklist? I have >>>>> not caught this part. >>>> Faulty systems could be identified by D
Re: [chrony-users] Resume from suspend and default makestep configuration
Le 19/05/2020 à 15:11, Pali Rohár a écrit : > On Tuesday 19 May 2020 12:42:28 FUSTE Emmanuel wrote: >> Le 19/05/2020 à 13:30, Pali Rohár a écrit : >>> On Tuesday 19 May 2020 11:10:01 FUSTE Emmanuel wrote: >>>> Le 19/05/2020 à 12:29, Pali Rohár a écrit : >>>>> On Monday 18 May 2020 13:45:04 FUSTE Emmanuel wrote: >>>>>> Le 18/05/2020 à 13:15, Pali Rohár a écrit : >>>>>>> On Monday 18 May 2020 10:45:02 FUSTE Emmanuel wrote: >>>>>>>> Hello Pali, >>>>>>>> >>>>>>>> Le 18/05/2020 à 12:37, Pali Rohár a écrit : >>>>>>>>> The main problem is when system is put into suspend or hibernate >>>>>>>>> state. >>>>>>>>> >>>>>>>>> In my opinion resuming from suspend / hibernate state should be >>>>>>>>> handled >>>>>>>>> in the same way as (re)starting chronyd. You do not know what may >>>>>>>>> happened during sleep. >>>>>>>> Yes and in case of needed workaround, it should be done at the system >>>>>>>> level, not chrony. >>>>>>>> A job for systemd. >>>>>>> Hello! Sorry for a stupid question, but what has systemd in common with >>>>>>> chronyd? Why should systemd care about chronyd time synchronization? >>>>>> Nothing. >>>>>> But it is to your "process manager" being systemd, sysvinit pile of >>>>>> scripts or whatever to restart or notify chrony, it has do do >>>>>> housekeeping anyway for other things when you suspend/resume. >>>>> Hm... I remember that in past it was needed to blacklist broken daemons, >>>>> software and kernel modules which did not work correctly during S3 or >>>>> hibernate state. It was in some pm scripts utils... >>>>> >>>>> But I thought that these days are already passed and software can deal >>>>> with fact that machine may be put into suspend or hibernate state. >>>>> >>>>> So what you are suggesting is to put chronyd daemon into list of broken >>>>> software (which needs to be stopped prior suspend / resume)? >>>>> >>>>> It does not make sense for me as the immediate step after putting >>>>> software or kernel module into such "blacklist" was to inform upstream >>>>> authors of that daemon or kernel module they it is broken / incompatible >>>>> with suspend state and it should be fixed. >>>>> >>>>> That "blacklist" was just workaround for buggy software and not >>>>> permanent solution. >>>> No not chrony, but the machine which change RTC on your back : buggy Bios >>> Sorry, but I have not caught this line. Blacklist contained list of >>> buggy software, daemons and kernel modules which had to be (in past) >>> stopped / unloaded prior system went to S3 and started / (re)loaded >>> after system resumed. So obviously putting "buggy Bios" into blacklist >>> not only does not make sense, but also it did nothing. In that >>> particular case chronyd had to be put into that blacklist of buggy >>> software as it as you described is chronyd which needs to be stopped / >>> started... But as I said this was used in past when buggy software and >>> kernel modules were there when they was not able to correctly handle S3 >>> state. >> I said the machine not chrony. >> Please I'm not native English, but this conversation became more and >> more like a trooling one. >> Blacklist are black list, this is a generic term as you point out. > Sorry for that. Lets call it just list. If you want to somehow use > machine in that list, then you probably need tuple > and teach scripts around to read that list as tuple and restart > "software" if "machine" matches string of current machine on which it is > running. Yes and software in this case is "software that provide time sync" > > I'm saying that in past this was just list of "buggy" software and > kernel modules which needs to be restarted during S3. It was not some > smart structure where you was able to define rules like "if you are > running on machine ABC then restart software CDE". And this is I guess > what you want to achieve by putting machine on list. > >>>>>> Exactly as networkmanager, ifupdown scripts
Re: [chrony-users] Resume from suspend and default makestep configuration
Le 19/05/2020 à 17:54, Pali Rohár a écrit : > On Tuesday 19 May 2020 17:36:15 Miroslav Lichvar wrote: >> On Tue, May 19, 2020 at 03:11:42PM +0200, Pali Rohár wrote: >>> Also when resuming from hibernation you may have been completely powered >>> off and also memory of system may have been modified. Plus multiOS >>> scenario may have applied, e.g. ordinary user just "booted" windows and >>> then turned it off and resumed linux from hibernation. I guess we would >>> agree that ordinary user does not use any virtualisation as you >>> described below. >> I don't think that's a common practice. If you suspend an OS and boot >> another, all kind of things can break, like corrupted swaps, etc. If >> you know what you are doing, fine, but don't be surprised when things >> break. > I know that lot of people are doing it. They are not developers, > sysadmins or people who watch mailing list, ... just normal users. > So from my observation, this is common. Maybe it is less common by > developers who know what can happen and break, but not uncommon by > ordinary non-power users. > > When hibernating windows it puts special signature on NTFS filesystems > and Linux's ntfs-fuse refuse to mount in R/W mode such "hibernated" NTFS > filesystem. So there is no corruption of hibernated windows state. > > Windows does not support accessing ext4, btrfs or linux swap so there is > corruption of linux fs/swap from windows. > >> When chronyd is running, it assumes it has full control over the >> system clock. When you suspend and resume the OS or machine, the >> system clock is reset to the RTC. chronyd can see there was a forward >> jump, but it doesn't know what happened. systemd should know that and >> there could be a unit to call the chronyc reset and makestep commands >> if a significant offset is expected. > But systemd cannot know that. It is chronyd who see that significant Systemd know that you are resuming from suspend. > jump occurred and only after it synchronize time via NTP. And until NTP Wrong. Chrony does not need to sync via NTP to see that the system clock jumped. > daemon tell (somehow) hat this jumps occurred, systemd cannot know that > during hibernation RTC clock was modified. That should not happen in normal case. > > This looks like a chicken and egg problem. systemd (or any other init / > service system) does not know correct time after resuming system from > suspend/hibernate, so it cannot check if RTC jump occurred. chronyd is RTC should be the trusted source in this case so init system, knowing that you are resuming from should notify ntp daemon that all is ok (launch "chronyc reset" in the devel version). After that, on a "sane" computer, you could even now drop the makestep parameter for the paranoids like me.
Re: [chrony-users] Resume from suspend and default makestep configuration
Le 19/05/2020 à 16:11, Pali Rohár a écrit : > On Tuesday 19 May 2020 13:40:18 FUSTE Emmanuel wrote: >> Le 19/05/2020 à 15:11, Pali Rohár a écrit : >> >>> In past I lot of time seen problem that Windows stored system time in >>> local timezone to RTC, then computer was rebooted to Linux which reads >>> system time from RTC in UTC and saw incorrect time. Installing NTP >>> daemon fixed this problem. And then after reboot Windows time was >>> shifted and after few seconds/minutes it synchronized it again against >>> Windows time server. >> A better workaround: just intruct linux that RTC is in localtimezone and >> not UTC and it would have worked. > I remember that this setup did not work in one case: when linux system > was booted prior booting windows system after DST change. Time was > shifted two times, once by linux, once by windows as windows did not > know that it should do it... > Emmanuel. Yes , the problem of shadow states. Only one could control the RTC, simultaneously (VM) or asynchronously (Multiboot). Your hibernation image is a shadow state too. Emmanuel.
Re: [chrony-users] NTS: Limiting
Le 20/01/2021 à 10:03, Karol Babioch a écrit : > Hi, > > Am 19.01.21 um 19:02 schrieb Kurt Roeckx: >> In your config file >> you need to say something like "server ntp.example.org nts". This >> means you will only accept certificates that have ntp.example.org >> in the certificate. If you only trust Let's encrypt, you will only >> trust certificates issued by Let's encrypt for ntp.example.org. > Yes, that is correct when you specify servers explicitly. > >> I have no idea what kind of attack surface you have in mind. > I'm wondering how this behaves in case of pools, i.e. when I run a > private pool of NTP servers, i.e. "pool.example.com". > > When I have something like this in my chrony.conf: > >> pool pool.example.com iburst maxsources 3 > Is NTS even possible in such a context? AFAIK only A records with IP > addresses are resolved, so I'm not sure if and how certificates can be > validated. There is no NTS for the pool for now. Some technical pieces are missing and need to be defined/specified. There is some propositions for a SRV record usage for NTP/NTS, but any projection is premature. So the problem you try to solve does not exist now: you always specify server explicitly in a NTS context. Emmanuel.N�r��y隊W!���ǫ�-r�+n��\�� "�r��z)��.n7��Z+��izf���k�|�z�\��'�۱}���*+�����)��.n7��:蹹^f��X��f���܆�'�۱}���*+