Re: [chrony-users] makestep in Chrony

2018-05-10 Thread Ariel Garcia
Hello Hei,

i use the call 
chronyc waitsync 1 1 500
for knowing if chrony has already achieved synchronization.
Attached a trivial shell script which you can use as boolean command:
chrony_is_synced && do_something || fail_somehow_if_not_synced
or you can call it
chrony_is_synced -v
to get the result on stdout.

The first parameter to waitsync (1) tells chronyc to only check once, i.e. , 
the call will be (roughly) non-blocking.
You can change that into "0" or a big number to get a blocking command, it 
will wait (long/forever) to sync. Attached also my "chrony_wait_sync" shell 
command).
On purpose i use big max-correction and max-skew values (1 500), to allow for 
a fast(er) rough initial synchronization.


BTW, chronyc burst  will _NOT_ give you any synchronization warranty, it will 
only force so many measurements from (each) server.

Hope it helps, Ariel



On Thursday, 10 May 2018 12:16:29 CEST Hei Chan wrote:
>  Thanks Miroslav and Bill!
> One last related question -- how can I be able to tell the sync/calibration
> is done after I manually ask chrony to synch/calibrate? I saw one of the
> posts 4 years ago suggesting that there is no way? Which command is better
> to force chrony to synchronize time right now -- chronyc burst or chronyc
> waitsync?
> 
> Which command is better to force chrony to synchronize time right now --...
> 
> Or even better -- is there a way that I can call through a C API and get a
> callback or get blocked after it is done? Thanks!


chrony_wait_sync
Description: application/shellscript


chrony_is_synced
Description: application/shellscript


Re: [chrony-users] Tracking lost but server selected

2018-03-26 Thread Ariel Garcia
> > > It would probably help if we could see debug output from the time when
> > > the sources cannot be selected.
> > 
> > Ok, got it :)
> > 
> > There seems to be a 1 to 1 correlation with the "Fallback drift 13"
> > 
> > activation, which is the value i selected in the config file:
> > fallbackdrift 13 19
> 
> Ok, I think chronyd switching to a fallback drift when no update of
> the clock was made in 2^13 seconds explains the unsynchronized status
> (0.0.0.0 entries in the tracking log). That's actually how it was
> designed to work. I think it could be improved to keep the current
> reference and adjust the dispersion rate to allow it to operate as an
> NTP server and make the tracking report useful.

Ok. I'm misinterpreting the fallbackdrift documentation then (or it is perhaps 
a bit misleading), since it says: "They (drifts) are used when the clock is no 
longer synchronised ...".  From what i understand 
   "synchronized" == "a reference is set"
so the fallback drift causes the reference to be lost, and not the other way 
around.

I would like to make sure that a reference source stays selected even if the 
network is offline, at least until chrony estimates the clock may get way too 
off (eg, greater than 1s):  i  poll chrony with 
"chronyc waitsync 1 1 500"
every 5 minutes to make sure it is still sync'ed, and trigger a network 
connection if not.

Shall i simply remove the fallbackdrift setting in that case?

> The "Can't synchronise: no selectable sources" messages seem to
> correspond to chronyc offline commands. Maybe you have a networking
> script running automatically when an interface is brought up/down?

yes, exactly, although i tried to ensure the 3G modem stays connected all the 
time. Probably now i can't recover a log of that anymore, but i could retest 
if interesting for you.

Thanks a lot for your support!


-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Tracking lost but server selected

2018-03-20 Thread Ariel Garcia
> maxdistance is only related to the source selection. The maxdelay*
> options are related only to the individual measurements. If we ignore
> the skew and other sources of dispersion, a source which doesn't have
> a measurement with better delay than 6 seconds would always fail the
> maxdistance check. (distance = delay / 2 + dispersion)

great, thanks for the explanations!

> Do you set any of the sources to offline (with chronyc)?

no, didn't

> Offline sources are treated as unreachable and have some additional
> requirements for selection.
> 
> It would probably help if we could see debug output from the time when
> the sources cannot be selected.

i've already recompiled with --enable-debug and restarted the daemon, but 
didn't happen up to now.  I will post again as soon as i get the data, perhaps 
1-2 days.

Thanks!

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Tracking lost but server selected

2018-03-19 Thread Ariel Garcia
Hello Miroslav,

thanks a lot for your enlightening answer :)
In the meantime i've noticed, that the state is "normalizing" itself after 
several hours: i've seen it remaining w/o reference for anything between
1 to 18 hours) but eventually happening again.

The device is connected via mobile network, with pretty varying and bad RTT's 
to the NTP servers (i've seen anything between 0.1 to 2+ seconds, mean is 
around 0.8s). Peer delay (chronyc ntpdata) is usually ~0.8s also.

> Tracking reporting no reference probably means that the selection
> failed at some point (e.g. no majority reached) and the newly selected
> source (if there is one) didn't have an update yet.
> 
> Is there a message in the system log corresponding to the selection of
> 0.0.0.0?

yes, right, it always prints
Can't synchronise: no selectable sources
whenever it turns to "Ref time 1970" state

> Was there a network change which could cause the delay to be increased?

well, no from the device but as said above the mobile nw is quite varying, 
delays could be better and are for sure nor regular.

> The sources seem to be failing in the test C, which suggests the
> network delay is larger than it used to be before. chronyd is waiting
> for the delay to get back to normal, but if it is a permanent change
> (e.g. network routing has changed which added more delay), it may take
> a while before the sample with the old minimum delay is flushed, or a
> different source with better statistics is selected.

great! that is a very helpful information.

Is there any parameter i could adjust, to make it less "sensitive" to the 
(bad) network behaviour, like increasing  maxdelay ? (currently i have the 
default value, 3s)
As said above, i've seen RTT's of 2 seconds, but looking sporadically, they 
may get even worse...

Would having less sources (4 intended) help in reaching majority?

> If you look further back in the measurements log, do you see a point
> when all sources suddenly started to consistently fail the test C?

it is rather the other way around: most of the time i see test C failing:
111 111 1101
and every once in a while you find a server where all bits are set (logs 
attached in case helpful, i can't really see a pattern)

So i guess when the rate of "all good" points falls even more than the normal 
value, i loose the reference?
Any of the parameters printed in the logs (root delay, dispersion etc) helps 
to know when the synchronization is going to be lost?

Thanks a lot for your help and for the great tool :)
Ariel


logs.tgz
Description: application/compressed-tar


Re: [chrony-users] Minimal online duration

2018-02-09 Thread Ariel Garcia
> > Assuming the servers are reachable/fine, would a short connection time of
> > 2-3 seconds twice per day be enough to keep synchronized, or should i
> > rather enforce some minimum "on" time? (no big precision required, just
> > within 1s)
>
> That should be ok. 1 second per 12 hours is a frequency error of 23
> ppm. Unless the clock on the board is very unstable, its error should
> stay well below that.

i'm not (so) worried by the precision, but rather if chrony manages to 
"exploit" in a useful way the single query to each server that it will manage 
to send in those 2-3 seconds window.
There seems to be an implicit "yes" in you answer :)

> > server SERVER1/2/3 auto_offline offline iburst presend 10 minpoll 4
> > initstepslew 3600 SERVER1/2/3
> 
> If you don't need chronyd to block the boot sequence until the clock
> is stepped I'd suggest to drop the initstepslew directive

ok thanks for the hint!

BR, Ariel


-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



[chrony-users] Minimal online duration

2018-02-09 Thread Ariel Garcia
Hello,

my use case for chrony is on a device which is connecting for a few seconds 
every 12 hours to send data. An initial synchronization time on first boot will 
of course take longer, that's ok.
The ppp deamon will set chrony servers online / offline when the link is on/off.

Assuming the servers are reachable/fine, would a short connection time of 
2-3 seconds twice per day be enough to keep synchronized, or should i rather 
enforce some minimum "on" time? (no big precision required, just within 1s)

Can anybody with a similar use-case comment on experience?

My relevant config is:
---
server SERVER1/2/3 auto_offline offline iburst presend 10 minpoll 4
initstepslew 3600 SERVER1/2/3
makestep 3600 2
fallbackdrift 13 19
driftfile /var/lib/chrony/sysclock.drift
dumpdir /var/lib/chrony/history
dumponexit
---

Thanks in advance!
Ariel

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] How to avoid oversteering

2018-02-06 Thread Ariel Garcia
> There is no config.gz file in the /proc directory.

well, it depends how you (or your distribution) compiled the kernel...

The option is/are:

CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y

inside "general Setup"



-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] How to avoid oversteering

2018-02-06 Thread Ariel Garcia
> > > What kernel version is the board running? There was a bug which caused
> > > issues like that and it was fixed few years ago. IIRC it was specific
> > > to 32-bit platforms.
...
> No, I think it's this one:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i
> d=2619d7e9c92d524cb155ec89fd72875321512e5b

perfect, that was the problem!! Thanks a lot!

I've also tested fully disabling the "tick-less" functionality (forcing the 
kernel to emit "ticks" when busy and also when idle) but that did not help. 
Said the other way around: having a "tickless" kernel does not seem to be an 
obstacle to Chrony :)

The behaviour without the abs64 kernel, on the other hand, is completely 
different, with 
Residual freq
and Skew
to jump to ridiculously high values once the correct NTP time is reached...

Thanks again for you support!

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] How to avoid oversteering

2018-02-06 Thread Ariel Garcia
> > i am trying to get Chrony 3.2 running without stepping the clock, but i
> > find that it oversteers by a huge amount.  I observe that it starts
> > driving
> > the system clock to the right value, but then moves it more that the
> > initial offset in the opposite direction, and starts "oscillating".
> 
> This looks like a broken system clock. chronyd is telling the kernel
> to slow down or speed up the clock, but it doesn't happen.
> 
> What kernel version is the board running? There was a bug which caused
> issues like that and it was fixed few years ago. IIRC it was specific
> to 32-bit platforms.

Thanks for you feedback!
You probably mean this one?
https://bugzilla.redhat.com/show_bug.cgi?id=1188074

I'm running 4.2.8 32bit (embedded device...)
I will recompile chrony adding some debugging for the frequency it tries to 
set,  my impression is also that the clock stays always at plus or minus the 
max frequency drift, once chrony starts to touch it.

On the other hand, i already tried Bill Unruh's hint:

> Alternatively it could be a severe hardware issue, or you could be running
> the system on a tickless clock.
 
i recompiled the kernel with ticks (and without power governors/frequency 
drivers, just performance) but it did not help.

Thanks!


-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] How to avoid oversteering

2018-02-05 Thread Ariel Garcia
> Alternatively it could be a severe hardware issue, or you could be running
> the system on a tickless clock.

# zcat /proc/config.gz  | grep HZ
CONFIG_NO_HZ_COMMON=y
CONFIG_NO_HZ_IDLE=y
CONFIG_NO_HZ=y
CONFIG_HZ_FIXED=0
CONFIG_HZ_100=y
CONFIG_HZ=100

i guess the CONFIG_NO_HZ=y  setting is what you mean, right?


-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] How to avoid oversteering

2018-02-05 Thread Ariel Garcia
> chronyd should not be doing that. It has a very efficient way of determining
> what the correct rate should be, and uses an extra drift to drive the
> offset to zero.
> 
> Is this on real hardware on a virtual machine? Chrony should not be run on
> virtual machines, because the timing on those is a mess. Their internal
> clocks are not regular.

thanks for your fast answer!
It is definitely NOTt a VM, but an ARM board (similar to BeagleBones black)

> Alternatively it could be a severe hardware issue

the system clock runs fine if left alone!! It drifts a few seconds per day, but 
that's it.
Chrony changes the clock frequency enormously when it starts "correcting" 
it... I can see that comparing system-clock and RTC times both with and 
without chrony.

> or you could be running the system on a tickless clock.

That is a good hint! that is a kernel config option right?




> On Mon, 5 Feb 2018, Ariel Garcia wrote:
> > Hello,
> > 
> > i am trying to get Chrony 3.2 running without stepping the clock, but i
> > find that it oversteers by a huge amount.  I observe that it starts
> > driving
> > the system clock to the right value, but then moves it more that the
> > initial offset in the opposite direction, and starts "oscillating".
> > 
> > I rebooted to make sure the system clock runs at the "default" speed
> > initially, having previously set the RTC ~10s off from the correct time
> > 
> > for testing. After chrony started it reported:
> > System time : 10.891304016 seconds slow of NTP time
> > 
> > After ~ 10 minutes running it was already:
> > System time : 2.529696226 seconds slow of NTP time
> > 
> > but then after ~20 minutes  it was already in the opposite direction:
> > System time : 12.160170555 seconds fast of NTP time
> > 
> > After one hour running it is worse than at the beginning:
> >System time : 16.009445190 seconds slow of NTP time
> > 
> > What am i doing wrong, am i having the wrong expectation, or could
> > it be a hardware or a configuration issue?
> > (the drift  values shown above are consistent with the device's RTC
> > --which is not being adjusted-- and an external PC)
> > 
> > I've tried to collect all details below,
> > thanks in advance for any hints/help!!
> > Ariel
> > 
> > 
> > 
> > Running on a BeagleBone black-similar board, with an AM335X CPU.
> > 
> > My configuration was chosen as simple as possible to start:
> > 
> > pool de.pool.ntp.org   minpoll 4
> > makestep 3600 3
> > logchange 0.05
> > cmdallow 127.0.0.1
> > bindcmdaddress /run/chrony/chronyd.sock
> > pidfile /run/chrony/chronyd.pid
> > ---
> > 
> > Of course i made sure there is no other NTP daemon running.
> > These are all running processes (only deleted my open ssh connections)
> > 
> > 
> >1 ?Ss 0:10 /sbin/init
> >  
> >  978 ?Ss 0:03 /usr/sbin/rngd -f
> >  985 ?Ss 0:04 /lib/systemd/systemd-journald
> > 
> > 1773 ?Ss 0:00 /lib/systemd/systemd-udevd
> > 1805 ttyO0Ss+0:00 /sbin/agetty --keep-baud 115200 38400 9600 ttyO0
> > vt220 1812 ?Ss 0:00 /usr/bin/dbus-daemon --system
> > --address=systemd: --nofork --nopidfile --systemd-activation 1813 ?  
> >  Ss 0:00 /usr/sbin/crond -b
> > 1827 ?Ss 0:00 /lib/systemd/systemd-networkd
> > 1834 ?Ss 0:00 /lib/systemd/systemd-resolved
> > 15344 ?S  0:00 /usr/sbin/chronyd -4 -f /etc/chrony/chrony.conf
> > 
> > 
> > The servers are reachable and seem ok:
> > 
> > $ chronyc -n sources
> > 210 Number of sources = 4
> > MS Name/IP address Stratum Poll Reach LastRx Last sample
> > ==
> > = ^- 172.104.239.150   2   4   377 3  -1158ms[-1158ms]
> > +/-   58ms ^- 138.201.135.108   2   4   377 4 
> > -1057ms[-1057ms] +/-   47ms ^* 195.34.187.1322   4   377 
> >   14+17ms[-1527ms] +/-   28ms ^- 5.9.78.71 2   4 
> >  37713   -142ms[ -142ms] +/-   50ms
> > 
> > $ chronyc -n sourcestats
> > 210 Number of sources = 4
> > Name/IP AddressNP  NR  Span  Frequency  Freq Skew  Offset  Std
> > Dev
> > =

Re: [chrony-users] gpsd:ERROR... Permission denied

2018-02-05 Thread Ariel Garcia
> During reboot, I see [OK] Started NTP client/server.
> Is it okay for the NTP client/server to be running next to chronyd?

I guess that is the message printed when starting chrony.
Are you using systemd?

Check chrony's systemd status with
# systemctl status chrony
mine shows:
chrony.service - NTP client/server

or just check the [Unit] Description field in
/lib/systemd/system/chrony.service
or 
/etc/systemd/system/chrony.service


If you really have NTPd running besides chrony, then you have two programs 
trying to "correct" the same clock... which will make both misunderstand the 
clock drift. In that case you must definitely chose one and turn off the other. 
Since you are in the chrony mailing list, you must of course turn off NTPd 
off ;-)

-- 
Dr. Ariel García
Gemfony scientific UG (haftungsbeschränkt)
Hauptstraße 2
D-76344 Eggenstein-Leopoldshafen
GERMANY

Phone:   +49 7247 934 2783
Fax: +49 7247 934 2781
E-Mail:  a.gar...@gemfony.eu
Web: http://www.gemfony.eu

Geschäftsführer:   Dr. Rüdiger Berlich
HandelsregisterNr: HRB 710566, Amtsgericht Mannheim, Ust-IdNr: DE274421406


--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



[chrony-users] How to avoid oversteering

2018-02-05 Thread Ariel Garcia
Hello,

i am trying to get Chrony 3.2 running without stepping the clock, but i
find that it oversteers by a huge amount.  I observe that it starts driving
the system clock to the right value, but then moves it more that the 
initial offset in the opposite direction, and starts "oscillating".

I rebooted to make sure the system clock runs at the "default" speed
initially, having previously set the RTC ~10s off from the correct time
for testing. After chrony started it reported:
 System time : 10.891304016 seconds slow of NTP time
After ~ 10 minutes running it was already:
 System time : 2.529696226 seconds slow of NTP time
but then after ~20 minutes  it was already in the opposite direction:
 System time : 12.160170555 seconds fast of NTP time
After one hour running it is worse than at the beginning:
System time : 16.009445190 seconds slow of NTP time

What am i doing wrong, am i having the wrong expectation, or could
it be a hardware or a configuration issue?
(the drift  values shown above are consistent with the device's RTC 
--which is not being adjusted-- and an external PC)

I've tried to collect all details below,
thanks in advance for any hints/help!!
Ariel



Running on a BeagleBone black-similar board, with an AM335X CPU.

My configuration was chosen as simple as possible to start:

pool de.pool.ntp.org   minpoll 4
makestep 3600 3
logchange 0.05
cmdallow 127.0.0.1
bindcmdaddress /run/chrony/chronyd.sock
pidfile /run/chrony/chronyd.pid
---

Of course i made sure there is no other NTP daemon running.
These are all running processes (only deleted my open ssh connections)

1 ?Ss 0:10 /sbin/init
  978 ?Ss 0:03 /usr/sbin/rngd -f
  985 ?Ss 0:04 /lib/systemd/systemd-journald
 1773 ?Ss 0:00 /lib/systemd/systemd-udevd
 1805 ttyO0Ss+0:00 /sbin/agetty --keep-baud 115200 38400 9600 ttyO0 
vt220
 1812 ?Ss 0:00 /usr/bin/dbus-daemon --system --address=systemd: 
--nofork --nopidfile --systemd-activation
 1813 ?Ss 0:00 /usr/sbin/crond -b
 1827 ?Ss 0:00 /lib/systemd/systemd-networkd
 1834 ?Ss 0:00 /lib/systemd/systemd-resolved
15344 ?S  0:00 /usr/sbin/chronyd -4 -f /etc/chrony/chrony.conf


The servers are reachable and seem ok:

$ chronyc -n sources
210 Number of sources = 4
MS Name/IP address Stratum Poll Reach LastRx Last sample
===
^- 172.104.239.150   2   4   377 3  -1158ms[-1158ms] +/-   58ms
^- 138.201.135.108   2   4   377 4  -1057ms[-1057ms] +/-   47ms
^* 195.34.187.1322   4   37714+17ms[-1527ms] +/-   28ms
^- 5.9.78.71 2   4   37713   -142ms[ -142ms] +/-   50ms

$ chronyc -n sourcestats
210 Number of sources = 4
Name/IP AddressNP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==
172.104.239.150 6   375-103422   1668.695  -1033ms14ms
138.201.135.108 6   375-103305   1165.815  -1020ms11ms
195.34.187.132  6   375-103052   2391.831  -1021ms15ms
5.9.78.71   6   375-102973   2085.425  -1017ms14ms


The logs show
-
Feb 05 17:18:22 BBx systemd[1]: Starting NTP client/server...
Feb 05 17:18:22 BBx chronyd[15344]: chronyd version 3.2 starting (+CMDMON +NTP 
+REFCLOCK +RTC +PRIVDROP -SCFILTER -SECHASH -SIGND +ASYNCDNS +IPV6 -DEBUG)
Feb 05 17:18:22 BBx chronyd[15344]: Initial frequency 42.124 ppm
Feb 05 17:18:22 BBx systemd[1]: Started NTP client/server.
Feb 05 17:18:55 BBx chronyd[15344]: Selected source 129.70.132.35
Feb 05 17:18:56 BBx chronyd[15344]: System clock wrong by 11.087803 seconds, 
adjustment started
Feb 05 17:19:11 BBx chronyd[15344]: Can't synchronise: no majority
Feb 05 17:19:12 BBx chronyd[15344]: Selected source 129.70.132.35
Feb 05 17:19:12 BBx chronyd[15344]: System clock wrong by 1.096591 seconds, 
adjustment started
Feb 05 17:19:27 BBx chronyd[15344]: Selected source 138.201.135.108
Feb 05 17:19:27 BBx chronyd[15344]: System clock wrong by 0.943644 seconds, 
adjustment started
Feb 05 17:19:43 BBx chronyd[15344]: System clock wrong by 1.241887 seconds, 
adjustment started
...
(lots or similar lines, but always the system clock is reported to be wrong
  by +2 to -2 seconds, never by the amount shown by "chronyc tracking")
...
-

The tracking data at different times:
-
$ chronyc tracking
Reference ID: 81468423 (stratum2-2.NTP.TechFak.NET)
Stratum :