[chrony-users] Chrony plugin for collectd?

2014-05-18 Thread Holger Hoffstätte

Hello fellow timekeepers,

is anybody aware of a chrony plugin for collectd? They have one for ntpd [1] 
but I figured lightweight native monitoring of chrony would be neat. Searched 
the web but came up empty, so I thought I'd ask here first.

thanks
Holger

[1] https://collectd.org/wiki/index.php/Plugin:NTPd

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



[chrony-users] New with 1.30: stuck after resume from suspend

2014-07-03 Thread Holger Hoffstätte

Hi,

I did not notice this with the 1.30-prerelease, so obviously this bug report 
comes right after release. Sorry. :)

After updating to 1.30-final on my laptop (kernel 3.14.10, Gentoo  userland) 
everything still seems to work fine. However after waking up from 
suspend-to-RAM over night, it seems chrony is "stuck": it's still running and 
accessible to requests (e.g. to list sources), but no  longer performs any 
update polls. The "LastRx" counter shows a "10y" interval, and waiting ~maxpoll 
doesn't seem to help. 10 years does seem a bit long. ;)

Giving such a stuck chrony a nudge via "chronyc burst" gets things going again.

The previous 1.29.1 release never had this problem and properly recovered 
itself after wakeup, initially downscaling the poll interval and subsequently 
ramping up again.

Any ideas?

thanks
Holger

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] New with 1.30: stuck after resume from suspend

2014-07-03 Thread Holger Hoffstätte

On 07/03/14 14:06, Miroslav Lichvar wrote:
> This is probably related to the new detection of forward time jumps,
> which was mainly intended to handle system suspends. When chronyd

I read that in the release notes and figured as much. :-)

> When this happens, in the log you should see a "forward time jump was
> detected" message, immediately followed by "no reachable sources" and
> then in about 2 minimum polling intervals (assuming the suspend was
> longer than the polling interval used before suspend) a new source
> should be selected.

Indeed - from this morning's wakeup, after sleeping overnight:

Jul  3 09:18:39 hho chronyd[28177]: Forward time jump detected!
Jul  3 09:18:39 hho chronyd[28177]: Can't synchronise: no reachable sources
Jul  3 09:19:19 hho chronyd[28177]: Selected source 192.168.100.222

I don't remember how long I waited for it to restart polling before kicking it, 
but I'm sure it was >> 2*minpoll.

And just as I wanted to send this mail I figured I re-check on a second system 
(workstation, identically configured but newer/better HW), and what do you 
know? It woke up and started syncing right away:

Jul  3 14:33:05 ragnarok chronyd[2529]: Forward time jump detected!
Jul  3 14:33:05 ragnarok chronyd[2529]: Can't synchronise: no reachable sources
Jul  3 14:33:43 ragnarok chronyd[2529]: Selected source 192.168.100.222

..and it's runnig fine now. That system ran with a shorter polling interval 
since I suspended briefly after reboot, so it was still at minpoll.

> Interesting. I suspect the scheduled timeout got lost somehow.

Apparently not always. The laptop where this happened is pretty old though 
(Thinkpad T60 from ~2007) with a pretty dodgy clock, so maybe it's a 
race/timing condition somewhere.

I will verify the behaviour again when polling has reached a higher value.

> How is your chronyd configured? Do you set the online/offline status
> from chronyc?

No, laptop/workstation start chrony at boot and continously sync to an 
always-on inhouse chronyd server ("ntp") for reference, with a very minimal 
config:

server ntp iburst minpoll 4 maxpoll 8
initstepslew 1 ntp

..plus various other unrelated settings like keyfile, driftfile etc. Nothing 
fancy.

Other than this it's working fine, so no drama. For now I can add a post-wakeup 
script to kick it into gear with a few bursts. If you have any ideas I can 
gladly try patches/build from git if that would help.

thanks!
Holger

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] New with 1.30: stuck after resume from suspend

2014-07-03 Thread Holger Hoffstätte
On 07/03/14 16:33, Miroslav Lichvar wrote:
> So it seems it selected the source pretty quickly after the jump was 
> detected. So the problem is that it took too long to detect it? Do
> you know at what time you issued the sources command?

I have a "chronytop" "realtime monitor" (using the term loosely here ;), with 
"watch" issuing various statistics calls every two seconds. I started it 
immediately after waking up and observed, mostly to see how quickly chrony 
would start polling again.

> There is a difference in how are the scheduled timers corrected when 
> chronyd reaches a timeout (select() returning 0) and when an
> external request (e.g. chronyc command) wakes it up.
> 
> When it doesn't timeout, all scheduled timers are moved by the 
> interval between last select() call and the current time. This means 
> if a chronyc command is issued before chronyd timeouts, the actual 
> time it takes to send first request after suspend could be up to 
> 2*maxpoll.

Oh! That might explain why I didn't observe it sooner and kicked it.

Anyway, I'll just give it a few tries again over the next few days without 
post-wakeup nudge, since the expected behaviour should eventually happen (and 
apparently eventually does). So that's good.

> Running chronyc offline before suspend and chronyc online after
> resume should work too. That's what happens with the NetworkManager
> dispatcher script in Fedora.

That's a good idea! I might try that.

cheers
Holger

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] New with 1.30: stuck after resume from suspend

2014-07-04 Thread Holger Hoffstätte
On 07/03/14 16:53, Holger Hoffstätte wrote:
> Anyway, I'll just give it a few tries again over the next few days
> without post-wakeup nudge, since the expected behaviour should
> eventually happen (and apparently eventually does). So that's good.

So after a few more suspend/wakeup cycles without any interference & waiting 
long enough I can confirm that the poll restart does indeed work reliably - 
just with a longer delay than before.
All good then - thanks :)

-h

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Chrony Accuracy

2014-08-12 Thread Holger Hoffstätte
On 08/12/14 10:13, Hosam Hittini wrote:
> I was wondering how accurate Chrony is, in terms of seconds

Very. The built-in stats say that my inhouse server with a Stratum-1 upstream 
source at my ISP is currently "0.10289 seconds fast of NTP time". That's 
10µs.

If you are thinking in terms of "seconds", chrony is good enough for you. :)

-h


-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Chrony Accuracy

2014-08-12 Thread Holger Hoffstätte
On 08/12/14 10:54, Hosam Hittini wrote:
> Actually I need accuracy in the order of nano seconds
> Is it possible to achieve such accuracy if I have a dedicated LAN NTP server?

Uhm.. Nanoseconds (and what deviation?) might be a tall order, and you are 
certainly NOT going to get that from chrony, normal whitebox PC hardware or 
even 1Gb Ethernet on a cheap switch.

AFAIK the best you can do on a LAN right now is PTP (high-precision NTP for 
industrial control systems or HFT) with NICs that have HW timestamping, but you 
will still need a reference clock.

Here is a great article that describes PTP and how it differs from NTP:
http://queue.acm.org/detail.cfm?id=2354406

The more or less official Linux package:
http://linuxptp.sourceforge.net/

For the HW clock your options mostly depend on your budget.. :)

-h


-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chrony-2.0-pre1 released

2015-01-28 Thread Holger Hoffstätte
On 01/27/15 15:12, Miroslav Lichvar wrote:
> The first prerelease for chrony-2.0 is now available.

Just some quick feedback since I haven't tested all new features yet (love the 
pool directive!) - several existing setups still work fine, no regressions 
noticed so far.

thanks!
Holger



-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chrony-2.0-pre1 released

2015-01-28 Thread Holger Hoffstätte
On 01/28/15 12:47, Miroslav Lichvar wrote:
> On Wed, Jan 28, 2015 at 10:38:14AM +0100, Holger Hoffstätte wrote:
>> On 01/27/15 15:12, Miroslav Lichvar wrote:
>>> The first prerelease for chrony-2.0 is now available.
>> 
>> Just some quick feedback since I haven't tested all new features
>> yet (love the pool directive!) - several existing setups still work
>> fine, no regressions noticed so far.
> 
> Great, thanks for the feedback. There were some major changes in the 
> NTP code, so I'm interested if it works as a client and also server 
> with other NTP implementations and older chrony versions.

Sure - in my case I have one box acting as client to my ISP's NTP (stratum-1) 
server, and server to my LAN. Both old and new inhouse chrony clients work. All 
on Linux (Gentoo, kernel 3.18.x).

cheers
Holger

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



[chrony-users] Delays before daemonizing

2015-12-01 Thread Holger Hoffstätte
Hi,

Peter Humphrey recently posted about his findings with an apprently
racy init script, see: https://bugs.gentoo.org/show_bug.cgi?id=566972

So far we've concluded that indeed the init script is inherently racy
due to accidental double-forking. I've since then investigated this a
bit more (see the bug) and wanted to ask for possible explanations why
chronyd's own daemonizing seems to take so surprisingly long before
returning.

I haven't looked at the source yet, but it seems to me that loading and
verifiying the configuration should not take several seconds before
forking and returning. I don't have a ton of sources (2), they are all
inhouse, healthy, reachable and DNS resolvable etc. So..any ideas?

thanks
Holger

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Delays before daemonizing

2015-12-01 Thread Holger Hoffstätte
On 12/01/15 12:45, Miroslav Lichvar wrote:
> Two things that by design delay the exit of the foreground process are
> the -s option and the initstepslew directive. As you pointed out in

..as I just found out concurrently. :)
Makes sense, of course.

> Do you have allow/deny with a hostname in your config? 

No. The difference is that I wasn't using rtcsync (like Peter) but the
rtcautotrim/rtcfile stuff. That initialization also seems to cause an
additional slowdown.

I've now removed initstepslew in favor of makestep - should accomplish
mostly the same thing, i.e. a basic sanity check in case the RTC went
bonkers - and use rtcsync instead of rtcfile. Now the init script starts
immediately without using external --background forking.
This also has the added benefit of properly announcing any configuration
or other startup errors to the init system.

All this is not terribly relevant *for me* since my HW clock is relatively
sane and I'm connected to a Stratum-1 source with ridiculous precision
and constant path latency. It was more to figure out a good path forward
for the packaged init script.

Thanks!
Holger


-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Delays before daemonizing

2015-12-01 Thread Holger Hoffstätte
On 12/01/15 13:31, Miroslav Lichvar wrote:
> On Tue, Dec 01, 2015 at 01:08:24PM +0100, Holger Hoffstätte wrote:
>> On 12/01/15 12:45, Miroslav Lichvar wrote:
>>> Do you have allow/deny with a hostname in your config? 
>>
>> No. The difference is that I wasn't using rtcsync (like Peter) but the
>> rtcautotrim/rtcfile stuff. That initialization also seems to cause an
>> additional slowdown.
> 
> Hm, was that with the -s option? rtcfile/rtcautotrim alone shouldn't

Yes, it was.

> add any delay. At least that's what I see when I try it here.
> If you don't need -s, it's probably better to use rtcsync instead of
> rtcfile+rtcautotrim in any case.

That's what I have now. Works just as fine - I don't remember why I
had the old driver enabled, probably because it came with an old
configuration template and has been working fine since then.

All good now :)

cheers
Holger


-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Delays before daemonizing

2015-12-01 Thread Holger Hoffstätte
On 12/01/15 14:46, Miroslav Lichvar wrote:
> On Tue, Dec 01, 2015 at 01:39:15PM +0100, Holger Hoffstätte wrote:
>> That's what I have now. Works just as fine - I don't remember why I
>> had the old driver enabled, probably because it came with an old
>> configuration template and has been working fine since then.
>>
>> All good now :)
> 
> Great :). Now the question is how many users have initstepslew in
> their config. If it was in a default/recommended config at some point,
> it could be a lot of unhappy users when the background option is
> removed from the init script.

I'm reasonably certain that was my own doing, a loong time ago. :)

The current package installs examples/chrony.conf.example1 as initial
configuration [1], and that looks very stripped down and sensible
out of the box. So I think we're good. The worst that can happen is
that people have a slower, but correct startup for all the right
reasons, so that wouldn't even be too wrong.

thanks!
Holger

[1] 
https://gitweb.gentoo.org/repo/gentoo.git/tree/net-misc/chrony/chrony-2.2.ebuild#n102


-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Delays before daemonizing

2015-12-01 Thread Holger Hoffstätte

...and of course I get to answer my own question. :)

> I haven't looked at the source yet, but it seems to me that loading and
> verifiying the configuration should not take several seconds before
> forking and returning. I don't have a ton of sources (2), they are all
> inhouse, healthy, reachable and DNS resolvable etc. So..any ideas?

So this turned out to be my use of initstepslew with two sources, which of
course must block before returning. Without it chronyd forks immediately
(unless the configuration has an error), just as expected.

Thanks! :)

-h


-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Asking chronyd not to combine time from multiple sources

2017-09-21 Thread Holger Hoffstätte
On 09/21/17 13:54, Chris Perl wrote:
> I would like a way to be able to ask chronyd not to combine time from
> multiple sources, but instead to just trim the local clock from the
> selected system peer.

Are you looking for the "combinelimit " directive?
If I understand you correctly, "combinelimit 0" does exactly what you are
asking for and works fine here.

cheers,
Holger

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



[chrony-users] Chrony vs. Linux RNG

2018-04-22 Thread Holger Hoffstätte


Hello!

I test stable/LTS kernels to help Greg KH and just updated to 4.16.4-rc1.
This contains a few patches that are supposed to help with CVEs around
randomness, and which cause an interesting catch-22 that affects chrony,
hence this mail.

The patches in question are in the stable queue and can be found under the
"random-*" prefix at:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-4.16

Not sure exactly which patch is "at fault" because I don't feel like bisecting
this mess and it's unlikely to be reverted anyway.

The initial symptom was that starting chronyd on boot seemed to "hang",
but eventually continued after ~30 secs or so, working fine as usual, so
I blamed Gremlins and continued.

Since the symptom reliably reproduced on two other machines I investigated
further and eventually found that it relates to access of the CRNG: as soon
as "random: crng init done" appeared in the kernel log, chrony would start
up without delay. Apparently accessing the CRNG now blocks in early phases
of the boot process, when not enough entropy has been collected - which is
typically the time when chrony is started as well. This can make e.g. a
headless server without concurrent background activity take a *really* long
time to boot: in one instance I measured a blocked boot process taking over
a minute instead of the usual 5 seconds. IMHO furiously pinging a booting
remote host is not really a solution, though it does seem to help. :)

Long story short, is there something chrony can do to avoid this?
Why does it need to access any random number generators in the first
place?

For now I just quick-fixed this issue for myself by starting chrony in the
background, allowing the system to boot and so creating more entropy faster -
but I realize of course the downside of adjusting time later etc.

Thanks!
Holger

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Chrony vs. Linux RNG

2018-04-23 Thread Holger Hoffstätte

On 04/23/18 11:04, Miroslav Lichvar wrote:

On Sun, Apr 22, 2018 at 07:15:12PM +0200, Holger Hoffstätte wrote:

I test stable/LTS kernels to help Greg KH and just updated to 4.16.4-rc1.
This contains a few patches that are supposed to help with CVEs around
randomness, and which cause an interesting catch-22 that affects chrony,
hence this mail.


Thanks for the heads up.

I tried booting a VM with 4.17-rc2, which should include the patches


Yeah, I could have mentioned that..


you are referring to, but didn't see any delays problems.

On what distro do you test it? Does it save and restore the random
seed on boot (e.g. the systemd-random-seed)?


Gentoo using OpenRC, chronyd 3.3. It uses start-stop-daemon and it
was definitely chronyd hanging the boot sequence; for tests I disabled
chronyd from the default runlevel and all was back to smooth sailing.
Since s-s-d relies on chronyd going into the background, the temporary
fix was to add the --background flag to s-s-d so that OpenRC returns
immediately.

I just saw that it does indeed have a "urandom" service in the boot
runlevel, reading/writing from/to /var/lib/misc/random-seed.
But that happens way before chrony's default runlevel.

glibc is 2.26, so it should be using getrandom() and not use the
urandom fallbacks. Unfortunately it's really hard to trace/debug
this since the bug only manifests itself during the early stages, and
as soon as I do anything on a freshly booted system I create entropy,
initialising the crng and thus making everything work.


I guess it could use a non-blocking read for the urandom device (or
getrandom() syscall) and fall back to random(), but I'm not sure if it
would be a good idea from the security point of view.


I found in util.c that it *should* be using getrandom() already?
Maybe the HAVE_GETRANDOM detection didn't work, but even then the
urandom fallback should not be blocking either. I'll double-check the
package script's autoconf log.

thanks,
Holger

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Chrony vs. Linux RNG

2018-04-23 Thread Holger Hoffstätte

On 04/23/18 11:52, Holger Hoffstätte wrote:

I guess it could use a non-blocking read for the urandom device (or
getrandom() syscall) and fall back to random(), but I'm not sure if it
would be a good idea from the security point of view.


I found in util.c that it *should* be using getrandom() already?
Maybe the HAVE_GETRANDOM detection didn't work, but even then the
urandom fallback should not be blocking either. I'll double-check the
package script's autoconf log.


As I suspected..everything looking good:

$ebuild chrony-3.3.ebuild configure
 * chrony-3.3.tar.gz BLAKE2B SHA512 size ;-) ...  [ ok ]

Unpacking source...
Unpacking chrony-3.3.tar.gz to /tmp/portage/net-misc/chrony-3.3/work
Source unpacked in /tmp/portage/net-misc/chrony-3.3/work
Preparing source in /tmp/portage/net-misc/chrony-3.3/work/chrony-3.3 ...
Source prepared.
Configuring source in /tmp/portage/net-misc/chrony-3.3/work/chrony-3.3 ...

 * ./configure --enable-scfilter --disable-pps --without-editline 
--docdir=/usr/share/doc/chrony-3.3 --chronysockdir=/run/chrony 
--mandir=/usr/share/man --prefix=/usr --sysconfdir=/etc/chrony 
--disable-sechash --without-nss --without-tomcrypt
Configuring for  Linux-x86_64
Checking for x86_64-pc-linux-gnu-gcc : Yes
Checking for 64-bit time_t : Yes
NTP time mapped to 1968-05-05T09:56:04Z/2104-06-11T16:24:20Z
Checking for math : No
Checking for math in -lm : Yes
Checking for  : Yes
Checking for  : Yes
Checking for struct in_pktinfo : Yes
Checking for IPv6 support : Yes
Checking for struct in6_pktinfo : No
Checking for struct in6_pktinfo with _GNU_SOURCE : Yes
Checking for clock_gettime() : Yes
Checking for getaddrinfo() : Yes
Checking for pthread : Yes
Checking for arc4random_buf() : No
Checking for getrandom() : Yes
Checking for recvmmsg() : Yes
Checking for SW/HW timestamping : Yes
Checking for other timestamping options : Yes
Checking for libcap : Yes
Checking for seccomp : Yes
Checking for  : Yes
Checking for  : Yes
Checking for sched_setscheduler() : Yes
Checking for mlockall() : Yes
Checking for readline : Yes
Features : +CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER -SIGND +ASYNCDNS 
+READLINE -SECHASH +IPV6 -DEBUG
Creating Makefile
Creating doc/Makefile
Creating test/unit/Makefile

Source configured.


So it's probably indeed blocking in too-early getrandom() (thanks for
pointing that out!)and falling back to urandom with GRND_NONBLOCK could
work. Let me know if I can try any patches.

thanks,
Holger

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Chrony vs. Linux RNG

2018-04-23 Thread Holger Hoffstätte

On 04/23/18 12:13, Miroslav Lichvar wrote:

On Mon, Apr 23, 2018 at 11:52:00AM +0200, Holger Hoffstätte wrote:

Gentoo using OpenRC, chronyd 3.3. It uses start-stop-daemon and it
was definitely chronyd hanging the boot sequence; for tests I disabled
chronyd from the default runlevel and all was back to smooth sailing.
Since s-s-d relies on chronyd going into the background, the temporary
fix was to add the --background flag to s-s-d so that OpenRC returns
immediately.


Ok, if it is blocking before the foreground process exits, that
probably means it's not due to NTP, but something else is using random
numbers, e.g. a timer is added to the scheduler.


I found in util.c that it *should* be using getrandom() already?


It should and that's probably why it is blocking. If you disable
HAVE_GETRANDOM, it should stop.



Indeed it does. I configured as usual but added a swift
"sed -i '/HAVE_GETRANDOM/d' config.h" post-configure, built,
removed the --background flag from s-s-d, rebooted and it
immediately starts just as before.

Now trying the other patch. :)

thanks!
Holger

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Chrony vs. Linux RNG

2018-04-23 Thread Holger Hoffstätte

On 04/23/18 12:40, Miroslav Lichvar wrote:

On Mon, Apr 23, 2018 at 12:05:55PM +0200, Holger Hoffstätte wrote:

So it's probably indeed blocking in too-early getrandom() (thanks for
pointing that out!)and falling back to urandom with GRND_NONBLOCK could
work. Let me know if I can try any patches.


You can try the following patch. It should prevent getrandom() from
blocking and allow fall back to /dev/urandom.

--- a/util.c
+++ b/util.c
@@ -1224,7 +1224,7 @@ get_random_bytes_getrandom(char *buf, unsigned int len)
if (disabled)
  break;
  
-  if (getrandom(rand_buf, sizeof (rand_buf), 0) != sizeof (rand_buf)) {

+  if (getrandom(rand_buf, sizeof (rand_buf), GRND_NONBLOCK) != sizeof 
(rand_buf)) {
  disabled = 1;
  break;
}



Works as expected. \o/

Tested-by: Holger Hoffstätte 

Thanks!
Holger

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Chrony vs. Linux RNG

2018-04-23 Thread Holger Hoffstätte

On 04/23/18 13:07, Miroslav Lichvar wrote:

Great. Thanks. I'll think a bit about possible implications before
pushing the change.


Maybe make "available" and "disabled" non-static so that they are
not just evaluated once? On subsequent calls the CRNG will eventually
be initialized, so at some point it will start working with the
expected randomness. Just an idea.

cheers
Holger

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Does chrony support the DHCP option ntp-servers?

2020-03-30 Thread Holger Hoffstätte

On 3/31/20 1:18 AM, Jason W. Lewis wrote:

I’m looking at using chrony on our network for the first time, and
want it to accept the ntp-servers DHCP option.  Does it?  And if so,
how?  I haven’t seen any documentation showing how to do this, so I
suspect it’s not supported, but at the same time, nothing says it
doesn’t either,  so I wanted to be sure.

That's something you would configure in your DHCP server, not chrony.
chrony just hands out the time; the DHCP server is the one pointing clients
at your chrony server(s).

-h

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] How to use Facebook's NTP-service correctly?

2020-04-08 Thread Holger Hoffstätte

Hi Lars-Daniel,

an important point for you since you're presumably in Germany and
I already went through the futile attempts to use these servers:
they're all outside Germany, with rather high latency and quite
terrible connectivity, depending on your ISP's routing and current
backbone load. Inetrestingly enough FB has some timeX.facebook.*de*
servers with rather decent (meaning: stable) latency, but those boxes
do not return any answers. So apart from the smearing issue, purely
by accuracy FB's time services are quite useless if you're in
Germany.

You're much better off using the public .de NTP pool or addressing
1..3 public servers explicitly. I've set my router to pull from three
stable Stratum 1 sources and get excellent, always-available results
on all my inhouse clients, e.g. this on my workstation:

$chronycmd sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===
^* bifrost.applied-asynchro> 2   6   37725  -3937ns[-7799ns] +/- 5388us

$chronycmd sourcestats
Name/IP AddressNP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==
bifrost.applied-asynchro>   8   6   229 -0.002  0.164-46ns  7193ns

hth,
Holger


--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] How to use Facebook's NTP-service correctly?

2020-04-08 Thread Holger Hoffstätte

On 4/8/20 11:00 AM, Lars-Daniel Weber wrote:

Hey Holger,

Holger wrote:

an important point for you since you're presumably in Germany and


Jawoll, under protection of CoronaSchVO NW.


I already went through the futile attempts to use these servers:
they're all outside Germany, with rather high latency and quite
terrible connectivity, depending on your ISP's routing and current
backbone load.


According to their blog, they're doing geo-routing to the nearst server.
Maybe it's for their own services only, not for the public one.


That's what I figured as well. Maybe they'll open the .de servers for
the public, but then there's still the smearing issue..


You're much better off using the public .de NTP pool or addressing
1..3 public servers explicitly. I've set my router to pull from three
stable Stratum 1 sources and get excellent, always-available results
on all my inhouse clients, e.g. this on my workstation:

$chronycmd sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===
^* bifrost.applied-asynchro> 2   6   37725  -3937ns[-7799ns] +/- 5388us

$chronycmd sourcestats
Name/IP AddressNP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==
bifrost.applied-asynchro>   8   6   229 -0.002  0.164-46ns  7193ns


Would you share your complete config as a good/best practice example?


There's nothing special in there (bifrost is my router, a plain FritzBox where
the upstreams are set):

--snip--
# Servers
server bifrost iburst minpoll 3 maxpoll 6

# Allow the system clock to be stepped in the first three updates
# if its offset is larger than 1 second.
makestep 1.0 3

# Limit update skew
maxupdateskew 10

# Leap second handling
leapsecmode slew

# Allow clients
allow 127.0.0.1

# Automatically sync the RTC
rtcsync

# Priority boost
sched_priority 1
lock_all

# Stats
dumponexit
dumpdir /var/lib/chrony
driftfile /var/lib/chrony/drift
--snip--

It's really just a cleaned up default (template) config..

cheers
Holger

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Monitoring chrony, Prometheus-friendly metrics

2020-04-10 Thread Holger Hoffstätte

On 4/10/20 12:08 AM, Watson Ladd wrote:

On Wed, Apr 8, 2020 at 3:23 PM Watson Ladd  wrote:


On Wed, Apr 8, 2020 at 5:58 AM Luca BRUNO  wrote:


Hi all,
I'm following up from this old thread from 2016 regarding monitoring
chrony [0], and from this more recent discussion in Prometheus land [1].

[0] 
https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-users/2016/02/msg3.html
[1] https://github.com/prometheus/node_exporter/issues/1666


I've got a python script lying around for exactly this. Let me get the
approvals sorted out to submit it/send it to you.


Here is the script. Same licensing terms as Chrony itself. I'll submit
a patch to put it in the contrib section shortly.

Note that we have a framework to turn a tool like this into part of
the scrape, so maybe a standalone monitor suits you a bit better.


And for the cherry on top here's a patch to make it compatible with
python3. Bytes vs. Strings and all that.

enjoy,
Hoplger
--- chrony_metrics.py~	2020-04-10 10:12:04.0 +0200
+++ chrony_metrics.py	2020-04-10 10:15:48.932272177 +0200
@@ -44,7 +44,7 @@ def get_cmdoutput(command):
 if return_code:
 raise RuntimeError('Call to "{}" returned error: \
 {}'.format(command, return_code))
-return out
+return out.decode("utf-8")
 
 
 def printPrometheusformat(metric, values):


Re: [chrony-users] Compiling chrony 4.0 with nts support on Ubuntu 18.04

2021-04-04 Thread Holger Hoffstätte

On 2021-04-04 22:44, Uwe Fechner wrote:

Dear all,

I am trying to configure chrony with nts support, but so far it doesn't work:

ufechner@TUD277255:~/00Software/chrony-4.0$ ./configure
Configuring for  Linux-x86_64



Checking for nettle : Yes
Checking for CMAC in nettle : No
Checking for gnutls : No

^^^

At least according to my Gentoo build NTS needs gnutls as well,
so give that a try.

-h

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] vanilla configure and make generates "DEVELOPMENT" version

2021-05-17 Thread Holger Hoffstätte

On 2021-05-17 09:15, Miroslav Lichvar wrote:

On Sat, May 15, 2021 at 09:11:03PM -0700, w...@comcast.net wrote:

After a ".configure" and "make" after fetching
https://download.tuxfamily.org/chrony/chrony-4.1.tar.gz, "chronyd -v"
reports:

  


chronyd (chrony) version DEVELOPMENT (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP
-SCFILTER -SIGND +ASYNCDNS -NTS -SECHASH +IPV6 -DEBUG)


After running "make", did you install it with "make install" or
copy manually to a directory like /usr/local/sbin?

Maybe you have two versions installed in different paths. Try
"which chronyd" command to see what is actually executed.



Seems to work fine:

$wget https://download.tuxfamily.org/chrony/chrony-4.1.tar.gz
..
$tar xf chrony-4.1.tar.gz
$cd chrony-4.1
$./configure && make -j8
...
$./chronyd -v
chronyd (chrony) version 4.1 (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP -SCFILTER 
-SIGND +ASYNCDNS +NTS +SECHASH +IPV6 -DEBUG)

-h

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



[chrony-users] Monitoring chrony, Prometheus-friendly metrics: redux

2022-10-26 Thread Holger Hoffstätte



Hello chrony-users -

Some of you may remember past threads about monitoring chrony (e.g. [1]),
preferrably with Prometheus but without adding a hard dependency on
output formats or an http server within chrony itself.
Well..rejoice!

Ben Kochie has started an external exporter project which does not
scrape chrony output but instead uses the proper way of retrieving
any metrics via the protocol. It's fast, small(ish) and already has
replaced the previous script I used for the past few years.

The project can be found at: https://github.com/SuperQ/chrony_exporter
and there are prebuilt binaries, though building it is quite easy as well.
An unofficial Gentoo ebuild can be found at [2].

If there is a particular metric missing that you want to see exposed,
please file an issue, or even better just add it and send a PR.
Adding more metrics is relatively simple; just look at my own contributions
as inspiration. :)

Hopefully some of you find this useful.

cheers
Holger

[1] https://www.mail-archive.com/chrony-users@chrony.tuxfamily.org/msg02175.html
[2] 
https://github.com/hhoffstaette/portage/tree/master/app-metrics/chrony_exporter

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] Silent Failure -- Enhancement Request

2024-04-19 Thread Holger Hoffstätte



On 2024-04-19 16:40, Chris Knox wrote:

Bryah, thanks for the answer.  Yes, now that we have the scars, we're
monitoring chronyd's health carefully.  But my question goes a bit


Glad you're back up and running.

Just to make sure since the details/constraints of your operational
setup were not mentioned yet - I take it you have seen the
"Installation, configuration, and monitoring" section on
https://chrony-project.org/links.html ?

It contains many pointers to third-party monitoring & alerting
tools. In particular the chrony_exporter for Prometheus, in combination
with Alertmanager, is just plain great and flexible enough for
any conceivable operational process.

In fact based on this thread I filed an issue:
https://github.com/SuperQ/chrony_exporter/issues/75
earlier today and it already resulted in a PR:
https://github.com/SuperQ/chrony_exporter/pull/76

(..just in case anybody else using Prometheus is reading this :)

Fundamentally it's not clear what chrony can or should do when
upstream servers are not available, because it's a bottomless pit
of compounding rules, problems and workarounds, all of which are
very environment- and process-dependent.

So instead of relying on a human to read syslog it's IMHO probably
more reliable and stress-free to let a machine do the job of reading
the existing statistics, aggregating a metric that distinguishes a
warning from an error caused by wonky network delays, switch reboots
or data center movements, and then acts according to *your* specific
proceses (email, SMS, reboot..)

cheers
Holger

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.