On Tue, Sep 01, 2020 at 07:58:44PM +0200, Otto Moerbeek wrote:
> On Tue, Sep 01, 2020 at 06:33:34PM +0200, Christian Weisgerber wrote:
>
> > Even after otto@'s commit in -current...
> >
> > If no replies are received for a while due to connectivity issues
> > go into unsynced mode. The existing code to check if we're unsycned
> > is only done on receiving an ntp packet which does not happen if
> > there are connectivity issues.
> >
> > ... ntpd can still fail to recognize that a server is unreachable.
> > 36 hours after packets to the server have been blocked, ntpd still
> > pretends that nothing is wrong:
> >
> > -----------------------------
> > 1/1 peers valid, clock synced, stratum 2
> >
> > peer
> > wt tl st next poll offset delay jitter
> > fddd:28ee:243:2:213:95ff:fe06:1bb7 ntp
> > * 1 10 1 2201s 3296s -0.068ms 1.346ms 0.438ms
> > -----------------------------
> >
> >
> > ===> What works
> >
> > If I use pf(4) to block outgoing packets on the client (block return
> > out proto udp to port ntp), ntpd quickly invalidates the peer.
> >
> > ktrace of the ntp engine process shows this pattern:
> >
> > 23799 ntpd CALL sendto(6,0x278c98a1784,0x30,0,0,0)
> > 23799 ntpd RET sendto -1 errno 65 No route to host
> > ...
> > 23799 ntpd CALL recvmsg(6,0x7f7ffffddd60,0)
> > 23799 ntpd RET recvmsg -1 errno 61 Connection refused
> >
> >
> > ===> What doesn't work
> >
> > If I block incoming NTP packets on the gateway to the server (block
> > return on $if proto udp to port ntp), ntpd does not notice that the
> > peer is unavailable. tcpdump shows the expected ICMP error responses
> > to the NTP queries.
> >
> > ktrace of the ntp engine process shows this pattern:
> >
> > 23799 ntpd CALL sendto(6,0x278c98a1784,0x30,0,0,0)
> > 23799 ntpd RET sendto 48/0x30
> > ...
> > 23799 ntpd CALL recvmsg(6,0x7f7ffffddd60,0)
> > 23799 ntpd RET recvmsg -1 errno 61 Connection refused
> >
> >
> > Tentative conclusion: ntpd notices when it can't send queries to
> > the server, but not when it fails to receive replies.
> >
> > --
> > Christian "naddy" Weisgerber [email protected]
> >
>
> Yeah, this was my next step. client_dispatch() is a bit dumb in the
> sense that it always returns 0, if it did something useful or not. Ot
> should distinguis between packet processes ok, packet rejected for
> some reason and error.
>
> Note that a complete absense of a reply (with no icmp) does work and
> sets the state to unsynced.
>
> -Otto
>
Currently testing this.
-Otto
Index: client.c
===================================================================
RCS file: /cvs/src/usr.sbin/ntpd/client.c,v
retrieving revision 1.113
diff -u -p -r1.113 client.c
--- client.c 30 Jan 2020 15:55:41 -0000 1.113
+++ client.c 1 Sep 2020 19:15:22 -0000
@@ -456,7 +456,7 @@ client_dispatch(struct ntp_peer *p, u_in
if (++p->shift >= OFFSET_ARRAY_SIZE)
p->shift = 0;
- return (0);
+ return (1);
}
int
Index: ntp.c
===================================================================
RCS file: /cvs/src/usr.sbin/ntpd/ntp.c,v
retrieving revision 1.166
diff -u -p -r1.166 ntp.c
--- ntp.c 30 Aug 2020 16:21:29 -0000 1.166
+++ ntp.c 1 Sep 2020 19:15:22 -0000
@@ -403,12 +403,9 @@ ntp_main(struct ntpd_conf *nconf, struct
for (; nfds > 0 && j < idx_clients; j++) {
if (pfd[j].revents & (POLLIN|POLLERR)) {
nfds--;
- last_action = now;
if (client_dispatch(idx2peer[j - idx_peers],
- conf->settime, conf->automatic) == -1) {
- log_warn("pipe write error (settime)");
- ntp_quit = 1;
- }
+ conf->settime, conf->automatic) == 1)
+ last_action = now;
}
}