------- Forwarded Message
Date: Fri, 18 Mar 2022 06:02:51 +0530 From: Mukund Sivaraman <m...@mukund.org> To: hmur...@megapathdsl.net Subject: NTPsec panic and abort Message-ID: <YjPTM/9y9kZPtB04@d1> - --q9OuToa696kGPIE0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Hal I apologize for emailing you directly instead of creating an NTPsec issue but I am currently not able to login into my GitLab account. I am reporting an ntpd abort/crash. It is from a stock Fedora RPM: > [muks@gw1 ~]$ rpm -q ntpsec > ntpsec-1.2.1-4.fc35.x86_64 > [muks@gw1 ~]$=20 The computer has a Garmin 18x LVC GPS receiver device hooked up to a serial port, and ntpd's builtin NMEA driver is used to interface with it directly. It also has a working PPS signal. The relevant ntp.conf config lines are: > server 127.127.20.0 mode 1 prefer minpoll 4 > fudge 127.127.20.0 flag1 1 flag2 0 flag3 0 flag4 1 time2 0.5100621 This is how it looks normally (the device's datasheet claims 1us accuracy): > [muks@gw1 ~]$ ntpq -np > remote refid st t when poll reach delay = offset jitter > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > oNMEA(0) .GPS. 0 l 15 16 377 0.0000 = 0.0002 0.0003 > [muks@gw1 ~]$ ntpq -c clocklist > associd=3D0 status=3D0000 no events, clk_unspec, > name=3D"NMEA", > timecode=3D"$GPRMC,002016,A,____.____,_,_____.____,_,000.0,315.7,180322,0= 01.5,W*__", > poll=3D103, noreply=3D0, badformat=3D0, baddata=3D0, fudgetime2=3D510.062= , stratum=3D0, refid=3DGPS, > flags=3D9, device=3D"NMEA GPS Clock" > [muks@gw1 ~]$=20 I've been running it this way for several years now, previously with the ntp.org implementation of ntpd, and for a few months now with NTPsec. The Garmin 18x LVC GPS receiver device stopped working yesterday due to hardware failure, and I replaced it with another identical unit with the same firmware version. Within a few hours of running ntpd with the new device, the ntpd process terminated with the following syslog message: > Mar 18 05:10:10 gw1 ntpd[2200]: CLOCK: Panic: offset too big: -604800.000 > Mar 18 05:10:10 gw1 systemd[1]: ntpd.service: Main process exited, code= =3Dexited, status=3D1/FAILURE > Mar 18 05:10:10 gw1 systemd[1]: ntpd.service: Failed with result 'exit-co= de'. It appears that the GPS receiver sent a faulty date in the $GPRMC NMEA sentence. The following are a sequence of lines from /var/log/ntpstats/clockstats: > 59655 85114.640 NMEA(0) $GPRMC,233834,A,____.____,_,_____.____,_,000.0,31= 5.7,170322,001.5,W*__ > 59655 85130.640 NMEA(0) $GPRMC,233850,A,____.____,_,_____.____,_,000.0,31= 5.7,170322,001.5,W*__ > 59655 85146.640 NMEA(0) $GPRMC,233906,A,____.____,_,_____.____,_,000.0,31= 5.7,170322,001.5,W*__ > 59655 85162.640 NMEA(0) $GPRMC,233922,A,____.____,_,_____.____,_,000.0,31= 5.7,170322,001.5,W*__ > 59655 85178.640 NMEA(0) $GPRMC,233938,A,____.____,_,_____.____,_,000.0,31= 5.7,170322,001.5,W*__ > 59655 85194.640 NMEA(0) $GPRMC,233954,A,____.____,_,_____.____,_,000.0,31= 5.7,100322,001.5,W*__ > 59655 85982.616 NMEA(0) $GPRMC,235302,A,____.____,_,_____.____,_,000.0,31= 5.7,170322,001.5,W*__ > 59655 85998.616 NMEA(0) $GPRMC,235318,A,____.____,_,_____.____,_,000.0,31= 5.7,170322,001.5,W*__ > 59655 86014.616 NMEA(0) $GPRMC,235334,A,____.____,_,_____.____,_,000.0,31= 5.7,170322,001.5,W*__ > 59655 86030.616 NMEA(0) $GPRMC,235350,A,____.____,_,_____.____,_,000.0,31= 5.7,170322,001.5,W*__ Note the spurious "100322" date that is 1 week in the past. -604800 from the syslog message is -1 week in seconds (7 * 24 * 3600). Note the ",A," is returned in the status (<2>) field in the $GPRMC sentence, by which the GPS receiver still claims it has the "fix" and a valid position. If you want a reference for the $GPRMC NMEA sentence for this receiver, please see page 18 of: https://static.garmin.com/pumac/GPS_18x_Tech_Specs.pdf It appears that some bogus condition has occurred within the GPS receiver and it has sent a spurious $GPRMC sentence. However, it seems too extreme for ntpd to abort due to this. Could it ignore the sentence with the big offset instead? The GPS receiver appears to correct itself eventually. If ntpd aborts, the running NTP service is no longer present causing other problems. I have never come across such an ntpd abort before. This is the first time I'm seeing it. Can this condition be handled in any other way, so that the service doesn't terminate? Mukund - --q9OuToa696kGPIE0 Content-Type: application/pgp-signature; name="signature.asc" - -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEcpanf3Bxi94C0NsVude/iQOlsOwFAmIz0zAACgkQude/iQOl sOwo3hAAgKsh6EF2mSM/tCew5AnRKAoOu/S5wDEfzJJU9qgLvxVpbEV4U4jGcIyd MicyUXeCtq19WpzMoxaB5AKLcFxNu4i1gg7b0w9IN8iQ+NYFOqEpBpmwOgYKzxRJ Xyeufayz9wasCoPgeA4SAXS1ttz2fmJVNvniXRJaHzUlHQzVIYe2lE9/l4a3lF+Z 67QExxKr/rvTSY+3+PMpPX3rywkxulniGQjpH8c0k7NYAH8WUgCpumh5IaQR2KWW FkGhCDxf5RhuSa52fI0bxcRu8T78KddJPbc4pc9VEy9o6iw/Cs/nH9UWWiiomdnb C4W6PoPnyufgIZagvIEnpzFNvCkiwy6rP38s3iJ4/d5AObHZBCz3aOQWWXaPRY5V qkqr6D3Phhz+3xMX3q2JLS9gPOXcs0bmOsBpB/W0zoq1aNejOJQSNiawwaqY6PlA AsMU3NWAgFNWsBnElUbndo6Ks+vFWZ/YA2Al8++WfpEW8FFf9dV0a9pVxZ/aez2M udygZs0/VLZEW+8pAu1bEV2mVEPiP+udw8qiXkCop2AOdvRu1ERqTx9xfERwRmX1 5JVgSQmX8Q0hDHpnhgIZlfhHaoUYOe9QncAbBupBrMfqhkQ3nUR01XARksAExsFQ /Z6GuH3kGkU4dPIgKVORGaoDhW2T9ss9OQKV92aYckAzD337v3U= =DKkJ - -----END PGP SIGNATURE----- - --q9OuToa696kGPIE0-- ------- End of Forwarded Message -- These are my opinions. I hate spam. _______________________________________________ devel mailing list devel@ntpsec.org https://lists.ntpsec.org/mailman/listinfo/devel