subject:"NTP Issues Today"

Re: NTP Issues Today

2012-11-21 Thread Robert E. Seastrom

Blake Dunlap iki...@gmail.com writes:

That's what happens when you just follow vendor recommendations blindly. If
you do follow that on vm's (which can actually be a good practice), make
sure they pull from your own time infrastructure, and not just the world at
large, and that those servers behave in a sane fashion with regard to time
jumps.

Emphatically disagree on the pull from your own infrastructure
point. You probably don't have the budget even in a big company for
sufficient diversity of sources [*] for your NTP server and even if
you do the NTP servers will probably be run by the same
person/organization. Mills has called the latter practice out as bad
in the past.

As Leo pointed out, the key is having a large diverse set so that if a
couple of them go nuts they can be voted off the island.

If you have a requirement for super low jitter or holdover if you lose
network, you're looking at on-site devices with OCXO or Rb frequency
standards in them. That doesn't mean you shouldn't be talking to the
rest of the world too though. What if your on-site sources go nuts?
This happens periodically, say every 10 years or so, because of crappy
implementations and worst-current-practices. A re-read of
https://groups.google.com/forum/?fromgroups=#!search/mills$20ntp$20byzantine/comp.protocols.time.ntp/TryjqtAd1XM/R0zzzE13Tl8J
may prove instructive.

(reading list also includes http://www.amazon.com/dp/1439814635/ )

In my experience NTP beats out even DNS for blatantly wrong configs
in the wild that nevertheless seem to work well enough that dilettante
tech folks don't notice.

I might have replied to this thread yesterday but I was blissfully
unaware of any problems:

rs@bifrost [8] % ntpq -c peers | egrep -v '(===|remote)' | wc -l
11
rs@bifrost [9] %

-r

[*] particularly due to shortsighted US federal government choices on
LORAN, GOES, WWVB time format, etc

Re: NTP Issues Today

2012-11-21 Thread Ryan Malayter



On Nov 19, 2012, at 6:12 PM, Scott Weeks sur...@mauigateway.com wrote:

 wbai...@satelliteintelligencegroup.com
 
 Or you could just concede the fact that the navy is playing with time travel 
 again.
 --
 
 
 To finish this thread off for the archives...
 
 Apparently something was up with the navy stuff as a post on
 the outages shows.

Re: NTP Issues Today

2012-11-21 Thread Ryan Malayter



On Nov 19, 2012, at 6:12 PM, Scott Weeks sur...@mauigateway.com wrote:

 Lesson learned: Use more than one NTP source.
 

The lesson is: use MORE THAN TWO diverse NTP sources.

A man with two watches has no idea what the time it actually is.

Re: NTP Issues Today

2012-11-21 Thread Neil Harris


On 21/11/12 12:34, Ryan Malayter wrote:


On Nov 19, 2012, at 6:12 PM, Scott Weeks sur...@mauigateway.com wrote:


Lesson learned: Use more than one NTP source.


The lesson is: use MORE THAN TWO diverse NTP sources.

A man with two watches has no idea what the time it actually is.




Per David Mills, from the discussion linked upthread, this should be 
FOUR OR MORE...


Every critical server should have at least four sources, no two from the
same organization and, as much as possible, reachable only via diverse,
nonintersecting paths.

Four, so that the remaining three can reach consensus even if one fails.

-- Neil

Re: NTP Issues Today

2012-11-21 Thread Sid Rao

Guys:

We were synchronized against multiple sources. Unfortunately the Navy NTP 
source contaminated multiple downstream sources. 

Unless you can trace all your sources, if these sources all have a root source 
you will break. 

Sid Rao | CTI Group | +1 (317) 262-4677

On Nov 21, 2012, at 8:01 AM, Neil Harris n...@tonal.clara.co.uk wrote:

 On 21/11/12 12:34, Ryan Malayter wrote:
 
 On Nov 19, 2012, at 6:12 PM, Scott Weeks sur...@mauigateway.com wrote:
 
 Lesson learned: Use more than one NTP source.
 The lesson is: use MORE THAN TWO diverse NTP sources.
 
 A man with two watches has no idea what the time it actually is.
 
 Per David Mills, from the discussion linked upthread, this should be FOUR OR 
 MORE...
 
 Every critical server should have at least four sources, no two from the
 same organization and, as much as possible, reachable only via diverse,
 nonintersecting paths.
 
 Four, so that the remaining three can reach consensus even if one fails.
 
 -- Neil

RE: NTP Issues Today

2012-11-21 Thread Chuck Church

-Original Message-
From: Jimmy Hess [mailto:mysi...@gmail.com] 
Sent: Tuesday, November 20, 2012 7:50 PM
To: Van Wolfe
Cc: nanog@nanog.org
Subject: Re: NTP Issues Today

This  _should_   have caused NTP to execute a panic shutdown,
instead of setting the clock back  30 million seconds.

--
-JH

Sounds like SNTP might have been on the client.  Doesn't do much if any
sanity checking.  Windows used to use that, was more than happy to change
the time by years if bad time received.  Not sure if that is still the case.

Chuck

Re: NTP Issues Today

2012-11-21 Thread Greg Ihnen

It sounds like the Navy and who ever else they partner with (NIST?) need
some egress filtering on their NTP servers to catch and prevent events like
this.

Re: NTP Issues Today

2012-11-21 Thread Jay Ashworth

- Original Message -
 From: Sid Rao s...@ctigroup.com

 We were synchronized against multiple sources. Unfortunately the Navy
 NTP source contaminated multiple downstream sources.
 
 Unless you can trace all your sources, if these sources all have a
 root source you will break.

... against multiple [Stratum 1] sources...

Baby, if you've ever wondered... whether it matters whether your sources
are strat 1 or not, now you know -- since there's no real way to get 
provenance on down-strat time sources that I'm aware of.

Does the NTP code, people who know, give any extra credence to strat-1
sources in it's byzantine code?

Cheers,
-- jra
-- 
Jay R. Ashworth  Baylink   j...@baylink.com
Designer The Things I Think   RFC 2100
Ashworth  Associates http://baylink.pitas.com 2000 Land Rover DII
St Petersburg FL USA   #natog  +1 727 647 1274

Re: NTP Issues Today

2012-11-21 Thread Majdi S. Abbas

On Wed, Nov 21, 2012 at 10:41:01AM -0500, Jay Ashworth wrote:
 ... against multiple [Stratum 1] sources...
 
 Baby, if you've ever wondered... whether it matters whether your sources
 are strat 1 or not, now you know -- since there's no real way to get 
 provenance on down-strat time sources that I'm aware of.
 
 Does the NTP code, people who know, give any extra credence to strat-1
 sources in it's byzantine code?

Not in a way that matters if one of them suddenly becomes a 
falseticker.  If a reference clock goes insane, it's pretty easily 
detected provided you have at least two more servers (or even
peers configured.)

Stratum 1 just means it thinks it has a reference clock
attached, but those clocks fail, go into holdover, what have you
all the time.

NTP will happily select a stratum 2 or lower clock instead
provided it appears stable (low jitter, responded to our last 255
queries, and is an eligible candidate.)

To get an idea what your NTP server will do, try ntpq -p:

msa@paladin:/home/msa (582)$ ntpq -p
 remote   refid  st t when poll reach   delay   offset
jitter
==
-nist1.symmetric .ACTS.   1 u  304 1024  3775.1403.271
0.581
+nist1-sj.ustimi .ACTS.   1 u  307 1024  3777.8435.227
0.729
+64.147.116.229  .ACTS.   1 u  414 1024  3779.4065.742
0.068
*usno.pa-x.dec.c .USNO.   1 u  540 1024  3771.3734.242
0.032
-pegasus.latt.ne 64.250.177.145   2 u  304 1024  377   61.3835.920
6.578
-pyramid.latt.ne 216.171.124.36   2 u  361 1024  3771.0764.181
0.066

This is a stratum 2 server in the public pool.  It's peering
with two other stratum 2 servers that I run.  Those two are deselected
(-).  The server marked with a * is selected, and those with a + are
included in a weighted averdage used to maintain the system clock.
If the primary selected server does something wonky, it's going to 
select one of the candidates marked with a +.

In this case it has enough stratum 1 servers that it's not
likely to fall back to its peers, but it can do so if those servers
suddenly give it a set of unexpected replies.

--msa

Re: NTP Issues Today

2012-11-21 Thread Ask Bjørn Hansen


On Nov 20, 2012, at 13:00, Darius Jahandarie djahanda...@gmail.com wrote:

Hi everyone,

I run the NTP Pool system - http://www.pool.ntp.org/ - so I have some opinions 
on some of this. :-)

 But beyond that, I'm honestly rather curious what server selections
 are a good idea. A first thought would be an adjacent country, but
 maybe there is a benefit to picking things outside of the pool.ntp.org
 selection entirely?

First of all: None of the ~3800 servers in the NTP Pool system were affected by 
this as far as I can tell from the (copious) monitoring data.

The big benefit to adding some non-pool servers is that you wouldn't be 
depending basically on a bunch of volunteers (and to a large extent me) for 
your time keeping. Though likely you'd just be depending on another group of 
volunteers.

In addition to depending on the server operators who run the ntpd servers you 
also depend on:

1) The monitoring system keeping accurate time.
2) The monitoring system does its job catching bad servers.
3) The process updating and distributing the DNS data working.
4) The DNS servers working (and not being under a DoS attack or similar).
5) Anything I haven't thought of!

Empirically I believe we've done a better job than just about anyone with a 
similar scale, but past performance is no promise of the future.

 I see that Jared used *.fedora.pool.ntp.org -- I wonder if there was a
 specific reason for that or if my questions are even worth thinking
 about at all :-).


The servers for x.fedora.pool.ntp.org are in the same group as 
x.pool.ntp.org.  If you are in a country with many servers in the pool then 
you'll very likely get different IPs for the two. If you are in a country with 
few servers your odds for that aren't so good and it'd be a bit pointless.

Anyone using the NTP Pool in a default configuration (like Fedora does) must 
get a vendor zone setup - http://www.pool.ntp.org/en/vendors.html - so we 
have at least a little bit of a chance to monitor and mitigate problems.

It also allows us to change what servers are selected, how many IPs are 
returned etc for a particular vendor.  For example if Fedora in the future 
changes to use 'pool' instead of 'server' in the configuration we could 
optimize for that.


Ask

-- 
http://askask.com/

Re: NTP Issues Today

2012-11-20 Thread Sid Rao

We had multiple servers synchronized with Windows/MS time change their clock to 
the year 2000 today.  It broke many things, including AD authentication. 

These servers had been properly synchronized for years. 

They were synchronized with Microsoft and NIST NTP servers. 

This may not be isolated. 

Sid Rao | CTI Group | +1 (317) 262-4677

On Nov 19, 2012, at 10:29 PM, George Herbert george.herb...@gmail.com wrote:

 crossreplying to outages list.
 
 Is anyone ELSE seeing GPS issues?  This could well have been an
 unrelated issue on that particular PBX.
 
 If this was real, then the mother of all infrastructure attacks might
 be underway...
 
 One glitch on tick and tock and one malfunctioning PBX is not
 sufficient evidence of pattern - much less hostile activity - to
 induce panic, but it would perhaps be a wise time to check
 time-related logs?
 
 
 -george
 
 On Mon, Nov 19, 2012 at 6:08 PM, Wallace Keith
 kwall...@pcconnection.com wrote:
 Just got paged with a pbx alarm that had 1970 as the year. By the time I 
 logged in , it was showing 2012.  Using GPS for time and date.
 
 -Original Message-
 From: Mark Andrews [mailto:ma...@isc.org]
 Sent: Monday, November 19, 2012 8:42 PM
 To: Van Wolfe
 Cc: nanog@nanog.org
 Subject: Re: NTP Issues Today
 
 
 In message 
 cameggd4cdqwhxqe_jbvpnr-pkke9lxqa+kzj97anhfonjwz...@mail.gmail.com
 , Van Wolfe writes:
 Hello,
 
 Did anyone else experience issues with NTP today?  We had our server
 times update to the year 2000 at around 3:30 MT, then revert back to 2012.
 
 Thanks,
 Van
 
 NTP should be immune from this sort of behaviour unless you did a ntpdate at 
 the wrong moment.  The clocks should have been marked as insane.
 
 Mark
 --
 Mark Andrews, ISC
 1 Seymour St., Dundas Valley, NSW 2117, Australia
 PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org
 
 
 
 
 
 -- 
 -george william herbert
 george.herb...@gmail.com

Re: NTP Issues Today

2012-11-20 Thread Leo Bicknell

In a message written on Mon, Nov 19, 2012 at 04:21:55PM -0700, Van Wolfe wrote:
 Did anyone else experience issues with NTP today?  We had our server
 times update to the year 2000 at around 3:30 MT, then revert back to 2012.

I'm surprised the various time geeks aren't all posting their logs, so
I'll kick off:

/tmp/parse-peerstats.pl peerstats.20121119
56250 76367.354 192.5.41.41 91b4 -378691200.312258363 0.088274002 0.014835425 
0.263515353
56250 77391.354 192.5.41.41 91b4 -378691200.312258363 0.088274002 0.018668790 
0.263749719
56250 78204.354 192.5.41.40 90b4 -378691200.785377324 0.088179350 0.014812585 
0.263668835
56250 78416.355 192.5.41.41 91b4 -378691200.785974681 0.088312507 0.014832943 
0.209966600
56250 79229.355 192.5.41.40 90b4 -378691200.785377324 0.088179350 0.018668723 
378691200.785523713
56250 79442.355 192.5.41.41 91b4 -378691200.785974681 0.088312507 0.018689918 
378691200.786114931

Or in more human readable form:
/tmp/parse-peerstats.pl peerstats.20121119
192.5.41.41 off by -378691200.312258363
192.5.41.41 off by -378691200.312258363
192.5.41.40 off by -378691200.785377324
192.5.41.41 off by -378691200.785974681
192.5.41.40 off by -378691200.785377324
192.5.41.41 off by -378691200.785974681

The script, if you want to run against your own stats:

#!/usr/bin/perl

while () {
  chomp;
  ($day, $second, $addr, $status, $offset, $delay, $disp, $skew) = split;
  if (($offset  10) || ($offset  -10)) {
#print $addr off by $offset\n; # More human friendly
print $_\n;   # Full details
  }
}

It just looks for servers off by more than 10 econds and then prints
the line.  378691200 seconds is ~12 years, which lines up with the
year 2000 dates some are reporting.

The IP's are tick.usno.navy.mil and tock.usno.navy.mil.

I can confirm from my vantage point that tick and tock both went about
12 years wrong on Nov 19th for a bit, I can also report that my NTP
server with sufficient sources correctly determined they were haywire
and ignored them.

If your machines switched dates yesterday it probably means you're
NTP infrastructure is insufficiently peered and diversified.

-- 
   Leo Bicknell - bickn...@ufp.org - CCIE 3440
PGP keys at http://www.ufp.org/~bicknell/


pgp9aMW4WaOuy.pgp
Description: PGP signature

Re: [outages] NTP Issues Today

2012-11-20 Thread Colin Johnston


On 20 Nov 2012, at 15:38, Jeremy Chadwick j...@koitsu.org wrote:

 I'm still waiting for someone who was affected by this to provide
 coherent logs from ntpd showing exactly when the time change happened.
 Getting these, at least on an *IX system, is far from difficult folks.
 

from firewall ntp logs
Nov 19 09:58:06 [192.168.0.1.128.176] 2012:11:19-09:58:06 ntpd[21385]: ntpd 
exiting on signal 15
Nov 19 09:58:19 [192.168.0.1.128.176] 2012:11:19-09:58:19 selfmonng[3503]: W 
check Failed increment ntpd_running counter 3 - 3
Nov 19 09:58:22 [192.168.0.1.128.176] 2012:11:19-09:58:22 selfmonng[3503]: W 
NOTIFYEVENT Name=ntpd_running Level=INFO Id=147 sent
Nov 19 09:58:22 [192.168.0.1.128.176] 2012:11:19-09:58:22 selfmonng[3503]: W 
triggerAction: 'cmd'
Nov 19 09:58:22 [192.168.0.1.128.176] 2012:11:19-09:58:22 selfmonng[3503]: W 
actionCmd(+):  '/var/mdw/scripts/ntp restart'
Nov 19 09:58:25 [192.168.0.1.128.176] 2012:11:19-09:58:25 ntpd[24120]: ntpd 
4.2.4p8@1.1612-o Tue Feb  2 21:46:54 UTC 2010 (1)
Nov 19 09:58:25 [192.168.0.1.128.176] 2012:11:19-09:58:25 selfmonng[3503]: W 
child returned status: exit='0' signal='0'
Nov 19 09:58:35 [192.168.0.1.128.176] 2012:11:19-09:58:35 ntpd[24121]: kernel 
time sync status change 0001

was sync'd to 84.25.175.98, stratum 2 at the time I believe

Colin

Re: NTP Issues Today

2012-11-20 Thread Steve Meuse

On Tue, Nov 20, 2012 at 11:38 AM, Leo Bicknell bickn...@ufp.org wrote:


 If your machines switched dates yesterday it probably means you're
 NTP infrastructure is insufficiently peered and diversified.


If you take anything away from this thread, this is it

-Steve

Re: [outages] NTP Issues Today

2012-11-20 Thread Colin Johnston

no idea, re sigterm cause
checked firewall system logs and could not see cause from that either
times are GMT

Colin

On 20 Nov 2012, at 17:05, Jeremy Chadwick j...@koitsu.org wrote:

 Colin,
 
 Signal 15 = SIGTERM, so something intentionally shut ntpd down on your
 side.  The logs I'd be interested in would be prior to what you've
 provided, i.e. what lead to the SIGTERM.
 
 Also, no timezone is mentioned anywhere in your timestamps, so please
 provide that (UTC offset would be best).
 
 -- 
 | Jeremy Chadwick   j...@koitsu.org |
 | UNIX Systems Administratorhttp://jdc.koitsu.org/ |
 | Mountain View, CA, US|
 | Making life hard for others since 1977. PGP 4BD6C0CB |
 
 On Tue, Nov 20, 2012 at 05:02:06PM +, Colin Johnston wrote:
 
 On 20 Nov 2012, at 15:38, Jeremy Chadwick j...@koitsu.org wrote:
 
 I'm still waiting for someone who was affected by this to provide
 coherent logs from ntpd showing exactly when the time change happened.
 Getting these, at least on an *IX system, is far from difficult folks.
 
 
 from firewall ntp logs
 Nov 19 09:58:06 [192.168.0.1.128.176] 2012:11:19-09:58:06 ntpd[21385]: ntpd 
 exiting on signal 15
 Nov 19 09:58:19 [192.168.0.1.128.176] 2012:11:19-09:58:19 selfmonng[3503]: W 
 check Failed increment ntpd_running counter 3 - 3
 Nov 19 09:58:22 [192.168.0.1.128.176] 2012:11:19-09:58:22 selfmonng[3503]: W 
 NOTIFYEVENT Name=ntpd_running Level=INFO Id=147 sent
 Nov 19 09:58:22 [192.168.0.1.128.176] 2012:11:19-09:58:22 selfmonng[3503]: W 
 triggerAction: 'cmd'
 Nov 19 09:58:22 [192.168.0.1.128.176] 2012:11:19-09:58:22 selfmonng[3503]: W 
 actionCmd(+):  '/var/mdw/scripts/ntp restart'
 Nov 19 09:58:25 [192.168.0.1.128.176] 2012:11:19-09:58:25 ntpd[24120]: ntpd 
 4.2.4p8@1.1612-o Tue Feb  2 21:46:54 UTC 2010 (1)
 Nov 19 09:58:25 [192.168.0.1.128.176] 2012:11:19-09:58:25 selfmonng[3503]: W 
 child returned status: exit='0' signal='0'
 Nov 19 09:58:35 [192.168.0.1.128.176] 2012:11:19-09:58:35 ntpd[24121]: 
 kernel time sync status change 0001
 
 was sync'd to 84.25.175.98, stratum 2 at the time I believe
 
 Colin

Re: NTP Issues Today

2012-11-20 Thread Seth Mattinen

On 11/19/12 6:08 PM, Wallace Keith wrote:
 Just got paged with a pbx alarm that had 1970 as the year. By the time I 
 logged in , it was showing 2012.  Using GPS for time and date. 
 


I use GPS for my NTP server and didn't notice anything, but it's PPS
disciplined after initial sync so it doesn't matter as long as the pulse
keeps going.

ntp0# ntpq -pn
 remote   refid  st t when poll reach   delay   offset
jitter
==
 127.127.1.0 .LOCL.  12 l   10   64  3770.0000.000
 0.015
+216.171.124.36  .ACTS.   1 u  167 1024  377   26.8012.387
 0.015
+127.127.20.0.GPS.0 l   45   64  3770.000   -0.048
 0.015
o127.127.22.0.PPS.0 l   27   64  3770.000   -0.048
 0.015


~Seth

Re: NTP Issues Today

2012-11-20 Thread Leo Bicknell


After some private replies, I'm going to reply to my own post with
some information here.

It appears many people don't understand how the NTP protocol works.
I suspect many people have configured a primary and a backup
NTP server on many of their devices.  It turns out this is the
_WORST_ possible configuration if you want accurate time:

http://support.ntp.org/bin/view/Support/SelectingOffsiteNTPServers#Section_5.3.3.

To protect against two falseticking servers (tick and tock, as we saw on
the 19th) you need _FIVE_ servers minimum configured if they are both in
the list.  More importantly, if you want to protect against a source
(GPS, CDMA, IRIG, WWIV, ACTS, etc) false ticking, you need a minimum of
_FOUR_ different source technologies in the list as well.

It's not hard, my box that I posted the logs from peers with 18 servers
using 8 source technologies, all freely available on the Internet...

-- 
   Leo Bicknell - bickn...@ufp.org - CCIE 3440
PGP keys at http://www.ufp.org/~bicknell/


pgpTEbSQgVa9z.pgp
Description: PGP signature

Re: NTP Issues Today

2012-11-20 Thread Jay Ashworth

- Original Message -
 From: Leo Bicknell bickn...@ufp.org

 To protect against two falseticking servers (tick and tock, as we saw on
 the 19th) you need _FIVE_ servers minimum configured if they are both in
 the list. More importantly, if you want to protect against a source
 (GPS, CDMA, IRIG, WWIV, ACTS, etc) false ticking, you need a minimum of
 _FOUR_ different source technologies in the list as well.
 
 It's not hard, my box that I posted the logs from peers with 18
 servers using 8 source technologies, all freely available on the Internet...

I'm curious, Leo, what your internal setup looks like.  Do you have an
internal pair of masters, all slaved to those externals and one another, 
with your machines homed to them?  Full mesh?  Or something else?

In my last big gig, it was recommended to me that I have all the machines 
which had to speak to my DBMS NTP *to it*, and have only it connect to the
rest of my NTP infrastructure.  It coming unstuck was of less operational
impact than *pieces of it* going out of sync with one another...

Cheers,
-- jra
-- 
Jay R. Ashworth  Baylink   j...@baylink.com
Designer The Things I Think   RFC 2100
Ashworth  Associates http://baylink.pitas.com 2000 Land Rover DII
St Petersburg FL USA   #natog  +1 727 647 1274

Re: NTP Issues Today

2012-11-20 Thread Jared Mauch


On Nov 20, 2012, at 2:28 PM, Jay Ashworth j...@baylink.com wrote:

 - Original Message -
 From: Leo Bicknell bickn...@ufp.org
 
 To protect against two falseticking servers (tick and tock, as we saw on
 the 19th) you need _FIVE_ servers minimum configured if they are both in
 the list. More importantly, if you want to protect against a source
 (GPS, CDMA, IRIG, WWIV, ACTS, etc) false ticking, you need a minimum of
 _FOUR_ different source technologies in the list as well.
 
 It's not hard, my box that I posted the logs from peers with 18
 servers using 8 source technologies, all freely available on the Internet...
 
 I'm curious, Leo, what your internal setup looks like.  Do you have an
 internal pair of masters, all slaved to those externals and one another, 
 with your machines homed to them?  Full mesh?  Or something else?
 
 In my last big gig, it was recommended to me that I have all the machines 
 which had to speak to my DBMS NTP *to it*, and have only it connect to the
 rest of my NTP infrastructure.  It coming unstuck was of less operational
 impact than *pieces of it* going out of sync with one another...


here's a sample ntp config from one of my systems.

-- snip --
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server 0.fedora.pool.ntp.org
server 1.fedora.pool.ntp.org
server 2.fedora.pool.ntp.org
server 3.fedora.pool.ntp.org

#
server 0.us.pool.ntp.org iburst maxpoll 9
server 1.us.pool.ntp.org iburst maxpoll 9
server 2.us.pool.ntp.org iburst maxpoll 9
server 129.250.35.250 iburst maxpoll 9
server 129.250.35.251 iburst maxpoll 9

-- snip --

You can audit its operation like this:

nat:~$ ntpq -p -n -c ass
 remote   refid  st t when poll reach   delay   offset  jitter
==
-129.250.35.250  164.244.221.197  2 u   68  512  377   19.248   -0.135   3.195
+129.250.35.251  192.5.41.40  2 u  439  512  377   41.8171.109  15.660
-206.57.44.17204.123.2.5  2 u  126  512  377   37.133   -6.443   9.631
+4.53.160.75 209.81.9.7   2 u   48  512  377   25.2091.551   8.804
-64.73.32.135192.5.41.41  2 u  349  512  377   23.418   -0.703   1.721
*50.116.38.157   64.250.177.145   2 u  380  512  377   43.0211.267   2.136
+208.87.221.228  10.0.22.49   2 u  517  512  377   92.0000.974   0.678
-206.212.242.132 128.252.19.1 2 u  323  512  377   21.781   -2.873   1.304
+38.229.71.1 204.123.2.72 2 u  211  512  377   21.977   -0.055   2.274

ind assid status  conf reach auth condition  last_event cnt
===
  1 39973  931a   yes   yes  none   outlyersys_peer  1
  2 39974  941a   yes   yes  none candidatesys_peer  1
  3 39975  9324   yes   yes  none   outlyer   reachable  2
  4 39976  942a   yes   yes  none candidatesys_peer  2
  5 39977  931a   yes   yes  none   outlyersys_peer  1
  6 39978  961a   yes   yes  none  sys.peersys_peer  1
  7 39979  9414   yes   yes  none candidate   reachable  1
  8 39980  931a   yes   yes  none   outlyersys_peer  1
  9 39981  941a   yes   yes  none candidatesys_peer  1


What you would have seen is a falseticker from the impacted clocks.

This is a fairly reasonable setup.

I've also been looking at an item like this:

http://www.netburnerstore.com/ProductDetails.asp?ProductCode=PK70EX-NTP

which is about $300 + misc parts.

Should be well worth it to avoid a 'major outage' that some folks had with 
needing to reboot their servers, etc.

- Jared

Re: NTP Issues Today

2012-11-20 Thread Leo Bicknell

In a message written on Tue, Nov 20, 2012 at 02:28:19PM -0500, Jay Ashworth 
wrote:
 I'm curious, Leo, what your internal setup looks like.  Do you have an
 internal pair of masters, all slaved to those externals and one another, 
 with your machines homed to them?  Full mesh?  Or something else?

My particular internal setup is a tad weird, and so rather than
answer your question, I'm going to answer with some generalities.
The right answer of course depends a lot on how important it is
that boxes have the right time.

If you have 4 or more physical sites, I believe the right answer
is to have on the order of 8 NTP servers.  2 each in 4 sites reaches
the minimum nicely with redundancy.  These boxes can have GPS, CDMA
or other technologies if you want, but MUST peer with at least 10
stratum-1 sources outside of your network.  Of course if you have
more sites, one server in each of 8 sites is peachy.  Those on a
budget could probably get by with 4 servers total, but never less!

All critical devices should then be synced to the full set of
internal servers.  4 boxes minimum, 8-10 preferred.  NTP will only
use the 10 best servers in it's calculations, so there is a steep
dropoff of diminishing returns beyond 10.  For most ISP's I would
include all routers in this list.

For the  non-critical devices?  Well, there it gets more complex.
For most I would only configure one server, their default gateway
router.  Of course, pushing out a set of 4+ to themm if that is
easy is a great thing to do.

The interesting thing here is that no devices except for your NTP
servers should ever peer with anything outside of your network.
Why?  Let's say your NTP servers all go crazy together.  The outside
world is cut off, GPS is spoofed, the world is ending.  All that
you have left is that all of your devices are in time to each
otherso at least your logs still coorelate and such.  So having
every device under your master set of NTP servers is important.
One guy with an external peer may choose to use that, and leave the
hive mind, so to speak.

For small players, less than 4 sites, typically just use the NTP
pool servers, configuring 4 per box minimum.  If you want the same
protection I just outlined in the paragraph before, make 4 of your
servers talk to the outside world, and make everything else talk
to those.  Want to give back to the community?  Get a GPS/CDMA/Whatever
box and make it part of the NTP pool.  Want to step up your game (which
is what I do), reach out to various Stratum-1's on the net (or find
free, open ones) and peer up 8-20 of them.

 In my last big gig, it was recommended to me that I have all the machines 
 which had to speak to my DBMS NTP *to it*, and have only it connect to the
 rest of my NTP infrastructure.  It coming unstuck was of less operational
 impact than *pieces of it* going out of sync with one another...

Yep, a prime example of the scenario I described above.  Depending on
your level of network redundancy, number of NTP servers, and so on, this
is a fine solution.  With one NTP server (the DBMS) the downstream will
always use it, and stay in sync.  It's a valid and good config in many
situations.

-- 
   Leo Bicknell - bickn...@ufp.org - CCIE 3440
PGP keys at http://www.ufp.org/~bicknell/


pgpkzZWO3GDPn.pgp
Description: PGP signature

Re: NTP Issues Today

2012-11-20 Thread George Herbert





On Nov 20, 2012, at 11:39 AM, Jared Mauch ja...@puck.nether.net wrote:
.
 
 I've also been looking at an item like this:
 
 http://www.netburnerstore.com/ProductDetails.asp?ProductCode=PK70EX-NTP
 
 which is about $300 + misc parts.
 
 Should be well worth it to avoid a 'major outage' that some folks had with 
 needing to reboot their servers, etc.
 
 - Jared


Caution - that Netburner decice is just GPS synced, so if GPS ever does go 
insane you're out of luck.  It doesn't list a precision internal clock part.

I am not sure what all is in the dev kit version, but I know the company owner 
and can ask if anyone cares.




George William Herbert
Sent from my iPhone

Re: NTP Issues Today

2012-11-20 Thread Darius Jahandarie

On Tue, Nov 20, 2012 at 3:15 PM, Leo Bicknell bickn...@ufp.org wrote:
 For small players, less than 4 sites, typically just use the NTP
 pool servers, configuring 4 per box minimum.  If you want the same
 protection I just outlined in the paragraph before, make 4 of your
 servers talk to the outside world, and make everything else talk
 to those.  Want to give back to the community?  Get a GPS/CDMA/Whatever

Choosing the first four servers is usually pretty straightforward:
*.CC.pool.ntp.org

But beyond that, I'm honestly rather curious what server selections
are a good idea. A first thought would be an adjacent country, but
maybe there is a benefit to picking things outside of the pool.ntp.org
selection entirely?

I see that Jared used *.fedora.pool.ntp.org -- I wonder if there was a
specific reason for that or if my questions are even worth thinking
about at all :-).


Happy to hear thoughts.

-- 
Darius Jahandarie

Re: NTP Issues Today

2012-11-20 Thread Mike Lyon

I usually use time.nist.gov.

On Tue, Nov 20, 2012 at 1:00 PM, Darius Jahandarie djahanda...@gmail.comwrote:

 On Tue, Nov 20, 2012 at 3:15 PM, Leo Bicknell bickn...@ufp.org wrote:
  For small players, less than 4 sites, typically just use the NTP
  pool servers, configuring 4 per box minimum.  If you want the same
  protection I just outlined in the paragraph before, make 4 of your
  servers talk to the outside world, and make everything else talk
  to those.  Want to give back to the community?  Get a GPS/CDMA/Whatever

 Choosing the first four servers is usually pretty straightforward:
 *.CC.pool.ntp.org

 But beyond that, I'm honestly rather curious what server selections
 are a good idea. A first thought would be an adjacent country, but
 maybe there is a benefit to picking things outside of the pool.ntp.org
 selection entirely?

 I see that Jared used *.fedora.pool.ntp.org -- I wonder if there was a
 specific reason for that or if my questions are even worth thinking
 about at all :-).


 Happy to hear thoughts.

 --
 Darius Jahandarie




-- 
Mike Lyon
408-621-4826
mike.l...@gmail.com

http://www.linkedin.com/in/mlyon

RE: [outages] NTP Issues Today

2012-11-20 Thread R. Benjamin Kessler

 to 172.20.167.252, stratum=2
Nov 20 11:25:16   xntpd[70766]: synchronized to 172.20.167.251, stratum=2
Nov 20 12:33:56   xntpd[70766]: synchronized to 172.20.167.252, stratum=2
Nov 20 14:16:05   xntpd[70766]: kernel time sync enabled 6001
Nov 20 14:33:10   xntpd[70766]: kernel time sync enabled 2001
Nov 20 15:07:19   xntpd[70766]: synchronized to 172.20.167.251, stratum=2




-Original Message-
From: outages-boun...@outages.org [mailto:outages-boun...@outages.org] On 
Behalf Of Jeremy Chadwick
Sent: Tuesday, November 20, 2012 10:38 AM
To: Scott Voll
Cc: Sid Rao; outages; nanog@nanog.org
Subject: Re: [outages] NTP Issues Today

I'm still waiting for someone who was affected by this to provide coherent logs 
from ntpd showing exactly when the time change happened.
Getting these, at least on an *IX system, is far from difficult folks.

Please don't omit anything from the logs either; for example if you know
*exactly* what NTP servers were in use (not ones you had configured
but which one was primarily chosen by ntpd ('*' mark) and which were secondary 
comparisons/fallbacks ('+' mark)), that would also be greatly helpful.  This 
would be output from ntpq -c peers when run on your NTP server *at or around 
the time* the incident happened and recovered.

What's been provided so far is that something happened, with reports of 
clocks going back to year 2000, and other reports of clocks going back to 
(presumably) epoch time; those reporting it were using either usno.navy.mil, 
NIST, or Microsoft NTP servers.  usno.navy.mil uses dedicated IRIG/AFNOR TCRs 
boxes, while NIST uses GPS.  No idea what Microsoft uses.

I asked on a public *IX forum if anyone saw anything NTP-wise that was out of 
the ordinary and not a single admin saw anything.  I also saw nothing anomalous 
on either of my FreeBSD machines (9.1-PRERELEASE, running base system ntpd 
4.2.4p8), but I sync with very specific stratum
1 and stratum 2 servers across the United States.

As Mark Andrews from the ISC stated below (read slowly/carefully), ntpd will 
not allow large clock jumps -- the largest it'll allow out of the box is 1000s 
(and on some systems like Solaris ntpd, 500s) -- unless you're running with the 
-g flag (and shame on if you're you doing that).
So I'm very surprised by this problem altogether.  Can't deny what happened 
did, but figuring out *why* is important.

Also, for Mike Lyon -- I looked at NIST's GPS graphs.  Did you notice they have 
no data for 11/18, 11/19, or 11/20?  I find that unnerving, do you not?

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |

On Tue, Nov 20, 2012 at 07:18:45AM -0800, Scott Voll wrote:
 Same thing happened to us yesterday.  ended up having to reboot 
 everything after we got time fixed.  Major outage.
 
 Scott
 
 
 On Mon, Nov 19, 2012 at 7:58 PM, Sid Rao s...@ctigroup.com wrote:
 
  We had multiple servers synchronized with Windows/MS time change 
  their clock to the year 2000 today.  It broke many things, including 
  AD authentication.
 
  These servers had been properly synchronized for years.
 
  They were synchronized with Microsoft and NIST NTP servers.
 
  This may not be isolated.
 
  Sid Rao | CTI Group | +1 (317) 262-4677
 
  On Nov 19, 2012, at 10:29 PM, George Herbert 
  george.herb...@gmail.com
  wrote:
 
   crossreplying to outages list.
  
   Is anyone ELSE seeing GPS issues?  This could well have been an 
   unrelated issue on that particular PBX.
  
   If this was real, then the mother of all infrastructure attacks 
   might be underway...
  
   One glitch on tick and tock and one malfunctioning PBX is not 
   sufficient evidence of pattern - much less hostile activity - to 
   induce panic, but it would perhaps be a wise time to check 
   time-related logs?
  
  
   -george
  
   On Mon, Nov 19, 2012 at 6:08 PM, Wallace Keith 
   kwall...@pcconnection.com wrote:
   Just got paged with a pbx alarm that had 1970 as the year. By the 
   time
  I logged in , it was showing 2012.  Using GPS for time and date.
  
   -Original Message-
   From: Mark Andrews [mailto:ma...@isc.org]
   Sent: Monday, November 19, 2012 8:42 PM
   To: Van Wolfe
   Cc: nanog@nanog.org
   Subject: Re: NTP Issues Today
  
  
   In message 
  cameggd4cdqwhxqe_jbvpnr-pkke9lxqa+kzj97anhfonjwz...@mail.gmail.com
   , Van Wolfe writes:
   Hello,
  
   Did anyone else experience issues with NTP today?  We had our 
   server times update to the year 2000 at around 3:30 MT, then 
   revert back to
  2012.
  
   Thanks,
   Van
  
   NTP should be immune from this sort of behaviour unless you did a
  ntpdate at the wrong moment.  The clocks should have been marked as insane.
  
   Mark
   --
   Mark Andrews, ISC
   1 Seymour St., Dundas Valley, NSW 2117

Re: NTP Issues Today

2012-11-20 Thread Jared Mauch


On Nov 20, 2012, at 4:00 PM, Darius Jahandarie djahanda...@gmail.com wrote:

 Choosing the first four servers is usually pretty straightforward:
 *.CC.pool.ntp.org
 
 But beyond that, I'm honestly rather curious what server selections
 are a good idea. A first thought would be an adjacent country, but
 maybe there is a benefit to picking things outside of the pool.ntp.org
 selection entirely?
 
 I see that Jared used *.fedora.pool.ntp.org -- I wonder if there was a
 specific reason for that or if my questions are even worth thinking
 about at all :-).

I'm by no means a time geek, but …. i have some ideas about what you want and 
can tell you why I picked the settings I did…

1) The 129.250 ones are my employer run clocks.  It is a good idea to know how 
accurate they are.

2) The pool ones, some were default (e.g.: fedora) from my OS distro on the 
machine I took the example from.  You will see freebsd, centOS and others based 
on your settings.  You may even see time.apple.com if you are MacOS.

3) CC ntp pool were selected to provide additional clock diversity.

4) You want low jitter to your clocks.  This will allow you to have an accurate 
timing source.  This means don't congest that path.  If you want something very 
reliable, don't run it on a server with the other misc functions you need 
(e.g.: DNS, etc).  If it's important, dedicate some hardware to it.  if it is 
of passing importance, use a fair number of peers.

I was playing with the OWAMP software.  Having consistent clocks is important 
for that, (even if they are all off by a few ms). It can be fun to play with 
and measure things… http://www.internet2.edu/performance/owamp/index.html

5) Monitor your NTP setup periodically.  You may see clocks be rejected or 
outliers.  Depending on how close your clocks are, you may see a fair number be 
unusable.  Take this output:

nat:~$ ntpq -n -p -c ass
 remote   refid  st t when poll reach   delay   offset  jitter
==
*129.250.35.250  164.244.221.197  2 u  507  512  377   18.8830.196  18.311
+129.250.35.251  209.51.161.238   2 u  366  512  377   41.3490.429   2.184
-206.57.44.17204.123.2.5  2 u   91  512  377   35.884   -5.982   7.099
-4.53.160.75 209.81.9.7   2 u5  512  377   24.2501.522   1.353
+64.73.32.135164.67.62.1942 u  296  512  377   26.405   -0.956  11.244
+50.116.38.157   64.250.177.145   2 u  897 1024  377   42.9780.685   1.211
-208.87.221.228  10.0.22.51   2 u  390  512  377   83.858   -2.717   0.814
-206.212.242.132 128.252.19.1 2 u  262  512  377   22.278   -1.640   1.150
+38.229.71.1 204.123.2.72 2 u   95  512  377   20.6880.113   1.878

ind assid status  conf reach auth condition  last_event cnt
===
  1 39973  961a   yes   yes  none  sys.peersys_peer  1
  2 39974  941a   yes   yes  none candidatesys_peer  1
  3 39975  9324   yes   yes  none   outlyer   reachable  2
  4 39976  932a   yes   yes  none   outlyersys_peer  2
  5 39977  941a   yes   yes  none candidatesys_peer  1
  6 39978  941a   yes   yes  none candidatesys_peer  1
  7 39979  9314   yes   yes  none   outlyer   reachable  1
  8 39980  931a   yes   yes  none   outlyersys_peer  1
  9 39981  941a   yes   yes  none candidatesys_peer  1

Only 5/9 clocks are 'candidate' for usage, or the actual reference clock.  The 
jitter on the reference clock is equal to the delay (!).  This is on a business 
class internet link/tier, but from one of the 'usual suspects' that offers 
residential services as well.  I haven't been able to find them operating any 
customer accessible clocks, but they may exist.

My config, or one resembling it will give you a fair amount of diversity of 
clocks.  Syncing to one can easily result in being lied to and resetting the 
clock as everyone observed that went back to 2000.

- Jared

Picking outside NTP servers (Re: NTP Issues Today)

2012-11-20 Thread Jay Ashworth

- Original Message -
 From: Darius Jahandarie djahanda...@gmail.com

 Choosing the first four servers is usually pretty straightforward:
 *.CC.pool.ntp.org
 
 But beyond that, I'm honestly rather curious what server selections
 are a good idea. A first thought would be an adjacent country, but
 maybe there is a benefit to picking things outside of the pool.ntp.org
 selection entirely?
 
 I see that Jared used *.fedora.pool.ntp.org -- I wonder if there was a
 specific reason for that or if my questions are even worth thinking
 about at all :-).

Ah; the question that has plagued mankind since the beginning of.. time.

:-)

There are a couple of documents on this topic at ntp.org, and there's the
traditional list -- of questionable accuracy at this point -- of open-acess
Strat 1 and 2 servers.

For myself, I usually pick the first three in us.pool.ntp.org, tick and tock,
time.nist.gov, and a couple of regionally appropriate large universities.

I have always aimed for 6 to 8 outside servers, and a pair inside,
preferably in different locations, both talking to one another.

If your site is in Internet Business, you should probably peer with 
your business partners.  If you deal with Google Docs or AWS, you should
probably peer with them, if they have servers for that.

Cheers,
-- jra
-- 
Jay R. Ashworth  Baylink   j...@baylink.com
Designer The Things I Think   RFC 2100
Ashworth  Associates http://baylink.pitas.com 2000 Land Rover DII
St Petersburg FL USA   #natog  +1 727 647 1274

Re: Picking outside NTP servers (Re: NTP Issues Today)

2012-11-20 Thread George Herbert

On Tue, Nov 20, 2012 at 1:53 PM, Jay Ashworth j...@baylink.com wrote:

 For myself, I usually pick the first three in us.pool.ntp.org, tick and tock,
 time.nist.gov, and a couple of regionally appropriate large universities.

As this week indicated, perhaps tick and tock are not sufficiently far
apart to be a good redundancy choice from a geographical failover
point of view or common mode failure point of view.

As part of a set of 8 servers as you indicate later, perhaps ok, but I
fear for people who think Ok, I want redundancy, so... Tick and
Tock.  Which, it turns out, was significant quantities.

-- 
-george william herbert
george.herb...@gmail.com

Re: Picking outside NTP servers (Re: NTP Issues Today)

2012-11-20 Thread Majdi S. Abbas

On Tue, Nov 20, 2012 at 04:53:39PM -0500, Jay Ashworth wrote:
 For myself, I usually pick the first three in us.pool.ntp.org, tick and tock,
 time.nist.gov, and a couple of regionally appropriate large universities.

I'd advise going through the RR for a while, and pick servers
close to you.  ntpd won't select a server that's more than 128ms away.
It also degrades accuracy.  Select for minimum latency, as well as
a diverse set of sources.  [Watch their refid over time, and make sure
they aren't slaving to the same set of servers, as well as others
you may be using.]

It requires a bit of effort, but over time you get an idea what
public time servers are close to each of your locations, and diverse
from each other.

--msa

Re: NTP Issues Today

2012-11-20 Thread Jimmy Hess

On 11/19/12, Van Wolfe vanwo...@gmail.com wrote:
 Did anyone else experience issues with NTP today?  We had our server
 times update to the year 2000 at around 3:30 MT, then revert back to 2012.

Are you sure that you are actually using NTP to set your clock?
For you to sync with 2000,  you should have had multiple confused
peers from multiple time sources;  possibly a false radio signal

NTP by default has a panic threshold of 1000 seconds.

This  _should_   have caused NTP to execute a panic shutdown,
instead of setting the clock back  30 million seconds.


 Thanks,
 Van
--
-JH

Re: NTP Issues Today

2012-11-20 Thread Darius Jahandarie

On Tue, Nov 20, 2012 at 7:49 PM, Jimmy Hess mysi...@gmail.com wrote:
 Are you sure that you are actually using NTP to set your clock?
 For you to sync with 2000,  you should have had multiple confused
 peers from multiple time sources;  possibly a false radio signal

 NTP by default has a panic threshold of 1000 seconds.

 This  _should_   have caused NTP to execute a panic shutdown,
 instead of setting the clock back  30 million seconds.

For VMWare at least, their official recommendation[1] for NTP is to

tinker panic 0

for suspend/resume reasons. I've seen it default in some places.

[1] 
http://kb.vmware.com/selfservice/microsites/search.do?language=en_UScmd=displayKCexternalId=1006427

-- 
Darius Jahandarie

Re: NTP Issues Today

2012-11-20 Thread Damian Menscher

On Tue, Nov 20, 2012 at 4:49 PM, Jimmy Hess mysi...@gmail.com wrote:

 On 11/19/12, Van Wolfe vanwo...@gmail.com wrote:
  Did anyone else experience issues with NTP today?  We had our server
  times update to the year 2000 at around 3:30 MT, then revert back to
 2012.

 Are you sure that you are actually using NTP to set your clock?
 For you to sync with 2000,  you should have had multiple confused
 peers from multiple time sources;  possibly a false radio signal

 NTP by default has a panic threshold of 1000 seconds.

 This  _should_   have caused NTP to execute a panic shutdown,
 instead of setting the clock back  30 million seconds.


From logs various people have posted, it appears NTPd saw the excessive
time shift and took the reasonable(?) step of killing itself.  The OS
detected ntpd's death and took the reasonable step of restarting it.  On
startup, ntpd can be reasonably(?) configured with the -g option to bypass
the 1000s limit to set the starting time before going into the regular ntpd
time adjustment code.

In this case that would have set them back to 2000

It's a good lesson on how a chain of reasonable decisions can lead to a bad
outcome, so you really need to understand the interactions of the whole
system.

Damian

Re: NTP Issues Today

2012-11-20 Thread Alvaro Pereira

Looks like something bad has happened:
Behind the Random NTP Bizarreness of Incorrect Year Being Set
https://isc.sans.edu/diary.html?nstoryid=14548

---
A few people have written in within the past 18 hours about their NTP
server/clients getting set to the year 2000.  The cause of this behavior is
that an NTP server at the US Naval Observatory (pretty much the
authoritative time source in the US) was rebooted and somehow reverted to
the year 2000.  This, then, propogated out for a limited time and
downstream time sources also got this value.  It's a transient problem and
should already be rectified.  Not much really to report except an error at
the top of the food chain causing problems to the layers below.  If you
have a problem, just fix the year or resync your NTP server.

Just goes to show how reliant NTP is that it is all but a fire and forget
service once configured until bad things happen. John Bambenek

---


Alvaro Pereira

Re: NTP Issues Today

2012-11-20 Thread Blake Dunlap

That's what happens when you just follow vendor recommendations blindly. If
you do follow that on vm's (which can actually be a good practice), make
sure they pull from your own time infrastructure, and not just the world at
large, and that those servers behave in a sane fashion with regard to time
jumps.


On Tue, Nov 20, 2012 at 6:56 PM, Darius Jahandarie djahanda...@gmail.comwrote:

 On Tue, Nov 20, 2012 at 7:49 PM, Jimmy Hess mysi...@gmail.com wrote:
  Are you sure that you are actually using NTP to set your clock?
  For you to sync with 2000,  you should have had multiple confused
  peers from multiple time sources;  possibly a false radio signal
 
  NTP by default has a panic threshold of 1000 seconds.
 
  This  _should_   have caused NTP to execute a panic shutdown,
  instead of setting the clock back  30 million seconds.

 For VMWare at least, their official recommendation[1] for NTP is to

 tinker panic 0

 for suspend/resume reasons. I've seen it default in some places.

 [1]
 http://kb.vmware.com/selfservice/microsites/search.do?language=en_UScmd=displayKCexternalId=1006427

 --
 Darius Jahandarie

Re: NTP Issues Today

2012-11-20 Thread George Herbert

As a reminder - time infrastructure is not recommended for
virtualization.  Make them physicals.


On Tue, Nov 20, 2012 at 5:03 PM, Blake Dunlap iki...@gmail.com wrote:
 That's what happens when you just follow vendor recommendations blindly. If
 you do follow that on vm's (which can actually be a good practice), make
 sure they pull from your own time infrastructure, and not just the world at
 large, and that those servers behave in a sane fashion with regard to time
 jumps.


 On Tue, Nov 20, 2012 at 6:56 PM, Darius Jahandarie 
 djahanda...@gmail.comwrote:

 On Tue, Nov 20, 2012 at 7:49 PM, Jimmy Hess mysi...@gmail.com wrote:
  Are you sure that you are actually using NTP to set your clock?
  For you to sync with 2000,  you should have had multiple confused
  peers from multiple time sources;  possibly a false radio signal
 
  NTP by default has a panic threshold of 1000 seconds.
 
  This  _should_   have caused NTP to execute a panic shutdown,
  instead of setting the clock back  30 million seconds.

 For VMWare at least, their official recommendation[1] for NTP is to

 tinker panic 0

 for suspend/resume reasons. I've seen it default in some places.

 [1]
 http://kb.vmware.com/selfservice/microsites/search.do?language=en_UScmd=displayKCexternalId=1006427

 --
 Darius Jahandarie





-- 
-george william herbert
george.herb...@gmail.com

NTP Issues Today

2012-11-19 Thread Van Wolfe

Hello,

Did anyone else experience issues with NTP today?  We had our server
times update to the year 2000 at around 3:30 MT, then revert back to 2012.

Thanks,
Van

Re: NTP Issues Today

2012-11-19 Thread Scott Weeks



--- vanwo...@gmail.com wrote:
From: Van Wolfe vanwo...@gmail.com

Did anyone else experience issues with NTP today?  We had our server
times update to the year 2000 at around 3:30 MT, then revert back to 2012.
-


You need to provide more information.  For example, what NTP
source are you using?

scott

Re: NTP Issues Today

2012-11-19 Thread Scott Weeks






On 11/19/12 6:32 PM, Scott Weeks sur...@mauigateway.com wrote:
--- vanwo...@gmail.com wrote:
From: Van Wolfe vanwo...@gmail.com

Did anyone else experience issues with NTP today?  We had our server
times update to the year 2000 at around 3:30 MT, then revert back to 2012.
-

You need to provide more information.  For example, what NTP
source are you using?
--
--- chay...@centracomm.net wrote:
From: Clay Haynes chay...@centracomm.net

I can confirm this had happened on one of my test servers - it was
pointing to tick.usno.navy.mil and tock.usno.navy.mil at the time.
---

That's not a very diverse set of NTP servers.  In the future if 
you think it might be an outage, you might try on the 'outages' 
list: http://puck.nether.net/mailman/listinfo/outages

For this one, you might ask the server contact if there was a
problem: http://support.ntp.org/bin/view/Servers/TockUsnoNavyMil

That assumes you've done your homework first and made sure it
wasn't something in your network.

scott

Re: NTP Issues Today

2012-11-19 Thread Scott Weeks



--- wbai...@satelliteintelligencegroup.com wrote:
From: Warren Bailey wbai...@satelliteintelligencegroup.com

Or you could just concede the fact that the navy is playing with time travel 
again.
--


To finish this thread off for the archives...

Apparently something was up with the navy stuff as a post on
the outages shows.

Lesson learned: Use more than one NTP source.

scott

NTP Issues Today

2012-11-19 Thread Oscar Orosco

We had the same issue on our NTP server pointing to tick.usno.navy.mil. Set 
date back to year 2000.

Date: Mon, 19 Nov 2012 16:21:55 -0700

From: Van Wolfe vanwo...@gmail.commailto:vanwo...@gmail.com

To: nanog@nanog.orgmailto:nanog@nanog.org

Subject: NTP Issues Today

Message-ID:

cameggd4cdqwhxqe_jbvpnr-pkke9lxqa+kzj97anhfonjwz...@mail.gmail.commailto:cameggd4cdqwhxqe_jbvpnr-pkke9lxqa+kzj97anhfonjwz...@mail.gmail.com

Content-Type: text/plain; charset=ISO-8859-1

Hello,

Did anyone else experience issues with NTP today?  We had our server

times update to the year 2000 at around 3:30 MT, then revert back to 2012.

Thanks,

Van

Re: NTP Issues Today

2012-11-19 Thread Warren Bailey

Or you could just concede the fact that the navy is playing with time travel 
again.




From my Galaxy Note II, please excuse any mistakes.


 Original message 
From: Scott Weeks sur...@mauigateway.com
Date: 11/19/2012 3:52 PM (GMT-08:00)
To: nanog@nanog.org
Subject: Re: NTP Issues Today







On 11/19/12 6:32 PM, Scott Weeks sur...@mauigateway.com wrote:
--- vanwo...@gmail.com wrote:
From: Van Wolfe vanwo...@gmail.com

Did anyone else experience issues with NTP today?  We had our server
times update to the year 2000 at around 3:30 MT, then revert back to 2012.
-

You need to provide more information.  For example, what NTP
source are you using?
--
--- chay...@centracomm.net wrote:
From: Clay Haynes chay...@centracomm.net

I can confirm this had happened on one of my test servers - it was
pointing to tick.usno.navy.mil and tock.usno.navy.mil at the time.
---

That's not a very diverse set of NTP servers.  In the future if
you think it might be an outage, you might try on the 'outages'
list: http://puck.nether.net/mailman/listinfo/outages

For this one, you might ask the server contact if there was a
problem: http://support.ntp.org/bin/view/Servers/TockUsnoNavyMil

That assumes you've done your homework first and made sure it
wasn't something in your network.

scott

Re: NTP Issues Today

2012-11-19 Thread Mark Andrews


In message cameggd4cdqwhxqe_jbvpnr-pkke9lxqa+kzj97anhfonjwz...@mail.gmail.com
, Van Wolfe writes:
 Hello,
 
 Did anyone else experience issues with NTP today?  We had our server
 times update to the year 2000 at around 3:30 MT, then revert back to 2012.
 
 Thanks,
 Van

NTP should be immune from this sort of behaviour unless you did a
ntpdate at the wrong moment.  The clocks should have been marked
as insane.

Mark
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org

RE: NTP Issues Today

2012-11-19 Thread Wallace Keith

Just got paged with a pbx alarm that had 1970 as the year. By the time I logged 
in , it was showing 2012.  Using GPS for time and date. 

-Original Message-
From: Mark Andrews [mailto:ma...@isc.org] 
Sent: Monday, November 19, 2012 8:42 PM
To: Van Wolfe
Cc: nanog@nanog.org
Subject: Re: NTP Issues Today

In message cameggd4cdqwhxqe_jbvpnr-pkke9lxqa+kzj97anhfonjwz...@mail.gmail.com
, Van Wolfe writes:
 Hello,

 Did anyone else experience issues with NTP today?  We had our server 
 times update to the year 2000 at around 3:30 MT, then revert back to 2012.

 Thanks,
 Van

NTP should be immune from this sort of behaviour unless you did a ntpdate at 
the wrong moment.  The clocks should have been marked as insane.

Mark
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org

Re: NTP Issues Today

2012-11-19 Thread George Herbert

crossreplying to outages list.

Is anyone ELSE seeing GPS issues?  This could well have been an
unrelated issue on that particular PBX.

If this was real, then the mother of all infrastructure attacks might
be underway...

One glitch on tick and tock and one malfunctioning PBX is not
sufficient evidence of pattern - much less hostile activity - to
induce panic, but it would perhaps be a wise time to check
time-related logs?


-george

On Mon, Nov 19, 2012 at 6:08 PM, Wallace Keith
kwall...@pcconnection.com wrote:
 Just got paged with a pbx alarm that had 1970 as the year. By the time I 
 logged in , it was showing 2012.  Using GPS for time and date.

 -Original Message-
 From: Mark Andrews [mailto:ma...@isc.org]
 Sent: Monday, November 19, 2012 8:42 PM
 To: Van Wolfe
 Cc: nanog@nanog.org
 Subject: Re: NTP Issues Today


 In message 
 cameggd4cdqwhxqe_jbvpnr-pkke9lxqa+kzj97anhfonjwz...@mail.gmail.com
 , Van Wolfe writes:
 Hello,

 Did anyone else experience issues with NTP today?  We had our server
 times update to the year 2000 at around 3:30 MT, then revert back to 2012.

 Thanks,
 Van

 NTP should be immune from this sort of behaviour unless you did a ntpdate at 
 the wrong moment.  The clocks should have been marked as insane.

 Mark
 --
 Mark Andrews, ISC
 1 Seymour St., Dundas Valley, NSW 2117, Australia
 PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org





-- 
-george william herbert
george.herb...@gmail.com

Re: [outages] NTP Issues Today

2012-11-19 Thread Mike Lyon

Anyone check out the NIST GPS Archive?

http://www.nist.gov/pml/div688/grp40/gpsarchive.cfm

-Mike


On Mon, Nov 19, 2012 at 7:58 PM, Sid Rao s...@ctigroup.com wrote:

 We had multiple servers synchronized with Windows/MS time change their
 clock to the year 2000 today.  It broke many things, including AD
 authentication.

 These servers had been properly synchronized for years.

 They were synchronized with Microsoft and NIST NTP servers.

 This may not be isolated.

 Sid Rao | CTI Group | +1 (317) 262-4677

 On Nov 19, 2012, at 10:29 PM, George Herbert george.herb...@gmail.com
 wrote:

  crossreplying to outages list.
 
  Is anyone ELSE seeing GPS issues?  This could well have been an
  unrelated issue on that particular PBX.
 
  If this was real, then the mother of all infrastructure attacks might
  be underway...
 
  One glitch on tick and tock and one malfunctioning PBX is not
  sufficient evidence of pattern - much less hostile activity - to
  induce panic, but it would perhaps be a wise time to check
  time-related logs?
 
 
  -george
 
  On Mon, Nov 19, 2012 at 6:08 PM, Wallace Keith
  kwall...@pcconnection.com wrote:
  Just got paged with a pbx alarm that had 1970 as the year. By the time
 I logged in , it was showing 2012.  Using GPS for time and date.
 
  -Original Message-
  From: Mark Andrews [mailto:ma...@isc.org]
  Sent: Monday, November 19, 2012 8:42 PM
  To: Van Wolfe
  Cc: nanog@nanog.org
  Subject: Re: NTP Issues Today
 
 
  In message 
 cameggd4cdqwhxqe_jbvpnr-pkke9lxqa+kzj97anhfonjwz...@mail.gmail.com
  , Van Wolfe writes:
  Hello,
 
  Did anyone else experience issues with NTP today?  We had our server
  times update to the year 2000 at around 3:30 MT, then revert back to
 2012.
 
  Thanks,
  Van
 
  NTP should be immune from this sort of behaviour unless you did a
 ntpdate at the wrong moment.  The clocks should have been marked as insane.
 
  Mark
  --
  Mark Andrews, ISC
  1 Seymour St., Dundas Valley, NSW 2117, Australia
  PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org
 
 
 
 
 
  --
  -george william herbert
  george.herb...@gmail.com
 
 


 ___
 Outages mailing list
 outa...@outages.org
 https://puck.nether.net/mailman/listinfo/outages




-- 
Mike Lyon
408-621-4826
mike.l...@gmail.com

http://www.linkedin.com/in/mlyon