Re: Erroneous Leap Second Introduced at 2014-06-30 23:59:59 UTC

2014-07-01 Thread Daniël W . Crompton
That's strange as I remember reading this yesterday: NO leap second will be
introduced at the end of June 2014.

http://hpiers.obspm.fr/iers/bul/bulc/bulletinc.dat

D.


Oplerno is built upon empowering faculty and students

-- 
Daniël W. Crompton daniel.cromp...@gmail.com

http://specialbrands.net/

http://specialbrands.net/
http://specialbrands.net/

   http://twitter.com/webhat http://www.facebook.com/webhat
http://plancast.com/webhat http://www.linkedin.com/in/redhat



On 1 July 2014 04:27, Majdi S. Abbas m...@latt.net wrote:

 On Mon, Jun 30, 2014 at 05:33:52PM -0700, Tim Heckman wrote:
  I just was alerted to one of the systems I managed having a time skew
  greater than 100ms from NTP sources. Upon further investigation it
  seemed that the time was off by almost exactly 1 second.
 
  Looking back over our NTP monitoring, it would appear that this system
  had a large time adjust at approximately 00:00 UTC:

 Okay.  Do you have any logging configured (peerstats, etc?) for
 ntpd?

  A few of our systems did alert early this morning, indicating they
  were going to be receiving a leap second today. However, I was unable
  to determine the exact cause for NTP believing a leap second should be
  added. And after some time a few of the systems were no longer
  indicating that a leap second would be introduced.

 This can happen if a server is either passing along a leap
 notification that it received, or is configured to use a leapseconds
 file that is incorrect.

  This specific system is hosted in AWS US-WEST-2C and uses the
  0.amazon.pool.ntp.org pool.

 0 is just one server in the pool (whichever you draw by
 rotation); is this the only server you have configured?

 --msa



Re: Erroneous Leap Second Introduced at 2014-06-30 23:59:59 UTC

2014-07-01 Thread Tim Heckman
On Mon, Jun 30, 2014 at 7:27 PM, Majdi S. Abbas m...@latt.net wrote:
 On Mon, Jun 30, 2014 at 05:33:52PM -0700, Tim Heckman wrote:
 I just was alerted to one of the systems I managed having a time skew
 greater than 100ms from NTP sources. Upon further investigation it
 seemed that the time was off by almost exactly 1 second.

 Looking back over our NTP monitoring, it would appear that this system
 had a large time adjust at approximately 00:00 UTC:

 Okay.  Do you have any logging configured (peerstats, etc?) for
 ntpd?

Our systems all have loopstats and peerstats logging enabled. I have
those log files available if interested. However, when I searched over
the files I wasn't able to find anything that seemed to indicate this
was the peer who told the system to introduce a leap second. That
said, I might just not know what to look for in the logs.

 A few of our systems did alert early this morning, indicating they
 were going to be receiving a leap second today. However, I was unable
 to determine the exact cause for NTP believing a leap second should be
 added. And after some time a few of the systems were no longer
 indicating that a leap second would be introduced.

 This can happen if a server is either passing along a leap
 notification that it received, or is configured to use a leapseconds
 file that is incorrect.

Correct, I was hoping to determine which peer it was so I can reach
out to them to make sure this doesn't bleed in to the pool at the end
of the year. I was also more-or-less curious how wide-spread of an
issue this was, but I'm starting to think I may have been the only
person to catch it in the act. :)

 This specific system is hosted in AWS US-WEST-2C and uses the
 0.amazon.pool.ntp.org pool.

 0 is just one server in the pool (whichever you draw by
 rotation); is this the only server you have configured?

We use 0.amazon.pool.ntp.org, 1.amazon.pool.ntp.org, and
2.amazon.pool.ntp.org. As with the other widely-used pool hostnames,
each of these is a round-robin DNS entry with 4 hosts and a TTL of
150s.

 --msa

Thank you for getting back to me.

Cheers!
-Tim


Re: Erroneous Leap Second Introduced at 2014-06-30 23:59:59 UTC

2014-07-01 Thread Majdi S. Abbas
On Tue, Jul 01, 2014 at 12:20:12PM -0700, Tim Heckman wrote:
 Our systems all have loopstats and peerstats logging enabled. I have
 those log files available if interested. However, when I searched over
 the files I wasn't able to find anything that seemed to indicate this
 was the peer who told the system to introduce a leap second. That
 said, I might just not know what to look for in the logs.

Look at the status word in peerstats; if the high bit is 
set, that's your huckleberry.

See: http://www.eecis.udel.edu/~mills/ntp/html/decode.html

 Correct, I was hoping to determine which peer it was so I can reach
 out to them to make sure this doesn't bleed in to the pool at the end
 of the year. I was also more-or-less curious how wide-spread of an
 issue this was, but I'm starting to think I may have been the only
 person to catch it in the act. :)

You might want to upgrade to current 4.2.7 development code,
wherein a majority rule is used to qualify the leap indicator.

Cheers,

--msa


Re: Erroneous Leap Second Introduced at 2014-06-30 23:59:59 UTC

2014-07-01 Thread Tim Heckman
On Tue, Jul 1, 2014 at 12:35 PM, Majdi S. Abbas m...@latt.net wrote:
 On Tue, Jul 01, 2014 at 12:20:12PM -0700, Tim Heckman wrote:
 Our systems all have loopstats and peerstats logging enabled. I have
 those log files available if interested. However, when I searched over
 the files I wasn't able to find anything that seemed to indicate this
 was the peer who told the system to introduce a leap second. That
 said, I might just not know what to look for in the logs.

 Look at the status word in peerstats; if the high bit is
 set, that's your huckleberry.

 See: http://www.eecis.udel.edu/~mills/ntp/html/decode.html

I've taken a look at all of the peerstats available for this host, and
surprisingly none of them are showing code 09 (leap_armed). I'm also
fairly certain that I know when some of my systems armed the leap
second (within a 60-120s window) based on our monitoring. Around those
times everything seems normal according to peerstats. Looking at

I am running Ubuntu 10.04 on this box, which is ntp v4.2.4p8. I'll
need to looking to see if the printing of this flag was added later;
otherwise, it would seem some of my systems picked up a phantom leap
second from an unknown source with one of them actually executing it.

Thanks for the decoder ring. My Google-fu wasn't hitting the right keywords.

 Correct, I was hoping to determine which peer it was so I can reach
 out to them to make sure this doesn't bleed in to the pool at the end
 of the year. I was also more-or-less curious how wide-spread of an
 issue this was, but I'm starting to think I may have been the only
 person to catch it in the act. :)

 You might want to upgrade to current 4.2.7 development code,
 wherein a majority rule is used to qualify the leap indicator.

We're going to be doing some system refreshes coming soon, so that may
be something we'll need to look at. I didn't realize this was
happening as part of the 4.2.7 development branch. Definitely an
interesting feature, especially after this. :p

 Cheers,

 --msa

Thanks again, Majdi.

Cheers!
-Tim


Erroneous Leap Second Introduced at 2014-06-30 23:59:59 UTC

2014-06-30 Thread Tim Heckman
Hey Everyone,

I just was alerted to one of the systems I managed having a time skew
greater than 100ms from NTP sources. Upon further investigation it
seemed that the time was off by almost exactly 1 second.

Looking back over our NTP monitoring, it would appear that this system
had a large time adjust at approximately 00:00 UTC:

- http://puu.sh/9Rs6O/a514ad7c97.png (times are in Pacific in these
graphs, sorry about that)

A few of our systems did alert early this morning, indicating they
were going to be receiving a leap second today. However, I was unable
to determine the exact cause for NTP believing a leap second should be
added. And after some time a few of the systems were no longer
indicating that a leap second would be introduced.

This specific system is hosted in AWS US-WEST-2C and uses the
0.amazon.pool.ntp.org pool.

Has anyone else seen any erroneous leap seconds being added to their system?

Cheers!
-Tim Heckman


Re: Erroneous Leap Second Introduced at 2014-06-30 23:59:59 UTC

2014-06-30 Thread Majdi S. Abbas
On Mon, Jun 30, 2014 at 05:33:52PM -0700, Tim Heckman wrote:
 I just was alerted to one of the systems I managed having a time skew
 greater than 100ms from NTP sources. Upon further investigation it
 seemed that the time was off by almost exactly 1 second.
 
 Looking back over our NTP monitoring, it would appear that this system
 had a large time adjust at approximately 00:00 UTC:

Okay.  Do you have any logging configured (peerstats, etc?) for
ntpd?

 A few of our systems did alert early this morning, indicating they
 were going to be receiving a leap second today. However, I was unable
 to determine the exact cause for NTP believing a leap second should be
 added. And after some time a few of the systems were no longer
 indicating that a leap second would be introduced.

This can happen if a server is either passing along a leap
notification that it received, or is configured to use a leapseconds
file that is incorrect.

 This specific system is hosted in AWS US-WEST-2C and uses the
 0.amazon.pool.ntp.org pool.

0 is just one server in the pool (whichever you draw by 
rotation); is this the only server you have configured?

--msa