[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
Hi Cam, On our hosts, 4 physical interfaces and then a bunch of bonds and bridges taking total up to 12 entries in /etc/network/interfaces . So contention certainly seems plausible? My guests have actually gone back to working normally, so I likely have mis-attributed an unrelated problem that occurred at same time. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: Fix Released Status in ntp source package in Trusty: Fix Released Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
Nathan: How many interfaces or IP's are you bringing up? That error message makes it sound like there could be a lot of contention on the lock. Could you also get the output of `pstree | grep -B3 lockfile` while a VM is coming up? (You'll need to attach to a free virtual terminal using the kvm console). Upon reading more of the lockfile-create manpage, it appears that there's a non-configurable 5-minute timeout on stale locks. Setting the --use-pid option might free up the lock more quickly if the parent process has died for some reason. It's not clear to me how this could prevent networking from coming up, since the network has to be up for NTP to run, and the if-up.d script backgrounds the ntpdate locking+syncing script. sshd in 12.04 and 14.04 is started from an upstart script which does not depend on the NTP service. The NTP service itself is fairly early in the sysvinit order at S23, so there might be other init scripts blocked behind it. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: Fix Released Status in ntp source package in Trusty: Fix Released Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
This fix is causing problems on Ubuntu 12.04 for me; for both KVM hosts and KVM guests. I see a message like lockfile creation failed: exceeded maximum number of lock attempts On my hosts, it delays boot finishing for several minutes; while some of my guests just never become network accessible. For anyone else bitten by same issue, I am currently using this workaround: chmod -x /usr/sbin/ntpdate-debian -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: Fix Released Status in ntp source package in Trusty: Fix Released Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
This bug was fixed in the package ntp - 1:4.2.6.p5+dfsg-3ubuntu2.14.04.7 --- ntp (1:4.2.6.p5+dfsg-3ubuntu2.14.04.7) trusty; urgency=medium * Use a single lockfile again - instead unlock the file before starting the init script. The lock sho uld be shared - both services can't run at the same time. (LP: #1125726) -- Cam Cope Tue, 19 Jan 2016 10:22:39 + ** Changed in: ntp (Ubuntu Trusty) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: Fix Released Status in ntp source package in Trusty: Fix Released Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
This bug was fixed in the package ntp - 1:4.2.6.p3+dfsg-1ubuntu3.8 --- ntp (1:4.2.6.p3+dfsg-1ubuntu3.8) precise; urgency=medium * Use a single lockfile again - instead unlock the file before starting the init script. The lock sho uld be shared - both services can't run at the same time. (LP: #1125726) -- Cam Cope Tue, 19 Jan 2016 10:20:07 + ** Changed in: ntp (Ubuntu Precise) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: Fix Released Status in ntp source package in Trusty: Fix Released Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
I can confirm I've been running this in production and have not seen any further issues. ** Tags removed: verification-needed ** Tags added: verification-done -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: Fix Committed Status in ntp source package in Trusty: Fix Committed Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
Hello Thomas, or anyone else affected, Accepted ntp into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ntp/1:4.2.6.p5+dfsg- 3ubuntu2.14.04.7 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: ntp (Ubuntu Trusty) Status: In Progress => Fix Committed ** Tags added: verification-needed ** Changed in: ntp (Ubuntu Precise) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: Fix Committed Status in ntp source package in Trusty: Fix Committed Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
Uploaded both now, thanks again! ** Changed in: ntp (Ubuntu Precise) Status: Triaged => In Progress ** Changed in: ntp (Ubuntu Trusty) Status: Triaged => In Progress ** Changed in: ntp (Ubuntu Precise) Assignee: (unassigned) => Cam Cope (ccope) ** Changed in: ntp (Ubuntu Trusty) Assignee: (unassigned) => Cam Cope (ccope) -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: In Progress Status in ntp source package in Trusty: In Progress Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
** Changed in: ntp (Ubuntu Precise) Status: New => Triaged ** Changed in: ntp (Ubuntu Trusty) Status: New => Triaged -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: Triaged Status in ntp source package in Trusty: Triaged Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
** Changed in: ntp (Ubuntu Precise) Importance: Undecided => Medium ** Changed in: ntp (Ubuntu Trusty) Importance: Undecided => Medium -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: New Status in ntp source package in Trusty: New Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
** Also affects: ntp (Ubuntu Precise) Importance: Undecided Status: New ** Also affects: ntp (Ubuntu Trusty) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: New Status in ntp source package in Trusty: New Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original deadlock behavior is described here: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 [Regression Potential] * Low. Out-of-sync clocks could be changed a large amount at boot time, but only for machines with static IP's. The clock is only likely to be in this state if the clock was very skewed at boot time, which is also unlikely since NTP usually keeps the software clock in sync during operation and the hardware clock is updated at shutdown. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
** Description changed: - We're seeing a race between if-up.d/ntpdate and the ntp startup script. + [Impact] + * Hardware clocks are not stepped at boot, which can prevent NTP from ever + syncing the clock. + Incorrect clocks can cause serious issues in distributed systems. - 1) if-up.d/ntpdate starts. - 2) if-up.d/ntpdate acquires the lock "/var/lock/ntpdate-ifup". - 3) if-up.d/ntpdate stops the ntp service [which isn't running anyway]. - 4) if-up.d/ntpdate starts running ntpdate, which bids UDP *.ntp - 5) /etc/init.d/rc 2 executes "/etc/rc2.d/S20ntp start" - 6) /etc/init.d/ntp acquires the lock "/var/lock/ntpdate". - 7) /etc/init.d/ntp starts the ntp daemon. - 8) The ntp daemon logs an error, complaining that it cannot bind UDP *.ntp. - 9) if-up.d/ntpdate now starts the ntp service. + * Upstream originally added a lock file to eliminate a race between the ntp + service (which keeps the clock synchronized during normal operation) and + ntpdate (which is used to step the clock by large intervals at boot time). + That change had a flaw which introduced a deadlock. An Ubuntu patch was + applied which broke the locking mechanism entirely, reintroducing the race + condition. - The result is a weird churn, though ntpd does end up running at the end. + * This change undoes the Ubuntu patch and fixes the deadlock by unlocking + before attempting to start the ntp service. - Should these not be using the same lock file? + [Test Case] + + * There are two bugs: The race, and the deadlock. To reproduce the race more + consistently: + - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding + '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out + 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will + reproduce the case where the ntp service starts between the stop command + and the ntpdate command. + The result will be that the ntpdate command fails. There will be a + message in syslog like: + 'ntpdate[17660]: the NTP socket is in use, exiting' + - Reintroducing the lock brings back the deadlock issue. Both the ntpdate + if-up.d script and the ntp init script check the lock file, but the + ntpdate script attempted to start the ntp init script before unlocking + the lock. Moving the unlock before the init script invocation fixes + the deadlock. The original deadlock behavior is described here: + https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/246203 + + [Regression Potential] + + * Low. Out-of-sync clocks could be changed a large amount at boot time, but + only for machines with static IP's. The clock is only likely to be in this + state if the clock was very skewed at boot time, which is also unlikely + since NTP usually keeps the software clock in sync during operation and + the hardware clock is updated at shutdown. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Status in ntp source package in Precise: New Status in ntp source package in Trusty: New Bug description: [Impact] * Hardware clocks are not stepped at boot, which can prevent NTP from ever syncing the clock. Incorrect clocks can cause serious issues in distributed systems. * Upstream originally added a lock file to eliminate a race between the ntp service (which keeps the clock synchronized during normal operation) and ntpdate (which is used to step the clock by large intervals at boot time). That change had a flaw which introduced a deadlock. An Ubuntu patch was applied which broke the locking mechanism entirely, reintroducing the race condition. * This change undoes the Ubuntu patch and fixes the deadlock by unlocking before attempting to start the ntp service. [Test Case] * There are two bugs: The race, and the deadlock. To reproduce the race more consistently: - add 'sleep 30' to '/etc/network/if-up.d/ntpdate' on the line preceding '/usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :', and comment out 'invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true'. This will reproduce the case where the ntp service starts between the stop command and the ntpdate command. The result will be that the ntpdate command fails. There will be a message in syslog like: 'ntpdate[17660]: the NTP socket is in use, exiting' - Reintroducing the lock brings back the deadlock issue. Both the ntpdate if-up.d script and the ntp init script check the lock file, but the ntpdate script attempted to start the ntp init script before unlocking the lock. Moving the unlock before the init script invocation fixes the deadlock. The original de
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
This bug was fixed in the package ntp - 1:4.2.6.p5+dfsg-3ubuntu9 --- ntp (1:4.2.6.p5+dfsg-3ubuntu9) xenial; urgency=medium [ Cam Cope ] * Use a single lockfile again - instead unlock the file before starting the init script. The lock sho uld be shared - both services can't run at the same time. (LP: #1125726) -- Iain Lane Mon, 07 Dec 2015 13:38:16 + ** Changed in: ntp (Ubuntu) Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Fix Released Bug description: We're seeing a race between if-up.d/ntpdate and the ntp startup script. 1) if-up.d/ntpdate starts. 2) if-up.d/ntpdate acquires the lock "/var/lock/ntpdate-ifup". 3) if-up.d/ntpdate stops the ntp service [which isn't running anyway]. 4) if-up.d/ntpdate starts running ntpdate, which bids UDP *.ntp 5) /etc/init.d/rc 2 executes "/etc/rc2.d/S20ntp start" 6) /etc/init.d/ntp acquires the lock "/var/lock/ntpdate". 7) /etc/init.d/ntp starts the ntp daemon. 8) The ntp daemon logs an error, complaining that it cannot bind UDP *.ntp. 9) if-up.d/ntpdate now starts the ntp service. The result is a weird churn, though ntpd does end up running at the end. Should these not be using the same lock file? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
Thanks Cam, I'm going to upload this to Xenial. If you want this to be uploaded to a stable release, please provide the required information (QA information, regression potential, etc) from https://wiki.ubuntu.com/StableReleaseUpdates#Procedure -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Confirmed Bug description: We're seeing a race between if-up.d/ntpdate and the ntp startup script. 1) if-up.d/ntpdate starts. 2) if-up.d/ntpdate acquires the lock "/var/lock/ntpdate-ifup". 3) if-up.d/ntpdate stops the ntp service [which isn't running anyway]. 4) if-up.d/ntpdate starts running ntpdate, which bids UDP *.ntp 5) /etc/init.d/rc 2 executes "/etc/rc2.d/S20ntp start" 6) /etc/init.d/ntp acquires the lock "/var/lock/ntpdate". 7) /etc/init.d/ntp starts the ntp daemon. 8) The ntp daemon logs an error, complaining that it cannot bind UDP *.ntp. 9) if-up.d/ntpdate now starts the ntp service. The result is a weird churn, though ntpd does end up running at the end. Should these not be using the same lock file? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
** Changed in: ntp (Ubuntu) Importance: Low => Medium -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Confirmed Bug description: We're seeing a race between if-up.d/ntpdate and the ntp startup script. 1) if-up.d/ntpdate starts. 2) if-up.d/ntpdate acquires the lock "/var/lock/ntpdate-ifup". 3) if-up.d/ntpdate stops the ntp service [which isn't running anyway]. 4) if-up.d/ntpdate starts running ntpdate, which bids UDP *.ntp 5) /etc/init.d/rc 2 executes "/etc/rc2.d/S20ntp start" 6) /etc/init.d/ntp acquires the lock "/var/lock/ntpdate". 7) /etc/init.d/ntp starts the ntp daemon. 8) The ntp daemon logs an error, complaining that it cannot bind UDP *.ntp. 9) if-up.d/ntpdate now starts the ntp service. The result is a weird churn, though ntpd does end up running at the end. Should these not be using the same lock file? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
In case it wasn't clear, my patch is supposed to be for the debian/ntpdate.if-up file. Also, I think the priority of this bug should be higher, it was assigned 'low' when there was no clear problem caused by the race. Systems booting with uncorrectable clock skew can be a serious problem. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Confirmed Bug description: We're seeing a race between if-up.d/ntpdate and the ntp startup script. 1) if-up.d/ntpdate starts. 2) if-up.d/ntpdate acquires the lock "/var/lock/ntpdate-ifup". 3) if-up.d/ntpdate stops the ntp service [which isn't running anyway]. 4) if-up.d/ntpdate starts running ntpdate, which bids UDP *.ntp 5) /etc/init.d/rc 2 executes "/etc/rc2.d/S20ntp start" 6) /etc/init.d/ntp acquires the lock "/var/lock/ntpdate". 7) /etc/init.d/ntp starts the ntp daemon. 8) The ntp daemon logs an error, complaining that it cannot bind UDP *.ntp. 9) if-up.d/ntpdate now starts the ntp service. The result is a weird churn, though ntpd does end up running at the end. Should these not be using the same lock file? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
** Tags added: patch -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Confirmed Bug description: We're seeing a race between if-up.d/ntpdate and the ntp startup script. 1) if-up.d/ntpdate starts. 2) if-up.d/ntpdate acquires the lock "/var/lock/ntpdate-ifup". 3) if-up.d/ntpdate stops the ntp service [which isn't running anyway]. 4) if-up.d/ntpdate starts running ntpdate, which bids UDP *.ntp 5) /etc/init.d/rc 2 executes "/etc/rc2.d/S20ntp start" 6) /etc/init.d/ntp acquires the lock "/var/lock/ntpdate". 7) /etc/init.d/ntp starts the ntp daemon. 8) The ntp daemon logs an error, complaining that it cannot bind UDP *.ntp. 9) if-up.d/ntpdate now starts the ntp service. The result is a weird churn, though ntpd does end up running at the end. Should these not be using the same lock file? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1125726] Re: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start"
See also https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/288905 where I said: The way I read "man ntpd" (on Debian wheezy), we could (should?) replace ntpdate by "ntpd -q"; and if we are going to run ntpd then ntpdate is unnecessary anyway. If we have (or are going to have) ntpd, then we should simply skip /etc/network/if-up.d/ntpdate; seeing how that depends on NTPSERVERS in /etc/ntp.conf or somesuch, I do not see that /etc/network/if-up.d/ntpdate is ever any use. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1125726 Title: boot-time race between /etc/network/if-up.d/ntpdate and "/etc/init.d/ntp start" Status in ntp package in Ubuntu: Confirmed Bug description: We're seeing a race between if-up.d/ntpdate and the ntp startup script. 1) if-up.d/ntpdate starts. 2) if-up.d/ntpdate acquires the lock "/var/lock/ntpdate-ifup". 3) if-up.d/ntpdate stops the ntp service [which isn't running anyway]. 4) if-up.d/ntpdate starts running ntpdate, which bids UDP *.ntp 5) /etc/init.d/rc 2 executes "/etc/rc2.d/S20ntp start" 6) /etc/init.d/ntp acquires the lock "/var/lock/ntpdate". 7) /etc/init.d/ntp starts the ntp daemon. 8) The ntp daemon logs an error, complaining that it cannot bind UDP *.ntp. 9) if-up.d/ntpdate now starts the ntp service. The result is a weird churn, though ntpd does end up running at the end. Should these not be using the same lock file? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1125726/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp