Re: [arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??
On 08/01/2012 06:31 PM, Leonid Isaev wrote: On Wed, 01 Aug 2012 17:50:45 -0500 David C. Rankin drankina...@suddenlinkmail.com wrote: The only config change I can see in the apcupsd.conf.pacnew is: LOCKFILE /var/lock - LOCKFILE /etc/apcupsd I can't see that causing a segfault, could it? I don't think so, no. But since something does, I would look at what has been updated lately (apcupsd for example). On a side note... I don't use this application (although I do have an APC UPS unit) but this entry looks very wrong -- is it a lockfile in /etc? Also, I quickly skimmed through the initscript in /etc/rc.d/apcupsd, at it also does few writes to /etc. Unless this is absolutely necessary, such behavior represents a bug and is in violation of FHS. Something is FUBAR with apcupsd. I am running the package from the community repo, so it might need looking at. Aug 1 13:20:15 providence ntpd[3454]: Listen normally on 6 eth0 192.168.7.124 UDP 123 Aug 1 13:20:15 providence ntpd[3454]: peers refreshed Aug 1 13:20:15 providence ntpd[3454]: new interface(s) found: waking up resolver Aug 1 13:20:15 providence postfix/postfix-script[13735]: stopping the Postfix mail system Aug 1 13:20:15 providence postfix/master[3868]: terminating on signal 15 Aug 1 13:20:15 providence postfix/postfix-script[13738]: waiting for the Postfix mail system to terminate Aug 1 13:20:16 providence dhcpcd[424]: eth0: sending IPv6 Router Solicitation Aug 1 13:20:19 providence ntpd[3454]: ntpd exiting on signal 15 Your NTP behaves strangely... why would it refresh timeservers and then shutdown at the same timestamp? It looks like your ntp did something funny and incorrectly adjusted the HW clock. Now that you mention it, I don't know why it would refresh at shutdown. I think this is just a normal log event that occurred between the time the apcupsd shutdown was commanded at 13:20:15 and the ntpd refresh at 13:20:15. I just don't think ntpd had time to get the message yet... Well, unless you have hwclock daemon, ntpd is the only thing which affects HW clock. No, it's ntpd and it WAS the problem. This issue started a couple of weeks ago. This box usually just sits idle. Going though the various log files, you can see it happening and then switching back -- really strange. The history is laid out below, but it appears that the hdwclock got zapped by ntpd on a Jul 11 reboot. Then the system would correct itself and run until the next reboot and read the hdwclock again and go though the same cycle. I've never seen anything like it and I can't explain it. hdwclock tracks sysclock fine, but we have seen some weird things in the past, but never by a whole month. It did screw up the cronstamps: [14:59 providence:/var/spool] # l cronstamps/ total 28 drwxr-xr-x 2 root root 4096 May 4 2011 . drwxr-xr-x 9 root root 4096 Jul 27 19:28 .. -rw-r--r-- 1 root root 16 Aug 30 2012 root.prune-cronstamps -rw-r--r-- 1 root root 16 Jul 23 14:22 root.sys-daily -rw-r--r-- 1 root root 16 Sep 1 2012 root.sys-hourly -rw-r--r-- 1 root root 16 Aug 12 2012 root.sys-monthly -rw-r--r-- 1 root root 16 Aug 30 2012 root.sys-weekly For the curios, here is the log history: In the everything.log.4 I think I found the fist event: Jul 11 13:53:12 providence shutdown[1939]: shutting down for system reboot Jul 11 13:53:13 providence init: Switching to runlevel: 6 Jul 11 13:53:14 providence ntfs-3g[290]: Unmounting /dev/sdb2 () Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting cleanly. Jul 11 18:53:14 providence rtkit-daemon[1536]: Demoting known real-time threads. Jul 11 18:53:14 providence rtkit-daemon[1536]: Successfully demoted thread 1534 of process 1534 (/usr/bin/pulseaudio). Jul 11 18:53:14 providence rtkit-daemon[1536]: Demoted 1 threads. Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting watchdog thread. Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting canary thread. Jul 11 13:53:14 providence apcupsd[1082]: apcupsd exiting, signal 15 Jul 11 13:53:14 providence apcupsd[1082]: apcupsd shutdown succeeded Jul 11 13:53:15 providence postfix/postfix-script[2116]: stopping the Postfix mail system Jul 11 13:53:15 providence postfix/master[1247]: terminating on signal 15 Jul 11 13:53:16 providence ntpd[870]: ntpd exiting on signal 15 Jul 11 13:53:16 providence dhcpcd[2234]: sending signal 1 to pid 413 Jul 11 13:53:16 providence dhcpcd[413]: received SIGHUP, releasing Jul 11 13:53:16 providence dhcpcd[413]: eth0: releasing lease of 192.168.7.124 Jul 11 13:53:16 providence dhcpcd[2234]: waiting for pid 413 to exit Jul 11 13:53:16 providence dhcpcd[413]: eth0: removing interface Jul 11 13:53:16 providence kernel: [ 977.567012] pcieport :00:1c.0: wake-up capability enabled by ACPI Jul 11 13:53:16 providence syslog-ng[352]: syslog-ng shutting down; version='3.3.5' Aug 12 02:00:38 providence kernel: [0.00] Initializing cgroup subsys cpuset Aug 12 02:00:38 providence kernel: [0.00]
Re: [arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??
Hey David, -apologies for not quoting correct on this one- could you check the date and time settings in the BIOS ? My guess is that it's something weird in the RTC. To make sure it's best to check outside of the OS. I'll admit; it's just a huch. mvg, Guus
Re: [arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??
On Thu, Aug 2, 2012 at 3:06 PM, David C. Rankin drankina...@suddenlinkmail.com wrote: On 08/01/2012 06:31 PM, Leonid Isaev wrote: On Wed, 01 Aug 2012 17:50:45 -0500 David C. Rankin drankina...@suddenlinkmail.com wrote: The only config change I can see in the apcupsd.conf.pacnew is: LOCKFILE /var/lock - LOCKFILE /etc/apcupsd I can't see that causing a segfault, could it? I don't think so, no. But since something does, I would look at what has been updated lately (apcupsd for example). On a side note... I don't use this application (although I do have an APC UPS unit) but this entry looks very wrong -- is it a lockfile in /etc? Also, I quickly skimmed through the initscript in /etc/rc.d/apcupsd, at it also does few writes to /etc. Unless this is absolutely necessary, such behavior represents a bug and is in violation of FHS. Something is FUBAR with apcupsd. I am running the package from the community repo, so it might need looking at. Aug 1 13:20:15 providence ntpd[3454]: Listen normally on 6 eth0 192.168.7.124 UDP 123 Aug 1 13:20:15 providence ntpd[3454]: peers refreshed Aug 1 13:20:15 providence ntpd[3454]: new interface(s) found: waking up resolver Aug 1 13:20:15 providence postfix/postfix-script[13735]: stopping the Postfix mail system Aug 1 13:20:15 providence postfix/master[3868]: terminating on signal 15 Aug 1 13:20:15 providence postfix/postfix-script[13738]: waiting for the Postfix mail system to terminate Aug 1 13:20:16 providence dhcpcd[424]: eth0: sending IPv6 Router Solicitation Aug 1 13:20:19 providence ntpd[3454]: ntpd exiting on signal 15 Your NTP behaves strangely... why would it refresh timeservers and then shutdown at the same timestamp? It looks like your ntp did something funny and incorrectly adjusted the HW clock. Now that you mention it, I don't know why it would refresh at shutdown. I think this is just a normal log event that occurred between the time the apcupsd shutdown was commanded at 13:20:15 and the ntpd refresh at 13:20:15. I just don't think ntpd had time to get the message yet... Well, unless you have hwclock daemon, ntpd is the only thing which affects HW clock. No, it's ntpd and it WAS the problem. This issue started a couple of weeks ago. This box usually just sits idle. Going though the various log files, you can see it happening and then switching back -- really strange. The history is laid out below, but it appears that the hdwclock got zapped by ntpd on a Jul 11 reboot. Then the system would correct itself and run until the next reboot and read the hdwclock again and go though the same cycle. I've never seen anything like it and I can't explain it. hdwclock tracks sysclock fine, but we have seen some weird things in the past, but never by a whole month. It did screw up the cronstamps: [14:59 providence:/var/spool] # l cronstamps/ total 28 drwxr-xr-x 2 root root 4096 May 4 2011 . drwxr-xr-x 9 root root 4096 Jul 27 19:28 .. -rw-r--r-- 1 root root 16 Aug 30 2012 root.prune-cronstamps -rw-r--r-- 1 root root 16 Jul 23 14:22 root.sys-daily -rw-r--r-- 1 root root 16 Sep 1 2012 root.sys-hourly -rw-r--r-- 1 root root 16 Aug 12 2012 root.sys-monthly -rw-r--r-- 1 root root 16 Aug 30 2012 root.sys-weekly For the curios, here is the log history: In the everything.log.4 I think I found the fist event: Jul 11 13:53:12 providence shutdown[1939]: shutting down for system reboot Jul 11 13:53:13 providence init: Switching to runlevel: 6 Jul 11 13:53:14 providence ntfs-3g[290]: Unmounting /dev/sdb2 () Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting cleanly. Jul 11 18:53:14 providence rtkit-daemon[1536]: Demoting known real-time threads. Jul 11 18:53:14 providence rtkit-daemon[1536]: Successfully demoted thread 1534 of process 1534 (/usr/bin/pulseaudio). Jul 11 18:53:14 providence rtkit-daemon[1536]: Demoted 1 threads. Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting watchdog thread. Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting canary thread. Jul 11 13:53:14 providence apcupsd[1082]: apcupsd exiting, signal 15 Jul 11 13:53:14 providence apcupsd[1082]: apcupsd shutdown succeeded Jul 11 13:53:15 providence postfix/postfix-script[2116]: stopping the Postfix mail system Jul 11 13:53:15 providence postfix/master[1247]: terminating on signal 15 Jul 11 13:53:16 providence ntpd[870]: ntpd exiting on signal 15 Jul 11 13:53:16 providence dhcpcd[2234]: sending signal 1 to pid 413 Jul 11 13:53:16 providence dhcpcd[413]: received SIGHUP, releasing Jul 11 13:53:16 providence dhcpcd[413]: eth0: releasing lease of 192.168.7.124 Jul 11 13:53:16 providence dhcpcd[2234]: waiting for pid 413 to exit Jul 11 13:53:16 providence dhcpcd[413]: eth0: removing interface Jul 11 13:53:16 providence kernel: [ 977.567012] pcieport :00:1c.0: wake-up capability enabled by ACPI Jul 11 13:53:16 providence syslog-ng[352]: syslog-ng shutting
[arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??
Guys, This is one for you to puzzle over. I've never seen it before. I had an apcupsd commanded shutdown at Aug 1 13:20:15 (not uncommon - power is terrible). However, when I restarted the hwclock had magically jumped ahead by 1 full month to Sept. 1. (the box was down for 56 minutes until I returned from lunch) Then on restart, I was dropped into the maintenance shell and required to run fsck manually due to the future date. There was a message that stated something line run fsck manually on /dev/sdc5 ... the system will reboot automatically after fsck completes (this looked like an abbreviated maintenance shell -- I guess due to the fact that the root fs was the one that was being checked /dev/sdc5. fsck completed (e2fsck), but the box didn't reboot. So I typed 'reboot' -- nothing happened. Checked /usr/bin and reboot was present. So then I finally did ctrl+alt+del and the box restarted. I was then dumped into the normal maintenance shell to fsck /home on /dev/sdc7. It completed and reboot worked normally. My questions are (1) does anybody have any idea how this can happen? (2) Anybody else every seen this? and (3) could the kernel apcupsd segfault (shown below) be the cause? Any ideas welcome. Strange stuff... The relevant everything.log snippet and bios info is below: Aug 1 13:20:15 providence apcupsd[3477]: apcupsd shutdown succeeded Aug 1 13:20:15 providence kernel: [171092.364247] apcupsd[3477]: segfault at 0 ip 0805c706 sp bfcdfff0 error 4 in apcupsd[8048000+38000] Aug 1 13:20:15 providence ntpd[3454]: Listen normally on 6 eth0 192.168.7.124 UDP 123 Aug 1 13:20:15 providence ntpd[3454]: peers refreshed Aug 1 13:20:15 providence ntpd[3454]: new interface(s) found: waking up resolver Aug 1 13:20:15 providence postfix/postfix-script[13735]: stopping the Postfix mail system Aug 1 13:20:15 providence postfix/master[3868]: terminating on signal 15 Aug 1 13:20:15 providence postfix/postfix-script[13738]: waiting for the Postfix mail system to terminate Aug 1 13:20:16 providence dhcpcd[424]: eth0: sending IPv6 Router Solicitation Aug 1 13:20:19 providence ntpd[3454]: ntpd exiting on signal 15 Aug 1 13:20:19 providence dhcpcd[13884]: sending signal 1 to pid 424 Aug 1 13:20:19 providence dhcpcd[424]: received SIGHUP, releasing Aug 1 13:20:19 providence dhcpcd[424]: eth0: releasing lease of 192.168.7.124 Aug 1 13:20:19 providence dhcpcd[13884]: waiting for pid 424 to exit Aug 1 13:20:19 providence dhcpcd[424]: eth0: removing interface Aug 1 13:20:19 providence kernel: [171096.79] pcieport :00:1c.0: wake-up capability enabled by ACPI Aug 1 13:20:19 providence syslog-ng[352]: syslog-ng shutting down; version='3.3.5' Sep 1 14:16:20 providence kernel: [0.00] Initializing cgroup subsys cpuset Sep 1 14:16:20 providence kernel: [0.00] Initializing cgroup subsys cpu Sep 1 14:16:20 providence kernel: [0.00] Linux version 3.4.7-1-ARCH (tobias@T-POWA-LX) (gcc version 4.7.1 20120721 (prerelease) (GCC) ) #1 SMP PREEMPT Sun Jul 29 20:05:01 UTC 2012 BIOS Information Vendor: Dell Inc. Version: A04 Release Date: 02/09/2005 -- David C. Rankin, J.D.,P.E.
Re: [arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/01/2012 03:42 PM, Leonid Isaev wrote: snip Aug 1 13:20:15 providence apcupsd[3477]: apcupsd shutdown succeeded Aug 1 13:20:15 providence kernel: [171092.364247] apcupsd[3477]: segfault at 0 ip 0805c706 sp bfcdfff0 error 4 in apcupsd[8048000+38000] Does apcupsd routinely segfault? No, the segfault just started on July 30: [17:40 providence:/var/log] # grep apcupsd everything.log* | grep segfault everything.log:Jul 30 13:48:24 providence kernel: [603728.758431] apcupsd[3190]: segfault at 0 ip 0805e0fc sp bfe397c0 error 4 in apcupsd (deleted)[8048000+39000] everything.log:Aug 1 13:20:15 providence kernel: [171092.364247] apcupsd[3477]: segfault at 0 ip 0805c706 sp bfcdfff0 error 4 in apcupsd[8048000+38000] The only config change I can see in the apcupsd.conf.pacnew is: LOCKFILE /var/lock - LOCKFILE /etc/apcupsd I can't see that causing a segfault, could it? Aug 1 13:20:15 providence ntpd[3454]: Listen normally on 6 eth0 192.168.7.124 UDP 123 Aug 1 13:20:15 providence ntpd[3454]: peers refreshed Aug 1 13:20:15 providence ntpd[3454]: new interface(s) found: waking up resolver Aug 1 13:20:15 providence postfix/postfix-script[13735]: stopping the Postfix mail system Aug 1 13:20:15 providence postfix/master[3868]: terminating on signal 15 Aug 1 13:20:15 providence postfix/postfix-script[13738]: waiting for the Postfix mail system to terminate Aug 1 13:20:16 providence dhcpcd[424]: eth0: sending IPv6 Router Solicitation Aug 1 13:20:19 providence ntpd[3454]: ntpd exiting on signal 15 Your NTP behaves strangely... why would it refresh timeservers and then shutdown at the same timestamp? It looks like your ntp did something funny and incorrectly adjusted the HW clock. Now that you mention it, I don't know why it would refresh at shutdown. I think this is just a normal log event that occurred between the time the apcupsd shutdown was commanded at 13:20:15 and the ntpd refresh at 13:20:15. I just don't think ntpd had time to get the message yet... Thanks for your help. I occurred during a power event, so I guess we'll just chock it up to stray voltage that somehow zapped the month bit in the bios ahead by 1. Sounds screwy, but I don't know the system/kernel/bios well enough to know how else it could have occurred or what even interfaces with the hardware clock for that matter. System time was fine up until the time of restart... - -- David C. Rankin, J.D.,P.E. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlAZssUACgkQZMpuZ8CyrcgM6gCfcqf+TsykvukqW/ca0La2l6Tc lIYAn1YizdUmq9+lCD0Ns5jEDd02NwXf =d3sE -END PGP SIGNATURE-
Re: [arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??
On Wed, 01 Aug 2012 17:50:45 -0500 David C. Rankin drankina...@suddenlinkmail.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/01/2012 03:42 PM, Leonid Isaev wrote: snip Aug 1 13:20:15 providence apcupsd[3477]: apcupsd shutdown succeeded Aug 1 13:20:15 providence kernel: [171092.364247] apcupsd[3477]: segfault at 0 ip 0805c706 sp bfcdfff0 error 4 in apcupsd[8048000+38000] Does apcupsd routinely segfault? No, the segfault just started on July 30: [17:40 providence:/var/log] # grep apcupsd everything.log* | grep segfault everything.log:Jul 30 13:48:24 providence kernel: [603728.758431] apcupsd[3190]: segfault at 0 ip 0805e0fc sp bfe397c0 error 4 in apcupsd (deleted)[8048000+39000] everything.log:Aug 1 13:20:15 providence kernel: [171092.364247] apcupsd[3477]: segfault at 0 ip 0805c706 sp bfcdfff0 error 4 in apcupsd[8048000+38000] The only config change I can see in the apcupsd.conf.pacnew is: LOCKFILE /var/lock - LOCKFILE /etc/apcupsd I can't see that causing a segfault, could it? I don't think so, no. But since something does, I would look at what has been updated lately (apcupsd for example). On a side note... I don't use this application (although I do have an APC UPS unit) but this entry looks very wrong -- is it a lockfile in /etc? Also, I quickly skimmed through the initscript in /etc/rc.d/apcupsd, at it also does few writes to /etc. Unless this is absolutely necessary, such behavior represents a bug and is in violation of FHS. Aug 1 13:20:15 providence ntpd[3454]: Listen normally on 6 eth0 192.168.7.124 UDP 123 Aug 1 13:20:15 providence ntpd[3454]: peers refreshed Aug 1 13:20:15 providence ntpd[3454]: new interface(s) found: waking up resolver Aug 1 13:20:15 providence postfix/postfix-script[13735]: stopping the Postfix mail system Aug 1 13:20:15 providence postfix/master[3868]: terminating on signal 15 Aug 1 13:20:15 providence postfix/postfix-script[13738]: waiting for the Postfix mail system to terminate Aug 1 13:20:16 providence dhcpcd[424]: eth0: sending IPv6 Router Solicitation Aug 1 13:20:19 providence ntpd[3454]: ntpd exiting on signal 15 Your NTP behaves strangely... why would it refresh timeservers and then shutdown at the same timestamp? It looks like your ntp did something funny and incorrectly adjusted the HW clock. Now that you mention it, I don't know why it would refresh at shutdown. I think this is just a normal log event that occurred between the time the apcupsd shutdown was commanded at 13:20:15 and the ntpd refresh at 13:20:15. I just don't think ntpd had time to get the message yet... Well, unless you have hwclock daemon, ntpd is the only thing which affects HW clock. Thanks for your help. I occurred during a power event, so I guess we'll just chock it up to stray voltage that somehow zapped the month bit in the bios ahead by 1. Sounds screwy, but I don't know the system/kernel/bios well enough to know how else it could have occurred or what even interfaces with the hardware clock for that matter. System time was fine up until the time of restart... - -- David C. Rankin, J.D.,P.E. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlAZssUACgkQZMpuZ8CyrcgM6gCfcqf+TsykvukqW/ca0La2l6Tc lIYAn1YizdUmq9+lCD0Ns5jEDd02NwXf =d3sE -END PGP SIGNATURE- -- Leonid Isaev GnuPG key: 0x164B5A6D Fingerprint: C0DF 20D0 C075 C3F1 E1BE 775A A7AE F6CB 164B 5A6D signature.asc Description: PGP signature