Re: [arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??

2012-08-02 Thread David C. Rankin

On 08/01/2012 06:31 PM, Leonid Isaev wrote:

On Wed, 01 Aug 2012 17:50:45 -0500
David C. Rankin drankina...@suddenlinkmail.com wrote:


The only config change I can see in the apcupsd.conf.pacnew is:

LOCKFILE /var/lock  -  LOCKFILE /etc/apcupsd

I can't see that causing a segfault, could it?


I don't think so, no. But since something does, I would look at what has been
updated lately (apcupsd for example).

On a side note... I don't use this application (although I do have an APC UPS
unit) but this entry looks very wrong -- is it a lockfile in /etc? Also, I
quickly skimmed through the initscript in /etc/rc.d/apcupsd, at it also does
few writes to /etc. Unless this is absolutely necessary, such behavior
represents a bug and is in violation of FHS.



Something is FUBAR with apcupsd. I am running the package from the community 
repo, so it might need looking at.






Aug  1 13:20:15 providence ntpd[3454]: Listen normally on 6 eth0
192.168.7.124 UDP 123
Aug  1 13:20:15 providence ntpd[3454]: peers refreshed
Aug  1 13:20:15 providence ntpd[3454]: new interface(s) found: waking up
resolver Aug  1 13:20:15 providence postfix/postfix-script[13735]:
stopping the Postfix mail system
Aug  1 13:20:15 providence postfix/master[3868]: terminating on signal 15
Aug  1 13:20:15 providence postfix/postfix-script[13738]: waiting for the
Postfix mail system to terminate
Aug  1 13:20:16 providence dhcpcd[424]: eth0: sending IPv6 Router
Solicitation Aug  1 13:20:19 providence ntpd[3454]: ntpd exiting on
signal 15


Your NTP behaves strangely... why would it refresh timeservers and then
shutdown at the same timestamp? It looks like your ntp did something funny
and incorrectly adjusted the HW clock.


Now that you mention it, I don't know why it would refresh at shutdown. I
think this is just a normal log event that occurred between the time the
apcupsd shutdown was commanded at 13:20:15 and the ntpd refresh at 13:20:15.
I just don't think ntpd had time to get the message yet...


Well, unless you have hwclock daemon, ntpd is the only thing which affects HW
clock.



  No, it's ntpd and it WAS the problem. This issue started a couple of weeks 
ago. This box usually just sits idle. Going though the various log files, you 
can see it happening and then switching back -- really strange. The history is 
laid out below, but it appears that the hdwclock got zapped by ntpd on a Jul 
11 reboot. Then the system would correct itself and run until the next reboot 
and read the hdwclock again and go though the same cycle. I've never seen 
anything like it and I can't explain it. hdwclock tracks sysclock fine, but we 
have seen some weird things in the past, but never by a whole month.


  It did screw up the cronstamps:

[14:59 providence:/var/spool] # l cronstamps/
total 28
drwxr-xr-x 2 root root 4096 May  4  2011 .
drwxr-xr-x 9 root root 4096 Jul 27 19:28 ..
-rw-r--r-- 1 root root   16 Aug 30  2012 root.prune-cronstamps
-rw-r--r-- 1 root root   16 Jul 23 14:22 root.sys-daily
-rw-r--r-- 1 root root   16 Sep  1  2012 root.sys-hourly
-rw-r--r-- 1 root root   16 Aug 12  2012 root.sys-monthly
-rw-r--r-- 1 root root   16 Aug 30  2012 root.sys-weekly


  For the curios, here is the log history:


  In the everything.log.4 I think I found the fist event:

Jul 11 13:53:12 providence shutdown[1939]: shutting down for system reboot
Jul 11 13:53:13 providence init: Switching to runlevel: 6
Jul 11 13:53:14 providence ntfs-3g[290]: Unmounting /dev/sdb2 ()
Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting cleanly.
Jul 11 18:53:14 providence rtkit-daemon[1536]: Demoting known real-time threads.
Jul 11 18:53:14 providence rtkit-daemon[1536]: Successfully demoted thread 
1534 of process 1534 (/usr/bin/pulseaudio).

Jul 11 18:53:14 providence rtkit-daemon[1536]: Demoted 1 threads.
Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting watchdog thread.
Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting canary thread.
Jul 11 13:53:14 providence apcupsd[1082]: apcupsd exiting, signal 15
Jul 11 13:53:14 providence apcupsd[1082]: apcupsd shutdown succeeded
Jul 11 13:53:15 providence postfix/postfix-script[2116]: stopping the Postfix 
mail system

Jul 11 13:53:15 providence postfix/master[1247]: terminating on signal 15
Jul 11 13:53:16 providence ntpd[870]: ntpd exiting on signal 15
Jul 11 13:53:16 providence dhcpcd[2234]: sending signal 1 to pid 413
Jul 11 13:53:16 providence dhcpcd[413]: received SIGHUP, releasing
Jul 11 13:53:16 providence dhcpcd[413]: eth0: releasing lease of 192.168.7.124
Jul 11 13:53:16 providence dhcpcd[2234]: waiting for pid 413 to exit
Jul 11 13:53:16 providence dhcpcd[413]: eth0: removing interface
Jul 11 13:53:16 providence kernel: [  977.567012] pcieport :00:1c.0: 
wake-up capability enabled by ACPI
Jul 11 13:53:16 providence syslog-ng[352]: syslog-ng shutting down; 
version='3.3.5'
Aug 12 02:00:38 providence kernel: [0.00] Initializing cgroup subsys 
cpuset

Aug 12 02:00:38 providence kernel: [0.00] 

Re: [arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??

2012-08-02 Thread Guus Snijders
Hey David,
-apologies for not quoting correct on  this one-

could you check the date and time settings in the BIOS ?

My guess is that it's something weird in the RTC. To make sure it's best to
check outside of the OS.

I'll admit; it's just a huch.

mvg, Guus


Re: [arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??

2012-08-02 Thread Myra Nelson
On Thu, Aug 2, 2012 at 3:06 PM, David C. Rankin
drankina...@suddenlinkmail.com wrote:
 On 08/01/2012 06:31 PM, Leonid Isaev wrote:

 On Wed, 01 Aug 2012 17:50:45 -0500
 David C. Rankin drankina...@suddenlinkmail.com wrote:


 The only config change I can see in the apcupsd.conf.pacnew is:

 LOCKFILE /var/lock  -  LOCKFILE /etc/apcupsd

 I can't see that causing a segfault, could it?


 I don't think so, no. But since something does, I would look at what has
 been
 updated lately (apcupsd for example).

 On a side note... I don't use this application (although I do have an APC
 UPS
 unit) but this entry looks very wrong -- is it a lockfile in /etc? Also, I
 quickly skimmed through the initscript in /etc/rc.d/apcupsd, at it also
 does
 few writes to /etc. Unless this is absolutely necessary, such behavior
 represents a bug and is in violation of FHS.


 Something is FUBAR with apcupsd. I am running the package from the community
 repo, so it might need looking at.




 Aug  1 13:20:15 providence ntpd[3454]: Listen normally on 6 eth0
 192.168.7.124 UDP 123
 Aug  1 13:20:15 providence ntpd[3454]: peers refreshed
 Aug  1 13:20:15 providence ntpd[3454]: new interface(s) found: waking
 up
 resolver Aug  1 13:20:15 providence postfix/postfix-script[13735]:
 stopping the Postfix mail system
 Aug  1 13:20:15 providence postfix/master[3868]: terminating on signal
 15
 Aug  1 13:20:15 providence postfix/postfix-script[13738]: waiting for
 the
 Postfix mail system to terminate
 Aug  1 13:20:16 providence dhcpcd[424]: eth0: sending IPv6 Router
 Solicitation Aug  1 13:20:19 providence ntpd[3454]: ntpd exiting on
 signal 15


 Your NTP behaves strangely... why would it refresh timeservers and then
 shutdown at the same timestamp? It looks like your ntp did something
 funny
 and incorrectly adjusted the HW clock.


 Now that you mention it, I don't know why it would refresh at shutdown. I
 think this is just a normal log event that occurred between the time the
 apcupsd shutdown was commanded at 13:20:15 and the ntpd refresh at
 13:20:15.
 I just don't think ntpd had time to get the message yet...


 Well, unless you have hwclock daemon, ntpd is the only thing which affects
 HW
 clock.


   No, it's ntpd and it WAS the problem. This issue started a couple of weeks
 ago. This box usually just sits idle. Going though the various log files,
 you can see it happening and then switching back -- really strange. The
 history is laid out below, but it appears that the hdwclock got zapped by
 ntpd on a Jul 11 reboot. Then the system would correct itself and run until
 the next reboot and read the hdwclock again and go though the same cycle.
 I've never seen anything like it and I can't explain it. hdwclock tracks
 sysclock fine, but we have seen some weird things in the past, but never by
 a whole month.

   It did screw up the cronstamps:

 [14:59 providence:/var/spool] # l cronstamps/
 total 28
 drwxr-xr-x 2 root root 4096 May  4  2011 .
 drwxr-xr-x 9 root root 4096 Jul 27 19:28 ..
 -rw-r--r-- 1 root root   16 Aug 30  2012 root.prune-cronstamps
 -rw-r--r-- 1 root root   16 Jul 23 14:22 root.sys-daily
 -rw-r--r-- 1 root root   16 Sep  1  2012 root.sys-hourly
 -rw-r--r-- 1 root root   16 Aug 12  2012 root.sys-monthly
 -rw-r--r-- 1 root root   16 Aug 30  2012 root.sys-weekly


   For the curios, here is the log history:


   In the everything.log.4 I think I found the fist event:

 Jul 11 13:53:12 providence shutdown[1939]: shutting down for system reboot
 Jul 11 13:53:13 providence init: Switching to runlevel: 6
 Jul 11 13:53:14 providence ntfs-3g[290]: Unmounting /dev/sdb2 ()
 Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting cleanly.
 Jul 11 18:53:14 providence rtkit-daemon[1536]: Demoting known real-time
 threads.
 Jul 11 18:53:14 providence rtkit-daemon[1536]: Successfully demoted thread
 1534 of process 1534 (/usr/bin/pulseaudio).
 Jul 11 18:53:14 providence rtkit-daemon[1536]: Demoted 1 threads.
 Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting watchdog thread.
 Jul 11 18:53:14 providence rtkit-daemon[1536]: Exiting canary thread.
 Jul 11 13:53:14 providence apcupsd[1082]: apcupsd exiting, signal 15
 Jul 11 13:53:14 providence apcupsd[1082]: apcupsd shutdown succeeded
 Jul 11 13:53:15 providence postfix/postfix-script[2116]: stopping the
 Postfix mail system
 Jul 11 13:53:15 providence postfix/master[1247]: terminating on signal 15
 Jul 11 13:53:16 providence ntpd[870]: ntpd exiting on signal 15
 Jul 11 13:53:16 providence dhcpcd[2234]: sending signal 1 to pid 413
 Jul 11 13:53:16 providence dhcpcd[413]: received SIGHUP, releasing
 Jul 11 13:53:16 providence dhcpcd[413]: eth0: releasing lease of
 192.168.7.124
 Jul 11 13:53:16 providence dhcpcd[2234]: waiting for pid 413 to exit
 Jul 11 13:53:16 providence dhcpcd[413]: eth0: removing interface
 Jul 11 13:53:16 providence kernel: [  977.567012] pcieport :00:1c.0:
 wake-up capability enabled by ACPI
 Jul 11 13:53:16 providence syslog-ng[352]: syslog-ng shutting 

[arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??

2012-08-01 Thread David C. Rankin

Guys,

  This is one for you to puzzle over. I've never seen it before. I had an 
apcupsd commanded shutdown at Aug  1 13:20:15 (not uncommon - power is 
terrible). However, when I restarted the hwclock had magically jumped ahead by 
1 full month to Sept. 1. (the box was down for 56 minutes until I returned 
from lunch)


  Then on restart, I was dropped into the maintenance shell and required to 
run fsck manually due to the future date. There was a message that stated 
something line run fsck manually on /dev/sdc5 ... the system will reboot 
automatically after fsck completes (this looked like an abbreviated 
maintenance shell -- I guess due to the fact that the root fs was the one that 
was being checked /dev/sdc5.


  fsck completed (e2fsck), but the box didn't reboot. So I typed 'reboot' -- 
nothing happened. Checked /usr/bin and reboot was present. So then I finally 
did ctrl+alt+del and the box restarted. I was then dumped into the normal 
maintenance shell to fsck /home on /dev/sdc7. It completed and reboot worked 
normally.


  My questions are (1) does anybody have any idea how this can happen? (2) 
Anybody else every seen this? and (3) could the kernel apcupsd segfault (shown 
below) be the cause?


  Any ideas welcome. Strange stuff...

  The relevant everything.log snippet and bios info is below:

Aug  1 13:20:15 providence apcupsd[3477]: apcupsd shutdown succeeded
Aug  1 13:20:15 providence kernel: [171092.364247] apcupsd[3477]: segfault at 
0 ip 0805c706 sp bfcdfff0 error 4 in apcupsd[8048000+38000]
Aug  1 13:20:15 providence ntpd[3454]: Listen normally on 6 eth0 192.168.7.124 
UDP 123

Aug  1 13:20:15 providence ntpd[3454]: peers refreshed
Aug  1 13:20:15 providence ntpd[3454]: new interface(s) found: waking up 
resolver
Aug  1 13:20:15 providence postfix/postfix-script[13735]: stopping the Postfix 
mail system

Aug  1 13:20:15 providence postfix/master[3868]: terminating on signal 15
Aug  1 13:20:15 providence postfix/postfix-script[13738]: waiting for the 
Postfix mail system to terminate

Aug  1 13:20:16 providence dhcpcd[424]: eth0: sending IPv6 Router Solicitation
Aug  1 13:20:19 providence ntpd[3454]: ntpd exiting on signal 15
Aug  1 13:20:19 providence dhcpcd[13884]: sending signal 1 to pid 424
Aug  1 13:20:19 providence dhcpcd[424]: received SIGHUP, releasing
Aug  1 13:20:19 providence dhcpcd[424]: eth0: releasing lease of 192.168.7.124
Aug  1 13:20:19 providence dhcpcd[13884]: waiting for pid 424 to exit
Aug  1 13:20:19 providence dhcpcd[424]: eth0: removing interface
Aug  1 13:20:19 providence kernel: [171096.79] pcieport :00:1c.0: 
wake-up capability enabled by ACPI
Aug  1 13:20:19 providence syslog-ng[352]: syslog-ng shutting down; 
version='3.3.5'
Sep  1 14:16:20 providence kernel: [0.00] Initializing cgroup subsys 
cpuset

Sep  1 14:16:20 providence kernel: [0.00] Initializing cgroup subsys cpu
Sep  1 14:16:20 providence kernel: [0.00] Linux version 3.4.7-1-ARCH 
(tobias@T-POWA-LX) (gcc version 4.7.1 20120721 (prerelease) (GCC) ) #1 SMP 
PREEMPT Sun Jul 29 20:05:01 UTC 2012


BIOS Information
Vendor: Dell Inc.
Version: A04
Release Date: 02/09/2005


--
David C. Rankin, J.D.,P.E.


Re: [arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??

2012-08-01 Thread David C. Rankin
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/01/2012 03:42 PM, Leonid Isaev wrote:
snip

 Aug  1 13:20:15 providence apcupsd[3477]: apcupsd shutdown succeeded
 Aug  1 13:20:15 providence kernel: [171092.364247] apcupsd[3477]: segfault
 at 0 ip 0805c706 sp bfcdfff0 error 4 in apcupsd[8048000+38000]
 
 Does apcupsd routinely segfault?
 

No, the segfault just started on July 30:

[17:40 providence:/var/log] # grep apcupsd everything.log* | grep segfault
everything.log:Jul 30 13:48:24 providence kernel: [603728.758431] apcupsd[3190]:
segfault at 0 ip 0805e0fc sp bfe397c0 error 4 in apcupsd 
(deleted)[8048000+39000]
everything.log:Aug  1 13:20:15 providence kernel: [171092.364247] apcupsd[3477]:
segfault at 0 ip 0805c706 sp bfcdfff0 error 4 in apcupsd[8048000+38000]

The only config change I can see in the apcupsd.conf.pacnew is:

LOCKFILE /var/lock  -  LOCKFILE /etc/apcupsd

I can't see that causing a segfault, could it?


 Aug  1 13:20:15 providence ntpd[3454]: Listen normally on 6 eth0
 192.168.7.124 UDP 123
 Aug  1 13:20:15 providence ntpd[3454]: peers refreshed
 Aug  1 13:20:15 providence ntpd[3454]: new interface(s) found: waking up
 resolver Aug  1 13:20:15 providence postfix/postfix-script[13735]: stopping
 the Postfix mail system
 Aug  1 13:20:15 providence postfix/master[3868]: terminating on signal 15
 Aug  1 13:20:15 providence postfix/postfix-script[13738]: waiting for the 
 Postfix mail system to terminate
 Aug  1 13:20:16 providence dhcpcd[424]: eth0: sending IPv6 Router
 Solicitation Aug  1 13:20:19 providence ntpd[3454]: ntpd exiting on signal 15
 
 Your NTP behaves strangely... why would it refresh timeservers and then
 shutdown at the same timestamp? It looks like your ntp did something funny and
 incorrectly adjusted the HW clock.

Now that you mention it, I don't know why it would refresh at shutdown. I think
this is just a normal log event that occurred between the time the apcupsd
shutdown was commanded at 13:20:15 and the ntpd refresh at 13:20:15. I just
don't think ntpd had time to get the message yet...

Thanks for your help. I occurred during a power event, so I guess we'll just
chock it up to stray voltage that somehow zapped the month bit in the bios ahead
by 1. Sounds screwy, but I don't know the system/kernel/bios well enough to
know how else it could have occurred or what even interfaces with the hardware
clock for that matter. System time was fine up until the time of restart...

- -- 
David C. Rankin, J.D.,P.E.
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlAZssUACgkQZMpuZ8CyrcgM6gCfcqf+TsykvukqW/ca0La2l6Tc
lIYAn1YizdUmq9+lCD0Ns5jEDd02NwXf
=d3sE
-END PGP SIGNATURE-


Re: [arch-general] Bizarre Clock Reset 8/1 - 9/1 after apcupsd shutdown??

2012-08-01 Thread Leonid Isaev
On Wed, 01 Aug 2012 17:50:45 -0500
David C. Rankin drankina...@suddenlinkmail.com wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 08/01/2012 03:42 PM, Leonid Isaev wrote:
 snip
 
  Aug  1 13:20:15 providence apcupsd[3477]: apcupsd shutdown succeeded
  Aug  1 13:20:15 providence kernel: [171092.364247] apcupsd[3477]: segfault
  at 0 ip 0805c706 sp bfcdfff0 error 4 in apcupsd[8048000+38000]
  
  Does apcupsd routinely segfault?
  
 
 No, the segfault just started on July 30:
 
 [17:40 providence:/var/log] # grep apcupsd everything.log* | grep segfault
 everything.log:Jul 30 13:48:24 providence kernel: [603728.758431]
 apcupsd[3190]: segfault at 0 ip 0805e0fc sp bfe397c0 error 4 in apcupsd
 (deleted)[8048000+39000] everything.log:Aug  1 13:20:15 providence kernel:
 [171092.364247] apcupsd[3477]: segfault at 0 ip 0805c706 sp bfcdfff0 error 4
 in apcupsd[8048000+38000]
 
 The only config change I can see in the apcupsd.conf.pacnew is:
 
 LOCKFILE /var/lock  -  LOCKFILE /etc/apcupsd
 
 I can't see that causing a segfault, could it?

I don't think so, no. But since something does, I would look at what has been
updated lately (apcupsd for example).

On a side note... I don't use this application (although I do have an APC UPS
unit) but this entry looks very wrong -- is it a lockfile in /etc? Also, I
quickly skimmed through the initscript in /etc/rc.d/apcupsd, at it also does
few writes to /etc. Unless this is absolutely necessary, such behavior
represents a bug and is in violation of FHS.

 
 
  Aug  1 13:20:15 providence ntpd[3454]: Listen normally on 6 eth0
  192.168.7.124 UDP 123
  Aug  1 13:20:15 providence ntpd[3454]: peers refreshed
  Aug  1 13:20:15 providence ntpd[3454]: new interface(s) found: waking up
  resolver Aug  1 13:20:15 providence postfix/postfix-script[13735]:
  stopping the Postfix mail system
  Aug  1 13:20:15 providence postfix/master[3868]: terminating on signal 15
  Aug  1 13:20:15 providence postfix/postfix-script[13738]: waiting for the 
  Postfix mail system to terminate
  Aug  1 13:20:16 providence dhcpcd[424]: eth0: sending IPv6 Router
  Solicitation Aug  1 13:20:19 providence ntpd[3454]: ntpd exiting on
  signal 15
  
  Your NTP behaves strangely... why would it refresh timeservers and then
  shutdown at the same timestamp? It looks like your ntp did something funny
  and incorrectly adjusted the HW clock.
 
 Now that you mention it, I don't know why it would refresh at shutdown. I
 think this is just a normal log event that occurred between the time the
 apcupsd shutdown was commanded at 13:20:15 and the ntpd refresh at 13:20:15.
 I just don't think ntpd had time to get the message yet...

Well, unless you have hwclock daemon, ntpd is the only thing which affects HW
clock.

 
 Thanks for your help. I occurred during a power event, so I guess we'll just
 chock it up to stray voltage that somehow zapped the month bit in the bios
 ahead by 1. Sounds screwy, but I don't know the system/kernel/bios well
 enough to know how else it could have occurred or what even interfaces with
 the hardware clock for that matter. System time was fine up until the time
 of restart...
 
 - -- 
 David C. Rankin, J.D.,P.E.
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.16 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iEYEARECAAYFAlAZssUACgkQZMpuZ8CyrcgM6gCfcqf+TsykvukqW/ca0La2l6Tc
 lIYAn1YizdUmq9+lCD0Ns5jEDd02NwXf
 =d3sE
 -END PGP SIGNATURE-



-- 
Leonid Isaev
GnuPG key: 0x164B5A6D
Fingerprint: C0DF 20D0 C075 C3F1 E1BE  775A A7AE F6CB 164B 5A6D


signature.asc
Description: PGP signature