I just reported this to [EMAIL PROTECTED] in more detail but would
also like to know if anyone else experienced something similar recently.

On 1996-03-04 our cell lost two (gutemine and kneipix) of our
three kaserver-databases.

Authentication processes were extremely slow and appeared to be hanging,
password changes and creation of new users was not possible.

The problem was fixed by identifying and removing the two broken kaservers
and recreating them.

We investigated the problem of course and only found the following entries
in the syslog.  No other unusual entries were found.

isis.wu-wien.ac.at> fgrep 'logical clock adjust timeout' syslog.0 syslog|less
Mar  4 03:00:34 zechine  ntpd[11208]: logical clock adjust timeout (86400 seconds)
Mar  4 03:01:19 kneipix  ntpd[9219]:  logical clock adjust timeout (86400 seconds)
Mar  4 03:01:53 gutemine ntpd[12307]: logical clock adjust timeout (86400 seconds)
Mar  4 03:02:24 idefix   ntpd[11445]: logical clock adjust timeout (86400 seconds)
Mar  4 03:03:25 asterix  ntpd[14776]: logical clock adjust timeout (86400 seconds)
Mar  4 03:03:54 falbala  ntpd[14001]: logical clock adjust timeout (86400 seconds)

Please note:
- falbala's entry (the only surviving database server) was the last.
- 60*60*24 = 86400

Our network monitoring showed an extremely high datatransfer beginning right
after 03:00 in the morning.  This high data volume - about 3.7 GB per hour
from subnet 3 to 7 and 241, 1.8 GB on subnets 7 and 241 each coming in from
subnet 3 - was almost constant from 03:00 to 08:00.  After that, the volume
went down until it reached a more normal level at 10:00, when the problem
was finally fixed.  The surviving database server is located in subnet 3,
the crashed servers are in subnet 7 and 241 respectively.

We don't have automatic mass backups scheduled during these hours.

+gg
 
--
[EMAIL PROTECTED]     Fax: +43/1/31336/702     [EMAIL PROTECTED]

Reply via email to