OK so the box crashed again after being up less than an hour. It was 
running gtk-gnutella mplayer kmail and one MoxillaFirebird brower. 
the /var/log/messages shows nothing. I've openet 2 konsoles on my 
desktop after disabling the screen saver and the plan is to run top 
in one and ps aux in the other. 
        What I'd like to do is run ps aux every 30 seconds or so to see if it 
catches something I'm overlooking. I've looked around to see how to 
automate that without success. Can someone give me a command that 
will loop the ps aux command?

TKS, Ernie

On Saturday 29 November 2003 11:00 am, Ernie Schroder wrote:
> For the last 3 mornings, my main Gentoo box has been locked solid.
> it has no keyboard or mouse response and attempting to ssh in from
> another box on the LAN gives "no route to host" errors. I've been
> looking at logs to try to figure out what's going on and I think
> I've narrowed it down to my back up script or rsync it self.
>       The script used paswordless ssh login and mounts a disk on the
> secondary box and does a pretty basic rsync backup which then sends
> 2 emails to root. one mail is a summary of what it did and the
> other a more involved report that includes a file list of all files
> deleted or written. The backup takes about 18 minutes to complete
> and has been running fine for 3 months. For some odd reason the
> backups have been done but the mail hasn't been delivered until
> just now IA received 3 days of reports all at once.
>       Checking /var/log/messages and "/messages.0 for the last 3 days
> shows that syslog has been restarted at the exact moment that the
> short summary shows it was sent. There is no more info written
> until I hit the reset to reboot the machine.
>       Here are the first 3 lines (only pertinant information) from
> var/log/ messages.0:
>
> $ sudo more /var/log/messages.0
> Nov 28 00:18:32 MRK syslogd 1.4.1: restart.
> Nov 28 20:37:21 MRK syslogd 1.4.1: restart.
> Nov 28 20:37:22 MRK kernel: klogd 1.4.1, log source = /proc/kmsg
> started.
>
> And from the headers of the cronjob sumary:
>
> Received: by localhost (Postfix, from userid 0)
>         id 0E3C47F97A; Fri, 28 Nov 2003 00:18:32 -0500 (EST)
>
>       Prior to the start of these symptoms I did an emerge sync and
> emerge -u world. The only apps that could have effected this seem
> to be shadow, glibc, wget and syslogd. etc-update did show a couple
> of files but as usual I didn't accept changes to anything I had
> edited myself. IIRC, there were 2 files, both having to do with
> syslogd which I don't remember ever altering so after a quick look
> I accepted the new file
>       Any help solving this would be appreciated. The box is generally
> very reliable. Before this situation the box had been up 89 days.

-- 
Regards, Ernie
100% Microsoft and Intel free


--
[EMAIL PROTECTED] mailing list

Reply via email to