My colo box stopped responding this morning. Thankfully, my ISP (Sonic.net) got an automated page, went into the server room, and called me asking if they should hit the reboot button, to which I said "can't ping it? sure, reboot!"
Anyway, I looked over the Apache logs to see if there was anything particularly interesting, which there wasn't. Just a gap in time between when the server stopped responding and when it finished rebooting. Looking at /var/log/messages, though, at about the time of the 'crash' (or whatever it was that happened), I'm seeing these, which are unfamiliar to me (not that I know much about /var/log/messages to begin with :) ) Oct 6 08:26:02 plink kernel: Rx ring a020f028: 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 Oct 6 08:26:02 plink kernel: Tx ring a020f128: 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 Oct 6 08:26:05 plink kernel: Rx ring a020f028: 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 Oct 6 08:26:05 plink kernel: Tx ring a020f128: 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 Oct 6 08:26:07 plink kernel: Rx ring a020f028: 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 80000000 Oct 6 08:26:07 plink kernel: Tx ring a020f128: 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 7fffbc00 Any idea what these are about? They continue until 8:54, then I only see a few of these every 15 minutes: Oct 6 10:00:05 plink telnetd[25738]: ttloop: read: Broken pipe (I see a lot of them earlier up in /var/log/messages, and they, too, are about 15mins apart from each other. But, I guess that's a completely different question I'll need to ask :) ) Anyway, then at 10:32, the machine got reboot. (Note: Looks like the box's clock is ahead of itself by 1/2 hr. D'oh!) -bill! (who is seeing a corollation between those 15-min-apart telnetd messages in /var/log/messages, and a bunch of telnetd and proftpd connection refusals in /var/log/secure ... interesting)
