Hi,
 now for the answer.

Some very good suggestions all round - thanks for all contributions..

The box had been replaced with an identical box. Same kernel panic.

Yes, internet is provided by a 3G cellular.

The centos kernel had been upgraded to the latest available on yum.

The kicker was noticing at this point that sftp did not work and would not connect.

Turns out the belkin router was faulty. A replacement was installed and the customer is happy. There have been no reported incidents since the new router was installed, so I can only conclude
that it is all good.

The upgrade of the kernel happened at the same time as the router was installed. Since the kernel panics are sporadic, one cannot "know for sure" if it was the kernel upgrade or the the router replaced that stopped the panics. My view is that the older kernel version was not coping with the huge number of tcp fragments and retries. It would have been "leaking memory" or some such thing until the box died. Which is actually a very good suggestion - I could log and report memory usage.

The suggestions to install "some-tool" are valid for home machines, but not machines in the field. Some googling reports that netstat reports a cumulative count of TCP error messages, but it is not clear which are the essential things to look at and use. I suspect that some answers are valid
sometimes, and other answers are valid other times.

If someone has a favourite netstat command as a diagnostic tool to report on the network link
I would appreciate very much.

================
One suggestion (off list was)
Probably environmental then: dirty power, magnetism, cleaner ( is there
a regular pattern to the crashes? ), script kiddies on the uplink or
somesuch.

However, as Craig mentioned, if it's still in that state and published
on the web, then security patches are essential...

My advice - look in /var/log and /var/crash - is still valid, and should
be the first step anyway.
======================
On 13/09/12 19:37, Christopher Sawtell wrote:
On 13 September 2012 18:47, Mark Beharrell <[email protected] <mailto:[email protected]>> wrote:

        Hi,
         No activity for some time, so I thought I would kick things
        off with a question.
        Any suggestions (including shell commands) are welcome..

        I have a problematic box on a customers site in the US. it is
        not going to be easy to get physical access. SSH access  is
        doable.


--
Derek J Smithies Ph.D.
Christchurch,
New Zealand

     -- "How did you make it work??"  "the usual, got everything right"

_______________________________________________
Linux-users mailing list
[email protected]
http://lists.canterbury.ac.nz/mailman/listinfo/linux-users

Reply via email to