On Wed, Feb 05, 2003 at 01:26:47PM -0600, John Goerzen wrote: > Paul Sladen <[EMAIL PROTECTED]> writes: > > > Sorry I didn't express a `bite', I [still] haven't had time to look through > > the call-path. > > That's OK; knowing that somebody knowledgeable is going to look at it > at least is comforting. > > > If you can regularly reproduce these, then dumping them (in full) via > > serial/netconsole/lights-out would be useful. > > This is the first time it's happened. I was so flustered at seeing > our server go down that I didn't take the time to ready myself for the > next one. Sigh. Anyway, next time it goes down, I'll transcribe the > full the trace using my laptop, and at the same time drop in an extra > serial card (our serial ports are full) so we can do the serial > console thing. >
beware, do not draw wrong conclusions from insufficient data ... although I tend to agree that your assumptions point into the right direction, I would be careful ... > Some data points: > > 1. I recently moved our vservers from a 2.4.19 uniprocessor machine to > a 2.4.20 SMP one, with approrpriate upgrades in the ctx patch. So, > three varibles there: kernel version, SMP, and ctx version. ... and hardware (ram, I/O, etc) and maybe network ... maybe even the distribution? some upgrade/update? > 2. I had never experienced this sort of thing on the 2.4.19 machine. ... for how long, and under what load/conditions? have they changed too? > 3. This machine is a brand-new Linux-supported Dell server, so > hardware is unlikely to be an issue. ... brand new "Linux-supported" big company server never made any problems (like compaq/hp/ibm ???), please wake up ... don't get me wrong, but I have two SMP 2.4.20-p8c13e machines running smoothly for about 126 days ... best, Herbert PS: I value your input, and I'm sure the bug will be found, sooner or later ... > -- John
