It does almost sound like hardware - when you run RH or Debian do you stress
the system to see if it dies?
(like running several instances of "cp /dev/hda /dev/null" or "find / |
cpio -o | compress > /dev/null")
I was going to say that memory may be an issue but you have ECC RAM and that
should squawk if there are errors.
How did you get your distro? off the net? There were a few posts I saw that
implicated that some of the Mandrake ISOs might be corrupt on some of the
mirrors or become corrupt at download.
BTW: Why do you have to recompile for SMP support? - may be I'm thinking of
Red Hat but I could swear that Mandrake provided SMP kernel rpms.
May be you should try temporarily pulling out one of the processors? (I'm
grasping at straws here)
----- Original Message -----
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, January 02, 2001 10:02 AM
Subject: Re: [expert] Critical problem (Not a Hardware Problem)
[ snip ]
> This problem came up within 24 hours of performing a complete system
update from
> one of the Mandrake mirrors. However, this may be a coincidence.
>
> When I booted the machine, the LILO dialog screen came up as usual. If I
> selected the standard "linux" or just let the default do the work, the
same
> result: An immediate reboot. LILO dialog comes up, default, reboot. The
cycle
> continued.
>
> Not having a working box, I performed an experiment. This is what I did
(with
> attendant results):
>
> 1. Performed a complete reinstall of Mandrake 7.2 (from scratch, including
> reformatting the hard disk (I wanted to get rid of CUPS anyway)). Result:
> Exact same problem, LILO, select, reboot.
>
> 2. Next, I switched distributions and loaded Debian 2.2r1 onto the box.
It ran
> just fine. No memory problems, no hard disk problems, nothing out of the
> ordinary.
>
> 3. Next, tried loading Mandrake 7.2 back on. Exact same problem! LILO,
select,
> reboot cycle. Moreover, I tried to load from a boot floppy. Same
problem!
>
> 4. Next, I switched yet again to RedHat 6.2. Same result as Debian.
Normal
> install and stable performance.
>
> Here is the configuration:
>
> Compaq Professional Workstation 8000
> Dual Pentium Pro (200 MHz)
> 128 MB ECC RAM
> two 4 GB Segate SCSI HD's (no RAID).
> Voodoo3 2000 PCI video card.
>
> The odd thing is that Mandrake ran stable for several weeks before failure
> (albeit just after a major update). However, if it was something with the
> update, the reinstall should have cleared it up -- but it didn't.
>
> I plan to do some more troubleshooting tonight. I will first try to
reload
> Mandrake, only this time I will specify GRUB in lieu of LILO.
>
> Next, I will load Mandrake 7.1 and see if I can narrow the problem down to
7.2.
>
> One last thing, since this is an SMP system, and I have to recompile the
kernel
> for SMP (and I'm not sure about RH 6.2), I will first check out SMP
enablement
> in RH 6.2 and, if necessary, recompile the kernel for SMP and reboot to
see
> whether that is causing the problem. Note, however, that upon boot both
> processors are initialized (as normal) so I don't think that SMP is a
problem.
>
> Any other suggestions?
>
> TIA,
>
> Ron
> ./.