----- Original Message ----- From: "Nick Fisher" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Friday, December 12, 2003 7:00 PM Subject: Re: [gentoo-user] What does this crash output mean?
> >> Hia, > >> I have been having a few odd stability problems with one of my gentoo > >> machines. It's a dual PIII500 with 768MB of ram running two SCSI drives > >> in a software RAID 1 array. It had crashed a few times so I used the NMI > >> Watchdog to try and figure out what was going on. Anyhow it's crashed > >> again, this time I got to see the watchdog output.... > >> > >> Bank 3: b20000000002010a > >> Kernel Panic: CPU Context Corrupt > > > > CPU Context is what you call the state of registers, flags, programm > > counter > > at a certain point, for example you have 2 processes p1 and p2, when > > timeslice for p1 is over and process p2 gets the ressource CPU then the > > CPU > > context of p1 is saved, after execution of p2 the process p1 gets the > > ressouce CPU again the old contect is restored. Now you might be able > > imagine what CPU Context Corrupt can mean. Since the CPU Context is > > usually > > stored in RAM it is very likely, that your RAM has prolems. For fast task > > switches the CPU context can also be stored in the CPU tself, so there is > > also the chance, that your cpu is bad. > > Hope that helps, I'd run a memtest again, if you don't find anything it is > > probably the CPU. > Intresting...... takes me back to my C and assembler days ;) > > Can anyone tell me what the other two mean? The 'bank' line appears to be > an address.... could that be a clue as too what DIMM has gone squiffy > (There are three in there)? Could it be as simple as DIMM Bank 3 at > b20000000002010a is where it was trying to retrive the data that was > corrupt? If it were that simple I could just replace that DIMM.... Yes I think this message is probably trying to say, that it is in Bank 3. > > And what of this Idle task not syncing? What's that all about? Is that > just the upshot of the error on the previous line? Idle task is the main process which runs if there is no other process running, just to keep the processor busy with nothing :-) The error of course is definitely a result of that bad cpu context. What I did forget to say is that it actually could be also a problem with the kernel itself, if there is some bug in it that stores a bad cpu context in memory you would of course have the same effect. So trying a different kernel would be also an option to make sure, that you don't buy new hardware before you are sure that it's absolutely broken. > > Also if anyone has any good links to articles or posts dealing with these > issues I would appreciate you posting them. Kernel crash debugging (basic) > is an area I could do with a few more good sources on.... I think "O'Reilly Understanding The Linux Kernel" might cover those things. I haven't read the complete book though, just some parts concerning sheduling. > > Many thanks > > Nick > > >> In Idle Task - Not Syncing > >> > >> I've found parts of the error on google... references to various things. > >> The one almost exact match I got was here: > >> http://www.ussg.iu.edu/hypermail/linux/kernel/0201.3/0382.html > >> > >> Sometimes the quoted problem is memory, or clock speed. However before I > >> built this instance of Gentoo on there it was running as a Gentoo test > >> machine for quite a while. During that time I ran memcheck and cpu_burn > >> for days. No probelms found and no crashes. > > > -- > [EMAIL PROTECTED] mailing list > > > -- [EMAIL PROTECTED] mailing list
