I've been tracking a tough problem with the M754LMR mainboard. Symptom was
that in intel_set_var_mtrr, the mtrr #200 (physbase 0) was getting loaded
with 0x106, which is invalid, which resulted in a GPF.
Using the Arium ICE, I have tracked it to the following:
// it is recommended that we disable and enable cache when we
// do this.
/* Disable cache */
/* Write back the cache and flush TLB */
asm volatile ("movl %%cr0, %0\n\t"
"orl $0x40000000, %0\n\t"
"wbinvd\n\t"
"movl %0, %%cr0\n\t"
"wbinvd\n\t":"=r" (tmp)::"memory");
As soon as the first wbinvd instruction is executed, the stack variable
(i.e. the memory location containing the variable) for the physbase is
corrupted; it had 0x6, and it is replaced with the old value and
consequently has either 0x106 or 0x146.
Ok. So the cache has some junk inside, left over from before, I hope. Is
it possible our cache setup has gotten somewhat out of order? in other
words, do we need a wbinvd earlier in the setup, or is there something
wrong with this sequence? Are we not properly cleaning the cache out
somehow before we enable it? I'm stumped.
Collins has not seen this on his box. I have seen it on every single
m754lmr I've tried. One other person is also seeing this problem on one
other machine, a VIA system. It dies with the same POST code, 0x60, which
means it was trying to set MTRR #200 and failed.
I'm worried that our cache setup and invalidate may not be totally solid
any more.
ron