(need volunteers to test the patch below on 8xx) Hi,
I've been investigating the 8xx update_mmu_cache() oops for the last weeks, and here is what I have gathered. Oops: kernel access of bad area, sig: 11 [#1] NIP: C00049E8 LR: C000A5D0 SP: C4F53E10 REGS: c4f53d60 TRAP: 0300 Not taintedMSR: 00009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11 DAR: 100113A0, DSISR: C2000000 TASK = c53f17e0[1224] 'a' THREAD: c4f52000 Last syscall: 47 GPR00: C783D2A0 C4F53E10 C53F17E0 10050000 00000100 0009F0A0 10050000 00000000 GPR08: 00075925 C783D2A0 C53F17E0 00000000 00076924 10077178 00000000 100B4338 GPR16: 100BBDE8 0ED792CE 7FFFF670 00000000 00000000 00000000 00000000 C4F41100 GPR24: 00000000 C4F3CAD4 C783D2A0 1005078C C4EB9140 C53861D0 04F85889 C034A0A0 NIP [c00049e8] __flush_dcache_icache+0x14/0x40 LR [c000a5d0] update_mmu_cache+0x64/0x98 Call trace: [c003fa7c] do_no_page+0x2f8/0x370 [c003fc44] handle_mm_fault+0x88/0x160 [c0009b58] do_page_fault+0x168/0x394 [c0002c28] handle_page_fault+0xc/0x80 What is happening here is that update_mmu_cache() calls __flush_dcache_icache() to sync the d-cache with memory and invalidate any stale i-cache entries for the address being faulted in. Problem is that the "dcbst" instruction will, _sometimes_ (the failure/success rate is about 1/4 with my test application) fault as a _write_ operation on the data. The address in question is always at the very beginning of the read-only data section, thus the write fault (as can be verified in DSISR: 0x02000000) is rejected because the vma structure is marked as read-only (vma->flags = ~VM_WRITE). 8xx machines running v2.6 are operating at the moment with a "tlbie()" call at update_mmu_cache() just before __flush_dcache_icache(), which worksaround the problem. I've been able to watch the "problematic" TLB entry just before update_mmu_cache(). Here it is: SPR 824 : 0x10011f0b 268508939 BDI>rds 825 SPR 825 : 0x000001e0 480 BDI>rds 826 SPR 826 : 0x00001f00 7936 As you can see by bit 18 of the D-TLB debugging register MD_RAM1 (SPR 826), this entry is marked as invalid, which will invocate DataTLBError in case of an access at this point and handle the fault properly in most cases. This is expected, and is how the sequence "DataTLBMiss" (no effective address in TLB entry) -> "DataTLBError" (existant EA but valid bit not set) works on 8xx. Kumar Gala suggested inspection of memory which holds __flush_dcache_icache(). With the BDI I could verify that the instruction sequence is there, intact. I'm unable to determine why a "dcbst" fault is incorrectly being treated as a WRITE operation. That seems to be the real problem. Likely to be Yet Another CPU bug? I've came up with a workaround which looks acceptable (unlike the tlbie one). Solution is to jump directly from the data tlb miss exception to DataAccess, which in turn calls do_page_fault() and friends. This avoids the dcbst's from being called to sync an address with an "invalid" TLB entry. Signed-off-by: Marcelo Tosatti <marcelo.tosatti at cyclades.com> --- a/arch/ppc/kernel/head_8xx.S.orig 2005-04-04 19:43:23.000000000 -0300 +++ b/arch/ppc/kernel/head_8xx.S 2005-04-04 19:47:40.000000000 -0300 @@ -359,9 +359,7 @@ . = 0x1200 DataStoreTLBMiss: -#ifdef CONFIG_8xx_CPU6 stw r3, 8(r0) -#endif DO_8xx_CPU6(0x3f80, r3) mtspr M_TW, r10 /* Save a couple of working registers */ mfcr r10 @@ -390,6 +388,16 @@ mfspr r10, MD_TWC /* ....and get the pte address */ lwz r10, 0(r10) /* Get the pte */ + li r3, 0 + cmpw r10, r3 /* does the pte contain a valid address? */ + bne 4f + mfspr r10, M_TW /* Restore registers */ + lwz r11, 0(r0) + mtcr r11 + lwz r11, 4(r0) + lwz r3, 8(r0) + b DataAccess +4: /* Insert the Guarded flag into the TWC from the Linux PTE. * It is bit 27 of both the Linux PTE and the TWC (at least * I got that right :-). It will be better when we can put @@ -419,9 +427,7 @@ lwz r11, 0(r0) mtcr r11 lwz r11, 4(r0) -#ifdef CONFIG_8xx_CPU6 lwz r3, 8(r0) -#endif rfi /* This is an instruction TLB error on the MPC8xx. This could be due