Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
On Thu, 2009-10-08 at 08:45 +0200, Joakim Tjernlund wrote: Generic code should sort it out in handle_mm_fault() (or earlier if it can't find a VMA at all). How can it? You need to know more that just read and write. It does. It's going to look for the VMA, which will tell it what is allowed or not. You'll notice 4xx/BookE doesn't use DSISR (except the ESR bit we pass to separate loads from stores). If the region has no access, the kernel will know it (no VMA for example) and will trigger a SEGV. Really, the DSISR stuff is not as necessary as you think it is :-) You should be able to jump to C code straight from both TLB error interrupts. But that's a slow path anyways. How so? You take a TLB Error for the first write to every page. Compared to the TLB miss that is :-) But my main point is that a TLB error caused by a lack of DIRTY or ACCESSED will be rare. Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED. Pros: - I/D TLB Miss never needs to write to the linux pte. - _PAGE_ACCESSED is only set on TLB Error fixing accounting - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly when a page has been made dirty. - Proper RO/RW mapping of user space. - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?) Cons: - 1 more instructions in I/D TLB Miss, but the since the linux pte is not written anymore, it should still be a big win. --- arch/powerpc/include/asm/pte-8xx.h | 13 +++--- arch/powerpc/kernel/head_8xx.S | 82 2 files changed, 43 insertions(+), 52 deletions(-) diff --git a/arch/powerpc/include/asm/pte-8xx.h b/arch/powerpc/include/asm/pte-8xx.h index 8c6e312..f23cd15 100644 --- a/arch/powerpc/include/asm/pte-8xx.h +++ b/arch/powerpc/include/asm/pte-8xx.h @@ -32,22 +32,21 @@ #define _PAGE_FILE 0x0002 /* when !present: nonlinear file mapping */ #define _PAGE_NO_CACHE 0x0002 /* I: cache inhibit */ #define _PAGE_SHARED 0x0004 /* No ASID (context) compare */ +#define _PAGE_DIRTY0x0100 /* C: page changed */ -/* These five software bits must be masked out when the entry is loaded - * into the TLB. +/* These 3 software bits must be masked out when the entry is loaded + * into the TLB, 2 SW bits left. */ #define _PAGE_EXEC 0x0008 /* software: i-cache coherency required */ #define _PAGE_GUARDED 0x0010 /* software: guarded access */ -#define _PAGE_DIRTY0x0020 /* software: page changed */ -#define _PAGE_RW 0x0040 /* software: user write access allowed */ -#define _PAGE_ACCESSED 0x0080 /* software: page referenced */ +#define _PAGE_ACCESSED 0x0020 /* software: page referenced */ /* Setting any bits in the nibble with the follow two controls will * require a TLB exception handler change. It is assumed unused bits * are always zero. */ -#define _PAGE_HWWRITE 0x0100 /* h/w write enable: never set in Linux PTE */ -#define _PAGE_USER 0x0800 /* One of the PP bits, the other is USER~RW */ +#define _PAGE_RW 0x0400 /* lsb PP bits, inverted in HW */ +#define _PAGE_USER 0x0800 /* msb PP bits */ #define _PMD_PRESENT 0x0001 #define _PMD_BAD 0x0ff0 diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 118bb05..3cf1289 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -333,26 +333,18 @@ InstructionTLBMiss: mfspr r11, SPRN_MD_TWC/* and get the pte address */ lwz r10, 0(r11) /* Get the pte */ -#ifdef CONFIG_SWAP - /* do not set the _PAGE_ACCESSED bit of a non-present page */ - andi. r11, r10, _PAGE_PRESENT - beq 4f - ori r10, r10, _PAGE_ACCESSED - mfspr r11, SPRN_MD_TWC/* get the pte address again */ - stw r10, 0(r11) -4: -#else - ori r10, r10, _PAGE_ACCESSED - stw r10, 0(r11) -#endif + andi. r11, r10, _PAGE_USER | _PAGE_ACCESSED + cmpwi cr0, r11, _PAGE_USER | _PAGE_ACCESSED + bne-cr0, 2f + /* Dont' bother with PP lsb, bit 21 for now */ /* The Linux PTE won't go exactly into the MMU TLB. -* Software indicator bits 21, 22 and 28 must be clear. +* Software indicator bits 22 and 28 must be clear. * Software indicator bits 24, 25, 26, and 27 must be * set. All other Linux PTE bits control the behavior * of the MMU. */ -2: li r11, 0x00f0 + li r11, 0x00f0 rlwimi r10, r11, 0, 24, 28 /* Set 24-27, clear 28 */ DO_8xx_CPU6(0x2d80, r3) mtspr SPRN_MI_RPN, r10/* Update TLB entry */ @@ -365,6 +357,19 @@ InstructionTLBMiss: lwz r3, 8(r0) #endif rfi +2: + mfspr r11, SRR1 + rlwinm r11, r11, 0, 5, 3 /* clear guarded */ + mtspr SRR1, r11 + + mfspr r10, SPRN_M_TW /* Restore registers */ + lwz r11, 0(r0) + mtcrr11 + lwz r11, 4(r0) +#ifdef CONFIG_8xx_CPU6 + lwz r3, 8(r0) +#endif + b InstructionAccess . = 0x1200 DataStoreTLBMiss: @@ -409,21 +414,14 @@ DataStoreTLBMiss: DO_8xx_CPU6(0x3b80, r3) mtspr SPRN_MD_TWC, r11 -#ifdef CONFIG_SWAP - /* do not set the _PAGE_ACCESSED bit of a non-present page */ - andi. r11, r10, _PAGE_PRESENT - beq 4f - ori r10, r10, _PAGE_ACCESSED -4: - /* and update pte in table */ -#else - ori r10, r10, _PAGE_ACCESSED -#endif - mfspr r11, SPRN_MD_TWC/* get the pte address again */ - stw r10, 0(r11) + andi. r11, r10, _PAGE_ACCESSED + bne+cr0, 5f /* branch if access allowed */ + rlwinm r10, r10, 0, 21, 19 /* Clear _PAGE_USER */ + ori r10, r10, _PAGE_RW /* Set RW bit for xor below to clear it */ +5: xorir10, r10,
Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
On Thu, 2009-10-08 at 00:08 +0200, Joakim Tjernlund wrote: Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 07/10/2009 23:14:52: On Wed, 2009-10-07 at 22:46 +0200, Joakim Tjernlund wrote: + andi. r11, r10, _PAGE_USER | _PAGE_ACCESSED + cmpwi cr0, r11, _PAGE_USER | _PAGE_ACCESSED + bne- cr0, 2f Did you mean _PAGE_PRESENT | _PAGE_ACCESSED ? +2: + mfspr r11, SRR1 + rlwinm r11, r11, 0, 5, 3 /* clear guarded */ + mtspr SRR1, r11 What is the above for ? TLB Miss will set that bit unconditionally and that is the same bit as protection error in TLB error. And ? Big deal :-) IE. Once you get to InstructionAccess, it doesn't matter if that bit is set, does it ? Lets start simple, shall we? :) Anyhow, I looked some more at that and I don't the best thing is to use shifts. All bits are correct if you invert RW and add an exception for extended coding. Right, as long as you avoid doing a conditional branch :-) Because if you go to C with a protection fault, you are in trouble. Why ? So deal with it here. Now, I got another idea too that will make this go away if it work out I don't understand your point about protection faults. You should be able to go straight to C with -anything-, that's what I do for all other platforms. Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 08/10/2009 00:20:17: On Thu, 2009-10-08 at 00:08 +0200, Joakim Tjernlund wrote: Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 07/10/2009 23:14:52: On Wed, 2009-10-07 at 22:46 +0200, Joakim Tjernlund wrote: + andi. r11, r10, _PAGE_USER | _PAGE_ACCESSED + cmpwi cr0, r11, _PAGE_USER | _PAGE_ACCESSED + bne- cr0, 2f Did you mean _PAGE_PRESENT | _PAGE_ACCESSED ? YES! cut and paste error, will send a new much improved patch with my new idea. +2: + mfspr r11, SRR1 + rlwinm r11, r11, 0, 5, 3 /* clear guarded */ + mtspr SRR1, r11 What is the above for ? TLB Miss will set that bit unconditionally and that is the same bit as protection error in TLB error. And ? Big deal :-) IE. Once you get to InstructionAccess, it doesn't matter if that bit is set, does it ? Yes it does. If one adds HWEXEC it will fail, right? Also this count as a read and you could easily end up in the protection case(in 2.4 you do) Lets start simple, shall we? :) Anyhow, I looked some more at that and I don't the best thing is to use shifts. All bits are correct if you invert RW and add an exception for extended coding. Right, as long as you avoid doing a conditional branch :-) hey, I think you have to show how then :) I am not good at ppc shift, mask, rotate insn. Because if you go to C with a protection fault, you are in trouble. Why ? In 2.4 you end up in read protection fault an get a SEGV back :) So deal with it here. Now, I got another idea too that will make this go away if it work out I don't understand your point about protection faults. You should be able to go straight to C with -anything-, that's what I do for all other platforms. Well, you don't force a tlb error like I do, however my new version handles this better. Now I only handle DIRTY and the rest in C. Figured it is much faster and really simplie now, stay tuned. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
Joakim Tjernlund/Transmode wrote on 08/10/2009 01:11:23: Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 08/10/2009 00:20:17: On Thu, 2009-10-08 at 00:08 +0200, Joakim Tjernlund wrote: Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 07/10/2009 23:14:52: On Wed, 2009-10-07 at 22:46 +0200, Joakim Tjernlund wrote: + andi. r11, r10, _PAGE_USER | _PAGE_ACCESSED + cmpwi cr0, r11, _PAGE_USER | _PAGE_ACCESSED + bne- cr0, 2f Did you mean _PAGE_PRESENT | _PAGE_ACCESSED ? YES! cut and paste error, will send a new much improved patch with my new idea. So here it is(on top for now), what do you think? diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 8c4c416..fea9f5b 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -339,8 +339,8 @@ InstructionTLBMiss: mfspr r11, SPRN_MD_TWC/* and get the pte address */ lwz r10, 0(r11) /* Get the pte */ - andi. r11, r10, _PAGE_USER | _PAGE_ACCESSED - cmpwi cr0, r11, _PAGE_USER | _PAGE_ACCESSED + andi. r21, r20, _PAGE_ACCESSED | _PAGE_PRESENT + cmpwi cr0, r21, _PAGE_ACCESSED | _PAGE_PRESENT bne-cr0, 2f /* Dont' bother with PP lsb, bit 21 for now */ @@ -365,7 +365,10 @@ InstructionTLBMiss: rfi 2: mfspr r11, SRR1 - rlwinm r11, r11, 0, 5, 3 /* clear guarded */ + /* clear all error bits as TLB Miss +* sets a few unconditionally + */ + rlwinm r21, r21, 0, 0x mtspr SRR1, r11 mfspr r10, SPRN_M_TW /* Restore registers */ @@ -422,8 +425,8 @@ DataStoreTLBMiss: andi. r11, r10, _PAGE_ACCESSED bne+cr0, 5f /* branch if access allowed */ - rlwinm r10, r10, 0, 21, 19 /* Clear _PAGE_USER */ - ori r10, r10, _PAGE_RW /* Set RW bit for xor below to clear it */ + /* Need to know if load/store - force a TLB Error */ + rlwinm r20, r20, 0, 0, 30 /* Clear _PAGE_PRESENT */ 5: xorir10, r10, _PAGE_RW /* invert RW bit */ /* The Linux PTE won't go exactly into the MMU TLB. @@ -482,8 +485,11 @@ DARFix:/* Return from dcbx instruction bug workaround, r10 holds value of DAR * /* First, make sure this was a store operation. */ mfspr r11, SPRN_DSISR - andis. r11, r11, 0x4000 /* no translation */ - bne 2f /* branch if set */ + andis. r21, r21, 0x4800/* !translation or protection */ + bne 2f /* branch if either is set */ + /* Only Change bit left now, do it here as it is faster +* than trapping to the C fault handler. + */ /* The EA of a data TLB miss is automatically stored in the MD_EPN * register. The EA of a data TLB error is automatically stored in @@ -533,16 +539,8 @@ DARFix:/* Return from dcbx instruction bug workaround, r10 holds value of DAR * mfspr r11, SPRN_MD_TWC/* and get the pte address */ lwz r10, 0(r11) /* Get the pte */ - mfspr r11, DSISR - andis. r11, r11, 0x0200/* store */ - beq 5f - andi. r11, r10, _PAGE_RW /* writeable? */ - beq 2f /* nope */ - ori r10, r10, _PAGE_DIRTY|_PAGE_HWWRITE -5: ori r10, r10, _PAGE_ACCESSED - mfspr r11, MD_TWC /* Get pte address again */ + ori r10, r10, _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_HWWRITE stw r10, 0(r11) /* and update pte in table */ - xorir10, r10, _PAGE_RW /* RW bit is inverted */ /* The Linux PTE won't go exactly into the MMU TLB. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
Yes it does. If one adds HWEXEC it will fail, right? Why ? We can just filter out DSISR, we don't really care why it failed as long as we know whether it was a store or not. Also this count as a read and you could easily end up in the protection case(in 2.4 you do) I'm not sure what you mean by the protection case Again, the C code shouldn't care. hey, I think you have to show how then :) I am not good at ppc shift, mask, rotate insn. They are so fun ! :-0 Because if you go to C with a protection fault, you are in trouble. Why ? In 2.4 you end up in read protection fault an get a SEGV back :) We probably should ignore the DSISR bits then. So it just goes to generic C code which then fixes up ACCESSED etc... and returns. Now I only handle DIRTY and the rest in C. Figured it is much faster and really simplie now, stay tuned. You should not even have to handle DIRTY at all in asm. At least in 2.6. I can't vouch for what 2.4 generic code does... You should really port your board over :-) Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
On Thu, 2009-10-08 at 02:19 +0200, Joakim Tjernlund wrote: Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 08/10/2009 02:04:56: Yes it does. If one adds HWEXEC it will fail, right? Why ? We can just filter out DSISR, we don't really care why it failed as long as we know whether it was a store or not. Also this count as a read and you could easily end up in the protection case(in 2.4 you do) I'm not sure what you mean by the protection case Again, the C code shouldn't care. it does, and it should. How else should you know if you try to read a NA space? Generic code should sort it out in handle_mm_fault() (or earlier if it can't find a VMA at all). The DSISR munging is really not necessary I believe. 2.4 and 2.6 have the same handling in asm. Yeah but the C code, especially the generic part, is different. hmm, maybe I should just call C, but 8xx isn't a speed monster so every cycle counts :) But that's a slow path anyways. It works if I trap to C for DIRTY too. Before thinking on porting my old board, I want 2.4 to enjoy the new TLB code too :) Hehehe. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev