Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.

2009-10-08 Thread Benjamin Herrenschmidt
On Thu, 2009-10-08 at 08:45 +0200, Joakim Tjernlund wrote:

  Generic code should sort it out in handle_mm_fault() (or earlier if it
  can't find a VMA at all).
 
 How can it? You need to know more that just read and write.

It does. It's going to look for the VMA, which will tell it what is
allowed or not. You'll notice 4xx/BookE doesn't use DSISR (except the
ESR bit we pass to separate loads from stores).

If the region has no access, the kernel will know it (no VMA for
example) and will trigger a SEGV.

Really, the DSISR stuff is not as necessary as you think it is :-) You
should be able to jump to C code straight from both TLB error
interrupts.

 
  But that's a slow path anyways.
 
 How so? You take a TLB Error for the first write to
 every page.

Compared to the TLB miss that is :-) But my main point is that a TLB
error caused by a lack of DIRTY or ACCESSED will be rare.

Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.

2009-10-07 Thread Joakim Tjernlund
Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
Pros:
 - I/D TLB Miss never needs to write to the linux pte.
 - _PAGE_ACCESSED is only set on TLB Error fixing accounting
 - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
when a page has been made dirty.
 - Proper RO/RW mapping of user space.
 - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
Cons:
 - 1 more instructions in I/D TLB Miss, but the since the linux pte is
   not written anymore, it should still be a big win.
---
 arch/powerpc/include/asm/pte-8xx.h |   13 +++---
 arch/powerpc/kernel/head_8xx.S |   82 
 2 files changed, 43 insertions(+), 52 deletions(-)

diff --git a/arch/powerpc/include/asm/pte-8xx.h 
b/arch/powerpc/include/asm/pte-8xx.h
index 8c6e312..f23cd15 100644
--- a/arch/powerpc/include/asm/pte-8xx.h
+++ b/arch/powerpc/include/asm/pte-8xx.h
@@ -32,22 +32,21 @@
 #define _PAGE_FILE 0x0002  /* when !present: nonlinear file mapping */
 #define _PAGE_NO_CACHE 0x0002  /* I: cache inhibit */
 #define _PAGE_SHARED   0x0004  /* No ASID (context) compare */
+#define _PAGE_DIRTY0x0100  /* C: page changed */
 
-/* These five software bits must be masked out when the entry is loaded
- * into the TLB.
+/* These 3 software bits must be masked out when the entry is loaded
+ * into the TLB, 2 SW bits left.
  */
 #define _PAGE_EXEC 0x0008  /* software: i-cache coherency required */
 #define _PAGE_GUARDED  0x0010  /* software: guarded access */
-#define _PAGE_DIRTY0x0020  /* software: page changed */
-#define _PAGE_RW   0x0040  /* software: user write access allowed */
-#define _PAGE_ACCESSED 0x0080  /* software: page referenced */
+#define _PAGE_ACCESSED 0x0020  /* software: page referenced */
 
 /* Setting any bits in the nibble with the follow two controls will
  * require a TLB exception handler change.  It is assumed unused bits
  * are always zero.
  */
-#define _PAGE_HWWRITE  0x0100  /* h/w write enable: never set in Linux PTE */
-#define _PAGE_USER 0x0800  /* One of the PP bits, the other is USER~RW */
+#define _PAGE_RW   0x0400  /* lsb PP bits, inverted in HW */
+#define _PAGE_USER 0x0800  /* msb PP bits */
 
 #define _PMD_PRESENT   0x0001
 #define _PMD_BAD   0x0ff0
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 118bb05..3cf1289 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -333,26 +333,18 @@ InstructionTLBMiss:
mfspr   r11, SPRN_MD_TWC/* and get the pte address */
lwz r10, 0(r11) /* Get the pte */
 
-#ifdef CONFIG_SWAP
-   /* do not set the _PAGE_ACCESSED bit of a non-present page */
-   andi.   r11, r10, _PAGE_PRESENT
-   beq 4f
-   ori r10, r10, _PAGE_ACCESSED
-   mfspr   r11, SPRN_MD_TWC/* get the pte address again */
-   stw r10, 0(r11)
-4:
-#else
-   ori r10, r10, _PAGE_ACCESSED
-   stw r10, 0(r11)
-#endif
+   andi.   r11, r10, _PAGE_USER | _PAGE_ACCESSED
+   cmpwi   cr0, r11, _PAGE_USER | _PAGE_ACCESSED
+   bne-cr0, 2f
+   /* Dont' bother with PP lsb, bit 21 for now */
 
/* The Linux PTE won't go exactly into the MMU TLB.
-* Software indicator bits 21, 22 and 28 must be clear.
+* Software indicator bits 22 and 28 must be clear.
 * Software indicator bits 24, 25, 26, and 27 must be
 * set.  All other Linux PTE bits control the behavior
 * of the MMU.
 */
-2: li  r11, 0x00f0
+   li  r11, 0x00f0
rlwimi  r10, r11, 0, 24, 28 /* Set 24-27, clear 28 */
DO_8xx_CPU6(0x2d80, r3)
mtspr   SPRN_MI_RPN, r10/* Update TLB entry */
@@ -365,6 +357,19 @@ InstructionTLBMiss:
lwz r3, 8(r0)
 #endif
rfi
+2:
+   mfspr   r11, SRR1
+   rlwinm  r11, r11, 0, 5, 3 /* clear guarded */
+   mtspr   SRR1, r11
+
+   mfspr   r10, SPRN_M_TW  /* Restore registers */
+   lwz r11, 0(r0)
+   mtcrr11
+   lwz r11, 4(r0)
+#ifdef CONFIG_8xx_CPU6
+   lwz r3, 8(r0)
+#endif
+   b   InstructionAccess
 
. = 0x1200
 DataStoreTLBMiss:
@@ -409,21 +414,14 @@ DataStoreTLBMiss:
DO_8xx_CPU6(0x3b80, r3)
mtspr   SPRN_MD_TWC, r11
 
-#ifdef CONFIG_SWAP
-   /* do not set the _PAGE_ACCESSED bit of a non-present page */
-   andi.   r11, r10, _PAGE_PRESENT
-   beq 4f
-   ori r10, r10, _PAGE_ACCESSED
-4:
-   /* and update pte in table */
-#else
-   ori r10, r10, _PAGE_ACCESSED
-#endif
-   mfspr   r11, SPRN_MD_TWC/* get the pte address again */
-   stw r10, 0(r11)
+   andi.   r11, r10, _PAGE_ACCESSED
+   bne+cr0, 5f /* branch if access allowed */
+   rlwinm  r10, r10, 0, 21, 19 /* Clear _PAGE_USER */
+   ori r10, r10, _PAGE_RW  /* Set RW bit for xor below to clear it */
+5: xorir10, r10, 

Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.

2009-10-07 Thread Benjamin Herrenschmidt
On Thu, 2009-10-08 at 00:08 +0200, Joakim Tjernlund wrote:
 
 Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 07/10/2009 
 23:14:52:
 
  On Wed, 2009-10-07 at 22:46 +0200, Joakim Tjernlund wrote:
 
   +   andi.   r11, r10, _PAGE_USER | _PAGE_ACCESSED
   +   cmpwi   cr0, r11, _PAGE_USER | _PAGE_ACCESSED
   +   bne-   cr0, 2f
 
  Did you mean _PAGE_PRESENT | _PAGE_ACCESSED ?
 
   +2:
   +   mfspr   r11, SRR1
   +   rlwinm   r11, r11, 0, 5, 3 /* clear guarded */
   +   mtspr   SRR1, r11
 
  What is the above for ?
 
 TLB Miss will set that bit unconditionally and that is
 the same bit as protection error in TLB error.

And ? Big deal :-) IE. Once you get to InstructionAccess, it doesn't
matter if that bit is set, does it ?

 Lets start simple, shall we? :)
 Anyhow, I looked some more at that and I don't the best thing is
 to use shifts. All bits are correct if you invert RW and add an exception
 for extended coding.

Right, as long as you avoid doing a conditional branch :-)

 Because if you go to C with a protection fault, you are in trouble.

Why ?

 So deal with it here. Now, I got another idea too that will make this go away
 if it work out

I don't understand your point about protection faults.

You should be able to go straight to C with -anything-, that's what I do
for all other platforms.

Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.

2009-10-07 Thread Joakim Tjernlund
Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 08/10/2009 00:20:17:


 On Thu, 2009-10-08 at 00:08 +0200, Joakim Tjernlund wrote:
 
  Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 07/10/2009 
  23:14:52:
  
   On Wed, 2009-10-07 at 22:46 +0200, Joakim Tjernlund wrote:
  
+   andi.   r11, r10, _PAGE_USER | _PAGE_ACCESSED
+   cmpwi   cr0, r11, _PAGE_USER | _PAGE_ACCESSED
+   bne-   cr0, 2f
  
   Did you mean _PAGE_PRESENT | _PAGE_ACCESSED ?

YES! cut and paste error, will send a new much improved patch
with my new idea.

  
+2:
+   mfspr   r11, SRR1
+   rlwinm   r11, r11, 0, 5, 3 /* clear guarded */
+   mtspr   SRR1, r11
  
   What is the above for ?
 
  TLB Miss will set that bit unconditionally and that is
  the same bit as protection error in TLB error.

 And ? Big deal :-) IE. Once you get to InstructionAccess, it doesn't
 matter if that bit is set, does it ?

Yes it does. If one adds HWEXEC it will fail, right?
Also this count as a read and you could easily end up
in the protection case(in 2.4 you do)


  Lets start simple, shall we? :)
  Anyhow, I looked some more at that and I don't the best thing is
  to use shifts. All bits are correct if you invert RW and add an exception
  for extended coding.

 Right, as long as you avoid doing a conditional branch :-)

hey, I think you have to show how then :) I am not
good at ppc shift, mask, rotate insn.


  Because if you go to C with a protection fault, you are in trouble.

 Why ?

In 2.4 you end up in read protection fault an get a SEGV back :)


  So deal with it here. Now, I got another idea too that will make this go 
  away
  if it work out

 I don't understand your point about protection faults.

 You should be able to go straight to C with -anything-, that's what I do
 for all other platforms.

Well, you don't force a tlb error like I do, however my new version
handles this better.

Now I only handle DIRTY and the rest in C. Figured it is
much faster and really simplie now, stay tuned.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.

2009-10-07 Thread Joakim Tjernlund
Joakim Tjernlund/Transmode wrote on 08/10/2009 01:11:23:

 Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 08/10/2009 
 00:20:17:
 
  On Thu, 2009-10-08 at 00:08 +0200, Joakim Tjernlund wrote:
  
   Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 07/10/2009 
   23:14:52:
   
On Wed, 2009-10-07 at 22:46 +0200, Joakim Tjernlund wrote:
   
 +   andi.   r11, r10, _PAGE_USER | _PAGE_ACCESSED
 +   cmpwi   cr0, r11, _PAGE_USER | _PAGE_ACCESSED
 +   bne-   cr0, 2f
   
Did you mean _PAGE_PRESENT | _PAGE_ACCESSED ?

 YES! cut and paste error, will send a new much improved patch
 with my new idea.

So here it is(on top for now), what do you think?

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 8c4c416..fea9f5b 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -339,8 +339,8 @@ InstructionTLBMiss:
mfspr   r11, SPRN_MD_TWC/* and get the pte address */
lwz r10, 0(r11) /* Get the pte */

-   andi.   r11, r10, _PAGE_USER | _PAGE_ACCESSED
-   cmpwi   cr0, r11, _PAGE_USER | _PAGE_ACCESSED
+   andi.   r21, r20, _PAGE_ACCESSED | _PAGE_PRESENT
+   cmpwi   cr0, r21, _PAGE_ACCESSED | _PAGE_PRESENT
bne-cr0, 2f
/* Dont' bother with PP lsb, bit 21 for now */

@@ -365,7 +365,10 @@ InstructionTLBMiss:
rfi
 2:
mfspr   r11, SRR1
-   rlwinm  r11, r11, 0, 5, 3 /* clear guarded */
+   /* clear all error bits as TLB Miss
+* sets a few unconditionally
+   */
+   rlwinm  r21, r21, 0, 0x
mtspr   SRR1, r11

mfspr   r10, SPRN_M_TW  /* Restore registers */
@@ -422,8 +425,8 @@ DataStoreTLBMiss:

andi.   r11, r10, _PAGE_ACCESSED
bne+cr0, 5f /* branch if access allowed */
-   rlwinm  r10, r10, 0, 21, 19 /* Clear _PAGE_USER */
-   ori r10, r10, _PAGE_RW  /* Set RW bit for xor below to clear it */
+   /* Need to know if load/store - force a TLB Error */
+   rlwinm  r20, r20, 0, 0, 30 /* Clear _PAGE_PRESENT */
 5: xorir10, r10, _PAGE_RW  /* invert RW bit */

/* The Linux PTE won't go exactly into the MMU TLB.
@@ -482,8 +485,11 @@ DARFix:/* Return from dcbx instruction bug workaround, 
r10 holds value of DAR *
/* First, make sure this was a store operation.
*/
mfspr   r11, SPRN_DSISR
-   andis.  r11, r11, 0x4000 /* no translation */
-   bne 2f  /* branch if set */
+   andis.  r21, r21, 0x4800/* !translation or protection */
+   bne 2f  /* branch if either is set */
+   /* Only Change bit left now, do it here as it is faster
+* than trapping to the C fault handler.
+   */

/* The EA of a data TLB miss is automatically stored in the MD_EPN
 * register.  The EA of a data TLB error is automatically stored in
@@ -533,16 +539,8 @@ DARFix:/* Return from dcbx instruction bug workaround, 
r10 holds value of DAR *
mfspr   r11, SPRN_MD_TWC/* and get the pte address 
*/
lwz r10, 0(r11) /* Get the pte */

-   mfspr   r11, DSISR
-   andis.  r11, r11, 0x0200/* store */
-   beq 5f
-   andi.   r11, r10, _PAGE_RW  /* writeable? */
-   beq 2f /* nope */
-   ori r10, r10, _PAGE_DIRTY|_PAGE_HWWRITE
-5: ori r10, r10, _PAGE_ACCESSED
-   mfspr   r11, MD_TWC /* Get pte address again */
+   ori r10, r10, _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_HWWRITE
stw r10, 0(r11) /* and update pte in table */
-
xorir10, r10, _PAGE_RW  /* RW bit is inverted */

/* The Linux PTE won't go exactly into the MMU TLB.



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.

2009-10-07 Thread Benjamin Herrenschmidt

 Yes it does. If one adds HWEXEC it will fail, right?

Why ? We can just filter out DSISR, we don't really care why it failed
as long as we know whether it was a store or not.

 Also this count as a read and you could easily end up
 in the protection case(in 2.4 you do)

I'm not sure what you mean by the protection case Again, the C code
shouldn't care.

 hey, I think you have to show how then :) I am not
 good at ppc shift, mask, rotate insn.

They are so fun ! :-0

 
   Because if you go to C with a protection fault, you are in trouble.
 
  Why ?
 
 In 2.4 you end up in read protection fault an get a SEGV back :)

We probably should ignore the DSISR bits then. So it just goes to
generic C code which then fixes up ACCESSED etc... and returns.

 Now I only handle DIRTY and the rest in C. Figured it is
 much faster and really simplie now, stay tuned.

You should not even have to handle DIRTY at all in asm. At least in 2.6.
I can't vouch for what 2.4 generic code does... You should really port
your board over :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.

2009-10-07 Thread Benjamin Herrenschmidt
On Thu, 2009-10-08 at 02:19 +0200, Joakim Tjernlund wrote:
 Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 08/10/2009 
 02:04:56:
 
 
   Yes it does. If one adds HWEXEC it will fail, right?
 
  Why ? We can just filter out DSISR, we don't really care why it failed
  as long as we know whether it was a store or not.
 
   Also this count as a read and you could easily end up
   in the protection case(in 2.4 you do)
 
  I'm not sure what you mean by the protection case Again, the C code
  shouldn't care.
 
 it does, and it should. How else should you know if you try
 to read a NA space?

Generic code should sort it out in handle_mm_fault() (or earlier if it
can't find a VMA at all).

The DSISR munging is really not necessary I believe.

 2.4 and 2.6 have the same handling in asm.

Yeah but the C code, especially the generic part, is different.

 hmm, maybe I should just call C, but 8xx isn't a speed monster so every
 cycle counts :)

But that's a slow path anyways.

 It works if I trap to C for DIRTY too.
 Before thinking on porting my old board, I want 2.4 to enjoy
 the new TLB code too :)

Hehehe.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev