Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Scott Wood
On Sun, Oct 11, 2009 at 06:35:08PM +0200, Joakim Tjernlund wrote:
 This is an assembler version to fixup DAR not being set
 by dcbX, icbi instructions. There are two versions, one
 uses selfmodifing code, the other uses a
 jump table but is much bigger(default).
 ---
  arch/powerpc/kernel/head_8xx.S |  146 
 +++-
  1 files changed, 145 insertions(+), 1 deletions(-)
 
 diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
 index 093176c..9839e79 100644
 --- a/arch/powerpc/kernel/head_8xx.S
 +++ b/arch/powerpc/kernel/head_8xx.S
 @@ -494,7 +494,8 @@ DataTLBError:
  
   mfspr   r10, SPRN_DAR
   cmpwi   cr0, r10, 0x00f0
 - beq-2f  /* must be a buggy dcbX, icbi insn. */
 + beq-FixDAR  /* must be a buggy dcbX, icbi insn. */
 +DARFix:  /* Return from dcbx instruction bug workaround, r10 holds value 
 of DAR */

Both FixDAR and DARFix?  Could we make the labels a little clearer?

 +/* This is the procedure to calculate the data EA for buggy dcbx,dcbi 
 instructions
 + * by decoding the registers used by the dcbx instruction and adding them.
 + * DAR is set to the calculated address and r10 also holds the EA on exit.
 + */

How often does this happen?  Could we just do it in C code after saving all
the registers, and avoid the self modifying stuff (or the big switch
statement equivalent)?

-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Joakim Tjernlund
Scott Wood scottw...@freescale.com wrote on 14/10/2009 19:20:03:

 On Sun, Oct 11, 2009 at 06:35:08PM +0200, Joakim Tjernlund wrote:
  This is an assembler version to fixup DAR not being set
  by dcbX, icbi instructions. There are two versions, one
  uses selfmodifing code, the other uses a
  jump table but is much bigger(default).
  ---
   arch/powerpc/kernel/head_8xx.S |  146 
  +++-
   1 files changed, 145 insertions(+), 1 deletions(-)
 
  diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
  index 093176c..9839e79 100644
  --- a/arch/powerpc/kernel/head_8xx.S
  +++ b/arch/powerpc/kernel/head_8xx.S
  @@ -494,7 +494,8 @@ DataTLBError:
 
  mfspr   r10, SPRN_DAR
  cmpwi   cr0, r10, 0x00f0
  -   beq-   2f   /* must be a buggy dcbX, icbi insn. */
  +   beq-   FixDAR   /* must be a buggy dcbX, icbi insn. */
  +DARFix:   /* Return from dcbx instruction bug workaround, r10 holds value 
  of DAR */

 Both FixDAR and DARFix?  Could we make the labels a little clearer?

Yes, need to come up with better names :)


  +/* This is the procedure to calculate the data EA for buggy dcbx,dcbi 
  instructions
  + * by decoding the registers used by the dcbx instruction and adding them.
  + * DAR is set to the calculated address and r10 also holds the EA on exit.
  + */

 How often does this happen?  Could we just do it in C code after saving all
 the registers, and avoid the self modifying stuff (or the big switch
 statement equivalent)?

I had some problems with the C-version. I got lots of extra TLB errors for the 
same address
so I am not confident it will work in the long run.

BTW, you could add a test and printk in do_page_fault on address 0x00f0.
if that ever hits there is a problem with dcbX fixup.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Scott Wood

Joakim Tjernlund wrote:

BTW, you could add a test and printk in do_page_fault on address 0x00f0.
if that ever hits there is a problem with dcbX fixup.


It doesn't get any 0xf0 faults.

FWIW, I'm not seeing the segfault any more, but I still get the lockup.

-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Joakim Tjernlund
Scott Wood scottw...@freescale.com wrote on 14/10/2009 21:23:02:
 Joakim Tjernlund wrote:
  BTW, you could add a test and printk in do_page_fault on address 0x00f0.
  if that ever hits there is a problem with dcbX fixup.

 It doesn't get any 0xf0 faults.

 FWIW, I'm not seeing the segfault any more, but I still get the lockup.

Have you reverted
 8xx: start using dcbX instructions in various copy routines ?

After that you could stick a
 b DataAccess

 directly in the DTLB error handler to skip and dcbX fixups.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Scott Wood

Joakim Tjernlund wrote:

Scott Wood scottw...@freescale.com wrote on 14/10/2009 21:23:02:

Joakim Tjernlund wrote:

BTW, you could add a test and printk in do_page_fault on address 0x00f0.
if that ever hits there is a problem with dcbX fixup.

It doesn't get any 0xf0 faults.

FWIW, I'm not seeing the segfault any more, but I still get the lockup.


Have you reverted
 8xx: start using dcbX instructions in various copy routines ?

After that you could stick a
 b DataAccess

 directly in the DTLB error handler to skip and dcbX fixups.


With that, I don't see the hard lockup, but things get stuck during 
bootup with everything idle.  I see this even if I revert everything but 
the invalidate non present TLBs patch, and I was seeing similar things 
sometimes with the other tlbil_va hacks.


I think there's something else going on in the 2.6 8xx code that needs 
to be fixed before we can tell what the impact of these patches is. 
I'll look into it.


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Joakim Tjernlund

Scott Wood scottw...@freescale.com wrote on 14/10/2009 22:22:25:

 Joakim Tjernlund wrote:
  Scott Wood scottw...@freescale.com wrote on 14/10/2009 21:23:02:
  Joakim Tjernlund wrote:
  BTW, you could add a test and printk in do_page_fault on address 
  0x00f0.
  if that ever hits there is a problem with dcbX fixup.
  It doesn't get any 0xf0 faults.
 
  FWIW, I'm not seeing the segfault any more, but I still get the lockup.
 
  Have you reverted
   8xx: start using dcbX instructions in various copy routines ?
 
  After that you could stick a
   b DataAccess
 
   directly in the DTLB error handler to skip and dcbX fixups.

 With that, I don't see the hard lockup, but things get stuck during

You needed both to loose the hard lockup? I would think
it should be enough to revert the various copy routines stuff?
I figure that these routines aren't working in 8xx for other reasons
since they haven't been used on 8xx since at least early 2.4.

 bootup with everything idle.  I see this even if I revert everything but
 the invalidate non present TLBs patch, and I was seeing similar things
 sometimes with the other tlbil_va hacks.

OK, something else is up.


 I think there's something else going on in the 2.6 8xx code that needs
 to be fixed before we can tell what the impact of these patches is.
 I'll look into it.

Great because I am really out of ideas. Perhaps back down to 2.6.30 and test
from there?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Scott Wood

Joakim Tjernlund wrote:

With that, I don't see the hard lockup, but things get stuck during


You needed both to loose the hard lockup? I would think
it should be enough to revert the various copy routines stuff?


No, but when I just reverted the patch and didn't change the TLB error handler, 
I got some other weirdness (assertion failure in some userspace program).  It 
may have been coincidental, though.



I think there's something else going on in the 2.6 8xx code that needs
to be fixed before we can tell what the impact of these patches is.
I'll look into it.


Great because I am really out of ideas. Perhaps back down to 2.6.30 and test
from there?


I think the last working version was a little older than that -- and it's quite 
possible that there was underlying badness even earlier that just recently got 
exposed.  I think I want to just debug it and find out what's really going on.


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Benjamin Herrenschmidt
On Wed, 2009-10-14 at 16:14 -0500, Scott Wood wrote:
 
 I think the last working version was a little older than that -- and it's 
 quite 
 possible that there was underlying badness even earlier that just recently 
 got 
 exposed.  I think I want to just debug it and find out what's really going on.

That would be good :-)

I've been itching to do that but without HW it's not trivial :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Joakim Tjernlund
Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 14/10/2009 23:17:09:

 On Wed, 2009-10-14 at 16:14 -0500, Scott Wood wrote:
 
  I think the last working version was a little older than that -- and it's 
  quite
  possible that there was underlying badness even earlier that just recently 
  got
  exposed.  I think I want to just debug it and find out what's really going 
  on.

 That would be good :-)

 I've been itching to do that but without HW it's not trivial :-)

Meanwhile, how about the tlb asm you promised me? :)
It will be a challenge I think since you only have 2 GPRs
I guess it would be possible to stash yet another reg since it
will fit in the cache line already used by the TLB handlers.

 Jocke

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Benjamin Herrenschmidt
On Wed, 2009-10-14 at 23:41 +0200, Joakim Tjernlund wrote:
 Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 14/10/2009 
 23:17:09:
 
  On Wed, 2009-10-14 at 16:14 -0500, Scott Wood wrote:
  
   I think the last working version was a little older than that -- and it's 
   quite
   possible that there was underlying badness even earlier that just 
   recently got
   exposed.  I think I want to just debug it and find out what's really 
   going on.
 
  That would be good :-)
 
  I've been itching to do that but without HW it's not trivial :-)
 
 Meanwhile, how about the tlb asm you promised me? :)
 It will be a challenge I think since you only have 2 GPRs
 I guess it would be possible to stash yet another reg since it
 will fit in the cache line already used by the TLB handlers.

Let's just get it working first :-)

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/8] 8xx: Fixup DAR from buggy dcbX instructions.

2009-10-14 Thread Joakim Tjernlund
Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 14/10/2009 23:52:10:

 On Wed, 2009-10-14 at 23:41 +0200, Joakim Tjernlund wrote:
  Benjamin Herrenschmidt b...@kernel.crashing.org wrote on 14/10/2009 
  23:17:09:
  
   On Wed, 2009-10-14 at 16:14 -0500, Scott Wood wrote:
   
I think the last working version was a little older than that -- and 
it's quite
possible that there was underlying badness even earlier that just 
recently got
exposed.  I think I want to just debug it and find out what's really 
going on.
  
   That would be good :-)
  
   I've been itching to do that but without HW it's not trivial :-)
 
  Meanwhile, how about the tlb asm you promised me? :)
  It will be a challenge I think since you only have 2 GPRs
  I guess it would be possible to stash yet another reg since it
  will fit in the cache line already used by the TLB handlers.

 Let's just get it working first :-)

Chicken :):)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev