[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-30 Thread Joakim Tjernlund
> 
> On Mon, Nov 07, 2005 at 07:37:45PM +0100, Joakim Tjernlund wrote:
> >  > 
> > > On Mon, Nov 07, 2005 at 07:14:15PM +0100, Joakim Tjernlund wrote:
> > > > > -Original Message-
> > > > > From: Tom Rini [mailto:trini at kernel.crashing.org] 
> > > > > Sent: 07 November 2005 16:52
> > > > > To: Marcelo Tosatti
> > > > > Cc: Joakim Tjernlund; Pantelis Antoniou; Dan Malek; 
> > > > > linuxppc-embedded at ozlabs.org; gtolstolytkin at ru.mvista.com
> > > > > Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
> > > > > 
> > > > > On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo 
> Tosatti wrote:
> > > > > > Joakim!
> > > > > > 
> > > > > > On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim 
> > > Tjernlund wrote:
> > > > > > > Hi Marcelo
> > > > > > > 
> > > > > > > [SNIP] 
> > > > > > > > The root of the problem are the changes against 
> the 8xx TLB 
> > > > > > > > handlers introduced
> > > > > > > > during v2.6. What happens is the TLBMiss 
> handlers load the 
> > > > > > > > zeroed pte into
> > > > > > > > the TLB, causing the TLBError handler to be 
> invoked (thats 
> > > > > > > > two TLB faults per 
> > > > > > > > pagefault), which then jumps to the generic MM code to 
> > > > > setup the pte.
> > > > > > > > 
> > > > > > > > The bug is that the zeroed TLB is not invalidated (the 
> > > > > same reason
> > > > > > > > for the "dcbst" misbehaviour), resulting in infinite 
> > > > > TLBError faults.
> > > > > > > > 
> > > > > > > > Dan, I wonder why we just don't go back to v2.4 
> behaviour.
> > > > > > > 
> > > > > > > This is one reason why it is the way it is:
> > > > > > > 
> > > > > 
> > > 
> http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > > > > > > This details are little fuzzy ATM, but I think the 
> > > reason for the
> > > > > > > current
> > > > > > > impl. was only that it was less intrusive to impl.
> > > > > > 
> > > > > > Ah, I see. I wonder if the bug is processor specific: we 
> > > > > don't have such
> > > > > > changes in our v2.4 tree and never experienced such problem.
> > > > > > 
> > > > > > It should be pretty easy to hit it right? (instruction 
> > > > > pagefaults should
> > > > > > fail).
> > > > > > 
> > > > > > Grigori, Tom, can you enlight us about the issue on the URL 
> > > > > above. How
> > > > > > can it be triggered?
> > > > > 
> > > > > So after looking at the code in 2.6.14 and current git, I 
> > > think the
> > > > > above URL isn't relevant, unless there was a change I 
> > > missed (which
> > > > > could totally be possible) that reverted the patch there and 
> > > > > fixed that
> > > > > issue in a different manner.  But since I didn't figure that 
> > > > > out until I
> > > > > had finished researching it again:
> > > > 
> > > > I wasn't clear enough. What I meant was that the above 
> patch made me
> > > > think and
> > > > the result was that I came up with a simpler fix, the "two 
> > > exception"
> > > > fix that
> > > > is in current kernels. See
> > > > 
> > >
http://linux.bkbits.net:8080/linux-2.6/diffs/arch/ppc/kernel/head_8xx.S@
> > > > 
> > > 1.19?nav=index.html|src/.|src/arch|src/arch/ppc|src/arch/ppc/k
> > > ernel|hist
> > > > /arch/ppc/kernel/head_8xx.S
> > > > It appears this fix has some other issues :(
> > > > 
> > > > How do the other ppc arches do? I am guessing that they 
> don't double
> > > > fault, but bails
> > > > out to do_page_fault from the TLB Miss handler, like 
> 8xx used to do.
> > > 
> > > Assuming Dan doesn't come up with a more simple & better 
> fix, maybe we
> > > shoul

[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-16 Thread Marcelo Tosatti
On Sun, Nov 13, 2005 at 01:47:53PM +0100, Joakim Tjernlund wrote:
>  
> 
> > -Original Message-
> > From: Marcelo Tosatti [mailto:marcelo.tosatti at cyclades.com] 
> > Sent: den 12 november 2005 20:28
> > To: Joakim Tjernlund
> > Cc: Tom Rini; Dan Malek; gtolstolytkin at ru.mvista.com; 
> > linuxppc-embedded at ozlabs.org
> > Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
> > 
> > On Mon, Nov 07, 2005 at 07:37:45PM +0100, Joakim Tjernlund wrote:
> > >  >
> > > > On Mon, Nov 07, 2005 at 07:14:15PM +0100, Joakim Tjernlund wrote:
> > > > > > -Original Message-
> > > > > > From: Tom Rini [mailto:trini at kernel.crashing.org]
> > > > > > Sent: 07 November 2005 16:52
> > > > > > To: Marcelo Tosatti
> > > > > > Cc: Joakim Tjernlund; Pantelis Antoniou; Dan Malek; 
> > > > > > linuxppc-embedded at ozlabs.org; gtolstolytkin at ru.mvista.com
> > > > > > Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
> > > > > > 
> > > > > > On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo 
> > Tosatti wrote:
> > > > > > > Joakim!
> > > > > > > 
> > > > > > > On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim
> > > > Tjernlund wrote:
> > > > > > > > Hi Marcelo
> > > > > > > > 
> > > > > > > > [SNIP]
> > > > > > > > > The root of the problem are the changes against the 8xx 
> > > > > > > > > TLB handlers introduced during v2.6. What 
> > happens is the 
> > > > > > > > > TLBMiss handlers load the zeroed pte into the 
> > TLB, causing 
> > > > > > > > > the TLBError handler to be invoked (thats two 
> > TLB faults 
> > > > > > > > > per pagefault), which then jumps to the generic 
> > MM code to
> > > > > > setup the pte.
> > > > > > > > > 
> > > > > > > > > The bug is that the zeroed TLB is not invalidated (the
> > > > > > same reason
> > > > > > > > > for the "dcbst" misbehaviour), resulting in infinite
> > > > > > TLBError faults.
> > > > > > > > > 
> > > > > > > > > Dan, I wonder why we just don't go back to v2.4 
> > behaviour.
> > > > > > > > 
> > > > > > > > This is one reason why it is the way it is:
> > > > > > > > 
> > > > > > 
> > > > 
> > http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.ht
> > > > ml
> > > > > > > > This details are little fuzzy ATM, but I think the
> > > > reason for the
> > > > > > > > current
> > > > > > > > impl. was only that it was less intrusive to impl.
> > > > > > > 
> > > > > > > Ah, I see. I wonder if the bug is processor specific: we
> > > > > > don't have such
> > > > > > > changes in our v2.4 tree and never experienced such problem.
> > > > > > > 
> > > > > > > It should be pretty easy to hit it right? (instruction
> > > > > > pagefaults should
> > > > > > > fail).
> > > > > > > 
> > > > > > > Grigori, Tom, can you enlight us about the issue on the URL
> > > > > > above. How
> > > > > > > can it be triggered?
> > > > > > 
> > > > > > So after looking at the code in 2.6.14 and current git, I
> > > > think the
> > > > > > above URL isn't relevant, unless there was a change I
> > > > missed (which
> > > > > > could totally be possible) that reverted the patch there and 
> > > > > > fixed that issue in a different manner.  But since I didn't 
> > > > > > figure that out until I had finished researching it again:
> > > > > 
> > > > > I wasn't clear enough. What I meant was that the above 
> > patch made 
> > > > > me think and the result was that I came up with a 
> > simpler fix, the 
> > > > > "two
> > > > exception"
> > > > > fix that
> > > > > is in current kernels. See
> > > > >

[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-13 Thread Joakim Tjernlund
 

> -Original Message-
> From: Marcelo Tosatti [mailto:marcelo.tosatti at cyclades.com] 
> Sent: den 12 november 2005 20:28
> To: Joakim Tjernlund
> Cc: Tom Rini; Dan Malek; gtolstolytkin at ru.mvista.com; 
> linuxppc-embedded at ozlabs.org
> Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
> 
> On Mon, Nov 07, 2005 at 07:37:45PM +0100, Joakim Tjernlund wrote:
> >  >
> > > On Mon, Nov 07, 2005 at 07:14:15PM +0100, Joakim Tjernlund wrote:
> > > > > -Original Message-
> > > > > From: Tom Rini [mailto:trini at kernel.crashing.org]
> > > > > Sent: 07 November 2005 16:52
> > > > > To: Marcelo Tosatti
> > > > > Cc: Joakim Tjernlund; Pantelis Antoniou; Dan Malek; 
> > > > > linuxppc-embedded at ozlabs.org; gtolstolytkin at ru.mvista.com
> > > > > Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
> > > > > 
> > > > > On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo 
> Tosatti wrote:
> > > > > > Joakim!
> > > > > > 
> > > > > > On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim
> > > Tjernlund wrote:
> > > > > > > Hi Marcelo
> > > > > > > 
> > > > > > > [SNIP]
> > > > > > > > The root of the problem are the changes against the 8xx 
> > > > > > > > TLB handlers introduced during v2.6. What 
> happens is the 
> > > > > > > > TLBMiss handlers load the zeroed pte into the 
> TLB, causing 
> > > > > > > > the TLBError handler to be invoked (thats two 
> TLB faults 
> > > > > > > > per pagefault), which then jumps to the generic 
> MM code to
> > > > > setup the pte.
> > > > > > > > 
> > > > > > > > The bug is that the zeroed TLB is not invalidated (the
> > > > > same reason
> > > > > > > > for the "dcbst" misbehaviour), resulting in infinite
> > > > > TLBError faults.
> > > > > > > > 
> > > > > > > > Dan, I wonder why we just don't go back to v2.4 
> behaviour.
> > > > > > > 
> > > > > > > This is one reason why it is the way it is:
> > > > > > > 
> > > > > 
> > > 
> http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.ht
> > > ml
> > > > > > > This details are little fuzzy ATM, but I think the
> > > reason for the
> > > > > > > current
> > > > > > > impl. was only that it was less intrusive to impl.
> > > > > > 
> > > > > > Ah, I see. I wonder if the bug is processor specific: we
> > > > > don't have such
> > > > > > changes in our v2.4 tree and never experienced such problem.
> > > > > > 
> > > > > > It should be pretty easy to hit it right? (instruction
> > > > > pagefaults should
> > > > > > fail).
> > > > > > 
> > > > > > Grigori, Tom, can you enlight us about the issue on the URL
> > > > > above. How
> > > > > > can it be triggered?
> > > > > 
> > > > > So after looking at the code in 2.6.14 and current git, I
> > > think the
> > > > > above URL isn't relevant, unless there was a change I
> > > missed (which
> > > > > could totally be possible) that reverted the patch there and 
> > > > > fixed that issue in a different manner.  But since I didn't 
> > > > > figure that out until I had finished researching it again:
> > > > 
> > > > I wasn't clear enough. What I meant was that the above 
> patch made 
> > > > me think and the result was that I came up with a 
> simpler fix, the 
> > > > "two
> > > exception"
> > > > fix that
> > > > is in current kernels. See
> > > > 
> > > http://linux.bkbits.net:8080/linux-2.6/diffs/arch/ppc/kernel/h
> > > ead_8xx.S@
> > > > 
> > > 1.19?nav=index.html|src/.|src/arch|src/arch/ppc|src/arch/ppc/k
> > > ernel|hist
> > > > /arch/ppc/kernel/head_8xx.S
> > > > It appears this fix has some other issues :(
> > > > 
> > > > How do the other ppc arches do? I am guessing that they don't 
> > > > double fault, but bails out to do_page_fault 

[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-12 Thread Marcelo Tosatti
On Mon, Nov 07, 2005 at 07:37:45PM +0100, Joakim Tjernlund wrote:
>  > 
> > On Mon, Nov 07, 2005 at 07:14:15PM +0100, Joakim Tjernlund wrote:
> > > > -Original Message-
> > > > From: Tom Rini [mailto:trini at kernel.crashing.org] 
> > > > Sent: 07 November 2005 16:52
> > > > To: Marcelo Tosatti
> > > > Cc: Joakim Tjernlund; Pantelis Antoniou; Dan Malek; 
> > > > linuxppc-embedded at ozlabs.org; gtolstolytkin at ru.mvista.com
> > > > Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
> > > > 
> > > > On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo Tosatti wrote:
> > > > > Joakim!
> > > > > 
> > > > > On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim 
> > Tjernlund wrote:
> > > > > > Hi Marcelo
> > > > > > 
> > > > > > [SNIP] 
> > > > > > > The root of the problem are the changes against the 8xx TLB 
> > > > > > > handlers introduced
> > > > > > > during v2.6. What happens is the TLBMiss handlers load the 
> > > > > > > zeroed pte into
> > > > > > > the TLB, causing the TLBError handler to be invoked (thats 
> > > > > > > two TLB faults per 
> > > > > > > pagefault), which then jumps to the generic MM code to 
> > > > setup the pte.
> > > > > > > 
> > > > > > > The bug is that the zeroed TLB is not invalidated (the 
> > > > same reason
> > > > > > > for the "dcbst" misbehaviour), resulting in infinite 
> > > > TLBError faults.
> > > > > > > 
> > > > > > > Dan, I wonder why we just don't go back to v2.4 behaviour.
> > > > > > 
> > > > > > This is one reason why it is the way it is:
> > > > > > 
> > > > 
> > http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > > > > > This details are little fuzzy ATM, but I think the 
> > reason for the
> > > > > > current
> > > > > > impl. was only that it was less intrusive to impl.
> > > > > 
> > > > > Ah, I see. I wonder if the bug is processor specific: we 
> > > > don't have such
> > > > > changes in our v2.4 tree and never experienced such problem.
> > > > > 
> > > > > It should be pretty easy to hit it right? (instruction 
> > > > pagefaults should
> > > > > fail).
> > > > > 
> > > > > Grigori, Tom, can you enlight us about the issue on the URL 
> > > > above. How
> > > > > can it be triggered?
> > > > 
> > > > So after looking at the code in 2.6.14 and current git, I 
> > think the
> > > > above URL isn't relevant, unless there was a change I 
> > missed (which
> > > > could totally be possible) that reverted the patch there and 
> > > > fixed that
> > > > issue in a different manner.  But since I didn't figure that 
> > > > out until I
> > > > had finished researching it again:
> > > 
> > > I wasn't clear enough. What I meant was that the above patch made me
> > > think and
> > > the result was that I came up with a simpler fix, the "two 
> > exception"
> > > fix that
> > > is in current kernels. See
> > > 
> > http://linux.bkbits.net:8080/linux-2.6/diffs/arch/ppc/kernel/h
> > ead_8xx.S@
> > > 
> > 1.19?nav=index.html|src/.|src/arch|src/arch/ppc|src/arch/ppc/k
> > ernel|hist
> > > /arch/ppc/kernel/head_8xx.S
> > > It appears this fix has some other issues :(
> > > 
> > > How do the other ppc arches do? I am guessing that they don't double
> > > fault, but bails
> > > out to do_page_fault from the TLB Miss handler, like 8xx used to do.
> > 
> > Assuming Dan doesn't come up with a more simple & better fix, maybe we
> > should go back to the original patch I made?
> 
> That was what I was thinking too(or some variation of your patch)
> I wonder if that would solve the misbehaving dcbst problem Marcelo found
> some time ago too?

Hi Joakim,

Yes, it would fix the "dcbst" issue. That problem was triggered by a
zeroed TLB entry.

In practice it seems that the "three exception" approach does not impose
a significant overhead in comparison with the "two exception" version
(as can be seen by the results of the latency tests).

Anyway, if decided upon, the "two exception" version (no zeroed TLB
entry state) needs the TLBMiss handler should to the present bit as Dan
mentioned.

I don't know what Dan is up to, he meant to be doing significant changes.

I'll be playing with TLB preloading next week... how's your TLB handler
shrinkage idea?




[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-10 Thread David Jander
On Thursday 10 November 2005 08:48, David Jander wrote:
>[...]
> Hmmm. This is a lot in the line of the tests I did with (the more generic
> benchmark) nbench. After looking at those results (see my other post in
> this thread) I already suspected something like this.

Sorry, I obviously did not mean this thread, but the following post on another 
thread:

http://ozlabs.org/pipermail/linuxppc-embedded/2005-November/020775.html

Regards,

-- 
David Jander



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-10 Thread David Jander
On Wednesday 09 November 2005 13:04, Marcelo Tosatti wrote:
>[...]
>
> ** 2.6.14 DataTLBHandler jump direct ("two exceptions"):
>
> first batch:
> avg: 287ms
> avg: 287ms
> avg: 287ms
> avg: 287ms
> avg: 287ms
>
> second batch:
> avg: 287ms
> avg: 287ms
> avg: 287ms
> avg: 287ms
> avg: 287ms
>
> ** 2.6.14 vanilla ("three exceptions"):
>
> first batch:
> avg: 288ms
> avg: 285ms
> avg: 287ms
> avg: 287ms
> avg: 288ms
>
> second batch:
> avg: 288ms
> avg: 288ms
> avg: 287ms
> avg: 287ms
> avg: 287ms
>
> ** 2.4.17 (root on RAMDISK):
>
> avg: 309ms
> avg: 313ms
> avg: 312ms
> avg: 311ms
> avg: 310ms

Hmmm. This is a lot in the line of the tests I did with (the more generic 
benchmark) nbench. After looking at those results (see my other post in this 
thread) I already suspected something like this.

> The v2.6.14's kernel jump-direct is more consistent at 287ms,
> while vanilla 2.6.14 oscillates between 285 and 288ms, but
> no significant difference between the two.
>
> v2.6's fault handling is clearly faster than 2.4's (note that the compiler
> is also different, 2.4 uses gcc 2.95 and 2.6 gcc 3.3).

I don't think the compiler does much difference here though. In my test the 
exact same compiler was used for both kernels, and the same rootfs and binary 
of nbench. gcc-3.3.3. I did also use oprofile to get an idea of where the 
code spent its most cpu time during nbench, and AFAIR flush_dcache_icache() 
took quite a chunk of it, so I assume page fault latency is of importance 
there too, and might account for the huge difference between 2.4 and 2.6.

Greetings,

-- 
David Jander



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-09 Thread Marcelo Tosatti
On Tue, Nov 08, 2005 at 07:39:59AM +1100, Benjamin Herrenschmidt wrote:

> I think the current code, even with your fix, is sub-optimal. But of
> course, the only way to be sure is to do real measurements


Hi folks,

I've written a simple app to estimate pagefault latency using gettimeofday().

Can be found at http://hera.kernel.org/~marcelo/measurefault/

/* This simple program attemps to estimate how long a pagefault takes.
 * It does that by mmaping() /tmp/latency-test, and touching a page.
 * Time measurement is done with gettimeofday() before and after the 
 * data touch.
 *
 * In the hope to have a more precise measurement two values are subtracted
 * from the pagefault time delta:
 *
 * - Estimated time between two subsequent gettimeofday() calls, average
 *   of 100 runs (this average is around 8ms on 48Mhz PPC 8xx, 
 *   0ms on 1Ghz Pegasos G4)
 * 
 * - Time taken to touch the data after its TLB cached, aka second run.
 *   This takes  1 and 2ms on 8xx (it varies) and 0ms on 1Ghz Pegasos.
 */

And results with 48Mhz 855T, comparing internal v2.4.17, vanilla v2.6.14
and v2.6.14-jump-direct (jumping directly to handle_page_fault if the
pte is zeroed).

Each "avg:" entry is an average of 100 "measure-fault-latency.c" runs.

2.6's root is mounted on NFS.

** 2.6.14 DataTLBHandler jump direct ("two exceptions"):

first batch:
avg: 287ms
avg: 287ms
avg: 287ms
avg: 287ms
avg: 287ms

second batch:
avg: 287ms
avg: 287ms
avg: 287ms
avg: 287ms
avg: 287ms

** 2.6.14 vanilla ("three exceptions"):

first batch:
avg: 288ms
avg: 285ms
avg: 287ms
avg: 287ms
avg: 288ms

second batch:
avg: 288ms
avg: 288ms
avg: 287ms
avg: 287ms
avg: 287ms

** 2.4.17 (root on RAMDISK):

avg: 309ms
avg: 313ms
avg: 312ms
avg: 311ms
avg: 310ms


The v2.6.14's kernel jump-direct is more consistent at 287ms,
while vanilla 2.6.14 oscillates between 285 and 288ms, but 
no significant difference between the two.

v2.6's fault handling is clearly faster than 2.4's (note that the compiler
is also different, 2.4 uses gcc 2.95 and 2.6 gcc 3.3).



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-08 Thread Benjamin Herrenschmidt
On Mon, 2005-11-07 at 06:44 -0200, Marcelo Tosatti wrote:

> 
> The bug is that the zeroed TLB is not invalidated (the same reason
> for the "dcbst" misbehaviour), resulting in infinite TLBError faults.

I see, so you are in the same situation as ia64 which has valid but
unmapped TLBs ?

> Dan, I wonder why we just don't go back to v2.4 behaviour. It is not very
> clear to me that "two exception" speedup offsets the additional code required
> for "one exception" version. Have you actually done any measurements? 

What do you mean by "one exception" version ? You probably get 3 in fact
since after you have serviced the fault in the common code, you take
another fault to fill the PTE.

In fact, you could even go back to one exception by pre-filling the TLB
in update_mmu_cache :)

> There is chance that the additional code ends up in the same cacheline,
> which would mean no huge gain by the "two exception" approach. Might be
> even harmful for performance (you need two exceptions instead of one
> after all).
> 
> The "two exception" approach requires a TLB flush (to nuke the zeroed)
> at each PTE update for correct behaviour (which BTW is another slowdown):

I think the current code, even with your fix, is sub-optimal. But of
course, the only way to be sure is to do real measurements

Ben.




[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Pantelis Antoniou
On Monday 07 November 2005 22:39, Benjamin Herrenschmidt wrote:
> On Mon, 2005-11-07 at 06:44 -0200, Marcelo Tosatti wrote:
> 
> > 
> > The bug is that the zeroed TLB is not invalidated (the same reason
> > for the "dcbst" misbehaviour), resulting in infinite TLBError faults.
> 
> I see, so you are in the same situation as ia64 which has valid but
> unmapped TLBs ?
> 
> > Dan, I wonder why we just don't go back to v2.4 behaviour. It is not very
> > clear to me that "two exception" speedup offsets the additional code 
> > required
> > for "one exception" version. Have you actually done any measurements? 
> 
> What do you mean by "one exception" version ? You probably get 3 in fact
> since after you have serviced the fault in the common code, you take
> another fault to fill the PTE.
> 
> In fact, you could even go back to one exception by pre-filling the TLB
> in update_mmu_cache :)
> 

Yep. That should be the target. Remember the poor 8xx is not exactly a 
speed demon :).

> > There is chance that the additional code ends up in the same cacheline,
> > which would mean no huge gain by the "two exception" approach. Might be
> > even harmful for performance (you need two exceptions instead of one
> > after all).
> > 
> > The "two exception" approach requires a TLB flush (to nuke the zeroed)
> > at each PTE update for correct behaviour (which BTW is another slowdown):
> 
> I think the current code, even with your fix, is sub-optimal. But of
> course, the only way to be sure is to do real measurements
> 
> Ben.
> 
> 

The TLB flush is bogus IMO. I'm going to try the last patch by marcelo to
see if it works for me.

Pantelis.



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Dan Malek

On Nov 7, 2005, at 1:22 PM, Tom Rini wrote:

> Assuming Dan doesn't come up with a more simple & better fix, maybe we
> should go back to the original patch I made?

I'm working on it.  It'll look more like 2.4.

Thanks.

-- Dan




[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Dan Malek

On Nov 7, 2005, at 3:50 PM, Pantelis Antoniou wrote:

> Yep. That should be the target. Remember the poor 8xx is not exactly a
> speed demon :).

It really isn't a big speed difference.  The context save/restore
is minimal.  The original thought was " ...well, I'm already here,
I know we will take another exception, so may as well fake the
error case and call do_page_fault."   However, I really do like
a minimal TLB miss case for valid PTEs, and push everything
else to the heavyweight functions.

Thanks.

-- Dan




[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Joakim Tjernlund
 > 
> On Mon, Nov 07, 2005 at 07:14:15PM +0100, Joakim Tjernlund wrote:
> > > -Original Message-
> > > From: Tom Rini [mailto:trini at kernel.crashing.org] 
> > > Sent: 07 November 2005 16:52
> > > To: Marcelo Tosatti
> > > Cc: Joakim Tjernlund; Pantelis Antoniou; Dan Malek; 
> > > linuxppc-embedded at ozlabs.org; gtolstolytkin at ru.mvista.com
> > > Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
> > > 
> > > On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo Tosatti wrote:
> > > > Joakim!
> > > > 
> > > > On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim 
> Tjernlund wrote:
> > > > > Hi Marcelo
> > > > > 
> > > > > [SNIP] 
> > > > > > The root of the problem are the changes against the 8xx TLB 
> > > > > > handlers introduced
> > > > > > during v2.6. What happens is the TLBMiss handlers load the 
> > > > > > zeroed pte into
> > > > > > the TLB, causing the TLBError handler to be invoked (thats 
> > > > > > two TLB faults per 
> > > > > > pagefault), which then jumps to the generic MM code to 
> > > setup the pte.
> > > > > > 
> > > > > > The bug is that the zeroed TLB is not invalidated (the 
> > > same reason
> > > > > > for the "dcbst" misbehaviour), resulting in infinite 
> > > TLBError faults.
> > > > > > 
> > > > > > Dan, I wonder why we just don't go back to v2.4 behaviour.
> > > > > 
> > > > > This is one reason why it is the way it is:
> > > > > 
> > > 
> http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > > > > This details are little fuzzy ATM, but I think the 
> reason for the
> > > > > current
> > > > > impl. was only that it was less intrusive to impl.
> > > > 
> > > > Ah, I see. I wonder if the bug is processor specific: we 
> > > don't have such
> > > > changes in our v2.4 tree and never experienced such problem.
> > > > 
> > > > It should be pretty easy to hit it right? (instruction 
> > > pagefaults should
> > > > fail).
> > > > 
> > > > Grigori, Tom, can you enlight us about the issue on the URL 
> > > above. How
> > > > can it be triggered?
> > > 
> > > So after looking at the code in 2.6.14 and current git, I 
> think the
> > > above URL isn't relevant, unless there was a change I 
> missed (which
> > > could totally be possible) that reverted the patch there and 
> > > fixed that
> > > issue in a different manner.  But since I didn't figure that 
> > > out until I
> > > had finished researching it again:
> > 
> > I wasn't clear enough. What I meant was that the above patch made me
> > think and
> > the result was that I came up with a simpler fix, the "two 
> exception"
> > fix that
> > is in current kernels. See
> > 
> http://linux.bkbits.net:8080/linux-2.6/diffs/arch/ppc/kernel/h
> ead_8xx.S@
> > 
> 1.19?nav=index.html|src/.|src/arch|src/arch/ppc|src/arch/ppc/k
> ernel|hist
> > /arch/ppc/kernel/head_8xx.S
> > It appears this fix has some other issues :(
> > 
> > How do the other ppc arches do? I am guessing that they don't double
> > fault, but bails
> > out to do_page_fault from the TLB Miss handler, like 8xx used to do.
> 
> Assuming Dan doesn't come up with a more simple & better fix, maybe we
> should go back to the original patch I made?

That was what I was thinking too(or some variation of your patch)
I wonder if that would solve the misbehaving dcbst problem Marcelo found
some time ago too?

 Jocke



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Joakim Tjernlund
> -Original Message-
> From: Tom Rini [mailto:trini at kernel.crashing.org] 
> Sent: 07 November 2005 16:52
> To: Marcelo Tosatti
> Cc: Joakim Tjernlund; Pantelis Antoniou; Dan Malek; 
> linuxppc-embedded at ozlabs.org; gtolstolytkin at ru.mvista.com
> Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
> 
> On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo Tosatti wrote:
> > Joakim!
> > 
> > On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim Tjernlund wrote:
> > > Hi Marcelo
> > > 
> > > [SNIP] 
> > > > The root of the problem are the changes against the 8xx TLB 
> > > > handlers introduced
> > > > during v2.6. What happens is the TLBMiss handlers load the 
> > > > zeroed pte into
> > > > the TLB, causing the TLBError handler to be invoked (thats 
> > > > two TLB faults per 
> > > > pagefault), which then jumps to the generic MM code to 
> setup the pte.
> > > > 
> > > > The bug is that the zeroed TLB is not invalidated (the 
> same reason
> > > > for the "dcbst" misbehaviour), resulting in infinite 
> TLBError faults.
> > > > 
> > > > Dan, I wonder why we just don't go back to v2.4 behaviour.
> > > 
> > > This is one reason why it is the way it is:
> > > 
> http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > > This details are little fuzzy ATM, but I think the reason for the
> > > current
> > > impl. was only that it was less intrusive to impl.
> > 
> > Ah, I see. I wonder if the bug is processor specific: we 
> don't have such
> > changes in our v2.4 tree and never experienced such problem.
> > 
> > It should be pretty easy to hit it right? (instruction 
> pagefaults should
> > fail).
> > 
> > Grigori, Tom, can you enlight us about the issue on the URL 
> above. How
> > can it be triggered?
> 
> So after looking at the code in 2.6.14 and current git, I think the
> above URL isn't relevant, unless there was a change I missed (which
> could totally be possible) that reverted the patch there and 
> fixed that
> issue in a different manner.  But since I didn't figure that 
> out until I
> had finished researching it again:

I wasn't clear enough. What I meant was that the above patch made me
think and
the result was that I came up with a simpler fix, the "two exception"
fix that
is in current kernels. See
http://linux.bkbits.net:8080/linux-2.6/diffs/arch/ppc/kernel/head_8xx.S@
1.19?nav=index.html|src/.|src/arch|src/arch/ppc|src/arch/ppc/kernel|hist
/arch/ppc/kernel/head_8xx.S
It appears this fix has some other issues :(

How do the other ppc arches do? I am guessing that they don't double
fault, but bails
out to do_page_fault from the TLB Miss handler, like 8xx used to do.

> 
> Switching hats for a minute, this came from a bug a customer of
> MontaVista found, so I can't give out the testcase :(
> 
> To repeat what Joakim said back then:
> "I think I have figured this out. The first TLB misses that happen at
> app startup is Data TLB misses. These will then hit the NULL L1 entry
> and end up in do_page_fault() which will populate the L1 
> entry. But when
> you have a very large app that spans more than one L1 entry (16 MB I
> think) it may happen that you will have I-TLB Miss first one of the L1
> entrys which will make the I-TLB handler bail out to 
> do_page_fault() and
> the app craches(SEGV)."

This still stands I think.

> 
> Looking at the patch again, what I don't see is why I talk 
> about fudging
> I-TLB Miss at 0x400 when it's I-TLB Error we fudge at being there, but
> then get hung up that there can be a slight diff between the 
> two ("This
> is because we check bit 4 of SRR1 in both cases, but in the case of an
> I-TLB Miss, this bit is always set, and it only indicates a protection
> fault on an I-TLB Error.") so instead of 0x1300 jumping to the handler
> at 0x400, we treat it like a regular exception so we know 
> where we came
> from, and perhaps missed fixing a case somewhere?

Didn't look into this part of your patch, sorry.

 Jocke



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread David Jander
On Monday 07 November 2005 09:44, Marcelo Tosatti wrote:
> Seems the bug is exposed by the change which avoids flushing the
> TLB when not necessary (in case the pte has not changed), introduced
> recently:
>[...]

Brilliant!
I just checked, and now it boots again.
Btw, it did boot before this patch, but taking about 2 or 3 hours to get 
halfway through the init scripts ;-)

Thanks for the good work!

-- 
David Jander



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Joakim Tjernlund
 
> Joakim!
> 
> On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim Tjernlund wrote:
> > Hi Marcelo
> > 
> > [SNIP] 
> > > The root of the problem are the changes against the 8xx TLB 
> > > handlers introduced
> > > during v2.6. What happens is the TLBMiss handlers load the 
> > > zeroed pte into
> > > the TLB, causing the TLBError handler to be invoked (thats 
> > > two TLB faults per 
> > > pagefault), which then jumps to the generic MM code to 
> setup the pte.
> > > 
> > > The bug is that the zeroed TLB is not invalidated (the same reason
> > > for the "dcbst" misbehaviour), resulting in infinite 
> TLBError faults.
> > > 
> > > Dan, I wonder why we just don't go back to v2.4 behaviour.
> > 
> > This is one reason why it is the way it is:
> > 
> http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > This details are little fuzzy ATM, but I think the reason for the
> > current
> > impl. was only that it was less intrusive to impl.
> 
> Ah, I see. I wonder if the bug is processor specific: we 
> don't have such
> changes in our v2.4 tree and never experienced such problem.
> 
> It should be pretty easy to hit it right? (instruction 
> pagefaults should
> fail).

No, its pretty hard to trigger it. Read the all mails on the subject to
see why.
The one or two exception approach doesn't matter performancewise(at
least for ITLB exceptions)
I think.

> 
> Grigori, Tom, can you enlight us about the issue on the URL above. How
> can it be triggered?
> 
> 
> 
> 



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Pantelis Antoniou
Marcelo Tosatti wrote:
> Hi folks,
> 
> Seems the bug is exposed by the change which avoids flushing the
> TLB when not necessary (in case the pte has not changed), introduced
> recently:
> 

[snip]

> 
> 

Good job Marcelo! :)

FWIW I'd rather have the single exception version if at all possible.

Regards

Pantelis




[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Joakim Tjernlund
Hi Marcelo

[SNIP] 
> The root of the problem are the changes against the 8xx TLB 
> handlers introduced
> during v2.6. What happens is the TLBMiss handlers load the 
> zeroed pte into
> the TLB, causing the TLBError handler to be invoked (thats 
> two TLB faults per 
> pagefault), which then jumps to the generic MM code to setup the pte.
> 
> The bug is that the zeroed TLB is not invalidated (the same reason
> for the "dcbst" misbehaviour), resulting in infinite TLBError faults.
> 
> Dan, I wonder why we just don't go back to v2.4 behaviour.

This is one reason why it is the way it is:
http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
This details are little fuzzy ATM, but I think the reason for the
current
impl. was only that it was less intrusive to impl.

 Jocke




[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Marcelo Tosatti
On Tue, Nov 08, 2005 at 07:39:59AM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2005-11-07 at 06:44 -0200, Marcelo Tosatti wrote:
> 
> > 
> > The bug is that the zeroed TLB is not invalidated (the same reason
> > for the "dcbst" misbehaviour), resulting in infinite TLBError faults.
> 
> I see, so you are in the same situation as ia64 which has valid but
> unmapped TLBs ?
> 
> > Dan, I wonder why we just don't go back to v2.4 behaviour. It is not very
> > clear to me that "two exception" speedup offsets the additional code 
> > required
> > for "one exception" version. Have you actually done any measurements? 
> 
> What do you mean by "one exception" version ? You probably get 3 in fact
> since after you have serviced the fault in the common code, you take
> another fault to fill the PTE.

Yep, that would be 3!

> In fact, you could even go back to one exception by pre-filling the TLB
> in update_mmu_cache :)

OK, thats a good idea as we talked on IRC. Working on that.

> > There is chance that the additional code ends up in the same cacheline,
> > which would mean no huge gain by the "two exception" approach. Might be
> > even harmful for performance (you need two exceptions instead of one
> > after all).
> > 
> > The "two exception" approach requires a TLB flush (to nuke the zeroed)
> > at each PTE update for correct behaviour (which BTW is another slowdown):
> 
> I think the current code, even with your fix, is sub-optimal. But of
> course, the only way to be sure is to do real measurements

Indeed.

Thanks!



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Tom Rini
On Mon, Nov 07, 2005 at 07:14:15PM +0100, Joakim Tjernlund wrote:
> > -Original Message-
> > From: Tom Rini [mailto:trini at kernel.crashing.org] 
> > Sent: 07 November 2005 16:52
> > To: Marcelo Tosatti
> > Cc: Joakim Tjernlund; Pantelis Antoniou; Dan Malek; 
> > linuxppc-embedded at ozlabs.org; gtolstolytkin at ru.mvista.com
> > Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
> > 
> > On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo Tosatti wrote:
> > > Joakim!
> > > 
> > > On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim Tjernlund wrote:
> > > > Hi Marcelo
> > > > 
> > > > [SNIP] 
> > > > > The root of the problem are the changes against the 8xx TLB 
> > > > > handlers introduced
> > > > > during v2.6. What happens is the TLBMiss handlers load the 
> > > > > zeroed pte into
> > > > > the TLB, causing the TLBError handler to be invoked (thats 
> > > > > two TLB faults per 
> > > > > pagefault), which then jumps to the generic MM code to 
> > setup the pte.
> > > > > 
> > > > > The bug is that the zeroed TLB is not invalidated (the 
> > same reason
> > > > > for the "dcbst" misbehaviour), resulting in infinite 
> > TLBError faults.
> > > > > 
> > > > > Dan, I wonder why we just don't go back to v2.4 behaviour.
> > > > 
> > > > This is one reason why it is the way it is:
> > > > 
> > http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > > > This details are little fuzzy ATM, but I think the reason for the
> > > > current
> > > > impl. was only that it was less intrusive to impl.
> > > 
> > > Ah, I see. I wonder if the bug is processor specific: we 
> > don't have such
> > > changes in our v2.4 tree and never experienced such problem.
> > > 
> > > It should be pretty easy to hit it right? (instruction 
> > pagefaults should
> > > fail).
> > > 
> > > Grigori, Tom, can you enlight us about the issue on the URL 
> > above. How
> > > can it be triggered?
> > 
> > So after looking at the code in 2.6.14 and current git, I think the
> > above URL isn't relevant, unless there was a change I missed (which
> > could totally be possible) that reverted the patch there and 
> > fixed that
> > issue in a different manner.  But since I didn't figure that 
> > out until I
> > had finished researching it again:
> 
> I wasn't clear enough. What I meant was that the above patch made me
> think and
> the result was that I came up with a simpler fix, the "two exception"
> fix that
> is in current kernels. See
> http://linux.bkbits.net:8080/linux-2.6/diffs/arch/ppc/kernel/head_8xx.S@
> 1.19?nav=index.html|src/.|src/arch|src/arch/ppc|src/arch/ppc/kernel|hist
> /arch/ppc/kernel/head_8xx.S
> It appears this fix has some other issues :(
> 
> How do the other ppc arches do? I am guessing that they don't double
> fault, but bails
> out to do_page_fault from the TLB Miss handler, like 8xx used to do.

Assuming Dan doesn't come up with a more simple & better fix, maybe we
should go back to the original patch I made?

-- 
Tom Rini
http://gate.crashing.org/~trini/



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Dan Malek

On Nov 7, 2005, at 9:32 AM, Joakim Tjernlund wrote:

> This is one reason why it is the way it is:

Oh geeze, what a hack!  :-)  This could have been fixed
with a line of assembler code in the TLB miss exception.
I'll take a look at all of this and fix it up.

Thanks.

-- Dan




[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Dan Malek

On Nov 7, 2005, at 3:44 AM, Marcelo Tosatti wrote:

> Dan, I wonder why we just don't go back to v2.4 behaviour. It is not 
> very
> clear to me that "two exception" speedup offsets the additional code 
> required
> for "one exception" version. Have you actually done any measurements?

No, and I didn't actually make these changes, either :-)
I'm working on some 8xx debugging right now, so let's experiment
with some changes.  I don't understand why other processors, especially
G2 cores like 82xx, aren't finding the same problems we are having
with 8xx.  Logically, we are all doing the same thing, unless there are
some tlb invalidates on these other processors that I'm forgetting 
about.
We just seem to be running into stale entries, and we have to fix it.

Thanks.

-- Dan




[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Marcelo Tosatti
On Mon, Nov 07, 2005 at 04:44:32PM +0100, Joakim Tjernlund wrote:
>  
> > Joakim!
> > 
> > On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim Tjernlund wrote:
> > > Hi Marcelo
> > > 
> > > [SNIP] 
> > > > The root of the problem are the changes against the 8xx TLB 
> > > > handlers introduced
> > > > during v2.6. What happens is the TLBMiss handlers load the 
> > > > zeroed pte into
> > > > the TLB, causing the TLBError handler to be invoked (thats 
> > > > two TLB faults per 
> > > > pagefault), which then jumps to the generic MM code to 
> > setup the pte.
> > > > 
> > > > The bug is that the zeroed TLB is not invalidated (the same reason
> > > > for the "dcbst" misbehaviour), resulting in infinite 
> > TLBError faults.
> > > > 
> > > > Dan, I wonder why we just don't go back to v2.4 behaviour.
> > > 
> > > This is one reason why it is the way it is:
> > > 
> > http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > > This details are little fuzzy ATM, but I think the reason for the
> > > current
> > > impl. was only that it was less intrusive to impl.
> > 
> > Ah, I see. I wonder if the bug is processor specific: we 
> > don't have such
> > changes in our v2.4 tree and never experienced such problem.
> > 
> > It should be pretty easy to hit it right? (instruction 
> > pagefaults should
> > fail).
> 
> No, its pretty hard to trigger it. Read the all mails on the subject to
> see why.
> The one or two exception approach doesn't matter performancewise(at
> least for ITLB exceptions)
> I think.

Fine, let it continue the way it is then.




[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Tom Rini
On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo Tosatti wrote:
> Joakim!
> 
> On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim Tjernlund wrote:
> > Hi Marcelo
> > 
> > [SNIP] 
> > > The root of the problem are the changes against the 8xx TLB 
> > > handlers introduced
> > > during v2.6. What happens is the TLBMiss handlers load the 
> > > zeroed pte into
> > > the TLB, causing the TLBError handler to be invoked (thats 
> > > two TLB faults per 
> > > pagefault), which then jumps to the generic MM code to setup the pte.
> > > 
> > > The bug is that the zeroed TLB is not invalidated (the same reason
> > > for the "dcbst" misbehaviour), resulting in infinite TLBError faults.
> > > 
> > > Dan, I wonder why we just don't go back to v2.4 behaviour.
> > 
> > This is one reason why it is the way it is:
> > http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > This details are little fuzzy ATM, but I think the reason for the
> > current
> > impl. was only that it was less intrusive to impl.
> 
> Ah, I see. I wonder if the bug is processor specific: we don't have such
> changes in our v2.4 tree and never experienced such problem.
> 
> It should be pretty easy to hit it right? (instruction pagefaults should
> fail).
> 
> Grigori, Tom, can you enlight us about the issue on the URL above. How
> can it be triggered?

So after looking at the code in 2.6.14 and current git, I think the
above URL isn't relevant, unless there was a change I missed (which
could totally be possible) that reverted the patch there and fixed that
issue in a different manner.  But since I didn't figure that out until I
had finished researching it again:

Switching hats for a minute, this came from a bug a customer of
MontaVista found, so I can't give out the testcase :(

To repeat what Joakim said back then:
"I think I have figured this out. The first TLB misses that happen at
app startup is Data TLB misses. These will then hit the NULL L1 entry
and end up in do_page_fault() which will populate the L1 entry. But when
you have a very large app that spans more than one L1 entry (16 MB I
think) it may happen that you will have I-TLB Miss first one of the L1
entrys which will make the I-TLB handler bail out to do_page_fault() and
the app craches(SEGV)."

Looking at the patch again, what I don't see is why I talk about fudging
I-TLB Miss at 0x400 when it's I-TLB Error we fudge at being there, but
then get hung up that there can be a slight diff between the two ("This
is because we check bit 4 of SRR1 in both cases, but in the case of an
I-TLB Miss, this bit is always set, and it only indicates a protection
fault on an I-TLB Error.") so instead of 0x1300 jumping to the handler
at 0x400, we treat it like a regular exception so we know where we came
from, and perhaps missed fixing a case somewhere?

-- 
Tom Rini
http://gate.crashing.org/~trini/



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Marcelo Tosatti
On Mon, Nov 07, 2005 at 09:35:59AM -0500, Dan Malek wrote:
> 
> On Nov 7, 2005, at 3:44 AM, Marcelo Tosatti wrote:
> 
> >Dan, I wonder why we just don't go back to v2.4 behaviour. It is not 
> >very
> >clear to me that "two exception" speedup offsets the additional code 
> >required
> >for "one exception" version. Have you actually done any measurements?
> 
> No, and I didn't actually make these changes, either :-)

Ahh, ok. sorry. I remember you arguing that it was faster this way (less
code).

> I'm working on some 8xx debugging right now, so let's experiment
> with some changes.  I don't understand why other processors, especially
> G2 cores like 82xx, aren't finding the same problems we are having
> with 8xx.  Logically, we are all doing the same thing, unless there are
> some tlb invalidates on these other processors that I'm forgetting 
> about.

I really dont know how the 82xx TLB works, so...

> We just seem to be running into stale entries, and we have to fix it.

Right - the issue Joakim noted would be one reason for the "two exception"
approach.



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Marcelo Tosatti
Joakim!

On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim Tjernlund wrote:
> Hi Marcelo
> 
> [SNIP] 
> > The root of the problem are the changes against the 8xx TLB 
> > handlers introduced
> > during v2.6. What happens is the TLBMiss handlers load the 
> > zeroed pte into
> > the TLB, causing the TLBError handler to be invoked (thats 
> > two TLB faults per 
> > pagefault), which then jumps to the generic MM code to setup the pte.
> > 
> > The bug is that the zeroed TLB is not invalidated (the same reason
> > for the "dcbst" misbehaviour), resulting in infinite TLBError faults.
> > 
> > Dan, I wonder why we just don't go back to v2.4 behaviour.
> 
> This is one reason why it is the way it is:
> http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> This details are little fuzzy ATM, but I think the reason for the
> current
> impl. was only that it was less intrusive to impl.

Ah, I see. I wonder if the bug is processor specific: we don't have such
changes in our v2.4 tree and never experienced such problem.

It should be pretty easy to hit it right? (instruction pagefaults should
fail).

Grigori, Tom, can you enlight us about the issue on the URL above. How
can it be triggered?





[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-07 Thread Marcelo Tosatti
Hi folks,

Seems the bug is exposed by the change which avoids flushing the
TLB when not necessary (in case the pte has not changed), introduced
recently:

__handle_mm_fault():

entry = pte_mkyoung(entry);
if (!pte_same(old_entry, entry)) {
ptep_set_access_flags(vma, address, pte, entry, write_access);
update_mmu_cache(vma, address, entry);
lazy_mmu_prot_update(entry);
} else {
/*
 * This is needed only for protection faults but the arch code
 * is not yet telling us if this is a protection fault or not.
 * This still avoids useless tlb flushes for .text page faults
 * with threads.
 */
if (write_access)
flush_tlb_page(vma, address);
}

The "update_mmu_cache()" call was unconditional before, which caused the TLB
to be flushed by:

if (pfn_valid(pfn)) {
struct page *page = pfn_to_page(pfn);
if (!PageReserved(page)
&& !test_bit(PG_arch_1, &page->flags)) {
if (vma->vm_mm == current->active_mm) {
#ifdef CONFIG_8xx
/* On 8xx, cache control instructions (particularly 
 * "dcbst" from flush_dcache_icache) fault as write 
 * operation if there is an unpopulated TLB entry 
 * for the address in question. To workaround that, 
 * we invalidate the TLB here, thus avoiding dcbst 
 * misbehaviour.
 */
_tlbie(address);
#endif
__flush_dcache_icache((void *) address);
} else
flush_dcache_icache_page(page);
set_bit(PG_arch_1, &page->flags);
}

Which worked to due to pure luck: PG_arch_1 was always unset before, but
now it isnt.

The root of the problem are the changes against the 8xx TLB handlers introduced
during v2.6. What happens is the TLBMiss handlers load the zeroed pte into
the TLB, causing the TLBError handler to be invoked (thats two TLB faults per 
pagefault), which then jumps to the generic MM code to setup the pte.

The bug is that the zeroed TLB is not invalidated (the same reason
for the "dcbst" misbehaviour), resulting in infinite TLBError faults.

Dan, I wonder why we just don't go back to v2.4 behaviour. It is not very
clear to me that "two exception" speedup offsets the additional code required
for "one exception" version. Have you actually done any measurements? 

There is chance that the additional code ends up in the same cacheline,
which would mean no huge gain by the "two exception" approach. Might be
even harmful for performance (you need two exceptions instead of one
after all).

The "two exception" approach requires a TLB flush (to nuke the zeroed)
at each PTE update for correct behaviour (which BTW is another slowdown):

--- ../git/linux-2.6/arch/ppc/mm/init.c 2005-11-01 07:58:12.0 -0600
+++ linux-2.6-git-wednov02/arch/ppc/mm/init.c   2005-11-07 06:13:58.0 
-0600
@@ -597,19 +597,12 @@
 
if (pfn_valid(pfn)) {
struct page *page = pfn_to_page(pfn);
+#ifdef CONFIG_8xx
+   _tlbie(address);
+#endif
if (!PageReserved(page)
&& !test_bit(PG_arch_1, &page->flags)) {
if (vma->vm_mm == current->active_mm) {
-#ifdef CONFIG_8xx
-   /* On 8xx, cache control instructions (particularly 
-* "dcbst" from flush_dcache_icache) fault as write 
-* operation if there is an unpopulated TLB entry 
-* for the address in question. To workaround that, 
-* we invalidate the TLB here, thus avoiding dcbst 
-* misbehaviour.
-*/
-   _tlbie(address);
-#endif
__flush_dcache_icache((void *) address);
} else
flush_dcache_icache_page(page);


On Sun, Oct 30, 2005 at 11:03:24PM +0300, Pantelis Antoniou wrote:
> Latest MMU changes caused 8xx to stop working. Flushing tlb of the faulting
> address fixes the problem.
> 
> ---
> commit 978e2f36b1ae53e37ba27b3ab8f1c5ddbb8c8a10
> tree 7dd0e403c240162b1925db0834d694f4b4a0e95e
> parent ca02ea5aebcda886d1552c6af73ca96c02bf9fed
> author Pantelis Antoniou  Sun, 30 Oct 2005 21:53:48 +0200
> committer Pantelis Antoniou  Sun, 30 Oct 2005 21:53:48 
> +0200
> 
>  arch/ppc/mm/fault.c |   13 +
>  1 files changed, 13 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/ppc/mm/fault.c b/arch/ppc/mm/fault.c
> --- a/arch/ppc/mm/fault.c
> +++ b/arch/ppc/mm/fault.c
> @@ -240,6 +240,19 @@ good_area:

[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-02 Thread Marcelo Tosatti
On Wed, Nov 02, 2005 at 12:55:57AM +0200, Pantelis Antoniou wrote:
> On Tuesday 01 November 2005 19:25, Marcelo Tosatti wrote:
> > On Sun, Oct 30, 2005 at 11:03:24PM +0300, Pantelis Antoniou wrote:
> > > Latest MMU changes caused 8xx to stop working. Flushing tlb of the 
> > > faulting
> > > address fixes the problem.
> > 
> > Hi Panto,
> > 
> > Its working fine around here. How much of a vanilla 2.6.14 your is?
> > 
> > [root at CAS root]# cat /proc/cpuinfo
> > processor   : 0
> > cpu : 8xx
> > clock   : 48MHz
> > bus clock   : 48MHz
> > revision: 0.0 (pvr 0050 )
> > bogomips: 47.82
> > [root at CAS root]# uname -a
> > Linux CAS 2.6.14 #2 Tue Nov 1 16:20:28 CST 2005 ppc unknown
> > 
> > 
> 
> Vanila 2.6.14 worked fine too.
> 
> It's the mm patches that started coming in later. 
> Unfortunately the version did not change, so I can't provide it.
> Did you used a current git tree?

No I did not - will do and chase the bug.

Thanks.



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-02 Thread Pantelis Antoniou
On Tuesday 01 November 2005 19:25, Marcelo Tosatti wrote:
> On Sun, Oct 30, 2005 at 11:03:24PM +0300, Pantelis Antoniou wrote:
> > Latest MMU changes caused 8xx to stop working. Flushing tlb of the faulting
> > address fixes the problem.
> 
> Hi Panto,
> 
> Its working fine around here. How much of a vanilla 2.6.14 your is?
> 
> [root at CAS root]# cat /proc/cpuinfo
> processor   : 0
> cpu : 8xx
> clock   : 48MHz
> bus clock   : 48MHz
> revision: 0.0 (pvr 0050 )
> bogomips: 47.82
> [root at CAS root]# uname -a
> Linux CAS 2.6.14 #2 Tue Nov 1 16:20:28 CST 2005 ppc unknown
> 
> 

Vanila 2.6.14 worked fine too.

It's the mm patches that started coming in later. 
Unfortunately the version did not change, so I can't provide it.
Did you used a current git tree?

Regards

Pantelis



[PATCH 2.6.14] mm: 8xx MM fix for

2005-11-01 Thread Marcelo Tosatti
On Sun, Oct 30, 2005 at 11:03:24PM +0300, Pantelis Antoniou wrote:
> Latest MMU changes caused 8xx to stop working. Flushing tlb of the faulting
> address fixes the problem.

Hi Panto,

Its working fine around here. How much of a vanilla 2.6.14 your is?

[root at CAS root]# cat /proc/cpuinfo
processor   : 0
cpu : 8xx
clock   : 48MHz
bus clock   : 48MHz
revision: 0.0 (pvr 0050 )
bogomips: 47.82
[root at CAS root]# uname -a
Linux CAS 2.6.14 #2 Tue Nov 1 16:20:28 CST 2005 ppc unknown




[PATCH 2.6.14] mm: 8xx MM fix for

2005-10-31 Thread Benjamin Herrenschmidt
On Sun, 2005-10-30 at 23:03 +0300, Pantelis Antoniou wrote:
> Latest MMU changes caused 8xx to stop working. Flushing tlb of the faulting
> address fixes the problem.

Ugh ?

What is the problem precisely ? This is just a dodgy workaround for an
unexplained problem. Normally, the kenrel _WILL_ cause a tlb flush after
manipulating a PTE.

Ben.

> ---
> commit 978e2f36b1ae53e37ba27b3ab8f1c5ddbb8c8a10
> tree 7dd0e403c240162b1925db0834d694f4b4a0e95e
> parent ca02ea5aebcda886d1552c6af73ca96c02bf9fed
> author Pantelis Antoniou  Sun, 30 Oct 2005 21:53:48 +0200
> committer Pantelis Antoniou  Sun, 30 Oct 2005 21:53:48 
> +0200
> 
>  arch/ppc/mm/fault.c |   13 +
>  1 files changed, 13 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/ppc/mm/fault.c b/arch/ppc/mm/fault.c
> --- a/arch/ppc/mm/fault.c
> +++ b/arch/ppc/mm/fault.c
> @@ -240,6 +240,19 @@ good_area:
>   goto bad_area;
>   if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
>   goto bad_area;
> +
> +#ifdef CONFIG_8xx
> + {
> + /* 8xx is retarded; news at 11 */
> + pte_t *ptep = NULL;
> +
> + if (get_pteptr(mm, address, &ptep) && pte_present(*ptep))
> + _tlbie(address);
> +
> + if (ptep != NULL)
> + pte_unmap(ptep);
> + }
> +#endif
>   }
>  
>   /*




[PATCH 2.6.14] mm: 8xx MM fix for

2005-10-30 Thread Pantelis Antoniou
Latest MMU changes caused 8xx to stop working. Flushing tlb of the faulting
address fixes the problem.

---
commit 978e2f36b1ae53e37ba27b3ab8f1c5ddbb8c8a10
tree 7dd0e403c240162b1925db0834d694f4b4a0e95e
parent ca02ea5aebcda886d1552c6af73ca96c02bf9fed
author Pantelis Antoniou  Sun, 30 Oct 2005 21:53:48 +0200
committer Pantelis Antoniou  Sun, 30 Oct 2005 21:53:48 +0200

 arch/ppc/mm/fault.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/ppc/mm/fault.c b/arch/ppc/mm/fault.c
--- a/arch/ppc/mm/fault.c
+++ b/arch/ppc/mm/fault.c
@@ -240,6 +240,19 @@ good_area:
goto bad_area;
if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
goto bad_area;
+
+#ifdef CONFIG_8xx
+   {
+   /* 8xx is retarded; news at 11 */
+   pte_t *ptep = NULL;
+
+   if (get_pteptr(mm, address, &ptep) && pte_present(*ptep))
+   _tlbie(address);
+
+   if (ptep != NULL)
+   pte_unmap(ptep);
+   }
+#endif
}
 
/*