On Mon, Nov 07, 2005 at 07:14:15PM +0100, Joakim Tjernlund wrote: > > -----Original Message----- > > From: Tom Rini [mailto:trini at kernel.crashing.org] > > Sent: 07 November 2005 16:52 > > To: Marcelo Tosatti > > Cc: Joakim Tjernlund; Pantelis Antoniou; Dan Malek; > > linuxppc-embedded at ozlabs.org; gtolstolytkin at ru.mvista.com > > Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for > > > > On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo Tosatti wrote: > > > Joakim! > > > > > > On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim Tjernlund wrote: > > > > Hi Marcelo > > > > > > > > [SNIP] > > > > > The root of the problem are the changes against the 8xx TLB > > > > > handlers introduced > > > > > during v2.6. What happens is the TLBMiss handlers load the > > > > > zeroed pte into > > > > > the TLB, causing the TLBError handler to be invoked (thats > > > > > two TLB faults per > > > > > pagefault), which then jumps to the generic MM code to > > setup the pte. > > > > > > > > > > The bug is that the zeroed TLB is not invalidated (the > > same reason > > > > > for the "dcbst" misbehaviour), resulting in infinite > > TLBError faults. > > > > > > > > > > Dan, I wonder why we just don't go back to v2.4 behaviour. > > > > > > > > This is one reason why it is the way it is: > > > > > > http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html > > > > This details are little fuzzy ATM, but I think the reason for the > > > > current > > > > impl. was only that it was less intrusive to impl. > > > > > > Ah, I see. I wonder if the bug is processor specific: we > > don't have such > > > changes in our v2.4 tree and never experienced such problem. > > > > > > It should be pretty easy to hit it right? (instruction > > pagefaults should > > > fail). > > > > > > Grigori, Tom, can you enlight us about the issue on the URL > > above. How > > > can it be triggered? > > > > So after looking at the code in 2.6.14 and current git, I think the > > above URL isn't relevant, unless there was a change I missed (which > > could totally be possible) that reverted the patch there and > > fixed that > > issue in a different manner. But since I didn't figure that > > out until I > > had finished researching it again: > > I wasn't clear enough. What I meant was that the above patch made me > think and > the result was that I came up with a simpler fix, the "two exception" > fix that > is in current kernels. See > http://linux.bkbits.net:8080/linux-2.6/diffs/arch/ppc/kernel/head_8xx.S@ > 1.19?nav=index.html|src/.|src/arch|src/arch/ppc|src/arch/ppc/kernel|hist > /arch/ppc/kernel/head_8xx.S > It appears this fix has some other issues :( > > How do the other ppc arches do? I am guessing that they don't double > fault, but bails > out to do_page_fault from the TLB Miss handler, like 8xx used to do.
Assuming Dan doesn't come up with a more simple & better fix, maybe we should go back to the original patch I made? -- Tom Rini http://gate.crashing.org/~trini/