On Mon, Jul 29, 2013 at 10:19:37AM +0200, Jonas Bonn wrote: > > As a rough estimate (by looking at simulation waveforms and comparing > > the time spent in the tlb miss exception handler and the time spent when > > doing a hw reload), the hardware tlb reload should be about 7.5 times faster > > than the software reload. > > The hardware reload isn't completely optimized, so you could still shave off > > a couple of cycles there. > > Perhaps that is true for the tlb miss handler in Linux too, so the rough > > estimate is probably a good enough indicator at what kind of speedup we > > can estimate from this. > > Walking the page table is pretty much the same operation whether it be > in software or hardware... the savings are the context switch > associated with the exception handler. >
+ the overhead of doing shift and mask operations. > > > > First, it doesn't exactly follow the arch specifications definition of the > > pagetable entries (pte), instead it uses the pte layout that our Linux port > > defines. > > > > Let me illustrate the differences. > > or1k arch spec pte layout: > > | 31 ... 10 | 9 | 8 ... 6 | 5 | 4 | 3 | 2 | 1 | 0 | > > | PPN | L |PP INDEX | D | A |WOM|WBC|CI |CC | > > > > We have 8K pages... why so many bits for the PPN? Bits 31, 30, and 29 > are never used? > My guess is, another overlook in the arch spec. The page size was probably not decided when they defined the PTE. Bits 12, 11, and 10 are not used. > > > Linux pte layout: > > | 31 ... 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | > > | PPN |SHARED|EXEC|SWE|SRE|UWE|URE| D | A |WOM|WBC|CI |PRESENT| > > So, I was sloppy here, the PPN is actually bits 31 - 13 and bit 12 is not used. > > The biggest difference is that the arch spec defines a seperate register > > (xMMUPR) which holds a table of protection bits, and the PP INDEX field > > of the pte is used to pick out the "right" protection flags from that. > > In our Linux port on the other hand, it has been chosen to not follow > > this and embed the protection bits straight into the pte (which of > > course is perfectly fine as it was designed for software tlb reload). > > So, the question is, should we change Linux to be compliant with the > > arch specs definition of the ptes and start using a PP index field or > > change the arch spec to allow usages of the Linux definition? > > What are the protection combinations that are actually used: > > SR|SW|SX /* really? */ > SR|SW > SR|SX > SR > > /* We may want to drop the SX's here */ > SR|SW|SX|UR|UW|UX > SR|SW|SX|UR|UX > SR|SW|SX|UR|UW > SR|SW|SX|UR > > Is that all? If yes, then SR is always set, and UR is always set for > user pages. > I'm not sure, but it doesn't sound too far fetched. > So we have: > > USERPAGE? > WRITABLE? > EXECUTABLE? > > ...that's 3 bits, which maps nicely into the 3 bits available for PP > INDEX. So the hardware and software implementations aren't > contradictory there. > Yes, that sounds good. And we could setup a static mapping in the PPI registers for that, and leave it up to implementations to actually implement the PPI registers or use the same static mapping. > L ("link") isn't intresenting for the software implementation, so > reusing that for SHARED is fine there, but the HW implementation wants > L... where does SHARED go in that case and, furthermore, what is it > even used for? Somebody please check what that SHARED flag is doing. > We can do without the L, if we just assume a two-level page table, but it sounds better to actually do it properly and set L ("last" not "link") on the actual PTE. I'm not sure about the SHARED, looking around, many architectures seems to set it to a combination of PRESENT, USER, RW and ACCESSED. I'll continue investigating this. > Finally, we play games with the CC bit since we don't have an SMP > Implementation of OpenRISC. It's used to indicate that a page is > swapped out. And the WBC bit is aliased to distinguish page cache via > the PAGE_FILE flag. For the HW implementation we can't do this, so > where do we put these? Bits 31 and 30 and have the HW mapper mask > them out? > So, under the assumption all of the above turns out ok (and the SHARED flag has to have it's own bit), our Linux PTE could look something like this. | 31 ... 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | | PPN |FILE|SHARED|PRESENT| L | X | W | U | D | A |WOM|WBC|CI |CC | > > arch/openrisc/include/asm/spr_defs.h: > > The defines for the bitfields of xMMUCR are wrong in all of our spr_defs.h, > > I tried to dig into where those defines come from, but both the arch spec > > and spr_defs.h have been different since the beginning of time (or as long > > back as the commit histories date back, some time in year 2000). > > > > I think that file has even more errors that that. Didn't somebody fix > this file up in or1ksim but not sync the kernel version? > I think they are pretty much up to sync, but not sure. There might be more errors in it, but at least this was in all versions I looked at. > > arch/openrisc/mm/init.c: > > arch/openrisc/mm/tlb.c: > > The correct value of the pagetable base pointer is updated to the xMMUCR > > registers right after paging is initially set up and on each switch_mm. > > > > arch/openrisc/mm/fault.c: > > do_pagefault is called a bit differently when it is called from the > > pagefault > > exception vectors and when it is called from the tlb miss exception vectors. > > I've put in a hack there to make that difference disappear, but this has > > to be addressed properly and as I see it there are two ways. > > > > > 1) Do the necessary checks in do_pagefault to see if it should handle a > > protection fault, or a missing page fault. > > I think this is the right approach. > I think so too, so let's do that. I'll take a look at what needs to be done. > > if (address >= VMALLOC_START && > > - (vector != 0x300 && vector != 0x400) && > > + /*(vector != 0x300 && vector != 0x400) &&*/ > > !user_mode(regs)) > > goto vmalloc_fault; > > This won't work as things stand today... > Certainly not, but it served as a good example of the problem I wanted to show. Stefan _______________________________________________ Linux mailing list Linux@lists.openrisc.net http://lists.openrisc.net/listinfo/linux