Great, this will definitely speed up things. I would suggest to enable this hardware tlb refill by bit 17 in the supervision register SR and not by a zero or nonzero DMMUCR or IMMUCR register. Then it would be more consistent with the specification to control such a feature.
Am 7/27/2013 9:02 PM, schrieb Stefan Kristiansson: > Good news everybody! > > We've got some hardware tlb reload going on in the hottest OpenRISC 1000 > implementation there is, no more wasting instructions on tlb miss exceptions > when running Linux. > > As a rough estimate (by looking at simulation waveforms and comparing > the time spent in the tlb miss exception handler and the time spent when > doing a hw reload), the hardware tlb reload should be about 7.5 times faster > than the software reload. > The hardware reload isn't completely optimized, so you could still shave off > a couple of cycles there. > Perhaps that is true for the tlb miss handler in Linux too, so the rough > estimate is probably a good enough indicator at what kind of speedup we > can estimate from this. > > Another rough estimate of how much time is spent in the tlb miss vectors > was done by running 'gcc hello_world.c -o hello_world' in the jor1k > emulator (http://s-macke.github.io/jor1k/) and by using the stats from that > we saw that (momentarily) roughly up to 25% of the time was spent in the > dtlb miss exception handler. > This could of course also be improved by increasing the number of sets and > ways > used in the mmus, but that's another topic that might be addressed in the > future. > > As always, you can find it in the github repos at: > https://github.com/openrisc/mor1kx > > But before we bring out the champagne and start celebrating, some notes about > the implementation that needs some discussion. > > First, it doesn't exactly follow the arch specifications definition of the > pagetable entries (pte), instead it uses the pte layout that our Linux port > defines. > > Let me illustrate the differences. > or1k arch spec pte layout: > | 31 ... 10 | 9 | 8 ... 6 | 5 | 4 | 3 | 2 | 1 | 0 | > | PPN | L |PP INDEX | D | A |WOM|WBC|CI |CC | > > Linux pte layout: > | 31 ... 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | > | PPN |SHARED|EXEC|SWE|SRE|UWE|URE| D | A |WOM|WBC|CI |PRESENT| > > The biggest difference is that the arch spec defines a seperate register > (xMMUPR) which holds a table of protection bits, and the PP INDEX field > of the pte is used to pick out the "right" protection flags from that. > In our Linux port on the other hand, it has been chosen to not follow > this and embed the protection bits straight into the pte (which of > course is perfectly fine as it was designed for software tlb reload). > So, the question is, should we change Linux to be compliant with the > arch specs definition of the ptes and start using a PP index field or > change the arch spec to allow usages of the Linux definition? > > Second, naturally there are a couple of changes needed to Linux for this to > work. > The changes are minor but needs commenting before proper patches are sent out. > The full diff is available in the end of this mail, but I'll first comment the > changes to each file. > > arch/openrisc/include/asm/spr_defs.h: > The defines for the bitfields of xMMUCR are wrong in all of our spr_defs.h, > I tried to dig into where those defines come from, but both the arch spec > and spr_defs.h have been different since the beginning of time (or as long > back as the commit histories date back, some time in year 2000). > > arch/openrisc/kernel/head.S: > The implementation in mor1kx works so, that if the xMMUCR register is 0, > it will generate tlb miss exceptions, so we have to make sure that it > is zero when the MMUs are enabled, so the boot tlb miss handlers are used > until paging is set up. > > arch/openrisc/mm/init.c: > arch/openrisc/mm/tlb.c: > The correct value of the pagetable base pointer is updated to the xMMUCR > registers right after paging is initially set up and on each switch_mm. > > arch/openrisc/mm/fault.c: > do_pagefault is called a bit differently when it is called from the pagefault > exception vectors and when it is called from the tlb miss exception vectors. > I've put in a hack there to make that difference disappear, but this has > to be addressed properly and as I see it there are two ways. > > 1) Do the necessary checks in do_pagefault to see if it should handle a > protection fault, or a missing page fault. > 2) Make mor1kx generate a tlb miss exception instead of a pagefault when the > pte table pointer is zero or the PRESENT bit is not set. > > Some thoughts and comments on those issues, please! > > Stefan > > --- >8 --- > diff --git a/arch/openrisc/include/asm/spr_defs.h > b/arch/openrisc/include/asm/spr_defs.h > index 5dbc668..1d20915 100644 > --- a/arch/openrisc/include/asm/spr_defs.h > +++ b/arch/openrisc/include/asm/spr_defs.h > @@ -226,19 +226,15 @@ > * Bit definitions for the Data MMU Control Register > * > */ > -#define SPR_DMMUCR_P2S 0x0000003e /* Level 2 Page Size */ > -#define SPR_DMMUCR_P1S 0x000007c0 /* Level 1 Page Size */ > -#define SPR_DMMUCR_VADDR_WIDTH 0x0000f800 /* Virtual ADDR Width */ > -#define SPR_DMMUCR_PADDR_WIDTH 0x000f0000 /* Physical ADDR Width */ > +#define SPR_DMMUCR_PTBP 0xfffffc00 /* Page Table Base Pointer */ > +#define SPR_DMMUCR_DTF 0x00000001 /* DTLB Flush */ > > /* > * Bit definitions for the Instruction MMU Control Register > * > */ > -#define SPR_IMMUCR_P2S 0x0000003e /* Level 2 Page Size */ > -#define SPR_IMMUCR_P1S 0x000007c0 /* Level 1 Page Size */ > -#define SPR_IMMUCR_VADDR_WIDTH 0x0000f800 /* Virtual ADDR Width */ > -#define SPR_IMMUCR_PADDR_WIDTH 0x000f0000 /* Physical ADDR Width */ > +#define SPR_IMMUCR_PTBP 0xfffffc00 /* Page Table Base Pointer */ > +#define SPR_IMMUCR_ITF 0x00000001 /* ITLB Flush */ > > /* > * Bit definitions for the Data TLB Match Register > diff --git a/arch/openrisc/kernel/head.S b/arch/openrisc/kernel/head.S > index 1d3c9c2..59a3263 100644 > --- a/arch/openrisc/kernel/head.S > +++ b/arch/openrisc/kernel/head.S > @@ -541,6 +541,15 @@ flush_tlb: > > enable_mmu: > /* > + * Make sure the page table base pointer is cleared > + * ( = hardware tlb fill disabled) > + */ > + l.movhi r30,0 > + l.mtspr r0,r30,SPR_DMMUCR > + l.movhi r30,0 > + l.mtspr r0,r30,SPR_IMMUCR > + > + /* > * enable dmmu & immu > * SR[5] = 0, SR[6] = 0, 6th and 7th bit of SR set to 0 > */ > diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c > index e2bfafc..4c07a20 100644 > --- a/arch/openrisc/mm/fault.c > +++ b/arch/openrisc/mm/fault.c > @@ -78,7 +78,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, > unsigned long address, > */ > > if (address >= VMALLOC_START && > - (vector != 0x300 && vector != 0x400) && > + /*(vector != 0x300 && vector != 0x400) &&*/ > !user_mode(regs)) > goto vmalloc_fault; > > diff --git a/arch/openrisc/mm/init.c b/arch/openrisc/mm/init.c > index e7fdc50..d8b8068 100644 > --- a/arch/openrisc/mm/init.c > +++ b/arch/openrisc/mm/init.c > @@ -191,6 +191,14 @@ void __init paging_init(void) > mtspr(SPR_ICBIR, 0x900); > mtspr(SPR_ICBIR, 0xa00); > > + /* > + * Update the pagetable base pointer, to enable hardware tlb refill if > + * supported by the hardware > + */ > + mtspr(SPR_IMMUCR, __pa(current_pgd) & SPR_IMMUCR_PTBP); > + mtspr(SPR_DMMUCR, __pa(current_pgd) & SPR_DMMUCR_PTBP); > + > + > /* New TLB miss handlers and kernel page tables are in now place. > * Make sure that page flags get updated for all pages in TLB by > * flushing the TLB and forcing all TLB entries to be recreated > diff --git a/arch/openrisc/mm/tlb.c b/arch/openrisc/mm/tlb.c > index 683bd4d..96e6df3 100644 > --- a/arch/openrisc/mm/tlb.c > +++ b/arch/openrisc/mm/tlb.c > @@ -151,6 +151,14 @@ void switch_mm(struct mm_struct *prev, struct mm_struct > *next, > */ > current_pgd = next->pgd; > > + /* > + * Update the pagetable base pointer with the new pgd. > + * This only have effect on implementations with hardware tlb refill > + * support. > + */ > + mtspr(SPR_IMMUCR, __pa(current_pgd) & SPR_IMMUCR_PTBP); > + mtspr(SPR_DMMUCR, __pa(current_pgd) & SPR_DMMUCR_PTBP); > + > /* We don't have context support implemented, so flush all > * entries belonging to previous map > */ > --- >8 --- > _______________________________________________ > Linux mailing list > Linux@lists.openrisc.net > http://lists.openrisc.net/listinfo/linux _______________________________________________ Linux mailing list Linux@lists.openrisc.net http://lists.openrisc.net/listinfo/linux