Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
On Wed, Sep 04, 2019 at 01:23:36PM +1000, Alastair D'Silva wrote: > > Maybe also add "msr" in the clobbers. > > > Ok. There is no known register "msr" in GCC. Segher
RE: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
On Tue, 2019-09-03 at 22:11 +0200, Gabriel Paubert wrote: > On Tue, Sep 03, 2019 at 01:31:57PM -0500, Segher Boessenkool wrote: > > On Tue, Sep 03, 2019 at 07:05:19PM +0200, Christophe Leroy wrote: > > > Le 03/09/2019 à 18:04, Segher Boessenkool a écrit : > > > > (Why are they separate though? It could just be one loop var). > > > > > > Yes it could just be a single loop var, but in that case it would > > > have > > > to be reset at the start of the second loop, which means we would > > > have > > > to pass 'addr' for resetting the loop anyway, > > > > Right, I noticed that after hitting send, as usual. > > > > > so I opted to do it > > > outside the inline asm by using to separate loop vars set to > > > their > > > starting value outside the inline asm. > > > > The thing is, the way it is written now, it will get separate > > registers > > for each loop (with proper earlyclobbers added). Not that that > > really > > matters of course, it just feels wrong :-) > > After "mtmsr %3", it is always possible to copy %0 to %3 and use it > as > an address register for the second loop. One register less to > allocate > for the compiler. Constraints of course have to be adjusted. > > Given that we're dealing with registers holding data that has been named outside the assembler, this feels dirty. We'd be using the register passed in as 'msr' to hold the address instead. Since we're not short on registers, I don't see this as a good change. -- Alastair D'Silva Open Source Developer Linux Technology Centre, IBM Australia mob: 0423 762 819
RE: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
On Tue, 2019-09-03 at 11:04 -0500, Segher Boessenkool wrote: > On Tue, Sep 03, 2019 at 04:28:09PM +0200, Christophe Leroy wrote: > > Le 03/09/2019 à 15:04, Segher Boessenkool a écrit : > > > On Tue, Sep 03, 2019 at 03:23:57PM +1000, Alastair D'Silva wrote: > > > > + asm volatile( > > > > + " mtctr %2;" > > > > + " mtmsr %3;" > > > > + " isync;" > > > > + "0: dcbst 0, %0;" > > > > + " addi%0, %0, %4;" > > > > + " bdnz0b;" > > > > + " sync;" > > > > + " mtctr %2;" > > > > + "1: icbi0, %1;" > > > > + " addi%1, %1, %4;" > > > > + " bdnz1b;" > > > > + " sync;" > > > > + " mtmsr %5;" > > > > + " isync;" > > > > + : "+r" (loop1), "+r" (loop2) > > > > + : "r" (nb), "r" (msr), "i" (bytes), "r" (msr0) > > > > + : "ctr", "memory"); > > > > > > This outputs as one huge assembler statement, all on one > > > line. That's > > > going to be fun to read or debug. > > > > Do you mean \n has to be added after the ; ? > > Something like that. There is no really satisfying way for doing > huge > inline asm, and maybe that is a good thing ;-) > > Often people write \n\t at the end of each line of inline asm. This > works > pretty well (but then there are labels, oh joy). > > > > loop1 and/or loop2 can be assigned the same register as msr0 or > > > nb. They > > > need to be made earlyclobbers. (msr is fine, all of its reads > > > are before > > > any writes to loop1 or loop2; and bytes is fine, it's not a > > > register). > > > > Can you explicit please ? Doesn't '+r' means that they are input > > and > > output at the same time ? > > That is what + means, yes -- that this output is an input as > well. It is > the same to write > > asm("mov %1,%0 ; mov %0,42" : "+r"(x), "=r"(y)); > or to write > asm("mov %1,%0 ; mov %0,42" : "=r"(x), "=r"(y) : "0"(x)); > > (So not "at the same time" as in "in the same machine instruction", > but > more loosely, as in "in the same inline asm statement"). > > > "to be made earlyclobbers", what does this means exactly ? How to > > do that ? > > You write &, like "+" in this case. It means the machine code > writes > to this register before it has consumed all asm inputs (remember, GCC > does not understand (or even parse!) the assembler string). > > So just > > : "+" (loop1), "+" (loop2) > > will do. (Why are they separate though? It could just be one loop > var). > > Thanks, I've updated these. -- Alastair D'Silva Open Source Developer Linux Technology Centre, IBM Australia mob: 0423 762 819
RE: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
On Tue, 2019-09-03 at 08:08 +0200, Christophe Leroy wrote: > > Le 03/09/2019 à 07:23, Alastair D'Silva a écrit : > > From: Alastair D'Silva > > > > Similar to commit 22e9c88d486a > > ("powerpc/64: reuse PPC32 static inline flush_dcache_range()") > > this patch converts the following ASM symbols to C: > > flush_icache_range() > > __flush_dcache_icache() > > __flush_dcache_icache_phys() > > > > This was done as we discovered a long-standing bug where the length > > of the > > range was truncated due to using a 32 bit shift instead of a 64 bit > > one. > > > > By converting these functions to C, it becomes easier to maintain. > > > > flush_dcache_icache_phys() retains a critical assembler section as > > we must > > ensure there are no memory accesses while the data MMU is disabled > > (authored by Christophe Leroy). Since this has no external callers, > > it has > > also been made static, allowing the compiler to inline it within > > flush_dcache_icache_page(). > > > > Signed-off-by: Alastair D'Silva > > Signed-off-by: Christophe Leroy > > --- > > arch/powerpc/include/asm/cache.h | 26 ++--- > > arch/powerpc/include/asm/cacheflush.h | 24 ++-- > > arch/powerpc/kernel/misc_32.S | 117 > > arch/powerpc/kernel/misc_64.S | 102 - > > arch/powerpc/mm/mem.c | 152 > > +- > > 5 files changed, 173 insertions(+), 248 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/cache.h > > b/arch/powerpc/include/asm/cache.h > > index f852d5cd746c..91c808c6738b 100644 > > --- a/arch/powerpc/include/asm/cache.h > > +++ b/arch/powerpc/include/asm/cache.h > > @@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void) > > #endif > > #endif /* ! __ASSEMBLY__ */ > > > > -#if defined(__ASSEMBLY__) > > -/* > > - * For a snooping icache, we still need a dummy icbi to purge all > > the > > - * prefetched instructions from the ifetch buffers. We also need a > > sync > > - * before the icbi to order the the actual stores to memory that > > might > > - * have modified instructions with the icbi. > > - */ > > -#define PURGE_PREFETCHED_INS \ > > - sync; \ > > - icbi0,r3; \ > > - sync; \ > > - isync > > - > > -#else > > +#if !defined(__ASSEMBLY__) > > #define __read_mostly > > __attribute__((__section__(".data..read_mostly"))) > > > > #ifdef CONFIG_PPC_BOOK3S_32 > > @@ -145,6 +132,17 @@ static inline void dcbst(void *addr) > > { > > __asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : > > "memory"); > > } > > + > > +static inline void icbi(void *addr) > > +{ > > + __asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory"); > > I think "__asm__ __volatile__" is deprecated. Use "asm volatile" > instead. > Ok. > > +} > > + > > +static inline void iccci(void *addr) > > +{ > > + __asm__ __volatile__ ("iccci 0, %0" : : "r"(addr) : "memory"); > > +} > > + > > Same > > > #endif /* !__ASSEMBLY__ */ > > #endif /* __KERNEL__ */ > > #endif /* _ASM_POWERPC_CACHE_H */ > > diff --git a/arch/powerpc/include/asm/cacheflush.h > > b/arch/powerpc/include/asm/cacheflush.h > > index ed57843ef452..4a1c9f0200e1 100644 > > --- a/arch/powerpc/include/asm/cacheflush.h > > +++ b/arch/powerpc/include/asm/cacheflush.h > > @@ -42,24 +42,20 @@ extern void flush_dcache_page(struct page > > *page); > > #define flush_dcache_mmap_lock(mapping) do { } while > > (0) > > #define flush_dcache_mmap_unlock(mapping) do { } while (0) > > > > -extern void flush_icache_range(unsigned long, unsigned long); > > +void flush_icache_range(unsigned long start, unsigned long stop); > > extern void flush_icache_user_range(struct vm_area_struct *vma, > > struct page *page, unsigned long > > addr, > > int len); > > -extern void __flush_dcache_icache(void *page_va); > > extern void flush_dcache_icache_page(struct page *page); > > -#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE) > > -extern void __flush_dcache_icache_phys(unsigned long physaddr); > > -#else > > -static inline void __flush_dcache_icache_phys(unsigned long > > physaddr) > > -{ > > - BUG(); > > -} > > -#endif > > - > > -/* > > - * Write any modified data cache blocks out to memory and > > invalidate them. > > - * Does not invalidate the corresponding instruction cache blocks. > > +void __flush_dcache_icache(void *page); > > + > > +/** > > + * flush_dcache_range(): Write any modified data cache blocks out > > to memory and > > + * invalidate them. Does not invalidate the corresponding > > instruction cache > > + * blocks. > > + * > > + * @start: the start address > > + * @stop: the stop address (exclusive) > >*/ > > static inline void flush_dcache_range(unsigned long start, > > unsigned long stop) > > { > > diff --git a/arch/powerpc/kernel/misc_32.S > > b/arch/powerpc/kernel/misc_32.S > > index
Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
On Tue, Sep 03, 2019 at 01:31:57PM -0500, Segher Boessenkool wrote: > On Tue, Sep 03, 2019 at 07:05:19PM +0200, Christophe Leroy wrote: > > Le 03/09/2019 à 18:04, Segher Boessenkool a écrit : > > >(Why are they separate though? It could just be one loop var). > > > > Yes it could just be a single loop var, but in that case it would have > > to be reset at the start of the second loop, which means we would have > > to pass 'addr' for resetting the loop anyway, > > Right, I noticed that after hitting send, as usual. > > > so I opted to do it > > outside the inline asm by using to separate loop vars set to their > > starting value outside the inline asm. > > The thing is, the way it is written now, it will get separate registers > for each loop (with proper earlyclobbers added). Not that that really > matters of course, it just feels wrong :-) After "mtmsr %3", it is always possible to copy %0 to %3 and use it as an address register for the second loop. One register less to allocate for the compiler. Constraints of course have to be adjusted. Gabriel > > > Segher
Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
On Tue, Sep 03, 2019 at 07:05:19PM +0200, Christophe Leroy wrote: > Le 03/09/2019 à 18:04, Segher Boessenkool a écrit : > >(Why are they separate though? It could just be one loop var). > > Yes it could just be a single loop var, but in that case it would have > to be reset at the start of the second loop, which means we would have > to pass 'addr' for resetting the loop anyway, Right, I noticed that after hitting send, as usual. > so I opted to do it > outside the inline asm by using to separate loop vars set to their > starting value outside the inline asm. The thing is, the way it is written now, it will get separate registers for each loop (with proper earlyclobbers added). Not that that really matters of course, it just feels wrong :-) Segher
Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
Le 03/09/2019 à 18:04, Segher Boessenkool a écrit : On Tue, Sep 03, 2019 at 04:28:09PM +0200, Christophe Leroy wrote: Le 03/09/2019 à 15:04, Segher Boessenkool a écrit : On Tue, Sep 03, 2019 at 03:23:57PM +1000, Alastair D'Silva wrote: + asm volatile( + " mtctr %2;" + " mtmsr %3;" + " isync;" + "0: dcbst 0, %0;" + " addi%0, %0, %4;" + " bdnz0b;" + " sync;" + " mtctr %2;" + "1: icbi0, %1;" + " addi%1, %1, %4;" + " bdnz1b;" + " sync;" + " mtmsr %5;" + " isync;" + : "+r" (loop1), "+r" (loop2) + : "r" (nb), "r" (msr), "i" (bytes), "r" (msr0) + : "ctr", "memory"); This outputs as one huge assembler statement, all on one line. That's going to be fun to read or debug. Do you mean \n has to be added after the ; ? Something like that. There is no really satisfying way for doing huge inline asm, and maybe that is a good thing ;-) Often people write \n\t at the end of each line of inline asm. This works pretty well (but then there are labels, oh joy). loop1 and/or loop2 can be assigned the same register as msr0 or nb. They need to be made earlyclobbers. (msr is fine, all of its reads are before any writes to loop1 or loop2; and bytes is fine, it's not a register). Can you explicit please ? Doesn't '+r' means that they are input and output at the same time ? That is what + means, yes -- that this output is an input as well. It is the same to write asm("mov %1,%0 ; mov %0,42" : "+r"(x), "=r"(y)); or to write asm("mov %1,%0 ; mov %0,42" : "=r"(x), "=r"(y) : "0"(x)); (So not "at the same time" as in "in the same machine instruction", but more loosely, as in "in the same inline asm statement"). "to be made earlyclobbers", what does this means exactly ? How to do that ? You write &, like "+" in this case. It means the machine code writes to this register before it has consumed all asm inputs (remember, GCC does not understand (or even parse!) the assembler string). So just : "+" (loop1), "+" (loop2) will do. (Why are they separate though? It could just be one loop var). Yes it could just be a single loop var, but in that case it would have to be reset at the start of the second loop, which means we would have to pass 'addr' for resetting the loop anyway, so I opted to do it outside the inline asm by using to separate loop vars set to their starting value outside the inline asm. Christophe
Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
On Tue, Sep 03, 2019 at 04:28:09PM +0200, Christophe Leroy wrote: > Le 03/09/2019 à 15:04, Segher Boessenkool a écrit : > >On Tue, Sep 03, 2019 at 03:23:57PM +1000, Alastair D'Silva wrote: > >>+ asm volatile( > >>+ " mtctr %2;" > >>+ " mtmsr %3;" > >>+ " isync;" > >>+ "0: dcbst 0, %0;" > >>+ " addi%0, %0, %4;" > >>+ " bdnz0b;" > >>+ " sync;" > >>+ " mtctr %2;" > >>+ "1: icbi0, %1;" > >>+ " addi%1, %1, %4;" > >>+ " bdnz1b;" > >>+ " sync;" > >>+ " mtmsr %5;" > >>+ " isync;" > >>+ : "+r" (loop1), "+r" (loop2) > >>+ : "r" (nb), "r" (msr), "i" (bytes), "r" (msr0) > >>+ : "ctr", "memory"); > > > >This outputs as one huge assembler statement, all on one line. That's > >going to be fun to read or debug. > > Do you mean \n has to be added after the ; ? Something like that. There is no really satisfying way for doing huge inline asm, and maybe that is a good thing ;-) Often people write \n\t at the end of each line of inline asm. This works pretty well (but then there are labels, oh joy). > >loop1 and/or loop2 can be assigned the same register as msr0 or nb. They > >need to be made earlyclobbers. (msr is fine, all of its reads are before > >any writes to loop1 or loop2; and bytes is fine, it's not a register). > > Can you explicit please ? Doesn't '+r' means that they are input and > output at the same time ? That is what + means, yes -- that this output is an input as well. It is the same to write asm("mov %1,%0 ; mov %0,42" : "+r"(x), "=r"(y)); or to write asm("mov %1,%0 ; mov %0,42" : "=r"(x), "=r"(y) : "0"(x)); (So not "at the same time" as in "in the same machine instruction", but more loosely, as in "in the same inline asm statement"). > "to be made earlyclobbers", what does this means exactly ? How to do that ? You write &, like "+" in this case. It means the machine code writes to this register before it has consumed all asm inputs (remember, GCC does not understand (or even parse!) the assembler string). So just : "+" (loop1), "+" (loop2) will do. (Why are they separate though? It could just be one loop var). Segher
Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
Le 03/09/2019 à 15:04, Segher Boessenkool a écrit : Hi! On Tue, Sep 03, 2019 at 03:23:57PM +1000, Alastair D'Silva wrote: diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c +#if !defined(CONFIG_PPC_8xx) & !defined(CONFIG_PPC64) Please write that as &&? That is more usual, and thus, easier to read. +static void flush_dcache_icache_phys(unsigned long physaddr) + asm volatile( + " mtctr %2;" + " mtmsr %3;" + " isync;" + "0: dcbst 0, %0;" + " addi%0, %0, %4;" + " bdnz0b;" + " sync;" + " mtctr %2;" + "1: icbi0, %1;" + " addi%1, %1, %4;" + " bdnz1b;" + " sync;" + " mtmsr %5;" + " isync;" + : "+r" (loop1), "+r" (loop2) + : "r" (nb), "r" (msr), "i" (bytes), "r" (msr0) + : "ctr", "memory"); This outputs as one huge assembler statement, all on one line. That's going to be fun to read or debug. Do you mean \n has to be added after the ; ? loop1 and/or loop2 can be assigned the same register as msr0 or nb. They need to be made earlyclobbers. (msr is fine, all of its reads are before any writes to loop1 or loop2; and bytes is fine, it's not a register). Can you explicit please ? Doesn't '+r' means that they are input and output at the same time ? "to be made earlyclobbers", what does this means exactly ? How to do that ? Christophe
Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
Hi! On Tue, Sep 03, 2019 at 03:23:57PM +1000, Alastair D'Silva wrote: > diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c > +#if !defined(CONFIG_PPC_8xx) & !defined(CONFIG_PPC64) Please write that as &&? That is more usual, and thus, easier to read. > +static void flush_dcache_icache_phys(unsigned long physaddr) > + asm volatile( > + " mtctr %2;" > + " mtmsr %3;" > + " isync;" > + "0: dcbst 0, %0;" > + " addi%0, %0, %4;" > + " bdnz0b;" > + " sync;" > + " mtctr %2;" > + "1: icbi0, %1;" > + " addi%1, %1, %4;" > + " bdnz1b;" > + " sync;" > + " mtmsr %5;" > + " isync;" > + : "+r" (loop1), "+r" (loop2) > + : "r" (nb), "r" (msr), "i" (bytes), "r" (msr0) > + : "ctr", "memory"); This outputs as one huge assembler statement, all on one line. That's going to be fun to read or debug. loop1 and/or loop2 can be assigned the same register as msr0 or nb. They need to be made earlyclobbers. (msr is fine, all of its reads are before any writes to loop1 or loop2; and bytes is fine, it's not a register). Segher
Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
Christophe Leroy writes: > Le 03/09/2019 à 07:23, Alastair D'Silva a écrit : >> From: Alastair D'Silva >> >> Similar to commit 22e9c88d486a >> ("powerpc/64: reuse PPC32 static inline flush_dcache_range()") >> this patch converts the following ASM symbols to C: >> flush_icache_range() >> __flush_dcache_icache() >> __flush_dcache_icache_phys() >> >> This was done as we discovered a long-standing bug where the length of the >> range was truncated due to using a 32 bit shift instead of a 64 bit one. >> >> By converting these functions to C, it becomes easier to maintain. >> >> flush_dcache_icache_phys() retains a critical assembler section as we must >> ensure there are no memory accesses while the data MMU is disabled >> (authored by Christophe Leroy). Since this has no external callers, it has >> also been made static, allowing the compiler to inline it within >> flush_dcache_icache_page(). >> >> Signed-off-by: Alastair D'Silva >> Signed-off-by: Christophe Leroy >> --- >> arch/powerpc/include/asm/cache.h | 26 ++--- >> arch/powerpc/include/asm/cacheflush.h | 24 ++-- >> arch/powerpc/kernel/misc_32.S | 117 >> arch/powerpc/kernel/misc_64.S | 102 - >> arch/powerpc/mm/mem.c | 152 +- >> 5 files changed, 173 insertions(+), 248 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/cache.h >> b/arch/powerpc/include/asm/cache.h >> index f852d5cd746c..91c808c6738b 100644 >> --- a/arch/powerpc/include/asm/cache.h >> +++ b/arch/powerpc/include/asm/cache.h >> @@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void) >> #endif >> #endif /* ! __ASSEMBLY__ */ >> >> -#if defined(__ASSEMBLY__) >> -/* >> - * For a snooping icache, we still need a dummy icbi to purge all the >> - * prefetched instructions from the ifetch buffers. We also need a sync >> - * before the icbi to order the the actual stores to memory that might >> - * have modified instructions with the icbi. >> - */ >> -#define PURGE_PREFETCHED_INS\ >> -sync; \ >> -icbi0,r3; \ >> -sync; \ >> -isync >> - >> -#else >> +#if !defined(__ASSEMBLY__) >> #define __read_mostly __attribute__((__section__(".data..read_mostly"))) >> >> #ifdef CONFIG_PPC_BOOK3S_32 >> @@ -145,6 +132,17 @@ static inline void dcbst(void *addr) >> { >> __asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : "memory"); >> } >> + >> +static inline void icbi(void *addr) >> +{ >> +__asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory"); > > I think "__asm__ __volatile__" is deprecated. Use "asm volatile" instead. Yes please. >> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c >> index 9191a66b3bc5..cd540123874d 100644 >> --- a/arch/powerpc/mm/mem.c >> +++ b/arch/powerpc/mm/mem.c >> @@ -321,6 +321,105 @@ void free_initmem(void) >> free_initmem_default(POISON_FREE_INITMEM); >> } >> >> +/* >> + * Warning: This macro will perform an early return if the CPU has >> + * a coherent icache. The intent is is call this early in function, >> + * and handle the non-coherent icache variant afterwards. >> + * >> + * For a snooping icache, we still need a dummy icbi to purge all the >> + * prefetched instructions from the ifetch buffers. We also need a sync >> + * before the icbi to order the the actual stores to memory that might >> + * have modified instructions with the icbi. >> + */ >> +#define flush_coherent_icache_or_return(addr) { \ >> +if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) { \ >> +mb(); /* sync */\ >> +icbi(addr); \ >> +mb(); /* sync */\ >> +isync();\ >> +return; \ >> +} \ >> +} > > I hate this kind of awful macro which kills code readability. Yes I agree. > Please to something like > > static bool flush_coherent_icache_or_return(unsigned long addr) > { > if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) > return false; > > mb(); /* sync */ > icbi(addr); > mb(); /* sync */ > isync(); > return true; > } > > then callers will do: > > if (flush_coherent_icache_or_return(addr)) > return; I don't think it needs the "_or_return" in the name. eg, it can just be: if (flush_coherent_icache(addr)) return; Which reads fine I think, ie. flush the coherent icache, and if that succeeds return, else continue. cheers
Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
Le 03/09/2019 à 07:23, Alastair D'Silva a écrit : From: Alastair D'Silva Similar to commit 22e9c88d486a ("powerpc/64: reuse PPC32 static inline flush_dcache_range()") this patch converts the following ASM symbols to C: flush_icache_range() __flush_dcache_icache() __flush_dcache_icache_phys() This was done as we discovered a long-standing bug where the length of the range was truncated due to using a 32 bit shift instead of a 64 bit one. By converting these functions to C, it becomes easier to maintain. flush_dcache_icache_phys() retains a critical assembler section as we must ensure there are no memory accesses while the data MMU is disabled (authored by Christophe Leroy). Since this has no external callers, it has also been made static, allowing the compiler to inline it within flush_dcache_icache_page(). Signed-off-by: Alastair D'Silva Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/cache.h | 26 ++--- arch/powerpc/include/asm/cacheflush.h | 24 ++-- arch/powerpc/kernel/misc_32.S | 117 arch/powerpc/kernel/misc_64.S | 102 - arch/powerpc/mm/mem.c | 152 +- 5 files changed, 173 insertions(+), 248 deletions(-) diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h index f852d5cd746c..91c808c6738b 100644 --- a/arch/powerpc/include/asm/cache.h +++ b/arch/powerpc/include/asm/cache.h @@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void) #endif #endif /* ! __ASSEMBLY__ */ -#if defined(__ASSEMBLY__) -/* - * For a snooping icache, we still need a dummy icbi to purge all the - * prefetched instructions from the ifetch buffers. We also need a sync - * before the icbi to order the the actual stores to memory that might - * have modified instructions with the icbi. - */ -#define PURGE_PREFETCHED_INS \ - sync; \ - icbi0,r3; \ - sync; \ - isync - -#else +#if !defined(__ASSEMBLY__) #define __read_mostly __attribute__((__section__(".data..read_mostly"))) #ifdef CONFIG_PPC_BOOK3S_32 @@ -145,6 +132,17 @@ static inline void dcbst(void *addr) { __asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : "memory"); } + +static inline void icbi(void *addr) +{ + __asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory"); I think "__asm__ __volatile__" is deprecated. Use "asm volatile" instead. +} + +static inline void iccci(void *addr) +{ + __asm__ __volatile__ ("iccci 0, %0" : : "r"(addr) : "memory"); +} + Same #endif /* !__ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_CACHE_H */ diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h index ed57843ef452..4a1c9f0200e1 100644 --- a/arch/powerpc/include/asm/cacheflush.h +++ b/arch/powerpc/include/asm/cacheflush.h @@ -42,24 +42,20 @@ extern void flush_dcache_page(struct page *page); #define flush_dcache_mmap_lock(mapping) do { } while (0) #define flush_dcache_mmap_unlock(mapping) do { } while (0) -extern void flush_icache_range(unsigned long, unsigned long); +void flush_icache_range(unsigned long start, unsigned long stop); extern void flush_icache_user_range(struct vm_area_struct *vma, struct page *page, unsigned long addr, int len); -extern void __flush_dcache_icache(void *page_va); extern void flush_dcache_icache_page(struct page *page); -#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE) -extern void __flush_dcache_icache_phys(unsigned long physaddr); -#else -static inline void __flush_dcache_icache_phys(unsigned long physaddr) -{ - BUG(); -} -#endif - -/* - * Write any modified data cache blocks out to memory and invalidate them. - * Does not invalidate the corresponding instruction cache blocks. +void __flush_dcache_icache(void *page); + +/** + * flush_dcache_range(): Write any modified data cache blocks out to memory and + * invalidate them. Does not invalidate the corresponding instruction cache + * blocks. + * + * @start: the start address + * @stop: the stop address (exclusive) */ static inline void flush_dcache_range(unsigned long start, unsigned long stop) { diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index fe4bd321730e..12b95e6799d4 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -318,123 +318,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE) EXPORT_SYMBOL(flush_instruction_cache) #endif /* CONFIG_PPC_8xx */ -/* - * Write any modified data cache blocks out to memory - * and invalidate the corresponding instruction cache blocks. - * This is a no-op on the 601. - * - * flush_icache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(flush_icache_range) -BEGIN_FTR_SECTION - PURGE_PREFETCHED_INS - blr
[PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
From: Alastair D'Silva Similar to commit 22e9c88d486a ("powerpc/64: reuse PPC32 static inline flush_dcache_range()") this patch converts the following ASM symbols to C: flush_icache_range() __flush_dcache_icache() __flush_dcache_icache_phys() This was done as we discovered a long-standing bug where the length of the range was truncated due to using a 32 bit shift instead of a 64 bit one. By converting these functions to C, it becomes easier to maintain. flush_dcache_icache_phys() retains a critical assembler section as we must ensure there are no memory accesses while the data MMU is disabled (authored by Christophe Leroy). Since this has no external callers, it has also been made static, allowing the compiler to inline it within flush_dcache_icache_page(). Signed-off-by: Alastair D'Silva Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/cache.h | 26 ++--- arch/powerpc/include/asm/cacheflush.h | 24 ++-- arch/powerpc/kernel/misc_32.S | 117 arch/powerpc/kernel/misc_64.S | 102 - arch/powerpc/mm/mem.c | 152 +- 5 files changed, 173 insertions(+), 248 deletions(-) diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h index f852d5cd746c..91c808c6738b 100644 --- a/arch/powerpc/include/asm/cache.h +++ b/arch/powerpc/include/asm/cache.h @@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void) #endif #endif /* ! __ASSEMBLY__ */ -#if defined(__ASSEMBLY__) -/* - * For a snooping icache, we still need a dummy icbi to purge all the - * prefetched instructions from the ifetch buffers. We also need a sync - * before the icbi to order the the actual stores to memory that might - * have modified instructions with the icbi. - */ -#define PURGE_PREFETCHED_INS \ - sync; \ - icbi0,r3; \ - sync; \ - isync - -#else +#if !defined(__ASSEMBLY__) #define __read_mostly __attribute__((__section__(".data..read_mostly"))) #ifdef CONFIG_PPC_BOOK3S_32 @@ -145,6 +132,17 @@ static inline void dcbst(void *addr) { __asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : "memory"); } + +static inline void icbi(void *addr) +{ + __asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory"); +} + +static inline void iccci(void *addr) +{ + __asm__ __volatile__ ("iccci 0, %0" : : "r"(addr) : "memory"); +} + #endif /* !__ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_CACHE_H */ diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h index ed57843ef452..4a1c9f0200e1 100644 --- a/arch/powerpc/include/asm/cacheflush.h +++ b/arch/powerpc/include/asm/cacheflush.h @@ -42,24 +42,20 @@ extern void flush_dcache_page(struct page *page); #define flush_dcache_mmap_lock(mapping)do { } while (0) #define flush_dcache_mmap_unlock(mapping) do { } while (0) -extern void flush_icache_range(unsigned long, unsigned long); +void flush_icache_range(unsigned long start, unsigned long stop); extern void flush_icache_user_range(struct vm_area_struct *vma, struct page *page, unsigned long addr, int len); -extern void __flush_dcache_icache(void *page_va); extern void flush_dcache_icache_page(struct page *page); -#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE) -extern void __flush_dcache_icache_phys(unsigned long physaddr); -#else -static inline void __flush_dcache_icache_phys(unsigned long physaddr) -{ - BUG(); -} -#endif - -/* - * Write any modified data cache blocks out to memory and invalidate them. - * Does not invalidate the corresponding instruction cache blocks. +void __flush_dcache_icache(void *page); + +/** + * flush_dcache_range(): Write any modified data cache blocks out to memory and + * invalidate them. Does not invalidate the corresponding instruction cache + * blocks. + * + * @start: the start address + * @stop: the stop address (exclusive) */ static inline void flush_dcache_range(unsigned long start, unsigned long stop) { diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index fe4bd321730e..12b95e6799d4 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -318,123 +318,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE) EXPORT_SYMBOL(flush_instruction_cache) #endif /* CONFIG_PPC_8xx */ -/* - * Write any modified data cache blocks out to memory - * and invalidate the corresponding instruction cache blocks. - * This is a no-op on the 601. - * - * flush_icache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(flush_icache_range) -BEGIN_FTR_SECTION - PURGE_PREFETCHED_INS - blr /* for 601, do nothing */ -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) - rlwinm r3,r3,0,0,31 - L1_CACHE_SHIFT - subfr4,r3,r4 -