On Tue, Aug 06, 2013 at 08:35:10PM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2013-08-06 at 18:23 +0800, Kevin Hao wrote: > > In function flush_icache_range(), we use cpu_has_feature() to test > > the feature bit of CPU_FTR_COHERENT_ICACHE. But this seems not optimal > > for two reasons: > > a) For ppc32, the function __flush_icache_range() already do this > > check with the macro END_FTR_SECTION_IFSET. > > b) Compare with the cpu_has_feature(), the method of using macro > > END_FTR_SECTION_IFSET will not introduce any runtime overhead. > > Nak. > > It adds the overhead of calling into a function :-) > > What about modifying cpu_has_feature to use jump labels ?
That's a great idea. I would like to gave it a try later. >It might solve > the problem of no runtime overhead ... however it might also be hard to > keep the ability to remove the whole statement at compile time if the > bit doesn't fit in the POSSIBLE mask... unless you find the right macro > magic. > > In any case, I suspect the function call introduces more overhead than > the bit test + conditional branch which will generally predict very > well, so the patch as-is is probably a regression. I don't think so. For the 64bit CPU which has a non-coherent icache, there is no any effect introduced by this patch since (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) always yield true at compile time. But for the 64bit CPU which does have the coherent icache, the following is the asm code before applying this patch: c000000000023b4c: e9 22 86 68 ld r9,-31128(r2) c000000000023b50: e9 29 00 00 ld r9,0(r9) c000000000023b54: e9 29 00 10 ld r9,16(r9) c000000000023b58: 79 2a 07 e1 clrldi. r10,r9,63 c000000000023b5c: 41 82 00 94 beq c000000000023bf0 <.handle_rt_signal64+0x670> ... c000000000023bf0: 7f 23 cb 78 mr r3,r25 c000000000023bf4: 7f 64 db 78 mr r4,r27 c000000000023bf8: 48 7a 65 99 bl c0000000007ca190 <.__flush_icache_range> After applying this patch, the following code is generated: c000000000023b48: 7f 23 cb 78 mr r3,r25 c000000000023b4c: 7f 64 db 78 mr r4,r27 c000000000023b50: 48 7a 65 81 bl c0000000007ca0d0 <.flush_icache_range> The run path for these two cases are: before after ld r9,-31128(r2) mr r3,r25 ld r9,0(r9) mr r4,r27 ld r9,16(r9) bl flush_icache_range clrldi. r10,r9,63 blr beq xxxx So I don't think this will introduce more overhead than before. On the contrary, I believe it yield less overhead than before. Correct me if I am wrong. > > Did you measure ? No. I don't have a access to 64bit CPU which has a coherent icache. For a non-coherent icache 64bit CPU this patch will not cause any effect. Thanks, Kevin > > Ben. > > > > Signed-off-by: Kevin Hao <haoke...@gmail.com> > > --- > > arch/powerpc/include/asm/cacheflush.h | 3 +-- > > arch/powerpc/kernel/misc_64.S | 4 +++- > > 2 files changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/cacheflush.h > > b/arch/powerpc/include/asm/cacheflush.h > > index b843e35..60b620d 100644 > > --- a/arch/powerpc/include/asm/cacheflush.h > > +++ b/arch/powerpc/include/asm/cacheflush.h > > @@ -35,8 +35,7 @@ extern void __flush_disable_L1(void); > > extern void __flush_icache_range(unsigned long, unsigned long); > > static inline void flush_icache_range(unsigned long start, unsigned long > > stop) > > { > > - if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) > > - __flush_icache_range(start, stop); > > + __flush_icache_range(start, stop); > > } > > > > extern void flush_icache_user_range(struct vm_area_struct *vma, > > diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S > > index 6820e45..74d87f1 100644 > > --- a/arch/powerpc/kernel/misc_64.S > > +++ b/arch/powerpc/kernel/misc_64.S > > @@ -68,7 +68,9 @@ PPC64_CACHES: > > */ > > > > _KPROBE(__flush_icache_range) > > - > > +BEGIN_FTR_SECTION > > + blr > > +END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > /* > > * Flush the data cache to memory > > * > >
pgpnxVg_fWBzi.pgp
Description: PGP signature
_______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev