Re: [PATCH v2 7/9] powerpc/powernv: Add platform support for stop instruction
On Tue, May 03, 2016 at 01:54:36PM +0530, Shreyas B. Prabhu wrote: > POWER ISA v3 defines a new idle processor core mechanism. In summary, > a) new instruction named stop is added. This instruction replaces > instructions like nap, sleep, rvwinkle. > b) new per thread SPR named PSSCR is added which controls the behavior > of stop instruction. > > PSSCR has following key fields > Bits 0:3 - Power-Saving Level Status. This field indicates the lowest > power-saving state the thread entered since stop instruction was last > executed. > > Bit 42 - Enable State Loss > 0 - No state is lost irrespective of other fields > 1 - Allows state loss > > Bits 44:47 - Power-Saving Level Limit > This limits the power-saving level that can be entered into. > > Bits 60:63 - Requested Level > Used to specify which power-saving level must be entered on executing > stop instruction > > This patch adds support for stop instruction and PSSCR handling. I notice that you have duplicated a whole lot of assembly code relating to synchronizing between threads going into and out of power-saving modes, saving/restoring SPRs, resyncing the timebase, and so on. Two questions arise: - Are we really going to have to do all of that in the same way for POWER9 as we did for POWER8? You even copied over a comment about the fastsleep workaround, which I really hope we won't have to do on POWER9. Also, on POWER9, the threads are much more independent, so I was not expecting that there would still be shared registers. - If we do have to do all that, could we use the same code as on POWER8 rather than having another copy of all that code? Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 2/9] powerpc/kvm: make hypervisor state restore a function
On Tue, May 03, 2016 at 01:54:31PM +0530, Shreyas B. Prabhu wrote: > In the current code, when the thread wakes up in reset vector, some > of the state restore code and check for whether a thread needs to > branch to kvm is duplicated. Reorder the code such that this > duplication is avoided. This is a nice cleanup. The one minor comment I have is that since power7_restore_hyp_resource has some unusual entry requirements (such as requiring cr3 to be set a certain way), those requirements should be documented in the comment just about the function entry point. I didn't see any unusual exit conditions, but if there are any they should be documented too. Reviewed-by: Paul Mackerras___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking
On Thu, May 19, 2016 at 4:15 PM, Josh Poimboeufwrote: > On Mon, May 02, 2016 at 08:52:41AM -0700, Andy Lutomirski wrote: >> On Mon, May 2, 2016 at 6:52 AM, Josh Poimboeuf wrote: >> > On Fri, Apr 29, 2016 at 05:08:50PM -0700, Andy Lutomirski wrote: >> >> On Apr 29, 2016 3:41 PM, "Josh Poimboeuf" wrote: >> >> > >> >> > On Fri, Apr 29, 2016 at 02:37:41PM -0700, Andy Lutomirski wrote: >> >> > > On Fri, Apr 29, 2016 at 2:25 PM, Josh Poimboeuf >> >> > > wrote: >> >> > > >> I suppose we could try to rejigger the code so that rbp points to >> >> > > >> pt_regs or similar. >> >> > > > >> >> > > > I think we should avoid doing something like that because it would >> >> > > > break >> >> > > > gdb and all the other unwinders who don't know about it. >> >> > > >> >> > > How so? >> >> > > >> >> > > Currently, rbp in the entry code is meaningless. I'm suggesting that, >> >> > > when we do, for example, 'call \do_sym' in idtentry, we point rbp to >> >> > > the pt_regs. Currently it points to something stale (which the >> >> > > dump_stack code might be relying on. Hmm.) But it's probably also >> >> > > safe to assume that if you unwind to the 'call \do_sym', then pt_regs >> >> > > is the next thing on the stack, so just doing the section thing would >> >> > > work. >> >> > >> >> > Yes, rbp is meaningless on the entry from user space. But if an >> >> > in-kernel interrupt occurs (e.g. page fault, preemption) and you have >> >> > nested entry, rbp keeps its old value, right? So the unwinder can walk >> >> > past the nested entry frame and keep going until it gets to the original >> >> > entry. >> >> >> >> Yes. >> >> >> >> It would be nice if we could do better, though, and actually notice >> >> the pt_regs and identify the entry. For example, I'd love to see >> >> "page fault, RIP=xyz" printed in the middle of a stack dump on a >> >> crash. >> >> >> >> Also, I think that just following rbp links will lose the >> >> actual function that took the page fault (or whatever function >> >> pt_regs->ip actually points to). >> > >> > Hm. I think we could fix all that in a more standard way. Whenever a >> > new pt_regs frame gets saved on entry, we could also create a new stack >> > frame which points to a fake kernel_entry() function. That would tell >> > the unwinder there's a pt_regs frame without otherwise breaking frame >> > pointers across the frame. >> > >> > Then I guess we wouldn't need my other solution of putting the idt >> > entries in a special section. >> > >> > How does that sound? >> >> Let me try to understand. >> >> The normal call sequence is call; push %rbp; mov %rsp, %rbp. So rbp >> points to (prev rbp, prev rip) on the stack, and you can follow the >> chain back. Right now, on a user access page fault or similar, we >> have rbp (probably) pointing to the interrupted frame, and the >> interrupted rip isn't saved anywhere that a naive unwinder can find >> it. (It's in pt_regs, but the rbp chain skips right over that.) >> >> We could change the entry code so that an interrupt / idtentry does: >> >> push pt_regs >> push kernel_entry >> push %rbp >> mov %rsp, %rbp >> call handler >> pop %rbp >> addq $8, %rsp >> >> or similar. That would make it appear that the actual C handler was >> caused by a dummy function "kernel_entry". Now the unwinder would get >> to kernel_entry, but it *still* wouldn't find its way to the calling >> frame, which only solves part of the problem. We could at least teach >> the unwinder how kernel_entry works and let it decode pt_regs to >> continue unwinding. This would be nice, and I think it could work. >> >> I think I like this, except that, if it used a separate section, it >> could potentially be faster, as, for each actual entry type, the >> offset from the C handler frame to pt_regs is a foregone conclusion. >> But this is pretty simple and performance is already abysmal in most >> handlers. >> >> There's an added benefit to using a separate section, though: we could >> also annotate the calls with what type of entry they were so the >> unwinder could print it out nicely. >> >> I could be convinced either way. > > Ok, I took a stab at this. See the patch below. > > In addition to annotating interrupt/exception pt_regs frames, I also > annotated all the syscall pt_regs, for consistency. > > As you mentioned, it will affect performance a bit, but I think it will > be insignificant. > > I think I like this approach better than putting the > interrupt/idtentry's in a special section, because this is much more > precise. Especially now that I'm annotating pt_regs syscalls. > > Also I think with a few minor changes we could implement your idea of > annotating the calls with what type of entry they are. But I don't > think that's really needed, because the name of the interrupt/idtentry > is already on the stack trace. > > Before: > > [] dump_stack+0x85/0xc2 > []
[PATCH] ps3_gelic: use kmemdup
Use kmemdup when some other buffer is immediately copied into allocated region. It replaces call to allocation followed by memcpy, by a single call to kmemdup. Signed-off-by: Muhammad Falak R Wani--- drivers/net/ethernet/toshiba/ps3_gelic_wireless.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c b/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c index 743b182..446ea58 100644 --- a/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c +++ b/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c @@ -1616,13 +1616,13 @@ static void gelic_wl_scan_complete_event(struct gelic_wl_info *wl) target->valid = 1; target->eurus_index = i; kfree(target->hwinfo); - target->hwinfo = kzalloc(be16_to_cpu(scan_info->size), + target->hwinfo = kmemdup(scan_info, +be16_to_cpu(scan_info->size), GFP_KERNEL); if (!target->hwinfo) continue; /* copy hw scan info */ - memcpy(target->hwinfo, scan_info, be16_to_cpu(scan_info->size)); target->essid_len = strnlen(scan_info->essid, sizeof(scan_info->essid)); target->rate_len = 0; -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking
On Mon, May 02, 2016 at 08:52:41AM -0700, Andy Lutomirski wrote: > On Mon, May 2, 2016 at 6:52 AM, Josh Poimboeufwrote: > > On Fri, Apr 29, 2016 at 05:08:50PM -0700, Andy Lutomirski wrote: > >> On Apr 29, 2016 3:41 PM, "Josh Poimboeuf" wrote: > >> > > >> > On Fri, Apr 29, 2016 at 02:37:41PM -0700, Andy Lutomirski wrote: > >> > > On Fri, Apr 29, 2016 at 2:25 PM, Josh Poimboeuf > >> > > wrote: > >> > > >> I suppose we could try to rejigger the code so that rbp points to > >> > > >> pt_regs or similar. > >> > > > > >> > > > I think we should avoid doing something like that because it would > >> > > > break > >> > > > gdb and all the other unwinders who don't know about it. > >> > > > >> > > How so? > >> > > > >> > > Currently, rbp in the entry code is meaningless. I'm suggesting that, > >> > > when we do, for example, 'call \do_sym' in idtentry, we point rbp to > >> > > the pt_regs. Currently it points to something stale (which the > >> > > dump_stack code might be relying on. Hmm.) But it's probably also > >> > > safe to assume that if you unwind to the 'call \do_sym', then pt_regs > >> > > is the next thing on the stack, so just doing the section thing would > >> > > work. > >> > > >> > Yes, rbp is meaningless on the entry from user space. But if an > >> > in-kernel interrupt occurs (e.g. page fault, preemption) and you have > >> > nested entry, rbp keeps its old value, right? So the unwinder can walk > >> > past the nested entry frame and keep going until it gets to the original > >> > entry. > >> > >> Yes. > >> > >> It would be nice if we could do better, though, and actually notice > >> the pt_regs and identify the entry. For example, I'd love to see > >> "page fault, RIP=xyz" printed in the middle of a stack dump on a > >> crash. > >> > >> Also, I think that just following rbp links will lose the > >> actual function that took the page fault (or whatever function > >> pt_regs->ip actually points to). > > > > Hm. I think we could fix all that in a more standard way. Whenever a > > new pt_regs frame gets saved on entry, we could also create a new stack > > frame which points to a fake kernel_entry() function. That would tell > > the unwinder there's a pt_regs frame without otherwise breaking frame > > pointers across the frame. > > > > Then I guess we wouldn't need my other solution of putting the idt > > entries in a special section. > > > > How does that sound? > > Let me try to understand. > > The normal call sequence is call; push %rbp; mov %rsp, %rbp. So rbp > points to (prev rbp, prev rip) on the stack, and you can follow the > chain back. Right now, on a user access page fault or similar, we > have rbp (probably) pointing to the interrupted frame, and the > interrupted rip isn't saved anywhere that a naive unwinder can find > it. (It's in pt_regs, but the rbp chain skips right over that.) > > We could change the entry code so that an interrupt / idtentry does: > > push pt_regs > push kernel_entry > push %rbp > mov %rsp, %rbp > call handler > pop %rbp > addq $8, %rsp > > or similar. That would make it appear that the actual C handler was > caused by a dummy function "kernel_entry". Now the unwinder would get > to kernel_entry, but it *still* wouldn't find its way to the calling > frame, which only solves part of the problem. We could at least teach > the unwinder how kernel_entry works and let it decode pt_regs to > continue unwinding. This would be nice, and I think it could work. > > I think I like this, except that, if it used a separate section, it > could potentially be faster, as, for each actual entry type, the > offset from the C handler frame to pt_regs is a foregone conclusion. > But this is pretty simple and performance is already abysmal in most > handlers. > > There's an added benefit to using a separate section, though: we could > also annotate the calls with what type of entry they were so the > unwinder could print it out nicely. > > I could be convinced either way. Ok, I took a stab at this. See the patch below. In addition to annotating interrupt/exception pt_regs frames, I also annotated all the syscall pt_regs, for consistency. As you mentioned, it will affect performance a bit, but I think it will be insignificant. I think I like this approach better than putting the interrupt/idtentry's in a special section, because this is much more precise. Especially now that I'm annotating pt_regs syscalls. Also I think with a few minor changes we could implement your idea of annotating the calls with what type of entry they are. But I don't think that's really needed, because the name of the interrupt/idtentry is already on the stack trace. Before: [] dump_stack+0x85/0xc2 [] __do_page_fault+0x576/0x5a0 [] trace_do_page_fault+0x5c/0x2e0 [] do_async_page_fault+0x2c/0xa0 [] async_page_fault+0x28/0x30 [] ? copy_page_to_iter+0x70/0x440 [] ? pagecache_get_page+0x2c/0x290 []
Re: [PATCH 3/4] rcutorture: Make -soundhw a x86 specific option
On Thu, May 19, 2016 at 12:38:47PM -0700, Paul E. McKenney wrote: > On Thu, May 19, 2016 at 09:23:39AM -0700, Paul E. McKenney wrote: > > On Thu, May 19, 2016 at 08:40:42AM -0700, Josh Triplett wrote: > > > On Thu, May 19, 2016 at 07:10:13AM -0700, Paul E. McKenney wrote: > > > > On Wed, May 18, 2016 at 09:23:10PM -0700, Josh Triplett wrote: > > > > > On Thu, May 19, 2016 at 11:42:23AM +0800, Boqun Feng wrote: > > > > > > The option "-soundhw pcspk" gives me a error on PPC as follow: > > > > > > > > > > > > qemu-system-ppc64: ISA bus not available for pcspk > > > > > > > > > > > > , which means this option doesn't work on ppc by default. So simply > > > > > > make > > > > > > this an x86-specific option via identify_qemu_args(). > > > > > > > > > > > > Signed-off-by: Boqun Feng> > > > > > > > > > The emulated system for RCU testing does not need sound hardware at > > > > > all. > > > > > Paul added this option in commit > > > > > 16c77ea7d0f4a74e49009aa2d26c275f7f93de7c to disable the default sound > > > > > hardware, saying that '"-soundhw pcspk" makes the script a bit less > > > > > dependent on odd audio libraries being installed'. Unfortunately, it > > > > > looks like there isn't a "-soundhw none". As far as I can tell, > > > > > currently the only way to completely eliminate sound hardware is to > > > > > pass > > > > > "-nodefaults" and then explicitly specify each desired device; while > > > > > that would solve the issue, it would likely introduce *more* > > > > > hardware-specific command-line options... > > > > > > > > > > I've filed two feature requests on upstream qemu to make this simpler: > > > > > https://bugs.launchpad.net/qemu/+bug/1583420 and > > > > > https://bugs.launchpad.net/qemu/+bug/1583421 . > > > > > > > > > > Paul, what did you mean by "dependent on odd audio libraries"? Did > > > > > you > > > > > mean in the guest or the host? And either way, is this something that > > > > > could potentially be solved another way? > > > > > > > > If I remember correctly, Ubuntu 14.04 qemu refused to run the guest > > > > without this option, but I don't recall the exact error message. > > > > I chalked it up to my ignorance of qemu, but I would very much welcome > > > > some way to not have to specify irrelevant hardware. So thank you very > > > > much for filing the bugs! > > > > > > According to qemu upstream, qemu doesn't enable any sound hardware by > > > default, so I can't think of any obvious reason why adding "-soundhw > > > pcspkr" would make the rcutorture VM boot. Did qemu refuse to run at > > > all, or did the VM start but fail during the boot process? > > > > > > Could you check if you can currently run without this option? If so, > > > perhaps we should just drop it for now. > > > > Will do! As soon as the current test completes. > > And it now works just fine without the "-soundhw pcspkr". Search me! In that case, can you replace the patch in this series making "-soundhw pcspkr" target-specific with one removing "-soundhw pcspkr"? - Josh Triplett ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/4] rcutorture: Make -soundhw a x86 specific option
On Thu, May 19, 2016 at 09:23:39AM -0700, Paul E. McKenney wrote: > On Thu, May 19, 2016 at 08:40:42AM -0700, Josh Triplett wrote: > > On Thu, May 19, 2016 at 07:10:13AM -0700, Paul E. McKenney wrote: > > > On Wed, May 18, 2016 at 09:23:10PM -0700, Josh Triplett wrote: > > > > On Thu, May 19, 2016 at 11:42:23AM +0800, Boqun Feng wrote: > > > > > The option "-soundhw pcspk" gives me a error on PPC as follow: > > > > > > > > > > qemu-system-ppc64: ISA bus not available for pcspk > > > > > > > > > > , which means this option doesn't work on ppc by default. So simply > > > > > make > > > > > this an x86-specific option via identify_qemu_args(). > > > > > > > > > > Signed-off-by: Boqun Feng> > > > > > > > The emulated system for RCU testing does not need sound hardware at all. > > > > Paul added this option in commit > > > > 16c77ea7d0f4a74e49009aa2d26c275f7f93de7c to disable the default sound > > > > hardware, saying that '"-soundhw pcspk" makes the script a bit less > > > > dependent on odd audio libraries being installed'. Unfortunately, it > > > > looks like there isn't a "-soundhw none". As far as I can tell, > > > > currently the only way to completely eliminate sound hardware is to pass > > > > "-nodefaults" and then explicitly specify each desired device; while > > > > that would solve the issue, it would likely introduce *more* > > > > hardware-specific command-line options... > > > > > > > > I've filed two feature requests on upstream qemu to make this simpler: > > > > https://bugs.launchpad.net/qemu/+bug/1583420 and > > > > https://bugs.launchpad.net/qemu/+bug/1583421 . > > > > > > > > Paul, what did you mean by "dependent on odd audio libraries"? Did you > > > > mean in the guest or the host? And either way, is this something that > > > > could potentially be solved another way? > > > > > > If I remember correctly, Ubuntu 14.04 qemu refused to run the guest > > > without this option, but I don't recall the exact error message. > > > I chalked it up to my ignorance of qemu, but I would very much welcome > > > some way to not have to specify irrelevant hardware. So thank you very > > > much for filing the bugs! > > > > According to qemu upstream, qemu doesn't enable any sound hardware by > > default, so I can't think of any obvious reason why adding "-soundhw > > pcspkr" would make the rcutorture VM boot. Did qemu refuse to run at > > all, or did the VM start but fail during the boot process? > > > > Could you check if you can currently run without this option? If so, > > perhaps we should just drop it for now. > > Will do! As soon as the current test completes. And it now works just fine without the "-soundhw pcspkr". Search me! > BTW, am I the only one getting "interesting" failures in the merge > window? I will be chasing these down, but am likely to be off the grid until Monday morning, Pacific time. Looks like the same failure to awaken as before, but much higher probability. Thanx, Paul ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc: Improve comment explaining why we modify VRSAVE
The comment explaining why we modify VRSAVE is misleading, glibc does rely on the behaviour. Update the comment. Signed-off-by: Anton Blanchard--- diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S index 1c2e7a3..3907fcf 100644 --- a/arch/powerpc/kernel/vector.S +++ b/arch/powerpc/kernel/vector.S @@ -70,10 +70,11 @@ _GLOBAL(load_up_altivec) MTMSRD(r5) /* enable use of AltiVec now */ isync - /* Hack: if we get an altivec unavailable trap with VRSAVE -* set to all zeros, we assume this is a broken application -* that fails to set it properly, and thus we switch it to -* all 1's + /* +* While userspace in general ignores VRSAVE, glibc uses it as a +* boolean to optimise userspace context save/restore. Whenever we +* take an altivec unavailable exception we must set VRSAVE to +* something non zero. Set it to all 1s. */ mfspr r4,SPRN_VRSAVE cmpwi 0,r4,0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH 2/2] powerpc/mm: Support segment table for Power9
PowerISA 3.0 adds an in memory table for storing segment translation information. In this mode, which is enabled by setting both HOST RADIX and GUEST RADIX bits in partition table to 0 and enabling UPRT to 1, we have a per process segment table. The segment table details are stored in the process table indexed by PID value. Segment table mode also requires us to map the process table at the beginning of a 1TB segment. On the linux kernel side we enable this model if we find that the radix is explicitily disabled by setting the ibm,pa-feature radix bit (byte 40 bit 0) set to 0. If the size of ibm,pa-feature node is less than 40 bytes, we enable the legacy HPT mode using SLB. If radix bit is set to 1, we use the radix mode. With respect to SLB mapping, we bolt mapp the entire kernel range and and only handle user space segment fault. We also have access to 4 SLB register in software. So we continue to use 3 of that for bolted kernel SLB entries as we use them currently. Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/hash.h | 10 + arch/powerpc/include/asm/book3s/64/mmu-hash.h | 17 ++ arch/powerpc/include/asm/book3s/64/mmu.h | 5 + arch/powerpc/include/asm/mmu.h| 6 +- arch/powerpc/include/asm/mmu_context.h| 5 +- arch/powerpc/kernel/prom.c| 1 + arch/powerpc/mm/hash_utils_64.c | 84 ++- arch/powerpc/mm/mmu_context_book3s64.c| 32 ++- arch/powerpc/mm/slb.c | 327 +- 9 files changed, 470 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h index f61cad3de4e6..5f0deeda7884 100644 --- a/arch/powerpc/include/asm/book3s/64/hash.h +++ b/arch/powerpc/include/asm/book3s/64/hash.h @@ -58,6 +58,16 @@ #define H_VMALLOC_END (H_VMALLOC_START + H_VMALLOC_SIZE) /* + * Process table with ISA 3.0 need to be mapped at the beginning of a 1TB segment + * We put that in the top of VMALLOC region. For each region we can go upto 64TB + * for now. Hence we have space to put process table there. We should not get + * an SLB miss for this address, because the VSID for this is placed in the + * partition table. + */ +#define H_SEG_PROC_TBL_START ASM_CONST(0xD0002000) +#define H_SEG_PROC_TBL_END ASM_CONST(0xD00020ff) + +/* * Region IDs */ #define REGION_SHIFT 60UL diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index a5fa6be7d5ae..75016f8cbd51 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -101,6 +101,18 @@ #define HPTE_V_1TB_SEG ASM_CONST(0x4000) #define HPTE_V_VRMA_MASK ASM_CONST(0x4001ff00) +/* segment table entry masks/bits */ +/* Upper 64 bit */ +#define STE_VALID ASM_CONST(0x800) +/* + * lower 64 bit + * 64th bit become 0 bit + */ +/* + * Software defined bolted bit + */ +#define STE_BOLTED ASM_CONST(0x1) + /* Values for PP (assumes Ks=0, Kp=1) */ #define PP_RWXX0 /* Supervisor read/write, User none */ #define PP_RWRX 1 /* Supervisor read/write, User read */ @@ -128,6 +140,11 @@ struct hash_pte { __be64 r; }; +struct seg_entry { + __be64 ste_e; + __be64 ste_v; +}; + extern struct hash_pte *htab_address; extern unsigned long htab_size_bytes; extern unsigned long htab_hash_mask; diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index c6b1ff795632..b7464bc013c9 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -60,7 +60,9 @@ extern struct patb_entry *partition_tb; * Power9 currently only support 64K partition table size. */ #define PATB_SIZE_SHIFT16 +#define SEGTB_SIZE_SHIFT PAGE_SHIFT +extern unsigned long segment_table_initialize(struct prtb_entry *prtb); typedef unsigned long mm_context_id_t; struct spinlock; @@ -90,6 +92,9 @@ typedef struct { #ifdef CONFIG_SPAPR_TCE_IOMMU struct list_head iommu_group_mem_list; #endif + unsigned long seg_table; + struct spinlock *seg_tbl_lock; + } mm_context_t; /* diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index 4ad66a547d4c..4c58b470f9c9 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -24,6 +24,10 @@ * Radix page table available */ #define MMU_FTR_TYPE_RADIX ASM_CONST(0x0040) + +/* Seg table only supported for book3s 64 */ +#define MMU_FTR_TYPE_SEG_TABLE ASM_CONST(0x0080) + /* * individual features */ @@ -124,7 +128,7 @@ enum { MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_CI_LARGE_PAGE | MMU_FTR_1T_SEGMENT | #ifdef CONFIG_PPC_RADIX_MMU - MMU_FTR_TYPE_RADIX | +
[RFC PATCH 1/2] powerpc/mm: Switch user slb fault handling to translation enabled
We also handle fault with proper stack initialized. This enable us to callout to C in fault handling routines. We don't do this for kernel mapping, because of the possibility of taking recursive fault if kernel stack in not yet mapped by an slb entry. This enable us to handle Power9 slb fault better. We will add bolted entries for the entire kernel mapping in segment table and user slb entries we take fault and insert on demand. With translation on, we should be able to access segment table from fault handler. Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/kernel/exceptions-64s.S | 55 arch/powerpc/mm/slb.c| 11 2 files changed, 61 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index f2bd375b9a4e..2f2c52559ea9 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -794,7 +794,7 @@ data_access_slb_relon_pSeries: mfspr r3,SPRN_DAR mfspr r12,SPRN_SRR1 #ifndef CONFIG_RELOCATABLE - b slb_miss_realmode + b handle_slb_miss_relon #else /* * We can't just use a direct branch to slb_miss_realmode @@ -803,7 +803,7 @@ data_access_slb_relon_pSeries: */ mfctr r11 ld r10,PACAKBASE(r13) - LOAD_HANDLER(r10, slb_miss_realmode) + LOAD_HANDLER(r10, handle_slb_miss_relon) mtctr r10 bctr #endif @@ -819,11 +819,11 @@ instruction_access_slb_relon_pSeries: mfspr r3,SPRN_SRR0/* SRR0 is faulting address */ mfspr r12,SPRN_SRR1 #ifndef CONFIG_RELOCATABLE - b slb_miss_realmode + b handle_slb_miss_relon #else mfctr r11 ld r10,PACAKBASE(r13) - LOAD_HANDLER(r10, slb_miss_realmode) + LOAD_HANDLER(r10, handle_slb_miss_relon) mtctr r10 bctr #endif @@ -961,7 +961,23 @@ h_data_storage_common: bl unknown_exception b ret_from_except +/* r3 point to DAR */ .align 7 + .globl slb_miss_user +slb_miss_user: + std r3,PACA_EXSLB+EX_DAR(r13) + /* Restore r3 as expected by PROLOG_COMMON below */ + ld r3,PACA_EXSLB+EX_R3(r13) + EXCEPTION_PROLOG_COMMON(0x380, PACA_EXSLB) + RECONCILE_IRQ_STATE(r10, r11) + ld r4,PACA_EXSLB+EX_DAR(r13) + li r5,0x380 + std r4,_DAR(r1) + addir3,r1,STACK_FRAME_OVERHEAD + bl handle_slb_miss + b ret_from_except_lite + +.align 7 .globl instruction_access_common instruction_access_common: EXCEPTION_PROLOG_COMMON(0x400, PACA_EXGEN) @@ -1379,11 +1395,17 @@ unrecover_mce: * We assume we aren't going to take any exceptions during this procedure. */ slb_miss_realmode: - mflrr10 #ifdef CONFIG_RELOCATABLE mtctr r11 #endif + /* +* Handle user slb miss with translation enabled +*/ + cmpdi r3,0 + bge 3f +slb_miss_kernel: + mflrr10 stw r9,PACA_EXSLB+EX_CCR(r13) /* save CR in exc. frame */ std r10,PACA_EXSLB+EX_LR(r13) /* save LR */ @@ -1428,6 +1450,29 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX) mtspr SPRN_SRR1,r10 rfid b . +3: + /* +* Enable IR/DR and handle the fault +*/ + EXCEPTION_PROLOG_PSERIES_1(slb_miss_user, EXC_STD) + /* +* handler with relocation on +*/ +handle_slb_miss_relon: +#ifdef CONFIG_RELOCATABLE + mtctr r11 +#endif + /* +* Handle user slb miss with stack initialized. +*/ + cmpdi r3,0 + bge 4f + /* +* go back to slb_miss_realmode +*/ + b slb_miss_kernel +4: + EXCEPTION_RELON_PROLOG_PSERIES_1(slb_miss_user, EXC_STD) unrecov_slb: EXCEPTION_PROLOG_COMMON(0x4100, PACA_EXSLB) diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c index 48fc28bab544..b18d7df5601d 100644 --- a/arch/powerpc/mm/slb.c +++ b/arch/powerpc/mm/slb.c @@ -25,6 +25,8 @@ #include #include +#include + enum slb_index { LINEAR_INDEX= 0, /* Kernel linear map (0xc000) */ VMALLOC_INDEX = 1, /* Kernel virtual map (0xd000) */ @@ -346,3 +348,12 @@ void slb_initialize(void) asm volatile("isync":::"memory"); } + +void handle_slb_miss(struct pt_regs *regs, +unsigned long address, unsigned long trap) +{ + enum ctx_state prev_state = exception_enter(); + + slb_allocate(address); + exception_exit(prev_state); +} -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/4] rcutorture: Make -soundhw a x86 specific option
On Thu, May 19, 2016 at 08:40:42AM -0700, Josh Triplett wrote: > On Thu, May 19, 2016 at 07:10:13AM -0700, Paul E. McKenney wrote: > > On Wed, May 18, 2016 at 09:23:10PM -0700, Josh Triplett wrote: > > > On Thu, May 19, 2016 at 11:42:23AM +0800, Boqun Feng wrote: > > > > The option "-soundhw pcspk" gives me a error on PPC as follow: > > > > > > > > qemu-system-ppc64: ISA bus not available for pcspk > > > > > > > > , which means this option doesn't work on ppc by default. So simply make > > > > this an x86-specific option via identify_qemu_args(). > > > > > > > > Signed-off-by: Boqun Feng> > > > > > The emulated system for RCU testing does not need sound hardware at all. > > > Paul added this option in commit > > > 16c77ea7d0f4a74e49009aa2d26c275f7f93de7c to disable the default sound > > > hardware, saying that '"-soundhw pcspk" makes the script a bit less > > > dependent on odd audio libraries being installed'. Unfortunately, it > > > looks like there isn't a "-soundhw none". As far as I can tell, > > > currently the only way to completely eliminate sound hardware is to pass > > > "-nodefaults" and then explicitly specify each desired device; while > > > that would solve the issue, it would likely introduce *more* > > > hardware-specific command-line options... > > > > > > I've filed two feature requests on upstream qemu to make this simpler: > > > https://bugs.launchpad.net/qemu/+bug/1583420 and > > > https://bugs.launchpad.net/qemu/+bug/1583421 . > > > > > > Paul, what did you mean by "dependent on odd audio libraries"? Did you > > > mean in the guest or the host? And either way, is this something that > > > could potentially be solved another way? > > > > If I remember correctly, Ubuntu 14.04 qemu refused to run the guest > > without this option, but I don't recall the exact error message. > > I chalked it up to my ignorance of qemu, but I would very much welcome > > some way to not have to specify irrelevant hardware. So thank you very > > much for filing the bugs! > > According to qemu upstream, qemu doesn't enable any sound hardware by > default, so I can't think of any obvious reason why adding "-soundhw > pcspkr" would make the rcutorture VM boot. Did qemu refuse to run at > all, or did the VM start but fail during the boot process? > > Could you check if you can currently run without this option? If so, > perhaps we should just drop it for now. Will do! As soon as the current test completes. BTW, am I the only one getting "interesting" failures in the merge window? Thanx, Paul ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/4] rcutorture: Make -soundhw a x86 specific option
On Thu, May 19, 2016 at 07:10:13AM -0700, Paul E. McKenney wrote: > On Wed, May 18, 2016 at 09:23:10PM -0700, Josh Triplett wrote: > > On Thu, May 19, 2016 at 11:42:23AM +0800, Boqun Feng wrote: > > > The option "-soundhw pcspk" gives me a error on PPC as follow: > > > > > > qemu-system-ppc64: ISA bus not available for pcspk > > > > > > , which means this option doesn't work on ppc by default. So simply make > > > this an x86-specific option via identify_qemu_args(). > > > > > > Signed-off-by: Boqun Feng> > > > The emulated system for RCU testing does not need sound hardware at all. > > Paul added this option in commit > > 16c77ea7d0f4a74e49009aa2d26c275f7f93de7c to disable the default sound > > hardware, saying that '"-soundhw pcspk" makes the script a bit less > > dependent on odd audio libraries being installed'. Unfortunately, it > > looks like there isn't a "-soundhw none". As far as I can tell, > > currently the only way to completely eliminate sound hardware is to pass > > "-nodefaults" and then explicitly specify each desired device; while > > that would solve the issue, it would likely introduce *more* > > hardware-specific command-line options... > > > > I've filed two feature requests on upstream qemu to make this simpler: > > https://bugs.launchpad.net/qemu/+bug/1583420 and > > https://bugs.launchpad.net/qemu/+bug/1583421 . > > > > Paul, what did you mean by "dependent on odd audio libraries"? Did you > > mean in the guest or the host? And either way, is this something that > > could potentially be solved another way? > > If I remember correctly, Ubuntu 14.04 qemu refused to run the guest > without this option, but I don't recall the exact error message. > I chalked it up to my ignorance of qemu, but I would very much welcome > some way to not have to specify irrelevant hardware. So thank you very > much for filing the bugs! According to qemu upstream, qemu doesn't enable any sound hardware by default, so I can't think of any obvious reason why adding "-soundhw pcspkr" would make the rcutorture VM boot. Did qemu refuse to run at all, or did the VM start but fail during the boot process? Could you check if you can currently run without this option? If so, perhaps we should just drop it for now. - Josh Triplett ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v3 4/8] powerpc: add io{read,write}64 accessors
This will allow device drivers to consistently use io{read,write}XX also for 64-bit accesses. Acked-by: Michael EllermanSigned-off-by: Horia Geantă --- arch/powerpc/kernel/iomap.c | 24 1 file changed, 24 insertions(+) diff --git a/arch/powerpc/kernel/iomap.c b/arch/powerpc/kernel/iomap.c index 12e48d56f771..3963f0b68d52 100644 --- a/arch/powerpc/kernel/iomap.c +++ b/arch/powerpc/kernel/iomap.c @@ -38,6 +38,18 @@ EXPORT_SYMBOL(ioread16); EXPORT_SYMBOL(ioread16be); EXPORT_SYMBOL(ioread32); EXPORT_SYMBOL(ioread32be); +#ifdef __powerpc64__ +u64 ioread64(void __iomem *addr) +{ + return readq(addr); +} +u64 ioread64be(void __iomem *addr) +{ + return readq_be(addr); +} +EXPORT_SYMBOL(ioread64); +EXPORT_SYMBOL(ioread64be); +#endif /* __powerpc64__ */ void iowrite8(u8 val, void __iomem *addr) { @@ -64,6 +76,18 @@ EXPORT_SYMBOL(iowrite16); EXPORT_SYMBOL(iowrite16be); EXPORT_SYMBOL(iowrite32); EXPORT_SYMBOL(iowrite32be); +#ifdef __powerpc64__ +void iowrite64(u64 val, void __iomem *addr) +{ + writeq(val, addr); +} +void iowrite64be(u64 val, void __iomem *addr) +{ + writeq_be(val, addr); +} +EXPORT_SYMBOL(iowrite64); +EXPORT_SYMBOL(iowrite64be); +#endif /* __powerpc64__ */ /* * These are the "repeat read/write" functions. Note the -- 2.4.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH v2 3/3] arch/powerpc : Enable optprobes support in powerpc
Signed-off-by: Anju T--- Documentation/features/debug/optprobes/arch-support.txt | 2 +- arch/powerpc/Kconfig| 1 + arch/powerpc/kernel/Makefile| 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/Documentation/features/debug/optprobes/arch-support.txt b/Documentation/features/debug/optprobes/arch-support.txt index b8999d8..45bc99d 100644 --- a/Documentation/features/debug/optprobes/arch-support.txt +++ b/Documentation/features/debug/optprobes/arch-support.txt @@ -27,7 +27,7 @@ | nios2: | TODO | |openrisc: | TODO | | parisc: | TODO | -| powerpc: | TODO | +| powerpc: | ok | |s390: | TODO | | score: | TODO | | sh: | TODO | diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 7cd32c0..a87c9b1 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -104,6 +104,7 @@ config PPC select HAVE_IOREMAP_PROT select HAVE_EFFICIENT_UNALIGNED_ACCESS if !CPU_LITTLE_ENDIAN select HAVE_KPROBES + select HAVE_OPTPROBES select HAVE_ARCH_KGDB select HAVE_KRETPROBES select HAVE_ARCH_TRACEHOOK diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 2da380f..7994e22 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -99,6 +99,7 @@ endif obj-$(CONFIG_BOOTX_TEXT) += btext.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_KPROBES) += kprobes.o +obj-$(CONFIG_OPTPROBES)+= optprobes.o optprobes_head.o obj-$(CONFIG_UPROBES) += uprobes.o obj-$(CONFIG_PPC_UDBG_16550) += legacy_serial.o udbg_16550.o obj-$(CONFIG_STACKTRACE) += stacktrace.o -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH v2 2/3] arch/powerpc : optprobes for powerpc core
ppc_get_optinsn_slot() and ppc_free_optinsn_slot() are geared towards the allocation and freeing of memory from the area reserved for detour buffer. Signed-off-by: Anju T--- arch/powerpc/kernel/optprobes.c | 480 1 file changed, 480 insertions(+) create mode 100644 arch/powerpc/kernel/optprobes.c diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c new file mode 100644 index 000..bb61e18 --- /dev/null +++ b/arch/powerpc/kernel/optprobes.c @@ -0,0 +1,480 @@ +/* + * Code for Kernel probes Jump optimization. + * + * Copyright 2016, Anju T, IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define SLOT_SIZE 65536 +#define TMPL_CALL_HDLR_IDX \ + (optprobe_template_call_handler - optprobe_template_entry) +#define TMPL_EMULATE_IDX \ + (optprobe_template_call_emulate - optprobe_template_entry) +#define TMPL_RET_BRANCH_IDX\ + (optprobe_template_ret_branch - optprobe_template_entry) +#define TMPL_RET_IDX \ + (optprobe_template_ret - optprobe_template_entry) +#define TMPL_OP1_IDX \ + (optprobe_template_op_address1 - optprobe_template_entry) +#define TMPL_OP2_IDX \ + (optprobe_template_op_address2 - optprobe_template_entry) +#define TMPL_INSN_IDX \ + (optprobe_template_insn - optprobe_template_entry) +#define TMPL_END_IDX \ + (optprobe_template_end - optprobe_template_entry) + +struct kprobe_ppc_insn_page { + struct list_head list; + kprobe_opcode_t *insns; /* Page of instruction slots */ + struct kprobe_insn_cache *cache; + int nused; + int ngarbage; + char slot_used[]; +}; + +#define PPC_KPROBE_INSN_PAGE_SIZE(slots) \ + (offsetof(struct kprobe_ppc_insn_page, slot_used) + \ + (sizeof(char) * (slots))) + +enum ppc_kprobe_slot_state { + SLOT_CLEAN = 0, + SLOT_DIRTY = 1, + SLOT_USED = 2, +}; + +static struct kprobe_insn_cache kprobe_ppc_optinsn_slots = { + .mutex = __MUTEX_INITIALIZER(kprobe_ppc_optinsn_slots.mutex), + .pages = LIST_HEAD_INIT(kprobe_ppc_optinsn_slots.pages), + /* .insn_size is initialized later */ + .nr_garbage = 0, +}; + +static int ppc_slots_per_page(struct kprobe_insn_cache *c) +{ + /* +* Here the #slots per page differs from x86 as we have +* only 64KB reserved. +*/ + return SLOT_SIZE / (c->insn_size * sizeof(kprobe_opcode_t)); +} + +/* Return 1 if all garbages are collected, otherwise 0. */ +static int collect_one_slot(struct kprobe_ppc_insn_page *kip, int idx) +{ + kip->slot_used[idx] = SLOT_CLEAN; + kip->nused--; + return 0; +} + +static int collect_garbage_slots(struct kprobe_insn_cache *c) +{ + struct kprobe_ppc_insn_page *kip, *next; + + /* Ensure no-one is interrupted on the garbages */ + synchronize_sched(); + + list_for_each_entry_safe(kip, next, >pages, list) { + int i; + + if (kip->ngarbage == 0) + continue; + kip->ngarbage = 0; /* we will collect all garbages */ + for (i = 0; i < ppc_slots_per_page(c); i++) { + if (kip->slot_used[i] == SLOT_DIRTY && + collect_one_slot(kip, i)) + break; + } + } + c->nr_garbage = 0; + return 0; +} + +kprobe_opcode_t *__ppc_get_optinsn_slot(struct kprobe_insn_cache *c) +{ + struct kprobe_ppc_insn_page *kip; + kprobe_opcode_t *slot = NULL; + + mutex_lock(>mutex); + list_for_each_entry(kip, >pages, list) { + if (kip->nused < ppc_slots_per_page(c)) { + int i; + + for (i = 0; i < ppc_slots_per_page(c); i++) { + if (kip->slot_used[i] == SLOT_CLEAN) { + kip->slot_used[i] = SLOT_USED; + kip->nused++; + slot = kip->insns + (i * c->insn_size); + goto out; + } + } + /* kip->nused reached max value. */ + kip->nused = ppc_slots_per_page(c); + WARN_ON(1); + } + if (!list_empty(>pages)) { + pr_info("No more slots to allocate\n"); + return NULL; + } + } + kip = kmalloc(PPC_KPROBE_INSN_PAGE_SIZE(ppc_slots_per_page(c)), +
[RFC PATCH v2 1/3] arch/powerpc : Add detour buffer support for optprobes
Detour buffer contains instructions to create an in memory pt_regs. After the execution of prehandler a call is made for instruction emulation. The NIP is decided after the probed instruction is executed. Hence a branch instruction is created to the NIP returned by emulate_step(). Instruction slot for detour buffer is allocated from the reserved area. For the time being 64KB is reserved in memory for this purpose. Signed-off-by: Anju T--- arch/powerpc/include/asm/kprobes.h | 25 arch/powerpc/kernel/optprobes_head.S | 108 +++ 2 files changed, 133 insertions(+) create mode 100644 arch/powerpc/kernel/optprobes_head.S diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h index 039b583..3e4c998 100644 --- a/arch/powerpc/include/asm/kprobes.h +++ b/arch/powerpc/include/asm/kprobes.h @@ -38,7 +38,25 @@ struct pt_regs; struct kprobe; typedef ppc_opcode_t kprobe_opcode_t; + +extern kprobe_opcode_t optinsn_slot; +/* Optinsn template address */ +extern kprobe_opcode_t optprobe_template_entry[]; +extern kprobe_opcode_t optprobe_template_call_handler[]; +extern kprobe_opcode_t optprobe_template_call_emulate[]; +extern kprobe_opcode_t optprobe_template_ret_branch[]; +extern kprobe_opcode_t optprobe_template_ret[]; +extern kprobe_opcode_t optprobe_template_insn[]; +extern kprobe_opcode_t optprobe_template_op_address1[]; +extern kprobe_opcode_t optprobe_template_op_address2[]; +extern kprobe_opcode_t optprobe_template_end[]; + #define MAX_INSN_SIZE 1 +#define MAX_OPTIMIZED_LENGTH4 +#define MAX_OPTINSN_SIZE \ + ((unsigned long)_template_end -\ + (unsigned long)_template_entry) +#define RELATIVEJUMP_SIZE 4 #ifdef CONFIG_PPC64 #if defined(_CALL_ELF) && _CALL_ELF == 2 @@ -129,5 +147,12 @@ struct kprobe_ctlblk { extern int kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *data); extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr); + +struct arch_optimized_insn { + kprobe_opcode_t copied_insn[1]; + /* detour buffer */ + kprobe_opcode_t *insn; +}; + #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_KPROBES_H */ diff --git a/arch/powerpc/kernel/optprobes_head.S b/arch/powerpc/kernel/optprobes_head.S new file mode 100644 index 000..ce32aec --- /dev/null +++ b/arch/powerpc/kernel/optprobes_head.S @@ -0,0 +1,108 @@ +/* + * Code to prepare detour buffer for optprobes in kernel. + * + * Copyright 2016, Anju T, IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include + +.global optinsn_slot +optinsn_slot: + /* Reserve an area to allocate slots for detour buffer */ + .space 65536 +.global optprobe_template_entry +optprobe_template_entry: + stdur1,-INT_FRAME_SIZE(r1) + SAVE_GPR(0,r1) + /* Save the previous SP into stack */ + addir0,r1,INT_FRAME_SIZE + std 0,GPR1(r1) + SAVE_2GPRS(2,r1) + SAVE_8GPRS(4,r1) + SAVE_10GPRS(12,r1) + SAVE_10GPRS(22,r1) + /* Save SPRS */ + mfcfar r5 + std r5,_NIP(r1) + mfmsr r5 + std r5,_MSR(r1) + mfctr r5 + std r5,_CTR(r1) + mflrr5 + std r5,_LINK(r1) + mfspr r5,SPRN_XER + std r5,_XER(r1) + li r5,0 + std r5,_TRAP(r1) + mfdar r5 + std r5,_DAR(r1) + mfdsisr r5 + std r5,_DSISR(r1) + /* Pass parameters for optimized_callback */ +.global optprobe_template_op_address1 +optprobe_template_op_address1: + nop + nop + nop + nop + nop + addir4,r1,STACK_FRAME_OVERHEAD + /* Branch to the prehandler */ +.global optprobe_template_call_handler +optprobe_template_call_handler: + nop + /* Pass parameters for instruction emulation */ + addir3,r1,STACK_FRAME_OVERHEAD +.global optprobe_template_insn +optprobe_template_insn: + nop + nop + /* Branch to instruction emulation */ +.global optprobe_template_call_emulate +optprobe_template_call_emulate: + nop +.global optprobe_template_op_address2 +optprobe_template_op_address2: + nop + nop + nop + nop + nop + addir4,r1,STACK_FRAME_OVERHEAD + /* Branch to create_return_branch() function */ +.global optprobe_template_ret_branch +optprobe_template_ret_branch: + nop + /* Restore the registers */ + ld r5,_MSR(r1) + mtmsr r5 + ld r5,_CTR(r1) + mtctr r5 + ld r5,_LINK(r1) + mtlrr5 + ld r5,_XER(r1) + mtxer r5 +
[RFC PATCH v2 0/3] OPTPROBES for powerpc
Here are the RFC patchset of the kprobes jump optimization (a.k.a OPTPROBES)for powerpc. Kprobe being an inevitable tool for kernel developers,enhancing the performance of kprobe has got much importance. Currently kprobes inserts a trap instruction to probe a running kernel. Jump optimization allows kprobes to replace the trap with a branch,reducing the probe overhead drastically. Performance: = An optimized kprobe in powerpc is 1.05 to 4.7 times faster than a kprobe. Example: Placed a probe at an offset 0x50 in _do_fork(). *Time Diff here is, difference in time before hitting the probe and after the probed instruction. mftb() is employed in kernel/fork.c for this purpose. # echo 0 > /proc/sys/debug/kprobes-optimization Kprobes globally unoptimized [ 233.607120] Time Diff = 0x1f0 [ 233.608273] Time Diff = 0x1ee [ 233.609228] Time Diff = 0x203 [ 233.610400] Time Diff = 0x1ec [ 233.611335] Time Diff = 0x200 [ 233.612552] Time Diff = 0x1f0 [ 233.613386] Time Diff = 0x1ee [ 233.614547] Time Diff = 0x212 [ 233.615570] Time Diff = 0x206 [ 233.616819] Time Diff = 0x1f3 [ 233.617773] Time Diff = 0x1ec [ 233.618944] Time Diff = 0x1fb [ 233.619879] Time Diff = 0x1f0 [ 233.621066] Time Diff = 0x1f9 [ 233.621999] Time Diff = 0x283 [ 233.623281] Time Diff = 0x24d [ 233.624172] Time Diff = 0x1ea [ 233.625381] Time Diff = 0x1f0 [ 233.626358] Time Diff = 0x200 [ 233.627572] Time Diff = 0x1ed # echo 1 > /proc/sys/debug/kprobes-optimization Kprobes globally optimized [ 70.797075] Time Diff = 0x103 [ 70.799102] Time Diff = 0x181 [ 70.801861] Time Diff = 0x15e [ 70.803466] Time Diff = 0xf0 [ 70.804348] Time Diff = 0xd0 [ 70.805653] Time Diff = 0xad [ 70.806477] Time Diff = 0xe0 [ 70.807725] Time Diff = 0xbe [ 70.808541] Time Diff = 0xc3 [ 70.810191] Time Diff = 0xc7 [ 70.811007] Time Diff = 0xc0 [ 70.812629] Time Diff = 0xc0 [ 70.813640] Time Diff = 0xda [ 70.814915] Time Diff = 0xbb [ 70.815726] Time Diff = 0xc4 [ 70.816955] Time Diff = 0xc0 [ 70.817778] Time Diff = 0xcd [ 70.818999] Time Diff = 0xcd [ 70.820099] Time Diff = 0xcb [ 70.821333] Time Diff = 0xf0 Implementation: === The trap instruction is replaced by a branch to a detour buffer. To address the limitation of branch instruction in power architecture detour buffer slot is allocated from a reserved area . This will ensure that the branch is within +/- 32 MB range. Patch 2/3 furnishes this. The current kprobes insn caches allocate memory area for insn slots with module_alloc(). This will always be beyond +/- 32MB range. Hence for allocating and freeing slots from this reserved area ppc_get_optinsn_slot() and ppc_free_optinsns_slot() are introduced. The detour buffer contains a call to optimized_callback() which in turn call the pre_handler(). Once the pre-handler is run, the original instruction is emulated from the detour buffer itself. Also the detour buffer is equipped with a branch back to the normal work flow after the probed instruction is emulated. Before preparing optimization, Kprobes inserts original(user-defined) kprobe on the specified address. So, even if the kprobe is not possible to be optimized, it just uses a normal kprobe. Limitations: == - Number of probes which can be optimized is limited by the size of the area reserved. * TODO: Have a template based implementation that will alleviate the probe count by using a lesser space from the reserved area for optimization. - Currently instructions which can be emulated are the only candidates for optimization. Changes from RFC-v1: --- - Detour buffer memory reservation code moved to optprobes.c - optimized_callback() is marked as NOKPROBE_SYMBOL. - Return NULL when there is no more slots to allocate from detour buffer. - Other comments by Masami are addressed. Kindly let me know your suggestions and comments. Thanks -Anju Anju T (3): arch/powerpc : Add detour buffer support for optprobes arch/powerpc : optprobes for powerpc core arch/powerpc : Enable optprobes support in powerpc .../features/debug/optprobes/arch-support.txt | 2 +- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/kprobes.h | 25 ++ arch/powerpc/kernel/Makefile | 1 + arch/powerpc/kernel/optprobes.c| 474 + arch/powerpc/kernel/optprobes_head.S | 108 + 6 files changed, 610 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/kernel/optprobes.c create mode 100644 arch/powerpc/kernel/optprobes_head.S -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 2/9] powerpc/kvm: make hypervisor state restore a function
Hi Shreyas, On Wed, May 18, 2016 at 12:37:56PM +0530, Shreyas B Prabhu wrote: [..snip..] > >> diff --git a/arch/powerpc/kernel/exceptions-64s.S > >> b/arch/powerpc/kernel/exceptions-64s.S > >> index 7716ceb..7ebfbb0 100644 > >> --- a/arch/powerpc/kernel/exceptions-64s.S > >> +++ b/arch/powerpc/kernel/exceptions-64s.S > >> @@ -107,25 +107,8 @@ BEGIN_FTR_SECTION > >>beq 9f > >> > >>cmpwi cr3,r13,2 > >> + bl power7_restore_hyp_resource > >> > >> - /* > >> - * Check if last bit of HSPGR0 is set. This indicates whether we are > >> - * waking up from winkle. > >> - */ > >> - GET_PACA(r13) > >> - clrldi r5,r13,63 > >> - clrrdi r13,r13,1 > >> - cmpwi cr4,r5,1 > >> - mtspr SPRN_HSPRG0,r13 > >> - > >> - lbz r0,PACA_THREAD_IDLE_STATE(r13) > >> - cmpwi cr2,r0,PNV_THREAD_NAP > >> - bgt cr2,8f /* Either sleep or Winkle */ > >> - > >> - /* Waking up from nap should not cause hypervisor state loss */ > >> - bgt cr3,. > >> - > >> - /* Waking up from nap */ > >>li r0,PNV_THREAD_RUNNING > >>stb r0,PACA_THREAD_IDLE_STATE(r13) /* Clear thread state */ > >> > >> @@ -143,13 +126,9 @@ BEGIN_FTR_SECTION > >> > >>/* Return SRR1 from power7_nap() */ > >>mfspr r3,SPRN_SRR1 > >> - beq cr3,2f > >> - b power7_wakeup_noloss > >> -2:b power7_wakeup_loss > >> - > >> - /* Fast Sleep wakeup on PowerNV */ > >> -8:GET_PACA(r13) > > > > In the old code, we do a GET_PACA(r13) before invoking the > > power7_wakeup_tb_loss. In the new code we don't. Can you explain > > this omission ? > > GET_PACA(13) is the called in the beginning of > power7_restore_hyp_resource. So r13 contains pointer to PACA when > power7_wakeup_tb_loss invoked later in the same function. Ah, I see it now. So the GET_PACA(r13) at 8: was anyway redundant in the older code. You can add my Reviewed-by: to this patch. -- Thanks and Regards gautham. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH] Increase in idle power with schedutil
On Thu, May 19, 2016 at 1:40 PM, Peter Zijlstrawrote: > On Wed, May 18, 2016 at 11:11:51PM +0200, Rafael J. Wysocki wrote: >> On Wed, May 18, 2016 at 2:53 PM, Shilpasri G Bhat >> wrote: >> > This patch adds driver callback for fast_switch and below observations >> > on schedutil governor are done with this patch. >> > >> > In POWER8 there is a regression observed with schedutil compared to >> > ondemand. With schedutil the frequency is not ramping down and is >> > mostly stuck at max frequency during idle . This is because of the >> > watchdog timer, an RT task which is fired every 4 seconds which >> > results in requesting max frequency. >> >> Well, yes, that would be problematic. >> > > Right; we need to come up with something for RT tasks; I think we need the hints thing for that to be able to distinguish between RT and the rest. Also in this particular case it looks like an RT task is the only task that wakes up often enough and we don't drop the frequency when going idle. Do we need a hook somewhere in the idle path? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 5/9] powerpc/powernv: Move idle related macros to cpuidle.h
On Tue, May 03, 2016 at 01:54:34PM +0530, Shreyas B. Prabhu wrote: > Move idle related macros to a common location asm/cpuidle.h so that > they can be used for stop instruction support. > > Signed-off-by: Shreyas B. PrabhyReviewed-by: Gautham R. Shenoy -- Thanks and Regards gautham. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 4/9] powerpc/powernv: Make power7_powersave_common more generic
On Wed, May 18, 2016 at 12:21:17PM +0530, Shreyas B Prabhu wrote: > With this patch, r5 which is the third parameter to > power_powersave_common contains the return address that needs to be > written to SRR0. So here I'm keeping r5 unaltered and using r7 for the MSR. Ok. Reviewed-by: Gautham R. Shenoy> > Thanks, > Shreyas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/4] rcutorture: Several fixes to run selftest scripts on PPC
On Wed, May 18, 2016 at 09:25:17PM -0700, Josh Triplett wrote: > On Thu, May 19, 2016 at 11:42:20AM +0800, Boqun Feng wrote: > > I spend some time to make tools/testing/selftest/rcutorture run on PPC, > > here are some documention and fixes made while I was trying. > > > > The scripts are able to run and get results on PPC, however please > > note there are some stalls even build errors that could be found > > by the tests currently. > > > > As I'm certainly not an expert of qemu or bash programming, there > > may be something I am missing in those patches. So tests and comments > > are welcome ;-) > > > > Regards, > > Boqun > > > > Boqun Feng (4): > > rcutorture/doc: Add a new way to create initrd using dracut > > rcutorture: Use vmlinux as the fallback kernel image > > rcutorture: Make -soundhw a x86 specific option > > rcutorture: Don't specify the cpu type of QEMU on PPC > > All four of these seem reasonable to me: > Reviewed-by: Josh TriplettThank you both! I have queued all four for further review and for testing. > I responded to the -soundhw patch, trying to track down why that option > was needed in the first place, and seeking a solution that doesn't > require adding to the set of target-specific options. But I don't think > that investigation should block your fix. Agreed, it should work better now than it did before! Thanx, Paul ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/4] rcutorture: Make -soundhw a x86 specific option
On Wed, May 18, 2016 at 09:23:10PM -0700, Josh Triplett wrote: > On Thu, May 19, 2016 at 11:42:23AM +0800, Boqun Feng wrote: > > The option "-soundhw pcspk" gives me a error on PPC as follow: > > > > qemu-system-ppc64: ISA bus not available for pcspk > > > > , which means this option doesn't work on ppc by default. So simply make > > this an x86-specific option via identify_qemu_args(). > > > > Signed-off-by: Boqun Feng> > The emulated system for RCU testing does not need sound hardware at all. > Paul added this option in commit > 16c77ea7d0f4a74e49009aa2d26c275f7f93de7c to disable the default sound > hardware, saying that '"-soundhw pcspk" makes the script a bit less > dependent on odd audio libraries being installed'. Unfortunately, it > looks like there isn't a "-soundhw none". As far as I can tell, > currently the only way to completely eliminate sound hardware is to pass > "-nodefaults" and then explicitly specify each desired device; while > that would solve the issue, it would likely introduce *more* > hardware-specific command-line options... > > I've filed two feature requests on upstream qemu to make this simpler: > https://bugs.launchpad.net/qemu/+bug/1583420 and > https://bugs.launchpad.net/qemu/+bug/1583421 . > > Paul, what did you mean by "dependent on odd audio libraries"? Did you > mean in the guest or the host? And either way, is this something that > could potentially be solved another way? If I remember correctly, Ubuntu 14.04 qemu refused to run the guest without this option, but I don't recall the exact error message. I chalked it up to my ignorance of qemu, but I would very much welcome some way to not have to specify irrelevant hardware. So thank you very much for filing the bugs! Thanx, Paul ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: next build: 37 warnings 2 failures (next/next-20160519)
On Thursday 19 May 2016 15:03:34 Kishon Vijay Abraham I wrote: > > > >> 1 drivers/phy/phy-exynos-mipi-video.c:238:13: warning: 'val' may be > >> used uninitialized in this function [-Wmaybe-uninitialized] > > > > I sent a patch on May 11, it was reviewed by Krzysztof Kozlowski, but not > > yet > > applied. > > Is it okay if I send this during the -rc cycle? > Yes, it's a bug fix, so it should just go in as soon as possible, that's what the -rc cycle is for after all. Arnd ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: next build: 37 warnings 2 failures (next/next-20160519)
On Thursday 19 May 2016 11:20:44 Michal Hocko wrote: > On Thu 19-05-16 11:07:09, Arnd Bergmann wrote: > [...] > > > 6 mm/page_alloc.c:3651:6: warning: 'compact_result' may be used > > > uninitialized in this function [-Wmaybe-uninitialized] > > > > I'm surprised this one is still there, I sent a patch but Michal Hocko came > > up with > > a better fix on May 12, which was not applied yet. > > > > Michael, can you resend this one to Andrew? I suspect he missed it as it was > > sent as a reply to mine. > > Andrew has taken the patch IIRC but he hasn't released any mmotm since > then so it didn't get to the linux-next. > Ok, cool, that explains it. Arnd ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: WARNING at kernel/sched/core.c:1166 while booting 4.6.0 mainline on ppc64le bare metal
On Thu, May 19, 2016 at 04:27:49PM +0530, abdhalee wrote: >Hi > >Today's mainline stable 4.6 on ppc64le bare metal booted with the following >warning. > >[0.080615] EEH: PowerNV platform initialized >[0.080709] POWER8 performance monitor hardware support registered >[0.080791] power8-pmu: PMAO restore workaround active. >[0.100780] [ cut here ] >[0.100869] WARNING: CPU: 40 PID: 248 at kernel/sched/core.c:1166 >__set_cpus_allowed_ptr+0x21c/0x290 I ran into same issue on yesterday's linux-next. Also, I added some logs and it seems the CPU isn't marked as active in time. The stack trace is poped up under the circumstance: CPU#80 is online, but not active yet. ==> cpuhp_thread_fun: CPU=80 cpuhp_thread_fun: state=10 target=45 cpuhp_ap_online: CPU=80, state=10 target=45 smpboot_unpark_threads: CPU=80 notify_online: CPU=80 CPU#80 isn't active yet. [ cut here ] WARNING: CPU: 80 PID: 408 at kernel/sched/core.c:1166 __set_cpus_allowed_ptr+0x22c/0x290 Modules linked in: CPU: 80 PID: 408 Comm: cpuhp/80 Not tainted 4.6.0-next-20160517-gavin-00020-g176bf86-dirty #35 task: c01e5243de00 ti: c01ffc10c000 task.ti: c01ffc10c000 NIP: c00d923c LR: c00d9224 CTR: REGS: c01ffc10f730 TRAP: 0700 Not tainted (4.6.0-next-20160517-gavin-00020-g176bf86-dirty) MSR: 90029033CR: 28002044 XER: 2000 CFAR: c047135c SOFTE: 0 GPR00: c00d9138 c01ffc10f9b0 c1321300 GPR04: c135aa18 0400 0010 GPR08: 0050 c135aa90 GPR12: 2200 cff14000 c00ffa60c5d0 c1292800 GPR16: 0001 c12780a8 c139b678 0001 GPR20: c01e523b c1278048 0008 c12cfa8e GPR24: c12780c8 c01ffc10fa40 c1278048 c135a898 GPR28: c00ff133ff08 c00ff9c0c780 c01e5240 NIP [c00d923c] __set_cpus_allowed_ptr+0x22c/0x290 LR [c00d9224] __set_cpus_allowed_ptr+0x214/0x290 Call Trace: [c01ffc10f9b0] [c00d9138] __set_cpus_allowed_ptr+0x128/0x290 (unreliable) [c01ffc10fa20] [c00c65e0] workqueue_cpu_up_callback+0x460/0x5d0 [c01ffc10faf0] [c00cee6c] notifier_call_chain+0xac/0x110 [c01ffc10fb40] [c009fc64] __cpu_notify+0x54/0xa0 [c01ffc10fb60] [c009fd9c] notify_online+0x4c/0x70 [c01ffc10fbd0] [c009f5b4] cpuhp_up_callbacks+0x74/0x1a0 [c01ffc10fc20] [c00a0100] cpuhp_thread_fun+0x1e0/0x2a0 [c01ffc10fcc0] [c00d2ac0] smpboot_thread_fn+0x290/0x2a0 [c01ffc10fd20] [c00cd578] kthread+0x108/0x130 [c01ffc10fe30] [c0009578] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 419eff38 3c820004 38849718 7f83e378 38a00400 483980f1 6000 2fa3 409eff18 813e0254 2f890001 419eff0c <0fe0> 4b04 80810038 387d0018 ---[ end trace 5cf6676167cdd41c ]--- sched_cpu_activate: CPU=80 < CPU#80 is marked as active Thanks, Gavin ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Pull request: scottwood/linux.git next
On Mon, 2016-05-16 at 20:37 -0500, Scott Wood wrote: > Sorry for the lateness... > > Contains include 86xx fixes, minor device tree fixes, an erratum > workaround, and a kconfig dependency fix. Thanks, merged into next and pushed. Will send a pull request to Linus late tomorrow once the whole assemblage has been through linux-next. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH] Increase in idle power with schedutil
On Wed, May 18, 2016 at 11:11:51PM +0200, Rafael J. Wysocki wrote: > On Wed, May 18, 2016 at 2:53 PM, Shilpasri G Bhat >wrote: > > This patch adds driver callback for fast_switch and below observations > > on schedutil governor are done with this patch. > > > > In POWER8 there is a regression observed with schedutil compared to > > ondemand. With schedutil the frequency is not ramping down and is > > mostly stuck at max frequency during idle . This is because of the > > watchdog timer, an RT task which is fired every 4 seconds which > > results in requesting max frequency. > > Well, yes, that would be problematic. > Right; we need to come up with something for RT tasks; but what happens if you disable the watchdog? This should be entirely doable and might give a better comparison. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [kernel-hardening] [PATCH v8 2/4] GCC plugin infrastructure
On 19 May 2016 at 16:22, Michael Ellerman wrote: > On Wed, 2016-05-18 at 12:33 +0200, Emese Revfy wrote: > > Did you test the plugins with all gcc versions (4.5-6)? > > What's the concern about gcc versions? Just not breaking the build on old > compilers? the earlier plugin capable gcc versions used to install gcc headers in a somewhat ad-hoc manner resulting in compile time breakage for plugins and since some of those potentially missing headers are target specific, each target arch should be verified before enabling plugin support on them. things have much improved with gcc 5 (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61176) though there's still an occasional missing header but with wider use of plugins they will hopefully be discovered earlier now. perhaps linux-arch should be cc'ed on the plugin infrastructure so that arch maintainers are aware of this? > I'm pretty sure powerpc big endian still builds with gcc 4.4. > > However if Andrew's only tested on little endian, then that select should be > guarded with an "if CPU_LITTLE_ENDIAN". And to build LE you need gcc >= 4.9. i guess that's part of the target tuple so in general arch maintainers should test the target tuples used on their arch with all the supported gcc versions (speaking of CC, not HOSTCC/HOSTCXX). cheers, PaX Team ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: next build: 37 warnings 2 failures (next/next-20160519)
On Thu, 19 May 2016, Arnd Bergmann wrote: > > 2 drivers/xen/balloon.c:154:13: warning: 'release_memory_resource' > > declared 'static' but never defined [-Wunused-function] > > I sent a patch on May 11, subject "xen: remove incorrect forward declaration" > and > Stefano Stabellini reviewed it. Ross Lagerwall did the same patch a day > earlier, > but neither of them has made it into linux-next so far. According to Ross, > this > one should be backported to v4.4. It's on our radar, the patch hasn't been lost. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: next build: 37 warnings 2 failures (next/next-20160519)
Hi Arnd, On Thursday 19 May 2016 02:37 PM, Arnd Bergmann wrote: > On Thursday 19 May 2016 00:45:16 Olof's autobuilder wrote: >> Errors: >> >> arm64.allmodconfig: >> samples/seccomp/bpf-fancy.c:13:27: fatal error: linux/seccomp.h: No such >> file or directory >> samples/seccomp/dropper.c:20:27: fatal error: linux/seccomp.h: No such file >> or directory >> samples/seccomp/bpf-helper.h:20:50: fatal error: linux/seccomp.h: No such >> file or directory >> samples/seccomp/bpf-direct.c:21:27: fatal error: linux/seccomp.h: No such >> file or directory > > This one is interesting: the same header dependency seems to be present for > samples/bpf, > but only samples/seccomp fails. Can you check if both are attempted to be > built? > > samples/bpf/README.rst says about this: > > |Kernel headers > |-- > | > |There are usually dependencies to header files of the current kernel. > |To avoid installing devel kernel headers system wide, as a normal > |user, simply call:: > | > | make headers_install > | > |This will creates a local "usr/include" directory in the git/build top > |level directory, that the make system automatically pickup first. > > which I assume would fix the problem, but it would be better if Kbuild was > smart enough > to do this implicitly when building these samples. > >> powerpc.pasemi_defconfig: >> arch/powerpc/kernel/ptrace.c:380:24: error: index 32 denotes an offset >> greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' >> [-Werror=array-bounds] >> arch/powerpc/kernel/ptrace.c:408:24: error: index 32 denotes an offset >> greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' >> [-Werror=array-bounds] > > I don't see a good way to avoid the warning other than dropping the > >BUILD_BUG_ON(offsetof(struct thread_fp_state, fpscr) != > offsetof(struct thread_fp_state, fpr[32][0])); > > statements in the powerpc ptrace implementation. It doesn't seem too > important to check for though. > > >> Warnings: > >> 2 drivers/net/wireless/intel/iwlegacy/3945.c:1022:5: warning: suggest >> explicit braces to avoid ambiguous 'else' [-Wparentheses] > > I had not seen this before, sent a patch now. > >> 3 drivers/pinctrl/stm32/pinctrl-stm32.c:797:17: warning: too many >> arguments for format [-Wformat-extra-args] > > sent a fix yesterday, got an ack but it wasn't applied yet. I'm sure Linus > Walleij > will take care of it soon. > >> 6 mm/page_alloc.c:3651:6: warning: 'compact_result' may be used >> uninitialized in this function [-Wmaybe-uninitialized] > > I'm surprised this one is still there, I sent a patch but Michal Hocko came > up with > a better fix on May 12, which was not applied yet. > > Michael, can you resend this one to Andrew? I suspect he missed it as it was > sent as a reply to mine. > >> 2 drivers/xen/balloon.c:154:13: warning: 'release_memory_resource' >> declared 'static' but never defined [-Wunused-function] > > I sent a patch on May 11, subject "xen: remove incorrect forward declaration" > and > Stefano Stabellini reviewed it. Ross Lagerwall did the same patch a day > earlier, > but neither of them has made it into linux-next so far. According to Ross, > this > one should be backported to v4.4. > >> 3 fs/xfs/xfs_aops.c:97:16: warning: unused variable 'blockmask' >> [-Wunused-variable] > > I sent a patch on April 16, but got no reply. Resending it now. > >> 2 arch/arm/mach-lpc32xx/include/mach/irqs.h:115:0: warning: "NR_IRQS" >> redefined > > I missed this one, as I have some other patches for lp32xx in my randconfig > fixup tree that hides it. > > I've created a fix now and applied it to the arm-soc fixes branch. > >> 1 drivers/soc/mediatek/mtk-pmic-wrap.c:1062:16: warning: large integer >> implicitly truncated to unsigned type [-Woverflow] >> 1 drivers/soc/mediatek/mtk-pmic-wrap.c:1074:16: warning: large integer >> implicitly truncated to unsigned type [-Woverflow] >> 1 drivers/soc/mediatek/mtk-pmic-wrap.c:1086:16: warning: large integer >> implicitly truncated to unsigned type [-Woverflow] > > I sent out a patch on May 12 for this, got no reply. I've applied my own patch > now on the arm-soc fixes branch. > >> 1 drivers/phy/phy-exynos-mipi-video.c:238:13: warning: 'val' may be >> used uninitialized in this function [-Wmaybe-uninitialized] > > I sent a patch on May 11, it was reviewed by Krzysztof Kozlowski, but not yet > applied. Is it okay if I send this during the -rc cycle? Thanks Kishon ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: next build: 37 warnings 2 failures (next/next-20160519)
On Thu 19-05-16 11:07:09, Arnd Bergmann wrote: [...] > > 6 mm/page_alloc.c:3651:6: warning: 'compact_result' may be used > > uninitialized in this function [-Wmaybe-uninitialized] > > I'm surprised this one is still there, I sent a patch but Michal Hocko came > up with > a better fix on May 12, which was not applied yet. > > Michael, can you resend this one to Andrew? I suspect he missed it as it was > sent as a reply to mine. Andrew has taken the patch IIRC but he hasn't released any mmotm since then so it didn't get to the linux-next. -- Michal Hocko SUSE Labs ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: next build: 37 warnings 2 failures (next/next-20160519)
On Thursday 19 May 2016 00:45:16 Olof's autobuilder wrote: > Errors: > > arm64.allmodconfig: > samples/seccomp/bpf-fancy.c:13:27: fatal error: linux/seccomp.h: No such file > or directory > samples/seccomp/dropper.c:20:27: fatal error: linux/seccomp.h: No such file > or directory > samples/seccomp/bpf-helper.h:20:50: fatal error: linux/seccomp.h: No such > file or directory > samples/seccomp/bpf-direct.c:21:27: fatal error: linux/seccomp.h: No such > file or directory This one is interesting: the same header dependency seems to be present for samples/bpf, but only samples/seccomp fails. Can you check if both are attempted to be built? samples/bpf/README.rst says about this: |Kernel headers |-- | |There are usually dependencies to header files of the current kernel. |To avoid installing devel kernel headers system wide, as a normal |user, simply call:: | | make headers_install | |This will creates a local "usr/include" directory in the git/build top |level directory, that the make system automatically pickup first. which I assume would fix the problem, but it would be better if Kbuild was smart enough to do this implicitly when building these samples. > powerpc.pasemi_defconfig: > arch/powerpc/kernel/ptrace.c:380:24: error: index 32 denotes an offset > greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' > [-Werror=array-bounds] > arch/powerpc/kernel/ptrace.c:408:24: error: index 32 denotes an offset > greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' > [-Werror=array-bounds] I don't see a good way to avoid the warning other than dropping the BUILD_BUG_ON(offsetof(struct thread_fp_state, fpscr) != offsetof(struct thread_fp_state, fpr[32][0])); statements in the powerpc ptrace implementation. It doesn't seem too important to check for though. > Warnings: > 2 drivers/net/wireless/intel/iwlegacy/3945.c:1022:5: warning: suggest > explicit braces to avoid ambiguous 'else' [-Wparentheses] I had not seen this before, sent a patch now. > 3 drivers/pinctrl/stm32/pinctrl-stm32.c:797:17: warning: too many > arguments for format [-Wformat-extra-args] sent a fix yesterday, got an ack but it wasn't applied yet. I'm sure Linus Walleij will take care of it soon. > 6 mm/page_alloc.c:3651:6: warning: 'compact_result' may be used > uninitialized in this function [-Wmaybe-uninitialized] I'm surprised this one is still there, I sent a patch but Michal Hocko came up with a better fix on May 12, which was not applied yet. Michael, can you resend this one to Andrew? I suspect he missed it as it was sent as a reply to mine. > 2 drivers/xen/balloon.c:154:13: warning: 'release_memory_resource' > declared 'static' but never defined [-Wunused-function] I sent a patch on May 11, subject "xen: remove incorrect forward declaration" and Stefano Stabellini reviewed it. Ross Lagerwall did the same patch a day earlier, but neither of them has made it into linux-next so far. According to Ross, this one should be backported to v4.4. > 3 fs/xfs/xfs_aops.c:97:16: warning: unused variable 'blockmask' > [-Wunused-variable] I sent a patch on April 16, but got no reply. Resending it now. > 2 arch/arm/mach-lpc32xx/include/mach/irqs.h:115:0: warning: "NR_IRQS" > redefined I missed this one, as I have some other patches for lp32xx in my randconfig fixup tree that hides it. I've created a fix now and applied it to the arm-soc fixes branch. > 1 drivers/soc/mediatek/mtk-pmic-wrap.c:1062:16: warning: large integer > implicitly truncated to unsigned type [-Woverflow] > 1 drivers/soc/mediatek/mtk-pmic-wrap.c:1074:16: warning: large integer > implicitly truncated to unsigned type [-Woverflow] > 1 drivers/soc/mediatek/mtk-pmic-wrap.c:1086:16: warning: large integer > implicitly truncated to unsigned type [-Woverflow] I sent out a patch on May 12 for this, got no reply. I've applied my own patch now on the arm-soc fixes branch. > 1 drivers/phy/phy-exynos-mipi-video.c:238:13: warning: 'val' may be > used uninitialized in this function [-Wmaybe-uninitialized] I sent a patch on May 11, it was reviewed by Krzysztof Kozlowski, but not yet applied. > 1 include/soc/nps/common.h:148:9: warning: cast to pointer from integer > of different size [-Wint-to-pointer-cast] > 1 include/soc/nps/common.h:162:9: warning: cast to pointer from integer > of different size [-Wint-to-pointer-cast] I sent a patch on May 12, but it hasn't appeared in linux-next yet. > 1 drivers/infiniband/core/cma.c:1253:12: warning: > 'src_addr_storage.sin_addr.s_addr' may be used uninitialized in this function > [-Wmaybe-uninitialized] This seems to only happen on powerpc. What compiler version are you using there? If it's an older compiler, we might not necessarily care about the warnings but you may want to upgrade. I've confirmed that this is a false positive, but I see
[PATCH 7/7] powerpc/mm: remove flush_tlb_page_nohash
This should be same as flush_tlb_page except for hash32. For hash32 I guess the existing code is wrong, because we don't seem to be flushing tlb for Hash != 0 case at all. Fix this by switching to calling flush_tlb_page() which does the right thing by flushing tlb for both hash and nohash case with hash32 Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 5 - arch/powerpc/include/asm/book3s/64/tlbflush.h | 8 arch/powerpc/include/asm/tlbflush.h| 1 - arch/powerpc/mm/pgtable.c | 2 +- arch/powerpc/mm/tlb_hash32.c | 11 --- 5 files changed, 1 insertion(+), 26 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h index f12ddf5e8de5..2f6373144e2c 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h @@ -75,11 +75,6 @@ static inline void hash__flush_tlb_page(struct vm_area_struct *vma, { } -static inline void hash__flush_tlb_page_nohash(struct vm_area_struct *vma, - unsigned long vmaddr) -{ -} - static inline void hash__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index 3b3e5e944af7..ea29cc3318d2 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -57,14 +57,6 @@ static inline void local_flush_tlb_page(struct vm_area_struct *vma, return hash__local_flush_tlb_page(vma, vmaddr); } -static inline void flush_tlb_page_nohash(struct vm_area_struct *vma, -unsigned long vmaddr) -{ - if (radix_enabled()) - return radix__flush_tlb_page(vma, vmaddr); - return hash__flush_tlb_page_nohash(vma, vmaddr); -} - static inline void tlb_flush(struct mmu_gather *tlb) { if (radix_enabled()) diff --git a/arch/powerpc/include/asm/tlbflush.h b/arch/powerpc/include/asm/tlbflush.h index 1b38eea28e5a..13dbcd41885e 100644 --- a/arch/powerpc/include/asm/tlbflush.h +++ b/arch/powerpc/include/asm/tlbflush.h @@ -54,7 +54,6 @@ extern void __flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, #define flush_tlb_page(vma,addr) local_flush_tlb_page(vma,addr) #define __flush_tlb_page(mm,addr,p,i) __local_flush_tlb_page(mm,addr,p,i) #endif -#define flush_tlb_page_nohash(vma,addr)flush_tlb_page(vma,addr) #elif defined(CONFIG_PPC_STD_MMU_32) diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 88a307504b5a..0b6fb244d0a1 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -225,7 +225,7 @@ int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, if (!is_vm_hugetlb_page(vma)) assert_pte_locked(vma->vm_mm, address); __ptep_set_access_flags(ptep, entry); - flush_tlb_page_nohash(vma, address); + flush_tlb_page(vma, address); } return changed; } diff --git a/arch/powerpc/mm/tlb_hash32.c b/arch/powerpc/mm/tlb_hash32.c index 558e30cce33e..702d7689d714 100644 --- a/arch/powerpc/mm/tlb_hash32.c +++ b/arch/powerpc/mm/tlb_hash32.c @@ -49,17 +49,6 @@ void flush_hash_entry(struct mm_struct *mm, pte_t *ptep, unsigned long addr) EXPORT_SYMBOL(flush_hash_entry); /* - * Called by ptep_set_access_flags, must flush on CPUs for which the - * DSI handler can't just "fixup" the TLB on a write fault - */ -void flush_tlb_page_nohash(struct vm_area_struct *vma, unsigned long addr) -{ - if (Hash != 0) - return; - _tlbie(addr); -} - -/* * Called at the end of a mmu_gather operation to make sure the * TLB flush is completely done. */ -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/7] powerpc/mm: Use hugetlb flush functions
Use flush_hugetlb_page instead of flush_tlb_page when we clear flush the pte. Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/hugetlb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index e2d9f4996e5c..c5517f463ec7 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -147,7 +147,7 @@ static inline void huge_ptep_clear_flush(struct vm_area_struct *vma, { pte_t pte; pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); - flush_tlb_page(vma, addr); + flush_hugetlb_page(vma, addr); } static inline int huge_pte_none(pte_t pte) -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 6/7] powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range
Some archs like ppc64 need to do special things when flushing tlb for hugepage. Add a new helper to flush hugetlb tlb range. This helps us to avoid flushing the entire tlb mapping for the pid. Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/tlbflush-radix.h | 2 ++ arch/powerpc/include/asm/book3s/64/tlbflush.h | 10 ++ arch/powerpc/mm/hugetlbpage-radix.c | 10 ++ mm/hugetlb.c| 10 +- 4 files changed, 31 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 68839e6adcf1..73953a44d4e3 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -10,6 +10,8 @@ static inline int mmu_get_ap(int psize) return mmu_psize_defs[psize].ap; } +extern void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end); extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, unsigned long end, int psize); extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index f0d6c9d38916..3b3e5e944af7 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -16,6 +16,16 @@ static inline void flush_pmd_tlb_range(struct vm_area_struct *vma, return hash__flush_tlb_range(vma, start, end); } +#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE +static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end) +{ + if (radix_enabled()) + return radix__flush_hugetlb_tlb_range(vma, start, end); + return hash__flush_tlb_range(vma, start, end); +} + static inline void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { diff --git a/arch/powerpc/mm/hugetlbpage-radix.c b/arch/powerpc/mm/hugetlbpage-radix.c index 1eca0deaf89b..35254a678456 100644 --- a/arch/powerpc/mm/hugetlbpage-radix.c +++ b/arch/powerpc/mm/hugetlbpage-radix.c @@ -25,6 +25,16 @@ void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long v radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, psize); } +void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, unsigned long start, + unsigned long end) +{ + int psize; + struct hstate *hstate = hstate_file(vma->vm_file); + + psize = hstate_get_psize(hstate); + radix__flush_tlb_range_psize(vma->vm_mm, start, end, psize); +} + /* * A vairant of hugetlb_get_unmapped_area doing topdown search * FIXME!! should we do as x86 does or non hugetlb area does ? diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 19d0d08b396f..076a57ee8790 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3893,6 +3893,14 @@ same_page: return i ? i : -EFAULT; } +#ifndef __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE +/* + * ARCHes with special requirements for evicting HUGETLB backing TLB entries can + * implement this. + */ +#define flush_hugetlb_tlb_range(vma, addr, end)flush_tlb_range(vma, addr, end) +#endif + unsigned long hugetlb_change_protection(struct vm_area_struct *vma, unsigned long address, unsigned long end, pgprot_t newprot) { @@ -3953,7 +3961,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, * once we release i_mmap_rwsem, another task can do the final put_page * and that page table be reused and filled with junk. */ - flush_tlb_range(vma, start, end); + flush_hugetlb_tlb_range(vma, start, end); mmu_notifier_invalidate_range(mm, start, end); i_mmap_unlock_write(vma->vm_file->f_mapping); mmu_notifier_invalidate_range_end(mm, start, end); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 5/7] powerpc/mm/radix/hugetlb: Add helper for finding page size from hstate
Use the helper instead of open coding the same at multiple place Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/hugetlb-radix.h | 15 +++ .../powerpc/include/asm/book3s/64/tlbflush-radix.h | 4 +-- arch/powerpc/mm/hugetlbpage-radix.c| 29 ++ arch/powerpc/mm/tlb-radix.c| 10 +--- 4 files changed, 30 insertions(+), 28 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h b/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h index 60f47649306f..c45189aa7476 100644 --- a/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h +++ b/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h @@ -11,4 +11,19 @@ extern unsigned long radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); + +static inline int hstate_get_psize(struct hstate *hstate) +{ + unsigned long shift; + + shift = huge_page_shift(hstate); + if (shift == mmu_psize_defs[MMU_PAGE_2M].shift) + return MMU_PAGE_2M; + else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift) + return MMU_PAGE_1G; + else { + WARN(1, "Wrong huge page shift\n"); + return mmu_virtual_psize; + } +} #endif diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 07b2e0031dad..68839e6adcf1 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -21,13 +21,13 @@ extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end extern void radix__local_flush_tlb_mm(struct mm_struct *mm); extern void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr); extern void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap); + int psize); extern void radix__tlb_flush(struct mmu_gather *tlb); #ifdef CONFIG_SMP extern void radix__flush_tlb_mm(struct mm_struct *mm); extern void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr); extern void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap); + int psize); #else #define radix__flush_tlb_mm(mm)radix__local_flush_tlb_mm(mm) #define radix__flush_tlb_page(vma,addr) radix__local_flush_tlb_page(vma,addr) diff --git a/arch/powerpc/mm/hugetlbpage-radix.c b/arch/powerpc/mm/hugetlbpage-radix.c index 0dfa1816f0c6..1eca0deaf89b 100644 --- a/arch/powerpc/mm/hugetlbpage-radix.c +++ b/arch/powerpc/mm/hugetlbpage-radix.c @@ -5,39 +5,24 @@ #include #include #include +#include void radix__flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) { - unsigned long ap, shift; + int psize; struct hstate *hstate = hstate_file(vma->vm_file); - shift = huge_page_shift(hstate); - if (shift == mmu_psize_defs[MMU_PAGE_2M].shift) - ap = mmu_get_ap(MMU_PAGE_2M); - else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift) - ap = mmu_get_ap(MMU_PAGE_1G); - else { - WARN(1, "Wrong huge page shift\n"); - return ; - } - radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, ap); + psize = hstate_get_psize(hstate); + radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, psize); } void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) { - unsigned long ap, shift; + int psize; struct hstate *hstate = hstate_file(vma->vm_file); - shift = huge_page_shift(hstate); - if (shift == mmu_psize_defs[MMU_PAGE_2M].shift) - ap = mmu_get_ap(MMU_PAGE_2M); - else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift) - ap = mmu_get_ap(MMU_PAGE_1G); - else { - WARN(1, "Wrong huge page shift\n"); - return ; - } - radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, ap); + psize = hstate_get_psize(hstate); + radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, psize); } /* diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index b1dc4675925d..7bc3d1402c63 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -128,9 +128,10 @@ void radix__local_flush_tlb_mm(struct mm_struct *mm) EXPORT_SYMBOL(radix__local_flush_tlb_mm); void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap) + int psize)
[PATCH 4/7] powerpc/mm/radix: Rename function and drop unused arg
Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/tlbflush-radix.h | 10 +- arch/powerpc/mm/hugetlbpage-radix.c | 4 ++-- arch/powerpc/mm/tlb-radix.c | 16 3 files changed, 15 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 823528d34688..07b2e0031dad 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -20,18 +20,18 @@ extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end extern void radix__local_flush_tlb_mm(struct mm_struct *mm); extern void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr); -extern void radix___local_flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap, int nid); +extern void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, + unsigned long ap); extern void radix__tlb_flush(struct mmu_gather *tlb); #ifdef CONFIG_SMP extern void radix__flush_tlb_mm(struct mm_struct *mm); extern void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr); -extern void radix___flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap, int nid); +extern void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, + unsigned long ap); #else #define radix__flush_tlb_mm(mm)radix__local_flush_tlb_mm(mm) #define radix__flush_tlb_page(vma,addr) radix__local_flush_tlb_page(vma,addr) -#define radix___flush_tlb_page(mm,addr,p,i) radix___local_flush_tlb_page(mm,addr,p,i) +#define radix__flush_tlb_page_psize(mm,addr,p) radix__local_flush_tlb_page_psize(mm,addr,p) #endif #endif diff --git a/arch/powerpc/mm/hugetlbpage-radix.c b/arch/powerpc/mm/hugetlbpage-radix.c index 1e11559e1aac..0dfa1816f0c6 100644 --- a/arch/powerpc/mm/hugetlbpage-radix.c +++ b/arch/powerpc/mm/hugetlbpage-radix.c @@ -20,7 +20,7 @@ void radix__flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) WARN(1, "Wrong huge page shift\n"); return ; } - radix___flush_tlb_page(vma->vm_mm, vmaddr, ap, 0); + radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, ap); } void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) @@ -37,7 +37,7 @@ void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long v WARN(1, "Wrong huge page shift\n"); return ; } - radix___local_flush_tlb_page(vma->vm_mm, vmaddr, ap, 0); + radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, ap); } /* diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index fe2fc58d2e00..b1dc4675925d 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -127,8 +127,8 @@ void radix__local_flush_tlb_mm(struct mm_struct *mm) } EXPORT_SYMBOL(radix__local_flush_tlb_mm); -void radix___local_flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap, int nid) +void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, + unsigned long ap) { unsigned int pid; @@ -146,8 +146,8 @@ void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmadd if (vma && is_vm_hugetlb_page(vma)) return __local_flush_hugetlb_page(vma, vmaddr); #endif - radix___local_flush_tlb_page(vma ? vma->vm_mm : NULL, vmaddr, - mmu_get_ap(mmu_virtual_psize), 0); + radix__local_flush_tlb_page_psize(vma ? vma->vm_mm : NULL, vmaddr, + mmu_get_ap(mmu_virtual_psize)); } EXPORT_SYMBOL(radix__local_flush_tlb_page); @@ -176,8 +176,8 @@ no_context: } EXPORT_SYMBOL(radix__flush_tlb_mm); -void radix___flush_tlb_page(struct mm_struct *mm, unsigned long vmaddr, - unsigned long ap, int nid) +void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, +unsigned long ap) { unsigned int pid; @@ -205,8 +205,8 @@ void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr) if (vma && is_vm_hugetlb_page(vma)) return flush_hugetlb_page(vma, vmaddr); #endif - radix___flush_tlb_page(vma ? vma->vm_mm : NULL, vmaddr, -mmu_get_ap(mmu_virtual_psize), 0); + radix__flush_tlb_page_psize(vma ? vma->vm_mm : NULL, vmaddr, + mmu_get_ap(mmu_virtual_psize)); }
[PATCH 3/7] powerpc/mm/radix: Add tlb flush of THP ptes
Instead of flushing the entire mm, implement a flush_pmd_tlb_range Signed-off-by: Aneesh Kumar K.V--- .../powerpc/include/asm/book3s/64/tlbflush-radix.h | 4 ++ arch/powerpc/include/asm/book3s/64/tlbflush.h | 9 arch/powerpc/mm/pgtable-book3s64.c | 4 +- arch/powerpc/mm/tlb-radix.c| 54 ++ 4 files changed, 69 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 13ef38828dfe..823528d34688 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -10,6 +10,10 @@ static inline int mmu_get_ap(int psize) return mmu_psize_defs[psize].ap; } +extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, +unsigned long end, int psize); +extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end); extern void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end); extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end); diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index d98424ae356c..f0d6c9d38916 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -7,6 +7,15 @@ #include #include +#define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE +static inline void flush_pmd_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (radix_enabled()) + return radix__flush_pmd_tlb_range(vma, start, end); + return hash__flush_tlb_range(vma, start, end); +} + static inline void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 670318766545..7bb8acffe876 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -33,7 +33,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address, changed = !pmd_same(*(pmdp), entry); if (changed) { __ptep_set_access_flags(pmdp_ptep(pmdp), pmd_pte(entry)); - flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE); + flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE); } return changed; } @@ -66,7 +66,7 @@ void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp) { pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, 0); - flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE); + flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE); /* * This ensures that generic code that rely on IRQ disabling * to prevent a parallel THP split work as expected. diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index 5807f5d72e1b..fe2fc58d2e00 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -243,3 +243,57 @@ void radix__tlb_flush(struct mmu_gather *tlb) struct mm_struct *mm = tlb->mm; radix__flush_tlb_mm(mm); } + +#define TLB_FLUSH_ALL -1UL +/* + * Number of pages above which we will do a bcast tlbie. Just a + * number at this point copied from x86 + */ +static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33; + +void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, + unsigned long end, int psize) +{ + unsigned int pid; + unsigned long addr; + int local = mm_is_core_local(mm); + unsigned long ap = mmu_get_ap(psize); + int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE); + unsigned long page_size = 1UL << mmu_psize_defs[psize].shift; + + + preempt_disable(); + pid = mm ? mm->context.id : 0; + if (unlikely(pid == MMU_NO_CONTEXT)) + goto err_out; + + if (end == TLB_FLUSH_ALL || + (end - start) > tlb_single_page_flush_ceiling * page_size) { + if (local) + _tlbiel_pid(pid); + else + _tlbie_pid(pid); + goto err_out; + } + for (addr = start; addr < end; addr += page_size) { + + if (local) + _tlbiel_va(addr, pid, ap); + else { + if (lock_tlbie) + raw_spin_lock(_tlbie_lock); + _tlbie_va(addr, pid, ap); + if (lock_tlbie) +
[PATCH 2/7] powerpc/mm: Drop multiple definition of mm_is_core_local
Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/tlb.h | 13 + arch/powerpc/mm/tlb-radix.c| 6 -- arch/powerpc/mm/tlb_nohash.c | 6 -- 3 files changed, 13 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index 20733fa518ae..f6f68f73e858 100644 --- a/arch/powerpc/include/asm/tlb.h +++ b/arch/powerpc/include/asm/tlb.h @@ -46,5 +46,18 @@ static inline void __tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, #endif } +#ifdef CONFIG_SMP +static inline int mm_is_core_local(struct mm_struct *mm) +{ + return cpumask_subset(mm_cpumask(mm), + topology_sibling_cpumask(smp_processor_id())); +} +#else +static inline int mm_is_core_local(struct mm_struct *mm) +{ + return 1; +} +#endif + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_TLB_H */ diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index 0fdaf93a3e09..5807f5d72e1b 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -152,12 +152,6 @@ void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmadd EXPORT_SYMBOL(radix__local_flush_tlb_page); #ifdef CONFIG_SMP -static int mm_is_core_local(struct mm_struct *mm) -{ - return cpumask_subset(mm_cpumask(mm), - topology_sibling_cpumask(smp_processor_id())); -} - void radix__flush_tlb_mm(struct mm_struct *mm) { unsigned int pid; diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c index f4668488512c..050badc0ebd3 100644 --- a/arch/powerpc/mm/tlb_nohash.c +++ b/arch/powerpc/mm/tlb_nohash.c @@ -215,12 +215,6 @@ EXPORT_SYMBOL(local_flush_tlb_page); static DEFINE_RAW_SPINLOCK(tlbivax_lock); -static int mm_is_core_local(struct mm_struct *mm) -{ - return cpumask_subset(mm_cpumask(mm), - topology_sibling_cpumask(smp_processor_id())); -} - struct tlb_flush_param { unsigned long addr; unsigned int pid; -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 0/7] TLB flush update for radix
Hi, This patch series introduce range based tlb flush and use the same for radix implementation. We still need to handle the mmu_gather related tlb flush. That will be done in a later patch. Aneesh Kumar K.V (7): powerpc/mm: Use hugetlb flush functions powerpc/mm: Drop multiple definition of mm_is_core_local powerpc/mm/radix: Add tlb flush of THP ptes powerpc/mm/radix: Rename function and drop unused arg powerpc/mm/radix/hugetlb: Add helper for finding page size from hstate powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range powerpc/mm: remove flush_tlb_page_nohash arch/powerpc/include/asm/book3s/64/hugetlb-radix.h | 15 + arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 5 -- .../powerpc/include/asm/book3s/64/tlbflush-radix.h | 16 +++-- arch/powerpc/include/asm/book3s/64/tlbflush.h | 27 +--- arch/powerpc/include/asm/hugetlb.h | 2 +- arch/powerpc/include/asm/tlb.h | 13 arch/powerpc/include/asm/tlbflush.h| 1 - arch/powerpc/mm/hugetlbpage-radix.c| 39 +-- arch/powerpc/mm/pgtable-book3s64.c | 4 +- arch/powerpc/mm/pgtable.c | 2 +- arch/powerpc/mm/tlb-radix.c| 78 ++ arch/powerpc/mm/tlb_hash32.c | 11 --- arch/powerpc/mm/tlb_nohash.c | 6 -- mm/hugetlb.c | 10 ++- 14 files changed, 152 insertions(+), 77 deletions(-) -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 5/6] powerpc/mm: Make MMU_FTR_RADIX a MMU family feature
MMU feature bits are defined such that we use the lower half to present MMU family features. Remove the strict split of half and also move Radix to a mmu family feature. Radix introduce a new MMU model and strictly speaking it is a new MMU family. This also free up bits which can be used for individual features later. Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/mmu.h | 2 +- arch/powerpc/include/asm/mmu.h | 16 +++- arch/powerpc/kernel/entry_64.S | 2 +- arch/powerpc/kernel/exceptions-64s.S | 8 arch/powerpc/kernel/prom.c | 2 +- 5 files changed, 14 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index 5854263d4d6e..c6b1ff795632 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -23,7 +23,7 @@ struct mmu_psize_def { }; extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; -#define radix_enabled() mmu_has_feature(MMU_FTR_RADIX) +#define radix_enabled() mmu_has_feature(MMU_FTR_TYPE_RADIX) #endif /* __ASSEMBLY__ */ diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index e53ebebff474..4ad66a547d4c 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -12,7 +12,7 @@ */ /* - * First half is MMU families + * MMU families */ #define MMU_FTR_HPTE_TABLE ASM_CONST(0x0001) #define MMU_FTR_TYPE_8xx ASM_CONST(0x0002) @@ -20,9 +20,12 @@ #define MMU_FTR_TYPE_44x ASM_CONST(0x0008) #define MMU_FTR_TYPE_FSL_E ASM_CONST(0x0010) #define MMU_FTR_TYPE_47x ASM_CONST(0x0020) - /* - * This is individual features + * Radix page table available + */ +#define MMU_FTR_TYPE_RADIX ASM_CONST(0x0040) +/* + * individual features */ /* Enable use of high BAT registers */ @@ -88,11 +91,6 @@ */ #define MMU_FTR_1T_SEGMENT ASM_CONST(0x4000) -/* - * Radix page table available - */ -#define MMU_FTR_RADIX ASM_CONST(0x8000) - /* MMU feature bit sets for various CPUs */ #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2 \ MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2 @@ -126,7 +124,7 @@ enum { MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_CI_LARGE_PAGE | MMU_FTR_1T_SEGMENT | #ifdef CONFIG_PPC_RADIX_MMU - MMU_FTR_RADIX | + MMU_FTR_TYPE_RADIX | #endif 0, }; diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 73e461a3dfbb..dd26d4ed7513 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -532,7 +532,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) #ifdef CONFIG_PPC_STD_MMU_64 BEGIN_MMU_FTR_SECTION b 2f -END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX) +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX) BEGIN_FTR_SECTION clrrdi r6,r8,28/* get its ESID */ clrrdi r9,r1,28/* get current sp ESID */ diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 4c9440629128..f2bd375b9a4e 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -945,7 +945,7 @@ BEGIN_MMU_FTR_SECTION b do_hash_page/* Try to handle as hpte fault */ MMU_FTR_SECTION_ELSE b handle_page_fault -ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_RADIX) +ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX) .align 7 .globl h_data_storage_common @@ -976,7 +976,7 @@ BEGIN_MMU_FTR_SECTION b do_hash_page/* Try to handle as hpte fault */ MMU_FTR_SECTION_ELSE b handle_page_fault -ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_RADIX) +ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX) STD_EXCEPTION_COMMON(0xe20, h_instr_storage, unknown_exception) @@ -1390,7 +1390,7 @@ slb_miss_realmode: #ifdef CONFIG_PPC_STD_MMU_64 BEGIN_MMU_FTR_SECTION bl slb_allocate_realmode -END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX) +END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX) #endif /* All done -- return from exception. */ @@ -1401,7 +1401,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX) mtlrr10 BEGIN_MMU_FTR_SECTION b 2f -END_MMU_FTR_SECTION_IFSET(MMU_FTR_RADIX) +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX) andi. r10,r12,MSR_RI /* check for unrecoverable exception */ beq-2f diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index d924cd60fc8e..8d5579b5b6c8 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -166,7 +166,7 @@ static struct ibm_pa_feature { * which is 0 if the kernel doesn't support TM. */ {CPU_FTR_TM_COMP, 0, 0, 22, 0, 0}, - {0, MMU_FTR_RADIX, 0,
[PATCH 4/6] powerpc/mm/hash: Compute the segment size correctly for ISA 3.0
PowerISA 3.0 encodes the segment size in the second half of hash page table entry. Update hpte_decode accordingly. Fixes: 50de596de8be ("powerpc/mm/hash: Add support for Power9 Hash") Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/mm/hash_native_64.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c index d873f6507f72..c9715fc99d68 100644 --- a/arch/powerpc/mm/hash_native_64.c +++ b/arch/powerpc/mm/hash_native_64.c @@ -550,7 +550,10 @@ static void hpte_decode(struct hash_pte *hpte, unsigned long slot, } } /* This works for all page sizes, and for 256M and 1T segments */ - *ssize = hpte_v >> HPTE_V_SSIZE_SHIFT; + if (cpu_has_feature(CPU_FTR_ARCH_300)) + *ssize = hpte_r >> HPTE_R_3_0_SSIZE_SHIFT; + else + *ssize = hpte_v >> HPTE_V_SSIZE_SHIFT; shift = mmu_psize_defs[size].shift; avpn = (HPTE_V_AVPN_VAL(hpte_v) & ~mmu_psize_defs[size].avpnm); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/6] powerpc/mm/radix: Update PID switch sequence
Update the PID switch as per ISA doc. slbia is needed in radix to invalidate any implementation specific lookaside information Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/mm/mmu_context_book3s64.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c index 227b2a6c4544..565f1b1da33b 100644 --- a/arch/powerpc/mm/mmu_context_book3s64.c +++ b/arch/powerpc/mm/mmu_context_book3s64.c @@ -181,7 +181,10 @@ void destroy_context(struct mm_struct *mm) #ifdef CONFIG_PPC_RADIX_MMU void radix__switch_mmu_context(struct mm_struct *prev, struct mm_struct *next) { - mtspr(SPRN_PID, next->context.id); asm volatile("isync": : :"memory"); + mtspr(SPRN_PID, next->context.id); + asm volatile("isync \n" +"slbia 0x7 \n" +: : :"memory"); } #endif -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 6/6] powerpc/mm/hash: Add helper for finding SLBE LLP encoding
Replace opencoding of the same at multiple places with the helper. No functional change with this patch. Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/include/asm/book3s/64/mmu-hash.h | 9 + arch/powerpc/include/asm/kvm_book3s_64.h | 3 +-- arch/powerpc/mm/hash_native_64.c | 6 ++ 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index 290157e8d5b2..a5fa6be7d5ae 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -150,6 +150,15 @@ static inline unsigned int mmu_psize_to_shift(unsigned int mmu_psize) BUG(); } +static inline unsigned long get_sllp_encoding(int psize) +{ + unsigned long sllp; + + sllp = ((mmu_psize_defs[psize].sllp & SLB_VSID_L) >> 6) | + ((mmu_psize_defs[psize].sllp & SLB_VSID_LP) >> 4); + return sllp; +} + #endif /* __ASSEMBLY__ */ /* diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h index 1f4497fb5b83..88d17b4ea9c8 100644 --- a/arch/powerpc/include/asm/kvm_book3s_64.h +++ b/arch/powerpc/include/asm/kvm_book3s_64.h @@ -181,8 +181,7 @@ static inline unsigned long compute_tlbie_rb(unsigned long v, unsigned long r, switch (b_psize) { case MMU_PAGE_4K: - sllp = ((mmu_psize_defs[a_psize].sllp & SLB_VSID_L) >> 6) | - ((mmu_psize_defs[a_psize].sllp & SLB_VSID_LP) >> 4); + sllp = get_sllp_encoding(a_psize); rb |= sllp << 5;/* AP field */ rb |= (va_low & 0x7ff) << 12; /* remaining 11 bits of AVA */ break; diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c index c9715fc99d68..db108e478c80 100644 --- a/arch/powerpc/mm/hash_native_64.c +++ b/arch/powerpc/mm/hash_native_64.c @@ -71,8 +71,7 @@ static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize) /* clear out bits after (52) [052.63] */ va &= ~((1ul << (64 - 52)) - 1); va |= ssize << 8; - sllp = ((mmu_psize_defs[apsize].sllp & SLB_VSID_L) >> 6) | - ((mmu_psize_defs[apsize].sllp & SLB_VSID_LP) >> 4); + sllp = get_sllp_encoding(apsize); va |= sllp << 5; asm volatile(ASM_FTR_IFCLR("tlbie %0,0", PPC_TLBIE(%1,%0), %2) : : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206) @@ -120,8 +119,7 @@ static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize) /* clear out bits after(52) [052.63] */ va &= ~((1ul << (64 - 52)) - 1); va |= ssize << 8; - sllp = ((mmu_psize_defs[apsize].sllp & SLB_VSID_L) >> 6) | - ((mmu_psize_defs[apsize].sllp & SLB_VSID_LP) >> 4); + sllp = get_sllp_encoding(apsize); va |= sllp << 5; asm volatile(".long 0x7c000224 | (%0 << 11) | (0 << 21)" : : "r"(va) : "memory"); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/6] powerpc/mm/hash: Update SDR1 size encoding as documented in ISA 3.0
ISA 3.0 document hash table size in bytes = 2^(HTABSIZE + 18) No functionality change by this patch. Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/mm/hash_utils_64.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 59268969a0bc..3849de15b65f 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -677,10 +677,9 @@ int remove_section_mapping(unsigned long start, unsigned long end) #endif /* CONFIG_MEMORY_HOTPLUG */ static void __init hash_init_partition_table(phys_addr_t hash_table, -unsigned long pteg_count) +unsigned long htab_size) { unsigned long ps_field; - unsigned long htab_size; unsigned long patb_size = 1UL << PATB_SIZE_SHIFT; /* @@ -688,7 +687,7 @@ static void __init hash_init_partition_table(phys_addr_t hash_table, * We can ignore that for lpid 0 */ ps_field = 0; - htab_size = __ilog2(pteg_count) - 11; + htab_size = __ilog2(htab_size) - 18; BUILD_BUG_ON_MSG((PATB_SIZE_SHIFT > 24), "Partition table size too large."); partition_tb = __va(memblock_alloc_base(patb_size, patb_size, @@ -774,7 +773,7 @@ static void __init htab_initialize(void) htab_address = __va(table); /* htab absolute addr + encoded htabsize */ - _SDR1 = table + __ilog2(pteg_count) - 11; + _SDR1 = table + __ilog2(htab_size_bytes) - 18; /* Initialize the HPT with no entries */ memset((void *)table, 0, htab_size_bytes); @@ -783,7 +782,7 @@ static void __init htab_initialize(void) /* Set SDR1 */ mtspr(SPRN_SDR1, _SDR1); else - hash_init_partition_table(table, pteg_count); + hash_init_partition_table(table, htab_size_bytes); } prot = pgprot_val(PAGE_KERNEL); -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/6] powerpc/mm/radix: Update LPCR only if it is powernv
LPCR cannot be updated when running in guest mode. Signed-off-by: Aneesh Kumar K.V--- arch/powerpc/mm/pgtable-radix.c | 23 ++- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c index 18b2c11604fa..c939e6e57a9e 100644 --- a/arch/powerpc/mm/pgtable-radix.c +++ b/arch/powerpc/mm/pgtable-radix.c @@ -296,11 +296,6 @@ found: void __init radix__early_init_mmu(void) { unsigned long lpcr; - /* -* setup LPCR UPRT based on mmu_features -*/ - lpcr = mfspr(SPRN_LPCR); - mtspr(SPRN_LPCR, lpcr | LPCR_UPRT); #ifdef CONFIG_PPC_64K_PAGES /* PAGE_SIZE mappings */ @@ -343,8 +338,11 @@ void __init radix__early_init_mmu(void) __pte_frag_size_shift = H_PTE_FRAG_SIZE_SHIFT; radix_init_page_sizes(); - if (!firmware_has_feature(FW_FEATURE_LPAR)) + if (!firmware_has_feature(FW_FEATURE_LPAR)) { + lpcr = mfspr(SPRN_LPCR); + mtspr(SPRN_LPCR, lpcr | LPCR_UPRT); radix_init_partition_table(); + } radix_init_pgtable(); } @@ -353,16 +351,15 @@ void radix__early_init_mmu_secondary(void) { unsigned long lpcr; /* -* setup LPCR UPRT based on mmu_features +* update partition table control register and UPRT */ - lpcr = mfspr(SPRN_LPCR); - mtspr(SPRN_LPCR, lpcr | LPCR_UPRT); - /* -* update partition table control register, 64 K size. -*/ - if (!firmware_has_feature(FW_FEATURE_LPAR)) + if (!firmware_has_feature(FW_FEATURE_LPAR)) { + lpcr = mfspr(SPRN_LPCR); + mtspr(SPRN_LPCR, lpcr | LPCR_UPRT); + mtspr(SPRN_PTCR, __pa(partition_tb) | (PATB_SIZE_SHIFT - 12)); + } } void radix__setup_initial_memory_limit(phys_addr_t first_memblock_base, -- 2.7.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 2/3] arch/powerpc : optprobes for powerpc core
Hi Masami, Thank you for reviewing the patch. On Wednesday 18 May 2016 08:43 PM, Masami Hiramatsu wrote: On Wed, 18 May 2016 02:09:37 +0530 Anju Twrote: Instruction slot for detour buffer is allocated from the reserved area. For the time being 64KB is reserved in memory for this purpose. ppc_get_optinsn_slot() and ppc_free_optinsn_slot() are geared towards the allocation and freeing of memory from this area. Thank you for porting optprobe on ppc!! I have some comments on this patch. Signed-off-by: Anju T --- arch/powerpc/kernel/optprobes.c | 463 1 file changed, 463 insertions(+) create mode 100644 arch/powerpc/kernel/optprobes.c diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c new file mode 100644 index 000..50a60c1 --- /dev/null +++ b/arch/powerpc/kernel/optprobes.c @@ -0,0 +1,463 @@ +/* + * Code for Kernel probes Jump optimization. + * + * Copyright 2016, Anju T, IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* Reserve an area to allocate slots for detour buffer */ +extern void optprobe_trampoline_holder(void) +{ + asm volatile(".global optinsn_slot\n" + "optinsn_slot:\n" + ".space 65536"); +} Would we better move this into optprobes_head.S? Yes. Will do. + +#define SLOT_SIZE 65536 +#define TMPL_CALL_HDLR_IDX \ + (optprobe_template_call_handler - optprobe_template_entry) +#define TMPL_EMULATE_IDX \ + (optprobe_template_call_emulate - optprobe_template_entry) +#define TMPL_RET_BRANCH_IDX\ + (optprobe_template_ret_branch - optprobe_template_entry) +#define TMPL_RET_IDX \ + (optprobe_template_ret - optprobe_template_entry) +#define TMPL_OP1_IDX \ + (optprobe_template_op_address1 - optprobe_template_entry) +#define TMPL_OP2_IDX \ + (optprobe_template_op_address2 - optprobe_template_entry) +#define TMPL_INSN_IDX \ + (optprobe_template_insn - optprobe_template_entry) +#define TMPL_END_IDX \ + (optprobe_template_end - optprobe_template_entry) + +struct kprobe_ppc_insn_page { + struct list_head list; + kprobe_opcode_t *insns; /* Page of instruction slots */ + struct kprobe_insn_cache *cache; + int nused; + int ngarbage; + char slot_used[]; +}; + +#define PPC_KPROBE_INSN_PAGE_SIZE(slots) \ + (offsetof(struct kprobe_ppc_insn_page, slot_used) + \ + (sizeof(char) * (slots))) + +enum ppc_kprobe_slot_state { + SLOT_CLEAN = 0, + SLOT_DIRTY = 1, + SLOT_USED = 2, +}; + +static struct kprobe_insn_cache kprobe_ppc_optinsn_slots = { + .mutex = __MUTEX_INITIALIZER(kprobe_ppc_optinsn_slots.mutex), + .pages = LIST_HEAD_INIT(kprobe_ppc_optinsn_slots.pages), + /* .insn_size is initialized later */ + .nr_garbage = 0, +}; + +static int ppc_slots_per_page(struct kprobe_insn_cache *c) +{ + /* +* Here the #slots per page differs from x86 as we have +* only 64KB reserved. +*/ + return SLOT_SIZE / (c->insn_size * sizeof(kprobe_opcode_t)); +} + +/* Return 1 if all garbages are collected, otherwise 0. */ +static int collect_one_slot(struct kprobe_ppc_insn_page *kip, int idx) +{ + kip->slot_used[idx] = SLOT_CLEAN; + kip->nused--; + return 0; +} + +static int collect_garbage_slots(struct kprobe_insn_cache *c) +{ + struct kprobe_ppc_insn_page *kip, *next; + + /* Ensure no-one is interrupted on the garbages */ + synchronize_sched(); + + list_for_each_entry_safe(kip, next, >pages, list) { + int i; + + if (kip->ngarbage == 0) + continue; + kip->ngarbage = 0; /* we will collect all garbages */ + for (i = 0; i < ppc_slots_per_page(c); i++) { + if (kip->slot_used[i] == SLOT_DIRTY && + collect_one_slot(kip, i)) + break; + } + } + c->nr_garbage = 0; + return 0; +} + +kprobe_opcode_t *__ppc_get_optinsn_slot(struct kprobe_insn_cache *c) +{ + struct kprobe_ppc_insn_page *kip; + kprobe_opcode_t *slot = NULL; + + mutex_lock(>mutex); + list_for_each_entry(kip, >pages, list) { + if (kip->nused < ppc_slots_per_page(c)) { + int i; + + for (i = 0; i < ppc_slots_per_page(c); i++) { + if (kip->slot_used[i] == SLOT_CLEAN) { + kip->slot_used[i] =
Re: [PATCH] kvm-pr: manage illegal instructions
On 18.05.2016 12:53, Thomas Huth wrote: > On 18.05.2016 12:18, Thomas Huth wrote: >> On 17.05.2016 19:49, Laurent Vivier wrote: >>> >>> >>> On 17/05/2016 10:37, Alexander Graf wrote: On 05/17/2016 10:35 AM, Laurent Vivier wrote: > > On 12/05/2016 16:23, Laurent Vivier wrote: >> >> On 12/05/2016 11:27, Alexander Graf wrote: >>> On 05/12/2016 11:10 AM, Laurent Vivier wrote: On 11/05/2016 13:49, Alexander Graf wrote: > On 05/11/2016 01:14 PM, Laurent Vivier wrote: >> On 11/05/2016 12:35, Alexander Graf wrote: >>> On 03/15/2016 09:18 PM, Laurent Vivier wrote: While writing some instruction tests for kvm-unit-tests for powerpc, I've found that illegal instructions are not managed correctly with kvm-pr, while it is fine with kvm-hv. When an illegal instruction (like ".long 0") is processed by kvm-pr, the kernel logs are filled with: Couldn't emulate instruction 0x (op 0 xop 0) kvmppc_handle_exit_pr: emulation at 700 failed () While the exception handler receives an interrupt for each instruction executed after the illegal instruction. Signed-off-by: Laurent Vivier--- arch/powerpc/kvm/book3s_emulate.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_emulate.c b/arch/powerpc/kvm/book3s_emulate.c index 2afdb9c..4ee969d 100644 --- a/arch/powerpc/kvm/book3s_emulate.c +++ b/arch/powerpc/kvm/book3s_emulate.c @@ -99,7 +99,6 @@ int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct kvm_vcpu *vcpu, switch (get_op(inst)) { case 0: -emulated = EMULATE_FAIL; if ((kvmppc_get_msr(vcpu) & MSR_LE) && (inst == swab32(inst_sc))) { /* @@ -112,6 +111,9 @@ int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct kvm_vcpu *vcpu, kvmppc_set_gpr(vcpu, 3, EV_UNIMPLEMENTED); kvmppc_set_pc(vcpu, kvmppc_get_pc(vcpu) + 4); emulated = EMULATE_DONE; +} else { +kvmppc_core_queue_program(vcpu, SRR1_PROGILL); >>> But isn't that exactly what the semantic of EMULATE_FAIL is? >>> Fixing it >>> up in book3s_emulate.c is definitely the wrong spot. >>> >>> So what is the problem you're trying to solve? Is the SRR0 at the >>> wrong >>> spot or are the log messages the problem? >> No, the problem is the host kernel logs are filled by the message >> and >> the execution hangs. And the host becomes unresponsiveness, even >> after >> the end of the tests. >> >> Please, try to run kvm-unit-tests (the emulator test) on a KVM-PR >> host, >> and check the kernel logs (dmesg), then try to ssh to the host... > Ok, so the log messages are the problem. Please fix the message > output > then - or remove it altogether. Or if you like, create a module > parameter that allows you to emit them. > > I personally think the best solution would be to just convert the > message into a trace point. > > While at it, please see whether the guest can trigger similar host > log > output excess in other code paths. The problem is not really with the log messages: they are consequence of the bug I try to fix. What happens is once kvm_pr decodes an invalid instruction all the valid following instructions trigger a Program exception to the guest (but are executed correctly). It has no real consequence on big machine like POWER8, except that the guest become very slow and the log files of the host are filled with messages (and qemu uses 100% of the CPU). On a smaller machine like a PowerMac G5, the machine becomes simply unusable. >>> It's probably more related to your verbosity level of kernel messages. >>> If you pass loglevel=0 (or quiet) to you kernel cmdline you won't get >>> the messages printed to serial which is what's slowing you down. >>> >>> The other problem sounds pretty severe, but the only thing your patch >>> does any different from the current code flow would be the patch below. >>> Or did I miss anything? >>>
Re: [kernel-hardening] [PATCH v8 2/4] GCC plugin infrastructure
On 19/05/16 16:22, Michael Ellerman wrote: Did you test the plugins with all gcc versions (4.5-6)? What's the concern about gcc versions? Just not breaking the build on old compilers? I'm pretty sure powerpc big endian still builds with gcc 4.4. gcc's plugin support only landed in 4.5, so we don't care about <=4.4. However if Andrew's only tested on little endian, then that select should be guarded with an "if CPU_LITTLE_ENDIAN". And to build LE you need gcc >= 4.9. I'm going to give BE a test too just to be sure. -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [kernel-hardening] [PATCH v8 2/4] GCC plugin infrastructure
On Wed, 2016-05-18 at 12:33 +0200, Emese Revfy wrote: > > I've done some basic sanity testing on powerpc with the cyclomatic > > complexity plugin (with LE native + cross-compilers) and it seems to > > work with the patch below. > > > > Signed-off-by: Andrew Donnellan> > > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > > index a18a0dc..0cfed5b 100644 > > --- a/arch/powerpc/Kconfig > > +++ b/arch/powerpc/Kconfig > > @@ -97,6 +97,7 @@ config PPC > > select HAVE_DYNAMIC_FTRACE_WITH_REGS if MPROFILE_KERNEL > > select HAVE_FUNCTION_TRACER > > select HAVE_FUNCTION_GRAPH_TRACER > > + select HAVE_GCC_PLUGINS > > select SYSCTL_EXCEPTION_TRACE > > select ARCH_WANT_OPTIONAL_GPIOLIB > > select VIRT_TO_BUS if !PPC64 > > Hi, > > Did you test the plugins with all gcc versions (4.5-6)? What's the concern about gcc versions? Just not breaking the build on old compilers? I'm pretty sure powerpc big endian still builds with gcc 4.4. However if Andrew's only tested on little endian, then that select should be guarded with an "if CPU_LITTLE_ENDIAN". And to build LE you need gcc >= 4.9. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev