Re: Several suspected memory leaks

2018-07-10 Thread Benjamin Herrenschmidt
On Tue, 2018-07-10 at 17:17 +0200, Paul Menzel wrote:
> Dear Liunx folks,
> 
> 
> On a the IBM S822LC (8335-GTA) with Ubuntu 18.04 I built Linux master
> – 4.18-rc4+, commit 092150a2 (Merge branch 'for-linus'
> of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid) – with
> kmemleak. Several issues are found.

Some of these are completely uninteresting though and look like
kmemleak bugs to me :-)

> [] __pud_alloc+0x80/0x270
> [<07135d64>] hash__map_kernel_page+0x30c/0x4d0
> [<71677858>] __ioremap_at+0x108/0x140
> [<0023e921>] __ioremap_caller+0x130/0x180
> [<9dbc3923>] icp_native_init_one_node+0x5cc/0x760
> [<15f3168a>] icp_native_init+0x70/0x13c
> [<60ed>] xics_init+0x38/0x1ac
> [<88dbf9d1>] pnv_init_IRQ+0x30/0x5c

This is the interrupt controller mapping its registers, why on earth
would that be considered a leak ? kmemleak needs to learn to ignore
kernel page tables allocations.


> [] __pud_alloc+0x80/0x270
> [<2cdcd2db>] vmap_page_range_noflush+0x670/0x880
> [<41cc3e80>] map_vm_area+0x58/0xb0
> [] __vmalloc_node_range+0x1cc/0x3f0
> [] __vmalloc+0x50/0x60
> [<27a0846e>] alloc_large_system_hash+0x3b8/0x554
> [] vfs_caches_init+0xd4/0x138
> [<55b60f04>] start_kernel+0x60c/0x684
> [] start_here_common+0x1c/0x520

This looks like some generic VFS stuff which similarly is meant to
remain for the whole duration of the system, what's the point on
reporting on init leaks like that ?

> unreferenced object 0xc7bd8000 (size 16384):
>   comm "init", pid 1, jiffies 4294895064 (age 1316.236s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>   backtrace:
> [] __pud_alloc+0x80/0x270
> [<6ee9b8a3>] move_page_tables+0xd6c/0x13d0
> [<91930c94>] shift_arg_pages+0xc8/0x220
> [<9cfa5804>] setup_arg_pages+0x26c/0x380
> [] load_elf_binary+0x600/0x29ac
> [<5bbeae4b>] search_binary_handler+0x114/0x330
> [] __do_execve_file.isra.8+0x8a4/0x10b0
> [<44d8a16f>] sys_execve+0x58/0x70
> [<1deb923d>] system_call+0x5c/0x70

This is odd, looks like a page table allocation, I'm pretty sure those
get freed when the corresponding process dies, but this is init (PID1),
it probably never does. Again, a false positive.

> unreferenced object 0xc7bc8000 (size 16384):
>   comm "init", pid 1, jiffies 4294895064 (age 1316.236s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>   backtrace:
> [] __pud_alloc+0x80/0x270
> [<1cb1e8bb>] __handle_mm_fault+0x34c/0x2f20
> [<28b41470>] handle_mm_fault+0x1f0/0x4e0
> [] __do_page_fault+0x274/0xf90
> [<94f967e0>] handle_page_fault+0x18/0x38
> [] 0xc07fce3bbaf0
> [] load_elf_binary+0x99c/0x29ac
> [<5bbeae4b>] search_binary_handler+0x114/0x330
> [] __do_execve_file.isra.8+0x8a4/0x10b0
> [<44d8a16f>] sys_execve+0x58/0x70
> [<1deb923d>] system_call+0x5c/0x70
> unreferenced object 0xc07fb84d (size 16384):
>   comm "systemd-journal", pid 1796, jiffies 4294895847 (age 1313.168s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>   backtrace:
> [] __pud_alloc+0x80/0x270
> [<6ee9b8a3>] move_page_tables+0xd6c/0x13d0
> [<91930c94>] shift_arg_pages+0xc8/0x220
> [<9cfa5804>] setup_arg_pages+0x26c/0x380
> [] load_elf_binary+0x600/0x29ac
> [<5bbeae4b>] search_binary_handler+0x114/0x330
> [] __do_execve_file.isra.8+0x8a4/0x10b0
> [<44d8a16f>] sys_execve+0x58/0x70
> [<1deb923d>] system_call+0x5c/0x70
> unreferenced object 0xc07fb84b (size 16384):
>   comm "systemd-journal", pid 1796, jiffies 4294895847 (age 1313.168s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
>   backtrace:
> [] __pud_alloc+0x80/0x270
> [<1cb1e8bb>] __handle_mm_fault+0x34c/0x2f20
> [<28b41470>] handle_mm_fault+0x1f0/0x4e0
> [] __do_page_fault+0x274/0xf90
> [<94f967e0>] handle_page_fault+0x18/0x38
> 

Re: [RFC PATCH kernel 0/5] powerpc/P9/vfio: Pass through NVIDIA Tesla V100

2018-07-10 Thread Alex Williamson
On Tue, 10 Jul 2018 14:10:20 +1000
Alexey Kardashevskiy  wrote:

> On Thu, 7 Jun 2018 23:03:23 -0600
> Alex Williamson  wrote:
> 
> > On Fri, 8 Jun 2018 14:14:23 +1000
> > Alexey Kardashevskiy  wrote:
> >   
> > > On 8/6/18 1:44 pm, Alex Williamson wrote:
> > > > On Fri, 8 Jun 2018 13:08:54 +1000
> > > > Alexey Kardashevskiy  wrote:
> > > >   
> > > >> On 8/6/18 8:15 am, Alex Williamson wrote:  
> > > >>> On Fri, 08 Jun 2018 07:54:02 +1000
> > > >>> Benjamin Herrenschmidt  wrote:
> > > >>> 
> > >  On Thu, 2018-06-07 at 11:04 -0600, Alex Williamson wrote:
> > > >
> > > > Can we back up and discuss whether the IOMMU grouping of NVLink
> > > > connected devices makes sense?  AIUI we have a PCI view of these
> > > > devices and from that perspective they're isolated.  That's the 
> > > > view of
> > > > the device used to generate the grouping.  However, not visible to 
> > > > us,
> > > > these devices are interconnected via NVLink.  What isolation 
> > > > properties
> > > > does NVLink provide given that its entire purpose for existing 
> > > > seems to
> > > > be to provide a high performance link for p2p between devices?  
> > > > 
> > > 
> > >  Not entire. On POWER chips, we also have an nvlink between the device
> > >  and the CPU which is running significantly faster than PCIe.
> > > 
> > >  But yes, there are cross-links and those should probably be accounted
> > >  for in the grouping.
> > > >>>
> > > >>> Then after we fix the grouping, can we just let the host driver manage
> > > >>> this coherent memory range and expose vGPUs to guests?  The use case 
> > > >>> of
> > > >>> assigning all 6 GPUs to one VM seems pretty limited.  (Might need to
> > > >>> convince NVIDIA to support more than a single vGPU per VM though) 
> > > >>>
> > > >>
> > > >> These are physical GPUs, not virtual sriov-alike things they are
> > > >> implementing as well elsewhere.  
> > > > 
> > > > vGPUs as implemented on M- and P-series Teslas aren't SR-IOV like
> > > > either.  That's why we have mdev devices now to implement software
> > > > defined devices.  I don't have first hand experience with V-series, but
> > > > I would absolutely expect a PCIe-based Tesla V100 to support vGPU.  
> > > 
> > > So assuming V100 can do vGPU, you are suggesting ditching this patchset 
> > > and
> > > using mediated vGPUs instead, correct?
> > 
> > If it turns out that our PCIe-only-based IOMMU grouping doesn't
> > account for lack of isolation on the NVLink side and we correct that,
> > limiting assignment to sets of 3 interconnected GPUs, is that still a
> > useful feature?  OTOH, it's entirely an NVIDIA proprietary decision
> > whether they choose to support vGPU on these GPUs or whether they can
> > be convinced to support multiple vGPUs per VM.
> >   
> > > >> My current understanding is that every P9 chip in that box has some 
> > > >> NVLink2
> > > >> logic on it so each P9 is directly connected to 3 GPUs via PCIe and
> > > >> 2xNVLink2, and GPUs in that big group are interconnected by NVLink2 
> > > >> links
> > > >> as well.
> > > >>
> > > >> From small bits of information I have it seems that a GPU can perfectly
> > > >> work alone and if the NVIDIA driver does not see these interconnects
> > > >> (because we do not pass the rest of the big 3xGPU group to this 
> > > >> guest), it
> > > >> continues with a single GPU. There is an "nvidia-smi -r" big reset 
> > > >> hammer
> > > >> which simply refuses to work until all 3 GPUs are passed so there is 
> > > >> some
> > > >> distinction between passing 1 or 3 GPUs, and I am trying (as we speak) 
> > > >> to
> > > >> get a confirmation from NVIDIA that it is ok to pass just a single GPU.
> > > >>
> > > >> So we will either have 6 groups (one per GPU) or 2 groups (one per
> > > >> interconnected group).  
> > > > 
> > > > I'm not gaining much confidence that we can rely on isolation between
> > > > NVLink connected GPUs, it sounds like you're simply expecting that
> > > > proprietary code from NVIDIA on a proprietary interconnect from NVIDIA
> > > > is going to play nice and nobody will figure out how to do bad things
> > > > because... obfuscation?  Thanks,  
> > > 
> > > Well, we already believe that a proprietary firmware of a sriov-capable
> > > adapter like Mellanox ConnextX is not doing bad things, how is this
> > > different in principle?
> > 
> > It seems like the scope and hierarchy are different.  Here we're
> > talking about exposing big discrete devices, which are peers of one
> > another (and have history of being reverse engineered), to userspace
> > drivers.  Once handed to userspace, each of those devices needs to be
> > considered untrusted.  In the case of SR-IOV, we typically have a
> > trusted host driver for the PF managing untrusted VFs.  We do rely on
> > some sanity in the hardware/firmware in isolating the 

Re: [PATCH] Documentation: Add powerpc options for spec_store_bypass_disable

2018-07-10 Thread Jonathan Corbet
On Tue, 10 Jul 2018 12:08:36 +1000
Michael Ellerman  wrote:

> Document the support for spec_store_bypass_disable that was added for
> powerpc in commit a048a07d7f45 ("powerpc/64s: Add support for a store
> forwarding barrier at kernel entry/exit").
> 
> Signed-off-by: Michael Ellerman 

Applied, thanks.

jon


[PATCH NEXT 1/4] powerpc/pasemi: Add PCI initialisation for Nemo board.

2018-07-10 Thread Christian Zigotzky
Hello Michael,

Thanks a lot for your reply. OK, first I would like to add 

pr_info("NEMO SB600 IOB base %08llx\n",res.start)

to the Nemo patch. Is this line correct now?

After that I will try to contact Darren because of your other comments.

If I don’t reach Darren then I will try to fix the issues but I think I need 
your help for fixing them.

Thanks,
Christian

> On 10. Jul 2018, at 15:50, Michael Ellerman wrote:
> 
> pr_info() would be nice.
> 
> But I replied with lots of other comments previously.
> 
> None of them were super critical, but it would be nice to get them fixed
> before merging if possible.
> 
> cheers


[tip:sched/core] watchdog/softlockup: Fix the SOFTLOCKUP_DETECTOR=n build

2018-07-10 Thread tip-bot for Peter Zijlstra
Commit-ID:  aef92a8bed25d03b8f03ce499a56e8e8e5e2c05e
Gitweb: https://git.kernel.org/tip/aef92a8bed25d03b8f03ce499a56e8e8e5e2c05e
Author: Peter Zijlstra 
AuthorDate: Tue, 10 Jul 2018 13:42:10 +0200
Committer:  Ingo Molnar 
CommitDate: Tue, 10 Jul 2018 17:56:22 +0200

watchdog/softlockup: Fix the SOFTLOCKUP_DETECTOR=n build

I got confused by all the various CONFIG options here about and
conflated CONFIG_LOCKUP_DETECTOR and CONFIG_SOFTLOCKUP_DETECTOR.

This results in a build failure for:

   CONFIG_LOCKUP_DETECTOR=y && CONFIG_SOFTLOCKUP_DETECTOR=n

As reported by Abdul.

Reported-and-tested-by: Abdul Haleem 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-next 
Cc: linuxppc-dev 
Cc: mpe 
Cc: sachinp 
Cc: stephen Rothwell 
Fixes: 9cf57731b63e ("watchdog/softlockup: Replace "watchdog/%u" threads with 
cpu_stop_work")
Link: 
http://lkml.kernel.org/r/20180710114210.gi2...@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar 
---
 include/linux/nmi.h | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 80664bbeca43..08f9247e9827 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -33,15 +33,10 @@ extern int sysctl_hardlockup_all_cpu_backtrace;
 #define sysctl_hardlockup_all_cpu_backtrace 0
 #endif /* !CONFIG_SMP */
 
-extern int lockup_detector_online_cpu(unsigned int cpu);
-extern int lockup_detector_offline_cpu(unsigned int cpu);
-
 #else /* CONFIG_LOCKUP_DETECTOR */
 static inline void lockup_detector_init(void) { }
 static inline void lockup_detector_soft_poweroff(void) { }
 static inline void lockup_detector_cleanup(void) { }
-#define lockup_detector_online_cpu NULL
-#define lockup_detector_offline_cpuNULL
 #endif /* !CONFIG_LOCKUP_DETECTOR */
 
 #ifdef CONFIG_SOFTLOCKUP_DETECTOR
@@ -50,12 +45,18 @@ extern void touch_softlockup_watchdog(void);
 extern void touch_softlockup_watchdog_sync(void);
 extern void touch_all_softlockup_watchdogs(void);
 extern unsigned int  softlockup_panic;
-#else
+
+extern int lockup_detector_online_cpu(unsigned int cpu);
+extern int lockup_detector_offline_cpu(unsigned int cpu);
+#else /* CONFIG_SOFTLOCKUP_DETECTOR */
 static inline void touch_softlockup_watchdog_sched(void) { }
 static inline void touch_softlockup_watchdog(void) { }
 static inline void touch_softlockup_watchdog_sync(void) { }
 static inline void touch_all_softlockup_watchdogs(void) { }
-#endif
+
+#define lockup_detector_online_cpu NULL
+#define lockup_detector_offline_cpuNULL
+#endif /* CONFIG_SOFTLOCKUP_DETECTOR */
 
 #ifdef CONFIG_DETECT_HUNG_TASK
 void reset_hung_task_detector(void);


Re: [PATCH v6 5/8] powerpc/pseries: flush SLB contents on SLB MCE errors.

2018-07-10 Thread Michal Suchánek
Hello,

On Wed, 04 Jul 2018 23:28:21 +0530
"Mahesh J Salgaonkar"  wrote:

> From: Mahesh Salgaonkar 
> 
> On pseries, as of today system crashes if we get a machine check
> exceptions due to SLB errors. These are soft errors and can be fixed
> by flushing the SLBs so the kernel can continue to function instead of
> system crash. We do this in real mode before turning on MMU. Otherwise
> we would run into nested machine checks. This patch now fetches the
> rtas error log in real mode and flushes the SLBs on SLB errors.
> 
> Signed-off-by: Mahesh Salgaonkar 
> ---
>  arch/powerpc/include/asm/book3s/64/mmu-hash.h |1 
>  arch/powerpc/include/asm/machdep.h|1 
>  arch/powerpc/kernel/exceptions-64s.S  |   42
> + arch/powerpc/kernel/mce.c
> |   16 +++- arch/powerpc/mm/slb.c |6
> +++ arch/powerpc/platforms/pseries/pseries.h  |1 
>  arch/powerpc/platforms/pseries/ras.c  |   51
> +
> arch/powerpc/platforms/pseries/setup.c|1 8 files changed,
> 116 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index
> 50ed64fba4ae..cc00a7088cf3 100644 ---
> a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++
> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -487,6 +487,7 @@
> extern void hpte_init_native(void); 
>  extern void slb_initialize(void);
>  extern void slb_flush_and_rebolt(void);
> +extern void slb_flush_and_rebolt_realmode(void);
>  
>  extern void slb_vmalloc_update(void);
>  extern void slb_set_size(u16 size);
> diff --git a/arch/powerpc/include/asm/machdep.h
> b/arch/powerpc/include/asm/machdep.h index ffe7c71e1132..fe447e0d4140
> 100644 --- a/arch/powerpc/include/asm/machdep.h
> +++ b/arch/powerpc/include/asm/machdep.h
> @@ -108,6 +108,7 @@ struct machdep_calls {
>  
>   /* Early exception handlers called in realmode */
>   int (*hmi_exception_early)(struct pt_regs
> *regs);
> + int (*machine_check_early)(struct pt_regs
> *regs); 
>   /* Called during machine check exception to retrive fixup
> address. */ bool  (*mce_check_early_recovery)(struct
> pt_regs *regs); diff --git a/arch/powerpc/kernel/exceptions-64s.S
> b/arch/powerpc/kernel/exceptions-64s.S index
> f283958129f2..0038596b7906 100644 ---
> a/arch/powerpc/kernel/exceptions-64s.S +++
> b/arch/powerpc/kernel/exceptions-64s.S @@ -332,6 +332,9 @@
> TRAMP_REAL_BEGIN(machine_check_pSeries) machine_check_fwnmi:
>   SET_SCRATCH0(r13)   /* save r13 */
>   EXCEPTION_PROLOG_0(PACA_EXMC)
> +BEGIN_FTR_SECTION
> + b   machine_check_pSeries_early
> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>  machine_check_pSeries_0:
>   EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
>   /*
> @@ -343,6 +346,45 @@ machine_check_pSeries_0:
>  
>  TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
>  
> +TRAMP_REAL_BEGIN(machine_check_pSeries_early)
> +BEGIN_FTR_SECTION
> + EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
> + mr  r10,r1  /* Save r1 */
> + ld  r1,PACAMCEMERGSP(r13)   /* Use MC emergency
> stack */
> + subir1,r1,INT_FRAME_SIZE/* alloc stack
> frame */
> + mfspr   r11,SPRN_SRR0   /* Save SRR0 */
> + mfspr   r12,SPRN_SRR1   /* Save SRR1 */
> + EXCEPTION_PROLOG_COMMON_1()
> + EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
> + EXCEPTION_PROLOG_COMMON_3(0x200)
> + addir3,r1,STACK_FRAME_OVERHEAD
> + BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI
> */ +
> + /* Move original SRR0 and SRR1 into the respective regs */
> + ld  r9,_MSR(r1)
> + mtspr   SPRN_SRR1,r9
> + ld  r3,_NIP(r1)
> + mtspr   SPRN_SRR0,r3
> + ld  r9,_CTR(r1)
> + mtctr   r9
> + ld  r9,_XER(r1)
> + mtxer   r9
> + ld  r9,_LINK(r1)
> + mtlrr9
> + REST_GPR(0, r1)
> + REST_8GPRS(2, r1)
> + REST_GPR(10, r1)
> + ld  r11,_CCR(r1)
> + mtcrr11
> + REST_GPR(11, r1)
> + REST_2GPRS(12, r1)
> + /* restore original r1. */
> + ld  r1,GPR1(r1)
> + SET_SCRATCH0(r13)   /* save r13 */
> + EXCEPTION_PROLOG_0(PACA_EXMC)
> + b   machine_check_pSeries_0
> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> +
>  EXC_COMMON_BEGIN(machine_check_common)
>   /*
>* Machine check is different because we use a different
> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index efdd16a79075..221271c96a57 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -488,9 +488,21 @@ long machine_check_early(struct pt_regs *regs)
>  {
>   long handled = 0;
>  
> - __this_cpu_inc(irq_stat.mce_exceptions);
> + /*
> +  * For pSeries we count mce when we go into virtual mode
> machine
> +  * check handler. Hence skip it. Also, We can't access per
> cpu
> +  * 

Re: [PATCH] watchdog/softlockup: Fix SOFTLOCKUP_DETECTOR=n build

2018-07-10 Thread Ingo Molnar


* Peter Zijlstra  wrote:

> On Mon, Jul 09, 2018 at 11:40:14PM +0530, Abdul Haleem wrote:
> 
> > Thanks Peter for the patch, build and boot is fine.
> > 
> > Reported-and-tested-by: Abdul Haleem 
> 
> Excellent, Ingo can you stick this in?

Sure, done!

Thanks,

Ingo


Re: [RFC PATCH 2/2] dma-mapping: Clean up dma_get_required_mask() hooks

2018-07-10 Thread Christoph Hellwig
On Tue, Jul 10, 2018 at 01:29:20PM +0100, Robin Murphy wrote:
>> What I've done is to:
>>
>>   1) provide the get_required_mask unconditionally in struct dma_map_ops
>>   2) default to what is the current dma_get_required_mask implementation
>>  if nothing else is specified.
>
> Yeah, there's already 17 pointers in dma_map_ops of which about half are 
> optional, so these awkward #ifdefs to save one more probably aren't worth 
> the inconsistency they bring. It feels like this guy mostly goes 
> hand-in-hand with dma_supported, so ack to giving it the same look and 
> feel.

This whole area needs a major refactoring - we currentl have three
different APIs to deal with addressability: dma_get_required_mask,
dma_capable/dma_set_mask and dma_capable from dma-direct.h, and there
is plenty of unexplainable mismatches between them.

Sorting this out has been on my TODO list, but I think it can only
effectively be done once the direct mapping implementations are
reasonably consolidated.

>> What I still had on my todo list but not done yet:
>>
>>   3) go through all instances and check if the current default
>>  makes sense, at it based on direct addressability.  For most
>>  iommu instances it seems like we should just return a 64-bit mask.
>
> That's reasonable, although in many cases we should know the effective 
> IOMMU input address size which would be even neater.

Sure.  Maybe I just need to steps 1 and 2 and let maintainers fill
in.

>>   4) figure out how to take the dma offsets into account for it
>
> AFAICS it might boil down to simply:
>
>   mask = roundup_pow_of_two(phys_to_dma(dev, PFN_PHYS(max_pfn))) - 1;

That looks way to sensible.  Which reminds me that I need to research
the history behind the low_totalram/high_totalram magic in
dma_get_required_mask.


Re: [PATCH] powerpc: Replaced msleep with usleep_range

2018-07-10 Thread Michael Ellerman
kbuild test robot  writes:

> Hi Daniel,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on powerpc/next]
> [also build test ERROR on v4.18-rc4 next-20180709]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Daniel-Klamt/powerpc-Replaced-msleep-with-usleep_range/20180709-231913
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
> config: powerpc-defconfig (attached as .config)
> compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> GCC_VERSION=7.2.0 make.cross ARCH=powerpc 
>
> All errors (new ones prefixed by >>):
>
>arch/powerpc/sysdev/xive/native.c: In function 'xive_native_configure_irq':
>>> arch/powerpc/sysdev/xive/native.c:113:2: error: expected ';' before '}' 
>>> token
>  }
>  ^

There's also instructions here for building the powerpc kernel:

  https://github.com/linuxppc/linux/wiki/Building-powerpc-kernels

cheers


Re: [PATCH NEXT 1/4] powerpc/pasemi: Add PCI initialisation for Nemo board.

2018-07-10 Thread Michael Ellerman
Christian Zigotzky  writes:

> Hi Michael,
> Hi All,
>
> kbuild test robot Wed, 03 Jan 2018 04:17:20 -0800 wrote:
>
> Hi Darren,
>
> Thank you for the patch! Perhaps something to improve:
>
> arch/powerpc/platforms/pasemi/pci.c: In function 'sb600_set_flag': >> 
> include/linux/kern_levels.h:5:18: warning: format '%lx' expects argument of 
> >> type 'long unsigned int', but argument 2 has type 'resource_size_t {aka 
> long >> long unsigned int}' [-Wformat=] #define KERN_SOH "\001" /* ASCII 
> Start Of Header */
>
> —-
>
> I was able to fix this small format issue. I replaced the format '%08lx' with 
> '%08llx'.
>
> + printk(KERN_CRIT "NEMO SB600 IOB base %08llx\n",res.start);
>
> Is this fix OK or is there a better solution?
>
>> On 3. May 2018, at 15:06, Michael Ellerman  wrote:
>> 
>>> +
>>> +printk(KERN_CRIT "NEMO SB600 IOB base %08lx\n",res.start);
>> 
>> That's INFO or even DEBUG.
>>> 
>
> Michael,
>
> What do you think about this fix?
>
> + printk(KERN_INFO "NEMO SB600 IOB base %08llx\n",res.start);

pr_info() would be nice.

But I replied with lots of other comments previously.

None of them were super critical, but it would be nice to get them fixed
before merging if possible.

cheers


Re: [PATCH 1/2] powerpc: Add ppc32_allmodconfig defconfig target

2018-07-10 Thread Michael Ellerman
Mathieu Malaterre  writes:
> On Mon, Jul 9, 2018 at 4:24 PM Michael Ellerman  wrote:
>>
>> Because the allmodconfig logic just sets every symbol to M or Y, it
>> has the effect of always generating a 64-bit config, because
>> CONFIG_PPC64 becomes Y.
>>
>> So to make it easier for folks to test 32-bit code, provide a phony
>> defconfig target that generates a 32-bit allmodconfig.
>>
>> The 32-bit port has several mutually exclusive CPU types, we choose
>> the Book3S variants as that's what the help text in Kconfig says is
>> most common.
>
> Ok then.

That was just me taking a stab in the dark. You suggested we should
mimic the Debian config, what does that use?

>> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
>> index 2ea575cb3401..2556c2182789 100644
>> --- a/arch/powerpc/Makefile
>> +++ b/arch/powerpc/Makefile
>> @@ -354,6 +354,11 @@ mpc86xx_smp_defconfig:
>> $(call merge_into_defconfig,mpc86xx_basic_defconfig,\
>> 86xx-smp 86xx-hw fsl-emb-nonhw)
>>
>> +PHONY += ppc32_allmodconfig
>> +ppc32_allmodconfig:
>> +   $(Q)$(MAKE) 
>> KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/book3s_32.config \
>> +   -f $(srctree)/Makefile allmodconfig
>> +
>
> I this a good time to update line 34 at the same time:
>
> KBUILD_DEFCONFIG := $(shell uname -m)_defconfig
>
> ?

34 or 36?

  ifeq ($(CROSS_COMPILE),)
  KBUILD_DEFCONFIG := $(shell uname -m)_defconfig
  else
  KBUILD_DEFCONFIG := ppc64_defconfig
  endif

Do you mean the else case?

cheers


Re: [powerpc/powervm]Oops: Kernel access of bad area, sig: 11 [#1] while running stress-ng

2018-07-10 Thread Michael Ellerman
vrbagal1  writes:

> On 2018-07-10 13:37, Nicholas Piggin wrote:
>> On Tue, 10 Jul 2018 11:58:40 +0530
>> vrbagal1  wrote:
>> 
>>> Hi,
>>> 
>>> Observing kernel oops on Power9(ZZ) box, running on PowerVM, while
>>> running stress-ng.
>>> 
>>> 
>>> Kernel: 4.18.0-rc4
>>> Machine: Power9 ZZ (PowerVM)
>>> Test: Stress-ng
>>> 
>>> Attached is .config file
>>> 
>>> Traces:
>>> 
>>>   [12251.245209] Oops: Kernel access of bad area, sig: 11 [#1]
>> 
>> Can you post the lines above this? Otherwise we don't know what address
>> it tried to access (without decoding the instructions and 
>> reconstructing
>> it from registers at least, which the XFS devs wouldn't be inclined to
>> do).
>> 
>
> ah my bad.
>
>   [12251.245179] Unable to handle kernel paging request for data at address 
> 0x60006000
>   [12251.245199] Faulting instruction address: 0xc0319e2c

Which matches the regs & disassembly:

r4 = 60006000 
r9 = 0
ldx r9,r4,r9<- pop

So object was 0x60006000.

That looks like two nops, ie. we got some code?

And there's only one caller of prefetch_freepointer() in slab_alloc_node():

prefetch_freepointer(s, next_object);


So slab corruption is looking likely.

Do you have slub_debug=FZP on the kernel command line?

cheers


Re: [RFC PATCH 2/2] dma-mapping: Clean up dma_get_required_mask() hooks

2018-07-10 Thread Robin Murphy

On 10/07/18 12:39, Christoph Hellwig wrote:

On Wed, Jul 04, 2018 at 06:50:12PM +0100, Robin Murphy wrote:

As for the other mask-related hooks, standardise the arch override into
a Kconfig option, and also pull the generic implementation into the DMA
mapping code rather than having it hide away in the platform bus code.


I compared this a bit to what I had around against an older kernel,
and I think we should probably go with something more like the
version I had, which I can dust off again.

What I've done is to:

  1) provide the get_required_mask unconditionally in struct dma_map_ops
  2) default to what is the current dma_get_required_mask implementation
 if nothing else is specified.


Yeah, there's already 17 pointers in dma_map_ops of which about half are 
optional, so these awkward #ifdefs to save one more probably aren't 
worth the inconsistency they bring. It feels like this guy mostly goes 
hand-in-hand with dma_supported, so ack to giving it the same look and feel.



What I still had on my todo list but not done yet:

  3) go through all instances and check if the current default
 makes sense, at it based on direct addressability.  For most
 iommu instances it seems like we should just return a 64-bit mask.


That's reasonable, although in many cases we should know the effective 
IOMMU input address size which would be even neater.



  4) figure out how to take the dma offsets into account for it


AFAICS it might boil down to simply:

mask = roundup_pow_of_two(phys_to_dma(dev, PFN_PHYS(max_pfn))) - 1;


  5) move the function to the dma-direct code, as that is where it
 belongs
  5) figure out if there is a better name for the method, as with
 swiotlb & co it isn't really the required mask, but more something
 like the optimal mask
  6) document the whole thing..
  7) sort out the powerpc indirection mess.

Do you agree with that general plan?  If so I can dust off my old
patch.


Sounds good; in the meantime I'll happily drop these two.

Robin.


Re: [RFC PATCH 1/2] dma-mapping: Clean up dma_set_*mask() hooks

2018-07-10 Thread Christoph Hellwig
On Mon, Jul 09, 2018 at 03:53:50PM +0100, Robin Murphy wrote:
> Oh, for sure, the generic fix would be the longer-term goal, this was just 
> an expedient compromise because I want to get *something* landed for 4.19. 
> Since in practice this is predominantly affecting arm64, doing the 
> arch-specific fix to appease affected customers then working to generalise 
> it afterwards seemed to carry the lowest risk.
>
> That said, I think I can see a relatively safe and clean alternative 
> approach based on converting dma_32bit_limit to a mask, so I'll spin some 
> patches around that idea ASAP to continue the discussion.

Great!  I really want to sort out this area as soon as possible, and
introducing more arch specific code isn't really helping with that.  In
fact my whole drive to consolidate code is based on the fact that
I want to fix issues like this in one code base instead of 20 slightly
different ones.

FYI, this is the Xilinx/RISC-V use case for the 32-bit limit,
which I'll need to respin a bit based on linux-pci feedback:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/xilinx-dma-32

> It's entirely possible to plug an old PCI soundcard via a bridge adapter 
> into a modern board where the card's 24-bit DMA mask reaches nothing but 
> the SoC's boot flash, and no IOMMU is available (e.g. some of the smaller 
> NXP Layercape stuff); I still think there should be an error in such rare 
> cases when DMA is utterly impossible, but otherwise I agree it would be 
> much nicer for drivers to just provide their preferred mask and let the ops 
> massage it as necessary.

Ok, I guess we need still need to keep the return value for that.


[PATCH] watchdog/softlockup: Fix SOFTLOCKUP_DETECTOR=n build

2018-07-10 Thread Peter Zijlstra
On Mon, Jul 09, 2018 at 11:40:14PM +0530, Abdul Haleem wrote:

> Thanks Peter for the patch, build and boot is fine.
> 
> Reported-and-tested-by: Abdul Haleem 

Excellent, Ingo can you stick this in?

---
Subject: watchdog/softlockup: Fix SOFTLOCKUP_DETECTOR=n build
From: Peter Zijlstra 
Date: Mon, 9 Jul 2018 13:47:16 +0200

I got confused by all the various CONFIG options here about and
conflated CONFIG_LOCKUP_DETECTOR and CONFIG_SOFTLOCKUP_DETECTOR. This
results in a build failure for:

   CONFIG_LOCKUP_DETECTOR=y && CONFIG_SOFTLOCKUP_DETECTOR=n

As reported by Abdul.

Cc: Ingo Molnar 
Cc: stephen Rothwell 
Cc: linuxppc-dev 
Cc: sachinp 
Cc: mpe 
Reported-and-tested-by: Abdul Haleem 
Fixes: 9cf57731b63e ("watchdog/softlockup: Replace "watchdog/%u" threads with 
cpu_stop_work")
Signed-off-by: Peter Zijlstra (Intel) 
Link: 
http://lkml.kernel.org/r/20180709114716.gn2...@hirez.programming.kicks-ass.net
---
 include/linux/nmi.h |   15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -33,15 +33,10 @@ extern int sysctl_hardlockup_all_cpu_bac
 #define sysctl_hardlockup_all_cpu_backtrace 0
 #endif /* !CONFIG_SMP */
 
-extern int lockup_detector_online_cpu(unsigned int cpu);
-extern int lockup_detector_offline_cpu(unsigned int cpu);
-
 #else /* CONFIG_LOCKUP_DETECTOR */
 static inline void lockup_detector_init(void) { }
 static inline void lockup_detector_soft_poweroff(void) { }
 static inline void lockup_detector_cleanup(void) { }
-#define lockup_detector_online_cpu NULL
-#define lockup_detector_offline_cpuNULL
 #endif /* !CONFIG_LOCKUP_DETECTOR */
 
 #ifdef CONFIG_SOFTLOCKUP_DETECTOR
@@ -50,12 +45,18 @@ extern void touch_softlockup_watchdog(vo
 extern void touch_softlockup_watchdog_sync(void);
 extern void touch_all_softlockup_watchdogs(void);
 extern unsigned int  softlockup_panic;
-#else
+
+extern int lockup_detector_online_cpu(unsigned int cpu);
+extern int lockup_detector_offline_cpu(unsigned int cpu);
+#else /* CONFIG_SOFTLOCKUP_DETECTOR */
 static inline void touch_softlockup_watchdog_sched(void) { }
 static inline void touch_softlockup_watchdog(void) { }
 static inline void touch_softlockup_watchdog_sync(void) { }
 static inline void touch_all_softlockup_watchdogs(void) { }
-#endif
+
+#define lockup_detector_online_cpu NULL
+#define lockup_detector_offline_cpuNULL
+#endif /* CONFIG_SOFTLOCKUP_DETECTOR */
 
 #ifdef CONFIG_DETECT_HUNG_TASK
 void reset_hung_task_detector(void);


Re: [RFC PATCH 2/2] dma-mapping: Clean up dma_get_required_mask() hooks

2018-07-10 Thread Christoph Hellwig
On Wed, Jul 04, 2018 at 06:50:12PM +0100, Robin Murphy wrote:
> As for the other mask-related hooks, standardise the arch override into
> a Kconfig option, and also pull the generic implementation into the DMA
> mapping code rather than having it hide away in the platform bus code.

I compared this a bit to what I had around against an older kernel,
and I think we should probably go with something more like the
version I had, which I can dust off again.

What I've done is to:

 1) provide the get_required_mask unconditionally in struct dma_map_ops
 2) default to what is the current dma_get_required_mask implementation
if nothing else is specified.

What I still had on my todo list but not done yet:

 3) go through all instances and check if the current default
makes sense, at it based on direct addressability.  For most
iommu instances it seems like we should just return a 64-bit mask.
 4) figure out how to take the dma offsets into account for it
 5) move the function to the dma-direct code, as that is where it
belongs
 5) figure out if there is a better name for the method, as with
swiotlb & co it isn't really the required mask, but more something
like the optimal mask
 6) document the whole thing..
 7) sort out the powerpc indirection mess.

Do you agree with that general plan?  If so I can dust off my old
patch.


Re: [RFC PATCH] powerpc/64s: Move ISAv3.0 / POWER9 idle code to powernv C code

2018-07-10 Thread Gautham R Shenoy
Hello Nicholas,


On Mon, Jul 09, 2018 at 12:24:36AM +1000, Nicholas Piggin wrote:
> Reimplement POWER9 idle code in C, in the powernv platform code.
> Assembly stubs are used to save and restore the stack frame and
> non-volatile GPRs before going to idle, but these are small and
> mostly agnostic to microarchitecture implementation details.
>

Thanks for this patch.  It is indeed hard to maneuver through the
current assembly code and change things without introducing new bugs.


> POWER7/8 code is not converted (yet), but that's not a moving
> target, and it doesn't make you want to claw your eyes out so
> much with the POWER9 code untangled from it.
> 
> The optimisation where EC=ESL=0 idle modes did not have to save
> GPRs or mtmsrd L=0 is restored, because it's simple to do.
> 
> Idle wakeup no longer uses the ->cpu_restore call to reinit SPRs,
> but saves and restores them all explicitly.

Right! The ->cpu_restore call in the current code is rendered
ineffective by "restore_additional_sprs" which is called after the
cpu_restore. 

> 
> Moving the HMI, SPR, OPAL, locking, etc. to C is the only real
> way this stuff will cope with non-trivial new CPU implementation
> details, firmware changes, etc., without becoming unmaintainable.

Some comments inline. 

> ---
>  arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>  arch/powerpc/include/asm/cpuidle.h|  14 +-
>  arch/powerpc/include/asm/paca.h   |  38 +-
>  arch/powerpc/include/asm/processor.h  |   3 +-
>  arch/powerpc/include/asm/reg.h|   7 +-
>  arch/powerpc/kernel/Makefile  |   2 +-
>  arch/powerpc/kernel/asm-offsets.c |  11 +-
>  arch/powerpc/kernel/exceptions-64s.S  |  10 +-
>  arch/powerpc/kernel/idle_book3s.S | 348 ++-
>  arch/powerpc/kernel/idle_isa3.S   |  73 
>  arch/powerpc/kernel/setup-common.c|   4 +-
>  arch/powerpc/mm/slb.c |   7 +-
>  arch/powerpc/platforms/powernv/idle.c | 402 +++---
>  arch/powerpc/xmon/xmon.c  |  25 +-
>  14 files changed, 496 insertions(+), 449 deletions(-)
>  create mode 100644 arch/powerpc/kernel/idle_isa3.S
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 50ed64fba4ae..c626319a962d 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -486,6 +486,7 @@ static inline void hpte_init_pseries(void) { }
>  extern void hpte_init_native(void);
> 
>  extern void slb_initialize(void);
> +extern void __slb_flush_and_rebolt(void);
>  extern void slb_flush_and_rebolt(void);
> 
>  extern void slb_vmalloc_update(void);
> diff --git a/arch/powerpc/include/asm/cpuidle.h 
> b/arch/powerpc/include/asm/cpuidle.h
> index e210a83eb196..b668f030d531 100644
> --- a/arch/powerpc/include/asm/cpuidle.h
> +++ b/arch/powerpc/include/asm/cpuidle.h
> @@ -28,6 +28,7 @@
>   * yet woken from the winkle state.
>   */
>  #define PNV_CORE_IDLE_LOCK_BIT   0x1000
> +#define NR_PNV_CORE_IDLE_LOCK_BIT28

We can define PNV_CORE_IDLE_LOCK_BIT mask based on
NR_PNV_CORE_IDLE_LOCK_BIT ?

> 
>  #define PNV_CORE_IDLE_WINKLE_COUNT   0x0001
>  #define PNV_CORE_IDLE_WINKLE_COUNT_ALL_BIT   0x0008
> @@ -68,22 +69,9 @@
>  #define ERR_DEEP_STATE_ESL_MISMATCH  -2
> 
>  #ifndef __ASSEMBLY__
> -/* Additional SPRs that need to be saved/restored during stop */
> -struct stop_sprs {
> - u64 pid;
> - u64 ldbar;
> - u64 fscr;
> - u64 hfscr;
> - u64 mmcr1;
> - u64 mmcr2;
> - u64 mmcra;
> -};
> -
>  extern u32 pnv_fastsleep_workaround_at_entry[];
>  extern u32 pnv_fastsleep_workaround_at_exit[];
> 
> -extern u64 pnv_first_deep_stop_state;
> -
>  unsigned long pnv_cpu_offline(unsigned int cpu);
>  int validate_psscr_val_mask(u64 *psscr_val, u64 *psscr_mask, u32 flags);
>  static inline void report_invalid_psscr_val(u64 psscr_val, int err)
> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
> index 4e9cede5a7e7..a7a4892d39c0 100644
> --- a/arch/powerpc/include/asm/paca.h
> +++ b/arch/powerpc/include/asm/paca.h
> @@ -178,23 +178,29 @@ struct paca_struct {
>  #endif
> 
>  #ifdef CONFIG_PPC_POWERNV
> - /* Per-core mask tracking idle threads and a lock bit-[L][] */
> - u32 *core_idle_state_ptr;
> - u8 thread_idle_state;   /* PNV_THREAD_RUNNING/NAP/SLEEP */
> - /* Mask to indicate thread id in core */
> - u8 thread_mask;
> - /* Mask to denote subcore sibling threads */
> - u8 subcore_sibling_mask;
> - /* Flag to request this thread not to stop */
> - atomic_t dont_stop;
> - /* The PSSCR value that the kernel requested before going to stop */
> - u64 requested_psscr;
> + union {
> + /* P7/P8 specific fields */
> + struct {
> + /* 

[PATCH] powerpc/perf: Update perf_regs structure to include SIER

2018-07-10 Thread Madhavan Srinivasan
On each sample, Sample Instruction Event Register (SIER) content
is saved in pt_regs. SIER does not have a entry as-is in the pt_regs
but instead, SIER content is saved in the "dar" register of pt_regs.

Patch adds another entry to the perf_regs structure to include the "SIER"
printing which internally maps to the "dar" of pt_regs.

Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Alexander Shishkin 
Cc: Anju T Sudhakar 
Cc: Ravi Bangoria 
Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/include/uapi/asm/perf_regs.h   | 1 +
 arch/powerpc/perf/perf_regs.c   | 1 +
 tools/arch/powerpc/include/uapi/asm/perf_regs.h | 1 +
 tools/perf/arch/powerpc/include/perf_regs.h | 3 ++-
 tools/perf/arch/powerpc/util/perf_regs.c| 1 +
 5 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h 
b/arch/powerpc/include/uapi/asm/perf_regs.h
index 9e52c86ccbd3..ff91192407d1 100644
--- a/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -46,6 +46,7 @@ enum perf_event_powerpc_regs {
PERF_REG_POWERPC_TRAP,
PERF_REG_POWERPC_DAR,
PERF_REG_POWERPC_DSISR,
+   PERF_REG_POWERPC_SIER,
PERF_REG_POWERPC_MAX,
 };
 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
index 09ceea6175ba..c262aea22ad9 100644
--- a/arch/powerpc/perf/perf_regs.c
+++ b/arch/powerpc/perf/perf_regs.c
@@ -69,6 +69,7 @@ static unsigned int pt_regs_offset[PERF_REG_POWERPC_MAX] = {
PT_REGS_OFFSET(PERF_REG_POWERPC_TRAP, trap),
PT_REGS_OFFSET(PERF_REG_POWERPC_DAR, dar),
PT_REGS_OFFSET(PERF_REG_POWERPC_DSISR, dsisr),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_SIER, dar),
 };
 
 u64 perf_reg_value(struct pt_regs *regs, int idx)
diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h 
b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
index 9e52c86ccbd3..ff91192407d1 100644
--- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -46,6 +46,7 @@ enum perf_event_powerpc_regs {
PERF_REG_POWERPC_TRAP,
PERF_REG_POWERPC_DAR,
PERF_REG_POWERPC_DSISR,
+   PERF_REG_POWERPC_SIER,
PERF_REG_POWERPC_MAX,
 };
 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git a/tools/perf/arch/powerpc/include/perf_regs.h 
b/tools/perf/arch/powerpc/include/perf_regs.h
index 00e37b106913..1076393e6f43 100644
--- a/tools/perf/arch/powerpc/include/perf_regs.h
+++ b/tools/perf/arch/powerpc/include/perf_regs.h
@@ -62,7 +62,8 @@ static const char *reg_names[] = {
[PERF_REG_POWERPC_SOFTE] = "softe",
[PERF_REG_POWERPC_TRAP] = "trap",
[PERF_REG_POWERPC_DAR] = "dar",
-   [PERF_REG_POWERPC_DSISR] = "dsisr"
+   [PERF_REG_POWERPC_DSISR] = "dsisr",
+   [PERF_REG_POWERPC_SIER] = "sier"
 };
 
 static inline const char *perf_reg_name(int id)
diff --git a/tools/perf/arch/powerpc/util/perf_regs.c 
b/tools/perf/arch/powerpc/util/perf_regs.c
index ec50939b0418..07fcd977d93e 100644
--- a/tools/perf/arch/powerpc/util/perf_regs.c
+++ b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -52,6 +52,7 @@ const struct sample_reg sample_reg_masks[] = {
SMPL_REG(trap, PERF_REG_POWERPC_TRAP),
SMPL_REG(dar, PERF_REG_POWERPC_DAR),
SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR),
+   SMPL_REG(sier, PERF_REG_POWERPC_SIER),
SMPL_REG_END
 };
 
-- 
2.7.4



Re: [powerpc/powervm]Oops: Kernel access of bad area, sig: 11 [#1] while running stress-ng

2018-07-10 Thread vrbagal1

On 2018-07-10 13:37, Nicholas Piggin wrote:

On Tue, 10 Jul 2018 11:58:40 +0530
vrbagal1  wrote:


Hi,

Observing kernel oops on Power9(ZZ) box, running on PowerVM, while
running stress-ng.


Kernel: 4.18.0-rc4
Machine: Power9 ZZ (PowerVM)
Test: Stress-ng

Attached is .config file

Traces:

  [12251.245209] Oops: Kernel access of bad area, sig: 11 [#1]


Can you post the lines above this? Otherwise we don't know what address
it tried to access (without decoding the instructions and 
reconstructing

it from registers at least, which the XFS devs wouldn't be inclined to
do).



ah my bad.

 [12251.245179] Unable to handle kernel paging request for data at 
address 0x60006000

 [12251.245199] Faulting instruction address: 0xc0319e2c



And I assume there is nothing else relevant to XFS in the dmesg before
this?


Nothing relevant in dmesg

Regards,
Venkat.



Thanks,
Nick




Re: [PATCH 1/2] mm/cma: remove unsupported gfp_mask parameter from cma_alloc()

2018-07-10 Thread Michal Hocko
On Tue 10-07-18 16:19:32, Joonsoo Kim wrote:
> Hello, Marek.
> 
> 2018-07-09 21:19 GMT+09:00 Marek Szyprowski :
> > cma_alloc() function doesn't really support gfp flags other than
> > __GFP_NOWARN, so convert gfp_mask parameter to boolean no_warn parameter.
> 
> Although gfp_mask isn't used in cma_alloc() except no_warn, it can be used
> in alloc_contig_range(). For example, if passed gfp mask has no __GFP_FS,
> compaction(isolation) would work differently. Do you have considered
> such a case?

Does any of cma_alloc users actually care about GFP_NO{FS,IO}?
-- 
Michal Hocko
SUSE Labs


Re: [powerpc/powervm]Oops: Kernel access of bad area, sig: 11 [#1] while running stress-ng

2018-07-10 Thread Nicholas Piggin
On Tue, 10 Jul 2018 11:58:40 +0530
vrbagal1  wrote:

> Hi,
> 
> Observing kernel oops on Power9(ZZ) box, running on PowerVM, while 
> running stress-ng.
> 
> 
> Kernel: 4.18.0-rc4
> Machine: Power9 ZZ (PowerVM)
> Test: Stress-ng
> 
> Attached is .config file
> 
> Traces:
> 
>   [12251.245209] Oops: Kernel access of bad area, sig: 11 [#1]

Can you post the lines above this? Otherwise we don't know what address
it tried to access (without decoding the instructions and reconstructing
it from registers at least, which the XFS devs wouldn't be inclined to
do).

And I assume there is nothing else relevant to XFS in the dmesg before
this?

Thanks,
Nick


Re: [PATCH 1/2] powerpc: Add ppc32_allmodconfig defconfig target

2018-07-10 Thread Mathieu Malaterre
On Mon, Jul 9, 2018 at 4:24 PM Michael Ellerman  wrote:
>
> Because the allmodconfig logic just sets every symbol to M or Y, it
> has the effect of always generating a 64-bit config, because
> CONFIG_PPC64 becomes Y.
>
> So to make it easier for folks to test 32-bit code, provide a phony
> defconfig target that generates a 32-bit allmodconfig.
>
> The 32-bit port has several mutually exclusive CPU types, we choose
> the Book3S variants as that's what the help text in Kconfig says is
> most common.

Ok then.

> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/Makefile | 5 +
>  arch/powerpc/configs/book3s_32.config | 2 ++
>  2 files changed, 7 insertions(+)
>  create mode 100644 arch/powerpc/configs/book3s_32.config
>
> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> index 2ea575cb3401..2556c2182789 100644
> --- a/arch/powerpc/Makefile
> +++ b/arch/powerpc/Makefile
> @@ -354,6 +354,11 @@ mpc86xx_smp_defconfig:
> $(call merge_into_defconfig,mpc86xx_basic_defconfig,\
> 86xx-smp 86xx-hw fsl-emb-nonhw)
>
> +PHONY += ppc32_allmodconfig
> +ppc32_allmodconfig:
> +   $(Q)$(MAKE) 
> KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/book3s_32.config \
> +   -f $(srctree)/Makefile allmodconfig
> +

I this a good time to update line 34 at the same time:

KBUILD_DEFCONFIG := $(shell uname -m)_defconfig

?

>  define archhelp
>@echo '* zImage  - Build default images selected by kernel config'
>@echo '  zImage.*- Compressed kernel image 
> (arch/$(ARCH)/boot/zImage.*)'
> diff --git a/arch/powerpc/configs/book3s_32.config 
> b/arch/powerpc/configs/book3s_32.config
> new file mode 100644
> index ..8721eb7b1294
> --- /dev/null
> +++ b/arch/powerpc/configs/book3s_32.config
> @@ -0,0 +1,2 @@
> +CONFIG_PPC64=n
> +CONFIG_PPC_BOOK3S_32=y
> --
> 2.14.1
>


Re: [PATCH 1/2] mm/cma: remove unsupported gfp_mask parameter from cma_alloc()

2018-07-10 Thread Joonsoo Kim
Hello, Marek.

2018-07-09 21:19 GMT+09:00 Marek Szyprowski :
> cma_alloc() function doesn't really support gfp flags other than
> __GFP_NOWARN, so convert gfp_mask parameter to boolean no_warn parameter.

Although gfp_mask isn't used in cma_alloc() except no_warn, it can be used
in alloc_contig_range(). For example, if passed gfp mask has no __GFP_FS,
compaction(isolation) would work differently. Do you have considered
such a case?

Thanks.


Re: [PATCH] Documentation: Add powerpc options for spec_store_bypass_disable

2018-07-10 Thread Thomas Gleixner
On Tue, 10 Jul 2018, Michael Ellerman wrote:

> Document the support for spec_store_bypass_disable that was added for
> powerpc in commit a048a07d7f45 ("powerpc/64s: Add support for a store
> forwarding barrier at kernel entry/exit").
> 
> Signed-off-by: Michael Ellerman 
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> I tried documenting the differences between the PPC options and X86 ones in 
> one
> section, but it got quite messy, so I went with this instead. Happy to take
> advice on how better to structure it if anyone has opinions.

Looks good to me.

Acked-by: Thomas Gleixner