Re: [PATCH] add POWER Virtual Management Channel driver

2016-02-17 Thread Stewart Smith
Steven Royer  writes:
> On 2016-02-17 16:31, Greg Kroah-Hartman wrote:
>> On Wed, Feb 17, 2016 at 03:18:26PM -0600, Steven Royer wrote:
>>> On 2016-02-16 16:18, Greg Kroah-Hartman wrote:
>>> >On Tue, Feb 16, 2016 at 02:43:13PM -0600, Steven Royer wrote:
>>> >>From: Steven Royer 
>>> >>
>>> >>The ibmvmc driver is a device driver for the POWER Virtual Management
>>> >>Channel virtual adapter on the PowerVM platform.  It is used to
>>> >>communicate with the hypervisor for virtualization management.  It
>>> >>provides both request/response and asynchronous message support through
>>> >>the /dev/ibmvmc node.
>>> >
>>> >What is the protocol for that device node?
>>> The protocol is not currently published.  I am pushing on getting it
>>> published, but that process will take time.  If you have a PowerVM 
>>> system
>>> with NovaLink, it would not be hard to reverse engineer it...  If you 
>>> don't
>>> have a PowerVM system, then this driver isn't interesting anyway...

Stephen - if you need some help pushing for it to be published, let me
know, there's a few internal things I could help push.

>> You can't just expect us to review this code without at least having a
>> clue as to how it is supposed to work?
> There are two layers to the protocol.  The first layer is the only layer 
> that the driver actually cares about.  The second layer is just a 
> payload that is between the application and the hypervisor and can 
> change independently from the kernel/driver (this is what is transported 
> over the /dev/ibmvmc node).  The first layer technically is published in 
> the PAPR (appendix G), but it is not trivial for most people to access

https://members.openpowerfoundation.org/document/dl/469 is LoPAPR which
has been published through OpenPower Foundation and anyone can access,
although Appendix G there is on EEH. Although VMC (Virtual Management
Channel) is mentioned in that document the details aren't there... so
it's possible that this is only in some other PAPR version :/
and... looking in internal places, it is. *sigh*

With my OpenPower Foundation hat on, I'll say that it's a
work-in-progress getting all this documentation in order.

The questions of if it's a sensible hypervisor to partition interface
and if it's a sensible userspace API are open for debate :)

Would we implement this way of communicating between a KVM guest and the
host linux system? If not, then it's probably not a generally good
idea. That being said, it seems to be what already exists in PowerVM

-- 
Stewart Smith
OPAL Architect, IBM.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V6 07/35] powerpc/mm: Don't have generic headers introduce functions touching pte bits

2016-02-17 Thread Balbir Singh


On 01/12/15 14:36, Aneesh Kumar K.V wrote:
> We are going to drop pte_common.h in the later patch. The idea is to
> enable hash code not require to define all PTE bits. Having PTE bits
> defined in pte_common.h made the code unnecessarily complex.
>
> Acked-by: Scott Wood 
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/include/asm/book3s/pgtable.h  | 176 +++
>  .../include/asm/{pgtable.h => pgtable-book3e.h}|  96 +--
>  arch/powerpc/include/asm/pgtable.h | 192 
> +
>  3 files changed, 185 insertions(+), 279 deletions(-)
>  copy arch/powerpc/include/asm/{pgtable.h => pgtable-book3e.h} (70%)
>
> diff --git a/arch/powerpc/include/asm/book3s/pgtable.h 
> b/arch/powerpc/include/asm/book3s/pgtable.h
> index 3818cc7bc9b7..fa270cfcf30a 100644
> --- a/arch/powerpc/include/asm/book3s/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/pgtable.h
> @@ -8,4 +8,180 @@
>  #endif
>  
>  #define FIRST_USER_ADDRESS   0UL
> +#ifndef __ASSEMBLY__
> +
> +/* Generic accessors to PTE bits */
> +static inline int pte_write(pte_t pte)
> +{
> + return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO;
> +}
> +static inline int pte_dirty(pte_t pte)   { return pte_val(pte) & 
> _PAGE_DIRTY; }
> +static inline int pte_young(pte_t pte)   { return pte_val(pte) & 
> _PAGE_ACCESSED; }
> +static inline int pte_special(pte_t pte) { return pte_val(pte) & 
> _PAGE_SPECIAL; }
> +static inline int pte_none(pte_t pte){ return (pte_val(pte) 
> & ~_PTE_NONE_MASK) == 0; }
> +static inline pgprot_t pte_pgprot(pte_t pte) { return __pgprot(pte_val(pte) 
> & PAGE_PROT_BITS); }
> +
> +#ifdef CONFIG_NUMA_BALANCING
> +/*
> + * These work without NUMA balancing but the kernel does not care. See the
> + * comment in include/asm-generic/pgtable.h . On powerpc, this will only
> + * work for user pages and always return true for kernel pages.
> + */
> +static inline int pte_protnone(pte_t pte)
> +{
> + return (pte_val(pte) &
> + (_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
> +}
> +
> +static inline int pmd_protnone(pmd_t pmd)
> +{
> + return pte_protnone(pmd_pte(pmd));
> +}
> +#endif /* CONFIG_NUMA_BALANCING */
> +
> +static inline int pte_present(pte_t pte)
> +{
> + return pte_val(pte) & _PAGE_PRESENT;
> +}
> +
> +/* Conversion functions: convert a page and protection to a page entry,
Typo.. comment style, comment starts on the same line for multi-block comments?
> + * and a page entry and page directory to the page they refer to.
> + *
> + * Even if PTEs can be unsigned long long, a PFN is always an unsigned
> + * long for now.
> + */
> +static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
> + return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
> +  pgprot_val(pgprot)); }
> +static inline unsigned long pte_pfn(pte_t pte)   {
> + return pte_val(pte) >> PTE_RPN_SHIFT; }
> +
> +/* Generic modifiers for PTE bits */
> +static inline pte_t pte_wrprotect(pte_t pte) {
> + pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
> + pte_val(pte) |= _PAGE_RO; return pte; }
> +static inline pte_t pte_mkclean(pte_t pte) {
> + pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
> +static inline pte_t pte_mkold(pte_t pte) {
> + pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
> +static inline pte_t pte_mkwrite(pte_t pte) {
> + pte_val(pte) &= ~_PAGE_RO;
> + pte_val(pte) |= _PAGE_RW; return pte; }
> +static inline pte_t pte_mkdirty(pte_t pte) {
> + pte_val(pte) |= _PAGE_DIRTY; return pte; }
> +static inline pte_t pte_mkyoung(pte_t pte) {
> + pte_val(pte) |= _PAGE_ACCESSED; return pte; }
> +static inline pte_t pte_mkspecial(pte_t pte) {
> + pte_val(pte) |= _PAGE_SPECIAL; return pte; }
> +static inline pte_t pte_mkhuge(pte_t pte) {
> + return pte; }
> +static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
> +{
> + pte_val(pte) = (pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot);
> + return pte;
> +}
> +
> +
> +/* Insert a PTE, top-level function is out of line. It uses an inline
Comment style?
> + * low level function in the respective pgtable-* files
> + */
> +extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
> +pte_t pte);
> +
> +/* This low level function performs the actual PTE insertion
> + * Setting the PTE depends on the MMU type and other factors. It's
> + * an horrible mess that I'm not going to try to clean up now but
> + * I'm keeping it in one place rather than spread around
> + */
> +static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
> + pte_t *ptep, pte_t pte, int percpu)
> +{
> +#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && 
> !defined(CONFIG_PTE_64BIT)
> + /* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use 
> the
> +  

Re: [PATCH V6 05/35] powerpc/mm: Move hash specific pte width and other defines to book3s

2016-02-17 Thread Balbir Singh


On 01/12/15 14:36, Aneesh Kumar K.V wrote:
> This further make a copy of pte defines to book3s/64/hash*.h. This
> remove the dependency on pgtable-ppc64-4k.h and pgtable-ppc64-64k.h
>
> Acked-by: Scott Wood 
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/include/asm/book3s/64/hash-4k.h  | 86 
> ++-
>  arch/powerpc/include/asm/book3s/64/hash-64k.h | 46 +-
>  arch/powerpc/include/asm/book3s/64/pgtable.h  |  6 +-
>  3 files changed, 129 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
> b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> index c134e809aac3..f2c51cd61f69 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> @@ -1,4 +1,51 @@
> -/* To be include by pgtable-hash64.h only */
> +#ifndef _ASM_POWERPC_BOOK3S_64_HASH_4K_H
> +#define _ASM_POWERPC_BOOK3S_64_HASH_4K_H
> +/*
> + * Entries per page directory level.  The PTE level must use a 64b record
> + * for each page table entry.  The PMD and PGD level use a 32b record for
> + * each entry by assuming that each entry is page aligned.
> + */

More clarity on this please
> +#define PTE_INDEX_SIZE  9
> +#define PMD_INDEX_SIZE  7
> +#define PUD_INDEX_SIZE  9
> +#define PGD_INDEX_SIZE  9
> +
We use 9+7+9+9+(12) bits?
> +#ifndef __ASSEMBLY__
> +#define PTE_TABLE_SIZE   (sizeof(pte_t) << PTE_INDEX_SIZE)
> +#define PMD_TABLE_SIZE   (sizeof(pmd_t) << PMD_INDEX_SIZE)
> +#define PUD_TABLE_SIZE   (sizeof(pud_t) << PUD_INDEX_SIZE)
> +#define PGD_TABLE_SIZE   (sizeof(pgd_t) << PGD_INDEX_SIZE)
> +#endif   /* __ASSEMBLY__ */
> +
> +#define PTRS_PER_PTE (1 << PTE_INDEX_SIZE)
> +#define PTRS_PER_PMD (1 << PMD_INDEX_SIZE)
> +#define PTRS_PER_PUD (1 << PUD_INDEX_SIZE)
> +#define PTRS_PER_PGD (1 << PGD_INDEX_SIZE)
> +
> +/* PMD_SHIFT determines what a second-level page table entry can map */
> +#define PMD_SHIFT(PAGE_SHIFT + PTE_INDEX_SIZE)
> +#define PMD_SIZE (1UL << PMD_SHIFT)
> +#define PMD_MASK (~(PMD_SIZE-1))
> +
> +/* With 4k base page size, hugepage PTEs go at the PMD level */
> +#define MIN_HUGEPTE_SHIFTPMD_SHIFT
> +
> +/* PUD_SHIFT determines what a third-level page table entry can map */
> +#define PUD_SHIFT(PMD_SHIFT + PMD_INDEX_SIZE)
> +#define PUD_SIZE (1UL << PUD_SHIFT)
> +#define PUD_MASK (~(PUD_SIZE-1))
> +
> +/* PGDIR_SHIFT determines what a fourth-level page table entry can map */
> +#define PGDIR_SHIFT  (PUD_SHIFT + PUD_INDEX_SIZE)
> +#define PGDIR_SIZE   (1UL << PGDIR_SHIFT)
> +#define PGDIR_MASK   (~(PGDIR_SIZE-1))
> +
> +/* Bits to mask out from a PMD to get to the PTE page */
> +#define PMD_MASKED_BITS  0
> +/* Bits to mask out from a PUD to get to the PMD page */
> +#define PUD_MASKED_BITS  0
> +/* Bits to mask out from a PGD to get to the PUD page */
> +#define PGD_MASKED_BITS  0
>  
Don't get why these are all 0?
>  /* PTE bits */
>  #define _PAGE_HASHPTE0x0400 /* software: pte has an associated HPTE 
> */
> @@ -15,3 +62,40 @@
>  /* shift to put page number into pte */
>  #define PTE_RPN_SHIFT(17)
>  
> +#ifndef __ASSEMBLY__
> +/*
> + * 4-level page tables related bits
> + */
> +
> +#define pgd_none(pgd)(!pgd_val(pgd))
> +#define pgd_bad(pgd) (pgd_val(pgd) == 0)
> +#define pgd_present(pgd) (pgd_val(pgd) != 0)
> +#define pgd_clear(pgdp)  (pgd_val(*(pgdp)) = 0)
> +#define pgd_page_vaddr(pgd)  (pgd_val(pgd) & ~PGD_MASKED_BITS)
> +
> +static inline pte_t pgd_pte(pgd_t pgd)
> +{
> + return __pte(pgd_val(pgd));
> +}
> +
> +static inline pgd_t pte_pgd(pte_t pte)
> +{
> + return __pgd(pte_val(pte));
> +}
> +extern struct page *pgd_page(pgd_t pgd);
> +
> +#define pud_offset(pgdp, addr)   \
> +  (((pud_t *) pgd_page_vaddr(*(pgdp))) + \
> +(((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
> +


> +#define pud_ERROR(e) \
> + pr_err("%s:%d: bad pud %08lx.\n", __FILE__, __LINE__, pud_val(e))
> +
> +/*
> + * On all 4K setups, remap_4k_pfn() equates to remap_pfn_range() */
> +#define remap_4k_pfn(vma, addr, pfn, prot)   \
> + remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, (prot))
> +
> +#endif /* !__ASSEMBLY__ */
> +
> +#endif /* _ASM_POWERPC_BOOK3S_64_HASH_4K_H */
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
> b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 4f4ec2ab45c9..ee073822145d 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -1,4 +1,35 @@
> -/* To be include by pgtable-hash64.h only */
> +#ifndef _ASM_POWERPC_BOOK3S_64_HASH_64K_H
> +#define _ASM_POWERPC_BOOK3S_64_HASH_64K_H
> +
> +#include 
> +
> +#define PTE_INDEX_SIZE  8
> +#define PMD_INDEX_SIZE  10
> +#define PUD_INDEX_SIZE   0
> +#define PGD_INDEX_SIZE  12
> +

OK.. So there is PGD but no PUD?
> +#define PTRS_PER_PTE (1 << PTE_INDEX_SIZE)

Re: [PATCH 2/2] powerpc: Add POWER9 cputable entry

2016-02-17 Thread Michael Neuling
On Wed, 2016-02-17 at 22:09 +1100, Michael Ellerman wrote:
> On Wed, 2016-02-17 at 16:07 +1100, Michael Neuling wrote:
> 
> > Add a cputable entry for POWER9.  More code is required to actually
> > boot and run on a POWER9 but this gets the base piece in which we
> > can
> > start building on.
> > 
> > Copies over from POWER8 except for:
> > - Adds a new CPU_FTR_ARCH_30 bit to start hanging new architecture
> 
> ARCH thirty?
> 
> Would CPU_FTR_ARCH_3 read better?
> 
> Or CPU_FTR_ARCH_3_00 ?

The actual architecture book used to say 2.07 but now says just 3.0.
Hence why I picked 30 vs 207.

That being said, I don't really care what we call it.

> 
> > diff --git a/arch/powerpc/include/asm/cputable.h
> > b/arch/powerpc/include/asm/cputable.h
> > index a47e175..7fb238c 100644
> > --- a/arch/powerpc/include/asm/cputable.h
> > +++ b/arch/powerpc/include/asm/cputable.h
> > @@ -171,7 +171,7 @@ enum {
> >  #define CPU_FTR_ARCH_201   LONG_ASM_CONST(0x00020
> > 000)
> >  #define CPU_FTR_ARCH_206   LONG_ASM_CONST(0x00040
> > 000)
> >  #define CPU_FTR_ARCH_207S  LONG_ASM_CONST(0x0008
> > )
> > -/* FreeLONG_ASM_CONST(0x00
> > 10) */
> > +#define CPU_FTR_ARCH_30LONG_ASM_CONST(0x00
> > 10)
> >  #define CPU_FTR_MMCRA  LONG_ASM_CONST(0x
> > 0020)
> >  #define CPU_FTR_CTRL   LONG_ASM_CONST(0x0
> > 040)
> >  #define CPU_FTR_SMTLONG_ASM_CONST(0x00
> > 80)
> > @@ -447,6 +447,16 @@ enum {
> > CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_SUBCORE)
> >  #define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG)
> >  #define CPU_FTRS_POWER8_DD1 (CPU_FTRS_POWER8 & ~CPU_FTR_DBELL)
> > +#define CPU_FTRS_POWER9 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
> > +   CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL |
> > CPU_FTR_ARCH_206 |\
> > +   CPU_FTR_MMCRA | CPU_FTR_SMT | \
> > +   CPU_FTR_COHERENT_ICACHE | \
> > +   CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
> > +   CPU_FTR_DSCR | CPU_FTR_SAO  | \
> > +   CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB |
> > CPU_FTR_POPCNTD | \
> > +   CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE |
> > CPU_FTR_VMX_COPY | \
> > +   CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
> > +   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_30)
> >  #define CPU_FTRS_CELL  (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
> > CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
> > CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
> > @@ -465,7 +475,7 @@ enum {
> > (CPU_FTRS_POWER4 | CPU_FTRS_PPC970 | CPU_FTRS_POWER5 |
> > \
> >  CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E
> > | \
> >  CPU_FTRS_POWER8 | CPU_FTRS_POWER8_DD1 | CPU_FTRS_CELL
> > | \
> > -CPU_FTRS_PA6T | CPU_FTR_VSX)
> > +CPU_FTRS_PA6T | CPU_FTR_VSX | CPU_FTRS_POWER9)
> >  #endif
> 
> That's you adding it to CPU_FTRS_POSSIBLE I think.
> 
> But you forgot to add it to CPU_FTRS_ALWAYS.

OK, thanks, I'll fix

> 
> > diff --git a/arch/powerpc/include/asm/mmu-hash64.h
> > b/arch/powerpc/include/asm/mmu-hash64.h
> > index 7352d3f..e36dc90 100644
> > --- a/arch/powerpc/include/asm/mmu-hash64.h
> > +++ b/arch/powerpc/include/asm/mmu-hash64.h
> > @@ -114,6 +114,7 @@
> >  
> >  #define POWER7_TLB_SETS128 /* # sets in
> > POWER7 TLB */
> >  #define POWER8_TLB_SETS512 /* # sets in
> > POWER8 TLB */
> > +#define POWER9_TLB_SETS_HASH   256 /* # sets in POWER9
> > TLB Hash mode */
> >  
> >  #ifndef __ASSEMBLY__
> >  
> > diff --git a/arch/powerpc/include/asm/mmu.h
> > b/arch/powerpc/include/asm/mmu.h
> > index 3d5abfe..54d4650 100644
> > --- a/arch/powerpc/include/asm/mmu.h
> > +++ b/arch/powerpc/include/asm/mmu.h
> > @@ -97,6 +97,7 @@
> >  #define MMU_FTRS_POWER6MMU_FTRS_POWER4 |
> > MMU_FTR_LOCKLESS_TLBIE
> >  #define MMU_FTRS_POWER7MMU_FTRS_POWER4 |
> > MMU_FTR_LOCKLESS_TLBIE
> >  #define MMU_FTRS_POWER8MMU_FTRS_POWER4 |
> > MMU_FTR_LOCKLESS_TLBIE
> > +#define MMU_FTRS_POWER9MMU_FTRS_POWER4 |
> > MMU_FTR_LOCKLESS_TLBIE
> >  #define MMU_FTRS_CELL  MMU_FTRS_DEFAULT_HPTE_ARCH_V2
> > | \
> > MMU_FTR_CI_LARGE_PAGE
> >  #define MMU_FTRS_PA6T  MMU_FTRS_DEFAULT_HPTE_ARCH_V2
> > | \
> > diff --git a/arch/powerpc/kernel/cpu_setup_power.S
> > b/arch/powerpc/kernel/cpu_setup_power.S
> > index 9c9b741..1785480 100644
> > --- a/arch/powerpc/kernel/cpu_setup_power.S
> > +++ b/arch/powerpc/kernel/cpu_setup_power.S
> > @@ -83,6 +83,43 @@ _GLOBAL(__restore_cpu_power8)
> > mtlrr11
> > blr
> >  
> > +_GLOBAL(__setup_cpu_power9)
> > +   mflrr11
> > +   bl  __init_FSCR
> > +   bl  __init_PMU
> 
> You might be better off leaving the PMU alone until we have a P9
> perf implementation?

ok, I'll drop 

Re: [PATCH V6 04/35] powerpc/mm: make a separate copy for book3s (part 2)

2016-02-17 Thread Balbir Singh
On Tue, 2015-12-01 at 09:06 +0530, Aneesh Kumar K.V wrote:
> Keep it seperate to make rebasing easier
> 
> Acked-by: Scott Wood 
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/include/asm/book3s/32/pgtable.h | 6 +++---
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 6 +++---
>  arch/powerpc/include/asm/pgtable-ppc32.h | 2 --
>  arch/powerpc/include/asm/pgtable-ppc64.h | 4 
>  4 files changed, 6 insertions(+), 12 deletions(-)
> 

One side effect is that someone could by mistake include
asm/book3s/32/pgtable.h and asm/pgtable-ppc32.h and get
duplicate definitions and have the compiler complain


Ditto for the 64 bit headers

Balbir
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V6 01/35] powerpc/mm: move pte headers to book3s directory

2016-02-17 Thread Balbir Singh
On Tue, 2015-12-01 at 09:06 +0530, Aneesh Kumar K.V wrote:
> Acked-by: Scott Wood 
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} | 0
>  arch/powerpc/include/asm/{pte-hash64.h => book3s/64/hash.h} | 0
>  arch/powerpc/include/asm/pgtable-ppc32.h| 2 +-
>  arch/powerpc/include/asm/pgtable-ppc64.h| 2 +-
>  4 files changed, 2 insertions(+), 2 deletions(-)
>  rename arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} (100%)
>  rename arch/powerpc/include/asm/{pte-hash64.h => book3s/64/hash.h} (100%)

Why not pte-hash.h?

Balbir Singh.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RESEND][PATCH] Fix BUG_ON() reporting in real mode on PPC

2016-02-17 Thread Balbir Singh
Changelog:
Don't use REGION_ID, breaks on some platforms
Don't blindly add PAGE_OFFSET to bugaddr

I ran into this issue while debugging an early boot problem.
The system hit a BUG_ON() but report bug failed to print the
line number and file name. The reason being that the system
was running in real mode and report_bug() searches for
addresses in the PAGE_OFFSET+ region

Suggested-by: Paul Mackerras 
Signed-off-by: Balbir Singh 
---
 arch/powerpc/kernel/traps.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index b6becc7..4e5c11d 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1148,6 +1148,7 @@ void __kprobes program_check_exception(struct pt_regs 
*regs)
goto bail;
}
if (reason & REASON_TRAP) {
+   unsigned long bugaddr;
/* Debugger is first in line to stop recursive faults in
 * rcu_lock, notify_die, or atomic_notifier_call_chain */
if (debugger_bpt(regs))
@@ -1158,8 +1159,15 @@ void __kprobes program_check_exception(struct pt_regs 
*regs)
== NOTIFY_STOP)
goto bail;
 
+   bugaddr = regs->nip;
+   /*
+* Fixup bugaddr for BUG_ON() in real mode
+*/
+   if (!is_kernel_addr(bugaddr) && !(regs->msr & MSR_IR))
+   bugaddr += PAGE_OFFSET;
+
if (!(regs->msr & MSR_PR) &&  /* not user-mode */
-   report_bug(regs->nip, regs) == BUG_TRAP_TYPE_WARN) {
+   report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) {
regs->nip += 4;
goto bail;
}
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Fix BUG_ON() reporting in real mode on powerpc

2016-02-17 Thread Aneesh Kumar K.V
Balbir Singh  writes:

>>> Changelog:
>>>  Don't add PAGE_OFFSET blindly, check if REGION_ID is 0
>>>
>>> I ran into this issue while debugging an early boot problem.
>>> The system hit a BUG_ON() but report bug failed to print the
>>> line number and file name. The reason being that the system
>>> was running in real mode and report_bug() searches for
>>> addresses in the PAGE_OFFSET+ region
>>>
>>> Suggested-by: Paul Mackerras 
>>> Signed-off-by: Balbir Singh 
>
> 
>
>> Can we add some comments around this. When i looked at this first, i was
>> wondering how nip can be in user region. But then realized that what we
>> are checking here is kernel address used in real mode. The use of
>> REGION_ID eventhough simpler is confusing. Hence adding the comment with
>> details Paul mentioned in email will help.
>>
>>
> I've tried and covered it in the changelog, I thought a code comment
> would make sense for the very non  obvious cases and not repeat what
> the code does as comment
>

The use of REGION_ID indicate that you are checking for region. Hence
the suggestion. Looking at this again, I suggest we should add a new
macro or will have to open code it. Because in the radix series we make
REGION_ID a hash config thing and this is generic stuff. 

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH kernel v3 7/7] KVM: PPC: Add support for multiple-TCE hcalls

2016-02-17 Thread Alexey Kardashevskiy

On 02/15/2016 12:55 PM, Alexey Kardashevskiy wrote:

This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and
H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO
devices or emulated PCI. These calls allow adding multiple entries
(up to 512) into the TCE table in one call which saves time on
transition between kernel and user space.

The current implementation of kvmppc_h_stuff_tce() allows it to be
executed in both real and virtual modes so there is one helper.
The kvmppc_rm_h_put_tce_indirect() needs to translate the guest address
to the host address and since the translation is different, there are
2 helpers - one for each mode.

This implements the KVM_CAP_PPC_MULTITCE capability. When present,
the kernel will try handling H_PUT_TCE_INDIRECT and H_STUFF_TCE if these
are enabled by the userspace via KVM_CAP_PPC_ENABLE_HCALL.
If they can not be handled by the kernel, they are passed on to
the user space. The user space still has to have an implementation
for these.

Both HV and PR-syle KVM are supported.

Signed-off-by: Alexey Kardashevskiy 
---

[skip]


diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
b/arch/powerpc/kvm/book3s_64_vio_hv.c
index b608fdd..0486aa2 100644
--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
+++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
@@ -14,6 +14,7 @@
   *
   * Copyright 2010 Paul Mackerras, IBM Corp. 
   * Copyright 2011 David Gibson, IBM Corporation 
+ * Copyright 2016 Alexey Kardashevskiy, IBM Corporation 
   */

  #include 
@@ -30,6 +31,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -37,6 +39,7 @@
  #include 
  #include 
  #include 
+#include 

  #define TCES_PER_PAGE (PAGE_SIZE / sizeof(u64))

@@ -46,7 +49,7 @@
   * WARNING: This will be called in real or virtual mode on HV KVM and virtual
   *  mode on PR KVM
   */
-static struct kvmppc_spapr_tce_table *kvmppc_find_table(struct kvm_vcpu *vcpu,
+struct kvmppc_spapr_tce_table *kvmppc_find_table(struct kvm_vcpu *vcpu,
unsigned long liobn)
  {
struct kvm *kvm = vcpu->kvm;
@@ -58,6 +61,7 @@ static struct kvmppc_spapr_tce_table 
*kvmppc_find_table(struct kvm_vcpu *vcpu,

return NULL;
  }
+EXPORT_SYMBOL_GPL(kvmppc_find_table);

  /*
   * Validates IO address.
@@ -151,9 +155,29 @@ void kvmppc_tce_put(struct kvmppc_spapr_tce_table *stt,
  }
  EXPORT_SYMBOL_GPL(kvmppc_tce_put);

-/* WARNING: This will be called in real-mode on HV KVM and virtual
- *  mode on PR KVM
- */
+long kvmppc_gpa_to_ua(struct kvm *kvm, unsigned long gpa,
+   unsigned long *ua, unsigned long **prmap)
+{
+   unsigned long gfn = gpa >> PAGE_SHIFT;
+   struct kvm_memory_slot *memslot;
+
+   memslot = search_memslots(kvm_memslots(kvm), gfn);
+   if (!memslot)
+   return -EINVAL;
+
+   *ua = __gfn_to_hva_memslot(memslot, gfn) |
+   (gpa & ~(PAGE_MASK | TCE_PCI_READ | TCE_PCI_WRITE));
+
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+   if (prmap)
+   *prmap = >arch.rmap[gfn - memslot->base_gfn];
+#endif
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(kvmppc_gpa_to_ua);
+
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
  long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
  unsigned long ioba, unsigned long tce)
  {
@@ -180,6 +204,122 @@ long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned 
long liobn,
  }
  EXPORT_SYMBOL_GPL(kvmppc_h_put_tce);

+static long kvmppc_rm_ua_to_hpa(struct kvm_vcpu *vcpu,
+   unsigned long ua, unsigned long *phpa)
+{
+   pte_t *ptep, pte;
+   unsigned shift = 0;
+
+   ptep = __find_linux_pte_or_hugepte(vcpu->arch.pgdir, ua, NULL, );




The latest powerkvm kernel passes @thp instead of NULL and check for it 
below in addition to (shift > PAGE_SHIFT), should it be fixed here as well?


Is that possible for __find_linux_pte_or_hugepte() return thp==true but 
shift<=PAGE_SHIT, assuming we call it on vcpu->arch.pgdir, not an ordinary 
task pgdir?





+   if (!ptep || !pte_present(*ptep))
+   return -ENXIO;
+   pte = *ptep;
+
+   if (!shift)
+   shift = PAGE_SHIFT;
+
+   /* Avoid handling anything potentially complicated in realmode */
+   if (shift > PAGE_SHIFT)
+   return -EAGAIN;
+
+   if (!pte_young(pte))
+   return -EAGAIN;
+
+   *phpa = (pte_pfn(pte) << PAGE_SHIFT) | (ua & ((1ULL << shift) - 1)) |
+   (ua & ~PAGE_MASK);
+
+   return 0;
+}



--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND][PATCH] Fix BUG_ON() reporting in real mode on PPC

2016-02-17 Thread kbuild test robot
Hi Balbir,

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.5-rc4 next-20160217]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improving the system]

url:
https://github.com/0day-ci/linux/commits/Balbir-Singh/Fix-BUG_ON-reporting-in-real-mode-on-PPC/20160218-082259
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allnoconfig (attached as .config)
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/traps.c: In function 'program_check_exception':
>> arch/powerpc/kernel/traps.c:1163:8: error: implicit declaration of function 
>> 'REGION_ID' [-Werror=implicit-function-declaration]
  if ((REGION_ID(bugaddr) == 0) && !(regs->msr & MSR_IR))
   ^
   cc1: all warnings being treated as errors

vim +/REGION_ID +1163 arch/powerpc/kernel/traps.c

  1157  /* trap exception */
  1158  if (notify_die(DIE_BPT, "breakpoint", regs, 5, 5, 
SIGTRAP)
  1159  == NOTIFY_STOP)
  1160  goto bail;
  1161  
  1162  bugaddr = regs->nip;
> 1163  if ((REGION_ID(bugaddr) == 0) && !(regs->msr & MSR_IR))
  1164  bugaddr += PAGE_OFFSET;
  1165  
  1166  if (!(regs->msr & MSR_PR) &&  /* not user-mode */

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t

2016-02-17 Thread Aneesh Kumar K.V
Paul Mackerras  writes:

> On Tue, Dec 01, 2015 at 09:06:45AM +0530, Aneesh Kumar K.V wrote:
>> This free up 11 bits in pte_t. In the later patch we also change
>> the pte_t format so that we can start supporting migration pte
>> at pmd level. We now track 4k subpage valid bit as below
>> 
>> If we have _PAGE_COMBO set, we override the _PAGE_F_GIX_SHIFT
>> and _PAGE_F_SECOND. Together we have 4 bits, each of them
>> used to indicate whether any of the 4 4k subpage in that group
>> is valid. ie,
>> 
>> [ group 1 bit ]   [ group 2 bit ]  . [ group 4 ]
>> [ subpage 1 - 4]  [ subpage 5- 8]  . [ subpage 13 - 16]
>> 
>> We still track each 4k subpage slot number and secondary hash
>> information in the second half of pgtable_t. Removing the subpage
>> tracking have some significant overhead on aim9 and ebizzy benchmark and
>> to support THP with 4K subpage, we do need a pgtable_t of 4096 bytes.
>
> I know this has already been applied, but this hunk looks wrong:
>
>> @@ -102,7 +131,7 @@ int __hash_page_4K(unsigned long ea, unsigned long 
>> access, unsigned long vsid,
>>   */
>>  if (!(old_pte & _PAGE_COMBO)) {
>>  flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
>> -old_pte &= ~_PAGE_HPTE_SUB;
>> +old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;
>
> Shouldn't this be:
>
> + old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);
>
> instead?

Thanks for checking this closely. Yes it should be what you suggested. I
will do a patch for this.

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 7/7] T104xQDS: Add qe node to t104xqds

2016-02-17 Thread Zhao Qiang
add qe node to t104xqds.dtsi

Signed-off-by: Zhao Qiang 
---
Changes for v2
- rebase

 arch/powerpc/boot/dts/fsl/t104xqds.dtsi | 40 +
 1 file changed, 40 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/t104xqds.dtsi 
b/arch/powerpc/boot/dts/fsl/t104xqds.dtsi
index 1498d1e..1a8e60d 100644
--- a/arch/powerpc/boot/dts/fsl/t104xqds.dtsi
+++ b/arch/powerpc/boot/dts/fsl/t104xqds.dtsi
@@ -190,4 +190,44 @@
  0 0x0001>;
};
};
+
+   qe: qe@ffe14 {
+   ranges = <0x0 0xf 0xfe14 0x4>;
+   reg = <0xf 0xfe14 0 0x480>;
+   brg-frequency = <0>;
+   bus-frequency = <0>;
+
+   si1: si@700 {
+   compatible = "fsl,qe-si";
+   reg = <0x700 0x80>;
+   };
+
+   siram1: siram@1000 {
+   compatible = "fsl,qe-siram";
+   reg = <0x1000 0x800>;
+   };
+
+   ucc_hdlc: ucc@2000 {
+   compatible = "fsl,ucc-hdlc";
+   rx-clock-name = "clk8";
+   tx-clock-name = "clk9";
+   fsl,rx-sync-clock = "rsync_pin";
+   fsl,tx-sync-clock = "tsync_pin";
+   fsl,tx-timeslot = <0xfffe>;
+   fsl,rx-timeslot = <0xfffe>;
+   fsl,tdm-framer-type = "e1";
+   fsl,tdm-mode = "normal";
+   fsl,tdm-id = <0>;
+   fsl,siram-entry-id = <0>;
+   fsl,tdm-interface;
+   };
+
+   ucc_serial: ucc@2200 {
+   device_type = "serial";
+   compatible = "ucc_uart";
+   port-number = <1>;
+   rx-clock-name = "brg2";
+   tx-clock-name = "brg2";
+   };
+   };
 };
-- 
2.1.0.27.g96db324

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 5/7] T104xD4RDB: Add qe node to t104xd4rdb

2016-02-17 Thread Zhao Qiang
add qe node to t104xd4rdb.dtsi and t1040si-post.dtsi.

Signed-off-by: Zhao Qiang 
---
Changes for v2
- rebase

 arch/powerpc/boot/dts/fsl/t1040si-post.dtsi | 45 +
 arch/powerpc/boot/dts/fsl/t104xd4rdb.dtsi   | 40 +
 2 files changed, 85 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/t1040si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/t1040si-post.dtsi
index e0f4da5..012f813 100644
--- a/arch/powerpc/boot/dts/fsl/t1040si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/t1040si-post.dtsi
@@ -673,3 +673,48 @@
};
};
 };
+
+ {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   device_type = "qe";
+   compatible = "fsl,qe";
+   fsl,qe-num-riscs = <1>;
+   fsl,qe-num-snums = <28>;
+
+   qeic: interrupt-controller@80 {
+   interrupt-controller;
+   compatible = "fsl,qe-ic";
+   #address-cells = <0>;
+   #interrupt-cells = <1>;
+   reg = <0x80 0x80>;
+   interrupts = <95 2 0 0  94 2 0 0>; //high:79 low:78
+   };
+
+   ucc@2000 {
+   cell-index = <1>;
+   reg = <0x2000 0x200>;
+   interrupts = <32>;
+   interrupt-parent = <>;
+   };
+
+   ucc@2200 {
+   cell-index = <3>;
+   reg = <0x2200 0x200>;
+   interrupts = <34>;
+   interrupt-parent = <>;
+   };
+
+   muram@1 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "fsl,qe-muram", "fsl,cpm-muram";
+   ranges = <0x0 0x1 0x6000>;
+
+   data-only@0 {
+   compatible = "fsl,qe-muram-data",
+   "fsl,cpm-muram-data";
+   reg = <0x0 0x6000>;
+   };
+   };
+};
diff --git a/arch/powerpc/boot/dts/fsl/t104xd4rdb.dtsi 
b/arch/powerpc/boot/dts/fsl/t104xd4rdb.dtsi
index 3f6d7c6..2e24322 100644
--- a/arch/powerpc/boot/dts/fsl/t104xd4rdb.dtsi
+++ b/arch/powerpc/boot/dts/fsl/t104xd4rdb.dtsi
@@ -212,4 +212,44 @@
  0 0x0001>;
};
};
+
+   qe: qe@ffe14 {
+   ranges = <0x0 0xf 0xfe14 0x4>;
+   reg = <0xf 0xfe14 0 0x480>;
+   brg-frequency = <0>;
+   bus-frequency = <0>;
+
+   si1: si@700 {
+   compatible = "fsl,qe-si";
+   reg = <0x700 0x80>;
+   };
+
+   siram1: siram@1000 {
+   compatible = "fsl,qe-siram";
+   reg = <0x1000 0x800>;
+   };
+
+   ucc_hdlc: ucc@2000 {
+   compatible = "fsl,ucc-hdlc";
+   rx-clock-name = "clk8";
+   tx-clock-name = "clk9";
+   fsl,rx-sync-clock = "rsync_pin";
+   fsl,tx-sync-clock = "tsync_pin";
+   fsl,tx-timeslot = <0xfffe>;
+   fsl,rx-timeslot = <0xfffe>;
+   fsl,tdm-framer-type = "e1";
+   fsl,tdm-mode = "normal";
+   fsl,tdm-id = <0>;
+   fsl,siram-entry-id = <0>;
+   fsl,tdm-interface;
+   };
+
+   ucc_serial: ucc@2200 {
+   device_type = "serial";
+   compatible = "ucc_uart";
+   port-number = <1>;
+   rx-clock-name = "brg2";
+   tx-clock-name = "brg2";
+   };
+   };
 };
-- 
2.1.0.27.g96db324

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t

2016-02-17 Thread Paul Mackerras
On Tue, Dec 01, 2015 at 09:06:45AM +0530, Aneesh Kumar K.V wrote:
> This free up 11 bits in pte_t. In the later patch we also change
> the pte_t format so that we can start supporting migration pte
> at pmd level. We now track 4k subpage valid bit as below
> 
> If we have _PAGE_COMBO set, we override the _PAGE_F_GIX_SHIFT
> and _PAGE_F_SECOND. Together we have 4 bits, each of them
> used to indicate whether any of the 4 4k subpage in that group
> is valid. ie,
> 
> [ group 1 bit ]   [ group 2 bit ]  . [ group 4 ]
> [ subpage 1 - 4]  [ subpage 5- 8]  . [ subpage 13 - 16]
> 
> We still track each 4k subpage slot number and secondary hash
> information in the second half of pgtable_t. Removing the subpage
> tracking have some significant overhead on aim9 and ebizzy benchmark and
> to support THP with 4K subpage, we do need a pgtable_t of 4096 bytes.

I know this has already been applied, but this hunk looks wrong:

> @@ -102,7 +131,7 @@ int __hash_page_4K(unsigned long ea, unsigned long 
> access, unsigned long vsid,
>*/
>   if (!(old_pte & _PAGE_COMBO)) {
>   flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
> - old_pte &= ~_PAGE_HPTE_SUB;
> + old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;

Shouldn't this be:

+   old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);

instead?

Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 1/7] QE: Add IC, SI and SIRAM document to device tree bindings.

2016-02-17 Thread Zhao Qiang
Add IC, SI and SIRAM document of QE to
Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt

Signed-off-by: Zhao Qiang 
---
Changes for v2
- Add interrupt-controller in Required properties
- delete address-cells and size-cells for qe-si and qe-siram

 .../devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt  | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt 
b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt
index 4f89302..84052a7 100644
--- a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt
+++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt
@@ -69,6 +69,56 @@ Example:
};
  };
 
+* Interrupt Controller (IC)
+
+Required properties:
+- compatible : should be "fsl,qe-ic".
+- reg : Address range of IC register set.
+- interrupts : interrupts generated by the device.
+- interrupt-controller : this device is a interrupt controller.
+
+Example:
+
+   qeic: interrupt-controller@80 {
+   interrupt-controller;
+   compatible = "fsl,qe-ic";
+   #address-cells = <0>;
+   #interrupt-cells = <1>;
+   reg = <0x80 0x80>;
+   interrupts = <95 2 0 0  94 2 0 0>; //high:79 low:78
+   };
+
+* Serial Interface Block (SI)
+
+The SI manages the routing of eight TDM lines to the QE block serial drivers
+, the MCC and the UCCs, for receive and transmit.
+
+Required properties:
+- compatible : should be "fsl,qe-si".
+- reg : Address range of SI register set.
+
+Example:
+
+   si1: si@700 {
+   compatible = "fsl,qe-si";
+   reg = <0x700 0x80>;
+   };
+
+* Serial Interface Block RAM(SIRAM)
+
+store the routing entries of SI
+
+Required properties:
+- compatible : should be "fsl,qe-siram".
+- reg : Address range of SI RAM.
+
+Example:
+
+   siram1: siram@1000 {
+   compatible = "fsl,qe-siram";
+   reg = <0x1000 0x800>;
+   };
+
 * QE Firmware Node
 
 This node defines a firmware binary that is embedded in the device tree, for
-- 
2.1.0.27.g96db324

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 2/7] QE: Add ucc hdlc document to bindings

2016-02-17 Thread Zhao Qiang
Add ucc hdlc document to
Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt

Signed-off-by: Zhao Qiang 
---
Changes for v2
- use ucc-hdlc instead of ucc_hdlc
- add more information to properties.

 .../bindings/powerpc/fsl/cpm_qe/network.txt| 93 ++
 1 file changed, 93 insertions(+)

diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt 
b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt
index 29b28b8..936158c 100644
--- a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt
+++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt
@@ -41,3 +41,96 @@ Example:
fsl,mdio-pin = <12>;
fsl,mdc-pin = <13>;
};
+
+* HDLC
+
+Currently defined compatibles:
+- fsl,ucc-hdlc
+
+Properties for fsl,ucc-hdlc:
+- rx-clock-name
+- tx-clock-name
+   Usage: required
+   Value type: 
+   Definition : should be "brg1"-"brg16" for internal clock source,
+should be "clk1"-"clk28" for external clock source.
+
+- fsl,rx-sync-clock
+   Usage: required
+   Value type: 
+   Definition : should be "none" when using internal clock source,
+should be "rsync_pin" when using external clock source.
+
+- fsl,tx-sync-clock
+   Usage: required
+   Value type: 
+   Definition : should be "none" when using internal clock source,
+should be "tsync_pin" when using external clock source.
+
+- fsl,tx-timeslot
+- fsl,rx-timeslot
+   Usage: required
+   Value type: 
+   Definition : time slot for TDM operation. Indicates which time slots
+used for transmitting and receiving.
+
+- fsl,tdm-framer-type
+   Usage: required
+   Value type: 
+   Definition : "e1" or "t1"
+
+- fsl,tdm-mode
+   Usage: required
+   Value type: 
+   Definition : "normal" or "internal-loopback"
+
+- fsl,tdm-id
+   Usage: required
+   Value type: 
+   Definition : number of TDM ID
+
+- fsl,siram-entry-id
+   Usage: required
+   Value type: 
+   Definition : should be 0,2,4...64. the number of TDM entry.
+
+- fsl,tdm-interface
+   Usage: optional
+   Value type: 
+   Definition : Specify that hdlc is based on tdm-interface
+
+Example:
+
+   ucc@2000 {
+   compatible = "fsl,ucc-hdlc";
+   rx-clock-name = "clk8";
+   tx-clock-name = "clk9";
+   fsl,rx-sync-clock = "rsync_pin";
+   fsl,tx-sync-clock = "tsync_pin";
+   fsl,tx-timeslot = <0xfffe>;
+   fsl,rx-timeslot = <0xfffe>;
+   fsl,tdm-framer-type = "e1";
+   fsl,tdm-mode = "normal";
+   fsl,tdm-id = <0>;
+   fsl,siram-entry-id = <0>;
+   fsl,tdm-interface;
+   };
+fsl,siram-entry-id : SI RAM entry ID for the TDM
+fsl,tdm-interface : hdlc is based on tdm-interface
+
+Example:
+
+   ucc@2000 {
+   compatible = "fsl,ucc-hdlc";
+   rx-clock-name = "clk8";
+   tx-clock-name = "clk9";
+   fsl,rx-sync-clock = "rsync_pin";
+   fsl,tx-sync-clock = "tsync_pin";
+   fsl,tx-timeslot = <0xfffe>;
+   fsl,rx-timeslot = <0xfffe>;
+   fsl,tdm-framer-type = "e1";
+   fsl,tdm-mode = "normal";
+   fsl,tdm-id = <0>;
+   fsl,siram-entry-id = <0>;
+   fsl,tdm-interface;
+   };
-- 
2.1.0.27.g96db324

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH][v3] drivers/memory: Add deep sleep support for IFC

2016-02-17 Thread Scott Wood
On Wed, 2016-02-17 at 14:40 +, Raghav Dogra wrote:
> 
> > -Original Message-
> > From: Scott Wood [mailto:o...@buserror.net]
> > Sent: Tuesday, February 16, 2016 2:05 PM
> > To: Raghav Dogra ; linuxppc-dev@lists.ozlabs.org
> > Cc: Prabhakar Kushwaha 
> > Subject: Re: [PATCH][v3] drivers/memory: Add deep sleep support for IFC
> > 
> > On Mon, 2016-02-15 at 11:44 +0530, Raghav Dogra wrote:
> > > 
> > > +
> > > + ctrl->saved_regs = kzalloc(sizeof(struct fsl_ifc_regs),
> > > GFP_KERNEL);
> > > + if (!ctrl->saved_regs)
> > > + return -ENOMEM;
> > 
> > Allocate memory at probe time, not here.
> > 
> 
> But, why allocate memory at the probe when it is not known at that time
> whether
> deep sleep state would be required or not? Is that because we want to save
> time
> while going to deep sleep?

We also want to avoid potential failures here.  We can also keep the code
simpler by embedding this into the ctrl struct itself, and not dynamically
allocating it at all.

> > > 
> > > + ifc_out32(0x0, >ifc_nand.nand_evter_intr_en);
> > > + ifc_out32(0x0, >ifc_nor.nor_evter_intr_en);
> > > + ifc_out32(0x0, >ifc_gpcm.gpcm_evter_intr_en);
> > > +
> > > + memcpy_fromio(ctrl->saved_regs, ifc, sizeof(struct
> > > fsl_ifc_regs));
> > > +
> > > +/* save the interrupt values */
> > > + ctrl->saved_regs->cm_evter_intr_en = cm_evter_intr_en;
> > > + ctrl->saved_regs->ifc_nand.nand_evter_intr_en =
> > nand_evter_intr_en;
> > > + ctrl->saved_regs->ifc_nor.nor_evter_intr_en =
> > > nor_evter_intr_en;
> > > + ctrl->saved_regs->ifc_gpcm.gpcm_evter_intr_en =
> > gpcm_evter_intr_en;
> > 
> > Why didn't you use the memcpy_fromio() to save these, and clear intr_en
> > later?
> > 
> 
> I used it whenever I did a write/read on iomem. In this case, both memories 
> are non iomem.

Huh?  >ifc_nand.nand_evter_intr_en and such are iomem.

> > > +
> > > +/*
> > > + * IFC interrupts disabled
> > > + */
> > > + ifc_out32(0x0, >cm_evter_intr_en);
> > > + ifc_out32(0x0, >ifc_nand.nand_evter_intr_en);
> > > + ifc_out32(0x0, >ifc_nor.nor_evter_intr_en);
> > > + ifc_out32(0x0, >ifc_gpcm.gpcm_evter_intr_en);
> > > +
> > > +
> > > + if (ctrl->saved_regs) {
> > > + for (ifc_bank = 0; ifc_bank < FSL_IFC_BANK_COUNT;
> > > ifc_bank++) {
> > > + ifc_out32(savd_regs
> > > ->cspr_cs[ifc_bank].cspr_ext,
> > > + 
> > > ->cspr_cs[ifc_bank].cspr_ext);
> > > + ifc_out32(savd_regs->cspr_cs[ifc_bank].cspr,
> > > + >cspr_cs[ifc_bank].cspr);
> > > + ifc_out32(savd_regs->amask_cs[ifc_bank].amask,
> > > + 
> > > ->amask_cs[ifc_bank].amask);
> > > + ifc_out32(savd_regs
> > > ->csor_cs[ifc_bank].csor_ext,
> > > + 
> > > ->csor_cs[ifc_bank].csor_ext);
> > > + ifc_out32(savd_regs->csor_cs[ifc_bank].csor,
> > > + >csor_cs[ifc_bank].csor);
> > 
> > Align continuation lines the way patchwork suggests ("" aligned with
> > "savd").
> 
> Okay, I will take care of this in the next patch.
> 
> > 
> > Does resume from deep sleep go via U-Boot (which would initialize these
> > registers) on these chips?
> 
> Yes, deep sleep resume goes via u-boot and these registers should be
> initialized 
> By u-boot.

So then we don't need to save/restore them.

> 

> > > +
> > > +/*
> > > +* IFC controller NOR machine registers */
> > > + ifc_out32(savd_regs->ifc_nor.nor_evter_en,
> > > + >ifc_nor.nor_evter_en);
> > > + ifc_out32(savd_regs->ifc_nor.norcr, 
> > > ->ifc_nor.norcr);
> > 
> > What uses these?
> > 
> 
> These registers are not used as such, but we would like to retain their
> value as they
> can be of help in case of error conditions.

I don't follow.  Neither of those registers reports errors, and the registers
that *do* report errors are generally w1c and thus you can't save/restore
them.

> > > 
> > > +
> > > + ver = ifc_in32(>regs->ifc_rev);
> > > + ncfgr = ifc_in32(>ifc_nand.ncfgr);
> > > + if (ver >= FSL_IFC_V1_3_0) {
> > > +
> > > + ifc_out32(ncfgr | IFC_NAND_SRAM_INIT_EN,
> > > + >ifc_nand.ncfgr);
> > > + /* wait for  SRAM_INIT bit to be clear or timeout */
> > > + timeout = IFC_TIMEOUT_MSECS;
> > > + while ((ifc_in32(>ifc_nand.ncfgr) &
> > > + IFC_NAND_SRAM_INIT_EN) &&
> > timeout)
> > > {
> > > + cpu_relax();
> > > + timeout--;
> > > + }
> > 
> > How can this timeout be in milliseconds or any other real unit of time, if
> > it's
> > actually measuring loop iterations with no udelay() or similar?
> > 
> 
> Yes, it's not in millisecond any longer. Will change the name to
> IFC_WAIT_ITR

What does ITR mean?  And my complaint was not just about naming -- this type
of delay loop is 

Re: [V2,1/2] powerpc/powernv: new function to access OPAL msglog

2016-02-17 Thread Andrew Donnellan

On 17/02/16 23:41, Michael Ellerman wrote:

I see you've posted a v3 since I merged this, please send an incremental patch
with the changes.


http://patchwork.ozlabs.org/patch/584416/

--
Andrew Donnellan  Software Engineer, OzLabs
andrew.donnel...@au1.ibm.com  Australia Development Lab, Canberra
+61 2 6201 8874 (work)IBM Australia Limited

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 6/7] T104xRDB: Add qe node to t104xrdb

2016-02-17 Thread Zhao Qiang
add qe node to t104xrdb.dtsi

Signed-off-by: Zhao Qiang 
---
Changes for v2
- rebase

 arch/powerpc/boot/dts/fsl/t104xrdb.dtsi | 40 +
 1 file changed, 40 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/t104xrdb.dtsi 
b/arch/powerpc/boot/dts/fsl/t104xrdb.dtsi
index 830ea48..3b08601 100644
--- a/arch/powerpc/boot/dts/fsl/t104xrdb.dtsi
+++ b/arch/powerpc/boot/dts/fsl/t104xrdb.dtsi
@@ -186,4 +186,44 @@
  0 0x0001>;
};
};
+
+   qe: qe@ffe14 {
+   ranges = <0x0 0xf 0xfe14 0x4>;
+   reg = <0xf 0xfe14 0 0x480>;
+   brg-frequency = <0>;
+   bus-frequency = <0>;
+
+   si1: si@700 {
+   compatible = "fsl,qe-si";
+   reg = <0x700 0x80>;
+   };
+
+   siram1: siram@1000 {
+   compatible = "fsl,qe-siram";
+   reg = <0x1000 0x800>;
+   };
+
+   ucc_hdlc: ucc@2000 {
+   compatible = "fsl,ucc-hdlc";
+   rx-clock-name = "clk8";
+   tx-clock-name = "clk9";
+   fsl,rx-sync-clock = "rsync_pin";
+   fsl,tx-sync-clock = "tsync_pin";
+   fsl,tx-timeslot = <0xfffe>;
+   fsl,rx-timeslot = <0xfffe>;
+   fsl,tdm-framer-type = "e1";
+   fsl,tdm-mode = "normal";
+   fsl,tdm-id = <0>;
+   fsl,siram-entry-id = <0>;
+   fsl,tdm-interface;
+   };
+
+   ucc_serial: ucc@2200 {
+   device_type = "serial";
+   compatible = "ucc_uart";
+   port-number = <1>;
+   rx-clock-name = "brg2";
+   tx-clock-name = "brg2";
+   };
+   };
 };
-- 
2.1.0.27.g96db324

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 4/7] bindings: move cpm_qe binding from powerpc/fsl to soc/fsl

2016-02-17 Thread Zhao Qiang
cpm_qe is supported on both powerpc and arm.
and the QE code has been moved from arch/powerpc into
drivers/soc/fsl, so move cpm_qe binding from powerpc/fsl
to soc/fsl

Signed-off-by: Zhao Qiang 
---
Changes for v2
- new added

 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/cpm.txt | 0
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/cpm/brg.txt | 0
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/cpm/i2c.txt | 0
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/cpm/pic.txt | 0
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/cpm/usb.txt | 0
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/gpio.txt| 0
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/network.txt | 0
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/qe.txt  | 0
 .../devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/qe/firmware.txt   | 0
 .../devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/qe/par_io.txt | 0
 .../devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/qe/pincfg.txt | 0
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/qe/ucc.txt  | 0
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/qe/usb.txt  | 0
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/serial.txt  | 0
 .../devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/uqe_serial.txt| 0
 15 files changed, 0 insertions(+), 0 deletions(-)
 rename Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/cpm.txt 
(100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/cpm/brg.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/cpm/i2c.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/cpm/pic.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/cpm/usb.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/gpio.txt 
(100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/network.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => soc}/fsl/cpm_qe/qe.txt 
(100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/qe/firmware.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/qe/par_io.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/qe/pincfg.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/qe/ucc.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/qe/usb.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/serial.txt (100%)
 rename Documentation/devicetree/bindings/{powerpc => 
soc}/fsl/cpm_qe/uqe_serial.txt (100%)

diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm.txt 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/cpm.txt
similarity index 100%
rename from Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm.txt
rename to Documentation/devicetree/bindings/soc/fsl/cpm_qe/cpm.txt
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/brg.txt 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/cpm/brg.txt
similarity index 100%
rename from Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/brg.txt
rename to Documentation/devicetree/bindings/soc/fsl/cpm_qe/cpm/brg.txt
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/i2c.txt 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/cpm/i2c.txt
similarity index 100%
rename from Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/i2c.txt
rename to Documentation/devicetree/bindings/soc/fsl/cpm_qe/cpm/i2c.txt
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/pic.txt 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/cpm/pic.txt
similarity index 100%
rename from Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/pic.txt
rename to Documentation/devicetree/bindings/soc/fsl/cpm_qe/cpm/pic.txt
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/usb.txt 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/cpm/usb.txt
similarity index 100%
rename from Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/usb.txt
rename to Documentation/devicetree/bindings/soc/fsl/cpm_qe/cpm/usb.txt
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/gpio.txt 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/gpio.txt
similarity index 100%
rename from Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/gpio.txt
rename to Documentation/devicetree/bindings/soc/fsl/cpm_qe/gpio.txt
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/network.txt
similarity index 100%
rename from Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt
rename to Documentation/devicetree/bindings/soc/fsl/cpm_qe/network.txt
diff 

[PATCH v2 3/7] QE: Add uqe_serial document to bindings

2016-02-17 Thread Zhao Qiang
Add uqe_serial document to
Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/uqe_serial.txt

Signed-off-by: Zhao Qiang 
---
Changes for v2
- modify tx/rx-clock-name specification

 .../bindings/powerpc/fsl/cpm_qe/uqe_serial.txt| 19 +++
 1 file changed, 19 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/uqe_serial.txt

diff --git 
a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/uqe_serial.txt 
b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/uqe_serial.txt
new file mode 100644
index 000..436c71c
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/uqe_serial.txt
@@ -0,0 +1,19 @@
+* Serial
+
+Currently defined compatibles:
+- ucc_uart
+
+Properties for ucc_uart:
+port-number : port number of UCC-UART
+tx/rx-clock-name : should be "brg1"-"brg16" for internal clock source,
+  should be "clk1"-"clk28" for external clock source.
+
+Example:
+
+   ucc_serial: ucc@2200 {
+   device_type = "serial";
+   compatible = "ucc_uart";
+   port-number = <1>;
+   rx-clock-name = "brg2";
+   tx-clock-name = "brg2";
+   };
-- 
2.1.0.27.g96db324

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/powernv: don't create OPAL msglog sysfs entry if memcons init fails

2016-02-17 Thread Andrew Donnellan
When initialising OPAL interfaces, there is a possibility that
opal_msglog_init() may fail to initialise the msglog/memory console.

Fix opal_msglog_sysfs_init() so it doesn't try to create sysfs entry for
the msglog if this occurs.

Suggested-by: Joel Stanley 
Fixes: 9b4fffa14906 ("powerpc/powernv: new function to access OPAL msglog")
Signed-off-by: Andrew Donnellan 
---
 arch/powerpc/platforms/powernv/opal-msglog.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal-msglog.c 
b/arch/powerpc/platforms/powernv/opal-msglog.c
index 59fa6e1..39d6ff9 100644
--- a/arch/powerpc/platforms/powernv/opal-msglog.c
+++ b/arch/powerpc/platforms/powernv/opal-msglog.c
@@ -128,6 +128,11 @@ void __init opal_msglog_init(void)
 
 void __init opal_msglog_sysfs_init(void)
 {
+   if (!opal_memcons) {
+   pr_warn("OPAL: message log initialisation failed, not creating 
sysfs entry\n");
+   return;
+   }
+
if (sysfs_create_bin_file(opal_kobj, _msglog_attr) != 0)
pr_warn("OPAL: sysfs file creation failed\n");
 }
-- 
Andrew Donnellan  Software Engineer, OzLabs
andrew.donnel...@au1.ibm.com  Australia Development Lab, Canberra
+61 2 6201 8874 (work)IBM Australia Limited

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH][v3] drivers/memory: Add deep sleep support for IFC

2016-02-17 Thread Scott Wood
On Wed, 2016-02-17 at 17:19 -0600, Leo Li wrote:
> On Wed, Feb 17, 2016 at 8:40 AM, Raghav Dogra  wrote:
> > 
> > > Is it really necessary to spin here rather than waiting for an interrupt
> > > like
> > > normal?
> > > 
> > 
> > Aren't the global interrupts disabled at this stage? Can we use the
> > interrupt based
> > waits in the deep sleep code? We used it based on the assumption that
> > interrupts
> > cannot be used here.
> 
> At the resume() stage, interrupts are already enabled.  But the
> problem of using interrupt based wait here is that we cannot give a
> correct return value at this point.  And it can also defeat the
> ordering of resume() callbacks for dependent devices.

I didn't say to return from the resume() function before the operation is
done, just to have the resume() function wait for the interrupt.  At the very
least it would make it easier to reuse existing code once this is moved to the
NAND driver, if we don't need a special way of waiting for this operation.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH kernel v2] powerpc/ioda: Set "read" permission when "write" is set

2016-02-17 Thread Benjamin Herrenschmidt
On Wed, 2016-02-17 at 18:26 +1100, Alexey Kardashevskiy wrote:
> Quite often drivers set only "write" permission assuming that this
> includes "read" permission as well and this works on plenty
> platforms.
> However IODA2 is strict about this and produces an EEH when "read"
> permission is not and reading happens.
> 
> This adds a workaround in IODA code to always add the "read" bit when
> the "write" bit is set.
> 
> This fixes breakage introduced in
> 10b35b2b74 powerpc/powernv: Do not set "read" flag if
> direction==DMA_NONE
> 
> Cc: sta...@vger.kernel.org # 4.2+

Acked-by: Benjamin Herrenschmidt 

> Signed-off-by: Alexey Kardashevskiy 
> Tested-by: Douglas Miller 
> ---
>  arch/powerpc/platforms/powernv/pci.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci.c
> b/arch/powerpc/platforms/powernv/pci.c
> index 2f55c86..6a97ba4 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -599,6 +599,9 @@ int pnv_tce_build(struct iommu_table *tbl, long
> index, long npages,
>   u64 rpn = __pa(uaddr) >> tbl->it_page_shift;
>   long i;
>  
> + if (proto_tce & TCE_PCI_WRITE)
> + proto_tce |= TCE_PCI_READ;
> +
>   for (i = 0; i < npages; i++) {
>   unsigned long newtce = proto_tce |
>   ((rpn + i) << tbl->it_page_shift);
> @@ -620,6 +623,9 @@ int pnv_tce_xchg(struct iommu_table *tbl, long
> index,
>  
>   BUG_ON(*hpa & ~IOMMU_PAGE_MASK(tbl));
>  
> + if (newtce & TCE_PCI_WRITE)
> + newtce |= TCE_PCI_READ;
> +
>   oldtce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce));
>   *hpa = be64_to_cpu(oldtce) & ~(TCE_PCI_READ |
> TCE_PCI_WRITE);
>   *direction = iommu_tce_direction(oldtce);
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Fix BUG_ON() reporting in real mode on powerpc

2016-02-17 Thread Balbir Singh
>> Changelog:
>>  Don't add PAGE_OFFSET blindly, check if REGION_ID is 0
>>
>> I ran into this issue while debugging an early boot problem.
>> The system hit a BUG_ON() but report bug failed to print the
>> line number and file name. The reason being that the system
>> was running in real mode and report_bug() searches for
>> addresses in the PAGE_OFFSET+ region
>>
>> Suggested-by: Paul Mackerras 
>> Signed-off-by: Balbir Singh 



> Can we add some comments around this. When i looked at this first, i was
> wondering how nip can be in user region. But then realized that what we
> are checking here is kernel address used in real mode. The use of
> REGION_ID eventhough simpler is confusing. Hence adding the comment with
> details Paul mentioned in email will help.
>
>
I've tried and covered it in the changelog, I thought a code comment
would make sense for the very non  obvious cases and not repeat what
the code does as comment

Balbir Singh.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RESEND][PATCH] Fix BUG_ON() reporting in real mode on PPC

2016-02-17 Thread Balbir Singh
Resending to remove whitespace issues created by email client

Changelog:
Don't add PAGE_OFFSET blindly, check if REGION_ID is 0

I ran into this issue while debugging an early boot problem.
The system hit a BUG_ON() but report bug failed to print the
line number and file name. The reason being that the system
was running in real mode and report_bug() searches for
addresses in the PAGE_OFFSET+ region

Suggested-by: Paul Mackerras 
Signed-off-by: Balbir Singh 
---
 arch/powerpc/kernel/traps.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index b6becc7..4de4fe7 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1148,6 +1148,7 @@ void __kprobes program_check_exception(struct pt_regs 
*regs)
goto bail;
}
if (reason & REASON_TRAP) {
+   unsigned long bugaddr;
/* Debugger is first in line to stop recursive faults in
 * rcu_lock, notify_die, or atomic_notifier_call_chain */
if (debugger_bpt(regs))
@@ -1158,8 +1159,12 @@ void __kprobes program_check_exception(struct pt_regs 
*regs)
== NOTIFY_STOP)
goto bail;
 
+   bugaddr = regs->nip;
+   if ((REGION_ID(bugaddr) == 0) && !(regs->msr & MSR_IR))
+   bugaddr += PAGE_OFFSET;
+
if (!(regs->msr & MSR_PR) &&  /* not user-mode */
-   report_bug(regs->nip, regs) == BUG_TRAP_TYPE_WARN) {
+   report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) {
regs->nip += 4;
goto bail;
}
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] random kernel crashes after THP rework on s390 (maybe also on PowerPC and ARM)

2016-02-17 Thread Kirill A. Shutemov
On Wed, Feb 17, 2016 at 08:13:40PM +0100, Gerald Schaefer wrote:
> On Sat, 13 Feb 2016 12:58:31 +0100 (CET)
> Sebastian Ott  wrote:
> 
> > [   59.875935] [ cut here ]
> > [   59.875937] kernel BUG at mm/huge_memory.c:2884!
> > [   59.875979] illegal operation: 0001 ilc:1 [#1] PREEMPT SMP 
> > DEBUG_PAGEALLOC
> > [   59.875986] Modules linked in: bridge stp llc btrfs xor mlx4_en vxlan 
> > ip6_udp_tunnel udp_tunnel mlx4_ib ptp pps_core ib_sa ib_mad ib_core ib_addr 
> > ghash_s390 prng raid6_pq ecb aes_s390 des_s390 des_generic sha512_s390 
> > sha256_s390 sha1_s390 mlx4_core sha_common genwqe_card scm_block crc_itu_t 
> > vhost_net tun vhost dm_mod macvtap eadm_sch macvlan kvm autofs4
> > [   59.876033] CPU: 2 PID: 5402 Comm: git Tainted: GW   
> > 4.4.0-07794-ga4eff16-dirty #77
> > [   59.876036] task: d2312948 ti: cfecc000 task.ti: 
> > cfecc000
> > [   59.876039] Krnl PSW : 0704d0018000 002bf3aa 
> > (__split_huge_pmd_locked+0x562/0xa10)
> > [   59.876045]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 
> > PM:0 EA:3
> >Krnl GPRS: 01a7a1cf 03d10177c000 
> > 00044068 5df00215
> > [   59.876051]0001 0001 
> >  774e6900
> > [   59.876054]03ff5200 6d403b10 
> > 6e1eb800 03ff51f0
> > [   59.876058]03d10177c000 00715190 
> > 002bf234 cfecfb58
> > [   59.876068] Krnl Code: 002bf39c: d507d010a000clc 
> > 16(8,%%r13),0(%%r10)
> >   002bf3a2: a7840004brc 
> > 8,2bf3aa
> >  #002bf3a6: a7f40001brc 
> > 15,2bf3a8
> >  >002bf3aa: 91407440tm  
> > 1088(%%r7),64
> >   002bf3ae: a7840208brc 
> > 8,2bf7be
> >   002bf3b2: a7f401e9brc 
> > 15,2bf784
> >   002bf3b6: 9104a006tm  
> > 6(%%r10),4
> >   002bf3ba: a7740004brc 
> > 7,2bf3c2
> > [   59.876089] Call Trace:
> > [   59.876092] ([<002bf234>] __split_huge_pmd_locked+0x3ec/0xa10)
> > [   59.876095]  [<002c4310>] __split_huge_pmd+0x118/0x218
> > [   59.876099]  [<002810e8>] unmap_single_vma+0x2d8/0xb40
> > [   59.876102]  [<00282d66>] zap_page_range+0x116/0x318
> > [   59.876105]  [<0029b834>] SyS_madvise+0x23c/0x5e8
> > [   59.876108]  [<006f9f56>] system_call+0xd6/0x258
> > [   59.876111]  [<03ff9bbfd282>] 0x3ff9bbfd282
> > [   59.876113] INFO: lockdep is turned off.
> > [   59.876115] Last Breaking-Event-Address:
> > [   59.876118]  [<002bf3a6>] __split_huge_pmd_locked+0x55e/0xa10
> 
> The BUG at mm/huge_memory.c:2884 is interesting, it's the 
> BUG_ON(!pte_none(*pte))
> check in __split_huge_pmd_locked(). Obviously we expect the pre-allocated
> pagetables to be empty, but in collapse_huge_page() we deposit the original
> pagetable instead of allocating a new (empty) one. This saves an allocation,
> which is good, but doesn't that mean that if such a collapsed hugepage will
> ever be split, we will always run into the BUG_ON(!pte_none(*pte)), or one
> of the two other VM_BUG_ONs in mm/huge_memory.c that check the same?
> 
> This behavior is not new, it was the same before the THP rework, so I do not
> assume that it is related to the current problems, maybe with the exception
> of this specific crash. I never saw the BUG at mm/huge_memory.c:2884 myself,
> and the other crashes probably cannot be explained with this. Maybe I am
> also missing something, but I do not see how collapse_huge_page() and the
> (non-empty) pgtable deposit there can work out with the 
> BUG_ON(!pte_none(*pte))
> checks. Any thoughts?

I don't think there's a problem: ptes in the pgtable are cleared with
pte_clear() in __collapse_huge_page_copy().

-- 
 Kirill A. Shutemov
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3] powerpc/powernv: Remove support for p5ioc2

2016-02-17 Thread Alexey Kardashevskiy

On 02/17/2016 11:41 PM, Michael Ellerman wrote:

On Mon, 2016-08-02 at 04:08:20 UTC, Russell Currey wrote:

"p5ioc2 is used by approximately 2 machines in the world, and has never
ever been a supported configuration."

The code for p5ioc2 is essentially unused and complicates what is already
a very complicated codebase.  Its removal is essentially a "free win" in
the effort to simplify the powernv PCI code.

In addition, support for p5ioc2 has been dropped from skiboot.  There's no
reason to keep it around in the kernel.

Signed-off-by: Russell Currey 
Acked-by: Gavin Shan 
Acked-by: Stewart Smith 


Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/2de50e9674fc4ca3c6174b0447



WOW! Nice :( I am using one, every day, even for HV KVM sometime. Thanks, 
folks.



--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] ibmvfc: byteswap scsi_id, wwpn, and node_name prior to logging

2016-02-17 Thread Martin K. Petersen
> "Tyrel" == Tyrel Datwyler  writes:

Tyrel> When logging async events the scsi_id, wwpn, and node_name values
Tyrel> are used directly from the CRQ struct which are of type
Tyrel> __be64. This can be confusing to someone looking through the log
Tyrel> on a LE system.  Instead byteswap these values to host endian
Tyrel> prior to logging.

Applied to 4.6/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] add POWER Virtual Management Channel driver

2016-02-17 Thread Steven Royer

On 2016-02-17 16:31, Greg Kroah-Hartman wrote:

On Wed, Feb 17, 2016 at 03:18:26PM -0600, Steven Royer wrote:

On 2016-02-16 16:18, Greg Kroah-Hartman wrote:
>On Tue, Feb 16, 2016 at 02:43:13PM -0600, Steven Royer wrote:
>>From: Steven Royer 
>>
>>The ibmvmc driver is a device driver for the POWER Virtual Management
>>Channel virtual adapter on the PowerVM platform.  It is used to
>>communicate with the hypervisor for virtualization management.  It
>>provides both request/response and asynchronous message support through
>>the /dev/ibmvmc node.
>
>What is the protocol for that device node?
The protocol is not currently published.  I am pushing on getting it
published, but that process will take time.  If you have a PowerVM 
system
with NovaLink, it would not be hard to reverse engineer it...  If you 
don't

have a PowerVM system, then this driver isn't interesting anyway...


You can't just expect us to review this code without at least having a
clue as to how it is supposed to work?
There are two layers to the protocol.  The first layer is the only layer 
that the driver actually cares about.  The second layer is just a 
payload that is between the application and the hypervisor and can 
change independently from the kernel/driver (this is what is transported 
over the /dev/ibmvmc node).  The first layer technically is published in 
the PAPR (appendix G), but it is not trivial for most people to access 
online.  I'll put together some documentation that describes that first 
layer of the protocol in my next revision of the patch.  In many 
respects, the interface between driver and hypervisor is similar to 
ibmvscsi.  Both are CRQ based devices.  ibmvmc is actually a little 
closer to the old ibmvstgt driver since it is a CRQ server device.



>Where is the documentation here?  Why does this have to be a character
>device?  Why can't it fit in with other drivers of this type?
This is a character device for historical reasons.  The short version 
is
that this driver is a clean-room rewrite of an AIX driver which made 
it a
character device.  The user space application was ported from AIX to 
Linux
and it is convenient to have the AIX and Linux drivers match behavior 
where

possible.


Note that we don't let random userspace applications dictate kernel api
decisions, please make the best choice for this interface without being
influenced by AIX.
That is fair.  I actually started down the path of using a block 
interface early on, and I ran into some complications that made it seem 
less desirable than character.  Specifically the interface has variable 
length messages from 32 bytes to 4kb (mostly closer to 32 bytes than 4kb 
per message) and I was worried about overhead of dealing with all the 
zeros in the majority of the messages.  I've never made a block 
interface before and it's entirely possible (likely) I missed something 
obvious.  I'll revisit before I post the next revision.



>>+/*
>>+ * IBM Power Systems Virtual Management Channel Support.
>>+ *
>>+ * Copyright (c) 2004, 2016 IBM Corp.
>>+ *   Dave Engebretsen engeb...@us.ibm.com
>>+ *   Steven Royer sero...@linux.vnet.ibm.com
>>+ *   Adam Reznechek adrez...@linux.vnet.ibm.com
>>+ *
>>+ * This program is free software; you can redistribute it and/or
>>+ * modify it under the terms of the GNU General Public License
>>+ * as published by the Free Software Foundation; either version 2
>>+ * of the License, or (at your option) any later version.
>
>I have to ask, but do you really mean "or any later version"?
This actually matches closely to other similar PowerVM virtual device
drivers, like ibmvscsi or ibmveth.


That did not answer the question, picking a license in a cargo-cult
manner is not a wise decision :(
This is boilerplate for IBM provided PowerVM drivers.  So yes, I did 
mean this.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH][v3] drivers/memory: Add deep sleep support for IFC

2016-02-17 Thread Leo Li
On Wed, Feb 17, 2016 at 8:40 AM, Raghav Dogra  wrote:
>
>
>> -Original Message-
>> From: Scott Wood [mailto:o...@buserror.net]
>> Sent: Tuesday, February 16, 2016 2:05 PM
>> To: Raghav Dogra ; linuxppc-dev@lists.ozlabs.org
>> Cc: Prabhakar Kushwaha 
>> Subject: Re: [PATCH][v3] drivers/memory: Add deep sleep support for IFC
>>
>> On Mon, 2016-02-15 at 11:44 +0530, Raghav Dogra wrote:
>> > Add support of suspend, resume function to support deep sleep.
>> > Also make sure of SRAM initialization  during resume.
>> >
>> > Signed-off-by: Prabhakar Kushwaha 
>> > Signed-off-by: Raghav Dogra 

Similar comment as last time, that we should involve the MTD guys.

>> > ---
>> > Changes for v3: Replace spin_event_timeout() with arch independent
>> > macro
>> >
>> > Based on
>> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>> > branch "master"
>> >
>> >  drivers/memory/fsl_ifc.c | 165
>> > +++
>> >  include/linux/fsl_ifc.h  |   6 ++
>> >  2 files changed, 171 insertions(+)
>> >
>> > diff --git a/drivers/memory/fsl_ifc.c b/drivers/memory/fsl_ifc.c index
>> > acd1460..fa028bd 100644
>> > --- a/drivers/memory/fsl_ifc.c
>> > +++ b/drivers/memory/fsl_ifc.c
>> > @@ -24,6 +24,7 @@
>> >  #include 
>> >  #include 
>> >  #include 
>> > +#include 
>> >  #include 
>> >  #include 
>> >  #include 
>> > @@ -35,6 +36,8 @@
>> >
>> >  struct fsl_ifc_ctrl *fsl_ifc_ctrl_dev;
>> > EXPORT_SYMBOL(fsl_ifc_ctrl_dev);
>> > +#define FSL_IFC_V1_3_0 0x0103
>> > +#define IFC_TIMEOUT_MSECS  10 /* 100ms */
>>
>> What does the "MSECS" mean in IFC_TIMEOUT_MSECS?  It's a unit without a
>> quantity.
>
> Yes, I agree. I will rename it to IFC_WAIT_ITR.
>
>>
>> >
>> >  /*
>> >   * convert_ifc_address - convert the base address @@ -309,6 +312,163
>> > @@ err:
>> > return ret;
>> >  }
>> >
>> > +#ifdef CONFIG_PM_SLEEP
>> > +/* save ifc registers */
>> > +static int fsl_ifc_suspend(struct device *dev) {
>> > +   struct fsl_ifc_ctrl *ctrl = dev_get_drvdata(dev);
>> > +   struct fsl_ifc_regs __iomem *ifc = ctrl->regs;
>> > +   __be32 nand_evter_intr_en, cm_evter_intr_en, nor_evter_intr_en,
>> > +gpcm_evter_intr_en
>> > ;
>>
>> s/__be32/u32/ as they've already been converted to host endianness.
>>
>> Also please repeat the type on a new line rather than use continuation lines
>> to declare more variables (and don't indent continuation lines so far).
>>
>
> Okay, will take care of this in the next version.
>
>> > +
>> > +   ctrl->saved_regs = kzalloc(sizeof(struct fsl_ifc_regs),
>> > GFP_KERNEL);
>> > +   if (!ctrl->saved_regs)
>> > +   return -ENOMEM;
>>
>> Allocate memory at probe time, not here.
>>
>
> But, why allocate memory at the probe when it is not known at that time 
> whether
> deep sleep state would be required or not? Is that because we want to save 
> time
> while going to deep sleep?
>
>> > +   cm_evter_intr_en = ifc_in32(>cm_evter_intr_en);
>> > +   nand_evter_intr_en = ifc_in32(>ifc_nand.nand_evter_intr_en);
>> > +   nor_evter_intr_en = ifc_in32(>ifc_nor.nor_evter_intr_en);
>> > +   gpcm_evter_intr_en = ifc_in32(
>> >ifc_gpcm.gpcm_evter_intr_en);
>> > +
>> > +/* IFC interrupts disabled */
>> > +
>> > +   ifc_out32(0x0, >cm_evter_intr_en);
>>
>> Indent the comments the same as the code.
>>
>
> Okay.
>
>> > +   ifc_out32(0x0, >ifc_nand.nand_evter_intr_en);
>> > +   ifc_out32(0x0, >ifc_nor.nor_evter_intr_en);
>> > +   ifc_out32(0x0, >ifc_gpcm.gpcm_evter_intr_en);
>> > +
>> > +   memcpy_fromio(ctrl->saved_regs, ifc, sizeof(struct fsl_ifc_regs));
>> > +
>> > +/* save the interrupt values */
>> > +   ctrl->saved_regs->cm_evter_intr_en = cm_evter_intr_en;
>> > +   ctrl->saved_regs->ifc_nand.nand_evter_intr_en =
>> nand_evter_intr_en;
>> > +   ctrl->saved_regs->ifc_nor.nor_evter_intr_en = nor_evter_intr_en;
>> > +   ctrl->saved_regs->ifc_gpcm.gpcm_evter_intr_en =
>> gpcm_evter_intr_en;
>>
>> Why didn't you use the memcpy_fromio() to save these, and clear intr_en
>> later?
>>
>
> I used it whenever I did a write/read on iomem. In this case, both memories
> are non iomem.
>
>> That said, I still don't like this approach.  I'd rather see the nand driver 
>> save
>> the registers it cares about, and this driver wouldn't have to do much other
>> than quiesce the rest of the interrupts.
>>
>
> Okay, we will analyze the required changes and include them.
>
>> > +
>> > +   return 0;
>> > +}
>> > +
>> > +/* restore ifc registers */
>> > +static int fsl_ifc_resume(struct device *dev) {
>> > +   struct fsl_ifc_ctrl *ctrl = dev_get_drvdata(dev);
>> > +   struct fsl_ifc_regs __iomem *ifc = ctrl->regs;
>> > +   struct fsl_ifc_regs *savd_regs = ctrl->saved_regs;
>> > +   uint32_t ver = 0, ncfgr, timeout, ifc_bank, i;
>>
>> s/savd/saved/
>>
>
> Okay.
>
>> > +
>> > +/*
>> > + * IFC interrupts disabled
>> > 

Re: [PATCH] MAINTAINERS: Update EEH details and maintainership

2016-02-17 Thread Gavin Shan
On Wed, Feb 17, 2016 at 11:32:24AM -0600, Bjorn Helgaas wrote:
>On Wed, Feb 17, 2016 at 05:06:04PM +1100, Russell Currey wrote:
>> Enhanced Error Handling could mean anything in the context of the entire
>> kernel, so change the name to reference that it is both for PCI and
>> powerpc.
>> 
>> EEH covers a bit more than the previously listed files, so add the headers
>> and platform-specific code to the EEH maintained section.
>> 
>> In addition, I am taking over the maintainership.
>> 
>> Signed-off-by: Russell Currey 
>
>This is fine with me.  I expect it will be merged via the powerpc tree,
>since I think that's how all of Gavin Shan's recent patches in this area
>are being handled.
>

Bjorn, thanks for your great helps in the past. Yeah, I think the code
will be merged via the powerpc tree as before. Looking forward to your
support Russell's work in future :-)

Thanks,
Gavin

>> ---
>>  MAINTAINERS | 16 +---
>>  1 file changed, 9 insertions(+), 7 deletions(-)
>> 
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 28eb61b..95d999e 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -4222,13 +4222,6 @@ M:Maxim Levitsky 
>>  S:  Maintained
>>  F:  drivers/media/rc/ene_ir.*
>>  
>> -ENHANCED ERROR HANDLING (EEH)
>> -M:  Gavin Shan 
>> -L:  linuxppc-dev@lists.ozlabs.org
>> -S:  Supported
>> -F:  Documentation/powerpc/eeh-pci-error-recovery.txt
>> -F:  arch/powerpc/kernel/eeh*.c
>> -
>>  EPSON S1D13XXX FRAMEBUFFER DRIVER
>>  M:  Kristoffer Ericson 
>>  S:  Maintained
>> @@ -8244,6 +8237,15 @@ L:linux-...@vger.kernel.org
>>  S:  Supported
>>  F:  Documentation/PCI/pci-error-recovery.txt
>>  
>> +PCI ENHANCED ERROR HANDLING (EEH) FOR POWERPC
>> +M:  Russell Currey 
>> +L:  linuxppc-dev@lists.ozlabs.org
>> +S:  Supported
>> +F:  Documentation/powerpc/eeh-pci-error-recovery.txt
>> +F:  arch/powerpc/kernel/eeh*.c
>> +F:  arch/powerpc/platforms/*/eeh*.c
>> +F:  arch/powerpc/include/*/eeh*.h
>> +
>>  PCI SUBSYSTEM
>>  M:  Bjorn Helgaas 
>>  L:  linux-...@vger.kernel.org
>> -- 
>> 2.7.1
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] MAINTAINERS: Update EEH details and maintainership

2016-02-17 Thread Gavin Shan
On Wed, Feb 17, 2016 at 05:06:04PM +1100, Russell Currey wrote:
>Enhanced Error Handling could mean anything in the context of the entire
>kernel, so change the name to reference that it is both for PCI and
>powerpc.
>
>EEH covers a bit more than the previously listed files, so add the headers
>and platform-specific code to the EEH maintained section.
>
>In addition, I am taking over the maintainership.
>
>Signed-off-by: Russell Currey 

Acked-by: Gavin Shan 

Thanks,
Gavin

>---
> MAINTAINERS | 16 +---
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
>diff --git a/MAINTAINERS b/MAINTAINERS
>index 28eb61b..95d999e 100644
>--- a/MAINTAINERS
>+++ b/MAINTAINERS
>@@ -4222,13 +4222,6 @@ M:  Maxim Levitsky 
> S:Maintained
> F:drivers/media/rc/ene_ir.*
>
>-ENHANCED ERROR HANDLING (EEH)
>-M:Gavin Shan 
>-L:linuxppc-dev@lists.ozlabs.org
>-S:Supported
>-F:Documentation/powerpc/eeh-pci-error-recovery.txt
>-F:arch/powerpc/kernel/eeh*.c
>-
> EPSON S1D13XXX FRAMEBUFFER DRIVER
> M:Kristoffer Ericson 
> S:Maintained
>@@ -8244,6 +8237,15 @@ L:  linux-...@vger.kernel.org
> S:Supported
> F:Documentation/PCI/pci-error-recovery.txt
>
>+PCI ENHANCED ERROR HANDLING (EEH) FOR POWERPC
>+M:Russell Currey 
>+L:linuxppc-dev@lists.ozlabs.org
>+S:Supported
>+F:Documentation/powerpc/eeh-pci-error-recovery.txt
>+F:arch/powerpc/kernel/eeh*.c
>+F:arch/powerpc/platforms/*/eeh*.c
>+F:arch/powerpc/include/*/eeh*.h
>+
> PCI SUBSYSTEM
> M:Bjorn Helgaas 
> L:linux-...@vger.kernel.org
>-- 
>2.7.1
>

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] add POWER Virtual Management Channel driver

2016-02-17 Thread Greg Kroah-Hartman
On Wed, Feb 17, 2016 at 03:18:26PM -0600, Steven Royer wrote:
> On 2016-02-16 16:18, Greg Kroah-Hartman wrote:
> >On Tue, Feb 16, 2016 at 02:43:13PM -0600, Steven Royer wrote:
> >>From: Steven Royer 
> >>
> >>The ibmvmc driver is a device driver for the POWER Virtual Management
> >>Channel virtual adapter on the PowerVM platform.  It is used to
> >>communicate with the hypervisor for virtualization management.  It
> >>provides both request/response and asynchronous message support through
> >>the /dev/ibmvmc node.
> >
> >What is the protocol for that device node?
> The protocol is not currently published.  I am pushing on getting it
> published, but that process will take time.  If you have a PowerVM system
> with NovaLink, it would not be hard to reverse engineer it...  If you don't
> have a PowerVM system, then this driver isn't interesting anyway...

You can't just expect us to review this code without at least having a
clue as to how it is supposed to work?

> >Where is the documentation here?  Why does this have to be a character
> >device?  Why can't it fit in with other drivers of this type?
> This is a character device for historical reasons.  The short version is
> that this driver is a clean-room rewrite of an AIX driver which made it a
> character device.  The user space application was ported from AIX to Linux
> and it is convenient to have the AIX and Linux drivers match behavior where
> possible.

Note that we don't let random userspace applications dictate kernel api
decisions, please make the best choice for this interface without being
influenced by AIX.

> >>+/*
> >>+ * IBM Power Systems Virtual Management Channel Support.
> >>+ *
> >>+ * Copyright (c) 2004, 2016 IBM Corp.
> >>+ *   Dave Engebretsen engeb...@us.ibm.com
> >>+ *   Steven Royer sero...@linux.vnet.ibm.com
> >>+ *   Adam Reznechek adrez...@linux.vnet.ibm.com
> >>+ *
> >>+ * This program is free software; you can redistribute it and/or
> >>+ * modify it under the terms of the GNU General Public License
> >>+ * as published by the Free Software Foundation; either version 2
> >>+ * of the License, or (at your option) any later version.
> >
> >I have to ask, but do you really mean "or any later version"?
> This actually matches closely to other similar PowerVM virtual device
> drivers, like ibmvscsi or ibmveth.

That did not answer the question, picking a license in a cargo-cult
manner is not a wise decision :(

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH][v3] mtd/ifc: Add support for IFC controller version 2.0

2016-02-17 Thread Scott Wood
On Wed, 2016-02-17 at 16:54 +0530, Raghav Dogra wrote:
> The new IFC controller version 2.0 has a different memory map page.
> Upto IFC 1.4 PAGE size is 4 KB and from IFC2.0 PAGE size is 64KB.
> This patch segregates the IFC global and runtime registers to appropriate
> PAGE sizes.
> 
> Signed-off-by: Jaiprakash Singh 
> Signed-off-by: Raghav Dogra 
> Acked-by: Li Yang 
> Signed-off-by: Raghav Dogra 
> ---
> Changes for v3: not dependent on 
> "drivers/memory: Add deep sleep support for IFC" patch
> 
> Changes for v2: rebased to resolve conflicts
> Applicable to git://git.infradead.org/l2-mtd.git
> 
> This patch is dependent on "drivers/memory: Add deep sleep support for IFC"
> https://patchwork.ozlabs.org/patch/582762/
> which is also applicable to git://git.infradead.org/l2-mtd.git
> 
> This patch is the new version of following patch with changed title:
> https://patchwork.ozlabs.org/patch/557391/
> 
>  drivers/memory/fsl_ifc.c| 36 ++---
>  drivers/mtd/nand/fsl_ifc_nand.c | 72 ++
> ---
>  include/linux/fsl_ifc.h | 45 +-
>  3 files changed, 87 insertions(+), 66 deletions(-)

Acked-by: Scott Wood 

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] add POWER Virtual Management Channel driver

2016-02-17 Thread Steven Royer

On 2016-02-16 16:18, Greg Kroah-Hartman wrote:

On Tue, Feb 16, 2016 at 02:43:13PM -0600, Steven Royer wrote:

From: Steven Royer 

The ibmvmc driver is a device driver for the POWER Virtual Management
Channel virtual adapter on the PowerVM platform.  It is used to
communicate with the hypervisor for virtualization management.  It
provides both request/response and asynchronous message support 
through

the /dev/ibmvmc node.


What is the protocol for that device node?
The protocol is not currently published.  I am pushing on getting it 
published, but that process will take time.  If you have a PowerVM 
system with NovaLink, it would not be hard to reverse engineer it...  If 
you don't have a PowerVM system, then this driver isn't interesting 
anyway...


Where is the documentation here?  Why does this have to be a character
device?  Why can't it fit in with other drivers of this type?
This is a character device for historical reasons.  The short version is 
that this driver is a clean-room rewrite of an AIX driver which made it 
a character device.  The user space application was ported from AIX to 
Linux and it is convenient to have the AIX and Linux drivers match 
behavior where possible.




Signed-off-by: Steven Royer 
---
This is used by the PowerVM NovaLink project.  You can see development 
history on github:

https://github.com/powervm/ibmvmc

 Documentation/ioctl/ioctl-number.txt |1 +
 MAINTAINERS  |5 +
 arch/powerpc/include/asm/hvcall.h|3 +-
 drivers/misc/Kconfig |9 +
 drivers/misc/Makefile|1 +
 drivers/misc/ibmvmc.c| 1882 
++

 drivers/misc/ibmvmc.h|  203 
 7 files changed, 2103 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/ibmvmc.c
 create mode 100644 drivers/misc/ibmvmc.h

diff --git a/Documentation/ioctl/ioctl-number.txt 
b/Documentation/ioctl/ioctl-number.txt

index 91261a3..d5f5f4f 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -324,6 +324,7 @@ Code  Seq#(hex) Include FileComments
 0xCA   80-8F   uapi/scsi/cxlflash_ioctl.h
 0xCB   00-1F   CBM serial IEC bus  in development:


+0xCC   00-0F   drivers/misc/ibmvmc.h   pseries VMC driver
 0xCD   01  linux/reiserfs_fs.h
 0xCF   02  fs/cifs/ioctl.c
 0xDB   00-0F   drivers/char/mwave/mwavepub.h
diff --git a/MAINTAINERS b/MAINTAINERS
index cc2f753..c39dca2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5353,6 +5353,11 @@ L:   net...@vger.kernel.org
 S: Supported
 F: drivers/net/ethernet/ibm/ibmvnic.*

+IBM Power Virtual Management Channel Driver
+M: Steven Royer 
+S: Supported
+F: drivers/misc/ibmvmc.*
+
 IBM Power Virtual SCSI Device Drivers
 M: Tyrel Datwyler 
 L: linux-s...@vger.kernel.org
diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h

index e3b54dd..1ee6f2b 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -274,7 +274,8 @@
 #define H_COP  0x304
 #define H_GET_MPP_X0x314
 #define H_SET_MODE 0x31C
-#define MAX_HCALL_OPCODE   H_SET_MODE
+#define H_REQUEST_VMC  0x360
+#define MAX_HCALL_OPCODE   H_REQUEST_VMC

 /* H_VIOCTL functions */
 #define H_GET_VIOA_DUMP_SIZE   0x01
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 054fc10..f8d9113 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -526,6 +526,15 @@ config VEXPRESS_SYSCFG
  bus. System Configuration interface is one of the possible means
  of generating transactions on this bus.

+config IBMVMC
+   tristate "IBM Virtual Management Channel support"
+   depends on PPC_PSERIES
+   help
+ This is the IBM POWER Virtual Management Channel
+
+ To compile this driver as a module, choose M here: the
+ module will be called ibmvmc.
+
 source "drivers/misc/c2port/Kconfig"
 source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 537d7f3..08336b3 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -56,3 +56,4 @@ obj-$(CONFIG_GENWQE)  += genwqe/
 obj-$(CONFIG_ECHO) += echo/
 obj-$(CONFIG_VEXPRESS_SYSCFG)  += vexpress-syscfg.o
 obj-$(CONFIG_CXL_BASE) += cxl/
+obj-$(CONFIG_IBMVMC)   += ibmvmc.o
diff --git a/drivers/misc/ibmvmc.c b/drivers/misc/ibmvmc.c
new file mode 100644
index 000..fb943b7
--- /dev/null
+++ b/drivers/misc/ibmvmc.c
@@ -0,0 +1,1882 @@
+/*
+ * IBM Power Systems Virtual Management Channel Support.
+ *
+ * Copyright (c) 2004, 2016 IBM Corp.
+ *   Dave Engebretsen engeb...@us.ibm.com
+ *   Steven 

Re: [PATCH v8 43/45] drivers/of: Specify parent node in of_fdt_unflatten_tree()

2016-02-17 Thread Jyri Sarha

On 02/17/16 05:44, Gavin Shan wrote:

This adds one more argument to of_fdt_unflatten_tree() to specify
the parent node of the FDT blob that is going to be unflattened.
In the result, the function can be used to unflatten FDT blob that
represents device sub-tree in PowerNV PCI hotplug driver.

Cc: Jyri Sarha 
Signed-off-by: Gavin Shan 
---
  drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c |  2 +-


Acked-by: Jyri Sarha 


  drivers/of/fdt.c | 14 ++
  drivers/of/unittest.c|  2 +-
  include/linux/of_fdt.h   |  1 +
  4 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c 
b/drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c
index 106679b..f9c79da 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c
@@ -157,7 +157,7 @@ struct device_node * __init tilcdc_get_overlay(struct 
kfree_table *kft)
if (!overlay_data || kfree_table_add(kft, overlay_data))
return NULL;

-   of_fdt_unflatten_tree(overlay_data, );
+   of_fdt_unflatten_tree(overlay_data, NULL, );
if (!overlay) {
pr_warn("%s: Unfattening overlay tree failed\n", __func__);
return NULL;
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 3fc9a30..16a1ba5 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -450,11 +450,13 @@ static int unflatten_dt_nodes(const void *blob,
   * pointers of the nodes so the normal device-tree walking functions
   * can be used.
   * @blob: The blob to expand
+ * @dad: Parent device node
   * @mynodes: The device_node tree created by the call
   * @dt_alloc: An allocator that provides a virtual address to memory
   * for the resulting tree
   */
  static void __unflatten_device_tree(const void *blob,
+struct device_node *dad,
 struct device_node **mynodes,
 void * (*dt_alloc)(u64 size, u64 align))
  {
@@ -479,7 +481,7 @@ static void __unflatten_device_tree(const void *blob,
}

/* First pass, scan for size */
-   size = unflatten_dt_nodes(blob, NULL, NULL, NULL);
+   size = unflatten_dt_nodes(blob, NULL, dad, NULL);
if (size < 0)
return;

@@ -495,7 +497,7 @@ static void __unflatten_device_tree(const void *blob,
pr_debug("  unflattening %p...\n", mem);

/* Second pass, do actual unflattening */
-   unflatten_dt_nodes(blob, mem, NULL, mynodes);
+   unflatten_dt_nodes(blob, mem, dad, mynodes);
if (be32_to_cpup(mem + size) != 0xdeadbeef)
pr_warning("End of tree marker overwritten: %08x\n",
   be32_to_cpup(mem + size));
@@ -512,6 +514,9 @@ static DEFINE_MUTEX(of_fdt_unflatten_mutex);

  /**
   * of_fdt_unflatten_tree - create tree of device_nodes from flat blob
+ * @blob: Flat device tree blob
+ * @dad: Parent device node
+ * @mynodes: The device tree created by the call
   *
   * unflattens the device-tree passed by the firmware, creating the
   * tree of struct device_node. It also fills the "name" and "type"
@@ -519,10 +524,11 @@ static DEFINE_MUTEX(of_fdt_unflatten_mutex);
   * can be used.
   */
  void of_fdt_unflatten_tree(const unsigned long *blob,
+   struct device_node *dad,
struct device_node **mynodes)
  {
mutex_lock(_fdt_unflatten_mutex);
-   __unflatten_device_tree(blob, mynodes, _tree_alloc);
+   __unflatten_device_tree(blob, dad, mynodes, _tree_alloc);
mutex_unlock(_fdt_unflatten_mutex);
  }
  EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);
@@ -1180,7 +1186,7 @@ bool __init early_init_dt_scan(void *params)
   */
  void __init unflatten_device_tree(void)
  {
-   __unflatten_device_tree(initial_boot_params, _root,
+   __unflatten_device_tree(initial_boot_params, NULL, _root,
early_init_dt_alloc_memory_arch);

/* Get pointer to "/chosen" and "/aliases" nodes for use everywhere */
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index 979b6e4..ec36f93 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -921,7 +921,7 @@ static int __init unittest_data_add(void)
"not running tests\n", __func__);
return -ENOMEM;
}
-   of_fdt_unflatten_tree(unittest_data, _data_node);
+   of_fdt_unflatten_tree(unittest_data, NULL, _data_node);
if (!unittest_data_node) {
pr_warn("%s: No tree to attach; not running tests\n", __func__);
return -ENODATA;
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index df9ef38..3644960 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -38,6 +38,7 @@ extern bool of_fdt_is_big_endian(const void *blob,
  extern int of_fdt_match(const void 

Re: [BUG] random kernel crashes after THP rework on s390 (maybe also on PowerPC and ARM)

2016-02-17 Thread Gerald Schaefer
On Sat, 13 Feb 2016 12:58:31 +0100 (CET)
Sebastian Ott  wrote:

> [   59.875935] [ cut here ]
> [   59.875937] kernel BUG at mm/huge_memory.c:2884!
> [   59.875979] illegal operation: 0001 ilc:1 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [   59.875986] Modules linked in: bridge stp llc btrfs xor mlx4_en vxlan 
> ip6_udp_tunnel udp_tunnel mlx4_ib ptp pps_core ib_sa ib_mad ib_core ib_addr 
> ghash_s390 prng raid6_pq ecb aes_s390 des_s390 des_generic sha512_s390 
> sha256_s390 sha1_s390 mlx4_core sha_common genwqe_card scm_block crc_itu_t 
> vhost_net tun vhost dm_mod macvtap eadm_sch macvlan kvm autofs4
> [   59.876033] CPU: 2 PID: 5402 Comm: git Tainted: GW   
> 4.4.0-07794-ga4eff16-dirty #77
> [   59.876036] task: d2312948 ti: cfecc000 task.ti: 
> cfecc000
> [   59.876039] Krnl PSW : 0704d0018000 002bf3aa 
> (__split_huge_pmd_locked+0x562/0xa10)
> [   59.876045]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 
> EA:3
>Krnl GPRS: 01a7a1cf 03d10177c000 00044068 
> 5df00215
> [   59.876051]0001 0001  
> 774e6900
> [   59.876054]03ff5200 6d403b10 6e1eb800 
> 03ff51f0
> [   59.876058]03d10177c000 00715190 002bf234 
> cfecfb58
> [   59.876068] Krnl Code: 002bf39c: d507d010a000  clc 
> 16(8,%%r13),0(%%r10)
>   002bf3a2: a7840004  brc 8,2bf3aa
>  #002bf3a6: a7f40001  brc 
> 15,2bf3a8
>  >002bf3aa: 91407440  tm  
> 1088(%%r7),64
>   002bf3ae: a7840208  brc 8,2bf7be
>   002bf3b2: a7f401e9  brc 
> 15,2bf784
>   002bf3b6: 9104a006  tm  
> 6(%%r10),4
>   002bf3ba: a7740004  brc 7,2bf3c2
> [   59.876089] Call Trace:
> [   59.876092] ([<002bf234>] __split_huge_pmd_locked+0x3ec/0xa10)
> [   59.876095]  [<002c4310>] __split_huge_pmd+0x118/0x218
> [   59.876099]  [<002810e8>] unmap_single_vma+0x2d8/0xb40
> [   59.876102]  [<00282d66>] zap_page_range+0x116/0x318
> [   59.876105]  [<0029b834>] SyS_madvise+0x23c/0x5e8
> [   59.876108]  [<006f9f56>] system_call+0xd6/0x258
> [   59.876111]  [<03ff9bbfd282>] 0x3ff9bbfd282
> [   59.876113] INFO: lockdep is turned off.
> [   59.876115] Last Breaking-Event-Address:
> [   59.876118]  [<002bf3a6>] __split_huge_pmd_locked+0x55e/0xa10

The BUG at mm/huge_memory.c:2884 is interesting, it's the 
BUG_ON(!pte_none(*pte))
check in __split_huge_pmd_locked(). Obviously we expect the pre-allocated
pagetables to be empty, but in collapse_huge_page() we deposit the original
pagetable instead of allocating a new (empty) one. This saves an allocation,
which is good, but doesn't that mean that if such a collapsed hugepage will
ever be split, we will always run into the BUG_ON(!pte_none(*pte)), or one
of the two other VM_BUG_ONs in mm/huge_memory.c that check the same?

This behavior is not new, it was the same before the THP rework, so I do not
assume that it is related to the current problems, maybe with the exception
of this specific crash. I never saw the BUG at mm/huge_memory.c:2884 myself,
and the other crashes probably cannot be explained with this. Maybe I am
also missing something, but I do not see how collapse_huge_page() and the
(non-empty) pgtable deposit there can work out with the BUG_ON(!pte_none(*pte))
checks. Any thoughts?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] random kernel crashes after THP rework on s390 (maybe also on PowerPC and ARM)

2016-02-17 Thread Sebastian Ott
Hi,

On Wed, 17 Feb 2016, Kirill A. Shutemov wrote:
> On Tue, Feb 16, 2016 at 05:24:44PM +0100, Gerald Schaefer wrote:
> > On Mon, 15 Feb 2016 23:35:26 +0200
> > "Kirill A. Shutemov"  wrote:
> > 
> > > Is there any chance that I'll be able to trigger the bug using QEMU?
> > > Does anybody have an QEMU image I can use?
> > > 
> > 
> > I have no image, but trying to reproduce this under virtualization may
> > help to trigger this also on other architectures. After ruling out IPI
> > vs. fast_gup I do not really see why this should be arch-specific, and
> > it wouldn't be the first time that we hit subtle races first on s390, due
> > to our virtualized environment (my test case is make -j20 with 10 CPUs and
> > 4GB of memory, no swap).
> 
> Could you post your kernel config?

Attached.

> It would be nice also to check if disabling split_huge_page() would make
> any difference:
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index a75081ca31cf..26d2b7b21021 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3364,6 +3364,8 @@ int split_huge_page_to_list(struct page *page, struct 
> list_head *list)
>   bool mlocked;
>   unsigned long flags;
> 
> + return -EBUSY;
> +
>   VM_BUG_ON_PAGE(is_huge_zero_page(page), page);
>   VM_BUG_ON_PAGE(!PageAnon(page), page);
>   VM_BUG_ON_PAGE(!PageLocked(page), page);
> -- 

65c23c6 + this patch also oopsed:

¢ 1707.903808! ODEBUG: active_state not available (active state 0) object type:
rcu_head hint:   (null)
¢ 1707.903852! ¢ cut here !
¢ 1707.903854! WARNING: at lib/debugobjects.c:263
¢ 1707.903856! Modules linked in: bridge stp llc btrfs mlx4_ib mlx4_en ib_sa vxl
an ib_mad ip6_udp_tunnel ib_core udp_tunnel ptp pps_core ib_addr xor raid6_pq gh
ash_s390 mlx4_core prng ecb aes_s390 des_s390 des_generic sha512_s390 dm_mod sha
256_s390 genwqe_card sha1_s390 sha_common crc_itu_t scm_block eadm_sch vhost_net
tun vhost macvtap macvlan kvm autofs4
¢ 1707.903892! CPU: 4 PID: 25215 Comm: git Not tainted 4.5.0-rc4-00037-g65c23c6-
dirty #273
¢ 1707.903894! task: 06a6 ti: 63b04000 task.ti: 63b0
4000
¢ 1707.903896! Krnl PSW : 0404c0018000 00486ce0 (debug_print_object+
 0xb0/0xd0)
¢ 1707.903905!R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:
3
Krnl GPRS: 01a361c7 06a6 0060 0101
¢ 1707.903908!00486cdc  0088cbdc 000
001b53848
¢ 1707.903910!0701  01b53850 000
0008bb820
¢ 1707.903912!00a8d710 dcdd3d38 00486cdc 000
0dcdd3c38
¢ 1707.903920! Krnl Code: 00486cd0: c0200021a496larl%%r2,8bb
5fc
00486cd6: c0e5ffee03a1   brasl   %%r14,247418
#00486cdc: a7f40001   brc 15,486cde
>00486ce0: c41d002f488e   lrl %%r1,a6fdfc
00486ce6: e340f0e80004   lg  %%r4,232(%%r15)
00486cec: a71a0001   ahi %%r1,1
00486cf0: eb6ff0a80004   lmg %%r6,%%r15,168(%%r15)
00486cf6: c41f002f4883   strl%%r1,a6fdfc
¢ 1707.903960! Call Trace:
¢ 1707.903962! (¢<00486cdc>! debug_print_object+0xac/0xd0)
¢ 1707.903964!  ¢<00488094>! debug_object_active_state+0x164/0x178
¢ 1707.903969!  ¢<001b991c>! rcu_process_callbacks+0x564/0x9e8
¢ 1707.903973!  ¢<0013d3ee>! __do_softirq+0x256/0x568
¢ 1707.903975!  ¢<0013da3a>! irq_exit+0x7a/0xd8
¢ 1707.903979!  ¢<0010c87e>! do_IRQ+0x86/0xc0
¢ 1707.903984!  ¢<006fa3f2>! ext_int_handler+0x11e/0x124
¢ 1707.903987!  ¢<00199bfe>! lock_release+0x5ce/0x670
¢ 1707.903989! (¢<00199be0>! lock_release+0x5b0/0x670)
¢ 1707.903993!  ¢<002dffa2>! getname_flags+0x82/0x218
¢ 1707.903994!  ¢<002e04e8>! user_path_at_empty+0x40/0x68
¢ 1707.903998!  ¢<002d44a4>! vfs_fstatat+0x6c/0xc8
¢ 1707.903999!  ¢<002d4894>! SyS_newlstat+0x2c/0x48
¢ 1707.904002!  ¢<006f9cce>! system_call+0xd6/0x258
¢ 1707.904003!  ¢<03ffb45f1124>! 0x3ffb45f1124
¢ 1707.904005! 1 lock held by git/25215:
¢ 1707.904006!  #0:  (_hash¢i!.lock){-.-.-.}, at: ¢<00487fdc>! debug
_object_active_state+0xac/0x178
¢ 1707.904012! Last Breaking-Event-Address:
¢ 1707.904014!  ¢<00486cdc>! debug_print_object+0xac/0xd0
¢ 1707.904016! ---¢ end trace 8ce68dc422e8321c !---
¢ 1707.904018! ODEBUG: deactivate not available (active state 0) object type: rc
u_head hint:   (null)
¢ 1707.904026! ¢ cut here !
¢ 1707.904027! WARNING: at lib/debugobjects.c:263
¢ 1707.904028! Modules linked in: bridge stp llc btrfs mlx4_ib mlx4_en ib_sa vxl
an ib_mad ip6_udp_tunnel ib_core udp_tunnel ptp pps_core ib_addr xor raid6_pq gh
ash_s390 mlx4_core prng ecb aes_s390 des_s390 des_generic sha512_s390 

Re: [PATCH] MAINTAINERS: Update EEH details and maintainership

2016-02-17 Thread Bjorn Helgaas
On Wed, Feb 17, 2016 at 05:06:04PM +1100, Russell Currey wrote:
> Enhanced Error Handling could mean anything in the context of the entire
> kernel, so change the name to reference that it is both for PCI and
> powerpc.
> 
> EEH covers a bit more than the previously listed files, so add the headers
> and platform-specific code to the EEH maintained section.
> 
> In addition, I am taking over the maintainership.
> 
> Signed-off-by: Russell Currey 

This is fine with me.  I expect it will be merged via the powerpc tree,
since I think that's how all of Gavin Shan's recent patches in this area
are being handled.

> ---
>  MAINTAINERS | 16 +---
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 28eb61b..95d999e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4222,13 +4222,6 @@ M: Maxim Levitsky 
>  S:   Maintained
>  F:   drivers/media/rc/ene_ir.*
>  
> -ENHANCED ERROR HANDLING (EEH)
> -M:   Gavin Shan 
> -L:   linuxppc-dev@lists.ozlabs.org
> -S:   Supported
> -F:   Documentation/powerpc/eeh-pci-error-recovery.txt
> -F:   arch/powerpc/kernel/eeh*.c
> -
>  EPSON S1D13XXX FRAMEBUFFER DRIVER
>  M:   Kristoffer Ericson 
>  S:   Maintained
> @@ -8244,6 +8237,15 @@ L: linux-...@vger.kernel.org
>  S:   Supported
>  F:   Documentation/PCI/pci-error-recovery.txt
>  
> +PCI ENHANCED ERROR HANDLING (EEH) FOR POWERPC
> +M:   Russell Currey 
> +L:   linuxppc-dev@lists.ozlabs.org
> +S:   Supported
> +F:   Documentation/powerpc/eeh-pci-error-recovery.txt
> +F:   arch/powerpc/kernel/eeh*.c
> +F:   arch/powerpc/platforms/*/eeh*.c
> +F:   arch/powerpc/include/*/eeh*.h
> +
>  PCI SUBSYSTEM
>  M:   Bjorn Helgaas 
>  L:   linux-...@vger.kernel.org
> -- 
> 2.7.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5] powerpc32: provide VIRT_CPU_ACCOUNTING

2016-02-17 Thread Christophe Leroy



Le 16/02/2016 22:21, Scott Wood a écrit :

On Thu, 2016-02-11 at 17:16 +0100, Christophe Leroy wrote:

This patch provides VIRT_CPU_ACCOUTING to PPC32 architecture.
PPC32 doesn't have the PACA structure, so we use the task_info
structure to store the accounting data.

In order to reuse on PPC32 the PPC64 functions, all u64 data has
been replaced by 'unsigned long' so that it is u32 on PPC32 and
u64 on PPC64

Signed-off-by: Christophe Leroy 
---
Changes in v3: unlike previous version of the patch that was inspired
from IA64 architecture, this new version tries to reuse as much as
possible the PPC64 implementation.

PPC32 doesn't have PACA and past discusion on v2 version has shown
that it is not worth implementing a PACA in PPC32 architecture
(see below benh opinion)

benh: PACA is actually a data structure and you really really don't want it
on ppc32 :-) Having a register point to current works, having a register
point to per-cpu data instead works too (ie, change what we do today),
but don't introduce a PACA *please* :-)

And Ben never replied to my reply at the time:

"What is special about 64-bit that warrants doing things differently from 32
-bit?  What is the difference between PACA and "per-cpu data", other than the
obscure name?"

I can understand wanting to avoid churn, but other than that, doing things
differently on 64-bit versus 32-bit sucks.



What I can see is that PACA is always available via register r13. Do we 
have anything equivalent on PPC32 ?
If we define a per-cpu data for accounting, what will be the quick way 
to get access to it in entry_32.S ?
Something like a table of accounting data for each CPU, that we index 
with thread_info->cpu ?
This would allow a quite quick access, is it the good way to proceed in 
order to have something closer to PPC64 ?


Christophe
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH RESEND] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing

2016-02-17 Thread Nicholas Krause
This fixes the incorrect return statement in the function
mpic_set_default_irq_routing from always returning zero
to signal success to this function's caller to instead
return the return value of kvm_set_irq_routing as this
function can fail and we need to correctly signal the
caller of mpic_set_default_irq_routing when the call
to this particular function has failed.

Signed-off-by: Nicholas Krause 
---
 arch/powerpc/kvm/mpic.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 6249cdc..b14b85a 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -1641,16 +1641,17 @@ static void mpic_destroy(struct kvm_device *dev)
 static int mpic_set_default_irq_routing(struct openpic *opp)
 {
struct kvm_irq_routing_entry *routing;
+   int ret;
 
/* Create a nop default map, so that dereferencing it still works */
routing = kzalloc((sizeof(*routing)), GFP_KERNEL);
if (!routing)
return -ENOMEM;
 
-   kvm_set_irq_routing(opp->kvm, routing, 0, 0);
+   ret = kvm_set_irq_routing(opp->kvm, routing, 0, 0);
 
kfree(routing);
-   return 0;
+   return ret;
 }
 
 static int mpic_create(struct kvm_device *dev, u32 type)
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Fix BUG_ON() reporting in real mode on powerpc

2016-02-17 Thread Aneesh Kumar K.V
Balbir Singh  writes:

>> It might be a little better to do this:
>> 
>>  bugaddr = regs->nip;
>>  if (REGION_ID(bugaddr) == 0 && !(regs->msr & MSR_IR))
>>  bugaddr += PAGE_OFFSET;
>> 
>> It is possible to execute from addresses with the 0xc000... on top in
>> real mode, because the CPU ignores the top 4 address bits in real
>> mode.
>
> Good catch! Thank you
>
> Changelog:
>  Don't add PAGE_OFFSET blindly, check if REGION_ID is 0
>
> I ran into this issue while debugging an early boot problem.
> The system hit a BUG_ON() but report bug failed to print the
> line number and file name. The reason being that the system
> was running in real mode and report_bug() searches for
> addresses in the PAGE_OFFSET+ region
>
> Suggested-by: Paul Mackerras 
> Signed-off-by: Balbir Singh 
> ---
>  arch/powerpc/kernel/traps.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index b6becc7..4de4fe7 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -1148,6 +1148,7 @@ void __kprobes program_check_exception(struct pt_regs 
> *regs)
>   goto bail;
>   }
>   if (reason & REASON_TRAP) {
> + unsigned long bugaddr;
>   /* Debugger is first in line to stop recursive faults in
>    * rcu_lock, notify_die, or atomic_notifier_call_chain */
>   if (debugger_bpt(regs))
> @@ -1158,8 +1159,12 @@ void __kprobes program_check_exception(struct pt_regs 
> *regs)
>   == NOTIFY_STOP)
>   goto bail;
>  
> + bugaddr = regs->nip;
> + if ((REGION_ID(bugaddr) == 0) && !(regs->msr & MSR_IR))
> + bugaddr += PAGE_OFFSET;
> +

Can we add some comments around this. When i looked at this first, i was
wondering how nip can be in user region. But then realized that what we
are checking here is kernel address used in real mode. The use of
REGION_ID eventhough simpler is confusing. Hence adding the comment with
details Paul mentioned in email will help.


>   if (!(regs->msr & MSR_PR) &&  /* not user-mode */
> - report_bug(regs->nip, regs) == BUG_TRAP_TYPE_WARN) {
> + report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) {
>   regs->nip += 4;
>   goto bail;
>   }
> -- 

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] random kernel crashes after THP rework on s390 (maybe also on PowerPC and ARM)

2016-02-17 Thread Kirill A. Shutemov
On Tue, Feb 16, 2016 at 05:24:44PM +0100, Gerald Schaefer wrote:
> On Mon, 15 Feb 2016 23:35:26 +0200
> "Kirill A. Shutemov"  wrote:
> 
> > Is there any chance that I'll be able to trigger the bug using QEMU?
> > Does anybody have an QEMU image I can use?
> > 
> 
> I have no image, but trying to reproduce this under virtualization may
> help to trigger this also on other architectures. After ruling out IPI
> vs. fast_gup I do not really see why this should be arch-specific, and
> it wouldn't be the first time that we hit subtle races first on s390, due
> to our virtualized environment (my test case is make -j20 with 10 CPUs and
> 4GB of memory, no swap).

Could you post your kernel config?

It would be nice also to check if disabling split_huge_page() would make
any difference:

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a75081ca31cf..26d2b7b21021 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3364,6 +3364,8 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
bool mlocked;
unsigned long flags;
 
+   return -EBUSY;
+
VM_BUG_ON_PAGE(is_huge_zero_page(page), page);
VM_BUG_ON_PAGE(!PageAnon(page), page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
-- 
 Kirill A. Shutemov
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/2] powerpc/powernv: Create separate subcores CPU feature bit

2016-02-17 Thread Aneesh Kumar K.V
Michael Neuling  writes:

> Subcores isn't really part of the 2.07 architecture but currently we
> turn it on using the 2.07 feature bit.  Subcores is really a POWER8
> specific feature.
>
> This adds a new CPU_FTR bit just for subcores and moves the subcore
> init code over to use this.
>

Reviewed-by: Aneesh Kumar K.V 

> Signed-off-by: Michael Neuling 
> ---
>  arch/powerpc/include/asm/cputable.h  | 3 ++-
>  arch/powerpc/platforms/powernv/subcore.c | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/cputable.h 
> b/arch/powerpc/include/asm/cputable.h
> index b118072..a47e175 100644
> --- a/arch/powerpc/include/asm/cputable.h
> +++ b/arch/powerpc/include/asm/cputable.h
> @@ -196,6 +196,7 @@ enum {
>  #define CPU_FTR_DAWR LONG_ASM_CONST(0x0400)
>  #define CPU_FTR_DABRX
> LONG_ASM_CONST(0x0800)
>  #define CPU_FTR_PMAO_BUG LONG_ASM_CONST(0x1000)
> +#define CPU_FTR_SUBCORE  
> LONG_ASM_CONST(0x2000)
>
>  #ifndef __ASSEMBLY__
>
> @@ -443,7 +444,7 @@ enum {
>   CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
>   CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
>   CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
> - CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP)
> + CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_SUBCORE)
>  #define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG)
>  #define CPU_FTRS_POWER8_DD1 (CPU_FTRS_POWER8 & ~CPU_FTR_DBELL)
>  #define CPU_FTRS_CELL(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
> diff --git a/arch/powerpc/platforms/powernv/subcore.c 
> b/arch/powerpc/platforms/powernv/subcore.c
> index 503a73f..0babef1 100644
> --- a/arch/powerpc/platforms/powernv/subcore.c
> +++ b/arch/powerpc/platforms/powernv/subcore.c
> @@ -407,7 +407,7 @@ static DEVICE_ATTR(subcores_per_core, 0644,
>
>  static int subcore_init(void)
>  {
> - if (!cpu_has_feature(CPU_FTR_ARCH_207S))
> + if (!cpu_has_feature(CPU_FTR_SUBCORE))
>   return 0;
>
>   /*
> -- 
> 2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v8 43/45] drivers/of: Specify parent node in of_fdt_unflatten_tree()

2016-02-17 Thread Rob Herring
On Tue, Feb 16, 2016 at 9:44 PM, Gavin Shan  wrote:
> This adds one more argument to of_fdt_unflatten_tree() to specify
> the parent node of the FDT blob that is going to be unflattened.
> In the result, the function can be used to unflatten FDT blob that
> represents device sub-tree in PowerNV PCI hotplug driver.
>
> Cc: Jyri Sarha 
> Signed-off-by: Gavin Shan 
> ---
>  drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c |  2 +-
>  drivers/of/fdt.c | 14 ++
>  drivers/of/unittest.c|  2 +-
>  include/linux/of_fdt.h   |  1 +
>  4 files changed, 13 insertions(+), 6 deletions(-)

Acked-by: Rob Herring 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v8 42/45] drivers/of: Rename unflatten_dt_node()

2016-02-17 Thread Rob Herring
On Tue, Feb 16, 2016 at 9:44 PM, Gavin Shan  wrote:
> This renames unflatten_dt_node() to unflatten_dt_nodes() as it
> populates multiple device nodes from FDT blob. No logical changes
> introduced.
>
> Signed-off-by: Gavin Shan 
> ---
>  drivers/of/fdt.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)

Acked-by: Rob Herring 

I'm happy to take patches 40-42 for 4.6 if the rest of the series
doesn't go in given they fix a separate problem. I just need to know
soon (or at least they need to go into -next soon).

Rob
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v8 41/45] drivers/of: Avoid recursively calling unflatten_dt_node()

2016-02-17 Thread Rob Herring
On Tue, Feb 16, 2016 at 9:44 PM, Gavin Shan  wrote:
> In current implementation, unflatten_dt_node() is called recursively
> to unflatten device nodes in FDT blob. It's stress to limited stack
> capacity, especially to adopt the function to unflatten device sub-tree
> that possibly has multiple root nodes. In that case, we runs out of
> stack and the system can't boot up successfully.
>
> In order to reuse the function to unflatten device sub-tree, this avoids
> calling the function recursively, meaning the device nodes are unflattened
> in one call on unflatten_dt_node(): two arrays are introduced to track the
> parent path size and the device node of current level of depth, which will
> be used by the device node on next level of depth to be unflattened. All
> device nodes in more than 64 level of depth are dropped and hopefully,
> the system can boot up successfully with the partial device-tree.
>
> Also, the parameter "poffset" and "fpsize" are unused and dropped and the
> parameter "dryrun" is figured out from "mem == NULL". Besides, the return
> value of the function is changed to indicate the size of memory consumed by
> the unflatten device tree or error code.
>
> Signed-off-by: Gavin Shan 
> ---
>  drivers/of/fdt.c | 122 
> +--
>  1 file changed, 74 insertions(+), 48 deletions(-)

Acked-by: Rob Herring 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 3/4] powerpc/mm: Rename the vmemmap_backing struct and its elements

2016-02-17 Thread Aneesh Kumar K.V
Anshuman Khandual  writes:

> The structure to track single virtual to physical mapping has
> been renamed from vmemmap_backing to vmemmap_hw_map which sounds
> more appropriate. This forms a single entry of the global linked
> list tracking all of the vmemmap physical mapping. The changes
> are as follows.
>
>   vmemmap_backing.list -> vmemmap_hw_map.link
>   vmemmap_backing.phys -> vmemmap_hw_map.paddr
>   vmemmap_backing.virt_addr -> vmemmap_hw_map.vaddr
>

I am not sure this helps. If we are going to take these renames, can you
wait till th book3s p9 preparation patches [1] hit upstream ? 

[1] 
http://mid.gmane.org/1454923241-6681-1-git-send-email-aneesh.ku...@linux.vnet.ibm.com
-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout

2016-02-17 Thread Aneesh Kumar K.V
Anshuman Khandual  writes:

> Add some explaination to the layout of vmemmap virtual address
> space and how physical page mapping is only used for valid PFNs
> present at any point on the system.
>

Reviewed-by: Aneesh Kumar K.V 


> Signed-off-by: Anshuman Khandual 
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 41 
> 
>  1 file changed, 41 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 8d1c41d..9db4a86 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -26,6 +26,47 @@
>  #define IOREMAP_BASE (PHB_IO_END)
>  #define IOREMAP_END  (KERN_VIRT_START + KERN_VIRT_SIZE)
>
> +/*
> + * Starting address of the virtual address space where all page structs
> + * for the system physical memory are stored under the vmemmap sparse
> + * memory model. All possible struct pages are logically stored in a
> + * sequence in this virtual address space irrespective of the fact
> + * whether any given PFN is valid or even the memory section is valid
> + * or not. During boot and memory hotplug add operation when new memory
> + * sections are added, real physical allocation and hash table bolting
> + * will be performed. This saves precious physical memory when the system
> + * really does not have valid PFNs in some address ranges.
> + *
> + *  vmemmap +--+
> + * +|  page struct +--+  PFN is valid
> + * |+--+  |
> + * ||  page struct |  |  PFN is invalid
> + * |+--+  |
> + * ||  page struct +--+   |
> + * |+--+  |   |
> + * ||  page struct |  |   |
> + * |+--+  |   |
> + * ||  page struct |  |   |
> + * |+--+  |   |
> + * ||  page struct +--+   |   |
> + * |+--+  |   |   |
> + * ||  page struct |  |   |   |   +-+
> + * |+--+  |   |   +-> | PFN |
> + * ||  page struct |  |   |   +-+
> + * |+--+  |   +-> | PFN |
> + * ||  page struct |  |   +-+
> + * |+--+  +-> | PFN |
> + * ||  page struct |  +-+
> + * |+--+   +> | PFN |
> + * ||  page struct |   |  +-+
> + * |+--+   |Bolted in hash table
> + * ||  page struct +---+
> + * v+--+
> + *
> + * VMEMMAP_BASE (0xf000) region has a total range of 64TB but
> + * then it uses NR_MEM_SECTIONS * PAGES_PER_SECTION * sizeof(page struct)
> + * amount of virtual memory from it.
> + */
>  #define vmemmap  ((struct page *)VMEMMAP_BASE)
>
>  /* Advertise special mapping type for AGP */
> -- 
> 2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH][v3] drivers/memory: Add deep sleep support for IFC

2016-02-17 Thread Raghav Dogra


> -Original Message-
> From: Scott Wood [mailto:o...@buserror.net]
> Sent: Tuesday, February 16, 2016 2:05 PM
> To: Raghav Dogra ; linuxppc-dev@lists.ozlabs.org
> Cc: Prabhakar Kushwaha 
> Subject: Re: [PATCH][v3] drivers/memory: Add deep sleep support for IFC
> 
> On Mon, 2016-02-15 at 11:44 +0530, Raghav Dogra wrote:
> > Add support of suspend, resume function to support deep sleep.
> > Also make sure of SRAM initialization  during resume.
> >
> > Signed-off-by: Prabhakar Kushwaha 
> > Signed-off-by: Raghav Dogra 
> > ---
> > Changes for v3: Replace spin_event_timeout() with arch independent
> > macro
> >
> > Based on
> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > branch "master"
> >
> >  drivers/memory/fsl_ifc.c | 165
> > +++
> >  include/linux/fsl_ifc.h  |   6 ++
> >  2 files changed, 171 insertions(+)
> >
> > diff --git a/drivers/memory/fsl_ifc.c b/drivers/memory/fsl_ifc.c index
> > acd1460..fa028bd 100644
> > --- a/drivers/memory/fsl_ifc.c
> > +++ b/drivers/memory/fsl_ifc.c
> > @@ -24,6 +24,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -35,6 +36,8 @@
> >
> >  struct fsl_ifc_ctrl *fsl_ifc_ctrl_dev;
> > EXPORT_SYMBOL(fsl_ifc_ctrl_dev);
> > +#define FSL_IFC_V1_3_0 0x0103
> > +#define IFC_TIMEOUT_MSECS  10 /* 100ms */
> 
> What does the "MSECS" mean in IFC_TIMEOUT_MSECS?  It's a unit without a
> quantity.

Yes, I agree. I will rename it to IFC_WAIT_ITR.

> 
> >
> >  /*
> >   * convert_ifc_address - convert the base address @@ -309,6 +312,163
> > @@ err:
> > return ret;
> >  }
> >
> > +#ifdef CONFIG_PM_SLEEP
> > +/* save ifc registers */
> > +static int fsl_ifc_suspend(struct device *dev) {
> > +   struct fsl_ifc_ctrl *ctrl = dev_get_drvdata(dev);
> > +   struct fsl_ifc_regs __iomem *ifc = ctrl->regs;
> > +   __be32 nand_evter_intr_en, cm_evter_intr_en, nor_evter_intr_en,
> > +gpcm_evter_intr_en
> > ;
> 
> s/__be32/u32/ as they've already been converted to host endianness.
> 
> Also please repeat the type on a new line rather than use continuation lines
> to declare more variables (and don't indent continuation lines so far).
> 

Okay, will take care of this in the next version.

> > +
> > +   ctrl->saved_regs = kzalloc(sizeof(struct fsl_ifc_regs),
> > GFP_KERNEL);
> > +   if (!ctrl->saved_regs)
> > +   return -ENOMEM;
> 
> Allocate memory at probe time, not here.
> 

But, why allocate memory at the probe when it is not known at that time whether
deep sleep state would be required or not? Is that because we want to save time
while going to deep sleep?

> > +   cm_evter_intr_en = ifc_in32(>cm_evter_intr_en);
> > +   nand_evter_intr_en = ifc_in32(>ifc_nand.nand_evter_intr_en);
> > +   nor_evter_intr_en = ifc_in32(>ifc_nor.nor_evter_intr_en);
> > +   gpcm_evter_intr_en = ifc_in32(
> >ifc_gpcm.gpcm_evter_intr_en);
> > +
> > +/* IFC interrupts disabled */
> > +
> > +   ifc_out32(0x0, >cm_evter_intr_en);
> 
> Indent the comments the same as the code.
> 

Okay.

> > +   ifc_out32(0x0, >ifc_nand.nand_evter_intr_en);
> > +   ifc_out32(0x0, >ifc_nor.nor_evter_intr_en);
> > +   ifc_out32(0x0, >ifc_gpcm.gpcm_evter_intr_en);
> > +
> > +   memcpy_fromio(ctrl->saved_regs, ifc, sizeof(struct fsl_ifc_regs));
> > +
> > +/* save the interrupt values */
> > +   ctrl->saved_regs->cm_evter_intr_en = cm_evter_intr_en;
> > +   ctrl->saved_regs->ifc_nand.nand_evter_intr_en =
> nand_evter_intr_en;
> > +   ctrl->saved_regs->ifc_nor.nor_evter_intr_en = nor_evter_intr_en;
> > +   ctrl->saved_regs->ifc_gpcm.gpcm_evter_intr_en =
> gpcm_evter_intr_en;
> 
> Why didn't you use the memcpy_fromio() to save these, and clear intr_en
> later?
> 

I used it whenever I did a write/read on iomem. In this case, both memories 
are non iomem.

> That said, I still don't like this approach.  I'd rather see the nand driver 
> save
> the registers it cares about, and this driver wouldn't have to do much other
> than quiesce the rest of the interrupts.
> 

Okay, we will analyze the required changes and include them.

> > +
> > +   return 0;
> > +}
> > +
> > +/* restore ifc registers */
> > +static int fsl_ifc_resume(struct device *dev) {
> > +   struct fsl_ifc_ctrl *ctrl = dev_get_drvdata(dev);
> > +   struct fsl_ifc_regs __iomem *ifc = ctrl->regs;
> > +   struct fsl_ifc_regs *savd_regs = ctrl->saved_regs;
> > +   uint32_t ver = 0, ncfgr, timeout, ifc_bank, i;
> 
> s/savd/saved/
> 

Okay.

> > +
> > +/*
> > + * IFC interrupts disabled
> > + */
> > +   ifc_out32(0x0, >cm_evter_intr_en);
> > +   ifc_out32(0x0, >ifc_nand.nand_evter_intr_en);
> > +   ifc_out32(0x0, >ifc_nor.nor_evter_intr_en);
> > +   ifc_out32(0x0, >ifc_gpcm.gpcm_evter_intr_en);
> > +
> > +
> > +   if (ctrl->saved_regs) {
> > +   for (ifc_bank = 0; ifc_bank < 

Re: [PATCH v8 40/45] drivers/of: Split unflatten_dt_node()

2016-02-17 Thread Rob Herring
On Tue, Feb 16, 2016 at 9:44 PM, Gavin Shan  wrote:
> The function unflatten_dt_node() is called recursively to unflatten
> device nodes and properties in the FDT blob. It looks complicated
> and hard to be understood.
>
> This splits the function into 3 functions: populate_properties(),
> populate_node() and unflatten_dt_node(). populate_properties(),
> which is called by populate_node(), creates properties for the
> indicated device node. The later one creates the device nodes
> from FDT blob. populate_node() gets the offset in FDT blob for
> next device nodes and then calls populate_node(). No logical
> changes introduced.
>
> Signed-off-by: Gavin Shan 
> ---
>  drivers/of/fdt.c | 249 
> ---
>  1 file changed, 147 insertions(+), 102 deletions(-)

One nit, otherwise:

Acked-by: Rob Herring 

[...]

> +   /* And we process the "ibm,phandle" property
> +* used in pSeries dynamic device tree
> +* stuff
> +*/
> +   if (!strcmp(pname, "ibm,phandle"))
> +   np->phandle = be32_to_cpup(val);
> +
> +   pp->name   = (char *)pname;
> +   pp->length = sz;
> +   pp->value  = (__be32 *)val;

This cast should not be needed.

> +   *pprev = pp;
> +   pprev  = >next;
> +   }
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] powerpc: Add POWER9 cputable entry

2016-02-17 Thread oliver
On Wed, Feb 17, 2016 at 10:09 PM, Michael Ellerman 
wrote:

> On Wed, 2016-02-17 at 16:07 +1100, Michael Neuling wrote:
>
> > Add a cputable entry for POWER9.  More code is required to actually
> > boot and run on a POWER9 but this gets the base piece in which we can
> > start building on.
> >
> > Copies over from POWER8 except for:
> > - Adds a new CPU_FTR_ARCH_30 bit to start hanging new architecture
>
> ARCH thirty?
>
> Would CPU_FTR_ARCH_3 read better?
>
> Or CPU_FTR_ARCH_3_00 ?


The user visible version flags all have the pattern ARCH_X_XX while the
in-kernel flags use ARCH_XXX. It should probably be CPU_FTR_ARCH_300 for
consistency with the other kernel flags.

> +#define COMMON_USER_POWER9   (COMMON_USER_PPC64 | PPC_FEATURE_ARCH_2_06
> |\
> > +  PPC_FEATURE_SMT |
> PPC_FEATURE_ICACHE_SNOOP | \
> > +  PPC_FEATURE_TRUE_LE | \
> > +  PPC_FEATURE_PSERIES_PERFMON_COMPAT)
>
> That looks like it's == COMMON_USER_POWER8.
>
> > +#define COMMON_USER2_POWER9  (PPC_FEATURE2_ARCH_2_07 | \
> > +  PPC_FEATURE2_HTM_COMP | \
> > +  PPC_FEATURE2_HTM_NOSC_COMP | \
> > +  PPC_FEATURE2_DSCR | \
> > +  PPC_FEATURE2_ISEL | PPC_FEATURE2_TAR | \
> > +  PPC_FEATURE2_VEC_CRYPTO | \
> > +  PPC_FEATURE2_ARCH_3_00 | \
> > +  PPC_FEATURE2_HAS_IEEE128)
>
> And this could be COMMON_USER_POWER8 + ARCH_3 + HAS_IEEE128 I think?


It could be, but similarly the POWER8 flags could also be POWER7 + some. I
think they're separate so flags can be easily removed if need be, but I'm
not sure how useful that is.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] powerpc: Add POWER9 cputable entry

2016-02-17 Thread Michael Ellerman
On Wed, 2016-02-17 at 23:28 +1100, oliver wrote:

> On Wed, Feb 17, 2016 at 10:09 PM, Michael Ellerman  
> wrote:

> > On Wed, 2016-02-17 at 16:07 +1100, Michael Neuling wrote:
> > 
> > > Add a cputable entry for POWER9.  More code is required to actually
> > > boot and run on a POWER9 but this gets the base piece in which we can
> > > start building on.
> > >
> > > Copies over from POWER8 except for:
> > > - Adds a new CPU_FTR_ARCH_30 bit to start hanging new architecture
> > 
> > ARCH thirty?
> > 
> > Would CPU_FTR_ARCH_3 read better?
> > 
> > Or CPU_FTR_ARCH_3_00 ?

> The user visible version flags all have the pattern ARCH_X_XX while the
> in-kernel flags use ARCH_XXX. It should probably be CPU_FTR_ARCH_300 for
> consistency with the other kernel flags.

Yeah, 300 is ugly too.

I'm not sure if the plan is for the next version to be 3.01 or 4, hopefully the
latter.

It would be a pity if we had ARCH_300 and then the next version was 4.

So I'm inclined to say now is the time where we break from the 2.0x tradition,
and just use ARCH_3.

> > > +#define COMMON_USER_POWER9   (COMMON_USER_PPC64 | PPC_FEATURE_ARCH_2_06 
> > > |\
> > > +  PPC_FEATURE_SMT | PPC_FEATURE_ICACHE_SNOOP 
> > > | \
> > > +  PPC_FEATURE_TRUE_LE | \
> > > +  PPC_FEATURE_PSERIES_PERFMON_COMPAT)
> > 
> > That looks like it's == COMMON_USER_POWER8.
> > 
> > > +#define COMMON_USER2_POWER9  (PPC_FEATURE2_ARCH_2_07 | \
> > > +  PPC_FEATURE2_HTM_COMP | \
> > > +  PPC_FEATURE2_HTM_NOSC_COMP | \
> > > +  PPC_FEATURE2_DSCR | \
> > > +  PPC_FEATURE2_ISEL | PPC_FEATURE2_TAR | \
> > > +  PPC_FEATURE2_VEC_CRYPTO | \
> > > +  PPC_FEATURE2_ARCH_3_00 | \
> > > +  PPC_FEATURE2_HAS_IEEE128)
> > 
> > And this could be COMMON_USER_POWER8 + ARCH_3 + HAS_IEEE128 I think?

> It could be, but similarly the POWER8 flags could also be POWER7 + some.

You just found yourself another cleanup to do :)

> I think they're separate so flags can be easily removed if need be, but I'm 
> not
> sure how useful that is.

It's not useful. Looking at the history it looks like we have literally *never*
removed a bit.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3] powerpc/powernv: Remove support for p5ioc2

2016-02-17 Thread Michael Ellerman
On Mon, 2016-08-02 at 04:08:20 UTC, Russell Currey wrote:
> "p5ioc2 is used by approximately 2 machines in the world, and has never
> ever been a supported configuration."
> 
> The code for p5ioc2 is essentially unused and complicates what is already
> a very complicated codebase.  Its removal is essentially a "free win" in
> the effort to simplify the powernv PCI code.
> 
> In addition, support for p5ioc2 has been dropped from skiboot.  There's no
> reason to keep it around in the kernel.
> 
> Signed-off-by: Russell Currey 
> Acked-by: Gavin Shan 
> Acked-by: Stewart Smith 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/2de50e9674fc4ca3c6174b0447

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/xmon: fix typo in usage message

2016-02-17 Thread Michael Ellerman
On Wed, 2016-27-01 at 00:29:44 UTC, Andrew Donnellan wrote:
> Signed-off-by: Andrew Donnellan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/2e34057929cad8a90b77558121

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/eeh: fix incorrect function name in comment

2016-02-17 Thread Michael Ellerman
On Mon, 2016-08-02 at 03:39:19 UTC, Andrew Donnellan wrote:
> The comment block above pcibios_set_pcie_reset_state() incorrectly refers
> to pcibios_set_pcie_slot_reset(). Fix the comment accordingly.
> 
> Signed-off-by: Andrew Donnellan 
> Acked-by: Gavin Shan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/31f6a4ada14de04ee6cd7ff03c

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/xmon: Add xmon command to dump process/task similar to ps(1)

2016-02-17 Thread Michael Ellerman
On Mon, 2015-23-11 at 15:01:15 UTC, Douglas Miller wrote:
> Add 'P' command with optional task_struct address to dump all/one task's
> information: task pointer, kernel stack pointer, PID, PPID, state
> (interpreted), CPU where (last) running, and command.
> 
> Introduce XMON_PROTECT macro to standardize memory-access-fault
> protection (setjmp). Initially used only by the 'P' command.
> 
> Signed-off-by: Douglas Miller 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/6dfb54049f9a99b24fe5d5cd2d

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2, 4/4] powerpc/powernv: Simplify definitions of EEH debugfs handlers

2016-02-17 Thread Michael Ellerman
On Tue, 2016-09-02 at 04:50:24 UTC, Gavin Shan wrote:
> The EEH debugfs handlers have same prototype. This introduces
> a macro to define them, then to simplify the code. No logical
> changes.
> 
> Signed-off-by: Gavin Shan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ccc9662da5494a7c4ff5ed5d16

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V2,2/2] powerpc/xmon: add command to dump OPAL msglog

2016-02-17 Thread Michael Ellerman
On Tue, 2016-09-02 at 07:17:49 UTC, Andrew Donnellan wrote:
> Add the 'do' command to dump the OPAL msglog in xmon.
> 
> Signed-off-by: Andrew Donnellan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/fde93a0f774f510bfaabccd5ba

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V7, 1/6] powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR

2016-02-17 Thread Michael Ellerman
On Thu, 2015-22-10 at 01:22:14 UTC, Wei Yang wrote:
> On PHB3, we enable SRIOV devices by mapping IOV BAR with M64 BARs. If a
> SRIOV device's IOV BAR is not 64bit-prefetchable, this is not assigned from
> 64bit prefetchable window, which means M64 BAR can't work on it.
> 
> The reason is PCI bridges support only 2 memory windows and the kernel code
> programs bridges in the way that one window is 32bit-nonprefetchable and
> the other one is 64bit-prefetchable. So if devices' IOV BAR is 64bit and
> non-prefetchable, it will be mapped into 32bit space and therefore M64
> cannot be used for it.
> 
> This patch makes this explicit and truncate IOV resource in this case to
> save MMIO space.
> 
> Signed-off-by: Wei Yang 
> Reviewed-by: Gavin Shan 
> Acked-by: Alexey Kardashevskiy 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/b0331854190e70b9d96d392572

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V2,1/2] powerpc/powernv: new function to access OPAL msglog

2016-02-17 Thread Michael Ellerman
On Tue, 2016-09-02 at 07:17:48 UTC, Andrew Donnellan wrote:
> Currently, the OPAL msglog/console buffer is exposed as a sysfs file, with
> the sysfs read handler responsible for retrieving the log from the OPAL
> buffer. We'd like to be able to use it in xmon as well.
> 
> Refactor the OPAL msglog code to create a new function, opal_msglog_copy(),
> that copies to an arbitrary buffer. Separate the initialisation code into
> generic memcons init and sysfs file creation.
> 
> Signed-off-by: Andrew Donnellan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/9b4fffa14906fce7aabf1f032d


I see you've posted a v3 since I merged this, please send an incremental patch
with the changes.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V2,1/1] powerpc/perf/hv-gpci: Increase request buffer size

2016-02-17 Thread Michael Ellerman
On Wed, 2016-10-02 at 03:24:05 UTC, Sukadev Bhattiprolu wrote:
> >From f1afe08fbc9797ff63adf03efe564a807a37cfe6 Mon Sep 17 00:00:00 2001
> From: Sukadev Bhattiprolu 
> Date: Tue, 9 Feb 2016 02:47:45 -0500
> Subject: [PATCH V2 1/1] powerpc/perf/hv-gpci: Increase request buffer size
> 
> The GPCI hcall allows for a 4K buffer but we limit the buffer to 1K.
> The problem with a 1K buffer is if a request results in returning
> more values than can be accomodated in the 1K buffer the request will
> fail.
> 
> The buffer we are using is currently allocated on the stack and hence
> limited in size. Instead use a per-CPU 4K buffer like we do with 24x7
> counters (hv-24x7.c).
> 
> While here, rename the macro GPCI_MAX_DATA_BYTES to HGPCI_MAX_DATA_BYTES
> for consistency with 24x7 counters.
> 
> Signed-off-by: Sukadev Bhattiprolu 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/e4f226b1580b36550727c324b4

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC 2/4] powerpc/mm: Add comments to the vmemmap layout

2016-02-17 Thread Anshuman Khandual
Add some explaination to the layout of vmemmap virtual address
space and how physical page mapping is only used for valid PFNs
present at any point on the system.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 41 
 1 file changed, 41 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 8d1c41d..9db4a86 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -26,6 +26,47 @@
 #define IOREMAP_BASE   (PHB_IO_END)
 #define IOREMAP_END(KERN_VIRT_START + KERN_VIRT_SIZE)
 
+/*
+ * Starting address of the virtual address space where all page structs
+ * for the system physical memory are stored under the vmemmap sparse
+ * memory model. All possible struct pages are logically stored in a
+ * sequence in this virtual address space irrespective of the fact
+ * whether any given PFN is valid or even the memory section is valid
+ * or not. During boot and memory hotplug add operation when new memory
+ * sections are added, real physical allocation and hash table bolting
+ * will be performed. This saves precious physical memory when the system
+ * really does not have valid PFNs in some address ranges.
+ *
+ *  vmemmap +--+
+ * +|  page struct +--+  PFN is valid
+ * |+--+  |
+ * ||  page struct |  |  PFN is invalid
+ * |+--+  |
+ * ||  page struct +--+   |
+ * |+--+  |   |
+ * ||  page struct |  |   |
+ * |+--+  |   |
+ * ||  page struct |  |   |
+ * |+--+  |   |
+ * ||  page struct +--+   |   |
+ * |+--+  |   |   |
+ * ||  page struct |  |   |   |   +-+
+ * |+--+  |   |   +-> | PFN |
+ * ||  page struct |  |   |   +-+
+ * |+--+  |   +-> | PFN |
+ * ||  page struct |  |   +-+
+ * |+--+  +-> | PFN |
+ * ||  page struct |  +-+
+ * |+--+   +> | PFN |
+ * ||  page struct |   |  +-+
+ * |+--+   |Bolted in hash table
+ * ||  page struct +---+
+ * v+--+
+ *
+ * VMEMMAP_BASE (0xf000) region has a total range of 64TB but
+ * then it uses NR_MEM_SECTIONS * PAGES_PER_SECTION * sizeof(page struct)
+ * amount of virtual memory from it.
+ */
 #define vmemmap((struct page *)VMEMMAP_BASE)
 
 /* Advertise special mapping type for AGP */
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC 4/4] powerpc/mm: Rename global tracker for virtual to physical mapping

2016-02-17 Thread Anshuman Khandual
This renames the global list which tracks all the virtual to physical
mapping and also the global list which tracks all the available unused
vmemmap_hw_map node structures. It also attempts to explain the purpose
of these global linked lists and points out a possible race condition.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/include/asm/pgalloc-64.h |  2 +-
 arch/powerpc/kernel/machine_kexec.c   |  2 +-
 arch/powerpc/mm/init_64.c | 82 +--
 3 files changed, 52 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/pgalloc-64.h 
b/arch/powerpc/include/asm/pgalloc-64.h
index e03b41c..6e21a2a 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -22,7 +22,7 @@ struct vmemmap_hw_map {
unsigned long paddr;
unsigned long vaddr;
 };
-extern struct vmemmap_hw_map *vmemmap_list;
+extern struct vmemmap_hw_map *vmemmap_global;
 
 /*
  * Functions that deal with pagetables that could be at any level of
diff --git a/arch/powerpc/kernel/machine_kexec.c 
b/arch/powerpc/kernel/machine_kexec.c
index 0d90798..eb6876c 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -77,7 +77,7 @@ void arch_crash_save_vmcoreinfo(void)
VMCOREINFO_SYMBOL(contig_page_data);
 #endif
 #if defined(CONFIG_PPC64) && defined(CONFIG_SPARSEMEM_VMEMMAP)
-   VMCOREINFO_SYMBOL(vmemmap_list);
+   VMCOREINFO_SYMBOL(vmemmap_global);
VMCOREINFO_SYMBOL(mmu_vmemmap_psize);
VMCOREINFO_SYMBOL(mmu_psize_defs);
VMCOREINFO_STRUCT_SIZE(vmemmap_hw_map);
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 9b5dea3..d998f3f 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -245,45 +245,63 @@ static void vmemmap_remove_mapping(unsigned long start,
 
 #endif /* CONFIG_PPC_BOOK3E */
 
-struct vmemmap_hw_map *vmemmap_list;
-static struct vmemmap_hw_map *next;
-static int num_left;
-static int num_freed;
+/*
+ * vmemmap virtual address space does not have a page table to track
+ * existing physical mapping. The vmemmap_global list maintains the
+ * physical mapping at all times where as the vmemmap_avail list
+ * maintains the available vmemmap_hw_map structures which got deleted
+ * from the vmemmap_global list during system runtime (memory hotplug
+ * remove operation for example). They freed structures are reused later
+ * when new requests come in without allocating new fresh memory. This
+ * pointer also tracks the allocated vmemmap_hw_map structures as we
+ * allocate one full page memory at a time when we dont have any.
+ */
+struct vmemmap_hw_map *vmemmap_global;
+static struct vmemmap_hw_map *vmemmap_avail;
+
+/* XXX: The same pointer vmemmap_avail tracks individual chunks inside
+ * the allocated full page during the boot time and again tracks the
+ * freeed nodes during runtime. It is racy but it does not happen as
+ * both they are separated by the boot process. Will create problem if
+ * some how we have memory hotplug operation during boot !!
+ */
+static int free_chunk; /* Allocated chunks available */
+static int free_node;  /* Freeed nodes available */
 
-static __meminit struct vmemmap_hw_map * vmemmap_list_alloc(int node)
+static __meminit struct vmemmap_hw_map * vmemmap_global_alloc(int node)
 {
struct vmemmap_hw_map *vmem_back;
/* get from freed entries first */
-   if (num_freed) {
-   num_freed--;
-   vmem_back = next;
-   next = next->link;
+   if (free_node) {
+   free_node--;
+   vmem_back = vmemmap_avail;
+   vmemmap_avail = vmemmap_avail->link;
 
return vmem_back;
}
 
/* allocate a page when required and hand out chunks */
-   if (!num_left) {
-   next = vmemmap_alloc_block(PAGE_SIZE, node);
-   if (unlikely(!next)) {
+   if (!free_chunk) {
+   vmemmap_avail = vmemmap_alloc_block(PAGE_SIZE, node);
+   if (unlikely(!vmemmap_avail)) {
WARN_ON(1);
return NULL;
}
-   num_left = PAGE_SIZE / sizeof(struct vmemmap_hw_map);
+   free_chunk = PAGE_SIZE / sizeof(struct vmemmap_hw_map);
}
 
-   num_left--;
+   free_chunk--;
 
-   return next++;
+   return vmemmap_avail++;
 }
 
-static __meminit void vmemmap_list_populate(unsigned long paddr,
+static __meminit void vmemmap_global_populate(unsigned long paddr,
unsigned long start,
int node)
 {
struct vmemmap_hw_map *vmem_back;
 
-   vmem_back = vmemmap_list_alloc(node);
+   vmem_back = vmemmap_global_alloc(node);
if (unlikely(!vmem_back)) {
WARN_ON(1);
return;
@@ -291,9 +309,9 @@ 

[RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section

2016-02-17 Thread Anshuman Khandual
The commit (16a05bff1: powerpc: start loop at section start of
start in vmemmap_populated()) reused 'start' variable to compute
the starting address of the memory section where the given address
belongs. Then the same variable is used for iterating over starting
address of all memory sections before reaching the 'end' address.
Renaming it as 'section_start' makes the logic more clear.

Fixes: 16a05bff1 ("powerpc: start loop at section start of start in 
vmemmap_populated()")
Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/mm/init_64.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 379a6a9..d6b9b4d 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -170,11 +170,15 @@ static unsigned long __meminit 
vmemmap_section_start(unsigned long page)
  */
 static int __meminit vmemmap_populated(unsigned long start, int page_size)
 {
-   unsigned long end = start + page_size;
-   start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
+   unsigned long end, section_start;
 
-   for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
-   if (pfn_valid(page_to_pfn((struct page *)start)))
+   end = start + page_size;
+   section_start = (unsigned long)(pfn_to_page
+   (vmemmap_section_start(start)));
+
+   for (; section_start < end; section_start
+   += (PAGES_PER_SECTION * sizeof(struct page)))
+   if (pfn_valid(page_to_pfn((struct page *)section_start)))
return 1;
 
return 0;
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC 3/4] powerpc/mm: Rename the vmemmap_backing struct and its elements

2016-02-17 Thread Anshuman Khandual
The structure to track single virtual to physical mapping has
been renamed from vmemmap_backing to vmemmap_hw_map which sounds
more appropriate. This forms a single entry of the global linked
list tracking all of the vmemmap physical mapping. The changes
are as follows.

vmemmap_backing.list -> vmemmap_hw_map.link
vmemmap_backing.phys -> vmemmap_hw_map.paddr
vmemmap_backing.virt_addr -> vmemmap_hw_map.vaddr

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/include/asm/pgalloc-64.h | 16 +++---
 arch/powerpc/kernel/machine_kexec.c   |  8 ++---
 arch/powerpc/mm/init_64.c | 58 +--
 3 files changed, 44 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/include/asm/pgalloc-64.h 
b/arch/powerpc/include/asm/pgalloc-64.h
index 69ef28a..e03b41c 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -11,12 +11,18 @@
 #include 
 #include 
 
-struct vmemmap_backing {
-   struct vmemmap_backing *list;
-   unsigned long phys;
-   unsigned long virt_addr;
+/*
+ * This structure tracks a single virtual page mapping from
+ * the vmemmap ddress space. This element is required to
+ * track virtual to physical mapping of page structures in
+ * absense of a page table at boot time.
+ */
+struct vmemmap_hw_map {
+   struct vmemmap_hw_map *link;
+   unsigned long paddr;
+   unsigned long vaddr;
 };
-extern struct vmemmap_backing *vmemmap_list;
+extern struct vmemmap_hw_map *vmemmap_list;
 
 /*
  * Functions that deal with pagetables that could be at any level of
diff --git a/arch/powerpc/kernel/machine_kexec.c 
b/arch/powerpc/kernel/machine_kexec.c
index 015ae55..0d90798 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -80,10 +80,10 @@ void arch_crash_save_vmcoreinfo(void)
VMCOREINFO_SYMBOL(vmemmap_list);
VMCOREINFO_SYMBOL(mmu_vmemmap_psize);
VMCOREINFO_SYMBOL(mmu_psize_defs);
-   VMCOREINFO_STRUCT_SIZE(vmemmap_backing);
-   VMCOREINFO_OFFSET(vmemmap_backing, list);
-   VMCOREINFO_OFFSET(vmemmap_backing, phys);
-   VMCOREINFO_OFFSET(vmemmap_backing, virt_addr);
+   VMCOREINFO_STRUCT_SIZE(vmemmap_hw_map);
+   VMCOREINFO_OFFSET(vmemmap_hw_map, link);
+   VMCOREINFO_OFFSET(vmemmap_hw_map, paddr);
+   VMCOREINFO_OFFSET(vmemmap_hw_map, vaddr);
VMCOREINFO_STRUCT_SIZE(mmu_psize_def);
VMCOREINFO_OFFSET(mmu_psize_def, shift);
 #endif
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index d6b9b4d..9b5dea3 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -194,7 +194,7 @@ static int __meminit vmemmap_populated(unsigned long start, 
int page_size)
 #ifdef CONFIG_PPC_BOOK3E
 static void __meminit vmemmap_create_mapping(unsigned long start,
 unsigned long page_size,
-unsigned long phys)
+unsigned long paddr)
 {
/* Create a PTE encoding without page size */
unsigned long i, flags = _PAGE_PRESENT | _PAGE_ACCESSED |
@@ -207,11 +207,11 @@ static void __meminit vmemmap_create_mapping(unsigned 
long start,
flags |= mmu_psize_defs[mmu_vmemmap_psize].enc << 8;
 
/* For each PTE for that area, map things. Note that we don't
-* increment phys because all PTEs are of the large size and
+* increment paddr because all PTEs are of the large size and
 * thus must have the low bits clear
 */
for (i = 0; i < page_size; i += PAGE_SIZE)
-   BUG_ON(map_kernel_page(start + i, phys, flags));
+   BUG_ON(map_kernel_page(start + i, paddr, flags));
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
@@ -223,9 +223,9 @@ static void vmemmap_remove_mapping(unsigned long start,
 #else /* CONFIG_PPC_BOOK3E */
 static void __meminit vmemmap_create_mapping(unsigned long start,
 unsigned long page_size,
-unsigned long phys)
+unsigned long paddr)
 {
-   int  mapped = htab_bolt_mapping(start, start + page_size, phys,
+   int  mapped = htab_bolt_mapping(start, start + page_size, paddr,
pgprot_val(PAGE_KERNEL),
mmu_vmemmap_psize,
mmu_kernel_ssize);
@@ -245,19 +245,19 @@ static void vmemmap_remove_mapping(unsigned long start,
 
 #endif /* CONFIG_PPC_BOOK3E */
 
-struct vmemmap_backing *vmemmap_list;
-static struct vmemmap_backing *next;
+struct vmemmap_hw_map *vmemmap_list;
+static struct vmemmap_hw_map *next;
 static int num_left;
 static int num_freed;
 
-static __meminit struct vmemmap_backing * vmemmap_list_alloc(int node)
+static __meminit struct vmemmap_hw_map * 

[PATCH][v3] mtd/ifc: Add support for IFC controller version 2.0

2016-02-17 Thread Raghav Dogra
The new IFC controller version 2.0 has a different memory map page.
Upto IFC 1.4 PAGE size is 4 KB and from IFC2.0 PAGE size is 64KB.
This patch segregates the IFC global and runtime registers to appropriate
PAGE sizes.

Signed-off-by: Jaiprakash Singh 
Signed-off-by: Raghav Dogra 
Acked-by: Li Yang 
Signed-off-by: Raghav Dogra 
---
Changes for v3: not dependent on 
"drivers/memory: Add deep sleep support for IFC" patch

Changes for v2: rebased to resolve conflicts
Applicable to git://git.infradead.org/l2-mtd.git

This patch is dependent on "drivers/memory: Add deep sleep support for IFC"
https://patchwork.ozlabs.org/patch/582762/
which is also applicable to git://git.infradead.org/l2-mtd.git

This patch is the new version of following patch with changed title:
https://patchwork.ozlabs.org/patch/557391/

 drivers/memory/fsl_ifc.c| 36 ++---
 drivers/mtd/nand/fsl_ifc_nand.c | 72 ++---
 include/linux/fsl_ifc.h | 45 +-
 3 files changed, 87 insertions(+), 66 deletions(-)

diff --git a/drivers/memory/fsl_ifc.c b/drivers/memory/fsl_ifc.c
index 2a691da..904b4af 100644
--- a/drivers/memory/fsl_ifc.c
+++ b/drivers/memory/fsl_ifc.c
@@ -59,11 +59,11 @@ int fsl_ifc_find(phys_addr_t addr_base)
 {
int i = 0;
 
-   if (!fsl_ifc_ctrl_dev || !fsl_ifc_ctrl_dev->regs)
+   if (!fsl_ifc_ctrl_dev || !fsl_ifc_ctrl_dev->gregs)
return -ENODEV;
 
for (i = 0; i < fsl_ifc_ctrl_dev->banks; i++) {
-   u32 cspr = ifc_in32(_ifc_ctrl_dev->regs->cspr_cs[i].cspr);
+   u32 cspr = ifc_in32(_ifc_ctrl_dev->gregs->cspr_cs[i].cspr);
if (cspr & CSPR_V && (cspr & CSPR_BA) ==
convert_ifc_address(addr_base))
return i;
@@ -75,7 +75,7 @@ EXPORT_SYMBOL(fsl_ifc_find);
 
 static int fsl_ifc_ctrl_init(struct fsl_ifc_ctrl *ctrl)
 {
-   struct fsl_ifc_regs __iomem *ifc = ctrl->regs;
+   struct fsl_ifc_global __iomem *ifc = ctrl->gregs;
 
/*
 * Clear all the common status and event registers
@@ -104,7 +104,7 @@ static int fsl_ifc_ctrl_remove(struct platform_device *dev)
irq_dispose_mapping(ctrl->nand_irq);
irq_dispose_mapping(ctrl->irq);
 
-   iounmap(ctrl->regs);
+   iounmap(ctrl->gregs);
 
dev_set_drvdata(>dev, NULL);
kfree(ctrl);
@@ -122,7 +122,7 @@ static DEFINE_SPINLOCK(nand_irq_lock);
 
 static u32 check_nand_stat(struct fsl_ifc_ctrl *ctrl)
 {
-   struct fsl_ifc_regs __iomem *ifc = ctrl->regs;
+   struct fsl_ifc_runtime __iomem *ifc = ctrl->rregs;
unsigned long flags;
u32 stat;
 
@@ -157,7 +157,7 @@ static irqreturn_t fsl_ifc_nand_irq(int irqno, void *data)
 static irqreturn_t fsl_ifc_ctrl_irq(int irqno, void *data)
 {
struct fsl_ifc_ctrl *ctrl = data;
-   struct fsl_ifc_regs __iomem *ifc = ctrl->regs;
+   struct fsl_ifc_global __iomem *ifc = ctrl->gregs;
u32 err_axiid, err_srcid, status, cs_err, err_addr;
irqreturn_t ret = IRQ_NONE;
 
@@ -215,6 +215,7 @@ static int fsl_ifc_ctrl_probe(struct platform_device *dev)
 {
int ret = 0;
int version, banks;
+   void __iomem *addr;
 
dev_info(>dev, "Freescale Integrated Flash Controller\n");
 
@@ -225,22 +226,13 @@ static int fsl_ifc_ctrl_probe(struct platform_device *dev)
dev_set_drvdata(>dev, fsl_ifc_ctrl_dev);
 
/* IOMAP the entire IFC region */
-   fsl_ifc_ctrl_dev->regs = of_iomap(dev->dev.of_node, 0);
-   if (!fsl_ifc_ctrl_dev->regs) {
+   fsl_ifc_ctrl_dev->gregs = of_iomap(dev->dev.of_node, 0);
+   if (!fsl_ifc_ctrl_dev->gregs) {
dev_err(>dev, "failed to get memory region\n");
ret = -ENODEV;
goto err;
}
 
-   version = ifc_in32(_ifc_ctrl_dev->regs->ifc_rev) &
-   FSL_IFC_VERSION_MASK;
-   banks = (version == FSL_IFC_VERSION_1_0_0) ? 4 : 8;
-   dev_info(>dev, "IFC version %d.%d, %d banks\n",
-   version >> 24, (version >> 16) & 0xf, banks);
-
-   fsl_ifc_ctrl_dev->version = version;
-   fsl_ifc_ctrl_dev->banks = banks;
-
if (of_property_read_bool(dev->dev.of_node, "little-endian")) {
fsl_ifc_ctrl_dev->little_endian = true;
dev_dbg(>dev, "IFC REGISTERS are LITTLE endian\n");
@@ -249,8 +241,9 @@ static int fsl_ifc_ctrl_probe(struct platform_device *dev)
dev_dbg(>dev, "IFC REGISTERS are BIG endian\n");
}
 
-   version = ioread32be(_ifc_ctrl_dev->regs->ifc_rev) &
+   version = ifc_in32(_ifc_ctrl_dev->gregs->ifc_rev) &
FSL_IFC_VERSION_MASK;
+
banks = (version == FSL_IFC_VERSION_1_0_0) ? 4 : 8;
dev_info(>dev, "IFC version %d.%d, %d banks\n",
version >> 24, (version >> 16) & 0xf, banks);
@@ -258,6 

Re: [PATCH v8 1/8] ppc64 (le): prepare for -mprofile-kernel

2016-02-17 Thread Michael Ellerman
On Wed, 2016-02-17 at 12:30 +0100, Torsten Duwe wrote:
> On Wed, Feb 17, 2016 at 09:55:40PM +1100, Michael Ellerman wrote:
> > 
> > On a kernel built with the 2 instruction version this will fault when the
> > function we're looking at is located at the beginning of a page. Because
> > instruction[-3] goes off the front of the mapping.
> > 
> > We can probably fix that. But it's still a bit dicey.
> 
> Not necessarily. Now that it's a separate function, it can be nested a bit 
> deeper,
> so we don't take chances on compiler optimisation:
> 
> if (instruction[-2] == PPC_INST_STD_LR) /* where should R0 come from? there 
> must be... */
>   {
> if (instruction[-3] == PPC_INST_MFLR)
>   return 1;
>   }
> else if (instruction[-2] == PPC_INST_MFLR)
> return 1;
> return 0;

Yeah true that should work in practice.

It's still trivial to construct a module that will oops the loader, but I guess
that's always been true.

> > I'm wondering if we want to just say we only support the 2 instruction 
> > version.
> > Currently that means GCC 6 only, or a distro compiler with the backport of
> > e95d0248dace. But we could also ask GCC to backport it to 4.9 and 5.
> 
> IMHO that's a too weak reason for a too strong limitation. OTOH getting 
> everyone
> to use the 2 insn version sounds appealing...

Fair enough. I'm just trying to manage the complexity explosion.

I'd certainly advocate that you backport it to your toolchain.

> Is e95d0248dace self-sufficient or does it depend on other improvements?

AFAIK it's self sufficient, it just deletes a single line. I'll ask the GCC
guys tomorrow if they can backport it if you don't beat me to it :)

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v8 1/8] ppc64 (le): prepare for -mprofile-kernel

2016-02-17 Thread Torsten Duwe
On Wed, Feb 17, 2016 at 09:55:40PM +1100, Michael Ellerman wrote:
> On Wed, 2016-02-10 at 17:21 +0100, Torsten Duwe wrote:
> 
> > --- a/arch/powerpc/kernel/module_64.c
> > +++ b/arch/powerpc/kernel/module_64.c
> > @@ -476,17 +474,44 @@ static unsigned long stub_for_addr(Elf64_Shdr 
> > *sechdrs,
> > return (unsigned long)[i];
> >  }
> >  
> > +#ifdef CC_USING_MPROFILE_KERNEL
> > +static int is_early_mcount_callsite(u32 *instruction)
> > +{
> > +   /* -mprofile-kernel sequence starting with
> > +* mflr r0 and maybe std r0, LRSAVE(r1).
> > +*/
> > +   if ((instruction[-3] == PPC_INST_MFLR &&
> > +instruction[-2] == PPC_INST_STD_LR) ||
> > +   instruction[-2] == PPC_INST_MFLR) {
> > +   /* Nothing to be done here, it's an _mcount
> > +* call location and r2 will have to be
> > +* restored in the _mcount function.
> > +*/
> > +   return 1;
> > +   }
> > +   return 0;
> > +}
> 
> On a kernel built with the 2 instruction version this will fault when the
> function we're looking at is located at the beginning of a page. Because
> instruction[-3] goes off the front of the mapping.
> 
> We can probably fix that. But it's still a bit dicey.

Not necessarily. Now that it's a separate function, it can be nested a bit 
deeper,
so we don't take chances on compiler optimisation:

if (instruction[-2] == PPC_INST_STD_LR) /* where should R0 come from? there 
must be... */
  {
if (instruction[-3] == PPC_INST_MFLR)
  return 1;
  }
else if (instruction[-2] == PPC_INST_MFLR)
return 1;
return 0;

> I'm wondering if we want to just say we only support the 2 instruction 
> version.
> Currently that means GCC 6 only, or a distro compiler with the backport of
> e95d0248dace. But we could also ask GCC to backport it to 4.9 and 5.
> 
> Thoughts?

IMHO that's a too weak reason for a too strong limitation. OTOH getting everyone
to use the 2 insn version sounds appealing...

Is e95d0248dace self-sufficient or does it depend on other improvements?

Torsten

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] powerpc: Add POWER9 cputable entry

2016-02-17 Thread Michael Ellerman
On Wed, 2016-02-17 at 16:07 +1100, Michael Neuling wrote:

> Add a cputable entry for POWER9.  More code is required to actually
> boot and run on a POWER9 but this gets the base piece in which we can
> start building on.
> 
> Copies over from POWER8 except for:
> - Adds a new CPU_FTR_ARCH_30 bit to start hanging new architecture

ARCH thirty?

Would CPU_FTR_ARCH_3 read better?

Or CPU_FTR_ARCH_3_00 ?

> diff --git a/arch/powerpc/include/asm/cputable.h 
> b/arch/powerpc/include/asm/cputable.h
> index a47e175..7fb238c 100644
> --- a/arch/powerpc/include/asm/cputable.h
> +++ b/arch/powerpc/include/asm/cputable.h
> @@ -171,7 +171,7 @@ enum {
>  #define CPU_FTR_ARCH_201 LONG_ASM_CONST(0x0002)
>  #define CPU_FTR_ARCH_206 LONG_ASM_CONST(0x0004)
>  #define CPU_FTR_ARCH_207SLONG_ASM_CONST(0x0008)
> -/* Free  
> LONG_ASM_CONST(0x0010) */
> +#define CPU_FTR_ARCH_30  
> LONG_ASM_CONST(0x0010)
>  #define CPU_FTR_MMCRA
> LONG_ASM_CONST(0x0020)
>  #define CPU_FTR_CTRL LONG_ASM_CONST(0x0040)
>  #define CPU_FTR_SMT  LONG_ASM_CONST(0x0080)
> @@ -447,6 +447,16 @@ enum {
>   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_SUBCORE)
>  #define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG)
>  #define CPU_FTRS_POWER8_DD1 (CPU_FTRS_POWER8 & ~CPU_FTR_DBELL)
> +#define CPU_FTRS_POWER9 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
> + CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
> + CPU_FTR_MMCRA | CPU_FTR_SMT | \
> + CPU_FTR_COHERENT_ICACHE | \
> + CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
> + CPU_FTR_DSCR | CPU_FTR_SAO  | \
> + CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
> + CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
> + CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
> + CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_30)
>  #define CPU_FTRS_CELL(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
>   CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>   CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
> @@ -465,7 +475,7 @@ enum {
>   (CPU_FTRS_POWER4 | CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
>CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
>CPU_FTRS_POWER8 | CPU_FTRS_POWER8_DD1 | CPU_FTRS_CELL | \
> -  CPU_FTRS_PA6T | CPU_FTR_VSX)
> +  CPU_FTRS_PA6T | CPU_FTR_VSX | CPU_FTRS_POWER9)
>  #endif

That's you adding it to CPU_FTRS_POSSIBLE I think.

But you forgot to add it to CPU_FTRS_ALWAYS.

> diff --git a/arch/powerpc/include/asm/mmu-hash64.h 
> b/arch/powerpc/include/asm/mmu-hash64.h
> index 7352d3f..e36dc90 100644
> --- a/arch/powerpc/include/asm/mmu-hash64.h
> +++ b/arch/powerpc/include/asm/mmu-hash64.h
> @@ -114,6 +114,7 @@
>  
>  #define POWER7_TLB_SETS  128 /* # sets in POWER7 TLB */
>  #define POWER8_TLB_SETS  512 /* # sets in POWER8 TLB */
> +#define POWER9_TLB_SETS_HASH 256 /* # sets in POWER9 TLB Hash mode */
>  
>  #ifndef __ASSEMBLY__
>  
> diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
> index 3d5abfe..54d4650 100644
> --- a/arch/powerpc/include/asm/mmu.h
> +++ b/arch/powerpc/include/asm/mmu.h
> @@ -97,6 +97,7 @@
>  #define MMU_FTRS_POWER6  MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE
>  #define MMU_FTRS_POWER7  MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE
>  #define MMU_FTRS_POWER8  MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE
> +#define MMU_FTRS_POWER9  MMU_FTRS_POWER4 | MMU_FTR_LOCKLESS_TLBIE
>  #define MMU_FTRS_CELLMMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \
>   MMU_FTR_CI_LARGE_PAGE
>  #define MMU_FTRS_PA6TMMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \
> diff --git a/arch/powerpc/kernel/cpu_setup_power.S 
> b/arch/powerpc/kernel/cpu_setup_power.S
> index 9c9b741..1785480 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.S
> +++ b/arch/powerpc/kernel/cpu_setup_power.S
> @@ -83,6 +83,43 @@ _GLOBAL(__restore_cpu_power8)
>   mtlrr11
>   blr
>  
> +_GLOBAL(__setup_cpu_power9)
> + mflrr11
> + bl  __init_FSCR
> + bl  __init_PMU

You might be better off leaving the PMU alone until we have a P9
perf implementation?

> + bl  __init_hvmode_206
> + mtlrr11
> + beqlr
> + li  r0,0
> + mtspr   SPRN_LPID,r0
> + mfspr   r3,SPRN_LPCR
> + ori r3, r3, LPCR_PECEDH
> + bl  __init_LPCR
> + bl  __init_HFSCR
> + bl  __init_tlb_power9
> + bl  __init_PMU_HV
> + mtlrr11
> + blr
> +
> +_GLOBAL(__restore_cpu_power9)
> + mflrr11
> + bl  __init_FSCR
> + bl  __init_PMU
> + mfmsr   r3
> + rldicl. 

Re: [PATCH v8 1/8] ppc64 (le): prepare for -mprofile-kernel

2016-02-17 Thread Michael Ellerman
On Wed, 2016-02-10 at 17:21 +0100, Torsten Duwe wrote:

> The gcc switch -mprofile-kernel, available for ppc64 on gcc > 4.8.5,
> allows to call _mcount very early in the function, which low-level
> ASM code and code patching functions need to consider.
> Especially the link register and the parameter registers are still
> alive and not yet saved into a new stack frame.

...

> diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
> index ac64ffd..72a1a52 100644
> --- a/arch/powerpc/kernel/module_64.c
> +++ b/arch/powerpc/kernel/module_64.c
> @@ -476,17 +474,44 @@ static unsigned long stub_for_addr(Elf64_Shdr *sechdrs,
>   return (unsigned long)[i];
>  }
>  
> +#ifdef CC_USING_MPROFILE_KERNEL
> +static int is_early_mcount_callsite(u32 *instruction)
> +{
> + /* -mprofile-kernel sequence starting with
> +  * mflr r0 and maybe std r0, LRSAVE(r1).
> +  */
> + if ((instruction[-3] == PPC_INST_MFLR &&
> +  instruction[-2] == PPC_INST_STD_LR) ||
> + instruction[-2] == PPC_INST_MFLR) {
> + /* Nothing to be done here, it's an _mcount
> +  * call location and r2 will have to be
> +  * restored in the _mcount function.
> +  */
> + return 1;
> + }
> + return 0;
> +}

So this logic to deal with the 2 vs 3 instruction version of the mcount
sequence is problematic.

On a kernel built with the 2 instruction version this will fault when the
function we're looking at is located at the beginning of a page. Because
instruction[-3] goes off the front of the mapping.

We can probably fix that. But it's still a bit dicey.

I'm wondering if we want to just say we only support the 2 instruction version.
Currently that means GCC 6 only, or a distro compiler with the backport of
e95d0248dace. But we could also ask GCC to backport it to 4.9 and 5.

Thoughts?

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/7] fix IS_ERR_VALUE usage

2016-02-17 Thread Arnd Bergmann
On Monday 15 February 2016 15:35:18 Andrzej Hajda wrote:
> 
> Andrzej Hajda (7):
>   netfilter: fix IS_ERR_VALUE usage
>   MIPS: module: fix incorrect IS_ERR_VALUE macro usages
>   drivers: char: mem: fix IS_ERROR_VALUE usage
>   atmel-isi: fix IS_ERR_VALUE usage
>   serial: clps711x: fix IS_ERR_VALUE usage
>   fbdev: exynos: fix IS_ERR_VALUE usage
>   usb: gadget: fsl_qe_udc: fix IS_ERR_VALUE usage
> 

Can you Cc me the next time on all of the patches? I only got
three of them this time.

Arnd
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 0/2] Consolidate redundant register/stack access code

2016-02-17 Thread Ingo Molnar

* David Long  wrote:

> On 02/09/2016 04:45 AM, Ingo Molnar wrote:
> >
> >* Michael Ellerman  wrote:
> >
> >>On Tue, 2016-02-09 at 00:38 -0500, David Long wrote:
> >>
> >>>From: "David A. Long" 
> >>>
> >>>Move duplicate and functionally equivalent code for accessing registers
> >>>and stack (CONFIG_HAVE_REGS_AND_STACK_ACCESS_API) from arch subdirs into
> >>>common kernel files.
> >>>
> >>>I'm sending this out again (with updated distribution list) because v2
> >>>just never got pulled in, even though I don't think there were any
> >>>outstanding issues.
> >>
> >>A big cross arch patch like this would often get taken by Andrew Morton, but
> >>AFAICS you didn't CC him - so I just added him, perhaps he'll pick it up for
> >>us :D
> >
> >The other problem is that the second patch is commingling changes to 6 
> >separate
> >architectures:
> >
> >  16 files changed, 106 insertions(+), 343 deletions(-)
> >
> >that should probably be 6 separate patches. Easier to review, easier to 
> >bisect to,
> >easier to revert, etc.
> >
> >Thanks,
> >
> > Ingo
> >
> 
> I see your point but I'm not sure it could have been broken into separate 
> successive patches that would each build for all architectures.

Why? AFAICS all the functionality appears to be conditional on 
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API, so it ought to build standalone as well, 
on 
a per arch basis, as long as the core kernel patch is applied first.

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] powerpc: Add POWER9 cputable entry

2016-02-17 Thread Michael Neuling
> > +_GLOBAL(__setup_cpu_power9)
> > +   mflrr11
> > +   bl  __init_FSCR
> > +   bl  __init_PMU
> Just to keep in mind, I am not sure whether
> powerisa 3.0 support MMCRS spr, so we
> will need a feature check in __init_PMU()
> for power9.

Yeah, I'm not expecting this to work.

I'm trying to lay down a common base we can start working on.  There
are lots of people working a bunch different bases.  I want to avoid
that and we can do that by upstreaming.

> > +   bl  __init_hvmode_206
> > +   mtlrr11
> > +   beqlr
> > +   li  r0,0
> > +   mtspr   SPRN_LPID,r0
> > +   mfspr   r3,SPRN_LPCR
> > +   ori r3, r3, LPCR_PECEDH
> > +   bl  __init_LPCR
> > +   bl  __init_HFSCR
> > +   bl  __init_tlb_power9
> > +   bl  __init_PMU_HV
> 
> Again, need to check whether powerisa 3.0 support MMCRH spr
> which is used in __init_PMU_HV()

Same here.
> > +   {   /*  Hacked up Power9 */

/me reviews his own patch...

Oops

> > +   .pvr_mask   = 0x,
> > +   .pvr_value  = 0x004e,
> > +   .cpu_name   = "POWER9 (raw)",
> > +   .cpu_features   = CPU_FTRS_POWER9,
> > +   .cpu_user_features  = COMMON_USER_POWER9,
> > +   .cpu_user_features2 = COMMON_USER2_POWER9,
> > +   .mmu_features   = MMU_FTRS_POWER9,
> > +   .icache_bsize   = 128,
> > +   .dcache_bsize   = 128,
> > +   .num_pmcs   = 6,
> > +   .pmc_type   = PPC_PMC_IBM,
> > +   .oprofile_cpu_type  = "ppc64/power8",
>
> This should be ppc64/power9. We use "oprofile_cpu_type" in PMU init.

Yep, we can fix that up when we post PMU patches, but if I repost I'll
change so it doesn't match with old one.

Mikey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Fix BUG_ON() reporting in real mode on powerpc

2016-02-17 Thread Anshuman Khandual
On 02/17/2016 12:46 PM, Balbir Singh wrote:
>> > It might be a little better to do this:
>> > 
>> >bugaddr = regs->nip;
>> >if (REGION_ID(bugaddr) == 0 && !(regs->msr & MSR_IR))
>> >bugaddr += PAGE_OFFSET;
>> > 
>> > It is possible to execute from addresses with the 0xc000... on top in
>> > real mode, because the CPU ignores the top 4 address bits in real
>> > mode.
> Good catch! Thank you
> 
> Changelog:
>  Don't add PAGE_OFFSET blindly, check if REGION_ID is 0

Cant we use USER_REGION_ID directly ?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Fix BUG_ON() reporting in real mode on powerpc

2016-02-17 Thread Paul Mackerras
On Wed, Feb 17, 2016 at 01:33:32PM +0530, Anshuman Khandual wrote:
> On 02/17/2016 12:46 PM, Balbir Singh wrote:
> >> > It might be a little better to do this:
> >> > 
> >> >  bugaddr = regs->nip;
> >> >  if (REGION_ID(bugaddr) == 0 && !(regs->msr & MSR_IR))
> >> >  bugaddr += PAGE_OFFSET;
> >> > 
> >> > It is possible to execute from addresses with the 0xc000... on top in
> >> > real mode, because the CPU ignores the top 4 address bits in real
> >> > mode.
> > Good catch! Thank you
> > 
> > Changelog:
> >  Don't add PAGE_OFFSET blindly, check if REGION_ID is 0
> 
> Cant we use USER_REGION_ID directly ?

If we use USER_REGION_ID then the reader needs to know that the user
region is region 0 to understand the code.  Thus I think it is clearer
to use REGION_ID(bugaddr) == 0.  Whether or not the address is a user
region address is not really relevant to the question of whether it's
a physical address being accessed directly in real mode vs. a kernel
virtual address, which is what we're trying to determine.

Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 3/7] ibmvscsi: Replace magic values in set_adpater_info() with defines

2016-02-17 Thread Johannes Thumshirn
On Tue, Feb 16, 2016 at 10:03:24PM -0500, Martin K. Petersen wrote:
> > "Tyrel" == Tyrel Datwyler  writes:
> 
> >> Is there some reason you didn't carry the review tag over from this:
> >> 
> >> http://mid.gmane.org/20160204084459.gw27...@c203.arch.suse.de
> >> 
> >> ?
> >> 
> >> James
> 
> Tyrel> The patch is slightly changed from v1. A define for AIX os type
> Tyrel> was added as mentioned in the cover letter v2 changes, and I
> Tyrel> moved the defines to the mad_adapter_info_data structure around
> Tyrel> the fields they apply.
> 
> Johannes: Mind checking this out?

I'm sorry I though I already did.

Reviewed-by: Johannes Thumshirn 

> 
> https://patchwork.kernel.org/patch/8276101/
> 
> -- 
> Martin K. PetersenOracle Linux Engineering

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev