[GIT PULL] Please pull powerpc/linux.git powerpc-4.10-1 tag

2016-12-15 Thread Michael Ellerman
Hi Linus,

Please pull powerpc updates for 4.10:

The following changes since commit a25f0944ba9b1d8a6813fd6f1a86f1bd59ac25a6:

  Linux 4.9-rc5 (2016-11-13 10:32:32 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-4.10-1

for you to fetch changes up to c6f6634721c871bfab4235e1cbcad208d3063798:

  Merge branch 'next' of 
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next 
(2016-12-16 15:05:38 +1100)


The change to the generic and arch/x86 kexec code were acked by Dave Young, and
have been in linux-next for a long time.

I see one conflict in asm-prototypes.h, AFAICS the resolution is to drop
the include of linux/kprobes.h and keep the rest.

cheers


powerpc updates for 4.10

Highlights include:

 - Support for the kexec_file_load() syscall, which is a prereq for secure and
   trusted boot.

 - Prevent kernel execution of userspace on P9 Radix (similar to SMEP/PXN).

 - Sort the exception tables at build time, to save time at boot, and store
   them as relative offsets to save space in the kernel image & memory.

 - Allow building the kernel with thin archives, which should allow us to build
   an allyesconfig once some other fixes land.

 - Build fixes to allow us to correctly rebuild when changing the kernel endian
   from big to little or vice versa.

 - Plumbing so that we can avoid doing a full mm TLB flush on P9 Radix.

 - Initial stack protector support (-fstack-protector).

 - Support for dumping the radix (aka. Linux) and hash page tables via debugfs.

 - Fix an oops in cxl coredump generation when cxl_get_fd() is used.

 - Freescale updates from Scott: "Highlights include 8xx hugepage support,
   qbman fixes/cleanup, device tree updates, and some misc cleanup."

 - Many and varied fixes and minor enhancements as always.

Thanks to:
  Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual,
  Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz, Christophe Jaillet,
  Christophe Leroy, Denis Kirjanov, Elimar Riesebieter, Frederic Barrat,
  Gautham R. Shenoy, Geliang Tang, Geoff Levand, Jack Miller, Johan Hovold,
  Lars-Peter Clausen, Libin, Madhavan Srinivasan, Michael Neuling, Nathan
  Fontenot, Naveen N. Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin,
  Rashmica Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
  Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain.


Alexey Kardashevskiy (7):
  powerpc/iommu: Pass mm_struct to init/cleanup helpers
  powerpc/iommu: Stop using @current in mm_iommu_xxx
  vfio/spapr: Postpone allocation of userspace version of TCE table
  vfio/spapr: Add a helper to create default DMA window
  vfio/spapr: Postpone default window creation
  vfio/spapr: Reference mm in tce_container
  powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown

Andrew Donnellan (1):
  cxl: Fix coccinelle warnings

Andy Fleming (1):
  powerpc/85xx: Enable gpio power/reset driver

Aneesh Kumar K.V (8):
  powerpc/mm/coproc: Handle bad address on coproc slb fault
  powerpc/mm: Rename hugetlb-radix.h to hugetlb.h
  powerpc/mm/hugetlb: Handle hugepage size supported by hash config
  powerpc/mm: Introduce _PAGE_LARGE software pte bits
  powerpc/mm: Add radix__tlb_flush_pte_p9_dd1()
  powerpc/mm: update radix__ptep_set_access_flag to not do full mm tlb flush
  powerpc/mm: update radix__pte_update to not do full mm tlb flush
  powerpc/mm: Batch tlb flush when invalidating pte entries

Anshuman Khandual (11):
  selftests/powerpc: Add more SPR numbers, TM & VMX instructions to 
'reg.h'/'instructions.h'
  selftests/powerpc: Add ptrace tests for GPR/FPR registers
  selftests/powerpc: Add ptrace tests for GPR/FPR registers in TM
  selftests/powerpc: Add ptrace tests for GPR/FPR registers in suspended TM
  selftests/powerpc: Add ptrace tests for TAR, PPR, DSCR registers
  selftests/powerpc: Add ptrace tests for TAR, PPR, DSCR in TM
  selftests/powerpc: Add ptrace tests for TAR, PPR, DSCR in suspended TM
  selftests/powerpc: Add ptrace tests for VSX, VMX registers
  selftests/powerpc: Add ptrace tests for VSX, VMX registers in TM
  selftests/powerpc: Add ptrace tests for VSX, VMX registers in suspended TM
  selftests/powerpc: Add ptrace tests for TM SPR registers

Anton Blanchard (2):
  selftests/powerpc: Add Anton's null_syscall benchmark to the selftests
  powerpc/pseries: Use H_CLEAR_HPT to clear MMU hash table during kexec

Balbir Singh (7):
  powerpc/hash64: Be more careful when generating tlbiel
  powerpc/mm: Fix typo in radix encodings print
  powerpc/mm/radix: Setup AMOR in HV mode to allow key 0
  powerpc/mm: Detect instruction fetch denied and report
  powerpc/mm/

Re: [PATCH v2 2/5] ia64: reuse append_elf_note() and final_note() functions

2016-12-15 Thread Hari Bathini



On Saturday 03 December 2016 12:52 AM, Eric W. Biederman wrote:

Hari Bathini  writes:


Hi Dave,


Thanks for the review.


On Thursday 01 December 2016 10:26 AM, Dave Young wrote:

Hi Hari

Personally I like V1 more, but split the patch 2 is easier for ia64
people to reivew.  I did basic x86 testing, it runs ok.

On 11/25/16 at 05:24pm, Hari Bathini wrote:

Get rid of multiple definitions of append_elf_note() & final_note()
functions. Reuse these functions compiled under CONFIG_CRASH_CORE.

Signed-off-by: Hari Bathini 
---
   arch/ia64/kernel/crash.c   |   22 --
   include/linux/crash_core.h |4 
   kernel/crash_core.c|6 +++---
   kernel/kexec_core.c|   28 
   4 files changed, 7 insertions(+), 53 deletions(-)

diff --git a/arch/ia64/kernel/crash.c b/arch/ia64/kernel/crash.c
index 2955f35..75859a0 100644
--- a/arch/ia64/kernel/crash.c
+++ b/arch/ia64/kernel/crash.c
@@ -27,28 +27,6 @@ static int kdump_freeze_monarch;
   static int kdump_on_init = 1;
   static int kdump_on_fatal_mca = 1;
   -static inline Elf64_Word
-*append_elf_note(Elf64_Word *buf, char *name, unsigned type, void *data,
-   size_t data_len)
-{
-   struct elf_note *note = (struct elf_note *)buf;
-   note->n_namesz = strlen(name) + 1;
-   note->n_descsz = data_len;
-   note->n_type   = type;
-   buf += (sizeof(*note) + 3)/4;
-   memcpy(buf, name, note->n_namesz);
-   buf += (note->n_namesz + 3)/4;
-   memcpy(buf, data, data_len);
-   buf += (data_len + 3)/4;
-   return buf;
-}
-
-static void
-final_note(void *buf)
-{
-   memset(buf, 0, sizeof(struct elf_note));
-}
-

The above IA64 version looks better than the functions in kexec_core.c
about the Elf64_Word type usage and the simpler final_note function.

Hmmm.. Is void* better over Elf64_Word* to be agnostic of Elf32 or
Elf64 type?

Both Elf64_Word and Elf32_Word result in a u32.  So I expect the right
solution is to add a definition of Elf_Word to include/linux/elf.h
and to make the buffer "Elf_Word *buf".

That way we preserve the alignment knowledge, while making the code
depend on 32bit or 64bit.

Eric



Thanks for the review, Eric. Will address this in the next version.

As some recent changes in powerpc tree may cause merger conflicts,
I will wait for those changes to get into linus's tree and rebase the
next version on top of that..

Thanks
Hari



[PATCH] powerpc/livepatch: Remove klp_write_module_reloc() stub

2016-12-15 Thread Kamalesh Babulal
commit 425595a7fc20 ("livepatch: reuse module loader code
to write relocations") offloads livepatch module relocation
write to arch specific module loader code.

Remove unused klp_write_module_reloc() function stub.

Signed-off-by: Kamalesh Babulal 
---
 arch/powerpc/include/asm/livepatch.h | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/arch/powerpc/include/asm/livepatch.h 
b/arch/powerpc/include/asm/livepatch.h
index a402f7f..47a03b9 100644
--- a/arch/powerpc/include/asm/livepatch.h
+++ b/arch/powerpc/include/asm/livepatch.h
@@ -28,13 +28,6 @@ static inline int klp_check_compiler_support(void)
return 0;
 }
 
-static inline int klp_write_module_reloc(struct module *mod, unsigned long
-   type, unsigned long loc, unsigned long value)
-{
-   /* This requires infrastructure changes; we need the loadinfos. */
-   return -ENOSYS;
-}
-
 static inline void klp_arch_set_pc(struct pt_regs *regs, unsigned long ip)
 {
regs->nip = ip;
-- 
2.7.4



Re: [PATCH v3] powerpc/powernv: Initialise nest mmu

2016-12-15 Thread Alistair Popple
> Michael the skiboot fix to stop this breaking in mambo has been posted
> (see http://patchwork.ozlabs.org/patch/702564/). Will let you know
> when it has gone upstream.

Upstream in skiboot master as of 9418533911728f6d8bb7aa647033c317772ddb97.

Thanks!

> 
>  arch/powerpc/include/asm/opal-api.h|  3 ++-
>  arch/powerpc/include/asm/opal.h|  1 +
>  arch/powerpc/mm/pgtable-radix.c|  8 ++--
>  arch/powerpc/platforms/powernv/opal-wrappers.S |  1 +
>  arch/powerpc/platforms/powernv/opal.c  | 11 +++
>  arch/powerpc/platforms/powernv/powernv.h   |  6 ++
>  6 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/opal-api.h 
> b/arch/powerpc/include/asm/opal-api.h
> index 0e2e57b..a0aa285 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -167,7 +167,8 @@
>  #define OPAL_INT_EOI 124
>  #define OPAL_INT_SET_MFRR125
>  #define OPAL_PCI_TCE_KILL126
> -#define OPAL_LAST126
> +#define OPAL_NMMU_SET_PTCR   127
> +#define OPAL_LAST127
> 
>  /* Device tree flags */
> 
> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
> index e958b70..b61c3d3 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/powerpc/include/asm/opal.h
> @@ -229,6 +229,7 @@ int64_t opal_pci_tce_kill(uint64_t phb_id, uint32_t 
> kill_type,
>  int64_t opal_rm_pci_tce_kill(uint64_t phb_id, uint32_t kill_type,
>uint32_t pe_num, uint32_t tce_size,
>uint64_t dma_addr, uint32_t npages);
> +int64_t opal_nmmu_set_ptcr(uint64_t chip_id, uint64_t ptcr);
> 
>  /* Internal functions */
>  extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
> index 688b545..d5e868b 100644
> --- a/arch/powerpc/mm/pgtable-radix.c
> +++ b/arch/powerpc/mm/pgtable-radix.c
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
> 
> @@ -177,7 +178,7 @@ static void __init radix_init_pgtable(void)
> 
>  static void __init radix_init_partition_table(void)
>  {
> - unsigned long rts_field;
> + unsigned long rts_field, ptcr;
> 
>   rts_field = radix__get_tree_size();
> 
> @@ -193,7 +194,9 @@ static void __init radix_init_partition_table(void)
>* update partition table control register,
>* 64 K size.
>*/
> - mtspr(SPRN_PTCR, __pa(partition_tb) | (PATB_SIZE_SHIFT - 12));
> + ptcr = __pa(partition_tb) | (PATB_SIZE_SHIFT - 12);
> + mtspr(SPRN_PTCR, ptcr);
> + powernv_set_nmmu_ptcr(ptcr);
>  }
> 
>  void __init radix_init_native(void)
> @@ -408,6 +411,7 @@ void radix__mmu_cleanup_all(void)
>   lpcr = mfspr(SPRN_LPCR);
>   mtspr(SPRN_LPCR, lpcr & ~LPCR_UPRT);
>   mtspr(SPRN_PTCR, 0);
> + powernv_set_nmmu_ptcr(0);
>   radix__flush_tlb_all();
>   }
>  }
> diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S 
> b/arch/powerpc/platforms/powernv/opal-wrappers.S
> index 44d2d84..894639b 100644
> --- a/arch/powerpc/platforms/powernv/opal-wrappers.S
> +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
> @@ -308,4 +308,5 @@ OPAL_CALL(opal_int_set_cppr,  
> OPAL_INT_SET_CPPR);
>  OPAL_CALL(opal_int_eoi,  OPAL_INT_EOI);
>  OPAL_CALL(opal_int_set_mfrr, OPAL_INT_SET_MFRR);
>  OPAL_CALL(opal_pci_tce_kill, OPAL_PCI_TCE_KILL);
> +OPAL_CALL(opal_nmmu_set_ptcr,OPAL_NMMU_SET_PTCR);
>  OPAL_CALL_REAL(opal_rm_pci_tce_kill, OPAL_PCI_TCE_KILL);
> diff --git a/arch/powerpc/platforms/powernv/opal.c 
> b/arch/powerpc/platforms/powernv/opal.c
> index 6c9a65b..773077e 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -885,6 +885,17 @@ int opal_error_code(int rc)
>   }
>  }
> 
> +void powernv_set_nmmu_ptcr(unsigned long ptcr)
> +{
> + int rc;
> +
> + if (firmware_has_feature(FW_FEATURE_OPAL)) {
> + rc = opal_nmmu_set_ptcr(-1UL, ptcr);
> + if (rc != OPAL_SUCCESS && rc != OPAL_UNSUPPORTED)
> + pr_warn("%s: Unable to set nest mmu ptcr\n", __func__);
> + }
> +}
> +
>  EXPORT_SYMBOL_GPL(opal_poll_events);
>  EXPORT_SYMBOL_GPL(opal_rtc_read);
>  EXPORT_SYMBOL_GPL(opal_rtc_write);
> diff --git a/arch/powerpc/platforms/powernv/powernv.h 
> b/arch/powerpc/platforms/powernv/powernv.h
> index da7c843..c49a2b0 100644
> --- a/arch/powerpc/platforms/powernv/powernv.h
> +++ b/arch/powerpc/platforms/powernv/powernv.h
> @@ -9,6 +9,12 @@ static inline void pnv_smp_init(void) { }
> 
>  struct pci_dev;
> 
> +#ifdef CONFIG_PPC_POWERNV
> +extern void powernv_set_nmmu_ptcr(unsigned long ptcr);
>

Re: [PATCH kernel 7/9] KVM: PPC: Enable IOMMU_API for KVM_BOOK3S_64 permanently

2016-12-15 Thread David Gibson
On Thu, Dec 08, 2016 at 07:19:54PM +1100, Alexey Kardashevskiy wrote:
> It does not make much sense to have KVM in book3s-64 and
> not to have IOMMU bits for PCI pass through support as it costs little
> and allows VFIO to function on book3s KVM.
> 
> Having IOMMU_API always enabled makes it unnecessary to have a lot of
> "#ifdef IOMMU_API" in arch/powerpc/kvm/book3s_64_vio*. With those
> ifdef's we could have only user space emulated devices accelerated
> (but not VFIO) which do not seem to be very useful.
> 
> Signed-off-by: Alexey Kardashevskiy 

Reviewed-by: David Gibson 

> ---
>  arch/powerpc/kvm/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
> index 029be26b5a17..65a471de96de 100644
> --- a/arch/powerpc/kvm/Kconfig
> +++ b/arch/powerpc/kvm/Kconfig
> @@ -67,6 +67,7 @@ config KVM_BOOK3S_64
>   select KVM_BOOK3S_64_HANDLER
>   select KVM
>   select KVM_BOOK3S_PR_POSSIBLE if !KVM_BOOK3S_HV_POSSIBLE
> + select SPAPR_TCE_IOMMU if IOMMU_SUPPORT
>   ---help---
> Support running unmodified book3s_64 and book3s_32 guest kernels
> in virtual machines on book3s_64 host processors.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH kernel 6/9] powerpc/powernv/iommu: Add real mode version of iommu_table_ops::exchange()

2016-12-15 Thread David Gibson
On Thu, Dec 08, 2016 at 07:19:53PM +1100, Alexey Kardashevskiy wrote:
> In real mode, TCE tables are invalidated using special
> cache-inhibited store instructions which are not available in
> virtual mode
> 
> This defines and implements exchange_rm() callback. This does not
> define set_rm/clear_rm/flush_rm callbacks as there is no user for those -
> exchange/exchange_rm are only to be used by KVM for VFIO.
> 
> The exchange_rm callback is defined for IODA1/IODA2 powernv platforms.
> 
> This replaces list_for_each_entry_rcu with its lockless version as
> from now on pnv_pci_ioda2_tce_invalidate() can be called in
> the real mode too.
> 
> Signed-off-by: Alexey Kardashevskiy 

Reviewed-by: David Gibson 

> ---
>  arch/powerpc/include/asm/iommu.h  |  7 +++
>  arch/powerpc/kernel/iommu.c   | 23 +++
>  arch/powerpc/platforms/powernv/pci-ioda.c | 26 +-
>  3 files changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/iommu.h 
> b/arch/powerpc/include/asm/iommu.h
> index 9de8bad1fdf9..82e77ebf85f4 100644
> --- a/arch/powerpc/include/asm/iommu.h
> +++ b/arch/powerpc/include/asm/iommu.h
> @@ -64,6 +64,11 @@ struct iommu_table_ops {
>   long index,
>   unsigned long *hpa,
>   enum dma_data_direction *direction);
> + /* Real mode */
> + int (*exchange_rm)(struct iommu_table *tbl,
> + long index,
> + unsigned long *hpa,
> + enum dma_data_direction *direction);
>  #endif
>   void (*clear)(struct iommu_table *tbl,
>   long index, long npages);
> @@ -209,6 +214,8 @@ extern void iommu_del_device(struct device *dev);
>  extern int __init tce_iommu_bus_notifier_init(void);
>  extern long iommu_tce_xchg(struct iommu_table *tbl, unsigned long entry,
>   unsigned long *hpa, enum dma_data_direction *direction);
> +extern long iommu_tce_xchg_rm(struct iommu_table *tbl, unsigned long entry,
> + unsigned long *hpa, enum dma_data_direction *direction);
>  #else
>  static inline void iommu_register_group(struct iommu_table_group 
> *table_group,
>   int pci_domain_number,
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index d12496889ce9..d02b8d22fb50 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -1022,6 +1022,29 @@ long iommu_tce_xchg(struct iommu_table *tbl, unsigned 
> long entry,
>  }
>  EXPORT_SYMBOL_GPL(iommu_tce_xchg);
>  
> +long iommu_tce_xchg_rm(struct iommu_table *tbl, unsigned long entry,
> + unsigned long *hpa, enum dma_data_direction *direction)
> +{
> + long ret;
> +
> + ret = tbl->it_ops->exchange_rm(tbl, entry, hpa, direction);
> +
> + if (!ret && ((*direction == DMA_FROM_DEVICE) ||
> + (*direction == DMA_BIDIRECTIONAL))) {
> + struct page *pg = realmode_pfn_to_page(*hpa >> PAGE_SHIFT);
> +
> + if (likely(pg)) {
> + SetPageDirty(pg);
> + } else {
> + tbl->it_ops->exchange_rm(tbl, entry, hpa, direction);
> + ret = -EFAULT;
> + }
> + }
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_tce_xchg_rm);
> +
>  int iommu_take_ownership(struct iommu_table *tbl)
>  {
>   unsigned long flags, i, sz = (tbl->it_size + 7) >> 3;
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index ea181f02bebd..f2c2ab8fbb3e 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1855,6 +1855,17 @@ static int pnv_ioda1_tce_xchg(struct iommu_table *tbl, 
> long index,
>  
>   return ret;
>  }
> +
> +static int pnv_ioda1_tce_xchg_rm(struct iommu_table *tbl, long index,
> + unsigned long *hpa, enum dma_data_direction *direction)
> +{
> + long ret = pnv_tce_xchg(tbl, index, hpa, direction);
> +
> + if (!ret)
> + pnv_pci_p7ioc_tce_invalidate(tbl, index, 1, true);
> +
> + return ret;
> +}
>  #endif
>  
>  static void pnv_ioda1_tce_free(struct iommu_table *tbl, long index,
> @@ -1869,6 +1880,7 @@ static struct iommu_table_ops pnv_ioda1_iommu_ops = {
>   .set = pnv_ioda1_tce_build,
>  #ifdef CONFIG_IOMMU_API
>   .exchange = pnv_ioda1_tce_xchg,
> + .exchange_rm = pnv_ioda1_tce_xchg_rm,
>  #endif
>   .clear = pnv_ioda1_tce_free,
>   .get = pnv_tce_get,
> @@ -1943,7 +1955,7 @@ static void pnv_pci_ioda2_tce_invalidate(struct 
> iommu_table *tbl,
>  {
>   struct iommu_table_group_link *tgl;
>  
> - list_for_each_entry_rcu(tgl, &tbl->it_group_list, next) {
> + list_for_each_entry_lockless(tgl, &tbl->it_group_list, next) {
>   struct pnv_ioda_pe *pe = container_of(tgl->table_group,
>   struct pnv_ioda_pe, t

Re: [PATCH kernel 8/9] KVM: PPC: Pass kvm* to kvmppc_find_table()

2016-12-15 Thread David Gibson
On Thu, Dec 08, 2016 at 07:19:55PM +1100, Alexey Kardashevskiy wrote:
> The guest view TCE tables are per KVM anyway (not per VCPU) so pass kvm*
> there. This will be used in the following patches where we will be
> attaching VFIO containers to LIOBNs via ioctl() to KVM (rather than
> to VCPU).
> 
> Signed-off-by: Alexey Kardashevskiy 

Reviewed-by: David Gibson 

> ---
>  arch/powerpc/include/asm/kvm_ppc.h  |  2 +-
>  arch/powerpc/kvm/book3s_64_vio.c|  7 ---
>  arch/powerpc/kvm/book3s_64_vio_hv.c | 13 +++--
>  3 files changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
> b/arch/powerpc/include/asm/kvm_ppc.h
> index f6e49640dbe1..0a21c8503974 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -167,7 +167,7 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu);
>  extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
>   struct kvm_create_spapr_tce_64 *args);
>  extern struct kvmppc_spapr_tce_table *kvmppc_find_table(
> - struct kvm_vcpu *vcpu, unsigned long liobn);
> + struct kvm *kvm, unsigned long liobn);
>  extern long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt,
>   unsigned long ioba, unsigned long npages);
>  extern long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *tt,
> diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
> b/arch/powerpc/kvm/book3s_64_vio.c
> index c379ff5a4438..15df8ae627d9 100644
> --- a/arch/powerpc/kvm/book3s_64_vio.c
> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> @@ -212,12 +212,13 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
>  long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
> unsigned long ioba, unsigned long tce)
>  {
> - struct kvmppc_spapr_tce_table *stt = kvmppc_find_table(vcpu, liobn);
> + struct kvmppc_spapr_tce_table *stt;
>   long ret;
>  
>   /* udbg_printf("H_PUT_TCE(): liobn=0x%lx ioba=0x%lx, tce=0x%lx\n", */
>   /*  liobn, ioba, tce); */
>  
> + stt = kvmppc_find_table(vcpu->kvm, liobn);
>   if (!stt)
>   return H_TOO_HARD;
>  
> @@ -245,7 +246,7 @@ long kvmppc_h_put_tce_indirect(struct kvm_vcpu *vcpu,
>   u64 __user *tces;
>   u64 tce;
>  
> - stt = kvmppc_find_table(vcpu, liobn);
> + stt = kvmppc_find_table(vcpu->kvm, liobn);
>   if (!stt)
>   return H_TOO_HARD;
>  
> @@ -299,7 +300,7 @@ long kvmppc_h_stuff_tce(struct kvm_vcpu *vcpu,
>   struct kvmppc_spapr_tce_table *stt;
>   long i, ret;
>  
> - stt = kvmppc_find_table(vcpu, liobn);
> + stt = kvmppc_find_table(vcpu->kvm, liobn);
>   if (!stt)
>   return H_TOO_HARD;
>  
> diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
> b/arch/powerpc/kvm/book3s_64_vio_hv.c
> index a3be4bd6188f..8a6834e6e1c8 100644
> --- a/arch/powerpc/kvm/book3s_64_vio_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
> @@ -49,10 +49,9 @@
>   * WARNING: This will be called in real or virtual mode on HV KVM and virtual
>   *  mode on PR KVM
>   */
> -struct kvmppc_spapr_tce_table *kvmppc_find_table(struct kvm_vcpu *vcpu,
> +struct kvmppc_spapr_tce_table *kvmppc_find_table(struct kvm *kvm,
>   unsigned long liobn)
>  {
> - struct kvm *kvm = vcpu->kvm;
>   struct kvmppc_spapr_tce_table *stt;
>  
>   list_for_each_entry_lockless(stt, &kvm->arch.spapr_tce_tables, list)
> @@ -194,12 +193,13 @@ static struct mm_iommu_table_group_mem_t 
> *kvmppc_rm_iommu_lookup(
>  long kvmppc_rm_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
>   unsigned long ioba, unsigned long tce)
>  {
> - struct kvmppc_spapr_tce_table *stt = kvmppc_find_table(vcpu, liobn);
> + struct kvmppc_spapr_tce_table *stt;
>   long ret;
>  
>   /* udbg_printf("H_PUT_TCE(): liobn=0x%lx ioba=0x%lx, tce=0x%lx\n", */
>   /*  liobn, ioba, tce); */
>  
> + stt = kvmppc_find_table(vcpu->kvm, liobn);
>   if (!stt)
>   return H_TOO_HARD;
>  
> @@ -252,7 +252,7 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
>   unsigned long tces, entry, ua = 0;
>   unsigned long *rmap = NULL;
>  
> - stt = kvmppc_find_table(vcpu, liobn);
> + stt = kvmppc_find_table(vcpu->kvm, liobn);
>   if (!stt)
>   return H_TOO_HARD;
>  
> @@ -335,7 +335,7 @@ long kvmppc_rm_h_stuff_tce(struct kvm_vcpu *vcpu,
>   struct kvmppc_spapr_tce_table *stt;
>   long i, ret;
>  
> - stt = kvmppc_find_table(vcpu, liobn);
> + stt = kvmppc_find_table(vcpu->kvm, liobn);
>   if (!stt)
>   return H_TOO_HARD;
>  
> @@ -356,12 +356,13 @@ long kvmppc_rm_h_stuff_tce(struct kvm_vcpu *vcpu,
>  long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
> unsigned long ioba)
>  {
> - struct kvmppc_spapr_tce_table *stt = kvmppc_find_table(vcpu, liobn);
> + struct kvmppc_spapr_tce_table *stt;
>  

Re: [PATCH kernel 5/9] KVM: PPC: Use preregistered memory API to access TCE list

2016-12-15 Thread David Gibson
On Thu, Dec 08, 2016 at 07:19:52PM +1100, Alexey Kardashevskiy wrote:
> VFIO on sPAPR already implements guest memory pre-registration
> when the entire guest RAM gets pinned. This can be used to translate
> the physical address of a guest page containing the TCE list
> from H_PUT_TCE_INDIRECT.
> 
> This makes use of the pre-registrered memory API to access TCE list
> pages in order to avoid unnecessary locking on the KVM memory
> reverse map as we know that all of guest memory is pinned and
> we have a flat array mapping GPA to HPA which makes it simpler and
> quicker to index into that array (even with looking up the
> kernel page tables in vmalloc_to_phys) than it is to find the memslot,
> lock the rmap entry, look up the user page tables, and unlock the rmap
> entry. Note that the rmap pointer is initialized to NULL where declared
> (not in this patch).
> 
> Signed-off-by: Alexey Kardashevskiy 


Hrm.  So, pinning all of guest memory is the usual case, but nothing
in the pre-registration APIs actually guarantees that.  Now I think
this patch is still correct because..

> ---
> Changes:
> v2:
> * updated the commit log with Paul's comment
> ---
>  arch/powerpc/kvm/book3s_64_vio_hv.c | 65 
> -
>  1 file changed, 49 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
> b/arch/powerpc/kvm/book3s_64_vio_hv.c
> index d461c440889a..a3be4bd6188f 100644
> --- a/arch/powerpc/kvm/book3s_64_vio_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
> @@ -180,6 +180,17 @@ long kvmppc_gpa_to_ua(struct kvm *kvm, unsigned long gpa,
>  EXPORT_SYMBOL_GPL(kvmppc_gpa_to_ua);
>  
>  #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> +static inline bool kvmppc_preregistered(struct kvm_vcpu *vcpu)
> +{
> + return mm_iommu_preregistered(vcpu->kvm->mm);
> +}
> +
> +static struct mm_iommu_table_group_mem_t *kvmppc_rm_iommu_lookup(
> + struct kvm_vcpu *vcpu, unsigned long ua, unsigned long size)
> +{
> + return mm_iommu_lookup_rm(vcpu->kvm->mm, ua, size);
> +}
> +
>  long kvmppc_rm_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
>   unsigned long ioba, unsigned long tce)
>  {
> @@ -260,23 +271,44 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
>   if (ret != H_SUCCESS)
>   return ret;
>  
> - if (kvmppc_gpa_to_ua(vcpu->kvm, tce_list, &ua, &rmap))
> - return H_TOO_HARD;
> + if (kvmppc_preregistered(vcpu)) {
> + /*
> +  * We get here if guest memory was pre-registered which
> +  * is normally VFIO case and gpa->hpa translation does not
> +  * depend on hpt.
> +  */
> + struct mm_iommu_table_group_mem_t *mem;
>  
> - rmap = (void *) vmalloc_to_phys(rmap);
> + if (kvmppc_gpa_to_ua(vcpu->kvm, tce_list, &ua, NULL))
> + return H_TOO_HARD;

..this will fail if the relevant chunk of memory has not been
pre-registered and you'll fall back to the virtual mode version.  The
commit message doesn't make that terribly clear though.

> - /*
> -  * Synchronize with the MMU notifier callbacks in
> -  * book3s_64_mmu_hv.c (kvm_unmap_hva_hv etc.).
> -  * While we have the rmap lock, code running on other CPUs
> -  * cannot finish unmapping the host real page that backs
> -  * this guest real page, so we are OK to access the host
> -  * real page.
> -  */
> - lock_rmap(rmap);
> - if (kvmppc_rm_ua_to_hpa(vcpu, ua, &tces)) {
> - ret = H_TOO_HARD;
> - goto unlock_exit;
> + mem = kvmppc_rm_iommu_lookup(vcpu, ua, IOMMU_PAGE_SIZE_4K);
> + if (!mem || mm_iommu_ua_to_hpa_rm(mem, ua, &tces))
> + return H_TOO_HARD;
> + } else {
> + /*
> +  * This is emulated devices case.
> +  * We do not require memory to be preregistered in this case
> +  * so lock rmap and do __find_linux_pte_or_hugepte().
> +  */
> + if (kvmppc_gpa_to_ua(vcpu->kvm, tce_list, &ua, &rmap))
> + return H_TOO_HARD;

If I follow correctly you could also fall back to this path in the
failing case, but I guess there's probably not advantage to doing so.

> + rmap = (void *) vmalloc_to_phys(rmap);
> +
> + /*
> +  * Synchronize with the MMU notifier callbacks in
> +  * book3s_64_mmu_hv.c (kvm_unmap_hva_hv etc.).
> +  * While we have the rmap lock, code running on other CPUs
> +  * cannot finish unmapping the host real page that backs
> +  * this guest real page, so we are OK to access the host
> +  * real page.
> +  */
> + lock_rmap(rmap);
> + if (kvmppc_rm_ua_to_hpa(vcpu, ua, &tces)) {
> + ret = H_TOO_HARD;
> + goto unlock_exit;
> + }
>   }
>  
>   for (i = 0; i < npa

Re: [PATCH kernel 4/9] powerpc/mmu: Add real mode support for IOMMU preregistered memory

2016-12-15 Thread David Gibson
On Thu, Dec 08, 2016 at 07:19:51PM +1100, Alexey Kardashevskiy wrote:
> This makes mm_iommu_lookup() able to work in realmode by replacing
> list_for_each_entry_rcu() (which can do debug stuff which can fail in
> real mode) with list_for_each_entry_lockless().
> 
> This adds realmode version of mm_iommu_ua_to_hpa() which adds
> explicit vmalloc'd-to-linear address conversion.
> Unlike mm_iommu_ua_to_hpa(), mm_iommu_ua_to_hpa_rm() can fail.
> 
> This changes mm_iommu_preregistered() to receive @mm as in real mode
> @current does not always have a correct pointer.
> 
> This adds realmode version of mm_iommu_lookup() which receives @mm
> (for the same reason as for mm_iommu_preregistered()) and uses
> lockless version of list_for_each_entry_rcu().
> 
> Signed-off-by: Alexey Kardashevskiy 

Reviewed-by: David Gibson 

> ---
>  arch/powerpc/include/asm/mmu_context.h |  4 
>  arch/powerpc/mm/mmu_context_iommu.c| 39 
> ++
>  2 files changed, 43 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/mmu_context.h 
> b/arch/powerpc/include/asm/mmu_context.h
> index b9e3f0aca261..c70c8272523d 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -29,10 +29,14 @@ extern void mm_iommu_init(struct mm_struct *mm);
>  extern void mm_iommu_cleanup(struct mm_struct *mm);
>  extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct 
> *mm,
>   unsigned long ua, unsigned long size);
> +extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup_rm(
> + struct mm_struct *mm, unsigned long ua, unsigned long size);
>  extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
>   unsigned long ua, unsigned long entries);
>  extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
>   unsigned long ua, unsigned long *hpa);
> +extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
> + unsigned long ua, unsigned long *hpa);
>  extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
>  extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
>  #endif
> diff --git a/arch/powerpc/mm/mmu_context_iommu.c 
> b/arch/powerpc/mm/mmu_context_iommu.c
> index 104bad029ce9..631d32f5937b 100644
> --- a/arch/powerpc/mm/mmu_context_iommu.c
> +++ b/arch/powerpc/mm/mmu_context_iommu.c
> @@ -314,6 +314,25 @@ struct mm_iommu_table_group_mem_t 
> *mm_iommu_lookup(struct mm_struct *mm,
>  }
>  EXPORT_SYMBOL_GPL(mm_iommu_lookup);
>  
> +struct mm_iommu_table_group_mem_t *mm_iommu_lookup_rm(struct mm_struct *mm,
> + unsigned long ua, unsigned long size)
> +{
> + struct mm_iommu_table_group_mem_t *mem, *ret = NULL;
> +
> + list_for_each_entry_lockless(mem, &mm->context.iommu_group_mem_list,
> + next) {
> + if ((mem->ua <= ua) &&
> + (ua + size <= mem->ua +
> +  (mem->entries << PAGE_SHIFT))) {
> + ret = mem;
> + break;
> + }
> + }
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(mm_iommu_lookup_rm);
> +
>  struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
>   unsigned long ua, unsigned long entries)
>  {
> @@ -345,6 +364,26 @@ long mm_iommu_ua_to_hpa(struct 
> mm_iommu_table_group_mem_t *mem,
>  }
>  EXPORT_SYMBOL_GPL(mm_iommu_ua_to_hpa);
>  
> +long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
> + unsigned long ua, unsigned long *hpa)
> +{
> + const long entry = (ua - mem->ua) >> PAGE_SHIFT;
> + void *va = &mem->hpas[entry];
> + unsigned long *pa;
> +
> + if (entry >= mem->entries)
> + return -EFAULT;
> +
> + pa = (void *) vmalloc_to_phys(va);
> + if (!pa)
> + return -EFAULT;
> +
> + *hpa = *pa | (ua & ~PAGE_MASK);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(mm_iommu_ua_to_hpa_rm);
> +
>  long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem)
>  {
>   if (atomic64_inc_not_zero(&mem->mapped))

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


RE: [PATCH v6 1/4] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe

2016-12-15 Thread Qiang Zhao
Hello,

Any comments on this patchset?

Best Regards
Zhao Qiang

> -Original Message-
> From: Zhao Qiang [mailto:qiang.z...@nxp.com]
> Sent: Wednesday, September 28, 2016 11:25 AM
> To: o...@buserror.net; t...@linutronix.de
> Cc: ja...@lakedaemon.net; marc.zyng...@arm.com; X.B. Xie
> ; linux-ker...@vger.kernel.org; linuxppc-
> d...@lists.ozlabs.org; Qiang Zhao 
> Subject: [PATCH v6 1/4] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
> 
> move the driver from drivers/soc/fsl/qe to drivers/irqchip, merge qe_ic.h and
> qe_ic.c into irq-qeic.c.
> 
> Signed-off-by: Zhao Qiang 
> ---
> Changes for v2:
>   - modify the subject and commit msg
> Changes for v3:
>   - merge .h file to .c, rename it with irq-qeic.c Changes for v4:
>   - modify comments
> Changes for v5:
>   - disable rename detection
> Changes for v6:
>   - rebase
> 
>  drivers/irqchip/Makefile   |   1 +
>  drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} |  95 ++-
>  drivers/soc/fsl/qe/Makefile|   2 +-
>  drivers/soc/fsl/qe/qe_ic.h | 103 
> -
>  4 files changed, 94 insertions(+), 107 deletions(-)  rename
> drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (85%)  delete mode 100644
> drivers/soc/fsl/qe/qe_ic.h
> 
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile index
> 4c203b6..face608 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -71,3 +71,4 @@ obj-$(CONFIG_MVEBU_ODMI)+= irq-mvebu-
> odmi.o
>  obj-$(CONFIG_LS_SCFG_MSI)+= irq-ls-scfg-msi.o
>  obj-$(CONFIG_EZNPS_GIC)  += irq-eznps.o
>  obj-$(CONFIG_ARCH_ASPEED)+= irq-aspeed-vic.o
> +obj-$(CONFIG_QUICC_ENGINE)   += irq-qeic.o
> diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/irqchip/irq-qeic.c 
> similarity
> index 85% rename from drivers/soc/fsl/qe/qe_ic.c rename to
> drivers/irqchip/irq-qeic.c index ec2ca86..48ceded 100644
> --- a/drivers/soc/fsl/qe/qe_ic.c
> +++ b/drivers/irqchip/irq-qeic.c
> @@ -1,7 +1,7 @@
>  /*
> - * arch/powerpc/sysdev/qe_lib/qe_ic.c
> + * drivers/irqchip/irq-qeic.c
>   *
> - * Copyright (C) 2006 Freescale Semiconductor, Inc.  All rights reserved.
> + * Copyright (C) 2016 Freescale Semiconductor, Inc.  All rights reserved.
>   *
>   * Author: Li Yang 
>   * Based on code from Shlomi Gridish  @@ -30,7
> +30,96 @@  #include   #include 
> 
> -#include "qe_ic.h"
> +#define NR_QE_IC_INTS64
> +
> +/* QE IC registers offset */
> +#define QEIC_CICR0x00
> +#define QEIC_CIVEC   0x04
> +#define QEIC_CRIPNR  0x08
> +#define QEIC_CIPNR   0x0c
> +#define QEIC_CIPXCC  0x10
> +#define QEIC_CIPYCC  0x14
> +#define QEIC_CIPWCC  0x18
> +#define QEIC_CIPZCC  0x1c
> +#define QEIC_CIMR0x20
> +#define QEIC_CRIMR   0x24
> +#define QEIC_CICNR   0x28
> +#define QEIC_CIPRTA  0x30
> +#define QEIC_CIPRTB  0x34
> +#define QEIC_CRICR   0x3c
> +#define QEIC_CHIVEC  0x60
> +
> +/* Interrupt priority registers */
> +#define CIPCC_SHIFT_PRI0 29
> +#define CIPCC_SHIFT_PRI1 26
> +#define CIPCC_SHIFT_PRI2 23
> +#define CIPCC_SHIFT_PRI3 20
> +#define CIPCC_SHIFT_PRI4 13
> +#define CIPCC_SHIFT_PRI5 10
> +#define CIPCC_SHIFT_PRI6 7
> +#define CIPCC_SHIFT_PRI7 4
> +
> +/* CICR priority modes */
> +#define CICR_GWCC0x0004
> +#define CICR_GXCC0x0002
> +#define CICR_GYCC0x0001
> +#define CICR_GZCC0x0008
> +#define CICR_GRTA0x0020
> +#define CICR_GRTB0x0040
> +#define CICR_HPIT_SHIFT  8
> +#define CICR_HPIT_MASK   0x0300
> +#define CICR_HP_SHIFT24
> +#define CICR_HP_MASK 0x3f00
> +
> +/* CICNR */
> +#define CICNR_WCC1T_SHIFT20
> +#define CICNR_ZCC1T_SHIFT28
> +#define CICNR_YCC1T_SHIFT12
> +#define CICNR_XCC1T_SHIFT4
> +
> +/* CRICR */
> +#define CRICR_RTA1T_SHIFT20
> +#define CRICR_RTB1T_SHIFT28
> +
> +/* Signal indicator */
> +#define SIGNAL_MASK  3
> +#define SIGNAL_HIGH  2
> +#define SIGNAL_LOW   0
> +
> +struct qe_ic {
> + /* Control registers offset */
> + volatile u32 __iomem *regs;
> +
> + /* The remapper for this QEIC */
> + struct irq_domain *irqhost;
> +
> + /* The "linux" controller struct */
> + struct irq_chip hc_irq;
> +
> + /* VIRQ numbers of QE high/low irqs */
> + unsigned int virq_high;
> + unsigned int virq_low;
> +};
> +
> +/*
> + * QE interrupt controller internal structure  */ struct qe_ic_info {
> + /* location of this source at the QIMR register. */
> + u32 mask;
> +
> + /* Mask register offset */
> + u32 mask_reg;
> +
> + /*
> +  * for grouped interrupts sources - the interrupt
> +  * code as appears 

[PATCH] powerpc/time: clear LPCR.LD when unneeded

2016-12-15 Thread Oliver O'Halloran
Currently the kernel will enable LD mode at boot when required. However,
when using kexec the second kernel may not want to have the LD enabled.
This patch ensures the second kernel will explicitly clear the LD flag
when not required by the current kernel.

Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/kernel/time.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index be9751f1cb2a..816700e8a475 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -925,18 +925,16 @@ static void register_decrementer_clockevent(int cpu)
 
 static void enable_large_decrementer(void)
 {
-   if (!cpu_has_feature(CPU_FTR_ARCH_300))
-   return;
-
-   if (decrementer_max <= DECREMENTER_DEFAULT_MAX)
-   return;
-
/*
 * If we're running as the hypervisor we need to enable the LD manually
 * otherwise firmware should have done it for us.
 */
-   if (cpu_has_feature(CPU_FTR_HVMODE))
+   if (decrementer_max > DECREMENTER_DEFAULT_MAX
+   && cpu_has_feature(CPU_FTR_HVMODE)
+   && cpu_has_feature(CPU_FTR_ARCH_300))
mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) | LPCR_LD);
+   else
+   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) & ~LPCR_LD);
 }
 
 static void __init set_decrementer_max(void)
-- 
2.7.4



[PATCH v2 3/3] powerpc/pseries: Update affinity for memory and cpus specified in a PRRN event

2016-12-15 Thread John Allen
Extend the existing PRRN infrastructure to perform the actual affinity
updating for cpus and memory in addition to the device tree updating. For
cpus, dynamic affinity updating already appears to exist in the kernel in
the form of arch_update_cpu_topology. For memory, we must place a READD
operation on the hotplug queue for any phandle included in the PRRN event
that is determined to be an LMB.

Signed-off-by: John Allen 
---
diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index a26a020..8836130 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -282,6 +283,7 @@ static void prrn_work_fn(struct work_struct *work)
 * the RTAS event.
 */
pseries_devicetree_update(-prrn_update_scope);
+   arch_update_cpu_topology();
 }

 static DECLARE_WORK(prrn_work, prrn_work_fn);
@@ -434,7 +436,10 @@ static void do_event_scan(void)
}

if (error == 0) {
-   pSeries_log_error(logdata, ERR_TYPE_RTAS_LOG, 0);
+   if (rtas_error_type((struct rtas_error_log *)logdata) !=
+   RTAS_TYPE_PRRN)
+   pSeries_log_error(logdata, ERR_TYPE_RTAS_LOG,
+ 0);
handle_rtas_event((struct rtas_error_log *)logdata);
}

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index a560a98..d62d48a 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -39,6 +39,7 @@ struct update_props_workarea {
 #define ADD_DT_NODE0x0300

 #define MIGRATION_SCOPE(1)
+#define PRRN_SCOPE -2

 static int mobility_rtas_call(int token, char *buf, s32 scope)
 {
@@ -236,6 +237,33 @@ static int add_dt_node(__be32 parent_phandle, __be32 
drc_index)
return rc;
 }

+void pseries_prrn_update_node(__be32 phandle)
+{
+   struct pseries_hp_errorlog *hp_elog;
+   struct device_node *dn;
+
+   hp_elog = kzalloc(sizeof(*hp_elog), GFP_KERNEL);
+   if (!hp_elog)
+   return;
+
+   dn = of_find_node_by_phandle(be32_to_cpu(phandle));
+
+   /*
+* If the phandle was not found, assume phandle is the drc index of
+* an LMB.
+*/
+   if (!dn) {
+   hp_elog->resource = PSERIES_HP_ELOG_RESOURCE_MEM;
+   hp_elog->action = PSERIES_HP_ELOG_ACTION_READD;
+   hp_elog->id_type = PSERIES_HP_ELOG_ID_DRC_INDEX;
+   hp_elog->_drc_u.drc_index = phandle;
+
+   queue_hotplug_event(hp_elog, NULL, NULL);
+   }
+
+   kfree(hp_elog);
+}
+
 int pseries_devicetree_update(s32 scope)
 {
char *rtas_buf;
@@ -274,6 +302,10 @@ int pseries_devicetree_update(s32 scope)
break;
case UPDATE_DT_NODE:
update_dt_node(phandle, scope);
+
+   if (scope == PRRN_SCOPE)
+   
pseries_prrn_update_node(phandle);
+
break;
case ADD_DT_NODE:
drc_index = *data++;



[PATCH v2 2/3] powerpc/pseries: Introduce memory hotplug READD operation

2016-12-15 Thread John Allen
Currently, memory must be hot removed and subsequently re-added in order
to dynamically update the affinity of LMBs specified by a PRRN event.
Earlier implementations of the PRRN event handler ran into issues in which
the hot remove would occur successfully, but a hotplug event would be
initiated from another source and grab the hotplug lock preventing the hot
add from occurring. To prevent this situation, this patch introduces the
notion of a hot "readd" action for memory which atomizes a hot remove and
a hot add into a single, serialized operation on the hotplug queue.

Signed-off-by: John Allen 
---
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 9c23baa..076b892 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -318,6 +318,7 @@ struct pseries_hp_errorlog {

 #define PSERIES_HP_ELOG_ACTION_ADD 1
 #define PSERIES_HP_ELOG_ACTION_REMOVE  2
+#define PSERIES_HP_ELOG_ACTION_READD   3

 #define PSERIES_HP_ELOG_ID_DRC_NAME1
 #define PSERIES_HP_ELOG_ID_DRC_INDEX   2
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 0eb4b1d..06f10a8 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -560,6 +560,44 @@ static int dlpar_memory_remove_by_index(u32 drc_index, 
struct property *prop)
return rc;
 }

+static int dlpar_memory_readd_by_index(u32 drc_index, struct property *prop)
+{
+   struct of_drconf_cell *lmbs;
+   u32 num_lmbs, *p;
+   int lmb_found;
+   int i, rc;
+
+   pr_info("Attempting to update LMB, drc index %x\n", drc_index);
+
+   p = prop->value;
+   num_lmbs = *p++;
+   lmbs = (struct of_drconf_cell *)p;
+
+   lmb_found = 0;
+   for (i = 0; i < num_lmbs; i++) {
+   if (lmbs[i].drc_index == drc_index) {
+   lmb_found = 1;
+   rc = dlpar_remove_lmb(&lmbs[i]);
+   if (!rc) {
+   rc = dlpar_add_lmb(&lmbs[i]);
+   if (rc)
+   dlpar_release_drc(lmbs[i].drc_index);
+   }
+   break;
+   }
+   }
+
+   if (!lmb_found)
+   rc = -EINVAL;
+
+   if (rc)
+   pr_info("Failed to update memory at %llx\n",
+   lmbs[i].base_addr);
+   else
+   pr_info("Memory at %llx was updated\n", lmbs[i].base_addr);
+
+   return rc;
+}
 #else
 static inline int pseries_remove_memblock(unsigned long base,
  unsigned int memblock_size)
@@ -776,6 +814,9 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
else
rc = -EINVAL;
break;
+   case PSERIES_HP_ELOG_ACTION_READD:
+   rc = dlpar_memory_readd_by_index(drc_index, prop);
+   break;
default:
pr_err("Invalid action (%d) specified\n", hp_elog->action);
rc = -EINVAL;



[PATCH v2 1/3] powerpc/pseries: Make the acquire/release of the drc for memory a seperate step

2016-12-15 Thread John Allen
When adding and removing LMBs we should make the acquire/release of
the DRC a separate step to allow for a few improvements. First
this will ensure that LMBs removed during a remove by count operation
are all available if a error occurs and we need to add them back. By
first removeing all the LMBs from the kernel before releasing their
DRCs the LMBs are available to add back should an error occur.

Also, this will allow for faster re-add operations of memory for
PRRN event handling since we can skip the unneeded step of having
to release the DRC and the acquire it back.

Signed-off-by: Nathan Fontenot 
Signed-off-by: John Allen 
---
 arch/powerpc/platforms/pseries/hotplug-memory.c |   34 +++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 2617f9f..be11fc3 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -446,9 +446,7 @@ static int dlpar_remove_lmb(struct of_drconf_cell *lmb)
/* Update memory regions for memory remove */
memblock_remove(lmb->base_addr, block_sz);

-   dlpar_release_drc(lmb->drc_index);
dlpar_remove_device_tree_lmb(lmb);
-
return 0;
 }

@@ -516,6 +514,7 @@ static int dlpar_memory_remove_by_count(u32 lmbs_to_remove,
if (!lmbs[i].reserved)
continue;

+   dlpar_release_drc(lmbs[i].drc_index);
pr_info("Memory at %llx was hot-removed\n",
lmbs[i].base_addr);

@@ -545,6 +544,9 @@ static int dlpar_memory_remove_by_index(u32 drc_index, 
struct property *prop)
if (lmbs[i].drc_index == drc_index) {
lmb_found = 1;
rc = dlpar_remove_lmb(&lmbs[i]);
+   if (!rc)
+   dlpar_release_drc(lmbs[i].drc_index);
+
break;
}
}
@@ -599,10 +601,6 @@ static int dlpar_add_lmb(struct of_drconf_cell *lmb)
if (lmb->flags & DRCONF_MEM_ASSIGNED)
return -EINVAL;

-   rc = dlpar_acquire_drc(lmb->drc_index);
-   if (rc)
-   return rc;
-
rc = dlpar_add_device_tree_lmb(lmb);
if (rc) {
pr_err("Couldn't update device tree for drc index %x\n",
@@ -618,12 +616,10 @@ static int dlpar_add_lmb(struct of_drconf_cell *lmb)

/* Add the memory */
rc = add_memory(nid, lmb->base_addr, block_sz);
-   if (rc) {
+   if (rc)
dlpar_remove_device_tree_lmb(lmb);
-   dlpar_release_drc(lmb->drc_index);
-   } else {
+   else
lmb->flags |= DRCONF_MEM_ASSIGNED;
-   }

return rc;
 }
@@ -655,10 +651,16 @@ static int dlpar_memory_add_by_count(u32 lmbs_to_add, 
struct property *prop)
return -EINVAL;

for (i = 0; i < num_lmbs && lmbs_to_add != lmbs_added; i++) {
-   rc = dlpar_add_lmb(&lmbs[i]);
+   rc = dlpar_acquire_drc(lmbs[i].drc_index);
if (rc)
continue;

+   rc = dlpar_add_lmb(&lmbs[i]);
+   if (rc) {
+   dlpar_release_drc(lmbs[i].drc_index);
+   continue;
+   }
+
lmbs_added++;

/* Mark this lmb so we can remove it later if all of the
@@ -678,6 +680,8 @@ static int dlpar_memory_add_by_count(u32 lmbs_to_add, 
struct property *prop)
if (rc)
pr_err("Failed to remove LMB, drc index %x\n",
   be32_to_cpu(lmbs[i].drc_index));
+   else
+   dlpar_release_drc(lmbs[i].drc_index);
}
rc = -EINVAL;
} else {
@@ -711,7 +715,13 @@ static int dlpar_memory_add_by_index(u32 drc_index, struct 
property *prop)
for (i = 0; i < num_lmbs; i++) {
if (lmbs[i].drc_index == drc_index) {
lmb_found = 1;
-   rc = dlpar_add_lmb(&lmbs[i]);
+   rc = dlpar_acquire_drc(lmbs[i].drc_index);
+   if (!rc) {
+   rc = dlpar_add_lmb(&lmbs[i]);
+   if (rc)
+   dlpar_release_drc(lmbs[i].drc_index);
+   }
+
break;
}
}



[PATCH v2 0/3] powerpc/pseries: Perform PRRN topology updates in kernel

2016-12-15 Thread John Allen
Formerly, when we received a PRRN rtas event, device tree updating was
performed in the kernel and the actual topology updating was performed in
userspace. This was necessary as in order to update the topology for memory,
we must perform a hot remove and a subsequent hot add and until recently,
memory hotplug was not included in the kernel. Since memory hotplug is now
available, this patchset moves the PRRN topology updating into the kernel.

Changes from v1:
-Introduce patch to separate the acquire and release drc from existing
 memory hotplug
-Create new function "dlpar_memory_readd_by_index" that consolidates the
 necessary steps of memory hot remove and hot add into a single function
-Remove conversion of phandle to BE
-Since error messages are already generated in the memory hotplug code,
 remove redundant error messages in pseries_prrn_update_node. Since we no
 longer use the return code from the hotplug event, remove the
 wait_for_completion infrastructure.

John Allen (3):
  powerpc/pseries: Make the acquire/release of the drc for memory a 
seperate step
  powerpc/pseries: Introduce memory hotplug READD operation
  powerpc/pseries: Update affinity for memory and cpus specified in a PRRN 
event

 arch/powerpc/include/asm/rtas.h |1
 arch/powerpc/kernel/rtasd.c |7 ++
 arch/powerpc/platforms/pseries/hotplug-memory.c |   75 +++
 arch/powerpc/platforms/pseries/mobility.c   |   32 ++
 4 files changed, 102 insertions(+), 13 deletions(-)



[PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix

2016-12-15 Thread Reza Arbab
Memory hotplug is leading to hash page table calls, even on radix:

...
arch_add_memory
create_section_mapping
htab_bolt_mapping
BUG_ON(!ppc_md.hpte_insert);

To fix, refactor {create,remove}_section_mapping() into hash__ and radix__
variants. Implement the radix versions by borrowing from existing vmemmap
and x86 code.

This passes basic verification of plugging and removing memory, but this 
stuff is tricky and I'd appreciate extra scrutiny of the series for 
correctness--in particular, the adaptation of remove_pagetable() from x86.

/* changelog */

v3:
* Port remove_pagetable() et al. from x86 for unmapping.

* [RFC] -> [PATCH]

v2:
* 
https://lkml.kernel.org/r/1471449083-15931-1-git-send-email-ar...@linux.vnet.ibm.com

* Do not simply fall through to vmemmap_{create,remove}_mapping(). As Aneesh
  and Michael pointed out, they are tied to CONFIG_SPARSEMEM_VMEMMAP and only
  did what I needed by luck anyway.

v1:
* 
https://lkml.kernel.org/r/1466699962-22412-1-git-send-email-ar...@linux.vnet.ibm.com

Reza Arbab (5):
  powerpc/mm: set the radix linear page mapping size
  powerpc/mm: refactor {create,remove}_section_mapping()
  powerpc/mm: add radix__create_section_mapping()
  powerpc/mm: add radix__remove_section_mapping()
  powerpc/mm: unstub radix__vmemmap_remove_mapping()

 arch/powerpc/include/asm/book3s/64/hash.h  |   5 +
 arch/powerpc/include/asm/book3s/64/radix.h |   5 +
 arch/powerpc/mm/hash_utils_64.c|   4 +-
 arch/powerpc/mm/pgtable-book3s64.c |  18 +++
 arch/powerpc/mm/pgtable-radix.c| 207 -
 5 files changed, 236 insertions(+), 3 deletions(-)

-- 
1.8.3.1



[PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping()

2016-12-15 Thread Reza Arbab
Add the linear page mapping function for radix, used by memory hotplug.
This is similar to vmemmap_populate().

Signed-off-by: Reza Arbab 
---
 arch/powerpc/include/asm/book3s/64/radix.h |  4 
 arch/powerpc/mm/pgtable-book3s64.c |  2 +-
 arch/powerpc/mm/pgtable-radix.c| 19 +++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/radix.h 
b/arch/powerpc/include/asm/book3s/64/radix.h
index b4d1302..43c2571 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -291,5 +291,9 @@ static inline unsigned long radix__get_tree_size(void)
}
return rts_field;
 }
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+int radix__create_section_mapping(unsigned long start, unsigned long end);
+#endif /* CONFIG_MEMORY_HOTPLUG */
 #endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/mm/pgtable-book3s64.c 
b/arch/powerpc/mm/pgtable-book3s64.c
index 653ff6c..2b13f6b 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -131,7 +131,7 @@ void mmu_cleanup_all(void)
 int create_section_mapping(unsigned long start, unsigned long end)
 {
if (radix_enabled())
-   return -ENODEV;
+   return radix__create_section_mapping(start, end);
 
return hash__create_section_mapping(start, end);
 }
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 54bd70e..8201d1f 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -465,6 +465,25 @@ void radix__setup_initial_memory_limit(phys_addr_t 
first_memblock_base,
memblock_set_current_limit(first_memblock_base + first_memblock_size);
 }
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+int radix__create_section_mapping(unsigned long start, unsigned long end)
+{
+   unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
+
+   /* Align to the page size of the linear mapping. */
+   start = _ALIGN_DOWN(start, page_size);
+
+   for (; start < end; start += page_size) {
+   int rc = radix__map_kernel_page(start, __pa(start),
+   PAGE_KERNEL, page_size);
+   if (rc)
+   return rc;
+   }
+
+   return 0;
+}
+#endif /* CONFIG_MEMORY_HOTPLUG */
+
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 int __meminit radix__vmemmap_create_mapping(unsigned long start,
  unsigned long page_size,
-- 
1.8.3.1



[PATCH v3 4/5] powerpc/mm: add radix__remove_section_mapping()

2016-12-15 Thread Reza Arbab
Tear down and free the four-level page tables of the linear mapping
during memory hotremove.

We borrow the basic structure of remove_pagetable() and friends from the
identically-named x86 functions.

Signed-off-by: Reza Arbab 
---
 arch/powerpc/include/asm/book3s/64/radix.h |   1 +
 arch/powerpc/mm/pgtable-book3s64.c |   2 +-
 arch/powerpc/mm/pgtable-radix.c| 163 +
 3 files changed, 165 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/radix.h 
b/arch/powerpc/include/asm/book3s/64/radix.h
index 43c2571..0032b66 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -294,6 +294,7 @@ static inline unsigned long radix__get_tree_size(void)
 
 #ifdef CONFIG_MEMORY_HOTPLUG
 int radix__create_section_mapping(unsigned long start, unsigned long end);
+int radix__remove_section_mapping(unsigned long start, unsigned long end);
 #endif /* CONFIG_MEMORY_HOTPLUG */
 #endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/mm/pgtable-book3s64.c 
b/arch/powerpc/mm/pgtable-book3s64.c
index 2b13f6b..b798ff6 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -139,7 +139,7 @@ int create_section_mapping(unsigned long start, unsigned 
long end)
 int remove_section_mapping(unsigned long start, unsigned long end)
 {
if (radix_enabled())
-   return -ENODEV;
+   return radix__remove_section_mapping(start, end);
 
return hash__remove_section_mapping(start, end);
 }
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 8201d1f..315237c 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -466,6 +466,159 @@ void radix__setup_initial_memory_limit(phys_addr_t 
first_memblock_base,
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
+static void free_pte_table(pte_t *pte_start, pmd_t *pmd)
+{
+   pte_t *pte;
+   int i;
+
+   for (i = 0; i < PTRS_PER_PTE; i++) {
+   pte = pte_start + i;
+   if (!pte_none(*pte))
+   return;
+   }
+
+   pte_free_kernel(&init_mm, pte_start);
+   spin_lock(&init_mm.page_table_lock);
+   pmd_clear(pmd);
+   spin_unlock(&init_mm.page_table_lock);
+}
+
+static void free_pmd_table(pmd_t *pmd_start, pud_t *pud)
+{
+   pmd_t *pmd;
+   int i;
+
+   for (i = 0; i < PTRS_PER_PMD; i++) {
+   pmd = pmd_start + i;
+   if (!pmd_none(*pmd))
+   return;
+   }
+
+   pmd_free(&init_mm, pmd_start);
+   spin_lock(&init_mm.page_table_lock);
+   pud_clear(pud);
+   spin_unlock(&init_mm.page_table_lock);
+}
+
+static void free_pud_table(pud_t *pud_start, pgd_t *pgd)
+{
+   pud_t *pud;
+   int i;
+
+   for (i = 0; i < PTRS_PER_PUD; i++) {
+   pud = pud_start + i;
+   if (!pud_none(*pud))
+   return;
+   }
+
+   pud_free(&init_mm, pud_start);
+   spin_lock(&init_mm.page_table_lock);
+   pgd_clear(pgd);
+   spin_unlock(&init_mm.page_table_lock);
+}
+
+static void remove_pte_table(pte_t *pte_start, unsigned long addr,
+unsigned long end)
+{
+   unsigned long next;
+   pte_t *pte;
+
+   pte = pte_start + pte_index(addr);
+   for (; addr < end; addr = next, pte++) {
+   next = (addr + PAGE_SIZE) & PAGE_MASK;
+   if (next > end)
+   next = end;
+
+   if (!pte_present(*pte))
+   continue;
+
+   spin_lock(&init_mm.page_table_lock);
+   pte_clear(&init_mm, addr, pte);
+   spin_unlock(&init_mm.page_table_lock);
+   }
+
+   flush_tlb_mm(&init_mm);
+}
+
+static void remove_pmd_table(pmd_t *pmd_start, unsigned long addr,
+unsigned long end, unsigned long map_page_size)
+{
+   unsigned long next;
+   pte_t *pte_base;
+   pmd_t *pmd;
+
+   pmd = pmd_start + pmd_index(addr);
+   for (; addr < end; addr = next, pmd++) {
+   next = pmd_addr_end(addr, end);
+
+   if (!pmd_present(*pmd))
+   continue;
+
+   if (map_page_size == PMD_SIZE) {
+   spin_lock(&init_mm.page_table_lock);
+   pte_clear(&init_mm, addr, (pte_t *)pmd);
+   spin_unlock(&init_mm.page_table_lock);
+
+   continue;
+   }
+
+   pte_base = (pte_t *)pmd_page_vaddr(*pmd);
+   remove_pte_table(pte_base, addr, next);
+   free_pte_table(pte_base, pmd);
+   }
+}
+
+static void remove_pud_table(pud_t *pud_start, unsigned long addr,
+unsigned long end, unsigned long map_page_size)
+{
+   unsigned long next;
+   pmd_t *pmd_base;
+   pud_t *pud;
+
+   pud = pud_start + pud_index(ad

[PATCH v3 1/5] powerpc/mm: set the radix linear page mapping size

2016-12-15 Thread Reza Arbab
This was defaulting to 4K, regardless of PAGE_SIZE.

Signed-off-by: Reza Arbab 
---
 arch/powerpc/mm/pgtable-radix.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 623a0dc..54bd70e 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -351,8 +351,10 @@ void __init radix__early_init_mmu(void)
 #ifdef CONFIG_PPC_64K_PAGES
/* PAGE_SIZE mappings */
mmu_virtual_psize = MMU_PAGE_64K;
+   mmu_linear_psize = MMU_PAGE_64K;
 #else
mmu_virtual_psize = MMU_PAGE_4K;
+   mmu_linear_psize = MMU_PAGE_4K;
 #endif
 
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
-- 
1.8.3.1



[PATCH v3 5/5] powerpc/mm: unstub radix__vmemmap_remove_mapping()

2016-12-15 Thread Reza Arbab
Use remove_pagetable() and friends for radix vmemmap removal.

We do not require the special-case handling of vmemmap done in the x86
versions of these functions. This is because vmemmap_free() has already
freed the mapped pages, and calls us with an aligned address range.

So, add a few failsafe WARNs, but otherwise the code to remove linear
mappings is already sufficient for vmemmap.

Signed-off-by: Reza Arbab 
---
 arch/powerpc/mm/pgtable-radix.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 315237c..9d1d51e 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -532,6 +532,15 @@ static void remove_pte_table(pte_t *pte_start, unsigned 
long addr,
if (!pte_present(*pte))
continue;
 
+   if (!PAGE_ALIGNED(addr) || !PAGE_ALIGNED(next)) {
+   /*
+* The vmemmap_free() and remove_section_mapping()
+* codepaths call us with aligned addresses.
+*/
+   WARN_ONCE(1, "%s: unaligned range\n", __func__);
+   continue;
+   }
+
spin_lock(&init_mm.page_table_lock);
pte_clear(&init_mm, addr, pte);
spin_unlock(&init_mm.page_table_lock);
@@ -555,6 +564,12 @@ static void remove_pmd_table(pmd_t *pmd_start, unsigned 
long addr,
continue;
 
if (map_page_size == PMD_SIZE) {
+   if (!IS_ALIGNED(addr, PMD_SIZE) ||
+   !IS_ALIGNED(next, PMD_SIZE)) {
+   WARN_ONCE(1, "%s: unaligned range\n", __func__);
+   continue;
+   }
+
spin_lock(&init_mm.page_table_lock);
pte_clear(&init_mm, addr, (pte_t *)pmd);
spin_unlock(&init_mm.page_table_lock);
@@ -583,6 +598,12 @@ static void remove_pud_table(pud_t *pud_start, unsigned 
long addr,
continue;
 
if (map_page_size == PUD_SIZE) {
+   if (!IS_ALIGNED(addr, PUD_SIZE) ||
+   !IS_ALIGNED(next, PUD_SIZE)) {
+   WARN_ONCE(1, "%s: unaligned range\n", __func__);
+   continue;
+   }
+
spin_lock(&init_mm.page_table_lock);
pte_clear(&init_mm, addr, (pte_t *)pud);
spin_unlock(&init_mm.page_table_lock);
@@ -662,7 +683,7 @@ int __meminit radix__vmemmap_create_mapping(unsigned long 
start,
 #ifdef CONFIG_MEMORY_HOTPLUG
 void radix__vmemmap_remove_mapping(unsigned long start, unsigned long 
page_size)
 {
-   /* FIXME!! intel does more. We should free page tables mapping vmemmap 
? */
+   remove_pagetable(start, start + page_size, page_size);
 }
 #endif
 #endif
-- 
1.8.3.1



[PATCH v3 2/5] powerpc/mm: refactor {create, remove}_section_mapping()

2016-12-15 Thread Reza Arbab
Change {create,remove}_section_mapping() to be wrappers around functions
prefixed with "hash__".

This is preparation for the addition of their "radix__" variants. No
functional change.

Signed-off-by: Reza Arbab 
---
 arch/powerpc/include/asm/book3s/64/hash.h |  5 +
 arch/powerpc/mm/hash_utils_64.c   |  4 ++--
 arch/powerpc/mm/pgtable-book3s64.c| 18 ++
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index f61cad3..dd90574 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -201,6 +201,11 @@ extern int __meminit hash__vmemmap_create_mapping(unsigned 
long start,
  unsigned long phys);
 extern void hash__vmemmap_remove_mapping(unsigned long start,
 unsigned long page_size);
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+int hash__create_section_mapping(unsigned long start, unsigned long end);
+int hash__remove_section_mapping(unsigned long start, unsigned long end);
+#endif /* CONFIG_MEMORY_HOTPLUG */
 #endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_H */
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index b9a062f..96a4fb7 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -743,7 +743,7 @@ static unsigned long __init htab_get_table_size(void)
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-int create_section_mapping(unsigned long start, unsigned long end)
+int hash__create_section_mapping(unsigned long start, unsigned long end)
 {
int rc = htab_bolt_mapping(start, end, __pa(start),
   pgprot_val(PAGE_KERNEL), mmu_linear_psize,
@@ -757,7 +757,7 @@ int create_section_mapping(unsigned long start, unsigned 
long end)
return rc;
 }
 
-int remove_section_mapping(unsigned long start, unsigned long end)
+int hash__remove_section_mapping(unsigned long start, unsigned long end)
 {
int rc = htab_remove_mapping(start, end, mmu_linear_psize,
 mmu_kernel_ssize);
diff --git a/arch/powerpc/mm/pgtable-book3s64.c 
b/arch/powerpc/mm/pgtable-book3s64.c
index ebf9782..653ff6c 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -126,3 +126,21 @@ void mmu_cleanup_all(void)
else if (mmu_hash_ops.hpte_clear_all)
mmu_hash_ops.hpte_clear_all();
 }
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+int create_section_mapping(unsigned long start, unsigned long end)
+{
+   if (radix_enabled())
+   return -ENODEV;
+
+   return hash__create_section_mapping(start, end);
+}
+
+int remove_section_mapping(unsigned long start, unsigned long end)
+{
+   if (radix_enabled())
+   return -ENODEV;
+
+   return hash__remove_section_mapping(start, end);
+}
+#endif /* CONFIG_MEMORY_HOTPLUG */
-- 
1.8.3.1



Re: [upstream-release] [PATCH net 2/4] fsl/fman: arm: call of_platform_populate() for arm64 platfrom

2016-12-15 Thread Scott Wood
On 12/15/2016 07:11 AM, Madalin Bucur wrote:
> From: Igal Liberman 
> 
> Signed-off-by: Igal Liberman 
> ---
>  drivers/net/ethernet/freescale/fman/fman.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/net/ethernet/freescale/fman/fman.c 
> b/drivers/net/ethernet/freescale/fman/fman.c
> index dafd9e1..f36b4eb 100644
> --- a/drivers/net/ethernet/freescale/fman/fman.c
> +++ b/drivers/net/ethernet/freescale/fman/fman.c
> @@ -2868,6 +2868,16 @@ static struct fman *read_dts_node(struct 
> platform_device *of_dev)
>  
>   fman->dev = &of_dev->dev;
>  
> +#ifdef CONFIG_ARM64
> + /* call of_platform_populate in order to probe sub-nodes on arm64 */
> + err = of_platform_populate(fm_node, NULL, NULL, &of_dev->dev);
> + if (err) {
> + dev_err(&of_dev->dev, "%s: of_platform_populate() failed\n",
> + __func__);
> + goto fman_free;
> + }
> +#endif

Should we remove fsl,fman from the PPC of_device_ids[], so this doesn't
need an ifdef?

Why is it #ifdef CONFIG_ARM64 rather than #ifndef CONFIG_PPC?

-Scott



Re: [powerpc/nmi: RFC 2/2] Keep interrupts enabled even on soft disable

2016-12-15 Thread Benjamin Herrenschmidt
On Wed, 2016-12-14 at 11:41 +1100, Balbir Singh wrote:
> I was planning to skipping other IRQ chips for now and support just
> XICS/XIVE with BOOK3S and PPC64. But we can discuss this.

Well you still need to make sure you don't do your lazy stuff on
them and actually mask EE.

> > That's why I mentioned opt-in. Maybe make it conditional on a
> > global
> > boolean that gets enabled by the PIC itself, or make it an enum
> > 
> > enum lazy_irq_masking_mode {
> >    lazy_irq_mask_ee,   /* Use CPU EE bit (default) */
> >    lazy_irq_mask_fetch,/* Fetch the interrupt and stash it
> > away */
> >    lazy_irq_mask_prio  /* Change processor priority */
> > };
> > 
> > For the latter we'd need a ppc_md. hook to do the priority change
> > which xive (and potentially others like MPIC) could use.
> 
> We have set_cpu_priority for XICS, which sets the base_priority
> only for the CPPR at the moment. It can be extended

Well, that's what I said earlier. XICS can do that in *theory* but it's
broken in HW. There's a race condition or two, if you whack the CPPR in
a way that causes a pending interrupt to be rejected, there's a timing
window where the ICP can wedge itself or the interrupt be lost, I don't
remember.

The only safe way on XICS is to fetch the interrupt (which implicitly
raises the CPPR) and lower it using EOI.

Cheers,
Ben.



Re: [PATCH 3/3] powerpc/corenet: add support for the kmcent2 board

2016-12-15 Thread Joakim Tjernlund
On Thu, 2016-12-15 at 14:22 +0100, Valentin Longchamp wrote:
> This board is built around Freescale's T1040 SoC.
> 
> The peripherals used by this design are:
> - DDR3 RAM with SPD support
> - parallel NOR Flash as boot medium
> - 1 PCIe bus (PCIe1 x1)
> - 3 FMAN Ethernet devices (FMAN1 DTSEC1/2/5)
> - 4 IFC bus devices:
>   - NOR flash
>   - NAND flash
>   - QRIO reset/power mgmt CPLD
>   - BFTIC chassis management CPLD
> - 2 I2C buses
> - 1 SPI bus
> - HDLC bus with the QE's UCC1
> - last but not least, the mandatory serial port
> 
> The board can be used with the corenet32_smp_defconfig.
> 
> Signed-off-by: Valentin Longchamp 
> ---
>  arch/powerpc/boot/dts/fsl/kmcent2.dts | 303 
> ++
>  arch/powerpc/platforms/85xx/corenet_generic.c |   1 +
>  2 files changed, 304 insertions(+)
>  create mode 100644 arch/powerpc/boot/dts/fsl/kmcent2.dts
> 
> diff --git a/arch/powerpc/boot/dts/fsl/kmcent2.dts 
> b/arch/powerpc/boot/dts/fsl/kmcent2.dts
> new file mode 100644
> index 000..47afa43
> --- /dev/null
> +++ b/arch/powerpc/boot/dts/fsl/kmcent2.dts
> @@ -0,0 +1,303 @@
> +/*
> + * Keymile kmcent2 Device Tree Source, based on T1040RDB DTS
> + *
> + * (C) Copyright 2016
> + * Valentin Longchamp, Keymile AG, valentin.longch...@keymile.com
> + *
> + * Copyright 2014 - 2015 Freescale Semiconductor Inc.
> + *
> + * This program is free software; you can redistribute  it and/or modify it
> + * under  the terms of  the GNU General  Public License as published by the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.
> + */
> +

[SNIP]

> +
> + ucc_hdlc: ucc@2000 {
> + device_type = "hdlc";
> + compatible = "fsl,ucc-hdlc";
> + rx-clock-name = "clk9";
> + tx-clock-name = "clk9";

Should it be clk9 on both tx and rx clock?

 Jocke

[PATCH 3/3] powerpc/corenet: add support for the kmcent2 board

2016-12-15 Thread Valentin Longchamp
This board is built around Freescale's T1040 SoC.

The peripherals used by this design are:
- DDR3 RAM with SPD support
- parallel NOR Flash as boot medium
- 1 PCIe bus (PCIe1 x1)
- 3 FMAN Ethernet devices (FMAN1 DTSEC1/2/5)
- 4 IFC bus devices:
  - NOR flash
  - NAND flash
  - QRIO reset/power mgmt CPLD
  - BFTIC chassis management CPLD
- 2 I2C buses
- 1 SPI bus
- HDLC bus with the QE's UCC1
- last but not least, the mandatory serial port

The board can be used with the corenet32_smp_defconfig.

Signed-off-by: Valentin Longchamp 
---
 arch/powerpc/boot/dts/fsl/kmcent2.dts | 303 ++
 arch/powerpc/platforms/85xx/corenet_generic.c |   1 +
 2 files changed, 304 insertions(+)
 create mode 100644 arch/powerpc/boot/dts/fsl/kmcent2.dts

diff --git a/arch/powerpc/boot/dts/fsl/kmcent2.dts 
b/arch/powerpc/boot/dts/fsl/kmcent2.dts
new file mode 100644
index 000..47afa43
--- /dev/null
+++ b/arch/powerpc/boot/dts/fsl/kmcent2.dts
@@ -0,0 +1,303 @@
+/*
+ * Keymile kmcent2 Device Tree Source, based on T1040RDB DTS
+ *
+ * (C) Copyright 2016
+ * Valentin Longchamp, Keymile AG, valentin.longch...@keymile.com
+ *
+ * Copyright 2014 - 2015 Freescale Semiconductor Inc.
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+/include/ "t104xsi-pre.dtsi"
+
+/ {
+   model = "keymile,kmcent2";
+   compatible = "keymile,kmcent2";
+
+   aliases {
+   front_phy = &front_phy;
+   };
+
+   reserved-memory {
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+
+   bman_fbpr: bman-fbpr {
+   size = <0 0x100>;
+   alignment = <0 0x100>;
+   };
+   qman_fqd: qman-fqd {
+   size = <0 0x40>;
+   alignment = <0 0x40>;
+   };
+   qman_pfdr: qman-pfdr {
+   size = <0 0x200>;
+   alignment = <0 0x200>;
+   };
+   };
+
+   ifc: localbus@ffe124000 {
+   reg = <0xf 0xfe124000 0 0x2000>;
+   ranges = <0 0 0xf 0xe800 0x0400
+ 1 0 0xf 0xfa00 0x0001
+ 2 0 0xf 0xfb00 0x0001
+ 4 0 0xf 0xc000 0x0800
+ 6 0 0xf 0xd000 0x0800
+ 7 0 0xf 0xd800 0x0800>;
+
+   nor@0,0 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "cfi-flash";
+   reg = <0x0 0x0 0x0400>;
+   bank-width = <2>;
+   device-width = <2>;
+   };
+
+   nand@1,0 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "fsl,ifc-nand";
+   reg = <0x1 0x0 0x1>;
+   };
+
+   board-control@2,0 {
+   compatible = "keymile,qriox";
+   reg = <0x2 0x0 0x80>;
+   };
+
+   chassis-mgmt@6,0 {
+   compatible = "keymile,bfticu";
+   reg = <6 0 0x100>;
+   interrupt-controller;
+   interrupt-parent = <&mpic>;
+   interrupts = <11 1 0 0>;
+   #interrupt-cells = <1>;
+   };
+
+   };
+
+   memory {
+   device_type = "memory";
+   };
+
+   dcsr: dcsr@f {
+   ranges = <0x 0xf 0x 0x01072000>;
+   };
+
+   bportals: bman-portals@ff400 {
+   ranges = <0x0 0xf 0xf400 0x200>;
+   };
+
+   qportals: qman-portals@ff600 {
+   ranges = <0x0 0xf 0xf600 0x200>;
+   };
+
+   soc: soc@ffe00 {
+   ranges = <0x 0xf 0xfe00 0x100>;
+   reg = <0xf 0xfe00 0 0x1000>;
+
+   spi@11 {
+   network-clock@1 {
+   compatible = "zarlink,zl30364";
+   reg = <1>;
+   spi-max-frequency = <100>;
+   };
+   };
+
+   sdhc@114000 {
+   status = "disabled";
+   };
+
+   i2c@118000 {
+   clock-frequency = <10>;
+
+   mux@70 {
+   compatible = "nxp,pca9547";
+   reg = <0x70>;
+   #address-cells = <1>;
+

[PATCH 2/3] powerpc/85xx: remove the kmp204x_defconfig

2016-12-15 Thread Valentin Longchamp
It is not maintained and thus obsolete. corenet32_smp_defconfig can be
used as reference for the kmcoge4/kmp204x boards.

Signed-off-by: Valentin Longchamp 
---
 arch/powerpc/configs/85xx/kmp204x_defconfig | 220 
 1 file changed, 220 deletions(-)
 delete mode 100644 arch/powerpc/configs/85xx/kmp204x_defconfig

diff --git a/arch/powerpc/configs/85xx/kmp204x_defconfig 
b/arch/powerpc/configs/85xx/kmp204x_defconfig
deleted file mode 100644
index a60..000
--- a/arch/powerpc/configs/85xx/kmp204x_defconfig
+++ /dev/null
@@ -1,220 +0,0 @@
-CONFIG_PPC_85xx=y
-CONFIG_SMP=y
-CONFIG_NR_CPUS=8
-CONFIG_SYSVIPC=y
-CONFIG_POSIX_MQUEUE=y
-CONFIG_AUDIT=y
-CONFIG_NO_HZ=y
-CONFIG_HIGH_RES_TIMERS=y
-CONFIG_BSD_PROCESS_ACCT=y
-CONFIG_IKCONFIG=y
-CONFIG_IKCONFIG_PROC=y
-CONFIG_LOG_BUF_SHIFT=14
-CONFIG_CGROUPS=y
-CONFIG_CGROUP_SCHED=y
-CONFIG_RELAY=y
-CONFIG_BLK_DEV_INITRD=y
-CONFIG_KALLSYMS_ALL=y
-CONFIG_EMBEDDED=y
-CONFIG_PERF_EVENTS=y
-CONFIG_SLAB=y
-CONFIG_MODULES=y
-CONFIG_MODULE_UNLOAD=y
-CONFIG_MODULE_FORCE_UNLOAD=y
-CONFIG_MODVERSIONS=y
-# CONFIG_BLK_DEV_BSG is not set
-CONFIG_PARTITION_ADVANCED=y
-CONFIG_MAC_PARTITION=y
-CONFIG_CORENET_GENERIC=y
-CONFIG_MPIC_MSGR=y
-CONFIG_HIGHMEM=y
-# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
-CONFIG_BINFMT_MISC=m
-CONFIG_KEXEC=y
-CONFIG_FORCE_MAX_ZONEORDER=13
-CONFIG_PCI=y
-CONFIG_PCIEPORTBUS=y
-# CONFIG_PCIEASPM is not set
-CONFIG_PCI_MSI=y
-CONFIG_ADVANCED_OPTIONS=y
-CONFIG_LOWMEM_SIZE_BOOL=y
-CONFIG_LOWMEM_SIZE=0x2000
-CONFIG_NET=y
-CONFIG_PACKET=y
-CONFIG_UNIX=y
-CONFIG_XFRM_USER=y
-CONFIG_XFRM_SUB_POLICY=y
-CONFIG_XFRM_STATISTICS=y
-CONFIG_NET_KEY=y
-CONFIG_NET_KEY_MIGRATE=y
-CONFIG_INET=y
-CONFIG_IP_MULTICAST=y
-CONFIG_IP_ADVANCED_ROUTER=y
-CONFIG_IP_MULTIPLE_TABLES=y
-CONFIG_IP_ROUTE_MULTIPATH=y
-CONFIG_IP_ROUTE_VERBOSE=y
-CONFIG_IP_PNP=y
-CONFIG_IP_PNP_DHCP=y
-CONFIG_IP_PNP_BOOTP=y
-CONFIG_IP_PNP_RARP=y
-CONFIG_NET_IPIP=y
-CONFIG_IP_MROUTE=y
-CONFIG_IP_PIMSM_V1=y
-CONFIG_IP_PIMSM_V2=y
-CONFIG_INET_AH=y
-CONFIG_INET_ESP=y
-CONFIG_INET_IPCOMP=y
-CONFIG_IPV6=y
-CONFIG_IP_SCTP=m
-CONFIG_TIPC=y
-CONFIG_NET_SCHED=y
-CONFIG_NET_SCH_CBQ=y
-CONFIG_NET_SCH_HTB=y
-CONFIG_NET_SCH_HFSC=y
-CONFIG_NET_SCH_PRIO=y
-CONFIG_NET_SCH_MULTIQ=y
-CONFIG_NET_SCH_RED=y
-CONFIG_NET_SCH_SFQ=y
-CONFIG_NET_SCH_TEQL=y
-CONFIG_NET_SCH_TBF=y
-CONFIG_NET_SCH_GRED=y
-CONFIG_NET_CLS_BASIC=y
-CONFIG_NET_CLS_TCINDEX=y
-CONFIG_NET_CLS_U32=y
-CONFIG_CLS_U32_PERF=y
-CONFIG_CLS_U32_MARK=y
-CONFIG_NET_CLS_FLOW=y
-CONFIG_NET_CLS_CGROUP=y
-CONFIG_UEVENT_HELPER_PATH="/sbin/mdev"
-CONFIG_DEVTMPFS=y
-CONFIG_MTD=y
-CONFIG_MTD_CMDLINE_PARTS=y
-CONFIG_MTD_BLOCK=y
-CONFIG_MTD_CFI=y
-CONFIG_MTD_CFI_AMDSTD=y
-CONFIG_MTD_PHYSMAP_OF=y
-CONFIG_MTD_PHRAM=y
-CONFIG_MTD_NAND=y
-CONFIG_MTD_NAND_ECC_BCH=y
-CONFIG_MTD_NAND_FSL_ELBC=y
-CONFIG_MTD_UBI=y
-CONFIG_MTD_UBI_GLUEBI=y
-CONFIG_BLK_DEV_LOOP=y
-CONFIG_BLK_DEV_RAM=y
-CONFIG_BLK_DEV_RAM_COUNT=2
-CONFIG_BLK_DEV_RAM_SIZE=2048
-CONFIG_EEPROM_AT24=y
-CONFIG_SCSI=y
-CONFIG_BLK_DEV_SD=y
-CONFIG_CHR_DEV_ST=y
-CONFIG_BLK_DEV_SR=y
-CONFIG_CHR_DEV_SG=y
-CONFIG_SCSI_LOGGING=y
-CONFIG_SCSI_SYM53C8XX_2=y
-CONFIG_NETDEVICES=y
-# CONFIG_NET_VENDOR_3COM is not set
-# CONFIG_NET_VENDOR_ADAPTEC is not set
-# CONFIG_NET_VENDOR_ALTEON is not set
-# CONFIG_NET_VENDOR_AMD is not set
-# CONFIG_NET_VENDOR_ATHEROS is not set
-# CONFIG_NET_VENDOR_BROADCOM is not set
-# CONFIG_NET_VENDOR_BROCADE is not set
-# CONFIG_NET_VENDOR_CHELSIO is not set
-# CONFIG_NET_VENDOR_CISCO is not set
-# CONFIG_NET_VENDOR_DEC is not set
-# CONFIG_NET_VENDOR_DLINK is not set
-# CONFIG_NET_VENDOR_EMULEX is not set
-# CONFIG_NET_VENDOR_EXAR is not set
-CONFIG_FSL_PQ_MDIO=y
-CONFIG_FSL_XGMAC_MDIO=y
-# CONFIG_NET_VENDOR_HP is not set
-# CONFIG_NET_VENDOR_INTEL is not set
-# CONFIG_NET_VENDOR_MARVELL is not set
-# CONFIG_NET_VENDOR_MELLANOX is not set
-# CONFIG_NET_VENDOR_MICREL is not set
-# CONFIG_NET_VENDOR_MICROCHIP is not set
-# CONFIG_NET_VENDOR_MYRI is not set
-# CONFIG_NET_VENDOR_NATSEMI is not set
-# CONFIG_NET_VENDOR_NVIDIA is not set
-# CONFIG_NET_VENDOR_OKI is not set
-# CONFIG_NET_PACKET_ENGINE is not set
-# CONFIG_NET_VENDOR_QLOGIC is not set
-# CONFIG_NET_VENDOR_REALTEK is not set
-# CONFIG_NET_VENDOR_RDC is not set
-# CONFIG_NET_VENDOR_SEEQ is not set
-# CONFIG_NET_VENDOR_SILAN is not set
-# CONFIG_NET_VENDOR_SIS is not set
-# CONFIG_NET_VENDOR_SMSC is not set
-# CONFIG_NET_VENDOR_STMICRO is not set
-# CONFIG_NET_VENDOR_SUN is not set
-# CONFIG_NET_VENDOR_TEHUTI is not set
-# CONFIG_NET_VENDOR_TI is not set
-# CONFIG_NET_VENDOR_VIA is not set
-# CONFIG_NET_VENDOR_WIZNET is not set
-# CONFIG_NET_VENDOR_XILINX is not set
-CONFIG_MARVELL_PHY=y
-CONFIG_VITESSE_PHY=y
-CONFIG_FIXED_PHY=y
-# CONFIG_WLAN is not set
-# CONFIG_INPUT_MOUSEDEV is not set
-# CONFIG_INPUT_KEYBOARD is not set
-# CONFIG_INPUT_MOUSE is not set
-CONFIG_SERIO_LIBPS2=y
-# CONFIG_LEGACY_PTYS is not set
-CONFIG_PPC_EPAPR_HV_BYTECHAN=y
-CONFIG_SERIAL_8250=y
-CONFIG_SERIAL_8250_CONSOLE=y
-CONFIG_SERIAL_8250_MANY_POR

[PATCH 0/3] powerpc: update for the Keymile QorIQ boards

2016-12-15 Thread Valentin Longchamp
This series contains some updates for the Keymile QorIQ boards.
There is a small fix for the kmcoge4 board DTS, the removal of the
kmp204x_defconfig file which is unmaintained (corenet32_smp_defconfig
can be used instead) and the addition of the kmcent2 board.

Valentin Longchamp (3):
  powerpc/corenet: explicitly disable the SDHC controller on kmcoge4
  powerpc/85xx: remove the kmp204x_defconfig
  powerpc/corenet: add support for the kmcent2 board

 arch/powerpc/boot/dts/fsl/kmcent2.dts | 303 ++
 arch/powerpc/boot/dts/fsl/kmcoge4.dts |   4 +
 arch/powerpc/configs/85xx/kmp204x_defconfig   | 220 ---
 arch/powerpc/platforms/85xx/corenet_generic.c |   1 +
 4 files changed, 308 insertions(+), 220 deletions(-)
 create mode 100644 arch/powerpc/boot/dts/fsl/kmcent2.dts
 delete mode 100644 arch/powerpc/configs/85xx/kmp204x_defconfig

-- 
1.8.3.1


[PATCH 1/3] powerpc/corenet: explicitly disable the SDHC controller on kmcoge4

2016-12-15 Thread Valentin Longchamp
It is not implemented on the kmcoge4 hardware and if not disabled it
leads to error messages with the corenet32_smp_defconfig.

Signed-off-by: Valentin Longchamp 
---
 arch/powerpc/boot/dts/fsl/kmcoge4.dts | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/kmcoge4.dts 
b/arch/powerpc/boot/dts/fsl/kmcoge4.dts
index ae70a24..e103c0f 100644
--- a/arch/powerpc/boot/dts/fsl/kmcoge4.dts
+++ b/arch/powerpc/boot/dts/fsl/kmcoge4.dts
@@ -83,6 +83,10 @@
};
};
 
+   sdhc@114000 {
+   status = "disabled";
+   };
+
i2c@119000 {
status = "disabled";
};
-- 
1.8.3.1


[PATCH v8] powerpc: Do not make the entire heap executable

2016-12-15 Thread Denys Vlasenko
On 32-bit powerpc the ELF PLT sections of binaries (built with --bss-plt,
or with a toolchain which defaults to it) look like this:

  [17] .sbss NOBITS  0002aff8 01aff8 14 00  WA  0   0  4
  [18] .plt  NOBITS  0002b00c 01aff8 84 00 WAX  0   0  4
  [19] .bss  NOBITS  0002b090 01aff8 a4 00  WA  0   0  4

Which results in an ELF load header:

  Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD   0x019c70 0x00029c70 0x00029c70 0x01388 0x014c4 RWE 0x1

This is all correct, the load region containing the PLT is marked as
executable. Note that the PLT starts at 0002b00c but the file mapping ends at
0002aff8, so the PLT falls in the 0 fill section described by the load header,
and after a page boundary.

Unfortunately the generic ELF loader ignores the X bit in the load headers
when it creates the 0 filled non-file backed mappings. It assumes all of these
mappings are RW BSS sections, which is not the case for PPC.

gcc/ld has an option (--secure-plt) to not do this, this is said to incur
a small performance penalty.

Currently, to support 32-bit binaries with PLT in BSS kernel maps *entire
brk area* with executable rights for all binaries, even --secure-plt ones.

Stop doing that.

Teach the ELF loader to check the X bit in the relevant load header
and create 0 filled anonymous mappings that are executable
if the load header requests that.

Test program showing the difference in /proc/$PID/maps:

int main() {
char buf[16*1024];
char *p = malloc(123); /* make "[heap]" mapping appear */
int fd = open("/proc/self/maps", O_RDONLY);
int len = read(fd, buf, sizeof(buf));
write(1, buf, len);
printf("%p\n", p);
return 0;
}

Compiled using: gcc -mbss-plt -m32 -Os test.c -otest

Unpatched ppc64 kernel:
0010-0012 r-xp  00:00 0  [vdso]
0fe1-0ffd r-xp  fd:00 67898094   
/usr/lib/libc-2.17.so
0ffd-0ffe r--p 001b fd:00 67898094   
/usr/lib/libc-2.17.so
0ffe-0fff rw-p 001c fd:00 67898094   
/usr/lib/libc-2.17.so
1000-1001 r-xp  fd:00 100674505  
/home/user/test
1001-1002 r--p  fd:00 100674505  
/home/user/test
1002-1003 rw-p 0001 fd:00 100674505  
/home/user/test
1069-106c rwxp  00:00 0  [heap]
f7f7-f7fa r-xp  fd:00 67898089   
/usr/lib/ld-2.17.so
f7fa-f7fb r--p 0002 fd:00 67898089   
/usr/lib/ld-2.17.so
f7fb-f7fc rw-p 0003 fd:00 67898089   
/usr/lib/ld-2.17.so
ffa9-ffac rw-p  00:00 0  [stack]
0x10690008

Patched ppc64 kernel:
0010-0012 r-xp  00:00 0  [vdso]
0fe1-0ffd r-xp  fd:00 67898094   
/usr/lib/libc-2.17.so
0ffd-0ffe r--p 001b fd:00 67898094   
/usr/lib/libc-2.17.so
0ffe-0fff rw-p 001c fd:00 67898094   
/usr/lib/libc-2.17.so
1000-1001 r-xp  fd:00 100674505  
/home/user/test
1001-1002 r--p  fd:00 100674505  
/home/user/test
1002-1003 rw-p 0001 fd:00 100674505  
/home/user/test
1018-101b rw-p  00:00 0  [heap]
   this has changed
f7c6-f7c9 r-xp  fd:00 67898089   
/usr/lib/ld-2.17.so
f7c9-f7ca r--p 0002 fd:00 67898089   
/usr/lib/ld-2.17.so
f7ca-f7cb rw-p 0003 fd:00 67898089   
/usr/lib/ld-2.17.so
ff86-ff89 rw-p  00:00 0  [stack]
0x10180008

The patch was originally posted in 2012 by Jason Gunthorpe
and apparently ignored:

https://lkml.org/lkml/2012/9/30/138

Lightly run-tested.

Signed-off-by: Jason Gunthorpe 
Signed-off-by: Denys Vlasenko 
Acked-by: Kees Cook 
Acked-by: Michael Ellerman 
Tested-by: Jason Gunthorpe 
CC: Andrew Morton 
CC: Benjamin Herrenschmidt 
CC: Paul Mackerras 
CC: "Aneesh Kumar K.V" 
CC: Kees Cook 
CC: Oleg Nesterov 
CC: Michael Ellerman 
CC: Florian Weimer 
CC: linux...@kvack.org
CC: linuxppc-dev@lists.ozlabs.org
CC: linux-ker...@vger.kernel.org
---
Changes since v7:
* added /proc/$PID/maps example in the commit message

Changes since v6:
* rebased to current Linus tree
* sending to akpm

Changes since v5:
* made do_brk_flags() error out if any bits other than VM_EXEC are set.
  (Kees Cook: "With this, I'd be happy to Ack.")
  See https://patchwork.ozlabs.org/patch/6

[PATCH net 3/3] MAINTAINERS: net: add entry for Freescale QorIQ DPAA Ethernet driver

2016-12-15 Thread Madalin Bucur
Add record for Freescale QORIQ DPAA Ethernet driver adding myself as
maintainer.

Signed-off-by: Madalin Bucur 
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e2463ba..0ff9757 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5058,6 +5058,12 @@ S:   Maintained
 F: drivers/net/ethernet/freescale/fman
 F: Documentation/devicetree/bindings/powerpc/fsl/fman.txt
 
+FREESCALE QORIQ DPAA ETHERNET DRIVER
+M: Madalin Bucur 
+L: net...@vger.kernel.org
+S: Maintained
+F: drivers/net/ethernet/freescale/dpaa
+
 FREESCALE QUICC ENGINE LIBRARY
 L: linuxppc-dev@lists.ozlabs.org
 S: Orphan
-- 
2.1.0



[PATCH net 2/3] dpaa_eth: remove redundant dependency on FSL_SOC

2016-12-15 Thread Madalin Bucur
Signed-off-by: Madalin Bucur 
---
 drivers/net/ethernet/freescale/dpaa/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/dpaa/Kconfig 
b/drivers/net/ethernet/freescale/dpaa/Kconfig
index f3a3454..a654736 100644
--- a/drivers/net/ethernet/freescale/dpaa/Kconfig
+++ b/drivers/net/ethernet/freescale/dpaa/Kconfig
@@ -1,6 +1,6 @@
 menuconfig FSL_DPAA_ETH
tristate "DPAA Ethernet"
-   depends on FSL_SOC && FSL_DPAA && FSL_FMAN
+   depends on FSL_DPAA && FSL_FMAN
select PHYLIB
select FSL_FMAN_MAC
---help---
-- 
2.1.0



[PATCH net 1/3] dpaa_eth: use big endian accessors

2016-12-15 Thread Madalin Bucur
From: Claudiu Manoil 

Ensure correct access to the big endian QMan HW through proper
accessors.

Signed-off-by: Claudiu Manoil 
Signed-off-by: Madalin Bucur 
---
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 71 ++
 1 file changed, 37 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c 
b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index 3c48a84..624ba90 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -733,7 +733,7 @@ static int dpaa_eth_cgr_init(struct dpaa_priv *priv)
priv->cgr_data.cgr.cb = dpaa_eth_cgscn;
 
/* Enable Congestion State Change Notifications and CS taildrop */
-   initcgr.we_mask = QM_CGR_WE_CSCN_EN | QM_CGR_WE_CS_THRES;
+   initcgr.we_mask = cpu_to_be16(QM_CGR_WE_CSCN_EN | QM_CGR_WE_CS_THRES);
initcgr.cgr.cscn_en = QM_CGR_EN;
 
/* Set different thresholds based on the MAC speed.
@@ -747,7 +747,7 @@ static int dpaa_eth_cgr_init(struct dpaa_priv *priv)
cs_th = DPAA_CS_THRESHOLD_1G;
qm_cgr_cs_thres_set64(&initcgr.cgr.cs_thres, cs_th, 1);
 
-   initcgr.we_mask |= QM_CGR_WE_CSTD_EN;
+   initcgr.we_mask |= cpu_to_be16(QM_CGR_WE_CSTD_EN);
initcgr.cgr.cstd_en = QM_CGR_EN;
 
err = qman_create_cgr(&priv->cgr_data.cgr, QMAN_CGR_FLAG_USE_INIT,
@@ -896,18 +896,18 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool 
td_enable)
if (dpaa_fq->init) {
memset(&initfq, 0, sizeof(initfq));
 
-   initfq.we_mask = QM_INITFQ_WE_FQCTRL;
+   initfq.we_mask = cpu_to_be16(QM_INITFQ_WE_FQCTRL);
/* Note: we may get to keep an empty FQ in cache */
-   initfq.fqd.fq_ctrl = QM_FQCTRL_PREFERINCACHE;
+   initfq.fqd.fq_ctrl = cpu_to_be16(QM_FQCTRL_PREFERINCACHE);
 
/* Try to reduce the number of portal interrupts for
 * Tx Confirmation FQs.
 */
if (dpaa_fq->fq_type == FQ_TYPE_TX_CONFIRM)
-   initfq.fqd.fq_ctrl |= QM_FQCTRL_HOLDACTIVE;
+   initfq.fqd.fq_ctrl |= cpu_to_be16(QM_FQCTRL_HOLDACTIVE);
 
/* FQ placement */
-   initfq.we_mask |= QM_INITFQ_WE_DESTWQ;
+   initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_DESTWQ);
 
qm_fqd_set_destwq(&initfq.fqd, dpaa_fq->channel, dpaa_fq->wq);
 
@@ -920,8 +920,8 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool 
td_enable)
if (dpaa_fq->fq_type == FQ_TYPE_TX ||
dpaa_fq->fq_type == FQ_TYPE_TX_CONFIRM ||
dpaa_fq->fq_type == FQ_TYPE_TX_CONF_MQ) {
-   initfq.we_mask |= QM_INITFQ_WE_CGID;
-   initfq.fqd.fq_ctrl |= QM_FQCTRL_CGE;
+   initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_CGID);
+   initfq.fqd.fq_ctrl |= cpu_to_be16(QM_FQCTRL_CGE);
initfq.fqd.cgid = (u8)priv->cgr_data.cgr.cgrid;
/* Set a fixed overhead accounting, in an attempt to
 * reduce the impact of fixed-size skb shells and the
@@ -932,7 +932,7 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool 
td_enable)
 * insufficient value, but even that is better than
 * no overhead accounting at all.
 */
-   initfq.we_mask |= QM_INITFQ_WE_OAC;
+   initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_OAC);
qm_fqd_set_oac(&initfq.fqd, QM_OAC_CG);
qm_fqd_set_oal(&initfq.fqd,
   min(sizeof(struct sk_buff) +
@@ -941,9 +941,9 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool 
td_enable)
}
 
if (td_enable) {
-   initfq.we_mask |= QM_INITFQ_WE_TDTHRESH;
+   initfq.we_mask |= cpu_to_be16(QM_INITFQ_WE_TDTHRESH);
qm_fqd_set_taildrop(&initfq.fqd, DPAA_FQ_TD, 1);
-   initfq.fqd.fq_ctrl = QM_FQCTRL_TDE;
+   initfq.fqd.fq_ctrl = cpu_to_be16(QM_FQCTRL_TDE);
}
 
if (dpaa_fq->fq_type == FQ_TYPE_TX) {
@@ -951,7 +951,8 @@ static int dpaa_fq_init(struct dpaa_fq *dpaa_fq, bool 
td_enable)
if (queue_id >= 0)
confq = priv->conf_fqs[queue_id];
if (confq) {
-   initfq.we_mask |= QM_INITFQ_WE_CONTEXTA;
+   initfq.we_mask |=
+   cpu_to_be16(QM_INITFQ_WE_CONTEXTA);
/* ContextA: OVOM=1(use contextA2 bits instead of ICAD)
 *   A2V=1 (contextA A2 field is valid)
 *   A0V=1 (contextA 

[PATCH net 0/3] dpaa_eth: a couple of fixes

2016-12-15 Thread Madalin Bucur
This patch set introduces big endian accessors in the dpaa_eth driver
making sure accesses to the QBMan HW are correct on little endian
platforms. Removing a redundant Kconfig dependency on FSL_SOC.
Adding myself as maintainer of the dpaa_eth driver.

Claudiu Manoil (1):
  dpaa_eth: use big endian accessors

Madalin Bucur (2):
  dpaa_eth: remove redundant dependency on FSL_SOC
  MAINTAINERS: net: add entry for Freescale QorIQ DPAA Ethernet driver

 MAINTAINERS|  6 +++
 drivers/net/ethernet/freescale/dpaa/Kconfig|  2 +-
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 71 ++
 3 files changed, 44 insertions(+), 35 deletions(-)

-- 
2.1.0



[PATCH net 4/4] fsl/fman: enable compilation on ARM64

2016-12-15 Thread Madalin Bucur
Signed-off-by: Madalin Bucur 
---
 drivers/net/ethernet/freescale/fman/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fman/Kconfig 
b/drivers/net/ethernet/freescale/fman/Kconfig
index 79b7c84..dc0850b 100644
--- a/drivers/net/ethernet/freescale/fman/Kconfig
+++ b/drivers/net/ethernet/freescale/fman/Kconfig
@@ -1,6 +1,6 @@
 config FSL_FMAN
tristate "FMan support"
-   depends on FSL_SOC || COMPILE_TEST
+   depends on FSL_SOC || ARCH_LAYERSCAPE || COMPILE_TEST
select GENERIC_ALLOCATOR
select PHYLIB
default n
-- 
2.1.0



[PATCH net 2/4] fsl/fman: arm: call of_platform_populate() for arm64 platfrom

2016-12-15 Thread Madalin Bucur
From: Igal Liberman 

Signed-off-by: Igal Liberman 
---
 drivers/net/ethernet/freescale/fman/fman.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/freescale/fman/fman.c 
b/drivers/net/ethernet/freescale/fman/fman.c
index dafd9e1..f36b4eb 100644
--- a/drivers/net/ethernet/freescale/fman/fman.c
+++ b/drivers/net/ethernet/freescale/fman/fman.c
@@ -2868,6 +2868,16 @@ static struct fman *read_dts_node(struct platform_device 
*of_dev)
 
fman->dev = &of_dev->dev;
 
+#ifdef CONFIG_ARM64
+   /* call of_platform_populate in order to probe sub-nodes on arm64 */
+   err = of_platform_populate(fm_node, NULL, NULL, &of_dev->dev);
+   if (err) {
+   dev_err(&of_dev->dev, "%s: of_platform_populate() failed\n",
+   __func__);
+   goto fman_free;
+   }
+#endif
+
return fman;
 
 fman_node_put:
-- 
2.1.0



[PATCH net 3/4] fsl/fman: A007273 only applies to PPC SoCs

2016-12-15 Thread Madalin Bucur
Signed-off-by: Madalin Bucur 
Reviewed-by: Camelia Groza 
---
 drivers/net/ethernet/freescale/fman/fman.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/freescale/fman/fman.c 
b/drivers/net/ethernet/freescale/fman/fman.c
index f36b4eb..93d6a36 100644
--- a/drivers/net/ethernet/freescale/fman/fman.c
+++ b/drivers/net/ethernet/freescale/fman/fman.c
@@ -1890,6 +1890,7 @@ static int fman_reset(struct fman *fman)
 
goto _return;
} else {
+#ifdef CONFIG_PPC
struct device_node *guts_node;
struct ccsr_guts __iomem *guts_regs;
u32 devdisr2, reg;
@@ -1921,6 +1922,7 @@ static int fman_reset(struct fman *fman)
 
/* Enable all MACs */
iowrite32be(reg, &guts_regs->devdisr2);
+#endif
 
/* Perform FMan reset */
iowrite32be(FPM_RSTC_FM_RESET, &fman->fpm_regs->fm_rstc);
@@ -1932,25 +1934,31 @@ static int fman_reset(struct fman *fman)
} while (((ioread32be(&fman->fpm_regs->fm_rstc)) &
 FPM_RSTC_FM_RESET) && --count);
if (count == 0) {
+#ifdef CONFIG_PPC
iounmap(guts_regs);
of_node_put(guts_node);
+#endif
err = -EBUSY;
goto _return;
}
+#ifdef CONFIG_PPC
 
/* Restore devdisr2 value */
iowrite32be(devdisr2, &guts_regs->devdisr2);
 
iounmap(guts_regs);
of_node_put(guts_node);
+#endif
 
goto _return;
 
+#ifdef CONFIG_PPC
 guts_regs:
of_node_put(guts_node);
 guts_node:
dev_dbg(fman->dev, "%s: Didn't perform FManV3 reset due to 
Errata A007273!\n",
__func__);
+#endif
}
 _return:
return err;
-- 
2.1.0



[PATCH net 1/4] fsl/fman: fix 1G support for QSGMII interfaces

2016-12-15 Thread Madalin Bucur
QSGMII ports were not advertising 1G speed.

Signed-off-by: Madalin Bucur 
Reviewed-by: Camelia Groza 
---
 drivers/net/ethernet/freescale/fman/mac.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/freescale/fman/mac.c 
b/drivers/net/ethernet/freescale/fman/mac.c
index 69ca42c..0b31f85 100644
--- a/drivers/net/ethernet/freescale/fman/mac.c
+++ b/drivers/net/ethernet/freescale/fman/mac.c
@@ -594,6 +594,7 @@ static const u16 phy2speed[] = {
[PHY_INTERFACE_MODE_RGMII_RXID] = SPEED_1000,
[PHY_INTERFACE_MODE_RGMII_TXID] = SPEED_1000,
[PHY_INTERFACE_MODE_RTBI]   = SPEED_1000,
+   [PHY_INTERFACE_MODE_QSGMII] = SPEED_1000,
[PHY_INTERFACE_MODE_XGMII]  = SPEED_1
 };
 
-- 
2.1.0



[PATCH net 0/4] fsl/fman: fixes for ARM

2016-12-15 Thread Madalin Bucur
The patch set fixes advertised speeds for QSGMII interfaces, disables
A007273 erratum workaround on non-PowerPC platforms where it does not
apply, enables compilation on ARM64 and addresses a probing issue on
ARM64.

Igal Liberman (1):
  fsl/fman: arm: call of_platform_populate() for arm64 platfrom

Madalin Bucur (3):
  fsl/fman: fix 1G support for QSGMII interfaces
  fsl/fman: A007273 only applies to PPC SoCs
  fsl/fman: enable compilation on ARM64

 drivers/net/ethernet/freescale/fman/Kconfig |  2 +-
 drivers/net/ethernet/freescale/fman/fman.c  | 18 ++
 drivers/net/ethernet/freescale/fman/mac.c   |  1 +
 3 files changed, 20 insertions(+), 1 deletion(-)

-- 
2.1.0



Re: [PATCH] powerpc/8xx: Perf events on PPC 8xx

2016-12-15 Thread Christophe LEROY

Note that this patch applies on top of the following patches:
- powerpc/32: Remove FIX_SRR1
- [2/2] powerpc/8xx: Implement hw_breakpoint

Christophe

Le 15/12/2016 à 13:42, Christophe Leroy a écrit :

This patch has been reworked since RFC version. In the RFC, this patch
was preceded by a patch clearing MSR RI for all PPC32 at all time at
exception prologs. Now MSR RI clearing is done only when this 8xx perf
events functionality is compiled in, it is therefore limited to 8xx
and merged inside this patch.
Other main changes have been to take into account detailed review from
Peter Zijlstra. The instructions counter has been reworked to behave
as a free running counter like the three other counters.

The 8xx has no PMU, however some events can be emulated by other means.

This patch implements the following events (as reported by 'perf list'):
  cpu-cycles OR cycles  [Hardware event]
  instructions  [Hardware event]
  dTLB-load-misses  [Hardware cache event]
  iTLB-load-misses  [Hardware cache event]

'cycles' event is implemented using the timebase clock. Timebase clock
corresponds to CPU clock divided by 16, so number of cycles is
approximatly 16 times the number of TB ticks

On the 8xx, TLB misses are handled by software. It is therefore
easy to count all TLB misses each time the TLB miss exception is
called.

'instructions' is calculated by using instruction watchpoint counter.
This patch sets counter A to count instructions at address greater
than 0, hence we count all instructions executed while MSR RI bit is
set. The counter is set to the maximum which is 0x. Every 65535
instructions, debug instruction breakpoint exception fires. The
exception handler increments a counter in memory which then
represent the upper part of the instruction counter. We therefore
end up with a 48 bits counter. In order to avoid unnecessary overhead
while no perf event is active, this counter is started when the first
event referring to this counter is added, and the counter is stopped
when the last event referring to it is deleted. In order to properly
support breakpoint exceptions, MSR RI bit has to be unset in exception
epilogs in order to avoid breakpoint exceptions during critical
sections during changes to SRR0 and SRR1 would be problematic.

All counters are handled as free running counters.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/reg.h |   2 +
 arch/powerpc/include/asm/reg_8xx.h |   4 +
 arch/powerpc/kernel/entry_32.S |  15 +++
 arch/powerpc/kernel/head_8xx.S |  46 -
 arch/powerpc/perf/8xx-pmu.c| 173 +
 arch/powerpc/perf/Makefile |   2 +
 arch/powerpc/platforms/Kconfig.cputype |   7 ++
 7 files changed, 248 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/perf/8xx-pmu.c

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 0d4531a..9098b35 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -548,7 +548,9 @@
 #define SPRN_IBAT7U0x236   /* Instruction BAT 7 Upper Register */
 #define SPRN_ICMP  0x3D5   /* Instruction TLB Compare Register */
 #define SPRN_ICTC  0x3FB   /* Instruction Cache Throttling Control Reg */
+#ifndef SPRN_ICTRL
 #define SPRN_ICTRL 0x3F3   /* 1011 7450 icache and interrupt ctrl */
+#endif
 #define ICTRL_EICE 0x0800  /* enable icache parity errs */
 #define ICTRL_EDC  0x0400  /* enable dcache parity errs */
 #define ICTRL_EICP 0x0100  /* enable icache par. check */
diff --git a/arch/powerpc/include/asm/reg_8xx.h 
b/arch/powerpc/include/asm/reg_8xx.h
index c52725b..ae16fef 100644
--- a/arch/powerpc/include/asm/reg_8xx.h
+++ b/arch/powerpc/include/asm/reg_8xx.h
@@ -28,12 +28,16 @@
 /* Special MSR manipulation registers */
 #define SPRN_EIE   80  /* External interrupt enable (EE=1, RI=1) */
 #define SPRN_EID   81  /* External interrupt disable (EE=0, RI=1) */
+#define SPRN_NRI   82  /* Non recoverable interrupt (EE=0, RI=0) */

 /* Debug registers */
+#define SPRN_CMPA  144
+#define SPRN_COUNTA150
 #define SPRN_CMPE  152
 #define SPRN_CMPF  153
 #define SPRN_LCTRL1156
 #define SPRN_LCTRL2157
+#define SPRN_ICTRL 158
 #define SPRN_BAR   159

 /* Commands.  Only the first few are available to the instruction cache.
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 980626a..f3e4fc1 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -205,6 +205,9 @@ transfer_to_handler_cont:
mflrr9
lwz r11,0(r9)   /* virtual address of handler */
lwz r9,4(r9)/* where to go when done */
+#ifdef CONFIG_PPC_8xx_PERF_EVENT
+   mtspr   SPRN_NRI, r0
+#endif
 #ifdef CONFIG_T

[PATCH] powerpc/8xx: Perf events on PPC 8xx

2016-12-15 Thread Christophe Leroy
This patch has been reworked since RFC version. In the RFC, this patch
was preceded by a patch clearing MSR RI for all PPC32 at all time at
exception prologs. Now MSR RI clearing is done only when this 8xx perf
events functionality is compiled in, it is therefore limited to 8xx
and merged inside this patch.
Other main changes have been to take into account detailed review from
Peter Zijlstra. The instructions counter has been reworked to behave
as a free running counter like the three other counters.

The 8xx has no PMU, however some events can be emulated by other means.

This patch implements the following events (as reported by 'perf list'):
  cpu-cycles OR cycles  [Hardware event]
  instructions  [Hardware event]
  dTLB-load-misses  [Hardware cache event]
  iTLB-load-misses  [Hardware cache event]

'cycles' event is implemented using the timebase clock. Timebase clock
corresponds to CPU clock divided by 16, so number of cycles is
approximatly 16 times the number of TB ticks

On the 8xx, TLB misses are handled by software. It is therefore
easy to count all TLB misses each time the TLB miss exception is
called.

'instructions' is calculated by using instruction watchpoint counter.
This patch sets counter A to count instructions at address greater
than 0, hence we count all instructions executed while MSR RI bit is
set. The counter is set to the maximum which is 0x. Every 65535
instructions, debug instruction breakpoint exception fires. The
exception handler increments a counter in memory which then
represent the upper part of the instruction counter. We therefore
end up with a 48 bits counter. In order to avoid unnecessary overhead
while no perf event is active, this counter is started when the first
event referring to this counter is added, and the counter is stopped
when the last event referring to it is deleted. In order to properly
support breakpoint exceptions, MSR RI bit has to be unset in exception
epilogs in order to avoid breakpoint exceptions during critical
sections during changes to SRR0 and SRR1 would be problematic.

All counters are handled as free running counters.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/reg.h |   2 +
 arch/powerpc/include/asm/reg_8xx.h |   4 +
 arch/powerpc/kernel/entry_32.S |  15 +++
 arch/powerpc/kernel/head_8xx.S |  46 -
 arch/powerpc/perf/8xx-pmu.c| 173 +
 arch/powerpc/perf/Makefile |   2 +
 arch/powerpc/platforms/Kconfig.cputype |   7 ++
 7 files changed, 248 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/perf/8xx-pmu.c

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 0d4531a..9098b35 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -548,7 +548,9 @@
 #define SPRN_IBAT7U0x236   /* Instruction BAT 7 Upper Register */
 #define SPRN_ICMP  0x3D5   /* Instruction TLB Compare Register */
 #define SPRN_ICTC  0x3FB   /* Instruction Cache Throttling Control Reg */
+#ifndef SPRN_ICTRL
 #define SPRN_ICTRL 0x3F3   /* 1011 7450 icache and interrupt ctrl */
+#endif
 #define ICTRL_EICE 0x0800  /* enable icache parity errs */
 #define ICTRL_EDC  0x0400  /* enable dcache parity errs */
 #define ICTRL_EICP 0x0100  /* enable icache par. check */
diff --git a/arch/powerpc/include/asm/reg_8xx.h 
b/arch/powerpc/include/asm/reg_8xx.h
index c52725b..ae16fef 100644
--- a/arch/powerpc/include/asm/reg_8xx.h
+++ b/arch/powerpc/include/asm/reg_8xx.h
@@ -28,12 +28,16 @@
 /* Special MSR manipulation registers */
 #define SPRN_EIE   80  /* External interrupt enable (EE=1, RI=1) */
 #define SPRN_EID   81  /* External interrupt disable (EE=0, RI=1) */
+#define SPRN_NRI   82  /* Non recoverable interrupt (EE=0, RI=0) */
 
 /* Debug registers */
+#define SPRN_CMPA  144
+#define SPRN_COUNTA150
 #define SPRN_CMPE  152
 #define SPRN_CMPF  153
 #define SPRN_LCTRL1156
 #define SPRN_LCTRL2157
+#define SPRN_ICTRL 158
 #define SPRN_BAR   159
 
 /* Commands.  Only the first few are available to the instruction cache.
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 980626a..f3e4fc1 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -205,6 +205,9 @@ transfer_to_handler_cont:
mflrr9
lwz r11,0(r9)   /* virtual address of handler */
lwz r9,4(r9)/* where to go when done */
+#ifdef CONFIG_PPC_8xx_PERF_EVENT
+   mtspr   SPRN_NRI, r0
+#endif
 #ifdef CONFIG_TRACE_IRQFLAGS
lis r12,reenable_mmu@h
ori r12,r12,reenable_mmu@l
@@ -292,6 +295,9 @@ stack_ovf:
lis r9,StackOverflow@ha
addir9,r9,StackOverflow@l
   

Re: [PATCH] genirq/affinity: fix node generation from cpumask

2016-12-15 Thread Guilherme G. Piccoli
On 12/15/2016 07:36 AM, Thomas Gleixner wrote:
> On Thu, 15 Dec 2016, Gavin Shan wrote:
>>> static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t 
>>> *nodemsk)
>>> {
>>> -   int n, nodes;
>>> +   int n, nodes = 0;
>>>
>>> /* Calculate the number of nodes in the supplied affinity mask */
>>> -   for (n = 0, nodes = 0; n < num_online_nodes(); n++) {
>>> +   for_each_online_node(n)
>>> if (cpumask_intersects(mask, cpumask_of_node(n))) {
>>> node_set(n, *nodemsk);
>>> nodes++;
>>> }
>>> -   }
>>> +
>>
>> It'd better to keep the brackets so that we needn't add them when adding
>> more code into the block next time.
> 
> Removing the brackets is outright wrong. See:
>   https://marc.info/?l=linux-kernel&m=147351236615103
> 
> I'll fix that up when applying the patch.
> 
> Thanks,
> 
>   tglx
> 

Thanks you all very much for the reviews and comments - lesson learned
about the brackets in multi-line if/for statements!

Thanks for fixing it Thomas.
Cheers,


Guilherme



Re: [PATCH] genirq/affinity: fix node generation from cpumask

2016-12-15 Thread Balbir Singh


On 15/12/16 05:01, Guilherme G. Piccoli wrote:
> Commit 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading
> infrastructure") introduced a better IRQ spreading mechanism, taking
> account of the available NUMA nodes in the machine.
> 
> Problem is that the algorithm of retrieving the nodemask iterates
> "linearly" based on the number of online nodes - some architectures
> present non-linear node distribution among the nodemask, like PowerPC.
> If this is the case, the algorithm lead to a wrong node count number
> and therefore to a bad/incomplete IRQ affinity distribution.
> 
> For example, this problem were found in a machine with 128 CPUs and two
> nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly
> distributed). This led to a wrong affinity distribution which then led to
> a bad mq allocation for nvme driver.
> 
> Finally, we take the opportunity to fix a comment regarding the affinity
> distribution when we have _more_ nodes than vectors.

Very good catch! 

Acked-by: Balbir Singh 


Re: [PATCH] genirq/affinity: fix node generation from cpumask

2016-12-15 Thread Thomas Gleixner
On Thu, 15 Dec 2016, Gavin Shan wrote:
> > static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t 
> > *nodemsk)
> > {
> >-int n, nodes;
> >+int n, nodes = 0;
> >
> > /* Calculate the number of nodes in the supplied affinity mask */
> >-for (n = 0, nodes = 0; n < num_online_nodes(); n++) {
> >+for_each_online_node(n)
> > if (cpumask_intersects(mask, cpumask_of_node(n))) {
> > node_set(n, *nodemsk);
> > nodes++;
> > }
> >-}
> >+
> 
> It'd better to keep the brackets so that we needn't add them when adding
> more code into the block next time.

Removing the brackets is outright wrong. See:
  https://marc.info/?l=linux-kernel&m=147351236615103

I'll fix that up when applying the patch.

Thanks,

tglx



Re: [PATCH] genirq/affinity: fix node generation from cpumask

2016-12-15 Thread Christoph Hellwig
Looks fine:

Reviewed-by: Christoph Hellwig 

(but I agree with the bracing nitpick from Gavin)