build regression from c153693: Simplify module TOC handling

2016-02-09 Thread Peter Robinson
Hi Alan,

Your patch for "powerpc: Simplify module TOC handling" is causing the
Fedora ppc64le to fail to build with depmod failures. Reverting the
commit fixes it for us on rawhide.

We're getting the out put below, full logs at [1]. Let me know if you
have any other queries.

Regards,
Peter

[1] 
http://ppc.koji.fedoraproject.org/kojifiles/work/tasks/5115/3125115/build.log

+ depmod -b . -aeF ./System.map 4.5.0-0.rc2.git0.1.fc24.ppc64le
Depmod failure
+ '[' -s depmod.out ']'
+ echo 'Depmod failure'
+ cat depmod.out
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/powernv/opal-prd.ko
needs unknown symbol .TOC.
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/pseries/pseries_energy.ko
needs unknown symbol .TOC.
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/pseries/hvcserver.ko
needs unknown symbol .TOC.
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm.ko
needs unknown symbol .TOC.
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm-pr.ko
needs unknown symbol .TOC.
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm-hv.ko
needs unknown symbol .TOC.
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/rcu/rcutorture.ko
needs unknown symbol .TOC.
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/trace/ring_buffer_benchmark.ko
needs unknown symbol .TOC.
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/torture.ko
needs unknown symbol .TOC.
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/fs/nfs_common/nfs_acl.ko
needs unknown symbol .TOC.
depmod: WARNING:
/builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/fs/nfs_common/grace.ko
needs unknown symbol .TOC.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 02/23] powerpc/8xx: Map linear kernel RAM with 8M pages

2016-02-09 Thread Christophe Leroy
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.

MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.

In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.

In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.

With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.

Signed-off-by: Christophe Leroy 
---
v2: using bt instead of bgt and named the label explicitly
v3: no change
v4: no change
v5: removed use of pmd_val() as L-value
v6: no change
v7: no change

 arch/powerpc/kernel/head_8xx.S | 35 +-
 arch/powerpc/mm/8xx_mmu.c  | 83 ++
 arch/powerpc/mm/Makefile   |  1 +
 arch/powerpc/mm/mmu_decl.h | 15 ++--
 4 files changed, 120 insertions(+), 14 deletions(-)
 create mode 100644 arch/powerpc/mm/8xx_mmu.c

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index a89492e..87d1f5f 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -398,11 +398,13 @@ DataStoreTLBMiss:
BRANCH_UNLESS_KERNEL(3f)
lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha
 3:
-   mtcrr3
 
/* Insert level 1 index */
rlwimi  r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 
29
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the 
level 1 entry */
+   mtcrr11
+   bt- 28,DTLBMiss8M   /* bit 28 = Large page (8M) */
+   mtcrr3
 
/* We have a pte table, so load fetch the pte from the table.
 */
@@ -455,6 +457,29 @@ DataStoreTLBMiss:
EXCEPTION_EPILOG_0
rfi
 
+DTLBMiss8M:
+   mtcrr3
+   ori r11, r11, MD_SVALID
+   MTSPR_CPU6(SPRN_MD_TWC, r11, r3)
+#ifdef CONFIG_PPC_16K_PAGES
+   /*
+* In 16k pages mode, each PGD entry defines a 64M block.
+* Here we select the 8M page within the block.
+*/
+   rlwimi  r11, r10, 0, 0x0380
+#endif
+   rlwinm  r10, r11, 0, 0xff80
+   ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SHARED | _PAGE_DIRTY | \
+ _PAGE_PRESENT
+   MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */
+
+   li  r11, RPN_PATTERN
+   mfspr   r3, SPRN_SPRG_SCRATCH2
+   mtspr   SPRN_DAR, r11   /* Tag DAR */
+   EXCEPTION_EPILOG_0
+   rfi
+
+
 /* This is an instruction TLB error on the MPC8xx.  This could be due
  * to many reasons, such as executing guarded memory or illegal instruction
  * addresses.  There is nothing to do but handle a big time error fault.
@@ -532,13 +557,15 @@ FixupDAR:/* Entry point for dcbx workaround. */
/* Insert level 1 index */
 3: rlwimi  r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 
29
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the 
level 1 entry */
+   mtcrr11
+   bt  28,200f /* bit 28 = Large page (8M) */
rlwinm  r11, r11,0,0,19 /* Extract page descriptor page address */
/* Insert level 2 index */
rlwimi  r11, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29
lwz r11, 0(r11) /* Get the pte */
/* concat physical page address(r11) and page offset(r10) */
rlwimi  r11, r10, 0, 32 - PAGE_SHIFT, 31
-   lwz r11,0(r11)
+201:   lwz r11,0(r11)
 /* Check if it really is a dcbx instruction. */
 /* dcbt and dcbtst does not generate DTLB Misses/Errors,
  * no need to include them here */
@@ -557,6 +584,10 @@ FixupDAR:/* Entry point for dcbx workaround. */
 141:   mfspr   r10,SPRN_SPRG_SCRATCH2
b   DARFixed/* Nope, go back to normal TLB processing */
 
+   /* concat physical page address(r11) and page offset(r10) */
+200:   rlwimi  r11, r10, 0, 32 - (PAGE_SHIFT << 1), 31
+   b   201b
+
 144:   mfspr   r10, SPRN_DSISR
rlwinm  r10, r10,0,7,5  /* Clear store bit for buggy dcbst insn */
mtspr   

[PATCH v7 03/23] powerpc: Update documentation for noltlbs kernel parameter

2016-02-09 Thread Christophe Leroy
Now the noltlbs kernel parameter is also applicable to PPC8xx

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 Documentation/kernel-parameters.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 59e1515..c3e420b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2592,7 +2592,7 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
nolapic_timer   [X86-32,APIC] Do not use the local APIC timer.
 
noltlbs [PPC] Do not use large page/tlb entries for kernel
-   lowmem mapping on PPC40x.
+   lowmem mapping on PPC40x and PPC8xx
 
nomca   [IA-64] Disable machine check abort handling
 
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 10/23] powerpc/8xx: map more RAM at startup when needed

2016-02-09 Thread Christophe Leroy
On recent kernels, with some debug options like for instance
CONFIG_LOCKDEP, the BSS requires more than 8M memory, allthough
the kernel code fits in the first 8M.
Today, it is necessary to activate CONFIG_PIN_TLB to get more than 8M
at startup, allthough pinning TLB is not necessary for that.

This patch adds more pages (up to 24Mb) to the initial mapping if
possible/needed in order to have the necessary mappings regardless of
CONFIG_PIN_TLB.

We could have mapped 16M or 24M inconditionnally but since some
platforms only have 8M memory, we need something a bit more elaborated

Therefore, if the bootloader is compliant with ePAPR standard, we use
r7 to know how much memory was mapped by the bootloader.
Otherwise, we try to determine the required memory size by looking at
the _end symbol and the address of the device tree.

This patch does not modify the behaviour when CONFIG_PIN_TLB is
selected.

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: Automatic detection of available/needed memory instead of allocating 16M 
for all.
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/kernel/head_8xx.S | 56 +++---
 arch/powerpc/mm/8xx_mmu.c  | 10 +++-
 2 files changed, 56 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index ae721a1..a268cf4 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -72,6 +72,9 @@
 #define RPN_PATTERN0x00f0
 #endif
 
+/* ePAPR magic value for non BOOK III-E CPUs */
+#define EPAPR_SMAGIC   0x65504150
+
__HEAD
 _ENTRY(_stext);
 _ENTRY(_start);
@@ -101,6 +104,38 @@ _ENTRY(_start);
  */
.globl  __start
 __start:
+/*
+ * Determine initial RAM size
+ *
+ * If the Bootloader is ePAPR compliant, the size is given in r7
+ * otherwise, we have to determine how much is needed. For that, we have to
+ * check whether _end of kernel and device tree are within the first 8Mb.
+ */
+   lis r30, 0x0080@h   /* 8Mb by default */
+
+   lis r8, EPAPR_SMAGIC@h
+   ori r8, r8, EPAPR_SMAGIC@l
+   cmplw   cr0, r8, r6
+   bne 1f
+   lis r30, 0x0180@h   /* 24Mb max */
+   cmplw   cr0, r7, r30
+   bgt 2f
+   mr  r30, r7 /* save initial ram size */
+   b   2f
+1:
+   /* is kernel _end or DTB in the first 8M ? if not map 16M */
+   lis r8, (_end - PAGE_OFFSET)@h
+   ori r8, r8, (_end - PAGE_OFFSET)@l
+   addir8, r8, -1
+   or  r8, r8, r3
+   cmplw   cr0, r8, r30
+   blt 2f
+   lis r30, 0x0100@h   /* 16Mb */
+   /* is kernel _end or DTB in the first 16M ? if not map 24M */
+   cmplw   cr0, r8, r30
+   blt 2f
+   lis r30, 0x0180@h   /* 24Mb */
+2:
mr  r31,r3  /* save device tree ptr */
 
/* We have to turn on the MMU right away so we get cache modes
@@ -737,6 +772,8 @@ start_here:
 /*
  * Decide what sort of machine this is and initialize the MMU.
  */
+   lis r3, initial_memory_size@ha
+   stw r30, initial_memory_size@l(r3)
li  r3,0
mr  r4,r31
bl  machine_init
@@ -868,10 +905,15 @@ initial_mmu:
mtspr   SPRN_MD_RPN, r8
 
 #ifdef CONFIG_PIN_TLB
-   /* Map two more 8M kernel data pages.
-   */
+   /* Map one more 8M kernel data page. */
addir10, r10, 0x0100
mtspr   SPRN_MD_CTR, r10
+#else
+   /* Map one more 8M kernel data page if needed */
+   lis r10, 0x0080@h
+   cmplw   cr0, r30, r10
+   ble 1f
+#endif
 
lis r8, KERNELBASE@h/* Create vaddr for TLB */
addis   r8, r8, 0x0080  /* Add 8M */
@@ -884,20 +926,28 @@ initial_mmu:
addis   r11, r11, 0x0080/* Add 8M */
mtspr   SPRN_MD_RPN, r11
 
+#ifdef CONFIG_PIN_TLB
+   /* Map one more 8M kernel data page. */
addir10, r10, 0x0100
mtspr   SPRN_MD_CTR, r10
+#else
+   /* Map one more 8M kernel data page if needed */
+   lis r10, 0x0100@h
+   cmplw   cr0, r30, r10
+   ble 1f
+#endif
 
addis   r8, r8, 0x0080  /* Add 8M */
mtspr   SPRN_MD_EPN, r8
mtspr   SPRN_MD_TWC, r9
addis   r11, r11, 0x0080/* Add 8M */
mtspr   SPRN_MD_RPN, r11
-#endif
 
/* Since the cache is enabled according to the information we
 * just loaded into the TLB, invalidate and enable the caches here.
 * We should probably check/set other modeslater.
 */
+1:
lis r8, IDC_INVALL@h
mtspr   SPRN_IC_CST, r8
mtspr   SPRN_DC_CST, r8
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index f37d5ec..50f17d2 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -20,6 +20,7 @@
 #define IMMR_SIZE 

[PATCH v7 17/23] powerpc/8xx: rewrite flush_instruction_cache() in C

2016-02-09 Thread Christophe Leroy
On PPC8xx, flushing instruction cache is performed by writing
in register SPRN_IC_CST. This registers suffers CPU6 ERRATA.
The patch rewrites the fonction in C so that CPU6 ERRATA will
be handled transparently

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/kernel/misc_32.S | 10 --
 arch/powerpc/mm/8xx_mmu.c |  7 +++
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index be8edd6..7d1284f 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -296,12 +296,9 @@ _GLOBAL(real_writeb)
  * Flush instruction cache.
  * This is a no-op on the 601.
  */
+#ifndef CONFIG_PPC_8xx
 _GLOBAL(flush_instruction_cache)
-#if defined(CONFIG_8xx)
-   isync
-   lis r5, IDC_INVALL@h
-   mtspr   SPRN_IC_CST, r5
-#elif defined(CONFIG_4xx)
+#if defined(CONFIG_4xx)
 #ifdef CONFIG_403GCX
li  r3, 512
mtctr   r3
@@ -334,9 +331,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE)
mfspr   r3,SPRN_HID0
ori r3,r3,HID0_ICFI
mtspr   SPRN_HID0,r3
-#endif /* CONFIG_8xx/4xx */
+#endif /* CONFIG_4xx */
isync
blr
+#endif /* CONFIG_PPC_8xx */
 
 /*
  * Write any modified data cache blocks out to memory
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index b75c461..e2ce480 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -181,3 +181,10 @@ void set_context(unsigned long id, pgd_t *pgd)
/* sync */
mb();
 }
+
+void flush_instruction_cache(void)
+{
+   isync();
+   mtspr(SPRN_IC_CST, IDC_INVALL);
+   isync();
+}
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH][v3] mpc85xx/lbc: modify suspend/resume entry sequence

2016-02-09 Thread Raghav Dogra
Modify platform driver suspend/resume to syscore
suspend/resume. This is because p1022ds needs to use
localbus when entering the PCIE resume.

Signed-off-by: Raghav Dogra 
---
Changes for v3: rebased to linux.git main branch

 arch/powerpc/sysdev/Makefile  |  2 +-
 arch/powerpc/sysdev/fsl_lbc.c | 49 +--
 2 files changed, 39 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile
index bd6bd72..ee972aa 100644
--- a/arch/powerpc/sysdev/Makefile
+++ b/arch/powerpc/sysdev/Makefile
@@ -18,9 +18,9 @@ obj-$(CONFIG_PPC_PMI) += pmi.o
 obj-$(CONFIG_U3_DART)  += dart_iommu.o
 obj-$(CONFIG_MMIO_NVRAM)   += mmio_nvram.o
 obj-$(CONFIG_FSL_SOC)  += fsl_soc.o fsl_mpic_err.o
+obj-$(CONFIG_FSL_LBC)  += fsl_lbc.o
 obj-$(CONFIG_FSL_PCI)  += fsl_pci.o $(fsl-msi-obj-y)
 obj-$(CONFIG_FSL_PMC)  += fsl_pmc.o
-obj-$(CONFIG_FSL_LBC)  += fsl_lbc.o
 obj-$(CONFIG_FSL_GTM)  += fsl_gtm.o
 obj-$(CONFIG_FSL_85XX_CACHE_SRAM)  += fsl_85xx_l2ctlr.o 
fsl_85xx_cache_sram.o
 obj-$(CONFIG_SIMPLE_GPIO)  += simple_gpio.o
diff --git a/arch/powerpc/sysdev/fsl_lbc.c b/arch/powerpc/sysdev/fsl_lbc.c
index 47f7810..424b67f 100644
--- a/arch/powerpc/sysdev/fsl_lbc.c
+++ b/arch/powerpc/sysdev/fsl_lbc.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -352,24 +353,42 @@ err:
 #ifdef CONFIG_SUSPEND
 
 /* save lbc registers */
-static int fsl_lbc_suspend(struct platform_device *pdev, pm_message_t state)
+static int fsl_lbc_syscore_suspend(void)
 {
-   struct fsl_lbc_ctrl *ctrl = dev_get_drvdata(>dev);
-   struct fsl_lbc_regs __iomem *lbc = ctrl->regs;
+   struct fsl_lbc_ctrl *ctrl;
+   struct fsl_lbc_regs __iomem *lbc;
+
+   ctrl = fsl_lbc_ctrl_dev;
+   if (!ctrl)
+   goto out;
+
+   lbc = ctrl->regs;
+   if (!lbc)
+   goto out;
 
ctrl->saved_regs = kmalloc(sizeof(struct fsl_lbc_regs), GFP_KERNEL);
if (!ctrl->saved_regs)
return -ENOMEM;
 
_memcpy_fromio(ctrl->saved_regs, lbc, sizeof(struct fsl_lbc_regs));
+
+out:
return 0;
 }
 
 /* restore lbc registers */
-static int fsl_lbc_resume(struct platform_device *pdev)
+static void fsl_lbc_syscore_resume(void)
 {
-   struct fsl_lbc_ctrl *ctrl = dev_get_drvdata(>dev);
-   struct fsl_lbc_regs __iomem *lbc = ctrl->regs;
+   struct fsl_lbc_ctrl *ctrl;
+   struct fsl_lbc_regs __iomem *lbc;
+
+   ctrl = fsl_lbc_ctrl_dev;
+   if (!ctrl)
+   goto out;
+
+   lbc = ctrl->regs;
+   if (!lbc)
+   goto out;
 
if (ctrl->saved_regs) {
_memcpy_toio(lbc, ctrl->saved_regs,
@@ -377,7 +396,9 @@ static int fsl_lbc_resume(struct platform_device *pdev)
kfree(ctrl->saved_regs);
ctrl->saved_regs = NULL;
}
-   return 0;
+
+out:
+   return;
 }
 #endif /* CONFIG_SUSPEND */
 
@@ -389,20 +410,26 @@ static const struct of_device_id fsl_lbc_match[] = {
{},
 };
 
+#ifdef CONFIG_SUSPEND
+static struct syscore_ops lbc_syscore_pm_ops = {
+   .suspend = fsl_lbc_syscore_suspend,
+   .resume = fsl_lbc_syscore_resume,
+};
+#endif
+
 static struct platform_driver fsl_lbc_ctrl_driver = {
.driver = {
.name = "fsl-lbc",
.of_match_table = fsl_lbc_match,
},
.probe = fsl_lbc_ctrl_probe,
-#ifdef CONFIG_SUSPEND
-   .suspend = fsl_lbc_suspend,
-   .resume  = fsl_lbc_resume,
-#endif
 };
 
 static int __init fsl_lbc_init(void)
 {
+#ifdef CONFIG_SUSPEND
+   register_syscore_ops(_syscore_pm_ops);
+#endif
return platform_driver_register(_lbc_ctrl_driver);
 }
 subsys_initcall(fsl_lbc_init);
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 08/23] powerpc/8xx: Map IMMR area with 512k page at a fixed address

2016-02-09 Thread Christophe Leroy
Once the linear memory space has been mapped with 8Mb pages, as
seen in the related commit, we get 11 millions DTLB missed during
the reference 600s period. 77% of the misses are on user addresses
and 23% are on kernel addresses (1 fourth for linear address space
and 3 fourth for virtual address space)

Traditionaly, each driver manages one computer board which has its
own components with its own memory maps.
But on embedded chips like the MPC8xx, the SOC has all registers
located in the same IO area.

When looking at ioremaps done during startup, we see that
many drivers are re-mapping small parts of the IMMR for their own use
and all those small pieces gets their own 4k page, amplifying the
number of TLB misses: in our system we get 0xff00 mapped 31 times
and 0xff003000 mapped 9 times.

Even if each part of IMMR was mapped only once with 4k pages, it would
still be several small mappings towards linear area.

With the patch, on the same principle as what was done for the RAM,
the IMMR gets mapped by a 512k page.

In 4k pages mode, we reserve a 4Mb area for mapping IMMR. The TLB
miss handler checks that we are within the first 512k and bail out
with page not marked valid if we are outside

In 16k pages mode, it is not realistic to reserve a 64Mb area, so
we do a standard mapping of the 512k area using 32 pages of 16k.
The CPM will be mapped via the first two pages, and the SEC engine
will be mapped via the 16th and 17th pages. As the pages are marked
guarded, there will be no speculative accesses.

With this patch applied, the number of DTLB misses during the 10 min
period is reduced to 11.8 millions for a duration of 5.8s, which
represents 2% of the non-idle time hence yet another 10% reduction.

Signed-off-by: Christophe Leroy 
---
v2:
- using bt instead of blt/bgt
- reorganised in order to have only one taken branch for both 512k
and 8M instead of a first branch for both 8M and 512k then a second
branch for 512k

v3:
- using fixmap
- using the new x_block_mapped() functions

v4: no change
v5: no change
v6: removed use of pmd_val() as L-value
v7: no change

 arch/powerpc/include/asm/fixmap.h |  9 ++-
 arch/powerpc/kernel/head_8xx.S| 36 +-
 arch/powerpc/mm/8xx_mmu.c | 53 +++
 arch/powerpc/mm/mmu_decl.h|  3 ++-
 4 files changed, 98 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/fixmap.h 
b/arch/powerpc/include/asm/fixmap.h
index d7dd8fb..b954dc3 100644
--- a/arch/powerpc/include/asm/fixmap.h
+++ b/arch/powerpc/include/asm/fixmap.h
@@ -52,12 +52,19 @@ enum fixed_addresses {
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,
 #endif
 #ifdef CONFIG_PPC_8xx
-   /* For IMMR we need an aligned 512K area */
FIX_IMMR_START,
+#ifdef CONFIG_PPC_4K_PAGES
+   /* For IMMR we need an aligned 4M area (full PGD entry) */
+   FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((4 * 1024 * 1024) / PAGE_SIZE)) &
+  ~(((4 * 1024 * 1024) / PAGE_SIZE) - 1),
+   FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((4 * 1024 * 1024) / PAGE_SIZE),
+#else
+   /* For IMMR we need an aligned 512K area */
FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((512 * 1024) / PAGE_SIZE)) &
   ~(((512 * 1024) / PAGE_SIZE) - 1),
FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((512 * 1024) / PAGE_SIZE),
 #endif
+#endif
/* FIX_PCIE_MCFG, */
__end_of_fixed_addresses
 };
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 09173ae..ae721a1 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -254,6 +254,37 @@ DataAccess:
. = 0x400
 InstructionAccess:
 
+/*
+ * Bottom part of DTLBMiss handler for 512k pages
+ * not enough space in the primary location
+ */
+#ifdef CONFIG_PPC_4K_PAGES
+/*
+ * 512k pages are only used for mapping IMMR area in 4K pages mode.
+ * Only map the first 512k page of the 4M area covered by the PGD entry.
+ * This should not happen, but if we are called for another page of that
+ * area, don't mark it valid
+ *
+ * In 16k pages mode, IMMR is directly mapped with 16k pages
+ */
+DTLBMiss512k:
+   rlwinm. r10, r10, 0, 0x0038
+   bne-1f
+   ori r11, r11, MD_SVALID
+1: mtcrr3
+   MTSPR_CPU6(SPRN_MD_TWC, r11, r3)
+   rlwinm  r10, r11, 0, 0xffc0
+   ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SHARED | _PAGE_DIRTY | \
+ _PAGE_PRESENT | _PAGE_NO_CACHE
+   MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */
+
+   li  r11, RPN_PATTERN
+   mfspr   r3, SPRN_SPRG_SCRATCH2
+   mtspr   SPRN_DAR, r11   /* Tag DAR */
+   EXCEPTION_EPILOG_0
+   rfi
+#endif
+
 /* External interrupt */
EXCEPTION(0x500, HardwareInterrupt, do_IRQ, EXC_XFER_LITE)
 
@@ -405,6 +436,9 @@ DataStoreTLBMiss:
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the 
level 1 entry 

[PATCH v7 21/23] powerpc: Simplify test in __dma_sync()

2016-02-09 Thread Christophe Leroy
This simplification helps the compiler. We now have only one test
instead of two, so it reduces the number of branches.

Signed-off-by: Christophe Leroy 
---
v2: new
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/mm/dma-noncoherent.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/dma-noncoherent.c 
b/arch/powerpc/mm/dma-noncoherent.c
index 169aba4..2dc74e5 100644
--- a/arch/powerpc/mm/dma-noncoherent.c
+++ b/arch/powerpc/mm/dma-noncoherent.c
@@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int direction)
 * invalidate only when cache-line aligned otherwise there is
 * the potential for discarding uncommitted data from the cache
 */
-   if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 
1)))
+   if ((start | end) & (L1_CACHE_BYTES - 1))
flush_dcache_range(start, end);
else
invalidate_dcache_range(start, end);
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 0/2] Consolidate redundant register/stack access code

2016-02-09 Thread Michael Ellerman
On Tue, 2016-02-09 at 00:38 -0500, David Long wrote:

> From: "David A. Long" 
>
> Move duplicate and functionally equivalent code for accessing registers
> and stack (CONFIG_HAVE_REGS_AND_STACK_ACCESS_API) from arch subdirs into
> common kernel files.
>
> I'm sending this out again (with updated distribution list) because v2
> just never got pulled in, even though I don't think there were any
> outstanding issues.

A big cross arch patch like this would often get taken by Andrew Morton, but
AFAICS you didn't CC him - so I just added him, perhaps he'll pick it up for
us :D

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 05/23] powerpc32: Fix pte_offset_kernel() to return NULL for bad pages

2016-02-09 Thread Christophe Leroy
The fixmap related functions try to map kernel pages that are
already mapped through Large TLBs. pte_offset_kernel() has to
return NULL for LTLBs, otherwise the caller will try to access
level 2 table which doesn't exist

Signed-off-by: Christophe Leroy 
---
v3: new
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/include/asm/nohash/32/pgtable.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index c82cbf5..e201600 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -309,7 +309,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, 
pte_t entry)
 #define pte_index(address) \
(((address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
 #define pte_offset_kernel(dir, addr)   \
-   ((pte_t *) pmd_page_vaddr(*(dir)) + pte_index(addr))
+   (pmd_bad(*(dir)) ? NULL : (pte_t *)pmd_page_vaddr(*(dir)) + \
+ pte_index(addr))
 #define pte_offset_map(dir, addr)  \
((pte_t *) kmap_atomic(pmd_page(*(dir))) + pte_index(addr))
 #define pte_unmap(pte) kunmap_atomic(pte)
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 13/23] powerpc/8xx: Add missing SPRN defines into reg_8xx.h

2016-02-09 Thread Christophe Leroy
Add missing SPRN defines into reg_8xx.h
Some of them are defined in mmu-8xx.h, so we include mmu-8xx.h in
reg_8xx.h, for that we remove references to PAGE_SHIFT in mmu-8xx.h
to have it self sufficient, as includers of reg_8xx.h don't all
include asm/page.h

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: We just add missing ones, don't move anymore the ones from mmu-8xx.h
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/include/asm/mmu-8xx.h |  4 ++--
 arch/powerpc/include/asm/reg_8xx.h | 11 +++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-8xx.h 
b/arch/powerpc/include/asm/mmu-8xx.h
index f05500a..0a566f1 100644
--- a/arch/powerpc/include/asm/mmu-8xx.h
+++ b/arch/powerpc/include/asm/mmu-8xx.h
@@ -171,9 +171,9 @@ typedef struct {
 } mm_context_t;
 #endif /* !__ASSEMBLY__ */
 
-#if (PAGE_SHIFT == 12)
+#if defined(CONFIG_PPC_4K_PAGES)
 #define mmu_virtual_psize  MMU_PAGE_4K
-#elif (PAGE_SHIFT == 14)
+#elif defined(CONFIG_PPC_16K_PAGES)
 #define mmu_virtual_psize  MMU_PAGE_16K
 #else
 #error "Unsupported PAGE_SIZE"
diff --git a/arch/powerpc/include/asm/reg_8xx.h 
b/arch/powerpc/include/asm/reg_8xx.h
index e8ea346..0f71c81 100644
--- a/arch/powerpc/include/asm/reg_8xx.h
+++ b/arch/powerpc/include/asm/reg_8xx.h
@@ -4,6 +4,8 @@
 #ifndef _ASM_POWERPC_REG_8xx_H
 #define _ASM_POWERPC_REG_8xx_H
 
+#include 
+
 /* Cache control on the MPC8xx is provided through some additional
  * special purpose registers.
  */
@@ -14,6 +16,15 @@
 #define SPRN_DC_ADR569 /* Address needed for some commands */
 #define SPRN_DC_DAT570 /* Read-only data register */
 
+/* Misc Debug */
+#define SPRN_DPDR  630
+#define SPRN_MI_CAM816
+#define SPRN_MI_RAM0   817
+#define SPRN_MI_RAM1   818
+#define SPRN_MD_CAM824
+#define SPRN_MD_RAM0   825
+#define SPRN_MD_RAM1   826
+
 /* Commands.  Only the first few are available to the instruction cache.
 */
 #defineIDC_ENABLE  0x0200  /* Cache enable */
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 19/23] powerpc32: Remove clear_pages() and define clear_page() inline

2016-02-09 Thread Christophe Leroy
clear_pages() is never used expect by clear_page, and PPC32 is the
only architecture (still) having this function. Neither PPC64 nor
any other architecture has it.

This patch removes clear_pages() and moves clear_page() function
inline (same as PPC64) as it only is a few isns

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/include/asm/page_32.h | 17 ++---
 arch/powerpc/kernel/misc_32.S  | 16 
 arch/powerpc/kernel/ppc_ksyms_32.c |  1 -
 3 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/page_32.h 
b/arch/powerpc/include/asm/page_32.h
index 68d73b2..6a8e179 100644
--- a/arch/powerpc/include/asm/page_32.h
+++ b/arch/powerpc/include/asm/page_32.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_POWERPC_PAGE_32_H
 #define _ASM_POWERPC_PAGE_32_H
 
+#include 
+
 #if defined(CONFIG_PHYSICAL_ALIGN) && (CONFIG_PHYSICAL_START != 0)
 #if (CONFIG_PHYSICAL_START % CONFIG_PHYSICAL_ALIGN) != 0
 #error "CONFIG_PHYSICAL_START must be a multiple of CONFIG_PHYSICAL_ALIGN"
@@ -36,9 +38,18 @@ typedef unsigned long long pte_basic_t;
 typedef unsigned long pte_basic_t;
 #endif
 
-struct page;
-extern void clear_pages(void *page, int order);
-static inline void clear_page(void *page) { clear_pages(page, 0); }
+/*
+ * Clear page using the dcbz instruction, which doesn't cause any
+ * memory traffic (except to write out any cache lines which get
+ * displaced).  This only works on cacheable memory.
+ */
+static inline void clear_page(void *addr)
+{
+   unsigned int i;
+
+   for (i = 0; i < PAGE_SIZE / L1_CACHE_BYTES; i++, addr += L1_CACHE_BYTES)
+   dcbz(addr);
+}
 extern void copy_page(void *to, void *from);
 
 #include 
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 7d1284f..181afc1 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -517,22 +517,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
 #endif /* CONFIG_BOOKE */
 
 /*
- * Clear pages using the dcbz instruction, which doesn't cause any
- * memory traffic (except to write out any cache lines which get
- * displaced).  This only works on cacheable memory.
- *
- * void clear_pages(void *page, int order) ;
- */
-_GLOBAL(clear_pages)
-   li  r0,PAGE_SIZE/L1_CACHE_BYTES
-   slw r0,r0,r4
-   mtctr   r0
-1: dcbz0,r3
-   addir3,r3,L1_CACHE_BYTES
-   bdnz1b
-   blr
-
-/*
  * Copy a whole page.  We use the dcbz instruction on the destination
  * to reduce memory traffic (it eliminates the unnecessary reads of
  * the destination into cache).  This requires that the destination
diff --git a/arch/powerpc/kernel/ppc_ksyms_32.c 
b/arch/powerpc/kernel/ppc_ksyms_32.c
index 30ddd8a..2bfaafe 100644
--- a/arch/powerpc/kernel/ppc_ksyms_32.c
+++ b/arch/powerpc/kernel/ppc_ksyms_32.c
@@ -10,7 +10,6 @@
 #include 
 #include 
 
-EXPORT_SYMBOL(clear_pages);
 EXPORT_SYMBOL(ISA_DMA_THRESHOLD);
 EXPORT_SYMBOL(DMA_MODE_READ);
 EXPORT_SYMBOL(DMA_MODE_WRITE);
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 0/2] Consolidate redundant register/stack access code

2016-02-09 Thread Ingo Molnar

* Michael Ellerman  wrote:

> On Tue, 2016-02-09 at 00:38 -0500, David Long wrote:
> 
> > From: "David A. Long" 
> >
> > Move duplicate and functionally equivalent code for accessing registers
> > and stack (CONFIG_HAVE_REGS_AND_STACK_ACCESS_API) from arch subdirs into
> > common kernel files.
> >
> > I'm sending this out again (with updated distribution list) because v2
> > just never got pulled in, even though I don't think there were any
> > outstanding issues.
> 
> A big cross arch patch like this would often get taken by Andrew Morton, but
> AFAICS you didn't CC him - so I just added him, perhaps he'll pick it up for
> us :D

The other problem is that the second patch is commingling changes to 6 separate 
architectures:

 16 files changed, 106 insertions(+), 343 deletions(-)

that should probably be 6 separate patches. Easier to review, easier to bisect 
to, 
easier to revert, etc.

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments

2016-02-09 Thread Christophe Leroy
The main purpose of this patchset is to dramatically reduce the time
spent in DTLB miss handler. This is achieved by:
1/ Mapping RAM with 8M pages
2/ Mapping IMMR with a fixed 512K page

On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.

Once the full patchset applied, the number of DTLB misses during the
period is reduced to 11.8 millions for a duration of 5.8s, which
represents 2% of the non-idle time.

This patch also includes other miscellaneous improvements:
1/ Handling of CPU6 ERRATA directly in mtspr() C macro to reduce code
specific to PPC8xx
2/ Rewrite of a few non critical ASM functions in C
3/ Removal of some unused items

See related patches for details

Main changes in v3:
* Using fixmap instead of fix address for mapping IMMR

Change in v4:
* Fix of a wrong #if notified by kbuild robot in 07/23

Change in v5:
* Removed use of pmd_val() as L-value
* Adapted to match the new include files layout in Linux 4.5

Change in v6:
* Removed remaining use of pmd_val() as L-value (reported by kbuild test robot)

Change in v7:
* Don't include x_block_mapped() from compilation in
arch/powerpc/mm/fsl_booke_mmu.c when CONFIG_FSL_BOOKE is not set
(reported by kbuild test robot)

Christophe Leroy (23):
  powerpc/8xx: Save r3 all the time in DTLB miss handler
  powerpc/8xx: Map linear kernel RAM with 8M pages
  powerpc: Update documentation for noltlbs kernel parameter
  powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c
  powerpc32: Fix pte_offset_kernel() to return NULL for bad pages
  powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam()
together
  powerpc/8xx: Fix vaddr for IMMR early remap
  powerpc/8xx: Map IMMR area with 512k page at a fixed address
  powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM
  powerpc/8xx: map more RAM at startup when needed
  powerpc32: Remove useless/wrong MMU:setio progress message
  powerpc32: remove ioremap_base
  powerpc/8xx: Add missing SPRN defines into reg_8xx.h
  powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro
  powerpc/8xx: remove special handling of CPU6 errata in set_dec()
  powerpc/8xx: rewrite set_context() in C
  powerpc/8xx: rewrite flush_instruction_cache() in C
  powerpc: add inline functions for cache related instructions
  powerpc32: Remove clear_pages() and define clear_page() inline
  powerpc32: move x_dcache_range() functions inline
  powerpc: Simplify test in __dma_sync()
  powerpc32: small optimisation in flush_icache_range()
  powerpc32: Remove one insn in mulhdu

 Documentation/kernel-parameters.txt  |   2 +-
 arch/powerpc/Kconfig.debug   |   1 -
 arch/powerpc/include/asm/cache.h |  19 +++
 arch/powerpc/include/asm/cacheflush.h|  52 ++-
 arch/powerpc/include/asm/fixmap.h|  14 ++
 arch/powerpc/include/asm/mmu-8xx.h   |   4 +-
 arch/powerpc/include/asm/nohash/32/pgtable.h |   5 +-
 arch/powerpc/include/asm/page_32.h   |  17 ++-
 arch/powerpc/include/asm/reg.h   |   2 +
 arch/powerpc/include/asm/reg_8xx.h   |  93 
 arch/powerpc/include/asm/time.h  |   6 +-
 arch/powerpc/kernel/asm-offsets.c|   8 ++
 arch/powerpc/kernel/head_8xx.S   | 207 +--
 arch/powerpc/kernel/misc_32.S| 107 ++
 arch/powerpc/kernel/ppc_ksyms.c  |   2 +
 arch/powerpc/kernel/ppc_ksyms_32.c   |   1 -
 arch/powerpc/mm/8xx_mmu.c| 190 
 arch/powerpc/mm/Makefile |   1 +
 arch/powerpc/mm/dma-noncoherent.c|   2 +-
 arch/powerpc/mm/fsl_booke_mmu.c  |   4 +-
 arch/powerpc/mm/init_32.c|  23 ---
 arch/powerpc/mm/mmu_decl.h   |  34 +++--
 arch/powerpc/mm/pgtable_32.c |  47 +-
 arch/powerpc/mm/ppc_mmu_32.c |   4 +-
 arch/powerpc/platforms/embedded6xx/mpc10x.h  |  10 --
 arch/powerpc/sysdev/cpm_common.c |  15 +-
 26 files changed, 583 insertions(+), 287 deletions(-)
 create mode 100644 arch/powerpc/mm/8xx_mmu.c

-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 07/23] powerpc/8xx: Fix vaddr for IMMR early remap

2016-02-09 Thread Christophe Leroy
Memory: 124428K/131072K available (3748K kernel code, 188K rwdata,
648K rodata, 508K init, 290K bss, 6644K reserved)
Kernel virtual memory layout:
  * 0xfffdf000..0xf000  : fixmap
  * 0xfde0..0xfe00  : consistent mem
  * 0xfddf6000..0xfde0  : early ioremap
  * 0xc900..0xfddf6000  : vmalloc & ioremap
SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1

Today, IMMR is mapped 1:1 at startup

Mapping IMMR 1:1 is just wrong because it may overlap with another
area. On most mpc8xx boards it is OK as IMMR is set to 0xff00
but for instance on EP88xC board, IMMR is at 0xfa20 which
overlaps with VM ioremap area

This patch fixes the virtual address for remapping IMMR with the fixmap
regardless of the value of IMMR.

The size of IMMR area is 256kbytes (CPM at offset 0, security engine
at offset 128k) so a 512k page is enough

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: Using fixmap instead of fixed address
v4: Fix a wrong #if notified by kbuild robot
v5: no change
v6: no change
v7: no change

 arch/powerpc/include/asm/fixmap.h |  7 +++
 arch/powerpc/kernel/asm-offsets.c |  8 
 arch/powerpc/kernel/head_8xx.S| 11 ++-
 arch/powerpc/mm/mmu_decl.h|  7 +++
 arch/powerpc/sysdev/cpm_common.c  | 15 ---
 5 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/fixmap.h 
b/arch/powerpc/include/asm/fixmap.h
index 90f604b..d7dd8fb 100644
--- a/arch/powerpc/include/asm/fixmap.h
+++ b/arch/powerpc/include/asm/fixmap.h
@@ -51,6 +51,13 @@ enum fixed_addresses {
FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,
 #endif
+#ifdef CONFIG_PPC_8xx
+   /* For IMMR we need an aligned 512K area */
+   FIX_IMMR_START,
+   FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((512 * 1024) / PAGE_SIZE)) &
+  ~(((512 * 1024) / PAGE_SIZE) - 1),
+   FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((512 * 1024) / PAGE_SIZE),
+#endif
/* FIX_PCIE_MCFG, */
__end_of_fixed_addresses
 };
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 07cebc3..9724ff8 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -68,6 +68,10 @@
 #include "../mm/mmu_decl.h"
 #endif
 
+#ifdef CONFIG_PPC_8xx
+#include 
+#endif
+
 int main(void)
 {
DEFINE(THREAD, offsetof(struct task_struct, thread));
@@ -772,5 +776,9 @@ int main(void)
 
DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
 
+#ifdef CONFIG_PPC_8xx
+   DEFINE(VIRT_IMMR_BASE, __fix_to_virt(FIX_IMMR_BASE));
+#endif
+
return 0;
 }
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 87d1f5f..09173ae 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Macro to make the code more readable. */
 #ifdef CONFIG_8xx_CPU6
@@ -763,7 +764,7 @@ start_here:
  * virtual to physical.  Also, set the cache mode since that is defined
  * by TLB entries and perform any additional mapping (like of the IMMR).
  * If configured to pin some TLBs, we pin the first 8 Mbytes of kernel,
- * 24 Mbytes of data, and the 8M IMMR space.  Anything not covered by
+ * 24 Mbytes of data, and the 512k IMMR space.  Anything not covered by
  * these mappings is mapped by page tables.
  */
 initial_mmu:
@@ -812,7 +813,7 @@ initial_mmu:
ori r8, r8, MD_APG_INIT@l
mtspr   SPRN_MD_AP, r8
 
-   /* Map another 8 MByte at the IMMR to get the processor
+   /* Map a 512k page for the IMMR to get the processor
 * internal registers (among other things).
 */
 #ifdef CONFIG_PIN_TLB
@@ -820,12 +821,12 @@ initial_mmu:
mtspr   SPRN_MD_CTR, r10
 #endif
mfspr   r9, 638 /* Get current IMMR */
-   andis.  r9, r9, 0xff80  /* Get 8Mbyte boundary */
+   andis.  r9, r9, 0xfff8  /* Get 512 kbytes boundary */
 
-   mr  r8, r9  /* Create vaddr for TLB */
+   lis r8, VIRT_IMMR_BASE@h/* Create vaddr for TLB */
ori r8, r8, MD_EVALID   /* Mark it valid */
mtspr   SPRN_MD_EPN, r8
-   li  r8, MD_PS8MEG   /* Set 8M byte page */
+   li  r8, MD_PS512K | MD_GUARDED  /* Set 512k byte page */
ori r8, r8, MD_SVALID   /* Make it valid */
mtspr   SPRN_MD_TWC, r8
mr  r8, r9  /* Create paddr for TLB */
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 40dd5d3..e7228b7 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -107,6 +107,13 @@ struct hash_pte;
 extern struct hash_pte *Hash, *Hash_end;
 extern unsigned long Hash_size, Hash_mask;
 
+#define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff8)
+#ifdef CONFIG_PPC_8xx
+#define VIRT_IMMR_BASE 

[PATCH v7 06/23] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together

2016-02-09 Thread Christophe Leroy
x_mapped_by_bats() and x_mapped_by_tlbcam() serve the same kind of
purpose, and are never defined at the same time.
So rename them x_block_mapped() and define them in the relevant
places

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: Functions are mutually exclusive so renamed iaw Scott comment instead of 
grouping into a single function
v4: no change
v5: no change
v6: no change
v7: Don't include x_block_mapped() from compilation in
arch/powerpc/mm/fsl_booke_mmu.c when CONFIG_FSL_BOOKE is not set
(problem reported by kbuild robot with a configuration having
CONFIG_FSL_BOOK3E and not CONFIG_FSL_BOOKE)

 arch/powerpc/mm/fsl_booke_mmu.c |  4 ++--
 arch/powerpc/mm/mmu_decl.h  | 10 ++
 arch/powerpc/mm/pgtable_32.c| 44 ++---
 arch/powerpc/mm/ppc_mmu_32.c|  4 ++--
 4 files changed, 20 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/mm/fsl_booke_mmu.c b/arch/powerpc/mm/fsl_booke_mmu.c
index f3afe3d..5d45341 100644
--- a/arch/powerpc/mm/fsl_booke_mmu.c
+++ b/arch/powerpc/mm/fsl_booke_mmu.c
@@ -75,7 +75,7 @@ unsigned long tlbcam_sz(int idx)
 /*
  * Return PA for this VA if it is mapped by a CAM, or 0
  */
-phys_addr_t v_mapped_by_tlbcam(unsigned long va)
+phys_addr_t v_block_mapped(unsigned long va)
 {
int b;
for (b = 0; b < tlbcam_index; ++b)
@@ -87,7 +87,7 @@ phys_addr_t v_mapped_by_tlbcam(unsigned long va)
 /*
  * Return VA for a given PA or 0 if not mapped
  */
-unsigned long p_mapped_by_tlbcam(phys_addr_t pa)
+unsigned long p_block_mapped(phys_addr_t pa)
 {
int b;
for (b = 0; b < tlbcam_index; ++b)
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 7faeb9f..40dd5d3 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -158,3 +158,13 @@ struct tlbcam {
u32 MAS7;
 };
 #endif
+
+#if defined(CONFIG_6xx) || defined(CONFIG_FSL_BOOKE)
+/* 6xx have BATS */
+/* FSL_BOOKE have TLBCAM */
+phys_addr_t v_block_mapped(unsigned long va);
+unsigned long p_block_mapped(phys_addr_t pa);
+#else
+static inline phys_addr_t v_block_mapped(unsigned long va) { return 0; }
+static inline unsigned long p_block_mapped(phys_addr_t pa) { return 0; }
+#endif
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 7692d1b..db0d35e 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -41,32 +41,8 @@ unsigned long ioremap_base;
 unsigned long ioremap_bot;
 EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */
 
-#ifdef CONFIG_6xx
-#define HAVE_BATS  1
-#endif
-
-#if defined(CONFIG_FSL_BOOKE)
-#define HAVE_TLBCAM1
-#endif
-
 extern char etext[], _stext[];
 
-#ifdef HAVE_BATS
-extern phys_addr_t v_mapped_by_bats(unsigned long va);
-extern unsigned long p_mapped_by_bats(phys_addr_t pa);
-#else /* !HAVE_BATS */
-#define v_mapped_by_bats(x)(0UL)
-#define p_mapped_by_bats(x)(0UL)
-#endif /* HAVE_BATS */
-
-#ifdef HAVE_TLBCAM
-extern phys_addr_t v_mapped_by_tlbcam(unsigned long va);
-extern unsigned long p_mapped_by_tlbcam(phys_addr_t pa);
-#else /* !HAVE_TLBCAM */
-#define v_mapped_by_tlbcam(x)  (0UL)
-#define p_mapped_by_tlbcam(x)  (0UL)
-#endif /* HAVE_TLBCAM */
-
 #define PGDIR_ORDER(32 + PGD_T_LOG2 - PGDIR_SHIFT)
 
 #ifndef CONFIG_PPC_4K_PAGES
@@ -228,19 +204,10 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, 
unsigned long flags,
 
/*
 * Is it already mapped?  Perhaps overlapped by a previous
-* BAT mapping.  If the whole area is mapped then we're done,
-* otherwise remap it since we want to keep the virt addrs for
-* each request contiguous.
-*
-* We make the assumption here that if the bottom and top
-* of the range we want are mapped then it's mapped to the
-* same virt address (and this is contiguous).
-*  -- Cort
+* mapping.
 */
-   if ((v = p_mapped_by_bats(p)) /*&& p_mapped_by_bats(p+size-1)*/ )
-   goto out;
-
-   if ((v = p_mapped_by_tlbcam(p)))
+   v = p_block_mapped(p);
+   if (v)
goto out;
 
if (slab_is_available()) {
@@ -278,7 +245,8 @@ void iounmap(volatile void __iomem *addr)
 * If mapped by BATs then there is nothing to do.
 * Calling vfree() generates a benign warning.
 */
-   if (v_mapped_by_bats((unsigned long)addr)) return;
+   if (v_block_mapped((unsigned long)addr))
+   return;
 
if (addr > high_memory && (unsigned long) addr < ioremap_bot)
vunmap((void *) (PAGE_MASK & (unsigned long)addr));
@@ -403,7 +371,7 @@ static int __change_page_attr(struct page *page, pgprot_t 
prot)
BUG_ON(PageHighMem(page));
address = (unsigned long)page_address(page);
 
-   if (v_mapped_by_bats(address) || v_mapped_by_tlbcam(address))
+   if (v_block_mapped(address))
return 0;
if (!get_pteptr(_mm, address, , ))
   

[PATCH v7 15/23] powerpc/8xx: remove special handling of CPU6 errata in set_dec()

2016-02-09 Thread Christophe Leroy
CPU6 ERRATA is now handled directly in mtspr(), so we can use the
standard set_dec() fonction in all cases.

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/include/asm/time.h |  6 +-
 arch/powerpc/kernel/head_8xx.S  | 18 --
 2 files changed, 1 insertion(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 2d7109a..1092fdd 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -31,8 +31,6 @@ extern void tick_broadcast_ipi_handler(void);
 
 extern void generic_calibrate_decr(void);
 
-extern void set_dec_cpu6(unsigned int val);
-
 /* Some sane defaults: 125 MHz timebase, 1GHz processor */
 extern unsigned long ppc_proc_freq;
 #define DEFAULT_PROC_FREQ  (DEFAULT_TB_FREQ * 8)
@@ -166,14 +164,12 @@ static inline void set_dec(int val)
 {
 #if defined(CONFIG_40x)
mtspr(SPRN_PIT, val);
-#elif defined(CONFIG_8xx_CPU6)
-   set_dec_cpu6(val - 1);
 #else
 #ifndef CONFIG_BOOKE
--val;
 #endif
mtspr(SPRN_DEC, val);
-#endif /* not 40x or 8xx_CPU6 */
+#endif /* not 40x */
 }
 
 static inline unsigned long tb_ticks_since(unsigned long tstamp)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index a268cf4..637f8e9 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -1011,24 +1011,6 @@ _GLOBAL(set_context)
SYNC
blr
 
-#ifdef CONFIG_8xx_CPU6
-/* It's here because it is unique to the 8xx.
- * It is important we get called with interrupts disabled.  I used to
- * do that, but it appears that all code that calls this already had
- * interrupt disabled.
- */
-   .globl  set_dec_cpu6
-set_dec_cpu6:
-   lis r7, cpu6_errata_word@h
-   ori r7, r7, cpu6_errata_word@l
-   li  r4, 0x2c00
-   stw r4, 8(r7)
-   lwz r4, 8(r7)
-mtspr   22, r3 /* Update Decrementer */
-   SYNC
-   blr
-#endif
-
 /*
  * We put a few things here that have to be page-aligned.
  * This stuff goes at the beginning of the data segment,
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 16/23] powerpc/8xx: rewrite set_context() in C

2016-02-09 Thread Christophe Leroy
There is no real need to have set_context() in assembly.
Now that we have mtspr() handling CPU6 ERRATA directly, we
can rewrite set_context() in C language for easier maintenance.

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/kernel/head_8xx.S | 44 --
 arch/powerpc/mm/8xx_mmu.c  | 34 
 2 files changed, 34 insertions(+), 44 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 637f8e9..bb2b657 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -968,50 +968,6 @@ initial_mmu:
 
 
 /*
- * Set up to use a given MMU context.
- * r3 is context number, r4 is PGD pointer.
- *
- * We place the physical address of the new task page directory loaded
- * into the MMU base register, and set the ASID compare register with
- * the new "context."
- */
-_GLOBAL(set_context)
-
-#ifdef CONFIG_BDI_SWITCH
-   /* Context switch the PTE pointer for the Abatron BDI2000.
-* The PGDIR is passed as second argument.
-*/
-   lis r5, KERNELBASE@h
-   lwz r5, 0xf0(r5)
-   stw r4, 0x4(r5)
-#endif
-
-   /* Register M_TW will contain base address of level 1 table minus the
-* lower part of the kernel PGDIR base address, so that all accesses to
-* level 1 table are done relative to lower part of kernel PGDIR base
-* address.
-*/
-   li  r5, (swapper_pg_dir-PAGE_OFFSET)@l
-   sub r4, r4, r5
-   tophys  (r4, r4)
-#ifdef CONFIG_8xx_CPU6
-   lis r6, cpu6_errata_word@h
-   ori r6, r6, cpu6_errata_word@l
-   li  r7, 0x3f80
-   stw r7, 12(r6)
-   lwz r7, 12(r6)
-#endif
-   mtspr   SPRN_M_TW, r4   /* Update pointeur to level 1 table */
-#ifdef CONFIG_8xx_CPU6
-   li  r7, 0x3380
-   stw r7, 12(r6)
-   lwz r7, 12(r6)
-#endif
-   mtspr   SPRN_M_CASID, r3/* Update context */
-   SYNC
-   blr
-
-/*
  * We put a few things here that have to be page-aligned.
  * This stuff goes at the beginning of the data segment,
  * which is page-aligned.
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index 50f17d2..b75c461 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -147,3 +147,37 @@ void setup_initial_memory_limit(phys_addr_t 
first_memblock_base,
memblock_set_current_limit(min_t(u64, first_memblock_size,
 initial_memory_size));
 }
+
+/*
+ * Set up to use a given MMU context.
+ * id is context number, pgd is PGD pointer.
+ *
+ * We place the physical address of the new task page directory loaded
+ * into the MMU base register, and set the ASID compare register with
+ * the new "context."
+ */
+void set_context(unsigned long id, pgd_t *pgd)
+{
+   s16 offset = (s16)(__pa(swapper_pg_dir));
+
+#ifdef CONFIG_BDI_SWITCH
+   pgd_t   **ptr = *(pgd_t ***)(KERNELBASE + 0xf0);
+
+   /* Context switch the PTE pointer for the Abatron BDI2000.
+* The PGDIR is passed as second argument.
+*/
+   *(ptr + 1) = pgd;
+#endif
+
+   /* Register M_TW will contain base address of level 1 table minus the
+* lower part of the kernel PGDIR base address, so that all accesses to
+* level 1 table are done relative to lower part of kernel PGDIR base
+* address.
+*/
+   mtspr(SPRN_M_TW, __pa(pgd) - offset);
+
+   /* Update context */
+   mtspr(SPRN_M_CASID, id);
+   /* sync */
+   mb();
+}
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 2/2] powerpc: tracing: don't trace hcalls on offline CPUs

2016-02-09 Thread Michael Ellerman
On Fri, 2016-02-05 at 09:36 -0500, Steven Rostedt wrote:
> On Fri, 5 Feb 2016 14:20:17 +0300
> Denis Kirjanov  wrote:
> > > > > Signed-off-by: Denis Kirjanov 
> > >
> > > Hi Steven,
> > >
> > > please apply with Michael's acked-by tag.
> >
> > ping
>
> Actually, can you take this through the ppc tree? The
> TRACE_EVENT_FN_COND is already in mainline.
>
> You can add my:
>
> Acked-by: Steven Rostedt 

Thanks, will do.

I tidied up the change log a bit:

powerpc/pseries: Don't trace hcalls on offline CPUs

If a cpu is hotplugged while the hcall trace points are active, it's
possible to hit a warning from RCU due to the trace points calling into
RCU from an offline cpu, eg:

  RCU used illegally from offline CPU!
  rcu_scheduler_active = 1, debug_locks = 1

Make the hypervisor tracepoints conditional by using
TRACE_EVENT_FN_COND.

Acked-by: Steven Rostedt 
Signed-off-by: Denis Kirjanov 
Signed-off-by: Michael Ellerman 


cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/perf/hv-gpci: Increase request buffer size

2016-02-09 Thread Michael Ellerman
On Tue, 2016-09-02 at 03:08:30 UTC, Sukadev Bhattiprolu wrote:
> >From 31edd352fb7c2a72913f1977fa1bf168109089ad Mon Sep 17 00:00:00 2001
> From: Sukadev Bhattiprolu 
> Date: Tue, 9 Feb 2016 02:47:45 -0500
> Subject: [PATCH] powerpc/perf/hv-gpci: Increase request buffer size
> 
> The GPCI hcall allows for a 4K buffer but we limit the buffer
> to 1K. The problem with a 1K buffer is if a request results in
> returning more values than can be accomodated in the 1K buffer
> the request will fail.
> 
> The buffer we are using is currently allocated on the stack and
> hence limited in size. Instead use a per-CPU 4K buffer like we do
> with 24x7 counters (hv-24x7.c).
> 
> diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
> index 856fe6e..e6fad73 100644
> --- a/arch/powerpc/perf/hv-gpci.c
> +++ b/arch/powerpc/perf/hv-gpci.c
> @@ -127,8 +127,16 @@ static const struct attribute_group *attr_groups[] = {
>   NULL,
>  };
>  
> +#define HGPCI_REQ_BUFFER_SIZE4096
>  #define GPCI_MAX_DATA_BYTES \
> - (1024 - sizeof(struct hv_get_perf_counter_info_params))
> + (HGPCI_REQ_BUFFER_SIZE - sizeof(struct hv_get_perf_counter_info_params))
> +
> +DEFINE_PER_CPU(char, hv_gpci_reqb[HGPCI_REQ_BUFFER_SIZE]) 
> __aligned(sizeof(uint64_t));
> +
> +struct hv_gpci_request_buffer {
> + struct hv_get_perf_counter_info_params params;
> + uint8_t bytes[1];

bytes is 1 byte long, but ..

> @@ -163,9 +168,11 @@ static unsigned long single_gpci_request(u32 req, u32 
> starting_index,
>*/
>   count = 0;
>   for (i = offset; i < offset + length; i++)
> - count |= arg.bytes[i] << (i - offset);
> + count |= arg->bytes[i] << (i - offset);


Here you read from bytes[i] where i can be > 1 (AFAICS).

That's fishy at best, and newer GCCs just don't allow it.

I think you could do this and it would work, but untested:

   struct hv_gpci_request_buffer {
struct hv_get_perf_counter_info_params params;
uint8_t bytes[4096 - sizeof(struct hv_get_perf_counter_info_parms)];
   };

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 01/23] powerpc/8xx: Save r3 all the time in DTLB miss handler

2016-02-09 Thread Christophe Leroy
We are spending between 40 and 160 cycles with a mean of 65 cycles in
the DTLB handling routine (measured with mftbl) so make it more
simple althought it adds one instruction.
With this modification, we get three registers available at all time,
which will help with following patch.

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/kernel/head_8xx.S | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index e629e28..a89492e 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -385,23 +385,20 @@ InstructionTLBMiss:
 
. = 0x1200
 DataStoreTLBMiss:
-#ifdef CONFIG_8xx_CPU6
mtspr   SPRN_SPRG_SCRATCH2, r3
-#endif
EXCEPTION_PROLOG_0
-   mfcrr10
+   mfcrr3
 
/* If we are faulting a kernel address, we have to use the
 * kernel page tables.
 */
-   mfspr   r11, SPRN_MD_EPN
-   IS_KERNEL(r11, r11)
+   mfspr   r10, SPRN_MD_EPN
+   IS_KERNEL(r11, r10)
mfspr   r11, SPRN_M_TW  /* Get level 1 table */
BRANCH_UNLESS_KERNEL(3f)
lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha
 3:
-   mtcrr10
-   mfspr   r10, SPRN_MD_EPN
+   mtcrr3
 
/* Insert level 1 index */
rlwimi  r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 
29
@@ -453,9 +450,7 @@ DataStoreTLBMiss:
MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */
 
/* Restore registers */
-#ifdef CONFIG_8xx_CPU6
mfspr   r3, SPRN_SPRG_SCRATCH2
-#endif
mtspr   SPRN_DAR, r11   /* Tag DAR */
EXCEPTION_EPILOG_0
rfi
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 09/23] powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM

2016-02-09 Thread Christophe Leroy
IMMR is now mapped by page tables so it is not
anymore necessary to PIN TLBs

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/Kconfig.debug | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 638f9ce..136b09c 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -220,7 +220,6 @@ config PPC_EARLY_DEBUG_40x
 config PPC_EARLY_DEBUG_CPM
bool "Early serial debugging for Freescale CPM-based serial ports"
depends on SERIAL_CPM
-   select PIN_TLB if PPC_8xx
help
  Select this to enable early debugging for Freescale chips
  using a CPM-based serial port.  This assumes that the bootwrapper
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/xmon: Add xmon command to dump process/task similar to ps(1)

2016-02-09 Thread Michael Ellerman
On Mon, 2015-23-11 at 15:01:15 UTC, Douglas Miller wrote:
> Add 'P' command with optional task_struct address to dump all/one task's
> information: task pointer, kernel stack pointer, PID, PPID, state
> (interpreted), CPU where (last) running, and command.
> 
> Introduce XMON_PROTECT macro to standardize memory-access-fault
> protection (setjmp). Initially used only by the 'P' command.

Hi Doug,

Sorry this has taken a while, it keeps getting preempted by more important
patches.

I'm also not a big fan of the protect macro, it works for this case, but it's
already a bit ugly calling for_each_process() inside the macro, and it would be
even worse for multi line logic.

I think I'd rather just open code it, and hopefully we can come up with a
better solution for catching errors in the long run.

I also renamed the routines to use "task", because "proc" in xmon is already
used to mean "procedure", and the struct is task_struct after all.

How does this look?

cheers

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 47e195d66a9a..942796fa4767 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -163,6 +163,7 @@ static int  cpu_cmd(void);
 static void csum(void);
 static void bootcmds(void);
 static void proccall(void);
+static void show_tasks(void);
 void dump_segments(void);
 static void symbol_lookup(void);
 static void xmon_show_stack(unsigned long sp, unsigned long lr,
@@ -238,6 +239,7 @@ Commands:\n\
   mz   zero a block of memory\n\
   mi   show information about memory allocation\n\
   pcall a procedure\n\
+  Plist processes/tasks\n\
   rprint registers\n\
   ssingle step\n"
 #ifdef CONFIG_SPU_BASE
@@ -967,6 +969,9 @@ cmds(struct pt_regs *excp)
case 'p':
proccall();
break;
+   case 'P':
+   show_tasks();
+   break;
 #ifdef CONFIG_PPC_STD_MMU
case 'u':
dump_segments();
@@ -2566,6 +2571,61 @@ memzcan(void)
printf("%.8x\n", a - mskip);
 }
 
+static void show_task(struct task_struct *tsk)
+{
+   char state;
+
+   /*
+* Cloned from kdb_task_state_char(), which is not entirely
+* appropriate for calling from xmon. This could be moved
+* to a common, generic, routine used by both.
+*/
+   state = (tsk->state == 0) ? 'R' :
+   (tsk->state < 0) ? 'U' :
+   (tsk->state & TASK_UNINTERRUPTIBLE) ? 'D' :
+   (tsk->state & TASK_STOPPED) ? 'T' :
+   (tsk->state & TASK_TRACED) ? 'C' :
+   (tsk->exit_state & EXIT_ZOMBIE) ? 'Z' :
+   (tsk->exit_state & EXIT_DEAD) ? 'E' :
+   (tsk->state & TASK_INTERRUPTIBLE) ? 'S' : '?';
+
+   printf("%p %016lx %6d %6d %c %2d %s\n", tsk,
+   tsk->thread.ksp,
+   tsk->pid, tsk->parent->pid,
+   state, task_thread_info(tsk)->cpu,
+   tsk->comm);
+}
+
+static void show_tasks(void)
+{
+   unsigned long tskv;
+   struct task_struct *tsk = NULL;
+
+   printf(" task_struct ->thread.kspPID   PPID S  P CMD\n");
+
+   if (scanhex())
+   tsk = (struct task_struct *)tskv;
+
+   if (setjmp(bus_error_jmp) != 0) {
+   catch_memory_errors = 0;
+   printf("*** Error dumping task %p\n", tsk);
+   return;
+   }
+
+   catch_memory_errors = 1;
+   sync();
+
+   if (tsk)
+   show_task(tsk);
+   else
+   for_each_process(tsk)
+   show_task(tsk);
+
+   sync();
+   __delay(200);
+   catch_memory_errors = 0;
+}
+
 static void proccall(void)
 {
unsigned long args[8];

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v7 04/10] ppc64 ftrace_with_regs configuration variables

2016-02-09 Thread Torsten Duwe
On Mon, Feb 08, 2016 at 10:49:28AM -0500, Steven Rostedt wrote:
> On Mon, 8 Feb 2016 16:23:06 +0100
> Petr Mladek  wrote:
> 
> > >From 2b0fcb678d7720d03f9c9f233b61ed9ed4d420b3 Mon Sep 17 00:00:00 2001  
> > From: Petr Mladek 
> > Date: Mon, 8 Feb 2016 16:03:03 +0100
> > Subject: [PATCH] ftrace: Allow to explicitly disable the build of the 
> > dynamic
> >  ftrace with regs
> > 
> > This patch allows to explicitly disable
> > CONFIG_DYNAMIC_FTRACE_WITH_REGS. We will need to do so on
> > PPC with a broken gcc. This situation will be detected at
> > buildtime and could not be handled by Kbuild automatically.
> 
> Wait. Can it be detected at build time? That is, does it cause a build

Yes, I wrote a test to detect it at build time. It is similar to "asm goto"
and part of the v7 patch set.

> error? If so, then you can have Kbuild automatically detect this and
> set the proper value. We do this with 'asm goto'. There's tricks in the
> build system that can change the configs based on if a compiler is
> broken or not.

Please clarify. All I could find is Makefile magic that does it. AFAICS
This runs _after_ Kconfig.

But what I'd like to see is to offer the user the full choice, where possible,
e.g.

Kernel Tracing ...
0) none
1) static FTRACE
2) DYNAMIC_FTRACE
3) DYNAMIC_FTRACE_WITH_REGS

Can such a test be used to simply reduce these options?
With Petr's patch, it comes quite close to the above, and if you select "3"
and your compiler is broken, compilation will fail. For "2", it will just do
the right thing ( fall back to plain "-pg" ).

Without Petr's patch you have *no* choice between "2" and "3".
(That's what I'd call a bug :)

So, the question is, can such a test be used to provide _input_ to
"make config" ? I can see the "env=" mechanism, but it seems not to be used
very heavily. That would then be a prerequisite to all "make *config".
Even if it can provide this input, you can still not choose between 2 and 3
where both are available.

Torsten

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 04/23] powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c

2016-02-09 Thread Christophe Leroy
Now we have a 8xx specific .c file for that so put it in there
as other powerpc variants do

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/mm/8xx_mmu.c | 17 +
 arch/powerpc/mm/init_32.c | 19 ---
 2 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index 2d42745..a84f5eb 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -81,3 +81,20 @@ unsigned long __init mmu_mapin_ram(unsigned long top)
 
return mapped;
 }
+
+void setup_initial_memory_limit(phys_addr_t first_memblock_base,
+   phys_addr_t first_memblock_size)
+{
+   /* We don't currently support the first MEMBLOCK not mapping 0
+* physical on those processors
+*/
+   BUG_ON(first_memblock_base != 0);
+
+#ifdef CONFIG_PIN_TLB
+   /* 8xx can only access 24MB at the moment */
+   memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0180));
+#else
+   /* 8xx can only access 8MB at the moment */
+   memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0080));
+#endif
+}
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index a10be66..1a18e4b 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -193,22 +193,3 @@ void __init MMU_init(void)
/* Shortly after that, the entire linear mapping will be available */
memblock_set_current_limit(lowmem_end_addr);
 }
-
-#ifdef CONFIG_8xx /* No 8xx specific .c file to put that in ... */
-void setup_initial_memory_limit(phys_addr_t first_memblock_base,
-   phys_addr_t first_memblock_size)
-{
-   /* We don't currently support the first MEMBLOCK not mapping 0
-* physical on those processors
-*/
-   BUG_ON(first_memblock_base != 0);
-
-#ifdef CONFIG_PIN_TLB
-   /* 8xx can only access 24MB at the moment */
-   memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0180));
-#else
-   /* 8xx can only access 8MB at the moment */
-   memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0080));
-#endif
-}
-#endif /* CONFIG_8xx */
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 14/23] powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro

2016-02-09 Thread Christophe Leroy
MPC8xx has an ERRATA on the use of mtspr() for some registers
This patch includes the ERRATA handling directly into mtspr() macro
so that mtspr() users don't need to bother about that errata

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/include/asm/reg.h |  2 +
 arch/powerpc/include/asm/reg_8xx.h | 82 ++
 2 files changed, 84 insertions(+)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index c4cb2ff..7b5d97f 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1211,9 +1211,11 @@ static inline void mtmsr_isync(unsigned long val)
 #define mfspr(rn)  ({unsigned long rval; \
asm volatile("mfspr %0," __stringify(rn) \
: "=r" (rval)); rval;})
+#ifndef mtspr
 #define mtspr(rn, v)   asm volatile("mtspr " __stringify(rn) ",%0" : \
 : "r" ((unsigned long)(v)) \
 : "memory")
+#endif
 
 extern void msr_check_and_set(unsigned long bits);
 extern bool strict_msr_control;
diff --git a/arch/powerpc/include/asm/reg_8xx.h 
b/arch/powerpc/include/asm/reg_8xx.h
index 0f71c81..d41412c 100644
--- a/arch/powerpc/include/asm/reg_8xx.h
+++ b/arch/powerpc/include/asm/reg_8xx.h
@@ -50,4 +50,86 @@
 #define DC_DFWT0x4000  /* Data cache is forced write 
through */
 #define DC_LES 0x2000  /* Caches are little endian mode */
 
+#ifdef CONFIG_8xx_CPU6
+#define do_mtspr_cpu6(rn, rn_addr, v)  \
+   do {\
+   int _reg_cpu6 = rn_addr, _tmp_cpu6[1];  \
+   asm volatile("stw %0, %1;"  \
+"lwz %0, %1;"  \
+"mtspr " __stringify(rn) ",%2" :   \
+: "r" (_reg_cpu6), "m"(_tmp_cpu6), \
+  "r" ((unsigned long)(v)) \
+: "memory");   \
+   } while (0)
+
+#define do_mtspr(rn, v)asm volatile("mtspr " __stringify(rn) ",%0" :   
\
+: "r" ((unsigned long)(v)) \
+: "memory")
+#define mtspr(rn, v) \
+   do {\
+   if (rn == SPRN_IMMR)\
+   do_mtspr_cpu6(rn, 0x3d30, v);   \
+   else if (rn == SPRN_IC_CST) \
+   do_mtspr_cpu6(rn, 0x2110, v);   \
+   else if (rn == SPRN_IC_ADR) \
+   do_mtspr_cpu6(rn, 0x2310, v);   \
+   else if (rn == SPRN_IC_DAT) \
+   do_mtspr_cpu6(rn, 0x2510, v);   \
+   else if (rn == SPRN_DC_CST) \
+   do_mtspr_cpu6(rn, 0x3110, v);   \
+   else if (rn == SPRN_DC_ADR) \
+   do_mtspr_cpu6(rn, 0x3310, v);   \
+   else if (rn == SPRN_DC_DAT) \
+   do_mtspr_cpu6(rn, 0x3510, v);   \
+   else if (rn == SPRN_MI_CTR) \
+   do_mtspr_cpu6(rn, 0x2180, v);   \
+   else if (rn == SPRN_MI_AP)  \
+   do_mtspr_cpu6(rn, 0x2580, v);   \
+   else if (rn == SPRN_MI_EPN) \
+   do_mtspr_cpu6(rn, 0x2780, v);   \
+   else if (rn == SPRN_MI_TWC) \
+   do_mtspr_cpu6(rn, 0x2b80, v);   \
+   else if (rn == SPRN_MI_RPN) \
+   do_mtspr_cpu6(rn, 0x2d80, v);   \
+   else if (rn == SPRN_MI_CAM) \
+   do_mtspr_cpu6(rn, 0x2190, v);   \
+   else if (rn == SPRN_MI_RAM0)\
+   do_mtspr_cpu6(rn, 0x2390, v);   \
+   else if (rn == SPRN_MI_RAM1)\
+   do_mtspr_cpu6(rn, 0x2590, v);   \
+   else if (rn == SPRN_MD_CTR) \
+   do_mtspr_cpu6(rn, 0x3180, v);   \
+   else if (rn == SPRN_M_CASID)  

[PATCH v7 22/23] powerpc32: small optimisation in flush_icache_range()

2016-02-09 Thread Christophe Leroy
Inlining of _dcache_range() functions has shown that the compiler
does the same thing a bit better with one insn less

Signed-off-by: Christophe Leroy 
---
v2: new
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/kernel/misc_32.S | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 09e1e5d..3ec5a22 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -348,10 +348,9 @@ BEGIN_FTR_SECTION
PURGE_PREFETCHED_INS
blr /* for 601, do nothing */
 END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-   li  r5,L1_CACHE_BYTES-1
-   andcr3,r3,r5
+   rlwinm  r3,r3,0,0,31 - L1_CACHE_SHIFT
subfr4,r3,r4
-   add r4,r4,r5
+   addir4,r4,L1_CACHE_BYTES - 1
srwi.   r4,r4,L1_CACHE_SHIFT
beqlr
mtctr   r4
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Fix kgdb on little endian ppc64le

2016-02-09 Thread Michael Ellerman
On Mon, 2016-01-02 at 06:03:25 UTC, Balbir Singh wrote:
> From: Balbir Singh 
>
> I spent some time trying to use kgdb and debugged my inability to
> resume from kgdb_handle_breakpoint(). NIP is not incremented
> and that leads to a loop in the debugger.
>
> I've tested this lightly on a virtual instance with KDB enabled.
> After the patch, I am able to get the "go" command to work as
> expected

The test suite isn't working for me (I think?), so I think maybe we need
something more?


  KGDB: Registered I/O driver kgdbts
  kgdbts:RUN plant and detach test

  Entering kdb (current=0xc001fefc, pid 1) on processor 12 due to 
Keyboard Entry
  [12]kdb> kgdbts:RUN sw breakpoint test
  kgdbts: BP mismatch c018e2cc expected c061c510
  KGDB: re-enter exception: ALL breakpoints killed
  CPU: 12 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc3-6-g893791ee8b01 #5
  Call Trace:
  [c001fb082e10] [c09c4608] dump_stack+0xb0/0xf0 (unreliable)
  [c001fb082e50] [c018fa8c] kgdb_handle_exception+0x2ac/0x2c0
  [c001fb082f20] [c00405f4] kgdb_handle_breakpoint+0x64/0xa0
  [c001fb082f50] [c09bcee4] program_check_exception+0x144/0x370
  [c001fb082fc0] [c0006244] program_check_common+0x144/0x180
  --- interrupt: 700 at check_and_rewind_pc+0x100/0x130
  LR = check_and_rewind_pc+0xfc/0x130
  [c001fb083340] [c061bfc0] validate_simple_test+0x60/0x170
  [c001fb083370] [c061c784] run_simple_test+0x194/0x3c0
  [c001fb0833f0] [c061c17c] kgdbts_put_char+0x4c/0x70
  [c001fb083420] [c01902e0] put_packet+0x130/0x210
  [c001fb083470] [c0191338] gdb_serial_stub+0x478/0x1110
  [c001fb083560] [c018f19c] kgdb_cpu_enter+0x3fc/0x800
  [c001fb083660] [c018f970] kgdb_handle_exception+0x190/0x2c0
  [c001fb083730] [c00405f4] kgdb_handle_breakpoint+0x64/0xa0
  [c001fb083760] [c09bcee4] program_check_exception+0x144/0x370
  [c001fb0837d0] [c0006244] program_check_common+0x144/0x180
  --- interrupt: 700 at kgdb_breakpoint+0x3c/0x70
  LR = run_breakpoint_test+0xa4/0x120
  [c001fb083ac0] []   (null) (unreliable)
  [c001fb083ae0] [c061d714] run_breakpoint_test+0xa4/0x120
  [c001fb083b50] [c061dcb4] configure_kgdbts+0x2c4/0x6e0
  [c001fb083c30] [c000b3d0] do_one_initcall+0xd0/0x250
  [c001fb083d00] [c0cc42f8] kernel_init_freeable+0x270/0x350
  [c001fb083dc0] [c000bd3c] kernel_init+0x2c/0x150
  [c001fb083e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac
  Kernel panic - not syncing: Recursive entry to debugger
  CPU: 12 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc3-6-g893791ee8b01 #5
  Call Trace:
  [c001fb082d70] [c09c4608] dump_stack+0xb0/0xf0 (unreliable)
  [c001fb082db0] [c09c2edc] panic+0x138/0x300
  [c001fb082e50] [c018fa9c] kgdb_handle_exception+0x2bc/0x2c0
  [c001fb082f20] [c00405f4] kgdb_handle_breakpoint+0x64/0xa0
  [c001fb082f50] [c09bcee4] program_check_exception+0x144/0x370
  [c001fb082fc0] [c0006244] program_check_common+0x144/0x180
  --- interrupt: 700 at check_and_rewind_pc+0x100/0x130
  LR = check_and_rewind_pc+0xfc/0x130
  [c001fb083340] [c061bfc0] validate_simple_test+0x60/0x170
  [c001fb083370] [c061c784] run_simple_test+0x194/0x3c0
  [c001fb0833f0] [c061c17c] kgdbts_put_char+0x4c/0x70
  [c001fb083420] [c01902e0] put_packet+0x130/0x210
  [c001fb083470] [c0191338] gdb_serial_stub+0x478/0x1110
  [c001fb083560] [c018f19c] kgdb_cpu_enter+0x3fc/0x800
  [c001fb083660] [c018f970] kgdb_handle_exception+0x190/0x2c0
  [c001fb083730] [c00405f4] kgdb_handle_breakpoint+0x64/0xa0
  [c001fb083760] [c09bcee4] program_check_exception+0x144/0x370
  [c001fb0837d0] [c0006244] program_check_common+0x144/0x180
  --- interrupt: 700 at kgdb_breakpoint+0x3c/0x70
  LR = run_breakpoint_test+0xa4/0x120
  [c001fb083ac0] []   (null) (unreliable)
  [c001fb083ae0] [c061d714] run_breakpoint_test+0xa4/0x120
  [c001fb083b50] [c061dcb4] configure_kgdbts+0x2c4/0x6e0
  [c001fb083c30] [c000b3d0] do_one_initcall+0xd0/0x250
  [c001fb083d00] [c0cc42f8] kernel_init_freeable+0x270/0x350
  [c001fb083dc0] [c000bd3c] kernel_init+0x2c/0x150
  [c001fb083e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac
  Rebooting in 10 seconds..


cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [1/2] powerpc/powernv: new function to access OPAL msglog

2016-02-09 Thread Michael Ellerman
On Tue, 2016-02-09 at 16:29 +1100, Andrew Donnellan wrote:
> On 08/02/16 22:31, Michael Ellerman wrote:
> > Pulling the memcons out of the bin_attr here is not all that nice. This 
> > routine
> > should really stand on its own without reference to the bin_attr. In theory 
> > I
> > might want to disable building sysfs but still have this routine available.
>
> Yeah it's a bit ugly, though does disabling sysfs actually break it?

Probably not, it looks like bin_attribute is still defined even when sysfs is
disabled. And the build would break in other places too.

> I can separate it out anyway - there's no reason for the memcons to be
> tied to the sysfs entry.

Yeah that was more my point.

> > It's also a bit fishy if it's called before the bin_attr is initialised or 
> > when
> > the memcons initialisation fails. In both cases it should be OK, because the
> > structs in question are static and so the private pointer will be NULL, but
> > that's a bit fragile.
> >
> > I think the solution is simply to create a:
> >
> >static struct memcons *opal_memcons;
> >
> > And use that in opal_msglog_copy() and so on.
>
> Will respin.

Thanks.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 12/23] powerpc32: remove ioremap_base

2016-02-09 Thread Christophe Leroy
ioremap_base is not initialised and is nowhere used so remove it

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: fix comment as well
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/include/asm/nohash/32/pgtable.h |  2 +-
 arch/powerpc/mm/mmu_decl.h   |  1 -
 arch/powerpc/mm/pgtable_32.c |  3 +--
 arch/powerpc/platforms/embedded6xx/mpc10x.h  | 10 --
 4 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index e201600..7808475 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -86,7 +86,7 @@ extern int icache_44x_need_flush;
  * We no longer map larger than phys RAM with the BATs so we don't have
  * to worry about the VMALLOC_OFFSET causing problems.  We do have to worry
  * about clashes between our early calls to ioremap() that start growing down
- * from ioremap_base being run into the VM area allocations (growing upwards
+ * from IOREMAP_TOP being run into the VM area allocations (growing upwards
  * from VMALLOC_START).  For this reason we have ioremap_bot to check when
  * we actually run into our mappings setup in the early boot with the VM
  * system.  This really does become a problem for machines with good amounts
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 3872332..53564a3 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -100,7 +100,6 @@ extern void setbat(int index, unsigned long virt, 
phys_addr_t phys,
 
 extern int __map_without_bats;
 extern int __allow_ioremap_reserved;
-extern unsigned long ioremap_base;
 extern unsigned int rtas_data, rtas_size;
 
 struct hash_pte;
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index db0d35e..815ccd7 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -37,7 +37,6 @@
 
 #include "mmu_decl.h"
 
-unsigned long ioremap_base;
 unsigned long ioremap_bot;
 EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */
 
@@ -173,7 +172,7 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, 
unsigned long flags,
/*
 * Choose an address to map it to.
 * Once the vmalloc system is running, we use it.
-* Before then, we use space going down from ioremap_base
+* Before then, we use space going down from IOREMAP_TOP
 * (ioremap_bot records where we're up to).
 */
p = addr & PAGE_MASK;
diff --git a/arch/powerpc/platforms/embedded6xx/mpc10x.h 
b/arch/powerpc/platforms/embedded6xx/mpc10x.h
index b290b63..5ad1202 100644
--- a/arch/powerpc/platforms/embedded6xx/mpc10x.h
+++ b/arch/powerpc/platforms/embedded6xx/mpc10x.h
@@ -24,13 +24,11 @@
  *   Processor: 0x8000 - 0x807f -> PCI I/O: 0x - 0x007f
  *   Processor: 0xc000 - 0xdfff -> PCI MEM: 0x - 0x1fff
  *   PCI MEM:   0x8000 -> Processor System Memory: 0x
- *   EUMB mapped to: ioremap_base - 0x0010 (ioremap_base - 1 MB)
  *
  * MAP B (CHRP Map)
  *   Processor: 0xfe00 - 0xfebf -> PCI I/O: 0x - 0x00bf
  *   Processor: 0x8000 - 0xbfff -> PCI MEM: 0x8000 - 0xbfff
  *   PCI MEM:   0x -> Processor System Memory: 0x
- *   EUMB mapped to: ioremap_base - 0x0010 (ioremap_base - 1 MB)
  */
 
 /*
@@ -138,14 +136,6 @@
 #define MPC10X_EUMB_WP_OFFSET  0x000ff000 /* Data path diagnostic, 
watchpoint reg offset */
 #define MPC10X_EUMB_WP_SIZE0x1000 /* Data path diagnostic, 
watchpoint reg size */
 
-/*
- * Define some recommended places to put the EUMB regs.
- * For both maps, recommend putting the EUMB from 0xeff0 to 0xefff.
- */
-extern unsigned long   ioremap_base;
-#defineMPC10X_MAPA_EUMB_BASE   (ioremap_base - 
MPC10X_EUMB_SIZE)
-#defineMPC10X_MAPB_EUMB_BASE   MPC10X_MAPA_EUMB_BASE
-
 enum ppc_sys_devices {
MPC10X_IIC1,
MPC10X_DMA0,
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 11/23] powerpc32: Remove useless/wrong MMU:setio progress message

2016-02-09 Thread Christophe Leroy
Commit 771168494719 ("[POWERPC] Remove unused machine call outs")
removed the call to setup_io_mappings(), so remove the associated
progress line message

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/mm/init_32.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index 1a18e4b..4eb1b8f 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -178,10 +178,6 @@ void __init MMU_init(void)
/* Initialize early top-down ioremap allocator */
ioremap_bot = IOREMAP_TOP;
 
-   /* Map in I/O resources */
-   if (ppc_md.progress)
-   ppc_md.progress("MMU:setio", 0x302);
-
if (ppc_md.progress)
ppc_md.progress("MMU:exit", 0x211);
 
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 18/23] powerpc: add inline functions for cache related instructions

2016-02-09 Thread Christophe Leroy
This patch adds inline functions to use dcbz, dcbi, dcbf, dcbst
from C functions

Signed-off-by: Christophe Leroy 
---
v2: new
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/include/asm/cache.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 5f8229e..ffbafbf 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -69,6 +69,25 @@ extern void _set_L3CR(unsigned long);
 #define _set_L3CR(val) do { } while(0)
 #endif
 
+static inline void dcbz(void *addr)
+{
+   __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void dcbi(void *addr)
+{
+   __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void dcbf(void *addr)
+{
+   __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void dcbst(void *addr)
+{
+   __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory");
+}
 #endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_CACHE_H */
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 20/23] powerpc32: move xxxxx_dcache_range() functions inline

2016-02-09 Thread Christophe Leroy
flush/clean/invalidate _dcache_range() functions are all very
similar and are quite short. They are mainly used in __dma_sync()
perf_event locate them in the top 3 consumming functions during
heavy ethernet activity

They are good candidate for inlining, as __dma_sync() does
almost nothing but calling them

Signed-off-by: Christophe Leroy 
---
v2: new
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/include/asm/cacheflush.h | 52 ++--
 arch/powerpc/kernel/misc_32.S | 65 ---
 arch/powerpc/kernel/ppc_ksyms.c   |  2 ++
 3 files changed, 51 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/include/asm/cacheflush.h 
b/arch/powerpc/include/asm/cacheflush.h
index 6229e6b..97c9978 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -47,12 +47,58 @@ static inline void __flush_dcache_icache_phys(unsigned long 
physaddr)
 }
 #endif
 
-extern void flush_dcache_range(unsigned long start, unsigned long stop);
 #ifdef CONFIG_PPC32
-extern void clean_dcache_range(unsigned long start, unsigned long stop);
-extern void invalidate_dcache_range(unsigned long start, unsigned long stop);
+/*
+ * Write any modified data cache blocks out to memory and invalidate them.
+ * Does not invalidate the corresponding instruction cache blocks.
+ */
+static inline void flush_dcache_range(unsigned long start, unsigned long stop)
+{
+   void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
+   unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
+   unsigned long i;
+
+   for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
+   dcbf(addr);
+   mb();   /* sync */
+}
+
+/*
+ * Write any modified data cache blocks out to memory.
+ * Does not invalidate the corresponding cache lines (especially for
+ * any corresponding instruction cache).
+ */
+static inline void clean_dcache_range(unsigned long start, unsigned long stop)
+{
+   void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
+   unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
+   unsigned long i;
+
+   for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
+   dcbst(addr);
+   mb();   /* sync */
+}
+
+/*
+ * Like above, but invalidate the D-cache.  This is used by the 8xx
+ * to invalidate the cache so the PPC core doesn't get stale data
+ * from the CPM (no cache snooping here :-).
+ */
+static inline void invalidate_dcache_range(unsigned long start,
+  unsigned long stop)
+{
+   void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
+   unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
+   unsigned long i;
+
+   for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
+   dcbi(addr);
+   mb();   /* sync */
+}
+
 #endif /* CONFIG_PPC32 */
 #ifdef CONFIG_PPC64
+extern void flush_dcache_range(unsigned long start, unsigned long stop);
 extern void flush_inval_dcache_range(unsigned long start, unsigned long stop);
 extern void flush_dcache_phys_range(unsigned long start, unsigned long stop);
 #endif
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 181afc1..09e1e5d 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -375,71 +375,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
isync
blr
 /*
- * Write any modified data cache blocks out to memory.
- * Does not invalidate the corresponding cache lines (especially for
- * any corresponding instruction cache).
- *
- * clean_dcache_range(unsigned long start, unsigned long stop)
- */
-_GLOBAL(clean_dcache_range)
-   li  r5,L1_CACHE_BYTES-1
-   andcr3,r3,r5
-   subfr4,r3,r4
-   add r4,r4,r5
-   srwi.   r4,r4,L1_CACHE_SHIFT
-   beqlr
-   mtctr   r4
-
-1: dcbst   0,r3
-   addir3,r3,L1_CACHE_BYTES
-   bdnz1b
-   sync/* wait for dcbst's to get to ram */
-   blr
-
-/*
- * Write any modified data cache blocks out to memory and invalidate them.
- * Does not invalidate the corresponding instruction cache blocks.
- *
- * flush_dcache_range(unsigned long start, unsigned long stop)
- */
-_GLOBAL(flush_dcache_range)
-   li  r5,L1_CACHE_BYTES-1
-   andcr3,r3,r5
-   subfr4,r3,r4
-   add r4,r4,r5
-   srwi.   r4,r4,L1_CACHE_SHIFT
-   beqlr
-   mtctr   r4
-
-1: dcbf0,r3
-   addir3,r3,L1_CACHE_BYTES
-   bdnz1b
-   sync/* wait for dcbst's to get to ram */
-   blr
-
-/*
- * Like above, but invalidate the D-cache.  This is used by the 8xx
- * to invalidate the cache so the PPC core doesn't get stale data
- * from the CPM (no cache snooping here :-).
- *
- * 

[PATCH v7 23/23] powerpc32: Remove one insn in mulhdu

2016-02-09 Thread Christophe Leroy
Remove one instruction in mulhdu

Signed-off-by: Christophe Leroy 
---
v2: new
v3: no change
v4: no change
v5: no change
v6: no change
v7: no change

 arch/powerpc/kernel/misc_32.S | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 3ec5a22..bf5160f 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -91,17 +91,16 @@ _GLOBAL(mulhdu)
addcr7,r0,r7
addze   r4,r4
 1: beqlr   cr1 /* all done if high part of A is 0 */
-   mr  r10,r3
mullw   r9,r3,r5
-   mulhwu  r3,r3,r5
+   mulhwu  r10,r3,r5
beq 2f
-   mullw   r0,r10,r6
-   mulhwu  r8,r10,r6
+   mullw   r0,r3,r6
+   mulhwu  r8,r3,r6
addcr7,r0,r7
adder4,r4,r8
-   addze   r3,r3
+   addze   r10,r10
 2: addcr4,r4,r9
-   addze   r3,r3
+   addze   r3,r10
blr
 
 /*
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH kernel] powerpc/ioda: Set "read" permission when "write" is set

2016-02-09 Thread Douglas Miller
We finally got the chance to test it end of last week. I forgot to 
update everyone Monday. B all appearances, the patch fixes the problem. 
We did not see any new issues with the patch (vs. same test scenarios 
without).


I'll also update the bugzilla.

Thanks,
Doug

On 02/08/2016 07:37 PM, Alexey Kardashevskiy wrote:

On 01/20/2016 06:01 AM, Douglas Miller wrote:



On 01/18/2016 09:52 PM, Alexey Kardashevskiy wrote:

On 01/13/2016 01:24 PM, Douglas Miller wrote:



On 01/12/2016 05:07 PM, Benjamin Herrenschmidt wrote:

On Tue, 2016-01-12 at 15:40 +1100, Alexey Kardashevskiy wrote:

Quite often drivers set only "write" permission assuming that this
includes "read" permission as well and this works on plenty
platforms.
However IODA2 is strict about this and produces an EEH when "read"
permission is not and reading happens.

This adds a workaround in IODA code to always add the "read" bit 
when

the "write" bit is set.

Cc: Benjamin Herrenschmidt 
Signed-off-by: Alexey Kardashevskiy 
---


Ben, what was the driver which did not set "read" and caused EEH?

aacraid

Cheers,
Ben.
Just to be precise, the driver wasn't responsible for setting READ. 
The
driver called scsi_dma_map() and the scsicmd was set (by scsi 
layer) as

DMA_FROM_DEVICE so the current code would set the permissions to
WRITE-ONLY. Previously, and in other architectures, this scsicmd 
would have

resulted in READ+WRITE permissions on the DMA map.



Does the patch fix the issue? Thanks.






---
  arch/powerpc/platforms/powernv/pci.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci.c
b/arch/powerpc/platforms/powernv/pci.c
index f2dd772..c7dcae5 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -601,6 +601,9 @@ int pnv_tce_build(struct iommu_table *tbl, long
index, long npages,
  u64 rpn = __pa(uaddr) >> tbl->it_page_shift;
  long i;
+if (proto_tce & TCE_PCI_WRITE)
+proto_tce |= TCE_PCI_READ;
+
  for (i = 0; i < npages; i++) {
  unsigned long newtce = proto_tce |
  ((rpn + i) << tbl->it_page_shift);
@@ -622,6 +625,9 @@ int pnv_tce_xchg(struct iommu_table *tbl, long
index,
  BUG_ON(*hpa & ~IOMMU_PAGE_MASK(tbl));
+if (newtce & TCE_PCI_WRITE)
+newtce |= TCE_PCI_READ;
+
  oldtce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce));
  *hpa = be64_to_cpu(oldtce) & ~(TCE_PCI_READ |
TCE_PCI_WRITE);
  *direction = iommu_tce_direction(oldtce);

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev




I am still working on getting a machine to try this on. From code
inspection, it looks like it should work. The problem is shortage of
machines and machines tied-up by Test.


Any progress here? Thanks.






___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: PowerPC agpmode issues

2016-02-09 Thread luigi burdo
Mike and Gerhard, dont think the situation of the pcie powerpc is bettrer.but 
compared with last years with the new kernels and last xorg on a radeonhd 4650 
i have an increase of performance about 250x ... example QuakeSpasm was gaving 
640x480157fps on Radeon 4650... Now is 380 fps yes compared the old nvida 
7800gtx on Osx  450 fps this results are less but for sure better than before.
The worst of last period im facing r600 radeon ring test errors and  i cant 
usewith gpu accel now only in fbdev the 5450 and 6570 that it was perfect 
working before.
Luigi

> From: gerhard_pirc...@gmx.net
> To: michael.hel...@gmail.com
> Subject: Re: PowerPC agpmode issues
> Date: Tue, 9 Feb 2016 12:52:15 +0100
> CC: aneesh.ku...@linux.vnet.ibm.com; mic...@daenzer.net; 
> linuxppc-dev@lists.ozlabs.org; reinhard.bo...@googlemail.com; 
> bobby.pr...@gmail.com
> 
> > On 9 Feb 2016 03:27, "Mike"  wrote:
> > Ok, so its quirks to be added then? Something not implemented in KMS
> > that was in UMS?
> > Reports are that the same issue exsist on PPC Amiga Ones with a VIA
> > chipset, and the Pegasos 2 with the Artica s chipset, i posted a
> > mail from detailiing that.
> Just to avoid some confusion:
> Old long story short: the issues for AmigaOnes and the Pegasos _1_ with
> ArticiaS northbridge and VIA southbridge are that:
> 1. the AGP controller corrupts data transfers in AGP mode (also depending
> on the AGP HW request queue size). So there is no official AGP driver that
> would require radeon.agpmode=-1. The microA1 is supposed to have a fix
> for this HW data corruption, but I yet have to dig out my ArticiaS AGP
> driver code for some test runs...
> 2. At least the AmigaOne with ArticiaS chip need non-coherent DMA
> allocations and/or proper cache flushes to avoid corrupted DMA transfers.
> 
> Nonetheless I had DRI1 working _only_ on my A1SE under Debian Squeeze (i.e.
> glxgears could run on the desktop with hardware acceleration), but DRI2
> with its very dynamic GART mapping is a no-go on every first-gen AmigaOne
> machine, even if the GART driver test (radeon.test=1) runs through in
> PCIGART mode (could it be that it uses a more or less static GART mapping
> for the test?).
> 
> > Sure that might be it, but i get different results trying agpmode=1-2-4,
> > 2 gave a noisy screen before the hard crash. i find it rather impossible
> > to debug at all as the crash happens so fast no logs seem to be written..
> > I think i would need serial...
> > I'd personally love nothing more then to see support restored and a
> > default as expected working condition ought be the minimum requirement.
> > I use a powerbook a1106, 5,6. With a 5,8 on the way. Those are the last
> > two revision powerbooks in the 15" series. In swrast they become useless,
> > impossible to use for any productivity. Most people trying to use linux
> > on ppc for personal use come in macs, with the exception of the Amiga PPC
> > crowd now running their amcc 440/460ex or e600 based x500/5000, all of
> > which have of course pci-e more cores and more threads. Yet struggle even
> > with regressions left and right to keep up with the single core performance
> > of the G4's. Sure it's pushing 10 years , but it's the only alternative
> > if one wishes to remain mobile.
> swrast definitely isn't fun on 10 years old PPC machines. Current Firefox
> is already slow enough on these machines... :-)
> 
> > On 9 Feb 2016 02:41, "Michel Dänzer"  wrote:
> > > On 08.02.2016 22:28, Mike wrote:
> > > Certainly 750~800 fps in glxgears vs 3000+ in debian squeeze, i cant
> > > bring myself to say that it's an acceptable situation no matter how
> > > tired i am of the problem knowing how well the setup could do. It's
> > > clear that the implementation is broken for everything but x86, [...]
> > 
> > Why is that? It was working fine on my last-gen PowerBook. AFAIK Darwin
> > / OS X never used anything but a static AGP GART mapping though, so it
> > seems very likely that the issues with older UniNorth revisions are
> > simply due to the hardware being unable to support the usage patterns of
> > modern GPU drivers.
> > 
> > That said, if you guys have specific suggestions for a "proper"
> > solution, nobody's standing in your way.
> I have to admit that I lack the knowledge of the inner workings of the
> TTM/radeon code (and its TTM AGP backend) to do any useful work here.
> I was hoping that the TMA DMA allocator could be of any help at least for
> non-cache coherent machines given that (IIRC) ARM is using it together
> with the nuoveau driver on the TEGRA platform, but I guess that would need
> some modifications also on the powerpc architecture side (maybe a new
> non-coherent DMA allocator that is not limited to 2M virtual address space
> for mappings). Thus I guess a lot of things could be improved/fixed, but
> nowadays Linux code doesn't seem to be something for the "occasional hobby
> hacker". :-)
> 
> regards,
> Gerhard
> 

[powerpc:topic/math-emu 6/9] arch/sh/math-emu/math.c:129:1: warning: ISO C90 forbids mixed declarations and code

2016-02-09 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
topic/math-emu
head:   0d351023a638b7c82abdd8d66ebf5b5b3d6cb169
commit: 3b76bfd2f01f3b9101f040878e2636cdb7f3bbdd [6/9] sh/math-emu: Move sh 
from math-emu-old to math-emu
config: sh-allyesconfig (attached as .config)
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 3b76bfd2f01f3b9101f040878e2636cdb7f3bbdd
# save the attached .config to linux build tree
make.cross ARCH=sh 

All warnings (new ones prefixed by >>):

   In file included from arch/sh/math-emu/math.c:23:0:
   include/math-emu/single.h:76:21: warning: "__BIG_ENDIAN" is not defined 
[-Wundef]
   In file included from arch/sh/math-emu/math.c:24:0:
   include/math-emu/double.h:81:22: warning: "__BIG_ENDIAN" is not defined 
[-Wundef]
   arch/sh/math-emu/math.c:54:0: warning: "WRITE" redefined [enabled by default]
   include/linux/fs.h:199:0: note: this is the location of the previous 
definition
   arch/sh/math-emu/math.c:55:0: warning: "READ" redefined [enabled by default]
   include/linux/fs.h:198:0: note: this is the location of the previous 
definition
   arch/sh/math-emu/math.c: In function 'fadd':
>> arch/sh/math-emu/math.c:129:1: warning: ISO C90 forbids mixed declarations 
>> and code [-Wdeclaration-after-statement]
>> arch/sh/math-emu/math.c:129:1: warning: ISO C90 forbids mixed declarations 
>> and code [-Wdeclaration-after-statement]
   arch/sh/math-emu/math.c: In function 'fsub':
   arch/sh/math-emu/math.c:136:1: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   arch/sh/math-emu/math.c:136:1: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   arch/sh/math-emu/math.c: In function 'fmac':
   arch/sh/math-emu/math.c:165:1: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   arch/sh/math-emu/math.c:165:1: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   arch/sh/math-emu/math.c:165:1: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   arch/sh/math-emu/math.c:165:1: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   arch/sh/math-emu/math.c:165:1: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   arch/sh/math-emu/math.c: In function 'ffloat':
   arch/sh/math-emu/math.c:313:1: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   arch/sh/math-emu/math.c:315:1: warning: ISO C90 forbids mixed declarations 
and code [-Wdeclaration-after-statement]
   arch/sh/math-emu/math.c: At top level:
   arch/sh/math-emu/math.c:524:12: warning: 'ieee_fpe_handler' defined but not 
used [-Wunused-function]

vim +129 arch/sh/math-emu/math.c

17  #include 
18  #include 
19  #include 
20  
21  #include "sfp-util.h"
22  #include 
  > 23  #include 
24  #include 
25  
26  #define FPUL(fregs->fpul)
27  #define FPSCR   (fregs->fpscr)
28  #define FPSCR_RM(FPSCR&3)
29  #define FPSCR_DN((FPSCR>>18)&1)
30  #define FPSCR_PR((FPSCR>>19)&1)
31  #define FPSCR_SZ((FPSCR>>20)&1)
32  #define FPSCR_FR((FPSCR>>21)&1)
33  #define FPSCR_MASK  0x003fUL
34  
35  #define BANK(n) (n^(FPSCR_FR?16:0))
36  #define FR  ((unsigned long*)(fregs->fp_regs))
37  #define FR0 (FR[BANK(0)])
38  #define FRn (FR[BANK(n)])
39  #define FRm (FR[BANK(m)])
40  #define DR  ((unsigned long long*)(fregs->fp_regs))
41  #define DRn (DR[BANK(n)/2])
42  #define DRm (DR[BANK(m)/2])
43  
44  #define XREG(n) (n^16)
45  #define XFn (FR[BANK(XREG(n))])
46  #define XFm (FR[BANK(XREG(m))])
47  #define XDn (DR[BANK(XREG(n))/2])
48  #define XDm (DR[BANK(XREG(m))/2])
49  
50  #define R0  (regs->regs[0])
51  #define Rn  (regs->regs[n])
52  #define Rm  (regs->regs[m])
53  
54  #define WRITE(d,a)  ({if(put_user(d, (typeof (d)*)a)) return 
-EFAULT;})
55  #define READ(d,a)   ({if(get_user(d, (typeof (d)*)a)) return 
-EFAULT;})
56  
57  #define PACK_S(r,f) FP_PACK_SP(,f)
58  #define PACK_SEMIRAW_S(r,f) FP_PACK_SEMIRAW_SP(,f)
59  #define PACK_RAW_S(r,f) FP_PACK_RAW_SP(,f)
60  #define UNPACK_S(f,r)   FP_UNPACK_SP(f,)
61  #define UNPACK_SEMIRAW_S(f,r)   FP_UNPACK_SEMIRAW_SP(f,)
62  #define UNPACK_RAW_S(f,r)   FP_UNPACK_RAW_SP(f,)
63  #define PACK_D(r,f) \
64  {u32 t[2]; FP_PACK_DP(t,f); ((u32*))[0]=t[1]; 
((u32*))[1]=t[0];}
65  #define PACK_SEMIRAW_D(r,f) \
66  {u32 t[2]; FP_PACK_SEMIRAW_DP(t,f); ((u32*))[0]=t[1]; \
67  ((u32*))[1]=t[0];}

enable kdump capture kernel functionality

2016-02-09 Thread Cosmin Banu
Hi,

I am trying to rebuild the kernel for one of our powerpc devices to include 
kdump capture kernel functionality, but I'm having trouble getting it to load 
for use on panic.
I'm running Linux 3.14.60 on e500v2 (COMX-P2020 module).

I've tried following the documentation at 
https://www.kernel.org/doc/Documentation/kdump/kdump.txt:
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_NONSTATIC_KERNEL=y
CONFIG_PROC_VMCORE=y
CONFIG_RELOCATABLE=y
CONFIG_RELOCATABLE_PPC32=y

These are the parameters I'm using to boot the kernel:
> cat /proc/cmdline
root=/dev/mmcblk0p2 rw rootdelay=15 
ip=10.215.181.92::10.215.180.1:255.255.254.0:XSTREAM-DEV2:eth0 loglevel=0 
mfgstring=1:XF40-LB-SS-MMDDYY console=ttyS0,115200 ramdisk_size=70 
cache-sram-size=0x1 crashkernel=0x0400@0x0400 slub_debug=FPZ

Running the following command:
> kexec -l /boot/uImage -t uImage-ppc --dtb=/boot/comx.dtb 
> --initrd=/boot/initramfs_data.cpio --command-line="root=/dev/mmcblk0p2 3 
> maxcpus=1 irqpoll noirqdistrib reset_devices rw rootdelay=15 
> ip=10.215.181.92::10.215.180.1:255.255.254.0:XSTREAM-DEV2:eth0 loglevel=0 
> mfgstring=1:XF40-LB-SS-MMDDYY console=ttyS0,115200 ramdisk_size=70 
> cache-sram-size=0x1"
Returns:
Can't add kernel to addr 0x len 0
Cannot load /boot/uImage

I'm not sure what mechanism is used to load the kernel and where this can be 
configured. I assumed that the crashkernel parameter from the cmdline is the 
actual setting.
Where should I set this value? Also, how can I ensure that the two kernel 
images don't overlap?

Thanks in advance,
Cosmin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 13/18] cxl: sysfs support for guests

2016-02-09 Thread Frederic Barrat



Le 08/02/2016 04:02, Stewart Smith a écrit :

Frederic Barrat  writes:

--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -183,7 +183,7 @@ Description:read only
  Identifies the revision level of the PSL.
  Users:https://github.com/ibm-capi/libcxl

-What:   /sys/class/cxl//base_image
+What:   /sys/class/cxl//base_image (not in a guest)


Is this going to be the case for KVM guest as well as PowerVM guest?



That's too early to say.
The entries we've removed are because the information is filtered by 
pHyp and not available to the OS. Some of it because nobody thought it 
would be useful, some of it because it's not meant to be seen by the OS. 
For KVM, if the card can be shared between guests, I would expect the 
same kind of restrictions.


  Fred

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/xmon: Add xmon command to dump process/task similar to ps(1)

2016-02-09 Thread Douglas Miller

That looks fine to me.

Thanks!


On 02/09/2016 04:58 AM, Michael Ellerman wrote:

On Mon, 2015-23-11 at 15:01:15 UTC, Douglas Miller wrote:

Add 'P' command with optional task_struct address to dump all/one task's
information: task pointer, kernel stack pointer, PID, PPID, state
(interpreted), CPU where (last) running, and command.

Introduce XMON_PROTECT macro to standardize memory-access-fault
protection (setjmp). Initially used only by the 'P' command.

Hi Doug,

Sorry this has taken a while, it keeps getting preempted by more important
patches.

I'm also not a big fan of the protect macro, it works for this case, but it's
already a bit ugly calling for_each_process() inside the macro, and it would be
even worse for multi line logic.

I think I'd rather just open code it, and hopefully we can come up with a
better solution for catching errors in the long run.

I also renamed the routines to use "task", because "proc" in xmon is already
used to mean "procedure", and the struct is task_struct after all.

How does this look?

cheers

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 47e195d66a9a..942796fa4767 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -163,6 +163,7 @@ static int  cpu_cmd(void);
  static void csum(void);
  static void bootcmds(void);
  static void proccall(void);
+static void show_tasks(void);
  void dump_segments(void);
  static void symbol_lookup(void);
  static void xmon_show_stack(unsigned long sp, unsigned long lr,
@@ -238,6 +239,7 @@ Commands:\n\
mz  zero a block of memory\n\
mi  show information about memory allocation\n\
p   call a procedure\n\
+  Plist processes/tasks\n\
r   print registers\n\
s   single step\n"
  #ifdef CONFIG_SPU_BASE
@@ -967,6 +969,9 @@ cmds(struct pt_regs *excp)
case 'p':
proccall();
break;
+   case 'P':
+   show_tasks();
+   break;
  #ifdef CONFIG_PPC_STD_MMU
case 'u':
dump_segments();
@@ -2566,6 +2571,61 @@ memzcan(void)
printf("%.8x\n", a - mskip);
  }
  
+static void show_task(struct task_struct *tsk)

+{
+   char state;
+
+   /*
+* Cloned from kdb_task_state_char(), which is not entirely
+* appropriate for calling from xmon. This could be moved
+* to a common, generic, routine used by both.
+*/
+   state = (tsk->state == 0) ? 'R' :
+   (tsk->state < 0) ? 'U' :
+   (tsk->state & TASK_UNINTERRUPTIBLE) ? 'D' :
+   (tsk->state & TASK_STOPPED) ? 'T' :
+   (tsk->state & TASK_TRACED) ? 'C' :
+   (tsk->exit_state & EXIT_ZOMBIE) ? 'Z' :
+   (tsk->exit_state & EXIT_DEAD) ? 'E' :
+   (tsk->state & TASK_INTERRUPTIBLE) ? 'S' : '?';
+
+   printf("%p %016lx %6d %6d %c %2d %s\n", tsk,
+   tsk->thread.ksp,
+   tsk->pid, tsk->parent->pid,
+   state, task_thread_info(tsk)->cpu,
+   tsk->comm);
+}
+
+static void show_tasks(void)
+{
+   unsigned long tskv;
+   struct task_struct *tsk = NULL;
+
+   printf(" task_struct ->thread.kspPID   PPID S  P CMD\n");
+
+   if (scanhex())
+   tsk = (struct task_struct *)tskv;
+
+   if (setjmp(bus_error_jmp) != 0) {
+   catch_memory_errors = 0;
+   printf("*** Error dumping task %p\n", tsk);
+   return;
+   }
+
+   catch_memory_errors = 1;
+   sync();
+
+   if (tsk)
+   show_task(tsk);
+   else
+   for_each_process(tsk)
+   show_task(tsk);
+
+   sync();
+   __delay(200);
+   catch_memory_errors = 0;
+}
+
  static void proccall(void)
  {
unsigned long args[8];

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 00/29] Book3s abstraction in preparation for new MMU model

2016-02-09 Thread Aneesh Kumar K.V


Hi Scott,

I missed adding you on CC:, Can you take a look at this and make sure we
are not breaking anything on freescale.

"Aneesh Kumar K.V"  writes:

> Hello,
>
> This is a large series, mostly consisting of code movement. No new features
> are done in this series. The changes are done to accomodate the upcoming new 
> memory
> model in future powerpc chips. The details of the new MMU model can be found 
> at
>
>  http://ibm.biz/power-isa3 (Needs registration). I am including a summary of 
> the changes below.
>
> ISA 3.0 adds support for the radix tree style of MMU with full
> virtualization and related control mechanisms that manage its
> coexistence with the HPT. Radix-using operating systems will
> manage their own translation tables instead of relying on hcalls.
>
> Radix style MMU model requires us to do a 4 level page table
> with 64K and 4K page size. The table index size different page size
> is listed below
>
> PGD -> 13 bits
> PUD -> 9 (1G hugepage)
> PMD -> 9 (2M huge page)
> PTE -> 5 (for 64k), 9 (for 4k)
>
> We also require the page table to be in big endian format.
>
> The changes proposed in this series enables us to support both
> hash page table and radix tree style MMU using a single kernel
> with limited impact. The idea is to change core page table
> accessors to static inline functions and later hotpatch them
> to switch to hash or radix tree functions. For ex:
>
> static inline int pte_write(pte_t pte)
> {
>if (radix_enabled())
>return rpte_write(pte);
> return hlpte_write(pte);
> }
>
> On boot we will hotpatch the code so as to avoid conditional operation.
>
> The other two major change propsed in this series is to switch hash
> linux page table to a 4 level table in big endian format. This is
> done so that functions like pte_val(), pud_populate() doesn't need
> hotpatching and thereby helps in limiting runtime impact of the changes.
>
> I didn't included the radix related changes in this series. You can
> find them at https://github.com/kvaneesh/linux/commits/radix-mmu-v1
>
> Changes from V1:
> * move patches adding helpers to the next series
>


Thanks
-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: PowerPC agpmode issues

2016-02-09 Thread Gerhard Pircher
> On 9 Feb 2016 03:27, "Mike"  wrote:
> Ok, so its quirks to be added then? Something not implemented in KMS
> that was in UMS?
> Reports are that the same issue exsist on PPC Amiga Ones with a VIA
> chipset, and the Pegasos 2 with the Artica s chipset, i posted a
> mail from detailiing that.
Just to avoid some confusion:
Old long story short: the issues for AmigaOnes and the Pegasos _1_ with
ArticiaS northbridge and VIA southbridge are that:
1. the AGP controller corrupts data transfers in AGP mode (also depending
on the AGP HW request queue size). So there is no official AGP driver that
would require radeon.agpmode=-1. The microA1 is supposed to have a fix
for this HW data corruption, but I yet have to dig out my ArticiaS AGP
driver code for some test runs...
2. At least the AmigaOne with ArticiaS chip need non-coherent DMA
allocations and/or proper cache flushes to avoid corrupted DMA transfers.

Nonetheless I had DRI1 working _only_ on my A1SE under Debian Squeeze (i.e.
glxgears could run on the desktop with hardware acceleration), but DRI2
with its very dynamic GART mapping is a no-go on every first-gen AmigaOne
machine, even if the GART driver test (radeon.test=1) runs through in
PCIGART mode (could it be that it uses a more or less static GART mapping
for the test?).

> Sure that might be it, but i get different results trying agpmode=1-2-4,
> 2 gave a noisy screen before the hard crash. i find it rather impossible
> to debug at all as the crash happens so fast no logs seem to be written..
> I think i would need serial...
> I'd personally love nothing more then to see support restored and a
> default as expected working condition ought be the minimum requirement.
> I use a powerbook a1106, 5,6. With a 5,8 on the way. Those are the last
> two revision powerbooks in the 15" series. In swrast they become useless,
> impossible to use for any productivity. Most people trying to use linux
> on ppc for personal use come in macs, with the exception of the Amiga PPC
> crowd now running their amcc 440/460ex or e600 based x500/5000, all of
> which have of course pci-e more cores and more threads. Yet struggle even
> with regressions left and right to keep up with the single core performance
> of the G4's. Sure it's pushing 10 years , but it's the only alternative
> if one wishes to remain mobile.
swrast definitely isn't fun on 10 years old PPC machines. Current Firefox
is already slow enough on these machines... :-)

> On 9 Feb 2016 02:41, "Michel Dänzer"  wrote:
> > On 08.02.2016 22:28, Mike wrote:
> > Certainly 750~800 fps in glxgears vs 3000+ in debian squeeze, i cant
> > bring myself to say that it's an acceptable situation no matter how
> > tired i am of the problem knowing how well the setup could do. It's
> > clear that the implementation is broken for everything but x86, [...]
> 
> Why is that? It was working fine on my last-gen PowerBook. AFAIK Darwin
> / OS X never used anything but a static AGP GART mapping though, so it
> seems very likely that the issues with older UniNorth revisions are
> simply due to the hardware being unable to support the usage patterns of
> modern GPU drivers.
> 
> That said, if you guys have specific suggestions for a "proper"
> solution, nobody's standing in your way.
I have to admit that I lack the knowledge of the inner workings of the
TTM/radeon code (and its TTM AGP backend) to do any useful work here.
I was hoping that the TMA DMA allocator could be of any help at least for
non-cache coherent machines given that (IIRC) ARM is using it together
with the nuoveau driver on the TEGRA platform, but I guess that would need
some modifications also on the powerpc architecture side (maybe a new
non-coherent DMA allocator that is not limited to 2M virtual address space
for mappings). Thus I guess a lot of things could be improved/fixed, but
nowadays Linux code doesn't seem to be something for the "occasional hobby
hacker". :-)

regards,
Gerhard
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3] powerpc/mm: Fix Multi hit ERAT cause by recent THP update

2016-02-09 Thread Michael Ellerman
On Tue, 2016-09-02 at 01:20:31 UTC, "Aneesh Kumar K.V" wrote:
> With ppc64 we use the deposited pgtable_t to store the hash pte slot
> information. We should not withdraw the deposited pgtable_t without
> marking the pmd none. This ensure that low level hash fault handling
> will skip this huge pte and we will handle them at upper levels.
> 
> Recent change to pmd splitting changed the above in order to handle the
> race between pmd split and exit_mmap. The race is explained below.
> 
> Consider following race:
> 
>   CPU0CPU1
> shrink_page_list()
>   add_to_swap()
> split_huge_page_to_list()
>   __split_huge_pmd_locked()
> pmdp_huge_clear_flush_notify()
>   // pmd_none() == true
>   exit_mmap()
> unmap_vmas()
>   zap_pmd_range()
> // no action on pmd since 
> pmd_none() == true
>   pmd_populate()
> 
> As result the THP will not be freed. The leak is detected by check_mm():
> 
>   BUG: Bad rss-counter state mm:880058d2e580 idx:1 val:512
> 
> The above required us to not mark pmd none during a pmd split.
> 
> The fix for ppc is to clear the huge pte of _PAGE_USER, so that low
> level fault handling code skip this pte. At higher level we do take ptl
> lock. That should serialze us against the pmd split. Once the lock is
> acquired we do check the pmd again using pmd_same. That should always
> return false for us and hence we should retry the access. We do the
> pmd_same check in all case after taking plt with
> THP (do_huge_pmd_wp_page, do_huge_pmd_numa_page and
> huge_pmd_set_accessed)
> 
> Also make sure we wait for irq disable section in other cpus to finish
> before flipping a huge pte entry with a regular pmd entry. Code paths
> like find_linux_pte_or_hugepte depend on irq disable to get
> a stable pte_t pointer. A parallel thp split need to make sure we
> don't convert a pmd pte to a regular pmd entry without waiting for the
> irq disable section to finish.
> 
> Acked-by: Kirill A. Shutemov 
> Signed-off-by: Aneesh Kumar K.V 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/9db4cd6c21535a4846b38808f3

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v3,2/2] powerpc: tracing: don't trace hcalls on offline CPUs

2016-02-09 Thread Michael Ellerman
On Mon, 2015-14-12 at 20:18:06 UTC, Denis Kirjanov wrote:
> ./drmgr -c cpu -a -r gives the following warning:
> 
> ...

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/168a20bb35122539682671d15c

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc: fix dedotify for binutils >= 2.26

2016-02-09 Thread Michael Ellerman
On Fri, 2016-05-02 at 18:50:03 UTC, Andreas Schwab wrote:
> Since binutils 2.26 BFD is doing suffix merging on STRTAB sections.  But
> dedotify modifies the symbol names in place, which can also modify
> unrelated symbols with a name that matches a suffix of a dotted name.  To
> remove the leading dot of a symbol name we can just increment the pointer
> into the STRTAB section instead.
> 
> Signed-off-by: Andreas Schwab 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/f15838e9cac8f78f0cc506529b

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v7 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments

2016-02-09 Thread Christophe Leroy



Le 09/02/2016 11:23, Christophe Leroy a écrit :

The main purpose of this patchset is to dramatically reduce the time
spent in DTLB miss handler. This is achieved by:
1/ Mapping RAM with 8M pages
2/ Mapping IMMR with a fixed 512K page


Change in v7:
* Don't include x_block_mapped() from compilation in
arch/powerpc/mm/fsl_booke_mmu.c when CONFIG_FSL_BOOKE is not set
(reported by kbuild test robot)




Please don't apply it, for some reason the modification supposed to be 
in v7 didn't get in. I will submit v8. Sorry for the noise.


Christophe
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 02/23] powerpc/8xx: Map linear kernel RAM with 8M pages

2016-02-09 Thread Christophe Leroy
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.

MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.

In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.

In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.

With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.

Signed-off-by: Christophe Leroy 
---
v2: using bt instead of bgt and named the label explicitly
v3: no change
v4: no change
v5: removed use of pmd_val() as L-value
v6: no change
v8: no change

 arch/powerpc/kernel/head_8xx.S | 35 +-
 arch/powerpc/mm/8xx_mmu.c  | 83 ++
 arch/powerpc/mm/Makefile   |  1 +
 arch/powerpc/mm/mmu_decl.h | 15 ++--
 4 files changed, 120 insertions(+), 14 deletions(-)
 create mode 100644 arch/powerpc/mm/8xx_mmu.c

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index a89492e..87d1f5f 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -398,11 +398,13 @@ DataStoreTLBMiss:
BRANCH_UNLESS_KERNEL(3f)
lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha
 3:
-   mtcrr3
 
/* Insert level 1 index */
rlwimi  r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 
29
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the 
level 1 entry */
+   mtcrr11
+   bt- 28,DTLBMiss8M   /* bit 28 = Large page (8M) */
+   mtcrr3
 
/* We have a pte table, so load fetch the pte from the table.
 */
@@ -455,6 +457,29 @@ DataStoreTLBMiss:
EXCEPTION_EPILOG_0
rfi
 
+DTLBMiss8M:
+   mtcrr3
+   ori r11, r11, MD_SVALID
+   MTSPR_CPU6(SPRN_MD_TWC, r11, r3)
+#ifdef CONFIG_PPC_16K_PAGES
+   /*
+* In 16k pages mode, each PGD entry defines a 64M block.
+* Here we select the 8M page within the block.
+*/
+   rlwimi  r11, r10, 0, 0x0380
+#endif
+   rlwinm  r10, r11, 0, 0xff80
+   ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SHARED | _PAGE_DIRTY | \
+ _PAGE_PRESENT
+   MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */
+
+   li  r11, RPN_PATTERN
+   mfspr   r3, SPRN_SPRG_SCRATCH2
+   mtspr   SPRN_DAR, r11   /* Tag DAR */
+   EXCEPTION_EPILOG_0
+   rfi
+
+
 /* This is an instruction TLB error on the MPC8xx.  This could be due
  * to many reasons, such as executing guarded memory or illegal instruction
  * addresses.  There is nothing to do but handle a big time error fault.
@@ -532,13 +557,15 @@ FixupDAR:/* Entry point for dcbx workaround. */
/* Insert level 1 index */
 3: rlwimi  r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 
29
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the 
level 1 entry */
+   mtcrr11
+   bt  28,200f /* bit 28 = Large page (8M) */
rlwinm  r11, r11,0,0,19 /* Extract page descriptor page address */
/* Insert level 2 index */
rlwimi  r11, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29
lwz r11, 0(r11) /* Get the pte */
/* concat physical page address(r11) and page offset(r10) */
rlwimi  r11, r10, 0, 32 - PAGE_SHIFT, 31
-   lwz r11,0(r11)
+201:   lwz r11,0(r11)
 /* Check if it really is a dcbx instruction. */
 /* dcbt and dcbtst does not generate DTLB Misses/Errors,
  * no need to include them here */
@@ -557,6 +584,10 @@ FixupDAR:/* Entry point for dcbx workaround. */
 141:   mfspr   r10,SPRN_SPRG_SCRATCH2
b   DARFixed/* Nope, go back to normal TLB processing */
 
+   /* concat physical page address(r11) and page offset(r10) */
+200:   rlwimi  r11, r10, 0, 32 - (PAGE_SHIFT << 1), 31
+   b   201b
+
 144:   mfspr   r10, SPRN_DSISR
rlwinm  r10, r10,0,7,5  /* Clear store bit for buggy dcbst insn */
mtspr   

[PATCH v8 08/23] powerpc/8xx: Map IMMR area with 512k page at a fixed address

2016-02-09 Thread Christophe Leroy
Once the linear memory space has been mapped with 8Mb pages, as
seen in the related commit, we get 11 millions DTLB missed during
the reference 600s period. 77% of the misses are on user addresses
and 23% are on kernel addresses (1 fourth for linear address space
and 3 fourth for virtual address space)

Traditionaly, each driver manages one computer board which has its
own components with its own memory maps.
But on embedded chips like the MPC8xx, the SOC has all registers
located in the same IO area.

When looking at ioremaps done during startup, we see that
many drivers are re-mapping small parts of the IMMR for their own use
and all those small pieces gets their own 4k page, amplifying the
number of TLB misses: in our system we get 0xff00 mapped 31 times
and 0xff003000 mapped 9 times.

Even if each part of IMMR was mapped only once with 4k pages, it would
still be several small mappings towards linear area.

With the patch, on the same principle as what was done for the RAM,
the IMMR gets mapped by a 512k page.

In 4k pages mode, we reserve a 4Mb area for mapping IMMR. The TLB
miss handler checks that we are within the first 512k and bail out
with page not marked valid if we are outside

In 16k pages mode, it is not realistic to reserve a 64Mb area, so
we do a standard mapping of the 512k area using 32 pages of 16k.
The CPM will be mapped via the first two pages, and the SEC engine
will be mapped via the 16th and 17th pages. As the pages are marked
guarded, there will be no speculative accesses.

With this patch applied, the number of DTLB misses during the 10 min
period is reduced to 11.8 millions for a duration of 5.8s, which
represents 2% of the non-idle time hence yet another 10% reduction.

Signed-off-by: Christophe Leroy 
---
v2:
- using bt instead of blt/bgt
- reorganised in order to have only one taken branch for both 512k
and 8M instead of a first branch for both 8M and 512k then a second
branch for 512k

v3:
- using fixmap
- using the new x_block_mapped() functions

v4: no change
v5: no change
v6: removed use of pmd_val() as L-value
v8: no change

 arch/powerpc/include/asm/fixmap.h |  9 ++-
 arch/powerpc/kernel/head_8xx.S| 36 +-
 arch/powerpc/mm/8xx_mmu.c | 53 +++
 arch/powerpc/mm/mmu_decl.h|  3 ++-
 4 files changed, 98 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/fixmap.h 
b/arch/powerpc/include/asm/fixmap.h
index d7dd8fb..b954dc3 100644
--- a/arch/powerpc/include/asm/fixmap.h
+++ b/arch/powerpc/include/asm/fixmap.h
@@ -52,12 +52,19 @@ enum fixed_addresses {
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,
 #endif
 #ifdef CONFIG_PPC_8xx
-   /* For IMMR we need an aligned 512K area */
FIX_IMMR_START,
+#ifdef CONFIG_PPC_4K_PAGES
+   /* For IMMR we need an aligned 4M area (full PGD entry) */
+   FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((4 * 1024 * 1024) / PAGE_SIZE)) &
+  ~(((4 * 1024 * 1024) / PAGE_SIZE) - 1),
+   FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((4 * 1024 * 1024) / PAGE_SIZE),
+#else
+   /* For IMMR we need an aligned 512K area */
FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((512 * 1024) / PAGE_SIZE)) &
   ~(((512 * 1024) / PAGE_SIZE) - 1),
FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((512 * 1024) / PAGE_SIZE),
 #endif
+#endif
/* FIX_PCIE_MCFG, */
__end_of_fixed_addresses
 };
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 09173ae..ae721a1 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -254,6 +254,37 @@ DataAccess:
. = 0x400
 InstructionAccess:
 
+/*
+ * Bottom part of DTLBMiss handler for 512k pages
+ * not enough space in the primary location
+ */
+#ifdef CONFIG_PPC_4K_PAGES
+/*
+ * 512k pages are only used for mapping IMMR area in 4K pages mode.
+ * Only map the first 512k page of the 4M area covered by the PGD entry.
+ * This should not happen, but if we are called for another page of that
+ * area, don't mark it valid
+ *
+ * In 16k pages mode, IMMR is directly mapped with 16k pages
+ */
+DTLBMiss512k:
+   rlwinm. r10, r10, 0, 0x0038
+   bne-1f
+   ori r11, r11, MD_SVALID
+1: mtcrr3
+   MTSPR_CPU6(SPRN_MD_TWC, r11, r3)
+   rlwinm  r10, r11, 0, 0xffc0
+   ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SHARED | _PAGE_DIRTY | \
+ _PAGE_PRESENT | _PAGE_NO_CACHE
+   MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */
+
+   li  r11, RPN_PATTERN
+   mfspr   r3, SPRN_SPRG_SCRATCH2
+   mtspr   SPRN_DAR, r11   /* Tag DAR */
+   EXCEPTION_EPILOG_0
+   rfi
+#endif
+
 /* External interrupt */
EXCEPTION(0x500, HardwareInterrupt, do_IRQ, EXC_XFER_LITE)
 
@@ -405,6 +436,9 @@ DataStoreTLBMiss:
lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the 
level 1 entry 

[PATCH v8 11/23] powerpc32: Remove useless/wrong MMU:setio progress message

2016-02-09 Thread Christophe Leroy
Commit 771168494719 ("[POWERPC] Remove unused machine call outs")
removed the call to setup_io_mappings(), so remove the associated
progress line message

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/mm/init_32.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index 1a18e4b..4eb1b8f 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -178,10 +178,6 @@ void __init MMU_init(void)
/* Initialize early top-down ioremap allocator */
ioremap_bot = IOREMAP_TOP;
 
-   /* Map in I/O resources */
-   if (ppc_md.progress)
-   ppc_md.progress("MMU:setio", 0x302);
-
if (ppc_md.progress)
ppc_md.progress("MMU:exit", 0x211);
 
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments

2016-02-09 Thread Christophe Leroy
The main purpose of this patchset is to dramatically reduce the time
spent in DTLB miss handler. This is achieved by:
1/ Mapping RAM with 8M pages
2/ Mapping IMMR with a fixed 512K page

On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.

Once the full patchset applied, the number of DTLB misses during the
period is reduced to 11.8 millions for a duration of 5.8s, which
represents 2% of the non-idle time.

This patch also includes other miscellaneous improvements:
1/ Handling of CPU6 ERRATA directly in mtspr() C macro to reduce code
specific to PPC8xx
2/ Rewrite of a few non critical ASM functions in C
3/ Removal of some unused items

See related patches for details

Main changes in v3:
* Using fixmap instead of fix address for mapping IMMR

Change in v4:
* Fix of a wrong #if notified by kbuild robot in 07/23

Change in v5:
* Removed use of pmd_val() as L-value
* Adapted to match the new include files layout in Linux 4.5

Change in v6:
* Removed remaining use of pmd_val() as L-value (reported by kbuild test robot)

Change in v7:
* No change (commit error)

Change in v8:
* Don't include x_block_mapped() from compilation in
arch/powerpc/mm/fsl_booke_mmu.c when CONFIG_FSL_BOOKE is not set
(reported by kbuild test robot)


Christophe Leroy (23):
  powerpc/8xx: Save r3 all the time in DTLB miss handler
  powerpc/8xx: Map linear kernel RAM with 8M pages
  powerpc: Update documentation for noltlbs kernel parameter
  powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c
  powerpc32: Fix pte_offset_kernel() to return NULL for bad pages
  powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam()
together
  powerpc/8xx: Fix vaddr for IMMR early remap
  powerpc/8xx: Map IMMR area with 512k page at a fixed address
  powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM
  powerpc/8xx: map more RAM at startup when needed
  powerpc32: Remove useless/wrong MMU:setio progress message
  powerpc32: remove ioremap_base
  powerpc/8xx: Add missing SPRN defines into reg_8xx.h
  powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro
  powerpc/8xx: remove special handling of CPU6 errata in set_dec()
  powerpc/8xx: rewrite set_context() in C
  powerpc/8xx: rewrite flush_instruction_cache() in C
  powerpc: add inline functions for cache related instructions
  powerpc32: Remove clear_pages() and define clear_page() inline
  powerpc32: move x_dcache_range() functions inline
  powerpc: Simplify test in __dma_sync()
  powerpc32: small optimisation in flush_icache_range()
  powerpc32: Remove one insn in mulhdu

 Documentation/kernel-parameters.txt  |   2 +-
 arch/powerpc/Kconfig.debug   |   1 -
 arch/powerpc/include/asm/cache.h |  19 +++
 arch/powerpc/include/asm/cacheflush.h|  52 ++-
 arch/powerpc/include/asm/fixmap.h|  14 ++
 arch/powerpc/include/asm/mmu-8xx.h   |   4 +-
 arch/powerpc/include/asm/nohash/32/pgtable.h |   5 +-
 arch/powerpc/include/asm/page_32.h   |  17 ++-
 arch/powerpc/include/asm/reg.h   |   2 +
 arch/powerpc/include/asm/reg_8xx.h   |  93 
 arch/powerpc/include/asm/time.h  |   6 +-
 arch/powerpc/kernel/asm-offsets.c|   8 ++
 arch/powerpc/kernel/head_8xx.S   | 207 +--
 arch/powerpc/kernel/misc_32.S| 107 ++
 arch/powerpc/kernel/ppc_ksyms.c  |   2 +
 arch/powerpc/kernel/ppc_ksyms_32.c   |   1 -
 arch/powerpc/mm/8xx_mmu.c| 190 
 arch/powerpc/mm/Makefile |   1 +
 arch/powerpc/mm/dma-noncoherent.c|   2 +-
 arch/powerpc/mm/fsl_booke_mmu.c  |   6 +-
 arch/powerpc/mm/init_32.c|  23 ---
 arch/powerpc/mm/mmu_decl.h   |  34 +++--
 arch/powerpc/mm/pgtable_32.c |  47 +-
 arch/powerpc/mm/ppc_mmu_32.c |   4 +-
 arch/powerpc/platforms/embedded6xx/mpc10x.h  |  10 --
 arch/powerpc/sysdev/cpm_common.c |  15 +-
 26 files changed, 585 insertions(+), 287 deletions(-)
 create mode 100644 arch/powerpc/mm/8xx_mmu.c

-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 01/23] powerpc/8xx: Save r3 all the time in DTLB miss handler

2016-02-09 Thread Christophe Leroy
We are spending between 40 and 160 cycles with a mean of 65 cycles in
the DTLB handling routine (measured with mftbl) so make it more
simple althought it adds one instruction.
With this modification, we get three registers available at all time,
which will help with following patch.

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/kernel/head_8xx.S | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index e629e28..a89492e 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -385,23 +385,20 @@ InstructionTLBMiss:
 
. = 0x1200
 DataStoreTLBMiss:
-#ifdef CONFIG_8xx_CPU6
mtspr   SPRN_SPRG_SCRATCH2, r3
-#endif
EXCEPTION_PROLOG_0
-   mfcrr10
+   mfcrr3
 
/* If we are faulting a kernel address, we have to use the
 * kernel page tables.
 */
-   mfspr   r11, SPRN_MD_EPN
-   IS_KERNEL(r11, r11)
+   mfspr   r10, SPRN_MD_EPN
+   IS_KERNEL(r11, r10)
mfspr   r11, SPRN_M_TW  /* Get level 1 table */
BRANCH_UNLESS_KERNEL(3f)
lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha
 3:
-   mtcrr10
-   mfspr   r10, SPRN_MD_EPN
+   mtcrr3
 
/* Insert level 1 index */
rlwimi  r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 
29
@@ -453,9 +450,7 @@ DataStoreTLBMiss:
MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */
 
/* Restore registers */
-#ifdef CONFIG_8xx_CPU6
mfspr   r3, SPRN_SPRG_SCRATCH2
-#endif
mtspr   SPRN_DAR, r11   /* Tag DAR */
EXCEPTION_EPILOG_0
rfi
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 03/23] powerpc: Update documentation for noltlbs kernel parameter

2016-02-09 Thread Christophe Leroy
Now the noltlbs kernel parameter is also applicable to PPC8xx

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 Documentation/kernel-parameters.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 59e1515..c3e420b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2592,7 +2592,7 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
nolapic_timer   [X86-32,APIC] Do not use the local APIC timer.
 
noltlbs [PPC] Do not use large page/tlb entries for kernel
-   lowmem mapping on PPC40x.
+   lowmem mapping on PPC40x and PPC8xx
 
nomca   [IA-64] Disable machine check abort handling
 
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 04/23] powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c

2016-02-09 Thread Christophe Leroy
Now we have a 8xx specific .c file for that so put it in there
as other powerpc variants do

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/mm/8xx_mmu.c | 17 +
 arch/powerpc/mm/init_32.c | 19 ---
 2 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index 2d42745..a84f5eb 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -81,3 +81,20 @@ unsigned long __init mmu_mapin_ram(unsigned long top)
 
return mapped;
 }
+
+void setup_initial_memory_limit(phys_addr_t first_memblock_base,
+   phys_addr_t first_memblock_size)
+{
+   /* We don't currently support the first MEMBLOCK not mapping 0
+* physical on those processors
+*/
+   BUG_ON(first_memblock_base != 0);
+
+#ifdef CONFIG_PIN_TLB
+   /* 8xx can only access 24MB at the moment */
+   memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0180));
+#else
+   /* 8xx can only access 8MB at the moment */
+   memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0080));
+#endif
+}
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index a10be66..1a18e4b 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -193,22 +193,3 @@ void __init MMU_init(void)
/* Shortly after that, the entire linear mapping will be available */
memblock_set_current_limit(lowmem_end_addr);
 }
-
-#ifdef CONFIG_8xx /* No 8xx specific .c file to put that in ... */
-void setup_initial_memory_limit(phys_addr_t first_memblock_base,
-   phys_addr_t first_memblock_size)
-{
-   /* We don't currently support the first MEMBLOCK not mapping 0
-* physical on those processors
-*/
-   BUG_ON(first_memblock_base != 0);
-
-#ifdef CONFIG_PIN_TLB
-   /* 8xx can only access 24MB at the moment */
-   memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0180));
-#else
-   /* 8xx can only access 8MB at the moment */
-   memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0080));
-#endif
-}
-#endif /* CONFIG_8xx */
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 05/23] powerpc32: Fix pte_offset_kernel() to return NULL for bad pages

2016-02-09 Thread Christophe Leroy
The fixmap related functions try to map kernel pages that are
already mapped through Large TLBs. pte_offset_kernel() has to
return NULL for LTLBs, otherwise the caller will try to access
level 2 table which doesn't exist

Signed-off-by: Christophe Leroy 
---
v3: new
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/include/asm/nohash/32/pgtable.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index c82cbf5..e201600 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -309,7 +309,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, 
pte_t entry)
 #define pte_index(address) \
(((address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
 #define pte_offset_kernel(dir, addr)   \
-   ((pte_t *) pmd_page_vaddr(*(dir)) + pte_index(addr))
+   (pmd_bad(*(dir)) ? NULL : (pte_t *)pmd_page_vaddr(*(dir)) + \
+ pte_index(addr))
 #define pte_offset_map(dir, addr)  \
((pte_t *) kmap_atomic(pmd_page(*(dir))) + pte_index(addr))
 #define pte_unmap(pte) kunmap_atomic(pte)
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 06/23] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together

2016-02-09 Thread Christophe Leroy
x_mapped_by_bats() and x_mapped_by_tlbcam() serve the same kind of
purpose, and are never defined at the same time.
So rename them x_block_mapped() and define them in the relevant
places

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: Functions are mutually exclusive so renamed iaw Scott comment instead of 
grouping into a single function
v4: no change
v5: no change
v6: no change
v8: Don't include x_block_mapped() from compilation in
arch/powerpc/mm/fsl_booke_mmu.c when CONFIG_FSL_BOOKE is not set
(problem reported by kbuild robot with a configuration having
CONFIG_FSL_BOOK3E and not CONFIG_FSL_BOOKE)

 arch/powerpc/mm/fsl_booke_mmu.c |  6 --
 arch/powerpc/mm/mmu_decl.h  | 10 ++
 arch/powerpc/mm/pgtable_32.c| 44 ++---
 arch/powerpc/mm/ppc_mmu_32.c|  4 ++--
 4 files changed, 22 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/mm/fsl_booke_mmu.c b/arch/powerpc/mm/fsl_booke_mmu.c
index f3afe3d..a1b2713 100644
--- a/arch/powerpc/mm/fsl_booke_mmu.c
+++ b/arch/powerpc/mm/fsl_booke_mmu.c
@@ -72,10 +72,11 @@ unsigned long tlbcam_sz(int idx)
return tlbcam_addrs[idx].limit - tlbcam_addrs[idx].start + 1;
 }
 
+#ifdef CONFIG_FSL_BOOKE
 /*
  * Return PA for this VA if it is mapped by a CAM, or 0
  */
-phys_addr_t v_mapped_by_tlbcam(unsigned long va)
+phys_addr_t v_block_mapped(unsigned long va)
 {
int b;
for (b = 0; b < tlbcam_index; ++b)
@@ -87,7 +88,7 @@ phys_addr_t v_mapped_by_tlbcam(unsigned long va)
 /*
  * Return VA for a given PA or 0 if not mapped
  */
-unsigned long p_mapped_by_tlbcam(phys_addr_t pa)
+unsigned long p_block_mapped(phys_addr_t pa)
 {
int b;
for (b = 0; b < tlbcam_index; ++b)
@@ -97,6 +98,7 @@ unsigned long p_mapped_by_tlbcam(phys_addr_t pa)
return tlbcam_addrs[b].start+(pa-tlbcam_addrs[b].phys);
return 0;
 }
+#endif
 
 /*
  * Set up a variable-size TLB entry (tlbcam). The parameters are not checked;
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 7faeb9f..40dd5d3 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -158,3 +158,13 @@ struct tlbcam {
u32 MAS7;
 };
 #endif
+
+#if defined(CONFIG_6xx) || defined(CONFIG_FSL_BOOKE)
+/* 6xx have BATS */
+/* FSL_BOOKE have TLBCAM */
+phys_addr_t v_block_mapped(unsigned long va);
+unsigned long p_block_mapped(phys_addr_t pa);
+#else
+static inline phys_addr_t v_block_mapped(unsigned long va) { return 0; }
+static inline unsigned long p_block_mapped(phys_addr_t pa) { return 0; }
+#endif
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 7692d1b..db0d35e 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -41,32 +41,8 @@ unsigned long ioremap_base;
 unsigned long ioremap_bot;
 EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */
 
-#ifdef CONFIG_6xx
-#define HAVE_BATS  1
-#endif
-
-#if defined(CONFIG_FSL_BOOKE)
-#define HAVE_TLBCAM1
-#endif
-
 extern char etext[], _stext[];
 
-#ifdef HAVE_BATS
-extern phys_addr_t v_mapped_by_bats(unsigned long va);
-extern unsigned long p_mapped_by_bats(phys_addr_t pa);
-#else /* !HAVE_BATS */
-#define v_mapped_by_bats(x)(0UL)
-#define p_mapped_by_bats(x)(0UL)
-#endif /* HAVE_BATS */
-
-#ifdef HAVE_TLBCAM
-extern phys_addr_t v_mapped_by_tlbcam(unsigned long va);
-extern unsigned long p_mapped_by_tlbcam(phys_addr_t pa);
-#else /* !HAVE_TLBCAM */
-#define v_mapped_by_tlbcam(x)  (0UL)
-#define p_mapped_by_tlbcam(x)  (0UL)
-#endif /* HAVE_TLBCAM */
-
 #define PGDIR_ORDER(32 + PGD_T_LOG2 - PGDIR_SHIFT)
 
 #ifndef CONFIG_PPC_4K_PAGES
@@ -228,19 +204,10 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, 
unsigned long flags,
 
/*
 * Is it already mapped?  Perhaps overlapped by a previous
-* BAT mapping.  If the whole area is mapped then we're done,
-* otherwise remap it since we want to keep the virt addrs for
-* each request contiguous.
-*
-* We make the assumption here that if the bottom and top
-* of the range we want are mapped then it's mapped to the
-* same virt address (and this is contiguous).
-*  -- Cort
+* mapping.
 */
-   if ((v = p_mapped_by_bats(p)) /*&& p_mapped_by_bats(p+size-1)*/ )
-   goto out;
-
-   if ((v = p_mapped_by_tlbcam(p)))
+   v = p_block_mapped(p);
+   if (v)
goto out;
 
if (slab_is_available()) {
@@ -278,7 +245,8 @@ void iounmap(volatile void __iomem *addr)
 * If mapped by BATs then there is nothing to do.
 * Calling vfree() generates a benign warning.
 */
-   if (v_mapped_by_bats((unsigned long)addr)) return;
+   if (v_block_mapped((unsigned long)addr))
+   return;
 
if (addr > high_memory && (unsigned long) addr < ioremap_bot)
vunmap((void *) (PAGE_MASK & 

[PATCH v8 10/23] powerpc/8xx: map more RAM at startup when needed

2016-02-09 Thread Christophe Leroy
On recent kernels, with some debug options like for instance
CONFIG_LOCKDEP, the BSS requires more than 8M memory, allthough
the kernel code fits in the first 8M.
Today, it is necessary to activate CONFIG_PIN_TLB to get more than 8M
at startup, allthough pinning TLB is not necessary for that.

This patch adds more pages (up to 24Mb) to the initial mapping if
possible/needed in order to have the necessary mappings regardless of
CONFIG_PIN_TLB.

We could have mapped 16M or 24M inconditionnally but since some
platforms only have 8M memory, we need something a bit more elaborated

Therefore, if the bootloader is compliant with ePAPR standard, we use
r7 to know how much memory was mapped by the bootloader.
Otherwise, we try to determine the required memory size by looking at
the _end symbol and the address of the device tree.

This patch does not modify the behaviour when CONFIG_PIN_TLB is
selected.

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: Automatic detection of available/needed memory instead of allocating 16M 
for all.
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/kernel/head_8xx.S | 56 +++---
 arch/powerpc/mm/8xx_mmu.c  | 10 +++-
 2 files changed, 56 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index ae721a1..a268cf4 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -72,6 +72,9 @@
 #define RPN_PATTERN0x00f0
 #endif
 
+/* ePAPR magic value for non BOOK III-E CPUs */
+#define EPAPR_SMAGIC   0x65504150
+
__HEAD
 _ENTRY(_stext);
 _ENTRY(_start);
@@ -101,6 +104,38 @@ _ENTRY(_start);
  */
.globl  __start
 __start:
+/*
+ * Determine initial RAM size
+ *
+ * If the Bootloader is ePAPR compliant, the size is given in r7
+ * otherwise, we have to determine how much is needed. For that, we have to
+ * check whether _end of kernel and device tree are within the first 8Mb.
+ */
+   lis r30, 0x0080@h   /* 8Mb by default */
+
+   lis r8, EPAPR_SMAGIC@h
+   ori r8, r8, EPAPR_SMAGIC@l
+   cmplw   cr0, r8, r6
+   bne 1f
+   lis r30, 0x0180@h   /* 24Mb max */
+   cmplw   cr0, r7, r30
+   bgt 2f
+   mr  r30, r7 /* save initial ram size */
+   b   2f
+1:
+   /* is kernel _end or DTB in the first 8M ? if not map 16M */
+   lis r8, (_end - PAGE_OFFSET)@h
+   ori r8, r8, (_end - PAGE_OFFSET)@l
+   addir8, r8, -1
+   or  r8, r8, r3
+   cmplw   cr0, r8, r30
+   blt 2f
+   lis r30, 0x0100@h   /* 16Mb */
+   /* is kernel _end or DTB in the first 16M ? if not map 24M */
+   cmplw   cr0, r8, r30
+   blt 2f
+   lis r30, 0x0180@h   /* 24Mb */
+2:
mr  r31,r3  /* save device tree ptr */
 
/* We have to turn on the MMU right away so we get cache modes
@@ -737,6 +772,8 @@ start_here:
 /*
  * Decide what sort of machine this is and initialize the MMU.
  */
+   lis r3, initial_memory_size@ha
+   stw r30, initial_memory_size@l(r3)
li  r3,0
mr  r4,r31
bl  machine_init
@@ -868,10 +905,15 @@ initial_mmu:
mtspr   SPRN_MD_RPN, r8
 
 #ifdef CONFIG_PIN_TLB
-   /* Map two more 8M kernel data pages.
-   */
+   /* Map one more 8M kernel data page. */
addir10, r10, 0x0100
mtspr   SPRN_MD_CTR, r10
+#else
+   /* Map one more 8M kernel data page if needed */
+   lis r10, 0x0080@h
+   cmplw   cr0, r30, r10
+   ble 1f
+#endif
 
lis r8, KERNELBASE@h/* Create vaddr for TLB */
addis   r8, r8, 0x0080  /* Add 8M */
@@ -884,20 +926,28 @@ initial_mmu:
addis   r11, r11, 0x0080/* Add 8M */
mtspr   SPRN_MD_RPN, r11
 
+#ifdef CONFIG_PIN_TLB
+   /* Map one more 8M kernel data page. */
addir10, r10, 0x0100
mtspr   SPRN_MD_CTR, r10
+#else
+   /* Map one more 8M kernel data page if needed */
+   lis r10, 0x0100@h
+   cmplw   cr0, r30, r10
+   ble 1f
+#endif
 
addis   r8, r8, 0x0080  /* Add 8M */
mtspr   SPRN_MD_EPN, r8
mtspr   SPRN_MD_TWC, r9
addis   r11, r11, 0x0080/* Add 8M */
mtspr   SPRN_MD_RPN, r11
-#endif
 
/* Since the cache is enabled according to the information we
 * just loaded into the TLB, invalidate and enable the caches here.
 * We should probably check/set other modeslater.
 */
+1:
lis r8, IDC_INVALL@h
mtspr   SPRN_IC_CST, r8
mtspr   SPRN_DC_CST, r8
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index f37d5ec..50f17d2 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -20,6 +20,7 @@
 #define IMMR_SIZE 

[PATCH v8 12/23] powerpc32: remove ioremap_base

2016-02-09 Thread Christophe Leroy
ioremap_base is not initialised and is nowhere used so remove it

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: fix comment as well
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/include/asm/nohash/32/pgtable.h |  2 +-
 arch/powerpc/mm/mmu_decl.h   |  1 -
 arch/powerpc/mm/pgtable_32.c |  3 +--
 arch/powerpc/platforms/embedded6xx/mpc10x.h  | 10 --
 4 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index e201600..7808475 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -86,7 +86,7 @@ extern int icache_44x_need_flush;
  * We no longer map larger than phys RAM with the BATs so we don't have
  * to worry about the VMALLOC_OFFSET causing problems.  We do have to worry
  * about clashes between our early calls to ioremap() that start growing down
- * from ioremap_base being run into the VM area allocations (growing upwards
+ * from IOREMAP_TOP being run into the VM area allocations (growing upwards
  * from VMALLOC_START).  For this reason we have ioremap_bot to check when
  * we actually run into our mappings setup in the early boot with the VM
  * system.  This really does become a problem for machines with good amounts
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 3872332..53564a3 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -100,7 +100,6 @@ extern void setbat(int index, unsigned long virt, 
phys_addr_t phys,
 
 extern int __map_without_bats;
 extern int __allow_ioremap_reserved;
-extern unsigned long ioremap_base;
 extern unsigned int rtas_data, rtas_size;
 
 struct hash_pte;
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index db0d35e..815ccd7 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -37,7 +37,6 @@
 
 #include "mmu_decl.h"
 
-unsigned long ioremap_base;
 unsigned long ioremap_bot;
 EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */
 
@@ -173,7 +172,7 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, 
unsigned long flags,
/*
 * Choose an address to map it to.
 * Once the vmalloc system is running, we use it.
-* Before then, we use space going down from ioremap_base
+* Before then, we use space going down from IOREMAP_TOP
 * (ioremap_bot records where we're up to).
 */
p = addr & PAGE_MASK;
diff --git a/arch/powerpc/platforms/embedded6xx/mpc10x.h 
b/arch/powerpc/platforms/embedded6xx/mpc10x.h
index b290b63..5ad1202 100644
--- a/arch/powerpc/platforms/embedded6xx/mpc10x.h
+++ b/arch/powerpc/platforms/embedded6xx/mpc10x.h
@@ -24,13 +24,11 @@
  *   Processor: 0x8000 - 0x807f -> PCI I/O: 0x - 0x007f
  *   Processor: 0xc000 - 0xdfff -> PCI MEM: 0x - 0x1fff
  *   PCI MEM:   0x8000 -> Processor System Memory: 0x
- *   EUMB mapped to: ioremap_base - 0x0010 (ioremap_base - 1 MB)
  *
  * MAP B (CHRP Map)
  *   Processor: 0xfe00 - 0xfebf -> PCI I/O: 0x - 0x00bf
  *   Processor: 0x8000 - 0xbfff -> PCI MEM: 0x8000 - 0xbfff
  *   PCI MEM:   0x -> Processor System Memory: 0x
- *   EUMB mapped to: ioremap_base - 0x0010 (ioremap_base - 1 MB)
  */
 
 /*
@@ -138,14 +136,6 @@
 #define MPC10X_EUMB_WP_OFFSET  0x000ff000 /* Data path diagnostic, 
watchpoint reg offset */
 #define MPC10X_EUMB_WP_SIZE0x1000 /* Data path diagnostic, 
watchpoint reg size */
 
-/*
- * Define some recommended places to put the EUMB regs.
- * For both maps, recommend putting the EUMB from 0xeff0 to 0xefff.
- */
-extern unsigned long   ioremap_base;
-#defineMPC10X_MAPA_EUMB_BASE   (ioremap_base - 
MPC10X_EUMB_SIZE)
-#defineMPC10X_MAPB_EUMB_BASE   MPC10X_MAPA_EUMB_BASE
-
 enum ppc_sys_devices {
MPC10X_IIC1,
MPC10X_DMA0,
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 07/23] powerpc/8xx: Fix vaddr for IMMR early remap

2016-02-09 Thread Christophe Leroy
Memory: 124428K/131072K available (3748K kernel code, 188K rwdata,
648K rodata, 508K init, 290K bss, 6644K reserved)
Kernel virtual memory layout:
  * 0xfffdf000..0xf000  : fixmap
  * 0xfde0..0xfe00  : consistent mem
  * 0xfddf6000..0xfde0  : early ioremap
  * 0xc900..0xfddf6000  : vmalloc & ioremap
SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1

Today, IMMR is mapped 1:1 at startup

Mapping IMMR 1:1 is just wrong because it may overlap with another
area. On most mpc8xx boards it is OK as IMMR is set to 0xff00
but for instance on EP88xC board, IMMR is at 0xfa20 which
overlaps with VM ioremap area

This patch fixes the virtual address for remapping IMMR with the fixmap
regardless of the value of IMMR.

The size of IMMR area is 256kbytes (CPM at offset 0, security engine
at offset 128k) so a 512k page is enough

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: Using fixmap instead of fixed address
v4: Fix a wrong #if notified by kbuild robot
v5: no change
v6: no change
v8: no change

 arch/powerpc/include/asm/fixmap.h |  7 +++
 arch/powerpc/kernel/asm-offsets.c |  8 
 arch/powerpc/kernel/head_8xx.S| 11 ++-
 arch/powerpc/mm/mmu_decl.h|  7 +++
 arch/powerpc/sysdev/cpm_common.c  | 15 ---
 5 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/fixmap.h 
b/arch/powerpc/include/asm/fixmap.h
index 90f604b..d7dd8fb 100644
--- a/arch/powerpc/include/asm/fixmap.h
+++ b/arch/powerpc/include/asm/fixmap.h
@@ -51,6 +51,13 @@ enum fixed_addresses {
FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,
 #endif
+#ifdef CONFIG_PPC_8xx
+   /* For IMMR we need an aligned 512K area */
+   FIX_IMMR_START,
+   FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((512 * 1024) / PAGE_SIZE)) &
+  ~(((512 * 1024) / PAGE_SIZE) - 1),
+   FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((512 * 1024) / PAGE_SIZE),
+#endif
/* FIX_PCIE_MCFG, */
__end_of_fixed_addresses
 };
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 07cebc3..9724ff8 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -68,6 +68,10 @@
 #include "../mm/mmu_decl.h"
 #endif
 
+#ifdef CONFIG_PPC_8xx
+#include 
+#endif
+
 int main(void)
 {
DEFINE(THREAD, offsetof(struct task_struct, thread));
@@ -772,5 +776,9 @@ int main(void)
 
DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
 
+#ifdef CONFIG_PPC_8xx
+   DEFINE(VIRT_IMMR_BASE, __fix_to_virt(FIX_IMMR_BASE));
+#endif
+
return 0;
 }
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 87d1f5f..09173ae 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Macro to make the code more readable. */
 #ifdef CONFIG_8xx_CPU6
@@ -763,7 +764,7 @@ start_here:
  * virtual to physical.  Also, set the cache mode since that is defined
  * by TLB entries and perform any additional mapping (like of the IMMR).
  * If configured to pin some TLBs, we pin the first 8 Mbytes of kernel,
- * 24 Mbytes of data, and the 8M IMMR space.  Anything not covered by
+ * 24 Mbytes of data, and the 512k IMMR space.  Anything not covered by
  * these mappings is mapped by page tables.
  */
 initial_mmu:
@@ -812,7 +813,7 @@ initial_mmu:
ori r8, r8, MD_APG_INIT@l
mtspr   SPRN_MD_AP, r8
 
-   /* Map another 8 MByte at the IMMR to get the processor
+   /* Map a 512k page for the IMMR to get the processor
 * internal registers (among other things).
 */
 #ifdef CONFIG_PIN_TLB
@@ -820,12 +821,12 @@ initial_mmu:
mtspr   SPRN_MD_CTR, r10
 #endif
mfspr   r9, 638 /* Get current IMMR */
-   andis.  r9, r9, 0xff80  /* Get 8Mbyte boundary */
+   andis.  r9, r9, 0xfff8  /* Get 512 kbytes boundary */
 
-   mr  r8, r9  /* Create vaddr for TLB */
+   lis r8, VIRT_IMMR_BASE@h/* Create vaddr for TLB */
ori r8, r8, MD_EVALID   /* Mark it valid */
mtspr   SPRN_MD_EPN, r8
-   li  r8, MD_PS8MEG   /* Set 8M byte page */
+   li  r8, MD_PS512K | MD_GUARDED  /* Set 512k byte page */
ori r8, r8, MD_SVALID   /* Make it valid */
mtspr   SPRN_MD_TWC, r8
mr  r8, r9  /* Create paddr for TLB */
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 40dd5d3..e7228b7 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -107,6 +107,13 @@ struct hash_pte;
 extern struct hash_pte *Hash, *Hash_end;
 extern unsigned long Hash_size, Hash_mask;
 
+#define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff8)
+#ifdef CONFIG_PPC_8xx
+#define VIRT_IMMR_BASE 

[PATCH v8 14/23] powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro

2016-02-09 Thread Christophe Leroy
MPC8xx has an ERRATA on the use of mtspr() for some registers
This patch includes the ERRATA handling directly into mtspr() macro
so that mtspr() users don't need to bother about that errata

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/include/asm/reg.h |  2 +
 arch/powerpc/include/asm/reg_8xx.h | 82 ++
 2 files changed, 84 insertions(+)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index c4cb2ff..7b5d97f 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1211,9 +1211,11 @@ static inline void mtmsr_isync(unsigned long val)
 #define mfspr(rn)  ({unsigned long rval; \
asm volatile("mfspr %0," __stringify(rn) \
: "=r" (rval)); rval;})
+#ifndef mtspr
 #define mtspr(rn, v)   asm volatile("mtspr " __stringify(rn) ",%0" : \
 : "r" ((unsigned long)(v)) \
 : "memory")
+#endif
 
 extern void msr_check_and_set(unsigned long bits);
 extern bool strict_msr_control;
diff --git a/arch/powerpc/include/asm/reg_8xx.h 
b/arch/powerpc/include/asm/reg_8xx.h
index 0f71c81..d41412c 100644
--- a/arch/powerpc/include/asm/reg_8xx.h
+++ b/arch/powerpc/include/asm/reg_8xx.h
@@ -50,4 +50,86 @@
 #define DC_DFWT0x4000  /* Data cache is forced write 
through */
 #define DC_LES 0x2000  /* Caches are little endian mode */
 
+#ifdef CONFIG_8xx_CPU6
+#define do_mtspr_cpu6(rn, rn_addr, v)  \
+   do {\
+   int _reg_cpu6 = rn_addr, _tmp_cpu6[1];  \
+   asm volatile("stw %0, %1;"  \
+"lwz %0, %1;"  \
+"mtspr " __stringify(rn) ",%2" :   \
+: "r" (_reg_cpu6), "m"(_tmp_cpu6), \
+  "r" ((unsigned long)(v)) \
+: "memory");   \
+   } while (0)
+
+#define do_mtspr(rn, v)asm volatile("mtspr " __stringify(rn) ",%0" :   
\
+: "r" ((unsigned long)(v)) \
+: "memory")
+#define mtspr(rn, v) \
+   do {\
+   if (rn == SPRN_IMMR)\
+   do_mtspr_cpu6(rn, 0x3d30, v);   \
+   else if (rn == SPRN_IC_CST) \
+   do_mtspr_cpu6(rn, 0x2110, v);   \
+   else if (rn == SPRN_IC_ADR) \
+   do_mtspr_cpu6(rn, 0x2310, v);   \
+   else if (rn == SPRN_IC_DAT) \
+   do_mtspr_cpu6(rn, 0x2510, v);   \
+   else if (rn == SPRN_DC_CST) \
+   do_mtspr_cpu6(rn, 0x3110, v);   \
+   else if (rn == SPRN_DC_ADR) \
+   do_mtspr_cpu6(rn, 0x3310, v);   \
+   else if (rn == SPRN_DC_DAT) \
+   do_mtspr_cpu6(rn, 0x3510, v);   \
+   else if (rn == SPRN_MI_CTR) \
+   do_mtspr_cpu6(rn, 0x2180, v);   \
+   else if (rn == SPRN_MI_AP)  \
+   do_mtspr_cpu6(rn, 0x2580, v);   \
+   else if (rn == SPRN_MI_EPN) \
+   do_mtspr_cpu6(rn, 0x2780, v);   \
+   else if (rn == SPRN_MI_TWC) \
+   do_mtspr_cpu6(rn, 0x2b80, v);   \
+   else if (rn == SPRN_MI_RPN) \
+   do_mtspr_cpu6(rn, 0x2d80, v);   \
+   else if (rn == SPRN_MI_CAM) \
+   do_mtspr_cpu6(rn, 0x2190, v);   \
+   else if (rn == SPRN_MI_RAM0)\
+   do_mtspr_cpu6(rn, 0x2390, v);   \
+   else if (rn == SPRN_MI_RAM1)\
+   do_mtspr_cpu6(rn, 0x2590, v);   \
+   else if (rn == SPRN_MD_CTR) \
+   do_mtspr_cpu6(rn, 0x3180, v);   \
+   else if (rn == SPRN_M_CASID)  

[PATCH v8 09/23] powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM

2016-02-09 Thread Christophe Leroy
IMMR is now mapped by page tables so it is not
anymore necessary to PIN TLBs

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/Kconfig.debug | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 638f9ce..136b09c 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -220,7 +220,6 @@ config PPC_EARLY_DEBUG_40x
 config PPC_EARLY_DEBUG_CPM
bool "Early serial debugging for Freescale CPM-based serial ports"
depends on SERIAL_CPM
-   select PIN_TLB if PPC_8xx
help
  Select this to enable early debugging for Freescale chips
  using a CPM-based serial port.  This assumes that the bootwrapper
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 18/23] powerpc: add inline functions for cache related instructions

2016-02-09 Thread Christophe Leroy
This patch adds inline functions to use dcbz, dcbi, dcbf, dcbst
from C functions

Signed-off-by: Christophe Leroy 
---
v2: new
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/include/asm/cache.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 5f8229e..ffbafbf 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -69,6 +69,25 @@ extern void _set_L3CR(unsigned long);
 #define _set_L3CR(val) do { } while(0)
 #endif
 
+static inline void dcbz(void *addr)
+{
+   __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void dcbi(void *addr)
+{
+   __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void dcbf(void *addr)
+{
+   __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void dcbst(void *addr)
+{
+   __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory");
+}
 #endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_CACHE_H */
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 20/23] powerpc32: move xxxxx_dcache_range() functions inline

2016-02-09 Thread Christophe Leroy
flush/clean/invalidate _dcache_range() functions are all very
similar and are quite short. They are mainly used in __dma_sync()
perf_event locate them in the top 3 consumming functions during
heavy ethernet activity

They are good candidate for inlining, as __dma_sync() does
almost nothing but calling them

Signed-off-by: Christophe Leroy 
---
v2: new
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/include/asm/cacheflush.h | 52 ++--
 arch/powerpc/kernel/misc_32.S | 65 ---
 arch/powerpc/kernel/ppc_ksyms.c   |  2 ++
 3 files changed, 51 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/include/asm/cacheflush.h 
b/arch/powerpc/include/asm/cacheflush.h
index 6229e6b..97c9978 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -47,12 +47,58 @@ static inline void __flush_dcache_icache_phys(unsigned long 
physaddr)
 }
 #endif
 
-extern void flush_dcache_range(unsigned long start, unsigned long stop);
 #ifdef CONFIG_PPC32
-extern void clean_dcache_range(unsigned long start, unsigned long stop);
-extern void invalidate_dcache_range(unsigned long start, unsigned long stop);
+/*
+ * Write any modified data cache blocks out to memory and invalidate them.
+ * Does not invalidate the corresponding instruction cache blocks.
+ */
+static inline void flush_dcache_range(unsigned long start, unsigned long stop)
+{
+   void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
+   unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
+   unsigned long i;
+
+   for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
+   dcbf(addr);
+   mb();   /* sync */
+}
+
+/*
+ * Write any modified data cache blocks out to memory.
+ * Does not invalidate the corresponding cache lines (especially for
+ * any corresponding instruction cache).
+ */
+static inline void clean_dcache_range(unsigned long start, unsigned long stop)
+{
+   void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
+   unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
+   unsigned long i;
+
+   for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
+   dcbst(addr);
+   mb();   /* sync */
+}
+
+/*
+ * Like above, but invalidate the D-cache.  This is used by the 8xx
+ * to invalidate the cache so the PPC core doesn't get stale data
+ * from the CPM (no cache snooping here :-).
+ */
+static inline void invalidate_dcache_range(unsigned long start,
+  unsigned long stop)
+{
+   void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
+   unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
+   unsigned long i;
+
+   for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
+   dcbi(addr);
+   mb();   /* sync */
+}
+
 #endif /* CONFIG_PPC32 */
 #ifdef CONFIG_PPC64
+extern void flush_dcache_range(unsigned long start, unsigned long stop);
 extern void flush_inval_dcache_range(unsigned long start, unsigned long stop);
 extern void flush_dcache_phys_range(unsigned long start, unsigned long stop);
 #endif
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 181afc1..09e1e5d 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -375,71 +375,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
isync
blr
 /*
- * Write any modified data cache blocks out to memory.
- * Does not invalidate the corresponding cache lines (especially for
- * any corresponding instruction cache).
- *
- * clean_dcache_range(unsigned long start, unsigned long stop)
- */
-_GLOBAL(clean_dcache_range)
-   li  r5,L1_CACHE_BYTES-1
-   andcr3,r3,r5
-   subfr4,r3,r4
-   add r4,r4,r5
-   srwi.   r4,r4,L1_CACHE_SHIFT
-   beqlr
-   mtctr   r4
-
-1: dcbst   0,r3
-   addir3,r3,L1_CACHE_BYTES
-   bdnz1b
-   sync/* wait for dcbst's to get to ram */
-   blr
-
-/*
- * Write any modified data cache blocks out to memory and invalidate them.
- * Does not invalidate the corresponding instruction cache blocks.
- *
- * flush_dcache_range(unsigned long start, unsigned long stop)
- */
-_GLOBAL(flush_dcache_range)
-   li  r5,L1_CACHE_BYTES-1
-   andcr3,r3,r5
-   subfr4,r3,r4
-   add r4,r4,r5
-   srwi.   r4,r4,L1_CACHE_SHIFT
-   beqlr
-   mtctr   r4
-
-1: dcbf0,r3
-   addir3,r3,L1_CACHE_BYTES
-   bdnz1b
-   sync/* wait for dcbst's to get to ram */
-   blr
-
-/*
- * Like above, but invalidate the D-cache.  This is used by the 8xx
- * to invalidate the cache so the PPC core doesn't get stale data
- * from the CPM (no cache snooping here :-).
- *
- * 

[PATCH v8 13/23] powerpc/8xx: Add missing SPRN defines into reg_8xx.h

2016-02-09 Thread Christophe Leroy
Add missing SPRN defines into reg_8xx.h
Some of them are defined in mmu-8xx.h, so we include mmu-8xx.h in
reg_8xx.h, for that we remove references to PAGE_SHIFT in mmu-8xx.h
to have it self sufficient, as includers of reg_8xx.h don't all
include asm/page.h

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: We just add missing ones, don't move anymore the ones from mmu-8xx.h
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/include/asm/mmu-8xx.h |  4 ++--
 arch/powerpc/include/asm/reg_8xx.h | 11 +++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-8xx.h 
b/arch/powerpc/include/asm/mmu-8xx.h
index f05500a..0a566f1 100644
--- a/arch/powerpc/include/asm/mmu-8xx.h
+++ b/arch/powerpc/include/asm/mmu-8xx.h
@@ -171,9 +171,9 @@ typedef struct {
 } mm_context_t;
 #endif /* !__ASSEMBLY__ */
 
-#if (PAGE_SHIFT == 12)
+#if defined(CONFIG_PPC_4K_PAGES)
 #define mmu_virtual_psize  MMU_PAGE_4K
-#elif (PAGE_SHIFT == 14)
+#elif defined(CONFIG_PPC_16K_PAGES)
 #define mmu_virtual_psize  MMU_PAGE_16K
 #else
 #error "Unsupported PAGE_SIZE"
diff --git a/arch/powerpc/include/asm/reg_8xx.h 
b/arch/powerpc/include/asm/reg_8xx.h
index e8ea346..0f71c81 100644
--- a/arch/powerpc/include/asm/reg_8xx.h
+++ b/arch/powerpc/include/asm/reg_8xx.h
@@ -4,6 +4,8 @@
 #ifndef _ASM_POWERPC_REG_8xx_H
 #define _ASM_POWERPC_REG_8xx_H
 
+#include 
+
 /* Cache control on the MPC8xx is provided through some additional
  * special purpose registers.
  */
@@ -14,6 +16,15 @@
 #define SPRN_DC_ADR569 /* Address needed for some commands */
 #define SPRN_DC_DAT570 /* Read-only data register */
 
+/* Misc Debug */
+#define SPRN_DPDR  630
+#define SPRN_MI_CAM816
+#define SPRN_MI_RAM0   817
+#define SPRN_MI_RAM1   818
+#define SPRN_MD_CAM824
+#define SPRN_MD_RAM0   825
+#define SPRN_MD_RAM1   826
+
 /* Commands.  Only the first few are available to the instruction cache.
 */
 #defineIDC_ENABLE  0x0200  /* Cache enable */
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 15/23] powerpc/8xx: remove special handling of CPU6 errata in set_dec()

2016-02-09 Thread Christophe Leroy
CPU6 ERRATA is now handled directly in mtspr(), so we can use the
standard set_dec() fonction in all cases.

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/include/asm/time.h |  6 +-
 arch/powerpc/kernel/head_8xx.S  | 18 --
 2 files changed, 1 insertion(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 2d7109a..1092fdd 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -31,8 +31,6 @@ extern void tick_broadcast_ipi_handler(void);
 
 extern void generic_calibrate_decr(void);
 
-extern void set_dec_cpu6(unsigned int val);
-
 /* Some sane defaults: 125 MHz timebase, 1GHz processor */
 extern unsigned long ppc_proc_freq;
 #define DEFAULT_PROC_FREQ  (DEFAULT_TB_FREQ * 8)
@@ -166,14 +164,12 @@ static inline void set_dec(int val)
 {
 #if defined(CONFIG_40x)
mtspr(SPRN_PIT, val);
-#elif defined(CONFIG_8xx_CPU6)
-   set_dec_cpu6(val - 1);
 #else
 #ifndef CONFIG_BOOKE
--val;
 #endif
mtspr(SPRN_DEC, val);
-#endif /* not 40x or 8xx_CPU6 */
+#endif /* not 40x */
 }
 
 static inline unsigned long tb_ticks_since(unsigned long tstamp)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index a268cf4..637f8e9 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -1011,24 +1011,6 @@ _GLOBAL(set_context)
SYNC
blr
 
-#ifdef CONFIG_8xx_CPU6
-/* It's here because it is unique to the 8xx.
- * It is important we get called with interrupts disabled.  I used to
- * do that, but it appears that all code that calls this already had
- * interrupt disabled.
- */
-   .globl  set_dec_cpu6
-set_dec_cpu6:
-   lis r7, cpu6_errata_word@h
-   ori r7, r7, cpu6_errata_word@l
-   li  r4, 0x2c00
-   stw r4, 8(r7)
-   lwz r4, 8(r7)
-mtspr   22, r3 /* Update Decrementer */
-   SYNC
-   blr
-#endif
-
 /*
  * We put a few things here that have to be page-aligned.
  * This stuff goes at the beginning of the data segment,
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 16/23] powerpc/8xx: rewrite set_context() in C

2016-02-09 Thread Christophe Leroy
There is no real need to have set_context() in assembly.
Now that we have mtspr() handling CPU6 ERRATA directly, we
can rewrite set_context() in C language for easier maintenance.

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/kernel/head_8xx.S | 44 --
 arch/powerpc/mm/8xx_mmu.c  | 34 
 2 files changed, 34 insertions(+), 44 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 637f8e9..bb2b657 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -968,50 +968,6 @@ initial_mmu:
 
 
 /*
- * Set up to use a given MMU context.
- * r3 is context number, r4 is PGD pointer.
- *
- * We place the physical address of the new task page directory loaded
- * into the MMU base register, and set the ASID compare register with
- * the new "context."
- */
-_GLOBAL(set_context)
-
-#ifdef CONFIG_BDI_SWITCH
-   /* Context switch the PTE pointer for the Abatron BDI2000.
-* The PGDIR is passed as second argument.
-*/
-   lis r5, KERNELBASE@h
-   lwz r5, 0xf0(r5)
-   stw r4, 0x4(r5)
-#endif
-
-   /* Register M_TW will contain base address of level 1 table minus the
-* lower part of the kernel PGDIR base address, so that all accesses to
-* level 1 table are done relative to lower part of kernel PGDIR base
-* address.
-*/
-   li  r5, (swapper_pg_dir-PAGE_OFFSET)@l
-   sub r4, r4, r5
-   tophys  (r4, r4)
-#ifdef CONFIG_8xx_CPU6
-   lis r6, cpu6_errata_word@h
-   ori r6, r6, cpu6_errata_word@l
-   li  r7, 0x3f80
-   stw r7, 12(r6)
-   lwz r7, 12(r6)
-#endif
-   mtspr   SPRN_M_TW, r4   /* Update pointeur to level 1 table */
-#ifdef CONFIG_8xx_CPU6
-   li  r7, 0x3380
-   stw r7, 12(r6)
-   lwz r7, 12(r6)
-#endif
-   mtspr   SPRN_M_CASID, r3/* Update context */
-   SYNC
-   blr
-
-/*
  * We put a few things here that have to be page-aligned.
  * This stuff goes at the beginning of the data segment,
  * which is page-aligned.
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index 50f17d2..b75c461 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -147,3 +147,37 @@ void setup_initial_memory_limit(phys_addr_t 
first_memblock_base,
memblock_set_current_limit(min_t(u64, first_memblock_size,
 initial_memory_size));
 }
+
+/*
+ * Set up to use a given MMU context.
+ * id is context number, pgd is PGD pointer.
+ *
+ * We place the physical address of the new task page directory loaded
+ * into the MMU base register, and set the ASID compare register with
+ * the new "context."
+ */
+void set_context(unsigned long id, pgd_t *pgd)
+{
+   s16 offset = (s16)(__pa(swapper_pg_dir));
+
+#ifdef CONFIG_BDI_SWITCH
+   pgd_t   **ptr = *(pgd_t ***)(KERNELBASE + 0xf0);
+
+   /* Context switch the PTE pointer for the Abatron BDI2000.
+* The PGDIR is passed as second argument.
+*/
+   *(ptr + 1) = pgd;
+#endif
+
+   /* Register M_TW will contain base address of level 1 table minus the
+* lower part of the kernel PGDIR base address, so that all accesses to
+* level 1 table are done relative to lower part of kernel PGDIR base
+* address.
+*/
+   mtspr(SPRN_M_TW, __pa(pgd) - offset);
+
+   /* Update context */
+   mtspr(SPRN_M_CASID, id);
+   /* sync */
+   mb();
+}
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 17/23] powerpc/8xx: rewrite flush_instruction_cache() in C

2016-02-09 Thread Christophe Leroy
On PPC8xx, flushing instruction cache is performed by writing
in register SPRN_IC_CST. This registers suffers CPU6 ERRATA.
The patch rewrites the fonction in C so that CPU6 ERRATA will
be handled transparently

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/kernel/misc_32.S | 10 --
 arch/powerpc/mm/8xx_mmu.c |  7 +++
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index be8edd6..7d1284f 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -296,12 +296,9 @@ _GLOBAL(real_writeb)
  * Flush instruction cache.
  * This is a no-op on the 601.
  */
+#ifndef CONFIG_PPC_8xx
 _GLOBAL(flush_instruction_cache)
-#if defined(CONFIG_8xx)
-   isync
-   lis r5, IDC_INVALL@h
-   mtspr   SPRN_IC_CST, r5
-#elif defined(CONFIG_4xx)
+#if defined(CONFIG_4xx)
 #ifdef CONFIG_403GCX
li  r3, 512
mtctr   r3
@@ -334,9 +331,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE)
mfspr   r3,SPRN_HID0
ori r3,r3,HID0_ICFI
mtspr   SPRN_HID0,r3
-#endif /* CONFIG_8xx/4xx */
+#endif /* CONFIG_4xx */
isync
blr
+#endif /* CONFIG_PPC_8xx */
 
 /*
  * Write any modified data cache blocks out to memory
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index b75c461..e2ce480 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -181,3 +181,10 @@ void set_context(unsigned long id, pgd_t *pgd)
/* sync */
mb();
 }
+
+void flush_instruction_cache(void)
+{
+   isync();
+   mtspr(SPRN_IC_CST, IDC_INVALL);
+   isync();
+}
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 19/23] powerpc32: Remove clear_pages() and define clear_page() inline

2016-02-09 Thread Christophe Leroy
clear_pages() is never used expect by clear_page, and PPC32 is the
only architecture (still) having this function. Neither PPC64 nor
any other architecture has it.

This patch removes clear_pages() and moves clear_page() function
inline (same as PPC64) as it only is a few isns

Signed-off-by: Christophe Leroy 
---
v2: no change
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/include/asm/page_32.h | 17 ++---
 arch/powerpc/kernel/misc_32.S  | 16 
 arch/powerpc/kernel/ppc_ksyms_32.c |  1 -
 3 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/page_32.h 
b/arch/powerpc/include/asm/page_32.h
index 68d73b2..6a8e179 100644
--- a/arch/powerpc/include/asm/page_32.h
+++ b/arch/powerpc/include/asm/page_32.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_POWERPC_PAGE_32_H
 #define _ASM_POWERPC_PAGE_32_H
 
+#include 
+
 #if defined(CONFIG_PHYSICAL_ALIGN) && (CONFIG_PHYSICAL_START != 0)
 #if (CONFIG_PHYSICAL_START % CONFIG_PHYSICAL_ALIGN) != 0
 #error "CONFIG_PHYSICAL_START must be a multiple of CONFIG_PHYSICAL_ALIGN"
@@ -36,9 +38,18 @@ typedef unsigned long long pte_basic_t;
 typedef unsigned long pte_basic_t;
 #endif
 
-struct page;
-extern void clear_pages(void *page, int order);
-static inline void clear_page(void *page) { clear_pages(page, 0); }
+/*
+ * Clear page using the dcbz instruction, which doesn't cause any
+ * memory traffic (except to write out any cache lines which get
+ * displaced).  This only works on cacheable memory.
+ */
+static inline void clear_page(void *addr)
+{
+   unsigned int i;
+
+   for (i = 0; i < PAGE_SIZE / L1_CACHE_BYTES; i++, addr += L1_CACHE_BYTES)
+   dcbz(addr);
+}
 extern void copy_page(void *to, void *from);
 
 #include 
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 7d1284f..181afc1 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -517,22 +517,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
 #endif /* CONFIG_BOOKE */
 
 /*
- * Clear pages using the dcbz instruction, which doesn't cause any
- * memory traffic (except to write out any cache lines which get
- * displaced).  This only works on cacheable memory.
- *
- * void clear_pages(void *page, int order) ;
- */
-_GLOBAL(clear_pages)
-   li  r0,PAGE_SIZE/L1_CACHE_BYTES
-   slw r0,r0,r4
-   mtctr   r0
-1: dcbz0,r3
-   addir3,r3,L1_CACHE_BYTES
-   bdnz1b
-   blr
-
-/*
  * Copy a whole page.  We use the dcbz instruction on the destination
  * to reduce memory traffic (it eliminates the unnecessary reads of
  * the destination into cache).  This requires that the destination
diff --git a/arch/powerpc/kernel/ppc_ksyms_32.c 
b/arch/powerpc/kernel/ppc_ksyms_32.c
index 30ddd8a..2bfaafe 100644
--- a/arch/powerpc/kernel/ppc_ksyms_32.c
+++ b/arch/powerpc/kernel/ppc_ksyms_32.c
@@ -10,7 +10,6 @@
 #include 
 #include 
 
-EXPORT_SYMBOL(clear_pages);
 EXPORT_SYMBOL(ISA_DMA_THRESHOLD);
 EXPORT_SYMBOL(DMA_MODE_READ);
 EXPORT_SYMBOL(DMA_MODE_WRITE);
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 21/23] powerpc: Simplify test in __dma_sync()

2016-02-09 Thread Christophe Leroy
This simplification helps the compiler. We now have only one test
instead of two, so it reduces the number of branches.

Signed-off-by: Christophe Leroy 
---
v2: new
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/mm/dma-noncoherent.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/dma-noncoherent.c 
b/arch/powerpc/mm/dma-noncoherent.c
index 169aba4..2dc74e5 100644
--- a/arch/powerpc/mm/dma-noncoherent.c
+++ b/arch/powerpc/mm/dma-noncoherent.c
@@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int direction)
 * invalidate only when cache-line aligned otherwise there is
 * the potential for discarding uncommitted data from the cache
 */
-   if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 
1)))
+   if ((start | end) & (L1_CACHE_BYTES - 1))
flush_dcache_range(start, end);
else
invalidate_dcache_range(start, end);
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 23/23] powerpc32: Remove one insn in mulhdu

2016-02-09 Thread Christophe Leroy
Remove one instruction in mulhdu

Signed-off-by: Christophe Leroy 
---
v2: new
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/kernel/misc_32.S | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 3ec5a22..bf5160f 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -91,17 +91,16 @@ _GLOBAL(mulhdu)
addcr7,r0,r7
addze   r4,r4
 1: beqlr   cr1 /* all done if high part of A is 0 */
-   mr  r10,r3
mullw   r9,r3,r5
-   mulhwu  r3,r3,r5
+   mulhwu  r10,r3,r5
beq 2f
-   mullw   r0,r10,r6
-   mulhwu  r8,r10,r6
+   mullw   r0,r3,r6
+   mulhwu  r8,r3,r6
addcr7,r0,r7
adder4,r4,r8
-   addze   r3,r3
+   addze   r10,r10
 2: addcr4,r4,r9
-   addze   r3,r3
+   addze   r3,r10
blr
 
 /*
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v8 22/23] powerpc32: small optimisation in flush_icache_range()

2016-02-09 Thread Christophe Leroy
Inlining of _dcache_range() functions has shown that the compiler
does the same thing a bit better with one insn less

Signed-off-by: Christophe Leroy 
---
v2: new
v3: no change
v4: no change
v5: no change
v6: no change
v8: no change

 arch/powerpc/kernel/misc_32.S | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 09e1e5d..3ec5a22 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -348,10 +348,9 @@ BEGIN_FTR_SECTION
PURGE_PREFETCHED_INS
blr /* for 601, do nothing */
 END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
-   li  r5,L1_CACHE_BYTES-1
-   andcr3,r3,r5
+   rlwinm  r3,r3,0,0,31 - L1_CACHE_SHIFT
subfr4,r3,r4
-   add r4,r4,r5
+   addir4,r4,L1_CACHE_BYTES - 1
srwi.   r4,r4,L1_CACHE_SHIFT
beqlr
mtctr   r4
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 0/2] Consolidate redundant register/stack access code

2016-02-09 Thread David Long

On 02/09/2016 04:36 AM, Michael Ellerman wrote:

On Tue, 2016-02-09 at 00:38 -0500, David Long wrote:


From: "David A. Long" 

Move duplicate and functionally equivalent code for accessing registers
and stack (CONFIG_HAVE_REGS_AND_STACK_ACCESS_API) from arch subdirs into
common kernel files.

I'm sending this out again (with updated distribution list) because v2
just never got pulled in, even though I don't think there were any
outstanding issues.


A big cross arch patch like this would often get taken by Andrew Morton, but
AFAICS you didn't CC him - so I just added him, perhaps he'll pick it up for
us :D

cheers



Thanks much.

-dl

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 0/2] Consolidate redundant register/stack access code

2016-02-09 Thread David Long

On 02/09/2016 04:45 AM, Ingo Molnar wrote:


* Michael Ellerman  wrote:


On Tue, 2016-02-09 at 00:38 -0500, David Long wrote:


From: "David A. Long" 

Move duplicate and functionally equivalent code for accessing registers
and stack (CONFIG_HAVE_REGS_AND_STACK_ACCESS_API) from arch subdirs into
common kernel files.

I'm sending this out again (with updated distribution list) because v2
just never got pulled in, even though I don't think there were any
outstanding issues.


A big cross arch patch like this would often get taken by Andrew Morton, but
AFAICS you didn't CC him - so I just added him, perhaps he'll pick it up for
us :D


The other problem is that the second patch is commingling changes to 6 separate
architectures:

  16 files changed, 106 insertions(+), 343 deletions(-)

that should probably be 6 separate patches. Easier to review, easier to bisect 
to,
easier to revert, etc.

Thanks,

Ingo



I see your point but I'm not sure it could have been broken into 
separate successive patches that would each build for all architectures.


-dl

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: build regression from c153693: Simplify module TOC handling

2016-02-09 Thread Dinar Valeev
On Tue, Feb 9, 2016 at 5:28 PM, Peter Robinson  wrote:
> Hi Alan,
>
> Your patch for "powerpc: Simplify module TOC handling" is causing the
> Fedora ppc64le to fail to build with depmod failures. Reverting the
> commit fixes it for us on rawhide.
Anton's patch [1] fixes it.

[1] 
https://build.opensuse.org/package/view_file/Base:System/kmod/depmod-Ignore_PowerPC64_ABIv2_.TOC.symbol.patch
>
> We're getting the out put below, full logs at [1]. Let me know if you
> have any other queries.
>
> Regards,
> Peter
>
> [1] 
> http://ppc.koji.fedoraproject.org/kojifiles/work/tasks/5115/3125115/build.log
>
> + depmod -b . -aeF ./System.map 4.5.0-0.rc2.git0.1.fc24.ppc64le
> Depmod failure
> + '[' -s depmod.out ']'
> + echo 'Depmod failure'
> + cat depmod.out
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/powernv/opal-prd.ko
> needs unknown symbol .TOC.
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/pseries/pseries_energy.ko
> needs unknown symbol .TOC.
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/pseries/hvcserver.ko
> needs unknown symbol .TOC.
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm.ko
> needs unknown symbol .TOC.
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm-pr.ko
> needs unknown symbol .TOC.
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm-hv.ko
> needs unknown symbol .TOC.
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/rcu/rcutorture.ko
> needs unknown symbol .TOC.
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/trace/ring_buffer_benchmark.ko
> needs unknown symbol .TOC.
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/torture.ko
> needs unknown symbol .TOC.
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/fs/nfs_common/nfs_acl.ko
> needs unknown symbol .TOC.
> depmod: WARNING:
> /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/fs/nfs_common/grace.ko
> needs unknown symbol .TOC.
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/6] ibmvscsi: Add and use enums for valid CRQ header values

2016-02-09 Thread Tyrel Datwyler
On 02/09/2016 09:41 AM, Manoj Kumar wrote:
>> Yeah, I can see how that is confusing. Since, all three possible valid
>> crq message types have the first bit set I think this was originally a
>> cute hack to grab anything that was likely valid. Then in
>> ibmvscsi_handle_crq() we explicitly match the full header value in a
>> switch statement logging anything that turned out actually invalid.
>>
>>>
>>> If 'valid' will only have one of these four enums defined, would
>>> this be better written as:
>>>
>>> if (crq->valid != VIOSRP_CRQ_FREE)
>>
>> This definitely would make the logic easier to read and follow. Also,
>> this would make sure any crq with an invalid header that doesn't have
>> its first bit set will also be logged by the ibmvscsi_handle_crq()
>> switch statement default block and not silently ignored.
>>
>> -Tyrel
> 
> Sounds good, Tyrel. Does this mean I should expect a v2 of this patch
> series?
> 
> - Manoj N. Kumar

Haven't had a chance to clean up and resubmit, but yes there will be a
v2 coming along soon.

-Tyrel

> 
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: build regression from c153693: Simplify module TOC handling

2016-02-09 Thread Michael Ellerman
On Tue, 2016-02-09 at 22:02 +0100, Dinar Valeev wrote:

> On Tue, Feb 9, 2016 at 5:28 PM, Peter Robinson  wrote:

> > Hi Alan,
> >
> > Your patch for "powerpc: Simplify module TOC handling" is causing the
> > Fedora ppc64le to fail to build with depmod failures. Reverting the
> > commit fixes it for us on rawhide.

> Anton's patch [1] fixes it.
>
> [1] 
> https://build.opensuse.org/package/view_file/Base:System/kmod/depmod-Ignore_PowerPC64_ABIv2_.TOC.symbol.patch

Yep, you need an updated depmod.

Anton sent a patch to linux-modules, reproduced below for the benefit of the
list archive:

depmod: Ignore PowerPC64 ABIv2 .TOC. symbol

The .TOC. symbol on the PowerPC64 ABIv2 identifies the GOT
pointer, similar to how other architectures use _GLOBAL_OFFSET_TABLE_.

This is not a symbol that needs relocation, and should be ignored
by depmod.

---
 tools/depmod.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/depmod.c b/tools/depmod.c
index 6e9bb4d..a2e07c1 100644
--- a/tools/depmod.c
+++ b/tools/depmod.c
@@ -2153,6 +2153,8 @@ static void depmod_add_fake_syms(struct depmod *depmod)
depmod_symbol_add(depmod, "__this_module", true, 0, NULL);
/* On S390, this is faked up too */
depmod_symbol_add(depmod, "_GLOBAL_OFFSET_TABLE_", true, 0, NULL);
+   /* On PowerPC64 ABIv2, .TOC. is more or less _GLOBAL_OFFSET_TABLE_ */
+   depmod_symbol_add(depmod, "TOC.", true, 0, NULL);
 }

 static int depmod_load_symvers(struct depmod *depmod, const char *filename)
--
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v7 07/10] ppc64 ftrace: disable profiling for some files

2016-02-09 Thread Michael Ellerman
On Mon, 2016-01-25 at 16:31 +0100, Torsten Duwe wrote:

> This patch complements the "notrace" attribute for selected functions.
> It adds -mprofile-kernel to the cc flags to be stripped from the command
> line for code-patching.o and feature-fixups.o, in addition to "-pg"

This could probably be folded into patch 5, and the combined patch would be
"remove -mprofile-kernel in all the same places we remove -pg and for the same
reasons".

I can't think of anywhere we would want to disable -pg but not disable
-mprofile-kernel? Or vice versa.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH kernel] powerpc/ioda: Set "read" permission when "write" is set

2016-02-09 Thread Alexey Kardashevskiy

On 02/10/2016 01:28 AM, Douglas Miller wrote:

We finally got the chance to test it end of last week. I forgot to update
everyone Monday. B all appearances, the patch fixes the problem. We did not
see any new issues with the patch (vs. same test scenarios without).

I'll also update the bugzilla.


Thanks. Care to add "Tested-by"?




Thanks,
Doug

On 02/08/2016 07:37 PM, Alexey Kardashevskiy wrote:

On 01/20/2016 06:01 AM, Douglas Miller wrote:



On 01/18/2016 09:52 PM, Alexey Kardashevskiy wrote:

On 01/13/2016 01:24 PM, Douglas Miller wrote:



On 01/12/2016 05:07 PM, Benjamin Herrenschmidt wrote:

On Tue, 2016-01-12 at 15:40 +1100, Alexey Kardashevskiy wrote:

Quite often drivers set only "write" permission assuming that this
includes "read" permission as well and this works on plenty
platforms.
However IODA2 is strict about this and produces an EEH when "read"
permission is not and reading happens.

This adds a workaround in IODA code to always add the "read" bit when
the "write" bit is set.

Cc: Benjamin Herrenschmidt 
Signed-off-by: Alexey Kardashevskiy 
---


Ben, what was the driver which did not set "read" and caused EEH?

aacraid

Cheers,
Ben.

Just to be precise, the driver wasn't responsible for setting READ. The
driver called scsi_dma_map() and the scsicmd was set (by scsi layer) as
DMA_FROM_DEVICE so the current code would set the permissions to
WRITE-ONLY. Previously, and in other architectures, this scsicmd would
have
resulted in READ+WRITE permissions on the DMA map.



Does the patch fix the issue? Thanks.






---
  arch/powerpc/platforms/powernv/pci.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci.c
b/arch/powerpc/platforms/powernv/pci.c
index f2dd772..c7dcae5 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -601,6 +601,9 @@ int pnv_tce_build(struct iommu_table *tbl, long
index, long npages,
  u64 rpn = __pa(uaddr) >> tbl->it_page_shift;
  long i;
+if (proto_tce & TCE_PCI_WRITE)
+proto_tce |= TCE_PCI_READ;
+
  for (i = 0; i < npages; i++) {
  unsigned long newtce = proto_tce |
  ((rpn + i) << tbl->it_page_shift);
@@ -622,6 +625,9 @@ int pnv_tce_xchg(struct iommu_table *tbl, long
index,
  BUG_ON(*hpa & ~IOMMU_PAGE_MASK(tbl));
+if (newtce & TCE_PCI_WRITE)
+newtce |= TCE_PCI_READ;
+
  oldtce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce));
  *hpa = be64_to_cpu(oldtce) & ~(TCE_PCI_READ |
TCE_PCI_WRITE);
  *direction = iommu_tce_direction(oldtce);

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev




I am still working on getting a machine to try this on. From code
inspection, it looks like it should work. The problem is shortage of
machines and machines tied-up by Test.


Any progress here? Thanks.









--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/perf/hv-gpci: Increase request buffer size

2016-02-09 Thread Sukadev Bhattiprolu
Michael Ellerman [m...@ellerman.id.au] wrote:
> Here you read from bytes[i] where i can be > 1 (AFAICS).

Yes, buffer is large enough and I thought this construct of
array was used in a several places. Maybe they are being
changed out now (struct pid has one such usage).

> 
> That's fishy at best, and newer GCCs just don't allow it.

Ah, ok.
> 
> I think you could do this and it would work, but untested:
> 
>struct hv_gpci_request_buffer {
>   struct hv_get_perf_counter_info_params params;
>   uint8_t bytes[4096 - sizeof(struct hv_get_perf_counter_info_parms)];

There is a macro for this computation in that file. I could have
used that. Will change it and repost.

Thanks,

Sukadev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2 1/1] powerpc/perf/hv-gpci: Increase request buffer size

2016-02-09 Thread Sukadev Bhattiprolu
From f1afe08fbc9797ff63adf03efe564a807a37cfe6 Mon Sep 17 00:00:00 2001
From: Sukadev Bhattiprolu 
Date: Tue, 9 Feb 2016 02:47:45 -0500
Subject: [PATCH V2 1/1] powerpc/perf/hv-gpci: Increase request buffer size

The GPCI hcall allows for a 4K buffer but we limit the buffer to 1K.
The problem with a 1K buffer is if a request results in returning
more values than can be accomodated in the 1K buffer the request will
fail.

The buffer we are using is currently allocated on the stack and hence
limited in size. Instead use a per-CPU 4K buffer like we do with 24x7
counters (hv-24x7.c).

While here, rename the macro GPCI_MAX_DATA_BYTES to HGPCI_MAX_DATA_BYTES
for consistency with 24x7 counters.

Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v2]:
- [Michael Ellerman] Specify the exact size of buffer in
  the GPCI request rather than use an array elment of 1.
---
 arch/powerpc/perf/hv-gpci.c | 43 +--
 1 file changed, 25 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 856fe6e..7aa3723 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -127,8 +127,16 @@ static const struct attribute_group *attr_groups[] = {
NULL,
 };
 
-#define GPCI_MAX_DATA_BYTES \
-   (1024 - sizeof(struct hv_get_perf_counter_info_params))
+#define HGPCI_REQ_BUFFER_SIZE  4096
+#define HGPCI_MAX_DATA_BYTES \
+   (HGPCI_REQ_BUFFER_SIZE - sizeof(struct hv_get_perf_counter_info_params))
+
+DEFINE_PER_CPU(char, hv_gpci_reqb[HGPCI_REQ_BUFFER_SIZE]) 
__aligned(sizeof(uint64_t));
+
+struct hv_gpci_request_buffer {
+   struct hv_get_perf_counter_info_params params;
+   uint8_t bytes[HGPCI_MAX_DATA_BYTES];
+} __packed;
 
 static unsigned long single_gpci_request(u32 req, u32 starting_index,
u16 secondary_index, u8 version_in, u32 offset, u8 length,
@@ -137,24 +145,21 @@ static unsigned long single_gpci_request(u32 req, u32 
starting_index,
unsigned long ret;
size_t i;
u64 count;
+   struct hv_gpci_request_buffer *arg;
+
+   arg = (void *)get_cpu_var(hv_gpci_reqb);
+   memset(arg, 0, HGPCI_REQ_BUFFER_SIZE);
 
-   struct {
-   struct hv_get_perf_counter_info_params params;
-   uint8_t bytes[GPCI_MAX_DATA_BYTES];
-   } __packed __aligned(sizeof(uint64_t)) arg = {
-   .params = {
-   .counter_request = cpu_to_be32(req),
-   .starting_index = cpu_to_be32(starting_index),
-   .secondary_index = cpu_to_be16(secondary_index),
-   .counter_info_version_in = version_in,
-   }
-   };
+   arg->params.counter_request = cpu_to_be32(req);
+   arg->params.starting_index = cpu_to_be32(starting_index);
+   arg->params.secondary_index = cpu_to_be16(secondary_index);
+   arg->params.counter_info_version_in = version_in;
 
ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO,
-   virt_to_phys(), sizeof(arg));
+   virt_to_phys(arg), HGPCI_REQ_BUFFER_SIZE);
if (ret) {
pr_devel("hcall failed: 0x%lx\n", ret);
-   return ret;
+   goto out;
}
 
/*
@@ -163,9 +168,11 @@ static unsigned long single_gpci_request(u32 req, u32 
starting_index,
 */
count = 0;
for (i = offset; i < offset + length; i++)
-   count |= arg.bytes[i] << (i - offset);
+   count |= arg->bytes[i] << (i - offset);
 
*value = count;
+out:
+   put_cpu_var(hv_gpci_reqb);
return ret;
 }
 
@@ -245,10 +252,10 @@ static int h_gpci_event_init(struct perf_event *event)
}
 
/* last byte within the buffer? */
-   if ((event_get_offset(event) + length) > GPCI_MAX_DATA_BYTES) {
+   if ((event_get_offset(event) + length) > HGPCI_MAX_DATA_BYTES) {
pr_devel("request outside of buffer: %zu > %zu\n",
(size_t)event_get_offset(event) + length,
-   GPCI_MAX_DATA_BYTES);
+   HGPCI_MAX_DATA_BYTES);
return -EINVAL;
}
 
-- 
1.8.3.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v7 06/10] ppc64 ftrace: disable profiling for some functions

2016-02-09 Thread Michael Ellerman
On Mon, 2016-01-25 at 16:31 +0100, Torsten Duwe wrote:

> At least POWER7/8 have MMUs that don't completely autoload;
> a normal, recoverable memory fault might pass through these functions.
> If a dynamic tracer function causes such a fault, any of these functions
> being traced with -mprofile-kernel may cause an endless recursion.

I'm not really happy with this one, still :)

At the moment I can trace these without any problems, with either ftrace or
kprobes, but obviously it was causing you some trouble. So I'd like to
understand why you were having issues when regular tracing doesn't.

If it's the case that tracing can work for these functions, but live patching
doesn't (for some reason), then maybe these should be blocked by the live
patching infrastructure rather than at the ftrace/kprobes level.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v14 5/9] powerpc/eeh: EEH device for VF

2016-02-09 Thread Gavin Shan
From: Wei Yang 

VFs and their corresponding pdn are created and released dynamically
when their PF's SRIOV capability is enabled and disabled. This creates
and releases EEH devices for VFs when creating and releasing their pdn
instances, which means EEH devices and pdn instances have same life
cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn).

Signed-off-by: Wei Yang 
Acked-by: Gavin Shan 
---
 arch/powerpc/include/asm/eeh.h |  1 +
 arch/powerpc/kernel/pci_dn.c   | 15 +++
 2 files changed, 16 insertions(+)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 867c39b..574ed49a 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -141,6 +141,7 @@ struct eeh_dev {
struct pci_controller *phb; /* Associated PHB   */
struct pci_dn *pdn; /* Associated PCI device node   */
struct pci_dev *pdev;   /* Associated PCI device*/
+   struct pci_dev *physfn; /* Associated SRIOV PF  */
struct pci_bus *bus;/* PCI bus for partial hotplug  */
 };
 
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index b3b4df9..e23bdf7 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -179,6 +179,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 {
 #ifdef CONFIG_PCI_IOV
struct pci_dn *parent, *pdn;
+   struct eeh_dev *edev;
int i;
 
/* Only support IOV for now */
@@ -204,6 +205,12 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 __func__, i);
return NULL;
}
+
+   /* Create the EEH device for the VF */
+   eeh_dev_init(pdn, pci_bus_to_host(pdev->bus));
+   edev = pdn_to_eeh_dev(pdn);
+   BUG_ON(!edev);
+   edev->physfn = pdev;
}
 #endif /* CONFIG_PCI_IOV */
 
@@ -215,6 +222,7 @@ void remove_dev_pci_data(struct pci_dev *pdev)
 #ifdef CONFIG_PCI_IOV
struct pci_dn *parent;
struct pci_dn *pdn, *tmp;
+   struct eeh_dev *edev;
int i;
 
/*
@@ -256,6 +264,13 @@ void remove_dev_pci_data(struct pci_dev *pdev)
pdn->devfn != pci_iov_virtfn_devfn(pdev, i))
continue;
 
+   /* Release EEH device for the VF */
+   edev = pdn_to_eeh_dev(pdn);
+   if (edev) {
+   pdn->edev = NULL;
+   kfree(edev);
+   }
+
if (!list_empty(>list))
list_del(>list);
 
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v14 0/9] EEH Support for SRIOV VFs

2016-02-09 Thread Gavin Shan
This applies to linux-powerpc-next and additional unmerged patches:

[v2,1/4] powerpc/eeh: Fix stale cached primary bus
powerpc/eeh: fix incorrect function name in comment
[V2] powerpc/powernv: Remove support for p5ioc2
[V7,1/6] powerpc/powernv: don't enable SRIOV when VF BAR has non 
64bit-prefetchable BAR
92e963f Linux 4.5-rc1 - Linux powerpc next branch

This patchset enables EEH on SRIOV VFs. The general idea is to create proper
VF edev and VF PE and handle them properly.

Different from the Bus PE, VF PE just contain one VF. This introduces the
difference of EEH error handling on a VF PE. Generally, it has several
differences.

First, the VF's removal and re-enumerate rely on its PF. VF has a tight
relationship between its PF. This is not proper to enumerate a VF by usual
scan procedure. That's why virtfn_add/virtfn_remove are exported in this patch
set.

Second, the reset/restore of a VF is done in kernel space. FW is not aware of
the VF, this means the usual reset function done in FW will not work. One of
the patch will imitate the reset/restore function in kernel space.

Third, the VF may be removed during the PF's error_detected function. In this
case, the original error_detected->slot_reset->resume sequence is not proper
to those removed VFs, since they are re-created by PF in a fresh state. A flag
in eeh_dev is introduce to mark the eeh_dev is in error state. By doing so, we
track whether this device needs to be reset or not.

This has been tested both on host and in guest on Power8 with latest kernel
version.

Changelog
=
v14:
   * Rebased to linux-powerpc-next branch, plus additional patches related to
 powerpc/pci and powerpc/eeh subsystems.
   * Minor code changes
   * Fix build error on pSeries reported by mpe
v13:
   * move eeh_rmv_data{} to eeh_driver.c
v12:
   * Rephrase some commit log to make it more clear and specific
   * move vf_index assignment in CONFIG_PPC_POWERNV
   * merge "Cache VF index in pci_dn" with "Support error recovery for VF PE"
   * check the return value after eeh_dev_init() for VF
   * initialize the parameter before pass to read_config()
   * make pnv_pci_fixup_vf_mps() a dedicated patch, which fixup and store mps
 value in pci_dn
v11:
   * move vf_index assignment in marco CONFIG_PPC_POWERNV
   * merge Patch "Cache VF index in pci_dn" into Patch "Support error recovery
 for VF PE"
v10:
   * rebased on v4.2
   * delete the last patch "powerpc/powernv: compound PE for VFs" since after
 redesign of SRIOV, there is no compound PE for VFs now.
   * add two patches which fix problems found during tests
 powerpc/eeh: Support error recovery for VF PE
 powerpc/eeh: Handle hot removed VF when PF is EEH aware
v9:
   * split pcibios_bus_add_device() into a separate patch
   * Bjorn acked the PCI part and agreed this patch set to be merged from ppc
 tree
   * rebased on mpe/linux.git next branch
v8:
   * fix on checking the return value of pnv_eeh_do_flr()
   * introduced a weak function pcibios_bus_add_device() to create PE for VFs
v7:
   * fix compile error when PCI_IOV is not set
v6:
   * code / commit log refactor by Gavin
v5:
   * remove the compound field, iterate on Master VF PE instead
   * some code refine on PCI config restore and reset on VF
 the wait time for assert and deassert
 PCI device address format
 check on edev->pcie_cap and edev->aer_cap before access them
v4:
   * refine the change logs, comment and code style
   * change pnv_pci_fixup_vf_eeh() to pnv_eeh_vf_final_fixup() and remove the
 CONFIG_PCI_IOV macro
   * reorder patch 5/6 to make the logic more reasonable
   * remove remove_dev_pci_data()
   * remove the EEH_DEV_VF flag, use edev->physfn to identify a VF EEH DEV and
 remove related CONFIG_PCI_IOV macro
   * add the option for VF reset
   * fix the pnv_eeh_cfg_blocked() logic
   * replace pnv_pci_cfg_{read,write} with eeh_ops->{read,write}_config in
 pnv_eeh_vf_restore_config()
   * rename pnv_eeh_vf_restore_config() to pnv_eeh_restore_vf_config()
   * rename pnv_pci_fixup_vf_caps() to pnv_pci_vf_header_fixup() and move it
 to arch/powerpc/platforms/powernv/pci.c
   * add a field compound in pnv_ioda_pe to link compound PEs
   * handle compound PE for VF PEs
v3:
   * add back vf_index in pci_dn to track the VF's index
   * rename ppdev in eeh_dev to physfn for consistency
   * move edev->physfn assignment before dev->dev.archdata.edev is set
   * move pnv_pci_fixup_vf_eeh() and pnv_pci_fixup_vf_caps() to eeh-powernv.c
   * more clear and detail in commit log and comment in code
   * merge eeh_rmv_virt_device() with eeh_rmv_device()
   * move the cfg_blocked check logic from pnv_eeh_read/write_config() to
 pnv_eeh_cfg_blocked()
   * move the vf reset/restore logic into its own patch, two patches are
 created.
 powerpc/powernv: Support PCI config restore for VFs
 powerpc/powernv: Support EEH reset for VFs
   * simplify the vf reset logic
v2:
   * add prefix pci_iov_ 

[PATCH v14 3/9] powerpc/pci: Remove VFs prior to PF

2016-02-09 Thread Gavin Shan
From: Wei Yang 

As commit ac205b7bb72f ("PCI: make sriov work with hotplug remove")
indicates, VFs which is on the same PCI bus as their PF, should be
removed before the PF. Otherwise, we might run into kernel crash
at PCI unplugging time.

This applies the above pattern to powerpc PCI hotplug path.

Signed-off-by: Wei Yang 
Acked-by: Gavin Shan 
---
 arch/powerpc/kernel/pci-hotplug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c 
b/arch/powerpc/kernel/pci-hotplug.c
index 7f9ed0c..59c4361 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -55,7 +55,7 @@ void pcibios_remove_pci_devices(struct pci_bus *bus)
 
pr_debug("PCI: Removing devices on bus %04x:%02x\n",
 pci_domain_nr(bus),  bus->number);
-   list_for_each_entry_safe(dev, tmp, >devices, bus_list) {
+   list_for_each_entry_safe_reverse(dev, tmp, >devices, bus_list) {
pr_debug("   Removing %s...\n", pci_name(dev));
pci_stop_and_remove_bus_device(dev);
}
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 13/18] cxl: sysfs support for guests

2016-02-09 Thread Ian Munsie
Excerpts from Frederic Barrat's message of 2016-02-10 02:21:19 +1100:
> 
> Le 08/02/2016 04:02, Stewart Smith a écrit :
> > Frederic Barrat  writes:
> >> --- a/Documentation/ABI/testing/sysfs-class-cxl
> >> +++ b/Documentation/ABI/testing/sysfs-class-cxl
> >> @@ -183,7 +183,7 @@ Description:read only
> >>   Identifies the revision level of the PSL.
> >>   Users:https://github.com/ibm-capi/libcxl
> >>
> >> -What:   /sys/class/cxl//base_image
> >> +What:   /sys/class/cxl//base_image (not in a guest)
> >
> > Is this going to be the case for KVM guest as well as PowerVM guest?
> 
> 
> That's too early to say.
> The entries we've removed are because the information is filtered by 
> pHyp and not available to the OS. Some of it because nobody thought it 
> would be useful, some of it because it's not meant to be seen by the OS. 
> For KVM, if the card can be shared between guests, I would expect the 
> same kind of restrictions.

The OS doesn't particularly care about this - the only people who might
even possibly need to know will be whoever is trying to flash their PSL
image, and probably not even then.

On KVM we are thinking that it will have to be root on the hypervisor
responsible for flashing the PSL image (there isn't much other option
unless we want to go into signed images and whatnot, but even if we do
I'm 100% committed to making that a userspace problem to solve and not
trying to do anything fancy in the kernel), so we won't really need it,
but I also don't see any harm in exposing it to guests.

Cheers,
-Ian

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 05/18] cxl: Rename some bare-metal specific functions

2016-02-09 Thread Ian Munsie
Excerpts from Frederic Barrat's message of 2016-02-07 00:28:52 +1100:
> Rename a few functions, mostly prefixed by 'cxl_', to make clear that
> the implementation is 'bare-metal' specific.

Patch looks fine to me, though the commit message should probably say
that you are changing the 'cxl_' prefix to 'cxl_pci_'.

Acked-by: Ian Munsie 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 07/18] cxl: Update cxl_irq() prototype

2016-02-09 Thread Ian Munsie
> The context parameter when calling cxl_irq() should be strongly typed.

Fair enough ;-)

Acked-by: Ian Munsie 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v14 8/9] powerpc/powernv: Support PCI config restore for VFs

2016-02-09 Thread Gavin Shan
From: Wei Yang 

After PE reset, OPAL API opal_pci_reinit() is called on all devices
contained in the PE to reinitialize them. While skiboot is not aware of
VFs, we have to implement the function in kernel to reinitialize VFs after
reset on PE for VFs.

In this patch, two functions pnv_pci_fixup_vf_mps() and
pnv_eeh_restore_vf_config() both manipulate the MPS of the VF, since for a
VF it has three cases.

1. Normal creation for a VF
   In this case, pnv_pci_fixup_vf_mps() is called to make the MPS a proper
   value compared with its parent.
2. EEH recovery without VF removed
   In this case, MPS is stored in pci_dn and pnv_eeh_restore_vf_config() is
   called to restore it and reinitialize other part.
3. EEH recovery with VF removed
   In this case, VF will be removed then re-created. Both functions are
   called. First pnv_pci_fixup_vf_mps() is called to store the proper MPS
   to pci_dn and then pnv_eeh_restore_vf_config() is called to do proper
   thing.

This introduces two functions: pnv_pci_fixup_vf_mps() to fixup the VF's
MPS to make sure it is equal to parent's and store this value in pci_dn
for future use. pnv_eeh_restore_vf_config() to re-initialize on VF by
restoring MPS, disabling completion timeout, enabling SERR, etc.

Signed-off-by: Wei Yang 
Acked-by: Gavin Shan 
---
 arch/powerpc/include/asm/pci-bridge.h|  1 +
 arch/powerpc/platforms/powernv/eeh-powernv.c | 95 +++-
 2 files changed, 93 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index b0b43f5..f4d1758 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -220,6 +220,7 @@ struct pci_dn {
 #define IODA_INVALID_M64(-1)
int (*m64_map)[PCI_SRIOV_NUM_BARS];
 #endif /* CONFIG_PCI_IOV */
+   int mps;/* Maximum Payload Size */
 #endif
struct list_head child_list;
struct list_head list;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c 
b/arch/powerpc/platforms/powernv/eeh-powernv.c
index e26256b..950b3e5 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1588,6 +1588,65 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
return ret;
 }
 
+static int pnv_eeh_restore_vf_config(struct pci_dn *pdn)
+{
+   struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+   u32 devctl, cmd, cap2, aer_capctl;
+   int old_mps;
+
+   if (edev->pcie_cap) {
+   /* Restore MPS */
+   old_mps = (ffs(pdn->mps) - 8) << 5;
+   eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+2, );
+   devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
+   devctl |= old_mps;
+   eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+ 2, devctl);
+
+   /* Disable Completion Timeout */
+   eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP2,
+4, );
+   if (cap2 & 0x10) {
+   eeh_ops->read_config(pdn,
+edev->pcie_cap + PCI_EXP_DEVCTL2,
+4, );
+   cap2 |= 0x10;
+   eeh_ops->write_config(pdn,
+ edev->pcie_cap + PCI_EXP_DEVCTL2,
+ 4, cap2);
+   }
+   }
+
+   /* Enable SERR and parity checking */
+   eeh_ops->read_config(pdn, PCI_COMMAND, 2, );
+   cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
+   eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd);
+
+   /* Enable report various errors */
+   if (edev->pcie_cap) {
+   eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+2, );
+   devctl &= ~PCI_EXP_DEVCTL_CERE;
+   devctl |= (PCI_EXP_DEVCTL_NFERE |
+  PCI_EXP_DEVCTL_FERE |
+  PCI_EXP_DEVCTL_URRE);
+   eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+ 2, devctl);
+   }
+
+   /* Enable ECRC generation and check */
+   if (edev->pcie_cap && edev->aer_cap) {
+   eeh_ops->read_config(pdn, edev->aer_cap + PCI_ERR_CAP,
+4, _capctl);
+   aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
+   eeh_ops->write_config(pdn, edev->aer_cap + PCI_ERR_CAP,
+ 4, aer_capctl);
+   }
+
+   return 0;
+}
+
 static int pnv_eeh_restore_config(struct pci_dn *pdn)
 {
struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
@@ -1597,9 

[PATCH v14 7/9] powerpc/powernv: Support EEH reset for VF PE

2016-02-09 Thread Gavin Shan
From: Wei Yang 

PEs for VFs don't have primary bus. So they have to have their own reset
backend, which is used during EEH recovery. The patch implements the reset
backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained
in the PE.

Signed-off-by: Wei Yang 
Acked-by: Gavin Shan 
---
 arch/powerpc/include/asm/eeh.h   |   1 +
 arch/powerpc/kernel/eeh.c|   9 +-
 arch/powerpc/platforms/powernv/eeh-powernv.c | 127 ++-
 3 files changed, 133 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 0c551a2..b5b5f45 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -137,6 +137,7 @@ struct eeh_dev {
int pcix_cap;   /* Saved PCIx capability*/
int pcie_cap;   /* Saved PCIe capability*/
int aer_cap;/* Saved AER capability */
+   int af_cap; /* Saved AF capability  */
struct eeh_pe *pe;  /* Associated PE*/
struct list_head list;  /* Form link list in the PE */
struct pci_controller *phb; /* Associated PHB   */
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 8c6005c..0d72462 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -761,7 +761,8 @@ int pcibios_set_pcie_reset_state(struct pci_dev *dev, enum 
pcie_reset_state stat
case pcie_deassert_reset:
eeh_ops->reset(pe, EEH_RESET_DEACTIVATE);
eeh_unfreeze_pe(pe, false);
-   eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
+   if (!(pe->type & EEH_PE_VF))
+   eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
eeh_pe_dev_traverse(pe, eeh_restore_dev_state, dev);
eeh_pe_state_clear(pe, EEH_PE_ISOLATED);
break;
@@ -769,14 +770,16 @@ int pcibios_set_pcie_reset_state(struct pci_dev *dev, 
enum pcie_reset_state stat
eeh_pe_state_mark_with_cfg(pe, EEH_PE_ISOLATED);
eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE);
eeh_pe_dev_traverse(pe, eeh_disable_and_save_dev_state, dev);
-   eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
+   if (!(pe->type & EEH_PE_VF))
+   eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
eeh_ops->reset(pe, EEH_RESET_HOT);
break;
case pcie_warm_reset:
eeh_pe_state_mark_with_cfg(pe, EEH_PE_ISOLATED);
eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE);
eeh_pe_dev_traverse(pe, eeh_disable_and_save_dev_state, dev);
-   eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
+   if (!(pe->type & EEH_PE_VF))
+   eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
eeh_ops->reset(pe, EEH_RESET_FUNDAMENTAL);
break;
default:
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c 
b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 830526e..e26256b 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -371,6 +371,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
edev->mode  &= 0xFF00;
edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX);
edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
+   edev->af_cap   = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF);
edev->aer_cap  = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
edev->mode |= EEH_DEV_BRIDGE;
@@ -879,6 +880,120 @@ void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
}
 }
 
+static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, const char *type,
+int pos, u16 mask)
+{
+   struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+   int i, status = 0;
+
+   /* Wait for Transaction Pending bit to be cleared */
+   for (i = 0; i < 4; i++) {
+   eeh_ops->read_config(pdn, pos, 2, );
+   if (!(status & mask))
+   return;
+
+   msleep((1 << i) * 100);
+   }
+
+   pr_warn("%s: Pending transaction while issuing %sFLR to 
%04x:%02x:%02x.%01x\n",
+   __func__, type,
+   edev->phb->global_number, pdn->busno,
+   PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+}
+
+static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
+{
+   struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+   u32 reg = 0;
+
+   if (WARN_ON(!edev->pcie_cap))
+   return -ENOTTY;
+
+   eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, );
+   if (!(reg & 

[PATCH v14 6/9] powerpc/eeh: Create PE for VFs

2016-02-09 Thread Gavin Shan
From: Wei Yang 

This creates PEs for VFs in the weak function pcibios_bus_add_device().
Those PEs for VFs are identified with newly introduced flag EEH_PE_VF
so that we treat them differently during EEH recovery.

Signed-off-by: Wei Yang 
Acked-by: Gavin Shan 
---
 arch/powerpc/include/asm/eeh.h   |  1 +
 arch/powerpc/kernel/eeh_pe.c | 10 --
 arch/powerpc/platforms/powernv/eeh-powernv.c | 16 
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 574ed49a..0c551a2 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -72,6 +72,7 @@ struct pci_dn;
 #define EEH_PE_PHB (1 << 1)/* PHB PE*/
 #define EEH_PE_DEVICE  (1 << 2)/* Device PE */
 #define EEH_PE_BUS (1 << 3)/* Bus PE*/
+#define EEH_PE_VF  (1 << 4)/* VF PE */
 
 #define EEH_PE_ISOLATED(1 << 0)/* Isolated PE  
*/
 #define EEH_PE_RECOVERING  (1 << 1)/* Recovering PE*/
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index faaf19e..e441d7b 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -299,7 +299,10 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev 
*edev)
 * EEH device already having associated PE, but
 * the direct parent EEH device doesn't have yet.
 */
-   pdn = pdn ? pdn->parent : NULL;
+   if (edev->physfn)
+   pdn = pci_get_pdn(edev->physfn);
+   else
+   pdn = pdn ? pdn->parent : NULL;
while (pdn) {
/* We're poking out of PCI territory */
parent = pdn_to_eeh_dev(pdn);
@@ -382,7 +385,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
}
 
/* Create a new EEH PE */
-   pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
+   if (edev->physfn)
+   pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
+   else
+   pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
if (!pe) {
pr_err("%s: out of memory!\n", __func__);
return -ENOMEM;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c 
b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 8119172..830526e 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1503,6 +1503,22 @@ static struct eeh_ops pnv_eeh_ops = {
.restore_config = pnv_eeh_restore_config
 };
 
+void pcibios_bus_add_device(struct pci_dev *pdev)
+{
+   struct pci_dn *pdn = pci_get_pdn(pdev);
+
+   if (!pdev->is_virtfn)
+   return;
+
+   /*
+* The following operations will fail if VF's sysfs files
+* aren't created or its resources aren't finalized.
+*/
+   eeh_add_device_early(pdn);
+   eeh_add_device_late(pdev);
+   eeh_sysfs_add_device(pdev);
+}
+
 /**
  * eeh_powernv_init - Register platform dependent EEH operations
  *
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 01/18] cxl: Move common code away from bare-metal-specific files

2016-02-09 Thread Ian Munsie
Acked-by: Ian Munsie 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 02/18] cxl: Move bare-metal specific code to specialized files

2016-02-09 Thread Ian Munsie
Acked-by: Ian Munsie 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 04/18] cxl: Introduce implementation-specific API

2016-02-09 Thread Ian Munsie
Acked-by: Ian Munsie 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc32: PAGE_EXEC required for inittext

2016-02-09 Thread Christophe Leroy
PAGE_EXEC is required for inittext, otherwise CONFIG_DEBUG_PAGEALLOC
ends up with an Oops

[0.00] Inode-cache hash table entries: 8192 (order: 1, 32768 bytes)
[0.00] Sorting __ex_table...
[0.00] bootmem::free_all_bootmem_core nid=0 start=0 end=2000
[0.00] Unable to handle kernel paging request for instruction fetch
[0.00] Faulting instruction address: 0xc045b970
[0.00] Oops: Kernel access of bad area, sig: 11 [#1]
[0.00] PREEMPT DEBUG_PAGEALLOC CMPC885
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.25-local-dirty #1673
[0.00] task: c04d83d0 ti: c04f8000 task.ti: c04f8000
[0.00] NIP: c045b970 LR: c045b970 CTR: 000a
[0.00] REGS: c04f9ea0 TRAP: 0400   Not tainted  (3.18.25-local-dirty)
[0.00] MSR: 08001032   CR: 39955d35  XER: a000ff40
[0.00]
GPR00: c045b970 c04f9f50 c04d83d0   c04dcdf4 0048 c04f6b10
GPR08: c04f6ab0 0001 c0563488 c04f6ab0 c04f8000   b6db6db7
GPR16: 3474 0180 2000 c7fec000  03ff 0176 c0415014
GPR24: c0471018 c0414ee8 c05304e8 c03aeaac c051 c0471018 c0471010 
[0.00] NIP [c045b970] free_all_bootmem+0x164/0x228
[0.00] LR [c045b970] free_all_bootmem+0x164/0x228
[0.00] Call Trace:
[0.00] [c04f9f50] [c045b970] free_all_bootmem+0x164/0x228 (unreliable)
[0.00] [c04f9fa0] [c0454044] mem_init+0x3c/0xd0
[0.00] [c04f9fb0] [c045080c] start_kernel+0x1f4/0x390
[0.00] [c04f9ff0] [c0002214] start_here+0x38/0x98
[0.00] Instruction dump:
[0.00] 2f15 7f968840 72a90001 3ad60001 56b5f87e 419a0028 419e0024 
41a20018
[0.00] 807cc20c 3880 7c638214 4bffd2f5 <3a940001> 3a100024 4bc8 
7e368b78
[0.00] ---[ end trace dc8fa200cb88537f ]---

Signed-off-by: Christophe Leroy 
---
This patch goes on top of the following serie:
[PATCH v8 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other 
improvments

 arch/powerpc/mm/pgtable_32.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 815ccd7..ba2ee66 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -40,7 +40,7 @@
 unsigned long ioremap_bot;
 EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */
 
-extern char etext[], _stext[];
+extern char etext[], _stext[], _sinittext[], _einittext[];
 
 #define PGDIR_ORDER(32 + PGD_T_LOG2 - PGDIR_SHIFT)
 
@@ -289,7 +289,8 @@ void __init __mapin_ram_chunk(unsigned long offset, 
unsigned long top)
v = PAGE_OFFSET + s;
p = memstart_addr + s;
for (; s < top; s += PAGE_SIZE) {
-   ktext = ((char *) v >= _stext && (char *) v < etext);
+   ktext = ((char *)v >= _stext && (char *)v < etext) ||
+   ((char *)v >= _sinittext && (char *)v < _einittext);
f = ktext ? pgprot_val(PAGE_KERNEL_TEXT) : 
pgprot_val(PAGE_KERNEL);
map_page(v, p, f);
 #ifdef CONFIG_PPC_STD_MMU_32
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 08/18] cxl: IRQ allocation for guests

2016-02-09 Thread Ian Munsie
Acked-by: Ian Munsie 

> +/*
> + * Look for the interrupt number.
> + * On bare-metal, we know the range 0 only contains the PSL
> + * interrupt so, we could start counting at range 1 and initialize
> + * afu_irq at 1.
> + * In a guest, range 0 also contains AFU interrupts, so it must
> + * be counted for, but we initialize afu_irq at 0 to take into
> + * account the PSL interrupt.
> + *
> + * For code-readability, it just seems easier to go over all
> + * the ranges.
> + */

Thanks for adding that explanation :)

> +if (cpu_has_feature(CPU_FTR_HVMODE))
> +alloc_count = count;
> +else
> +alloc_count = count + 1;

Almost a shame you can't reuse the afu_irq_range_start function you
defined for this, but doing so would probably make the code less
readable, so fine to leave this as is.

>  /* We've allocated all memory now, so let's do the irq allocations */
>  irq_name = list_first_entry(>irq_names, struct cxl_irq_name, list);
> -for (r = 1; r < CXL_IRQ_RANGES; r++) {
> +for (r = afu_irq_range_start(); r < CXL_IRQ_RANGES; r++) {
>  hwirq = ctx->irqs.offset[r];
>  for (i = 0; i < ctx->irqs.range[r]; hwirq++, i++) {
> -cxl_map_irq(ctx->afu->adapter, hwirq,
> -cxl_irq_afu, ctx, irq_name->name);
> +if (r == 0 && i == 0)
> +/* PSL interrupt, only for guest */

That comment is perhaps not as clear as it could be - the interrupt is
used on either, but it's only allocated per context on PowerVM guests.

Cheers,
-Ian

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v14 1/9] PCI/IOV: Rename and export virtfn_{add, remove}

2016-02-09 Thread Gavin Shan
From: Wei Yang 

During EEH recovery, hotplug is applied to the devices which don't
have drivers or their drivers don't support EEH. However, the hotplug,
which was implemented based on PCI bus, can't be applied to VF directly.
Instead, we unplug and plug individual PCI devices (VFs).

This renames virtn_{add,remove}() and exports them so they can be used
in PCI hotplug during EEH recovery.

Signed-off-by: Wei Yang 
Reviewed-by: Gavin Shan 
Acked-by: Bjorn Helgaas 
---
 drivers/pci/iov.c   | 10 +-
 include/linux/pci.h |  8 
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 31f31d4..fa4f138 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -113,7 +113,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, 
int resno)
return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
 }
 
-static int virtfn_add(struct pci_dev *dev, int id, int reset)
+int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
 {
int i;
int rc = -ENOMEM;
@@ -188,7 +188,7 @@ failed:
return rc;
 }
 
-static void virtfn_remove(struct pci_dev *dev, int id, int reset)
+void pci_iov_remove_virtfn(struct pci_dev *dev, int id, int reset)
 {
char buf[VIRTFN_ID_LEN];
struct pci_dev *virtfn;
@@ -321,7 +321,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
}
 
for (i = 0; i < initial; i++) {
-   rc = virtfn_add(dev, i, 0);
+   rc = pci_iov_add_virtfn(dev, i, 0);
if (rc)
goto failed;
}
@@ -333,7 +333,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 failed:
while (i--)
-   virtfn_remove(dev, i, 0);
+   pci_iov_remove_virtfn(dev, i, 0);
 
pcibios_sriov_disable(dev);
 err_pcibios:
@@ -359,7 +359,7 @@ static void sriov_disable(struct pci_dev *dev)
return;
 
for (i = 0; i < iov->num_VFs; i++)
-   virtfn_remove(dev, i, 0);
+   pci_iov_remove_virtfn(dev, i, 0);
 
pcibios_sriov_disable(dev);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 27df4a6..3db5e30 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1738,6 +1738,8 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
 
 int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 void pci_disable_sriov(struct pci_dev *dev);
+int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset);
+void pci_iov_remove_virtfn(struct pci_dev *dev, int id, int reset);
 int pci_num_vf(struct pci_dev *dev);
 int pci_vfs_assigned(struct pci_dev *dev);
 int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
@@ -1754,6 +1756,12 @@ static inline int pci_iov_virtfn_devfn(struct pci_dev 
*dev, int id)
 }
 static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
 { return -ENODEV; }
+static inline int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
+{
+   return -ENOSYS;
+}
+static inline void pci_iov_remove_virtfn(struct pci_dev *dev,
+int id, int reset) { }
 static inline void pci_disable_sriov(struct pci_dev *dev) { }
 static inline int pci_num_vf(struct pci_dev *dev) { return 0; }
 static inline int pci_vfs_assigned(struct pci_dev *dev)
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v14 2/9] PCI: Add pcibios_bus_add_device() weak function

2016-02-09 Thread Gavin Shan
From: Wei Yang 

This adds weak function pcibios_bus_add_device() for arch dependent
code could do proper setup. For example, powerpc could setup EEH
related resources for SRIOV VFs.

Signed-off-by: Wei Yang 
Reviewed-by: Gavin Shan 
Acked-by: Bjorn Helgaas 
---
 drivers/pci/bus.c   | 3 +++
 include/linux/pci.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 89b3bef..6469ff6 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -271,6 +271,8 @@ bool pci_bus_clip_resource(struct pci_dev *dev, int idx)
 
 void __weak pcibios_resource_survey_bus(struct pci_bus *bus) { }
 
+void __weak pcibios_bus_add_device(struct pci_dev *pdev) { }
+
 /**
  * pci_bus_add_device - start driver for a single device
  * @dev: device to add
@@ -285,6 +287,7 @@ void pci_bus_add_device(struct pci_dev *dev)
 * Can not put in pci_device_add yet because resources
 * are not assigned yet for some devices.
 */
+   pcibios_bus_add_device(dev);
pci_fixup_device(pci_fixup_final, dev);
pci_create_sysfs_dev_files(dev);
pci_proc_attach_device(dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 3db5e30..bc435d62 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -770,6 +770,7 @@ extern struct list_head pci_root_buses; /* list of all 
known PCI buses */
 int no_pci_devices(void);
 
 void pcibios_resource_survey_bus(struct pci_bus *bus);
+void pcibios_bus_add_device(struct pci_dev *pdev);
 void pcibios_add_bus(struct pci_bus *bus);
 void pcibios_remove_bus(struct pci_bus *bus);
 void pcibios_fixup_bus(struct pci_bus *);
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v14 4/9] powerpc/eeh: Cache normal BARs, not windows or IOV BARs

2016-02-09 Thread Gavin Shan
From: Wei Yang 

This restricts the EEH address cache to use only the first 7 BARs. This
makes __eeh_addr_cache_insert_dev() ignore PCI bridge window and IOV BARs.
As the result of this change, eeh_addr_cache_get_dev() will return VFs from
VF's resource addresses instead of parent PFs.

This also removes PCI bridge check as we limit __eeh_addr_cache_insert_dev()
to 7 BARs and this effectively excludes PCI bridges from being cached.

Signed-off-by: Wei Yang 
Acked-by: Gavin Shan 
---
 arch/powerpc/kernel/eeh_cache.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index a1e86e1..ddbcfab 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -195,8 +195,11 @@ static void __eeh_addr_cache_insert_dev(struct pci_dev 
*dev)
return;
}
 
-   /* Walk resources on this device, poke them into the tree */
-   for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+   /*
+* Walk resources on this device, poke the first 7 (6 normal BAR and 1
+* ROM BAR) into the tree.
+*/
+   for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
resource_size_t start = pci_resource_start(dev,i);
resource_size_t end = pci_resource_end(dev,i);
unsigned long flags = pci_resource_flags(dev,i);
@@ -222,10 +225,6 @@ void eeh_addr_cache_insert_dev(struct pci_dev *dev)
 {
unsigned long flags;
 
-   /* Ignore PCI bridges */
-   if ((dev->class >> 16) == PCI_BASE_CLASS_BRIDGE)
-   return;
-
spin_lock_irqsave(_io_addr_cache_root.piar_lock, flags);
__eeh_addr_cache_insert_dev(dev);
spin_unlock_irqrestore(_io_addr_cache_root.piar_lock, flags);
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v14 9/9] powerpc/eeh: powerpc/eeh: Support error recovery for VF PE

2016-02-09 Thread Gavin Shan
From: Wei Yang 

PFs are enumerated on PCI bus, while VFs are created by PF's driver.

In EEH recovery, it has two cases:
1. Device and driver is EEH aware, error handlers are called.
2. Device and driver is not EEH aware, un-plug the device and plug it again
by enumerating it.

The special thing happens on the second case. For a PF, we could use the
original pci core to enumerate the bus, while for VF we need to record the
VFs which aer un-plugged then plug it again.

Also The patch caches the VF index in pci_dn, which can be used to
calculate VF's bus, device and function number. Those information helps to
locate the VF's PCI device instance when doing hotplug during EEH recovery
if necessary.

Signed-off-by: Wei Yang 
Acked-by: Gavin Shan 
---
 arch/powerpc/include/asm/eeh.h|   2 +
 arch/powerpc/include/asm/pci-bridge.h |   1 +
 arch/powerpc/kernel/eeh.c |   8 ++
 arch/powerpc/kernel/eeh_dev.c |   1 +
 arch/powerpc/kernel/eeh_driver.c  | 137 +++---
 arch/powerpc/kernel/pci_dn.c  |   4 +-
 6 files changed, 127 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index b5b5f45..fb9f376 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -140,9 +140,11 @@ struct eeh_dev {
int af_cap; /* Saved AF capability  */
struct eeh_pe *pe;  /* Associated PE*/
struct list_head list;  /* Form link list in the PE */
+   struct list_head rmv_list;  /* Record the removed edevs */
struct pci_controller *phb; /* Associated PHB   */
struct pci_dn *pdn; /* Associated PCI device node   */
struct pci_dev *pdev;   /* Associated PCI device*/
+   bool in_error;  /* Error flag for edev  */
struct pci_dev *physfn; /* Associated SRIOV PF  */
struct pci_bus *bus;/* PCI bus for partial hotplug  */
 };
diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index f4d1758..9f165e8 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -212,6 +212,7 @@ struct pci_dn {
 #define IODA_INVALID_PE(-1)
 #ifdef CONFIG_PPC_POWERNV
int pe_number;
+   int vf_index;   /* VF index in the PF */
 #ifdef CONFIG_PCI_IOV
u16 vfs_expanded;   /* number of VFs IOV BAR expanded */
u16 num_vfs;/* number of VFs enabled*/
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 0d72462..b7338a9 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1246,6 +1246,14 @@ void eeh_remove_device(struct pci_dev *dev)
 * from the parent PE during the BAR resotre.
 */
edev->pdev = NULL;
+
+   /*
+* The flag "in_error" is used to trace EEH devices for VFs
+* in error state or not. It's set in eeh_report_error(). If
+* it's not set, eeh_report_{reset,resume}() won't be called
+* for the VF EEH device.
+*/
+   edev->in_error = false;
dev->dev.archdata.edev = NULL;
if (!(edev->pe->state & EEH_PE_KEEP))
eeh_rmv_from_parent_pe(edev);
diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c
index aabba94..7815095 100644
--- a/arch/powerpc/kernel/eeh_dev.c
+++ b/arch/powerpc/kernel/eeh_dev.c
@@ -67,6 +67,7 @@ void *eeh_dev_init(struct pci_dn *pdn, void *data)
edev->pdn = pdn;
edev->phb = phb;
INIT_LIST_HEAD(>list);
+   INIT_LIST_HEAD(>rmv_list);
 
return NULL;
 }
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 301be31..d1c65f5 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -34,6 +34,11 @@
 #include 
 #include 
 
+struct eeh_rmv_data {
+   struct list_head edev_list;
+   int removed;
+};
+
 /**
  * eeh_pcid_name - Retrieve name of PCI device driver
  * @pdev: PCI device
@@ -211,6 +216,7 @@ static void *eeh_report_error(void *data, void *userdata)
if (rc == PCI_ERS_RESULT_NEED_RESET) *res = rc;
if (*res == PCI_ERS_RESULT_NONE) *res = rc;
 
+   edev->in_error = true;
eeh_pcid_put(dev);
return NULL;
 }
@@ -282,7 +288,8 @@ static void *eeh_report_reset(void *data, void *userdata)
 
if (!driver->err_handler ||
!driver->err_handler->slot_reset ||
-   (edev->mode & EEH_DEV_NO_HANDLER)) {
+   (edev->mode & EEH_DEV_NO_HANDLER) ||
+   (!edev->in_error)) {
eeh_pcid_put(dev);
return NULL;
}
@@ -326,6 +333,7 @@ static void *eeh_report_resume(void 

Re: [PATCH v3 03/18] cxl: Define process problem state area at attach time only

2016-02-09 Thread Ian Munsie
Excerpts from Frederic Barrat's message of 2016-02-07 00:28:50 +1100:
> Cxl kernel API was defining the process problem state area during
> context initialization, making it possible to map the problem state
> area before attaching the context. This won't work on a powerVM
> guest. So do the logical thing, like in userspace: attach first, then
> map the problem state area.
> Remove calls to cxl_assign_psn_space during init. The function is
> already called on the attach paths.

Looks good.

It might be a reasonable idea to make cxl_psa_map fail outright if it is
called on a context that has not been attached yet like we do in the
user api, but I trust kernel devs to get this right more than userspace
so I'm not too worried :)

Cheers,
-Ian

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 06/18] cxl: Isolate a few bare-metal-specific calls

2016-02-09 Thread Ian Munsie
Acked-by: Ian Munsie 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

  1   2   >