Re: [PATCH 1/1] powerpc: correct DSCR during TM context switch
Hi Sam, Comments inline .. On Wed, 2014-06-04 at 13:33 +1000, Sam Bobroff wrote: Correct the DSCR SPR becoming temporarily corrupted when a task is context switched when within a transaction. It is corrected when the transaction is aborted (which will happen after a context switch) but if the task has suspended (TSUSPEND) the transaction the incorrect value can be seen. I don't quite follow this description. How is it corrected when the transaction is aborted, and when does that usually happen? If that happens the task can't ever see the corrupted value? To hit the suspended case, the task starts a transaction, suspends it, is then context switched out and back in, and at that point it can see the wrong value? The problem is caused by saving a thread's DSCR after it has already been reverted to the CPU's default value: __switch_to() calls __switch_to_tm() which calls tm_reclaim_task() which calls tm_reclaim_thread() which calls tm_reclaim() where the DSCR is reset Where the DSCR is set to DSCR_DEFAULT ? Or now PACA_DSCR since your previous patches? Could we instead fix the bug there by reverting to the thread's DSCR value? __switch_to() calls _switch _switch() saves the DSCR to thread.dscr The fix is to treat the DSCR similarly to the TAR and save it early in __switch_to(). The program below will expose the problem: Can you drop this in tools/testing/selftests/powerpc/tm ? You'll need to create that directory, you can ape the Makefile from the pmu directory, it should be fairly obvious. See the pmu tests for how to integrate with the test harness etc., or bug me if it's not straight forward. diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index 2737f46..3efd0e5 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -16,13 +16,15 @@ struct thread_struct; extern struct task_struct *_switch(struct thread_struct *prev, struct thread_struct *next); #ifdef CONFIG_PPC_BOOK3S_64 -static inline void save_tar(struct thread_struct *prev) +static inline void save_early_sprs(struct thread_struct *prev) { if (cpu_has_feature(CPU_FTR_ARCH_207S)) prev-tar = mfspr(SPRN_TAR); + if (cpu_has_feature(CPU_FTR_DSCR)) + prev-dscr = mfspr(SPRN_DSCR); } Are we going to end up saving more SPRs in this code? What makes the TAR DSCR special vs everything else? The nice thing about doing this in asm is it's nop'ed out for cpus that don't have the DSCR. What does the generated code for this look like? diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 662c6dd..a107f4a 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -432,12 +432,6 @@ BEGIN_FTR_SECTION std r24,THREAD_VRSAVE(r3) END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC) #endif /* CONFIG_ALTIVEC */ -#ifdef CONFIG_PPC64 -BEGIN_FTR_SECTION - mfspr r25,SPRN_DSCR - std r25,THREAD_DSCR(r3) -END_FTR_SECTION_IFSET(CPU_FTR_DSCR) -#endif and.r0,r0,r22 beq+1f andcr22,r22,r0 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/powernv: Fix killed EEH event
On PowerNV platform, EEH errors are reported by IO accessors or poller driven by interrupt. After the PE is isolated, we won't produce EEH event for the PE. The current implementation has possibility of EEH event lost in this way: The interrupt handler queues one special event, which drives the poller. EEH thread doesn't pick the special event yet. IO accessors kicks in, the frozen PE is marked as isolated and EEH event is queued to the list. EEH thread runs because of special event and purge all existing EEH events. However, we never produce an other EEH event for the frozen PE. Eventually, the PE is marked as isolated and we don't have EEH event to recover it. The patch fixes the issue to keep EEH events for PEs that have been marked as isolated with the help of additional force help to eeh_remove_event(). Reported-by: Rolf Brudeseth ro...@us.ibm.com Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com --- arch/powerpc/include/asm/eeh_event.h | 2 +- arch/powerpc/kernel/eeh_driver.c | 4 ++-- arch/powerpc/kernel/eeh_event.c | 21 +++-- arch/powerpc/platforms/powernv/eeh-ioda.c | 2 +- 4 files changed, 19 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/eeh_event.h b/arch/powerpc/include/asm/eeh_event.h index 89d5670..1e551a2 100644 --- a/arch/powerpc/include/asm/eeh_event.h +++ b/arch/powerpc/include/asm/eeh_event.h @@ -33,7 +33,7 @@ struct eeh_event { int eeh_event_init(void); int eeh_send_failure_event(struct eeh_pe *pe); -void eeh_remove_event(struct eeh_pe *pe); +void eeh_remove_event(struct eeh_pe *pe, bool force); void eeh_handle_event(struct eeh_pe *pe); #endif /* __KERNEL__ */ diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index 8bb40e7..420da61 100644 --- a/arch/powerpc/kernel/eeh_driver.c +++ b/arch/powerpc/kernel/eeh_driver.c @@ -770,7 +770,7 @@ static void eeh_handle_special_event(void) eeh_serialize_lock(flags); /* Purge all events */ - eeh_remove_event(NULL); + eeh_remove_event(NULL, true); list_for_each_entry(hose, hose_list, list_node) { phb_pe = eeh_phb_pe_get(hose); @@ -789,7 +789,7 @@ static void eeh_handle_special_event(void) eeh_serialize_lock(flags); /* Purge all events of the PHB */ - eeh_remove_event(pe); + eeh_remove_event(pe, true); if (rc == EEH_NEXT_ERR_DEAD_PHB) eeh_pe_state_mark(pe, EEH_PE_ISOLATED); diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c index 72d748b..4eefb6e 100644 --- a/arch/powerpc/kernel/eeh_event.c +++ b/arch/powerpc/kernel/eeh_event.c @@ -152,24 +152,33 @@ int eeh_send_failure_event(struct eeh_pe *pe) /** * eeh_remove_event - Remove EEH event from the queue * @pe: Event binding to the PE + * @force: Event will be removed unconditionally * * On PowerNV platform, we might have subsequent coming events * is part of the former one. For that case, those subsequent * coming events are totally duplicated and unnecessary, thus * they should be removed. */ -void eeh_remove_event(struct eeh_pe *pe) +void eeh_remove_event(struct eeh_pe *pe, bool force) { unsigned long flags; struct eeh_event *event, *tmp; + /* +* If we have NULL PE passed in, we have dead IOC +* or we're sure we can report all existing errors +* by the caller. +* +* With force, the event with associated PE that +* have been isolated, the event won't be removed +* to avoid event lost. +*/ spin_lock_irqsave(eeh_eventlist_lock, flags); list_for_each_entry_safe(event, tmp, eeh_eventlist, list) { - /* -* If we don't have valid PE passed in, that means -* we already have event corresponding to dead IOC -* and all events should be purged. -*/ + if (!force event-pe + (event-pe-state EEH_PE_ISOLATED)) + continue; + if (!pe) { list_del(event-list); kfree(event); diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c b/arch/powerpc/platforms/powernv/eeh-ioda.c index cab3e62..f1f089d 100644 --- a/arch/powerpc/platforms/powernv/eeh-ioda.c +++ b/arch/powerpc/platforms/powernv/eeh-ioda.c @@ -797,7 +797,7 @@ static int ioda_eeh_next_error(struct eeh_pe **pe) * And we should keep the cached OPAL notifier event sychronized * between the kernel and firmware. */ - eeh_remove_event(NULL); + eeh_remove_event(NULL, false); opal_notifier_update_evt(OPAL_EVENT_PCI_ERROR, 0x0ul); list_for_each_entry(hose, hose_list,
[PATCH v2 1/2] powerpc: Allow ppc_md platform hook to override memory_block_size_bytes
The pseries platform code unconditionally overrides memory_block_size_bytes regardless of the running platform. Create a ppc_md hook that so each platform can choose to do what it wants. Signed-off-by: Anton Blanchard an...@samba.org --- arch/powerpc/include/asm/machdep.h | 3 +++ arch/powerpc/kernel/setup_64.c | 10 ++ arch/powerpc/platforms/pseries/hotplug-memory.c | 17 +++-- arch/powerpc/platforms/pseries/pseries.h| 2 ++ arch/powerpc/platforms/pseries/setup.c | 3 +++ 5 files changed, 21 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index 374abc2..f92b0b5 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -98,6 +98,9 @@ struct machdep_calls { void(*iommu_save)(void); void(*iommu_restore)(void); #endif +#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE + unsigned long (*memory_block_size)(void); +#endif #endif /* CONFIG_PPC64 */ void(*pci_dma_dev_setup)(struct pci_dev *dev); diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 90b532a..ee082d7 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -36,6 +36,7 @@ #include linux/lockdep.h #include linux/memblock.h #include linux/hugetlb.h +#include linux/memory.h #include asm/io.h #include asm/kdump.h @@ -780,6 +781,15 @@ void __init setup_per_cpu_areas(void) } #endif +#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE +unsigned long memory_block_size_bytes(void) +{ + if (ppc_md.memory_block_size) + return ppc_md.memory_block_size(); + + return MIN_MEMORY_BLOCK_SIZE; +} +#endif #if defined(CONFIG_PPC_INDIRECT_PIO) || defined(CONFIG_PPC_INDIRECT_MMIO) struct ppc_pci_io ppc_pci_io; diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 7f75c94..7995135 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -21,7 +21,7 @@ #include asm/prom.h #include asm/sparsemem.h -static unsigned long get_memblock_size(void) +unsigned long pseries_memory_block_size(void) { struct device_node *np; unsigned int memblock_size = MIN_MEMORY_BLOCK_SIZE; @@ -64,17 +64,6 @@ static unsigned long get_memblock_size(void) return memblock_size; } -/* WARNING: This is going to override the generic definition whenever - * pseries is built-in regardless of what platform is active at boot - * time. This is fine for now as this is the only option and it - * should work everywhere. If not, we'll have to turn this into a - * ppc_md. callback - */ -unsigned long memory_block_size_bytes(void) -{ - return get_memblock_size(); -} - #ifdef CONFIG_MEMORY_HOTREMOVE static int pseries_remove_memory(u64 start, u64 size) { @@ -105,7 +94,7 @@ static int pseries_remove_memblock(unsigned long base, unsigned int memblock_siz if (!pfn_valid(start_pfn)) goto out; - block_sz = memory_block_size_bytes(); + block_sz = pseries_memory_block_size(); sections_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE; nid = memory_add_physaddr_to_nid(base); @@ -201,7 +190,7 @@ static int pseries_update_drconf_memory(struct of_prop_reconfig *pr) u32 *p; int i, rc = -EINVAL; - memblock_size = get_memblock_size(); + memblock_size = pseries_memory_block_size(); if (!memblock_size) return -EINVAL; diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h index 9921953..361add6 100644 --- a/arch/powerpc/platforms/pseries/pseries.h +++ b/arch/powerpc/platforms/pseries/pseries.h @@ -64,4 +64,6 @@ extern int dlpar_detach_node(struct device_node *); struct pci_host_bridge; int pseries_root_bridge_prepare(struct pci_host_bridge *bridge); +unsigned long pseries_memory_block_size(void); + #endif /* _PSERIES_PSERIES_H */ diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 215c3c2..adc21a0 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -810,4 +810,7 @@ define_machine(pseries) { #ifdef CONFIG_KEXEC .machine_kexec = pSeries_machine_kexec, #endif +#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE + .memory_block_size = pseries_memory_block_size, +#endif }; -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 2/2] powerpc/powernv: Set memory_block_size_bytes to 256MB
powerpc sets a low SECTION_SIZE_BITS to accomodate small pseries boxes. We default to 16MB memory blocks, and boxes with a lot of memory end up with enormous numbers of sysfs memory nodes. Set a more reasonable default for powernv of 256MB. Signed-off-by: Anton Blanchard an...@samba.org --- arch/powerpc/platforms/powernv/setup.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index 865aab4..8c16a5f 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -244,6 +244,13 @@ static void pnv_kexec_cpu_down(int crash_shutdown, int secondary) } #endif /* CONFIG_KEXEC */ +#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE +static unsigned long pnv_memory_block_size(void) +{ + return 256UL * 1024 * 1024; +} +#endif + static void __init pnv_setup_machdep_opal(void) { ppc_md.get_boot_time = opal_get_boot_time; @@ -326,4 +333,7 @@ define_machine(powernv) { #ifdef CONFIG_KEXEC .kexec_cpu_down = pnv_kexec_cpu_down, #endif +#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE + .memory_block_size = pnv_memory_block_size, +#endif }; -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] powerpc/powernv: Enable POWER8 doorbell IPIs
On Mon, 2014-06-02 at 16:57 +1000, Michael Neuling wrote: This patch enables POWER8 doorbell IPIs on powernv. Since doorbells can only IPI within a core, we test to see when we can use doorbells and if not we fall back to XICS. This also enables hypervisor doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit. Based on tests by Anton, the best case IPI latency between two threads dropped from 894ns to 512ns. Signed-off-by: Michael Neuling mi...@neuling.org --- arch/powerpc/kernel/cpu_setup_power.S | 2 ++ arch/powerpc/platforms/powernv/smp.c | 6 ++ arch/powerpc/sysdev/xics/icp-native.c | 9 - 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S index 1557e7c..4673353 100644 --- a/arch/powerpc/kernel/cpu_setup_power.S +++ b/arch/powerpc/kernel/cpu_setup_power.S @@ -56,6 +56,7 @@ _GLOBAL(__setup_cpu_power8) li r0,0 mtspr SPRN_LPID,r0 mfspr r3,SPRN_LPCR + ori r3, r3, LPCR_PECEDH bl __init_LPCR bl __init_HFSCR bl __init_tlb_power8 @@ -74,6 +75,7 @@ _GLOBAL(__restore_cpu_power8) li r0,0 mtspr SPRN_LPID,r0 mfspr r3,SPRN_LPCR + ori r3, r3, LPCR_PECEDH bl __init_LPCR bl __init_HFSCR bl __init_tlb_power8 diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 0062a43..5fcfcf4 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -32,6 +32,7 @@ #include asm/opal.h #include asm/runlatch.h #include asm/code-patching.h +#include asm/dbell.h #include powernv.h @@ -46,6 +47,11 @@ static void pnv_smp_setup_cpu(int cpu) { if (cpu != boot_cpuid) xics_setup_cpu(); + +#ifdef CONFIG_PPC_DOORBELL + if (cpu_has_feature(CPU_FTR_DBELL)) + doorbell_setup_this_cpu(); +#endif } Where does CONFIG_PPC_DOORBELL get set? It looks like the only place you can be getting it is from CONFIG_PPC_PSERIES. If you build a POWERNV only kernel it will get disabled, which I don't think is what we want. I think we should select it for POWERNV as well as PSERIES. That would mean it's always enabled for any build of the files you touch, so you shouldn't need the #ifdefs. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [RESEND PATCH] memory: Freescale CoreNet Coherency Fabric error reporting driver
-Original Message- From: Linuxppc-dev [mailto:linuxppc-dev- bounces+bharat.bhushan=freescale@lists.ozlabs.org] On Behalf Of Scott Wood Sent: Saturday, May 31, 2014 3:58 AM To: Greg Kroah-Hartman Cc: linuxppc-dev@lists.ozlabs.org; linux-ker...@vger.kernel.org Subject: [RESEND PATCH] memory: Freescale CoreNet Coherency Fabric error reporting driver The CoreNet Coherency Fabric is part of the memory subsystem on some Freescale QorIQ chips. It can report coherency violations (e.g. due to misusing memory that is mapped noncoherent) as well as transactions that do not hit any local access window, or which hit a local access window with an invalid target ID. Signed-off-by: Scott Wood scottw...@freescale.com --- Resending to the proper list addresses -- sorry for the duplicate. arch/powerpc/configs/corenet32_smp_defconfig | 1 + arch/powerpc/configs/corenet64_smp_defconfig | 1 + drivers/memory/Kconfig | 10 ++ drivers/memory/Makefile | 1 + drivers/memory/fsl-corenet-cf.c | 246 +++ 5 files changed, 259 insertions(+) create mode 100644 drivers/memory/fsl-corenet-cf.c diff --git a/arch/powerpc/configs/corenet32_smp_defconfig b/arch/powerpc/configs/corenet32_smp_defconfig index c19ff05..0c99d7e 100644 --- a/arch/powerpc/configs/corenet32_smp_defconfig +++ b/arch/powerpc/configs/corenet32_smp_defconfig @@ -179,3 +179,4 @@ CONFIG_CRYPTO_SHA512=y CONFIG_CRYPTO_AES=y # CONFIG_CRYPTO_ANSI_CPRNG is not set CONFIG_CRYPTO_DEV_FSL_CAAM=y +CONFIG_FSL_CORENET_CF=y diff --git a/arch/powerpc/configs/corenet64_smp_defconfig b/arch/powerpc/configs/corenet64_smp_defconfig index 5c7fa19..8fb616d 100644 --- a/arch/powerpc/configs/corenet64_smp_defconfig +++ b/arch/powerpc/configs/corenet64_smp_defconfig @@ -175,3 +175,4 @@ CONFIG_CRYPTO_SHA256=y CONFIG_CRYPTO_SHA512=y # CONFIG_CRYPTO_ANSI_CPRNG is not set CONFIG_CRYPTO_DEV_FSL_CAAM=y +CONFIG_FSL_CORENET_CF=y diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig index c59e9c9..fab81a1 100644 --- a/drivers/memory/Kconfig +++ b/drivers/memory/Kconfig @@ -61,6 +61,16 @@ config TEGRA30_MC analysis, especially for IOMMU/SMMU(System Memory Management Unit) module. +config FSL_CORENET_CF + tristate Freescale CoreNet Error Reporting + depends on FSL_SOC_BOOKE + help + Say Y for reporting of errors from the Freescale CoreNet + Coherency Fabric. Errors reported include accesses to + physical addresses that mapped by no local access window + (LAW) or an invalid LAW, as well as bad cache state that + represents a coherency violation. + config FSL_IFC bool depends on FSL_SOC diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile index 71160a2..4055c47 100644 --- a/drivers/memory/Makefile +++ b/drivers/memory/Makefile @@ -7,6 +7,7 @@ obj-$(CONFIG_OF) += of_memory.o endif obj-$(CONFIG_TI_AEMIF) += ti-aemif.o obj-$(CONFIG_TI_EMIF)+= emif.o +obj-$(CONFIG_FSL_CORENET_CF) += fsl-corenet-cf.o obj-$(CONFIG_FSL_IFC)+= fsl_ifc.o obj-$(CONFIG_MVEBU_DEVBUS) += mvebu-devbus.o obj-$(CONFIG_TEGRA20_MC) += tegra20-mc.o diff --git a/drivers/memory/fsl-corenet-cf.c b/drivers/memory/fsl-corenet-cf.c new file mode 100644 index 000..a57a614 --- /dev/null +++ b/drivers/memory/fsl-corenet-cf.c @@ -0,0 +1,246 @@ +/* + * CoreNet Coherency Fabric error reporting + * + * Copyright 2014 Freescale Semiconductor Inc. + * + * This program is free software; you can redistribute it and/or +modify it + * under the terms of the GNU General Public License as published by +the + * Free Software Foundation; either version 2 of the License, or (at +your + * option) any later version. + */ + +#include linux/interrupt.h +#include linux/io.h +#include linux/irq.h +#include linux/module.h +#include linux/of.h +#include linux/of_address.h +#include linux/of_device.h +#include linux/of_irq.h +#include linux/platform_device.h + +enum ccf_version { + CCF1, + CCF2, +}; + +struct ccf_info { + enum ccf_version version; + int err_reg_offs; +}; + +static const struct ccf_info ccf1_info = { + .version = CCF1, + .err_reg_offs = 0xa00, +}; + +static const struct ccf_info ccf2_info = { + .version = CCF2, + .err_reg_offs = 0xe40, +}; + +static const struct of_device_id ccf_matches[] = { + { + .compatible = fsl,corenet1-cf, + .data = ccf1_info, + }, + { + .compatible = fsl,corenet2-cf, + .data = ccf2_info, + }, + {} +}; + +struct ccf_err_regs { + u32 errdet; /* 0x00 Error Detect Register */ + /* 0x04 Error Enable (ccf1)/Disable (ccf2) Register */ + u32 errdis; + /* 0x08 Error Interrupt Enable Register
Re: [PATCH] powerpc, xmon: Enable hardware instruction breakpoint support on POWER8
On 06/03/2014 11:33 AM, Anshuman Khandual wrote: On 05/30/2014 07:12 PM, Aneesh Kumar K.V wrote: Anshuman Khandual khand...@linux.vnet.ibm.com writes: This patch enables support for hardware instruction breakpoints on POWER8 with the help of a new register called CIABR (Completed Instruction Address Breakpoint Register). With this patch, single hardware instruction breakpoint can be added and cleared during any active xmon debug session. This hardware based instruction breakpoint mechanism works correctly along with the existing TRAP based instruction breakpoints available on xmon. Example usage as follows. (A) Start xmon: $echo x /proc/sysrq-trigger SysRq : Entering xmon cpu 0x0: Vector: 0 at [c01f6c67f960] pc: c0072078: .sysrq_handle_xmon+0x58/0x60 lr: c0072078: .sysrq_handle_xmon+0x58/0x60 sp: c01f6c67fac0 msr: 90009032 current = 0xc01f6e709ac0 paca= 0xcfffa000 softe: 0irq_happened: 0x00 pid = 3250, comm = bash enter ? for help 0:mon b typeaddress (B) Set the breakpoint: 0:mon ls .power_pmu_add .power_pmu_add: c0078f50 0:mon bi c0078f50 0:mon b typeaddress 1 inst c0078f50 .power_pmu_add+0x0/0x2e0 0:mon ls .perf_event_interrupt .perf_event_interrupt: c007aee0 0:mon bi c007aee0 One instruction breakpoint possible with CIABR 0:mon x (C) Run the workload (with the breakpoint): $./perf record ls cpu 0x2: Vector: d00 (Single Step) at [c01f718133a0] pc: c0078f54: .power_pmu_add+0x4/0x2e0 lr: c0155be0: .event_sched_in+0x90/0x1d0 sp: c01f71813620 msr: 900040109032 current = 0xc01f6ce3 paca= 0xcfffa600 softe: 0irq_happened: 0x01 pid = 3270, comm = ls std r22,-80(r1) enter ? for help (D) Clear the breakpoint: 2:mon bc All breakpoints cleared 2:mon x [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.002 MB perf.data (~66 samples) ] (E) Run the workload again (without any breakpoints): $./perf record ls [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (~61 samples) ] Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com --- arch/powerpc/xmon/xmon.c | 62 1 file changed, 58 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index 3fd1d9a..f74ec83 100644 --- a/arch/powerpc/xmon/xmon.c +++ b/arch/powerpc/xmon/xmon.c @@ -48,6 +48,7 @@ #ifdef CONFIG_PPC64 #include asm/hvcall.h #include asm/paca.h +#include asm/plpar_wrappers.h #endif #include nonstdio.h @@ -89,6 +90,7 @@ struct bpt { /* Bits in bpt.enabled */ #define BP_IABR_TE 1 /* IABR translation enabled */ #define BP_IABR 2 +#define BP_CIABR4 #define BP_TRAP 8 #define BP_DABR 0x10 @@ -97,6 +99,7 @@ static struct bpt bpts[NBPTS]; static struct bpt dabr; static struct bpt *iabr; static unsigned bpinstr = 0x7fe8; /* trap */ +static bool ciabr_used = false; /* CIABR instruction breakpoint */ #define BP_NUM(bp) ((bp) - bpts + 1) @@ -269,6 +272,34 @@ static inline void cinval(void *p) asm volatile (dcbi 0,%0; icbi 0,%0 : : r (p)); } +static void write_ciabr(unsigned long ciabr) +{ +if (cpu_has_feature(CPU_FTR_HVMODE)) { +mtspr(SPRN_CIABR, ciabr); +return; +} + +#ifdef CONFIG_PPC64 +plapr_set_ciabr(ciabr); +#endif +} + +static void set_ciabr(unsigned long addr) +{ +addr = ~CIABR_PRIV; +if (cpu_has_feature(CPU_FTR_HVMODE)) +addr |= CIABR_PRIV_HYPER; +else +addr |= CIABR_PRIV_SUPER; +write_ciabr(addr); +} + +static void clear_ciabr(void) +{ +if (cpu_has_feature(CPU_FTR_ARCH_207S)) +write_ciabr(0); +} + /* * Disable surveillance (the service processor watchdog function) * while we are in xmon. @@ -764,6 +795,9 @@ static void insert_cpu_bpts(void) if (iabr cpu_has_feature(CPU_FTR_IABR)) mtspr(SPRN_IABR, iabr-address | (iabr-enabled (BP_IABR|BP_IABR_TE))); + +if (iabr cpu_has_feature(CPU_FTR_ARCH_207S)) +set_ciabr(iabr-address); } static void remove_bpts(void) @@ -791,6 +825,7 @@ static void remove_cpu_bpts(void) hw_breakpoint_disable(); if (cpu_has_feature(CPU_FTR_IABR)) mtspr(SPRN_IABR, 0); +clear_ciabr(); } /* Command interpreting routine */ @@ -1124,7
Re: [PATCH 2/2] powerpc/powernv: Enable POWER8 doorbell IPIs
On Wed, 2014-06-04 at 18:03 +1000, Michael Ellerman wrote: On Mon, 2014-06-02 at 16:57 +1000, Michael Neuling wrote: This patch enables POWER8 doorbell IPIs on powernv. Since doorbells can only IPI within a core, we test to see when we can use doorbells and if not we fall back to XICS. This also enables hypervisor doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit. Based on tests by Anton, the best case IPI latency between two threads dropped from 894ns to 512ns. Signed-off-by: Michael Neuling mi...@neuling.org --- arch/powerpc/kernel/cpu_setup_power.S | 2 ++ arch/powerpc/platforms/powernv/smp.c | 6 ++ arch/powerpc/sysdev/xics/icp-native.c | 9 - 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S index 1557e7c..4673353 100644 --- a/arch/powerpc/kernel/cpu_setup_power.S +++ b/arch/powerpc/kernel/cpu_setup_power.S @@ -56,6 +56,7 @@ _GLOBAL(__setup_cpu_power8) li r0,0 mtspr SPRN_LPID,r0 mfspr r3,SPRN_LPCR + ori r3, r3, LPCR_PECEDH bl __init_LPCR bl __init_HFSCR bl __init_tlb_power8 @@ -74,6 +75,7 @@ _GLOBAL(__restore_cpu_power8) li r0,0 mtspr SPRN_LPID,r0 mfspr r3,SPRN_LPCR + ori r3, r3, LPCR_PECEDH bl __init_LPCR bl __init_HFSCR bl __init_tlb_power8 diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 0062a43..5fcfcf4 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -32,6 +32,7 @@ #include asm/opal.h #include asm/runlatch.h #include asm/code-patching.h +#include asm/dbell.h #include powernv.h @@ -46,6 +47,11 @@ static void pnv_smp_setup_cpu(int cpu) { if (cpu != boot_cpuid) xics_setup_cpu(); + +#ifdef CONFIG_PPC_DOORBELL + if (cpu_has_feature(CPU_FTR_DBELL)) + doorbell_setup_this_cpu(); +#endif } Where does CONFIG_PPC_DOORBELL get set? No where. It looks like the only place you can be getting it is from CONFIG_PPC_PSERIES. If you build a POWERNV only kernel it will get disabled, which I don't think is what we want. I think we should select it for POWERNV as well as PSERIES. That would mean it's always enabled for any build of the files you touch, so you shouldn't need the #ifdefs. Ok, I'll add and resend. Mikey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/1] powerpc: correct DSCR during TM context switch
On Wed, 2014-06-04 at 17:31 +1000, Michael Ellerman wrote: Hi Sam, Comments inline .. Ditto On Wed, 2014-06-04 at 13:33 +1000, Sam Bobroff wrote: Correct the DSCR SPR becoming temporarily corrupted when a task is context switched when within a transaction. It is corrected when the transaction is aborted (which will happen after a context switch) but if the task has suspended (TSUSPEND) the transaction the incorrect value can be seen. I don't quite follow this description. How is it corrected when the transaction is aborted, and when does that usually happen? If that happens the task can't ever see the corrupted value? To hit the suspended case, the task starts a transaction, suspends it, is then context switched out and back in, and at that point it can see the wrong value? Yep, that's it and it's corrupted until the transaction is rolled back (normally at the tresume). At the tresume it gets rolled back to the checkpointed value at tbegin and is no longer corrupt. The problem is caused by saving a thread's DSCR afterNo it's lost at that point as we've not saved it and it was overwritten when we did the treclaim. it has already been reverted to the CPU's default value: __switch_to() calls __switch_to_tm() which calls tm_reclaim_task() which calls tm_reclaim_thread() which calls tm_reclaim() where the DSCR is reset Where the DSCR is set to DSCR_DEFAULT ? Or now PACA_DSCR since your previous patches? Could we instead fix the bug there by reverting to the thread's DSCR value? We really need to save it earlier, before the treclaim which will override it. __switch_to() calls _switch _switch() saves the DSCR to thread.dscrTBEGIN The fix is to treat the DSCR similarly to the TAR and save it early in __switch_to(). The program below will expose the problem: Can you drop this in tools/testing/selftests/powerpc/tm ? You'll need to create that directory, you can ape the Makefile from the pmu directory, it should be fairly obvious. See the pmu tests for how to integrate with the test harness etc., or bug me if it's not straight forward. diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index 2737f46..3efd0e5 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -16,13 +16,15 @@ struct thread_struct; extern struct task_struct *_switch(struct thread_struct *prev, struct thread_struct *next); #ifdef CONFIG_PPC_BOOK3S_64 -static inline void save_tar(struct thread_struct *prev) +static inline void save_early_sprs(struct thread_struct *prev) { if (cpu_has_feature(CPU_FTR_ARCH_207S)) prev-tar = mfspr(SPRN_TAR); + if (cpu_has_feature(CPU_FTR_DSCR)) + prev-dscr = mfspr(SPRN_DSCR); } Are we going to end up saving more SPRs in this code? What makes the TAR DSCR special vs everything else? There are only a limited set of SPRs that TM checkpoints. The full list is CR, LR, CTR, FPSCR, AMR, PPR, VRSAVE, VSCR, DSCR, and TAR. http://www.scribd.com/doc/142877680/PowerISA-v2-07#outer_page_826 CR, LR, CTR, PPR are handled really early in the exception handler FPSCR, VSCR are done in the FP/VMX/VSX code. AMR we don't care about. That just leaves the DSCR and the TAR for here ... and the VRSAVE. Sam: did you have a patch to save that one early too? I think we talked about it but forgot, or did we decide that it's always broken anyway so we don't care? :-D Mikey The nice thing about doing this in asm is it's nop'ed out for cpus that don't have the DSCR. What does the generated code for this look like? diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 662c6dd..a107f4a 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -432,12 +432,6 @@ BEGIN_FTR_SECTION std r24,THREAD_VRSAVE(r3) END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC) #endif /* CONFIG_ALTIVEC */ -#ifdef CONFIG_PPC64 -BEGIN_FTR_SECTION - mfspr r25,SPRN_DSCR - std r25,THREAD_DSCR(r3) -END_FTR_SECTION_IFSET(CPU_FTR_DSCR) -#endif and.r0,r0,r22 beq+1f andcr22,r22,r0 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] PPC: BOOK3S: Disable/Enable TM looking at the ibm, pa-features device tree entry
Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com writes: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com writes: Michael Neuling mi...@neuling.org writes: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: Runtime disable transactional memory feature looking at pa-features device tree entry. This provides a mechanism to disable TM on P8 systems. What are we actually achieving with this? PAPR compliance :) ? Also I wanted to disable guest kernel from doing TM related save restore. Guest kernel already look at the cpu feature before doing that. Hence needed a mechanism to disable the feature. Things like static inline void __switch_to_tm(struct task_struct *prev) { if (cpu_has_feature(CPU_FTR_TM)) { tm_enable(); tm_reclaim_task(prev); } } Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com --- arch/powerpc/kernel/prom.c | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 668aa4791fd7..537bd7e7db0b 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -161,6 +161,11 @@ static struct ibm_pa_feature { {CPU_FTR_NODSISRALIGN, 0, 0,1, 1, 1}, {0, MMU_FTR_CI_LARGE_PAGE, 0, 1, 2, 0}, {CPU_FTR_REAL_LE, PPC_FEATURE_TRUE_LE, 5, 0, 0}, + /* + * We should use CPU_FTR_TM_COMP so that if we disable TM, it won't get + * enabled via device tree + */ + {CPU_FTR_TM_COMP, 0, 0, 22, 0, 0}, What does this do to guests? Will it turn TM unavailable into an illegal instruction? Good suggestion. I guess it should be facility unavailable interrupt ? I should also make the sure __init_HFSCR only set HFSCR_TM only if the cpu feature is enabled ? I looked at this and I guess we don't need to update HFSCR considering that the guest kernel (privileged) access to TM always happen within if (cpu_has_feature(CPU_FTR_TM)) conditional block. Also we want to disable this per guest and there is no easy way to suggest hypervisor that disable TM in HFSCR. BTW we already do this for guest problme state. We do in guest kernel if (cpu_has_feature(CPU_FTR_TM)) regs-msr |= MSR_TM; IIUC that should result in facility unavailable interrupt when problem state try to access TM ? I will try to run some test with the patch and update here. This will actually result in illegal instruction. -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation
Paul Mackerras pau...@samba.org writes: On Tue, Jun 03, 2014 at 05:46:11PM +0530, Aneesh Kumar K.V wrote: We use time base for PURR and SPURR emulation with PR KVM since we are emulating a single threaded core. When using time base we need to make sure that we don't accumulate time spent in the host in PURR and SPURR value. Mostly looks good except for this... @@ -170,6 +175,11 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu, out: preempt_enable(); +/* + * Update purr and spurr using time base + */ +vcpu-arch.purr += get_tb() - vcpu-arch.entry_tb; +vcpu-arch.spurr += get_tb() - vcpu-arch.entry_tb; You need to do those updates before the out: label. Otherwise if this function gets called with !svcpu-in_use (which can happen if CONFIG_PREEMPT is enabled) we would do these updates a second time for one guest exit. The thing is that kvmppc_copy_from_svcpu() can get called from kvmppc_core_vcpu_put_pr() if the vcpu task gets preempted on the way out from the guest before we get to the regular call of kvmppc_copy_from_svcpu(). It would then get called again when the task gets to run, but this time it does nothing because svcpu-in_use is false. Looking at the code, since we enable MSR.EE early now, we might possibly end up calling this function late in the guest exit path. That implies, we may account that time (time spent after a preempt immediately following a guest exit) in purr/spurr. I guess that amount of inaccuracy is ok, because that is the best we could do here ? -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 2/2] powerpc/powernv: Enable POWER8 doorbell IPIs
This patch enables POWER8 doorbell IPIs on powernv. Since doorbells can only IPI within a core, we test to see when we can use doorbells and if not we fall back to XICS. This also enables hypervisor doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit. Based on tests by Anton, the best case IPI latency between two threads dropped from 894ns to 512ns. Signed-off-by: Michael Neuling mi...@neuling.org --- v2: Adds select of PPC_DOORBELL for PSERIES and POWERNV as suggested by mpe. diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S index 1557e7c..4673353 100644 --- a/arch/powerpc/kernel/cpu_setup_power.S +++ b/arch/powerpc/kernel/cpu_setup_power.S @@ -56,6 +56,7 @@ _GLOBAL(__setup_cpu_power8) li r0,0 mtspr SPRN_LPID,r0 mfspr r3,SPRN_LPCR + ori r3, r3, LPCR_PECEDH bl __init_LPCR bl __init_HFSCR bl __init_tlb_power8 @@ -74,6 +75,7 @@ _GLOBAL(__restore_cpu_power8) li r0,0 mtspr SPRN_LPID,r0 mfspr r3,SPRN_LPCR + ori r3, r3, LPCR_PECEDH bl __init_LPCR bl __init_HFSCR bl __init_tlb_power8 diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig index c252ee9..45a8ed0 100644 --- a/arch/powerpc/platforms/powernv/Kconfig +++ b/arch/powerpc/platforms/powernv/Kconfig @@ -17,6 +17,7 @@ config PPC_POWERNV select CPU_FREQ_GOV_USERSPACE select CPU_FREQ_GOV_ONDEMAND select CPU_FREQ_GOV_CONSERVATIVE + select PPC_DOORBELL default y config PPC_POWERNV_RTAS diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 0062a43..5fcfcf4 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -32,6 +32,7 @@ #include asm/opal.h #include asm/runlatch.h #include asm/code-patching.h +#include asm/dbell.h #include powernv.h @@ -46,6 +47,11 @@ static void pnv_smp_setup_cpu(int cpu) { if (cpu != boot_cpuid) xics_setup_cpu(); + +#ifdef CONFIG_PPC_DOORBELL + if (cpu_has_feature(CPU_FTR_DBELL)) + doorbell_setup_this_cpu(); +#endif } int pnv_smp_kick_cpu(int nr) diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 2cb8b77..756b482 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -21,6 +21,7 @@ config PPC_PSERIES select HAVE_CONTEXT_TRACKING select HOTPLUG_CPU if SMP select ARCH_RANDOM + select PPC_DOORBELL default y config PPC_SPLPAR diff --git a/arch/powerpc/sysdev/xics/icp-native.c b/arch/powerpc/sysdev/xics/icp-native.c index 9dee470..de8d948 100644 --- a/arch/powerpc/sysdev/xics/icp-native.c +++ b/arch/powerpc/sysdev/xics/icp-native.c @@ -26,6 +26,7 @@ #include asm/errno.h #include asm/xics.h #include asm/kvm_ppc.h +#include asm/dbell.h struct icp_ipl { union { @@ -145,7 +146,13 @@ static unsigned int icp_native_get_irq(void) static void icp_native_cause_ipi(int cpu, unsigned long data) { kvmppc_set_host_ipi(cpu, 1); - icp_native_set_qirr(cpu, IPI_PRIORITY); +#ifdef CONFIG_PPC_DOORBELL + if (cpu_has_feature(CPU_FTR_DBELL) + (cpumask_test_cpu(cpu, cpu_sibling_mask(smp_processor_id() + doorbell_cause_ipi(cpu, data); + else +#endif + icp_native_set_qirr(cpu, IPI_PRIORITY); } void xics_wake_cpu(int cpu) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2] KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation
We use time base for PURR and SPURR emulation with PR KVM since we are emulating a single threaded core. When using time base we need to make sure that we don't accumulate time spent in the host in PURR and SPURR value. Also we don't need to emulate mtspr because both the registers are hypervisor resource. Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com --- Changes from V1: Move the purr/spurr update to happen only once in case of exit with preemption enabled. arch/powerpc/include/asm/kvm_book3s.h | 2 -- arch/powerpc/include/asm/kvm_host.h | 4 ++-- arch/powerpc/kvm/book3s_emulate.c | 16 arch/powerpc/kvm/book3s_pr.c | 11 +++ 4 files changed, 21 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index f52f65694527..a20cc0bbd048 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -83,8 +83,6 @@ struct kvmppc_vcpu_book3s { u64 sdr1; u64 hior; u64 msr_mask; - u64 purr_offset; - u64 spurr_offset; #ifdef CONFIG_PPC_BOOK3S_32 u32 vsid_pool[VSID_POOL_SIZE]; u32 vsid_next; diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index bb66d8b8efdf..4a58731a0a72 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -503,8 +503,8 @@ struct kvm_vcpu_arch { #ifdef CONFIG_BOOKE u32 decar; #endif - u32 tbl; - u32 tbu; + /* Time base value when we entered the guest */ + u64 entry_tb; u32 tcr; ulong tsr; /* we need to perform set/clr_bits() which requires ulong */ u32 ivor[64]; diff --git a/arch/powerpc/kvm/book3s_emulate.c b/arch/powerpc/kvm/book3s_emulate.c index 3f295269af37..3565e775b61b 100644 --- a/arch/powerpc/kvm/book3s_emulate.c +++ b/arch/powerpc/kvm/book3s_emulate.c @@ -439,12 +439,6 @@ int kvmppc_core_emulate_mtspr_pr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val) (mfmsr() MSR_HV)) vcpu-arch.hflags |= BOOK3S_HFLAG_DCBZ32; break; - case SPRN_PURR: - to_book3s(vcpu)-purr_offset = spr_val - get_tb(); - break; - case SPRN_SPURR: - to_book3s(vcpu)-spurr_offset = spr_val - get_tb(); - break; case SPRN_GQR0: case SPRN_GQR1: case SPRN_GQR2: @@ -572,10 +566,16 @@ int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val *spr_val = 0; break; case SPRN_PURR: - *spr_val = get_tb() + to_book3s(vcpu)-purr_offset; + /* +* On exit we would have updated purr +*/ + *spr_val = vcpu-arch.purr; break; case SPRN_SPURR: - *spr_val = get_tb() + to_book3s(vcpu)-purr_offset; + /* +* On exit we would have updated spurr +*/ + *spr_val = vcpu-arch.spurr; break; case SPRN_GQR0: case SPRN_GQR1: diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 23367a7e44c3..ff15f8e31a27 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -120,6 +120,11 @@ void kvmppc_copy_to_svcpu(struct kvmppc_book3s_shadow_vcpu *svcpu, #ifdef CONFIG_PPC_BOOK3S_64 svcpu-shadow_fscr = vcpu-arch.shadow_fscr; #endif + /* +* Now also save the current time base value. We use this +* to find the guest purr and spurr value. +*/ + vcpu-arch.entry_tb = get_tb(); svcpu-in_use = true; } @@ -166,6 +171,12 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu, #ifdef CONFIG_PPC_BOOK3S_64 vcpu-arch.shadow_fscr = svcpu-shadow_fscr; #endif + /* +* Update purr and spurr using time base on exit. +*/ + vcpu-arch.purr += get_tb() - vcpu-arch.entry_tb; + vcpu-arch.spurr += get_tb() - vcpu-arch.entry_tb; + svcpu-in_use = false; out: -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation
Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com writes: Paul Mackerras pau...@samba.org writes: On Tue, Jun 03, 2014 at 05:46:11PM +0530, Aneesh Kumar K.V wrote: We use time base for PURR and SPURR emulation with PR KVM since we are emulating a single threaded core. When using time base we need to make sure that we don't accumulate time spent in the host in PURR and SPURR value. Mostly looks good except for this... @@ -170,6 +175,11 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu, out: preempt_enable(); + /* +* Update purr and spurr using time base +*/ + vcpu-arch.purr += get_tb() - vcpu-arch.entry_tb; + vcpu-arch.spurr += get_tb() - vcpu-arch.entry_tb; You need to do those updates before the out: label. Otherwise if this function gets called with !svcpu-in_use (which can happen if CONFIG_PREEMPT is enabled) we would do these updates a second time for one guest exit. The thing is that kvmppc_copy_from_svcpu() can get called from kvmppc_core_vcpu_put_pr() if the vcpu task gets preempted on the way out from the guest before we get to the regular call of kvmppc_copy_from_svcpu(). It would then get called again when the task gets to run, but this time it does nothing because svcpu-in_use is false. Looking at the code, since we enable MSR.EE early now, we might possibly end up calling this function late in the guest exit path. That implies, we may account that time (time spent after a preempt immediately following a guest exit) in purr/spurr. I guess that amount of inaccuracy is ok, because that is the best we could do here ? May be not, we do call kvmppc_copy_to_svcpu from preempt notifier callbacks. That should ensure that we get the right value. I have sent patch -V2 taking care of the above review comment. -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc, kexec: Fix Processor X is stuck issue during kexec from ST mode
On Wed, Jun 04, 2014 at 01:58:40AM +0530, Srivatsa S. Bhat wrote: On 05/28/2014 07:01 PM, Vivek Goyal wrote: On Tue, May 27, 2014 at 04:25:34PM +0530, Srivatsa S. Bhat wrote: If we try to perform a kexec when the machine is in ST (Single-Threaded) mode (ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we get the following messages during boot: [0.089866] POWER8 performance monitor hardware support registered [0.089985] power8-pmu: PMAO restore workaround active. [5.095419] Processor 1 is stuck. [ 10.097933] Processor 2 is stuck. [ 15.100480] Processor 3 is stuck. [ 20.102982] Processor 4 is stuck. [ 25.105489] Processor 5 is stuck. [ 30.108005] Processor 6 is stuck. [ 35.110518] Processor 7 is stuck. [ 40.113369] Processor 9 is stuck. [ 45.115879] Processor 10 is stuck. [ 50.118389] Processor 11 is stuck. [ 55.120904] Processor 12 is stuck. [ 60.123425] Processor 13 is stuck. [ 65.125970] Processor 14 is stuck. [ 70.128495] Processor 15 is stuck. [ 75.131316] Processor 17 is stuck. Note that only the sibling threads are stuck, while the primary threads (0, 8, 16 etc) boot just fine. Looking closer at the previous step of kexec, we observe that kexec tries to wakeup (bring online) the sibling threads of all the cores, before performing kexec: [ 9464.131231] Starting new kernel [ 9464.148507] kexec: Waking offline cpu 1. [ 9464.148552] kexec: Waking offline cpu 2. [ 9464.148600] kexec: Waking offline cpu 3. [ 9464.148636] kexec: Waking offline cpu 4. [ 9464.148671] kexec: Waking offline cpu 5. [ 9464.148708] kexec: Waking offline cpu 6. [ 9464.148743] kexec: Waking offline cpu 7. [ 9464.148779] kexec: Waking offline cpu 9. [ 9464.148815] kexec: Waking offline cpu 10. [ 9464.148851] kexec: Waking offline cpu 11. [ 9464.148887] kexec: Waking offline cpu 12. [ 9464.148922] kexec: Waking offline cpu 13. [ 9464.148958] kexec: Waking offline cpu 14. [ 9464.148994] kexec: Waking offline cpu 15. [ 9464.149030] kexec: Waking offline cpu 17. Instrumenting this piece of code revealed that the cpu_up() operation actually fails with -EBUSY. Thus, only the primary threads of all the cores are online during kexec, and hence this is a sure-shot receipe for disaster, as explained in commit e8e5c2155b (powerpc/kexec: Fix orphaned offline CPUs across kexec), as well as in the comment above wake_offline_cpus(). It turns out that cpu_up() was returning -EBUSY because the variable 'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done by migrate_to_reboot_cpu() inside kernel_kexec(). Now, migrate_to_reboot_cpu() was originally written with the assumption that any further code will not need to perform CPU hotplug, since we are anyway in the reboot path. However, kexec is clearly not such a case, since we depend on onlining CPUs, atleast on powerpc. So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the kexec path, to fix this regression in kexec on powerpc. Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we can catch such issues more easily in the future. Fixes: c97102ba963 (kexec: migrate to reboot cpu) Cc: sta...@vger.kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com --- arch/powerpc/kernel/machine_kexec_64.c |2 +- kernel/kexec.c |8 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/machine_kexec_64.c index 59d229a..879b3aa 100644 --- a/arch/powerpc/kernel/machine_kexec_64.c +++ b/arch/powerpc/kernel/machine_kexec_64.c @@ -237,7 +237,7 @@ static void wake_offline_cpus(void) if (!cpu_online(cpu)) { printk(KERN_INFO kexec: Waking offline cpu %d.\n, cpu); - cpu_up(cpu); + WARN_ON(cpu_up(cpu)); } } } diff --git a/kernel/kexec.c b/kernel/kexec.c index c8380ad..28c5706 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1683,6 +1683,14 @@ int kernel_kexec(void) kexec_in_progress = true; kernel_restart_prepare(NULL); migrate_to_reboot_cpu(); + + /* + * migrate_to_reboot_cpu() disables CPU hotplug assuming that + * no further code needs to use CPU hotplug (which is true in + * the reboot case). However, the kexec path depends on using + * CPU hotplug again; so re-enable it here. + */ + cpu_hotplug_enable(); printk(KERN_EMERG Starting new kernel\n); machine_shutdown(); After migrate_to_reboot_cpu(), we are calling machine_shutdown() which calls disable_nonboot_cpus() and which in
Re: [PATCH] powerpc, kexec: Fix Processor X is stuck issue during kexec from ST mode
On Wed, Jun 04, 2014 at 08:09:25AM +1000, Benjamin Herrenschmidt wrote: On Wed, 2014-06-04 at 01:58 +0530, Srivatsa S. Bhat wrote: Yep, that makes sense. But unfortunately I don't have enough insight into why exactly powerpc has to online the CPUs before doing a kexec. I just know from the commit log and the comment mentioned above (and from my own experiments) that the CPUs will get stuck if they were offline. Perhaps somebody more knowledgeable can explain this in detail and suggest a proper long-term solution. Matt, Ben, any thoughts on this? The problem is with our soft offline which we do on some platforms. When we offline we don't actually send the CPUs back to firmware or anything like that. We put them into a very low low power loop inside Linux. The new kernel has no way to extract them from that loop. So we must re-online them before we kexec so they can be passed to the new kernel normally (or returned to firmware like we do on powernv). Srivatsa, Looks like your patch has been merged. I don't like the following change in arch independent code. /* * migrate_to_reboot_cpu() disables CPU hotplug assuming that * no further code needs to use CPU hotplug (which is true in * the reboot case). However, the kexec path depends on using * CPU hotplug again; so re-enable it here. */ cpu_hotplug_enable(); As it is very powerpc specific requirement, can you enable hotplug in powerpc arch dependent code as a short term solution. Ideally one needs to fix the requirement of online all cpus in powerpc as a long term solution and then get rid of hotplug enable call. Thanks Vivek ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/mpc85xx: fix fsl/p2041-post.dtsi clockgen mux2
The mux2 node is missing the clock-output-names field that is required by the clk-ppc-corenet driver. Signed-off-by: Valentin Longchamp valentin.longch...@keymile.com --- arch/powerpc/boot/dts/fsl/p2041si-post.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi index e2987a3..45ce43c 100644 --- a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi @@ -358,6 +358,7 @@ compatible = fsl,qoriq-core-mux-1.0; clocks = pll0 0, pll0 1, pll1 0, pll1 1; clock-names = pll0, pll0-div2, pll1, pll1-div2; + clock-output-names = cmux2; }; mux3: mux3@60 { -- 1.8.0.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RESEND PATCH] memory: Freescale CoreNet Coherency Fabric error reporting driver
On Wed, 2014-06-04 at 03:17 -0500, Bhushan Bharat-R65777 wrote: +struct ccf_err_regs { + u32 errdet; /* 0x00 Error Detect Register */ + /* 0x04 Error Enable (ccf1)/Disable (ccf2) Register */ + u32 errdis; + /* 0x08 Error Interrupt Enable Register (ccf2 only) */ + u32 errinten; + u32 cecar; /* 0x0c Error Capture Attribute Register */ + u32 cecadrh;/* 0x10 Error Capture Address High */ s/cecadrh/cecaddrh/g This way we will be consistent with Reference manual. It's cecadrh in ccf1 and cecaddrh in ccf2. I suppose I should use the latter since errdet/errdis/errinten are the ccf2 names. + u32 cecadrl;/* 0x14 Error Capture Address Low */ s/cecadrl/cecaddrl/g + u32 cecar2; /* 0x18 Error Capture Attribute Register 2 */ +}; + +/* LAE/CV also valid for errdis and errinten */ +#define ERRDET_LAE (1 0) /* Local Access Error */ +#define ERRDET_CV (1 1) /* Coherency Violation */ +#define ERRDET_CTYPE_SHIFT 26/* Capture Type (ccf2 only) */ +#define ERRDET_CTYPE_MASK (0x3f ERRDET_CTYPE_SHIFT) Should not this be (0x1f ERRDET_CTYPE_SHIFT) Yes, thanks for catching that. +#define ERRDET_CAP (1 31) /* Capture Valid (ccf2 only) */ + +#define CECAR_VAL (1 0) /* Valid (ccf1 only) */ +#define CECAR_UVT (1 15) /* Unavailable target ID (ccf1) */ +#define CECAR_SRCID_SHIFT_CCF1 24 +#define CECAR_SRCID_MASK_CCF1 (0xff CECAR_SRCID_SHIFT_CCF1) +#define CECAR_SRCID_SHIFT_CCF2 18 +#define CECAR_SRCID_MASK_CCF2 (0xff CECAR_SRCID_SHIFT_CCF2) + +#define CECADRH_ADDRH 0xf On ccf2 this id 0xff. OK. I think we can get away with using 0xff on both. +static int ccf_remove(struct platform_device *pdev) { + struct ccf_private *ccf = dev_get_drvdata(pdev-dev); + + switch (ccf-info-version) { + case CCF1: + iowrite32be(0, ccf-err_regs-errdis); + break; + + case CCF2: + iowrite32be(0, ccf-err_regs-errinten); Do you think it is same to disable detection bits in ccf-err_regs-errdis? Disabling the interrupt is what we're aiming for here, but ccf1 doesn't provide a way to do that separate from disabling detection. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [RESEND PATCH] memory: Freescale CoreNet Coherency Fabric error reporting driver
-Original Message- From: Wood Scott-B07421 Sent: Wednesday, June 04, 2014 10:12 PM To: Bhushan Bharat-R65777 Cc: Greg Kroah-Hartman; linuxppc-dev@lists.ozlabs.org; linux- ker...@vger.kernel.org Subject: Re: [RESEND PATCH] memory: Freescale CoreNet Coherency Fabric error reporting driver On Wed, 2014-06-04 at 03:17 -0500, Bhushan Bharat-R65777 wrote: +struct ccf_err_regs { + u32 errdet; /* 0x00 Error Detect Register */ + /* 0x04 Error Enable (ccf1)/Disable (ccf2) Register */ + u32 errdis; + /* 0x08 Error Interrupt Enable Register (ccf2 only) */ + u32 errinten; + u32 cecar; /* 0x0c Error Capture Attribute Register */ + u32 cecadrh;/* 0x10 Error Capture Address High */ s/cecadrh/cecaddrh/g This way we will be consistent with Reference manual. It's cecadrh in ccf1 and cecaddrh in ccf2. I suppose I should use the latter since errdet/errdis/errinten are the ccf2 names. + u32 cecadrl;/* 0x14 Error Capture Address Low */ s/cecadrl/cecaddrl/g + u32 cecar2; /* 0x18 Error Capture Attribute Register 2 */ +}; + +/* LAE/CV also valid for errdis and errinten */ +#define ERRDET_LAE (1 0) /* Local Access Error */ +#define ERRDET_CV(1 1) /* Coherency Violation */ +#define ERRDET_CTYPE_SHIFT 26/* Capture Type (ccf2 only) */ +#define ERRDET_CTYPE_MASK(0x3f ERRDET_CTYPE_SHIFT) Should not this be (0x1f ERRDET_CTYPE_SHIFT) Yes, thanks for catching that. +#define ERRDET_CAP (1 31) /* Capture Valid (ccf2 only) */ + +#define CECAR_VAL(1 0) /* Valid (ccf1 only) */ +#define CECAR_UVT(1 15) /* Unavailable target ID (ccf1) */ +#define CECAR_SRCID_SHIFT_CCF1 24 +#define CECAR_SRCID_MASK_CCF1(0xff CECAR_SRCID_SHIFT_CCF1) +#define CECAR_SRCID_SHIFT_CCF2 18 +#define CECAR_SRCID_MASK_CCF2(0xff CECAR_SRCID_SHIFT_CCF2) + +#define CECADRH_ADDRH0xf On ccf2 this id 0xff. OK. I think we can get away with using 0xff on both. +static int ccf_remove(struct platform_device *pdev) { + struct ccf_private *ccf = dev_get_drvdata(pdev-dev); + + switch (ccf-info-version) { + case CCF1: + iowrite32be(0, ccf-err_regs-errdis); + break; + + case CCF2: + iowrite32be(0, ccf-err_regs-errinten); Do you think it is same to disable detection bits in ccf-err_regs-errdis? Disabling the interrupt is what we're aiming for here, but ccf1 doesn't provide a way to do that separate from disabling detection. What I wanted to say that do we also need to disable detection (set ERRDET_LAE | ERRDET_CV bits in errdis) apart from clearing errinten on ccf2 ? Thanks -Bharat -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RESEND PATCH] memory: Freescale CoreNet Coherency Fabric error reporting driver
On Wed, 2014-06-04 at 12:04 -0500, Bhushan Bharat-R65777 wrote: -Original Message- From: Wood Scott-B07421 Sent: Wednesday, June 04, 2014 10:12 PM To: Bhushan Bharat-R65777 Cc: Greg Kroah-Hartman; linuxppc-dev@lists.ozlabs.org; linux- ker...@vger.kernel.org Subject: Re: [RESEND PATCH] memory: Freescale CoreNet Coherency Fabric error reporting driver On Wed, 2014-06-04 at 03:17 -0500, Bhushan Bharat-R65777 wrote: +static int ccf_remove(struct platform_device *pdev) { + struct ccf_private *ccf = dev_get_drvdata(pdev-dev); + + switch (ccf-info-version) { + case CCF1: + iowrite32be(0, ccf-err_regs-errdis); + break; + + case CCF2: + iowrite32be(0, ccf-err_regs-errinten); Do you think it is same to disable detection bits in ccf-err_regs-errdis? Disabling the interrupt is what we're aiming for here, but ccf1 doesn't provide a way to do that separate from disabling detection. What I wanted to say that do we also need to disable detection (set ERRDET_LAE | ERRDET_CV bits in errdis) apart from clearing errinten on ccf2 ? I don't think we need to. You could argue that we should for consistency, though I think there's value in errors continuing to be detected even without the driver (e.g. can dump the registers in a debugger). -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] cpufreq: ppc-corenet-cpu-freq: do_div use quotient
6712d2931933ada259b82f06c03a855b19937074 (cpufreq: ppc-corenet-cpufreq: Fix __udivdi3 modpost error) used the remainder from do_div instead of the quotient. Fix that and add one to ensure minimum is met. Signed-off-by: Ed Swarthout ed.swarth...@freescale.com --- drivers/cpufreq/ppc-corenet-cpufreq.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/cpufreq/ppc-corenet-cpufreq.c b/drivers/cpufreq/ppc-corenet-cpufreq.c index 0af618a..3607070 100644 --- a/drivers/cpufreq/ppc-corenet-cpufreq.c +++ b/drivers/cpufreq/ppc-corenet-cpufreq.c @@ -138,7 +138,7 @@ static int corenet_cpufreq_cpu_init(struct cpufreq_policy *policy) struct cpufreq_frequency_table *table; struct cpu_data *data; unsigned int cpu = policy-cpu; - u64 transition_latency_hz; + u64 u64temp; np = of_get_cpu_node(cpu, NULL); if (!np) @@ -206,9 +206,10 @@ static int corenet_cpufreq_cpu_init(struct cpufreq_policy *policy) for_each_cpu(i, per_cpu(cpu_mask, cpu)) per_cpu(cpu_data, i) = data; - transition_latency_hz = 12ULL * NSEC_PER_SEC; - policy-cpuinfo.transition_latency = - do_div(transition_latency_hz, fsl_get_sys_freq()); + /* Minimum transition latency is 12 platform clocks */ + u64temp = 12ULL * NSEC_PER_SEC; + do_div(u64temp, fsl_get_sys_freq()); + policy-cpuinfo.transition_latency = u64temp + 1; of_node_put(np); -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2] KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation
On 04.06.14 13:17, Aneesh Kumar K.V wrote: We use time base for PURR and SPURR emulation with PR KVM since we are emulating a single threaded core. When using time base we need to make sure that we don't accumulate time spent in the host in PURR and SPURR value. Also we don't need to emulate mtspr because both the registers are hypervisor resource. Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Thanks, applied to kvm-ppc-queue. Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc: exported functions __clear_user and copy_page use r2 so need _GLOBAL_TOC()
__clear_user and copy_page load from the TOC and are also exported to modules. This means we have to use _GLOBAL_TOC() so that we create the global entry point that sets up the TOC. Signed-off-by: Anton Blanchard an...@samba.org --- Index: b/arch/powerpc/lib/copypage_64.S === --- a/arch/powerpc/lib/copypage_64.S +++ b/arch/powerpc/lib/copypage_64.S @@ -16,7 +16,7 @@ PPC64_CACHES: .tc ppc64_caches[TC],ppc64_caches .section.text -_GLOBAL(copy_page) +_GLOBAL_TOC(copy_page) BEGIN_FTR_SECTION lis r5,PAGE_SIZE@h FTR_SECTION_ELSE Index: b/arch/powerpc/lib/string_64.S === --- a/arch/powerpc/lib/string_64.S +++ b/arch/powerpc/lib/string_64.S @@ -77,7 +77,7 @@ err3; stb r0,0(r3) mr r3,r4 blr -_GLOBAL(__clear_user) +_GLOBAL_TOC(__clear_user) cmpdi r4,32 neg r6,r3 li r0,0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RESEND PATCH] slub: search partial list on numa_mem_id(), instead of numa_node_id()
On Wed, 21 May 2014, Joonsoo Kim wrote: Currently, if allocation constraint to node is NUMA_NO_NODE, we search a partial slab on numa_node_id() node. This doesn't work properly on the system having memoryless node, since it can have no memory on that node and there must be no partial slab on that node. On that node, page allocation always fallback to numa_mem_id() first. So searching a partial slab on numa_node_id() in that case is proper solution for memoryless node case. Acked-by: Nishanth Aravamudan n...@linux.vnet.ibm.com Acked-by: David Rientjes rient...@google.com Acked-by: Christoph Lameter c...@linux.com Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/slub.c b/mm/slub.c index 545a170..cc1f995 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1698,7 +1698,7 @@ static void *get_partial(struct kmem_cache *s, gfp_t flags, int node, struct kmem_cache_cpu *c) { void *object; - int searchnode = (node == NUMA_NO_NODE) ? numa_node_id() : node; + int searchnode = (node == NUMA_NO_NODE) ? numa_mem_id() : node; object = get_partial_node(s, get_node(s, searchnode), c, flags); if (object || node != NUMA_NO_NODE) Andrew, can you merge this please? It's still not in linux-next. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/1] powerpc: correct DSCR during TM context switch
On 04/06/14 20:03, Michael Neuling wrote: On Wed, 2014-06-04 at 17:31 +1000, Michael Ellerman wrote: Hi Sam, Comments inline .. Ditto Responses inline... On Wed, 2014-06-04 at 13:33 +1000, Sam Bobroff wrote: Correct the DSCR SPR becoming temporarily corrupted when a task is context switched when within a transaction. It is corrected when the transaction is aborted (which will happen after a context switch) but if the task has suspended (TSUSPEND) the transaction the incorrect value can be seen. I don't quite follow this description. How is it corrected when the transaction is aborted, and when does that usually happen? If that happens the task can't ever see the corrupted value? To hit the suspended case, the task starts a transaction, suspends it, is then context switched out and back in, and at that point it can see the wrong value? Yep, that's it and it's corrupted until the transaction is rolled back (normally at the tresume). At the tresume it gets rolled back to the checkpointed value at tbegin and is no longer corrupt. I'll re-work the explanation to be clearer about how it becomes corrupt and how it is corrected. The problem is caused by saving a thread's DSCR afterNo it's lost at that point as we've not saved it and it was overwritten when we did the treclaim. it has already been reverted to the CPU's default value: __switch_to() calls __switch_to_tm() which calls tm_reclaim_task() which calls tm_reclaim_thread() which calls tm_reclaim() where the DSCR is reset Where the DSCR is set to DSCR_DEFAULT ? Or now PACA_DSCR since your previous patches? Could we instead fix the bug there by reverting to the thread's DSCR value? We really need to save it earlier, before the treclaim which will override it. I'll try to improve this explanation as well. __switch_to() calls _switch _switch() saves the DSCR to thread.dscrTBEGIN The fix is to treat the DSCR similarly to the TAR and save it early in __switch_to(). The program below will expose the problem: Can you drop this in tools/testing/selftests/powerpc/tm ? You'll need to create that directory, you can ape the Makefile from the pmu directory, it should be fairly obvious. See the pmu tests for how to integrate with the test harness etc., or bug me if it's not straight forward. Will do :-) diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index 2737f46..3efd0e5 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -16,13 +16,15 @@ struct thread_struct; extern struct task_struct *_switch(struct thread_struct *prev, struct thread_struct *next); #ifdef CONFIG_PPC_BOOK3S_64 -static inline void save_tar(struct thread_struct *prev) +static inline void save_early_sprs(struct thread_struct *prev) { if (cpu_has_feature(CPU_FTR_ARCH_207S)) prev-tar = mfspr(SPRN_TAR); + if (cpu_has_feature(CPU_FTR_DSCR)) + prev-dscr = mfspr(SPRN_DSCR); } Are we going to end up saving more SPRs in this code? What makes the TAR DSCR special vs everything else? There are only a limited set of SPRs that TM checkpoints. The full list is CR, LR, CTR, FPSCR, AMR, PPR, VRSAVE, VSCR, DSCR, and TAR. http://www.scribd.com/doc/142877680/PowerISA-v2-07#outer_page_826 CR, LR, CTR, PPR are handled really early in the exception handler FPSCR, VSCR are done in the FP/VMX/VSX code. AMR we don't care about. That just leaves the DSCR and the TAR for here ... and the VRSAVE. Sam: did you have a patch to save that one early too? I think we talked about it but forgot, or did we decide that it's always broken anyway so we don't care? :-D I thought we'd decided that VRSAVE was already probably broken ;-) I haven't tested VRSAVE yet so we don't know if it's actually getting corrupted in this situation (although it seems likely), and from a quick look at the code it's not being treated like DSCR or TAR at the moment so I would need to investigate it separately. Mikey The nice thing about doing this in asm is it's nop'ed out for cpus that don't have the DSCR. What does the generated code for this look like? diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 662c6dd..a107f4a 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -432,12 +432,6 @@ BEGIN_FTR_SECTION std r24,THREAD_VRSAVE(r3) END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC) #endif /* CONFIG_ALTIVEC */ -#ifdef CONFIG_PPC64 -BEGIN_FTR_SECTION - mfspr r25,SPRN_DSCR - std r25,THREAD_DSCR(r3) -END_FTR_SECTION_IFSET(CPU_FTR_DSCR) -#endif and.r0,r0,r22 beq+1f andcr22,r22,r0 I wondered a little about this as well. The C code calls this function: static inline int cpu_has_feature(unsigned long feature) { return (CPU_FTRS_ALWAYS
Re: [PATCH] powerpc: Remove platforms/wsp and associated pieces
On Mon, 2014-06-02 at 10:04 +0200, Paul Bolle wrote: On Mon, 2014-06-02 at 11:20 +1000, Michael Ellerman wrote: __attribute__ ((unused)) WSP is the last user of CONFIG_PPC_A2, so we remove that as well. Although CONFIG_PPC_ICSWX still exists, it's no longer selectable for any Book3E platform, so we can remove the code in mmu-book3e.h that depended on it. Signed-off-by: Michael Ellerman m...@ellerman.id.au --- arch/powerpc/Kconfig.debug |5 - arch/powerpc/configs/chroma_defconfig | 307 - arch/powerpc/include/asm/mmu-book3e.h |4 - arch/powerpc/include/asm/reg_a2.h |9 - arch/powerpc/include/asm/wsp.h | 14 - arch/powerpc/kernel/Makefile |1 - arch/powerpc/kernel/cpu_setup_a2.S | 120 arch/powerpc/kernel/cputable.c | 38 -- arch/powerpc/kernel/exceptions-64e.S | 16 - arch/powerpc/kernel/udbg.c |2 - arch/powerpc/kernel/udbg_16550.c | 11 - arch/powerpc/platforms/Kconfig |1 - arch/powerpc/platforms/Kconfig.cputype |6 +- arch/powerpc/platforms/Makefile|1 - arch/powerpc/platforms/wsp/Kconfig | 30 - arch/powerpc/platforms/wsp/Makefile| 10 - arch/powerpc/platforms/wsp/chroma.c| 56 -- arch/powerpc/platforms/wsp/h8.c| 135 arch/powerpc/platforms/wsp/ics.c | 762 - arch/powerpc/platforms/wsp/ics.h | 25 - arch/powerpc/platforms/wsp/msi.c | 102 --- arch/powerpc/platforms/wsp/msi.h | 19 - arch/powerpc/platforms/wsp/opb_pic.c | 321 - arch/powerpc/platforms/wsp/psr2.c | 67 -- arch/powerpc/platforms/wsp/scom_smp.c | 434 This (trivially) conflicts with commit 2751b628c97e (powerpc: Fix SMP issues with ppc64le ABIv2) in next-20140530. Ah thanks. I'll assume Ben can handle it unless he yells at me. For what it's worth, according to my scripts this doesn't introduce Kconfig related regressions (eg, you've removed all references to the Kconfig symbols that were removed). And, of course, that puzzling check for CONFIG_WSP_DD1_WORKAROUND_DD1_TCE_BUGS is now gone. Thanks for checking. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] powerpc/cpuidle: Only clear LPCR decrementer wakeup bit on fast sleep entry
On 06/02/2014 12:27 PM, Michael Neuling wrote: Currently when entering fastsleep we clear all LPCR PECE bits. This patch changes it to only clear the decrementer bit (ie. PECE1), which is the only bit we really need to clear here. This is needed if we want to set other wakeup causes like the PECEDH bit so we can use hypervisor doorbells on powernv. Signed-off-by: Michael Neuling mi...@neuling.org --- drivers/cpuidle/cpuidle-powernv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c index 719f6fb..7f7798e 100644 --- a/drivers/cpuidle/cpuidle-powernv.c +++ b/drivers/cpuidle/cpuidle-powernv.c @@ -73,7 +73,7 @@ static int fastsleep_loop(struct cpuidle_device *dev, return index; new_lpcr = old_lpcr; - new_lpcr = ~(LPCR_MER | LPCR_PECE); /* lpcr[mer] must be 0 */ + new_lpcr = ~(LPCR_MER | LPCR_PECE1); /* lpcr[mer] must be 0 */ /* exit powersave upon external interrupt, but not decrementer * interrupt. You might want to remove this comment and instead simply add /* Do not exit powersave upon decrementer interrupt */ Besides this, the following line which clears the wakeup from external interrupt bit can be removed as well, since we are not clearing it anyway ? new_lpcr |= LPCR_PECE0 ^^^ Regards Preeti U Murthy ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/6] powerpc, powernv, CPU hotplug: Put offline CPUs in Fast-Sleep instead of Nap
On Wed, May 28, 2014 at 10:08:56AM +0530, Preeti U Murthy wrote: From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com The offline cpus are put to fast sleep if the idle state is discovered in the device tree. This is to gain maximum powersavings in the offline state. ... while (!generic_check_cpu_restart(cpu)) { ppc64_runlatch_off(); - power7_nap(); + + /* If sleep is supported, go to sleep, instead of nap */ + if (idle_states IDLE_USE_SLEEP) + power7_sleep(); + else + power7_nap(); + ppc64_runlatch_on(); if (!generic_check_cpu_restart(cpu)) { DBG(CPU%d Unexpected exit while offline !\n, cpu); What is the latency for waking up from fast sleep state? I'm concerned this will increase the latency for entering KVM guests. Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/powernv: Pass buffer size to OPAL validate flash call
We pass actual buffer size to opal_validate_flash() OPAL API call and in return it contains output buffer size. Commit cc146d1d (Fix little endian issues) missed to set the size param before making OPAL call. So firmware image validation fails. This patch sets size variable before making OPAL call. Signed-off-by: Vasant Hegde hegdevas...@linux.vnet.ibm.com Tested-by: Thomas Falcon tlfal...@linux.vnet.ibm.com --- arch/powerpc/platforms/powernv/opal-flash.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/powernv/opal-flash.c b/arch/powerpc/platforms/powernv/opal-flash.c index dc487ff..4e541e6 100644 --- a/arch/powerpc/platforms/powernv/opal-flash.c +++ b/arch/powerpc/platforms/powernv/opal-flash.c @@ -130,7 +130,8 @@ static inline void opal_flash_validate(void) { long ret; void *buf = validate_flash_data.buf; - __be32 size, result; + __be32 size = cpu_to_be32(validate_flash_data.buf_size); + __be32 result; ret = opal_validate_flash(__pa(buf), size, result); ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev