Re: [PATCH v2 2/2] leds/powernv: Add driver for PowerNV platform
On Fri, 24 Apr 2015 14:18:30 +1000 Stewart Smith stew...@linux.vnet.ibm.com wrote: Jacek Anaszewski j.anaszewsk...@gmail.com writes: These device tree comes from out firmware ... which is immutable . How the firmware is related to kernel? These bindings are for kernel, not for the firmware. DT bindings are compiled to *.dtb file which is concatenated with zImage. During system boot device drivers are matched with DT bindings through 'compatible' property. A driver should have single matching DT node, i.e. no other driver can probe with the same DT node. This implies that the node should contain only the properties required for configuring the related device. For OPAL firmware on POWER, firmware hands kernel a flattened device tree of the machine it's booting on. It's not added to kernel as the kernels aren't board specific - they're generic. Is the DT node we are discussing used by some other drivers than the LED class driver? Or is it required in this form by other components of your platform? https://github.com/open-power/skiboot/ is the firmware that generates the device tree for booting under OPAL. -- Best Regards, Jacek Anaszewski ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] pci-phb: check for the 32-bit overflow
On Fri, 24 Apr 2015 09:22:33 +0530 Nikunj A Dadhania nik...@linux.vnet.ibm.com wrote: Hi Thomas, Thomas Huth th...@redhat.com writes: Am Wed, 22 Apr 2015 16:27:19 +0530 schrieb Nikunj A Dadhania nik...@linux.vnet.ibm.com: With the addition of 64-bit BARS and increase in the mmio address space, the code was hitting this limit. The memory of pci devices across the bridges were not accessible due to which the drivers failed. Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com --- board-qemu/slof/pci-phb.fs | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/board-qemu/slof/pci-phb.fs b/board-qemu/slof/pci-phb.fs index 529772f..e307d95 100644 --- a/board-qemu/slof/pci-phb.fs +++ b/board-qemu/slof/pci-phb.fs @@ -258,7 +258,8 @@ setup-puid decode-64 2 / dup r\ Decode and calc size/2 pci-next-mem @ + dup pci-max-mem ! \ and calc max mem address Could pci-max-mem overflow, too? Should not, its only the boundary that was an issue. Qemu sends base and size, base + size can be till uint32 max. So for example base was 0xC000. and size was 0x4000., we add up base + size and put pci-max-mmio as 0x1.., which would get programmend in the bridge bars: lower limit as 0xC000 and 0x as upper limit. And no mmio access were going across the bridge. In my testing, I have found one more issue with translate-my-address, it does not take care of 64-bit addresses. I have a patch working for SLOF, but its breaking the guest kernel booting. dup pci-next-mmio ! \ which is the same as MMIO base -r + pci-max-mmio ! \ calc max MMIO address +r + min pci-max-mmio !\ calc max MMIO address and +\ check the 32-bit boundary Ok, thanks a lot for the example! I think your patch likely works in practice, but after staring at the code for a while, I think the real bug is slightly different. If I get the code above right, pci-max-mmio is normally set to the first address that is _not_ part of the mmio window anymore, right. Now have a look at pci-bridge-set-mmio-base in pci-scan.fs: : pci-bridge-set-mmio-base ( addr -- ) pci-next-mmio @ 10 #aligned \ read the current Value and align to 1MB boundary dup 10 + pci-next-mmio !\ and write back with 1MB for bridge 10 rshift \ mmio-base reg is only the upper 16 bits pci-max-mmio @ and or \ and Insert mmio Limit (set it to max) swap 20 + rtas-config-l!\ and write it into the bridge ; Seems like the pci-max-mmio, i.e. the first address that is not in the window anymore, is programmed into the memory limit register here - but according to the pci-to-pci bridge specification, it should be the last address of the window instead. So I think the correct fix would be to decrease the pci-max-mmio value in pci-bridge-set-mmio-base by 1- before programming it into the limit register (note: in pci-bridge-set-mmio-limit you can find a 1- already, so I think this also should be done in pci-bridge-set-mmio-base, too) So if you've got some spare minutes, could you please check whether that would fix the issue, too? Thomas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 2/2] leds/powernv: Add driver for PowerNV platform
On Fri, 24 Apr 2015 11:00:41 +0530 Hi Vasant, Vasant Hegde hegdevas...@linux.vnet.ibm.com wrote: On 04/23/2015 07:43 PM, Jacek Anaszewski wrote: On Thu, 23 Apr 2015 10:55:40 +0530 Vasant Hegde hegdevas...@linux.vnet.ibm.com wrote: Hi Jacek, .../... These device tree comes from out firmware ... which is immutable . How the firmware is related to kernel? These bindings are for kernel, not for the firmware. DT bindings are compiled to *.dtb file which is concatenated with zImage. During system boot device drivers are matched with DT bindings through 'compatible' property. A driver should have single matching DT node, i.e. no other driver can probe with the same DT node. This implies that the node should contain only the properties required for configuring the related device. As Stewart mentioned, its not .dtb file in our case.. we pass flattened device tree .. which is built by OPAL. No matter what format of device tree OPAL produces, I assume that it must compile it from some sources. dtb file is a compiled form of human readable dts file containing Flattened Device Tree - a data structure for describing the hardware in the system. Please refer to: http://elinux.org/Device_Tree We can use LED node name + led-type property for naming...which is what I do currently (v4.. which I haven't posted) 1. Each LED would have one corresponding LED class device. 2. Operations on attn and fault LED types: turn on: echo 255 brightness turn off: echo 0 brightness get status cat brightness 3. Operations on identify LED: turn on: echo timer trigger (blink_set op would have to be implemented in the driver) turn off: echo 0 brightness get status: support for this would have to be added to the LED subsystem core I see few issues here. - Overloading same LED device with multiple opeartion complicates things .. as these operations can be done independently (say user is allowed to enable both identify and fault simultaneously) I agree, it would be hard to distinguish whether by executing `echo 0 brightness` we want to turn off identify or fault function. - point 3: IIUC after duration value expires identify indicator reverts.. we don't want to revert until user asks . From what you shared, blinking has hardware acceleration on OPAL side. At first timer trigger tries to use HW accelerated blinking by calling blink_set op and resorts to using software fallback only if the op fails or is not defined. Blinking is the physical state of LED to represent identify state. which is taken care by hardware. and OS doesn't have control on this .. I am aware of it. Therefore we would probably need to add a flag LED_BLINK_HW_ONLY to the LED subsystem core and modify led_blink_set function to log an error and avoid setting software fallback in case blink_set op fails and the flag is set. Nevertheless, I am leaning towards using brightness_set op for this. From software point of view its just another LED with two state (ON and OFF). BTW timer trigger re-sets blink after timer expires, unless LED_BLINK_ONESHOT flag is set by LED class device. In my case, I want to retain the state.' - point 3: if I use brightness for both identify/fault, how to disable these LEDs independently? Another sysfs attribute would be required, but it would be ugly. yeah. - Also how to use trigger property for each LED (if at all we want to use them later)? After analyzing pros and cons I think that separate LED class devices for each LED type would be most suitable solution in this case. Agree. For 'identify' LED the operation would be: #echo timer trigger //set 'identify' (blinking) #cat trigger//check identify state #none [timer] //'identify' is ON #echo 0 brightness//unset 'identify #cat trigger #[none] timer //'identify' is OFF You would have to implement blink_set op (see Documentation/leds/leds-class.txt and other LED class drivers for reference). Implementing another op should be fine.. I can try to implement it. But from user perspective identify is just another LED. Hence can we just use brightness property itself? OK, let's use only brightness. Usage of blinking API would impose turning on led-triggers, which would be used only for exposing trigger sysfs attribute. Triggers however would not be used, as the intention is using only HW accelerated blinking. Please add comment to the driver, describing the reasons for abusing API semantics. For attention and fault LEDs only brightness attribute would matter. Sure. DT bindings would look as follows: opal-leds { compatible =
Re: [PATCH 1/2] pci-phb: check for the 32-bit overflow
On Fri, 24 Apr 2015 12:56:57 +0200 Thomas Huth th...@redhat.com wrote: On Fri, 24 Apr 2015 09:22:33 +0530 Nikunj A Dadhania nik...@linux.vnet.ibm.com wrote: Hi Thomas, Thomas Huth th...@redhat.com writes: Am Wed, 22 Apr 2015 16:27:19 +0530 schrieb Nikunj A Dadhania nik...@linux.vnet.ibm.com: With the addition of 64-bit BARS and increase in the mmio address space, the code was hitting this limit. The memory of pci devices across the bridges were not accessible due to which the drivers failed. Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com --- board-qemu/slof/pci-phb.fs | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/board-qemu/slof/pci-phb.fs b/board-qemu/slof/pci-phb.fs index 529772f..e307d95 100644 --- a/board-qemu/slof/pci-phb.fs +++ b/board-qemu/slof/pci-phb.fs @@ -258,7 +258,8 @@ setup-puid decode-64 2 / dup r\ Decode and calc size/2 pci-next-mem @ + dup pci-max-mem ! \ and calc max mem address Could pci-max-mem overflow, too? Should not, its only the boundary that was an issue. Qemu sends base and size, base + size can be till uint32 max. So for example base was 0xC000. and size was 0x4000., we add up base + size and put pci-max-mmio as 0x1.., which would get programmend in the bridge bars: lower limit as 0xC000 and 0x as upper limit. And no mmio access were going across the bridge. In my testing, I have found one more issue with translate-my-address, it does not take care of 64-bit addresses. I have a patch working for SLOF, but its breaking the guest kernel booting. dup pci-next-mmio ! \ which is the same as MMIO base -r + pci-max-mmio ! \ calc max MMIO address +r + min pci-max-mmio !\ calc max MMIO address and +\ check the 32-bit boundary Ok, thanks a lot for the example! I think your patch likely works in practice, but after staring at the code for a while, I think the real bug is slightly different. If I get the code above right, pci-max-mmio is normally set to the first address that is _not_ part of the mmio window anymore, right. Now have a look at pci-bridge-set-mmio-base in pci-scan.fs: : pci-bridge-set-mmio-base ( addr -- ) pci-next-mmio @ 10 #aligned \ read the current Value and align to 1MB boundary dup 10 + pci-next-mmio !\ and write back with 1MB for bridge 10 rshift \ mmio-base reg is only the upper 16 bits pci-max-mmio @ and or \ and Insert mmio Limit (set it to max) swap 20 + rtas-config-l!\ and write it into the bridge ; Seems like the pci-max-mmio, i.e. the first address that is not in the window anymore, is programmed into the memory limit register here - but according to the pci-to-pci bridge specification, it should be the last address of the window instead. So I think the correct fix would be to decrease the pci-max-mmio value in pci-bridge-set-mmio-base by 1- before programming it into the limit register (note: in pci-bridge-set-mmio-limit you can find a 1- already, so I think this also should be done in pci-bridge-set-mmio-base, too) So if you've got some spare minutes, could you please check whether that would fix the issue, too? By the way, if I'm right, pci-bridge-set-mem-base seems to suffer from the same problem, too. Thomas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/ftrace: add powerpc timebase as a trace clock source
On 2015/04/23 09:10AM, Steven Rostedt wrote: On Thu, 23 Apr 2015 12:15:04 +0530 Naveen N. Rao naveen.n@linux.vnet.ibm.com wrote: diff --git a/arch/powerpc/include/asm/trace_clock.h b/arch/powerpc/include/asm/trace_clock.h new file mode 100644 index 000..0b0d094 --- /dev/null +++ b/arch/powerpc/include/asm/trace_clock.h @@ -0,0 +1,27 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * Copyright (C) 2015 Naveen N. Rao, IBM Corporation + */ + +#ifndef _ASM_PPC_TRACE_CLOCK_H +#define _ASM_PPC_TRACE_CLOCK_H + +#include linux/compiler.h +#include linux/types.h + +#ifdef CONFIG_TRACE_CLOCK You don't need this #if statement. What else is using this besides kernel/trace/trace.c, which selects TRACE_CLOCK if it is compiled. If you were trying to match x86, where it has: #ifdef CONFIG_X86_TSC where you have CONFIG_TRACE_CLOCK. We needed the #ifdef because you can compile the x86 kernel without TSC support, and we did not want to export a tsc tracing clock if one did not exist. And the only place that I see that even includes this header in ppc, is also only compiled if CONFIG_TRACE_CLOCK is selected. Ah yes, agreed. I have removed it and seeing as CONFIG_TRACE_CLOCK is really for the generic clocks, I have moved the dependency on arch/powerpc/kernel/trace_clock.o to CONFIG_TRACING since that is what gates kernel/trace/trace.o I'm fine with the change, just nuke the unnecessary #ifdef. Thanks for the review! - Naveen ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2] powerpc/ftrace: add powerpc timebase as a trace clock source
Add a new powerpc-specific trace clock using the timebase register, similar to x86-tsc. This gives us - a fast, monotonic, hardware clock source for trace entries, and - a clock that can be used to correlate events across cpus as well as across hypervisor and guests. Signed-off-by: Naveen N. Rao naveen.n@linux.vnet.ibm.com --- Changes since v1: - removed unnecessary #ifdef in trace_clock.h - changed config build dependency for trace_clock.o from TRACE_CLOCK to TRACING Documentation/trace/ftrace.txt | 5 + arch/powerpc/include/asm/Kbuild| 1 - arch/powerpc/include/asm/trace_clock.h | 19 +++ arch/powerpc/kernel/Makefile | 1 + arch/powerpc/kernel/trace_clock.c | 15 +++ 5 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/include/asm/trace_clock.h create mode 100644 arch/powerpc/kernel/trace_clock.c diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 572ca92..689f61a 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -346,6 +346,11 @@ of ftrace. Here is a list of some of the key files: x86-tsc: Architectures may define their own clocks. For example, x86 uses its own TSC cycle clock here. + ppc-tb: This uses the powerpc timebase register value. + This is in sync across CPUs and can also be used + to correlate events across hypervisor/guest if + tb_offset is known. + To set a clock, simply echo the clock name into this file. echo global trace_clock diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild index 382b28e..5041c66 100644 --- a/arch/powerpc/include/asm/Kbuild +++ b/arch/powerpc/include/asm/Kbuild @@ -5,5 +5,4 @@ generic-y += mcs_spinlock.h generic-y += preempt.h generic-y += rwsem.h generic-y += scatterlist.h -generic-y += trace_clock.h generic-y += vtime.h diff --git a/arch/powerpc/include/asm/trace_clock.h b/arch/powerpc/include/asm/trace_clock.h new file mode 100644 index 000..cf1ee75 --- /dev/null +++ b/arch/powerpc/include/asm/trace_clock.h @@ -0,0 +1,19 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * Copyright (C) 2015 Naveen N. Rao, IBM Corporation + */ + +#ifndef _ASM_PPC_TRACE_CLOCK_H +#define _ASM_PPC_TRACE_CLOCK_H + +#include linux/compiler.h +#include linux/types.h + +extern u64 notrace trace_clock_ppc_tb(void); + +#define ARCH_TRACE_CLOCKS { trace_clock_ppc_tb, ppc-tb, 0 }, + +#endif /* _ASM_PPC_TRACE_CLOCK_H */ diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 502cf69..18e038e 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -118,6 +118,7 @@ obj-$(CONFIG_PPC_IO_WORKAROUNDS)+= io-workarounds.o obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o obj-$(CONFIG_FUNCTION_GRAPH_TRACER)+= ftrace.o obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o +obj-$(CONFIG_TRACING) += trace_clock.o ifneq ($(CONFIG_PPC_INDIRECT_PIO),y) obj-y += iomap.o diff --git a/arch/powerpc/kernel/trace_clock.c b/arch/powerpc/kernel/trace_clock.c new file mode 100644 index 000..4917069 --- /dev/null +++ b/arch/powerpc/kernel/trace_clock.c @@ -0,0 +1,15 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * Copyright (C) 2015 Naveen N. Rao, IBM Corporation + */ + +#include asm/trace_clock.h +#include asm/time.h + +u64 notrace trace_clock_ppc_tb(void) +{ + return get_tb(); +} -- 2.3.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] spi: fsl-spi: fix devm_ioremap_resource() error case
On Thu, Apr 23, 2015 at 02:11:47PM +0200, Christophe Leroy wrote: devm_ioremap_resource() doesn't return NULL but an ERR_PTR on error. Applied, thanks. signature.asc Description: Digital signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[git pull] Please pull mpe/linux.git powerpc-4.1-2 tag
Hi Linus, Please pull powerpc fixes for 4.1: The following changes since commit d19d5efd8c8840aa4f38a6dfbfe500d8cc27de46: Merge tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux (2015-04-16 13:53:32 -0500) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux.git tags/powerpc-4.1-2 for you to fetch changes up to 2e826695d87c2d213def07bc344ae97d88384f62: powerpc/mm: Fix build error with CONFIG_PPC_TRANSACTIONAL_MEM disabled (2015-04-23 17:42:14 +1000) powerpc fixes for 4.1 - Fix for mm_dec_nr_pmds() from Scott. - Fixes for oopses seen with KVM + THP from Aneesh. - Build fixes from Aneesh Shreyas. Aneesh Kumar K.V (5): KVM: PPC: Use READ_ONCE when dereferencing pte_t pointer KVM: PPC: Remove page table walk helpers powerpc/mm/thp: Make page table walk safe against thp split/collapse powerpc/mm/thp: Return pte address if we find trans_splitting. powerpc/mm: Fix build error with CONFIG_PPC_TRANSACTIONAL_MEM disabled Michael Ellerman (1): Merge branch 'master' of git://git.kernel.org/.../scottwood/linux into fixes Scott Wood (1): powerpc/hugetlb: Call mm_dec_nr_pmds() in hugetlb_free_pmd_range() Shreyas B. Prabhu (1): powerpc/kvm: Fix ppc64_defconfig + PPC_POWERNV=n build error arch/powerpc/include/asm/kvm_book3s_64.h | 17 +++ arch/powerpc/include/asm/pgtable.h | 28 +++ arch/powerpc/kernel/eeh.c| 6 ++- arch/powerpc/kernel/io-workarounds.c | 10 ++-- arch/powerpc/kvm/Kconfig | 2 +- arch/powerpc/kvm/book3s_64_mmu_hv.c | 14 +++--- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 86 +--- arch/powerpc/kvm/e500_mmu_host.c | 32 arch/powerpc/mm/hash_utils_64.c | 3 +- arch/powerpc/mm/hugetlbpage.c| 32 arch/powerpc/perf/callchain.c| 24 + 11 files changed, 137 insertions(+), 117 deletions(-) signature.asc Description: This is a digitally signed message part ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 0/2] powerpc/kvm: Enable running guests on RT Linux
This patchset enables running KVM SMP guests with external interrupts on an underlying RT-enabled Linux. Previous to this patch, a guest with in-kernel MPIC emulation could easily panic the kernel due to preemption when delivering IPIs and external interrupts, because of the openpic spinlock becoming a sleeping mutex on PREEMPT_RT_FULL Linux. 0001: converts the openpic spinlock to a raw spinlock, in order to circumvent this behavior. While this change is targeted for a RT enabled Linux, it has no effect on upstream kvm-ppc, so send it upstream for better future maintenance. 0002: disables in-kernel MPIC emulation for guest running on RT, in order to prevent a potential DoS attack due to large system latencies. This patch is targeted to RT (due to CONFIG_PREEMPT_RT_FULL), but it can also be applied on upstream Linux, with no effect. - applied compiled against vanilla 4.0 - applied compiled against stable-rt 3.18-rt v2: - updated commit messages - change the fix for potentially large latencies from limiting the max number of VCPUs a guest can have to disabling the in-kernel MPIC Bogdan Purcareata (2): powerpc/kvm: Convert openpic lock to raw_spinlock powerpc/kvm: Disable in-kernel MPIC emulation for PREEMPT_RT_FULL arch/powerpc/kvm/Kconfig | 1 + arch/powerpc/kvm/mpic.c | 44 ++-- 2 files changed, 23 insertions(+), 22 deletions(-) -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 1/2] powerpc/kvm: Convert openpic lock to raw_spinlock
The lock in the KVM openpic emulation on PPC is a spinlock_t, meaning it becomes a sleeping mutex under PREEMPT_RT_FULL. This yields to a situation where this non-raw lock is grabbed with interrupts already disabled by hard_irq_disable(): kvmppc_prepare_to_enter() hard_irq_disable() kvmppc_core_prepare_to_enter() kvmppc_core_check_exceptions() kvmppc_booke_irqprio_deliver() kvmppc_mpic_set_epr() spin_lock_irqsave() ... This happens for guest interrupts that go through this openpic emulation code. The result is a kernel crash on guest enter (include/linux/kvm_host.h:784). Converting the lock to a raw_spinlock fixes the issue and enables the guest to run I/O intensive workloads in a SMP configuration. A similar fix can be found for the i8254 PIT emulation on x86 [1]. [1] https://lkml.org/lkml/2010/1/11/289 v2: - updated commit message Signed-off-by: Bogdan Purcareata bogdan.purcare...@freescale.com --- arch/powerpc/kvm/mpic.c | 44 ++-- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c index 6249cdc..2f70660 100644 --- a/arch/powerpc/kvm/mpic.c +++ b/arch/powerpc/kvm/mpic.c @@ -196,7 +196,7 @@ struct openpic { int num_mmio_regions; gpa_t reg_base; - spinlock_t lock; + raw_spinlock_t lock; /* Behavior control */ struct fsl_mpic_info *fsl; @@ -1103,9 +1103,9 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr, mpic_irq_raise(opp, dst, ILR_INTTGT_INT); } - spin_unlock(opp-lock); + raw_spin_unlock(opp-lock); kvm_notify_acked_irq(opp-kvm, 0, notify_eoi); - spin_lock(opp-lock); + raw_spin_lock(opp-lock); break; } @@ -1180,12 +1180,12 @@ void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu) int cpu = vcpu-arch.irq_cpu_id; unsigned long flags; - spin_lock_irqsave(opp-lock, flags); + raw_spin_lock_irqsave(opp-lock, flags); if ((opp-gcr opp-mpic_mode_mask) == GCR_MODE_PROXY) kvmppc_set_epr(vcpu, openpic_iack(opp, opp-dst[cpu], cpu)); - spin_unlock_irqrestore(opp-lock, flags); + raw_spin_unlock_irqrestore(opp-lock, flags); } static int openpic_cpu_read_internal(void *opaque, gpa_t addr, @@ -1386,9 +1386,9 @@ static int kvm_mpic_read(struct kvm_vcpu *vcpu, return -EINVAL; } - spin_lock_irq(opp-lock); + raw_spin_lock_irq(opp-lock); ret = kvm_mpic_read_internal(opp, addr - opp-reg_base, u.val); - spin_unlock_irq(opp-lock); + raw_spin_unlock_irq(opp-lock); /* * Technically only 32-bit accesses are allowed, but be nice to @@ -1427,10 +1427,10 @@ static int kvm_mpic_write(struct kvm_vcpu *vcpu, return -EOPNOTSUPP; } - spin_lock_irq(opp-lock); + raw_spin_lock_irq(opp-lock); ret = kvm_mpic_write_internal(opp, addr - opp-reg_base, *(const u32 *)ptr); - spin_unlock_irq(opp-lock); + raw_spin_unlock_irq(opp-lock); pr_debug(%s: addr %llx ret %d val %x\n, __func__, addr, ret, *(const u32 *)ptr); @@ -1501,14 +1501,14 @@ static int access_reg(struct openpic *opp, gpa_t addr, u32 *val, int type) if (addr 3) return -ENXIO; - spin_lock_irq(opp-lock); + raw_spin_lock_irq(opp-lock); if (type == ATTR_SET) ret = kvm_mpic_write_internal(opp, addr, *val); else ret = kvm_mpic_read_internal(opp, addr, val); - spin_unlock_irq(opp-lock); + raw_spin_unlock_irq(opp-lock); pr_debug(%s: type %d addr %llx val %x\n, __func__, type, addr, *val); @@ -1545,9 +1545,9 @@ static int mpic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr) if (attr32 != 0 attr32 != 1) return -EINVAL; - spin_lock_irq(opp-lock); + raw_spin_lock_irq(opp-lock); openpic_set_irq(opp, attr-attr, attr32); - spin_unlock_irq(opp-lock); + raw_spin_unlock_irq(opp-lock); return 0; } @@ -1592,9 +1592,9 @@ static int mpic_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr) if (attr-attr MAX_SRC) return -EINVAL; - spin_lock_irq(opp-lock); + raw_spin_lock_irq(opp-lock); attr32 = opp-src[attr-attr].pending; - spin_unlock_irq(opp-lock); + raw_spin_unlock_irq(opp-lock); if (put_user(attr32, (u32 __user *)(long)attr-addr)) return -EFAULT; @@ -1670,7 +1670,7 @@ static int mpic_create(struct kvm_device *dev, u32 type) opp-kvm = dev-kvm;
Re: [PATCH v2] powerpc/ftrace: add powerpc timebase as a trace clock source
On Fri, 24 Apr 2015 14:24:44 +0530 Naveen N. Rao naveen.n@linux.vnet.ibm.com wrote: Add a new powerpc-specific trace clock using the timebase register, similar to x86-tsc. This gives us - a fast, monotonic, hardware clock source for trace entries, and - a clock that can be used to correlate events across cpus as well as across hypervisor and guests. Signed-off-by: Naveen N. Rao naveen.n@linux.vnet.ibm.com --- Changes since v1: - removed unnecessary #ifdef in trace_clock.h - changed config build dependency for trace_clock.o from TRACE_CLOCK to TRACING Looks fine to me. Acked-by: Steven Rostedt rost...@goodmis.org -- Steve Documentation/trace/ftrace.txt | 5 + arch/powerpc/include/asm/Kbuild| 1 - arch/powerpc/include/asm/trace_clock.h | 19 +++ arch/powerpc/kernel/Makefile | 1 + arch/powerpc/kernel/trace_clock.c | 15 +++ 5 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/include/asm/trace_clock.h create mode 100644 arch/powerpc/kernel/trace_clock.c ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 2/2] powerpc/kvm: Disable in-kernel MPIC emulation for PREEMPT_RT_FULL
While converting the openpic emulation code to use a raw_spinlock_t enables guests to run on RT, there's still a performance issue. For interrupts sent in directed delivery mode with a multiple CPU mask, the emulated openpic will loop through all of the VCPUs, and for each VCPUs, it call IRQ_check, which will loop through all the pending interrupts for that VCPU. This is done while holding the raw_lock, meaning that in all this time the interrupts and preemption are disabled on the host Linux. A malicious user app can max both these number and cause a DoS. This temporary fix is sent for two reasons. First is so that users who want to use the in-kernel MPIC emulation are aware of the potential latencies, thus making sure that the hardware MPIC and their usage scenario does not involve interrupts sent in directed delivery mode, and the number of possible pending interrupts is kept small. Secondly, this should incentivize the development of a proper openpic emulation that would be better suited for RT. Signed-off-by: Bogdan Purcareata bogdan.purcare...@freescale.com --- arch/powerpc/kvm/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig index 11850f3..415499a 100644 --- a/arch/powerpc/kvm/Kconfig +++ b/arch/powerpc/kvm/Kconfig @@ -158,6 +158,7 @@ config KVM_E500MC config KVM_MPIC bool KVM in-kernel MPIC emulation depends on KVM E500 + depends on !PREEMPT_RT_FULL select HAVE_KVM_IRQCHIP select HAVE_KVM_IRQFD select HAVE_KVM_IRQ_ROUTING -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] pci: Use Qemu created PCI device nodes
Hi Nikunj, On Wed, 22 Apr 2015 16:27:20 +0530 Nikunj A Dadhania nik...@linux.vnet.ibm.com wrote: PCI Enumeration has been part of SLOF. Now with hotplug code addition in Qemu, it makes more sense to have this code a one place, i.e. Qemu. s/Qemu/QEMU/ and s/code a one place/code in one place/ ? Adding routines to walk through the device nodes created by Qemu. SLOF will configure the device/bridges and program the BARs for communicating with the devices. I wonder whether it would make more sense to also set up the BARs etc. in QEMU instead of SLOF? diff --git a/board-qemu/slof/pci-phb.fs b/board-qemu/slof/pci-phb.fs index e307d95..30b7443 100644 --- a/board-qemu/slof/pci-phb.fs +++ b/board-qemu/slof/pci-phb.fs @@ -283,6 +283,41 @@ setup-puid THEN ; +: phb-pci-walk-bridge ( -- ) +phb-debug? IF . Calling pci-walk-bridge pwd cr THEN + +get-node child ?dup 0= IF EXIT THEN\ get and check if we have children +BEGIN +dup \ Continue as long as there are children +WHILE Most Forth code uses the same indentation for the code between BEGIN...WHILE and WHILE...REPEAT ... so I think you could decrease the indentation of the following block by one level. +\ Set child node as current node: +dup set-node Below you are calling pci-device-setup which in turn might include some pci-class_*.fs or pci-device_*.fs files (or even run some FCODE?). At least pci-class_02.fs seems to use an INSTANCE VARIABLE, i.e. the instance template should get modified in that case == Please double-check whether you need to use extend-device here instead (I'm not 100% sure right now ... what happens for example when you run qemu with a network device that SLOF does not provide a pci-device_*.fs for? I guess it will try to include pci-class_02.fs and fail due to the INSTANCE VARIABLE ?) +my-space pci-set-slot \ set the slot bit pci-set-slot seems to rely on the pci-device-slots global variable. This is normally initialized by pci-probe-bus. Now that you provide your own implementation of that function below, I think it should likely also set up the pci-device-slots variable, shouldn't it? +my-space pci-htype@ \ read HEADER-Type +7f and \ Mask bit 7 - multifunction device +CASE + 0 OF my-space pci-device-setup ENDOF \ | set up the device + 1 OF my-space pci-bridge-setup ENDOF \ | set up the bridge + dup OF my-space pci-htype@ pci-out ENDOF + ENDCASE + peer +REPEAT drop +get-parent set-node +; The remaining part of the patch looks ok to me. Thomas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev