Re: pci and pcie device-tree binding - range No cells
On 12/10/2012 10:41 PM, Grant Likely wrote: On Mon, 10 Dec 2012 09:21:51 -0600, Rob Herring robherri...@gmail.com wrote: On 12/10/2012 09:05 AM, Michal Simek wrote: On 12/10/2012 03:26 PM, Rob Herring wrote: On 12/10/2012 06:20 AM, Michal Simek wrote: Hi Grant and others, I have a question regarding number of cells in ranges property for pci and pcie nodes. Linux pci/pcie powerpc DTSes contain 7 cells (xpedite5370.dts, sequoia.dts, etc) but also 6 cells format too (mpc832x_mds.dts) Here is shown 6 cells ranges format and describe http://devicetree.org/Device_Tree_Usage#PCI_Host_Bridge And also in documentation in the linux Documentation/devicetree/bindings/pci/83xx-512x-pci.txt Both format uses: #size-cells = 2; #address-cells = 3; What is valid format? Both. 7 cells are valid when the host (parent) bus is 64-bit and 6 cells are valid when the host bus is 32-bit. The ranges property is child address parent address size. The parent address #address-cells is taken from the parent node. Ok. Got it. Here is what we use on zynq and microblaze - both 32bit which should be fine. ps7_axi_interconnect_0: axi@0 { #address-cells = 1; #size-cells = 1; axi_pcie_0: axi-pcie@5000 { #address-cells = 3; #size-cells = 2; compatible = xlnx,axi-pcie-1.05.a; ranges = 0x0200 0 0x6000 0x6000 0 0x1000 ; ... } } What I am wondering is pci_process_bridge_OF_ranges() at arch/powerpc/kernel/pci-common.c where there are used some hardcoded values which should be probably loaded from device-tree. For example: 683 int np = pna + 5; ... 702 pci_addr = of_read_number(ranges + 1, 2); 703 cpu_addr = of_translate_address(dev, ranges + 3); 704 size = of_read_number(ranges + pna + 3, 2); These would always be correct whether you have 6 or 7 cells. pna is the parent bus address cells size. The pci address is fixed at 3 cells. Unfortunately we have copied it to microblaze. I look at the PCI DT code in powerpc and see a whole bunch of code that seems like it should be common. The different per arch pci structs complicates that. No one has really gotten to looking at PCI DT on ARM yet except you and Thierry for Tegra. We definitely don't want to create a 3rd copy. Starting the process of moving it to something like drivers/pci/pci-of.c would be great. A lot of it should be common. The microblaze code is a copy of the powerpc version. I'll strongly nack any attempt to add a third! :-) Yes it. There are some things which we had fixed because that powerpc port is big endian only and we support PCIe on little endian too. But changes are really cosmetic. drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c which might actually be a good idea in the short term so that it gets appropriate supervision while being generalized before being moved into the pci directory. Ben: Are you willing to move that ppc code to this location? It is probably not good idea that I should do it when I even don't have hardware available for testing (Asking someone else). Thanks, Michal -- Michal Simek, Ing. (M.Eng) w: www.monstr.eu p: +42-0-721842854 Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/ Microblaze U-BOOT custodian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: pci and pcie device-tree binding - range No cells
On Wed, Dec 12, 2012 at 10:37 AM, Michal Simek mon...@monstr.eu wrote: On 12/10/2012 10:41 PM, Grant Likely wrote: drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c which might actually be a good idea in the short term so that it gets appropriate supervision while being generalized before being moved into the pci directory. Ben: Are you willing to move that ppc code to this location? It is probably not good idea that I should do it when I even don't have hardware available for testing (Asking someone else). You're a clever guy, you are more than capable of crafting the patch, even if you can't test on hardware. :-) I refactored most of the OF support code without having access to most of the affected hardware. Once I got the changes out there for review I also asked for spot testing before getting it into linux-next for even more testing. g. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] vfio powerpc: enabled on powernv platform
This patch initializes IOMMU groups based on the IOMMU configuration discovered during the PCI scan on POWERNV (POWER non virtualized) platform. The IOMMU groups are to be used later by VFIO driver (PCI pass through). It also implements an API for mapping/unmapping pages for guest PCI drivers and providing DMA window properties. This API is going to be used later by QEMU-VFIO to handle h_put_tce hypercalls from the KVM guest. Although this driver has been tested only on the POWERNV platform, it should work on any platform which supports TCE tables. To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config option and configure VFIO as required. Cc: David Gibson da...@gibson.dropbear.id.au Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- arch/powerpc/include/asm/iommu.h | 10 ++ arch/powerpc/kernel/iommu.c | 329 ++ arch/powerpc/platforms/powernv/pci.c | 134 ++ drivers/iommu/Kconfig|8 + 4 files changed, 481 insertions(+) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index cbfe678..3c861ae 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -76,6 +76,9 @@ struct iommu_table { struct iommu_pool large_pool; struct iommu_pool pools[IOMMU_NR_POOLS]; unsigned long *it_map; /* A simple allocation bitmap for now */ +#ifdef CONFIG_IOMMU_API + struct iommu_group *it_group; +#endif }; struct scatterlist; @@ -147,5 +150,12 @@ static inline void iommu_restore(void) } #endif +extern void iommu_reset_table(struct iommu_table *tbl, bool restore); +extern long iommu_clear_tces(struct iommu_table *tbl, unsigned long ioba, + unsigned long size); +extern long iommu_put_tces(struct iommu_table *tbl, unsigned long ioba, + uint64_t tce, enum dma_data_direction direction, + unsigned long size); + #endif /* __KERNEL__ */ #endif /* _ASM_IOMMU_H */ diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index ff5a6ce..f3bb2e7 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -36,6 +36,7 @@ #include linux/hash.h #include linux/fault-inject.h #include linux/pci.h +#include linux/uaccess.h #include asm/io.h #include asm/prom.h #include asm/iommu.h @@ -44,6 +45,7 @@ #include asm/kdump.h #include asm/fadump.h #include asm/vio.h +#include asm/tce.h #define DBG(...) @@ -856,3 +858,330 @@ void iommu_free_coherent(struct iommu_table *tbl, size_t size, free_pages((unsigned long)vaddr, get_order(size)); } } + +#ifdef CONFIG_IOMMU_API +/* + * SPAPR TCE API + */ + +struct vwork { + struct mm_struct*mm; + longnpage; + struct work_struct work; +}; + +/* delayed decrement/increment for locked_vm */ +static void lock_acct_bg(struct work_struct *work) +{ + struct vwork *vwork = container_of(work, struct vwork, work); + struct mm_struct *mm; + + mm = vwork-mm; + down_write(mm-mmap_sem); + mm-locked_vm += vwork-npage; + up_write(mm-mmap_sem); + mmput(mm); + kfree(vwork); +} + +static void lock_acct(long npage) +{ + struct vwork *vwork; + struct mm_struct *mm; + + if (!current-mm) + return; /* process exited */ + + if (down_write_trylock(current-mm-mmap_sem)) { + current-mm-locked_vm += npage; + up_write(current-mm-mmap_sem); + return; + } + + /* +* Couldn't get mmap_sem lock, so must setup to update +* mm-locked_vm later. If locked_vm were atomic, we +* wouldn't need this silliness +*/ + vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL); + if (!vwork) + return; + mm = get_task_mm(current); + if (!mm) { + kfree(vwork); + return; + } + INIT_WORK(vwork-work, lock_acct_bg); + vwork-mm = mm; + vwork-npage = npage; + schedule_work(vwork-work); +} + +/* + * iommu_reset_table is called when it started/stopped being used. + * + * restore==true says to bring the iommu_table into the state as it was + * before being used by VFIO. + */ +void iommu_reset_table(struct iommu_table *tbl, bool restore) +{ + /* Page#0 is marked as used in iommu_init_table, so we clear it... */ + if (!restore (tbl-it_offset == 0)) + clear_bit(0, tbl-it_map); + + iommu_clear_tces(tbl, tbl-it_offset, tbl-it_size); + + /* ... or restore */ + if (restore (tbl-it_offset == 0)) + set_bit(0, tbl-it_map); +} +EXPORT_SYMBOL_GPL(iommu_reset_table); + +/* + * Returns the number of used IOMMU pages (4K) within + * the same system page (4K or 64K). + * + * syspage_weight_zero is optimized for expected case == 0 + * syspage_weight_one is optimized for expected case 1 + * Other case are not used in
[PATCH] vfio powerpc: implemented IOMMU driver for VFIO
VFIO implements platform independent stuff such as a PCI driver, BAR access (via read/write on a file descriptor or direct mapping when possible) and IRQ signaling. The platform dependent part includes IOMMU initialization and handling. This patch implements an IOMMU driver for VFIO which does mapping/unmapping pages for the guest IO and provides information about DMA window (required by a POWERPC guest). The counterpart in QEMU is required to support this functionality. Cc: David Gibson da...@gibson.dropbear.id.au Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- drivers/vfio/Kconfig|6 + drivers/vfio/Makefile |1 + drivers/vfio/vfio_iommu_spapr_tce.c | 249 +++ include/linux/vfio.h| 31 + 4 files changed, 287 insertions(+) create mode 100644 drivers/vfio/vfio_iommu_spapr_tce.c diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index 7cd5dec..b464687 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -3,10 +3,16 @@ config VFIO_IOMMU_TYPE1 depends on VFIO default n +config VFIO_IOMMU_SPAPR_TCE + tristate + depends on VFIO SPAPR_TCE_IOMMU + default n + menuconfig VFIO tristate VFIO Non-Privileged userspace driver framework depends on IOMMU_API select VFIO_IOMMU_TYPE1 if X86 + select VFIO_IOMMU_SPAPR_TCE if PPC_POWERNV help VFIO provides a framework for secure userspace device drivers. See Documentation/vfio.txt for more details. diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 2398d4a..72bfabc 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -1,3 +1,4 @@ obj-$(CONFIG_VFIO) += vfio.o obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o +obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o obj-$(CONFIG_VFIO_PCI) += pci/ diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c new file mode 100644 index 000..714bf57 --- /dev/null +++ b/drivers/vfio/vfio_iommu_spapr_tce.c @@ -0,0 +1,249 @@ +/* + * VFIO: IOMMU DMA mapping support for TCE on POWER + * + * Copyright (C) 2012 IBM Corp. All rights reserved. + * Author: Alexey Kardashevskiy a...@ozlabs.ru + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * Derived from original vfio_iommu_type1.c: + * Copyright (C) 2012 Red Hat, Inc. All rights reserved. + * Author: Alex Williamson alex.william...@redhat.com + */ + +#include linux/module.h +#include linux/pci.h +#include linux/slab.h +#include linux/uaccess.h +#include linux/err.h +#include linux/vfio.h +#include asm/iommu.h + +#define DRIVER_VERSION 0.1 +#define DRIVER_AUTHOR a...@ozlabs.ru +#define DRIVER_DESC VFIO IOMMU SPAPR TCE + +static void tce_iommu_detach_group(void *iommu_data, + struct iommu_group *iommu_group); + +/* + * VFIO IOMMU fd for SPAPR_TCE IOMMU implementation + * + * This code handles mapping and unmapping of user data buffers + * into DMA'ble space using the IOMMU + */ + +/* + * The container descriptor supports only a single group per container. + * Required by the API as the container is not supplied with the IOMMU group + * at the moment of initialization. + */ +struct tce_container { + struct mutex lock; + struct iommu_table *tbl; +}; + +static void *tce_iommu_open(unsigned long arg) +{ + struct tce_container *container; + + if (arg != VFIO_SPAPR_TCE_IOMMU) { + pr_err(tce_vfio: Wrong IOMMU type\n); + return ERR_PTR(-EINVAL); + } + + container = kzalloc(sizeof(*container), GFP_KERNEL); + if (!container) + return ERR_PTR(-ENOMEM); + + mutex_init(container-lock); + + return container; +} + +static void tce_iommu_release(void *iommu_data) +{ + struct tce_container *container = iommu_data; + + WARN_ON(container-tbl !container-tbl-it_group); + if (container-tbl container-tbl-it_group) + tce_iommu_detach_group(iommu_data, container-tbl-it_group); + + mutex_destroy(container-lock); + + kfree(container); +} + +static long tce_iommu_ioctl(void *iommu_data, +unsigned int cmd, unsigned long arg) +{ + struct tce_container *container = iommu_data; + unsigned long minsz; + long ret; + + switch (cmd) { + case VFIO_CHECK_EXTENSION: + return (arg == VFIO_SPAPR_TCE_IOMMU) ? 1 : 0; + + case VFIO_IOMMU_SPAPR_TCE_GET_INFO: { + struct vfio_iommu_spapr_tce_info info; + struct iommu_table *tbl = container-tbl; + + if (WARN_ON(!tbl)) + return -ENXIO; + + minsz = offsetofend(struct vfio_iommu_spapr_tce_info, +
Re: [PATCH] vfio powerpc: enabled on powernv platform
Hi Alex, I posted other pair of patches. While debugging and testing my stuff I implemented some rough hack to support IOMMU mappings without passing those hypercalls to the QEMU, this is why I moved pieces of code around - want to support both QEMU-VFIO and kernel optimized H_PUT_TCE handler. On 12/12/12 23:34, Alexey Kardashevskiy wrote: This patch initializes IOMMU groups based on the IOMMU configuration discovered during the PCI scan on POWERNV (POWER non virtualized) platform. The IOMMU groups are to be used later by VFIO driver (PCI pass through). It also implements an API for mapping/unmapping pages for guest PCI drivers and providing DMA window properties. This API is going to be used later by QEMU-VFIO to handle h_put_tce hypercalls from the KVM guest. Although this driver has been tested only on the POWERNV platform, it should work on any platform which supports TCE tables. To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config option and configure VFIO as required. Cc: David Gibson da...@gibson.dropbear.id.au Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- -- Alexey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] vfio powerpc: enabled on powernv platform
On Wed, 2012-12-12 at 17:14 +1100, Alexey Kardashevskiy wrote: On 08/12/12 04:38, Alex Williamson wrote: +static int __init tce_iommu_init(void) +{ + struct pci_dev *pdev = NULL; + struct iommu_table *tbl; + struct iommu_group *grp; + + /* Allocate and initialize IOMMU groups */ + for_each_pci_dev(pdev) { + tbl = get_iommu_table_base(pdev-dev); + if (!tbl) + continue; + + /* Skip already initialized */ + if (tbl-it_group) + continue; + + grp = iommu_group_alloc(); + if (IS_ERR(grp)) { + pr_info(tce_vfio: cannot create new IOMMU group, ret=%ld\n, + PTR_ERR(grp)); + return PTR_ERR(grp); + } + tbl-it_group = grp; + iommu_group_set_iommudata(grp, tbl, group_release); BTW, groups have a name property that shows up in sysfs that can be set with iommu_group_set_name(). IIRC, this was a feature David requested for PEs. It'd be nice if it was used for PEs... Thanks, But what would I put there?... IOMMU ID is more than enough at the moment and struct iommu_table does not have anything what would have made sense to show in the sysfs... I believe David mentioned that PEs had user visible names. Perhaps they match an enclosure location or something. Group numbers are rather arbitrary and really have no guarantee of persistence. Thanks, Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] vfio powerpc: implemented IOMMU driver for VFIO
On Wed, 2012-12-12 at 17:59 +1100, Alexey Kardashevskiy wrote: On 08/12/12 04:01, Alex Williamson wrote: + case VFIO_IOMMU_MAP_DMA: { + vfio_iommu_spapr_tce_dma_map param; + struct iommu_table *tbl = container-tbl; + enum dma_data_direction direction; + unsigned long locked, lock_limit; + + if (WARN_ON(!tbl)) + return -ENXIO; + + minsz = offsetofend(vfio_iommu_spapr_tce_dma_map, size); + + if (copy_from_user(param, (void __user *)arg, minsz)) + return -EFAULT; + + if (param.argsz minsz) + return -EINVAL; + + if ((param.flags VFIO_DMA_MAP_FLAG_READ) + (param.flags VFIO_DMA_MAP_FLAG_WRITE)) + direction = DMA_BIDIRECTIONAL; + else if (param.flags VFIO_DMA_MAP_FLAG_READ) + direction = DMA_TO_DEVICE; + else if (param.flags VFIO_DMA_MAP_FLAG_WRITE) + direction = DMA_FROM_DEVICE; + else + return -EINVAL; flags needs to be sanitized too. Return EINVAL if any unknown bit is set or else sloppy users may make it very difficult to make use of those flag bits later. It already returns -EINVAL on any bit set except READ/WRITE, no? No. I could pass flags ~0 through there to get a read/write mapping and cause you problems if you later want to define another bit. Thanks, Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: pci and pcie device-tree binding - range No cells
Dear Rob Herring, On Mon, 10 Dec 2012 17:24:44 -0600, Rob Herring wrote: Marvell SoCs have up to 20 configurable address windows, which allow you, at run time, to say I would like the range from physical address 0x to 0x to correspond to the PCIe device in port 1, lane 2, or to the NAND, or to this or that device. Therefore, in the PCIe driver I proposed for the Armada 370/XP SoCs [1], there is no need to encode all those ranges statically in the DT. That's not a unique feature. I'm not sure if any powerpc systems do that though. Yes, probably not an unique feature. The only ranges property I'm using is to allow the DT sub-nodes describing each PCIe port/lane to access the CPU registers that allow to see if the PCIe link is up or down, access the PCI configuration space and so on. So all ranges in my ranges property correspond to normal CPU registers, like the one you would put in the reg property for any device. The fact that those devices are PCIe is really orthogonal here. That doesn't really sound right. Very likely, but I still don't get what is the right way. I don't think deviating from the normal binding is the right approach. Perhaps the host driver should fill in the ranges property with the addresses it uses. Then any child devices will get the right address translation. I don't really understand what you mean here. If you look at the host driver code (arch/arm/mach-mvebu/pcie.c), for each PCIe interface is simply does: * Create an address decoding window for the memory BAR * Create an address decoding window for the I/O BAR * Associate the memory BAR window address and the I/O bar window address with the PCIe interface And that's it. See https://github.com/MISL-EBU-System-SW/mainline-public/blob/marvell-pcie-v1/arch/arm/mach-mvebu/pcie.c#L107. So this driver is both deciding of the physical addresses for each PCIe interface, and associating them with the PCIe interfaces. How is it useful to feed some addresses back into the Device Tree? Also, while the h/w may support practically any config, there are practical constraints of what Linux will use like there's no reason to support more than 64K i/o space. PCI memory addresses generally start at 0x10. You probably don't need more than 1 memory window per root complex (although prefetchable memory may also be needed). I allocate one 64K I/O window and one memory window per PCIe interface whose link is up (i.e a PCIe device is connected). You could let the DT settings drive the address window configuration. No, because I don't want to have absolute addresses for the windows: I have 10 PCIe interfaces, but often, only a few of them are used. So I don't want in the Device Tree to over-allocate hundreds of MB of physical address space if it's not useful. PCIe is dynamic, address window configuration is dynamic. And we should hardcode all this configuration statically in the DT? Doesn't seem like the right solution. Best regards, Thomas -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: pci and pcie device-tree binding - range No cells
On 12/12/2012 11:49 AM, Grant Likely wrote: On Wed, Dec 12, 2012 at 10:37 AM, Michal Simek mon...@monstr.eu wrote: On 12/10/2012 10:41 PM, Grant Likely wrote: drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c which might actually be a good idea in the short term so that it gets appropriate supervision while being generalized before being moved into the pci directory. Ben: Are you willing to move that ppc code to this location? It is probably not good idea that I should do it when I even don't have hardware available for testing (Asking someone else). You're a clever guy, you are more than capable of crafting the patch, even if you can't test on hardware. :-) I refactored most of the OF support code without having access to most of the affected hardware. Once I got the changes out there for review I also asked for spot testing before getting it into linux-next for even more testing. Fair enough. :-) Good time to start to look for how to work with board farm. Thanks, Michal -- Michal Simek, Ing. (M.Eng) w: www.monstr.eu p: +42-0-721842854 Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/ Microblaze U-BOOT custodian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: pci and pcie device-tree binding - range No cells
On Wed, Dec 12, 2012 at 4:16 PM, Thomas Petazzoni thomas.petazz...@free-electrons.com wrote: Dear Rob Herring, On Mon, 10 Dec 2012 17:24:44 -0600, Rob Herring wrote: Marvell SoCs have up to 20 configurable address windows, which allow you, at run time, to say I would like the range from physical address 0x to 0x to correspond to the PCIe device in port 1, lane 2, or to the NAND, or to this or that device. Therefore, in the PCIe driver I proposed for the Armada 370/XP SoCs [1], there is no need to encode all those ranges statically in the DT. That's not a unique feature. I'm not sure if any powerpc systems do that though. Yes, probably not an unique feature. The only ranges property I'm using is to allow the DT sub-nodes describing each PCIe port/lane to access the CPU registers that allow to see if the PCIe link is up or down, access the PCI configuration space and so on. So all ranges in my ranges property correspond to normal CPU registers, like the one you would put in the reg property for any device. The fact that those devices are PCIe is really orthogonal here. That doesn't really sound right. Very likely, but I still don't get what is the right way. Hi Thomas, I just went and looked at your binding. Here's the snippet I found interesting: pcie-controller { + compatible = marvell,armada-370-xp-pcie; + status = disabled; + #address-cells = 1; + #size-cells = 1; + ranges = 0 0xd004 0x2000 /* port0x1_port0 */ + 0x2000 0xd0042000 0x2000 /* port2x1_port0 */ + 0x4000 0xd0044000 0x2000 /* port0x1_port1 */ + 0x8000 0xd0048000 0x2000 /* port0x1_port2 */ + 0xC000 0xd004C000 0x2000 /* port0x1_port3 */ + 0x1 0xd008 0x2000 /* port1x1_port0 */ + 0x12000 0xd0082000 0x2000 /* port3x1_port0 */ + 0x14000 0xd0084000 0x2000 /* port1x1_port1 */ + 0x18000 0xd0088000 0x2000 /* port1x1_port2 */ + 0x1C000 0xd008C000 0x2000 /* port1x1_port3 */; + + pcie0.0 at 0xd004 { + reg = 0x0 0x2000; + interrupts = 58; + clocks = gateclk 5; + marvell,pcie-port = 0; + marvell,pcie-lane = 0; + status = disabled; + }; + + pcie0.1 at 0xd0044000 { + reg = 0x4000 0x2000; + interrupts = 59; + clocks = gateclk 5; + marvell,pcie-port = 0; + marvell,pcie-lane = 1; + status = disabled; + }; [... rest trimmed for berevity] You're right, if you're doing dynamic allocation of windows, then you really don't need to have a ranges property. However, the way the PCI node is set up definitely looks incorrect. PCI already has a very specific binding for pci host controller nodes. First, #address-cells=3; #size-cells=2; and device_type=pcie must be there. You don't want to break this. You can find details on the pci and pci-express binding here: http://www.openfirmware.org/1275/bindings/pci/pci2_1.pdf http://www.openfirmware.org/1275/bindings/pci/pci-express.txt For the child nodes, PCI is a discoverable bus, so normally I wouldn't expect to see child nodes at all when using a dtb. The only time nodes should be populated is when a device has non-discoverable charactersitics. In your example above you do have some additional data, but I don't know enough about pci-express to say how best to encode them or whether they are needed at all. Ben might have some thoughts on this. When the PCI child nodes are present, it is important to stick with the established PCI addressing scheme which uses 3 cells for addressing. The first entry in the reg property must represent the configuration space so that DT nodes can be matched up with discovered devices. There is no requirement to include mappings for the memory and io regions if the host controller can assign them dynamically. I don't think you should need a ranges property at all for what you're doing. Access to config space is generally managed by the PCI host controller drivers and subsystem, and PCI device drivers don't typically use of_ calls directly. g. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: pci and pcie device-tree binding - range No cells
On 12/12/2012 10:16 AM, Thomas Petazzoni wrote: Dear Rob Herring, On Mon, 10 Dec 2012 17:24:44 -0600, Rob Herring wrote: Marvell SoCs have up to 20 configurable address windows, which allow you, at run time, to say I would like the range from physical address 0x to 0x to correspond to the PCIe device in port 1, lane 2, or to the NAND, or to this or that device. Therefore, in the PCIe driver I proposed for the Armada 370/XP SoCs [1], there is no need to encode all those ranges statically in the DT. That's not a unique feature. I'm not sure if any powerpc systems do that though. Yes, probably not an unique feature. The only ranges property I'm using is to allow the DT sub-nodes describing each PCIe port/lane to access the CPU registers that allow to see if the PCIe link is up or down, access the PCI configuration space and so on. So all ranges in my ranges property correspond to normal CPU registers, like the one you would put in the reg property for any device. The fact that those devices are PCIe is really orthogonal here. That doesn't really sound right. Very likely, but I still don't get what is the right way. I don't think deviating from the normal binding is the right approach. Perhaps the host driver should fill in the ranges property with the addresses it uses. Then any child devices will get the right address translation. I don't really understand what you mean here. If you look at the host driver code (arch/arm/mach-mvebu/pcie.c), for each PCIe interface is simply does: * Create an address decoding window for the memory BAR * Create an address decoding window for the I/O BAR * Associate the memory BAR window address and the I/O bar window address with the PCIe interface And that's it. See https://github.com/MISL-EBU-System-SW/mainline-public/blob/marvell-pcie-v1/arch/arm/mach-mvebu/pcie.c#L107. So this driver is both deciding of the physical addresses for each PCIe interface, and associating them with the PCIe interfaces. How is it useful to feed some addresses back into the Device Tree? I'm not completely sure for PCI, but the ranges is necessary to translate addresses of child nodes. If you don't need ranges then you could omit it. If you need ranges, then you should follow the PCI binding whether it is put in the DTS or you dynamically fill it in. This could be filled in by the bootloader as well if you have PCI devices you need to boot from. Also, while the h/w may support practically any config, there are practical constraints of what Linux will use like there's no reason to support more than 64K i/o space. PCI memory addresses generally start at 0x10. You probably don't need more than 1 memory window per root complex (although prefetchable memory may also be needed). I allocate one 64K I/O window and one memory window per PCIe interface whose link is up (i.e a PCIe device is connected). You could let the DT settings drive the address window configuration. No, because I don't want to have absolute addresses for the windows: I have 10 PCIe interfaces, but often, only a few of them are used. So I don't want in the Device Tree to over-allocate hundreds of MB of physical address space if it's not useful. How many you have is probably board dependent and not probe-able, right? So you would at least know the subset of root complexes that you are using. I know you want to find the size of all the cards up front and size windows based on that, but I don't think that is going to be possible. PCIe is dynamic, address window configuration is dynamic. And we should hardcode all this configuration statically in the DT? Doesn't seem like the right solution. I'm just throwing out ideas. There are many cases of flexibility in h/w designs which are never used. H/w is often designed in a vacuum without s/w input. Not saying that is the case here, but you do have to consider that. Rob Best regards, Thomas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[TRIVIAL PATCH 00/26] treewide: Add and use vsprintf extension %pSR
Remove the somewhat awkward uses of print_symbol and convert all the existing uses to a new vsprintf pointer type of %pSR. print_symbol can be interleaved when it is used in a sequence like: printk(something: ...); print_symbol(%s, addr); printk(\n); Instead use: printk(something: %pSR\n, (void *)addr); Add a new %p[SsFf]R vsprintf extension that can perform the same symbol function/address/offset formatting as print_symbol to reduce the number and styles of message logging functions. print_symbol used __builtin_extract_return_addr for those architectures like S/390 and SPARC that have offset or masked addressing. %p[FfSs]R uses the same gcc __builtin Joe Perches (26): vsprintf: Add extension %pSR - print_symbol replacement alpha: Convert print_symbol to %pSR arm: Convert print_symbol to %pSR arm64: Convert print_symbol to %pSR avr32: Convert print_symbol to %pSR c6x: Convert print_symbol to %pSR ia64: Convert print_symbol to %pSR m32r: Convert print_symbol to %pSR mn10300: Convert print_symbol to %pSR openrisc: Convert print_symbol to %pSR powerpc: Convert print_symbol to %pSR s390: Convert print_symbol to %pSR sh: Convert print_symbol to %pSR um: Convert print_symbol to %pSR unicore32: Convert print_symbol to %pSR x86: Convert print_symbol to %pSR xtensa: Convert print_symbol to %pSR drivers: base: Convert print_symbol to %pSR gfs2: Convert print_symbol to %pSR sysfs: Convert print_symbol to %pSR irq: Convert print_symbol to %pSR smp_processor_id: Convert print_symbol to %pSR mm: Convert print_symbol to %pSR xtensa: Convert print_symbol to %pSR x86: head_64.S: Use vsprintf extension %pSR not print_symbol kallsyms: Remove print_symbol Documentation/filesystems/sysfs.txt |4 +- Documentation/printk-formats.txt|2 + Documentation/zh_CN/filesystems/sysfs.txt |4 +- arch/alpha/kernel/traps.c |8 ++ arch/arm/kernel/process.c |4 +- arch/arm64/kernel/process.c |4 +- arch/avr32/kernel/process.c | 25 ++- arch/c6x/kernel/traps.c |3 +- arch/ia64/kernel/process.c | 13 --- arch/m32r/kernel/traps.c|6 +--- arch/mn10300/kernel/traps.c |8 +++--- arch/openrisc/kernel/traps.c|7 + arch/powerpc/platforms/cell/spu_callbacks.c | 12 -- arch/s390/kernel/traps.c| 28 +++--- arch/sh/kernel/process_32.c |4 +- arch/um/kernel/sysrq.c |6 +--- arch/unicore32/kernel/process.c |5 ++- arch/x86/kernel/cpu/mcheck/mce.c| 13 ++- arch/x86/kernel/dumpstack.c |5 +-- arch/x86/kernel/head_64.S |4 +- arch/x86/kernel/process_32.c|2 +- arch/x86/mm/mmio-mod.c |4 +- arch/x86/um/sysrq_32.c |9 ++- arch/xtensa/kernel/traps.c |6 +--- drivers/base/core.c |4 +- fs/gfs2/glock.c |4 +- fs/gfs2/trans.c |3 +- fs/sysfs/file.c |4 +- include/linux/kallsyms.h| 18 - kernel/irq/debug.h | 15 ++--- kernel/kallsyms.c | 11 -- lib/smp_processor_id.c |2 +- lib/vsprintf.c | 18 mm/memory.c |8 +++--- mm/slab.c |8 ++ 35 files changed, 117 insertions(+), 164 deletions(-) -- 1.7.8.112.g3fd21 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[TRIVIAL PATCH 11/26] powerpc: Convert print_symbol to %pSR
Use the new vsprintf extension to avoid any possible message interleaving. Convert the #ifdef DEBUG block to a single pr_debug. Signed-off-by: Joe Perches j...@perches.com --- arch/powerpc/platforms/cell/spu_callbacks.c | 12 +--- 1 files changed, 5 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/platforms/cell/spu_callbacks.c b/arch/powerpc/platforms/cell/spu_callbacks.c index 75d6133..c5fe6d2 100644 --- a/arch/powerpc/platforms/cell/spu_callbacks.c +++ b/arch/powerpc/platforms/cell/spu_callbacks.c @@ -60,13 +60,11 @@ long spu_sys_callback(struct spu_syscall_block *s) syscall = spu_syscall_table[s-nr_ret]; -#ifdef DEBUG - print_symbol(KERN_DEBUG SPU-syscall %s:, (unsigned long)syscall); - printk(syscall%ld(%lx, %lx, %lx, %lx, %lx, %lx)\n, - s-nr_ret, - s-parm[0], s-parm[1], s-parm[2], - s-parm[3], s-parm[4], s-parm[5]); -#endif + pr_debug(SPU-syscall %pSR:syscall%ld(%lx, %lx, %lx, %lx, %lx, %lx)\n, +syscall, +s-nr_ret, +s-parm[0], s-parm[1], s-parm[2], +s-parm[3], s-parm[4], s-parm[5]); return syscall(s-parm[0], s-parm[1], s-parm[2], s-parm[3], s-parm[4], s-parm[5]); -- 1.7.8.112.g3fd21 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: pci and pcie device-tree binding - range No cells
On Wed, Dec 12, 2012 at 10:49 AM, Grant Likely wrote: On Wed, Dec 12, 2012 at 10:37 AM, Michal Simek mon...@monstr.eumailto:mon...@monstr.eu wrote: On 12/10/2012 10:41 PM, Grant Likely wrote: drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c which might actually be a good idea in the short term so that it gets appropriate supervision while being generalized before being moved into the pci directory. Ben: Are you willing to move that ppc code to this location? It is probably not good idea that I should do it when I even don't have hardware available for testing (Asking someone else). You're a clever guy, you are more than capable of crafting the patch, even if you can't test on hardware. :-) I refactored most of the OF support code without having access to most of the affected hardware. Once I got the changes out there for review I also asked for spot testing before getting it into linux-next for even more testing. I've been working on a relatively architecture agnostic PCI host bridge driver and also wanted to avoid duplicating more generic DT parsing code for PCI bindings. I've ended up with a patch which provides an iterator for returning resources based on the the typical 'ranges' binding. This has ended up living in drivers/of/address.c. I originally started out in drivers/of/pci.c and drivers/pci/pci-of.c but found there were good (and static) implementations in drivers/of/address.c which can be reused (e.g. of_bus_pci_get_flags, bus-count_cells). I'm not just ready to post it - but can do before early next week if you can wait. Andrew Murray ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: pci and pcie device-tree binding - range No cells
On Wed, Dec 12, 2012 at 12:19:12PM +, Andrew Murray wrote: On Wed, Dec 12, 2012 at 10:49 AM, Grant Likely wrote: On Wed, Dec 12, 2012 at 10:37 AM, Michal Simek mon...@monstr.eumailto:mon...@monstr.eu wrote: On 12/10/2012 10:41 PM, Grant Likely wrote: drivers/pci/pci-of.c would be good. I'd also accept drivers/of/pci.c which might actually be a good idea in the short term so that it gets appropriate supervision while being generalized before being moved into the pci directory. Ben: Are you willing to move that ppc code to this location? It is probably not good idea that I should do it when I even don't have hardware available for testing (Asking someone else). You're a clever guy, you are more than capable of crafting the patch, even if you can't test on hardware. :-) I refactored most of the OF support code without having access to most of the affected hardware. Once I got the changes out there for review I also asked for spot testing before getting it into linux-next for even more testing. I've been working on a relatively architecture agnostic PCI host bridge driver and also wanted to avoid duplicating more generic DT parsing code for PCI bindings. I've ended up with a patch which provides an iterator for returning resources based on the the typical 'ranges' binding. This has ended up living in drivers/of/address.c. I originally started out in drivers/of/pci.c and drivers/pci/pci-of.c but found there were good (and static) implementations in drivers/of/address.c which can be reused (e.g. of_bus_pci_get_flags, bus-count_cells). I'm not just ready to post it - but can do before early next week if you can wait. I already posted a similar patch[0] as part of a larger series to bring DT support to Tegra PCIe back in July. I suppose what you have must be something pretty close to that. Most of the stuff that had me occupied since then should be done soon and I was planning on resurrecting the series one of these days. Thierry [0]: https://patchwork.kernel.org/patch/1244451/ pgp8ylNOrYiqS.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] pci: Provide support for parsing PCI DT ranges property
DT bindings for PCI host bridges often use the ranges property to describe memory and IO ranges - this binding tends to be the same across architectures yet several parsing implementations exist, e.g. arch/mips/pci/pci.c, arch/powerpc/kernel/pci-common.c, arch/sparc/kernel/pci.c and arch/microblaze/pci/pci-common.c (clone of PPC). Some of these duplicate functionality provided by drivers/of/address.c. This patch provides a common iterator-based parser for the ranges property, it is hoped this will reduce DT representation differences between architectures and that architectures will migrate in part to this new parser. It is also hoped (and the motativation for the patch) that this patch will reduce duplication of code when writing host bridge drivers that are supported by multiple architectures. This patch provides struct resources from a device tree node, e.g.: u32 *last = NULL; struct resource res; while ((last = of_pci_process_ranges(np, res, last))) { //do something with res } Platforms with quirks can then do what they like with the resource or migrate common quirk handling to the parser. In an ideal world drivers can just request the obtained resources and pass them on (e.g. pci_add_resource_offset). Signed-off-by: Andrew Murray andrew.mur...@arm.com Signed-off-by: Liviu Dudau liviu.du...@arm.com --- drivers/of/address.c | 53 +++- include/linux/of_address.h |7 + 2 files changed, 59 insertions(+), 1 deletions(-) diff --git a/drivers/of/address.c b/drivers/of/address.c index 7e262a6..03bfe61 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -219,6 +219,57 @@ int of_pci_address_to_resource(struct device_node *dev, int bar, return __of_address_to_resource(dev, addrp, size, flags, NULL, r); } EXPORT_SYMBOL_GPL(of_pci_address_to_resource); + +const __be32 *of_pci_process_ranges(struct device_node *node, + struct resource *res, const __be32 *from) +{ + const __be32 *start, *end; + int na, ns, np, pna; + int rlen; + struct of_bus *bus; + WARN_ON(!res); + + bus = of_match_bus(node); + bus-count_cells(node, na, ns); + if (!OF_CHECK_COUNTS(na, ns)) { + pr_err(Bad cell count for %s\n, node-full_name); + return NULL; + } + + pna = of_n_addr_cells(node); + np = pna + na + ns; + + start = of_get_property(node, ranges, rlen); + if (start == NULL) + return NULL; + + end = start + rlen; + + if (!from) + from = start; + + while (from + np = end) { + u64 cpu_addr, size; + + cpu_addr = of_translate_address(node, from + na); + size = of_read_number(from + na + pna, ns); + res-flags = bus-get_flags(from); + from += np; + + if (cpu_addr == OF_BAD_ADDR || size == 0) + continue; + + res-name = node-full_name; + res-start = cpu_addr; + res-end = res-start + size - 1; + res-parent = res-child = res-sibling = NULL; + return from; + } + + return NULL; +} +EXPORT_SYMBOL_GPL(of_pci_process_ranges); + #endif /* CONFIG_PCI */ /* @@ -421,7 +472,7 @@ u64 __of_translate_address(struct device_node *dev, const __be32 *in_addr, goto bail; bus = of_match_bus(parent); - /* Cound address cells copy address locally */ + /* Count address cells copy address locally */ bus-count_cells(dev, na, ns); if (!OF_CHECK_COUNTS(na, ns)) { printk(KERN_ERR prom_parse: Bad cell count for %s\n, diff --git a/include/linux/of_address.h b/include/linux/of_address.h index 01b925a..4582b20 100644 --- a/include/linux/of_address.h +++ b/include/linux/of_address.h @@ -26,6 +26,8 @@ static inline unsigned long pci_address_to_pio(phys_addr_t addr) { return -1; } #define pci_address_to_pio pci_address_to_pio #endif +const __be32 *of_pci_process_ranges(struct device_node *node, + struct resource *res, const __be32 *from); #else /* CONFIG_OF_ADDRESS */ static inline int of_address_to_resource(struct device_node *dev, int index, struct resource *r) @@ -48,6 +50,11 @@ static inline const u32 *of_get_address(struct device_node *dev, int index, { return NULL; } +const __be32 *of_pci_process_ranges(struct device_node *node, + struct resource *res, const __be32 *from) +{ + return NULL; +} #endif /* CONFIG_OF_ADDRESS */ -- 1.7.0.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: pci and pcie device-tree binding - range No cells
On Wed, Dec 12, 2012 at 01:34:24PM +, Thierry Reding wrote: On Wed, Dec 12, 2012 at 12:19:12PM +, Andrew Murray wrote: I've been working on a relatively architecture agnostic PCI host bridge driver and also wanted to avoid duplicating more generic DT parsing code for PCI bindings. I've ended up with a patch which provides an iterator for returning resources based on the the typical 'ranges' binding. This has ended up living in drivers/of/address.c. I originally started out in drivers/of/pci.c and drivers/pci/pci-of.c but found there were good (and static) implementations in drivers/of/address.c which can be reused (e.g. of_bus_pci_get_flags, bus-count_cells). I'm not just ready to post it - but can do before early next week if you can wait. I already posted a similar patch[0] as part of a larger series to bring DT support to Tegra PCIe back in July. I suppose what you have must be something pretty close to that. Most of the stuff that had me occupied since then should be done soon and I was planning on resurrecting the series one of these days. Thanks for the reference. I've submitted my patch, it's along the lines of your existing patch. I'm happy to take the best bits from both, drop mine, etc. Andrew Murray ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] Revert crypto: caam - Updated SEC-4.0 device tree binding for ERA information.
On Dec 7, 2012, at 2:57 AM, Vakul Garg wrote: This reverts commit a2c0911c09190125f52c9941b9d187f601c2f7be. Signed-off-by: Vakul Garg va...@freescale.com --- Instead of adding SEC era information in crypto node's compatible, a new property 'fsl,sec-era' is being introduced into crypto node. .../devicetree/bindings/crypto/fsl-sec4.txt|5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) What tree do you think this has been applied to? - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v3] powerpc: fix wii_memory_fixups() compile error on 3.0.y tree
Fix wii_memory_fixups() the following compile error on 3.0.y tree with wii_defconfig on 3.0.y tree. CC arch/powerpc/platforms/embedded6xx/wii.o arch/powerpc/platforms/embedded6xx/wii.c: In function ‘wii_memory_fixups’: arch/powerpc/platforms/embedded6xx/wii.c:88:2: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘phys_addr_t’ [-Werror=format] arch/powerpc/platforms/embedded6xx/wii.c:88:2: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘phys_addr_t’ [-Werror=format] arch/powerpc/platforms/embedded6xx/wii.c:90:2: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘phys_addr_t’ [-Werror=format] arch/powerpc/platforms/embedded6xx/wii.c:90:2: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘phys_addr_t’ [-Werror=format] cc1: all warnings being treated as errors make[2]: *** [arch/powerpc/platforms/embedded6xx/wii.o] Error 1 make[1]: *** [arch/powerpc/platforms/embedded6xx] Error 2 make: *** [arch/powerpc/platforms] Error 2 Signed-off-by: Shuah Khan shuah.k...@hp.com CC: sta...@vger.kernel.org 3.0.y --- arch/powerpc/platforms/embedded6xx/wii.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/embedded6xx/wii.c b/arch/powerpc/platforms/embedded6xx/wii.c index 1b5dc1a..daf793b 100644 --- a/arch/powerpc/platforms/embedded6xx/wii.c +++ b/arch/powerpc/platforms/embedded6xx/wii.c @@ -85,9 +85,11 @@ void __init wii_memory_fixups(void) wii_hole_start = p[0].base + p[0].size; wii_hole_size = p[1].base - wii_hole_start; - pr_info(MEM1: %08llx %08llx\n, p[0].base, p[0].size); + pr_info(MEM1: %08llx %08llx\n, + (unsigned long long) p[0].base, (unsigned long long) p[0].size); pr_info(HOLE: %08lx %08lx\n, wii_hole_start, wii_hole_size); - pr_info(MEM2: %08llx %08llx\n, p[1].base, p[1].size); + pr_info(MEM2: %08llx %08llx\n, + (unsigned long long) p[1].base, (unsigned long long) p[1].size); p[0].size += wii_hole_size + p[1].size; -- 1.7.9.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] vfio powerpc: enabled on powernv platform
On Wed, 2012-12-12 at 23:34 +1100, Alexey Kardashevskiy wrote: This patch initializes IOMMU groups based on the IOMMU configuration discovered during the PCI scan on POWERNV (POWER non virtualized) platform. The IOMMU groups are to be used later by VFIO driver (PCI pass through). It also implements an API for mapping/unmapping pages for guest PCI drivers and providing DMA window properties. This API is going to be used later by QEMU-VFIO to handle h_put_tce hypercalls from the KVM guest. Although this driver has been tested only on the POWERNV platform, it should work on any platform which supports TCE tables. To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config option and configure VFIO as required. Cc: David Gibson da...@gibson.dropbear.id.au Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- arch/powerpc/include/asm/iommu.h | 10 ++ arch/powerpc/kernel/iommu.c | 329 ++ arch/powerpc/platforms/powernv/pci.c | 134 ++ drivers/iommu/Kconfig|8 + 4 files changed, 481 insertions(+) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index cbfe678..3c861ae 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -76,6 +76,9 @@ struct iommu_table { struct iommu_pool large_pool; struct iommu_pool pools[IOMMU_NR_POOLS]; unsigned long *it_map; /* A simple allocation bitmap for now */ +#ifdef CONFIG_IOMMU_API + struct iommu_group *it_group; +#endif }; struct scatterlist; @@ -147,5 +150,12 @@ static inline void iommu_restore(void) } #endif +extern void iommu_reset_table(struct iommu_table *tbl, bool restore); +extern long iommu_clear_tces(struct iommu_table *tbl, unsigned long ioba, + unsigned long size); +extern long iommu_put_tces(struct iommu_table *tbl, unsigned long ioba, + uint64_t tce, enum dma_data_direction direction, + unsigned long size); + #endif /* __KERNEL__ */ #endif /* _ASM_IOMMU_H */ diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index ff5a6ce..f3bb2e7 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -36,6 +36,7 @@ #include linux/hash.h #include linux/fault-inject.h #include linux/pci.h +#include linux/uaccess.h #include asm/io.h #include asm/prom.h #include asm/iommu.h @@ -44,6 +45,7 @@ #include asm/kdump.h #include asm/fadump.h #include asm/vio.h +#include asm/tce.h #define DBG(...) @@ -856,3 +858,330 @@ void iommu_free_coherent(struct iommu_table *tbl, size_t size, free_pages((unsigned long)vaddr, get_order(size)); } } + +#ifdef CONFIG_IOMMU_API +/* + * SPAPR TCE API + */ + +struct vwork { + struct mm_struct*mm; + longnpage; + struct work_struct work; +}; + +/* delayed decrement/increment for locked_vm */ +static void lock_acct_bg(struct work_struct *work) +{ + struct vwork *vwork = container_of(work, struct vwork, work); + struct mm_struct *mm; + + mm = vwork-mm; + down_write(mm-mmap_sem); + mm-locked_vm += vwork-npage; + up_write(mm-mmap_sem); + mmput(mm); + kfree(vwork); +} + +static void lock_acct(long npage) +{ + struct vwork *vwork; + struct mm_struct *mm; + + if (!current-mm) + return; /* process exited */ + + if (down_write_trylock(current-mm-mmap_sem)) { + current-mm-locked_vm += npage; + up_write(current-mm-mmap_sem); + return; + } + + /* + * Couldn't get mmap_sem lock, so must setup to update + * mm-locked_vm later. If locked_vm were atomic, we + * wouldn't need this silliness + */ + vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL); + if (!vwork) + return; + mm = get_task_mm(current); + if (!mm) { + kfree(vwork); + return; + } + INIT_WORK(vwork-work, lock_acct_bg); + vwork-mm = mm; + vwork-npage = npage; + schedule_work(vwork-work); +} Locked page accounting in this version is very, very broken. How do powerpc folks feel about seemingly generic kernel iommu interfaces messing with the current task mm? Besides that, more problems below... + +/* + * iommu_reset_table is called when it started/stopped being used. + * + * restore==true says to bring the iommu_table into the state as it was + * before being used by VFIO. + */ +void iommu_reset_table(struct iommu_table *tbl, bool restore) +{ + /* Page#0 is marked as used in iommu_init_table, so we clear it... */ + if (!restore (tbl-it_offset == 0)) + clear_bit(0, tbl-it_map); + + iommu_clear_tces(tbl, tbl-it_offset, tbl-it_size); This does locked page accounting and unpins pages, even on startup when the pages
[PATCH] powerpc+of: Rename and fix OF reconfig notifier error inject module
This module used to inject errors in the pSeries specific dynamic reconfiguration notifiers. Those are gone however, replaced by generic notifiers for changes to the device-tree. So let's update the module to deal with these instead and rename it along the way. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- lib/Kconfig.debug| 10 ++--- lib/Makefile |4 +- lib/of-reconfig-notifier-error-inject.c | 51 ++ lib/pSeries-reconfig-notifier-error-inject.c | 51 -- 4 files changed, 58 insertions(+), 58 deletions(-) create mode 100644 lib/of-reconfig-notifier-error-inject.c delete mode 100644 lib/pSeries-reconfig-notifier-error-inject.c diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 28e9d6c9..c2d89f3 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1192,14 +1192,14 @@ config MEMORY_NOTIFIER_ERROR_INJECT If unsure, say N. -config PSERIES_RECONFIG_NOTIFIER_ERROR_INJECT - tristate pSeries reconfig notifier error injection module - depends on PPC_PSERIES NOTIFIER_ERROR_INJECTION +config OF_RECONFIG_NOTIFIER_ERROR_INJECT + tristate OF reconfig notifier error injection module + depends on OF_DYNAMIC NOTIFIER_ERROR_INJECTION help This option provides the ability to inject artifical errors to - pSeries reconfig notifier chain callbacks. It is controlled + OF reconfig notifier chain callbacks. It is controlled through debugfs interface under - /sys/kernel/debug/notifier-error-inject/pSeries-reconfig/ + /sys/kernel/debug/notifier-error-inject/OF-reconfig/ If the notifier call chain should be failed with some events notified, write the error code to actions/notifier event/error. diff --git a/lib/Makefile b/lib/Makefile index 821a162..7c00908 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -94,8 +94,8 @@ obj-$(CONFIG_NOTIFIER_ERROR_INJECTION) += notifier-error-inject.o obj-$(CONFIG_CPU_NOTIFIER_ERROR_INJECT) += cpu-notifier-error-inject.o obj-$(CONFIG_PM_NOTIFIER_ERROR_INJECT) += pm-notifier-error-inject.o obj-$(CONFIG_MEMORY_NOTIFIER_ERROR_INJECT) += memory-notifier-error-inject.o -obj-$(CONFIG_PSERIES_RECONFIG_NOTIFIER_ERROR_INJECT) += \ - pSeries-reconfig-notifier-error-inject.o +obj-$(CONFIG_OF_RECONFIG_NOTIFIER_ERROR_INJECT) += \ + of-reconfig-notifier-error-inject.o lib-$(CONFIG_GENERIC_BUG) += bug.o diff --git a/lib/of-reconfig-notifier-error-inject.c b/lib/of-reconfig-notifier-error-inject.c new file mode 100644 index 000..8dc7986 --- /dev/null +++ b/lib/of-reconfig-notifier-error-inject.c @@ -0,0 +1,51 @@ +#include linux/kernel.h +#include linux/module.h +#include linux/of.h + +#include notifier-error-inject.h + +static int priority; +module_param(priority, int, 0); +MODULE_PARM_DESC(priority, specify OF reconfig notifier priority); + +static struct notifier_err_inject reconfig_err_inject = { + .actions = { + { NOTIFIER_ERR_INJECT_ACTION(OF_RECONFIG_ATTACH_NODE) }, + { NOTIFIER_ERR_INJECT_ACTION(OF_RECONFIG_DETACH_NODE) }, + { NOTIFIER_ERR_INJECT_ACTION(OF_RECONFIG_ADD_PROPERTY) }, + { NOTIFIER_ERR_INJECT_ACTION(OF_RECONFIG_REMOVE_PROPERTY) }, + { NOTIFIER_ERR_INJECT_ACTION(OF_RECONFIG_UPDATE_PROPERTY) }, + {} + } +}; + +static struct dentry *dir; + +static int err_inject_init(void) +{ + int err; + + dir = notifier_err_inject_init(OF-reconfig, + notifier_err_inject_dir, reconfig_err_inject, priority); + if (IS_ERR(dir)) + return PTR_ERR(dir); + + err = of_reconfig_notifier_register(reconfig_err_inject.nb); + if (err) + debugfs_remove_recursive(dir); + + return err; +} + +static void err_inject_exit(void) +{ + of_reconfig_notifier_unregister(reconfig_err_inject.nb); + debugfs_remove_recursive(dir); +} + +module_init(err_inject_init); +module_exit(err_inject_exit); + +MODULE_DESCRIPTION(OF reconfig notifier error injection module); +MODULE_LICENSE(GPL); +MODULE_AUTHOR(Akinobu Mita akinobu.m...@gmail.com); diff --git a/lib/pSeries-reconfig-notifier-error-inject.c b/lib/pSeries-reconfig-notifier-error-inject.c deleted file mode 100644 index 7f7c98d..000 --- a/lib/pSeries-reconfig-notifier-error-inject.c +++ /dev/null @@ -1,51 +0,0 @@ -#include linux/kernel.h -#include linux/module.h - -#include asm/pSeries_reconfig.h - -#include notifier-error-inject.h - -static int priority; -module_param(priority, int, 0); -MODULE_PARM_DESC(priority, specify pSeries reconfig notifier priority); - -static struct notifier_err_inject reconfig_err_inject = { - .actions = { - { NOTIFIER_ERR_INJECT_ACTION(PSERIES_RECONFIG_ADD) }, - { NOTIFIER_ERR_INJECT_ACTION(PSERIES_RECONFIG_REMOVE) }, - {
[PATCH 1/3] powerpc: Run savedefconfig over pseries, ppc64 and ppc64e defconfig
No changes, just update the configs with savedefconfig. Signed-off-by: Anton Blanchard an...@samba.org --- Index: b/arch/powerpc/configs/ppc64_defconfig === --- a/arch/powerpc/configs/ppc64_defconfig +++ b/arch/powerpc/configs/ppc64_defconfig @@ -5,6 +5,9 @@ CONFIG_SMP=y CONFIG_EXPERIMENTAL=y CONFIG_SYSVIPC=y CONFIG_POSIX_MQUEUE=y +CONFIG_IRQ_DOMAIN_DEBUG=y +CONFIG_NO_HZ=y +CONFIG_HIGH_RES_TIMERS=y CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_IKCONFIG=y @@ -21,6 +24,7 @@ CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y +CONFIG_PARTITION_ADVANCED=y CONFIG_PPC_SPLPAR=y CONFIG_SCANLOG=m CONFIG_PPC_SMLPAR=y @@ -42,11 +46,8 @@ CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_GOV_POWERSAVE=y CONFIG_CPU_FREQ_GOV_USERSPACE=y CONFIG_CPU_FREQ_PMAC64=y -CONFIG_NO_HZ=y -CONFIG_HIGH_RES_TIMERS=y CONFIG_HZ_100=y CONFIG_BINFMT_MISC=m -CONFIG_HOTPLUG_CPU=y CONFIG_KEXEC=y CONFIG_IRQ_ALL_CPUS=y CONFIG_MEMORY_HOTREMOVE=y @@ -73,7 +74,6 @@ CONFIG_INET_ESP=m CONFIG_INET_IPCOMP=m # CONFIG_IPV6 is not set CONFIG_NETFILTER=y -CONFIG_NETFILTER_NETLINK_QUEUE=m CONFIG_NF_CONNTRACK=m CONFIG_NF_CONNTRACK_EVENTS=y CONFIG_NF_CT_PROTO_SCTP=m @@ -130,19 +130,12 @@ CONFIG_NETFILTER_XT_MATCH_U32=m CONFIG_NF_CONNTRACK_IPV4=m CONFIG_IP_NF_QUEUE=m CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m CONFIG_IP_NF_MATCH_AH=m CONFIG_IP_NF_MATCH_ECN=m CONFIG_IP_NF_MATCH_TTL=m CONFIG_IP_NF_FILTER=m CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_NF_NAT=m -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_NF_NAT_SNMP_BASIC=m CONFIG_IP_NF_MANGLE=m CONFIG_IP_NF_TARGET_CLUSTERIP=m CONFIG_IP_NF_TARGET_ECN=m @@ -151,6 +144,7 @@ CONFIG_IP_NF_RAW=m CONFIG_IP_NF_ARPTABLES=m CONFIG_IP_NF_ARPFILTER=m CONFIG_IP_NF_ARP_MANGLE=m +CONFIG_BPF_JIT=y CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug CONFIG_PROC_DEVICETREE=y CONFIG_BLK_DEV_FD=y @@ -173,7 +167,6 @@ CONFIG_CHR_DEV_SG=y CONFIG_SCSI_MULTI_LUN=y CONFIG_SCSI_CONSTANTS=y CONFIG_SCSI_FC_ATTRS=y -CONFIG_SCSI_SAS_ATTRS=m CONFIG_SCSI_CXGB3_ISCSI=m CONFIG_SCSI_CXGB4_ISCSI=m CONFIG_SCSI_BNX2_ISCSI=m @@ -205,13 +198,6 @@ CONFIG_DM_SNAPSHOT=m CONFIG_DM_MIRROR=m CONFIG_DM_ZERO=m CONFIG_DM_MULTIPATH=m -CONFIG_IEEE1394=y -CONFIG_IEEE1394_OHCI1394=y -CONFIG_IEEE1394_SBP2=m -CONFIG_IEEE1394_ETH1394=m -CONFIG_IEEE1394_RAWIO=y -CONFIG_IEEE1394_VIDEO1394=m -CONFIG_IEEE1394_DV1394=m CONFIG_ADB_PMU=y CONFIG_PMAC_SMU=y CONFIG_THERM_PM72=y @@ -220,50 +206,43 @@ CONFIG_WINDFARM_PM81=y CONFIG_WINDFARM_PM91=y CONFIG_WINDFARM_PM112=y CONFIG_WINDFARM_PM121=y -CONFIG_NETDEVICES=y -CONFIG_DUMMY=m CONFIG_BONDING=m +CONFIG_DUMMY=m +CONFIG_NETCONSOLE=y +CONFIG_NETPOLL_TRAP=y CONFIG_TUN=m -CONFIG_MARVELL_PHY=y -CONFIG_BROADCOM_PHY=m -CONFIG_NET_ETHERNET=y -CONFIG_SUNGEM=y -CONFIG_NET_VENDOR_3COM=y CONFIG_VORTEX=y -CONFIG_IBMVETH=m -CONFIG_NET_PCI=y -CONFIG_PCNET32=y -CONFIG_E100=y CONFIG_ACENIC=m CONFIG_ACENIC_OMIT_TIGON_I=y -CONFIG_E1000=y -CONFIG_E1000E=y +CONFIG_PCNET32=y CONFIG_TIGON3=y -CONFIG_BNX2=m -CONFIG_SPIDER_NET=m -CONFIG_GELIC_NET=m -CONFIG_GELIC_WIRELESS=y CONFIG_CHELSIO_T1=m -CONFIG_CHELSIO_T3=m -CONFIG_CHELSIO_T4=m +CONFIG_BE2NET=m +CONFIG_S2IO=m +CONFIG_IBMVETH=m CONFIG_EHEA=m -CONFIG_IXGBE=m +CONFIG_E100=y +CONFIG_E1000=y +CONFIG_E1000E=y CONFIG_IXGB=m -CONFIG_S2IO=m +CONFIG_IXGBE=m +CONFIG_MLX4_EN=m CONFIG_MYRI10GE=m -CONFIG_NETXEN_NIC=m CONFIG_PASEMI_MAC=y -CONFIG_MLX4_EN=m CONFIG_QLGE=m -CONFIG_BE2NET=m +CONFIG_NETXEN_NIC=m +CONFIG_SUNGEM=y +CONFIG_GELIC_NET=m +CONFIG_GELIC_WIRELESS=y +CONFIG_SPIDER_NET=m +CONFIG_MARVELL_PHY=y +CONFIG_BROADCOM_PHY=m CONFIG_PPP=m -CONFIG_PPP_ASYNC=m -CONFIG_PPP_SYNC_TTY=m -CONFIG_PPP_DEFLATE=m CONFIG_PPP_BSDCOMP=m +CONFIG_PPP_DEFLATE=m CONFIG_PPPOE=m -CONFIG_NETCONSOLE=y -CONFIG_NETPOLL_TRAP=y +CONFIG_PPP_ASYNC=m +CONFIG_PPP_SYNC_TTY=m # CONFIG_INPUT_MOUSEDEV_PSAUX is not set CONFIG_INPUT_EVDEV=m CONFIG_INPUT_MISC=y @@ -279,13 +258,10 @@ CONFIG_HVC_RTAS=y CONFIG_HVC_BEAT=y CONFIG_HVCS=m CONFIG_IBM_BSR=m -CONFIG_HW_RANDOM=m -CONFIG_HW_RANDOM_PSERIES=m CONFIG_RAW_DRIVER=y CONFIG_I2C_CHARDEV=y CONFIG_I2C_AMD8111=y CONFIG_I2C_PASEMI=y -# CONFIG_HWMON is not set CONFIG_VIDEO_OUTPUT_CONTROL=m CONFIG_FB=y CONFIG_FIRMWARE_EDID=y @@ -300,7 +276,6 @@ CONFIG_FB_RADEON=y CONFIG_FB_IBM_GXT4500=y CONFIG_FB_PS3=m CONFIG_LCD_CLASS_DEVICE=y -CONFIG_DISPLAY_SUPPORT=y # CONFIG_VGA_CONSOLE is not set CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_LOGO=y @@ -317,18 +292,16 @@ CONFIG_SND_AOA_FABRIC_LAYOUT=m CONFIG_SND_AOA_ONYX=m CONFIG_SND_AOA_TAS=m CONFIG_SND_AOA_TOONIE=m -CONFIG_USB_HIDDEV=y CONFIG_HID_GYRATION=y CONFIG_HID_PANTHERLORD=y CONFIG_HID_PETALYNX=y CONFIG_HID_SAMSUNG=y CONFIG_HID_SONY=y CONFIG_HID_SUNPLUS=y +CONFIG_USB_HIDDEV=y CONFIG_USB=y -CONFIG_USB_DEVICEFS=y CONFIG_USB_MON=m CONFIG_USB_EHCI_HCD=y
[PATCH 2/3] powerpc: Cleanup NLS config options on pseries, ppc64 and ppc64e defconfig
Set CONFIG_NLS_DEFAULT to utf8. The distros do this (eg ppc64 FC17 and RHEL6) as well as the x86 defconfigs. Userspace these days is most likely to expect utf8 anyway. Signed-off-by: Anton Blanchard an...@samba.org --- Index: b/arch/powerpc/configs/ppc64_defconfig === --- a/arch/powerpc/configs/ppc64_defconfig +++ b/arch/powerpc/configs/ppc64_defconfig @@ -372,43 +372,11 @@ CONFIG_NFSD_V4=y CONFIG_CIFS=m CONFIG_CIFS_XATTR=y CONFIG_CIFS_POSIX=y +CONFIG_NLS_DEFAULT=utf8 CONFIG_NLS_CODEPAGE_437=y -CONFIG_NLS_CODEPAGE_737=m -CONFIG_NLS_CODEPAGE_775=m -CONFIG_NLS_CODEPAGE_850=m -CONFIG_NLS_CODEPAGE_852=m -CONFIG_NLS_CODEPAGE_855=m -CONFIG_NLS_CODEPAGE_857=m -CONFIG_NLS_CODEPAGE_860=m -CONFIG_NLS_CODEPAGE_861=m -CONFIG_NLS_CODEPAGE_862=m -CONFIG_NLS_CODEPAGE_863=m -CONFIG_NLS_CODEPAGE_864=m -CONFIG_NLS_CODEPAGE_865=m -CONFIG_NLS_CODEPAGE_866=m -CONFIG_NLS_CODEPAGE_869=m -CONFIG_NLS_CODEPAGE_936=m -CONFIG_NLS_CODEPAGE_950=m -CONFIG_NLS_CODEPAGE_932=m -CONFIG_NLS_CODEPAGE_949=m -CONFIG_NLS_CODEPAGE_874=m -CONFIG_NLS_ISO8859_8=m -CONFIG_NLS_CODEPAGE_1250=m -CONFIG_NLS_CODEPAGE_1251=m -CONFIG_NLS_ASCII=m +CONFIG_NLS_ASCII=y CONFIG_NLS_ISO8859_1=y -CONFIG_NLS_ISO8859_2=m -CONFIG_NLS_ISO8859_3=m -CONFIG_NLS_ISO8859_4=m -CONFIG_NLS_ISO8859_5=m -CONFIG_NLS_ISO8859_6=m -CONFIG_NLS_ISO8859_7=m -CONFIG_NLS_ISO8859_9=m -CONFIG_NLS_ISO8859_13=m -CONFIG_NLS_ISO8859_14=m -CONFIG_NLS_ISO8859_15=m -CONFIG_NLS_KOI8_R=m -CONFIG_NLS_KOI8_U=m +CONFIG_NLS_UTF8=y CONFIG_CRC_T10DIF=y CONFIG_MAGIC_SYSRQ=y CONFIG_DEBUG_KERNEL=y Index: b/arch/powerpc/configs/ppc64e_defconfig === --- a/arch/powerpc/configs/ppc64e_defconfig +++ b/arch/powerpc/configs/ppc64e_defconfig @@ -290,43 +290,11 @@ CONFIG_NFSD_V4=y CONFIG_CIFS=m CONFIG_CIFS_XATTR=y CONFIG_CIFS_POSIX=y +CONFIG_NLS_DEFAULT=utf8 CONFIG_NLS_CODEPAGE_437=y -CONFIG_NLS_CODEPAGE_737=m -CONFIG_NLS_CODEPAGE_775=m -CONFIG_NLS_CODEPAGE_850=m -CONFIG_NLS_CODEPAGE_852=m -CONFIG_NLS_CODEPAGE_855=m -CONFIG_NLS_CODEPAGE_857=m -CONFIG_NLS_CODEPAGE_860=m -CONFIG_NLS_CODEPAGE_861=m -CONFIG_NLS_CODEPAGE_862=m -CONFIG_NLS_CODEPAGE_863=m -CONFIG_NLS_CODEPAGE_864=m -CONFIG_NLS_CODEPAGE_865=m -CONFIG_NLS_CODEPAGE_866=m -CONFIG_NLS_CODEPAGE_869=m -CONFIG_NLS_CODEPAGE_936=m -CONFIG_NLS_CODEPAGE_950=m -CONFIG_NLS_CODEPAGE_932=m -CONFIG_NLS_CODEPAGE_949=m -CONFIG_NLS_CODEPAGE_874=m -CONFIG_NLS_ISO8859_8=m -CONFIG_NLS_CODEPAGE_1250=m -CONFIG_NLS_CODEPAGE_1251=m -CONFIG_NLS_ASCII=m +CONFIG_NLS_ASCII=y CONFIG_NLS_ISO8859_1=y -CONFIG_NLS_ISO8859_2=m -CONFIG_NLS_ISO8859_3=m -CONFIG_NLS_ISO8859_4=m -CONFIG_NLS_ISO8859_5=m -CONFIG_NLS_ISO8859_6=m -CONFIG_NLS_ISO8859_7=m -CONFIG_NLS_ISO8859_9=m -CONFIG_NLS_ISO8859_13=m -CONFIG_NLS_ISO8859_14=m -CONFIG_NLS_ISO8859_15=m -CONFIG_NLS_KOI8_R=m -CONFIG_NLS_KOI8_U=m +CONFIG_NLS_UTF8=y CONFIG_CRC_T10DIF=y CONFIG_MAGIC_SYSRQ=y CONFIG_DEBUG_KERNEL=y Index: b/arch/powerpc/configs/pseries_defconfig === --- a/arch/powerpc/configs/pseries_defconfig +++ b/arch/powerpc/configs/pseries_defconfig @@ -298,9 +298,11 @@ CONFIG_NFSD_V4=y CONFIG_CIFS=m CONFIG_CIFS_XATTR=y CONFIG_CIFS_POSIX=y +CONFIG_NLS_DEFAULT=utf8 CONFIG_NLS_CODEPAGE_437=y CONFIG_NLS_ASCII=y CONFIG_NLS_ISO8859_1=y +CONFIG_NLS_UTF8=y CONFIG_CRC_T10DIF=y CONFIG_MAGIC_SYSRQ=y CONFIG_DEBUG_KERNEL=y ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/3] powerpc: Enable devtmpfs, EFI partition support and tmpfs ACLs on pseries, ppc64 and ppc64e defconfig
We need devtmpfs enabled to boot on recent versions of Fedora. EFI partitions will be useful for large block devices. tmpfs ACL support is used by some distros for managing access to devices. Signed-off-by: Anton Blanchard an...@samba.org --- Index: b/arch/powerpc/configs/pseries_defconfig === --- a/arch/powerpc/configs/pseries_defconfig +++ b/arch/powerpc/configs/pseries_defconfig @@ -32,6 +32,8 @@ CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y +CONFIG_PARTITION_ADVANCED=y +CONFIG_EFI_PARTITION=y CONFIG_PPC_SPLPAR=y CONFIG_SCANLOG=m CONFIG_PPC_SMLPAR=y @@ -118,6 +120,8 @@ CONFIG_IP_NF_FILTER=m CONFIG_IP_NF_TARGET_REJECT=m CONFIG_IP_NF_TARGET_ULOG=m CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug +CONFIG_DEVTMPFS=y +CONFIG_DEVTMPFS_MOUNT=y CONFIG_PROC_DEVICETREE=y CONFIG_PARPORT=m CONFIG_PARPORT_PC=m @@ -283,6 +287,7 @@ CONFIG_MSDOS_FS=y CONFIG_VFAT_FS=y CONFIG_PROC_KCORE=y CONFIG_TMPFS=y +CONFIG_TMPFS_POSIX_ACL=y CONFIG_HUGETLBFS=y CONFIG_CRAMFS=m CONFIG_SQUASHFS=m Index: b/arch/powerpc/configs/ppc64_defconfig === --- a/arch/powerpc/configs/ppc64_defconfig +++ b/arch/powerpc/configs/ppc64_defconfig @@ -25,6 +25,7 @@ CONFIG_MODULE_UNLOAD=y CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_PARTITION_ADVANCED=y +CONFIG_EFI_PARTITION=y CONFIG_PPC_SPLPAR=y CONFIG_SCANLOG=m CONFIG_PPC_SMLPAR=y @@ -146,6 +147,8 @@ CONFIG_IP_NF_ARPFILTER=m CONFIG_IP_NF_ARP_MANGLE=m CONFIG_BPF_JIT=y CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug +CONFIG_DEVTMPFS=y +CONFIG_DEVTMPFS_MOUNT=y CONFIG_PROC_DEVICETREE=y CONFIG_BLK_DEV_FD=y CONFIG_BLK_DEV_LOOP=y @@ -354,6 +357,7 @@ CONFIG_MSDOS_FS=y CONFIG_VFAT_FS=y CONFIG_PROC_KCORE=y CONFIG_TMPFS=y +CONFIG_TMPFS_POSIX_ACL=y CONFIG_HUGETLBFS=y CONFIG_HFS_FS=m CONFIG_HFSPLUS_FS=m Index: b/arch/powerpc/configs/ppc64e_defconfig === --- a/arch/powerpc/configs/ppc64e_defconfig +++ b/arch/powerpc/configs/ppc64e_defconfig @@ -22,6 +22,7 @@ CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_PARTITION_ADVANCED=y CONFIG_MAC_PARTITION=y +CONFIG_EFI_PARTITION=y CONFIG_P5020_DS=y CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_GOV_POWERSAVE=y @@ -119,6 +120,8 @@ CONFIG_IP_NF_ARPTABLES=m CONFIG_IP_NF_ARPFILTER=m CONFIG_IP_NF_ARP_MANGLE=m CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug +CONFIG_DEVTMPFS=y +CONFIG_DEVTMPFS_MOUNT=y CONFIG_PROC_DEVICETREE=y CONFIG_BLK_DEV_FD=y CONFIG_BLK_DEV_LOOP=y @@ -277,6 +280,7 @@ CONFIG_MSDOS_FS=y CONFIG_VFAT_FS=y CONFIG_PROC_KCORE=y CONFIG_TMPFS=y +CONFIG_TMPFS_POSIX_ACL=y CONFIG_HFS_FS=m CONFIG_HFSPLUS_FS=m CONFIG_CRAMFS=y ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc: Avoid load of static chain register when calling nested functions through a pointer on 64bit
The ppc64 ABI has a static chain register (r11) which is only used when calling nested functions through a pointer. Considering that we take a dim view of nested functions in the kernel, we have a lot of unnecessary overhead here. gcc 4.7 has an option to disable loading of r11 so lets use it. If hell freezes over and hipsters manage to litter the kernel with nested functions, gcc will give us an error message and won't simply compile bad code: You cannot take the address of a nested function if you use the -mno-pointers-to-nested-functions option. Furthermore our kernel module trampolines don't setup the static chain register so adding this option and forcing gcc to error out makes even more sense. Signed-off-by: Anton Blanchard an...@samba.org --- Index: b/arch/powerpc/Makefile === --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -85,6 +85,7 @@ endif CFLAGS-$(CONFIG_PPC64) := -mtraceback=no -mcall-aixdesc CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcmodel=medium,-mminimal-toc) +CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mno-pointers-to-nested-functions) CFLAGS-$(CONFIG_PPC32) := -ffixed-r2 -mmultiple CFLAGS-$(CONFIG_GENERIC_CPU) += $(call cc-option,-mtune=power7,-mtune=power4) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3] powerpc: fix wii_memory_fixups() compile error on 3.0.y tree
On Wed, 2012-12-12 at 16:27 -0700, Shuah Khan wrote: Fix wii_memory_fixups() the following compile error on 3.0.y tree with wii_defconfig on 3.0.y tree. CC arch/powerpc/platforms/embedded6xx/wii.o arch/powerpc/platforms/embedded6xx/wii.c: In function ‘wii_memory_fixups’: arch/powerpc/platforms/embedded6xx/wii.c:88:2: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘phys_addr_t’ [-Werror=format] arch/powerpc/platforms/embedded6xx/wii.c:88:2: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘phys_addr_t’ [-Werror=format] arch/powerpc/platforms/embedded6xx/wii.c:90:2: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘phys_addr_t’ [-Werror=format] arch/powerpc/platforms/embedded6xx/wii.c:90:2: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘phys_addr_t’ [-Werror=format] cc1: all warnings being treated as errors make[2]: *** [arch/powerpc/platforms/embedded6xx/wii.o] Error 1 make[1]: *** [arch/powerpc/platforms/embedded6xx] Error 2 make: *** [arch/powerpc/platforms] Error 2 Signed-off-by: Shuah Khan shuah.k...@hp.com CC: sta...@vger.kernel.org 3.0.y [...] This looks fine, and I've queued it up for 3.2 since the errant code wasn't removed until 3.3. Ben. -- Ben Hutchings Theory and practice are closer in theory than in practice. - John Levine, moderator of comp.compilers signature.asc Description: This is a digitally signed message part ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] vfio powerpc: enabled on powernv platform
On 13/12/12 10:30, Alex Williamson wrote: On Wed, 2012-12-12 at 23:34 +1100, Alexey Kardashevskiy wrote: This patch initializes IOMMU groups based on the IOMMU configuration discovered during the PCI scan on POWERNV (POWER non virtualized) platform. The IOMMU groups are to be used later by VFIO driver (PCI pass through). It also implements an API for mapping/unmapping pages for guest PCI drivers and providing DMA window properties. This API is going to be used later by QEMU-VFIO to handle h_put_tce hypercalls from the KVM guest. Although this driver has been tested only on the POWERNV platform, it should work on any platform which supports TCE tables. To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config option and configure VFIO as required. Cc: David Gibson da...@gibson.dropbear.id.au Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- arch/powerpc/include/asm/iommu.h | 10 ++ arch/powerpc/kernel/iommu.c | 329 ++ arch/powerpc/platforms/powernv/pci.c | 134 ++ drivers/iommu/Kconfig|8 + 4 files changed, 481 insertions(+) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index cbfe678..3c861ae 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -76,6 +76,9 @@ struct iommu_table { struct iommu_pool large_pool; struct iommu_pool pools[IOMMU_NR_POOLS]; unsigned long *it_map; /* A simple allocation bitmap for now */ +#ifdef CONFIG_IOMMU_API + struct iommu_group *it_group; +#endif }; struct scatterlist; @@ -147,5 +150,12 @@ static inline void iommu_restore(void) } #endif +extern void iommu_reset_table(struct iommu_table *tbl, bool restore); +extern long iommu_clear_tces(struct iommu_table *tbl, unsigned long ioba, + unsigned long size); +extern long iommu_put_tces(struct iommu_table *tbl, unsigned long ioba, + uint64_t tce, enum dma_data_direction direction, + unsigned long size); + #endif /* __KERNEL__ */ #endif /* _ASM_IOMMU_H */ diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index ff5a6ce..f3bb2e7 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -36,6 +36,7 @@ #include linux/hash.h #include linux/fault-inject.h #include linux/pci.h +#include linux/uaccess.h #include asm/io.h #include asm/prom.h #include asm/iommu.h @@ -44,6 +45,7 @@ #include asm/kdump.h #include asm/fadump.h #include asm/vio.h +#include asm/tce.h #define DBG(...) @@ -856,3 +858,330 @@ void iommu_free_coherent(struct iommu_table *tbl, size_t size, free_pages((unsigned long)vaddr, get_order(size)); } } + +#ifdef CONFIG_IOMMU_API +/* + * SPAPR TCE API + */ + +struct vwork { + struct mm_struct*mm; + longnpage; + struct work_struct work; +}; + +/* delayed decrement/increment for locked_vm */ +static void lock_acct_bg(struct work_struct *work) +{ + struct vwork *vwork = container_of(work, struct vwork, work); + struct mm_struct *mm; + + mm = vwork-mm; + down_write(mm-mmap_sem); + mm-locked_vm += vwork-npage; + up_write(mm-mmap_sem); + mmput(mm); + kfree(vwork); +} + +static void lock_acct(long npage) +{ + struct vwork *vwork; + struct mm_struct *mm; + + if (!current-mm) + return; /* process exited */ + + if (down_write_trylock(current-mm-mmap_sem)) { + current-mm-locked_vm += npage; + up_write(current-mm-mmap_sem); + return; + } + + /* +* Couldn't get mmap_sem lock, so must setup to update +* mm-locked_vm later. If locked_vm were atomic, we +* wouldn't need this silliness +*/ + vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL); + if (!vwork) + return; + mm = get_task_mm(current); + if (!mm) { + kfree(vwork); + return; + } + INIT_WORK(vwork-work, lock_acct_bg); + vwork-mm = mm; + vwork-npage = npage; + schedule_work(vwork-work); +} Locked page accounting in this version is very, very broken. How do powerpc folks feel about seemingly generic kernel iommu interfaces messing with the current task mm? Besides that, more problems below... + +/* + * iommu_reset_table is called when it started/stopped being used. + * + * restore==true says to bring the iommu_table into the state as it was + * before being used by VFIO. + */ +void iommu_reset_table(struct iommu_table *tbl, bool restore) +{ + /* Page#0 is marked as used in iommu_init_table, so we clear it... */ + if (!restore (tbl-it_offset == 0)) + clear_bit(0, tbl-it_map); + + iommu_clear_tces(tbl, tbl-it_offset, tbl-it_size); This does locked page accounting and unpins pages, even on startup when
Re: [PATCH] vfio powerpc: enabled on powernv platform
On Wed, 2012-12-12 at 07:34 -0700, Alex Williamson wrote: But what would I put there?... IOMMU ID is more than enough at the moment and struct iommu_table does not have anything what would have made sense to show in the sysfs... I believe David mentioned that PEs had user visible names. Perhaps they match an enclosure location or something. Group numbers are rather arbitrary and really have no guarantee of persistence. Thanks, I agree. Make up something, for example domain[PE] or something like that. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] vfio powerpc: enabled on powernv platform
On Wed, 2012-12-12 at 16:30 -0700, Alex Williamson wrote: Locked page accounting in this version is very, very broken. How do powerpc folks feel about seemingly generic kernel iommu interfaces messing with the current task mm? Besides that, more problems below... Not good at all :-) I don't understand tho ... H_PUT_TCE calls should be in the qemu context (or the guest) as current at the point of the call, so everything should be accounted fine on the *current* task when those calls occur, what's the point of the work queue Alexey ? This code looks horribly complicated ... where does it come from ? +/* + * iommu_reset_table is called when it started/stopped being used. + * + * restore==true says to bring the iommu_table into the state as it was + * before being used by VFIO. + */ +void iommu_reset_table(struct iommu_table *tbl, bool restore) +{ + /* Page#0 is marked as used in iommu_init_table, so we clear it... */ + if (!restore (tbl-it_offset == 0)) + clear_bit(0, tbl-it_map); + + iommu_clear_tces(tbl, tbl-it_offset, tbl-it_size); This does locked page accounting and unpins pages, even on startup when the pages aren't necessarily pinned or accounted against the current process. Not sure what you mean Alex, and not sure either what Alexey implementation actually does but indeed, pages inside an iommu table that was used by the host don't have their refcount elevated by the fact that they are there. So when taking ownership of an iommu for vfio, you probably need to FAIL if any page is already mapped. Only once you know the iommu is clear for use, then you can start populating it and account for anything you put in it (and de-account anything you remove from it when cleaning things up). + + /* ... or restore */ + if (restore (tbl-it_offset == 0)) + set_bit(0, tbl-it_map); +} +EXPORT_SYMBOL_GPL(iommu_reset_table); + +/* + * Returns the number of used IOMMU pages (4K) within + * the same system page (4K or 64K). + * + * syspage_weight_zero is optimized for expected case == 0 + * syspage_weight_one is optimized for expected case 1 + * Other case are not used in this file. + */ +#if PAGE_SIZE == IOMMU_PAGE_SIZE + +#define syspage_weight_zero(map, offset) test_bit((map), (offset)) +#define syspage_weight_one(map, offset)test_bit((map), (offset)) + +#elif PAGE_SIZE/IOMMU_PAGE_SIZE == 16 + +static int syspage_weight_zero(unsigned long *map, unsigned long offset) +{ + offset = PAGE_MASK IOMMU_PAGE_SHIFT; + return 0xUL (map[BIT_WORD(offset)] + (offset (BITS_PER_LONG-1))); +} I would have expected these to be bools and return true if the weight matches the value. What is that business anyway ? It's very obscure. If you replaced 0x above w/ this, would you need the #error below? (1UL (PAGE_SIZE/IOMMU_PAGE_SIZE)) - 1) + +static int syspage_weight_one(unsigned long *map, unsigned long offset) +{ + int ret = 0, nbits = PAGE_SIZE/IOMMU_PAGE_SIZE; + + /* Aligns TCE entry number to system page boundary */ + offset = PAGE_MASK IOMMU_PAGE_SHIFT; + + /* Count used 4K pages */ + while (nbits (ret 2)) { Don't you have a ffs()? Could also be used for _zero. Surely there are some bitops helpers that could help here even on big endian. hweight really doesn't work? + if (test_bit(offset, map)) + ++ret; + + --nbits; + ++offset; + } + + return ret; +} +#else +#error TODO: support other page size +#endif What combinations do you support ? +static void tce_flush(struct iommu_table *tbl) +{ + /* Flush/invalidate TLB caches if necessary */ + if (ppc_md.tce_flush) + ppc_md.tce_flush(tbl); + + /* Make sure updates are seen by hardware */ + mb(); +} + +/* + * iommu_clear_tces clears tces and returned the number of system pages + * which it called put_page() on + */ +static long clear_tces_nolock(struct iommu_table *tbl, unsigned long entry, + unsigned long pages) +{ + int i, retpages = 0, clr; + unsigned long oldtce, oldweight; + struct page *page; + + for (i = 0; i pages; ++i, ++entry) { + if (!test_bit(entry - tbl-it_offset, tbl-it_map)) + continue; + + oldtce = ppc_md.tce_get(tbl, entry); + ppc_md.tce_free(tbl, entry, 1); + + oldweight = syspage_weight_one(tbl-it_map, + entry - tbl-it_offset); + clr = __test_and_clear_bit(entry - tbl-it_offset, + tbl-it_map); + + if (WARN_ON(!(oldtce (TCE_PCI_WRITE | TCE_PCI_READ + continue; + + page = pfn_to_page(oldtce PAGE_SHIFT); + + if (WARN_ON(!page)) + continue; + + if
Re: [PATCH] vfio powerpc: enabled on powernv platform
On Wed, 2012-12-12 at 16:30 -0700, Alex Williamson wrote: Locked page accounting in this version is very, very broken. How do powerpc folks feel about seemingly generic kernel iommu interfaces messing with the current task mm? Besides that, more problems below... After a second look thought... This whole accounting business is fucked. First, we simply can't just randomly return errors from H_PUT_TCE because the process reached some rlimit. This is not a proper failure mode. That means that the guest will probably panic() ... possibly right in the middle of some disk writeback or god knows what. Not good. Also the overhead of doing all that crap on every TCE map/unmap is ridiculous. Finally, it's just not going to work for real mode which we really want, since we can't take the mmap-sem in real mode anyway, so unless we convert that counter to an atomic, we can't do it. I'd suggest just not bothering, or if you want to bother, check once when creating a TCE table that the rlimit is enough to bolt as many pages as can be populated in that table and fail to create *that*. The failure mode is much better, ie, qemu failing to create a PCI bus due to insufficient rlimits. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] vfio powerpc: enabled on powernv platform
On Thu, 2012-12-13 at 13:57 +1100, Benjamin Herrenschmidt wrote: On Wed, 2012-12-12 at 16:30 -0700, Alex Williamson wrote: Locked page accounting in this version is very, very broken. How do powerpc folks feel about seemingly generic kernel iommu interfaces messing with the current task mm? Besides that, more problems below... After a second look thought... This whole accounting business is fucked. First, we simply can't just randomly return errors from H_PUT_TCE because the process reached some rlimit. This is not a proper failure mode. That means that the guest will probably panic() ... possibly right in the middle of some disk writeback or god knows what. Not good. Also the overhead of doing all that crap on every TCE map/unmap is ridiculous. Finally, it's just not going to work for real mode which we really want, since we can't take the mmap-sem in real mode anyway, so unless we convert that counter to an atomic, we can't do it. I'd suggest just not bothering, or if you want to bother, check once when creating a TCE table that the rlimit is enough to bolt as many pages as can be populated in that table and fail to create *that*. The failure mode is much better, ie, qemu failing to create a PCI bus due to insufficient rlimits. I agree, we don't seem to be headed in the right direction. x86 needs to track rlimits or else a user can exploit the interface to pin all the memory in the system. On power, only the iova window can be pinned, so it's a fixed amount. I could see it as granting access to a group implicitly grants access to pinning the iova window. We can still make it more explicit by handling the rlimit accounting upfront. Thanks, Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc: added DSCR support to ptrace
The DSCR (aka Data Stream Control Register) is supported on some server PowerPC chips and allow some control over the prefetch of data streams. The kernel already supports DSCR value per thread but there is also a need in a ability to change it from an external process for the specific pid. The patch adds new register index PT_DSCR (index=44) which can be set/get by: ptrace(PTRACE_POKEUSER, traced_process, PT_DSCR 3, dscr); dscr = ptrace(PTRACE_PEEKUSER, traced_process, PT_DSCR 3, NULL); Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- arch/powerpc/include/asm/ptrace.h |1 + arch/powerpc/kernel/ptrace.c | 16 2 files changed, 17 insertions(+) diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h index 9c21ed4..340fe36 100644 --- a/arch/powerpc/include/asm/ptrace.h +++ b/arch/powerpc/include/asm/ptrace.h @@ -276,6 +276,7 @@ static inline unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs, #define PT_DAR 41 #define PT_DSISR 42 #define PT_RESULT 43 +#define PT_DSCR 44 #define PT_REGS_COUNT 44 #define PT_FPR048 /* each FP reg occupies 2 slots in this space */ diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c index c10fc28..d3ba67b 100644 --- a/arch/powerpc/kernel/ptrace.c +++ b/arch/powerpc/kernel/ptrace.c @@ -179,6 +179,17 @@ static int set_user_msr(struct task_struct *task, unsigned long msr) return 0; } +static unsigned long get_user_dscr(struct task_struct *task) +{ + return task-thread.dscr; +} + +static int set_user_dscr(struct task_struct *task, unsigned long dscr) +{ + task-thread.dscr = dscr; + return 0; +} + /* * We prevent mucking around with the reserved area of trap * which are used internally by the kernel. @@ -200,6 +211,9 @@ unsigned long ptrace_get_reg(struct task_struct *task, int regno) if (regno == PT_MSR) return get_user_msr(task); + if (regno == PT_DSCR) + return get_user_dscr(task); + if (regno (sizeof(struct pt_regs) / sizeof(unsigned long))) return ((unsigned long *)task-thread.regs)[regno]; @@ -218,6 +232,8 @@ int ptrace_put_reg(struct task_struct *task, int regno, unsigned long data) return set_user_msr(task, data); if (regno == PT_TRAP) return set_user_trap(task, data); + if (regno == PT_DSCR) + return set_user_dscr(task, data); if (regno = PT_MAX_PUT_REG) { ((unsigned long *)task-thread.regs)[regno] = data; -- 1.7.10.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH] Revert crypto: caam - Updated SEC-4.0 device tree binding for ERA information.
Hello Kumar This has been applied to: git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git. Regards Vakul -Original Message- From: Kumar Gala [mailto:ga...@kernel.crashing.org] Sent: Thursday, December 13, 2012 3:00 AM To: Garg Vakul-B16394 Cc: linux-cry...@vger.kernel.org; linuxppc-...@ozlabs.org; devicetree- disc...@lists.ozlabs.org Subject: Re: [PATCH] Revert crypto: caam - Updated SEC-4.0 device tree binding for ERA information. On Dec 7, 2012, at 2:57 AM, Vakul Garg wrote: This reverts commit a2c0911c09190125f52c9941b9d187f601c2f7be. Signed-off-by: Vakul Garg va...@freescale.com --- Instead of adding SEC era information in crypto node's compatible, a new property 'fsl,sec-era' is being introduced into crypto node. .../devicetree/bindings/crypto/fsl-sec4.txt|5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) What tree do you think this has been applied to? - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc: added DSCR support to ptrace
The DSCR (aka Data Stream Control Register) is supported on some server PowerPC chips and allow some control over the prefetch of data streams. The kernel already supports DSCR value per thread but there is also a need in a ability to change it from an external process for the specific pid. The patch adds new register index PT_DSCR (index=44) which can be set/get by: ptrace(PTRACE_POKEUSER, traced_process, PT_DSCR 3, dscr); dscr = ptrace(PTRACE_PEEKUSER, traced_process, PT_DSCR 3, NULL); Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- arch/powerpc/include/asm/ptrace.h |1 + arch/powerpc/kernel/ptrace.c | 17 + 2 files changed, 18 insertions(+) diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h index 9c21ed4..340fe36 100644 --- a/arch/powerpc/include/asm/ptrace.h +++ b/arch/powerpc/include/asm/ptrace.h @@ -276,6 +276,7 @@ static inline unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs, #define PT_DAR 41 #define PT_DSISR 42 #define PT_RESULT 43 +#define PT_DSCR 44 #define PT_REGS_COUNT 44 #define PT_FPR048 /* each FP reg occupies 2 slots in this space */ diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c index c10fc28..aa19389 100644 --- a/arch/powerpc/kernel/ptrace.c +++ b/arch/powerpc/kernel/ptrace.c @@ -179,6 +179,18 @@ static int set_user_msr(struct task_struct *task, unsigned long msr) return 0; } +static unsigned long get_user_dscr(struct task_struct *task) +{ + return task-thread.dscr; +} + +static int set_user_dscr(struct task_struct *task, unsigned long dscr) +{ + task-thread.dscr = dscr; + task-thread.dscr_inherit = 1; + return 0; +} + /* * We prevent mucking around with the reserved area of trap * which are used internally by the kernel. @@ -200,6 +212,9 @@ unsigned long ptrace_get_reg(struct task_struct *task, int regno) if (regno == PT_MSR) return get_user_msr(task); + if (regno == PT_DSCR) + return get_user_dscr(task); + if (regno (sizeof(struct pt_regs) / sizeof(unsigned long))) return ((unsigned long *)task-thread.regs)[regno]; @@ -218,6 +233,8 @@ int ptrace_put_reg(struct task_struct *task, int regno, unsigned long data) return set_user_msr(task, data); if (regno == PT_TRAP) return set_user_trap(task, data); + if (regno == PT_DSCR) + return set_user_dscr(task, data); if (regno = PT_MAX_PUT_REG) { ((unsigned long *)task-thread.regs)[regno] = data; -- 1.7.10.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] vfio powerpc: enabled on powernv platform
On 13/12/12 13:29, Benjamin Herrenschmidt wrote: On Wed, 2012-12-12 at 07:34 -0700, Alex Williamson wrote: But what would I put there?... IOMMU ID is more than enough at the moment and struct iommu_table does not have anything what would have made sense to show in the sysfs... I believe David mentioned that PEs had user visible names. Perhaps they match an enclosure location or something. Group numbers are rather arbitrary and really have no guarantee of persistence. Thanks, I agree. Make up something, for example domain[PE] or something like that. To be able to add a PE number, I need to call iommu_group_alloc() in the correct place where I know this number OR I have to carry it in iommu_table till the moment the iommu_group_alloc() is called (acceptable but not cool). I will post a patch which would help as a response to this mail. -- Alexey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev