Re: [PATCH 19/37] powerpc: consolidate ipi message mux and demux
On Thu, 2011-05-19 at 16:57 +1000, Benjamin Herrenschmidt wrote: > On Wed, 2011-05-11 at 00:29 -0500, Milton Miller wrote: > > Consolidate the mux and demux of ipi messages into smp.c and call > > a new smp_ops callback to actually trigger the ipi. > > .../... > > I'm merging the whole series. I had to do some fixups to this one and > the one adding the CONFIG option, missing cell & wsp bits among others, > but mostly trivial. I forgot to mention... I dropped the change to include/linux/smp.h to remove the unused MSG_ flags for now. It will not have been in -next long enough to hit Linus via my tree, just in case somebody started using the flags while we were not looking :-) I suggest you send it to Linus directly after he pulls my tree during the merge window. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 19/37] powerpc: consolidate ipi message mux and demux
On Wed, 2011-05-11 at 00:29 -0500, Milton Miller wrote: > Consolidate the mux and demux of ipi messages into smp.c and call > a new smp_ops callback to actually trigger the ipi. .../... I'm merging the whole series. I had to do some fixups to this one and the one adding the CONFIG option, missing cell & wsp bits among others, but mostly trivial. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [linuxppc-dev] [PATCH][upstream] powerpc:Integrated Flash controller device tree bindings
On May 19, 2011, at 1:38 AM, Dipen Dudhat wrote: > Signed-off-by: Dipen Dudhat > Acked-By: Scott Wood > --- > Based upon > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git (branch > -> master) > .../devicetree/bindings/powerpc/fsl/ifc.txt| 76 > 1 files changed, 76 insertions(+), 0 deletions(-) > create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/ifc.txt applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH][upstream] powerpc:Integrated Flash controller device tree bindings
Signed-off-by: Dipen Dudhat Acked-By: Scott Wood --- Based upon git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git (branch -> master) .../devicetree/bindings/powerpc/fsl/ifc.txt| 76 1 files changed, 76 insertions(+), 0 deletions(-) create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/ifc.txt diff --git a/Documentation/devicetree/bindings/powerpc/fsl/ifc.txt b/Documentation/devicetree/bindings/powerpc/fsl/ifc.txt new file mode 100644 index 000..939a26d --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/ifc.txt @@ -0,0 +1,76 @@ +Integrated Flash Controller + +Properties: +- name : Should be ifc +- compatible : should contain "fsl,ifc". The version of the integrated + flash controller can be found in the IFC_REV register at + offset zero. + +- #address-cells : Should be either two or three. The first cell is the + chipselect number, and the remaining cells are the + offset into the chipselect. +- #size-cells : Either one or two, depending on how large each chipselect +can be. +- reg : Offset and length of the register set for the device +- interrupts : IFC has two interrupts. The first one is the "common" + interrupt(CM_EVTER_STAT), and second is the NAND interrupt + (NAND_EVTER_STAT). + +- ranges : Each range corresponds to a single chipselect, and covers + the entire access window as configured. + +Child device nodes describe the devices connected to IFC such as NOR (e.g. +cfi-flash) and NAND (fsl,ifc-nand). There might be board specific devices +like FPGAs, CPLDs, etc. + +Example: + + ifc@ffe1e000 { + compatible = "fsl,ifc", "simple-bus"; + #address-cells = <2>; + #size-cells = <1>; + reg = <0x0 0xffe1e000 0 0x2000>; + interrupts = <16 2 19 2>; + + /* NOR, NAND Flashes and CPLD on board */ + ranges = <0x0 0x0 0x0 0xee00 0x0200 + 0x1 0x0 0x0 0xffa0 0x0001 + 0x3 0x0 0x0 0xffb0 0x0002>; + + flash@0,0 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "cfi-flash"; + reg = <0x0 0x0 0x200>; + bank-width = <2>; + device-width = <1>; + + partition@0 { + /* 32MB for user data */ + reg = <0x0 0x0200>; + label = "NOR Data"; + }; + }; + + flash@1,0 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "fsl,ifc-nand"; + reg = <0x1 0x0 0x1>; + + partition@0 { + /* This location must not be altered */ + /* 1MB for u-boot Bootloader Image */ + reg = <0x0 0x0010>; + label = "NAND U-Boot Image"; + read-only; + }; + }; + + cpld@3,0 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "fsl,p1010rdb-cpld"; + reg = <0x3 0x0 0x01f>; + }; + }; -- 1.5.6.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/qoriq: Add default mode for P1020RDB USB
On May 4, 2011, at 8:26 AM, Ramneek Mehresh wrote: > Add P1020 USB controller default value for "dr_mode" property > > Signed-off-by: Ramneek Mehresh > --- > Applies on git://git.am.freescale.net/mirrors/linux-2.6.git > (branch master) > arch/powerpc/boot/dts/p1020rdb.dts | 10 -- > 1 files changed, 4 insertions(+), 6 deletions(-) Can you update the patch. Also make sure to update the p1020rdb_camp* .dts Against git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc.git next thanks - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/85xx:Create dts of each core in CAMP mode for P1020RDB
On Apr 28, 2011, at 2:00 AM, Prabhakar Kushwaha wrote: > Create the dts files for each core and splits the devices between the two > cores > for P1020RDB. > > Core0 has core0 to have memory, l2, i2c, spi, gpio, tdm, dma, usb, eth1, eth2, > sdhc, crypto, global-util, message, pci0, pci1, msi. > Core1 has l2, eth0, crypto. > > MPIC is shared between two cores but each core will protect its interrupts > from > other core by using "protected-sources" of mpic. > > Fix compatible property for global-util node of P1020si.dtsi. > > Signed-off-by: Prabhakar Kushwaha > --- > Based upon > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git(branch > master) > > This patch depends on following patch > "powerpc/85xx: P1020 DTS : re-organize dts files" > > arch/powerpc/boot/dts/p1020rdb_camp_core0.dts | 213 + > arch/powerpc/boot/dts/p1020rdb_camp_core1.dts | 148 + > arch/powerpc/boot/dts/p1020si.dtsi|2 +- > 3 files changed, 362 insertions(+), 1 deletions(-) > create mode 100644 arch/powerpc/boot/dts/p1020rdb_camp_core0.dts > create mode 100644 arch/powerpc/boot/dts/p1020rdb_camp_core1.dts applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/85xx: Save and restore pcie ATMU windows for PM
On Apr 28, 2011, at 1:38 AM, Prabhakar Kushwaha wrote: > D3-cold state indicates removal of the clock and power. however auxiliary > (AUX) > Power may remain available even after the main power rails are powered down. > > wakeup from D3-cold state requires full context restore. Other things are > taken > care in pci-driver except ATMUs. > ATMU windows needs to be saved and restored during suspend and resume. > > Signed-off-by: Jiang Yutang > Signed-off-by: Prabhakar Kushwaha > --- > Based upon > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git(branch > master) > > arch/powerpc/sysdev/fsl_pci.c | 116 + > arch/powerpc/sysdev/fsl_pci.h |7 ++- > 2 files changed, 121 insertions(+), 2 deletions(-) Is this patch for when we are a host or agent? - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/85xx: add host-pci(e) bridge only for RC
On Apr 27, 2011, at 12:35 AM, Prabhakar Kushwaha wrote: > FSL PCIe controller can act as agent(EP) or host(RC). > Under Agent(EP) mode they are configured via Host. So it is not required to > add > with the PCI(e) sub-system. > > Add and configure PCIe controller only for RC mode. > > Signed-off-by: Vivek Mahajan > Signed-off-by: Prabhakar Kushwaha > --- > Based upon > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git(branch > master) > > arch/powerpc/sysdev/fsl_pci.c | 14 ++ > 1 files changed, 14 insertions(+), 0 deletions(-) > > diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c > index 68ca929..87ac11b 100644 > --- a/arch/powerpc/sysdev/fsl_pci.c > +++ b/arch/powerpc/sysdev/fsl_pci.c > @@ -323,6 +323,7 @@ int __init fsl_add_bridge(struct device_node *dev, int > is_primary) > struct pci_controller *hose; > struct resource rsrc; > const int *bus_range; > + u8 is_agent; > > if (!of_device_is_available(dev)) { > pr_warning("%s: disabled\n", dev->full_name); > @@ -353,6 +354,19 @@ int __init fsl_add_bridge(struct device_node *dev, int > is_primary) > > setup_indirect_pci(hose, rsrc.start, rsrc.start + 0x4, > PPC_INDIRECT_TYPE_BIG_ENDIAN); > + > + early_read_config_byte(hose, 0, 0, PCI_HEADER_TYPE, &is_agent); Why are we looking at PCI_HEADER_TYPE? We should look at PCI_CLASS_PROG. > + if ((is_agent & 0x7f) == PCI_HEADER_TYPE_NORMAL) { > + u32 temp; > + > + temp = (u32)hose->cfg_data & ~PAGE_MASK; > + if (((u32)hose->cfg_data & PAGE_MASK) != (u32)hose->cfg_addr) > + iounmap(hose->cfg_data - temp); > + iounmap(hose->cfg_addr); > + pcibios_free_controller(hose); > + return 0; > + } > + > setup_pci_cmd(hose); > > /* check PCI express link status */ > -- > 1.7.3 > > > ___ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/85xx:DTS: Fix PCIe IDSEL for Px020RDB
On Apr 19, 2011, at 11:12 PM, Prabhakar Kushwaha wrote: > PCIe device in legacy mode can trigger interrupts using the wires #INTA, #INTB > ,#INTC and #INTD. PCI devices are obligated to use #INTx for interrupts under > legacy mode. Each PCI slot or device is typically wired to different inputs > on > the interrupt controller. > > So, Define interrupt-map and interrupt-map-mask properties for device tree to > of map each PCI interrupt signal to the inputs of the interrupt controller. > > Signed-off-by: Prabhakar Kushwaha > --- > Based upon > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git(branch > master) > > This patch has depedency on following 2 patches > -- powerpc/85xx: P2020 DTS: re-organize dts files > -- powerpc/85xx: P1020 DTS : re-organize dts files > > arch/powerpc/boot/dts/p1020rdb.dts| 16 > arch/powerpc/boot/dts/p2020rdb.dts| 16 > arch/powerpc/boot/dts/p2020rdb_camp_core0.dts |8 > arch/powerpc/boot/dts/p2020rdb_camp_core1.dts |8 > 4 files changed, 48 insertions(+), 0 deletions(-) applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH][v2] powerpc/85xx: P1020 DTS : re-organize dts files
On Apr 7, 2011, at 4:10 AM, Prabhakar Kushwaha wrote: > Creates P1020si.dtsi, containing information for the P1020 SoC. Modifies dts > files for P1020 based systems to use dtsi file > > Signed-off-by: Prabhakar Kushwaha > Acked-by: Kumar Gala > --- > Based upon > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git(branch > master) > > Please see mpc5200b.dtsi for reference. > > Tested on P1020RDB > > Changes for v2: Incorporated Grant Likely's comment > -updated model name > > arch/powerpc/boot/dts/p1020rdb.dts | 316 +-- > arch/powerpc/boot/dts/p1020si.dtsi | 377 > 2 files changed, 380 insertions(+), 313 deletions(-) > create mode 100644 arch/powerpc/boot/dts/p1020si.dtsi applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/85xx: P2020 DTS: re-organize dts files
On Apr 8, 2011, at 7:27 AM, Prabhakar Kushwaha wrote: > Creates P2020si.dtsi, containing information for P2020 SoC. Modifies dts files > for P2020 based systems to use dtsi file. > > Signed-off-by: Prabhakar Kushwaha > --- > Based upon > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git(branch > master) > > Please see mpc5200b.dtsi for reference. > > Tested on P2020RDB and P2020DS > > arch/powerpc/boot/dts/p2020ds.dts | 374 ++-- > arch/powerpc/boot/dts/p2020rdb.dts| 362 ++- > arch/powerpc/boot/dts/p2020rdb_camp_core0.dts | 237 +++- > arch/powerpc/boot/dts/p2020rdb_camp_core1.dts | 142 ++ > arch/powerpc/boot/dts/p2020si.dtsi| 382 + > 5 files changed, 564 insertions(+), 933 deletions(-) > create mode 100644 arch/powerpc/boot/dts/p2020si.dtsi applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH][upstream] powerpc: Adding bindings for flexcan controller
On Apr 19, 2011, at 8:58 AM, Bhaskar Upadhaya wrote: > From: Bhaskar Upadhaya > > Signed-off-by: Bhaskar Upadhaya > Acked-By: Scott Wood > --- > Based upon > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git (branch > -> master) > > .../devicetree/bindings/net/can/fsl-flexcan.txt| 61 > 1 files changed, 61 insertions(+), 0 deletions(-) > create mode 100755 Documentation/devicetree/bindings/net/can/fsl-flexcan.txt applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 13/13] kvm/powerpc: Allow book3s_hv guests to use SMT processor modes
On Tue, May 17, 2011 at 01:36:26PM +0200, Alexander Graf wrote: > > Just so I understand the scheme: One vcpu needs to go to MMU mode in > KVM, it then sends IPIs to stop the other threads and finally we > return from this wait here? Actually, if one thread needs to get the other threads out of the guest, it sets the HDEC to 0. Since it's a shared register and interrupts all threads on a 0 to -1 transition, setting it to 0 makes all threads come out of the guest. The IPI is for when we're going into the guest. When we're in the host, all the secondary threads are in nap and only the primary thread is running. (Offlining a cpu in the host results in the cpu/thread going to nap mode.) Sending an IPI to a napping thread wakes it up and it resumes at the system reset vector with some bits set in SRR1 to say that it was previously in nap mode. > Oh, I'm certainly fine with the scheme :). I would just like to > understand it and see it documented somewhere, as it's slightly > unintuitive. It took some thought to work it out, so you're right, I should definitely document it. > Also, this scheme might confuse the host scheduler for a bit, as it > might migrate threads to other host CPUs while it would prove > beneficial for cache usage to keep them local. But since the > scheduler doesn't know about the correlation between the threads, it > can't be clever about it. Well, it's not going to migrate a sleeping thread. The accounting gets slightly strange in that all the CPU time for running the 4 vcpus in the vcore gets accounted to one of the vcpu threads (which one can change over time). However, the total across all qemu threads should be correct. Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 02/13] powerpc/e500: SPE register saving: take arbitrary struct offset
On May 17, 2011, at 6:36 PM, Scott Wood wrote: > Previously, these macros hardcoded THREAD_EVR0 as the base of the save > area, relative to the base register passed. This base offset is now > passed as a separate macro parameter, allowing reuse with other SPE > save areas, such as used by KVM. > > Signed-off-by: Scott Wood > --- > This is a resending of http://www.spinics.net/lists/kvm-ppc/msg02672.html > > Kumar, please ack to go via kvm. > > arch/powerpc/include/asm/ppc_asm.h | 28 > arch/powerpc/kernel/head_fsl_booke.S |6 +++--- > 2 files changed, 19 insertions(+), 15 deletions(-) Acked-by: Kumar Gala [ Alex, let me know if you want this via my powerpc.git tree or your kvm tree ] - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 01/13] powerpc/e500: Save SPEFCSR in flush_spe_to_thread()
On May 17, 2011, at 6:35 PM, Scott Wood wrote: > From: yu liu > > giveup_spe() saves the SPE state which is protected by MSR[SPE]. > However, modifying SPEFSCR does not trap when MSR[SPE]=0. > And since SPEFSCR is already saved/restored in _switch(), > not all the callers want to save SPEFSCR again. > Thus, saving SPEFSCR should not belong to giveup_spe(). > > This patch moves SPEFSCR saving to flush_spe_to_thread(), > and cleans up the caller that needs to save SPEFSCR accordingly. > > Signed-off-by: Liu Yu > Signed-off-by: Scott Wood > --- > This is a resending of http://patchwork.ozlabs.org/patch/88677/ > > Kumar, please ack to go via kvm. This is holding up the rest of the SPE > patches, which in turn are holding up the MMU patches due to both > touching the MSR update code. > > arch/powerpc/kernel/head_fsl_booke.S |2 -- > arch/powerpc/kernel/process.c|1 + > arch/powerpc/kernel/traps.c |5 + > 3 files changed, 2 insertions(+), 6 deletions(-) Acked-by: Kumar Gala [ Alex, let me know if you want this via my powerpc.git tree or your kvm tree ] - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] powerpc/fsl: enable verbose bug output
On May 10, 2011, at 1:02 PM, Scott Wood wrote: > This debug option has no overhead other than a slight increase in > kernel size, and makes bug reports more useful. While some end users > may prefer to save the space, as a default on a kernel config aimed > primarily at development on reference boards, it should be enabled. > > Signed-off-by: Scott Wood > --- > arch/powerpc/configs/83xx/mpc8313_rdb_defconfig |1 - > arch/powerpc/configs/83xx/mpc8315_rdb_defconfig |1 - > arch/powerpc/configs/85xx/mpc8540_ads_defconfig |1 - > arch/powerpc/configs/85xx/mpc8560_ads_defconfig |1 - > arch/powerpc/configs/85xx/mpc85xx_cds_defconfig |1 - > arch/powerpc/configs/86xx/mpc8641_hpcn_defconfig |1 - > arch/powerpc/configs/e55xx_smp_defconfig |1 - > arch/powerpc/configs/mpc85xx_defconfig |1 - > arch/powerpc/configs/mpc85xx_smp_defconfig |1 - > arch/powerpc/configs/mpc86xx_defconfig |1 - > 10 files changed, 0 insertions(+), 10 deletions(-) applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 4/4] powerpc/mpic: add the mpic global timer support
On Mar 24, 2011, at 4:43 PM, Scott Wood wrote: > Add support for MPIC timers as requestable interrupt sources. > > Based on http://patchwork.ozlabs.org/patch/20941/ by Dave Liu. > > Signed-off-by: Dave Liu > Signed-off-by: Scott Wood > --- > arch/powerpc/include/asm/mpic.h |3 +- > arch/powerpc/sysdev/mpic.c | 92 --- > 2 files changed, 88 insertions(+), 7 deletions(-) applied to next, fixed for upstream changes. - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/4] powerpc/mpic: parse 4-cell intspec types other than zero
On Mar 24, 2011, at 4:43 PM, Scott Wood wrote: > Signed-off-by: Scott Wood > --- > arch/powerpc/include/asm/mpic.h |2 ++ > arch/powerpc/sysdev/mpic.c | 37 - > 2 files changed, 38 insertions(+), 1 deletions(-) applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/4] powerpc/p1022ds: fix broken mpic timer node
On Mar 24, 2011, at 4:43 PM, Scott Wood wrote: > There is no hardware interrupt 0xf7. But now we can express the timer > interrupt using 4-cell interrupts. This requires converting all of the > other interrupt specifiers in the tree as well. > > Also add the second timer group, and fix the reg property to only > describe the timer registers. > > Signed-off-by: Scott Wood > --- > arch/powerpc/boot/dts/p1022ds.dts | 106 > 1 files changed, 59 insertions(+), 47 deletions(-) applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/4] powerpc: Add fsl mpic timer binding
On Mar 24, 2011, at 4:43 PM, Scott Wood wrote: > Update the existing example in the general mpic binding to have a > separate TCRx region. Currently the example doesn't describe TCRx at > all. The one upstream device tree with an mpic timer node (p1022ds) > uses one large reg region to describe both, even though there are other > unrelated registers in between. That device tree also contains a bogus > interrupt specifier, and there's no upstream software that uses this yet, > so changing this shouldn't be a problem. > > Add a full binding for the MPIC timer node, not just an example of > 4-cell interrupts in the MPIC binding. > > Add fsl,available-ranges, similar to msi-available-ranges. > > Signed-off-by: Scott Wood > --- > .../devicetree/bindings/powerpc/fsl/mpic-timer.txt | 38 > .../devicetree/bindings/powerpc/fsl/mpic.txt |2 +- > 2 files changed, 39 insertions(+), 1 deletions(-) > create mode 100644 > Documentation/devicetree/bindings/powerpc/fsl/mpic-timer.txt applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/13] Hypervisor-mode KVM on POWER7
On 19.05.2011, at 07:22, Paul Mackerras wrote: > On Tue, May 17, 2011 at 02:42:08PM +0300, Avi Kivity wrote: >> On 05/17/2011 02:38 PM, Alexander Graf wrote: What would be the path for these patches to get upstream? Would this stuff normally go through Avi's tree? There is a bit of a complication in that they are based on Ben's next branch. Would Avi pull Ben's next branch, or would they go in via Ben's tree? >>> >>> Usually the ppc tree gets merged into Avi's tree and goes on from >>> there. When we have interdependencies, we can certainly do it >>> differently though. We can also shove them through Ben's tree this >>> time around, as there are more dependencies on ppc code than KVM >>> code. >>> >> >> Yes, both options are fine. If it goes through kvm.git I can merge >> Ben's tree (provided it is append-only) and apply the kvm-ppc >> patches on top. > > OK, the easiest thing is for them to go via Ben's tree, I think, since > they depend so much on other stuff in Ben's tree. > > Alex, could you give Ben an acked-by for patches 1-8 of the series? > There haven't been any changes requested for them. Let me give them a spin on a G5, so I can at least verify nothing breaks ;). I'll hopefully get to this before next week. Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] powerpc,e5500: add networking to defconfig
On May 10, 2011, at 1:01 PM, Scott Wood wrote: > Even though support for the p5020's on-chip ethernet is not yet upstream, > it is not appropriate to disable all networking support (including > loopback, unix domain sockets, external ethernet devices, etc) in the > defconfig. The networking settings are taken from mpc85xx_smp_defconfig, > minus the drivers for ethernet devices not found on any current e5500 > chip. > > The other changes are the result of running "make savedefconfig". > > Signed-off-by: Scott Wood > --- > arch/powerpc/configs/e55xx_smp_defconfig | 38 ++--- > 1 files changed, 29 insertions(+), 9 deletions(-) applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/7] [RFC] add support for BlueGene/P FPU
Eric, > This patch adds save/restore register support for the BlueGene/P > double hummer FPU. What does this mean? Needs more details here. > Signed-off-by: Eric Van Hensbergen > --- > arch/powerpc/include/asm/ppc_asm.h | 39 -- - > arch/powerpc/kernel/fpu.S |8 +++--- > arch/powerpc/platforms/44x/Kconfig |9 > 3 files changed, 40 insertions(+), 16 deletions(-) > > diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/pp c_asm.h > index 9821006..daa22bb 100644 > --- a/arch/powerpc/include/asm/ppc_asm.h > +++ b/arch/powerpc/include/asm/ppc_asm.h > @@ -88,6 +88,13 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR) > REST_10GPRS(22, base) > #endif > > +#ifdef CONFIG_BGP > +#define LFPDX(frt, ra, rb) .long (31<<26)|((frt)<<21)|((ra)<<16)| \ > + ((rb)<<11)|(462<<1) > +#define STFPDX(frt, ra, rb) .long (31<<26)|((frt)<<21)|((ra)<<16)| \ > + ((rb)<<11)|(974<<1) > +#endif /* CONFIG_BGP */ Put these in arch/powerpc/include/asm/ppc-opcode.h and reformat to fit whats there already. Also, don't need to put these defines inside a #ifdef. > + > #define SAVE_2GPRS(n, base) SAVE_GPR(n, base); SAVE_GPR(n+1, base) > #define SAVE_4GPRS(n, base) SAVE_2GPRS(n, base); SAVE_2GPRS(n+2, base) > #define SAVE_8GPRS(n, base) SAVE_4GPRS(n, base); SAVE_4GPRS(n+4, base) > @@ -97,18 +104,26 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR) > #define REST_8GPRS(n, base) REST_4GPRS(n, base); REST_4GPRS(n+4, base) > #define REST_10GPRS(n, base) REST_8GPRS(n, base); REST_2GPRS(n+8, base) > > -#define SAVE_FPR(n, base)stfdn,THREAD_FPR0+8*TS_FPRWIDTH*(n)(base) > -#define SAVE_2FPRS(n, base) SAVE_FPR(n, base); SAVE_FPR(n+1, base) > -#define SAVE_4FPRS(n, base) SAVE_2FPRS(n, base); SAVE_2FPRS(n+2, base) > -#define SAVE_8FPRS(n, base) SAVE_4FPRS(n, base); SAVE_4FPRS(n+4, base) > -#define SAVE_16FPRS(n, base) SAVE_8FPRS(n, base); SAVE_8FPRS(n+8, base) > -#define SAVE_32FPRS(n, base) SAVE_16FPRS(n, base); SAVE_16FPRS(n+16, base) > -#define REST_FPR(n, base)lfd n,THREAD_FPR0+8*TS_FPRWIDTH*(n)(base) > -#define REST_2FPRS(n, base) REST_FPR(n, base); REST_FPR(n+1, base) > -#define REST_4FPRS(n, base) REST_2FPRS(n, base); REST_2FPRS(n+2, base) > -#define REST_8FPRS(n, base) REST_4FPRS(n, base); REST_4FPRS(n+4, base) > -#define REST_16FPRS(n, base) REST_8FPRS(n, base); REST_8FPRS(n+8, base) > -#define REST_32FPRS(n, base) REST_16FPRS(n, base); REST_16FPRS(n+16, base) > +#ifdef CONFIG_BGP > +#define SAVE_FPR(n, b, base) li b, THREAD_FPR0+(16*(n)); STFPDX(n, base, b) > +#define REST_FPR(n, b, base) li b, THREAD_FPR0+(16*(n)); LFPDX(n, base, b) 16*? Are these FP regs 64 or 128 bits wide? If 128 you are doing to have to play with TS_WIDTH to get the size of the FPs correct in the thread_struct. I think there's a bug here. > +#else /* CONFIG_BGP */ > +#define SAVE_FPR(n, b, base) (stfd n, THREAD_FPR0+8*TS_FPRWIDTH*(n)(base)) > +#define REST_FPR(n, b, base) (lfdn, THREAD_FPR0+8*TS_FPRWIDTH*(n)(base)) > +#endif /* CONFIG_BGP */ > + > +#define SAVE_2FPRS(n, b, base) SAVE_FPR(n, b, base); SAVE_FPR(n+1, b, base) > +#define SAVE_4FPRS(n, b, base) SAVE_2FPRS(n, b, base); SAVE_2FPRS(n+2, b, base) > +#define SAVE_8FPRS(n, b, base) SAVE_4FPRS(n, b, base); SAVE_4FPRS(n+4, b, base) > +#define SAVE_16FPRS(n, b, base) SAVE_8FPRS(n, b, base); SAVE_8FPRS(n+8, b, base) > +#define SAVE_32FPRS(n, b, base) SAVE_16FPRS(n, b, base); \ > + SAVE_16FPRS(n+16, b, base) > +#define REST_2FPRS(n, b, base) REST_FPR(n, b, base); REST_FPR(n+1, b, base) > +#define REST_4FPRS(n, b, base) REST_2FPRS(n, b, base); REST_2FPRS(n+2, b, base) > +#define REST_8FPRS(n, b, base) REST_4FPRS(n, b, base); REST_4FPRS(n+4, b, base) > +#define REST_16FPRS(n, b, base) REST_8FPRS(n, b, base); REST_8FPRS(n+8, b, base) > +#define REST_32FPRS(n, b, base) REST_16FPRS(n, b, base); \ > + REST_16FPRS(n+16, b, base) > > #define SAVE_VR(n,b,base)li b,THREAD_VR0+(16*(n)); stvx n,base,b > #define SAVE_2VRS(n,b,base) SAVE_VR(n,b,base); SAVE_VR(n+1,b,base) > diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S > index de36955..9f11c66 100644 > --- a/arch/powerpc/kernel/fpu.S > +++ b/arch/powerpc/kernel/fpu.S > @@ -30,7 +30,7 @@ > BEGIN_FTR_SECTION\ > b 2f; \ > END_FTR_SECTION_IFSET(CPU_FTR_VSX); \ > - REST_32FPRS(n,base);\ > + REST_32FPRS(n,c,base); \ > b 3f; \ > 2: REST_32VSRS(n,c,base);
Re: [PATCH] powerpc/86xx: don't pretend that we support 8-bit pixels on the MPC8610 HPCD
On May 9, 2011, at 2:29 PM, Timur Tabi wrote: > If the video mode is set to 16-, 24-, or 32-bit pixels, then the pixel data > contains actual levels of red, blue, and green. However, if the video mode is > set to 8-bit pixels, then the 8-bit value represents an index into color > table. > This is called "palette mode" on the Freescale DIU video controller. > > The DIU driver does not currently support palette mode, but the MPC8610 HPCD > board file returned a non-zero (although incorrect) pixel format value for > 8-bit mode. > > Signed-off-by: Timur Tabi > --- > arch/powerpc/platforms/86xx/mpc8610_hpcd.c | 97 ++-- > 1 files changed, 64 insertions(+), 33 deletions(-) applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/mpc8610_hpcd: Do not use "/" in interrupt names
On May 4, 2011, at 9:29 AM, Geert Uytterhoeven wrote: > It may trigger a warning in fs/proc/generic.c:__xlate_proc_name() when > trying to add an entry for the interrupt handler to sysfs. > > Signed-off-by: Geert Uytterhoeven > --- > arch/powerpc/platforms/86xx/mpc8610_hpcd.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) applied to next, will CC for stable (2.6.39.1) - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/e5500: set non-base IVORs
On May 9, 2011, at 4:26 PM, Scott Wood wrote: > Without this, we attempt to use doorbells for IPIs, and end up > branching to some bad address. Plus, even for the exceptions > we don't implement, it's good to handle it and get a message out. > > Signed-off-by: Scott Wood > --- > arch/powerpc/include/asm/reg_booke.h |4 ++ > arch/powerpc/kernel/cpu_setup_fsl_booke.S |3 ++ > arch/powerpc/kernel/exceptions-64e.S | 47 + > 3 files changed, 54 insertions(+), 0 deletions(-) applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Thu, 2011-05-19 at 08:46 +0400, James Bottomley wrote: > This can't really be done generically. There are several considerations > to do with hardware requirements. I can see some hw requiring a > specific write order (I think this applies more to read order, though). Right. Or there can be a need for a completely different access pattern to do 32-bit, or maybe write only one half because both might have a side effect etc etc etc ... Also a global lock would be suboptimal vs. a per device lock burried in the driver. > The specific mpt2sas problem is that if we write a 64 bit register non > atomically, we can't allow any interleaving writes for any other region > on the chip, otherwise the HW will take the write as complete in the 64 > bit register and latch the wrong value. The only way to achieve that > given the semantics of writeq is a global static spinlock. > > > How do you think about them? If you cannot agree with the above two > > solutions, I'll agree with reverting them. > > Having x86 roll its own never made any sense, so I think they need > reverting anyway. Agreed. > This is a driver/platform bus problem not an > architecture problem. The assumption we can make is that the platform > CPU can write atomically at its chip width. We *may* be able to make > the assumption that the bus controller can translate an atomic chip > width transaction to a single atomic bus transaction; I think that > assumption holds true for at least PCI and on the parisc legacy busses, > so if we can agree on semantics, this should be a global define > somewhere. If there are problems with the bus assumption, we'll likely > need some type of opt-in (or just not bother). And we want a well defined #ifdef drivers test to know whether there's a writeq/readq (just #define writeq/readq itself is fine even if it's an inline function, we do that elsewhere) so they can have a fallback scenario. This is important as these can be used in very performance critical code path. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Wed, 2011-05-18 at 21:16 -0700, Roland Dreier wrote: > On Wed, May 18, 2011 at 11:31 AM, Milton Miller wrote: > > So the real question should be why is x86-32 supplying a broken writeq > > instead of letting drivers work out what to do it when needed? > > Sounds a lot like what I was asking a couple of years ago :) > http://lkml.org/lkml/2009/4/19/164 > > But Ingo insisted that non-atomic writeq would be fine: > http://lkml.org/lkml/2009/4/19/167 Yuck... Ingo, I think that was very wrong. Those are for MMIO, which must almost ALWAYS know precisely what the resulting access size is going to be. It's not even about atomicity between multiple CPUs. I have seen plenty of HW for which a 64-bit access to a register is -not- equivalent to two 32-bit ones. In fact, in some case, you can get the side effects twice ... or none at all. The only case where you can be lax is when you explicitely know that there is no side effects -and- the HW cope with different access sizes. This is not the general case and drivers need at the very least a way to know what the behaviour will be. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc: mpc85xx regression since 2.6.39-rc2, one cpu core lame
On May 17, 2011, at 4:40 PM, Benjamin Herrenschmidt wrote: > On Tue, 2011-05-17 at 18:28 +0200, Richard Cochran wrote: >> Ben, >> >> Recent 2.6.39-rc kernels behave strangely on the Freescale dual core >> mpc8572 and p2020. There is a long pause (like 2 seconds) in the boot >> sequence after "mpic: requesting IPIs..." >> >> When the system comes up, only one core shows in /proc/cpuinfo. Later >> on, lots of messages appear like the following: >> >> INFO: task ksoftirqd/1:9 blocked for more than 120 seconds. >> >> I bisected [1] the problem to: >> >> commit c56e58537d504706954a06570b4034c04e5b7500 >> Author: Benjamin Herrenschmidt >> Date: Tue Mar 8 14:40:04 2011 +1100 >> >> powerpc/smp: Create idle threads on demand and properly reset them >> >> I don't see from that commit what had gone wrong. Perhaps you can >> help resolve this? > > Hrm, odd. Kumar, care to have a look ? That's what happens when you > don't get me HW to test with :-) I'm trying to work on it ;) - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc: mpc85xx regression since 2.6.39-rc2, one cpu core lame
On May 18, 2011, at 4:48 PM, Benjamin Herrenschmidt wrote: > >> (I get the feeling that I am the only one testing recent kernels with >> the mpc85xx.) >> >> Anyhow, I see that this commit was one of a series. For my own use, >> can I simply revert this one commit independently? > > For your own use sure :-) But I'd still like to get to the bottom of > this ! > > Cheers, > Ben. Tested the 'merge' branch and it appears to fix the issues with secondary cores coming up. - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [git pull] Please pull powerpc.git merge branch
On May 18, 2011, at 11:06 PM, Benjamin Herrenschmidt wrote: > Hi Linus > > Dunno if it's too late or not yet but here's 3 fixes for powerpc that > would be welcome to have in before the release. If not I'll send them > first thing next (one of them is already in -next in fact). > > Those are regression fixes and a build breakage. > > Cheers, > Ben. > > The following changes since commit fce519588acfac249e8fdc1f5016c73d617de315: > > Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6 > (2011-05-18 13:25:57 -0700) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git merge > > Ben Hutchings (1): > powerpc/kexec: Fix build failure on 32-bit SMP > > Benjamin Herrenschmidt (1): > powerpc/smp: Make start_secondary_resume available to all CPU variants > > kerstin jonsson (1): > powerpc/4xx: Fix regression in SMP on 476 > > arch/powerpc/kernel/crash.c | 59 + > arch/powerpc/kernel/head_32.S |9 -- > arch/powerpc/kernel/misc_32.S | 11 +++ > arch/powerpc/kernel/smp.c |4 +- > 4 files changed, 43 insertions(+), 40 deletions(-) > Can you pull this into next. - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [git pull] Please pull powerpc.git merge branch
On Wed, 2011-05-18 at 21:11 -0700, Linus Torvalds wrote: > On Wed, May 18, 2011 at 9:06 PM, Benjamin Herrenschmidt > wrote: > > > > Dunno if it's too late or not yet but here's 3 fixes for powerpc that > > would be welcome to have in before the release. If not I'll send them > > first thing next (one of them is already in -next in fact). > > Gah. I just cut 2.6.39. Bah, no biggie. I'll stick some CC: stable and put them in -next :-) Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/13] Hypervisor-mode KVM on POWER7
On Tue, May 17, 2011 at 02:42:08PM +0300, Avi Kivity wrote: > On 05/17/2011 02:38 PM, Alexander Graf wrote: > >> > >> What would be the path for these patches to get upstream? Would this > >> stuff normally go through Avi's tree? There is a bit of a > >> complication in that they are based on Ben's next branch. Would Avi > >> pull Ben's next branch, or would they go in via Ben's tree? > > > >Usually the ppc tree gets merged into Avi's tree and goes on from > >there. When we have interdependencies, we can certainly do it > >differently though. We can also shove them through Ben's tree this > >time around, as there are more dependencies on ppc code than KVM > >code. > > > > Yes, both options are fine. If it goes through kvm.git I can merge > Ben's tree (provided it is append-only) and apply the kvm-ppc > patches on top. OK, the easiest thing is for them to go via Ben's tree, I think, since they depend so much on other stuff in Ben's tree. Alex, could you give Ben an acked-by for patches 1-8 of the series? There haven't been any changes requested for them. Thanks, Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Thu, 2011-05-19 at 13:08 +0900, Hitoshi Mitake wrote: > On Thu, May 19, 2011 at 04:11, Moore, Eric wrote: > > On Wednesday, May 18, 2011 12:31 PM Milton Miller wrote: > >> Ingo I would propose the following commits added in 2.6.29 be reverted. > >> I think the current concensus is drivers must know if the writeq is > >> not atomic so they can provide their own locking or other workaround. > >> > > > > > > Exactly. > > > > The original motivation of preparing common readq/writeq is that > letting each driver > have their own readq/writeq is bad for maintenance of source code. > > But if you really dislike them, there might be two solutions: > > 1. changing the name of readq/writeq to readq_nonatomic/writeq_nonatomic This is fine, but not really very useful > 2. adding new C file to somewhere and defining spinlock for them. > With spin_lock_irqsave() and spin_unlock_irqrestore() on the spinlock, > readq/writeq can be atomic. This can't really be done generically. There are several considerations to do with hardware requirements. I can see some hw requiring a specific write order (I think this applies more to read order, though). The specific mpt2sas problem is that if we write a 64 bit register non atomically, we can't allow any interleaving writes for any other region on the chip, otherwise the HW will take the write as complete in the 64 bit register and latch the wrong value. The only way to achieve that given the semantics of writeq is a global static spinlock. > How do you think about them? If you cannot agree with the above two > solutions, I'll agree with reverting them. Having x86 roll its own never made any sense, so I think they need reverting anyway. This is a driver/platform bus problem not an architecture problem. The assumption we can make is that the platform CPU can write atomically at its chip width. We *may* be able to make the assumption that the bus controller can translate an atomic chip width transaction to a single atomic bus transaction; I think that assumption holds true for at least PCI and on the parisc legacy busses, so if we can agree on semantics, this should be a global define somewhere. If there are problems with the bus assumption, we'll likely need some type of opt-in (or just not bother). James ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Wed, May 18, 2011 at 11:31 AM, Milton Miller wrote: > So the real question should be why is x86-32 supplying a broken writeq > instead of letting drivers work out what to do it when needed? Sounds a lot like what I was asking a couple of years ago :) http://lkml.org/lkml/2009/4/19/164 But Ingo insisted that non-atomic writeq would be fine: http://lkml.org/lkml/2009/4/19/167 - R. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Thu, May 19, 2011 at 04:11, Moore, Eric wrote: > On Wednesday, May 18, 2011 12:31 PM Milton Miller wrote: >> Ingo I would propose the following commits added in 2.6.29 be reverted. >> I think the current concensus is drivers must know if the writeq is >> not atomic so they can provide their own locking or other workaround. >> > > > Exactly. > The original motivation of preparing common readq/writeq is that letting each driver have their own readq/writeq is bad for maintenance of source code. But if you really dislike them, there might be two solutions: 1. changing the name of readq/writeq to readq_nonatomic/writeq_nonatomic 2. adding new C file to somewhere and defining spinlock for them. With spin_lock_irqsave() and spin_unlock_irqrestore() on the spinlock, readq/writeq can be atomic. How do you think about them? If you cannot agree with the above two solutions, I'll agree with reverting them. -- Hitoshi Mitake h.mit...@gmail.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [git pull] Please pull powerpc.git merge branch
From: Linus Torvalds Date: Wed, 18 May 2011 21:11:47 -0700 > On Wed, May 18, 2011 at 9:06 PM, Benjamin Herrenschmidt > wrote: >> >> Dunno if it's too late or not yet but here's 3 fixes for powerpc that >> would be welcome to have in before the release. If not I'll send them >> first thing next (one of them is already in -next in fact). > > Gah. I just cut 2.6.39. I know we can't let these things go forever, but in my opinion we should have given this one or two more -rc's. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH RFCv7 0/2] CARMA Board Support
On Fri, 2011-02-11 at 15:34 -0800, Ira W. Snyder wrote: > Hello everyone, > > This is the seventh posting of these drivers, taking into account comments > from earlier postings. I've made sure that the drivers both pass checkpatch > without any errors or warnings. I would appreciate as much review as you > can offer, so that these can get into the next merge cycle. They've been > sitting outside mainline for far too long. This has been bitrotting for way too long indeed. I'm sticking this into powerpc -next today. Cheers, Ben. > RFCv6 -> RFCv7: > - reference count private data structure (to support unbind) > - use #defines instead of hex values for registers > - keep lines <=80 characters > > RFCv5 -> RFCv6: > - change locking in several functions > - use list_move_tail() to simplify code > - remove unused helper functions > > RFCv4 -> RFCv5: > - remove unecessary locking per review comments > - do not clobber return values from *_interruptible() > - explicitly track buffer DMA mapping > - use #defines instead of raw hex addresses > - change enable sysfs attribute to root-writeable only > > RFCv3 -> RFCv4: > - updates for DATA-FPGA version 2 > > RFCv2 -> RFCv3: > - use miscdevice framework (removing the carma class) > - add bitfile readback capability to the programmer > > RFCv1 -> RFCv2: > - change comments to kerneldoc format > - Kconfig improvements > - use the videobuf_dma_sg API in the programmer > - updates for Freescale DMAEngine DMA_SLAVE API changes > > KNOWN ISSUES: > - untested with a setup that can generate interrupts (will get access soon) > - does not handle runtime "unbind" > > Information about the CARMA board: > > The CARMA board is essentially an MPC8349EA MDS reference design with a > 1GHz ADC and 4 high powered data processing FPGAs connected to the local > bus. It is all packed into a compact PCI form factor. It is used at the > Owens Valley Radio Observatory as the main component in the correlator > system. > > For board information, see: > http://www.mmarray.org/~dwh/carma_board/index.html > > For DATA-FPGA register layout, see: > http://www.mmarray.org/memos/carma_memo46.pdf > > These drivers are the necessary pieces to get the data processing FPGAs > working and producing data. Despite the fact that the hardware is custom > and we are the only users, I'd still like to get the drivers upstream. > Several people have suggested that this is possible. > > Some further patches will be forthcoming. I have a driver for the LED > subsystem and the PPS subsystem. The LED register layout is expected to > change soon, so I won't post the driver until that is finished. The PPS > driver will be posted seperately from this patch series; it is very > generic. > > Thanks to everyone who has provided comments on earlier versions! > > Ira W. Snyder (2): > misc: add CARMA DATA-FPGA Access Driver > misc: add CARMA DATA-FPGA Programmer support > > drivers/misc/Kconfig|1 + > drivers/misc/Makefile |1 + > drivers/misc/carma/Kconfig | 18 + > drivers/misc/carma/Makefile |2 + > drivers/misc/carma/carma-fpga-program.c | 1141 > drivers/misc/carma/carma-fpga.c | 1433 > +++ > 6 files changed, 2596 insertions(+), 0 deletions(-) > create mode 100644 drivers/misc/carma/Kconfig > create mode 100644 drivers/misc/carma/Makefile > create mode 100644 drivers/misc/carma/carma-fpga-program.c > create mode 100644 drivers/misc/carma/carma-fpga.c > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [git pull] Please pull powerpc.git merge branch
On Wed, May 18, 2011 at 9:06 PM, Benjamin Herrenschmidt wrote: > > Dunno if it's too late or not yet but here's 3 fixes for powerpc that > would be welcome to have in before the release. If not I'll send them > first thing next (one of them is already in -next in fact). Gah. I just cut 2.6.39. Linus ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering
On Tue, May 17, 2011 at 6:19 AM, Ingo Molnar wrote: > > * Steven Rostedt wrote: > >> On Tue, 2011-05-17 at 14:42 +0200, Ingo Molnar wrote: >> > * Steven Rostedt wrote: >> > >> > > On Mon, 2011-05-16 at 18:52 +0200, Ingo Molnar wrote: >> > > > * Steven Rostedt wrote: >> > > > >> > > > > I'm a bit nervous about the 'active' role of (trace_)events, because >> > > > > of the >> > > > > way multiple callbacks can be registered. How would: >> > > > > >> > > > > err = event_x(); >> > > > > if (err == -EACCESS) { >> > > > > >> > > > > be handled? [...] >> > > > >> > > > The default behavior would be something obvious: to trigger all >> > > > callbacks and >> > > > use the first non-zero return value. >> > > >> > > But how do we know which callback that was from? There's no ordering of >> > > what >> > > callbacks are called first. >> > >> > We do not have to know that - nor do the calling sites care in general. Do >> > you >> > have some specific usecase in mind where the identity of the callback that >> > generates a match matters? >> >> Maybe I'm confused. I was thinking that these event_*() are what we >> currently call trace_*(), but the event_*(), I assume, can return a >> value if a call back returns one. > > Yeah - and the call site can treat it as: > > - Ugh, if i get an error i need to abort whatever i was about to do > > or (more advanced future use): > > - If i get a positive value i need to re-evaluate the parameters that were > passed in, they were changed Do event_* that return non-void exist in the tree at all now? I've looked at the various tracepoint macros as well as some of the other handlers (trace_function, perf_tp_event, etc) and I'm not seeing any places where a return value is honored nor could be. At best, the perf_tp_event can be short-circuited it in the hlist_for_each, but it'd still need a way to bubble up a failure and result in not calling the trace/event that the hook precedes. Am I missing something really obvious? I don't feel I've gotten a good handle on exactly how all the tracing code gets triggered, so perhaps I'm still a level (or three) too shallow. (I can see the asm hooks for trace functions and I can see where that translates to registered calls - like trace_function - but I don't see how the hooked calls can be trivially aborted). As is, I'm not sure how the perf and ftrace infrastructure could be reused cleanly without a fair number of hacks to the interface and a good bit of reworking. I can already see a number of challenges around reusing the sys_perf_event_open interface and the fact that reimplementing something even as simple as seccomp mode=1 seems to require a fair amount of tweaking to avoid from being leaky. (E.g., enabling all TRACE_EVENT()s for syscalls will miss unhooked syscalls so either acceptance matching needs to be propagated up the stack along with some seccomp-like task modality or seccomp-on-perf would have to depend on sys_enter events with syscall number predicate matching and fail when a filter discard applies to all active events.) At present, I'm leaning back towards the v2 series (plus the requested minor changes) for the benefit of code clarity and its fail-secure behavior. Even just considering the reduced case of seccomp mode 1 being implemented on the shared infrastructure, I feel like I missing something that makes it viable. Any clues? If not, I don't think a seccomp mode 2 interface via prctl would be intractable if the long term movement is to a ftrace/perf backend - it just means that the in-kernel code would change to wrap whatever the final design ended up being. Thanks and sorry if I'm being dense! >> Thus, we now have the ability to dynamically attach function calls to >> arbitrary points in the kernel that can have an affect on the code that >> called it. Right now, we only have the ability to attach function calls to >> these locations that have passive affects (tracing/profiling). > > Well, they can only have the effect that the calling site accepts and handles. > So the 'effect' is not arbitrary and not defined by the callbacks, it is > controlled and handled by the calling code. > > We do not want invisible side-effects, opaque hooks, etc. > > Instead of that we want (this is the getname() example i cited in the thread) > explicit effects, like: > > if (event_vfs_getname(result)) > return ERR_PTR(-EPERM); > >> But you say, "nor do the calling sites care in general". Then what do >> these calling sites do with the return code? Are we limiting these >> actions to security only? Or can we have some other feature. [...] > > Yeah, not just security. One other example that came up recently is whether to > panic the box on certain (bad) events such as NMI errors. This too could be > made flexible via the event filter code: we already capture many events, so > places that might conceivably do some policy could do so based on a filter > condition. This sounds great - I just wish I could
[git pull] Please pull powerpc.git merge branch
Hi Linus Dunno if it's too late or not yet but here's 3 fixes for powerpc that would be welcome to have in before the release. If not I'll send them first thing next (one of them is already in -next in fact). Those are regression fixes and a build breakage. Cheers, Ben. The following changes since commit fce519588acfac249e8fdc1f5016c73d617de315: Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6 (2011-05-18 13:25:57 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git merge Ben Hutchings (1): powerpc/kexec: Fix build failure on 32-bit SMP Benjamin Herrenschmidt (1): powerpc/smp: Make start_secondary_resume available to all CPU variants kerstin jonsson (1): powerpc/4xx: Fix regression in SMP on 476 arch/powerpc/kernel/crash.c | 59 + arch/powerpc/kernel/head_32.S |9 -- arch/powerpc/kernel/misc_32.S | 11 +++ arch/powerpc/kernel/smp.c |4 +- 4 files changed, 43 insertions(+), 40 deletions(-) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] lib: Consolidate DEBUG_STACK_USAGE option
On Fri, 2011-05-06 at 22:57 -0700, Stephen Boyd wrote: > Most arches define CONFIG_DEBUG_STACK_USAGE exactly the same way. > Move it to lib/Kconfig.debug so each arch doesn't have to define > it. This obviously makes the option generic, but that's fine > because the config is already used in generic code. > > It's not obvious to me that sysrq-P actually does anything > different with this option enabled, but I erred on the side of > caution by keeping the most inclusive wording. Sorry for the delay... For powerpc: Acked-by: Benjamin Herrenschmidt Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
linux-next: build warning after merge of the powerpc tree
Hi all, After merging the powerpc tree, today's linux-next build (powerpc allyesconfig) produced this warning: WARNING: arch/powerpc/sysdev/built-in.o(.text+0x10eb8): Section mismatch in reference from the function .ics_rtas_init() to the function .init.text:.xics_register_ics() The function .ics_rtas_init() references the function __init .xics_register_ics(). This is often because .ics_rtas_init lacks a __init annotation or the annotation of .xics_register_ics is wrong. Introduced by commit 0b05ac6e2480 ("powerpc/xics: Rewrite XICS driver"). ics_rtas_init() is only called from xics_init() which is marked __init, so ics_rtas_init() should be as well. -- Cheers, Stephen Rothwells...@canb.auug.org.au http://www.canb.auug.org.au/~sfr/ pgpMvnnPJ1fHO.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] PPC_47x SMP fix
On Wed, 2011-05-18 at 11:57 +0200, Kerstin Jonsson wrote: > commit c56e58537d504706954a06570b4034c04e5b7500 breaks SMP support in PPC_47x > chip. > secondary_ti must be set to current thread info before callin kick_cpu or > else > start_secondary_47x will jump into void when trying to return to c-code. > In the current setup secondary_ti is initialized before the CPU idle task is > started > and only the boot core will start. I am not sure this is the correct > solution, but it > makes SMP possible in my chip. > Note! The HOTPLUG support probably need some fixing to, There is no > trampoline code > available in head_44x.S - start_secondary_resume? Sending to Linus now. I've also committed a fix for the later, moving the 32-bit definition of start_secondary_resume to misc_32.S Thanks ! Cheers, Ben. > > Signed-off-by: Kerstin Jonsson > Cc: Paul Mackerras > Cc: Michael Neuling > Cc: Will Schmidt > --- > arch/powerpc/kernel/smp.c |4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c > index cbdbb14..f2dcab7 100644 > --- a/arch/powerpc/kernel/smp.c > +++ b/arch/powerpc/kernel/smp.c > @@ -410,8 +410,6 @@ int __cpuinit __cpu_up(unsigned int cpu) > { > int rc, c; > > - secondary_ti = current_set[cpu]; > - > if (smp_ops == NULL || > (smp_ops->cpu_bootable && !smp_ops->cpu_bootable(cpu))) > return -EINVAL; > @@ -421,6 +419,8 @@ int __cpuinit __cpu_up(unsigned int cpu) > if (rc) > return rc; > > + secondary_ti = current_set[cpu]; > + > /* Make sure callin-map entry is 0 (can be leftover a CPU >* hotplug >*/ ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: mmotm threatens ppc preemption again
On 03/31/2011 01:38 PM, Benjamin Herrenschmidt wrote: > On Thu, 2011-03-31 at 10:21 -0700, Jeremy Fitzhardinge wrote: >> No, its the same accessors for both, since the need to distinguish them >> hasn't really come up. Could you put a "if (preemptable()) return;" >> guard in your implementations? > That would be a band-aid but would probably do the trick for now > for !-rt, tho it wouldn't do the right thing for -rt... Hi Ben, Have you had a chance to look at doing a workaround/fix for these power problems? I believe that's the only holdup to putting in the batching changes. I'd like to get them in for the next window if possible, since they're a pretty significant performance win for us. Thanks, J ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 6/7] powerpc/mm: 64-bit: tlb handler micro-optimization
On Thu, 19 May 2011 07:54:48 +1000 Benjamin Herrenschmidt wrote: > On Wed, 2011-05-18 at 16:51 -0500, Scott Wood wrote: > > On Thu, 19 May 2011 07:37:47 +1000 > > Benjamin Herrenschmidt wrote: > > > > > On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > > > > A little more speed up measured on e5500. > > > > > > > > Setting of U0-3 is dropped as it is not used by Linux as far as I can > > > > see. > > > > > > Please keep them for now. If your core doesn't have them, make them an > > > MMU feature. > > > > We have them, it was just an attempt to clean out unused things to speed up > > the miss handler. I'll drop that part if you think we'll use it in the > > future. > > I never know for sure ... damn research people ... :-) > > I'd rather keep them for now, does it make a significant difference ? It was minor but measurable (wouldn't have been worthwhile except as part of a series of small things that add up), but upon trying again I was able to reorder slightly and fit it in without seeing an impact. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Wednesday, May 18, 2011 3:30 PM, Benjamin Herrenschmidt wrote: > > You may also want to look at Milton's comments, it looks like the way > you do init_completion followed immediately by wait_completion is racy. > > You should init the completion before you do the IO that will eventually > trigger complete() to be called. > I agree. The init_completion needs to be done prior to posting the smid. I'm not sure why I did it that way. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 7/7] powerpc/e5500: set MMU_FTR_USE_PAIRED_MAS
On Wed, 2011-05-18 at 16:52 -0500, Scott Wood wrote: > On Thu, 19 May 2011 07:38:19 +1000 > Benjamin Herrenschmidt wrote: > > > On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > > > Signed-off-by: Scott Wood > > > --- > > > Is there any 64-bit book3e chip that doesn't support this? It > > > doesn't appear to be optional in the ISA. > > > > Not afaik. > > Any objection to just removing the feature bit? Nope. Wasn't it added by Kumar in the first place ? Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 6/7] powerpc/mm: 64-bit: tlb handler micro-optimization
On Wed, 2011-05-18 at 16:51 -0500, Scott Wood wrote: > On Thu, 19 May 2011 07:37:47 +1000 > Benjamin Herrenschmidt wrote: > > > On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > > > A little more speed up measured on e5500. > > > > > > Setting of U0-3 is dropped as it is not used by Linux as far as I can > > > see. > > > > Please keep them for now. If your core doesn't have them, make them an > > MMU feature. > > We have them, it was just an attempt to clean out unused things to speed up > the miss handler. I'll drop that part if you think we'll use it in the > future. I never know for sure ... damn research people ... :-) I'd rather keep them for now, does it make a significant difference ? Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 5/7] powerpc/mm: 64-bit: don't handle non-standard page sizes
On Wed, 2011-05-18 at 16:50 -0500, Scott Wood wrote: > On Thu, 19 May 2011 07:36:04 +1000 > Benjamin Herrenschmidt wrote: > > > On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > > > I don't see where any non-standard page size will be set in the > > > kernel page tables, so don't waste time checking for it. It wouldn't > > > work with TLB0 on an FSL MMU anyway, so if there's something I missed > > > (or which is out-of-tree), it's relying on implementation-specific > > > behavior. If there's an out-of-tree need for occasional 4K mappings > > > with CONFIG_PPC_64K_PAGES, perhaps this check could only be done when > > > that is defined. > > > > > > Signed-off-by: Scott Wood > > > --- > > > > Do you use that in the hugetlbfs code ? Can you publish that code ? It's > > long overdue... > > hugetlbfs entries don't get loaded by this code. It branches to a slow > path based on seeing a positive value in a pgd/pud/pmd entry. BTW. The long overdue was aimed at David to get A2 hugetlbfs out :-) Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 7/7] powerpc/e5500: set MMU_FTR_USE_PAIRED_MAS
On Thu, 19 May 2011 07:38:19 +1000 Benjamin Herrenschmidt wrote: > On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > > Signed-off-by: Scott Wood > > --- > > Is there any 64-bit book3e chip that doesn't support this? It > > doesn't appear to be optional in the ISA. > > Not afaik. Any objection to just removing the feature bit? -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/7] powerpc/mm: 64-bit 4k: use page-sized PMDs
> > Why do you want to create a virtual page table at the PMD level ? Also, > > you are changing the geometry of the page tables which I think we don't > > want. We chose that geometry so that the levels match the segment sizes > > on server, I think it may have an impact with the hugetlbfs code (check > > with David), it also was meant as a way to implement shared page tables > > on hash64 tho we never published that. > > The number of virtual page table misses were very high on certain loads. > Cutting back to a virtual PMD eliminates most of that for the benchmark I > tested, though it could still be painful for access patterns that are > extremely spread out through the 64-bit address space. I'll try a full > 4-level walk and see what the performance is like; I was aiming for a > compromise between random access and linear/localized access. Let's get more numbers first then :-) > Why does it need to match segment sizes on server? I'm not sure whether we have a dependency with hugetlbfs there, I need to check (remember we have one page size per segment there). For sharing page tables that came from us using the PMD pointer as a base to calculate the VSIDs. But I don't think we have plans to revive those patches in the immediate future. Cheers, Ben. > As for hugetlbfs, it merged easily enough with Becky's patches (you'll have > to ask her when they'll be published). > > -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 6/7] powerpc/mm: 64-bit: tlb handler micro-optimization
On Thu, 19 May 2011 07:37:47 +1000 Benjamin Herrenschmidt wrote: > On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > > A little more speed up measured on e5500. > > > > Setting of U0-3 is dropped as it is not used by Linux as far as I can > > see. > > Please keep them for now. If your core doesn't have them, make them an > MMU feature. We have them, it was just an attempt to clean out unused things to speed up the miss handler. I'll drop that part if you think we'll use it in the future. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 5/7] powerpc/mm: 64-bit: don't handle non-standard page sizes
On Thu, 19 May 2011 07:36:04 +1000 Benjamin Herrenschmidt wrote: > On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > > I don't see where any non-standard page size will be set in the > > kernel page tables, so don't waste time checking for it. It wouldn't > > work with TLB0 on an FSL MMU anyway, so if there's something I missed > > (or which is out-of-tree), it's relying on implementation-specific > > behavior. If there's an out-of-tree need for occasional 4K mappings > > with CONFIG_PPC_64K_PAGES, perhaps this check could only be done when > > that is defined. > > > > Signed-off-by: Scott Wood > > --- > > Do you use that in the hugetlbfs code ? Can you publish that code ? It's > long overdue... hugetlbfs entries don't get loaded by this code. It branches to a slow path based on seeing a positive value in a pgd/pud/pmd entry. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc: mpc85xx regression since 2.6.39-rc2, one cpu core lame
> (I get the feeling that I am the only one testing recent kernels with > the mpc85xx.) > > Anyhow, I see that this commit was one of a series. For my own use, > can I simply revert this one commit independently? For your own use sure :-) But I'd still like to get to the bottom of this ! Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc: mpc85xx regression since 2.6.39-rc2, one cpu core lame
On Wed, 2011-05-18 at 12:19 -0500, Milton Miller wrote: > Does this patch help? If so please reply to that thread so patchwork > will see it in addition to here. > > http://patchwork.ozlabs.org/patch/96146/ Interesting. I'll have a closer look today. Unfortunately, I don't have any 32-bit BookE SMP at hand at the moment so I couldn't test those configs. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/7] powerpc/mm: 64-bit 4k: use page-sized PMDs
On Thu, 19 May 2011 07:32:41 +1000 Benjamin Herrenschmidt wrote: > On Wed, 2011-05-18 at 16:04 -0500, Scott Wood wrote: > > This allows a virtual page table to be used at the PMD rather than > > the PTE level. > > > > Rather than adjust the constant in pgd_index() (or ignore it, as > > too-large values don't hurt as long as overly large addresses aren't > > passed in), go back to using PTRS_PER_PGD. The overflow comment seems to > > apply to a very old implementation of free_pgtables that used pgd_index() > > (unfortunately the commit message, if you seek it out in the historic > > tree, doesn't mention any details about the overflow). The existing > > value was numerically indentical to the old 4K-page PTRS_PER_PGD, so > > using it shouldn't produce an overflow where it's not otherwise possible. > > > > Also get rid of the incorrect comment at the top of pgtable-ppc64-4k.h. > > Why do you want to create a virtual page table at the PMD level ? Also, > you are changing the geometry of the page tables which I think we don't > want. We chose that geometry so that the levels match the segment sizes > on server, I think it may have an impact with the hugetlbfs code (check > with David), it also was meant as a way to implement shared page tables > on hash64 tho we never published that. The number of virtual page table misses were very high on certain loads. Cutting back to a virtual PMD eliminates most of that for the benchmark I tested, though it could still be painful for access patterns that are extremely spread out through the 64-bit address space. I'll try a full 4-level walk and see what the performance is like; I was aiming for a compromise between random access and linear/localized access. Why does it need to match segment sizes on server? As for hugetlbfs, it merged easily enough with Becky's patches (you'll have to ask her when they'll be published). -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 7/7] powerpc/e5500: set MMU_FTR_USE_PAIRED_MAS
On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > Signed-off-by: Scott Wood > --- > Is there any 64-bit book3e chip that doesn't support this? It > doesn't appear to be optional in the ISA. Not afaik. Cheers, Ben. > arch/powerpc/kernel/cputable.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c > index 34d2722..a3b8eeb 100644 > --- a/arch/powerpc/kernel/cputable.c > +++ b/arch/powerpc/kernel/cputable.c > @@ -1981,7 +1981,7 @@ static struct cpu_spec __initdata cpu_specs[] = { > .cpu_features = CPU_FTRS_E5500, > .cpu_user_features = COMMON_USER_BOOKE, > .mmu_features = MMU_FTR_TYPE_FSL_E | MMU_FTR_BIG_PHYS > | > - MMU_FTR_USE_TLBILX, > + MMU_FTR_USE_TLBILX | MMU_FTR_USE_PAIRED_MAS, > .icache_bsize = 64, > .dcache_bsize = 64, > .num_pmcs = 4, ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 6/7] powerpc/mm: 64-bit: tlb handler micro-optimization
On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > A little more speed up measured on e5500. > > Setting of U0-3 is dropped as it is not used by Linux as far as I can > see. Please keep them for now. If your core doesn't have them, make them an MMU feature. Cheers, Ben. > Signed-off-by: Scott Wood > --- > arch/powerpc/mm/tlb_low_64e.S | 21 - > 1 files changed, 8 insertions(+), 13 deletions(-) > > diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S > index e782023..a94c87b 100644 > --- a/arch/powerpc/mm/tlb_low_64e.S > +++ b/arch/powerpc/mm/tlb_low_64e.S > @@ -47,10 +47,10 @@ >* We could probably also optimize by not saving SRR0/1 in the >* linear mapping case but I'll leave that for later >*/ > - mfspr r14,SPRN_ESR > mfspr r16,SPRN_DEAR /* get faulting address */ > srdir15,r16,60 /* get region */ > cmpldi cr0,r15,0xc /* linear mapping ? */ > + mfspr r14,SPRN_ESR > TLB_MISS_STATS_SAVE_INFO > beq tlb_load_linear /* yes -> go to linear map load */ > > @@ -62,11 +62,11 @@ > andi. r10,r15,0x1 > bne-virt_page_table_tlb_miss > > - std r14,EX_TLB_ESR(r12);/* save ESR */ > - std r16,EX_TLB_DEAR(r12); /* save DEAR */ > + /* We need _PAGE_PRESENT and _PAGE_ACCESSED set */ > > - /* We need _PAGE_PRESENT and _PAGE_ACCESSED set */ > + std r14,EX_TLB_ESR(r12);/* save ESR */ > li r11,_PAGE_PRESENT > + std r16,EX_TLB_DEAR(r12); /* save DEAR */ > orisr11,r11,_PAGE_ACCESSED@h > > /* We do the user/kernel test for the PID here along with the RW test > @@ -225,21 +225,16 @@ finish_normal_tlb_miss: >* yet implemented for now >* MAS 2 :Defaults not useful, need to be redone >* MAS 3+7 :Needs to be done > - * > - * TODO: mix up code below for better scheduling >*/ > clrrdi r11,r16,12 /* Clear low crap in EA */ > + rldicr r15,r14,64-(PTE_RPN_SHIFT-PAGE_SHIFT),63-PAGE_SHIFT > rlwimi r11,r14,32-19,27,31 /* Insert WIMGE */ > + clrldi r15,r15,12 /* Clear crap at the top */ > mtspr SPRN_MAS2,r11 > - > - /* Move RPN in position */ > - rldicr r11,r14,64-(PTE_RPN_SHIFT-PAGE_SHIFT),63-PAGE_SHIFT > - clrldi r15,r11,12 /* Clear crap at the top */ > - rlwimi r15,r14,32-8,22,25 /* Move in U bits */ > + andi. r11,r14,_PAGE_DIRTY > rlwimi r15,r14,32-2,26,31 /* Move in BAP bits */ > > /* Mask out SW and UW if !DIRTY (XXX optimize this !) */ > - andi. r11,r14,_PAGE_DIRTY > bne 1f > li r11,MAS3_SW|MAS3_UW > andcr15,r15,r11 > @@ -483,10 +478,10 @@ virt_page_table_tlb_miss_whacko_fault: >* We could probably also optimize by not saving SRR0/1 in the >* linear mapping case but I'll leave that for later >*/ > - mfspr r14,SPRN_ESR > mfspr r16,SPRN_DEAR /* get faulting address */ > srdir11,r16,60 /* get region */ > cmpldi cr0,r11,0xc /* linear mapping ? */ > + mfspr r14,SPRN_ESR > TLB_MISS_STATS_SAVE_INFO > beq tlb_load_linear /* yes -> go to linear map load */ > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 5/7] powerpc/mm: 64-bit: don't handle non-standard page sizes
On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > I don't see where any non-standard page size will be set in the > kernel page tables, so don't waste time checking for it. It wouldn't > work with TLB0 on an FSL MMU anyway, so if there's something I missed > (or which is out-of-tree), it's relying on implementation-specific > behavior. If there's an out-of-tree need for occasional 4K mappings > with CONFIG_PPC_64K_PAGES, perhaps this check could only be done when > that is defined. > > Signed-off-by: Scott Wood > --- Do you use that in the hugetlbfs code ? Can you publish that code ? It's long overdue... Cheers, Ben. > arch/powerpc/mm/tlb_low_64e.S | 13 - > 1 files changed, 0 insertions(+), 13 deletions(-) > > diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S > index 922fece..e782023 100644 > --- a/arch/powerpc/mm/tlb_low_64e.S > +++ b/arch/powerpc/mm/tlb_low_64e.S > @@ -232,19 +232,6 @@ finish_normal_tlb_miss: > rlwimi r11,r14,32-19,27,31 /* Insert WIMGE */ > mtspr SPRN_MAS2,r11 > > - /* Check page size, if not standard, update MAS1 */ > - rldicl r11,r14,64-8,64-8 > -#ifdef CONFIG_PPC_64K_PAGES > - cmpldi cr0,r11,BOOK3E_PAGESZ_64K > -#else > - cmpldi cr0,r11,BOOK3E_PAGESZ_4K > -#endif > - beq-1f > - mfspr r11,SPRN_MAS1 > - rlwimi r11,r14,31,21,24 > - rlwinm r11,r11,0,21,19 > - mtspr SPRN_MAS1,r11 > -1: > /* Move RPN in position */ > rldicr r11,r14,64-(PTE_RPN_SHIFT-PAGE_SHIFT),63-PAGE_SHIFT > clrldi r15,r11,12 /* Clear crap at the top */ ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/7] powerpc/mm: 64-bit 4k: use a PMD-based virtual page table
On Wed, 2011-05-18 at 16:05 -0500, Scott Wood wrote: > Loads with non-linear access patterns were producing a very high > ratio of recursive pt faults to regular tlb misses. Rather than > choose between a 4-level table walk or a 1-level virtual page table > lookup, use a hybrid scheme with a virtual linear pmd, followed by a > 2-level lookup in the normal handler. > > This adds about 5 cycles (assuming no cache misses, and e5500 timing) > to a normal TLB miss, but greatly reduces the recursive fault rate > for loads which don't have locality within 2 MiB regions but do have > significant locality within 1 GiB regions. Improvements of close to 50% > were seen on such benchmarks. Can you publish benchmarks that compare these two with no virtual at all (4 full loads) ? Cheers, Ben. > Signed-off-by: Scott Wood > --- > arch/powerpc/mm/tlb_low_64e.S | 23 +++ > 1 files changed, 15 insertions(+), 8 deletions(-) > > diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S > index af08922..17726d3 100644 > --- a/arch/powerpc/mm/tlb_low_64e.S > +++ b/arch/powerpc/mm/tlb_low_64e.S > @@ -24,7 +24,7 @@ > #ifdef CONFIG_PPC_64K_PAGES > #define VPTE_PMD_SHIFT (PTE_INDEX_SIZE+1) > #else > -#define VPTE_PMD_SHIFT (PTE_INDEX_SIZE) > +#define VPTE_PMD_SHIFT 0 > #endif > #define VPTE_PUD_SHIFT (VPTE_PMD_SHIFT + PMD_INDEX_SIZE) > #define VPTE_PGD_SHIFT (VPTE_PUD_SHIFT + PUD_INDEX_SIZE) > @@ -185,7 +185,7 @@ normal_tlb_miss: > /* Insert the bottom bits in */ > rlwimi r14,r15,0,16,31 > #else > - rldicl r14,r16,64-(PAGE_SHIFT-3),PAGE_SHIFT-3+4 > + rldicl r14,r16,64-(PMD_SHIFT-3),PMD_SHIFT-3+4 > #endif > sldir15,r10,60 > clrrdi r14,r14,3 > @@ -202,6 +202,16 @@ MMU_FTR_SECTION_ELSE > ld r14,0(r10) > ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV) > > +#ifndef CONFIG_PPC_64K_PAGES > + rldicl r15,r16,64-PAGE_SHIFT+3,64-PTE_INDEX_SIZE-3 > + clrrdi r15,r15,3 > + > + cmpldi cr0,r14,0 > + beq normal_tlb_miss_access_fault > + > + ldx r14,r14,r15 > +#endif > + > finish_normal_tlb_miss: > /* Check if required permissions are met */ > andc. r15,r11,r14 > @@ -353,14 +363,11 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_TLBRSRV) > #ifndef CONFIG_PPC_64K_PAGES > /* Get to PUD entry */ > rldicl r11,r16,64-VPTE_PUD_SHIFT,64-PUD_INDEX_SIZE-3 > - clrrdi r10,r11,3 > - ldx r15,r10,r15 > - cmpldi cr0,r15,0 > - beq virt_page_table_tlb_miss_fault > -#endif /* CONFIG_PPC_64K_PAGES */ > - > +#else > /* Get to PMD entry */ > rldicl r11,r16,64-VPTE_PMD_SHIFT,64-PMD_INDEX_SIZE-3 > +#endif > + > clrrdi r10,r11,3 > ldx r15,r10,r15 > cmpldi cr0,r15,0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/7] powerpc/mm: 64-bit 4k: use page-sized PMDs
On Wed, 2011-05-18 at 16:04 -0500, Scott Wood wrote: > This allows a virtual page table to be used at the PMD rather than > the PTE level. > > Rather than adjust the constant in pgd_index() (or ignore it, as > too-large values don't hurt as long as overly large addresses aren't > passed in), go back to using PTRS_PER_PGD. The overflow comment seems to > apply to a very old implementation of free_pgtables that used pgd_index() > (unfortunately the commit message, if you seek it out in the historic > tree, doesn't mention any details about the overflow). The existing > value was numerically indentical to the old 4K-page PTRS_PER_PGD, so > using it shouldn't produce an overflow where it's not otherwise possible. > > Also get rid of the incorrect comment at the top of pgtable-ppc64-4k.h. Why do you want to create a virtual page table at the PMD level ? Also, you are changing the geometry of the page tables which I think we don't want. We chose that geometry so that the levels match the segment sizes on server, I think it may have an impact with the hugetlbfs code (check with David), it also was meant as a way to implement shared page tables on hash64 tho we never published that. Cheers, Ben. > Signed-off-by: Scott Wood > --- > arch/powerpc/include/asm/pgtable-ppc64-4k.h | 12 > arch/powerpc/include/asm/pgtable-ppc64.h|3 +-- > 2 files changed, 5 insertions(+), 10 deletions(-) > > diff --git a/arch/powerpc/include/asm/pgtable-ppc64-4k.h > b/arch/powerpc/include/asm/pgtable-ppc64-4k.h > index 6eefdcf..194005e 100644 > --- a/arch/powerpc/include/asm/pgtable-ppc64-4k.h > +++ b/arch/powerpc/include/asm/pgtable-ppc64-4k.h > @@ -1,14 +1,10 @@ > #ifndef _ASM_POWERPC_PGTABLE_PPC64_4K_H > #define _ASM_POWERPC_PGTABLE_PPC64_4K_H > -/* > - * Entries per page directory level. The PTE level must use a 64b record > - * for each page table entry. The PMD and PGD level use a 32b record for > - * each entry by assuming that each entry is page aligned. > - */ > + > #define PTE_INDEX_SIZE 9 > -#define PMD_INDEX_SIZE 7 > +#define PMD_INDEX_SIZE 9 > #define PUD_INDEX_SIZE 7 > -#define PGD_INDEX_SIZE 9 > +#define PGD_INDEX_SIZE 7 > > #ifndef __ASSEMBLY__ > #define PTE_TABLE_SIZE (sizeof(pte_t) << PTE_INDEX_SIZE) > @@ -19,7 +15,7 @@ > > #define PTRS_PER_PTE (1 << PTE_INDEX_SIZE) > #define PTRS_PER_PMD (1 << PMD_INDEX_SIZE) > -#define PTRS_PER_PUD (1 << PMD_INDEX_SIZE) > +#define PTRS_PER_PUD (1 << PUD_INDEX_SIZE) > #define PTRS_PER_PGD (1 << PGD_INDEX_SIZE) > > /* PMD_SHIFT determines what a second-level page table entry can map */ > diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h > b/arch/powerpc/include/asm/pgtable-ppc64.h > index 2b09cd5..8bd1cd9 100644 > --- a/arch/powerpc/include/asm/pgtable-ppc64.h > +++ b/arch/powerpc/include/asm/pgtable-ppc64.h > @@ -181,8 +181,7 @@ > * Find an entry in a page-table-directory. We combine the address region > * (the high order N bits) and the pgd portion of the address. > */ > -/* to avoid overflow in free_pgtables we don't use PTRS_PER_PGD here */ > -#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & 0x1ff) > +#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - > 1)) > > #define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address)) > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Wed, 2011-05-18 at 09:35 -0600, Moore, Eric wrote: > I worked the original defect a couple months ago, and Kashyap is now > getting around to posting my patch's. > > This original defect has nothing to do with PPC64. The original > problem was only on x86.It only became a problem on PPC64 when I > tried to fix the original x86 issue by copying the writeq code from > the linux headers, then it broke PPC64. I doubt that broken patch > was ever posted. Anyways, back to the original defect. The reason it > because a problem for x86 is because the kernel headers had a > implementation of writeq in the arch/x86 headers, which means our > internal implementation of writeq is not being used. The writeq > implementation in the kernel is total wrong for arch/x86 because it > doesn't not have spin locks, and if two processor simultaneously doing > two separate 32bit pci writes, then what is received by controller > firmware is out of order. This change occurs between Red Hat RHEL5 > and RHEL6. In RHEL5, this writeq was not implemented in arch/x86 > headers, and our driver internal implementation of write was used. You may also want to look at Milton's comments, it looks like the way you do init_completion followed immediately by wait_completion is racy. You should init the completion before you do the IO that will eventually trigger complete() to be called. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 7/7] [RFC] SMP support code
This patch adds the necessary core code to enable SMP support on BlueGene/P Signed-off-by: Eric Van Hensbergen --- arch/powerpc/kernel/head_44x.S | 72 + arch/powerpc/mm/fault.c| 77 arch/powerpc/platforms/Kconfig.cputype |2 +- 3 files changed, 150 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S index 1f7ae60..57d4483 100644 --- a/arch/powerpc/kernel/head_44x.S +++ b/arch/powerpc/kernel/head_44x.S @@ -1133,6 +1133,70 @@ clear_utlb_entry: #endif /* CONFIG_PPC_47x */ +#if defined(CONFIG_BGP) && defined(CONFIG_SMP) +_GLOBAL(start_secondary_bgp) + /* U2 will be enabled in TLBs. */ +lis r7,PPC44x_MMUCR_U2@h +mtspr SPRN_MMUCR,r7 +li r7,0 +mtspr SPRN_PID,r7 +sync +lis r8,KERNELBASE@h + +/* The tlb_44x_hwater global var (setup by cpu#0) reveals how many + * 256M TLBs we need to map. + */ +lis r9, tlb_44x_hwater@ha +lwz r9, tlb_44x_hwater@l(r9) + +li r5,(PPC44x_TLB_SW | PPC44x_TLB_SR | PPC44x_TLB_SX | \ + PPC44x_TLB_M|PPC44x_TLB_U2) +orisr5, r5, PPC44x_TLB_WL1@h + +/* tlb_44x_hwater is the biggest TLB slot number for regular TLBs. + TLB 63 covers kernel base mapping(256MB) and TLB 62 covers CNS. + With 768MB lowmem, it is set to 59. +*/ +2: +addir9, r9, 1 +cmpwi r9,62 /* Stop at entry 62 which is the fw */ +beq 3f +addis r7,r7,0x1000 /* add 256M */ +addis r8,r8,0x1000 +ori r6,r8,PPC44x_TLB_VALID | PPC44x_TLB_256M + +tlbwe r6,r9,PPC44x_TLB_PAGEID /* Load the pageid fields */ +tlbwe r7,r9,PPC44x_TLB_XLAT /* Load the translation fields */ +tlbwe r5,r9,PPC44x_TLB_ATTRIB /* Load the attrib/access fields */ +b 2b + +3: isync + +/* Setup context from global var secondary_ti */ +lis r1, secondary_ti@ha +lwz r1, secondary_ti@l(r1) +lwz r2, TI_TASK(r1) /* r2 = task_info */ + +addir3,r2,THREAD/* init task's THREAD */ +mtspr SPRN_SPRG3,r3 + +li r0,0 +stwur0,THREAD_SIZE-STACK_FRAME_OVERHEAD(r1) + +/* Let's move on */ +lis r4,start_secondary@h +ori r4,r4,start_secondary@l +lis r3,MSR_KERNEL@h +ori r3,r3,MSR_KERNEL@l +mtspr SPRN_SRR0,r4 +mtspr SPRN_SRR1,r3 +rfi /* change context and jump to start_secondary */ + +_GLOBAL(start_secondary_resume) + /* I don't think this currently happens on BGP */ + b . +#endif /* CONFIG_BGP && CONFIG_SMP */ + /* * Here we are back to code that is common between 44x and 47x * @@ -1144,6 +1208,14 @@ head_start_common: lis r4,interrupt_base@h /* IVPR only uses the high 16-bits */ mtspr SPRN_IVPR,r4 +#if defined(CONFIG_BGP) && defined(CONFIG_SMP) + /* are we an additional CPU */ + li r0, 0 + mfspr r4, SPRN_PIR + cmpwr4, r0 + bgt start_secondary_bgp +#endif /* CONFIG_BGP && CONFIG_SMP */ + addis r22,r22,KERNELBASE@h mtlrr22 isync diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 54f4fb9..0e73244 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -103,6 +103,77 @@ static int store_updates_sp(struct pt_regs *regs) return 0; } +#ifdef CONFIG_BGP +/* + * The icbi instruction does not broadcast to all cpus in the ppc450 + * processor used by Blue Gene/P. It is unlikely this problem will + * be exhibited in other processors so this remains ifdef'ed for BGP + * specifically. + * + * We deal with this by marking executable pages either writable, or + * executable, but never both. The permissions will fault back and + * forth if the thread is actively writing to executable sections. + * Each time we fault to become executable we flush the dcache into + * icache on all cpus. + */ +struct bgp_fixup_parm { + struct page *page; + unsigned long address; + struct vm_area_struct *vma; +}; + +static void bgp_fixup_cache_tlb(void *parm) +{ + struct bgp_fixup_parm *p = parm; + + if (!PageHighMem(p->page)) + flush_dcache_icache_page(p->page); + local_flush_tlb_page(p->vma, p->address); +} + +static void bgp_fixup_access_perms(struct vm_area_struct *vma, + unsigned long address, + int is_write, int is_exec) +{ + struct mm_struct *mm = vma->vm_mm; + pte_t *ptep = NULL; + pmd_t *pmdp; + + if (get_pteptr(mm, address, &ptep, &pmdp)) { + spinlock_t *ptl
[PATCH 5/7] [RFC] force 32-byte aligned kmallocs
For BGP, it is convenient for 'kmalloc' to come back with 32-byte aligned units for torus DMA Signed-off-by: Eric Van Hensbergen --- arch/powerpc/include/asm/page_32.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/page_32.h b/arch/powerpc/include/asm/page_32.h index 68d73b2..fb0a7ae 100644 --- a/arch/powerpc/include/asm/page_32.h +++ b/arch/powerpc/include/asm/page_32.h @@ -9,7 +9,7 @@ #define VM_DATA_DEFAULT_FLAGS VM_DATA_DEFAULT_FLAGS32 -#ifdef CONFIG_NOT_COHERENT_CACHE +#if defined(CONFIG_NOT_COHERENT_CACHE) || defined(CONFIG_BGP) #define ARCH_DMA_MINALIGN L1_CACHE_BYTES #endif -- 1.7.4.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 6/7] [RFC] enable early TLBs for BG/P
BG/P maps firmware with an early TLB Signed-off-by: Eric Van Hensbergen --- arch/powerpc/include/asm/mmu-44x.h |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/mmu-44x.h b/arch/powerpc/include/asm/mmu-44x.h index ca1b90c..2807d6e 100644 --- a/arch/powerpc/include/asm/mmu-44x.h +++ b/arch/powerpc/include/asm/mmu-44x.h @@ -115,8 +115,12 @@ typedef struct { #endif /* !__ASSEMBLY__ */ #ifndef CONFIG_PPC_EARLY_DEBUG_44x +#ifndef CONFIG_BGP #define PPC44x_EARLY_TLBS 1 -#else +#else /* CONFIG_BGP */ +#define PPC44x_EARLY_TLBS 2 +#endif /* CONFIG_BGP */ +#else /* CONFIG_PPC_EARLY_DEBUG_44x */ #define PPC44x_EARLY_TLBS 2 #define PPC44x_EARLY_DEBUG_VIRTADDR(ASM_CONST(0xf000) \ | (ASM_CONST(CONFIG_PPC_EARLY_DEBUG_44x_PHYSLOW) & 0x)) -- 1.7.4.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/7] [RFC] add support for BlueGene/P FPU
This patch adds save/restore register support for the BlueGene/P double hummer FPU. Signed-off-by: Eric Van Hensbergen --- arch/powerpc/include/asm/ppc_asm.h | 39 --- arch/powerpc/kernel/fpu.S |8 +++--- arch/powerpc/platforms/44x/Kconfig |9 3 files changed, 40 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 9821006..daa22bb 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -88,6 +88,13 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR) REST_10GPRS(22, base) #endif +#ifdef CONFIG_BGP +#define LFPDX(frt, ra, rb) .long (31<<26)|((frt)<<21)|((ra)<<16)| \ + ((rb)<<11)|(462<<1) +#define STFPDX(frt, ra, rb).long (31<<26)|((frt)<<21)|((ra)<<16)| \ + ((rb)<<11)|(974<<1) +#endif /* CONFIG_BGP */ + #define SAVE_2GPRS(n, base)SAVE_GPR(n, base); SAVE_GPR(n+1, base) #define SAVE_4GPRS(n, base)SAVE_2GPRS(n, base); SAVE_2GPRS(n+2, base) #define SAVE_8GPRS(n, base)SAVE_4GPRS(n, base); SAVE_4GPRS(n+4, base) @@ -97,18 +104,26 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR) #define REST_8GPRS(n, base)REST_4GPRS(n, base); REST_4GPRS(n+4, base) #define REST_10GPRS(n, base) REST_8GPRS(n, base); REST_2GPRS(n+8, base) -#define SAVE_FPR(n, base) stfdn,THREAD_FPR0+8*TS_FPRWIDTH*(n)(base) -#define SAVE_2FPRS(n, base)SAVE_FPR(n, base); SAVE_FPR(n+1, base) -#define SAVE_4FPRS(n, base)SAVE_2FPRS(n, base); SAVE_2FPRS(n+2, base) -#define SAVE_8FPRS(n, base)SAVE_4FPRS(n, base); SAVE_4FPRS(n+4, base) -#define SAVE_16FPRS(n, base) SAVE_8FPRS(n, base); SAVE_8FPRS(n+8, base) -#define SAVE_32FPRS(n, base) SAVE_16FPRS(n, base); SAVE_16FPRS(n+16, base) -#define REST_FPR(n, base) lfd n,THREAD_FPR0+8*TS_FPRWIDTH*(n)(base) -#define REST_2FPRS(n, base)REST_FPR(n, base); REST_FPR(n+1, base) -#define REST_4FPRS(n, base)REST_2FPRS(n, base); REST_2FPRS(n+2, base) -#define REST_8FPRS(n, base)REST_4FPRS(n, base); REST_4FPRS(n+4, base) -#define REST_16FPRS(n, base) REST_8FPRS(n, base); REST_8FPRS(n+8, base) -#define REST_32FPRS(n, base) REST_16FPRS(n, base); REST_16FPRS(n+16, base) +#ifdef CONFIG_BGP +#define SAVE_FPR(n, b, base) li b, THREAD_FPR0+(16*(n)); STFPDX(n, base, b) +#define REST_FPR(n, b, base) li b, THREAD_FPR0+(16*(n)); LFPDX(n, base, b) +#else /* CONFIG_BGP */ +#define SAVE_FPR(n, b, base) (stfd n, THREAD_FPR0+8*TS_FPRWIDTH*(n)(base)) +#define REST_FPR(n, b, base) (lfdn, THREAD_FPR0+8*TS_FPRWIDTH*(n)(base)) +#endif /* CONFIG_BGP */ + +#define SAVE_2FPRS(n, b, base) SAVE_FPR(n, b, base); SAVE_FPR(n+1, b, base) +#define SAVE_4FPRS(n, b, base) SAVE_2FPRS(n, b, base); SAVE_2FPRS(n+2, b, base) +#define SAVE_8FPRS(n, b, base) SAVE_4FPRS(n, b, base); SAVE_4FPRS(n+4, b, base) +#define SAVE_16FPRS(n, b, base)SAVE_8FPRS(n, b, base); SAVE_8FPRS(n+8, b, base) +#define SAVE_32FPRS(n, b, base)SAVE_16FPRS(n, b, base); \ + SAVE_16FPRS(n+16, b, base) +#define REST_2FPRS(n, b, base) REST_FPR(n, b, base); REST_FPR(n+1, b, base) +#define REST_4FPRS(n, b, base) REST_2FPRS(n, b, base); REST_2FPRS(n+2, b, base) +#define REST_8FPRS(n, b, base) REST_4FPRS(n, b, base); REST_4FPRS(n+4, b, base) +#define REST_16FPRS(n, b, base)REST_8FPRS(n, b, base); REST_8FPRS(n+8, b, base) +#define REST_32FPRS(n, b, base)REST_16FPRS(n, b, base); \ + REST_16FPRS(n+16, b, base) #define SAVE_VR(n,b,base) li b,THREAD_VR0+(16*(n)); stvx n,base,b #define SAVE_2VRS(n,b,base)SAVE_VR(n,b,base); SAVE_VR(n+1,b,base) diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S index de36955..9f11c66 100644 --- a/arch/powerpc/kernel/fpu.S +++ b/arch/powerpc/kernel/fpu.S @@ -30,7 +30,7 @@ BEGIN_FTR_SECTION \ b 2f; \ END_FTR_SECTION_IFSET(CPU_FTR_VSX);\ - REST_32FPRS(n,base);\ + REST_32FPRS(n,c,base); \ b 3f; \ 2: REST_32VSRS(n,c,base); \ 3: @@ -39,13 +39,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX); \ BEGIN_FTR_SECTION \ b 2f; \ END_FTR_SECTION_IFSET(CPU_FTR_VSX);\ - SAVE_32FPRS(n,base);\ + SAVE_32FPRS(n,c,base); \ b 3f;
[PATCH 4/7] [RFC] enable L1_WRITETHROUGH mode for BG/P
BG/P nodes need to be configured for writethrough to work in SMP configurations. This patch adds the right hooks in the MMU code to make sure L1_WRITETHROUGH configurations are setup for BG/P. Signed-off-by: Eric Van Hensbergen --- arch/powerpc/include/asm/mmu-44x.h |2 ++ arch/powerpc/kernel/head_44x.S | 24 ++-- arch/powerpc/kernel/misc_32.S | 15 +++ arch/powerpc/lib/copy_32.S | 10 ++ arch/powerpc/mm/44x_mmu.c |7 +-- arch/powerpc/platforms/Kconfig |5 + arch/powerpc/platforms/Kconfig.cputype |4 7 files changed, 63 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/mmu-44x.h b/arch/powerpc/include/asm/mmu-44x.h index bf52d70..ca1b90c 100644 --- a/arch/powerpc/include/asm/mmu-44x.h +++ b/arch/powerpc/include/asm/mmu-44x.h @@ -8,6 +8,7 @@ #define PPC44x_MMUCR_TID 0x00ff #define PPC44x_MMUCR_STS 0x0001 +#define PPC44x_MMUCR_U20x0020 #definePPC44x_TLB_PAGEID 0 #definePPC44x_TLB_XLAT 1 @@ -32,6 +33,7 @@ /* Storage attribute and access control fields */ #define PPC44x_TLB_ATTR_MASK 0xff80 +#define PPC44x_TLB_WL1 0x0010 /* Write-through L1 */ #define PPC44x_TLB_U0 0x8000 /* User 0 */ #define PPC44x_TLB_U1 0x4000 /* User 1 */ #define PPC44x_TLB_U2 0x2000 /* User 2 */ diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S index 5e12b74..1f7ae60 100644 --- a/arch/powerpc/kernel/head_44x.S +++ b/arch/powerpc/kernel/head_44x.S @@ -429,7 +429,16 @@ finish_tlb_load_44x: andi. r10,r12,_PAGE_USER /* User page ? */ beq 1f /* nope, leave U bits empty */ rlwimi r11,r11,3,26,28 /* yes, copy S bits to U */ -1: tlbwe r11,r13,PPC44x_TLB_ATTRIB /* Write ATTRIB */ +1: +#ifdef CONFIG_L1_WRITETHROUGH + andi. r10, r11, PPC44x_TLB_I + bne 2f + orisr11,r11,PPC44x_TLB_WL1@h/* Add coherency for */ + /* non-inhibited */ + ori r11,r11,PPC44x_TLB_U2|PPC44x_TLB_M +2: +#endif /* CONFIG_L1_WRITETHROUGH */ + tlbwe r11,r13,PPC44x_TLB_ATTRIB /* Write ATTRIB */ /* Done...restore registers and get out of here. */ @@ -799,7 +808,11 @@ skpinv:addir4,r4,1 /* Increment */ sync /* Initialize MMUCR */ +#ifdef CONFIG_L1_WRITETHROUGH + lis r5, PPC44x_MMUCR_U2@h +#else li r5,0 +#endif /* CONFIG_L1_WRITETHROUGH */ mtspr SPRN_MMUCR,r5 sync @@ -814,7 +827,14 @@ skpinv:addir4,r4,1 /* Increment */ /* attrib fields */ /* Added guarded bit to protect against speculative loads/stores */ li r5,0 - ori r5,r5,(PPC44x_TLB_SW | PPC44x_TLB_SR | PPC44x_TLB_SX | PPC44x_TLB_G) +#ifdef CONFIG_L1_WRITETHROUGH + ori r5,r5,(PPC44x_TLB_SW | PPC44x_TLB_SR | PPC44x_TLB_SX | \ + PPC44x_TLB_G | PPC44x_TLB_U2) + orisr5,r5,PPC44x_TLB_WL1@h +#else + ori r5,r5,(PPC44x_TLB_SW | PPC44x_TLB_SR | PPC44x_TLB_SX | \ + PPC44x_TLB_G) +#endif /* CONFIG_L1_WRITETHROUGH li r0,63/* TLB slot 63 */ diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 094bd98..d88369b 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -506,7 +506,20 @@ _GLOBAL(clear_pages) li r0,PAGE_SIZE/L1_CACHE_BYTES slw r0,r0,r4 mtctr r0 +#ifdef CONFIG_L1_WRITETHROUGH + /* assuming 32 byte cacheline */ + li r4, 0 +1: stw r4, 0(r3) + stw r4, 4(r3) + stw r4, 8(r3) + stw r4, 12(r3) + stw r4, 16(r3) + stw r4, 20(r3) + stw r4, 24(r3) + stw r4, 28(r3) +#else 1: dcbz0,r3 +#endif /* CONFIG_L1_WRITETHROUGH */ addir3,r3,L1_CACHE_BYTES bdnz1b blr @@ -550,7 +563,9 @@ _GLOBAL(copy_page) mtctr r0 1: dcbtr11,r4 +#ifndef CONFIG_L1_WRITETHROUGH dcbzr5,r3 +#endif COPY_16_BYTES #if L1_CACHE_BYTES >= 32 COPY_16_BYTES diff --git a/arch/powerpc/lib/copy_32.S b/arch/powerpc/lib/copy_32.S index 55f19f9..98a07e3 100644 --- a/arch/powerpc/lib/copy_32.S +++ b/arch/powerpc/lib/copy_32.S @@ -98,7 +98,11 @@ _GLOBAL(cacheable_memzero) bdnz4b 3: mtctr r9 li r7,4 +#ifdef CONFIG_L1_WRITETHROUGH +10: +#else 10:dcbzr7,r6 +#endif /* CONFIG_L1_WRITETHROUGH */ addir6,r6,CACHELINE_BYTES bdnz10b clrlwi r5,r8,32-LG_CACHELINE_BYTES @@ -187,7 +191,9 @@ _GLOBAL(cacheable_me
[PATCH 2/7] [RFC] add bluegene entry to cputable
Signed-off-by: Eric Van Hensbergen --- arch/powerpc/kernel/cputable.c | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c index b9602ee..0eb245e 100644 --- a/arch/powerpc/kernel/cputable.c +++ b/arch/powerpc/kernel/cputable.c @@ -1732,6 +1732,20 @@ static struct cpu_spec __initdata cpu_specs[] = { .machine_check = machine_check_440A, .platform = "ppc440", }, + { /* Blue Gene/P */ + .pvr_mask = 0xfff0, + .pvr_value = 0x52131880, + .cpu_name = "450 Blue Gene/P", + .cpu_features = CPU_FTRS_440x6, + .cpu_user_features = COMMON_USER_BOOKE | + PPC_FEATURE_HAS_FPU, + .mmu_features = MMU_FTR_TYPE_44x, + .icache_bsize = 32, + .dcache_bsize = 32, + .cpu_setup = __setup_cpu_460gt, + .machine_check = machine_check_440A, + .platform = "ppc440", + }, { /* 460EX */ .pvr_mask = 0x0006, .pvr_value = 0x13020002, -- 1.7.4.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/7] [RFC] Mainline BG/P platform support
The Linux kernel patches for the IBM BlueGene/P have been open-sourced for quite some time, but haven't been integrated into the mainline Linux kernel source tree. This is the first patch series of several where I will attempt to cleanup and mainline the already public patches. I welcome feedback as well as any help I can get. I'm drawing on the patches available for the IBM Compute Node kernel, the ZeptoOS project and the Kittyhawk project. (all available from http://wiki.bg.anl-external.org) I'll be prioritizing core patches which are harder to keep current with mainline due to merge conflicts and then slowly incorporating the drivers and other extensions (if acceptable after community review). I'll be maintaining the patchset in my kernel.org repository (/pub/scm/linux/kernel/git/ericvh/bluegene.git) under the bluegene branch with the source repos (zepto, kittyhawk, ibmcn) available in respective branches. Ben - if you would prefer me to send pull requests once we get rolling, I can switch to that -- otherwise I'll stick to just submitting patches to the list assuming you'll pull them when they become acceptable. Thanks for your attention reviewing these patches. Signed-off-by: Eric Van Hensbergen --- MAINTAINERS |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 69f19f1..3ffca88 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3863,6 +3863,14 @@ S: Maintained F: arch/powerpc/platforms/40x/ F: arch/powerpc/platforms/44x/ +LINUX FOR POWERPC BLUEGENE/P +M: Eric Van Hensbergen +W: http://bg-linux.anl-external.org/wiki/index.php/Main_Page +L: bg-li...@lists.anl-external.org +T: git git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/bluegene.git +S: Maintained +F: arch/powerpc/platforms/44x/bgp* + LINUX FOR POWERPC EMBEDDED XILINX VIRTEX M: Grant Likely W: http://wiki.secretlab.ca/index.php/Linux_on_Xilinx_Virtex -- 1.7.4.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 5/7] powerpc/mm: 64-bit: don't handle non-standard page sizes
I don't see where any non-standard page size will be set in the kernel page tables, so don't waste time checking for it. It wouldn't work with TLB0 on an FSL MMU anyway, so if there's something I missed (or which is out-of-tree), it's relying on implementation-specific behavior. If there's an out-of-tree need for occasional 4K mappings with CONFIG_PPC_64K_PAGES, perhaps this check could only be done when that is defined. Signed-off-by: Scott Wood --- arch/powerpc/mm/tlb_low_64e.S | 13 - 1 files changed, 0 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S index 922fece..e782023 100644 --- a/arch/powerpc/mm/tlb_low_64e.S +++ b/arch/powerpc/mm/tlb_low_64e.S @@ -232,19 +232,6 @@ finish_normal_tlb_miss: rlwimi r11,r14,32-19,27,31 /* Insert WIMGE */ mtspr SPRN_MAS2,r11 - /* Check page size, if not standard, update MAS1 */ - rldicl r11,r14,64-8,64-8 -#ifdef CONFIG_PPC_64K_PAGES - cmpldi cr0,r11,BOOK3E_PAGESZ_64K -#else - cmpldi cr0,r11,BOOK3E_PAGESZ_4K -#endif - beq-1f - mfspr r11,SPRN_MAS1 - rlwimi r11,r14,31,21,24 - rlwinm r11,r11,0,21,19 - mtspr SPRN_MAS1,r11 -1: /* Move RPN in position */ rldicr r11,r14,64-(PTE_RPN_SHIFT-PAGE_SHIFT),63-PAGE_SHIFT clrldi r15,r11,12 /* Clear crap at the top */ -- 1.7.4.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 4/7] powerpc/mm: 64-bit: Don't load PACA in normal TLB miss exceptions
Load it only when needed, in recursive/linear/indirect faults, and in the stats code. Signed-off-by: Scott Wood --- arch/powerpc/include/asm/exception-64e.h | 28 +- arch/powerpc/mm/tlb_low_64e.S| 43 + 2 files changed, 39 insertions(+), 32 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64e.h b/arch/powerpc/include/asm/exception-64e.h index 6921261..9b57a27 100644 --- a/arch/powerpc/include/asm/exception-64e.h +++ b/arch/powerpc/include/asm/exception-64e.h @@ -80,9 +80,9 @@ exc_##label##_book3e: * * This prolog handles re-entrancy (up to 3 levels supported in the PACA * though we currently don't test for overflow). It provides you with a - * re-entrancy safe working space of r10...r16 and CR with r12 being used - * as the exception area pointer in the PACA for that level of re-entrancy - * and r13 containing the PACA pointer. + * re-entrancy safe working space of r10...r16 (except r13) and CR with r12 + * being used as the exception area pointer in the PACA for that level of + * re-entrancy. * * SRR0 and SRR1 are saved, but DEAR and ESR are not, since they don't apply * as-is for instruction exceptions. It's up to the actual exception code @@ -95,8 +95,6 @@ exc_##label##_book3e: mfcrr10;\ std r11,EX_TLB_R11(r12);\ mfspr r11,SPRN_SPRG_TLB_SCRATCH; \ - std r13,EX_TLB_R13(r12);\ - ld r13,EX_TLB_PACA(r12); \ std r14,EX_TLB_R14(r12);\ addir14,r12,EX_TLB_SIZE;\ std r15,EX_TLB_R15(r12);\ @@ -135,7 +133,6 @@ exc_##label##_book3e: mtspr SPRN_SPRG_TLB_EXFRAME,freg; \ ld r11,EX_TLB_R11(r12);\ mtcrr14;\ - ld r13,EX_TLB_R13(r12);\ ld r14,EX_TLB_R14(r12);\ mtspr SPRN_SRR0,r15; \ ld r15,EX_TLB_R15(r12);\ @@ -148,11 +145,13 @@ exc_##label##_book3e: TLB_MISS_RESTORE(r12) #define TLB_MISS_EPILOG_ERROR \ - addir12,r13,PACA_EXTLB; \ + ld r10,EX_TLB_PACA(r12); \ + addir12,r10,PACA_EXTLB; \ TLB_MISS_RESTORE(r12) #define TLB_MISS_EPILOG_ERROR_SPECIAL \ - addir11,r13,PACA_EXTLB; \ + ld r10,EX_TLB_PACA(r12); \ + addir11,r10,PACA_EXTLB; \ TLB_MISS_RESTORE(r11) #ifdef CONFIG_BOOK3E_MMU_TLB_STATS @@ -160,25 +159,26 @@ exc_##label##_book3e: mflrr10;\ std r8,EX_TLB_R8(r12); \ std r9,EX_TLB_R9(r12); \ - std r10,EX_TLB_LR(r12); + std r10,EX_TLB_LR(r12); \ + ld r9,EX_TLB_PACA(r12); #define TLB_MISS_RESTORE_STATS \ ld r16,EX_TLB_LR(r12); \ ld r9,EX_TLB_R9(r12); \ ld r8,EX_TLB_R8(r12); \ mtlrr16; #define TLB_MISS_STATS_D(name) \ - addir9,r13,MMSTAT_DSTATS+name; \ + addir9,r9,MMSTAT_DSTATS+name; \ bl .tlb_stat_inc; #define TLB_MISS_STATS_I(name) \ - addir9,r13,MMSTAT_ISTATS+name; \ + addir9,r9,MMSTAT_ISTATS+name; \ bl .tlb_stat_inc; #define TLB_MISS_STATS_X(name) \ - ld r8,PACA_EXTLB+EX_TLB_ESR(r13); \ + ld r8,PACA_EXTLB+EX_TLB_ESR(r9); \ cmpdi cr2,r8,-1; \ beq cr2,61f;\ - addi
[PATCH 7/7] powerpc/e5500: set MMU_FTR_USE_PAIRED_MAS
Signed-off-by: Scott Wood --- Is there any 64-bit book3e chip that doesn't support this? It doesn't appear to be optional in the ISA. arch/powerpc/kernel/cputable.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c index 34d2722..a3b8eeb 100644 --- a/arch/powerpc/kernel/cputable.c +++ b/arch/powerpc/kernel/cputable.c @@ -1981,7 +1981,7 @@ static struct cpu_spec __initdata cpu_specs[] = { .cpu_features = CPU_FTRS_E5500, .cpu_user_features = COMMON_USER_BOOKE, .mmu_features = MMU_FTR_TYPE_FSL_E | MMU_FTR_BIG_PHYS | - MMU_FTR_USE_TLBILX, + MMU_FTR_USE_TLBILX | MMU_FTR_USE_PAIRED_MAS, .icache_bsize = 64, .dcache_bsize = 64, .num_pmcs = 4, -- 1.7.4.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 6/7] powerpc/mm: 64-bit: tlb handler micro-optimization
A little more speed up measured on e5500. Setting of U0-3 is dropped as it is not used by Linux as far as I can see. Signed-off-by: Scott Wood --- arch/powerpc/mm/tlb_low_64e.S | 21 - 1 files changed, 8 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S index e782023..a94c87b 100644 --- a/arch/powerpc/mm/tlb_low_64e.S +++ b/arch/powerpc/mm/tlb_low_64e.S @@ -47,10 +47,10 @@ * We could probably also optimize by not saving SRR0/1 in the * linear mapping case but I'll leave that for later */ - mfspr r14,SPRN_ESR mfspr r16,SPRN_DEAR /* get faulting address */ srdir15,r16,60 /* get region */ cmpldi cr0,r15,0xc /* linear mapping ? */ + mfspr r14,SPRN_ESR TLB_MISS_STATS_SAVE_INFO beq tlb_load_linear /* yes -> go to linear map load */ @@ -62,11 +62,11 @@ andi. r10,r15,0x1 bne-virt_page_table_tlb_miss - std r14,EX_TLB_ESR(r12);/* save ESR */ - std r16,EX_TLB_DEAR(r12); /* save DEAR */ + /* We need _PAGE_PRESENT and _PAGE_ACCESSED set */ -/* We need _PAGE_PRESENT and _PAGE_ACCESSED set */ + std r14,EX_TLB_ESR(r12);/* save ESR */ li r11,_PAGE_PRESENT + std r16,EX_TLB_DEAR(r12); /* save DEAR */ orisr11,r11,_PAGE_ACCESSED@h /* We do the user/kernel test for the PID here along with the RW test @@ -225,21 +225,16 @@ finish_normal_tlb_miss: * yet implemented for now * MAS 2 :Defaults not useful, need to be redone * MAS 3+7 :Needs to be done -* -* TODO: mix up code below for better scheduling */ clrrdi r11,r16,12 /* Clear low crap in EA */ + rldicr r15,r14,64-(PTE_RPN_SHIFT-PAGE_SHIFT),63-PAGE_SHIFT rlwimi r11,r14,32-19,27,31 /* Insert WIMGE */ + clrldi r15,r15,12 /* Clear crap at the top */ mtspr SPRN_MAS2,r11 - - /* Move RPN in position */ - rldicr r11,r14,64-(PTE_RPN_SHIFT-PAGE_SHIFT),63-PAGE_SHIFT - clrldi r15,r11,12 /* Clear crap at the top */ - rlwimi r15,r14,32-8,22,25 /* Move in U bits */ + andi. r11,r14,_PAGE_DIRTY rlwimi r15,r14,32-2,26,31 /* Move in BAP bits */ /* Mask out SW and UW if !DIRTY (XXX optimize this !) */ - andi. r11,r14,_PAGE_DIRTY bne 1f li r11,MAS3_SW|MAS3_UW andcr15,r15,r11 @@ -483,10 +478,10 @@ virt_page_table_tlb_miss_whacko_fault: * We could probably also optimize by not saving SRR0/1 in the * linear mapping case but I'll leave that for later */ - mfspr r14,SPRN_ESR mfspr r16,SPRN_DEAR /* get faulting address */ srdir11,r16,60 /* get region */ cmpldi cr0,r11,0xc /* linear mapping ? */ + mfspr r14,SPRN_ESR TLB_MISS_STATS_SAVE_INFO beq tlb_load_linear /* yes -> go to linear map load */ -- 1.7.4.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/7] powerpc/mm: 64-bit 4k: use a PMD-based virtual page table
Loads with non-linear access patterns were producing a very high ratio of recursive pt faults to regular tlb misses. Rather than choose between a 4-level table walk or a 1-level virtual page table lookup, use a hybrid scheme with a virtual linear pmd, followed by a 2-level lookup in the normal handler. This adds about 5 cycles (assuming no cache misses, and e5500 timing) to a normal TLB miss, but greatly reduces the recursive fault rate for loads which don't have locality within 2 MiB regions but do have significant locality within 1 GiB regions. Improvements of close to 50% were seen on such benchmarks. Signed-off-by: Scott Wood --- arch/powerpc/mm/tlb_low_64e.S | 23 +++ 1 files changed, 15 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S index af08922..17726d3 100644 --- a/arch/powerpc/mm/tlb_low_64e.S +++ b/arch/powerpc/mm/tlb_low_64e.S @@ -24,7 +24,7 @@ #ifdef CONFIG_PPC_64K_PAGES #define VPTE_PMD_SHIFT (PTE_INDEX_SIZE+1) #else -#define VPTE_PMD_SHIFT (PTE_INDEX_SIZE) +#define VPTE_PMD_SHIFT 0 #endif #define VPTE_PUD_SHIFT (VPTE_PMD_SHIFT + PMD_INDEX_SIZE) #define VPTE_PGD_SHIFT (VPTE_PUD_SHIFT + PUD_INDEX_SIZE) @@ -185,7 +185,7 @@ normal_tlb_miss: /* Insert the bottom bits in */ rlwimi r14,r15,0,16,31 #else - rldicl r14,r16,64-(PAGE_SHIFT-3),PAGE_SHIFT-3+4 + rldicl r14,r16,64-(PMD_SHIFT-3),PMD_SHIFT-3+4 #endif sldir15,r10,60 clrrdi r14,r14,3 @@ -202,6 +202,16 @@ MMU_FTR_SECTION_ELSE ld r14,0(r10) ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV) +#ifndef CONFIG_PPC_64K_PAGES + rldicl r15,r16,64-PAGE_SHIFT+3,64-PTE_INDEX_SIZE-3 + clrrdi r15,r15,3 + + cmpldi cr0,r14,0 + beq normal_tlb_miss_access_fault + + ldx r14,r14,r15 +#endif + finish_normal_tlb_miss: /* Check if required permissions are met */ andc. r15,r11,r14 @@ -353,14 +363,11 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_TLBRSRV) #ifndef CONFIG_PPC_64K_PAGES /* Get to PUD entry */ rldicl r11,r16,64-VPTE_PUD_SHIFT,64-PUD_INDEX_SIZE-3 - clrrdi r10,r11,3 - ldx r15,r10,r15 - cmpldi cr0,r15,0 - beq virt_page_table_tlb_miss_fault -#endif /* CONFIG_PPC_64K_PAGES */ - +#else /* Get to PMD entry */ rldicl r11,r16,64-VPTE_PMD_SHIFT,64-PMD_INDEX_SIZE-3 +#endif + clrrdi r10,r11,3 ldx r15,r10,r15 cmpldi cr0,r15,0 -- 1.7.4.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/7] powerpc/mm: 64-bit tlb miss: get PACA from memory rather than SPR
This saves a few cycles, at least on e5500. Signed-off-by: Scott Wood --- arch/powerpc/include/asm/exception-64e.h | 16 +++- arch/powerpc/kernel/paca.c |5 + 2 files changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64e.h b/arch/powerpc/include/asm/exception-64e.h index 6d53f31..6921261 100644 --- a/arch/powerpc/include/asm/exception-64e.h +++ b/arch/powerpc/include/asm/exception-64e.h @@ -62,16 +62,14 @@ #define EX_TLB_ESR ( 9 * 8) /* Level 0 and 2 only */ #define EX_TLB_SRR0(10 * 8) #define EX_TLB_SRR1(11 * 8) -#define EX_TLB_MMUCR0 (12 * 8) /* Level 0 */ -#define EX_TLB_MAS1(12 * 8) /* Level 0 */ -#define EX_TLB_MAS2(13 * 8) /* Level 0 */ +#define EX_TLB_PACA(12 * 8) #ifdef CONFIG_BOOK3E_MMU_TLB_STATS -#define EX_TLB_R8 (14 * 8) -#define EX_TLB_R9 (15 * 8) -#define EX_TLB_LR (16 * 8) -#define EX_TLB_SIZE(17 * 8) +#define EX_TLB_R8 (13 * 8) +#define EX_TLB_R9 (14 * 8) +#define EX_TLB_LR (15 * 8) +#define EX_TLB_SIZE(16 * 8) #else -#define EX_TLB_SIZE(14 * 8) +#define EX_TLB_SIZE(13 * 8) #endif #defineSTART_EXCEPTION(label) \ @@ -98,7 +96,7 @@ exc_##label##_book3e: std r11,EX_TLB_R11(r12);\ mfspr r11,SPRN_SPRG_TLB_SCRATCH; \ std r13,EX_TLB_R13(r12);\ - mfspr r13,SPRN_SPRG_PACA; \ + ld r13,EX_TLB_PACA(r12); \ std r14,EX_TLB_R14(r12);\ addir14,r12,EX_TLB_SIZE;\ std r15,EX_TLB_R15(r12);\ diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c index 102244e..814dae2 100644 --- a/arch/powerpc/kernel/paca.c +++ b/arch/powerpc/kernel/paca.c @@ -151,6 +151,11 @@ void __init initialise_paca(struct paca_struct *new_paca, int cpu) #ifdef CONFIG_PPC_STD_MMU_64 new_paca->slb_shadow_ptr = &slb_shadow[cpu]; #endif /* CONFIG_PPC_STD_MMU_64 */ +#ifdef CONFIG_PPC_BOOK3E + new_paca->extlb[0][EX_TLB_PACA / 8] = (u64)new_paca; + new_paca->extlb[1][EX_TLB_PACA / 8] = (u64)new_paca; + new_paca->extlb[2][EX_TLB_PACA / 8] = (u64)new_paca; +#endif } /* Put the paca pointer into r13 and SPRG_PACA */ -- 1.7.4.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/7] powerpc/mm: 64-bit 4k: use page-sized PMDs
This allows a virtual page table to be used at the PMD rather than the PTE level. Rather than adjust the constant in pgd_index() (or ignore it, as too-large values don't hurt as long as overly large addresses aren't passed in), go back to using PTRS_PER_PGD. The overflow comment seems to apply to a very old implementation of free_pgtables that used pgd_index() (unfortunately the commit message, if you seek it out in the historic tree, doesn't mention any details about the overflow). The existing value was numerically indentical to the old 4K-page PTRS_PER_PGD, so using it shouldn't produce an overflow where it's not otherwise possible. Also get rid of the incorrect comment at the top of pgtable-ppc64-4k.h. Signed-off-by: Scott Wood --- arch/powerpc/include/asm/pgtable-ppc64-4k.h | 12 arch/powerpc/include/asm/pgtable-ppc64.h|3 +-- 2 files changed, 5 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/pgtable-ppc64-4k.h b/arch/powerpc/include/asm/pgtable-ppc64-4k.h index 6eefdcf..194005e 100644 --- a/arch/powerpc/include/asm/pgtable-ppc64-4k.h +++ b/arch/powerpc/include/asm/pgtable-ppc64-4k.h @@ -1,14 +1,10 @@ #ifndef _ASM_POWERPC_PGTABLE_PPC64_4K_H #define _ASM_POWERPC_PGTABLE_PPC64_4K_H -/* - * Entries per page directory level. The PTE level must use a 64b record - * for each page table entry. The PMD and PGD level use a 32b record for - * each entry by assuming that each entry is page aligned. - */ + #define PTE_INDEX_SIZE 9 -#define PMD_INDEX_SIZE 7 +#define PMD_INDEX_SIZE 9 #define PUD_INDEX_SIZE 7 -#define PGD_INDEX_SIZE 9 +#define PGD_INDEX_SIZE 7 #ifndef __ASSEMBLY__ #define PTE_TABLE_SIZE (sizeof(pte_t) << PTE_INDEX_SIZE) @@ -19,7 +15,7 @@ #define PTRS_PER_PTE (1 << PTE_INDEX_SIZE) #define PTRS_PER_PMD (1 << PMD_INDEX_SIZE) -#define PTRS_PER_PUD (1 << PMD_INDEX_SIZE) +#define PTRS_PER_PUD (1 << PUD_INDEX_SIZE) #define PTRS_PER_PGD (1 << PGD_INDEX_SIZE) /* PMD_SHIFT determines what a second-level page table entry can map */ diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h index 2b09cd5..8bd1cd9 100644 --- a/arch/powerpc/include/asm/pgtable-ppc64.h +++ b/arch/powerpc/include/asm/pgtable-ppc64.h @@ -181,8 +181,7 @@ * Find an entry in a page-table-directory. We combine the address region * (the high order N bits) and the pgd portion of the address. */ -/* to avoid overflow in free_pgtables we don't use PTRS_PER_PGD here */ -#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & 0x1ff) +#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1)) #define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address)) -- 1.7.4.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Wednesday, May 18, 2011 12:31 PM Milton Miller wrote: > Ingo I would propose the following commits added in 2.6.29 be reverted. > I think the current concensus is drivers must know if the writeq is > not atomic so they can provide their own locking or other workaround. > Exactly. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Wed, 18 May 2011 about 09:35:56 -0600, Eric Moore wrote: > On Wednesday, May 18, 2011 2:24 AM, Milton Miller wrote: > > On Wed, 18 May 2011 around 17:00:10 +1000, Benjamin Herrenschmidt wrote: > > > (Just adding Milton to the CC list, he suspects races in the > > > driver instead). > > > > > > On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote: > > > > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote: > > > > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote: > > > > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote: > > > > > > > The following code seems to be there in > > /usr/src/linux/arch/x86/include/asm/io.h. > > > > > > > This is not going to work. > > > > > > > > > > > > > > static inline void writeq(__u64 val, volatile void __iomem *addr) > > > > > > > { > > > > > > > writel(val, addr); > > > > > > > writel(val >> 32, addr+4); > > > > > > > } > > > > > > > > > > > > > > So with this code turned on in the kernel, there is going to be > > race condition > > > > > > > where multiple cpus can be writing to the request descriptor at > > the same time. > > > > > > > > > > > > > > Meaning this could happen: > > > > > > > (A) CPU A doest 32bit write > > > > > > > (B) CPU B does 32 bit write > > > > > > > (C) CPU A does 32 bit write > > > > > > > (D) CPU B does 32 bit write > > > > > > > > > > > > > > We need the 64 bit completed in one access pci memory write, else > > spin lock is required. > > > > > > > Since it's going to be difficult to know which writeq was > > implemented in the kernel, > > > > > > > the driver is going to have to always acquire a spin lock each > > time we do 64bit write. > > > > > > > > > > > > > > Cc: sta...@kernle.org > > > > > > > Signed-off-by: Kashyap Desai > > > > > > > --- > > > > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c > > b/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > > > > index efa0255..5778334 100644 > > > > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct > > MPT2SAS_ADAPTER *ioc, u16 smid) > > > > > > > * care of 32 bit environment where its not quarenteed to send > > the entire word > > > > > > > * in one transfer. > > > > > > > */ > > > > > > > -#ifndef writeq > > > > > > > > > > > > Why not make this #ifndef CONFIG_64BIT? You know that all 64 bit > > > > > > systems have writeq implemented correctly; you suspect 32 bit > > systems > > > > > > don't. > > > > > > > > > > > > James > > > > > > > > > > > > James, This issue was observed on PPC64 system. So what you have > > suggested will not solve this issue. > > > > > > If we are sure that writeq() is atomic across all architecture, we > > can use it safely. As we have seen issue on ppc64, we are not confident to > > use > > > > > > "writeq" call. > > > > > > > > > > So have you told the powerpc people that they have a broken writeq? > > > > > > > > I'm just in the process of finding them now on IRC so I can demand an > > > > explanation: this is a really serious API problem because writeq is > > > > supposed to be atomic on 64 bit. > > > > > > > > > And why do you obfuscate your report by talking about i386 when it's > > > > > really about powerpc64? > > > > > > > > James > > > > I checked the assembly for my complied output and it ends up with > > a single std (store doubleword aka 64 bits) instruction with offset > > 192 decimal (0xc0) from the base register obtained from the structure. > > > > An aligned doubleword store is atomic on 64 bit powerpc. > > > > So I would really like more details if you are blaming 64 bit > > powerpc of a non-atomic store. > > > > That said, the patch will affect the code by adding barriers. > > Specifically, while powerpc has a sync before doing the store as part > > of writeq, wrapping in a spinlock adds a sync before releasing the lock > > whenever a writeq (or writex x=b,w,d,q) was issued inside the lock. > > > > (sync orders all reads and all writes to both memory and devices from > > that cpu). > > > > But looking further at the code, I see such things as: > > > > drivers/scsi/mpt2sas/mpt2sas_base.c line 2944 > > > > mpt2sas_base_put_smid_default(ioc, smid); > > init_completion(&ioc->base_cmds.done); > > timeleft = wait_for_completion_timeout(&ioc->base_cmds.done, > > > > where mpt2sas_base_put_smid_default is a routine that has a call to > > _base_writeq. This will initiate io to the adapter, then initialize > > the completion, then hope that the timeout is long enough to let the io > > complete and be marked done but short enough to not be a problem when > > the timeout occurs because we initialized the compeltion after the irq > > came in. > > > > The code then looks at a status flag, but there is no indication how > > the access to that field is serialized between the interrupt handler > > and the submission routine. It may mostly work due to barrier
Re: powerpc: mpc85xx regression since 2.6.39-rc2, one cpu core lame
Does this patch help? If so please reply to that thread so patchwork will see it in addition to here. http://patchwork.ozlabs.org/patch/96146/ milton ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Wednesday, May 18, 2011 2:24 AM, Milton Miller wrote: > On Wed, 18 May 2011 around 17:00:10 +1000, Benjamin Herrenschmidt wrote: > > (Just adding Milton to the CC list, he suspects races in the > > driver instead). > > > > On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote: > > > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote: > > > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote: > > > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote: > > > > > > The following code seems to be there in > /usr/src/linux/arch/x86/include/asm/io.h. > > > > > > This is not going to work. > > > > > > > > > > > > static inline void writeq(__u64 val, volatile void __iomem *addr) > > > > > > { > > > > > > writel(val, addr); > > > > > > writel(val >> 32, addr+4); > > > > > > } > > > > > > > > > > > > So with this code turned on in the kernel, there is going to be > race condition > > > > > > where multiple cpus can be writing to the request descriptor at > the same time. > > > > > > > > > > > > Meaning this could happen: > > > > > > (A) CPU A doest 32bit write > > > > > > (B) CPU B does 32 bit write > > > > > > (C) CPU A does 32 bit write > > > > > > (D) CPU B does 32 bit write > > > > > > > > > > > > We need the 64 bit completed in one access pci memory write, else > spin lock is required. > > > > > > Since it's going to be difficult to know which writeq was > implemented in the kernel, > > > > > > the driver is going to have to always acquire a spin lock each > time we do 64bit write. > > > > > > > > > > > > Cc: sta...@kernle.org > > > > > > Signed-off-by: Kashyap Desai > > > > > > --- > > > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c > b/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > > > index efa0255..5778334 100644 > > > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct > MPT2SAS_ADAPTER *ioc, u16 smid) > > > > > > * care of 32 bit environment where its not quarenteed to send > the entire word > > > > > > * in one transfer. > > > > > > */ > > > > > > -#ifndef writeq > > > > > > > > > > Why not make this #ifndef CONFIG_64BIT? You know that all 64 bit > > > > > systems have writeq implemented correctly; you suspect 32 bit > systems > > > > > don't. > > > > > > > > > > James > > > > > > > > > > James, This issue was observed on PPC64 system. So what you have > suggested will not solve this issue. > > > > > If we are sure that writeq() is atomic across all architecture, we > can use it safely. As we have seen issue on ppc64, we are not confident to > use > > > > > "writeq" call. > > > > > > > > So have you told the powerpc people that they have a broken writeq? > > > > > > I'm just in the process of finding them now on IRC so I can demand an > > > explanation: this is a really serious API problem because writeq is > > > supposed to be atomic on 64 bit. > > > > > > > And why do you obfuscate your report by talking about i386 when it's > > > > really about powerpc64? > > > > > > James > > I checked the assembly for my complied output and it ends up with > a single std (store doubleword aka 64 bits) instruction with offset > 192 decimal (0xc0) from the base register obtained from the structure. > > An aligned doubleword store is atomic on 64 bit powerpc. > > So I would really like more details if you are blaming 64 bit > powerpc of a non-atomic store. > > That said, the patch will affect the code by adding barriers. > Specifically, while powerpc has a sync before doing the store as part > of writeq, wrapping in a spinlock adds a sync before releasing the lock > whenever a writeq (or writex x=b,w,d,q) was issued inside the lock. > > (sync orders all reads and all writes to both memory and devices from > that cpu). > > But looking further at the code, I see such things as: > > drivers/scsi/mpt2sas/mpt2sas_base.c line 2944 > > mpt2sas_base_put_smid_default(ioc, smid); > init_completion(&ioc->base_cmds.done); > timeleft = wait_for_completion_timeout(&ioc->base_cmds.done, > > where mpt2sas_base_put_smid_default is a routine that has a call to > _base_writeq. This will initiate io to the adapter, then initialize > the completion, then hope that the timeout is long enough to let the io > complete and be marked done but short enough to not be a problem when > the timeout occurs because we initialized the compeltion after the irq > came in. > > The code then looks at a status flag, but there is no indication how > the access to that field is serialized between the interrupt handler > and the submission routine. It may mostly work due to barriers in > the primitives but I don't see any statement of rules. > > Also, while I see a few wmb before writel in _base_interrupt, I don't > see any rmb, which I would expect between establishing a element is > valid and reading other fields in that element. > > So I'd really
Re: Kernel cannot see PCI device
On Wed, May 18, 2011 at 4:02 AM, Prashant Bhole wrote: > On Mon, May 2, 2011 at 10:21 AM, Prashant Bhole > wrote: >> >> Hi, >> I have a custom made powerpc 460EX board. On that board u-boot >> can see a PCI device but Linux kernel cannot see it. What could be the >> problem? >> >> On u-boot "pci 2" commands displays following device: >> Scanning PCI devices on bus 2 >> BusDevFun VendorId DeviceId Device Class Sub-Class >> _ >> 02.00.00 0x1000 0x0072 Mass storage controller 0x00 >> >> And when the kernel is booted, there is only one pci device (bridge): >> #ls /sys/bus/pci/devices >> :80:00.0 >> > > I am still facing in this problem. > > a call to pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, &l) returns > positive value in the function pci_scan_device(), which means VENDOR_ID > reading > failed. I could not find the reason. Any hints? Hmm... probably powerpc-related, so I added linuxppc-dev. My guess would be that Linux didn't find the host bridge to the hierarchy containing bus 2. I would guess the host bridge info is supposed to come from OF. More information, like the complete u-boot PCI scan and the kernel dmesg log, would be useful. And maybe u-boot has a way to dump the OF device tree? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] [klibc] ppc64: Fix build failure with stricter as
From: Matthias Klose Landed in Ubuntu klibc version 1.5.20-1ubuntu3. Signed-off-by: maximilian attems --- usr/klibc/arch/ppc64/crt0.S | 17 + 1 files changed, 9 insertions(+), 8 deletions(-) diff --git a/usr/klibc/arch/ppc64/crt0.S b/usr/klibc/arch/ppc64/crt0.S index a7776a1..c976d5c 100644 --- a/usr/klibc/arch/ppc64/crt0.S +++ b/usr/klibc/arch/ppc64/crt0.S @@ -12,16 +12,17 @@ .section ".toc","aw" .LC0: .tc environ[TC],environ + .text + .align 4 + .section ".opd","aw" - .align 3 - .globl _start _start: - .quad ._start - .quad .TOC.@tocbase, 0 - - .text - .globl ._start + .quad ._start, .TOC.@tocbase, 0 + .previous + .size _start, 24 .type ._start,@function + .globl _start + .globl ._start ._start: stdu%r1,-32(%r1) addi%r3,%r1,32 @@ -29,4 +30,4 @@ _start: b .__libc_init nop - .size _start,.-_start + .size ._start,.-._start -- 1.7.4.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc: mpc85xx regression since 2.6.39-rc2, one cpu core lame
On Wed, May 18, 2011 at 07:40:16AM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2011-05-17 at 18:28 +0200, Richard Cochran wrote: > > Ben, > > > > Recent 2.6.39-rc kernels behave strangely on the Freescale dual core > > mpc8572 and p2020. There is a long pause (like 2 seconds) in the boot > > sequence after "mpic: requesting IPIs..." > > > > When the system comes up, only one core shows in /proc/cpuinfo. Later > > on, lots of messages appear like the following: > > > >INFO: task ksoftirqd/1:9 blocked for more than 120 seconds. > > > > I bisected [1] the problem to: > > > >commit c56e58537d504706954a06570b4034c04e5b7500 > >Author: Benjamin Herrenschmidt > >Date: Tue Mar 8 14:40:04 2011 +1100 > > > >powerpc/smp: Create idle threads on demand and properly reset them > > > > I don't see from that commit what had gone wrong. Perhaps you can > > help resolve this? > > Hrm, odd. Kumar, care to have a look ? That's what happens when you > don't get me HW to test with :-) (I get the feeling that I am the only one testing recent kernels with the mpc85xx.) Anyhow, I see that this commit was one of a series. For my own use, can I simply revert this one commit independently? Thanks, Richard ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] PPC_47x SMP fix
commit c56e58537d504706954a06570b4034c04e5b7500 breaks SMP support in PPC_47x chip. secondary_ti must be set to current thread info before callin kick_cpu or else start_secondary_47x will jump into void when trying to return to c-code. In the current setup secondary_ti is initialized before the CPU idle task is started and only the boot core will start. I am not sure this is the correct solution, but it makes SMP possible in my chip. Note! The HOTPLUG support probably need some fixing to, There is no trampoline code available in head_44x.S - start_secondary_resume? Signed-off-by: Kerstin Jonsson Cc: Paul Mackerras Cc: Michael Neuling Cc: Will Schmidt --- arch/powerpc/kernel/smp.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index cbdbb14..f2dcab7 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -410,8 +410,6 @@ int __cpuinit __cpu_up(unsigned int cpu) { int rc, c; - secondary_ti = current_set[cpu]; - if (smp_ops == NULL || (smp_ops->cpu_bootable && !smp_ops->cpu_bootable(cpu))) return -EINVAL; @@ -421,6 +419,8 @@ int __cpuinit __cpu_up(unsigned int cpu) if (rc) return rc; + secondary_ti = current_set[cpu]; + /* Make sure callin-map entry is 0 (can be leftover a CPU * hotplug */ -- 1.7.2.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] PPC_47x SMP fix
commit c56e58537d504706954a06570b4034c04e5b7500 breaks SMP support in PPC_47x chip. secondary_ti must be set to current thread info before callin kick_cpu or else start_secondary_47x will jump into void when trying to return to c-code. In the current setup secondary_ti is initialized before the CPU idle task is started and only the boot core will start. I am not sure this is the correct solution, but it makes SMP possible in my chip. Note! The HOTPLUG support probably need some fixing to, There is no trampoline code available in head_44x.S - start_secondary_resume? Signed-off-by: Kerstin Jonsson Cc: Paul Mackerras Cc: Michael Neuling Cc: Darren Hart Cc: Will Schmidt --- arch/powerpc/kernel/smp.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index cbdbb14..f2dcab7 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -410,8 +410,6 @@ int __cpuinit __cpu_up(unsigned int cpu) { int rc, c; - secondary_ti = current_set[cpu]; - if (smp_ops == NULL || (smp_ops->cpu_bootable && !smp_ops->cpu_bootable(cpu))) return -EINVAL; @@ -421,6 +419,8 @@ int __cpuinit __cpu_up(unsigned int cpu) if (rc) return rc; + secondary_ti = current_set[cpu]; + /* Make sure callin-map entry is 0 (can be leftover a CPU * hotplug */ -- 1.7.2.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
On Wed, 18 May 2011 around 17:00:10 +1000, Benjamin Herrenschmidt wrote: > (Just adding Milton to the CC list, he suspects races in the > driver instead). > > On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote: > > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote: > > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote: > > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote: > > > > > The following code seems to be there in > > > > > /usr/src/linux/arch/x86/include/asm/io.h. > > > > > This is not going to work. > > > > > > > > > > static inline void writeq(__u64 val, volatile void __iomem *addr) > > > > > { > > > > > writel(val, addr); > > > > > writel(val >> 32, addr+4); > > > > > } > > > > > > > > > > So with this code turned on in the kernel, there is going to be race > > > > > condition > > > > > where multiple cpus can be writing to the request descriptor at the > > > > > same time. > > > > > > > > > > Meaning this could happen: > > > > > (A) CPU A doest 32bit write > > > > > (B) CPU B does 32 bit write > > > > > (C) CPU A does 32 bit write > > > > > (D) CPU B does 32 bit write > > > > > > > > > > We need the 64 bit completed in one access pci memory write, else > > > > > spin lock is required. > > > > > Since it's going to be difficult to know which writeq was implemented > > > > > in the kernel, > > > > > the driver is going to have to always acquire a spin lock each time > > > > > we do 64bit write. > > > > > > > > > > Cc: sta...@kernle.org > > > > > Signed-off-by: Kashyap Desai > > > > > --- > > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > > b/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > > index efa0255..5778334 100644 > > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER > > > > > *ioc, u16 smid) > > > > > * care of 32 bit environment where its not quarenteed to send the > > > > > entire word > > > > > * in one transfer. > > > > > */ > > > > > -#ifndef writeq > > > > > > > > Why not make this #ifndef CONFIG_64BIT? You know that all 64 bit > > > > systems have writeq implemented correctly; you suspect 32 bit systems > > > > don't. > > > > > > > > James > > > > > > > > James, This issue was observed on PPC64 system. So what you have > > > > suggested will not solve this issue. > > > > If we are sure that writeq() is atomic across all architecture, we can > > > > use it safely. As we have seen issue on ppc64, we are not confident to > > > > use > > > > "writeq" call. > > > > > > So have you told the powerpc people that they have a broken writeq? > > > > I'm just in the process of finding them now on IRC so I can demand an > > explanation: this is a really serious API problem because writeq is > > supposed to be atomic on 64 bit. > > > > > And why do you obfuscate your report by talking about i386 when it's > > > really about powerpc64? > > > > James I checked the assembly for my complied output and it ends up with a single std (store doubleword aka 64 bits) instruction with offset 192 decimal (0xc0) from the base register obtained from the structure. An aligned doubleword store is atomic on 64 bit powerpc. So I would really like more details if you are blaming 64 bit powerpc of a non-atomic store. That said, the patch will affect the code by adding barriers. Specifically, while powerpc has a sync before doing the store as part of writeq, wrapping in a spinlock adds a sync before releasing the lock whenever a writeq (or writex x=b,w,d,q) was issued inside the lock. (sync orders all reads and all writes to both memory and devices from that cpu). But looking further at the code, I see such things as: drivers/scsi/mpt2sas/mpt2sas_base.c line 2944 mpt2sas_base_put_smid_default(ioc, smid); init_completion(&ioc->base_cmds.done); timeleft = wait_for_completion_timeout(&ioc->base_cmds.done, where mpt2sas_base_put_smid_default is a routine that has a call to _base_writeq. This will initiate io to the adapter, then initialize the completion, then hope that the timeout is long enough to let the io complete and be marked done but short enough to not be a problem when the timeout occurs because we initialized the compeltion after the irq came in. The code then looks at a status flag, but there is no indication how the access to that field is serialized between the interrupt handler and the submission routine. It may mostly work due to barriers in the primitives but I don't see any statement of rules. Also, while I see a few wmb before writel in _base_interrupt, I don't see any rmb, which I would expect between establishing a element is valid and reading other fields in that element. So I'd really like to hear more about what your symptoms were and how you determined writeq on 64 bit powerpc was not atomic. milton ___
RE: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq isnot atomic
> > > > static inline void writeq(__u64 val, volatile void __iomem *addr) > > > > { > > > > writel(val, addr); > > > > writel(val >> 32, addr+4); > > > > } ... > > > > We need the 64 bit completed in one access pci memory write, else spin lock is required. > > > > Since it's going to be difficult to know which writeq was implemented in the kernel, > > > > the driver is going to have to always acquire a spin lock each time we do 64bit write. ... > I'm just in the process of finding them now on IRC so I can demand an > explanation: this is a really serious API problem because writeq is > supposed to be atomic on 64 bit. Most 32 bit systems don't have atomic 64bit writes. I'd also have thought there would be code which wouldn't mind the write being done as two cycles. I'm not sure that some of the ppc soc systems are capable of doing a 64bit data pci/pcie cycle except by dma. So your driver is probably doomed to require a lock. David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mpt2sas: remove the use of writeq, since writeq is not atomic
(Just adding Milton to the CC list, he suspects races in the driver instead). On Wed, 2011-05-18 at 08:23 +0400, James Bottomley wrote: > On Tue, 2011-05-17 at 22:15 -0600, Matthew Wilcox wrote: > > On Wed, May 18, 2011 at 09:37:08AM +0530, Desai, Kashyap wrote: > > > On Wed, 2011-05-04 at 17:23 +0530, Kashyap, Desai wrote: > > > > The following code seems to be there in > > > > /usr/src/linux/arch/x86/include/asm/io.h. > > > > This is not going to work. > > > > > > > > static inline void writeq(__u64 val, volatile void __iomem *addr) > > > > { > > > > writel(val, addr); > > > > writel(val >> 32, addr+4); > > > > } > > > > > > > > So with this code turned on in the kernel, there is going to be race > > > > condition > > > > where multiple cpus can be writing to the request descriptor at the > > > > same time. > > > > > > > > Meaning this could happen: > > > > (A) CPU A doest 32bit write > > > > (B) CPU B does 32 bit write > > > > (C) CPU A does 32 bit write > > > > (D) CPU B does 32 bit write > > > > > > > > We need the 64 bit completed in one access pci memory write, else spin > > > > lock is required. > > > > Since it's going to be difficult to know which writeq was implemented > > > > in the kernel, > > > > the driver is going to have to always acquire a spin lock each time we > > > > do 64bit write. > > > > > > > > Cc: sta...@kernle.org > > > > Signed-off-by: Kashyap Desai > > > > --- > > > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > b/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > index efa0255..5778334 100644 > > > > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c > > > > @@ -1558,7 +1558,6 @@ mpt2sas_base_free_smid(struct MPT2SAS_ADAPTER > > > > *ioc, u16 smid) > > > > * care of 32 bit environment where its not quarenteed to send the > > > > entire word > > > > * in one transfer. > > > > */ > > > > -#ifndef writeq > > > > > > Why not make this #ifndef CONFIG_64BIT? You know that all 64 bit > > > systems have writeq implemented correctly; you suspect 32 bit systems > > > don't. > > > > > > James > > > > > > James, This issue was observed on PPC64 system. So what you have > > > suggested will not solve this issue. > > > If we are sure that writeq() is atomic across all architecture, we can > > > use it safely. As we have seen issue on ppc64, we are not confident to use > > > "writeq" call. > > > > So have you told the powerpc people that they have a broken writeq? > > I'm just in the process of finding them now on IRC so I can demand an > explanation: this is a really serious API problem because writeq is > supposed to be atomic on 64 bit. > > > And why do you obfuscate your report by talking about i386 when it's > > really about powerpc64? > > James > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: book to learn ppc assembly and architecture
> > On Mon, 2011-05-16 at 16:37 +1000, Michael Neuling wrote: > >> > what is the best book to learn assembly and architecture . > > Assuming you have a powerpc compiler available you can use the -S > option to produce assembly listings. With gcc add -fverbose-asm for more info. For a general background, look at something much simpler than ppc, even if you don't write/run any code. David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev