[PATCH] powerpc/eeh: sysfs entries lost

2014-06-25 Thread Mike Qiu
The sysfs entries are lost because of commit 2213fb1 ("powerpc/eeh:
Skip eeh sysfs when eeh is disabled"). That commit added condition
to create sysfs entries with EEH_ENABLED, which isn't populated
when trying to create sysfs entries on PowerNV platform during system
boot time. The patch fixes the issue by:

   * Reoder EEH initialization functions so that they're same on
 PowerNV/pSeries.
   * Cache PE's primary bus by PowerNV platform instead of EEH core
 to avoid kernel crash caused by the function reorder. Another
 benefit with this is to avoid one eeh_probe_mode_dev() in EEH
 core.

Signed-off-by: Mike Qiu 
---
 arch/powerpc/kernel/eeh_pe.c | 11 ---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 17 -
 arch/powerpc/platforms/powernv/pci-ioda.c|  2 +-
 3 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index fbd01eb..1dce071a 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -351,17 +351,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
pe->config_addr = edev->config_addr;
 
/*
-* While doing PE reset, we probably hot-reset the
-* upstream bridge. However, the PCI devices including
-* the associated EEH devices might be removed when EEH
-* core is doing recovery. So that won't safe to retrieve
-* the bridge through downstream EEH device. We have to
-* trace the parent PCI bus, then the upstream bridge.
-*/
-   if (eeh_probe_mode_dev())
-   pe->bus = eeh_dev_to_pci_dev(edev)->bus;
-
-   /*
 * Put the new EEH PE into hierarchy tree. If the parent
 * can't be found, the newly created PE will be attached
 * to PHB directly. Otherwise, we have to associate the
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c 
b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 56a206f..48eb223 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -107,6 +107,7 @@ static int powernv_eeh_dev_probe(struct pci_dev *dev, void 
*flag)
struct pnv_phb *phb = hose->private_data;
struct device_node *dn = pci_device_to_OF_node(dev);
struct eeh_dev *edev = of_node_to_eeh_dev(dn);
+   int ret;
 
/*
 * When probing the root bridge, which doesn't have any
@@ -143,7 +144,21 @@ static int powernv_eeh_dev_probe(struct pci_dev *dev, void 
*flag)
edev->pe_config_addr= phb->bdfn_to_pe(phb, dev->bus, dev->devfn & 
0xff);
 
/* Create PE */
-   eeh_add_to_parent_pe(edev);
+   ret = eeh_add_to_parent_pe(edev);
+   if (ret) {
+   pr_warn("%s: Can't add PCI dev %s to parent PE (%d)\n",
+   __func__, pci_name(dev), ret);
+   return ret;
+   }
+
+   /*
+* Cache the PE primary bus, which can't be fetched when
+* full hotplug is in progress. In that case, all child
+* PCI devices of the PE are expected to be removed prior
+* to PE reset.
+*/
+   if (!edev->pe->bus)
+   edev->pe->bus = dev->bus;
 
/*
 * Enable EEH explicitly so that we will do EEH check
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index de19ede..81f2d3a 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1142,8 +1142,8 @@ static void pnv_pci_ioda_fixup(void)
 
 #ifdef CONFIG_EEH
eeh_probe_mode_set(EEH_PROBE_MODE_DEV);
-   eeh_addr_cache_build();
eeh_init();
+   eeh_addr_cache_build();
 #endif
 }
 
-- 
1.8.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc: crash with 3.16.0-rc2

2014-06-25 Thread Cedric Le Goater
Hi Suka,

On 06/26/2014 08:10 AM, Sukadev Bhattiprolu wrote:
> I got the following crash in Open Firmware, on two separate systems
> with recent mainline kernel (3.16.0-rc2). One was a P8 LPAR with no
> changes to kernel and another a Power7 LPAR with some kernel changes
> (24x7 perf counter patches). I will backout the patches and try but
> wanted to check if there is some config change I am missing.
> 
> I am also attaching the config file (which is based on a 3.14
> kernel that boots ok).
> 
> ---
> OF stdout device is: /vdevice/vty@3000
> Preparing to boot Linux version 3.16.0-rc2-unwind+ (root@saturn-lp2) (gcc 
> version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) #5 SMP Thu Jun 26 00:01:47 
> CDT 2014
> Detected machine type: 0101
> Max number of cores passed to firmware: 256 (NR_CPUS = 1024)
> Calling ibm,client-architecture-support... done
> command line: BOOT_IMAGE=/vmlinux-3.16.0-rc2-unwind 
> root=UUID=017d60b7-9db9-4d91-9dc8-4e6f91b0ed40 ro biosdevname=0 
> vconsole.font=latarcyrheb-sun16
> memory layout at init:
>   memory_limit :  (16 MB aligned)
>   alloc_bottom : 0553
>   alloc_top: 1000
>   alloc_top_hi : 1000
>   rmo_top  : 1000
>   ram_top  : 1000
> instantiating rtas at 0x0ee9... done
> Querying for OPAL presence... not there.

A patch from Michael Ellerman was just merged : 

powerpc/powernv: Remove OPAL v1 takeover

I think it fixes the problem you are seeing.

Cheers,

C. 


> DEFAULT CATCH!, exception-handler=fff00700 
> at   %SRR0: 04133064   %SRR1: 80081002 
> Open Firmware exception handler entered from non-OF code
> 
> Client's Fix Pt Regs:
>  00 04133030 04133018 047b1b28 0002
>  04 28005024 047b1b28 1002 04132f18
>  08   04132f18 1002
>  0c a001  01a3fd20 0401bb70
>  10 0401c030 0401bd70 fffd 01a3fd20
>  14 01b49080 0ee9 0117 0ee9
>  18 0401c510 0369 0401bb28 04193da0
>  1c 01a3fd60 794927e391410070 e87f408204cc 3cc2ff8b38c629d8
> Special Regs:
> %IV: 0700 %CR: 28005022%XER:   %DSISR: 4200 
>   %SRR0: 04133064   %SRR1: 80081002 
> %LR: 04133030%CTR:  
>%DAR: 01a3fcf00020b4ac 
> Virtual PID = 0 
>  ok
> 
> 
> 
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] Bugfix: powerpc/eeh: Create eeh sysfs entry in post_init()

2014-06-25 Thread Mike Qiu

On 06/26/2014 08:12 AM, Gavin Shan wrote:

On Wed, Jun 25, 2014 at 03:27:55PM +0800, Mike Qiu wrote:

On 06/25/2014 01:33 PM, Gavin Shan wrote:

On Tue, Jun 24, 2014 at 11:32:07PM -0400, Mike Qiu wrote:

[ cc Richard ]


Eeh sysfs entry created must be after EEH_ENABLED been set
in eeh_subsystem_flags.

In PowerNV platform, it try to create sysfs entry before
EEH_ENABLED been set, when boot up. So nothing will be
created for eeh in sysfs.


Could you please make the commit log more clear? :-)

I guess the issue is introduced by commit 2213fb1 ("
powerpc/eeh: Skip eeh sysfs when eeh is disabled"). The
commit checks EEH is enabled while creating PCI device
EEH sysfs files. If not, the sysfs files won't be created.
That's to avoid warning reported during PCI hotplug.

The problem you're reporting (if I understand completely):
You don't see the sysfs files after the system boots up.
If it's the case, you probably need following changes in
arch/powerpc/platforms/powernv/pci.c::pnv_pci_ioda_fixup().
Could you have a try with it?

#ifdef CONFIG_EEH
eeh_probe_mode_set(EEH_PROBE_MODE_DEV);
-   eeh_addr_cache_build();
eeh_init();
+   eeh_addr_cache_build();
#endif

But this was not work, as I test, see boot log below:


Yeah, we can't convert eeh_dev to pci_dev that time. The
association is populated by eeh_addr_cache_build(). The
attached patch should fix your issue. I tried on P7 machine
and sysfs entries created. Could you help having a test
on your machine? :-)


I have tested, works good.

Thanks
Mike


Thanks,
Gavin


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH v2 6/6] mmc: core: add manual resume capability

2014-06-25 Thread Vincent Yang
This patch adds manual resume for some embedded platforms with rootfs
stored in SD card. It references CONFIG_MMC_BLOCK_DEFERRED_RESUME in
kernel 3.10. It lets host controller driver to manually handle resume
by itself.

[symptom]
This issue is found on mb86s7x platforms with rootfs stored in SD card.
It failed to resume form STR suspend mode because SD card cannot be ready
in time. It take longer time (e.g., 600ms) to be ready for access.
The error log looks like below:

root@localhost:~# echo mem > /sys/power/state
[   30.441974] SUSPEND

SCB Firmware : Category 01 Version 02.03 Rev. 00_
Config   : (no configuration)
root@localhost:~# [   30.702976] Buffer I/O error on device mmcblk1p2, logical 
block 31349
[   30.709678] Buffer I/O error on device mmcblk1p2, logical block 168073
[   30.716220] Buffer I/O error on device mmcblk1p2, logical block 168074
[   30.722759] Buffer I/O error on device mmcblk1p2, logical block 168075
[   30.729456] Buffer I/O error on device mmcblk1p2, logical block 31349
[   30.735916] Buffer I/O error on device mmcblk1p2, logical block 31350
[   30.742370] Buffer I/O error on device mmcblk1p2, logical block 31351
[   30.749025] Buffer I/O error on device mmcblk1p2, logical block 168075
[   30.755657] Buffer I/O error on device mmcblk1p2, logical block 31351
[   30.763130] Aborting journal on device mmcblk1p2-8.
[   30.768060] JBD2: Error -5 detected when updating journal superblock for 
mmcblk1p2-8.
[   30.776085] EXT4-fs error (device mmcblk1p2): ext4_journal_check_start:56: 
Detected aborted journal
[   30.785259] EXT4-fs (mmcblk1p2): Remounting filesystem read-only
[   31.370716] EXT4-fs error (device mmcblk1p2): ext4_find_entry:1309: inode 
#2490369: comm udevd: reading directory lblock 0
[   31.382485] EXT4-fs error (device mmcblk1p2): ext4_find_entry:1309: inode 
#1048577: comm udevd: reading directory lblock 0

[analysis]
In system resume path, mmc_sd_resume() is failed with error code -123
because at that time SD card is still not ready on mb86s7x platforms.

[solution]
In order to not blocking system resume path, this patch just sets a flag
MMC_BUSRESUME_MANUAL_RESUME when this error happened, and then host controller
driver can understand it by this flag. Then host controller driver have to
resume SD card manually and asynchronously.

Signed-off-by: Vincent Yang 
---
 drivers/mmc/core/core.c  |  4 ++
 drivers/mmc/core/sd.c|  4 ++
 drivers/mmc/host/sdhci_f_sdh30.c | 89 
 include/linux/mmc/host.h | 14 +++
 4 files changed, 111 insertions(+)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 764af63..51fce49 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -2648,6 +2648,10 @@ int mmc_pm_notify(struct notifier_block *notify_block,
case PM_POST_RESTORE:
 
spin_lock_irqsave(&host->lock, flags);
+   if (mmc_bus_manual_resume(host)) {
+   spin_unlock_irqrestore(&host->lock, flags);
+   break;
+   }
host->rescan_disable = 0;
spin_unlock_irqrestore(&host->lock, flags);
_mmc_detect_change(host, 0, false);
diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
index 0c44510..859390d 100644
--- a/drivers/mmc/core/sd.c
+++ b/drivers/mmc/core/sd.c
@@ -1133,6 +1133,10 @@ static int mmc_sd_resume(struct mmc_host *host)
 
if (!(host->caps & MMC_CAP_RUNTIME_RESUME)) {
err = _mmc_sd_resume(host);
+   if ((host->caps2 & MMC_CAP2_MANUAL_RESUME) && err)
+   mmc_set_bus_resume_policy(host, 1);
+   else
+   mmc_set_bus_resume_policy(host, 0);
pm_runtime_set_active(&host->card->dev);
pm_runtime_mark_last_busy(&host->card->dev);
}
diff --git a/drivers/mmc/host/sdhci_f_sdh30.c b/drivers/mmc/host/sdhci_f_sdh30.c
index 6fae509..67bcff2 100644
--- a/drivers/mmc/host/sdhci_f_sdh30.c
+++ b/drivers/mmc/host/sdhci_f_sdh30.c
@@ -30,6 +30,12 @@
 #include "../core/core.h"
 
 #define DRIVER_NAME "f_sdh30"
+#define RESUME_WAIT_COUNT  100
+#define RESUME_WAIT_TIME   50
+#define RESUME_WAIT_JIFFIESmsecs_to_jiffies(RESUME_DETECT_TIME)
+#define RESUME_DETECT_COUNT16
+#define RESUME_DETECT_TIME 50
+#define RESUME_DETECT_JIFFIES  msecs_to_jiffies(RESUME_DETECT_TIME)
 
 
 struct f_sdhost_priv {
@@ -38,8 +44,59 @@ struct f_sdhost_priv {
int gpio_select_1v8;
u32 vendor_hs200;
struct device *dev;
+   unsigned int quirks;/* Deviations from spec. */
+
+/* retry to detect mmc device when resume */
+#define F_SDH30_QUIRK_RESUME_DETECT_RETRY  (1<<0)
+
+   struct workqueue_struct *resume_detect_wq;
+   struct delayed_work resume_detect_work;
+   unsigned intresume_detect_count;
+   unsigned intresume_wait_count;
 };
 
+static void sdhci_f_sdh30

[RFC PATCH v2 5/6] mmc: core: hold SD Clock before CMD11 during Signal Voltage Switch Procedure

2014-06-25 Thread Vincent Yang
This patch is to fix an issue found on mb86s7x platforms.

[symptom]
There are some UHS-1 SD memory cards sometimes cannot be detected correctly,
e.g., Transcend 600x SDXC 64GB UHS-1 memory card.
During Signal Voltage Switch Procedure, failure to switch is indicated
by the card holding DAT[3:0] low.

[analysis]
According to SD Host Controller Simplified Specification Version 3.00
chapter 3.6.1, the Signal Voltage Switch Procedure should be:
(1) Check S18A; (2) Issue CMD11; (3) Check CMD 11 response;
(4) Stop providing SD clock; (5) Check DAT[3:0] should be b;
(6) Set 1.8V Signal Enable; (7) Wait 5ms; (8) Check 1.8V Signal Enable;
(9) Provide SD Clock; (10) Wait 1ms; (11) Check DAT[3:0] should be b;
(12) error handling

With CONFIG_MMC_CLKGATE=y, sometimes there is one more gating/un-gating
SD clock between (2) and (3). In this case, some UHS-1 SD cards will hold
DAT[3:0] b at (11) and thus fails Signal Voltage Switch Procedure.

[solution]
By mmc_host_clk_hold() before CMD11, the additional gating/un-gating
SD clock between (2) and (3) can be prevented and thus no failure at (11).
It has been verified with many UHS-1 SD cards on mb86s7x platforms and
works correctly.

Signed-off-by: Vincent Yang 
---
 drivers/mmc/core/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 7dc0c85..764af63 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -1428,6 +1428,8 @@ int mmc_set_signal_voltage(struct mmc_host *host, int 
signal_voltage, u32 ocr)
pr_warning("%s: cannot verify signal voltage switch\n",
mmc_hostname(host));
 
+   mmc_host_clk_hold(host);
+
cmd.opcode = SD_SWITCH_VOLTAGE;
cmd.arg = 0;
cmd.flags = MMC_RSP_R1 | MMC_CMD_AC;
@@ -1438,8 +1440,6 @@ int mmc_set_signal_voltage(struct mmc_host *host, int 
signal_voltage, u32 ocr)
 
if (!mmc_host_is_spi(host) && (cmd.resp[0] & R1_ERROR))
return -EIO;
-
-   mmc_host_clk_hold(host);
/*
 * The card should drive cmd and dat[0:3] low immediately
 * after the response of cmd11, but wait 1 ms to be sure
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH v2 4/6] mmc: sdhci: host: add new f_sdh30

2014-06-25 Thread Vincent Yang
This patch adds new host controller driver for
Fujitsu SDHCI controller f_sdh30.

Signed-off-by: Vincent Yang 
---
 .../devicetree/bindings/mmc/sdhci-fujitsu.txt  |  35 +++
 drivers/mmc/host/Kconfig   |   7 +
 drivers/mmc/host/Makefile  |   1 +
 drivers/mmc/host/sdhci_f_sdh30.c   | 346 +
 drivers/mmc/host/sdhci_f_sdh30.h   |  40 +++
 5 files changed, 429 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mmc/sdhci-fujitsu.txt
 create mode 100644 drivers/mmc/host/sdhci_f_sdh30.c
 create mode 100644 drivers/mmc/host/sdhci_f_sdh30.h

diff --git a/Documentation/devicetree/bindings/mmc/sdhci-fujitsu.txt 
b/Documentation/devicetree/bindings/mmc/sdhci-fujitsu.txt
new file mode 100644
index 000..40add438
--- /dev/null
+++ b/Documentation/devicetree/bindings/mmc/sdhci-fujitsu.txt
@@ -0,0 +1,35 @@
+* Fujitsu SDHCI controller
+
+This file documents differences between the core properties in mmc.txt
+and the properties used by the sdhci_f_sdh30 driver.
+
+Required properties:
+- compatible: "fujitsu,f_sdh30"
+- voltage-ranges : two cells are required, first cell specifies minimum
+  slot voltage (mV), second cell specifies maximum slot voltage (mV).
+  Several ranges could be specified.
+
+Optional properties:
+- gpios: This is one optional gpio for controlling a power mux which
+  switches between two power supplies. 3.3V is selected when gpio is high,
+  and 1.8V is selected when gpio is low. This voltage is used for signal
+  level.
+- clocks: Must contain an entry for each entry in clock-names. It is a
+  list of phandles and clock-specifier pairs.
+  See ../clocks/clock-bindings.txt for details.
+- clock-names: Should contain the following two entries:
+   "sd_sd4clk" - clock primarily used for tuning process
+   "sd_bclk"   - base clock for sdhci controller
+
+Example:
+
+   sdhci1: sdio@3660 {
+   compatible = "fujitsu,f_sdh30";
+   reg = <0 0x3660 0x1000>;
+   interrupts = <0 172 0x4>,
+<0 173 0x4>;
+   voltage-ranges = <1800 1800 3300 3300>;
+   gpios = <&gpio0 7 0>;
+   clocks = <&clk_hdmi_2_0>, <&clk_hdmi_3_0>;
+   clock-names = "sd_sd4clk", "sd_bclk";
+   };
diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 7fee224..a1f3207 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -281,6 +281,13 @@ config MMC_SDHCI_BCM2835
  This selects the BCM2835 SD/MMC controller. If you have a BCM2835
  platform with SD or MMC devices, say Y or M here.
 
+config MMC_SDHCI_F_SDH30
+   tristate "SDHCI support for Fujitsu Semiconductor F_SDH30"
+   depends on MMC_SDHCI && (ARCH_MB8AC0300 || ARCH_MB86S70)
+   help
+ This selects the Secure Digital Host Controller Interface (SDHCI)
+ Needed by some Fujitsu SoC for MMC / SD / SDIO support.
+ If you have a controller with this interface, say Y or M here.
  If unsure, say N.
 
 config MMC_MOXART
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index 7f81ddf..a4c89e5 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -15,6 +15,7 @@ obj-$(CONFIG_MMC_SDHCI_PXAV3) += sdhci-pxav3.o
 obj-$(CONFIG_MMC_SDHCI_PXAV2)  += sdhci-pxav2.o
 obj-$(CONFIG_MMC_SDHCI_S3C)+= sdhci-s3c.o
 obj-$(CONFIG_MMC_SDHCI_SIRF)   += sdhci-sirf.o
+obj-$(CONFIG_MMC_SDHCI_F_SDH30)+= sdhci_f_sdh30.o
 obj-$(CONFIG_MMC_SDHCI_SPEAR)  += sdhci-spear.o
 obj-$(CONFIG_MMC_WBSD) += wbsd.o
 obj-$(CONFIG_MMC_AU1X) += au1xmmc.o
diff --git a/drivers/mmc/host/sdhci_f_sdh30.c b/drivers/mmc/host/sdhci_f_sdh30.c
new file mode 100644
index 000..6fae509
--- /dev/null
+++ b/drivers/mmc/host/sdhci_f_sdh30.c
@@ -0,0 +1,346 @@
+/*
+ * linux/drivers/mmc/host/sdhci_f_sdh30.c
+ *
+ * Copyright (C) 2013 - 2014 Fujitsu Semiconductor, Ltd
+ *  Vincent Yang 
+ * Copyright (C) 2014 Linaro Ltd  Andy Green 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, version 2 of the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "sdhci.h"
+#include "sdhci-pltfm.h"
+#include "sdhci_f_sdh30.h"
+#include "../core/core.h"
+
+#define DRIVER_NAME "f_sdh30"
+
+
+struct f_sdhost_priv {
+   struct clk *clk_sd4;
+   struct clk *clk_b;
+   int gpio_select_1v8;
+   u32 vendor_hs200;
+   struct device *dev;
+};
+
+void sdhci_f_sdh30_soft_voltage_switch(struct sdhci_host *host)
+{
+   struct f_sdhost_priv *priv = sdhci_priv(host);
+   u32 ctrl = 0;
+
+   usleep_range(2500, 3000);
+   ctrl = sdhci_readl(host, F_SDH30_IO_CONTROL2);
+  

[RFC PATCH v2 2/6] mmc: sdhci: add quirk for tuning work around

2014-06-25 Thread Vincent Yang
This patch defines a quirk for tuning work
around for some sdhci host controller. It sets
both SDHCI_CTRL_EXEC_TUNING and SDHCI_CTRL_TUNED_CLK
for tuning.
It is a preparation and will be used by Fujitsu
SDHCI controller f_sdh30 driver.

Signed-off-by: Vincent Yang 
---
 drivers/mmc/host/sdhci.c  | 2 ++
 include/linux/mmc/sdhci.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index d62262b..900b4e4 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1867,6 +1867,8 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, u32 
opcode)
 
ctrl = sdhci_readw(host, SDHCI_HOST_CONTROL2);
ctrl |= SDHCI_CTRL_EXEC_TUNING;
+   if (host->quirks2 & SDHCI_QUIRK2_TUNING_WORK_AROUND)
+   ctrl |= SDHCI_CTRL_TUNED_CLK;
sdhci_writew(host, ctrl, SDHCI_HOST_CONTROL2);
 
/*
diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h
index 5433f04..bcbad45 100644
--- a/include/linux/mmc/sdhci.h
+++ b/include/linux/mmc/sdhci.h
@@ -100,6 +100,8 @@ struct sdhci_host {
 #define SDHCI_QUIRK2_BROKEN_DDR50  (1<<7)
 /* Do a callback when switching voltages so do controller-specific actions */
 #define SDHCI_QUIRK2_VOLTAGE_SWITCH(1<<8)
+/* forced tuned clock */
+#define SDHCI_QUIRK2_TUNING_WORK_AROUND(1<<9)
 
int irq;/* Device IRQ */
void __iomem *ioaddr;   /* Mapped address */
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH v2 3/6] mmc: sdhci: add quirk for single block transactions

2014-06-25 Thread Vincent Yang
This patch defines a quirk to disable the block count
for single block transactions.
It is a preparation and will be used by Fujitsu
SDHCI controller f_sdh30 driver.

Signed-off-by: Vincent Yang 
---
 drivers/mmc/host/sdhci.c  | 8 +---
 include/linux/mmc/sdhci.h | 2 ++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 900b4e4..169e17d 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -876,7 +876,7 @@ static void sdhci_prepare_data(struct sdhci_host *host, 
struct mmc_command *cmd)
 static void sdhci_set_transfer_mode(struct sdhci_host *host,
struct mmc_command *cmd)
 {
-   u16 mode;
+   u16 mode = 0;
struct mmc_data *data = cmd->data;
 
if (data == NULL) {
@@ -889,9 +889,11 @@ static void sdhci_set_transfer_mode(struct sdhci_host 
*host,
 
WARN_ON(!host->data);
 
-   mode = SDHCI_TRNS_BLK_CNT_EN;
+   if (!(host->quirks2 & SDHCI_QUIRK2_SUPPORT_SINGLE))
+   mode = SDHCI_TRNS_BLK_CNT_EN;
+
if (mmc_op_multi(cmd->opcode) || data->blocks > 1) {
-   mode |= SDHCI_TRNS_MULTI;
+   mode = SDHCI_TRNS_BLK_CNT_EN | SDHCI_TRNS_MULTI;
/*
 * If we are sending CMD23, CMD12 never gets sent
 * on successful completion (so no Auto-CMD12).
diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h
index bcbad45..72072d1 100644
--- a/include/linux/mmc/sdhci.h
+++ b/include/linux/mmc/sdhci.h
@@ -102,6 +102,8 @@ struct sdhci_host {
 #define SDHCI_QUIRK2_VOLTAGE_SWITCH(1<<8)
 /* forced tuned clock */
 #define SDHCI_QUIRK2_TUNING_WORK_AROUND(1<<9)
+/* disable the block count for single block transactions */
+#define SDHCI_QUIRK2_SUPPORT_SINGLE(1<<10)
 
int irq;/* Device IRQ */
void __iomem *ioaddr;   /* Mapped address */
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH v2 1/6] mmc: sdhci: add quirk for voltage switch callback

2014-06-25 Thread Vincent Yang
This patch defines a quirk to do a callback when
switching voltages so do controller-specific
actions.
It is a preparation and will be used by Fujitsu
SDHCI controller f_sdh30 driver.

Signed-off-by: Vincent Yang 
---
 drivers/mmc/host/sdhci.c  | 5 +
 drivers/mmc/host/sdhci.h  | 1 +
 include/linux/mmc/sdhci.h | 2 ++
 3 files changed, 8 insertions(+)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 47055f3..d62262b 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1763,6 +1763,11 @@ static int sdhci_do_start_signal_voltage_switch(struct 
sdhci_host *host,
ctrl |= SDHCI_CTRL_VDD_180;
sdhci_writew(host, ctrl, SDHCI_HOST_CONTROL2);
 
+   /* Some controller need to do more when switching */
+   if ((host->quirks2 & SDHCI_QUIRK2_VOLTAGE_SWITCH) &&
+   host->ops->voltage_switch)
+   host->ops->voltage_switch(host);
+
/* Wait for 5ms */
usleep_range(5000, 5500);
 
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 4a5cd5e..63c7a46 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -292,6 +292,7 @@ struct sdhci_ops {
void(*adma_workaround)(struct sdhci_host *host, u32 intmask);
void(*platform_init)(struct sdhci_host *host);
void(*card_event)(struct sdhci_host *host);
+   void(*voltage_switch)(struct sdhci_host *host);
 };
 
 #ifdef CONFIG_MMC_SDHCI_IO_ACCESSORS
diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h
index 08abe99..5433f04 100644
--- a/include/linux/mmc/sdhci.h
+++ b/include/linux/mmc/sdhci.h
@@ -98,6 +98,8 @@ struct sdhci_host {
 #define SDHCI_QUIRK2_BROKEN_HS200  (1<<6)
 /* Controller does not support DDR50 */
 #define SDHCI_QUIRK2_BROKEN_DDR50  (1<<7)
+/* Do a callback when switching voltages so do controller-specific actions */
+#define SDHCI_QUIRK2_VOLTAGE_SWITCH(1<<8)
 
int irq;/* Device IRQ */
void __iomem *ioaddr;   /* Mapped address */
-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH v2 0/6] mmc: sdhci: adding support for a new Fujitsu sdhci IP

2014-06-25 Thread Vincent Yang
Hi,
We are adding support for a new Fujitsu sdhci IP.

These patches are against v3.16-rc1 mainline since nothing in
mmc-next at this moment.

These patches are tested on 3.16-rc1 integration tree.

We welcome any comment and advice about how to make any
improvements or better align them with upstream.

Changes from v1:
- Add sufficient description in DT binding ducument
- Remove one patch "mmc: sdhci: add quirk for broken 3.0V support" and use
  voltage-ranges = <> in the device tree instead

Thanks a lot!

Best regards,
Vincent Yang


Vincent Yang (6):
  mmc: sdhci: add quirk for voltage switch callback
  mmc: sdhci: add quirk for tuning work around
  mmc: sdhci: add quirk for single block transactions
  mmc: sdhci: host: add new f_sdh30
  mmc: core: hold SD Clock before CMD11 during Signal Voltage Switch
Procedure
  mmc: core: add manual resume capability

 .../devicetree/bindings/mmc/sdhci-fujitsu.txt  |  35 ++
 drivers/mmc/core/core.c|   8 +-
 drivers/mmc/core/sd.c  |   4 +
 drivers/mmc/host/Kconfig   |   7 +
 drivers/mmc/host/Makefile  |   1 +
 drivers/mmc/host/sdhci.c   |  15 +-
 drivers/mmc/host/sdhci.h   |   1 +
 drivers/mmc/host/sdhci_f_sdh30.c   | 435 +
 drivers/mmc/host/sdhci_f_sdh30.h   |  40 ++
 include/linux/mmc/host.h   |  14 +
 include/linux/mmc/sdhci.h  |   6 +
 11 files changed, 561 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/mmc/sdhci-fujitsu.txt
 create mode 100644 drivers/mmc/host/sdhci_f_sdh30.c
 create mode 100644 drivers/mmc/host/sdhci_f_sdh30.h

-- 
1.9.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v1 2/3] powerpc/powernv: Support PCI error injection

2014-06-25 Thread Stewart Smith
Gavin Shan  writes:
> +static struct kobj_attribute errinjct_attr =
> + __ATTR(errinjct, 0600, NULL, errinjct_store);

May also be good to have a read method that either lists current
injected errors? I guess it depends on if they're one time errors or
persistent errors too.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v1 2/3] powerpc/powernv: Support PCI error injection

2014-06-25 Thread Stewart Smith
Gavin Shan  writes:
> On Mon, Jun 23, 2014 at 04:36:44PM +1000, Michael Neuling wrote:
>>On Mon, 2014-06-23 at 12:14 +1000, Gavin Shan wrote:
>>> The patch implements one OPAL firmware sysfs file to support PCI error
>>> injection: "/sys/firmware/opal/errinjct", which will be used like the
>>> way described as follows.
>>> 
>>> According to PAPR spec, there are 3 RTAS calls related to error injection:
>>> "ibm,open-errinjct": allocate token prior to doing error injection.
>>> "ibm,close-errinjct": release the token allocated from "ibm,open-errinjct".
>>> "ibm,errinjct": do error injection.
>>> 
>>> Sysfs file /sys/firmware/opal/errinjct accepts strings that have fixed
>>> format "ei_token ...". For now, we only support 32-bits and 64-bits
>>> PCI error injection and they should have following strings written to
>>> /sys/firmware/opal/errinjct as follows. We don't have corresponding
>>> sysfs files for "ibm,open-errinjct" and "ibm,close-errinjct", which
>>> means that we rely on userland to maintain the token by itself.
>>
>>This sounds cool.  
>>
>>Can you document the sysfs interface in Documentation/powerpc?
>>
>
> Yeah, Documentation/powerpc/eeh-pci-error-recovery.txt needs update
> as Ben suggested. It's something in my list :-)

It should probably also/instead be in
Documentation/ABI/(testing|stable)/sysfs-firmware-opal-errinjct  as this
seems to be where sysfs bits get documented.

Also, considering that we're specifically looking at PCI error
injection, should the sysfs name be /sys/firmware/opal/pci-error-inject
instead?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2] spi: deal with a compile warning

2014-06-25 Thread Zhao Qiang
ret is unused when CONFIG_FSL_SOC defined,
so return ret instead of -ENOMEM when the
kzalloc fails to avoid it.

Signed-off-by: Zhao Qiang 
---
Changes for v2:
-return ret instead of -ENOMEM when the kzalloc fails 

 drivers/spi/spi-fsl-lib.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-fsl-lib.c b/drivers/spi/spi-fsl-lib.c
index e5d45fc..d40378f 100644
--- a/drivers/spi/spi-fsl-lib.c
+++ b/drivers/spi/spi-fsl-lib.c
@@ -202,7 +202,7 @@ int of_mpc8xxx_spi_probe(struct platform_device *ofdev)
 
pinfo = devm_kzalloc(&ofdev->dev, sizeof(*pinfo), GFP_KERNEL);
if (!pinfo)
-   return -ENOMEM;
+   return ret;
 
pdata = &pinfo->pdata;
dev->platform_data = pdata;
-- 
1.8.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] sched: Fix compiler warnings

2014-06-25 Thread Guenter Roeck

On 06/25/2014 05:59 PM, Stephen Rothwell wrote:

Hi Guenter,

[I know I'm a bit late to this, but ...]

On Tue, 24 Jun 2014 18:05:29 -0700 Guenter Roeck  wrote:


diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 9d85318..e35d880 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -275,7 +275,7 @@ void store_cpu_topology(unsigned int cpuid)
cpu_topology[cpuid].socket_id, mpidr);
  }

-static inline const int cpu_corepower_flags(void)
+static inline int cpu_corepower_flags(void)
  {
return SD_SHARE_PKG_RESOURCES  | SD_SHARE_POWERDOMAIN;
  }


The only reference to this function is to take its address, so "inline"
is useless, right?


diff --git a/include/linux/sched.h b/include/linux/sched.h
index 306f4f0..0376b05 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -872,21 +872,21 @@ enum cpu_idle_type {
  #define SD_NUMA   0x4000  /* cross-node balancing */

  #ifdef CONFIG_SCHED_SMT
-static inline const int cpu_smt_flags(void)
+static inline int cpu_smt_flags(void)
  {
return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
  }
  #endif

  #ifdef CONFIG_SCHED_MC
-static inline const int cpu_core_flags(void)
+static inline int cpu_core_flags(void)
  {
return SD_SHARE_PKG_RESOURCES;
  }
  #endif

  #ifdef CONFIG_NUMA
-static inline const int cpu_numa_flags(void)
+static inline int cpu_numa_flags(void)
  {
return SD_NUMA;
  }


The same is true of those three, but then they would have to be moved
into a .c file and replaced with prototypes ...



Personally I wasn't sure why it had to be functions instead of defines,
but who knows. Anyway, seems others are not happy with my proposed fix
either, and everyone seems to suggest a different solution, so I guess
it won't go anywhere.

I "solved" my immediate problem of getting a polluted build log by
filtering the warnings out, so I don't really care too much anymore ;-).

Guenter

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] sched: Fix compiler warnings

2014-06-25 Thread Stephen Rothwell
Hi Guenter,

[I know I'm a bit late to this, but ...]

On Tue, 24 Jun 2014 18:05:29 -0700 Guenter Roeck  wrote:
>
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index 9d85318..e35d880 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -275,7 +275,7 @@ void store_cpu_topology(unsigned int cpuid)
>   cpu_topology[cpuid].socket_id, mpidr);
>  }
>  
> -static inline const int cpu_corepower_flags(void)
> +static inline int cpu_corepower_flags(void)
>  {
>   return SD_SHARE_PKG_RESOURCES  | SD_SHARE_POWERDOMAIN;
>  }

The only reference to this function is to take its address, so "inline"
is useless, right?

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 306f4f0..0376b05 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -872,21 +872,21 @@ enum cpu_idle_type {
>  #define SD_NUMA  0x4000  /* cross-node balancing */
>  
>  #ifdef CONFIG_SCHED_SMT
> -static inline const int cpu_smt_flags(void)
> +static inline int cpu_smt_flags(void)
>  {
>   return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
>  }
>  #endif
>  
>  #ifdef CONFIG_SCHED_MC
> -static inline const int cpu_core_flags(void)
> +static inline int cpu_core_flags(void)
>  {
>   return SD_SHARE_PKG_RESOURCES;
>  }
>  #endif
>  
>  #ifdef CONFIG_NUMA
> -static inline const int cpu_numa_flags(void)
> +static inline int cpu_numa_flags(void)
>  {
>   return SD_NUMA;
>  }

The same is true of those three, but then they would have to be moved
into a .c file and replaced with prototypes ...

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] Bugfix: powerpc/eeh: Create eeh sysfs entry in post_init()

2014-06-25 Thread Gavin Shan
On Wed, Jun 25, 2014 at 03:27:55PM +0800, Mike Qiu wrote:
>On 06/25/2014 01:33 PM, Gavin Shan wrote:
>>On Tue, Jun 24, 2014 at 11:32:07PM -0400, Mike Qiu wrote:
>>
>>[ cc Richard ]
>>
>>>Eeh sysfs entry created must be after EEH_ENABLED been set
>>>in eeh_subsystem_flags.
>>>
>>>In PowerNV platform, it try to create sysfs entry before
>>>EEH_ENABLED been set, when boot up. So nothing will be
>>>created for eeh in sysfs.
>>>
>>Could you please make the commit log more clear? :-)
>>
>>I guess the issue is introduced by commit 2213fb1 ("
>>powerpc/eeh: Skip eeh sysfs when eeh is disabled"). The
>>commit checks EEH is enabled while creating PCI device
>>EEH sysfs files. If not, the sysfs files won't be created.
>>That's to avoid warning reported during PCI hotplug.
>>
>>The problem you're reporting (if I understand completely):
>>You don't see the sysfs files after the system boots up.
>>If it's the case, you probably need following changes in
>>arch/powerpc/platforms/powernv/pci.c::pnv_pci_ioda_fixup().
>>Could you have a try with it?
>>
>>#ifdef CONFIG_EEH
>>  eeh_probe_mode_set(EEH_PROBE_MODE_DEV);
>>- eeh_addr_cache_build();
>>  eeh_init();
>>+ eeh_addr_cache_build();
>>#endif
>
>But this was not work, as I test, see boot log below:
>

Yeah, we can't convert eeh_dev to pci_dev that time. The
association is populated by eeh_addr_cache_build(). The
attached patch should fix your issue. I tried on P7 machine
and sysfs entries created. Could you help having a test
on your machine? :-)

Thanks,
Gavin
>From 70b8ac1d64192954e04bc4a4f2736349d7df6a8a Mon Sep 17 00:00:00 2001
From: Gavin Shan 
Date: Thu, 26 Jun 2014 09:50:25 +1000
Subject: [PATCH] powerpc/eeh: sysfs entries lost

The sysfs entries are lost because of commit 2213fb1 ("powerpc/eeh:
Skip eeh sysfs when eeh is disabled"). That commit added condition
to create sysfs entries with EEH_ENABLED, which isn't populated
when trying to create sysfs entries on PowerNV platform during system
boot time. The patch fixes the issue by:

   * Reoder EEH initialization functions so that they're same on
 PowerNV/pSeries.
   * Cache PE's primary bus by PowerNV platform instead of EEH core
 to avoid kernel crash caused by the function reorder. Another
 benefit with this is to avoid one eeh_probe_mode_dev() in EEH
 core.

Reported-by: Mike Qiu 
Signed-off-by: Gavin Shan 
---
 arch/powerpc/kernel/eeh_pe.c | 11 ---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 17 -
 arch/powerpc/platforms/powernv/pci-ioda.c|  2 +-
 3 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index fbd01eb..1dce071a 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -351,17 +351,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 	pe->config_addr	= edev->config_addr;
 
 	/*
-	 * While doing PE reset, we probably hot-reset the
-	 * upstream bridge. However, the PCI devices including
-	 * the associated EEH devices might be removed when EEH
-	 * core is doing recovery. So that won't safe to retrieve
-	 * the bridge through downstream EEH device. We have to
-	 * trace the parent PCI bus, then the upstream bridge.
-	 */
-	if (eeh_probe_mode_dev())
-		pe->bus = eeh_dev_to_pci_dev(edev)->bus;
-
-	/*
 	 * Put the new EEH PE into hierarchy tree. If the parent
 	 * can't be found, the newly created PE will be attached
 	 * to PHB directly. Otherwise, we have to associate the
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 56a206f..48eb223 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -107,6 +107,7 @@ static int powernv_eeh_dev_probe(struct pci_dev *dev, void *flag)
 	struct pnv_phb *phb = hose->private_data;
 	struct device_node *dn = pci_device_to_OF_node(dev);
 	struct eeh_dev *edev = of_node_to_eeh_dev(dn);
+	int ret;
 
 	/*
 	 * When probing the root bridge, which doesn't have any
@@ -143,7 +144,21 @@ static int powernv_eeh_dev_probe(struct pci_dev *dev, void *flag)
 	edev->pe_config_addr	= phb->bdfn_to_pe(phb, dev->bus, dev->devfn & 0xff);
 
 	/* Create PE */
-	eeh_add_to_parent_pe(edev);
+	ret = eeh_add_to_parent_pe(edev);
+	if (ret) {
+		pr_warn("%s: Can't add PCI dev %s to parent PE (%d)\n",
+			__func__, pci_name(dev), ret);
+		return ret;
+	}
+
+	/*
+	 * Cache the PE primary bus, which can't be fetched when
+	 * full hotplug is in progress. In that case, all child
+	 * PCI devices of the PE are expected to be removed prior
+	 * to PE reset.
+	 */
+	if (!edev->pe->bus)
+		edev->pe->bus = dev->bus;
 
 	/*
 	 * Enable EEH explicitly so that we will do EEH check
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index de19ede..81f2d3a 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1142,8 +1142,8

Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration

2014-06-25 Thread Alexey Kardashevskiy
On 06/26/2014 07:12 AM, Alexander Graf wrote:
> 
> On 06.06.14 02:20, Alexey Kardashevskiy wrote:
>> On 06/05/2014 09:57 PM, Alexander Graf wrote:
>>> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
 This reserves 2 capability numbers.

 This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.

 Please advise how to proceed with these patches as I suspect that
 first two should go via Paolo's tree while the last one via Alex Graf's
 tree
 (correct?).
>>> They would just go via my tree, but only be actually allocated (read:
>>> mergable to qemu) when they hit Paolo's tree.
>>>
>>> In fact, I don't think it makes sense to split them off at all.
>>
>> So? Are these patches going anywhere? Thanks.
> 
> So? Are you going to address the comments?

Sorry, I cannot find here anything to fix. Ben asked some questions, I
answered and there were no objections. What do I miss this time?...


-- 
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [2/3,v4] powerpc/fsl-booke: Add initial T208x QDS board support

2014-06-25 Thread Scott Wood
On Wed, 2014-06-25 at 18:23 -0500, Scott Wood wrote:
> On Wed, Jun 11, 2014 at 06:10:05PM +0800, Shengzhou Liu wrote:
> > +   flash@2 {
> > +   #address-cells = <1>;
> > +   #size-cells = <1>;
> > +   compatible = "eon,en25s64";
> > +   reg = <2>;
> > +   spi-max-frequency = <3500>;
> > +   };
> 
> I won't hold up this patch due to this, but you should send a patch to
> add eon to Documentation/devicetree/bindings/vendor-prefixes.txt.

at24, OTOH, looks wrong.  Shouldn't it be atmel?  How did "24" become
part of the vendor?  I see lots of existing uses, though...

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [3/3,v4] powerpc/t2080rdb: Add T2080RDB board support

2014-06-25 Thread Scott Wood
On Wed, Jun 11, 2014 at 06:10:06PM +0800, Shengzhou Liu wrote:
> + i2c@0 {
> + #address-cells = <1>;
> + #size-cells = <0>;
> + reg = <0x0>;
> +
> + sfp@50 {
> + compatible = "optics,sfp";
> + reg = <0x50>;
> + };
> + };

What is "sfp"?  Please use generic node names when possible.

I'm not able to easily find what chip this is referring to by googling
"optics sfp".  I suspect this compatible is too vague -- what is the
actual part number?  Could you provide a URL to a description of the
chip?

If "optics" is the correct vendor name, it needs to go into
vendor-prefixes.txt.

-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [2/3,v4] powerpc/fsl-booke: Add initial T208x QDS board support

2014-06-25 Thread Scott Wood
On Wed, Jun 11, 2014 at 06:10:05PM +0800, Shengzhou Liu wrote:
> + flash@2 {
> + #address-cells = <1>;
> + #size-cells = <1>;
> + compatible = "eon,en25s64";
> + reg = <2>;
> + spi-max-frequency = <3500>;
> + };

I won't hold up this patch due to this, but you should send a patch to
add eon to Documentation/devicetree/bindings/vendor-prefixes.txt.

-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs

2014-06-25 Thread Russell King - ARM Linux
On Wed, Jun 25, 2014 at 06:30:37PM +0100, Sudeep Holla wrote:
> + coherency_line_size: the minimum amount of data that gets 
> transferred

So, what value to do envision this taking for a CPU where the cache
line size is 32 bytes, but each cache line has two dirty bits which
allow it to only evict either the upper or lower 16 bytes depending
on which are dirty?

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration

2014-06-25 Thread Alexander Graf


On 06.06.14 02:20, Alexey Kardashevskiy wrote:

On 06/05/2014 09:57 PM, Alexander Graf wrote:

On 05.06.14 09:25, Alexey Kardashevskiy wrote:

This reserves 2 capability numbers.

This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.

Please advise how to proceed with these patches as I suspect that
first two should go via Paolo's tree while the last one via Alex Graf's tree
(correct?).

They would just go via my tree, but only be actually allocated (read:
mergable to qemu) when they hit Paolo's tree.

In fact, I don't think it makes sense to split them off at all.


So? Are these patches going anywhere? Thanks.


So? Are you going to address the comments?


Alex

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: OF_DYNAMIC node lifecycle

2014-06-25 Thread Grant Likely
On Tue, 24 Jun 2014 15:10:55 -0500, Nathan Fontenot  
wrote:
> On 06/23/2014 09:48 AM, Grant Likely wrote:
> > On Thu, 19 Jun 2014 10:26:15 -0500, Nathan Fontenot  
> > wrote:
> >> On 06/18/2014 03:07 PM, Grant Likely wrote:
> >>> Hi Nathan and Tyrel,
> >>>
> >>> I'm looking into lifecycle issues on nodes modified by OF_DYNAMIC, and
> >>> I'm hoping you can help me. Right now, pseries seems to be the only
> >>> user of OF_DYNAMIC, but making OF_DYNAMIC work has a huge impact on
> >>> the entire kernel because it requires all DT code to manage reference
> >>> counting with iterating over nodes. Most users simply get it wrong.
> >>> Pantelis did some investigation and found that the reference counts on
> >>> a running kernel are all over the place. I have my doubts that any
> >>> code really gets it right.
> >>>
> >>> The problem is that users need to know when it is appropriate to call
> >>> of_node_get()/of_node_put(). All list traversals that exit early need
> >>> an extra call to of_node_put(), and code that is searching for a node
> >>> in the tree and holding a reference to it needs to call of_node_get().
> >>>
> >>> I've got a few pseries questions:
> >>> - What are the changes being requested by pseries firmware? Is it only
> >>> CPUs and memory nodes, or does it manipulate things all over the tree?
> >>
> >> The short answer, everything.
> > 
> > :-)
> > 
> >> For pseries the two big actions that can change the device tree are
> >> adding/removing resources and partition migration.
> >>
> >> The most frequent updates to the device tree happen during resource
> >> (cpu, memory, and pci/phb) add and remove. During this process we add
> >> and remove the node and its properties from the device tree.
> >> - For memory on newer systems this just involves updating the
> >>   ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory property. Older
> >>   firmware levels add and remove the memroy@XXX nodes and their properties.
> >> - For cpus the cpus/PowerPC,POWER nodes and its properties are added
> >>   or removed
> >> - For pci/phb the pci@X nodes and properties are added/removed.
> >>
> >> The less frequent operation of live partition migration (and 
> >> suspend/resume)
> >> can update just about anything in the device tree. When this occurs and the
> >> systems starts after being migrated (or waking up after a suspend) we make
> >> a call to firmware to get updates to the device tree for the new hardware
> >> we are running on.
> >>  
> >>> - How frequent are the changes? How many changes would be likely over
> >>> the runtime of the system?
> >>
> >> This can happen frequently.
> > 
> > Thanks, that is exactly the information that I want. I'm not so much
> > concerned with the addition or removal of nodes/properties, which is
> > actually pretty easy to handle. It is the lifecycle of allocations on
> > dynamic nodes that causes heartburn.
> > 
> >>> - Are you able to verify that removed nodes are actually able to be
> >>> freed correctly? Do you have any testcases for node removal?
> >>
> >> I have always tested this by doing resource add/remove, usually cpu and 
> >> memory
> >> since it is the easiest.
> > 
> > Is that just testing the functionality, or do you have tests that check
> > if the memory gets freed?
> 
> In general it's just functionality testing.
> 
> > 
> >>> I'm thinking very seriously about changing the locking semantics of DT
> >>> code entirely so that most users never have to worry about
> >>> of_node_get/put at all. If the DT code is switched to use rcu
> >>> primitives for tree iteration (which also means making DT code use
> >>> list_head, something I'm already investigating), then instead of
> >>> trying to figure out of_node_get/put rules, callers could use
> >>> rcu_read_lock()/rcu_read_unlock() to protect the region that is
> >>> searching over nodes, and only call of_node_get() if the node pointer
> >>> is needed outside the rcu read-side lock.
> >>>
> >>
> >> This sounds good. I like just taking the rcu lock around accessing the DT.
> >> Do we have many places where DT node pointers are held that require
> >> keeping the of_node_get/put calls? If this did exist perhaps we could
> >> update those places to look up the DT node every time instead of
> >> holding on to the pointer. We could just get rid of the reference counting
> >> altogether then.
> > 
> > There are a few, but I would be happy to restrict reference counting to
> > only those locations. Most places will decode the DT data, and then
> > throw away the reference. We /might/ even be able to do rcu_lock/unlock
> > around the entire probe path which would make it transparent to all
> > device drivers.
> > 
> >>> I'd really like to be rid of the node reference counting entirely, but
> >>> I can't figure out a way of doing that safely, so I'd settle for
> >>> making it a lot easier to get correct.
> >>>
> >>
> >> heh! I have often thought about adding reference counting to device tree
> >> properties.
> > 

Re: OF_DYNAMIC node lifecycle

2014-06-25 Thread Grant Likely
On Tue, 24 Jun 2014 15:07:05 -0500, Nathan Fontenot  
wrote:
> On 06/23/2014 09:58 AM, Grant Likely wrote:
> > On Thu, 19 Jun 2014 11:33:20 +0300, Pantelis Antoniou 
> >  wrote:
> >> Hi Grant,
> >>
> >> CCing Thomas Gleixner & Steven Rostedt, since they might have a few
> >> ideas...
> >>
> >> On Jun 18, 2014, at 11:07 PM, Grant Likely wrote:
> >>
> >>> Hi Nathan and Tyrel,
> >>>
> >>> I'm looking into lifecycle issues on nodes modified by OF_DYNAMIC, and
> >>> I'm hoping you can help me. Right now, pseries seems to be the only
> >>> user of OF_DYNAMIC, but making OF_DYNAMIC work has a huge impact on
> >>> the entire kernel because it requires all DT code to manage reference
> >>> counting with iterating over nodes. Most users simply get it wrong.
> >>> Pantelis did some investigation and found that the reference counts on
> >>> a running kernel are all over the place. I have my doubts that any
> >>> code really gets it right.
> >>>
> >>> The problem is that users need to know when it is appropriate to call
> >>> of_node_get()/of_node_put(). All list traversals that exit early need
> >>> an extra call to of_node_put(), and code that is searching for a node
> >>> in the tree and holding a reference to it needs to call of_node_get().
> >>>
> >>
> >> In hindsight it appears that drivers just can't get the lifecycle right.
> >> So we need to simplify things.
> >>
> >>> I've got a few pseries questions:
> >>> - What are the changes being requested by pseries firmware? Is it only
> >>> CPUs and memory nodes, or does it manipulate things all over the tree?
> >>> - How frequent are the changes? How many changes would be likely over
> >>> the runtime of the system?
> >>> - Are you able to verify that removed nodes are actually able to be
> >>> freed correctly? Do you have any testcases for node removal?
> >>>
> >>> I'm thinking very seriously about changing the locking semantics of DT
> >>> code entirely so that most users never have to worry about
> >>> of_node_get/put at all. If the DT code is switched to use rcu
> >>> primitives for tree iteration (which also means making DT code use
> >>> list_head, something I'm already investigating), then instead of
> >>> trying to figure out of_node_get/put rules, callers could use
> >>> rcu_read_lock()/rcu_read_unlock() to protect the region that is
> >>> searching over nodes, and only call of_node_get() if the node pointer
> >>> is needed outside the rcu read-side lock.
> >>>
> >>> I'd really like to be rid of the node reference counting entirely, but
> >>> I can't figure out a way of doing that safely, so I'd settle for
> >>> making it a lot easier to get correct.
> >>>
> >>
> >> Since we're going about changing things, how about that devtree_lock?
> > 
> > I believe rcu would pretty much eliminate the devtree_lock entirely. All
> > modifiers would need to grab a mutex to ensure there is only one writer
> > at any given time, but readers would have free reign to parse the tree
> > however they like.
> > 
> > DT writers would have to follow some strict rules about how to handle
> > nodes that are removed (ie. don't modify or of_node_put() them until
> > after rcu is syncronized), but the number of writers is very small and
> > we have control of all of them.
> > 
> >> We're using a raw_spinlock and we're always taking the lock with
> >> interrupts disabled.
> >>
> >> If we're going to make DT changes frequently during normal runtime
> >> and not only during boot time, those are bad for any kind of real-time
> >> performance.
> >>
> >> So the question is, do we really have code that access the live tree
> >> during atomic sections?  Is that something we want? Enforcing this
> >> will make our lives easier, and we'll get the change to replace
> >> that spinlock with a mutex.
> > 
> > Yes, I believe the powerpc CPU hotplug code accesses the DT in atomic
> > sections. I cannot put my finger on the exact code however. Nathan might
> > know better. But, if I'm right, the whole problem goes away with RCU.
> 
> I went back through the cpu hotplug code. we do update the DT during cpu
> hotplug but I don't see it happening during atomic sections.
> 
> The code is in arch/powerpc/platforms/pseries/dlpar.c

Great, thanks,

By the way, notifiers currently get sent before any updates are applied
to the tree. I want to change it so that the notifier gets sent
afterwards. Does that work for you? I've looked through all the users
and aside from a stupid block of code in arch/powerpc/kernel/prom.c
which does things that should be done by of_attach_node(), it looks like
everything should be fine.

g.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 -next 0/9] CMA: generalize CMA reserved area management code

2014-06-25 Thread Andrew Morton
On Wed, 25 Jun 2014 14:33:56 +0200 Marek Szyprowski  
wrote:

> > That's probably easier.  Marek, I'll merge these into -mm (and hence
> > -next and git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git)
> > and shall hold them pending you review/ack/test/etc, OK?
> 
> Ok. I've tested them and they work fine. I'm sorry that you had to wait for
> me for a few days. You can now add:
> 
> Acked-and-tested-by: Marek Szyprowski 

Thanks.

> I've also rebased my pending patches onto this set (I will send them soon).
> 
> The question is now if you want to keep the discussed patches in your 
> -mm tree,
> or should I take them to my -next branch. If you like to keep them, I assume
> you will also take the patches which depends on the discussed changes.

Yup, that works.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 2/2] powerpc: bpf: Fix the broken LD_VLAN_TAG_PRESENT test

2014-06-25 Thread Denis Kirjanov
We have to return the boolean here if the tag presents
or not, not just ANDing the TCI with the mask which results to:

[  709.412097] test_bpf: #18 LD_VLAN_TAG_PRESENT
[  709.412245] ret 4096 != 1
[  709.412332] ret 4096 != 1
[  709.412333] FAIL (2 times)

Signed-off-by: Denis Kirjanov 
---
 arch/powerpc/net/bpf_jit_comp.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 892167b..82e82ca 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -394,10 +394,12 @@ static int bpf_jit_build_body(struct sk_filter *fp, u32 
*image,
 
PPC_LHZ_OFFS(r_A, r_skb, offsetof(struct sk_buff,
  vlan_tci));
-   if (code == (BPF_ANC | SKF_AD_VLAN_TAG))
+   if (code == (BPF_ANC | SKF_AD_VLAN_TAG)) {
PPC_ANDI(r_A, r_A, ~VLAN_TAG_PRESENT);
-   else
+   } else {
PPC_ANDI(r_A, r_A, VLAN_TAG_PRESENT);
+   PPC_SRWI(r_A, r_A, 12);
+   }
break;
case BPF_ANC | SKF_AD_QUEUE:
BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
-- 
2.0.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 1/2] powerpc: bpf: Use correct mask while accessing the VLAN tag

2014-06-25 Thread Denis Kirjanov
To get a full tag (and not just a VID) we should access the TCI
except the VLAN_TAG_PRESENT field (which means that 802.1q header
is present). Also ensure that the VLAN_TAG_PRESENT stay on its place

Signed-off-by: Denis Kirjanov 
---
 arch/powerpc/net/bpf_jit_comp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 6dcdade..892167b 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -390,10 +390,12 @@ static int bpf_jit_build_body(struct sk_filter *fp, u32 
*image,
case BPF_ANC | SKF_AD_VLAN_TAG:
case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, vlan_tci) != 
2);
+   BUILD_BUG_ON(VLAN_TAG_PRESENT != 0x1000);
+
PPC_LHZ_OFFS(r_A, r_skb, offsetof(struct sk_buff,
  vlan_tci));
if (code == (BPF_ANC | SKF_AD_VLAN_TAG))
-   PPC_ANDI(r_A, r_A, VLAN_VID_MASK);
+   PPC_ANDI(r_A, r_A, ~VLAN_TAG_PRESENT);
else
PPC_ANDI(r_A, r_A, VLAN_TAG_PRESENT);
break;
-- 
2.0.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs

2014-06-25 Thread Sudeep Holla
From: Sudeep Holla 

This patch adds initial support for providing processor cache information
to userspace through sysfs interface. This is based on already existing
implementations(x86, ia64, s390 and powerpc) and hence the interface is
intended to be fully compatible.

The main purpose of this generic support is to avoid further code
duplication to support new architectures and also to unify all the existing
different implementations.

This implementation maintains the hierarchy of cache objects which reflects
the system's cache topology. Cache devices are instantiated as needed as
CPUs come online. The cache information is replicated per-cpu even if they are
shared. A per-cpu array of cache information maintained is used mainly for
sysfs-related book keeping.

It also implements the shared_cpu_map attribute, which is essential for
enabling both kernel and user-space to discover the system's overall cache
topology.

This patch also add the missing ABI documentation for the cacheinfo sysfs
interface already, which is well defined and widely used.

Signed-off-by: Sudeep Holla 
Cc: Greg Kroah-Hartman 
Cc: Rob Herring 
Cc: linux-...@vger.kernel.org
Cc: linux-i...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |  41 ++
 drivers/base/Makefile  |   2 +-
 drivers/base/cacheinfo.c   | 564 +
 include/linux/cacheinfo.h  |  56 ++
 4 files changed, 662 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu 
b/Documentation/ABI/testing/sysfs-devices-system-cpu
index acb9bfc..5827f4e 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -224,3 +224,44 @@ Description:   Parameters for the Intel P-state driver
frequency range.
 
More details can be found in 
Documentation/cpu-freq/intel-pstate.txt
+
+What:  
/sys/devices/system/cpu/cpu*/cache/index*/
+Date:  June 2014(documented, existed before August 2008)
+Contact:   Sudeep Holla 
+   Linux kernel mailing list 
+Description:   Parameters for the CPU cache attributes
+
+   attributes:
+   - writethrough: data is written to both the cache line
+   and to the block in the lower-level 
memory
+   - writeback: data is written only to the cache line and
+the modified cache line is written to main
+memory only when it is replaced
+   - writeallocate: allocate a memory location to a cache 
line
+on a cache miss because of a write
+   - readallocate: allocate a memory location to a cache 
line
+   on a cache miss because of a read
+
+   coherency_line_size: the minimum amount of data that gets 
transferred
+
+   level: the cache hierarcy in the multi-level cache configuration
+
+   number_of_sets: total number of sets in the cache, a set is a
+   collection of cache lines with the same cache 
index
+
+   physical_line_partition: number of physical cache line per 
cache tag
+
+   shared_cpu_list: the list of cpus sharing the cache
+
+   shared_cpu_map: logical cpu mask containing the list of cpus 
sharing
+   the cache
+
+   size: the total cache size in kB
+
+   type:
+   - instruction: cache that only holds instructions
+   - data: cache that only caches data
+   - unified: cache that holds both data and instructions
+
+   ways_of_associativity: degree of freedom in placing a 
particular block
+   of memory in the cache
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 04b314e..bad2ff8 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -4,7 +4,7 @@ obj-y   := component.o core.o bus.o dd.o 
syscore.o \
   driver.o class.o platform.o \
   cpu.o firmware.o init.o map.o devres.o \
   attribute_container.o transport_class.o \
-  topology.o container.o
+  topology.o container.o cacheinfo.o
 obj-$(CONFIG_DEVTMPFS) += devtmpfs.o
 obj-$(CONFIG_DMA_CMA) += dma-contiguous.o
 obj-y  += power/
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.

[PATCH 0/9] drivers: cacheinfo support

2014-06-25 Thread Sudeep Holla
From: Sudeep Holla 

This series adds a generic cacheinfo support similar to topology. The
implementation is based on x86 cacheinfo support. Currently x86, powerpc,
ia64 and s390 have their own implementations. While adding similar support
to ARM and ARM64, here is the attempt to make it generic quite similar to
topology info support. It also adds the missing ABI documentation for
the cacheinfo sysfs which is already being used.

It moves all the existing different implementations on x86, ia64, powerpc
and s390 to use the generic cacheinfo infrastructure introduced here.
These changes on non-ARM platforms are only compile tested and tested on x86.

This series also adds support for ARM and ARM64 architectures based on
the generic support.

Since there was no objection to the idea in RFC, I am posting non-RFC
version here.

The code can be fetched from:
 git://linux-arm.org/linux-skn cacheinfo

Previous RFCs:
[1] https://lkml.org/lkml/2014/1/8/523
[2] https://lkml.org/lkml/2014/2/7/654
[3] https://lkml.org/lkml/2014/2/19/391

Cc: Greg Kroah-Hartman 
Cc: linux-i...@vger.kernel.org
Cc: linux...@de.ibm.com
Cc: linux-s...@vger.kernel.org
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org

---
Sudeep Holla (9):
  drivers: base: add new class "cpu" to group cpu devices
  drivers: base: support cpu cache information interface to userspace
via sysfs
  ia64: move cacheinfo sysfs to generic cacheinfo infrastructure
  s390: move cacheinfo sysfs to generic cacheinfo infrastructure
  x86: move cacheinfo sysfs to generic cacheinfo infrastructure
  powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure
  ARM64: kernel: add support for cpu cache information
  ARM: kernel: add support for cpu cache information
  ARM: kernel: add outer cache support for cacheinfo implementation

 Documentation/ABI/testing/sysfs-devices-system-cpu |  41 ++
 arch/arm/include/asm/outercache.h  |  13 +
 arch/arm/kernel/Makefile   |   1 +
 arch/arm/kernel/cacheinfo.c| 249 +++
 arch/arm/mm/Kconfig|  13 +
 arch/arm/mm/cache-l2x0.c   |  10 +
 arch/arm/mm/cache-tauros2.c|  34 +
 arch/arm/mm/cache-xsc3l2.c |  15 +
 arch/arm64/kernel/Makefile |   3 +-
 arch/arm64/kernel/cacheinfo.c  | 135 
 arch/ia64/kernel/topology.c| 401 ++
 arch/powerpc/kernel/cacheinfo.c| 813 +++--
 arch/powerpc/kernel/cacheinfo.h|   8 -
 arch/powerpc/kernel/sysfs.c|  12 +-
 arch/s390/kernel/cache.c   | 388 +++---
 arch/x86/kernel/cpu/intel_cacheinfo.c  | 655 -
 drivers/base/Makefile  |   2 +-
 drivers/base/cacheinfo.c   | 564 ++
 drivers/base/core.c|  39 +-
 drivers/base/cpu.c |   7 +
 include/linux/cacheinfo.h  |  56 ++
 include/linux/cpu.h|   2 +
 22 files changed, 1590 insertions(+), 1871 deletions(-)
 create mode 100644 arch/arm/kernel/cacheinfo.c
 create mode 100644 arch/arm64/kernel/cacheinfo.c
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

-- 
1.8.3.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/9] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure

2014-06-25 Thread Sudeep Holla
From: Sudeep Holla 

This patch removes the redundant sysfs cacheinfo code by making use of
the newly introduced generic cacheinfo infrastructure.

Signed-off-by: Sudeep Holla 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Anshuman Khandual 
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/cacheinfo.c | 813 +---
 arch/powerpc/kernel/cacheinfo.h |   8 -
 arch/powerpc/kernel/sysfs.c |  12 +-
 3 files changed, 91 insertions(+), 742 deletions(-)
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 40198d5..b871c24 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -10,38 +10,10 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include 
 #include 
-#include 
 #include 
-#include 
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
-
-#include "cacheinfo.h"
-
-/* per-cpu object for tracking:
- * - a "cache" kobject for the top-level directory
- * - a list of "index" objects representing the cpu's local cache hierarchy
- */
-struct cache_dir {
-   struct kobject *kobj; /* bare (not embedded) kobject for cache
-  * directory */
-   struct cache_index_dir *index; /* list of index objects */
-};
-
-/* "index" object: each cpu's cache directory has an index
- * subdirectory corresponding to a cache object associated with the
- * cpu.  This object's lifetime is managed via the embedded kobject.
- */
-struct cache_index_dir {
-   struct kobject kobj;
-   struct cache_index_dir *next; /* next index in parent directory */
-   struct cache *cache;
-};
 
 /* Template for determining which OF properties to query for a given
  * cache type */
@@ -60,11 +32,6 @@ struct cache_type_info {
const char *nr_sets_prop;
 };
 
-/* These are used to index the cache_type_info array. */
-#define CACHE_TYPE_UNIFIED 0
-#define CACHE_TYPE_INSTRUCTION 1
-#define CACHE_TYPE_DATA2
-
 static const struct cache_type_info cache_type_info[] = {
{
/* PowerPC Processor binding says the [di]-cache-*
@@ -92,231 +59,83 @@ static const struct cache_type_info cache_type_info[] = {
},
 };
 
-/* Cache object: each instance of this corresponds to a distinct cache
- * in the system.  There are separate objects for Harvard caches: one
- * each for instruction and data, and each refers to the same OF node.
- * The refcount of the OF node is elevated for the lifetime of the
- * cache object.  A cache object is released when its shared_cpu_map
- * is cleared (see cache_cpu_clear).
- *
- * A cache object is on two lists: an unsorted global list
- * (cache_list) of cache objects; and a singly-linked list
- * representing the local cache hierarchy, which is ordered by level
- * (e.g. L1d -> L1i -> L2 -> L3).
- */
-struct cache {
-   struct device_node *ofnode;/* OF node for this cache, may be cpu */
-   struct cpumask shared_cpu_map; /* online CPUs using this cache */
-   int type;  /* split cache disambiguation */
-   int level; /* level not explicit in device tree */
-   struct list_head list; /* global list of cache objects */
-   struct cache *next_local;  /* next cache of >= level */
-};
-
-static DEFINE_PER_CPU(struct cache_dir *, cache_dir_pcpu);
-
-/* traversal/modification of this list occurs only at cpu hotplug time;
- * access is serialized by cpu hotplug locking
- */
-static LIST_HEAD(cache_list);
-
-static struct cache_index_dir *kobj_to_cache_index_dir(struct kobject *k)
-{
-   return container_of(k, struct cache_index_dir, kobj);
-}
-
-static const char *cache_type_string(const struct cache *cache)
+static inline int get_cacheinfo_idx(enum cache_type type)
 {
-   return cache_type_info[cache->type].name;
-}
-
-static void cache_init(struct cache *cache, int type, int level,
-  struct device_node *ofnode)
-{
-   cache->type = type;
-   cache->level = level;
-   cache->ofnode = of_node_get(ofnode);
-   INIT_LIST_HEAD(&cache->list);
-   list_add(&cache->list, &cache_list);
-}
-
-static struct cache *new_cache(int type, int level, struct device_node *ofnode)
-{
-   struct cache *cache;
-
-   cache = kzalloc(sizeof(*cache), GFP_KERNEL);
-   if (cache)
-   cache_init(cache, type, level, ofnode);
-
-   return cache;
-}
-
-static void release_cache_debugcheck(struct cache *cache)
-{
-   struct cache *iter;
-
-   list_for_each_entry(iter, &cache_list, list)
-   WARN_ONCE(iter->next_local == cache,
- "cache for %s(%s) refers to cache for %s(%s)\n",
- iter->ofnode->full_name,
- cache_type_string(iter),
- cache->ofnode->full_name,
- cache_type_string(cache));
-}
-
-static v

Re: [PATCH v2] sched: Fix compiler warnings

2014-06-25 Thread Guenter Roeck

On 06/25/2014 08:52 AM, Uwe Kleine-König wrote:

Hello,

On Wed, Jun 25, 2014 at 03:40:28PM +, David Laight wrote:

From: Guenter Roeck

Actually turns out one can use __attribute_const__, and it is

static inline int __attribute_const__ cpu_corepower_flags(void)

which turns out to be widely used.

I'll change that and resubmit after testing.


You don't need to tell the compiler that for an inline function.

I didn't check for the functions in question here, but in general your
statement is wrong.

For example:

static inline unsigned int __attribute_const__ read_cpuid_id(void)
{
return readl(BASEADDR_V7M_SCB + V7M_SCB_CPUID);
}

from arch/arm/include/asm/cputype.h. The V7M_SCB_CPUID register never
changes, but there is no way gcc can deduce that.



Sigh. As I mentioned earlier, it is much easier to introduce a problem
than to fix it.

Ok, I'll leave this alone. I already spent much more time on this than
I should or have, so it is really time to move on.

Guenter

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH v2] sched: Fix compiler warnings

2014-06-25 Thread David Laight
From: Uwe Kleine-König 
> Hello,
> 
> On Wed, Jun 25, 2014 at 03:40:28PM +, David Laight wrote:
> > From: Guenter Roeck
> > > Actually turns out one can use __attribute_const__, and it is
> > >
> > >   static inline int __attribute_const__ cpu_corepower_flags(void)
> > >
> > > which turns out to be widely used.
> > >
> > > I'll change that and resubmit after testing.
> >
> > You don't need to tell the compiler that for an inline function.
> I didn't check for the functions in question here, but in general your
> statement is wrong.
> 
> For example:
> 
> static inline unsigned int __attribute_const__ read_cpuid_id(void)
> {
>   return readl(BASEADDR_V7M_SCB + V7M_SCB_CPUID);
> }
> 
> from arch/arm/include/asm/cputype.h. The V7M_SCB_CPUID register never
> changes, but there is no way gcc can deduce that.

Hmm... it all rather depends on the order of the optimisations and 
inlining.

I've tried to use 'restrict' on the parameters to an inline function
in an attempt to get 'noalias' - but the reverse inference never
seems to be applied.

David



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] sched: Fix compiler warnings

2014-06-25 Thread Uwe Kleine-König
Hello,

On Wed, Jun 25, 2014 at 03:40:28PM +, David Laight wrote:
> From: Guenter Roeck
> > Actually turns out one can use __attribute_const__, and it is
> > 
> > static inline int __attribute_const__ cpu_corepower_flags(void)
> > 
> > which turns out to be widely used.
> > 
> > I'll change that and resubmit after testing.
> 
> You don't need to tell the compiler that for an inline function.
I didn't check for the functions in question here, but in general your
statement is wrong.

For example:

static inline unsigned int __attribute_const__ read_cpuid_id(void)
{
return readl(BASEADDR_V7M_SCB + V7M_SCB_CPUID);
}

from arch/arm/include/asm/cputype.h. The V7M_SCB_CPUID register never
changes, but there is no way gcc can deduce that.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 1/1] powerpc/perf: Adjust callchain based on DWARF debug info

2014-06-25 Thread Sukadev Bhattiprolu
Jiri Olsa [jo...@redhat.com] wrote:
| 
| you could use __maybe_unused for the 'skip_idx'

Yes, here is the updated patch.
---

powerpc/perf: Adjust callchain based on DWARF debug info

When saving the callchain on Power, the kernel conservatively saves excess
entries in the callchain. A few of these entries are needed in some cases
but not others. We should use the DWARF debug information to determine
when the entries are  needed.

Eg: the value in the link register (LR) is needed only when it holds the
return address of a function. At other times it must be ignored.

If the unnecessary entries are not ignored, we end up with duplicate arcs
in the call-graphs.

Use the DWARF debug information to determine if any callchain entries
should be ignored when building call-graphs.

Callgraph before the patch:

14.67%  2234  sprintft  libc-2.18.so   [.] __random
|
--- __random
   |
   |--61.12%-- __random
   |  |
   |  |--97.15%-- rand
   |  |  do_my_sprintf
   |  |  main
   |  |  generic_start_main.isra.0
   |  |  __libc_start_main
   |  |  0x0
   |  |
   |   --2.85%-- do_my_sprintf
   | main
   | generic_start_main.isra.0
   | __libc_start_main
   | 0x0
   |
--38.88%-- rand
  |
  |--94.01%-- rand
  |  do_my_sprintf
  |  main
  |  generic_start_main.isra.0
  |  __libc_start_main
  |  0x0
  |
   --5.99%-- do_my_sprintf
 main
 generic_start_main.isra.0
 __libc_start_main
 0x0

Callgraph after the patch:

14.67%  2234  sprintft  libc-2.18.so   [.] __random
|
--- __random
   |
   |--95.93%-- rand
   |  do_my_sprintf
   |  main
   |  generic_start_main.isra.0
   |  __libc_start_main
   |  0x0
   |
--4.07%-- do_my_sprintf
  main
  generic_start_main.isra.0
  __libc_start_main
  0x0

TODO:   For split-debug info objects like glibc, we can only determine
the call-frame-address only when both .eh_frame and .debug_info
sections are available. We should be able to determin the CFA
even without the .eh_frame section.

Fix suggested by Anton Blanchard.

Thanks to valuable input on DWARF debug information from Ulrich Weigand.

Changelog[v5]
[Jiri Olsa] Avoid the new external symbol PERF_CONTEXT_IGNORE;
Revert back to previous version and to avoid impact on other
architectures, use #ifdef in machine__resolve_callchain_sample().

Changelog[v4]
Move Powerpc-specific code into a separate patch

Changelog[v3]
[Jiri Olsa] Rename function to arch_skip_callchain_idx() to be
consistent with behavior.
[Jiri Olsa] Add '__maybe_unused' tags for unused parameters.

Changelog[v2]:
Add missing dwfl_end()
Fix merge conflicts due to some unwind code

Reported-by: Maynard Johnson 
Tested-by: Maynard Johnson 
Signed-off-by: Sukadev Bhattiprolu 
---
 tools/perf/arch/powerpc/Makefile  |1 +
 tools/perf/arch/powerpc/util/skip-callchain-idx.c |  266 +
 tools/perf/config/Makefile|4 +
 tools/perf/util/callchain.h   |   13 +
 tools/perf/util/machine.c |   18 +-
 5 files changed, 300 insertions(+), 2 deletions(-)
 create mode 100644 tools/perf/arch/powerpc/util/skip-callchain-idx.c

diff --git a/tools/perf/arch/powerpc/Makefile b/tools/perf/arch/powerpc/Makefile
index 744e629..b92219b 100644
--- a/tools/perf/arch/powerpc/Makefile
+++ b/tools/perf/arch/powerpc/Makefile
@@ -3,3 +3,4 @@ PERF_HAVE_DWARF_REGS := 1
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/header.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/skip-callchain-idx.o
diff --git a/tools/perf/arch/powerpc/util/skip-callchain-idx.c 
b/tools/perf/arch/powerpc/util/skip-callchain-idx.c
new file mode 100644
index 000..a7c23a4
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/skip-callchain-idx.c
@@ -0,0 +1,266 @@
+/*
+ * Use DWARF

RE: [PATCH v2] sched: Fix compiler warnings

2014-06-25 Thread David Laight
From: Guenter Roeck
> On 06/25/2014 07:49 AM, Uwe Kleine-Knig wrote:
> > Hello Guenter,
> >
> > On Wed, Jun 25, 2014 at 07:27:47AM -0700, Guenter Roeck wrote:
> >>> Maybe the author's intention was:
> >>>
> >>>   static inline int cpu_corepower_flags(void) __attribute__((const));
> >>>
> >>> ?
> >>> This specifies that the function has no side effects and the return value
> >>> only depends on the (here non-existing) function arguments.
> >>>
> >>
> >> Possibly, but either I am missing something or this doesn't compile.
> > You need to do a separate declaration:
> >
> > static inline int cpu_corepower_flags(void) __attribute__((const));
> > static inline int cpu_corepower_flags(void)
> > {
> > ...
> 
> Actually turns out one can use __attribute_const__, and it is
> 
>   static inline int __attribute_const__ cpu_corepower_flags(void)
> 
> which turns out to be widely used.
> 
> I'll change that and resubmit after testing.

You don't need to tell the compiler that for an inline function.

David

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] sched: Fix compiler warnings

2014-06-25 Thread Guenter Roeck

On 06/25/2014 07:49 AM, Uwe Kleine-König wrote:

Hello Guenter,

On Wed, Jun 25, 2014 at 07:27:47AM -0700, Guenter Roeck wrote:

Maybe the author's intention was:

static inline int cpu_corepower_flags(void) __attribute__((const));

?
This specifies that the function has no side effects and the return value
only depends on the (here non-existing) function arguments.



Possibly, but either I am missing something or this doesn't compile.

You need to do a separate declaration:

static inline int cpu_corepower_flags(void) __attribute__((const));
static inline int cpu_corepower_flags(void)
{
...


Actually turns out one can use __attribute_const__, and it is

static inline int __attribute_const__ cpu_corepower_flags(void)

which turns out to be widely used.

I'll change that and resubmit after testing.

Guenter

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] sched: Fix compiler warnings

2014-06-25 Thread Uwe Kleine-König
Hello Guenter,

On Wed, Jun 25, 2014 at 07:27:47AM -0700, Guenter Roeck wrote:
> >Maybe the author's intention was:
> >
> > static inline int cpu_corepower_flags(void) __attribute__((const));
> >
> >?
> >This specifies that the function has no side effects and the return value
> >only depends on the (here non-existing) function arguments.
> >
> 
> Possibly, but either I am missing something or this doesn't compile.
You need to do a separate declaration:

static inline int cpu_corepower_flags(void) __attribute__((const));
static inline int cpu_corepower_flags(void)
{
...

Does this help?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] sched: Fix compiler warnings

2014-06-25 Thread Guenter Roeck

On 06/25/2014 12:14 AM, Uwe Kleine-König wrote:

On Tue, Jun 24, 2014 at 06:05:29PM -0700, Guenter Roeck wrote:

Commit 143e1e28cb (sched: Rework sched_domain topology definition)
introduced a number of functions with a return value of 'const int'.
gcc doesn't know what to do with that and, if the kernel is compiled
with W=1, complains with the following warnings whenever sched.h
is included.

include/linux/sched.h:875:25: warning:
type qualifiers ignored on function return type
include/linux/sched.h:882:25: warning:
type qualifiers ignored on function return type
include/linux/sched.h:889:25: warning:
type qualifiers ignored on function return type
include/linux/sched.h:1002:21: warning:
type qualifiers ignored on function return type

Commits fb2aa855 (sched, ARM: Create a dedicated scheduler topology table)
and 607b45e9a (sched, powerpc: Create a dedicated topology table) introduce
the same warning in the arm and powerpc code.

Drop 'const' from the function declarations to fix the problem.

The fix for all three patches has to be applied together to avoid
compilation failures for the affected architectures.

Cc: Dietmar Eggemann 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Benjamin Herrenschmidt 
Cc: Vincent Guittot 
Signed-off-by: Guenter Roeck 
---
v2: Fix problem in all affected architectures with a single patch
 to avoid compilation errors.

  arch/arm/kernel/topology.c | 2 +-
  arch/powerpc/kernel/smp.c  | 2 +-
  include/linux/sched.h  | 8 
  3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 9d85318..e35d880 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -275,7 +275,7 @@ void store_cpu_topology(unsigned int cpuid)
cpu_topology[cpuid].socket_id, mpidr);
  }

-static inline const int cpu_corepower_flags(void)
+static inline int cpu_corepower_flags(void)
  {
return SD_SHARE_PKG_RESOURCES  | SD_SHARE_POWERDOMAIN;
  }

Maybe the author's intention was:

static inline int cpu_corepower_flags(void) __attribute__((const));

?
This specifies that the function has no side effects and the return value
only depends on the (here non-existing) function arguments.



Possibly, but either I am missing something or this doesn't compile.

Guenter


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 -next 0/9] CMA: generalize CMA reserved area management code

2014-06-25 Thread Marek Szyprowski

Hello,

On 2014-06-18 22:51, Andrew Morton wrote:

On Tue, 17 Jun 2014 10:25:07 +0900 Joonsoo Kim  wrote:

v2:
   - Although this patchset looks very different with v1, the end result,
   that is, mm/cma.c is same with v1's one. So I carry Ack to patch 6-7.

This patchset is based on linux-next 20140610.

Thanks for taking care of this. I will test it with my setup and if
everything goes well, I will take it to my -next tree. If any branch
is required for anyone to continue his works on top of those patches,
let me know, I will also prepare it.

Hello,

I'm glad to hear that. :)
But, there is one concern. As you already know, I am preparing further
patches (Aggressively allocate the pages on CMA reserved memory). It
may be highly related to MM branch and also slightly depends on this CMA
changes. In this case, what is the best strategy to merge this
patchset? IMHO, Anrew's tree is more appropriate branch. If there is
no issue in this case, I am willing to develope further patches based
on your tree.

That's probably easier.  Marek, I'll merge these into -mm (and hence
-next and git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git)
and shall hold them pending you review/ack/test/etc, OK?


Ok. I've tested them and they work fine. I'm sorry that you had to wait for
me for a few days. You can now add:

Acked-and-tested-by: Marek Szyprowski 

I've also rebased my pending patches onto this set (I will send them soon).

The question is now if you want to keep the discussed patches in your 
-mm tree,

or should I take them to my -next branch. If you like to keep them, I assume
you will also take the patches which depends on the discussed changes.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2] KVM: PPC: BOOK3S: HV: Use base page size when comparing against slb value

2014-06-25 Thread Alexander Graf


On 15.06.14 20:47, Aneesh Kumar K.V wrote:

With guests supporting Multiple page size per segment (MPSS),
hpte_page_size returns the actual page size used. Add a new function to
return base page size and use that to compare against the the page size
calculated from SLB. Without this patch a hpte lookup can fail since
we are comparing wrong page size in kvmppc_hv_find_lock_hpte.

Signed-off-by: Aneesh Kumar K.V 


Thanks, applied to for-3.16.


Alex

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 3/3] dmaengine: mpc512x: register for device tree channel lookup

2014-06-25 Thread Alexander Popov
Register the controller for device tree based lookup of DMA channels
(non-fatal for backwards compatibility with older device trees) and
provide the '#dma-cells' property in the shared mpc5121.dtsi file

Signed-off-by: Alexander Popov 
---
 arch/powerpc/boot/dts/mpc5121.dtsi |  1 +
 drivers/dma/mpc512x_dma.c  | 13 -
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/boot/dts/mpc5121.dtsi 
b/arch/powerpc/boot/dts/mpc5121.dtsi
index 2c0e155..7f9d14f 100644
--- a/arch/powerpc/boot/dts/mpc5121.dtsi
+++ b/arch/powerpc/boot/dts/mpc5121.dtsi
@@ -498,6 +498,7 @@
compatible = "fsl,mpc5121-dma";
reg = <0x14000 0x1800>;
interrupts = <65 0x8>;
+   #dma-cells = <1>;
};
};
 
diff --git a/drivers/dma/mpc512x_dma.c b/drivers/dma/mpc512x_dma.c
index 2ad4373..881db2b 100644
--- a/drivers/dma/mpc512x_dma.c
+++ b/drivers/dma/mpc512x_dma.c
@@ -53,6 +53,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -1036,7 +1037,15 @@ static int mpc_dma_probe(struct platform_device *op)
if (retval)
goto err_free2;
 
-   return retval;
+   /* Register with OF helpers for DMA lookups (nonfatal) */
+   if (dev->of_node) {
+   retval = of_dma_controller_register(dev->of_node,
+   of_dma_xlate_by_chan_id, mdma);
+   if (retval)
+   dev_warn(dev, "Could not register for OF lookup\n");
+   }
+
+   return 0;
 
 err_free2:
if (mdma->is_mpc8308)
@@ -1057,6 +1066,8 @@ static int mpc_dma_remove(struct platform_device *op)
struct device *dev = &op->dev;
struct mpc_dma *mdma = dev_get_drvdata(dev);
 
+   if (dev->of_node)
+   of_dma_controller_free(dev->of_node);
dma_async_device_unregister(&mdma->dma);
if (mdma->is_mpc8308) {
free_irq(mdma->irq2, mdma);
-- 
1.8.4.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 2/3] dmaengine: of: add common xlate function for matching by channel id

2014-06-25 Thread Alexander Popov
This patch adds a new common OF dma xlate callback function which will match a
channel by it's id. The binding expects one integer argument which it will use 
to
lookup the channel by the id.

Unlike of_dma_simple_xlate this function is able to handle a system with
multiple DMA controllers. When registering the of dma provider with
of_dma_controller_register a pointer to the dma_device struct which is
associated with the dt node needs to passed as the data parameter.
New function will use this pointer to match only channels which belong to the
specified DMA controller.

Signed-off-by: Alexander Popov 
---
 drivers/dma/of-dma.c   | 35 +++
 include/linux/of_dma.h |  4 
 2 files changed, 39 insertions(+)

diff --git a/drivers/dma/of-dma.c b/drivers/dma/of-dma.c
index e8fe9dc..d5fbeaa 100644
--- a/drivers/dma/of-dma.c
+++ b/drivers/dma/of-dma.c
@@ -218,3 +218,38 @@ struct dma_chan *of_dma_simple_xlate(struct 
of_phandle_args *dma_spec,
&dma_spec->args[0]);
 }
 EXPORT_SYMBOL_GPL(of_dma_simple_xlate);
+
+/**
+ * of_dma_xlate_by_chan_id - Translate dt property to DMA channel by channel id
+ * @dma_spec:  pointer to DMA specifier as found in the device tree
+ * @of_dma:pointer to DMA controller data
+ *
+ * This function can be used as the of xlate callback for DMA driver which 
wants
+ * to match the channel based on the channel id. When using this xlate function
+ * the #dma-cells propety of the DMA controller dt node needs to be set to 1.
+ * The data parameter of of_dma_controller_register must be a pointer to the
+ * dma_device struct the function should match upon.
+ *
+ * Returns pointer to appropriate dma channel on success or NULL on error.
+ */
+struct dma_chan *of_dma_xlate_by_chan_id(struct of_phandle_args *dma_spec,
+struct of_dma *ofdma)
+{
+   struct dma_device *dev = ofdma->of_dma_data;
+   struct dma_chan *chan, *candidate = NULL;
+
+   if (!dev || dma_spec->args_count != 1)
+   return NULL;
+
+   list_for_each_entry(chan, &dev->channels, device_node)
+   if (chan->chan_id == dma_spec->args[0]) {
+   candidate = chan;
+   break;
+   }
+
+   if (!candidate)
+   return NULL;
+
+   return dma_get_slave_channel(candidate);
+}
+EXPORT_SYMBOL_GPL(of_dma_xlate_by_chan_id);
diff --git a/include/linux/of_dma.h b/include/linux/of_dma.h
index ae36298..56bc026 100644
--- a/include/linux/of_dma.h
+++ b/include/linux/of_dma.h
@@ -41,6 +41,8 @@ extern struct dma_chan *of_dma_request_slave_channel(struct 
device_node *np,
 const char *name);
 extern struct dma_chan *of_dma_simple_xlate(struct of_phandle_args *dma_spec,
struct of_dma *ofdma);
+extern struct dma_chan *of_dma_xlate_by_chan_id(struct of_phandle_args 
*dma_spec,
+   struct of_dma *ofdma);
 #else
 static inline int of_dma_controller_register(struct device_node *np,
struct dma_chan *(*of_dma_xlate)
@@ -66,6 +68,8 @@ static inline struct dma_chan *of_dma_simple_xlate(struct 
of_phandle_args *dma_s
return NULL;
 }
 
+#define of_dma_xlate_by_chan_id NULL
+
 #endif
 
 #endif /* __LINUX_OF_DMA_H */
-- 
1.8.4.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 1/3] dmaengine: mpc512x: add device tree binding document

2014-06-25 Thread Alexander Popov
Introduce a device tree binding document for the MPC512x DMA controller

Signed-off-by: Alexander Popov 
---
 .../devicetree/bindings/dma/mpc512x-dma.txt| 29 ++
 1 file changed, 29 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/mpc512x-dma.txt

diff --git a/Documentation/devicetree/bindings/dma/mpc512x-dma.txt 
b/Documentation/devicetree/bindings/dma/mpc512x-dma.txt
new file mode 100644
index 000..a6511df
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/mpc512x-dma.txt
@@ -0,0 +1,29 @@
+* Freescale MPC512x and MPC8308 DMA Controller
+
+The DMA controller in Freescale MPC512x and MPC8308 SoCs can move
+blocks of memory contents between memory and peripherals or
+from memory to memory.
+
+Refer to "Generic DMA Controller and DMA request bindings" in
+the dma/dma.txt file for a more detailed description of binding.
+
+Required properties:
+- compatible: should be "fsl,mpc5121-dma" or "fsl,mpc8308-dma";
+- reg: should contain the DMA controller registers location and length;
+- interrupt for the DMA controller: syntax of interrupt client node
+   is described in interrupt-controller/interrupts.txt file.
+- #dma-cells: the length of the DMA specifier, must be <1>.
+   Each channel of this DMA controller has a peripheral request line,
+   the assignment is fixed in hardware. This one cell
+   in dmas property of a client device represents the channel number.
+
+Example:
+
+   dma0: dma@14000 {
+   compatible = "fsl,mpc5121-dma";
+   reg = <0x14000 0x1800>;
+   interrupts = <65 0x8>;
+   #dma-cells = <1>;
+   };
+
+DMA clients must use the format described in dma/dma.txt file.
-- 
1.8.4.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 0/3] dmaengine: mpc512x: add device tree binding document and DMA channel lookup

2014-06-25 Thread Alexander Popov
This patch series introduces a device tree binding document for
the MPC512x DMA controller and adds device tree based DMA channel lookup
for it.

This version contains the improved device tree binding document:
#dma-cells is made a required property, as it must be according
dma/dma.txt document.

Alexander Popov (3):
  dmaengine: mpc512x: add device tree binding document
  dmaengine: of: add common xlate function for matching by channel id
  dmaengine: mpc512x: register for device tree channel lookup

 .../devicetree/bindings/dma/mpc512x-dma.txt| 29 ++
 arch/powerpc/boot/dts/mpc5121.dtsi |  1 +
 drivers/dma/mpc512x_dma.c  | 13 +++-
 drivers/dma/of-dma.c   | 35 ++
 include/linux/of_dma.h |  4 +++
 5 files changed, 81 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/dma/mpc512x-dma.txt

-- 
1.8.4.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH V3 06/17] ppc/pnv: allocate pe->iommu_table dynamically

2014-06-25 Thread Alexey Kardashevskiy
On 06/25/2014 07:20 PM, David Laight wrote:
> From: Wei Yang
>> On Wed, Jun 25, 2014 at 02:12:34PM +1000, Alexey Kardashevskiy wrote:
>>> On 06/25/2014 11:12 AM, Wei Yang wrote:
 On Tue, Jun 24, 2014 at 08:06:32PM +1000, Alexey Kardashevskiy wrote:
> On 06/10/2014 11:56 AM, Wei Yang wrote:
>> Current iommu_table of a PE is a static field. This will have a problem 
>> when
>> iommu_free_table is called.
>
> What kind of problem? This table is per PE and PE is not going anywhere.
>

 Yes, for Bus PE, they will always sit in the system. When VF PE introduced,
 they could be released on the fly. When they are released, so do the iommu
 table for the PE.
>>>
>>> iommu_table is a part of PE struct. When PE is released, iommu_table will
>>> go with it as well. Why to make is a pointer? I would understand it if you
>>> added reference counting there but no - iommu_table's lifetime is equal to
>>> PE lifetime.
>>>
>>
>> Yes, iommu_talbe's life time equals to PE lifetime, so when releasing a PE we
>> need to release the iommu table. Currently, there is one function to release
>> the iommu table, iommu_free_table() which takes a pointer of the iommu_table
>> and release it.
>>
>> If the iommu table in PE is just a part of PE, it will have some problem to
>> release it with iommu_free_table(). That's why I make it a pointer in PE
>> structure.
> 
> What are the sizes of the iommu table and the PE structure?

This is all about iommu_table struct (which is just a descriptor), not
IOMMU table per se (which may be megabytes) :)


> If the table is a round number of pages then you probably don't want to
> embed it inside the PE structure.




-- 
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH v2] fsl-rio: add support for mapping inbound windows

2014-06-25 Thread gang....@freescale.com

> Subject: [PATCH v2] fsl-rio: add support for mapping inbound windows
> 
> From: Martijn de Gouw 
> 
> Add support for mapping and unmapping of inbound rapidio windows.
> 
> Signed-off-by: Martijn de Gouw 
> ---
>  arch/powerpc/sysdev/fsl_rio.c |   92 
> +
>  arch/powerpc/sysdev/fsl_rio.h |   12 ++
>  2 files changed, 104 insertions(+)

> + /* check for conflicting ranges */
> + for (i = 0; i < RIO_INB_ATMU_COUNT; i++) {
> + riwar = in_be32(&priv->inb_atmu_regs[i].riwar);
> + if ((riwar & RIWAR_ENABLE) == 0)
> + continue;
> + win_start = ((u64)(in_be32(&priv->inb_atmu_regs[i].riwbar) &
> RIWBAR_BADD_MASK))
> + << RIWBAR_BADD_VAL_SHIFT;
> + win_end = win_start + ((1 << ((riwar & RIWAR_SIZE_MASK) + 1)) - 
> 1);
> + if (rstart < win_end && (rstart + size) > win_start)
> + return -EINVAL;
> + }

For the inbound window, the base address must be aligned based on the
size selected in the window size bits. So I think it will be better
if you can check the alignment.


> @@ -598,6 +687,8 @@ int fsl_rio_setup(struct platform_device *dev)
>   RIO_ATMU_REGS_PORT2_OFFSET));
> 
>   priv->maint_atmu_regs = priv->atmu_regs + 1;
> + priv->inb_atmu_regs = (struct rio_inb_atmu_regs *)
> + (priv->regs_win + RIO_INB_ATMU_REGS_OFFSET);

The RIO_INB_ATMU_REGS_OFFSET is just for port 1, I think the port 2
also should be supported.

Best Regards,
Liu Gang
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH V3 06/17] ppc/pnv: allocate pe->iommu_table dynamically

2014-06-25 Thread Wei Yang
On Wed, Jun 25, 2014 at 09:20:11AM +, David Laight wrote:
>From: Wei Yang
>> On Wed, Jun 25, 2014 at 02:12:34PM +1000, Alexey Kardashevskiy wrote:
>> >On 06/25/2014 11:12 AM, Wei Yang wrote:
>> >> On Tue, Jun 24, 2014 at 08:06:32PM +1000, Alexey Kardashevskiy wrote:
>> >>> On 06/10/2014 11:56 AM, Wei Yang wrote:
>>  Current iommu_table of a PE is a static field. This will have a problem 
>>  when
>>  iommu_free_table is called.
>> >>>
>> >>> What kind of problem? This table is per PE and PE is not going anywhere.
>> >>>
>> >>
>> >> Yes, for Bus PE, they will always sit in the system. When VF PE 
>> >> introduced,
>> >> they could be released on the fly. When they are released, so do the iommu
>> >> table for the PE.
>> >
>> >iommu_table is a part of PE struct. When PE is released, iommu_table will
>> >go with it as well. Why to make is a pointer? I would understand it if you
>> >added reference counting there but no - iommu_table's lifetime is equal to
>> >PE lifetime.
>> >
>> 
>> Yes, iommu_talbe's life time equals to PE lifetime, so when releasing a PE we
>> need to release the iommu table. Currently, there is one function to release
>> the iommu table, iommu_free_table() which takes a pointer of the iommu_table
>> and release it.
>> 
>> If the iommu table in PE is just a part of PE, it will have some problem to
>> release it with iommu_free_table(). That's why I make it a pointer in PE
>> structure.
>
>What are the sizes of the iommu table and the PE structure?

I calculated it in my mind, the size of iommu_table, defined in
arch/powerpc/include/asm/iommu.h is 256 bytes.

>If the table is a round number of pages then you probably don't want to
>embed it inside the PE structure.

If my understanding is correct, the iommu table structure size is not that
big.

>
>   David
>

-- 
Richard Yang
Help you, Help me

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [RFC PATCH V3 06/17] ppc/pnv: allocate pe->iommu_table dynamically

2014-06-25 Thread David Laight
From: Wei Yang
> On Wed, Jun 25, 2014 at 02:12:34PM +1000, Alexey Kardashevskiy wrote:
> >On 06/25/2014 11:12 AM, Wei Yang wrote:
> >> On Tue, Jun 24, 2014 at 08:06:32PM +1000, Alexey Kardashevskiy wrote:
> >>> On 06/10/2014 11:56 AM, Wei Yang wrote:
>  Current iommu_table of a PE is a static field. This will have a problem 
>  when
>  iommu_free_table is called.
> >>>
> >>> What kind of problem? This table is per PE and PE is not going anywhere.
> >>>
> >>
> >> Yes, for Bus PE, they will always sit in the system. When VF PE introduced,
> >> they could be released on the fly. When they are released, so do the iommu
> >> table for the PE.
> >
> >iommu_table is a part of PE struct. When PE is released, iommu_table will
> >go with it as well. Why to make is a pointer? I would understand it if you
> >added reference counting there but no - iommu_table's lifetime is equal to
> >PE lifetime.
> >
> 
> Yes, iommu_talbe's life time equals to PE lifetime, so when releasing a PE we
> need to release the iommu table. Currently, there is one function to release
> the iommu table, iommu_free_table() which takes a pointer of the iommu_table
> and release it.
> 
> If the iommu table in PE is just a part of PE, it will have some problem to
> release it with iommu_free_table(). That's why I make it a pointer in PE
> structure.

What are the sizes of the iommu table and the PE structure?
If the table is a round number of pages then you probably don't want to
embed it inside the PE structure.

David

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH V3 06/17] ppc/pnv: allocate pe->iommu_table dynamically

2014-06-25 Thread Wei Yang
On Wed, Jun 25, 2014 at 05:56:37PM +1000, Benjamin Herrenschmidt wrote:
>On Wed, 2014-06-25 at 17:50 +1000, Alexey Kardashevskiy wrote:
>
>> > Yes, iommu_talbe's life time equals to PE lifetime, so when releasing a PE 
>> > we
>> > need to release the iommu table. Currently, there is one function to 
>> > release
>> > the iommu table, iommu_free_table() which takes a pointer of the 
>> > iommu_table
>> > and release it.
>> > 
>> > If the iommu table in PE is just a part of PE, it will have some problem to
>> > release it with iommu_free_table(). That's why I make it a pointer in PE
>> > structure.
>> 
>> So you are saying that you want to release PE by one kfree() and release
>> iommu_table by another kfree (embedded into iommu_free_table()). For me
>> that means that PE and iommu_table have different lifetime.
>> 
>> And I cannot find the exact place in this patchset where you call
>> iommu_free_table(), what do I miss?
>
>He has a point though... iommu_free_table() does a whole bunch of things
>in addition to kfree at the end.
>
>This is a discrepancy in the iommu.c code, we don't allocate the table,
>it's allocated by our callers, but we do free it in iommu_free_table().
>
>My gut feeling is that we should fix that in the core by moving the
>kfree() out of iommu_free_table() and back into vio.c and
>pseries/iommu.c, the only two callers, otherwise we can't wrap the table
>structure inside another object if we are going to ever free it.
>

Yes, this is another option. Move the kfree() outside could keep some logic in
current code, like in pnv_pci_ioda_tce_invalidate(). We could get the tbl from
a PE structure directly, instead of adding a field in tbl to point to the PE
structure.

>Cheers,
>Ben.
>
>
>

-- 
Richard Yang
Help you, Help me

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH V3 06/17] ppc/pnv: allocate pe->iommu_table dynamically

2014-06-25 Thread Wei Yang
On Wed, Jun 25, 2014 at 05:50:08PM +1000, Alexey Kardashevskiy wrote:
>On 06/25/2014 03:27 PM, Wei Yang wrote:
>> On Wed, Jun 25, 2014 at 02:12:34PM +1000, Alexey Kardashevskiy wrote:
>>> On 06/25/2014 11:12 AM, Wei Yang wrote:
 On Tue, Jun 24, 2014 at 08:06:32PM +1000, Alexey Kardashevskiy wrote:
> On 06/10/2014 11:56 AM, Wei Yang wrote:
>> Current iommu_table of a PE is a static field. This will have a problem 
>> when
>> iommu_free_table is called.
>
> What kind of problem? This table is per PE and PE is not going anywhere.
>

 Yes, for Bus PE, they will always sit in the system. When VF PE introduced,
 they could be released on the fly. When they are released, so do the iommu
 table for the PE.
>>>
>>> iommu_table is a part of PE struct. When PE is released, iommu_table will
>>> go with it as well. Why to make is a pointer? I would understand it if you
>>> added reference counting there but no - iommu_table's lifetime is equal to
>>> PE lifetime.
>>>
>> 
>> Yes, iommu_talbe's life time equals to PE lifetime, so when releasing a PE we
>> need to release the iommu table. Currently, there is one function to release
>> the iommu table, iommu_free_table() which takes a pointer of the iommu_table
>> and release it.
>> 
>> If the iommu table in PE is just a part of PE, it will have some problem to
>> release it with iommu_free_table(). That's why I make it a pointer in PE
>> structure.
>
>So you are saying that you want to release PE by one kfree() and release
>iommu_table by another kfree (embedded into iommu_free_table()). For me
>that means that PE and iommu_table have different lifetime.
>

Hmm... it is right, the lifetime of these two may have some difference.

>And I cannot find the exact place in this patchset where you call
>iommu_free_table(), what do I miss?
>

This is called in pnv_pci_release_dev_dma(), which is introduced in the commit
cd740988: powerpc/powernv: allocate VF PE

>
>
>
>-- 
>Alexey

-- 
Richard Yang
Help you, Help me

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH] spi: include "int ret" with macro

2014-06-25 Thread David Laight
From: Zhao Qiang
> ret is unused when CONFIG_FSL_SOC defined,
> so include it with "#ifndef CONFIG_FSL_SOC".
> 
> Signed-off-by: Zhao Qiang 
> ---
>  drivers/spi/spi-fsl-lib.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/spi/spi-fsl-lib.c b/drivers/spi/spi-fsl-lib.c
> index e5d45fc..44aace1 100644
> --- a/drivers/spi/spi-fsl-lib.c
> +++ b/drivers/spi/spi-fsl-lib.c
> @@ -198,8 +198,9 @@ int of_mpc8xxx_spi_probe(struct platform_device *ofdev)
>   struct mpc8xxx_spi_probe_info *pinfo;
>   struct fsl_spi_platform_data *pdata;
>   const void *prop;
> +#ifndef CONFIG_FSL_SOC
>   int ret = -ENOMEM;
> -
> +#endif

You are removing the blank line after the definition of the locals,
and the initialiser isn't needed.

>   pinfo = devm_kzalloc(&ofdev->dev, sizeof(*pinfo), GFP_KERNEL);
>   if (!pinfo)
>   return -ENOMEM;
> --

I think it might be preferable to define 'ret' inside the conditional
where it is used - which requires an extra {...} block.

A 'sneaky' way to avoid the warning is to 'return ret' when the kzalloc() fails.

David


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH V3 06/17] ppc/pnv: allocate pe->iommu_table dynamically

2014-06-25 Thread Benjamin Herrenschmidt
On Wed, 2014-06-25 at 17:50 +1000, Alexey Kardashevskiy wrote:

> > Yes, iommu_talbe's life time equals to PE lifetime, so when releasing a PE 
> > we
> > need to release the iommu table. Currently, there is one function to release
> > the iommu table, iommu_free_table() which takes a pointer of the iommu_table
> > and release it.
> > 
> > If the iommu table in PE is just a part of PE, it will have some problem to
> > release it with iommu_free_table(). That's why I make it a pointer in PE
> > structure.
> 
> So you are saying that you want to release PE by one kfree() and release
> iommu_table by another kfree (embedded into iommu_free_table()). For me
> that means that PE and iommu_table have different lifetime.
> 
> And I cannot find the exact place in this patchset where you call
> iommu_free_table(), what do I miss?

He has a point though... iommu_free_table() does a whole bunch of things
in addition to kfree at the end.

This is a discrepancy in the iommu.c code, we don't allocate the table,
it's allocated by our callers, but we do free it in iommu_free_table().

My gut feeling is that we should fix that in the core by moving the
kfree() out of iommu_free_table() and back into vio.c and
pseries/iommu.c, the only two callers, otherwise we can't wrap the table
structure inside another object if we are going to ever free it.

Cheers,
Ben.




___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Fix build warning

2014-06-25 Thread Geert Uytterhoeven
On Tue, Jun 24, 2014 at 8:01 AM, Guenter Roeck  wrote:
> Sigh. Much easier to break something than to fix it. That would mean to get
> approval
> from at least three maintainers, and all that to get rid of a warning. I
> don't
> really have time for that. Let's just forget about it and live with the
> warning.

So you send it to akpm. Or perhaps even trivial.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH V3 06/17] ppc/pnv: allocate pe->iommu_table dynamically

2014-06-25 Thread Alexey Kardashevskiy
On 06/25/2014 03:27 PM, Wei Yang wrote:
> On Wed, Jun 25, 2014 at 02:12:34PM +1000, Alexey Kardashevskiy wrote:
>> On 06/25/2014 11:12 AM, Wei Yang wrote:
>>> On Tue, Jun 24, 2014 at 08:06:32PM +1000, Alexey Kardashevskiy wrote:
 On 06/10/2014 11:56 AM, Wei Yang wrote:
> Current iommu_table of a PE is a static field. This will have a problem 
> when
> iommu_free_table is called.

 What kind of problem? This table is per PE and PE is not going anywhere.

>>>
>>> Yes, for Bus PE, they will always sit in the system. When VF PE introduced,
>>> they could be released on the fly. When they are released, so do the iommu
>>> table for the PE.
>>
>> iommu_table is a part of PE struct. When PE is released, iommu_table will
>> go with it as well. Why to make is a pointer? I would understand it if you
>> added reference counting there but no - iommu_table's lifetime is equal to
>> PE lifetime.
>>
> 
> Yes, iommu_talbe's life time equals to PE lifetime, so when releasing a PE we
> need to release the iommu table. Currently, there is one function to release
> the iommu table, iommu_free_table() which takes a pointer of the iommu_table
> and release it.
> 
> If the iommu table in PE is just a part of PE, it will have some problem to
> release it with iommu_free_table(). That's why I make it a pointer in PE
> structure.

So you are saying that you want to release PE by one kfree() and release
iommu_table by another kfree (embedded into iommu_free_table()). For me
that means that PE and iommu_table have different lifetime.

And I cannot find the exact place in this patchset where you call
iommu_free_table(), what do I miss?




-- 
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 1/1] powerpc/perf: Adjust callchain based on DWARF debug info

2014-06-25 Thread Jiri Olsa
On Tue, Jun 24, 2014 at 05:00:52PM -0700, Sukadev Bhattiprolu wrote:
> [PATCH v5 1/1] powerpc/perf: Adjust callchain based on DWARF debug info

superflous ^^^

> 
> When saving the callchain on Power, the kernel conservatively saves excess
> entries in the callchain. A few of these entries are needed in some cases
> but not others. We should use the DWARF debug information to determine
> when the entries are  needed.

SNIP

> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 0e5fea9..6221d01 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -1291,7 +1291,9 @@ static int machine__resolve_callchain_sample(struct 
> machine *machine,
>   u8 cpumode = PERF_RECORD_MISC_USER;
>   int chain_nr = min(max_stack, (int)chain->nr);
>   int i;
> + int j;
>   int err;
> + int skip_idx;
>  
>   callchain_cursor_reset(&callchain_cursor);
>  
> @@ -1300,14 +1302,26 @@ static int machine__resolve_callchain_sample(struct 
> machine *machine,
>   return 0;
>   }
>  
> + /*
> +  * Based on DWARF debug information, some architectures skip
> +  * a callchain entry saved by the kernel.
> +  */
> + skip_idx = arch_skip_callchain_idx(machine, thread, chain);
> +
>   for (i = 0; i < chain_nr; i++) {
>   u64 ip;
>   struct addr_location al;
>  
>   if (callchain_param.order == ORDER_CALLEE)
> - ip = chain->ips[i];
> + j = i;
>   else
> - ip = chain->ips[chain->nr - i - 1];
> + j = chain->nr - i - 1;
> +
> +#ifdef HAVE_SKIP_CALLCHAIN_IDX
> + if (j == skip_idx)
> + continue;
> +#endif
> + ip = chain->ips[j];
>  
>   if (ip >= PERF_CONTEXT_MAX) {
>   switch (ip) {
> -- 
> 1.7.9.5
> 

got following compile error:

  CC   util/map.o
util/machine.c: In function ‘machine__resolve_callchain_sample’:
util/machine.c:1286:6: error: variable ‘skip_idx’ set but not used 
[-Werror=unused-but-set-variable]
  int skip_idx;
  ^
cc1: all warnings being treated as errors
make[2]: *** [util/machine.o] Error 1
make[2]: *** Waiting for unfinished jobs
make[1]: *** [all] Error 2
  test: test -x ./perf
make: *** [make_pure] Error 1


you could use __maybe_unused for the 'skip_idx'

jirka
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] sched: Fix compiler warnings

2014-06-25 Thread Uwe Kleine-König
On Tue, Jun 24, 2014 at 06:05:29PM -0700, Guenter Roeck wrote:
> Commit 143e1e28cb (sched: Rework sched_domain topology definition)
> introduced a number of functions with a return value of 'const int'.
> gcc doesn't know what to do with that and, if the kernel is compiled
> with W=1, complains with the following warnings whenever sched.h
> is included.
> 
> include/linux/sched.h:875:25: warning:
>   type qualifiers ignored on function return type
> include/linux/sched.h:882:25: warning:
>   type qualifiers ignored on function return type
> include/linux/sched.h:889:25: warning:
>   type qualifiers ignored on function return type
> include/linux/sched.h:1002:21: warning:
>   type qualifiers ignored on function return type
> 
> Commits fb2aa855 (sched, ARM: Create a dedicated scheduler topology table)
> and 607b45e9a (sched, powerpc: Create a dedicated topology table) introduce
> the same warning in the arm and powerpc code.
> 
> Drop 'const' from the function declarations to fix the problem.
> 
> The fix for all three patches has to be applied together to avoid
> compilation failures for the affected architectures.
> 
> Cc: Dietmar Eggemann 
> Cc: Peter Zijlstra 
> Cc: Ingo Molnar 
> Cc: Benjamin Herrenschmidt 
> Cc: Vincent Guittot 
> Signed-off-by: Guenter Roeck 
> ---
> v2: Fix problem in all affected architectures with a single patch
> to avoid compilation errors.
> 
>  arch/arm/kernel/topology.c | 2 +-
>  arch/powerpc/kernel/smp.c  | 2 +-
>  include/linux/sched.h  | 8 
>  3 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index 9d85318..e35d880 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -275,7 +275,7 @@ void store_cpu_topology(unsigned int cpuid)
>   cpu_topology[cpuid].socket_id, mpidr);
>  }
>  
> -static inline const int cpu_corepower_flags(void)
> +static inline int cpu_corepower_flags(void)
>  {
>   return SD_SHARE_PKG_RESOURCES  | SD_SHARE_POWERDOMAIN;
>  }
Maybe the author's intention was:

static inline int cpu_corepower_flags(void) __attribute__((const));

?
This specifies that the function has no side effects and the return value
only depends on the (here non-existing) function arguments.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] Bugfix: powerpc/eeh: Create eeh sysfs entry in post_init()

2014-06-25 Thread Mike Qiu

On 06/25/2014 01:33 PM, Gavin Shan wrote:

On Tue, Jun 24, 2014 at 11:32:07PM -0400, Mike Qiu wrote:

[ cc Richard ]


Eeh sysfs entry created must be after EEH_ENABLED been set
in eeh_subsystem_flags.

In PowerNV platform, it try to create sysfs entry before
EEH_ENABLED been set, when boot up. So nothing will be
created for eeh in sysfs.


Could you please make the commit log more clear? :-)

I guess the issue is introduced by commit 2213fb1 ("
powerpc/eeh: Skip eeh sysfs when eeh is disabled"). The
commit checks EEH is enabled while creating PCI device
EEH sysfs files. If not, the sysfs files won't be created.
That's to avoid warning reported during PCI hotplug.

The problem you're reporting (if I understand completely):
You don't see the sysfs files after the system boots up.
If it's the case, you probably need following changes in
arch/powerpc/platforms/powernv/pci.c::pnv_pci_ioda_fixup().
Could you have a try with it?

#ifdef CONFIG_EEH
eeh_probe_mode_set(EEH_PROBE_MODE_DEV);
-   eeh_addr_cache_build();
eeh_init();
+   eeh_addr_cache_build();
#endif


But this was not work, as I test, see boot log below:

[0.233993] Unable to handle kernel paging request for data at 
address 0x0010

[0.234086] Faulting instruction address: 0xc0036c84
[0.234144] Oops: Kernel access of bad area, sig: 11 [#1]
[0.234188] SMP NR_CPUS=1024 NUMA PowerNV
[0.234235] Modules linked in:
[0.234282] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 3.16.0-rc1+ #61
[0.234339] task: c003bfcc ti: c003bfd0 task.ti: 
c003bfd0
[0.234405] NIP: c0036c84 LR: c0036c4c CTR: 


[0.234472] REGS: c003bfd03430 TRAP: 0300   Not tainted (3.16.0-rc1+)
[0.234528] MSR: 90009032  CR: 
44008088  XER: 
[0.234686] CFAR: c0009358 DAR: 0010 DSISR: 
4000 SOFTE: 1

GPR00: c0036c4c c003bfd036b0 c1448d58 c003bce30080
GPR04:   0001 c003bce300c8
GPR08: c003bce300e8   3030f000
GPR12: 22008042 cfee1200 c0b0e1f0 
GPR16: f0019600 0008 003f c3022280
GPR20: c0b0e058 0040 0008 0007
GPR24: c3120f80 c0b0e2d0 c13bc6f0 c003bca18400
GPR28:  c301 c003bce30080 c003bb2c3b40
[0.235582] NIP [c0036c84] .eeh_add_to_parent_pe+0x164/0x340
[0.235639] LR [c0036c4c] .eeh_add_to_parent_pe+0x12c/0x340
[0.235695] Call Trace:
[0.235719] [c003bfd036b0] [c0036c4c] 
.eeh_add_to_parent_pe+0x12c/0x340 (unreliable)
[0.235810] [c003bfd03730] [c0070ee8] 
.powernv_eeh_dev_probe+0x158/0x1d0
[0.235890] [c003bfd037c0] [c048768c] 
.pci_walk_bus+0x8c/0x120

[0.235957] [c003bfd03860] [c00341c4] .eeh_init+0xf4/0x310
[0.236025] [c003bfd03900] [c006e7a8] 
.pnv_pci_ioda_fixup+0x688/0xb30
[0.236105] [c003bfd03a60] [c0c2ee90] 
.pcibios_resource_survey+0x334/0x3f4

[0.236183] [c003bfd03b50] [c0c2e65c] .pcibios_init+0xa0/0xd4
[0.236251] [c003bfd03be0] [c000bc94] 
.do_one_initcall+0x124/0x280
[0.236329] [c003bfd03cd0] [c0c24acc] 
.kernel_init_freeable+0x250/0x348

[0.236408] [c003bfd03db0] [c000c4c4] .kernel_init+0x24/0x140
[0.236475] [c003bfd03e30] [c000a45c] 
.ret_from_kernel_thread+0x58/0x7c

[0.236553] Instruction dump:
[0.236586] 815f000c 6000 e9228890 915e000c 8129 7926f7e3 
813f0008 913e0008
[0.236698] 41820018 2fbf 419e0154 e93f0088  f93e0018 
e93f0080 4834

[0.236819] ---[ end trace e78b31e354e84859 ]---
[0.236864]
[2.236933] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x000b


This may because  edev->pdev is set in eeh_addr_cache_build(), while 
eeh_init() use that entry.


After changed the code, the call patch:

eeh_init() >
pci_walk_bus()>
powernv_eeh_dev_probe() ->
   eeh_add_to_parent_pe()
eeh_addr_cache_build()

We can see in
eeh_add_to_parent_pe() {
..
pe->bus = eeh_dev_to_pci_dev(edev)->bus;
..
}

That is sure eeh_dev_to_pci_dev(edev) will be *NULL*, because this is 
set in  eeh_addr_cache_build()



Thanks
Mike

Eventually PowerNV/pSeries have same function call sequence:

- Set EEH probe mode
- Doing probe (with device node or PCI device)
- Build address cache.


Signed-off-by: Mike Qiu 
---
arch/powerpc/platforms/powernv/eeh-ioda.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c 
b/arch/powerpc/platforms/powernv/eeh-ioda.c
index 8ad0c5b..5f95581 100644
--- a/arch/powerpc/platforms/powernv/eeh-ioda.c
+++ b/arch/powerpc/platforms/powernv/ee