[PATCH] powerpc/mpc512x: enable USB support in defconfig

2013-06-24 Thread Anatolij Gustschin
Enable USB EHCI, mass storage and USB gadget support.

Signed-off-by: Anatolij Gustschin 
---
 arch/powerpc/configs/mpc512x_defconfig | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/configs/mpc512x_defconfig 
b/arch/powerpc/configs/mpc512x_defconfig
index 5b8ee80..ee853a1 100644
--- a/arch/powerpc/configs/mpc512x_defconfig
+++ b/arch/powerpc/configs/mpc512x_defconfig
@@ -102,6 +102,13 @@ CONFIG_FB=y
 CONFIG_FB_FSL_DIU=y
 # CONFIG_VGA_CONSOLE is not set
 CONFIG_FRAMEBUFFER_CONSOLE=y
+CONFIG_USB=y
+CONFIG_USB_EHCI_HCD=y
+CONFIG_USB_EHCI_FSL=y
+# CONFIG_USB_EHCI_HCD_PPC_OF is not set
+CONFIG_USB_STORAGE=y
+CONFIG_USB_GADGET=y
+CONFIG_USB_FSL_USB2=y
 CONFIG_RTC_CLASS=y
 CONFIG_RTC_DRV_M41T80=y
 CONFIG_RTC_DRV_MPC5121=y
-- 
1.7.11.7

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] powerpc/mpc512x: update defconfig

2013-06-24 Thread Anatolij Gustschin
From: Gerhard Sittig 

This patch does not change the content, it merely re-orders
configuration items and drops explicit options which already
apply as the default.

Signed-off-by: Gerhard Sittig 
Signed-off-by: Anatolij Gustschin 
---
 arch/powerpc/configs/mpc512x_defconfig |   20 +++-
 1 file changed, 3 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/configs/mpc512x_defconfig 
b/arch/powerpc/configs/mpc512x_defconfig
index 0d0d981..5b8ee80 100644
--- a/arch/powerpc/configs/mpc512x_defconfig
+++ b/arch/powerpc/configs/mpc512x_defconfig
@@ -1,7 +1,6 @@
-CONFIG_EXPERIMENTAL=y
 # CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
-CONFIG_SPARSE_IRQ=y
+CONFIG_NO_HZ=y
 CONFIG_LOG_BUF_SHIFT=16
 CONFIG_BLK_DEV_INITRD=y
 # CONFIG_COMPAT_BRK is not set
@@ -9,6 +8,7 @@ CONFIG_SLAB=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 # CONFIG_BLK_DEV_BSG is not set
+CONFIG_PARTITION_ADVANCED=y
 # CONFIG_IOSCHED_CFQ is not set
 # CONFIG_PPC_CHRP is not set
 CONFIG_PPC_MPC512x=y
@@ -16,9 +16,7 @@ CONFIG_MPC5121_ADS=y
 CONFIG_MPC512x_GENERIC=y
 CONFIG_PDM360NG=y
 # CONFIG_PPC_PMAC is not set
-CONFIG_NO_HZ=y
 CONFIG_HZ_1000=y
-# CONFIG_MIGRATION is not set
 # CONFIG_SECCOMP is not set
 # CONFIG_PCI is not set
 CONFIG_NET=y
@@ -33,8 +31,6 @@ CONFIG_IP_PNP=y
 # CONFIG_INET_DIAG is not set
 # CONFIG_IPV6 is not set
 CONFIG_CAN=y
-CONFIG_CAN_RAW=y
-CONFIG_CAN_BCM=y
 CONFIG_CAN_VCAN=y
 CONFIG_CAN_MSCAN=y
 CONFIG_CAN_DEBUG_DEVICES=y
@@ -46,7 +42,6 @@ CONFIG_DEVTMPFS_MOUNT=y
 # CONFIG_FIRMWARE_IN_KERNEL is not set
 CONFIG_MTD=y
 CONFIG_MTD_CMDLINE_PARTS=y
-CONFIG_MTD_CHAR=y
 CONFIG_MTD_BLOCK=y
 CONFIG_MTD_CFI=y
 CONFIG_MTD_CFI_AMDSTD=y
@@ -60,7 +55,6 @@ CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_COUNT=1
 CONFIG_BLK_DEV_RAM_SIZE=8192
 CONFIG_BLK_DEV_XIP=y
-CONFIG_MISC_DEVICES=y
 CONFIG_EEPROM_AT24=y
 CONFIG_EEPROM_AT25=y
 CONFIG_SCSI=y
@@ -68,6 +62,7 @@ CONFIG_SCSI=y
 CONFIG_BLK_DEV_SD=y
 CONFIG_CHR_DEV_SG=y
 CONFIG_NETDEVICES=y
+CONFIG_FS_ENET=y
 CONFIG_MARVELL_PHY=y
 CONFIG_DAVICOM_PHY=y
 CONFIG_QSEMI_PHY=y
@@ -83,10 +78,6 @@ CONFIG_STE10XP=y
 CONFIG_LSI_ET1011C_PHY=y
 CONFIG_FIXED_PHY=y
 CONFIG_MDIO_BITBANG=y
-CONFIG_NET_ETHERNET=y
-CONFIG_FS_ENET=y
-# CONFIG_NETDEV_1000 is not set
-# CONFIG_NETDEV_1 is not set
 # CONFIG_WLAN is not set
 # CONFIG_INPUT_MOUSEDEV_PSAUX is not set
 CONFIG_INPUT_EVDEV=y
@@ -106,10 +97,7 @@ CONFIG_GPIO_SYSFS=y
 CONFIG_GPIO_MPC8XXX=y
 # CONFIG_HWMON is not set
 CONFIG_MEDIA_SUPPORT=y
-CONFIG_VIDEO_DEV=y
 CONFIG_VIDEO_ADV_DEBUG=y
-# CONFIG_VIDEO_HELPER_CHIPS_AUTO is not set
-CONFIG_VIDEO_SAA711X=y
 CONFIG_FB=y
 CONFIG_FB_FSL_DIU=y
 # CONFIG_VGA_CONSOLE is not set
@@ -129,9 +117,7 @@ CONFIG_TMPFS=y
 CONFIG_JFFS2_FS=y
 CONFIG_UBIFS_FS=y
 CONFIG_NFS_FS=y
-CONFIG_NFS_V3=y
 CONFIG_ROOT_NFS=y
-CONFIG_PARTITION_ADVANCED=y
 CONFIG_NLS_CODEPAGE_437=y
 CONFIG_NLS_ISO8859_1=y
 # CONFIG_ENABLE_WARN_DEPRECATED is not set
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2 2/2] powerpc/eeh: Avoid warning on P8

2013-06-24 Thread Gavin Shan
To replace down() with down_interrutible() to avoid following
warning:

[c0007ba7b710] [c0014410] .__switch_to+0x1b0/0x380
[c0007ba7b7c0] [c07b408c] .__schedule+0x3ec/0x970
[c0007ba7ba50] [c07b1f24] .schedule_timeout+0x1a4/0x2b0
[c0007ba7bb30] [c07b34a4] .__down+0xa4/0x104
[c0007ba7bbf0] [c00b9230] .down+0x60/0x70
[c0007ba7bc80] [c00336d0] .eeh_event_handler+0x70/0x190
[c0007ba7bd30] [c00b1a58] .kthread+0xe8/0xf0
[c0007ba7be30] [c000a05c] .ret_from_kernel_thread+0x5c/0x8

Signed-off-by: Gavin Shan 
---
 arch/powerpc/kernel/eeh_event.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
index 39bcd81..d27c5af 100644
--- a/arch/powerpc/kernel/eeh_event.c
+++ b/arch/powerpc/kernel/eeh_event.c
@@ -55,7 +55,8 @@ static int eeh_event_handler(void * dummy)
struct eeh_pe *pe;
 
while (!kthread_should_stop()) {
-   down(&eeh_eventlist_sem);
+   if (down_interruptible(&eeh_eventlist_sem))
+   break;
 
/* Fetch EEH event from the queue */
spin_lock_irqsave(&eeh_eventlist_lock, flags);
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2 1/2] powerpc/eeh: Remove eeh_mutex

2013-06-24 Thread Gavin Shan
Originally, eeh_mutex was introduced to protect the PE hierarchy
tree and the attached EEH devices because EEH core was possiblly
running with multiple threads to access the PE hierarchy tree.
However, we now have only one kthread in EEH core. So we needn't
the eeh_mutex and just remove it.

Signed-off-by: Gavin Shan 
---
 arch/powerpc/include/asm/eeh.h |   14 --
 arch/powerpc/kernel/eeh.c  |3 ---
 arch/powerpc/kernel/eeh_pe.c   |   30 +-
 3 files changed, 1 insertions(+), 46 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index a0b11fb..dd65e31 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -151,7 +151,6 @@ struct eeh_ops {
 
 extern struct eeh_ops *eeh_ops;
 extern int eeh_subsystem_enabled;
-extern struct mutex eeh_mutex;
 extern raw_spinlock_t confirm_error_lock;
 extern int eeh_probe_mode;
 
@@ -173,16 +172,6 @@ static inline int eeh_probe_mode_dev(void)
return (eeh_probe_mode == EEH_PROBE_MODE_DEV);
 }
 
-static inline void eeh_lock(void)
-{
-   mutex_lock(&eeh_mutex);
-}
-
-static inline void eeh_unlock(void)
-{
-   mutex_unlock(&eeh_mutex);
-}
-
 static inline void eeh_serialize_lock(unsigned long *flags)
 {
raw_spin_lock_irqsave(&confirm_error_lock, *flags);
@@ -271,9 +260,6 @@ static inline void eeh_add_sysfs_files(struct pci_bus *bus) 
{ }
 
 static inline void eeh_remove_bus_device(struct pci_dev *dev, int purge_pe) { }
 
-static inline void eeh_lock(void) { }
-static inline void eeh_unlock(void) { }
-
 #define EEH_POSSIBLE_ERROR(val, type) (0)
 #define EEH_IO_ERROR_VALUE(size) (-1UL)
 #endif /* CONFIG_EEH */
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 7c567be..951a632 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -103,9 +103,6 @@ EXPORT_SYMBOL(eeh_subsystem_enabled);
  */
 int eeh_probe_mode;
 
-/* Global EEH mutex */
-DEFINE_MUTEX(eeh_mutex);
-
 /* Lock to avoid races due to multiple reports of an error */
 DEFINE_RAW_SPINLOCK(confirm_error_lock);
 
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index ae75722..55943fc 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -78,9 +78,7 @@ int eeh_phb_pe_create(struct pci_controller *phb)
}
 
/* Put it into the list */
-   eeh_lock();
list_add_tail(&pe->child, &eeh_phb_pe);
-   eeh_unlock();
 
pr_debug("EEH: Add PE for PHB#%d\n", phb->global_number);
 
@@ -185,21 +183,15 @@ void *eeh_pe_dev_traverse(struct eeh_pe *root,
return NULL;
}
 
-   eeh_lock();
-
/* Traverse root PE */
for (pe = root; pe; pe = eeh_pe_next(pe, root)) {
eeh_pe_for_each_dev(pe, edev) {
ret = fn(edev, flag);
-   if (ret) {
-   eeh_unlock();
+   if (ret)
return ret;
-   }
}
}
 
-   eeh_unlock();
-
return NULL;
 }
 
@@ -305,8 +297,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 {
struct eeh_pe *pe, *parent;
 
-   eeh_lock();
-
/*
 * Search the PE has been existing or not according
 * to the PE address. If that has been existing, the
@@ -316,7 +306,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
pe = eeh_pe_get(edev);
if (pe && !(pe->type & EEH_PE_INVALID)) {
if (!edev->pe_config_addr) {
-   eeh_unlock();
pr_err("%s: PE with addr 0x%x already exists\n",
__func__, edev->config_addr);
return -EEXIST;
@@ -328,7 +317,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 
/* Put the edev to PE */
list_add_tail(&edev->list, &pe->edevs);
-   eeh_unlock();
pr_debug("EEH: Add %s to Bus PE#%x\n",
edev->dn->full_name, pe->addr);
 
@@ -347,7 +335,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
parent->type &= ~EEH_PE_INVALID;
parent = parent->parent;
}
-   eeh_unlock();
pr_debug("EEH: Add %s to Device PE#%x, Parent PE#%x\n",
edev->dn->full_name, pe->addr, pe->parent->addr);
 
@@ -357,7 +344,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
/* Create a new EEH PE */
pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
if (!pe) {
-   eeh_unlock();
pr_err("%s: out of memory!\n", __func__);
return -ENOMEM;
}
@@ -385,7 +371,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
if (!parent) {
parent = eeh_phb_pe_get(edev->phb);
if (!parent) {
-   eeh_unlock();
   

Re: [PATCH 10/10] net/tg3: Avoid delay during MMIO access

2013-06-24 Thread Benjamin Herrenschmidt
On Tue, 2013-06-25 at 13:55 +0800, Gavin Shan wrote:
> When the driver is encountering EEH errors, which might be caused
> by frozen PCI host controller, the driver needn't keep reading on
> MMIO until timeout. For the case, 0xFF's should be returned from
> hardware. Otherwise, it possibly trigger soft-lockup. The patch
> adds more check on that by pci_channel_offline(), thus to avoid
> the possible soft-lockup.

Can you resend this patch "standalone" (not part of a series)
to the maintainer/author of this driver and CC the netdev list on
vger.kernel.org ?

For the CC list, check the author of the original EEH support.

Also maybe improve the explanation above explaining something like:

"When the EEH error is the result of a fenced host bridge, MMIO
accesses can be very slow (milliseconds) to timeout and return all 1's,
thus causing the driver various timeout loops to take way too long and
trigger soft-lockup warnings (in addition to taking minutes to recover).

It might be worthwhile to check if for any of these cases,  is
a valid possible value, and if not, bail early since that means the HW
is either gone or isolated.

In the meantime, checking that the PCI channel is offline will
workaround the problem".

Or something like that...

> Signed-off-by: Gavin Shan 
> ---
>  drivers/net/ethernet/broadcom/tg3.c |   36 
> +++
>  1 files changed, 36 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/tg3.c 
> b/drivers/net/ethernet/broadcom/tg3.c
> index c777b90..a13463e 100644
> --- a/drivers/net/ethernet/broadcom/tg3.c
> +++ b/drivers/net/ethernet/broadcom/tg3.c
> @@ -744,6 +744,9 @@ static int tg3_ape_lock(struct tg3 *tp, int locknum)
>   status = tg3_ape_read32(tp, gnt + off);
>   if (status == bit)
>   break;
> + if (pci_channel_offline(tp->pdev))
> + break;
> +
>   udelay(10);
>   }
>  
> @@ -1635,6 +1638,9 @@ static void tg3_wait_for_event_ack(struct tg3 *tp)
>   for (i = 0; i < delay_cnt; i++) {
>   if (!(tr32(GRC_RX_CPU_EVENT) & GRC_RX_CPU_DRIVER_EVENT))
>   break;
> + if (pci_channel_offline(tp->pdev))
> + break;
> +
>   udelay(8);
>   }
>  }
> @@ -1813,6 +1819,9 @@ static int tg3_poll_fw(struct tg3 *tp)
>   for (i = 0; i < 200; i++) {
>   if (tr32(VCPU_STATUS) & VCPU_STATUS_INIT_DONE)
>   return 0;
> + if (pci_channel_offline(tp->pdev))
> + return -ENODEV;
> +
>   udelay(100);
>   }
>   return -ENODEV;
> @@ -1823,6 +1832,15 @@ static int tg3_poll_fw(struct tg3 *tp)
>   tg3_read_mem(tp, NIC_SRAM_FIRMWARE_MBOX, &val);
>   if (val == ~NIC_SRAM_FIRMWARE_MBOX_MAGIC1)
>   break;
> + if (pci_channel_offline(tp->pdev)) {
> + if (!tg3_flag(tp, NO_FWARE_REPORTED)) {
> + tg3_flag_set(tp, NO_FWARE_REPORTED);
> + netdev_info(tp->dev, "No firmware running\n");
> + }
> +
> + break;
> + }
> +
>   udelay(10);
>   }
>  
> @@ -3520,6 +3538,8 @@ static int tg3_pause_cpu(struct tg3 *tp, u32 cpu_base)
>   tw32(cpu_base + CPU_MODE,  CPU_MODE_HALT);
>   if (tr32(cpu_base + CPU_MODE) & CPU_MODE_HALT)
>   break;
> + if (pci_channel_offline(tp->pdev))
> + return -EBUSY;
>   }
>  
>   return (i == iters) ? -EBUSY : 0;
> @@ -8589,6 +8609,14 @@ static int tg3_stop_block(struct tg3 *tp, unsigned 
> long ofs, u32 enable_bit, boo
>   tw32_f(ofs, val);
>  
>   for (i = 0; i < MAX_WAIT_CNT; i++) {
> + if (pci_channel_offline(tp->pdev)) {
> + dev_err(&tp->pdev->dev,
> + "tg3_stop_block device offline, "
> + "ofs=%lx enable_bit=%x\n",
> + ofs, enable_bit);
> + return -ENODEV;
> + }
> +
>   udelay(100);
>   val = tr32(ofs);
>   if ((val & enable_bit) == 0)
> @@ -8612,6 +8640,13 @@ static int tg3_abort_hw(struct tg3 *tp, bool silent)
>  
>   tg3_disable_ints(tp);
>  
> + if (pci_channel_offline(tp->pdev)) {
> + tp->rx_mode &= ~(RX_MODE_ENABLE | TX_MODE_ENABLE);
> + tp->mac_mode &= ~MAC_MODE_TDE_ENABLE;
> + err = -ENODEV;
> + goto err_no_dev;
> + }
> +
>   tp->rx_mode &= ~RX_MODE_ENABLE;
>   tw32_f(MAC_RX_MODE, tp->rx_mode);
>   udelay(10);
> @@ -8660,6 +8695,7 @@ static int tg3_abort_hw(struct tg3 *tp, bool silent)
>   err |= tg3_stop_block(tp, BUFMGR_MODE, BUFMGR_MODE_ENABLE, silent);
>   err |= tg3_stop_

Re: [PATCH 04/10] powerpc/eeh: Backends to get/set settings

2013-06-24 Thread Benjamin Herrenschmidt
On Tue, 2013-06-25 at 13:55 +0800, Gavin Shan wrote:
> When the PHB gets fenced, 0xFF's returns from PCI config space and
> MMIO space in the hardware. The operations writting to them should
> be dropped. The patch introduce backends allow to set/get flags that
> indicate the access to PCI-CFG and MMIO should be blocked.

We can't block MMIO without massive overhead. Config space can be
blocked inside the firmware, can't it ?

Cheers,
Ben.

> Signed-off-by: Gavin Shan 
> ---
>  arch/powerpc/include/asm/eeh.h   |6 +++
>  arch/powerpc/platforms/pseries/eeh_pseries.c |   44 
> ++
>  2 files changed, 50 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index dd65e31..de821c1 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -131,6 +131,10 @@ static inline struct pci_dev *eeh_dev_to_pci_dev(struct 
> eeh_dev *edev)
>  #define EEH_LOG_TEMP 1   /* EEH temporary error log  */
>  #define EEH_LOG_PERM 2   /* EEH permanent error log  */
>  
> +/* Settings for platforms */
> +#define EEH_SETTING_BLOCK_CFG1   /* Blocked PCI config access
> */
> +#define EEH_SETTING_BLOCK_IO 2   /* Blocked MMIO access  */
> +
>  struct eeh_ops {
>   char *name;
>   int (*init)(void);
> @@ -146,6 +150,8 @@ struct eeh_ops {
>   int (*configure_bridge)(struct eeh_pe *pe);
>   int (*read_config)(struct device_node *dn, int where, int size, u32 
> *val);
>   int (*write_config)(struct device_node *dn, int where, int size, u32 
> val);
> + int (*get_setting)(int option, int *value, void *data);
> + int (*set_setting)(int option, int value, void *data);
>   int (*next_error)(struct eeh_pe **pe);
>  };
>  
> diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c 
> b/arch/powerpc/platforms/pseries/eeh_pseries.c
> index 62415f2..8c9509b 100644
> --- a/arch/powerpc/platforms/pseries/eeh_pseries.c
> +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
> @@ -612,6 +612,48 @@ static int pseries_eeh_write_config(struct device_node 
> *dn, int where, int size,
>   return rtas_write_config(pdn, where, size, val);
>  }
>  
> +/**
> + * pseries_eeh_get_setting - Retrieve settings that affect EEH core
> + * @option: option
> + * @value: value
> + * @data: dependent data
> + *
> + * Retrieve the settings from the platform in order to affect the
> + * behaviour of EEH core. We don't block PCI config or MMIO access
> + * on pSeries platform.
> + */
> +static int pseries_eeh_get_setting(int option, int *value, void *data)
> +{
> + int ret = 0;
> +
> + switch (option) {
> + case EEH_SETTING_BLOCK_CFG:
> + case EEH_SETTING_BLOCK_IO:
> + *value = 0;
> + break;
> + default:
> + pr_warning("%s: Unrecognized option (%d)\n",
> +__func__, option);
> + ret = -EINVAL;
> + }
> +
> + return ret;
> +}
> +
> +/**
> + * pseries_eeh_set_setting - Configure settings to affect EEH core
> + * @option: option
> + * @value: value
> + * @data: dependent data
> + *
> + * Configure the settings for the platform in order to affect the
> + * behaviour of EEH core.
> + */
> +static int pseries_eeh_set_setting(int option, int value, void *data)
> +{
> + return 0;
> +}
> +
>  static struct eeh_ops pseries_eeh_ops = {
>   .name   = "pseries",
>   .init   = pseries_eeh_init,
> @@ -626,6 +668,8 @@ static struct eeh_ops pseries_eeh_ops = {
>   .configure_bridge   = pseries_eeh_configure_bridge,
>   .read_config= pseries_eeh_read_config,
>   .write_config   = pseries_eeh_write_config,
> + .get_setting= pseries_eeh_get_setting,
> + .set_setting= pseries_eeh_set_setting,
>   .next_error = NULL
>  };
>  


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 03/10] powerpc/eeh: Check PCIe link after reset

2013-06-24 Thread Benjamin Herrenschmidt
On Tue, 2013-06-25 at 13:55 +0800, Gavin Shan wrote:
> * don't touch the other command bits
>  */
> -   eeh_ops->read_config(dn, PCI_COMMAND, 4, &cmd);
> -   if (edev->config_space[1] & PCI_COMMAND_PARITY)
> -   cmd |= PCI_COMMAND_PARITY;
> -   else
> -   cmd &= ~PCI_COMMAND_PARITY;
> -   if (edev->config_space[1] & PCI_COMMAND_SERR)
> -   cmd |= PCI_COMMAND_SERR;
> -   else
> -   cmd &= ~PCI_COMMAND_SERR;
> -   eeh_ops->write_config(dn, PCI_COMMAND, 4, cmd);
> +   if (pdev) {
> +   eeh_ops->write_config(dn, PCI_COMMAND, 4,
> + edev->config_space[1]);
> +   } else {

That needs a much better comment. Why are you doing that instead
of what's below ? In fact there is more to restore in a bridge
right ? (windows etc...). Do you do that ? Should we just have a
different function to restore a device vs. a bridge ?

I also don't see a need to do thing differently between phyp and
powernv. Bridges inside partitions would suffer the same fate in
both cases.

Ben.

> +   eeh_ops->read_config(dn, PCI_COMMAND, 4, &cmd);
> +   if (edev->config_space[1] & PCI_COMMAND_PARITY)
> +   cmd |= PCI_COMMAND_PARITY;
> +   else
> +   cmd &= ~PCI_COMMAND_PARITY;
> +   if (edev->config_space[1] & PCI_COMMAND_SERR)
> +   cmd |= PCI_COMMAND_SERR;
> +   else
> +   cmd &= ~PCI_COMMAND_SERR;
> +   eeh_ops->write_config(dn, PCI_COMMAND, 4, cmd);
> +   }
> +


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL

2013-06-24 Thread Joakim Tjernlund
Scott Wood  wrote on 2013/06/25 02:51:00:
> 
> On Fri, Jul 20, 2012 at 10:37:17AM +0200, Joakim Tjernlund wrote:
> > Zang Roy-R61911  wrote on 2012/07/20 10:27:52:
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: 
linuxppc-dev-bounces+tie-fei.zang=freescale@lists.ozlabs.org
> > > > [
mailto:linuxppc-dev-bounces+tie-fei.zang=freescale@lists.ozlabs.org]
> > > > On Behalf Of Joakim Tjernlund
> > > > Sent: Friday, June 01, 2012 6:36 AM
> > > > To: Wood Scott-B07421
> > > > Cc: linuxppc-...@ozlabs.org; Dan Malek; Bob Cochran; Support
> > > > Subject: Re: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
> > > >
> > > > It just occurred to me that you guys have this already in your 
Linux SDK so
> > > > it can't be that bad.
> > > No. MSR_DE is ONLY added when using CW debug in SDK.
> > > Roy
> > >
> > 
> > Yes, and I later found that user space debugging is busted if you turn 
on MSR_DE in
> > kernel.
> 
> So, how should we handle the CONFIG_BDI_SWITCH patch?  It seems like it
> should at least have a warning in the kconfig help text that it breaks
> userspace debugging (to the point of causing a kernel oops if it's
> tried).  Or maybe it can deselect CONFIG_PPC_ADV_DEBUG_REGS?
> 
> It'd also be nice to keep things like this, that are a consequence of 
how
> external debug works on e500, separate from the Abatron-specific stuff.
> 

I was hoping the kernel would grow per context handling of MSR_DE. Then 
one could have
MSR_DE on in MSR_KERNEL but off in user space(unless gdb request it on a 
per process basis). 

 Jocke
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 09/10] powerpc/eeh: Fix address catch for PowerNV

2013-06-24 Thread Gavin Shan
On the PowerNV platform, the EEH address cache isn't built correctly
because we skipped the EEH devices without binding PE. The patch
fixes that.

Signed-off-by: Gavin Shan 
---
 arch/powerpc/kernel/eeh_cache.c   |2 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |1 +
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index 1d5d9a6..858ebea 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -194,7 +194,7 @@ static void __eeh_addr_cache_insert_dev(struct pci_dev *dev)
}
 
/* Skip any devices for which EEH is not enabled. */
-   if (!edev->pe) {
+   if (!eeh_probe_mode_dev() && !edev->pe) {
 #ifdef DEBUG
pr_info("PCI: skip building address cache for=%s - %s\n",
pci_name(dev), dn->full_name);
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3e5c3d5..0ff9a3a 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -998,6 +998,7 @@ static void pnv_pci_ioda_fixup(void)
pnv_pci_ioda_create_dbgfs();
 
 #ifdef CONFIG_EEH
+   eeh_probe_mode_set(EEH_PROBE_MODE_DEV);
eeh_addr_cache_build();
eeh_init();
 #endif
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 08/10] powerpc/powernv: Hold PCI-CFG and I/O access

2013-06-24 Thread Gavin Shan
While doing recovery from fenced PHB, we need hold the PCI-CFG and
I/O access until the complete PHB reset and BARs restore are done.
The patch addresses that.

Signed-off-by: Gavin Shan 
---
 arch/powerpc/kernel/eeh_driver.c  |   11 +++
 arch/powerpc/platforms/powernv/eeh-ioda.c |   10 --
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 0974e13..944e225 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -349,12 +349,14 @@ static void *eeh_report_failure(void *data, void 
*userdata)
  */
 static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 {
+   struct pci_controller *hose;
struct timeval tstamp;
int cnt, rc;
 
/* pcibios will clear the counter; save the value */
cnt = pe->freeze_count;
tstamp = pe->tstamp;
+   hose = (pe->type & EEH_PE_PHB) ? pe->phb : NULL;
 
/*
 * We don't remove the corresponding PE instances because
@@ -377,6 +379,15 @@ static int eeh_reset_device(struct eeh_pe *pe, struct 
pci_bus *bus)
eeh_ops->configure_bridge(pe);
eeh_pe_restore_bars(pe);
 
+   /*
+* If we're recovering fenced PHB, the PCI-CFG and I/O should
+* have been blocked. We need reenable that.
+*/
+   if (hose) {
+   eeh_ops->set_setting(EEH_SETTING_BLOCK_CFG, 0, hose);
+   eeh_ops->set_setting(EEH_SETTING_BLOCK_IO,  0, hose);
+   }
+
/* Give the system 5 seconds to finish running the user-space
 * hotplug shutdown scripts, e.g. ifdown for ethernet.  Yes,
 * this is a hack, but if we don't do this, and try to bring
diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c 
b/arch/powerpc/platforms/powernv/eeh-ioda.c
index 64c3d1e..23c2442 100644
--- a/arch/powerpc/platforms/powernv/eeh-ioda.c
+++ b/arch/powerpc/platforms/powernv/eeh-ioda.c
@@ -922,7 +922,9 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
list_for_each_entry_safe(hose, tmp,
&hose_list, list_node) {
phb = hose->private_data;
-   phb->eeh_state |= PNV_EEH_STATE_REMOVED;
+   phb->eeh_state |= 
(PNV_EEH_STATE_REMOVED |
+  
PNV_EEH_STATE_IO_BLOCKED |
+  
PNV_EEH_STATE_IO_BLOCKED);
}
 
WARN(1, "EEH: dead IOC detected\n");
@@ -939,7 +941,9 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
 
WARN(1, "EEH: dead PHB#%x detected\n",
 hose->global_number);
-   phb->eeh_state |= PNV_EEH_STATE_REMOVED;
+   phb->eeh_state |= (PNV_EEH_STATE_REMOVED |
+  PNV_EEH_STATE_CFG_BLOCKED |
+  PNV_EEH_STATE_IO_BLOCKED);
ret = 3;
goto out;
} else if (severity == OPAL_EEH_SEV_PHB_FENCED) {
@@ -948,6 +952,8 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
 
WARN(1, "EEH: fenced PHB#%x detected\n",
 hose->global_number);
+   phb->eeh_state |= (PNV_EEH_STATE_CFG_BLOCKED |
+  PNV_EEH_STATE_IO_BLOCKED);
ret = 2;
goto out;
} else if (severity == OPAL_EEH_SEV_INF)
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 06/10] powerpc/eeh: Support blocked IO access

2013-06-24 Thread Gavin Shan
The patch intends to support blocking IO access. Basically, if
the EEH core detects that the IO access has been blocked on one
specific PHB, we will simply return 0xFF's for reading and drop
writing.

Signed-off-by: Gavin Shan 
---
 arch/powerpc/include/asm/eeh.h   |  231 +-
 arch/powerpc/include/asm/io.h|   67 +---
 arch/powerpc/kernel/eeh.c|   50 --
 arch/powerpc/platforms/powernv/eeh-powernv.c |4 +-
 4 files changed, 269 insertions(+), 83 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index de821c1..a8dd983 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -211,7 +211,9 @@ void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
 int __init eeh_init(void);
 int __init eeh_ops_register(struct eeh_ops *ops);
 int __exit eeh_ops_unregister(const char *name);
-unsigned long eeh_check_failure(const volatile void __iomem *token,
+int eeh_check_blocked_io(const volatile void __iomem *token,
+void **pedev);
+unsigned long eeh_check_failure(struct eeh_dev *edev,
unsigned long val);
 int eeh_dev_check_failure(struct eeh_dev *edev);
 void __init eeh_addr_cache_build(void);
@@ -249,7 +251,13 @@ static inline void *eeh_dev_init(struct device_node *dn, 
void *data)
 
 static inline void eeh_dev_phb_init_dynamic(struct pci_controller *phb) { }
 
-static inline unsigned long eeh_check_failure(const volatile void __iomem 
*token, unsigned long val)
+int eeh_check_blocked_io(const volatile void __iomem *token,
+void **pedev)
+{
+   return 0;
+}
+
+static inline unsigned long eeh_check_failure(void *data, unsigned long val)
 {
return val;
 }
@@ -276,57 +284,99 @@ static inline void eeh_remove_bus_device(struct pci_dev 
*dev, int purge_pe) { }
  */
 static inline u8 eeh_readb(const volatile void __iomem *addr)
 {
-   u8 val = in_8(addr);
-   if (EEH_POSSIBLE_ERROR(val, u8))
-   return eeh_check_failure(addr, val);
+   u8 val = 0xFF;
+   void *edev;
+
+   if (!eeh_check_blocked_io(addr, &edev)) {
+   val = in_8(addr);
+   if (EEH_POSSIBLE_ERROR(val, u8))
+   return eeh_check_failure(edev, val);
+   }
+
return val;
 }
 
 static inline u16 eeh_readw(const volatile void __iomem *addr)
 {
-   u16 val = in_le16(addr);
-   if (EEH_POSSIBLE_ERROR(val, u16))
-   return eeh_check_failure(addr, val);
+   u16 val = 0x;
+   void *edev;
+
+   if (!eeh_check_blocked_io(addr, &edev)) {
+   val = in_le16(addr);
+   if (EEH_POSSIBLE_ERROR(val, u16))
+   return eeh_check_failure(edev, val);
+   }
+
return val;
 }
 
 static inline u32 eeh_readl(const volatile void __iomem *addr)
 {
-   u32 val = in_le32(addr);
-   if (EEH_POSSIBLE_ERROR(val, u32))
-   return eeh_check_failure(addr, val);
+   u32 val = 0x;
+   void *edev;
+
+   if (!eeh_check_blocked_io(addr, &edev)) {
+   val = in_le32(addr);
+   if (EEH_POSSIBLE_ERROR(val, u32))
+   return eeh_check_failure(edev, val);
+   }
+
return val;
 }
 
 static inline u64 eeh_readq(const volatile void __iomem *addr)
 {
-   u64 val = in_le64(addr);
-   if (EEH_POSSIBLE_ERROR(val, u64))
-   return eeh_check_failure(addr, val);
+   u64 val = 0x;
+   void *edev;
+
+   if (!eeh_check_blocked_io(addr, &edev)) {
+   val = in_le64(addr);
+   if (EEH_POSSIBLE_ERROR(val, u64))
+   return eeh_check_failure(edev, val);
+   }
+
return val;
 }
 
 static inline u16 eeh_readw_be(const volatile void __iomem *addr)
 {
-   u16 val = in_be16(addr);
-   if (EEH_POSSIBLE_ERROR(val, u16))
-   return eeh_check_failure(addr, val);
+   u16 val = 0x;
+   void *edev;
+
+   if (!eeh_check_blocked_io(addr, &edev)) {
+   val = in_be16(addr);
+   if (EEH_POSSIBLE_ERROR(val, u16))
+   return eeh_check_failure(edev, val);
+   }
+
return val;
 }
 
 static inline u32 eeh_readl_be(const volatile void __iomem *addr)
 {
-   u32 val = in_be32(addr);
-   if (EEH_POSSIBLE_ERROR(val, u32))
-   return eeh_check_failure(addr, val);
+   u32 val = 0x;
+   void *edev;
+
+   if (!eeh_check_blocked_io(addr, &edev)) {
+   val = in_be32(addr);
+   if (EEH_POSSIBLE_ERROR(val, u32))
+   return eeh_check_failure(edev, val);
+   }
+
return val;
 }
 
 static inline u64 eeh_readq_be(const volatile void __iomem *addr)
 {
-   u64 val = in_be64(addr);
-   if (EEH_POSSIBLE_ERROR(val, u64))
-   return eeh_check_failure(addr, val);
+   

[PATCH 07/10] powerpc/powernv: Block PCI-CFG access if necessary

2013-06-24 Thread Gavin Shan
If the PCI-CFG access on the specific PHB, to return 0xFF's for
reading and drop writing. The patch implements that for PowerNV
platform. The patch also removes the check on "hose == NULL"
for PCI-CFG accessors since the kernel should stop while fetching
platform-dependent PHB (struct pnv_phb).

Signed-off-by: Gavin Shan 
---
 arch/powerpc/platforms/powernv/eeh-powernv.c |   10 ++---
 arch/powerpc/platforms/powernv/pci.c |   59 --
 arch/powerpc/platforms/powernv/pci.h |4 ++
 3 files changed, 54 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c 
b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 20a7865..249798e 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -328,9 +328,9 @@ static int powernv_eeh_read_config(struct device_node *dn, 
int where,
 {
struct eeh_dev *edev = of_node_to_eeh_dev(dn);
struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
-   struct pci_controller *hose = edev->phb;
 
-   return hose->ops->read(dev->bus, dev->devfn, where, size, val);
+   return pnv_pci_cfg_read(dev->bus, dev->devfn,
+   where, size, val, false);
 }
 
 /**
@@ -347,11 +347,9 @@ static int powernv_eeh_write_config(struct device_node 
*dn, int where,
 {
struct eeh_dev *edev = of_node_to_eeh_dev(dn);
struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
-   struct pci_controller *hose = edev->phb;
 
-   hose = pci_bus_to_host(dev->bus);
-
-   return hose->ops->write(dev->bus, dev->devfn, where, size, val);
+   return pnv_pci_cfg_write(dev->bus, dev->devfn,
+where, size, val, false);
 }
 
 /**
diff --git a/arch/powerpc/platforms/powernv/pci.c 
b/arch/powerpc/platforms/powernv/pci.c
index 1f31826..47fa921 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -255,21 +255,30 @@ static void pnv_pci_config_check_eeh(struct pnv_phb *phb, 
struct pci_bus *bus,
pnv_pci_handle_eeh_config(phb, pe_no);
 }
 
-static int pnv_pci_read_config(struct pci_bus *bus,
-  unsigned int devfn,
-  int where, int size, u32 *val)
+int pnv_pci_cfg_read(struct pci_bus *bus,
+unsigned int devfn,
+int where, int size,
+u32 *val, bool check)
 {
struct pci_controller *hose = pci_bus_to_host(bus);
struct pnv_phb *phb = hose->private_data;
+   u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
+   s64 rc;
 #ifdef CONFIG_EEH
struct device_node *busdn, *dn;
struct eeh_pe *phb_pe = NULL;
-#endif
-   u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
-   s64 rc;
 
-   if (hose == NULL)
+   /*
+* If PCI-CFG access has been blocked, we simply
+* return 0xFF's here.
+*/
+   if (check &&
+   (phb->eeh_state & PNV_EEH_STATE_ENABLED) &&
+   (phb->eeh_state & PNV_EEH_STATE_CFG_BLOCKED)) {
+   *val = 0x;
return PCIBIOS_DEVICE_NOT_FOUND;
+   }
+#endif
 
switch (size) {
case 1: {
@@ -329,19 +338,26 @@ static int pnv_pci_read_config(struct pci_bus *bus,
return PCIBIOS_SUCCESSFUL;
 }
 
-static int pnv_pci_write_config(struct pci_bus *bus,
-   unsigned int devfn,
-   int where, int size, u32 val)
+int pnv_pci_cfg_write(struct pci_bus *bus,
+ unsigned int devfn,
+ int where, int size,
+ u32 val, bool check)
 {
struct pci_controller *hose = pci_bus_to_host(bus);
struct pnv_phb *phb = hose->private_data;
u32 bdfn = (((uint64_t)bus->number) << 8) | devfn;
 
-   if (hose == NULL)
-   return PCIBIOS_DEVICE_NOT_FOUND;
-
cfg_dbg("pnv_pci_write_config bus: %x devfn: %x +%x/%x -> %08x\n",
bus->number, devfn, where, size, val);
+
+#ifdef CONFIG_EEH
+   /* If PCI-CFG access has been blocked, drop it */
+   if (check &&
+   (phb->eeh_state & PNV_EEH_STATE_ENABLED) &&
+   (phb->eeh_state & PNV_EEH_STATE_CFG_BLOCKED))
+   return PCIBIOS_DEVICE_NOT_FOUND;
+#endif
+
switch (size) {
case 1:
opal_pci_config_write_byte(phb->opal_id, bdfn, where, val);
@@ -367,6 +383,23 @@ static int pnv_pci_write_config(struct pci_bus *bus,
return PCIBIOS_SUCCESSFUL;
 }
 
+static int pnv_pci_read_config(struct pci_bus *bus,
+  unsigned int devfn,
+  int where, int size, u32 *val)
+{
+   return pnv_pci_cfg_read(bus, devfn, where,
+   size, val, true);
+}
+
+static int pnv_pci_write_config(struct pci_bus *bus,
+   unsigned int devfn,
+   

[PATCH 03/10] powerpc/eeh: Check PCIe link after reset

2013-06-24 Thread Gavin Shan
After reset (e.g. complete reset) in order to bring the fenced PHB
back, the PCIe link might not be ready yet. The patch intends to
make sure the PCIe link is ready before accessing its subordinate
PCI devices. The patch also fixes that wrong values restored to
PCI_COMMAND register for PCI bridges.

Signed-off-by: Gavin Shan 
---
 arch/powerpc/kernel/eeh_pe.c |  120 ++
 1 files changed, 110 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 55943fc..db83ada 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -22,6 +22,7 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -567,6 +568,88 @@ void eeh_pe_state_clear(struct eeh_pe *pe, int state)
eeh_pe_traverse(pe, __eeh_pe_state_clear, &state);
 }
 
+/*
+ * Some PCI bridges (e.g. PLX bridges) have primary/secondary
+ * buses assigned explicitly by firmware, and we probably have
+ * lost that after reset. So we have to delay the check until
+ * the PCI-CFG registers have been restored for the parent
+ * bridge.
+ *
+ * Don't use normal PCI-CFG accessors, which probably has been
+ * blocked on normal path during the stage. So we need utilize
+ * eeh operations, which is always permitted.
+ */
+static void eeh_bridge_check_link(struct device_node *dn,
+ struct pci_dev *pdev)
+{
+   int cap;
+   uint32_t val;
+   int timeout = 0;
+
+   /*
+* We only check root port and downstream ports of
+* PCIe switches
+*/
+   if (!pci_is_pcie(pdev) ||
+   (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT &&
+pci_pcie_type(pdev) != PCI_EXP_TYPE_DOWNSTREAM))
+   return;
+
+   pr_debug("%s: Check PCIe link for %s ...\n",
+__func__, pci_name(pdev));
+
+   /* Check slot status */
+   cap = pdev->pcie_cap;
+   eeh_ops->read_config(dn, cap + PCI_EXP_SLTSTA, 2, &val);
+   if (!(val & PCI_EXP_SLTSTA_PDS)) {
+   pr_debug("  No card in the slot (0x%04x) !\n", val);
+   return;
+   }
+
+   /* Check power status if we have the capability */
+   eeh_ops->read_config(dn, cap + PCI_EXP_SLTCAP, 2, &val);
+   if (val & PCI_EXP_SLTCAP_PCP) {
+   eeh_ops->read_config(dn, cap + PCI_EXP_SLTCTL, 2, &val);
+   if (val & PCI_EXP_SLTCTL_PCC) {
+   pr_debug("  In power-off state, power it on ...\n");
+   val &= ~(PCI_EXP_SLTCTL_PCC | PCI_EXP_SLTCTL_PIC);
+   val |= (0x0100 & PCI_EXP_SLTCTL_PIC);
+   eeh_ops->write_config(dn, cap + PCI_EXP_SLTCTL, 2, val);
+   msleep(2 * 1000);
+   }
+   }
+
+   /* Enable link */
+   eeh_ops->read_config(dn, cap + PCI_EXP_LNKCTL, 2, &val);
+   val &= ~PCI_EXP_LNKCTL_LD;
+   eeh_ops->write_config(dn, cap + PCI_EXP_LNKCTL, 2, val);
+
+   /* Check link */
+   eeh_ops->read_config(dn, cap + PCI_EXP_LNKCAP, 4, &val);
+   if (!(val & PCI_EXP_LNKCAP_DLLLARC)) {
+   pr_debug("  No link reporting capability (0x%08x) \n", val);
+   msleep(1000);
+   return;
+   }
+
+   /* Wait the link is up until timeout (5s) */
+   timeout = 0;
+   while (timeout < 5000) {
+   msleep(20);
+   timeout += 20;
+
+   eeh_ops->read_config(dn, cap + PCI_EXP_LNKSTA, 2, &val);
+   if (val & PCI_EXP_LNKSTA_DLLLA)
+   break;
+   }
+
+   if (val & PCI_EXP_LNKSTA_DLLLA)
+   pr_debug("  Link up (%s)\n",
+(val & PCI_EXP_LNKSTA_CLS_2_5GB) ? "2.5GB" : "5GB");
+   else
+   pr_debug("  Link not ready (0x%04x)\n", val);
+}
+
 /**
  * eeh_restore_one_device_bars - Restore the Base Address Registers for one 
device
  * @data: EEH device
@@ -580,9 +663,17 @@ static void *eeh_restore_one_device_bars(void *data, void 
*flag)
 {
int i;
u32 cmd;
+   struct pci_dev *pdev = NULL;
struct eeh_dev *edev = (struct eeh_dev *)data;
struct device_node *dn = eeh_dev_to_of_node(edev);
 
+   /* Trace the PCI bridge */
+   if (eeh_probe_mode_dev()) {
+   pdev = eeh_dev_to_pci_dev(edev);
+   if (pdev->hdr_type != PCI_HEADER_TYPE_BRIDGE)
+   pdev = NULL;
+   }
+
for (i = 4; i < 10; i++)
eeh_ops->write_config(dn, i*4, 4, edev->config_space[i]);
/* 12 == Expansion ROM Address */
@@ -603,16 +694,25 @@ static void *eeh_restore_one_device_bars(void *data, void 
*flag)
 * Restore PERR & SERR bits, some devices require it,
 * don't touch the other command bits
 */
-   eeh_ops->read_config(dn, PCI_COMMAND, 4, &cmd);
-   if (edev->config_space[1] & PCI_COMMA

[PATCH 05/10] powerpc/powernv: Support set/get EEH settings

2013-06-24 Thread Gavin Shan
The patch implements PowerNV backends to support set/get settings.
Also, we needn't maintain multiple fields in "struct pnv_phb" to
trace different EEH states. The patch merges all EEH states to one
field "eeh_state".

Signed-off-by: Gavin Shan 
---
 arch/powerpc/platforms/powernv/eeh-ioda.c|   82 -
 arch/powerpc/platforms/powernv/eeh-powernv.c |   34 +++
 arch/powerpc/platforms/powernv/pci.c |4 +-
 arch/powerpc/platforms/powernv/pci.h |   12 +++-
 4 files changed, 124 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c 
b/arch/powerpc/platforms/powernv/eeh-ioda.c
index 84f3036..64c3d1e 100644
--- a/arch/powerpc/platforms/powernv/eeh-ioda.c
+++ b/arch/powerpc/platforms/powernv/eeh-ioda.c
@@ -132,7 +132,7 @@ static int ioda_eeh_post_init(struct pci_controller *hose)
&ioda_eeh_dbgfs_ops);
 #endif
 
-   phb->eeh_enabled = 1;
+   phb->eeh_state |= PNV_EEH_STATE_ENABLED;
}
 
return 0;
@@ -583,6 +583,78 @@ static int ioda_eeh_configure_bridge(struct eeh_pe *pe)
return 0;
 }
 
+/**
+ * ioda_eeh_set_setting - Configure the settings to affect EEH core
+ * @option: option
+ * @value: value
+ * @data: dependent data
+ *
+ * Configure the settings to affect EEH core.
+ */
+static int ioda_eeh_set_setting(int option, int value, void *data)
+{
+   struct pci_controller *hose = (struct pci_controller *)data;
+   struct pnv_phb *phb = hose->private_data;
+   int ret = 0;
+
+   switch (option) {
+   case EEH_SETTING_BLOCK_CFG:
+   if (value)
+   phb->eeh_state |= PNV_EEH_STATE_CFG_BLOCKED;
+   else
+   phb->eeh_state &= ~PNV_EEH_STATE_CFG_BLOCKED;
+   break;
+   case EEH_SETTING_BLOCK_IO:
+   if (value)
+   phb->eeh_state |= PNV_EEH_STATE_IO_BLOCKED;
+   else
+   phb->eeh_state &= ~PNV_EEH_STATE_IO_BLOCKED;
+   break;
+   default:
+   pr_warning("%s: Unrecognized option (%d)\n",
+  __func__, option);
+   ret = -EINVAL;
+   }
+
+   return ret;
+}
+
+/**
+ * ioda_eeh_get_setting - Retrieve the settings to affect EEH core
+ * @option: option
+ * @value: value
+ * @data: dependent data
+ *
+ * EEH core retrieves the settings and utilize them.
+ */
+static int ioda_eeh_get_setting(int option, int *value, void *data)
+{
+   struct pci_controller *hose = (struct pci_controller *)data;
+   struct pnv_phb *phb = hose->private_data;
+   int ret = 0;
+
+   switch (option) {
+   case EEH_SETTING_BLOCK_CFG:
+   if (phb->eeh_state & PNV_EEH_STATE_CFG_BLOCKED)
+   *value = 1;
+   else
+   *value = 0;
+   break;
+   case EEH_SETTING_BLOCK_IO:
+   if (phb->eeh_state & PNV_EEH_STATE_IO_BLOCKED)
+   *value = 1;
+   else
+   *value = 0;
+   break;
+   default:
+   pr_warning("%s: Unrecognized option (%d)\n",
+  __func__, option);
+   ret = -EINVAL;
+   }
+
+   return ret;
+}
+
 static void ioda_eeh_hub_diag_common(struct OpalIoP7IOCErrorData *data)
 {
/* GEM */
@@ -815,7 +887,7 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
 * removed, we needn't take care of it any more.
 */
phb = hose->private_data;
-   if (phb->removed)
+   if (phb->eeh_state & PNV_EEH_STATE_REMOVED)
continue;
 
rc = opal_pci_next_error(phb->opal_id,
@@ -850,7 +922,7 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
list_for_each_entry_safe(hose, tmp,
&hose_list, list_node) {
phb = hose->private_data;
-   phb->removed = 1;
+   phb->eeh_state |= PNV_EEH_STATE_REMOVED;
}
 
WARN(1, "EEH: dead IOC detected\n");
@@ -867,7 +939,7 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
 
WARN(1, "EEH: dead PHB#%x detected\n",
 hose->global_number);
-   phb->removed = 1;
+   phb->eeh_state |= PNV_EEH_STATE_REMOVED;
ret = 3;
goto out;
} else if (severity == OPAL_EEH_SEV_PHB_FENCED) {
@@ -905,5 +977,7 @@ struct pnv_eeh_ops ioda_eeh_ops = {
.reset  = ioda_eeh_reset,
.get_log= ioda_eeh_get_log,
.configu

[PATCH 10/10] net/tg3: Avoid delay during MMIO access

2013-06-24 Thread Gavin Shan
When the driver is encountering EEH errors, which might be caused
by frozen PCI host controller, the driver needn't keep reading on
MMIO until timeout. For the case, 0xFF's should be returned from
hardware. Otherwise, it possibly trigger soft-lockup. The patch
adds more check on that by pci_channel_offline(), thus to avoid
the possible soft-lockup.

Signed-off-by: Gavin Shan 
---
 drivers/net/ethernet/broadcom/tg3.c |   36 +++
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index c777b90..a13463e 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -744,6 +744,9 @@ static int tg3_ape_lock(struct tg3 *tp, int locknum)
status = tg3_ape_read32(tp, gnt + off);
if (status == bit)
break;
+   if (pci_channel_offline(tp->pdev))
+   break;
+
udelay(10);
}
 
@@ -1635,6 +1638,9 @@ static void tg3_wait_for_event_ack(struct tg3 *tp)
for (i = 0; i < delay_cnt; i++) {
if (!(tr32(GRC_RX_CPU_EVENT) & GRC_RX_CPU_DRIVER_EVENT))
break;
+   if (pci_channel_offline(tp->pdev))
+   break;
+
udelay(8);
}
 }
@@ -1813,6 +1819,9 @@ static int tg3_poll_fw(struct tg3 *tp)
for (i = 0; i < 200; i++) {
if (tr32(VCPU_STATUS) & VCPU_STATUS_INIT_DONE)
return 0;
+   if (pci_channel_offline(tp->pdev))
+   return -ENODEV;
+
udelay(100);
}
return -ENODEV;
@@ -1823,6 +1832,15 @@ static int tg3_poll_fw(struct tg3 *tp)
tg3_read_mem(tp, NIC_SRAM_FIRMWARE_MBOX, &val);
if (val == ~NIC_SRAM_FIRMWARE_MBOX_MAGIC1)
break;
+   if (pci_channel_offline(tp->pdev)) {
+   if (!tg3_flag(tp, NO_FWARE_REPORTED)) {
+   tg3_flag_set(tp, NO_FWARE_REPORTED);
+   netdev_info(tp->dev, "No firmware running\n");
+   }
+
+   break;
+   }
+
udelay(10);
}
 
@@ -3520,6 +3538,8 @@ static int tg3_pause_cpu(struct tg3 *tp, u32 cpu_base)
tw32(cpu_base + CPU_MODE,  CPU_MODE_HALT);
if (tr32(cpu_base + CPU_MODE) & CPU_MODE_HALT)
break;
+   if (pci_channel_offline(tp->pdev))
+   return -EBUSY;
}
 
return (i == iters) ? -EBUSY : 0;
@@ -8589,6 +8609,14 @@ static int tg3_stop_block(struct tg3 *tp, unsigned long 
ofs, u32 enable_bit, boo
tw32_f(ofs, val);
 
for (i = 0; i < MAX_WAIT_CNT; i++) {
+   if (pci_channel_offline(tp->pdev)) {
+   dev_err(&tp->pdev->dev,
+   "tg3_stop_block device offline, "
+   "ofs=%lx enable_bit=%x\n",
+   ofs, enable_bit);
+   return -ENODEV;
+   }
+
udelay(100);
val = tr32(ofs);
if ((val & enable_bit) == 0)
@@ -8612,6 +8640,13 @@ static int tg3_abort_hw(struct tg3 *tp, bool silent)
 
tg3_disable_ints(tp);
 
+   if (pci_channel_offline(tp->pdev)) {
+   tp->rx_mode &= ~(RX_MODE_ENABLE | TX_MODE_ENABLE);
+   tp->mac_mode &= ~MAC_MODE_TDE_ENABLE;
+   err = -ENODEV;
+   goto err_no_dev;
+   }
+
tp->rx_mode &= ~RX_MODE_ENABLE;
tw32_f(MAC_RX_MODE, tp->rx_mode);
udelay(10);
@@ -8660,6 +8695,7 @@ static int tg3_abort_hw(struct tg3 *tp, bool silent)
err |= tg3_stop_block(tp, BUFMGR_MODE, BUFMGR_MODE_ENABLE, silent);
err |= tg3_stop_block(tp, MEMARB_MODE, MEMARB_MODE_ENABLE, silent);
 
+err_no_dev:
for (i = 0; i < tp->irq_cnt; i++) {
struct tg3_napi *tnapi = &tp->napi[i];
if (tnapi->hw_status)
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 02/10] powerpc/eeh: Don't collect PCI-CFG data on PHB

2013-06-24 Thread Gavin Shan
When the PHB is fenced or dead, it's pointless to collect the data
from PCI config space of subordinate PCI devices since it should
return 0xFF's. It also has potential risk to incur additional errors.
The patch avoids collecting PCI-CFG data while PHB is in fenced or
dead state.

Signed-off-by: Gavin Shan 
---
 arch/powerpc/kernel/eeh.c |   34 --
 1 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 951a632..65320fd 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -232,16 +232,30 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int 
severity)
 {
size_t loglen = 0;
struct eeh_dev *edev;
+   bool valid_cfg_log = true;
 
-   eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
-   eeh_ops->configure_bridge(pe);
-   eeh_pe_restore_bars(pe);
-
-   pci_regs_buf[0] = 0;
-   eeh_pe_for_each_dev(pe, edev) {
-   loglen += eeh_gather_pci_data(edev, pci_regs_buf,
-   EEH_PCI_REGS_LOG_LEN);
-}
+   /*
+* When the PHB is fenced or dead, it's pointless to collect
+* the data from PCI config space because it should return
+* 0xFF's. The potential risk of that is introducing additional
+* errors.
+*/
+   if (eeh_probe_mode_dev() &&
+   (pe->type & EEH_PE_PHB) &&
+   (pe->state & (EEH_PE_ISOLATED | EEH_PE_PHB_DEAD)))
+   valid_cfg_log = false;
+
+   if (valid_cfg_log) {
+   eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
+   eeh_ops->configure_bridge(pe);
+   eeh_pe_restore_bars(pe);
+
+   pci_regs_buf[0] = 0;
+   eeh_pe_for_each_dev(pe, edev) {
+   loglen += eeh_gather_pci_data(edev, pci_regs_buf,
+ EEH_PCI_REGS_LOG_LEN);
+   }
+   }
 
eeh_ops->get_log(pe, severity, pci_regs_buf, loglen);
 }
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 04/10] powerpc/eeh: Backends to get/set settings

2013-06-24 Thread Gavin Shan
When the PHB gets fenced, 0xFF's returns from PCI config space and
MMIO space in the hardware. The operations writting to them should
be dropped. The patch introduce backends allow to set/get flags that
indicate the access to PCI-CFG and MMIO should be blocked.

Signed-off-by: Gavin Shan 
---
 arch/powerpc/include/asm/eeh.h   |6 +++
 arch/powerpc/platforms/pseries/eeh_pseries.c |   44 ++
 2 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index dd65e31..de821c1 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -131,6 +131,10 @@ static inline struct pci_dev *eeh_dev_to_pci_dev(struct 
eeh_dev *edev)
 #define EEH_LOG_TEMP   1   /* EEH temporary error log  */
 #define EEH_LOG_PERM   2   /* EEH permanent error log  */
 
+/* Settings for platforms */
+#define EEH_SETTING_BLOCK_CFG  1   /* Blocked PCI config access*/
+#define EEH_SETTING_BLOCK_IO   2   /* Blocked MMIO access  */
+
 struct eeh_ops {
char *name;
int (*init)(void);
@@ -146,6 +150,8 @@ struct eeh_ops {
int (*configure_bridge)(struct eeh_pe *pe);
int (*read_config)(struct device_node *dn, int where, int size, u32 
*val);
int (*write_config)(struct device_node *dn, int where, int size, u32 
val);
+   int (*get_setting)(int option, int *value, void *data);
+   int (*set_setting)(int option, int value, void *data);
int (*next_error)(struct eeh_pe **pe);
 };
 
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c 
b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 62415f2..8c9509b 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -612,6 +612,48 @@ static int pseries_eeh_write_config(struct device_node 
*dn, int where, int size,
return rtas_write_config(pdn, where, size, val);
 }
 
+/**
+ * pseries_eeh_get_setting - Retrieve settings that affect EEH core
+ * @option: option
+ * @value: value
+ * @data: dependent data
+ *
+ * Retrieve the settings from the platform in order to affect the
+ * behaviour of EEH core. We don't block PCI config or MMIO access
+ * on pSeries platform.
+ */
+static int pseries_eeh_get_setting(int option, int *value, void *data)
+{
+   int ret = 0;
+
+   switch (option) {
+   case EEH_SETTING_BLOCK_CFG:
+   case EEH_SETTING_BLOCK_IO:
+   *value = 0;
+   break;
+   default:
+   pr_warning("%s: Unrecognized option (%d)\n",
+  __func__, option);
+   ret = -EINVAL;
+   }
+
+   return ret;
+}
+
+/**
+ * pseries_eeh_set_setting - Configure settings to affect EEH core
+ * @option: option
+ * @value: value
+ * @data: dependent data
+ *
+ * Configure the settings for the platform in order to affect the
+ * behaviour of EEH core.
+ */
+static int pseries_eeh_set_setting(int option, int value, void *data)
+{
+   return 0;
+}
+
 static struct eeh_ops pseries_eeh_ops = {
.name   = "pseries",
.init   = pseries_eeh_init,
@@ -626,6 +668,8 @@ static struct eeh_ops pseries_eeh_ops = {
.configure_bridge   = pseries_eeh_configure_bridge,
.read_config= pseries_eeh_read_config,
.write_config   = pseries_eeh_write_config,
+   .get_setting= pseries_eeh_get_setting,
+   .set_setting= pseries_eeh_set_setting,
.next_error = NULL
 };
 
-- 
1.7.5.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v1 00/10] powerpc/eeh: Remove eeh_mutex

2013-06-24 Thread Gavin Shan
The series of patches are follow-up in order to make EEH workable for PowerNV
platform on Juno-IOC-L machine. Couple of issues have been fixed with help of
Ben:
- eeh_lock() and eeh_unlock() were introduced to protect the PE 
hierarchy
  tree. However, we already had one kthread ("eehd"). So that's not 
necessary
  any more.
- When PHB gets fenced, we need do complete reset for the PHB in order
  for recovery. However, we never checked the downstream PCIe links are
  ready again.
- Introduce mechanism to block accessing to PCI-CFG and MMIO. The 
hardware
  should return 0xFF's while the PHB is fenced. So we needn't access 
PCI-CFG
  and MMIO during the stage (before the PHB gets complete reset).
- EEH address cache wasn't populated on PowerNV.
- PCI-CFG for PCI bridges (PCI_COMMAND) wasn't restored correctly.
- While PHB gets fenced, TG3 driver is still trying to access MMIO with 
loop.
  That's unnecessary.

The series of patches have been verified on Juno-IOC-L machine:

Trigger frozen PE:

echo 0x0200 > /sys/kernel/debug/powerpc/PCI/err_injct
sleep 1
echo 0x0 > /sys/kernel/debug/powerpc/PCI/err_injct

Trigger fenced PHB:

echo 0x8000 > /sys/kernel/debug/powerpc/PCI/err_injct

---

arch/powerpc/include/asm/eeh.h   |  251 --
arch/powerpc/include/asm/io.h|   67 ---
arch/powerpc/kernel/eeh.c|   85 ++---
arch/powerpc/kernel/eeh_cache.c  |2 +-
arch/powerpc/kernel/eeh_driver.c |   11 ++
arch/powerpc/kernel/eeh_event.c  |3 +-
arch/powerpc/kernel/eeh_pe.c |  150 
arch/powerpc/platforms/powernv/eeh-ioda.c|   88 +-
arch/powerpc/platforms/powernv/eeh-powernv.c |   42 -
arch/powerpc/platforms/powernv/pci-ioda.c|1 +
arch/powerpc/platforms/powernv/pci.c |   63 +--
arch/powerpc/platforms/powernv/pci.h |   16 ++-
arch/powerpc/platforms/pseries/eeh_pseries.c |   44 +
drivers/net/ethernet/broadcom/tg3.c  |   36 
14 files changed, 685 insertions(+), 174 deletions(-)

Thanks,
Gavin

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 01/10] powerpc/eeh: Remove eeh_mutex

2013-06-24 Thread Gavin Shan
Originally, eeh_mutex was introduced to protect the PE hierarchy
tree and the attached EEH devices because EEH core was possiblly
running with multiple threads to access the PE hierarchy tree.
However, we now have only one kthread in EEH core. So we needn't
the eeh_mutex and just remove it. The patch also allows to be
interrupted while waiting on EEH event semaphore.

Signed-off-by: Gavin Shan 
---
 arch/powerpc/include/asm/eeh.h  |   14 --
 arch/powerpc/kernel/eeh.c   |3 ---
 arch/powerpc/kernel/eeh_event.c |3 ++-
 arch/powerpc/kernel/eeh_pe.c|   30 +-
 4 files changed, 3 insertions(+), 47 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index a0b11fb..dd65e31 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -151,7 +151,6 @@ struct eeh_ops {
 
 extern struct eeh_ops *eeh_ops;
 extern int eeh_subsystem_enabled;
-extern struct mutex eeh_mutex;
 extern raw_spinlock_t confirm_error_lock;
 extern int eeh_probe_mode;
 
@@ -173,16 +172,6 @@ static inline int eeh_probe_mode_dev(void)
return (eeh_probe_mode == EEH_PROBE_MODE_DEV);
 }
 
-static inline void eeh_lock(void)
-{
-   mutex_lock(&eeh_mutex);
-}
-
-static inline void eeh_unlock(void)
-{
-   mutex_unlock(&eeh_mutex);
-}
-
 static inline void eeh_serialize_lock(unsigned long *flags)
 {
raw_spin_lock_irqsave(&confirm_error_lock, *flags);
@@ -271,9 +260,6 @@ static inline void eeh_add_sysfs_files(struct pci_bus *bus) 
{ }
 
 static inline void eeh_remove_bus_device(struct pci_dev *dev, int purge_pe) { }
 
-static inline void eeh_lock(void) { }
-static inline void eeh_unlock(void) { }
-
 #define EEH_POSSIBLE_ERROR(val, type) (0)
 #define EEH_IO_ERROR_VALUE(size) (-1UL)
 #endif /* CONFIG_EEH */
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 7c567be..951a632 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -103,9 +103,6 @@ EXPORT_SYMBOL(eeh_subsystem_enabled);
  */
 int eeh_probe_mode;
 
-/* Global EEH mutex */
-DEFINE_MUTEX(eeh_mutex);
-
 /* Lock to avoid races due to multiple reports of an error */
 DEFINE_RAW_SPINLOCK(confirm_error_lock);
 
diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
index 39bcd81..165943e 100644
--- a/arch/powerpc/kernel/eeh_event.c
+++ b/arch/powerpc/kernel/eeh_event.c
@@ -55,7 +55,8 @@ static int eeh_event_handler(void * dummy)
struct eeh_pe *pe;
 
while (!kthread_should_stop()) {
-   down(&eeh_eventlist_sem);
+if (down_interruptible(&eeh_eventlist_sem))
+   break;
 
/* Fetch EEH event from the queue */
spin_lock_irqsave(&eeh_eventlist_lock, flags);
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index ae75722..55943fc 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -78,9 +78,7 @@ int eeh_phb_pe_create(struct pci_controller *phb)
}
 
/* Put it into the list */
-   eeh_lock();
list_add_tail(&pe->child, &eeh_phb_pe);
-   eeh_unlock();
 
pr_debug("EEH: Add PE for PHB#%d\n", phb->global_number);
 
@@ -185,21 +183,15 @@ void *eeh_pe_dev_traverse(struct eeh_pe *root,
return NULL;
}
 
-   eeh_lock();
-
/* Traverse root PE */
for (pe = root; pe; pe = eeh_pe_next(pe, root)) {
eeh_pe_for_each_dev(pe, edev) {
ret = fn(edev, flag);
-   if (ret) {
-   eeh_unlock();
+   if (ret)
return ret;
-   }
}
}
 
-   eeh_unlock();
-
return NULL;
 }
 
@@ -305,8 +297,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 {
struct eeh_pe *pe, *parent;
 
-   eeh_lock();
-
/*
 * Search the PE has been existing or not according
 * to the PE address. If that has been existing, the
@@ -316,7 +306,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
pe = eeh_pe_get(edev);
if (pe && !(pe->type & EEH_PE_INVALID)) {
if (!edev->pe_config_addr) {
-   eeh_unlock();
pr_err("%s: PE with addr 0x%x already exists\n",
__func__, edev->config_addr);
return -EEXIST;
@@ -328,7 +317,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 
/* Put the edev to PE */
list_add_tail(&edev->list, &pe->edevs);
-   eeh_unlock();
pr_debug("EEH: Add %s to Bus PE#%x\n",
edev->dn->full_name, pe->addr);
 
@@ -347,7 +335,6 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
parent->type &= ~EEH_PE_INVALID;
parent = parent->parent;
}
- 

Re: [PATCH 3/3 v16] iommu/fsl: Freescale PAMU driver and iommu implementation.

2013-06-24 Thread Alex Williamson
On Thu, 2013-06-20 at 21:31 +0530, Varun Sethi wrote:

> +#define REQ_ACS_FLAGS(PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | 
> PCI_ACS_UF)
> +
> +static struct iommu_group *get_device_iommu_group(struct device *dev)
> +{
> + struct iommu_group *group;
> +
> + group = iommu_group_get(dev);
> + if (!group)
> + group = iommu_group_alloc();
> +
> + return group;
> +}
> +
[snip]
> +

This really gets parent or peer, right?

> +static struct iommu_group *get_peer_pci_device_group(struct pci_dev *pdev)
> +{
> + struct iommu_group *group = NULL;
> +
> + /* check if this is the first device on the bus*/
> + if (pdev->bus_list.next == pdev->bus_list.prev) {

It's a list_head, use list functions.  The list implementation should be
treated as opaque.

if (list_is_singular(&pdev->bus_list))

> + struct pci_bus *bus = pdev->bus->parent;
> + /* Traverese the parent bus list to get
> +  * pdev & dev for the sibling device.
> +  */
> + while (bus) {
> + if (!list_empty(&bus->devices)) {
> + pdev = container_of(bus->devices.next,
> + struct pci_dev, bus_list);

pdev = list_first_entry(&bus->devices, struct pci_dev, bus_list);

> + group = iommu_group_get(&pdev->dev);
> + break;
> + } else
> + bus = bus->parent;

Is this ever reached?  Don't you always have bus->self?

> + }
> + } else {
> + /*
> +  * Get the pdev & dev for the sibling device
> +  */
> + pdev = container_of(pdev->bus_list.prev,
> + struct pci_dev, bus_list);

How do you know if you're at the head or tail of the list?

struct pci_dev *tmp;
list_for_each_entry(tmp, &pdev->bus_list, bus_list) {
if (tmp == pdev)
continue;

group = iommu_group_get(&tmp->dev);
break;
}

> + group = iommu_group_get(&pdev->dev);
> + }
> +
> + return group;
> +}
> +
> +static struct iommu_group *get_pci_device_group(struct pci_dev *pdev)
> +{
> + struct iommu_group *group = NULL;
> + struct pci_dev *bridge, *dma_pdev = NULL;
> + struct pci_controller *pci_ctl;
> + bool pci_endpt_partioning;
> +
> + pci_ctl = pci_bus_to_host(pdev->bus);
> + pci_endpt_partioning = check_pci_ctl_endpt_part(pci_ctl);
> + /* We can partition PCIe devices so assign device group to the device */
> + if (pci_endpt_partioning) {
> + bridge = pci_find_upstream_pcie_bridge(pdev);
> + if (bridge) {
> + if (pci_is_pcie(bridge))
> + dma_pdev = pci_get_domain_bus_and_slot(
> + pci_domain_nr(pdev->bus),
> + bridge->subordinate->number, 0);
> + if (!dma_pdev)
> + dma_pdev = pci_dev_get(bridge);
> + } else
> + dma_pdev = pci_dev_get(pdev);
> +
> + /* Account for quirked devices */
> + swap_pci_ref(&dma_pdev, pci_get_dma_source(dma_pdev));
> +
> + /*
> +  * If it's a multifunction device that does not support our
> +  * required ACS flags, add to the same group as function 0.
> +  */

See c14d2690 in Joerg's next tree, using function 0 was a poor
assumption.

> + if (dma_pdev->multifunction &&
> + !pci_acs_enabled(dma_pdev, REQ_ACS_FLAGS))
> + swap_pci_ref(&dma_pdev,
> +  pci_get_slot(dma_pdev->bus,
> +   
> PCI_DEVFN(PCI_SLOT(dma_pdev->devfn),
> +   0)));
> +
> + group = get_device_iommu_group(&pdev->dev);
> + pci_dev_put(pdev);

What was the point of all the above if we use pdev here instead of
dma_pdev?  Wrong device and broken reference counting.  This also isn't
testing ACS all the way up to the root complex or controller.

> + /*
> +  * PCIe controller is not a paritionable entity
> +  * free the controller device iommu_group.
> +  */
> + if (pci_ctl->parent->iommu_group)
> + iommu_group_remove_device(pci_ctl->parent);
> + } else {
> + /*
> +  * All devices connected to the controller will share the
> +  * PCI controllers device group. If this is the first
> +  * device to be probed for the pci controller, copy the
> +  * device group information from the PCI controller device
> +  * node and remove the PCI controller iommu group.
> +  * For subsequent devices, the iommu g

Re: [PATCH 40/45] powerpc, irq: Use GFP_ATOMIC allocations in atomic context

2013-06-24 Thread Benjamin Herrenschmidt
On Tue, 2013-06-25 at 12:58 +1000, Michael Ellerman wrote:
> On Tue, Jun 25, 2013 at 12:13:04PM +1000, Benjamin Herrenschmidt wrote:
> > On Tue, 2013-06-25 at 12:08 +1000, Michael Ellerman wrote:
> > > We're not checking for allocation failure, which we should be.
> > > 
> > > But this code is only used on powermac and 85xx, so it should probably
> > > just be a TODO to fix this up to handle the failure.
> > 
> > And what can we do if they fail ?
> 
> Fail up the chain and not unplug the CPU presumably.

BTW. Isn't Srivatsa series removing the need to stop_machine() for
unplug ? That should mean we should be able to use GFP_KERNEL no ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2] Do not update sysfs cpu registration from invalid context

2013-06-24 Thread Nathan Fontenot
The topology update code that updates the cpu node registration in sysfs
should not be called while in stop_machine(). The register/unregister
calls take a lock and may sleep.

This patch moves these calls outside of the call to stop_machine().

Signed-off-by:Nathan Fontenot 
---
 arch/powerpc/mm/numa.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: powerpc/arch/powerpc/mm/numa.c
===
--- powerpc.orig/arch/powerpc/mm/numa.c 2013-06-24 06:53:31.0 -0500
+++ powerpc/arch/powerpc/mm/numa.c  2013-06-24 14:28:14.0 -0500
@@ -1433,11 +1433,9 @@
if (cpu != update->cpu)
continue;
 
-   unregister_cpu_under_node(update->cpu, update->old_nid);
unmap_cpu_from_node(update->cpu);
map_cpu_to_node(update->cpu, update->new_nid);
vdso_getcpu_init();
-   register_cpu_under_node(update->cpu, update->new_nid);
}
 
return 0;
@@ -1485,6 +1483,9 @@
stop_machine(update_cpu_topology, &updates[0], &updated_cpus);
 
for (ud = &updates[0]; ud; ud = ud->next) {
+   unregister_cpu_under_node(ud->cpu, ud->old_nid);
+   register_cpu_under_node(ud->cpu, ud->new_nid);
+
dev = get_cpu_device(ud->cpu);
if (dev)
kobject_uevent(&dev->kobj, KOBJ_CHANGE);

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 40/45] powerpc, irq: Use GFP_ATOMIC allocations in atomic context

2013-06-24 Thread Michael Ellerman
On Tue, Jun 25, 2013 at 12:13:04PM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2013-06-25 at 12:08 +1000, Michael Ellerman wrote:
> > We're not checking for allocation failure, which we should be.
> > 
> > But this code is only used on powermac and 85xx, so it should probably
> > just be a TODO to fix this up to handle the failure.
> 
> And what can we do if they fail ?

Fail up the chain and not unplug the CPU presumably.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] Do not update sysfs cpu registration from invalid context

2013-06-24 Thread Nathan Fontenot
On 06/24/2013 08:50 PM, Michael Ellerman wrote:
> On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
>> The topology update code that updates the cpu node registration in sysfs
>> should not be called while in stop_machine(). The register/unregister
>> calls take a lock and may sleep.
>>
>> This patch moves these calls outside of the call to stop_machine().
> 
> What happens? Do we lockup or do you just get a warning?
> 
> And what commit introduced the breakage?

Guilty on on all counts. Hopefully I can still get by with stern warning.

-Nathan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] Do not update sysfs cpu registration from invalid context

2013-06-24 Thread Nathan Fontenot
On 06/24/2013 08:50 PM, Michael Ellerman wrote:
> On Mon, Jun 24, 2013 at 02:25:59PM -0500, Nathan Fontenot wrote:
>> On 06/24/2013 02:16 PM, Seth Jennings wrote:
>>> On Mon, Jun 24, 2013 at 12:18:04PM -0500, Seth Jennings wrote:
 On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
> The topology update code that updates the cpu node registration in sysfs
> should not be called while in stop_machine(). The register/unregister
> calls take a lock and may sleep.
>
> This patch moves these calls outside of the call to stop_machine().
>
> Signed-off-by:Nathan Fontenot 

 Reviewed-by: Seth Jennings 
>>>
>>> Gah! I _knew_ I should have waited for my cross compiler to finish
>>> building.  This thing doesn't build:
>>>
>>>   CC  arch/powerpc/mm/numa.o
>>> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c: In function 
>>> 'arch_update_cpu_topology':
>>> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: 'update' 
>>> undeclared (first use in this function)
>>> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: (Each 
>>> undeclared identifier is reported only once
>>> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: for each 
>>> function it appears in.)
>>>
>>> s/update/ud/ in the *_cpu_under_node() calls.
>>
>> Oops! Time for patch submission re-education training.
> 
> We've all done it, but yes :)
> 
> I try to stick to:
> 
>   1. write code.

I would suggest
1a. ensure you have the proper config options set

>   2. build code.
>   3. test code.
>   4. submit code.
> 
> I imagine you tested an early version of the patch, or on RHEL or
> something, but that can bite you like this. Whenever possible you should
> build & test the exact code you submit, though that can be hard when
> trees are moving quickly underneath you.

Yep, bitten by 1a. I didn't verify the config options I was building
with and had SMP disabled in the tree. This ifdef'ed out my code.

-Nathan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 40/45] powerpc, irq: Use GFP_ATOMIC allocations in atomic context

2013-06-24 Thread Benjamin Herrenschmidt
On Tue, 2013-06-25 at 12:08 +1000, Michael Ellerman wrote:
> We're not checking for allocation failure, which we should be.
> 
> But this code is only used on powermac and 85xx, so it should probably
> just be a TODO to fix this up to handle the failure.

And what can we do if they fail ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 40/45] powerpc, irq: Use GFP_ATOMIC allocations in atomic context

2013-06-24 Thread Michael Ellerman
On Sun, Jun 23, 2013 at 07:17:00PM +0530, Srivatsa S. Bhat wrote:
> The function migrate_irqs() is called with interrupts disabled
> and hence its not safe to do GFP_KERNEL allocations inside it,
> because they can sleep. So change the gfp mask to GFP_ATOMIC.

OK so it gets there via:
  __stop_machine()
take_cpu_down()
  __cpu_disable()
smp_ops->cpu_disable()
  generic_cpu_disable()
migrate_irqs()

> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index ea185e0..ca39bac 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -412,7 +412,7 @@ void migrate_irqs(void)
>   cpumask_var_t mask;
>   const struct cpumask *map = cpu_online_mask;
>  
> - alloc_cpumask_var(&mask, GFP_KERNEL);
> + alloc_cpumask_var(&mask, GFP_ATOMIC);

We're not checking for allocation failure, which we should be.

But this code is only used on powermac and 85xx, so it should probably
just be a TODO to fix this up to handle the failure.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] Do not update sysfs cpu registration from invalid context

2013-06-24 Thread Michael Ellerman
On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
> The topology update code that updates the cpu node registration in sysfs
> should not be called while in stop_machine(). The register/unregister
> calls take a lock and may sleep.
> 
> This patch moves these calls outside of the call to stop_machine().

What happens? Do we lockup or do you just get a warning?

And what commit introduced the breakage?

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] Do not update sysfs cpu registration from invalid context

2013-06-24 Thread Michael Ellerman
On Mon, Jun 24, 2013 at 02:25:59PM -0500, Nathan Fontenot wrote:
> On 06/24/2013 02:16 PM, Seth Jennings wrote:
> > On Mon, Jun 24, 2013 at 12:18:04PM -0500, Seth Jennings wrote:
> >> On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
> >>> The topology update code that updates the cpu node registration in sysfs
> >>> should not be called while in stop_machine(). The register/unregister
> >>> calls take a lock and may sleep.
> >>>
> >>> This patch moves these calls outside of the call to stop_machine().
> >>>
> >>> Signed-off-by:Nathan Fontenot 
> >>
> >> Reviewed-by: Seth Jennings 
> > 
> > Gah! I _knew_ I should have waited for my cross compiler to finish
> > building.  This thing doesn't build:
> > 
> >   CC  arch/powerpc/mm/numa.o
> > /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c: In function 
> > 'arch_update_cpu_topology':
> > /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: 'update' 
> > undeclared (first use in this function)
> > /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: (Each 
> > undeclared identifier is reported only once
> > /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: for each 
> > function it appears in.)
> > 
> > s/update/ud/ in the *_cpu_under_node() calls.
> 
> Oops! Time for patch submission re-education training.

We've all done it, but yes :)

I try to stick to:

  1. write code.
  2. build code.
  3. test code.
  4. submit code.

I imagine you tested an early version of the patch, or on RHEL or
something, but that can bite you like this. Whenever possible you should
build & test the exact code you submit, though that can be hard when
trees are moving quickly underneath you.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] powerpc/perf: Freeze PMC5/6 if we're not using them on Power8

2013-06-24 Thread Michael Ellerman
On Thu, Jun 13, 2013 at 12:09:47PM +0530, Anshuman Khandual wrote:
> On 06/13/2013 06:46 AM, Michael Ellerman wrote:
> > On Power8 we can freeze PMC5 and 6 if we're not using them. Normally they
> > run all the time.
> >
> > index f7d1c4f..e791c68 100644
> > --- a/arch/powerpc/perf/power8-pmu.c
> > +++ b/arch/powerpc/perf/power8-pmu.c
> > @@ -378,6 +378,10 @@ static int power8_compute_mmcr(u64 event[], int n_ev,
> > if (pmc_inuse & 0x7c)
> > mmcr[0] |= MMCR0_PMCjCE;
> > 
> > +   /* If we're not using PMC 5 or 6, freeze them */
> > +   if (!(pmc_inuse & 0x60))
> > +   mmcr[0] |= MMCR0_FC56;
> > +
> > mmcr[1] = mmcr1;
> > mmcr[2] = mmcra;
> > 
> 
> Hey Michael,
> 
> This looks good. But we need to undo this changes when we terminate the perf 
> session.
> That way user would be able to continue reading PMC5 and PMC6 through /sys 
> interface
> as before (which may not be ideal). Adding the following changes along with 
> this patch
> would keep the status quo as it is.

Yep.

> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 29c6482..141756a 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -881,6 +881,12 @@ static void power_pmu_disable(struct pmu *pmu)
>   }
>  
>   /*
> +  * Undo PMC5/PMC6 freeze if already applied
> +  */
> + if (mfspr(SPRN_MMCR0) & MMCR0_FC56)
> + mtspr(SPRN_MMCR0, mfspr(SPRN_MMCR0) & ~PMCR0_FC56)

The intent here is correct. But you've added two mfsprs() and an mtspr()
when the surrounding code has already read MMCR0, and will soon write
it. They may not be that expensive but still.

See my updated patch.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL

2013-06-24 Thread Scott Wood
On Fri, Jul 20, 2012 at 10:37:17AM +0200, Joakim Tjernlund wrote:
> Zang Roy-R61911  wrote on 2012/07/20 10:27:52:
> >
> >
> >
> > > -Original Message-
> > > From: linuxppc-dev-bounces+tie-fei.zang=freescale@lists.ozlabs.org
> > > [mailto:linuxppc-dev-bounces+tie-fei.zang=freescale@lists.ozlabs.org]
> > > On Behalf Of Joakim Tjernlund
> > > Sent: Friday, June 01, 2012 6:36 AM
> > > To: Wood Scott-B07421
> > > Cc: linuxppc-...@ozlabs.org; Dan Malek; Bob Cochran; Support
> > > Subject: Re: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
> > >
> > > It just occurred to me that you guys have this already in your Linux SDK 
> > > so
> > > it can't be that bad.
> > No. MSR_DE is ONLY added when using CW debug in SDK.
> > Roy
> >
> 
> Yes, and I later found that user space debugging is busted if you turn on 
> MSR_DE in
> kernel.

So, how should we handle the CONFIG_BDI_SWITCH patch?  It seems like it
should at least have a warning in the kconfig help text that it breaks
userspace debugging (to the point of causing a kernel oops if it's
tried).  Or maybe it can deselect CONFIG_PPC_ADV_DEBUG_REGS?

It'd also be nice to keep things like this, that are a consequence of how
external debug works on e500, separate from the Abatron-specific stuff.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 04/45] CPU hotplug: Add infrastructure to check lacking hotplug synchronization

2013-06-24 Thread Steven Rostedt
On Sun, 2013-06-23 at 19:08 +0530, Srivatsa S. Bhat wrote:


Just to make the code a little cleaner, can you add:

> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 860f51a..e90d9d7 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -63,6 +63,72 @@ static struct {
>   .refcount = 0,
>  };
>  
> +#ifdef CONFIG_DEBUG_HOTPLUG_CPU
> +
> +static DEFINE_PER_CPU(unsigned long, atomic_reader_refcnt);
> +
> +static int current_is_hotplug_safe(const struct cpumask *mask)
> +{
> +
> + /* If we are not dealing with cpu_online_mask, don't complain. */
> + if (mask != cpu_online_mask)
> + return 1;
> +
> + /* If this is the task doing hotplug, don't complain. */
> + if (unlikely(current == cpu_hotplug.active_writer))
> + return 1;
> +
> + /* If we are in early boot, don't complain. */
> + if (system_state != SYSTEM_RUNNING)
> + return 1;
> +
> + /*
> +  * Check if the current task is in atomic context and it has
> +  * invoked get_online_cpus_atomic() to synchronize with
> +  * CPU Hotplug.
> +  */
> + if (preempt_count() || irqs_disabled())
> + return this_cpu_read(atomic_reader_refcnt);
> + else
> + return 1; /* No checks for non-atomic contexts for now */
> +}
> +
> +static inline void warn_hotplug_unsafe(void)
> +{
> + WARN_ONCE(1, "Must use get/put_online_cpus_atomic() to synchronize"
> +  " with CPU hotplug\n");
> +}
> +
> +/*
> + * Check if the task (executing in atomic context) has the required 
> protection
> + * against CPU hotplug, while accessing the specified cpumask.
> + */
> +void check_hotplug_safe_cpumask(const struct cpumask *mask)
> +{
> + if (!current_is_hotplug_safe(mask))
> + warn_hotplug_unsafe();
> +}
> +EXPORT_SYMBOL_GPL(check_hotplug_safe_cpumask);
> +
> +/*
> + * Similar to check_hotplug_safe_cpumask(), except that we don't complain
> + * if the task (executing in atomic context) is testing whether the CPU it
> + * is executing on is online or not.
> + *
> + * (A task executing with preemption disabled on a CPU, automatically 
> prevents
> + *  offlining that CPU, irrespective of the actual implementation of CPU
> + *  offline. So we don't enforce holding of get_online_cpus_atomic() for that
> + *  case).
> + */
> +void check_hotplug_safe_cpu(unsigned int cpu, const struct cpumask *mask)
> +{
> + if(!current_is_hotplug_safe(mask) && cpu != smp_processor_id())
> + warn_hotplug_unsafe();
> +}
> +EXPORT_SYMBOL_GPL(check_hotplug_safe_cpu);
> +

static inline void atomic_reader_refcnt_inc(void)
{
this_cpu_inc(atomic_reader_refcnt);
}
static inline void atomic_reader_refcnt_dec(void)
{
this_cpu_dec(atomic_reader_refcnt);
}

#else
static inline void atomic_reader_refcnt_inc(void)
{
}
static inline void atomic_reader_refcnt_dec(void)
{
}
#endif

> +#endif
> +
>  void get_online_cpus(void)
>  {
>   might_sleep();
> @@ -189,13 +255,22 @@ unsigned int get_online_cpus_atomic(void)
>* from going offline.
>*/
>   preempt_disable();
> +
> +#ifdef CONFIG_DEBUG_HOTPLUG_CPU
> + this_cpu_inc(atomic_reader_refcnt);
> +#endif

Replace the #ifdef with just:

atomic_reader_refcnt_inc();

>   return smp_processor_id();
>  }
>  EXPORT_SYMBOL_GPL(get_online_cpus_atomic);
>  
>  void put_online_cpus_atomic(void)
>  {
> +
> +#ifdef CONFIG_DEBUG_HOTPLUG_CPU
> + this_cpu_dec(atomic_reader_refcnt);
> +#endif

And

atomic_reader_refcnt_dec();

-- Steve

>   preempt_enable();
> +
>  }
>  EXPORT_SYMBOL_GPL(put_online_cpus_atomic);
>  


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 01/45] CPU hotplug: Provide APIs to prevent CPU offline from atomic context

2013-06-24 Thread Steven Rostedt
On Sun, 2013-06-23 at 19:08 +0530, Srivatsa S. Bhat wrote:
> The current CPU offline code uses stop_machine() internally. And disabling
> preemption prevents stop_machine() from taking effect, thus also preventing
> CPUs from going offline, as a side effect.
> 
> There are places where this side-effect of preempt_disable() (or equivalent)
> is used to synchronize with CPU hotplug. Typically these are in atomic
> sections of code, where they can't make use of get/put_online_cpus(), because
> the latter set of APIs can sleep.
> 
> Going forward, we want to get rid of stop_machine() from the CPU hotplug
> offline path. And then, with stop_machine() gone, disabling preemption will
> no longer prevent CPUs from going offline.
> 
> So provide a set of APIs for such atomic hotplug readers, to prevent (any)
> CPUs from going offline. For now, they will default to preempt_disable()
> and preempt_enable() itself, but this will help us do the tree-wide 
> conversion,
> as a preparatory step to remove stop_machine() from CPU hotplug.
> 
> (Besides, it is good documentation as well, since it clearly marks places
> where we synchronize with CPU hotplug, instead of combining it subtly with
> disabling preemption).
> 
> In future, when actually removing stop_machine(), we will alter the
> implementation of these APIs to a suitable synchronization scheme.
> 
> Cc: Thomas Gleixner 
> Cc: Andrew Morton 
> Cc: Tejun Heo 
> Cc: "Rafael J. Wysocki" 
> Cc: Yasuaki Ishimatsu 

Reviewed-by: Steven Rostedt 

-- Steve

> Signed-off-by: Srivatsa S. Bhat 
> ---
> 
>  include/linux/cpu.h |   18 ++
>  kernel/cpu.c|   38 ++
>  2 files changed, 56 insertions(+)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Pull request: scottwood/linux.git for-3.10

2013-06-24 Thread Benjamin Herrenschmidt
On Mon, 2013-06-24 at 17:02 -0500, Scott Wood wrote:
> This fixes a regression that causes 83xx to oops on boot if a
> non-express PCI bus is present.  It is the same patch as the last pull
> request, but with the changelog reworded to be clearer that this is a
> regression.

Ok, Kumar, I'll pick that up and send to Linus.

Cheers,
Ben.

> The following changes since commit 17858ca65eef148d335ffd4cfc09228a1c1cbfb5:
> 
>   Merge tag 'please-pull-fixia64' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux (2013-06-18 06:29:19 
> -1000)
> 
> are available in the git repository at:
> 
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux.git for-3.10
> 
> for you to fetch changes up to b37e161388ac3980d5dfb73050e85874b84253eb:
> 
>   powerpc/pci: Fix boot panic on mpc83xx (regression) (2013-06-24 16:54:09 
> -0500)
> 
> 
> Rojhalat Ibrahim (1):
>   powerpc/pci: Fix boot panic on mpc83xx (regression)
> 
>  arch/powerpc/sysdev/fsl_pci.c | 24 +---
>  1 file changed, 9 insertions(+), 15 deletions(-)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Pull request: scottwood/linux.git for-3.10

2013-06-24 Thread Scott Wood
This fixes a regression that causes 83xx to oops on boot if a
non-express PCI bus is present.  It is the same patch as the last pull
request, but with the changelog reworded to be clearer that this is a
regression.

The following changes since commit 17858ca65eef148d335ffd4cfc09228a1c1cbfb5:

  Merge tag 'please-pull-fixia64' of 
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux (2013-06-18 06:29:19 
-1000)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux.git for-3.10

for you to fetch changes up to b37e161388ac3980d5dfb73050e85874b84253eb:

  powerpc/pci: Fix boot panic on mpc83xx (regression) (2013-06-24 16:54:09 
-0500)


Rojhalat Ibrahim (1):
  powerpc/pci: Fix boot panic on mpc83xx (regression)

 arch/powerpc/sysdev/fsl_pci.c | 24 +---
 1 file changed, 9 insertions(+), 15 deletions(-)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH-next 00/32] Delete support for __cpuinit

2013-06-24 Thread Paul Gortmaker
This is the whole patch queue for removal of __cpuinit support
against the latest linux-next tree (Jun24th).  Some of you may
have already seen chunks of it, or already read the logistics
of what is being done (and why) here:

  https://lkml.org/lkml/2013/6/20/513

I won't repeat all that here again, other than to say this send
is to ensure arch/subsystem maintainers get a 2nd chance to know
what is going on and to look at what is being proposed for their
area of code.  That, and to ensure one complete continuous copy
of it gets mailed out.  You can also see the patch queue here:

  http://git.kernel.org/cgit/linux/kernel/git/paulg/cpuinit-delete.git

If you've noticed that a chunk for MIPS isn't present here, that
is because it has already been queued in the linux-mips for-next
branch.

Thanks,
Paul.

---
Cc: Len Brown 
Cc: "Rafael J. Wysocki" 
Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Cc: Vineet Gupta 
Cc: Russell King 
Cc: Will Deacon 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Mike Frysinger 
Cc: Bob Liu 
Cc: Sonic Zhang 
Cc: Jens Axboe 
Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: "Rafael J. Wysocki" 
Cc: Viresh Kumar 
Cc: Mikael Starvik 
Cc: Jesper Nilsson 
Cc: Greg Kroah-Hartman 
Cc: David Howells 
Cc: Richard Kuo 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: Fenghua Yu 
Cc: Hirokazu Takata 
Cc: James Hogan 
Cc: Arnd Bergmann 
Cc: Rusty Russell 
Cc: "David S. Miller" 
Cc: Jonas Bonn 
Cc: Helge Deller 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Josh Boyer 
Cc: Matt Porter 
Cc: Kumar Gala 
Cc: "Paul E. McKenney" 
Cc: Josh Triplett 
Cc: Dipankar Sarma 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Chen Liqin 
Cc: Lennox Wu 
Cc: Paul Mundt 
Cc: "David S. Miller" 
Cc: Chris Metcalf 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Chris Zankel 
Cc: Max Filippov 
Cc: linux-a...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: uclinux-dist-de...@blackfin.uclinux.org
Cc: cpuf...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: linux-cris-ker...@axis.com
Cc: linux-hexa...@vger.kernel.org
Cc: lm-sens...@lm-sensors.org
Cc: linux-i...@vger.kernel.org
Cc: linux-m...@ml.linux-m32r.org
Cc: linux-m32r...@ml.linux-m32r.org
Cc: net...@vger.kernel.org
Cc: li...@lists.openrisc.net
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux...@de.ibm.com
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-xte...@linux-xtensa.org

Paul Gortmaker (32):
  init.h: remove __cpuinit sections from the kernel
  modpost: remove all traces of cpuinit/cpuexit sections
  alpha: delete __cpuinit usage from all users
  powerpc: delete __cpuinit usage from all users
  parisc: delete __cpuinit usage from all users
  ia64: delete __cpuinit usage from all ia64 users
  arm: delete __cpuinit/__CPUINIT usage from all ARM users
  sparc: delete __cpuinit/__CPUINIT usage from all users
  arm64: delete __cpuinit usage from all users
  arc: delete __cpuinit usage from all arc files
  blackfin: delete __cpuinit usage from all blackfin files
  s390: delete __cpuinit usage from all s390 files
  sh: delete __cpuinit usage from all sh files
  tile: delete __cpuinit usage from all tile files
  metag: delete __cpuinit usage from all metag files
  cris: delete __cpuinit usage from all cris files
  frv: delete __cpuinit usage from all frv files
  hexagon: delete __cpuinit usage from all hexagon files
  m32r: delete __cpuinit usage from all m32r files
  openrisc: delete __cpuinit usage from all openrisc files
  xtensa: delete __cpuinit usage from all xtensa files
  score: delete __cpuinit usage from all score files
  x86: delete __cpuinit usage from all x86 files
  clocksource+irqchip: delete __cpuinit usage from all related files
  cpufreq: delete __cpuinit usage from all cpufreq files
  hwmon: delete __cpuinit usage from all hwmon files
  acpi: delete __cpuinit usage from all acpi files
  net: delete __cpuinit usage from all net files
  rcu: delete __cpuinit usage from all rcu files
  kernel: delete __cpuinit usage from all core kernel files
  drivers: delete __cpuinit usage from all remaining drivers files
  block: delete __cpuinit usage from all block files

 Documentation/cpu-hotplug.txt |  6 +--
 arch/alpha/kernel/smp.c   | 10 ++---
 arch/alpha/kernel/traps.c |  4 +-
 arch/arc/include/asm/irq.h|  2 +-
 arch/arc/kernel/irq.c |  2 +-
 arch/arc/kernel/setup.c   | 10 ++---
 arch/arc/kernel/smp.c |  4 +-
 arch/arc/kernel/time.c|  6 +--
 arch/arc/mm/cache_arc700.c|  4 +-
 arch/arc/mm/tlb.c |  4 +-
 arch/arm/common/mcpm_platsmp.c|  4 +-
 arch/arm/include/asm/arch_timer.h |  2 +-
 arch/arm/kernel/head-common.S |  1 -
 arch/arm/kernel/head.S|  1 -
 arch

[PATCH-next 00/32] Delete support for __cpuinit

2013-06-24 Thread Paul Gortmaker
[Resending with only lists on Cc: -- previous mail header on the 00/32
 was too long; failed to get passed vger's crap filters.]

On 13-06-24 03:30 PM, Paul Gortmaker wrote:
> This is the whole patch queue for removal of __cpuinit support
> against the latest linux-next tree (Jun24th).  Some of you may
> have already seen chunks of it, or already read the logistics
> of what is being done (and why) here:
> 
>   https://lkml.org/lkml/2013/6/20/513
> 
> I won't repeat all that here again, other than to say this send
> is to ensure arch/subsystem maintainers get a 2nd chance to know
> what is going on and to look at what is being proposed for their
> area of code.  That, and to ensure one complete continuous copy
> of it gets mailed out.  You can also see the patch queue here:
> 
>   http://git.kernel.org/cgit/linux/kernel/git/paulg/cpuinit-delete.git
> 
> If you've noticed that a chunk for MIPS isn't present here, that
> is because it has already been queued in the linux-mips for-next
> branch.
> 
> Thanks,
> Paul.
> 
> ---
> Cc: Len Brown 
> Cc: "Rafael J. Wysocki" 
> Cc: Richard Henderson 
> Cc: Ivan Kokshaysky 
> Cc: Matt Turner 
> Cc: Vineet Gupta 
> Cc: Russell King 
> Cc: Will Deacon 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Mike Frysinger 
> Cc: Bob Liu 
> Cc: Sonic Zhang 
> Cc: Jens Axboe 
> Cc: John Stultz 
> Cc: Thomas Gleixner 
> Cc: "Rafael J. Wysocki" 
> Cc: Viresh Kumar 
> Cc: Mikael Starvik 
> Cc: Jesper Nilsson 
> Cc: Greg Kroah-Hartman 
> Cc: David Howells 
> Cc: Richard Kuo 
> Cc: Fenghua Yu 
> Cc: Tony Luck 
> Cc: Fenghua Yu 
> Cc: Hirokazu Takata 
> Cc: James Hogan 
> Cc: Arnd Bergmann 
> Cc: Rusty Russell 
> Cc: "David S. Miller" 
> Cc: Jonas Bonn 
> Cc: Helge Deller 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Josh Boyer 
> Cc: Matt Porter 
> Cc: Kumar Gala 
> Cc: "Paul E. McKenney" 
> Cc: Josh Triplett 
> Cc: Dipankar Sarma 
> Cc: Martin Schwidefsky 
> Cc: Heiko Carstens 
> Cc: Chen Liqin 
> Cc: Lennox Wu 
> Cc: Paul Mundt 
> Cc: "David S. Miller" 
> Cc: Chris Metcalf 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Chris Zankel 
> Cc: Max Filippov 
> Cc: linux-a...@vger.kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: uclinux-dist-de...@blackfin.uclinux.org
> Cc: cpuf...@vger.kernel.org
> Cc: linux...@vger.kernel.org
> Cc: linux-cris-ker...@axis.com
> Cc: linux-hexa...@vger.kernel.org
> Cc: lm-sens...@lm-sensors.org
> Cc: linux-i...@vger.kernel.org
> Cc: linux-m...@ml.linux-m32r.org
> Cc: linux-m32r...@ml.linux-m32r.org
> Cc: net...@vger.kernel.org
> Cc: li...@lists.openrisc.net
> Cc: linux-par...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux...@de.ibm.com
> Cc: linux-s...@vger.kernel.org
> Cc: linux...@vger.kernel.org
> Cc: sparcli...@vger.kernel.org
> Cc: x...@kernel.org
> Cc: linux-xte...@linux-xtensa.org
> 
> Paul Gortmaker (32):
>   init.h: remove __cpuinit sections from the kernel
>   modpost: remove all traces of cpuinit/cpuexit sections
>   alpha: delete __cpuinit usage from all users
>   powerpc: delete __cpuinit usage from all users
>   parisc: delete __cpuinit usage from all users
>   ia64: delete __cpuinit usage from all ia64 users
>   arm: delete __cpuinit/__CPUINIT usage from all ARM users
>   sparc: delete __cpuinit/__CPUINIT usage from all users
>   arm64: delete __cpuinit usage from all users
>   arc: delete __cpuinit usage from all arc files
>   blackfin: delete __cpuinit usage from all blackfin files
>   s390: delete __cpuinit usage from all s390 files
>   sh: delete __cpuinit usage from all sh files
>   tile: delete __cpuinit usage from all tile files
>   metag: delete __cpuinit usage from all metag files
>   cris: delete __cpuinit usage from all cris files
>   frv: delete __cpuinit usage from all frv files
>   hexagon: delete __cpuinit usage from all hexagon files
>   m32r: delete __cpuinit usage from all m32r files
>   openrisc: delete __cpuinit usage from all openrisc files
>   xtensa: delete __cpuinit usage from all xtensa files
>   score: delete __cpuinit usage from all score files
>   x86: delete __cpuinit usage from all x86 files
>   clocksource+irqchip: delete __cpuinit usage from all related files
>   cpufreq: delete __cpuinit usage from all cpufreq files
>   hwmon: delete __cpuinit usage from all hwmon files
>   acpi: delete __cpuinit usage from all acpi files
>   net: delete __cpuinit usage from all net files
>   rcu: delete __cpuinit usage from all rcu files
>   kernel: delete __cpuinit usage from all core kernel files
>   drivers: delete __cpuinit usage from all remaining drivers files
>   block: delete __cpuinit usage from all block files
> 
>  Documentation/cpu-hotplug.txt |  6 +--
>  arch/alpha/kernel/smp.c   | 10 ++---
>  arch/alpha/kernel/traps.c |  4 +-
>  arch/arc/include/asm/irq.h|  2 +-
>  arch/arc/kernel/irq.c |  2 +-
>  arch/arc/kernel/setup.c   | 10 

[PATCH 04/32] powerpc: delete __cpuinit usage from all users

2013-06-24 Thread Paul Gortmaker
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

This removes all the powerpc uses of the __cpuinit macros.  There
are no __CPUINIT users in assembly files in powerpc.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Josh Boyer 
Cc: Matt Porter 
Cc: Kumar Gala 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Paul Gortmaker 
---

[This commit is part of the __cpuinit removal work.  If you don't see
 any problems with it, then you don't have to do anything ; it will be
 submitted with all the rest of the __cpuinit removal work.  On the
 other hand, if you want to carry this patch in with your other pending
 changes so as to handle conflicts with other pending work yourself, then
 that is fine too, as the commits can largely be treated independently.
 For more information, please see: https://lkml.org/lkml/2013/6/20/513 ]

 arch/powerpc/include/asm/rtas.h|  4 ++--
 arch/powerpc/include/asm/vdso.h|  2 +-
 arch/powerpc/kernel/cacheinfo.c| 36 --
 arch/powerpc/kernel/rtas.c |  4 ++--
 arch/powerpc/kernel/smp.c  |  4 ++--
 arch/powerpc/kernel/sysfs.c|  6 +++---
 arch/powerpc/kernel/time.c |  1 -
 arch/powerpc/kernel/vdso.c |  2 +-
 arch/powerpc/mm/44x_mmu.c  |  6 +++---
 arch/powerpc/mm/hash_utils_64.c|  2 +-
 arch/powerpc/mm/mmu_context_nohash.c   |  6 +++---
 arch/powerpc/mm/numa.c |  7 +++
 arch/powerpc/mm/tlb_nohash.c   |  2 +-
 arch/powerpc/perf/core-book3s.c|  4 ++--
 arch/powerpc/platforms/44x/currituck.c |  4 ++--
 arch/powerpc/platforms/44x/iss4xx.c|  4 ++--
 arch/powerpc/platforms/85xx/smp.c  |  6 +++---
 arch/powerpc/platforms/powermac/smp.c  |  2 +-
 arch/powerpc/platforms/powernv/smp.c   |  2 +-
 19 files changed, 54 insertions(+), 50 deletions(-)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 34fd704..c7a8bfc 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -350,8 +350,8 @@ static inline u32 rtas_config_addr(int busno, int devfn, 
int reg)
(devfn << 8) | (reg & 0xff);
 }
 
-extern void __cpuinit rtas_give_timebase(void);
-extern void __cpuinit rtas_take_timebase(void);
+extern void rtas_give_timebase(void);
+extern void rtas_take_timebase(void);
 
 #ifdef CONFIG_PPC_RTAS
 static inline int page_is_rtas_user_buf(unsigned long pfn)
diff --git a/arch/powerpc/include/asm/vdso.h b/arch/powerpc/include/asm/vdso.h
index 50f261b..0d9cecd 100644
--- a/arch/powerpc/include/asm/vdso.h
+++ b/arch/powerpc/include/asm/vdso.h
@@ -22,7 +22,7 @@ extern unsigned long vdso64_rt_sigtramp;
 extern unsigned long vdso32_sigtramp;
 extern unsigned long vdso32_rt_sigtramp;
 
-int __cpuinit vdso_getcpu_init(void);
+int vdso_getcpu_init(void);
 
 #else /* __ASSEMBLY__ */
 
diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 92c6b00..9262cf2 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -131,7 +131,8 @@ static const char *cache_type_string(const struct cache 
*cache)
return cache_type_info[cache->type].name;
 }
 
-static void __cpuinit cache_init(struct cache *cache, int type, int level, 
struct device_node *ofnode)
+static void cache_init(struct cache *cache, int type, int level,
+  struct device_node *ofnode)
 {
cache->type = type;
cache->level = level;
@@ -140,7 +141,7 @@ static void __cpuinit cache_init(struct cache *cache, int 
type, int level, struc
list_add(&cache->list, &cache_list);
 }
 
-static struct cache *__cpuinit new_cache(int type, int level, struct 
device_node *ofnode)
+static struct cache *new_cache(int type, int level, struct device_node *ofnode)
 {
struct cache *cache;
 
@@ -324,7 +325,8 @@ static bool cache_node_is_unified(const struct device_node 
*np)
return of_get_property(np, "cache-unified", NULL);
 }
 
-static struct cache *__cpuinit cache_do_one_devnode_unified(struct device_node 
*node, int level)
+static struct cache *cache_do_one_devnode_unified(struct device_node *node,
+ int level)
 {
struct cache *cache;
 
@@ -335,7 +337,8 @@ static struct cache *__cpuinit 
cache_do_one_devnode_unified(struct device_node *
return cache;
 }
 
-static struct cache *__cpu

Re: [PATCH] Do not update sysfs cpu registration from invalid context

2013-06-24 Thread Nathan Fontenot
On 06/24/2013 02:16 PM, Seth Jennings wrote:
> On Mon, Jun 24, 2013 at 12:18:04PM -0500, Seth Jennings wrote:
>> On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
>>> The topology update code that updates the cpu node registration in sysfs
>>> should not be called while in stop_machine(). The register/unregister
>>> calls take a lock and may sleep.
>>>
>>> This patch moves these calls outside of the call to stop_machine().
>>>
>>> Signed-off-by:Nathan Fontenot 
>>
>> Reviewed-by: Seth Jennings 
> 
> Gah! I _knew_ I should have waited for my cross compiler to finish
> building.  This thing doesn't build:
> 
>   CC  arch/powerpc/mm/numa.o
> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c: In function 
> 'arch_update_cpu_topology':
> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: 'update' 
> undeclared (first use in this function)
> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: (Each 
> undeclared identifier is reported only once
> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: for each 
> function it appears in.)
> 
> s/update/ud/ in the *_cpu_under_node() calls.

Oops! Time for patch submission re-education training.

New, and correct, patch coming soon.

-Nathan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] Do not update sysfs cpu registration from invalid context

2013-06-24 Thread Seth Jennings
On Mon, Jun 24, 2013 at 12:18:04PM -0500, Seth Jennings wrote:
> On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
> > The topology update code that updates the cpu node registration in sysfs
> > should not be called while in stop_machine(). The register/unregister
> > calls take a lock and may sleep.
> > 
> > This patch moves these calls outside of the call to stop_machine().
> > 
> > Signed-off-by:Nathan Fontenot 
> 
> Reviewed-by: Seth Jennings 

Gah! I _knew_ I should have waited for my cross compiler to finish
building.  This thing doesn't build:

  CC  arch/powerpc/mm/numa.o
/home/sjennings/ltc/linux/arch/powerpc/mm/numa.c: In function 
'arch_update_cpu_topology':
/home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: 'update' 
undeclared (first use in this function)
/home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: (Each undeclared 
identifier is reported only once
/home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: for each function 
it appears in.)

s/update/ud/ in the *_cpu_under_node() calls.

Seth

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Pull request: scottwood/linux.git for-3.10

2013-06-24 Thread Scott Wood

On 06/19/2013 11:50:30 AM, Scott Wood wrote:

On 06/19/2013 10:06:38 AM, Kumar Gala wrote:


On Jun 18, 2013, at 3:14 PM, Scott Wood wrote:

> This fixes a regression that causes 83xx to oops on boot if a
> non-express PCI bus is present.
>
> The following changes since commit  
17858ca65eef148d335ffd4cfc09228a1c1cbfb5:

>
>  Merge tag 'please-pull-fixia64' of  
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux (2013-06-18  
06:29:19 -1000)

>
> are available in the git repository at:
>
>
>  git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux.git  
for-3.10

>
> for you to fetch changes up to  
2383ea94854bcf5a0df3c6803b980868cef95418:

>
>  powerpc/pci: Fix setup of Freescale PCI / PCIe controllers  
(2013-06-18 14:44:57 -0500)

>
> 
> Rojhalat Ibrahim (1):
>  powerpc/pci: Fix setup of Freescale PCI / PCIe controllers
>
> arch/powerpc/sysdev/fsl_pci.c |   24 +---
> 1 file changed, 9 insertions(+), 15 deletions(-)

What about Rohit's patch: powerpc/pci: Fix setup of Freescale PCI /  
PCIe controllers?  Seems like also a fix for 3.10


That's the patch I'm asking you to pull. :-P


Ping

-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 25/45] staging/octeon: Use get/put_online_cpus_atomic() to prevent CPU offline

2013-06-24 Thread David Daney

On 06/23/2013 11:55 AM, Srivatsa S. Bhat wrote:

On 06/23/2013 11:47 PM, Greg Kroah-Hartman wrote:

On Sun, Jun 23, 2013 at 07:13:33PM +0530, Srivatsa S. Bhat wrote:

Once stop_machine() is gone from the CPU offline path, we won't be able
to depend on disabling preemption to prevent CPUs from going offline
from under us.

Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going
offline, while invoking from atomic context.

Cc: Greg Kroah-Hartman 
Cc: de...@driverdev.osuosl.org
Signed-off-by: Srivatsa S. Bhat 
---

  drivers/staging/octeon/ethernet-rx.c |3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/staging/octeon/ethernet-rx.c 
b/drivers/staging/octeon/ethernet-rx.c
index 34afc16..8588b4d 100644
--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -36,6 +36,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #ifdef CONFIG_XFRM
@@ -97,6 +98,7 @@ static void cvm_oct_enable_one_cpu(void)
return;

/* ... if a CPU is available, Turn on NAPI polling for that CPU.  */
+   get_online_cpus_atomic();
for_each_online_cpu(cpu) {
if (!cpu_test_and_set(cpu, core_state.cpu_state)) {
v = smp_call_function_single(cpu, cvm_oct_enable_napi,
@@ -106,6 +108,7 @@ static void cvm_oct_enable_one_cpu(void)
break;
}
}
+   put_online_cpus_atomic();


Does this driver really need to be doing this in the first place?  If
so, why?  The majority of network drivers don't, why is this one
"special"?



It depends on your definition of "need".

The current driver receives packets from *all* network ports into a 
single queue (in OCTEON speak this queue is called a POW group).  Under 
high packet rates, the CPU time required to process the packets may 
exceed the capabilities of a single CPU.


In order to increase throughput beyond the single CPU limited rate, we 
bring more than one CPUs into play for NAPI receive.  The code being 
patched here is part of the logic that controls which CPUs are used for 
NAPI receive.


Just for the record:  Yes I know that doing this may lead to packet 
reordering when doing forwarding.


A further question that wasn't asked is: Will the code work at all if a 
CPU is taken offline even if the race, the patch eliminates, is avoided?


I doubt it.

As far as the patch goes:

Acked-by: David Daney 

David Daney





Honestly, I don't know. Let's CC the author of that code (David Daney).
I wonder why get_maintainer.pl didn't generate his name for this file,
even though the entire file is almost made up of his commits alone!

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 22/45] percpu_counter: Use get/put_online_cpus_atomic() to prevent CPU offline

2013-06-24 Thread Srivatsa S. Bhat
On 06/24/2013 11:36 PM, Tejun Heo wrote:
> On Mon, Jun 24, 2013 at 10:55:35AM -0700, Tejun Heo wrote:
>>> @@ -105,6 +106,7 @@ s64 __percpu_counter_sum(struct percpu_counter *fbc)
>>> ret += *pcount;
>>> }
>>> raw_spin_unlock(&fbc->lock);
>>> +   put_online_cpus_atomic();
>>
>> I don't think this is necessary.  CPU on/offlining is explicitly
>> handled via the hotplug callback which synchronizes through fbc->lock.
>> __percpu_counter_sum() racing with actual on/offlining doesn't affect
>> correctness and adding superflous get_online_cpus_atomic() around it
>> can be misleading.
> 
> Ah, okay, so you added a debug feature which triggers warning if
> online mask is accessed without synchronization.

Exactly!

>  Yeah, that makes
> sense and while the above is not strictly necessary, it probably is
> better to just add it rather than suppressing the warning in a
> different way.

Yeah, I was beginning to scratch my head as to how to suppress the
warning after I read your explanation as to why the calls to
get/put_online_cpus_atomic() would be superfluous in this case...

But as you said, simply invoking those functions is much simpler ;-)

>  Can you please at least add a comment explaining that?
> 

Sure, will do. Thanks a lot Tejun!
 
Regards,
Srivatsa S. Bhat

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 22/45] percpu_counter: Use get/put_online_cpus_atomic() to prevent CPU offline

2013-06-24 Thread Tejun Heo
On Mon, Jun 24, 2013 at 10:55:35AM -0700, Tejun Heo wrote:
> > @@ -105,6 +106,7 @@ s64 __percpu_counter_sum(struct percpu_counter *fbc)
> > ret += *pcount;
> > }
> > raw_spin_unlock(&fbc->lock);
> > +   put_online_cpus_atomic();
> 
> I don't think this is necessary.  CPU on/offlining is explicitly
> handled via the hotplug callback which synchronizes through fbc->lock.
> __percpu_counter_sum() racing with actual on/offlining doesn't affect
> correctness and adding superflous get_online_cpus_atomic() around it
> can be misleading.

Ah, okay, so you added a debug feature which triggers warning if
online mask is accessed without synchronization.  Yeah, that makes
sense and while the above is not strictly necessary, it probably is
better to just add it rather than suppressing the warning in a
different way.  Can you please at least add a comment explaining that?

Thanks.

-- 
tejun
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/3] powerpc/pseries: Support compression of oops text via pstore

2013-06-24 Thread Kees Cook
On Sun, Jun 23, 2013 at 11:23 PM, Aruna Balakrishnaiah
 wrote:
> The patch set supports compression of oops messages while writing to NVRAM,
> this helps in capturing more of oops data to lnx,oops-log. The pstore file
> for oops messages will be in decompressed format making it readable.
>
> In case compression fails, the patch takes care of copying the header added
> by pstore and last oops_data_sz bytes of big_oops_buf to NVRAM so that we
> have recent oops messages in lnx,oops-log.
>
> In case decompression fails, it will result in absence of oops file but still
> have files (in /dev/pstore) for other partitions.
>
> Signed-off-by: Aruna Balakrishnaiah 
> ---
>  arch/powerpc/platforms/pseries/nvram.c |  132 
> +---
>  1 file changed, 118 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/nvram.c 
> b/arch/powerpc/platforms/pseries/nvram.c
> index 0159d74..b5ba5e2 100644
> --- a/arch/powerpc/platforms/pseries/nvram.c
> +++ b/arch/powerpc/platforms/pseries/nvram.c
> @@ -539,6 +539,65 @@ static int zip_oops(size_t text_len)
>  }
>
>  #ifdef CONFIG_PSTORE
> +/* Derived from logfs_uncompress */
> +int nvram_decompress(void *in, void *out, size_t inlen, size_t outlen)
> +{
> +   int err, ret;
> +
> +   ret = -EIO;
> +   err = zlib_inflateInit(&stream);
> +   if (err != Z_OK)
> +   goto error;
> +
> +   stream.next_in = in;
> +   stream.avail_in = inlen;
> +   stream.total_in = 0;
> +   stream.next_out = out;
> +   stream.avail_out = outlen;
> +   stream.total_out = 0;
> +
> +   err = zlib_inflate(&stream, Z_FINISH);
> +   if (err != Z_STREAM_END)
> +   goto error;
> +
> +   err = zlib_inflateEnd(&stream);
> +   if (err != Z_OK)
> +   goto error;
> +
> +   ret = stream.total_out;
> +error:
> +   return ret;
> +}
> +
> +static int unzip_oops(char *oops_buf, char *big_buf)
> +{
> +   struct oops_log_info *oops_hdr = (struct oops_log_info *)oops_buf;
> +   u64 timestamp = oops_hdr->timestamp;
> +   char *big_oops_data = NULL;
> +   char *oops_data_buf = NULL;
> +   size_t big_oops_data_sz;
> +   int unzipped_len;
> +
> +   big_oops_data = big_buf + sizeof(struct oops_log_info);
> +   big_oops_data_sz = big_oops_buf_sz - sizeof(struct oops_log_info);
> +   oops_data_buf = oops_buf + sizeof(struct oops_log_info);
> +
> +   unzipped_len = nvram_decompress(oops_data_buf, big_oops_data,
> +   oops_hdr->report_length,
> +   big_oops_data_sz);
> +
> +   if (unzipped_len < 0) {
> +   pr_err("nvram: decompression failed; returned %d\n",
> +   unzipped_len);
> +   return -1;
> +   }
> +   oops_hdr = (struct oops_log_info *)big_buf;
> +   oops_hdr->version = OOPS_HDR_VERSION;
> +   oops_hdr->report_length = (u16) unzipped_len;
> +   oops_hdr->timestamp = timestamp;
> +   return 0;
> +}
> +
>  static int nvram_pstore_open(struct pstore_info *psi)
>  {
> /* Reset the iterator to start reading partitions again */
> @@ -567,6 +626,7 @@ static int nvram_pstore_write(enum pstore_type_id type,
> size_t size, struct pstore_info *psi)
>  {
> int rc;
> +   unsigned int err_type = ERR_TYPE_KERNEL_PANIC;
> struct oops_log_info *oops_hdr = (struct oops_log_info *) oops_buf;
>
> /* part 1 has the recent messages from printk buffer */
> @@ -577,8 +637,31 @@ static int nvram_pstore_write(enum pstore_type_id type,
> oops_hdr->version = OOPS_HDR_VERSION;
> oops_hdr->report_length = (u16) size;
> oops_hdr->timestamp = get_seconds();
> +
> +   if (big_oops_buf) {
> +   rc = zip_oops(size);
> +   /*
> +* If compression fails copy recent log messages from
> +* big_oops_buf to oops_data.
> +*/
> +   if (rc != 0) {
> +   int hsize = pstore_get_header_size();

I think I would rather see the API to pstore_write() changed to
include explicit details about header sizes. Mkaing hsize a global
seems unwise, since it's not strictly going to be a constant value. It
could change between calls to the writer, for example.

Beyond that, this all seems sensible, though it would be kind of cool
to move this compression logic into the pstore core so it would get
used by default (or through a module parameter).

-Kees

> +   size_t diff = size - oops_data_sz + hsize;
> +
> +   if (size > oops_data_sz) {
> +   memcpy(oops_data, big_oops_buf, hsize);
> +   memcpy(oops_data + hsize, big_oops_buf + diff,
> +   oops_data_sz - hsize);
> +
> + 

Re: [PATCH 22/45] percpu_counter: Use get/put_online_cpus_atomic() to prevent CPU offline

2013-06-24 Thread Tejun Heo
On Sun, Jun 23, 2013 at 07:12:59PM +0530, Srivatsa S. Bhat wrote:
> Once stop_machine() is gone from the CPU offline path, we won't be able
> to depend on disabling preemption to prevent CPUs from going offline
> from under us.
> 
> Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going
> offline, while invoking from atomic context.
> 
> Cc: Al Viro 
> Signed-off-by: Srivatsa S. Bhat 
...
> @@ -98,6 +98,7 @@ s64 __percpu_counter_sum(struct percpu_counter *fbc)
>   s64 ret;
>   int cpu;
>  
> + get_online_cpus_atomic();
>   raw_spin_lock(&fbc->lock);
>   ret = fbc->count;
>   for_each_online_cpu(cpu) {
> @@ -105,6 +106,7 @@ s64 __percpu_counter_sum(struct percpu_counter *fbc)
>   ret += *pcount;
>   }
>   raw_spin_unlock(&fbc->lock);
> + put_online_cpus_atomic();

I don't think this is necessary.  CPU on/offlining is explicitly
handled via the hotplug callback which synchronizes through fbc->lock.
__percpu_counter_sum() racing with actual on/offlining doesn't affect
correctness and adding superflous get_online_cpus_atomic() around it
can be misleading.

Thanks.

-- 
tejun
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 25/45] staging/octeon: Use get/put_online_cpus_atomic() to prevent CPU offline

2013-06-24 Thread Srivatsa S. Bhat
On 06/24/2013 12:47 AM, Joe Perches wrote:
> On Mon, 2013-06-24 at 00:25 +0530, Srivatsa S. Bhat wrote:
>> On 06/23/2013 11:47 PM, Greg Kroah-Hartman wrote:
>>> On Sun, Jun 23, 2013 at 07:13:33PM +0530, Srivatsa S. Bhat wrote:
> []
 diff --git a/drivers/staging/octeon/ethernet-rx.c 
 b/drivers/staging/octeon/ethernet-rx.c
> []
>> Honestly, I don't know. Let's CC the author of that code (David Daney).
>> I wonder why get_maintainer.pl didn't generate his name for this file,
>> even though the entire file is almost made up of his commits alone!
> 
> Because by default, get_maintainer looks for a matching
> file entry in MAINTAINERS.  Failing that, it looks at
> one year of git history.  In this case, no work has been
> done on the file for quite awhile.
> 
> --git-blame can be added to the get_maintainer.pl command
> line to look for % of authorship by line and commit count.
> 
> Adding --git-blame can take a long time to run, that's why
> it's not on by default.  Also, very old history can give
> invalid email addresses as people move around and email
> addresses decay.
> 
> If you always want to find original authors, you could
> use a .get_maintainer.conf file with --git-blame in it.
> 
> $ time ./scripts/get_maintainer.pl --git-blame -f 
> drivers/staging/octeon/ethernet-tx.c
> Greg Kroah-Hartman  (supporter:STAGING 
> SUBSYSTEM,commits:4/16=25%)
> David Daney  (authored 
> lines:711/725=98%,commits:13/16=81%)
> Ralf Baechle  (commits:11/16=69%)
> Eric Dumazet  (commits:2/16=12%)
> Andrew Morton  (commits:1/16=6%)
> de...@driverdev.osuosl.org (open list:STAGING SUBSYSTEM)
> linux-ker...@vger.kernel.org (open list)
> 
> real  0m16.853s
> user  0m16.088s
> sys   0m0.444s
> 
> 

Oh, ok.. Thanks for the explanation and the tip!

Regards,
Srivatsa S. Bhat

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] Do not update sysfs cpu registration from invalid context

2013-06-24 Thread Seth Jennings
On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
> The topology update code that updates the cpu node registration in sysfs
> should not be called while in stop_machine(). The register/unregister
> calls take a lock and may sleep.
> 
> This patch moves these calls outside of the call to stop_machine().
> 
> Signed-off-by:Nathan Fontenot 

Reviewed-by: Seth Jennings 

> ---
>  arch/powerpc/mm/numa.c |5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> Index: powerpc/arch/powerpc/mm/numa.c
> ===
> --- powerpc.orig/arch/powerpc/mm/numa.c   2013-06-24 06:53:31.0 
> -0500
> +++ powerpc/arch/powerpc/mm/numa.c2013-06-24 06:56:30.0 -0500
> @@ -1433,11 +1433,9 @@
>   if (cpu != update->cpu)
>   continue;
> 
> - unregister_cpu_under_node(update->cpu, update->old_nid);
>   unmap_cpu_from_node(update->cpu);
>   map_cpu_to_node(update->cpu, update->new_nid);
>   vdso_getcpu_init();
> - register_cpu_under_node(update->cpu, update->new_nid);
>   }
> 
>   return 0;
> @@ -1485,6 +1483,9 @@
>   stop_machine(update_cpu_topology, &updates[0], &updated_cpus);
> 
>   for (ud = &updates[0]; ud; ud = ud->next) {
> + unregister_cpu_under_node(update->cpu, update->old_nid);
> + register_cpu_under_node(update->cpu, update->new_nid);
> +
>   dev = get_cpu_device(ud->cpu);
>   if (dev)
>   kobject_uevent(&dev->kobj, KOBJ_CHANGE);
> 
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: [PATCH] Correct build warnings with CONFIG_TRANSPARENT_HUGEPAGE disabled

2013-06-24 Thread David Laight
> failure:
> __
> static inline __attribute__((always_inline)) 
> __attribute__((no_instrument_function)) int
> __hash_page_thp(unsigned long ea, unsigned long access,
>   unsigned long vsid, pmd_t *pmdp,
>   unsigned long trap, int local,
>   int ssize, unsigned int psize)
> {
>  do { __asm__ __volatile__( "1: twi 31,0,0\n" ".section __bug_table,\"a\"\n" 
> "2:\t" ".llong" " " "1b,
> %0\n" "\t.short %1, %2\n" ".org 2b+%3\n" ".previous\n" : : "i"
> ("/home/nfont/src/powerpc/arch/powerpc/include/asm/mmu-hash64.h"), "i" (353), 
> "i" (0), "i"
> (sizeof(struct bug_entry))); do { } while (1); } while (0);
> }

Why isn't the "do { } while (1);" enough to stop the compiler
expecting the above to return?
I know I've added "for (;;);" in some code before now.
Disabling optimisations would be enough - but unlikely to be true.

David

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] Correct build warnings with CONFIG_TRANSPARENT_HUGEPAGE disabled

2013-06-24 Thread Aneesh Kumar K.V
"Aneesh Kumar K.V"  writes:

> Nathan Fontenot  writes:
>
>> Building with CONFIG_TRANSPARENT_HUGEPAGE disabled causes the following
>> build wearnings;
>>
>> powerpc/arch/powerpc/include/asm/mmu-hash64.h: In function ‘__hash_page_thp’:
>> powerpc/arch/powerpc/include/asm/mmu-hash64.h:354: warning: no return 
>> statement in function returning non-void
>>
>> This patch adds a return -1 to the static inline for __hash_page_thp()
>> to correct the warnings.
>>
>> Signed-off-by: Nathan Fontenot 
>
> Reviewed-by: Aneesh Kumar K.V 
>
> Wondering why i am not finding this
>
> [root@llmp24l02 thp]# make arch/powerpc/mm/hash_utils_64.o
> 
> .
>   CC  arch/powerpc/mm/hash_utils_64.o
> [root@llmp24l02 thp]# grep TRANSPARENT_HUGEPAGE .config
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
> # CONFIG_TRANSPARENT_HUGEPAGE is not set
> [root@llmp24l02 thp]# 
> [root@llmp24l02 thp]# gcc --version
> gcc (GCC) 4.7.2 20121109 (Red Hat 4.7.2-8)
> Copyright (C) 2012 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
>

new compilers have __builtin_unreachable in BUG. That is why it didn't
trigger for me. 

new compiler:
_
static inline __attribute__((always_inline)) 
__attribute__((no_instrument_function)) int __hash_page_thp(unsigned long ea, 
unsigned long access,
  unsigned long vsid, pmd_t *pmdp,
  unsigned long trap, int local,
  int ssize, unsigned int psize)
{
 do { __asm__ __volatile__( "1: twi 31,0,0\n" ".section __bug_table,\"a\"\n" 
"2:\t" ".llong" " " "1b, %0\n" "\t.short %1, %2\n" ".org 2b+%3\n" ".previous\n" 
: : "i" 
("/home/opensource/sources/kernels/linux-powerpc/arch/powerpc/include/asm/mmu-hash64.h"),
 "i" (353), "i" (0), "i" (sizeof(struct bug_entry))); __builtin_unreachable(); 
} while (0);
}

failure:
__
static inline __attribute__((always_inline)) 
__attribute__((no_instrument_function)) int __hash_page_thp(unsigned long ea, 
unsigned long access,
  unsigned long vsid, pmd_t *pmdp,
  unsigned long trap, int local,
  int ssize, unsigned int psize)
{
 do { __asm__ __volatile__( "1: twi 31,0,0\n" ".section __bug_table,\"a\"\n" 
"2:\t" ".llong" " " "1b, %0\n" "\t.short %1, %2\n" ".org 2b+%3\n" ".previous\n" 
: : "i" ("/home/nfont/src/powerpc/arch/powerpc/include/asm/mmu-hash64.h"), "i" 
(353), "i" (0), "i" (sizeof(struct bug_entry))); do { } while (1); } while (0);
}


-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] Correct build warnings with CONFIG_TRANSPARENT_HUGEPAGE disabled

2013-06-24 Thread Aneesh Kumar K.V
Nathan Fontenot  writes:

> Building with CONFIG_TRANSPARENT_HUGEPAGE disabled causes the following
> build wearnings;
>
> powerpc/arch/powerpc/include/asm/mmu-hash64.h: In function ‘__hash_page_thp’:
> powerpc/arch/powerpc/include/asm/mmu-hash64.h:354: warning: no return 
> statement in function returning non-void
>
> This patch adds a return -1 to the static inline for __hash_page_thp()
> to correct the warnings.
>
> Signed-off-by: Nathan Fontenot 

Reviewed-by: Aneesh Kumar K.V 

Wondering why i am not finding this

[root@llmp24l02 thp]# make arch/powerpc/mm/hash_utils_64.o

.
  CC  arch/powerpc/mm/hash_utils_64.o
[root@llmp24l02 thp]# grep TRANSPARENT_HUGEPAGE .config
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
# CONFIG_TRANSPARENT_HUGEPAGE is not set
[root@llmp24l02 thp]# 
[root@llmp24l02 thp]# gcc --version
gcc (GCC) 4.7.2 20121109 (Red Hat 4.7.2-8)
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


> ---
>  arch/powerpc/include/asm/mmu-hash64.h |1 +
>  1 file changed, 1 insertion(+)
>
> Index: powerpc/arch/powerpc/include/asm/mmu-hash64.h
> ===
> --- powerpc.orig/arch/powerpc/include/asm/mmu-hash64.h2013-06-24 
> 07:54:08.0 -0500
> +++ powerpc/arch/powerpc/include/asm/mmu-hash64.h 2013-06-24 
> 08:07:56.0 -0500
> @@ -351,6 +351,7 @@
> int ssize, unsigned int psize)
>  {
>   BUG();
> + return -1;
>  }
>  #endif
>  extern void hash_failure_debug(unsigned long ea, unsigned long access,

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] Correct build warnings with CONFIG_TRANSPARENT_HUGEPAGE disabled

2013-06-24 Thread Nathan Fontenot
Building with CONFIG_TRANSPARENT_HUGEPAGE disabled causes the following
build wearnings;

powerpc/arch/powerpc/include/asm/mmu-hash64.h: In function ‘__hash_page_thp’:
powerpc/arch/powerpc/include/asm/mmu-hash64.h:354: warning: no return statement 
in function returning non-void

This patch adds a return -1 to the static inline for __hash_page_thp()
to correct the warnings.

Signed-off-by: Nathan Fontenot 
---
 arch/powerpc/include/asm/mmu-hash64.h |1 +
 1 file changed, 1 insertion(+)

Index: powerpc/arch/powerpc/include/asm/mmu-hash64.h
===
--- powerpc.orig/arch/powerpc/include/asm/mmu-hash64.h  2013-06-24 
07:54:08.0 -0500
+++ powerpc/arch/powerpc/include/asm/mmu-hash64.h   2013-06-24 
08:07:56.0 -0500
@@ -351,6 +351,7 @@
  int ssize, unsigned int psize)
 {
BUG();
+   return -1;
 }
 #endif
 extern void hash_failure_debug(unsigned long ea, unsigned long access,

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] Do not update sysfs cpu registration from invalid context

2013-06-24 Thread Nathan Fontenot
The topology update code that updates the cpu node registration in sysfs
should not be called while in stop_machine(). The register/unregister
calls take a lock and may sleep.

This patch moves these calls outside of the call to stop_machine().

Signed-off-by:Nathan Fontenot 
---
 arch/powerpc/mm/numa.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: powerpc/arch/powerpc/mm/numa.c
===
--- powerpc.orig/arch/powerpc/mm/numa.c 2013-06-24 06:53:31.0 -0500
+++ powerpc/arch/powerpc/mm/numa.c  2013-06-24 06:56:30.0 -0500
@@ -1433,11 +1433,9 @@
if (cpu != update->cpu)
continue;
 
-   unregister_cpu_under_node(update->cpu, update->old_nid);
unmap_cpu_from_node(update->cpu);
map_cpu_to_node(update->cpu, update->new_nid);
vdso_getcpu_init();
-   register_cpu_under_node(update->cpu, update->new_nid);
}
 
return 0;
@@ -1485,6 +1483,9 @@
stop_machine(update_cpu_topology, &updates[0], &updated_cpus);
 
for (ud = &updates[0]; ud; ud = ud->next) {
+   unregister_cpu_under_node(update->cpu, update->old_nid);
+   register_cpu_under_node(update->cpu, update->new_nid);
+
dev = get_cpu_device(ud->cpu);
if (dev)
kobject_uevent(&dev->kobj, KOBJ_CHANGE);

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 7/8] powerpc/perf: Core EBB support for 64-bit book3s

2013-06-24 Thread Michael Ellerman
Add support for EBB (Event Based Branches) on 64-bit book3s. See the
included documentation for more details.

EBBs are a feature which allows the hardware to branch directly to a
specified user space address when a PMU event overflows. This can be
used by programs for self-monitoring with no kernel involvement in the
inner loop.

Most of the logic is in the generic book3s code, primarily to avoid a
proliferation of PMU callbacks.

Signed-off-by: Michael Ellerman 
---
 Documentation/powerpc/00-INDEX   |2 +
 Documentation/powerpc/pmu-ebb.txt|  122 +++
 arch/powerpc/include/asm/perf_event_server.h |6 +
 arch/powerpc/include/asm/processor.h |3 +-
 arch/powerpc/include/asm/reg.h   |8 ++
 arch/powerpc/include/asm/switch_to.h |   14 +++
 arch/powerpc/kernel/process.c|4 +
 arch/powerpc/perf/core-book3s.c  |  161 +++---
 8 files changed, 306 insertions(+), 14 deletions(-)
 create mode 100644 Documentation/powerpc/pmu-ebb.txt

diff --git a/Documentation/powerpc/00-INDEX b/Documentation/powerpc/00-INDEX
index dd9e928..05026ce 100644
--- a/Documentation/powerpc/00-INDEX
+++ b/Documentation/powerpc/00-INDEX
@@ -14,6 +14,8 @@ hvcs.txt
- IBM "Hypervisor Virtual Console Server" Installation Guide
 mpc52xx.txt
- Linux 2.6.x on MPC52xx family
+pmu-ebb.txt
+   - Description of the API for using the PMU with Event Based Branches.
 qe_firmware.txt
- describes the layout of firmware binaries for the Freescale QUICC
  Engine and the code that parses and uploads the microcode therein.
diff --git a/Documentation/powerpc/pmu-ebb.txt 
b/Documentation/powerpc/pmu-ebb.txt
new file mode 100644
index 000..65b6989
--- /dev/null
+++ b/Documentation/powerpc/pmu-ebb.txt
@@ -0,0 +1,122 @@
+PMU Event Based Branches
+
+
+Event Based Branches (EBBs) are a feature which allows the hardware to
+branch directly to a specified user space address when certain events occur.
+
+The full specification is available in Power ISA v2.07:
+
+  https://www.power.org/documentation/power-isa-version-2-07/
+
+One type of event for which EBBs can be configured is PMU exceptions. This
+document describes the API for configuring the Power PMU to generate EBBs,
+using the Linux perf_events API.
+
+
+Terminology
+---
+
+Throughout this document we will refer to an "EBB event" or "EBB events". This
+just refers to a struct perf_event which has set the "EBB" flag in its
+attr.config. All events which can be configured on the hardware PMU are
+possible "EBB events".
+
+
+Background
+--
+
+When a PMU EBB occurs it is delivered to the currently running process. As such
+EBBs can only sensibly be used by programs for self-monitoring.
+
+It is a feature of the perf_events API that events can be created on other
+processes, subject to standard permission checks. This is also true of EBB
+events, however unless the target process enables EBBs (via mtspr(BESCR)) no
+EBBs will ever be delivered.
+
+This makes it possible for a process to enable EBBs for itself, but not
+actually configure any events. At a later time another process can come along
+and attach an EBB event to the process, which will then cause EBBs to be
+delivered to the first process. It's not clear if this is actually useful.
+
+
+When the PMU is configured for EBBs, all PMU interrupts are delivered to the
+user process. This means once an EBB event is scheduled on the PMU, no non-EBB
+events can be configured. This means that EBB events can not be run
+concurrently with regular 'perf' commands.
+
+It is however safe to run 'perf' commands on a process which is using EBBs. In
+general the EBB event will take priority, though it depends on the exact
+options used on the perf_event_open() and the timing.
+
+
+Creating an EBB event
+-
+
+To request that an event is counted using EBB, the event code should have bit
+63 set.
+
+EBB events must be created with a particular, and restrictive, set of
+attributes - this is so that they interoperate correctly with the rest of the
+perf_events subsystem.
+
+An EBB event must be created with the "pinned" and "exclusive" attributes set.
+Note that if you are creating a group of EBB events, only the leader can have
+these attributes set.
+
+An EBB event must NOT set any of the "inherit", "sample_period", "freq" or
+"enable_on_exec" attributes.
+
+An EBB event must be attached to a task. This is specified to perf_event_open()
+by passing a pid value, typically 0 indicating the current task.
+
+All events in a group must agree on whether they want EBB. That is all events
+must request EBB, or none may request EBB.
+
+
+Enabling an EBB event
+-
+
+Once an EBB event has been successfully opened, it must be enabled with the
+perf_events API. This can be achieved either via the ioctl() interface, or the
+prctl() interface.
+
+H

[PATCH 8/8] powerpc/perf: Add power8 EBB support

2013-06-24 Thread Michael Ellerman
Add logic to the power8 PMU code to support EBB. Future processors would
also be expected to implement similar constraints. At that time we could
possibly factor these out into common code.

Finally mark the power8 PMU as supporting EBB, which is the actual
enable switch which allows EBBs to be configured.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/perf/power8-pmu.c |   44 +---
 1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index d59f5b2..c7f8ccc 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -31,9 +31,9 @@
  *
  *60565248444036   
 32
  * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - 
- - |
- * [  thresh_cmp ]   [  thresh_ctl 
  ]
- *   |
- *   thresh start/stop OR FAB match -*
+ *   | [  thresh_cmp ]   [  thresh_ctl 
  ]
+ *   |   |
+ *   *- EBB (Linux)  thresh start/stop OR FAB match -*
  *
  *2824201612 8 4   
  0
  * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - 
- - |
@@ -117,6 +117,7 @@
 (EVENT_UNIT_MASK  << EVENT_UNIT_SHIFT) |   \
 (EVENT_COMBINE_MASK   << EVENT_COMBINE_SHIFT)  |   \
 (EVENT_MARKED_MASK<< EVENT_MARKED_SHIFT)   |   \
+(1ull << EVENT_CONFIG_EBB_SHIFT)   |   \
  EVENT_PSEL_MASK)
 
 /* MMCRA IFM bits - POWER8 */
@@ -140,10 +141,10 @@
  *
  *2824201612 8 4   
  0
  * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - 
- - |
- *   [ ]   [  sample ]   [ ]   [6] [5]   [4] [3]   [2] 
[1]
- *| |
- *  L1 I/D qualifier -* |  Count of events for 
each PMC.
- *  |p1, p2, p3, p4, p5, 
p6.
+ *   |   [ ]   [  sample ]   [ ]   [6] [5]   [4] [3]   [2] 
[1]
+ *  EBB -*| |
+ *| |  Count of events for 
each PMC.
+ *  L1 I/D qualifier -* |p1, p2, p3, p4, p5, 
p6.
  * nc - number of counters -*
  *
  * The PMC fields P1..P6, and NC, are adder fields. As we accumulate 
constraints
@@ -159,6 +160,9 @@
 #define CNST_THRESH_VAL(v) (((v) & EVENT_THRESH_MASK) << 32)
 #define CNST_THRESH_MASK   CNST_THRESH_VAL(EVENT_THRESH_MASK)
 
+#define CNST_EBB_VAL(v)(((v) & 1) << 24)
+#define CNST_EBB_MASK  CNST_EBB_VAL(1)
+
 #define CNST_L1_QUAL_VAL(v)(((v) & 3) << 22)
 #define CNST_L1_QUAL_MASK  CNST_L1_QUAL_VAL(3)
 
@@ -217,7 +221,7 @@ static inline bool event_is_fab_match(u64 event)
 
 static int power8_get_constraint(u64 event, unsigned long *maskp, unsigned 
long *valp)
 {
-   unsigned int unit, pmc, cache;
+   unsigned int unit, pmc, cache, ebb;
unsigned long mask, value;
 
mask = value = 0;
@@ -225,9 +229,13 @@ static int power8_get_constraint(u64 event, unsigned long 
*maskp, unsigned long
if (event & ~EVENT_VALID_MASK)
return -1;
 
-   pmc   = (event >> EVENT_PMC_SHIFT)   & EVENT_PMC_MASK;
-   unit  = (event >> EVENT_UNIT_SHIFT)  & EVENT_UNIT_MASK;
-   cache = (event >> EVENT_CACHE_SEL_SHIFT) & EVENT_CACHE_SEL_MASK;
+   pmc   = (event >> EVENT_PMC_SHIFT)& EVENT_PMC_MASK;
+   unit  = (event >> EVENT_UNIT_SHIFT)   & EVENT_UNIT_MASK;
+   cache = (event >> EVENT_CACHE_SEL_SHIFT)  & EVENT_CACHE_SEL_MASK;
+   ebb   = (event >> EVENT_CONFIG_EBB_SHIFT) & 1;
+
+   /* Clear the EBB bit in the event, so event checks work below */
+   event &= ~(1ull << EVENT_CONFIG_EBB_SHIFT);
 
if (pmc) {
if (pmc > 6)
@@ -297,6 +305,18 @@ static int power8_get_constraint(u64 event, unsigned long 
*maskp, unsigned long
value |= CNST_THRESH_VAL(event >> EVENT_THRESH_SHIFT);
}
 
+   if (!pmc && ebb)
+   /* EBB events must specify the PMC */
+   return -1;
+
+   /*
+* All events must agree on EBB, either all request it or none.
+* EBB events are pinned & exclusive, so this should never actually
+* hit, but we leave it as a fallback in case.
+*/
+   mask  |= CNST_EBB_VAL(ebb);
+   value |= CNST_EBB_MASK;
+
*maskp = mask;
*valp = value;
 
@@ -591,7 +611,7 @@ static struct powe

[PATCH 5/8] powerpc/perf: Don't enable if we have zero events

2013-06-24 Thread Michael Ellerman
In power_pmu_enable() we still enable the PMU even if we have zero
events. This should have no effect but doesn't make much sense. Instead
just return after telling the hypervisor that we are not using the PMCs.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/perf/core-book3s.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index af4b4b1..d3ee2e5 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -926,6 +926,11 @@ static void power_pmu_enable(struct pmu *pmu)
if (!cpuhw->disabled)
goto out;
 
+   if (cpuhw->n_events == 0) {
+   ppc_set_pmu_inuse(0);
+   goto out;
+   }
+
cpuhw->disabled = 0;
 
/*
@@ -937,8 +942,6 @@ static void power_pmu_enable(struct pmu *pmu)
if (!cpuhw->n_added) {
mtspr(SPRN_MMCRA, cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
mtspr(SPRN_MMCR1, cpuhw->mmcr[1]);
-   if (cpuhw->n_events == 0)
-   ppc_set_pmu_inuse(0);
goto out_enable;
}
 
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 6/8] powerpc/perf: Drop MMCRA from thread_struct

2013-06-24 Thread Michael Ellerman
In commit 59affcd "Context switch more PMU related SPRs" I added more
PMU SPRs to thread_struct, later modified in commit b11ae95. To add
insult to injury it turns out we don't need to switch MMCRA as it's
only user readable, and the value is recomputed by the PMU code.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/processor.h |1 -
 arch/powerpc/kernel/asm-offsets.c|1 -
 2 files changed, 2 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 14a6583..48af5d7 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -289,7 +289,6 @@ struct thread_struct {
unsigned long   sier;
unsigned long   mmcr0;
unsigned long   mmcr2;
-   unsigned long   mmcra;
 #endif
 };
 
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 6f16ffa..2f066ef 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -132,7 +132,6 @@ int main(void)
DEFINE(THREAD_SIER, offsetof(struct thread_struct, sier));
DEFINE(THREAD_MMCR0, offsetof(struct thread_struct, mmcr0));
DEFINE(THREAD_MMCR2, offsetof(struct thread_struct, mmcr2));
-   DEFINE(THREAD_MMCRA, offsetof(struct thread_struct, mmcra));
 #endif
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
DEFINE(PACATMSCRATCH, offsetof(struct paca_struct, tm_scratch));
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 4/8] powerpc/perf: Use existing out label in power_pmu_enable()

2013-06-24 Thread Michael Ellerman
In power_pmu_enable() we can use the existing out label to reduce the
number of return paths.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/perf/core-book3s.c |9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 3d566ee..af4b4b1 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -919,12 +919,13 @@ static void power_pmu_enable(struct pmu *pmu)
 
if (!ppmu)
return;
+
local_irq_save(flags);
+
cpuhw = &__get_cpu_var(cpu_hw_events);
-   if (!cpuhw->disabled) {
-   local_irq_restore(flags);
-   return;
-   }
+   if (!cpuhw->disabled)
+   goto out;
+
cpuhw->disabled = 0;
 
/*
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 3/8] powerpc/perf: Freeze PMC5/6 if we're not using them

2013-06-24 Thread Michael Ellerman
On Power8 we can freeze PMC5 and 6 if we're not using them. Normally they
run all the time.

As noticed by Anshuman, we should unfreeze them when we disable the PMU
as there are legacy tools which expect them to run all the time.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/reg.h  |1 +
 arch/powerpc/perf/core-book3s.c |5 +++--
 arch/powerpc/perf/power8-pmu.c  |4 
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 4a9e408..362142b 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -626,6 +626,7 @@
 #define   MMCR0_TRIGGER0x2000UL /* TRIGGER enable */
 #define   MMCR0_PMAO   0x0080UL /* performance monitor alert has occurred, 
set to 0 after handling exception */
 #define   MMCR0_SHRFC  0x0040UL /* SHRre freeze conditions between threads 
*/
+#define   MMCR0_FC56   0x0010UL /* freeze counters 5 and 6 */
 #define   MMCR0_FCTI   0x0008UL /* freeze counters in tags inactive mode */
 #define   MMCR0_FCTA   0x0004UL /* freeze counters in tags active mode */
 #define   MMCR0_FCWAIT 0x0002UL /* freeze counter in WAIT state */
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 1ab3068..3d566ee 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -75,6 +75,7 @@ static unsigned int freeze_events_kernel = MMCR0_FCS;
 
 #define MMCR0_FCHV 0
 #define MMCR0_PMCjCE   MMCR0_PMCnCE
+#define MMCR0_FC56 0
 #define MMCR0_PMAO 0
 
 #define SPRN_MMCRA SPRN_MMCR2
@@ -870,11 +871,11 @@ static void power_pmu_disable(struct pmu *pmu)
}
 
/*
-* Set the 'freeze counters' bit, clear PMAO.
+* Set the 'freeze counters' bit, clear PMAO/FC56.
 */
val  = mfspr(SPRN_MMCR0);
val |= MMCR0_FC;
-   val &= ~MMCR0_PMAO;
+   val &= ~(MMCR0_PMAO | MMCR0_FC56);
 
/*
 * The barrier is to make sure the mtspr has been
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 84cdc6d..d59f5b2 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -391,6 +391,10 @@ static int power8_compute_mmcr(u64 event[], int n_ev,
if (pmc_inuse & 0x7c)
mmcr[0] |= MMCR0_PMCjCE;
 
+   /* If we're not using PMC 5 or 6, freeze them */
+   if (!(pmc_inuse & 0x60))
+   mmcr[0] |= MMCR0_FC56;
+
mmcr[1] = mmcr1;
mmcr[2] = mmcra;
 
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 2/8] powerpc/perf: Rework disable logic in pmu_disable()

2013-06-24 Thread Michael Ellerman
In pmu_disable() we disable the PMU by setting the FC (Freeze Counters)
bit in MMCR0. In order to do this we have to read/modify/write MMCR0.

It's possible that we read a value from MMCR0 which has PMAO (PMU Alert
Occurred) set. When we write that value back it will cause an interrupt
to occur. We will then end up in the PMU interrupt handler even though
we are supposed to have just disabled the PMU.

We can avoid this by making sure we never write PMAO back. We should not
lose interrupts because when the PMU is re-enabled the overflowed values
will cause another interrupt.

We also reorder the clearing of SAMPLE_ENABLE so that is done after the
PMU is frozen. Otherwise there is a small window between the clearing of
SAMPLE_ENABLE and the setting of FC where we could take an interrupt and
incorrectly see SAMPLE_ENABLE not set. This would for example change the
logic in perf_read_regs().

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/perf/core-book3s.c |   31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 29c6482..1ab3068 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -75,6 +75,7 @@ static unsigned int freeze_events_kernel = MMCR0_FCS;
 
 #define MMCR0_FCHV 0
 #define MMCR0_PMCjCE   MMCR0_PMCnCE
+#define MMCR0_PMAO 0
 
 #define SPRN_MMCRA SPRN_MMCR2
 #define MMCRA_SAMPLE_ENABLE0
@@ -852,7 +853,7 @@ static void write_mmcr0(struct cpu_hw_events *cpuhw, 
unsigned long mmcr0)
 static void power_pmu_disable(struct pmu *pmu)
 {
struct cpu_hw_events *cpuhw;
-   unsigned long flags;
+   unsigned long flags, val;
 
if (!ppmu)
return;
@@ -860,9 +861,6 @@ static void power_pmu_disable(struct pmu *pmu)
cpuhw = &__get_cpu_var(cpu_hw_events);
 
if (!cpuhw->disabled) {
-   cpuhw->disabled = 1;
-   cpuhw->n_added = 0;
-
/*
 * Check if we ever enabled the PMU on this cpu.
 */
@@ -872,6 +870,21 @@ static void power_pmu_disable(struct pmu *pmu)
}
 
/*
+* Set the 'freeze counters' bit, clear PMAO.
+*/
+   val  = mfspr(SPRN_MMCR0);
+   val |= MMCR0_FC;
+   val &= ~MMCR0_PMAO;
+
+   /*
+* The barrier is to make sure the mtspr has been
+* executed and the PMU has frozen the events etc.
+* before we return.
+*/
+   write_mmcr0(cpuhw, val);
+   mb();
+
+   /*
 * Disable instruction sampling if it was enabled
 */
if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE) {
@@ -880,14 +893,8 @@ static void power_pmu_disable(struct pmu *pmu)
mb();
}
 
-   /*
-* Set the 'freeze counters' bit.
-* The barrier is to make sure the mtspr has been
-* executed and the PMU has frozen the events
-* before we return.
-*/
-   write_mmcr0(cpuhw, mfspr(SPRN_MMCR0) | MMCR0_FC);
-   mb();
+   cpuhw->disabled = 1;
+   cpuhw->n_added = 0;
}
local_irq_restore(flags);
 }
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/8] powerpc/perf: Check that events only include valid bits on Power8

2013-06-24 Thread Michael Ellerman
A mistake we have made in the past is that we pull out the fields we
need from the event code, but don't check that there are no unknown bits
set. This means that we can't ever assign meaning to those unknown bits
in future.

Although we have once again failed to do this at release, it is still
early days for Power8 so I think we can still slip this in and get away
with it.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/perf/power8-pmu.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index f7d1c4f..84cdc6d 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -109,6 +109,16 @@
 #define EVENT_IS_MARKED(EVENT_MARKED_MASK << 
EVENT_MARKED_SHIFT)
 #define EVENT_PSEL_MASK0xff/* PMCxSEL value */
 
+#define EVENT_VALID_MASK   \
+   ((EVENT_THRESH_MASK<< EVENT_THRESH_SHIFT)   |   \
+(EVENT_SAMPLE_MASK<< EVENT_SAMPLE_SHIFT)   |   \
+(EVENT_CACHE_SEL_MASK << EVENT_CACHE_SEL_SHIFT)|   \
+(EVENT_PMC_MASK   << EVENT_PMC_SHIFT)  |   \
+(EVENT_UNIT_MASK  << EVENT_UNIT_SHIFT) |   \
+(EVENT_COMBINE_MASK   << EVENT_COMBINE_SHIFT)  |   \
+(EVENT_MARKED_MASK<< EVENT_MARKED_SHIFT)   |   \
+ EVENT_PSEL_MASK)
+
 /* MMCRA IFM bits - POWER8 */
 #definePOWER8_MMCRA_IFM1   0x4000UL
 #definePOWER8_MMCRA_IFM2   0x8000UL
@@ -212,6 +222,9 @@ static int power8_get_constraint(u64 event, unsigned long 
*maskp, unsigned long
 
mask = value = 0;
 
+   if (event & ~EVENT_VALID_MASK)
+   return -1;
+
pmc   = (event >> EVENT_PMC_SHIFT)   & EVENT_PMC_MASK;
unit  = (event >> EVENT_UNIT_SHIFT)  & EVENT_UNIT_MASK;
cache = (event >> EVENT_CACHE_SEL_SHIFT) & EVENT_CACHE_SEL_MASK;
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/3 v16] iommu/fsl: Freescale PAMU driver and iommu implementation.

2013-06-24 Thread Joerg Roedel
On Thu, Jun 20, 2013 at 09:31:28PM +0530, Varun Sethi wrote:
> This patch provides the PAMU driver (fsl_pamu.c) and the corresponding IOMMU
> API implementation (fsl_pamu_domain.c). The PAMU hardware driver (fsl_pamu.c)
> has been derived from the work done by Ashish Kalra and Timur Tabi.

AlexW,

can you please have a look at the group-code again and ack the patch if
it looks right to you?

Thanks,

Joerg


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Build regressions/improvements in v3.10-rc7

2013-06-24 Thread Geert Uytterhoeven
On Mon, 24 Jun 2013, Geert Uytterhoeven wrote:
> JFYI, when comparing v3.10-rc7 to v3.10-rc6[3], the summaries are:
>   - build errors: +51/-11

After filtering out false-positives (v3.10-rc6 had only 72% build coverage):

  + arch/powerpc/kernel/fadump.c: error: 'KEXEC_CORE_NOTE_NAME' undeclared 
(first use in this function):  => 521:36
  + arch/powerpc/kernel/fadump.c: error: 'crashing_cpu' undeclared (first use 
in this function):  => 408:2
  + arch/powerpc/kernel/fadump.c: error: 'note_buf_t' undeclared (first use in 
this function):  => 624:49
  + arch/powerpc/kernel/fadump.c: error: 'vmcoreinfo_max_size' undeclared 
(first use in this function):  => 874:18
  + arch/powerpc/kernel/fadump.c: error: implicit declaration of function 
'crash_save_vmcoreinfo' [-Werror=implicit-function-declaration]:  => 410:2
  + arch/powerpc/kernel/fadump.c: error: implicit declaration of function 
'elf_core_copy_kernel_regs' [-Werror=implicit-function-declaration]:  => 520:2
  + arch/powerpc/kernel/fadump.c: error: implicit declaration of function 
'paddr_vmcoreinfo_note' [-Werror=implicit-function-declaration]:  => 872:2
  + arch/powerpc/kernel/fadump.c: error: storage size of 'prstatus' isn't 
known:  => 513:22
  + arch/powerpc/kernel/fadump.c: error: unused variable 'prstatus' 
[-Werror=unused-variable]:  => 513:22
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_ANDCOND' undeclared (first use 
in this function):  => 161:17, 86:16
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_AVPN' undeclared (first use in 
this function):  => 160:17, 85:16, 192:16
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_BULK_REMOVE' undeclared (first 
use in this function):  => 246:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_CEDE' undeclared (first use in 
this function):  => 250:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_CPPR' undeclared (first use in 
this function):  => 257:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_ENTER' undeclared (first use 
in this function):  => 240:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_EOI' undeclared (first use in 
this function):  => 258:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_EXACT' undeclared (first use 
in this function):  => 50:6
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_IPI' undeclared (first use in 
this function):  => 259:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_IPOLL' undeclared (first use 
in this function):  => 260:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_NOT_FOUND' undeclared (first 
use in this function):  => 193:27, 87:27
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_PARAMETER' undeclared (first 
use in this function):  => 138:10
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_PROTECT' undeclared (first use 
in this function):  => 244:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_PTEG_FULL' undeclared (first 
use in this function):  => 54:12
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_PUT_TCE' undeclared (first use 
in this function):  => 248:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_REMOVE' undeclared (first use 
in this function):  => 242:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_RTAS' undeclared (first use in 
this function):  => 265:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_SUCCESS' undeclared (first use 
in this function):  => 67:26, 96:26, 211:26, 125:12
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_TOO_HARD' undeclared (first 
use in this function):  => 224:12
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_XIRR' undeclared (first use in 
this function):  => 256:7
  + arch/powerpc/kvm/book3s_pr_papr.c: error: 'H_XIRR_X' undeclared (first use 
in this function):  => 261:7

All from the powerpc-randconfig police.

> [1] http://kisskb.ellerman.id.au/kisskb/head/6349/ (all 120 configs)
> [3] http://kisskb.ellerman.id.au/kisskb/head/6325/ (86 out of 120 configs)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[RFC PATCH 2/3] powerpc: Contiguous memory allocator based hash page allocation

2013-06-24 Thread Aneesh Kumar K.V
From: "Aneesh Kumar K.V" 

Use CMA for allocation of guest hash page.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/kvm_book3s_64.h |   1 -
 arch/powerpc/include/asm/kvm_host.h  |   2 +-
 arch/powerpc/include/asm/kvm_ppc.h   |   8 +-
 arch/powerpc/kernel/setup_64.c   |   2 +
 arch/powerpc/kvm/Kconfig |   1 +
 arch/powerpc/kvm/Makefile|   3 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  |  36 +++--
 arch/powerpc/kvm/book3s_hv_builtin.c |  79 +++
 arch/powerpc/kvm/book3s_hv_cma.c | 220 +++
 arch/powerpc/kvm/book3s_hv_cma.h |  21 +++
 10 files changed, 322 insertions(+), 51 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_hv_cma.c
 create mode 100644 arch/powerpc/kvm/book3s_hv_cma.h

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 9c1ff33..f8355a9 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -37,7 +37,6 @@ static inline void svcpu_put(struct kvmppc_book3s_shadow_vcpu 
*svcpu)
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 #define KVM_DEFAULT_HPT_ORDER  24  /* 16MB HPT by default */
-extern int kvm_hpt_order;  /* order of preallocated HPTs */
 #endif
 
 #define VRMA_VSID  0x1ffUL /* 1TB VSID reserved for VRMA */
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index af326cd..0097dab 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -259,7 +259,7 @@ struct kvm_arch {
spinlock_t slot_phys_lock;
cpumask_t need_tlb_flush;
struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
-   struct kvmppc_linear_info *hpt_li;
+   int hpt_cma_alloc;
 #endif /* CONFIG_KVM_BOOK3S_64_HV */
 #ifdef CONFIG_PPC_BOOK3S_64
struct list_head spapr_tce_tables;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index a5287fe..058ac93 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -139,8 +139,8 @@ extern long kvm_vm_ioctl_allocate_rma(struct kvm *kvm,
struct kvm_allocate_rma *rma);
 extern struct kvmppc_linear_info *kvm_alloc_rma(void);
 extern void kvm_release_rma(struct kvmppc_linear_info *ri);
-extern struct kvmppc_linear_info *kvm_alloc_hpt(void);
-extern void kvm_release_hpt(struct kvmppc_linear_info *li);
+extern struct page *kvm_alloc_hpt(int nr_pages);
+extern void kvm_release_hpt(struct page *page, int nr_pages);
 extern int kvmppc_core_init_vm(struct kvm *kvm);
 extern void kvmppc_core_destroy_vm(struct kvm *kvm);
 extern void kvmppc_core_free_memslot(struct kvm_memory_slot *free,
@@ -261,6 +261,7 @@ void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
 struct openpic;
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
+extern void kvm_cma_reserve(void) __init;
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
 {
paca[cpu].kvm_hstate.xics_phys = addr;
@@ -284,6 +285,9 @@ extern void kvmppc_fast_vcpu_kick(struct kvm_vcpu *vcpu);
 extern void kvm_linear_init(void);
 
 #else
+static inline void __init kvm_cma_reserve(void)
+{}
+
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
 {}
 
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index e379d3f..ee28d1f 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -229,6 +229,8 @@ void __init early_setup(unsigned long dt_ptr)
/* Initialize the hash table or TLB handling */
early_init_mmu();
 
+   kvm_cma_reserve();
+
/*
 * Reserve any gigantic pages requested on the command line.
 * memblock needs to have been initialized by the time this is
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index eb643f8..ffaef2c 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -72,6 +72,7 @@ config KVM_BOOK3S_64_HV
bool "KVM support for POWER7 and PPC970 using hypervisor mode in host"
depends on KVM_BOOK3S_64
select MMU_NOTIFIER
+   select CMA
---help---
  Support running unmodified book3s_64 guest kernels in
  virtual machines on POWER7 and PPC970 processors that have
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 422de3f..dc155de 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -74,12 +74,15 @@ kvm-book3s_64-objs-$(CONFIG_KVM_BOOK3S_64_HV) := \
book3s_64_mmu_hv.o
 kvm-book3s_64-builtin-xics-objs-$(CONFIG_KVM_XICS) := \
book3s_hv_rm_xics.o
+kvm-book3s_64-builtin-cma-objs-$(CONFIG_CMA) := \
+   book3s_hv_cma.o
 kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HV) := \
book3s_hv_rmhandlers.o \
book3s_hv_rm_mmu.o \
book3s_64_vio_hv.o \
book3s_hv_ras.o \
book3s_hv_builtin.o \
+   $(kvm-book3s_64-builti

[RFC PATCH 3/3] powerpc: Contiguous memory allocator based RMA allocation

2013-06-24 Thread Aneesh Kumar K.V
From: "Aneesh Kumar K.V" 

Use CMA for allocation of RMA region for guest. Also remove linear allocator
now that it is not used

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/kvm_book3s_64.h |   1 +
 arch/powerpc/include/asm/kvm_host.h  |  12 +--
 arch/powerpc/include/asm/kvm_ppc.h   |   8 +-
 arch/powerpc/kernel/setup_64.c   |   2 -
 arch/powerpc/kvm/book3s_hv.c |  44 ++---
 arch/powerpc/kvm/book3s_hv_builtin.c | 164 +++
 6 files changed, 71 insertions(+), 160 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index f8355a9..76ff0b5 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -37,6 +37,7 @@ static inline void svcpu_put(struct kvmppc_book3s_shadow_vcpu 
*svcpu)
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 #define KVM_DEFAULT_HPT_ORDER  24  /* 16MB HPT by default */
+extern unsigned long kvm_rma_pages;
 #endif
 
 #define VRMA_VSID  0x1ffUL /* 1TB VSID reserved for VRMA */
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 0097dab..525684c 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -183,13 +183,9 @@ struct kvmppc_spapr_tce_table {
struct page *pages[0];
 };
 
-struct kvmppc_linear_info {
-   void*base_virt;
-   unsigned longbase_pfn;
-   unsigned longnpages;
-   struct list_head list;
-   atomic_t use_count;
-   int  type;
+struct kvm_rma_info {
+   atomic_t use_count;
+   unsigned long base_pfn;
 };
 
 /* XICS components, defined in book3s_xics.c */
@@ -246,7 +242,7 @@ struct kvm_arch {
int tlbie_lock;
unsigned long lpcr;
unsigned long rmor;
-   struct kvmppc_linear_info *rma;
+   struct kvm_rma_info *ri;
unsigned long vrma_slb_v;
int rma_setup_done;
int using_mmu_notifiers;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 058ac93..7a09cf5 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -137,8 +137,8 @@ extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, 
unsigned long liobn,
 unsigned long ioba, unsigned long tce);
 extern long kvm_vm_ioctl_allocate_rma(struct kvm *kvm,
struct kvm_allocate_rma *rma);
-extern struct kvmppc_linear_info *kvm_alloc_rma(void);
-extern void kvm_release_rma(struct kvmppc_linear_info *ri);
+extern struct kvm_rma_info *kvm_alloc_rma(void);
+extern void kvm_release_rma(struct kvm_rma_info *ri);
 extern struct page *kvm_alloc_hpt(int nr_pages);
 extern void kvm_release_hpt(struct page *page, int nr_pages);
 extern int kvmppc_core_init_vm(struct kvm *kvm);
@@ -282,7 +282,6 @@ static inline void kvmppc_set_host_ipi(int cpu, u8 host_ipi)
 }
 
 extern void kvmppc_fast_vcpu_kick(struct kvm_vcpu *vcpu);
-extern void kvm_linear_init(void);
 
 #else
 static inline void __init kvm_cma_reserve(void)
@@ -291,9 +290,6 @@ static inline void __init kvm_cma_reserve(void)
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
 {}
 
-static inline void kvm_linear_init(void)
-{}
-
 static inline u32 kvmppc_get_xics_latch(void)
 {
return 0;
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index ee28d1f..8a022f5 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -611,8 +611,6 @@ void __init setup_arch(char **cmdline_p)
/* Initialize the MMU context management stuff */
mmu_context_init();
 
-   kvm_linear_init();
-
/* Interrupt code needs to be 64K-aligned */
if ((unsigned long)_stext & 0x)
panic("Kernelbase not 64K-aligned (0x%lx)!\n",
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 550f592..0f0d05e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1511,10 +1511,10 @@ static inline int lpcr_rmls(unsigned long rma_size)
 
 static int kvm_rma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
-   struct kvmppc_linear_info *ri = vma->vm_file->private_data;
struct page *page;
+   struct kvm_rma_info *ri = vma->vm_file->private_data;
 
-   if (vmf->pgoff >= ri->npages)
+   if (vmf->pgoff >= kvm_rma_pages)
return VM_FAULT_SIGBUS;
 
page = pfn_to_page(ri->base_pfn + vmf->pgoff);
@@ -1536,7 +1536,7 @@ static int kvm_rma_mmap(struct file *file, struct 
vm_area_struct *vma)
 
 static int kvm_rma_release(struct inode *inode, struct file *filp)
 {
-   struct kvmppc_linear_info *ri = filp->private_data;
+   struct kvm_rma_info *ri = filp->private_data;
 
kvm_release_rma(ri);
return 0;
@@ -1549,8 +1549,24 @@ static const struct file_operations kvm_rma_fops = {
 
 lon

[RFC PATCH 1/3] mm/cma: Move dma contiguous changes into a seperate config

2013-06-24 Thread Aneesh Kumar K.V
From: "Aneesh Kumar K.V" 

We want to use CMA for allocating hash page table and real mode area for
PPC64. Hence move DMA contiguous related changes into a seperate config
so that ppc64 can enable CMA without requiring DMA contiguous.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/arm/configs/omap2plus_defconfig  |  2 +-
 arch/arm/configs/tegra_defconfig  |  2 +-
 arch/arm/include/asm/dma-contiguous.h |  2 +-
 arch/arm/mm/dma-mapping.c |  6 +++---
 drivers/base/Kconfig  | 21 +
 drivers/base/Makefile |  2 +-
 include/linux/dma-contiguous.h|  2 +-
 mm/Kconfig| 20 
 8 files changed, 33 insertions(+), 24 deletions(-)

diff --git a/arch/arm/configs/omap2plus_defconfig 
b/arch/arm/configs/omap2plus_defconfig
index abbe319..098268f 100644
--- a/arch/arm/configs/omap2plus_defconfig
+++ b/arch/arm/configs/omap2plus_defconfig
@@ -71,7 +71,7 @@ CONFIG_MAC80211=m
 CONFIG_MAC80211_RC_PID=y
 CONFIG_MAC80211_RC_DEFAULT_PID=y
 CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
-CONFIG_CMA=y
+CONFIG_DMA_CMA=y
 CONFIG_CONNECTOR=y
 CONFIG_DEVTMPFS=y
 CONFIG_DEVTMPFS_MOUNT=y
diff --git a/arch/arm/configs/tegra_defconfig b/arch/arm/configs/tegra_defconfig
index f7ba3161..34ae8f2 100644
--- a/arch/arm/configs/tegra_defconfig
+++ b/arch/arm/configs/tegra_defconfig
@@ -79,7 +79,7 @@ CONFIG_RFKILL_GPIO=y
 CONFIG_DEVTMPFS=y
 CONFIG_DEVTMPFS_MOUNT=y
 # CONFIG_FIRMWARE_IN_KERNEL is not set
-CONFIG_CMA=y
+CONFIG_DMA_CMA=y
 CONFIG_MTD=y
 CONFIG_MTD_CHAR=y
 CONFIG_MTD_M25P80=y
diff --git a/arch/arm/include/asm/dma-contiguous.h 
b/arch/arm/include/asm/dma-contiguous.h
index 3ed37b4..e072bb2 100644
--- a/arch/arm/include/asm/dma-contiguous.h
+++ b/arch/arm/include/asm/dma-contiguous.h
@@ -2,7 +2,7 @@
 #define ASMARM_DMA_CONTIGUOUS_H
 
 #ifdef __KERNEL__
-#ifdef CONFIG_CMA
+#ifdef CONFIG_DMA_CMA
 
 #include 
 #include 
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index ef3e0f3..1fb40dc 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -358,7 +358,7 @@ static int __init atomic_pool_init(void)
if (!pages)
goto no_pages;
 
-   if (IS_ENABLED(CONFIG_CMA))
+   if (IS_ENABLED(CONFIG_DMA_CMA))
ptr = __alloc_from_contiguous(NULL, pool->size, prot, &page,
  atomic_pool_init);
else
@@ -670,7 +670,7 @@ static void *__dma_alloc(struct device *dev, size_t size, 
dma_addr_t *handle,
addr = __alloc_simple_buffer(dev, size, gfp, &page);
else if (!(gfp & __GFP_WAIT))
addr = __alloc_from_pool(size, &page);
-   else if (!IS_ENABLED(CONFIG_CMA))
+   else if (!IS_ENABLED(CONFIG_DMA_CMA))
addr = __alloc_remap_buffer(dev, size, gfp, prot, &page, 
caller);
else
addr = __alloc_from_contiguous(dev, size, prot, &page, caller);
@@ -759,7 +759,7 @@ static void __arm_dma_free(struct device *dev, size_t size, 
void *cpu_addr,
__dma_free_buffer(page, size);
} else if (__free_from_pool(cpu_addr, size)) {
return;
-   } else if (!IS_ENABLED(CONFIG_CMA)) {
+   } else if (!IS_ENABLED(CONFIG_DMA_CMA)) {
__dma_free_remap(cpu_addr, size);
__dma_free_buffer(page, size);
} else {
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 07abd9d..74b7c98 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -202,11 +202,10 @@ config DMA_SHARED_BUFFER
  APIs extension; the file's descriptor can then be passed on to other
  driver.
 
-config CMA
-   bool "Contiguous Memory Allocator"
-   depends on HAVE_DMA_CONTIGUOUS && HAVE_MEMBLOCK
-   select MIGRATION
-   select MEMORY_ISOLATION
+config DMA_CMA
+   bool "DMA Contiguous Memory Allocator"
+   depends on HAVE_DMA_CONTIGUOUS
+   select CMA
help
  This enables the Contiguous Memory Allocator which allows drivers
  to allocate big physically-contiguous blocks of memory for use with
@@ -215,17 +214,7 @@ config CMA
  For more information see .
  If unsure, say "n".
 
-if CMA
-
-config CMA_DEBUG
-   bool "CMA debug messages (DEVELOPMENT)"
-   depends on DEBUG_KERNEL
-   help
- Turns on debug messages in CMA.  This produces KERN_DEBUG
- messages for every CMA call as well as various messages while
- processing calls such as dma_alloc_from_contiguous().
- This option does not affect warning and error messages.
-
+if  DMA_CMA
 comment "Default contiguous memory area size:"
 
 config CMA_SIZE_MBYTES
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 4e22ce3..5d93bb5 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -6,7 +6,7 @@ obj-y   := core.o bus.o dd.o syscore.o \
   attribute_container.o transport_class

Re: [PATCH V3 2/2] powerpc, perf: BHRB filter configuration should follow the task

2013-06-24 Thread Michael Neuling
Anshuman Khandual  wrote:

> When the task moves around the system, the corresponding cpuhw
> per cpu strcuture should be popullated with the BHRB filter
> request value so that PMU could be configured appropriately with
> that during the next call into power_pmu_enable().
> 
> Signed-off-by: Anshuman Khandual 

benh you might want to fix the spelling mistakes above 

  strcuture -> structure
  popullated -> populated

Otherwise:

Acked-by: Michael Neuling 


> ---
>  arch/powerpc/perf/core-book3s.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 426180b..48c68a8 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1122,8 +1122,11 @@ nocheck:
>  
>   ret = 0;
>   out:
> - if (has_branch_stack(event))
> + if (has_branch_stack(event)) {
>   power_pmu_bhrb_enable(event);
> + cpuhw->bhrb_filter = ppmu->bhrb_filter_map(
> + event->attr.branch_sample_type);
> + }
>  
>   perf_pmu_enable(event->pmu);
>   local_irq_restore(flags);
> -- 
> 1.7.11.7
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V3 1/2] powerpc, perf: Ignore separate BHRB privilege state filter request

2013-06-24 Thread Michael Neuling
Anshuman Khandual  wrote:

> Completely ignore BHRB privilege state filter request as we are
> already configuring that with privilege state filtering attribute
> for the accompanying PMU event. This would help achieve cleaner
> user space interaction for BHRB.
> 
> This patch fixes a situation like this
> 
> Before patch:-
> 
> ./perf record -j any -e branch-misses:k ls
> Error:
> The sys_perf_event_open() syscall returned with 95 (Operation not
> supported) for event (branch-misses:k).
> /bin/dmesg may provide additional information.
> No CONFIG_PERF_EVENTS=y kernel support configured?
> 
> Here 'perf record' actually copies over ':k' filter request into BHRB
> privilege state filter config and our previous check in kernel would
> fail that.
> 
> After patch:-
> -
> ./perf record -j any -e branch-misses:k ls
> perf  perf.data  perf.data.old  test-mmap-ring
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.002 MB perf.data (~102 samples)]
> 
> Signed-off-by: Anshuman Khandual 

Acked-by: Michael Neuling 


> ---
>  arch/powerpc/perf/power8-pmu.c | 13 -
>  1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
> index f7d1c4f..371c6e7 100644
> --- a/arch/powerpc/perf/power8-pmu.c
> +++ b/arch/powerpc/perf/power8-pmu.c
> @@ -523,18 +523,13 @@ static int power8_generic_events[] = {
>  static u64 power8_bhrb_filter_map(u64 branch_sample_type)
>  {
>   u64 pmu_bhrb_filter = 0;
> - u64 br_privilege = branch_sample_type & ONLY_PLM;
>  
> - /* BHRB and regular PMU events share the same prvillege state
> + /* BHRB and regular PMU events share the same privilege state
>* filter configuration. BHRB is always recorded along with a
> -  * regular PMU event. So privilege state filter criteria for BHRB
> -  * and the companion PMU events has to be the same. As a default
> -  * "perf record" tool sets all privillege bits ON when no filter
> -  * criteria is provided in the command line. So as along as all
> -  * privillege bits are ON or they are OFF, we are good to go.
> +  * regular PMU event. As the privilege state filter is handled
> +  * in the basic PMC configuration of the accompanying regular
> +  * PMU event, we ignore any separate BHRB specific request.
>*/
> - if ((br_privilege != 7) && (br_privilege != 0))
> - return -1;
>  
>   /* No branch filter requested */
>   if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY)
> -- 
> 1.7.11.7
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v3] powerpc/pseries: Enable PSTORE in pseries_defconfig

2013-06-24 Thread Michael Neuling
Aruna Balakrishnaiah  wrote:

> Since now we have pstore support for nvram in pseries, enable it
> in the default config. With this config option enabled, pstore
> infra-structure will be used to read/write the messages from/to nvram.
> 
> Signed-off-by: Aruna Balakrishnaiah 

Acked-by: Michael Neuling 

> ---
>  v3:
>   Move pstore config to right place
>  v2:
>   Change patch description
> 
>  arch/powerpc/configs/pseries_defconfig |1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/configs/pseries_defconfig 
> b/arch/powerpc/configs/pseries_defconfig
> index c4dfbaf..bea8587 100644
> --- a/arch/powerpc/configs/pseries_defconfig
> +++ b/arch/powerpc/configs/pseries_defconfig
> @@ -296,6 +296,7 @@ CONFIG_SQUASHFS=m
>  CONFIG_SQUASHFS_XATTR=y
>  CONFIG_SQUASHFS_LZO=y
>  CONFIG_SQUASHFS_XZ=y
> +CONFIG_PSTORE=y
>  CONFIG_NFS_FS=y
>  CONFIG_NFS_V3_ACL=y
>  CONFIG_NFS_V4=y
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev