Re: [PATCH V2] QorIQ/TMU: add thermal management support based on TMU

2015-07-29 Thread Eduardo Valentin
On Wed, Jul 29, 2015 at 02:19:39PM +0800, Jia Hongtao wrote:
> It supports one critical trip point and one passive trip point.
> The cpufreq is used as the cooling device to throttle CPUs when
> the passive trip is crossed.
> 
> Signed-off-by: Jia Hongtao 
> ---
> This patch based on:
> http://patchwork.ozlabs.org/patch/482987/
> 
> Changes for V2:
> * Add tmu-range parse.
> * Use default trend hook.
> * Using latest thermal_zone_bind_cooling_device API.
> * Add calibration check during initialization.
> * Disable/enalbe device when suspend/resume.
> 
>  drivers/thermal/Kconfig |  11 ++
>  drivers/thermal/Makefile|   1 +
>  drivers/thermal/qoriq_thermal.c | 406 
> 
>  3 files changed, 418 insertions(+)
>  create mode 100644 drivers/thermal/qoriq_thermal.c
> 
> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 118938e..a200745 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -180,6 +180,17 @@ config IMX_THERMAL
> cpufreq is used as the cooling device to throttle CPUs when the
> passive trip is crossed.
>  
> +config QORIQ_THERMAL
> + tristate "Freescale QorIQ Thermal Monitoring Unit"
> + depends on CPU_THERMAL
> + depends on OF
> + default n
> + help
> +   Enable thermal management based on Freescale QorIQ Thermal Monitoring
> +   Unit (TMU). It supports one critical trip point and one passive trip
> +   point. The cpufreq is used as the cooling device to throttle CPUs when
> +   the passive trip is crossed.
> +
>  config SPEAR_THERMAL
>   bool "SPEAr thermal sensor driver"
>   depends on PLAT_SPEAR
> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> index 535dfee..8c25859 100644
> --- a/drivers/thermal/Makefile
> +++ b/drivers/thermal/Makefile
> @@ -33,6 +33,7 @@ obj-$(CONFIG_DOVE_THERMAL)  += dove_thermal.o
>  obj-$(CONFIG_DB8500_THERMAL) += db8500_thermal.o
>  obj-$(CONFIG_ARMADA_THERMAL) += armada_thermal.o
>  obj-$(CONFIG_IMX_THERMAL)+= imx_thermal.o
> +obj-$(CONFIG_QORIQ_THERMAL)  += qoriq_thermal.o
>  obj-$(CONFIG_DB8500_CPUFREQ_COOLING) += db8500_cpufreq_cooling.o
>  obj-$(CONFIG_INTEL_POWERCLAMP)   += intel_powerclamp.o
>  obj-$(CONFIG_X86_PKG_TEMP_THERMAL)   += x86_pkg_temp_thermal.o
> diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
> new file mode 100644
> index 000..0694f42
> --- /dev/null
> +++ b/drivers/thermal/qoriq_thermal.c
> @@ -0,0 +1,406 @@
> +/*
> + * Copyright 2015 Freescale Semiconductor, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + */
> +
> +/*
> + * Based on Freescale QorIQ Thermal Monitoring Unit (TMU)
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define SITES_MAX16
> +
> +#define TMU_TEMP_PASSIVE 85000
> +#define TMU_TEMP_CRITICAL95000
> +
> +#define TMU_PASSIVE_DELAY1000/* Milliseconds */
> +#define TMU_POLLING_DELAY5000
> +
> +/* The driver supports 1 passive trip point and 1 critical trip point */
> +enum tmu_thermal_trip {
> + TMU_TRIP_PASSIVE,
> + TMU_TRIP_CRITICAL,
> + TMU_TRIP_NUM,
> +};
> +
> +/*
> + * QorIQ TMU Registers
> + */
> +struct qoriq_tmu_site_regs {
> + __be32 tritsr;  /* Immediate Temperature Site Register */
> + __be32 tratsr;  /* Average Temperature Site Register */
> + u8 res0[0x8];
> +} __packed;
> +
> +struct qoriq_tmu_regs {
> + __be32 tmr; /* Mode Register */
> +#define TMR_DISABLE  0x0
> +#define TMR_ME   0x8000
> +#define TMR_ALPF 0x0c00
> +#define TMR_MSITE0x8000
> +#define TMR_ALL  (TMR_ME | TMR_ALPF | TMR_MSITE)
> + __be32 tsr; /* Status Register */
> + __be32 tmtmir;  /* Temperature measurement interval Register */
> +#define TMTMIR_DEFAULT   0x0007
> + u8 res0[0x14];
> + __be32 tier;/* Interrupt Enable Register */
> +#define TIER_DISABLE 0x0
> + __be32 tidr;/* Interrupt Detect Register */
> + __be32 tiscr;   /* Interrupt Site Capture Register */
> + __be32 ticscr;  /* Interrupt Critical Site Capture Register */
> + u8 res1[0x10];
> + __be32 tmhtcrh; /* High Temperature Capture Register */
> + __be32 tmhtcrl; /* Low Temperature Capture Register */
> + u8 res2[0x8];
> + __be32 tmhtitr; /* High Temperature Immediate Threshold */
> +

[PATCH] powerpc/eeh-powernv: Fix unbalanced IRQ warning

2015-07-29 Thread Alistair Popple
pnv_eeh_next_error() re-enables the eeh opal event interrupt but it
gets called from a loop if there are more outstanding events to
process, resulting in a warning due to enabling an already enabled
interrupt. Instead the interrupt should only be re-enabled once the
last outstanding event has been processed.

Tested-by: Daniel Axtens 
Reported-by: Daniel Axtens 
Signed-off-by: Alistair Popple 
Acked-by: Gavin Shan 
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c 
b/arch/powerpc/platforms/powernv/eeh-powernv.c
index ca825ec..ff41c03 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1478,7 +1478,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
}
 
/* Unmask the event */
-   if (eeh_enabled())
+   if (ret == EEH_NEXT_ERR_NONE && eeh_enabled())
enable_irq(eeh_event_irq);
 
return ret;
-- 
1.8.3.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/powernv: Fix unbalanced IRQ warning in eeh-powernv.c

2015-07-29 Thread Michael Ellerman
To avoid EEH getting invoked repeatedly for the same error, the OPAL
interrupt that invokes EEH is masked at the start of the process.

Currently, pnv_eeh_next_error() re-enables the interrupt but it gets
called from a loop if there are more outstanding events to process.
This causes an unbalanced enable warning.

Check that there are no more errors before enabling interrupts.

Fixed-by: Alistair Popple 
Tested-by: Daniel Axtens  [and changelog]
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c 
b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 5cf5e6ea213b..7cf0df859d05 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1478,7 +1478,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
}
 
/* Unmask the event */
-   if (eeh_enabled())
+   if (ret == EEH_NEXT_ERR_NONE && eeh_enabled())
enable_irq(eeh_event_irq);
 
return ret;
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2] powerpc/kernel: Enable seccomp filter

2015-07-29 Thread Michael Ellerman
This commit enables seccomp filter on powerpc, now that we have all the
necessary pieces in place.

To support seccomp's desire to modify the syscall return value under
some circumstances, we use a different ABI to the ptrace ABI. That is we
use r3 as the syscall return value, and orig_gpr3 is the first syscall
parameter.

This means the seccomp code, or a ptracer via SECCOMP_RET_TRACE, will
see -ENOSYS preloaded in r3. This is identical to the behaviour on x86,
and allows seccomp or the ptracer to either leave the -ENOSYS or change
it to something else, as well as rejecting or not the syscall by
modifying r0.

If seccomp does not reject the syscall, we restore the register state to
match what ptrace and audit expect, ie. r3 is the first syscall
parameter again. We do this restore using orig_gpr3, which may have been
modified by seccomp, which allows seccomp to modify the first syscall
paramater and allow the syscall to proceed.

We need to #ifdef the the additional handling of r3 for seccomp, so move
it all out of line.

Signed-off-by: Michael Ellerman 
Reviewed-by: Kees Cook 
---
 arch/powerpc/Kconfig |  1 +
 arch/powerpc/kernel/ptrace.c | 41 -
 2 files changed, 41 insertions(+), 1 deletion(-)


v2: The previous version didn't compile for CONFIG_SECCOMP=n. To fix it up I
moved the logic out of line and #ifdef'ed that. It gets inlined by the 
compiler
so the end result is the same.

Kees I kept your Reviewed-by on the basis that the interesting logic is the
same, hope that's OK by you.

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index fe2f2c595fd9..4139644030fb 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -159,6 +159,7 @@ config PPC
select EDAC_SUPPORT
select EDAC_ATOMIC_SCRUB
select ARCH_HAS_DMA_SET_COHERENT_MASK
+   select HAVE_ARCH_SECCOMP_FILTER
 
 config GENERIC_CSUM
def_bool CPU_LITTLE_ENDIAN
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 7484221bb3f8..737c0d0b53ac 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -1762,6 +1762,44 @@ long arch_ptrace(struct task_struct *child, long request,
return ret;
 }
 
+#ifdef CONFIG_SECCOMP
+static int do_seccomp(struct pt_regs *regs)
+{
+   if (!test_thread_flag(TIF_SECCOMP))
+   return 0;
+
+   /*
+* The ABI we present to seccomp tracers is that r3 contains
+* the syscall return value and orig_gpr3 contains the first
+* syscall parameter. This is different to the ptrace ABI where
+* both r3 and orig_gpr3 contain the first syscall parameter.
+*/
+   regs->gpr[3] = -ENOSYS;
+
+   /*
+* We use the __ version here because we have already checked
+* TIF_SECCOMP. If this fails, there is nothing left to do, we
+* have already loaded -ENOSYS into r3, or seccomp has put
+* something else in r3 (via SECCOMP_RET_ERRNO/TRACE).
+*/
+   if (__secure_computing())
+   return -1;
+
+   /*
+* The syscall was allowed by seccomp, restore the register
+* state to what ptrace and audit expect.
+* Note that we use orig_gpr3, which means a seccomp tracer can
+* modify the first syscall parameter (in orig_gpr3) and also
+* allow the syscall to proceed.
+*/
+   regs->gpr[3] = regs->orig_gpr3;
+
+   return 0;
+}
+#else
+static inline int do_seccomp(struct pt_regs *regs) { return 0; }
+#endif /* CONFIG_SECCOMP */
+
 /**
  * do_syscall_trace_enter() - Do syscall tracing on kernel entry.
  * @regs: the pt_regs of the task to trace (current)
@@ -1787,7 +1825,8 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 
user_exit();
 
-   secure_computing_strict(regs->gpr[0]);
+   if (do_seccomp(regs))
+   return -1;
 
if (test_thread_flag(TIF_SYSCALL_TRACE)) {
/*
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR

2015-07-29 Thread Wei Yang
On Thu, Jul 30, 2015 at 11:15:01AM +1000, Gavin Shan wrote:
>On Wed, Jul 29, 2015 at 03:22:07PM +0800, Wei Yang wrote:
>>In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64
>>BAR in Single PE mode to cover the number of VFs required to be enabled.
>>By doing so, several VFs would be in one VF Group and leads to interference
>>between VFs in the same group.
>>
>>This patch changes the design by using one M64 BAR in Single PE mode for
>>one VF BAR. This gives absolute isolation for VFs.
>>
>>Signed-off-by: Wei Yang 
>>---
>> arch/powerpc/include/asm/pci-bridge.h |5 +-
>> arch/powerpc/platforms/powernv/pci-ioda.c |  104 
>> +
>> 2 files changed, 18 insertions(+), 91 deletions(-)
>>
>
>questions regarding this:
>
>(1) When M64 BAR is running in single-PE-mode for VFs, the alignment for one
>particular IOV BAR still have to be (IOV_BAR_size * max_vf_number), or
>M64 segment size of last BAR (0x1000) is fine? If the later one is 
> fine,
>more M64 space would be saved. On the other hand, if the IOV BAR size
>(for all VFs) is less than 256MB, will the allocated resource conflict
>with the M64 segments in last BAR?

Not need to be IOV BAR size aligned, be individual VF BAR size aligned is fine.

IOV BAR size = VF BAR size * expended_num_vfs

>(2) When M64 BAR is in single-PE-mode, the PE numbers allocated for VFs need
>continuous or not.

No, not need.

>(3) Each PF could have 6 IOV BARs and there're 15 available M64 BAR. It means
>only two VFs can be enabled in the extreme case. Would it be a problem?
>

Yes, you are right.

Based on Alexey's mail, full isolation is more important than more VFs.

>>diff --git a/arch/powerpc/include/asm/pci-bridge.h 
>>b/arch/powerpc/include/asm/pci-bridge.h
>>index 712add5..1997e5d 100644
>>--- a/arch/powerpc/include/asm/pci-bridge.h
>>+++ b/arch/powerpc/include/asm/pci-bridge.h
>>@@ -214,10 +214,9 @@ struct pci_dn {
>>  u16 vfs_expanded;   /* number of VFs IOV BAR expanded */
>>  u16 num_vfs;/* number of VFs enabled*/
>>  int offset; /* PE# for the first VF PE */
>>-#define M64_PER_IOV 4
>>- int m64_per_iov;
>>+#define MAX_M64_WINDOW  16
>> #define IODA_INVALID_M64(-1)
>>- int m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
>>+ int m64_wins[PCI_SRIOV_NUM_BARS][MAX_M64_WINDOW];
>> #endif /* CONFIG_PCI_IOV */
>> #endif
>
>The "m64_wins" would be renamed to "m64_map". Also, it would have dynamic size:
>
>- When the IOV BAR is extended to 256 segments, its size is sizeof(int) * 
>PCI_SRIOV_NUM_BARS;
>- When the IOV BAR is extended to max_vf_num, its size is sizeof(int) * 
>max_vf_num;
>
>>  struct list_head child_list;
>>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
>>b/arch/powerpc/platforms/powernv/pci-ioda.c
>>index 5738d31..b3e7909 100644
>>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>@@ -1168,7 +1168,7 @@ static int pnv_pci_vf_release_m64(struct pci_dev *pdev)
>>  pdn = pci_get_pdn(pdev);
>>
>>  for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
>>- for (j = 0; j < M64_PER_IOV; j++) {
>>+ for (j = 0; j < MAX_M64_WINDOW; j++) {
>>  if (pdn->m64_wins[i][j] == IODA_INVALID_M64)
>>  continue;
>>  opal_pci_phb_mmio_enable(phb->opal_id,
>>@@ -1193,8 +1193,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
>>u16 num_vfs)
>>  inttotal_vfs;
>>  resource_size_tsize, start;
>>  intpe_num;
>>- intvf_groups;
>>- intvf_per_group;
>>+ intm64s;
>
>"m64s" could have better name. For example, "vfs_per_m64_bar"...
>

m64s is used to represent number of M64 BARs necessary to enable num_vfs.
vfs_per_m64_bar may be misleading.

How about "m64_bars" ?

>>
>>  bus = pdev->bus;
>>  hose = pci_bus_to_host(bus);
>>@@ -1204,17 +1203,13 @@ static int pnv_pci_vf_assign_m64(struct pci_dev 
>>*pdev, u16 num_vfs)
>>
>>  /* Initialize the m64_wins to IODA_INVALID_M64 */
>>  for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
>>- for (j = 0; j < M64_PER_IOV; j++)
>>+ for (j = 0; j < MAX_M64_WINDOW; j++)
>>  pdn->m64_wins[i][j] = IODA_INVALID_M64;
>>
>>- if (pdn->m64_per_iov == M64_PER_IOV) {
>>- vf_groups = (num_vfs <= M64_PER_IOV) ? num_vfs: M64_PER_IOV;
>>- vf_per_group = (num_vfs <= M64_PER_IOV)? 1:
>>- roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
>>- } else {
>>- vf_groups = 1;
>>- vf_per_group = 1;
>>- }
>>+ if (pdn->vfs_expanded != phb->ioda.total_pe)
>>+ m64s = num_vfs;
>>+ else
>>+ m64s = 1;
>
>The condition (pdn->vfs_expanded != phb->ioda.total_pe) isn't precise enough 

RE: [PATCH][v2] powerpc/fsl-booke: Add T1040D4RDB/T1042D4RDB board support

2015-07-29 Thread Priyanka Jain


> -Original Message-
> From: Wood Scott-B07421
> Sent: Thursday, July 30, 2015 3:45 AM
> To: Jain Priyanka-B32167
> Cc: linuxppc-dev@lists.ozlabs.org
> Subject: Re: [PATCH][v2] powerpc/fsl-booke: Add T1040D4RDB/T1042D4RDB
> board support
> 
> On Wed, 2015-07-29 at 04:07 -0500, Jain Priyanka-B32167 wrote:
> >
> > > -Original Message-
> > > From: Wood Scott-B07421
> > > Sent: Friday, July 24, 2015 8:58 PM
> > > To: Jain Priyanka-B32167
> > > Cc: linuxppc-dev@lists.ozlabs.org
> > > Subject: Re: [PATCH][v2] powerpc/fsl-booke: Add
> > > T1040D4RDB/T1042D4RDB board support
> > >
> > > OK, so you're saying the i2c devices are pluggable (and I'm assuming
> > > by "PEX slots" you just mean that the physical slot is repurposed,
> > > not that the PCI express protocol is involved)?  Making a
> > > non-runtime-enumerable bus be pluggable seems like a bad idea, but
> > > if that's really what has been done, there needs to be a device tree
> > > that represents the entire system, not just the motherboard.  This
> > > could be done either via a dts file that /include/s the motherboard
> > > dts, or via firmware dtb edits.  The dts for the motherboard should
> > > include the mux node with a comment explaining what the situation
> > > is.
> > >
> > [Jain Priyanka-B32167] Is the below comment looks OK?
> > "Output I2C data, clock lines (SDO/SC0,SD1/SC1 , SD2/SC2, SD3/SC3) are
> > going mini PCI connector slot1, mini PCI connector slot2, HDMI
> > connector, PEX slot respectively  The sub-nodes will depend upon the
> > device that will be connected on these slots"
> 
> How about:
> 
> "Child nodes depend on which i2c devices are connected via the mini PCI
> 
> connector slot1, the mini PCI connector slot2, the HDMI connector, and the
> 
> PEX slot.  Systems with such devices attached should provide a wrapper .dts
> file that includes this one, and adds those nodes."
> 
[Jain Priyanka-B32167] Thanks, This looks good. I will send the updated patch.
> -Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH][v3] powerpc/fsl-booke: Add T1040D4RDB/T1042D4RDB board support

2015-07-29 Thread Priyanka Jain
T1040D4RDB/T1042D4RDB are Freescale Reference Design Board
which can support T1040/T1042 QorIQ Power
Architecture™ processor respectively

T1040D4RDB/T1042D4RDB board Overview
-
- SERDES Connections, 8 lanes supporting:
- PCI
- SGMII
- SATA 2.0
- QSGMII(only for T1040D4RDB)
- DDR Controller
- Supports rates of up to 1600 MHz data-rate
- Supports one DDR4 UDIMM
-IFC/Local Bus
- NAND flash: 1GB 8-bit NAND flash
- NOR: 128MB 16-bit NOR Flash
- Ethernet
- Two on-board RGMII 10/100/1G ethernet ports.
- PHY #0 remains powered up during deep-sleep
- CPLD
- Clocks
- System and DDR clock (SYSCLK, “DDRCLK”)
- SERDES clocks
- Power Supplies
- USB
- Supports two USB 2.0 ports with integrated PHYs
- Two type A ports with 5V@1.5A per port.
- SDHC
- SDHC/SDXC connector
- SPI
- On-board 64MB SPI flash
- I2C
- Devices connected: EEPROM, thermal monitor, VID controller
- Other IO
- Two Serial ports
- ProfiBus port

Add support for T1040/T1042D4RDB board:
-add device tree
-Add entry in corenet_generic.c

Signed-off-by: Priyanka Jain 
---
 Changes for v3:
Add comment for mux node to tell about child node representation

 Changes for v2:
Incorporated Scott's comments on device tree

 arch/powerpc/boot/dts/t1040d4rdb.dts  |   46 ++
 arch/powerpc/boot/dts/t1042d4rdb.dts  |   53 +++
 arch/powerpc/boot/dts/t104xd4rdb.dtsi |  205 +
 arch/powerpc/platforms/85xx/corenet_generic.c |2 +
 4 files changed, 306 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/boot/dts/t1040d4rdb.dts
 create mode 100644 arch/powerpc/boot/dts/t1042d4rdb.dts
 create mode 100644 arch/powerpc/boot/dts/t104xd4rdb.dtsi

diff --git a/arch/powerpc/boot/dts/t1040d4rdb.dts 
b/arch/powerpc/boot/dts/t1040d4rdb.dts
new file mode 100644
index 000..2d1315a
--- /dev/null
+++ b/arch/powerpc/boot/dts/t1040d4rdb.dts
@@ -0,0 +1,46 @@
+/*
+ * T1040D4RDB Device Tree Source
+ *
+ * Copyright 2015 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor "AS IS" AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/include/ "fsl/t104xsi-pre.dtsi"
+/include/ "t104xd4rdb.dtsi"
+
+/ {
+   model = "fsl,T1040D4RDB";
+   compatible = "fsl,T1040D4RDB";
+   #address-cells = <2>;
+   #size-cells = <2>;
+   interrupt-parent = <&mpic>;
+};
+
+/include/ "fsl/t1040si-post.dtsi"
diff --git a/arch/powerpc/boot/dts/t1042d4rdb.dts 
b/arch/powerpc/boot/dts/t1042d4rdb.dts
new file mode 100644
index 000..846f8c8
--- /dev/null
+++ b/arch/powerpc/boot/dts/t1042d4rdb.dts
@@ -0,0 +1,53 @@
+/*
+ * T1042D4RDB Device Tree Source
+ *
+ * Copyright 2015 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the 

RE: [RFC v2] genalloc:add an gen_pool_first_fit_align algo to genalloc

2015-07-29 Thread Zhao Qiang
On Thu, 2015-07-30 at 5:21, Scott Wood wrote:
> -Original Message-
> From: Wood Scott-B07421
> Sent: Thursday, July 30, 2015 12:19 AM
> To: Zhao Qiang-B45475
> Cc: lau...@codeaurora.org; linux-ker...@vger.kernel.org; linuxppc-
> d...@lists.ozlabs.org; a...@linux-foundation.org; o...@lixom.net;
> catalin.mari...@arm.com; Xie Xiaobo-R63061
> Subject: Re: [RFC v2] genalloc:add an gen_pool_first_fit_align algo to
> genalloc
> 
> On Tue, 2015-07-28 at 00:32 -0500, Zhao Qiang-B45475 wrote:
> > On Tue, 2015-07-28 at 5:21, Scott Wood wrote:
> > > -Original Message-
> > > From: Wood Scott-B07421
> > > Sent: Tuesday, July 28, 2015 5:21 AM
> > > To: Zhao Qiang-B45475
> > > Cc: lau...@codeaurora.org; linux-ker...@vger.kernel.org; linuxppc-
> > > d...@lists.ozlabs.org; a...@linux-foundation.org; o...@lixom.net;
> > > catalin.mari...@arm.com; Xie Xiaobo-R63061
> > > Subject: Re: [RFC v2] genalloc:add an gen_pool_first_fit_align algo
> > > to genalloc
> > >
> > > On Mon, 2015-07-27 at 17:57 +0800, Zhao Qiang wrote:
> > >
> > > Where's the part that adds the ability to pass in data to each
> > > allocation call, as per the previous discussion?
> >
> > You means to use gen_pool_alloc_data()?
> 
> Yes.
> 
> > Previously you said that the format of data is algorithm-specific, So
> > I think it is better to handle data in algorithm function.
> 
> It is a channel for communication from the API caller to the algorithm.
> 
> > If you still prefer gen_pool_alloc_data(), I will modify it.
> > But there still are details I want to confirm with you.
> > 1. If use gen_pool_alloc_data(), should I pass data as a parameter?
> 
> Yes.
> 
> > 2. Should I count align_mask in gen_pool_alloc_data(), meanwhile, add
> >a align_mask to data as a member?
> 
> gen_pool_alloc_data() should just pass data to the algorithm.  The
> algorithm should calculate align_mask based on align.  I don't think
> exposing align_mask to API users would be very friendly.

If calculate align_mask in algorithm, I need to get pool->min_alloc_order in 
algorithm,
Like:
   order = data->pool->min_alloc_order;
   align_mask = ((data->align + (1UL << order) - 1) >> order) - 1; 
so I add pool to structure data as a member. Is there any other better idea? 

> 
> > 3. where to define the data, in genalloc.h or caller layer?
> 
> Same place as where the algorithm function is declared.
> 
> -Scott
> 
-Zhao Qiang
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR

2015-07-29 Thread Gavin Shan
On Wed, Jul 29, 2015 at 03:22:07PM +0800, Wei Yang wrote:
>In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64
>BAR in Single PE mode to cover the number of VFs required to be enabled.
>By doing so, several VFs would be in one VF Group and leads to interference
>between VFs in the same group.
>
>This patch changes the design by using one M64 BAR in Single PE mode for
>one VF BAR. This gives absolute isolation for VFs.
>
>Signed-off-by: Wei Yang 
>---
> arch/powerpc/include/asm/pci-bridge.h |5 +-
> arch/powerpc/platforms/powernv/pci-ioda.c |  104 +
> 2 files changed, 18 insertions(+), 91 deletions(-)
>

questions regarding this:

(1) When M64 BAR is running in single-PE-mode for VFs, the alignment for one
particular IOV BAR still have to be (IOV_BAR_size * max_vf_number), or
M64 segment size of last BAR (0x1000) is fine? If the later one is fine,
more M64 space would be saved. On the other hand, if the IOV BAR size
(for all VFs) is less than 256MB, will the allocated resource conflict
with the M64 segments in last BAR?
(2) When M64 BAR is in single-PE-mode, the PE numbers allocated for VFs need
continuous or not.
(3) Each PF could have 6 IOV BARs and there're 15 available M64 BAR. It means
only two VFs can be enabled in the extreme case. Would it be a problem?

>diff --git a/arch/powerpc/include/asm/pci-bridge.h 
>b/arch/powerpc/include/asm/pci-bridge.h
>index 712add5..1997e5d 100644
>--- a/arch/powerpc/include/asm/pci-bridge.h
>+++ b/arch/powerpc/include/asm/pci-bridge.h
>@@ -214,10 +214,9 @@ struct pci_dn {
>   u16 vfs_expanded;   /* number of VFs IOV BAR expanded */
>   u16 num_vfs;/* number of VFs enabled*/
>   int offset; /* PE# for the first VF PE */
>-#define M64_PER_IOV 4
>-  int m64_per_iov;
>+#define MAX_M64_WINDOW  16
> #define IODA_INVALID_M64(-1)
>-  int m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
>+  int m64_wins[PCI_SRIOV_NUM_BARS][MAX_M64_WINDOW];
> #endif /* CONFIG_PCI_IOV */
> #endif

The "m64_wins" would be renamed to "m64_map". Also, it would have dynamic size:

- When the IOV BAR is extended to 256 segments, its size is sizeof(int) * 
PCI_SRIOV_NUM_BARS;
- When the IOV BAR is extended to max_vf_num, its size is sizeof(int) * 
max_vf_num;

>   struct list_head child_list;
>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
>b/arch/powerpc/platforms/powernv/pci-ioda.c
>index 5738d31..b3e7909 100644
>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>@@ -1168,7 +1168,7 @@ static int pnv_pci_vf_release_m64(struct pci_dev *pdev)
>   pdn = pci_get_pdn(pdev);
>
>   for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
>-  for (j = 0; j < M64_PER_IOV; j++) {
>+  for (j = 0; j < MAX_M64_WINDOW; j++) {
>   if (pdn->m64_wins[i][j] == IODA_INVALID_M64)
>   continue;
>   opal_pci_phb_mmio_enable(phb->opal_id,
>@@ -1193,8 +1193,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
>u16 num_vfs)
>   inttotal_vfs;
>   resource_size_tsize, start;
>   intpe_num;
>-  intvf_groups;
>-  intvf_per_group;
>+  intm64s;

"m64s" could have better name. For example, "vfs_per_m64_bar"...

>
>   bus = pdev->bus;
>   hose = pci_bus_to_host(bus);
>@@ -1204,17 +1203,13 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
>u16 num_vfs)
>
>   /* Initialize the m64_wins to IODA_INVALID_M64 */
>   for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
>-  for (j = 0; j < M64_PER_IOV; j++)
>+  for (j = 0; j < MAX_M64_WINDOW; j++)
>   pdn->m64_wins[i][j] = IODA_INVALID_M64;
>
>-  if (pdn->m64_per_iov == M64_PER_IOV) {
>-  vf_groups = (num_vfs <= M64_PER_IOV) ? num_vfs: M64_PER_IOV;
>-  vf_per_group = (num_vfs <= M64_PER_IOV)? 1:
>-  roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
>-  } else {
>-  vf_groups = 1;
>-  vf_per_group = 1;
>-  }
>+  if (pdn->vfs_expanded != phb->ioda.total_pe)
>+  m64s = num_vfs;
>+  else
>+  m64s = 1;

The condition (pdn->vfs_expanded != phb->ioda.total_pe) isn't precise enough as
explained below.

>
>   for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
>   res = &pdev->resource[i + PCI_IOV_RESOURCES];
>@@ -1224,7 +1219,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
>u16 num_vfs)
>   if (!pnv_pci_is_mem_pref_64(res->flags))
>   continue;
>
>-  for (j = 0; j < vf_groups; j++) {
>+  for (j = 0; j < m64s; j++) {
>   do {
>   win = 
> find_next_zero_bit(&ph

[PATCH] powerpc/eeh: Disable automatically blocked PCI config

2015-07-29 Thread Gavin Shan
pcibios_set_pcie_reset_state() could be called to complete
reset request when passing through PCI device, flag
EEH_PE_ISOLATED is set before saving the PCI config sapce.
On some Broadcom adapters, EEH_PE_CFG_BLOCKED is automatically
set when the flag EEH_PE_ISOLATED is marked. It caused bogus
data saved from the PCI config space, which will be restored
to the PCI adapter after the reset. Eventually, the hardware
can't work with corrupted data in PCI config space.

The patch fixes the issue with eeh_pe_state_mark_no_cfg(), which
doesn't set EEH_PE_CFG_BLOCKED when seeing EEH_PE_ISOLATED on the
PE, in order to avoid the bogus data saved and restored to the PCI
config space.

Reported-by: Rajanikanth H. Adaveeshaiah 
Signed-off-by: Gavin Shan 
---
 arch/powerpc/include/asm/ppc-pci.h |  1 +
 arch/powerpc/kernel/eeh.c  |  4 ++--
 arch/powerpc/kernel/eeh_pe.c   | 22 ++
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-pci.h 
b/arch/powerpc/include/asm/ppc-pci.h
index 4122a86..ca0c5bf 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -61,6 +61,7 @@ int rtas_write_config(struct pci_dn *, int where, int size, 
u32 val);
 int rtas_read_config(struct pci_dn *, int where, int size, u32 *val);
 void eeh_pe_state_mark(struct eeh_pe *pe, int state);
 void eeh_pe_state_clear(struct eeh_pe *pe, int state);
+void eeh_pe_state_mark_with_cfg(struct eeh_pe *pe, int state);
 void eeh_pe_dev_mode_mark(struct eeh_pe *pe, int mode);
 
 void eeh_sysfs_add_device(struct pci_dev *pdev);
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index af9b597..8050f5e 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -750,14 +750,14 @@ int pcibios_set_pcie_reset_state(struct pci_dev *dev, 
enum pcie_reset_state stat
eeh_pe_state_clear(pe, EEH_PE_ISOLATED);
break;
case pcie_hot_reset:
-   eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
+   eeh_pe_state_mark_with_cfg(pe, EEH_PE_ISOLATED);
eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE);
eeh_pe_dev_traverse(pe, eeh_disable_and_save_dev_state, dev);
eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
eeh_ops->reset(pe, EEH_RESET_HOT);
break;
case pcie_warm_reset:
-   eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
+   eeh_pe_state_mark_with_cfg(pe, EEH_PE_ISOLATED);
eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE);
eeh_pe_dev_traverse(pe, eeh_disable_and_save_dev_state, dev);
eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED);
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 35f0b62..8654cb1 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -657,6 +657,28 @@ void eeh_pe_state_clear(struct eeh_pe *pe, int state)
eeh_pe_traverse(pe, __eeh_pe_state_clear, &state);
 }
 
+/**
+ * eeh_pe_state_mark_with_cfg - Mark PE state with unblocked config space
+ * @pe: PE
+ * @state: PE state to be set
+ *
+ * Set specified flag to PE and its child PEs. The PCI config space
+ * of some PEs is blocked automatically when EEH_PE_ISOLATED is set,
+ * which isn't needed in some situations. The function allows to set
+ * the specified flag to indicated PEs without blocking their PCI
+ * config space.
+ */
+void eeh_pe_state_mark_with_cfg(struct eeh_pe *pe, int state)
+{
+   eeh_pe_traverse(pe, __eeh_pe_state_mark, &state);
+   if (!(state & EEH_PE_ISOLATED))
+   return;
+
+   /* Clear EEH_PE_CFG_BLOCKED, which might be set just now */
+   state = EEH_PE_CFG_BLOCKED;
+   eeh_pe_traverse(pe, __eeh_pe_state_clear, &state);
+}
+
 /*
  * Some PCI bridges (e.g. PLX bridges) have primary/secondary
  * buses assigned explicitly by firmware, and we probably have
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] powerpc/powernv: Avoid compound PE for VF

2015-07-29 Thread Gavin Shan
On Fri, Jul 17, 2015 at 10:14:41AM +1000, Gavin Shan wrote:
>When the VF BAR size is equal to 128MB or bigger than that, the IOV BAR
>is extended to cover number of maximal VFs supported by the PF, not 256.
>Also, one PHB's M64 BAR is picked to cover VF BARs for 4 continous VFs,
>but the PHB's M64 BAR is configured as being owned by single PE. Eventually,
>those 4 VFs have 4 separate PEs from the perspective of PCI config or DMA,
>but single shared PE from MMIO's perspective. Once we have compound PE, all
>those 4 VFs included in the compound PE can't be passed to separate guests
>with VFIO infrastructure.
>
>The above gate (128MB) was choosen based on the assumption: one IOV BAR can
>consume 1/4 of PHB's M64 window, which is 16GB. However, it can consume as
>much as half of that (32GB) when the PF seats behind the root port. 
>Accordingly,
>the gate can be doubled to be 256MB in order to avoid compound PE as we can.
>
>

Please ignore those two patches as Richard already sent one patch fixing it
in better way. Sorry for the noise!

Thanks,
Gavin

>Gavin Shan (2):
>  powerpc/powernv: Fix alignment for IOV BAR
>  powerpc/powernv: Double VF BAR size for compound PE
>
> arch/powerpc/platforms/powernv/pci-ioda.c | 56 +--
> 1 file changed, 45 insertions(+), 11 deletions(-)
>
>-- 
>2.1.0
>

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 1/2] perf,kvm/ppc: Add kvm_perf.h for powerpc

2015-07-29 Thread Scott Wood
On Wed, 2015-07-29 at 16:07 +0530, Hemant Kumar wrote:
> Hi Scott,
> 
> On 07/17/2015 01:40 AM, Scott Wood wrote:
> > On Thu, 2015-07-16 at 21:18 +0530, Hemant Kumar wrote:
> > > To analyze the exit events with perf, we need kvm_perf.h to be added in
> > > the arch/powerpc directory, where the kvm tracepoints needed to trace
> > > the KVM exit events are defined.
> > > 
> > > This patch adds "kvm_perf_book3s.h" to indicate that the tracepoints are
> > > book3s specific. Generic "kvm_perf.h" then can just include
> > > "kvm_perf_book3s.h".
> > > 
> > > Signed-off-by: Hemant Kumar 
> > > ---
> > > Changes:
> > > - Not exporting the exit reasons compared to previous patchset 
> > > (suggested
> > > by Paul)
> > > 
> > >   arch/powerpc/include/uapi/asm/kvm_perf.h|  6 ++
> > >   arch/powerpc/include/uapi/asm/kvm_perf_book3s.h | 14 ++
> > >   2 files changed, 20 insertions(+)
> > >   create mode 100644 arch/powerpc/include/uapi/asm/kvm_perf.h
> > >   create mode 100644 arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
> > > 
> > > diff --git a/arch/powerpc/include/uapi/asm/kvm_perf.h
> > > b/arch/powerpc/include/uapi/asm/kvm_perf.h
> > > new file mode 100644
> > > index 000..5ed2ff3
> > > --- /dev/null
> > > +++ b/arch/powerpc/include/uapi/asm/kvm_perf.h
> > > @@ -0,0 +1,6 @@
> > > +#ifndef _ASM_POWERPC_KVM_PERF_H
> > > +#define _ASM_POWERPC_KVM_PERF_H
> > > +
> > > +#include 
> > > +
> > > +#endif
> > > diff --git a/arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
> > > b/arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
> > > new file mode 100644
> > > index 000..8c8d8c2
> > > --- /dev/null
> > > +++ b/arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
> > > @@ -0,0 +1,14 @@
> > > +#ifndef _ASM_POWERPC_KVM_PERF_BOOK3S_H
> > > +#define _ASM_POWERPC_KVM_PERF_BOOK3S_H
> > > +
> > > +#include 
> > > +
> > > +#define DECODE_STR_LEN 20
> > > +
> > > +#define VCPU_ID "vcpu_id"
> > > +
> > > +#define KVM_ENTRY_TRACE "kvm_hv:kvm_guest_enter"
> > > +#define KVM_EXIT_TRACE "kvm_hv:kvm_guest_exit"
> > > +#define KVM_EXIT_REASON "trap"
> > > +
> > > +#endif /* _ASM_POWERPC_KVM_PERF_BOOK3S_H */
> > Again, why is book3s stuff being presented via uapi as generic
> >  with generic symbol names?
> > 
> > -Scott
> 
> Ok.
> 
> We can change the KVM_ENTRY_TRACE macro to something like
> KVM_BOOK3S_ENTRY_TRACE and likewise for KVM_EXIT_TRACE
> and KVM_EXIT_REASON

What about DECODE_STR_LEN and VCPU_ID?

Where is this API documented?

>  and then, to resolve the issue of generic
> macro names in the userspace side, we can handle it using __weak
> modifier.

Does userspace get built differently for book3s versus book3e?  For now it'd 

be fine for userspace to check for book3s and not use the feature if it's 

book3e.  If and when book3e gains this feature, then userspace can be changed.

> What would you suggest?

Another option would be to explain this interface so that we can figure out 
if book3e would even want different values for these, and if not, move it to 
asm/kvm.h.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH][v2] powerpc/fsl-booke: Add T1040D4RDB/T1042D4RDB board support

2015-07-29 Thread Scott Wood
On Wed, 2015-07-29 at 04:07 -0500, Jain Priyanka-B32167 wrote:
> 
> > -Original Message-
> > From: Wood Scott-B07421
> > Sent: Friday, July 24, 2015 8:58 PM
> > To: Jain Priyanka-B32167
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Subject: Re: [PATCH][v2] powerpc/fsl-booke: Add T1040D4RDB/T1042D4RDB
> > board support
> > 
> > OK, so you're saying the i2c devices are pluggable (and I'm assuming by 
> > "PEX
> > slots" you just mean that the physical slot is repurposed, not that the 
> > PCI
> > express protocol is involved)?  Making a non-runtime-enumerable bus be
> > pluggable seems like a bad idea, but if that's really what has been done,
> > there needs to be a device tree that represents the entire system, not 
> > just
> > the motherboard.  This could be done either via a dts file that 
> > /include/s the
> > motherboard dts, or via firmware dtb edits.  The dts for the motherboard
> > should include the mux node with a comment explaining what the situation
> > is.
> > 
> [Jain Priyanka-B32167] Is the below comment looks OK?
> "Output I2C data, clock lines (SDO/SC0,SD1/SC1 , SD2/SC2, SD3/SC3) are 
> going mini PCI connector slot1, mini PCI connector slot2, HDMI connector, 
> PEX slot respectively
>  The sub-nodes will depend upon the device that will be connected on these 
> slots"

How about:

"Child nodes depend on which i2c devices are connected via the mini PCI 

connector slot1, the mini PCI connector slot2, the HDMI connector, and the 

PEX slot.  Systems with such devices attached should provide a wrapper .dts 
file that includes this one, and adds those nodes."

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] gianfar: Fix warnings when built on 64-bit

2015-07-29 Thread Arnd Bergmann
On Wednesday 29 July 2015 11:02:41 Scott Wood wrote:
> On Wed, 2015-07-29 at 10:02 +0200, Arnd Bergmann wrote:
> > On Wednesday 29 July 2015 00:24:37 Scott Wood wrote:
> > > +#ifdef CONFIG_PM
> > >  static void lock_tx_qs(struct gfar_private *priv)
> > >  {
> > > int i;
> > > @@ -580,6 +581,7 @@ static void unlock_tx_qs(struct gfar_private *priv)
> > > for (i = 0; i < priv->num_tx_queues; i++)
> > > spin_unlock(&priv->tx_queue[i]->txlock);
> > >  }
> > > +#endif
> > >  
> > 
> > This seems unrelated and should probably be a separate fix.
> 
> It's related in that it fixes a warning -- the 64-bit build didn't have 
> CONFIG_PM -- though I should have been clearer about that in the changelog.

Yes, that's what I meant: you can easily have a 32-bit build without
CONFIG_PM of course, and that would have the same problem.

> > 
> > You are fixing two problems here: the warning about a size cast, and
> > the fact that the driver is using the wrong pointer. I'd suggest
> > explaining it in the changelog.
> > 
> > Note that we normally rely on void pointer arithmetic in the kernel, so
> > I'd write it without the uintptr_t casts as 
> > 
> >   bdp_dma = lower_32_bits(rx_queue->rx_bd_dma_base + (base - bdp));
> 
> But those aren't void pointers, and rx_bd_dma_base isn't a pointer, so you'd 
> get the wrong answer doing that.

Ah, right.

Arnd
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC v2] genalloc:add an gen_pool_first_fit_align algo to genalloc

2015-07-29 Thread Scott Wood
On Tue, 2015-07-28 at 00:32 -0500, Zhao Qiang-B45475 wrote:
> On Tue, 2015-07-28 at 5:21, Scott Wood wrote:
> > -Original Message-
> > From: Wood Scott-B07421
> > Sent: Tuesday, July 28, 2015 5:21 AM
> > To: Zhao Qiang-B45475
> > Cc: lau...@codeaurora.org; linux-ker...@vger.kernel.org; linuxppc-
> > d...@lists.ozlabs.org; a...@linux-foundation.org; o...@lixom.net;
> > catalin.mari...@arm.com; Xie Xiaobo-R63061
> > Subject: Re: [RFC v2] genalloc:add an gen_pool_first_fit_align algo to
> > genalloc
> > 
> > On Mon, 2015-07-27 at 17:57 +0800, Zhao Qiang wrote:
> > 
> > Where's the part that adds the ability to pass in data to each allocation
> > call, as per the previous discussion?
> 
> You means to use gen_pool_alloc_data()?

Yes.

> Previously you said that the format of data is algorithm-specific,
> So I think it is better to handle data in algorithm function.

It is a channel for communication from the API caller to the algorithm.

> If you still prefer gen_pool_alloc_data(), I will modify it.
> But there still are details I want to confirm with you.
> 1. If use gen_pool_alloc_data(), should I pass data as a parameter?

Yes.

> 2. Should I count align_mask in gen_pool_alloc_data(), meanwhile, add 
>a align_mask to data as a member?

gen_pool_alloc_data() should just pass data to the algorithm.  The algorithm 
should calculate align_mask based on align.  I don't think exposing 
align_mask to API users would be very friendly.

> 3. where to define the data, in genalloc.h or caller layer?

Same place as where the algorithm function is declared.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] gianfar: Fix warnings when built on 64-bit

2015-07-29 Thread Scott Wood
On Wed, 2015-07-29 at 10:02 +0200, Arnd Bergmann wrote:
> On Wednesday 29 July 2015 00:24:37 Scott Wood wrote:
> > +#ifdef CONFIG_PM
> >  static void lock_tx_qs(struct gfar_private *priv)
> >  {
> > int i;
> > @@ -580,6 +581,7 @@ static void unlock_tx_qs(struct gfar_private *priv)
> > for (i = 0; i < priv->num_tx_queues; i++)
> > spin_unlock(&priv->tx_queue[i]->txlock);
> >  }
> > +#endif
> >  
> 
> This seems unrelated and should probably be a separate fix.

It's related in that it fixes a warning -- the 64-bit build didn't have 
CONFIG_PM -- though I should have been clearer about that in the changelog.

> > @@ -2964,8 +2967,13 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q 
> > *rx_queue, int rx_work_limit)
> > gfar_init_rxbdp(rx_queue, bdp, bufaddr);
> >  
> > /* Update Last Free RxBD pointer for LFC */
> > -   if (unlikely(rx_queue->rfbptr && priv->tx_actual_en))
> > -   gfar_write(rx_queue->rfbptr, (u32)bdp);
> > +   if (unlikely(rx_queue->rfbptr && priv->tx_actual_en)) {
> > +   u32 bdp_dma;
> > +
> > +   bdp_dma = lower_32_bits(rx_queue->rx_bd_dma_base);
> > +   bdp_dma += (uintptr_t)bdp - (uintptr_t)base;
> > +   gfar_write(rx_queue->rfbptr, bdp_dma);
> > +   }
> >  
> > /* Update to the next pointer */
> > bdp = next_bd(bdp, base, rx_queue->rx_ring_size);
> 
> You are fixing two problems here: the warning about a size cast, and
> the fact that the driver is using the wrong pointer. I'd suggest
> explaining it in the changelog.
> 
> Note that we normally rely on void pointer arithmetic in the kernel, so
> I'd write it without the uintptr_t casts as 
> 
>   bdp_dma = lower_32_bits(rx_queue->rx_bd_dma_base + (base - bdp));

But those aren't void pointers, and rx_bd_dma_base isn't a pointer, so you'd 
get the wrong answer doing that.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V6 4/6] mm: mlock: Add mlock flags to enable VM_LOCKONFAULT usage

2015-07-29 Thread Eric B Munson
The previous patch introduced a flag that specified pages in a VMA
should be placed on the unevictable LRU, but they should not be made
present when the area is created.  This patch adds the ability to set
this state via the new mlock system calls.

We add MLOCK_ONFAULT for mlock2 and MCL_ONFAULT for mlockall.
MLOCK_ONFAULT will set the VM_LOCKONFAULT modifier for VM_LOCKED.
MCL_ONFAULT should be used as a modifier to the two other mlockall
flags.  When used with MCL_CURRENT, all current mappings will be marked
with VM_LOCKED | VM_LOCKONFAULT.  When used with MCL_FUTURE, the
mm->def_flags will be marked with VM_LOCKED | VM_LOCKONFAULT.  When used
with both MCL_CURRENT and MCL_FUTURE, all current mappings and
mm->def_flags will be marked with VM_LOCKED | VM_LOCKONFAULT.

Prior to this patch, mlockall() will unconditionally clear the
mm->def_flags any time it is called without MCL_FUTURE.  This behavior
is maintained after adding MCL_ONFAULT.  If a call to
mlockall(MCL_FUTURE) is followed by mlockall(MCL_CURRENT), the
mm->def_flags will be cleared and new VMAs will be unlocked.  This
remains true with or without MCL_ONFAULT in either mlockall()
invocation.

munlock() will unconditionally clear both vma flags.  munlockall()
unconditionally clears for VMA flags on all VMAs and in the
mm->def_flags field.

Signed-off-by: Eric B Munson 
Cc: Michal Hocko 
Cc: Vlastimil Babka 
Cc: Jonathan Corbet 
Cc: "Kirill A. Shutemov" 
Cc: linux-al...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-m...@linux-mips.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparcli...@vger.kernel.org
Cc: linux-xte...@linux-xtensa.org
Cc: linux-a...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux...@kvack.org
---
 arch/alpha/include/uapi/asm/mman.h |  3 ++
 arch/mips/include/uapi/asm/mman.h  |  6 
 arch/parisc/include/uapi/asm/mman.h|  3 ++
 arch/powerpc/include/uapi/asm/mman.h   |  1 +
 arch/sparc/include/uapi/asm/mman.h |  1 +
 arch/tile/include/uapi/asm/mman.h  |  1 +
 arch/xtensa/include/uapi/asm/mman.h|  6 
 include/uapi/asm-generic/mman-common.h |  5 
 include/uapi/asm-generic/mman.h|  1 +
 mm/mlock.c | 55 ++
 10 files changed, 70 insertions(+), 12 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/mman.h 
b/arch/alpha/include/uapi/asm/mman.h
index 0086b47..f2f9496 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -37,6 +37,9 @@
 
 #define MCL_CURRENT 8192   /* lock all currently mapped pages */
 #define MCL_FUTURE 16384   /* lock all additions to address space 
*/
+#define MCL_ONFAULT32768   /* lock all pages that are faulted in */
+
+#define MLOCK_ONFAULT  0x01/* Lock pages in range after they are 
faulted in, do not prefault */
 
 #define MADV_NORMAL0   /* no further special treatment */
 #define MADV_RANDOM1   /* expect random page references */
diff --git a/arch/mips/include/uapi/asm/mman.h 
b/arch/mips/include/uapi/asm/mman.h
index cfcb876..97c03f4 100644
--- a/arch/mips/include/uapi/asm/mman.h
+++ b/arch/mips/include/uapi/asm/mman.h
@@ -61,6 +61,12 @@
  */
 #define MCL_CURRENT1   /* lock all current mappings */
 #define MCL_FUTURE 2   /* lock all future mappings */
+#define MCL_ONFAULT4   /* lock all pages that are faulted in */
+
+/*
+ * Flags for mlock
+ */
+#define MLOCK_ONFAULT  0x01/* Lock pages in range after they are 
faulted in, do not prefault */
 
 #define MADV_NORMAL0   /* no further special treatment */
 #define MADV_RANDOM1   /* expect random page references */
diff --git a/arch/parisc/include/uapi/asm/mman.h 
b/arch/parisc/include/uapi/asm/mman.h
index 294d251..ecc3ae1 100644
--- a/arch/parisc/include/uapi/asm/mman.h
+++ b/arch/parisc/include/uapi/asm/mman.h
@@ -31,6 +31,9 @@
 
 #define MCL_CURRENT1   /* lock all current mappings */
 #define MCL_FUTURE 2   /* lock all future mappings */
+#define MCL_ONFAULT4   /* lock all pages that are faulted in */
+
+#define MLOCK_ONFAULT  0x01/* Lock pages in range after they are 
faulted in, do not prefault */
 
 #define MADV_NORMAL 0   /* no further special treatment */
 #define MADV_RANDOM 1   /* expect random page references */
diff --git a/arch/powerpc/include/uapi/asm/mman.h 
b/arch/powerpc/include/uapi/asm/mman.h
index 6ea26df..03c06ba 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -22,6 +22,7 @@
 
 #define MCL_CURRENT 0x2000  /* lock all currently mapped pages */
 #define MCL_FUTURE  0x4000  /* lock all additions to address space 
*/
+#define MCL_ONFAULT0x8000  /* lock all pages that are faulted in */
 
 #define MAP_POPULATE   0x8000 

[PATCH V6 0/6] Allow user to request memory to be locked on page fault

2015-07-29 Thread Eric B Munson
mlock() allows a user to control page out of program memory, but this
comes at the cost of faulting in the entire mapping when it is
allocated.  For large mappings where the entire area is not necessary
this is not ideal.  Instead of forcing all locked pages to be present
when they are allocated, this set creates a middle ground.  Pages are
marked to be placed on the unevictable LRU (locked) when they are first
used, but they are not faulted in by the mlock call.

This series introduces a new mlock() system call that takes a flags
argument along with the start address and size.  This flags argument
gives the caller the ability to request memory be locked in the
traditional way, or to be locked after the page is faulted in.  A new
MCL flag is added to mirror the lock on fault behavior from mlock() in
mlockall().

There are two main use cases that this set covers.  The first is the
security focussed mlock case.  A buffer is needed that cannot be written
to swap.  The maximum size is known, but on average the memory used is
significantly less than this maximum.  With lock on fault, the buffer
is guaranteed to never be paged out without consuming the maximum size
every time such a buffer is created.

The second use case is focussed on performance.  Portions of a large
file are needed and we want to keep the used portions in memory once
accessed.  This is the case for large graphical models where the path
through the graph is not known until run time.  The entire graph is
unlikely to be used in a given invocation, but once a node has been
used it needs to stay resident for further processing.  Given these
constraints we have a number of options.  We can potentially waste a
large amount of memory by mlocking the entire region (this can also
cause a significant stall at startup as the entire file is read in).
We can mlock every page as we access them without tracking if the page
is already resident but this introduces large overhead for each access.
The third option is mapping the entire region with PROT_NONE and using
a signal handler for SIGSEGV to mprotect(PROT_READ) and mlock() the
needed page.  Doing this page at a time adds a significant performance
penalty.  Batching can be used to mitigate this overhead, but in order
to safely avoid trying to mprotect pages outside of the mapping, the
boundaries of each mapping to be used in this way must be tracked and
available to the signal handler.  This is precisely what the mm system
in the kernel should already be doing.

For mlock(MLOCK_ONFAULT) the user is charged against RLIMIT_MEMLOCK as
if mlock(MLOCK_LOCKED) or mmap(MAP_LOCKED) was used, so when the VMA is
created not when the pages are faulted in.  For mlockall(MCL_ONFAULT)
the user is charged as if MCL_FUTURE was used.  This decision was made
to keep the accounting checks out of the page fault path.

To illustrate the benefit of this set I wrote a test program that mmaps
a 5 GB file filled with random data and then makes 15,000,000 accesses
to random addresses in that mapping.  The test program was run 20 times
for each setup.  Results are reported for two program portions, setup
and execution.  The setup phase is calling mmap and optionally mlock on
the entire region.  For most experiments this is trivial, but it
highlights the cost of faulting in the entire region.  Results are
averages across the 20 runs in milliseconds.

mmap with mlock(MLOCK_LOCKED) on entire range:
Setup avg:  8228.666
Processing avg: 8274.257

mmap with mlock(MLOCK_LOCKED) before each access:
Setup avg:  0.113
Processing avg: 90993.552

mmap with PROT_NONE and signal handler and batch size of 1 page:
With the default value in max_map_count, this gets ENOMEM as I attempt
to change the permissions, after upping the sysctl significantly I get:
Setup avg:  0.058
Processing avg: 69488.073
mmap with PROT_NONE and signal handler and batch size of 8 pages:
Setup avg:  0.068
Processing avg: 38204.116

mmap with PROT_NONE and signal handler and batch size of 16 pages:
Setup avg:  0.044
Processing avg: 29671.180

mmap with mlock(MLOCK_ONFAULT) on entire range:
Setup avg:  0.189
Processing avg: 17904.899

The signal handler in the batch cases faulted in memory in two steps to
avoid having to know the start and end of the faulting mapping.  The
first step covers the page that caused the fault as we know that it will
be possible to lock.  The second step speculatively tries to mlock and
mprotect the batch size - 1 pages that follow.  There may be a clever
way to avoid this without having the program track each mapping to be
covered by this handeler in a globally accessible structure, but I could
not find it.  It should be noted that with a large enough batch size
this two step fault handler can still cause the program to crash if it
reaches far beyond the end of the mapping.

These results show that if the developer knows that a majority of the
mapping will be used, it is better to try and fault it in at once,
otherwise mlo

[PATCH V6 2/6] mm: mlock: Add new mlock system call

2015-07-29 Thread Eric B Munson
With the refactored mlock code, introduce a new system call for mlock.
The new call will allow the user to specify what lock states are being
added.  mlock2 is trivial at the moment, but a follow on patch will add
a new mlock state making it useful.

Signed-off-by: Eric B Munson 
Cc: Michal Hocko 
Cc: Vlastimil Babka 
Cc: Heiko Carstens 
Cc: Geert Uytterhoeven 
Cc: Catalin Marinas 
Cc: Stephen Rothwell 
Cc: Guenter Roeck 
Cc: linux-al...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: adi-buildroot-de...@lists.sourceforge.net
Cc: linux-cris-ker...@axis.com
Cc: linux-i...@vger.kernel.org
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-am33-l...@redhat.com
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux-xte...@linux-xtensa.org
Cc: linux-...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux...@kvack.org
---
 arch/x86/entry/syscalls/syscall_32.tbl | 1 +
 arch/x86/entry/syscalls/syscall_64.tbl | 1 +
 include/linux/syscalls.h   | 2 ++
 include/uapi/asm-generic/unistd.h  | 4 +++-
 kernel/sys_ni.c| 1 +
 mm/mlock.c | 9 +
 6 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl 
b/arch/x86/entry/syscalls/syscall_32.tbl
index ef8187f..839d5df 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -365,3 +365,4 @@
 356i386memfd_createsys_memfd_create
 357i386bpf sys_bpf
 358i386execveatsys_execveat
stub32_execveat
+359i386mlock2  sys_mlock2
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl 
b/arch/x86/entry/syscalls/syscall_64.tbl
index 9ef32d5..ad36769 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -329,6 +329,7 @@
 320common  kexec_file_load sys_kexec_file_load
 321common  bpf sys_bpf
 32264  execveatstub_execveat
+323common  mlock2  sys_mlock2
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index b45c45b..56a3d59 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -884,4 +884,6 @@ asmlinkage long sys_execveat(int dfd, const char __user 
*filename,
const char __user *const __user *argv,
const char __user *const __user *envp, int flags);
 
+asmlinkage long sys_mlock2(unsigned long start, size_t len, int flags);
+
 #endif
diff --git a/include/uapi/asm-generic/unistd.h 
b/include/uapi/asm-generic/unistd.h
index e016bd9..14a6013 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -709,9 +709,11 @@ __SYSCALL(__NR_memfd_create, sys_memfd_create)
 __SYSCALL(__NR_bpf, sys_bpf)
 #define __NR_execveat 281
 __SC_COMP(__NR_execveat, sys_execveat, compat_sys_execveat)
+#define __NR_mlock2 282
+__SYSCALL(__NR_mlock2, sys_mlock2)
 
 #undef __NR_syscalls
-#define __NR_syscalls 282
+#define __NR_syscalls 283
 
 /*
  * All syscalls below here should go away really,
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 7995ef5..4818b71 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -193,6 +193,7 @@ cond_syscall(sys_mlock);
 cond_syscall(sys_munlock);
 cond_syscall(sys_mlockall);
 cond_syscall(sys_munlockall);
+cond_syscall(sys_mlock2);
 cond_syscall(sys_mincore);
 cond_syscall(sys_madvise);
 cond_syscall(sys_mremap);
diff --git a/mm/mlock.c b/mm/mlock.c
index 1585cca..807f986 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -642,6 +642,15 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
return do_mlock(start, len, VM_LOCKED);
 }
 
+SYSCALL_DEFINE3(mlock2, unsigned long, start, size_t, len, int, flags)
+{
+   vm_flags_t vm_flags = VM_LOCKED;
+   if (flags)
+   return -EINVAL;
+
+   return do_mlock(start, len, vm_flags);
+}
+
 SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len)
 {
int ret;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 04/46] staging: emxx_udc: add ep capabilities support

2015-07-29 Thread Felipe Balbi
On Mon, Jul 27, 2015 at 11:16:14AM +0200, Robert Baldyga wrote:
> Convert endpoint configuration to new capabilities model.
> 
> Fixed typo in "epc-nulk" to "epc-bulk".
> 
> Signed-off-by: Robert Baldyga 
> ---
>  drivers/staging/emxx_udc/emxx_udc.c | 60 
> ++---
>  1 file changed, 29 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/staging/emxx_udc/emxx_udc.c 
> b/drivers/staging/emxx_udc/emxx_udc.c
> index 3b7aa36..0d64bee 100644
> --- a/drivers/staging/emxx_udc/emxx_udc.c
> +++ b/drivers/staging/emxx_udc/emxx_udc.c
> @@ -3153,36 +3153,33 @@ static const struct usb_gadget_ops nbu2ss_gadget_ops 
> = {
>   .ioctl  = nbu2ss_gad_ioctl,
>  };
>  
> -static const char g_ep0_name[] = "ep0";
> -static const char g_ep1_name[] = "ep1-bulk";
> -static const char g_ep2_name[] = "ep2-bulk";
> -static const char g_ep3_name[] = "ep3in-int";
> -static const char g_ep4_name[] = "ep4-iso";
> -static const char g_ep5_name[] = "ep5-iso";
> -static const char g_ep6_name[] = "ep6-bulk";
> -static const char g_ep7_name[] = "ep7-bulk";
> -static const char g_ep8_name[] = "ep8in-int";
> -static const char g_ep9_name[] = "ep9-iso";
> -static const char g_epa_name[] = "epa-iso";
> -static const char g_epb_name[] = "epb-bulk";
> -static const char g_epc_name[] = "epc-nulk";
> -static const char g_epd_name[] = "epdin-int";
> -
> -static const char *gp_ep_name[NUM_ENDPOINTS] = {
> - g_ep0_name,
> - g_ep1_name,
> - g_ep2_name,
> - g_ep3_name,
> - g_ep4_name,
> - g_ep5_name,
> - g_ep6_name,
> - g_ep7_name,
> - g_ep8_name,
> - g_ep9_name,
> - g_epa_name,
> - g_epb_name,
> - g_epc_name,
> - g_epd_name,
> +static const struct {
> + const char *name;
> + const struct usb_ep_caps caps;
> +} ep_info[NUM_ENDPOINTS] = {
> +#define EP_INFO(_name, _type, _dir) \
> + { \
> + .name = _name, \
> + .caps = USB_EP_CAPS(USB_EP_CAPS_TYPE_ ## _type, \
> + USB_EP_CAPS_DIR_ ## _dir), \
> + }
> +
> + EP_INFO("ep0",  CONTROL, ALL),
> + EP_INFO("ep1-bulk", BULK,   ALL),
> + EP_INFO("ep2-bulk", BULK,   ALL),
> + EP_INFO("ep3in-int",INT,IN),
> + EP_INFO("ep4-iso",  INT,ALL),
> + EP_INFO("ep5-iso",  ISO,ALL),
> + EP_INFO("ep6-bulk", ISO,ALL),
> + EP_INFO("ep7-bulk", BULK,   ALL),
> + EP_INFO("ep8in-int",INT,IN),
> + EP_INFO("ep9-iso",  ISO,ALL),
> + EP_INFO("epa-iso",  ISO,ALL),
> + EP_INFO("epb-bulk", BULK,   ALL),
> + EP_INFO("epc-bulk", BULK,   ALL),
> + EP_INFO("epdin-int",INT,IN),

IMO, this is pointless obfuscation. It just makes it a pain to grep
source around. Why don't you have UDC drivers initialize the 1-bit flags
directly ?

-- 
balbi


signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Boris Ostrovsky

On 07/29/2015 10:23 AM, Julien Grall wrote:

On 29/07/15 15:14, Boris Ostrovsky wrote:

static inline unsigned long pfn_to_gfn(unsigned long pfn)
{
 if (xen_feature(XENFEAT_autotranslated_physmap))
 return pfn;
 else
 return pfn_to_mfn(pfn);
}


But you'd still say 'op.arg1.mfn = pfn_to_gfn(pfn);' in xen_do_pin()
i.e. assign GFN to MFN, right? That's what I was referring to.

Well no. I would use op.arg1.mfn = pfn_to_mfn(pfn) given that the code,
if I'm right, is only executed for PV.

mfn = pfn_to_gfn(...) was valid too because on PV is always an MFN. The
suggestion of pfn_to_mfn was just for more readability,


Right, and my comments were also not about correctness.





(In general, I am not sure a guest should ever use 'mfn' as it is purely
a hypervisor construct. Including p2m, which I think should really be
p2g as this is what we use to figure out what to stick into page tables)

I think avoid to use mfn in the hypervisor interface is out-of-scope for
this series. If we ever want to modify the Xen API in Linux, we should
do in sync with Xen to avoid inconsistency on naming.

Anyway, the oddity of mfn = pfn_to_gfn(...) is mostly contained in the
x86 specific code. I don't mind to either add pfn_to_mfn and use it or
add a comment /* PV-specific so mfn == gfn */ for every use of mfn =
pfn_to_gfn(...).


I think the former is better (even thought it adds a test)

-boris
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Julien Grall
On 29/07/15 15:14, Boris Ostrovsky wrote:
>> static inline unsigned long pfn_to_gfn(unsigned long pfn)
>> {
>> if (xen_feature(XENFEAT_autotranslated_physmap))
>> return pfn;
>> else
>> return pfn_to_mfn(pfn);
>> }
> 
> 
> But you'd still say 'op.arg1.mfn = pfn_to_gfn(pfn);' in xen_do_pin()
> i.e. assign GFN to MFN, right? That's what I was referring to.

Well no. I would use op.arg1.mfn = pfn_to_mfn(pfn) given that the code,
if I'm right, is only executed for PV.

mfn = pfn_to_gfn(...) was valid too because on PV is always an MFN. The
suggestion of pfn_to_mfn was just for more readability,

> (In general, I am not sure a guest should ever use 'mfn' as it is purely
> a hypervisor construct. Including p2m, which I think should really be
> p2g as this is what we use to figure out what to stick into page tables)

I think avoid to use mfn in the hypervisor interface is out-of-scope for
this series. If we ever want to modify the Xen API in Linux, we should
do in sync with Xen to avoid inconsistency on naming.

Anyway, the oddity of mfn = pfn_to_gfn(...) is mostly contained in the
x86 specific code. I don't mind to either add pfn_to_mfn and use it or
add a comment /* PV-specific so mfn == gfn */ for every use of mfn =
pfn_to_gfn(...).

Regards,

-- 
Julien Grall
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Boris Ostrovsky

On 07/29/2015 07:25 AM, Julien Grall wrote:

Hi Boris,

On 28/07/15 20:12, Boris Ostrovsky wrote:

On 07/28/2015 11:02 AM, Julien Grall wrote:

Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
is meant, I suspect this is because the first support for Xen was for
PV. This brough some misimplementation of helpers on ARM and make the
developper confused the expected behavior.

For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.

For clarity and avoid new confusion, replace any reference of mfn into
gnf in any helpers used by PV drivers.






@@ -730,7 +730,7 @@ static void xen_do_pin(unsigned level, unsigned
long pfn)
   struct mmuext_op op;

   op.cmd = level;
-op.arg1.mfn = pfn_to_mfn(pfn);
+op.arg1.mfn = pfn_to_gfn(pfn);



This looks slightly odd. It is correct but given that purpose of this
series is to make things more clear perhaps we can add another union
member (gfn) to mmuext_op.arg1?

(Of course, the hypervisor will continue referring to mfn which could
still be confusing)


This operation is only used for PV guests, right?

IHMO re-introducing pfn_to_mfn for PV-guests only (i.e with a BUG_ON to
ensure no usage for auto-translated guest) would be the best solution.
It would avoid to have different name than the hypersivor one in the
hypercall interface. It will also make clear that virt_to_machine & co
is only PV specific.

I though doing this but I preferred to defer it to x86 expert as my
knowledge for x86 Xen is very limited. I don't know where it's more
suitable to use MFN or GFN. I guess this file (mmu.c) is mostly PV specific?

Would something like below fine for you?

static inline unsigned long pfn_to_mfn(unsigned long pfn)
{
unsigned long mfn;

BUG_ON(xen_feature(XENFEAT_auto_translated_physmap));

mfn = __pfn_to_mfn(pfn);
if (mfn != INVALID_P2M_ENTRY)
mfn &= ~(FOREIGN_FRAME_BIT | IDENTITY_FRAME_BIT);

return mfn;
}

static inline unsigned long pfn_to_gfn(unsigned long pfn)
{
if (xen_feature(XENFEAT_autotranslated_physmap))
return pfn;
else
return pfn_to_mfn(pfn);
}



But you'd still say 'op.arg1.mfn = pfn_to_gfn(pfn);' in xen_do_pin() 
i.e. assign GFN to MFN, right? That's what I was referring to.


(In general, I am not sure a guest should ever use 'mfn' as it is purely 
a hypervisor construct. Including p2m, which I think should really be 
p2g as this is what we use to figure out what to stick into page tables)


-boris




Similar splitting would be done for gfn_to_pfn and mfn_to_pfn.

Regards,



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 02/10] dpaa_eth: add support for DPAA Ethernet

2015-07-29 Thread Joakim Tjernlund
On Wed, 2015-07-22 at 19:16 +0300, Madalin Bucur wrote:
> This introduces the Freescale Data Path Acceleration Architecture
> (DPAA) Ethernet driver (dpaa_eth) that builds upon the DPAA QMan,
> BMan, PAMU and FMan drivers to deliver Ethernet connectivity on
> the Freescale DPAA QorIQ platforms.
> 
> Signed-off-by: Madalin Bucur 
> ---
>  drivers/net/ethernet/freescale/Kconfig |2 +
>  drivers/net/ethernet/freescale/Makefile|1 +
>  drivers/net/ethernet/freescale/dpaa/Kconfig|   46 +
>  drivers/net/ethernet/freescale/dpaa/Makefile   |   13 +
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c |  827 +
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.h |  447 +++
>  .../net/ethernet/freescale/dpaa/dpaa_eth_common.c  | 1254 
> 
>  .../net/ethernet/freescale/dpaa/dpaa_eth_common.h  |  119 ++
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth_sg.c  |  406 +++
>  9 files changed, 3115 insertions(+)
>  create mode 100644 drivers/net/ethernet/freescale/dpaa/Kconfig
>  create mode 100644 drivers/net/ethernet/freescale/dpaa/Makefile
>  create mode 100644 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
>  create mode 100644 drivers/net/ethernet/freescale/dpaa/dpaa_eth.h
>  create mode 100644 drivers/net/ethernet/freescale/dpaa/dpaa_eth_common.c
>  create mode 100644 drivers/net/ethernet/freescale/dpaa/dpaa_eth_common.h
>  create mode 100644 drivers/net/ethernet/freescale/dpaa/dpaa_eth_sg.c
> 
> diff --git a/drivers/net/ethernet/freescale/Kconfig 
> b/drivers/net/ethernet/freescale/Kconfig
> index f3f89cc..92198be 100644
> --- a/drivers/net/ethernet/freescale/Kconfig
> +++ b/drivers/net/ethernet/freescale/Kconfig
> @@ -92,4 +92,6 @@ config GIANFAR
> and MPC86xx family of chips, the eTSEC on LS1021A and the FEC
> on the 8540.
>  
> +source "drivers/net/ethernet/freescale/dpaa/Kconfig"
> +
>  endif # NET_VENDOR_FREESCALE
> diff --git a/drivers/net/ethernet/freescale/Makefile 
> b/drivers/net/ethernet/freescale/Makefile
> index 4097c58..ae13dc5 100644
> --- a/drivers/net/ethernet/freescale/Makefile
> +++ b/drivers/net/ethernet/freescale/Makefile
> @@ -12,6 +12,7 @@ obj-$(CONFIG_FS_ENET) += fs_enet/
>  obj-$(CONFIG_FSL_PQ_MDIO) += fsl_pq_mdio.o
>  obj-$(CONFIG_FSL_XGMAC_MDIO) += xgmac_mdio.o
>  obj-$(CONFIG_GIANFAR) += gianfar_driver.o
> +obj-$(CONFIG_FSL_DPAA_ETH) += dpaa/
>  obj-$(CONFIG_PTP_1588_CLOCK_GIANFAR) += gianfar_ptp.o
>  gianfar_driver-objs := gianfar.o \
>   gianfar_ethtool.o
> diff --git a/drivers/net/ethernet/freescale/dpaa/Kconfig 
> b/drivers/net/ethernet/freescale/dpaa/Kconfig
> new file mode 100644
> index 000..1f3a203
> --- /dev/null
> +++ b/drivers/net/ethernet/freescale/dpaa/Kconfig
> @@ -0,0 +1,46 @@
> +menuconfig FSL_DPAA_ETH
> + tristate "DPAA Ethernet"
> + depends on FSL_SOC && FSL_BMAN && FSL_QMAN && FSL_FMAN
> + select PHYLIB
> + select FSL_FMAN_MAC
> + ---help---
> +   Data Path Acceleration Architecture Ethernet driver,
> +   supporting the Freescale QorIQ chips.
> +   Depends on Freescale Buffer Manager and Queue Manager
> +   driver and Frame Manager Driver.
> +
> +if FSL_DPAA_ETH
> +
> +config FSL_DPAA_CS_THRESHOLD_1G
> + hex "Egress congestion threshold on 1G ports"
> + range 0x1000 0x1000
> + default "0x0600"
> + ---help---
> +   The size in bytes of the egress Congestion State notification 
> threshold on 1G ports.
> +   The 1G dTSECs can quite easily be flooded by cores doing Tx in a 
> tight loop
> +   (e.g. by sending UDP datagrams at "while(1) speed"),
> +   and the larger the frame size, the more acute the problem.
> +   So we have to find a balance between these factors:
> +- avoiding the device staying congested for a prolonged time 
> (risking
> + the netdev watchdog to fire - see also the tx_timeout 
> module param);
> +   - affecting performance of protocols such as TCP, which 
> otherwise
> +  behave well under the congestion notification mechanism;
> +- preventing the Tx cores from tightly-looping (as if the 
> congestion
> +  threshold was too low to be effective);
> +- running out of memory if the CS threshold is set too high.
> +
> +config FSL_DPAA_CS_THRESHOLD_10G
> + hex "Egress congestion threshold on 10G ports"
> + range 0x1000 0x2000
> + default "0x1000"
> + ---help ---
> +   The size in bytes of the egress Congestion State notification 
> threshold on 10G ports.
> +
> +config FSL_DPAA_INGRESS_CS_THRESHOLD
> + hex "Ingress congestion threshold on FMan ports"
> + default "0x1000"
> + ---help---
> +   The size in bytes of the ingress tail-drop threshold on FMan ports.
> +   Traffic piling up above this value will be rejected by QMan and 
> discarded by FMan.
> +
> +endif # FSL_DPAA_ETH
> diff --git a/drivers/net/ethernet/freescale/dpaa/Make

Re: [Xen-devel] [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Wei Liu
On Wed, Jul 29, 2015 at 12:35:54PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> On 29/07/15 11:13, Wei Liu wrote:
> > On Tue, Jul 28, 2015 at 04:02:45PM +0100, Julien Grall wrote:
> > [...]
> >> diff --git a/drivers/net/xen-netback/netback.c 
> >> b/drivers/net/xen-netback/netback.c
> >> index 7d50711..3b7b7c3 100644
> >> --- a/drivers/net/xen-netback/netback.c
> >> +++ b/drivers/net/xen-netback/netback.c
> >> @@ -314,7 +314,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
> >> *queue, struct sk_buff *skb
> >>} else {
> >>copy_gop->source.domid = DOMID_SELF;
> >>copy_gop->source.u.gmfn =
> >> -  virt_to_mfn(page_address(page));
> >> +  virt_to_gfn(page_address(page));
> >>}
> >>copy_gop->source.offset = offset;
> >>  
> >> @@ -1284,7 +1284,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
> >> *queue,
> >>queue->tx_copy_ops[*copy_ops].source.offset = txreq.offset;
> >>  
> >>queue->tx_copy_ops[*copy_ops].dest.u.gmfn =
> >> -  virt_to_mfn(skb->data);
> >> +  virt_to_gfn(skb->data);
> >>queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
> >>queue->tx_copy_ops[*copy_ops].dest.offset =
> >>offset_in_page(skb->data);
> > 
> > Reviewed-by: Wei Liu 
> > 
> > One possible improvement is to change gmfn in copy_gop to gfn as well.
> > But that's outside of netback code.
> 
> The structure gnttab_copy is part of the hypervisor interface. Is it
> fine to differ on the naming between Xen and Linux?
> 
> Or maybe we could do the change in the public headers in Xen repo too.
> Is it fine to do field renaming in public headers?
> 

Oh well. Never mind then. I mistook that structure as internal to Linux.

Wei.

> Regards,
> 
> -- 
> Julien Grall
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Wei Liu
On Tue, Jul 28, 2015 at 04:02:45PM +0100, Julien Grall wrote:
[...]
> diff --git a/drivers/net/xen-netback/netback.c 
> b/drivers/net/xen-netback/netback.c
> index 7d50711..3b7b7c3 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -314,7 +314,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
> *queue, struct sk_buff *skb
>   } else {
>   copy_gop->source.domid = DOMID_SELF;
>   copy_gop->source.u.gmfn =
> - virt_to_mfn(page_address(page));
> + virt_to_gfn(page_address(page));
>   }
>   copy_gop->source.offset = offset;
>  
> @@ -1284,7 +1284,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
> *queue,
>   queue->tx_copy_ops[*copy_ops].source.offset = txreq.offset;
>  
>   queue->tx_copy_ops[*copy_ops].dest.u.gmfn =
> - virt_to_mfn(skb->data);
> + virt_to_gfn(skb->data);
>   queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>   queue->tx_copy_ops[*copy_ops].dest.offset =
>   offset_in_page(skb->data);

Reviewed-by: Wei Liu 

One possible improvement is to change gmfn in copy_gop to gfn as well.
But that's outside of netback code.

Wei.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers

2015-07-29 Thread Thomas Huth
- Original Message -
> The EPOW interrupt handler uses rtas_get_sensor(), which in turn
> uses rtas_busy_delay() to wait for RTAS becoming ready in case it
> is necessary. But rtas_busy_delay() is annotated with might_sleep()
> and thus may not be used by interrupts handlers like the EPOW handler!
> This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is
> enabled:
> 
>  BUG: sleeping function called from invalid context at
>  arch/powerpc/kernel/rtas.c:496
>  in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6
>  Call Trace:
>  [c0007ffe7b90] [c0807670] dump_stack+0xa0/0xdc (unreliable)
>  [c0007ffe7bc0] [c00e1f14] ___might_sleep+0x134/0x180
>  [c0007ffe7c20] [c002aec0] rtas_busy_delay+0x30/0xd0
>  [c0007ffe7c50] [c002bde4] rtas_get_sensor+0x74/0xe0
>  [c0007ffe7ce0] [c0083264] ras_epow_interrupt+0x44/0x450
>  [c0007ffe7d90] [c0120260] handle_irq_event_percpu+0xa0/0x300
>  [c0007ffe7e70] [c0120524] handle_irq_event+0x64/0xc0
>  [c0007ffe7eb0] [c0124dbc] handle_fasteoi_irq+0xec/0x260
>  [c0007ffe7ef0] [c011f4f0] generic_handle_irq+0x50/0x80
>  [c0007ffe7f20] [c0010f3c] __do_irq+0x8c/0x200
>  [c0007ffe7f90] [c00236cc] call_do_irq+0x14/0x24
>  [c0007e6f39e0] [c0011144] do_IRQ+0x94/0x110
>  [c0007e6f3a30] [c0002594] hardware_interrupt_common+0x114/0x180
> 
> Fix this issue by introducing a new rtas_get_sensor_fast() function
> that does not use rtas_busy_delay() - and thus can only be used for
> sensors that do not cause a BUSY condition (which should be the case
> for the sensor that is queried by the EPOW IRQ handler).
> 
> Signed-off-by: Thomas Huth 
> ---
>  arch/powerpc/include/asm/rtas.h  |  1 +
>  arch/powerpc/kernel/rtas.c   | 17 +
>  arch/powerpc/platforms/pseries/ras.c |  3 ++-
>  3 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/rtas.h
> b/arch/powerpc/include/asm/rtas.h
> index 7a4ede1..b77ef36 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -343,6 +343,7 @@ extern void rtas_power_off(void);
>  extern void rtas_halt(void);
>  extern void rtas_os_term(char *str);
>  extern int rtas_get_sensor(int sensor, int index, int *state);
> +extern int rtas_get_sensor_fast(int sensor, int index, int *state);
>  extern int rtas_get_power_level(int powerdomain, int *level);
>  extern int rtas_set_power_level(int powerdomain, int level, int *setlevel);
>  extern bool rtas_indicator_present(int token, int *maxindex);
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 7a488c1..caffb10 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -584,6 +584,23 @@ int rtas_get_sensor(int sensor, int index, int *state)
>  }
>  EXPORT_SYMBOL(rtas_get_sensor);
>  
> +int rtas_get_sensor_fast(int sensor, int index, int *state)
> +{
> + int token = rtas_token("get-sensor-state");
> + int rc;
> +
> + if (token == RTAS_UNKNOWN_SERVICE)
> + return -ENOENT;
> +
> + rc = rtas_call(token, 2, 2, state, sensor, index);
> + WARN_ON(rc == RTAS_BUSY || (rc >= RTAS_EXTENDED_DELAY_MIN &&
> + rc <= RTAS_EXTENDED_DELAY_MAX));
> +
> + if (rc < 0)
> + return rtas_error_rc(rc);
> + return rc;
> +}
> +
>  bool rtas_indicator_present(int token, int *maxindex)
>  {
>   int proplen, count, i;
> diff --git a/arch/powerpc/platforms/pseries/ras.c
> b/arch/powerpc/platforms/pseries/ras.c
> index 02e4a17..3b6647e 100644
> --- a/arch/powerpc/platforms/pseries/ras.c
> +++ b/arch/powerpc/platforms/pseries/ras.c
> @@ -189,7 +189,8 @@ static irqreturn_t ras_epow_interrupt(int irq, void
> *dev_id)
>   int state;
>   int critical;
>  
> - status = rtas_get_sensor(EPOW_SENSOR_TOKEN, EPOW_SENSOR_INDEX, &state);
> + status = rtas_get_sensor_fast(EPOW_SENSOR_TOKEN, EPOW_SENSOR_INDEX,
> +   &state);
>  
>   if (state > 3)
>   critical = 1;   /* Time Critical */
> --
> 1.8.3.1

*ping*

Michael, do you think this patch is OK for fixing this problem?
Or shall I rather send a patch to simply revert 587f83e8dd50d instead?

 Thomas
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Xen-devel] [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread David Vrabel
On 29/07/15 12:35, Julien Grall wrote:
> Hi Wei,
> 
> On 29/07/15 11:13, Wei Liu wrote:
>> On Tue, Jul 28, 2015 at 04:02:45PM +0100, Julien Grall wrote:
>> [...]
>>> diff --git a/drivers/net/xen-netback/netback.c 
>>> b/drivers/net/xen-netback/netback.c
>>> index 7d50711..3b7b7c3 100644
>>> --- a/drivers/net/xen-netback/netback.c
>>> +++ b/drivers/net/xen-netback/netback.c
>>> @@ -314,7 +314,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
>>> *queue, struct sk_buff *skb
>>> } else {
>>> copy_gop->source.domid = DOMID_SELF;
>>> copy_gop->source.u.gmfn =
>>> -   virt_to_mfn(page_address(page));
>>> +   virt_to_gfn(page_address(page));
>>> }
>>> copy_gop->source.offset = offset;
>>>  
>>> @@ -1284,7 +1284,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
>>> *queue,
>>> queue->tx_copy_ops[*copy_ops].source.offset = txreq.offset;
>>>  
>>> queue->tx_copy_ops[*copy_ops].dest.u.gmfn =
>>> -   virt_to_mfn(skb->data);
>>> +   virt_to_gfn(skb->data);
>>> queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>>> queue->tx_copy_ops[*copy_ops].dest.offset =
>>> offset_in_page(skb->data);
>>
>> Reviewed-by: Wei Liu 
>>
>> One possible improvement is to change gmfn in copy_gop to gfn as well.
>> But that's outside of netback code.
> 
> The structure gnttab_copy is part of the hypervisor interface. Is it
> fine to differ on the naming between Xen and Linux?
> 
> Or maybe we could do the change in the public headers in Xen repo too.
> Is it fine to do field renaming in public headers?

I think this series should not alter than Xen API.

David
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Xen-devel] [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Julien Grall
Hi Wei,

On 29/07/15 11:13, Wei Liu wrote:
> On Tue, Jul 28, 2015 at 04:02:45PM +0100, Julien Grall wrote:
> [...]
>> diff --git a/drivers/net/xen-netback/netback.c 
>> b/drivers/net/xen-netback/netback.c
>> index 7d50711..3b7b7c3 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -314,7 +314,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
>> *queue, struct sk_buff *skb
>>  } else {
>>  copy_gop->source.domid = DOMID_SELF;
>>  copy_gop->source.u.gmfn =
>> -virt_to_mfn(page_address(page));
>> +virt_to_gfn(page_address(page));
>>  }
>>  copy_gop->source.offset = offset;
>>  
>> @@ -1284,7 +1284,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
>> *queue,
>>  queue->tx_copy_ops[*copy_ops].source.offset = txreq.offset;
>>  
>>  queue->tx_copy_ops[*copy_ops].dest.u.gmfn =
>> -virt_to_mfn(skb->data);
>> +virt_to_gfn(skb->data);
>>  queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>>  queue->tx_copy_ops[*copy_ops].dest.offset =
>>  offset_in_page(skb->data);
> 
> Reviewed-by: Wei Liu 
> 
> One possible improvement is to change gmfn in copy_gop to gfn as well.
> But that's outside of netback code.

The structure gnttab_copy is part of the hypervisor interface. Is it
fine to differ on the naming between Xen and Linux?

Or maybe we could do the change in the public headers in Xen repo too.
Is it fine to do field renaming in public headers?

Regards,

-- 
Julien Grall
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Xen-devel] [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Julien Grall
Hi Chris,

On 28/07/15 20:39, Chris (Christopher) Brand wrote:
>> Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN is 
>> meant,
>> I suspect this is because the first support for Xen was for PV. This brough 
>> some
> Typo : "brought"
> Perhaps "resulted in" would be better ?
> 
>> misimplementation of helpers on ARM and make the developper confused the 
>> expected behavior.
> Typo: "developer".
> I'd also suggest "...and confused developers about the...".
> 
> [...]
> 
>> For clarity and avoid new confusion, replace any reference of mfn into gnf 
>> in any helpers used by PV drivers.
> Typo : "gfn"
> I'd suggest "...replace any reference to mfn with gfn..."
> 
> [...]

Thanks for telling me the typoes. I will fix it in the next version of
this series.

Regards,

-- 
Julien Grall
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Julien Grall
Hi Boris,

On 28/07/15 20:12, Boris Ostrovsky wrote:
> On 07/28/2015 11:02 AM, Julien Grall wrote:
>> Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
>> is meant, I suspect this is because the first support for Xen was for
>> PV. This brough some misimplementation of helpers on ARM and make the
>> developper confused the expected behavior.
>>
>> For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
>> Although, if we look at the implementation on x86, it's returning a GFN.
>>
>> For clarity and avoid new confusion, replace any reference of mfn into
>> gnf in any helpers used by PV drivers.
> 
> 
> 
> 
>> @@ -730,7 +730,7 @@ static void xen_do_pin(unsigned level, unsigned
>> long pfn)
>>   struct mmuext_op op;
>>
>>   op.cmd = level;
>> -op.arg1.mfn = pfn_to_mfn(pfn);
>> +op.arg1.mfn = pfn_to_gfn(pfn);
> 
> 
> This looks slightly odd. It is correct but given that purpose of this
> series is to make things more clear perhaps we can add another union
> member (gfn) to mmuext_op.arg1?
> 
> (Of course, the hypervisor will continue referring to mfn which could
> still be confusing)

This operation is only used for PV guests, right?

IHMO re-introducing pfn_to_mfn for PV-guests only (i.e with a BUG_ON to
ensure no usage for auto-translated guest) would be the best solution.
It would avoid to have different name than the hypersivor one in the
hypercall interface. It will also make clear that virt_to_machine & co
is only PV specific.

I though doing this but I preferred to defer it to x86 expert as my
knowledge for x86 Xen is very limited. I don't know where it's more
suitable to use MFN or GFN. I guess this file (mmu.c) is mostly PV specific?

Would something like below fine for you?

static inline unsigned long pfn_to_mfn(unsigned long pfn)
{
unsigned long mfn;

BUG_ON(xen_feature(XENFEAT_auto_translated_physmap));

mfn = __pfn_to_mfn(pfn);
if (mfn != INVALID_P2M_ENTRY)
mfn &= ~(FOREIGN_FRAME_BIT | IDENTITY_FRAME_BIT);

return mfn;
}

static inline unsigned long pfn_to_gfn(unsigned long pfn)
{
if (xen_feature(XENFEAT_autotranslated_physmap))
return pfn;
else
return pfn_to_mfn(pfn);
}

Similar splitting would be done for gfn_to_pfn and mfn_to_pfn.

Regards,

-- 
Julien Grall
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH] gianfar: Fix warnings when built on 64-bit

2015-07-29 Thread Manoil Claudiu
> -Original Message-
> From: Arnd Bergmann [mailto:a...@arndb.de]
> Sent: Wednesday, July 29, 2015 11:02 AM
> To: linuxppc-dev@lists.ozlabs.org; net...@vger.kernel.org; Manoil Claudiu-
> B08782; Wood Scott-B07421
> Subject: Re: [PATCH] gianfar: Fix warnings when built on 64-bit
> 
> On Wednesday 29 July 2015 00:24:37 Scott Wood wrote:
> 
> > Alternatively, if there's a desire to not mess with this code (I don't
> > know how to trigger this code path to test it), this driver should be
> > given dependencies that ensure that it only builds on 32-bit.
> 
[...]

> > @@ -2964,8 +2967,13 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q
> *rx_queue, int rx_work_limit)
> > gfar_init_rxbdp(rx_queue, bdp, bufaddr);
> >
> > /* Update Last Free RxBD pointer for LFC */
> > -   if (unlikely(rx_queue->rfbptr && priv->tx_actual_en))
> > -   gfar_write(rx_queue->rfbptr, (u32)bdp);
> > +   if (unlikely(rx_queue->rfbptr && priv->tx_actual_en)) {
> > +   u32 bdp_dma;
> > +
> > +   bdp_dma = lower_32_bits(rx_queue-
> >rx_bd_dma_base);
> > +   bdp_dma += (uintptr_t)bdp - (uintptr_t)base;
> > +   gfar_write(rx_queue->rfbptr, bdp_dma);
> > +   }
> >
> > /* Update to the next pointer */
> > bdp = next_bd(bdp, base, rx_queue->rx_ring_size);
> 
> You are fixing two problems here: the warning about a size cast, and
> the fact that the driver is using the wrong pointer. I'd suggest
> explaining it in the changelog.
> 

What would be the wrong pointer here? "base"?
"base" is " rx_queue->rx_bd_base".

> Note that we normally rely on void pointer arithmetic in the kernel, so
> I'd write it without the uintptr_t casts as
> 
>   bdp_dma = lower_32_bits(rx_queue->rx_bd_dma_base + (base -
> bdp));

I think you mean: (bdp-base)

Claudiu
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Xen-devel] [PATCH 4/8] xen: Use the correctly the Xen memory terminologies

2015-07-29 Thread Julien Grall
On 28/07/15 18:16, David Vrabel wrote:
> On 28/07/15 16:02, Julien Grall wrote:
>> Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
>> is meant, I suspect this is because the first support for Xen was for
>> PV. This brough some misimplementation of helpers on ARM and make the
>> developper confused the expected behavior.
> 
> For the benefit of other subsystem maintainers, this is a purely
> mechanical change in Xen-specific terminology.  It doesn't need reviews
> or acks from non-Xen people (IMO).
> 
>> For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
>> Although, if we look at the implementation on x86, it's returning a GFN.
>>
>> For clarity and avoid new confusion, replace any reference of mfn into
>> gnf in any helpers used by PV drivers.
>>
>> Take also the opportunity to simplify simple construction such
>> as pfn_to_mfn(page_to_pfn(page)) into page_to_gfn. More complex clean up
>> will come in follow-up patches.
>>
>> I think it may be possible to do further clean up in the x86 code to
>> ensure that helpers returning machine address (such as virt_address) is
>> not used by no auto-translated guests. I will let x86 xen expert doing
>> it.
> 
> Reviewed-by: David Vrabel 
> 
> It looks a bit odd to use GFN in some of the PV code where the
> hypervisor API uses MFN but overall I think using the correct
> terminology where possible is best.  But I'd like to have Boris's or
> Konrad's opinion on this.

I was thinking to introduce mfn_to_pfn & co which would be used only for
PV-guest (a BUG_ON would be here to ensure it) and hypercall related.

I didn't do it as I haven't much knowledge on x86 Xen and was able to
decide where I have to use pfn_to_mfn.

Regards,

-- 
Julien Grall
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/8] Use correctly the Xen memory terminologies in Linux

2015-07-29 Thread Julien Grall
On 28/07/15 16:02, Julien Grall wrote:
> Hi all,
> 
> This patch series aims to use the memory terminologies described in
> include/linux/mm.h [1] for Linux xen code.

I mistakenly wrote the wrong include here. It should be include/xen/mm.h
from the Xen tree:

http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb

Regards,

-- 
Julien Grall
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V5 0/7] Allow user to request memory to be locked on page fault

2015-07-29 Thread Vlastimil Babka
On 07/29/2015 12:45 PM, Michal Hocko wrote:
>> In a much less
>> likely corner case, it is not possible in the current setup to request
>> all current VMAs be VM_LOCKONFAULT and all future be VM_LOCKED.
> 
> Vlastimil has already pointed that out. MCL_FUTURE doesn't clear
> MCL_CURRENT. I was quite surprised in the beginning but it makes a
> perfect sense. mlockall call shouldn't lead into munlocking, that would
> be just weird. Clearing MCL_FUTURE on MCL_CURRENT makes sense on the
> other hand because the request is explicit about _current_ memory and it
> doesn't lead to any munlocking.

Yeah after more thinking it does make some sense despite the perceived
inconsistency, but it's definitely worth documenting properly. It also already
covers the usecase for munlockall2(MCL_FUTURE) which IIRC you had in the earlier
revisions...
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V5 0/7] Allow user to request memory to be locked on page fault

2015-07-29 Thread Michal Hocko
On Tue 28-07-15 09:49:42, Eric B Munson wrote:
> On Tue, 28 Jul 2015, Michal Hocko wrote:
> 
> > [I am sorry but I didn't get to this sooner.]
> > 
> > On Mon 27-07-15 10:54:09, Eric B Munson wrote:
> > > Now that VM_LOCKONFAULT is a modifier to VM_LOCKED and
> > > cannot be specified independentally, it might make more sense to mirror
> > > that relationship to userspace.  Which would lead to soemthing like the
> > > following:
> > 
> > A modifier makes more sense.
> >  
> > > To lock and populate a region:
> > > mlock2(start, len, 0);
> > > 
> > > To lock on fault a region:
> > > mlock2(start, len, MLOCK_ONFAULT);
> > > 
> > > If LOCKONFAULT is seen as a modifier to mlock, then having the flags
> > > argument as 0 mean do mlock classic makes more sense to me.
> > > 
> > > To mlock current on fault only:
> > > mlockall(MCL_CURRENT | MCL_ONFAULT);
> > > 
> > > To mlock future on fault only:
> > > mlockall(MCL_FUTURE | MCL_ONFAULT);
> > > 
> > > To lock everything on fault:
> > > mlockall(MCL_CURRENT | MCL_FUTURE | MCL_ONFAULT);
> > 
> > Makes sense to me. The only remaining and still tricky part would be
> > the munlock{all}(flags) behavior. What should munlock(MLOCK_ONFAULT)
> > do? Keep locked and poppulate the range or simply ignore the flag an
> > just unlock?
> > 
> > I can see some sense to allow munlockall(MCL_FUTURE[|MLOCK_ONFAULT]),
> > munlockall(MCL_CURRENT) resp. munlockall(MCL_CURRENT|MCL_FUTURE) but
> > other combinations sound weird to me.
> > 
> > Anyway munlock with flags opens new doors of trickiness.
> 
> In the current revision there are no new munlock[all] system calls
> introduced.  munlockall() unconditionally cleared both MCL_CURRENT and
> MCL_FUTURE before the set and now unconditionally clears all three.
> munlock() does the same for VM_LOCK and VM_LOCKONFAULT. 

OK if new munlock{all}(flags) is not introduced then this is much saner
IMO.

> If the user
> wants to adjust mlockall flags today, they need to call mlockall a
> second time with the new flags, this remains true for mlockall after
> this set and the same behavior is mirrored in mlock2. 

OK, this makes sense to me.

> The only
> remaining question I have is should we have 2 new mlockall flags so that
> the caller can explicitly set VM_LOCKONFAULT in the mm->def_flags vs
> locking all current VMAs on fault.  I ask because if the user wants to
> lock all current VMAs the old way, but all future VMAs on fault they
> have to call mlockall() twice:
> 
>   mlockall(MCL_CURRENT);
>   mlockall(MCL_CURRENT | MCL_FUTURE | MCL_ONFAULT);
> 
> This has the side effect of converting all the current VMAs to
> VM_LOCKONFAULT, but because they were all made present and locked in the
> first call, this should not matter in most cases. 

I think this is OK (worth documenting though) considering that ONFAULT
is just modifier for the current mlock* operation. The memory is locked
the same way for both - aka once the memory is present you do not know
whether it was done during mlock call or later during the fault.

> The catch is that,
> like mmap(MAP_LOCKED), mlockall() does not communicate if mm_populate()
> fails.  This has been true of mlockall() from the beginning so I don't
> know if it needs more than an entry in the man page to clarify (which I
> will add when I add documentation for MCL_ONFAULT).

Yes this is true but unlike mmap it seems fixable I guess. We do not have
to unmap and we can downgrade mmap_sem to read and the fault so nobody
can race with a concurent mlock.

> In a much less
> likely corner case, it is not possible in the current setup to request
> all current VMAs be VM_LOCKONFAULT and all future be VM_LOCKED.

Vlastimil has already pointed that out. MCL_FUTURE doesn't clear
MCL_CURRENT. I was quite surprised in the beginning but it makes a
perfect sense. mlockall call shouldn't lead into munlocking, that would
be just weird. Clearing MCL_FUTURE on MCL_CURRENT makes sense on the
other hand because the request is explicit about _current_ memory and it
doesn't lead to any munlocking.

-- 
Michal Hocko
SUSE Labs
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 1/2] perf,kvm/ppc: Add kvm_perf.h for powerpc

2015-07-29 Thread Hemant Kumar

Hi Scott,

On 07/17/2015 01:40 AM, Scott Wood wrote:

On Thu, 2015-07-16 at 21:18 +0530, Hemant Kumar wrote:

To analyze the exit events with perf, we need kvm_perf.h to be added in
the arch/powerpc directory, where the kvm tracepoints needed to trace
the KVM exit events are defined.

This patch adds "kvm_perf_book3s.h" to indicate that the tracepoints are
book3s specific. Generic "kvm_perf.h" then can just include
"kvm_perf_book3s.h".

Signed-off-by: Hemant Kumar 
---
Changes:
- Not exporting the exit reasons compared to previous patchset (suggested
by Paul)

  arch/powerpc/include/uapi/asm/kvm_perf.h|  6 ++
  arch/powerpc/include/uapi/asm/kvm_perf_book3s.h | 14 ++
  2 files changed, 20 insertions(+)
  create mode 100644 arch/powerpc/include/uapi/asm/kvm_perf.h
  create mode 100644 arch/powerpc/include/uapi/asm/kvm_perf_book3s.h

diff --git a/arch/powerpc/include/uapi/asm/kvm_perf.h
b/arch/powerpc/include/uapi/asm/kvm_perf.h
new file mode 100644
index 000..5ed2ff3
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/kvm_perf.h
@@ -0,0 +1,6 @@
+#ifndef _ASM_POWERPC_KVM_PERF_H
+#define _ASM_POWERPC_KVM_PERF_H
+
+#include 
+
+#endif
diff --git a/arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
b/arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
new file mode 100644
index 000..8c8d8c2
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
@@ -0,0 +1,14 @@
+#ifndef _ASM_POWERPC_KVM_PERF_BOOK3S_H
+#define _ASM_POWERPC_KVM_PERF_BOOK3S_H
+
+#include 
+
+#define DECODE_STR_LEN 20
+
+#define VCPU_ID "vcpu_id"
+
+#define KVM_ENTRY_TRACE "kvm_hv:kvm_guest_enter"
+#define KVM_EXIT_TRACE "kvm_hv:kvm_guest_exit"
+#define KVM_EXIT_REASON "trap"
+
+#endif /* _ASM_POWERPC_KVM_PERF_BOOK3S_H */

Again, why is book3s stuff being presented via uapi as generic
 with generic symbol names?

-Scott


Ok.

We can change the KVM_ENTRY_TRACE macro to something like
KVM_BOOK3S_ENTRY_TRACE and likewise for KVM_EXIT_TRACE
and KVM_EXIT_REASON and then, to resolve the issue of generic
macro names in the userspace side, we can handle it using __weak
modifier.

What would you suggest?


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


--
Thanks,
Hemant Kumar

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH][v2] powerpc/fsl-booke: Add T1040D4RDB/T1042D4RDB board support

2015-07-29 Thread Priyanka Jain


> -Original Message-
> From: Wood Scott-B07421
> Sent: Friday, July 24, 2015 8:58 PM
> To: Jain Priyanka-B32167
> Cc: linuxppc-dev@lists.ozlabs.org
> Subject: Re: [PATCH][v2] powerpc/fsl-booke: Add T1040D4RDB/T1042D4RDB
> board support
> 
> On Wed, 2015-07-22 at 05:49 -0500, Jain Priyanka-B32167 wrote:
> >
> > > -Original Message-
> > > From: Wood Scott-B07421
> > > Sent: Friday, July 17, 2015 10:37 PM
> > > To: Jain Priyanka-B32167
> > > Cc: linuxppc-dev@lists.ozlabs.org
> > > Subject: Re: [PATCH][v2] powerpc/fsl-booke: Add
> > > T1040D4RDB/T1042D4RDB board support
> > >
> > > On Fri, 2015-07-17 at 01:17 -0500, Jain Priyanka-B32167 wrote:
> > > >
> > > > > -Original Message-
> > > > > From: Wood Scott-B07421
> > > > > Sent: Friday, July 17, 2015 1:06 AM
> > > > > To: Jain Priyanka-B32167
> > > > > Cc: linuxppc-dev@lists.ozlabs.org
> > > > > Subject: Re: [PATCH][v2] powerpc/fsl-booke: Add
> > > > > T1040D4RDB/T1042D4RDB board support
> > > > >
> > > > > > > +i2c@118100{
> > > > > > > +  mux@77{
> > > > > > > + compatible = "nxp,pca9546";
> > > > > > > + reg = <0x77>;
> > > > > > > + #address-cells = <1>;
> > > > > > > + #size-cells = <0>;
> > > > > > > + };
> > > > > > > + };
> > > > > >
> > > > > > A mux with no nodes under it (and yet it has
> > > > > > #address-cells/#size- cells)?
> > > > > > What is it multiplexing?
> > > > > > [Priyanka]: PCA9546 is i2c mux device , to which other i2c
> > > > > > devices (up-to 8
> > > > > > ) can be further connected on output channels On T104xD4RDB,
> > > > > > channel 0, 1, 3 line are connected to PEX device, Channel 2 to
> > > > > > hdmi interface (initialization is done in u-boot only), other
> > > > > > channels are
> > > grounded.
> > > > > > So, as such Linux is not using the second level I2C devices
> > > > > > connected on this MUX device. So, I have not shown next level
> > > hierarchy.
> > > > > > Should I replace 'mux' with some other name? . Please suggest.
> > > > >
> > > > > The device tree describes the hardware, not just what Linux uses...
> > > > > but what I don't understand is why you describe the mux at all
> > > > > if you're not going to describe what goes underneath it.
> > > > >
> > > > [Jain Priyanka-B32167] : Is below looks OK?
> > > > i2c@118100{
> > > >  +  i2c@77{
> > > >  + compatible = "nxp,pca9546";
> > > >  + reg = <0x77>;
> > > >  + #address-cells = <1>;
> > > >  + #size-cells = <0>;
> > > >  + };
> > > >  + };
> > >
> > > Where in my above comment did it appear that I was complaining about
> > > the node name?
> > >
> > [Jain Priyanka-B32167]
> > From what I understand:
> > PCA9546 is a mux device and it would be good if we were able to
> > present the I2C devices on output lines as subnodes like in case of
> > B4qds board and then 'mux' name would have make more sense.
> 
> The name "mux" makes more sense regardless.
> 
> > But in case of T1040D4RDB board, output i2c lines are going to PEX
> > slots, PCI connector. I am not aware of how to represents them as sub-
> nodes in dts.
> 
> OK, so you're saying the i2c devices are pluggable (and I'm assuming by "PEX
> slots" you just mean that the physical slot is repurposed, not that the PCI
> express protocol is involved)?  Making a non-runtime-enumerable bus be
> pluggable seems like a bad idea, but if that's really what has been done,
> there needs to be a device tree that represents the entire system, not just
> the motherboard.  This could be done either via a dts file that /include/s the
> motherboard dts, or via firmware dtb edits.  The dts for the motherboard
> should include the mux node with a comment explaining what the situation
> is.
> 
[Jain Priyanka-B32167] Is the below comment looks OK?
"Output I2C data, clock lines (SDO/SC0,SD1/SC1 , SD2/SC2, SD3/SC3) are going 
mini PCI connector slot1, mini PCI connector slot2, HDMI connector, PEX slot 
respectively
 The sub-nodes will depend upon the device that will be connected on these 
slots"

> -Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH] gianfar: Fix warnings when built on 64-bit

2015-07-29 Thread Manoil Claudiu
> -Original Message-
> From: Arnd Bergmann [mailto:a...@arndb.de]
> Sent: Wednesday, July 29, 2015 11:02 AM
> To: linuxppc-dev@lists.ozlabs.org; net...@vger.kernel.org; Manoil Claudiu-
> B08782; Wood Scott-B07421
> Subject: Re: [PATCH] gianfar: Fix warnings when built on 64-bit
> 
> On Wednesday 29 July 2015 00:24:37 Scott Wood wrote:
> 
> > Alternatively, if there's a desire to not mess with this code (I don't
> > know how to trigger this code path to test it), this driver should be
> > given dependencies that ensure that it only builds on 32-bit.
> 
> These are obvious fixes, they should definitely go in.

This patch conflicts with the rx s/g patch series:
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=9061cb023567abf081569d6851b0815dd18437e6

So if applied as it is on top of net.git it will give a headache when 
net-next.git
will be merged into net.git (or vice versa).

Since there are no 64-bit systems with gianfar/ eTSEC, I think that this patch
should target net-next.git (reworked to be applicable on net-next.git) to avoid
the conflict.   I could do this rework and resend it on top of net-next.git

> 
> >  drivers/net/ethernet/freescale/gianfar.c | 22 ++
> >  1 file changed, 18 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/freescale/gianfar.c
> b/drivers/net/ethernet/freescale/gianfar.c
> > index ff87502..7c682ac 100644
> > --- a/drivers/net/ethernet/freescale/gianfar.c
> > +++ b/drivers/net/ethernet/freescale/gianfar.c
> > @@ -565,6 +565,7 @@ static void gfar_ints_enable(struct gfar_private
> *priv)
> > }
> >  }
> >
> > +#ifdef CONFIG_PM
> >  static void lock_tx_qs(struct gfar_private *priv)
> >  {
> > int i;
> > @@ -580,6 +581,7 @@ static void unlock_tx_qs(struct gfar_private *priv)
> > for (i = 0; i < priv->num_tx_queues; i++)
> > spin_unlock(&priv->tx_queue[i]->txlock);
> >  }
> > +#endif
> >
> 
> This seems unrelated and should probably be a separate fix.
> 

I'm working at a patch set to revive/ cleanup the power management code,
and lock_tx_qs() is planned to be removed (it can be shown that it's not 
needed).
So this change can be remove from this patch.

Thanks,
Claudiu

[...]
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [4/5] powerpc/perf: Change name & type of 'pred' in power_pmu_bhrb_read

2015-07-29 Thread Anshuman Khandual
On 07/29/2015 08:55 AM, Michael Ellerman wrote:
> On Tue, 2015-30-06 at 08:20:30 UTC, Anshuman Khandual wrote:
>> > Branch record attributes 'mispred' and 'predicted' are single bit
>> > fields as defined in the perf ABI. Hence the data type of the field
>> > 'pred' used during BHRB processing should be changed from integer
>> > to bool. This patch also changes the name of the variable from 'pred'
>> > to 'mispred' making the logical inversion process more meaningful
>> > and readable.
> This whole function is a mess.
> 
> There's no good reason why we're doing the assignment to pred/mispred in two
> places to begin with, so if that was eliminated we wouldn't need a local for
> mispred to begin with.

Not sure whether I got this right. We are assigning mispred once with
the value (val & BHRB_PREDICTION) and then assigning mispred and it's
inversion to two different fields of the branch entry as required.

> 
> Then there's the type juggling, all of which probably works but is fishy and
> horrible.

With this patch and one more (2nd patch of the BHRB SW filter series)
patch, we are trying to make it better.

> 
> You take a u64, bitwise and it with a mask, assign that to a boolean, then 
> take

So that any residual positive value after the "AND" operation will
become logical TRUE for the boolean. We dont use any shifting here
as BHRB_PREDICTION checks for the right most (least significant) bit
in the sequence.

> the boolean, *bitwise* negate that and assign the result to a single bit
> bitfield.

This is getting fixed with a subsequent patch (2nd patch of the BHRB
SW filter series) in a new function called insert_branch.

+static inline void insert_branch(struct cpu_hw_events *cpuhw,
+   int index, u64 from, u64 to, bool mispred)
+{
+   cpuhw->bhrb_entries[index].from = from;
+   cpuhw->bhrb_entries[index].to = to;
+   cpuhw->bhrb_entries[index].mispred = mispred;
+   cpuhw->bhrb_entries[index].predicted = !mispred;
+}

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] gianfar: Fix warnings when built on 64-bit

2015-07-29 Thread Arnd Bergmann
On Wednesday 29 July 2015 00:24:37 Scott Wood wrote:

> Alternatively, if there's a desire to not mess with this code (I don't
> know how to trigger this code path to test it), this driver should be
> given dependencies that ensure that it only builds on 32-bit.

These are obvious fixes, they should definitely go in.

>  drivers/net/ethernet/freescale/gianfar.c | 22 ++
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/gianfar.c 
> b/drivers/net/ethernet/freescale/gianfar.c
> index ff87502..7c682ac 100644
> --- a/drivers/net/ethernet/freescale/gianfar.c
> +++ b/drivers/net/ethernet/freescale/gianfar.c
> @@ -565,6 +565,7 @@ static void gfar_ints_enable(struct gfar_private *priv)
>   }
>  }
>  
> +#ifdef CONFIG_PM
>  static void lock_tx_qs(struct gfar_private *priv)
>  {
>   int i;
> @@ -580,6 +581,7 @@ static void unlock_tx_qs(struct gfar_private *priv)
>   for (i = 0; i < priv->num_tx_queues; i++)
>   spin_unlock(&priv->tx_queue[i]->txlock);
>  }
> +#endif
>  

This seems unrelated and should probably be a separate fix.

> @@ -2964,8 +2967,13 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q 
> *rx_queue, int rx_work_limit)
>   gfar_init_rxbdp(rx_queue, bdp, bufaddr);
>  
>   /* Update Last Free RxBD pointer for LFC */
> - if (unlikely(rx_queue->rfbptr && priv->tx_actual_en))
> - gfar_write(rx_queue->rfbptr, (u32)bdp);
> + if (unlikely(rx_queue->rfbptr && priv->tx_actual_en)) {
> + u32 bdp_dma;
> +
> + bdp_dma = lower_32_bits(rx_queue->rx_bd_dma_base);
> + bdp_dma += (uintptr_t)bdp - (uintptr_t)base;
> + gfar_write(rx_queue->rfbptr, bdp_dma);
> + }
>  
>   /* Update to the next pointer */
>   bdp = next_bd(bdp, base, rx_queue->rx_ring_size);

You are fixing two problems here: the warning about a size cast, and
the fact that the driver is using the wrong pointer. I'd suggest
explaining it in the changelog.

Note that we normally rely on void pointer arithmetic in the kernel, so
I'd write it without the uintptr_t casts as 

bdp_dma = lower_32_bits(rx_queue->rx_bd_dma_base + (base - bdp));

Arnd
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3] powerpc/dts: Add and fix 1588 timer node for eTSEC

2015-07-29 Thread Yangbo Lu
Add 1588 timer node in files:
arch/powerpc/boot/dts/bsc9131rdb.dtsi
arch/powerpc/boot/dts/bsc9132qds.dtsi
arch/powerpc/boot/dts/p1010rdb.dtsi
arch/powerpc/boot/dts/p1020rdb-pd.dts
arch/powerpc/boot/dts/p1021rdb-pc.dtsi
arch/powerpc/boot/dts/p1022ds.dtsi
arch/powerpc/boot/dts/p1025twr.dtsi
For P2020RDB-PC, registers' values should be calculated
based on default 1588 reference clock(300MHz) not 250MHz,
and fix this in file:
arch/powerpc/boot/dts/p2020rdb-pc.dtsi

Signed-off-by: Yangbo Lu 
---
Changes for v3:
- Changed 'tmr-add' to hex value
- Modified commit message
Changes for v2:
- Changed hex value to decimal value in dts
- Modified commit message
- Modified 1588 node in p2020rdb-pc.dtsi
---
 arch/powerpc/boot/dts/bsc9131rdb.dtsi  | 12 
 arch/powerpc/boot/dts/bsc9132qds.dtsi  | 12 
 arch/powerpc/boot/dts/p1010rdb.dtsi| 12 
 arch/powerpc/boot/dts/p1020rdb-pd.dts  | 12 
 arch/powerpc/boot/dts/p1021rdb-pc.dtsi | 12 
 arch/powerpc/boot/dts/p1022ds.dtsi | 12 
 arch/powerpc/boot/dts/p1025twr.dtsi| 12 
 arch/powerpc/boot/dts/p2020rdb-pc.dtsi | 12 ++--
 8 files changed, 90 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/boot/dts/bsc9131rdb.dtsi 
b/arch/powerpc/boot/dts/bsc9131rdb.dtsi
index 45efcba..f4d96d2 100644
--- a/arch/powerpc/boot/dts/bsc9131rdb.dtsi
+++ b/arch/powerpc/boot/dts/bsc9131rdb.dtsi
@@ -80,6 +80,18 @@
status = "disabled";
};
 
+   ptp_clock@b0e00 {
+   compatible = "fsl,etsec-ptp";
+   reg = <0xb0e00 0xb0>;
+   interrupts = <68 2 0 0 69 2 0 0>;
+   fsl,tclk-period = <5>;
+   fsl,tmr-prsc= <2>;
+   fsl,tmr-add = <0xcccd>;
+   fsl,tmr-fiper1  = <5>;
+   fsl,tmr-fiper2  = <0>;
+   fsl,max-adj = <24999>;
+   };
+
enet0: ethernet@b {
phy-handle = <&phy0>;
phy-connection-type = "rgmii-id";
diff --git a/arch/powerpc/boot/dts/bsc9132qds.dtsi 
b/arch/powerpc/boot/dts/bsc9132qds.dtsi
index af8e888..7a13bf2 100644
--- a/arch/powerpc/boot/dts/bsc9132qds.dtsi
+++ b/arch/powerpc/boot/dts/bsc9132qds.dtsi
@@ -87,6 +87,18 @@
};
};
 
+   ptp_clock@b0e00 {
+   compatible = "fsl,etsec-ptp";
+   reg = <0xb0e00 0xb0>;
+   interrupts = <68 2 0 0 69 2 0 0>;
+   fsl,tclk-period = <5>;
+   fsl,tmr-prsc= <2>;
+   fsl,tmr-add = <0xcccd>;
+   fsl,tmr-fiper1  = <5>;
+   fsl,tmr-fiper2  = <0>;
+   fsl,max-adj = <24999>;
+   };
+
enet0: ethernet@b {
phy-handle = <&phy0>;
tbi-handle = <&tbi0>;
diff --git a/arch/powerpc/boot/dts/p1010rdb.dtsi 
b/arch/powerpc/boot/dts/p1010rdb.dtsi
index ea534ef..0f0ced6 100644
--- a/arch/powerpc/boot/dts/p1010rdb.dtsi
+++ b/arch/powerpc/boot/dts/p1010rdb.dtsi
@@ -186,6 +186,18 @@
};
};
 
+   ptp_clock@b0e00 {
+   compatible = "fsl,etsec-ptp";
+   reg = <0xb0e00 0xb0>;
+   interrupts = <68 2 0 0 69 2 0 0>;
+   fsl,tclk-period = <10>;
+   fsl,tmr-prsc= <2>;
+   fsl,tmr-add = <0x8016>;
+   fsl,tmr-fiper1  = <0>;
+   fsl,tmr-fiper2  = <0>;
+   fsl,max-adj = <1>;
+   };
+
enet0: ethernet@b {
phy-handle = <&phy0>;
phy-connection-type = "rgmii-id";
diff --git a/arch/powerpc/boot/dts/p1020rdb-pd.dts 
b/arch/powerpc/boot/dts/p1020rdb-pd.dts
index 987017e..c7c6416 100644
--- a/arch/powerpc/boot/dts/p1020rdb-pd.dts
+++ b/arch/powerpc/boot/dts/p1020rdb-pd.dts
@@ -225,6 +225,18 @@
};
};
 
+   ptp_clock@b0e00 {
+   compatible = "fsl,etsec-ptp";
+   reg = <0xb0e00 0xb0>;
+   interrupts = <68 2 0 0 69 2 0 0>;
+   fsl,tclk-period = <10>;
+   fsl,tmr-prsc= <2>;
+   fsl,tmr-add = <0x8016>;
+   fsl,tmr-fiper1  = <0>;
+   fsl,tmr-fiper2  = <0>;
+   fsl,max-adj = <1>;
+   };
+
enet0: ethernet@b {
fixed-link = <1 1 1000 0 0>;
phy-connection-type = "rgmii-id";
diff --git a/arch/powerpc/boot/dts/p1021rdb-pc.dtsi 
b/arch/powerpc/boot/dts/p1021rdb-pc.dtsi
index d6274c5..e8a0f95 100644
--- a/arch/powerpc/boot/dts/p1021rdb-pc.dtsi
+++ b/arch/powerpc/boot/dts/p1021rdb-pc.dtsi
@@ -224,6 +224,18 @@
};
};
 
+   ptp_clock@b0e00 {
+   compatible = "fsl,etsec-p

RE: [PATCH v2] powerpc/dts: Add and fix 1588 timer node for eTSEC

2015-07-29 Thread Lu Y . B .
Hi Scott,

I submitted a v3 patch for this.
Sorry for my misunderstanding and thank a lot.



Best regards,
Yangbo Lu

> -Original Message-
> From: Wood Scott-B07421
> Sent: Friday, July 24, 2015 11:08 PM
> To: Lu Yangbo-B47093
> Cc: linuxppc-dev@lists.ozlabs.org; linux-ker...@vger.kernel.org
> Subject: Re: [PATCH v2] powerpc/dts: Add and fix 1588 timer node for
> eTSEC
> 
> On Mon, 2015-07-20 at 01:33 -0500, Lu Yangbo-B47093 wrote:
> > > On Wed, 2015-07-15 at 21:37 -0500, Lu Yangbo-B47093 wrote:
> > > > Any comments?
> > > > Thanks.
> > >
> > > Sorry, I must have missed this on my last time through the patch
> queue.
> > > I see you've decimalized the fiper and max-adj properties, which is
> > > good... but does it really make sense for tmr-add?  I'm not familiar
> > > with what this value represents, but the numbers look more natural as
> hex (e.g.
> > > 0xaaab versus 2863311531).
> >
> > Yes, the fiper value would be writed into fiper registers. And max-adj
> > value would be used in ptp driver in driver/ptp/.
> > But you insisted that values should be in decimalism in the v1
> > patch... :)
> >
> > See the history :)
> 
> I didn't insist on decimals for *everything*, just where it makes sense,
> and that "it goes in a register" doesn't *automatically* mean that it
> doesn't make sense.
> 
> # history ##
> > > > +  ptp_clock@b0e00{
> > > > + compatible = "fsl,etsec-ptp";
> > > > + reg = <0xb0e00 0xb0>;
> > > > + interrupts = <68 2 0 0 69 2 0 0>;
> > > > + fsl,tclk-period = <5>;
> > > > + fsl,tmr-prsc= <2>;
> > > > + fsl,tmr-add = <0xcccd>;
> > > > + fsl,tmr-fiper1  = <0x3b9ac9fb>;
> > > > + fsl,tmr-fiper2  = <0x00018696>;
> > > > + fsl,max-adj = <24999>;
> > >
> > > Please don't use hex for numbers that make more sense as decimal.
> > > [Lu Yangbo-B47093] The hex value is register value, I think it's
> > > better to use hex.
> >
> > Whether it goes into a register doesn't matter.  Hex values are useful
> > for values which are subdivided into various bitfields, or whose hex
> > representation is simpler than decimal.  I'm not familiar with the
> > details of this hardware, but I doubt the former is the case for
> > 0x3b9ac9fb ==
> > 95 or 0x18696 == 0.
> >
> > -Scott
> > ##
> >
> >
> > >
> > > > > diff --git a/arch/powerpc/boot/dts/p2020rdb-pc.dtsi
> > > > > b/arch/powerpc/boot/dts/p2020rdb-pc.dtsi
> > > > > index c21d1c7..363172d 100644
> > > > > --- a/arch/powerpc/boot/dts/p2020rdb-pc.dtsi
> > > > > +++ b/arch/powerpc/boot/dts/p2020rdb-pc.dtsi
> > > > > @@ -215,12 +215,12 @@
> > > > > };
> > > > >
> > > > >  ptp_clock@24e00{
> > > > > -   fsl,tclk-period = <5>;
> > > > > -   fsl,tmr-prsc = <200>;
> > > > > -   fsl,tmr-add = <0xCCCD>;
> > > > > -   fsl,tmr-fiper1 = <0x3B9AC9FB>;
> > > > > -   fsl,tmr-fiper2 = <0x0001869B>;
> > > > > -   fsl,max-adj = <24999>;
> > > > > +   fsl,tclk-period = <5>;
> > > > > +   fsl,tmr-prsc= <2>;
> > > > > +   fsl,tmr-add = <2863311531>;
> > > > > +   fsl,tmr-fiper1  = <5>;
> > > > > +   fsl,tmr-fiper2  = <0>;
> > > > > +   fsl,max-adj = <2>;
> > > > > };
> > >
> > > And here, you're changing the value of fsl,tmr-add and fsl,max-adj.
> Why?
> >
> > The old values maybe not calculated base on the default eTSEC system
> > clock value.
> > 1588 timer couldn’t be adjusted correctly by old values.
> 
> Explain in the changelog what was wrong with the old values (don't just
> say "Fix 1588 timer node in file").
> 
> -Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH] powerpc/kexec: Wait 1s for secondaries to enter OPAL

2015-07-29 Thread Stewart Smith
Benjamin Herrenschmidt  writes:
> On Tue, 2015-07-28 at 16:13 +1000, Samuel Mendoza-Jonas wrote:
>
>> "It sounds reasonable" was more or less the inspiration :)
>> While I was going over some of the code relating to the previous kexec
>> fix with Ben he pointed this out and suggested there wasn't
>> much of a reason to differentiate between a crashing/non-crashing
>> cpu as far as the timeout goes - if we're not 'crashing' we still
>> don't want to spin forever.
>> 
>> I'll let Ben comment on whether 1s per cpu is enough.
>
> Well, if the scheduler doesn't give us the CPU at the point of kexec
> within a second, I think we are in pretty bad shape already, don't you
> think ?

Quite likely, I think my dislike of magic timeouts just kicked in :)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR

2015-07-29 Thread Wei Yang
In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64
BAR in Single PE mode to cover the number of VFs required to be enabled.
By doing so, several VFs would be in one VF Group and leads to interference
between VFs in the same group.

This patch changes the design by using one M64 BAR in Single PE mode for
one VF BAR. This gives absolute isolation for VFs.

Signed-off-by: Wei Yang 
---
 arch/powerpc/include/asm/pci-bridge.h |5 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |  104 +
 2 files changed, 18 insertions(+), 91 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 712add5..1997e5d 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -214,10 +214,9 @@ struct pci_dn {
u16 vfs_expanded;   /* number of VFs IOV BAR expanded */
u16 num_vfs;/* number of VFs enabled*/
int offset; /* PE# for the first VF PE */
-#define M64_PER_IOV 4
-   int m64_per_iov;
+#define MAX_M64_WINDOW  16
 #define IODA_INVALID_M64(-1)
-   int m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
+   int m64_wins[PCI_SRIOV_NUM_BARS][MAX_M64_WINDOW];
 #endif /* CONFIG_PCI_IOV */
 #endif
struct list_head child_list;
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 5738d31..b3e7909 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1168,7 +1168,7 @@ static int pnv_pci_vf_release_m64(struct pci_dev *pdev)
pdn = pci_get_pdn(pdev);
 
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
-   for (j = 0; j < M64_PER_IOV; j++) {
+   for (j = 0; j < MAX_M64_WINDOW; j++) {
if (pdn->m64_wins[i][j] == IODA_INVALID_M64)
continue;
opal_pci_phb_mmio_enable(phb->opal_id,
@@ -1193,8 +1193,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
inttotal_vfs;
resource_size_tsize, start;
intpe_num;
-   intvf_groups;
-   intvf_per_group;
+   intm64s;
 
bus = pdev->bus;
hose = pci_bus_to_host(bus);
@@ -1204,17 +1203,13 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
 
/* Initialize the m64_wins to IODA_INVALID_M64 */
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
-   for (j = 0; j < M64_PER_IOV; j++)
+   for (j = 0; j < MAX_M64_WINDOW; j++)
pdn->m64_wins[i][j] = IODA_INVALID_M64;
 
-   if (pdn->m64_per_iov == M64_PER_IOV) {
-   vf_groups = (num_vfs <= M64_PER_IOV) ? num_vfs: M64_PER_IOV;
-   vf_per_group = (num_vfs <= M64_PER_IOV)? 1:
-   roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
-   } else {
-   vf_groups = 1;
-   vf_per_group = 1;
-   }
+   if (pdn->vfs_expanded != phb->ioda.total_pe)
+   m64s = num_vfs;
+   else
+   m64s = 1;
 
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
res = &pdev->resource[i + PCI_IOV_RESOURCES];
@@ -1224,7 +1219,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
if (!pnv_pci_is_mem_pref_64(res->flags))
continue;
 
-   for (j = 0; j < vf_groups; j++) {
+   for (j = 0; j < m64s; j++) {
do {
win = 
find_next_zero_bit(&phb->ioda.m64_bar_alloc,
phb->ioda.m64_bar_idx + 1, 0);
@@ -1235,10 +1230,9 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
 
pdn->m64_wins[i][j] = win;
 
-   if (pdn->m64_per_iov == M64_PER_IOV) {
+   if (pdn->vfs_expanded != phb->ioda.total_pe) {
size = pci_iov_resource_size(pdev,
PCI_IOV_RESOURCES + i);
-   size = size * vf_per_group;
start = res->start + size * j;
} else {
size = resource_size(res);
@@ -1246,7 +1240,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
}
 
/* Map the M64 here */
-   if (pdn->m64_per_iov == M64_PER_IOV) {
+   if (pdn->vfs_expanded != phb->ioda.total_pe) {
pe_num = pdn->offset + j;
rc = opal_pci_map_pe_mmio_window(phb->opal_id,

Re: [PATCH] ipmi/powernv: Fix potential invalid pointer dereference

2015-07-29 Thread Neelesh Gupta

Hi Alistair,

Thanks for the review.

On 07/28/2015 11:21 PM, Alistair Popple wrote:

Hi Neelesh,

This fix looks reasonable to me, although Jeremy would be the best person to
comment if he has time. I wonder why we bother polling at all given that our
event interface should call opal_ipmi_recv() whenever a message is ready?


Agree. I thought about it and didn't find any reason to have it as we 
have event

mechanism.. but didn't think of changing as it is not causing any issue..



Also the firmware fix you refer to and this fix are independent of each other
so there's no ordering issues there.


Correct. Though, there is no relation, but I figured out the skiboot 
issue after

this change.. yes, they are independent. Please find time to review the
skiboot patch.

Corey,

Please queue this patch for upstream if you Ok with it.

Thanks,
Neelesh.



Reviewed-By: Alistair Popple 

On Tue, 28 Jul 2015 13:20:07 Neelesh Gupta wrote:

On 07/17/2015 02:12 PM, Neelesh Gupta wrote:

Hi Corey,

On 07/16/2015 08:31 PM, Corey Minyard wrote:

Ok, this looks fine.  A couple of question...

Do I need to send this upstream right now?  How well has this been

tested?

I would want either Jeremy or Alistair to review this patch before you
send this
upstream. There is also firmware piece
http://patchwork.ozlabs.org/patch/496645/
awaiting review.

In the testing front, I manually made the opal_ipmi_recv() function to
fail for testing
the error path and see if the driver recovers from it and subsequent
ipmi commands
work all good.

Hi Jeremy/Alistair,

Could you please review it and the corresponding skiboot patch...

Thanks,
Neelesh.


Do you want this backported to 4.0 stable?

Yes, I want this to be be backported to 4.0 stable.

Thanks,
Neelesh.


-corey

On 07/16/2015 06:16 AM, Neelesh Gupta wrote:

If the OPAL call to receive the ipmi message fails, then we free up the
smi message and return. But, the driver still holds the reference to
old smi message in the 'cur_msg' which can potentially be accessed later
and freed again leading to kernel oops. To fix it up,

The kernel driver should reset the 'cur_msg' and send reply to the user
in addition to freeing the message.

Signed-off-by: Neelesh Gupta
---
   drivers/char/ipmi/ipmi_powernv.c |   13 ++---
   1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_powernv.c

b/drivers/char/ipmi/ipmi_powernv.c

index 9b409c0..637486d 100644
--- a/drivers/char/ipmi/ipmi_powernv.c
+++ b/drivers/char/ipmi/ipmi_powernv.c
@@ -143,9 +143,16 @@ static int ipmi_powernv_recv(struct

ipmi_smi_powernv *smi)

pr_devel("%s:   -> %d (size %lld)\n", __func__,
rc, rc == 0 ? size : 0);
if (rc) {
-   spin_unlock_irqrestore(&smi->msg_lock, flags);
-   ipmi_free_smi_msg(msg);
-   return 0;
+   /* If came via the poll, and response was not yet ready */
+   if (rc == OPAL_EMPTY) {
+   spin_unlock_irqrestore(&smi->msg_lock, flags);
+   return 0;
+   } else {
+   smi->cur_msg = NULL;
+   spin_unlock_irqrestore(&smi->msg_lock, flags);
+   send_error_reply(smi, msg, IPMI_ERR_UNSPECIFIED);
+   return 0;
+   }
}

if (size < sizeof(*opal_msg)) {




___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/8] powerpc/slb: Rename all the 'slot' occurrences to 'entry'

2015-07-29 Thread Anshuman Khandual
These are essentially SLB individual slots with entries what we are
dealing with in these functions. Usage of both 'entry' and 'slot'
synonyms makes it real confusing sometimes. This patch makes it
uniform across the file by replacing all those 'slot's with 'entry's.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/mm/slb.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index 62fafb3..faf9f0c 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -41,9 +41,9 @@ static void slb_allocate(unsigned long ea)
(((ssize) == MMU_SEGSIZE_256M)? ESID_MASK: ESID_MASK_1T)
 
 static inline unsigned long mk_esid_data(unsigned long ea, int ssize,
-unsigned long slot)
+unsigned long entry)
 {
-   return (ea & slb_esid_mask(ssize)) | SLB_ESID_V | slot;
+   return (ea & slb_esid_mask(ssize)) | SLB_ESID_V | entry;
 }
 
 static inline unsigned long mk_vsid_data(unsigned long ea, int ssize,
@@ -308,12 +308,11 @@ void slb_initialize(void)
lflags = SLB_VSID_KERNEL | linear_llp;
vflags = SLB_VSID_KERNEL | vmalloc_llp;
 
-   /* Invalidate the entire SLB (even slot 0) & all the ERATS */
+   /* Invalidate the entire SLB (even entry 0) & all the ERATS */
asm volatile("isync":::"memory");
asm volatile("slbmte  %0,%0"::"r" (0) : "memory");
asm volatile("isync; slbia; isync":::"memory");
create_shadowed_slbe(PAGE_OFFSET, mmu_kernel_ssize, lflags, 0);
-
create_shadowed_slbe(VMALLOC_START, mmu_kernel_ssize, vflags, 1);
 
/* For the boot cpu, we're running on the stack in init_thread_union,
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 8/8] powerpc/xmon: Add some more elements to the existing PACA dump list

2015-07-29 Thread Anshuman Khandual
This patch adds a set of new elements to the existing PACA dump list
inside an xmon session which can be listed below improving the overall
xmon debug support.

(1) hmi_event_available
(2) dscr_default
(3) vmalloc_sllp
(4) slb_cache_ptr
(5) sprg_vdso
(6) tm_scratch
(7) core_idle_state_ptr
(8) thread_idle_state
(9) thread_mask
(10) slb_shadow
(11) pgd
(12) kernel_pgd
(13) tcd_ptr
(14) mc_kstack
(15) crit_kstack
(16) dbg_kstack
(17) user_time
(18) system_time
(19) user_time_scaled
(20) starttime
(21) starttime_user
(22) startspurr
(23) utime_sspurr
(24) stolen_time

With this patch, a typical xmon PACA dump looks something like this.

paca for cpu 0x0 @ cfdc:
 possible = yes
 present  = yes
 online   = yes
 lock_token   = 0x8000  (0x8)
 paca_index   = 0x0 (0xa)
 kernel_toc   = 0xc0e79300  (0x10)
 kernelbase   = 0xc000  (0x18)
 kernel_msr   = 0xb0001032  (0x20)
 emergency_sp = 0xc0003fff  (0x28)
 mc_emergency_sp  = 0xc0003ffec000  (0x2e0)
 in_mce   = 0x0 (0x2e8)
 hmi_event_available  = 0x0 (0x2ea)
 data_offset  = 0xfa9f  (0x30)
 hw_cpu_id= 0x0 (0x38)
 cpu_start= 0x1 (0x3a)
 kexec_state  = 0x0 (0x3b)
 slb_shadow[0]:   = 0xc800 0x40016e7779000510
 slb_shadow[1]:   = 0xd801 0x400142add1000510
 dscr_default = 0x0 (0x58)
 vmalloc_sllp = 0x510   (0x1b8)
 slb_cache_ptr= 0x3 (0x1ba)
 slb_cache[0]:= 0x3f000
 slb_cache[1]:= 0x1
 slb_cache[2]:= 0x1000
 __current= 0xc000a7406b70  (0x290)
 kstack   = 0xc000a750fe30  (0x298)
 stab_rr  = 0x11(0x2a0)
 saved_r1 = 0xc000a750f360  (0x2a8)
 trap_save= 0x0 (0x2b8)
 soft_enabled = 0x0 (0x2ba)
 irq_happened = 0x1 (0x2bb)
 io_sync  = 0x0 (0x2bc)
 irq_work_pending = 0x0 (0x2bd)
 nap_state_lost   = 0x0 (0x2be)
 sprg_vdso= 0x0 (0x2c0)
 tm_scratch   = 0x80010280f032  (0x2c8)
 core_idle_state_ptr  = (null)  (0x2d0)
 thread_idle_state= 0x0 (0x2d8)
 thread_mask  = 0x0 (0x2d9)
 subcore_sibling_mask = 0x0 (0x2da)
 user_time= 0x18895 (0x2f0)
 system_time  = 0x11dc2 (0x2f8)
 user_time_scaled = 0x0 (0x300)
 starttime= 0xe64688b4688a  (0x308)
 starttime_user   = 0xe64688b466d1  (0x310)
 startspurr   = 0x1a79afea8 (0x318)
 utime_sspurr = 0x0 (0x320)
 stolen_time  = 0x0 (0x328)

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/xmon/xmon.c | 57 
 1 file changed, 53 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index bc1b066a..1e67c8b 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2073,6 +2073,9 @@ static void xmon_rawdump (unsigned long adrs, long ndump)
 static void dump_one_paca(int cpu)
 {
struct paca_struct *p;
+#ifdef CONFIG_PPC_STD_MMU_64
+   int i = 0;
+#endif
 
if (setjmp(bus_error_jmp) != 0) {
printf("*** Error dumping paca for cpu 0x%x!\n", cpu);
@@ -2086,12 +2089,12 @@ static void dump_one_paca(int cpu)
 
printf("paca for cpu 0x%x @ %p:\n", cpu, p);
 
-   printf(" %-*s = %s\n", 16, "possible", cpu_possible(cpu) ? "yes" : 
"no");
-   printf(" %-*s = %s\n", 16, "present", cpu_present(cpu) ? "yes" : "no");
-   printf(" %-*s = %s\n", 16, "online", cpu_online(cpu) ? "yes" : "no");
+   printf(" %-*s = %s\n", 20, "possible", cpu_possible(cpu) ? "yes" : 
"no");
+   printf(" %-*s = %s\n", 20, "present", cpu_present(cpu) ? "yes" : "no");
+   printf(" %-*s = %s\n", 20, "online", cpu_online(cpu) ? "yes" : "no");
 
 #define DUMP(paca, name, format) \
-   printf(" %-*s = %#-*"format"\t(0x%lx)\n", 16, #name, 18, paca->name, \
+   printf(" %-*s = %#-*"format"\t(0x%lx)\n", 20, #name, 18, paca->name, \
offsetof(struct paca_struct, name));
 
DUMP(p, lock_token, "x");
@@ -2103,11 +2106,37 @@ static void dump_one_paca(int cpu)
 #ifdef CONFIG_PPC_BOOK3S_64
DUMP(p, mc_emergency_sp, "p");
DUMP(p, in_mce, "x");
+   DUMP(p, hmi_event_available, "x");
 #endif
DUMP(p, data_offset, "lx");
DUMP(p, hw

[PATCH 1/8] powerpc/slb: Remove a duplicate extern variable

2015-07-29 Thread Anshuman Khandual
This patch just removes one redundant entry for one extern variable
'slb_compare_rr_to_size' from the scope. This patch does not change
any functionality.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/mm/slb.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index 6e450ca..62fafb3 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -253,7 +253,6 @@ static inline void patch_slb_encoding(unsigned int 
*insn_addr,
patch_instruction(insn_addr, insn);
 }
 
-extern u32 slb_compare_rr_to_size[];
 extern u32 slb_miss_kernel_load_linear[];
 extern u32 slb_miss_kernel_load_io[];
 extern u32 slb_compare_rr_to_size[];
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/8] powerpc/slb: Define macros for the bolted slots

2015-07-29 Thread Anshuman Khandual
This patch defines macros for all the three bolted SLB slots. This also
renames the 'create_shadowed_slb' function as 'new_shadowed_slb'.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/mm/slb.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index faf9f0c..701a57f 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -25,6 +25,11 @@
 #include 
 #include 
 
+enum slb_slots {
+   LINEAR_SLOT = 0, /* Kernel linear map  (0xc000) */
+   VMALLOC_SLOT= 1, /* Kernel virtual map (0xd000) */
+   KSTACK_SLOT = 2, /* Kernel stack map */
+};
 
 extern void slb_allocate_realmode(unsigned long ea);
 extern void slb_allocate_user(unsigned long ea);
@@ -74,7 +79,7 @@ static inline void slb_shadow_clear(unsigned long entry)
get_slb_shadow()->save_area[entry].esid = 0;
 }
 
-static inline void create_shadowed_slbe(unsigned long ea, int ssize,
+static inline void new_shadowed_slbe(unsigned long ea, int ssize,
unsigned long flags,
unsigned long entry)
 {
@@ -103,16 +108,16 @@ static void __slb_flush_and_rebolt(void)
lflags = SLB_VSID_KERNEL | linear_llp;
vflags = SLB_VSID_KERNEL | vmalloc_llp;
 
-   ksp_esid_data = mk_esid_data(get_paca()->kstack, mmu_kernel_ssize, 2);
+   ksp_esid_data = mk_esid_data(get_paca()->kstack, mmu_kernel_ssize, 
KSTACK_SLOT);
if ((ksp_esid_data & ~0xfffUL) <= PAGE_OFFSET) {
ksp_esid_data &= ~SLB_ESID_V;
ksp_vsid_data = 0;
-   slb_shadow_clear(2);
+   slb_shadow_clear(KSTACK_SLOT);
} else {
/* Update stack entry; others don't change */
-   slb_shadow_update(get_paca()->kstack, mmu_kernel_ssize, lflags, 
2);
+   slb_shadow_update(get_paca()->kstack, mmu_kernel_ssize, lflags, 
KSTACK_SLOT);
ksp_vsid_data =
-   be64_to_cpu(get_slb_shadow()->save_area[2].vsid);
+   
be64_to_cpu(get_slb_shadow()->save_area[KSTACK_SLOT].vsid);
}
 
/* We need to do this all in asm, so we're sure we don't touch
@@ -125,7 +130,7 @@ static void __slb_flush_and_rebolt(void)
 "slbmte%2,%3\n"
 "isync"
 :: "r"(mk_vsid_data(VMALLOC_START, mmu_kernel_ssize, 
vflags)),
-   "r"(mk_esid_data(VMALLOC_START, mmu_kernel_ssize, 1)),
+   "r"(mk_esid_data(VMALLOC_START, mmu_kernel_ssize, 
VMALLOC_SLOT)),
"r"(ksp_vsid_data),
"r"(ksp_esid_data)
 : "memory");
@@ -151,7 +156,7 @@ void slb_vmalloc_update(void)
unsigned long vflags;
 
vflags = SLB_VSID_KERNEL | mmu_psize_defs[mmu_vmalloc_psize].sllp;
-   slb_shadow_update(VMALLOC_START, mmu_kernel_ssize, vflags, 1);
+   slb_shadow_update(VMALLOC_START, mmu_kernel_ssize, vflags, 
VMALLOC_SLOT);
slb_flush_and_rebolt();
 }
 
@@ -312,19 +317,19 @@ void slb_initialize(void)
asm volatile("isync":::"memory");
asm volatile("slbmte  %0,%0"::"r" (0) : "memory");
asm volatile("isync; slbia; isync":::"memory");
-   create_shadowed_slbe(PAGE_OFFSET, mmu_kernel_ssize, lflags, 0);
-   create_shadowed_slbe(VMALLOC_START, mmu_kernel_ssize, vflags, 1);
+   new_shadowed_slbe(PAGE_OFFSET, mmu_kernel_ssize, lflags, LINEAR_SLOT);
+   new_shadowed_slbe(VMALLOC_START, mmu_kernel_ssize, vflags, 
VMALLOC_SLOT);
 
/* For the boot cpu, we're running on the stack in init_thread_union,
 * which is in the first segment of the linear mapping, and also
 * get_paca()->kstack hasn't been initialized yet.
 * For secondary cpus, we need to bolt the kernel stack entry now.
 */
-   slb_shadow_clear(2);
+   slb_shadow_clear(KSTACK_SLOT);
if (raw_smp_processor_id() != boot_cpuid &&
(get_paca()->kstack & slb_esid_mask(mmu_kernel_ssize)) > 
PAGE_OFFSET)
-   create_shadowed_slbe(get_paca()->kstack,
-mmu_kernel_ssize, lflags, 2);
+   new_shadowed_slbe(get_paca()->kstack,
+mmu_kernel_ssize, lflags, KSTACK_SLOT);
 
asm volatile("isync":::"memory");
 }
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/8] powerpc/slb: Add some helper functions to improve modularization

2015-07-29 Thread Anshuman Khandual
This patch adds the following six helper functions to help improve
modularization and readability of the code.

(1) slb_invalidate_all: Invalidates the entire SLB
(2) slb_invalidate: Invalidates SLB entries present in PACA
(3) mmu_linear_vsid_flags:  VSID flags for kernel linear mapping
(4) mmu_virtual_vsid_flags: VSID flags for kernel virtual mapping
(5) mmu_vmemmap_vsid_flags: VSID flags for kernel vmem mapping
(6) mmu_io_vsid_flags:  VSID flags for kernel I/O mapping

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/mm/slb.c | 92 ++-
 1 file changed, 61 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index 701a57f..c87d5de 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -96,18 +96,37 @@ static inline void new_shadowed_slbe(unsigned long ea, int 
ssize,
 : "memory" );
 }
 
+static inline unsigned long mmu_linear_vsid_flags(void)
+{
+   return SLB_VSID_KERNEL | mmu_psize_defs[mmu_linear_psize].sllp;
+}
+
+static inline unsigned long mmu_vmalloc_vsid_flags(void)
+{
+   return SLB_VSID_KERNEL | mmu_psize_defs[mmu_vmalloc_psize].sllp;
+}
+
+static inline unsigned long mmu_io_vsid_flags(void)
+{
+   return SLB_VSID_KERNEL | mmu_psize_defs[mmu_io_psize].sllp;
+}
+
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+static inline unsigned long mmu_vmemmap_vsid_flags(void)
+{
+   return SLB_VSID_KERNEL | mmu_psize_defs[mmu_vmemmap_psize].sllp;
+}
+#endif
+
 static void __slb_flush_and_rebolt(void)
 {
/* If you change this make sure you change SLB_NUM_BOLTED
 * and PR KVM appropriately too. */
-   unsigned long linear_llp, vmalloc_llp, lflags, vflags;
+   unsigned long lflags, vflags;
unsigned long ksp_esid_data, ksp_vsid_data;
 
-   linear_llp = mmu_psize_defs[mmu_linear_psize].sllp;
-   vmalloc_llp = mmu_psize_defs[mmu_vmalloc_psize].sllp;
-   lflags = SLB_VSID_KERNEL | linear_llp;
-   vflags = SLB_VSID_KERNEL | vmalloc_llp;
-
+   lflags = mmu_linear_vsid_flags();
+   vflags = mmu_vmalloc_vsid_flags();
ksp_esid_data = mk_esid_data(get_paca()->kstack, mmu_kernel_ssize, 
KSTACK_SLOT);
if ((ksp_esid_data & ~0xfffUL) <= PAGE_OFFSET) {
ksp_esid_data &= ~SLB_ESID_V;
@@ -155,7 +174,7 @@ void slb_vmalloc_update(void)
 {
unsigned long vflags;
 
-   vflags = SLB_VSID_KERNEL | mmu_psize_defs[mmu_vmalloc_psize].sllp;
+   vflags = mmu_vmalloc_vsid_flags();
slb_shadow_update(VMALLOC_START, mmu_kernel_ssize, vflags, 
VMALLOC_SLOT);
slb_flush_and_rebolt();
 }
@@ -189,26 +208,15 @@ static inline int esids_match(unsigned long addr1, 
unsigned long addr2)
return (GET_ESID_1T(addr1) == GET_ESID_1T(addr2));
 }
 
-/* Flush all user entries from the segment table of the current processor. */
-void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
+static void slb_invalidate(void)
 {
-   unsigned long offset;
unsigned long slbie_data = 0;
-   unsigned long pc = KSTK_EIP(tsk);
-   unsigned long stack = KSTK_ESP(tsk);
-   unsigned long exec_base;
+   unsigned long offset;
+   int i;
 
-   /*
-* We need interrupts hard-disabled here, not just soft-disabled,
-* so that a PMU interrupt can't occur, which might try to access
-* user memory (to get a stack trace) and possible cause an SLB miss
-* which would update the slb_cache/slb_cache_ptr fields in the PACA.
-*/
-   hard_irq_disable();
offset = get_paca()->slb_cache_ptr;
if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) &&
offset <= SLB_CACHE_ENTRIES) {
-   int i;
asm volatile("isync" : : : "memory");
for (i = 0; i < offset; i++) {
slbie_data = (unsigned long)get_paca()->slb_cache[i]
@@ -226,6 +234,23 @@ void switch_slb(struct task_struct *tsk, struct mm_struct 
*mm)
/* Workaround POWER5 < DD2.1 issue */
if (offset == 1 || offset > SLB_CACHE_ENTRIES)
asm volatile("slbie %0" : : "r" (slbie_data));
+}
+
+/* Flush all user entries from the segment table of the current processor. */
+void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
+{
+   unsigned long pc = KSTK_EIP(tsk);
+   unsigned long stack = KSTK_ESP(tsk);
+   unsigned long exec_base;
+
+   /*
+* We need interrupts hard-disabled here, not just soft-disabled,
+* so that a PMU interrupt can't occur, which might try to access
+* user memory (to get a stack trace) and possible cause an SLB miss
+* which would update the slb_cache/slb_cache_ptr fields in the PACA.
+*/
+   hard_irq_disable();
+   slb_invalidate();
 
get_paca()->slb_cache_ptr = 0;
get_paca()->context = mm->context;
@@ -258,6 +283,14 @@ static inline void patch_slb_encoding(unsigned in

[PATCH 5/8] powerpc/slb: Add documentation to runtime patching of SLB encoding

2015-07-29 Thread Anshuman Khandual
This patch adds some documentation to 'patch_slb_encoding' function
explaining about how it clears the existing immediate value in the
given instruction and inserts a new one there.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/mm/slb.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index c87d5de..1962357 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -279,7 +279,18 @@ void switch_slb(struct task_struct *tsk, struct mm_struct 
*mm)
 static inline void patch_slb_encoding(unsigned int *insn_addr,
  unsigned int immed)
 {
-   int insn = (*insn_addr & 0x) | immed;
+
+   /*
+* This function patches either an li or a cmpldi instruction with
+* a new immediate value. This relies on the fact that both li
+* (which is actually addi) and cmpldi both take a 16-bit immediate
+* value, and it is situated in the same location in the instruction,
+* ie. bits 16-31 (Big endian bit order) or the lower 16 bits.
+* To patch the value we read the existing instruction, clear the
+* immediate value, and or in our new value, then write the instruction
+* back.
+*/
+   unsigned int insn = (*insn_addr & 0x) | immed;
patch_instruction(insn_addr, insn);
 }
 
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/8] powerpc/prom: Simplify the logic while fetching SLB size

2015-07-29 Thread Anshuman Khandual
This patch just simplifies the existing code logic while fetching
the SLB size property from the device tree. This also changes the
function name from check_cpu_slb_size to init_mmu_slb_size as
it just initializes the mmu_slb_size value.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/kernel/prom.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 8b888b1..4bb43c0 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -218,22 +218,18 @@ static void __init check_cpu_pa_features(unsigned long 
node)
 }
 
 #ifdef CONFIG_PPC_STD_MMU_64
-static void __init check_cpu_slb_size(unsigned long node)
+static void __init init_mmu_slb_size(unsigned long node)
 {
const __be32 *slb_size_ptr;
 
-   slb_size_ptr = of_get_flat_dt_prop(node, "slb-size", NULL);
-   if (slb_size_ptr != NULL) {
-   mmu_slb_size = be32_to_cpup(slb_size_ptr);
-   return;
-   }
-   slb_size_ptr = of_get_flat_dt_prop(node, "ibm,slb-size", NULL);
-   if (slb_size_ptr != NULL) {
+   slb_size_ptr = of_get_flat_dt_prop(node, "slb-size", NULL) ? :
+   of_get_flat_dt_prop(node, "ibm,slb-size", NULL);
+
+   if (slb_size_ptr)
mmu_slb_size = be32_to_cpup(slb_size_ptr);
-   }
 }
 #else
-#define check_cpu_slb_size(node) do { } while(0)
+#define init_mmu_slb_size(node) do { } while(0)
 #endif
 
 static struct feature_property {
@@ -380,7 +376,7 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
 
check_cpu_feature_properties(node);
check_cpu_pa_features(node);
-   check_cpu_slb_size(node);
+   init_mmu_slb_size(node);
 
 #ifdef CONFIG_PPC64
if (nthreads > 1)
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 7/8] powerpc/xmon: Drop the 'valid' variable completely in 'dump_segments'

2015-07-29 Thread Anshuman Khandual
Value of the 'valid' variable is zero when 'esid' is zero and it does
not matter when 'esid' is non-zero. The variable 'valid' can be dropped
from the function 'dump_segments' by checking for validity of 'esid'
inside the nested code block. This patch does that change.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/xmon/xmon.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index e599259..bc1b066a 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2731,7 +2731,7 @@ static void xmon_print_symbol(unsigned long address, 
const char *mid,
 void dump_segments(void)
 {
int i;
-   unsigned long esid,vsid,valid;
+   unsigned long esid,vsid;
unsigned long llp;
 
printf("SLB contents of cpu 0x%x\n", smp_processor_id());
@@ -2739,10 +2739,9 @@ void dump_segments(void)
for (i = 0; i < mmu_slb_size; i++) {
asm volatile("slbmfee  %0,%1" : "=r" (esid) : "r" (i));
asm volatile("slbmfev  %0,%1" : "=r" (vsid) : "r" (i));
-   valid = (esid & SLB_ESID_V);
-   if (valid | esid | vsid) {
+   if (esid || vsid) {
printf("%02d %016lx %016lx", i, esid, vsid);
-   if (valid) {
+   if (esid & SLB_ESID_V) {
llp = vsid & SLB_VSID_LLP;
if (vsid & SLB_VSID_B_1T) {
printf("  1T  ESID=%9lx  VSID=%13lx 
LLP:%3lx \n",
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev