Re: p4080ds IRQ vector number for he External IRQ4 and IRQ5

2016-01-11 Thread Scott Wood
On Wed, 2016-01-06 at 08:40 +, Lakshmi wrote:
> I have been trying to figure out what is the vector number used for external
> IRQ4 and IRQ5 in P4080ds.
> 
> According to board document xpedite5470-p4080 
> IRQ4: VPX GP Input 0 (GPI0)
> IRQ5 VPX GP Input 1 (GPI1)
> 
> In p4080 user guide OpenPIC interrupt connection its mentioned as
> 
> IRQ4_B: SLOT4 Sideband connector (SGMII riser does not connect, XAUI riser
> can use or inband)
> 
> IRQ5_B: MIC2076 USB Power FLAG for over current at USB connector
> 
> But in code i am unable to find any info related to IRQ4 and IRQ5.
> 
> How can i find??

Are you asking what to put in the device tree for these interrupts?  External
IRQ4 is <4 ls 0 0> and IRQ5 is <5 ls 0 0>, with ls being the level/sense info.

See Documentation/devicetree/bindings/powerpc/fsl/mpic.txt for details.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Mark Brown
On Mon, Jan 11, 2016 at 07:23:54PM -0600, Timur Tabi wrote:
> Mark Brown wrote:

> >regcache handles this fine, it's perfectly happy to just go and allocate
> >the cache as registers get used (this is why the code that's doing the
> >allocation exists...).  What is causing problems here is that the first
> >access to the register is happening in interrupt context so we can't do
> >a GFP_KERNEL allocation for it.

> Considering how small and not-sparse the SSI register space is, would using
> REGCACHE_FLAT be appropriate?

Quite possibly (it'll be more efficient and it's intended for such use
cases) but as I said in my other reply that then has the issue that it
implicitly gives default values to all the registers so I'd expect we
still need to handle the cache initialisation explicitly (or
alternatively the hardware sync with the cache on startup).


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Timur Tabi

Mark Brown wrote:

regcache handles this fine, it's perfectly happy to just go and allocate
the cache as registers get used (this is why the code that's doing the
allocation exists...).  What is causing problems here is that the first
access to the register is happening in interrupt context so we can't do
a GFP_KERNEL allocation for it.


Considering how small and not-sparse the SSI register space is, would 
using REGCACHE_FLAT be appropriate?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 0/3] checkpatch: handling of memory barriers

2016-01-11 Thread Joe Perches
On Mon, 2016-01-11 at 13:00 +0200, Michael S. Tsirkin wrote:
> As part of memory barrier cleanup, this patchset
> extends checkpatch to make it easier to stop
> incorrect memory barrier usage.

Thanks Michael.

Acked-by: Joe Perches 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: simple_alloc space tramples initrd

2016-01-11 Thread dwalker
On Tue, Jan 12, 2016 at 09:17:53AM +1100, Michael Ellerman wrote:
> On Mon, 2016-01-11 at 08:49 -0800, dwal...@fifo99.com wrote:
> > On Mon, Jan 11, 2016 at 02:09:34PM +1100, Michael Ellerman wrote:
> > > On Fri, 2016-01-08 at 09:45 -0800, dwal...@fifo99.com wrote:
> > > > Hi,
> > > > 
> > > > A powerpc machine I'm working on has this problem where the
> > > > simple_alloc_init() area is trampling the initrd. The two are placed 
> > > > fairly
> > > > close together.
> > > 
> > > Which machine / platform?
> > 
> > It's not upstream yet. I'm still putting the patches together, that's when 
> > this
> > issue came up. I can send an RFC if you want to look at the patches.
> 
> OK. Thanks but I don't need more patches to look at :)
> 
> I was just trying to narrow down which code you were talking about.

It's coming eventually anyways ;) ..

> > > I don't really know that code very well. But ideally either the boot 
> > > loader
> > > gives you space, or the platform boot code is smart enough to detect that 
> > > there
> > > is insufficient room and puts the heap somewhere else.
> > 
> > It seems like the kernel should be able to handle it. I believe the 
> > bootloader passes
> > the initrd location , but I don't think it's evaluated till later in the 
> > boot up. For
> > simple_alloc_init() it seems all platforms just assume the space is empty 
> > without checking.
> 
> Yeah that's what I see too, which seems like it's liable to break, but
> obviously hasn't for anyone else yet.
> 
> The bootloader must pass the initrd location, otherwise the kernel can't use
> it, so it seems like the kernel should be able to notice when they are too
> close. But it may be complicated by the sequencing of the code.


I found a similar one,

arch/powerpc/boot/ps3.c:platform_init()

I realized that in platform_init() your discovering the initrd location, so you 
do have
access to the values. In ps3 you can see how if the initrd is placed in the 
16megs after
the kernel image then the simple_alloc code could corrupt it.

I think it would be appropriate to check the initrd location in that function 
(since it's available)
and make a choice to put the simple_alloc area after the initrd if the areas 
overlap. Does that make
sense ?

Daniel
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: simple_alloc space tramples initrd

2016-01-11 Thread Michael Ellerman
On Mon, 2016-01-11 at 08:49 -0800, dwal...@fifo99.com wrote:
> On Mon, Jan 11, 2016 at 02:09:34PM +1100, Michael Ellerman wrote:
> > On Fri, 2016-01-08 at 09:45 -0800, dwal...@fifo99.com wrote:
> > > Hi,
> > > 
> > > A powerpc machine I'm working on has this problem where the
> > > simple_alloc_init() area is trampling the initrd. The two are placed 
> > > fairly
> > > close together.
> > 
> > Which machine / platform?
> 
> It's not upstream yet. I'm still putting the patches together, that's when 
> this
> issue came up. I can send an RFC if you want to look at the patches.

OK. Thanks but I don't need more patches to look at :)

I was just trying to narrow down which code you were talking about.

> > I don't really know that code very well. But ideally either the boot loader
> > gives you space, or the platform boot code is smart enough to detect that 
> > there
> > is insufficient room and puts the heap somewhere else.
> 
> It seems like the kernel should be able to handle it. I believe the 
> bootloader passes
> the initrd location , but I don't think it's evaluated till later in the boot 
> up. For
> simple_alloc_init() it seems all platforms just assume the space is empty 
> without checking.

Yeah that's what I see too, which seems like it's liable to break, but
obviously hasn't for anyone else yet.

The bootloader must pass the initrd location, otherwise the kernel can't use
it, so it seems like the kernel should be able to notice when they are too
close. But it may be complicated by the sequencing of the code.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification

2016-01-11 Thread Greg KH
On Mon, Jan 11, 2016 at 02:54:36PM -0600, Shilpasri G Bhat wrote:
> In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the
> CPU when the chip crosses its thermal and power limits. Currently,
> powernv-cpufreq driver detects and reports this event as a console
> message. Some machines may not sustain the max turbo frequency in all
> conditions and can be throttled frequently. This can lead to the
> flooding of console with throttle messages. So this patchset aims to
> redesign the presentation of this event via sysfs counters and
> tracepoints. 
> 
> Patches [2] to [4] will add a perf trace point "power:powernv_throttle" and
> sysfs throttle counter stats in /sys/devices/system/cpu/cpufreq/chipN.
> Patch [1] solves a bug in powernv_cpufreq_throttle_check(), which calls in to
> cpu_to_chip_id() in hot path which reads DT every time to find the chip id.



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
for how to do this properly.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] powerpc/perf: Remove PME_ prefix for power7 events

2016-01-11 Thread Sukadev Bhattiprolu
We used the PME_ prefix earlier to avoid some macro/variable name
collisions.  We have since changed the way we define/use the event
macros so we no longer need the prefix.

By dropping the prefix, we keep the the event macros consistent with
their official names.

Reported-by: Michael Ellerman 
Signed-off-by: Sukadev Bhattiprolu 
---
 arch/powerpc/include/asm/perf_event_server.h |  2 +-
 arch/powerpc/perf/power7-pmu.c   | 18 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/perf_event_server.h 
b/arch/powerpc/include/asm/perf_event_server.h
index 8146221..0691087 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -141,7 +141,7 @@ extern ssize_t power_events_sysfs_show(struct device *dev,
 #defineEVENT_PTR(_id, _suffix) _VAR(_id, 
_suffix).attr.attr
 
 #defineEVENT_ATTR(_name, _id, _suffix) 
\
-   PMU_EVENT_ATTR(_name, EVENT_VAR(_id, _suffix), PME_##_id,   \
+   PMU_EVENT_ATTR(_name, EVENT_VAR(_id, _suffix), _id, \
power_events_sysfs_show)
 
 #defineGENERIC_EVENT_ATTR(_name, _id)  EVENT_ATTR(_name, _id, _g)
diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
index 5b62f238..a383c23 100644
--- a/arch/powerpc/perf/power7-pmu.c
+++ b/arch/powerpc/perf/power7-pmu.c
@@ -54,7 +54,7 @@
  * Power7 event codes.
  */
 #define EVENT(_name, _code) \
-   PME_##_name = _code,
+   _name = _code,
 
 enum {
 #include "power7-events-list.h"
@@ -318,14 +318,14 @@ static void power7_disable_pmc(unsigned int pmc, unsigned 
long mmcr[])
 }
 
 static int power7_generic_events[] = {
-   [PERF_COUNT_HW_CPU_CYCLES] =PME_PM_CYC,
-   [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =   PME_PM_GCT_NOSLOT_CYC,
-   [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =PME_PM_CMPLU_STALL,
-   [PERF_COUNT_HW_INSTRUCTIONS] =  PME_PM_INST_CMPL,
-   [PERF_COUNT_HW_CACHE_REFERENCES] =  PME_PM_LD_REF_L1,
-   [PERF_COUNT_HW_CACHE_MISSES] =  PME_PM_LD_MISS_L1,
-   [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] =   PME_PM_BRU_FIN,
-   [PERF_COUNT_HW_BRANCH_MISSES] = PME_PM_BR_MPRED,
+   [PERF_COUNT_HW_CPU_CYCLES] =PM_CYC,
+   [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =   PM_GCT_NOSLOT_CYC,
+   [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =PM_CMPLU_STALL,
+   [PERF_COUNT_HW_INSTRUCTIONS] =  PM_INST_CMPL,
+   [PERF_COUNT_HW_CACHE_REFERENCES] =  PM_LD_REF_L1,
+   [PERF_COUNT_HW_CACHE_MISSES] =  PM_LD_MISS_L1,
+   [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] =   PM_BRU_FIN,
+   [PERF_COUNT_HW_BRANCH_MISSES] = PM_BR_MPRED,
 };
 
 #define C(x)   PERF_COUNT_HW_CACHE_##x
-- 
2.5.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2] mm/powerpc: Fix _PAGE_PTE breaking swapoff

2016-01-11 Thread Michael Ellerman
On Mon, 2016-01-11 at 21:19 +0530, Aneesh Kumar K.V wrote:

> Core kernel expect swp_entry_t to be consisting of
> only swap type and swap offset. We should not leak pte bits to
> swp_entry_t. This breaks swapoff which use the swap type and offset
> to build a swp_entry_t and later compare that to the swp_entry_t
> obtained from linux page table pte. Leaking pte bits to swp_entry_t
> breaks that comparison and results in us looping in try_to_unuse.
> 
> The stack trace can be anywhere below try_to_unuse() in mm/swapfile.c,
> since swapoff is circling around and around that function, reading from
> each used swap block into a page, then trying to find where that page
> belongs, looking at every non-file pte of every mm that ever swapped.
> 
> Reported-by: Hugh Dickins 
> Suggested-by: Hugh Dickins 
> Signed-off-by: Aneesh Kumar K.V 

Thanks. I slightly edited the wording in the change log and added:

Fixes: 6a119eae942c ("powerpc/mm: Add a _PAGE_PTE bit")

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: simple_alloc space tramples initrd

2016-01-11 Thread Michael Ellerman
On Mon, 2016-01-11 at 15:07 -0800, dwal...@fifo99.com wrote:
> On Tue, Jan 12, 2016 at 09:17:53AM +1100, Michael Ellerman wrote:
> > On Mon, 2016-01-11 at 08:49 -0800, dwal...@fifo99.com wrote:
> > > On Mon, Jan 11, 2016 at 02:09:34PM +1100, Michael Ellerman wrote:
> > > > On Fri, 2016-01-08 at 09:45 -0800, dwal...@fifo99.com wrote:
> > > > > A powerpc machine I'm working on has this problem where the
> > > > > simple_alloc_init() area is trampling the initrd. The two are placed 
> > > > > fairly
> > > > > close together.
> > > > 
> > > > Which machine / platform?
> > > 
> > > It's not upstream yet. I'm still putting the patches together, that's 
> > > when this
> > > issue came up. I can send an RFC if you want to look at the patches.
> > 
> > OK. Thanks but I don't need more patches to look at :)
> > 
> > I was just trying to narrow down which code you were talking about.
> 
> It's coming eventually anyways ;) ..

Hah, yeah I know :)

> > > > I don't really know that code very well. But ideally either the boot 
> > > > loader
> > > > gives you space, or the platform boot code is smart enough to detect 
> > > > that there
> > > > is insufficient room and puts the heap somewhere else.
> > > 
> > > It seems like the kernel should be able to handle it. I believe the 
> > > bootloader passes
> > > the initrd location , but I don't think it's evaluated till later in the 
> > > boot up. For
> > > simple_alloc_init() it seems all platforms just assume the space is empty 
> > > without checking.
> > 
> > Yeah that's what I see too, which seems like it's liable to break, but
> > obviously hasn't for anyone else yet.
> > 
> > The bootloader must pass the initrd location, otherwise the kernel can't use
> > it, so it seems like the kernel should be able to notice when they are too
> > close. But it may be complicated by the sequencing of the code.
> 
> I found a similar one,
> 
> arch/powerpc/boot/ps3.c:platform_init()
> 
> I realized that in platform_init() your discovering the initrd location, so 
> you do have
> access to the values. In ps3 you can see how if the initrd is placed in the 
> 16megs after
> the kernel image then the simple_alloc code could corrupt it.
> 
> I think it would be appropriate to check the initrd location in that function
> (since it's available) and make a choice to put the simple_alloc area after
> the initrd if the areas overlap. Does that make sense ?

Hmm, maybe. I think the ps3 code knows at link time where the initrd is, so
that's kind of cheating.

But if on your platform you can find out the initrd location early enough, then
yes ideally you take that into account when initialising the allocator.

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 0/3] checkpatch: handling of memory barriers

2016-01-11 Thread Michael S. Tsirkin
On Mon, Jan 11, 2016 at 12:59:25PM +0200, Michael S. Tsirkin wrote:
> As part of memory barrier cleanup, this patchset
> extends checkpatch to make it easier to stop
> incorrect memory barrier usage.
> 
> This replaces the checkpatch patches in my series
>   arch: barrier cleanup + barriers for virt
> and will be included in the pull request including
> the series.
> 
> changes from v3:
>   rename smp_barrier_stems to barrier_stems
>   as suggested by Julian Calaby.

In fact it was Joe Perches that suggested it.
Sorry about the confusion.

>   add (?: ... ) around a variable in regexp,
>   in case we change the value later so that it matters.
> changes from v2:
>   address comments by Joe Perches:
>   use (?: ... ) to avoid unnecessary capture groups
>   rename smp_barriers to smp_barrier_stems for clarity
>   add barriers before/after atomic
> Changes from v1:
>   catch optional\s* before () in barriers
>   rewrite using qr{} instead of map
> 
> Michael S. Tsirkin (3):
>   checkpatch.pl: add missing memory barriers
>   checkpatch: check for __smp outside barrier.h
>   checkpatch: add virt barriers
> 
> Michael S. Tsirkin (3):
>   checkpatch.pl: add missing memory barriers
>   checkpatch: check for __smp outside barrier.h
>   checkpatch: add virt barriers
> 
>  scripts/checkpatch.pl | 33 -
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> -- 
> MST
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 0/3] checkpatch: handling of memory barriers

2016-01-11 Thread Julian Calaby
Hi Michael,

On Mon, Jan 11, 2016 at 10:04 PM, Michael S. Tsirkin  wrote:
> On Mon, Jan 11, 2016 at 12:59:25PM +0200, Michael S. Tsirkin wrote:
>> As part of memory barrier cleanup, this patchset
>> extends checkpatch to make it easier to stop
>> incorrect memory barrier usage.
>>
>> This replaces the checkpatch patches in my series
>>   arch: barrier cleanup + barriers for virt
>> and will be included in the pull request including
>> the series.
>>
>> changes from v3:
>>   rename smp_barrier_stems to barrier_stems
>>   as suggested by Julian Calaby.
>
> In fact it was Joe Perches that suggested it.
> Sorry about the confusion.

I was about to point that out.

FWIW this entire series is:

Acked-by: Julian Calaby 

Thanks,

-- 
Julian Calaby

Email: julian.cal...@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 3/6] QE: Add uqe_serial document to bindings

2016-01-11 Thread Qiang Zhao
On Fri, Jan 09, 2016 at 04:12AM, Rob Herring  wrote:
> -Original Message-
> From: Rob Herring [mailto:r...@kernel.org]
> Sent: Saturday, January 09, 2016 4:12 AM
> To: Qiang Zhao 
> Cc: devicet...@vger.kernel.org; linux-ker...@vger.kernel.org; linuxppc-
> d...@lists.ozlabs.org; priyanka.j...@freescale.com; o...@buserror.net
> Subject: Re: [PATCH 3/6] QE: Add uqe_serial document to bindings
> 
> On Fri, Jan 08, 2016 at 10:18:11AM +0800, Zhao Qiang wrote:
> > Add uqe_serial document to
> > Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/uqe_serial.txt
> >
> > Signed-off-by: Zhao Qiang 
> > ---
> >  .../bindings/powerpc/fsl/cpm_qe/uqe_serial.txt   | 20
> 
> >  1 file changed, 20 insertions(+)
> >  create mode 100644
> > Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/uqe_serial.txt
> >
> > diff --git
> > a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/uqe_serial.txt
> > b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/uqe_serial.txt
> > new file mode 100644
> > index 000..e677599
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/uqe_serial.
> > +++ txt
> > @@ -0,0 +1,20 @@
> > +* Serial
> > +
> > +Currently defined compatibles:
> > +- ucc_uart
> > +
> > +Properties for ucc_uart:
> > +device_type : which type the device is
> 
> Drop this please.

Yes, I will drop it in next version.

> 
> > +port-number : port number of UCC-UART
> 
> Use aliases instead.

I don't understand, can you explain more?

> 
> > +rx-clock-name : which clock QE use for RX tx-clock-name : which clock
> > +QE use for TX
> 
> These should use the clock binding.

This property means which clock source the UCC use, 
the QE just use this property to route UCC clock to clock source.
The clock source maybe either internal or outside(from clock input pin).
So clock binding is not apply in this case.

> 
> > +
> > +Example:
> > +
> > +   serial: ucc@2200 {
> > +   device_type = "serial";
> > +   compatible = "ucc_uart";
> > +   port-number = <1>;
> > +   rx-clock-name = "brg2";
> > +   tx-clock-name = "brg2";
> > +   };
> > --
> > 2.1.0.27.g96db324
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe devicetree"
> > in the body of a message to majord...@vger.kernel.org More majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
Best Regards
Zhao Qiang
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/opal: fix minor off-by-one error in opal_mce_check_early_recovery()

2016-01-11 Thread Michael Ellerman
On Mon, 2015-21-12 at 07:28:37 UTC, Andrew Donnellan wrote:
> Fix off-by-one error in opal_mce_check_early_recovery() when checking
> whether the NIP falls within OPAL space.
> 
> Signed-off-by: Andrew Donnellan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/dc3799bb9ab2666fa19081121f

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v10 3/4] tools/perf: Map the ID values with register names

2016-01-11 Thread Anju T
Map ID values with corresponding register names. These names are then
displayed when user issues perf record with the -I option
followed by perf report/script with -D option.

To test this patchset,
Eg:

$ perf record -I ls   # record machine state at interrupt
$ perf script -D  # read the perf.data file

Sample output obtained for this patch / output looks like as follows:

178329381464 0x138 [0x180]: PERF_RECORD_SAMPLE(IP, 0x1): 7803/7803: 
0xc000fd9c period: 1 addr: 0
... intr regs: mask 0x3ff ABI 64-bit
 gpr0  0xc01a6420
 gpr1  0xc01e4df039b0
 gpr2  0xc0cdd100
 gpr3  0x1
 gpr4  0xc01e4a96d000
 gpr5  0x29854255ba
 gpr6  0xc00ffa3050b8
 gpr7  0x0
 gpr8  0x0
 gpr9  0x0
 gpr10 0x0
 gpr11 0x0
 gpr12 0x24022822
 gpr13 0xcfe03000
 gpr14 0x0
 gpr15 0xc0d763f8
 gpr16 0x0
 gpr17 0xc01e4ddcf000
 gpr18 0x0
 gpr19 0xc00ffa305000
 gpr20 0xc01e4df038c0
 gpr21 0xc01e40ed7a00
 gpr22 0xc00aa28c
 gpr23 0xc0cdd100
 gpr24 0x0
 gpr25 0xc0cdd100
 gpr26 0xc01e4df038b0
 gpr27 0xfeae
 gpr28 0xc01e4df03880
 gpr29 0xc0dce900
 gpr30 0xc01e4df03890
 gpr31 0xc01e355c7a30
 nip   0xc01a62d8
 msr   0x90009032
 orig_r3 0xc01a6320
 ctr   0xc00a7be0
 link   0xc01a6428
 xer   0x0
 ccr   0x24022888
 trap  0xf01
 dar   0xc01e40ed7a00
 dsisr 0x3000c006004
 ... thread: :7803:7803
 .. dso: /root/.debug/.build-id/d0/eb47b06c0d294143af13c50616f638c2d88658
   :7803  7803   178.329381:  1 cycles:  c000fd9c 
.arch_local_irq_restore (/boot/vmlinux)


Signed-off-by: Anju T 
Reviewed-by  : Madhavan Srinivasan 
---
 tools/perf/arch/powerpc/include/perf_regs.h | 64 +
 tools/perf/config/Makefile  |  5 +++
 2 files changed, 69 insertions(+)
 create mode 100644 tools/perf/arch/powerpc/include/perf_regs.h

diff --git a/tools/perf/arch/powerpc/include/perf_regs.h 
b/tools/perf/arch/powerpc/include/perf_regs.h
new file mode 100644
index 000..93080f5
--- /dev/null
+++ b/tools/perf/arch/powerpc/include/perf_regs.h
@@ -0,0 +1,64 @@
+#ifndef ARCH_PERF_REGS_H
+#define ARCH_PERF_REGS_H
+
+#include 
+#include 
+#include 
+
+#define PERF_REGS_MASK  ((1ULL << PERF_REG_POWERPC_MAX) - 1)
+#define PERF_REGS_MAX   PERF_REG_POWERPC_MAX
+#define PERF_SAMPLE_REGS_ABI   PERF_SAMPLE_REGS_ABI_64
+
+#define PERF_REG_IP PERF_REG_POWERPC_NIP
+#define PERF_REG_SP PERF_REG_POWERPC_GPR1
+
+static const char *reg_names[] = {
+   [PERF_REG_POWERPC_GPR0] = "gpr0",
+   [PERF_REG_POWERPC_GPR1] = "gpr1",
+   [PERF_REG_POWERPC_GPR2] = "gpr2",
+   [PERF_REG_POWERPC_GPR3] = "gpr3",
+   [PERF_REG_POWERPC_GPR4] = "gpr4",
+   [PERF_REG_POWERPC_GPR5] = "gpr5",
+   [PERF_REG_POWERPC_GPR6] = "gpr6",
+   [PERF_REG_POWERPC_GPR7] = "gpr7",
+   [PERF_REG_POWERPC_GPR8] = "gpr8",
+   [PERF_REG_POWERPC_GPR9] = "gpr9",
+   [PERF_REG_POWERPC_GPR10] = "gpr10",
+   [PERF_REG_POWERPC_GPR11] = "gpr11",
+   [PERF_REG_POWERPC_GPR12] = "gpr12",
+   [PERF_REG_POWERPC_GPR13] = "gpr13",
+   [PERF_REG_POWERPC_GPR14] = "gpr14",
+   [PERF_REG_POWERPC_GPR15] = "gpr15",
+   [PERF_REG_POWERPC_GPR16] = "gpr16",
+   [PERF_REG_POWERPC_GPR17] = "gpr17",
+   [PERF_REG_POWERPC_GPR18] = "gpr18",
+   [PERF_REG_POWERPC_GPR19] = "gpr19",
+   [PERF_REG_POWERPC_GPR20] = "gpr20",
+   [PERF_REG_POWERPC_GPR21] = "gpr21",
+   [PERF_REG_POWERPC_GPR22] = "gpr22",
+   [PERF_REG_POWERPC_GPR23] = "gpr23",
+   [PERF_REG_POWERPC_GPR24] = "gpr24",
+   [PERF_REG_POWERPC_GPR25] = "gpr25",
+   [PERF_REG_POWERPC_GPR26] = "gpr26",
+   [PERF_REG_POWERPC_GPR27] = "gpr27",
+   [PERF_REG_POWERPC_GPR28] = "gpr28",
+   [PERF_REG_POWERPC_GPR29] = "gpr29",
+   [PERF_REG_POWERPC_GPR30] = "gpr30",
+   [PERF_REG_POWERPC_GPR31] = "gpr31",
+   [PERF_REG_POWERPC_NIP] = "nip",
+   [PERF_REG_POWERPC_MSR] = "msr",
+   [PERF_REG_POWERPC_ORIG_R3] = "orig_r3",
+   [PERF_REG_POWERPC_CTR] = "ctr",
+   [PERF_REG_POWERPC_LNK] = "link",
+   [PERF_REG_POWERPC_XER] = "xer",
+   [PERF_REG_POWERPC_CCR] = "ccr",
+   [PERF_REG_POWERPC_TRAP] = "trap",
+   [PERF_REG_POWERPC_DAR] = "dar",
+   [PERF_REG_POWERPC_DSISR] = "dsisr"
+};
+
+static inline const char *perf_reg_name(int id)
+{
+   return reg_names[id];
+}
+#endif /* ARCH_PERF_REGS_H */
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 38a0853..62a2f2d 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -23,6 +23,11 @@ $(call detected_var,ARCH)
 
 NO_PERF_REGS := 1
 
+# Additional ARCH settings for ppc64
+ifeq ($(ARCH),powerpc)
+  

Re: [PATCH v3 3/3] checkpatch: add virt barriers

2016-01-11 Thread Julian Calaby
Hi Michael,

On Mon, Jan 11, 2016 at 9:35 PM, Michael S. Tsirkin  wrote:
> On Sun, Jan 10, 2016 at 02:52:16PM -0800, Joe Perches wrote:
>> On Mon, 2016-01-11 at 09:13 +1100, Julian Calaby wrote:
>> > On Mon, Jan 11, 2016 at 6:31 AM, Michael S. Tsirkin  
>> > wrote:
>> > > Add virt_ barriers to list of barriers to check for
>> > > presence of a comment.
>> []
>> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
>> []
>> > > @@ -5133,7 +5133,8 @@ sub process {
>> > > }x;
>> > > my $all_barriers = qr{
>> > > $barriers|
>> > > -   smp_(?:$smp_barrier_stems)
>> > > +   smp_(?:$smp_barrier_stems)|
>> > > +   virt_(?:$smp_barrier_stems)
>> >
>> > Sorry I'm late to the party here, but would it make sense to write this as:
>> >
>> > (?:smp|virt)_(?:$smp_barrier_stems)
>>
>> Yes.  Perhaps the name might be better as barrier_stems.
>>
>> Also, ideally this would be longest match first or use \b
>> after the matches so that $all_barriers could work
>> successfully without a following \s*\(
>>
>> my $all_barriers = qr{
>>   (?:smp|virt)_(?:barrier_stems)|
>>   $barriers)
>> }x;
>>
>> or maybe add separate $smp_barriers and $virt_barriers
>>
>>   it doesn't matter much in any case
>
> OK just to clarify - are you OK with merging the patch as is?
> Refactorings can come as patches on top if required.

I don't really care either way, I was just asking if it was possible.
If you don't see any value in that change, then don't make it.

Thanks,

-- 
Julian Calaby

Email: julian.cal...@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 3/3] checkpatch: add virt barriers

2016-01-11 Thread Michael S. Tsirkin
On Mon, Jan 11, 2016 at 09:40:18PM +1100, Julian Calaby wrote:
> Hi Michael,
> 
> On Mon, Jan 11, 2016 at 9:35 PM, Michael S. Tsirkin  wrote:
> > On Sun, Jan 10, 2016 at 02:52:16PM -0800, Joe Perches wrote:
> >> On Mon, 2016-01-11 at 09:13 +1100, Julian Calaby wrote:
> >> > On Mon, Jan 11, 2016 at 6:31 AM, Michael S. Tsirkin  
> >> > wrote:
> >> > > Add virt_ barriers to list of barriers to check for
> >> > > presence of a comment.
> >> []
> >> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> >> []
> >> > > @@ -5133,7 +5133,8 @@ sub process {
> >> > > }x;
> >> > > my $all_barriers = qr{
> >> > > $barriers|
> >> > > -   smp_(?:$smp_barrier_stems)
> >> > > +   smp_(?:$smp_barrier_stems)|
> >> > > +   virt_(?:$smp_barrier_stems)
> >> >
> >> > Sorry I'm late to the party here, but would it make sense to write this 
> >> > as:
> >> >
> >> > (?:smp|virt)_(?:$smp_barrier_stems)
> >>
> >> Yes.  Perhaps the name might be better as barrier_stems.
> >>
> >> Also, ideally this would be longest match first or use \b
> >> after the matches so that $all_barriers could work
> >> successfully without a following \s*\(
> >>
> >> my $all_barriers = qr{
> >>   (?:smp|virt)_(?:barrier_stems)|
> >>   $barriers)
> >> }x;
> >>
> >> or maybe add separate $smp_barriers and $virt_barriers
> >>
> >>   it doesn't matter much in any case
> >
> > OK just to clarify - are you OK with merging the patch as is?
> > Refactorings can come as patches on top if required.
> 
> I don't really care either way, I was just asking if it was possible.
> If you don't see any value in that change, then don't make it.
> 
> Thanks,
> 
> -- 
> Julian Calaby
> 
> Email: julian.cal...@gmail.com
> Profile: http://www.google.com/profiles/julian.calaby/

OK, got it, thanks.

I will rename smp_barrier_stems to barrier_stems since
this doesn't need too much testing.

I'd rather keep the regex code as is since changing it requires
testing.  I might play with it some more in the future
but I'd like to merge it in the current form to help make
sure __smp barriers are not misused.

I'll post v4 now - an ack will be appreciated.
-- 
MST
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/powernv: Only delay opal_rtc_read() retry when necessary

2016-01-11 Thread Michael Ellerman
On Fri, 2015-18-12 at 10:46:04 UTC, Michael Neuling wrote:
> Only delay opal_rtc_read() when busy and are going to retry.
> 
> This has the advantage of possibly saving a massive 10ms off booting!
> 
> Kudos to Stewart for noticing.
> 
> Signed-off-by: Michael Neuling 
> Reviewed-by: Stewart Smith 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/57a9039052aadf5833c40ab494

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V10 0/4] perf/powerpc: Add ability to sample intr machine state in powerpc

2016-01-11 Thread Anju T
This short patch series adds the ability to sample the interrupted
machine state for each hardware sample.

To test this patchset,
Eg:

$ perf record -I?   # list supported registers

output:

available registers: gpr0 gpr1 gpr2 gpr3 gpr4 gpr5 gpr6 gpr7 gpr8 gpr9 gpr10 
gpr11 gpr12 gpr13 gpr14 gpr15 gpr16 gpr17 gpr18 gpr19 gpr20 gpr21 gpr22 gpr23 
gpr24 gpr25 gpr26 gpr27 gpr28 gpr29 gpr30 gpr31 nip msr orig_r3 ctr link xer 
ccr trap dar dsisr
usage: perf record [] []
or: perf record [] --  []
 -I, --intr-regs[=]
sample selected machine registers on interrupt, use -I ? to list register names


$ perf record -I ls   # record machine state at interrupt
$ perf script -D  # read the perf.data file

Samplfdoutput obtained for this patchset/ output looks like as follows:

178329381464 0x138 [0x180]: PERF_RECORD_SAMPLE(IP, 0x1): 7803/7803: 
0xc000fd9c period: 1 addr: 0
... intr regs: mask 0x3ff ABI 64-bit
 gpr0  0xc01a6420
 gpr1  0xc01e4df039b0
 gpr2  0xc0cdd100
 gpr3  0x1
 gpr4  0xc01e4a96d000
 gpr5  0x29854255ba
 gpr6  0xc00ffa3050b8
 gpr7  0x0
 gpr8  0x0
 gpr9  0x0
 gpr10 0x0
 gpr11 0x0
 gpr12 0x24022822
 gpr13 0xcfe03000
 gpr14 0x0
 gpr15 0xc0d763f8
 gpr16 0x0
 gpr17 0xc01e4ddcf000
 gpr18 0x0
 gpr19 0xc00ffa305000
 gpr20 0xc01e4df038c0
 gpr21 0xc01e40ed7a00
 gpr22 0xc00aa28c
 gpr23 0xc0cdd100
 gpr24 0x0
 gpr25 0xc0cdd100
 gpr26 0xc01e4df038b0
 gpr27 0xfeae
 gpr28 0xc01e4df03880
 gpr29 0xc0dce900
 gpr30 0xc01e4df03890
 gpr31 0xc01e355c7a30
 nip   0xc01a62d8
 msr   0x90009032
 orig_r3 0xc01a6320
 ctr   0xc00a7be0
 link   0xc01a6428
 xer   0x0
 ccr   0x24022888
 trap  0xf01
 dar   0xc01e40ed7a00
 dsisr 0x3000c006004
 ... thread: :7803:7803
 .. dso: /root/.debug/.build-id/d0/eb47b06c0d294143af13c50616f638c2d88658
   :7803  7803   178.329381:  1 cycles:  c000fd9c 
.arch_local_irq_restore (/boot/vmlinux)

Changes from V9:

- Changed the name displayed for link register from "lnk" to "link" in 
  tools/perf/arch/powerpc/include/perf_regs.h

changes from V8:

- Corrected the indentation issue in the Makefile mentioned in 3rd patch

Changes from V7:

- Addressed the new line issue in 3rd patch.

Changes from V6:

- Corrected the typo in patch  tools/perf: Map the ID values with register 
names.
  ie #define PERF_REG_SP  PERF_REG_POWERPC_R1 should be #define PERF_REG_SP   
PERF_REG_POWERPC_GPR1


Changes from V5:

- Enabled perf_sample_regs_user also in this patch set.Functions added in 
   arch/powerpc/perf/perf_regs.c
- Added Maddy's patch to this patchset for enabling -I? option which will
  list the supported register names.


Changes from V4:

- Removed the softe and MQ from all patches
- Switch case is replaced with an array in the 3rd patch

Changes from V3:

- Addressed the comments by Sukadev regarding the nits in the descriptions.
- Modified the subject of first patch.
- Included the sample output in the 3rd patch also.

Changes from V2:

- tools/perf/config/Makefile is moved to the patch tools/perf.
- The patchset is reordered.
- perf_regs_load() function is used for the dwarf unwind test.Since it is not 
required here,
  it is removed from tools/perf/arch/powerpc/include/perf_regs.h
- PERF_REGS_POWERPC_RESULT is removed.

Changes from V1:

- Solved the name missmatch issue in the from and signed-off field of the patch 
series.
- Added necessary comments in the 3rd patch ie perf/powerpc ,as suggested by 
Maddy.



Anju T (3):
  perf/powerpc: assign an id to each powerpc register
  perf/powerpc: add support for sampling intr machine state
  tools/perf: Map the ID values with register names

Madhavan Srinivasan (1):
  tool/perf: Add sample_reg_mask to include all perf_regs regs

 arch/powerpc/Kconfig|  1 +
 arch/powerpc/include/uapi/asm/perf_regs.h   | 49 +
 arch/powerpc/perf/Makefile  |  1 +
 arch/powerpc/perf/perf_regs.c   | 85 +
 tools/perf/arch/powerpc/include/perf_regs.h | 64 ++
 tools/perf/arch/powerpc/util/Build  |  1 +
 tools/perf/arch/powerpc/util/perf_regs.c| 48 
 tools/perf/config/Makefile  |  5 ++
 8 files changed, 254 insertions(+)
 create mode 100644 arch/powerpc/include/uapi/asm/perf_regs.h
 create mode 100644 arch/powerpc/perf/perf_regs.c
 create mode 100644 tools/perf/arch/powerpc/include/perf_regs.h
 create mode 100644 tools/perf/arch/powerpc/util/perf_regs.c

-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 2/3] checkpatch: check for __smp outside barrier.h

2016-01-11 Thread Michael S. Tsirkin
Introduction of __smp barriers cleans up a bunch of duplicate code, but
it gives people an additional handle onto a "new" set of barriers - just
because they're prefixed with __* unfortunately doesn't stop anyone from
using it (as happened with other arch stuff before.)

Add a checkpatch test so it will trigger a warning.

Reported-by: Russell King 
Signed-off-by: Michael S. Tsirkin 
---
 scripts/checkpatch.pl | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 94b4e33..25476c2 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5143,6 +5143,16 @@ sub process {
}
}
 
+   my $underscore_smp_barriers = qr{__smp_(?:$barrier_stems)}x;
+
+   if ($realfile !~ m@^include/asm-generic/@ &&
+   $realfile !~ m@/barrier\.h$@ &&
+   $line =~ m/\b(?:$underscore_smp_barriers)\s*\(/ &&
+   $line !~ 
m/^.\s*\#\s*define\s+(?:$underscore_smp_barriers)\s*\(/) {
+   WARN("MEMORY_BARRIER",
+"__smp memory barriers shouldn't be used outside 
barrier.h and asm-generic\n" . $herecurr);
+   }
+
 # check for waitqueue_active without a comment.
if ($line =~ /\bwaitqueue_active\s*\(/) {
if (!ctx_has_comment($first_line, $linenr)) {
-- 
MST

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

2016-01-11 Thread Michael Ellerman
On Fri, 2015-27-11 at 06:23:07 UTC, Russell Currey wrote:
> On BMC machines, console output is controlled by the OPAL firmware and is
> only flushed when its pollers are called.  When the kernel is in a panic
> state, it no longer calls these pollers and thus console output does not
> completely flush, causing some output from the panic to be lost.
> 
> Output is only actually lost when the kernel is configured to not power off
> or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
> flushes the console buffer as part of its power down routines.  Before this
> patch, however, only partial output would be printed during the timeout wait.
> 
> This patch adds a new kmsg_dumper which gets called at panic time to ensure
> panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
> in the OPAL API, and if that is not available, the pollers are called enough
> times to (hopefully) completely flush the buffer.
> 
> The flushing mechanism will only affect output printed at and before the
> kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
> message may still be truncated as follows:
> 
> >Call Trace:
> >[c00f1f603b00] [c08e9458] dump_stack+0x90/0xbc (unreliable)
> >[c00f1f603b30] [c08e7e78] panic+0xf8/0x2c4
> >[c00f1f603bc0] [c0be4860] mount_block_root+0x288/0x33c
> >[c00f1f603c80] [c0be4d14] prepare_namespace+0x1f4/0x254
> >[c00f1f603d00] [c0be43e8] kernel_init_freeable+0x318/0x350
> >[c00f1f603dc0] [c000bd74] kernel_init+0x24/0x130
> >[c00f1f603e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac
> >---[ end Kernel panic - not
> 
> This functionality is implemented as a kmsg_dumper as it seems to be the
> most sensible way to introduce platform-specific functionality to the
> panic function.
> 
> Signed-off-by: Russell Currey 
> Reviewed-by: Andrew Donnellan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/affddff69c55eb68969448f35f

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] mm/powerpc: Fix _PAGE_PTE breaking swapoff

2016-01-11 Thread Aneesh Kumar K.V
When converting a swp_entry_t to pte, we need to add _PAGE_PTE,
because we later compare the pte with linux page table entries to
find a matching pte. We do set _PAGE_PTE on pte entries on linux page
table even if it is a swap entry. So add them when converting
swp_entry_t to pte_t

The stack trace can be anywhere below try_to_unuse() in mm/swapfile.c,
since swapoff is circling around and around that function, reading from
each used swap block into a page, then trying to find where that page
belongs, looking at every non-file pte of every mm that ever swapped.

Reported-by: Hugh Dickins 
Suggested-by: Hugh Dickins 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 03c1a5a21c0c..48edcd8fbc4f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -158,9 +158,14 @@ static inline void pgd_set(pgd_t *pgdp, unsigned long val)
 #define __swp_entry(type, offset)  ((swp_entry_t) { \
((type) << _PAGE_BIT_SWAP_TYPE) \
| ((offset) << PTE_RPN_SHIFT) })
-
-#define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val((pte)) 
})
-#define __swp_entry_to_pte(x)  __pte((x).val)
+/*
+ * swp_entry_t should be arch independent. We build a swp_entry_t from
+ * swap type and offset we get from swap and convert that to pte to
+ * find a matching pte in linux page table.
+ * Clear bits not found in swap entries here
+ */
+#define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val((pte)) & 
~_PAGE_PTE })
+#define __swp_entry_to_pte(x)  __pte((x).val | _PAGE_PTE)
 
 #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
 #define _PAGE_SWP_SOFT_DIRTY   (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 1/3] checkpatch.pl: add missing memory barriers

2016-01-11 Thread Michael S. Tsirkin
SMP-only barriers were missing in checkpatch.pl

Refactor code slightly to make adding more variants easier.

Signed-off-by: Michael S. Tsirkin 
---
 scripts/checkpatch.pl | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 2b3c228..94b4e33 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5116,7 +5116,27 @@ sub process {
}
}
 # check for memory barriers without a comment.
-   if ($line =~ 
/\b(mb|rmb|wmb|read_barrier_depends|smp_mb|smp_rmb|smp_wmb|smp_read_barrier_depends)\(/)
 {
+
+   my $barriers = qr{
+   mb|
+   rmb|
+   wmb|
+   read_barrier_depends
+   }x;
+   my $barrier_stems = qr{
+   mb__before_atomic|
+   mb__after_atomic|
+   store_release|
+   load_acquire|
+   store_mb|
+   (?:$barriers)
+   }x;
+   my $all_barriers = qr{
+   (?:$barriers)|
+   smp_(?:$barrier_stems)
+   }x;
+
+   if ($line =~ /\b(?:$all_barriers)\s*\(/) {
if (!ctx_has_comment($first_line, $linenr)) {
WARN("MEMORY_BARRIER",
 "memory barrier without comment\n" . 
$herecurr);
-- 
MST

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V10 1/4] perf/powerpc: assign an id to each powerpc register

2016-01-11 Thread Anju T
The enum definition assigns an 'id' to each register in "struct pt_regs"
of arch/powerpc. The order of these values in the enum definition are
based on the corresponding macros in arch/powerpc/include/uapi/asm/ptrace.h.

Signed-off-by: Anju T 
Reviewed-by  : Madhavan Srinivasan 
---
 arch/powerpc/include/uapi/asm/perf_regs.h | 49 +++
 1 file changed, 49 insertions(+)
 create mode 100644 arch/powerpc/include/uapi/asm/perf_regs.h

diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h 
b/arch/powerpc/include/uapi/asm/perf_regs.h
new file mode 100644
index 000..cfbd068
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -0,0 +1,49 @@
+#ifndef _ASM_POWERPC_PERF_REGS_H
+#define _ASM_POWERPC_PERF_REGS_H
+
+enum perf_event_powerpc_regs {
+   PERF_REG_POWERPC_GPR0,
+   PERF_REG_POWERPC_GPR1,
+   PERF_REG_POWERPC_GPR2,
+   PERF_REG_POWERPC_GPR3,
+   PERF_REG_POWERPC_GPR4,
+   PERF_REG_POWERPC_GPR5,
+   PERF_REG_POWERPC_GPR6,
+   PERF_REG_POWERPC_GPR7,
+   PERF_REG_POWERPC_GPR8,
+   PERF_REG_POWERPC_GPR9,
+   PERF_REG_POWERPC_GPR10,
+   PERF_REG_POWERPC_GPR11,
+   PERF_REG_POWERPC_GPR12,
+   PERF_REG_POWERPC_GPR13,
+   PERF_REG_POWERPC_GPR14,
+   PERF_REG_POWERPC_GPR15,
+   PERF_REG_POWERPC_GPR16,
+   PERF_REG_POWERPC_GPR17,
+   PERF_REG_POWERPC_GPR18,
+   PERF_REG_POWERPC_GPR19,
+   PERF_REG_POWERPC_GPR20,
+   PERF_REG_POWERPC_GPR21,
+   PERF_REG_POWERPC_GPR22,
+   PERF_REG_POWERPC_GPR23,
+   PERF_REG_POWERPC_GPR24,
+   PERF_REG_POWERPC_GPR25,
+   PERF_REG_POWERPC_GPR26,
+   PERF_REG_POWERPC_GPR27,
+   PERF_REG_POWERPC_GPR28,
+   PERF_REG_POWERPC_GPR29,
+   PERF_REG_POWERPC_GPR30,
+   PERF_REG_POWERPC_GPR31,
+   PERF_REG_POWERPC_NIP,
+   PERF_REG_POWERPC_MSR,
+   PERF_REG_POWERPC_ORIG_R3,
+   PERF_REG_POWERPC_CTR,
+   PERF_REG_POWERPC_LNK,
+   PERF_REG_POWERPC_XER,
+   PERF_REG_POWERPC_CCR,
+   PERF_REG_POWERPC_TRAP,
+   PERF_REG_POWERPC_DAR,
+   PERF_REG_POWERPC_DSISR,
+   PERF_REG_POWERPC_MAX,
+};
+#endif /* _ASM_POWERPC_PERF_REGS_H */
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 3/3] checkpatch: add virt barriers

2016-01-11 Thread Michael S. Tsirkin
On Sun, Jan 10, 2016 at 02:52:16PM -0800, Joe Perches wrote:
> On Mon, 2016-01-11 at 09:13 +1100, Julian Calaby wrote:
> > On Mon, Jan 11, 2016 at 6:31 AM, Michael S. Tsirkin  wrote:
> > > Add virt_ barriers to list of barriers to check for
> > > presence of a comment.
> []
> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> []
> > > @@ -5133,7 +5133,8 @@ sub process {
> > > }x;
> > > my $all_barriers = qr{
> > > $barriers|
> > > -   smp_(?:$smp_barrier_stems)
> > > +   smp_(?:$smp_barrier_stems)|
> > > +   virt_(?:$smp_barrier_stems)
> > 
> > Sorry I'm late to the party here, but would it make sense to write this as:
> > 
> > (?:smp|virt)_(?:$smp_barrier_stems)
> 
> Yes.  Perhaps the name might be better as barrier_stems.
> 
> Also, ideally this would be longest match first or use \b
> after the matches so that $all_barriers could work
> successfully without a following \s*\(
> 
> my $all_barriers = qr{
>   (?:smp|virt)_(?:barrier_stems)|
>   $barriers)
> }x;
> 
> or maybe add separate $smp_barriers and $virt_barriers
> 
>   it doesn't matter much in any case

OK just to clarify - are you OK with merging the patch as is?
Refactorings can come as patches on top if required.

-- 
MST
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [v2] powerpc/fsl: Update fman dt binding with pcs-phy and tbi-phy

2016-01-11 Thread Igal Liberman
Hi Rob,

> -Original Message-
> From: Rob Herring [mailto:r...@kernel.org]
> Sent: Wednesday, December 30, 2015 6:28 PM
> To: Igal Liberman 
> Cc: devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; Scott Wood
> ; Madalin-Cristian Bucur
> ; shaohui@freescale.com
> Subject: Re: [v2] powerpc/fsl: Update fman dt binding with pcs-phy and tbi-
> phy
> 
> On Thu, Dec 24, 2015 at 03:42:11AM +0200, igal.liber...@freescale.com
> wrote:
> > From: Igal Liberman 
> >
> > The FMan contains internal PHY devices used for SGMII connections to
> > external PHYs. When these PHYs are in use a reference is needed for
> > both the external PHY and the internal one. For the external PHY
> > phy-handle provides the reference. For the internal PHY a new handle
> > is required.
> > In dTSEC, the internal PHY is a TBI (Ten Bit Interface) PHY, the
> > handle used will be tbi-handle.
> > In mEMAC, the internal PHY is a PCS (Physical Coding Sublayer) PHY,
> > the handle used will be pcsphy-handle.
> 
> This is fairly commom for 10G eth I think. Can't you use the common PHY
> binding here in the case without internal MDIO bus? Just because you use it
> that doesn't mean you have to use the generic phy subsystem in the kernel.
> 

mEMAC and dTSEC always have internal MDIO bus.
I was requested to use generic PHY API for internal PHY configuration by 
netdev, this part was accepted. 

> Perhaps phy-handle should be deprecated in favor of doing something like
> this if you need a phandle to both:
> 
> phys = <>, <>;
> 

I think that pcsphy-handle and tbi-handle represent the hardware in a good way.
This is the actual name of the hardware.

> > diff --git a/Documentation/devicetree/bindings/powerpc/fsl/fman.txt
> > b/Documentation/devicetree/bindings/powerpc/fsl/fman.txt
> > index 1fc5328..55c2c03 100644
> > --- a/Documentation/devicetree/bindings/powerpc/fsl/fman.txt
> > +++ b/Documentation/devicetree/bindings/powerpc/fsl/fman.txt
> > @@ -315,6 +315,16 @@ PROPERTIES
> > Value type: 
> > Definition: A phandle for 1EEE1588 timer.
> >
> > +- pcsphy-handle
> > +   Usage required for "fsl,fman-memac" MACs
> > +   Value type: 
> > +   Definition: A phandle for pcsphy.
> > +
> > +- tbi-handle
> > +   Usage required for "fsl,fman-dtsec" MACs
> > +   Value type: 
> > +   Definition: A phandle for tbiphy.
> > +
> >  EXAMPLE
> >
> >  fman1_tx28: port@a8000 {
> > @@ -340,6 +350,7 @@ ethernet@e {
> > reg = <0xe 0x1000>;
> > fsl,fman-ports = <_rx8 _tx28>;
> > ptp-timer = <>;
> > +   tbi-handle = <>;
> 
> What does the tbi0 node contain? It should be present in the example.
> 

There is an example in under MDIO section:

mdio@e3120 {
compatible = "fsl,fman-mdio";
reg = <0xe3120 0xee0>;
fsl,fman-internal-mdio;

tbi1: tbi-phy@8 {
reg = <0x8>;
device_type = "tbi-phy";
};
};

I used different indexes to make sure that it's aligned to the other examples 
in the binding document.

> Rob

Thank you for your feedback, 
Igal
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc: fix style of self-test config prompts

2016-01-11 Thread Michael Ellerman
On Mon, 2015-21-12 at 06:38:41 UTC, Andrew Donnellan wrote:
> A few of the config prompts for powerpc self-tests have periods at the
> end, which is inconsistent with the rest of the prompts. Remove the
> periods.
> 
> Signed-off-by: Andrew Donnellan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/759fb100b22473bebc46a0f10c

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc: add va_end()

2016-01-11 Thread Michael Ellerman
On Thu, 2015-17-12 at 08:41:00 UTC, Daniel Axtens wrote:
> cppcheck picked up that there were a couple of missing va_end()
> calls in functions using va_start().
> 
> Signed-off-by: Daniel Axtens 
> Reviewed-by: Russell Currey 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/1b855e167b90fcb353977c0893

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 0/3] checkpatch: handling of memory barriers

2016-01-11 Thread Michael S. Tsirkin
As part of memory barrier cleanup, this patchset
extends checkpatch to make it easier to stop
incorrect memory barrier usage.

This replaces the checkpatch patches in my series
arch: barrier cleanup + barriers for virt
and will be included in the pull request including
the series.

changes from v3:
rename smp_barrier_stems to barrier_stems
as suggested by Julian Calaby.
add (?: ... ) around a variable in regexp,
in case we change the value later so that it matters.
changes from v2:
address comments by Joe Perches:
use (?: ... ) to avoid unnecessary capture groups
rename smp_barriers to smp_barrier_stems for clarity
add barriers before/after atomic
Changes from v1:
catch optional\s* before () in barriers
rewrite using qr{} instead of map

Michael S. Tsirkin (3):
  checkpatch.pl: add missing memory barriers
  checkpatch: check for __smp outside barrier.h
  checkpatch: add virt barriers

Michael S. Tsirkin (3):
  checkpatch.pl: add missing memory barriers
  checkpatch: check for __smp outside barrier.h
  checkpatch: add virt barriers

 scripts/checkpatch.pl | 33 -
 1 file changed, 32 insertions(+), 1 deletion(-)

-- 
MST

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 3/3] checkpatch: add virt barriers

2016-01-11 Thread Michael S. Tsirkin
Add virt_ barriers to list of barriers to check for
presence of a comment.

Signed-off-by: Michael S. Tsirkin 
---
 scripts/checkpatch.pl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 25476c2..c7bf1aa 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5133,7 +5133,8 @@ sub process {
}x;
my $all_barriers = qr{
(?:$barriers)|
-   smp_(?:$barrier_stems)
+   smp_(?:$barrier_stems)|
+   virt_(?:$barrier_stems)
}x;
 
if ($line =~ /\b(?:$all_barriers)\s*\(/) {
-- 
MST

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: cxl: Fix DSI misses when the context owning task exits

2016-01-11 Thread Michael Ellerman
On Tue, 2015-24-11 at 10:56:18 UTC, Vaibhav Jain wrote:
> Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we
> store the pid of the current task_struct and use it to get pointer to
> the mm_struct of the process, while processing page or segment faults
> from the capi card. However this causes issues when the thread that had
> originally issued the start-work ioctl exits in which case the stored
> pid is no more valid and the cxl driver is unable to handle faults as
> the mm_struct corresponding to process is no more accessible.
> 
> This patch fixes this issue by using the mm_struct of the next alive
> task in the thread group. This is done by iterating over all the tasks
> in the thread group starting from thread group leader and calling
> get_task_mm on each one of them. When a valid mm_struct is obtained the
> pid of the associated task is stored in the context replacing the
> exiting one for handling future faults.
> 
> The patch introduces a new function named get_mem_context that checks if
> the current task pointed to by ctx->pid is dead? If yes it performs the
> steps described above. Also a new variable cxl_context.glpid is
> introduced which stores the pid of the thread group leader associated
> with the context owning task.
> 
> Reported-by: Matthew R. Ochs 
> Reported-by: Frank Haverkamp 
> Suggested-by: Ian Munsie 
> Signed-off-by: Vaibhav Jain 
> Acked-by: Ian Munsie 
> Reviewed-by: Frederic Barrat 
> Reviewed-by: Matthew R. Ochs 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/7b8ad495d59280b634a7b546f4

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V1 4/4] tool/perf: Add sample_reg_mask to include all perf_regs regs

2016-01-11 Thread Anju T
From: Madhavan Srinivasan 

Add sample_reg_mask array with pt_regs registers.
This is needed for printing supported regs ( -I? option).

Signed-off-by: Madhavan Srinivasan 
---
 tools/perf/arch/powerpc/util/Build   |  1 +
 tools/perf/arch/powerpc/util/perf_regs.c | 48 
 2 files changed, 49 insertions(+)
 create mode 100644 tools/perf/arch/powerpc/util/perf_regs.c

diff --git a/tools/perf/arch/powerpc/util/Build 
b/tools/perf/arch/powerpc/util/Build
index 7b8b0d1..3deb1bc 100644
--- a/tools/perf/arch/powerpc/util/Build
+++ b/tools/perf/arch/powerpc/util/Build
@@ -1,5 +1,6 @@
 libperf-y += header.o
 libperf-y += sym-handling.o
+libperf-y += perf_regs.o
 
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
 libperf-$(CONFIG_DWARF) += skip-callchain-idx.o
diff --git a/tools/perf/arch/powerpc/util/perf_regs.c 
b/tools/perf/arch/powerpc/util/perf_regs.c
new file mode 100644
index 000..0b0ec65
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -0,0 +1,48 @@
+#include "../../perf.h"
+#include "../../util/perf_regs.h"
+
+const struct sample_reg sample_reg_masks[] = {
+   SMPL_REG(gpr0, PERF_REG_POWERPC_GPR0),
+   SMPL_REG(gpr1, PERF_REG_POWERPC_GPR1),
+   SMPL_REG(gpr2, PERF_REG_POWERPC_GPR2),
+   SMPL_REG(gpr3, PERF_REG_POWERPC_GPR3),
+   SMPL_REG(gpr4, PERF_REG_POWERPC_GPR4),
+   SMPL_REG(gpr5, PERF_REG_POWERPC_GPR5),
+   SMPL_REG(gpr6, PERF_REG_POWERPC_GPR6),
+   SMPL_REG(gpr7, PERF_REG_POWERPC_GPR7),
+   SMPL_REG(gpr8, PERF_REG_POWERPC_GPR8),
+   SMPL_REG(gpr9, PERF_REG_POWERPC_GPR9),
+   SMPL_REG(gpr10, PERF_REG_POWERPC_GPR10),
+   SMPL_REG(gpr11, PERF_REG_POWERPC_GPR11),
+   SMPL_REG(gpr12, PERF_REG_POWERPC_GPR12),
+   SMPL_REG(gpr13, PERF_REG_POWERPC_GPR13),
+   SMPL_REG(gpr14, PERF_REG_POWERPC_GPR14),
+   SMPL_REG(gpr15, PERF_REG_POWERPC_GPR15),
+   SMPL_REG(gpr16, PERF_REG_POWERPC_GPR16),
+   SMPL_REG(gpr17, PERF_REG_POWERPC_GPR17),
+   SMPL_REG(gpr18, PERF_REG_POWERPC_GPR18),
+   SMPL_REG(gpr19, PERF_REG_POWERPC_GPR19),
+   SMPL_REG(gpr20, PERF_REG_POWERPC_GPR20),
+   SMPL_REG(gpr21, PERF_REG_POWERPC_GPR21),
+   SMPL_REG(gpr22, PERF_REG_POWERPC_GPR22),
+   SMPL_REG(gpr23, PERF_REG_POWERPC_GPR23),
+   SMPL_REG(gpr24, PERF_REG_POWERPC_GPR24),
+   SMPL_REG(gpr25, PERF_REG_POWERPC_GPR25),
+   SMPL_REG(gpr26, PERF_REG_POWERPC_GPR26),
+   SMPL_REG(gpr27, PERF_REG_POWERPC_GPR27),
+   SMPL_REG(gpr28, PERF_REG_POWERPC_GPR28),
+   SMPL_REG(gpr29, PERF_REG_POWERPC_GPR29),
+   SMPL_REG(gpr30, PERF_REG_POWERPC_GPR30),
+   SMPL_REG(gpr31, PERF_REG_POWERPC_GPR31),
+   SMPL_REG(nip, PERF_REG_POWERPC_NIP),
+   SMPL_REG(msr, PERF_REG_POWERPC_MSR),
+   SMPL_REG(orig_r3, PERF_REG_POWERPC_ORIG_R3),
+   SMPL_REG(ctr, PERF_REG_POWERPC_CTR),
+   SMPL_REG(link, PERF_REG_POWERPC_LNK),
+   SMPL_REG(xer, PERF_REG_POWERPC_XER),
+   SMPL_REG(ccr, PERF_REG_POWERPC_CCR),
+   SMPL_REG(trap, PERF_REG_POWERPC_TRAP),
+   SMPL_REG(dar, PERF_REG_POWERPC_DAR),
+   SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR),
+   SMPL_REG_END
+};
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v3, 2/2] powerpc: Copy only required pieces of the mm_context_t to the paca

2016-01-11 Thread Michael Ellerman
On Thu, 2015-10-12 at 22:34:42 UTC, Michael Neuling wrote:
> Currently we copy the whole mm_context_t to the paca but only access a
> few bits of it.  This is wasteful of space paca and also takes quite
> some time in the hot path of context switching.
> 
> This patch pulls in only the required bits from the mm_context_t to
> the paca and on context switch, copies only those.
> 
> Benchmarking this (On top of Anton's recent MSR context switching
> changes [1]) using processes and yield shows an improvement of almost
> 3% on POWER8:
> 
>   http://ozlabs.org/~anton/junkcode/context_switch2.c
>   ./context_switch2 --test=yield --process 0 0
> 
> 1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.html
> 
> Signed-off-by: Michael Neuling 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/2fc251a8dda56b71ec491bee4c

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [1/2] powerpc: Add function to copy mm_context_t to the paca

2016-01-11 Thread Michael Ellerman
On Wed, 2015-28-10 at 04:54:06 UTC, Michael Neuling wrote:
> This adds a function to copy the mm->context to the paca.  This is
> only a basic conversion for now but will be used more extensively in
> the next patch.
> 
> This also adds #ifdef CONFIG_PPC_BOOK3S around this code since it's
> not used elsewhere.
> 
> Signed-off-by: Michael Neuling 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/c395465da68bfc3a238d5bc15f

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V10 2/4] perf/powerpc: add support for sampling intr machine state

2016-01-11 Thread Anju T
The perf infrastructure uses a bit mask to find out valid
registers to display. Define a register mask for supported
registers defined in asm/perf_regs.h. The bit positions also
correspond to register IDs which is used by perf infrastructure
to fetch the register values. CONFIG_HAVE_PERF_REGS enables
sampling of the interrupted machine state.

Signed-off-by: Anju T 
Reviewed-by  : Madhavan Srinivasan 
---
 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/perf/Makefile|  1 +
 arch/powerpc/perf/perf_regs.c | 85 +++
 3 files changed, 87 insertions(+)
 create mode 100644 arch/powerpc/perf/perf_regs.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9a7057e..c4ce60d 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -119,6 +119,7 @@ config PPC
select GENERIC_ATOMIC64 if PPC32
select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
select HAVE_PERF_EVENTS
+   select HAVE_PERF_REGS
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_HW_BREAKPOINT if PERF_EVENTS && PPC_BOOK3S_64
select ARCH_WANT_IPC_PARSE_VERSION
diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile
index f9c083a..2f2d3d2 100644
--- a/arch/powerpc/perf/Makefile
+++ b/arch/powerpc/perf/Makefile
@@ -8,6 +8,7 @@ obj64-$(CONFIG_PPC_PERF_CTRS)   += power4-pmu.o ppc970-pmu.o 
power5-pmu.o \
   power8-pmu.o
 obj32-$(CONFIG_PPC_PERF_CTRS)  += mpc7450-pmu.o
 
+obj-$(CONFIG_PERF_EVENTS)  += perf_regs.o
 obj-$(CONFIG_FSL_EMB_PERF_EVENT) += core-fsl-emb.o
 obj-$(CONFIG_FSL_EMB_PERF_EVENT_E500) += e500-pmu.o e6500-pmu.o
 
diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
new file mode 100644
index 000..d32581763
--- /dev/null
+++ b/arch/powerpc/perf/perf_regs.c
@@ -0,0 +1,85 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PT_REGS_OFFSET(id, r) [id] = offsetof(struct pt_regs, r)
+
+#define REG_RESERVED (~((1ULL << PERF_REG_POWERPC_MAX) - 1))
+
+static unsigned int pt_regs_offset[PERF_REG_POWERPC_MAX] = {
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR0, gpr[0]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR1, gpr[1]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR2, gpr[2]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR3, gpr[3]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR4, gpr[4]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR5, gpr[5]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR6, gpr[6]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR7, gpr[7]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR8, gpr[8]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR9, gpr[9]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR10, gpr[10]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR11, gpr[11]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR12, gpr[12]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR13, gpr[13]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR14, gpr[14]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR15, gpr[15]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR16, gpr[16]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR17, gpr[17]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR18, gpr[18]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR19, gpr[19]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR20, gpr[20]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR21, gpr[21]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR22, gpr[22]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR23, gpr[23]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR24, gpr[24]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR25, gpr[25]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR26, gpr[26]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR27, gpr[27]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR28, gpr[28]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR29, gpr[29]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR30, gpr[30]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_GPR31, gpr[31]),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_NIP, nip),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_MSR, msr),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_ORIG_R3, orig_gpr3),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_CTR, ctr),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_LNK, link),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_XER, xer),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_CCR, ccr),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_TRAP, trap),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_DAR, dar),
+   PT_REGS_OFFSET(PERF_REG_POWERPC_DSISR, dsisr),
+};
+
+u64 perf_reg_value(struct pt_regs *regs, int idx)
+{
+   if (WARN_ON_ONCE(idx >= PERF_REG_POWERPC_MAX))
+   return 0;
+
+   return regs_get_register(regs, pt_regs_offset[idx]);
+}
+
+int perf_reg_validate(u64 mask)
+{
+   if (!mask || mask & REG_RESERVED)
+   return -EINVAL;
+   return 0;
+}
+
+u64 perf_reg_abi(struct task_struct *task)
+{
+   return PERF_SAMPLE_REGS_ABI_64;
+}
+
+void perf_get_regs_user(struct perf_regs *regs_user,
+   

Re: [PATCH v2 20/32] metag: define __smp_xxx

2016-01-11 Thread Michael S. Tsirkin
On Tue, Jan 05, 2016 at 12:09:30AM +, James Hogan wrote:
> Hi Michael,
> 
> On Thu, Dec 31, 2015 at 09:08:22PM +0200, Michael S. Tsirkin wrote:
> > This defines __smp_xxx barriers for metag,
> > for use by virtualization.
> > 
> > smp_xxx barriers are removed as they are
> > defined correctly by asm-generic/barriers.h
> > 
> > Note: as __smp_XX macros should not depend on CONFIG_SMP, they can not
> > use the existing fence() macro since that is defined differently between
> > SMP and !SMP.  For this reason, this patch introduces a wrapper
> > metag_fence() that doesn't depend on CONFIG_SMP.
> > fence() is then defined using that, depending on CONFIG_SMP.
> 
> I'm not a fan of the inconsistent commit message wrapping. I wrap to 72
> columns (although I now notice SubmittingPatches says to use 75...).
> 
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > Acked-by: Arnd Bergmann 
> > ---
> >  arch/metag/include/asm/barrier.h | 32 +++-
> >  1 file changed, 15 insertions(+), 17 deletions(-)
> > 
> > diff --git a/arch/metag/include/asm/barrier.h 
> > b/arch/metag/include/asm/barrier.h
> > index b5b778b..84880c9 100644
> > --- a/arch/metag/include/asm/barrier.h
> > +++ b/arch/metag/include/asm/barrier.h
> > @@ -44,13 +44,6 @@ static inline void wr_fence(void)
> >  #define rmb()  barrier()
> >  #define wmb()  mb()
> >  
> > -#ifndef CONFIG_SMP
> > -#define fence()do { } while (0)
> > -#define smp_mb()barrier()
> > -#define smp_rmb()   barrier()
> > -#define smp_wmb()   barrier()
> > -#else
> 
> !SMP kernel text differs, but only because of new presence of unused
> metag_fence() inline function. If I #if 0 that out, then it matches, so
> thats fine.
> 
> > -
> >  #ifdef CONFIG_METAG_SMP_WRITE_REORDERING
> >  /*
> >   * Write to the atomic memory unlock system event register (command 0). 
> > This is
> > @@ -60,26 +53,31 @@ static inline void wr_fence(void)
> >   * incoherence). It is therefore ineffective if used after and on the same
> >   * thread as a write.
> >   */
> > -static inline void fence(void)
> > +static inline void metag_fence(void)
> >  {
> > volatile int *flushptr = (volatile int *) LINSYSEVENT_WR_ATOMIC_UNLOCK;
> > barrier();
> > *flushptr = 0;
> > barrier();
> >  }
> > -#define smp_mb()fence()
> > -#define smp_rmb()   fence()
> > -#define smp_wmb()   barrier()
> > +#define __smp_mb()metag_fence()
> > +#define __smp_rmb()   metag_fence()
> > +#define __smp_wmb()   barrier()
> >  #else
> > -#define fence()do { } while (0)
> > -#define smp_mb()barrier()
> > -#define smp_rmb()   barrier()
> > -#define smp_wmb()   barrier()
> > +#define metag_fence()  do { } while (0)
> > +#define __smp_mb()barrier()
> > +#define __smp_rmb()   barrier()
> > +#define __smp_wmb()   barrier()
> 
> Whitespace is now messed up. Admitedly its already inconsistent
> tabs/spaces, but it'd be nice if the definitions at least still all
> lined up. You're touching all the definitions which use spaces anyway,
> so feel free to convert them to tabs while you're at it.
> 
> Other than those niggles, it looks sensible to me:
> Acked-by: James Hogan 
> 
> Cheers
> James

Thanks!

I did this in my tree (replaced spaces with tabs in the new
definitions); not reposting just because of this change.

> >  #endif
> > +
> > +#ifdef CONFIG_SMP
> > +#define fence() metag_fence()
> > +#else
> > +#define fence()do { } while (0)
> >  #endif
> >  
> > -#define smp_mb__before_atomic()barrier()
> > -#define smp_mb__after_atomic() barrier()
> > +#define __smp_mb__before_atomic()  barrier()
> > +#define __smp_mb__after_atomic()   barrier()
> >  
> >  #include 
> >  
> > -- 
> > MST
> > 


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH kernel] powerpc/ioda: Set "read" permission when "write" is set

2016-01-11 Thread Alexey Kardashevskiy
Quite often drivers set only "write" permission assuming that this
includes "read" permission as well and this works on plenty platforms.
However IODA2 is strict about this and produces an EEH when "read"
permission is not and reading happens.

This adds a workaround in IODA code to always add the "read" bit when
the "write" bit is set.

Cc: Benjamin Herrenschmidt 
Signed-off-by: Alexey Kardashevskiy 
---


Ben, what was the driver which did not set "read" and caused EEH?


---
 arch/powerpc/platforms/powernv/pci.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci.c 
b/arch/powerpc/platforms/powernv/pci.c
index f2dd772..c7dcae5 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -601,6 +601,9 @@ int pnv_tce_build(struct iommu_table *tbl, long index, long 
npages,
u64 rpn = __pa(uaddr) >> tbl->it_page_shift;
long i;
 
+   if (proto_tce & TCE_PCI_WRITE)
+   proto_tce |= TCE_PCI_READ;
+
for (i = 0; i < npages; i++) {
unsigned long newtce = proto_tce |
((rpn + i) << tbl->it_page_shift);
@@ -622,6 +625,9 @@ int pnv_tce_xchg(struct iommu_table *tbl, long index,
 
BUG_ON(*hpa & ~IOMMU_PAGE_MASK(tbl));
 
+   if (newtce & TCE_PCI_WRITE)
+   newtce |= TCE_PCI_READ;
+
oldtce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce));
*hpa = be64_to_cpu(oldtce) & ~(TCE_PCI_READ | TCE_PCI_WRITE);
*direction = iommu_tce_direction(oldtce);
-- 
2.5.0.rc3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 19/33] powerpc/mm: Rename hash specific page table bits (_PAGE* -> H_PAGE*)

2016-01-11 Thread Aneesh Kumar K.V
This patch renames _PAGE* -> H_PAGE*. This enables us to support
different page table format in the same kernel.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |  60 ++--
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 111 
 arch/powerpc/include/asm/book3s/64/hash.h | 320 +++---
 arch/powerpc/include/asm/book3s/64/pgalloc-hash.h |  16 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  67 -
 arch/powerpc/include/asm/kvm_book3s_64.h  |  10 +-
 arch/powerpc/include/asm/mmu-hash64.h |   4 +-
 arch/powerpc/include/asm/page_64.h|   2 +-
 arch/powerpc/include/asm/pte-common.h |   3 +
 arch/powerpc/kernel/asm-offsets.c |   9 +-
 arch/powerpc/kernel/pci_64.c  |   3 +-
 arch/powerpc/kvm/book3s_64_mmu_host.c |   2 +-
 arch/powerpc/mm/copro_fault.c |   8 +-
 arch/powerpc/mm/hash64_4k.c   |  25 +-
 arch/powerpc/mm/hash64_64k.c  |  61 +++--
 arch/powerpc/mm/hash_native_64.c  |  10 +-
 arch/powerpc/mm/hash_utils_64.c   |  93 ---
 arch/powerpc/mm/hugepage-hash64.c |  22 +-
 arch/powerpc/mm/hugetlbpage-hash64.c  |  46 ++--
 arch/powerpc/mm/mmu_context_hash64.c  |   4 +-
 arch/powerpc/mm/pgtable-hash64.c  |  42 +--
 arch/powerpc/mm/pgtable_64.c  |  86 --
 arch/powerpc/mm/slb.c |   8 +-
 arch/powerpc/mm/slb_low.S |   4 +-
 arch/powerpc/mm/slice.c   |   2 +-
 arch/powerpc/mm/tlb_hash64.c  |   8 +-
 arch/powerpc/platforms/cell/spu_base.c|   6 +-
 arch/powerpc/platforms/cell/spufs/fault.c |   4 +-
 arch/powerpc/platforms/ps3/spu.c  |   2 +-
 arch/powerpc/platforms/pseries/lpar.c |  12 +-
 drivers/misc/cxl/fault.c  |   6 +-
 31 files changed, 598 insertions(+), 458 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index c78f5928001b..1ef4b39f96fd 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -5,56 +5,56 @@
  * for each page table entry.  The PMD and PGD level use a 32b record for
  * each entry by assuming that each entry is page aligned.
  */
-#define PTE_INDEX_SIZE  9
-#define PMD_INDEX_SIZE  7
-#define PUD_INDEX_SIZE  9
-#define PGD_INDEX_SIZE  9
+#define H_PTE_INDEX_SIZE  9
+#define H_PMD_INDEX_SIZE  7
+#define H_PUD_INDEX_SIZE  9
+#define H_PGD_INDEX_SIZE  9
 
 #ifndef __ASSEMBLY__
-#define PTE_TABLE_SIZE (sizeof(pte_t) << PTE_INDEX_SIZE)
-#define PMD_TABLE_SIZE (sizeof(pmd_t) << PMD_INDEX_SIZE)
-#define PUD_TABLE_SIZE (sizeof(pud_t) << PUD_INDEX_SIZE)
-#define PGD_TABLE_SIZE (sizeof(pgd_t) << PGD_INDEX_SIZE)
+#define H_PTE_TABLE_SIZE   (sizeof(pte_t) << H_PTE_INDEX_SIZE)
+#define H_PMD_TABLE_SIZE   (sizeof(pmd_t) << H_PMD_INDEX_SIZE)
+#define H_PUD_TABLE_SIZE   (sizeof(pud_t) << H_PUD_INDEX_SIZE)
+#define H_PGD_TABLE_SIZE   (sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 #endif /* __ASSEMBLY__ */
 
-#define PTRS_PER_PTE   (1 << PTE_INDEX_SIZE)
-#define PTRS_PER_PMD   (1 << PMD_INDEX_SIZE)
-#define PTRS_PER_PUD   (1 << PUD_INDEX_SIZE)
-#define PTRS_PER_PGD   (1 << PGD_INDEX_SIZE)
+#define H_PTRS_PER_PTE (1 << H_PTE_INDEX_SIZE)
+#define H_PTRS_PER_PMD (1 << H_PMD_INDEX_SIZE)
+#define H_PTRS_PER_PUD (1 << H_PUD_INDEX_SIZE)
+#define H_PTRS_PER_PGD (1 << H_PGD_INDEX_SIZE)
 
 /* PMD_SHIFT determines what a second-level page table entry can map */
-#define PMD_SHIFT  (PAGE_SHIFT + PTE_INDEX_SIZE)
-#define PMD_SIZE   (1UL << PMD_SHIFT)
-#define PMD_MASK   (~(PMD_SIZE-1))
+#define H_PMD_SHIFT(PAGE_SHIFT + H_PTE_INDEX_SIZE)
+#define H_PMD_SIZE (1UL << H_PMD_SHIFT)
+#define H_PMD_MASK (~(H_PMD_SIZE-1))
 
 /* With 4k base page size, hugepage PTEs go at the PMD level */
-#define MIN_HUGEPTE_SHIFT  PMD_SHIFT
+#define MIN_HUGEPTE_SHIFT  H_PMD_SHIFT
 
 /* PUD_SHIFT determines what a third-level page table entry can map */
-#define PUD_SHIFT  (PMD_SHIFT + PMD_INDEX_SIZE)
-#define PUD_SIZE   (1UL << PUD_SHIFT)
-#define PUD_MASK   (~(PUD_SIZE-1))
+#define H_PUD_SHIFT(H_PMD_SHIFT + H_PMD_INDEX_SIZE)
+#define H_PUD_SIZE (1UL << H_PUD_SHIFT)
+#define H_PUD_MASK (~(H_PUD_SIZE-1))
 
 /* PGDIR_SHIFT determines what a fourth-level page table entry can map */
-#define PGDIR_SHIFT(PUD_SHIFT + PUD_INDEX_SIZE)
-#define PGDIR_SIZE (1UL << PGDIR_SHIFT)
-#define PGDIR_MASK (~(PGDIR_SIZE-1))
+#define H_PGDIR_SHIFT  (H_PUD_SHIFT + H_PUD_INDEX_SIZE)
+#define H_PGDIR_SIZE   (1UL << H_PGDIR_SHIFT)
+#define H_PGDIR_MASK   (~(H_PGDIR_SIZE-1))
 
 /* Bits to mask out from a PMD to get to the PTE page */

[RFC PATCH V1 00/33] Book3s abstraction in preparation for new MMU model

2016-01-11 Thread Aneesh Kumar K.V


Hello,

This is a large series, mostly consisting of code movement. No new features
are done in this series. The changes are done to accomodate the upcoming new 
memory
model in future powerpc chips. The details of the new MMU model can be found at

 http://ibm.biz/power-isa3 (Needs registration). I am including a summary of 
the changes below.

ISA 3.0 adds support for the radix tree style of MMU with full
virtualization and related control mechanisms that manage its
coexistence with the HPT. Radix-using operating systems will
manage their own translation tables instead of relying on hcalls.

Radix style MMU model requires us to do a 4 level page table
with 64K and 4K page size. The table index size different page size
is listed below

PGD -> 13 bits
PUD -> 9 (1G hugepage)
PMD -> 9 (2M huge page)
PTE -> 5 (for 64k), 9 (for 4k)

We also require the page table to be in big endian format.

The changes proposed in this series enables us to support both
hash page table and radix tree style MMU using a single kernel
with limited impact. The idea is to change core page table
accessors to static inline functions and later hotpatch them
to switch to hash or radix tree functions. For ex:

static inline int pte_write(pte_t pte)
{
   if (radix_enabled())
   return rpte_write(pte);
return hlpte_write(pte);
}

On boot we will hotpatch the code so as to avoid conditional operation.

The other two major change propsed in this series is to switch hash
linux page table to a 4 level table in big endian format. This is
done so that functions like pte_val(), pud_populate() doesn't need
hotpatching and thereby helps in limiting runtime impact of the changes.

I didn't included the radix related changes in this series. You can
find them at https://github.com/kvaneesh/linux/commits/radix-mmu-v1

Aneesh Kumar K.V (33):
  powerpc/mm: add _PAGE_HASHPTE similar to 4K hash
  powerpc/mm: Split pgtable types to separate header
  powerpc/mm: Switch book3s 64 with 64K page size to 4 level page table
  powerpc/mm: Copy pgalloc (part 1)
  powerpc/mm: Copy pgalloc (part 2)
  powerpc/mm: Copy pgalloc (part 3)
  mm: arch hook for vm_get_page_prot
  mm: Some arch may want to use HPAGE_PMD related values as variables
  powerpc/mm: Hugetlbfs is book3s_64 and fsl_book3e (32 or 64)
  powerpc/mm: free_hugepd_range split to hash and nonhash
  powerpc/mm: Use helper instead of opencoding
  powerpc/mm: Move hash64 specific defintions to seperate header
  powerpc/mm: Move swap related definition ot hash64 header
  powerpc/mm: Use helper for finding pte bits mapping I/O area
  powerpc/mm: Use helper for finding pte filter mask for gup
  powerpc/mm: Move hash page table related functions to pgtable-hash64.c
  mm: Change pmd_huge_pte type in mm_struct
  powerpc/mm: Add helper for update page flags during ioremap
  powerpc/mm: Rename hash specific page table bits (_PAGE* -> H_PAGE*)
  powerpc/mm: Use flush_tlb_page in ptep_clear_flush_young
  powerpc/mm: THP is only available on hash64 as of now
  powerpc/mm: Use generic version of pmdp_clear_flush_young
  powerpc/mm: Create a new headers for tlbflush for hash64
  powerpc/mm: Hash linux abstraction for page table accessors
  powerpc/mm: Hash linux abstraction for functions in pgtable-hash.c
  powerpc/mm: Hash linux abstraction for mmu context handling code
  powerpc/mm: Move hash related mmu-*.h headers to book3s/
  powerpc/mm: Hash linux abstractions for early init routines
  powerpc/mm: Hash linux abstraction for THP
  powerpc/mm: Hash linux abstraction for HugeTLB
  powerpc/mm: Hash linux abstraction for page table allocator
  powerpc/mm: Hash linux abstraction for tlbflush routines
  powerpc/mm: Hash linux abstraction for pte swap encoding

 arch/arm/include/asm/pgtable-3level.h  |   8 +
 arch/arm64/include/asm/pgtable.h   |   7 +
 arch/mips/include/asm/pgtable.h|   8 +
 arch/powerpc/Kconfig   |   1 +
 .../asm/{mmu-hash32.h => book3s/32/mmu-hash.h} |   6 +-
 arch/powerpc/include/asm/book3s/32/pgalloc.h   | 109 
 arch/powerpc/include/asm/book3s/32/pgtable.h   |  39 ++
 arch/powerpc/include/asm/book3s/64/hash-4k.h   | 103 ++-
 arch/powerpc/include/asm/book3s/64/hash-64k.h  | 165 ++---
 arch/powerpc/include/asm/book3s/64/hash.h  | 524 +---
 .../asm/{mmu-hash64.h => book3s/64/mmu-hash.h} |  67 +-
 arch/powerpc/include/asm/book3s/64/mmu.h   |  93 +++
 .../include/asm/book3s/64/pgalloc-hash-4k.h|  92 +++
 .../include/asm/book3s/64/pgalloc-hash-64k.h   |  48 ++
 arch/powerpc/include/asm/book3s/64/pgalloc-hash.h  |  82 +++
 arch/powerpc/include/asm/book3s/64/pgalloc.h   | 158 +
 arch/powerpc/include/asm/book3s/64/pgtable.h   | 693 +++--
 arch/powerpc/include/asm/book3s/64/tlbflush-hash.h |  96 +++
 arch/powerpc/include/asm/book3s/64/tlbflush.h  |  56 ++
 arch/powerpc/include/asm/book3s/pgalloc.h  |  19 

[RFC PATCH V1 06/33] powerpc/mm: Copy pgalloc (part 3)

2016-01-11 Thread Aneesh Kumar K.V
64bit book3s now always have 4 level page table irrespective of linux
page size. Move the related code out of #ifdef

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/pgalloc.h | 55 +---
 1 file changed, 18 insertions(+), 37 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index 5bb6852fa771..f06ad7354d68 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -51,7 +51,6 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
 }
 
-#ifndef CONFIG_PPC_64K_PAGES
 static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
 {
pgd_set(pgd, (unsigned long)pud);
@@ -79,6 +78,14 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, 
pmd_t *pmd,
pmd_set(pmd, (unsigned long)pte);
 }
 
+/*
+ * FIXME!!
+ * Between 4K and 64K pages, we differ in what is stored in pmd. ie.
+ * typedef pte_t *pgtable_t; -> 64K
+ * typedef struct page *pgtable_t; -> 4k
+ */
+#ifndef CONFIG_PPC_64K_PAGES
+
 static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
pgtable_t pte_page)
 {
@@ -176,36 +183,6 @@ extern void pgtable_free_tlb(struct mmu_gather *tlb, void 
*table, int shift);
 extern void __tlb_remove_table(void *_table);
 #endif
 
-#ifndef __PAGETABLE_PUD_FOLDED
-/* book3s 64 is 4 level page table */
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
-{
-   pgd_set(pgd, (unsigned long)pud);
-}
-
-static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
-{
-   return kmem_cache_alloc(PGT_CACHE(PUD_INDEX_SIZE),
-   GFP_KERNEL|__GFP_REPEAT);
-}
-
-static inline void pud_free(struct mm_struct *mm, pud_t *pud)
-{
-   kmem_cache_free(PGT_CACHE(PUD_INDEX_SIZE), pud);
-}
-#endif
-
-static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
-{
-   pud_set(pud, (unsigned long)pmd);
-}
-
-static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
-  pte_t *pte)
-{
-   pmd_set(pmd, (unsigned long)pte);
-}
-
 static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
pgtable_t pte_page)
 {
@@ -258,13 +235,17 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t 
*pmd)
kmem_cache_free(PGT_CACHE(PMD_CACHE_INDEX), pmd);
 }
 
-#define __pmd_free_tlb(tlb, pmd, addr)   \
-   pgtable_free_tlb(tlb, pmd, PMD_CACHE_INDEX)
-#ifndef __PAGETABLE_PUD_FOLDED
-#define __pud_free_tlb(tlb, pud, addr)   \
-   pgtable_free_tlb(tlb, pud, PUD_INDEX_SIZE)
+static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
+  unsigned long address)
+{
+return pgtable_free_tlb(tlb, pmd, PMD_CACHE_INDEX);
+}
 
-#endif /* __PAGETABLE_PUD_FOLDED */
+static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
+  unsigned long address)
+{
+pgtable_free_tlb(tlb, pud, PUD_INDEX_SIZE);
+}
 
 #define check_pgt_cache()  do { } while (0)
 
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 02/33] powerpc/mm: Split pgtable types to separate header

2016-01-11 Thread Aneesh Kumar K.V
No code changes. We will later add a radix variant that is big endian

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/page.h  | 104 +--
 arch/powerpc/include/asm/pgtable-types.h | 100 +
 2 files changed, 101 insertions(+), 103 deletions(-)
 create mode 100644 arch/powerpc/include/asm/pgtable-types.h

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index e34124f6fbf2..3a3f073f7222 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -281,109 +281,7 @@ extern long long virt_phys_offset;
 
 #ifndef __ASSEMBLY__
 
-#ifdef CONFIG_STRICT_MM_TYPECHECKS
-/* These are used to make use of C type-checking. */
-
-/* PTE level */
-typedef struct { pte_basic_t pte; } pte_t;
-#define __pte(x)   ((pte_t) { (x) })
-static inline pte_basic_t pte_val(pte_t x)
-{
-   return x.pte;
-}
-
-/* 64k pages additionally define a bigger "real PTE" type that gathers
- * the "second half" part of the PTE for pseudo 64k pages
- */
-#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC_STD_MMU_64)
-typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
-#else
-typedef struct { pte_t pte; } real_pte_t;
-#endif
-
-/* PMD level */
-#ifdef CONFIG_PPC64
-typedef struct { unsigned long pmd; } pmd_t;
-#define __pmd(x)   ((pmd_t) { (x) })
-static inline unsigned long pmd_val(pmd_t x)
-{
-   return x.pmd;
-}
-
-/* PUD level exusts only on 4k pages */
-#ifndef CONFIG_PPC_64K_PAGES
-typedef struct { unsigned long pud; } pud_t;
-#define __pud(x)   ((pud_t) { (x) })
-static inline unsigned long pud_val(pud_t x)
-{
-   return x.pud;
-}
-#endif /* !CONFIG_PPC_64K_PAGES */
-#endif /* CONFIG_PPC64 */
-
-/* PGD level */
-typedef struct { unsigned long pgd; } pgd_t;
-#define __pgd(x)   ((pgd_t) { (x) })
-static inline unsigned long pgd_val(pgd_t x)
-{
-   return x.pgd;
-}
-
-/* Page protection bits */
-typedef struct { unsigned long pgprot; } pgprot_t;
-#define pgprot_val(x)  ((x).pgprot)
-#define __pgprot(x)((pgprot_t) { (x) })
-
-#else
-
-/*
- * .. while these make it easier on the compiler
- */
-
-typedef pte_basic_t pte_t;
-#define __pte(x)   (x)
-static inline pte_basic_t pte_val(pte_t pte)
-{
-   return pte;
-}
-
-#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC_STD_MMU_64)
-typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
-#else
-typedef pte_t real_pte_t;
-#endif
-
-
-#ifdef CONFIG_PPC64
-typedef unsigned long pmd_t;
-#define __pmd(x)   (x)
-static inline unsigned long pmd_val(pmd_t pmd)
-{
-   return pmd;
-}
-
-#ifndef CONFIG_PPC_64K_PAGES
-typedef unsigned long pud_t;
-#define __pud(x)   (x)
-static inline unsigned long pud_val(pud_t pud)
-{
-   return pud;
-}
-#endif /* !CONFIG_PPC_64K_PAGES */
-#endif /* CONFIG_PPC64 */
-
-typedef unsigned long pgd_t;
-#define __pgd(x)   (x)
-static inline unsigned long pgd_val(pgd_t pgd)
-{
-   return pgd;
-}
-
-typedef unsigned long pgprot_t;
-#define pgprot_val(x)  (x)
-#define __pgprot(x)(x)
-
-#endif
+#include 
 
 typedef struct { signed long pd; } hugepd_t;
 
diff --git a/arch/powerpc/include/asm/pgtable-types.h 
b/arch/powerpc/include/asm/pgtable-types.h
new file mode 100644
index ..71487e1ca638
--- /dev/null
+++ b/arch/powerpc/include/asm/pgtable-types.h
@@ -0,0 +1,100 @@
+#ifndef _ASM_POWERPC_PGTABLE_TYPES_H
+#define _ASM_POWERPC_PGTABLE_TYPES_H
+
+#ifdef CONFIG_STRICT_MM_TYPECHECKS
+/* These are used to make use of C type-checking. */
+
+/* PTE level */
+typedef struct { pte_basic_t pte; } pte_t;
+#define __pte(x)   ((pte_t) { (x) })
+static inline pte_basic_t pte_val(pte_t x)
+{
+   return x.pte;
+}
+
+/* PMD level */
+#ifdef CONFIG_PPC64
+typedef struct { unsigned long pmd; } pmd_t;
+#define __pmd(x)   ((pmd_t) { (x) })
+static inline unsigned long pmd_val(pmd_t x)
+{
+   return x.pmd;
+}
+
+/* PUD level exusts only on 4k pages */
+#ifndef CONFIG_PPC_64K_PAGES
+typedef struct { unsigned long pud; } pud_t;
+#define __pud(x)   ((pud_t) { (x) })
+static inline unsigned long pud_val(pud_t x)
+{
+   return x.pud;
+}
+#endif /* !CONFIG_PPC_64K_PAGES */
+#endif /* CONFIG_PPC64 */
+
+/* PGD level */
+typedef struct { unsigned long pgd; } pgd_t;
+#define __pgd(x)   ((pgd_t) { (x) })
+static inline unsigned long pgd_val(pgd_t x)
+{
+   return x.pgd;
+}
+
+/* Page protection bits */
+typedef struct { unsigned long pgprot; } pgprot_t;
+#define pgprot_val(x)  ((x).pgprot)
+#define __pgprot(x)((pgprot_t) { (x) })
+
+#else
+
+/*
+ * .. while these make it easier on the compiler
+ */
+
+typedef pte_basic_t pte_t;
+#define __pte(x)   (x)
+static inline pte_basic_t pte_val(pte_t pte)
+{
+   return pte;
+}
+
+#ifdef CONFIG_PPC64
+typedef unsigned long pmd_t;
+#define __pmd(x)   (x)
+static inline unsigned long pmd_val(pmd_t pmd)
+{
+   return pmd;
+}
+
+#ifndef CONFIG_PPC_64K_PAGES

[RFC PATCH V1 32/33] powerpc/mm: Hash linux abstraction for tlbflush routines

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 28 ++-
 arch/powerpc/include/asm/book3s/64/tlbflush.h  | 56 ++
 arch/powerpc/include/asm/tlbflush.h|  2 +-
 arch/powerpc/mm/tlb_hash64.c   |  2 +-
 4 files changed, 73 insertions(+), 15 deletions(-)
 create mode 100644 arch/powerpc/include/asm/book3s/64/tlbflush.h

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
index 1b753f96b374..ddce8477fe0c 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
@@ -52,40 +52,42 @@ extern void flush_hash_range(unsigned long number, int 
local);
 extern void flush_hash_hugepage(unsigned long vsid, unsigned long addr,
pmd_t *pmdp, unsigned int psize, int ssize,
unsigned long flags);
-
-static inline void local_flush_tlb_mm(struct mm_struct *mm)
+static inline void local_flush_hltlb_mm(struct mm_struct *mm)
 {
 }
 
-static inline void flush_tlb_mm(struct mm_struct *mm)
+static inline void flush_hltlb_mm(struct mm_struct *mm)
 {
 }
 
-static inline void local_flush_tlb_page(struct vm_area_struct *vma,
-   unsigned long vmaddr)
+static inline void local_flush_hltlb_page(struct vm_area_struct *vma,
+ unsigned long vmaddr)
 {
 }
 
-static inline void flush_tlb_page(struct vm_area_struct *vma,
- unsigned long vmaddr)
+static inline void flush_hltlb_page(struct vm_area_struct *vma,
+   unsigned long vmaddr)
 {
 }
 
-static inline void flush_tlb_page_nohash(struct vm_area_struct *vma,
-unsigned long vmaddr)
+static inline void flush_hltlb_page_nohash(struct vm_area_struct *vma,
+  unsigned long vmaddr)
 {
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
-  unsigned long start, unsigned long end)
+static inline void flush_hltlb_range(struct vm_area_struct *vma,
+unsigned long start, unsigned long end)
 {
 }
 
-static inline void flush_tlb_kernel_range(unsigned long start,
- unsigned long end)
+static inline void flush_hltlb_kernel_range(unsigned long start,
+   unsigned long end)
 {
 }
 
+
+struct mmu_gather;
+extern void hltlb_flush(struct mmu_gather *tlb);
 /* Private function for use by PCI IO mapping code */
 extern void __flush_hash_table_range(struct mm_struct *mm, unsigned long start,
 unsigned long end);
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush.h
new file mode 100644
index ..dd8830ea7143
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -0,0 +1,56 @@
+#ifndef _ASM_POWERPC_BOOK3S_64_TLBFLUSH_H
+#define _ASM_POWERPC_BOOK3S_64_TLBFLUSH_H
+
+#include 
+
+static inline void flush_tlb_range(struct vm_area_struct *vma,
+  unsigned long start, unsigned long end)
+{
+   return flush_hltlb_range(vma, start, end);
+}
+
+static inline void flush_tlb_kernel_range(unsigned long start,
+ unsigned long end)
+{
+   return flush_hltlb_kernel_range(start, end);
+}
+
+static inline void local_flush_tlb_mm(struct mm_struct *mm)
+{
+   return local_flush_hltlb_mm(mm);
+}
+
+static inline void local_flush_tlb_page(struct vm_area_struct *vma,
+   unsigned long vmaddr)
+{
+   return local_flush_hltlb_page(vma, vmaddr);
+}
+
+static inline void flush_tlb_page_nohash(struct vm_area_struct *vma,
+unsigned long vmaddr)
+{
+   return flush_hltlb_page_nohash(vma, vmaddr);
+}
+
+static inline void tlb_flush(struct mmu_gather *tlb)
+{
+   return hltlb_flush(tlb);
+}
+
+#ifdef CONFIG_SMP
+static inline void flush_tlb_mm(struct mm_struct *mm)
+{
+   return flush_hltlb_mm(mm);
+}
+
+static inline void flush_tlb_page(struct vm_area_struct *vma,
+ unsigned long vmaddr)
+{
+   return flush_hltlb_page(vma, vmaddr);
+}
+#else
+#define flush_tlb_mm(mm)   local_flush_tlb_mm(mm)
+#define flush_tlb_page(vma,addr)   local_flush_tlb_page(vma,addr)
+#endif /* CONFIG_SMP */
+
+#endif /*  _ASM_POWERPC_BOOK3S_64_TLBFLUSH_H */
diff --git a/arch/powerpc/include/asm/tlbflush.h 
b/arch/powerpc/include/asm/tlbflush.h
index 9f77f85e3e99..2fc4331c5bc5 100644
--- a/arch/powerpc/include/asm/tlbflush.h
+++ b/arch/powerpc/include/asm/tlbflush.h
@@ -78,7 +78,7 @@ static inline void local_flush_tlb_mm(struct 

[RFC PATCH V1 11/33] powerpc/mm: Use helper instead of opencoding

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/pgalloc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index f06ad7354d68..23b0dd07f9ae 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -191,7 +191,7 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t 
*pmd,
 
 static inline pgtable_t pmd_pgtable(pmd_t pmd)
 {
-   return (pgtable_t)(pmd_val(pmd) & ~PMD_MASKED_BITS);
+   return (pgtable_t)pmd_page_vaddr(pmd);
 }
 
 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 31/33] powerpc/mm: Hash linux abstraction for page table allocator

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 .../include/asm/book3s/64/pgalloc-hash-4k.h|  26 ++---
 .../include/asm/book3s/64/pgalloc-hash-64k.h   |  23 ++--
 arch/powerpc/include/asm/book3s/64/pgalloc-hash.h  |  36 +--
 arch/powerpc/include/asm/book3s/64/pgalloc.h   | 118 +
 4 files changed, 148 insertions(+), 55 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc-hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc-hash-4k.h
index d1d67e585ad4..ae6480e2111b 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc-hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc-hash-4k.h
@@ -1,30 +1,30 @@
 #ifndef _ASM_POWERPC_BOOK3S_64_PGALLOC_HASH_4K_H
 #define _ASM_POWERPC_BOOK3S_64_PGALLOC_HASH_4K_H
 
-static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
+static inline void hlpmd_populate(struct mm_struct *mm, pmd_t *pmd,
pgtable_t pte_page)
 {
pmd_set(pmd, (unsigned long)page_address(pte_page));
 }
 
-static inline pgtable_t pmd_pgtable(pmd_t pmd)
+static inline pgtable_t hlpmd_pgtable(pmd_t pmd)
 {
return pmd_page(pmd);
 }
 
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
- unsigned long address)
+static inline pte_t *hlpte_alloc_one_kernel(struct mm_struct *mm,
+   unsigned long address)
 {
return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO);
 }
 
-static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
- unsigned long address)
+static inline pgtable_t hlpte_alloc_one(struct mm_struct *mm,
+   unsigned long address)
 {
struct page *page;
pte_t *pte;
 
-   pte = pte_alloc_one_kernel(mm, address);
+   pte = hlpte_alloc_one_kernel(mm, address);
if (!pte)
return NULL;
page = virt_to_page(pte);
@@ -35,12 +35,12 @@ static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
return page;
 }
 
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+static inline void hlpte_free_kernel(struct mm_struct *mm, pte_t *pte)
 {
free_page((unsigned long)pte);
 }
 
-static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
+static inline void hlpte_free(struct mm_struct *mm, pgtable_t ptepage)
 {
pgtable_page_dtor(ptepage);
__free_page(ptepage);
@@ -58,7 +58,7 @@ static inline void pgtable_free(void *table, unsigned 
index_size)
 
 #ifdef CONFIG_SMP
 static inline void pgtable_free_tlb(struct mmu_gather *tlb,
-   void *table, int shift)
+ void *table, int shift)
 {
unsigned long pgf = (unsigned long)table;
BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
@@ -75,14 +75,14 @@ static inline void __tlb_remove_table(void *_table)
 }
 #else /* !CONFIG_SMP */
 static inline void pgtable_free_tlb(struct mmu_gather *tlb,
-   void *table, int shift)
+ void *table, int shift)
 {
pgtable_free(table, shift);
 }
 #endif /* CONFIG_SMP */
 
-static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
- unsigned long address)
+static inline void __hlpte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
+   unsigned long address)
 {
tlb_flush_pgtable(tlb, address);
pgtable_page_dtor(table);
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc-hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc-hash-64k.h
index e2dab4f64316..cb382773397f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc-hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc-hash-64k.h
@@ -4,45 +4,42 @@
 extern pte_t *page_table_alloc(struct mm_struct *, unsigned long, int);
 extern void page_table_free(struct mm_struct *, unsigned long *, int);
 extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift);
-#ifdef CONFIG_SMP
-extern void __tlb_remove_table(void *_table);
-#endif
 
-static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
-   pgtable_t pte_page)
+static inline void hlpmd_populate(struct mm_struct *mm, pmd_t *pmd,
+ pgtable_t pte_page)
 {
pmd_set(pmd, (unsigned long)pte_page);
 }
 
-static inline pgtable_t pmd_pgtable(pmd_t pmd)
+static inline pgtable_t hlpmd_pgtable(pmd_t pmd)
 {
return (pgtable_t)pmd_page_vaddr(pmd);
 }
 
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
- unsigned long address)
+static inline pte_t *hlpte_alloc_one_kernel(struct mm_struct *mm,
+   unsigned long address)
 {
return (pte_t *)page_table_alloc(mm, address, 1);
 

[RFC PATCH V1 16/33] powerpc/mm: Move hash page table related functions to pgtable-hash64.c

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash.h|   1 +
 arch/powerpc/include/asm/nohash/64/pgtable.h |   2 +
 arch/powerpc/mm/Makefile |   3 +-
 arch/powerpc/mm/init_64.c| 114 +
 arch/powerpc/mm/mem.c|  29 +---
 arch/powerpc/mm/mmu_decl.h   |   4 -
 arch/powerpc/mm/pgtable-book3e.c | 163 ++
 arch/powerpc/mm/pgtable-hash64.c | 246 +++
 arch/powerpc/mm/pgtable.c|   9 +
 arch/powerpc/mm/pgtable_64.c |  88 --
 arch/powerpc/mm/ppc_mmu_32.c |  30 
 11 files changed, 461 insertions(+), 228 deletions(-)
 create mode 100644 arch/powerpc/mm/pgtable-book3e.c
 create mode 100644 arch/powerpc/mm/pgtable-hash64.c

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index ee8dd7e561b0..d51709dad729 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -604,6 +604,7 @@ static inline void hpte_do_hugepage_flush(struct mm_struct 
*mm,
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
+extern int map_kernel_page(unsigned long ea, unsigned long pa, int flags);
 #endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_H */
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index b9f734dd5b81..a68e809d7739 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -359,6 +359,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, 
pte_t entry)
 
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
+extern int map_kernel_page(unsigned long ea, unsigned long pa, int flags);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_NOHASH_64_PGTABLE_H */
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index 1ffeda85c086..6b5cc805c7ba 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -13,7 +13,8 @@ obj-$(CONFIG_PPC_MMU_NOHASH)  += mmu_context_nohash.o 
tlb_nohash.o \
   tlb_nohash_low.o
 obj-$(CONFIG_PPC_BOOK3E)   += tlb_low_$(CONFIG_WORD_SIZE)e.o
 hash64-$(CONFIG_PPC_NATIVE):= hash_native_64.o
-obj-$(CONFIG_PPC_STD_MMU_64)   += hash_utils_64.o slb_low.o slb.o $(hash64-y)
+obj-$(CONFIG_PPC_BOOK3E_64)   += pgtable-book3e.o
+obj-$(CONFIG_PPC_STD_MMU_64)   += pgtable-hash64.o hash_utils_64.o slb_low.o 
slb.o $(hash64-y)
 obj-$(CONFIG_PPC_STD_MMU_32)   += ppc_mmu_32.o hash_low_32.o
 obj-$(CONFIG_PPC_STD_MMU)  += tlb_hash$(CONFIG_WORD_SIZE).o \
   mmu_context_hash$(CONFIG_WORD_SIZE).o
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 8ce1ec24d573..05b025a0efe6 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -65,38 +65,10 @@
 
 #include "mmu_decl.h"
 
-#ifdef CONFIG_PPC_STD_MMU_64
-#if PGTABLE_RANGE > USER_VSID_RANGE
-#warning Limited user VSID range means pagetable space is wasted
-#endif
-
-#if (TASK_SIZE_USER64 < PGTABLE_RANGE) && (TASK_SIZE_USER64 < USER_VSID_RANGE)
-#warning TASK_SIZE is smaller than it needs to be.
-#endif
-#endif /* CONFIG_PPC_STD_MMU_64 */
-
 phys_addr_t memstart_addr = ~0;
 EXPORT_SYMBOL_GPL(memstart_addr);
 phys_addr_t kernstart_addr;
 EXPORT_SYMBOL_GPL(kernstart_addr);
-
-static void pgd_ctor(void *addr)
-{
-   memset(addr, 0, PGD_TABLE_SIZE);
-}
-
-static void pud_ctor(void *addr)
-{
-   memset(addr, 0, PUD_TABLE_SIZE);
-}
-
-static void pmd_ctor(void *addr)
-{
-   memset(addr, 0, PMD_TABLE_SIZE);
-}
-
-struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE];
-
 /*
  * Create a kmem_cache() for pagetables.  This is not used for PTE
  * pages - they're linked to struct page, come from the normal free
@@ -104,6 +76,7 @@ struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE];
  * everything else.  Caches created by this function are used for all
  * the higher level pagetables, and for hugepage pagetables.
  */
+struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE];
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
 {
char *name;
@@ -138,25 +111,6 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void 
*))
pr_debug("Allocated pgtable cache for order %d\n", shift);
 }
 
-
-void pgtable_cache_init(void)
-{
-   pgtable_cache_add(PGD_INDEX_SIZE, pgd_ctor);
-   pgtable_cache_add(PMD_CACHE_INDEX, pmd_ctor);
-   /*
-* In all current configs, when the PUD index exists it's the
-* same size as either the pgd or pmd index except with THP enabled
-* on book3s 64
-*/
-   if (PUD_INDEX_SIZE && !PGT_CACHE(PUD_INDEX_SIZE))
-   pgtable_cache_add(PUD_INDEX_SIZE, pud_ctor);
-
-   if 

[RFC PATCH V1 14/33] powerpc/mm: Use helper for finding pte bits mapping I/O area

2016-01-11 Thread Aneesh Kumar K.V
We will have different values for hash and radix. Hence we
cannot use #define constants. Add helper

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 5 +
 arch/powerpc/include/asm/book3s/64/hash.h| 5 +
 arch/powerpc/include/asm/nohash/pgtable.h| 5 +
 arch/powerpc/kernel/isa-bridge.c | 4 ++--
 arch/powerpc/kernel/pci_64.c | 2 +-
 arch/powerpc/mm/pgtable_64.c | 2 +-
 6 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 3ed3303c1295..77adada2f3b4 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -478,6 +478,11 @@ static inline pgprot_t pgprot_writecombine(pgprot_t prot)
return pgprot_noncached_wc(prot);
 }
 
+static inline unsigned long pte_io_cache_bits(void)
+{
+   return _PAGE_NO_CACHE | _PAGE_GUARDED;
+}
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /*  _ASM_POWERPC_BOOK3S_32_PGTABLE_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index ced3aed63af2..1b27c0c8effa 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -578,6 +578,11 @@ static inline pgprot_t pgprot_writecombine(pgprot_t prot)
 extern pgprot_t vm_get_page_prot(unsigned long vm_flags);
 #define vm_get_page_prot vm_get_page_prot
 
+static inline unsigned long pte_io_cache_bits(void)
+{
+   return _PAGE_NO_CACHE | _PAGE_GUARDED;
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
   pmd_t *pmdp, unsigned long old_pmd);
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 11e3767216c0..8c4bb8fda0de 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -224,6 +224,11 @@ extern pgprot_t phys_mem_access_prot(struct file *file, 
unsigned long pfn,
 unsigned long size, pgprot_t vma_prot);
 #define __HAVE_PHYS_MEM_ACCESS_PROT
 
+static inline unsigned long pte_io_cache_bits(void)
+{
+   return _PAGE_NO_CACHE | _PAGE_GUARDED;
+}
+
 #ifdef CONFIG_HUGETLB_PAGE
 static inline int hugepd_ok(hugepd_t hpd)
 {
diff --git a/arch/powerpc/kernel/isa-bridge.c b/arch/powerpc/kernel/isa-bridge.c
index 0f1997097960..d81185f025fa 100644
--- a/arch/powerpc/kernel/isa-bridge.c
+++ b/arch/powerpc/kernel/isa-bridge.c
@@ -109,14 +109,14 @@ static void pci_process_ISA_OF_ranges(struct device_node 
*isa_node,
size = 0x1;
 
__ioremap_at(phb_io_base_phys, (void *)ISA_IO_BASE,
-size, _PAGE_NO_CACHE|_PAGE_GUARDED);
+size, pte_io_cache_bits());
return;
 
 inval_range:
printk(KERN_ERR "no ISA IO ranges or unexpected isa range, "
   "mapping 64k\n");
__ioremap_at(phb_io_base_phys, (void *)ISA_IO_BASE,
-0x1, _PAGE_NO_CACHE|_PAGE_GUARDED);
+0x1, pte_io_cache_bits());
 }
 
 
diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c
index 60bb187cb46a..7fe1dfd214a1 100644
--- a/arch/powerpc/kernel/pci_64.c
+++ b/arch/powerpc/kernel/pci_64.c
@@ -159,7 +159,7 @@ static int pcibios_map_phb_io_space(struct pci_controller 
*hose)
 
/* Establish the mapping */
if (__ioremap_at(phys_page, area->addr, size_page,
-_PAGE_NO_CACHE | _PAGE_GUARDED) == NULL)
+pte_io_cache_bits()) == NULL)
return -ENOMEM;
 
/* Fixup hose IO resource */
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index e5f600d19326..6d161cec2e32 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -253,7 +253,7 @@ void __iomem * __ioremap(phys_addr_t addr, unsigned long 
size,
 
 void __iomem * ioremap(phys_addr_t addr, unsigned long size)
 {
-   unsigned long flags = _PAGE_NO_CACHE | _PAGE_GUARDED;
+   unsigned long flags = pte_io_cache_bits();
void *caller = __builtin_return_address(0);
 
if (ppc_md.ioremap)
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 30/33] powerpc/mm: Hash linux abstraction for HugeTLB

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 10 
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 14 +--
 arch/powerpc/include/asm/book3s/64/pgalloc-hash.h |  7 ++
 arch/powerpc/include/asm/book3s/64/pgalloc.h  |  9 +++
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 30 +++
 arch/powerpc/include/asm/hugetlb.h|  4 ---
 arch/powerpc/include/asm/nohash/pgalloc.h |  7 ++
 arch/powerpc/mm/hugetlbpage-hash64.c  | 11 -
 arch/powerpc/mm/hugetlbpage.c | 16 
 9 files changed, 86 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 1ef4b39f96fd..5fc9e4e1db5f 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -66,23 +66,23 @@
 /*
  * For 4k page size, we support explicit hugepage via hugepd
  */
-static inline int pmd_huge(pmd_t pmd)
+static inline int hlpmd_huge(pmd_t pmd)
 {
return 0;
 }
 
-static inline int pud_huge(pud_t pud)
+static inline int hlpud_huge(pud_t pud)
 {
return 0;
 }
 
-static inline int pgd_huge(pgd_t pgd)
+static inline int hlpgd_huge(pgd_t pgd)
 {
return 0;
 }
 #define pgd_huge pgd_huge
 
-static inline int hugepd_ok(hugepd_t hpd)
+static inline int hlhugepd_ok(hugepd_t hpd)
 {
/*
 * if it is not a pte and have hugepd shift mask
@@ -93,7 +93,7 @@ static inline int hugepd_ok(hugepd_t hpd)
return true;
return false;
 }
-#define is_hugepd(hpd) (hugepd_ok(hpd))
+#define is_hlhugepd(hpd)   (hlhugepd_ok(hpd))
 #endif
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index e697fc528c0a..4fff8b12ba0f 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -146,7 +146,7 @@ extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long 
index);
  * Defined in such a way that we can optimize away code block at build time
  * if CONFIG_HUGETLB_PAGE=n.
  */
-static inline int pmd_huge(pmd_t pmd)
+static inline int hlpmd_huge(pmd_t pmd)
 {
/*
 * leaf pte for huge page
@@ -154,7 +154,7 @@ static inline int pmd_huge(pmd_t pmd)
return !!(pmd_val(pmd) & H_PAGE_PTE);
 }
 
-static inline int pud_huge(pud_t pud)
+static inline int hlpud_huge(pud_t pud)
 {
/*
 * leaf pte for huge page
@@ -162,7 +162,7 @@ static inline int pud_huge(pud_t pud)
return !!(pud_val(pud) & H_PAGE_PTE);
 }
 
-static inline int pgd_huge(pgd_t pgd)
+static inline int hlpgd_huge(pgd_t pgd)
 {
/*
 * leaf pte for huge page
@@ -172,19 +172,19 @@ static inline int pgd_huge(pgd_t pgd)
 #define pgd_huge pgd_huge
 
 #ifdef CONFIG_DEBUG_VM
-extern int hugepd_ok(hugepd_t hpd);
-#define is_hugepd(hpd)   (hugepd_ok(hpd))
+extern int hlhugepd_ok(hugepd_t hpd);
+#define is_hlhugepd(hpd)   (hlhugepd_ok(hpd))
 #else
 /*
  * With 64k page size, we have hugepage ptes in the pgd and pmd entries. We 
don't
  * need to setup hugepage directory for them. Our pte and page directory format
  * enable us to have this enabled.
  */
-static inline int hugepd_ok(hugepd_t hpd)
+static inline int hlhugepd_ok(hugepd_t hpd)
 {
return 0;
 }
-#define is_hugepd(pdep)0
+#define is_hlhugepd(pdep)  0
 #endif /* CONFIG_DEBUG_VM */
 
 #endif /* CONFIG_HUGETLB_PAGE */
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc-hash.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc-hash.h
index dbf680970c12..1dcfe7b75f06 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc-hash.h
@@ -56,4 +56,11 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, 
pud_t *pud,
 {
pgtable_free_tlb(tlb, pud, H_PUD_INDEX_SIZE);
 }
+
+extern pte_t *huge_hlpte_alloc(struct mm_struct *mm, unsigned long addr,
+  unsigned long sz);
+extern void hugetlb_free_hlpgd_range(struct mmu_gather *tlb, unsigned long 
addr,
+unsigned long end, unsigned long floor,
+unsigned long ceiling);
+
 #endif /* _ASM_POWERPC_BOOK3S_64_PGALLOC_HASH_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index ff3c0e36fe3d..fa2ddda14b3d 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -66,4 +66,13 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, 
pmd_t *pmd,
 #include 
 #endif
 
+#ifdef CONFIG_HUGETLB_PAGE
+static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned 
long addr,
+

[RFC PATCH V1 12/33] powerpc/mm: Move hash64 specific defintions to seperate header

2016-01-11 Thread Aneesh Kumar K.V
Also split pgalloc 64k and 4k headers

Signed-off-by: Aneesh Kumar K.V 
---
 .../include/asm/book3s/64/pgalloc-hash-4k.h|  92 ++
 .../include/asm/book3s/64/pgalloc-hash-64k.h   |  51 ++
 arch/powerpc/include/asm/book3s/64/pgalloc-hash.h  |  59 ++
 arch/powerpc/include/asm/book3s/64/pgalloc.h   | 199 +
 4 files changed, 210 insertions(+), 191 deletions(-)
 create mode 100644 arch/powerpc/include/asm/book3s/64/pgalloc-hash-4k.h
 create mode 100644 arch/powerpc/include/asm/book3s/64/pgalloc-hash-64k.h
 create mode 100644 arch/powerpc/include/asm/book3s/64/pgalloc-hash.h

diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc-hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc-hash-4k.h
new file mode 100644
index ..d1d67e585ad4
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc-hash-4k.h
@@ -0,0 +1,92 @@
+#ifndef _ASM_POWERPC_BOOK3S_64_PGALLOC_HASH_4K_H
+#define _ASM_POWERPC_BOOK3S_64_PGALLOC_HASH_4K_H
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
+   pgtable_t pte_page)
+{
+   pmd_set(pmd, (unsigned long)page_address(pte_page));
+}
+
+static inline pgtable_t pmd_pgtable(pmd_t pmd)
+{
+   return pmd_page(pmd);
+}
+
+static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
+ unsigned long address)
+{
+   return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO);
+}
+
+static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
+ unsigned long address)
+{
+   struct page *page;
+   pte_t *pte;
+
+   pte = pte_alloc_one_kernel(mm, address);
+   if (!pte)
+   return NULL;
+   page = virt_to_page(pte);
+   if (!pgtable_page_ctor(page)) {
+   __free_page(page);
+   return NULL;
+   }
+   return page;
+}
+
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+   free_page((unsigned long)pte);
+}
+
+static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
+{
+   pgtable_page_dtor(ptepage);
+   __free_page(ptepage);
+}
+
+static inline void pgtable_free(void *table, unsigned index_size)
+{
+   if (!index_size)
+   free_page((unsigned long)table);
+   else {
+   BUG_ON(index_size > MAX_PGTABLE_INDEX_SIZE);
+   kmem_cache_free(PGT_CACHE(index_size), table);
+   }
+}
+
+#ifdef CONFIG_SMP
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+   void *table, int shift)
+{
+   unsigned long pgf = (unsigned long)table;
+   BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+   pgf |= shift;
+   tlb_remove_table(tlb, (void *)pgf);
+}
+
+static inline void __tlb_remove_table(void *_table)
+{
+   void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+   unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+   pgtable_free(table, shift);
+}
+#else /* !CONFIG_SMP */
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+   void *table, int shift)
+{
+   pgtable_free(table, shift);
+}
+#endif /* CONFIG_SMP */
+
+static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
+ unsigned long address)
+{
+   tlb_flush_pgtable(tlb, address);
+   pgtable_page_dtor(table);
+   pgtable_free_tlb(tlb, page_address(table), 0);
+}
+
+#endif /* _ASM_POWERPC_BOOK3S_64_PGALLOC_HASH_4K_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc-hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc-hash-64k.h
new file mode 100644
index ..e2dab4f64316
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc-hash-64k.h
@@ -0,0 +1,51 @@
+#ifndef _ASM_POWERPC_BOOK3S_64_PGALLOC_HASH_64K_H
+#define _ASM_POWERPC_BOOK3S_64_PGALLOC_HASH_64K_H
+
+extern pte_t *page_table_alloc(struct mm_struct *, unsigned long, int);
+extern void page_table_free(struct mm_struct *, unsigned long *, int);
+extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift);
+#ifdef CONFIG_SMP
+extern void __tlb_remove_table(void *_table);
+#endif
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
+   pgtable_t pte_page)
+{
+   pmd_set(pmd, (unsigned long)pte_page);
+}
+
+static inline pgtable_t pmd_pgtable(pmd_t pmd)
+{
+   return (pgtable_t)pmd_page_vaddr(pmd);
+}
+
+static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
+ unsigned long address)
+{
+   return (pte_t *)page_table_alloc(mm, address, 1);
+}
+
+static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
+   unsigned long address)
+{
+   return (pgtable_t)page_table_alloc(mm, address, 0);
+}
+
+static inline void 

[RFC PATCH V1 33/33] powerpc/mm: Hash linux abstraction for pte swap encoding

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash.h| 44 +
 arch/powerpc/include/asm/book3s/64/pgtable.h | 49 
 arch/powerpc/mm/slb.c|  1 -
 3 files changed, 64 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index f43b26c4d319..13926dbfb687 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -41,6 +41,7 @@
  */
 #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
 
+#define H_PAGE_SWP_SOFT_DIRTY   (1UL << (SWP_TYPE_BITS + H_PAGE_BIT_SWAP_TYPE))
 /*
  * set of bits not changed in pmd_modify.
  */
@@ -230,46 +231,31 @@
 #define hlpmd_index(address) (((address) >> (H_PMD_SHIFT)) & (H_PTRS_PER_PMD - 
1))
 #define hlpte_index(address) (((address) >> (PAGE_SHIFT)) & (H_PTRS_PER_PTE - 
1))
 
-/* Encode and de-code a swap entry */
-#define MAX_SWAPFILES_CHECK() do { \
-   BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS); \
-   /*  \
-* Don't have overlapping bits with _PAGE_HPTEFLAGS \
-* We filter HPTEFLAGS on set_pte.  \
-*/ \
-   BUILD_BUG_ON(H_PAGE_HPTEFLAGS & (0x1f << H_PAGE_BIT_SWAP_TYPE)); \
-   BUILD_BUG_ON(H_PAGE_HPTEFLAGS & H_PAGE_SWP_SOFT_DIRTY); \
-   } while (0)
 /*
  * on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
+ * We encode swap type in the lower part of pte, skipping the lowest two bits.
+ * Offset is encoded as pfn.
  */
-#define SWP_TYPE_BITS 5
-#define __swp_type(x)  (((x).val >> H_PAGE_BIT_SWAP_TYPE) \
-   & ((1UL << SWP_TYPE_BITS) - 1))
-#define __swp_offset(x)((x).val >> H_PTE_RPN_SHIFT)
-#define __swp_entry(type, offset)  ((swp_entry_t) { \
-   ((type) << H_PAGE_BIT_SWAP_TYPE) \
-   | ((offset) << H_PTE_RPN_SHIFT) })
-
-#define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val((pte)) 
})
-#define __swp_entry_to_pte(x)  __pte((x).val)
+#define hl_swp_type(x) (((x).val >> H_PAGE_BIT_SWAP_TYPE)  \
+& ((1UL << SWP_TYPE_BITS) - 1))
+#define hl_swp_offset(x)   ((x).val >> H_PTE_RPN_SHIFT)
+#define hl_swp_entry(type, offset) ((swp_entry_t) {\
+   ((type) << H_PAGE_BIT_SWAP_TYPE)\
+   | ((offset) << H_PTE_RPN_SHIFT) })
 
 #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
-#define _PAGE_SWP_SOFT_DIRTY   (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
-static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
+static inline pte_t hl_pte_swp_mksoft_dirty(pte_t pte)
 {
-   return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY);
+   return __pte(pte_val(pte) | H_PAGE_SWP_SOFT_DIRTY);
 }
-static inline bool pte_swp_soft_dirty(pte_t pte)
+static inline bool hl_pte_swp_soft_dirty(pte_t pte)
 {
-   return !!(pte_val(pte) & _PAGE_SWP_SOFT_DIRTY);
+   return !!(pte_val(pte) & H_PAGE_SWP_SOFT_DIRTY);
 }
-static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
+static inline pte_t hl_pte_swp_clear_soft_dirty(pte_t pte)
 {
-   return __pte(pte_val(pte) & ~_PAGE_SWP_SOFT_DIRTY);
+   return __pte(pte_val(pte) & ~H_PAGE_SWP_SOFT_DIRTY);
 }
-#else
-#define _PAGE_SWP_SOFT_DIRTY   0
 #endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
 
 extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index ff7dda649ee3..bf5598628e34 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -5,6 +5,7 @@
  * the ppc64 hashed page table.
  */
 
+#define SWP_TYPE_BITS 5
 #include 
 #include 
 
@@ -322,6 +323,54 @@ static inline void set_pte_at(struct mm_struct *mm, 
unsigned long addr,
 {
return set_hlpte_at(mm, addr, ptep, pte);
 }
+/*
+ * Swap definitions
+ */
+
+/* Encode and de-code a swap entry */
+#define MAX_SWAPFILES_CHECK() do { \
+   BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS);  \
+   /*  \
+* Don't have overlapping bits with _PAGE_HPTEFLAGS \
+* We filter HPTEFLAGS on set_pte.  \
+*/ \
+   BUILD_BUG_ON(H_PAGE_HPTEFLAGS & (0x1f << 
H_PAGE_BIT_SWAP_TYPE)); \
+   BUILD_BUG_ON(H_PAGE_HPTEFLAGS & H_PAGE_SWP_SOFT_DIRTY); \
+   } while (0)
+/*
+ * on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
+ */
+#define __pte_to_swp_entry(pte)((swp_entry_t) 

[RFC PATCH V1 08/33] mm: Some arch may want to use HPAGE_PMD related values as variables

2016-01-11 Thread Aneesh Kumar K.V
Architecture supporting multiple page table formats have the hugepage
related values as variable. So we can't use them in #define constants

Signed-off-by: Aneesh Kumar K.V 
---
 arch/arm/include/asm/pgtable-3level.h |  8 
 arch/arm64/include/asm/pgtable.h  |  7 +++
 arch/mips/include/asm/pgtable.h   |  8 
 arch/powerpc/mm/pgtable_64.c  |  7 +++
 arch/s390/include/asm/pgtable.h   |  8 
 arch/sparc/include/asm/pgtable_64.h   |  7 +++
 arch/tile/include/asm/pgtable.h   |  9 +
 arch/x86/include/asm/pgtable.h|  7 +++
 include/linux/huge_mm.h   |  3 ---
 mm/huge_memory.c  | 11 +++
 10 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-3level.h 
b/arch/arm/include/asm/pgtable-3level.h
index dc46398bc3a5..4b934de4d088 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -281,6 +281,14 @@ static inline void set_pmd_at(struct mm_struct *mm, 
unsigned long addr,
flush_pmd_entry(pmdp);
 }
 
+#if HPAGE_PMD_ORDER >= MAX_ORDER
+#error "hugepages can't be allocated by the buddy allocator"
+#endif
+
+#if HPAGE_PMD_ORDER < 2
+#error "We need more than 2 pages to do deferred thp split"
+#endif
+
 static inline int has_transparent_hugepage(void)
 {
return 1;
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4be63692f275..99a2ccc4e7d4 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -379,6 +379,13 @@ static inline pgprot_t mk_sect_prot(pgprot_t prot)
 
 #define set_pmd_at(mm, addr, pmdp, pmd)set_pte_at(mm, addr, (pte_t 
*)pmdp, pmd_pte(pmd))
 
+#if HPAGE_PMD_ORDER >= MAX_ORDER
+#error "hugepages can't be allocated by the buddy allocator"
+#endif
+
+#if HPAGE_PMD_ORDER < 2
+#error "We need more than 2 pages to do deferred thp split"
+#endif
 static inline int has_transparent_hugepage(void)
 {
return 1;
diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 6995b4a02e23..93810618c302 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -468,6 +468,14 @@ static inline int io_remap_pfn_range(struct vm_area_struct 
*vma,
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
+#if HPAGE_PMD_ORDER >= MAX_ORDER
+#error "hugepages can't be allocated by the buddy allocator"
+#endif
+
+#if HPAGE_PMD_ORDER < 2
+#error "We need more than 2 pages to do deferred thp split"
+#endif
+
 extern int has_transparent_hugepage(void);
 
 static inline int pmd_trans_huge(pmd_t pmd)
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 3124a20d0fab..e5f600d19326 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -785,6 +785,13 @@ pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
 
 int has_transparent_hugepage(void)
 {
+
+   BUILD_BUG_ON_MSG((PMD_SHIFT - PAGE_SHIFT) >= MAX_ORDER,
+   "hugepages can't be allocated by the buddy allocator");
+
+   BUILD_BUG_ON_MSG((PMD_SHIFT - PAGE_SHIFT) < 2,
+"We need more than 2 pages to do deferred thp split");
+
if (!mmu_has_feature(MMU_FTR_16M_PAGE))
return 0;
/*
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 64ead8091248..79e7ea6e272c 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1617,6 +1617,14 @@ static inline int pmd_trans_huge(pmd_t pmd)
return pmd_val(pmd) & _SEGMENT_ENTRY_LARGE;
 }
 
+#if HPAGE_PMD_ORDER >= MAX_ORDER
+#error "hugepages can't be allocated by the buddy allocator"
+#endif
+
+#if HPAGE_PMD_ORDER < 2
+#error "We need more than 2 pages to do deferred thp split"
+#endif
+
 static inline int has_transparent_hugepage(void)
 {
return MACHINE_HAS_HPAGE ? 1 : 0;
diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index bf13625f8f90..1f62c5447513 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -683,6 +683,13 @@ static inline unsigned long pmd_trans_huge(pmd_t pmd)
return pte_val(pte) & _PAGE_PMD_HUGE;
 }
 
+#if HPAGE_PMD_ORDER >= MAX_ORDER
+#error "hugepages can't be allocated by the buddy allocator"
+#endif
+
+#if HPAGE_PMD_ORDER < 2
+#error "We need more than 2 pages to do deferred thp split"
+#endif
 #define has_transparent_hugepage() 1
 
 static inline pmd_t pmd_mkold(pmd_t pmd)
diff --git a/arch/tile/include/asm/pgtable.h b/arch/tile/include/asm/pgtable.h
index 983f1ed37d62..d8c1306e3a2f 100644
--- a/arch/tile/include/asm/pgtable.h
+++ b/arch/tile/include/asm/pgtable.h
@@ -488,6 +488,15 @@ static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
 }
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+
+#if HPAGE_PMD_ORDER >= MAX_ORDER
+#error "hugepages can't be allocated by the buddy allocator"
+#endif
+
+#if 

[RFC PATCH V1 15/33] powerpc/mm: Use helper for finding pte filter mask for gup

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 8 
 arch/powerpc/include/asm/book3s/64/hash.h| 9 +
 arch/powerpc/include/asm/nohash/pgtable.h| 9 +
 arch/powerpc/mm/hugetlbpage.c| 5 +
 4 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 77adada2f3b4..c0898e26ed4a 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -483,6 +483,14 @@ static inline unsigned long pte_io_cache_bits(void)
return _PAGE_NO_CACHE | _PAGE_GUARDED;
 }
 
+static inline unsigned long gup_pte_filter(int write)
+{
+   unsigned long mask;
+   mask = _PAGE_PRESENT | _PAGE_USER;
+   if (write)
+   mask |= _PAGE_RW;
+   return mask;
+}
 #endif /* !__ASSEMBLY__ */
 
 #endif /*  _ASM_POWERPC_BOOK3S_32_PGTABLE_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 1b27c0c8effa..ee8dd7e561b0 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -583,6 +583,15 @@ static inline unsigned long pte_io_cache_bits(void)
return _PAGE_NO_CACHE | _PAGE_GUARDED;
 }
 
+static inline unsigned long gup_pte_filter(int write)
+{
+   unsigned long mask;
+   mask = _PAGE_PRESENT | _PAGE_USER;
+   if (write)
+   mask |= _PAGE_RW;
+   return mask;
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
   pmd_t *pmdp, unsigned long old_pmd);
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 8c4bb8fda0de..e4173cb06e5b 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -229,6 +229,15 @@ static inline unsigned long pte_io_cache_bits(void)
return _PAGE_NO_CACHE | _PAGE_GUARDED;
 }
 
+static inline unsigned long gup_pte_filter(int write)
+{
+   unsigned long mask;
+   mask = _PAGE_PRESENT | _PAGE_USER;
+   if (write)
+   mask |= _PAGE_RW;
+   return mask;
+}
+
 #ifdef CONFIG_HUGETLB_PAGE
 static inline int hugepd_ok(hugepd_t hpd)
 {
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 26fb814f289f..4e970f58f8d9 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -417,10 +417,7 @@ int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned 
long addr,
end = pte_end;
 
pte = READ_ONCE(*ptep);
-   mask = _PAGE_PRESENT | _PAGE_USER;
-   if (write)
-   mask |= _PAGE_RW;
-
+   mask = gup_pte_filter(write);
if ((pte_val(pte) & mask) != mask)
return 0;
 
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 03/33] powerpc/mm: Switch book3s 64 with 64K page size to 4 level page table

2016-01-11 Thread Aneesh Kumar K.V
This is needed so that we can support both hash and radix page table
using single kernel. Radix kernel uses a 4 level table.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 33 +--
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 20 +---
 arch/powerpc/include/asm/book3s/64/hash.h |  8 +++
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 25 +++-
 arch/powerpc/include/asm/pgalloc-64.h | 24 ---
 arch/powerpc/include/asm/pgtable-types.h  | 13 +++
 arch/powerpc/mm/init_64.c | 21 -
 8 files changed, 90 insertions(+), 55 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 378f1127ca98..618afea4c9fc 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -303,6 +303,7 @@ config ZONE_DMA32
 config PGTABLE_LEVELS
int
default 2 if !PPC64
+   default 4 if PPC_BOOK3S_64
default 3 if PPC_64K_PAGES
default 4
 
diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index ea0414d6659e..c78f5928001b 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -57,39 +57,8 @@
 #define _PAGE_4K_PFN   0
 #ifndef __ASSEMBLY__
 /*
- * 4-level page tables related bits
+ * On all 4K setups, remap_4k_pfn() equates to remap_pfn_range()
  */
-
-#define pgd_none(pgd)  (!pgd_val(pgd))
-#define pgd_bad(pgd)   (pgd_val(pgd) == 0)
-#define pgd_present(pgd)   (pgd_val(pgd) != 0)
-#define pgd_page_vaddr(pgd)(pgd_val(pgd) & ~PGD_MASKED_BITS)
-
-static inline void pgd_clear(pgd_t *pgdp)
-{
-   *pgdp = __pgd(0);
-}
-
-static inline pte_t pgd_pte(pgd_t pgd)
-{
-   return __pte(pgd_val(pgd));
-}
-
-static inline pgd_t pte_pgd(pte_t pte)
-{
-   return __pgd(pte_val(pte));
-}
-extern struct page *pgd_page(pgd_t pgd);
-
-#define pud_offset(pgdp, addr) \
-  (((pud_t *) pgd_page_vaddr(*(pgdp))) + \
-(((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
-
-#define pud_ERROR(e) \
-   pr_err("%s:%d: bad pud %08lx.\n", __FILE__, __LINE__, pud_val(e))
-
-/*
- * On all 4K setups, remap_4k_pfn() equates to remap_pfn_range() */
 #define remap_4k_pfn(vma, addr, pfn, prot) \
remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, (prot))
 
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 849bbec80f7b..5c9392b71a6b 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -1,15 +1,14 @@
 #ifndef _ASM_POWERPC_BOOK3S_64_HASH_64K_H
 #define _ASM_POWERPC_BOOK3S_64_HASH_64K_H
 
-#include 
-
 #define PTE_INDEX_SIZE  8
-#define PMD_INDEX_SIZE  10
-#define PUD_INDEX_SIZE 0
+#define PMD_INDEX_SIZE  5
+#define PUD_INDEX_SIZE 5
 #define PGD_INDEX_SIZE  12
 
 #define PTRS_PER_PTE   (1 << PTE_INDEX_SIZE)
 #define PTRS_PER_PMD   (1 << PMD_INDEX_SIZE)
+#define PTRS_PER_PUD   (1 << PUD_INDEX_SIZE)
 #define PTRS_PER_PGD   (1 << PGD_INDEX_SIZE)
 
 /* With 4k base page size, hugepage PTEs go at the PMD level */
@@ -20,8 +19,13 @@
 #define PMD_SIZE   (1UL << PMD_SHIFT)
 #define PMD_MASK   (~(PMD_SIZE-1))
 
+/* PUD_SHIFT determines what a third-level page table entry can map */
+#define PUD_SHIFT  (PMD_SHIFT + PMD_INDEX_SIZE)
+#define PUD_SIZE   (1UL << PUD_SHIFT)
+#define PUD_MASK   (~(PUD_SIZE-1))
+
 /* PGDIR_SHIFT determines what a third-level page table entry can map */
-#define PGDIR_SHIFT(PMD_SHIFT + PMD_INDEX_SIZE)
+#define PGDIR_SHIFT(PUD_SHIFT + PUD_INDEX_SIZE)
 #define PGDIR_SIZE (1UL << PGDIR_SHIFT)
 #define PGDIR_MASK (~(PGDIR_SIZE-1))
 
@@ -61,6 +65,8 @@
 #define PMD_MASKED_BITS(PTE_FRAG_SIZE - 1)
 /* Bits to mask out from a PGD/PUD to get to the PMD page */
 #define PUD_MASKED_BITS0x1ff
+/* FIXME!! check this */
+#define PGD_MASKED_BITS0
 
 #ifndef __ASSEMBLY__
 
@@ -130,11 +136,9 @@ extern bool __rpte_sub_valid(real_pte_t rpte, unsigned 
long index);
 #else
 #define PMD_TABLE_SIZE (sizeof(pmd_t) << PMD_INDEX_SIZE)
 #endif
+#define PUD_TABLE_SIZE (sizeof(pud_t) << PUD_INDEX_SIZE)
 #define PGD_TABLE_SIZE (sizeof(pgd_t) << PGD_INDEX_SIZE)
 
-#define pgd_pte(pgd)   (pud_pte(((pud_t){ pgd })))
-#define pte_pgd(pte)   ((pgd_t)pte_pud(pte))
-
 #ifdef CONFIG_HUGETLB_PAGE
 /*
  * We have PGD_INDEX_SIZ = 12 and PTE_INDEX_SIZE = 8, so that we can have
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index f46974d0134a..9ff1e056acef 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -226,6 +226,7 @@
 #define pud_page_vaddr(pud)(pud_val(pud) & ~PUD_MASKED_BITS)
 
 #define pgd_index(address) (((address) 

[RFC PATCH V1 13/33] powerpc/mm: Move swap related definition ot hash64 header

2016-01-11 Thread Aneesh Kumar K.V
They are dependent on hash pte bits, so move them to hash64 header

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash.h| 42 
 arch/powerpc/include/asm/book3s/64/pgtable.h | 42 
 2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 37a152428c99..ced3aed63af2 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -230,6 +230,48 @@
 #define pmd_index(address) (((address) >> (PMD_SHIFT)) & (PTRS_PER_PMD - 1))
 #define pte_index(address) (((address) >> (PAGE_SHIFT)) & (PTRS_PER_PTE - 1))
 
+/* Encode and de-code a swap entry */
+#define MAX_SWAPFILES_CHECK() do { \
+   BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS); \
+   /*  \
+* Don't have overlapping bits with _PAGE_HPTEFLAGS \
+* We filter HPTEFLAGS on set_pte.  \
+*/ \
+   BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \
+   BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY);   \
+   } while (0)
+/*
+ * on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
+ */
+#define SWP_TYPE_BITS 5
+#define __swp_type(x)  (((x).val >> _PAGE_BIT_SWAP_TYPE) \
+   & ((1UL << SWP_TYPE_BITS) - 1))
+#define __swp_offset(x)((x).val >> PTE_RPN_SHIFT)
+#define __swp_entry(type, offset)  ((swp_entry_t) { \
+   ((type) << _PAGE_BIT_SWAP_TYPE) \
+   | ((offset) << PTE_RPN_SHIFT) })
+
+#define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val((pte)) 
})
+#define __swp_entry_to_pte(x)  __pte((x).val)
+
+#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
+#define _PAGE_SWP_SOFT_DIRTY   (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
+static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY);
+}
+static inline bool pte_swp_soft_dirty(pte_t pte)
+{
+   return !!(pte_val(pte) & _PAGE_SWP_SOFT_DIRTY);
+}
+static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
+{
+   return __pte(pte_val(pte) & ~_PAGE_SWP_SOFT_DIRTY);
+}
+#else
+#define _PAGE_SWP_SOFT_DIRTY   0
+#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
+
 extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, unsigned long pte, int huge);
 extern unsigned long htab_convert_pte_flags(unsigned long pteflags);
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 8f639401c7ba..2613b3b436c9 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -161,48 +161,6 @@ extern struct page *pgd_page(pgd_t pgd);
 #define pgd_ERROR(e) \
pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
 
-/* Encode and de-code a swap entry */
-#define MAX_SWAPFILES_CHECK() do { \
-   BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS); \
-   /*  \
-* Don't have overlapping bits with _PAGE_HPTEFLAGS \
-* We filter HPTEFLAGS on set_pte.  \
-*/ \
-   BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \
-   BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY);   \
-   } while (0)
-/*
- * on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
- */
-#define SWP_TYPE_BITS 5
-#define __swp_type(x)  (((x).val >> _PAGE_BIT_SWAP_TYPE) \
-   & ((1UL << SWP_TYPE_BITS) - 1))
-#define __swp_offset(x)((x).val >> PTE_RPN_SHIFT)
-#define __swp_entry(type, offset)  ((swp_entry_t) { \
-   ((type) << _PAGE_BIT_SWAP_TYPE) \
-   | ((offset) << PTE_RPN_SHIFT) })
-
-#define __pte_to_swp_entry(pte)((swp_entry_t) { pte_val((pte)) 
})
-#define __swp_entry_to_pte(x)  __pte((x).val)
-
-#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
-#define _PAGE_SWP_SOFT_DIRTY   (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
-static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY);
-}
-static inline bool pte_swp_soft_dirty(pte_t pte)
-{
-   return !!(pte_val(pte) & _PAGE_SWP_SOFT_DIRTY);
-}
-static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
-{
-   return __pte(pte_val(pte) & ~_PAGE_SWP_SOFT_DIRTY);
-}
-#else
-#define _PAGE_SWP_SOFT_DIRTY   0
-#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
-
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void 

[RFC PATCH V1 01/33] powerpc/mm: add _PAGE_HASHPTE similar to 4K hash

2016-01-11 Thread Aneesh Kumar K.V
Not really needed. But this brings it back to as it was before

Check this
41743a4e34f0777f51c1cf0675b91508ba143050

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/hash64_64k.c | 4 ++--
 arch/powerpc/mm/hugepage-hash64.c| 2 +-
 arch/powerpc/mm/hugetlbpage-hash64.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 0762c1e08c88..3c417f9099f9 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -76,7 +76,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, 
unsigned long vsid,
 * a write access. Since this is 4K insert of 64K page size
 * also add _PAGE_COMBO
 */
-   new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED | _PAGE_COMBO;
+   new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED | _PAGE_COMBO | 
_PAGE_HASHPTE;
if (access & _PAGE_RW)
new_pte |= _PAGE_DIRTY;
} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
@@ -246,7 +246,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 * a write access. Since this is 4K insert of 64K page size
 * also add _PAGE_COMBO
 */
-   new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED;
+   new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED | _PAGE_HASHPTE;
if (access & _PAGE_RW)
new_pte |= _PAGE_DIRTY;
} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
diff --git a/arch/powerpc/mm/hugepage-hash64.c 
b/arch/powerpc/mm/hugepage-hash64.c
index 49b152b0f926..3c4bd4c0ade9 100644
--- a/arch/powerpc/mm/hugepage-hash64.c
+++ b/arch/powerpc/mm/hugepage-hash64.c
@@ -46,7 +46,7 @@ int __hash_page_thp(unsigned long ea, unsigned long access, 
unsigned long vsid,
 * Try to lock the PTE, add ACCESSED and DIRTY if it was
 * a write access
 */
-   new_pmd = old_pmd | _PAGE_BUSY | _PAGE_ACCESSED;
+   new_pmd = old_pmd | _PAGE_BUSY | _PAGE_ACCESSED | _PAGE_HASHPTE;
if (access & _PAGE_RW)
new_pmd |= _PAGE_DIRTY;
} while (old_pmd != __cmpxchg_u64((unsigned long *)pmdp,
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c 
b/arch/powerpc/mm/hugetlbpage-hash64.c
index e2138c7ae70f..9c224b012d62 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -54,7 +54,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, 
unsigned long vsid,
return 1;
/* Try to lock the PTE, add ACCESSED and DIRTY if it was
 * a write access */
-   new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED;
+   new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED | _PAGE_HASHPTE;
if (access & _PAGE_RW)
new_pte |= _PAGE_DIRTY;
} while(old_pte != __cmpxchg_u64((unsigned long *)ptep,
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 07/33] mm: arch hook for vm_get_page_prot

2016-01-11 Thread Aneesh Kumar K.V
With radix, we will have to dynamically switch between different
protection map. Hence override vm_get_page_prot instead of using
arch_vm_get_page_prot.We could also drop arch_vm_get_page_prot since
only powerpc define it. But then matching arch_calc_vm_prot_bits also
need to be changed. So for now keep it.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash.h |  3 +++
 arch/powerpc/include/asm/mman.h   |  6 --
 arch/powerpc/mm/hash_utils_64.c   | 19 +++
 mm/mmap.c |  5 +
 4 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 9ff1e056acef..37a152428c99 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -533,6 +533,9 @@ static inline pgprot_t pgprot_writecombine(pgprot_t prot)
return pgprot_noncached_wc(prot);
 }
 
+extern pgprot_t vm_get_page_prot(unsigned long vm_flags);
+#define vm_get_page_prot vm_get_page_prot
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
   pmd_t *pmdp, unsigned long old_pmd);
diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 8565c254151a..9f48698af024 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -24,12 +24,6 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned 
long prot)
 }
 #define arch_calc_vm_prot_bits(prot) arch_calc_vm_prot_bits(prot)
 
-static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
-{
-   return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0);
-}
-#define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags)
-
 static inline int arch_validate_prot(unsigned long prot)
 {
if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_SAO))
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 4233dcccbaf7..6ad84eb5fe56 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1563,3 +1563,22 @@ void setup_initial_memory_limit(phys_addr_t 
first_memblock_base,
/* Finally limit subsequent allocations */
memblock_set_current_limit(ppc64_rma_size);
 }
+
+static pgprot_t hash_protection_map[16] = {
+   __P000, __P001, __P010, __P011, __P100,
+   __P101, __P110, __P111, __S000, __S001,
+   __S010, __S011, __S100, __S101, __S110, __S111
+};
+
+pgprot_t vm_get_page_prot(unsigned long vm_flags)
+{
+   pgprot_t prot_soa = __pgprot(0);
+
+   if (vm_flags & VM_SAO)
+   prot_soa = __pgprot(_PAGE_SAO);
+
+   return __pgprot(pgprot_val(hash_protection_map[vm_flags &
+   (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) |
+   pgprot_val(prot_soa));
+}
+EXPORT_SYMBOL(vm_get_page_prot);
diff --git a/mm/mmap.c b/mm/mmap.c
index f32b84ad621a..bdde0252ba0c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -89,6 +89,10 @@ static void unmap_region(struct mm_struct *mm,
  * x: (no) no  x: (no) yes x: (no) yes x: (yes) yes
  *
  */
+/*
+ * Give arch an option to override the below in dynamic matter
+ */
+#ifndef vm_get_page_prot
 pgprot_t protection_map[16] = {
__P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
__S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
@@ -101,6 +105,7 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags)
pgprot_val(arch_vm_get_page_prot(vm_flags)));
 }
 EXPORT_SYMBOL(vm_get_page_prot);
+#endif
 
 static pgprot_t vm_pgprot_modify(pgprot_t oldprot, unsigned long vm_flags)
 {
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 27/33] powerpc/mm: Move hash related mmu-*.h headers to book3s/

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/{mmu-hash32.h => book3s/32/mmu-hash.h} | 0
 arch/powerpc/include/asm/{mmu-hash64.h => book3s/64/mmu-hash.h} | 0
 arch/powerpc/include/asm/mmu.h  | 4 ++--
 arch/powerpc/kernel/idle_power7.S   | 2 +-
 arch/powerpc/kvm/book3s_32_mmu_host.c   | 2 +-
 arch/powerpc/kvm/book3s_64_mmu.c| 2 +-
 arch/powerpc/kvm/book3s_64_mmu_host.c   | 2 +-
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c| 2 +-
 arch/powerpc/kvm/book3s_64_vio_hv.c | 2 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 2 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +-
 12 files changed, 11 insertions(+), 11 deletions(-)
 rename arch/powerpc/include/asm/{mmu-hash32.h => book3s/32/mmu-hash.h} (100%)
 rename arch/powerpc/include/asm/{mmu-hash64.h => book3s/64/mmu-hash.h} (100%)

diff --git a/arch/powerpc/include/asm/mmu-hash32.h 
b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
similarity index 100%
rename from arch/powerpc/include/asm/mmu-hash32.h
rename to arch/powerpc/include/asm/book3s/32/mmu-hash.h
diff --git a/arch/powerpc/include/asm/mmu-hash64.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
similarity index 100%
rename from arch/powerpc/include/asm/mmu-hash64.h
rename to arch/powerpc/include/asm/book3s/64/mmu-hash.h
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 3d5abfe6ba67..18a1b7dbf5fb 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -182,10 +182,10 @@ static inline void assert_pte_locked(struct mm_struct 
*mm, unsigned long addr)
 
 #if defined(CONFIG_PPC_STD_MMU_64)
 /* 64-bit classic hash table MMU */
-#  include 
+#include 
 #elif defined(CONFIG_PPC_STD_MMU_32)
 /* 32-bit classic hash table MMU */
-#  include 
+#include 
 #elif defined(CONFIG_40x)
 /* 40x-style software loaded TLB */
 #  include 
diff --git a/arch/powerpc/kernel/idle_power7.S 
b/arch/powerpc/kernel/idle_power7.S
index cf4fb5429cf1..470ceebd2d23 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -19,7 +19,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #undef DEBUG
 
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c 
b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 55c4d51ea3e2..999106991a76 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -22,7 +22,7 @@
 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index 9bf7031a67ff..b9131aa1aedf 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -26,7 +26,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 /* #define DEBUG_MMU */
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c 
b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 30fc2d83dffa..d7959b2a8b32 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -23,7 +23,7 @@
 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index fb37290a57b4..c7b78d8336b2 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -32,7 +32,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index 54cf9bc94dad..9c3b76bb69d9 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -30,7 +30,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
b/arch/powerpc/kvm/book3s_64_vio_hv.c
index 89e96b3e0039..039028d3ccb5 100644
--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
+++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
@@ -29,7 +29,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 91700518bbf3..4cb8db05f3e5 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -17,7 +17,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6ee26de9a1de..c613fee0b9f7 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -27,7 +27,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #define VCPU_GPRS_TM(reg) 

[RFC PATCH V1 04/33] powerpc/mm: Copy pgalloc (part 1)

2016-01-11 Thread Aneesh Kumar K.V
cp pgalloc-32.h book3s/32/pgalloc.h
cp pgalloc-64.h book3s/64/pgalloc.h

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/32/pgalloc.h | 109 +++
 arch/powerpc/include/asm/book3s/64/pgalloc.h | 262 +++
 2 files changed, 371 insertions(+)
 create mode 100644 arch/powerpc/include/asm/book3s/32/pgalloc.h
 create mode 100644 arch/powerpc/include/asm/book3s/64/pgalloc.h

diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h 
b/arch/powerpc/include/asm/book3s/32/pgalloc.h
new file mode 100644
index ..76d6b9e0c8a9
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h
@@ -0,0 +1,109 @@
+#ifndef _ASM_POWERPC_PGALLOC_32_H
+#define _ASM_POWERPC_PGALLOC_32_H
+
+#include 
+
+/* For 32-bit, all levels of page tables are just drawn from get_free_page() */
+#define MAX_PGTABLE_INDEX_SIZE 0
+
+extern void __bad_pte(pmd_t *pmd);
+
+extern pgd_t *pgd_alloc(struct mm_struct *mm);
+extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
+
+/*
+ * We don't have any real pmd's, and this code never triggers because
+ * the pgd will always be present..
+ */
+/* #define pmd_alloc_one(mm,address)   ({ BUG(); ((pmd_t *)2); }) */
+#define pmd_free(mm, x)do { } while (0)
+#define __pmd_free_tlb(tlb,x,a)do { } while (0)
+/* #define pgd_populate(mm, pmd, pte)  BUG() */
+
+#ifndef CONFIG_BOOKE
+
+static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp,
+  pte_t *pte)
+{
+   *pmdp = __pmd(__pa(pte) | _PMD_PRESENT);
+}
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp,
+   pgtable_t pte_page)
+{
+   *pmdp = __pmd((page_to_pfn(pte_page) << PAGE_SHIFT) | _PMD_PRESENT);
+}
+
+#define pmd_pgtable(pmd) pmd_page(pmd)
+#else
+
+static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp,
+  pte_t *pte)
+{
+   *pmdp = __pmd((unsigned long)pte | _PMD_PRESENT);
+}
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp,
+   pgtable_t pte_page)
+{
+   *pmdp = __pmd((unsigned long)lowmem_page_address(pte_page) | 
_PMD_PRESENT);
+}
+
+#define pmd_pgtable(pmd) pmd_page(pmd)
+#endif
+
+extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr);
+extern pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long addr);
+
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+   free_page((unsigned long)pte);
+}
+
+static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
+{
+   pgtable_page_dtor(ptepage);
+   __free_page(ptepage);
+}
+
+static inline void pgtable_free(void *table, unsigned index_size)
+{
+   BUG_ON(index_size); /* 32-bit doesn't use this */
+   free_page((unsigned long)table);
+}
+
+#define check_pgt_cache()  do { } while (0)
+
+#ifdef CONFIG_SMP
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+   void *table, int shift)
+{
+   unsigned long pgf = (unsigned long)table;
+   BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+   pgf |= shift;
+   tlb_remove_table(tlb, (void *)pgf);
+}
+
+static inline void __tlb_remove_table(void *_table)
+{
+   void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+   unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+   pgtable_free(table, shift);
+}
+#else
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+   void *table, int shift)
+{
+   pgtable_free(table, shift);
+}
+#endif
+
+static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
+ unsigned long address)
+{
+   tlb_flush_pgtable(tlb, address);
+   pgtable_page_dtor(table);
+   pgtable_free_tlb(tlb, page_address(table), 0);
+}
+#endif /* _ASM_POWERPC_PGALLOC_32_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc.h
new file mode 100644
index ..014489a619d0
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -0,0 +1,262 @@
+#ifndef _ASM_POWERPC_PGALLOC_64_H
+#define _ASM_POWERPC_PGALLOC_64_H
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+
+struct vmemmap_backing {
+   struct vmemmap_backing *list;
+   unsigned long phys;
+   unsigned long virt_addr;
+};
+extern struct vmemmap_backing *vmemmap_list;
+
+/*
+ * Functions that deal with pagetables that could be at any level of
+ * the table need to be passed an "index_size" so they know how to
+ * handle allocation.  For PTE pages 

[RFC PATCH V1 05/33] powerpc/mm: Copy pgalloc (part 2)

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/32/pgalloc.h   |  6 +++---
 arch/powerpc/include/asm/book3s/64/pgalloc.h   | 23 +++---
 arch/powerpc/include/asm/book3s/pgalloc.h  | 19 ++
 .../asm/{pgalloc-32.h => nohash/32/pgalloc.h}  |  0
 .../asm/{pgalloc-64.h => nohash/64/pgalloc.h}  |  0
 arch/powerpc/include/asm/nohash/pgalloc.h  | 23 ++
 arch/powerpc/include/asm/pgalloc.h | 19 +++---
 7 files changed, 64 insertions(+), 26 deletions(-)
 create mode 100644 arch/powerpc/include/asm/book3s/pgalloc.h
 rename arch/powerpc/include/asm/{pgalloc-32.h => nohash/32/pgalloc.h} (100%)
 rename arch/powerpc/include/asm/{pgalloc-64.h => nohash/64/pgalloc.h} (100%)
 create mode 100644 arch/powerpc/include/asm/nohash/pgalloc.h

diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h 
b/arch/powerpc/include/asm/book3s/32/pgalloc.h
index 76d6b9e0c8a9..a2350194fc76 100644
--- a/arch/powerpc/include/asm/book3s/32/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGALLOC_32_H
-#define _ASM_POWERPC_PGALLOC_32_H
+#ifndef _ASM_POWERPC_BOOK3S_32_PGALLOC_H
+#define _ASM_POWERPC_BOOK3S_32_PGALLOC_H
 
 #include 
 
@@ -106,4 +106,4 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, 
pgtable_t table,
pgtable_page_dtor(table);
pgtable_free_tlb(tlb, page_address(table), 0);
 }
-#endif /* _ASM_POWERPC_PGALLOC_32_H */
+#endif /* _ASM_POWERPC_BOOK3S_32_PGALLOC_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index 014489a619d0..5bb6852fa771 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGALLOC_64_H
-#define _ASM_POWERPC_PGALLOC_64_H
+#ifndef _ASM_POWERPC_BOOK3S_64_PGALLOC_H
+#define _ASM_POWERPC_BOOK3S_64_PGALLOC_H
 /*
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License
@@ -52,8 +52,10 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 }
 
 #ifndef CONFIG_PPC_64K_PAGES
-
-#define pgd_populate(MM, PGD, PUD) pgd_set(PGD, (unsigned long)PUD)
+static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
+{
+   pgd_set(pgd, (unsigned long)pud);
+}
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
@@ -83,7 +85,10 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t 
*pmd,
pmd_set(pmd, (unsigned long)page_address(pte_page));
 }
 
-#define pmd_pgtable(pmd) pmd_page(pmd)
+static inline pgtable_t pmd_pgtable(pmd_t pmd)
+{
+   return pmd_page(pmd);
+}
 
 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
  unsigned long address)
@@ -173,7 +178,11 @@ extern void __tlb_remove_table(void *_table);
 
 #ifndef __PAGETABLE_PUD_FOLDED
 /* book3s 64 is 4 level page table */
-#define pgd_populate(MM, PGD, PUD) pgd_set(PGD, PUD)
+static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
+{
+   pgd_set(pgd, (unsigned long)pud);
+}
+
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
return kmem_cache_alloc(PGT_CACHE(PUD_INDEX_SIZE),
@@ -259,4 +268,4 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t 
*pmd)
 
 #define check_pgt_cache()  do { } while (0)
 
-#endif /* _ASM_POWERPC_PGALLOC_64_H */
+#endif /* _ASM_POWERPC_BOOK3S_64_PGALLOC_H */
diff --git a/arch/powerpc/include/asm/book3s/pgalloc.h 
b/arch/powerpc/include/asm/book3s/pgalloc.h
new file mode 100644
index ..54f591e9572e
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/pgalloc.h
@@ -0,0 +1,19 @@
+#ifndef _ASM_POWERPC_BOOK3S_PGALLOC_H
+#define _ASM_POWERPC_BOOK3S_PGALLOC_H
+
+#include 
+
+extern void tlb_remove_table(struct mmu_gather *tlb, void *table);
+static inline void tlb_flush_pgtable(struct mmu_gather *tlb,
+unsigned long address)
+{
+
+}
+
+#ifdef CONFIG_PPC64
+#include 
+#else
+#include 
+#endif
+
+#endif /* _ASM_POWERPC_BOOK3S_PGALLOC_H */
diff --git a/arch/powerpc/include/asm/pgalloc-32.h 
b/arch/powerpc/include/asm/nohash/32/pgalloc.h
similarity index 100%
rename from arch/powerpc/include/asm/pgalloc-32.h
rename to arch/powerpc/include/asm/nohash/32/pgalloc.h
diff --git a/arch/powerpc/include/asm/pgalloc-64.h 
b/arch/powerpc/include/asm/nohash/64/pgalloc.h
similarity index 100%
rename from arch/powerpc/include/asm/pgalloc-64.h
rename to arch/powerpc/include/asm/nohash/64/pgalloc.h
diff --git a/arch/powerpc/include/asm/nohash/pgalloc.h 
b/arch/powerpc/include/asm/nohash/pgalloc.h
new file mode 100644
index ..b39ec956d71e
--- /dev/null
+++ b/arch/powerpc/include/asm/nohash/pgalloc.h
@@ -0,0 +1,23 @@
+#ifndef 

[RFC PATCH V1 23/33] powerpc/mm: Create a new headers for tlbflush for hash64

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 94 ++
 arch/powerpc/include/asm/tlbflush.h| 92 +
 2 files changed, 95 insertions(+), 91 deletions(-)
 create mode 100644 arch/powerpc/include/asm/book3s/64/tlbflush-hash.h

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
new file mode 100644
index ..1b753f96b374
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
@@ -0,0 +1,94 @@
+#ifndef _ASM_POWERPC_BOOK3S_64_TLBFLUSH_HASH_H
+#define _ASM_POWERPC_BOOK3S_64_TLBFLUSH_HASH_H
+
+#define MMU_NO_CONTEXT 0
+
+/*
+ * TLB flushing for 64-bit hash-MMU CPUs
+ */
+
+#include 
+#include 
+
+#define PPC64_TLB_BATCH_NR 192
+
+struct ppc64_tlb_batch {
+   int active;
+   unsigned long   index;
+   struct mm_struct*mm;
+   real_pte_t  pte[PPC64_TLB_BATCH_NR];
+   unsigned long   vpn[PPC64_TLB_BATCH_NR];
+   unsigned intpsize;
+   int ssize;
+};
+DECLARE_PER_CPU(struct ppc64_tlb_batch, ppc64_tlb_batch);
+
+extern void __flush_tlb_pending(struct ppc64_tlb_batch *batch);
+
+#define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
+
+static inline void arch_enter_lazy_mmu_mode(void)
+{
+   struct ppc64_tlb_batch *batch = this_cpu_ptr(_tlb_batch);
+
+   batch->active = 1;
+}
+
+static inline void arch_leave_lazy_mmu_mode(void)
+{
+   struct ppc64_tlb_batch *batch = this_cpu_ptr(_tlb_batch);
+
+   if (batch->index)
+   __flush_tlb_pending(batch);
+   batch->active = 0;
+}
+
+#define arch_flush_lazy_mmu_mode()  do {} while (0)
+
+
+extern void flush_hash_page(unsigned long vpn, real_pte_t pte, int psize,
+   int ssize, unsigned long flags);
+extern void flush_hash_range(unsigned long number, int local);
+extern void flush_hash_hugepage(unsigned long vsid, unsigned long addr,
+   pmd_t *pmdp, unsigned int psize, int ssize,
+   unsigned long flags);
+
+static inline void local_flush_tlb_mm(struct mm_struct *mm)
+{
+}
+
+static inline void flush_tlb_mm(struct mm_struct *mm)
+{
+}
+
+static inline void local_flush_tlb_page(struct vm_area_struct *vma,
+   unsigned long vmaddr)
+{
+}
+
+static inline void flush_tlb_page(struct vm_area_struct *vma,
+ unsigned long vmaddr)
+{
+}
+
+static inline void flush_tlb_page_nohash(struct vm_area_struct *vma,
+unsigned long vmaddr)
+{
+}
+
+static inline void flush_tlb_range(struct vm_area_struct *vma,
+  unsigned long start, unsigned long end)
+{
+}
+
+static inline void flush_tlb_kernel_range(unsigned long start,
+ unsigned long end)
+{
+}
+
+/* Private function for use by PCI IO mapping code */
+extern void __flush_hash_table_range(struct mm_struct *mm, unsigned long start,
+unsigned long end);
+extern void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd,
+   unsigned long addr);
+#endif /*  _ASM_POWERPC_BOOK3S_64_TLBFLUSH_HASH_H */
diff --git a/arch/powerpc/include/asm/tlbflush.h 
b/arch/powerpc/include/asm/tlbflush.h
index 23d351ca0303..9f77f85e3e99 100644
--- a/arch/powerpc/include/asm/tlbflush.h
+++ b/arch/powerpc/include/asm/tlbflush.h
@@ -78,97 +78,7 @@ static inline void local_flush_tlb_mm(struct mm_struct *mm)
 }
 
 #elif defined(CONFIG_PPC_STD_MMU_64)
-
-#define MMU_NO_CONTEXT 0
-
-/*
- * TLB flushing for 64-bit hash-MMU CPUs
- */
-
-#include 
-#include 
-
-#define PPC64_TLB_BATCH_NR 192
-
-struct ppc64_tlb_batch {
-   int active;
-   unsigned long   index;
-   struct mm_struct*mm;
-   real_pte_t  pte[PPC64_TLB_BATCH_NR];
-   unsigned long   vpn[PPC64_TLB_BATCH_NR];
-   unsigned intpsize;
-   int ssize;
-};
-DECLARE_PER_CPU(struct ppc64_tlb_batch, ppc64_tlb_batch);
-
-extern void __flush_tlb_pending(struct ppc64_tlb_batch *batch);
-
-#define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
-
-static inline void arch_enter_lazy_mmu_mode(void)
-{
-   struct ppc64_tlb_batch *batch = this_cpu_ptr(_tlb_batch);
-
-   batch->active = 1;
-}
-
-static inline void arch_leave_lazy_mmu_mode(void)
-{
-   struct ppc64_tlb_batch *batch = this_cpu_ptr(_tlb_batch);
-
-   if (batch->index)
-   __flush_tlb_pending(batch);
-   batch->active = 0;
-}
-
-#define arch_flush_lazy_mmu_mode()  do {} while (0)
-
-
-extern void flush_hash_page(unsigned long vpn, real_pte_t pte, int psize,
-   int ssize, unsigned long flags);
-extern void 

[RFC PATCH V1 17/33] mm: Change pmd_huge_pte type in mm_struct

2016-01-11 Thread Aneesh Kumar K.V
We need this to be the pte page not the pgtable_t

Signed-off-by: Aneesh Kumar K.V 
---
 include/linux/mm_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index c67ea476991e..c9a1ebec07c4 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -474,7 +474,7 @@ struct mm_struct {
struct mmu_notifier_mm *mmu_notifier_mm;
 #endif
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
-   pgtable_t pmd_huge_pte; /* protected by page_table_lock */
+   struct page *pmd_huge_pte; /* protected by page_table_lock */
 #endif
 #ifdef CONFIG_CPUMASK_OFFSTACK
struct cpumask cpumask_allocation;
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 28/33] powerpc/mm: Hash linux abstractions for early init routines

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/32/mmu-hash.h |  6 +-
 arch/powerpc/include/asm/book3s/64/mmu-hash.h | 61 +-
 arch/powerpc/include/asm/book3s/64/mmu.h  | 93 +++
 arch/powerpc/include/asm/mmu.h| 25 +++
 arch/powerpc/mm/hash_utils_64.c   |  6 +-
 5 files changed, 116 insertions(+), 75 deletions(-)
 create mode 100644 arch/powerpc/include/asm/book3s/64/mmu.h

diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
index 16f513e5cbd7..b82e063494dd 100644
--- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_MMU_HASH32_H_
-#define _ASM_POWERPC_MMU_HASH32_H_
+#ifndef _ASM_POWERPC_BOOK3S_32_MMU_HASH_H_
+#define _ASM_POWERPC_BOOK3S_32_MMU_HASH_H_
 /*
  * 32-bit hash table MMU support
  */
@@ -90,4 +90,4 @@ typedef struct {
 #define mmu_virtual_psize  MMU_PAGE_4K
 #define mmu_linear_psize   MMU_PAGE_256M
 
-#endif /* _ASM_POWERPC_MMU_HASH32_H_ */
+#endif /* _ASM_POWERPC_BOOK3S_32_MMU_HASH_H_ */
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 95ee27564804..ee929cb1a150 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_MMU_HASH64_H_
-#define _ASM_POWERPC_MMU_HASH64_H_
+#ifndef _ASM_POWERPC_BOOK3S_64_MMU_HASH_H_
+#define _ASM_POWERPC_BOOK3S_64_MMU_HASH_H_
 /*
  * PowerPC64 memory management structures
  *
@@ -126,24 +126,6 @@ extern struct hash_pte *htab_address;
 extern unsigned long htab_size_bytes;
 extern unsigned long htab_hash_mask;
 
-/*
- * Page size definition
- *
- *shift : is the "PAGE_SHIFT" value for that page size
- *sllp  : is a bit mask with the value of SLB L || LP to be or'ed
- *directly to a slbmte "vsid" value
- *penc  : is the HPTE encoding mask for the "LP" field:
- *
- */
-struct mmu_psize_def
-{
-   unsigned intshift;  /* number of bits */
-   int penc[MMU_PAGE_COUNT];   /* HPTE encoding */
-   unsigned inttlbiel; /* tlbiel supported for that page size */
-   unsigned long   avpnm;  /* bits to mask out in AVPN in the HPTE */
-   unsigned long   sllp;   /* SLB L||LP (exact mask to use in slbmte) */
-};
-extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
 
 static inline int shift_to_mmu_psize(unsigned int shift)
 {
@@ -209,11 +191,6 @@ static inline int segment_shift(int ssize)
 /*
  * The current system page and segment sizes
  */
-extern int mmu_linear_psize;
-extern int mmu_virtual_psize;
-extern int mmu_vmalloc_psize;
-extern int mmu_vmemmap_psize;
-extern int mmu_io_psize;
 extern int mmu_kernel_ssize;
 extern int mmu_highuser_ssize;
 extern u16 mmu_slb_size;
@@ -511,38 +488,6 @@ static inline void subpage_prot_free(struct mm_struct *mm) 
{}
 static inline void subpage_prot_init_new_context(struct mm_struct *mm) { }
 #endif /* CONFIG_PPC_SUBPAGE_PROT */
 
-typedef unsigned long mm_context_id_t;
-struct spinlock;
-
-typedef struct {
-   mm_context_id_t id;
-   u16 user_psize; /* page size index */
-
-#ifdef CONFIG_PPC_MM_SLICES
-   u64 low_slices_psize;   /* SLB page size encodings */
-   unsigned char high_slices_psize[SLICE_ARRAY_SIZE];
-#else
-   u16 sllp;   /* SLB page size encoding */
-#endif
-   unsigned long vdso_base;
-#ifdef CONFIG_PPC_SUBPAGE_PROT
-   struct subpage_prot_table spt;
-#endif /* CONFIG_PPC_SUBPAGE_PROT */
-#ifdef CONFIG_PPC_ICSWX
-   struct spinlock *cop_lockp; /* guard acop and cop_pid */
-   unsigned long acop; /* mask of enabled coprocessor types */
-   unsigned int cop_pid;   /* pid value used with coprocessors */
-#endif /* CONFIG_PPC_ICSWX */
-#ifdef CONFIG_PPC_64K_PAGES
-   /* for 4K PTE fragment support */
-   void *pte_frag;
-#endif
-#ifdef CONFIG_SPAPR_TCE_IOMMU
-   struct list_head iommu_group_mem_list;
-#endif
-} mm_context_t;
-
-
 #if 0
 /*
  * The code below is equivalent to this function for arguments
@@ -609,4 +554,4 @@ static inline unsigned long get_kernel_vsid(unsigned long 
ea, int ssize)
 }
 #endif /* __ASSEMBLY__ */
 
-#endif /* _ASM_POWERPC_MMU_HASH64_H_ */
+#endif /* _ASM_POWERPC_BOOK3S_64_MMU_HASH_H_ */
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
new file mode 100644
index ..a2274ad0afee
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -0,0 +1,93 @@
+#ifndef _ASM_POWERPC_BOOK3S_64_MMU_H_
+#define _ASM_POWERPC_BOOK3S_64_MMU_H_
+
+#ifndef __ASSEMBLY__
+/*
+ * Page size definition
+ *
+ *shift : is the "PAGE_SHIFT" value for that page size
+ *sllp  : is a bit mask with the value of SLB L || LP to be or'ed
+ *directly to a slbmte "vsid" value
+ *

[RFC PATCH V1 24/33] powerpc/mm: Hash linux abstraction for page table accessors

2016-01-11 Thread Aneesh Kumar K.V
We will later make the generic functions do conditial radix or hash
page table access. This patch doesn't do hugepage api update yet.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash.h| 138 +++---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 262 ++-
 arch/powerpc/mm/hash_utils_64.c  |   4 +-
 3 files changed, 336 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index f6d27579607f..5d333400c87d 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -217,18 +217,18 @@
 #define H_PUD_BAD_BITS (H_PMD_TABLE_SIZE-1)
 
 #ifndef __ASSEMBLY__
-#definepmd_bad(pmd)(!is_kernel_addr(pmd_val(pmd)) \
+#definehlpmd_bad(pmd)  (!is_kernel_addr(pmd_val(pmd))  
\
 || (pmd_val(pmd) & H_PMD_BAD_BITS))
-#define pmd_page_vaddr(pmd)(pmd_val(pmd) & ~H_PMD_MASKED_BITS)
+#define hlpmd_page_vaddr(pmd)  (pmd_val(pmd) & ~H_PMD_MASKED_BITS)
 
-#definepud_bad(pud)(!is_kernel_addr(pud_val(pud)) \
+#definehlpud_bad(pud)  (!is_kernel_addr(pud_val(pud))  
\
 || (pud_val(pud) & H_PUD_BAD_BITS))
-#define pud_page_vaddr(pud)(pud_val(pud) & ~H_PUD_MASKED_BITS)
+#define hlpud_page_vaddr(pud)  (pud_val(pud) & ~H_PUD_MASKED_BITS)
 
-#define pgd_index(address) (((address) >> (H_PGDIR_SHIFT)) & (H_PTRS_PER_PGD - 
1))
-#define pud_index(address) (((address) >> (H_PUD_SHIFT)) & (H_PTRS_PER_PUD - 
1))
-#define pmd_index(address) (((address) >> (H_PMD_SHIFT)) & (H_PTRS_PER_PMD - 
1))
-#define pte_index(address) (((address) >> (PAGE_SHIFT)) & (H_PTRS_PER_PTE - 1))
+#define hlpgd_index(address) (((address) >> (H_PGDIR_SHIFT)) & (H_PTRS_PER_PGD 
- 1))
+#define hlpud_index(address) (((address) >> (H_PUD_SHIFT)) & (H_PTRS_PER_PUD - 
1))
+#define hlpmd_index(address) (((address) >> (H_PMD_SHIFT)) & (H_PTRS_PER_PMD - 
1))
+#define hlpte_index(address) (((address) >> (PAGE_SHIFT)) & (H_PTRS_PER_PTE - 
1))
 
 /* Encode and de-code a swap entry */
 #define MAX_SWAPFILES_CHECK() do { \
@@ -276,11 +276,11 @@ extern void hpte_need_flush(struct mm_struct *mm, 
unsigned long addr,
pte_t *ptep, unsigned long pte, int huge);
 extern unsigned long htab_convert_pte_flags(unsigned long pteflags);
 /* Atomic PTE updates */
-static inline unsigned long pte_update(struct mm_struct *mm,
-  unsigned long addr,
-  pte_t *ptep, unsigned long clr,
-  unsigned long set,
-  int huge)
+static inline unsigned long hlpte_update(struct mm_struct *mm,
+unsigned long addr,
+pte_t *ptep, unsigned long clr,
+unsigned long set,
+int huge)
 {
unsigned long old, tmp;
 
@@ -313,42 +313,41 @@ static inline unsigned long pte_update(struct mm_struct 
*mm,
  * We should be more intelligent about this but for the moment we override
  * these functions and force a tlb flush unconditionally
  */
-static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
+static inline int __hlptep_test_and_clear_young(struct mm_struct *mm,
  unsigned long addr, pte_t *ptep)
 {
unsigned long old;
 
if ((pte_val(*ptep) & (H_PAGE_ACCESSED | H_PAGE_HASHPTE)) == 0)
return 0;
-   old = pte_update(mm, addr, ptep, H_PAGE_ACCESSED, 0, 0);
+   old = hlpte_update(mm, addr, ptep, H_PAGE_ACCESSED, 0, 0);
return (old & H_PAGE_ACCESSED) != 0;
 }
 
-#define __HAVE_ARCH_PTEP_SET_WRPROTECT
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+static inline void hlptep_set_wrprotect(struct mm_struct *mm, unsigned long 
addr,
  pte_t *ptep)
 {
 
if ((pte_val(*ptep) & H_PAGE_RW) == 0)
return;
 
-   pte_update(mm, addr, ptep, H_PAGE_RW, 0, 0);
+   hlpte_update(mm, addr, ptep, H_PAGE_RW, 0, 0);
 }
 
-static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
+static inline void huge_hlptep_set_wrprotect(struct mm_struct *mm,
   unsigned long addr, pte_t *ptep)
 {
if ((pte_val(*ptep) & H_PAGE_RW) == 0)
return;
 
-   pte_update(mm, addr, ptep, H_PAGE_RW, 0, 1);
+   hlpte_update(mm, addr, ptep, H_PAGE_RW, 0, 1);
 }
 
 
 /* Set the dirty and/or accessed bits atomically in a linux PTE, this
  * function doesn't need to flush the hash entry
  */
-static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
+static inline void 

Re: [V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

2016-01-11 Thread Stewart Smith
Michael Ellerman  writes:
> On Fri, 2015-27-11 at 06:23:07 UTC, Russell Currey wrote:
>> On BMC machines, console output is controlled by the OPAL firmware and is
>> only flushed when its pollers are called.  When the kernel is in a panic
>> state, it no longer calls these pollers and thus console output does not
>> completely flush, causing some output from the panic to be lost.
>> 
>> Output is only actually lost when the kernel is configured to not power off
>> or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
>> flushes the console buffer as part of its power down routines.  Before this
>> patch, however, only partial output would be printed during the timeout wait.
>> 
>> This patch adds a new kmsg_dumper which gets called at panic time to ensure
>> panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
>> in the OPAL API, and if that is not available, the pollers are called enough
>> times to (hopefully) completely flush the buffer.
>> 
>> The flushing mechanism will only affect output printed at and before the
>> kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
>> message may still be truncated as follows:
>> 
>> >Call Trace:
>> >[c00f1f603b00] [c08e9458] dump_stack+0x90/0xbc (unreliable)
>> >[c00f1f603b30] [c08e7e78] panic+0xf8/0x2c4
>> >[c00f1f603bc0] [c0be4860] mount_block_root+0x288/0x33c
>> >[c00f1f603c80] [c0be4d14] prepare_namespace+0x1f4/0x254
>> >[c00f1f603d00] [c0be43e8] kernel_init_freeable+0x318/0x350
>> >[c00f1f603dc0] [c000bd74] kernel_init+0x24/0x130
>> >[c00f1f603e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac
>> >---[ end Kernel panic - not
>> 
>> This functionality is implemented as a kmsg_dumper as it seems to be the
>> most sensible way to introduce platform-specific functionality to the
>> panic function.
>> 
>> Signed-off-by: Russell Currey 
>> Reviewed-by: Andrew Donnellan 
>
> Applied to powerpc next, thanks.
>
> https://git.kernel.org/powerpc/c/affddff69c55eb68969448f35f

The firmware interface changed slightly since this kernel patch[1], it
added a parameter to OPAL_CONSOLE_FLUSH which accepted the terminal
number to flush, theoretically allowing this to be plumbed into TTY
layer or something too.

So, we'll either have to update this patch or replace it with an updated
one.

[1] i'm pushing the accepted skiboot patch now.

-- 
Stewart Smith
OPAL Architect, IBM.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 26/33] powerpc/mm: Hash linux abstraction for mmu context handling code

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/mmu_context.h | 63 +++---
 arch/powerpc/kernel/swsusp.c   |  2 +-
 arch/powerpc/mm/mmu_context_hash64.c   | 16 -
 arch/powerpc/mm/mmu_context_nohash.c   |  3 +-
 drivers/cpufreq/pmac32-cpufreq.c   |  2 +-
 drivers/macintosh/via-pmu.c|  4 +--
 6 files changed, 57 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 878c27771717..5124b721da6e 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -10,11 +10,6 @@
 #include 
 #include 
 
-/*
- * Most if the context management is out of line
- */
-extern int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
-extern void destroy_context(struct mm_struct *mm);
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 struct mm_iommu_table_group_mem_t;
 
@@ -33,16 +28,50 @@ extern long mm_iommu_ua_to_hpa(struct 
mm_iommu_table_group_mem_t *mem,
 extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
 extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
 #endif
+/*
+ * Most of the context management is out of line
+ */
+#ifdef CONFIG_PPC_BOOK3S_64
+extern int hlinit_new_context(struct task_struct *tsk, struct mm_struct *mm);
+static inline int init_new_context(struct task_struct *tsk, struct mm_struct 
*mm)
+{
+   return hlinit_new_context(tsk, mm);
+}
+
+extern void hldestroy_context(struct mm_struct *mm);
+static inline void destroy_context(struct mm_struct *mm)
+{
+   return hldestroy_context(mm);
+}
 
-extern void switch_mmu_context(struct mm_struct *prev, struct mm_struct *next);
 extern void switch_slb(struct task_struct *tsk, struct mm_struct *mm);
-extern void set_context(unsigned long id, pgd_t *pgd);
+static inline void switch_mmu_context(struct mm_struct *prev,
+ struct mm_struct *next,
+ struct task_struct *tsk)
+{
+   return switch_slb(tsk, next);
+}
 
-#ifdef CONFIG_PPC_BOOK3S_64
-extern int __init_new_context(void);
-extern void __destroy_context(int context_id);
+extern void set_context(unsigned long id, pgd_t *pgd);
+extern int __hlinit_new_context(void);
+static inline int __init_new_context(void)
+{
+   return __hlinit_new_context();
+}
+extern void __hldestroy_context(int context_id);
+static inline void __destroy_context(int context_id)
+{
+   return __hldestroy_context(context_id);
+}
 static inline void mmu_context_init(void) { }
 #else
+extern int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
+extern void destroy_context(struct mm_struct *mm);
+
+extern void switch_mmu_context(struct mm_struct *prev, struct mm_struct *next,
+  struct task_struct *tsk);
+extern void switch_slb(struct task_struct *tsk, struct mm_struct *mm);
+extern void set_context(unsigned long id, pgd_t *pgd);
 extern unsigned long __init_new_context(void);
 extern void __destroy_context(unsigned long context_id);
 extern void mmu_context_init(void);
@@ -88,17 +117,11 @@ static inline void switch_mm(struct mm_struct *prev, 
struct mm_struct *next,
if (cpu_has_feature(CPU_FTR_ALTIVEC))
asm volatile ("dssall");
 #endif /* CONFIG_ALTIVEC */
-
-   /* The actual HW switching method differs between the various
-* sub architectures.
+   /*
+* The actual HW switching method differs between the various
+* sub architectures. Out of line for now
 */
-#ifdef CONFIG_PPC_STD_MMU_64
-   switch_slb(tsk, next);
-#else
-   /* Out of line for now */
-   switch_mmu_context(prev, next);
-#endif
-
+   switch_mmu_context(prev, next, tsk);
 }
 
 #define deactivate_mm(tsk,mm)  do { } while (0)
diff --git a/arch/powerpc/kernel/swsusp.c b/arch/powerpc/kernel/swsusp.c
index 6669b1752512..6ae9bd5086a4 100644
--- a/arch/powerpc/kernel/swsusp.c
+++ b/arch/powerpc/kernel/swsusp.c
@@ -31,6 +31,6 @@ void save_processor_state(void)
 void restore_processor_state(void)
 {
 #ifdef CONFIG_PPC32
-   switch_mmu_context(current->active_mm, current->active_mm);
+   switch_mmu_context(current->active_mm, current->active_mm, NULL);
 #endif
 }
diff --git a/arch/powerpc/mm/mmu_context_hash64.c 
b/arch/powerpc/mm/mmu_context_hash64.c
index ff9baa5d2944..9c147d800760 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -30,7 +30,7 @@
 static DEFINE_SPINLOCK(mmu_context_lock);
 static DEFINE_IDA(mmu_context_ida);
 
-int __init_new_context(void)
+int __hlinit_new_context(void)
 {
int index;
int err;
@@ -59,11 +59,11 @@ again:
 }
 EXPORT_SYMBOL_GPL(__init_new_context);
 
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+int hlinit_new_context(struct task_struct *tsk, struct mm_struct *mm)
 {
int index;
 
-   

[RFC PATCH V1 10/33] powerpc/mm: free_hugepd_range split to hash and nonhash

2016-01-11 Thread Aneesh Kumar K.V
We strictly don't need to do this. But enables us to not depend on
pgtable_free_tlb for radix.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/hugetlbpage-book3e.c | 187 ++
 arch/powerpc/mm/hugetlbpage-hash64.c | 150 
 arch/powerpc/mm/hugetlbpage.c| 188 ---
 3 files changed, 337 insertions(+), 188 deletions(-)

diff --git a/arch/powerpc/mm/hugetlbpage-book3e.c 
b/arch/powerpc/mm/hugetlbpage-book3e.c
index e6339ac45f0f..94be03c58c60 100644
--- a/arch/powerpc/mm/hugetlbpage-book3e.c
+++ b/arch/powerpc/mm/hugetlbpage-book3e.c
@@ -265,6 +265,193 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long 
addr, unsigned long sz
return hugepte_offset(*hpdp, addr, pdshift);
 }
 
+extern void hugepd_free(struct mmu_gather *tlb, void *hugepte);
+static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int 
pdshift,
+ unsigned long start, unsigned long end,
+ unsigned long floor, unsigned long ceiling)
+{
+   pte_t *hugepte = hugepd_page(*hpdp);
+   int i;
+
+   unsigned long pdmask = ~((1UL << pdshift) - 1);
+   unsigned int num_hugepd = 1;
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+   /* Note: On fsl the hpdp may be the first of several */
+   num_hugepd = (1 << (hugepd_shift(*hpdp) - pdshift));
+#else
+   unsigned int shift = hugepd_shift(*hpdp);
+#endif
+
+   start &= pdmask;
+   if (start < floor)
+   return;
+   if (ceiling) {
+   ceiling &= pdmask;
+   if (! ceiling)
+   return;
+   }
+   if (end - 1 > ceiling - 1)
+   return;
+
+   for (i = 0; i < num_hugepd; i++, hpdp++)
+   hpdp->pd = 0;
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+   hugepd_free(tlb, hugepte);
+#else
+   pgtable_free_tlb(tlb, hugepte, pdshift - shift);
+#endif
+}
+
+static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
+  unsigned long addr, unsigned long end,
+  unsigned long floor, unsigned long ceiling)
+{
+   pmd_t *pmd;
+   unsigned long next;
+   unsigned long start;
+
+   start = addr;
+   do {
+   pmd = pmd_offset(pud, addr);
+   next = pmd_addr_end(addr, end);
+   if (!is_hugepd(__hugepd(pmd_val(*pmd {
+   /*
+* if it is not hugepd pointer, we should already find
+* it cleared.
+*/
+   WARN_ON(!pmd_none_or_clear_bad(pmd));
+   continue;
+   }
+#ifdef CONFIG_PPC_FSL_BOOK3E
+   /*
+* Increment next by the size of the huge mapping since
+* there may be more than one entry at this level for a
+* single hugepage, but all of them point to
+* the same kmem cache that holds the hugepte.
+*/
+   next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd));
+#endif
+   free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
+ addr, next, floor, ceiling);
+   } while (addr = next, addr != end);
+
+   start &= PUD_MASK;
+   if (start < floor)
+   return;
+   if (ceiling) {
+   ceiling &= PUD_MASK;
+   if (!ceiling)
+   return;
+   }
+   if (end - 1 > ceiling - 1)
+   return;
+
+   pmd = pmd_offset(pud, start);
+   pud_clear(pud);
+   pmd_free_tlb(tlb, pmd, start);
+   mm_dec_nr_pmds(tlb->mm);
+}
+
+static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
+  unsigned long addr, unsigned long end,
+  unsigned long floor, unsigned long ceiling)
+{
+   pud_t *pud;
+   unsigned long next;
+   unsigned long start;
+
+   start = addr;
+   do {
+   pud = pud_offset(pgd, addr);
+   next = pud_addr_end(addr, end);
+   if (!is_hugepd(__hugepd(pud_val(*pud {
+   if (pud_none_or_clear_bad(pud))
+   continue;
+   hugetlb_free_pmd_range(tlb, pud, addr, next, floor,
+  ceiling);
+   } else {
+#ifdef CONFIG_PPC_FSL_BOOK3E
+   /*
+* Increment next by the size of the huge mapping since
+* there may be more than one entry at this level for a
+* single hugepage, but all of them point to
+* the same kmem cache that holds the hugepte.
+*/
+   next = addr + (1 << hugepd_shift(*(hugepd_t *)pud));
+#endif

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-11 Thread Leonid Yegoshin

On 01/10/2016 06:18 AM, Michael S. Tsirkin wrote:

On mips dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends,
smp_read_barrier_depends, smp_store_release and smp_load_acquire  match
the asm-generic variants exactly. Drop the local definitions and pull in
asm-generic/barrier.h instead.

This statement doesn't fit MIPS barriers variations. Moreover, there is 
a reason to extend that even more specific, at least for 
smp_store_release and smp_load_acquire, look into


http://patchwork.linux-mips.org/patch/10506/

- Leonid.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 20/33] powerpc/mm: Use flush_tlb_page in ptep_clear_flush_young

2016-01-11 Thread Aneesh Kumar K.V
This should not have any impact for hash linux implementation. But radix
would require us to flush tlb after clearing accessed bit. Also move
code that is not dependent on pte bits to generic header.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash.h| 45 +---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 38 +++
 arch/powerpc/include/asm/mmu-hash64.h|  2 +-
 3 files changed, 47 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 212037f5c0af..f6d27579607f 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -305,6 +305,14 @@ static inline unsigned long pte_update(struct mm_struct 
*mm,
return old;
 }
 
+/*
+ * We currently remove entries from the hashtable regardless of whether
+ * the entry was young or dirty. The generic routines only flush if the
+ * entry was young or dirty which is not good enough.
+ *
+ * We should be more intelligent about this but for the moment we override
+ * these functions and force a tlb flush unconditionally
+ */
 static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
  unsigned long addr, pte_t *ptep)
 {
@@ -315,13 +323,6 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
old = pte_update(mm, addr, ptep, H_PAGE_ACCESSED, 0, 0);
return (old & H_PAGE_ACCESSED) != 0;
 }
-#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
-#define ptep_test_and_clear_young(__vma, __addr, __ptep)  \
-({\
-   int __r;   \
-   __r = __ptep_test_and_clear_young((__vma)->vm_mm, __addr, __ptep); \
-   __r;   \
-})
 
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
@@ -343,36 +344,6 @@ static inline void huge_ptep_set_wrprotect(struct 
mm_struct *mm,
pte_update(mm, addr, ptep, H_PAGE_RW, 0, 1);
 }
 
-/*
- * We currently remove entries from the hashtable regardless of whether
- * the entry was young or dirty. The generic routines only flush if the
- * entry was young or dirty which is not good enough.
- *
- * We should be more intelligent about this but for the moment we override
- * these functions and force a tlb flush unconditionally
- */
-#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
-#define ptep_clear_flush_young(__vma, __address, __ptep)   \
-({ \
-   int __young = __ptep_test_and_clear_young((__vma)->vm_mm, __address, \
- __ptep);  \
-   __young;\
-})
-
-#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
-static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
-  unsigned long addr, pte_t *ptep)
-{
-   unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0);
-   return __pte(old);
-}
-
-static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
-pte_t * ptep)
-{
-   pte_update(mm, addr, ptep, ~0UL, 0, 0);
-}
-
 
 /* Set the dirty and/or accessed bits atomically in a linux PTE, this
  * function doesn't need to flush the hash entry
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 3df6684c5948..90ec5b8b02c1 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -8,6 +8,10 @@
 #include 
 #include 
 
+#ifndef __ASSEMBLY__
+#include 
+#include 
+#endif
 /*
  * The second half of the kernel virtual space is used for IO mappings,
  * it's itself carved into the PIO region (ISA and PHB IO space) and
@@ -126,6 +130,40 @@ extern unsigned long ioremap_bot;
 
 #endif /* __real_pte */
 
+#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
+static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
+   unsigned long address,
+   pte_t *ptep)
+{
+   return  __ptep_test_and_clear_young(vma->vm_mm, address, ptep);
+}
+
+#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
+static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
+unsigned long address, pte_t *ptep)
+{
+   int young;
+
+   young = __ptep_test_and_clear_young(vma->vm_mm, address, ptep);
+   if (young)
+   flush_tlb_page(vma, address);
+   return young;
+}
+
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+static inline pte_t ptep_get_and_clear(struct mm_struct 

Re: [V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

2016-01-11 Thread Russell Currey
On Tue, 2016-01-12 at 14:44 +1100, Stewart Smith wrote:
> Michael Ellerman  writes:
> > On Fri, 2015-27-11 at 06:23:07 UTC, Russell Currey wrote:
> > > On BMC machines, console output is controlled by the OPAL firmware and is
> > > only flushed when its pollers are called.  When the kernel is in a panic
> > > state, it no longer calls these pollers and thus console output does not
> > > completely flush, causing some output from the panic to be lost.
> > > 
> > > Output is only actually lost when the kernel is configured to not power
> > > off
> > > or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
> > > flushes the console buffer as part of its power down routines.  Before
> > > this
> > > patch, however, only partial output would be printed during the timeout
> > > wait.
> > > 
> > > This patch adds a new kmsg_dumper which gets called at panic time to
> > > ensure
> > > panic output is not lost.  It accomplishes this by calling
> > > OPAL_CONSOLE_FLUSH
> > > in the OPAL API, and if that is not available, the pollers are called
> > > enough
> > > times to (hopefully) completely flush the buffer.
> > > 
> > > The flushing mechanism will only affect output printed at and before the
> > > kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel
> > > panic"
> > > message may still be truncated as follows:
> > > 
> > > > Call Trace:
> > > > [c00f1f603b00] [c08e9458] dump_stack+0x90/0xbc (unreliable)
> > > > [c00f1f603b30] [c08e7e78] panic+0xf8/0x2c4
> > > > [c00f1f603bc0] [c0be4860] mount_block_root+0x288/0x33c
> > > > [c00f1f603c80] [c0be4d14] prepare_namespace+0x1f4/0x254
> > > > [c00f1f603d00] [c0be43e8] kernel_init_freeable+0x318/0x350
> > > > [c00f1f603dc0] [c000bd74] kernel_init+0x24/0x130
> > > > [c00f1f603e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac
> > > > ---[ end Kernel panic - not
> > > 
> > > This functionality is implemented as a kmsg_dumper as it seems to be the
> > > most sensible way to introduce platform-specific functionality to the
> > > panic function.
> > > 
> > > Signed-off-by: Russell Currey 
> > > Reviewed-by: Andrew Donnellan 
> > 
> > Applied to powerpc next, thanks.
> > 
> > https://git.kernel.org/powerpc/c/affddff69c55eb68969448f35f
> 
> The firmware interface changed slightly since this kernel patch[1], it
> added a parameter to OPAL_CONSOLE_FLUSH which accepted the terminal
> number to flush, theoretically allowing this to be plumbed into TTY
> layer or something too.
> 
> So, we'll either have to update this patch or replace it with an updated
> one.
> 
> [1] i'm pushing the accepted skiboot patch now.
> 
I'm working on an updated kernel patch to use the new parameter and additional
return values, so I suppose it's up to mpe whether or not this patch gets
merged now and another gets sent later to amend it, or if this patch gets
reverted in next and I can send a V4 adding the new stuff.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 09/33] powerpc/mm: Hugetlbfs is book3s_64 and fsl_book3e (32 or 64)

2016-01-11 Thread Aneesh Kumar K.V
We move large part of fsl related code to hugetlbpage-book3e.c.
Only code movement. This also avoid #ifdef in the code.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/hugetlb.h   |   1 +
 arch/powerpc/mm/hugetlbpage-book3e.c | 293 +
 arch/powerpc/mm/hugetlbpage-hash64.c | 121 +++
 arch/powerpc/mm/hugetlbpage.c| 401 +--
 4 files changed, 416 insertions(+), 400 deletions(-)

diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index 7eac89b9f02e..0525f1c29afb 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -47,6 +47,7 @@ static inline unsigned int hugepd_shift(hugepd_t hpd)
 
 #endif /* CONFIG_PPC_BOOK3S_64 */
 
+#define hugepd_none(hpd)   ((hpd).pd == 0)
 
 static inline pte_t *hugepte_offset(hugepd_t hpd, unsigned long addr,
unsigned pdshift)
diff --git a/arch/powerpc/mm/hugetlbpage-book3e.c 
b/arch/powerpc/mm/hugetlbpage-book3e.c
index ba47aaf33a4b..e6339ac45f0f 100644
--- a/arch/powerpc/mm/hugetlbpage-book3e.c
+++ b/arch/powerpc/mm/hugetlbpage-book3e.c
@@ -7,6 +7,39 @@
  */
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * Tracks gpages after the device tree is scanned and before the
+ * huge_boot_pages list is ready.  On non-Freescale implementations, this is
+ * just used to track 16G pages and so is a single array.  FSL-based
+ * implementations may have more than one gpage size, so we need multiple
+ * arrays
+ */
+#ifdef CONFIG_PPC_FSL_BOOK3E
+#define MAX_NUMBER_GPAGES  128
+struct psize_gpages {
+   u64 gpage_list[MAX_NUMBER_GPAGES];
+   unsigned int nr_gpages;
+};
+static struct psize_gpages gpage_freearray[MMU_PAGE_COUNT];
+#endif
+
+/*
+ * These macros define how to determine which level of the page table holds
+ * the hpdp.
+ */
+#ifdef CONFIG_PPC_FSL_BOOK3E
+#define HUGEPD_PGD_SHIFT PGDIR_SHIFT
+#define HUGEPD_PUD_SHIFT PUD_SHIFT
+#else
+#define HUGEPD_PGD_SHIFT PUD_SHIFT
+#define HUGEPD_PUD_SHIFT PMD_SHIFT
+#endif
 
 #ifdef CONFIG_PPC_FSL_BOOK3E
 #ifdef CONFIG_PPC64
@@ -151,3 +184,263 @@ void flush_hugetlb_page(struct vm_area_struct *vma, 
unsigned long vmaddr)
 
__flush_tlb_page(vma->vm_mm, vmaddr, tsize, 0);
 }
+
+static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
+  unsigned long address, unsigned pdshift, unsigned 
pshift)
+{
+   struct kmem_cache *cachep;
+   pte_t *new;
+
+   int i;
+   int num_hugepd = 1 << (pshift - pdshift);
+   cachep = hugepte_cache;
+
+   new = kmem_cache_zalloc(cachep, GFP_KERNEL|__GFP_REPEAT);
+
+   BUG_ON(pshift > HUGEPD_SHIFT_MASK);
+   BUG_ON((unsigned long)new & HUGEPD_SHIFT_MASK);
+
+   if (! new)
+   return -ENOMEM;
+
+   spin_lock(>page_table_lock);
+   /*
+* We have multiple higher-level entries that point to the same
+* actual pte location.  Fill in each as we go and backtrack on error.
+* We need all of these so the DTLB pgtable walk code can find the
+* right higher-level entry without knowing if it's a hugepage or not.
+*/
+   for (i = 0; i < num_hugepd; i++, hpdp++) {
+   if (unlikely(!hugepd_none(*hpdp)))
+   break;
+   else
+   /* We use the old format for PPC_FSL_BOOK3E */
+   hpdp->pd = ((unsigned long)new & ~PD_HUGE) | pshift;
+   }
+   /* If we bailed from the for loop early, an error occurred, clean up */
+   if (i < num_hugepd) {
+   for (i = i - 1 ; i >= 0; i--, hpdp--)
+   hpdp->pd = 0;
+   kmem_cache_free(cachep, new);
+   }
+   spin_unlock(>page_table_lock);
+   return 0;
+}
+
+pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long 
sz)
+{
+   pgd_t *pg;
+   pud_t *pu;
+   pmd_t *pm;
+   hugepd_t *hpdp = NULL;
+   unsigned pshift = __ffs(sz);
+   unsigned pdshift = PGDIR_SHIFT;
+
+   addr &= ~(sz-1);
+
+   pg = pgd_offset(mm, addr);
+
+   if (pshift >= HUGEPD_PGD_SHIFT) {
+   hpdp = (hugepd_t *)pg;
+   } else {
+   pdshift = PUD_SHIFT;
+   pu = pud_alloc(mm, pg, addr);
+   if (pshift >= HUGEPD_PUD_SHIFT) {
+   hpdp = (hugepd_t *)pu;
+   } else {
+   pdshift = PMD_SHIFT;
+   pm = pmd_alloc(mm, pu, addr);
+   hpdp = (hugepd_t *)pm;
+   }
+   }
+
+   if (!hpdp)
+   return NULL;
+
+   BUG_ON(!hugepd_none(*hpdp) && !hugepd_ok(*hpdp));
+
+   if (hugepd_none(*hpdp) && __hugepte_alloc(mm, hpdp, addr, pdshift, 
pshift))
+   return NULL;
+
+   return hugepte_offset(*hpdp, addr, pdshift);
+}
+
+#ifdef 

[RFC PATCH V1 25/33] powerpc/mm: Hash linux abstraction for functions in pgtable-hash.c

2016-01-11 Thread Aneesh Kumar K.V
We will later make the generic functions do conditial radix or hash
page table access. This patch doesn't do hugepage api update yet.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 15 +
 arch/powerpc/include/asm/book3s/64/hash.h| 12 ++-
 arch/powerpc/include/asm/book3s/64/pgtable.h | 47 +++-
 arch/powerpc/include/asm/book3s/pgtable.h|  4 ---
 arch/powerpc/include/asm/nohash/64/pgtable.h |  4 ++-
 arch/powerpc/include/asm/nohash/pgtable.h| 11 +++
 arch/powerpc/include/asm/pgtable.h   | 13 
 arch/powerpc/mm/init_64.c|  3 --
 arch/powerpc/mm/pgtable-hash64.c | 34 ++--
 9 files changed, 103 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index b53d7504d6f6..6b1859c78719 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -102,6 +102,9 @@ extern unsigned long ioremap_bot;
 #define pte_clear(mm, addr, ptep) \
do { pte_update(ptep, ~_PAGE_HASHPTE, 0); } while (0)
 
+extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+  pte_t pte);
+
 #define pmd_none(pmd)  (!pmd_val(pmd))
 #definepmd_bad(pmd)(pmd_val(pmd) & _PMD_BAD)
 #definepmd_present(pmd)(pmd_val(pmd) & _PMD_PRESENT_MASK)
@@ -502,6 +505,18 @@ static inline unsigned long ioremap_prot_flags(unsigned 
long flags)
flags &= ~(_PAGE_USER | _PAGE_EXEC);
return flags;
 }
+
+/*
+ * This gets called at the end of handling a page fault, when
+ * the kernel has put a new PTE into the page table for the process.
+ * We use it to ensure coherency between the i-cache and d-cache
+ * for the page which has just been mapped in.
+ * On machines which use an MMU hash table, we use this to put a
+ * corresponding HPTE into the hash table ahead of time, instead of
+ * waiting for the inevitable extra hash-table miss exception.
+ */
+extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t *);
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /*  _ASM_POWERPC_BOOK3S_32_PGTABLE_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 5d333400c87d..20bb9da200c6 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -600,7 +600,17 @@ static inline void hpte_do_hugepage_flush(struct mm_struct 
*mm,
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-extern int map_kernel_page(unsigned long ea, unsigned long pa, int flags);
+extern int hlmap_kernel_page(unsigned long ea, unsigned long pa, int flags);
+extern void hlpgtable_cache_init(void);
+extern void __meminit hlvmemmap_create_mapping(unsigned long start,
+  unsigned long page_size,
+  unsigned long phys);
+extern void hlvmemmap_remove_mapping(unsigned long start,
+unsigned long page_size);
+extern void set_hlpte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+pte_t pte);
+extern void hlupdate_mmu_cache(struct vm_area_struct *vma, unsigned long 
address,
+  pte_t *ptep);
 #endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index ca2f4364fac2..213cc7b8dac2 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -317,6 +317,12 @@ static inline int pte_present(pte_t pte)
return hlpte_present(pte);
 }
 
+static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t pte)
+{
+   return set_hlpte_at(mm, addr, ptep, pte);
+}
+
 static inline void pmd_set(pmd_t *pmdp, unsigned long val)
 {
*pmdp = __pmd(val);
@@ -459,7 +465,46 @@ extern struct page *pgd_page(pgd_t pgd);
pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
 
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
-void pgtable_cache_init(void);
+static inline void pgtable_cache_init(void)
+{
+   return hlpgtable_cache_init();
+}
+
+static inline int map_kernel_page(unsigned long ea, unsigned long pa,
+ unsigned long flags)
+{
+   return hlmap_kernel_page(ea, pa, flags);
+}
+
+static inline void __meminit vmemmap_create_mapping(unsigned long start,
+   unsigned long page_size,
+   unsigned long phys)
+{
+   return hlvmemmap_create_mapping(start, page_size, phys);
+}
+
+#ifdef CONFIG_MEMORY_HOTPLUG

[RFC PATCH V1 22/33] powerpc/mm: Use generic version of pmdp_clear_flush_young

2016-01-11 Thread Aneesh Kumar K.V
The radix variant is going to require a flush_tlb_range. We can't then
have this as static inline because of the usage of HPAGE_PMD_SIZE. So
we are forced to make it a function in which case we can use the generic 
version.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |  3 ---
 arch/powerpc/mm/pgtable-hash64.c | 10 ++
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 90ec5b8b02c1..4dbd5eab2521 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -330,9 +330,6 @@ extern int pmdp_set_access_flags(struct vm_area_struct *vma,
 #define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
 extern int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 unsigned long address, pmd_t *pmdp);
-#define __HAVE_ARCH_PMDP_CLEAR_YOUNG_FLUSH
-extern int pmdp_clear_flush_young(struct vm_area_struct *vma,
- unsigned long address, pmd_t *pmdp);
 
 #define __HAVE_ARCH_PMDP_HUGE_GET_AND_CLEAR
 extern pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
index 6c9c16b37033..2f9348e24633 100644
--- a/arch/powerpc/mm/pgtable-hash64.c
+++ b/arch/powerpc/mm/pgtable-hash64.c
@@ -349,12 +349,6 @@ pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, 
unsigned long address,
return pmd;
 }
 
-int pmdp_test_and_clear_young(struct vm_area_struct *vma,
- unsigned long address, pmd_t *pmdp)
-{
-   return __pmdp_test_and_clear_young(vma->vm_mm, address, pmdp);
-}
-
 /*
  * We currently remove entries from the hashtable regardless of whether
  * the entry was young or dirty. The generic routines only flush if the
@@ -363,8 +357,8 @@ int pmdp_test_and_clear_young(struct vm_area_struct *vma,
  * We should be more intelligent about this but for the moment we override
  * these functions and force a tlb flush unconditionally
  */
-int pmdp_clear_flush_young(struct vm_area_struct *vma,
- unsigned long address, pmd_t *pmdp)
+int pmdp_test_and_clear_young(struct vm_area_struct *vma,
+ unsigned long address, pmd_t *pmdp)
 {
return __pmdp_test_and_clear_young(vma->vm_mm, address, pmdp);
 }
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Timur Tabi

Mark Brown wrote:

Quite possibly (it'll be more efficient and it's intended for such use
cases) but as I said in my other reply that then has the issue that it
implicitly gives default values to all the registers so I'd expect we
still need to handle the cache initialisation explicitly (or
alternatively the hardware sync with the cache on startup).


Why does REGCACHE_FLAT assume that all registers have a default value of 
0?  Shouldn't it have the same behavior w.r.t. cache values as 
REGCACHE_RBTREE?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification

2016-01-11 Thread Viresh Kumar
On 11-01-16, 14:23, Greg KH wrote:
> On Mon, Jan 11, 2016 at 02:54:36PM -0600, Shilpasri G Bhat wrote:
> > In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the
> > CPU when the chip crosses its thermal and power limits. Currently,
> > powernv-cpufreq driver detects and reports this event as a console
> > message. Some machines may not sustain the max turbo frequency in all
> > conditions and can be throttled frequently. This can lead to the
> > flooding of console with throttle messages. So this patchset aims to
> > redesign the presentation of this event via sysfs counters and
> > tracepoints. 
> > 
> > Patches [2] to [4] will add a perf trace point "power:powernv_throttle" and
> > sysfs throttle counter stats in /sys/devices/system/cpu/cpufreq/chipN.
> > Patch [1] solves a bug in powernv_cpufreq_throttle_check(), which calls in 
> > to
> > cpu_to_chip_id() in hot path which reads DT every time to find the chip id.
> 
> 
> 
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
> for how to do this properly.
> 
> 

Also you shouldn't use --in-reply-to for the new versions of a
multiple patch series. Just use a new thread.

-- 
viresh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH V1 21/33] powerpc/mm: THP is only available on hash64 as of now

2016-01-11 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/pgtable-hash64.c | 341 +++
 arch/powerpc/mm/pgtable_64.c | 341 ---
 2 files changed, 341 insertions(+), 341 deletions(-)

diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
index 35127acdb8c6..6c9c16b37033 100644
--- a/arch/powerpc/mm/pgtable-hash64.c
+++ b/arch/powerpc/mm/pgtable-hash64.c
@@ -21,6 +21,9 @@
 
 #include "mmu_decl.h"
 
+#define CREATE_TRACE_POINTS
+#include 
+
 #if H_PGTABLE_RANGE > USER_VSID_RANGE
 #warning Limited user VSID range means pagetable space is wasted
 #endif
@@ -244,3 +247,341 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, 
pte_t *ptep,
/* Perform the setting of the PTE */
__set_pte_at(mm, addr, ptep, pte, 0);
 }
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+
+/*
+ * This is called when relaxing access to a hugepage. It's also called in the 
page
+ * fault path when we don't hit any of the major fault cases, ie, a minor
+ * update of _PAGE_ACCESSED, _PAGE_DIRTY, etc... The generic code will have
+ * handled those two for us, we additionally deal with missing execute
+ * permission here on some processors
+ */
+int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
+ pmd_t *pmdp, pmd_t entry, int dirty)
+{
+   int changed;
+#ifdef CONFIG_DEBUG_VM
+   WARN_ON(!pmd_trans_huge(*pmdp));
+   assert_spin_locked(>vm_mm->page_table_lock);
+#endif
+   changed = !pmd_same(*(pmdp), entry);
+   if (changed) {
+   __ptep_set_access_flags(pmdp_ptep(pmdp), pmd_pte(entry));
+   /*
+* Since we are not supporting SW TLB systems, we don't
+* have any thing similar to flush_tlb_page_nohash()
+*/
+   }
+   return changed;
+}
+
+unsigned long pmd_hugepage_update(struct mm_struct *mm, unsigned long addr,
+ pmd_t *pmdp, unsigned long clr,
+ unsigned long set)
+{
+
+   unsigned long old, tmp;
+
+#ifdef CONFIG_DEBUG_VM
+   WARN_ON(!pmd_trans_huge(*pmdp));
+   assert_spin_locked(>page_table_lock);
+#endif
+
+#ifdef PTE_ATOMIC_UPDATES
+   __asm__ __volatile__(
+   "1: ldarx   %0,0,%3\n\
+   andi.   %1,%0,%6\n\
+   bne-1b \n\
+   andc%1,%0,%4 \n\
+   or  %1,%1,%7\n\
+   stdcx.  %1,0,%3 \n\
+   bne-1b"
+   : "=" (old), "=" (tmp), "=m" (*pmdp)
+   : "r" (pmdp), "r" (clr), "m" (*pmdp), "i" (H_PAGE_BUSY), "r" (set)
+   : "cc" );
+#else
+   old = pmd_val(*pmdp);
+   *pmdp = __pmd((old & ~clr) | set);
+#endif
+   trace_hugepage_update(addr, old, clr, set);
+   if (old & H_PAGE_HASHPTE)
+   hpte_do_hugepage_flush(mm, addr, pmdp, old);
+   return old;
+}
+
+pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long address,
+ pmd_t *pmdp)
+{
+   pmd_t pmd;
+
+   VM_BUG_ON(address & ~HPAGE_PMD_MASK);
+   VM_BUG_ON(pmd_trans_huge(*pmdp));
+
+   pmd = *pmdp;
+   pmd_clear(pmdp);
+   /*
+* Wait for all pending hash_page to finish. This is needed
+* in case of subpage collapse. When we collapse normal pages
+* to hugepage, we first clear the pmd, then invalidate all
+* the PTE entries. The assumption here is that any low level
+* page fault will see a none pmd and take the slow path that
+* will wait on mmap_sem. But we could very well be in a
+* hash_page with local ptep pointer value. Such a hash page
+* can result in adding new HPTE entries for normal subpages.
+* That means we could be modifying the page content as we
+* copy them to a huge page. So wait for parallel hash_page
+* to finish before invalidating HPTE entries. We can do this
+* by sending an IPI to all the cpus and executing a dummy
+* function there.
+*/
+   kick_all_cpus_sync();
+   /*
+* Now invalidate the hpte entries in the range
+* covered by pmd. This make sure we take a
+* fault and will find the pmd as none, which will
+* result in a major fault which takes mmap_sem and
+* hence wait for collapse to complete. Without this
+* the __collapse_huge_page_copy can result in copying
+* the old content.
+*/
+   flush_tlb_pmd_range(vma->vm_mm, , address);
+   return pmd;
+}
+
+int pmdp_test_and_clear_young(struct vm_area_struct *vma,
+ unsigned long address, pmd_t *pmdp)
+{
+   return __pmdp_test_and_clear_young(vma->vm_mm, address, pmdp);
+}
+
+/*
+ * We currently remove entries from the hashtable regardless of whether
+ * the entry was young or dirty. The generic routines only flush if the
+ * entry was 

[RFC PATCH V1 18/33] powerpc/mm: Add helper for update page flags during ioremap

2016-01-11 Thread Aneesh Kumar K.V
They differ between radix and hash. Hence we need a helper

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 11 +++
 arch/powerpc/include/asm/book3s/64/hash.h| 11 +++
 arch/powerpc/include/asm/nohash/pgtable.h| 20 
 arch/powerpc/mm/pgtable_64.c | 16 +---
 4 files changed, 43 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index c0898e26ed4a..b53d7504d6f6 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -491,6 +491,17 @@ static inline unsigned long gup_pte_filter(int write)
mask |= _PAGE_RW;
return mask;
 }
+
+static inline unsigned long ioremap_prot_flags(unsigned long flags)
+{
+   /* writeable implies dirty for kernel addresses */
+   if (flags & _PAGE_RW)
+   flags |= _PAGE_DIRTY;
+
+   /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */
+   flags &= ~(_PAGE_USER | _PAGE_EXEC);
+   return flags;
+}
 #endif /* !__ASSEMBLY__ */
 
 #endif /*  _ASM_POWERPC_BOOK3S_32_PGTABLE_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index d51709dad729..4f0fdb9a5d19 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -592,6 +592,17 @@ static inline unsigned long gup_pte_filter(int write)
return mask;
 }
 
+static inline unsigned long ioremap_prot_flags(unsigned long flags)
+{
+   /* writeable implies dirty for kernel addresses */
+   if (flags & _PAGE_RW)
+   flags |= _PAGE_DIRTY;
+
+   /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */
+   flags &= ~(_PAGE_USER | _PAGE_EXEC);
+   return flags;
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
   pmd_t *pmdp, unsigned long old_pmd);
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index e4173cb06e5b..8861ec146985 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -238,6 +238,26 @@ static inline unsigned long gup_pte_filter(int write)
return mask;
 }
 
+static inline unsigned long ioremap_prot_flags(unsigned long flags)
+{
+   /* writeable implies dirty for kernel addresses */
+   if (flags & _PAGE_RW)
+   flags |= _PAGE_DIRTY;
+
+   /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */
+   flags &= ~(_PAGE_USER | _PAGE_EXEC);
+
+#ifdef _PAGE_BAP_SR
+   /* _PAGE_USER contains _PAGE_BAP_SR on BookE using the new PTE format
+* which means that we just cleared supervisor access... oops ;-) This
+* restores it
+*/
+   flags |= _PAGE_BAP_SR;
+#endif
+
+   return flags;
+}
+
 #ifdef CONFIG_HUGETLB_PAGE
 static inline int hugepd_ok(hugepd_t hpd)
 {
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 21a9a171c267..aa8ff4c74563 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -188,21 +188,7 @@ void __iomem * ioremap_prot(phys_addr_t addr, unsigned 
long size,
 {
void *caller = __builtin_return_address(0);
 
-   /* writeable implies dirty for kernel addresses */
-   if (flags & _PAGE_RW)
-   flags |= _PAGE_DIRTY;
-
-   /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */
-   flags &= ~(_PAGE_USER | _PAGE_EXEC);
-
-#ifdef _PAGE_BAP_SR
-   /* _PAGE_USER contains _PAGE_BAP_SR on BookE using the new PTE format
-* which means that we just cleared supervisor access... oops ;-) This
-* restores it
-*/
-   flags |= _PAGE_BAP_SR;
-#endif
-
+   flags = ioremap_prot_flags(flags);
if (ppc_md.ioremap)
return ppc_md.ioremap(addr, size, flags, caller);
return __ioremap_caller(addr, size, flags, caller);
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: 答复: [PATCH V3] cpufreq: qoriq: Register cooling device based on device tree

2016-01-11 Thread Arnd Bergmann
On Monday 11 January 2016 17:34:52 Scott Wood wrote:
> >>
> >> I think you need a 'depends on THERMAL' to prevent the driver from being
> >> built-in when THERMAL=m.
> >>
> >> Arnd
> > 
> > Correct. I need to add following lines to the Kconfig file:
> > depends on !CPU_THERMAL || THERMAL=y
> > 
> > Hi Rafael,
> > Should I send a new patch include this fix or send a fix patch?
> 
> Why THERMAL=y and not just THERMAL, which would allow building this
> driver as a module?

Right, that would be better, and it is what all other drivers do.

For some reason, some drivers depend on !CPU_THERMAL and others
depend on !THERMAL_OF here, and I think the result is the same, but
we are a bit inconsistent here. CPU_THERMAL cannot be set if THERMAL_OF
is disabled, and the header file only uses the 'extern' declaration
if both are set.

Arnd
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] Add hwcap2 bits for POWER9

2016-01-11 Thread Carlos O'Donell
On 01/11/2016 02:55 PM, Tulio Magno Quites Machado Filho wrote:
> "Carlos O'Donell"  writes:
> 
>> On 01/11/2016 10:16 AM, Tulio Magno Quites Machado Filho wrote:
>>> Adhemerval Zanella  writes:
>>>
 On 08-01-2016 13:36, Peter Bergner wrote:
> On Fri, 2016-01-08 at 11:25 -0200, Tulio Magno Quites Machado Filho wrote:
>> Peter, this solves the issue you reported previously [1].
>>
>> [1] https://sourceware.org/ml/libc-alpha/2015-12/msg00522.html
>
> Agreed, thanks.  I'll also add the POWER9 support to the GCC side
> of the patch now that the glibc code is upstream.

 I do not see these bits being added in kernel side yet and GLIBC usual
 only sync these kind of bits *after* they are included in kernel side.
 So I would advise to either get these pieces (kernel support and hwcap
 advertise) in kernel before 2.23 release, otherwise revert the patches.
>>>
>>> Ack.
>>> It has just been sent to the correspondent Linux mailing list:
>>> https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-January/137763.html
>>
>> Please revert the changes from glibc until you checkin support to linux
>> kernel mainline.
>>
>> Leaving these bits in increases the risk that someone uses to deploy a glibc
>> that then may have the wrong value.
> 
> Could you clarify this statement, please?
> I fail to see how they could have the wrong value.

Until it is checked into the mainline kernel it is not canonical.

That's the rule. There are no other discussions to be had.

The single rule avoids discussions like "it can never be wrong because that's
what our ABI says it is."
 
> However, I do agree with the concerns raised by Peter and Adhemerval: glibc
> should be in sync with the kernel by the time of the release in order to
> guarantee both bits are reserved for the exact same goal and we should have
> both AT_HWCAP and AT_PLATFORM supporting the new processor.
> With that said, I was planning to revert both commits d2de9ef7 and b1f19b8e
> if we don't get the kernel patch accepted into the powerpc tree in time for
> the release 2.23.

Exactly. That's perfect. We can backport them to 2.23.1 if you get in later.

Cheers,
Carlos.
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

答复: [PATCH V3] cpufreq: qoriq: Register cooling device based on device tree

2016-01-11 Thread Hongtao Jia
Sorry for the late response. I got a knee surgery to do.
See comments at the end.

> -邮件原件-
> 发件人: Arnd Bergmann [mailto:a...@arndb.de]
> 发送时间: Saturday, December 19, 2015 6:33 AM
> 收件人: Rafael J. Wysocki 
> 抄送: Jia Hongtao ; edubez...@gmail.com;
> viresh.ku...@linaro.org; linux...@vger.kernel.org; linuxppc-
> d...@lists.ozlabs.org; devicet...@vger.kernel.org; Scott Wood
> 
> 主题: Re: [PATCH V3] cpufreq: qoriq: Register cooling device based on device
> tree
> 
> On Tuesday 15 December 2015 00:58:26 Rafael J. Wysocki wrote:
> > On Thursday, November 26, 2015 05:21:11 PM Jia Hongtao wrote:
> > > Register the qoriq cpufreq driver as a cooling device, based on the
> > > thermal device tree framework. When temperature crosses the passive
> > > trip point cpufreq is used to throttle CPUs.
> > >
> > > Signed-off-by: Jia Hongtao 
> > > Reviewed-by: Viresh Kumar 
> >
> > Applied, thanks!
> >
> 
> I got a randconfig build error today:
> 
> drivers/built-in.o: In function `qoriq_cpufreq_ready':
> debugfs.c:(.text+0x1f4688): undefined reference to
> `of_cpufreq_cooling_register'
> 
> CONFIG_OF=y
> CONFIG_QORIQ_CPUFREQ=y
> CONFIG_THERMAL=m
> CONFIG_THERMAL_OF=y
> 
> I think you need a 'depends on THERMAL' to prevent the driver from being
> built-in when THERMAL=m.
> 
>   Arnd

Correct. I need to add following lines to the Kconfig file:
depends on !CPU_THERMAL || THERMAL=y

Hi Rafael,
Should I send a new patch include this fix or send a fix patch?

Thanks.
-Hongtao.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path

2016-01-11 Thread Shilpasri G Bhat
cpu_to_chip_id() does a DT walk through to find out the chip id by taking a
contended device tree lock. This adds an unnecessary overhead in a hot-path.
So instead of cpu_to_chip_id() use PIR of the cpu to find the chip id.

Reported-by: Anton Blanchard 
Signed-off-by: Shilpasri G Bhat 
---
 drivers/cpufreq/powernv-cpufreq.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c 
b/drivers/cpufreq/powernv-cpufreq.c
index cb50138..597a084 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -39,6 +39,7 @@
 #define PMSR_PSAFE_ENABLE  (1UL << 30)
 #define PMSR_SPR_EM_DISABLE(1UL << 31)
 #define PMSR_MAX(x)((x >> 32) & 0xFF)
+#define pir_to_chip_id(pir)(((pir) >> 7) & 0x3f)
 
 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
 static bool rebooting, throttled, occ_reset;
@@ -312,13 +313,14 @@ static inline unsigned int get_nominal_index(void)
 static void powernv_cpufreq_throttle_check(void *data)
 {
unsigned int cpu = smp_processor_id();
+   unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
unsigned long pmsr;
int pmsr_pmax, i;
 
pmsr = get_pmspr(SPRN_PMSR);
 
for (i = 0; i < nr_chips; i++)
-   if (chips[i].id == cpu_to_chip_id(cpu))
+   if (chips[i].id == chip_id)
break;
 
/* Check for Pmax Capping */
@@ -558,7 +560,8 @@ static int init_chip_info(void)
unsigned int prev_chip_id = UINT_MAX;
 
for_each_possible_cpu(cpu) {
-   unsigned int id = cpu_to_chip_id(cpu);
+   unsigned int id =
+   pir_to_chip_id(get_hard_smp_processor_id(cpu));
 
if (prev_chip_id != id) {
prev_chip_id = id;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 2/4] cpufreq: powernv/tracing: Add powernv_throttle tracepoint

2016-01-11 Thread Shilpasri G Bhat
This patch adds the powernv_throttle tracepoint to trace the CPU
frequency throttling event, which is used by the powernv-cpufreq
driver in POWER8.

Signed-off-by: Shilpasri G Bhat 
CC: Ingo Molnar 
CC: Steven Rostedt 
---
No changes from v2 and v3.

 include/trace/events/power.h | 22 ++
 kernel/trace/power-traces.c  |  1 +
 2 files changed, 23 insertions(+)

diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 284244e..19e5030 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -38,6 +38,28 @@ DEFINE_EVENT(cpu, cpu_idle,
TP_ARGS(state, cpu_id)
 );
 
+TRACE_EVENT(powernv_throttle,
+
+   TP_PROTO(int chip_id, const char *reason, int pmax),
+
+   TP_ARGS(chip_id, reason, pmax),
+
+   TP_STRUCT__entry(
+   __field(int, chip_id)
+   __string(reason, reason)
+   __field(int, pmax)
+   ),
+
+   TP_fast_assign(
+   __entry->chip_id = chip_id;
+   __assign_str(reason, reason);
+   __entry->pmax = pmax;
+   ),
+
+   TP_printk("Chip %d Pmax %d %s", __entry->chip_id,
+ __entry->pmax, __get_str(reason))
+);
+
 TRACE_EVENT(pstate_sample,
 
TP_PROTO(u32 core_busy,
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index eb4220a..81b8745 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -15,4 +15,5 @@
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(suspend_resume);
 EXPORT_TRACEPOINT_SYMBOL_GPL(cpu_idle);
+EXPORT_TRACEPOINT_SYMBOL_GPL(powernv_throttle);
 
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 3/4] cpufreq: powernv: Add a trace print for the throttle event

2016-01-11 Thread Shilpasri G Bhat
Record the throttle event with a trace print replacing the printk,
except for events like throttling below nominal and occ reset
event which print a warning message.

Signed-off-by: Shilpasri G Bhat 
---
Changes from v3:
- Separate this patch to contain trace_point changes
- Move struct chip member 'restore' of type bool above 'mask' to reduce
  structure padding.

No changes from v2.

Changes from v1:
- As suggested by Paul Clarke replaced char * throttle_reason[][30] by 
  const char * const throttle_reason[].

 drivers/cpufreq/powernv-cpufreq.c | 95 ---
 1 file changed, 49 insertions(+), 46 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c 
b/drivers/cpufreq/powernv-cpufreq.c
index 597a084..c98a6e7 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -44,12 +45,22 @@
 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
 static bool rebooting, throttled, occ_reset;
 
+static const char * const throttle_reason[] = {
+   "No throttling",
+   "Power Cap",
+   "Processor Over Temperature",
+   "Power Supply Failure",
+   "Over Current",
+   "OCC Reset"
+};
+
 static struct chip {
unsigned int id;
bool throttled;
+   bool restore;
+   u8 throt_reason;
cpumask_t mask;
struct work_struct throttle;
-   bool restore;
 } *chips;
 
 static int nr_chips;
@@ -310,41 +321,49 @@ static inline unsigned int get_nominal_index(void)
return powernv_pstate_info.max - powernv_pstate_info.nominal;
 }
 
-static void powernv_cpufreq_throttle_check(void *data)
+static void powernv_cpufreq_check_pmax(void)
 {
unsigned int cpu = smp_processor_id();
unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id());
-   unsigned long pmsr;
int pmsr_pmax, i;
 
-   pmsr = get_pmspr(SPRN_PMSR);
+   pmsr_pmax = (s8)PMSR_MAX(get_pmspr(SPRN_PMSR));
 
for (i = 0; i < nr_chips; i++)
if (chips[i].id == chip_id)
break;
 
-   /* Check for Pmax Capping */
-   pmsr_pmax = (s8)PMSR_MAX(pmsr);
if (pmsr_pmax != powernv_pstate_info.max) {
if (chips[i].throttled)
-   goto next;
+   return;
+
chips[i].throttled = true;
if (pmsr_pmax < powernv_pstate_info.nominal)
-   pr_crit("CPU %d on Chip %u has Pmax reduced below 
nominal frequency (%d < %d)\n",
-   cpu, chips[i].id, pmsr_pmax,
-   powernv_pstate_info.nominal);
-   else
-   pr_info("CPU %d on Chip %u has Pmax reduced below turbo 
frequency (%d < %d)\n",
-   cpu, chips[i].id, pmsr_pmax,
-   powernv_pstate_info.max);
+   pr_warn_once("CPU %d on Chip %u has Pmax reduced below 
nominal frequency (%d < %d)\n",
+cpu, chips[i].id, pmsr_pmax,
+powernv_pstate_info.nominal);
+
+   trace_powernv_throttle(chips[i].id,
+  throttle_reason[chips[i].throt_reason],
+  pmsr_pmax);
} else if (chips[i].throttled) {
chips[i].throttled = false;
-   pr_info("CPU %d on Chip %u has Pmax restored to %d\n", cpu,
-   chips[i].id, pmsr_pmax);
+   trace_powernv_throttle(chips[i].id,
+  throttle_reason[chips[i].throt_reason],
+  pmsr_pmax);
}
+}
+
+static void powernv_cpufreq_throttle_check(void *data)
+{
+   unsigned long pmsr;
+
+   pmsr = get_pmspr(SPRN_PMSR);
+
+   /* Check for Pmax Capping */
+   powernv_cpufreq_check_pmax();
 
/* Check if Psafe_mode_active is set in PMSR. */
-next:
if (pmsr & PMSR_PSAFE_ENABLE) {
throttled = true;
pr_info("Pstate set to safe frequency\n");
@@ -358,7 +377,7 @@ next:
 
if (throttled) {
pr_info("PMSR = %16lx\n", pmsr);
-   pr_crit("CPU Frequency could be throttled\n");
+   pr_warn("CPU Frequency could be throttled\n");
}
 }
 
@@ -449,15 +468,6 @@ void powernv_cpufreq_work_fn(struct work_struct *work)
}
 }
 
-static char throttle_reason[][30] = {
-   "No throttling",
-   "Power Cap",
-   "Processor Over Temperature",
-   "Power Supply Failure",
-   "Over Current",
-   "OCC Reset"
-

[PATCH v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats

2016-01-11 Thread Shilpasri G Bhat
Create sysfs attributes to export throttle information in
/sys/devices/system/cpu/cpufreq/chipN. The newly added sysfs files are as
follows:

1)/sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
  This gives the throttle stats for each of the available frequencies.
  The throttle stat of a frequency is the total number of times the max
  frequency is reduced to that frequency.
  # cat /sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
  4023000 0
  399 0
  3956000 1
  3923000 0
  389 0
  3857000 2
  3823000 0
  379 0
  3757000 2
  3724000 1
  369 1
  ...

2)/sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
  This directory contains throttle reason files. Each file gives the
  total number of times the max frequency is throttled, except for
  'throttle_reset', which gives the total number of times the max
  frequency is unthrottled after being throttled.
  # cd /sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
  # cat cpu_over_temperature
  7
  # cat occ_reset
  0
  # cat over_current
  0
  # cat power_cap
  0
  # cat power_supply_failure
  0
  # cat throttle_reset
  7

3)/sys/devices/system/cpu/cpufreq/chip0/throttle_stat
  This gives the total number of events of max frequency throttling to
  lower frequencies in the turbo range of frequencies and the sub-turbo(at
  and below nominal) range of frequencies.
  # cat /sys/devices/system/cpu/cpufreq/chip0/throttle_stat
  turbo 7
  sub-turbo 0

Signed-off-by: Shilpasri G Bhat 
---
Changes from v3:
- Seperate the patch to contain only the throttle sysfs attribute changes.
- Add helper inline function get_chip_index()

Changes from v2:
- Fixed kbuild test warning.
drivers/cpufreq/powernv-cpufreq.c:609:2: warning: ignoring return
value of 'kstrtoint', declared with attribute warn_unused_result
[-Wunused-result]

Changes from v1:
- Added a kobject to struct chip
- Grouped the throttle reasons under a separate attribute_group and
  exported each reason as individual file.
- Moved the sysfs files from /sys/devices/system/node/nodeN to
  /sys/devices/system/cpu/cpufreq/chipN
- As suggested by Paul Clarke replaced 'Nominal' with 'sub-turbo'.
- Modified the commit message.

 drivers/cpufreq/powernv-cpufreq.c | 177 +-
 1 file changed, 173 insertions(+), 4 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c 
b/drivers/cpufreq/powernv-cpufreq.c
index c98a6e7..40ccd9d 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -54,6 +54,16 @@ static const char * const throttle_reason[] = {
"OCC Reset"
 };
 
+enum throt_reason_type {
+   NO_THROTTLE = 0,
+   POWERCAP,
+   CPU_OVERTEMP,
+   POWER_SUPPLY_FAILURE,
+   OVERCURRENT,
+   OCC_RESET_THROTTLE,
+   OCC_MAX_REASON
+};
+
 static struct chip {
unsigned int id;
bool throttled;
@@ -61,6 +71,11 @@ static struct chip {
u8 throt_reason;
cpumask_t mask;
struct work_struct throttle;
+   int throt_turbo;
+   int throt_nominal;
+   int reason[OCC_MAX_REASON];
+   int *pstate_stat;
+   struct kobject *kobj;
 } *chips;
 
 static int nr_chips;
@@ -195,6 +210,113 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
NULL,
 };
 
+static inline int get_chip_index(struct kobject *kobj)
+{
+   int i, id;
+
+   i = kstrtoint(kobj->name + 4, 0, );
+   if (i)
+   return i;
+
+   for (i = 0; i < nr_chips; i++)
+   if (chips[i].id == id)
+   return i;
+   return -EINVAL;
+}
+
+static ssize_t throttle_freq_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+   int i, count = 0, id;
+
+   id = get_chip_index(kobj);
+   if (id < 0)
+   return id;
+
+   for (i = 0; i < powernv_pstate_info.nr_pstates; i++)
+   count += sprintf([count], "%d %d\n",
+  powernv_freqs[i].frequency,
+  chips[id].pstate_stat[i]);
+
+   return count;
+}
+
+static struct kobj_attribute attr_throttle_frequencies =
+__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL);
+
+static ssize_t throttle_stat_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+   int id, count = 0;
+
+   id = get_chip_index(kobj);
+   if (id < 0)
+   return id;
+
+   count += sprintf([count], "turbo %d\n", chips[id].throt_turbo);
+   count += sprintf([count], "sub-turbo %d\n",
+   chips[id].throt_nominal);
+
+   return count;
+}
+
+static struct kobj_attribute attr_throttle_stat =
+__ATTR(throttle_stat, 0444, throttle_stat_show, NULL);
+
+#define define_throttle_reason_attr(attr_name, val)  \
+static ssize_t attr_name##_show(struct kobject *kobj,\
+  

Re: 答复: [PATCH V3] cpufreq: qoriq: Register cooling device based on device tree

2016-01-11 Thread Scott Wood
On 01/11/2016 08:54 AM, Hongtao Jia wrote:
> Sorry for the late response. I got a knee surgery to do.
> See comments at the end.
> 
>> -邮件原件-
>> 发件人: Arnd Bergmann [mailto:a...@arndb.de]
>> 发送时间: Saturday, December 19, 2015 6:33 AM
>> 收件人: Rafael J. Wysocki 
>> 抄送: Jia Hongtao ; edubez...@gmail.com;
>> viresh.ku...@linaro.org; linux...@vger.kernel.org; linuxppc-
>> d...@lists.ozlabs.org; devicet...@vger.kernel.org; Scott Wood
>> 
>> 主题: Re: [PATCH V3] cpufreq: qoriq: Register cooling device based on device
>> tree
>>
>> On Tuesday 15 December 2015 00:58:26 Rafael J. Wysocki wrote:
>>> On Thursday, November 26, 2015 05:21:11 PM Jia Hongtao wrote:
 Register the qoriq cpufreq driver as a cooling device, based on the
 thermal device tree framework. When temperature crosses the passive
 trip point cpufreq is used to throttle CPUs.

 Signed-off-by: Jia Hongtao 
 Reviewed-by: Viresh Kumar 
>>>
>>> Applied, thanks!
>>>
>>
>> I got a randconfig build error today:
>>
>> drivers/built-in.o: In function `qoriq_cpufreq_ready':
>> debugfs.c:(.text+0x1f4688): undefined reference to
>> `of_cpufreq_cooling_register'
>>
>> CONFIG_OF=y
>> CONFIG_QORIQ_CPUFREQ=y
>> CONFIG_THERMAL=m
>> CONFIG_THERMAL_OF=y
>>
>> I think you need a 'depends on THERMAL' to prevent the driver from being
>> built-in when THERMAL=m.
>>
>> Arnd
> 
> Correct. I need to add following lines to the Kconfig file:
> depends on !CPU_THERMAL || THERMAL=y
> 
> Hi Rafael,
> Should I send a new patch include this fix or send a fix patch?

Why THERMAL=y and not just THERMAL, which would allow building this
driver as a module?

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification

2016-01-11 Thread Shilpasri G Bhat
In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the
CPU when the chip crosses its thermal and power limits. Currently,
powernv-cpufreq driver detects and reports this event as a console
message. Some machines may not sustain the max turbo frequency in all
conditions and can be throttled frequently. This can lead to the
flooding of console with throttle messages. So this patchset aims to
redesign the presentation of this event via sysfs counters and
tracepoints. 

Patches [2] to [4] will add a perf trace point "power:powernv_throttle" and
sysfs throttle counter stats in /sys/devices/system/cpu/cpufreq/chipN.
Patch [1] solves a bug in powernv_cpufreq_throttle_check(), which calls in to
cpu_to_chip_id() in hot path which reads DT every time to find the chip id.

Changes from v3:
- Add a fix to replace cpu_to_chip_id() with simpler PIR shift to obtain the
  chip id.
- Break patch2 in to two patches separating the tracepoint and sysfs attribute
  changes.

Changes from v2:
- Fixed kbuild test warning.
drivers/cpufreq/powernv-cpufreq.c:609:2: warning: ignoring return
value of 'kstrtoint', declared with attribute warn_unused_result
[-Wunused-result]

Shilpasri G Bhat (4):
  cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
  cpufreq: powernv/tracing: Add powernv_throttle tracepoint
  cpufreq: powernv: Add a trace print for the throttle event
  cpufreq: powernv: Add sysfs attributes to show throttle stats

 drivers/cpufreq/powernv-cpufreq.c | 279 +++---
 include/trace/events/power.h  |  22 +++
 kernel/trace/power-traces.c   |   1 +
 3 files changed, 250 insertions(+), 52 deletions(-)

-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Maciej S. Szmigiero
Hi Fabio,

Thanks for testing.

On 11.01.2016 13:10, Fabio Estevam wrote:
> On Mon, Jan 11, 2016 at 10:04 AM, Fabio Estevam <feste...@gmail.com> wrote:
> 
>> This patch causes the following issue in linux-next:
>>
>> [2.526984] [ cut here ]
>> [2.531632] WARNING: CPU: 1 PID: 1 at kernel/locking/lockdep.c:2755
>> lockdep_trace_alloc+0xf4/0x124()
>> [2.540771] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
>> [2.546175] Modules linked in:
>> [2.549447] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
>> 4.4.0-rc8-next-20160111 #204
>> [2.557021] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
>> [2.563553] Backtrace:
>> [2.566040] [] (dump_backtrace) from []
>> (show_stack+0x18/0x1c)
>> [2.573615]  r6:0ac3 r5: r4: r3:
>> [2.579362] [] (show_stack) from []
>> (dump_stack+0x88/0xa4)
>> [2.586607] [] (dump_stack) from []
>> (warn_slowpath_common+0x80/0xbc)
>> [2.594702]  r5:c0071ed0 r4:ef055b90
>> [2.598326] [] (warn_slowpath_common) from []
>> (warn_slowpath_fmt+0x38/0x40)
>> [2.607028]  r8:0004 r7:0004 r6:024080c0 r5:024080c0 r4:6093
>> [2.613829] [] (warn_slowpath_fmt) from []
>> (lockdep_trace_alloc+0xf4/0x124)
>> [2.622532]  r3:c09a1634 r2:c099dc0c
>> [2.626161] [] (lockdep_trace_alloc) from []
>> (kmem_cache_alloc+0x30/0x174)
>> [2.634778]  r4:ef001f00 r3:c0b02a88
>> [2.638407] [] (kmem_cache_alloc) from []
>> (regcache_rbtree_write+0x150/0x724)
>> [2.647283]  r10: r9:0010 r8:0004 r7:0004
>> r6:002c r5:
>> [2.655203]  r4:
>> [2.657767] [] (regcache_rbtree_write) from []
>> (regcache_write+0x5c/0x64)
> 
> This fixes the warning:
> 
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -180,7 +180,6 @@ static const struct regmap_config fsl_ssi_regconfig = {
> .volatile_reg = fsl_ssi_volatile_reg,
> .precious_reg = fsl_ssi_precious_reg,
> .writeable_reg = fsl_ssi_writeable_reg,
> -   .cache_type = REGCACHE_RBTREE,
>  };
> 
> Is this the correct fix?
> 

This will disable register cache so it isn't right.
Could you try REGCACHE_FLAT instead, please?

Looks like the problem here is rbtree cache does some non-atomic allocations in
read / write path when not supplied with default register values.

Best regards,
Maciej Szmigiero

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Fabio Estevam
Hi Maciej,

On Sun, Dec 20, 2015 at 6:33 PM, Maciej S. Szmigiero
<m...@maciej.szmigiero.name> wrote:
> There is no guarantee that on fsl_ssi module load
> SSI registers will have their power-on-reset values.
>
> In fact, if the driver is reloaded the values in
> registers will be whatever they were set to previously.
>
> This fixes hard lockup on fsl_ssi module reload,
> at least in AC'97 mode.
>
> Fixes: 05cf237972fe ("ASoC: fsl_ssi: Add driver suspend and resume to support 
> MEGA Fast")
>
> Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>

This patch causes the following issue in linux-next:

[2.526984] [ cut here ]
[2.531632] WARNING: CPU: 1 PID: 1 at kernel/locking/lockdep.c:2755
lockdep_trace_alloc+0xf4/0x124()
[2.540771] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
[2.546175] Modules linked in:
[2.549447] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
4.4.0-rc8-next-20160111 #204
[2.557021] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[2.563553] Backtrace:
[2.566040] [] (dump_backtrace) from []
(show_stack+0x18/0x1c)
[2.573615]  r6:0ac3 r5: r4: r3:
[2.579362] [] (show_stack) from []
(dump_stack+0x88/0xa4)
[2.586607] [] (dump_stack) from []
(warn_slowpath_common+0x80/0xbc)
[2.594702]  r5:c0071ed0 r4:ef055b90
[2.598326] [] (warn_slowpath_common) from []
(warn_slowpath_fmt+0x38/0x40)
[2.607028]  r8:0004 r7:0004 r6:024080c0 r5:024080c0 r4:6093
[2.613829] [] (warn_slowpath_fmt) from []
(lockdep_trace_alloc+0xf4/0x124)
[2.622532]  r3:c09a1634 r2:c099dc0c
[2.626161] [] (lockdep_trace_alloc) from []
(kmem_cache_alloc+0x30/0x174)
[2.634778]  r4:ef001f00 r3:c0b02a88
[2.638407] [] (kmem_cache_alloc) from []
(regcache_rbtree_write+0x150/0x724)
[2.647283]  r10: r9:0010 r8:0004 r7:0004
r6:002c r5:
[2.655203]  r4:
[2.657767] [] (regcache_rbtree_write) from []
(regcache_write+0x5c/0x64)
[2.666295]  r10:ef0f1c80 r9:ef0f6500 r8:0001 r7:ef055cbc
r6:0010 r5:
[2.674215]  r4:eeaaf000
[2.676778] [] (regcache_write) from []
(_regmap_read+0xa8/0xc0)
[2.684526]  r6:
[2.685780] mmc2: new DDR MMC card at address 0001
[2.686495] mmcblk1: mmc2:0001 SEM08G 7.40 GiB
[2.686792] mmcblk1boot0: mmc2:0001 SEM08G partition 1 2.00 MiB
[2.687092] mmcblk1boot1: mmc2:0001 SEM08G partition 2 2.00 MiB
[2.687388] mmcblk1rpmb: mmc2:0001 SEM08G partition 3 128 KiB
[2.713792]  r5:0010 r4:eeaaf000 r3:
[2.718660] [] (_regmap_read) from []
(regmap_read+0x44/0x64)
[2.726147]  r7:1000 r6:ef055cbc r5:0010 r4:eeaaf000
[2.731890] [] (regmap_read) from []
(_fsl_ssi_set_dai_fmt+0xa4/0x440)
[2.740159]  r6:1001 r5:eeaaf000 r4:eeaaee10 r3:01400454
[2.745897] [] (_fsl_ssi_set_dai_fmt) from []
(fsl_ssi_set_dai_fmt+0x1c/0x20)
[2.754773]  r10:ef0f1c80 r8:ef118800 r7:1001 r6:
r5:0001 r4:ef0f1700
[2.762706] [] (fsl_ssi_set_dai_fmt) from []
(snd_soc_runtime_set_dai_fmt+0x104/0x154)
[2.772375] [] (snd_soc_runtime_set_dai_fmt) from
[] (snd_soc_register_card+0xc14/0xdd0)
[2.782206]  r10:eeaa889c r9:0002 r8:eeaa8810 r7:ef118800
r6:0001 r5:
[2.790126]  r4:ef0f1c80 r3:
[2.793750] [] (snd_soc_register_card) from []
(devm_snd_soc_register_card+0x38/0x78)
[2.803321]  r10:ef20b420 r9: r8:eeaa8810 r7:ef1cf410
r6:eeaa889c r5:ef0f6490
[2.811240]  r4:eeaa889c
[2.813806] [] (devm_snd_soc_register_card) from
[] (imx_wm8962_probe+0x2fc/0x3a0)
[2.823116]  r7:ef1cf410 r6:eeaa889c r5:c085ad98 r4:ef1cf400
[2.828858] [] (imx_wm8962_probe) from []
(platform_drv_probe+0x58/0xb4)
[2.837300]  r10: r9:00de r8:c0b645f4 r7:c0b645f4
r6:fdfb r5:ef1cf410
[2.845220]  r4:fffe
[2.847788] [] (platform_drv_probe) from []
(driver_probe_device+0x1f8/0x2b4)
[2.856665]  r7: r6:c1379780 r5:c1379778 r4:ef1cf410
[2.862405] [] (driver_probe_device) from []
(__driver_attach+0x9c/0xa0)
[2.870846]  r10: r8:c0ad7b30 r7: r6:ef1cf444
r5:c0b645f4 r4:ef1cf410
[2.878776] [] (__driver_attach) from []
(bus_for_each_dev+0x5c/0x90)
[2.886956]  r6:c03d27b8 r5:c0b645f4 r4: r3:ef1b695c
[2.892695] [] (bus_for_each_dev) from []
(driver_attach+0x20/0x28)
[2.900703]  r6:c0b2f9f0 r5:ef0f1400 r4:c0b645f4
[2.905383] [] (driver_attach) from []
(bus_add_driver+0xec/0x1fc)
[2.913313] [] (bus_add_driver) from []
(driver_register+0x80/0xfc)
[2.921320]  r7:c0ae684c r6:ef0f63c0 r5:c0b06188 r4:c0b645f4
[2.927059] [] (driver_register) from []
(__platform_driver_register+0x38/0x4c)
[2.936109]  r5:c0b06188 r4:c0b06188
[2.939733] [] (__platform_driver_register) from
[] (imx_wm8962_driver_init+0x18/0x20)
[2.949398] [] (imx_wm8962_driver_init) from []
(d

Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-11 Thread Raghavendra K T

On 01/10/2016 04:33 AM, Jan Stancek wrote:

Hi,

I'm seeing bare metal ppc64le system crashing early during boot
with latest upstream kernel (4.4.0-rc8):



Jan,
Do you mind sharing the .config you used for the kernel.
Not able to reproduce with the one that I have :(

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Maciej S. Szmigiero
On 11.01.2016 15:00, Mark Brown wrote:
> On Mon, Jan 11, 2016 at 10:10:56AM -0200, Fabio Estevam wrote:
>> On Mon, Jan 11, 2016 at 10:04 AM, Fabio Estevam  wrote:
> 
>>> [2.526984] [ cut here ]
>>> [2.531632] WARNING: CPU: 1 PID: 1 at kernel/locking/lockdep.c:2755
>>> lockdep_trace_alloc+0xf4/0x124()
> 
>> This fixes the warning:
> 
>> --- a/sound/soc/fsl/fsl_ssi.c
>> +++ b/sound/soc/fsl/fsl_ssi.c
>> @@ -180,7 +180,6 @@ static const struct regmap_config fsl_ssi_regconfig = {
>> .volatile_reg = fsl_ssi_volatile_reg,
>> .precious_reg = fsl_ssi_precious_reg,
>> .writeable_reg = fsl_ssi_writeable_reg,
>> -   .cache_type = REGCACHE_RBTREE,
>>  };
> 
>> Is this the correct fix?
> 
> I suspect not, it looks like the driver is using the cache for
> suspend/resume handling.  I've dropped the patch for now.  Either the
> driver should explicitly write to the relevant registers outside of
> interrupt context to ensure the cache entry exists or it should keep the
> defaults and explicitly write them to hardware at startup to ensure
> sync (the former is more likely to be safe).

Is it acceptable to switch it to flat cache instead to not keep the register
defaults in driver?

Maciej

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Fabio Estevam
On Mon, Jan 11, 2016 at 10:04 AM, Fabio Estevam <feste...@gmail.com> wrote:

> This patch causes the following issue in linux-next:
>
> [2.526984] [ cut here ]
> [2.531632] WARNING: CPU: 1 PID: 1 at kernel/locking/lockdep.c:2755
> lockdep_trace_alloc+0xf4/0x124()
> [2.540771] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
> [2.546175] Modules linked in:
> [2.549447] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
> 4.4.0-rc8-next-20160111 #204
> [2.557021] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [2.563553] Backtrace:
> [2.566040] [] (dump_backtrace) from []
> (show_stack+0x18/0x1c)
> [2.573615]  r6:0ac3 r5: r4: r3:
> [2.579362] [] (show_stack) from []
> (dump_stack+0x88/0xa4)
> [2.586607] [] (dump_stack) from []
> (warn_slowpath_common+0x80/0xbc)
> [2.594702]  r5:c0071ed0 r4:ef055b90
> [2.598326] [] (warn_slowpath_common) from []
> (warn_slowpath_fmt+0x38/0x40)
> [2.607028]  r8:0004 r7:0004 r6:024080c0 r5:024080c0 r4:6093
> [2.613829] [] (warn_slowpath_fmt) from []
> (lockdep_trace_alloc+0xf4/0x124)
> [2.622532]  r3:c09a1634 r2:c099dc0c
> [2.626161] [] (lockdep_trace_alloc) from []
> (kmem_cache_alloc+0x30/0x174)
> [2.634778]  r4:ef001f00 r3:c0b02a88
> [2.638407] [] (kmem_cache_alloc) from []
> (regcache_rbtree_write+0x150/0x724)
> [2.647283]  r10: r9:0010 r8:0004 r7:0004
> r6:002c r5:
> [2.655203]  r4:
> [2.657767] [] (regcache_rbtree_write) from []
> (regcache_write+0x5c/0x64)

This fixes the warning:

--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -180,7 +180,6 @@ static const struct regmap_config fsl_ssi_regconfig = {
.volatile_reg = fsl_ssi_volatile_reg,
.precious_reg = fsl_ssi_precious_reg,
.writeable_reg = fsl_ssi_writeable_reg,
-   .cache_type = REGCACHE_RBTREE,
 };

Is this the correct fix?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] PowerNV crash with 4.4.0-rc8 at sched_init_numa (related to commit c118baf80256)

2016-01-11 Thread Raghavendra K T

On 01/11/2016 05:22 PM, Raghavendra K T wrote:

On 01/10/2016 04:33 AM, Jan Stancek wrote:

Hi,

I'm seeing bare metal ppc64le system crashing early during boot
with latest upstream kernel (4.4.0-rc8):



Jan,
Do you mind sharing the .config you used for the kernel.
Not able to reproduce with the one that I have :(



Never mind.. I enabled DEBUG_PER_CPU_MAPS.. to hit that..
/me goes back..

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 39/41] xen/events: use virt_xxx barriers

2016-01-11 Thread David Vrabel
On 10/01/16 14:21, Michael S. Tsirkin wrote:
> drivers/xen/events/events_fifo.c uses rmb() to communicate with the
> other side.
> 
> For guests compiled with CONFIG_SMP, smp_rmb would be sufficient, so
> rmb() here is only needed if a non-SMP guest runs on an SMP host.
> 
> Switch to the virt_rmb barrier which serves this exact purpose.
> 
> Pull in asm/barrier.h here to make sure the file is self-contained.
> 
> Suggested-by: David Vrabel 
> Signed-off-by: Michael S. Tsirkin 

Acked-by: David Vrabel 

David
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Fabio Estevam
Hi Maciej,

On Mon, Jan 11, 2016 at 11:57 AM, Maciej S. Szmigiero
 wrote:
> Hi Fabio,

> This will disable register cache so it isn't right.
> Could you try REGCACHE_FLAT instead, please?

Yes, with REGCACHE_FLAT I don't get the warning.

Regards,

Fabio Estevam
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH RESEND] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing

2016-01-11 Thread Nicholas Krause
This fixes the incorrect return statement in the function
mpic_set_default_irq_routing from always returning zero
to signal success to this function's caller to instead
return the return value of kvm_set_irq_routing as this
function can fail and we need to correctly signal the
caller of mpic_set_default_irq_routing when the call
to this particular function has failed.

Signed-off-by: Nicholas Krause 
---
 arch/powerpc/kvm/mpic.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 6249cdc..b14b85a 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -1641,16 +1641,17 @@ static void mpic_destroy(struct kvm_device *dev)
 static int mpic_set_default_irq_routing(struct openpic *opp)
 {
struct kvm_irq_routing_entry *routing;
+   int ret;
 
/* Create a nop default map, so that dereferencing it still works */
routing = kzalloc((sizeof(*routing)), GFP_KERNEL);
if (!routing)
return -ENOMEM;
 
-   kvm_set_irq_routing(opp->kvm, routing, 0, 0);
+   ret = kvm_set_irq_routing(opp->kvm, routing, 0, 0);
 
kfree(routing);
-   return 0;
+   return ret;
 }
 
 static int mpic_create(struct kvm_device *dev, u32 type)
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Mark Brown
On Mon, Jan 11, 2016 at 09:45:37AM -0600, Timur Tabi wrote:

> Ok, I'm confused.  Granted, all of this regcache stuff was added after I
> stopped working on this driver, so I'm out of the loop.  But it appears that
> the regcache cannot properly handle an uninitialized cache.  I would expect
> it to know to perform hard reads of any registers that are uninitialized.

regcache handles this fine, it's perfectly happy to just go and allocate
the cache as registers get used (this is why the code that's doing the
allocation exists...).  What is causing problems here is that the first
access to the register is happening in interrupt context so we can't do
a GFP_KERNEL allocation for it.  Most users don't do anything at all in
interrupt context so it's not an issue for them, drivers that want to
use regmap in interrupt context need to handle this.

We can't rely on knowing which registers are valid and which registers
can be read without side effects, it's optional for drivers to provide
that information.  Even with that information it's not always clear that
we want to stop and read every single value when we are initialising the
device, that might be excessively slow (remember a lot of regmap devices
are I2C or SPI connected, some with large register maps).  We should
have a helper to do that though for drivers where it does make sense.


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: simple_alloc space tramples initrd

2016-01-11 Thread dwalker
On Mon, Jan 11, 2016 at 02:09:34PM +1100, Michael Ellerman wrote:
> On Fri, 2016-01-08 at 09:45 -0800, dwal...@fifo99.com wrote:
> > Hi,
> > 
> > A powerpc machine I'm working on has this problem where the
> > simple_alloc_init() area is trampling the initrd. The two are placed fairly
> > close together.
> 
> Which machine / platform?

It's not upstream yet. I'm still putting the patches together, that's when this
issue came up. I can send an RFC if you want to look at the patches.
 
> > I have a fix for this proposed to add a section into
> > arch/powerpc/boot/zImage.lds.S called "mallocarea" to account for this 
> > space,
> > but not all powerpc platforms use simple_alloc_init(). So for those 
> > platforms
> > it's a waste.
> 
> Yeah I don't really like the sound of that. We could do it if it was behind a
> CONFIG option, but hopefully there is a better solution.
> 
> > Another alternative is to alter the bootloader to place more space between
> > the kernel image and initrd image.
> >
> > I wanted to get some feedback on the right way to fix this. It seems like it
> > could be a generic issue on powerpc, or it's possibly already fixed 
> > someplace
> > and I just haven't noticed.
> 
> I don't really know that code very well. But ideally either the boot loader
> gives you space, or the platform boot code is smart enough to detect that 
> there
> is insufficient room and puts the heap somewhere else.

It seems like the kernel should be able to handle it. I believe the bootloader 
passes
the initrd location , but I don't think it's evaluated till later in the boot 
up. For
simple_alloc_init() it seems all platforms just assume the space is empty 
without checking.

Daniel
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC][PATCH] ppc: Implement save_stack_trace_regs()

2016-01-11 Thread Steven Rostedt
On Mon, 11 Jan 2016 14:30:31 +1100
Michael Ellerman  wrote:

 
> Sorry, yep I'll take it.
> 
> I trimmed the change log a bit, final version below.
> 
> cheers
>

Thanks, appreciate it!

-- Steve
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] Add hwcap2 bits for POWER9

2016-01-11 Thread Tulio Magno Quites Machado Filho
Adhemerval Zanella  writes:

> On 08-01-2016 13:36, Peter Bergner wrote:
>> On Fri, 2016-01-08 at 11:25 -0200, Tulio Magno Quites Machado Filho wrote:
>>> Peter, this solves the issue you reported previously [1].
>>>
>>> [1] https://sourceware.org/ml/libc-alpha/2015-12/msg00522.html
>> 
>> Agreed, thanks.  I'll also add the POWER9 support to the GCC side
>> of the patch now that the glibc code is upstream.
>
> I do not see these bits being added in kernel side yet and GLIBC usual
> only sync these kind of bits *after* they are included in kernel side.
> So I would advise to either get these pieces (kernel support and hwcap
> advertise) in kernel before 2.23 release, otherwise revert the patches.

Ack.
It has just been sent to the correspondent Linux mailing list:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-January/137763.html

-- 
Tulio Magno

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Mark Brown
On Mon, Jan 11, 2016 at 10:10:56AM -0200, Fabio Estevam wrote:
> On Mon, Jan 11, 2016 at 10:04 AM, Fabio Estevam  wrote:

> > [2.526984] [ cut here ]
> > [2.531632] WARNING: CPU: 1 PID: 1 at kernel/locking/lockdep.c:2755
> > lockdep_trace_alloc+0xf4/0x124()

> This fixes the warning:

> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -180,7 +180,6 @@ static const struct regmap_config fsl_ssi_regconfig = {
> .volatile_reg = fsl_ssi_volatile_reg,
> .precious_reg = fsl_ssi_precious_reg,
> .writeable_reg = fsl_ssi_writeable_reg,
> -   .cache_type = REGCACHE_RBTREE,
>  };

> Is this the correct fix?

I suspect not, it looks like the driver is using the cache for
suspend/resume handling.  I've dropped the patch for now.  Either the
driver should explicitly write to the relevant registers outside of
interrupt context to ensure the cache entry exists or it should keep the
defaults and explicitly write them to hardware at startup to ensure
sync (the former is more likely to be safe).


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] ASoC: fsl_ssi: mark SACNT register volatile

2016-01-11 Thread Maciej S. Szmigiero
Hi Timur,

Thanks for review.

On 10.01.2016 22:33, Timur Tabi wrote:
> Maciej S. Szmigiero wrote:
>> +regmap_write(regs, CCSR_SSI_SACNT,
>> +ssi_private->regcache_sacnt);
> 
> So I'm not familiar with all of the regcache features, but I understand this 
> patch.
> I was wondering if it makes sense to write the same exact value that was read 
> previously.
> Isn't it possible for the WR or RD bits to change between fsl_ssi_suspend() 
> and fsl_ssi_resume()?

These bits are only set in fsl_ssi_ac97_{read,write} which then wait 100usecs
before returning. This should be enough for SSI core to finish the relevant
operation and clear the bits again, so theoretically they shouldn't be set
outside these functions.

However, if AC'97 register access is done concurrently with suspend or resume
the read / written reg data might be corrupted.

It looks to me this is indeed possible since SSI PM callbacks are set in its
platform driver struct but ASoC core only calls PM callbacks in 
snd_soc_dai_driver
(which SSI driver don't set).

If I am correct with this reasoning then these callbacks need to be added to
snd_soc_dai_driver but platform driver ones should still be provided in case
the driver is loaded but the sound card is not yet registered.

I've CCed Zidan since he originally added PM support to this driver.

> That is, should we be doing this instead?
> 
> u32 temp;
> regmap_read(regs, CCSR_SSI_SACNT, );
> temp &= 0x18; // preserve WR and RD
> regmap_write(regs, CCSR_SSI_SACNT, (ssi_private->regcache_sacnt & ~0x18) | 
> temp);
> 

Maciej

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] ASoC: fsl_ssi: remove register defaults

2016-01-11 Thread Mark Brown
On Mon, Jan 11, 2016 at 03:10:20PM +0100, Maciej S. Szmigiero wrote:
> On 11.01.2016 15:00, Mark Brown wrote:

> > I suspect not, it looks like the driver is using the cache for
> > suspend/resume handling.  I've dropped the patch for now.  Either the
> > driver should explicitly write to the relevant registers outside of
> > interrupt context to ensure the cache entry exists or it should keep the
> > defaults and explicitly write them to hardware at startup to ensure
> > sync (the former is more likely to be safe).

> Is it acceptable to switch it to flat cache instead to not keep the register
> defaults in driver?

That's possibly problematic because the flat cache will of necessity end
up with defaults (of 0 from the kzalloc()) for all the registers.
You'll still have default values in the cache, though some of the
behaviour around optimising syncs does change without them explicitly
given.  It does deal with the allocation issue but given that the issue
was incorrect defaults I'd be a bit concerned.


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

  1   2   >