date:20111010

Re: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread Benjamin Herrenschmidt

On Sun, 2011-10-09 at 12:30 +0200, Eli Cohen wrote:

  Ideally you want to avoid that swapping altogether and use the right
  accessor that indicates that your register is BE to start with. IE.
  remove the swab32 completely and then use something like 
  iowrite32be() instead of writel().
 I agree, this looks better but does it work on memory mapped io or
 only on io pci space? All our registers are memory mapped...

The iomap functions work on both.

  Basically, the problem you have is that writel() has an implicit write
  to LE register semantic. Your register is BE. the iomap variants
  provide you with more fine grained be variants to use in that case.
  There's also writel_be() but that one doesn't exist on every
  architecture afaik.
 So writel_be is the function I should use for memory mapped io? If it
 does not exist for all platforms it's a pitty :-(

Just use the iomap variant. Usually you also use pci_iomap() instead of
ioremap() but afaik, for straight MMIO, it works with normal ioremap as
well.

  Now, once the mmio problem is out of the way, let's look back at how you
  then use that qpn.
  
  With the current code, you've generated something in memory which is
  byte reversed, so essentially LE on ppc and BE on x86.
  
  Then, this statement:
  
  *(u32 *) (tx_desc-ctrl.vlan_tag) |= ring-doorbell_qpn;
  
  Will essentially write it out as-is in memory for use by the chip. The chip,
  from what you say, expects BE, so this will be broken on PPC.
 I see. So this field is layed in le for ppc and the rest of the
 descriptor is be. so I assum that __iowrite64_copy() does not swap
 anything but we still have tx_desc-ctrl.vlan_tag in the wrong
 endianess.

Yes because you had swapped it initially. IE your original swab32 is
what busts it for you on ppc.

  Here too, the right solution is to instead not do that swab32 to begin
  with (ring-doorbell_qpn remains a native endian value) and instead do,
  in addition to the above mentioned change to the writel:
  
  *(u32 *) (tx_desc-ctrl.vlan_tag) |= cpu_to_be32(ring-doorbell_qpn);
  
  (Also get rid of that cast and define vlan_tag as a __be32 to start
  with).
  
  Cheers,
  Ben.
 
 Thanks for your review. I will send another patch which should fix the
 deficiencies.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread David Laight

 
 Then, this statement:
 
 *(u32 *) (tx_desc-ctrl.vlan_tag) |= ring-doorbell_qpn;

...
 instead do ... :

   *(u32 *) (tx_desc-ctrl.vlan_tag) |=
cpu_to_be32(ring-doorbell_qpn);
 
 (Also get rid of that cast and define vlan_tag as a __be32 to start
 with).

Agreed, casts that change the type of memory - *(foo *)xxx - are
generally bad news unless you are casting a generic 'buffer' to
a specific structure.
I've seen far to much code that ends up being depending on the
endianness and system word size.

For the above I'd actually suggest making 'doorbell_qpn' have the
correct endianness in order to avoid the (potential) swap every
time it is set.

You also need to treble-check the required endianness for the
'vlan_tag' in the tx descriptor. What would be needed is the
MAC PCI slave were on an x86 (LE) system.

David


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread Benjamin Herrenschmidt

On Mon, 2011-10-10 at 09:20 +0100, David Laight wrote:
 
 For the above I'd actually suggest making 'doorbell_qpn' have the
 correct endianness in order to avoid the (potential) swap every
 time it is set.

Well, the problem is that either you'll end up swapping on x86 or you'll
end up swapping on ppc, there is no native MMIO accessor that allow
you to do a no-swap access whatever the arch you are on. Or rather,
there is the __raw_ one but you shouldn't use it for most things :-)
(Because it also doesn't have the right memory barriers).

So I'd rather they do it right using the simpler method, the cost of
swap is going to be negligible, probably not even measurable, and if and
only if they think they can improve on that in a second step, then
consider doing otherwise with appropriate measurements showing a
significant difference.

 You also need to treble-check the required endianness for the
 'vlan_tag' in the tx descriptor. What would be needed is the
 MAC PCI slave were on an x86 (LE) system.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread David Laight

 
 What is this __iowrite64_copy... oh I see
 
 Nice, somebody _AGAIN_ added a bunch of generic IO 
 accessors that are utterly wrong on all archs except
 x86 (ok, -almost-).
 There isn't a single bloody memory barrier in there !

Actually memory barriers shouldn't really be added to
any of these 'accessor' functions.
(Or, at least, ones without barriers should be provided.)

The driver may want to to a series of writes, then a
single barrier, before a final write of a command (etc).

in_le32() from io.h is specially horrid!

David


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread Eli Cohen

On Mon, Oct 10, 2011 at 09:40:17AM +0100, David Laight wrote:
 
 Actually memory barriers shouldn't really be added to
 any of these 'accessor' functions.
 (Or, at least, ones without barriers should be provided.)
 
 The driver may want to to a series of writes, then a
 single barrier, before a final write of a command (etc).
 
 in_le32() from io.h is specially horrid!
 
   David
 
The driver would like to control if and when we want to put a memory
barrier. We really don't want it to be done under the hood. In this
respect we prefer raw functions which are still available to all
platforms.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread Benjamin Herrenschmidt

On Mon, 2011-10-10 at 09:40 +0100, David Laight wrote:
  What is this __iowrite64_copy... oh I see
  
  Nice, somebody _AGAIN_ added a bunch of generic IO 
  accessors that are utterly wrong on all archs except
  x86 (ok, -almost-).
  There isn't a single bloody memory barrier in there !
 
 Actually memory barriers shouldn't really be added to
 any of these 'accessor' functions.
 (Or, at least, ones without barriers should be provided.)

As long as they are documented to provide no guarantee of ordering
between the stores... And x86 driver writers have any clue that they
will not be ordered vs. surrounding accesses.

 The driver may want to to a series of writes, then a
 single barrier, before a final write of a command (etc).
 
 in_le32() from io.h is specially horrid!

The reason for that is that drivers expect fully ordered writel() vs
everything (including DMA).

Unfortunately, this is how Linux defines those semantics. I would much
prefer to require barriers explicitely but the decision was made back
then simply because the vast majority of driver writers do not
understand weakly ordered memory models and everything should be made
to look like x86.

It would be great to come up with a set of more relaxed accessors along
with the appropriate barrier to use for drivers who know better but so
far all attempts at doing so have failed due to the inability to agree
on their precise semantics. Tho that was a while ago, we should probably
give it a new shot.

Cheers,
Ben. 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread Benjamin Herrenschmidt

On Mon, 2011-10-10 at 10:47 +0200, Eli Cohen wrote:
 On Mon, Oct 10, 2011 at 09:40:17AM +0100, David Laight wrote:
  
  Actually memory barriers shouldn't really be added to
  any of these 'accessor' functions.
  (Or, at least, ones without barriers should be provided.)
  
  The driver may want to to a series of writes, then a
  single barrier, before a final write of a command (etc).
  
  in_le32() from io.h is specially horrid!
  
  David
  
 The driver would like to control if and when we want to put a memory
 barrier. We really don't want it to be done under the hood. In this
 respect we prefer raw functions which are still available to all
 platforms.

 ... but not necessarily the corresponding barriers.

That's why on powerpc we had to make all rmb,wmb and mb the same, aka a
full sync, because our weaker barriers don't order cachable vs.
non-cachable.

In any case, the raw functions are a bit nasty to use because they both
don't have barriers -and- don't handle endianness. So you have to be
extra careful.

In 90% of the cases, the barriers are what you want anyway. For example
in the else case of the driver, the doorbell MMIO typically wants it, so
using writel() is fine (or iowrite32be) and will have the necessary
barriers.

The case where things get a bit more nasty is when you try to use MMIO
for low latency small-data type transfers instead of DMA, in which case
you do want the ability for the chipset to write-combine and control the
barriers more precisely.

However, this is hard and Linux doesn't provide very good accessors to
do so, thus you need to be extra careful (see my example about wmb()

In the case of the iomap copy operations, my problem is that they
don't properly advertise their lack of ordering since normal iomap does
have full ordering.

I believe they should provide ordering with a barrier before  a barrier
after, eventually with _relaxed variants or _raw variants for those who
know what they are doing.

Maybe it's time for us to revive those discussions about providing a
good set of relaxed MMIO accessors with explicit barriers :-)

Cheers,
Ben.
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread Eli Cohen

On Mon, Oct 10, 2011 at 11:01:24AM +0200, Benjamin Herrenschmidt wrote:
 
 The case where things get a bit more nasty is when you try to use MMIO
 for low latency small-data type transfers instead of DMA, in which case
 you do want the ability for the chipset to write-combine and control the
 barriers more precisely.
 
 However, this is hard and Linux doesn't provide very good accessors to
 do so, thus you need to be extra careful (see my example about wmb()
 
 In the case of the iomap copy operations, my problem is that they
 don't properly advertise their lack of ordering since normal iomap does
 have full ordering.
 
 I believe they should provide ordering with a barrier before  a barrier
 after, eventually with _relaxed variants or _raw variants for those who
 know what they are doing.

Until then I think we need to have the logic working right on ppc and
measure if blue flame buys us any improvement in ppc. If that's not
the case (e.g because write combining is not working), then maybe we
should avoid using blueflame in ppc.
Could any of the guys from IBM check this and give us feedback?
 
 Maybe it's time for us to revive those discussions about providing a
 good set of relaxed MMIO accessors with explicit barriers :-)
 
 Cheers,
 Ben.
  
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread Benjamin Herrenschmidt

On Mon, 2011-10-10 at 11:16 +0200, Eli Cohen wrote:

 Until then I think we need to have the logic working right on ppc and
 measure if blue flame buys us any improvement in ppc. If that's not
 the case (e.g because write combining is not working), then maybe we
 should avoid using blueflame in ppc.
 Could any of the guys from IBM check this and give us feedback?

I don't have the necessary hardware myself to test that but maybe Thadeu
can.

Note that for WC to work, things must be mapped non-guarded. You can do
that by using ioremap_prot() with pgprot_noncached_wc(PAGE_KERNEL) or
ioremap_wc() (dunno how generic the later is).

From there, you should get write combining provided that you don't have
barriers between every access (ie those copy operations in their current
form should do the trick).

Cheers,
Ben.

  Maybe it's time for us to revive those discussions about providing a
  good set of relaxed MMIO accessors with explicit barriers :-)
  
  Cheers,
  Ben.
   


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread Eli Cohen

On Mon, Oct 10, 2011 at 11:24:05AM +0200, Benjamin Herrenschmidt wrote:
 On Mon, 2011-10-10 at 11:16 +0200, Eli Cohen wrote:
 
  Until then I think we need to have the logic working right on ppc and
  measure if blue flame buys us any improvement in ppc. If that's not
  the case (e.g because write combining is not working), then maybe we
  should avoid using blueflame in ppc.
  Could any of the guys from IBM check this and give us feedback?
 
 I don't have the necessary hardware myself to test that but maybe Thadeu
 can.
 
 Note that for WC to work, things must be mapped non-guarded. You can do
 that by using ioremap_prot() with pgprot_noncached_wc(PAGE_KERNEL) or
 ioremap_wc() (dunno how generic the later is).

I use the io mapping API:

at driver statrt:
priv-bf_mapping = io_mapping_create_wc(bf_start, bf_len);
if (!priv-bf_mapping)
err = -ENOMEM;

and then:
uar-bf_map = io_mapping_map_wc(priv-bf_mapping, uar-index  
PAGE_SHIFT);


Will this work on ppc?

 
 From there, you should get write combining provided that you don't have
 barriers between every access (ie those copy operations in their current
 form should do the trick).
 
 Cheers,
 Ben.
 
   Maybe it's time for us to revive those discussions about providing a
   good set of relaxed MMIO accessors with explicit barriers :-)
   
   Cheers,
   Ben.

 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/3] Kdump support for PPC440x

2011-10-10 Thread Suzuki K. Poulose

The following series implements CRASH_DUMP support for PPC440x. The
patches apply on top of power-next tree. This set also adds support
for CONFIG_RELOCATABLE on 44x.

I have tested the patches on Ebony and Virtex(QEMU Emulated). Testing
these patches would require latest snapshot of kexec-tools git tree and
(preferrably) the following patch for kexec-tools :

http://lists.infradead.org/pipermail/kexec/2011-October/005552.html

---

Suzuki K. Poulose (3):
  [44x] Enable CRASH_DUMP for 440x
  [44x] Enable CONFIG_RELOCATABLE for PPC44x
  [powerpc32] Process dynamic relocations for kernel


 arch/powerpc/Kconfig  |   10 +-
 arch/powerpc/Makefile |1 
 arch/powerpc/include/asm/page.h   |   84 
 arch/powerpc/kernel/Makefile  |2 
 arch/powerpc/kernel/head_44x.S|  111 ++---
 arch/powerpc/kernel/reloc_32.S|  194 +
 arch/powerpc/kernel/vmlinux.lds.S |8 +-
 arch/powerpc/mm/init_32.c |7 +
 8 files changed, 396 insertions(+), 21 deletions(-)
 create mode 100644 arch/powerpc/kernel/reloc_32.S

-- 
Thanks
Suzuki
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/3] [powerpc32] Process dynamic relocations for kernel

2011-10-10 Thread Suzuki K. Poulose

The following patch implements the dynamic relocation processing for
PPC32 kernel. relocate() accepts the target virtual address and relocates
 the kernel image to the same.

Currently the following relocation types are handled :

R_PPC_RELATIVE
R_PPC_ADDR16_LO
R_PPC_ADDR16_HI
R_PPC_ADDR16_HA

The last 3 relocations in the above list depends on value of Symbol indexed
whose index is encoded in the Relocation entry. Hence we need the Symbol
Table for processing such relocations.

Note: The GNU ld for ppc32 produces buggy relocations for relocation types
that depend on symbols. The value of the symbols with STB_LOCAL scope
should be assumed to be zero. - Alan Modra

Signed-off-by: Suzuki K. Poulose suz...@in.ibm.com
Cc: Paul Mackerras pau...@samba.org
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Alan Modra amo...@au1.ibm.com
Cc: Kumar Gala ga...@kernel.crashing.org
Cc: Josh Boyer jwbo...@gmail.com
Cc: linuxppc-dev linuxppc-dev@lists.ozlabs.org
---

 arch/powerpc/Kconfig  |4 +
 arch/powerpc/kernel/Makefile  |2 
 arch/powerpc/kernel/reloc_32.S|  194 +
 arch/powerpc/kernel/vmlinux.lds.S |8 +-
 4 files changed, 207 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/kernel/reloc_32.S

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8523bd1..9eb2e60 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -859,6 +859,10 @@ config RELOCATABLE
  setting can still be useful to bootwrappers that need to know the
  load location of the kernel (eg. u-boot/mkimage).
 
+config RELOCATABLE_PPC32
+   def_bool y
+   depends on PPC32  RELOCATABLE
+
 config PAGE_OFFSET_BOOL
bool Set custom page offset address
depends on ADVANCED_OPTIONS
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index ce4f7f1..ee728e4 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -85,6 +85,8 @@ extra-$(CONFIG_FSL_BOOKE) := head_fsl_booke.o
 extra-$(CONFIG_8xx):= head_8xx.o
 extra-y+= vmlinux.lds
 
+obj-$(CONFIG_RELOCATABLE_PPC32)+= reloc_32.o
+
 obj-$(CONFIG_PPC32)+= entry_32.o setup_32.o
 obj-$(CONFIG_PPC64)+= dma-iommu.o iommu.o
 obj-$(CONFIG_KGDB) += kgdb.o
diff --git a/arch/powerpc/kernel/reloc_32.S b/arch/powerpc/kernel/reloc_32.S
new file mode 100644
index 000..045d61e
--- /dev/null
+++ b/arch/powerpc/kernel/reloc_32.S
@@ -0,0 +1,194 @@
+/*
+ * Code to process dynamic relocations for PPC32.
+ *
+ * Copyrights (C) IBM Corporation, 2011.
+ * Author: Suzuki Poulose suz...@in.ibm.com
+ *
+ *  - Based on ppc64 code - reloc_64.S
+ *
+ *  This program is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU General Public License
+ *  as published by the Free Software Foundation; either version
+ *  2 of the License, or (at your option) any later version.
+ */
+
+#include asm/ppc_asm.h
+
+/* Dynamic section table entry tags */
+DT_RELA = 7/* Tag for Elf32_Rela section */
+DT_RELASZ = 8  /* Size of the Rela relocs */
+DT_RELAENT = 9 /* Size of one Rela reloc entry */
+
+STN_UNDEF = 0  /* Undefined symbol index */
+STB_LOCAL = 0  /* Local binding for the symbol */
+
+R_PPC_ADDR16_LO = 4/* Lower half of (S+A) */
+R_PPC_ADDR16_HI = 5/* Upper half of (S+A) */
+R_PPC_ADDR16_HA = 6/* High Adjusted (S+A) */
+R_PPC_RELATIVE = 22
+
+/*
+ * r3 = desired final address
+ */
+
+_GLOBAL(relocate)
+
+   mflrr0
+   bl  0f  /* Find our current runtime address */
+0: mflrr12 /* Make it accessible */
+   mtlrr0
+
+   lwz r11, (p_dyn - 0b)(r12)
+   add r11, r11, r12   /* runtime address of .dynamic section */
+   lwz r9, (p_rela - 0b)(r12)
+   add r9, r9, r12 /* runtime address of .rela.dyn section */
+   lwz r10, (p_st - 0b)(r12)
+   add r10, r10, r12   /* runtime address of _stext section */
+   lwz r13, (p_sym - 0b)(r12)
+   add r13, r13, r12   /* runtime address of .dynsym section */
+
+   /*
+* Scan the dynamic section for RELA, RELASZ entries
+*/
+   li  r6, 0
+   li  r7, 0
+   li  r8, 0
+1: lwz r5, 0(r11)  /* ELF_Dyn.d_tag */
+   cmpwi   r5, 0   /* End of ELF_Dyn[] */
+   beq eodyn
+   cmpwi   r5, DT_RELA
+   bne relasz
+   lwz r7, 4(r11)  /* r7 = rela.link */
+   b   skip
+relasz:
+   cmpwi   r5, DT_RELASZ
+   bne relaent
+   lwz r8, 4(r11)  /* r8 = Total Rela relocs size */
+   b   skip
+relaent:
+   cmpwi   r5, DT_RELAENT
+   bne skip
+   lwz r6, 4(r11)  /* r6 = Size of

[PATCH 2/3] [44x] Enable CONFIG_RELOCATABLE for PPC44x

2011-10-10 Thread Suzuki K. Poulose

The following patch adds relocatable support for PPC44x kernel.

We find the runtime address of _stext and relocate ourselves based
on the following calculation.

virtual_base = ALIGN(KERNELBASE,256M) +
MODULO(_stext.run,256M)

relocate() is called with the Effective Virtual Base Address (as
shown below)

| Phys. Addr| Virt. Addr |
Page (256M) ||
Boundary|   ||
|   ||
|   ||
Kernel Load |___|_ __ _ _ _ _|- Effective
Addr(_stext)|   |  ^ |Virt. Base Addr
|   |  | |
|   |  | |
|   |reloc_offset|
|   |  | |
|   |  | |
|   |__v_|-(KERNELBASE)%256M
|   ||
|   ||
|   ||
Page(256M)  |---||
Boundary|   ||


On BookE, we need __va()  __pa() early in the boot process to access
the device tree.

Currently this has been defined as :

#define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) -
PHYSICAL_START + KERNELBASE)
where:
 PHYSICAL_START is kernstart_addr - a variable updated at runtime.
 KERNELBASE is the compile time Virtual base address of kernel.

This won't work for us, as kernstart_addr is dynamic and will yield different
results for __va()/__pa() for same mapping.

e.g.,

Let the kernel be loaded at 64MB and KERNELBASE be 0xc000 (same as
PAGE_OFFSET).

In this case, we would be mapping 0 to 0xc000, and kernstart_addr = 64M

Now __va(1MB) = (0x10) - (0x400) + 0xc000
= 0xbc10 , which is wrong.

it should be : 0xc000 + 0x10 = 0xc010

On PPC_47x (which is based on 44x), the kernel could be loaded at highmem.
Hence we cannot always depend on the compile time constants for mapping.

Here are the possible solutions:

1) Update kernstart_addr(PHSYICAL_START) to match the Physical address of
compile time KERNELBASE value, instead of the actual Physical_Address(_stext).

The disadvantage is that we may break other users of PHYSICAL_START. They
could be replaced with __pa(_stext).

2) Redefine __va()  __pa() with relocation offset


#if defined(CONFIG_RELOCATABLE)  defined(CONFIG_44x)
#define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) - PHYSICAL_START + 
(KERNELBASE + RELOC_OFFSET)))
#define __pa(x) ((unsigned long)(x) + PHYSICAL_START - (KERNELBASE + 
RELOC_OFFSET))
#endif

where, RELOC_OFFSET could be

  a) A variable, say relocation_offset (like kernstart_addr), updated
 at boot time. This impacts performance, as we have to load an additional
 variable from memory.

OR

  b) #define RELOC_OFFSET ((PHYSICAL_START  PPC_PIN_SIZE_OFFSET_MASK) - \
  (KERNELBASE  PPC_PIN_SIZE_OFFSET_MASK))

   This introduces more calculations for doing the translation.

3) Redefine __va()  __pa() with a new variable

i.e,

#define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) + VIRT_PHYS_OFFSET))

where VIRT_PHYS_OFFSET :

#ifdef CONFIG_44x
#define VIRT_PHYS_OFFSET virt_phys_offset
#else
#define VIRT_PHYS_OFFSET (KERNELBASE - PHYSICAL_START)
#endif /* 44x */

where virt_phy_offset is updated at runtime to :

Effective KERNELBASE - kernstart_addr.

Taking our example, above:

virt_phys_offset = effective_kernelstart_vaddr - kernstart_addr
 = 0xc040 - 0x40
 = 0xc000
and

__va(0x10) = 0xc000 + 0x10 = 0xc010
 which is what we want.

I have implemented (3) in the following patch which has same cost of
operation as the existing one.

I have tested the patches on 440x platforms only. However this should
work fine for PPC_47x also, as we only depend on the runtime address
and the current TLB XLAT entry for the startup code, which is available
in r25. I don't have access to a 47x board yet. So, it would be great if
somebody could test this on 47x.

Signed-off-by: Suzuki K. Poulose suz...@in.ibm.com
Cc: Paul Mackerras pau...@samba.org
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Kumar Gala ga...@kernel.crashing.org
Cc: Tony Breeds t...@bakeyournoodle.com
Cc: Josh Boyer jwbo...@gmail.com
Cc: linuxppc-dev linuxppc-dev@lists.ozlabs.org
---

 arch/powerpc/Kconfig|2 -
 arch/powerpc/Makefile   |1 
 arch/powerpc/include/asm/page.h |   84 +-
 arch/powerpc/kernel/head_44x.S  |  111 ++-
 arch/powerpc/mm/init_32.c   |7 ++
 5 files changed, 187 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9eb2e60..99558d6 100644
--- a/arch/powerpc/Kconfig
+++

[PATCH 3/3] [44x] Enable CRASH_DUMP for 440x

2011-10-10 Thread Suzuki K. Poulose

Now that we have relocatable kernel, supporting CRASH_DUMP only requires
turning the switches on for UP machines.

We don't have kexec support on 47x yet. Enabling SMP support would be done
as part of enabling the PPC_47x support.


Signed-off-by: Suzuki K. Poulose suz...@in.ibm.com
Cc: Josh Boyer jwbo...@gmail.com
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: linuxppc-dev linuxppc-dev@lists.ozlabs.org
---

 arch/powerpc/Kconfig |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 99558d6..fc41ce5 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -362,8 +362,8 @@ config KEXEC
 
 config CRASH_DUMP
bool Build a kdump crash kernel
-   depends on PPC64 || 6xx || FSL_BOOKE
-   select RELOCATABLE if PPC64 || FSL_BOOKE
+   depends on PPC64 || 6xx || FSL_BOOKE || (44x  !SMP)
+   select RELOCATABLE if PPC64 || FSL_BOOKE || 44x
help
  Build a kernel suitable for use as a kdump capture kernel.
  The same kernel binary can be used as production kernel and dump

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

2011-10-10 Thread Benjamin Herrenschmidt

On Mon, 2011-10-10 at 11:29 +0200, Eli Cohen wrote:
 On Mon, Oct 10, 2011 at 11:24:05AM +0200, Benjamin Herrenschmidt wrote:
  On Mon, 2011-10-10 at 11:16 +0200, Eli Cohen wrote:
  
   Until then I think we need to have the logic working right on ppc and
   measure if blue flame buys us any improvement in ppc. If that's not
   the case (e.g because write combining is not working), then maybe we
   should avoid using blueflame in ppc.
   Could any of the guys from IBM check this and give us feedback?
  
  I don't have the necessary hardware myself to test that but maybe Thadeu
  can.
  
  Note that for WC to work, things must be mapped non-guarded. You can do
  that by using ioremap_prot() with pgprot_noncached_wc(PAGE_KERNEL) or
  ioremap_wc() (dunno how generic the later is).
 
 I use the io mapping API:
 
 at driver statrt:
 priv-bf_mapping = io_mapping_create_wc(bf_start, bf_len);
 if (!priv-bf_mapping)
 err = -ENOMEM;
 
 and then:
 uar-bf_map = io_mapping_map_wc(priv-bf_mapping, uar-index  
 PAGE_SHIFT);
 
 
 Will this work on ppc?

That API has never been tested on ppc I suspect. We don't have
CONFIG_HAVE_ATOMIC_IOMAP (mostly because we never needed it, it
was designed and only ever used for Intel graphics before), so
it will fallback to:

static inline struct io_mapping *
io_mapping_create_wc(resource_size_t base, unsigned long size)
{
return (struct io_mapping __force *) ioremap_wc(base, size);
}

Which should work (hopefully :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 00/14] Backport 8xx TLB to 2.4

2011-10-10 Thread Joakim Tjernlund

This is a backport from 2.6 which I did to overcome 8xx CPU
bugs. 8xx does not update the DAR register when taking a TLB
error caused by dcbX and icbi insns which makes it very
tricky to use these insns. Also the dcbst wrongly sets the
the store bit when faulting into DTLB error.
A few more bugs very found during development.

I know 2.4 is in strict maintenance mode and 8xx is obsolete
but as it is still in use I wanted 8xx to age with grace.

Addendum:
I have now ported our 8xx custom board to 2.4.37.11 and
tested these patches there.

V2:
 - Remove mandatory pinning of kernel ITLB. It is not
   needed in 2.4

8 MB Large page support will follow.

Joakim Tjernlund (14):
  8xx: Use a macro to simpliy CPU6 errata code.
  8xx: Tag DAR with 0x00f0 to catch buggy instructions.
  8xx: invalidate non present TLBs
  8xx: Fix CONFIG_PIN_TLB
  8xx: Update TLB asm so it behaves as linux mm expects.
  8xx: Fixup DAR from buggy dcbX instructions.
  8xx: CPU6 errata make DTLB error too big to fit.
  8xx: Add missing Guarded setting in DTLB Error.
  8xx: Restore _PAGE_WRITETHRU
  8xx: Set correct HW pte flags in DTLB Error too
  8xx: start using dcbX instructions in various copy routines
  8xx: Use symbolic constants in TLB asm
  8xx: Optimize TLB Miss handlers
  8xx: The TLB miss handler manages ACCESSED correctly.

 arch/ppc/kernel/head_8xx.S |  367 ++-
 arch/ppc/kernel/misc.S |   18 --
 arch/ppc/lib/string.S  |   17 --
 include/asm-ppc/pgtable.h  |   26 +--
 4 files changed, 264 insertions(+), 164 deletions(-)

-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 02/14] 8xx: Tag DAR with 0x00f0 to catch buggy instructions.

2011-10-10 Thread Joakim Tjernlund

dcbz, dcbf, dcbi, dcbst and icbi do not set DAR when they
cause a DTLB Error. Dectect this by tagging DAR with 0x00f0
at every exception exit that modifies DAR.
This also fixes MachineCheck to pass DAR and DSISR as well.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |   18 +-
 1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index ba05a57..57858ce 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -197,7 +197,17 @@ label: \
STD_EXCEPTION(0x100, Reset, UnknownException)
 
 /* Machine check */
-   STD_EXCEPTION(0x200, MachineCheck, MachineCheckException)
+   . = 0x200
+MachineCheck:
+   EXCEPTION_PROLOG
+   mfspr   r20,DSISR
+   stw r20,_DSISR(r21)
+   mfspr   r20,DAR
+   stw r20,_DAR(r21)
+   li  r20,0x00f0
+   mtspr   DAR,r20 /* Tag DAR */
+   addir3,r1,STACK_FRAME_OVERHEAD
+   FINISH_EXCEPTION(MachineCheckException)
 
 /* Data access exception.
  * This is never generated by the MPC8xx.  We jump to it for other
@@ -211,6 +221,8 @@ DataAccess:
mr  r5,r20
mfspr   r4,DAR
stw r4,_DAR(r21)
+   li  r20,0x00f0
+   mtspr   DAR,r20 /* Tag DAR */
addir3,r1,STACK_FRAME_OVERHEAD
li  r20,MSR_KERNEL
rlwimi  r20,r23,0,16,16 /* copy EE bit from saved MSR */
@@ -249,6 +261,8 @@ Alignment:
EXCEPTION_PROLOG
mfspr   r4,DAR
stw r4,_DAR(r21)
+   li  r20,0x00f0
+   mtspr   DAR,r20 /* Tag DAR */
mfspr   r5,DSISR
stw r5,_DSISR(r21)
addir3,r1,STACK_FRAME_OVERHEAD
@@ -433,6 +447,7 @@ DataStoreTLBMiss:
 * of the MMU.
 */
 2: li  r21, 0x00f0
+   mtspr   DAR, r21/* Tag DAR */
rlwimi  r20, r21, 0, 24, 28 /* Set 24-27, clear 28 */
DO_8xx_CPU6(0x3d80, r3)
mtspr   MD_RPN, r20 /* Update TLB entry */
@@ -543,6 +558,7 @@ DataTLBError:
 * of the MMU.
 */
li  r21, 0x00f0
+   mtspr   DAR, r21/* Tag DAR */
rlwimi  r20, r21, 0, 24, 28 /* Set 24-27, clear 28 */
DO_8xx_CPU6(0x3d80, r3)
mtspr   MD_RPN, r20 /* Update TLB entry */
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 03/14] 8xx: invalidate non present TLBs

2011-10-10 Thread Joakim Tjernlund

8xx sometimes need to load a invalid/non-present TLBs in
it DTLB asm handler.
These must be invalidated separately as 8xx MMU don't.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index 57858ce..b3aff21 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -221,7 +221,11 @@ DataAccess:
mr  r5,r20
mfspr   r4,DAR
stw r4,_DAR(r21)
-   li  r20,0x00f0
+   /* invalidate ~PRESENT TLBs, 8xx MMU don't do this */
+   andis.  r20,r5,0x4000
+   beq+1f
+   tlbie   r4
+1: li  r20,0x00f0
mtspr   DAR,r20 /* Tag DAR */
addir3,r1,STACK_FRAME_OVERHEAD
li  r20,MSR_KERNEL
@@ -238,7 +242,11 @@ InstructionAccess:
addir3,r1,STACK_FRAME_OVERHEAD
mr  r4,r22
mr  r5,r23
-   li  r20,MSR_KERNEL
+   /* invalidate ~PRESENT TLBs, 8xx MMU don't do this */
+   andis.  r20,r5,0x4000
+   beq+1f
+   tlbie   r4
+1: li  r20,MSR_KERNEL
rlwimi  r20,r23,0,16,16 /* copy EE bit from saved MSR */
FINISH_EXCEPTION(do_page_fault)
 
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 04/14] 8xx: Fix CONFIG_PIN_TLB

2011-10-10 Thread Joakim Tjernlund

The wrong register was loaded into MD_RPN.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index b3aff21..9d8a1b5 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -848,13 +848,13 @@ initial_mmu:
mtspr   MD_TWC, r9
li  r11, MI_BOOTINIT/* Create RPN for address 0 */
addis   r11, r11, 0x0080/* Add 8M */
-   mtspr   MD_RPN, r8
+   mtspr   MD_RPN, r11
 
addis   r8, r8, 0x0080  /* Add 8M */
mtspr   MD_EPN, r8
mtspr   MD_TWC, r9
addis   r11, r11, 0x0080/* Add 8M */
-   mtspr   MD_RPN, r8
+   mtspr   MD_RPN, r11
 #endif
 
/* Since the cache is enabled according to the information we
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 06/14] 8xx: Fixup DAR from buggy dcbX instructions.

2011-10-10 Thread Joakim Tjernlund

This is an assembler version to fixup DAR not being set
by dcbX, icbi instructions. There are two versions, one
uses selfmodifing code, the other uses a
jump table but is much bigger(default).

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |  149 +++-
 1 files changed, 146 insertions(+), 3 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index c9770b6..0891b96 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -511,8 +511,17 @@ DataTLBError:
stw r20, 0(r0)
stw r21, 4(r0)
 
-   mfspr   r20, DSISR
-   andis.  r21, r20, 0x4800/* !translation or protection */
+   mfspr   r20, DAR
+   cmpwi   cr0, r20, 0x00f0
+   beq-FixupDAR/* must be a buggy dcbX, icbi insn. */
+DARFixed:
+   /* As the DAR fixup may clear store we may have all 3 states zero.
+* Make sure only 0x0200(store) falls down into DIRTY handling
+*/
+   mfspr   r21, DSISR
+   andis.  r21, r21, 0x4a00/* !translation, protection or store */
+   srwir21, r21, 16
+   cmpwi   cr0, r21, 0x0200/* just store ? */
bne-2f
/* Only Change bit left now, do it here as it is faster
 * than trapping to the C fault handler.
@@ -534,7 +543,7 @@ DataTLBError:
 * are initialized in mapin_ram().  This will avoid the problem,
 * assuming we only use the dcbi instruction on kernel addresses.
 */
-   mfspr   r20, DAR
+   /* DAR is in r20 already */
rlwinm  r21, r20, 0, 0, 19
ori r21, r21, MD_EVALID
mfspr   r20, M_CASID
@@ -618,6 +627,140 @@ DataTLBError:
STD_EXCEPTION(0x1f00, Trap_1f, UnknownException)
 
. = 0x2000
+/* This is the procedure to calculate the data EA for buggy dcbx,dcbi 
instructions
+ * by decoding the registers used by the dcbx instruction and adding them.
+ * DAR is set to the calculated address and r10 also holds the EA on exit.
+ */
+ /* define if you don't want to use self modifying code */
+#define NO_SELF_MODIFYING_CODE
+FixupDAR:/* Entry point for dcbx workaround. */
+   /* fetch instruction from memory. */
+   mfspr   r20, SRR0
+   andis.  r21, r20, 0x8000/* Address = 0x8000 */
+   DO_8xx_CPU6(0x3780, r3)
+   mtspr   MD_EPN, r20
+   mfspr   r21, M_TWB  /* Get level 1 table entry address */
+   beq-3f  /* Branch if user space */
+   lis r21, (swapper_pg_dir-PAGE_OFFSET)@h
+   ori r21, r21, (swapper_pg_dir-PAGE_OFFSET)@l
+   rlwimi  r21, r20, 32-20, 0xffc /* r21 = r21~0xffc|(r2020)0xffc */
+3: lwz r21, 0(r21) /* Get the level 1 entry */
+   tophys  (r21, r21)
+   DO_8xx_CPU6(0x3b80, r3)
+   mtspr   MD_TWC, r21 /* Load pte table base address */
+   mfspr   r21, MD_TWC /* and get the pte address */
+   lwz r21, 0(r21) /* Get the pte */
+   /* concat physical page address(r21) and page offset(r20) */
+   rlwimi  r21, r20, 0, 20, 31
+   lwz r21,0(r21)
+/* Check if it really is a dcbx instruction. */
+/* dcbt and dcbtst does not generate DTLB Misses/Errors,
+ * no need to include them here */
+   srwir20, r21, 26/* check if major OP code is 31 */
+   cmpwi   cr0, r20, 31
+   bne-141f
+   rlwinm  r20, r21, 0, 21, 30
+   cmpwi   cr0, r20, 2028  /* Is dcbz? */
+   beq+142f
+   cmpwi   cr0, r20, 940   /* Is dcbi? */
+   beq+142f
+   cmpwi   cr0, r20, 108   /* Is dcbst? */
+   beq+144f/* Fix up store bit! */
+   cmpwi   cr0, r20, 172   /* Is dcbf? */
+   beq+142f
+   cmpwi   cr0, r20, 1964  /* Is icbi? */
+   beq+142f
+141:   mfspr   r20, DAR/* r20 must hold DAR at exit */
+   b   DARFixed/* Nope, go back to normal TLB processing */
+
+144:   mfspr   r20, DSISR
+   rlwinm  r20, r20,0,7,5  /* Clear store bit for buggy dcbst insn */
+   mtspr   DSISR, r20
+142:   /* continue, it was a dcbx, dcbi instruction. */
+#ifdef CONFIG_8xx_CPU6
+   lwz r3, 8(r0)   /* restore r3 from memory */
+#endif
+#ifndef NO_SELF_MODIFYING_CODE
+   andis.  r20,r21,0x1f/* test if reg RA is r0 */
+   li  r20,modified_instr@l
+   dcbtst  r0,r20  /* touch for store */
+   rlwinm  r21,r21,0,0,20  /* Zero lower 10 bits */
+   orisr21,r21,640 /* Transform instr. to a add r20,RA,RB */
+   ori r21,r21,532
+   stw r21,0(r20)  /* store add/and instruction */
+   dcbf0,r20   /* flush new instr. to memory. */
+   icbi0,r20   /* invalidate instr. cache line */
+   lwz r21, 4(r0)  /* restore r21 from memory */
+   mfspr   r20, M_TW   /* restore r20 from M_TW */
+   isync   /* Wait until new instr is loaded from

[PATCH 05/14] 8xx: Update TLB asm so it behaves as linux mm expects.

2011-10-10 Thread Joakim Tjernlund

Update the TLB asm to make proper use of _PAGE_DIRTY and _PAGE_ACCESSED.
Get rid of _PAGE_HWWRITE too.
Pros:
  - PRESENT is copied to ACCESSED, fixing accounting
  - DIRTY is mapped to 0x100, the changed bit, and is set directly
when a page has been made dirty.
  - Proper RO/RW mapping of user space.
  - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
  - kernel RO/user NA support. Not sure this is really needed, would save
a few insn if not required.
Cons:
  - A few more instructions in the DTLB Miss routine.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |   53 ++-
 include/asm-ppc/pgtable.h  |   15 +--
 2 files changed, 39 insertions(+), 29 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index 9d8a1b5..c9770b6 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -369,25 +369,27 @@ InstructionTLBMiss:
 */
tophys(r21,r21)
ori r21,r21,1   /* Set valid bit */
-   beq-2f  /* If zero, don't try to find a pte */
DO_8xx_CPU6(0x2b80, r3)
mtspr   MI_TWC, r21 /* Set segment attributes */
+   beq-2f  /* If zero, don't try to find a pte */
DO_8xx_CPU6(0x3b80, r3)
mtspr   MD_TWC, r21 /* Load pte table base address */
mfspr   r21, MD_TWC /* and get the pte address */
lwz r20, 0(r21) /* Get the pte */
 
-   ori r20, r20, _PAGE_ACCESSED
-   stw r20, 0(r21)
-
+#if 1
+   /* if !swap, you can delete this */
+   rlwimi  r20, r20, 5, _PAGE_PRESENT5   /* Copy PRESENT to ACCESSED */
+   stw r20, 0(r21) /* Update pte */
+#endif
/* The Linux PTE won't go exactly into the MMU TLB.
-* Software indicator bits 21, 22 and 28 must be clear.
+* Software indicator bits 21 and 28 must be clear.
 * Software indicator bits 24, 25, 26, and 27 must be
 * set.  All other Linux PTE bits control the behavior
 * of the MMU.
 */
 2: li  r21, 0x00f0
-   rlwimi  r20, r21, 0, 24, 28 /* Set 24-27, clear 28 */
+   rlwimi  r20, r21, 0, 0x07f8 /* Set 24-27, clear 21-23,28 */
DO_8xx_CPU6(0x2d80, r3)
mtspr   MI_RPN, r20 /* Update TLB entry */
 
@@ -444,12 +446,25 @@ DataStoreTLBMiss:
DO_8xx_CPU6(0x3b80, r3)
mtspr   MD_TWC, r21
 
-   mfspr   r21, MD_TWC /* get the pte address again */
-   ori r20, r20, _PAGE_ACCESSED
-   stw r20, 0(r21)
+#if 1
+   /* if !swap, you can delete this */
+   mfspr   r21, MD_TWC /* get the pte address */
+   rlwimi  r20, r20, 5, _PAGE_PRESENT5   /* Copy PRESENT to ACCESSED */
+   stw r20, 0(r21) /* Update pte */
+#endif
+
+   /* Honour kernel RO, User NA */
+   /* 0x200 == Extended encoding, bit 22 */
+   /* r20 |=  (r20  _PAGE_USER)  2 */
+   rlwimi  r20, r20, 32-2, 0x200
+   /* r21 =  (r20  _PAGE_RW)  1 */
+   rlwinm  r21, r20, 32-1, 0x200
+   or  r20, r21, r20
+   /* invert RW and 0x200 bits */
+   xorir20, r20, _PAGE_RW | 0x200
 
/* The Linux PTE won't go exactly into the MMU TLB.
-* Software indicator bits 21, 22 and 28 must be clear.
+* Software indicator bits 22 and 28 must be clear.
 * Software indicator bits 24, 25, 26, and 27 must be
 * set.  All other Linux PTE bits control the behavior
 * of the MMU.
@@ -496,11 +511,12 @@ DataTLBError:
stw r20, 0(r0)
stw r21, 4(r0)
 
-   /* First, make sure this was a store operation.
-   */
mfspr   r20, DSISR
-   andis.  r21, r20, 0x0200/* If set, indicates store op */
-   beq 2f
+   andis.  r21, r20, 0x4800/* !translation or protection */
+   bne-2f
+   /* Only Change bit left now, do it here as it is faster
+* than trapping to the C fault handler.
+*/
 
/* The EA of a data TLB miss is automatically stored in the MD_EPN
 * register.  The EA of a data TLB error is automatically stored in
@@ -550,17 +566,12 @@ DataTLBError:
mfspr   r21, MD_TWC /* and get the pte address */
lwz r20, 0(r21) /* Get the pte */
 
-   andi.   r21, r20, _PAGE_RW  /* Is it writeable? */
-   beq 2f  /* Bail out if not */
-
-   /* Update 'changed', among others.
-   */
ori r20, r20, _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE
-   mfspr   r21, MD_TWC /* Get pte address again */
stw r20, 0(r21) /* and update pte in table */
+   xorir20, r20, _PAGE_RW  /* RW bit is inverted */
 
/* The Linux PTE won't go exactly into the MMU TLB.
-* Software indicator bits 21, 22 and 28 must be clear.
+* Software

[PATCH 07/14] 8xx: CPU6 errata make DTLB error too big to fit.

2011-10-10 Thread Joakim Tjernlund

branch to common code in DTLB Miss instead.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |   23 ++-
 1 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index 0891b96..367fec0 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -469,6 +469,7 @@ DataStoreTLBMiss:
 * set.  All other Linux PTE bits control the behavior
 * of the MMU.
 */
+finish_DTLB:
 2: li  r21, 0x00f0
mtspr   DAR, r21/* Tag DAR */
rlwimi  r20, r21, 0, 24, 28 /* Set 24-27, clear 28 */
@@ -578,27 +579,7 @@ DARFixed:
ori r20, r20, _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE
stw r20, 0(r21) /* and update pte in table */
xorir20, r20, _PAGE_RW  /* RW bit is inverted */
-
-   /* The Linux PTE won't go exactly into the MMU TLB.
-* Software indicator bits 22 and 28 must be clear.
-* Software indicator bits 24, 25, 26, and 27 must be
-* set.  All other Linux PTE bits control the behavior
-* of the MMU.
-*/
-   li  r21, 0x00f0
-   mtspr   DAR, r21/* Tag DAR */
-   rlwimi  r20, r21, 0, 24, 28 /* Set 24-27, clear 28 */
-   DO_8xx_CPU6(0x3d80, r3)
-   mtspr   MD_RPN, r20 /* Update TLB entry */
-
-   mfspr   r20, M_TW   /* Restore registers */
-   lwz r21, 0(r0)
-   mtcrr21
-   lwz r21, 4(r0)
-#ifdef CONFIG_8xx_CPU6
-   lwz r3, 8(r0)
-#endif
-   rfi
+   b   finish_DTLB
 2:
mfspr   r20, M_TW   /* Restore registers */
lwz r21, 0(r0)
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 08/14] 8xx: Add missing Guarded setting in DTLB Error.

2011-10-10 Thread Joakim Tjernlund

only DTLB Miss did set this bit, DTLB Error needs too otherwise
the setting is lost when the page becomes dirty.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |   12 +---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index 367fec0..86bc727 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -573,9 +573,15 @@ DARFixed:
ori r21, r21, 1 /* Set valid bit in physical L2 page */
DO_8xx_CPU6(0x3b80, r3)
mtspr   MD_TWC, r21 /* Load pte table base address */
-   mfspr   r21, MD_TWC /* and get the pte address */
-   lwz r20, 0(r21) /* Get the pte */
-
+   mfspr   r20, MD_TWC /* and get the pte address */
+   lwz r20, 0(r20) /* Get the pte */
+   /* Insert the Guarded flag into the TWC from the Linux PTE.
+* It is bit 27 of both the Linux PTE and the TWC
+*/
+   rlwimi  r21, r20, 0, 27, 27
+   DO_8xx_CPU6(0x3b80, r3)
+   mtspr   MD_TWC, r21
+   mfspr   r21, MD_TWC /* get the pte address again */
ori r20, r20, _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE
stw r20, 0(r21) /* and update pte in table */
xorir20, r20, _PAGE_RW  /* RW bit is inverted */
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 09/14] 8xx: Restore _PAGE_WRITETHRU

2011-10-10 Thread Joakim Tjernlund

8xx has not had WRITETHRU due to lack of bits in the pte.
After the recent rewrite of the 8xx TLB code, there are
two bits left. Use one of them to WRITETHRU.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |8 
 include/asm-ppc/pgtable.h  |5 +++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index 86bc727..402158d 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -443,6 +443,10 @@ DataStoreTLBMiss:
 * above.
 */
rlwimi  r21, r20, 0, 27, 27
+   /* Insert the WriteThru flag into the TWC from the Linux PTE.
+* It is bit 25 in the Linux PTE and bit 30 in the TWC
+*/
+   rlwimi  r21, r20, 32-5, 30, 30
DO_8xx_CPU6(0x3b80, r3)
mtspr   MD_TWC, r21
 
@@ -579,6 +583,10 @@ DARFixed:
 * It is bit 27 of both the Linux PTE and the TWC
 */
rlwimi  r21, r20, 0, 27, 27
+   /* Insert the WriteThru flag into the TWC from the Linux PTE.
+* It is bit 25 in the Linux PTE and bit 30 in the TWC
+*/
+   rlwimi  r21, r20, 32-5, 30, 30
DO_8xx_CPU6(0x3b80, r3)
mtspr   MD_TWC, r21
mfspr   r21, MD_TWC /* get the pte address again */
diff --git a/include/asm-ppc/pgtable.h b/include/asm-ppc/pgtable.h
index 2ba37d3..6cfc5fc 100644
--- a/include/asm-ppc/pgtable.h
+++ b/include/asm-ppc/pgtable.h
@@ -298,12 +298,13 @@ extern unsigned long vmalloc_start;
 #define _PAGE_NO_CACHE 0x0002  /* I: cache inhibit */
 #define _PAGE_SHARED   0x0004  /* No ASID (context) compare */
 
-/* These three software bits must be masked out when the entry is loaded
- * into the TLB, 2 SW bits free.
+/* These four software bits must be masked out when the entry is loaded
+ * into the TLB, 1 SW bits left(0x0080).
  */
 #define _PAGE_EXEC 0x0008  /* software: i-cache coherency required */
 #define _PAGE_GUARDED  0x0010  /* software: guarded access */
 #define _PAGE_ACCESSED 0x0020  /* software: page referenced */
+#define _PAGE_WRITETHRU0x0040  /* software: caching is write through */
 
 /* Setting any bits in the nibble with the follow two controls will
  * require a TLB exception handler change.  It is assumed unused bits
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 10/14] 8xx: Set correct HW pte flags in DTLB Error too

2011-10-10 Thread Joakim Tjernlund

DTLB Error needs to adjust the HW PTE bits as DTLB Miss
does.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index 402158d..4bcd9b3 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -592,7 +592,12 @@ DARFixed:
mfspr   r21, MD_TWC /* get the pte address again */
ori r20, r20, _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE
stw r20, 0(r21) /* and update pte in table */
-   xorir20, r20, _PAGE_RW  /* RW bit is inverted */
+   rlwimi  r20, r20, 32-2, _PAGE_USER2 /* Copy USER to Encoding */
+   /* r21 =  (r20  _PAGE_RW)  1 */
+   rlwinm  r21, r20, 32-1, _PAGE_RW1
+   or  r20, r21, r20
+   /* invert RW and 0x200 bits */
+   xorir20, r20, _PAGE_RW | 0x200
b   finish_DTLB
 2:
mfspr   r20, M_TW   /* Restore registers */
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 12/14] 8xx: Use symbolic constants in TLB asm

2011-10-10 Thread Joakim Tjernlund

Use the PTE #defines where possible instead of
hardcoded constants.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index 4bcd9b3..0f2101d 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -442,11 +442,11 @@ DataStoreTLBMiss:
 * this into the Linux pgd/pmd and load it in the operation
 * above.
 */
-   rlwimi  r21, r20, 0, 27, 27
+   rlwimi  r21, r20, 0, _PAGE_GUARDED
/* Insert the WriteThru flag into the TWC from the Linux PTE.
 * It is bit 25 in the Linux PTE and bit 30 in the TWC
 */
-   rlwimi  r21, r20, 32-5, 30, 30
+   rlwimi  r21, r20, 32-5, _PAGE_WRITETHRU5
DO_8xx_CPU6(0x3b80, r3)
mtspr   MD_TWC, r21
 
@@ -460,9 +460,9 @@ DataStoreTLBMiss:
/* Honour kernel RO, User NA */
/* 0x200 == Extended encoding, bit 22 */
/* r20 |=  (r20  _PAGE_USER)  2 */
-   rlwimi  r20, r20, 32-2, 0x200
+   rlwimi  r20, r20, 32-2, _PAGE_USER2 /* Copy USER to Encoding */
/* r21 =  (r20  _PAGE_RW)  1 */
-   rlwinm  r21, r20, 32-1, 0x200
+   rlwinm  r21, r20, 32-1, _PAGE_RW1
or  r20, r21, r20
/* invert RW and 0x200 bits */
xorir20, r20, _PAGE_RW | 0x200
@@ -582,11 +582,11 @@ DARFixed:
/* Insert the Guarded flag into the TWC from the Linux PTE.
 * It is bit 27 of both the Linux PTE and the TWC
 */
-   rlwimi  r21, r20, 0, 27, 27
+   rlwimi  r21, r20, 0, _PAGE_GUARDED
/* Insert the WriteThru flag into the TWC from the Linux PTE.
 * It is bit 25 in the Linux PTE and bit 30 in the TWC
 */
-   rlwimi  r21, r20, 32-5, 30, 30
+   rlwimi  r21, r20, 32-5, _PAGE_WRITETHRU5
DO_8xx_CPU6(0x3b80, r3)
mtspr   MD_TWC, r21
mfspr   r21, MD_TWC /* get the pte address again */
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 11/14] 8xx: start using dcbX instructions in various copy routines

2011-10-10 Thread Joakim Tjernlund

Now that 8xx can fixup dcbX instructions, start using them
where possible like every other PowerPc arch do.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/misc.S |   18 --
 arch/ppc/lib/string.S  |   17 -
 2 files changed, 0 insertions(+), 35 deletions(-)

diff --git a/arch/ppc/kernel/misc.S b/arch/ppc/kernel/misc.S
index c616098..c291005 100644
--- a/arch/ppc/kernel/misc.S
+++ b/arch/ppc/kernel/misc.S
@@ -662,15 +662,7 @@ _GLOBAL(__flush_dcache_icache)
 _GLOBAL(clear_page)
li  r0,4096/L1_CACHE_LINE_SIZE
mtctr   r0
-#ifdef CONFIG_8xx
-   li  r4, 0
-1: stw r4, 0(r3)
-   stw r4, 4(r3)
-   stw r4, 8(r3)
-   stw r4, 12(r3)
-#else
 1: dcbz0,r3
-#endif
addir3,r3,L1_CACHE_LINE_SIZE
bdnz1b
blr
@@ -695,15 +687,6 @@ _GLOBAL(copy_page)
addir3,r3,-4
addir4,r4,-4
 
-#ifdef CONFIG_8xx
-   /* don't use prefetch on 8xx */
-   li  r0,4096/L1_CACHE_LINE_SIZE
-   mtctr   r0
-1: COPY_16_BYTES
-   bdnz1b
-   blr
-
-#else  /* not 8xx, we can prefetch */
li  r5,4
 
 #if MAX_COPY_PREFETCH  1
@@ -744,7 +727,6 @@ _GLOBAL(copy_page)
li  r0,MAX_COPY_PREFETCH
li  r11,4
b   2b
-#endif /* CONFIG_8xx */
 
 /*
  * Atomic [testset] exchange
diff --git a/arch/ppc/lib/string.S b/arch/ppc/lib/string.S
index 6ca54b4..b6ea44b 100644
--- a/arch/ppc/lib/string.S
+++ b/arch/ppc/lib/string.S
@@ -159,14 +159,7 @@ _GLOBAL(cacheable_memzero)
bdnz4b
 3: mtctr   r9
li  r7,4
-#if !defined(CONFIG_8xx)
 10:dcbzr7,r6
-#else
-10:stw r4, 4(r6)
-   stw r4, 8(r6)
-   stw r4, 12(r6)
-   stw r4, 16(r6)
-#endif
addir6,r6,CACHELINE_BYTES
bdnz10b
clrlwi  r5,r8,32-LG_CACHELINE_BYTES
@@ -261,9 +254,7 @@ _GLOBAL(cacheable_memcpy)
mtctr   r0
beq 63f
 53:
-#if !defined(CONFIG_8xx)
dcbzr11,r6
-#endif
COPY_16_BYTES
 #if L1_CACHE_LINE_SIZE = 32
COPY_16_BYTES
@@ -443,13 +434,6 @@ _GLOBAL(__copy_tofrom_user)
li  r11,4
beq 63f
 
-#ifdef CONFIG_8xx
-   /* Don't use prefetch on 8xx */
-   mtctr   r0
-53:COPY_16_BYTES_WITHEX(0)
-   bdnz53b
-
-#else /* not CONFIG_8xx */
/* Here we decide how far ahead to prefetch the source */
li  r3,4
cmpwi   r0,1
@@ -502,7 +486,6 @@ _GLOBAL(__copy_tofrom_user)
li  r3,4
li  r7,0
bne 114b
-#endif /* CONFIG_8xx */
 
 63:srwi.   r0,r5,2
mtctr   r0
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 13/14] 8xx: Optimize TLB Miss handlers

2011-10-10 Thread Joakim Tjernlund

Only update pte w.r.t ACCESSED if it isn't already set
Wrap ACCESSED with #ifndef NO_SWAP for too ease optimization.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index 0f2101d..36089cc 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -377,10 +377,14 @@ InstructionTLBMiss:
mfspr   r21, MD_TWC /* and get the pte address */
lwz r20, 0(r21) /* Get the pte */
 
-#if 1
+#ifndef NO_SWAP
/* if !swap, you can delete this */
+   andi.   r21, r20, _PAGE_ACCESSED/* test ACCESSED bit */
+   bne+4f  /* Branch if set */
+   mfspr   r21, MD_TWC /* get the pte address */
rlwimi  r20, r20, 5, _PAGE_PRESENT5   /* Copy PRESENT to ACCESSED */
stw r20, 0(r21) /* Update pte */
+4:
 #endif
/* The Linux PTE won't go exactly into the MMU TLB.
 * Software indicator bits 21 and 28 must be clear.
@@ -450,11 +454,14 @@ DataStoreTLBMiss:
DO_8xx_CPU6(0x3b80, r3)
mtspr   MD_TWC, r21
 
-#if 1
+#ifndef NO_SWAP
/* if !swap, you can delete this */
+   andi.   r21, r20, _PAGE_ACCESSED/* test ACCESSED bit */
+   bne+4f  /* Branch if set */
mfspr   r21, MD_TWC /* get the pte address */
rlwimi  r20, r20, 5, _PAGE_PRESENT5   /* Copy PRESENT to ACCESSED */
stw r20, 0(r21) /* Update pte */
+4:
 #endif
 
/* Honour kernel RO, User NA */
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 14/14] 8xx: The TLB miss handler manages ACCESSED correctly.

2011-10-10 Thread Joakim Tjernlund

The new MMU/TLB code no longer sets ACCESSED unconditionally
so remove the exception.

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 include/asm-ppc/pgtable.h |   10 --
 1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/include/asm-ppc/pgtable.h b/include/asm-ppc/pgtable.h
index 6cfc5fc..b94e8a8 100644
--- a/include/asm-ppc/pgtable.h
+++ b/include/asm-ppc/pgtable.h
@@ -318,16 +318,6 @@ extern unsigned long vmalloc_start;
 #define _PMD_PAGE_MASK 0x000c
 #define _PMD_PAGE_8M   0x000c
 
-/*
- * The 8xx TLB miss handler allegedly sets _PAGE_ACCESSED in the PTE
- * for an address even if _PAGE_PRESENT is not set, as a performance
- * optimization.  This is a bug if you ever want to use swap unless
- * _PAGE_ACCESSED is 2, which it isn't, or unless you have 8xx-specific
- * definitions for __swp_entry etc. below, which would be gross.
- *  -- paulus
- */
-#define _PTE_NONE_MASK _PAGE_ACCESSED
-
 #else /* CONFIG_6xx */
 /* Definitions for 60x, 740/750, etc. */
 #define _PAGE_PRESENT  0x001   /* software: pte contains a translation */
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/3] 8xx: Large page(8MB) support for 2.4

2011-10-10 Thread Joakim Tjernlund

This adds Large page support for 8xx and uses it
for all kernel RAM.

Further usage is possible, IMAP_ADDR and on board
flash comes to mind.

There is one bit free the pte which could be used for
selecting different large page sizes but that is for another
day.

- Dan, what do you think :)

Joakim Tjernlund (3):
  8xx: replace _PAGE_EXEC with _PAGE_PSE
  8xx: Support LARGE pages in TLB code.
  8xx: Use LARGE pages for kernel RAM.

 arch/ppc/kernel/head_8xx.S |   30 +++---
 arch/ppc/mm/pgtable.c  |4 +++-
 include/asm-ppc/pgtable.h  |6 +-
 3 files changed, 27 insertions(+), 13 deletions(-)

-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/3] 8xx: replace _PAGE_EXEC with _PAGE_PSE

2011-10-10 Thread Joakim Tjernlund

We need this bit for large pages(8MB). Adjust TLB code
to not clear bit 28 Mx_RPN

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |8 
 include/asm-ppc/pgtable.h  |6 +-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index 36089cc..8e3fe40 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -387,13 +387,13 @@ InstructionTLBMiss:
 4:
 #endif
/* The Linux PTE won't go exactly into the MMU TLB.
-* Software indicator bits 21 and 28 must be clear.
+* Software indicator bit 21 must be clear.
 * Software indicator bits 24, 25, 26, and 27 must be
 * set.  All other Linux PTE bits control the behavior
 * of the MMU.
 */
 2: li  r21, 0x00f0
-   rlwimi  r20, r21, 0, 0x07f8 /* Set 24-27, clear 21-23,28 */
+   rlwimi  r20, r21, 0, 0x07f0 /* Set 24-27, clear 21-23 */
DO_8xx_CPU6(0x2d80, r3)
mtspr   MI_RPN, r20 /* Update TLB entry */
 
@@ -475,7 +475,7 @@ DataStoreTLBMiss:
xorir20, r20, _PAGE_RW | 0x200
 
/* The Linux PTE won't go exactly into the MMU TLB.
-* Software indicator bits 22 and 28 must be clear.
+* Software indicator bit 22 must be clear.
 * Software indicator bits 24, 25, 26, and 27 must be
 * set.  All other Linux PTE bits control the behavior
 * of the MMU.
@@ -483,7 +483,7 @@ DataStoreTLBMiss:
 finish_DTLB:
 2: li  r21, 0x00f0
mtspr   DAR, r21/* Tag DAR */
-   rlwimi  r20, r21, 0, 24, 28 /* Set 24-27, clear 28 */
+   rlwimi  r20, r21, 0, 0x00f0 /* Set 24-27 */
DO_8xx_CPU6(0x3d80, r3)
mtspr   MD_RPN, r20 /* Update TLB entry */
 
diff --git a/include/asm-ppc/pgtable.h b/include/asm-ppc/pgtable.h
index b94e8a8..1a0ca7b 100644
--- a/include/asm-ppc/pgtable.h
+++ b/include/asm-ppc/pgtable.h
@@ -297,11 +297,11 @@ extern unsigned long vmalloc_start;
 #define _PAGE_PRESENT  0x0001  /* Page is valid */
 #define _PAGE_NO_CACHE 0x0002  /* I: cache inhibit */
 #define _PAGE_SHARED   0x0004  /* No ASID (context) compare */
+#define _PAGE_PSE  0x0008  /* Large Page, 8MB */
 
 /* These four software bits must be masked out when the entry is loaded
  * into the TLB, 1 SW bits left(0x0080).
  */
-#define _PAGE_EXEC 0x0008  /* software: i-cache coherency required */
 #define _PAGE_GUARDED  0x0010  /* software: guarded access */
 #define _PAGE_ACCESSED 0x0020  /* software: page referenced */
 #define _PAGE_WRITETHRU0x0040  /* software: caching is write through */
@@ -359,6 +359,10 @@ extern unsigned long vmalloc_start;
 #define _PAGE_EXEC 0
 #endif
 
+#ifndef _PAGE_PSE
+#define _PAGE_PSE  0
+#endif
+
 #define _PAGE_CHG_MASK (PAGE_MASK | _PAGE_ACCESSED | _PAGE_DIRTY)
 
 /*
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/3] 8xx: Support LARGE pages in TLB code.

2011-10-10 Thread Joakim Tjernlund


Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/kernel/head_8xx.S |   22 +++---
 1 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/arch/ppc/kernel/head_8xx.S b/arch/ppc/kernel/head_8xx.S
index 8e3fe40..439e7f2 100644
--- a/arch/ppc/kernel/head_8xx.S
+++ b/arch/ppc/kernel/head_8xx.S
@@ -368,15 +368,19 @@ InstructionTLBMiss:
 * for this segment.
 */
tophys(r21,r21)
-   ori r21,r21,1   /* Set valid bit */
-   DO_8xx_CPU6(0x2b80, r3)
-   mtspr   MI_TWC, r21 /* Set segment attributes */
beq-2f  /* If zero, don't try to find a pte */
DO_8xx_CPU6(0x3b80, r3)
mtspr   MD_TWC, r21 /* Load pte table base address */
-   mfspr   r21, MD_TWC /* and get the pte address */
-   lwz r20, 0(r21) /* Get the pte */
+   mfspr   r20, MD_TWC /* and get the pte address */
+   lwz r20, 0(r20) /* Get the pte */
+
+   ori r21, r21, MI_SVALID /* Set valid bit */
+   /* Copy PSE to PS bits(8MB) */
+   rlwimi  r21, r20, 0, _PAGE_PSE
+   rlwimi  r21, r20, 32-1, _PAGE_PSE1
 
+   DO_8xx_CPU6(0x2b80, r3)
+   mtspr   MI_TWC, r21 /* Set segment attributes */
 #ifndef NO_SWAP
/* if !swap, you can delete this */
andi.   r21, r20, _PAGE_ACCESSED/* test ACCESSED bit */
@@ -446,7 +450,9 @@ DataStoreTLBMiss:
 * this into the Linux pgd/pmd and load it in the operation
 * above.
 */
-   rlwimi  r21, r20, 0, _PAGE_GUARDED
+   rlwimi  r21, r20, 0, _PAGE_GUARDED | _PAGE_PSE
+   /* Copy PSE to PS bits(8MB), combine with GUARDED above */
+   rlwimi  r21, r20, 32-1, _PAGE_PSE1
/* Insert the WriteThru flag into the TWC from the Linux PTE.
 * It is bit 25 in the Linux PTE and bit 30 in the TWC
 */
@@ -589,7 +595,9 @@ DARFixed:
/* Insert the Guarded flag into the TWC from the Linux PTE.
 * It is bit 27 of both the Linux PTE and the TWC
 */
-   rlwimi  r21, r20, 0, _PAGE_GUARDED
+   rlwimi  r21, r20, 0, _PAGE_GUARDED | _PAGE_PSE
+   /* Copy PSE to PS bits(8MB), combine with GUARDED above */
+   rlwimi  r21, r20, 32-1, _PAGE_PSE1
/* Insert the WriteThru flag into the TWC from the Linux PTE.
 * It is bit 25 in the Linux PTE and bit 30 in the TWC
 */
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/3] 8xx: Use LARGE pages for kernel RAM.

2011-10-10 Thread Joakim Tjernlund

Use the new _PAGE_PSE to map all kernel RAM with 8 MB TLBs

Signed-off-by: Joakim Tjernlund joakim.tjernl...@transmode.se
---
 arch/ppc/mm/pgtable.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/ppc/mm/pgtable.c b/arch/ppc/mm/pgtable.c
index 866ae43..56e847e 100644
--- a/arch/ppc/mm/pgtable.c
+++ b/arch/ppc/mm/pgtable.c
@@ -298,7 +298,9 @@ void __init mapin_ram(void)
/* On the MPC8xx, we want the page shared so we
 * don't get ASID compares on kernel space.
 */
-   f = _PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_SHARED | 
_PAGE_HWEXEC;
+   f = _PAGE_PSE | _PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_SHARED | 
_PAGE_HWEXEC;
+   if (_PAGE_PSE)
+   f |= _PAGE_WRENABLE;
 #if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH)
/* Allows stub to set breakpoints everywhere */
f |= _PAGE_WRENABLE;
-- 
1.7.3.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 00/14] Backport 8xx TLB to 2.4

2011-10-10 Thread Willy Tarreau

Hi Joakim,

On Mon, Oct 10, 2011 at 01:30:06PM +0200, Joakim Tjernlund wrote:
 This is a backport from 2.6 which I did to overcome 8xx CPU
 bugs. 8xx does not update the DAR register when taking a TLB
 error caused by dcbX and icbi insns which makes it very
 tricky to use these insns. Also the dcbst wrongly sets the
 the store bit when faulting into DTLB error.
 A few more bugs very found during development.
 
 I know 2.4 is in strict maintenance mode and 8xx is obsolete
 but as it is still in use I wanted 8xx to age with grace.

Thank you. I must admit I was hoping those patches would come in
for a last release before the end of the year :-)

Unless there is any objection from anyone, I'll merge them when
kernel.org is back online.

Cheers,
Willy

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] [powerpc32] Process dynamic relocations for kernel

2011-10-10 Thread Scott Wood

On 10/10/2011 04:55 AM, Suzuki K. Poulose wrote:
 The following patch implements the dynamic relocation processing for
 PPC32 kernel. relocate() accepts the target virtual address and relocates
  the kernel image to the same.

How much overhead is involved in a true relocatable kernel?  Is it worth
preserving the old relocatable booke behavior under a different name?

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] 8xx: Large page(8MB) support for 2.4

2011-10-10 Thread Willy Tarreau

Hi an,

On Mon, Oct 10, 2011 at 09:22:09AM -0700, Dan Malek wrote:
 
 Hi Joakim.
 
 On Oct 10, 2011, at 4:38 AM, Joakim Tjernlund wrote:
 
 This adds Large page support for 8xx and uses it
 for all kernel RAM
 
 - Dan, what do you think :)
 
 Since you asked, yes it looks great :-)  Now, can we
 get this into a more contemporary kernel?  I'm
 actually working on an 8xx project that may have
 a few years of life left.

At the pace of current 2.4, I'm sure the code won't have change
much a few years from now :-) It would be nice to know by now if
the current longterm branches work OK or not though.

Cheers,
Willy

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] 8xx: Large page(8MB) support for 2.4

2011-10-10 Thread Dan Malek



Hi Joakim.

On Oct 10, 2011, at 4:38 AM, Joakim Tjernlund wrote:


This adds Large page support for 8xx and uses it
for all kernel RAM



- Dan, what do you think :)


Since you asked, yes it looks great :-)  Now, can we
get this into a more contemporary kernel?  I'm
actually working on an 8xx project that may have
a few years of life left.

Thanks.

-- Dan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] mlx4_en: fix endianness with blue frame support

2011-10-10 Thread Thadeu Lima de Souza Cascardo

The doorbell register was being unconditionally swapped. In x86, that
meant it was being swapped to BE and written to the descriptor and to
memory, depending on the case of blue frame support or writing to
doorbell register. On PPC, this meant it was being swapped to LE and
then swapped back to BE while writing to the register. But in the blue
frame case, it was being written as LE to the descriptor.

The fix is not to swap doorbell unconditionally, write it to the
register as BE and convert it to BE when writing it to the descriptor.

Signed-off-by: Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com
Reported-by: Richard Hendrickson richh...@us.ibm.com
Cc: Eli Cohen e...@dev.mellanox.co.il
Cc: Yevgeny Petrilin yevge...@mellanox.co.il
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 drivers/net/mlx4/en_tx.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx4/en_tx.c b/drivers/net/mlx4/en_tx.c
index 6e03de0..f76ab6b 100644
--- a/drivers/net/mlx4/en_tx.c
+++ b/drivers/net/mlx4/en_tx.c
@@ -172,7 +172,7 @@ int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
memset(ring-buf, 0, ring-buf_size);
 
ring-qp_state = MLX4_QP_STATE_RST;
-   ring-doorbell_qpn = swab32(ring-qp.qpn  8);
+   ring-doorbell_qpn = ring-qp.qpn  8;
 
mlx4_en_fill_qp_context(priv, ring-size, ring-stride, 1, 0, ring-qpn,
ring-cqn, ring-context);
@@ -791,7 +791,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
skb_orphan(skb);
 
if (ring-bf_enabled  desc_size = MAX_BF  !bounce  !vlan_tag) {
-   *(u32 *) (tx_desc-ctrl.vlan_tag) |= ring-doorbell_qpn;
+   *(__be32 *) (tx_desc-ctrl.vlan_tag) |= 
cpu_to_be32(ring-doorbell_qpn);
op_own |= htonl((bf_index  0x)  8);
/* Ensure new descirptor hits memory
* before setting ownership of this descriptor to HW */
@@ -812,7 +812,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
wmb();
tx_desc-ctrl.owner_opcode = op_own;
wmb();
-   writel(ring-doorbell_qpn, ring-bf.uar-map + 
MLX4_SEND_DOORBELL);
+   iowrite32be(ring-doorbell_qpn, ring-bf.uar-map + 
MLX4_SEND_DOORBELL);
}
 
/* Poll CQ here */
-- 
1.7.4.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] mlx4_en: fix endianness with blue frame support

2011-10-10 Thread Thadeu Lima de Souza Cascardo

On Mon, Oct 10, 2011 at 01:42:23PM -0300, Thadeu Lima de Souza Cascardo wrote:
 The doorbell register was being unconditionally swapped. In x86, that
 meant it was being swapped to BE and written to the descriptor and to
 memory, depending on the case of blue frame support or writing to
 doorbell register. On PPC, this meant it was being swapped to LE and
 then swapped back to BE while writing to the register. But in the blue
 frame case, it was being written as LE to the descriptor.
 
 The fix is not to swap doorbell unconditionally, write it to the
 register as BE and convert it to BE when writing it to the descriptor.
 
 Signed-off-by: Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com
 Reported-by: Richard Hendrickson richh...@us.ibm.com
 Cc: Eli Cohen e...@dev.mellanox.co.il
 Cc: Yevgeny Petrilin yevge...@mellanox.co.il
 Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
 ---

So I tested this patch and it works for me. Thanks Ben and Eli for
finding out the problem with doorbell in the descriptor.

Regards,
Cascardo.

  drivers/net/mlx4/en_tx.c |6 +++---
  1 files changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/net/mlx4/en_tx.c b/drivers/net/mlx4/en_tx.c
 index 6e03de0..f76ab6b 100644
 --- a/drivers/net/mlx4/en_tx.c
 +++ b/drivers/net/mlx4/en_tx.c
 @@ -172,7 +172,7 @@ int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
   memset(ring-buf, 0, ring-buf_size);
 
   ring-qp_state = MLX4_QP_STATE_RST;
 - ring-doorbell_qpn = swab32(ring-qp.qpn  8);
 + ring-doorbell_qpn = ring-qp.qpn  8;
 
   mlx4_en_fill_qp_context(priv, ring-size, ring-stride, 1, 0, ring-qpn,
   ring-cqn, ring-context);
 @@ -791,7 +791,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
 net_device *dev)
   skb_orphan(skb);
 
   if (ring-bf_enabled  desc_size = MAX_BF  !bounce  !vlan_tag) {
 - *(u32 *) (tx_desc-ctrl.vlan_tag) |= ring-doorbell_qpn;
 + *(__be32 *) (tx_desc-ctrl.vlan_tag) |= 
 cpu_to_be32(ring-doorbell_qpn);
   op_own |= htonl((bf_index  0x)  8);
   /* Ensure new descirptor hits memory
   * before setting ownership of this descriptor to HW */
 @@ -812,7 +812,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
 net_device *dev)
   wmb();
   tx_desc-ctrl.owner_opcode = op_own;
   wmb();
 - writel(ring-doorbell_qpn, ring-bf.uar-map + 
 MLX4_SEND_DOORBELL);
 + iowrite32be(ring-doorbell_qpn, ring-bf.uar-map + 
 MLX4_SEND_DOORBELL);
   }
 
   /* Poll CQ here */
 -- 
 1.7.4.4
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] 8xx: Large page(8MB) support for 2.4

2011-10-10 Thread Joakim Tjernlund

Dan Malek ppc6...@digitaldans.com wrote on 2011/10/10 18:22:09:


 Hi Joakim.

 On Oct 10, 2011, at 4:38 AM, Joakim Tjernlund wrote:

  This adds Large page support for 8xx and uses it
  for all kernel RAM

  - Dan, what do you think :)

 Since you asked, yes it looks great :-)  Now, can we
 get this into a more contemporary kernel?  I'm
 actually working on an 8xx project that may have
 a few years of life left.

That is an easy port but I will have to do that blind. Would you
mind take this for a spin on 2.4 first?

The more interesting part is if one should use other sized(16K or 512K) large 
pages too?
Those should be useful for user space but it is a lot of work. I haven't checked
what large page support for user space is in 2.6 for ppc though.

 Jocke

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] [powerpc32] Process dynamic relocations for kernel

2011-10-10 Thread Suzuki Poulose


On 10/10/11 20:45, Scott Wood wrote:

On 10/10/2011 04:55 AM, Suzuki K. Poulose wrote:

The following patch implements the dynamic relocation processing for
PPC32 kernel. relocate() accepts the target virtual address and relocates
  the kernel image to the same.


How much overhead is involved in a true relocatable kernel?  Is it worth
preserving the old relocatable booke behavior under a different name?


There are '75782' on an ebony kernel with minimal config. So thats a pretty big
number for small embedded chips. I guess, preserving the 'old relocatable' (page
aligned approach) would be a good idea for the architectures which can afford 
it.
e.g, places where TLB size is 64M or less.

Thanks
Suzuki
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] [powerpc32] Process dynamic relocations for kernel

2011-10-10 Thread Scott Wood

On 10/10/2011 12:17 PM, Suzuki Poulose wrote:
 On 10/10/11 20:45, Scott Wood wrote:
 On 10/10/2011 04:55 AM, Suzuki K. Poulose wrote:
 The following patch implements the dynamic relocation processing for
 PPC32 kernel. relocate() accepts the target virtual address and
 relocates
   the kernel image to the same.

 How much overhead is involved in a true relocatable kernel?  Is it worth
 preserving the old relocatable booke behavior under a different name?
 
 There are '75782' on an ebony kernel with minimal config. So thats a
 pretty big
 number for small embedded chips. I guess, preserving the 'old
 relocatable' (page
 aligned approach) would be a good idea for the architectures which can
 afford it.
 e.g, places where TLB size is 64M or less.

The systems we've been using this option on aren't *that* small -- I was
thinking more about runtime overhead (beyond the time taken at boot to
process relocations).

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/3] [44x] Enable CONFIG_RELOCATABLE for PPC44x

2011-10-10 Thread Scott Wood

On 10/10/2011 04:56 AM, Suzuki K. Poulose wrote:
 #if defined(CONFIG_RELOCATABLE)  defined(CONFIG_44x)
 #define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) - PHYSICAL_START + 
 (KERNELBASE + RELOC_OFFSET)))
 #define __pa(x) ((unsigned long)(x) + PHYSICAL_START - (KERNELBASE + 
 RELOC_OFFSET))
 #endif

Why is this 44x-specific?

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] 8xx: Large page(8MB) support for 2.4

2011-10-10 Thread Dan Malek



On Oct 10, 2011, at 9:45 AM, Joakim Tjernlund wrote:


That is an easy port but I will have to do that blind. Would you
mind take this for a spin on 2.4 first?


My current system is running 2.6, so I don't have much
interested in testing 2.4

The more interesting part is if one should use other sized(16K or  
512K) large pages too?


My thought long ago was most of the 8xx systems have rather small
real memories, so the larger pages, especially 512K may be too wasteful.
I've always been a fan of keeping the TLB handlers tiny and simple,
rather then spending the instructions doing complex replacements.
Remember, this also affects the I- and D-cache, so a more frequent
and trivial PTE update could very well gain larger system performance
than the management of larger pages with more complex code.
With all of the bug fix code in the handlers, maybe a larger page would
be better.

Those should be useful for user space but it is a lot of work. I  
haven't checked

what large page support for user space is in 2.6 for ppc though.


The 2.6/3.0 kernel supports different, but fixed, page sizes.  IIRC,  
anything
over 64K may require distribution rebuilding to realign code/data  
sections
to more restrictive boundaries.  Maybe a 16K page would show some  
benefit.

I'll try to make some time to play with it.

Thanks.

-- Dan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] mlx4_en: fix endianness with blue frame support

2011-10-10 Thread David Miller

From: Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com
Date: Mon, 10 Oct 2011 13:46:54 -0300

 On Mon, Oct 10, 2011 at 01:42:23PM -0300, Thadeu Lima de Souza Cascardo wrote:
 The doorbell register was being unconditionally swapped. In x86, that
 meant it was being swapped to BE and written to the descriptor and to
 memory, depending on the case of blue frame support or writing to
 doorbell register. On PPC, this meant it was being swapped to LE and
 then swapped back to BE while writing to the register. But in the blue
 frame case, it was being written as LE to the descriptor.

 The fix is not to swap doorbell unconditionally, write it to the
 register as BE and convert it to BE when writing it to the descriptor.

 Signed-off-by: Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com
 Reported-by: Richard Hendrickson richh...@us.ibm.com
 Cc: Eli Cohen e...@dev.mellanox.co.il
 Cc: Yevgeny Petrilin yevge...@mellanox.co.il
 Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
 ---

 So I tested this patch and it works for me. Thanks Ben and Eli for
 finding out the problem with doorbell in the descriptor.

Applied, thanks everyone.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/13] Hugetlb for 64-bit Freescale Book3E

2011-10-10 Thread Becky Bruce

This series of patches contains mostly cleanup code that allows
the enablement of hugetlb for 64-bit Freescale BookE processors.
There are also some bits that I dropped from the 32-bit release
that are added back, as they are needed by other implementations.
Otherwise, it's mostly a bunch of code rearrangement, changes
in #include protections, and Kconfig changes.

Cheers,
Becky

 arch/powerpc/configs/corenet32_smp_defconfig |9 +--
 arch/powerpc/configs/corenet64_smp_defconfig |6 +-
 arch/powerpc/configs/mpc85xx_defconfig   |6 +-
 arch/powerpc/configs/mpc85xx_smp_defconfig   |7 +-
 arch/powerpc/include/asm/hugetlb.h   |   36 ++--
 arch/powerpc/include/asm/page_64.h   |2 +
 arch/powerpc/kernel/setup_64.c   |   10 ++
 arch/powerpc/mm/hugetlbpage-book3e.c |   15 ++--
 arch/powerpc/mm/hugetlbpage.c|  116 --
 arch/powerpc/mm/tlb_low_64e.S|   36 -
 arch/powerpc/mm/tlb_nohash.c |2 +-
 arch/powerpc/platforms/Kconfig.cputype   |4 +-
 12 files changed, 143 insertions(+), 106 deletions(-)



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 01/13] powerpc: Only define HAVE_ARCH_HUGETLB_UNMAPPED_AREA if PPC_MM_SLICES

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

If we don't have slices, we should be able to use the generic
hugetlb_get_unmapped_area() code

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/include/asm/page_64.h |2 ++
 arch/powerpc/mm/hugetlbpage.c  |6 ++
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/page_64.h 
b/arch/powerpc/include/asm/page_64.h
index fb40ede..fed85e6 100644
--- a/arch/powerpc/include/asm/page_64.h
+++ b/arch/powerpc/include/asm/page_64.h
@@ -130,7 +130,9 @@ do {\
 
 #ifdef CONFIG_HUGETLB_PAGE
 
+#ifdef CONFIG_PPC_MM_SLICES
 #define HAVE_ARCH_HUGETLB_UNMAPPED_AREA
+#endif
 
 #endif /* !CONFIG_HUGETLB_PAGE */
 
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 48b65be..71c6533 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -686,19 +686,17 @@ int gup_hugepd(hugepd_t *hugepd, unsigned pdshift,
return 1;
 }
 
+#ifdef CONFIG_PPC_MM_SLICES
 unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
unsigned long len, unsigned long pgoff,
unsigned long flags)
 {
-#ifdef CONFIG_PPC_MM_SLICES
struct hstate *hstate = hstate_file(file);
int mmu_psize = shift_to_mmu_psize(huge_page_shift(hstate));
 
return slice_get_unmapped_area(addr, len, flags, mmu_psize, 1, 0);
-#else
-   return get_unmapped_area(file, addr, len, pgoff, flags);
-#endif
 }
+#endif
 
 unsigned long vma_mmu_pagesize(struct vm_area_struct *vma)
 {
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 02/13] powerpc: hugetlb: fix huge_ptep_set_access_flags return value

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

There was an unconditional return of 1 in the original code
from David Gibson, and I dropped it because it wasn't needed
for FSL BOOKE 32-bit.  However, not all systems (including 64-bit
FSL BOOKE) do loading of the hpte from the fault handler asm
and depend on this function returning 1, which causes a call
to update_mmu_cache() that writes an entry into the tlb.

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
Signed-off-by: David Gibson da...@gibson.dropbear.id.au
---
 arch/powerpc/include/asm/hugetlb.h |   11 +++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index 8600493..70f9885 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -124,7 +124,18 @@ static inline int huge_ptep_set_access_flags(struct 
vm_area_struct *vma,
 unsigned long addr, pte_t *ptep,
 pte_t pte, int dirty)
 {
+#if defined(CONFIG_PPC_MMU_NOHASH)  \
+   !(defined(CONFIG_PPC_FSL_BOOK3E)  defined(CONFIG_PPC32))
+   /*
+* The return 1 forces a call of update_mmu_cache, which will write a
+* TLB entry.  Without this, platforms that don't do a write of the TLB
+* entry in the TLB miss handler asm will fault ad infinitum.
+*/
+   ptep_set_access_flags(vma, addr, ptep, pte, dirty);
+   return 1;
+#else
return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
+#endif
 }
 
 static inline pte_t huge_ptep_get(pte_t *ptep)
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 03/13] powerpc: Fix booke hugetlb preload code for PPC_MM_SLICES and 64-bit

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

This patch does 2 things: It corrects the code that determines the
size to write into MAS1 for the PPC_MM_SLICES case (this originally
came from David Gibson and I had incorrectly altered it), and it
changes the methodolody used to calculate the size for !PPC_MM_SLICES
to work for 64-bit as well as 32-bit.

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
Signed-off-by: David Gibson da...@gibson.dropbear.id.au
---
 arch/powerpc/mm/hugetlbpage-book3e.c |   15 ++-
 1 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/mm/hugetlbpage-book3e.c 
b/arch/powerpc/mm/hugetlbpage-book3e.c
index 343ad0b..4d6d849 100644
--- a/arch/powerpc/mm/hugetlbpage-book3e.c
+++ b/arch/powerpc/mm/hugetlbpage-book3e.c
@@ -45,23 +45,20 @@ void book3e_hugetlb_preload(struct mm_struct *mm, unsigned 
long ea, pte_t pte)
unsigned long flags;
 
 #ifdef CONFIG_PPC_FSL_BOOK3E
-   int index, lz, ncams;
-   struct vm_area_struct *vma;
+   int index, ncams;
 #endif
 
if (unlikely(is_kernel_addr(ea)))
return;
 
 #ifdef CONFIG_PPC_MM_SLICES
-   psize = mmu_get_tsize(get_slice_psize(mm, ea));
-   tsize = mmu_get_psize(psize);
+   psize = get_slice_psize(mm, ea);
+   tsize = mmu_get_tsize(psize);
shift = mmu_psize_defs[psize].shift;
 #else
-   vma = find_vma(mm, ea);
-   psize = vma_mmu_pagesize(vma);  /* returns actual size in bytes */
-   asm (PPC_CNTLZL %0,%1 : =r (lz) : r (psize));
-   shift = 31 - lz;
-   tsize = 21 - lz;
+   psize = vma_mmu_pagesize(find_vma(mm, ea));
+   shift = __ilog2(psize);
+   tsize = shift - 10;
 #endif
 
/*
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 04/13] powerpc: Update hugetlb huge_pte_alloc and tablewalk code for FSL BOOKE

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

This updates the hugetlb page table code to handle 64-bit FSL_BOOKE.
The previous 32-bit work counted on the inner levels of the page table
collapsing.

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/mm/hugetlbpage.c |   48 +++-
 1 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 71c6533..b4a4884 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -155,11 +155,28 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t 
*hpdp,
hpdp-pd = 0;
kmem_cache_free(cachep, new);
}
+#else
+   if (!hugepd_none(*hpdp))
+   kmem_cache_free(cachep, new);
+   else
+   hpdp-pd = ((unsigned long)new  ~PD_HUGE) | pshift;
 #endif
spin_unlock(mm-page_table_lock);
return 0;
 }
 
+/*
+ * These macros define how to determine which level of the page table holds
+ * the hpdp.
+ */
+#ifdef CONFIG_PPC_FSL_BOOK3E
+#define HUGEPD_PGD_SHIFT PGDIR_SHIFT
+#define HUGEPD_PUD_SHIFT PUD_SHIFT
+#else
+#define HUGEPD_PGD_SHIFT PUD_SHIFT
+#define HUGEPD_PUD_SHIFT PMD_SHIFT
+#endif
+
 pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long 
sz)
 {
pgd_t *pg;
@@ -172,12 +189,13 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long 
addr, unsigned long sz
addr = ~(sz-1);
 
pg = pgd_offset(mm, addr);
-   if (pshift = PUD_SHIFT) {
+
+   if (pshift = HUGEPD_PGD_SHIFT) {
hpdp = (hugepd_t *)pg;
} else {
pdshift = PUD_SHIFT;
pu = pud_alloc(mm, pg, addr);
-   if (pshift = PMD_SHIFT) {
+   if (pshift = HUGEPD_PUD_SHIFT) {
hpdp = (hugepd_t *)pu;
} else {
pdshift = PMD_SHIFT;
@@ -453,14 +471,23 @@ static void hugetlb_free_pmd_range(struct mmu_gather 
*tlb, pud_t *pud,
unsigned long start;
 
start = addr;
-   pmd = pmd_offset(pud, addr);
do {
+   pmd = pmd_offset(pud, addr);
next = pmd_addr_end(addr, end);
if (pmd_none(*pmd))
continue;
+#ifdef CONFIG_PPC_FSL_BOOK3E
+   /*
+* Increment next by the size of the huge mapping since
+* there may be more than one entry at this level for a
+* single hugepage, but all of them point to
+* the same kmem cache that holds the hugepte.
+*/
+   next = addr + (1  hugepd_shift(*(hugepd_t *)pmd));
+#endif
free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
  addr, next, floor, ceiling);
-   } while (pmd++, addr = next, addr != end);
+   } while (addr = next, addr != end);
 
start = PUD_MASK;
if (start  floor)
@@ -487,8 +514,8 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, 
pgd_t *pgd,
unsigned long start;
 
start = addr;
-   pud = pud_offset(pgd, addr);
do {
+   pud = pud_offset(pgd, addr);
next = pud_addr_end(addr, end);
if (!is_hugepd(pud)) {
if (pud_none_or_clear_bad(pud))
@@ -496,10 +523,19 @@ static void hugetlb_free_pud_range(struct mmu_gather 
*tlb, pgd_t *pgd,
hugetlb_free_pmd_range(tlb, pud, addr, next, floor,
   ceiling);
} else {
+#ifdef CONFIG_PPC_FSL_BOOK3E
+   /*
+* Increment next by the size of the huge mapping since
+* there may be more than one entry at this level for a
+* single hugepage, but all of them point to
+* the same kmem cache that holds the hugepte.
+*/
+   next = addr + (1  hugepd_shift(*(hugepd_t *)pud));
+#endif
free_hugepd_range(tlb, (hugepd_t *)pud, PUD_SHIFT,
  addr, next, floor, ceiling);
}
-   } while (pud++, addr = next, addr != end);
+   } while (addr = next, addr != end);
 
start = PGDIR_MASK;
if (start  floor)
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 05/13] powerpc: hugetlb: modify include usage for FSL BookE code

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

The original 32-bit hugetlb implementation used PPC64 vs PPC32 to
determine which code path to take.  However, the final hugetlb
implementation for 64-bit FSL ended up shared with the FSL
32-bit code so the actual check needs to be FSL_BOOK3E vs
everything else.  This patch changes the include protections to
reflect this.

There are also a couple of related comment fixes.

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/include/asm/hugetlb.h |6 ++--
 arch/powerpc/mm/hugetlbpage.c  |   54 ---
 arch/powerpc/mm/tlb_nohash.c   |2 +-
 3 files changed, 29 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index 70f9885..273acfa 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -22,14 +22,14 @@ static inline pte_t *hugepte_offset(hugepd_t *hpdp, 
unsigned long addr,
unsigned pdshift)
 {
/*
-* On 32-bit, we have multiple higher-level table entries that point to
-* the same hugepte.  Just use the first one since they're all
+* On FSL BookE, we have multiple higher-level table entries that
+* point to the same hugepte.  Just use the first one since they're all
 * identical.  So for that case, idx=0.
 */
unsigned long idx = 0;
 
pte_t *dir = hugepd_page(*hpdp);
-#ifdef CONFIG_PPC64
+#ifndef CONFIG_PPC_FSL_BOOK3E
idx = (addr  ((1UL  pdshift) - 1))  hugepd_shift(*hpdp);
 #endif
 
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index b4a4884..9a34606 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -33,17 +33,17 @@ unsigned int HPAGE_SHIFT;
  * implementations may have more than one gpage size due to limitations
  * of the memory allocators, so we need multiple arrays
  */
-#ifdef CONFIG_PPC64
-#define MAX_NUMBER_GPAGES  1024
-static u64 gpage_freearray[MAX_NUMBER_GPAGES];
-static unsigned nr_gpages;
-#else
+#ifdef CONFIG_PPC_FSL_BOOK3E
 #define MAX_NUMBER_GPAGES  128
 struct psize_gpages {
u64 gpage_list[MAX_NUMBER_GPAGES];
unsigned int nr_gpages;
 };
 static struct psize_gpages gpage_freearray[MMU_PAGE_COUNT];
+#else
+#define MAX_NUMBER_GPAGES  1024
+static u64 gpage_freearray[MAX_NUMBER_GPAGES];
+static unsigned nr_gpages;
 #endif
 
 static inline int shift_to_mmu_psize(unsigned int shift)
@@ -114,12 +114,12 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t 
*hpdp,
struct kmem_cache *cachep;
pte_t *new;
 
-#ifdef CONFIG_PPC64
-   cachep = PGT_CACHE(pdshift - pshift);
-#else
+#ifdef CONFIG_PPC_FSL_BOOK3E
int i;
int num_hugepd = 1  (pshift - pdshift);
cachep = hugepte_cache;
+#else
+   cachep = PGT_CACHE(pdshift - pshift);
 #endif
 
new = kmem_cache_zalloc(cachep, GFP_KERNEL|__GFP_REPEAT);
@@ -131,12 +131,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t 
*hpdp,
return -ENOMEM;
 
spin_lock(mm-page_table_lock);
-#ifdef CONFIG_PPC64
-   if (!hugepd_none(*hpdp))
-   kmem_cache_free(cachep, new);
-   else
-   hpdp-pd = ((unsigned long)new  ~PD_HUGE) | pshift;
-#else
+#ifdef CONFIG_PPC_FSL_BOOK3E
/*
 * We have multiple higher-level entries that point to the same
 * actual pte location.  Fill in each as we go and backtrack on error.
@@ -215,7 +210,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long 
addr, unsigned long sz
return hugepte_offset(hpdp, addr, pdshift);
 }
 
-#ifdef CONFIG_PPC32
+#ifdef CONFIG_PPC_FSL_BOOK3E
 /* Build list of addresses of gigantic pages.  This function is used in early
  * boot before the buddy or bootmem allocator is setup.
  */
@@ -335,7 +330,7 @@ void __init reserve_hugetlb_gpages(void)
}
 }
 
-#else /* PPC64 */
+#else /* !PPC_FSL_BOOK3E */
 
 /* Build list of addresses of gigantic pages.  This function is used in early
  * boot before the buddy or bootmem allocator is setup.
@@ -373,7 +368,7 @@ int huge_pmd_unshare(struct mm_struct *mm, unsigned long 
*addr, pte_t *ptep)
return 0;
 }
 
-#ifdef CONFIG_PPC32
+#ifdef CONFIG_PPC_FSL_BOOK3E
 #define HUGEPD_FREELIST_SIZE \
((PAGE_SIZE - sizeof(struct hugepd_freelist)) / sizeof(pte_t))
 
@@ -433,11 +428,11 @@ static void free_hugepd_range(struct mmu_gather *tlb, 
hugepd_t *hpdp, int pdshif
unsigned long pdmask = ~((1UL  pdshift) - 1);
unsigned int num_hugepd = 1;
 
-#ifdef CONFIG_PPC64
-   unsigned int shift = hugepd_shift(*hpdp);
-#else
-   /* Note: On 32-bit the hpdp may be the first of several */
+#ifdef CONFIG_PPC_FSL_BOOK3E
+   /* Note: On fsl the hpdp may be the first of several */
num_hugepd = (1  (hugepd_shift(*hpdp) - pdshift));
+#else
+   unsigned int shift =

[PATCH 06/13] powerpc: Whitespace/comment changes to tlb_low_64e.S

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

I happened to comment this code while I was digging through it;
we might as well commit that.  I also made some whitespace
changes - the existing code had a lot of unnecessary newlines
that I found annoying when I was working on my tiny laptop.

No functional changes.

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/mm/tlb_low_64e.S |   28 +++-
 1 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
index dc4a5f3..71d5d9a 100644
--- a/arch/powerpc/mm/tlb_low_64e.S
+++ b/arch/powerpc/mm/tlb_low_64e.S
@@ -94,11 +94,11 @@
 
srdir15,r16,60  /* get region */
rldicl. r10,r16,64-PGTABLE_EADDR_SIZE,PGTABLE_EADDR_SIZE+4
-   bne-dtlb_miss_fault_bolted
+   bne-dtlb_miss_fault_bolted  /* Bail if fault addr is invalid */
 
rlwinm  r10,r11,32-19,27,27
rlwimi  r10,r11,32-16,19,19
-   cmpwi   r15,0
+   cmpwi   r15,0   /* user vs kernel check */
ori r10,r10,_PAGE_PRESENT
orisr11,r10,_PAGE_ACCESSED@h
 
@@ -120,44 +120,38 @@ tlb_miss_common_bolted:
rldicl  r15,r16,64-PGDIR_SHIFT+3,64-PGD_INDEX_SIZE-3
cmpldi  cr0,r14,0
clrrdi  r15,r15,3
-   beq tlb_miss_fault_bolted
+   beq tlb_miss_fault_bolted   /* No PGDIR, bail */
 
 BEGIN_MMU_FTR_SECTION
/* Set the TLB reservation and search for existing entry. Then load
 * the entry.
 */
PPC_TLBSRX_DOT(0,r16)
-   ldx r14,r14,r15
-   beq normal_tlb_miss_done
+   ldx r14,r14,r15 /* grab pgd entry */
+   beq normal_tlb_miss_done/* tlb exists already, bail */
 MMU_FTR_SECTION_ELSE
-   ldx r14,r14,r15
+   ldx r14,r14,r15 /* grab pgd entry */
 ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV)
 
 #ifndef CONFIG_PPC_64K_PAGES
rldicl  r15,r16,64-PUD_SHIFT+3,64-PUD_INDEX_SIZE-3
clrrdi  r15,r15,3
-
-   cmpldi  cr0,r14,0
-   beq tlb_miss_fault_bolted
-
-   ldx r14,r14,r15
+   cmlpdi  cr0,r14,0
+   beq tlb_miss_fault_bolted   /* Bad pgd entry */
+   ldx r14,r14,r15 /* grab pud entry */
 #endif /* CONFIG_PPC_64K_PAGES */
 
rldicl  r15,r16,64-PMD_SHIFT+3,64-PMD_INDEX_SIZE-3
clrrdi  r15,r15,3
-
cmpldi  cr0,r14,0
beq tlb_miss_fault_bolted
-
-   ldx r14,r14,r15
+   ldx r14,r14,r15 /* Grab pmd entry */
 
rldicl  r15,r16,64-PAGE_SHIFT+3,64-PTE_INDEX_SIZE-3
clrrdi  r15,r15,3
-
cmpldi  cr0,r14,0
beq tlb_miss_fault_bolted
-
-   ldx r14,r14,r15
+   ldx r14,r14,r15 /* Grab PTE */
 
/* Check if required permissions are met */
andc.   r15,r11,r14
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 07/13] powerpc: Add hugepage support to 64-bit tablewalk code for FSL_BOOK3E

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

Before hugetlb, at each level of the table, we test for
!0 to determine if we have a valid table entry.  With hugetlb, this
compare becomes:
 0 is a normal entry
0 is an invalid entry
 0 is huge

This works because the hugepage code pulls the top bit off the entry
(which for non-huge entries always has the top bit set) as an
indicator that we have a hugepage.

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/mm/tlb_low_64e.S |   14 +++---
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
index 71d5d9a..ff672bd 100644
--- a/arch/powerpc/mm/tlb_low_64e.S
+++ b/arch/powerpc/mm/tlb_low_64e.S
@@ -136,22 +136,22 @@ ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV)
 #ifndef CONFIG_PPC_64K_PAGES
rldicl  r15,r16,64-PUD_SHIFT+3,64-PUD_INDEX_SIZE-3
clrrdi  r15,r15,3
-   cmlpdi  cr0,r14,0
-   beq tlb_miss_fault_bolted   /* Bad pgd entry */
+   cmpdi   cr0,r14,0
+   bge tlb_miss_fault_bolted   /* Bad pgd entry or hugepage; bail */
ldx r14,r14,r15 /* grab pud entry */
 #endif /* CONFIG_PPC_64K_PAGES */
 
rldicl  r15,r16,64-PMD_SHIFT+3,64-PMD_INDEX_SIZE-3
clrrdi  r15,r15,3
-   cmpldi  cr0,r14,0
-   beq tlb_miss_fault_bolted
+   cmpdi   cr0,r14,0
+   bge tlb_miss_fault_bolted
ldx r14,r14,r15 /* Grab pmd entry */
 
rldicl  r15,r16,64-PAGE_SHIFT+3,64-PTE_INDEX_SIZE-3
clrrdi  r15,r15,3
-   cmpldi  cr0,r14,0
-   beq tlb_miss_fault_bolted
-   ldx r14,r14,r15 /* Grab PTE */
+   cmpdi   cr0,r14,0
+   bge tlb_miss_fault_bolted
+   ldx r14,r14,r15 /* Grab PTE, normal (!huge) page */
 
/* Check if required permissions are met */
andc.   r15,r11,r14
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 08/13] powerpc: Add gpages reservation code for 64-bit FSL BOOKE

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

For 64-bit FSL_BOOKE implementations, gigantic pages need to be
reserved at boot time by the memblock code based on the command line.
This adds the call that handles the reservation, and fixes some code
comments.

It also removes the previous pr_err when reserve_hugetlb_gpages
is called on a system without hugetlb enabled - the way the code is
structured, the call is unconditional and the resulting error message
spurious and confusing.

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/include/asm/hugetlb.h |   19 ++-
 arch/powerpc/kernel/setup_64.c |   10 ++
 arch/powerpc/mm/hugetlbpage.c  |8 
 3 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index 273acfa..555044c 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -5,7 +5,6 @@
 #include asm/page.h
 
 extern struct kmem_cache *hugepte_cache;
-extern void __init reserve_hugetlb_gpages(void);
 
 static inline pte_t *hugepd_page(hugepd_t hpd)
 {
@@ -153,14 +152,24 @@ static inline void arch_release_hugepage(struct page 
*page)
 }
 
 #else /* ! CONFIG_HUGETLB_PAGE */
-static inline void reserve_hugetlb_gpages(void)
-{
-   pr_err(Cannot reserve gpages without hugetlb enabled\n);
-}
 static inline void flush_hugetlb_page(struct vm_area_struct *vma,
  unsigned long vmaddr)
 {
 }
+#endif /* CONFIG_HUGETLB_PAGE */
+
+
+/*
+ * FSL Book3E platforms require special gpage handling - the gpages
+ * are reserved early in the boot process by memblock instead of via
+ * the .dts as on IBM platforms.
+ */
+#if defined(CONFIG_HUGETLB_PAGE)  defined(CONFIG_PPC_FSL_BOOK3E)
+extern void __init reserve_hugetlb_gpages(void);
+#else
+static inline void reserve_hugetlb_gpages(void)
+{
+}
 #endif
 
 #endif /* _ASM_POWERPC_HUGETLB_H */
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index d4168c9..2e334d4 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -35,6 +35,8 @@
 #include linux/pci.h
 #include linux/lockdep.h
 #include linux/memblock.h
+#include linux/hugetlb.h
+
 #include asm/io.h
 #include asm/kdump.h
 #include asm/prom.h
@@ -64,6 +66,7 @@
 #include asm/mmu_context.h
 #include asm/code-patching.h
 #include asm/kvm_ppc.h
+#include asm/hugetlb.h
 
 #include setup.h
 
@@ -217,6 +220,13 @@ void __init early_setup(unsigned long dt_ptr)
/* Initialize the hash table or TLB handling */
early_init_mmu();
 
+   /*
+* Reserve any gigantic pages requested on the command line.
+* memblock needs to have been initialized by the time this is
+* called since this will reserve memory.
+*/
+   reserve_hugetlb_gpages();
+
DBG( - early_setup()\n);
 }
 
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 9a34606..51855a0 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -28,10 +28,10 @@ unsigned int HPAGE_SHIFT;
 
 /*
  * Tracks gpages after the device tree is scanned and before the
- * huge_boot_pages list is ready.  On 64-bit implementations, this is
- * just used to track 16G pages and so is a single array.  32-bit
- * implementations may have more than one gpage size due to limitations
- * of the memory allocators, so we need multiple arrays
+ * huge_boot_pages list is ready.  On non-Freescale implementations, this is
+ * just used to track 16G pages and so is a single array.  FSL-based
+ * implementations may have more than one gpage size, so we need multiple
+ * arrays
  */
 #ifdef CONFIG_PPC_FSL_BOOK3E
 #define MAX_NUMBER_GPAGES  128
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 09/13] powerpc: Kconfig updates for FSL BookE HUGETLB 64-bit

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

Allow hugetlb to be enabled on 64b FSL_BOOK3E.  No platforms enable
it by default yet.

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/platforms/Kconfig.cputype |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index a85990c..7e47fd4 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -174,7 +174,6 @@ config BOOKE
 config FSL_BOOKE
bool
depends on (E200 || E500)  PPC32
-   select SYS_SUPPORTS_HUGETLBFS if PHYS_64BIT
default y
 
 # this is for common code between PPC32  PPC64 FSL BOOKE
@@ -182,6 +181,7 @@ config PPC_FSL_BOOK3E
bool
select FSL_EMB_PERFMON
select PPC_SMP_MUXED_IPI
+   select SYS_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
default y if FSL_BOOKE
 
 config PTE_64BIT
@@ -298,7 +298,7 @@ config PPC_BOOK3E_MMU
 
 config PPC_MM_SLICES
bool
-   default y if (PPC64  HUGETLB_PAGE) || (PPC_STD_MMU_64  
PPC_64K_PAGES)
+   default y if (!PPC_FSL_BOOK3E  PPC64  HUGETLB_PAGE) || 
(PPC_STD_MMU_64  PPC_64K_PAGES)
default n
 
 config VIRT_CPU_ACCOUNTING
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 10/13] powerpc: Update mpc85xx/corenet 32-bit defconfigs

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

Results from updates via make savedefconfig.

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/configs/corenet32_smp_defconfig |8 
 arch/powerpc/configs/mpc85xx_defconfig   |5 +
 arch/powerpc/configs/mpc85xx_smp_defconfig   |6 +-
 3 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/configs/corenet32_smp_defconfig 
b/arch/powerpc/configs/corenet32_smp_defconfig
index 4311d02..ab4db40 100644
--- a/arch/powerpc/configs/corenet32_smp_defconfig
+++ b/arch/powerpc/configs/corenet32_smp_defconfig
@@ -12,9 +12,7 @@ CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_LOG_BUF_SHIFT=14
 CONFIG_BLK_DEV_INITRD=y
-# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
 CONFIG_KALLSYMS_ALL=y
-CONFIG_KALLSYMS_EXTRA_PASS=y
 CONFIG_EMBEDDED=y
 CONFIG_PERF_EVENTS=y
 CONFIG_SLAB=y
@@ -69,7 +67,6 @@ CONFIG_IPV6=y
 CONFIG_IP_SCTP=m
 CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug
 CONFIG_MTD=y
-CONFIG_MTD_PARTITIONS=y
 CONFIG_MTD_CMDLINE_PARTS=y
 CONFIG_MTD_CHAR=y
 CONFIG_MTD_BLOCK=y
@@ -107,7 +104,6 @@ CONFIG_FSL_PQ_MDIO=y
 # CONFIG_INPUT_MOUSE is not set
 CONFIG_SERIO_LIBPS2=y
 # CONFIG_LEGACY_PTYS is not set
-CONFIG_PPC_EPAPR_HV_BYTECHAN=y
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_8250_EXTENDED=y
@@ -136,8 +132,6 @@ CONFIG_USB_OHCI_HCD_PPC_OF_LE=y
 CONFIG_USB_STORAGE=y
 CONFIG_MMC=y
 CONFIG_MMC_SDHCI=y
-CONFIG_MMC_SDHCI_OF=y
-CONFIG_MMC_SDHCI_OF_ESDHC=y
 CONFIG_EDAC=y
 CONFIG_EDAC_MM_EDAC=y
 CONFIG_EDAC_MPC85XX=y
@@ -146,7 +140,6 @@ CONFIG_RTC_DRV_DS3232=y
 CONFIG_RTC_DRV_CMOS=y
 CONFIG_UIO=y
 CONFIG_STAGING=y
-# CONFIG_STAGING_EXCLUDE_BUILD is not set
 CONFIG_VIRT_DRIVERS=y
 CONFIG_FSL_HV_MANAGER=y
 CONFIG_EXT2_FS=y
@@ -173,7 +166,6 @@ CONFIG_MAC_PARTITION=y
 CONFIG_NLS_ISO8859_1=y
 CONFIG_NLS_UTF8=m
 CONFIG_MAGIC_SYSRQ=y
-CONFIG_DEBUG_KERNEL=y
 CONFIG_DEBUG_SHIRQ=y
 CONFIG_DETECT_HUNG_TASK=y
 CONFIG_DEBUG_INFO=y
diff --git a/arch/powerpc/configs/mpc85xx_defconfig 
b/arch/powerpc/configs/mpc85xx_defconfig
index 2500912..a1e5a17 100644
--- a/arch/powerpc/configs/mpc85xx_defconfig
+++ b/arch/powerpc/configs/mpc85xx_defconfig
@@ -10,10 +10,8 @@ CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_LOG_BUF_SHIFT=14
 CONFIG_BLK_DEV_INITRD=y
-# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
 CONFIG_EXPERT=y
 CONFIG_KALLSYMS_ALL=y
-CONFIG_KALLSYMS_EXTRA_PASS=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 CONFIG_MODULE_FORCE_UNLOAD=y
@@ -41,7 +39,6 @@ CONFIG_TQM8560=y
 CONFIG_SBC8548=y
 CONFIG_QUICC_ENGINE=y
 CONFIG_QE_GPIO=y
-CONFIG_GPIO_MPC8XXX=y
 CONFIG_HIGHMEM=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
@@ -123,6 +120,7 @@ CONFIG_NVRAM=y
 CONFIG_I2C=y
 CONFIG_I2C_CPM=m
 CONFIG_I2C_MPC=y
+CONFIG_GPIO_MPC8XXX=y
 # CONFIG_HWMON is not set
 CONFIG_VIDEO_OUTPUT_CONTROL=y
 CONFIG_FB=y
@@ -206,7 +204,6 @@ CONFIG_PARTITION_ADVANCED=y
 CONFIG_MAC_PARTITION=y
 CONFIG_CRC_T10DIF=y
 CONFIG_DEBUG_FS=y
-CONFIG_DEBUG_KERNEL=y
 CONFIG_DETECT_HUNG_TASK=y
 CONFIG_DEBUG_INFO=y
 CONFIG_SYSCTL_SYSCALL_CHECK=y
diff --git a/arch/powerpc/configs/mpc85xx_smp_defconfig 
b/arch/powerpc/configs/mpc85xx_smp_defconfig
index a4ba13b..dd1e413 100644
--- a/arch/powerpc/configs/mpc85xx_smp_defconfig
+++ b/arch/powerpc/configs/mpc85xx_smp_defconfig
@@ -12,10 +12,8 @@ CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_LOG_BUF_SHIFT=14
 CONFIG_BLK_DEV_INITRD=y
-# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
 CONFIG_EXPERT=y
 CONFIG_KALLSYMS_ALL=y
-CONFIG_KALLSYMS_EXTRA_PASS=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 CONFIG_MODULE_FORCE_UNLOAD=y
@@ -42,7 +40,6 @@ CONFIG_TQM8560=y
 CONFIG_SBC8548=y
 CONFIG_QUICC_ENGINE=y
 CONFIG_QE_GPIO=y
-CONFIG_GPIO_MPC8XXX=y
 CONFIG_HIGHMEM=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
@@ -124,6 +121,7 @@ CONFIG_NVRAM=y
 CONFIG_I2C=y
 CONFIG_I2C_CPM=m
 CONFIG_I2C_MPC=y
+CONFIG_GPIO_MPC8XXX=y
 # CONFIG_HWMON is not set
 CONFIG_VIDEO_OUTPUT_CONTROL=y
 CONFIG_FB=y
@@ -207,10 +205,8 @@ CONFIG_PARTITION_ADVANCED=y
 CONFIG_MAC_PARTITION=y
 CONFIG_CRC_T10DIF=y
 CONFIG_DEBUG_FS=y
-CONFIG_DEBUG_KERNEL=y
 CONFIG_DETECT_HUNG_TASK=y
 CONFIG_DEBUG_INFO=y
-# CONFIG_RCU_CPU_STALL_DETECTOR is not set
 CONFIG_SYSCTL_SYSCALL_CHECK=y
 CONFIG_VIRQ_DEBUG=y
 CONFIG_CRYPTO_PCBC=m
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 11/13] powerpc: Enable Hugetlb by default for 32-bit 85xx/corenet

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/configs/corenet32_smp_defconfig |1 +
 arch/powerpc/configs/mpc85xx_defconfig   |1 +
 arch/powerpc/configs/mpc85xx_smp_defconfig   |1 +
 3 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/configs/corenet32_smp_defconfig 
b/arch/powerpc/configs/corenet32_smp_defconfig
index ab4db40..1c328da 100644
--- a/arch/powerpc/configs/corenet32_smp_defconfig
+++ b/arch/powerpc/configs/corenet32_smp_defconfig
@@ -154,6 +154,7 @@ CONFIG_VFAT_FS=y
 CONFIG_NTFS_FS=y
 CONFIG_PROC_KCORE=y
 CONFIG_TMPFS=y
+CONFIG_HUGETLBFS=y
 CONFIG_JFFS2_FS=y
 CONFIG_CRAMFS=y
 CONFIG_NFS_FS=y
diff --git a/arch/powerpc/configs/mpc85xx_defconfig 
b/arch/powerpc/configs/mpc85xx_defconfig
index a1e5a17..542eaa1 100644
--- a/arch/powerpc/configs/mpc85xx_defconfig
+++ b/arch/powerpc/configs/mpc85xx_defconfig
@@ -182,6 +182,7 @@ CONFIG_VFAT_FS=y
 CONFIG_NTFS_FS=y
 CONFIG_PROC_KCORE=y
 CONFIG_TMPFS=y
+CONFIG_HUGETLBFS=y
 CONFIG_ADFS_FS=m
 CONFIG_AFFS_FS=m
 CONFIG_HFS_FS=m
diff --git a/arch/powerpc/configs/mpc85xx_smp_defconfig 
b/arch/powerpc/configs/mpc85xx_smp_defconfig
index dd1e413..c0a9574 100644
--- a/arch/powerpc/configs/mpc85xx_smp_defconfig
+++ b/arch/powerpc/configs/mpc85xx_smp_defconfig
@@ -183,6 +183,7 @@ CONFIG_VFAT_FS=y
 CONFIG_NTFS_FS=y
 CONFIG_PROC_KCORE=y
 CONFIG_TMPFS=y
+CONFIG_HUGETLBFS=y
 CONFIG_ADFS_FS=m
 CONFIG_AFFS_FS=m
 CONFIG_HFS_FS=m
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 12/13] powerpc: Update corenet64_smp_defconfig

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

Updates from make savedefconfig.

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/configs/corenet64_smp_defconfig |5 -
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/configs/corenet64_smp_defconfig 
b/arch/powerpc/configs/corenet64_smp_defconfig
index c92c204..782822c 100644
--- a/arch/powerpc/configs/corenet64_smp_defconfig
+++ b/arch/powerpc/configs/corenet64_smp_defconfig
@@ -11,10 +11,8 @@ CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_LOG_BUF_SHIFT=14
 CONFIG_BLK_DEV_INITRD=y
-# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
 CONFIG_EXPERT=y
 CONFIG_KALLSYMS_ALL=y
-CONFIG_KALLSYMS_EXTRA_PASS=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 CONFIG_MODULE_FORCE_UNLOAD=y
@@ -25,7 +23,6 @@ CONFIG_P5020_DS=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_BINFMT_MISC=m
-# CONFIG_PCI is not set
 CONFIG_NET=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
@@ -93,10 +90,8 @@ CONFIG_CRC_T10DIF=y
 CONFIG_CRC_ITU_T=m
 CONFIG_FRAME_WARN=1024
 CONFIG_DEBUG_FS=y
-CONFIG_DEBUG_KERNEL=y
 CONFIG_DETECT_HUNG_TASK=y
 CONFIG_DEBUG_INFO=y
-# CONFIG_RCU_CPU_STALL_DETECTOR is not set
 CONFIG_SYSCTL_SYSCALL_CHECK=y
 CONFIG_VIRQ_DEBUG=y
 CONFIG_CRYPTO_PCBC=m
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 13/13] powerpc: Enable hugetlb by default for corenet64 platforms

2011-10-10 Thread Becky Bruce

From: Becky Bruce bec...@kernel.crashing.org

Signed-off-by: Becky Bruce bec...@kernel.crashing.org
---
 arch/powerpc/configs/corenet64_smp_defconfig |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/configs/corenet64_smp_defconfig 
b/arch/powerpc/configs/corenet64_smp_defconfig
index 782822c..53741f4 100644
--- a/arch/powerpc/configs/corenet64_smp_defconfig
+++ b/arch/powerpc/configs/corenet64_smp_defconfig
@@ -81,6 +81,7 @@ CONFIG_EXT3_FS=y
 # CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set
 CONFIG_PROC_KCORE=y
 CONFIG_TMPFS=y
+CONFIG_HUGETLBFS=y
 # CONFIG_MISC_FILESYSTEMS is not set
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_MAC_PARTITION=y
-- 
1.5.6.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

59 matches

Mail list logo