Re: [PATCH] ndfc driver

2008-12-09 Thread Mitch Bradley
One address/size cell isn't enough for the next generation of NAND FLASH 
chips.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: Add SMP support to no-hash TLB handling v3

2008-12-09 Thread Kumar Gala
+void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long  
vmaddr)

+{
+   unsigned int pid;
+
+   preempt_disable();
+   pid = vma ? vma-vm_mm-context.id : 0;
+   if (pid != MMU_NO_CONTEXT)
+   _tlbil_va(vmaddr, pid);
+   preempt_enable();
+}
+EXPORT_SYMBOL(local_flush_tlb_page);


We are using this in highmem.h for kmap_atomic.. So you need to fix  
that call site.


- k

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] ndfc driver

2008-12-09 Thread Josh Boyer
On Tue, 9 Dec 2008 07:10:27 +0100
Stefan Roese [EMAIL PROTECTED] wrote:

 On Tuesday 09 December 2008, Sean MacLennan wrote:
  On Thu, 4 Dec 2008 09:01:07 -0500
 
  Josh Boyer [EMAIL PROTECTED] wrote:
   In addition to an example DTS patch (probably to warp itself), could
   you briefly write up a binding and put it in
   Documentation/powerpc/dts-bindings/amcc (or similar)?  Also please CC
   the devicetree-discuss list on that part.
 
  Here is a start at the doc. I have sent it as a patch, but could just
  as easily send raw text.
 
  The example comes from the warp dts, just with less partitions, so I
  have not included a warp dts patch here.
 
  Cheers,
 Sean
 
  diff --git a/Documentation/powerpc/dts-bindings/amcc/ndfc.txt
  b/Documentation/powerpc/dts-bindings/amcc/ndfc.txt new file mode 100644
  index 000..668f4a9
  --- /dev/null
  +++ b/Documentation/powerpc/dts-bindings/amcc/ndfc.txt
  @@ -0,0 +1,31 @@
  +AMCC NDFC (NanD Flash Controller)
  +
  +Required properties:
  +- compatible : amcc,ndfc.
 
 The 4xx NAND controller was first implemented on the 440EP, IIRC. So I'm 
 pretty sure that this controller is an IBM core and not am AMCC core. So this 
 should be ibm,ndfc.

That is true.  It's an IBM blue logic core.
 
 And with this change it makes no sense to put this file ndfc.txt into the 
 amcc directory.
 
 Josh, where should this go then?

I declare it to be: dts-bindings/4xx/

mostly because I don't want the bindings scattered across two
directories simply because of the timeframe they showed up in the
marketplace.

If there are better ideas, I'm all ears.

josh
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/9] powerpc: Preliminary work to enable SMP BookE

2008-12-09 Thread Kumar Gala


On Dec 7, 2008, at 11:39 PM, Benjamin Herrenschmidt wrote:


This series of patches is aimed at supporting SMP on non-hash
based processors. It consists of a rework of the MMU context
management and TLB management, clearly splitting hash32, hash64
and nohash in both cases, adding SMP safe context handling and
some basic SMP TLB management.

There is room for improvements, such as implementing lazy TLB
flushing on processors without invalidate-by-PID support HW,
some better IPI mechanism, support for variable sizes PID,
lock less fast path in the MMU context switch, etc...
but it should basically work.

There are some semingly unrelated patches in the pile as they
are dependencies of the main ones so I'm including them in.


You'll be happy to know these patches at least boot on real 85xx SMP HW.

- k
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/5] powerpc: booke: Don't hard-code size of struct tlbcam

2008-12-09 Thread Josh Boyer
On Mon,  8 Dec 2008 19:34:55 -0800
Trent Piepho [EMAIL PROTECTED] wrote:

 Some assembly code in head_fsl_booke.S hard-coded the size of struct tlbcam
 to 20 when it indexed the TLBCAM table.  Anyone changing the size of struct
 tlbcam would not know to expect that.
 
 The kernel already has a system to get the size of C structures into
 assembly language files, asm-offsets, so let's use it.
 
 The definition of the struct gets moved to a header, so that asm-offsets.c
 can include it.

I don't mean to be overly picky, but your patch subjects and changelog
descriptions are a bit wrong.  This series pertains to FSL BookE chips,
not BookE in general.  There are other variants of BookE, such as 4xx.

If you could keep that in mind for future revisions, I'd appreciate
it.  Something like:

[PATCH] powerpc/fsl-booke: 

or something similar would be a bit more correct.  Unless you really
are changing something global to all BookE processors (which is sort of
rare at the moment).

josh
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] Fix corruption error in rh_alloc_fixed()

2008-12-09 Thread Guillaume Knispel
There is an error in rh_alloc_fixed() of the Remote Heap code:
If there is at least one free block blk won't be NULL at the end of the
search loop, so -ENOMEM won't be returned and the else branch of
if (bs == s || be == e) will be taken, corrupting the management
structures.

Signed-off-by: Guillaume Knispel [EMAIL PROTECTED]
---
Fix an error in rh_alloc_fixed() that made allocations succeed when
they should fail, and corrupted management structures.

diff --git a/arch/powerpc/lib/rheap.c b/arch/powerpc/lib/rheap.c
index 29b2941..45907c1 100644
--- a/arch/powerpc/lib/rheap.c
+++ b/arch/powerpc/lib/rheap.c
@@ -556,6 +556,7 @@ unsigned long rh_alloc_fixed(rh_info_t * info, unsigned 
long start, int size, co
be = blk-start + blk-size;
if (s = bs  e = be)
break;
+   blk = NULL;
}
 
if (blk == NULL)
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] Fix corruption error in rh_alloc_fixed()

2008-12-09 Thread Timur Tabi
Guillaume Knispel wrote:
 There is an error in rh_alloc_fixed() of the Remote Heap code:
 If there is at least one free block blk won't be NULL at the end of the
 search loop, so -ENOMEM won't be returned and the else branch of
 if (bs == s || be == e) will be taken, corrupting the management
 structures.
 
 Signed-off-by: Guillaume Knispel [EMAIL PROTECTED]
 ---
 Fix an error in rh_alloc_fixed() that made allocations succeed when
 they should fail, and corrupted management structures.
 
 diff --git a/arch/powerpc/lib/rheap.c b/arch/powerpc/lib/rheap.c
 index 29b2941..45907c1 100644
 --- a/arch/powerpc/lib/rheap.c
 +++ b/arch/powerpc/lib/rheap.c
 @@ -556,6 +556,7 @@ unsigned long rh_alloc_fixed(rh_info_t * info, unsigned 
 long start, int size, co
   be = blk-start + blk-size;
   if (s = bs  e = be)
   break;
 + blk = NULL;
   }
  
   if (blk == NULL)

This is a good catch, however, wouldn't it be better to do this:

list_for_each(l, info-free_list) {
blk = list_entry(l, rh_block_t, list);
/* The range must lie entirely inside one free block */
bs = blk-start;
be = blk-start + blk-size;
if (s = bs  e = be)
break;
}

-   if (blk == NULL)
+   if (blk == info-free_list)
return (unsigned long) -ENOMEM;

I haven't tested this, but the if-statement at the end of the loop is meant to
check whether the list_for_each() loop got to the end or not.

What do you think?

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] Fix corruption error in rh_alloc_fixed()

2008-12-09 Thread Guillaume Knispel
On Tue, 09 Dec 2008 09:03:19 -0600
Timur Tabi [EMAIL PROTECTED] wrote:

 Guillaume Knispel wrote:
  There is an error in rh_alloc_fixed() of the Remote Heap code:
  If there is at least one free block blk won't be NULL at the end of the
  search loop, so -ENOMEM won't be returned and the else branch of
  if (bs == s || be == e) will be taken, corrupting the management
  structures.
  
  Signed-off-by: Guillaume Knispel [EMAIL PROTECTED]
  ---
  Fix an error in rh_alloc_fixed() that made allocations succeed when
  they should fail, and corrupted management structures.
  
  diff --git a/arch/powerpc/lib/rheap.c b/arch/powerpc/lib/rheap.c
  index 29b2941..45907c1 100644
  --- a/arch/powerpc/lib/rheap.c
  +++ b/arch/powerpc/lib/rheap.c
  @@ -556,6 +556,7 @@ unsigned long rh_alloc_fixed(rh_info_t * info, unsigned 
  long start, int size, co
  be = blk-start + blk-size;
  if (s = bs  e = be)
  break;
  +   blk = NULL;
  }
   
  if (blk == NULL)
 
 This is a good catch, however, wouldn't it be better to do this:
 
   list_for_each(l, info-free_list) {
   blk = list_entry(l, rh_block_t, list);
   /* The range must lie entirely inside one free block */
   bs = blk-start;
   be = blk-start + blk-size;
   if (s = bs  e = be)
   break;
   }
 
 - if (blk == NULL)
 + if (blk == info-free_list)
   return (unsigned long) -ENOMEM;
 
 I haven't tested this, but the if-statement at the end of the loop is meant to
 check whether the list_for_each() loop got to the end or not.
 
 What do you think?

blk = NULL; at the end of the loop is what is done in the more used
rh_alloc_align(), so for consistency either we change both or we use
the same construction here.
I also think that testing for info-free_list is harder to understand
because you must have the linked list implementation in your head
(which a kernel developer should anyway so this is not so important)

Guillaume Knispel
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] Fix corruption error in rh_alloc_fixed()

2008-12-09 Thread Timur Tabi
Guillaume Knispel wrote:

 blk = NULL; at the end of the loop is what is done in the more used
 rh_alloc_align(), so for consistency either we change both or we use
 the same construction here.
 I also think that testing for info-free_list is harder to understand
 because you must have the linked list implementation in your head
 (which a kernel developer should anyway so this is not so important)

Fair enough.

Acked-by: Timur Tabi [EMAIL PROTECTED]

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] fork_init: fix division by zero

2008-12-09 Thread Yuri Tikhonov

The following patch fixes divide-by-zero error for the
cases of really big PAGE_SIZEs (e.g. 256KB on ppc44x).
Support for such big page sizes on 44x is not present in the
current kernel yet, but coming soon.

Also this patch fixes the comment for the max_threads
settings, as this didn't match the things actually done
in the code.

Signed-off-by: Yuri Tikhonov [EMAIL PROTECTED]
Signed-off-by: Ilya Yanok [EMAIL PROTECTED]
---
 kernel/fork.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 2a372a0..b0ac2fb 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -181,10 +181,14 @@ void __init fork_init(unsigned long mempages)
 
/*
 * The default maximum number of threads is set to a safe
-* value: the thread structures can take up at most half
-* of memory.
+* value: the thread structures can take up at most
+* (1/8) part of memory.
 */
+#if (8 * THREAD_SIZE)  PAGE_SIZE
max_threads = mempages / (8 * THREAD_SIZE / PAGE_SIZE);
+#else
+   max_threads = mempages * PAGE_SIZE / (8 * THREAD_SIZE);
+#endif
 
/*
 * we need to allow at least 20 threads to boot a system
-- 
1.5.6.1
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 0/8] Fix a bug and cleanup NUMA boot-time code

2008-12-09 Thread Dave Hansen
The first patch in this series is a genuine bug fix.  The rest
are really just an RFC.

Jon introduced a bug a while ago.  I introduced another when
trying to fix Jon's bug.  I refuse to accept personal blame for
this and, instead, blame the code. :)

The reset of the series are cleanups that I think will help
clarify the code in numa.c and work to ensure that the next
bonehead like me is not as able to easily muck up the code. :)

The cleanups increase in impact and intrusiveness as the series
goes along, so please consider them an RFC.  But, what I really
want to figure out is a safer way to initialize NODE_DATA() and
start using it as we bring up bootmem on all the nodes.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 2/8] Add better comment on careful_allocation()

2008-12-09 Thread Dave Hansen

The behavior in careful_allocation() really confused me
at first.  Add a comment to hopefully make it easier
on the next doofus that looks at it.

Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---

 linux-2.6.git-dave/arch/powerpc/mm/numa.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff -puN arch/powerpc/mm/numa.c~cleanup-careful_allocation4 
arch/powerpc/mm/numa.c
--- linux-2.6.git/arch/powerpc/mm/numa.c~cleanup-careful_allocation4
2008-12-09 10:16:05.0 -0800
+++ linux-2.6.git-dave/arch/powerpc/mm/numa.c   2008-12-09 10:16:05.0 
-0800
@@ -840,8 +840,16 @@ static void __init *careful_allocation(i
  size, nid);
 
/*
-* If the memory came from a previously allocated node, we must
-* retry with the bootmem allocator.
+* We initialize the nodes in numeric order: 0, 1, 2...
+* and hand over control from the LMB allocator to the
+* bootmem allocator.  If this function is called for
+* node 5, then we know that all nodes 5 are using the
+* bootmem allocator instead of the LMB allocator.
+*
+* So, check the nid from which this allocation came
+* and double check to see if we need to use bootmem
+* instead of the LMB.  We don't free the LMB memory
+* since it would be useless.
 */
new_nid = early_pfn_to_nid(ret  PAGE_SHIFT);
if (new_nid  nid) {
_
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 6/8] cleanup do_init_bootmem()

2008-12-09 Thread Dave Hansen

I'm debating whether this is worth it. It makes this a bit more clean
looking, but doesn't seriously enhance readability.  But, I do think
it helps a bit.

Thoughts?

Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---

 linux-2.6.git-dave/arch/powerpc/mm/numa.c |  104 +++---
 1 file changed, 55 insertions(+), 49 deletions(-)

diff -puN arch/powerpc/mm/numa.c~cleanup-careful_allocation3 
arch/powerpc/mm/numa.c
--- linux-2.6.git/arch/powerpc/mm/numa.c~cleanup-careful_allocation3
2008-12-09 10:16:07.0 -0800
+++ linux-2.6.git-dave/arch/powerpc/mm/numa.c   2008-12-09 10:16:07.0 
-0800
@@ -938,6 +938,59 @@ static void mark_reserved_regions_for_ni
}
 }
 
+void do_init_bootmem_node(int node)
+{
+   unsigned long start_pfn, end_pfn;
+   void *bootmem_vaddr;
+   unsigned long bootmap_pages;
+
+   dbg(node %d is online\n, nid);
+   get_pfn_range_for_nid(nid, start_pfn, end_pfn);
+
+   /*
+* Allocate the node structure node local if possible
+*
+* Be careful moving this around, as it relies on all
+* previous nodes' bootmem to be initialized and have
+* all reserved areas marked.
+*/
+   NODE_DATA(nid) = careful_zallocation(nid,
+   sizeof(struct pglist_data),
+   SMP_CACHE_BYTES, end_pfn);
+
+   dbg(node %d\n, nid);
+   dbg(NODE_DATA() = %p\n, NODE_DATA(nid));
+
+   NODE_DATA(nid)-bdata = bootmem_node_data[nid];
+   NODE_DATA(nid)-node_start_pfn = start_pfn;
+   NODE_DATA(nid)-node_spanned_pages = end_pfn - start_pfn;
+
+   if (NODE_DATA(nid)-node_spanned_pages == 0)
+   return;
+
+   dbg(start_paddr = %lx\n, start_pfn  PAGE_SHIFT);
+   dbg(end_paddr = %lx\n, end_pfn  PAGE_SHIFT);
+
+   bootmap_pages = bootmem_bootmap_pages(end_pfn - start_pfn);
+   bootmem_vaddr = careful_zallocation(nid,
+   bootmap_pages  PAGE_SHIFT,
+   PAGE_SIZE, end_pfn);
+
+   dbg(bootmap_vaddr = %p\n, bootmem_vaddr);
+
+   init_bootmem_node(NODE_DATA(nid),
+ __pa(bootmem_vaddr)  PAGE_SHIFT,
+ start_pfn, end_pfn);
+
+   free_bootmem_with_active_regions(nid, end_pfn);
+   /*
+* Be very careful about moving this around.  Future
+* calls to careful_zallocation() depend on this getting
+* done correctly.
+*/
+   mark_reserved_regions_for_nid(nid);
+   sparse_memory_present_with_active_regions(nid);
+}
 
 void __init do_init_bootmem(void)
 {
@@ -958,55 +1011,8 @@ void __init do_init_bootmem(void)
  (void *)(unsigned long)boot_cpuid);
 
for_each_online_node(nid) {
-   unsigned long start_pfn, end_pfn;
-   void *bootmem_vaddr;
-   unsigned long bootmap_pages;
-
-   get_pfn_range_for_nid(nid, start_pfn, end_pfn);
-
-   /*
-* Allocate the node structure node local if possible
-*
-* Be careful moving this around, as it relies on all
-* previous nodes' bootmem to be initialized and have
-* all reserved areas marked.
-*/
-   NODE_DATA(nid) = careful_zallocation(nid,
-   sizeof(struct pglist_data),
-   SMP_CACHE_BYTES, end_pfn);
-
-   dbg(node %d\n, nid);
-   dbg(NODE_DATA() = %p\n, NODE_DATA(nid));
-
-   NODE_DATA(nid)-bdata = bootmem_node_data[nid];
-   NODE_DATA(nid)-node_start_pfn = start_pfn;
-   NODE_DATA(nid)-node_spanned_pages = end_pfn - start_pfn;
-
-   if (NODE_DATA(nid)-node_spanned_pages == 0)
-   continue;
-
-   dbg(start_paddr = %lx\n, start_pfn  PAGE_SHIFT);
-   dbg(end_paddr = %lx\n, end_pfn  PAGE_SHIFT);
-
-   bootmap_pages = bootmem_bootmap_pages(end_pfn - start_pfn);
-   bootmem_vaddr = careful_zallocation(nid,
-   bootmap_pages  PAGE_SHIFT,
-   PAGE_SIZE, end_pfn);
-
-   dbg(bootmap_vaddr = %p\n, bootmem_vaddr);
-
-   init_bootmem_node(NODE_DATA(nid),
- __pa(bootmem_vaddr)  PAGE_SHIFT,
- start_pfn, end_pfn);
-
-   free_bootmem_with_active_regions(nid, end_pfn);
-   /*
-* Be very careful about moving this around.  Future
-* calls to careful_zallocation() depend on this getting
-* done correctly.
-*/
-   mark_reserved_regions_for_nid(nid);
-   sparse_memory_present_with_active_regions(nid);
+   dbg(node %d: marked online, initializing bootmem\n, nid);
+   

[PATCH 4/8] make careful_allocation() return vaddrs

2008-12-09 Thread Dave Hansen

Since we memset() the result in both of the uses here,
just make careful_alloc() return a virtual address.
Also, add a separate variable to store the physial
address that comes back from the lmb_alloc() functions.
This makes it less likely that someone will screw it up
forgetting to convert before returning since the vaddr
is always in a void* and the paddr is always in an
unsigned long.

I admit this is arbitrary since one of its users needs
a paddr and one a vaddr, but it does remove a good
number of casts.

Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---

 linux-2.6.git-dave/arch/powerpc/mm/numa.c |   37 --
 1 file changed, 20 insertions(+), 17 deletions(-)

diff -puN arch/powerpc/mm/numa.c~cleanup-careful_allocation1 
arch/powerpc/mm/numa.c
--- linux-2.6.git/arch/powerpc/mm/numa.c~cleanup-careful_allocation1
2008-12-09 10:16:06.0 -0800
+++ linux-2.6.git-dave/arch/powerpc/mm/numa.c   2008-12-09 10:16:06.0 
-0800
@@ -822,23 +822,28 @@ static void __init dump_numa_memory_topo
  * required. nid is the preferred node and end is the physical address of
  * the highest address in the node.
  *
- * Returns the physical address of the memory.
+ * Returns the virtual address of the memory.
  */
 static void __init *careful_allocation(int nid, unsigned long size,
   unsigned long align,
   unsigned long end_pfn)
 {
+   void *ret;
int new_nid;
-   unsigned long ret = __lmb_alloc_base(size, align, end_pfn  
PAGE_SHIFT);
+   unsigned long ret_paddr;
+
+   ret_paddr = __lmb_alloc_base(size, align, end_pfn  PAGE_SHIFT);
 
/* retry over all memory */
-   if (!ret)
-   ret = __lmb_alloc_base(size, align, lmb_end_of_DRAM());
+   if (!ret_paddr)
+   ret_paddr = __lmb_alloc_base(size, align, lmb_end_of_DRAM());
 
-   if (!ret)
+   if (!ret_paddr)
panic(numa.c: cannot allocate %lu bytes for node %d,
  size, nid);
 
+   ret = __va(ret_paddr);
+
/*
 * We initialize the nodes in numeric order: 0, 1, 2...
 * and hand over control from the LMB allocator to the
@@ -851,17 +856,15 @@ static void __init *careful_allocation(i
 * instead of the LMB.  We don't free the LMB memory
 * since it would be useless.
 */
-   new_nid = early_pfn_to_nid(ret  PAGE_SHIFT);
+   new_nid = early_pfn_to_nid(ret_paddr  PAGE_SHIFT);
if (new_nid  nid) {
-   ret = (unsigned long)__alloc_bootmem_node(NODE_DATA(new_nid),
+   ret = __alloc_bootmem_node(NODE_DATA(new_nid),
size, align, 0);
 
-   ret = __pa(ret);
-
-   dbg(alloc_bootmem %lx %lx\n, ret, size);
+   dbg(alloc_bootmem %p %lx\n, ret, size);
}
 
-   return (void *)ret;
+   return ret;
 }
 
 static struct notifier_block __cpuinitdata ppc64_numa_nb = {
@@ -955,7 +958,7 @@ void __init do_init_bootmem(void)
 
for_each_online_node(nid) {
unsigned long start_pfn, end_pfn;
-   unsigned long bootmem_paddr;
+   void *bootmem_vaddr;
unsigned long bootmap_pages;
 
get_pfn_range_for_nid(nid, start_pfn, end_pfn);
@@ -970,7 +973,6 @@ void __init do_init_bootmem(void)
NODE_DATA(nid) = careful_allocation(nid,
sizeof(struct pglist_data),
SMP_CACHE_BYTES, end_pfn);
-   NODE_DATA(nid) = __va(NODE_DATA(nid));
memset(NODE_DATA(nid), 0, sizeof(struct pglist_data));
 
dbg(node %d\n, nid);
@@ -987,14 +989,15 @@ void __init do_init_bootmem(void)
dbg(end_paddr = %lx\n, end_pfn  PAGE_SHIFT);
 
bootmap_pages = bootmem_bootmap_pages(end_pfn - start_pfn);
-   bootmem_paddr = (unsigned long)careful_allocation(nid,
+   bootmem_vaddr = careful_allocation(nid,
bootmap_pages  PAGE_SHIFT,
PAGE_SIZE, end_pfn);
-   memset(__va(bootmem_paddr), 0, bootmap_pages  PAGE_SHIFT);
+   memset(bootmem_vaddr, 0, bootmap_pages  PAGE_SHIFT);
 
-   dbg(bootmap_paddr = %lx\n, bootmem_paddr);
+   dbg(bootmap_vaddr = %p\n, bootmem_vaddr);
 
-   init_bootmem_node(NODE_DATA(nid), bootmem_paddr  PAGE_SHIFT,
+   init_bootmem_node(NODE_DATA(nid),
+ __pa(bootmem_vaddr)  PAGE_SHIFT,
  start_pfn, end_pfn);
 
free_bootmem_with_active_regions(nid, end_pfn);
_
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 5/8] cleanup careful_allocation(): consolidate memset()

2008-12-09 Thread Dave Hansen

Both users of careful_allocation() immediately memset() the
result.  So, just do it in one place.

Also give careful_allocation() a 'z' prefix to bring it in
line with kzmalloc() and friends.

Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---

 linux-2.6.git-dave/arch/powerpc/mm/numa.c |   11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff -puN arch/powerpc/mm/numa.c~cleanup-careful_allocation2 
arch/powerpc/mm/numa.c
--- linux-2.6.git/arch/powerpc/mm/numa.c~cleanup-careful_allocation2
2008-12-09 10:16:06.0 -0800
+++ linux-2.6.git-dave/arch/powerpc/mm/numa.c   2008-12-09 10:16:06.0 
-0800
@@ -824,7 +824,7 @@ static void __init dump_numa_memory_topo
  *
  * Returns the virtual address of the memory.
  */
-static void __init *careful_allocation(int nid, unsigned long size,
+static void __init *careful_zallocation(int nid, unsigned long size,
   unsigned long align,
   unsigned long end_pfn)
 {
@@ -864,6 +864,7 @@ static void __init *careful_allocation(i
dbg(alloc_bootmem %p %lx\n, ret, size);
}
 
+   memset(ret, 0, size);
return ret;
 }
 
@@ -970,10 +971,9 @@ void __init do_init_bootmem(void)
 * previous nodes' bootmem to be initialized and have
 * all reserved areas marked.
 */
-   NODE_DATA(nid) = careful_allocation(nid,
+   NODE_DATA(nid) = careful_zallocation(nid,
sizeof(struct pglist_data),
SMP_CACHE_BYTES, end_pfn);
-   memset(NODE_DATA(nid), 0, sizeof(struct pglist_data));
 
dbg(node %d\n, nid);
dbg(NODE_DATA() = %p\n, NODE_DATA(nid));
@@ -989,10 +989,9 @@ void __init do_init_bootmem(void)
dbg(end_paddr = %lx\n, end_pfn  PAGE_SHIFT);
 
bootmap_pages = bootmem_bootmap_pages(end_pfn - start_pfn);
-   bootmem_vaddr = careful_allocation(nid,
+   bootmem_vaddr = careful_zallocation(nid,
bootmap_pages  PAGE_SHIFT,
PAGE_SIZE, end_pfn);
-   memset(bootmem_vaddr, 0, bootmap_pages  PAGE_SHIFT);
 
dbg(bootmap_vaddr = %p\n, bootmem_vaddr);
 
@@ -1003,7 +1002,7 @@ void __init do_init_bootmem(void)
free_bootmem_with_active_regions(nid, end_pfn);
/*
 * Be very careful about moving this around.  Future
-* calls to careful_allocation() depend on this getting
+* calls to careful_zallocation() depend on this getting
 * done correctly.
 */
mark_reserved_regions_for_nid(nid);
_
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 1/8] fix bootmem reservation on uninitialized node

2008-12-09 Thread Dave Hansen

careful_allocation() was calling into the bootemem allocator for
nodes which had not been fully initialized and caused a previous
bug.  http://patchwork.ozlabs.org/patch/10528/  So, I merged a
few broken out loops in do_init_bootmem() to fix it.  That changed
the code ordering.

I think this bug is triggered by having reserved areas for a node
which are spanned by another node's contents.  In the
mark_reserved_regions_for_nid() code, we attempt to reserve the
area for a node before we have allocated the NODE_DATA() for that
nid.  We do this since I reordered that loop.  I suck.

This may only present on some systems that have 16GB pages
reserved.  But, it can probably happen on any system that is
trying to reserve large swaths of memory that happen to span other
nodes' contents.

This patch ensures that we do not touch bootmem for any node which
has not been initialized.

Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---

 linux-2.6.git-dave/arch/powerpc/mm/numa.c |   13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff -puN arch/powerpc/mm/numa.c~fix-bad-node-reserve arch/powerpc/mm/numa.c
--- linux-2.6.git/arch/powerpc/mm/numa.c~fix-bad-node-reserve   2008-12-09 
10:16:04.0 -0800
+++ linux-2.6.git-dave/arch/powerpc/mm/numa.c   2008-12-09 10:16:04.0 
-0800
@@ -870,6 +870,7 @@ static void mark_reserved_regions_for_ni
struct pglist_data *node = NODE_DATA(nid);
int i;
 
+   dbg(mark_reserved_regions_for_nid(%d) NODE_DATA: %p\n, nid, node);
for (i = 0; i  lmb.reserved.cnt; i++) {
unsigned long physbase = lmb.reserved.region[i].base;
unsigned long size = lmb.reserved.region[i].size;
@@ -901,10 +902,14 @@ static void mark_reserved_regions_for_ni
if (end_pfn  node_ar.end_pfn)
reserve_size = (node_ar.end_pfn  PAGE_SHIFT)
- (start_pfn  PAGE_SHIFT);
-   dbg(reserve_bootmem %lx %lx nid=%d\n, physbase,
-   reserve_size, node_ar.nid);
-   reserve_bootmem_node(NODE_DATA(node_ar.nid), physbase,
-   reserve_size, BOOTMEM_DEFAULT);
+   /*
+* Only worry about *this* node, others may not
+* yet have valid NODE_DATA().
+*/
+   if (node_ar.nid == nid)
+   reserve_bootmem_node(NODE_DATA(node_ar.nid),
+   physbase, reserve_size,
+   BOOTMEM_DEFAULT);
/*
 * if reserved region is contained in the active region
 * then done.
_
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 7/8] less use of NODE_DATA()

2008-12-09 Thread Dave Hansen

The use of NODE_DATA() in the ppc init code is fragile.  We use
it for some nodes as we are initializing others.  As the loop
initializing them has gotten more complex and broken out into
several functions it gets harder and harder to remember how
this goes.

This was recently the cause of a bug

http://patchwork.ozlabs.org/patch/10528/

in which I also created a new regression for machines with large
memory reservations in the LMB structures (most likely 16GB
pages).

This patch reduces the references to NODE_DATA() and also keeps
it unitialized for as long as possible.  Hopefully, the delay
in initialization will help its use from spreading too much,
reducing the chances for future bugs.

Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---

 linux-2.6.git-dave/arch/powerpc/mm/numa.c |   63 ++
 1 file changed, 31 insertions(+), 32 deletions(-)

diff -puN arch/powerpc/mm/numa.c~less-use-of-NODE_DATA arch/powerpc/mm/numa.c
--- linux-2.6.git/arch/powerpc/mm/numa.c~less-use-of-NODE_DATA  2008-12-09 
10:16:08.0 -0800
+++ linux-2.6.git-dave/arch/powerpc/mm/numa.c   2008-12-09 10:16:08.0 
-0800
@@ -847,17 +847,16 @@ static void __init *careful_zallocation(
/*
 * We initialize the nodes in numeric order: 0, 1, 2...
 * and hand over control from the LMB allocator to the
-* bootmem allocator.  If this function is called for
-* node 5, then we know that all nodes 5 are using the
-* bootmem allocator instead of the LMB allocator.
+* bootmem allocator.
 *
-* So, check the nid from which this allocation came
-* and double check to see if we need to use bootmem
-* instead of the LMB.  We don't free the LMB memory
-* since it would be useless.
+* We must not call into the bootmem allocator for any node
+* which has not had bootmem initialized and had all of the
+* reserved areas set up.  In do_init_bootmem_node(), we do
+* not set NODE_DATA(nid) up until that is done.  Use that
+* property here.
 */
new_nid = early_pfn_to_nid(ret_paddr  PAGE_SHIFT);
-   if (new_nid  nid) {
+   if (NODE_DATA(new_nid)) {
ret = __alloc_bootmem_node(NODE_DATA(new_nid),
size, align, 0);
 
@@ -873,12 +872,12 @@ static struct notifier_block __cpuinitda
.priority = 1 /* Must run before sched domains notifier. */
 };
 
-static void mark_reserved_regions_for_nid(int nid)
+static void mark_reserved_regions_for_node(struct pglist_data *node)
 {
-   struct pglist_data *node = NODE_DATA(nid);
+   int nid = node-node_id;
int i;
 
-   dbg(mark_reserved_regions_for_nid(%d) NODE_DATA: %p\n, nid, node);
+   dbg(%s(%d) NODE_DATA: %p\n, __func__, nid, node);
for (i = 0; i  lmb.reserved.cnt; i++) {
unsigned long physbase = lmb.reserved.region[i].base;
unsigned long size = lmb.reserved.region[i].size;
@@ -915,9 +914,8 @@ static void mark_reserved_regions_for_ni
 * yet have valid NODE_DATA().
 */
if (node_ar.nid == nid)
-   reserve_bootmem_node(NODE_DATA(node_ar.nid),
-   physbase, reserve_size,
-   BOOTMEM_DEFAULT);
+   reserve_bootmem_node(node, physbase,
+   reserve_size, BOOTMEM_DEFAULT);
/*
 * if reserved region is contained in the active region
 * then done.
@@ -938,8 +936,9 @@ static void mark_reserved_regions_for_ni
}
 }
 
-void do_init_bootmem_node(int node)
+void do_init_bootmem_node(int nid)
 {
+   struct pglist_data *node;
unsigned long start_pfn, end_pfn;
void *bootmem_vaddr;
unsigned long bootmap_pages;
@@ -954,18 +953,16 @@ void do_init_bootmem_node(int node)
 * previous nodes' bootmem to be initialized and have
 * all reserved areas marked.
 */
-   NODE_DATA(nid) = careful_zallocation(nid,
-   sizeof(struct pglist_data),
-   SMP_CACHE_BYTES, end_pfn);
-
-   dbg(node %d\n, nid);
-   dbg(NODE_DATA() = %p\n, NODE_DATA(nid));
-
-   NODE_DATA(nid)-bdata = bootmem_node_data[nid];
-   NODE_DATA(nid)-node_start_pfn = start_pfn;
-   NODE_DATA(nid)-node_spanned_pages = end_pfn - start_pfn;
+   node = careful_zallocation(nid, sizeof(struct pglist_data),
+  SMP_CACHE_BYTES, end_pfn);
 
-   if (NODE_DATA(nid)-node_spanned_pages == 0)
+   dbg(node %d pgkist_data: %p\n, nid, node);
+
+   node-bdata = bootmem_node_data[nid];
+   node-node_start_pfn = start_pfn;
+   node-node_spanned_pages = end_pfn - start_pfn;
+
+ 

[PATCH 8/8] make free_bootmem_with_active_regions() take pgdat

2008-12-09 Thread Dave Hansen

As I said earlier, I'm trying to restrict the use of NODE_DATA()
since it can easily be referenced too early otherwise.

free_bootmem_with_active_regions() does not in practice need to
deal with multiple nodes.  I already audited all of its callers.

This patch makes it take a pgdat instead of doing the NODE_DATA()
lookup internally.

Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---

 linux-2.6.git-dave/arch/mips/sgi-ip27/ip27-memory.c |2 +-
 linux-2.6.git-dave/arch/powerpc/mm/mem.c|5 +++--
 linux-2.6.git-dave/arch/powerpc/mm/numa.c   |3 +--
 linux-2.6.git-dave/arch/s390/kernel/setup.c |2 +-
 linux-2.6.git-dave/arch/sh/mm/numa.c|2 +-
 linux-2.6.git-dave/arch/sparc64/mm/init.c   |6 +++---
 linux-2.6.git-dave/arch/x86/mm/init_32.c|2 +-
 linux-2.6.git-dave/arch/x86/mm/init_64.c|2 +-
 linux-2.6.git-dave/arch/x86/mm/numa_64.c|2 +-
 linux-2.6.git-dave/include/linux/mm.h   |2 +-
 linux-2.6.git-dave/mm/page_alloc.c  |8 
 11 files changed, 18 insertions(+), 18 deletions(-)

diff -puN 
arch/mips/sgi-ip27/ip27-memory.c~make-free_bootmem_with_active_regions-take-pgdat
 arch/mips/sgi-ip27/ip27-memory.c
--- 
linux-2.6.git/arch/mips/sgi-ip27/ip27-memory.c~make-free_bootmem_with_active_regions-take-pgdat
 2008-12-09 10:16:08.0 -0800
+++ linux-2.6.git-dave/arch/mips/sgi-ip27/ip27-memory.c 2008-12-09 
10:16:08.0 -0800
@@ -412,7 +412,7 @@ static void __init node_mem_init(cnodeid
 
bootmap_size = init_bootmem_node(NODE_DATA(node), slot_freepfn,
start_pfn, end_pfn);
-   free_bootmem_with_active_regions(node, end_pfn);
+   free_bootmem_with_active_regions(NODE_DATA(node), end_pfn);
reserve_bootmem_node(NODE_DATA(node), slot_firstpfn  PAGE_SHIFT,
((slot_freepfn - slot_firstpfn)  PAGE_SHIFT) + bootmap_size,
BOOTMEM_DEFAULT);
diff -puN 
arch/powerpc/mm/mem.c~make-free_bootmem_with_active_regions-take-pgdat 
arch/powerpc/mm/mem.c
--- 
linux-2.6.git/arch/powerpc/mm/mem.c~make-free_bootmem_with_active_regions-take-pgdat
2008-12-09 10:16:08.0 -0800
+++ linux-2.6.git-dave/arch/powerpc/mm/mem.c2008-12-09 10:16:08.0 
-0800
@@ -212,7 +212,8 @@ void __init do_init_bootmem(void)
 * present.
 */
 #ifdef CONFIG_HIGHMEM
-   free_bootmem_with_active_regions(0, lowmem_end_addr  PAGE_SHIFT);
+   free_bootmem_with_active_regions(NODE_DATA(0),
+lowmem_end_addr  PAGE_SHIFT);
 
/* reserve the sections we're already using */
for (i = 0; i  lmb.reserved.cnt; i++) {
@@ -230,7 +231,7 @@ void __init do_init_bootmem(void)
}
}
 #else
-   free_bootmem_with_active_regions(0, max_pfn);
+   free_bootmem_with_active_regions(NODE_DATA(0), max_pfn);
 
/* reserve the sections we're already using */
for (i = 0; i  lmb.reserved.cnt; i++)
diff -puN 
arch/powerpc/mm/numa.c~make-free_bootmem_with_active_regions-take-pgdat 
arch/powerpc/mm/numa.c
--- 
linux-2.6.git/arch/powerpc/mm/numa.c~make-free_bootmem_with_active_regions-take-pgdat
   2008-12-09 10:16:08.0 -0800
+++ linux-2.6.git-dave/arch/powerpc/mm/numa.c   2008-12-09 10:16:08.0 
-0800
@@ -978,8 +978,6 @@ void do_init_bootmem_node(int nid)
init_bootmem_node(node, __pa(bootmem_vaddr)  PAGE_SHIFT,
  start_pfn, end_pfn);
 
-   NODE_DATA(nid) = node;
-   /* this call needs NODE_DATA(), so initialize it above */
free_bootmem_with_active_regions(nid, end_pfn);
mark_reserved_regions_for_node(node);
/*
@@ -988,6 +986,7 @@ void do_init_bootmem_node(int nid)
 * careful_zallocation() depends on this getting set
 * now to tell from which nodes it must use bootmem.
 */
+   NODE_DATA(nid) = node;
sparse_memory_present_with_active_regions(nid);
 }
 
diff -puN 
arch/s390/kernel/setup.c~make-free_bootmem_with_active_regions-take-pgdat 
arch/s390/kernel/setup.c
--- 
linux-2.6.git/arch/s390/kernel/setup.c~make-free_bootmem_with_active_regions-take-pgdat
 2008-12-09 10:16:08.0 -0800
+++ linux-2.6.git-dave/arch/s390/kernel/setup.c 2008-12-09 10:16:08.0 
-0800
@@ -616,7 +616,7 @@ setup_memory(void)
 
psw_set_key(PAGE_DEFAULT_KEY);
 
-   free_bootmem_with_active_regions(0, max_pfn);
+   free_bootmem_with_active_regions(NODE_DATA(0), max_pfn);
 
/*
 * Reserve memory used for lowcore/command line/kernel image.
diff -puN arch/sh/mm/numa.c~make-free_bootmem_with_active_regions-take-pgdat 
arch/sh/mm/numa.c
--- 
linux-2.6.git/arch/sh/mm/numa.c~make-free_bootmem_with_active_regions-take-pgdat
2008-12-09 10:16:08.0 -0800
+++ linux-2.6.git-dave/arch/sh/mm/numa.c2008-12-09 10:16:08.0 
-0800
@@ -75,7 +75,7 @@ void 

[PATCH 3/8] cleanup careful_allocation(): bootmem already panics

2008-12-09 Thread Dave Hansen

If we fail a bootmem allocation, the bootmem code itself
panics.  No need to redo it here.

Also change the wording of the other panic.  We don't
strictly have to allocate memory on the specified node.
It is just a hint and that node may not even *have* any
memory on it.  In that case we can and do fall back to
other nodes.


Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---

 linux-2.6.git-dave/arch/powerpc/mm/numa.c |6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff -puN arch/powerpc/mm/numa.c~cleanup-careful_allocation 
arch/powerpc/mm/numa.c
--- linux-2.6.git/arch/powerpc/mm/numa.c~cleanup-careful_allocation 
2008-12-09 10:16:05.0 -0800
+++ linux-2.6.git-dave/arch/powerpc/mm/numa.c   2008-12-09 10:16:05.0 
-0800
@@ -836,7 +836,7 @@ static void __init *careful_allocation(i
ret = __lmb_alloc_base(size, align, lmb_end_of_DRAM());
 
if (!ret)
-   panic(numa.c: cannot allocate %lu bytes on node %d,
+   panic(numa.c: cannot allocate %lu bytes for node %d,
  size, nid);
 
/*
@@ -856,10 +856,6 @@ static void __init *careful_allocation(i
ret = (unsigned long)__alloc_bootmem_node(NODE_DATA(new_nid),
size, align, 0);
 
-   if (!ret)
-   panic(numa.c: cannot allocate %lu bytes on node %d,
- size, new_nid);
-
ret = __pa(ret);
 
dbg(alloc_bootmem %lx %lx\n, ret, size);
_
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[RFC/PATCH] powerpc: consistent memory mapping.

2008-12-09 Thread Ilya Yanok
 Defining the start virtual address of the consistent memory
in configs leads to overlapping of the consistent area with
the other virtual regions (fixmap, pkmap, vmalloc). Defaults from
current kernel just set consistent memory area to be somewhere
high in the vmalloc area and then you need to pray there will be
not enough vmalloc allocations to overlap.

 So, this patch makes the virtual address of the consistent memory
to be assigned dynamically, at the end of the virtual address area.
The fixmap area is now shifted to the low addresses, and ends before
start of the consistent virtual addresses. User is now allowed to
configure the size of the consistent memory area only.

 The exception has been made for 8xx archs, where the start
of the consistent memory is still configurable: this is to avoid
overlapping with the IMM space of 8xx. Actually this is wrong. We
have a possibility to overlap not only for consistent memory but
for IMM space too. But we don't have much expertise in 8xx so we
are looking forward for some advice here.

 The following items remain to be done to complete supporting of
the consistent memory fully:

a) we missing 1 (last) page of addresses at the end of the consistent
memory area;

b) if CONFIG_CONSISTENT_SIZE is such that we cover more address
regions than served by 1 pgd level, then mapping of the pages to
these additional areas won't work (this 'feature' isn't introduced
by this patch, but is the consequence of the current consistent
memory support code, where consistent_pte is set in dma_alloc_init()
in accordance with the pgd of the CONSISTENT_BASE address).

Signed-off-by: Ilya Yanok [EMAIL PROTECTED]
Signed-off-by: Yuri Tikhonov [EMAIL PROTECTED]
---
 arch/powerpc/Kconfig   |7 ---
 arch/powerpc/lib/dma-noncoherent.c |5 +
 arch/powerpc/mm/pgtable_32.c   |2 +-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index aa2eb46..4d62446 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -809,7 +809,7 @@ config TASK_SIZE
 
 config CONSISTENT_START_BOOL
bool Set custom consistent memory pool address
-   depends on ADVANCED_OPTIONS  NOT_COHERENT_CACHE
+   depends on ADVANCED_OPTIONS  NOT_COHERENT_CACHE  8xx
help
  This option allows you to set the base virtual address
  of the consistent memory pool.  This pool of virtual
@@ -817,8 +817,8 @@ config CONSISTENT_START_BOOL
 
 config CONSISTENT_START
hex Base virtual address of consistent memory pool if 
CONSISTENT_START_BOOL
-   default 0xfd00 if (NOT_COHERENT_CACHE  8xx)
-   default 0xff10 if NOT_COHERENT_CACHE
+   depends on 8xx
+   default 0xfd00 if NOT_COHERENT_CACHE
 
 config CONSISTENT_SIZE_BOOL
bool Set custom consistent memory pool size
@@ -831,6 +831,7 @@ config CONSISTENT_SIZE_BOOL
 config CONSISTENT_SIZE
hex Size of consistent memory pool if CONSISTENT_SIZE_BOOL
default 0x0020 if NOT_COHERENT_CACHE
+   default 0x if !NOT_COHERENT_CACHE
 
 config PIN_TLB
bool Pinned Kernel TLBs (860 ONLY)
diff --git a/arch/powerpc/lib/dma-noncoherent.c 
b/arch/powerpc/lib/dma-noncoherent.c
index 31734c0..3c12577 100644
--- a/arch/powerpc/lib/dma-noncoherent.c
+++ b/arch/powerpc/lib/dma-noncoherent.c
@@ -38,8 +38,13 @@
  * can be further configured for specific applications under
  * the Advanced Setup menu. -Matt
  */
+#ifdef CONFIG_CONSISTENT_START
 #define CONSISTENT_BASE(CONFIG_CONSISTENT_START)
 #define CONSISTENT_END (CONFIG_CONSISTENT_START + CONFIG_CONSISTENT_SIZE)
+#else
+#define CONSISTENT_BASE((unsigned long)(-CONFIG_CONSISTENT_SIZE))
+#define CONSISTENT_END ((unsigned long)(-PAGE_SIZE))
+#endif /* CONFIG_CONSISTENT_START */
 #define CONSISTENT_OFFSET(x)   (((unsigned long)(x) - CONSISTENT_BASE)  
PAGE_SHIFT)
 
 /*
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 10d21c3..fda24c7 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -395,7 +395,7 @@ void kernel_map_pages(struct page *page, int numpages, int 
enable)
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
 static int fixmaps;
-unsigned long FIXADDR_TOP = (-PAGE_SIZE);
+unsigned long FIXADDR_TOP = (-PAGE_SIZE-CONFIG_CONSISTENT_SIZE);
 EXPORT_SYMBOL(FIXADDR_TOP);
 
 void __set_fixmap (enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags)
-- 
1.5.6.1

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] powerpc: Remove `have_of' global variable

2008-12-09 Thread Anton Vorontsov
The `have_of' variable is a relic from the arch/ppc time, it isn't
useful nowadays.

Signed-off-by: Anton Vorontsov [EMAIL PROTECTED]
---
 arch/powerpc/include/asm/processor.h |2 --
 arch/powerpc/kernel/pci-common.c |2 --
 arch/powerpc/kernel/pci_32.c |7 +--
 arch/powerpc/kernel/setup_32.c   |2 --
 arch/powerpc/kernel/setup_64.c   |1 -
 fs/proc/proc_devtree.c   |3 +--
 6 files changed, 2 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index cd7a478..d346649 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -69,8 +69,6 @@ extern int _prep_type;
 
 #ifdef __KERNEL__
 
-extern int have_of;
-
 struct task_struct;
 void start_thread(struct pt_regs *regs, unsigned long fdptr, unsigned long sp);
 void release_thread(struct task_struct *);
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 91c3f52..1a32db3 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -160,8 +160,6 @@ EXPORT_SYMBOL(pci_domain_nr);
  */
 struct pci_controller* pci_find_hose_for_OF_device(struct device_node* node)
 {
-   if (!have_of)
-   return NULL;
while(node) {
struct pci_controller *hose, *tmp;
list_for_each_entry_safe(hose, tmp, hose_list, list_node)
diff --git a/arch/powerpc/kernel/pci_32.c b/arch/powerpc/kernel/pci_32.c
index 7ad11e5..132cd80 100644
--- a/arch/powerpc/kernel/pci_32.c
+++ b/arch/powerpc/kernel/pci_32.c
@@ -266,9 +266,6 @@ pci_busdev_to_OF_node(struct pci_bus *bus, int devfn)
 {
struct device_node *parent, *np;
 
-   if (!have_of)
-   return NULL;
-
pr_debug(pci_busdev_to_OF_node(%d,0x%x)\n, bus-number, devfn);
parent = scan_OF_for_pci_bus(bus);
if (parent == NULL)
@@ -309,8 +306,6 @@ pci_device_from_OF_node(struct device_node* node, u8* bus, 
u8* devfn)
struct pci_controller* hose;
struct pci_dev* dev = NULL;

-   if (!have_of)
-   return -ENODEV;
/* Make sure it's really a PCI device */
hose = pci_find_hose_for_OF_device(node);
if (!hose || !hose-dn)
@@ -431,7 +426,7 @@ static int __init pcibios_init(void)
 * numbers vs. kernel bus numbers since we may have to
 * remap them.
 */
-   if (pci_assign_all_buses  have_of)
+   if (pci_assign_all_buses)
pcibios_make_OF_bus_map();
 
/* Call common code to handle resource allocation */
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index c1a2762..cc4679e 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -53,8 +53,6 @@ unsigned long ISA_DMA_THRESHOLD;
 unsigned int DMA_MODE_READ;
 unsigned int DMA_MODE_WRITE;
 
-int have_of = 1;
-
 #ifdef CONFIG_VGA_CONSOLE
 unsigned long vgacon_remap_base;
 EXPORT_SYMBOL(vgacon_remap_base);
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 93c875a..ce48f5c 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -70,7 +70,6 @@
 #define DBG(fmt...)
 #endif
 
-int have_of = 1;
 int boot_cpuid = 0;
 u64 ppc64_pft_size;
 
diff --git a/fs/proc/proc_devtree.c b/fs/proc/proc_devtree.c
index d89..de2bba5 100644
--- a/fs/proc/proc_devtree.c
+++ b/fs/proc/proc_devtree.c
@@ -218,8 +218,7 @@ void proc_device_tree_add_node(struct device_node *np,
 void __init proc_device_tree_init(void)
 {
struct device_node *root;
-   if ( !have_of )
-   return;
+
proc_device_tree = proc_mkdir(device-tree, NULL);
if (proc_device_tree == 0)
return;
-- 
1.5.6.5
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: Add SMP support to no-hash TLB handling v3

2008-12-09 Thread Benjamin Herrenschmidt
On Tue, 2008-12-09 at 07:10 -0600, Kumar Gala wrote:
  +void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long  
  vmaddr)
  +{
  +   unsigned int pid;
  +
  +   preempt_disable();
  +   pid = vma ? vma-vm_mm-context.id : 0;
  +   if (pid != MMU_NO_CONTEXT)
  +   _tlbil_va(vmaddr, pid);
  +   preempt_enable();
  +}
  +EXPORT_SYMBOL(local_flush_tlb_page);
 
 We are using this in highmem.h for kmap_atomic.. So you need to fix  
 that call site.

Ah yes, I forgot, will fix, thanks.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/9] powerpc: Preliminary work to enable SMP BookE

2008-12-09 Thread Benjamin Herrenschmidt
On Tue, 2008-12-09 at 07:17 -0600, Kumar Gala wrote:

  There are some semingly unrelated patches in the pile as they
  are dependencies of the main ones so I'm including them in.
 
 You'll be happy to know these patches at least boot on real 85xx SMP HW.

Ah excellent !

Now time for you to torture test them :-) BTW. Don't you guys support
larger than 8-bit PIDs on some E500 cores ? The latest patch I posted
yesterday should allow to slip than in easily too.

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: add 16K/64K pages support for the 44x PPC32 architectures.

2008-12-09 Thread Benjamin Herrenschmidt
Hi Ilya !

Looks good overall. A few minor comments.

 +config PPC_4K_PAGES
 + bool 4k page size
 +
 +config PPC_16K_PAGES
 + bool 16k page size if 44x
 +
 +config PPC_64K_PAGES
 + bool 64k page size if 44x || PPC64
 + select PPC_HAS_HASH_64K if PPC64

I'd rather if the PPC64 references were instead PPC_STD_MMU_64 (which
may or may not be defined in Kconfig depending on what you are based on,
but is trivial to add.

I want to clearly differenciate what is MMU from what CPU architecture
and there may (will ... ahem) at some point be 64-bit BookE.

In the same vein, we should probably rework some of the above so that
the CPU/MMU type actually defines what page sizes are allowed
(PPC_CAN_16K, PPC_CAN_64K, ...) but let's keep that for a later patch.

  config PPC_SUBPAGE_PROT
   bool Support setting protections for 4k subpages
 - depends on PPC_64K_PAGES
 + depends on PPC64  PPC_64K_PAGES
   help
 This option adds support for a system call to allow user programs
 to set access permissions (read/write, readonly, or no access)

Same comment here.

 diff --git a/arch/powerpc/include/asm/highmem.h 
 b/arch/powerpc/include/asm/highmem.h
 index 91c5895..9875540 100644
 --- a/arch/powerpc/include/asm/highmem.h
 +++ b/arch/powerpc/include/asm/highmem.h
 @@ -38,9 +38,20 @@ extern pte_t *pkmap_page_table;
   * easily, subsequent pte tables have to be allocated in one physical
   * chunk of RAM.
   */
 -#define LAST_PKMAP   (1  PTE_SHIFT)
 -#define LAST_PKMAP_MASK (LAST_PKMAP-1)
 +/*
 + * We use one full pte table with 4K pages. And with 16K/64K pages pte
 + * table covers enough memory (32MB and 512MB resp.) that both FIXMAP
 + * and PKMAP can be placed in single pte table. We use 1024 pages for
 + * PKMAP in case of 16K/64K pages.
 + */
 +#define PKMAP_ORDER  min(PTE_SHIFT, 10)
 +#define LAST_PKMAP   (1  PKMAP_ORDER)
 +#if !defined(CONFIG_PPC_4K_PAGES)
 +#define PKMAP_BASE   (FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1))
 +#else
  #define PKMAP_BASE   ((FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1))  
 PMD_MASK)
 +#endif

I'm not sure about the above  PMD_MASK. Shouldn't we instead make it
not build if (PKMAP_BASE  PMD_MASK) != 0 ? IE, somebody set
FIXADDR_START to something wrong... and avoid the ifdef alltogether ? Or
am I missing something ? (it's early morning and I may not have all my
wits with me right now !)

 -#ifdef CONFIG_PPC_64K_PAGES
 +#if defined(CONFIG_PPC_64K_PAGES)  defined(CONFIG_PPC64)
  typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
  #else

Same comment about using PPC_STD_MMU_64, it's going to make my life
easier later on :-) And in various other places, I won't quote them all.

 diff --git a/arch/powerpc/include/asm/page_32.h 
 b/arch/powerpc/include/asm/page_32.h
 index d77072a..74b097b 100644
 --- a/arch/powerpc/include/asm/page_32.h
 +++ b/arch/powerpc/include/asm/page_32.h
 @@ -19,6 +19,7 @@
  #define PTE_FLAGS_OFFSET 0
  #endif
  
 +#define PTE_SHIFT(PAGE_SHIFT - PTE_T_LOG2)   /* full page */
  #ifndef __ASSEMBLY__

Stick a blank line between the two above statements.

  /*
   * The basic type of a PTE - 64 bits for those CPUs with  32 bit
 @@ -26,10 +27,8 @@
   */
  #ifdef CONFIG_PTE_64BIT
  typedef unsigned long long pte_basic_t;
 -#define PTE_SHIFT(PAGE_SHIFT - 3)/* 512 ptes per page */
  #else
  typedef unsigned long pte_basic_t;
 -#define PTE_SHIFT(PAGE_SHIFT - 2)/* 1024 ptes per page */
  #endif
  
  struct page;
 diff --git a/arch/powerpc/include/asm/pgtable.h 
 b/arch/powerpc/include/asm/pgtable.h
 index dbb8ca1..a202043 100644
 --- a/arch/powerpc/include/asm/pgtable.h
 +++ b/arch/powerpc/include/asm/pgtable.h
 @@ -39,6 +39,8 @@ extern void paging_init(void);
  
  #include asm-generic/pgtable.h
  
 +#define PGD_T_LOG2   (__builtin_ffs(sizeof(pgd_t)) - 1)
 +#define PTE_T_LOG2   (__builtin_ffs(sizeof(pte_t)) - 1)

I'm surprised the above actually work :-) Why not having these next to the
definition of pte_t in page_32.h ?

Also, you end up having to do an asm-offset trick to get those to asm, I
wonder if it's worth it or if we aren't better off just #defining the sizes
with actual numbers next to the type definitions. No big deal either way.
  /*
   * To support 32-bit physical addresses, we use an 8KB pgdir.
 diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
 index bdc8b0e..42f99d2 100644
 --- a/arch/powerpc/kernel/misc_32.S
 +++ b/arch/powerpc/kernel/misc_32.S
 @@ -647,8 +647,8 @@ _GLOBAL(__flush_dcache_icache)
  BEGIN_FTR_SECTION
   blr
  END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
 - rlwinm  r3,r3,0,0,19/* Get page base address */
 - li  r4,4096/L1_CACHE_BYTES  /* Number of lines in a page */
 + rlwinm  r3,r3,0,0,PPC44x_RPN_MASK   /* Get page base address */
 + li  r4,PAGE_SIZE/L1_CACHE_BYTES /* Number of lines in a page */

Now, the problem here is the name of the constant. IE. This is more or
less generic 

Re: [PATCH v5] spi: Add PPC4xx SPI driver

2008-12-09 Thread Steven A. Falco
Stefan Roese wrote:
 This adds a SPI driver for the SPI controller found in the IBM/AMCC
 4xx PowerPC's.
 
 Signed-off-by: Stefan Roese [EMAIL PROTECTED]
 Signed-off-by: Wolfgang Ocker [EMAIL PROTECTED]
 Acked-by: Josh Boyer [EMAIL PROTECTED]
 ---

I have a question as to how to use this driver.  of_num_gpios() starts
testing for gpio's at num = 0, and stops at the first invalid one.
However, gpio numbers are apparently allocated dynamically from 255 down,
meaning that there probably is no gpio-0.

For example, on my Sequoia board I have gpiochip176, gpiochip192,
and gpiochip224.  So, of_num_gpios() returns zero, even though there
are 72 gpio's on my board.

This gets back to an earlier discussion about setting the gpio index
of each controller, which was rejected, IIRC.  If we could set the
base gpio of each chip, we could start at zero and use consecutive
numbers.  Failing that, it seems that Stefan's SPI driver needs to
probe the entire 0-255 gpio space.

How is this intended to work?  An example .dts would be greatly
appreciated.

Steve
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 6/8] cleanup do_init_bootmem()

2008-12-09 Thread Serge E. Hallyn
Quoting Dave Hansen ([EMAIL PROTECTED]):
 
 I'm debating whether this is worth it. It makes this a bit more clean
 looking, but doesn't seriously enhance readability.  But, I do think
 it helps a bit.
 
 Thoughts?

Absolutely.  do_init_bootmem_node() is *still* a bit largish,
but far better broken out.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC/PATCH] powerpc: consistent memory mapping.

2008-12-09 Thread Benjamin Herrenschmidt
On Tue, 2008-12-09 at 21:23 +0300, Ilya Yanok wrote:
 Defining the start virtual address of the consistent memory
 in configs leads to overlapping of the consistent area with
 the other virtual regions (fixmap, pkmap, vmalloc). Defaults from
 current kernel just set consistent memory area to be somewhere
 high in the vmalloc area and then you need to pray there will be
 not enough vmalloc allocations to overlap.

 .../...

What about just ripping that consistent memory implementation out
completely and using the normal vmalloc/ioremap allocator instead ?

Any reason not to ?

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH v5] spi: Add PPC4xx SPI driver

2008-12-09 Thread Steven A. Falco
Steven A. Falco wrote:
 Stefan Roese wrote:
 This adds a SPI driver for the SPI controller found in the IBM/AMCC
 4xx PowerPC's.

 Signed-off-by: Stefan Roese [EMAIL PROTECTED]
 Signed-off-by: Wolfgang Ocker [EMAIL PROTECTED]
 Acked-by: Josh Boyer [EMAIL PROTECTED]
 ---

 How is this intended to work?  An example .dts would be greatly
 appreciated.

Answered my own question.  The gpios must be directly under the spi
node rather than elsewhere in the tree.  This works:

SPI0: [EMAIL PROTECTED] {
compatible = ibm,ppc4xx-spi;
reg = ef600900 7;
interrupts = 8 4;
interrupt-parent = UIC0;

gpios = GPIO1 14 0;
};
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: Remove `have_of' global variable

2008-12-09 Thread Benjamin Herrenschmidt
On Tue, 2008-12-09 at 22:47 +0300, Anton Vorontsov wrote:
 The `have_of' variable is a relic from the arch/ppc time, it isn't
 useful nowadays.
 
 Signed-off-by: Anton Vorontsov [EMAIL PROTECTED]

Acked-by: Benjamin Herrenschmidt [EMAIL PROTECTED]

 ---
  arch/powerpc/include/asm/processor.h |2 --
  arch/powerpc/kernel/pci-common.c |2 --
  arch/powerpc/kernel/pci_32.c |7 +--
  arch/powerpc/kernel/setup_32.c   |2 --
  arch/powerpc/kernel/setup_64.c   |1 -
  fs/proc/proc_devtree.c   |3 +--
  6 files changed, 2 insertions(+), 15 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/processor.h 
 b/arch/powerpc/include/asm/processor.h
 index cd7a478..d346649 100644
 --- a/arch/powerpc/include/asm/processor.h
 +++ b/arch/powerpc/include/asm/processor.h
 @@ -69,8 +69,6 @@ extern int _prep_type;
  
  #ifdef __KERNEL__
  
 -extern int have_of;
 -
  struct task_struct;
  void start_thread(struct pt_regs *regs, unsigned long fdptr, unsigned long 
 sp);
  void release_thread(struct task_struct *);
 diff --git a/arch/powerpc/kernel/pci-common.c 
 b/arch/powerpc/kernel/pci-common.c
 index 91c3f52..1a32db3 100644
 --- a/arch/powerpc/kernel/pci-common.c
 +++ b/arch/powerpc/kernel/pci-common.c
 @@ -160,8 +160,6 @@ EXPORT_SYMBOL(pci_domain_nr);
   */
  struct pci_controller* pci_find_hose_for_OF_device(struct device_node* node)
  {
 - if (!have_of)
 - return NULL;
   while(node) {
   struct pci_controller *hose, *tmp;
   list_for_each_entry_safe(hose, tmp, hose_list, list_node)
 diff --git a/arch/powerpc/kernel/pci_32.c b/arch/powerpc/kernel/pci_32.c
 index 7ad11e5..132cd80 100644
 --- a/arch/powerpc/kernel/pci_32.c
 +++ b/arch/powerpc/kernel/pci_32.c
 @@ -266,9 +266,6 @@ pci_busdev_to_OF_node(struct pci_bus *bus, int devfn)
  {
   struct device_node *parent, *np;
  
 - if (!have_of)
 - return NULL;
 -
   pr_debug(pci_busdev_to_OF_node(%d,0x%x)\n, bus-number, devfn);
   parent = scan_OF_for_pci_bus(bus);
   if (parent == NULL)
 @@ -309,8 +306,6 @@ pci_device_from_OF_node(struct device_node* node, u8* 
 bus, u8* devfn)
   struct pci_controller* hose;
   struct pci_dev* dev = NULL;
   
 - if (!have_of)
 - return -ENODEV;
   /* Make sure it's really a PCI device */
   hose = pci_find_hose_for_OF_device(node);
   if (!hose || !hose-dn)
 @@ -431,7 +426,7 @@ static int __init pcibios_init(void)
* numbers vs. kernel bus numbers since we may have to
* remap them.
*/
 - if (pci_assign_all_buses  have_of)
 + if (pci_assign_all_buses)
   pcibios_make_OF_bus_map();
  
   /* Call common code to handle resource allocation */
 diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
 index c1a2762..cc4679e 100644
 --- a/arch/powerpc/kernel/setup_32.c
 +++ b/arch/powerpc/kernel/setup_32.c
 @@ -53,8 +53,6 @@ unsigned long ISA_DMA_THRESHOLD;
  unsigned int DMA_MODE_READ;
  unsigned int DMA_MODE_WRITE;
  
 -int have_of = 1;
 -
  #ifdef CONFIG_VGA_CONSOLE
  unsigned long vgacon_remap_base;
  EXPORT_SYMBOL(vgacon_remap_base);
 diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
 index 93c875a..ce48f5c 100644
 --- a/arch/powerpc/kernel/setup_64.c
 +++ b/arch/powerpc/kernel/setup_64.c
 @@ -70,7 +70,6 @@
  #define DBG(fmt...)
  #endif
  
 -int have_of = 1;
  int boot_cpuid = 0;
  u64 ppc64_pft_size;
  
 diff --git a/fs/proc/proc_devtree.c b/fs/proc/proc_devtree.c
 index d89..de2bba5 100644
 --- a/fs/proc/proc_devtree.c
 +++ b/fs/proc/proc_devtree.c
 @@ -218,8 +218,7 @@ void proc_device_tree_add_node(struct device_node *np,
  void __init proc_device_tree_init(void)
  {
   struct device_node *root;
 - if ( !have_of )
 - return;
 +
   proc_device_tree = proc_mkdir(device-tree, NULL);
   if (proc_device_tree == 0)
   return;

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Re[2]: [PATCH 01/11] async_tx: don't use src_list argument of async_xor() for dma addresses

2008-12-09 Thread Dan Williams
On Mon, Dec 8, 2008 at 5:41 PM, Yuri Tikhonov [EMAIL PROTECTED] wrote:
 On Tuesday, December 9, 2008 you wrote:

 On Mon, Dec 8, 2008 at 2:55 PM, Yuri Tikhonov [EMAIL PROTECTED] wrote:
 Using src_list argument of async_xor() as a storage for dma addresses
 implies sizeof(dma_addr_t) = sizeof(struct page *) restriction which is
 not always true (e.g. ppc440spe).


 ppc440spe runs with CONFIG_PHYS_64BIT?

  Yep. It uses 36-bit addressing, so this CONFIG is turned on.

 If we do this then we need to also change md to limit the number of
 allowed disks based on the kernel stack size.  Because with 256 disks
 a 4K stack can be consumed by one call to async_pq ((256 sources in
 raid5.c + 256 sources async_pq.c) * 8 bytes per source on 64-bit).

  On ppc440spe we have 8KB stack, so the things are not worse than on
 32-bit archs with 4KB stack. Thus, I guess no changes to md are
 required because of this patch. Right?

8K stacks do make this less of an issue *provided* handle_stripe()
remains only called from raid5d.  We used to share some stripe
handling work with the requester's process context where the stack is
much more crowded.  So, we would now be more strongly tied to the
raid5d-only approach... maybe that is not enough to deny this change.
Neil what do you think of the async_{xor,pq,etc} apis allocating
'src_cnt' sized arrays on the stack?

Thanks,
Dan
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] ndfc driver

2008-12-09 Thread Sean MacLennan
On Mon, 08 Dec 2008 21:57:12 -1000
Mitch Bradley [EMAIL PROTECTED] wrote:

 One address/size cell isn't enough for the next generation of NAND
 FLASH chips.
 

I am no dts expert, but I thought I could put:

nand {
#address-cells = 1;
#size-cells = 1;

in my dts and you could put:

nand {
#address-cells = 2;
#size-cells = 2;

and, assuming we specified the reg entry right, everything would just
work. Is that assumption wrong?

And if the assumption is true, should I make a note in the doc that you
can make the address and size bigger?

Cheers,
   Sean
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 4/4] leds: Let GPIO LEDs keep their current state

2008-12-09 Thread Trent Piepho
On Wed, 3 Dec 2008, Richard Purdie wrote:
 On Sun, 2008-11-23 at 13:31 +0100, Pavel Machek wrote:
 On Thu 2008-11-20 17:05:56, Trent Piepho wrote:
 I thought of that, but it ends up being more complex.  Instead of just
 using:
 static const struct gpio_led myled = {
 .name = something,
 .keep_state = 1,
 }

 You'd do something like this:
 .default_state = LEDS_GPIO_DEFSTATE_KEEP,

 Is that better?

 Yes.

 Yes, agreed, much better.

Oh very well, I'll change it.  But I reserve the right to make a sarcastic
commit message.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[RFC/PATCH 1/2] powerpc: Rework usage of _PAGE_COHERENT/NO_CACHE/GUARDED

2008-12-09 Thread Benjamin Herrenschmidt
Currently, we never set _PAGE_COHERENT in the PTEs, we just OR it in
in the hash code based on some CPU feature bit. We also manipulate
_PAGE_NO_CACHE and _PAGE_GUARDED by hand in all sorts of places.

This changes the logic so that instead, the PTE now contains
_PAGE_COHERENT for all normal RAM pages tha have I = 0. The hash
code clears it if the feature bit is not set.

It also adds some clean accessors to setup various valid combinations
of access flags and change various bits of code to use them instead.

This should help having the PTE actually containing the bit
combinations that we really want.

I also removed _PAGE_GUARDED from _PAGE_BASE on 44x and instead
set it explicitely from the TLB miss. I will ultimately remove it
completely as it appears that it might not be needed after all
but in the meantime, having it in the TLB miss makes things a
lot easier.

! DO NOT MERGE YET !

I haven't touched at the FSL BookE code yet. It may need to selectively
clear M in the TLB miss handler ... or not. Depends what the impact of
M on non-SMP E5xx setup is. I also didn't bother to clear it on 440 because
it just has no effect (ie, it won't slow things down).

Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
---

 arch/powerpc/include/asm/pgtable-ppc32.h |   42 +++
 arch/powerpc/include/asm/pgtable-ppc64.h |   13 -
 arch/powerpc/include/asm/pgtable.h   |   26 +++
 arch/powerpc/kernel/head_44x.S   |1 
 arch/powerpc/kernel/pci-common.c |   24 ++---
 arch/powerpc/mm/hash_low_32.S|4 +-
 arch/powerpc/mm/mem.c|4 +-
 arch/powerpc/platforms/cell/spufs/file.c |   27 ++-
 drivers/video/controlfb.c|4 +-
 9 files changed, 66 insertions(+), 79 deletions(-)

--- linux-work.orig/arch/powerpc/include/asm/pgtable-ppc32.h2008-12-10 
10:48:07.0 +1100
+++ linux-work/arch/powerpc/include/asm/pgtable-ppc32.h 2008-12-10 
16:37:01.0 +1100
@@ -228,9 +228,10 @@ extern int icache_44x_need_flush;
  *   - FILE *must* be in the bottom three bits because swap cache
  * entries use the top 29 bits for TLB2.
  *
- *   - CACHE COHERENT bit (M) has no effect on PPC440 core, because it
- * doesn't support SMP. So we can use this as software bit, like
- * DIRTY.
+ *   - CACHE COHERENT bit (M) has no effect on original PPC440 cores,
+ * because it doesn't support SMP. However, some later 460 variants
+ * have -some- form of SMP support and so I keep the bit there for
+ * future use
  *
  * With the PPC 44x Linux implementation, the 0-11th LSBs of the PTE are used
  * for memory protection related functions (see PTE structure in
@@ -436,20 +437,19 @@ extern int icache_44x_need_flush;
 _PAGE_USER | _PAGE_ACCESSED | \
 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | \
 _PAGE_EXEC | _PAGE_HWEXEC)
+
 /*
- * Note: the _PAGE_COHERENT bit automatically gets set in the hardware
- * PTE if CONFIG_SMP is defined (hash_page does this); there is no need
- * to have it in the Linux PTE, and in fact the bit could be reused for
- * another purpose.  -- paulus.
+ * We define 2 sets of base prot bits, one for basic pages (ie,
+ * cacheable kernel and user pages) and one for non cacheable
+ * pages. We always set _PAGE_COHERENT (when it exists), it will
+ * be explicitely cleared whenever it may prove beneficial
  */
+#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_COHERENT)
+#define _PAGE_BASE_NC  (_PAGE_PRESENT | _PAGE_ACCESSED)
 
-#ifdef CONFIG_44x
-#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_GUARDED)
-#else
-#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED)
-#endif
 #define _PAGE_WRENABLE (_PAGE_RW | _PAGE_DIRTY | _PAGE_HWWRITE)
 #define _PAGE_KERNEL   (_PAGE_BASE | _PAGE_SHARED | _PAGE_WRENABLE)
+#define _PAGE_KERNEL_NC(_PAGE_BASE_NC | _PAGE_SHARED | _PAGE_WRENABLE 
| _PAGE_NO_CACHE)
 
 #ifdef CONFIG_PPC_STD_MMU
 /* On standard PPC MMU, no user access implies kernel read/write access,
@@ -459,7 +459,7 @@ extern int icache_44x_need_flush;
 #define _PAGE_KERNEL_RO(_PAGE_BASE | _PAGE_SHARED)
 #endif
 
-#define _PAGE_IO   (_PAGE_KERNEL | _PAGE_NO_CACHE | _PAGE_GUARDED)
+#define _PAGE_IO   (_PAGE_KERNEL_NC | _PAGE_GUARDED)
 #define _PAGE_RAM  (_PAGE_KERNEL | _PAGE_HWEXEC)
 
 #if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) 
||\
@@ -552,9 +552,6 @@ static inline int pte_young(pte_t pte)  
 static inline int pte_file(pte_t pte)  { return pte_val(pte)  
_PAGE_FILE; }
 static inline int pte_special(pte_t pte)   { return pte_val(pte)  
_PAGE_SPECIAL; }
 
-static inline void pte_uncache(pte_t pte)   { pte_val(pte) |= 
_PAGE_NO_CACHE; }
-static inline void pte_cache(pte_t pte) { pte_val(pte) = 
~_PAGE_NO_CACHE; }
-
 static inline pte_t pte_wrprotect(pte_t pte) {

[RFC/PATCH 2/2] powerpc: 44x doesn't need G set everywhere

2008-12-09 Thread Benjamin Herrenschmidt
After discussing with chip designers, it appears that it's not
necessary to set G everywhere on 440 cores. The various core
errata related to prefetch should be sorted out by firmware by
disabling icache prefetching in CCR0. We add the workaround to
the kernel however just in case ld firmwares don't do it.

This is valid for -all- 4xx core variants. Later ones hard wire
the absence of prefetch but it doesn't harm to clear the bits
in CCR0 (they should already be cleared anyway).

We still leave G=1 on the linear mapping for now, we need to
stop over-mapping RAM to be able to remove it.

Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
---

 arch/powerpc/kernel/head_44x.S |   12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

--- linux-work.orig/arch/powerpc/kernel/head_44x.S  2008-12-10 
16:11:35.0 +1100
+++ linux-work/arch/powerpc/kernel/head_44x.S   2008-12-10 16:29:08.0 
+1100
@@ -69,6 +69,17 @@ _ENTRY(_start);
li  r24,0   /* CPU number */
 
 /*
+ * In case the firmware didn't do it, we apply some workarounds
+ * that are good for all 440 core variants here
+ */
+   mfspr   r3,SPRN_CCR0
+   rlwinm  r3,r3,0,0,27/* disable icache prefetch */
+   isync
+   mtspr   SPRN_CCR0,r3
+   isync
+   sync
+
+/*
  * Set up the initial MMU state
  *
  * We are still executing code at the virtual address
@@ -570,7 +581,6 @@ finish_tlb_load:
rlwimi  r10,r12,29,30,30/* DIRTY - SW position */
and r11,r12,r10 /* Mask PTE bits to keep */
andi.   r10,r12,_PAGE_USER  /* User page ? */
-   ori r11,r11,_PAGE_GUARDED   /* 440 errata, needs G set */
beq 1f  /* nope, leave U bits empty */
rlwimi  r11,r11,3,26,28 /* yes, copy S bits to U */
 1: tlbwe   r11,r13,PPC44x_TLB_ATTRIB   /* Write ATTRIB */
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] powerpc: Remove flush_HPTE()

2008-12-09 Thread Benjamin Herrenschmidt
The function flush_HPTE() is used in only one place, the implementation
of DEBUG_PAGEALLOC on ppc32.

It's actually a dup of flush_tlb_page() though it's -slightly- more
efficient on hash based processors. We remove it and replace it by
a direct call to the hash flush code on those processors and to
flush_tlb_page() for everybody else.

Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
---

 arch/powerpc/mm/mmu_decl.h   |   17 -
 arch/powerpc/mm/pgtable_32.c |6 +-
 2 files changed, 5 insertions(+), 18 deletions(-)

--- linux-work.orig/arch/powerpc/mm/mmu_decl.h  2008-12-10 17:01:18.0 
+1100
+++ linux-work/arch/powerpc/mm/mmu_decl.h   2008-12-10 17:01:35.0 
+1100
@@ -58,17 +58,14 @@ extern phys_addr_t lowmem_end_addr;
  * architectures.  -- Dan
  */
 #if defined(CONFIG_8xx)
-#define flush_HPTE(X, va, pg)  _tlbie(va, 0 /* 8xx doesn't care about PID */)
 #define MMU_init_hw()  do { } while(0)
 #define mmu_mapin_ram()(0UL)
 
 #elif defined(CONFIG_4xx)
-#define flush_HPTE(pid, va, pg)_tlbie(va, pid)
 extern void MMU_init_hw(void);
 extern unsigned long mmu_mapin_ram(void);
 
 #elif defined(CONFIG_FSL_BOOKE)
-#define flush_HPTE(pid, va, pg)_tlbie(va, pid)
 extern void MMU_init_hw(void);
 extern unsigned long mmu_mapin_ram(void);
 extern void adjust_total_lowmem(void);
@@ -77,18 +74,4 @@ extern void adjust_total_lowmem(void);
 /* anything 32-bit except 4xx or 8xx */
 extern void MMU_init_hw(void);
 extern unsigned long mmu_mapin_ram(void);
-
-/* Be carefulthis needs to be updated if we ever encounter 603 SMPs,
- * which includes all new 82xx processors.  We need tlbie/tlbsync here
- * in that case (I think). -- Dan.
- */
-static inline void flush_HPTE(unsigned context, unsigned long va,
- unsigned long pdval)
-{
-   if ((Hash != 0) 
-   mmu_has_feature(MMU_FTR_HPTE_TABLE))
-   flush_hash_pages(0, va, pdval, 1);
-   else
-   _tlbie(va);
-}
 #endif
Index: linux-work/arch/powerpc/mm/pgtable_32.c
===
--- linux-work.orig/arch/powerpc/mm/pgtable_32.c2008-12-10 
17:01:49.0 +1100
+++ linux-work/arch/powerpc/mm/pgtable_32.c 2008-12-10 17:04:36.0 
+1100
@@ -342,7 +342,11 @@ static int __change_page_attr(struct pag
return -EINVAL;
set_pte_at(init_mm, address, kpte, mk_pte(page, prot));
wmb();
-   flush_HPTE(0, address, pmd_val(*kpmd));
+#ifdef CONFIG_PPC_STD_MMU
+   flush_hash_pages(0, address, pmd_val(*kpmd), 1);
+#else
+   flush_tlb_page(NULL, address);
+#endif
pte_unmap(kpte);
 
return 0;
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Indirect DCR Access

2008-12-09 Thread Grant Erickson
Josh:

In working through the PPC4XX memory-controller,ibm,sdram-4xx-ddr2
adapter driver for the EDAC MC driver, there are a substantial number of
indirect DCR accesses.

Ideally, I would use the address and data DCRs implied from the SDRAM0
dcr-reg device tree property; however, the mtdcri and mfdcri are
mnemonic-only at present. Consequently, I've done:

#define DCRN_SDRAM0_BASE0x010
#define DCRN_SDRAM0_CONFIG_ADDR (DCRN_SDRAM0_BASE+0x0)
#define DCRN_SDRAM0_CONFIG_DATA (DCRN_SDRAM0_BASE+0x1)

#define mfsdram(reg)mfdcri(SDRAM0, SDRAM_ ## reg)
#define mtsdram(reg, value) mtdcri(SDRAM0, SDRAM_ ## reg, value)

for the short-term.

Is there a long-term strategy or set of options under discussion about
expanding the DCR accessors in dcr.h to include indirect access from a
device tree property as in the case above?

It appears that the processors that use this memory controller core all have
the same DCR address and data registers, so this isn't a huge portability
issue for the immediate future; however, I endeavor to get things as close
to best practices as possible up front.

Regards,

Grant


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev