Re: [PATCH] ndfc driver
One address/size cell isn't enough for the next generation of NAND FLASH chips. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] powerpc: Add SMP support to no-hash TLB handling v3
+void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr) +{ + unsigned int pid; + + preempt_disable(); + pid = vma ? vma-vm_mm-context.id : 0; + if (pid != MMU_NO_CONTEXT) + _tlbil_va(vmaddr, pid); + preempt_enable(); +} +EXPORT_SYMBOL(local_flush_tlb_page); We are using this in highmem.h for kmap_atomic.. So you need to fix that call site. - k ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] ndfc driver
On Tue, 9 Dec 2008 07:10:27 +0100 Stefan Roese [EMAIL PROTECTED] wrote: On Tuesday 09 December 2008, Sean MacLennan wrote: On Thu, 4 Dec 2008 09:01:07 -0500 Josh Boyer [EMAIL PROTECTED] wrote: In addition to an example DTS patch (probably to warp itself), could you briefly write up a binding and put it in Documentation/powerpc/dts-bindings/amcc (or similar)? Also please CC the devicetree-discuss list on that part. Here is a start at the doc. I have sent it as a patch, but could just as easily send raw text. The example comes from the warp dts, just with less partitions, so I have not included a warp dts patch here. Cheers, Sean diff --git a/Documentation/powerpc/dts-bindings/amcc/ndfc.txt b/Documentation/powerpc/dts-bindings/amcc/ndfc.txt new file mode 100644 index 000..668f4a9 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/amcc/ndfc.txt @@ -0,0 +1,31 @@ +AMCC NDFC (NanD Flash Controller) + +Required properties: +- compatible : amcc,ndfc. The 4xx NAND controller was first implemented on the 440EP, IIRC. So I'm pretty sure that this controller is an IBM core and not am AMCC core. So this should be ibm,ndfc. That is true. It's an IBM blue logic core. And with this change it makes no sense to put this file ndfc.txt into the amcc directory. Josh, where should this go then? I declare it to be: dts-bindings/4xx/ mostly because I don't want the bindings scattered across two directories simply because of the timeframe they showed up in the marketplace. If there are better ideas, I'm all ears. josh ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 0/9] powerpc: Preliminary work to enable SMP BookE
On Dec 7, 2008, at 11:39 PM, Benjamin Herrenschmidt wrote: This series of patches is aimed at supporting SMP on non-hash based processors. It consists of a rework of the MMU context management and TLB management, clearly splitting hash32, hash64 and nohash in both cases, adding SMP safe context handling and some basic SMP TLB management. There is room for improvements, such as implementing lazy TLB flushing on processors without invalidate-by-PID support HW, some better IPI mechanism, support for variable sizes PID, lock less fast path in the MMU context switch, etc... but it should basically work. There are some semingly unrelated patches in the pile as they are dependencies of the main ones so I'm including them in. You'll be happy to know these patches at least boot on real 85xx SMP HW. - k ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/5] powerpc: booke: Don't hard-code size of struct tlbcam
On Mon, 8 Dec 2008 19:34:55 -0800 Trent Piepho [EMAIL PROTECTED] wrote: Some assembly code in head_fsl_booke.S hard-coded the size of struct tlbcam to 20 when it indexed the TLBCAM table. Anyone changing the size of struct tlbcam would not know to expect that. The kernel already has a system to get the size of C structures into assembly language files, asm-offsets, so let's use it. The definition of the struct gets moved to a header, so that asm-offsets.c can include it. I don't mean to be overly picky, but your patch subjects and changelog descriptions are a bit wrong. This series pertains to FSL BookE chips, not BookE in general. There are other variants of BookE, such as 4xx. If you could keep that in mind for future revisions, I'd appreciate it. Something like: [PATCH] powerpc/fsl-booke: or something similar would be a bit more correct. Unless you really are changing something global to all BookE processors (which is sort of rare at the moment). josh ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] Fix corruption error in rh_alloc_fixed()
There is an error in rh_alloc_fixed() of the Remote Heap code: If there is at least one free block blk won't be NULL at the end of the search loop, so -ENOMEM won't be returned and the else branch of if (bs == s || be == e) will be taken, corrupting the management structures. Signed-off-by: Guillaume Knispel [EMAIL PROTECTED] --- Fix an error in rh_alloc_fixed() that made allocations succeed when they should fail, and corrupted management structures. diff --git a/arch/powerpc/lib/rheap.c b/arch/powerpc/lib/rheap.c index 29b2941..45907c1 100644 --- a/arch/powerpc/lib/rheap.c +++ b/arch/powerpc/lib/rheap.c @@ -556,6 +556,7 @@ unsigned long rh_alloc_fixed(rh_info_t * info, unsigned long start, int size, co be = blk-start + blk-size; if (s = bs e = be) break; + blk = NULL; } if (blk == NULL) ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] Fix corruption error in rh_alloc_fixed()
Guillaume Knispel wrote: There is an error in rh_alloc_fixed() of the Remote Heap code: If there is at least one free block blk won't be NULL at the end of the search loop, so -ENOMEM won't be returned and the else branch of if (bs == s || be == e) will be taken, corrupting the management structures. Signed-off-by: Guillaume Knispel [EMAIL PROTECTED] --- Fix an error in rh_alloc_fixed() that made allocations succeed when they should fail, and corrupted management structures. diff --git a/arch/powerpc/lib/rheap.c b/arch/powerpc/lib/rheap.c index 29b2941..45907c1 100644 --- a/arch/powerpc/lib/rheap.c +++ b/arch/powerpc/lib/rheap.c @@ -556,6 +556,7 @@ unsigned long rh_alloc_fixed(rh_info_t * info, unsigned long start, int size, co be = blk-start + blk-size; if (s = bs e = be) break; + blk = NULL; } if (blk == NULL) This is a good catch, however, wouldn't it be better to do this: list_for_each(l, info-free_list) { blk = list_entry(l, rh_block_t, list); /* The range must lie entirely inside one free block */ bs = blk-start; be = blk-start + blk-size; if (s = bs e = be) break; } - if (blk == NULL) + if (blk == info-free_list) return (unsigned long) -ENOMEM; I haven't tested this, but the if-statement at the end of the loop is meant to check whether the list_for_each() loop got to the end or not. What do you think? -- Timur Tabi Linux kernel developer at Freescale ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] Fix corruption error in rh_alloc_fixed()
On Tue, 09 Dec 2008 09:03:19 -0600 Timur Tabi [EMAIL PROTECTED] wrote: Guillaume Knispel wrote: There is an error in rh_alloc_fixed() of the Remote Heap code: If there is at least one free block blk won't be NULL at the end of the search loop, so -ENOMEM won't be returned and the else branch of if (bs == s || be == e) will be taken, corrupting the management structures. Signed-off-by: Guillaume Knispel [EMAIL PROTECTED] --- Fix an error in rh_alloc_fixed() that made allocations succeed when they should fail, and corrupted management structures. diff --git a/arch/powerpc/lib/rheap.c b/arch/powerpc/lib/rheap.c index 29b2941..45907c1 100644 --- a/arch/powerpc/lib/rheap.c +++ b/arch/powerpc/lib/rheap.c @@ -556,6 +556,7 @@ unsigned long rh_alloc_fixed(rh_info_t * info, unsigned long start, int size, co be = blk-start + blk-size; if (s = bs e = be) break; + blk = NULL; } if (blk == NULL) This is a good catch, however, wouldn't it be better to do this: list_for_each(l, info-free_list) { blk = list_entry(l, rh_block_t, list); /* The range must lie entirely inside one free block */ bs = blk-start; be = blk-start + blk-size; if (s = bs e = be) break; } - if (blk == NULL) + if (blk == info-free_list) return (unsigned long) -ENOMEM; I haven't tested this, but the if-statement at the end of the loop is meant to check whether the list_for_each() loop got to the end or not. What do you think? blk = NULL; at the end of the loop is what is done in the more used rh_alloc_align(), so for consistency either we change both or we use the same construction here. I also think that testing for info-free_list is harder to understand because you must have the linked list implementation in your head (which a kernel developer should anyway so this is not so important) Guillaume Knispel ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] Fix corruption error in rh_alloc_fixed()
Guillaume Knispel wrote: blk = NULL; at the end of the loop is what is done in the more used rh_alloc_align(), so for consistency either we change both or we use the same construction here. I also think that testing for info-free_list is harder to understand because you must have the linked list implementation in your head (which a kernel developer should anyway so this is not so important) Fair enough. Acked-by: Timur Tabi [EMAIL PROTECTED] -- Timur Tabi Linux kernel developer at Freescale ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] fork_init: fix division by zero
The following patch fixes divide-by-zero error for the cases of really big PAGE_SIZEs (e.g. 256KB on ppc44x). Support for such big page sizes on 44x is not present in the current kernel yet, but coming soon. Also this patch fixes the comment for the max_threads settings, as this didn't match the things actually done in the code. Signed-off-by: Yuri Tikhonov [EMAIL PROTECTED] Signed-off-by: Ilya Yanok [EMAIL PROTECTED] --- kernel/fork.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index 2a372a0..b0ac2fb 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -181,10 +181,14 @@ void __init fork_init(unsigned long mempages) /* * The default maximum number of threads is set to a safe -* value: the thread structures can take up at most half -* of memory. +* value: the thread structures can take up at most +* (1/8) part of memory. */ +#if (8 * THREAD_SIZE) PAGE_SIZE max_threads = mempages / (8 * THREAD_SIZE / PAGE_SIZE); +#else + max_threads = mempages * PAGE_SIZE / (8 * THREAD_SIZE); +#endif /* * we need to allow at least 20 threads to boot a system -- 1.5.6.1 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/8] Fix a bug and cleanup NUMA boot-time code
The first patch in this series is a genuine bug fix. The rest are really just an RFC. Jon introduced a bug a while ago. I introduced another when trying to fix Jon's bug. I refuse to accept personal blame for this and, instead, blame the code. :) The reset of the series are cleanups that I think will help clarify the code in numa.c and work to ensure that the next bonehead like me is not as able to easily muck up the code. :) The cleanups increase in impact and intrusiveness as the series goes along, so please consider them an RFC. But, what I really want to figure out is a safer way to initialize NODE_DATA() and start using it as we bring up bootmem on all the nodes. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/8] Add better comment on careful_allocation()
The behavior in careful_allocation() really confused me at first. Add a comment to hopefully make it easier on the next doofus that looks at it. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- linux-2.6.git-dave/arch/powerpc/mm/numa.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff -puN arch/powerpc/mm/numa.c~cleanup-careful_allocation4 arch/powerpc/mm/numa.c --- linux-2.6.git/arch/powerpc/mm/numa.c~cleanup-careful_allocation4 2008-12-09 10:16:05.0 -0800 +++ linux-2.6.git-dave/arch/powerpc/mm/numa.c 2008-12-09 10:16:05.0 -0800 @@ -840,8 +840,16 @@ static void __init *careful_allocation(i size, nid); /* -* If the memory came from a previously allocated node, we must -* retry with the bootmem allocator. +* We initialize the nodes in numeric order: 0, 1, 2... +* and hand over control from the LMB allocator to the +* bootmem allocator. If this function is called for +* node 5, then we know that all nodes 5 are using the +* bootmem allocator instead of the LMB allocator. +* +* So, check the nid from which this allocation came +* and double check to see if we need to use bootmem +* instead of the LMB. We don't free the LMB memory +* since it would be useless. */ new_nid = early_pfn_to_nid(ret PAGE_SHIFT); if (new_nid nid) { _ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 6/8] cleanup do_init_bootmem()
I'm debating whether this is worth it. It makes this a bit more clean looking, but doesn't seriously enhance readability. But, I do think it helps a bit. Thoughts? Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- linux-2.6.git-dave/arch/powerpc/mm/numa.c | 104 +++--- 1 file changed, 55 insertions(+), 49 deletions(-) diff -puN arch/powerpc/mm/numa.c~cleanup-careful_allocation3 arch/powerpc/mm/numa.c --- linux-2.6.git/arch/powerpc/mm/numa.c~cleanup-careful_allocation3 2008-12-09 10:16:07.0 -0800 +++ linux-2.6.git-dave/arch/powerpc/mm/numa.c 2008-12-09 10:16:07.0 -0800 @@ -938,6 +938,59 @@ static void mark_reserved_regions_for_ni } } +void do_init_bootmem_node(int node) +{ + unsigned long start_pfn, end_pfn; + void *bootmem_vaddr; + unsigned long bootmap_pages; + + dbg(node %d is online\n, nid); + get_pfn_range_for_nid(nid, start_pfn, end_pfn); + + /* +* Allocate the node structure node local if possible +* +* Be careful moving this around, as it relies on all +* previous nodes' bootmem to be initialized and have +* all reserved areas marked. +*/ + NODE_DATA(nid) = careful_zallocation(nid, + sizeof(struct pglist_data), + SMP_CACHE_BYTES, end_pfn); + + dbg(node %d\n, nid); + dbg(NODE_DATA() = %p\n, NODE_DATA(nid)); + + NODE_DATA(nid)-bdata = bootmem_node_data[nid]; + NODE_DATA(nid)-node_start_pfn = start_pfn; + NODE_DATA(nid)-node_spanned_pages = end_pfn - start_pfn; + + if (NODE_DATA(nid)-node_spanned_pages == 0) + return; + + dbg(start_paddr = %lx\n, start_pfn PAGE_SHIFT); + dbg(end_paddr = %lx\n, end_pfn PAGE_SHIFT); + + bootmap_pages = bootmem_bootmap_pages(end_pfn - start_pfn); + bootmem_vaddr = careful_zallocation(nid, + bootmap_pages PAGE_SHIFT, + PAGE_SIZE, end_pfn); + + dbg(bootmap_vaddr = %p\n, bootmem_vaddr); + + init_bootmem_node(NODE_DATA(nid), + __pa(bootmem_vaddr) PAGE_SHIFT, + start_pfn, end_pfn); + + free_bootmem_with_active_regions(nid, end_pfn); + /* +* Be very careful about moving this around. Future +* calls to careful_zallocation() depend on this getting +* done correctly. +*/ + mark_reserved_regions_for_nid(nid); + sparse_memory_present_with_active_regions(nid); +} void __init do_init_bootmem(void) { @@ -958,55 +1011,8 @@ void __init do_init_bootmem(void) (void *)(unsigned long)boot_cpuid); for_each_online_node(nid) { - unsigned long start_pfn, end_pfn; - void *bootmem_vaddr; - unsigned long bootmap_pages; - - get_pfn_range_for_nid(nid, start_pfn, end_pfn); - - /* -* Allocate the node structure node local if possible -* -* Be careful moving this around, as it relies on all -* previous nodes' bootmem to be initialized and have -* all reserved areas marked. -*/ - NODE_DATA(nid) = careful_zallocation(nid, - sizeof(struct pglist_data), - SMP_CACHE_BYTES, end_pfn); - - dbg(node %d\n, nid); - dbg(NODE_DATA() = %p\n, NODE_DATA(nid)); - - NODE_DATA(nid)-bdata = bootmem_node_data[nid]; - NODE_DATA(nid)-node_start_pfn = start_pfn; - NODE_DATA(nid)-node_spanned_pages = end_pfn - start_pfn; - - if (NODE_DATA(nid)-node_spanned_pages == 0) - continue; - - dbg(start_paddr = %lx\n, start_pfn PAGE_SHIFT); - dbg(end_paddr = %lx\n, end_pfn PAGE_SHIFT); - - bootmap_pages = bootmem_bootmap_pages(end_pfn - start_pfn); - bootmem_vaddr = careful_zallocation(nid, - bootmap_pages PAGE_SHIFT, - PAGE_SIZE, end_pfn); - - dbg(bootmap_vaddr = %p\n, bootmem_vaddr); - - init_bootmem_node(NODE_DATA(nid), - __pa(bootmem_vaddr) PAGE_SHIFT, - start_pfn, end_pfn); - - free_bootmem_with_active_regions(nid, end_pfn); - /* -* Be very careful about moving this around. Future -* calls to careful_zallocation() depend on this getting -* done correctly. -*/ - mark_reserved_regions_for_nid(nid); - sparse_memory_present_with_active_regions(nid); + dbg(node %d: marked online, initializing bootmem\n, nid); +
[PATCH 4/8] make careful_allocation() return vaddrs
Since we memset() the result in both of the uses here, just make careful_alloc() return a virtual address. Also, add a separate variable to store the physial address that comes back from the lmb_alloc() functions. This makes it less likely that someone will screw it up forgetting to convert before returning since the vaddr is always in a void* and the paddr is always in an unsigned long. I admit this is arbitrary since one of its users needs a paddr and one a vaddr, but it does remove a good number of casts. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- linux-2.6.git-dave/arch/powerpc/mm/numa.c | 37 -- 1 file changed, 20 insertions(+), 17 deletions(-) diff -puN arch/powerpc/mm/numa.c~cleanup-careful_allocation1 arch/powerpc/mm/numa.c --- linux-2.6.git/arch/powerpc/mm/numa.c~cleanup-careful_allocation1 2008-12-09 10:16:06.0 -0800 +++ linux-2.6.git-dave/arch/powerpc/mm/numa.c 2008-12-09 10:16:06.0 -0800 @@ -822,23 +822,28 @@ static void __init dump_numa_memory_topo * required. nid is the preferred node and end is the physical address of * the highest address in the node. * - * Returns the physical address of the memory. + * Returns the virtual address of the memory. */ static void __init *careful_allocation(int nid, unsigned long size, unsigned long align, unsigned long end_pfn) { + void *ret; int new_nid; - unsigned long ret = __lmb_alloc_base(size, align, end_pfn PAGE_SHIFT); + unsigned long ret_paddr; + + ret_paddr = __lmb_alloc_base(size, align, end_pfn PAGE_SHIFT); /* retry over all memory */ - if (!ret) - ret = __lmb_alloc_base(size, align, lmb_end_of_DRAM()); + if (!ret_paddr) + ret_paddr = __lmb_alloc_base(size, align, lmb_end_of_DRAM()); - if (!ret) + if (!ret_paddr) panic(numa.c: cannot allocate %lu bytes for node %d, size, nid); + ret = __va(ret_paddr); + /* * We initialize the nodes in numeric order: 0, 1, 2... * and hand over control from the LMB allocator to the @@ -851,17 +856,15 @@ static void __init *careful_allocation(i * instead of the LMB. We don't free the LMB memory * since it would be useless. */ - new_nid = early_pfn_to_nid(ret PAGE_SHIFT); + new_nid = early_pfn_to_nid(ret_paddr PAGE_SHIFT); if (new_nid nid) { - ret = (unsigned long)__alloc_bootmem_node(NODE_DATA(new_nid), + ret = __alloc_bootmem_node(NODE_DATA(new_nid), size, align, 0); - ret = __pa(ret); - - dbg(alloc_bootmem %lx %lx\n, ret, size); + dbg(alloc_bootmem %p %lx\n, ret, size); } - return (void *)ret; + return ret; } static struct notifier_block __cpuinitdata ppc64_numa_nb = { @@ -955,7 +958,7 @@ void __init do_init_bootmem(void) for_each_online_node(nid) { unsigned long start_pfn, end_pfn; - unsigned long bootmem_paddr; + void *bootmem_vaddr; unsigned long bootmap_pages; get_pfn_range_for_nid(nid, start_pfn, end_pfn); @@ -970,7 +973,6 @@ void __init do_init_bootmem(void) NODE_DATA(nid) = careful_allocation(nid, sizeof(struct pglist_data), SMP_CACHE_BYTES, end_pfn); - NODE_DATA(nid) = __va(NODE_DATA(nid)); memset(NODE_DATA(nid), 0, sizeof(struct pglist_data)); dbg(node %d\n, nid); @@ -987,14 +989,15 @@ void __init do_init_bootmem(void) dbg(end_paddr = %lx\n, end_pfn PAGE_SHIFT); bootmap_pages = bootmem_bootmap_pages(end_pfn - start_pfn); - bootmem_paddr = (unsigned long)careful_allocation(nid, + bootmem_vaddr = careful_allocation(nid, bootmap_pages PAGE_SHIFT, PAGE_SIZE, end_pfn); - memset(__va(bootmem_paddr), 0, bootmap_pages PAGE_SHIFT); + memset(bootmem_vaddr, 0, bootmap_pages PAGE_SHIFT); - dbg(bootmap_paddr = %lx\n, bootmem_paddr); + dbg(bootmap_vaddr = %p\n, bootmem_vaddr); - init_bootmem_node(NODE_DATA(nid), bootmem_paddr PAGE_SHIFT, + init_bootmem_node(NODE_DATA(nid), + __pa(bootmem_vaddr) PAGE_SHIFT, start_pfn, end_pfn); free_bootmem_with_active_regions(nid, end_pfn); _ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 5/8] cleanup careful_allocation(): consolidate memset()
Both users of careful_allocation() immediately memset() the result. So, just do it in one place. Also give careful_allocation() a 'z' prefix to bring it in line with kzmalloc() and friends. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- linux-2.6.git-dave/arch/powerpc/mm/numa.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff -puN arch/powerpc/mm/numa.c~cleanup-careful_allocation2 arch/powerpc/mm/numa.c --- linux-2.6.git/arch/powerpc/mm/numa.c~cleanup-careful_allocation2 2008-12-09 10:16:06.0 -0800 +++ linux-2.6.git-dave/arch/powerpc/mm/numa.c 2008-12-09 10:16:06.0 -0800 @@ -824,7 +824,7 @@ static void __init dump_numa_memory_topo * * Returns the virtual address of the memory. */ -static void __init *careful_allocation(int nid, unsigned long size, +static void __init *careful_zallocation(int nid, unsigned long size, unsigned long align, unsigned long end_pfn) { @@ -864,6 +864,7 @@ static void __init *careful_allocation(i dbg(alloc_bootmem %p %lx\n, ret, size); } + memset(ret, 0, size); return ret; } @@ -970,10 +971,9 @@ void __init do_init_bootmem(void) * previous nodes' bootmem to be initialized and have * all reserved areas marked. */ - NODE_DATA(nid) = careful_allocation(nid, + NODE_DATA(nid) = careful_zallocation(nid, sizeof(struct pglist_data), SMP_CACHE_BYTES, end_pfn); - memset(NODE_DATA(nid), 0, sizeof(struct pglist_data)); dbg(node %d\n, nid); dbg(NODE_DATA() = %p\n, NODE_DATA(nid)); @@ -989,10 +989,9 @@ void __init do_init_bootmem(void) dbg(end_paddr = %lx\n, end_pfn PAGE_SHIFT); bootmap_pages = bootmem_bootmap_pages(end_pfn - start_pfn); - bootmem_vaddr = careful_allocation(nid, + bootmem_vaddr = careful_zallocation(nid, bootmap_pages PAGE_SHIFT, PAGE_SIZE, end_pfn); - memset(bootmem_vaddr, 0, bootmap_pages PAGE_SHIFT); dbg(bootmap_vaddr = %p\n, bootmem_vaddr); @@ -1003,7 +1002,7 @@ void __init do_init_bootmem(void) free_bootmem_with_active_regions(nid, end_pfn); /* * Be very careful about moving this around. Future -* calls to careful_allocation() depend on this getting +* calls to careful_zallocation() depend on this getting * done correctly. */ mark_reserved_regions_for_nid(nid); _ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/8] fix bootmem reservation on uninitialized node
careful_allocation() was calling into the bootemem allocator for nodes which had not been fully initialized and caused a previous bug. http://patchwork.ozlabs.org/patch/10528/ So, I merged a few broken out loops in do_init_bootmem() to fix it. That changed the code ordering. I think this bug is triggered by having reserved areas for a node which are spanned by another node's contents. In the mark_reserved_regions_for_nid() code, we attempt to reserve the area for a node before we have allocated the NODE_DATA() for that nid. We do this since I reordered that loop. I suck. This may only present on some systems that have 16GB pages reserved. But, it can probably happen on any system that is trying to reserve large swaths of memory that happen to span other nodes' contents. This patch ensures that we do not touch bootmem for any node which has not been initialized. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- linux-2.6.git-dave/arch/powerpc/mm/numa.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff -puN arch/powerpc/mm/numa.c~fix-bad-node-reserve arch/powerpc/mm/numa.c --- linux-2.6.git/arch/powerpc/mm/numa.c~fix-bad-node-reserve 2008-12-09 10:16:04.0 -0800 +++ linux-2.6.git-dave/arch/powerpc/mm/numa.c 2008-12-09 10:16:04.0 -0800 @@ -870,6 +870,7 @@ static void mark_reserved_regions_for_ni struct pglist_data *node = NODE_DATA(nid); int i; + dbg(mark_reserved_regions_for_nid(%d) NODE_DATA: %p\n, nid, node); for (i = 0; i lmb.reserved.cnt; i++) { unsigned long physbase = lmb.reserved.region[i].base; unsigned long size = lmb.reserved.region[i].size; @@ -901,10 +902,14 @@ static void mark_reserved_regions_for_ni if (end_pfn node_ar.end_pfn) reserve_size = (node_ar.end_pfn PAGE_SHIFT) - (start_pfn PAGE_SHIFT); - dbg(reserve_bootmem %lx %lx nid=%d\n, physbase, - reserve_size, node_ar.nid); - reserve_bootmem_node(NODE_DATA(node_ar.nid), physbase, - reserve_size, BOOTMEM_DEFAULT); + /* +* Only worry about *this* node, others may not +* yet have valid NODE_DATA(). +*/ + if (node_ar.nid == nid) + reserve_bootmem_node(NODE_DATA(node_ar.nid), + physbase, reserve_size, + BOOTMEM_DEFAULT); /* * if reserved region is contained in the active region * then done. _ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 7/8] less use of NODE_DATA()
The use of NODE_DATA() in the ppc init code is fragile. We use it for some nodes as we are initializing others. As the loop initializing them has gotten more complex and broken out into several functions it gets harder and harder to remember how this goes. This was recently the cause of a bug http://patchwork.ozlabs.org/patch/10528/ in which I also created a new regression for machines with large memory reservations in the LMB structures (most likely 16GB pages). This patch reduces the references to NODE_DATA() and also keeps it unitialized for as long as possible. Hopefully, the delay in initialization will help its use from spreading too much, reducing the chances for future bugs. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- linux-2.6.git-dave/arch/powerpc/mm/numa.c | 63 ++ 1 file changed, 31 insertions(+), 32 deletions(-) diff -puN arch/powerpc/mm/numa.c~less-use-of-NODE_DATA arch/powerpc/mm/numa.c --- linux-2.6.git/arch/powerpc/mm/numa.c~less-use-of-NODE_DATA 2008-12-09 10:16:08.0 -0800 +++ linux-2.6.git-dave/arch/powerpc/mm/numa.c 2008-12-09 10:16:08.0 -0800 @@ -847,17 +847,16 @@ static void __init *careful_zallocation( /* * We initialize the nodes in numeric order: 0, 1, 2... * and hand over control from the LMB allocator to the -* bootmem allocator. If this function is called for -* node 5, then we know that all nodes 5 are using the -* bootmem allocator instead of the LMB allocator. +* bootmem allocator. * -* So, check the nid from which this allocation came -* and double check to see if we need to use bootmem -* instead of the LMB. We don't free the LMB memory -* since it would be useless. +* We must not call into the bootmem allocator for any node +* which has not had bootmem initialized and had all of the +* reserved areas set up. In do_init_bootmem_node(), we do +* not set NODE_DATA(nid) up until that is done. Use that +* property here. */ new_nid = early_pfn_to_nid(ret_paddr PAGE_SHIFT); - if (new_nid nid) { + if (NODE_DATA(new_nid)) { ret = __alloc_bootmem_node(NODE_DATA(new_nid), size, align, 0); @@ -873,12 +872,12 @@ static struct notifier_block __cpuinitda .priority = 1 /* Must run before sched domains notifier. */ }; -static void mark_reserved_regions_for_nid(int nid) +static void mark_reserved_regions_for_node(struct pglist_data *node) { - struct pglist_data *node = NODE_DATA(nid); + int nid = node-node_id; int i; - dbg(mark_reserved_regions_for_nid(%d) NODE_DATA: %p\n, nid, node); + dbg(%s(%d) NODE_DATA: %p\n, __func__, nid, node); for (i = 0; i lmb.reserved.cnt; i++) { unsigned long physbase = lmb.reserved.region[i].base; unsigned long size = lmb.reserved.region[i].size; @@ -915,9 +914,8 @@ static void mark_reserved_regions_for_ni * yet have valid NODE_DATA(). */ if (node_ar.nid == nid) - reserve_bootmem_node(NODE_DATA(node_ar.nid), - physbase, reserve_size, - BOOTMEM_DEFAULT); + reserve_bootmem_node(node, physbase, + reserve_size, BOOTMEM_DEFAULT); /* * if reserved region is contained in the active region * then done. @@ -938,8 +936,9 @@ static void mark_reserved_regions_for_ni } } -void do_init_bootmem_node(int node) +void do_init_bootmem_node(int nid) { + struct pglist_data *node; unsigned long start_pfn, end_pfn; void *bootmem_vaddr; unsigned long bootmap_pages; @@ -954,18 +953,16 @@ void do_init_bootmem_node(int node) * previous nodes' bootmem to be initialized and have * all reserved areas marked. */ - NODE_DATA(nid) = careful_zallocation(nid, - sizeof(struct pglist_data), - SMP_CACHE_BYTES, end_pfn); - - dbg(node %d\n, nid); - dbg(NODE_DATA() = %p\n, NODE_DATA(nid)); - - NODE_DATA(nid)-bdata = bootmem_node_data[nid]; - NODE_DATA(nid)-node_start_pfn = start_pfn; - NODE_DATA(nid)-node_spanned_pages = end_pfn - start_pfn; + node = careful_zallocation(nid, sizeof(struct pglist_data), + SMP_CACHE_BYTES, end_pfn); - if (NODE_DATA(nid)-node_spanned_pages == 0) + dbg(node %d pgkist_data: %p\n, nid, node); + + node-bdata = bootmem_node_data[nid]; + node-node_start_pfn = start_pfn; + node-node_spanned_pages = end_pfn - start_pfn; + +
[PATCH 8/8] make free_bootmem_with_active_regions() take pgdat
As I said earlier, I'm trying to restrict the use of NODE_DATA() since it can easily be referenced too early otherwise. free_bootmem_with_active_regions() does not in practice need to deal with multiple nodes. I already audited all of its callers. This patch makes it take a pgdat instead of doing the NODE_DATA() lookup internally. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- linux-2.6.git-dave/arch/mips/sgi-ip27/ip27-memory.c |2 +- linux-2.6.git-dave/arch/powerpc/mm/mem.c|5 +++-- linux-2.6.git-dave/arch/powerpc/mm/numa.c |3 +-- linux-2.6.git-dave/arch/s390/kernel/setup.c |2 +- linux-2.6.git-dave/arch/sh/mm/numa.c|2 +- linux-2.6.git-dave/arch/sparc64/mm/init.c |6 +++--- linux-2.6.git-dave/arch/x86/mm/init_32.c|2 +- linux-2.6.git-dave/arch/x86/mm/init_64.c|2 +- linux-2.6.git-dave/arch/x86/mm/numa_64.c|2 +- linux-2.6.git-dave/include/linux/mm.h |2 +- linux-2.6.git-dave/mm/page_alloc.c |8 11 files changed, 18 insertions(+), 18 deletions(-) diff -puN arch/mips/sgi-ip27/ip27-memory.c~make-free_bootmem_with_active_regions-take-pgdat arch/mips/sgi-ip27/ip27-memory.c --- linux-2.6.git/arch/mips/sgi-ip27/ip27-memory.c~make-free_bootmem_with_active_regions-take-pgdat 2008-12-09 10:16:08.0 -0800 +++ linux-2.6.git-dave/arch/mips/sgi-ip27/ip27-memory.c 2008-12-09 10:16:08.0 -0800 @@ -412,7 +412,7 @@ static void __init node_mem_init(cnodeid bootmap_size = init_bootmem_node(NODE_DATA(node), slot_freepfn, start_pfn, end_pfn); - free_bootmem_with_active_regions(node, end_pfn); + free_bootmem_with_active_regions(NODE_DATA(node), end_pfn); reserve_bootmem_node(NODE_DATA(node), slot_firstpfn PAGE_SHIFT, ((slot_freepfn - slot_firstpfn) PAGE_SHIFT) + bootmap_size, BOOTMEM_DEFAULT); diff -puN arch/powerpc/mm/mem.c~make-free_bootmem_with_active_regions-take-pgdat arch/powerpc/mm/mem.c --- linux-2.6.git/arch/powerpc/mm/mem.c~make-free_bootmem_with_active_regions-take-pgdat 2008-12-09 10:16:08.0 -0800 +++ linux-2.6.git-dave/arch/powerpc/mm/mem.c2008-12-09 10:16:08.0 -0800 @@ -212,7 +212,8 @@ void __init do_init_bootmem(void) * present. */ #ifdef CONFIG_HIGHMEM - free_bootmem_with_active_regions(0, lowmem_end_addr PAGE_SHIFT); + free_bootmem_with_active_regions(NODE_DATA(0), +lowmem_end_addr PAGE_SHIFT); /* reserve the sections we're already using */ for (i = 0; i lmb.reserved.cnt; i++) { @@ -230,7 +231,7 @@ void __init do_init_bootmem(void) } } #else - free_bootmem_with_active_regions(0, max_pfn); + free_bootmem_with_active_regions(NODE_DATA(0), max_pfn); /* reserve the sections we're already using */ for (i = 0; i lmb.reserved.cnt; i++) diff -puN arch/powerpc/mm/numa.c~make-free_bootmem_with_active_regions-take-pgdat arch/powerpc/mm/numa.c --- linux-2.6.git/arch/powerpc/mm/numa.c~make-free_bootmem_with_active_regions-take-pgdat 2008-12-09 10:16:08.0 -0800 +++ linux-2.6.git-dave/arch/powerpc/mm/numa.c 2008-12-09 10:16:08.0 -0800 @@ -978,8 +978,6 @@ void do_init_bootmem_node(int nid) init_bootmem_node(node, __pa(bootmem_vaddr) PAGE_SHIFT, start_pfn, end_pfn); - NODE_DATA(nid) = node; - /* this call needs NODE_DATA(), so initialize it above */ free_bootmem_with_active_regions(nid, end_pfn); mark_reserved_regions_for_node(node); /* @@ -988,6 +986,7 @@ void do_init_bootmem_node(int nid) * careful_zallocation() depends on this getting set * now to tell from which nodes it must use bootmem. */ + NODE_DATA(nid) = node; sparse_memory_present_with_active_regions(nid); } diff -puN arch/s390/kernel/setup.c~make-free_bootmem_with_active_regions-take-pgdat arch/s390/kernel/setup.c --- linux-2.6.git/arch/s390/kernel/setup.c~make-free_bootmem_with_active_regions-take-pgdat 2008-12-09 10:16:08.0 -0800 +++ linux-2.6.git-dave/arch/s390/kernel/setup.c 2008-12-09 10:16:08.0 -0800 @@ -616,7 +616,7 @@ setup_memory(void) psw_set_key(PAGE_DEFAULT_KEY); - free_bootmem_with_active_regions(0, max_pfn); + free_bootmem_with_active_regions(NODE_DATA(0), max_pfn); /* * Reserve memory used for lowcore/command line/kernel image. diff -puN arch/sh/mm/numa.c~make-free_bootmem_with_active_regions-take-pgdat arch/sh/mm/numa.c --- linux-2.6.git/arch/sh/mm/numa.c~make-free_bootmem_with_active_regions-take-pgdat 2008-12-09 10:16:08.0 -0800 +++ linux-2.6.git-dave/arch/sh/mm/numa.c2008-12-09 10:16:08.0 -0800 @@ -75,7 +75,7 @@ void
[PATCH 3/8] cleanup careful_allocation(): bootmem already panics
If we fail a bootmem allocation, the bootmem code itself panics. No need to redo it here. Also change the wording of the other panic. We don't strictly have to allocate memory on the specified node. It is just a hint and that node may not even *have* any memory on it. In that case we can and do fall back to other nodes. Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- linux-2.6.git-dave/arch/powerpc/mm/numa.c |6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff -puN arch/powerpc/mm/numa.c~cleanup-careful_allocation arch/powerpc/mm/numa.c --- linux-2.6.git/arch/powerpc/mm/numa.c~cleanup-careful_allocation 2008-12-09 10:16:05.0 -0800 +++ linux-2.6.git-dave/arch/powerpc/mm/numa.c 2008-12-09 10:16:05.0 -0800 @@ -836,7 +836,7 @@ static void __init *careful_allocation(i ret = __lmb_alloc_base(size, align, lmb_end_of_DRAM()); if (!ret) - panic(numa.c: cannot allocate %lu bytes on node %d, + panic(numa.c: cannot allocate %lu bytes for node %d, size, nid); /* @@ -856,10 +856,6 @@ static void __init *careful_allocation(i ret = (unsigned long)__alloc_bootmem_node(NODE_DATA(new_nid), size, align, 0); - if (!ret) - panic(numa.c: cannot allocate %lu bytes on node %d, - size, new_nid); - ret = __pa(ret); dbg(alloc_bootmem %lx %lx\n, ret, size); _ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[RFC/PATCH] powerpc: consistent memory mapping.
Defining the start virtual address of the consistent memory in configs leads to overlapping of the consistent area with the other virtual regions (fixmap, pkmap, vmalloc). Defaults from current kernel just set consistent memory area to be somewhere high in the vmalloc area and then you need to pray there will be not enough vmalloc allocations to overlap. So, this patch makes the virtual address of the consistent memory to be assigned dynamically, at the end of the virtual address area. The fixmap area is now shifted to the low addresses, and ends before start of the consistent virtual addresses. User is now allowed to configure the size of the consistent memory area only. The exception has been made for 8xx archs, where the start of the consistent memory is still configurable: this is to avoid overlapping with the IMM space of 8xx. Actually this is wrong. We have a possibility to overlap not only for consistent memory but for IMM space too. But we don't have much expertise in 8xx so we are looking forward for some advice here. The following items remain to be done to complete supporting of the consistent memory fully: a) we missing 1 (last) page of addresses at the end of the consistent memory area; b) if CONFIG_CONSISTENT_SIZE is such that we cover more address regions than served by 1 pgd level, then mapping of the pages to these additional areas won't work (this 'feature' isn't introduced by this patch, but is the consequence of the current consistent memory support code, where consistent_pte is set in dma_alloc_init() in accordance with the pgd of the CONSISTENT_BASE address). Signed-off-by: Ilya Yanok [EMAIL PROTECTED] Signed-off-by: Yuri Tikhonov [EMAIL PROTECTED] --- arch/powerpc/Kconfig |7 --- arch/powerpc/lib/dma-noncoherent.c |5 + arch/powerpc/mm/pgtable_32.c |2 +- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index aa2eb46..4d62446 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -809,7 +809,7 @@ config TASK_SIZE config CONSISTENT_START_BOOL bool Set custom consistent memory pool address - depends on ADVANCED_OPTIONS NOT_COHERENT_CACHE + depends on ADVANCED_OPTIONS NOT_COHERENT_CACHE 8xx help This option allows you to set the base virtual address of the consistent memory pool. This pool of virtual @@ -817,8 +817,8 @@ config CONSISTENT_START_BOOL config CONSISTENT_START hex Base virtual address of consistent memory pool if CONSISTENT_START_BOOL - default 0xfd00 if (NOT_COHERENT_CACHE 8xx) - default 0xff10 if NOT_COHERENT_CACHE + depends on 8xx + default 0xfd00 if NOT_COHERENT_CACHE config CONSISTENT_SIZE_BOOL bool Set custom consistent memory pool size @@ -831,6 +831,7 @@ config CONSISTENT_SIZE_BOOL config CONSISTENT_SIZE hex Size of consistent memory pool if CONSISTENT_SIZE_BOOL default 0x0020 if NOT_COHERENT_CACHE + default 0x if !NOT_COHERENT_CACHE config PIN_TLB bool Pinned Kernel TLBs (860 ONLY) diff --git a/arch/powerpc/lib/dma-noncoherent.c b/arch/powerpc/lib/dma-noncoherent.c index 31734c0..3c12577 100644 --- a/arch/powerpc/lib/dma-noncoherent.c +++ b/arch/powerpc/lib/dma-noncoherent.c @@ -38,8 +38,13 @@ * can be further configured for specific applications under * the Advanced Setup menu. -Matt */ +#ifdef CONFIG_CONSISTENT_START #define CONSISTENT_BASE(CONFIG_CONSISTENT_START) #define CONSISTENT_END (CONFIG_CONSISTENT_START + CONFIG_CONSISTENT_SIZE) +#else +#define CONSISTENT_BASE((unsigned long)(-CONFIG_CONSISTENT_SIZE)) +#define CONSISTENT_END ((unsigned long)(-PAGE_SIZE)) +#endif /* CONFIG_CONSISTENT_START */ #define CONSISTENT_OFFSET(x) (((unsigned long)(x) - CONSISTENT_BASE) PAGE_SHIFT) /* diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 10d21c3..fda24c7 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -395,7 +395,7 @@ void kernel_map_pages(struct page *page, int numpages, int enable) #endif /* CONFIG_DEBUG_PAGEALLOC */ static int fixmaps; -unsigned long FIXADDR_TOP = (-PAGE_SIZE); +unsigned long FIXADDR_TOP = (-PAGE_SIZE-CONFIG_CONSISTENT_SIZE); EXPORT_SYMBOL(FIXADDR_TOP); void __set_fixmap (enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags) -- 1.5.6.1 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] powerpc: Remove `have_of' global variable
The `have_of' variable is a relic from the arch/ppc time, it isn't useful nowadays. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- arch/powerpc/include/asm/processor.h |2 -- arch/powerpc/kernel/pci-common.c |2 -- arch/powerpc/kernel/pci_32.c |7 +-- arch/powerpc/kernel/setup_32.c |2 -- arch/powerpc/kernel/setup_64.c |1 - fs/proc/proc_devtree.c |3 +-- 6 files changed, 2 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index cd7a478..d346649 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -69,8 +69,6 @@ extern int _prep_type; #ifdef __KERNEL__ -extern int have_of; - struct task_struct; void start_thread(struct pt_regs *regs, unsigned long fdptr, unsigned long sp); void release_thread(struct task_struct *); diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index 91c3f52..1a32db3 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -160,8 +160,6 @@ EXPORT_SYMBOL(pci_domain_nr); */ struct pci_controller* pci_find_hose_for_OF_device(struct device_node* node) { - if (!have_of) - return NULL; while(node) { struct pci_controller *hose, *tmp; list_for_each_entry_safe(hose, tmp, hose_list, list_node) diff --git a/arch/powerpc/kernel/pci_32.c b/arch/powerpc/kernel/pci_32.c index 7ad11e5..132cd80 100644 --- a/arch/powerpc/kernel/pci_32.c +++ b/arch/powerpc/kernel/pci_32.c @@ -266,9 +266,6 @@ pci_busdev_to_OF_node(struct pci_bus *bus, int devfn) { struct device_node *parent, *np; - if (!have_of) - return NULL; - pr_debug(pci_busdev_to_OF_node(%d,0x%x)\n, bus-number, devfn); parent = scan_OF_for_pci_bus(bus); if (parent == NULL) @@ -309,8 +306,6 @@ pci_device_from_OF_node(struct device_node* node, u8* bus, u8* devfn) struct pci_controller* hose; struct pci_dev* dev = NULL; - if (!have_of) - return -ENODEV; /* Make sure it's really a PCI device */ hose = pci_find_hose_for_OF_device(node); if (!hose || !hose-dn) @@ -431,7 +426,7 @@ static int __init pcibios_init(void) * numbers vs. kernel bus numbers since we may have to * remap them. */ - if (pci_assign_all_buses have_of) + if (pci_assign_all_buses) pcibios_make_OF_bus_map(); /* Call common code to handle resource allocation */ diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c index c1a2762..cc4679e 100644 --- a/arch/powerpc/kernel/setup_32.c +++ b/arch/powerpc/kernel/setup_32.c @@ -53,8 +53,6 @@ unsigned long ISA_DMA_THRESHOLD; unsigned int DMA_MODE_READ; unsigned int DMA_MODE_WRITE; -int have_of = 1; - #ifdef CONFIG_VGA_CONSOLE unsigned long vgacon_remap_base; EXPORT_SYMBOL(vgacon_remap_base); diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 93c875a..ce48f5c 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -70,7 +70,6 @@ #define DBG(fmt...) #endif -int have_of = 1; int boot_cpuid = 0; u64 ppc64_pft_size; diff --git a/fs/proc/proc_devtree.c b/fs/proc/proc_devtree.c index d89..de2bba5 100644 --- a/fs/proc/proc_devtree.c +++ b/fs/proc/proc_devtree.c @@ -218,8 +218,7 @@ void proc_device_tree_add_node(struct device_node *np, void __init proc_device_tree_init(void) { struct device_node *root; - if ( !have_of ) - return; + proc_device_tree = proc_mkdir(device-tree, NULL); if (proc_device_tree == 0) return; -- 1.5.6.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] powerpc: Add SMP support to no-hash TLB handling v3
On Tue, 2008-12-09 at 07:10 -0600, Kumar Gala wrote: +void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr) +{ + unsigned int pid; + + preempt_disable(); + pid = vma ? vma-vm_mm-context.id : 0; + if (pid != MMU_NO_CONTEXT) + _tlbil_va(vmaddr, pid); + preempt_enable(); +} +EXPORT_SYMBOL(local_flush_tlb_page); We are using this in highmem.h for kmap_atomic.. So you need to fix that call site. Ah yes, I forgot, will fix, thanks. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 0/9] powerpc: Preliminary work to enable SMP BookE
On Tue, 2008-12-09 at 07:17 -0600, Kumar Gala wrote: There are some semingly unrelated patches in the pile as they are dependencies of the main ones so I'm including them in. You'll be happy to know these patches at least boot on real 85xx SMP HW. Ah excellent ! Now time for you to torture test them :-) BTW. Don't you guys support larger than 8-bit PIDs on some E500 cores ? The latest patch I posted yesterday should allow to slip than in easily too. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] powerpc: add 16K/64K pages support for the 44x PPC32 architectures.
Hi Ilya ! Looks good overall. A few minor comments. +config PPC_4K_PAGES + bool 4k page size + +config PPC_16K_PAGES + bool 16k page size if 44x + +config PPC_64K_PAGES + bool 64k page size if 44x || PPC64 + select PPC_HAS_HASH_64K if PPC64 I'd rather if the PPC64 references were instead PPC_STD_MMU_64 (which may or may not be defined in Kconfig depending on what you are based on, but is trivial to add. I want to clearly differenciate what is MMU from what CPU architecture and there may (will ... ahem) at some point be 64-bit BookE. In the same vein, we should probably rework some of the above so that the CPU/MMU type actually defines what page sizes are allowed (PPC_CAN_16K, PPC_CAN_64K, ...) but let's keep that for a later patch. config PPC_SUBPAGE_PROT bool Support setting protections for 4k subpages - depends on PPC_64K_PAGES + depends on PPC64 PPC_64K_PAGES help This option adds support for a system call to allow user programs to set access permissions (read/write, readonly, or no access) Same comment here. diff --git a/arch/powerpc/include/asm/highmem.h b/arch/powerpc/include/asm/highmem.h index 91c5895..9875540 100644 --- a/arch/powerpc/include/asm/highmem.h +++ b/arch/powerpc/include/asm/highmem.h @@ -38,9 +38,20 @@ extern pte_t *pkmap_page_table; * easily, subsequent pte tables have to be allocated in one physical * chunk of RAM. */ -#define LAST_PKMAP (1 PTE_SHIFT) -#define LAST_PKMAP_MASK (LAST_PKMAP-1) +/* + * We use one full pte table with 4K pages. And with 16K/64K pages pte + * table covers enough memory (32MB and 512MB resp.) that both FIXMAP + * and PKMAP can be placed in single pte table. We use 1024 pages for + * PKMAP in case of 16K/64K pages. + */ +#define PKMAP_ORDER min(PTE_SHIFT, 10) +#define LAST_PKMAP (1 PKMAP_ORDER) +#if !defined(CONFIG_PPC_4K_PAGES) +#define PKMAP_BASE (FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1)) +#else #define PKMAP_BASE ((FIXADDR_START - PAGE_SIZE*(LAST_PKMAP + 1)) PMD_MASK) +#endif I'm not sure about the above PMD_MASK. Shouldn't we instead make it not build if (PKMAP_BASE PMD_MASK) != 0 ? IE, somebody set FIXADDR_START to something wrong... and avoid the ifdef alltogether ? Or am I missing something ? (it's early morning and I may not have all my wits with me right now !) -#ifdef CONFIG_PPC_64K_PAGES +#if defined(CONFIG_PPC_64K_PAGES) defined(CONFIG_PPC64) typedef struct { pte_t pte; unsigned long hidx; } real_pte_t; #else Same comment about using PPC_STD_MMU_64, it's going to make my life easier later on :-) And in various other places, I won't quote them all. diff --git a/arch/powerpc/include/asm/page_32.h b/arch/powerpc/include/asm/page_32.h index d77072a..74b097b 100644 --- a/arch/powerpc/include/asm/page_32.h +++ b/arch/powerpc/include/asm/page_32.h @@ -19,6 +19,7 @@ #define PTE_FLAGS_OFFSET 0 #endif +#define PTE_SHIFT(PAGE_SHIFT - PTE_T_LOG2) /* full page */ #ifndef __ASSEMBLY__ Stick a blank line between the two above statements. /* * The basic type of a PTE - 64 bits for those CPUs with 32 bit @@ -26,10 +27,8 @@ */ #ifdef CONFIG_PTE_64BIT typedef unsigned long long pte_basic_t; -#define PTE_SHIFT(PAGE_SHIFT - 3)/* 512 ptes per page */ #else typedef unsigned long pte_basic_t; -#define PTE_SHIFT(PAGE_SHIFT - 2)/* 1024 ptes per page */ #endif struct page; diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index dbb8ca1..a202043 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -39,6 +39,8 @@ extern void paging_init(void); #include asm-generic/pgtable.h +#define PGD_T_LOG2 (__builtin_ffs(sizeof(pgd_t)) - 1) +#define PTE_T_LOG2 (__builtin_ffs(sizeof(pte_t)) - 1) I'm surprised the above actually work :-) Why not having these next to the definition of pte_t in page_32.h ? Also, you end up having to do an asm-offset trick to get those to asm, I wonder if it's worth it or if we aren't better off just #defining the sizes with actual numbers next to the type definitions. No big deal either way. /* * To support 32-bit physical addresses, we use an 8KB pgdir. diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index bdc8b0e..42f99d2 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -647,8 +647,8 @@ _GLOBAL(__flush_dcache_icache) BEGIN_FTR_SECTION blr END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) - rlwinm r3,r3,0,0,19/* Get page base address */ - li r4,4096/L1_CACHE_BYTES /* Number of lines in a page */ + rlwinm r3,r3,0,0,PPC44x_RPN_MASK /* Get page base address */ + li r4,PAGE_SIZE/L1_CACHE_BYTES /* Number of lines in a page */ Now, the problem here is the name of the constant. IE. This is more or less generic
Re: [PATCH v5] spi: Add PPC4xx SPI driver
Stefan Roese wrote: This adds a SPI driver for the SPI controller found in the IBM/AMCC 4xx PowerPC's. Signed-off-by: Stefan Roese [EMAIL PROTECTED] Signed-off-by: Wolfgang Ocker [EMAIL PROTECTED] Acked-by: Josh Boyer [EMAIL PROTECTED] --- I have a question as to how to use this driver. of_num_gpios() starts testing for gpio's at num = 0, and stops at the first invalid one. However, gpio numbers are apparently allocated dynamically from 255 down, meaning that there probably is no gpio-0. For example, on my Sequoia board I have gpiochip176, gpiochip192, and gpiochip224. So, of_num_gpios() returns zero, even though there are 72 gpio's on my board. This gets back to an earlier discussion about setting the gpio index of each controller, which was rejected, IIRC. If we could set the base gpio of each chip, we could start at zero and use consecutive numbers. Failing that, it seems that Stefan's SPI driver needs to probe the entire 0-255 gpio space. How is this intended to work? An example .dts would be greatly appreciated. Steve ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 6/8] cleanup do_init_bootmem()
Quoting Dave Hansen ([EMAIL PROTECTED]): I'm debating whether this is worth it. It makes this a bit more clean looking, but doesn't seriously enhance readability. But, I do think it helps a bit. Thoughts? Absolutely. do_init_bootmem_node() is *still* a bit largish, but far better broken out. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC/PATCH] powerpc: consistent memory mapping.
On Tue, 2008-12-09 at 21:23 +0300, Ilya Yanok wrote: Defining the start virtual address of the consistent memory in configs leads to overlapping of the consistent area with the other virtual regions (fixmap, pkmap, vmalloc). Defaults from current kernel just set consistent memory area to be somewhere high in the vmalloc area and then you need to pray there will be not enough vmalloc allocations to overlap. .../... What about just ripping that consistent memory implementation out completely and using the normal vmalloc/ioremap allocator instead ? Any reason not to ? Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH v5] spi: Add PPC4xx SPI driver
Steven A. Falco wrote: Stefan Roese wrote: This adds a SPI driver for the SPI controller found in the IBM/AMCC 4xx PowerPC's. Signed-off-by: Stefan Roese [EMAIL PROTECTED] Signed-off-by: Wolfgang Ocker [EMAIL PROTECTED] Acked-by: Josh Boyer [EMAIL PROTECTED] --- How is this intended to work? An example .dts would be greatly appreciated. Answered my own question. The gpios must be directly under the spi node rather than elsewhere in the tree. This works: SPI0: [EMAIL PROTECTED] { compatible = ibm,ppc4xx-spi; reg = ef600900 7; interrupts = 8 4; interrupt-parent = UIC0; gpios = GPIO1 14 0; }; ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] powerpc: Remove `have_of' global variable
On Tue, 2008-12-09 at 22:47 +0300, Anton Vorontsov wrote: The `have_of' variable is a relic from the arch/ppc time, it isn't useful nowadays. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] Acked-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- arch/powerpc/include/asm/processor.h |2 -- arch/powerpc/kernel/pci-common.c |2 -- arch/powerpc/kernel/pci_32.c |7 +-- arch/powerpc/kernel/setup_32.c |2 -- arch/powerpc/kernel/setup_64.c |1 - fs/proc/proc_devtree.c |3 +-- 6 files changed, 2 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index cd7a478..d346649 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -69,8 +69,6 @@ extern int _prep_type; #ifdef __KERNEL__ -extern int have_of; - struct task_struct; void start_thread(struct pt_regs *regs, unsigned long fdptr, unsigned long sp); void release_thread(struct task_struct *); diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index 91c3f52..1a32db3 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -160,8 +160,6 @@ EXPORT_SYMBOL(pci_domain_nr); */ struct pci_controller* pci_find_hose_for_OF_device(struct device_node* node) { - if (!have_of) - return NULL; while(node) { struct pci_controller *hose, *tmp; list_for_each_entry_safe(hose, tmp, hose_list, list_node) diff --git a/arch/powerpc/kernel/pci_32.c b/arch/powerpc/kernel/pci_32.c index 7ad11e5..132cd80 100644 --- a/arch/powerpc/kernel/pci_32.c +++ b/arch/powerpc/kernel/pci_32.c @@ -266,9 +266,6 @@ pci_busdev_to_OF_node(struct pci_bus *bus, int devfn) { struct device_node *parent, *np; - if (!have_of) - return NULL; - pr_debug(pci_busdev_to_OF_node(%d,0x%x)\n, bus-number, devfn); parent = scan_OF_for_pci_bus(bus); if (parent == NULL) @@ -309,8 +306,6 @@ pci_device_from_OF_node(struct device_node* node, u8* bus, u8* devfn) struct pci_controller* hose; struct pci_dev* dev = NULL; - if (!have_of) - return -ENODEV; /* Make sure it's really a PCI device */ hose = pci_find_hose_for_OF_device(node); if (!hose || !hose-dn) @@ -431,7 +426,7 @@ static int __init pcibios_init(void) * numbers vs. kernel bus numbers since we may have to * remap them. */ - if (pci_assign_all_buses have_of) + if (pci_assign_all_buses) pcibios_make_OF_bus_map(); /* Call common code to handle resource allocation */ diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c index c1a2762..cc4679e 100644 --- a/arch/powerpc/kernel/setup_32.c +++ b/arch/powerpc/kernel/setup_32.c @@ -53,8 +53,6 @@ unsigned long ISA_DMA_THRESHOLD; unsigned int DMA_MODE_READ; unsigned int DMA_MODE_WRITE; -int have_of = 1; - #ifdef CONFIG_VGA_CONSOLE unsigned long vgacon_remap_base; EXPORT_SYMBOL(vgacon_remap_base); diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 93c875a..ce48f5c 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -70,7 +70,6 @@ #define DBG(fmt...) #endif -int have_of = 1; int boot_cpuid = 0; u64 ppc64_pft_size; diff --git a/fs/proc/proc_devtree.c b/fs/proc/proc_devtree.c index d89..de2bba5 100644 --- a/fs/proc/proc_devtree.c +++ b/fs/proc/proc_devtree.c @@ -218,8 +218,7 @@ void proc_device_tree_add_node(struct device_node *np, void __init proc_device_tree_init(void) { struct device_node *root; - if ( !have_of ) - return; + proc_device_tree = proc_mkdir(device-tree, NULL); if (proc_device_tree == 0) return; ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Re[2]: [PATCH 01/11] async_tx: don't use src_list argument of async_xor() for dma addresses
On Mon, Dec 8, 2008 at 5:41 PM, Yuri Tikhonov [EMAIL PROTECTED] wrote: On Tuesday, December 9, 2008 you wrote: On Mon, Dec 8, 2008 at 2:55 PM, Yuri Tikhonov [EMAIL PROTECTED] wrote: Using src_list argument of async_xor() as a storage for dma addresses implies sizeof(dma_addr_t) = sizeof(struct page *) restriction which is not always true (e.g. ppc440spe). ppc440spe runs with CONFIG_PHYS_64BIT? Yep. It uses 36-bit addressing, so this CONFIG is turned on. If we do this then we need to also change md to limit the number of allowed disks based on the kernel stack size. Because with 256 disks a 4K stack can be consumed by one call to async_pq ((256 sources in raid5.c + 256 sources async_pq.c) * 8 bytes per source on 64-bit). On ppc440spe we have 8KB stack, so the things are not worse than on 32-bit archs with 4KB stack. Thus, I guess no changes to md are required because of this patch. Right? 8K stacks do make this less of an issue *provided* handle_stripe() remains only called from raid5d. We used to share some stripe handling work with the requester's process context where the stack is much more crowded. So, we would now be more strongly tied to the raid5d-only approach... maybe that is not enough to deny this change. Neil what do you think of the async_{xor,pq,etc} apis allocating 'src_cnt' sized arrays on the stack? Thanks, Dan ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] ndfc driver
On Mon, 08 Dec 2008 21:57:12 -1000 Mitch Bradley [EMAIL PROTECTED] wrote: One address/size cell isn't enough for the next generation of NAND FLASH chips. I am no dts expert, but I thought I could put: nand { #address-cells = 1; #size-cells = 1; in my dts and you could put: nand { #address-cells = 2; #size-cells = 2; and, assuming we specified the reg entry right, everything would just work. Is that assumption wrong? And if the assumption is true, should I make a note in the doc that you can make the address and size bigger? Cheers, Sean ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 4/4] leds: Let GPIO LEDs keep their current state
On Wed, 3 Dec 2008, Richard Purdie wrote: On Sun, 2008-11-23 at 13:31 +0100, Pavel Machek wrote: On Thu 2008-11-20 17:05:56, Trent Piepho wrote: I thought of that, but it ends up being more complex. Instead of just using: static const struct gpio_led myled = { .name = something, .keep_state = 1, } You'd do something like this: .default_state = LEDS_GPIO_DEFSTATE_KEEP, Is that better? Yes. Yes, agreed, much better. Oh very well, I'll change it. But I reserve the right to make a sarcastic commit message. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[RFC/PATCH 1/2] powerpc: Rework usage of _PAGE_COHERENT/NO_CACHE/GUARDED
Currently, we never set _PAGE_COHERENT in the PTEs, we just OR it in in the hash code based on some CPU feature bit. We also manipulate _PAGE_NO_CACHE and _PAGE_GUARDED by hand in all sorts of places. This changes the logic so that instead, the PTE now contains _PAGE_COHERENT for all normal RAM pages tha have I = 0. The hash code clears it if the feature bit is not set. It also adds some clean accessors to setup various valid combinations of access flags and change various bits of code to use them instead. This should help having the PTE actually containing the bit combinations that we really want. I also removed _PAGE_GUARDED from _PAGE_BASE on 44x and instead set it explicitely from the TLB miss. I will ultimately remove it completely as it appears that it might not be needed after all but in the meantime, having it in the TLB miss makes things a lot easier. ! DO NOT MERGE YET ! I haven't touched at the FSL BookE code yet. It may need to selectively clear M in the TLB miss handler ... or not. Depends what the impact of M on non-SMP E5xx setup is. I also didn't bother to clear it on 440 because it just has no effect (ie, it won't slow things down). Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- arch/powerpc/include/asm/pgtable-ppc32.h | 42 +++ arch/powerpc/include/asm/pgtable-ppc64.h | 13 - arch/powerpc/include/asm/pgtable.h | 26 +++ arch/powerpc/kernel/head_44x.S |1 arch/powerpc/kernel/pci-common.c | 24 ++--- arch/powerpc/mm/hash_low_32.S|4 +- arch/powerpc/mm/mem.c|4 +- arch/powerpc/platforms/cell/spufs/file.c | 27 ++- drivers/video/controlfb.c|4 +- 9 files changed, 66 insertions(+), 79 deletions(-) --- linux-work.orig/arch/powerpc/include/asm/pgtable-ppc32.h2008-12-10 10:48:07.0 +1100 +++ linux-work/arch/powerpc/include/asm/pgtable-ppc32.h 2008-12-10 16:37:01.0 +1100 @@ -228,9 +228,10 @@ extern int icache_44x_need_flush; * - FILE *must* be in the bottom three bits because swap cache * entries use the top 29 bits for TLB2. * - * - CACHE COHERENT bit (M) has no effect on PPC440 core, because it - * doesn't support SMP. So we can use this as software bit, like - * DIRTY. + * - CACHE COHERENT bit (M) has no effect on original PPC440 cores, + * because it doesn't support SMP. However, some later 460 variants + * have -some- form of SMP support and so I keep the bit there for + * future use * * With the PPC 44x Linux implementation, the 0-11th LSBs of the PTE are used * for memory protection related functions (see PTE structure in @@ -436,20 +437,19 @@ extern int icache_44x_need_flush; _PAGE_USER | _PAGE_ACCESSED | \ _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | \ _PAGE_EXEC | _PAGE_HWEXEC) + /* - * Note: the _PAGE_COHERENT bit automatically gets set in the hardware - * PTE if CONFIG_SMP is defined (hash_page does this); there is no need - * to have it in the Linux PTE, and in fact the bit could be reused for - * another purpose. -- paulus. + * We define 2 sets of base prot bits, one for basic pages (ie, + * cacheable kernel and user pages) and one for non cacheable + * pages. We always set _PAGE_COHERENT (when it exists), it will + * be explicitely cleared whenever it may prove beneficial */ +#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_COHERENT) +#define _PAGE_BASE_NC (_PAGE_PRESENT | _PAGE_ACCESSED) -#ifdef CONFIG_44x -#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_GUARDED) -#else -#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED) -#endif #define _PAGE_WRENABLE (_PAGE_RW | _PAGE_DIRTY | _PAGE_HWWRITE) #define _PAGE_KERNEL (_PAGE_BASE | _PAGE_SHARED | _PAGE_WRENABLE) +#define _PAGE_KERNEL_NC(_PAGE_BASE_NC | _PAGE_SHARED | _PAGE_WRENABLE | _PAGE_NO_CACHE) #ifdef CONFIG_PPC_STD_MMU /* On standard PPC MMU, no user access implies kernel read/write access, @@ -459,7 +459,7 @@ extern int icache_44x_need_flush; #define _PAGE_KERNEL_RO(_PAGE_BASE | _PAGE_SHARED) #endif -#define _PAGE_IO (_PAGE_KERNEL | _PAGE_NO_CACHE | _PAGE_GUARDED) +#define _PAGE_IO (_PAGE_KERNEL_NC | _PAGE_GUARDED) #define _PAGE_RAM (_PAGE_KERNEL | _PAGE_HWEXEC) #if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) ||\ @@ -552,9 +552,6 @@ static inline int pte_young(pte_t pte) static inline int pte_file(pte_t pte) { return pte_val(pte) _PAGE_FILE; } static inline int pte_special(pte_t pte) { return pte_val(pte) _PAGE_SPECIAL; } -static inline void pte_uncache(pte_t pte) { pte_val(pte) |= _PAGE_NO_CACHE; } -static inline void pte_cache(pte_t pte) { pte_val(pte) = ~_PAGE_NO_CACHE; } - static inline pte_t pte_wrprotect(pte_t pte) {
[RFC/PATCH 2/2] powerpc: 44x doesn't need G set everywhere
After discussing with chip designers, it appears that it's not necessary to set G everywhere on 440 cores. The various core errata related to prefetch should be sorted out by firmware by disabling icache prefetching in CCR0. We add the workaround to the kernel however just in case ld firmwares don't do it. This is valid for -all- 4xx core variants. Later ones hard wire the absence of prefetch but it doesn't harm to clear the bits in CCR0 (they should already be cleared anyway). We still leave G=1 on the linear mapping for now, we need to stop over-mapping RAM to be able to remove it. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- arch/powerpc/kernel/head_44x.S | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) --- linux-work.orig/arch/powerpc/kernel/head_44x.S 2008-12-10 16:11:35.0 +1100 +++ linux-work/arch/powerpc/kernel/head_44x.S 2008-12-10 16:29:08.0 +1100 @@ -69,6 +69,17 @@ _ENTRY(_start); li r24,0 /* CPU number */ /* + * In case the firmware didn't do it, we apply some workarounds + * that are good for all 440 core variants here + */ + mfspr r3,SPRN_CCR0 + rlwinm r3,r3,0,0,27/* disable icache prefetch */ + isync + mtspr SPRN_CCR0,r3 + isync + sync + +/* * Set up the initial MMU state * * We are still executing code at the virtual address @@ -570,7 +581,6 @@ finish_tlb_load: rlwimi r10,r12,29,30,30/* DIRTY - SW position */ and r11,r12,r10 /* Mask PTE bits to keep */ andi. r10,r12,_PAGE_USER /* User page ? */ - ori r11,r11,_PAGE_GUARDED /* 440 errata, needs G set */ beq 1f /* nope, leave U bits empty */ rlwimi r11,r11,3,26,28 /* yes, copy S bits to U */ 1: tlbwe r11,r13,PPC44x_TLB_ATTRIB /* Write ATTRIB */ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] powerpc: Remove flush_HPTE()
The function flush_HPTE() is used in only one place, the implementation of DEBUG_PAGEALLOC on ppc32. It's actually a dup of flush_tlb_page() though it's -slightly- more efficient on hash based processors. We remove it and replace it by a direct call to the hash flush code on those processors and to flush_tlb_page() for everybody else. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- arch/powerpc/mm/mmu_decl.h | 17 - arch/powerpc/mm/pgtable_32.c |6 +- 2 files changed, 5 insertions(+), 18 deletions(-) --- linux-work.orig/arch/powerpc/mm/mmu_decl.h 2008-12-10 17:01:18.0 +1100 +++ linux-work/arch/powerpc/mm/mmu_decl.h 2008-12-10 17:01:35.0 +1100 @@ -58,17 +58,14 @@ extern phys_addr_t lowmem_end_addr; * architectures. -- Dan */ #if defined(CONFIG_8xx) -#define flush_HPTE(X, va, pg) _tlbie(va, 0 /* 8xx doesn't care about PID */) #define MMU_init_hw() do { } while(0) #define mmu_mapin_ram()(0UL) #elif defined(CONFIG_4xx) -#define flush_HPTE(pid, va, pg)_tlbie(va, pid) extern void MMU_init_hw(void); extern unsigned long mmu_mapin_ram(void); #elif defined(CONFIG_FSL_BOOKE) -#define flush_HPTE(pid, va, pg)_tlbie(va, pid) extern void MMU_init_hw(void); extern unsigned long mmu_mapin_ram(void); extern void adjust_total_lowmem(void); @@ -77,18 +74,4 @@ extern void adjust_total_lowmem(void); /* anything 32-bit except 4xx or 8xx */ extern void MMU_init_hw(void); extern unsigned long mmu_mapin_ram(void); - -/* Be carefulthis needs to be updated if we ever encounter 603 SMPs, - * which includes all new 82xx processors. We need tlbie/tlbsync here - * in that case (I think). -- Dan. - */ -static inline void flush_HPTE(unsigned context, unsigned long va, - unsigned long pdval) -{ - if ((Hash != 0) - mmu_has_feature(MMU_FTR_HPTE_TABLE)) - flush_hash_pages(0, va, pdval, 1); - else - _tlbie(va); -} #endif Index: linux-work/arch/powerpc/mm/pgtable_32.c === --- linux-work.orig/arch/powerpc/mm/pgtable_32.c2008-12-10 17:01:49.0 +1100 +++ linux-work/arch/powerpc/mm/pgtable_32.c 2008-12-10 17:04:36.0 +1100 @@ -342,7 +342,11 @@ static int __change_page_attr(struct pag return -EINVAL; set_pte_at(init_mm, address, kpte, mk_pte(page, prot)); wmb(); - flush_HPTE(0, address, pmd_val(*kpmd)); +#ifdef CONFIG_PPC_STD_MMU + flush_hash_pages(0, address, pmd_val(*kpmd), 1); +#else + flush_tlb_page(NULL, address); +#endif pte_unmap(kpte); return 0; ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Indirect DCR Access
Josh: In working through the PPC4XX memory-controller,ibm,sdram-4xx-ddr2 adapter driver for the EDAC MC driver, there are a substantial number of indirect DCR accesses. Ideally, I would use the address and data DCRs implied from the SDRAM0 dcr-reg device tree property; however, the mtdcri and mfdcri are mnemonic-only at present. Consequently, I've done: #define DCRN_SDRAM0_BASE0x010 #define DCRN_SDRAM0_CONFIG_ADDR (DCRN_SDRAM0_BASE+0x0) #define DCRN_SDRAM0_CONFIG_DATA (DCRN_SDRAM0_BASE+0x1) #define mfsdram(reg)mfdcri(SDRAM0, SDRAM_ ## reg) #define mtsdram(reg, value) mtdcri(SDRAM0, SDRAM_ ## reg, value) for the short-term. Is there a long-term strategy or set of options under discussion about expanding the DCR accessors in dcr.h to include indirect access from a device tree property as in the case above? It appears that the processors that use this memory controller core all have the same DCR address and data registers, so this isn't a huge portability issue for the immediate future; however, I endeavor to get things as close to best practices as possible up front. Regards, Grant ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev