Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

2008-07-31 Thread Nick Piggin
On Thursday 31 July 2008 03:34, Andrew Morton wrote:
 On Wed, 30 Jul 2008 18:23:18 +0100 Mel Gorman [EMAIL PROTECTED] wrote:
  On (30/07/08 01:43), Andrew Morton didst pronounce:
   On Mon, 28 Jul 2008 12:17:10 -0700 Eric Munson [EMAIL PROTECTED] 
wrote:
Certain workloads benefit if their data or text segments are backed
by huge pages.
  
   oh.  As this is a performance patch, it would be much better if its
   description contained some performance measurement results!  Please.
 
  I ran these patches through STREAM (http://www.cs.virginia.edu/stream/).
  STREAM itself was patched to allocate data from the stack instead of
  statically for the test. They completed without any problem on x86,
  x86_64 and PPC64 and each test showed a performance gain from using
  hugepages.  I can post the raw figures but they are not currently in an
  eye-friendly format. Here are some plots of the data though;
 
  x86:
  http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/x86-stream-stac
 k.ps x86_64:
  http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/x86_64-stream-s
 tack.ps ppc64-small:
  http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/ppc64-small-str
 eam-stack.ps ppc64-large:
  http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/ppc64-large-str
 eam-stack.ps
 
  The test was to run STREAM with different array sizes (plotted on X-axis)
  and measure the average throughput (y-axis). In each case, backing the
  stack with large pages with a performance gain.

 So about a 10% speedup on x86 for most STREAM configurations.  Handy -
 that's somewhat larger than most hugepage-conversions, iirc.

Although it might be a bit unusual to have codes doing huge streaming
memory operations on stack memory...

We can see why IBM is so keen on their hugepages though :)


 Do we expect that this change will be replicated in other
 memory-intensive apps?  (I do).

Such as what? It would be nice to see some numbers with some HPC or java
or DBMS workload using this. Not that I dispute it will help some cases,
but 10% (or 20% for ppc) I guess is getting toward the best case, short
of a specifically written TLB thrasher.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

2008-07-31 Thread Andrew Morton
On Thu, 31 Jul 2008 16:04:14 +1000 Nick Piggin [EMAIL PROTECTED] wrote:

  Do we expect that this change will be replicated in other
  memory-intensive apps?  (I do).
 
 Such as what? It would be nice to see some numbers with some HPC or java
 or DBMS workload using this. Not that I dispute it will help some cases,
 but 10% (or 20% for ppc) I guess is getting toward the best case, short
 of a specifically written TLB thrasher.

I didn't realise the STREAM is using vast amounts of automatic memory. 
I'd assumed that it was using sane amounts of stack, but the stack TLB
slots were getting zapped by all the heap-memory activity.  Oh well.

I guess that effect is still there, but smaller.

I agree that few real-world apps are likely to see gains of this
order.  More benchmarks, please :)
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Bugme-new] [Bug 11185] New: Device/host RESET in SCSI

2008-07-31 Thread Andrew Morton

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Wed, 30 Jul 2008 23:18:04 -0700 (PDT) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=11185
 
Summary: Device/host RESET in SCSI
Product: Platform Specific/Hardware
Version: 2.5
  KernelVersion: 2.6.26
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: blocking
   Priority: P1
  Component: PPC-64
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Latest working kernel version: 2.6.25, 2.6.18??? both tested are Debian
 distribution kernels
 Earliest failing kernel version: unknown
 Distribution: Debian stable
 Hardware Environment: IBM H70, PPC64 kernel
 Software Environment: Debian stable, 2.6.26 self compiled

Why do you describe this regression as a powerpc problem rather than a
scsi one?

(It could be either or both, I'm just wondering...)

 Problem Description:
 
 [3.881326] sym53c8xx :00:0c.0: enabling device (0140 - 0143)
 [3.959117] sym0: 875 rev 0x4 at pci :00:0c.0 irq 17
 [4.029503] sym0: No NVRAM, ID 7, Fast-20, SE, parity checking
 [4.108967] sym0: SCSI BUS has been reset.
 [4.160753] scsi0 : sym-2.2.3
 [4.200066] sym53c8xx :00:11.0: enabling device (0140 - 0143)
 [4.278375] sym1: 895 rev 0x1 at pci :00:11.0 irq 19
 [4.349340] sym1: No NVRAM, ID 7, Fast-40, SE, parity checking
 [4.429359] sym1: SCSI BUS has been reset.
 [4.481660] scsi1 : sym-2.2.3
 [4.521351] sym53c8xx 0001:40:0c.0: enabling device (0140 - 0143)
 [4.600250] sym2: 875 rev 0x3 at pci 0001:40:0c.0 irq 29
 [4.756252] sym2: No NVRAM, ID 7, Fast-20, SE, parity checking
 [4.836739] sym2: SCSI BUS has been reset.
 [4.889450] scsi2 : sym-2.2.3
 [4.929845] st: Version 20080224, fixed bufsize 32768, s/g segs 256
 [5.008868] Driver 'st' needs updating - please use bus_type methods
 [5.089184] Driver 'sd' needs updating - please use bus_type methods
 [5.169000] Driver 'sr' needs updating - please use bus_type methods
 [5.248969] SCSI Media Changer driver v0.25
 [5.303686] Driver 'ch' needs updating - please use bus_type methods
 [5.385159] mice: PS/2 mouse device common for all mice
 [5.454519] TCP cubic registered
 [5.496517] NET: Registered protocol family 17
 [5.553945] registered taskstats version 1
 [5.606960] scsi: waiting for bus probes to complete ...
 [   12.689883] scsi 0:0:0:0: ABORT operation started
 [   13.057829] scsi 1:0:0:0: ABORT operation started
 [   13.401828] scsi 2:0:0:0: ABORT operation started
 [   17.745837] scsi 0:0:0:0: ABORT operation timed-out.
 [   17.808370] scsi 0:0:0:0: DEVICE RESET operation started
 [   18.113823] scsi 1:0:0:0: ABORT operation timed-out.
 [   18.176365] scsi 1:0:0:0: DEVICE RESET operation started
 [   18.457824] scsi 2:0:0:0: ABORT operation timed-out.
 [   18.520280] scsi 2:0:0:0: DEVICE RESET operation started
 [   22.873822] scsi 0:0:0:0: DEVICE RESET operation timed-out.
 [   22.943527] scsi 0:0:0:0: BUS RESET operation started
 [   23.241823] scsi 1:0:0:0: DEVICE RESET operation timed-out.
 [   23.311387] scsi 1:0:0:0: BUS RESET operation started
 [   23.585824] scsi 2:0:0:0: DEVICE RESET operation timed-out.
 [   23.655371] scsi 2:0:0:0: BUS RESET operation started
 [   28.005857] scsi 0:0:0:0: BUS RESET operation timed-out.
 [   28.072268] scsi 0:0:0:0: HOST RESET operation started
 [   28.143373] sym0: SCSI BUS has been reset.
 [   28.373822] scsi 1:0:0:0: BUS RESET operation timed-out.
 [   28.440183] scsi 1:0:0:0: HOST RESET operation started
 [   28.511103] sym1: SCSI BUS has been reset.
 [   28.717826] scsi 2:0:0:0: BUS RESET operation timed-out.
 [   28.784138] scsi 2:0:0:0: HOST RESET operation started
 [   28.854961] sym2: SCSI BUS has been reset.
 [   33.193826] scsi 0:0:0:0: HOST RESET operation timed-out.
 [   33.261164] scsi 0:0:0:0: Device offlined - not ready after error recovery
 [   33.561823] scsi 1:0:0:0: HOST RESET operation timed-out.
 [   33.629409] scsi 1:0:0:0: Device offlined - not ready after error recovery
 [   33.905841] scsi 2:0:0:0: HOST RESET operation timed-out.
 [   33.973557] scsi 2:0:0:0: Device offlined - not ready after error recovery
 [   38.845823] scsi 0:0:1:0: ABORT operation started
 [   39.213828] scsi 1:0:1:0: ABORT operation started
 [   39.469825] scsi 2:0:1:0: ABORT operation started
 [   43.901857] scsi 0:0:1:0: ABORT operation timed-out.
 [   43.964323] scsi 0:0:1:0: DEVICE RESET operation started
 [   44.269822] scsi 1:0:1:0: ABORT operation timed-out.
 [   44.332252] scsi 1:0:1:0: DEVICE RESET operation started
 [   44.525821] scsi 2:0:1:0: ABORT operation timed-out.
 [   44.588275] scsi 2:0:1:0: DEVICE RESET operation started
 [   49.029823] scsi 0:0:1:0: DEVICE RESET operation timed-out.
 [   49.099525] scsi 0:0:1:0: BUS 

Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

2008-07-31 Thread Nick Piggin
On Thursday 31 July 2008 16:14, Andrew Morton wrote:
 On Thu, 31 Jul 2008 16:04:14 +1000 Nick Piggin [EMAIL PROTECTED] 
wrote:
   Do we expect that this change will be replicated in other
   memory-intensive apps?  (I do).
 
  Such as what? It would be nice to see some numbers with some HPC or java
  or DBMS workload using this. Not that I dispute it will help some cases,
  but 10% (or 20% for ppc) I guess is getting toward the best case, short
  of a specifically written TLB thrasher.

 I didn't realise the STREAM is using vast amounts of automatic memory.
 I'd assumed that it was using sane amounts of stack, but the stack TLB
 slots were getting zapped by all the heap-memory activity.  Oh well.

An easy mistake to make because that's probabably how STREAM would normally
work. I think what Mel had done is to modify the stream kernel so as to
have it operate on arrays of stack memory.


 I guess that effect is still there, but smaller.

I imagine it should be, unless you're using a CPU with seperate TLBs for
small and huge pages, and your large data set is mapped with huge pages,
in which case you might now introduce *new* TLB contention between the
stack and the dataset :)

Also, interestingly I have actually seen some CPUs whos memory operations
get significantly slower when operating on large pages than small (in the
case when there is full TLB coverage for both sizes). This would make
sense if the CPU only implements a fast L1 TLB for small pages.

So for the vast majority of workloads, where stacks are relatively small
(or slowly changing), and relatively hot, I suspect this could easily have
no benefit at best and slowdowns at worst.

But I'm not saying that as a reason not to merge it -- this is no
different from any other hugepage allocations and as usual they have to be
used selectively where they help I just wonder exactly where huge
stacks will help.


 I agree that few real-world apps are likely to see gains of this
 order.  More benchmarks, please :)

Would be nice, if just out of morbid curiosity :)
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Bugme-new] [Bug 11185] New: Device/host RESET in SCSI

2008-07-31 Thread Michael Ellerman
On Wed, 2008-07-30 at 23:24 -0700, Andrew Morton wrote:
 (switched to email.  Please respond via emailed reply-to-all, not via the
 bugzilla web interface).
 
 On Wed, 30 Jul 2008 23:18:04 -0700 (PDT) [EMAIL PROTECTED] wrote:
 
  http://bugzilla.kernel.org/show_bug.cgi?id=11185
  
 Summary: Device/host RESET in SCSI
 Product: Platform Specific/Hardware
 Version: 2.5
   KernelVersion: 2.6.26
Platform: All
  OS/Version: Linux
Tree: Mainline
  Status: NEW
Severity: blocking
Priority: P1
   Component: PPC-64
  AssignedTo: [EMAIL PROTECTED]
  ReportedBy: [EMAIL PROTECTED]
  
  
  Latest working kernel version: 2.6.25, 2.6.18??? both tested are Debian
  distribution kernels
  Earliest failing kernel version: unknown
  Distribution: Debian stable
  Hardware Environment: IBM H70, PPC64 kernel
  Software Environment: Debian stable, 2.6.26 self compiled
 
 Why do you describe this regression as a powerpc problem rather than a
 scsi one?
 
 (It could be either or both, I'm just wondering...)
 
  Problem Description:
  
  [3.881326] sym53c8xx :00:0c.0: enabling device (0140 - 0143)
  [3.959117] sym0: 875 rev 0x4 at pci :00:0c.0 irq 17
  [4.029503] sym0: No NVRAM, ID 7, Fast-20, SE, parity checking
  [4.108967] sym0: SCSI BUS has been reset.
  [4.160753] scsi0 : sym-2.2.3
  [4.200066] sym53c8xx :00:11.0: enabling device (0140 - 0143)
  [4.278375] sym1: 895 rev 0x1 at pci :00:11.0 irq 19
  [4.349340] sym1: No NVRAM, ID 7, Fast-40, SE, parity checking
  [4.429359] sym1: SCSI BUS has been reset.
  [4.481660] scsi1 : sym-2.2.3
  [4.521351] sym53c8xx 0001:40:0c.0: enabling device (0140 - 0143)
  [4.600250] sym2: 875 rev 0x3 at pci 0001:40:0c.0 irq 29
  [4.756252] sym2: No NVRAM, ID 7, Fast-20, SE, parity checking
  [4.836739] sym2: SCSI BUS has been reset.
  [4.889450] scsi2 : sym-2.2.3


I don't know much about scsi, but I have a 44P (POWER3) which boots fine:

Linux version 2.6.27-rc1 ([EMAIL PROTECTED]) (gcc version 4.2.3 (Ubuntu 
4.2.3-2ubuntu8
...
sym53c8xx :00:0c.0: enabling device (0140 - 0143)
sym0: 896 rev 0x7 at pci :00:0c.0 irq 17
sym0: No NVRAM, ID 7, Fast-40, SE, parity checking
sym0: SCSI BUS has been reset.
scsi0 : sym-2.2.3
scsi 0:0:1:0: CD-ROMIBM  CDRM002031_05 PQ: 0 ANSI: 2
 target0:0:1: Beginning Domain Validation
 target0:0:1: asynchronous
 target0:0:1: wide asynchronous
 target0:0:1: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 15)
 target0:0:1: Domain Validation skipping write tests
 target0:0:1: Ending Domain Validation
 target0:0:4: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 31)
scsi 0:0:4:0: Direct-Access IBM  DDYS-T09170N S96F PQ: 0 ANSI: 3
 target0:0:4: tagged command queuing enabled, command queue depth 16.
 target0:0:4: Beginning Domain Validation
 target0:0:4: asynchronous
 target0:0:4: wide asynchronous
 target0:0:4: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 31)
 target0:0:4: Domain Validation skipping write tests
 target0:0:4: Ending Domain Validation
 target0:0:5: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 31)
scsi 0:0:5:0: Direct-Access IBM  DDYS-T09170N S96F PQ: 0 ANSI: 3
 target0:0:5: tagged command queuing enabled, command queue depth 16.
 target0:0:5: Beginning Domain Validation
 target0:0:5: asynchronous
 target0:0:5: wide asynchronous
 target0:0:5: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 31)
 target0:0:5: Domain Validation skipping write tests
 target0:0:5: Ending Domain Validation
sym53c8xx :00:0c.1: enabling device (0140 - 0143)
sym1: 896 rev 0x7 at pci :00:0c.1 irq 18
sym1: No NVRAM, ID 7, Fast-40, LVD, parity checking
sym1: SCSI BUS has been reset.
scsi1 : sym-2.2.3
ipr: IBM Power RAID SCSI Device Driver version: 2.4.1 (April 24, 2007)
st: Version 20080504, fixed bufsize 32768, s/g segs 256
Driver 'st' needs updating - please use bus_type methods
Driver 'sd' needs updating - please use bus_type methods
sd 0:0:4:0: [sda] 17774160 512-byte hardware sectors (9100 MB)
sd 0:0:4:0: [sda] Write Protect is off
sd 0:0:4:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DA
sd 0:0:4:0: [sda] 17774160 512-byte hardware sectors (9100 MB)
sd 0:0:4:0: [sda] Write Protect is off
sd 0:0:4:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DA
 sda: sda1
sd 0:0:4:0: [sda] Attached SCSI disk
sd 0:0:5:0: [sdb] Spinning up disk..ready
sd 0:0:5:0: [sdb] 17774160 512-byte hardware sectors (9100 MB)
sd 0:0:5:0: [sdb] Write Protect is off
sd 0:0:5:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DA
sd 0:0:5:0: [sdb] 17774160 512-byte hardware sectors (9100 MB)
sd 0:0:5:0: [sdb] Write Protect is off
sd 0:0:5:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DA
 sdb: sdb1 sdb2 sdb3  sdb5 sdb6 
sd 0:0:5:0: [sdb] Attached SCSI disk

[PATCH] powerpc/kdump: Fix /dev/oldmem interface

2008-07-31 Thread Sachin P. Sant

This patch fixes the /dev/oldmem interface for kdump on ppc64.
The patch originally came from Michael Ellerman hence have
retained the signed-off line. I just rediffed/tested against
latest git. Michael has ack'ed this patch.

Ben this is not a must for 2.6.27 but would be good if it's 
included.


Thanks
-Sachin

Signed-off-by : Michael Ellerman [EMAIL PROTECTED]
Acked-by : Michael Ellerman [EMAIL PROTECTED]

---

Fix /dev/oldmem for kdump

A change to __ioremap() broke reading /dev/oldmem because we're no
longer able to ioremap pfn 0 (d177c207ba16b1db31283e2d1fee7ad4a863584b).

We actually don't need to ioremap for anything that's part of the linear
mapping, so just read it directly.

Also make sure we're only reading one page or less at a time.

Signed-off-by : Michael Ellerman [EMAIL PROTECTED]
---

diff -Naurp 1/arch/powerpc/kernel/crash_dump.c 2/arch/powerpc/kernel/crash_dump.c
--- 1/arch/powerpc/kernel/crash_dump.c	2008-07-31 11:55:39.0 +0530
+++ 2/arch/powerpc/kernel/crash_dump.c	2008-07-31 12:09:49.0 +0530
@@ -86,6 +86,19 @@ static int __init parse_savemaxmem(char 
 }
 __setup(savemaxmem=, parse_savemaxmem);
 
+
+static size_t copy_oldmem_vaddr(void *vaddr, char *buf, size_t csize,
+   unsigned long offset, int userbuf)
+{
+	if (userbuf) {
+		if (copy_to_user((char __user *)buf, (vaddr + offset), csize))
+			return -EFAULT;
+	} else
+		memcpy(buf, (vaddr + offset), csize);
+
+	return csize;
+}
+
 /**
  * copy_oldmem_page - copy one page from oldmem
  * @pfn: page frame number to be copied
@@ -107,16 +120,16 @@ ssize_t copy_oldmem_page(unsigned long p
 	if (!csize)
 		return 0;
 
-	vaddr = __ioremap(pfn  PAGE_SHIFT, PAGE_SIZE, 0);
+	csize = min(csize, PAGE_SIZE);
 
-	if (userbuf) {
-		if (copy_to_user((char __user *)buf, (vaddr + offset), csize)) {
-			iounmap(vaddr);
-			return -EFAULT;
-		}
-	} else
-		memcpy(buf, (vaddr + offset), csize);
+	if (pfn  max_pfn) {
+		vaddr = __va(pfn  PAGE_SHIFT);
+		csize = copy_oldmem_vaddr(vaddr, buf, csize, offset, userbuf);
+	} else {
+		vaddr = __ioremap(pfn  PAGE_SHIFT, PAGE_SIZE, 0);
+		csize = copy_oldmem_vaddr(vaddr, buf, csize, offset, userbuf);
+		iounmap(vaddr);
+	}
 
-	iounmap(vaddr);
 	return csize;
 }
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 5/8] Silence warning in arch/powerpc/mm/ppc_mmu_32.c

2008-07-31 Thread Milton Miller

On Thu Jul 31 at 15:21:25 EST in 2008, Stephen Rothwell wrote:
On Thu, 31 Jul 2008 13:51:43 +1000 (EST) Tony Breeds tony at 
bakeyournoodle.com wrote:


total_memory is a 'phys_addr_t', cast to unsigned long to silence
warning.

diff --git a/arch/powerpc/mm/ppc_mmu_32.c 
b/arch/powerpc/mm/ppc_mmu_32.c

index c53145f..9c19655 100644
--- a/arch/powerpc/mm/ppc_mmu_32.c
+++ b/arch/powerpc/mm/ppc_mmu_32.c
@@ -237,7 +237,7 @@ void __init MMU_init_hw(void)
  Hash_end = (struct hash_pte *) ((unsigned long)Hash + 
Hash_size);


  printk(Total memory = %ldMB; using %ldkB for hash table (at 
%p)\n,

-total_memory  20, Hash_size  10, Hash);
+(unsigned long)total_memory  20, Hash_size  10, 
Hash);


Will this ever be built with CONFIG_PHYS_64BIT?


I think that is how warning originates.

But please, cast the result of the shift.  Otherwise it will print 0MB 
instead of 4096MB.


The patches for 4G ram are in on one platform and in progress on a 
second.


milton

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH v2] Guard print_device_node_tree() if #if 0.

2008-07-31 Thread Tony Breeds
Currently print_device_node_tree() isn't called but it can be usful for
debuging.  Leave the function there but hide it behind '#if 0' to save
it being rewritten.  If you want to call it you're already editing this
file anyway ;P

Signed-off-by: Tony Breeds [EMAIL PROTECTED]
---
Changes since v1:
 - No longer breaks CONFIG_PPC_PSERIES_DEBUG.

 arch/powerpc/platforms/pseries/eeh_driver.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c 
b/arch/powerpc/platforms/pseries/eeh_driver.c
index 8c1ca47..0ad56ff 100644
--- a/arch/powerpc/platforms/pseries/eeh_driver.c
+++ b/arch/powerpc/platforms/pseries/eeh_driver.c
@@ -41,7 +41,7 @@ static inline const char * pcid_name (struct pci_dev *pdev)
return ;
 }
 
-#ifdef DEBUG
+#if 0
 static void print_device_node_tree(struct pci_dn *pdn, int dent)
 {
int i;
-- 
1.5.6.3

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH v2] Force printing of 'total_memory' to unsigned long long in ppc_mmu_32.c

2008-07-31 Thread Tony Breeds
total_memory is a 'phys_addr_t', Which can be either 64 or 32 bits.
Force printing as unsigned long long to silence the warning.

Signed-off-by: Tony Breeds [EMAIL PROTECTED]
---
Changes since v1:
 - correctly use 64bit type as phys_addr_t wont always be 32bits.  Thanks to 
sfr for showing me the error of my ways ;P

 arch/powerpc/mm/ppc_mmu_32.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/ppc_mmu_32.c b/arch/powerpc/mm/ppc_mmu_32.c
index c53145f..07473e0 100644
--- a/arch/powerpc/mm/ppc_mmu_32.c
+++ b/arch/powerpc/mm/ppc_mmu_32.c
@@ -236,8 +236,8 @@ void __init MMU_init_hw(void)
 
Hash_end = (struct hash_pte *) ((unsigned long)Hash + Hash_size);
 
-   printk(Total memory = %ldMB; using %ldkB for hash table (at %p)\n,
-  total_memory  20, Hash_size  10, Hash);
+   printk(Total memory = %lldMB; using %ldkB for hash table (at %p)\n,
+  (unsigned long long)total_memory  20, Hash_size  10, Hash);
 
 
/*
-- 
1.5.6.3

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 5/8] Silence warning in arch/powerpc/mm/ppc_mmu_32.c

2008-07-31 Thread Tony Breeds
On Thu, Jul 31, 2008 at 03:21:25PM +1000, Stephen Rothwell wrote:
 Hi Tony,

Rusty has forever ruined 'Hi $name' ;P

snip
 
 Will this ever be built with CONFIG_PHYS_64BIT?

Updated patch follows.

Yours Tony

  linux.conf.auhttp://www.marchsouth.org/
  Jan 19 - 24 2009 The Australian Linux Technical Conference!

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc/kdump: Fix /dev/oldmem interface

2008-07-31 Thread Michael Ellerman
On Thu, 2008-07-31 at 12:24 +0530, Sachin P. Sant wrote:
 This patch fixes the /dev/oldmem interface for kdump on ppc64.
 The patch originally came from Michael Ellerman hence have
 retained the signed-off line. I just rediffed/tested against
 latest git. Michael has ack'ed this patch.
 
 Ben this is not a must for 2.6.27 but would be good if it's 
 included.

It's been broken long enough, I think it's a must for 27.


 Signed-off-by : Michael Ellerman [EMAIL PROTECTED]
 Acked-by : Michael Ellerman [EMAIL PROTECTED]

Heh, do I get a prize for being neurotic? ;)  .. Or just stupid, ahem.

I think you should also sign off on it Sachin, see clause (c) of the
developers cert (http://lwn.net/Articles/139918/)

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [Bugme-new] [Bug 11185] New: Device/host RESET in SCSI

2008-07-31 Thread Matthew Wilcox
On Wed, Jul 30, 2008 at 11:24:09PM -0700, Andrew Morton wrote:
 Why do you describe this regression as a powerpc problem rather than a
 scsi one?
 
 (It could be either or both, I'm just wondering...)

This seems quite astute of the reporter.  The error messages from sym2
are consistent with an interrupt routing problem.  I have an idea for
reporting this more effectively (because this comes up every 3-6 months
or so) but testing that patch will have to wait until I'm back home.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Bugme-new] [Bug 11185] New: Device/host RESET in SCSI

2008-07-31 Thread Michael Ellerman
On Thu, 2008-07-31 at 01:21 -0600, Matthew Wilcox wrote:
 On Wed, Jul 30, 2008 at 11:24:09PM -0700, Andrew Morton wrote:
  Why do you describe this regression as a powerpc problem rather than a
  scsi one?
  
  (It could be either or both, I'm just wondering...)
 
 This seems quite astute of the reporter.  The error messages from sym2
 are consistent with an interrupt routing problem. 

Hmm I suppose.

In that case can we see the full dmesg and a tarball
of /proc/device-tree from a working kernel, Cijoml?

Which begs the question what was the latest working kernel version?

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: ide pmac breakage

2008-07-31 Thread Alan Cox
 There seems to be some confusion between warm-plugging of IDE devices
 and hot-plugging of IDE devices.
 
  not a single piece of HW to exercise those code path ? I don't ask you
  to get a powermac with a media-bay, but ide_cs seems to be a pretty
  important one that's part of what the ide maintainer should take care
  of... And I suspect it's going to exercise the same code path as
  mediabay.

That confuses people sometimes - ide_cs is a controller hotplug not a
device hotplug ...

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: ide pmac breakage

2008-07-31 Thread Benjamin Herrenschmidt
On Thu, 2008-07-31 at 09:49 +0100, Alan Cox wrote:
  There seems to be some confusion between warm-plugging of IDE devices
  and hot-plugging of IDE devices.
  
   not a single piece of HW to exercise those code path ? I don't ask you
   to get a powermac with a media-bay, but ide_cs seems to be a pretty
   important one that's part of what the ide maintainer should take care
   of... And I suspect it's going to exercise the same code path as
   mediabay.
 
 That confuses people sometimes - ide_cs is a controller hotplug not a
 device hotplug ...

I could make the media-bay look like a controller hotplug if it was
going to make things easier...

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc/kdump: Fix /dev/oldmem interface

2008-07-31 Thread Sachin P. Sant

Michael Ellerman wrote:

Heh, do I get a prize for being neurotic? ;)  .. Or just stupid, ahem.

I think you should also sign off on it Sachin, see clause (c) of the
developers cert (http://lwn.net/Articles/139918/)

  

Here is a signed-off from me. Let me know if i need to resubmit
the patch with all signed-off-by lines.

Signed-off-by : Sachin Sant [EMAIL PROTECTED]


--
Thanks
-Sachin

-
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
-

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: ide pmac breakage

2008-07-31 Thread Alan Cox
 I could make the media-bay look like a controller hotplug if it was
 going to make things easier...

I'm not sure it will. It may do nowdays, but the older IDE code
historically was fairly broken for both cases except in 2.4. Also faking
it as controller hotplug is the wrong path for libata which does real
drive hot plug.


Alan
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] powerpc - Initialize the irq radix tree earlier

2008-07-31 Thread Sebastien Dugue
  The radix tree used for fast irq reverse mapping by the XICS is initialized
late in the boot process, after the first interrupt (IPI) gets registered
and after the first IPI is received.

  This patch moves the initialization of the XICS radix tree earlier into
the boot process in smp_xics_probe() (the mm is already up but no interrupts
have been registered at that point) to avoid having to insert a mapping into
the tree in interrupt context. This will help in simplifying the locking
constraints and move to a lockless radix tree in subsequent patches.

  As a nice side effect, there is no need any longer to check for
(host-revmap_data.tree.gfp_mask != 0) to know if the tree have been
initialized.


Signed-off-by: Sebastien Dugue [EMAIL PROTECTED]
Cc: Benjamin Herrenschmidt [EMAIL PROTECTED]
Cc: Paul Mackerras [EMAIL PROTECTED]
---
 arch/powerpc/kernel/irq.c|   44 +
 arch/powerpc/platforms/pseries/smp.c |1 +
 include/asm-powerpc/irq.h|5 
 3 files changed, 18 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 6ac8612..0a1445c 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -840,9 +840,6 @@ void irq_dispose_mapping(unsigned int virq)
host-revmap_data.linear.revmap[hwirq] = NO_IRQ;
break;
case IRQ_HOST_MAP_TREE:
-   /* Check if radix tree allocated yet */
-   if (host-revmap_data.tree.gfp_mask == 0)
-   break;
irq_radix_wrlock(flags);
radix_tree_delete(host-revmap_data.tree, hwirq);
irq_radix_wrunlock(flags);
@@ -893,28 +890,28 @@ unsigned int irq_find_mapping(struct irq_host *host,
 }
 EXPORT_SYMBOL_GPL(irq_find_mapping);
 
+void __init irq_radix_revmap_init(void)
+{
+   struct irq_host *h;
+
+   list_for_each_entry(h, irq_hosts, link) {
+   if (h-revmap_type == IRQ_HOST_MAP_TREE)
+   INIT_RADIX_TREE(h-revmap_data.tree, GFP_ATOMIC);
+   }
+}
 
 unsigned int irq_radix_revmap(struct irq_host *host,
  irq_hw_number_t hwirq)
 {
-   struct radix_tree_root *tree;
struct irq_map_entry *ptr;
unsigned int virq;
unsigned long flags;
 
WARN_ON(host-revmap_type != IRQ_HOST_MAP_TREE);
 
-   /* Check if the radix tree exist yet. We test the value of
-* the gfp_mask for that. Sneaky but saves another int in the
-* structure. If not, we fallback to slow mode
-*/
-   tree = host-revmap_data.tree;
-   if (tree-gfp_mask == 0)
-   return irq_find_mapping(host, hwirq);
-
-   /* Now try to resolve */
+   /* Try to resolve */
irq_radix_rdlock(flags);
-   ptr = radix_tree_lookup(tree, hwirq);
+   ptr = radix_tree_lookup(host-revmap_data.tree, hwirq);
irq_radix_rdunlock(flags);
 
/* Found it, return */
@@ -927,7 +924,7 @@ unsigned int irq_radix_revmap(struct irq_host *host,
virq = irq_find_mapping(host, hwirq);
if (virq != NO_IRQ) {
irq_radix_wrlock(flags);
-   radix_tree_insert(tree, hwirq, irq_map[virq]);
+   radix_tree_insert(host-revmap_data.tree, hwirq, 
irq_map[virq]);
irq_radix_wrunlock(flags);
}
return virq;
@@ -1035,23 +1032,6 @@ void irq_early_init(void)
get_irq_desc(i)-status |= IRQ_NOREQUEST;
 }
 
-/* We need to create the radix trees late */
-static int irq_late_init(void)
-{
-   struct irq_host *h;
-   unsigned long flags;
-
-   irq_radix_wrlock(flags);
-   list_for_each_entry(h, irq_hosts, link) {
-   if (h-revmap_type == IRQ_HOST_MAP_TREE)
-   INIT_RADIX_TREE(h-revmap_data.tree, GFP_ATOMIC);
-   }
-   irq_radix_wrunlock(flags);
-
-   return 0;
-}
-arch_initcall(irq_late_init);
-
 #ifdef CONFIG_VIRQ_DEBUG
 static int virq_debug_show(struct seq_file *m, void *private)
 {
diff --git a/arch/powerpc/platforms/pseries/smp.c 
b/arch/powerpc/platforms/pseries/smp.c
index 9d8f8c8..b143fe7 100644
--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -130,6 +130,7 @@ static void smp_xics_message_pass(int target, int msg)
 
 static int __init smp_xics_probe(void)
 {
+   irq_radix_revmap_init();
xics_request_IPIs();
 
return cpus_weight(cpu_possible_map);
diff --git a/include/asm-powerpc/irq.h b/include/asm-powerpc/irq.h
index 1ef8e30..47b8119 100644
--- a/include/asm-powerpc/irq.h
+++ b/include/asm-powerpc/irq.h
@@ -237,6 +237,11 @@ extern unsigned int irq_find_mapping(struct irq_host *host,
  */
 extern unsigned int irq_create_direct_mapping(struct irq_host *host);
 
+/*
+ * Initialize the radix tree used by some irq controllers
+ */
+extern void __init irq_radix_revmap_init(void);
+
 /**
  * irq_radix_revmap - Find a linux virq from a 

[PATCH] powerpc - Separate the irq radix tree insertion and lookup

2008-07-31 Thread Sebastien Dugue
  irq_radix_revmap() currently serves 2 purposes, irq mapping lookup
and insertion which happen in interrupt and process context respectively.

  Separate the function into its 2 components, one for lookup only and one
for insertion only.


Signed-off-by: Sebastien Dugue [EMAIL PROTECTED]
Cc: Benjamin Herrenschmidt [EMAIL PROTECTED]
Cc: Paul Mackerras [EMAIL PROTECTED]
---
 arch/powerpc/kernel/irq.c |   25 ++---
 arch/powerpc/platforms/pseries/xics.c |   11 ---
 include/asm-powerpc/irq.h |   18 +++---
 3 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 0a1445c..083b181 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -900,34 +900,37 @@ void __init irq_radix_revmap_init(void)
}
 }
 
-unsigned int irq_radix_revmap(struct irq_host *host,
- irq_hw_number_t hwirq)
+unsigned int irq_radix_revmap_lookup(struct irq_host *host,
+irq_hw_number_t hwirq)
 {
struct irq_map_entry *ptr;
-   unsigned int virq;
+   unsigned int virq = NO_IRQ;
unsigned long flags;
 
WARN_ON(host-revmap_type != IRQ_HOST_MAP_TREE);
 
-   /* Try to resolve */
irq_radix_rdlock(flags);
ptr = radix_tree_lookup(host-revmap_data.tree, hwirq);
irq_radix_rdunlock(flags);
 
-   /* Found it, return */
-   if (ptr) {
+   if (ptr)
virq = ptr - irq_map;
-   return virq;
-   }
 
-   /* If not there, try to insert it */
-   virq = irq_find_mapping(host, hwirq);
+   return virq;
+}
+
+void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq,
+irq_hw_number_t hwirq)
+{
+   unsigned long flags;
+
+   WARN_ON(host-revmap_type != IRQ_HOST_MAP_TREE);
+
if (virq != NO_IRQ) {
irq_radix_wrlock(flags);
radix_tree_insert(host-revmap_data.tree, hwirq, 
irq_map[virq]);
irq_radix_wrunlock(flags);
}
-   return virq;
 }
 
 unsigned int irq_linear_revmap(struct irq_host *host,
diff --git a/arch/powerpc/platforms/pseries/xics.c 
b/arch/powerpc/platforms/pseries/xics.c
index 0fc830f..6b1a005 100644
--- a/arch/powerpc/platforms/pseries/xics.c
+++ b/arch/powerpc/platforms/pseries/xics.c
@@ -310,12 +310,6 @@ static void xics_mask_irq(unsigned int virq)
 
 static unsigned int xics_startup(unsigned int virq)
 {
-   unsigned int irq;
-
-   /* force a reverse mapping of the interrupt so it gets in the cache */
-   irq = (unsigned int)irq_map[virq].hwirq;
-   irq_radix_revmap(xics_host, irq);
-
/* unmask it */
xics_unmask_irq(virq);
return 0;
@@ -346,7 +340,7 @@ static inline unsigned int xics_remap_irq(unsigned int vec)
 
if (vec == XICS_IRQ_SPURIOUS)
return NO_IRQ;
-   irq = irq_radix_revmap(xics_host, vec);
+   irq = irq_radix_revmap_lookup(xics_host, vec);
if (likely(irq != NO_IRQ))
return irq;
 
@@ -530,6 +524,9 @@ static int xics_host_map(struct irq_host *h, unsigned int 
virq,
 {
pr_debug(xics: map virq %d, hwirq 0x%lx\n, virq, hw);
 
+   /* Insert the interrupt mapping into the radix tree for fast lookup */
+   irq_radix_revmap_insert(xics_host, virq, hw);
+
get_irq_desc(virq)-status |= IRQ_LEVEL;
set_irq_chip_and_handler(virq, xics_irq_chip, handle_fasteoi_irq);
return 0;
diff --git a/include/asm-powerpc/irq.h b/include/asm-powerpc/irq.h
index 47b8119..5c88acf 100644
--- a/include/asm-powerpc/irq.h
+++ b/include/asm-powerpc/irq.h
@@ -243,15 +243,27 @@ extern unsigned int irq_create_direct_mapping(struct 
irq_host *host);
 extern void __init irq_radix_revmap_init(void);
 
 /**
- * irq_radix_revmap - Find a linux virq from a hw irq number.
+ * irq_radix_revmap_insert - Insert a hw irq to linux virq number mapping.
+ * @host: host owning this hardware interrupt
+ * @virq: linux irq number
+ * @hwirq: hardware irq number in that host space
+ *
+ * This is for use by irq controllers that use a radix tree reverse
+ * mapping for fast lookup.
+ */
+extern void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq,
+   irq_hw_number_t hwirq);
+
+/**
+ * irq_radix_revmap_lookup - Find a linux virq from a hw irq number.
  * @host: host owning this hardware interrupt
  * @hwirq: hardware irq number in that host space
  *
  * This is a fast path, for use by irq controller code that uses radix tree
  * revmaps
  */
-extern unsigned int irq_radix_revmap(struct irq_host *host,
-irq_hw_number_t hwirq);
+extern unsigned int irq_radix_revmap_lookup(struct irq_host *host,
+   irq_hw_number_t hwirq);
 
 /**
  * irq_linear_revmap - Find a linux virq from a hw irq number.
-- 
1.5.5.1


[PATCH 0/3] powerpc - Make the irq reverse mapping tree lockless

2008-07-31 Thread Sebastien Dugue
  Hi ,

  here is a respin of the patches I posted last week for the RT kernel now 
targeted
for mainline (http://lkml.org/lkml/2008/7/24/98). Thomas, steven, a note for you
at the end.

  The goal of this patchset is to simplify the locking constraints on the radix
tree used for IRQ reverse mapping on the pSeries machines and provide lockless
access to this tree.

  This also solves the following BUG under preempt-rt:

BUG: sleeping function called from invalid context swapper(1) at 
kernel/rtmutex.c:739
in_atomic():1 [0002], irqs_disabled():1
Call Trace:
[c001e20f3340] [c0010370] .show_stack+0x70/0x1bc (unreliable)
[c001e20f33f0] [c0049380] .__might_sleep+0x11c/0x138
[c001e20f3470] [c02a2f64] .__rt_spin_lock+0x3c/0x98
[c001e20f34f0] [c00c3f20] .kmem_cache_alloc+0x68/0x184
[c001e20f3590] [c0193f3c] .radix_tree_node_alloc+0xf0/0x144
[c001e20f3630] [c0195190] .radix_tree_insert+0x18c/0x2fc
[c001e20f36f0] [c000c710] .irq_radix_revmap+0x1a4/0x1e4
[c001e20f37b0] [c003b3f0] .xics_startup+0x30/0x54
[c001e20f3840] [c008b864] .setup_irq+0x26c/0x370
[c001e20f38f0] [c008ba68] .request_irq+0x100/0x158
[c001e20f39a0] [c01ee9c0] .hvc_open+0xb4/0x148
[c001e20f3a40] [c01d72ec] .tty_open+0x200/0x368
[c001e20f3af0] [c00ce928] .chrdev_open+0x1f4/0x25c
[c001e20f3ba0] [c00c8bf0] .__dentry_open+0x188/0x2c8
[c001e20f3c50] [c00c8dec] .do_filp_open+0x50/0x70
[c001e20f3d70] [c00c8e8c] .do_sys_open+0x80/0x148
[c001e20f3e20] [c000928c] .init_post+0x4c/0x100
[c001e20f3ea0] [c03c0e0c] .kernel_init+0x428/0x478
[c001e20f3f90] [c0027448] .kernel_thread+0x4c/0x68

  The root cause of this bug lies in the fact that the XICS interrupt controller
uses a radix tree for its reverse irq mapping and that we cannot allocate the 
tree
nodes (even GFP_ATOMIC) with preemption disabled.

  In fact, we have 2 nested preemption disabling when we want to allocate
a new node:

  - setup_irq() does a spin_lock_irqsave() before calling xics_startup() which
then calls irq_radix_revmap() to insert a new node in the tree

  - irq_radix_revmap() also does a spin_lock_irqsave() (in irq_radix_wrlock())
before the radix_tree_insert()

  Also, if an IRQ gets registered before the tree is initialized (namely the
IPI), it will be inserted into the tree in interrupt context once the tree
have been initialized, hence the need for a spin_lock_irqsave() in the insertion
path.

  This serie is split into 3 patches:

  - The first patch moves the initialization of the radix tree earlier in the
boot process before any IRQ gets registered, but after the mm is up.

  - The second patch splits irq_radix_revmap() into its 2 components: one
for lookup and one for insertion into the radix tree.

  - And finally, the third patch makes the radix tree fully lockless on the 
lookup side.


  Here is the diffstat for the whole patchset:

 arch/powerpc/kernel/irq.c |  134 -
 arch/powerpc/platforms/pseries/smp.c  |1 +
 arch/powerpc/platforms/pseries/xics.c |   11 +--
 include/asm-powerpc/irq.h |   24 +-
 4 files changed, 58 insertions(+), 112 deletions(-)


  Thomas, Steven, the first 2 patches can be applied seamlessly to 2.6.26-rt1
with offsets, the third patch has a trivial to fix reject in
arch/powerpc/kernel/irq.c because the irq_big_lock is changed to a raw spinlock
in preempt-rt. If you want those patches for RT, just flag me, I have those
sitting on my test box.



  Thanks,

  Sebastien.



___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] powerpc - Make the irq reverse mapping radix tree lockless

2008-07-31 Thread Sebastien Dugue
  The radix trees used by interrupt controllers for their irq reverse mapping
(currently only the XICS found on pSeries) have a complex locking scheme
dating back to before the advent of the lockless radix tree.

  Take advantage of this and of the fact that the items of the tree are
pointers to a static array (irq_map) elements which can never go under us
to simplify the locking.

  Concurrency between readers and writers is handled by the intrinsic
properties of the lockless radix tree. Concurrency between writers is handled
with a spinlock added to the irq_host structure.


Signed-off-by: Sebastien Dugue [EMAIL PROTECTED]
Cc: Benjamin Herrenschmidt [EMAIL PROTECTED]
Cc: Paul Mackerras [EMAIL PROTECTED]
---
 arch/powerpc/kernel/irq.c |   75 ++--
 include/asm-powerpc/irq.h |1 +
 2 files changed, 12 insertions(+), 64 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 083b181..3aa683b 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -458,8 +458,6 @@ void do_softirq(void)
 
 static LIST_HEAD(irq_hosts);
 static DEFINE_SPINLOCK(irq_big_lock);
-static DEFINE_PER_CPU(unsigned int, irq_radix_reader);
-static unsigned int irq_radix_writer;
 struct irq_map_entry irq_map[NR_IRQS];
 static unsigned int irq_virq_count = NR_IRQS;
 static struct irq_host *irq_default_host;
@@ -602,57 +600,6 @@ void irq_set_virq_count(unsigned int count)
irq_virq_count = count;
 }
 
-/* radix tree not lockless safe ! we use a brlock-type mecanism
- * for now, until we can use a lockless radix tree
- */
-static void irq_radix_wrlock(unsigned long *flags)
-{
-   unsigned int cpu, ok;
-
-   spin_lock_irqsave(irq_big_lock, *flags);
-   irq_radix_writer = 1;
-   smp_mb();
-   do {
-   barrier();
-   ok = 1;
-   for_each_possible_cpu(cpu) {
-   if (per_cpu(irq_radix_reader, cpu)) {
-   ok = 0;
-   break;
-   }
-   }
-   if (!ok)
-   cpu_relax();
-   } while(!ok);
-}
-
-static void irq_radix_wrunlock(unsigned long flags)
-{
-   smp_wmb();
-   irq_radix_writer = 0;
-   spin_unlock_irqrestore(irq_big_lock, flags);
-}
-
-static void irq_radix_rdlock(unsigned long *flags)
-{
-   local_irq_save(*flags);
-   __get_cpu_var(irq_radix_reader) = 1;
-   smp_mb();
-   if (likely(irq_radix_writer == 0))
-   return;
-   __get_cpu_var(irq_radix_reader) = 0;
-   smp_wmb();
-   spin_lock(irq_big_lock);
-   __get_cpu_var(irq_radix_reader) = 1;
-   spin_unlock(irq_big_lock);
-}
-
-static void irq_radix_rdunlock(unsigned long flags)
-{
-   __get_cpu_var(irq_radix_reader) = 0;
-   local_irq_restore(flags);
-}
-
 static int irq_setup_virq(struct irq_host *host, unsigned int virq,
irq_hw_number_t hwirq)
 {
@@ -807,7 +754,6 @@ void irq_dispose_mapping(unsigned int virq)
 {
struct irq_host *host;
irq_hw_number_t hwirq;
-   unsigned long flags;
 
if (virq == NO_IRQ)
return;
@@ -840,9 +786,9 @@ void irq_dispose_mapping(unsigned int virq)
host-revmap_data.linear.revmap[hwirq] = NO_IRQ;
break;
case IRQ_HOST_MAP_TREE:
-   irq_radix_wrlock(flags);
+   spin_lock(host-tree_lock);
radix_tree_delete(host-revmap_data.tree, hwirq);
-   irq_radix_wrunlock(flags);
+   spin_unlock(host-tree_lock);
break;
}
 
@@ -895,8 +841,10 @@ void __init irq_radix_revmap_init(void)
struct irq_host *h;
 
list_for_each_entry(h, irq_hosts, link) {
-   if (h-revmap_type == IRQ_HOST_MAP_TREE)
+   if (h-revmap_type == IRQ_HOST_MAP_TREE) {
INIT_RADIX_TREE(h-revmap_data.tree, GFP_ATOMIC);
+   spin_lock_init(h-tree_lock);
+   }
}
 }
 
@@ -905,13 +853,14 @@ unsigned int irq_radix_revmap_lookup(struct irq_host 
*host,
 {
struct irq_map_entry *ptr;
unsigned int virq = NO_IRQ;
-   unsigned long flags;
 
WARN_ON(host-revmap_type != IRQ_HOST_MAP_TREE);
 
-   irq_radix_rdlock(flags);
+   /*
+* No rcu_read_lock(ing) needed, the ptr returned can't go under us
+* as it's referencing an entry in the static irq_map table.
+*/
ptr = radix_tree_lookup(host-revmap_data.tree, hwirq);
-   irq_radix_rdunlock(flags);
 
if (ptr)
virq = ptr - irq_map;
@@ -922,14 +871,12 @@ unsigned int irq_radix_revmap_lookup(struct irq_host 
*host,
 void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq,
 irq_hw_number_t hwirq)
 {
-   unsigned long flags;
-
WARN_ON(host-revmap_type != IRQ_HOST_MAP_TREE);
 

Re: ide pmac breakage

2008-07-31 Thread Benjamin Herrenschmidt
On Thu, 2008-07-31 at 10:13 +0100, Alan Cox wrote:
  I could make the media-bay look like a controller hotplug if it was
  going to make things easier...
 
 I'm not sure it will. It may do nowdays, but the older IDE code
 historically was fairly broken for both cases except in 2.4. Also faking
 it as controller hotplug is the wrong path for libata which does real
 drive hot plug.

Yeah, that was my line of thinking initially, also the fact that it has
the nice side effect of keeping the minor number stable.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/3] powerpc - Make the irq reverse mapping tree lockless

2008-07-31 Thread Sebastien Dugue

  OK, I goofed up with git-format-patch, forgot the --numbered option.

  The patches subjects should read:

[PATCH 1/3] powerpc - Initialize the irq radix tree earlier
[PATCH 2/3] powerpc - Separate the irq radix tree insertion and lookup
[PATCH 3/3] powerpc - Make the irq reverse mapping radix tree lockless

  Sorry for that.

  Sebastien.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

2008-07-31 Thread Mel Gorman
On (30/07/08 13:07), Andrew Morton didst pronounce:
 On Wed, 30 Jul 2008 20:30:10 +0100
 Mel Gorman [EMAIL PROTECTED] wrote:
 
  With Erics patch and libhugetlbfs, we can automatically back text/data[1],
  malloc[2] and stacks without source modification. Fairly soon, libhugetlbfs
  will also be able to override shmget() to add SHM_HUGETLB. That should cover
  a lot of the memory-intensive apps without source modification.
 
 The weak link in all of this still might be the need to reserve
 hugepages and the unreliability of dynamically allocating them.
 
 The dynamic allocation should be better nowadays, but I've lost track
 of how reliable it really is.  What's our status there?
 

We are a lot more reliable than we were although exact quantification is
difficult because it's workload dependent. For a long time, I've been able
to test bits and pieces with hugepages by allocating the pool at the time
I needed it even after days of uptime. Previously this required a reboot.

I've also been able to use the dynamic hugepage pool resizing effectively
and we track how much it is succeeding and failing in /proc/vmstat (see the
htlb fields) to watch for problems. Between that and /proc/pagetypeinfo, I am
expecting to be able to identify availablilty problems. As an administrator
can now set a minimum pool size and the maximum size of the pool (nr_hugepages
and nr_overcommit_hugepages), the configuration difficulties should be relaxed.

If it is found that anti-fragmentation can be broken down and pool
resizing starts failing after X amount of time on Y workloads, there is
still the option of using movablecore=BiggestPoolSizeIWillEverNeed
and writing 1 to /proc/sys/vm/hugepages_treat_as_movable so the hugepage
pool can grow/shrink reliably there.

Overall, it's in pretty good shape.

To be fair, one snag is that that swap is almost required for pool
resizing to work as I never pushed to complete memory compaction
(http://lwn.net/Articles/238837/).  Hence, we depend on the workload to
have lots of filesystem-backed data for lumpy-reclaim to do its job, for
pool resizing to take place between batch jobs or for swap to be configured
even if it's just for the duration of a pool resize.

-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: ide pmac breakage

2008-07-31 Thread Bartlomiej Zolnierkiewicz
On Thursday 31 July 2008, Benjamin Herrenschmidt wrote:
 
  Is it actually caused by additional reference counting on drive-gendev?
  IOW if you reverse the patch below instead of applying the previous fix
  do things work OK again?
  
   Note that there shouldn't be anything fundamentally different from
   ide-pmac here vs. something like pcmcia IDE cards... do you have one of
   these to test with ?
  
  Nope and I really don't intend to have one.  I count on other people
  to take some care of support for host drivers that they maintain/use. ;)
 
 Reverting the patch below does the job. Thanks.

Thanks, this narrows down the problem pretty nicely.

[ Unfortunately we cannot revert the whole patch as it would break
  unloading of IDE host drivers modules so I still need you help on
  fixing this one. ]

Lets get back to the oops:

Vector: 300 (Data Access) at [c59dfdc0]
    pc: c0211f78: ide_device_put+0x18/0x58
    lr: c0223c34: ide_cd_put+0x40/0x5c
    sp: c59dfe70
   msr: 9032
   dar: 10
 dsisr: 4000
  current = 0xc58a9880
    pid   = 843, comm = media-bay
enter ? for help
[c59dfe80] c0223c34 ide_cd_put+0x40/0x5c
[c59dfea0] c02114d4 generic_ide_remove+0x28/0x3c
[c59dfeb0] c01ea108 __device_release_driver+0x78/0xb4
[c59dfec0] c01ea218 device_release_driver+0x28/0x44
[c59dfee0] c01e9350 bus_remove_device+0xac/0xd8
[c59dff00] c01e77f8 device_del+0x104/0x198
[c59dff20] c01e78a4 device_unregister+0x18/0x30
[c59dff40] c0212598 __ide_port_unregister_devices+0x6c/0x88
[c59dff60] c021276c ide_port_unregister_devices+0x38/0x80
[c59dff80] c0209078 media_bay_step+0x1cc/0x5c0
[c59dffb0] c02094f8 media_bay_task+0x8c/0xcc
[c59dffd0] c0048640 kthread+0x48/0x84
[c59dfff0] c0011b38 kernel_thread+0x44/0x60

On a fresh look at ide_device_put(), ide_host_alloc() and pmac.c
it may be that the above oops is actually media-bay specific.

ide_device_put():
...
struct device *host_dev = drive-hwif-host-dev[0];
struct module *module = host_dev ? host_dev-driver-owner : NULL;
...

ide_host_alloc():
...
   if (hws[0])
host-dev[0] = hws[0]-dev;
...

pmac.c:
...
pmac_ide_macio_attach(struct macio_dev *mdev, const struct of_device_id *match)
...
dev_set_drvdata(mdev-ofdev.dev, pmif);

memset(hw, 0, sizeof(hw));
pmac_ide_init_ports(hw, pmif-regbase);
hw.irq = irq;
hw.dev = mdev-bus-pdev-dev;
hw.parent = mdev-ofdev.dev;
...

pmac macio is unique in using different devices for hwif-dev / host-dev
(hw.dev) and hwif-gendev.parent / dev_set_drvdata() (hw.parent)

[ I has been actually wondering why is it so for some time...? ]

Thus we may be hitting oops in ide_device_put() on host_dev-driver
because hw.dev is used as host-dev for pmac macio in ide_device_put()
while we really want to use hw.parent.

Fix should be as simple as:

-host-dev[0] = hws[0]-dev;
+host-dev[0] = hws[0]-parent ? hws[0]-parent : hws[0]-dev;

Could you please try it together with my previous patch for
ide_device_{get,put}()?

Bart
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

2008-07-31 Thread Mel Gorman
On (31/07/08 16:26), Nick Piggin didst pronounce:
 On Thursday 31 July 2008 16:14, Andrew Morton wrote:
  On Thu, 31 Jul 2008 16:04:14 +1000 Nick Piggin [EMAIL PROTECTED] 
 wrote:
Do we expect that this change will be replicated in other
memory-intensive apps?  (I do).
  
   Such as what? It would be nice to see some numbers with some HPC or java
   or DBMS workload using this. Not that I dispute it will help some cases,
   but 10% (or 20% for ppc) I guess is getting toward the best case, short
   of a specifically written TLB thrasher.
 
  I didn't realise the STREAM is using vast amounts of automatic memory.
  I'd assumed that it was using sane amounts of stack, but the stack TLB
  slots were getting zapped by all the heap-memory activity.  Oh well.
 
 An easy mistake to make because that's probabably how STREAM would normally
 work. I think what Mel had done is to modify the stream kernel so as to
 have it operate on arrays of stack memory.
 

Yes, I mentioned in the mail that STREAM was patched to use stack for
its data. It was as much to show the patches were working as advertised
even though it is an extreme case obviously.

I had seen stack-hugepage-backing as something that would improve performance
in addition to something else as opposed to having to stand entirely on its
own. For example, I would expect many memory-intensive applications to gain
by just having malloc and stack backed more than backing either in isolation.

  I guess that effect is still there, but smaller.
 
 I imagine it should be, unless you're using a CPU with seperate TLBs for
 small and huge pages, and your large data set is mapped with huge pages,
 in which case you might now introduce *new* TLB contention between the
 stack and the dataset :)

Yes, this can happen particularly on older CPUs. For example, on my
crash-test laptop the Pentium III there reports

TLB and cache info:
01: Instruction TLB: 4KB pages, 4-way set assoc, 32 entries
02: Instruction TLB: 4MB pages, 4-way set assoc, 2 entries

so a workload that sparsely addressed memory (i.e. = 4MB strides on each
reference) might suffer more TLB misses with large pages than with small.
It's hardly new that there are is uncertainity around when and if hugepages
are of benefit and where.

 Also, interestingly I have actually seen some CPUs whos memory operations
 get significantly slower when operating on large pages than small (in the
 case when there is full TLB coverage for both sizes). This would make
 sense if the CPU only implements a fast L1 TLB for small pages.
 

It's also possible there is a micro-TLB involved that only support small
pages. It's been the case for a while that what wins on one machine type
may lose on another.

 So for the vast majority of workloads, where stacks are relatively small
 (or slowly changing), and relatively hot, I suspect this could easily have
 no benefit at best and slowdowns at worst.
 

I wouldn't expect an application with small stacks to request its stack
to be backed by hugepages either. Ideally, it would be enabled because a
large enough number of DTLB misses were found to be in the stack
although catching this sort of data is tricky. 

 But I'm not saying that as a reason not to merge it -- this is no
 different from any other hugepage allocations and as usual they have to be
 used selectively where they help I just wonder exactly where huge
 stacks will help.
 

Benchmark wise, SPECcpu and SPEComp have stack-dependent benchmarks.
Computations that partition problems with recursion I would expect to benefit
as well as some JVMs that heavily use the stack (see how many docs suggest
setting ulimit -s unlimited). Bit out there, but stack-based languages would
stand to gain by this. The potential gap is for threaded apps as there will
be stacks that are not the main stack.  Backing those with hugepages depends
on how they are allocated (malloc, it's easy, MAP_ANONYMOUS not so much).

  I agree that few real-world apps are likely to see gains of this
  order.  More benchmarks, please :)
 
 Would be nice, if just out of morbid curiosity :)
 

Benchmarks will happen, they just take time, you know the way. The STREAM one
in the meantime is a this works and has an effect. I'm hoping Andrew Hastings
will have figures at hand and I cc'd him elsewhere in the thread for comment.

Thanks

-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc - Initialize the irq radix tree earlier

2008-07-31 Thread Michael Ellerman
On Thu, 2008-07-31 at 11:40 +0200, Sebastien Dugue wrote:
 The radix tree used for fast irq reverse mapping by the XICS is initialized
 late in the boot process, after the first interrupt (IPI) gets registered
 and after the first IPI is received.
 
   This patch moves the initialization of the XICS radix tree earlier into
 the boot process in smp_xics_probe() (the mm is already up but no interrupts
 have been registered at that point) to avoid having to insert a mapping into
 the tree in interrupt context. This will help in simplifying the locking
 constraints and move to a lockless radix tree in subsequent patches.
 
   As a nice side effect, there is no need any longer to check for
 (host-revmap_data.tree.gfp_mask != 0) to know if the tree have been
 initialized.

Hi Sebastien,

This is a nice cleanup, I think :)

 diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
 index 6ac8612..0a1445c 100644
 --- a/arch/powerpc/kernel/irq.c
 +++ b/arch/powerpc/kernel/irq.c
 @@ -893,28 +890,28 @@ unsigned int irq_find_mapping(struct irq_host *host,
  }
  EXPORT_SYMBOL_GPL(irq_find_mapping);
  
 +void __init irq_radix_revmap_init(void)
 +{
 + struct irq_host *h;
 +
 + list_for_each_entry(h, irq_hosts, link) {
 + if (h-revmap_type == IRQ_HOST_MAP_TREE)
 + INIT_RADIX_TREE(h-revmap_data.tree, GFP_ATOMIC);
 + }
 +}

Note irq_radix_revmap_init() loops over all irq_hosts ...

 diff --git a/arch/powerpc/platforms/pseries/smp.c 
 b/arch/powerpc/platforms/pseries/smp.c
 index 9d8f8c8..b143fe7 100644
 --- a/arch/powerpc/platforms/pseries/smp.c
 +++ b/arch/powerpc/platforms/pseries/smp.c
 @@ -130,6 +130,7 @@ static void smp_xics_message_pass(int target, int msg)
  
  static int __init smp_xics_probe(void)
  {
 + irq_radix_revmap_init();
   xics_request_IPIs();

But now it's only called from the xics setup code.

Which seems a bit ugly. In practice it doesn't matter because at the
moment xics is the only user of the radix revmap. But if we're going to
switch to this sort of initialisation I think xics should only be
init'ing the revmap for itself.


This boot ordering stuff is pretty hairy, so I might have missed
something, but this is how the code is ordered AFAICT:

start_kernel()
init_IRQ()
...
local_irq_enable()
...
rest_init()
kernel_thread()
kernel_init()
smp_prepare_cpus()
smp_xics_probe()(via 
smp_ops-probe())


What's stopping us from taking an irq between local_irq_enable() and
smp_xics_probe() ?  Is it just that no one's request_irq()'ed them yet?


cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

2008-07-31 Thread Nick Piggin
On Thursday 31 July 2008 21:27, Mel Gorman wrote:
 On (31/07/08 16:26), Nick Piggin didst pronounce:

  I imagine it should be, unless you're using a CPU with seperate TLBs for
  small and huge pages, and your large data set is mapped with huge pages,
  in which case you might now introduce *new* TLB contention between the
  stack and the dataset :)

 Yes, this can happen particularly on older CPUs. For example, on my
 crash-test laptop the Pentium III there reports

 TLB and cache info:
 01: Instruction TLB: 4KB pages, 4-way set assoc, 32 entries
 02: Instruction TLB: 4MB pages, 4-way set assoc, 2 entries

Oh? Newer CPUs tend to have unified TLBs?


  Also, interestingly I have actually seen some CPUs whos memory operations
  get significantly slower when operating on large pages than small (in the
  case when there is full TLB coverage for both sizes). This would make
  sense if the CPU only implements a fast L1 TLB for small pages.

 It's also possible there is a micro-TLB involved that only support small
 pages.

That is the case on a couple of contemporary CPUs I've tested with
(although granted they are engineering samples, but I don't expect
that to be the cause)


  So for the vast majority of workloads, where stacks are relatively small
  (or slowly changing), and relatively hot, I suspect this could easily
  have no benefit at best and slowdowns at worst.

 I wouldn't expect an application with small stacks to request its stack
 to be backed by hugepages either. Ideally, it would be enabled because a
 large enough number of DTLB misses were found to be in the stack
 although catching this sort of data is tricky.

Sure, as I said, I have nothing against this functionality just because
it has the possibility to cause a regression. I was just pointing out
there are a few possibilities there, so it will take a particular type
of app to take advantage of it. Ie. it is not something you would ever
just enable just in case the stack starts thrashing the TLB.


  But I'm not saying that as a reason not to merge it -- this is no
  different from any other hugepage allocations and as usual they have to
  be used selectively where they help I just wonder exactly where huge
  stacks will help.

 Benchmark wise, SPECcpu and SPEComp have stack-dependent benchmarks.
 Computations that partition problems with recursion I would expect to
 benefit as well as some JVMs that heavily use the stack (see how many docs
 suggest setting ulimit -s unlimited). Bit out there, but stack-based
 languages would stand to gain by this. The potential gap is for threaded
 apps as there will be stacks that are not the main stack.  Backing those
 with hugepages depends on how they are allocated (malloc, it's easy,
 MAP_ANONYMOUS not so much).

Oh good, then there should be lots of possibilities to demonstrate it.

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc - Initialize the irq radix tree earlier

2008-07-31 Thread Sebastien Dugue

  Hi Michael,

On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman [EMAIL PROTECTED] wrote:

 On Thu, 2008-07-31 at 11:40 +0200, Sebastien Dugue wrote:
  The radix tree used for fast irq reverse mapping by the XICS is initialized
  late in the boot process, after the first interrupt (IPI) gets registered
  and after the first IPI is received.
  
This patch moves the initialization of the XICS radix tree earlier into
  the boot process in smp_xics_probe() (the mm is already up but no interrupts
  have been registered at that point) to avoid having to insert a mapping into
  the tree in interrupt context. This will help in simplifying the locking
  constraints and move to a lockless radix tree in subsequent patches.
  
As a nice side effect, there is no need any longer to check for
  (host-revmap_data.tree.gfp_mask != 0) to know if the tree have been
  initialized.
 
 Hi Sebastien,
 
 This is a nice cleanup, I think :)

  Thanks.

 
  diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
  index 6ac8612..0a1445c 100644
  --- a/arch/powerpc/kernel/irq.c
  +++ b/arch/powerpc/kernel/irq.c
  @@ -893,28 +890,28 @@ unsigned int irq_find_mapping(struct irq_host *host,
   }
   EXPORT_SYMBOL_GPL(irq_find_mapping);
   
  +void __init irq_radix_revmap_init(void)
  +{
  +   struct irq_host *h;
  +
  +   list_for_each_entry(h, irq_hosts, link) {
  +   if (h-revmap_type == IRQ_HOST_MAP_TREE)
  +   INIT_RADIX_TREE(h-revmap_data.tree, GFP_ATOMIC);
  +   }
  +}
 
 Note irq_radix_revmap_init() loops over all irq_hosts ...

  Yep, but there's only one host (xics)

 
  diff --git a/arch/powerpc/platforms/pseries/smp.c 
  b/arch/powerpc/platforms/pseries/smp.c
  index 9d8f8c8..b143fe7 100644
  --- a/arch/powerpc/platforms/pseries/smp.c
  +++ b/arch/powerpc/platforms/pseries/smp.c
  @@ -130,6 +130,7 @@ static void smp_xics_message_pass(int target, int msg)
   
   static int __init smp_xics_probe(void)
   {
  +   irq_radix_revmap_init();
  xics_request_IPIs();
 
 But now it's only called from the xics setup code.
 
 Which seems a bit ugly. In practice it doesn't matter because at the
 moment xics is the only user of the radix revmap. But if we're going to
 switch to this sort of initialisation I think xics should only be
 init'ing the revmap for itself.

  You're right, that's what I intended to do from the beginning but
stumbled upon ... hmm, can't remember what, that made me change
my mind. But I agree, I'm not particularly proud of that. Will look
again into that.

 
 
 This boot ordering stuff is pretty hairy, so I might have missed
 something, but this is how the code is ordered AFAICT:
 
 start_kernel()
   init_IRQ()
   ...
   local_irq_enable()
   ...
   rest_init()
   kernel_thread()
   kernel_init()
   smp_prepare_cpus()
   smp_xics_probe()(via 
 smp_ops-probe())
 
 
 What's stopping us from taking an irq between local_irq_enable() and
 smp_xics_probe() ?  Is it just that no one's request_irq()'ed them yet?

  It's hairy, I agree, but as you've mentioned no one has done a request_irq()
at that point. The first one to do it is smp_xics_probe() for the IPI.

  Thanks for your comments.

  Sebastien.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH] powerpc - Initialize the irq radix tree earlier

2008-07-31 Thread Sebastien Dugue
On Thu, 31 Jul 2008 14:00:02 +0200 Sebastien Dugue [EMAIL PROTECTED] wrote:

 
   Hi Michael,
 
 On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman [EMAIL PROTECTED] wrote:
 
  On Thu, 2008-07-31 at 11:40 +0200, Sebastien Dugue wrote:
   The radix tree used for fast irq reverse mapping by the XICS is 
   initialized
   late in the boot process, after the first interrupt (IPI) gets registered
   and after the first IPI is received.
   
 This patch moves the initialization of the XICS radix tree earlier into
   the boot process in smp_xics_probe() (the mm is already up but no 
   interrupts
   have been registered at that point) to avoid having to insert a mapping 
   into
   the tree in interrupt context. This will help in simplifying the locking
   constraints and move to a lockless radix tree in subsequent patches.
   
 As a nice side effect, there is no need any longer to check for
   (host-revmap_data.tree.gfp_mask != 0) to know if the tree have been
   initialized.
  
  Hi Sebastien,
  
  This is a nice cleanup, I think :)
 
   Thanks.
 
  
   diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
   index 6ac8612..0a1445c 100644
   --- a/arch/powerpc/kernel/irq.c
   +++ b/arch/powerpc/kernel/irq.c
   @@ -893,28 +890,28 @@ unsigned int irq_find_mapping(struct irq_host *host,
}
EXPORT_SYMBOL_GPL(irq_find_mapping);

   +void __init irq_radix_revmap_init(void)
   +{
   + struct irq_host *h;
   +
   + list_for_each_entry(h, irq_hosts, link) {
   + if (h-revmap_type == IRQ_HOST_MAP_TREE)
   + INIT_RADIX_TREE(h-revmap_data.tree, GFP_ATOMIC);
   + }
   +}
  
  Note irq_radix_revmap_init() loops over all irq_hosts ...
 
   Yep, but there's only one host (xics)
 
  
   diff --git a/arch/powerpc/platforms/pseries/smp.c 
   b/arch/powerpc/platforms/pseries/smp.c
   index 9d8f8c8..b143fe7 100644
   --- a/arch/powerpc/platforms/pseries/smp.c
   +++ b/arch/powerpc/platforms/pseries/smp.c
   @@ -130,6 +130,7 @@ static void smp_xics_message_pass(int target, int msg)

static int __init smp_xics_probe(void)
{
   + irq_radix_revmap_init();
 xics_request_IPIs();
  
  But now it's only called from the xics setup code.
  
  Which seems a bit ugly. In practice it doesn't matter because at the
  moment xics is the only user of the radix revmap. But if we're going to
  switch to this sort of initialisation I think xics should only be
  init'ing the revmap for itself.
 
   You're right, that's what I intended to do from the beginning but
 stumbled upon ... hmm, can't remember what, that made me change
 my mind.

  Ah yes, I wanted to do it from xics_init_host() but backed off
because at that point the mm is not up. But it does not make a difference
as the first request_irq() happens after the mm is up. A bit shaky I
concede.

 But I agree, I'm not particularly proud of that. Will look
 again into that.
 
  
  
  This boot ordering stuff is pretty hairy, so I might have missed
  something, but this is how the code is ordered AFAICT:
  
  start_kernel()
  init_IRQ()
  ...
  local_irq_enable()
  ...
  rest_init()
  kernel_thread()
  kernel_init()
  smp_prepare_cpus()
  smp_xics_probe()(via 
  smp_ops-probe())
  
  
  What's stopping us from taking an irq between local_irq_enable() and
  smp_xics_probe() ?  Is it just that no one's request_irq()'ed them yet?
 
   It's hairy, I agree, but as you've mentioned no one has done a request_irq()
 at that point. The first one to do it is smp_xics_probe() for the IPI.
 
   Thanks for your comments.
 
   Sebastien.
 ___
 Linuxppc-dev mailing list
 Linuxppc-dev@ozlabs.org
 https://ozlabs.org/mailman/listinfo/linuxppc-dev
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH] powerpc - Initialize the irq radix tree earlier

2008-07-31 Thread Michael Ellerman
On Thu, 2008-07-31 at 14:00 +0200, Sebastien Dugue wrote:
 On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman [EMAIL PROTECTED] wrote:
  
  This boot ordering stuff is pretty hairy, so I might have missed
  something, but this is how the code is ordered AFAICT:
  
  start_kernel()
  init_IRQ()
  ...
  local_irq_enable()
  ...
  rest_init()
  kernel_thread()
  kernel_init()
  smp_prepare_cpus()
  smp_xics_probe()(via 
  smp_ops-probe())
  
  
  What's stopping us from taking an irq between local_irq_enable() and
  smp_xics_probe() ?  Is it just that no one's request_irq()'ed them yet?
 
   It's hairy, I agree, but as you've mentioned no one has done a request_irq()
 at that point. The first one to do it is smp_xics_probe() for the IPI.

Hmm, I don't think that's strong enough. I can trivially cause irqs to
fire during a kexec reboot just by mashing the keyboard.

And during a kdump boot all sorts of stuff could be firing. Even during
a clean boot, from firmware, I don't think we can guarantee that
nothing's going to fire.

.. after a bit of testing ..

It seems it actually works (sort of). 

xics_remap_irq() calls irq_radix_revmap_lookup(), which calls:

ptr = radix_tree_lookup(host-revmap_data.tree, hwirq);

And because host-revmap_data.tree was zalloc'ed we trip on the first
check here:



cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

ePAPR 1.0 is published

2008-07-31 Thread Yoder Stuart

The ePAPR 1.0 spec is officially published and available
on the power.org website.

http://www.power.org/resources/downloads/Power_ePAPR_APPROVED_v1.0.pdf

One purpose of the spec is to more formally specify in one place the
flat device tree concept defined booting_without_of.txt plus the
relevant parts of 1275 and misc bindings created by the OF working
group.

It also contains a specification for how multicore boot works in
an ePAPR-compliant system.

Stuart Yoder
Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc - Initialize the irq radix tree earlier

2008-07-31 Thread Michael Ellerman
On Thu, 2008-07-31 at 22:58 +1000, Michael Ellerman wrote:
 On Thu, 2008-07-31 at 14:00 +0200, Sebastien Dugue wrote:
  On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman [EMAIL PROTECTED] 
  wrote:
   
   This boot ordering stuff is pretty hairy, so I might have missed
   something, but this is how the code is ordered AFAICT:
   
   start_kernel()
 init_IRQ()
 ...
 local_irq_enable()
 ...
 rest_init()
 kernel_thread()
 kernel_init()
 smp_prepare_cpus()
 smp_xics_probe()(via 
   smp_ops-probe())
   
   
   What's stopping us from taking an irq between local_irq_enable() and
   smp_xics_probe() ?  Is it just that no one's request_irq()'ed them yet?
  
It's hairy, I agree, but as you've mentioned no one has done a 
  request_irq()
  at that point. The first one to do it is smp_xics_probe() for the IPI.
 
 Hmm, I don't think that's strong enough. I can trivially cause irqs to
 fire during a kexec reboot just by mashing the keyboard.
 
 And during a kdump boot all sorts of stuff could be firing. Even during
 a clean boot, from firmware, I don't think we can guarantee that
 nothing's going to fire.
 
 .. after a bit of testing ..
 
 It seems it actually works (sort of). 
 
 xics_remap_irq() calls irq_radix_revmap_lookup(), which calls:
 
 ptr = radix_tree_lookup(host-revmap_data.tree, hwirq);
 
 And because host-revmap_data.tree was zalloc'ed we trip on the first
 check here:

@#$% ctrl-enter == send!

Continuing ...

void *radix_tree_lookup(struct radix_tree_root *root, unsigned long index)
{
unsigned int height, shift;
struct radix_tree_node *node, **slot;

node = rcu_dereference(root-rnode);
if (node == NULL)
return NULL;

Which means irq_radix_revmap_lookup() will return NO_IRQ, which is cool.


So I think it can fly, as long as we're happy that we can't reverse map
anything until smp_xics_probe() - and I think that's true, as any irq we
take will be invalid.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Compact Flash on 8349mITX

2008-07-31 Thread Sparks, Sam
Hello All,

I'm using the latest 2.2.26 kernel image from
linux/kernel/git/benh/powerpc.git on a mpc8349mITX, and trying to get
Compact Flash to work. The kernel booted fine, but I was surprised to
see Compact Flash IRQ was not being handled. Does irq polling need to be
used for compact flash to work with this version of the kernel on this
board?

Here's snip of the unhandled IRQ, for your edification.

irq 23: nobody cared (try booting with the irqpoll option)
Call Trace:
[cf829a30] [c00087d4] 0xc00087d4 (unreliable)
[cf829a60] [c0052844] 0xc0052844
[cf829a80] [c0052ac4] 0xc0052ac4
[cf829ac0] [c0053520] 0xc0053520
[cf829ad0] [c000581c] 0xc000581c
[cf829ae0] [c0012adc] 0xc0012adc
--- Exception: 501 at 0xc002f1e0
LR = 0xc00058c8
[cf829ba0] [c02d4ef8] 0xc02d4ef8 (unreliable)
[cf829bd0] [c00058c8] 0xc00058c8
[cf829be0] [c002f394] 0xc002f394
[cf829bf0] [c0005820] 0xc0005820
[cf829c00] [c0012adc] 0xc0012adc
--- Exception: 501 at 0xc0052450
LR = 0xc0052574
[cf829cf0] [c00527e0] 0xc00527e0
[cf829d20] [c0053a04] 0xc0053a04
[cf829d40] [c01a7d88] 0xc01a7d88
[cf829d70] [c029a5bc] 0xc029a5bc
[cf829db0] [c029a8b4] 0xc029a8b4
[cf829e40] [c0202d10] 0xc0202d10
[cf829e60] [c017ab60] 0xc017ab60
[cf829e80] [c017ad6c] 0xc017ad6c
[cf829ea0] [c0179758] 0xc0179758
[cf829ed0] [c017adc4] 0xc017adc4
[cf829ee0] [c017a180] 0xc017a180
[cf829f10] [c017b39c] 0xc017b39c
[cf829f30] [c0202f80] 0xc0202f80
[cf829f40] [c0320750] 0xc0320750
[cf829f50] [c03007ac] 0xc03007ac
[cf829fd0] [c0300994] 0xc0300994
[cf829ff0] [c0012268] 0xc0012268
handlers:
[c01b35c0]
Disabling IRQ #23


Thanks for the help,
Sam

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc - Initialize the irq radix tree earlier

2008-07-31 Thread Sebastien Dugue
On Thu, 31 Jul 2008 23:01:39 +1000 Michael Ellerman [EMAIL PROTECTED] wrote:

 On Thu, 2008-07-31 at 22:58 +1000, Michael Ellerman wrote:
  On Thu, 2008-07-31 at 14:00 +0200, Sebastien Dugue wrote:
   On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman [EMAIL PROTECTED] 
   wrote:

This boot ordering stuff is pretty hairy, so I might have missed
something, but this is how the code is ordered AFAICT:

start_kernel()
init_IRQ()
...
local_irq_enable()
...
rest_init()
kernel_thread()
kernel_init()
smp_prepare_cpus()
smp_xics_probe()(via 
smp_ops-probe())


What's stopping us from taking an irq between local_irq_enable() and
smp_xics_probe() ?  Is it just that no one's request_irq()'ed them yet?
   
 It's hairy, I agree, but as you've mentioned no one has done a 
   request_irq()
   at that point. The first one to do it is smp_xics_probe() for the IPI.
  
  Hmm, I don't think that's strong enough. I can trivially cause irqs to
  fire during a kexec reboot just by mashing the keyboard.
  
  And during a kdump boot all sorts of stuff could be firing. Even during
  a clean boot, from firmware, I don't think we can guarantee that
  nothing's going to fire.
  
  .. after a bit of testing ..
  
  It seems it actually works (sort of). 
  
  xics_remap_irq() calls irq_radix_revmap_lookup(), which calls:
  
  ptr = radix_tree_lookup(host-revmap_data.tree, hwirq);
  
  And because host-revmap_data.tree was zalloc'ed we trip on the first
  check here:
 
 @#$% ctrl-enter == send!
 
 Continuing ...
 
 void *radix_tree_lookup(struct radix_tree_root *root, unsigned long index)
 {
 unsigned int height, shift;
 struct radix_tree_node *node, **slot;
 
 node = rcu_dereference(root-rnode);
 if (node == NULL)
 return NULL;
 
 Which means irq_radix_revmap_lookup() will return NO_IRQ, which is cool.

  Which is what I intended so that as long as no IRQ is registered we
return NO_IRQ.

 
 
 So I think it can fly, as long as we're happy that we can't reverse map
 anything until smp_xics_probe() - and I think that's true, as any irq we
 take will be invalid.

  That's true as no IRQs are registered before smp_xics_probe() and for any
interrupt we might get before that, irq_radix_revmap_lookup() will return
NO_IRQ.

  Thanks,

  Sebastien.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Wolfgang Grandegger

Grant Likely wrote:

On Fri, Jul 25, 2008 at 11:19:41AM -0500, Timur Tabi wrote:

Wolfgang Grandegger wrote:


I know but we still need an algorithm for MPC52xx and MPC82xx as well.

That's true, but I still think hard-coding values of DFSR and FDR in the device
tree is not a good way to do this.


I agree, it should encode real frequencies, not raw register values.


Digging deeper I'm frightened by plenty of platform specific code. We 
would need:


- one table of divider,fdr,dfsr values for the MPC82/3/5/6xx processors
  (already available from Timur's U-Boot implementation)

- one table of divider,fdr values for the MPC5200 rev A.

- one table of divider,fdr values for the MPC5200 rev B.
  (the Rev. B has two more pre-scaler bits).

- furthermore, there are various mpc-specific I2C clock sources:

  MPC82xx : fsl_get_sys_freq()
  MPC5200 : IPB
  MPC83xx : fsl_get_sys_freq()
  MPC8540/41/60/55,MPC8610: fsl_get_sys_freq()
  MPC8543/45/47/48/68, MPC8641: fsl_get_sys_freq()/2
  MPC8544 : fsl_get_sys_freq()/2 or /3

  It would make sense to hand-over the I2C frequency from U-Boot to
  Linux.

Wolfgang.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

2008-07-31 Thread Mel Gorman
On (31/07/08 21:51), Nick Piggin didst pronounce:
 On Thursday 31 July 2008 21:27, Mel Gorman wrote:
  On (31/07/08 16:26), Nick Piggin didst pronounce:
 
   I imagine it should be, unless you're using a CPU with seperate TLBs for
   small and huge pages, and your large data set is mapped with huge pages,
   in which case you might now introduce *new* TLB contention between the
   stack and the dataset :)
 
  Yes, this can happen particularly on older CPUs. For example, on my
  crash-test laptop the Pentium III there reports
 
  TLB and cache info:
  01: Instruction TLB: 4KB pages, 4-way set assoc, 32 entries
  02: Instruction TLB: 4MB pages, 4-way set assoc, 2 entries
 
 Oh? Newer CPUs tend to have unified TLBs?
 

I've seen more unified DTLBs (ITLB tends to be split) than not but it could
just be where I'm looking. For example, on the machine I'm writing this
(Core Duo), it's

TLB and cache info:
51: Instruction TLB: 4KB and 2MB or 4MB pages, 128 entries
5b: Data TLB: 4KB and 4MB pages, 64 entries

DTLB is unified there but on my T60p laptop where I guess they want the CPU
to be using less power and be cheaper, it's

TLB info
 Instruction TLB: 4K pages, 4-way associative, 128 entries.
 Instruction TLB: 4MB pages, fully associative, 2 entries
 Data TLB: 4K pages, 4-way associative, 128 entries.
 Data TLB: 4MB pages, 4-way associative, 8 entries

So I would expect huge pages to be slower there than in other cases.
On one Xeon, I see 32 entries for huge pages and 256 for small pages so
it's not straight-forward to predict. On another Xeon, I see the DLB is 64
entries unified.

To make all this more complex, huge pages can be a win because less L2 cache
is consumed on page table information. The gains are due to fewer access to
main memory and less to do with TLB misses. So lets say we do have a TLB
that is set-associative with very few large page entries, it could still
end up winning because the increased usage of L2 offset the increased TLB
misses. Predicting when huge pages are a win and when they are a loss is
just not particularly straight-forward.

 
   Also, interestingly I have actually seen some CPUs whos memory operations
   get significantly slower when operating on large pages than small (in the
   case when there is full TLB coverage for both sizes). This would make
   sense if the CPU only implements a fast L1 TLB for small pages.
 
  It's also possible there is a micro-TLB involved that only support small
  pages.
 
 That is the case on a couple of contemporary CPUs I've tested with
 (although granted they are engineering samples, but I don't expect
 that to be the cause)
 

I found it hard to determine if the CPU I was using at a uTLB or not. The
manuals didn't cover the subject but it was a theory as to why large pages
might be slower on a particular CPU. Whatever the reason, I'm ok
admitting that large pages can be slower on smaller data sets and in
other situations for whatever reason. It's not a major surprise.

 
   So for the vast majority of workloads, where stacks are relatively small
   (or slowly changing), and relatively hot, I suspect this could easily
   have no benefit at best and slowdowns at worst.
 
  I wouldn't expect an application with small stacks to request its stack
  to be backed by hugepages either. Ideally, it would be enabled because a
  large enough number of DTLB misses were found to be in the stack
  although catching this sort of data is tricky.
 
 Sure, as I said, I have nothing against this functionality just because
 it has the possibility to cause a regression. I was just pointing out
 there are a few possibilities there, so it will take a particular type
 of app to take advantage of it. Ie. it is not something you would ever
 just enable just in case the stack starts thrashing the TLB.
 

No, it's something you'd enable because you know your app is using a lot
of stack. If you are lazy, you might do a test run of the app with it
enabled for the sake of curiousity and take the option that's faster :)

 
   But I'm not saying that as a reason not to merge it -- this is no
   different from any other hugepage allocations and as usual they have to
   be used selectively where they help I just wonder exactly where huge
   stacks will help.
 
  Benchmark wise, SPECcpu and SPEComp have stack-dependent benchmarks.
  Computations that partition problems with recursion I would expect to
  benefit as well as some JVMs that heavily use the stack (see how many docs
  suggest setting ulimit -s unlimited). Bit out there, but stack-based
  languages would stand to gain by this. The potential gap is for threaded
  apps as there will be stacks that are not the main stack.  Backing those
  with hugepages depends on how they are allocated (malloc, it's easy,
  MAP_ANONYMOUS not so much).
 
 Oh good, then there should be lots of possibilities to demonstrate it.
 

There should :)

-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center

Re: [PATCH] powerpc - Initialize the irq radix tree earlier

2008-07-31 Thread Sebastien Dugue
On Thu, 31 Jul 2008 23:39:26 +1000 Michael Ellerman [EMAIL PROTECTED] wrote:

 On Thu, 2008-07-31 at 15:26 +0200, Sebastien Dugue wrote:
  On Thu, 31 Jul 2008 23:01:39 +1000 Michael Ellerman [EMAIL PROTECTED] 
  wrote:
  
   On Thu, 2008-07-31 at 22:58 +1000, Michael Ellerman wrote:
On Thu, 2008-07-31 at 14:00 +0200, Sebastien Dugue wrote:
 On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman [EMAIL 
 PROTECTED] wrote:
  
  This boot ordering stuff is pretty hairy, so I might have missed
  something, but this is how the code is ordered AFAICT:
  
  start_kernel()
  init_IRQ()
  ...
  local_irq_enable()
  ...
  rest_init()
  kernel_thread()
  kernel_init()
  smp_prepare_cpus()
  smp_xics_probe()(via 
  smp_ops-probe())
  
  
  What's stopping us from taking an irq between local_irq_enable() and
  smp_xics_probe() ?  Is it just that no one's request_irq()'ed them 
  yet?
 
   It's hairy, I agree, but as you've mentioned no one has done a 
 request_irq()
 at that point. The first one to do it is smp_xics_probe() for the IPI.

Hmm, I don't think that's strong enough. I can trivially cause irqs to
fire during a kexec reboot just by mashing the keyboard.

And during a kdump boot all sorts of stuff could be firing. Even during
a clean boot, from firmware, I don't think we can guarantee that
nothing's going to fire.

.. after a bit of testing ..

It seems it actually works (sort of). 

xics_remap_irq() calls irq_radix_revmap_lookup(), which calls:

ptr = radix_tree_lookup(host-revmap_data.tree, hwirq);

And because host-revmap_data.tree was zalloc'ed we trip on the first
check here:
   
   @#$% ctrl-enter == send!
   
   Continuing ...
   
   void *radix_tree_lookup(struct radix_tree_root *root, unsigned long index)
   {
   unsigned int height, shift;
   struct radix_tree_node *node, **slot;
   
   node = rcu_dereference(root-rnode);
   if (node == NULL)
   return NULL;
   
   Which means irq_radix_revmap_lookup() will return NO_IRQ, which is cool.
  
Which is what I intended so that as long as no IRQ is registered we
  return NO_IRQ.
  
   
   
   So I think it can fly, as long as we're happy that we can't reverse map
   anything until smp_xics_probe() - and I think that's true, as any irq we
   take will be invalid.
  
That's true as no IRQs are registered before smp_xics_probe() and for any
  interrupt we might get before that, irq_radix_revmap_lookup() will return
  NO_IRQ.
 
 Cool, we agree :) 
 
 My only worry is that we might be relying on on the particular radix
 tree implementation a bit too much.

  Well maybe we could revert back to testing a flag just like we
do for host-revmap_data.tree.gfp_mask != 0. Dunno.

 Is it documented somewhere that
 the /very/ first check is for root-rnode != NULL, and the rest of the
 root may be unintialised?

  Not in anything I could read except in looking at the code.

 
 And I think it needs a big fat comment in the irq code saying that it's
 safe because revmap_data is zalloc'ed, and that means the radix lookup
 will fail (safely).

  Yep, right. Will advertise this properly for the next round if this
remains the prefered solution.

  Thanks,

  Sebastien.



___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

RE: Compact Flash on 8349mITX

2008-07-31 Thread Sparks, Sam
From: Sparks, Sam 
Sent: Thursday, July 31, 2008 8:15 AM
Does irq polling need to be used for compact flash to work 
with this version of the kernel on this board?

When I remove the interrupt definition from the DTS (to cause
irqpolling), the ata driver gets stuck in a loop displaying the
following. Does anyone have any insight?
ata1.00: status: { DRDY }
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: hard resetting link
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: configured for UDMA/100
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x20 (host bus
error)

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

2008-07-31 Thread Michael Ellerman
On Thu, 2008-07-31 at 14:50 +0100, Mel Gorman wrote:
 On (31/07/08 21:51), Nick Piggin didst pronounce:
  On Thursday 31 July 2008 21:27, Mel Gorman wrote:
   On (31/07/08 16:26), Nick Piggin didst pronounce:
  
I imagine it should be, unless you're using a CPU with seperate TLBs for
small and huge pages, and your large data set is mapped with huge pages,
in which case you might now introduce *new* TLB contention between the
stack and the dataset :)
  
   Yes, this can happen particularly on older CPUs. For example, on my
   crash-test laptop the Pentium III there reports
  
   TLB and cache info:
   01: Instruction TLB: 4KB pages, 4-way set assoc, 32 entries
   02: Instruction TLB: 4MB pages, 4-way set assoc, 2 entries
  
  Oh? Newer CPUs tend to have unified TLBs?
  
 
 I've seen more unified DTLBs (ITLB tends to be split) than not but it could
 just be where I'm looking. For example, on the machine I'm writing this
 (Core Duo), it's
 
 TLB and cache info:
 51: Instruction TLB: 4KB and 2MB or 4MB pages, 128 entries
 5b: Data TLB: 4KB and 4MB pages, 64 entries
 
 DTLB is unified there but on my T60p laptop where I guess they want the CPU
 to be using less power and be cheaper, it's
 
 TLB info
  Instruction TLB: 4K pages, 4-way associative, 128 entries.
  Instruction TLB: 4MB pages, fully associative, 2 entries
  Data TLB: 4K pages, 4-way associative, 128 entries.
  Data TLB: 4MB pages, 4-way associative, 8 entries

Clearly I've been living under a rock, but I didn't know one could get
such nicely formatted info.

In case I'm not the only one, a bit of googling turned up x86info,
courtesy of davej - apt-get'able and presumably yum'able too.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 2/5] x86: Define elfcorehdr_addr in arch dependent section

2008-07-31 Thread Ingo Molnar

* Vivek Goyal [EMAIL PROTECTED] wrote:

 o Move elfcorehdr_addr definition in arch dependent crash dump file. This is
   equivalent to defining elfcorehdr_addr under CONFIG_CRASH_DUMP instead of
   CONFIG_PROC_VMCORE. This is needed by is_kdump_kernel() which can be
   used irrespective of the fact whether CONFIG_PROC_VMCORE is enabled or
   not.
 
 Signed-off-by: Vivek Goyal [EMAIL PROTECTED]
 ---
 
  arch/x86/kernel/crash_dump_32.c |3 +++
  arch/x86/kernel/crash_dump_64.c |3 +++
  arch/x86/kernel/setup.c |8 +++-
  3 files changed, 13 insertions(+), 1 deletion(-)

the x86 bits look fine to me.

Acked-by: Ingo Molnar [EMAIL PROTECTED]

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: libfdt: Forgot one function when cleaning the namespace

2008-07-31 Thread Jon Loeliger
 In commit b6d80a20fc293f3b995c3ce1a6744a5574192125, we renamed all
 libfdt functions to be prefixed with fdt_ or _fdt_ to minimise the
 chance of collisions with things from whatever package libfdt is
 embedded in, pulled into the libfdt build via that environment's
 libfdt_env.h.
 
 Except... I missed one.  This patch applies the same treatment to
 _stringlist_contains().  While we're at it, also make it static since
 it's only used in the same file.
 
 Signed-off-by: David Gibson [EMAIL PROTECTED]

Applied.

jdl

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: dtc: Remove unused lexer function

2008-07-31 Thread Jon Loeliger
 dtc does not use the input() function in flex.  Apparently on some gcc
 versions the unused function will cause warnings.  Therefore, this
 patch removes the function by using the 'noinput' option to flex.
 
 Signed-off-by: David Gibson [EMAIL PROTECTED]

Applied.

jdl
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] dtc: give advance warning that -S is going away.

2008-07-31 Thread Jon Loeliger
 The -S option allowed the specification of a minimum size for
 the blob, however the main reason for caring about the size is
 so there is enough padding to add a chosen node by u-boot or
 whoever.  In which case, folks don't really care about the absolute
 size, but rather the size of the padding added for this -- which
 is what the -p option does.  Having the -S just confuses people.
 
 Signed-off-by: Paul Gortmaker [EMAIL PROTECTED]

Applied.

jdl
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Jon Smirl
On 7/31/08, Wolfgang Grandegger [EMAIL PROTECTED] wrote:
 Grant Likely wrote:

  On Fri, Jul 25, 2008 at 11:19:41AM -0500, Timur Tabi wrote:
 
   Wolfgang Grandegger wrote:
  
  
I know but we still need an algorithm for MPC52xx and MPC82xx as well.
   
   That's true, but I still think hard-coding values of DFSR and FDR in the
 device
   tree is not a good way to do this.
  
 
  I agree, it should encode real frequencies, not raw register values.
 

  Digging deeper I'm frightened by plenty of platform specific code. We would
 need:

  - one table of divider,fdr,dfsr values for the MPC82/3/5/6xx processors
   (already available from Timur's U-Boot implementation)

  - one table of divider,fdr values for the MPC5200 rev A.

  - one table of divider,fdr values for the MPC5200 rev B.
   (the Rev. B has two more pre-scaler bits).

Aren't the tables in the manual there just to make it easy for a human
to pick out the line they want? For a computer you'd program the
formula that was used to create the tables.

I agree that it took me half an hour to figure out the formula that
was needed to compute the i2s clocks, but once you figure out the
formula it solves all of the cases and no one needs to read the manual
any more. The i2c formula may even need a small loop which compares
different solutions looking for the smallest error term. But it's a
small space to search.

These device tree flags should be removed, the driver can ask the
platform code what CPU it is running on.

if (of_get_property(op-node, dfsrr, NULL))
i2c-flags |= FSL_I2C_DEV_SEPARATE_DFSRR;

if (of_device_is_compatible(op-node, fsl,mpc5200-i2c) ||
of_device_is_compatible(op-node, mpc5200-i2c))
i2c-flags |= FSL_I2C_DEV_CLOCK_5200;

static void mpc_i2c_setclock(struct mpc_i2c *i2c)
{
/* Set clock and filters */
if (i2c-flags  FSL_I2C_DEV_SEPARATE_DFSRR) {
writeb(0x31, i2c-base + MPC_I2C_FDR);
writeb(0x10, i2c-base + MPC_I2C_DFSRR);
} else if (i2c-flags  FSL_I2C_DEV_CLOCK_5200)
writeb(0x3f, i2c-base + MPC_I2C_FDR);
else
writel(0x1031, i2c-base + MPC_I2C_FDR);
}

These defines shouldn't be here, they should get the offset from the
right header file for the CPU. But it appears that structures for the
i2c memory map haven't been done for the various CPUs.

#define MPC_I2C_FDR 0x04
#define MPC_I2C_CR  0x08
#define MPC_I2C_SR  0x0c
#define MPC_I2C_DR  0x10
#define MPC_I2C_DFSRR 0x14

There appears to be one for i2x8xxx but not the other CPUs.

/* I2C
*/
typedef struct i2c {
u_char  i2c_i2mod;
charres1[3];
u_char  i2c_i2add;
charres2[3];
u_char  i2c_i2brg;
charres3[3];
u_char  i2c_i2com;
charres4[3];
u_char  i2c_i2cer;
charres5[3];
u_char  i2c_i2cmr;
charres6[0x8b];
} i2c8xx_t;




  - furthermore, there are various mpc-specific I2C clock sources:

   MPC82xx : fsl_get_sys_freq()
   MPC5200 : IPB
   MPC83xx : fsl_get_sys_freq()
   MPC8540/41/60/55,MPC8610: fsl_get_sys_freq()
   MPC8543/45/47/48/68, MPC8641: fsl_get_sys_freq()/2
   MPC8544 : fsl_get_sys_freq()/2 or /3

   It would make sense to hand-over the I2C frequency from U-Boot to
   Linux.

  Wolfgang.



  ___
  Linuxppc-dev mailing list
  Linuxppc-dev@ozlabs.org
  https://ozlabs.org/mailman/listinfo/linuxppc-dev



-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Jon Smirl wrote:

 Aren't the tables in the manual there just to make it easy for a human
 to pick out the line they want? For a computer you'd program the
 formula that was used to create the tables.

Actually, the tables in the manuals are just an example of what can be
programmed.  They don't cover all cases.  The tables assume a specific value of
DFSR.

For 8xxx, you want to use AN2919 to figure out how to really program the device.

The table I generated for U-Boot is based on the calculations outlined in
AN2919.  I debated trying to implement the actual algorithm, but decided that
pre-calculating the values was better.  The algorithm is very convoluted.

 I agree that it took me half an hour to figure out the formula that
 was needed to compute the i2s clocks, but once you figure out the
 formula it solves all of the cases and no one needs to read the manual
 any more. The i2c formula may even need a small loop which compares
 different solutions looking for the smallest error term. But it's a
 small space to search.

My understanding is that the algorithm itself is different on 8xxx vs. 52xx.  It
might be possible to combine 5200A and 5200B into one table/algorithm, but I
don't think you can combine it with the 8xxx table.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[2.6 patch] Add cuImage.mpc866ads to the bootwrapper as a cuboot-8xx target

2008-07-31 Thread Adrian Bunk
From: Scott Wood [EMAIL PROTECTED]

This patch fixes the following build error with mpc866_ads_defconfig:

--  snip  --

...
  WRAParch/powerpc/boot/cuImage.mpc866ads
powerpc64-linux-ld: arch/powerpc/boot/cuboot-mpc866ads.o: No such file: No such 
file or directory
make[2]: *** [arch/powerpc/boot/cuImage.mpc866ads] Error 1

--  snip  --

Reported-by: Adrian Bunk [EMAIL PROTECTED]
Signed-off-by: Scott Wood [EMAIL PROTECTED]
Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

This patch was sent by Scott Wood on:
- 21 May 2008

 arch/powerpc/boot/wrapper |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
index d6c96d9..3b59e33 100755
--- a/arch/powerpc/boot/wrapper
+++ b/arch/powerpc/boot/wrapper
@@ -159,7 +159,7 @@ cuboot*)
 binary=y
 gzip=
 case $platform in
-*-mpc885ads|*-adder875*|*-ep88xc)
+*-mpc866ads|*-mpc885ads|*-adder875*|*-ep88xc)
 platformo=$object/cuboot-8xx.o
 ;;
 *5200*|*-motionpro)
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 5:51 AM, Wolfgang Grandegger [EMAIL PROTECTED] wrote:
 Grant Likely wrote:

 On Fri, Jul 25, 2008 at 11:19:41AM -0500, Timur Tabi wrote:

 Wolfgang Grandegger wrote:

 I know but we still need an algorithm for MPC52xx and MPC82xx as well.

 That's true, but I still think hard-coding values of DFSR and FDR in the
 device
 tree is not a good way to do this.

 I agree, it should encode real frequencies, not raw register values.

 Digging deeper I'm frightened by plenty of platform specific code. We would
 need:

 - one table of divider,fdr,dfsr values for the MPC82/3/5/6xx processors
  (already available from Timur's U-Boot implementation)

 - one table of divider,fdr values for the MPC5200 rev A.

 - one table of divider,fdr values for the MPC5200 rev B.
  (the Rev. B has two more pre-scaler bits).

 - furthermore, there are various mpc-specific I2C clock sources:

  MPC82xx : fsl_get_sys_freq()
  MPC5200 : IPB
  MPC83xx : fsl_get_sys_freq()
  MPC8540/41/60/55,MPC8610: fsl_get_sys_freq()
  MPC8543/45/47/48/68, MPC8641: fsl_get_sys_freq()/2
  MPC8544 : fsl_get_sys_freq()/2 or /3

  It would make sense to hand-over the I2C frequency from U-Boot to
  Linux.

U-Boot isn't always available and there are plenty of 5200 and 8xxx
boards out there which will never have U-Boot reflashed to provide
this data.  Also, there are boards that don't even use U-Boot.  I
don't want to go down this path.  It is the drivers *job* to
understand how to set these registers.

If you're careful, the table doesn't need to be huge.  It can be
marked as initdata and conditionally compiled depending on which
architectures are compiled in.  You should use .data in the driver's
of_device_id table to provide machine specific ops for setting
clocking to avoid a maze of if/else statements.

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Jon Smirl
On 7/31/08, Grant Likely [EMAIL PROTECTED] wrote:
  If you're careful, the table doesn't need to be huge.  It can be
  marked as initdata and conditionally compiled depending on which
  architectures are compiled in.  You should use .data in the driver's
  of_device_id table to provide machine specific ops for setting
  clocking to avoid a maze of if/else statements.

Does this look ok for the mpc5200 i2c struct?

/* I2C Registers */
struct mpc52xx_i2c {
u8 madr;/* I2C + 0x00 */
u8 reserved1[3];/* I2C + 0x01 */
u8 mfdr;/* I2C + 0x04 */
u8 reserved2[3];/* I2C + 0x05 */
u8 mcr; /* I2C + 0x08 */
u8 reserved3[3];/* I2C + 0x09 */
u8 msr; /* I2C + 0x0c */
u8 reserved4[3];/* I2C + 0x0d */
u8 mdr; /* I2C + 0x10 */
u8 reserved5[15];   /* I2C + 0x11 */
u8 interrupt;   /* I2C + 0x20 */
u8 reserved6[3];/* I2C + 0x21 */
u8 mifr;/* I2C + 0x24 */
};



-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Wolfgang Grandegger

Jon Smirl wrote:

On 7/31/08, Wolfgang Grandegger [EMAIL PROTECTED] wrote:

Grant Likely wrote:


On Fri, Jul 25, 2008 at 11:19:41AM -0500, Timur Tabi wrote:


Wolfgang Grandegger wrote:



I know but we still need an algorithm for MPC52xx and MPC82xx as well.


That's true, but I still think hard-coding values of DFSR and FDR in the

device

tree is not a good way to do this.


I agree, it should encode real frequencies, not raw register values.


 Digging deeper I'm frightened by plenty of platform specific code. We would
need:

 - one table of divider,fdr,dfsr values for the MPC82/3/5/6xx processors
  (already available from Timur's U-Boot implementation)

 - one table of divider,fdr values for the MPC5200 rev A.

 - one table of divider,fdr values for the MPC5200 rev B.
  (the Rev. B has two more pre-scaler bits).


Aren't the tables in the manual there just to make it easy for a human
to pick out the line they want? For a computer you'd program the
formula that was used to create the tables.


I have the formulas to create the tables, also for the MPC5200 Rev. A 
and B. That was not my point. I'm worried about arch specific code in 
i2c-mpc.c. It should go somewhere to arch/powerpc.



I agree that it took me half an hour to figure out the formula that
was needed to compute the i2s clocks, but once you figure out the
formula it solves all of the cases and no one needs to read the manual
any more. The i2c formula may even need a small loop which compares
different solutions looking for the smallest error term. But it's a
small space to search.

These device tree flags should be removed, the driver can ask the
platform code what CPU it is running on.

if (of_get_property(op-node, dfsrr, NULL))
i2c-flags |= FSL_I2C_DEV_SEPARATE_DFSRR;

if (of_device_is_compatible(op-node, fsl,mpc5200-i2c) ||
of_device_is_compatible(op-node, mpc5200-i2c))
i2c-flags |= FSL_I2C_DEV_CLOCK_5200;

static void mpc_i2c_setclock(struct mpc_i2c *i2c)
{
/* Set clock and filters */
if (i2c-flags  FSL_I2C_DEV_SEPARATE_DFSRR) {
writeb(0x31, i2c-base + MPC_I2C_FDR);
writeb(0x10, i2c-base + MPC_I2C_DFSRR);
} else if (i2c-flags  FSL_I2C_DEV_CLOCK_5200)
writeb(0x3f, i2c-base + MPC_I2C_FDR);
else
writel(0x1031, i2c-base + MPC_I2C_FDR);
}

These defines shouldn't be here, they should get the offset from the
right header file for the CPU. But it appears that structures for the
i2c memory map haven't been done for the various CPUs.

#define MPC_I2C_FDR 0x04
#define MPC_I2C_CR  0x08
#define MPC_I2C_SR  0x0c
#define MPC_I2C_DR  0x10
#define MPC_I2C_DFSRR 0x14

There appears to be one for i2x8xxx but not the other CPUs.

/* I2C
*/
typedef struct i2c {
u_char  i2c_i2mod;
charres1[3];
u_char  i2c_i2add;
charres2[3];
u_char  i2c_i2brg;
charres3[3];
u_char  i2c_i2com;
charres4[3];
u_char  i2c_i2cer;
charres5[3];
u_char  i2c_i2cmr;
charres6[0x8b];
} i2c8xx_t;


The I2C interface for the MPC5200 is not compatible with the one for the 
 MPC83/4/5/6xx, AFAIK.


Wolfgang.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Wolfgang Grandegger

Grant Likely wrote:

On Thu, Jul 31, 2008 at 5:51 AM, Wolfgang Grandegger [EMAIL PROTECTED] wrote:

Grant Likely wrote:

On Fri, Jul 25, 2008 at 11:19:41AM -0500, Timur Tabi wrote:

Wolfgang Grandegger wrote:


I know but we still need an algorithm for MPC52xx and MPC82xx as well.

That's true, but I still think hard-coding values of DFSR and FDR in the
device
tree is not a good way to do this.

I agree, it should encode real frequencies, not raw register values.

Digging deeper I'm frightened by plenty of platform specific code. We would
need:

- one table of divider,fdr,dfsr values for the MPC82/3/5/6xx processors
 (already available from Timur's U-Boot implementation)

- one table of divider,fdr values for the MPC5200 rev A.

- one table of divider,fdr values for the MPC5200 rev B.
 (the Rev. B has two more pre-scaler bits).

- furthermore, there are various mpc-specific I2C clock sources:

 MPC82xx : fsl_get_sys_freq()
 MPC5200 : IPB
 MPC83xx : fsl_get_sys_freq()
 MPC8540/41/60/55,MPC8610: fsl_get_sys_freq()
 MPC8543/45/47/48/68, MPC8641: fsl_get_sys_freq()/2
 MPC8544 : fsl_get_sys_freq()/2 or /3

 It would make sense to hand-over the I2C frequency from U-Boot to
 Linux.


U-Boot isn't always available and there are plenty of 5200 and 8xxx
boards out there which will never have U-Boot reflashed to provide
this data.  Also, there are boards that don't even use U-Boot.  I
don't want to go down this path.  It is the drivers *job* to
understand how to set these registers.

If you're careful, the table doesn't need to be huge.  It can be
marked as initdata and conditionally compiled depending on which
architectures are compiled in.  You should use .data in the driver's
of_device_id table to provide machine specific ops for setting
clocking to avoid a maze of if/else statements.


Yep, that makes sense.

Wolfgang.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 11:22 AM, Wolfgang Grandegger [EMAIL PROTECTED] wrote:
 Jon Smirl wrote:

 On 7/31/08, Wolfgang Grandegger [EMAIL PROTECTED] wrote:

 Grant Likely wrote:

 On Fri, Jul 25, 2008 at 11:19:41AM -0500, Timur Tabi wrote:

 Wolfgang Grandegger wrote:


 I know but we still need an algorithm for MPC52xx and MPC82xx as well.

 That's true, but I still think hard-coding values of DFSR and FDR in
 the

 device

 tree is not a good way to do this.

 I agree, it should encode real frequencies, not raw register values.

  Digging deeper I'm frightened by plenty of platform specific code. We
 would
 need:

  - one table of divider,fdr,dfsr values for the MPC82/3/5/6xx processors
  (already available from Timur's U-Boot implementation)

  - one table of divider,fdr values for the MPC5200 rev A.

  - one table of divider,fdr values for the MPC5200 rev B.
  (the Rev. B has two more pre-scaler bits).

 Aren't the tables in the manual there just to make it easy for a human
 to pick out the line they want? For a computer you'd program the
 formula that was used to create the tables.

 I have the formulas to create the tables, also for the MPC5200 Rev. A and B.

Oh, hey, even better.

 That was not my point. I'm worried about arch specific code in i2c-mpc.c. It
 should go somewhere to arch/powerpc.

i2c-mpc *is* arch specific.  I really don't think you need to worry
about adding a block of code for each supported SoC family.  Just
change the of_match table to look something like this:

static const struct of_device_id mpc_i2c_of_match[] = {
{.compatible = fsl,mpc5200b-i2c, .data = fsl_i2c_mpc5200b_set_freq, },
{.compatible = fsl,mpc5200-i2c, .data = fsl_i2c_mpc5200_set_freq, },
{.compatible = fsl,mpc8260-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
{.compatible = fsl,mpc8349-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
{.compatible = fsl,mpc8540-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
{.compatible = fsl,mpc8543-i2c, .data = 
fsl_i2c_mpc8xxx_div2_set_freq, },
{.compatible = fsl,mpc8544-i2c, .data = 
fsl_i2c_mpc8xxx_div3_set_freq, },

/* keep this only for older device trees with some support
code to figure out
   what .data should have pointed to. */
{.compatible = fsl-i2c, },
{},
};
MODULE_DEVICE_TABLE(of, mpc_i2c_of_match);

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Jon Smirl
On 7/31/08, Wolfgang Grandegger [EMAIL PROTECTED] wrote:
 Jon Smirl wrote:
  There appears to be one for i2x8xxx but not the other CPUs.
 
  /* I2C
  */
  typedef struct i2c {
 u_char  i2c_i2mod;
 charres1[3];
 u_char  i2c_i2add;
 charres2[3];
 u_char  i2c_i2brg;
 charres3[3];
 u_char  i2c_i2com;
 charres4[3];
 u_char  i2c_i2cer;
 charres5[3];
 u_char  i2c_i2cmr;
 charres6[0x8b];
  } i2c8xx_t;
 

  The I2C interface for the MPC5200 is not compatible with the one for the
 MPC83/4/5/6xx, AFAIK.

Ignore that table, I mistook MPC8xx for MPC8xxx. That is a MPC8xx table.


  Wolfgang.



-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Jon Smirl
On 7/31/08, Grant Likely [EMAIL PROTECTED] wrote:
 On Thu, Jul 31, 2008 at 11:06 AM, Jon Smirl [EMAIL PROTECTED] wrote:
   On 7/31/08, Grant Likely [EMAIL PROTECTED] wrote:
If you're careful, the table doesn't need to be huge.  It can be
marked as initdata and conditionally compiled depending on which
architectures are compiled in.  You should use .data in the driver's
of_device_id table to provide machine specific ops for setting
clocking to avoid a maze of if/else statements.
  
   Does this look ok for the mpc5200 i2c struct?
  
   /* I2C Registers */
   struct mpc52xx_i2c {
  u8 madr;/* I2C + 0x00 */
  u8 reserved1[3];/* I2C + 0x01 */
  u8 mfdr;/* I2C + 0x04 */
  u8 reserved2[3];/* I2C + 0x05 */
  u8 mcr; /* I2C + 0x08 */
  u8 reserved3[3];/* I2C + 0x09 */
  u8 msr; /* I2C + 0x0c */
  u8 reserved4[3];/* I2C + 0x0d */
  u8 mdr; /* I2C + 0x10 */
  u8 reserved5[15];   /* I2C + 0x11 */
  u8 interrupt;   /* I2C + 0x20 */
  u8 reserved6[3];/* I2C + 0x21 */
  u8 mifr;/* I2C + 0x24 */
   };


 Ugh.  I hate all the registers defined in structures thing done for
  5200, but I guess it is better to stick with established convention
  than do it differently.

Isn't it better than adding random numbers to a pointer and having to
worry about what the pointer is pointing at so that it will multiply
by the right word size? That's a mess when the registers are different
lengths.

A common i2c struct might be a better idea that adds in the dfsrr
register might be better.

You can set the flags for dfsrr use in these set_freq routines too
since they select on CPU type.
   {.compatible = fsl,mpc5200b-i2c, .data = fsl_i2c_mpc5200b_set_freq, },
   {.compatible = fsl,mpc5200-i2c, .data = fsl_i2c_mpc5200_set_freq, },
   {.compatible = fsl,mpc8260-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
   {.compatible = fsl,mpc8349-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
   {.compatible = fsl,mpc8540-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
   {.compatible = fsl,mpc8543-i2c, .data =
fsl_i2c_mpc8xxx_div2_set_freq, },
   {.compatible = fsl,mpc8544-i2c, .data =
fsl_i2c_mpc8xxx_div3_set_freq, },




  Yes, I think this is okay (but I haven't double checked the values; I
  trust you).


  g.


  --
  Grant Likely, B.Sc., P.Eng.
  Secret Lab Technologies Ltd.



-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Wolfgang Grandegger

Grant Likely wrote:

On Thu, Jul 31, 2008 at 11:22 AM, Wolfgang Grandegger [EMAIL PROTECTED] wrote:

Jon Smirl wrote:

On 7/31/08, Wolfgang Grandegger [EMAIL PROTECTED] wrote:

Grant Likely wrote:


On Fri, Jul 25, 2008 at 11:19:41AM -0500, Timur Tabi wrote:


Wolfgang Grandegger wrote:



I know but we still need an algorithm for MPC52xx and MPC82xx as well.


That's true, but I still think hard-coding values of DFSR and FDR in
the

device

tree is not a good way to do this.


I agree, it should encode real frequencies, not raw register values.


 Digging deeper I'm frightened by plenty of platform specific code. We
would
need:

 - one table of divider,fdr,dfsr values for the MPC82/3/5/6xx processors
 (already available from Timur's U-Boot implementation)

 - one table of divider,fdr values for the MPC5200 rev A.

 - one table of divider,fdr values for the MPC5200 rev B.
 (the Rev. B has two more pre-scaler bits).

Aren't the tables in the manual there just to make it easy for a human
to pick out the line they want? For a computer you'd program the
formula that was used to create the tables.

I have the formulas to create the tables, also for the MPC5200 Rev. A and B.


Oh, hey, even better.


That was not my point. I'm worried about arch specific code in i2c-mpc.c. It
should go somewhere to arch/powerpc.


i2c-mpc *is* arch specific.  I really don't think you need to worry
about adding a block of code for each supported SoC family.  Just
change the of_match table to look something like this:

static const struct of_device_id mpc_i2c_of_match[] = {
{.compatible = fsl,mpc5200b-i2c, .data = fsl_i2c_mpc5200b_set_freq, },
{.compatible = fsl,mpc5200-i2c, .data = fsl_i2c_mpc5200_set_freq, },
{.compatible = fsl,mpc8260-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
{.compatible = fsl,mpc8349-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
{.compatible = fsl,mpc8540-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
{.compatible = fsl,mpc8543-i2c, .data = 
fsl_i2c_mpc8xxx_div2_set_freq, },
{.compatible = fsl,mpc8544-i2c, .data = 
fsl_i2c_mpc8xxx_div3_set_freq, },

/* keep this only for older device trees with some support
code to figure out
   what .data should have pointed to. */
{.compatible = fsl-i2c, },
{},
};
MODULE_DEVICE_TABLE(of, mpc_i2c_of_match);


Cool, this would also make the dfsrr property obsolete. Just the 
MPC8544 needs more attention because the I2C clock can be programmed to 
 be freq/2 or freq/3.


Wolfgang.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Grant Likely wrote:

 static const struct of_device_id mpc_i2c_of_match[] = {
   {.compatible = fsl,mpc5200b-i2c, .data = fsl_i2c_mpc5200b_set_freq, },
   {.compatible = fsl,mpc5200-i2c, .data = fsl_i2c_mpc5200_set_freq, },
   {.compatible = fsl,mpc8260-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
   {.compatible = fsl,mpc8349-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
   {.compatible = fsl,mpc8540-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
   {.compatible = fsl,mpc8543-i2c, .data = 
 fsl_i2c_mpc8xxx_div2_set_freq, },
   {.compatible = fsl,mpc8544-i2c, .data = 
 fsl_i2c_mpc8xxx_div3_set_freq, },

So we need to update this table every time there's a new SOC?  All 83xx, 85xx,
and 86xx SOCs use the same table.  I'd prefer an implementation that does need a
specific entry for each variant of 8[356]xx.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Wolfgang Grandegger

Timur Tabi wrote:

Grant Likely wrote:


static const struct of_device_id mpc_i2c_of_match[] = {
{.compatible = fsl,mpc5200b-i2c, .data = fsl_i2c_mpc5200b_set_freq, },
{.compatible = fsl,mpc5200-i2c, .data = fsl_i2c_mpc5200_set_freq, },
{.compatible = fsl,mpc8260-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
{.compatible = fsl,mpc8349-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
{.compatible = fsl,mpc8540-i2c, .data = fsl_i2c_mpc8xxx_set_freq, },
{.compatible = fsl,mpc8543-i2c, .data = 
fsl_i2c_mpc8xxx_div2_set_freq, },
{.compatible = fsl,mpc8544-i2c, .data = 
fsl_i2c_mpc8xxx_div3_set_freq, },


So we need to update this table every time there's a new SOC?  All 83xx, 85xx,
and 86xx SOCs use the same table.  I'd prefer an implementation that does need a
specific entry for each variant of 8[356]xx.


We could add a property source-clock-divider = 2/3 if it's needed!?

Wolfgang.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 12:07 PM, Wolfgang Grandegger [EMAIL PROTECTED] wrote:

 We could add a property source-clock-divider = 2/3 if it's needed!?

fsl,source-clock-divider

But, yes, this is a good solution (assuming that it is a board or SoC
characteristic, and not just a choice that the driver has the option
of making on it's own).

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Wolfgang Grandegger wrote:

 We could add a property source-clock-divider = 2/3 if it's needed!?

How about we just get U-Boot to put the core frequency in the device tree?

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Grant Likely wrote:
 On Thu, Jul 31, 2008 at 12:07 PM, Wolfgang Grandegger [EMAIL PROTECTED] 
 wrote:
 We could add a property source-clock-divider = 2/3 if it's needed!?
 
 fsl,source-clock-divider
 
 But, yes, this is a good solution (assuming that it is a board or SoC
 characteristic, and not just a choice that the driver has the option
 of making on it's own).

I was going to suggest the actual clock frequency of the I2C device.  It would
be value of gd-i2c1_clk or gd-i2c2_clk from U-Boot.  The actual divider value
is meaningless.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Grant Likely wrote:

 This is a solved problem.  The device tree simple claims compatibility
 with an older part that has the identical register-level interface.

That would assume that the clock frequency is the only thing that decides
compatibility, which may technically be true now, but I don't think it's a good
idea.

I don't understand what's wrong with simply specifying the actual clock
frequency that the device uses?  It varies from SOC to SOC, but U-Boot
calculates today already:

#if defined(CONFIG_MPC8540) || defined(CONFIG_MPC8541) || \
defined(CONFIG_MPC8560) || defined(CONFIG_MPC8555)
gd-i2c1_clk = sys_info.freqSystemBus;
#elif defined(CONFIG_MPC8544)
/*
 * On the 8544, the I2C clock is the same as the SEC clock.  This can be
 * either CCB/2 or CCB/3, depending on the value of cfg_sec_freq. See
 * 4.4.3.3 of the 8544 RM.  Note that this might actually work for all
 * 85xx, but only the 8544 has cfg_sec_freq, so it's unknown if the
 * PORDEVSR2_SEC_CFG bit is 0 on all 85xx boards that are not an 8544.
 */
if (gur-pordevsr2  MPC85xx_PORDEVSR2_SEC_CFG)
gd-i2c1_clk = sys_info.freqSystemBus / 3;
else
gd-i2c1_clk = sys_info.freqSystemBus / 2;
#else
/* Most 85xx SOCs use CCB/2, so this is the default behavior. */
gd-i2c1_clk = sys_info.freqSystemBus / 2;
#endif
gd-i2c2_clk = gd-i2c1_clk;

We need this ugliness in U-Boot.  Let's take advantage of this and do something
clean and elegant in the device tree.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 01:10:30PM -0500, Timur Tabi wrote:
 Grant Likely wrote:
  On Thu, Jul 31, 2008 at 12:07 PM, Wolfgang Grandegger [EMAIL PROTECTED] 
  wrote:
  We could add a property source-clock-divider = 2/3 if it's needed!?
  
  fsl,source-clock-divider
  
  But, yes, this is a good solution (assuming that it is a board or SoC
  characteristic, and not just a choice that the driver has the option
  of making on it's own).
 
 I was going to suggest the actual clock frequency of the I2C device.  It would
 be value of gd-i2c1_clk or gd-i2c2_clk from U-Boot.  The actual divider 
 value
 is meaningless.

I'm cool with that too, as long as it gets documented

g.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 01:13:10PM -0500, Timur Tabi wrote:
 Grant Likely wrote:
 
  This is a solved problem.  The device tree simple claims compatibility
  with an older part that has the identical register-level interface.
 
 That would assume that the clock frequency is the only thing that decides
 compatibility, which may technically be true now, but I don't think it's a 
 good
 idea.

No it doesn't, it depends on the register interface to decide
compatibility.  Clock interface is part of that.  I suggested encoding
the clock divider directly in compatible (implicit in the SoC version),
but it doesn't have to be that way.  If clock freq is obtained from
another property or some other method then that is okay too.

What is important is that if compatibility is claimed, then it must
really be compatible!  If some new part appears in the future that
breaks our assumptions, then we'll need to rework the driver anyway and
the new part will *not* claim compatibility with the old.

 I don't understand what's wrong with simply specifying the actual clock
 frequency that the device uses?  It varies from SOC to SOC, but U-Boot
 calculates today already:

There is nothing wrong with it (as long as we agree and it gets
documented).  I certainly don't have a problem with doing it this way.

g.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] powerpc/mm: Implement _PAGE_SPECIAL pte_special() for 32-bit

2008-07-31 Thread Kumar Gala
Implement _PAGE_SPECIAL and pte_special() for 32-bit powerpc. This bit will
be used by the fast get_user_pages() to differenciate PTEs that correspond
to a valid struct page from special mappings that don't such as IO mappings
obtained via io_remap_pfn_ranges().

We currently only implement this on sub-arch that support SMP or will so
in the future (6xx, 44x, FSL-BookE) and not (8xx, 40x).

Signed-off-by: Kumar Gala [EMAIL PROTECTED]
---
Nick, do you forsee using _PAGE_SPECIAL for other applications that would
be of interested to non-SMP hw?

We can look at adding it into 8xx and 40x, but was being lazy as I assume
there is no point.

- k

 include/asm-powerpc/pgtable-ppc32.h |   15 +--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/include/asm-powerpc/pgtable-ppc32.h 
b/include/asm-powerpc/pgtable-ppc32.h
index 6fe39e3..58afaee 100644
--- a/include/asm-powerpc/pgtable-ppc32.h
+++ b/include/asm-powerpc/pgtable-ppc32.h
@@ -261,6 +261,7 @@ extern int icache_44x_need_flush;
 #define _PAGE_HWEXEC   0x0004  /* H: Execute permission */
 #define _PAGE_ACCESSED 0x0008  /* S: Page referenced */
 #define _PAGE_DIRTY0x0010  /* S: Page dirty */
+#define _PAGE_SPECIAL  0x0020  /* S: Special page */
 #define _PAGE_USER 0x0040  /* S: User page */
 #define _PAGE_ENDIAN   0x0080  /* H: E bit */
 #define _PAGE_GUARDED  0x0100  /* H: G bit */
@@ -276,6 +277,7 @@ extern int icache_44x_need_flush;
 /* ERPN in a PTE never gets cleared, ignore it */
 #define _PTE_NONE_MASK 0xULL

+#define __HAVE_ARCH_PTE_SPECIAL

 #elif defined(CONFIG_FSL_BOOKE)
 /*
@@ -305,6 +307,7 @@ extern int icache_44x_need_flush;
 #define _PAGE_COHERENT 0x00100 /* H: M bit */
 #define _PAGE_NO_CACHE 0x00200 /* H: I bit */
 #define _PAGE_WRITETHRU0x00400 /* H: W bit */
+#define _PAGE_SPECIAL  0x00800 /* S: Special page */

 #ifdef CONFIG_PTE_64BIT
 /* ERPN in a PTE never gets cleared, ignore it */
@@ -315,6 +318,8 @@ extern int icache_44x_need_flush;
 #define _PMD_PRESENT_MASK (PAGE_MASK)
 #define _PMD_BAD   (~PAGE_MASK)

+#define __HAVE_ARCH_PTE_SPECIAL
+
 #elif defined(CONFIG_8xx)
 /* Definitions for 8xx embedded chips. */
 #define _PAGE_PRESENT  0x0001  /* Page is valid */
@@ -362,6 +367,7 @@ extern int icache_44x_need_flush;
 #define _PAGE_ACCESSED 0x100   /* R: page referenced */
 #define _PAGE_EXEC 0x200   /* software: i-cache coherency required */
 #define _PAGE_RW   0x400   /* software: user write access allowed */
+#define _PAGE_SPECIAL  0x800   /* software: Special page */

 #define _PTE_NONE_MASK _PAGE_HASHPTE

@@ -372,6 +378,8 @@ extern int icache_44x_need_flush;
 /* Hash table based platforms need atomic updates of the linux PTE */
 #define PTE_ATOMIC_UPDATES 1

+#define __HAVE_ARCH_PTE_SPECIAL
+
 #endif

 /*
@@ -404,6 +412,9 @@ extern int icache_44x_need_flush;
 #ifndef _PAGE_WRITETHRU
 #define _PAGE_WRITETHRU0
 #endif
+#ifndef _PAGE_SPECIAL
+#define _PAGE_SPECIAL  0
+#endif
 #ifndef _PMD_PRESENT_MASK
 #define _PMD_PRESENT_MASK  _PMD_PRESENT
 #endif
@@ -533,7 +544,7 @@ static inline int pte_write(pte_t pte)  { 
return pte_val(pte)  _PAGE_RW; }
 static inline int pte_dirty(pte_t pte) { return pte_val(pte)  
_PAGE_DIRTY; }
 static inline int pte_young(pte_t pte) { return pte_val(pte)  
_PAGE_ACCESSED; }
 static inline int pte_file(pte_t pte)  { return pte_val(pte)  
_PAGE_FILE; }
-static inline int pte_special(pte_t pte)   { return 0; }
+static inline int pte_special(pte_t pte)   { return pte_val(pte)  
_PAGE_SPECIAL; }

 static inline void pte_uncache(pte_t pte)   { pte_val(pte) |= 
_PAGE_NO_CACHE; }
 static inline void pte_cache(pte_t pte) { pte_val(pte) = 
~_PAGE_NO_CACHE; }
@@ -552,7 +563,7 @@ static inline pte_t pte_mkdirty(pte_t pte) {
 static inline pte_t pte_mkyoung(pte_t pte) {
pte_val(pte) |= _PAGE_ACCESSED; return pte; }
 static inline pte_t pte_mkspecial(pte_t pte) {
-   return pte; }
+   pte_val(pte) |= _PAGE_SPECIAL; return pte; }
 static inline unsigned long pte_pgprot(pte_t pte)
 {
return __pgprot(pte_val(pte))  PAGE_PROT_BITS;
-- 
1.5.5.1

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


libata badness

2008-07-31 Thread Kumar Gala
I'm getting the following badness with top of tree on a embedded  
PowerPC w/a ULI 1575 bridge (M5229 IDE):


02:1f.0 IDE interface: Acer Laboratories Inc. [ALi] M5229 IDE (rev c8)

If you need more info let me know.

- k

ahci :02:1f.1: AHCI 0001. 32 slots 4 ports 3 Gbps 0xf impl  
SATA mode

ahci :02:1f.1: flags: ncq sntf ilck pm led clo pmp pio slum part
[ cut here ]
Badness at c021e700 [verbose debug info unavailable]
NIP: c021e700 LR: c021e6e8 CTR: c022ce90
REGS: ef82bca0 TRAP: 0700   Not tainted  (2.6.27-rc1-00495-g33bd9fe- 
dirty)

MSR: 00029032 EE,ME,IR,DR  CR: 22044022  XER: 2000
TASK = ef83[1] 'swapper' THREAD: ef82a000 CPU: 1
GPR00: 0001 ef82bd50 ef83  9032  ef0b64fc  

GPR08: ef937c10 c022f93f efbfc8e0 ef900b60 22044028 fffdd3ff 0ffe8700  
c044c17c
GPR16: c04d52ec ef82f458 ef82f4f4 ef82f4f4 c044c234 c044c5d8 c044c5d8  
c044c220
GPR24: c044c218 c04d52f0 0080 c022f940  c044c244 efbec490  


NIP [c021e700] ata_host_activate+0x40/0x10c
LR [c021e6e8] ata_host_activate+0x28/0x10c
Call Trace:
[ef82bd50] [c021e6e8] ata_host_activate+0x28/0x10c (unreliable)
[ef82bd80] [c022f48c] ahci_init_one+0x8b4/0xd68
[ef82be30] [c01b69c0] pci_device_probe+0x84/0xa8
[ef82be50] [c01e5438] driver_probe_device+0xb4/0x1e8
[ef82be70] [c01e55f0] __driver_attach+0x84/0x88
[ef82be90] [c01e4ba0] bus_for_each_dev+0x5c/0xa4
[ef82bec0] [c01e5254] driver_attach+0x24/0x34
[ef82bed0] [c01e44b8] bus_add_driver+0x1d8/0x24c
[ef82bef0] [c01e5810] driver_register+0x70/0x160
[ef82bf10] [c01b6c54] __pci_register_driver+0x64/0xc4
[ef82bf30] [c04aaa60] ahci_init+0x28/0x38
[ef82bf40] [c048717c] do_one_initcall+0x38/0x1ac
[ef82bfb0] [c04874e0] kernel_init+0x1f0/0x1fc
[ef82bff0] [c0013b04] kernel_thread+0x44/0x60
Instruction dump:
7cbb2b78 90010034 7cda3378 7cf93b78 7c7e1b78 4bff9dbd 7c7f1b79 408200c8
2f9c 409e002c 313b 7c09d910 0f00 80010034 7fc3f378  
7f24cb78

scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
ata1: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006100
ata2: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006180
ata3: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006200
ata4: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006280
ata1: SATA link down (SStatus 0 SControl 300)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3: SATA link down (SStatus 0 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
scsi4 : pata_ali
scsi5 : pata_ali
ata5: PATA max UDMA/133 cmd 0x1200 ctl 0x1208 bmdma 0x1220
ata6: PATA max UDMA/133 cmd 0x1210 ctl 0x1218 bmdma 0x1228
ata5.00: ATAPI: TSSTcorp CDW/DVD SH-M522C, TS06, max UDMA/33
ata5.00: WARNING: ATAPI DMA disabled for reliablity issues.  It can be  
enabled
ata5.00: WARNING: via pata_ali.atapi_dma modparam or corresponding  
sysfs node.

ata5.00: configured for UDMA/33
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata5.00: cmd a0/00:00:00:24:00/00:00:00:00:00/a0 tag 0 pio 36 in
 cdb 12 00 00 00 24 00 00 00  00 00 00 00 00 00 00 00
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5: soft resetting link
ata5.00: configured for UDMA/33
ata5: EH complete
ata5.00: limiting speed to UDMA/25:PIO4
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata5.00: cmd a0/00:00:00:24:00/00:00:00:00:00/a0 tag 0 pio 36 in
 cdb 12 00 00 00 24 00 00 00  00 00 00 00 00 00 00 00
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5: soft resetting link
ata5.00: configured for UDMA/25
ata5: EH complete
ata5.00: limiting speed to PIO4
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata5.00: cmd a0/00:00:00:24:00/00:00:00:00:00/a0 tag 0 pio 36 in
 cdb 12 00 00 00 24 00 00 00  00 00 00 00 00 00 00 00
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5: soft resetting link
ata5.00: configured for PIO4
ata5: EH complete
ata5.00: limiting speed to PIO3
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata5.00: cmd a0/00:00:00:24:00/00:00:00:00:00/a0 tag 0 pio 36 in
 cdb 12 00 00 00 24 00 00 00  00 00 00 00 00 00 00 00
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: status: { DRDY }
ata5: soft resetting link
ata5.00: configured for PIO3
ata5: EH complete
ata5.00: limiting speed to PIO0
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata5.00: cmd a0/00:00:00:24:00/00:00:00:00:00/a0 tag 0 

Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Jon Smirl
On 7/31/08, Timur Tabi [EMAIL PROTECTED] wrote:
 Grant Likely wrote:

   No it doesn't, it depends on the register interface to decide
   compatibility.  Clock interface is part of that.


 I don't think so.  The interface for programming the clock registers is
  identical on all 8[356]xx parts.  The only thing that matters is what 
 specific
  values to put in the FDR and DFSR registers to get a desired I2C bus speed.
  That answer is dependent on the actual clock input to the device, which is
  external to the device.  I wouldn't call the input frequency a property of 
 the
  I2C device.


   I suggested encoding
   the clock divider directly in compatible (implicit in the SoC version),
   but it doesn't have to be that way.  If clock freq is obtained from
   another property or some other method then that is okay too.


 I think we agree on that.


   There is nothing wrong with it (as long as we agree and it gets
   documented).  I certainly don't have a problem with doing it this way.


 I propose the property clock-frequency, like this:

 [EMAIL PROTECTED] {
 #address-cells = 1;
 #size-cells = 0;
 cell-index = 0;
 compatible = fsl-i2c;
 reg = 0x3000 0x100;
 interrupts = 14 0x8;
 interrupt-parent = ipic;
 dfsrr;

The existence of the dfsrr and fsl,mpc5200-i2c attributes imply to me
that the compatible strings were not done correctly. If these devices
really were compatible we wouldn't need extra attributes to tell them
apart.

So I'm with Grant on adding compatible strings sufficient to remove
the need for dfsrr and fsl,mpc5200-i2c.

Something like...
static const struct of_device_id mpc_i2c_of_match[] = {
   {.compatible = fsl,mpc5200b-i2c, .data = fsl_i2c_mpc5200, },
   {.compatible = fsl,mpc5200-i2c, .data = fsl_i2c_mpc5200, },
   {.compatible = fsl,mpc8xxx-i2c, .data = fsl_i2c_dfsrr, },


As for the source clock, how about creating a new global like
ppc_proc_freq called ppc_ipb_freq. The platform code can then set the
right clock value into the variable. For mpc8 get it from uboot.
mpc5200 can easily compute it from ppc_proc_freq and checking how the
ipb divider is set. That will move the clock problem out of the i2c
driver.

I'd like to move towards a more generic uboot that gets the processor
minimally running and then use the device tree to customize as many
things as possible. But that's just my opinion.


-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Scott Wood

Timur Tabi wrote:

Grant Likely wrote:


No it doesn't, it depends on the register interface to decide
compatibility.  Clock interface is part of that. 


I don't think so.  The interface for programming the clock registers is
identical on all 8[356]xx parts.  The only thing that matters is what specific
values to put in the FDR and DFSR registers to get a desired I2C bus speed.


If it affects the values you need to write to the registers to achieve a 
given result, how is it not a difference in the register interface?



I propose the property clock-frequency, like this:

[EMAIL PROTECTED] {
#address-cells = 1;
#size-cells = 0;
cell-index = 0;
compatible = fsl-i2c;
reg = 0x3000 0x100;
interrupts = 14 0x8;
interrupt-parent = ipic;
dfsrr;
clock-frequency = 0xblablabla;  -- added by U-Boot
};


A clock-frequency property is OK, and is in line with what we do in 
other types of nodes.  However, in the long run it might be nice to 
introduce some sort of clock binding where, for example, the i2c node 
can point to a clock elsewhere in the device tree as an input clock.


That way, less knowledge is required by the firmware to poke values all 
over the place, and it also allows one to describe situations where the 
frequency of the input clock can change (such as in low-power modes).


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Jon Smirl wrote:

 The existence of the dfsrr and fsl,mpc5200-i2c attributes imply to me
 that the compatible strings were not done correctly. If these devices
 really were compatible we wouldn't need extra attributes to tell them
 apart.

I agree with that.

 So I'm with Grant on adding compatible strings sufficient to remove
 the need for dfsrr and fsl,mpc5200-i2c.

Let's just make sure we don't break backwards compatibility.

 Something like...
 static const struct of_device_id mpc_i2c_of_match[] = {
{.compatible = fsl,mpc5200b-i2c, .data = fsl_i2c_mpc5200, },
{.compatible = fsl,mpc5200-i2c, .data = fsl_i2c_mpc5200, },
{.compatible = fsl,mpc8xxx-i2c, .data = fsl_i2c_dfsrr, },

That seems ok.

 As for the source clock, how about creating a new global like
 ppc_proc_freq called ppc_ipb_freq. The platform code can then set the
 right clock value into the variable. For mpc8 get it from uboot.
 mpc5200 can easily compute it from ppc_proc_freq and checking how the
 ipb divider is set. That will move the clock problem out of the i2c
 driver.

I don't want to differentiate between 52xx and 8xxx any more than we have to.
If 8xxx has the clock frequency in the device tree, then 52xx should have it
there, too.

For backwards compatibility purposes, we can have the driver provide a
hard-coded value of some kind if the property does not exist.

 I'd like to move towards a more generic uboot that gets the processor
 minimally running and then use the device tree to customize as many
 things as possible. But that's just my opinion.

U-Boot needs to configure the I2C bus speed because U-Boot uses the I2C.  We
should capitalize on that by taking the information that U-Boot has calculated
and re-use it.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Jon Smirl
On 7/31/08, Scott Wood [EMAIL PROTECTED] wrote:
 Timur Tabi wrote:

  Grant Likely wrote:
 
 
   No it doesn't, it depends on the register interface to decide
   compatibility.  Clock interface is part of that.
  
 
  I don't think so.  The interface for programming the clock registers is
  identical on all 8[356]xx parts.  The only thing that matters is what
 specific
  values to put in the FDR and DFSR registers to get a desired I2C bus
 speed.
 

  If it affects the values you need to write to the registers to achieve a
 given result, how is it not a difference in the register interface?


  I propose the property clock-frequency, like this:
 
 [EMAIL PROTECTED] {
 #address-cells = 1;
 #size-cells = 0;
 cell-index = 0;
 compatible = fsl-i2c;
 reg = 0x3000 0x100;
 interrupts = 14 0x8;
 interrupt-parent = ipic;
 dfsrr;
 clock-frequency = 0xblablabla;  -- added by
 U-Boot
 };
 

  A clock-frequency property is OK, and is in line with what we do in other
 types of nodes.  However, in the long run it might be nice to introduce some
 sort of clock binding where, for example, the i2c node can point to a clock
 elsewhere in the device tree as an input clock.

  That way, less knowledge is required by the firmware to poke values all
 over the place, and it also allows one to describe situations where the
 frequency of the input clock can change (such as in low-power modes).

PowerPC,[EMAIL PROTECTED] {
timebase-frequency = 0;   /* From Bootloader  */
bus-frequency = 0;/* From Bootloader  */
clock-frequency = 0;  /* From Bootloader  */
};

The mpc5200 code already has mpc52xx_find_ipb_freq() to get it.

I should grep before suggesting things.

-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 01:35:51PM -0500, Timur Tabi wrote:
 Grant Likely wrote:
 
  No it doesn't, it depends on the register interface to decide
  compatibility.  Clock interface is part of that. 
 
 I don't think so.  The interface for programming the clock registers is
 identical on all 8[356]xx parts.  The only thing that matters is what specific
 values to put in the FDR and DFSR registers to get a desired I2C bus speed.
 That answer is dependent on the actual clock input to the device, which is
 external to the device.  I wouldn't call the input frequency a property of the
 I2C device.

Okay, I accept that argument.  Make input frequency a property and be
done with it.

  I suggested encoding
  the clock divider directly in compatible (implicit in the SoC version),
  but it doesn't have to be that way.  If clock freq is obtained from
  another property or some other method then that is okay too.
 
 I think we agree on that.

indeed.

  There is nothing wrong with it (as long as we agree and it gets
  documented).  I certainly don't have a problem with doing it this way.
 
 I propose the property clock-frequency, like this:
 
   [EMAIL PROTECTED] {
   #address-cells = 1;
   #size-cells = 0;
   cell-index = 0;
   compatible = fsl-i2c;
   reg = 0x3000 0x100;
   interrupts = 14 0x8;
   interrupt-parent = ipic;
   dfsrr;
   clock-frequency = 0xblablabla;  -- added by U-Boot
   };

I'm okay with this.

 Note that the dfsrr property already differentiates between 8xxx and 52xx, so
 maybe we don't need any other device tree changes.

The whole dfsrr property is a really bad idea, and it just replaces the
equally bad idea of an mpc5200-clocking property which is currently in
use.  This bit should definitely be determined via compatible.

g.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Scott Wood wrote:
 Timur Tabi wrote:
 Grant Likely wrote:

 No it doesn't, it depends on the register interface to decide
 compatibility.  Clock interface is part of that. 
 I don't think so.  The interface for programming the clock registers is
 identical on all 8[356]xx parts.  The only thing that matters is what 
 specific
 values to put in the FDR and DFSR registers to get a desired I2C bus speed.
 
 If it affects the values you need to write to the registers to achieve a 
 given result, how is it not a difference in the register interface?

I think we're splitting hairs, here.  The code for actually programming the
registers is the same, regardless of the input frequency.  It's just that the
input frequency is a value needed to calculate the right value.  But like I
said, I don't think this distinction is that relevant.

 A clock-frequency property is OK, and is in line with what we do in 
 other types of nodes.  However, in the long run it might be nice to 
 introduce some sort of clock binding where, for example, the i2c node 
 can point to a clock elsewhere in the device tree as an input clock.

The only problem with that is that the actual input clock to the I2C device is
not the same as any other device.  It's a unique clock.  Look at the code I had
to write to figure out this clock just on 85xx:

/*
 * The base clock for I2C depends on the actual SOC.  Unfortunately,
 * there is no pattern that can be used to determine the frequency, so
 * the only choice is to look up the actual SOC number and use the value
 * for that SOC. This information is taken from application note
 * AN2919.
 */
#if defined(CONFIG_MPC8540) || defined(CONFIG_MPC8541) || \
defined(CONFIG_MPC8560) || defined(CONFIG_MPC8555)
gd-i2c1_clk = sys_info.freqSystemBus;
#elif defined(CONFIG_MPC8544)
/*
 * On the 8544, the I2C clock is the same as the SEC clock.  This can be
 * either CCB/2 or CCB/3, depending on the value of cfg_sec_freq. See
 * 4.4.3.3 of the 8544 RM.  Note that this might actually work for all
 * 85xx, but only the 8544 has cfg_sec_freq, so it's unknown if the
 * PORDEVSR2_SEC_CFG bit is 0 on all 85xx boards that are not an 8544.
 */
if (gur-pordevsr2  MPC85xx_PORDEVSR2_SEC_CFG)
gd-i2c1_clk = sys_info.freqSystemBus / 3;
else
gd-i2c1_clk = sys_info.freqSystemBus / 2;
#else
/* Most 85xx SOCs use CCB/2, so this is the default behavior. */
gd-i2c1_clk = sys_info.freqSystemBus / 2;
#endif

This is just for 85xx, and it only applies to the I2C devices.

 That way, less knowledge is required by the firmware to poke values all 
 over the place, and it also allows one to describe situations where the 
 frequency of the input clock can change (such as in low-power modes).

I don't think it's possible to do what you want to do.  I2C clocking is
completely screwed up, and that's just the way the hardware was designed.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Scott Wood

Timur Tabi wrote:

Scott Wood wrote:
A clock-frequency property is OK, and is in line with what we do in 
other types of nodes.  However, in the long run it might be nice to 
introduce some sort of clock binding where, for example, the i2c node 
can point to a clock elsewhere in the device tree as an input clock.


The only problem with that is that the actual input clock to the I2C device is
not the same as any other device.  It's a unique clock.  Look at the code I had
to write to figure out this clock just on 85xx:


IIRC, only the divider is unique, and the divider that is applied to the 
input clock can be specified in the i2c node (either implicitly in 
compatible, or explicitly via a property).


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 02:57:25PM -0400, Jon Smirl wrote:
 On 7/31/08, Timur Tabi [EMAIL PROTECTED] wrote:
  Grant Likely wrote:
  I propose the property clock-frequency, like this:
 
  [EMAIL PROTECTED] {
  #address-cells = 1;
  #size-cells = 0;
  cell-index = 0;
  compatible = fsl-i2c;
  reg = 0x3000 0x100;
  interrupts = 14 0x8;
  interrupt-parent = ipic;
  dfsrr;
 
 The existence of the dfsrr and fsl,mpc5200-i2c attributes imply to me
 that the compatible strings were not done correctly. If these devices
 really were compatible we wouldn't need extra attributes to tell them
 apart.

Yes, exactly right.

 So I'm with Grant on adding compatible strings sufficient to remove
 the need for dfsrr and fsl,mpc5200-i2c.
 
 Something like...
 static const struct of_device_id mpc_i2c_of_match[] = {
{.compatible = fsl,mpc5200b-i2c, .data = fsl_i2c_mpc5200, },
{.compatible = fsl,mpc5200-i2c, .data = fsl_i2c_mpc5200, },
{.compatible = fsl,mpc8xxx-i2c, .data = fsl_i2c_dfsrr, },

Yes, but I don't agree with the last entry in this list.
fsl,mpc8xxx-i2c is not a good value.  Use an actual part number
instead and have newer devices claim compatibility with the old.

The problem with 8xxx is that you don't know if/when another 8xxx
will arrive on the scene which does not conform to already made
assumptions.  If you specify an exact SoC part, then the problem goes
away without any downside.

 As for the source clock, how about creating a new global like
 ppc_proc_freq called ppc_ipb_freq. The platform code can then set the
 right clock value into the variable. For mpc8 get it from uboot.
 mpc5200 can easily compute it from ppc_proc_freq and checking how the
 ipb divider is set. That will move the clock problem out of the i2c
 driver.

In general, I don't like adding additional global or machine vars to
Linux.  I just don't think it in necessary and it can quickly get out of
control.  Instead, For 5200 I've exported a mpc52xx specific function
that returns the clock frequency.  The function can be adapted over time
to handle new cases on the 52xx platform and it get compiled out if 5200
is not in use.

As an alternative, clock-frequency could be made an optional property.
If it isn't specified, then get the bus-frequency property from the
parent node instead.

g.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Wolfgang Grandegger wrote:

 No, the source clock is not identical for all 8[356]xx. Some use half or 
 even a third of the SOC clock frequency. 

The platform clock divided by 2 or 3 *is* the source clock to the I2C.  That's
what I'm talking about.

 Linux must determine the real
 source clock frequency somehow. We may introduce the SOC property 
 i2c-clock-frequency, which could be fixed up by U-Boot or a pre-loader 
 (in case U-Boot is not used). Like for other frequency properties as well.

That's what I'm proposing, but drop the i2c-.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: libata badness

2008-07-31 Thread Ben Dooks
On Thu, Jul 31, 2008 at 01:53:45PM -0500, Kumar Gala wrote:
 I'm getting the following badness with top of tree on a embedded  
 PowerPC w/a ULI 1575 bridge (M5229 IDE):
 
 02:1f.0 IDE interface: Acer Laboratories Inc. [ALi] M5229 IDE (rev c8)
 
 If you need more info let me know.
 
 - k
 
 ahci :02:1f.1: AHCI 0001. 32 slots 4 ports 3 Gbps 0xf impl  
 SATA mode
 ahci :02:1f.1: flags: ncq sntf ilck pm led clo pmp pio slum part
 [ cut here ]
 Badness at c021e700 [verbose debug info unavailable]

it would be helpful to compile a kernel with verbose fault information
turned on.

 NIP: c021e700 LR: c021e6e8 CTR: c022ce90
 REGS: ef82bca0 TRAP: 0700   Not tainted  (2.6.27-rc1-00495-g33bd9fe- 
 dirty)
 MSR: 00029032 EE,ME,IR,DR  CR: 22044022  XER: 2000
 TASK = ef83[1] 'swapper' THREAD: ef82a000 CPU: 1
 GPR00: 0001 ef82bd50 ef83  9032  ef0b64fc  
 
 GPR08: ef937c10 c022f93f efbfc8e0 ef900b60 22044028 fffdd3ff 0ffe8700  
 c044c17c
 GPR16: c04d52ec ef82f458 ef82f4f4 ef82f4f4 c044c234 c044c5d8 c044c5d8  
 c044c220
 GPR24: c044c218 c04d52f0 0080 c022f940  c044c244 efbec490  
 
 NIP [c021e700] ata_host_activate+0x40/0x10c
 LR [c021e6e8] ata_host_activate+0x28/0x10c
 Call Trace:
 [ef82bd50] [c021e6e8] ata_host_activate+0x28/0x10c (unreliable)
 [ef82bd80] [c022f48c] ahci_init_one+0x8b4/0xd68
 [ef82be30] [c01b69c0] pci_device_probe+0x84/0xa8
 [ef82be50] [c01e5438] driver_probe_device+0xb4/0x1e8
 [ef82be70] [c01e55f0] __driver_attach+0x84/0x88
 [ef82be90] [c01e4ba0] bus_for_each_dev+0x5c/0xa4
 [ef82bec0] [c01e5254] driver_attach+0x24/0x34
 [ef82bed0] [c01e44b8] bus_add_driver+0x1d8/0x24c
 [ef82bef0] [c01e5810] driver_register+0x70/0x160
 [ef82bf10] [c01b6c54] __pci_register_driver+0x64/0xc4
 [ef82bf30] [c04aaa60] ahci_init+0x28/0x38
 [ef82bf40] [c048717c] do_one_initcall+0x38/0x1ac
 [ef82bfb0] [c04874e0] kernel_init+0x1f0/0x1fc
 [ef82bff0] [c0013b04] kernel_thread+0x44/0x60
 Instruction dump:
 7cbb2b78 90010034 7cda3378 7cf93b78 7c7e1b78 4bff9dbd 7c7f1b79 408200c8
 2f9c 409e002c 313b 7c09d910 0f00 80010034 7fc3f378  
 7f24cb78
 scsi0 : ahci
 scsi1 : ahci
 scsi2 : ahci
 scsi3 : ahci
 ata1: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006100
 ata2: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006180
 ata3: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006200
 ata4: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006280
 ata1: SATA link down (SStatus 0 SControl 300)
 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
 ata2.00: qc timeout (cmd 0xec)
 ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
 ata2.00: qc timeout (cmd 0xec)
 ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
 ata2.00: qc timeout (cmd 0xec)
 ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
 ata3: SATA link down (SStatus 0 SControl 300)
 ata4: SATA link down (SStatus 0 SControl 300)
 scsi4 : pata_ali
 scsi5 : pata_ali
 ata5: PATA max UDMA/133 cmd 0x1200 ctl 0x1208 bmdma 0x1220
 ata6: PATA max UDMA/133 cmd 0x1210 ctl 0x1218 bmdma 0x1228

Looks like you're running into the same problem as I did, with the
fact that the M5529 is in native mode, but doesn't have any IRQ
available. Do you know if the bridge chip in it is in routes the
IRQs internally?

 ata5.00: ATAPI: TSSTcorp CDW/DVD SH-M522C, TS06, max UDMA/33
 ata5.00: WARNING: ATAPI DMA disabled for reliablity issues.  It can be  
 enabled
 ata5.00: WARNING: via pata_ali.atapi_dma modparam or corresponding  
 sysfs node.
 ata5.00: configured for UDMA/33
 ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
 ata5.00: cmd a0/00:00:00:24:00/00:00:00:00:00/a0 tag 0 pio 36 in
  cdb 12 00 00 00 24 00 00 00  00 00 00 00 00 00 00 00
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata5.00: status: { DRDY }
 ata5: soft resetting link
 ata5.00: configured for UDMA/33
 ata5: EH complete
 ata5.00: limiting speed to UDMA/25:PIO4
 ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
 ata5.00: cmd a0/00:00:00:24:00/00:00:00:00:00/a0 tag 0 pio 36 in
  cdb 12 00 00 00 24 00 00 00  00 00 00 00 00 00 00 00
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata5.00: status: { DRDY }
 ata5: soft resetting link
 ata5.00: configured for UDMA/25
 ata5: EH complete
 ata5.00: limiting speed to PIO4
 ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
 ata5.00: cmd a0/00:00:00:24:00/00:00:00:00:00/a0 tag 0 pio 36 in
  cdb 12 00 00 00 24 00 00 00  00 00 00 00 00 00 00 00
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata5.00: status: { DRDY }
 ata5: soft resetting link
 ata5.00: configured for PIO4
 ata5: EH complete
 ata5.00: limiting speed to PIO3
 ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 

Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 09:54:48PM +0200, Wolfgang Grandegger wrote:
 Thinking more about it, it would be best to add the property  
 i2c-clock-divider to the soc node and implement fsl_get_i2c_freq() in  
 a similar way:

 [EMAIL PROTECTED] {
 #address-cells = 1;
 #size-cells = 1;
 device_type = soc;
 ranges = 0x0 0xe000 0x10;
 reg = 0xe000 0x1000;  // CCSRBAR 1M
 bus-frequency = 0;
 i2c-clock-divider = 2;

 U-Boot could then fixup that value like bus-frequency() and the i2c-mpc  
 driver simply calls fsl_get_i2c_freq().

Ugh.  This is specifically related to the i2c device, so please place
the property in the i2c device.  Remember, device tree design is not
about what will make the implementation simplest, but rather about what
describes the hardware in the best way.

Now, if you can argue that i2c-clock-frequency is actually a separate
clock domain defined at the SoC level, not the i2c device level, then I
will change my opinion.

g.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Wolfgang Grandegger wrote:

 But clock-frequency, aka bus-frequency, is already used by 
 fsl_get_sys_freq():
 
 http://lxr.linux.no/linux+v2.6.26/arch/powerpc/sysdev/fsl_soc.c#L80

So?  clock-frequency is a per-node property.  I use it in the codec node on the
8610 (mpc8610_hpcd.dts).  It does not mean platform clock frequency.

 U-Boot could then fixup that value like bus-frequency() and the i2c-mpc 
 driver simply calls fsl_get_i2c_freq().

This is just more complicated than it needs to be.  Why should the I2C driver
fetch the platform clock and the divider from the parent node, and then do
additional math, when it could just get the value it needs right from the node
it's probing?

Besides, U-Boot does not currently store the divider value.  Look at the code
I've posted twice already - it stores the frequency in i2c1_clk.  So now I would
need to create another variable in the gd_t to store the divider?  No thanks.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Wolfgang Grandegger

Timur Tabi wrote:

Wolfgang Grandegger wrote:

But clock-frequency, aka bus-frequency, is already used by 
fsl_get_sys_freq():


http://lxr.linux.no/linux+v2.6.26/arch/powerpc/sysdev/fsl_soc.c#L80


So?  clock-frequency is a per-node property.  I use it in the codec node on the
8610 (mpc8610_hpcd.dts).  It does not mean platform clock frequency.

U-Boot could then fixup that value like bus-frequency() and the i2c-mpc 
driver simply calls fsl_get_i2c_freq().


This is just more complicated than it needs to be.  Why should the I2C driver
fetch the platform clock and the divider from the parent node, and then do
additional math, when it could just get the value it needs right from the node
it's probing?


I'm a bit confused. The frequency of the I2C source clock and the real 
I2C clock frequency are two different things. The first one is common 
for all I2C devices, the second can be different. What properties would 
you like to use for defining both?



Besides, U-Boot does not currently store the divider value.  Look at the code
I've posted twice already - it stores the frequency in i2c1_clk.  So now I would
need to create another variable in the gd_t to store the divider?  No thanks.


OK, that's an argument but it's biased by U-Boot.

Wolfgang.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Wolfgang Grandegger

Grant Likely wrote:

On Thu, Jul 31, 2008 at 09:54:48PM +0200, Wolfgang Grandegger wrote:
Thinking more about it, it would be best to add the property  
i2c-clock-divider to the soc node and implement fsl_get_i2c_freq() in  
a similar way:


[EMAIL PROTECTED] {
#address-cells = 1;
#size-cells = 1;
device_type = soc;
ranges = 0x0 0xe000 0x10;
reg = 0xe000 0x1000;  // CCSRBAR 1M
bus-frequency = 0;
i2c-clock-divider = 2;

U-Boot could then fixup that value like bus-frequency() and the i2c-mpc  
driver simply calls fsl_get_i2c_freq().


Ugh.  This is specifically related to the i2c device, so please place
the property in the i2c device.  Remember, device tree design is not
about what will make the implementation simplest, but rather about what
describes the hardware in the best way.

Now, if you can argue that i2c-clock-frequency is actually a separate
clock domain defined at the SoC level, not the i2c device level, then I
will change my opinion.


Is it not exactly that? For me it's not a per I2C device property, at least.

Wolfgang.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Board level compatibility matching

2008-07-31 Thread Grant Likely
This topic keeps coming up, so it is probably time to address it once
and for all.

When it comes to machine level support in arch/powerpc, there seems to
me that there are two levels or machine support.

Level 1 is the board specific stuff.  Board X has a, b, and c things
that need to be done for Linux to work correctly, but the fixups are
entirely board specific and will only ever be used on a single board
port.  The lite5200 support in arch/powerpc/platforms/52xx is an
example of this kind of board support.  It binds on a value in the top
level compatible property.

Level 2 is kind of the generic catch-all machine support for systems
that are unremarkable and don't require any special code to be run.
In most cases, new boards can be supported by this generic code
without any changes to the Linux kernel.
arch/powerpc/platforms/52xx/mpc5200_simple.c is an example here.
mpc5200_simple maintains a list of boards that are known to work with
it.

At the moment, every new board port forces a linux kernel source tree
change, even if it is just adding a single string to the match table.
I'm willing to wager that 99 times out of 100, boards based on the
mpc5200 SoC will want to use the common board support code and that
maintaining an explicit list of supported boards is completely
unnecessary.  I expect that the exact same is true for 8xxx and 4xx
SoCs.  So, rather than continuing to need to maintain explicit lists,
I propose the following:

- Add a property to the device tree that explicitly specifies the SoC
that the board is based on.  Something like 'soc-model =
fsl,mpc5200b' would be appropriate.  This in and of itself does not
change the usage conventions, it just provides more information about
the hardware.  (Another idea is to add a string to the top level
compatible property, but there are still arguments about what
compatible really means in the root node.)

- Prioritize board ports in the arch/powerpc/platforms directory to
identify level-1 machines support from the level-2 ones.  Make sure
that level-1 stuff always gets probed before level-2 stuff within each
SoC family.  In all likelyhood, this would probably just involve
making sure that board specific machines get linked in before the
catchall machine.

- Change level-2 machine support to bind on soc-model instead of an
explicit compatible list.

Doing so should simplify adding new board ports.  In many cases it
would just involve dropping in a new .dts file.  However, it retains
the flexability of overriding generic code with platform specific
fixups as the need arises.  I know we've been cautious about adding
catch-all bindings to the device tree, and it is a big deal to avoid
adding wildcards to compatible values.  However, this solution should
be workable because it doesn't involve stating something that is not
true in the device tree and it maintains the ability to override
cleanly when new bugs are discovered.  It also doesn't try to define
wildcard values in compatible or other device tree properties.

Thoughts?
g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: libata badness

2008-07-31 Thread Kumar Gala


On Jul 31, 2008, at 2:02 PM, Ben Dooks wrote:


On Thu, Jul 31, 2008 at 01:53:45PM -0500, Kumar Gala wrote:

I'm getting the following badness with top of tree on a embedded
PowerPC w/a ULI 1575 bridge (M5229 IDE):

02:1f.0 IDE interface: Acer Laboratories Inc. [ALi] M5229 IDE (rev  
c8)


If you need more info let me know.

- k

ahci :02:1f.1: AHCI 0001. 32 slots 4 ports 3 Gbps 0xf impl
SATA mode
ahci :02:1f.1: flags: ncq sntf ilck pm led clo pmp pio slum part
[ cut here ]
Badness at c021e700 [verbose debug info unavailable]


it would be helpful to compile a kernel with verbose fault information
turned on.


Doesn't seem to provide too much more info on ppc.  just says the file/ 
line number of the badness.



NIP: c021e700 LR: c021e6e8 CTR: c022ce90
REGS: ef82bca0 TRAP: 0700   Not tainted  (2.6.27-rc1-00495-g33bd9fe-
dirty)
MSR: 00029032 EE,ME,IR,DR  CR: 22044022  XER: 2000
TASK = ef83[1] 'swapper' THREAD: ef82a000 CPU: 1
GPR00: 0001 ef82bd50 ef83  9032  ef0b64fc

GPR08: ef937c10 c022f93f efbfc8e0 ef900b60 22044028 fffdd3ff 0ffe8700
c044c17c
GPR16: c04d52ec ef82f458 ef82f4f4 ef82f4f4 c044c234 c044c5d8 c044c5d8
c044c220
GPR24: c044c218 c04d52f0 0080 c022f940  c044c244 efbec490

NIP [c021e700] ata_host_activate+0x40/0x10c
LR [c021e6e8] ata_host_activate+0x28/0x10c
Call Trace:
[ef82bd50] [c021e6e8] ata_host_activate+0x28/0x10c (unreliable)
[ef82bd80] [c022f48c] ahci_init_one+0x8b4/0xd68
[ef82be30] [c01b69c0] pci_device_probe+0x84/0xa8
[ef82be50] [c01e5438] driver_probe_device+0xb4/0x1e8
[ef82be70] [c01e55f0] __driver_attach+0x84/0x88
[ef82be90] [c01e4ba0] bus_for_each_dev+0x5c/0xa4
[ef82bec0] [c01e5254] driver_attach+0x24/0x34
[ef82bed0] [c01e44b8] bus_add_driver+0x1d8/0x24c
[ef82bef0] [c01e5810] driver_register+0x70/0x160
[ef82bf10] [c01b6c54] __pci_register_driver+0x64/0xc4
[ef82bf30] [c04aaa60] ahci_init+0x28/0x38
[ef82bf40] [c048717c] do_one_initcall+0x38/0x1ac
[ef82bfb0] [c04874e0] kernel_init+0x1f0/0x1fc
[ef82bff0] [c0013b04] kernel_thread+0x44/0x60
Instruction dump:
7cbb2b78 90010034 7cda3378 7cf93b78 7c7e1b78 4bff9dbd 7c7f1b79  
408200c8

2f9c 409e002c 313b 7c09d910 0f00 80010034 7fc3f378
7f24cb78
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
ata1: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006100
ata2: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006180
ata3: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006200
ata4: SATA max UDMA/133 abar [EMAIL PROTECTED] port 0x80006280
ata1: SATA link down (SStatus 0 SControl 300)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3: SATA link down (SStatus 0 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
scsi4 : pata_ali
scsi5 : pata_ali
ata5: PATA max UDMA/133 cmd 0x1200 ctl 0x1208 bmdma 0x1220
ata6: PATA max UDMA/133 cmd 0x1210 ctl 0x1218 bmdma 0x1228


Looks like you're running into the same problem as I did, with the
fact that the M5529 is in native mode, but doesn't have any IRQ
available. Do you know if the bridge chip in it is in routes the
IRQs internally?


They are routed via an i8259 in the bridge.

- k

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Wolfgang Grandegger

Timur Tabi wrote:

Wolfgang Grandegger wrote:

I'm a bit confused. The frequency of the I2C source clock and the real 
I2C clock frequency are two different things. 


There are two frequencies:

1) The frequency of the input clock to the I2C device, after it has gone through
a divider.  This is what I call the I2C clock frequency and what I think
belongs in the clock-frequency property.  This is usually the platform clock
divided by 1, 2, or 3.


OK.


2) The speed of the I2C bus, as seen by devices on that bus.  This is usually
400KHz.


Which should be defined with the property current-speed, right?

Wolfgang.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Wolfgang Grandegger wrote:

 I'm a bit confused. The frequency of the I2C source clock and the real 
 I2C clock frequency are two different things. 

There are two frequencies:

1) The frequency of the input clock to the I2C device, after it has gone through
a divider.  This is what I call the I2C clock frequency and what I think
belongs in the clock-frequency property.  This is usually the platform clock
divided by 1, 2, or 3.

2) The speed of the I2C bus, as seen by devices on that bus.  This is usually
400KHz.

 The first one is common 
 for all I2C devices, the second can be different. What properties would 
 you like to use for defining both?

The platform clock has no value to the I2C hardware, so I don't care anything
about it.

 Besides, U-Boot does not currently store the divider value.  Look at the code
 I've posted twice already - it stores the frequency in i2c1_clk.  So now I 
 would
 need to create another variable in the gd_t to store the divider?  No thanks.
 
 OK, that's an argument but it's biased by U-Boot.

As long as a method that is favorable to U-Boot does not put any undo hardship
on non-U-Boot methods, I would say that it is the preferred method.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Wolfgang Grandegger wrote:

 Is it not exactly that? For me it's not a per I2C device property, at least.

Of course it's a per-I2C device property.  The input frequency to the I2C device
is only seen by the I2C device, and no other device.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 2:19 PM, Timur Tabi [EMAIL PROTECTED] wrote:
 Wolfgang Grandegger wrote:

 I'm a bit confused. The frequency of the I2C source clock and the real
 I2C clock frequency are two different things.

 There are two frequencies:

 1) The frequency of the input clock to the I2C device, after it has gone 
 through
 a divider.  This is what I call the I2C clock frequency and what I think
 belongs in the clock-frequency property.  This is usually the platform clock
 divided by 1, 2, or 3.

 2) The speed of the I2C bus, as seen by devices on that bus.  This is usually
 400KHz.

analogous example: serial nodes use two properties; clock-frequency
and current-speed, one for clock frequency routed into the device
and one for the current baud rate.  It is completely relevant to put
the effective clock frequency into the i2c device node.

g.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Jon Smirl
On 7/31/08, Wolfgang Grandegger [EMAIL PROTECTED] wrote:
 Timur Tabi wrote:

  Wolfgang Grandegger wrote:
 
 
   But clock-frequency, aka bus-frequency, is already used by
 fsl_get_sys_freq():
  
  
 http://lxr.linux.no/linux+v2.6.26/arch/powerpc/sysdev/fsl_soc.c#L80
  
 
  So?  clock-frequency is a per-node property.  I use it in the codec node
 on the
  8610 (mpc8610_hpcd.dts).  It does not mean platform clock frequency.
 
 
   U-Boot could then fixup that value like bus-frequency() and the i2c-mpc
 driver simply calls fsl_get_i2c_freq().
  
 
  This is just more complicated than it needs to be.  Why should the I2C
 driver
  fetch the platform clock and the divider from the parent node, and then do
  additional math, when it could just get the value it needs right from the
 node
  it's probing?
 

  I'm a bit confused. The frequency of the I2C source clock and the real I2C
 clock frequency are two different things. The first one is common for all
 I2C devices, the second can be different. What properties would you like to
 use for defining both?

For mpc5200 it is easy, mpc52xx_find_ipb_freq()  already exists to get
the source clock for the i2c devices. Each i2c node in the device tree
can then specify clock-frequency = 40; or let it default. You
have the input clock and the desired clock, do some math and you can
set FDR.

For the 8xxx chips there is no simple solution for obtain the input
clock for the i2c section. Maybe create something like i2c-frequency
and set it from uboot. The make another accessor function like
mpc8xxx_find_i2c_freq().

PowerPC,[EMAIL PROTECTED] {
timebase-frequency = 0;   /* From Bootloader  */
bus-frequency = 0;/* From Bootloader  */
clock-frequency = 0;  /* From Bootloader  */
i2c-frequency = 0;/* From Bootloader  */
};

Instead of creating i2c-frequency in the device tree, you could put
this piece of code in the the mpc8xxx platform driver and use it to
implement mpc8xxx_find_i2c_freq(). Read the PORDEVSR2_SEC_CFG bit back
from whatever uboot set it too.

#if defined(CONFIG_MPC8540) || defined(CONFIG_MPC8541) || \
   defined(CONFIG_MPC8560) || defined(CONFIG_MPC8555)
   gd-i2c1_clk = sys_info.freqSystemBus;
#elif defined(CONFIG_MPC8544)
   /*
* On the 8544, the I2C clock is the same as the SEC clock.  This can be
* either CCB/2 or CCB/3, depending on the value of cfg_sec_freq. See
* 4.4.3.3 of the 8544 RM.  Note that this might actually work for all
* 85xx, but only the 8544 has cfg_sec_freq, so it's unknown if the
* PORDEVSR2_SEC_CFG bit is 0 on all 85xx boards that are not an 8544.
*/
   if (gur-pordevsr2  MPC85xx_PORDEVSR2_SEC_CFG)
   gd-i2c1_clk = sys_info.freqSystemBus / 3;
   else
   gd-i2c1_clk = sys_info.freqSystemBus / 2;
#else
   /* Most 85xx SOCs use CCB/2, so this is the default behavior. */
   gd-i2c1_clk = sys_info.freqSystemBus / 2;
#endif
   gd-i2c2_clk = gd-i2c1_clk;


-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 2:32 PM, Jon Smirl [EMAIL PROTECTED] wrote:
 On 7/31/08, Wolfgang Grandegger [EMAIL PROTECTED] wrote:
 Timur Tabi wrote:

  Wolfgang Grandegger wrote:
 
 
   But clock-frequency, aka bus-frequency, is already used by
 fsl_get_sys_freq():
  
  
 http://lxr.linux.no/linux+v2.6.26/arch/powerpc/sysdev/fsl_soc.c#L80
  
 
  So?  clock-frequency is a per-node property.  I use it in the codec node
 on the
  8610 (mpc8610_hpcd.dts).  It does not mean platform clock frequency.
 
 
   U-Boot could then fixup that value like bus-frequency() and the i2c-mpc
 driver simply calls fsl_get_i2c_freq().
  
 
  This is just more complicated than it needs to be.  Why should the I2C
 driver
  fetch the platform clock and the divider from the parent node, and then do
  additional math, when it could just get the value it needs right from the
 node
  it's probing?
 

  I'm a bit confused. The frequency of the I2C source clock and the real I2C
 clock frequency are two different things. The first one is common for all
 I2C devices, the second can be different. What properties would you like to
 use for defining both?

How is the divider controlled?  Is it a fixed property of the SoC? a
shared register setting? or a register setting within the i2c device?
(My answer depends on the layout)

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Wolfgang Grandegger wrote:

 2) The speed of the I2C bus, as seen by devices on that bus.  This is usually
 400KHz.
 
 Which should be defined with the property current-speed, right?

I would use something like bus-speed, but yes.  The word current shouldn't
be in a property string.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: libata badness

2008-07-31 Thread Scott Wood
On Thu, Jul 31, 2008 at 03:24:31PM -0500, Kumar Gala wrote:
 it would be helpful to compile a kernel with verbose fault information
 turned on.
 
 Doesn't seem to provide too much more info on ppc.

I'm pretty sure it does the same thing on ppc as on all the other
architectures.

 just says the file/ line number of the badness.

...which is a lot more useful than an address in someone else's kernel
image.

-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Board level compatibility matching

2008-07-31 Thread Chris Friesen

Grant Likely wrote:


Doing so should simplify adding new board ports.  In many cases it
would just involve dropping in a new .dts file.  However, it retains
the flexability of overriding generic code with platform specific
fixups as the need arises.


If it makes it easier for board vendors to do a clean port, you get a 
vote from me.


I've seen a large amount of vendor code from that needed a lot of 
massaging before it would play nice with support for similar boards from 
other vendors--not to mention how ugly it looked with all the 
board-specific ifdefs.


Chris
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Jon Smirl wrote:

 For mpc5200 it is easy, mpc52xx_find_ipb_freq()  already exists to get
 the source clock for the i2c devices. Each i2c node in the device tree
 can then specify clock-frequency = 40; or let it default. You

400KHz is the *speed* of the I2C bus, so let's be sure to use speed in the
property name.  Reserve the word bus for the input clock to the I2C device.

 #if defined(CONFIG_MPC8540) || defined(CONFIG_MPC8541) || \
defined(CONFIG_MPC8560) || defined(CONFIG_MPC8555)
gd-i2c1_clk = sys_info.freqSystemBus;
 #elif defined(CONFIG_MPC8544)
/*
 * On the 8544, the I2C clock is the same as the SEC clock.  This can 
 be
 * either CCB/2 or CCB/3, depending on the value of cfg_sec_freq. See
 * 4.4.3.3 of the 8544 RM.  Note that this might actually work for all
 * 85xx, but only the 8544 has cfg_sec_freq, so it's unknown if the
 * PORDEVSR2_SEC_CFG bit is 0 on all 85xx boards that are not an 8544.
 */
if (gur-pordevsr2  MPC85xx_PORDEVSR2_SEC_CFG)
gd-i2c1_clk = sys_info.freqSystemBus / 3;
else
gd-i2c1_clk = sys_info.freqSystemBus / 2;
 #else
/* Most 85xx SOCs use CCB/2, so this is the default behavior. */
gd-i2c1_clk = sys_info.freqSystemBus / 2;
 #endif
gd-i2c2_clk = gd-i2c1_clk;

I think the whole point is to eliminate duplicating this code in the Linux
driver.  It's hideously ugly, so it should be limited as much as possible.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Jon Smirl
On 7/31/08, Timur Tabi [EMAIL PROTECTED] wrote:
 Jon Smirl wrote:

   For mpc5200 it is easy, mpc52xx_find_ipb_freq()  already exists to get
   the source clock for the i2c devices. Each i2c node in the device tree
   can then specify clock-frequency = 40; or let it default. You


 400KHz is the *speed* of the I2C bus, so let's be sure to use speed in the
  property name.  Reserve the word bus for the input clock to the I2C device.


   #if defined(CONFIG_MPC8540) || defined(CONFIG_MPC8541) || \
  defined(CONFIG_MPC8560) || defined(CONFIG_MPC8555)
  gd-i2c1_clk = sys_info.freqSystemBus;
   #elif defined(CONFIG_MPC8544)
  /*
   * On the 8544, the I2C clock is the same as the SEC clock.  This 
 can be
   * either CCB/2 or CCB/3, depending on the value of cfg_sec_freq. 
 See
   * 4.4.3.3 of the 8544 RM.  Note that this might actually work for 
 all
   * 85xx, but only the 8544 has cfg_sec_freq, so it's unknown if the
   * PORDEVSR2_SEC_CFG bit is 0 on all 85xx boards that are not an 
 8544.
   */
  if (gur-pordevsr2  MPC85xx_PORDEVSR2_SEC_CFG)
  gd-i2c1_clk = sys_info.freqSystemBus / 3;
  else
  gd-i2c1_clk = sys_info.freqSystemBus / 2;
   #else
  /* Most 85xx SOCs use CCB/2, so this is the default behavior. */
  gd-i2c1_clk = sys_info.freqSystemBus / 2;
   #endif
  gd-i2c2_clk = gd-i2c1_clk;


 I think the whole point is to eliminate duplicating this code in the Linux
  driver.  It's hideously ugly, so it should be limited as much as possible.

It wouldn't go into the i2c driver, it would go into the mpc8xxx
platform driver. Why is it bad to put it into the mpc8xxx platform
driver? It is an accurate description of the mpc8xxx platform isn't
it?


-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Grant Likely wrote:

 How is the divider controlled?  Is it a fixed property of the SoC? 

Yes.  The divider is either 1, 2, or 3, and the only way to know which one it is
is to look up the specific SOC model number.  And depending on the SOC model,
there may also be a register that needs to be looked up.

 a
 shared register setting? or a register setting within the i2c device?

The I2C device itself has no idea what the divider is.  It only sees the result
of the divider.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 03:37:07PM -0500, Timur Tabi wrote:
 Grant Likely wrote:
 
  How is the divider controlled?  Is it a fixed property of the SoC? 
 
 Yes.  The divider is either 1, 2, or 3, and the only way to know which one
 it is is to look up the specific SOC model number.  And depending on the
 SOC model, there may also be a register that needs to be looked up.
 
  a
  shared register setting? or a register setting within the i2c device?
 
 The I2C device itself has no idea what the divider is.  It only sees the
 result of the divider.

Then that absolutely suggests to me that either the final clock or the
divider should be encoded in the i2c node; not in the soc node.

g.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Board level compatibility matching

2008-07-31 Thread Jon Smirl
On 7/31/08, Grant Likely [EMAIL PROTECTED] wrote:
 This topic keeps coming up, so it is probably time to address it once
  and for all.

  When it comes to machine level support in arch/powerpc, there seems to
  me that there are two levels or machine support.

..

  Thoughts?
  g.


As part of this, how can we going to solve the problem with triggering
the load of a board specific machine/fabric driver in a generic way?

-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Timur Tabi
Jon Smirl wrote:

 It wouldn't go into the i2c driver, it would go into the mpc8xxx
 platform driver. Why is it bad to put it into the mpc8xxx platform
 driver? It is an accurate description of the mpc8xxx platform isn't
 it?

There's no need to put that code in the platform driver because U-Boot will put
the final result in the device tree!  That way, we won't need to update the
platform driver *and* U-Boot for any new SOC.

-- 
Timur Tabi
Linux kernel developer at Freescale
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Board level compatibility matching

2008-07-31 Thread Grant Likely
On Thu, Jul 31, 2008 at 04:49:49PM -0400, Jon Smirl wrote:
 On 7/31/08, Grant Likely [EMAIL PROTECTED] wrote:
  This topic keeps coming up, so it is probably time to address it once
   and for all.
 
   When it comes to machine level support in arch/powerpc, there seems to
   me that there are two levels or machine support.
 
 ..
 
   Thoughts?
   g.
 
 
 As part of this, how can we going to solve the problem with triggering
 the load of a board specific machine/fabric driver in a generic way?

That really is a separate problem.  We *could* do this with a board
specific powerpc machine driver, but I don't think it is the best
solution.

I'm still thinking that the drivers module_init() function could check
the top level board model property and decide whether or not to load
based on that.

g.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: i2c-mpc: make speed registers configurable via FDT

2008-07-31 Thread Jon Smirl
On 7/31/08, Grant Likely [EMAIL PROTECTED] wrote:
 On Thu, Jul 31, 2008 at 03:37:07PM -0500, Timur Tabi wrote:
   Grant Likely wrote:
  
How is the divider controlled?  Is it a fixed property of the SoC?
  
   Yes.  The divider is either 1, 2, or 3, and the only way to know which one
   it is is to look up the specific SOC model number.  And depending on the
   SOC model, there may also be a register that needs to be looked up.
  
a
shared register setting? or a register setting within the i2c device?
  
   The I2C device itself has no idea what the divider is.  It only sees the
   result of the divider.


 Then that absolutely suggests to me that either the final clock or the
  divider should be encoded in the i2c node; not in the soc node.

Isn't there a single global divider that generates all the i2c source
clocks? You don't want to copy a global value into each i2c node.
Aren't we talking about the /2 or /3 or /1 divider that appears to be
randomly implemented on various members of the mpc8xxx family?



  g.



-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Board level compatibility matching

2008-07-31 Thread Jon Smirl
On 7/31/08, Grant Likely [EMAIL PROTECTED] wrote:
 On Thu, Jul 31, 2008 at 04:49:49PM -0400, Jon Smirl wrote:
   On 7/31/08, Grant Likely [EMAIL PROTECTED] wrote:
This topic keeps coming up, so it is probably time to address it once
 and for all.
   
 When it comes to machine level support in arch/powerpc, there seems to
 me that there are two levels or machine support.
   
   ..
   
 Thoughts?
 g.
  
  
   As part of this, how can we going to solve the problem with triggering
   the load of a board specific machine/fabric driver in a generic way?


 That really is a separate problem.  We *could* do this with a board
  specific powerpc machine driver, but I don't think it is the best
  solution.

  I'm still thinking that the drivers module_init() function could check
  the top level board model property and decide whether or not to load
  based on that.

You're assuming the driver is compiled in.

If the drivers are on initrd selection has to happen via the normal
device/driver matching process. Search for a device in the alias table
of the drive file.

-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Board level compatibility matching

2008-07-31 Thread Scott Wood
On Thu, Jul 31, 2008 at 02:19:57PM -0600, Grant Likely wrote:
 - Add a property to the device tree that explicitly specifies the SoC
 that the board is based on.  Something like 'soc-model =
 fsl,mpc5200b' would be appropriate.

Shouldn't that go in the compatible property of the soc node?

 - Prioritize board ports in the arch/powerpc/platforms directory to
 identify level-1 machines support from the level-2 ones.  Make sure
 that level-1 stuff always gets probed before level-2 stuff within each
 SoC family.  In all likelyhood, this would probably just involve
 making sure that board specific machines get linked in before the
 catchall machine.

I don't think we're too far away from being able to have a catch-all that
isn't even soc-type-specific -- the main things that come to mind are
instantiating PCI controllers (should be easily fixed) and finding the
root IRQ controller (I suppose we could pick a random interrupt
controller and follow the interrupt tree to its root, though it'd be more
robust if it were explicitly identified in the device tree).

-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


  1   2   >