RE: [RFC PATCH 2/2 v2] powerpc/83xx: mpc836x_mds: add support for USBHost
-Original Message- From: Anton Vorontsov [mailto:[EMAIL PROTECTED] Sent: Monday, September 01, 2008 9:35 PM To: Kumar Gala Cc: linuxppc-dev@ozlabs.org; Li Yang-R58472 Subject: [RFC PATCH 2/2 v2] powerpc/83xx: mpc836x_mds: add support for USBHost Various changes to support QE USB Host on a MPC8360E-MDS board: - Update the device tree per QE USB bindings; - Configure QE Par IO; - Set up BCSR for both USB Host and Peripheral modes; - Add timer (GTM) node; - Add gpio-controller node for BCSR13 bank; - Select FSL_GTM, QE_GPIO and OF_SIMPLE_GPIO. The work is loosely based on Li Yang's patch[1], which is used to support peripheral mode only. [1] http://ozlabs.org/pipermail/linuxppc-dev/2008-August/061357.html The s-o-b line of the original patch preserved here. Signed-off-by: Li Yang [EMAIL PROTECTED] Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] {snip} @@ -297,11 +328,20 @@ }; [EMAIL PROTECTED] { - compatible = qe_udc; + compatible = fsl,mpc8360-qe-usb, + fsl,mpc8323-qe-usb; reg = 0x6c0 0x40 0x8b00 0x100; interrupts = 11; interrupt-parent = qeic; - mode = slave; + fsl,fullspeed-clock = clk21; + fsl,lowspeed-clock = brg9; + gpios = qe_pio_b 2 0 /* USBOE */ + qe_pio_b 3 0 /* USBTP */ + qe_pio_b 8 0 /* USBTN */ + qe_pio_b 9 0 /* USBRP */ + qe_pio_b 11 0 /* USBRN */ + bcsr135 0 /* SPEED */ + bcsr134 1; /* POWER */ Nothing against this node. But I don't think gpio nodes can replaces par_io nodes. Gpios are focusing on the pins which are directly manipulated by the core, but par_io are for pins used by internal SoCs. - Leo ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Porting linux on PPC405 , XUPv2p board
Hi all, I am trying to port Linux 2.6 on PPC405 processor.I have EDK 10.1, Xilinx University program board (with Virtex II Pro device) linux kernel from Xilinx git. (git.xilinx.com) Cross Compiler (Croostool) fdt.dts file generated by EDK using with help of Xilinx fdt patch.(gen-device-tree-mhs file) I could compile the kernel and could generate the zImage. I've downloaded the hardware bit file and then the zImage to the DDR SDRAM on the board using Parallel 4 cable. when i give the RUN command, this is what i get.. booting virtex memstart=0x0 memsize=0x1000 zImage starting: loaded at 0x0040 (sp: 0x00578f1c) Allocating 0x326564 bytes for kernel ... gunzipping (0x - 0x0040c000:0x005772bb)...done 0x3015c8 bytes Linux/PowerPC load: console=ttyS0 ip=on root=/dev/ram Finalizing device tree... flat tree at 0x40ad68 The booting process seems to be stopped after this. I the bootarguments i've given are, In dts file, bootargs = console=ttyS0 ip=off root=/dev/ram; linux,stdout-path=/[EMAIL PROTECTED]/[EMAIL PROTECTED]; Please let me know if've missed smthing here..! n any clue on the reason why the boor process seems to be hanged here ! Thanks in advance, Vineeth ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] powerpc - Make the irq reverse mapping radix tree lockless
On Thu, 04 Sep 2008 12:52:19 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: On Wed, 2008-09-03 at 15:41 +0200, Sebastien Dugue wrote: On Wed, 20 Aug 2008 15:23:01 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: BTW. It would be good to try to turn the GFP_ATOMIC into GFP_KERNEL, That would be nice indeed maybe using a semaphore instead of a lock to protect insertion vs. initialisation. a semaphore? are you meaning a mutex? If not, I fail to understand what you're implying. Right, a mutex, bad habit calling those semaphores from the old days :-) OK, then we're on the same line ;-) Right, that's the problem with this new scheme and I'm still trying to find a way to handle memory allocation failures be it for GFP_ATOMIC or GFP_KERNEL. I could not think of anything simple so far and I'm open for suggestions. GFP_KERNEL should not fail, it will just block no ? No it won't block and will fail (returns NULL). If it fails, it's probably catastrophic enough not to care. Yep, I'd tend to agree with that. You can always fallback to linear lookup. I will have to add that back as there is no more fallback. I don't know if it's worth trying to fire off a new allocation attempt later, probably not. I've been pondering with this lately, but I think that adding a linear lookup fallback should be OK. Thanks, Sebastien. Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] powerpc - Make the irq reverse mapping radix tree lockless
There's nothing to 'de-initialize' here, or am I missing something? radix_tree_insert() will return ENOMEM and won't insert anything. Forget my comment, just fallback. Or you can fallback if you don't find, as easy, probably easier since it shouldn't happen in practice. That's what I had in mind. Thanks for doing that work ! Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] powerpc - Make the irq reverse mapping radix tree lockless
On Thu, 04 Sep 2008 17:58:56 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: There's nothing to 'de-initialize' here, or am I missing something? radix_tree_insert() will return ENOMEM and won't insert anything. Forget my comment, just fallback. Or you can fallback if you don't find, as easy, probably easier since it shouldn't happen in practice. That's what I had in mind. Thanks for doing that work ! Will do that way. Thanks, Sebastien. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH v4 1/2] powerpc: Board support for GE Fanuc SBC610
On Tue, 2 Sep 2008 13:27:09 -0500 Scott Wood [EMAIL PROTECTED] wrote: On Mon, Sep 01, 2008 at 11:27:59AM +0100, Martyn Welch wrote: +static __initdata struct of_device_id of_bus_ids[] = { + { .compatible = simple-bus, }, + { .type = serial, }, + { .type = soc, }, + { .type = i2c, }, i2c and serial don't belong here, and soc is redundant with simple-bus. Hmm, I had to put those in to get the relevant parts to work - somethings obviously changed. I'll remove them and see what happens. Thanks again for the feedback, Martyn -Scott -- Martyn Welch MEng MPhil MIET (Principal Software Engineer) T:+44(0)1327322748 GE Fanuc Intelligent Platforms Ltd,|Registered in England and Wales Tove Valley Business Park, Towcester, |(3828642) at 100 Barbirolli Square, Northants, NN12 6PF, UK T:+44(0)1327359444 |Manchester,M2 3AB VAT:GB 729849476 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: irq
I read the booting_without_of.txt document and the Interrupt Mapping docucument from http://playground.sun.com/1275. But I don't understand all parameters. Can somebody help me to create my device tree about interrupt part ? I have an interrupt controller at the adresse 0x20006000. The irq_id range is 1 to 63. I would like to try UART interrupt, which have ids : 0x18 (tranmission fifo empty, 0x19 (reception fifo full), 0x1a (reception error), 0x1b (break emission). What other informations are needed ? Nothing is cascaded. Thanks 2008/9/4, Benjamin Herrenschmidt [EMAIL PROTECTED]: On Wed, 2008-09-03 at 23:02 +0200, Sébastien Chrétien wrote: irq_of_parse_and_map is equivalent to ioremap in the MMU case ? On the powerpc architecture, we use virtualized IRQ numbers in order to deal with the wide range of interrupt controllers around and multiple of them cascaded. The base function to map a physical interrupt to a virtual interrupt is irq_create_mapping(). It takes an irq_host argument which represent the IRQ domain (typically irq controller) off which the interrupt you are trying to map hangs. If you pass NULL, it will use the default controller, which doesn't always exist, it depends on the platform. Usually, platforms set that to the toplevel PIC. However, normally, that function shouldn't be used directly. Instead, you should create a representation of your device in the device-tree along with the appropriate interrupt mapping, and then use the irq_of_parse_and_map() function to obtain a mapped virtual irq based on the device-tree information. This will take care of finding the right irq_host but will also properly setup the polarity of the interrupt etc... Now, as to how you should represent the interrupt in the device-tree, this should be explained in Documentation/booting-without-of.txt Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[RFC] adding hwmon support to sequoia device tree
Hi, I added support for the Sequoia on-board I2C temperature sensor to the device. I am not sure if there is any node naming convention for such devices. The needed I2C driver can be found under drivers/hwmon in the kernel sources, so I found hwmon suitable for the node. Do we want to add this to the sequoia dts file? If this is how it should be done, I will commit a proper patch. diff --git a/arch/powerpc/boot/dts/sequoia.dts b/arch/powerpc/boot/dts/sequoia.dts index 72d15f0..82fdfdf 100644 --- a/arch/powerpc/boot/dts/sequoia.dts +++ b/arch/powerpc/boot/dts/sequoia.dts @@ -246,13 +246,23 @@ }; IIC0: [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 0; compatible = ibm,iic-440epx, ibm,iic; reg = 0xef600700 0x0014; interrupt-parent = UIC0; interrupts = 0x2 0x4; + + [EMAIL PROTECTED] { + device_type = hwmon; + compatible = analog,ad7414; + reg = 0x48; + }; }; IIC1: [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 0; (prepare for adding devices on the 2nd I2C also) compatible = ibm,iic-440epx, ibm,iic; reg = 0xef600800 0x0014; interrupt-parent = UIC0; Matthias ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Efficient memcpy()/memmove() for G2/G3 cores...
On Thursday 04 September 2008 04:04:58 Paul Mackerras wrote: prodyut hazarika writes: glibc memxxx for powerpc are horribly inefficient. For optimal performance, we should should dcbt instruction to establish the source address in cache, and dcbz to establish the destination address in cache. We should do dcbt and dcbz such that the touches happen a line ahead of the actual copy. The problem which is see is that dcbt and dcbz instructions don't work on non-cacheable memory (obviously!). But memxxx function are used for both cached and non-cached memory. Thus this optimized memcpy should be smart enough to figure out that both source and destination address fall in cacheable space, and only then used the optimized dcbt/dcbz instructions. I would be careful about adding overhead to memcpy. I found that in the kernel, almost all calls to memcpy are for less than 128 bytes (1 cache line on most 64-bit machines). So, adding a lot of code to detect cacheability and do prefetching is just going to slow down the common case, which is short copies. I don't have statistics for glibc but I wouldn't be surprised if most copies were short there also. Then please explain the following. This is a memcpy() speed test for different sized blocks on a MPC5121e (DIU is turned on). The first case is glibc code without optimizations, and the second case is 16-register strides with dcbt/dcbz instructions, written in assembly language (see attachment) $ ./memcpyspeed Fully aligned: 10 chunks of 5 bytes : 3.48 Mbyte/s ( throughput: 6.96 Mbytes/s) 5 chunks of 16 bytes : 14.3 Mbyte/s ( throughput: 28.6 Mbytes/s) 1 chunks of 100 bytes : 14.4 Mbyte/s ( throughput: 28.8 Mbytes/s) 5000 chunks of 256 bytes : 14.4 Mbyte/s ( throughput: 28.7 Mbytes/s) 1000 chunks of 1000 bytes : 14.4 Mbyte/s ( throughput: 28.7 Mbytes/s) 50 chunks of 16384 bytes : 14.2 Mbyte/s ( throughput: 28.4 Mbytes/s) 1 chunks of 1048576 bytes : 14.4 Mbyte/s ( throughput: 28.8 Mbytes/s) $ LD_PRELOAD=./libmemcpye300dj.so ./memcpyspeed Fully aligned: 10 chunks of 5 bytes : 7.44 Mbyte/s ( throughput: 14.9 Mbytes/s) 5 chunks of 16 bytes : 13.1 Mbyte/s ( throughput: 26.2 Mbytes/s) 1 chunks of 100 bytes : 29.4 Mbyte/s ( throughput: 58.8 Mbytes/s) 5000 chunks of 256 bytes : 90.2 Mbyte/s ( throughput: 180 Mbytes/s) 1000 chunks of 1000 bytes :77 Mbyte/s ( throughput: 154 Mbytes/s) 50 chunks of 16384 bytes : 96.8 Mbyte/s ( throughput: 194 Mbytes/s) 1 chunks of 1048576 bytes : 97.6 Mbyte/s ( throughput: 195 Mbytes/s) (I have edited the output of this tool to fit into an e-mail without wrapping lines for readability). Please tell me how on earth there can be such a big difference??? Note that on a MPC5200B this is TOTALLY different, and both processors have an e300 core (different versions of it though). The other thing that I have found is that code that is optimal for cache-cold copies is usually significantly slower than optimal for cache-hot copies, because the cache management instructions consume cycles and don't help in the cache-hot case. In other words, I don't think we should be tuning the glibc memcpy based on tests of how fast it copies multiple megabytes. I don't just copy multiple megabytes! See above example. Also I do constant performance testing of different applications using LD_PRELOAD, to se the impact. Recentrly I even tried prboom (a free doom port), to remember the good old days of PC benchmarking ;-) I have yet to come across a test that has lower performance with this optimization (on an MPC5121e that is). Still, for 6xx/e300 cores, we probably do want to use dcbt/dcbz for larger copies. We don't want to use dcbt/dcbz on the larger 64-bit At least for MPC5121e you really, really need it!! processors (POWER4/5/6) because the hardware prefetching and write-combining mean that dcbt/dcbz don't help and just slow things down. That's explainable. What's not explainable, are the results I am getting on the MPC5121e. Please, could someone tell me what I am doing wrong? (I must be doing something wrong, I'm almost sure). One thing that I realize is not quite right with memcpyspeed.c is the fact that it copies consecutive blocks of memory, that should have an impact on 5-byte and 16-bytes copy results I guess (a cacheline for the following block may already be fetched), but not anymore for 100-byte blocks and bigger (with 32-byte cache lines). In fact, 16-bytes seems to be the only size where the additional overhead has some impact (which is negligible). Another thing is that performance probably matters most to the end-user when applications need to copy big amounts of data (e.g. video frames or bitmap data), which is most probably done using big blocks of memcpy(), so eventually hurting performance for small copies probably has less weight on overall experience. Best regards, -- David Jander /* Optimized
Re: [RFC PATCH 2/2 v2] powerpc/83xx: mpc836x_mds: add support for USBHost
On Thu, Sep 04, 2008 at 02:45:05PM +0800, Li Yang-R58472 wrote: -Original Message- From: Anton Vorontsov [mailto:[EMAIL PROTECTED] Sent: Monday, September 01, 2008 9:35 PM To: Kumar Gala Cc: linuxppc-dev@ozlabs.org; Li Yang-R58472 Subject: [RFC PATCH 2/2 v2] powerpc/83xx: mpc836x_mds: add support for USBHost Various changes to support QE USB Host on a MPC8360E-MDS board: - Update the device tree per QE USB bindings; - Configure QE Par IO; - Set up BCSR for both USB Host and Peripheral modes; - Add timer (GTM) node; - Add gpio-controller node for BCSR13 bank; - Select FSL_GTM, QE_GPIO and OF_SIMPLE_GPIO. The work is loosely based on Li Yang's patch[1], which is used to support peripheral mode only. [1] http://ozlabs.org/pipermail/linuxppc-dev/2008-August/061357.html The s-o-b line of the original patch preserved here. Signed-off-by: Li Yang [EMAIL PROTECTED] Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] {snip} @@ -297,11 +328,20 @@ }; [EMAIL PROTECTED] { - compatible = qe_udc; + compatible = fsl,mpc8360-qe-usb, +fsl,mpc8323-qe-usb; reg = 0x6c0 0x40 0x8b00 0x100; interrupts = 11; interrupt-parent = qeic; - mode = slave; + fsl,fullspeed-clock = clk21; + fsl,lowspeed-clock = brg9; + gpios = qe_pio_b 2 0 /* USBOE */ +qe_pio_b 3 0 /* USBTP */ +qe_pio_b 8 0 /* USBTN */ +qe_pio_b 9 0 /* USBRP */ +qe_pio_b 11 0 /* USBRN */ +bcsr135 0 /* SPEED */ +bcsr134 1; /* POWER */ Nothing against this node. But I don't think gpio nodes can replaces par_io nodes. Yes, they can't, and gpios = are not meant to be replacement for par_io nodes. gpios are used by the host driver, the driver really needs these gpios = as gpios. -- Anton Vorontsov email: [EMAIL PROTECTED] irc://irc.freenode.net/bd2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC] adding hwmon support to sequoia device tree
On Thu, Sep 04, 2008 at 11:55:02AM +0200, Matthias Fuchs wrote: Hi, I added support for the Sequoia on-board I2C temperature sensor to the device. I am not sure if there is any node naming convention for such devices. The needed I2C driver can be found under drivers/hwmon in the kernel sources, so I found hwmon suitable for the node. Do we want to add this to the sequoia dts file? If this is how it should be done, I will commit a proper patch. See comments below. Out of curiosity, do you have a working driver and setup with this? diff --git a/arch/powerpc/boot/dts/sequoia.dts b/arch/powerpc/boot/dts/sequoia.dts index 72d15f0..82fdfdf 100644 --- a/arch/powerpc/boot/dts/sequoia.dts +++ b/arch/powerpc/boot/dts/sequoia.dts @@ -246,13 +246,23 @@ }; IIC0: [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 0; compatible = ibm,iic-440epx, ibm,iic; reg = 0xef600700 0x0014; interrupt-parent = UIC0; interrupts = 0x2 0x4; + + [EMAIL PROTECTED] { + device_type = hwmon; We don't need device_type. Particularly not a new one like this. + compatible = analog,ad7414; Perhaps 'compatible = analog,ad7417, amcc,hwmon-440epx;' josh ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/2] powerpc - Make the irq reverse mapping radix tree lockless
The radix trees used by interrupt controllers for their irq reverse mapping (currently only the XICS found on pSeries) have a complex locking scheme dating back to before the advent of the lockless radix tree. Take advantage of this and of the fact that the items of the tree are pointers to a static array (irq_map) elements which can never go under us to simplify the locking. Concurrency between readers and writers is handled by the intrinsic properties of the lockless radix tree. Concurrency between writers is handled with a global mutex. Signed-off-by: Sebastien Dugue [EMAIL PROTECTED] Cc: Paul Mackerras [EMAIL PROTECTED] Cc: Benjamin Herrenschmidt [EMAIL PROTECTED] Cc: Michael Ellerman [EMAIL PROTECTED] --- arch/powerpc/kernel/irq.c | 76 ++-- 1 files changed, 11 insertions(+), 65 deletions(-) diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 2656924..ac222d0 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -439,9 +439,8 @@ void do_softirq(void) static LIST_HEAD(irq_hosts); static DEFINE_SPINLOCK(irq_big_lock); -static DEFINE_PER_CPU(unsigned int, irq_radix_reader); -static unsigned int irq_radix_writer; static unsigned int revmap_trees_allocated; +static DEFINE_MUTEX(revmap_trees_mutex); struct irq_map_entry irq_map[NR_IRQS]; static unsigned int irq_virq_count = NR_IRQS; static struct irq_host *irq_default_host; @@ -584,57 +583,6 @@ void irq_set_virq_count(unsigned int count) irq_virq_count = count; } -/* radix tree not lockless safe ! we use a brlock-type mecanism - * for now, until we can use a lockless radix tree - */ -static void irq_radix_wrlock(unsigned long *flags) -{ - unsigned int cpu, ok; - - spin_lock_irqsave(irq_big_lock, *flags); - irq_radix_writer = 1; - smp_mb(); - do { - barrier(); - ok = 1; - for_each_possible_cpu(cpu) { - if (per_cpu(irq_radix_reader, cpu)) { - ok = 0; - break; - } - } - if (!ok) - cpu_relax(); - } while(!ok); -} - -static void irq_radix_wrunlock(unsigned long flags) -{ - smp_wmb(); - irq_radix_writer = 0; - spin_unlock_irqrestore(irq_big_lock, flags); -} - -static void irq_radix_rdlock(unsigned long *flags) -{ - local_irq_save(*flags); - __get_cpu_var(irq_radix_reader) = 1; - smp_mb(); - if (likely(irq_radix_writer == 0)) - return; - __get_cpu_var(irq_radix_reader) = 0; - smp_wmb(); - spin_lock(irq_big_lock); - __get_cpu_var(irq_radix_reader) = 1; - spin_unlock(irq_big_lock); -} - -static void irq_radix_rdunlock(unsigned long flags) -{ - __get_cpu_var(irq_radix_reader) = 0; - local_irq_restore(flags); -} - static int irq_setup_virq(struct irq_host *host, unsigned int virq, irq_hw_number_t hwirq) { @@ -789,7 +737,6 @@ void irq_dispose_mapping(unsigned int virq) { struct irq_host *host; irq_hw_number_t hwirq; - unsigned long flags; if (virq == NO_IRQ) return; @@ -829,9 +776,9 @@ void irq_dispose_mapping(unsigned int virq) smp_rmb(); if (revmap_trees_allocated 1) break; - irq_radix_wrlock(flags); + mutex_lock(revmap_trees_mutex); radix_tree_delete(host-revmap_data.tree, hwirq); - irq_radix_wrunlock(flags); + mutex_unlock(revmap_trees_mutex); break; } @@ -885,7 +832,6 @@ unsigned int irq_radix_revmap_lookup(struct irq_host *host, { struct irq_map_entry *ptr; unsigned int virq; - unsigned long flags; WARN_ON(host-revmap_type != IRQ_HOST_MAP_TREE); @@ -897,9 +843,11 @@ unsigned int irq_radix_revmap_lookup(struct irq_host *host, return irq_find_mapping(host, hwirq); /* Now try to resolve */ - irq_radix_rdlock(flags); + /* +* No rcu_read_lock(ing) needed, the ptr returned can't go under us +* as it's referencing an entry in the static irq_map table. +*/ ptr = radix_tree_lookup(host-revmap_data.tree, hwirq); - irq_radix_rdunlock(flags); /* * If found in radix tree, then fine. @@ -917,7 +865,6 @@ unsigned int irq_radix_revmap_lookup(struct irq_host *host, void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq, irq_hw_number_t hwirq) { - unsigned long flags; WARN_ON(host-revmap_type != IRQ_HOST_MAP_TREE); @@ -931,10 +878,10 @@ void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq, return; if (virq != NO_IRQ) { - irq_radix_wrlock(flags); +
[PATCH 0/2 V4] powerpc - Make the irq reverse mapping tree lockless
Hi , here is V4 for the powerpc IRQ radix tree reverse mapping rework. Big thanks to Benjamin Herrenschmidt for his most useful comments. V3 - V4: from comments by Benjamin Herrenschmidt - Dump the use of a global atomic variable for synchronization between the radix tree initialization path and the insert/remove/lookup path. Instead use a state variable to distinguish between the phases of tree initialization: - 0 tree not allocated - 1 tree allocated but not intialized - 2 tree allocated and initialized - turn the radix tree nodes GFP_ATOMIC allocations into GFP_KERNEL allocations. - Use a global mutex to handle writers concurrency instead of a spinlock embedded into the irq_host struct. V2 - V3: from comments by Benjamin Herrenschmidt and Daniel Walker - Move the initialization of the radix tree back into irq_late_init() and insert pre-existing irqs into the tree at that time. - One whitespace cleanup. V1 - V2: from comments by Michael Ellerman - Initialize the XICS radix tree in xics code and only for that irq_host rather than doing it for all the hosts in the powerpc irq generic code (although the hosts list only contain one entry at the moment). - Add a comment in irq_radix_revmap_lookup() stating why it is safe to perform a lookup even if the radix tree has not been initialized yet. The goal of this patchset is to simplify the locking constraints on the radix tree used for IRQ reverse mapping on the pSeries machines and provide lockless access to this tree. This also solves the following BUG under preempt-rt: BUG: sleeping function called from invalid context swapper(1) at kernel/rtmutex.c:739 in_atomic():1 [0002], irqs_disabled():1 Call Trace: [c001e20f3340] [c0010370] .show_stack+0x70/0x1bc (unreliable) [c001e20f33f0] [c0049380] .__might_sleep+0x11c/0x138 [c001e20f3470] [c02a2f64] .__rt_spin_lock+0x3c/0x98 [c001e20f34f0] [c00c3f20] .kmem_cache_alloc+0x68/0x184 [c001e20f3590] [c0193f3c] .radix_tree_node_alloc+0xf0/0x144 [c001e20f3630] [c0195190] .radix_tree_insert+0x18c/0x2fc [c001e20f36f0] [c000c710] .irq_radix_revmap+0x1a4/0x1e4 [c001e20f37b0] [c003b3f0] .xics_startup+0x30/0x54 [c001e20f3840] [c008b864] .setup_irq+0x26c/0x370 [c001e20f38f0] [c008ba68] .request_irq+0x100/0x158 [c001e20f39a0] [c01ee9c0] .hvc_open+0xb4/0x148 [c001e20f3a40] [c01d72ec] .tty_open+0x200/0x368 [c001e20f3af0] [c00ce928] .chrdev_open+0x1f4/0x25c [c001e20f3ba0] [c00c8bf0] .__dentry_open+0x188/0x2c8 [c001e20f3c50] [c00c8dec] .do_filp_open+0x50/0x70 [c001e20f3d70] [c00c8e8c] .do_sys_open+0x80/0x148 [c001e20f3e20] [c000928c] .init_post+0x4c/0x100 [c001e20f3ea0] [c03c0e0c] .kernel_init+0x428/0x478 [c001e20f3f90] [c0027448] .kernel_thread+0x4c/0x68 The root cause of this bug lies in the fact that the XICS interrupt controller uses a radix tree for its reverse irq mapping and that we cannot allocate the tree nodes (even GFP_ATOMIC) with preemption disabled. In fact, we have 2 nested preemption disabling when we want to allocate a new node: - setup_irq() does a spin_lock_irqsave() before calling xics_startup() which then calls irq_radix_revmap() to insert a new node in the tree - irq_radix_revmap() also does a spin_lock_irqsave() (in irq_radix_wrlock()) before the radix_tree_insert() Also, if an IRQ gets registered before the tree is initialized (namely the IPI), it will be inserted into the tree in interrupt context once the tree have been initialized, hence the need for a spin_lock_irqsave() in the insertion path. This serie is split into 2 patches: - The first patch splits irq_radix_revmap() into its 2 components: one for lookup and one for insertion into the radix tree and moves the insertion of pre-existing irq into the tree at irq_late_init() time. - The second patch makes the radix tree fully lockless on the lookup side. And the diffstat for the whole patchset: arch/powerpc/include/asm/irq.h| 18 +++- arch/powerpc/kernel/irq.c | 169 + arch/powerpc/platforms/pseries/xics.c | 11 +-- 3 files changed, 104 insertions(+), 94 deletions(-) Thanks, Sebastien. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/2] powerpc - Separate the irq radix tree insertion and lookup
irq_radix_revmap() currently serves 2 purposes, irq mapping lookup and insertion which happen in interrupt and process context respectively. Separate the function into its 2 components, one for lookup only and one for insertion only. Fix the only user of the revmap tree (XICS) to use the new functions. Also, move the insertion into the radix tree of those irqs that were requested before it was initialized at said tree initialization. Mutual exclusion between the tree initialization and readers/writers is handled via a state variable (revmap_trees_allocated) set to 1 when the tree has been initialized and set to 2 after the already requested irqs have been inserted in the tree by the init path. This state is checked before any reader or writer access just like we used to check for tree.gfp_mask != 0 before. Finally, now that we're not any longer inserting nodes into the radix-tree in interrupt context, turn the GFP_ATOMIC allocations into GFP_KERNEL ones. Signed-off-by: Sebastien Dugue [EMAIL PROTECTED] Cc: Paul Mackerras [EMAIL PROTECTED] Cc: Benjamin Herrenschmidt [EMAIL PROTECTED] Cc: Michael Ellerman [EMAIL PROTECTED] --- arch/powerpc/include/asm/irq.h| 18 +- arch/powerpc/kernel/irq.c | 97 ++--- arch/powerpc/platforms/pseries/xics.c | 11 ++--- 3 files changed, 95 insertions(+), 31 deletions(-) diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h index a372f76..0a51376 100644 --- a/arch/powerpc/include/asm/irq.h +++ b/arch/powerpc/include/asm/irq.h @@ -236,15 +236,27 @@ extern unsigned int irq_find_mapping(struct irq_host *host, extern unsigned int irq_create_direct_mapping(struct irq_host *host); /** - * irq_radix_revmap - Find a linux virq from a hw irq number. + * irq_radix_revmap_insert - Insert a hw irq to linux virq number mapping. + * @host: host owning this hardware interrupt + * @virq: linux irq number + * @hwirq: hardware irq number in that host space + * + * This is for use by irq controllers that use a radix tree reverse + * mapping for fast lookup. + */ +extern void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq, + irq_hw_number_t hwirq); + +/** + * irq_radix_revmap_lookup - Find a linux virq from a hw irq number. * @host: host owning this hardware interrupt * @hwirq: hardware irq number in that host space * * This is a fast path, for use by irq controller code that uses radix tree * revmaps */ -extern unsigned int irq_radix_revmap(struct irq_host *host, -irq_hw_number_t hwirq); +extern unsigned int irq_radix_revmap_lookup(struct irq_host *host, + irq_hw_number_t hwirq); /** * irq_linear_revmap - Find a linux virq from a hw irq number. diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index d972dec..2656924 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -441,6 +441,7 @@ static LIST_HEAD(irq_hosts); static DEFINE_SPINLOCK(irq_big_lock); static DEFINE_PER_CPU(unsigned int, irq_radix_reader); static unsigned int irq_radix_writer; +static unsigned int revmap_trees_allocated; struct irq_map_entry irq_map[NR_IRQS]; static unsigned int irq_virq_count = NR_IRQS; static struct irq_host *irq_default_host; @@ -821,8 +822,12 @@ void irq_dispose_mapping(unsigned int virq) host-revmap_data.linear.revmap[hwirq] = NO_IRQ; break; case IRQ_HOST_MAP_TREE: - /* Check if radix tree allocated yet */ - if (host-revmap_data.tree.gfp_mask == 0) + /* +* Check if radix tree allocated yet, if not then nothing to +* remove. +*/ + smp_rmb(); + if (revmap_trees_allocated 1) break; irq_radix_wrlock(flags); radix_tree_delete(host-revmap_data.tree, hwirq); @@ -875,43 +880,62 @@ unsigned int irq_find_mapping(struct irq_host *host, EXPORT_SYMBOL_GPL(irq_find_mapping); -unsigned int irq_radix_revmap(struct irq_host *host, - irq_hw_number_t hwirq) +unsigned int irq_radix_revmap_lookup(struct irq_host *host, +irq_hw_number_t hwirq) { - struct radix_tree_root *tree; struct irq_map_entry *ptr; unsigned int virq; unsigned long flags; WARN_ON(host-revmap_type != IRQ_HOST_MAP_TREE); - /* Check if the radix tree exist yet. We test the value of -* the gfp_mask for that. Sneaky but saves another int in the -* structure. If not, we fallback to slow mode + /* +* Check if the radix tree exists and has bee initialized. +* If not, we fallback to slow mode */ - tree = host-revmap_data.tree; - if (tree-gfp_mask == 0) + if (revmap_trees_allocated
Re: [RFC] adding hwmon support to sequoia device tree
On Thursday 04 September 2008, Josh Boyer wrote: Do we want to add this to the sequoia dts file? If this is how it should be done, I will commit a proper patch. See comments below. Out of curiosity, do you have a working driver and setup with this? Sure, it did hit mainline with 2.6.27: drivers/hwmon/ad7414.c diff --git a/arch/powerpc/boot/dts/sequoia.dts b/arch/powerpc/boot/dts/sequoia.dts index 72d15f0..82fdfdf 100644 --- a/arch/powerpc/boot/dts/sequoia.dts +++ b/arch/powerpc/boot/dts/sequoia.dts @@ -246,13 +246,23 @@ }; IIC0: [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 0; compatible = ibm,iic-440epx, ibm,iic; reg = 0xef600700 0x0014; interrupt-parent = UIC0; interrupts = 0x2 0x4; + + [EMAIL PROTECTED] { + device_type = hwmon; We don't need device_type. Particularly not a new one like this. + compatible = analog,ad7414; Perhaps 'compatible = analog,ad7417, amcc,hwmon-440epx;' I don't think we need any amcc and/or 440epx here. Its a common chip from Analog. Best regards, Stefan ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Efficient memcpy()/memmove() for G2/G3 cores...
On Thursday 04 September 2008 14:19:26 Josh Boyer wrote: [...] $ ./memcpyspeed Fully aligned: 10 chunks of 5 bytes : 3.48 Mbyte/s ( throughput: 6.96 Mbytes/s) 5 chunks of 16 bytes : 14.3 Mbyte/s ( throughput: 28.6 Mbytes/s) 1 chunks of 100 bytes : 14.4 Mbyte/s ( throughput: 28.8 Mbytes/s) 5000 chunks of 256 bytes : 14.4 Mbyte/s ( throughput: 28.7 Mbytes/s) 1000 chunks of 1000 bytes : 14.4 Mbyte/s ( throughput: 28.7 Mbytes/s) 50 chunks of 16384 bytes : 14.2 Mbyte/s ( throughput: 28.4 Mbytes/s) 1 chunks of 1048576 bytes : 14.4 Mbyte/s ( throughput: 28.8 Mbytes/s) $ LD_PRELOAD=./libmemcpye300dj.so ./memcpyspeed Fully aligned: 10 chunks of 5 bytes : 7.44 Mbyte/s ( throughput: 14.9 Mbytes/s) 5 chunks of 16 bytes : 13.1 Mbyte/s ( throughput: 26.2 Mbytes/s) 1 chunks of 100 bytes : 29.4 Mbyte/s ( throughput: 58.8 Mbytes/s) 5000 chunks of 256 bytes : 90.2 Mbyte/s ( throughput: 180 Mbytes/s) 1000 chunks of 1000 bytes :77 Mbyte/s ( throughput: 154 Mbytes/s) 50 chunks of 16384 bytes : 96.8 Mbyte/s ( throughput: 194 Mbytes/s) 1 chunks of 1048576 bytes : 97.6 Mbyte/s ( throughput: 195 Mbytes/s) (I have edited the output of this tool to fit into an e-mail without wrapping lines for readability). Please tell me how on earth there can be such a big difference??? Note that on a MPC5200B this is TOTALLY different, and both processors have an e300 core (different versions of it though). How can there be such a big difference in throughput? Well, your algorithm seems better optimized than the glibc one for your testcase :). Yes, I admit my testcase is focussing on optimizing memcpy() of uncached data, and that interest stems from the fact that I was testing X11 performance (using xorg kdrive and xorg-server), and wondering why this processor wasn't able to get more FPS when moving frames on screen or scrolling, when in theory the on-board RAM should have bandwidth enough to get a smooth image. What I mean is that I have a hard time believing that this processor core is so dependent of tweaks in order to get some decent memory throughput. The MPC5200B does get higher througput with much less effort, and the two cores should be fairly identical (besides the MPC5200B having less cache memory and some other details). [...] I don't think you're doing anything wrong exactly. But it seems that your testcase sits there and just copies data with memcpy in varying sizes and amounts. That's not exactly a real-world usecase is it? No, of course it's not. I made this program to test the performance difference of different tweaks quickly. Once I found something that worked, I started LD_PRELOADing it to different other programs (among others the kdrive Xserver, mplayer, and x11perf) to see its impact on performance of some real-life apps. There the difference in performance is not so impressive of course, but it is still there (almost always either noticeably in favor of the tweaked version of memcpy(), or with a negligible or no difference). I have not studied the different application's uses of memcpy(), and only done empirical tests so far. I think what Paul was saying is that during the course of runtime for a normal program (the kernel or userspace), most memcpy operations will be of a small order of magnitude. They will also be scattered among code that does _other_ stuff than just memcpy. So he's concerned about the overhead of an implementation that sets up the cache to do a single 32 byte memcpy. I understand. I also have this concern, especially for other processors, as the MPC5200B, where there doesn't seem to be so much to gain anyway. Of course, I could be totally wrong. I haven't had my coffee yet this morning after all. You're doing quite good regardless of your lack of caffeine ;-) Greetings, -- David Jander ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: irq
I try to write a device tree about irq : IT_controller: [EMAIL PROTECTED] { clock-frequency = 0; interrupt-controller; #address-cells = 0; reg = 0x20006000 0x100; compatible = it; device_type = it; big-endian; }; [EMAIL PROTECTED] device_type=uart; compatible=uart; interrupts = 0x18 0; interrupt-parent=IT_controller; }; uart_irq=of_find_node_by_type(NULL, uart); if(uart_irq==NULL) printk(%s: No uart node found !\n, __func__); virt=irq_of_parse_and_map(uart_irq, 0); printk(Virtual irq : %d \n,virt); When I boot linux, virt=0; What is wrong ? 2008/9/4, Sébastien Chrétien [EMAIL PROTECTED]: I read the booting_without_of.txt document and the Interrupt Mapping docucument from http://playground.sun.com/1275. But I don't understand all parameters. Can somebody help me to create my device tree about interrupt part ? I have an interrupt controller at the adresse 0x20006000. The irq_id range is 1 to 63. I would like to try UART interrupt, which have ids : 0x18 (tranmission fifo empty, 0x19 (reception fifo full), 0x1a (reception error), 0x1b (break emission). What other informations are needed ? Nothing is cascaded. Thanks 2008/9/4, Benjamin Herrenschmidt [EMAIL PROTECTED]: On Wed, 2008-09-03 at 23:02 +0200, Sébastien Chrétien wrote: irq_of_parse_and_map is equivalent to ioremap in the MMU case ? On the powerpc architecture, we use virtualized IRQ numbers in order to deal with the wide range of interrupt controllers around and multiple of them cascaded. The base function to map a physical interrupt to a virtual interrupt is irq_create_mapping(). It takes an irq_host argument which represent the IRQ domain (typically irq controller) off which the interrupt you are trying to map hangs. If you pass NULL, it will use the default controller, which doesn't always exist, it depends on the platform. Usually, platforms set that to the toplevel PIC. However, normally, that function shouldn't be used directly. Instead, you should create a representation of your device in the device-tree along with the appropriate interrupt mapping, and then use the irq_of_parse_and_map() function to obtain a mapped virtual irq based on the device-tree information. This will take care of finding the right irq_host but will also properly setup the polarity of the interrupt etc... Now, as to how you should represent the interrupt in the device-tree, this should be explained in Documentation/booting-without-of.txt Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/2] powerpc: add driver for simple GPIO banks
The driver supports very simple GPIO controllers, that is, when a controller provides just a 'data' register. Such controllers may be found in various BCSRs (Board's FPGAs used to control board's switches, LEDs, chip-selects, Ethernet/USB PHY power, etc). So far we support only 1-byte GPIO banks. Support for other widths may be implemented when/if needed. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- The driver is user-selectable now. arch/powerpc/sysdev/Kconfig | 11 +++ arch/powerpc/sysdev/Makefile |2 + arch/powerpc/sysdev/of_simple_gpio.c | 154 ++ 3 files changed, 167 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/sysdev/of_simple_gpio.c diff --git a/arch/powerpc/sysdev/Kconfig b/arch/powerpc/sysdev/Kconfig index 72fb35b..30d6e4d 100644 --- a/arch/powerpc/sysdev/Kconfig +++ b/arch/powerpc/sysdev/Kconfig @@ -6,3 +6,14 @@ config PPC4xx_PCI_EXPRESS bool depends on PCI 4xx default n + +config OF_SIMPLE_GPIO + bool Support for simple, memory-mapped GPIO controllers + depends on PPC + select GENERIC_GPIO + select ARCH_REQUIRE_GPIOLIB + help + Say Y here to support simple, memory-mapped GPIO controllers. + These are usually BCSRs used to control board's switches, LEDs, + chip-selects, Ethernet/USB PHY's power and various other small + on-board peripherals. diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile index a90054b..239d7e8 100644 --- a/arch/powerpc/sysdev/Makefile +++ b/arch/powerpc/sysdev/Makefile @@ -36,6 +36,8 @@ ifeq ($(CONFIG_PCI),y) obj-$(CONFIG_4xx) += ppc4xx_pci.o endif +obj-$(CONFIG_OF_SIMPLE_GPIO) += of_simple_gpio.o + # Temporary hack until we have migrated to asm-powerpc ifeq ($(ARCH),powerpc) obj-$(CONFIG_CPM) += cpm_common.o diff --git a/arch/powerpc/sysdev/of_simple_gpio.c b/arch/powerpc/sysdev/of_simple_gpio.c new file mode 100644 index 000..536c0c2 --- /dev/null +++ b/arch/powerpc/sysdev/of_simple_gpio.c @@ -0,0 +1,154 @@ +/* + * Simple Memory-Mapped GPIOs + * + * Copyright (c) MontaVista Software, Inc. 2008. + * + * Author: Anton Vorontsov [EMAIL PROTECTED] + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + */ + +#include linux/init.h +#include linux/kernel.h +#include linux/spinlock.h +#include linux/types.h +#include linux/ioport.h +#include linux/io.h +#include linux/of.h +#include linux/of_gpio.h +#include linux/gpio.h + +struct u8_gpio_chip { + struct of_mm_gpio_chip mm_gc; + spinlock_t lock; + + /* shadowed data register to clear/set bits safely */ + u8 data; +}; + +static struct u8_gpio_chip *to_u8_gpio_chip(struct of_mm_gpio_chip *mm_gc) +{ + return container_of(mm_gc, struct u8_gpio_chip, mm_gc); +} + +static u8 u8_pin2mask(unsigned int pin) +{ + return 1 (8 - 1 - pin); +} + +static int u8_gpio_get(struct gpio_chip *gc, unsigned int gpio) +{ + struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc); + + return in_8(mm_gc-regs) u8_pin2mask(gpio); +} + +static void u8_gpio_set(struct gpio_chip *gc, unsigned int gpio, int val) +{ + struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc); + struct u8_gpio_chip *u8_gc = to_u8_gpio_chip(mm_gc); + unsigned long flags; + + spin_lock_irqsave(u8_gc-lock, flags); + + if (val) + u8_gc-data |= u8_pin2mask(gpio); + else + u8_gc-data = ~u8_pin2mask(gpio); + + out_8(mm_gc-regs, u8_gc-data); + + spin_unlock_irqrestore(u8_gc-lock, flags); +} + +static int u8_gpio_dir_in(struct gpio_chip *gc, unsigned int gpio) +{ + return 0; +} + +static int u8_gpio_dir_out(struct gpio_chip *gc, unsigned int gpio, int val) +{ + u8_gpio_set(gc, gpio, val); + return 0; +} + +static void u8_gpio_save_regs(struct of_mm_gpio_chip *mm_gc) +{ + struct u8_gpio_chip *u8_gc = to_u8_gpio_chip(mm_gc); + + u8_gc-data = in_8(mm_gc-regs); +} + +static int __init u8_simple_gpiochip_add(struct device_node *np) +{ + int ret; + struct u8_gpio_chip *u8_gc; + struct of_mm_gpio_chip *mm_gc; + struct of_gpio_chip *of_gc; + struct gpio_chip *gc; + + u8_gc = kzalloc(sizeof(*u8_gc), GFP_KERNEL); + if (!u8_gc) + return -ENOMEM; + + spin_lock_init(u8_gc-lock); + + mm_gc = u8_gc-mm_gc; + of_gc = mm_gc-of_gc; + gc = of_gc-gc; + + mm_gc-save_regs = u8_gpio_save_regs; + of_gc-gpio_cells = 2; + gc-ngpio = 8; + gc-direction_input = u8_gpio_dir_in; + gc-direction_output = u8_gpio_dir_out; + gc-get = u8_gpio_get; + gc-set = u8_gpio_set; + + ret = of_mm_gpiochip_add(np,
[PATCH 2/2] powerpc/83xx: mpc836x_mds: add support for USB Host/Gadget
Various changes to support QE USB Host and Gadget on MPC8360E-MDS boards: - Update the device tree per QE USB bindings; - Configure QE Par IO; - Set up BCSR for both USB Host and Peripheral modes; - Add timer (GTM) node; - Add gpio-controller node for BCSR13 bank; - Select FSL_GTM and QE_USB. The work is loosely based on Li Yang's patch[1], which is used to support peripheral mode only. [1] http://ozlabs.org/pipermail/linuxppc-dev/2008-August/061357.html The s-o-b line of the original patch preserved here. Signed-off-by: Li Yang [EMAIL PROTECTED] Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Added qe_usb_clock_set() for peripheral case, since the udc driver does not manage the clocks. Also, simple of gpio driver is user-selectable now, so we don't need to select it explicitly. arch/powerpc/boot/dts/mpc836x_mds.dts | 44 +++- arch/powerpc/platforms/83xx/Kconfig |2 + arch/powerpc/platforms/83xx/mpc836x_mds.c | 45 - 3 files changed, 88 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/boot/dts/mpc836x_mds.dts b/arch/powerpc/boot/dts/mpc836x_mds.dts index ada8446..0be98f3 100644 --- a/arch/powerpc/boot/dts/mpc836x_mds.dts +++ b/arch/powerpc/boot/dts/mpc836x_mds.dts @@ -69,8 +69,19 @@ }; [EMAIL PROTECTED],0 { + #address-cells = 1; + #size-cells = 1; device_type = board-control; reg = 1 0 0x8000; + ranges = 0 1 0 0x8000; + + bcsr13: [EMAIL PROTECTED] { + #gpio-cells = 2; + compatible = fsl,mpc8360mds-bcsr-gpio, +simple-gpio-bank; + reg = 0xd 1; + gpio-controller; + }; }; }; @@ -191,10 +202,21 @@ }; [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 1; reg = 0x1400 0x100; + ranges = 0 0x1400 0x100; device_type = par_io; num-ports = 7; + qe_pio_b: [EMAIL PROTECTED] { + #gpio-cells = 2; + compatible = fsl,mpc8360-qe-pario-bank, +fsl,mpc8323-qe-pario-bank; + reg = 0x18 0x18; + gpio-controller; + }; + pio1: [EMAIL PROTECTED] { pio-map = /* port pin dir open_drain assignment has_irq */ @@ -278,6 +300,15 @@ }; }; + [EMAIL PROTECTED] { + compatible = fsl,mpc8360-qe-gtm, +fsl,qe-gtm, fsl,gtm; + reg = 0x440 0x40; + clock-frequency = 13200; + interrupts = 12 13 14 15; + interrupt-parent = qeic; + }; + [EMAIL PROTECTED] { cell-index = 0; compatible = fsl,spi; @@ -297,11 +328,20 @@ }; [EMAIL PROTECTED] { - compatible = qe_udc; + compatible = fsl,mpc8360-qe-usb, +fsl,mpc8323-qe-usb; reg = 0x6c0 0x40 0x8b00 0x100; interrupts = 11; interrupt-parent = qeic; - mode = slave; + fsl,fullspeed-clock = clk21; + fsl,lowspeed-clock = brg9; + gpios = qe_pio_b 2 0 /* USBOE */ +qe_pio_b 3 0 /* USBTP */ +qe_pio_b 8 0 /* USBTN */ +qe_pio_b 9 0 /* USBRP */ +qe_pio_b 11 0 /* USBRN */ +bcsr135 0 /* SPEED */ +bcsr134 1; /* POWER */ }; enet0: [EMAIL PROTECTED] { diff --git a/arch/powerpc/platforms/83xx/Kconfig b/arch/powerpc/platforms/83xx/Kconfig index 6159c5d..b4ee8f1 100644 --- a/arch/powerpc/platforms/83xx/Kconfig +++ b/arch/powerpc/platforms/83xx/Kconfig @@ -58,6 +58,8 @@ config MPC836x_MDS bool Freescale MPC836x MDS select DEFAULT_UIMAGE select QUICC_ENGINE + select QE_USB + select FSL_GTM help This option enables support for the MPC836x MDS Processor Board. diff --git a/arch/powerpc/platforms/83xx/mpc836x_mds.c b/arch/powerpc/platforms/83xx/mpc836x_mds.c
Re: [PATCH, RFC] mv643xx_eth: move sram window setting code into the driver
I was thinking about this actually, similar to the Efika Device Tree Supplement for MPC5200B board (which adds a lot of device tree nodes not present on the production model and fixes some things), we have a Pegasos one too which changes some very minor stuff. This could be in the kernel (chrp fixups) or in a pre-boot Forth script, but in either case, wouldn't a real sram node in the device tree be the proper solution here? Hardcoding addresses for devices is rather arch/ppc behaviour and the driver is one of the few cases that never got reworked to fit. I don't have a Pegasos running right now to test but I will, as soon as possible, make sure this works first. -- Matt Sealey [EMAIL PROTECTED] Genesi, Manager, Developer Relations Lennert Buytenhek wrote: This gets rid of a big mv643xx_eth annoyance of mine where the driver exports the offsets of some of its internal registers via a header file, and the Pegasos platform code ioremaps the peripheral directly and pokes into peripheral registers directly without involving the driver at all. I don't have the hardware, though, so I'd appreciate it if someone with a Pegasos board could test this (and possibly try some followup patches if it doesn't work). (You might need to apply some of the other mv643xx_eth patches that I just sent to netdev@ -- I'll happily provide a rolled-up patch on request.) Signed-off-by: Lennert Buytenhek [EMAIL PROTECTED] --- arch/powerpc/platforms/chrp/pegasos_eth.c | 48 +++-- drivers/net/mv643xx_eth.c | 56 +++-- include/linux/mv643xx_eth.h | 21 +++ 3 files changed, 80 insertions(+), 45 deletions(-) diff --git a/arch/powerpc/platforms/chrp/pegasos_eth.c b/arch/powerpc/platforms/chrp/pegasos_eth.c index 130ff72..1adec8e 100644 --- a/arch/powerpc/platforms/chrp/pegasos_eth.c +++ b/arch/powerpc/platforms/chrp/pegasos_eth.c @@ -21,8 +21,8 @@ #define PEGASOS2_SRAM_BASE (0xf200) #define PEGASOS2_SRAM_SIZE (256*1024) -#define PEGASOS2_SRAM_BASE_ETH0 (PEGASOS2_SRAM_BASE) -#define PEGASOS2_SRAM_BASE_ETH1 (PEGASOS2_SRAM_BASE_ETH0 + (PEGASOS2_SRAM_SIZE / 2) ) +#define PEGASOS2_SRAM_OFF_ETH0 (0) +#define PEGASOS2_SRAM_OFF_ETH1 (PEGASOS2_SRAM_SIZE / 2) #define PEGASOS2_SRAM_RXRING_SIZE (PEGASOS2_SRAM_SIZE/4) @@ -40,6 +40,14 @@ static struct resource mv643xx_eth_shared_resources[] = { }, }; +static struct mv643xx_eth_shared_platform_data mv643xx_eth_shared_pd = { + .sram_mbus_target_id= 0x02, + .sram_mbus_target_attr = 0x00, + .sram_mbus_addr = PEGASOS2_SRAM_BASE, + .sram_size = PEGASOS2_SRAM_SIZE, + .sram_cpu_phys_addr = PEGASOS2_SRAM_BASE, +}; + static struct platform_device mv643xx_eth_shared_device = { .name = MV643XX_ETH_SHARED_NAME, .id = 0, @@ -61,12 +69,12 @@ static struct mv643xx_eth_platform_data eth0_pd = { .shared = mv643xx_eth_shared_device, .port_number= 0, - .tx_sram_addr = PEGASOS2_SRAM_BASE_ETH0, - .tx_sram_size = PEGASOS2_SRAM_TXRING_SIZE, + .sram_tx_offset = PEGASOS2_SRAM_OFF_ETH0, + .sram_tx_size = PEGASOS2_SRAM_TXRING_SIZE, .tx_queue_size = PEGASOS2_SRAM_TXRING_SIZE/16, - .rx_sram_addr = PEGASOS2_SRAM_BASE_ETH0 + PEGASOS2_SRAM_TXRING_SIZE, - .rx_sram_size = PEGASOS2_SRAM_RXRING_SIZE, + .sram_rx_offset = PEGASOS2_SRAM_OFF_ETH0 + PEGASOS2_SRAM_TXRING_SIZE, + .sram_tx_size = PEGASOS2_SRAM_RXRING_SIZE, .rx_queue_size = PEGASOS2_SRAM_RXRING_SIZE/16, }; @@ -93,12 +101,12 @@ static struct mv643xx_eth_platform_data eth1_pd = { .shared = mv643xx_eth_shared_device, .port_number= 1, - .tx_sram_addr = PEGASOS2_SRAM_BASE_ETH1, - .tx_sram_size = PEGASOS2_SRAM_TXRING_SIZE, + .sram_tx_offset = PEGASOS2_SRAM_OFF_ETH1, + .sram_tx_size = PEGASOS2_SRAM_TXRING_SIZE, .tx_queue_size = PEGASOS2_SRAM_TXRING_SIZE/16, - .rx_sram_addr = PEGASOS2_SRAM_BASE_ETH1 + PEGASOS2_SRAM_TXRING_SIZE, - .rx_sram_size = PEGASOS2_SRAM_RXRING_SIZE, + .sram_rx_offset = PEGASOS2_SRAM_OFF_ETH1 + PEGASOS2_SRAM_TXRING_SIZE, + .sram_rx_size = PEGASOS2_SRAM_RXRING_SIZE, .rx_queue_size = PEGASOS2_SRAM_RXRING_SIZE/16, }; @@ -123,16 +131,13 @@ static struct platform_device *mv643xx_eth_pd_devs[] __initdata = { #define MV_READ(offset,val){ val = readl(mv643xx_reg_base + offset); } #define MV_WRITE(offset,data) writel(data, mv643xx_reg_base + offset) -static void __iomem *mv643xx_reg_base; - static int Enable_SRAM(void) { + void __iomem *mv643xx_reg_base; u32 ALong; - if (mv643xx_reg_base == NULL) - mv643xx_reg_base = ioremap(PEGASOS2_MARVELL_REGBASE, - PEGASOS2_MARVELL_REGSIZE); - +
[PATCH] spi_mpc83xx: fix clockrate calculation for low speed
Commit a61f5345 (spi_mpc83xx clockrate fixes) broke clockrate calculation for low speeds. SPMODE_DIV16 should be set if the divider is higher than 64, not only if the divider gets clipped to 1024. Furthermore, the clipping check was off by a factor 16 as well. Signed-off-by: Peter Korsgaard [EMAIL PROTECTED] --- drivers/spi/spi_mpc83xx.c | 13 + 1 files changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/spi/spi_mpc83xx.c b/drivers/spi/spi_mpc83xx.c index 070c621..ab7ee44 100644 --- a/drivers/spi/spi_mpc83xx.c +++ b/drivers/spi/spi_mpc83xx.c @@ -267,16 +267,13 @@ int mpc83xx_spi_setup_transfer(struct spi_device *spi, struct spi_transfer *t) cs-hw_mode |= SPMODE_LEN(bits_per_word); if ((mpc83xx_spi-spibrg / hz) 64) { + cs-hw_mode |= SPMODE_DIV16; pm = mpc83xx_spi-spibrg / (hz * 64); if (pm 16) { - cs-hw_mode |= SPMODE_DIV16; - pm /= 16; - if (pm 16) { - dev_err(spi-dev, Requested speed is too - low: %d Hz. Will use %d Hz instead.\n, - hz, mpc83xx_spi-spibrg / 1024); - pm = 16; - } + dev_err(spi-dev, Requested speed is too + low: %d Hz. Will use %d Hz instead.\n, + hz, mpc83xx_spi-spibrg / 1024); + pm = 16; } } else pm = mpc83xx_spi-spibrg / (hz * 4); -- 1.5.6.3 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: irq
On Thu, Sep 04, 2008 at 03:23:21PM +0200, Sébastien Chrétien wrote: I try to write a device tree about irq : IT_controller: [EMAIL PROTECTED] { clock-frequency = 0; interrupt-controller; #address-cells = 0; reg = 0x20006000 0x100; compatible = it; device_type = it; big-endian; }; [EMAIL PROTECTED] device_type=uart; compatible=uart; interrupts = 0x18 0; interrupt-parent=IT_controller; }; uart_irq=of_find_node_by_type(NULL, uart); if(uart_irq==NULL) printk(%s: No uart node found !\n, __func__); virt=irq_of_parse_and_map(uart_irq, 0); printk(Virtual irq : %d \n,virt); When I boot linux, virt=0; What is wrong ? You're missing #interrupt-cells, for one. Also, you need your interrupt controller driver to register with the IRQ subsystem properly (see other chained IRQ drivers in the tree). You should also get rid of device_type, big-endian, and clock-frequency, and use better compatible names. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Efficient memcpy()/memmove() for G2/G3 cores...
On Thu, 2008-09-04 at 14:59 +0200, David Jander wrote: On Thursday 04 September 2008 14:19:26 Josh Boyer wrote: [...] (I have edited the output of this tool to fit into an e-mail without wrapping lines for readability). Please tell me how on earth there can be such a big difference??? Note that on a MPC5200B this is TOTALLY different, and both processors have an e300 core (different versions of it though). How can there be such a big difference in throughput? Well, your algorithm seems better optimized than the glibc one for your testcase :). Yes, I admit my testcase is focussing on optimizing memcpy() of uncached data, and that interest stems from the fact that I was testing X11 performance (using xorg kdrive and xorg-server), and wondering why this processor wasn't able to get more FPS when moving frames on screen or scrolling, when in theory the on-board RAM should have bandwidth enough to get a smooth image. What I mean is that I have a hard time believing that this processor core is so dependent of tweaks in order to get some decent memory throughput. The MPC5200B does get higher througput with much less effort, and the two cores should be fairly identical (besides the MPC5200B having less cache memory and some other details). I have personally optimized memcpy for power4/5/6 and they are all different. There are dozens of different PPC implementations from different manufacturers and design, every one is different! With painful negotiation I was able to get the --with-cpu= framework added to glibc but not all distro use it. You can thank me later ... MPC5200B? never heard of it, don't care. I am busy with power7. So don't assume we are stupid because we have not dropped everything to optimize memcpy for YOUR processor and YOUR specific case. You care, your are a programmer? write code! If you care about the community then fit your optimization into the framework provided for CPU specific optimization and submit it so others can benefit. [...] I don't think you're doing anything wrong exactly. But it seems that your testcase sits there and just copies data with memcpy in varying sizes and amounts. That's not exactly a real-world usecase is it? No, of course it's not. I made this program to test the performance difference of different tweaks quickly. Once I found something that worked, I started LD_PRELOADing it to different other programs (among others the kdrive Xserver, mplayer, and x11perf) to see its impact on performance of some real-life apps. There the difference in performance is not so impressive of course, but it is still there (almost always either noticeably in favor of the tweaked version of memcpy(), or with a negligible or no difference). The trick is that the code built into glibc has to be optimal for the average case (4-256, average 12 bytes). Actually most memcpy implementations are a series of special cases for length and alignment. You can always do better if you know exactly what processor you are on and what specific sizes and alignment your application uses. I have not studied the different application's uses of memcpy(), and only done empirical tests so far. I think what Paul was saying is that during the course of runtime for a normal program (the kernel or userspace), most memcpy operations will be of a small order of magnitude. They will also be scattered among code that does _other_ stuff than just memcpy. So he's concerned about the overhead of an implementation that sets up the cache to do a single 32 byte memcpy. I understand. I also have this concern, especially for other processors, as the MPC5200B, where there doesn't seem to be so much to gain anyway. Of course, I could be totally wrong. I haven't had my coffee yet this morning after all. You're doing quite good regardless of your lack of caffeine ;-) Greetings, ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC] adding hwmon support to sequoia device tree
Hi, On Thursday 04 September 2008 14:25, Josh Boyer wrote: On Thu, Sep 04, 2008 at 11:55:02AM +0200, Matthias Fuchs wrote: Hi, I added support for the Sequoia on-board I2C temperature sensor to the device. I am not sure if there is any node naming convention for such devices. The needed I2C driver can be found under drivers/hwmon in the kernel sources, so I found hwmon suitable for the node. Do we want to add this to the sequoia dts file? If this is how it should be done, I will commit a proper patch. See comments below. Out of curiosity, do you have a working driver and setup with this? Of course. You need CONFIG_HWMON=y CONFIG_SENSORS_AD7414=y Via sysfs you get: sequoia:~# cat /sys/bus/i2c/devices/0-0048/temp1_input 34750 sequoia:~# cat /sys/class/hwmon/hwmon0/device/temp1_input 34750 sequoia:~# 34750 = 34.75°C diff --git a/arch/powerpc/boot/dts/sequoia.dts b/arch/powerpc/boot/dts/sequoia.dts index 72d15f0..82fdfdf 100644 --- a/arch/powerpc/boot/dts/sequoia.dts +++ b/arch/powerpc/boot/dts/sequoia.dts @@ -246,13 +246,23 @@ }; IIC0: [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 0; compatible = ibm,iic-440epx, ibm,iic; reg = 0xef600700 0x0014; interrupt-parent = UIC0; interrupts = 0x2 0x4; + + [EMAIL PROTECTED] { + device_type = hwmon; We don't need device_type. Particularly not a new one like this. I will remove that. + compatible = analog,ad7414; Perhaps 'compatible = analog,ad7417, amcc,hwmon-440epx;' It has nothing to do with either AMCC or 440EPx. Patch is on the way. Matthias ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Hooking an IRQ on a modified mpc8349emitx board
On Wed, Sep 03, 2008 at 03:55:41PM -0700, Oscar Takeshita wrote: I've been trying to hook an IRQ on a modified mpc8349emitx board without success. The IRQ is hooked physically to IRQ1/GPIO2[13] on the mpc8349e. No other devices are tied to this pin. I'm using uboot 1.2.0 and kernel 2.6.22.19. Do I need to have a dts entry for this interrupt in order to make request_irq() succeed? How can I find the IRQ number? I tried probe_irq_on/off unfortunately it did not work. Would it be MPC83xx_IRQ_EXT1 in arch/powerpc/include/asm/mpc83xx.h ? I'm new doing kernel work. Any hints appreciated. You need to describe the IRQ in a device tree node and use irq_of_parse_and_map(). request_irq() takes virtual IRQ numbers. Maybe we should put together an arch/powerpc FAQ... -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] powerpc/44x: Add hwmon support to Sequoia device tree
This patch adds support for the AD7414 temperature sensor on Sequoia PPC440EPx board. Signed-off-by: Matthias Fuchs [EMAIL PROTECTED] --- arch/powerpc/boot/dts/sequoia.dts |9 + 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/boot/dts/sequoia.dts b/arch/powerpc/boot/dts/sequoia.dts index 72d15f0..9ba5def 100644 --- a/arch/powerpc/boot/dts/sequoia.dts +++ b/arch/powerpc/boot/dts/sequoia.dts @@ -246,13 +246,22 @@ }; IIC0: [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 0; compatible = ibm,iic-440epx, ibm,iic; reg = 0xef600700 0x0014; interrupt-parent = UIC0; interrupts = 0x2 0x4; + + [EMAIL PROTECTED] { + compatible = analog,ad7414; + reg = 0x48; + }; }; IIC1: [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 0; compatible = ibm,iic-440epx, ibm,iic; reg = 0xef600800 0x0014; interrupt-parent = UIC0; -- 1.5.3 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/3] ibm_newemac: Introduce mal_has_feature
There are some PowerPC SoCs that do odd things with the MAL handling. In order to accommodate them, we need to introduce a feature mechanism that is similar to the existing emac_has_feature function. This adds a feature variable to the mal_instance structure, and adds a mal_has_feature function with some feature definitions. These are guarded by Kconfig options that are selected by the affected platforms. Signed-of-by: Josh Boyer [EMAIL PROTECTED] --- drivers/net/ibm_newemac/Kconfig |8 drivers/net/ibm_newemac/mal.h | 37 + 2 files changed, 45 insertions(+), 0 deletions(-) diff --git a/drivers/net/ibm_newemac/Kconfig b/drivers/net/ibm_newemac/Kconfig index dfb6547..44e5a0e 100644 --- a/drivers/net/ibm_newemac/Kconfig +++ b/drivers/net/ibm_newemac/Kconfig @@ -66,3 +66,11 @@ config IBM_NEW_EMAC_EMAC4 config IBM_NEW_EMAC_NO_FLOW_CTRL bool default n + +config IBM_NEW_EMAC_MAL_CLR_ICINTSTAT + bool + default n + +config IBM_NEW_EMAC_MAL_COMMON_ERR + bool + default n diff --git a/drivers/net/ibm_newemac/mal.h b/drivers/net/ibm_newemac/mal.h index eaa7262..50c5c65 100644 --- a/drivers/net/ibm_newemac/mal.h +++ b/drivers/net/ibm_newemac/mal.h @@ -213,6 +213,8 @@ struct mal_instance { struct of_device*ofdev; int index; spinlock_t lock; + + unsigned int features; }; static inline u32 get_mal_dcrn(struct mal_instance *mal, int reg) @@ -225,6 +227,41 @@ static inline void set_mal_dcrn(struct mal_instance *mal, int reg, u32 val) dcr_write(mal-dcr_host, reg, val); } +/* Features of various MAL implementations */ + +/* Dummy feature bit so the enum works properly */ +#define MAL_FTR_DUMMY 0x0001 + +/* Set if you have interrupt coalescing and you have to clear the SDR + * register for TXEOB and RXEOB interrupts to work + */ +#define MAL_FTR_CLEAR_ICINTSTAT0x0002 + +/* Set if your MAL has SERR, TXDE, and RXDE OR'd into a single UIC + * interrupt + */ +#define MAL_FTR_COMMON_ERR_INT 0x0004 + +enum { + MAL_FTRS_ALWAYS = 0, + + MAL_FTRS_POSSIBLE = +#ifdef CONFIG_IBM_NEW_EMAC_MAL_CLR_ICINTSTAT + MAL_FTR_CLEAR_ICINTSTAT | +#endif +#ifdef CONFIG_IBM_NEW_EMAC_MAL_COMMON_ERR + MAL_FTR_COMMON_ERR_INT | +#endif + MAL_FTR_DUMMY, +}; + +static inline int mal_has_feature(struct mal_instance *dev, + unsigned long feature) +{ + return (MAL_FTRS_ALWAYS feature) || + (MAL_FTRS_POSSIBLE dev-features feature); +} + /* Register MAL devices */ int mal_init(void); void mal_exit(void); -- 1.5.5.1 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 3/3] ibm_newemac: MAL support for PowerPC 405EZ
The PowerPC 405EZ SoC has some differences in the interrupt layout and handling for the MAL. The SERR, TXDE, and RXDE interrupts are OR'd into a single interrupt. Also, due to the possibility for interrupt coalescing, the TXEOB and RXEOB interrupts require an interrupt bit to be cleared in the ICINTSTAT SDR. This sets the proper MAL feature bits for 405EZ boards, and adds a common shared handler for SERR, TXDE, and RXDE. This has been adapted from code originally written by Stefan Roese. Signed-off-by: Josh Boyer [EMAIL PROTECTED] --- drivers/net/ibm_newemac/mal.c | 98 1 files changed, 78 insertions(+), 20 deletions(-) diff --git a/drivers/net/ibm_newemac/mal.c b/drivers/net/ibm_newemac/mal.c index 10c267b..3cef534 100644 --- a/drivers/net/ibm_newemac/mal.c +++ b/drivers/net/ibm_newemac/mal.c @@ -28,6 +28,7 @@ #include linux/delay.h #include core.h +#include asm/dcr-regs.h static int mal_count; @@ -279,6 +280,9 @@ static irqreturn_t mal_txeob(int irq, void *dev_instance) mal_schedule_poll(mal); set_mal_dcrn(mal, MAL_TXEOBISR, r); + if (mal_has_feature(mal, MAL_FTR_CLEAR_ICINTSTAT)) + mtdcri(SDR0, 0x4510, (mfdcri(SDR0, 0x4510) | 0x6000)); + return IRQ_HANDLED; } @@ -293,6 +297,9 @@ static irqreturn_t mal_rxeob(int irq, void *dev_instance) mal_schedule_poll(mal); set_mal_dcrn(mal, MAL_RXEOBISR, r); + if (mal_has_feature(mal, MAL_FTR_CLEAR_ICINTSTAT)) + mtdcri(SDR0, 0x4510, (mfdcri(SDR0, 0x4510) | 0x8000)); + return IRQ_HANDLED; } @@ -336,6 +343,25 @@ static irqreturn_t mal_rxde(int irq, void *dev_instance) return IRQ_HANDLED; } +static irqreturn_t mal_int(int irq, void *dev_instance) +{ + struct mal_instance *mal = dev_instance; + u32 esr = get_mal_dcrn(mal, MAL_ESR); + + if (esr MAL_ESR_EVB) { + /* descriptor error */ + if (esr MAL_ESR_DE) { + if (esr MAL_ESR_CIDT) + return (mal_rxde(irq, dev_instance)); + else + return (mal_txde(irq, dev_instance)); + } else { /* SERR */ + return (mal_serr(irq, dev_instance)); + } + } + return IRQ_HANDLED; +} + void mal_poll_disable(struct mal_instance *mal, struct mal_commac *commac) { /* Spinlock-type semantics: only one caller disable poll at a time */ @@ -542,11 +568,22 @@ static int __devinit mal_probe(struct of_device *ofdev, goto fail; } - mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); - mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); - mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); - mal-txde_irq = irq_of_parse_and_map(ofdev-node, 3); - mal-rxde_irq = irq_of_parse_and_map(ofdev-node, 4); + if (of_device_is_compatible(ofdev-node, ibm,mcmal-405ez)) + mal-features |= (MAL_FTR_CLEAR_ICINTSTAT | MAL_FTR_COMMON_ERR_INT); + + if (mal_has_feature(mal, MAL_FTR_COMMON_ERR_INT)) { + mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); + mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); + mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); + mal-txde_irq = mal-rxde_irq = mal-serr_irq; + } else { + mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); + mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); + mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); + mal-txde_irq = irq_of_parse_and_map(ofdev-node, 3); + mal-rxde_irq = irq_of_parse_and_map(ofdev-node, 4); + } + if (mal-txeob_irq == NO_IRQ || mal-rxeob_irq == NO_IRQ || mal-serr_irq == NO_IRQ || mal-txde_irq == NO_IRQ || mal-rxde_irq == NO_IRQ) { @@ -608,21 +645,42 @@ static int __devinit mal_probe(struct of_device *ofdev, sizeof(struct mal_descriptor) * mal_rx_bd_offset(mal, i)); - err = request_irq(mal-serr_irq, mal_serr, 0, MAL SERR, mal); - if (err) - goto fail2; - err = request_irq(mal-txde_irq, mal_txde, 0, MAL TX DE, mal); - if (err) - goto fail3; - err = request_irq(mal-txeob_irq, mal_txeob, 0, MAL TX EOB, mal); - if (err) - goto fail4; - err = request_irq(mal-rxde_irq, mal_rxde, 0, MAL RX DE, mal); - if (err) - goto fail5; - err = request_irq(mal-rxeob_irq, mal_rxeob, 0, MAL RX EOB, mal); - if (err) - goto fail6; + if (mal_has_feature(mal, MAL_FTR_COMMON_ERR_INT)) { + err = request_irq(mal-serr_irq, mal_int, IRQF_SHARED, + MAL SERR, mal); + if (err) + goto fail2; +
Re: [PATCH v4 1/2] powerpc: Board support for GE Fanuc SBC610
Martyn Welch wrote: On Tue, 2 Sep 2008 13:27:09 -0500 Scott Wood [EMAIL PROTECTED] wrote: On Mon, Sep 01, 2008 at 11:27:59AM +0100, Martyn Welch wrote: +static __initdata struct of_device_id of_bus_ids[] = { + { .compatible = simple-bus, }, + { .type = serial, }, + { .type = soc, }, + { .type = i2c, }, i2c and serial don't belong here, and soc is redundant with simple-bus. I agree about the i2c and serial, although I need the soc entry for the i2c devices to be found during the platform scan. Not if the soc node has simple-bus in its compatible, which I thought I remembered yours did. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Maple D info
Hi, Sorry to bother you. I'm searching for the information/guides related to the Maple D board (XSC-100). I've downloaded all the info I could get from the IBM, but I'm still missing the docs about board jumpers/settings/etc. All official sources for info are dead since long ago. So, if one has any technical docs regarding XSC-100 board, could you please contact me. Thanks in advance. -- With best wishes Dmitry ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Efficient memcpy()/memmove() for G2/G3 cores...
Steve, I think we should be grateful for people being interested in improving performance for PPC, and we should not bash them. The proposal to optimize the memcopy for the 5200 is good. Steve, you said that you've heard about the 5200.. Maybe I can refresh your memory: I did send you an optimized 32bit memcopy version for the 5200 about halve a year ago, I did send you the routine with the kind request for inclusion. As you might recall the optimized 5200 memcopy version that I had send you, was improving the performance by 50%. Kind regards Gunnar On Thu, Sep 4, 2008 at 4:31 PM, Steven Munroe [EMAIL PROTECTED] wrote: On Thu, 2008-09-04 at 14:59 +0200, David Jander wrote: On Thursday 04 September 2008 14:19:26 Josh Boyer wrote: [...] (I have edited the output of this tool to fit into an e-mail without wrapping lines for readability). Please tell me how on earth there can be such a big difference??? Note that on a MPC5200B this is TOTALLY different, and both processors have an e300 core (different versions of it though). How can there be such a big difference in throughput? Well, your algorithm seems better optimized than the glibc one for your testcase :). Yes, I admit my testcase is focussing on optimizing memcpy() of uncached data, and that interest stems from the fact that I was testing X11 performance (using xorg kdrive and xorg-server), and wondering why this processor wasn't able to get more FPS when moving frames on screen or scrolling, when in theory the on-board RAM should have bandwidth enough to get a smooth image. What I mean is that I have a hard time believing that this processor core is so dependent of tweaks in order to get some decent memory throughput. The MPC5200B does get higher througput with much less effort, and the two cores should be fairly identical (besides the MPC5200B having less cache memory and some other details). I have personally optimized memcpy for power4/5/6 and they are all different. There are dozens of different PPC implementations from different manufacturers and design, every one is different! With painful negotiation I was able to get the --with-cpu= framework added to glibc but not all distro use it. You can thank me later ... MPC5200B? never heard of it, don't care. I am busy with power7. So don't assume we are stupid because we have not dropped everything to optimize memcpy for YOUR processor and YOUR specific case. You care, your are a programmer? write code! If you care about the community then fit your optimization into the framework provided for CPU specific optimization and submit it so others can benefit. [...] I don't think you're doing anything wrong exactly. But it seems that your testcase sits there and just copies data with memcpy in varying sizes and amounts. That's not exactly a real-world usecase is it? No, of course it's not. I made this program to test the performance difference of different tweaks quickly. Once I found something that worked, I started LD_PRELOADing it to different other programs (among others the kdrive Xserver, mplayer, and x11perf) to see its impact on performance of some real-life apps. There the difference in performance is not so impressive of course, but it is still there (almost always either noticeably in favor of the tweaked version of memcpy(), or with a negligible or no difference). The trick is that the code built into glibc has to be optimal for the average case (4-256, average 12 bytes). Actually most memcpy implementations are a series of special cases for length and alignment. You can always do better if you know exactly what processor you are on and what specific sizes and alignment your application uses. I have not studied the different application's uses of memcpy(), and only done empirical tests so far. I think what Paul was saying is that during the course of runtime for a normal program (the kernel or userspace), most memcpy operations will be of a small order of magnitude. They will also be scattered among code that does _other_ stuff than just memcpy. So he's concerned about the overhead of an implementation that sets up the cache to do a single 32 byte memcpy. I understand. I also have this concern, especially for other processors, as the MPC5200B, where there doesn't seem to be so much to gain anyway. Of course, I could be totally wrong. I haven't had my coffee yet this morning after all. You're doing quite good regardless of your lack of caffeine ;-) Greetings, ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Efficient memcpy()/memmove() for G2/G3 cores...
Hi David, Regarding your testcase. I think we all agree with you that improving the performance for PPC is a noble quest and we should all try do improve the performance where possible. Regarding the 5200B and 5221 CPUs. As we all know the 5200B is a G2 PowerPC from Freescale. The factor for the memory performance of the PPC are two items: A) This CPU has ZERO 2nd level cache B) This CPU can remember exactly one prefetched memory line. This means the normal memcopy routines that prefetch several cache lines ahead DO NOT WORK! To get good/best performance you need to prefetch EXACTLY ONE cache line ahead. Altering the Linux Kernel or glibc memcopy routines for the G2/PPC core to work like this is actually very simple. Altering the Linux Kernel or glibc memcopy routines to work like described will increase performance by 100% Regarding the 5121. David, you did create a very special memcopy for the 5121e CPU. Your test showed us that the normal glibc memcopy is about 10 times slower than expected on the 5121. I really wonder why this is the case. I would have expected the 5121 to perform just like the 5200B. What we saw is that switching from READ to WRITE and back is very costly on 5121. There seems to be a huge difference between the 5200 and its successor the 5121. Is this performance difference caused by the CPU or by the board /memory? Cheers Gunnar ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Efficient memcpy()/memmove() for G2/G3 cores...
Hi Steve, I have personally optimized memcpy for power4/5/6 and they are all different. There are dozens of different PPC implementations from different manufacturers and design, every one is different! With painful negotiation I was able to get the --with-cpu= framework added to glibc but not all distro use it. You can thank me later Steve, you make it sound like very many different PowerPC chips: You said you did the Power 4, Power 5 , Power 6 and now Power 7 routines. And there are the 970 and the Cell. While this sounds like 7 different PPC chips. But aren't this actually only 2 main families? Wouldn't it be possible to create two main routine to cover all? One type that performs good on the family of Power4/5 and 7. And one that performs good on the family of P6 and Cell? How are the Linux hackers handling this? Maybe there is room for consolidating? Cheers Gunnar ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH, RFC] mv643xx_eth: move sram window setting code into the driver
On Thu, Sep 04, 2008 at 08:44:31AM -0500, Matt Sealey wrote: I was thinking about this actually, similar to the Efika Device Tree Supplement for MPC5200B board (which adds a lot of device tree nodes not present on the production model and fixes some things), we have a Pegasos one too which changes some very minor stuff. This could be in the kernel (chrp fixups) or in a pre-boot Forth script, but in either case, wouldn't a real sram node in the device tree be the proper solution here? Hardcoding addresses for devices is rather arch/ppc behaviour and the driver is one of the few cases that never got reworked to fit. Probably. I guess you don't want to pass that info _directly_ to the mv643xx_eth driver, though -- since the on-chip SRAM can be used for many things, and you're not necessarily sure that the user wants to use it for descriptors. (Or how much of it they want to use for descriptors.) (Or for the descriptors of which of the 8 possible transmit and receive queues, considering the 2.6.27 driver supports multiple queues.) Well, I'm not sure what the best solution is. :-) I don't have a Pegasos running right now to test but I will, as soon as possible, make sure this works first. Cool, thanks. It would be nice if you could give the driver in 2.6.27-rc5 a spin, it has seen a _lot_ of changes since 2.6.25 and I'd really like to make sure it still works on Pegasos. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Hooking an IRQ on a modified mpc8349emitx board
Scott Wood wrote: On Wed, Sep 03, 2008 at 03:55:41PM -0700, Oscar Takeshita wrote: I've been trying to hook an IRQ on a modified mpc8349emitx board without success. The IRQ is hooked physically to IRQ1/GPIO2[13] on the mpc8349e. No other devices are tied to this pin. I'm using uboot 1.2.0 and kernel 2.6.22.19. Do I need to have a dts entry for this interrupt in order to make request_irq() succeed? How can I find the IRQ number? I tried probe_irq_on/off unfortunately it did not work. Would it be MPC83xx_IRQ_EXT1 in arch/powerpc/include/asm/mpc83xx.h ? I'm new doing kernel work. Any hints appreciated. You need to describe the IRQ in a device tree node and use irq_of_parse_and_map(). request_irq() takes virtual IRQ numbers. Maybe we should put together an arch/powerpc FAQ... That would be wonderful :-) This could help many [of us] over these initial speed-bumps. -- Gary Thomas | Consulting for the MLB Associates |Embedded world ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH v4 1/2] powerpc: Board support for GE Fanuc SBC610
On Thu, 04 Sep 2008 10:04:04 -0500 Scott Wood [EMAIL PROTECTED] wrote: Martyn Welch wrote: On Tue, 2 Sep 2008 13:27:09 -0500 Scott Wood [EMAIL PROTECTED] wrote: On Mon, Sep 01, 2008 at 11:27:59AM +0100, Martyn Welch wrote: +static __initdata struct of_device_id of_bus_ids[] = { + { .compatible = simple-bus, }, + { .type = serial, }, + { .type = soc, }, + { .type = i2c, }, i2c and serial don't belong here, and soc is redundant with simple-bus. I agree about the i2c and serial, although I need the soc entry for the i2c devices to be found during the platform scan. Not if the soc node has simple-bus in its compatible, which I thought I remembered yours did. Um, it will have in the next revision of the patch set ;-) Martyn -- Martyn Welch MEng MPhil MIET (Principal Software Engineer) T:+44(0)1327322748 GE Fanuc Intelligent Platforms Ltd,|Registered in England and Wales Tove Valley Business Park, Towcester, |(3828642) at 100 Barbirolli Square, Northants, NN12 6PF, UK T:+44(0)1327359444 |Manchester,M2 3AB VAT:GB 729849476 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH v3 1/2] powerpc: Board support for GE Fanuc SBC610
On Wed, 03 Sep 2008 19:26:35 -0400 Jerry Van Baren [EMAIL PROTECTED] wrote: Martyn Welch wrote: On Fri, 29 Aug 2008 07:04:18 -0500 Kumar Gala [EMAIL PROTECTED] wrote: what u-boot version are you using/shipping with these boards? U-boot 1.2.0 1.2.0 predates the libfdt support. You probably don't have any fdt support in your u-boot (if you do, it is old and crude and probably doesn't any of the fixups that Kumar refers to). I would strongly recommend you upgrade to the tip o' the tree or at least the latest release (1.3.4). FDT support will be *much* better - we now has generic utility routines that fix up lots of stuff for you rather than the crufty by-hand fixups (if any) from the 1.2.0 timeframe. Unfortunately that's not something I have the power to do. I'm stuck with 1.2.0. Martyn -- Martyn Welch MEng MPhil MIET (Principal Software Engineer) T:+44(0)1327322748 GE Fanuc Intelligent Platforms Ltd,|Registered in England and Wales Tove Valley Business Park, Towcester, |(3828642) at 100 Barbirolli Square, Northants, NN12 6PF, UK T:+44(0)1327359444 |Manchester,M2 3AB VAT:GB 729849476 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[RFC] [PATCH 0/4] Initial PowerPC 405EZ Acadia support
This adds initial support for the AMCC PowerPC 405EZ board. Mostly sending these out for comments as I clean them up. The cuboot wrapper needs particular cleaning, as the clocking function has a ridiculous number of variables. But hey, it works. These patches depend on the EMAC series I sent out earlier today. Comments welcome. josh Josh Boyer (4): powerpc/40x: AMCC PowerPC 405EZ Acadia DTS powerpc/40x: Add support for PowerPC 405EZ Acadia board powerpc/40x: Add cuboot wrapper for Acadia board powerpc/40x: Add PowerPC 405EZ Acadia defconfig arch/powerpc/boot/Makefile|4 +- arch/powerpc/boot/cuboot-acadia.c | 198 +++ arch/powerpc/boot/dts/acadia.dts | 224 +++ arch/powerpc/configs/40x/acadia_defconfig | 917 + arch/powerpc/platforms/40x/Kconfig| 14 + arch/powerpc/platforms/40x/Makefile |1 + arch/powerpc/platforms/40x/acadia.c | 56 ++ 7 files changed, 1413 insertions(+), 1 deletions(-) create mode 100644 arch/powerpc/boot/cuboot-acadia.c create mode 100644 arch/powerpc/boot/dts/acadia.dts create mode 100644 arch/powerpc/configs/40x/acadia_defconfig create mode 100644 arch/powerpc/platforms/40x/acadia.c ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/4] powerpc/40x: AMCC PowerPC 405EZ Acadia DTS
Add the DTS for the AMCC PowerPC 405EZ Acadia evaluation board Signed-off-by: Josh Boyer [EMAIL PROTECTED] --- arch/powerpc/boot/dts/acadia.dts | 224 ++ 1 files changed, 224 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/boot/dts/acadia.dts diff --git a/arch/powerpc/boot/dts/acadia.dts b/arch/powerpc/boot/dts/acadia.dts new file mode 100644 index 000..57291f6 --- /dev/null +++ b/arch/powerpc/boot/dts/acadia.dts @@ -0,0 +1,224 @@ +/* + * Device Tree Source for AMCC Acadia (405EZ) + * + * Copyright IBM Corp. 2008 + * + * This file is licensed under the terms of the GNU General Public License + * version 2. This program is licensed as is without any warranty of any + * kind, whether express or implied. + */ + +/dts-v1/; + +/ { + #address-cells = 1; + #size-cells = 1; + model = amcc,acadia; + compatible = amcc,acadia; + dcr-parent = {/cpus/[EMAIL PROTECTED]; + + aliases { + ethernet0 = EMAC0; + serial0 = UART0; + serial1 = UART1; + }; + + cpus { + #address-cells = 1; + #size-cells = 0; + + [EMAIL PROTECTED] { + device_type = cpu; + model = PowerPC,405EZ; + reg = 0x0; + clock-frequency = 0; /* Filled in by wrapper */ + timebase-frequency = 0; /* Filled in by wrapper */ + i-cache-line-size = 32; + d-cache-line-size = 32; + i-cache-size = 16384; + d-cache-size = 16384; + dcr-controller; + dcr-access-method = native; + }; + }; + + memory { + device_type = memory; + reg = 0x0 0x0; /* Filled in by wrapper */ + }; + + UIC0: interrupt-controller { + compatible = ibm,uic-405ez, ibm,uic; + interrupt-controller; + dcr-reg = 0x0c0 0x009; + cell-index = 0; + #address-cells = 0; + #size-cells = 0; + #interrupt-cells = 2; + }; + + plb { + compatible = ibm,plb-405ez, ibm,plb3; + #address-cells = 1; + #size-cells = 1; + ranges; + clock-frequency = 0; /* Filled in by wrapper */ + + MAL0: mcmal { + compatible = ibm,mcmal-405ez, ibm,mcmal; + dcr-reg = 0x380 0x62; + num-tx-chans = 1; + num-rx-chans = 1; + interrupt-parent = UIC0; + /* 405EZ has only 3 interrupts to the UIC, as +* SERR, TXDE, and RXDE are or'd together into +* one UIC bit +*/ + interrupts = + 0x13 0x4 /* TXEOB */ + 0x15 0x4 /* RXEOB */ + 0x12 0x4 /* SERR, TXDE, RXDE */; + }; + + POB0: opb { + compatible = ibm,opb-405ez, ibm,opb; + #address-cells = 1; + #size-cells = 1; + ranges; + dcr-reg = 0x0a 0x05; + clock-frequency = 0; /* Filled in by wrapper */ + + UART0: [EMAIL PROTECTED] { + device_type = serial; + compatible = ns16550; + reg = 0xef600300 0x8; + virtual-reg = 0xef600300; + clock-frequency = 0; /* Filled in by wrapper */ + current-speed = 115200; + interrupt-parent = UIC0; + interrupts = 0x5 0x4; + }; + + UART1: [EMAIL PROTECTED] { + device_type = serial; + compatible = ns16550; + reg = 0xef600400 0x8; + clock-frequency = 0; /* Filled in by wrapper */ + current-speed = 115200; + interrupt-parent = UIC0; + interrupts = 0x6 0x4; + }; + + IIC: [EMAIL PROTECTED] { + compatible = ibm,iic-405ez, ibm,iic; + reg = 0xef600500 0x11; + interrupt-parent = UIC0; + interrupts = 0xa 0x4; + }; + + GPIO0: [EMAIL PROTECTED] { + compatible = ibm,gpio-405ez; +
[PATCH 2/4] powerpc/40x: Add support for PowerPC 405EZ Acadia board
Add base support for the AMCC PowerPC 405EZ Acadia evalution board. In addition to some of the normal PPC 40x peripherals, the Acadia board has: - 64 MiB PSRAM - NOR and NAND flash - Two USB 1.1 host ports - Two CAN 2.0 ports - ADC and DAC connectors - LCD display This adds the basic platform support to build from. Signed-off-by: Josh Boyer [EMAIL PROTECTED] --- arch/powerpc/platforms/40x/Kconfig | 14 + arch/powerpc/platforms/40x/Makefile |1 + arch/powerpc/platforms/40x/acadia.c | 56 +++ 3 files changed, 71 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/platforms/40x/acadia.c diff --git a/arch/powerpc/platforms/40x/Kconfig b/arch/powerpc/platforms/40x/Kconfig index a9260e2..fe59060 100644 --- a/arch/powerpc/platforms/40x/Kconfig +++ b/arch/powerpc/platforms/40x/Kconfig @@ -14,6 +14,14 @@ # help #This option enables support for the CPCI405 board. +config ACADIA + bool Acadia + depends on 40x + default n + select 405EZ + help + This option enables support for the AMCC 405EZ Acadia evaluation board. + config EP405 bool EP405/EP405PC depends on 40x @@ -118,6 +126,12 @@ config 405EX select IBM_NEW_EMAC_EMAC4 select IBM_NEW_EMAC_RGMII +config 405EZ + bool + select IBM_NEW_EMAC_NO_FLOW_CTRL + select IBM_NEW_EMAC_MAL_CLR_ICINTSTAT + select IBM_NEW_EMAC_MAL_COMMON_ERR + config 405GPR bool diff --git a/arch/powerpc/platforms/40x/Makefile b/arch/powerpc/platforms/40x/Makefile index 5533a5c..ff483c0 100644 --- a/arch/powerpc/platforms/40x/Makefile +++ b/arch/powerpc/platforms/40x/Makefile @@ -3,3 +3,4 @@ obj-$(CONFIG_MAKALU)+= makalu.o obj-$(CONFIG_WALNUT) += walnut.o obj-$(CONFIG_XILINX_VIRTEX_GENERIC_BOARD) += virtex.o obj-$(CONFIG_EP405)+= ep405.o +obj-$(CONFIG_ACADIA) += acadia.o diff --git a/arch/powerpc/platforms/40x/acadia.c b/arch/powerpc/platforms/40x/acadia.c new file mode 100644 index 000..9a4419b --- /dev/null +++ b/arch/powerpc/platforms/40x/acadia.c @@ -0,0 +1,56 @@ +/* + * Acadia board support + * + * Copyright 2008 IBM Corporation + * Based on the Walnut code + * Josh Boyer [EMAIL PROTECTED] + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + */ +#include linux/init.h +#include linux/of_platform.h + +#include asm/machdep.h +#include asm/prom.h +#include asm/udbg.h +#include asm/time.h +#include asm/uic.h +#include asm/ppc4xx.h + +static __initdata struct of_device_id acadia_of_bus[] = { + { .compatible = ibm,plb3, }, + { .compatible = ibm,opb, }, + { .compatible = ibm,ebc, }, + {}, +}; + +static int __init acadia_device_probe(void) +{ + of_platform_bus_probe(NULL, acadia_of_bus, NULL); + + return 0; +} +machine_device_initcall(acadia, acadia_device_probe); + +static int __init acadia_probe(void) +{ + unsigned long root = of_get_flat_dt_root(); + + if (!of_flat_dt_is_compatible(root, amcc,acadia)) + return 0; + + return 1; +} + +define_machine(acadia) { + .name = Acadia, + .probe = acadia_probe, + .progress = udbg_progress, + .init_IRQ = uic_init_tree, + .get_irq= uic_get_irq, + .restart= ppc4xx_reset_system, + .calibrate_decr = generic_calibrate_decr, +}; -- 1.5.5.1 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 3/4] powerpc/40x: Add cuboot wrapper for Acadia board
This adds a cuboot wrapper for the AMCC PowerPC 405EZ Acadia board. The clocking code is derived from U-Boot, originally written by Stefan Roese. Signed-off-by: Josh Boyer [EMAIL PROTECTED] --- arch/powerpc/boot/Makefile|4 +- arch/powerpc/boot/cuboot-acadia.c | 198 + 2 files changed, 201 insertions(+), 1 deletions(-) create mode 100644 arch/powerpc/boot/cuboot-acadia.c diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index 14174aa..fb712a4 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -68,7 +68,8 @@ src-plat := of.c cuboot-52xx.c cuboot-824x.c cuboot-83xx.c cuboot-85xx.c holly.c fixed-head.S ep88xc.c ep405.c cuboot-c2k.c \ cuboot-katmai.c cuboot-rainier.c redboot-8xx.c ep8248e.c \ cuboot-warp.c cuboot-85xx-cpm2.c cuboot-yosemite.c simpleboot.c \ - virtex405-head.S virtex.c redboot-83xx.c cuboot-sam440ep.c + virtex405-head.S virtex.c redboot-83xx.c cuboot-sam440ep.c \ + cuboot-acadia.c src-boot := $(src-wlib) $(src-plat) empty.c src-boot := $(addprefix $(obj)/, $(src-boot)) @@ -211,6 +212,7 @@ image-$(CONFIG_DEFAULT_UIMAGE) += uImage # Board ports in arch/powerpc/platform/40x/Kconfig image-$(CONFIG_EP405) += dtbImage.ep405 image-$(CONFIG_WALNUT) += treeImage.walnut +image-$(CONFIG_ACADIA) += cuImage.acadia # Board ports in arch/powerpc/platform/44x/Kconfig image-$(CONFIG_EBONY) += treeImage.ebony cuImage.ebony diff --git a/arch/powerpc/boot/cuboot-acadia.c b/arch/powerpc/boot/cuboot-acadia.c new file mode 100644 index 000..3818d47 --- /dev/null +++ b/arch/powerpc/boot/cuboot-acadia.c @@ -0,0 +1,198 @@ +/* + * Old U-boot compatibility for Acadia + * + * Author: Josh Boyer [EMAIL PROTECTED] + * + * Copyright 2008 IBM Corporation + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + */ + +#include ops.h +#include io.h +#include dcr.h +#include stdio.h +#include 4xx.h +#include 44x.h +#include cuboot.h + +#define TARGET_4xx +#include ppcboot.h + +static bd_t bd; + +#define CPR_PERD0_SPIDV_MASK 0x000F /* SPI Clock Divider */ + +#define PLLC_SRC_MASK 0x2000 /* PLL feedback source */ + +#define PLLD_FBDV_MASK0x1F00 /* PLL feedback divider value */ +#define PLLD_FWDVA_MASK0x000F /* PLL forward divider A value */ +#define PLLD_FWDVB_MASK0x0700 /* PLL forward divider B value */ + +#define PRIMAD_CPUDV_MASK 0x0F00 /* CPU Clock Divisor Mask */ +#define PRIMAD_PLBDV_MASK 0x000F /* PLB Clock Divisor Mask */ +#define PRIMAD_OPBDV_MASK 0x0F00 /* OPB Clock Divisor Mask */ +#define PRIMAD_EBCDV_MASK 0x000F /* EBC Clock Divisor Mask */ + +#define PERD0_PWMDV_MASK 0xFF00 /* PWM Divider Mask */ +#define PERD0_SPIDV_MASK 0x000F /* SPI Divider Mask */ +#define PERD0_U0DV_MASK0xFF00 /* UART 0 Divider Mask */ +#define PERD0_U1DV_MASK0x00FF /* UART 1 Divider Mask */ + +static void get_clocks(void) +{ + unsigned long sysclk; + unsigned long cpr_plld; + unsigned long cpr_pllc; + unsigned long cpr_primad; + unsigned long primad_cpudv; + unsigned long m; + unsigned long pllFwdDiv, pllFwdDivB, pllFbkDiv, pllPlbDiv, pllExtBusDiv; + unsigned long pllOpbDiv, freqVCOHz, freqProcessor, freqPLB, freqEBC, freqUART, freqOPB; + unsigned long div; /* total divisor udiv * bdiv */ + unsigned long umin; /* minimum udiv */ + unsigned short diff;/* smallest diff */ + unsigned long udiv; /* best udiv */ + unsigned short idiff; /* current diff */ + unsigned short ibdiv; /* current bdiv */ + unsigned long i; + unsigned long est; /* current estimate */ + unsigned long plloutb; + + /* read the sysclk value from the CPLD */ + sysclk = (in_8(0x8000) == 0xc) ? : 3000; + + /* +* Read PLL Mode registers +*/ + cpr_plld = CPR0_READ(DCRN_CPR0_PLLD); + cpr_pllc = CPR0_READ(DCRN_CPR0_PLLC); + + /* +* Determine forward divider A +*/ + pllFwdDiv = ((cpr_plld PLLD_FWDVA_MASK) 16); + + /* +* Determine forward divider B +*/ + pllFwdDivB = ((cpr_plld PLLD_FWDVB_MASK) 8); + if (pllFwdDivB == 0) + pllFwdDivB = 8; + + /* +* Determine FBK_DIV. +*/ + pllFbkDiv = ((cpr_plld PLLD_FBDV_MASK) 24); + if (pllFbkDiv == 0) + pllFbkDiv = 256; + + /* +* Read CPR_PRIMAD register +*/ +
[PATCH 4/4] powerpc/40x: Add PowerPC 405EZ Acadia defconfig
Add simple defconfig for the AMCC PowerPC 405EZ Acadia evaluation board Signed-off-by: Josh Boyer [EMAIL PROTECTED] --- arch/powerpc/configs/40x/acadia_defconfig | 917 + 1 files changed, 917 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/configs/40x/acadia_defconfig diff --git a/arch/powerpc/configs/40x/acadia_defconfig b/arch/powerpc/configs/40x/acadia_defconfig new file mode 100644 index 000..99a0e2a --- /dev/null +++ b/arch/powerpc/configs/40x/acadia_defconfig @@ -0,0 +1,917 @@ +# +# Automatically generated make config: don't edit +# Linux kernel version: 2.6.27-rc3 +# Wed Sep 3 14:47:45 2008 +# +# CONFIG_PPC64 is not set + +# +# Processor support +# +# CONFIG_6xx is not set +# CONFIG_PPC_85xx is not set +# CONFIG_PPC_8xx is not set +CONFIG_40x=y +# CONFIG_44x is not set +# CONFIG_E200 is not set +CONFIG_4xx=y +# CONFIG_PPC_MM_SLICES is not set +CONFIG_NOT_COHERENT_CACHE=y +CONFIG_PPC32=y +CONFIG_WORD_SIZE=32 +CONFIG_PPC_MERGE=y +CONFIG_MMU=y +CONFIG_GENERIC_CMOS_UPDATE=y +CONFIG_GENERIC_TIME=y +CONFIG_GENERIC_TIME_VSYSCALL=y +CONFIG_GENERIC_CLOCKEVENTS=y +CONFIG_GENERIC_HARDIRQS=y +# CONFIG_HAVE_SETUP_PER_CPU_AREA is not set +CONFIG_IRQ_PER_CPU=y +CONFIG_STACKTRACE_SUPPORT=y +CONFIG_HAVE_LATENCYTOP_SUPPORT=y +CONFIG_LOCKDEP_SUPPORT=y +CONFIG_RWSEM_XCHGADD_ALGORITHM=y +CONFIG_ARCH_HAS_ILOG2_U32=y +CONFIG_GENERIC_HWEIGHT=y +CONFIG_GENERIC_CALIBRATE_DELAY=y +CONFIG_GENERIC_FIND_NEXT_BIT=y +# CONFIG_ARCH_NO_VIRT_TO_BUS is not set +CONFIG_PPC=y +CONFIG_EARLY_PRINTK=y +CONFIG_GENERIC_NVRAM=y +CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y +CONFIG_ARCH_MAY_HAVE_PC_FDC=y +CONFIG_PPC_OF=y +CONFIG_OF=y +CONFIG_PPC_UDBG_16550=y +# CONFIG_GENERIC_TBSYNC is not set +CONFIG_AUDIT_ARCH=y +CONFIG_GENERIC_BUG=y +# CONFIG_DEFAULT_UIMAGE is not set +CONFIG_PPC_DCR_NATIVE=y +# CONFIG_PPC_DCR_MMIO is not set +CONFIG_PPC_DCR=y +CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config + +# +# General setup +# +CONFIG_EXPERIMENTAL=y +CONFIG_BROKEN_ON_SMP=y +CONFIG_INIT_ENV_ARG_LIMIT=32 +CONFIG_LOCALVERSION= +CONFIG_LOCALVERSION_AUTO=y +CONFIG_SWAP=y +CONFIG_SYSVIPC=y +CONFIG_SYSVIPC_SYSCTL=y +CONFIG_POSIX_MQUEUE=y +# CONFIG_BSD_PROCESS_ACCT is not set +# CONFIG_TASKSTATS is not set +# CONFIG_AUDIT is not set +# CONFIG_IKCONFIG is not set +CONFIG_LOG_BUF_SHIFT=14 +# CONFIG_CGROUPS is not set +CONFIG_GROUP_SCHED=y +# CONFIG_FAIR_GROUP_SCHED is not set +# CONFIG_RT_GROUP_SCHED is not set +CONFIG_USER_SCHED=y +# CONFIG_CGROUP_SCHED is not set +CONFIG_SYSFS_DEPRECATED=y +CONFIG_SYSFS_DEPRECATED_V2=y +# CONFIG_RELAY is not set +# CONFIG_NAMESPACES is not set +CONFIG_BLK_DEV_INITRD=y +CONFIG_INITRAMFS_SOURCE= +# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set +CONFIG_SYSCTL=y +CONFIG_EMBEDDED=y +CONFIG_SYSCTL_SYSCALL=y +CONFIG_KALLSYMS=y +CONFIG_KALLSYMS_ALL=y +CONFIG_KALLSYMS_EXTRA_PASS=y +CONFIG_HOTPLUG=y +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_ELF_CORE=y +CONFIG_COMPAT_BRK=y +CONFIG_BASE_FULL=y +CONFIG_FUTEX=y +CONFIG_ANON_INODES=y +CONFIG_EPOLL=y +CONFIG_SIGNALFD=y +CONFIG_TIMERFD=y +CONFIG_EVENTFD=y +CONFIG_SHMEM=y +CONFIG_VM_EVENT_COUNTERS=y +CONFIG_SLUB_DEBUG=y +# CONFIG_SLAB is not set +CONFIG_SLUB=y +# CONFIG_SLOB is not set +# CONFIG_PROFILING is not set +# CONFIG_MARKERS is not set +CONFIG_HAVE_OPROFILE=y +# CONFIG_KPROBES is not set +CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y +CONFIG_HAVE_IOREMAP_PROT=y +CONFIG_HAVE_KPROBES=y +CONFIG_HAVE_KRETPROBES=y +CONFIG_HAVE_ARCH_TRACEHOOK=y +# CONFIG_HAVE_DMA_ATTRS is not set +# CONFIG_USE_GENERIC_SMP_HELPERS is not set +# CONFIG_HAVE_CLK is not set +CONFIG_PROC_PAGE_MONITOR=y +# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set +CONFIG_SLABINFO=y +CONFIG_RT_MUTEXES=y +# CONFIG_TINY_SHMEM is not set +CONFIG_BASE_SMALL=0 +CONFIG_MODULES=y +# CONFIG_MODULE_FORCE_LOAD is not set +CONFIG_MODULE_UNLOAD=y +# CONFIG_MODULE_FORCE_UNLOAD is not set +# CONFIG_MODVERSIONS is not set +# CONFIG_MODULE_SRCVERSION_ALL is not set +CONFIG_KMOD=y +CONFIG_BLOCK=y +CONFIG_LBD=y +# CONFIG_BLK_DEV_IO_TRACE is not set +# CONFIG_LSF is not set +# CONFIG_BLK_DEV_BSG is not set +# CONFIG_BLK_DEV_INTEGRITY is not set + +# +# IO Schedulers +# +CONFIG_IOSCHED_NOOP=y +CONFIG_IOSCHED_AS=y +CONFIG_IOSCHED_DEADLINE=y +CONFIG_IOSCHED_CFQ=y +CONFIG_DEFAULT_AS=y +# CONFIG_DEFAULT_DEADLINE is not set +# CONFIG_DEFAULT_CFQ is not set +# CONFIG_DEFAULT_NOOP is not set +CONFIG_DEFAULT_IOSCHED=anticipatory +CONFIG_CLASSIC_RCU=y +# CONFIG_PPC4xx_PCI_EXPRESS is not set + +# +# Platform support +# +# CONFIG_PPC_CELL is not set +# CONFIG_PPC_CELL_NATIVE is not set +# CONFIG_PQ2ADS is not set +CONFIG_ACADIA=y +# CONFIG_EP405 is not set +# CONFIG_KILAUEA is not set +# CONFIG_MAKALU is not set +# CONFIG_WALNUT is not set +# CONFIG_XILINX_VIRTEX_GENERIC_BOARD is not set +CONFIG_405EZ=y +# CONFIG_IPIC is not set +# CONFIG_MPIC is not set +# CONFIG_MPIC_WEIRD is not set +# CONFIG_PPC_I8259 is not set +# CONFIG_PPC_RTAS is not set +# CONFIG_MMIO_NVRAM is not set +# CONFIG_PPC_MPC106 is not set +#
[PATCH v5 0/2] powerpc: Board support for GE Fanuc SBC610
The following series implements basic board support for GE Fanuc's SBC610, a 6U single board computer, based on Freescale's MPC8641D. This series provides basic functionality: - The board can boot with a serial console. - Ethernet works, though the phys are polled. - The PCI bus is scanned and sata functions. Signed-off-by: Martyn Welch [EMAIL PROTECTED] --- Scott: Thanks for your feeback. I have removed un-needed entries from of_bus_ids[] and added simple-bus compatible node to soc in the DTS. Martyn arch/powerpc/boot/dts/gef_sbc610.dts | 260 ++ arch/powerpc/platforms/86xx/Kconfig |9 + arch/powerpc/platforms/86xx/Makefile |1 arch/powerpc/platforms/86xx/gef_sbc610.c | 149 + 4 files changed, 418 insertions(+), 1 deletions(-) ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH v5 1/2] powerpc: Board support for GE Fanuc SBC610
Support for the SBC610 VPX Single Board Computer from GE Fanuc (PowerPC MPC8641D). This is the basic board support for GE Fanuc's SBC610, a 6U single board computer, based on Freescale's MPC8641D. Signed-off-by: Martyn Welch [EMAIL PROTECTED] --- arch/powerpc/boot/dts/gef_sbc610.dts | 260 ++ arch/powerpc/platforms/86xx/Kconfig |9 + arch/powerpc/platforms/86xx/Makefile |1 arch/powerpc/platforms/86xx/gef_sbc610.c | 149 + 4 files changed, 418 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/boot/dts/gef_sbc610.dts b/arch/powerpc/boot/dts/gef_sbc610.dts new file mode 100644 index 000..b80c581 --- /dev/null +++ b/arch/powerpc/boot/dts/gef_sbc610.dts @@ -0,0 +1,260 @@ +/* + * GE Fanuc SBC610 Device Tree Source + * + * Copyright 2008 GE Fanuc Intelligent Platforms Embedded Systems, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * Based on: SBS CM6 Device Tree Source + * Copyright 2007 SBS Technologies GmbH Co. KG + * And: mpc8641_hpcn.dts (MPC8641 HPCN Device Tree Source) + * Copyright 2006 Freescale Semiconductor Inc. + */ + +/* + * Compiled with dtc -I dts -O dtb -o gef_sbc610.dtb gef_sbc610.dts + */ + +/dts-v1/; + +/ { + model = GEF_SBC610; + compatible = gef,sbc610; + #address-cells = 1; + #size-cells = 1; + + aliases { + ethernet0 = enet0; + ethernet1 = enet1; + serial0 = serial0; + serial1 = serial1; + pci0 = pci0; + }; + + cpus { + #address-cells = 1; + #size-cells = 0; + + PowerPC,[EMAIL PROTECTED] { + device_type = cpu; + reg = 0; + d-cache-line-size = 32; // 32 bytes + i-cache-line-size = 32; // 32 bytes + d-cache-size = 32768; // L1, 32K + i-cache-size = 32768; // L1, 32K + timebase-frequency = 0; // From uboot + bus-frequency = 0;// From uboot + clock-frequency = 0; // From uboot + }; + PowerPC,[EMAIL PROTECTED] { + device_type = cpu; + reg = 1; + d-cache-line-size = 32; // 32 bytes + i-cache-line-size = 32; // 32 bytes + d-cache-size = 32768; // L1, 32K + i-cache-size = 32768; // L1, 32K + timebase-frequency = 0; // From uboot + bus-frequency = 0;// From uboot + clock-frequency = 0; // From uboot + }; + }; + + memory { + device_type = memory; + reg = 0x0 0x4000; // set by uboot + }; + + [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 1; + #interrupt-cells = 2; + device_type = soc; + compatible = simple-bus; + ranges = 0x0 0xfef0 0x0010; + reg = 0xfef0 0x10;// CCSRBAR 1M + bus-frequency = 0; + + i2c1: [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 0; + compatible = fsl-i2c; + reg = 0x3000 0x100; + interrupts = 0x2b 0x2; + interrupt-parent = mpic; + dfsrr; + + [EMAIL PROTECTED] { + compatible = dallas,ds1682; + reg = 0x6b; + }; + }; + + i2c2: [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 0; + compatible = fsl-i2c; + reg = 0x3100 0x100; + interrupts = 0x2b 0x2; + interrupt-parent = mpic; + dfsrr; + }; + + [EMAIL PROTECTED] { + #address-cells = 1; + #size-cells = 1; + compatible = fsl,mpc8641-dma, fsl,eloplus-dma; + reg = 0x21300 0x4; + ranges = 0x0 0x21100 0x200; + cell-index = 0; + [EMAIL PROTECTED] { + compatible = fsl,mpc8641-dma-channel, + fsl,eloplus-dma-channel; +
Re: Efficient memcpy()/memmove() for G2/G3 cores...
Hi Steven, On Thursday 04 September 2008 16:31:13 Steven Munroe wrote: [...] Yes, I admit my testcase is focussing on optimizing memcpy() of uncached data, and that interest stems from the fact that I was testing X11 performance (using xorg kdrive and xorg-server), and wondering why this processor wasn't able to get more FPS when moving frames on screen or scrolling, when in theory the on-board RAM should have bandwidth enough to get a smooth image. What I mean is that I have a hard time believing that this processor core is so dependent of tweaks in order to get some decent memory throughput. The MPC5200B does get higher througput with much less effort, and the two cores should be fairly identical (besides the MPC5200B having less cache memory and some other details). I have personally optimized memcpy for power4/5/6 and they are all different. There are dozens of different PPC implementations from different manufacturers and design, every one is different! With painful negotiation I was able to get the --with-cpu= framework added to glibc but not all distro use it. You can thank me later ... Well, thank you ;-) MPC5200B? never heard of it, don't care. I am busy with power7. Ok, keep up your work with power7, it's great you care about that one ;-) So don't assume we are stupid because we have not dropped everything to optimize memcpy for YOUR processor and YOUR specific case. Ho! I never, ever assumed that anyone (on this list) is stupid. I think you got me totally wrong (and _that_ may be my fault). I was asking for other users experience. You make it apear as if I was complaining about your optimizations for Power4/5/6/970/Cell, but in fact, if you read correctly I havn't even touched them... they are useless to me, since this is an e300 core. My comparisons are all against vanilla glibc _without_ any optimized code... that is (most probably) simple loops with char copy, or at most 32-bit word copies. What I want to know is why this processor (MPC5121e, not the MPC5200B) is so terribly inefficient at this without optimizations and if someone has done something about it before me (I am doing it right now). I have never stated that specifically _you_ did a bad job or something, so why are you reacting like that?? In fact, your framework for specific optimizations in glibc will most probably come in VERY handy, once I have sorted out the root of the problem with my specific case so thanks a lot for your valuable work... yes, I mean it. You care, your are a programmer? write code! If you care about the community then fit your optimization into the framework provided for CPU specific optimization and submit it so others can benefit. I _am_ writing code, and Gunnar is helping me find an explaination to the bizarre behaviour of this particular chip. If the result is useable to others, i _will_ fit it on your framework for optimizations. [...] I don't think you're doing anything wrong exactly. But it seems that your testcase sits there and just copies data with memcpy in varying sizes and amounts. That's not exactly a real-world usecase is it? No, of course it's not. I made this program to test the performance difference of different tweaks quickly. Once I found something that worked, I started LD_PRELOADing it to different other programs (among others the kdrive Xserver, mplayer, and x11perf) to see its impact on performance of some real-life apps. There the difference in performance is not so impressive of course, but it is still there (almost always either noticeably in favor of the tweaked version of memcpy(), or with a negligible or no difference). The trick is that the code built into glibc has to be optimal for the average case (4-256, average 12 bytes). Actually most memcpy implementations are a series of special cases for length and alignment. You can always do better if you know exactly what processor you are on and what specific sizes and alignment your application uses. Yes, I know that's a problem. Thanks for the information for average size, I don't know where it comes from, but I'll take your word. I am trying to be as polite and friendly as I can, so if you think I am not, please tell me where and when... I'll try to improve my social skills for the next time ;-) Greetings, -- David Jander ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Efficient memcpy()/memmove() for G2/G3 cores...
On Thursday 04 September 2008 17:01:21 Gunnar Von Boehn wrote: [...] Regarding the 5121. David, you did create a very special memcopy for the 5121e CPU. Your test showed us that the normal glibc memcopy is about 10 times slower than expected on the 5121. I really wonder why this is the case. I would have expected the 5121 to perform just like the 5200B. What we saw is that switching from READ to WRITE and back is very costly on 5121. There seems to be a huge difference between the 5200 and its successor the 5121. Is this performance difference caused by the CPU or by the board /memory? I have some new insight now, and I will look more closely at the working of the DRAM controller... there has to be something wrong somewhere, an I am going to find it... whether it is some strange bug in my u-boot code (initializing the DRAM controller and prio-manager for example) or a silicon-errata (John?) Thanks a lot for your help so far. -- David Jander ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Efficient memcpy()/memmove() for G2/G3 cores...
I would be careful about adding overhead to memcpy. I found that in the kernel, almost all calls to memcpy are for less than 128 bytes (1 cache line on most 64-bit machines). So, adding a lot of code to detect cacheability and do prefetching is just going to slow down the common case, which is short copies. I don't have statistics for glibc but I wouldn't be surprised if most copies were short there also. You are right. For small copy, it is not advisable. The way I did was put a small check in the beginning of memcpy. If the copy is less than 5 cache lines, I don't do dcbt/dcbz. Thus we see a big jump for copy more than 5 cache lines. The overhead is only 2 assembly instructions (compare number of bytes followed by jump). One question - How can we can quickly determine whether both source and memory address range fall in cacheable range? The user can mmap a region of memory as non-cacheable, but then call memcpy with that address. The optimized version must quickly determine that dcbt/dcbz must not be used in this case. I don't know what would be a good way to achieve the same? Regards, Prodyut Hazarika ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Hooking an IRQ on a modified mpc8349emitx board
Thanks Scott, It is hooked and alive! Your couple of lines guided me through. Here's the outline of what I did after looking at other powerpc drivers and the ePAPR document from Power.org: 1. Assumed that MPC83xx_IRQ_EXT1 (0x11 without offset) corresponds to IRQ1. 2. Created a node in my dts that looks like: [EMAIL PROTECTED] { device_type = network; interrupts = 11 8; interrupt-parent = ipic ; } 3. On my kernel device driver I have the following outline: struct device_node *np; int irq; if ((np = of_find_node_by_name(NULL, mydevice)) == NULL) { printk(KERN_ERR Can't find my device node\n); irq = -1; } else irq = irq_of_parse_and_map(np,0); 4. Ready to use request_irq() on irq! Thanks, Oscar Scott Wood wrote: On Wed, Sep 03, 2008 at 03:55:41PM -0700, Oscar Takeshita wrote: I've been trying to hook an IRQ on a modified mpc8349emitx board without success. The IRQ is hooked physically to IRQ1/GPIO2[13] on the mpc8349e. No other devices are tied to this pin. I'm using uboot 1.2.0 and kernel 2.6.22.19. Do I need to have a dts entry for this interrupt in order to make request_irq() succeed? How can I find the IRQ number? I tried probe_irq_on/off unfortunately it did not work. Would it be MPC83xx_IRQ_EXT1 in arch/powerpc/include/asm/mpc83xx.h ? I'm new doing kernel work. Any hints appreciated. You need to describe the IRQ in a device tree node and use irq_of_parse_and_map(). request_irq() takes virtual IRQ numbers. Maybe we should put together an arch/powerpc FAQ... -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH v3 1/4] powerpc: Introduce local (non-broadcast) forms of tlb invalidates
Kumar Gala wrote: [...] The intent of this change is to handle SMP based invalidates via IPIs instead of broadcasts as the mechanism scales better for larger number of cores. Hi Kumar, How is the inter-IPI deadlock avoidance designed in this new approach? I don't know the close details of low-level Book-E VM design in Linux, but am thinking about a scenario when we have two TLB misses hitting almost immediately on two different cores and they both want to send a TLB invalidate IPI to each other. How do you manage this? The reason I ask is we had similar considerations (and problems) when bringing SMP to the dual core e500 on FreeBSD and ended up not using IPIs, at least for now, because of such concerns (and actual problems of this kind). Rafal ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] powerpc: add SSI-to-DMA properties to Freescale MPC8610 HPCD device tree
On Wed, Aug 6, 2008 at 11:48 AM, Timur Tabi [EMAIL PROTECTED] wrote: Add the fsl,playback-dma and fsl,capture-dma properties to the Freescale MPC8610 HPCD device tree. These properties connect the SSI nodes to the DMA nodes for the DMA channels that the SSI should use. Also update the ssi.txt documentation. These properties will be needed when the ASoC V2 version of the Freescale MPC8610 device drivers are merged into the mainline. Signed-off-by: Timur Tabi [EMAIL PROTECTED] --- Kumar, any thoughts on this patch? -- Timur Tabi Linux kernel developer at Freescale ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Trace
I followed your advice : I used the UART layer. It is easier. Thanks Benjamin Herrenschmidt a écrit : On Tue, 2008-09-02 at 09:34 +0200, Sébastien Chrétien wrote: ok is it a bad way to write a tty driver ? why ? Not necessarily, it's just overkill. The UART layer handles a lot of stuff for you. The tty layer is tricky to get right. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC][USB] powerpc: Workaround for the PPC440EPX USBH_23 errata
В Fri, 29 Aug 2008 17:30:57 -0400 (EDT) Alan Stern [EMAIL PROTECTED] пишет: On Fri, 29 Aug 2008, Vitaly Bordug wrote: But even assuming PM set, common use-case of embedded systems to have stuff on USB bus that is never plugged off; and in case of compiled-in ehci and ohci there is just Initializing USB Mass Storage driver... usb 1-1: new high speed USB device using ppc-of-ehci and address 2 usb 1-1: device descriptor read/all, error -110 hub 1-0:1.0: unable to enumerate USB device on port 1 (not touching PM here to be clear) even 1-second delay would be enough for root hub to be hosed... So, is the patch (with all the issues addressed) acceptable for mainline? Try doing this instead. First, make sure CONFIG_PM is set! :-) Then replace your patch with something that adds a 2-second delay to the end of ehci_start() when running on a 440EPx system. That should give the OHCI controller time to suspend before the EHCI root hub is registered. I've started looking this way. Sorry, but this approach is a little bit fragile, and unreliable. First, it just did not work out in case of usb2 hub with memory stick and keybord plugged - OHCI did not suspend, ehci is hosed and we're happily using hi-speed device on full-speed: ppc-of-ehci e300.ehci: OF EHCI ppc-of-ehci e300.ehci: new USB bus registered, assigned bus number 1 ppc-of-ehci e300.ehci: irq 32, io mem 0xe300 ppc-of-ehci e300.ehci: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 1 port detected usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb1: Product: OF EHCI usb usb1: Manufacturer: Linux 2.6.26-00016-g8914f6a-dirty ehci_hcd usb usb1: SerialNumber: PPC-OF USB ppc-of-ohci e400.usb: OF OHCI ppc-of-ohci e400.usb: new USB bus registered, assigned bus number 2 ppc-of-ohci e400.usb: irq 33, io mem 0xe400 usb usb2: configuration #1 chosen from 1 choice hub 2-0:1.0: USB hub found hub 2-0:1.0: 1 port detected usb usb2: New USB device found, idVendor=1d6b, idProduct=0001 usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb2: Product: OF OHCI usb usb2: Manufacturer: Linux 2.6.26-00016-g8914f6a-dirty ohci_hcd usb usb2: SerialNumber: PPC-OF USB Initializing USB Mass Storage driver... usb 1-1: new high speed USB device using ppc-of-ehci and address 2 usb 1-1: device not accepting address 2, error -110 usb 1-1: new high speed USB device using ppc-of-ehci and address 3 usb 1-1: device not accepting address 3, error -110 usb 1-1: new high speed USB device using ppc-of-ehci and address 4 usb 1-1: device not accepting address 4, error -110 usb 1-1: new high speed USB device using ppc-of-ehci and address 5 usb 1-1: device not accepting address 5, error -110 hub 1-0:1.0: unable to enumerate USB device on port 1 usb 2-1: new full speed USB device using ppc-of-ohci and address 2 Secondly, we may *not* rely on the fact, that OHCI will always have the same suspend policy. Even kicking the code up to the shape when it will automagically suspend in proper timing to get the HW issue around, we cannot be sure that it will persist along kernel lifecycle, and won't require concerned people to kick suspend timings back to the working state subsequently each rc release. Thirdly, PM is disabled by Kconfig explicitly in case of 44x. Reasoning is not clear at the moment, but I believe that isn't there just in case. A little more tweaking will be needed to handle system sleeps. But this should be a good start. Getting 44x into proper PM land is good direction, but right now there is a situation for certain platform when USB trimmed down to full-speed mode in evaluation design and many inheritances. The only way to have all the benefits, was maintaining internally some version of AMCC workaorund What to do when CONFIG_PM is off is a separate matter. Let's not worry about it for now -- especially since, as Matthias suggested, you can use a USB 2.0 hub. Not every hub will work (none of available did so far), and it is often not an option for embedded device without rewiring. As this touches powerpc stuff only, are there any objections to let powerpc peolple consider if approach suggested earlier is applicable or not? Thanks, -Vitaly ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Libhugetlbfs-devel] Buglet in 16G page handling
Jon Tollefson wrote: David Gibson wrote: On Tue, Sep 02, 2008 at 12:12:27PM -0500, Jon Tollefson wrote: David Gibson wrote: When BenH and I were looking at the new code for handling 16G pages, we noticed a small bug. It doesn't actually break anything user visible, but it's certainly not the way things are supposed to be. The 16G patches didn't update the huge_pte_offset() and huge_pte_alloc() functions, which means that the hugepte tables for 16G pages will be allocated much further down the page table tree than they should be - allocating several levels of page table with a single entry in them along the way. The patch below is supposed to fix this, cleaning up the existing handling of 64k vs 16M pages while its at it. However, it needs some testing. I've checked that it doesn't break existing 16M support, either with 4k or 64k base pages. I haven't figured out how to test with 64k pages yet, at least until the multisize support goes into libhugetlbfs. For 16G pages, I just don't have access to a machine with enough memory to test. Jon, presumably you must have found such a machine when you did the 16G page support in the first place. Do you still have access, and can you test this patch? I do have access to a machine to test it. I applied the patch to -rc4 and used a pseries_defconfig. I boot with default_hugepagesz=16G... in order to test huge page sizes other then 16M at this point. Running the libhugetlbfs test suite it gets as far as Readback (64): PASS before it hits the following program check. Ah, yes, oops, forgot to fix up the pagetable freeing path in line with the other changes. Try the revised version below. I have run through the tests twice now with this new patch using a 4k base page size(and 16G huge page size) and there are no program checks or spin lock issues. So looking good. I will run it next a couple of times with 64K base pages. I have run through the libhugetest suite 3 times each now with both combinations(4k and 64K base page) and have not seen the spin lock problem or any other problems. Acked-by: Jon Tollefson [EMAIL PROTECTED] Jon Index: working-2.6/arch/powerpc/mm/hugetlbpage.c === --- working-2.6.orig/arch/powerpc/mm/hugetlbpage.c 2008-09-02 11:50:12.0 +1000 +++ working-2.6/arch/powerpc/mm/hugetlbpage.c2008-09-03 10:10:54.0 +1000 @@ -128,29 +128,37 @@ static int __hugepte_alloc(struct mm_str return 0; } -/* Base page size affects how we walk hugetlb page tables */ -#ifdef CONFIG_PPC_64K_PAGES -#define hpmd_offset(pud, addr, h) pmd_offset(pud, addr) -#define hpmd_alloc(mm, pud, addr, h)pmd_alloc(mm, pud, addr) -#else -static inline -pmd_t *hpmd_offset(pud_t *pud, unsigned long addr, struct hstate *hstate) + +static pud_t *hpud_offset(pgd_t *pgd, unsigned long addr, struct hstate *hstate) +{ +if (huge_page_shift(hstate) PUD_SHIFT) +return pud_offset(pgd, addr); +else +return (pud_t *) pgd; +} +static pud_t *hpud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long addr, + struct hstate *hstate) { -if (huge_page_shift(hstate) == PAGE_SHIFT_64K) +if (huge_page_shift(hstate) PUD_SHIFT) +return pud_alloc(mm, pgd, addr); +else +return (pud_t *) pgd; +} +static pmd_t *hpmd_offset(pud_t *pud, unsigned long addr, struct hstate *hstate) +{ +if (huge_page_shift(hstate) PMD_SHIFT) return pmd_offset(pud, addr); else return (pmd_t *) pud; } -static inline -pmd_t *hpmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long addr, - struct hstate *hstate) +static pmd_t *hpmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long addr, + struct hstate *hstate) { -if (huge_page_shift(hstate) == PAGE_SHIFT_64K) +if (huge_page_shift(hstate) PMD_SHIFT) return pmd_alloc(mm, pud, addr); else return (pmd_t *) pud; } -#endif /* Build list of addresses of gigantic pages. This function is used in early * boot before the buddy or bootmem allocator is setup. @@ -204,7 +212,7 @@ pte_t *huge_pte_offset(struct mm_struct pg = pgd_offset(mm, addr); if (!pgd_none(*pg)) { -pu = pud_offset(pg, addr); +pu = hpud_offset(pg, addr, hstate); if (!pud_none(*pu)) { pm = hpmd_offset(pu, addr, hstate); if (!pmd_none(*pm)) @@ -233,7 +241,7 @@ pte_t *huge_pte_alloc(struct mm_struct * addr = hstate-mask; pg = pgd_offset(mm, addr); -pu = pud_alloc(mm, pg, addr); +pu = hpud_alloc(mm, pg, addr, hstate); if (pu) { pm = hpmd_alloc(mm, pu, addr, hstate); @@ -316,13 +324,7
Re: [PATCH v3 1/4] powerpc: Introduce local (non-broadcast) forms of tlb invalidates
On Thu, 2008-09-04 at 21:17 +0200, Rafal Jaworowski wrote: Kumar Gala wrote: [...] The intent of this change is to handle SMP based invalidates via IPIs instead of broadcasts as the mechanism scales better for larger number of cores. Hi Kumar, How is the inter-IPI deadlock avoidance designed in this new approach? I don't know the close details of low-level Book-E VM design in Linux, but am thinking about a scenario when we have two TLB misses hitting almost immediately on two different cores and they both want to send a TLB invalidate IPI to each other. How do you manage this? Simple.. Just look how it's done on x86 :-) The flush_tlb_* operations happen with interrupt enabled and no major lock held. They shouldn't deadlock. The reason I ask is we had similar considerations (and problems) when bringing SMP to the dual core e500 on FreeBSD and ended up not using IPIs, at least for now, because of such concerns (and actual problems of this kind). Well, it depends how your VM is designed. The linux one has beeing doing IPIs for invalidations forever on x86 so it's not hard to adapt. Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC][USB] powerpc: Workaround for the PPC440EPX USBH_23 errata
On Fri, 5 Sep 2008, Vitaly Bordug wrote: I've started looking this way. Sorry, but this approach is a little bit fragile, and unreliable. First, it just did not work out in case of usb2 hub with memory stick and keybord plugged - OHCI did not suspend, ehci is hosed and we're happily using hi-speed device on full-speed: Secondly, we may *not* rely on the fact, that OHCI will always have the same suspend policy. Even kicking the code up to the shape when it will automagically suspend in proper timing to get the HW issue around, we cannot be sure that it will persist along kernel lifecycle, and won't require concerned people to kick suspend timings back to the working state subsequently each rc release. Thirdly, PM is disabled by Kconfig explicitly in case of 44x. Reasoning is not clear at the moment, but I believe that isn't there just in case. I assume that's the reason the suggested approach failed. What to do when CONFIG_PM is off is a separate matter. Let's not worry about it for now -- especially since, as Matthias suggested, you can use a USB 2.0 hub. Not every hub will work (none of available did so far), and it is often not an option for embedded device without rewiring. It's odd that your hubs don't work. What's wrong with them? As this touches powerpc stuff only, are there any objections to let powerpc peolple consider if approach suggested earlier is applicable or not? I don't mind doing that, provided the changes are cleaned up so that they don't affect people who aren't building kernels for 44x systems. Alan Stern ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] Correct printk %pF to work on all architectures
Make dereference_function_descriptor() more accommodating by allowing architecture overrides. I put the three overrides (for parisc64, ppc64 and ia64) in arch/kernel/module.c because that's where the kernel internal linker which knows how to deal with function descriptors sits. Signed-off-by: James Bottomley [EMAIL PROTECTED] Acked-by: Benjamin Herrenschmidt [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] POWERPC: Rename PTE_SIZE to HPTE_SIZE
On Wed, 2008-09-03 at 10:37 -0500, Becky Bruce wrote: It's the size of the hardware PTE; make that clear in the name. Signed-off-by: Becky Bruce [EMAIL PROTECTED] Acked-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- --- arch/powerpc/mm/hash_low_32.S | 36 ++-- 1 files changed, 18 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/mm/hash_low_32.S b/arch/powerpc/mm/hash_low_32.S index b9ba7d9..c41d658 100644 --- a/arch/powerpc/mm/hash_low_32.S +++ b/arch/powerpc/mm/hash_low_32.S @@ -285,7 +285,7 @@ Hash_bits = 12/* e.g. 256kB hash table */ Hash_msk = (((1 Hash_bits) - 1) * 64) /* defines for the PTE format for 32-bit PPCs */ -#define PTE_SIZE 8 +#define HPTE_SIZE8 #define PTEG_SIZE64 #define LG_PTEG_SIZE 6 #define LDPTEu lwzu @@ -342,8 +342,8 @@ _GLOBAL(hash_page_patch_A) /* Search the primary PTEG for a PTE whose 1st (d)word matches r5 */ mtctr r0 - addir4,r3,-PTE_SIZE -1: LDPTEu r6,PTE_SIZE(r4) /* get next PTE */ + addir4,r3,-HPTE_SIZE +1: LDPTEu r6,HPTE_SIZE(r4)/* get next PTE */ CMPPTE 0,r6,r5 bdnzf 2,1b/* loop while ctr != 0 !cr0.eq */ beq+found_slot @@ -353,9 +353,9 @@ _GLOBAL(hash_page_patch_A) _GLOBAL(hash_page_patch_B) xoris r4,r3,Hash_msk16 /* compute secondary hash */ xorir4,r4,(-PTEG_SIZE 0x) - addir4,r4,-PTE_SIZE + addir4,r4,-HPTE_SIZE mtctr r0 -2: LDPTEu r6,PTE_SIZE(r4) +2: LDPTEu r6,HPTE_SIZE(r4) CMPPTE 0,r6,r5 bdnzf 2,2b beq+found_slot @@ -363,8 +363,8 @@ _GLOBAL(hash_page_patch_B) /* Search the primary PTEG for an empty slot */ 10: mtctr r0 - addir4,r3,-PTE_SIZE /* search primary PTEG */ -1: LDPTEu r6,PTE_SIZE(r4) /* get next PTE */ + addir4,r3,-HPTE_SIZE/* search primary PTEG */ +1: LDPTEu r6,HPTE_SIZE(r4)/* get next PTE */ TST_V(r6) /* test valid bit */ bdnzf 2,1b/* loop while ctr != 0 !cr0.eq */ beq+found_empty @@ -380,9 +380,9 @@ _GLOBAL(hash_page_patch_B) _GLOBAL(hash_page_patch_C) xoris r4,r3,Hash_msk16 /* compute secondary hash */ xorir4,r4,(-PTEG_SIZE 0x) - addir4,r4,-PTE_SIZE + addir4,r4,-HPTE_SIZE mtctr r0 -2: LDPTEu r6,PTE_SIZE(r4) +2: LDPTEu r6,HPTE_SIZE(r4) TST_V(r6) bdnzf 2,2b beq+found_empty @@ -409,11 +409,11 @@ _GLOBAL(hash_page_patch_C) 1: addis r4,r7,[EMAIL PROTECTED] /* get next evict slot */ lwz r6,[EMAIL PROTECTED](r4) - addir6,r6,PTE_SIZE /* search for candidate */ - andi. r6,r6,7*PTE_SIZE + addir6,r6,HPTE_SIZE /* search for candidate */ + andi. r6,r6,7*HPTE_SIZE stw r6,[EMAIL PROTECTED](r4) add r4,r3,r6 - LDPTE r0,PTE_SIZE/2(r4) /* get PTE second word */ + LDPTE r0,HPTE_SIZE/2(r4) /* get PTE second word */ clrrwi r0,r0,12 lis r6,[EMAIL PROTECTED] ori r6,r6,[EMAIL PROTECTED] /* get etext */ @@ -426,7 +426,7 @@ _GLOBAL(hash_page_patch_C) found_empty: STPTE r5,0(r4) found_slot: - STPTE r8,PTE_SIZE/2(r4) + STPTE r8,HPTE_SIZE/2(r4) #else /* CONFIG_SMP */ /* @@ -452,7 +452,7 @@ found_slot: STPTE r5,0(r4) sync TLBSYNC - STPTE r8,PTE_SIZE/2(r4) /* put in correct RPN, WIMG, PP bits */ + STPTE r8,HPTE_SIZE/2(r4) /* put in correct RPN, WIMG, PP bits */ sync SET_V(r5) STPTE r5,0(r4)/* finally set V bit in PTE */ @@ -562,8 +562,8 @@ _GLOBAL(flush_hash_patch_A) /* Search the primary PTEG for a PTE whose 1st (d)word matches r5 */ li r0,8/* PTEs/group */ mtctr r0 - addir12,r8,-PTE_SIZE -1: LDPTEu r0,PTE_SIZE(r12)/* get next PTE */ + addir12,r8,-HPTE_SIZE +1: LDPTEu r0,HPTE_SIZE(r12) /* get next PTE */ CMPPTE 0,r0,r11 bdnzf 2,1b/* loop while ctr != 0 !cr0.eq */ beq+3f @@ -574,9 +574,9 @@ _GLOBAL(flush_hash_patch_A) _GLOBAL(flush_hash_patch_B) xoris r12,r8,Hash_msk16 /* compute secondary hash */ xorir12,r12,(-PTEG_SIZE 0x) - addir12,r12,-PTE_SIZE + addir12,r12,-HPTE_SIZE mtctr r0 -2: LDPTEu r0,PTE_SIZE(r12) +2: LDPTEu r0,HPTE_SIZE(r12) CMPPTE 0,r0,r11 bdnzf 2,2b xorir11,r11,PTE_H /* clear H again */ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: [PATCH] Correct printk %pF to work on all architectures
Make dereference_function_descriptor() more accommodating by allowing architecture overrides. I put the three overrides (for parisc64, ppc64 and ia64) in arch/kernel/module.c because that's where the kernel internal linker which knows how to deal with function descriptors sits. Signed-off-by: James Bottomley [EMAIL PROTECTED] Acked-by: Benjamin Herrenschmidt [EMAIL PROTECTED] ia64 bits still build, boot and work too. Acked-by: Tony Luck [EMAIL PROTECTED] -Tony ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: Kernel crash while initialising PCI
-Original Message- From: Scott Wood [mailto:[EMAIL PROTECTED] Sent: Thursday, 4 September 2008 3:15 AM To: Janaka Subhawickrama Cc: 'linuxppc-dev@ozlabs.org' Subject: Re: Kernel crash while initialising PCI On Wed, Sep 03, 2008 at 05:50:10PM +1000, Janaka Subhawickrama wrote: My DTS file entries for PCI are as follows: [EMAIL PROTECTED] { interrupt-map-mask = // Mask the child UnitIrqSpec //ff00 0 0 7 ff ff f ; interrupt-map = // Child UnitIrqSpec, Parent PIC handle, Parent UnitIrqSpec // PCI unit address ( 0 0), Interrupt specifier IDSEL 0x0D (CRX board hard wired to IDSEL 13), // parent Phandle, IRQ7=0x17 (jumper based) level sensitive(8) //0d00 0 0 1 ipic 17 8 6800 0 0 1 ipic 17 8 ; interrupt-parent = ipic ; interrupts = 42 8;//PCI1 interrupt, level sensitive bus-range = 0 0; //Bus number and largest bus under this /* struct ranges_pci { unsigned int pci_space; //Prefechable/relocatable IEEE1275 u64 pci_addr; phys_addr_t phys_addr; u64 size; }; */ ranges = 0200 0 A000 A000 0 2000 0100 0 D800 D800 0 0100 ; clock-frequency = 1FCA055;//Hz #interrupt-cells = 1; //PCI aparently uses 1 #size-cells = 2; //Max of 2 ints #address-cells = 3; //int size reg = 8500 100; //?? compatible = fsl,mpc8349-pci; device_type = pci; }; What am I doing wrong ? The PCI node needs to go under the root node, rather than under the soc node. Otherwise, the kernel will try to translate ranges through the IMMR window. In older kernels, there was a bug in the PCI code that let you get away with this. See current in-kernel device trees for examples. -Scott Yep. That fixed my immediate problem. Thank you very much. Kernel boots now and I can do lspci and see the basic information of the graphics card. Now I have to trace through why I get PCI: Calling quirk c01a9db8 for :00:0d.0. Any pointers are greatly appreciated. Cheers Janaka ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Libhugetlbfs-devel] Buglet in 16G page handling
On Wed, Sep 03, 2008 at 05:19:27PM -0500, Jon Tollefson wrote: David Gibson wrote: On Tue, Sep 02, 2008 at 12:12:27PM -0500, Jon Tollefson wrote: David Gibson wrote: When BenH and I were looking at the new code for handling 16G pages, we noticed a small bug. It doesn't actually break anything user visible, but it's certainly not the way things are supposed to be. The 16G patches didn't update the huge_pte_offset() and huge_pte_alloc() functions, which means that the hugepte tables for 16G pages will be allocated much further down the page table tree than they should be - allocating several levels of page table with a single entry in them along the way. The patch below is supposed to fix this, cleaning up the existing handling of 64k vs 16M pages while its at it. However, it needs some testing. I've checked that it doesn't break existing 16M support, either with 4k or 64k base pages. I haven't figured out how to test with 64k pages yet, at least until the multisize support goes into libhugetlbfs. For 16G pages, I just don't have access to a machine with enough memory to test. Jon, presumably you must have found such a machine when you did the 16G page support in the first place. Do you still have access, and can you test this patch? I do have access to a machine to test it. I applied the patch to -rc4 and used a pseries_defconfig. I boot with default_hugepagesz=16G... in order to test huge page sizes other then 16M at this point. Running the libhugetlbfs test suite it gets as far as Readback (64): PASS before it hits the following program check. Ah, yes, oops, forgot to fix up the pagetable freeing path in line with the other changes. Try the revised version below. I have run through the tests twice now with this new patch using a 4k base page size(and 16G huge page size) and there are no program checks or spin lock issues. So looking good. I will run it next a couple of times with 64K base pages. Ok, and I've now run it with 64k hugepage size, so assuming this last test of yours goes ok, I'll push the patch out. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
build failure with current linus tree
Current linus tree fail to build for me for a 64bit powerpc config with: WARNING: modpost: Found 13 section mismatch(es). To see full details build your kernel with: 'make CONFIG_DEBUG_SECTION_MISMATCH=y' mm/built-in.o: In function `.strndup_user': (.text+0x1401a): undefined reference to `.LCTOC0' mm/built-in.o: In function `.walk_page_range': (.text+0x269ae): undefined reference to `.LCTOC0' fs/built-in.o: In function `.posix_acl_alloc': (.text+0x5f62a): undefined reference to `.LCTOC0' fs/built-in.o: In function `.posix_acl_from_xattr': (.text+0x5feda): undefined reference to `.LCTOC0' fs/built-in.o: In function `.ext3fs_dirhash': (.text+0x8e17e): undefined reference to `.LCTOC0' fs/built-in.o:(.text+0xe844e): more undefined references to `.LCTOC0' follow make[2]: *** [.tmp_vmlinux1] Error 1 make[1]: *** [sub-make] Error 2 make: *** [all] Error 2 The config is at http://verein.lst.de/~hch/.config, I'm using gcc 3.4.6. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC][USB] powerpc: Workaround for the PPC440EPX USBH_23 errata
I assume that's the reason the suggested approach failed. I think (hope) that Vitaly actually fixed that before trying it :-) I don't mind doing that, provided the changes are cleaned up so that they don't affect people who aren't building kernels for 44x systems. I also find the CONFIG_PM approach not terribly reliable... too timing sensitive, and it forces boards to add CONFIG_PM which is not necessarily the nicest thing to do in embedded space. Sounds better to me to clean up the initial patch to ensure it's not going to introduce bloat for other platforms and use that, ie just consider it as yet another HW errata. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Libhugetlbfs-devel] Buglet in 16G page handling
On Thu, Sep 04, 2008 at 04:08:30PM -0500, Jon Tollefson wrote: Jon Tollefson wrote: David Gibson wrote: On Tue, Sep 02, 2008 at 12:12:27PM -0500, Jon Tollefson wrote: [snip] I have run through the tests twice now with this new patch using a 4k base page size(and 16G huge page size) and there are no program checks or spin lock issues. So looking good. I will run it next a couple of times with 64K base pages. I have run through the libhugetest suite 3 times each now with both combinations(4k and 64K base page) and have not seen the spin lock problem or any other problems. Excellent. I'll push the patch. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Cleanup hugepage pagetable allocation for powerpc with 16G pages
There is a small bug in the handling of 16G hugepages recently added to the kernel. This doesn't cause a crash or other user-visible problems, but it does mean that more levels of pagetable are allocated than makes sense for 16G pages. The hugepage pagetables for the 16G pages are allocated much lower in the pagetable tree than they should be, with the intervening levels allocated with full pmd and pud pages which will only ever have one entry filled in. This patch corrects this problem, at the same time cleaning up the handling of which level 64k versus 16M hugepage pagetables are allocated at. The new way of formatting the tests should be more robust against changes in pagetable structure, or any newly added hugepage sizes. Signed-off-by: David Gibson [EMAIL PROTECTED] --- Paul, please apply for 2.6.28 Index: working-2.6/arch/powerpc/mm/hugetlbpage.c === --- working-2.6.orig/arch/powerpc/mm/hugetlbpage.c 2008-09-02 11:50:12.0 +1000 +++ working-2.6/arch/powerpc/mm/hugetlbpage.c 2008-09-04 15:36:01.0 +1000 @@ -128,29 +128,37 @@ static int __hugepte_alloc(struct mm_str return 0; } -/* Base page size affects how we walk hugetlb page tables */ -#ifdef CONFIG_PPC_64K_PAGES -#define hpmd_offset(pud, addr, h) pmd_offset(pud, addr) -#define hpmd_alloc(mm, pud, addr, h) pmd_alloc(mm, pud, addr) -#else -static inline -pmd_t *hpmd_offset(pud_t *pud, unsigned long addr, struct hstate *hstate) + +static pud_t *hpud_offset(pgd_t *pgd, unsigned long addr, struct hstate *hstate) +{ + if (huge_page_shift(hstate) PUD_SHIFT) + return pud_offset(pgd, addr); + else + return (pud_t *) pgd; +} +static pud_t *hpud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long addr, +struct hstate *hstate) { - if (huge_page_shift(hstate) == PAGE_SHIFT_64K) + if (huge_page_shift(hstate) PUD_SHIFT) + return pud_alloc(mm, pgd, addr); + else + return (pud_t *) pgd; +} +static pmd_t *hpmd_offset(pud_t *pud, unsigned long addr, struct hstate *hstate) +{ + if (huge_page_shift(hstate) PMD_SHIFT) return pmd_offset(pud, addr); else return (pmd_t *) pud; } -static inline -pmd_t *hpmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long addr, - struct hstate *hstate) +static pmd_t *hpmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long addr, +struct hstate *hstate) { - if (huge_page_shift(hstate) == PAGE_SHIFT_64K) + if (huge_page_shift(hstate) PMD_SHIFT) return pmd_alloc(mm, pud, addr); else return (pmd_t *) pud; } -#endif /* Build list of addresses of gigantic pages. This function is used in early * boot before the buddy or bootmem allocator is setup. @@ -204,7 +212,7 @@ pte_t *huge_pte_offset(struct mm_struct pg = pgd_offset(mm, addr); if (!pgd_none(*pg)) { - pu = pud_offset(pg, addr); + pu = hpud_offset(pg, addr, hstate); if (!pud_none(*pu)) { pm = hpmd_offset(pu, addr, hstate); if (!pmd_none(*pm)) @@ -233,7 +241,7 @@ pte_t *huge_pte_alloc(struct mm_struct * addr = hstate-mask; pg = pgd_offset(mm, addr); - pu = pud_alloc(mm, pg, addr); + pu = hpud_alloc(mm, pg, addr, hstate); if (pu) { pm = hpmd_alloc(mm, pu, addr, hstate); @@ -316,13 +324,7 @@ static void hugetlb_free_pud_range(struc pud = pud_offset(pgd, addr); do { next = pud_addr_end(addr, end); -#ifdef CONFIG_PPC_64K_PAGES - if (pud_none_or_clear_bad(pud)) - continue; - hugetlb_free_pmd_range(tlb, pud, addr, next, floor, ceiling, - psize); -#else - if (shift == PAGE_SHIFT_64K) { + if (shift PMD_SHIFT) { if (pud_none_or_clear_bad(pud)) continue; hugetlb_free_pmd_range(tlb, pud, addr, next, floor, @@ -332,7 +334,6 @@ static void hugetlb_free_pud_range(struc continue; free_hugepte_range(tlb, (hugepd_t *)pud, psize); } -#endif } while (pud++, addr = next, addr != end); start = PGDIR_MASK; @@ -422,9 +423,15 @@ void hugetlb_free_pgd_range(struct mmu_g psize = get_slice_psize(tlb-mm, addr); BUG_ON(!mmu_huge_psizes[psize]); next = pgd_addr_end(addr, end); - if (pgd_none_or_clear_bad(pgd)) - continue; - hugetlb_free_pud_range(tlb, pgd, addr, next, floor, ceiling); + if (mmu_psize_to_shift(psize) PUD_SHIFT) { + if
Re: [PATCH 3/3] ibm_newemac: MAL support for PowerPC 405EZ
On Thu, Sep 04, 2008 at 11:02:16AM -0400, Josh Boyer wrote: The PowerPC 405EZ SoC has some differences in the interrupt layout and handling for the MAL. The SERR, TXDE, and RXDE interrupts are OR'd into a single interrupt. Also, due to the possibility for interrupt coalescing, the TXEOB and RXEOB interrupts require an interrupt bit to be cleared in the ICINTSTAT SDR. This sets the proper MAL feature bits for 405EZ boards, and adds a common shared handler for SERR, TXDE, and RXDE. This has been adapted from code originally written by Stefan Roese. Signed-off-by: Josh Boyer [EMAIL PROTECTED] --- drivers/net/ibm_newemac/mal.c | 98 1 files changed, 78 insertions(+), 20 deletions(-) diff --git a/drivers/net/ibm_newemac/mal.c b/drivers/net/ibm_newemac/mal.c index 10c267b..3cef534 100644 --- a/drivers/net/ibm_newemac/mal.c +++ b/drivers/net/ibm_newemac/mal.c @@ -28,6 +28,7 @@ #include linux/delay.h #include core.h +#include asm/dcr-regs.h static int mal_count; @@ -279,6 +280,9 @@ static irqreturn_t mal_txeob(int irq, void *dev_instance) mal_schedule_poll(mal); set_mal_dcrn(mal, MAL_TXEOBISR, r); + if (mal_has_feature(mal, MAL_FTR_CLEAR_ICINTSTAT)) + mtdcri(SDR0, 0x4510, (mfdcri(SDR0, 0x4510) | 0x6000)); + return IRQ_HANDLED; } @@ -293,6 +297,9 @@ static irqreturn_t mal_rxeob(int irq, void *dev_instance) mal_schedule_poll(mal); set_mal_dcrn(mal, MAL_RXEOBISR, r); + if (mal_has_feature(mal, MAL_FTR_CLEAR_ICINTSTAT)) + mtdcri(SDR0, 0x4510, (mfdcri(SDR0, 0x4510) | 0x8000)); + return IRQ_HANDLED; } @@ -336,6 +343,25 @@ static irqreturn_t mal_rxde(int irq, void *dev_instance) return IRQ_HANDLED; } +static irqreturn_t mal_int(int irq, void *dev_instance) +{ + struct mal_instance *mal = dev_instance; + u32 esr = get_mal_dcrn(mal, MAL_ESR); + + if (esr MAL_ESR_EVB) { + /* descriptor error */ + if (esr MAL_ESR_DE) { + if (esr MAL_ESR_CIDT) + return (mal_rxde(irq, dev_instance)); Return statements shouldn't be enlosed in brackets according to checkpatch.pl. Also in a few other places. + else + return (mal_txde(irq, dev_instance)); + } else { /* SERR */ + return (mal_serr(irq, dev_instance)); + } + } + return IRQ_HANDLED; +} + void mal_poll_disable(struct mal_instance *mal, struct mal_commac *commac) { /* Spinlock-type semantics: only one caller disable poll at a time */ @@ -542,11 +568,22 @@ static int __devinit mal_probe(struct of_device *ofdev, goto fail; } - mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); - mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); - mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); - mal-txde_irq = irq_of_parse_and_map(ofdev-node, 3); - mal-rxde_irq = irq_of_parse_and_map(ofdev-node, 4); + if (of_device_is_compatible(ofdev-node, ibm,mcmal-405ez)) + mal-features |= (MAL_FTR_CLEAR_ICINTSTAT | MAL_FTR_COMMON_ERR_INT); The above like is 80 characters wide. But I'm not sure that anyone cares. + + if (mal_has_feature(mal, MAL_FTR_COMMON_ERR_INT)) { + mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); + mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); + mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); + mal-txde_irq = mal-rxde_irq = mal-serr_irq; + } else { + mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); + mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); + mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); + mal-txde_irq = irq_of_parse_and_map(ofdev-node, 3); + mal-rxde_irq = irq_of_parse_and_map(ofdev-node, 4); + } It seems that that first three calls to irq_of_parse_and_map() could be moved outside of the if/else clause. mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); if (mal_has_feature(mal, MAL_FTR_COMMON_ERR_INT)) { mal-txde_irq = mal-rxde_irq = mal-serr_irq; } else { mal-txde_irq = irq_of_parse_and_map(ofdev-node, 3); mal-rxde_irq = irq_of_parse_and_map(ofdev-node, 4); } + if (mal-txeob_irq == NO_IRQ || mal-rxeob_irq == NO_IRQ || mal-serr_irq == NO_IRQ || mal-txde_irq == NO_IRQ || mal-rxde_irq == NO_IRQ) { @@ -608,21 +645,42 @@ static int __devinit mal_probe(struct of_device *ofdev, sizeof(struct mal_descriptor) * mal_rx_bd_offset(mal, i));
2.6.27-rc5-mm1 mystery trace
Booting the putative 2.6.27-rc5-mm1 lineup on the g5 I see: io scheduler cfq registered proc_dir_entry '00' already registered Call Trace: [c0017a0fbae0] [c0012540] unrecov_restore+0x98d8/0x1220c (unreliable) [c0017a0fbb90] [c0149798] dst_error+0x117d3c/0x42ba04 [c0017a0fbc40] [c0149c14] dst_error+0x1181b8/0x42ba04 [c0017a0fbcc0] [c01f9260] dst_error+0x1c7804/0x42ba04 [c0017a0fbd60] [c05ccc8c] _sinittext+0x25c8c/0x371d4 [c0017a0fbde0] [c0009230] unrecov_restore+0x5c8/0x1220c [c0017a0fbee0] [c05a7d90] _sinittext+0xd90/0x371d4 [c0017a0fbf90] [c0026db4] fpdisable+0xa6f4/0xa884 nvidiafb: Device ID: 10de0141 nvidiafb: CRTC0 analog not found nvidiafb: CRTC1 analog found from which I conclude a) someone broke powerpc's kallsyms processing and b) someone screwed up their procfs handling. The output of `nm -n vmlinux' is at http://userweb.kernel.org/~akpm/nm-n.txt. You'll see that c0012540 is actually somewhere inside proc_register(). This is stupid: g5:/usr/src/25 gdb vmlinux GNU gdb Red Hat Linux (6.3.0.0-1.21rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as ppc-yellowdog-linux-gnu.../usr/src/25/vmlinux: not in executable format: File format not recognized probably there's some command line magic to make it work, but there shouldn't be. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/3] ibm_newemac: Introduce mal_has_feature
On Thu, 2008-09-04 at 11:02 -0400, Josh Boyer wrote: There are some PowerPC SoCs that do odd things with the MAL handling. In order to accommodate them, we need to introduce a feature mechanism that is similar to the existing emac_has_feature function. This adds a feature variable to the mal_instance structure, and adds a mal_has_feature function with some feature definitions. These are guarded by Kconfig options that are selected by the affected platforms. Signed-of-by: Josh Boyer [EMAIL PROTECTED] You also add an actual feature (CLR_ICINSTAT). You should document that or move it to a separate patch. +/* Features of various MAL implementations */ + +/* Dummy feature bit so the enum works properly */ +#define MAL_FTR_DUMMY0x0001 Nah. Just stick an | 0 in the enum to make it happy. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/3] ibm_newemac: Introduce mal_has_feature
On Fri, 05 Sep 2008 13:19:23 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: On Thu, 2008-09-04 at 11:02 -0400, Josh Boyer wrote: There are some PowerPC SoCs that do odd things with the MAL handling. In order to accommodate them, we need to introduce a feature mechanism that is similar to the existing emac_has_feature function. This adds a feature variable to the mal_instance structure, and adds a mal_has_feature function with some feature definitions. These are guarded by Kconfig options that are selected by the affected platforms. Signed-of-by: Josh Boyer [EMAIL PROTECTED] You also add an actual feature (CLR_ICINSTAT). You should document that or move it to a separate patch. I add two features. And I so in the commit log, though I didn't actually say what those do. I can fix that up. +/* Features of various MAL implementations */ + +/* Dummy feature bit so the enum works properly */ +#define MAL_FTR_DUMMY 0x0001 Nah. Just stick an | 0 in the enum to make it happy. OK. I did that originally, but for some reason I wanted to avoid having FTRS_ALWAYS and FTRS_POSSIBLE be equal if none of the Kconfig options were set. I can't remember why though. josh ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 3/3] ibm_newemac: MAL support for PowerPC 405EZ
On Fri, 5 Sep 2008 12:10:37 +1000 Simon Horman [EMAIL PROTECTED] wrote: +static irqreturn_t mal_int(int irq, void *dev_instance) +{ + struct mal_instance *mal = dev_instance; + u32 esr = get_mal_dcrn(mal, MAL_ESR); + + if (esr MAL_ESR_EVB) { + /* descriptor error */ + if (esr MAL_ESR_DE) { + if (esr MAL_ESR_CIDT) + return (mal_rxde(irq, dev_instance)); Return statements shouldn't be enlosed in brackets according to checkpatch.pl. Also in a few other places. I hate checkpatch, but that's easy enough to fix. Though I don't see what other places I add with that mistake. + else + return (mal_txde(irq, dev_instance)); + } else { /* SERR */ + return (mal_serr(irq, dev_instance)); + } + } + return IRQ_HANDLED; +} + void mal_poll_disable(struct mal_instance *mal, struct mal_commac *commac) { /* Spinlock-type semantics: only one caller disable poll at a time */ @@ -542,11 +568,22 @@ static int __devinit mal_probe(struct of_device *ofdev, goto fail; } - mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); - mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); - mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); - mal-txde_irq = irq_of_parse_and_map(ofdev-node, 3); - mal-rxde_irq = irq_of_parse_and_map(ofdev-node, 4); + if (of_device_is_compatible(ofdev-node, ibm,mcmal-405ez)) + mal-features |= (MAL_FTR_CLEAR_ICINTSTAT | MAL_FTR_COMMON_ERR_INT); The above like is 80 characters wide. But I'm not sure that anyone cares. I don't. If Ben complains I'll change it. + + if (mal_has_feature(mal, MAL_FTR_COMMON_ERR_INT)) { + mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); + mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); + mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); + mal-txde_irq = mal-rxde_irq = mal-serr_irq; + } else { + mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); + mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); + mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); + mal-txde_irq = irq_of_parse_and_map(ofdev-node, 3); + mal-rxde_irq = irq_of_parse_and_map(ofdev-node, 4); + } It seems that that first three calls to irq_of_parse_and_map() could be moved outside of the if/else clause. mal-txeob_irq = irq_of_parse_and_map(ofdev-node, 0); mal-rxeob_irq = irq_of_parse_and_map(ofdev-node, 1); mal-serr_irq = irq_of_parse_and_map(ofdev-node, 2); if (mal_has_feature(mal, MAL_FTR_COMMON_ERR_INT)) { mal-txde_irq = mal-rxde_irq = mal-serr_irq; } else { mal-txde_irq = irq_of_parse_and_map(ofdev-node, 3); mal-rxde_irq = irq_of_parse_and_map(ofdev-node, 4); } Indeed they could. Good catch. + if (mal-txeob_irq == NO_IRQ || mal-rxeob_irq == NO_IRQ || mal-serr_irq == NO_IRQ || mal-txde_irq == NO_IRQ || mal-rxde_irq == NO_IRQ) { @@ -608,21 +645,42 @@ static int __devinit mal_probe(struct of_device *ofdev, sizeof(struct mal_descriptor) * mal_rx_bd_offset(mal, i)); - err = request_irq(mal-serr_irq, mal_serr, 0, MAL SERR, mal); - if (err) - goto fail2; - err = request_irq(mal-txde_irq, mal_txde, 0, MAL TX DE, mal); - if (err) - goto fail3; - err = request_irq(mal-txeob_irq, mal_txeob, 0, MAL TX EOB, mal); - if (err) - goto fail4; - err = request_irq(mal-rxde_irq, mal_rxde, 0, MAL RX DE, mal); - if (err) - goto fail5; - err = request_irq(mal-rxeob_irq, mal_rxeob, 0, MAL RX EOB, mal); - if (err) - goto fail6; + if (mal_has_feature(mal, MAL_FTR_COMMON_ERR_INT)) { + err = request_irq(mal-serr_irq, mal_int, IRQF_SHARED, + MAL SERR, mal); + if (err) + goto fail2; + err = request_irq(mal-txde_irq, mal_int, IRQF_SHARED, + MAL TX DE, mal); + if (err) + goto fail3; + err = request_irq(mal-txeob_irq, mal_txeob, 0, MAL TX EOB, mal); + if (err) + goto fail4; + err = request_irq(mal-rxde_irq, mal_int, IRQF_SHARED, + MAL RX DE, mal); + if (err) + goto fail5; + err = request_irq(mal-rxeob_irq, mal_rxeob, 0, MAL RX EOB, mal); + if (err) + goto fail6; + } else { + err = request_irq(mal-serr_irq, mal_serr, 0, MAL SERR, mal); + if (err) + goto fail2; + err =
Re: 2.6.27-rc5-mm1 mystery trace
This is stupid: g5:/usr/src/25 gdb vmlinux GNU gdb Red Hat Linux (6.3.0.0-1.21rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as ppc-yellowdog-linux-gnu.../usr/src/25/vmlinux: not in executable format: File format not recognized probably there's some command line magic to make it work, but there shouldn't be. Is this a 64 bits gdb ? Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.27-rc5-mm1 mystery trace
On Fri, 05 Sep 2008 13:42:44 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: This is stupid: g5:/usr/src/25 gdb vmlinux GNU gdb Red Hat Linux (6.3.0.0-1.21rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as ppc-yellowdog-linux-gnu.../usr/src/25/vmlinux: not in executable format: File format not recognized probably there's some command line magic to make it work, but there shouldn't be. Is this a 64 bits gdb ? I don't think so - it's whatever ydl4.1 gave me. A 64-bit binary on a 64-bit machine Should Just Work. Maybe that's me being simplistic. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.27-rc5-mm1 mystery trace
a) someone broke powerpc's kallsyms processing and b) someone screwed up their procfs handling. I think the proc bug is old, might have to do with /proc/bus/pci... I have that on my G5 too with 2.6.26, I think I just forgot about it.. Care to give that untested patch a try ? powerpc: Always enable pci domains in /proc for 64 bits powerpc This patch always enable the use of the pci domain number in /proc on 64 bits ppc machines instead of the old pseries specific hack of testing for the buid which is meaningless on most machines. We also enable the compat code that makes domain 0 use the old format for the sake of the X server. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- Index: linux-work/arch/powerpc/kernel/pci-common.c === --- linux-work.orig/arch/powerpc/kernel/pci-common.c2008-09-05 13:46:09.0 +1000 +++ linux-work/arch/powerpc/kernel/pci-common.c 2008-09-05 13:49:46.0 +1000 @@ -53,8 +53,9 @@ static int global_phb_number; /* Global /* ISA Memory physical address */ resource_size_t isa_mem_base; -/* Default PCI flags is 0 */ -unsigned int ppc_pci_flags; +/* Default PCI flags is 0 on ppc32, modified at boot on ppc64 */ +unsigned int ppc_pci_flags = 0; + struct pci_controller *pcibios_alloc_controller(struct device_node *dev) { @@ -669,15 +670,12 @@ void __devinit pci_process_bridge_OF_ran int pci_proc_domain(struct pci_bus *bus) { struct pci_controller *hose = pci_bus_to_host(bus); -#ifdef CONFIG_PPC64 - return hose-buid != 0; -#else + if (!(ppc_pci_flags PPC_PCI_ENABLE_PROC_DOMAINS)) return 0; if (ppc_pci_flags PPC_PCI_COMPAT_DOMAIN_0) return hose-global_number != 0; return 1; -#endif } void pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region, ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.27-rc5-mm1 mystery trace
On Fri, 05 Sep 2008 13:54:22 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: a) someone broke powerpc's kallsyms processing and b) someone screwed up their procfs handling. I think the proc bug is old, might have to do with /proc/bus/pci... I have that on my G5 too with 2.6.26, I think I just forgot about it.. Care to give that untested patch a try ? uh, soon, hopfully. The wife pinched the monitor. I'd be more worried about the kallsyms non-workage? ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.27-rc5-mm1 mystery trace
On Thu, 2008-09-04 at 20:49 -0700, Andrew Morton wrote: On Fri, 05 Sep 2008 13:42:44 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: This is stupid: g5:/usr/src/25 gdb vmlinux GNU gdb Red Hat Linux (6.3.0.0-1.21rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as ppc-yellowdog-linux-gnu.../usr/src/25/vmlinux: not in executable format: File format not recognized probably there's some command line magic to make it work, but there shouldn't be. Is this a 64 bits gdb ? I don't think so - it's whatever ydl4.1 gave me. A 64-bit binary on a 64-bit machine Should Just Work. Maybe that's me being simplistic. But is it a 64 bits binary ? Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.27-rc5-mm1 mystery trace
On Thu, 2008-09-04 at 21:04 -0700, Andrew Morton wrote: On Fri, 05 Sep 2008 13:54:22 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: a) someone broke powerpc's kallsyms processing and b) someone screwed up their procfs handling. I think the proc bug is old, might have to do with /proc/bus/pci... I have that on my G5 too with 2.6.26, I think I just forgot about it.. Care to give that untested patch a try ? uh, soon, hopfully. The wife pinched the monitor. I'd be more worried about the kallsyms non-workage? yes, that's the harder part :-) somebody will have to dig. I don't have the cycles today but next if if nobody raised a hand, I can have a look. Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.27-rc5-mm1 mystery trace
On Fri, 05 Sep 2008 14:36:41 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: On Thu, 2008-09-04 at 20:49 -0700, Andrew Morton wrote: On Fri, 05 Sep 2008 13:42:44 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: This is stupid: g5:/usr/src/25 gdb vmlinux GNU gdb Red Hat Linux (6.3.0.0-1.21rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as ppc-yellowdog-linux-gnu.../usr/src/25/vmlinux: not in executable format: File format not recognized probably there's some command line magic to make it work, but there shouldn't be. Is this a 64 bits gdb ? I don't think so - it's whatever ydl4.1 gave me. A 64-bit binary on a 64-bit machine Should Just Work. Maybe that's me being simplistic. But is it a 64 bits binary ? vmlinux? Yup. According to file(1). ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev