Re: [Linux-fbdev-devel] [PATCH 1/2] fb: add support for foreign endianness
On Sun, 17 Feb 2008 10:44:32 +0100 (CET) Geert Uytterhoeven [EMAIL PROTECTED] wrote: On Fri, 15 Feb 2008, Anton Vorontsov wrote: On Thu, Feb 14, 2008 at 10:49:42PM -0800, Andrew Morton wrote: On Tue, 5 Feb 2008 18:44:32 +0300 Anton Vorontsov [EMAIL PROTECTED] wrote: Actually... should CONFIG_FB_FOREIGN_ENDIAN exist, or should this feature be permanently enabled? (...) The notion of `FOREIGN_ENDIAN' is relative, as it depends on the architecture you're compiling for. Suppose you have a PCI graphics card with a frame buffer that's always big endian. When compiling for a big endian platform, the driver won't depend on FB_FOREIGN_ENDIAN. When compiling for a little endian platform, it will. Shouldn't we add LITTLE_ENDIAN and BIG_ENDIAN Kconfig vars first, just like we have 64BIT? I disagree here. The FOREIGN_ENDIAN is enough. It is determined only by graphics chip endianess and CPU (arch) endianess. I know two fb drivers which use endianess information (pm2fb and s3c2410fb). Both resolve endianess at driver level. Actually, both handle it by setting special bits so the graphics chip itself reorder bytes to transform foreign endianess. I understand that this patch is for chips which cannot reorder bytes by themselves. So the FOREIGN_ENDIANESS flag should be set by the driver if it is needed (if the graphics chip is BE and CPU is LE a simple #ifdef will add the flag). I'd like to handle this in Kconfig (cfr. above). Again, it is possible. It is enough to put one rule which enables the FOREIGN_ENDIAN if the architecture endianess is foreign for the driver. The advantage here is that it can be set only for drivers which need it (as some driver can handle it without this code). It should be hidden option set only internally if needed (no user selectable). I tested this patch on the s3c2410fb with disabled byte order corrections by the graphics chip itself. It worked for 8-bit depth but not for 16-bit depth (pixel position seemed ok but wrong tux' colors). I will investigate. The s3c2410fb is BE and the kernel was arm LE. I would like to extend this patch to fb depths below 8-bit. The s3c2410fb cannot handle this correctly with LE kernel. Kind regards, Krzysztof -- Masz ostatnia szanse ! Sprawdz http://link.interia.pl/f1d02 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
make ARCH=ppc defconfig fails looking for common_defconfig
(AIUI, the entire ppc/ architecture is going away, yes? which means i probably shouldn't care about any errors within. is that correct? even so, build errors should probably still be avoided for now.) $ make ARCH=ppc defconfig HOSTCC scripts/basic/fixdep HOSTCC scripts/basic/docproc HOSTCC scripts/kconfig/conf.o HOSTCC scripts/kconfig/kxgettext.o SHIPPED scripts/kconfig/zconf.tab.c SHIPPED scripts/kconfig/lex.zconf.c SHIPPED scripts/kconfig/zconf.hash.c HOSTCC scripts/kconfig/zconf.tab.o HOSTLD scripts/kconfig/conf *** Default configuration is based on 'common_defconfig' *** *** Can't find default configuration arch/ppc/configs/common_defconfig! *** make[1]: *** [defconfig] Error 1 make: *** [defconfig] Error 2 rday -- Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry: Have classroom, will lecture. http://crashcourse.ca Waterloo, Ontario, CANADA ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] drivers/base: export gpl (un)register_memory_notifier
switching to proper mail client... Dave Hansen [EMAIL PROTECTED] wrote on 15.02.2008 17:55:38: I've been thinking about that, and I don't think you really *need* to keep a comprehensive map like that. When the memory is in a particular configuration (range of memory present along with unique set of holes) you get a unique ehea_bmap configuration. That layout is completely predictable. So, if at any time you want to figure out what the ehea_bmap address for a particular *Linux* virtual address is, you just need to pretend that you're creating the entire ehea_bmap, use the same algorithm and figure out host you would have placed things, and use that result. Now, that's going to be a slow, crappy linear search (but maybe not as slow as recreating the silly thing). So, you might eventually run into some scalability problems with a lot of packets going around. But, I'd be curious if you do in practice. Up to 14 addresses translation per packet (sg_list) might be required on the transmit side. On receive side it is only 1. Most packets require only very few translations (1 or sometimes more) translations. However, with more then 700.000 packets per second this approach does not seem reasonable from performance perspective when memory is fragmented as you described. The other idea is that you create a mapping that is precisely 1:1 with kernel memory. Let's say you have two sections present, 0 and 100. You have a high_section_index of 100, and you vmalloc() a 100 entry array. You need to create a *CONTIGUOUS* ehea map? Create one like this: EHEA_VADDR-Linux Section 0-0 1-0 2-0 3-0 ... 100-100 It's contiguous. Each area points to a valid Linux memory address. It's also discernable in O(1) to what EHEA address a given Linux address is mapped. You just have a couple of duplicate entries. This has a serious issues with constraint I mentions in the previous mail: - MRs can have a maximum size of the memory available under linux The requirement is not met that the memory region must not be larger then the available memory for that partition. The create MR H_CALL will fails (we tried this and discussed with FW development) Regards, Jan-Bernd Christoph ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] drivers/base: export gpl (un)register_memory_notifier
Dave Hansen [EMAIL PROTECTED] wrote on 15.02.2008 17:55:38: I've been thinking about that, and I don't think you really *need* to keep a comprehensive map like that. When the memory is in a particular configuration (range of memory present along with unique set of holes) you get a unique ehea_bmap configuration. That layout is completely predictable. So, if at any time you want to figure out what the ehea_bmap address for a particular *Linux* virtual address is, you just need to pretend that you're creating the entire ehea_bmap, use the same algorithm and figure out host you would have placed things, and use that result. Now, that's going to be a slow, crappy linear search (but maybe not as slow as recreating the silly thing). So, you might eventually run into some scalability problems with a lot of packets going around. But, I'd be curious if you do in practice. Up to 14 addresses translation per packet (sg_list) might be required on the transmit side. On receive side it is only 1. Most packets require only very few translations (1 or sometimes more) translations. However, with more then 700.000 packets per second this approach does not seem reasonable from performance perspective when memory is fragmented as you described. The other idea is that you create a mapping that is precisely 1:1 with kernel memory. Let's say you have two sections present, 0 and 100. You have a high_section_index of 100, and you vmalloc() a 100 entry array. You need to create a *CONTIGUOUS* ehea map? Create one like this: EHEA_VADDR-Linux Section 0-0 1-0 2-0 3-0 ... 100-100 It's contiguous. Each area points to a valid Linux memory address. It's also discernable in O(1) to what EHEA address a given Linux address is mapped. You just have a couple of duplicate entries. This has a serious issues with constraint I mentions in the previous mail: - MRs can have a maximum size of the memory available under linux The requirement is not met that the memory region must not be larger then the available memory for that partition. The create MR H_CALL will fails (we tried this and discussed with FW development) Regards, Jan-Bernd Christoph___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
mm section mismatch warnings for ps3
When building current mainline for ps3_defconfig, I get the following mm-related section mismatches: | WARNING: mm/built-in.o(.text+0x25094): Section mismatch in reference from the function .sparse_add_one_section() to the function .meminit.text:.sparse_index_init() | The function .sparse_add_one_section() references | the function __meminit .sparse_index_init(). | This is often because .sparse_add_one_section lacks a __meminit | annotation or the annotation of .sparse_index_init is wrong. | | WARNING: mm/built-in.o(.text+0x25128): Section mismatch in reference from the function .sparse_add_one_section() to the function .meminit.text:.sparse_init_one_section() | The function .sparse_add_one_section() references | the function __meminit .sparse_init_one_section(). | This is often because .sparse_add_one_section lacks a __meminit | annotation or the annotation of .sparse_init_one_section is wrong. | | WARNING: mm/built-in.o(.text+0x2c5fc): Section mismatch in reference from the function .__add_zone() to the function .meminit.text:.init_currently_empty_zone() | The function .__add_zone() references | the function __meminit .init_currently_empty_zone(). | This is often because .__add_zone lacks a __meminit | annotation or the annotation of .init_currently_empty_zone is wrong. | | WARNING: mm/built-in.o(.text+0x2c628): Section mismatch in reference from the function .__add_zone() to the function .meminit.text:.memmap_init_zone() | The function .__add_zone() references | the function __meminit .memmap_init_zone(). | This is often because .__add_zone lacks a __meminit | annotation or the annotation of .memmap_init_zone is wrong. | | WARNING: vmlinux.o(.text+0x7360): Section mismatch in reference from the function .start_secondary_prolog() to the function .devinit.text:.start_secondary() | The function .start_secondary_prolog() references | the function __devinit .start_secondary(). | This is often because .start_secondary_prolog lacks a __devinit | annotation or the annotation of .start_secondary is wrong. | | WARNING: vmlinux.o(.text+0x1dea0): Section mismatch in reference from the function .move_device_tree() to the function .init.text:.lmb_alloc_base() | The function .move_device_tree() references | the function __init .lmb_alloc_base(). | This is often because .move_device_tree lacks a __init | annotation or the annotation of .lmb_alloc_base is wrong. | | WARNING: vmlinux.o(.text+0xabd7c): Section mismatch in reference from the function .sparse_add_one_section() to the function .meminit.text:.sparse_index_init() | The function .sparse_add_one_section() references | the function __meminit .sparse_index_init(). | This is often because .sparse_add_one_section lacks a __meminit | annotation or the annotation of .sparse_index_init is wrong. | | WARNING: vmlinux.o(.text+0xabe10): Section mismatch in reference from the function .sparse_add_one_section() to the function .meminit.text:.sparse_init_one_section() | The function .sparse_add_one_section() references | the function __meminit .sparse_init_one_section(). | This is often because .sparse_add_one_section lacks a __meminit | annotation or the annotation of .sparse_init_one_section is wrong. | | WARNING: vmlinux.o(.text+0xb3198): Section mismatch in reference from the function .add_memory() to the function .devinit.text:.arch_add_memory() | The function .add_memory() references | the function __devinit .arch_add_memory(). | This is often because .add_memory lacks a __devinit | annotation or the annotation of .arch_add_memory is wrong. | | WARNING: vmlinux.o(.text+0xb32e4): Section mismatch in reference from the function .__add_zone() to the function .meminit.text:.init_currently_empty_zone() | The function .__add_zone() references | the function __meminit .init_currently_empty_zone(). | This is often because .__add_zone lacks a __meminit | annotation or the annotation of .init_currently_empty_zone is wrong. | | WARNING: vmlinux.o(.text+0xb3310): Section mismatch in reference from the function .__add_zone() to the function .meminit.text:.memmap_init_zone() | The function .__add_zone() references | the function __meminit .memmap_init_zone(). | This is often because .__add_zone lacks a __meminit | annotation or the annotation of .memmap_init_zone is wrong. (there are a few more regarding ps3_register_repository_device(), but these cannot happen as only storage devices can be added later) With kind regards, Geert Uytterhoeven Software Architect Sony Network and Software Technology Center Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone:+32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: [EMAIL PROTECTED] Internet: http://www.sony-europe.com/ Sony Network and Software Technology Center Europe A division of Sony Service Centre (Europe) N.V. Registered office: Technologielaan 7 · B-1840 Londerzeel · Belgium VAT BE 0413.825.160 · RPR Brussels Fortis Bank Zaventem · Swift GEBABEBB08A · IBAN
How to describe FPGA-based devices in the device tree ?
Hi everybody, I'm (at last) trying to move from ARCH=ppc to ARCH=powerpc. After reading Documentation/powerpc/booting-without-of.txt and various device trees in arch/powerpc/boot/dts, I still don't know how to express some devices in the device tree. The target board has several devices on the processor local bus, as described in the following device tree fragment. [EMAIL PROTECTED] { compatible = fsl,mpc8260-localbus, fsl,pq2-localbus; #address-cells = 2; #size-cells = 1; reg = f0010100 60; ranges = 0 0 4000 0100 2 0 f200 0010 3 0 f300 0010 4 0 f400 0010; [EMAIL PROTECTED],0 { compatible = cfi-flash; reg = 0 0 0100; bank-width = 2; }; [EMAIL PROTECTED],0 { compatible = mtd-ram; reg = 2 0 0010; bank-width = 2; }; [EMAIL PROTECTED],0 { device_type = board-control; reg = 3 0 0020; }; [EMAIL PROTECTED],0 { reg = 4 0 0001; }; }; The fourth device is a FPGA that contains several IP cores such as an interrupt controller and a SD/MMC host controller. If I understand things correctly, each IP core should have its own node in the device tree to allow proper binding with device drivers. As booting-without-of.txt describes the localbus node ranges as corresponding to a single chipselect and covering the entire chipselect access window, I can't have nodes for each IP core as children of the localbus node. Should I put IP core nodes as children of the FPGA node ? If so, how do I map addresses at the FPGA level ? A ranges property in the FPGA node would let me map addresses in the FPGA scope to the localbus scope. However, as the localbus scope use the chipselect number as its first address cell and 0 as its second address cell, I don't see how I could translate offsets in the FPGA into an address at the localbus scope. Could anyone advice me regarding how to properly describe my hardware in the device tree ? Best regards, -- Laurent Pinchart CSE Semaphore Belgium Chaussée de Bruxelles, 732A B-1410 Waterloo Belgium T +32 (2) 387 42 59 F +32 (2) 387 42 75 pgpFjgc9ybEq2.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: make ARCH=ppc defconfig fails looking for common_defconfig
On Mon, 18 Feb 2008 04:32:24 -0500 (EST) Robert P. J. Day [EMAIL PROTECTED] wrote: (AIUI, the entire ppc/ architecture is going away, yes? which means i probably shouldn't care about any errors within. is that correct? even so, build errors should probably still be avoided for now.) Yes, it's going away. There's really no longer support for anything common in it now anyway. CHRP, POWER3 and POWER4 have been removed. So have 83xx and 85xx. Just let it go. josh ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Sun, Feb 17, 2008 at 10:50:03PM +1100, Michael Ellerman wrote: On Sat, 2008-02-16 at 10:39 -0800, Arjan van de Ven wrote: ... for mordern (last 10 years) x86 cpus... the cpu branchpredictor is better than the coder in general. Same for most other architectures. unlikely() creates bigger code as well. Now... we're talking about your super duper hotpath function here right? One where you care about 0.5 cycle speed improvement? (less on modern systems ;) The first patch was to platforms/ps3 code, which runs on the Cell, in particular the PPE ... which is not an x86 :) eg: [EMAIL PROTECTED] ~]$ cat branch.c #include stdio.h int main(void) { int i, j; for (i = 0, j = 0; i 10; i++) if (i % 4 == 0) j++; printf(j = %d\n, j); return 0; } [EMAIL PROTECTED] ~]$ ppu-gcc -Wall -O3 -o branch branch.c [EMAIL PROTECTED] ~]$ time ./branch real0m5.172s [EMAIL PROTECTED] ~]$ cat branch.c .. for (i = 0, j = 0; i 10; i++) if (__builtin_expect(i % 4 == 0, 0)) j++; .. [EMAIL PROTECTED] ~]$ ppu-gcc -Wall -O3 -o branch branch.c [EMAIL PROTECTED] ~]$ time ./branch real0m3.762s Which looks as though unlikely() is helping. Admittedly we don't have a lot of kernel code that looks like that, but at least unlikely is doing what we want it to. This means it generates faster code with a current gcc for your platform. But a future gcc might e.g. replace the whole loop with a division (gcc SVN head (that will soon become gcc 4.3) already does transformations like replacing loops with divisions [1]). And your __builtin_expect() then might have unwanted effects on gcc. Or the kernel code changes much but the likely/unlikely stays unchanged although it becomes wrong. If it is a real hotpath in the kernel where you have _measurable_ performance advantages from using likely/unlikely it's usage might be justified, but otherwise it shouldn't be used. cheers cu Adrian [1] e.g. the while() loop in timespec_add_ns() in include/linux/time.h gets replaced by a division and a modulo (whether this transformation is correct in this specific case is a different question, but that's the level of code transformation gcc already does today) -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Mon, 18 Feb 2008, Adrian Bunk wrote: On Sun, Feb 17, 2008 at 10:50:03PM +1100, Michael Ellerman wrote: On Sat, 2008-02-16 at 10:39 -0800, Arjan van de Ven wrote: ... for mordern (last 10 years) x86 cpus... the cpu branchpredictor is better than the coder in general. Same for most other architectures. unlikely() creates bigger code as well. Now... we're talking about your super duper hotpath function here right? One where you care about 0.5 cycle speed improvement? (less on modern systems ;) The first patch was to platforms/ps3 code, which runs on the Cell, in particular the PPE ... which is not an x86 :) eg: [EMAIL PROTECTED] ~]$ cat branch.c #include stdio.h int main(void) { int i, j; for (i = 0, j = 0; i 10; i++) if (i % 4 == 0) j++; printf(j = %d\n, j); return 0; } [EMAIL PROTECTED] ~]$ ppu-gcc -Wall -O3 -o branch branch.c [EMAIL PROTECTED] ~]$ time ./branch real0m5.172s [EMAIL PROTECTED] ~]$ cat branch.c .. for (i = 0, j = 0; i 10; i++) if (__builtin_expect(i % 4 == 0, 0)) j++; .. [EMAIL PROTECTED] ~]$ ppu-gcc -Wall -O3 -o branch branch.c [EMAIL PROTECTED] ~]$ time ./branch real0m3.762s Which looks as though unlikely() is helping. Admittedly we don't have a lot of kernel code that looks like that, but at least unlikely is doing what we want it to. This means it generates faster code with a current gcc for your platform. But a future gcc might e.g. replace the whole loop with a division (gcc SVN head (that will soon become gcc 4.3) already does transformations like replacing loops with divisions [1]). Hence shouldn't we ask the gcc people what's the purpose of __builtin_expect(), if it doesn't live up to its promise? With kind regards, Geert Uytterhoeven Software Architect Sony Network and Software Technology Center Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone:+32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: [EMAIL PROTECTED] Internet: http://www.sony-europe.com/ Sony Network and Software Technology Center Europe A division of Sony Service Centre (Europe) N.V. Registered office: Technologielaan 7 · B-1840 Londerzeel · Belgium VAT BE 0413.825.160 · RPR Brussels Fortis Bank Zaventem · Swift GEBABEBB08A · IBAN BE39001382358619___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] next-20080218 build failure at pmac_ide_macio_attach ()
Hi, The next-20080218 kernel build fails on the powerpc(s) drivers/ide/ppc/pmac.c: In function ‘pmac_ide_macio_attach’: drivers/ide/ppc/pmac.c:1094: error: conversion to non-scalar type requested drivers/ide/ppc/pmac.c: In function ‘pmac_ide_pci_attach’: drivers/ide/ppc/pmac.c:1232: error: conversion to non-scalar type requested make[3]: *** [drivers/ide/ppc/pmac.o] Error 1 make[2]: *** [drivers/ide/ppc] Error 2 make[1]: *** [drivers/ide] Error 2 make: *** [drivers] Error 2 I Have tested this patch for build failure only. Signed-off-by: Kamalesh Babulal [EMAIL PROTECTED] --- --- linux-2.6.25-rc1/drivers/ide/ppc/pmac.c 2008-02-18 18:41:48.0 +0530 +++ linux-2.6.25-rc1/drivers/ide/ppc/~pmac.c2008-02-18 19:20:37.0 +0530 @@ -1091,7 +1091,7 @@ pmac_ide_macio_attach(struct macio_dev * int irq, rc; hw_regs_t hw; - pmif = (struct pmac_ide_hwif)kzalloc(sizeof(*pmif), GFP_KERNEL); + pmif = (struct pmac_ide_hwif*)kzalloc(sizeof(*pmif), GFP_KERNEL); if (pmif == NULL) return -ENOMEM; @@ -1229,7 +1229,7 @@ pmac_ide_pci_attach(struct pci_dev *pdev return -ENODEV; } - pmif = (struct pmac_ide_hwif)kzalloc(sizeof(*pmif), GFP_KERNEL); + pmif = (struct pmac_ide_hwif*)kzalloc(sizeof(*pmif), GFP_KERNEL); if (pmif == NULL) return -ENOMEM; -- Thanks Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Mon, Feb 18, 2008 at 03:01:35PM +0100, Geert Uytterhoeven wrote: On Mon, 18 Feb 2008, Adrian Bunk wrote: This means it generates faster code with a current gcc for your platform. But a future gcc might e.g. replace the whole loop with a division (gcc SVN head (that will soon become gcc 4.3) already does transformations like replacing loops with divisions [1]). Hence shouldn't we ask the gcc people what's the purpose of __builtin_expect(), if it doesn't live up to its promise? That's a different issue. My point here is that we do not know how the latest gcc available in the year 2010 might transform this code, and how a likely/unlikely placed there might influence gcc's optimizations then. If this is in hotpath code with a measurable speedup when using likely/unlikely with a current gcc then it's worth it. But otherwise it brings no real advantage today and the longterm effects are not predictable. With kind regards, Geert Uytterhoeven cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] next-20080218 build failure at pmac_ide_macio_attach ()
Kamalesh Babulal [EMAIL PROTECTED] writes: --- linux-2.6.25-rc1/drivers/ide/ppc/pmac.c 2008-02-18 18:41:48.0 +0530 +++ linux-2.6.25-rc1/drivers/ide/ppc/~pmac.c 2008-02-18 19:20:37.0 +0530 @@ -1091,7 +1091,7 @@ pmac_ide_macio_attach(struct macio_dev * int irq, rc; hw_regs_t hw; - pmif = (struct pmac_ide_hwif)kzalloc(sizeof(*pmif), GFP_KERNEL); + pmif = (struct pmac_ide_hwif*)kzalloc(sizeof(*pmif), GFP_KERNEL); Just remove the cast. Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
linux-next: ide pmac.c fix
Hi Bart, the current linux-next produces this error for a pmac32 powerpc build: drivers/ide/ppc/pmac.c:1094: error: conversion to non-scalar type requested drivers/ide/ppc/pmac.c:1232: error: conversion to non-scalar type requested -- Cheers, Stephen Rothwell[EMAIL PROTECTED] pgp5KFCbeHY8H.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
Geert Uytterhoeven [EMAIL PROTECTED] wrote: Hence shouldn't we ask the gcc people what's the purpose of __builtin_expect(), if it doesn't live up to its promise? __builtin_expect() is useful on FRV where you _have_ to give each branch and conditional branch instruction a measure of probability whether the branch will be taken. David ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
Arjan van de Ven [EMAIL PROTECTED] writes: you have more faith in the authors knowledge of how his code actually behaves than I think is warranted :) iirc there was a mm patch some time ago to keep track of the actual unlikely values at runtime and it showed indeed some wrong ones. But the far majority of them are probably correct. Or faith in that he knows what unlikely means. I should write docs about this; but unlikely() means: 1) It happens less than 0.01% of the cases. 2) The compiler couldn't have figured this out by itself (NULL pointer checks are compiler done already, same for some other conditions) 3) It's a hot codepath where shaving 0.5 cycles (less even on x86) matters (and the author is ok with taking a 500 cycles hit if he's wrong) One more thing unlikely() does is to move the unlikely code out of line. So it should conserve some icache in critical functions, which might well be worth some more cycles (don't have numbers though). But overall I agree with you that unlikely is in most cases a bad idea (and I submitted the original patch introducing it originally @). That is because it is often used in situations where gcc's default branch prediction heuristics do would make exactly the same decision if (unlikely(x == NULL)) is simply totally useless because gcc already assumes all x == NULL tests are unlikely. I appended some of the builtin heuristics from a recent gcc source so people can see them. Note in particular the last predictors; assuming branch ending with goto, including call, causing early function return or returning negative constant are not taken. Just these alone are likely 95+% of the unlikelies in the kernel. -Andi /* Use number of loop iterations determined by # of iterations analysis to set probability. We don't want to use Dempster-Shaffer theory here, as the predictions is exact. */ DEF_PREDICTOR (PRED_LOOP_ITERATIONS, loop iterations, PROB_ALWAYS, PRED_FLAG_FIRST_MATCH) /* Hints dropped by user via __builtin_expect feature. */ DEF_PREDICTOR (PRED_BUILTIN_EXPECT, __builtin_expect, PROB_VERY_LIKELY, PRED_FLAG_FIRST_MATCH) /* Use number of loop iterations guessed by the contents of the loop. */ DEF_PREDICTOR (PRED_LOOP_ITERATIONS_GUESSED, guessed loop iterations, PROB_ALWAYS, PRED_FLAG_FIRST_MATCH) /* Branch containing goto is probably not taken. */ DEF_PREDICTOR (PRED_CONTINUE, continue, HITRATE (56), 0) /* Branch to basic block containing call marked by noreturn attribute. */ DEF_PREDICTOR (PRED_NORETURN, noreturn call, HITRATE (99), PRED_FLAG_FIRST_MATCH) /* Branch to basic block containing call marked by cold function attribute. */ DEF_PREDICTOR (PRED_COLD_FUNCTION, cold function call, HITRATE (99), PRED_FLAG_FIRST_MATCH) /* Loopback edge is taken. */ DEF_PREDICTOR (PRED_LOOP_BRANCH, loop branch, HITRATE (86), PRED_FLAG_FIRST_MATCH) /* Edge causing loop to terminate is probably not taken. */ DEF_PREDICTOR (PRED_LOOP_EXIT, loop exit, HITRATE (91), PRED_FLAG_FIRST_MATCH) /* Pointers are usually not NULL. */ DEF_PREDICTOR (PRED_POINTER, pointer, HITRATE (85), 0) DEF_PREDICTOR (PRED_TREE_POINTER, pointer (on trees), HITRATE (85), 0) /* NE is probable, EQ not etc... */ DEF_PREDICTOR (PRED_OPCODE_POSITIVE, opcode values positive, HITRATE (79), 0) DEF_PREDICTOR (PRED_OPCODE_NONEQUAL, opcode values nonequal, HITRATE (71), 0) DEF_PREDICTOR (PRED_FPOPCODE, fp_opcode, HITRATE (90), 0) DEF_PREDICTOR (PRED_TREE_OPCODE_POSITIVE, opcode values positive (on trees), HITRATE (70), 0) DEF_PREDICTOR (PRED_TREE_OPCODE_NONEQUAL, opcode values nonequal (on trees), HITRATE (69), 0) DEF_PREDICTOR (PRED_TREE_FPOPCODE, fp_opcode (on trees), HITRATE (90), 0) /* Branch guarding call is probably taken. */ DEF_PREDICTOR (PRED_CALL, call, HITRATE (69), 0) /* Branch causing function to terminate is probably not taken. */ DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, early return (on trees), HITRATE (54), 0) /* Branch containing goto is probably not taken. */ DEF_PREDICTOR (PRED_GOTO, goto, HITRATE (70), 0) /* Branch guarding call is probably taken. */ DEF_PREDICTOR (PRED_CALL, call, HITRATE (69), 0) /* Branch causing function to terminate is probably not taken. */ DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, early return (on trees), HITRATE (54), 0) /* Branch containing goto is probably not taken. */ DEF_PREDICTOR (PRED_GOTO, goto, HITRATE (70), 0) /* Branch ending with return constant is probably not taken. */ DEF_PREDICTOR (PRED_CONST_RETURN, const return, HITRATE (67), 0) /* Branch ending with return negative constant is probably not taken. */ DEF_PREDICTOR (PRED_NEGATIVE_RETURN, negative return, HITRATE (96), 0) /* Branch ending with return; is probably not taken */ DEF_PREDICTOR (PRED_NULL_RETURN, null return, HITRATE (96), 0) /* Branches to a mudflap bounds check are extremely unlikely. */ DEF_PREDICTOR
Re: libfdt: More tests of NOP handling behaviour
So, like, the other day David Gibson mumbled: In light of the recently discovered bug with NOP handling, this adds some more testcases for NOP handling. Specifically, it adds a helper program which will add a NOP tag after every existing tag in a dtb, and runs the standard battery of tests over trees mangled in this way. For now, this does not add a NOP at the very beginning of the structure block. This causes problems for libfdt at present, because we assume in many places that the root node's BEGIN_NODE tag is at offset 0. I'm still contemplating what to do about this (with one option being simply to declare such dtbs invalid). Signed-off-by: David Gibson [EMAIL PROTECTED] Applied. BTW, declaring DTBs with BEGIN_NODES not at offset 0 as invalid seems like a fine choice to me. jdl ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: libfdt: Remove no longer used code from fdt_node_offset_by_compatible()
So, like, the other day David Gibson mumbled: Since fdt_node_offset_by_compatible() was converted to the new fdt_next_node() iterator, a chunk of initialization code became redundant, but was not removed by oversight. This patch cleans it up. Signed-off-by: David Gibson [EMAIL PROTECTED] Applied. jdl ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: libfdt: Trivial cleanup for CHECK_HEADER)
So, like, the other day David Gibson mumbled: Currently the CHECK_HEADER() macro is defined local to fdt_ro.c. However, there are a handful of functions (fdt_move, rw_check_header, fdt_open_into) from other files which could also use it (currently they open-code something more-or-less identical). Therefore, this patch moves CHECK_HEADER() to libfdt_internal.h and uses it in those places. Signed-off-by: David Gibson [EMAIL PROTECTED] Applied. jdl ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
David Howells wrote: Geert Uytterhoeven [EMAIL PROTECTED] wrote: Hence shouldn't we ask the gcc people what's the purpose of __builtin_expect(), if it doesn't live up to its promise? __builtin_expect() is useful on FRV where you _have_ to give each branch and conditional branch instruction a measure of probability whether the branch will be taken. David I was wondering whether some of the uses of likely illustrated below are incorrect or not useful. x = likely(X) || Y for ( ... ; likely(...); ... ) while ( likely(X) ) if ( unlikely(X) /|| likely(Y) ) if ( X /|| unlikely(Y) ) return likely(X); return likely(X) ? a : b; ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH-RESEND] next-20080218 build failure at pmac_ide_macio_attach ()
On Mon, Feb 18, 2008 at 03:17:49PM +0100, Andreas Schwab wrote: Kamalesh Babulal [EMAIL PROTECTED] writes: . . snip Just remove the cast. Andreas. Resending the patch after making the changes Andreas said. Signed-off-by: Kamalesh Babulal [EMAIL PROTECTED] Cc: Andreas Schwab [EMAIL PROTECTED] -- --- linux-2.6.25-rc1/drivers/ide/ppc/pmac.c 2008-02-18 22:24:49.0 +0530 +++ linux-2.6.25-rc1/drivers/ide/ppc/~pmac.c2008-02-18 22:25:10.0 +0530 @@ -1091,7 +1091,7 @@ pmac_ide_macio_attach(struct macio_dev * int irq, rc; hw_regs_t hw; - pmif = (struct pmac_ide_hwif)kzalloc(sizeof(*pmif), GFP_KERNEL); + pmif = kzalloc(sizeof(*pmif), GFP_KERNEL); if (pmif == NULL) return -ENOMEM; @@ -1229,7 +1229,7 @@ pmac_ide_pci_attach(struct pci_dev *pdev return -ENODEV; } - pmif = (struct pmac_ide_hwif)kzalloc(sizeof(*pmif), GFP_KERNEL); + pmif = kzalloc(sizeof(*pmif), GFP_KERNEL); if (pmif == NULL) return -ENOMEM; -- Thanks Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Linux-fbdev-devel] [PATCH 1/2] fb: add support for foreign endianness
On Mon, Feb 18, 2008 at 12:30:11PM -0500, [EMAIL PROTECTED] wrote: On Mon, 18 Feb 2008 08:18:47 +0100, Krzysztof Helt said: I know two fb drivers which use endianess information (pm2fb and s3c2410fb). Both resolve endianess at driver level. Actually, both handle it by setting special bits so the graphics chip itself reorder bytes to transform foreign endianess. I understand that this patch is for chips which cannot reorder bytes by themselves. Does anybody know of such a chip that's actually available in the wild? LE Fujitsu mb86277 (MINT) on the BE MPC8360E. -- Anton Vorontsov email: [EMAIL PROTECTED] backup email: [EMAIL PROTECTED] irc://irc.freenode.net/bd2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Mon, 18 Feb 2008 14:27:10 GMT, David Howells said: __builtin_expect() is useful on FRV where you _have_ to give each branch and conditional branch instruction a measure of probability whether the branch will be taken. What does gcc do the 99.998% of the time we don't have likely/unlikely coded? pgpYerVU1HBt7.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
arch_initcall time
I need to call i2c_register_board_info for the new i2c style ad7414 driver. This needs to be called at arch initcall time. Currently I just do this: static int __init warp_arch_init(void) { i2c_register_board_info(0, warp_i2c_info, ARRAY_SIZE(warp_i2c_info)); return 0; } arch_initcall(warp_arch_init); It works, but is there a better place to put this? None of the other powerpc platforms make this call and I want to get it right, so that others don't blindly follow my example ;) I kept the name vague rather than specific in case more drivers need to be setup this way in the future. Cheers, Sean ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: arch_initcall time
On Feb 18, 2008 11:28 AM, Sean MacLennan [EMAIL PROTECTED] wrote: I need to call i2c_register_board_info for the new i2c style ad7414 driver. This needs to be called at arch initcall time. Currently I just do this: static int __init warp_arch_init(void) { i2c_register_board_info(0, warp_i2c_info, ARRAY_SIZE(warp_i2c_info)); return 0; } arch_initcall(warp_arch_init); Yes, this is the right thing to do, but use machine_arch_initcall() instead so that it doesn't get called if it is not your board. Cheers, g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Linux-fbdev-devel] [PATCH 1/2] fb: add support for foreign endianness
On Mon, 18 Feb 2008 08:18:47 +0100, Krzysztof Helt said: I know two fb drivers which use endianess information (pm2fb and s3c2410fb). Both resolve endianess at driver level. Actually, both handle it by setting special bits so the graphics chip itself reorder bytes to transform foreign endianess. I understand that this patch is for chips which cannot reorder bytes by themselves. Does anybody know of such a chip that's actually available in the wild? Or are we writing drivers for speculative possible chips? pgpfU4AE2m8IS.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: arch_initcall time
On Feb 18, 2008 11:31 AM, Grant Likely [EMAIL PROTECTED] wrote: On Feb 18, 2008 11:28 AM, Sean MacLennan [EMAIL PROTECTED] wrote: I need to call i2c_register_board_info for the new i2c style ad7414 driver. This needs to be called at arch initcall time. Currently I just do this: static int __init warp_arch_init(void) { i2c_register_board_info(0, warp_i2c_info, ARRAY_SIZE(warp_i2c_info)); return 0; } arch_initcall(warp_arch_init); Yes, this is the right thing to do, but use machine_arch_initcall() instead so that it doesn't get called if it is not your board. That being said, I believe there is infrastructure to handle the creation of your i2c board info from the device tree. Your i2c board info should not be hard coded. Cheers, g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: arch_initcall time
On Mon, 18 Feb 2008 12:42:40 -0600 Olof Johansson [EMAIL PROTECTED] wrote: On Mon, Feb 18, 2008 at 11:32:14AM -0700, Grant Likely wrote: On Feb 18, 2008 11:31 AM, Grant Likely [EMAIL PROTECTED] wrote: On Feb 18, 2008 11:28 AM, Sean MacLennan [EMAIL PROTECTED] wrote: I need to call i2c_register_board_info for the new i2c style ad7414 driver. This needs to be called at arch initcall time. Currently I just do this: static int __init warp_arch_init(void) { i2c_register_board_info(0, warp_i2c_info, ARRAY_SIZE(warp_i2c_info)); return 0; } arch_initcall(warp_arch_init); Yes, this is the right thing to do, but use machine_arch_initcall() instead so that it doesn't get called if it is not your board. That being said, I believe there is infrastructure to handle the creation of your i2c board info from the device tree. Your i2c board info should not be hard coded. Jon Smirl's patches? Not yet, unfortunately. It didn't make .25, but maybe for .26. (I will need to do it specifically on my platform, like fsl_soc already does, as a stopgap until then). That, and Sean is still working on getting the iic device-tree-compliant driver through as well :) josh ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: arch_initcall time
On Mon, Feb 18, 2008 at 11:32:14AM -0700, Grant Likely wrote: On Feb 18, 2008 11:31 AM, Grant Likely [EMAIL PROTECTED] wrote: On Feb 18, 2008 11:28 AM, Sean MacLennan [EMAIL PROTECTED] wrote: I need to call i2c_register_board_info for the new i2c style ad7414 driver. This needs to be called at arch initcall time. Currently I just do this: static int __init warp_arch_init(void) { i2c_register_board_info(0, warp_i2c_info, ARRAY_SIZE(warp_i2c_info)); return 0; } arch_initcall(warp_arch_init); Yes, this is the right thing to do, but use machine_arch_initcall() instead so that it doesn't get called if it is not your board. That being said, I believe there is infrastructure to handle the creation of your i2c board info from the device tree. Your i2c board info should not be hard coded. Jon Smirl's patches? Not yet, unfortunately. It didn't make .25, but maybe for .26. (I will need to do it specifically on my platform, like fsl_soc already does, as a stopgap until then). -Olof ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Mon, 18 Feb 2008 13:11:06 -0500 [EMAIL PROTECTED] wrote: On Mon, 18 Feb 2008 14:27:10 GMT, David Howells said: __builtin_expect() is useful on FRV where you _have_ to give each branch and conditional branch instruction a measure of probability whether the branch will be taken. What does gcc do the 99.998% of the time we don't have likely/unlikely coded? see Andi's email. It gets the exact same hints that 95%+ of the kernels unlikely/likely get you, because the heuristics in it are usually the same as the kernel programmers heuristics. -- If you want to reach me at my work email, use [EMAIL PROTECTED] For development, discussion and tips for power savings, visit http://www.lesswatts.org ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: How to describe FPGA-based devices in the device tree ?
On Feb 18, 2008 10:47 AM, Scott Wood [EMAIL PROTECTED] wrote: On Mon, Feb 18, 2008 at 01:43:52PM +0100, Laurent Pinchart wrote: Should I put IP core nodes as children of the FPGA node ? You could do that as well. I'd recommend doing that, then your subnodes are isolated from changes to the bus attachment (chipselect). (really an insignificant point, but I think it is a more logical layout). So, something like this: [EMAIL PROTECTED],0 { #address-cells = 1; #size-cells = 1; ranges = 0 4 0 0010; /* breakdown of 'ranges' fields: */ /* 0: start address of internal range */ /* 4 0: start address of external range (chip select 4, address 0) */ /* 0010: size of range */ [EMAIL PROTECTED] { compatible = foo,bar; reg = 0 0001; }; [EMAIL PROTECTED] { compatible = foo,bar; reg = 1 0001; }; [EMAIL PROTECTED] { compatible = foo,bar; reg = 2 0001; }; }; Cheers, g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Cbe-oss-dev] [PATCH 1/3] Fix Unlikely(x) == y
On Feb 18, 2008 6:01 AM, Geert Uytterhoeven [EMAIL PROTECTED] wrote: This means it generates faster code with a current gcc for your platform. But a future gcc might e.g. replace the whole loop with a division (gcc SVN head (that will soon become gcc 4.3) already does transformations like replacing loops with divisions [1]). Yes but the issue is one optimization inside GCC does not take into account the probability in one case. And really there is a bug in the linux kernel for not implementing the long long divide function (or really using libgcc) but that is a different story and is part of the issue there anyways. -- Pinski ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] booting-without-of: add Xilinx uart 16550.
Hello. Stephen Neuendorffer wrote: Instead of attempting to come up with a generic description of this, I recommend just naming it after the actual device instance; something like compatible=xlnx,opb-uart16550; Well, that means that we'll need a to add a code which glues the chip to 8250.c driver... well, of_serial.c could be that glue layer if we add to it the ability to recognize Xilinx UART... well, legacy_serial.c could be taught that trick too... Well, we could also add the new compatible, but still claim ns16550 compatibility... This actually makes more sense to me... I'd rather have the code set the reg-shift than have it explicitly set in the device tree anyway. The compatibility set should include (at the least): opb_uart16550_v1_00_c opb_uart16550_v1_00_d opb_uart16550_v1_00_e plb_uart16550_v1_00_c xps_uart16550_v1_00_a Sounds like too much? Couldn't this be handled via the model prop? I think this is somewhat independent of Sergei's arguments that generic ns16550 devices should allow having a reg-shift set Steve WBR, Sergei ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Patch 0/2] powerpc: avoid userspace poking to legacy ioports
Hi Ben, On Thu, 14 Feb 2008 07:42:54 +1100, Benjamin Herrenschmidt wrote: On Wed, 2008-02-13 at 18:35 +0100, Christian Krafft wrote: sensors_detect crashes kernel on PowerPC, as it pokes directly to memory. For the records, sensors-detect accesses I/O ports, not memory. This patch adds a check_legacy_ioports to read_port and write_port. It will now return ENXIO, instead of oopsing. Signed-off-by: Christian Krafft [EMAIL PROTECTED] The problem is that this prevents using /proc/ioports to access PCI IO space, which might be useful. Maybe Christian's patch can be improved to not do the check on these? As long as /dev/port exists, it seems reasonable that the kernel should behave, no matter what I/O ports are accessed from user-space. I hate that sensors_detect.. or for that matter any other userland code that pokes random ports like that. It should die. What do you propose as a replacement? And how is userland code poking at random ports different from kernel code poking at random ports? We could move sensors-detect inside the kernel (and I have some plan to do that) but I fail to see how this would solve this particular problem. Index: linux.git/drivers/char/mem.c === --- linux.git.orig/drivers/char/mem.c +++ linux.git/drivers/char/mem.c @@ -566,8 +566,13 @@ static ssize_t read_port(struct file * f char __user *tmp = buf; if (!access_ok(VERIFY_WRITE, buf, count)) - return -EFAULT; + return -EFAULT; + while (count-- 0 i 65536) { +#ifdef CONFIG_PPC_MERGE + if (check_legacy_ioport(i)) + return -ENXIO; +#endif if (__put_user(inb(i),tmp) 0) return -EFAULT; i++; @@ -585,6 +590,7 @@ static ssize_t write_port(struct file * if (!access_ok(VERIFY_READ,buf,count)) return -EFAULT; + while (count-- 0 i 65536) { char c; if (__get_user(c, tmp)) { @@ -592,6 +598,10 @@ static ssize_t write_port(struct file * break; return -EFAULT; } +#ifdef CONFIG_PPC_MERGE + if (check_legacy_ioport(i)) + return -ENXIO; +#endif outb(c,i); i++; tmp++; -- Jean Delvare ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Patch 0/2] powerpc: avoid userspace poking to legacy ioports
Maybe Christian's patch can be improved to not do the check on these? As long as /dev/port exists, it seems reasonable that the kernel should behave, no matter what I/O ports are accessed from user-space. nonsense. /dev/mem exists for example, but you are still not supposed to go bang all over the place in it. I hate that sensors_detect.. or for that matter any other userland code that pokes random ports like that. It should die. What do you propose as a replacement? Dunno, something less scary, like knowing where your sensors are on a given machine... honestly, it's just scary the risk you guys are taking by banging random IO ports. At the very least, that shouldn't be done on non-x86. And how is userland code poking at random ports different from kernel code poking at random ports? We could move sensors-detect inside the kernel (and I have some plan to do that) but I fail to see how this would solve this particular problem. It wouldn't, but at least I could NAK it or make it CONFIG_X86 :-) Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Patch 0/2] powerpc: avoid userspace poking to legacy ioports
* Super-I/O chips at 0x2e/0x2f and 0x4e/0x4f. * Legacy PC hardware monitoring chips at 0x290-0x297. * IPMI interface at 0x0ca3 and 0x0cab (read-only). Please tell me which ones should be skipped on PowerPC. Skip the whole thing. I consider that on a powerpc linux port, the platform is responsible for telling drivers where things are (via the device tree generally) Christian, can you tell me which of these probes caused trouble for you? And how is userland code poking at random ports different from kernel code poking at random ports? We could move sensors-detect inside the kernel (and I have some plan to do that) but I fail to see how this would solve this particular problem. It wouldn't, but at least I could NAK it or make it CONFIG_X86 :-) The same could be done for user-space (or at the /dev/port level.) Well, there are -other- legit usages of /dev/port... Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Patch 0/2] powerpc: avoid userspace poking to legacy ioports
On Mon, 18 Feb 2008 21:58:42 +0100 Jean Delvare [EMAIL PROTECTED] wrote: On Tue, 19 Feb 2008 07:42:03 +1100, Benjamin Herrenschmidt wrote: Maybe Christian's patch can be improved to not do the check on these? As long as /dev/port exists, it seems reasonable that the kernel should behave, no matter what I/O ports are accessed from user-space. nonsense. /dev/mem exists for example, but you are still not supposed to go bang all over the place in it. You should at least be able to read from it without crashing the machine. Of course writing is a different story. keep dreaming. This is not how /dev/mem works today, not on x86 and very likely not on ppc either. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Patch 0/2] powerpc: avoid userspace poking to legacy ioports
On Tue, 19 Feb 2008 07:42:03 +1100, Benjamin Herrenschmidt wrote: Maybe Christian's patch can be improved to not do the check on these? As long as /dev/port exists, it seems reasonable that the kernel should behave, no matter what I/O ports are accessed from user-space. nonsense. /dev/mem exists for example, but you are still not supposed to go bang all over the place in it. You should at least be able to read from it without crashing the machine. Of course writing is a different story. I hate that sensors_detect.. or for that matter any other userland code that pokes random ports like that. It should die. What do you propose as a replacement? Dunno, something less scary, like knowing where your sensors are on a given machine... You mean, having a complete database for the, what, 4000 PC motherboards out there? And maintaining it day after day? _This_ sounds much scarier to me than the current situation. honestly, it's just scary the risk you guys are taking by banging random IO ports. I don't remember anyone reporting problems with this in the past 3 or 4 years, so it doesn't seem to be a big problem in practice. At the very least, that shouldn't be done on non-x86. I am surprised that anyone would actually run sensors-detect on non-x86... Non-PC hardware usually doesn't have sensors driven by hwmon drivers anyway, or people know what they have and do not need detection. But I would be totally fine with updating sensors-detect to skip some of the probes on non-x86 hardware. There are basically 3 /dev/port probes that are done currently: * Super-I/O chips at 0x2e/0x2f and 0x4e/0x4f. * Legacy PC hardware monitoring chips at 0x290-0x297. * IPMI interface at 0x0ca3 and 0x0cab (read-only). Please tell me which ones should be skipped on PowerPC. Christian, can you tell me which of these probes caused trouble for you? And how is userland code poking at random ports different from kernel code poking at random ports? We could move sensors-detect inside the kernel (and I have some plan to do that) but I fail to see how this would solve this particular problem. It wouldn't, but at least I could NAK it or make it CONFIG_X86 :-) The same could be done for user-space (or at the /dev/port level.) -- Jean Delvare ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Mon, 2008-02-18 at 16:13 +0200, Adrian Bunk wrote: On Mon, Feb 18, 2008 at 03:01:35PM +0100, Geert Uytterhoeven wrote: On Mon, 18 Feb 2008, Adrian Bunk wrote: This means it generates faster code with a current gcc for your platform. But a future gcc might e.g. replace the whole loop with a division (gcc SVN head (that will soon become gcc 4.3) already does transformations like replacing loops with divisions [1]). Hence shouldn't we ask the gcc people what's the purpose of __builtin_expect(), if it doesn't live up to its promise? That's a different issue. My point here is that we do not know how the latest gcc available in the year 2010 might transform this code, and how a likely/unlikely placed there might influence gcc's optimizations then. You're right, we don't know. But if giving the compiler _more_ information causes it to produce vastly inferior code then we should be filing gcc bugs. After all the unlikely/likely is just a hint, if gcc knows better it can always ignore it. cheers -- Michael Ellerman OzLabs, IBM Australia Development Lab wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person signature.asc Description: This is a digitally signed message part ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Build failure on treeboot-walnut.c
Kumar Gala wrote: $ make V=1 make ARCH=ppc64 -f scripts/Makefile.build obj=arch/powerpc/boot arch/powerpc/boot/uImage powerpc-unknown-linux-gnu-gcc -m32 - Wp,-MD,arch/powerpc/boot/.treeboot-walnut.o.d -Wall -Wundef - Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -Os -msoft-float -pipe - fomit-frame-pointer -fno-builtin -fPIC -nostdinc -isystem /_TOOLS_/.dist0/gnu- gcc-4.0.2-binutils-2.16.1-glibc-2.3.6-e300c2-powerpc-unknown-linux-gnu/i686- pc-linux2.4/bin/../lib/gcc/powerpc-unknown-linux-gnu/4.0.2/include - Iarch/powerpc/boot -I/temp/kumar.484/arch/powerpc/boot -mcpu=405 -c -o arch/powerpc/boot/treeboot-walnut.o arch/powerpc/boot/treeboot-walnut.c Assembler messages: Error: Internal assembler error for instruction icbt Internal error, aborting at /tmp/crosstool/crosstool-0.42/build/powerpc-unknown-linux-gnu/gcc- 4.0.2_e300-enabled-glibc-2.3.6/binutils-2.16.1-complete/gas/config/tc-ppc.c line 1314 in ppc_setup_opcodes Please report this bug. make[1]: *** [arch/powerpc/boot/treeboot-walnut.o] Error 2 make: *** [uImage] Error 2 ARCH=ppc64? WTF? I specifically said ARCH=powerpc on the make command line. I have got the exact same problem. Has the problem been sovled ? I am trying to compile a 2.6.24 kernel (vanilla + some board specific stuff) with a vanilla gcc-4.1.2 with the flag -msoft-float. cheers, Maxime -- Maxime Louvel 0044 7964 80 43 Allen road Whitemore reans WV60AW Wolverhampton United Kingdom ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Linux-fbdev-devel] [PATCH 1/2] fb: add support for foreign endianness
[EMAIL PROTECTED] schrieb: On Mon, 18 Feb 2008 08:18:47 +0100, Krzysztof Helt said: I know two fb drivers which use endianess information (pm2fb and s3c2410fb). Both resolve endianess at driver level. Actually, both handle it by setting special bits so the graphics chip itself reorder bytes to transform foreign endianess. I understand that this patch is for chips which cannot reorder bytes by themselves. Does anybody know of such a chip that's actually available in the wild? Or are we writing drivers for speculative possible chips? I had troubles with the Silicon Motion SM501/SM502 endianess on PowerPC PCI vs. LocalBus. The chip also has a register to swap endianess, but that seems to only affect some LocalBus modes. The current fb and X drivers are working, but when it comes to font aliasing and hw-acceleration, the problems start to rise again... Regards, Clemens ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [Linux-fbdev-devel] [PATCH 1/2] fb: add support for foreign endianness
On Tue, 2008-02-19 at 00:35 +0100, Clemens Koller wrote: [EMAIL PROTECTED] schrieb: On Mon, 18 Feb 2008 08:18:47 +0100, Krzysztof Helt said: I know two fb drivers which use endianess information (pm2fb and s3c2410fb). Both resolve endianess at driver level. Actually, both handle it by setting special bits so the graphics chip itself reorder bytes to transform foreign endianess. I understand that this patch is for chips which cannot reorder bytes by themselves. Does anybody know of such a chip that's actually available in the wild? Or are we writing drivers for speculative possible chips? I had troubles with the Silicon Motion SM501/SM502 endianess on PowerPC PCI vs. LocalBus. The chip also has a register to swap endianess, but that seems to only affect some LocalBus modes. The current fb and X drivers are working, but when it comes to font aliasing and hw-acceleration, the problems start to rise again... Most sane gfx chips nowadays provide configurable surfaces that allow to perform the swap when writing/reading from regions of the framebuffer, with the ability to set a different swapper setting (based on bit depth) per region. Then there is also the risk that your PCI-Localbus has been wired improperly :-) Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] next-20080218 build failure at pmac_ide_macio_attach ()
On Monday 18 February 2008, Kamalesh Babulal wrote: Hi, The next-20080218 kernel build fails on the powerpc(s) drivers/ide/ppc/pmac.c: In function ‘pmac_ide_macio_attach’: drivers/ide/ppc/pmac.c:1094: error: conversion to non-scalar type requested drivers/ide/ppc/pmac.c: In function ‘pmac_ide_pci_attach’: drivers/ide/ppc/pmac.c:1232: error: conversion to non-scalar type requested make[3]: *** [drivers/ide/ppc/pmac.o] Error 1 make[2]: *** [drivers/ide/ppc] Error 2 make[1]: *** [drivers/ide] Error 2 make: *** [drivers] Error 2 I Have tested this patch for build failure only. Signed-off-by: Kamalesh Babulal [EMAIL PROTECTED] --- --- linux-2.6.25-rc1/drivers/ide/ppc/pmac.c 2008-02-18 18:41:48.0 +0530 +++ linux-2.6.25-rc1/drivers/ide/ppc/~pmac.c 2008-02-18 19:20:37.0 +0530 @@ -1091,7 +1091,7 @@ pmac_ide_macio_attach(struct macio_dev * int irq, rc; hw_regs_t hw; - pmif = (struct pmac_ide_hwif)kzalloc(sizeof(*pmif), GFP_KERNEL); + pmif = (struct pmac_ide_hwif*)kzalloc(sizeof(*pmif), GFP_KERNEL); if (pmif == NULL) return -ENOMEM; @@ -1229,7 +1229,7 @@ pmac_ide_pci_attach(struct pci_dev *pdev return -ENODEV; } - pmif = (struct pmac_ide_hwif)kzalloc(sizeof(*pmif), GFP_KERNEL); + pmif = (struct pmac_ide_hwif*)kzalloc(sizeof(*pmif), GFP_KERNEL); if (pmif == NULL) return -ENOMEM; Thanks, I integrated it with the guilty patch to preserve bisectability. From: Bartlomiej Zolnierkiewicz [EMAIL PROTECTED] Subject: [PATCH] ide-pmac: dynamically allocate struct pmac_ide_hwif instances (take 2) * Dynamically allocate struct pmac_ide_hwif instances in pmac_ide_macio_attach() and pmac_ide_pci_attach(), then remove no longer needed pmac_ide[]. v2: * Build fix from Kamalesh Babulal. Cc: Benjamin Herrenschmidt [EMAIL PROTECTED] Cc: Kamalesh Babulal [EMAIL PROTECTED] Signed-off-by: Bartlomiej Zolnierkiewicz [EMAIL PROTECTED] --- drivers/ide/ppc/pmac.c | 49 + 1 file changed, 33 insertions(+), 16 deletions(-) Index: b/drivers/ide/ppc/pmac.c === --- a/drivers/ide/ppc/pmac.c +++ b/drivers/ide/ppc/pmac.c @@ -79,8 +79,6 @@ typedef struct pmac_ide_hwif { } pmac_ide_hwif_t; -static pmac_ide_hwif_t pmac_ide[MAX_HWIFS]; - enum { controller_ohare, /* OHare based */ controller_heathrow,/* Heathrow/Paddington */ @@ -1094,29 +1092,34 @@ pmac_ide_macio_attach(struct macio_dev * int i, rc; hw_regs_t hw; + pmif = kzalloc(sizeof(*pmif), GFP_KERNEL); + if (pmif == NULL) + return -ENOMEM; + i = 0; - while (i MAX_HWIFS (ide_hwifs[i].io_ports[IDE_DATA_OFFSET] != 0 - || pmac_ide[i].node != NULL)) + while (i MAX_HWIFS (ide_hwifs[i].io_ports[IDE_DATA_OFFSET] != 0)) ++i; if (i = MAX_HWIFS) { printk(KERN_ERR ide-pmac: MacIO interface attach with no slot\n); printk(KERN_ERR %s\n, mdev-ofdev.node-full_name); - return -ENODEV; + rc = -ENODEV; + goto out_free_pmif; } - pmif = pmac_ide[i]; hwif = ide_hwifs[i]; if (macio_resource_count(mdev) == 0) { printk(KERN_WARNING ide%d: no address for %s\n, i, mdev-ofdev.node-full_name); - return -ENXIO; + rc = -ENXIO; + goto out_free_pmif; } /* Request memory resource for IO ports */ if (macio_request_resource(mdev, 0, ide-pmac (ports))) { printk(KERN_ERR ide%d: can't request mmio resource !\n, i); - return -EBUSY; + rc = -EBUSY; + goto out_free_pmif; } /* XXX This is bogus. Should be fixed in the registry by checking @@ -1166,11 +1169,15 @@ pmac_ide_macio_attach(struct macio_dev * iounmap(pmif-dma_regs); macio_release_resource(mdev, 1); } - memset(pmif, 0, sizeof(*pmif)); macio_release_resource(mdev, 0); + kfree(pmif); } return rc; + +out_free_pmif: + kfree(pmif); + return rc; } static int @@ -1223,30 +1230,36 @@ pmac_ide_pci_attach(struct pci_dev *pdev printk(KERN_ERR ide-pmac: cannot find MacIO node for Kauai ATA interface\n); return -ENODEV; } + + pmif = kzalloc(sizeof(*pmif), GFP_KERNEL); + if (pmif == NULL) + return -ENOMEM; + i = 0; - while (i MAX_HWIFS (ide_hwifs[i].io_ports[IDE_DATA_OFFSET] != 0 - || pmac_ide[i].node != NULL)) + while (i MAX_HWIFS (ide_hwifs[i].io_ports[IDE_DATA_OFFSET] != 0)) ++i; if (i = MAX_HWIFS
Re: [PATCH 2/2] i2c-ibm_iic driver
An updated version of the patch. This one should answer all of Jean's concerns. Cheers, Sean Signed-off-by: Sean MacLennan [EMAIL PROTECTED] --- --- for-2.6.25/drivers/i2c/busses/orig-i2c-ibm_iic.c2008-02-18 16:36:30.0 -0500 +++ for-2.6.25/drivers/i2c/busses/i2c-ibm_iic.c 2008-02-18 17:39:53.0 -0500 @@ -6,6 +6,9 @@ * Copyright (c) 2003, 2004 Zultys Technologies. * Eugene Surovegin [EMAIL PROTECTED] or [EMAIL PROTECTED] * + * Copyright (c) 2008 PIKA Technologies + * Sean MacLennan [EMAIL PROTECTED] + * * Based on original work by * Ian DaSilva [EMAIL PROTECTED] * Armin Kuster [EMAIL PROTECTED] @@ -39,12 +42,17 @@ #include asm/io.h #include linux/i2c.h #include linux/i2c-id.h + +#ifdef CONFIG_IBM_OCP #include asm/ocp.h #include asm/ibm4xx.h +#else +#include linux/of_platform.h +#endif #include i2c-ibm_iic.h -#define DRIVER_VERSION 2.1 +#define DRIVER_VERSION 2.2 MODULE_DESCRIPTION(IBM IIC driver v DRIVER_VERSION); MODULE_LICENSE(GPL); @@ -657,6 +665,7 @@ return (u8)((opb + 9) / 10 - 1); } +#ifdef CONFIG_IBM_OCP /* * Register single IIC interface */ @@ -828,5 +837,182 @@ ocp_unregister_driver(ibm_iic_driver); } +#else /* !CONFIG_IBM_OCP */ + +static int __devinit iic_request_irq(struct of_device *ofdev, +struct ibm_iic_private *dev) +{ + struct device_node *np = ofdev-node; + int irq; + + if (iic_force_poll) + return NO_IRQ; + + irq = irq_of_parse_and_map(np, 0); + if (irq == NO_IRQ) { + dev_err(ofdev-dev, irq_of_parse_and_map failed\n); + return NO_IRQ; + } + + /* Disable interrupts until we finish initialization, assumes +* level-sensitive IRQ setup... +*/ + iic_interrupt_mode(dev, 0); + if (request_irq(irq, iic_handler, 0, IBM IIC, dev)) { + dev_err(ofdev-dev, request_irq %d failed\n, irq); + /* Fallback to the polling mode */ + return NO_IRQ; + } + + return irq; +} + +/* + * Register single IIC interface + */ +static int __devinit iic_probe(struct of_device *ofdev, + const struct of_device_id *match) +{ + struct device_node *np = ofdev-node; + struct ibm_iic_private *dev; + struct i2c_adapter *adap; + const u32 *indexp, *freq; + int ret; + + + dev = kzalloc(sizeof(*dev), GFP_KERNEL); + if (!dev) { + dev_err(ofdev-dev, failed to allocate device data\n); + return -ENOMEM; + } + + dev_set_drvdata(ofdev-dev, dev); + + indexp = of_get_property(np, index, NULL); + if (!indexp) { + dev_err(ofdev-dev, no index specified.\n); + ret = -EINVAL; + goto error_cleanup; + } + dev-idx = *indexp; + + dev-vaddr = of_iomap(np, 0); + if (dev-vaddr == NULL) { + dev_err(ofdev-dev, failed to iomap device\n); + ret = -ENXIO; + goto error_cleanup; + } + + init_waitqueue_head(dev-wq); + + dev-irq = iic_request_irq(ofdev, dev); + if (dev-irq == NO_IRQ) + dev_warn(ofdev-dev, using polling mode\n); + + /* Board specific settings */ + if (iic_force_fast || of_get_property(np, fast-mode, NULL)) + dev-fast_mode = 1; + + freq = of_get_property(np, clock-frequency, NULL); + if (freq == NULL) { + freq = of_get_property(np-parent, clock-frequency, NULL); + if (freq == NULL) { + dev_err(ofdev-dev, Unable to get bus frequency\n); + ret = -EBUSY; + goto error_cleanup; + } + } + + dev-clckdiv = iic_clckdiv(*freq); + dev_dbg(ofdev-dev, clckdiv = %d\n, dev-clckdiv); + + /* Initialize IIC interface */ + iic_dev_init(dev); + + /* Register it with i2c layer */ + adap = dev-adap; + adap-dev.parent = ofdev-dev; + strlcpy(adap-name, IBM IIC, sizeof(adap-name)); + i2c_set_adapdata(adap, dev); + adap-id = I2C_HW_OCP; + adap-class = I2C_CLASS_HWMON; + adap-algo = iic_algo; + adap-timeout = 1; + adap-nr = dev-idx; + + ret = i2c_add_numbered_adapter(adap); + if (ret 0) { + dev_err(ofdev-dev, failed to register i2c adapter\n); + goto error_cleanup; + } + + dev_dbg(ofdev-dev, using %s mode\n, + dev-fast_mode ? fast (400 kHz) : standard (100 kHz)); + + return 0; + +error_cleanup: + if (dev-irq != NO_IRQ) { + iic_interrupt_mode(dev, 0); + free_irq(dev-irq, dev); + } + + if (dev-vaddr) + iounmap(dev-vaddr); + + dev_set_drvdata(ofdev-dev, NULL); + kfree(dev); + return ret; +} + +/* + * Cleanup initialized IIC interface + */
Re: [PATCH] i2c-ibm_iic driver bonus patch
Here is an optional bonus patch that cleans up most of the checkpatch warnings in the common code. I left in the volatiles, since I don't understand why they where needed. The memory always seems to be access with in_8 and out_8, which are declared volatile. But they could be there to fix a very specific bug. Cheers, Sean Signed-off-by: Sean MacLennan [EMAIL PROTECTED] --- --- for-2.6.25/drivers/i2c/busses/submitted-i2c-ibm_iic.c 2008-02-18 20:44:06.0 -0500 +++ for-2.6.25/drivers/i2c/busses/i2c-ibm_iic.c 2008-02-18 20:53:53.0 -0500 @@ -76,17 +76,17 @@ #endif #if DBG_LEVEL 0 -# define DBG(f,x...) printk(KERN_DEBUG ibm-iic f, ##x) +# define DBG(f, x...) printk(KERN_DEBUG ibm-iic f, ##x) #else -# define DBG(f,x...) ((void)0) +# define DBG(f, x...) ((void)0) #endif #if DBG_LEVEL 1 -# define DBG2(f,x...) DBG(f, ##x) +# define DBG2(f, x...)DBG(f, ##x) #else -# define DBG2(f,x...) ((void)0) +# define DBG2(f, x...)((void)0) #endif #if DBG_LEVEL 2 -static void dump_iic_regs(const char* header, struct ibm_iic_private* dev) +static void dump_iic_regs(const char *header, struct ibm_iic_private *dev) { volatile struct iic_regs __iomem *iic = dev-vaddr; printk(KERN_DEBUG ibm-iic%d: %s\n, dev-idx, header); @@ -98,9 +98,9 @@ in_8(iic-extsts), in_8(iic-clkdiv), in_8(iic-xfrcnt), in_8(iic-xtcntlss), in_8(iic-directcntl)); } -# define DUMP_REGS(h,dev) dump_iic_regs((h),(dev)) +# define DUMP_REGS(h, dev)dump_iic_regs((h), (dev)) #else -# define DUMP_REGS(h,dev) ((void)0) +# define DUMP_REGS(h, dev)((void)0) #endif /* Bus timings (in ns) for bit-banging */ @@ -111,25 +111,26 @@ unsigned int high; unsigned int buf; } timings [] = { -/* Standard mode (100 KHz) */ -{ - .hd_sta = 4000, - .su_sto = 4000, - .low= 4700, - .high = 4000, - .buf= 4700, -}, -/* Fast mode (400 KHz) */ -{ - .hd_sta = 600, - .su_sto = 600, - .low= 1300, - .high = 600, - .buf= 1300, -}}; + /* Standard mode (100 KHz) */ + { + .hd_sta = 4000, + .su_sto = 4000, + .low= 4700, + .high = 4000, + .buf= 4700, + }, + /* Fast mode (400 KHz) */ + { + .hd_sta = 600, + .su_sto = 600, + .low= 1300, + .high = 600, + .buf= 1300, + } +}; /* Enable/disable interrupt generation */ -static inline void iic_interrupt_mode(struct ibm_iic_private* dev, int enable) +static inline void iic_interrupt_mode(struct ibm_iic_private *dev, int enable) { out_8(dev-vaddr-intmsk, enable ? INTRMSK_EIMTC : 0); } @@ -137,7 +138,7 @@ /* * Initialize IIC interface. */ -static void iic_dev_init(struct ibm_iic_private* dev) +static void iic_dev_init(struct ibm_iic_private *dev) { volatile struct iic_regs __iomem *iic = dev-vaddr; @@ -182,7 +183,7 @@ /* * Reset IIC interface */ -static void iic_dev_reset(struct ibm_iic_private* dev) +static void iic_dev_reset(struct ibm_iic_private *dev) { volatile struct iic_regs __iomem *iic = dev-vaddr; int i; @@ -191,19 +192,19 @@ DBG(%d: soft reset\n, dev-idx); DUMP_REGS(reset, dev); - /* Place chip in the reset state */ + /* Place chip in the reset state */ out_8(iic-xtcntlss, XTCNTLSS_SRST); /* Check if bus is free */ dc = in_8(iic-directcntl); - if (!DIRCTNL_FREE(dc)){ + if (!DIRCTNL_FREE(dc)) { DBG(%d: trying to regain bus control\n, dev-idx); /* Try to set bus free state */ out_8(iic-directcntl, DIRCNTL_SDAC | DIRCNTL_SCC); /* Wait until we regain bus control */ - for (i = 0; i 100; ++i){ + for (i = 0; i 100; ++i) { dc = in_8(iic-directcntl); if (DIRCTNL_FREE(dc)) break; @@ -235,7 +236,7 @@ static int iic_dc_wait(volatile struct iic_regs __iomem *iic, u8 mask) { unsigned long x = jiffies + HZ / 28 + 2; - while ((in_8(iic-directcntl) mask) != mask){ + while ((in_8(iic-directcntl) mask) != mask) { if (unlikely(time_after(jiffies, x))) return -1; cond_resched(); @@ -243,15 +244,15 @@ return 0; } -static int iic_smbus_quick(struct ibm_iic_private* dev, const struct i2c_msg* p) +static int iic_smbus_quick(struct ibm_iic_private *dev, const struct i2c_msg *p) { volatile struct iic_regs __iomem *iic = dev-vaddr; - const struct i2c_timings* t = timings[dev-fast_mode ? 1 : 0]; + const struct i2c_timings *t = timings[dev-fast_mode ? 1 : 0]; u8 mask, v, sda; int i, res; /* Only 7-bit
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Tue, 19 Feb 2008 13:33:53 +1100 Nick Piggin [EMAIL PROTECTED] wrote: Actually one thing I don't like about gcc is that I think it still emits cmovs for likely/unlikely branches, which is silly (the gcc developers seem to be in love with that instruction). If that goes away, then branch hints may be even better. only for -Os and only if the result is smaller afaik. (cmov tends to be a performance loss most of the time so for -O2 and such it isn't used as far as I know.. it does make for nice small code however ;-) -- If you want to reach me at my work email, use [EMAIL PROTECTED] For development, discussion and tips for power savings, visit http://www.lesswatts.org ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: compile quirk linux-2.6.24 (with workaround)
On Tue, 5 Feb 2008 09:38:20 -0600 Josh Boyer [EMAIL PROTECTED] wrote: On Tue, 5 Feb 2008 08:24:38 -0700 Grant Likely [EMAIL PROTECTED] wrote: On 2/5/08, Josh Boyer [EMAIL PROTECTED] wrote: I mean, if you have not included 4xx support in the kernel, as is the case here, it does not make sense to add the 4xx bootwrapper code, no ? It does, in a manner. There are both generic and platform specific pieces to the bootwrapper. Having everything always built helps keep the generic bits from breaking, which is important as they're often tightly coupled. That's at least the reason I can think of. The powerpc maintainers have been over this quite a bit and I don't see it changing anytime soon. That would mean we're dropping support for compilers which can't build 405/440 specific wrapper bits (or other core specific quirks that need to go in the wrapper) That doesn't sound appropriate to me. No it doesn't. At least not yet. I said I'd try to come up with a patch soon-ish. We haven't failed!(yet) Also, this isn't a core-specific quirk. It's an architected instruction of Book III-E in the PowerPC ISA. I can't help it if other chips don't implement this wonderful control mechanism ;) Taking a step back though, there will always be odd cases like this as we move forward. Toolchain XXX will eventually not support instruction which will eventually be used, etc. I'll try to make this specific case work because it's scope is quite limited. But this problem as a whole will still remain. My apologies for taking so long on this. Digging through gcc history isn't exactly fun :) Ok, so it seems -mcpu=440 was added in gcc 3.4. The -mcpu=405 option has been around since 2001. Seeing as how there really isn't anything 440 specific in the files effected, we should be able to pass -mcpu=405 for everything and have it still work. Bernhard, can you try the patch below? I've compile test it, and booted it on an Ebony board (PowerPC 440GP). If anyone else cares to test that would also be welcome. josh [POWERPC] Fix bootwrapper builds with older gcc versions GCC versions before 3.4 did not support the -mcpu=440 option. Use -mcpu=405 for the 4xx specific bootwrapper files, as that has been around for much longer. Signed-off-by: Josh Boyer [EMAIL PROTECTED] diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index 63d07cc..e3993a6 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -35,10 +35,10 @@ endif BOOTCFLAGS += -I$(obj) -I$(srctree)/$(obj) -I$(srctree)/$(src)/libfdt -$(obj)/4xx.o: BOOTCFLAGS += -mcpu=440 -$(obj)/ebony.o: BOOTCFLAGS += -mcpu=440 -$(obj)/cuboot-taishan.o: BOOTCFLAGS += -mcpu=440 -$(obj)/cuboot-katmai.o: BOOTCFLAGS += -mcpu=440 +$(obj)/4xx.o: BOOTCFLAGS += -mcpu=405 +$(obj)/ebony.o: BOOTCFLAGS += -mcpu=405 +$(obj)/cuboot-taishan.o: BOOTCFLAGS += -mcpu=405 +$(obj)/cuboot-katmai.o: BOOTCFLAGS += -mcpu=405 $(obj)/treeboot-walnut.o: BOOTCFLAGS += -mcpu=405 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] i2c-ibm_iic driver bonus patch
On Tuesday 19 February 2008, Sean MacLennan wrote: I left in the volatiles, since I don't understand why they where needed. The memory always seems to be access with in_8 and out_8, which are declared volatile. But they could be there to fix a very specific bug. It's very unlikely that they were really needed, and you certainly shouldn't mark data as volatile in new code. It's very common to mark I/O data structures as volatile when they should be __iomem, because that's what people learn at university, but that is never the right solution, even if it can hide other bugs in your code. Of course, unlike the other changes in your patch, it does impact code generation, so if you want to change it, that should be a separate patch. Arnd ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Tuesday 19 February 2008 01:39, Andi Kleen wrote: Arjan van de Ven [EMAIL PROTECTED] writes: you have more faith in the authors knowledge of how his code actually behaves than I think is warranted :) iirc there was a mm patch some time ago to keep track of the actual unlikely values at runtime and it showed indeed some wrong ones. But the far majority of them are probably correct. Or faith in that he knows what unlikely means. I should write docs about this; but unlikely() means: 1) It happens less than 0.01% of the cases. 2) The compiler couldn't have figured this out by itself (NULL pointer checks are compiler done already, same for some other conditions) 3) It's a hot codepath where shaving 0.5 cycles (less even on x86) matters (and the author is ok with taking a 500 cycles hit if he's wrong) One more thing unlikely() does is to move the unlikely code out of line. So it should conserve some icache in critical functions, which might well be worth some more cycles (don't have numbers though). I actually once measured context switching performance in the scheduler, and removing the unlikely hint for testing RT tasks IIRC gave about 5% performance drop. This was on a P4 which is very different from more modern CPUs both in terms of branch performance characteristics, and icache characteristics. However, the P4's branch predictor is pretty good, and it should easily be able to correctly predict the rt_task check if it has enough entries. So I think much of the savings came from code transformation and movement. Anyway, it is definitely worthwhile if used correctly. Actually one thing I don't like about gcc is that I think it still emits cmovs for likely/unlikely branches, which is silly (the gcc developers seem to be in love with that instruction). If that goes away, then branch hints may be even better. But overall I agree with you that unlikely is in most cases a bad idea (and I submitted the original patch introducing it originally @). That is because it is often used in situations where gcc's default branch prediction heuristics do would make exactly the same decision if (unlikely(x == NULL)) is simply totally useless because gcc already assumes all x == NULL tests are unlikely. I appended some of the builtin heuristics from a recent gcc source so people can see them. Note in particular the last predictors; assuming branch ending with goto, including call, causing early function return or returning negative constant are not taken. Just these alone are likely 95+% of the unlikelies in the kernel. Yes, gcc should be able to do pretty good heuristics, considering the quite good numbers that cold CPU predictors can attain. However for really performance critical code (or really never executed code), then I think it is OK to have the hints and not have to rely on gcc heuristics. -Andi [snip] Interesting, thanks! ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Tuesday 19 February 2008 13:40, Arjan van de Ven wrote: On Tue, 19 Feb 2008 13:33:53 +1100 Nick Piggin [EMAIL PROTECTED] wrote: Actually one thing I don't like about gcc is that I think it still emits cmovs for likely/unlikely branches, which is silly (the gcc developers seem to be in love with that instruction). If that goes away, then branch hints may be even better. only for -Os and only if the result is smaller afaik. What is your evidence for saying this? Because here, with the latest kernel and recent gcc-4.3 snapshot, it spits out cmov like crazy even when compiled with -O2. [EMAIL PROTECTED]:~/usr/src/linux-2.6$ grep cmov kernel/sched.s | wc -l 45 And yes it even does for hinted branches and even at -O2/3 [EMAIL PROTECTED]:~/tests$ cat cmov.c int test(int a, int b) { if (__builtin_expect(a b, 0)) return a; else return b; } [EMAIL PROTECTED]:~/tests$ gcc-4.3 -S -O2 cmov.c [EMAIL PROTECTED]:~/tests$ head -13 cmov.s .file cmov.c .text .p2align 4,,15 ..globl test .type test, @function test: ..LFB2: cmpl%edi, %esi cmovle %esi, %edi movl%edi, %eax ret ..LFE2: .size test, .-test This definitely should be a branch, IMO. (cmov tends to be a performance loss most of the time so for -O2 and such it isn't used as far as I know.. it does make for nice small code however ;-) It shouldn't be hard to work out the cutover point based on how expensive cmov is, how expensive branch and branch mispredicts are, and how often the branch is likely to be mispredicted. For an unpredictable branch, cmov is normally quite a good win even on modern CPUs. But gcc overuses it I think. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[patch 0/4] RFC: PCI: consolidate several pcibios_enable_resources() implementations
There are many implementations of pcibios_enable_resources() that differ in minor ways that look more like bugs than architectural differences. This patch series consolidates most of them to use the x86 version. This series is for discussion only at this point. I'm interested in feedback about whether any of the differences are real and need to be preserved. ARM and PA-RISC, in particular, have interesting differences: - ARM always enables bridge devices, which no other arch does - PA-RISC always turns on SERR and PARITY, which no other arch does Should other arches do the same thing, or are these somehow related to ARM and PA-RISC architecture? Bjorn -- ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[patch 1/4] PCI: split pcibios_enable_resources() out of pcibios_enable_device()
On x86, pcibios_enable_device() is factored into pcibios_enable_resources() and pcibios_enable_irq(). On several other architectures, the functional equivalent of pcibios_enable_resources() is expanded directly inside pcibios_enable_device(). This splits these pcibios_enable_device() implementations to make them more similar to the x86 implementation. There should be no functional change from this patch. Signed-off-by: Bjorn Helgaas [EMAIL PROTECTED] --- arch/alpha/kernel/pci.c |8 +++- arch/arm/kernel/bios32.c |9 +++-- arch/parisc/kernel/pci.c |6 +- arch/powerpc/kernel/pci-common.c | 14 +- arch/sh/drivers/pci/pci.c|7 ++- arch/sparc64/kernel/pci.c|7 ++- arch/v850/kernel/rte_mb_a_pci.c |7 ++- 7 files changed, 46 insertions(+), 12 deletions(-) Index: work6/arch/alpha/kernel/pci.c === --- work6.orig/arch/alpha/kernel/pci.c 2008-02-18 10:43:50.0 -0700 +++ work6/arch/alpha/kernel/pci.c 2008-02-18 10:45:14.0 -0700 @@ -370,7 +370,7 @@ #endif int -pcibios_enable_device(struct pci_dev *dev, int mask) +pcibios_enable_resources(struct pci_dev *dev, int mask) { u16 cmd, oldcmd; int i; @@ -396,6 +396,12 @@ return 0; } +int +pcibios_enable_device(struct pci_dev *dev, int mask) +{ + return pcibios_enable_resources(dev, mask); +} + /* * If we set up a device for bus mastering, we need to check the latency * timer as certain firmware forgets to set it properly, as seen Index: work6/arch/arm/kernel/bios32.c === --- work6.orig/arch/arm/kernel/bios32.c 2008-02-18 10:43:50.0 -0700 +++ work6/arch/arm/kernel/bios32.c 2008-02-18 10:45:14.0 -0700 @@ -655,10 +655,10 @@ } /** - * pcibios_enable_device - Enable I/O and memory. + * pcibios_enable_resources - Enable I/O and memory. * @dev: PCI device to be enabled */ -int pcibios_enable_device(struct pci_dev *dev, int mask) +int pcibios_enable_resources(struct pci_dev *dev, int mask) { u16 cmd, old_cmd; int idx; @@ -697,6 +697,11 @@ return 0; } +int pcibios_enable_device(struct pci_dev *dev, int mask) +{ + return pcibios_enable_resources(dev, mask); +} + int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, enum pci_mmap_state mmap_state, int write_combine) { Index: work6/arch/parisc/kernel/pci.c === --- work6.orig/arch/parisc/kernel/pci.c 2008-02-18 10:43:50.0 -0700 +++ work6/arch/parisc/kernel/pci.c 2008-02-18 10:45:14.0 -0700 @@ -285,7 +285,7 @@ * Drivers that do not need parity (eg graphics and possibly networking) * can clear these bits if they want. */ -int pcibios_enable_device(struct pci_dev *dev, int mask) +int pcibios_enable_resources(struct pci_dev *dev, int mask) { u16 cmd; int idx; @@ -317,6 +317,10 @@ return 0; } +int pcibios_enable_device(struct pci_dev *dev, int mask) +{ + return pcibios_enable_resources(dev, mask); +} /* PA-RISC specific */ void pcibios_register_hba(struct pci_hba_data *hba) Index: work6/arch/powerpc/kernel/pci-common.c === --- work6.orig/arch/powerpc/kernel/pci-common.c 2008-02-18 10:43:50.0 -0700 +++ work6/arch/powerpc/kernel/pci-common.c 2008-02-18 10:45:14.0 -0700 @@ -1153,16 +1153,12 @@ EXPORT_SYMBOL_GPL(pcibios_claim_one_bus); #endif /* CONFIG_HOTPLUG */ -int pcibios_enable_device(struct pci_dev *dev, int mask) +int pcibios_enable_resources(struct pci_dev *dev, int mask) { u16 cmd, old_cmd; int idx; struct resource *r; - if (ppc_md.pcibios_enable_device_hook) - if (ppc_md.pcibios_enable_device_hook(dev)) - return -EINVAL; - pci_read_config_word(dev, PCI_COMMAND, cmd); old_cmd = cmd; for (idx = 0; idx PCI_NUM_RESOURCES; idx++) { @@ -1193,3 +1189,11 @@ return 0; } +int pcibios_enable_device(struct pci_dev *dev, int mask) +{ + if (ppc_md.pcibios_enable_device_hook) + if (ppc_md.pcibios_enable_device_hook(dev)) + return -EINVAL; + + return pcibios_enable_resources(dev, mask); +} Index: work6/arch/sh/drivers/pci/pci.c === --- work6.orig/arch/sh/drivers/pci/pci.c2008-02-18 10:43:50.0 -0700 +++ work6/arch/sh/drivers/pci/pci.c 2008-02-18 10:45:14.0 -0700 @@ -131,7 +131,7 @@ } } -int pcibios_enable_device(struct pci_dev *dev, int mask) +int pcibios_enable_resources(struct pci_dev *dev, int mask) { u16 cmd, old_cmd; int idx; @@ -163,6 +163,11 @@
[patch 3/4] xtensa: make pcibios_enable_device() use pcibios_enable_resources()
pcibios_enable_device() has an almost verbatim copy of pcibios_enable_resources(), (the only difference is that pcibios_enable_resources() turns on PCI_COMMAND_MEMORY if there's a ROM resource). The duplication might be intentional, but I don't see any callers of pcibios_enable_resources() on xtensa, so I think it's more likely a historical accident copied from ppc. This patch removes the duplication, making pcibios_enable_device() simply call pcibios_enable_resources() as x86 does. Signed-off-by: Bjorn Helgaas [EMAIL PROTECTED] Index: work6/arch/xtensa/kernel/pci.c === --- work6.orig/arch/xtensa/kernel/pci.c 2008-02-18 10:43:50.0 -0700 +++ work6/arch/xtensa/kernel/pci.c 2008-02-18 11:32:12.0 -0700 @@ -238,31 +238,7 @@ int pcibios_enable_device(struct pci_dev *dev, int mask) { - u16 cmd, old_cmd; - int idx; - struct resource *r; - - pci_read_config_word(dev, PCI_COMMAND, cmd); - old_cmd = cmd; - for (idx=0; idx6; idx++) { - r = dev-resource[idx]; - if (!r-start r-end) { - printk(KERN_ERR PCI: Device %s not available because - of resource collisions\n, pci_name(dev)); - return -EINVAL; - } - if (r-flags IORESOURCE_IO) - cmd |= PCI_COMMAND_IO; - if (r-flags IORESOURCE_MEM) - cmd |= PCI_COMMAND_MEMORY; - } - if (cmd != old_cmd) { - printk(PCI: Enabling device %s (%04x - %04x)\n, - pci_name(dev), old_cmd, cmd); - pci_write_config_word(dev, PCI_COMMAND, cmd); - } - - return 0; + return pcibios_enable_resources(dev, mask); } #ifdef CONFIG_PROC_FS -- ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[patch 4/4] RFC: PCI: consolidate several pcibios_enable_resources() implementations
There are many implementations of pcibios_enable_resources() that differ in minor ways that look more like bugs than architectural differences. This patch consolidates most of them to use the x86 version. This patch is for discussion only at this point. I'm interested in feedback about whether any of the differences are real and need to be preserved. ARM and PA-RISC, in particular, have interesting differences: - ARM always enables bridge devices, which no other arch does - PA-RISC always turns on SERR and PARITY, which no other arch does Should other arches do the same thing, or are these somehow related to ARM and PA-RISC architecture? Here's the x86 version, which seems most complete and up-to-date: int pcibios_enable_resources(struct pci_dev *dev, int mask) { u16 cmd, old_cmd; int i; struct resource *r; (0) pci_read_config_word(dev, PCI_COMMAND, cmd); old_cmd = cmd; (1) for (i = 0; i PCI_NUM_RESOURCES; i++) { (2) if (!(mask (1 i))) continue; r = dev-resource[i]; (3) if (!(r-flags (IORESOURCE_IO | IORESOURCE_MEM))) continue; (4) if ((i == PCI_ROM_RESOURCE) (!(r-flags IORESOURCE_ROM_ENABLE))) continue; (5) if (!r-start r-end) { dev_err(dev-dev, device not available because of resource %d collisions\n, i); return -EINVAL; } if (r-flags IORESOURCE_IO) cmd |= PCI_COMMAND_IO; if (r-flags IORESOURCE_MEM) cmd |= PCI_COMMAND_MEMORY; } (6) (7) if (cmd != old_cmd) { dev_info(dev-dev, enabling device (%04x - %04x)\n, old_cmd, cmd); pci_write_config_word(dev, PCI_COMMAND, cmd); } return 0; } Compared with the x86 version, other architectures have the following functional differences: alpha: ignores mask at (2), has no PCI_ROM_RESOURCE check at (4), has no collision check at (5) arm: checks only 6 resources at (1), has no PCI_ROM_RESOURCE check at (4), always fully enables bridges at (6) cris: checks only 6 resources at (1), has a different ROM resource check at (4) and (6) that ignores IORESOURCE_ROM_ENABLE frv: checks only 6 resources at (1), has a different ROM resource check at (4) and (6) that ignores IORESOURCE_ROM_ENABLE ia64: checks for NULL dev at (0) mips: has no IORESOURCE_{IO,MEM} check at (3), has a different ROM resource check at (4) and (6) that ignores IORESOURCE_ROM_ENABLE mn10300: checks only 6 resources at (1), has no IORESOURCE_{IO,MEM} check at (3), has a different ROM resource check at (4) and (6) that ignores IORESOURCE_ROM_ENABLE parisc: checks DEVICE_COUNT_RESOURCE (12) instead of PCI_NUM_RESOURCES (11) resources at (1), has no IORESOURCE_{IO,MEM} check at (3), has no PCI_ROM_RESOURCE check at (4), has no collision check at (5) always turns on PCI_COMMAND_SERR | PCI_COMMAND_PARITY at (6), writes cmd even if unchanged at (7) powerpc: has a different collision check at (5) ppc: checks only 6 resources at (1), has no IORESOURCE_{IO,MEM} check at (3), has a different ROM resource check at (4) and (6) that ignores IORESOURCE_ROM_ENABLE, has a different collision check using IORESOURCE_UNSET at (5) sh: checks only 6 resources at (1), has no IORESOURCE_{IO,MEM} check at (3), has a different ROM resource check at (4) and (6) that ignores IORESOURCE_ROM_ENABLE sparc64: has no IORESOURCE_{IO,MEM} check at (3), has no PCI_ROM_RESOURCE check at (4) v850: checks only 6 resources at (1), has no IORESOURCE_{IO,MEM} check at (3), has no PCI_ROM_RESOURCE check at (4) xtensa: checks only 6 resources at (1), has no IORESOURCE_{IO,MEM} check at (3), has a different ROM resource check at (4) and (6) that ignores IORESOURCE_ROM_ENABLE The mips/pmc-sierra implementation of pcibios_enable_resources() is cluttered with a bunch of titan stuff, so I can't immediately consolidate it with the others. So I made the generic version weak so pmc-sierra can override it. Not-Yet-Signed-off-by: Bjorn Helgaas [EMAIL PROTECTED] --- arch/alpha/kernel/pci.c | 27 - arch/arm/kernel/bios32.c| 43 - arch/cris/arch-v32/drivers/pci/bios.c | 32 arch/frv/mb93090-mb00/pci-frv.c | 32 arch/ia64/pci/pci.c | 42 - arch/mips/pci/pci.c | 32 arch/mn10300/unit-asb2305/pci-asb2305.c | 39
[patch 2/4] ppc: make pcibios_enable_device() use pcibios_enable_resources()
pcibios_enable_device() has an almost verbatim copy of pcibios_enable_resources(), (the only difference is that pcibios_enable_resources() turns on PCI_COMMAND_MEMORY if there's a ROM resource). The duplication might be intentional, but I don't see any callers of pcibios_enable_resources() on ppc, so I think it's more likely a historical accident. This patch removes the duplication, making pcibios_enable_device() simply call pcibios_enable_resources() as x86 does. Signed-off-by: Bjorn Helgaas [EMAIL PROTECTED] Index: work6/arch/ppc/kernel/pci.c === --- work6.orig/arch/ppc/kernel/pci.c2008-02-18 10:43:50.0 -0700 +++ work6/arch/ppc/kernel/pci.c 2008-02-18 11:31:23.0 -0700 @@ -785,33 +785,11 @@ int pcibios_enable_device(struct pci_dev *dev, int mask) { - u16 cmd, old_cmd; - int idx; - struct resource *r; - if (ppc_md.pcibios_enable_device_hook) if (ppc_md.pcibios_enable_device_hook(dev, 0)) return -EINVAL; - - pci_read_config_word(dev, PCI_COMMAND, cmd); - old_cmd = cmd; - for (idx=0; idx6; idx++) { - r = dev-resource[idx]; - if (r-flags IORESOURCE_UNSET) { - printk(KERN_ERR PCI: Device %s not available because of resource collisions\n, pci_name(dev)); - return -EINVAL; - } - if (r-flags IORESOURCE_IO) - cmd |= PCI_COMMAND_IO; - if (r-flags IORESOURCE_MEM) - cmd |= PCI_COMMAND_MEMORY; - } - if (cmd != old_cmd) { - printk(PCI: Enabling device %s (%04x - %04x)\n, - pci_name(dev), old_cmd, cmd); - pci_write_config_word(dev, PCI_COMMAND, cmd); - } - return 0; + + return pcibios_enable_resources(dev, mask); } struct pci_controller* -- ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [patch 0/4] RFC: PCI: consolidate several pcibios_enable_resources() implementations
On Mon, Feb 18, 2008 at 09:39:52PM -0700, Bjorn Helgaas wrote: - PA-RISC always turns on SERR and PARITY, which no other arch does I suspect this is because we set the host bus adapters to hard fail (HPMC) on detecting an error, since we don't want to 1) return possibly bogus (-1) data 2) write code to use the (undocumented) error detection More to the point, I suspect it's extra paranoia because firmware has set it up this way. I put in a quick hack to test whether those bits were set, and they came enabled that way by firmware on all the boxes I tested. (Disclaimer, I didn't have any easily accessible boxes with add-on cards installed, so firmware might just set it up for core devices, and we're making sure its set everywhere.) cheers, Kyle ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Tue, Feb 19, 2008 at 01:33:53PM +1100, Nick Piggin wrote: Note in particular the last predictors; assuming branch ending with goto, including call, causing early function return or returning negative constant are not taken. Just these alone are likely 95+% of the unlikelies in the kernel. Yes, gcc should be able to do pretty good heuristics, considering the quite good numbers that cold CPU predictors can attain. However for really performance critical code (or really never executed code), then I think it is OK to have the hints and not have to rely on gcc heuristics. in my experience, the real problem is that gcc does what *it* wants and not what *you* want. I've been annoyed a lot by the way it coded some loops that could really be blazingly fast, but which resulted in a ton of branches due to its predictors. And using unlikely() there was a real mess, because instead of just hinting the compiler with probabilities to write some linear code for the *most* common case, it ended up with awful branches everywhere with code sent far away and even duplicated for some branches. Sometimes, for performance critical paths, I would like gcc to be dumb and follow *my* code and not its hard-coded probabilities. For instance, in a tree traversal, you really know how you want to build your loop. And these days, it seems like the single method of getting it your way is doing asm, which obviously is not portable :-( Maybe one thing we would need would be the ability to assign probabilities to each branch based on what we expect, so that gcc could build a better tree keeping most frequently used code tight. Hmm I've just noticed -fno-guess-branch-probability in the man, I never tried it. regards, Willy ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [patch 4/4] RFC: PCI: consolidate several pcibios_enable_resources() implementations
On Mon, 2008-02-18 at 21:39 -0700, Bjorn Helgaas wrote: powerpc: has a different collision check at (5) I've always found the collision check dodgy. I tend to want to keep the way powerpc does it here. pci_enable_device() should only enable resources that have successfully been added to the resource tree (that have passed all the collision check etc...). There is a simple clear indication of that: res-parent is non-NULL. I think that is a better check than the test x86 does on start and end. That is, whatever the arch code decides to use to decide whether resources are assigned by firmware or by the first pass assignment code or not and collide or not, once that phase is finished (which is the case when calling pcibios_enable_device(), having the resource in the resource-tree or not is, I believe, the proper way to test whether it's a useable resource. Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [patch 4/4] RFC: PCI: consolidate several pcibios_enable_resources() implementations
Index: work6/drivers/pci/Makefile === --- work6.orig/drivers/pci/Makefile 2008-02-18 21:16:36.0 -0700 +++ work6/drivers/pci/Makefile2008-02-18 21:16:38.0 -0700 @@ -2,7 +2,7 @@ # Makefile for the PCI bus specific drivers. # -obj-y+= access.o bus.o probe.o remove.o pci.o quirks.o \ +obj-y+= access.o bios.o bus.o probe.o remove.o pci.o quirks.o \ pci-driver.o search.o pci-sysfs.o rom.o setup-res.o obj-$(CONFIG_PROC_FS) += proc.o Index: work6/drivers/pci/bios.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ work6/drivers/pci/bios.c 2008-02-18 21:16:38.0 -070 ^^ Yuck :-) Please, don't call this bios ... whatever is in this file really has nothing to do with a BIOS in any shape or form :-) I know we used to call those things pcibios_* but that's really historical... If you want to make clear it's for utilities that can be overriden by the arch, maybe call it utils.c, or just stick the function in pci.c, or setup-res.c Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Tuesday 19 February 2008 16:58, Willy Tarreau wrote: On Tue, Feb 19, 2008 at 01:33:53PM +1100, Nick Piggin wrote: Note in particular the last predictors; assuming branch ending with goto, including call, causing early function return or returning negative constant are not taken. Just these alone are likely 95+% of the unlikelies in the kernel. Yes, gcc should be able to do pretty good heuristics, considering the quite good numbers that cold CPU predictors can attain. However for really performance critical code (or really never executed code), then I think it is OK to have the hints and not have to rely on gcc heuristics. in my experience, the real problem is that gcc does what *it* wants and not what *you* want. I've been annoyed a lot by the way it coded some loops that could really be blazingly fast, but which resulted in a ton of branches due to its predictors. And using unlikely() there was a real mess, because instead of just hinting the compiler with probabilities to write some linear code for the *most* common case, it ended up with awful branches everywhere with code sent far away and even duplicated for some branches. Sometimes, for performance critical paths, I would like gcc to be dumb and follow *my* code and not its hard-coded probabilities. For instance, in a tree traversal, you really know how you want to build your loop. And these days, it seems like the single method of getting it your way is doing asm, which obviously is not portable :-( Probably all true. Maybe one thing we would need would be the ability to assign probabilities to each branch based on what we expect, so that gcc could build a better tree keeping most frequently used code tight. I don't know if that would *directly* lead to gcc being smarter. I think perhaps they probably don't benchmark on code bases that have much explicit annotation (I'm sure they wouldn't seriously benchmark any parts of Linux as part of daily development). I think the key is to continue to use annotations _properly_, and eventually gcc should go in the right direction if enough code uses it. And if you have really good examples like it sounds like above, then I guess that should be reported to gcc? ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [patch 2/4] ppc: make pcibios_enable_device() use pcibios_enable_resources()
On Mon, 2008-02-18 at 21:39 -0700, Bjorn Helgaas wrote: plain text document attachment (ppc-pcibios_enable_resources) pcibios_enable_device() has an almost verbatim copy of pcibios_enable_resources(), (the only difference is that pcibios_enable_resources() turns on PCI_COMMAND_MEMORY if there's a ROM resource). The duplication might be intentional, but I don't see any callers of pcibios_enable_resources() on ppc, so I think it's more likely a historical accident. This patch removes the duplication, making pcibios_enable_device() simply call pcibios_enable_resources() as x86 does. Signed-off-by: Bjorn Helgaas [EMAIL PROTECTED] Ack. arch/ppc is being phased out soon anyway. Ben. Index: work6/arch/ppc/kernel/pci.c === --- work6.orig/arch/ppc/kernel/pci.c 2008-02-18 10:43:50.0 -0700 +++ work6/arch/ppc/kernel/pci.c 2008-02-18 11:31:23.0 -0700 @@ -785,33 +785,11 @@ int pcibios_enable_device(struct pci_dev *dev, int mask) { - u16 cmd, old_cmd; - int idx; - struct resource *r; - if (ppc_md.pcibios_enable_device_hook) if (ppc_md.pcibios_enable_device_hook(dev, 0)) return -EINVAL; - - pci_read_config_word(dev, PCI_COMMAND, cmd); - old_cmd = cmd; - for (idx=0; idx6; idx++) { - r = dev-resource[idx]; - if (r-flags IORESOURCE_UNSET) { - printk(KERN_ERR PCI: Device %s not available because of resource collisions\n, pci_name(dev)); - return -EINVAL; - } - if (r-flags IORESOURCE_IO) - cmd |= PCI_COMMAND_IO; - if (r-flags IORESOURCE_MEM) - cmd |= PCI_COMMAND_MEMORY; - } - if (cmd != old_cmd) { - printk(PCI: Enabling device %s (%04x - %04x)\n, -pci_name(dev), old_cmd, cmd); - pci_write_config_word(dev, PCI_COMMAND, cmd); - } - return 0; + + return pcibios_enable_resources(dev, mask); } struct pci_controller* ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [patch 1/4] PCI: split pcibios_enable_resources() out of pcibios_enable_device()
On Mon, 2008-02-18 at 21:39 -0700, Bjorn Helgaas wrote: plain text document attachment (make-pcibios_enable_resources) On x86, pcibios_enable_device() is factored into pcibios_enable_resources() and pcibios_enable_irq(). On several other architectures, the functional equivalent of pcibios_enable_resources() is expanded directly inside pcibios_enable_device(). This splits these pcibios_enable_device() implementations to make them more similar to the x86 implementation. There should be no functional change from this patch. Signed-off-by: Bjorn Helgaas [EMAIL PROTECTED] Acked-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- arch/alpha/kernel/pci.c |8 +++- arch/arm/kernel/bios32.c |9 +++-- arch/parisc/kernel/pci.c |6 +- arch/powerpc/kernel/pci-common.c | 14 +- arch/sh/drivers/pci/pci.c|7 ++- arch/sparc64/kernel/pci.c|7 ++- arch/v850/kernel/rte_mb_a_pci.c |7 ++- 7 files changed, 46 insertions(+), 12 deletions(-) Index: work6/arch/alpha/kernel/pci.c === --- work6.orig/arch/alpha/kernel/pci.c2008-02-18 10:43:50.0 -0700 +++ work6/arch/alpha/kernel/pci.c 2008-02-18 10:45:14.0 -0700 @@ -370,7 +370,7 @@ #endif int -pcibios_enable_device(struct pci_dev *dev, int mask) +pcibios_enable_resources(struct pci_dev *dev, int mask) { u16 cmd, oldcmd; int i; @@ -396,6 +396,12 @@ return 0; } +int +pcibios_enable_device(struct pci_dev *dev, int mask) +{ + return pcibios_enable_resources(dev, mask); +} + /* * If we set up a device for bus mastering, we need to check the latency * timer as certain firmware forgets to set it properly, as seen Index: work6/arch/arm/kernel/bios32.c === --- work6.orig/arch/arm/kernel/bios32.c 2008-02-18 10:43:50.0 -0700 +++ work6/arch/arm/kernel/bios32.c2008-02-18 10:45:14.0 -0700 @@ -655,10 +655,10 @@ } /** - * pcibios_enable_device - Enable I/O and memory. + * pcibios_enable_resources - Enable I/O and memory. * @dev: PCI device to be enabled */ -int pcibios_enable_device(struct pci_dev *dev, int mask) +int pcibios_enable_resources(struct pci_dev *dev, int mask) { u16 cmd, old_cmd; int idx; @@ -697,6 +697,11 @@ return 0; } +int pcibios_enable_device(struct pci_dev *dev, int mask) +{ + return pcibios_enable_resources(dev, mask); +} + int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, enum pci_mmap_state mmap_state, int write_combine) { Index: work6/arch/parisc/kernel/pci.c === --- work6.orig/arch/parisc/kernel/pci.c 2008-02-18 10:43:50.0 -0700 +++ work6/arch/parisc/kernel/pci.c2008-02-18 10:45:14.0 -0700 @@ -285,7 +285,7 @@ * Drivers that do not need parity (eg graphics and possibly networking) * can clear these bits if they want. */ -int pcibios_enable_device(struct pci_dev *dev, int mask) +int pcibios_enable_resources(struct pci_dev *dev, int mask) { u16 cmd; int idx; @@ -317,6 +317,10 @@ return 0; } +int pcibios_enable_device(struct pci_dev *dev, int mask) +{ + return pcibios_enable_resources(dev, mask); +} /* PA-RISC specific */ void pcibios_register_hba(struct pci_hba_data *hba) Index: work6/arch/powerpc/kernel/pci-common.c === --- work6.orig/arch/powerpc/kernel/pci-common.c 2008-02-18 10:43:50.0 -0700 +++ work6/arch/powerpc/kernel/pci-common.c2008-02-18 10:45:14.0 -0700 @@ -1153,16 +1153,12 @@ EXPORT_SYMBOL_GPL(pcibios_claim_one_bus); #endif /* CONFIG_HOTPLUG */ -int pcibios_enable_device(struct pci_dev *dev, int mask) +int pcibios_enable_resources(struct pci_dev *dev, int mask) { u16 cmd, old_cmd; int idx; struct resource *r; - if (ppc_md.pcibios_enable_device_hook) - if (ppc_md.pcibios_enable_device_hook(dev)) - return -EINVAL; - pci_read_config_word(dev, PCI_COMMAND, cmd); old_cmd = cmd; for (idx = 0; idx PCI_NUM_RESOURCES; idx++) { @@ -1193,3 +1189,11 @@ return 0; } +int pcibios_enable_device(struct pci_dev *dev, int mask) +{ + if (ppc_md.pcibios_enable_device_hook) + if (ppc_md.pcibios_enable_device_hook(dev)) + return -EINVAL; + + return pcibios_enable_resources(dev, mask); +} Index: work6/arch/sh/drivers/pci/pci.c === --- work6.orig/arch/sh/drivers/pci/pci.c 2008-02-18 10:43:50.0 -0700 +++
Re: [PATCH 1/3] Fix Unlikely(x) == y
On Tue, Feb 19, 2008 at 08:46:03AM +1100, Michael Ellerman wrote: On Mon, 2008-02-18 at 16:13 +0200, Adrian Bunk wrote: On Mon, Feb 18, 2008 at 03:01:35PM +0100, Geert Uytterhoeven wrote: On Mon, 18 Feb 2008, Adrian Bunk wrote: This means it generates faster code with a current gcc for your platform. But a future gcc might e.g. replace the whole loop with a division (gcc SVN head (that will soon become gcc 4.3) already does transformations like replacing loops with divisions [1]). Hence shouldn't we ask the gcc people what's the purpose of __builtin_expect(), if it doesn't live up to its promise? That's a different issue. My point here is that we do not know how the latest gcc available in the year 2010 might transform this code, and how a likely/unlikely placed there might influence gcc's optimizations then. You're right, we don't know. But if giving the compiler _more_ information causes it to produce vastly inferior code then we should be filing gcc bugs. After all the unlikely/likely is just a hint, if gcc knows better it can always ignore it. It's the other way round, gcc assumes that you know better than gcc when you give it a __builtin_expect(). The example you gave had only a 1:3 ratio, which is far outside of the ratios where __builtin_expect() should be used. What if you gave this annotation for the 1:3 case and gcc generates code that performs better for ratios 1:1000 but much worse for a 1:3 ratio since your hint did override a better estimate of gcc? And I'm sure that 90% of all kernel developers (including me) are worse in such respects than the gcc heuristics. I'm a firm believer in the following: - it's the programmer's job to write clean and efficient C code - it's the compiler's job to convert C code into efficient assembler code The stable interface between the programmer and the compiler is C, and when the programmer starts manually messing with internals of the compiler that's a layering violation that requires a _good_ justification. With a good justification not consisting of some microbenchmark but of measurements of the actual annotations in the kernel code. cheers cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [patch 4/4] RFC: PCI: consolidate several pcibios_enable_resources() implementations
On Mon, Feb 18, 2008 at 09:39:56PM -0700, Bjorn Helgaas wrote: parisc: checks DEVICE_COUNT_RESOURCE (12) instead of PCI_NUM_RESOURCES (11) resources at (1), Good catch. has no IORESOURCE_{IO,MEM} check at (3), What else can it be? has no PCI_ROM_RESOURCE check at (4), has no collision check at (5) always turns on PCI_COMMAND_SERR | PCI_COMMAND_PARITY at (6), writes cmd even if unchanged at (7) I'll have to check into 4 5, there might be a good reason for it. For instance on a four port HP ethernet card (pci-pci bridge + 4 tulips) all 4 of the rom resources are mapped to the same address, which afaict, is allowed but breaks things in mysterious and subtle ways. That said, the parisc pci code is a rats nest... cheers, Kyle ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev