Re: [PATCH -mm] Blackfin: blackfin i2c driver
On Wed, 07 Mar 2007 15:39:27 +0800 "Wu, Bryan" <[EMAIL PROTECTED]> wrote: > Thanks a lot, could you please give me a script just to kill this > whitespace? So I can do it before sending you patches. Is pretty simple: #!/bin/sh # # Strip any trailing whitespace which a unified diff adds. # strip1() { TMP=$(mktemp /tmp/XX) cp $1 $TMP sed -e '/^+/s/[ ]*$//' < $TMP > $1 rm $TMP } for i in $* do strip1 $i done that'll be in http://www.zip.com.au/~akpm/linux/patches/patch-scripts-0.20/patch-scripts-0.20.tar.gz too - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + stupid-hack-to-make-mainline-build.patch added to -mm tree
On Tue, 2007-03-06 at 17:44 -0800, Dan Hecht wrote: > >>> 2) As I said above. The time accounting for virtualization needs to be > >>> fixed in a generic way. > >>> > >>> I'm not going to accept some weird hackery for virtualization, which is > >>> of exactly ZERO value for the kernel itself. Quite the contrary it will > >>> make the cleanup harder and introduce another hard to remove thing, > >>> which will in the worst case last for ever. > >>> > >> Okay, to confirm I'm on the same page as you, you want to move process > >> time accounting from being periodic sampled based to being trace based? > >> i.e. at the system-call/interrupt boundaries, read clocksource and > >> compute directly the amount of system/user/process time? > > > > At least for the paravirt guests this is the correct approach. Once the > > CPU vendors come up with a sane solution for a reliable and fast clock > > source we might use that on real hardware as well. > > > > I thought your preference was to not do things differently from real > hardware? I guess this case you are okay with since you'd like to see > the real hardware case follow eventually? Real hardware _IS_ broken and slow. If we add the facilities for virtualization we want it in a way, which is usable by real hardware as well. > > Yes, with todays hardware it is simply a PITA. PowerPC has some basic > > support for this though, IIRC. > > > > I think S390 maybe too. One more reason to make it a generic solution rather than some extra hackery. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Blackfin: blackfin i2c driver
On Tue, 2007-03-06 at 23:14 -0800, Andrew Morton wrote: > On Wed, 7 Mar 2007 07:58:22 +0100 Jean Delvare <[EMAIL PROTECTED]> wrote: > > > > +config BFIN_SDA > > > > I2C_BLACKFIN_SDA > > The blackfin architecture uses "bfin" pretty much universally, so this > usage is consistent. > > box:/usr/src/25> grep -i blackfin patches/blackfin*|wc -l >1608 > box:/usr/src/25> grep -i bfin patches/blackfin*|wc -l >6198 > Thanks for you understanding, but now we want to move to use CONFIG_BLACKFIN options. There is a new task in our development plan to change things to CONFIG_BLACKFIN. At this moment, we both provide CONFIG_BFIN and CONFIG_BLACKFIN. When all the things relied on CONFIG_BFIN/bfin are changed to CONFIG_BLACKFIN/blackfin, the CONFIG_BFIN will be removed. So here I will follow Jean's comments. > Let's just hope nobody makes a bluefin. So it is ok for both blackfin and bluefin. But I think Black is cooler than Blue. -:) > > > > + range 0 15 if (BF533 || BF532 || BF531) > > > > Trailing whitespace. > > I always remove that when merging a patch. Thanks a lot, could you please give me a script just to kill this whitespace? So I can do it before sending you patches. Thanks Jean and Andrew. -Bryan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 8136] 2.6.21-rc2-mm2 won't boot
Le mardi 06 mars 2007 à 16:15 -0800, Andrew Morton a écrit : > So rc2-mm2 panics due to "MP-BIOS bug: 8254 timer not connected to IO-APIC" > and > rc2-mm1 does not. > > Could be ACPI, could be x86_64 timer changes, could be something else. > > Would you have time to bisect it? > http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt > explains how. > > If so, I'd suggest you drill in on the patches between > x86_64-mm-defconfig-update.patch and > optimize-and-simplify-get_cycles_sync.patch: the x86 changes. I may have some more debug time this evening (CET), probably not enough for a full bisection. I'd really love to have timer/clock problems nailed once and for all on this box (MP BIOS, RTC, HPET, whatever) -- Nicolas Mailhot signature.asc Description: Ceci est une partie de message numériquement signée
[PATCH 8/20] x86_64: 64bit PIC SMP trampoline
This modifies the SMP trampoline and all of the associated code so it can jump to a 64bit kernel loaded at an arbitrary address. The dependencies on having an idenetity mapped page in the kernel page tables for SMP bootup have all been removed. In addition the trampoline has been modified to verify that long mode is supported. Asking if long mode is implemented is down right silly but we have traditionally had some of these checks, and they can't hurt anything. So when the totally ludicrous happens we just might handle it correctly. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/head.S |1 arch/x86_64/kernel/setup.c |9 -- arch/x86_64/kernel/trampoline.S | 168 3 files changed, 156 insertions(+), 22 deletions(-) diff -puN arch/x86_64/kernel/head.S~x86_64-64bit-PIC-SMP-trampoline arch/x86_64/kernel/head.S --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/head.S~x86_64-64bit-PIC-SMP-trampoline 2007-03-07 01:25:32.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/head.S 2007-03-07 01:25:32.0 +0530 @@ -101,6 +101,7 @@ startup_32: .org 0x100 .globl startup_64 startup_64: +ENTRY(secondary_startup_64) /* We come here either from startup_32 * or directly from a 64bit bootloader. * Since we may have come directly from a bootloader we diff -puN arch/x86_64/kernel/setup.c~x86_64-64bit-PIC-SMP-trampoline arch/x86_64/kernel/setup.c --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/setup.c~x86_64-64bit-PIC-SMP-trampoline 2007-03-07 01:25:32.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/setup.c 2007-03-07 01:25:32.0 +0530 @@ -329,15 +329,8 @@ void __init setup_arch(char **cmdline_p) #endif #ifdef CONFIG_SMP - /* -* But first pinch a few for the stack/trampoline stuff -* FIXME: Don't need the extra page at 4K, but need to fix -* trampoline before removing it. (see the GDT stuff) -*/ - reserve_bootmem_generic(PAGE_SIZE, PAGE_SIZE); - /* Reserve SMP trampoline */ - reserve_bootmem_generic(SMP_TRAMPOLINE_BASE, PAGE_SIZE); + reserve_bootmem_generic(SMP_TRAMPOLINE_BASE, 2*PAGE_SIZE); #endif #ifdef CONFIG_ACPI_SLEEP diff -puN arch/x86_64/kernel/trampoline.S~x86_64-64bit-PIC-SMP-trampoline arch/x86_64/kernel/trampoline.S --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/trampoline.S~x86_64-64bit-PIC-SMP-trampoline 2007-03-07 01:25:32.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/trampoline.S 2007-03-07 01:25:32.0 +0530 @@ -3,6 +3,7 @@ * Trampoline.SDerived from Setup.S by Linus Torvalds * * 4 Jan 1997 Michael Chastain: changed to gnu as. + * 15 Sept 2005 Eric Biederman: 64bit PIC support * * Entry: CS:IP point to the start of our code, we are * in real mode with no stack, but the rest of the @@ -17,15 +18,20 @@ * and IP is zero. Thus, data addresses need to be absolute * (no relocation) and are taken with regard to r_base. * + * With the addition of trampoline_level4_pgt this code can + * now enter a 64bit kernel that lives at arbitrary 64bit + * physical addresses. + * * If you work on this file, check the object module with objdump * --full-contents --reloc to make sure there are no relocation - * entries. For the GDT entry we do hand relocation in smpboot.c - * because of 64bit linker limitations. + * entries. */ #include -#include +#include #include +#include +#include .data @@ -33,15 +39,31 @@ ENTRY(trampoline_data) r_base = . + cli # We should be safe anyway wbinvd mov %cs, %ax# Code and data in the same place mov %ax, %ds + mov %ax, %es + mov %ax, %ss - cli # We should be safe anyway movl$0xA5A5A5A5, trampoline_data - r_base # write marker for master knows we're running + # Setup stack + movw$(trampoline_stack_end - r_base), %sp + + callverify_cpu # Verify the cpu supports long mode + + mov %cs, %ax + movzx %ax, %esi # Find the 32bit trampoline location + shll$4, %esi + + # Fixup the vectors + addl%esi, startup_32_vector - r_base + addl%esi, startup_64_vector - r_base + addl%esi, tgdt + 2 - r_base # Fixup the gdt pointer + /* * GDT tables in non default location kernel can be beyond 16MB and * lgdt will not be able to load the address as in real mode default @@ -49,23 +71,141 @@ r_base = . * to 32 bit. */ - lidtl idt_48 - r_base # load idt with 0, 0 -
[PATCH 17/20] x86_64: __pa and __pa_symbol address space separation
Currently __pa_symbol is for use with symbols in the kernel address map and __pa is for use with pointers into the physical memory map. But the code is implemented so you can usually interchange the two. __pa which is much more common can be implemented much more cheaply if it is it doesn't have to worry about any other kernel address spaces. This is especially true with a relocatable kernel as __pa_symbol needs to peform an extra variable read to resolve the address. There is a third macro that is added for the vsyscall data __pa_vsymbol for finding the physical addesses of vsyscall pages. Most of this patch is simply sorting through the references to __pa or __pa_symbol and using the proper one. A little of it is continuing to use a physical address when we have it instead of recalculating it several times. swapper_pgd is now NULL. leave_mm now uses init_mm.pgd and init_mm.pgd is initialized at boot (instead of compile time) to the physmem virtual mapping of init_level4_pgd. The physical address changed. Except for the for EMPTY_ZERO page all of the remaining references to __pa_symbol appear to be during kernel initialization. So this should reduce the cost of __pa in the common case, even on a relocated kernel. As this is technically a semantic change we need to be on the lookout for anything I missed. But it works for me (tm). Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/i386/kernel/alternative.c |8 arch/i386/mm/init.c| 15 --- arch/x86_64/kernel/machine_kexec.c | 14 +++--- arch/x86_64/kernel/setup.c |9 + arch/x86_64/kernel/smp.c |2 +- arch/x86_64/kernel/vsyscall.c |9 +++-- arch/x86_64/mm/init.c | 21 +++-- arch/x86_64/mm/pageattr.c | 16 include/asm-x86_64/page.h |6 ++ include/asm-x86_64/pgtable.h |4 ++-- 10 files changed, 55 insertions(+), 49 deletions(-) diff -puN arch/i386/kernel/alternative.c~x86_64-__pa-and-__pa_symbol-address-space-separation arch/i386/kernel/alternative.c --- linux-2.6.21-rc2-reloc/arch/i386/kernel/alternative.c~x86_64-__pa-and-__pa_symbol-address-space-separation 2007-03-07 01:31:03.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/i386/kernel/alternative.c 2007-03-07 01:31:03.0 +0530 @@ -389,8 +389,8 @@ void __init alternative_instructions(voi if (no_replacement) { printk(KERN_INFO "(SMP-)alternatives turned off\n"); free_init_pages("SMP alternatives", - (unsigned long)__smp_alt_begin, - (unsigned long)__smp_alt_end); + __pa_symbol(&__smp_alt_begin), + __pa_symbol(&__smp_alt_end)); return; } @@ -419,8 +419,8 @@ void __init alternative_instructions(voi _text, _etext); } free_init_pages("SMP alternatives", - (unsigned long)__smp_alt_begin, - (unsigned long)__smp_alt_end); + __pa_symbol(&__smp_alt_begin), + __pa_symbol(&__smp_alt_end)); } else { alternatives_smp_save(__smp_alt_instructions, __smp_alt_instructions_end); diff -puN arch/i386/mm/init.c~x86_64-__pa-and-__pa_symbol-address-space-separation arch/i386/mm/init.c --- linux-2.6.21-rc2-reloc/arch/i386/mm/init.c~x86_64-__pa-and-__pa_symbol-address-space-separation 2007-03-07 01:31:03.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/i386/mm/init.c 2007-03-07 01:31:03.0 +0530 @@ -774,10 +774,11 @@ void free_init_pages(char *what, unsigne unsigned long addr; for (addr = begin; addr < end; addr += PAGE_SIZE) { - ClearPageReserved(virt_to_page(addr)); - init_page_count(virt_to_page(addr)); - memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE); - free_page(addr); + struct page *page = pfn_to_page(addr >> PAGE_SHIFT); + ClearPageReserved(page); + init_page_count(page); + memset(page_address(page), POISON_FREE_INITMEM, PAGE_SIZE); + __free_page(page); totalram_pages++; } printk(KERN_INFO "Freeing %s: %ldk freed\n", what, (end - begin) >> 10); @@ -786,14 +787,14 @@ void free_init_pages(char *what, unsigne void free_initmem(void) { free_init_pages("unused kernel memory", - (unsigned long)(&__init_begin), - (unsigned long)(&__init_end)); + __pa_symbol(&__init_begin), + __pa_symbol(&__init_end)); } #ifdef CO
Re: [BUGFIX][PATCH] fix NULL pointer in ia64/irq_chip-mask/unmask function
On Tue, 6 Mar 2007 22:57:10 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Wed, 7 Mar 2007 15:23:17 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > > > This patch fixes boot failure because irq_desc->mask() is NULL. > > > > - Added mask/unmask functions to ia64's irq desc function table. > > But I'm not sure this fix is correct or not. please review. > > > > - rename hw_interrupt_type to irq_chip. hw_interrupt_type is old name. > > Thanks. > > This bug is present in mainline too, isn't it? > Yes, I confirmed rc3 has this bug. -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 19/20] x86_64: Extend bzImage protocol for relocatable bzImage
o Extend the bzImage protocol (same as i386) to allow bzImage loaders to load the protected mode kernel at non-1MB address. Now protected mode component is relocatable and can be loaded at non-1MB addresses. o As of today kdump uses it to run a second kernel from a reserved memory area. Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/boot/setup.S | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff -puN arch/x86_64/boot/setup.S~x86_64-extend-bzImage-protocol-for-relocatable-bzImage arch/x86_64/boot/setup.S --- linux-2.6.21-rc2-reloc/arch/x86_64/boot/setup.S~x86_64-extend-bzImage-protocol-for-relocatable-bzImage 2007-03-07 01:32:01.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/boot/setup.S2007-03-07 01:32:01.0 +0530 @@ -80,7 +80,7 @@ start: # This is the setup header, and it must start at %cs:2 (old 0x9020:2) .ascii "HdrS" # header signature - .word 0x0204 # header version number (>= 0x0105) + .word 0x0205 # header version number (>= 0x0105) # or else old loadlin-1.5 will fail) realmode_swtch:.word 0, 0# default_switch, SETUPSEG start_sys_seg: .word SYSSEG @@ -155,7 +155,16 @@ cmd_line_ptr: .long 0 # (Header versio # low memory 0x1 or higher. ramdisk_max: .long 0x - +kernel_alignment: .long 0x20 # physical addr alignment required for + # protected mode relocatable kernel +#ifdef CONFIG_RELOCATABLE +relocatable_kernel:.byte 1 +#else +relocatable_kernel:.byte 0 +#endif +pad2: .byte 0 +pad3: .word 0 + trampoline:callstart_of_setup .align 16 # The offset at this point is 0x240 _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/20] x86_64: Add EFER to the register set saved by save_processor_state
EFER varies like %cr4 depending on the cpu capabilities, and which cpu capabilities we want to make use of. So save/restore it make certain we have the same EFER value when we are done. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/suspend.c |3 ++- include/asm-x86_64/suspend.h |1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff -puN arch/x86_64/kernel/suspend.c~x86_64-Add-EFER-to-the-set-registers-saved-by-save_processor_state arch/x86_64/kernel/suspend.c --- linux-2.6.19-rc6-reloc/arch/x86_64/kernel/suspend.c~x86_64-Add-EFER-to-the-set-registers-saved-by-save_processor_state 2006-11-17 00:08:16.0 -0500 +++ linux-2.6.19-rc6-reloc-root/arch/x86_64/kernel/suspend.c2006-11-17 00:08:16.0 -0500 @@ -33,7 +33,6 @@ void __save_processor_state(struct saved asm volatile ("str %0" : "=m" (ctxt->tr)); /* XMM0..XMM15 should be handled by kernel_fpu_begin(). */ - /* EFER should be constant for kernel version, no need to handle it. */ /* * segment registers */ @@ -50,6 +49,7 @@ void __save_processor_state(struct saved /* * control registers */ + rdmsrl(MSR_EFER, ctxt->efer); asm volatile ("movq %%cr0, %0" : "=r" (ctxt->cr0)); asm volatile ("movq %%cr2, %0" : "=r" (ctxt->cr2)); asm volatile ("movq %%cr3, %0" : "=r" (ctxt->cr3)); @@ -75,6 +75,7 @@ void __restore_processor_state(struct sa /* * control registers */ + wrmsrl(MSR_EFER, ctxt->efer); asm volatile ("movq %0, %%cr8" :: "r" (ctxt->cr8)); asm volatile ("movq %0, %%cr4" :: "r" (ctxt->cr4)); asm volatile ("movq %0, %%cr3" :: "r" (ctxt->cr3)); diff -puN include/asm-x86_64/suspend.h~x86_64-Add-EFER-to-the-set-registers-saved-by-save_processor_state include/asm-x86_64/suspend.h --- linux-2.6.19-rc6-reloc/include/asm-x86_64/suspend.h~x86_64-Add-EFER-to-the-set-registers-saved-by-save_processor_state 2006-11-17 00:08:16.0 -0500 +++ linux-2.6.19-rc6-reloc-root/include/asm-x86_64/suspend.h2006-11-17 00:08:16.0 -0500 @@ -17,6 +17,7 @@ struct saved_context { u16 ds, es, fs, gs, ss; unsigned long gs_base, gs_kernel_base, fs_base; unsigned long cr0, cr2, cr3, cr4, cr8; + unsigned long efer; u16 gdt_pad; u16 gdt_limit; unsigned long gdt_base; _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 13/20] x86_64: Modify discover_ebda to use virtual addresses
Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/setup.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN arch/x86_64/kernel/setup.c~x86_64-Modify-discover_ebda-to-use-virtual-addresses arch/x86_64/kernel/setup.c --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/setup.c~x86_64-Modify-discover_ebda-to-use-virtual-addresses 2007-03-07 01:28:51.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/setup.c 2007-03-07 01:28:51.0 +0530 @@ -205,10 +205,10 @@ static void discover_ebda(void) * there is a real-mode segmented pointer pointing to the * 4K EBDA area at 0x40E */ - ebda_addr = *(unsigned short *)EBDA_ADDR_POINTER; + ebda_addr = *(unsigned short *)__va(EBDA_ADDR_POINTER); ebda_addr <<= 4; - ebda_size = *(unsigned short *)(unsigned long)ebda_addr; + ebda_size = *(unsigned short *)__va(ebda_addr); /* Round EBDA up to pages */ if (ebda_size == 0) _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 16/20] swsusp: do not use virt_to_page on kernel data address
o virt_to_page() call should be used on kernel linear addresses and not on kernel text and data addresses. Swsusp code uses it on kernel data (statically allocated swsusp_header). o Allocate swsusp_header dynamically so that virt_to_page() can be used safely. o I am changing this because in next few patches, __pa() on x86_64 will no longer support kernel text and data addresses and hibernation breaks. Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- kernel/power/swap.c | 42 +++--- 1 file changed, 27 insertions(+), 15 deletions(-) diff -puN kernel/power/swap.c~swsusp-do-not-use-virt_to_page-on-kernel-data-addr kernel/power/swap.c --- linux-2.6.21-rc2-reloc/kernel/power/swap.c~swsusp-do-not-use-virt_to_page-on-kernel-data-addr 2007-03-07 01:30:43.0 +0530 +++ linux-2.6.21-rc2-reloc-root/kernel/power/swap.c 2007-03-07 01:30:43.0 +0530 @@ -33,12 +33,14 @@ extern char resume_file[]; #define SWSUSP_SIG "S1SUSPEND" -static struct swsusp_header { +struct swsusp_header { char reserved[PAGE_SIZE - 20 - sizeof(sector_t)]; sector_t image; charorig_sig[10]; charsig[10]; -} __attribute__((packed, aligned(PAGE_SIZE))) swsusp_header; +} __attribute__((packed)); + +static struct swsusp_header *swsusp_header; /* * General things @@ -141,14 +143,14 @@ static int mark_swapfiles(sector_t start { int error; - bio_read_page(swsusp_resume_block, &swsusp_header, NULL); - if (!memcmp("SWAP-SPACE",swsusp_header.sig, 10) || - !memcmp("SWAPSPACE2",swsusp_header.sig, 10)) { - memcpy(swsusp_header.orig_sig,swsusp_header.sig, 10); - memcpy(swsusp_header.sig,SWSUSP_SIG, 10); - swsusp_header.image = start; + bio_read_page(swsusp_resume_block, swsusp_header, NULL); + if (!memcmp("SWAP-SPACE",swsusp_header->sig, 10) || + !memcmp("SWAPSPACE2",swsusp_header->sig, 10)) { + memcpy(swsusp_header->orig_sig,swsusp_header->sig, 10); + memcpy(swsusp_header->sig,SWSUSP_SIG, 10); + swsusp_header->image = start; error = bio_write_page(swsusp_resume_block, - &swsusp_header, NULL); + swsusp_header, NULL); } else { printk(KERN_ERR "swsusp: Swap header not found!\n"); error = -ENODEV; @@ -564,7 +566,7 @@ int swsusp_read(void) if (error < PAGE_SIZE) return error < 0 ? error : -EFAULT; header = (struct swsusp_info *)data_of(snapshot); - error = get_swap_reader(&handle, swsusp_header.image); + error = get_swap_reader(&handle, swsusp_header->image); if (!error) error = swap_read_page(&handle, header, NULL); if (!error) @@ -591,17 +593,17 @@ int swsusp_check(void) resume_bdev = open_by_devnum(swsusp_resume_device, FMODE_READ); if (!IS_ERR(resume_bdev)) { set_blocksize(resume_bdev, PAGE_SIZE); - memset(&swsusp_header, 0, sizeof(swsusp_header)); + memset(swsusp_header, 0, sizeof(PAGE_SIZE)); error = bio_read_page(swsusp_resume_block, - &swsusp_header, NULL); + swsusp_header, NULL); if (error) return error; - if (!memcmp(SWSUSP_SIG, swsusp_header.sig, 10)) { - memcpy(swsusp_header.sig, swsusp_header.orig_sig, 10); + if (!memcmp(SWSUSP_SIG, swsusp_header->sig, 10)) { + memcpy(swsusp_header->sig, swsusp_header->orig_sig, 10); /* Reset swap signature now */ error = bio_write_page(swsusp_resume_block, - &swsusp_header, NULL); + swsusp_header, NULL); } else { return -EINVAL; } @@ -632,3 +634,13 @@ void swsusp_close(void) blkdev_put(resume_bdev); } + +static int swsusp_header_init(void) +{ + swsusp_header = (struct swsusp_header*) __get_free_page(GFP_KERNEL); + if (!swsusp_header) + panic("Could not allocate memory for swsusp_header\n"); + return 0; +} + +core_initcall(swsusp_header_init); _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 20/20] x86_64: Move cpu verification code to common file
o This patch moves the code to verify long mode and SSE to a common file. This code is now shared by trampoline.S, wakeup.S, boot/setup.S and boot/compressed/head.S o So far we used to do very limited check in trampoline.S, wakeup.S and in 32bit entry point. Now all the entry paths are forced to do the exhaustive check, including SSE because verify_cpu is shared. o I am keeping this patch as last in the x86 relocatable series because previous patches have got quite some amount of testing done and don't want to distrub that. So that if there is problem introduced by this patch, at least it can be easily isolated. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/boot/compressed/head.S | 19 ++ arch/x86_64/boot/setup.S | 65 ++--- arch/x86_64/kernel/acpi/wakeup.S | 30 +- arch/x86_64/kernel/trampoline.S| 51 + arch/x86_64/kernel/verify_cpu.S| 110 + 5 files changed, 152 insertions(+), 123 deletions(-) diff -puN arch/x86_64/boot/compressed/head.S~x86_64-move-cpu-verfication-code-to-common-file arch/x86_64/boot/compressed/head.S --- linux-2.6.21-rc2-reloc/arch/x86_64/boot/compressed/head.S~x86_64-move-cpu-verfication-code-to-common-file 2007-03-07 01:32:27.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/boot/compressed/head.S 2007-03-07 01:32:27.0 +0530 @@ -54,6 +54,15 @@ startup_32: 1: popl%ebp subl$1b, %ebp +/* setup a stack and make sure cpu supports long mode. */ + movl$user_stack_end, %eax + addl%ebp, %eax + movl%eax, %esp + + callverify_cpu + testl %eax, %eax + jnz no_longmode + /* Compute the delta between where we were compiled to run at * and where the code will actually run at. */ @@ -159,13 +168,21 @@ startup_32: /* Jump from 32bit compatibility mode into 64bit mode. */ lret +no_longmode: + /* This isn't an x86-64 CPU so hang */ +1: + hlt + jmp 1b + +#include "../../kernel/verify_cpu.S" + /* Be careful here startup_64 needs to be at a predictable * address so I can export it in an ELF header. Bootloaders * should look at the ELF header to find this address, as * it may change in the future. */ .code64 - .org 0x100 + .org 0x200 ENTRY(startup_64) /* We come here either from startup_32 or directly from a * 64bit bootloader. If we come here from a bootloader we depend on diff -puN arch/x86_64/boot/setup.S~x86_64-move-cpu-verfication-code-to-common-file arch/x86_64/boot/setup.S --- linux-2.6.21-rc2-reloc/arch/x86_64/boot/setup.S~x86_64-move-cpu-verfication-code-to-common-file 2007-03-07 01:32:27.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/boot/setup.S2007-03-07 01:32:27.0 +0530 @@ -299,64 +299,10 @@ loader_ok: movw%cs,%ax movw%ax,%ds - /* minimum CPUID flags for x86-64 */ - /* see http://www.x86-64.org/lists/discuss/msg02971.html */ -#define SSE_MASK ((1<<25)|(1<<26)) -#define REQUIRED_MASK1 ((1<<0)|(1<<3)|(1<<4)|(1<<5)|(1<<6)|(1<<8)|\ - (1<<13)|(1<<15)|(1<<24)) -#define REQUIRED_MASK2 (1<<29) - - pushfl /* standard way to check for cpuid */ - popl%eax - movl%eax,%ebx - xorl$0x20,%eax - pushl %eax - popfl - pushfl - popl%eax - cmpl%eax,%ebx - jz no_longmode /* cpu has no cpuid */ - movl$0x0,%eax - cpuid - cmpl$0x1,%eax - jb no_longmode /* no cpuid 1 */ - xor %di,%di - cmpl$0x68747541,%ebx/* AuthenticAMD */ - jnz noamd - cmpl$0x69746e65,%edx - jnz noamd - cmpl$0x444d4163,%ecx - jnz noamd - mov $1,%di /* cpu is from AMD */ -noamd: - movl$0x1,%eax - cpuid - andl$REQUIRED_MASK1,%edx - xorl$REQUIRED_MASK1,%edx - jnz no_longmode - movl$0x8000,%eax - cpuid - cmpl$0x8001,%eax - jb no_longmode /* no extended cpuid */ - movl$0x8001,%eax - cpuid - andl$REQUIRED_MASK2,%edx - xorl$REQUIRED_MASK2,%edx - jnz no_longmode -sse_test: - movl$1,%eax - cpuid - andl$SSE_MASK,%edx - cmpl$SSE_MASK,%edx - je sse_ok - test%di,%di - jz no_longmode /* only try to force SSE on AMD */ - movl$0xc0010015,%ecx/* HWCR */ - rdmsr - btr $15,%eax/* enable SSE */ - wrmsr - xor %di,%di /*
[PATCH 2/20] x86_64: Kill temp boot pmds
Early in the boot process we need the ability to set up temporary mappings, before our normal mechanisms are initialized. Currently this is used to map pages that are part of the page tables we are building and pages during the dmi scan. The core problem is that we are using the user portion of the page tables to implement this. Which means that while this mechanism is active we cannot catch NULL pointer dereferences and we deviate from the normal ways of handling things. In this patch I modify early_ioremap to map pages into the kernel portion of address space, roughly where we will later put modules, and I make the discovery of which addresses we can use dynamic which removes all kinds of static limits and remove the dependencies on implementation details between different parts of the code. Now alloc_low_page() and unmap_low_page() use early_iomap() and early_iounmap() to allocate/map and unmap a page. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/head.S |3 - arch/x86_64/mm/init.c | 100 -- 2 files changed, 45 insertions(+), 58 deletions(-) diff -puN arch/x86_64/kernel/head.S~x86_64-Kill-temp_boot_pmds arch/x86_64/kernel/head.S --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/head.S~x86_64-Kill-temp_boot_pmds 2007-03-07 01:21:26.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/head.S 2007-03-07 01:21:26.0 +0530 @@ -288,9 +288,6 @@ NEXT_PAGE(level2_ident_pgt) .quad i << 21 | 0x083 i = i + 1 .endr - /* Temporary mappings for the super early allocator in arch/x86_64/mm/init.c */ - .globl temp_boot_pmds -temp_boot_pmds: .fill 492,8,0 NEXT_PAGE(level2_kernel_pgt) diff -puN arch/x86_64/mm/init.c~x86_64-Kill-temp_boot_pmds arch/x86_64/mm/init.c --- linux-2.6.21-rc2-reloc/arch/x86_64/mm/init.c~x86_64-Kill-temp_boot_pmds 2007-03-07 01:21:26.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/mm/init.c 2007-03-07 01:21:26.0 +0530 @@ -167,23 +167,9 @@ __set_fixmap (enum fixed_addresses idx, unsigned long __initdata table_start, table_end; -extern pmd_t temp_boot_pmds[]; - -static struct temp_map { - pmd_t *pmd; - void *address; - intallocated; -} temp_mappings[] __initdata = { - { &temp_boot_pmds[0], (void *)(40UL * 1024 * 1024) }, - { &temp_boot_pmds[1], (void *)(42UL * 1024 * 1024) }, - {} -}; - -static __meminit void *alloc_low_page(int *index, unsigned long *phys) +static __meminit void *alloc_low_page(unsigned long *phys) { - struct temp_map *ti; - int i; - unsigned long pfn = table_end++, paddr; + unsigned long pfn = table_end++; void *adr; if (after_bootmem) { @@ -194,57 +180,63 @@ static __meminit void *alloc_low_page(in if (pfn >= end_pfn) panic("alloc_low_page: ran out of memory"); - for (i = 0; temp_mappings[i].allocated; i++) { - if (!temp_mappings[i].pmd) - panic("alloc_low_page: ran out of temp mappings"); - } - ti = &temp_mappings[i]; - paddr = (pfn << PAGE_SHIFT) & PMD_MASK; - set_pmd(ti->pmd, __pmd(paddr | _KERNPG_TABLE | _PAGE_PSE)); - ti->allocated = 1; - __flush_tlb(); - adr = ti->address + ((pfn << PAGE_SHIFT) & ~PMD_MASK); + + adr = early_ioremap(pfn * PAGE_SIZE, PAGE_SIZE); memset(adr, 0, PAGE_SIZE); - *index = i; - *phys = pfn * PAGE_SIZE; - return adr; -} + *phys = pfn * PAGE_SIZE; + return adr; +} -static __meminit void unmap_low_page(int i) +static __meminit void unmap_low_page(void *adr) { - struct temp_map *ti; if (after_bootmem) return; - ti = &temp_mappings[i]; - set_pmd(ti->pmd, __pmd(0)); - ti->allocated = 0; + early_iounmap(adr, PAGE_SIZE); } /* Must run before zap_low_mappings */ __init void *early_ioremap(unsigned long addr, unsigned long size) { - unsigned long map = round_down(addr, LARGE_PAGE_SIZE); - - /* actually usually some more */ - if (size >= LARGE_PAGE_SIZE) { - return NULL; + unsigned long vaddr; + pmd_t *pmd, *last_pmd; + int i, pmds; + + pmds = ((addr & ~PMD_MASK) + size + ~PMD_MASK) / PMD_SIZE; + vaddr = __START_KERNEL_map; + pmd = level2_kernel_pgt; + last_pmd = level2_kernel_pgt + PTRS_PER_PMD - 1; + for (; pmd <= last_pmd; pmd++, vaddr += PMD_SIZE) { + for (i = 0; i < pmds; i++) { + if (pmd_present(pmd[i])) + goto next; + } + vaddr += addr & ~PMD_MASK; + addr &= PMD_MASK; + for (i = 0; i < pmds; i++, addr += PMD_SIZE) + set_pmd(pmd + i,__pmd
[PATCH 0/20] x86_64 Relocatable bzImage support (V4)
Hi, Here is another attempt on x86_64 relocatable bzImage patches(V4). This patchset makes a bzImage relocatable and same kernel binary can be loaded and run from different physical addresses. As on now, this mainly helps distros who have to ship an extra kernel compiled for a different physical address to capture the kernel crash dump. This patchset will allow distros and kdump users to use production kernel itself as dump capture kernel and there is no need to ship/build an extra kernel. I am hopeful people will find other interesting usages down the line. Eric has done all the heavy weight lifting requird to make this patchset work. Last time I posted this patchset (V3), there were minor comments which I have taken care of. Following are the changes since V3. - Reduced the usage of _AC() macro to only shift operations, as per Andi's comment. - Restored the CONFIG_PHYSICAL_START option. - Fixed few bugs with suspend to disk code path. It would be good if these patches get into -mm so that it can undergo more testing. I have been testing them and these just work fine for me. Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/20] x86_64: wakeup.S rename registers to reflect right names
o Use appropriate names for 64bit regsiters. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/acpi/wakeup.S | 36 ++-- include/asm-x86_64/suspend.h | 12 ++-- 2 files changed, 24 insertions(+), 24 deletions(-) diff -puN arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-rename-registers-to-reflect-right-names arch/x86_64/kernel/acpi/wakeup.S --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-rename-registers-to-reflect-right-names 2007-03-07 01:26:55.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/acpi/wakeup.S 2007-03-07 01:26:55.0 +0530 @@ -211,16 +211,16 @@ wakeup_long64: movw%ax, %es movw%ax, %fs movw%ax, %gs - movqsaved_esp, %rsp + movqsaved_rsp, %rsp movw$0x0e00 + 'x', %ds:(0xb8018) - movqsaved_ebx, %rbx - movqsaved_edi, %rdi - movqsaved_esi, %rsi - movqsaved_ebp, %rbp + movqsaved_rbx, %rbx + movqsaved_rdi, %rdi + movqsaved_rsi, %rsi + movqsaved_rbp, %rbp movw$0x0e00 + '!', %ds:(0xb801a) - movqsaved_eip, %rax + movqsaved_rip, %rax jmp *%rax .code32 @@ -408,13 +408,13 @@ do_suspend_lowlevel: movq %r15, saved_context_r15(%rip) pushfq ; popq saved_context_eflags(%rip) - movq$.L97, saved_eip(%rip) + movq$.L97, saved_rip(%rip) - movq %rsp,saved_esp - movq %rbp,saved_ebp - movq %rbx,saved_ebx - movq %rdi,saved_edi - movq %rsi,saved_esi + movq %rsp,saved_rsp + movq %rbp,saved_rbp + movq %rbx,saved_rbx + movq %rdi,saved_rdi + movq %rsi,saved_rsi addq$8, %rsp movl$3, %edi @@ -461,12 +461,12 @@ do_suspend_lowlevel: .data ALIGN -ENTRY(saved_ebp) .quad 0 -ENTRY(saved_esi) .quad 0 -ENTRY(saved_edi) .quad 0 -ENTRY(saved_ebx) .quad 0 +ENTRY(saved_rbp) .quad 0 +ENTRY(saved_rsi) .quad 0 +ENTRY(saved_rdi) .quad 0 +ENTRY(saved_rbx) .quad 0 -ENTRY(saved_eip) .quad 0 -ENTRY(saved_esp) .quad 0 +ENTRY(saved_rip) .quad 0 +ENTRY(saved_rsp) .quad 0 ENTRY(saved_magic) .quad 0 diff -puN include/asm-x86_64/suspend.h~x86_64-wakeup.S-rename-registers-to-reflect-right-names include/asm-x86_64/suspend.h --- linux-2.6.21-rc2-reloc/include/asm-x86_64/suspend.h~x86_64-wakeup.S-rename-registers-to-reflect-right-names 2007-03-07 01:26:55.0 +0530 +++ linux-2.6.21-rc2-reloc-root/include/asm-x86_64/suspend.h2007-03-07 01:26:55.0 +0530 @@ -45,12 +45,12 @@ extern unsigned long saved_context_eflag extern void fix_processor_context(void); #ifdef CONFIG_ACPI_SLEEP -extern unsigned long saved_eip; -extern unsigned long saved_esp; -extern unsigned long saved_ebp; -extern unsigned long saved_ebx; -extern unsigned long saved_esi; -extern unsigned long saved_edi; +extern unsigned long saved_rip; +extern unsigned long saved_rsp; +extern unsigned long saved_rbp; +extern unsigned long saved_rbx; +extern unsigned long saved_rsi; +extern unsigned long saved_rdi; /* routines for saving/restoring kernel state */ extern int acpi_save_state_mem(void); _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 0/7] Resource controllers based on process containers
Balbir Singh wrote: > Pavel Emelianov wrote: >> This patchset adds RSS, accounting and control and >> limiting the number of tasks and files within container. >> >> Based on top of Paul Menage's container subsystem v7 >> >> RSS controller includes per-container RSS accounter, >> reclamation and OOM killer. It behaves like standalone >> machine - when container runs out of resources it tries >> to reclaim some pages and if it doesn't succeed in it >> kills some task which mm_struct belongs to container in >> question. >> >> Num tasks and files containers are very simple and >> self-descriptive from code. >> >> As discussed before when a task moves from one container >> to another no resources follow it - they keep holding the >> container they were allocated in. >> > > I have one problem with the patchset, I cannot compile > the patches individually and some of the code is hard > to read as it depends on functions from future patches. > Patch 2, 3 and 4 fail to compile without patch 5 applied. > > Patch 1 failed to apply with a reject in kernel/Makefile > I applied it on top of 2.6.20 with all of Paul Menage's > patches (all 7). This sounds weird for me :( I've taken a stock 2.6.20 and applied Paul's patches. This is what this patchset is applicable for. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/20] x86_64: Fix early printk to use standard ISA mapping
Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/early_printk.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff -puN arch/x86_64/kernel/early_printk.c~x86_64-fix-early_printk-to-use-the-standard-ISA-mapping arch/x86_64/kernel/early_printk.c --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/early_printk.c~x86_64-fix-early_printk-to-use-the-standard-ISA-mapping 2007-03-07 01:22:33.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/early_printk.c 2007-03-07 01:22:33.0 +0530 @@ -11,11 +11,10 @@ #ifdef __i386__ #include -#define VGABASE(__ISA_IO_base + 0xb8000) #else #include -#define VGABASE((void __iomem *)0x800b8000UL) #endif +#define VGABASE(__ISA_IO_base + 0xb8000) static int max_ypos = 25, max_xpos = 80; static int current_ypos = 25, current_xpos = 0; _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/20] x86_64: Assembly safe page.h and pgtable.h
This patch makes pgtable.h and page.h safe to include in assembly files like head.S. Allowing us to use symbolic constants instead of hard coded numbers when refering to the page tables. This patch copies asm-sparc64/const.h to asm-x86_64 to get a definition of _AC() a very convinient macro that allows us to force the type when we are compiling the code in C and to drop all of the type information when we are using the constant in assembly. Previously this was done with multiple definition of the same constant. const.h was modified slightly so that it works when given CONFIG options as arguments. This patch adds #ifndef __ASSEMBLY__ ... #endif and _AC(1,UL) where appropriate so the assembler won't choke on the header files. Otherwise nothing should have changed. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- include/asm-x86_64/const.h | 20 include/asm-x86_64/page.h| 28 ++-- include/asm-x86_64/pgtable.h | 33 + 3 files changed, 51 insertions(+), 30 deletions(-) diff -puN /dev/null include/asm-x86_64/const.h --- /dev/null 2007-03-07 00:46:17.354096448 +0530 +++ linux-2.6.21-rc2-reloc-root/include/asm-x86_64/const.h 2007-03-07 01:20:54.0 +0530 @@ -0,0 +1,20 @@ +/* const.h: Macros for dealing with constants. */ + +#ifndef _X86_64_CONST_H +#define _X86_64_CONST_H + +/* Some constant macros are used in both assembler and + * C code. Therefore we cannot annotate them always with + * 'UL' and other type specificers unilaterally. We + * use the following macros to deal with this. + */ + +#ifdef __ASSEMBLY__ +#define _AC(X,Y) X +#else +#define __AC(X,Y) (X##Y) +#define _AC(X,Y) __AC(X,Y) +#endif + + +#endif /* !(_X86_64_CONST_H) */ diff -puN include/asm-x86_64/page.h~x86_64-Assembly-safe-page.h-and-pgtable.h include/asm-x86_64/page.h --- linux-2.6.21-rc2-reloc/include/asm-x86_64/page.h~x86_64-Assembly-safe-page.h-and-pgtable.h 2007-03-07 01:20:54.0 +0530 +++ linux-2.6.21-rc2-reloc-root/include/asm-x86_64/page.h 2007-03-07 01:20:54.0 +0530 @@ -1,14 +1,11 @@ #ifndef _X86_64_PAGE_H #define _X86_64_PAGE_H +#include /* PAGE_SHIFT determines the page size */ #define PAGE_SHIFT 12 -#ifdef __ASSEMBLY__ -#define PAGE_SIZE (0x1 << PAGE_SHIFT) -#else -#define PAGE_SIZE (1UL << PAGE_SHIFT) -#endif +#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) #define PHYSICAL_PAGE_MASK (~(PAGE_SIZE-1) & __PHYSICAL_MASK) @@ -33,10 +30,10 @@ #define N_EXCEPTION_STACKS 5 /* hw limit: 7 */ #define LARGE_PAGE_MASK (~(LARGE_PAGE_SIZE-1)) -#define LARGE_PAGE_SIZE (1UL << PMD_SHIFT) +#define LARGE_PAGE_SIZE (_AC(1,UL) << PMD_SHIFT) #define HPAGE_SHIFT PMD_SHIFT -#define HPAGE_SIZE ((1UL) << HPAGE_SHIFT) +#define HPAGE_SIZE (_AC(1,UL) << HPAGE_SHIFT) #define HPAGE_MASK (~(HPAGE_SIZE - 1)) #define HUGETLB_PAGE_ORDER (HPAGE_SHIFT - PAGE_SHIFT) @@ -76,29 +73,24 @@ typedef struct { unsigned long pgprot; } #define __pgd(x) ((pgd_t) { (x) } ) #define __pgprot(x)((pgprot_t) { (x) } ) -#define __PHYSICAL_START ((unsigned long)CONFIG_PHYSICAL_START) -#define __START_KERNEL (__START_KERNEL_map + __PHYSICAL_START) -#define __START_KERNEL_map 0x8000UL -#define __PAGE_OFFSET 0x8100UL +#endif /* !__ASSEMBLY__ */ -#else #define __PHYSICAL_START CONFIG_PHYSICAL_START #define __START_KERNEL (__START_KERNEL_map + __PHYSICAL_START) #define __START_KERNEL_map 0x8000 #define __PAGE_OFFSET 0x8100 -#endif /* !__ASSEMBLY__ */ /* to align the pointer to the (next) page boundary */ #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK) /* See Documentation/x86_64/mm.txt for a description of the memory map. */ #define __PHYSICAL_MASK_SHIFT 46 -#define __PHYSICAL_MASK((1UL << __PHYSICAL_MASK_SHIFT) - 1) +#define __PHYSICAL_MASK((_AC(1,UL) << __PHYSICAL_MASK_SHIFT) - 1) #define __VIRTUAL_MASK_SHIFT 48 -#define __VIRTUAL_MASK ((1UL << __VIRTUAL_MASK_SHIFT) - 1) +#define __VIRTUAL_MASK ((_AC(1,UL) << __VIRTUAL_MASK_SHIFT) - 1) -#define KERNEL_TEXT_SIZE (40UL*1024*1024) -#define KERNEL_TEXT_START 0x8000UL +#define KERNEL_TEXT_SIZE (40*1024*1024) +#define KERNEL_TEXT_START 0x8000 #ifndef __ASSEMBLY__ @@ -106,7 +98,7 @@ typedef struct { unsigned long pgprot; } #endif /* __ASSEMBLY__ */ -#define PAGE_OFFSET((unsigned long)__PAGE_OFFSET) +#define PAGE_OFFSET__PAGE_OFFSET /* Note: __pa(&symbol_visible_to_c) should be always replaced with __pa_symbol. Otherwise you risk miscompilation. */ diff -puN include/asm-x86_64/pgtable.h~x86_64-Assembly-safe-page.h-and-pgtable.h include/asm-x86_64/pgtable.h --- linux-2.6.21-r
[PATCH 18/20] x86_64: Relocatable Kernel Support
This patch modifies the x86_64 kernel so that it can be loaded and run at any 2M aligned address, below 512G. The technique used is to compile the decompressor with -fPIC and modify it so the decompressor is fully relocatable. For the main kernel the page tables are modified so the kernel remains at the same virtual address. In addition a variable phys_base is kept that holds the physical address the kernel is loaded at. __pa_symbol is modified to add that when we take the address of a kernel symbol. When loaded with a normal bootloader the decompressor will decompress the kernel to 2M and it will run there. This both ensures the relocation code is always working, and makes it easier to use 2M pages for the kernel and the cpu. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/Kconfig | 50 arch/x86_64/boot/compressed/Makefile| 12 - arch/x86_64/boot/compressed/head.S | 322 +++- arch/x86_64/boot/compressed/misc.c | 251 +--- arch/x86_64/boot/compressed/vmlinux.lds | 44 arch/x86_64/boot/compressed/vmlinux.scr |9 arch/x86_64/kernel/head.S | 225 -- arch/x86_64/kernel/suspend_asm.S|7 include/asm-x86_64/page.h |6 9 files changed, 596 insertions(+), 330 deletions(-) diff -puN arch/x86_64/boot/compressed/head.S~x86_64-Relocatable-kernel-support arch/x86_64/boot/compressed/head.S --- linux-2.6.21-rc2-reloc/arch/x86_64/boot/compressed/head.S~x86_64-Relocatable-kernel-support 2007-03-07 01:31:35.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/boot/compressed/head.S 2007-03-07 01:31:35.0 +0530 @@ -26,116 +26,262 @@ #include #include +#include #include +#include +.section ".text.head" .code32 .globl startup_32 - + startup_32: cld cli - movl $(__KERNEL_DS),%eax - movl %eax,%ds - movl %eax,%es - movl %eax,%fs - movl %eax,%gs - - lss stack_start,%esp - xorl %eax,%eax -1: incl %eax # check that A20 really IS enabled - movl %eax,0x00 # loop forever if it isn't - cmpl %eax,0x10 - je 1b + movl$(__KERNEL_DS), %eax + movl%eax, %ds + movl%eax, %es + movl%eax, %ss + +/* Calculate the delta between where we were compiled to run + * at and where we were actually loaded at. This can only be done + * with a short local call on x86. Nothing else will tell us what + * address we are running at. The reserved chunk of the real-mode + * data at 0x34-0x3f are used as the stack for this calculation. + * Only 4 bytes are needed. + */ + leal0x40(%esi), %esp + call1f +1: popl%ebp + subl$1b, %ebp + +/* Compute the delta between where we were compiled to run at + * and where the code will actually run at. + */ +/* %ebp contains the address we are loaded at by the boot loader and %ebx + * contains the address where we should move the kernel image temporarily + * for safe in-place decompression. + */ + +#ifdef CONFIG_RELOCATABLE + movl%ebp, %ebx + addl$(LARGE_PAGE_SIZE -1), %ebx + andl$LARGE_PAGE_MASK, %ebx +#else + movl$CONFIG_PHYSICAL_START, %ebx +#endif + + /* Replace the compressed data size with the uncompressed size */ + sublinput_len(%ebp), %ebx + movloutput_len(%ebp), %eax + addl%eax, %ebx + /* Add 8 bytes for every 32K input block */ + shrl$12, %eax + addl%eax, %ebx + /* Add 32K + 18 bytes of extra slack and align on a 4K boundary */ + addl$(32768 + 18 + 4095), %ebx + andl$~4095, %ebx /* - * Initialize eflags. Some BIOS's leave bits like NT set. This would - * confuse the debugger if this code is traced. - * XXX - best to initialize before switching to protected mode. + * Prepare for entering 64 bit mode */ - pushl $0 - popfl + + /* Load new GDT with the 64bit segments using 32bit descriptor */ + lealgdt(%ebp), %eax + movl%eax, gdt+2(%ebp) + lgdtgdt(%ebp) + + /* Enable PAE mode */ + xorl%eax, %eax + orl $(1 << 5), %eax + movl%eax, %cr4 + + /* + * Build early 4G boot pagetable + */ + /* Initialize Page tables to 0*/ + lealpgtable(%ebx), %edi + xorl%eax, %eax + movl$((4096*6)/4), %ecx + rep stosl + + /* Build Level 4 */ + lealpgtable + 0(%ebx), %edi + leal0x1007 (%edi), %eax + movl%eax, 0(%edi) + + /* Build Level 3 */ + lealpgtable + 0x1000(%ebx), %edi + leal0x1007(%edi), %eax + movl$4, %ecx +1: movl%eax, 0x00(%edi) + addl$0x1000, %eax + addl$8, %edi + decl%ecx +
[PATCH 6/20] x86_64: cleanup segments
Move __KERNEL32_CS up into the unused gdt entry. __KERNEL32_CS is used when entering the kernel so putting it first is useful when trying to keep boot gdt sizes to a minimum. Set the accessed bit on all gdt entries. We don't care so there is no need for the cpu to burn the extra cycles, and it potentially allows the pages to be immutable. Plus it is confusing when debugging and your gdt entries mysteriously change. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/head.S| 12 ++-- include/asm-x86_64/segment.h |2 +- 2 files changed, 7 insertions(+), 7 deletions(-) diff -puN arch/x86_64/kernel/head.S~x86_64-cleanup-segments arch/x86_64/kernel/head.S --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/head.S~x86_64-cleanup-segments 2007-03-07 01:24:53.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/head.S 2007-03-07 01:24:53.0 +0530 @@ -362,13 +362,13 @@ gdt: ENTRY(cpu_gdt_table) .quad 0x /* NULL descriptor */ + .quad 0x00cf9b00 /* __KERNEL32_CS */ + .quad 0x00af9b00 /* __KERNEL_CS */ + .quad 0x00cf9300 /* __KERNEL_DS */ + .quad 0x00cffb00 /* __USER32_CS */ + .quad 0x00cff300 /* __USER_DS, __USER32_DS */ + .quad 0x00affb00 /* __USER_CS */ .quad 0x0 /* unused */ - .quad 0x00af9a00 /* __KERNEL_CS */ - .quad 0x00cf9200 /* __KERNEL_DS */ - .quad 0x00cffa00 /* __USER32_CS */ - .quad 0x00cff200 /* __USER_DS, __USER32_DS */ - .quad 0x00affa00 /* __USER_CS */ - .quad 0x00cf9a00 /* __KERNEL32_CS */ .quad 0,0 /* TSS */ .quad 0,0 /* LDT */ .quad 0,0,0 /* three TLS descriptors */ diff -puN include/asm-x86_64/segment.h~x86_64-cleanup-segments include/asm-x86_64/segment.h --- linux-2.6.21-rc2-reloc/include/asm-x86_64/segment.h~x86_64-cleanup-segments 2007-03-07 01:24:53.0 +0530 +++ linux-2.6.21-rc2-reloc-root/include/asm-x86_64/segment.h2007-03-07 01:24:53.0 +0530 @@ -6,7 +6,7 @@ #define __KERNEL_CS0x10 #define __KERNEL_DS0x18 -#define __KERNEL32_CS 0x38 +#define __KERNEL32_CS 0x08 /* * we cannot use the same code segment descriptor for user and kernel _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 9/20] x86_64: Get rid of dead code in suspend resume
o Get rid of dead code in wakeup.S o We never restore from saved_gdt, saved_idt, saved_ltd, saved_tss, saved_cr3, saved_cr4, saved_cr0, real_save_gdt, saved_efer, saved_efer2. Get rid of of associated code. o Get rid of bogus_magic, bogus_31_magic and bogus_magic2. No longer being used. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/acpi/wakeup.S | 57 --- 1 file changed, 1 insertion(+), 56 deletions(-) diff -puN arch/x86_64/kernel/acpi/wakeup.S~x86_64-get-rid-of-dead-code-in-suspend-resume arch/x86_64/kernel/acpi/wakeup.S --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-get-rid-of-dead-code-in-suspend-resume 2007-03-07 01:26:21.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/acpi/wakeup.S 2007-03-07 01:26:21.0 +0530 @@ -258,8 +258,6 @@ gdt_48a: .word 0, 0# gdt base (filled in later) -real_save_gdt: .word 0 - .quad 0 real_magic:.quad 0 video_mode:.quad 0 video_flags: .quad 0 @@ -272,10 +270,6 @@ bogus_32_magic: movb$0xb3,%al ; outb %al,$0x80 jmp bogus_32_magic -bogus_31_magic: - movb$0xb1,%al ; outb %al,$0x80 - jmp bogus_31_magic - bogus_cpu: movb$0xbc,%al ; outb %al,$0x80 jmp bogus_cpu @@ -346,16 +340,6 @@ check_vesaa: _setbada: jmp setbada - .code64 -bogus_magic: - movw$0x0e00 + 'B', %ds:(0xb8018) - jmp bogus_magic - -bogus_magic2: - movw$0x0e00 + '2', %ds:(0xb8018) - jmp bogus_magic2 - - wakeup_stack_begin:# Stack grows down .org 0xff0 @@ -373,28 +357,11 @@ ENTRY(wakeup_end) # # Returned address is location of code in low memory (past data and stack) # + .code64 ENTRY(acpi_copy_wakeup_routine) pushq %rax - pushq %rcx pushq %rdx - sgdtsaved_gdt - sidtsaved_idt - sldtsaved_ldt - str saved_tss - - movq%cr3, %rdx - movq%rdx, saved_cr3 - movq%cr4, %rdx - movq%rdx, saved_cr4 - movq%cr0, %rdx - movq%rdx, saved_cr0 - sgdtreal_save_gdt - wakeup_start (,%rdi) - movl$MSR_EFER, %ecx - rdmsr - movl%eax, saved_efer - movl%edx, saved_efer2 - movlsaved_video_mode, %edx movl%edx, video_mode - wakeup_start (,%rdi) movlacpi_video_flags, %edx @@ -407,17 +374,8 @@ ENTRY(acpi_copy_wakeup_routine) cmpl$0x9abcdef0, %eax jne bogus_32_magic - # make sure %cr4 is set correctly (features, etc) - movlsaved_cr4 - __START_KERNEL_map, %eax - movq%rax, %cr4 - - movlsaved_cr0 - __START_KERNEL_map, %eax - movq%rax, %cr0 - jmp 1f # Flush pipelines -1: # restore the regs we used popq%rdx - popq%rcx popq%rax ENTRY(do_suspend_lowlevel_s4bios) ret @@ -512,16 +470,3 @@ ENTRY(saved_eip) .quad 0 ENTRY(saved_esp) .quad 0 ENTRY(saved_magic) .quad 0 - -ALIGN -# saved registers -saved_gdt: .quad 0,0 -saved_idt: .quad 0,0 -saved_ldt: .quad 0 -saved_tss: .quad 0 - -saved_cr0: .quad 0 -saved_cr3: .quad 0 -saved_cr4: .quad 0 -saved_efer:.quad 0 -saved_efer2: .quad 0 _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/20] x86_64: wakeup.S misc cleanups
o Various cleanups. One of the main purpose of cleanups is that make wakeup.S as close as possible to trampoline.S. o Following are the changes - Indentations for comments. - Changed the gdt table to compact form and to resemble the one in trampoline.S - Take the jump to 32bit from real mode using ljmpl. Makes code more readable. - After enabling long mode, directly take a long jump for 64bit mode. No need to take an extra jump to "reach_comaptibility_mode" - Stack is not used after real mode. So don't load stack in 32 bit mode. - No need to enable PGE here. - No need to do extra EFER read, anyway we trash the read contents. - No need to enable system call (EFER_SCE). Anyway it will be enabled when original EFER is restored. - No need to set MP, ET, NE, WP, AM bits in cr0. Very soon we will reload the original cr0 while restroing the processor state. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/acpi/wakeup.S | 112 +-- 1 file changed, 40 insertions(+), 72 deletions(-) diff -puN arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-misc-cleanups arch/x86_64/kernel/acpi/wakeup.S --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-wakeup.S-misc-cleanups 2007-03-07 01:27:32.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/acpi/wakeup.S 2007-03-07 01:27:32.0 +0530 @@ -30,11 +30,12 @@ wakeup_code: cld # setup data segment movw%cs, %ax - movw%ax, %ds# Make ds:0 point to wakeup_start + movw%ax, %ds# Make ds:0 point to wakeup_start movw%ax, %ss - mov $(wakeup_stack - wakeup_code), %sp # Private stack is needed for ASUS board + # Private stack is needed for ASUS board + mov $(wakeup_stack - wakeup_code), %sp - pushl $0 # Kill any dangerous flags + pushl $0 # Kill any dangerous flags popfl movlreal_magic - wakeup_code, %eax @@ -45,7 +46,7 @@ wakeup_code: jz 1f lcall $0xc000,$3 movw%cs, %ax - movw%ax, %ds# Bios might have played with that + movw%ax, %ds# Bios might have played with that movw%ax, %ss 1: @@ -75,9 +76,12 @@ wakeup_code: jmp 1f 1: - .byte 0x66, 0xea# prefix + jmpi-opcode - .long wakeup_32 - __START_KERNEL_map - .word __KERNEL_CS + ljmpl *(wakeup_32_vector - wakeup_code) + + .balign 4 +wakeup_32_vector: + .long wakeup_32 - __START_KERNEL_map + .word __KERNEL32_CS, 0 .code32 wakeup_32: @@ -96,65 +100,50 @@ wakeup_32: jnc bogus_cpu movl%edx,%edi - movw$__KERNEL_DS, %ax - movw%ax, %ds - movw%ax, %es - movw%ax, %fs - movw%ax, %gs + movl$__KERNEL_DS, %eax + movl%eax, %ds - movw$__KERNEL_DS, %ax - movw%ax, %ss - - mov $(wakeup_stack - __START_KERNEL_map), %esp movlsaved_magic - __START_KERNEL_map, %eax cmpl$0x9abcdef0, %eax jne bogus_32_magic + movw$0x0e00 + 'i', %ds:(0xb8012) + movb$0xa8, %al ; outb %al, $0x80; + /* * Prepare for entering 64bits mode */ - /* Enable PAE mode and PGE */ + /* Enable PAE */ xorl%eax, %eax btsl$5, %eax - btsl$7, %eax movl%eax, %cr4 /* Setup early boot stage 4 level pagetables */ movl$(wakeup_level4_pgt - __START_KERNEL_map), %eax movl%eax, %cr3 - /* Setup EFER (Extended Feature Enable Register) */ - movl$MSR_EFER, %ecx - rdmsr - /* Fool rdmsr and reset %eax to avoid dependences */ - xorl%eax, %eax /* Enable Long Mode */ + xorl%eax, %eax btsl$_EFER_LME, %eax - /* Enable System Call */ - btsl$_EFER_SCE, %eax - /* No Execute supported? */ + /* No Execute supported? */ btl $20,%edi jnc 1f btsl$_EFER_NX, %eax -1: /* Make changes effective */ +1: movl$MSR_EFER, %ecx + xorl%edx, %edx wrmsr - wbinvd xorl%eax, %eax btsl$31, %eax /* Enable paging and in turn activate Long Mode */ btsl$0, %eax/* Enable protected mode */ - btsl$1, %eax
[PATCH 15/20] Move swsusp __pa() dependent code to arch portion
o __pa() should be used only on kernel linearly mapped virtual addresses and not on kernel text and data addresses. o Hibernation code needs to determine the physical address associated with kernel symbol to mark a section boundary which contains pages which don't have to be saved and restored during hibernate/resume operation. o Move this piece of code in arch dependent section. So that architectures which don't have kernel text/data mapped into kernel linearly mapped region can come up with their own ways of determining physical addresses associated with a kernel text. Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/i386/power/suspend.c | 14 ++ arch/powerpc/kernel/Makefile |1 + arch/powerpc/kernel/suspend.c | 24 arch/x86_64/kernel/suspend.c | 14 ++ kernel/power/power.h |5 ++--- kernel/power/snapshot.c | 11 --- 6 files changed, 55 insertions(+), 14 deletions(-) diff -puN arch/i386/power/suspend.c~move-swsusp-__pa-dependent-code-to-arch-portion arch/i386/power/suspend.c --- linux-2.6.21-rc2-reloc/arch/i386/power/suspend.c~move-swsusp-__pa-dependent-code-to-arch-portion 2007-03-07 01:30:18.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/i386/power/suspend.c 2007-03-07 01:30:18.0 +0530 @@ -16,6 +16,9 @@ /* Defined in arch/i386/power/swsusp.S */ extern int restore_image(void); +/* References to section boundaries */ +extern const void __nosave_begin, __nosave_end; + /* Pointer to the temporary resume page tables */ pgd_t *resume_pg_dir; @@ -156,3 +159,14 @@ int swsusp_arch_resume(void) restore_image(); return 0; } + +/* + * pfn_is_nosave - check if given pfn is in the 'nosave' section + */ + +int pfn_is_nosave(unsigned long pfn) +{ + unsigned long nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT; + unsigned long nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT; + return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn); +} diff -puN arch/powerpc/kernel/Makefile~move-swsusp-__pa-dependent-code-to-arch-portion arch/powerpc/kernel/Makefile --- linux-2.6.21-rc2-reloc/arch/powerpc/kernel/Makefile~move-swsusp-__pa-dependent-code-to-arch-portion 2007-03-07 01:30:18.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/powerpc/kernel/Makefile2007-03-07 01:30:18.0 +0530 @@ -37,6 +37,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-$(CONFIG_6xx) += idle_6xx.o l2cr_6xx.o cpu_setup_6xx.o obj-$(CONFIG_TAU) += tau_6xx.o obj32-$(CONFIG_SOFTWARE_SUSPEND) += swsusp_32.o +obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend.o obj32-$(CONFIG_MODULES)+= module_32.o ifeq ($(CONFIG_PPC_MERGE),y) diff -puN /dev/null arch/powerpc/kernel/suspend.c --- /dev/null 2007-03-07 00:46:17.354096448 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/powerpc/kernel/suspend.c 2007-03-07 01:30:18.0 +0530 @@ -0,0 +1,24 @@ +/* + * Suspend support specific for power. + * + * Distribute under GPLv2 + * + * Copyright (c) 2002 Pavel Machek <[EMAIL PROTECTED]> + * Copyright (c) 2001 Patrick Mochel <[EMAIL PROTECTED]> + */ + +#include + +/* References to section boundaries */ +extern const void __nosave_begin, __nosave_end; + +/* + * pfn_is_nosave - check if given pfn is in the 'nosave' section + */ + +int pfn_is_nosave(unsigned long pfn) +{ + unsigned long nosave_begin_pfn = __pa(&__nosave_begin) >> PAGE_SHIFT; + unsigned long nosave_end_pfn = PAGE_ALIGN(__pa(&__nosave_end)) >> PAGE_SHIFT; + return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn); +} diff -puN arch/x86_64/kernel/suspend.c~move-swsusp-__pa-dependent-code-to-arch-portion arch/x86_64/kernel/suspend.c --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/suspend.c~move-swsusp-__pa-dependent-code-to-arch-portion 2007-03-07 01:30:18.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/suspend.c2007-03-07 01:30:18.0 +0530 @@ -13,6 +13,9 @@ #include #include +/* References to section boundaries */ +extern const void __nosave_begin, __nosave_end; + struct saved_context saved_context; unsigned long saved_context_eax, saved_context_ebx, saved_context_ecx, saved_context_edx; @@ -220,4 +223,15 @@ int swsusp_arch_resume(void) restore_image(); return 0; } + +/* + * pfn_is_nosave - check if given pfn is in the 'nosave' section + */ + +int pfn_is_nosave(unsigned long pfn) +{ + unsigned long nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT; + unsigned long nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT; + return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn); +} #endif /* CONFIG_SOFTWARE_SUSPEND */ diff -puN kernel/power/power.h~move-swsusp-__pa-dependent-code-to-arch-portion kernel/power/power.h --- linux-2.6.21-rc2-reloc/kernel/power/power.h~move-swsusp-__pa-
[PATCH 3/20] x86_64: Clean up the early boot page table
- Merge physmem_pgt and ident_pgt, removing physmem_pgt. The merge is broken as soon as mm/init.c:init_memory_mapping is run. - As physmem_pgt is gone don't export it in pgtable.h. - Use defines from pgtable.h for page permissions. - Fix the physical memory identity mapping so it is at the correct address. - Remove the physical memory mapping from wakeup_level4_pgt it is at the wrong address so we can't possibly be usinging it. - Simply NEXT_PAGE the work to calculate the phys_ alias of the labels was very cool. Unfortuantely it was a brittle special purpose hack that makes maitenance more difficult. Instead just use label - __START_KERNEL_map like we do everywhere else in assembly. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/head.S| 61 +++ include/asm-x86_64/pgtable.h |1 2 files changed, 28 insertions(+), 34 deletions(-) diff -puN arch/x86_64/kernel/head.S~x86_64-Cleanup-the-early-boot-page-table arch/x86_64/kernel/head.S --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/head.S~x86_64-Cleanup-the-early-boot-page-table 2007-03-07 01:22:07.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/head.S 2007-03-07 01:22:07.0 +0530 @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -260,52 +261,48 @@ ljumpvector: ENTRY(stext) ENTRY(_stext) - $page = 0 #define NEXT_PAGE(name) \ - $page = $page + 1; \ - .org $page * 0x1000; \ - phys_/**/name = $page * 0x1000 + __PHYSICAL_START; \ + .balign PAGE_SIZE; \ ENTRY(name) +/* Automate the creation of 1 to 1 mapping pmd entries */ +#define PMDS(START, PERM, COUNT) \ + i = 0 ; \ + .rept (COUNT) ; \ + .quad (START) + (i << 21) + (PERM) ; \ + i = i + 1 ; \ + .endr + NEXT_PAGE(init_level4_pgt) /* This gets initialized in x86_64_start_kernel */ .fill 512,8,0 NEXT_PAGE(level3_ident_pgt) - .quad phys_level2_ident_pgt | 0x007 + .quad level2_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE .fill 511,8,0 NEXT_PAGE(level3_kernel_pgt) .fill 510,8,0 /* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */ - .quad phys_level2_kernel_pgt | 0x007 + .quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE .fill 1,8,0 NEXT_PAGE(level2_ident_pgt) - /* 40MB for bootup. */ - i = 0 - .rept 20 - .quad i << 21 | 0x083 - i = i + 1 - .endr - .fill 492,8,0 + /* Since I easily can, map the first 1G. +* Don't set NX because code runs from these pages. +*/ + PMDS(0x, __PAGE_KERNEL_LARGE_EXEC, PTRS_PER_PMD) NEXT_PAGE(level2_kernel_pgt) /* 40MB kernel mapping. The kernel code cannot be bigger than that. When you change this change KERNEL_TEXT_SIZE in page.h too. */ /* (2^48-(2*1024*1024*1024)-((2^39)*511)-((2^30)*510)) = 0 */ - i = 0 - .rept 20 - .quad i << 21 | 0x183 - i = i + 1 - .endr + PMDS(0x, __PAGE_KERNEL_LARGE_EXEC|_PAGE_GLOBAL, + KERNEL_TEXT_SIZE/PMD_SIZE) /* Module mapping starts here */ - .fill 492,8,0 - -NEXT_PAGE(level3_physmem_pgt) - .quad phys_level2_kernel_pgt | 0x007 /* so that __va works even before pagetable_init */ - .fill 511,8,0 + .fill (PTRS_PER_PMD - (KERNEL_TEXT_SIZE/PMD_SIZE)),8,0 +#undef PMDS #undef NEXT_PAGE .data @@ -313,12 +310,10 @@ NEXT_PAGE(level3_physmem_pgt) #ifdef CONFIG_ACPI_SLEEP .align PAGE_SIZE ENTRY(wakeup_level4_pgt) - .quad phys_level3_ident_pgt | 0x007 - .fill 255,8,0 - .quad phys_level3_physmem_pgt | 0x007 - .fill 254,8,0 + .quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE + .fill 510,8,0 /* (2^48-(2*1024*1024*1024))/(2^39) = 511 */ - .quad phys_level3_kernel_pgt | 0x007 + .quad level3_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE #endif #ifndef CONFIG_HOTPLUG_CPU @@ -332,12 +327,12 @@ ENTRY(wakeup_level4_pgt) */ .align PAGE_SIZE ENTRY(boot_level4_pgt) - .quad phys_level3_ident_pgt | 0x007 - .fill 255,8,0 - .quad phys_level3_physmem_pgt | 0x007 - .fill 254,8,0 + .quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE + .fill 257,8,0 + .quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE + .fill 252,8,0 /* (2^48-(2*1024*1024*1024))/(2^39) = 511 */ - .quad phys_level3_kernel_pgt | 0x007 + .quad level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE .data diff -puN include/asm-x86_64/pgtable.h~x86_64-Cl
Re: [RFC][PATCH 0/7] Resource controllers based on process containers
Paul Menage wrote: > On 3/6/07, Pavel Emelianov <[EMAIL PROTECTED]> wrote: >> 2. Extended containers may register themselves too late. >>Kernel threads/helpers start forking, opening files >>and touching pages much earlier. This patchset >>workarounds this in not-so-cute manner and I'm waiting >>for Paul's comments on this issue. >> > > Can we not make sure that each subsystem registers itself before any > of its resources become usable? So the file counting subsystem should Actually all the subsystems I've sent became usable very early. Much earlier that initcalls started. I didn't found where exactly but I can make it if we really need it. > register at some point before filp_open() becomes usable, and the > process counting subsystem should register before it's possible to > fork, etc. > > Paul > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/20] x86_64: modify copy_bootdata to use virtual addresses
Use virtual addresses instead of physical addresses in copy bootdata. In addition fix the implementation of the old bootloader convention. Everything is at real_mode_data always. It is just that sometimes real_mode_data was relocated by setup.S to not sit at 0x9. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/head64.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff -puN arch/x86_64/kernel/head64.c~x86_64-modify-copy_bootdata-to-use-virtual-addresses arch/x86_64/kernel/head64.c --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/head64.c~x86_64-modify-copy_bootdata-to-use-virtual-addresses 2007-03-07 01:23:55.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/head64.c 2007-03-07 01:23:55.0 +0530 @@ -29,25 +29,24 @@ static void __init clear_bss(void) } #define NEW_CL_POINTER 0x228 /* Relative to real mode data */ -#define OLD_CL_MAGIC_ADDR 0x90020 +#define OLD_CL_MAGIC_ADDR 0x20 #define OLD_CL_MAGIC0xA33F -#define OLD_CL_BASE_ADDR0x9 -#define OLD_CL_OFFSET 0x90022 +#define OLD_CL_OFFSET 0x22 static void __init copy_bootdata(char *real_mode_data) { - int new_data; + unsigned long new_data; char * command_line; memcpy(x86_boot_params, real_mode_data, BOOT_PARAM_SIZE); - new_data = *(int *) (x86_boot_params + NEW_CL_POINTER); + new_data = *(u32 *) (x86_boot_params + NEW_CL_POINTER); if (!new_data) { - if (OLD_CL_MAGIC != * (u16 *) OLD_CL_MAGIC_ADDR) { + if (OLD_CL_MAGIC != *(u16 *)(real_mode_data + OLD_CL_MAGIC_ADDR)) { return; } - new_data = OLD_CL_BASE_ADDR + * (u16 *) OLD_CL_OFFSET; + new_data = __pa(real_mode_data) + *(u16 *)(real_mode_data + OLD_CL_OFFSET); } - command_line = (char *) ((u64)(new_data)); + command_line = __va(new_data); memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE); } @@ -74,7 +73,7 @@ void __init x86_64_start_kernel(char * r cpu_pda(i) = &boot_cpu_pda[i]; pda_init(0); - copy_bootdata(real_mode_data); + copy_bootdata(__va(real_mode_data)); #ifdef CONFIG_SMP cpu_set(0, cpu_online_map); #endif _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/20] x86_64: 64bit ACPI wakeup trampoline
o Moved wakeup_level4_pgt into the wakeup routine so we can run the kernel above 4G. o Now we first go to 64bit mode and continue to run from trampoline and then then start accessing kernel symbols and restore processor context. This enables us to resume even in relocatable kernel context when kernel might not be loaded at physical addr it has been compiled for. o Removed the need for modifying any existing kernel page table. o Increased the size of the wakeup routine to 8K. This is required as wake page tables are on trampoline itself and they got to be at 4K boundary, hence one page is not sufficient. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/acpi/sleep.c | 22 ++ arch/x86_64/kernel/acpi/wakeup.S | 59 --- arch/x86_64/kernel/head.S|9 - 3 files changed, 41 insertions(+), 49 deletions(-) diff -puN arch/x86_64/kernel/acpi/sleep.c~x86_64-64bit-ACPI-wakeup-trampoline arch/x86_64/kernel/acpi/sleep.c --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/acpi/sleep.c~x86_64-64bit-ACPI-wakeup-trampoline 2007-03-07 01:28:11.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/acpi/sleep.c 2007-03-07 01:28:11.0 +0530 @@ -60,17 +60,6 @@ extern char wakeup_start, wakeup_end; extern unsigned long acpi_copy_wakeup_routine(unsigned long); -static pgd_t low_ptr; - -static void init_low_mapping(void) -{ - pgd_t *slot0 = pgd_offset(current->mm, 0UL); - low_ptr = *slot0; - set_pgd(slot0, *pgd_offset(current->mm, PAGE_OFFSET)); - WARN_ON(num_online_cpus() != 1); - local_flush_tlb(); -} - /** * acpi_save_state_mem - save kernel state * @@ -79,8 +68,6 @@ static void init_low_mapping(void) */ int acpi_save_state_mem(void) { - init_low_mapping(); - memcpy((void *)acpi_wakeup_address, &wakeup_start, &wakeup_end - &wakeup_start); acpi_copy_wakeup_routine(acpi_wakeup_address); @@ -93,8 +80,6 @@ int acpi_save_state_mem(void) */ void acpi_restore_state_mem(void) { - set_pgd(pgd_offset(current->mm, 0UL), low_ptr); - local_flush_tlb(); } /** @@ -107,10 +92,11 @@ void acpi_restore_state_mem(void) */ void __init acpi_reserve_bootmem(void) { - acpi_wakeup_address = (unsigned long)alloc_bootmem_low(PAGE_SIZE); - if ((&wakeup_end - &wakeup_start) > PAGE_SIZE) + acpi_wakeup_address = (unsigned long)alloc_bootmem_low(PAGE_SIZE*2); + if ((&wakeup_end - &wakeup_start) > (PAGE_SIZE*2)) printk(KERN_CRIT - "ACPI: Wakeup code way too big, will crash on attempt to suspend\n"); + "ACPI: Wakeup code way too big, will crash on attempt" + " to suspend\n"); } static int __init acpi_sleep_setup(char *str) diff -puN arch/x86_64/kernel/acpi/wakeup.S~x86_64-64bit-ACPI-wakeup-trampoline arch/x86_64/kernel/acpi/wakeup.S --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/acpi/wakeup.S~x86_64-64bit-ACPI-wakeup-trampoline 2007-03-07 01:28:11.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/acpi/wakeup.S 2007-03-07 01:28:11.0 +0530 @@ -1,6 +1,7 @@ .text #include #include +#include #include #include @@ -62,12 +63,15 @@ wakeup_code: movb$0xa2, %al ; outb %al, $0x80 - lidt%ds:idt_48a - wakeup_code - xorl%eax, %eax - movw%ds, %ax# (Convert %ds:gdt to a linear ptr) - shll$4, %eax - addl$(gdta - wakeup_code), %eax - movl%eax, gdt_48a +2 - wakeup_code + mov %ds, %ax# Find 32bit wakeup_code addr + movzx %ax, %esi # (Convert %ds:gdt to a liner ptr) + shll$4, %esi + # Fix up the vectors + addl%esi, wakeup_32_vector - wakeup_code + addl%esi, wakeup_long64_vector - wakeup_code + addl%esi, gdt_48a + 2 - wakeup_code # Fixup the gdt pointer + + lidtl %ds:idt_48a - wakeup_code lgdtl %ds:gdt_48a - wakeup_code # load gdt with whatever is # appropriate @@ -80,7 +84,7 @@ wakeup_code: .balign 4 wakeup_32_vector: - .long wakeup_32 - __START_KERNEL_map + .long wakeup_32 - wakeup_code .word __KERNEL32_CS, 0 .code32 @@ -103,10 +107,6 @@ wakeup_32: movl$__KERNEL_DS, %eax movl%eax, %ds - movlsaved_magic - __START_KERNEL_map, %eax - cmpl$0x9abcdef0, %eax - jne bogus_32_magic - movw$0x0e00 + 'i', %ds:(0xb8012) movb$0xa8, %al ; outb %al, $0x80; @@ -120,7 +120,7 @@ wakeup_32: movl%eax, %cr4 /* Setup early boot stage 4 level pagetables */ - movl$(wakeup_level
[PATCH 14/20] x86_64: Remove the identity mapping as early as possible
With the rewrite of the SMP trampoline and the early page allocator there is nothing that needs identity mapped pages, once we start executing C code. So add zap_identity_mappings into head64.c and remove zap_low_mappings() from much later in the code. The functions are subtly different thus the name change. This also kills boot_level4_pgt which was from an earlier attempt to move the identity mappings as early as possible, and is now no longer needed. Essentially I have replaced boot_level4_pgt with trampoline_level4_pgt in trampoline.S Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> --- arch/x86_64/kernel/head.S| 39 ++- arch/x86_64/kernel/head64.c | 17 +++-- arch/x86_64/kernel/setup.c |2 -- arch/x86_64/kernel/setup64.c |1 - arch/x86_64/mm/init.c| 24 include/asm-x86_64/pgtable.h |1 - include/asm-x86_64/proto.h |2 -- 7 files changed, 25 insertions(+), 61 deletions(-) diff -puN arch/x86_64/kernel/head64.c~x86_64-Remove-the-identity-mapping-as-early-as-possible arch/x86_64/kernel/head64.c --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/head64.c~x86_64-Remove-the-identity-mapping-as-early-as-possible 2007-03-07 01:29:50.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/head64.c 2007-03-07 01:29:50.0 +0530 @@ -18,8 +18,16 @@ #include #include #include +#include #include +static void __init zap_identity_mappings(void) +{ + pgd_t *pgd = pgd_offset_k(0UL); + pgd_clear(pgd); + __flush_tlb(); +} + /* Don't add a printk in there. printk relies on the PDA which is not initialized yet. */ static void __init clear_bss(void) @@ -57,18 +65,15 @@ void __init x86_64_start_kernel(char * r /* clear bss before set_intr_gate with early_idt_handler */ clear_bss(); + /* Make NULL pointers segfault */ + zap_identity_mappings(); + for (i = 0; i < IDT_ENTRIES; i++) set_intr_gate(i, early_idt_handler); asm volatile("lidt %0" :: "m" (idt_descr)); early_printk("Kernel alive\n"); - /* -* switch to init_level4_pgt from boot_level4_pgt -*/ - memcpy(init_level4_pgt, boot_level4_pgt, PTRS_PER_PGD*sizeof(pgd_t)); - asm volatile("movq %0,%%cr3" :: "r" (__pa_symbol(&init_level4_pgt))); - for (i = 0; i < NR_CPUS; i++) cpu_pda(i) = &boot_cpu_pda[i]; diff -puN arch/x86_64/kernel/head.S~x86_64-Remove-the-identity-mapping-as-early-as-possible arch/x86_64/kernel/head.S --- linux-2.6.21-rc2-reloc/arch/x86_64/kernel/head.S~x86_64-Remove-the-identity-mapping-as-early-as-possible 2007-03-07 01:29:50.0 +0530 +++ linux-2.6.21-rc2-reloc-root/arch/x86_64/kernel/head.S 2007-03-07 01:29:50.0 +0530 @@ -71,7 +71,7 @@ startup_32: movl%eax, %cr4 /* Setup early boot stage 4 level pagetables */ - movl$(boot_level4_pgt - __START_KERNEL_map), %eax + movl$(init_level4_pgt - __START_KERNEL_map), %eax movl%eax, %cr3 /* Setup EFER (Extended Feature Enable Register) */ @@ -115,7 +115,7 @@ ENTRY(secondary_startup_64) movq%rax, %cr4 /* Setup early boot stage 4 level pagetables. */ - movq$(boot_level4_pgt - __START_KERNEL_map), %rax + movq$(init_level4_pgt - __START_KERNEL_map), %rax movq%rax, %cr3 /* Check if nx is implemented */ @@ -274,9 +274,19 @@ ENTRY(name) i = i + 1 ; \ .endr + /* +* This default setting generates an ident mapping at address 0x10 +* and a mapping for the kernel that precisely maps virtual address +* 0x8000 to physical address 0x00. (always using +* 2Mbyte large pages provided by PAE mode) +*/ NEXT_PAGE(init_level4_pgt) - /* This gets initialized in x86_64_start_kernel */ - .fill 512,8,0 + .quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE + .fill 257,8,0 + .quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE + .fill 252,8,0 + /* (2^48-(2*1024*1024*1024))/(2^39) = 511 */ + .quad level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE NEXT_PAGE(level3_ident_pgt) .quad level2_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE @@ -307,27 +317,6 @@ NEXT_PAGE(level2_kernel_pgt) #undef NEXT_PAGE .data - -#ifndef CONFIG_HOTPLUG_CPU - __INITDATA -#endif - /* -* This default setting generates an ident mapping at address 0x10 -* and a mapping for the kernel that precisely maps virtual address -* 0x8000 to physical address 0x00. (always using -* 2Mbyte large pages provided by PAE mode) -*/ - .align PAGE_SIZE -ENTRY(boot_level4_pgt) - .quad le
Re: [patch 3/6] mm: fix fault vs invalidate race for linear mappings
On Tue, Mar 06, 2007 at 11:08:41PM -0800, Andrew Morton wrote: > On Wed, 7 Mar 2007 07:57:27 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > > > Why was truncate_inode_pages_range() altered to unmap the page if it got > > > mapped again? > > > > > > Oh. Because the unmap_mapping_range() call got removed from > > > vmtruncate(). > > > Why? (Please send suitable updates to the changelog). > > > > We have to ensure it is unmapped, and be prepared to unmap it while under > > the page lock. > > But vmtruncate() dropped i_size, so nobody will map this page into > pagetables from then on. But there could be a fault in progress... the only way to know is locking the page. > > > I guess truncate of a mmapped area isn't sufficiently common to worry > > > about > > > the inefficiency of this change. > > > > Yeah, and it should be more efficient for files that aren't mmapped, > > because we don't have to take i_mmap_lock for them. > > > > > Lots of memory barriers got removed in memory.c, unchangeloggedly. > > > > Yeah they were all for the lockless truncate_count checks. Now that > > we use the page lock, we don't need barriers. > > > > > Gratuitous renaming of locals in do_no_page() makes the change hard to > > > review. Should have been a separate patch. > > > > > > In fact, the patch would have been heaps clearer if that renaming had been > > > a separate patch. > > > > Shall I? > > If you don't have anything better to do, yes please ;) OK. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 2/7] RSS controller core
Balbir Singh wrote: > Pavel Emelianov wrote: >> This includes setup of RSS container within generic >> process containers, all the declarations used in RSS >> accounting, and core code responsible for accounting. >> >> >> >> >> diff -upr linux-2.6.20.orig/include/linux/rss_container.h >> linux-2.6.20-0/include/linux/rss_container.h >> --- linux-2.6.20.orig/include/linux/rss_container.h2007-03-06 >> 13:39:17.0 +0300 >> +++ linux-2.6.20-0/include/linux/rss_container.h2007-03-06 >> 13:33:28.0 +0300 >> @@ -0,0 +1,68 @@ >> +#ifndef __RSS_CONTAINER_H__ >> +#define __RSS_CONTAINER_H__ >> +/* >> + * RSS container >> + * >> + * Copyright 2007 OpenVZ SWsoft Inc >> + * >> + * Author: Pavel Emelianov <[EMAIL PROTECTED]> >> + * >> + */ >> + >> +struct page_container; >> +struct rss_container; >> + >> +#ifdef CONFIG_RSS_CONTAINER >> +int container_rss_prepare(struct page *, struct vm_area_struct *vma, >> +struct page_container **); >> + >> +void container_rss_add(struct page_container *); >> +void container_rss_del(struct page_container *); >> +void container_rss_release(struct page_container *); >> + >> +int mm_init_container(struct mm_struct *mm, struct task_struct *tsk); >> +void mm_free_container(struct mm_struct *mm); >> + >> +unsigned long container_isolate_pages(unsigned long nr_to_scan, >> +struct rss_container *rss, struct list_head *dst, >> +int active, unsigned long *scanned); >> +unsigned long container_nr_physpages(struct rss_container *rss); >> + >> +unsigned long container_try_to_free_pages(struct rss_container *); >> +void container_out_of_memory(struct rss_container *); >> + >> +void container_rss_init_early(void); >> +#else >> +static inline int container_rss_prepare(struct page *pg, >> +struct vm_area_struct *vma, struct page_container **pc) >> +{ >> +*pc = NULL; /* to make gcc happy */ >> +return 0; >> +} >> + >> +static inline void container_rss_add(struct page_container *pc) >> +{ >> +} >> + >> +static inline void container_rss_del(struct page_container *pc) >> +{ >> +} >> + >> +static inline void container_rss_release(struct page_container *pc) >> +{ >> +} >> + >> +static inline int mm_init_container(struct mm_struct *mm, struct >> task_struct *t) >> +{ >> +return 0; >> +} >> + >> +static inline void mm_free_container(struct mm_struct *mm) >> +{ >> +} >> + >> +static inline void container_rss_init_early(void) >> +{ >> +} >> +#endif >> +#endif >> diff -upr linux-2.6.20.orig/init/Kconfig linux-2.6.20-0/init/Kconfig >> --- linux-2.6.20.orig/init/Kconfig2007-03-06 13:33:28.0 +0300 >> +++ linux-2.6.20-0/init/Kconfig2007-03-06 13:33:28.0 +0300 >> @@ -265,6 +265,13 @@ config CPUSETS >> bool >> select CONTAINERS >> >> +config RSS_CONTAINER >> +bool "RSS accounting container" >> +select RESOURCE_COUNTERS >> +help >> + Provides a simple Resource Controller for monitoring and >> + controlling the total Resident Set Size of the tasks in a >> container >> + > > The wording looks very familiar :-). It would be useful to add > "The reclaim logic is now container aware, when the container goes > overlimit > the page reclaimer reclaims pages belonging to this container. If we are > unable to reclaim enough pages to satisfy the request, the process is > killed with an out of memory warning" OK. Thanks. > >> config SYSFS_DEPRECATED >> bool "Create deprecated sysfs files" >> default y >> diff -upr linux-2.6.20.orig/mm/Makefile linux-2.6.20-0/mm/Makefile >> --- linux-2.6.20.orig/mm/Makefile2007-02-04 21:44:54.0 +0300 >> +++ linux-2.6.20-0/mm/Makefile2007-03-06 13:33:28.0 +0300 >> @@ -29,3 +29,5 @@ obj-$(CONFIG_MEMORY_HOTPLUG) += memory_h >> obj-$(CONFIG_FS_XIP) += filemap_xip.o >> obj-$(CONFIG_MIGRATION) += migrate.o >> obj-$(CONFIG_SMP) += allocpercpu.o >> + >> +obj-$(CONFIG_RSS_CONTAINER) += rss_container.o >> diff -upr linux-2.6.20.orig/mm/rss_container.c >> linux-2.6.20-0/mm/rss_container.c >> --- linux-2.6.20.orig/mm/rss_container.c2007-03-06 >> 13:39:17.0 +0300 >> +++ linux-2.6.20-0/mm/rss_container.c2007-03-06 13:33:28.0 >> +0300 >> @@ -0,0 +1,307 @@ >> +/* >> + * RSS accounting container >> + * >> + * Copyright 2007 OpenVZ SWsoft Inc >> + * >> + * Author: Pavel Emelianov <[EMAIL PROTECTED]> >> + * >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +static struct container_subsys rss_subsys; >> + >> +struct rss_container { >> +struct res_counter res; >> +struct list_head page_list; >> +struct container_subsys_state css; >> +}; >> + >> +struct page_container { >> +struct page *page; >> +struct rss_container *cnt; >> +struct list_head list; >> +}; >> + > > Yes, this is what I was planning to get to -- a per container LRU list. > But you have just one list, don't you need active and inactive
Re: [patch 4/6] mm: merge populate and nopage into fault (fixes nonlinear)
On Tue, Mar 06, 2007 at 10:51:01PM -0800, Andrew Morton wrote: > Does anybody really pass a NULL `type' arg into filemap_nopage()? The major vs. minor fault accounting patch that introduced the argument didn't make non-NULL type arguments a requirement. It's essentially an optional second return value and the NULL pointer represents the caller choosing to ignore it. I'm not sure I actually liked that aspect of it, but that's how it ended up going in. I think it had something to do with driver churn clashing with the sweep at the time of the merge. I'd rather the argument be mandatory and defaulted to VM_FAULT_MINOR. It's something of a non-answer, though, since it only discusses a convention as opposed to reviewing specific callers of filemap_nopage(). NULL type arguments to ->nopage() are rare at most, and could be easily eliminated, at least for in-tree drivers. egrep -nr 'nopage.*NULL' . 2>/dev/null | grep -v '^Bin' on a current git tree yields zero matches. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] epoll use a single inode ...
Eric Dumazet a écrit : I would definitly *love* saving dentries for pipes (and sockets too), but how are you going to get the inode ? pipes()/sockets() can use read()/write()/rw_verify_area() and thus need file->f_path.dentry->d_inode (so each pipe needs a separate dentry) Are you suggesting adding a new "struct file_operations" member to get the inode ? Or re-intoducing an inode pointer in struct file ? Crazy ideas : (some readers are going to kill me) 1) Use the low order bit of f_path.dentry to say : this pointer is not a pointer to a dentry but the inode pointer (with the low order bit set to 1) OR 2) file->f_path.dentry set to NULL for this special files (so that we dont need to dput() and cache line ping pong the common dentry each time we __fput() a pipe/socket. Same trick could be used for file->f_path.mnt, because there is a big SMP cache line ping/pong to maintain a mnt_count on pipe/sockets mountpoint while these file systems cannot be un-mounted) If dentry is NULL, we get the inode pointer from an overlay of struct file_ra_statef_ra; (because for this special files readahead is unused) This adds some conditional branches of course, but being able to save ram and better use cpu caches might be worth them. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/7] Resource counters
Balbir Singh wrote: > Pavel Emelianov wrote: >> Introduce generic structures and routines for >> resource accounting. >> >> Each resource accounting container is supposed to >> aggregate it, container_subsystem_state and its >> resource-specific members within. >> >> >> >> >> diff -upr linux-2.6.20.orig/include/linux/res_counter.h >> linux-2.6.20-0/include/linux/res_counter.h >> --- linux-2.6.20.orig/include/linux/res_counter.h2007-03-06 >> 13:39:17.0 +0300 >> +++ linux-2.6.20-0/include/linux/res_counter.h2007-03-06 >> 13:33:28.0 +0300 >> @@ -0,0 +1,83 @@ >> +#ifndef __RES_COUNTER_H__ >> +#define __RES_COUNTER_H__ >> +/* >> + * resource counters >> + * >> + * Copyright 2007 OpenVZ SWsoft Inc >> + * >> + * Author: Pavel Emelianov <[EMAIL PROTECTED]> >> + * >> + */ >> + >> +#include >> + >> +struct res_counter { >> +unsigned long usage; >> +unsigned long limit; >> +unsigned long failcnt; >> +spinlock_t lock; >> +}; >> + >> +enum { >> +RES_USAGE, >> +RES_LIMIT, >> +RES_FAILCNT, >> +}; >> + >> +ssize_t res_counter_read(struct res_counter *cnt, int member, >> +const char __user *buf, size_t nbytes, loff_t *pos); >> +ssize_t res_counter_write(struct res_counter *cnt, int member, >> +const char __user *buf, size_t nbytes, loff_t *pos); >> + >> +static inline void res_counter_init(struct res_counter *cnt) >> +{ >> +spin_lock_init(&cnt->lock); >> +cnt->limit = (unsigned long)LONG_MAX; >> +} >> + > > Is there any way to indicate that there are no limits on this container. Yes - LONG_MAX is essentially a "no limit" value as no container will ever have such many files :) > LONG_MAX is quite huge, but still when the administrator wants to > configure a container to *un-limited usage*, it becomes hard for > the administrator. > >> +static inline int res_counter_charge_locked(struct res_counter *cnt, >> +unsigned long val) >> +{ >> +if (cnt->usage <= cnt->limit - val) { >> +cnt->usage += val; >> +return 0; >> +} >> + >> +cnt->failcnt++; >> +return -ENOMEM; >> +} >> + >> +static inline int res_counter_charge(struct res_counter *cnt, >> +unsigned long val) >> +{ >> +int ret; >> +unsigned long flags; >> + >> +spin_lock_irqsave(&cnt->lock, flags); >> +ret = res_counter_charge_locked(cnt, val); >> +spin_unlock_irqrestore(&cnt->lock, flags); >> +return ret; >> +} >> + > > Will atomic counters help here. I'm afraid no. We have to atomically check for limit and alter one of usage or failcnt depending on the checking result. Making this with atomic_xxx ops will require at least two ops. If we'll remove failcnt this would look like while (atomic_cmpxchg(...)) which is also not that good. Moreover - in RSS accounting patches I perform page list manipulations under this lock, so this also saves one atomic op. >> +static inline void res_counter_uncharge_locked(struct res_counter *cnt, >> +unsigned long val) >> +{ >> +if (unlikely(cnt->usage < val)) { >> +WARN_ON(1); >> +val = cnt->usage; >> +} >> + >> +cnt->usage -= val; >> +} >> + >> +static inline void res_counter_uncharge(struct res_counter *cnt, >> +unsigned long val) >> +{ >> +unsigned long flags; >> + >> +spin_lock_irqsave(&cnt->lock, flags); >> +res_counter_uncharge_locked(cnt, val); >> +spin_unlock_irqrestore(&cnt->lock, flags); >> +} >> + >> +#endif >> diff -upr linux-2.6.20.orig/init/Kconfig linux-2.6.20-0/init/Kconfig >> --- linux-2.6.20.orig/init/Kconfig2007-03-06 13:33:28.0 +0300 >> +++ linux-2.6.20-0/init/Kconfig2007-03-06 13:33:28.0 +0300 >> @@ -265,6 +265,10 @@ config CPUSETS >> >>Say N if unsure. >> >> +config RESOURCE_COUNTERS >> +bool >> +select CONTAINERS >> + >> config SYSFS_DEPRECATED >> bool "Create deprecated sysfs files" >> default y >> diff -upr linux-2.6.20.orig/kernel/Makefile >> linux-2.6.20-0/kernel/Makefile >> --- linux-2.6.20.orig/kernel/Makefile2007-03-06 13:33:28.0 >> +0300 >> +++ linux-2.6.20-0/kernel/Makefile2007-03-06 13:33:28.0 +0300 >> @@ -51,6 +51,7 @@ obj-$(CONFIG_RELAY) += relay.o >> obj-$(CONFIG_UTS_NS) += utsname.o >> obj-$(CONFIG_TASK_DELAY_ACCT) += delayacct.o >> obj-$(CONFIG_TASKSTATS) += taskstats.o tsacct.o >> +obj-$(CONFIG_RESOURCE_COUNTERS) += res_counter.o >> >> ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y) >> # According to Alan Modra <[EMAIL PROTECTED]>, the >> -fno-omit-frame-pointer is >> diff -upr linux-2.6.20.orig/kernel/res_counter.c >> linux-2.6.20-0/kernel/res_counter.c >> --- linux-2.6.20.orig/kernel/res_counter.c2007-03-06 >> 13:39:17.0 +0300 >> +++ linux-2.6.20-0/kernel/res_counter.c2007-03-06 >> 13:33:28.0 +0300 >> @@ -0,0 +1,72 @@ >> +/* >> + * resource containers >> + * >> + * Copyright 2007 OpenVZ SWsoft Inc >> + * >> + *
Re: [patch] epoll use a single inode ...
On Wed, 7 Mar 2007, Eric Dumazet wrote: > I would definitly *love* saving dentries for pipes (and sockets too), but how > are you going to get the inode ? I was not planning to touch anything but epoll, signalfd and timerfd files. > pipes()/sockets() can use read()/write()/rw_verify_area() and thus need > file->f_path.dentry->d_inode (so each pipe needs a separate dentry) Currently, they use a single inode, and multiple dentries (to give the name of the class). But this could be changed to a single dentry like Linus was suggesting. I'll wait for Al's reply before doing anything. Memory saving can be something, on top of the already big one of avoiding code duplication. - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v2] epoll use a single inode ...
Eric Dumazet wrote: Linus Torvalds a écrit : On Tue, 6 Mar 2007, Eric Dumazet wrote: I did a user space program, attached to this mail. I rewrote the reciprocal_div() for i386 so that one multiply is used. Ok, this is definitely faster on Core 2 as well, so "numbers talk, bullshit walks". No more objections. And the numbers were ? :) (That said, I bet you could do even better for octal and hex numbers, so if you *really* want to speed things up, you should just make a special-case routine for each base (there's just three of them), and you can then also optimize the base-10 thing much better (you can do two digits at a time by dividing by 100, etc) Well, given that sprintf() is frequently called only for pipe/sockets creation, we probably better : 1) wait a very clever idea to suppress individual dentry per pipe/sockets (no more sprintf() at pipe/socket setup) 2) delay the sprintf() only if needed as you mentioned in a previous mail (when someone wants ls -l /proc/pid/fd/), since their dentries are not anymore inserted in the global dcache hash, they could stay with a (nul) dname. Yes, the right thing to do is probably to only generate these strings when someone tries to list them, not on every socket/pipe/epoll creation. One can assign a counter and keep it as a binary value at the start, but create the strings when necessary. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Blackfin: blackfin i2c driver
On Wed, 7 Mar 2007 07:58:22 +0100 Jean Delvare <[EMAIL PROTECTED]> wrote: > > +config BFIN_SDA > > I2C_BLACKFIN_SDA The blackfin architecture uses "bfin" pretty much universally, so this usage is consistent. box:/usr/src/25> grep -i blackfin patches/blackfin*|wc -l 1608 box:/usr/src/25> grep -i bfin patches/blackfin*|wc -l 6198 Let's just hope nobody makes a bluefin. > > + range 0 15 if (BF533 || BF532 || BF531) > > Trailing whitespace. I always remove that when merging a patch. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] ARP notify option
On Tue, 6 Mar 2007, Chris Friesen wrote: Stephen Hemminger wrote: +arp_notify - BOOLEAN + Define mode for notification of address and device changes. + 0 - (default): do nothing + 1 - Generate gratuitous arp replies when device is brought up + or hardware address changes. Did you consider using gratuitous arp requests instead? I remember reading about some hardware that updated its arp cache on gratuitous requests but not gratuitous replies. You might be interested in taking a look at: http://tools.ietf.org/id/draft-cheshire-ipv4-acd There has been some follow-up discussion on this in the thread starting at: http://www1.ietf.org/mail-archive/web/int-area/current/msg00611.html In particular, you may be interested in this comment about ARP request and ARP reply for gratuitous ARP: http://www1.ietf.org/mail-archive/web/int-area/current/msg00669.html -- Pekka Savola "You each name yourselves king, yet the Netcore Oykingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/4] signalfd v1 - signalfd core ...
On Wed, 7 Mar 2007, Stephen Rothwell wrote: > On Tue, 6 Mar 2007 17:36:56 -0800 (PST) Davide Libenzi > wrote: > > > > The read(2) call will read u32 signal numbers that landed over the > > signalfd. It returns the size of the data copied, or zero if the sighand > > we are attached to, has been detached. > > So what about signals that the user asked for a siginfo_t to be returned > with? O-Ren: "You didn't think it was gonna be that easy, did you?" B-Kiddo: "You know, for a second there, yeah, I kinda did." :) I could do that, since where I placed the signalfd_notify() I have the siginfo. But that is going to make code a little more complex, since the simple bitmaks needs to become a queue. - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.21-rc3] cpufreq: p4-clockmod.c compilation error
On Wed, 7 Mar 2007, Dave Jones wrote: > diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig > index d155e81..74747d9 100644 > --- a/drivers/cpufreq/Kconfig > +++ b/drivers/cpufreq/Kconfig > @@ -16,7 +16,7 @@ config CPU_FREQ > if CPU_FREQ > > config CPU_FREQ_TABLE > - tristate > + bool > > config CPU_FREQ_DEBUG > bool "Enable CPUfreq debugging" > > > That did the trick, thanks. Acked-by: David Rientjes <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 6/7] Account for the number of tasks within container
Paul Menage wrote: > Hi Pavel, > > On 3/6/07, Pavel Emelianov <[EMAIL PROTECTED]> wrote: >> diff -upr linux-2.6.20.orig/include/linux/sched.h >> linux-2.6.20-0/include/linux/sched.h >> --- linux-2.6.20.orig/include/linux/sched.h 2007-03-06 >> 13:33:28.0 +0300 >> +++ linux-2.6.20-0/include/linux/sched.h2007-03-06 >> 13:33:28.0 +0300 >> @@ -1052,6 +1055,9 @@ struct task_struct { >> #ifdef CONFIG_FAULT_INJECTION >> int make_it_fail; >> #endif >> +#ifdef CONFIG_PROCESS_CONTAINER >> + struct numproc_container *numproc_cnt; >> +#endif >> }; > > Why do you need a pointer added to task_struct? One of the main points > of the generic containers is to avoid every different subsystem and > resource controller having to add new pointers there. > >> + >> + rcu_read_lock(); >> + np = numproc_from_cont(task_container(current, &numproc_subsys)); >> + css_get_current(&np->css); > > There's no need to hold a reference here - by definition, the task's > container can't go away while the task is in it. > > Also, shouldn't you have an attach() method to move the count from one > container to another when a task moves? The idea is: Task may be "the entity that allocates the resources" and "the entity that is a resource allocated". When task is the first entity it may move across containers (that is implemented in your patches). When task is a resource it shouldn't move across containers like files or pages do. More generally - allocated resources hold reference to original container till they die. No resource migration is performed. Did I express my idea cleanly? > Paul > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/6] mm: fix fault vs invalidate race for linear mappings
On Wed, 7 Mar 2007 07:57:27 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > Why was truncate_inode_pages_range() altered to unmap the page if it got > > mapped again? > > > > Oh. Because the unmap_mapping_range() call got removed from vmtruncate(). > > Why? (Please send suitable updates to the changelog). > > We have to ensure it is unmapped, and be prepared to unmap it while under > the page lock. But vmtruncate() dropped i_size, so nobody will map this page into pagetables from then on. > > I guess truncate of a mmapped area isn't sufficiently common to worry about > > the inefficiency of this change. > > Yeah, and it should be more efficient for files that aren't mmapped, > because we don't have to take i_mmap_lock for them. > > > Lots of memory barriers got removed in memory.c, unchangeloggedly. > > Yeah they were all for the lockless truncate_count checks. Now that > we use the page lock, we don't need barriers. > > > Gratuitous renaming of locals in do_no_page() makes the change hard to > > review. Should have been a separate patch. > > > > In fact, the patch would have been heaps clearer if that renaming had been > > a separate patch. > > Shall I? If you don't have anything better to do, yes please ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 4/6] mm: merge populate and nopage into fault (fixes nonlinear)
On Tue, Mar 06, 2007 at 10:51:01PM -0800, Andrew Morton wrote: > On Wed, 21 Feb 2007 05:50:17 +0100 (CET) Nick Piggin <[EMAIL PROTECTED]> > wrote: > > > Nonlinear mappings are (AFAIKS) simply a virtual memory concept that > > encodes the virtual address -> file offset differently from linear > > mappings. > > > > I can't see why the filesystem/pagecache code should need to know anything > > about it, except for the fact that the ->nopage handler didn't quite pass > > down enough information (ie. pgoff). But it is more logical to pass pgoff > > rather than have the ->nopage function calculate it itself anyway. And > > having the nopage handler install the pte itself is sort of nasty. > > > > This patch introduces a new fault handler that replaces ->nopage and > > ->populate and (later) ->nopfn. Most of the old mechanism is still in place > > so there is a lot of duplication and nice cleanups that can be removed if > > everyone switches over. > > > > The rationale for doing this in the first place is that nonlinear mappings > > are subject to the pagefault vs invalidate/truncate race too, and it seemed > > stupid to duplicate the synchronisation logic rather than just consolidate > > the two. > > > > It's awkward to layer a largely do-nothing patch like this on top of a > significant functional change. Makes it harder to isolate the source of > regressions, harder to revert the do-something patch. > > > After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in > > pagecache. Seems like a fringe functionality anyway. > > Does Ingo agree? I cc'ed him when first posting it. He didn't disagree. > > NOPAGE_REFAULT is removed. This should be implemented with ->fault, and > > no users have hit mainline yet. > > Did benh agree with that? Yes. > The patch unchangeloggedly adds a basic new structure to core mm > (fault_data). Would be nice to document its fields, especially `flags'. OK. This is actually something that I would like more people to review. Do we need any different fields? Should it be passed as arguments instead of a structure? > Please add less pointless blank lines. > > > How well has this been tested? The ocfs2 changes? gfs2? We should at > least give those guys a heads-up. Yes we should. Not all those filesystem changes have been tested. > Does anybody really pass a NULL `type' arg into filemap_nopage()? Dunno, it's exported. I remove that completely in a subsequent patch anyway. > This patch seems to churn things around an awful lot for minimal benefit. Well it fixes the whole design of the nonlinear fault path. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Blackfin: blackfin i2c driver
On Wed, 07 Mar 2007 13:57:58 +0800 "Wu, Bryan" <[EMAIL PROTECTED]> wrote: > Here is the updated blackfin i2c driver. > > [PATCH] Blackfin: blackfin i2c driver > > The i2c linux driver for blackfin architecture which supports both GPIO > i2c operation and blackfin on-chip TWI controller i2c operation. > > Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> > --- > drivers/i2c/busses/Kconfig | 47 > drivers/i2c/busses/i2c-bfin-gpio.c | 98 + > drivers/i2c/busses/i2c-bfin-twi.c | 589 > > 3 files changed, 734 insertions(+) > > Index: linux-2.6/drivers/i2c/busses/Kconfig > === > --- linux-2.6.orig/drivers/i2c/busses/Kconfig 2007-03-07 13:32:02.0 > +0800 > +++ linux-2.6/drivers/i2c/busses/Kconfig 2007-03-07 13:44:19.0 > +0800 > @@ -5,6 +5,53 @@ > menu "I2C Hardware Bus support" > depends on I2C > > +config I2C_BFIN_GPIO > + tristate "Generic Blackfin and HHBF533/561 development board I2C > support" > + depends on I2C && EXPERIMENTAL > + select I2C_ALGOBIT > + help > + -- > + > +menu "BFIN I2C SDA/SCL Selection" > + depends on I2C_BFIN_GPIO > +config BFIN_SDA > + int "SDA is GPIO Number" > + range 0 15 if (BF533 || BF532 || BF531) > + range 0 47 if (BF534 || BF536 || BF537) > + range 0 47 if BF561 > + default 2 if (BF533 || BF532 || BF531) > + > +config BFIN_SCL > + int "SCL is GPIO Number" > + range 0 15 if (BF533 || BF532 || BF531) > + range 0 47 if (BF534 || BF536 || BF537) > + range 0 47 if BF561 > + default 3 > +endmenu > + > +config I2C_BFIN_GPIO_CYCLE_DELAY > + int "Cycle Delay in usec" > + depends on I2C_BFIN_GPIO > + range 1 100 > + default 40 > + > +config I2C_BFIN_TWI > + tristate "Blackfin TWI I2C support" > + depends on I2C && (BF534 || BF536 || BF537) > + help > + This the TWI I2C device driver for Blackfin 534/536/537. > + > + This driver can also be built as a module. If so, the module > + will be called i2c-bfin-twi. > + > +config TWICLK_KHZ > + int "TWI clock (kHZ)" > + depends on I2C_BFIN_TWI > + default 50 > + help > + The unit of the TWI clock is kilo HZ. Please divide the clock > + by 1024 if you count it in HZ. The value should be less than 400. > + Well that's cute. This patch causes an i386 `make allmodconfig' to spew these: SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) SDA is GPIO Number (BFIN_SDA) [] (NEW) out at about 1,000,000/sec, infinitely. I'll put a `depends on BFIN' in there to shut it up, but I think you've tickled a Kconfig bug. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Unused floppy1 going bonkers in 2.6.20-rds
Greetings; Kernel 2.6.20-rds (Cons patch), biostar mobo. 1GB of memory. I have an elderly 5.25" floppy drive mounted in this box, something I use for sneakernet duties to get to a 'legacy' machine occasionally. It hasn't been used for anything in about a month or more. No disk in it now. Uptime 0d:15:47 The messages file is being cluttered with this: Mar 7 01:17:22 coyote kernel: Mar 7 01:17:22 coyote kernel: floppy driver state Mar 7 01:17:22 coyote kernel: --- Mar 7 01:17:22 coyote kernel: now=54902555 last interrupt=54899556 diff=2999 last called handler=f89f833f Mar 7 01:17:22 coyote kernel: timeout_message=floppy start Mar 7 01:17:22 coyote kernel: last output bytes: Mar 7 01:17:22 coyote kernel: 0 90 54899555 Mar 7 01:17:22 coyote kernel: 13 90 54899555 Mar 7 01:17:22 coyote kernel: 0 90 54899555 Mar 7 01:17:22 coyote kernel: 1a 90 54899555 Mar 7 01:17:22 coyote kernel: 0 90 54899555 Mar 7 01:17:22 coyote kernel: 3 90 54899555 Mar 7 01:17:22 coyote kernel: c1 90 54899555 Mar 7 01:17:22 coyote kernel: 8 90 54899555 Mar 7 01:17:22 coyote kernel: 7 80 54899555 Mar 7 01:17:22 coyote kernel: 1 90 54899556 Mar 7 01:17:22 coyote kernel: 8 82 54899556 Mar 7 01:17:22 coyote kernel: e6 80 54899556 Mar 7 01:17:22 coyote kernel: 1 90 54899556 Mar 7 01:17:22 coyote kernel: 0 90 54899556 Mar 7 01:17:22 coyote kernel: 0 90 54899556 Mar 7 01:17:22 coyote kernel: 1 90 54899556 Mar 7 01:17:22 coyote kernel: 2 90 54899556 Mar 7 01:17:22 coyote kernel: 9 90 54899556 Mar 7 01:17:22 coyote kernel: 2a 90 54899556 Mar 7 01:17:22 coyote kernel: ff 90 54899556 Mar 7 01:17:22 coyote kernel: last result at 54899556 Mar 7 01:17:22 coyote kernel: last redo_fd_request at 54899555 Mar 7 01:17:22 coyote kernel: 21 0 Mar 7 01:17:22 coyote kernel: status=50 Mar 7 01:17:22 coyote kernel: fdc_busy=1 Mar 7 01:17:22 coyote kernel: do_floppy=f89f3323 Mar 7 01:17:22 coyote kernel: fd_timer.function=f89f51b2 Mar 7 01:17:22 coyote kernel: cont=f89fc5ec Mar 7 01:17:22 coyote kernel: current_req=e8eaa9c8 Mar 7 01:17:22 coyote kernel: command_status=-1 Mar 7 01:17:22 coyote kernel: Mar 7 01:17:22 coyote kernel: floppy1: floppy timeout called Mar 7 01:17:22 coyote kernel: end_request: I/O error, dev fd1, sector 0 Mar 7 01:17:22 coyote kernel: Buffer I/O error on device fd1, logical block 0 About 8 or 9 times since it was rebooted 16 hours ago, and very intermittent, it might skip 3 hours, then show two of these stanza's in 3 seconds. The drive type is properly set in the bios if that actually means anything. The only mention of floppy in /var/log/dmesg is the ide-floppy driver signing in during the boot cause I've got one of those 100MB floppies I occasionally use too. It is not installed at the moment. Has anyone else seen this in a box with two floppy drives in it? Cheers, Gene -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Function reject. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Blackfin: blackfin i2c driver
Hi Bryan, On Wed, 07 Mar 2007 13:57:58 +0800, Wu, Bryan wrote: > Here is the updated blackfin i2c driver. > > [PATCH] Blackfin: blackfin i2c driver > > The i2c linux driver for blackfin architecture which supports both GPIO > i2c operation and blackfin on-chip TWI controller i2c operation. > > Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> > --- > drivers/i2c/busses/Kconfig | 47 > drivers/i2c/busses/i2c-bfin-gpio.c | 98 + > drivers/i2c/busses/i2c-bfin-twi.c | 589 > I'd prefer i2c-blackfin-gpio and i2c-blackfin-twi. Abreviations tend to confuse newcomers. > 3 files changed, 734 insertions(+) > > Index: linux-2.6/drivers/i2c/busses/Kconfig > === > --- linux-2.6.orig/drivers/i2c/busses/Kconfig 2007-03-07 13:32:02.0 > +0800 > +++ linux-2.6/drivers/i2c/busses/Kconfig 2007-03-07 13:44:19.0 > +0800 > @@ -5,6 +5,53 @@ > menu "I2C Hardware Bus support" > depends on I2C > > +config I2C_BFIN_GPIO I2C_BLACKFIN_GPIO Please move the entries to the right location. The list is sorted alphabetically if you didn't notice. > + tristate "Generic Blackfin and HHBF533/561 development board I2C > support" You can drop the trailing "I2C support", the user is in a menu named "I2C hardware bus support" so it's pretty clear what we're talking about. > + depends on I2C && EXPERIMENTAL > + select I2C_ALGOBIT > + help > + -- > + > +menu "BFIN I2C SDA/SCL Selection" > + depends on I2C_BFIN_GPIO > +config BFIN_SDA I2C_BLACKFIN_SDA > + int "SDA is GPIO Number" "SDA GPIO pin number" > + range 0 15 if (BF533 || BF532 || BF531) Trailing whitespace. > + range 0 47 if (BF534 || BF536 || BF537) > + range 0 47 if BF561 > + default 2 if (BF533 || BF532 || BF531) Trailing whitespace. No default for the other cases? > + > +config BFIN_SCL I2C_BLACKFIN_SCL Etc etc, all the options should start with I2C_BLACKFIN. > + int "SCL is GPIO Number" "SCL GPIO pin number" > + range 0 15 if (BF533 || BF532 || BF531) Trailing whitespace, and many more after that. Please fix them all! > + range 0 47 if (BF534 || BF536 || BF537) > + range 0 47 if BF561 > + default 3 > +endmenu > + > +config I2C_BFIN_GPIO_CYCLE_DELAY > + int "Cycle Delay in usec" > + depends on I2C_BFIN_GPIO > + range 1 100 > + default 40 This should really not be a kernel configuration option. Please turn it into a kernel module parameter or a sysfs attribute if you really need it. Also note that we already have an interface to change this value from user-space (using an ioctl on /dev/i2c-N) and that might be sufficient for your needs. And allowing 1 usec delay is probably not a good idea, I don't recommend values below 6 usec with i2c-algo-bit. > + > +config I2C_BFIN_TWI > + tristate "Blackfin TWI I2C support" > + depends on I2C && (BF534 || BF536 || BF537) > + help > + This the TWI I2C device driver for Blackfin 534/536/537. > + > + This driver can also be built as a module. If so, the module > + will be called i2c-bfin-twi. > + > +config TWICLK_KHZ > + int "TWI clock (kHZ)" kHz > + depends on I2C_BFIN_TWI > + default 50 > + help > + The unit of the TWI clock is kilo HZ. Please divide the clock > + by 1024 if you count it in HZ. The value should be less than 400. Why don't you use "range" here too to ensure that the value is actually less than 400? Either way, same as above, IMHO this should not be a compilation-time decision. A kHz is really 1000 Hz, not 1024. And everybody skilled enough to configure a kernel should know that, I doubt it's worth reminding. > + > config I2C_ALI1535 > tristate "ALI 1535" > depends on I2C && PCI All these options won't work really well until you also change drivers/i2c/busses/Makefile to make something useful with them... -- Jean Delvare - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/6] mm: fix fault vs invalidate race for linear mappings
On Tue, Mar 06, 2007 at 10:36:41PM -0800, Andrew Morton wrote: > On Wed, 21 Feb 2007 05:50:05 +0100 (CET) Nick Piggin <[EMAIL PROTECTED]> > wrote: > > > Fix the race between invalidate_inode_pages and do_no_page. > > > > Andrea Arcangeli identified a subtle race between invalidation of > > pages from pagecache with userspace mappings, and do_no_page. > > > > The issue is that invalidation has to shoot down all mappings to the > > page, before it can be discarded from the pagecache. Between shooting > > down ptes to a particular page, and actually dropping the struct page > > from the pagecache, do_no_page from any process might fault on that > > page and establish a new mapping to the page just before it gets > > discarded from the pagecache. > > > > The most common case where such invalidation is used is in file > > truncation. This case was catered for by doing a sort of open-coded > > seqlock between the file's i_size, and its truncate_count. > > > > Truncation will decrease i_size, then increment truncate_count before > > unmapping userspace pages; do_no_page will read truncate_count, then > > find the page if it is within i_size, and then check truncate_count > > under the page table lock and back out and retry if it had > > subsequently been changed (ptl will serialise against unmapping, and > > ensure a potentially updated truncate_count is actually visible). > > > > Complexity and documentation issues aside, the locking protocol fails > > in the case where we would like to invalidate pagecache inside i_size. > > do_no_page can come in anytime and filemap_nopage is not aware of the > > invalidation in progress (as it is when it is outside i_size). The > > end result is that dangling (->mapping == NULL) pages that appear to > > be from a particular file may be mapped into userspace with nonsense > > data. Valid mappings to the same place will see a different page. > > > > Andrea implemented two working fixes, one using a real seqlock, > > another using a page->flags bit. He also proposed using the page lock > > in do_no_page, but that was initially considered too heavyweight. > > However, it is not a global or per-file lock, and the page cacheline > > is modified in do_no_page to increment _count and _mapcount anyway, so > > a further modification should not be a large performance hit. > > Scalability is not an issue. > > > > This patch implements this latter approach. ->nopage implementations > > return with the page locked if it is possible for their underlying > > file to be invalidated (in that case, they must set a special vm_flags > > bit to indicate so). do_no_page only unlocks the page after setting > > up the mapping completely. invalidation is excluded because it holds > > the page lock during invalidation of each page (and ensures that the > > page is not mapped while holding the lock). > > > > This also allows significant simplifications in do_no_page, because > > we have the page locked in the right place in the pagecache from the > > start. > > > > Why was truncate_inode_pages_range() altered to unmap the page if it got > mapped again? > > Oh. Because the unmap_mapping_range() call got removed from vmtruncate(). > Why? (Please send suitable updates to the changelog). We have to ensure it is unmapped, and be prepared to unmap it while under the page lock. > I guess truncate of a mmapped area isn't sufficiently common to worry about > the inefficiency of this change. Yeah, and it should be more efficient for files that aren't mmapped, because we don't have to take i_mmap_lock for them. > Lots of memory barriers got removed in memory.c, unchangeloggedly. Yeah they were all for the lockless truncate_count checks. Now that we use the page lock, we don't need barriers. > Gratuitous renaming of locals in do_no_page() makes the change hard to > review. Should have been a separate patch. > > In fact, the patch would have been heaps clearer if that renaming had been > a separate patch. Shall I? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUGFIX][PATCH] fix NULL pointer in ia64/irq_chip-mask/unmask function
On Wed, 7 Mar 2007 15:23:17 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > This patch fixes boot failure because irq_desc->mask() is NULL. > > - Added mask/unmask functions to ia64's irq desc function table. > But I'm not sure this fix is correct or not. please review. > > - rename hw_interrupt_type to irq_chip. hw_interrupt_type is old name. Thanks. This bug is present in mainline too, isn't it? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v2] epoll use a single inode ...
Linus Torvalds a écrit : On Tue, 6 Mar 2007, Eric Dumazet wrote: I did a user space program, attached to this mail. I rewrote the reciprocal_div() for i386 so that one multiply is used. Ok, this is definitely faster on Core 2 as well, so "numbers talk, bullshit walks". No more objections. And the numbers were ? :) (That said, I bet you could do even better for octal and hex numbers, so if you *really* want to speed things up, you should just make a special-case routine for each base (there's just three of them), and you can then also optimize the base-10 thing much better (you can do two digits at a time by dividing by 100, etc) Well, given that sprintf() is frequently called only for pipe/sockets creation, we probably better : 1) wait a very clever idea to suppress individual dentry per pipe/sockets (no more sprintf() at pipe/socket setup) 2) delay the sprintf() only if needed as you mentioned in a previous mail (when someone wants ls -l /proc/pid/fd/), since their dentries are not anymore inserted in the global dcache hash, they could stay with a (nul) dname. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.21-rc3] cpufreq: p4-clockmod.c compilation error
On Tue, Mar 06, 2007 at 10:33:05PM -0800, David Rientjes wrote: > arch/x86_64/kernel/built-in.o: In function > `cpufreq_p4_verify':p4-clockmod.c:(.text.cpufreq_p4_verify+0x8): undefined > reference to `cpufreq_frequency_table_verify' > arch/x86_64/kernel/built-in.o: In function > `cpufreq_p4_cpu_exit':p4-clockmod.c:(.text.cpufreq_p4_cpu_exit+0x8): > undefined reference to `cpufreq_frequency_table_put_attr' > arch/x86_64/kernel/built-in.o: In function > `cpufreq_p4_cpu_init':p4-clockmod.c:(.text.cpufreq_p4_cpu_init+0x13b): > undefined reference to `cpufreq_frequency_table_get_attr' > :p4-clockmod.c:(.text.cpufreq_p4_cpu_init+0x163): undefined reference to > `cpufreq_frequency_table_cpuinfo' > arch/x86_64/kernel/built-in.o: In function > `cpufreq_p4_target':p4-clockmod.c:(.text.cpufreq_p4_target+0x21): undefined > reference to `cpufreq_frequency_table_target' > arch/x86_64/kernel/built-in.o: In function > `k8nops':alternative.c:(.data+0x2b70): undefined reference to > `cpufreq_freq_attr_scaling_available_freqs' > CONFIG_CPU_FREQ=y > CONFIG_CPU_FREQ_TABLE=m > CONFIG_X86_P4_CLOCKMOD=y So P4_CLOCKMOD does a 'select CPU_FREQ_TABLE', but for some reason, that makes it =m, not the same as whatever the option that is doing the 'select' is set to (which is what I thought it did). Given the cpufreq table code is tiny anyway, I'm wondering if its worth the pain of having it be modular, instead just making it be built-in to cpufreq. Give the diff below a shot? Dave diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig index d155e81..74747d9 100644 --- a/drivers/cpufreq/Kconfig +++ b/drivers/cpufreq/Kconfig @@ -16,7 +16,7 @@ config CPU_FREQ if CPU_FREQ config CPU_FREQ_TABLE - tristate + bool config CPU_FREQ_DEBUG bool "Enable CPUfreq debugging" -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Blackfin: blackfin on-chip SPI controller driver
Hi alls, Could you please give some feedback about this patch? I noticed some coding style issues and will update this patch according to your kindly review. Thanks -Bryan Wu - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] epoll use a single inode ...
Linus Torvalds a écrit : I assume that the *only* reason for having multiple dentries is really just the output in /proc//fd/, right? Or is there any other reason to have separate dentries for these pseudo-files? It's a bit sad to waste that much memory (and time) on something like that. I bet that the dentry setup is a noticeable part of the whole sigfd()/timerfd() setup. It's likely also a big part of any memory footprint if you have lots of them. So how about just doing: - do a single dentry - make a "struct file_operations" member function that prints out the name of the thing in /proc//fd/, and which *defaults* to just doing the d_path() on the dentry, but special filesystems like this could do something else (like print out a fake inode number from the "file->f_private_data" information) There seems to really be no downsides to that approach. No existing filesystem will even notice (they'll all have NULL in the new f_op member), and it would allow pipes etc to be sped up and use less memory. I would definitly *love* saving dentries for pipes (and sockets too), but how are you going to get the inode ? pipes()/sockets() can use read()/write()/rw_verify_area() and thus need file->f_path.dentry->d_inode (so each pipe needs a separate dentry) Are you suggesting adding a new "struct file_operations" member to get the inode ? Or re-intoducing an inode pointer in struct file ? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 0/7] Resource controllers based on process containers
Pavel Emelianov wrote: This patchset adds RSS, accounting and control and limiting the number of tasks and files within container. Based on top of Paul Menage's container subsystem v7 RSS controller includes per-container RSS accounter, reclamation and OOM killer. It behaves like standalone machine - when container runs out of resources it tries to reclaim some pages and if it doesn't succeed in it kills some task which mm_struct belongs to container in question. Num tasks and files containers are very simple and self-descriptive from code. As discussed before when a task moves from one container to another no resources follow it - they keep holding the container they were allocated in. I have one problem with the patchset, I cannot compile the patches individually and some of the code is hard to read as it depends on functions from future patches. Patch 2, 3 and 4 fail to compile without patch 5 applied. Patch 1 failed to apply with a reject in kernel/Makefile I applied it on top of 2.6.20 with all of Paul Menage's patches (all 7). -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 4/6] mm: merge populate and nopage into fault (fixes nonlinear)
On Wed, 21 Feb 2007 05:50:17 +0100 (CET) Nick Piggin <[EMAIL PROTECTED]> wrote: > Nonlinear mappings are (AFAIKS) simply a virtual memory concept that > encodes the virtual address -> file offset differently from linear > mappings. > > I can't see why the filesystem/pagecache code should need to know anything > about it, except for the fact that the ->nopage handler didn't quite pass > down enough information (ie. pgoff). But it is more logical to pass pgoff > rather than have the ->nopage function calculate it itself anyway. And > having the nopage handler install the pte itself is sort of nasty. > > This patch introduces a new fault handler that replaces ->nopage and > ->populate and (later) ->nopfn. Most of the old mechanism is still in place > so there is a lot of duplication and nice cleanups that can be removed if > everyone switches over. > > The rationale for doing this in the first place is that nonlinear mappings > are subject to the pagefault vs invalidate/truncate race too, and it seemed > stupid to duplicate the synchronisation logic rather than just consolidate > the two. > It's awkward to layer a largely do-nothing patch like this on top of a significant functional change. Makes it harder to isolate the source of regressions, harder to revert the do-something patch. > After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in > pagecache. Seems like a fringe functionality anyway. Does Ingo agree? > NOPAGE_REFAULT is removed. This should be implemented with ->fault, and > no users have hit mainline yet. Did benh agree with that? The patch unchangeloggedly adds a basic new structure to core mm (fault_data). Would be nice to document its fields, especially `flags'. Please add less pointless blank lines. How well has this been tested? The ocfs2 changes? gfs2? We should at least give those guys a heads-up. Does anybody really pass a NULL `type' arg into filemap_nopage()? This patch seems to churn things around an awful lot for minimal benefit. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Wanted: simple, safe x86 stack overflow detection
At some point in the past, I wrote: >> I'm certainly in favor of the move; IRQ stacks could be made >> rather deep and cheaply at that. I may get around to writing it this >> week if no one else does it first. On Tue, Mar 06, 2007 at 08:28:35PM -0800, Arjan van de Ven wrote: > the irq stacks aren't the problem; RH at some point accidentally shipped > a kernel with 4k *shared* irq/user context stack and even that gave > almost no issues. > irq's really shouldn't actually nest; it's bad for just about everything > to do that (but that's another story, I would *love* to get rid of the > "enable irqs" thing in the x86 irq path, it hurts just about anything in > reality) What do you see as the obstacle to eliminating nested IRQ's? It doesn't seem so far out to test for being on the interrupt stack and defer the call to do_IRQ() until after the currently-running instance of do_IRQ() has returned, or to move to per-irq stacks modulo special arrangements for the per-cpu IRQ's. Or did you have other methods in mind? -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Blackfin: blackfin i2c driver
> > > > OK, I change it into yield(). So, current process will be move to the > > tail of the run queue. Is that OK with you? > > Nope, yield is terribly bad when there are busy processes running: it can > stall for a very long time indeed, > > Is this hardware not capable of generating an interrupt when BUSBUSY gets > negated? > > I guess not, in which case you're stuck with having to poll it - probably > use a cond_resched() in the loop, and an angry comment. Thanks, we fix it. please pick up this one. [PATCH] Blackfin: blackfin i2c driver The i2c linux driver for blackfin architecture which supports both GPIO i2c operation and blackfin on-chip TWI controller i2c operation. Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> --- drivers/i2c/busses/Kconfig | 47 drivers/i2c/busses/i2c-bfin-gpio.c | 98 + drivers/i2c/busses/i2c-bfin-twi.c | 589 3 files changed, 734 insertions(+) Index: linux-2.6/drivers/i2c/busses/Kconfig === --- linux-2.6.orig/drivers/i2c/busses/Kconfig 2007-03-07 13:32:02.0 +0800 +++ linux-2.6/drivers/i2c/busses/Kconfig2007-03-07 13:44:19.0 +0800 @@ -5,6 +5,53 @@ menu "I2C Hardware Bus support" depends on I2C +config I2C_BFIN_GPIO + tristate "Generic Blackfin and HHBF533/561 development board I2C support" + depends on I2C && EXPERIMENTAL + select I2C_ALGOBIT + help + -- + +menu "BFIN I2C SDA/SCL Selection" + depends on I2C_BFIN_GPIO +config BFIN_SDA + int "SDA is GPIO Number" + range 0 15 if (BF533 || BF532 || BF531) + range 0 47 if (BF534 || BF536 || BF537) + range 0 47 if BF561 + default 2 if (BF533 || BF532 || BF531) + +config BFIN_SCL + int "SCL is GPIO Number" + range 0 15 if (BF533 || BF532 || BF531) + range 0 47 if (BF534 || BF536 || BF537) + range 0 47 if BF561 + default 3 +endmenu + +config I2C_BFIN_GPIO_CYCLE_DELAY + int "Cycle Delay in usec" + depends on I2C_BFIN_GPIO + range 1 100 + default 40 + +config I2C_BFIN_TWI + tristate "Blackfin TWI I2C support" + depends on I2C && (BF534 || BF536 || BF537) + help + This the TWI I2C device driver for Blackfin 534/536/537. + + This driver can also be built as a module. If so, the module + will be called i2c-bfin-twi. + +config TWICLK_KHZ + int "TWI clock (kHZ)" + depends on I2C_BFIN_TWI + default 50 + help + The unit of the TWI clock is kilo HZ. Please divide the clock + by 1024 if you count it in HZ. The value should be less than 400. + config I2C_ALI1535 tristate "ALI 1535" depends on I2C && PCI Index: linux-2.6/drivers/i2c/busses/i2c-bfin-gpio.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6/drivers/i2c/busses/i2c-bfin-gpio.c2007-03-07 13:44:19.0 +0800 @@ -0,0 +1,98 @@ +/ + * Description: * + * * + * Maintainer: Meihui Fan <[EMAIL PROTECTED]> * + * * + * CopyRight (c) 2004 HHTech * + * www.hhcn.com, www.hhcn.org * + * All rights reserved. * + * * + * This file is free software; * + * you are free to modify and/or redistribute it * + * under the terms of the GNU General Public Licence (GPL). * + * * + / + +#include +#include +#include +#include +#include +#include + +#include +#include + +#defineI2C_HW_B_HHBF 0x13 + +static void hhbf_setsda(void *data, int state) +{ + if (state) { + gpio_direction_input(CONFIG_BFIN_SDA); + + } else { + gpio_direction_output(CONFIG_BFIN_SDA); + gpio_set_value(CONFIG_BFIN_SDA, 0); + } +} + +static void hhbf_setscl(void *data, int state) +{ + gpio_set_value(CONFIG_BFIN_SCL, state); +} + +static int hhbf_getsda(void *data) +{ + return (gpio_get_value(CONFIG_BFIN_SDA) != 0); +} + + +static struct i2c_algo_bit_data bit_hhbf_data = { + .setsda = hhbf_setsda, + .setscl = hhbf_setscl, + .getsda = hhbf_getsda, + .udelay = CONFIG_I2C_BFIN_GPIO_CYCLE_DELAY, + .timeout = HZ +}; + +static struct i2c_adapter hhbf_ops = { + .owner = THIS_MODULE, + .id = I2C_HW_B_HHBF
Re: [patch 3/6] mm: fix fault vs invalidate race for linear mappings
On Wed, 21 Feb 2007 05:50:05 +0100 (CET) Nick Piggin <[EMAIL PROTECTED]> wrote: > Fix the race between invalidate_inode_pages and do_no_page. > > Andrea Arcangeli identified a subtle race between invalidation of > pages from pagecache with userspace mappings, and do_no_page. > > The issue is that invalidation has to shoot down all mappings to the > page, before it can be discarded from the pagecache. Between shooting > down ptes to a particular page, and actually dropping the struct page > from the pagecache, do_no_page from any process might fault on that > page and establish a new mapping to the page just before it gets > discarded from the pagecache. > > The most common case where such invalidation is used is in file > truncation. This case was catered for by doing a sort of open-coded > seqlock between the file's i_size, and its truncate_count. > > Truncation will decrease i_size, then increment truncate_count before > unmapping userspace pages; do_no_page will read truncate_count, then > find the page if it is within i_size, and then check truncate_count > under the page table lock and back out and retry if it had > subsequently been changed (ptl will serialise against unmapping, and > ensure a potentially updated truncate_count is actually visible). > > Complexity and documentation issues aside, the locking protocol fails > in the case where we would like to invalidate pagecache inside i_size. > do_no_page can come in anytime and filemap_nopage is not aware of the > invalidation in progress (as it is when it is outside i_size). The > end result is that dangling (->mapping == NULL) pages that appear to > be from a particular file may be mapped into userspace with nonsense > data. Valid mappings to the same place will see a different page. > > Andrea implemented two working fixes, one using a real seqlock, > another using a page->flags bit. He also proposed using the page lock > in do_no_page, but that was initially considered too heavyweight. > However, it is not a global or per-file lock, and the page cacheline > is modified in do_no_page to increment _count and _mapcount anyway, so > a further modification should not be a large performance hit. > Scalability is not an issue. > > This patch implements this latter approach. ->nopage implementations > return with the page locked if it is possible for their underlying > file to be invalidated (in that case, they must set a special vm_flags > bit to indicate so). do_no_page only unlocks the page after setting > up the mapping completely. invalidation is excluded because it holds > the page lock during invalidation of each page (and ensures that the > page is not mapped while holding the lock). > > This also allows significant simplifications in do_no_page, because > we have the page locked in the right place in the pagecache from the > start. > Why was truncate_inode_pages_range() altered to unmap the page if it got mapped again? Oh. Because the unmap_mapping_range() call got removed from vmtruncate(). Why? (Please send suitable updates to the changelog). I guess truncate of a mmapped area isn't sufficiently common to worry about the inefficiency of this change. Lots of memory barriers got removed in memory.c, unchangeloggedly. Gratuitous renaming of locals in do_no_page() makes the change hard to review. Should have been a separate patch. In fact, the patch would have been heaps clearer if that renaming had been a separate patch. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[2.6.21-rc3] cpufreq: p4-clockmod.c compilation error
arch/x86_64/kernel/built-in.o: In function `cpufreq_p4_verify':p4-clockmod.c:(.text.cpufreq_p4_verify+0x8): undefined reference to `cpufreq_frequency_table_verify' arch/x86_64/kernel/built-in.o: In function `cpufreq_p4_cpu_exit':p4-clockmod.c:(.text.cpufreq_p4_cpu_exit+0x8): undefined reference to `cpufreq_frequency_table_put_attr' arch/x86_64/kernel/built-in.o: In function `cpufreq_p4_cpu_init':p4-clockmod.c:(.text.cpufreq_p4_cpu_init+0x13b): undefined reference to `cpufreq_frequency_table_get_attr' :p4-clockmod.c:(.text.cpufreq_p4_cpu_init+0x163): undefined reference to `cpufreq_frequency_table_cpuinfo' arch/x86_64/kernel/built-in.o: In function `cpufreq_p4_target':p4-clockmod.c:(.text.cpufreq_p4_target+0x21): undefined reference to `cpufreq_frequency_table_target' arch/x86_64/kernel/built-in.o: In function `k8nops':alternative.c:(.data+0x2b70): undefined reference to `cpufreq_freq_attr_scaling_available_freqs' # # Automatically generated make config: don't edit # Linux kernel version: 2.6.21-rc3 # Tue Mar 6 21:53:36 2007 # CONFIG_X86_64=y CONFIG_64BIT=y CONFIG_X86=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_ZONE_DMA32=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_RWSEM_GENERIC_SPINLOCK=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_CMPXCHG=y CONFIG_EARLY_PRINTK=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_DMI=y CONFIG_AUDIT_ARCH=y CONFIG_GENERIC_BUG=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # Code maturity level options # # CONFIG_EXPERIMENTAL is not set CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y # CONFIG_SYSVIPC is not set CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y CONFIG_TASKSTATS=y # CONFIG_TASK_DELAY_ACCT is not set # CONFIG_TASK_XACCT is not set # CONFIG_UTS_NS is not set # CONFIG_AUDIT is not set # CONFIG_IKCONFIG is not set # CONFIG_SYSFS_DEPRECATED is not set # CONFIG_RELAY is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" CONFIG_EMBEDDED=y # CONFIG_SYSCTL_SYSCALL is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_EXTRA_PASS is not set # CONFIG_HOTPLUG is not set # CONFIG_PRINTK is not set CONFIG_BUG=y # CONFIG_ELF_CORE is not set CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_SLAB=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # CONFIG_SLOB is not set # # Loadable module support # CONFIG_MODULES=y # CONFIG_MODULE_UNLOAD is not set CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_KMOD=y # # Block layer # # CONFIG_BLOCK is not set # # Processor type and features # CONFIG_X86_PC=y # CONFIG_X86_VSMP is not set # CONFIG_MK8 is not set CONFIG_MPSC=y # CONFIG_MCORE2 is not set # CONFIG_GENERIC_CPU is not set CONFIG_X86_L1_CACHE_BYTES=128 CONFIG_X86_L1_CACHE_SHIFT=7 CONFIG_X86_INTERNODE_CACHE_BYTES=128 CONFIG_X86_TSC=y CONFIG_X86_GOOD_APIC=y CONFIG_MICROCODE=y CONFIG_MICROCODE_OLD_INTERFACE=y # CONFIG_X86_MSR is not set # CONFIG_X86_CPUID is not set CONFIG_X86_IO_APIC=y CONFIG_X86_LOCAL_APIC=y # CONFIG_MTRR is not set # CONFIG_SMP is not set # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y # CONFIG_PREEMPT_BKL is not set CONFIG_ARCH_FLATMEM_ENABLE=y CONFIG_FLATMEM=y CONFIG_FLAT_NODE_MEM_MAP=y # CONFIG_SPARSEMEM_STATIC is not set CONFIG_SPLIT_PTLOCK_CPUS=4 CONFIG_RESOURCES_64BIT=y CONFIG_ZONE_DMA_FLAG=1 CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y CONFIG_HPET_TIMER=y # CONFIG_IOMMU is not set # CONFIG_X86_MCE is not set CONFIG_KEXEC=y CONFIG_PHYSICAL_START=0x20 # CONFIG_HZ_100 is not set # CONFIG_HZ_250 is not set CONFIG_HZ_300=y # CONFIG_HZ_1000 is not set CONFIG_HZ=300 CONFIG_REORDER=y CONFIG_K8_NB=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_ISA_DMA_API=y # # Power management options # CONFIG_PM=y # CONFIG_PM_LEGACY is not set # CONFIG_PM_DEBUG is not set # # ACPI (Advanced Configuration and Power Interface) Support # CONFIG_ACPI=y # CONFIG_ACPI_SLEEP is not set CONFIG_ACPI_PROCFS=y CONFIG_ACPI_AC=y # CONFIG_ACPI_BATTERY is not set # CONFIG_ACPI_BUTTON is not set CONFIG_ACPI_VIDEO=m CONFIG_ACPI_FAN=m CONFIG_ACPI_PROCESSOR=y CONFIG_ACPI_THERMAL=y CONFIG_ACPI_ASUS=m # CONFIG_ACPI_IBM is not set CONFIG_ACPI_TOSHIBA=m CONFIG_ACPI_BLACKLIST_YEAR=0 # CONFIG_ACPI_DEBUG is not set CONFIG_ACPI_EC=y CONFIG_ACPI_POWER=y CONFIG_ACPI_SYSTEM=y # CONFIG_X86_PM_TIMER is not set # # CPU Frequency scaling # CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=m # CONFIG_CPU_FREQ_DEBUG is not set # CONFIG_CPU_FREQ_STAT is not set # CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y # CONFIG_CPU_FREQ_GOV_PERFORMANCE is not set CONFIG
Re: [2.6.22 patch] the scheduled removal of OBSOLETE_OSS options
On Tue, Mar 06, 2007 at 06:55:04PM +0100, Adrian Bunk wrote: > On Tue, Mar 06, 2007 at 12:46:22PM -0500, Bill Davidsen wrote: > > Adrian Bunk wrote: > > >This patch contains the scheduled removal of the OBSOLETE_OSS options > > >for 2.6.22. > > > > > If these are drivers for which there are thought to be useful ALSA > > drivers, would it be reasonable to leave a stub for a help file naming > > the driver which claims to support the hardware? > > > > I'm not objection to the removal of the drivers, just noting that > > identifying the new drivers can be made easier. > > People compiling their own kernels aren't completely dumb - if you know > about people having problems finding the right ALSA driver for their > hardware, please name the concrete problems so that we can improve the > description and/or help text of these ALSA options. Real problem is that we can expect several "sound does not work anymore" because people doing "make oldconfig" will get no warning at all about the removed options. Remember people complaining about keyboard not working ? Perhaps the real problem is more Kconfig than OSS, but it would be fine if we found a solution to enumerate the list of options which have been removed when they do their make oldconfig. Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc2-mm2
Hi J.A., On Tue, 6 Mar 2007 16:46:09 +0100, J.A. Magallón wrote: > On Tue, 6 Mar 2007 00:44:08 -0800, Andrew Morton <[EMAIL PROTECTED]> wrote: > > > Temporarily at > > > > http://userweb.kernel.org/~akpm/2.6.21-rc2-mm2/ > > I have another question about i2c... > > The 'sensors' program gives me a stange message: > > w83627thf-i2c-9191-290 > Can't get adapter name for bus 9191<- > VCore: +1.49 V (min = +1.94 V, max = +1.94 V) ALARM > +12.0V: +11.86 V (min = +10.82 V, max = +13.19 V) > + 3.3V:+3.30 V (min = +3.14 V, max = +3.47 V) > ... > > And gnome-sensors-applet can't read the sensors. If using libsensors, no As a side note, this is bad design from gnome-sensors-applet. They should definitely not plain stop just because they failed to retrieve the i2c_adapter name, when all the monitored values are otherwise available. > value is displayed (I suppose an applicacion bug). And the access to sensors > directly through i2c-dev gives an error like this: > > Error opening sensor device file: > /sys/devices/platform/i2c-9191/9191-0290/ > > In fact, the real path is > > /sys/devices/platform/i2c-adapter:i2c-9191/9191-0290/ > > I supposed it was a kernel change not tracked by userspace, but the strange > thing is that looking at the code the sensors applet lists the sensors > reding directories and files in /sys (AFAICS in the code). > So perhaps there is a little inconsistency, /sys says in some place the > sensor is at x, when it really is at y. > Or the 'i2c-adapter:' is a bug and should be 'i2c-adapter/'. > > ??? See: http://bugzilla.kernel.org/show_bug.cgi?id=8115 lm-sensors SVN should work fine: http://dl.lm-sensors.org/lm-sensors/snapshots/lm-sensors-r4338-20070305.tar.bz2 If not, please report. We will release it as lm-sensors 2.10.3 soon. -- Jean Delvare - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BUG() during suspend to disk (2.6.21-rc2, x86_64)
On Tue, Mar 06, 2007 at 09:56:46PM +0100, Rafael J. Wysocki wrote: > Hi, > > On Tuesday, 6 March 2007 11:32, Vivek Goyal wrote: > > Hi, > > > > I see following BUG() on serial console while hibernating on a x86_64 > > machine. I am using 2.6.21-rc2 kernel. > > I see it too. > > > BUG: at arch/x86_64/kernel/acpi/sleep.c:70 init_low_mapping() > > > > Call Trace: > > [] acpi_save_state_mem+0x70/0xd6 > > [] acpi_pm_enter+0x23/0xc1 > > [] pm_suspend_disk+0x1ac/0x228 > > [] enter_state+0x50/0x1e6 > > [] acpi_system_write_sleep+0x5c/0x79 > > [] vfs_write+0xad/0x136 > > [] sys_write+0x45/0x6e > > [] system_call+0x7e/0x83 > > Hm, it doesn't like the fact that nonboot CPUs are online at that point, but > we don't do anything to disable them. Should we? > > I think the appended patch might work. > Yes. It does work for me. Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUGFIX][PATCH] fix NULL pointer in ia64/irq_chip-mask/unmask function
This patch fixes boot failure because irq_desc->mask() is NULL. - Added mask/unmask functions to ia64's irq desc function table. But I'm not sure this fix is correct or not. please review. - rename hw_interrupt_type to irq_chip. hw_interrupt_type is old name. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Original BUG I met (following) was caused at irq 57:uhci_hcd:usb3. xxBUG DESCRIPTIONxx Unable to handle kernel NULL pointer dereference (address ) yum-updatesd[3461]: Oops 11012296146944 [1] Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 vfat fat dm_mirror dm_mod button parport_pc lp parport sg tg3 e100 shpchp mii usb_storage lpfc scsi_transport_fc mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd Pid: 3461, CPU 5, comm: yum-updatesd psr : 121008022018 ifs : 8286 ip : []Not tainted ip is at move_native_irq+0x91/0x140 unat: pfs : 0205 rsc : 0003 rnat: bsps: pr : 005a56a9 ldrs: ccv : fpsr: 0009804c0270033f csd : ssd : b0 : a00100050430 b6 : a0021b49d0c0 b7 : a00100050400 f6 : 1003e0de0 f7 : 1003e0060 f8 : 1003e0025 f9 : 1003e22e0 f10 : 1003e0060 f11 : 1003e005d r1 : a00100d8d230 r2 : a001009f8c30 r3 : 5580 r8 : a00100b8e178 r9 : 00ab r10 : 0039 r11 : 0020 r12 : e1404b6f7e30 r13 : e1404b6f r14 : 0020 r15 : a001009f8c00 r16 : a00100b5b160 r17 : dead4ead r18 : a001009f8c4c r19 : r20 : a00100b5b110 r21 : a00100b5b140 r22 : a00100b5b110 r23 : 005020050874 r24 : a00100ba98c0 r25 : a0021b53d768 r26 : e0018007e030 r27 : a00100786238 r28 : e00040004ae0 r29 : a001009f8c50 r30 : 0005 r31 : a001009f8c58 Call Trace: [] show_stack+0x40/0xa0 sp=e1404b6f79c0 bsp=e1404b6f1030 [] show_regs+0x840/0x880 sp=e1404b6f7b90 bsp=e1404b6f0fd0 [] die+0x1c0/0x2a0 sp=e1404b6f7b90 bsp=e1404b6f0f88 [] ia64_do_page_fault+0x8d0/0xa00 sp=e1404b6f7bb0 bsp=e1404b6f0f38 [] ia64_leave_kernel+0x0/0x270 sp=e1404b6f7c60 bsp=e1404b6f0f38 [] move_native_irq+0x90/0x140 sp=e1404b6f7e30 bsp=e1404b6f0f08 [] iosapic_end_level_irq+0x30/0xe0 sp=e1404b6f7e30 bsp=e1404b6f0ee8 [] __do_IRQ+0x390/0x3c0 sp=e1404b6f7e30 bsp=e1404b6f0ea8 [] ia64_handle_irq+0x1e0/0x2e0 sp=e1404b6f7e30 bsp=e1404b6f0e78 [] ia64_leave_kernel+0x0/0x270 sp=e1404b6f7e30 bsp=e1404b6f0e78 Kernel panic - not syncing: Aiee, killing interrupt handler! --- arch/ia64/kernel/iosapic.c |8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) Index: devel-tree/arch/ia64/kernel/iosapic.c === --- devel-tree.orig/arch/ia64/kernel/iosapic.c +++ devel-tree/arch/ia64/kernel/iosapic.c @@ -446,7 +446,7 @@ iosapic_end_level_irq (unsigned int irq) #define iosapic_disable_level_irq mask_irq #define iosapic_ack_level_irq nop -struct hw_interrupt_type irq_type_iosapic_level = { +struct irq_chip irq_type_iosapic_level = { .name = "IO-SAPIC-level", .startup = iosapic_startup_level_irq, .shutdown = iosapic_shutdown_level_irq, @@ -454,6 +454,8 @@ struct hw_interrupt_type irq_type_iosapi .disable = iosapic_disable_level_irq, .ack = iosapic_ack_level_irq, .end = iosapic_end_level_irq, + .mask = mask_irq, + .unmask = unmask_irq, .set_affinity = iosapic_set_affinity }; @@ -493,7 +495,7 @@ iosapic_ack_edge_irq (unsigned int irq) #define iosapic_disable_edge_irq nop #define iosapic_end_edge_irq nop -struct hw_interrupt_type irq_type_iosapic_edge = { +struct irq_chip irq_type_iosapic_edge = { .name = "IO-SAPIC-edge", .startup = iosapic_startup_edge_irq, .shutdown = iosapic_disable_edge_irq, @@ -501,6 +503,8 @@ struct hw_interrupt_type irq_type_iosapi .disable = iosapic_disable_edge_irq, .ack = iosapic_ack_edge_irq, .end = iosapic_end_edge_irq, + .mask = mask_irq, + .unmask = unmask_irq, .set_affinity = iosapic_set_affinity }; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a me
Update to cube root benchmark code
Hi Stephen, Thanks for this code, it's easy to experiment with it. Let me propose this simple update with a variation on your ncubic() function. I noticed that all intermediate results were far below 32 bits, so I did a new version which is 30% faster on my athlon with the same results. This is because we only use x and a/x^2 in the function, with x very close to cbrt(a). So a/x^2 is very close to cbrt(a) which is at most 22 bits. So we only use the 32 lower bits of the result of div64_64(), and all intermediate computations can be done on 32 bits (including multiplies and divides). [EMAIL PROTECTED]:~$ ./bictcp Calibrating Function clocks mean(us) max(us) std(us) Avg error bictcp 1085 0.7028.19 2.30 0.172% ocubic 869 0.5622.76 1.23 0.274% ncubic 637 0.4116.29 1.41 0.247% ncubic32435 0.2811.18 1.03 0.247% acbrt 824 0.5321.03 0.85 0.275% hcbrt 547 0.3513.96 0.42 1.580% I also tried to improve a bit by checking for early convergence and returning before last divide, but it is worthless because it almost never happens so it does not make the code any faster. Here's the code. I think that it would be fine if we merged this version since it's supposed to behave better on most 32 bits machines. Best regards, Willy /* Here is a better version of the benchmark code. It has the original code used in 2.4 version of Cubic for comparison --- */ /* Test and measure perf of cube root algorithms. */ #include #include #include #include #include #ifdef __x86_64 #define rdtscll(val) do { \ unsigned int __a,__d; \ asm volatile("rdtsc" : "=a" (__a), "=d" (__d)); \ (val) = ((unsigned long)__a) | (((unsigned long)__d)<<32); \ } while(0) # define do_div(n,base) ({ \ uint32_t __base = (base); \ uint32_t __rem; \ __rem = ((uint64_t)(n)) % __base; \ (n) = ((uint64_t)(n)) / __base; \ __rem; \ }) /** * __ffs - find first bit in word. * @word: The word to search * * Undefined if no bit exists, so code should check against 0 first. */ static __inline__ unsigned long __ffs(unsigned long word) { __asm__("bsfq %1,%0" :"=r" (word) :"rm" (word)); return word; } /* * __fls: find last bit set. * @word: The word to search * * Undefined if no zero exists, so code should check against ~0UL first. */ static inline unsigned long __fls(unsigned long word) { __asm__("bsrq %1,%0" :"=r" (word) :"rm" (word)); return word; } /** * ffs - find first bit set * @x: the word to search * * This is defined the same way as * the libc and compiler builtin ffs routines, therefore * differs in spirit from the above ffz (man ffs). */ static __inline__ int ffs(int x) { int r; __asm__("bsfl %1,%0\n\t" "cmovzl %2,%0" : "=r" (r) : "rm" (x), "r" (-1)); return r+1; } /** * fls - find last bit set * @x: the word to search * * This is defined the same way as ffs. */ static inline int fls(int x) { int r; __asm__("bsrl %1,%0\n\t" "cmovzl %2,%0" : "=&r" (r) : "rm" (x), "rm" (-1)); return r+1; } /** * fls64 - find last bit set in 64 bit word * @x: the word to search * * This is defined the same way as fls. */ static inline int fls64(uint64_t x) { if (x == 0) return 0; return __fls(x) + 1; } static inline uint64_t div64_64(uint64_t dividend, uint64_t divisor) { return dividend / divisor; } #elif __i386 #define rdtscll(val) \ __asm__ __volatile__("rdtsc" : "=A" (val)) /** * ffs - find first bit set * @x: the word to search * * This is defined the same way as * the libc and compiler builtin ffs routines, therefore * differs in spirit from the above ffz() (man ffs). */ static inline int ffs(int x) { int r; __asm__("bsfl %1,%0\n\t" "jnz 1f\n\t" "movl $-1,%0\n" "1:" : "=r" (r) : "rm" (x)); return r+1; } /** * fls - find last bit set * @x: the word to search * * This is defined the same way as ffs(). */ static inline int fls(int x) { int r; __asm__("bsrl %1,%0\n\t" "jnz 1f\n\t" "movl $-1,%0\n" "1:" : "=r" (r) : "rm" (x)); return r+1; } static inline int fls64(uint64_t x) { uint32_t h = x >> 32; if (h) return fls(h) + 32; return fls(x); } #define do_div(n,base) ({ \ unsigned long __upper, __low, __high, __mod, __base; \ __base =
Re: [PATCH -mm] Blackfin: blackfin i2c driver
On Wed, 7 Mar 2007 13:17:57 +0800 "Sonic Zhang" <[EMAIL PROTECTED]> wrote: > On 3/6/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > > On Tue, 06 Mar 2007 14:54:18 +0800 "Wu, Bryan" <[EMAIL PROTECTED]> wrote: > > > > > Hi folks, > > > > > > [PATCH] Blackfin: blackfin i2c driver > > > > > > > + struct i2c_msg *pmsg; > > > + int i, ret; > > > + int rc = 0; > > > + > > > + if (!(bfin_read_TWI_CONTROL() & TWI_ENA)) > > > + return -ENXIO; > > > + > > > + down(&iface->twi_lock); > > > + > > > + while (bfin_read_TWI_MASTER_STAT() & BUSBUSY) { > > > + up(&iface->twi_lock); > > > + schedule(); > > > + down(&iface->twi_lock); > > > + } > > > > That's a busy loop until this task's timeslice has expired. It'll work, > > but it'll suck a bit. (Repeated in several places) > > > > OK, I change it into yield(). So, current process will be move to the > tail of the run queue. Is that OK with you? Nope, yield is terribly bad when there are busy processes running: it can stall for a very long time indeed, Is this hardware not capable of generating an interrupt when BUSBUSY gets negated? I guess not, in which case you're stuck with having to poll it - probably use a cond_resched() in the loop, and an angry comment. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Blackfin: blackfin i2c driver
Dear Andrew and Alexey: Thanks a lot for the review. Here is the updated blackfin i2c driver. [PATCH] Blackfin: blackfin i2c driver The i2c linux driver for blackfin architecture which supports both GPIO i2c operation and blackfin on-chip TWI controller i2c operation. Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> --- drivers/i2c/busses/Kconfig | 47 drivers/i2c/busses/i2c-bfin-gpio.c | 98 + drivers/i2c/busses/i2c-bfin-twi.c | 589 3 files changed, 734 insertions(+) Index: linux-2.6/drivers/i2c/busses/Kconfig === --- linux-2.6.orig/drivers/i2c/busses/Kconfig 2007-03-07 13:32:02.0 +0800 +++ linux-2.6/drivers/i2c/busses/Kconfig2007-03-07 13:44:19.0 +0800 @@ -5,6 +5,53 @@ menu "I2C Hardware Bus support" depends on I2C +config I2C_BFIN_GPIO + tristate "Generic Blackfin and HHBF533/561 development board I2C support" + depends on I2C && EXPERIMENTAL + select I2C_ALGOBIT + help + -- + +menu "BFIN I2C SDA/SCL Selection" + depends on I2C_BFIN_GPIO +config BFIN_SDA + int "SDA is GPIO Number" + range 0 15 if (BF533 || BF532 || BF531) + range 0 47 if (BF534 || BF536 || BF537) + range 0 47 if BF561 + default 2 if (BF533 || BF532 || BF531) + +config BFIN_SCL + int "SCL is GPIO Number" + range 0 15 if (BF533 || BF532 || BF531) + range 0 47 if (BF534 || BF536 || BF537) + range 0 47 if BF561 + default 3 +endmenu + +config I2C_BFIN_GPIO_CYCLE_DELAY + int "Cycle Delay in usec" + depends on I2C_BFIN_GPIO + range 1 100 + default 40 + +config I2C_BFIN_TWI + tristate "Blackfin TWI I2C support" + depends on I2C && (BF534 || BF536 || BF537) + help + This the TWI I2C device driver for Blackfin 534/536/537. + + This driver can also be built as a module. If so, the module + will be called i2c-bfin-twi. + +config TWICLK_KHZ + int "TWI clock (kHZ)" + depends on I2C_BFIN_TWI + default 50 + help + The unit of the TWI clock is kilo HZ. Please divide the clock + by 1024 if you count it in HZ. The value should be less than 400. + config I2C_ALI1535 tristate "ALI 1535" depends on I2C && PCI Index: linux-2.6/drivers/i2c/busses/i2c-bfin-gpio.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6/drivers/i2c/busses/i2c-bfin-gpio.c2007-03-07 13:44:19.0 +0800 @@ -0,0 +1,98 @@ +/ + * Description: * + * * + * Maintainer: Meihui Fan <[EMAIL PROTECTED]> * + * * + * CopyRight (c) 2004 HHTech * + * www.hhcn.com, www.hhcn.org * + * All rights reserved. * + * * + * This file is free software; * + * you are free to modify and/or redistribute it * + * under the terms of the GNU General Public Licence (GPL). * + * * + / + +#include +#include +#include +#include +#include +#include + +#include +#include + +#defineI2C_HW_B_HHBF 0x13 + +static void hhbf_setsda(void *data, int state) +{ + if (state) { + gpio_direction_input(CONFIG_BFIN_SDA); + + } else { + gpio_direction_output(CONFIG_BFIN_SDA); + gpio_set_value(CONFIG_BFIN_SDA, 0); + } +} + +static void hhbf_setscl(void *data, int state) +{ + gpio_set_value(CONFIG_BFIN_SCL, state); +} + +static int hhbf_getsda(void *data) +{ + return (gpio_get_value(CONFIG_BFIN_SDA) != 0); +} + + +static struct i2c_algo_bit_data bit_hhbf_data = { + .setsda = hhbf_setsda, + .setscl = hhbf_setscl, + .getsda = hhbf_getsda, + .udelay = CONFIG_I2C_BFIN_GPIO_CYCLE_DELAY, + .timeout = HZ +}; + +static struct i2c_adapter hhbf_ops = { + .owner = THIS_MODULE, + .id = I2C_HW_B_HHBF, + .algo_data = &bit_hhbf_data, + .name = "HHBF I2C driver", +}; + +static int __init i2c_hhbf_init(void) +{ + + if (gpio_request(CONFIG_BFIN_SCL, NULL)) { + printk(KERN_ERR "%s: gpio_request GPIO %d failed \n",__func__, CONFIG_BFIN_SCL); + return -1; + } + + if (gpio_request(CONFIG_BFIN_SDA, NULL)) { + printk(KERN_ERR "%s: gpio_request
Re: kernel-headers
On Wed, 2007-03-07 at 13:14 +0800, zhangxiliang wrote: > hello, > do you know where some problems about kernel-headers-*.rpm are discussed? Hi, the answer to your question depends on which distro you are using if it's a distro that gets the headers from the kernel's "make header_install" (Fedora at least) I suspect this mailinglist is the right place. If not, you probably should use a mailinglist for your distribution. Greetings, Arjan van de Ven - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v2] epoll use a single inode ...
Linus Torvalds wrote: On Tue, 6 Mar 2007, Eric Dumazet wrote: I did a user space program, attached to this mail. I rewrote the reciprocal_div() for i386 so that one multiply is used. Ok, this is definitely faster on Core 2 as well, so "numbers talk, bullshit walks". No more objections. (That said, I bet you could do even better for octal and hex numbers, so if you *really* want to speed things up, you should just make a special-case routine for each base (there's just three of them), and you can then also optimize the base-10 thing much better (you can do two digits at a time by dividing by 100, etc) Of course you can do better for octal and hex -- it's just shift and mask. Decimal is trickier; however, at least on i386 it might make sense to divide by 100 and then use the AAM instruction, or a table lookup, to split it into individual digits. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kref refcounting breakage in mainline
On Tue, 2007-03-06 at 13:04 -0800, Greg KH wrote: > On Tue, Mar 06, 2007 at 06:43:22AM +0100, Mike Galbraith wrote: > > On Mon, 2007-03-05 at 16:25 -0800, Greg KH wrote: > > > > > Mike, I've reverted this patch, and I don't see any references leaking. > > > And, as your patch released the reference on the driver, and the > > > module_add_driver() call would not grab a reference to the driver, only > > > the module kobject, I don't see what you were trying to fix with this > > > patch. > > > > > > Do you have a test case that this fixes? > > > > What it fixed for me was the hard hang reported below. > > > > http://lkml.org/lkml/2007/2/16/96 > > What specific module are you trying to unload that causes the hang? I > think it might just be a problem with that module, and not with all > others. It's ipmi_si that's hanging, waits for completion that never comes. > So, I'm going to revert your patch and work to try to find the real > cause of this problem. Yeah, my stab at it seems busted. I'll take another poke at it to see if I can find out why (post 725522b5453dd680412f2b6463a988e4fd148757) I'm left with a reference. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 2/7] RSS controller core
Pavel Emelianov wrote: This includes setup of RSS container within generic process containers, all the declarations used in RSS accounting, and core code responsible for accounting. diff -upr linux-2.6.20.orig/include/linux/rss_container.h linux-2.6.20-0/include/linux/rss_container.h --- linux-2.6.20.orig/include/linux/rss_container.h 2007-03-06 13:39:17.0 +0300 +++ linux-2.6.20-0/include/linux/rss_container.h2007-03-06 13:33:28.0 +0300 @@ -0,0 +1,68 @@ +#ifndef __RSS_CONTAINER_H__ +#define __RSS_CONTAINER_H__ +/* + * RSS container + * + * Copyright 2007 OpenVZ SWsoft Inc + * + * Author: Pavel Emelianov <[EMAIL PROTECTED]> + * + */ + +struct page_container; +struct rss_container; + +#ifdef CONFIG_RSS_CONTAINER +int container_rss_prepare(struct page *, struct vm_area_struct *vma, + struct page_container **); + +void container_rss_add(struct page_container *); +void container_rss_del(struct page_container *); +void container_rss_release(struct page_container *); + +int mm_init_container(struct mm_struct *mm, struct task_struct *tsk); +void mm_free_container(struct mm_struct *mm); + +unsigned long container_isolate_pages(unsigned long nr_to_scan, + struct rss_container *rss, struct list_head *dst, + int active, unsigned long *scanned); +unsigned long container_nr_physpages(struct rss_container *rss); + +unsigned long container_try_to_free_pages(struct rss_container *); +void container_out_of_memory(struct rss_container *); + +void container_rss_init_early(void); +#else +static inline int container_rss_prepare(struct page *pg, + struct vm_area_struct *vma, struct page_container **pc) +{ + *pc = NULL; /* to make gcc happy */ + return 0; +} + +static inline void container_rss_add(struct page_container *pc) +{ +} + +static inline void container_rss_del(struct page_container *pc) +{ +} + +static inline void container_rss_release(struct page_container *pc) +{ +} + +static inline int mm_init_container(struct mm_struct *mm, struct task_struct *t) +{ + return 0; +} + +static inline void mm_free_container(struct mm_struct *mm) +{ +} + +static inline void container_rss_init_early(void) +{ +} +#endif +#endif diff -upr linux-2.6.20.orig/init/Kconfig linux-2.6.20-0/init/Kconfig --- linux-2.6.20.orig/init/Kconfig 2007-03-06 13:33:28.0 +0300 +++ linux-2.6.20-0/init/Kconfig 2007-03-06 13:33:28.0 +0300 @@ -265,6 +265,13 @@ config CPUSETS bool select CONTAINERS +config RSS_CONTAINER + bool "RSS accounting container" + select RESOURCE_COUNTERS + help + Provides a simple Resource Controller for monitoring and + controlling the total Resident Set Size of the tasks in a container + The wording looks very familiar :-). It would be useful to add "The reclaim logic is now container aware, when the container goes overlimit the page reclaimer reclaims pages belonging to this container. If we are unable to reclaim enough pages to satisfy the request, the process is killed with an out of memory warning" config SYSFS_DEPRECATED bool "Create deprecated sysfs files" default y diff -upr linux-2.6.20.orig/mm/Makefile linux-2.6.20-0/mm/Makefile --- linux-2.6.20.orig/mm/Makefile 2007-02-04 21:44:54.0 +0300 +++ linux-2.6.20-0/mm/Makefile 2007-03-06 13:33:28.0 +0300 @@ -29,3 +29,5 @@ obj-$(CONFIG_MEMORY_HOTPLUG) += memory_h obj-$(CONFIG_FS_XIP) += filemap_xip.o obj-$(CONFIG_MIGRATION) += migrate.o obj-$(CONFIG_SMP) += allocpercpu.o + +obj-$(CONFIG_RSS_CONTAINER) += rss_container.o diff -upr linux-2.6.20.orig/mm/rss_container.c linux-2.6.20-0/mm/rss_container.c --- linux-2.6.20.orig/mm/rss_container.c2007-03-06 13:39:17.0 +0300 +++ linux-2.6.20-0/mm/rss_container.c 2007-03-06 13:33:28.0 +0300 @@ -0,0 +1,307 @@ +/* + * RSS accounting container + * + * Copyright 2007 OpenVZ SWsoft Inc + * + * Author: Pavel Emelianov <[EMAIL PROTECTED]> + * + */ + +#include +#include +#include +#include +#include + +static struct container_subsys rss_subsys; + +struct rss_container { + struct res_counter res; + struct list_head page_list; + struct container_subsys_state css; +}; + +struct page_container { + struct page *page; + struct rss_container *cnt; + struct list_head list; +}; + Yes, this is what I was planning to get to -- a per container LRU list. But you have just one list, don't you need active and inactive lists? When the global LRU is manipulated, shouldn't this list be updated as well, so that reclaim will pick the best pages. +static inline struct rss_container *rss_from_cont(struct container *cnt) +{ + return container_of(container_subsys_state(cnt, &rss_subsys), + struct rss_container, css); +} + +int mm_init_container(struct mm_struct *mm, struct task_stru
Re: [PATCH] INPUT/keyboard: PXA27x keyboard support
Hi Rodolfo, On Friday 02 March 2007 11:05, Rodolfo Giometti wrote: > Hello, here my last patch for the PXA27x keyboard support updated to > linux-2.6.21-rc2. > > I added power management support (suspend/resume code). The patch has bunch of issues that are hard to list because it was sent as an attachment... Examples are: REL_WHEEL does not belong to evbit, using input_free_device() is not allowed after input_unregister_device(), etc. I tried to fix everything I notoiced; if you could try the patch below and verify that it still works I will apply it to teh input tree. Thanks. -- Dmitry From: Rodolfo Giometti <[EMAIL PROTECTED]> Input: add support for PXA27x keyboard controller Signed-off-by: Rodolfo Giometti <[EMAIL PROTECTED]> Signed-off-by: Dmitry Torokhov <[EMAIL PROTECTED]> --- drivers/input/keyboard/Kconfig |9 + drivers/input/keyboard/Makefile|1 drivers/input/keyboard/pxa27x_keyboard.c | 258 + include/asm-arm/arch-pxa/pxa27x_keyboard.h | 13 + 4 files changed, 281 insertions(+) Index: work/drivers/input/keyboard/Kconfig === --- work.orig/drivers/input/keyboard/Kconfig +++ work/drivers/input/keyboard/Kconfig @@ -203,6 +203,15 @@ config KEYBOARD_OMAP To compile this driver as a module, choose M here: the module will be called omap-keypad. +config KEYBOARD_PXA27x + tristate "PXA27x keyboard support" + depends on PXA27x + help + Enable support for PXA27x matrix keyboard controller + + To compile this driver as a module, choose M here: the + module will be called pxa27x_keyboard. + config KEYBOARD_AAED2000 tristate "AAED-2000 keyboard" depends on MACH_AAED2000 Index: work/drivers/input/keyboard/Makefile === --- work.orig/drivers/input/keyboard/Makefile +++ work/drivers/input/keyboard/Makefile @@ -17,6 +17,7 @@ obj-$(CONFIG_KEYBOARD_SPITZ) += spitzkb obj-$(CONFIG_KEYBOARD_HIL) += hil_kbd.o obj-$(CONFIG_KEYBOARD_HIL_OLD) += hilkbd.o obj-$(CONFIG_KEYBOARD_OMAP)+= omap-keypad.o +obj-$(CONFIG_KEYBOARD_PXA27x) += pxa27x_keyboard.o obj-$(CONFIG_KEYBOARD_AAED2000)+= aaed2000_kbd.o obj-$(CONFIG_KEYBOARD_GPIO)+= gpio_keys.o Index: work/drivers/input/keyboard/pxa27x_keyboard.c === --- /dev/null +++ work/drivers/input/keyboard/pxa27x_keyboard.c @@ -0,0 +1,258 @@ +/* + * linux/drivers/input/keyboard/pxa27x_keyboard.c + * + * Driver for the pxa27x matrix keyboard controller. + * + * Created:Feb 22, 2007 + * Author: Rodolfo Giometti <[EMAIL PROTECTED]> + * + * Based on a previous implementations by Kevin O'Connor + * and Alex Osborne <[EMAIL PROTECTED]> and + * on some suggestions by Nicolas Pitre <[EMAIL PROTECTED]>. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include +#include +#include +#include + +#define DRIVER_NAME"pxa27x-keyboard" + +#define KPASMKP(col) (col/2 == 0 ? KPASMKP0 : \ +col/2 == 1 ? KPASMKP1 : \ +col/2 == 2 ? KPASMKP2 : KPASMKP3) +#define KPASMKPx_MKC(row, col) (1 << (row + 16 * (col % 2))) + +static irqreturn_t pxakbd_irq_handler(int irq, void *dev_id) +{ + struct platform_device *pdev = dev_id; + struct pxa27x_keyboard_platform_data *pdev = dev->platform_data; + struct input_dev *input_dev = platform_get_drvdata(pdev); + unsigned long kpc = KPC; + int p, row, col, rel; + + if (kpc & KPC_DI) { + unsigned long kpdk = KPDK; + + if (!(kpdk & KPDK_DKP)) { + /* better luck next time */ + } else if (kpc & KPC_REE0) { + unsigned long kprec = KPREC; + KPREC = 0x7f; + + if (kprec & KPREC_OF0) + rel = (kprec & 0xff) + 0x7f; + else if (kprec & KPREC_UF0) + rel = (kprec & 0xff) - 0x7f - 0xff; + else + rel = (kprec & 0xff) - 0x7f; + + if (rel) { + input_report_rel(input_dev, REL_WHEEL, rel); + input_sync(input_dev); + } + } + } + + if (kpc & KPC_MI) { + /* report the status of every button */ + for (row = 0; row < pdev->nr_rows; row++) { + for
Sleeping thread not receive signal until it wakes up
Hi all, I am having this problem. I have a process with 2 threads created. One of the thread will keep calling IOCTL to get information from the kernel and will be blocked if there is no new information. If there is information retured, the thread will be checked to see if any error happens and trigger an action. Since we have no way to know if the error is gone (Hardware provides no signal), so what we do is when trigger an action for the error, we will set an timer using alarm() and register a SIGALRM handler in the thread by using sigaction. After setting the alarm, the thread will loop back and call IOCTL, which could cause it to be put to sleep. The problem is the SIGALRM handler does not receive the SIGALRM while the thread is being blocked by IOCTL. And if we generated some event so that the IOCTL is returned with new information, the SIGALRM handler is invoked right away. However, as I read the manual, which says a thread/process should be waken up even when it sleeps if there is a signal delivered to it. Am I right? One thing I don't know it mattters or not is that I am not using sigwait to block the process and wait for signal because the thread need to go back to the IOCTL call and be slept on that. So I used sigaction to register the signal handler in hope that this handler wil be invoked by the kernel when there is an SIGALRM delivered to the thread. Could anyone tell me if I did something wrong and what is the correct way to achieve this task? I tried to avoid creating another thread which will call sigwait and block until the IOCTL thread send it explicitly a signal because I want to use timer. Thank you in advance, LNgo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc2 : Oops in rtc_cmos...
On Tuesday 06 March 2007 8:42 pm, Paul Rolland wrote: > It seems to me that the DRV_RTC_CMOS and the "standard" CONFIG_RTC > shouldn't be used at the same time... Am I correct on that ? Yes. I recall not forcing that because I couldn't be sure the new code was functionally identical to the legacy driver ... and in areas like HPET, knew it was not. Of course, since then I see that someone has kicked in a patch making Linux stop using HPET in legacy-replacement mode, so that particular issue now seems moot. Another area it's not functionally identical is CONFIG_SND_RTCTIMER, where the ALSA code doesn't know how to use the new RTC framework. (Or, probably, cope with the fact that not all RTCs can give periodic IRQs at the rates it wants ...) So it wasn't clear to me that distros might not need to have both options, to help cope with strange hardware. Phasing out legacy code tends to be done a bit cautiously. On the other hand, the distro vendors have been slow to look at this issue, and haven't even upgraded their copies of "hwclock" to be able to recognize /dev/rtc0, so I'm not holding my breath there. > Wouldn't it be better to have this dependancy enforced ? Feel free to submit a patch updating the Kconfig for both drivers. Merging it might complicate distro efforts to move away from that legacy driver (by preventing systems that can run with either one), but I don't see that happening very quickly anyway. - Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] msi: Fixup the msi enable/disable logic
Michael Ellerman <[EMAIL PROTECTED]> writes: > > Hi Eric, comments below .. > > > I get the reasoning for disabling MSI before we start writing back the > config space, but don't we want to re-enable MSI on the way out? We are restoring the entire msi flags register which includes the enable bit, setting it a second time is gratuitous. In addition if we are restoring the register when the enable bit is not set. (because we don't have a mask bit) enabling the msi state is actually the wrong thing to do.But I admit that case can only happen after the additions in my last patch. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Blackfin: blackfin i2c driver
On 3/6/07, Andrew Morton <[EMAIL PROTECTED]> wrote: On Tue, 06 Mar 2007 14:54:18 +0800 "Wu, Bryan" <[EMAIL PROTECTED]> wrote: > Hi folks, > > [PATCH] Blackfin: blackfin i2c driver > > + struct i2c_msg *pmsg; > + int i, ret; > + int rc = 0; > + > + if (!(bfin_read_TWI_CONTROL() & TWI_ENA)) > + return -ENXIO; > + > + down(&iface->twi_lock); > + > + while (bfin_read_TWI_MASTER_STAT() & BUSBUSY) { > + up(&iface->twi_lock); > + schedule(); > + down(&iface->twi_lock); > + } That's a busy loop until this task's timeslice has expired. It'll work, but it'll suck a bit. (Repeated in several places) OK, I change it into yield(). So, current process will be move to the tail of the run queue. Is that OK with you? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: should RTS init in serial core be tied to CRTSCTS
> shouldnt TIOCM_RTS be passed down only when the 'r' is appended to the > boot cmdline ? How would it be useful? CRTSCTS is for CTS only (i.e., the transmission is paused when CTS is inactive), not for RTS. DTR and RTS should be active when the port is open even without CRTSCTS (= without handshaking), it's used for various purposes such as providing +12V to the device (and two pins can supply more power than one - sure, it isn't the best idea). The name of the option is not CCTS, but CRTSCTS, isn't it? So, you may not only want to pause own transmission when CTS is inactive, but to control the transmission flow from the remote side. Why should RTS be active when the port is open even without CRTSCTS? You may still assert RTS manually if it is used to provide +12V to the device. But as I understand it is not common use of this pin, isn't it? And a question is not only about supporting legacy equipment but also about embedded hardware where RTS/CTS handshaking is handshaking, not something else... -Oleksiy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + stupid-hack-to-make-mainline-build.patch added to -mm tree
Thomas Gleixner wrote: > Ooops. I completely forgot, that you get the absolute expiry time > already in ktime_t format (nanoseconds) when dev->set_next_event() is > called. > > dev->next_event = expires; > > is done right before the call. > > So it's already there for free. > OK, but a trap for young players (ie, me): the absolute time is in ns since kernel boot, but the hypervisor wants an absolute time in ns since system boot. Everything works reasonably well for the first guest started early, so be sure to take a snapshot of hypervisor time early in order to get the correction... J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux v2.6.21-rc3
We've finally hopefully started to put a dent in the regressions, especially the suspend/resume problems introduced since 2.6.20. So 2.6.21-rc3 is out there now, and there's some hope that it will work more widely than -rc1 and -rc2 did. Please do give it a good testing, and update Adrian and the mailing list (and me) about any regressions (hopefully many more of the "it's fixed now" than other kinds, but all regressions are interesting). The appended shortlog gives a reasonable overview. In general we're definitely calming down, and most of the changes are fairly small and obvious fixes. Let's keep the fixes to a minimum, especially since I'm planning on biting peoples heads off if I get any more pull requests for things that aren't real and obvious fixes. Linus --- Adam Litke (1): Fix get_unmapped_area and fsync for hugetlb shm segments Adrian Bunk (8): HID: hid-debug.c should #include arch/arm26/kernel/entry.S: remove dead code make ipc/shm.c:shm_nopage() static mm/{,tiny-}shmem.c cleanups drivers/video/sm501fb.c: make 4 functions static fix the SYSCTL=n compilation arch/i386/kernel/vmi.c must #include remove arch/i386/kernel/tsc.c:custom_sched_clock Ahmed S. Darwish (1): KVM: Use ARRAY_SIZE macro instead of manual calculation. Akira Iguchi (1): scc_pata: bugfix for checking DMA IRQ status Alan Cox (4): libata-core: Fix simplex handling pata_qdi: Fix initialisation siimage: DRAC4 note ide: remove a ton of pointless #undef REALLY_SLOW_IO Alexandr Andreev (1): [IA64] sync compat getdents Alexey Dobriyan (1): geode-aes: use unsigned long for spin_lock_irqsave Allan Graves (1): uml: enable RAW Andres Salomon (3): i386: make x86_64 tsc header require i386 rather than vice-versa hrtimers: fix HRTIMER_CB_IRQSAFE_NO_SOFTIRQ description hrtimers: hrtimer_clock_base description typo Andrew Morton (7): throttle_vm_writeout(): don't loop on GFP_NOFS and GFP_NOIO allocations ide: fix pmac breakage KVM: Move kvmfs magic number to cyclades: return closing_wait revert "drivers/net/tulip/dmfe: support basic carrier detection" sis900 warning fixes fix build with CONFIG_NO_IDLE_HZ=n Andrzej Zaborowski (1): ARM: OMAP: correct misc 15xx and non-15xx platform code Antonino A. Daplas (2): MAINTAINERS: Update email address atyfb: Fix kconfig error Aristeu Sergio Rozanski Filho (1): tty_io: fix race in master pty close/slave pty close path Arnaldo Carvalho de Melo (1): [TCP]: Fix minisock tcp_create_openreq_child() typo. Arnaud Patard (1): ARM: OMAP: board-nokia770: correct lcd name Atsushi Nemoto (4): [MIPS] jmr3927: build fix [MIPS] Convert to RTC-class ds1742 driver [MIPS] No need to write c0_compare in plat_timer_setup [MIPS] TX39: Remove redundant tx39_blast_icache() calls Avi Kivity (13): KVM: mmu: add missing dirty page tracking cases KVM: Cosmetics KVM: Add hypercall host support for svm KVM: Wire up hypercall handlers to a central arch-independent location KVM: svm: init cr0 with the wp bit set KVM: More 0 -> NULL conversions KVM: Add internal filesystem for generating inodes KVM: Create an inode per virtual machine KVM: Rename some kvm_dev_ioctl_*() functions to kvm_vm_ioctl_*() KVM: Move kvm_vm_ioctl_create_vcpu() around KVM: Per-vcpu inodes KVM: Bump API version KVM: Fix bogus failure in kvm.ko module initialization Bartlomiej Zolnierkiewicz (3): ide: remove some obsoleted kernel params (v2) ide: make legacy IDE VLB modules check for the "probe" kernel params (v2) pata_pdc202xx_old: fix data corruption and other problems Ben Dooks (2): [ARM] 4238/1: S3C24XX: docs: update suspend and resume [ARM] 4239/1: S3C24XX: Update kconfig entries for PM Brice Goglin (1): myri10ge: fix copyright and license Catalin Marinas (1): [ARM] 4241/1: Define mb() as compiler barrier on a uniprocessor system Christian Krafft (1): ipmi: check, if default ports are accessible on PPC Christoph Lameter (1): Page migration: Fix vma flag checking Con Kolivas (1): sched: remove SMT nice Cornelia Huck (3): [S390] cio: Fix locking when calling notify function. [S390] cio: Use path verification to check for path state. [S390] cio: Call cancel_halt_clear even when actl == 0. Dale Farnsworth (2): mv643xx_eth: move mac_addr inside mv643xx_eth_platform_data mv643xx_eth: Place explicit port number in mv643xx_eth_platform_data Dan Aloni (1): [VLAN]: Avoid a 4-order allocation. Daniel Walker (2): update timekeeping_is_continuous comment fix vsyscall settimeofday Dave Johnson (1): [MIPS] Fix __raw_read_trylock() to allow multiple readers Dave Jones (2): Fix mv643xx_eth compi
Re: [patch 1/4] signalfd v1 - signalfd core ...
On Tue, 6 Mar 2007 17:36:56 -0800 (PST) Davide Libenzi wrote: > > The read(2) call will read u32 signal numbers that landed over the > signalfd. It returns the size of the data copied, or zero if the sighand > we are attached to, has been detached. So what about signals that the user asked for a siginfo_t to be returned with? -- Cheers, Stephen Rothwell[EMAIL PROTECTED] http://www.canb.auug.org.au/~sfr/ pgpmN3ZjnHr72.pgp Description: PGP signature
Re: PROBLEM: 2.6.20-1 not working on ibook g4 (BUG/Oops)
David Woodhouse <[EMAIL PROTECTED]> writes: > On Tue, 2007-03-06 at 14:53 +1300, Paul Collins wrote: >> In case it's of interest, 2.6.20 has been running fine on my >> PowerBook5,4. > > How much memory? What if you boot with mem=512M or mem=256M? 1GB. Also works fine when booted with those options. -- Paul Collins Wellington, New Zealand Dag vijandelijk luchtschip de huismeester is dood - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.21-rc2 : Oops in rtc_cmos...
Hello, > > Yes, it does, so it's a Good One (tm), > > And points out that $SUBJECT is misleading; the root cause of > the oops isn't rtc_cmos. Workaround, don't enable the legacy > driver for this hardware. Well, sorry for that, but my point was that without enabling CONFIG_DRV_RTC_CMOS and only using CONFIG_RTC, my dmesg says : drivers/rtc/hctosys.c: unable to open rtc device (rtc0) Having seem that, I got thru all the options, trying to find what I could have forgot as an option, and added the RTC_CMOS one, that resulted in an Oops... > One of the good things about getting rtc-cmos merged: it > exposes this new RTC framework to new mistakes, which helps > fix some of the remaining rough spots. Good ;) > > pnp: Device 00:03 does not support disabling. > > Blame the PNP stack for that particular useless message. > I'l send a fix for that one too. OK, ready to test ! > > drivers/rtc/hctosys.c: unable to open rtc device (rtc0) > Because probing 00:03 failed, was never fully usable. > So then rtc0 couldn't be found. You'd get the same > message if, say, the RTC was loaded as a module. It seems to me that the DRV_RTC_CMOS and the "standard" CONFIG_RTC shouldn't be used at the same time... Am I correct on that ? Wouldn't it be better to have this dependancy enforced ? Regards, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[kj]Patch8:replace pci_find_device in drivers/telephony/ixj.c
Hi, Cleaning up of pci_find_device in drivers/telephony/ixj.c. Applies and compiles clean on Linus tree. No hardware hence not tested!! Unable to find a suitable Maintainer for the current subsection in the Maintainers file. I am not sure whether this is orphaned or maintained. Can somebody help me identify the actual maintainer. thank you. Signed-off-by: Surya Prabhakar <[EMAIL PROTECTED]> --- diff --git a/drivers/telephony/ixj.c b/drivers/telephony/ixj.c index 71cb64e..c7b0a35 100644 --- a/drivers/telephony/ixj.c +++ b/drivers/telephony/ixj.c @@ -7692,7 +7692,7 @@ static int __init ixj_probe_pci(int *cnt IXJ *j = NULL; for (i = 0; i < IXJMAX - *cnt; i++) { - pci = pci_find_device(PCI_VENDOR_ID_QUICKNET, + pci = pci_get_device(PCI_VENDOR_ID_QUICKNET, PCI_DEVICE_ID_QUICKNET_XJ, pci); if (!pci) break; @@ -7712,6 +7712,7 @@ static int __init ixj_probe_pci(int *cnt printk(KERN_INFO "ixj: found Internet PhoneJACK PCI at 0x%x\n", j->DSPbase); ++*cnt; } + pci_dev_put(pci); return probe; } -- surya. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: qla2xxx BUG: workqueue leaked lock or atomic
On Wed, 28 Feb 2007 16:37:22 +0100 Andre Noll <[EMAIL PROTECTED]> wrote: > On 16:18, Andre Noll wrote: > > > With 2.6.21-rc2 I am unable to reproduce this BUG message. However, > > writing to both raid systems at the same time via lvm still locks up > > the system within minutes. > > Screenshot of the resulting kernel panic: > > http://systemlinux.org/~maan/shots/kernel-panic-21-rc2-huangho2.png > It died in CFQ. Please try a different IO scheduler. Use something like echo deadline > /sys/block/sda/queue/scheduler This could still be the old qla2xxx bug, or it could be a new qla2xxx bug, or it could be a block bug, or it could be an LVM bug. Adrian, can we please track this as a regression? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: don't use ZONE_DMA unless CONFIG_ZONE_DMA is set in setup.c
Dave Jones wrote: > On Tue, Mar 06, 2007 at 05:52:46PM -0800, Andrew Morton wrote: > > On Tue, 06 Mar 2007 18:52:59 -0500 > > Andres Salomon <[EMAIL PROTECTED]> wrote: > > > > > If CONFIG_ZONE_DMA is ever undefined, ZONE_DMA will also not be defined, > > > and setup.c won't compile. This wraps it with an #ifdef. > > > > > > > I guess if anyone tries to disable ZONE_DMA on i386 they'll pretty quickly > > discover that. But I don't think we need to "fix" it yet? Oh, it's certainly not urgent. I sent it simply for correctness reasons. It would've been nice to see the ZONE_DMA removal patches just #define ZONE_DMA regardless, and include less #ifdefs scattered about; but at this point, I'd just as soon prefer to see a proper way to allocate things based on address constraints (as discussed in http://www.gelato.unsw.edu.au/archives/linux-ia64/0609/19036.html). > > CONFIG_ZONE_DMA isn't even optional on i386, so I'm curious how > you could hit this compile failure. > Why, with custom code of course ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Wanted: simple, safe x86 stack overflow detection
> I'm certainly in favor of the move; IRQ stacks could be made > rather deep and cheaply at that. I may get around to writing it this > week if no one else does it first. the irq stacks aren't the problem; RH at some point accidentally shipped a kernel with 4k *shared* irq/user context stack and even that gave almost no issues. irq's really shouldn't actually nest; it's bad for just about everything to do that (but that's another story, I would *love* to get rid of the "enable irqs" thing in the x86 irq path, it hurts just about anything in reality) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] proc: maps protection
> [Adding Cc:lkml] > How about using a reduced check, as is done for fd and environ? This > would allow root-running system monitors to still do their job. > Effectively, this changes the test from "is ptracing" to just "can > ptrace". > > If this still isn't considered safe, I'll add the maps_protect file... btw I consider it an information leak that any user can see which files/libraries any other user and root has mmap'd. (and with glibc's stdio mmap feature that goes even beyond direct mmap to fopen()'d). If root or some other user wants to watch hillary-vs-obama-in-the-mud.avi, no other user has ANY business even seeing that. So at minimum it's a privacy issue showing the filenames... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp_cubic: faster cube root
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Tue, 6 Mar 2007 14:47:06 -0800 > The Newton-Raphson method is quadratically convergent so > only a small fixed number of steps are necessary. > Therefore it is faster to unroll the loop. Since div64_64 is no longer > inline it won't cause code explosion. > > Also fixes a bug that can occur if x^2 was bigger than 32 bits. > > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> Applied, thanks Stephen. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA resume slowness, e1000 MSI warning
"Kok, Auke" <[EMAIL PROTECTED]> writes: > Ingo Molnar wrote: >> * Kok, Auke <[EMAIL PROTECTED]> wrote: >> > BUG: at drivers/pci/msi.c:611 pci_enable_msi() >> I would poke Eric Biederman(sp?) about this one. Maybe its even solved by the MSI-enable-related patch he posted in the past 24-48 hours. >>> I tried the 3-patch series "[PATCH 0/3] Basic msi bug fixes.." and they fix >>> this problem for me. Were you expecting the OOPS in the first place? [...] >> >> the bug was the warning message (a WARN_ON()) above - not an oops. So that >> warning message is gone in your testing? > > yes. Sorry for the slow delay. I was out of town for my brothers wedding the last few days. I wasn't exactly expecting the WARN_ON to trigger. What I fixed was an inconsistency in handling our state bits. Fixing that inconsistency appears to have fixed the e1000 usage scenario mostly by accident. The basic issue is that pci_save_state saves the current msi state along with other registers, and then the e1000 driver goes and disables the msi irq after we have saved the irq state as on. My code notices that the msi irq was disabled before restore time, so it skips the restore. However we now have a leak of the msi saved cap because we are not freeing it. This leaves with some basic questions. - Does it make sense for suspend/resume methods to request/free irqs? - Does it make sense for suspend/resume methods to allocate/free msi irqs? - Do we want pci_save/restore_cap to save/restore msi state? The path of least resistance is to just free the extra state and we are good. I'm just not quite certain that is sane and it has been a long day. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: don't use ZONE_DMA unless CONFIG_ZONE_DMA is set in setup.c
On Tue, Mar 06, 2007 at 05:52:46PM -0800, Andrew Morton wrote: > On Tue, 06 Mar 2007 18:52:59 -0500 > Andres Salomon <[EMAIL PROTECTED]> wrote: > > > If CONFIG_ZONE_DMA is ever undefined, ZONE_DMA will also not be defined, > > and setup.c won't compile. This wraps it with an #ifdef. > > > > I guess if anyone tries to disable ZONE_DMA on i386 they'll pretty quickly > discover that. But I don't think we need to "fix" it yet? CONFIG_ZONE_DMA isn't even optional on i386, so I'm curious how you could hit this compile failure. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/7] Resource counters
Pavel Emelianov wrote: Introduce generic structures and routines for resource accounting. Each resource accounting container is supposed to aggregate it, container_subsystem_state and its resource-specific members within. diff -upr linux-2.6.20.orig/include/linux/res_counter.h linux-2.6.20-0/include/linux/res_counter.h --- linux-2.6.20.orig/include/linux/res_counter.h 2007-03-06 13:39:17.0 +0300 +++ linux-2.6.20-0/include/linux/res_counter.h 2007-03-06 13:33:28.0 +0300 @@ -0,0 +1,83 @@ +#ifndef __RES_COUNTER_H__ +#define __RES_COUNTER_H__ +/* + * resource counters + * + * Copyright 2007 OpenVZ SWsoft Inc + * + * Author: Pavel Emelianov <[EMAIL PROTECTED]> + * + */ + +#include + +struct res_counter { + unsigned long usage; + unsigned long limit; + unsigned long failcnt; + spinlock_t lock; +}; + +enum { + RES_USAGE, + RES_LIMIT, + RES_FAILCNT, +}; + +ssize_t res_counter_read(struct res_counter *cnt, int member, + const char __user *buf, size_t nbytes, loff_t *pos); +ssize_t res_counter_write(struct res_counter *cnt, int member, + const char __user *buf, size_t nbytes, loff_t *pos); + +static inline void res_counter_init(struct res_counter *cnt) +{ + spin_lock_init(&cnt->lock); + cnt->limit = (unsigned long)LONG_MAX; +} + Is there any way to indicate that there are no limits on this container. LONG_MAX is quite huge, but still when the administrator wants to configure a container to *un-limited usage*, it becomes hard for the administrator. +static inline int res_counter_charge_locked(struct res_counter *cnt, + unsigned long val) +{ + if (cnt->usage <= cnt->limit - val) { + cnt->usage += val; + return 0; + } + + cnt->failcnt++; + return -ENOMEM; +} + +static inline int res_counter_charge(struct res_counter *cnt, + unsigned long val) +{ + int ret; + unsigned long flags; + + spin_lock_irqsave(&cnt->lock, flags); + ret = res_counter_charge_locked(cnt, val); + spin_unlock_irqrestore(&cnt->lock, flags); + return ret; +} + Will atomic counters help here. +static inline void res_counter_uncharge_locked(struct res_counter *cnt, + unsigned long val) +{ + if (unlikely(cnt->usage < val)) { + WARN_ON(1); + val = cnt->usage; + } + + cnt->usage -= val; +} + +static inline void res_counter_uncharge(struct res_counter *cnt, + unsigned long val) +{ + unsigned long flags; + + spin_lock_irqsave(&cnt->lock, flags); + res_counter_uncharge_locked(cnt, val); + spin_unlock_irqrestore(&cnt->lock, flags); +} + +#endif diff -upr linux-2.6.20.orig/init/Kconfig linux-2.6.20-0/init/Kconfig --- linux-2.6.20.orig/init/Kconfig 2007-03-06 13:33:28.0 +0300 +++ linux-2.6.20-0/init/Kconfig 2007-03-06 13:33:28.0 +0300 @@ -265,6 +265,10 @@ config CPUSETS Say N if unsure. +config RESOURCE_COUNTERS + bool + select CONTAINERS + config SYSFS_DEPRECATED bool "Create deprecated sysfs files" default y diff -upr linux-2.6.20.orig/kernel/Makefile linux-2.6.20-0/kernel/Makefile --- linux-2.6.20.orig/kernel/Makefile 2007-03-06 13:33:28.0 +0300 +++ linux-2.6.20-0/kernel/Makefile 2007-03-06 13:33:28.0 +0300 @@ -51,6 +51,7 @@ obj-$(CONFIG_RELAY) += relay.o obj-$(CONFIG_UTS_NS) += utsname.o obj-$(CONFIG_TASK_DELAY_ACCT) += delayacct.o obj-$(CONFIG_TASKSTATS) += taskstats.o tsacct.o +obj-$(CONFIG_RESOURCE_COUNTERS) += res_counter.o ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y) # According to Alan Modra <[EMAIL PROTECTED]>, the -fno-omit-frame-pointer is diff -upr linux-2.6.20.orig/kernel/res_counter.c linux-2.6.20-0/kernel/res_counter.c --- linux-2.6.20.orig/kernel/res_counter.c 2007-03-06 13:39:17.0 +0300 +++ linux-2.6.20-0/kernel/res_counter.c 2007-03-06 13:33:28.0 +0300 @@ -0,0 +1,72 @@ +/* + * resource containers + * + * Copyright 2007 OpenVZ SWsoft Inc + * + * Author: Pavel Emelianov <[EMAIL PROTECTED]> + * + */ + +#include +#include +#include +#include + +static inline unsigned long *res_counter_member(struct res_counter *cnt, int member) +{ + switch (member) { + case RES_USAGE: + return &cnt->usage; + case RES_LIMIT: + return &cnt->limit; + case RES_FAILCNT: + return &cnt->failcnt; + }; + + BUG(); + return NULL; +} + +ssize_t res_counter_read(struct res_counter *cnt, int member, + const char __user *userbuf, size_t nbytes, loff_t *pos) +{ + unsigned long *val; + char buf[64], *s; + + s = buf; + val = res_counter_member(cnt, member); + s += sprintf(s, "%lu\n", *val); + return simple_read_from_buffer((void __user
Re: passing function pointers through platform devices?
NZG wrote: I'm developing an SPI- bus >MMC/SD block driver translation layer. As part of this layer the write protect and card detect lines need to be read. The method for determining the state of these lines will be board specific. Is it appropriate to pass a function pointer through a platform device (declared in the mach initialization) to implement card_available and write_protect function calls? Or is there a cleaner way to do it? Once the generic GPIO framework migrates upstream from -mm you should just pass the GPIO token from board-specific code and gpio_get_value() it. --Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] hwbkpt: Hardware breakpoints (was Kwatch)
> > Yeah, I guess that's right. It should still return NOTIFY_STOP when > > args->err has no other bits set, so notifiers aren't called with zero. > > In practice that might not work. On my machine, at least, reads of DR6 > return ones in all the reserved bit positions. Does that mean asm("mov %1,%%dr6; mov %%dr6,%0" : "=r" (mask) : "r" (0)); puts in mask the set of reserved bits? We could collect that value at CPU startup and mask it off args->err, then OR it back into vdr6. Thanks, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v2] epoll use a single inode ...
On Tue, 6 Mar 2007, Eric Dumazet wrote: > > I did a user space program, attached to this mail. > > I rewrote the reciprocal_div() for i386 so that one multiply is used. Ok, this is definitely faster on Core 2 as well, so "numbers talk, bullshit walks". No more objections. (That said, I bet you could do even better for octal and hex numbers, so if you *really* want to speed things up, you should just make a special-case routine for each base (there's just three of them), and you can then also optimize the base-10 thing much better (you can do two digits at a time by dividing by 100, etc) Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86_64 irq: keep consistent for changing IRQ0_VECTOR from 0x20 to 0x30
On Mon, 5 Mar 2007, Yinghai Lu wrote: > > please check the patch Hmm.. It doesn't look *wrong*, but could you please - split it up a bit (some of it is 100% obvious, ie the comment fixes) - write an explanation for the individually split up patches - not use attachments, but just make it inline. It's practically impossible to reply and quote part of the patch now. Eric/Ingo - did you go through and check the patch? Thanks, Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] utrace: nommu fixup support utrace
That old ptrace check seems pretty questionable to me. I think what you want is for the nommu world's get_user_pages/access_process_vm when called with force=1,write=1 on a read-only MAP_PRIVATE page to do something more morally similar to the mmu world's COW than it does now. Thanks, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] proc: maps protection
On Tue, Mar 06, 2007 at 06:59:42PM -0800, Andrew Morton wrote: > On Tue, 6 Mar 2007 18:13:35 -0800 Kees Cook <[EMAIL PROTECTED]> wrote: > > > On Tue, Mar 06, 2007 at 05:56:09PM -0800, Andrew Morton wrote: > > > On Tue, 6 Mar 2007 17:22:34 -0800 > > > Kees Cook <[EMAIL PROTECTED]> wrote: > > > > > > > This is a continuation of a much earlier discussion[1]. As I > > > > understand, the problem is: > > > > > > This sounds like a really good way of breaking lots and lots of people's > > > expensively-developed stuff. In ways which we won't discover until a year > > > after we shipped it. > > > > > > So nope, sorry. Need to find a compatible way of doing this. Perhaps a > > > kernel boot parameter or a /proc knob. > > > > Do you have examples of things in the kernel that I can use as a > > starting point? > > No, I don't think this has precedent. > > > Would something like /proc/sys/kernel/maps_protect be > > reasonable? > > Yes, that sounds reasonable. > > An alternative is to do it with elf headers, perhaps - let the process > specify what protections it wants in some manner. > > > If an acceptable toggle is made, would you consider it being enabled by > > default (i.e. "tighter security by default")? > > Again, that sounds risky. [Adding Cc:lkml] How about using a reduced check, as is done for fd and environ? This would allow root-running system monitors to still do their job. Effectively, this changes the test from "is ptracing" to just "can ptrace". If this still isn't considered safe, I'll add the maps_protect file... --- task_mmu.c | 16 +++- task_nommu.c |6 ++ 2 files changed, 21 insertions(+), 1 deletion(-) --- diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 7445980..7c9aad3 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -134,6 +134,9 @@ static int show_map_internal(struct seq_file *m, void *v, struct mem_size_stats dev_t dev = 0; int len; + if (!ptrace_may_attach(task)) + return -EACCES; + if (file) { struct inode *inode = vma->vm_file->f_path.dentry->d_inode; dev = inode->i_sb->s_dev; @@ -444,11 +447,22 @@ const struct file_operations proc_maps_operations = { #ifdef CONFIG_NUMA extern int show_numa_map(struct seq_file *m, void *v); +static int show_numa_map_checked(struct seq_file *m, void *v) +{ + struct proc_maps_private *priv = m->private; + struct task_struct *task = priv->task; + + if (!ptrace_may_attach(task)) + return -EACCES; + + return show_numa_map(m, v); +} + static struct seq_operations proc_pid_numa_maps_op = { .start = m_start, .next = m_next, .stop = m_stop, -.show = show_numa_map +.show = show_numa_map_checked }; static int numa_maps_open(struct inode *inode, struct file *file) diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c index 7cddf6b..c5783b7 100644 --- a/fs/proc/task_nommu.c +++ b/fs/proc/task_nommu.c @@ -143,6 +143,12 @@ out: static int show_map(struct seq_file *m, void *_vml) { struct vm_list_struct *vml = _vml; + struct proc_maps_private *priv = m->private; + struct task_struct *task = priv->task; + + if (!ptrace_may_attach(task)) + return -EACCES; + return nommu_vma_show(m, vml->vma); } -- Kees Cook@outflux.net - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix get_order()
On Tue, 6 Mar 2007, David Howells wrote: > /** > + * ilog2_up - rounded up log of base 2 of 32-bit or a 64-bit unsigned value > + * @n - parameter > + * > + * constant-capable log of base 2 calculation > + * - this can be used to initialise global variables from constant data, > hence > + * the massive ternary operator construction > + * - the result is rounded up > + * - the result is undefined when n < 1 > + * > + * selects the appropriately-sized optimised version depending on sizeof(n) > + */ > +#define ilog2_up(n) ((n) == 1 ? 0 : ilog2((n) - 1) + 1) This is wrong. It uses "n" twice, which makes it unsafe as a macro. It would need to be an inline function, but then the global initializer comment is wrong. Or it could use a "__builtin_constant_p()" (which gcc defines to not have side effects) to allow the multiple use for constant data. Or we could require that "ilog2(0)" returns -1, and then we could just say #define ilog2_up(n) (ilog2((n)-1)+1) Or.. ? The whole "get_order()" macro also has some serious lack of parenthesis. In general, commit 39d61db0edb34d60b83c5e0d62d0e906578cc707 just was pretty damn bad! I'm becoming a bit disgruntled about this whole thing, I have to admit. I'm just not sure the bugs here are worth it. Especially considering that __get_order() has apparently never even tested these things to begin with, since nobody but FRV has ever #defined the ARCH_HAS_ILOG2_U?? macros. The whole *reason* for that mess seems to be bogus too, since at least ia64 still has its own inline "get_order()", which means that nobody can use get_order() for constant initializers *anyway*, quite unlike the comments say. So the whole thing is: - buggy - untested - has untrue comments - makes no real sense and I'm inclined to just revert 39d61db0 instead of adding more and more breakage to it, since it's simply not going to help with the fundamental problems! Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SLUB 2/3] Large kmalloc pass through. Removal of large general slabs
On Tue, 6 Mar 2007, Matt Mackall wrote: > I've been meaning to do this in SLOB as well. Perhaps it warrants > doing in stock kmalloc? I've got a grand total of 18 of these objects > here. The number increases with the number numa nodes. We have had trouble with the maximum kmalloc size before and this will get rid of it for good. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] disable NMI watchdog by default
> --- linux.orig/include/asm-x86_64/nmi.h > +++ linux/include/asm-x86_64/nmi.h > @@ -63,7 +63,7 @@ extern int setup_nmi_watchdog(char *); > > extern atomic_t nmi_active; > extern unsigned int nmi_watchdog; > -#define NMI_DEFAULT -1 > +#define NMI_DEFAULT 0 Maybe I'm missing something obvious, but this patch doesn't seem correct to me. The sentiment of disabling the NMI watchdog by default is fine, and I agree with it, but I don't think this patch does what it says. First of all, I have a system running a kernel with this patch applied (v2.6.21-rc2-gc3442e2), and I see NMIs in /proc/interrupts and "testing NMI watchdog ... OK." in the log. And second, looking at the NMI code, it seems that this change actually makes it impossible to turn off the NMI watchdog! In arch/x86_64/kernel/nmi.c, we have: void nmi_watchdog_default(void) { if (nmi_watchdog != NMI_DEFAULT) return; if (nmi_known_cpu()) nmi_watchdog = NMI_LOCAL_APIC; else nmi_watchdog = NMI_IO_APIC; } so it seems changing the value of NMI_DEFAULT has no effect on this logic, really: if nmi_watchdog is left at the default, then the kernel chooses LAPIC or IO-APIC. And if someone passes "nmi_watchdog=0" on the command line, nmi_watchdog is still NMI_DEFAULT and so the same logic triggers. Ingo, I assume you tested this, so what am I missing? - R. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Xen & VMI?
Rusty Russell wrote: On Tue, 2007-03-06 at 21:37 +0100, Ingo Molnar wrote: maybe i shouldnt call it 'VMI' but 'the paravirt ABI'. I dont mind if it's the Xen ABI or the VMWare ABI or a mesh of the two - everyone can map their own internals to that /one/ ABI. I think it's an excellent aim, but it's *HARD*. I rejected this approach earlier because I'm just not smart enough. (Yet?) With VMI, I think we came within 90% of getting a cross vendor paravirt-ABI that satisfied everyone's needs. Nobody is smart enough to figure out the last 10% - it needs cooperation, trial, error, and experience dealing with each other's hypervisors. The Linux side is fairly stable. The hardware side is changing, and the hypervisor side is changing. This means the ABI will churn fairly fast. The hypervisors are very different, which means the ABI will be very wide. We could start with VMI and try to support Xen, KVM and lguest. It would at least give us a better idea of the scope of the problem. But IMHO it's a *huge* job. Surely, given time, the technical issues can be worked out. In the meantime, the hardware has evolved, and many of the points that are now important have changed - and new issues have come into play that we can't anticipate yet. At some point, we will hopefully converge, but we might not, and it is a huge job. UDI had similarly lofty goals. It was started in 1998. Where is it today? But this isn't the problem. The problem is that nobody wants a single ABI. Just like no hardware vendors want a fixed ABI for their hardware. They need to innovate independently, and time to market and features are more important than being binary compatible with a bunch of competing vendors. They want to differentiate, and break away from an ABI, and as history repeats, again and again, this happens eventually with every ABI. So once the ivory tower is built, and you let all the kids in to play, they are going to have a party and you are going to start noticing chips and eventually cracks, and eventually the tower will go into disrepair and fall because somebody else has built a new and better one further down the road. Why go through that exercise if nobody sees any tangible benefit from it today? Paravirt-ops avoids this because it is an API, and because it is flexible, and because it can change with the kernel, and because it doesn't lock you into a legacy way of doing things, it allows you to fork and adapt and push legacy and future compatibility issues into the vendor backend modules, like VMI, where they should belong. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SLUB 2/3] Large kmalloc pass through. Removal of large general slabs
On Tue, Mar 06, 2007 at 06:35:16PM -0800, Christoph Lameter wrote: > Unlimited kmalloc size and removal of general caches >=4. > > We can directly use the page allocator for all allocations 4K and larger. This > means that no general slabs are necessary and the size of the allocation > passed > to kmalloc() can be arbitrarily large. Remove the useless general caches over > 4k. I've been meaning to do this in SLOB as well. Perhaps it warrants doing in stock kmalloc? I've got a grand total of 18 of these objects here. The downside is this makes them suddenly disappear off the slabinfo radar. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[SLUB 3/3] Guarantee minimum number of objects in a slab
Guarantee a mininum number of objects per slab The number of objects per slab is important for SLUB because it determines the number of allocations that can be performed without having to consult per node slab lists. Add another boot option "min_objects=xx" that allows the configuration of the objects per slab. This is similar to SLABS queue configurations. Set the default of objects to 4. This will increase the page order for certain slab objects. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Index: linux-2.6.21-rc2-mm1/mm/slub.c === --- linux-2.6.21-rc2-mm1.orig/mm/slub.c 2007-03-06 17:57:11.0 -0800 +++ linux-2.6.21-rc2-mm1/mm/slub.c 2007-03-06 17:57:15.0 -0800 @@ -1201,6 +1201,12 @@ static __always_inline struct page *get_ static int slub_min_order = 0; /* + * Minumum number of objects per slab. This is necessary in order to + * reduce locking overhead. Similar to the queue size in SLAB. + */ +static int slub_min_objects = 4; + +/* * Merge control. If this is set then no merging of slab caches will occur. */ static int slub_nomerge = 0; @@ -1232,7 +1238,7 @@ static int calculate_order(int size) order < MAX_ORDER; order++) { unsigned long slab_size = PAGE_SIZE << order; - if (slab_size < size) + if (slab_size < slub_min_objects * size) continue; rem = slab_size % size; @@ -1624,6 +1630,15 @@ static int __init setup_slub_min_order(c __setup("slub_min_order=", setup_slub_min_order); +static int __init setup_slub_min_objects(char *str) +{ + get_option (&str, &slub_min_objects); + + return 1; +} + +__setup("slub_min_objects=", setup_slub_min_objects); + static int __init setup_slub_nomerge(char *str) { slub_nomerge = 1; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/