Re: [discuss] [PATCH] allow CONFIG_FRAME_POINTER for x86-64
On Fri, Sep 09, 2005 at 12:58:12PM +0200 Andi Kleen wrote: > On Friday 09 September 2005 12:45, Hugh Dickins wrote: > > On Fri, 9 Sep 2005, Jan Beulich wrote: > > > > But why would anyone want frame pointers on x86-64? > > > > > > I'd put the question differently: Why should x86-64 not allow what > > > other architectures do? > > > > > > But of course, I'm not insisting on this patch to get in, it just > > > seemed an obvious inconsistency... > > > > I'm with Jan on this. I use a similar patch for frame pointers on > > x86_64 most of the time, in the hope of getting more accurate backtraces. > > It won't give more accurate backtraces, not even on i386 because show_stack > doesn't have any code to follow frame pointers. > Huh? print_context_stack follows frame pointers which is called from show_stack - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [discuss] [PATCH] allow CONFIG_FRAME_POINTER for x86-64
On Fri, Sep 09, 2005 at 12:58:12PM +0200 Andi Kleen wrote: On Friday 09 September 2005 12:45, Hugh Dickins wrote: On Fri, 9 Sep 2005, Jan Beulich wrote: But why would anyone want frame pointers on x86-64? I'd put the question differently: Why should x86-64 not allow what other architectures do? But of course, I'm not insisting on this patch to get in, it just seemed an obvious inconsistency... I'm with Jan on this. I use a similar patch for frame pointers on x86_64 most of the time, in the hope of getting more accurate backtraces. It won't give more accurate backtraces, not even on i386 because show_stack doesn't have any code to follow frame pointers. Huh? print_context_stack follows frame pointers which is called from show_stack - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange LVM2/DM data corruption with 2.6.11.12
On Thu, Sep 08, 2005 at 11:58:54AM +0200 Ludovic Drolez wrote: > Hi ! > > We are developing (GPLed) disk cloning software similar to partimage: it's > an intelligent 'dd' which backups only used sectors. > > Recently I added LVM1/2 support to it, and sometimes we saw LVM > restorations failing randomly (Disk images are not corrupted, but the > result of the restoration can be lead to a corrupted filesystem). If a > restoration fails, just try another one and it will work... > Please upgrade to 2.6.12.6 (I don't remember exactly in which 2.6.12.x it went in), it contains a bugfix that should fix what you are seeing. 2.6.13 also has this. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange LVM2/DM data corruption with 2.6.11.12
On Thu, Sep 08, 2005 at 11:58:54AM +0200 Ludovic Drolez wrote: Hi ! We are developing (GPLed) disk cloning software similar to partimage: it's an intelligent 'dd' which backups only used sectors. Recently I added LVM1/2 support to it, and sometimes we saw LVM restorations failing randomly (Disk images are not corrupted, but the result of the restoration can be lead to a corrupted filesystem). If a restoration fails, just try another one and it will work... Please upgrade to 2.6.12.6 (I don't remember exactly in which 2.6.12.x it went in), it contains a bugfix that should fix what you are seeing. 2.6.13 also has this. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Some debugging patches on top of -mm
These are debugging patches on-top of -mm that makes it possible for those arches that want to be able to to save caller traces of who allocates pages and slab objects. Any arch that wants to use this could make a next_stack_func function that goes through the stack starting at *prev_addr and finds the next function return address. 'count' is for when we can use the frame pointer (CONFIG_FRAME_POINTER) to get accurate backtraces. For x86 it goes like: unsigned long *next_stack_func(unsigned long *prev_addr, int count) { struct thread_info *tinfo = current_thread_info(); if (!prev_addr) return NULL; #ifdef CONFIG_FRAME_POINTER /* In this case 'prev_addr' is a pointer to the last return * function found on the stack */ if (count == 0) { unsigned long ebp; unsigned long *func_ptr; asm ("movl %%ebp, %0" : "=r" (ebp) : ); /* We don't want the obvious caller to show up */ ebp = *(unsigned long *) ebp; func_ptr = (unsigned long *)(ebp + 4); if (valid_stack_ptr(tinfo, func_ptr)) return func_ptr; } else { unsigned long *func_ptr; unsigned long ebp = (unsigned long) prev_addr; ebp -= 4; ebp = *(unsigned long *) ebp; func_ptr = (unsigned long *) ((unsigned long)ebp + 4); if (valid_stack_ptr(tinfo, func_ptr)) return func_ptr; } #else while (prev_addr++) { if (!valid_stack_ptr(tinfo, prev_addr)) break; if (__kernel_text_address(*prev_addr)) return prev_addr; } #endif return NULL; } 1) A "generic" next_stack_func() for arches that want to have these debugging facilities 2) Saving more slab object call traces via DBG_DEBUGWORDS. Now uses next_stack_func(). This still prints to the console, oh well... (I have not made SLAB_DEBUG conditional on x86 so it won't compile on non-x86 arches with these patches currently...) 3) Simplification of the page-owner-leak-detector to use next_stack_func() so that any arch that wants it can use it. Index: mm/arch/i386/kernel/traps.c === --- mm.orig/arch/i386/kernel/traps.c2005-09-03 11:22:39.0 +0200 +++ mm/arch/i386/kernel/traps.c 2005-09-03 18:17:00.0 +0200 @@ -148,6 +148,48 @@ p < (void *)tinfo + THREAD_SIZE - 3; } +unsigned long *next_stack_func(unsigned long *prev_addr, int count) +{ + struct thread_info *tinfo = current_thread_info(); + + if (!prev_addr) + return NULL; + +#ifdef CONFIG_FRAME_POINTER + /* In this case 'prev_addr' is a pointer to the last return +* function found on the stack */ + if (count == 0) { + unsigned long ebp; + unsigned long *func_ptr; + + asm ("movl %%ebp, %0" : "=r" (ebp) : ); + /* We don't want the obvious caller to show up */ + ebp = *(unsigned long *) ebp; + func_ptr = (unsigned long *)(ebp + 4); + if (valid_stack_ptr(tinfo, func_ptr)) + return func_ptr; + } else { + unsigned long *func_ptr; + unsigned long ebp = (unsigned long) prev_addr; + + ebp -= 4; + + ebp = *(unsigned long *) ebp; + func_ptr = (unsigned long *) ((unsigned long)ebp + 4); + if (valid_stack_ptr(tinfo, func_ptr)) + return func_ptr; + } +#else + while (prev_addr++) { + if (!valid_stack_ptr(tinfo, prev_addr)) + break; + if (__kernel_text_address(*prev_addr)) + return prev_addr; + } +#endif + return NULL; +} + static inline unsigned long print_context_stack(struct thread_info *tinfo, unsigned long *stack, unsigned long ebp) { Index: mm/include/linux/sched.h === --- mm.orig/include/linux/sched.h 2005-09-03 11:22:51.0 +0200 +++ mm/include/linux/sched.h2005-09-03 15:52:20.0 +0200 @@ -171,6 +171,7 @@ * trace (or NULL if the entire call-chain of the task should be shown). */ extern void show_stack(struct task_struct *task, unsigned long *sp); +extern unsigned long *next_stack_func(unsigned long *prev_addr, int count); void io_schedule(void); long io_schedule_timeout(long timeout); Index: mm/arch/x86_64/kernel/traps.c === --- mm.orig/arch/x86_64/kernel/traps.c 2005-09-03 17:59:16.0 +0200 +++ mm/arch/x86_64/kernel/traps.c 2005-09-03 19:00:48.0 +0200 @@ -154,6 +154,54 @@
Some debugging patches on top of -mm
These are debugging patches on-top of -mm that makes it possible for those arches that want to be able to to save caller traces of who allocates pages and slab objects. Any arch that wants to use this could make a next_stack_func function that goes through the stack starting at *prev_addr and finds the next function return address. 'count' is for when we can use the frame pointer (CONFIG_FRAME_POINTER) to get accurate backtraces. For x86 it goes like: unsigned long *next_stack_func(unsigned long *prev_addr, int count) { struct thread_info *tinfo = current_thread_info(); if (!prev_addr) return NULL; #ifdef CONFIG_FRAME_POINTER /* In this case 'prev_addr' is a pointer to the last return * function found on the stack */ if (count == 0) { unsigned long ebp; unsigned long *func_ptr; asm (movl %%ebp, %0 : =r (ebp) : ); /* We don't want the obvious caller to show up */ ebp = *(unsigned long *) ebp; func_ptr = (unsigned long *)(ebp + 4); if (valid_stack_ptr(tinfo, func_ptr)) return func_ptr; } else { unsigned long *func_ptr; unsigned long ebp = (unsigned long) prev_addr; ebp -= 4; ebp = *(unsigned long *) ebp; func_ptr = (unsigned long *) ((unsigned long)ebp + 4); if (valid_stack_ptr(tinfo, func_ptr)) return func_ptr; } #else while (prev_addr++) { if (!valid_stack_ptr(tinfo, prev_addr)) break; if (__kernel_text_address(*prev_addr)) return prev_addr; } #endif return NULL; } 1) A generic next_stack_func() for arches that want to have these debugging facilities 2) Saving more slab object call traces via DBG_DEBUGWORDS. Now uses next_stack_func(). This still prints to the console, oh well... (I have not made SLAB_DEBUG conditional on x86 so it won't compile on non-x86 arches with these patches currently...) 3) Simplification of the page-owner-leak-detector to use next_stack_func() so that any arch that wants it can use it. Index: mm/arch/i386/kernel/traps.c === --- mm.orig/arch/i386/kernel/traps.c2005-09-03 11:22:39.0 +0200 +++ mm/arch/i386/kernel/traps.c 2005-09-03 18:17:00.0 +0200 @@ -148,6 +148,48 @@ p (void *)tinfo + THREAD_SIZE - 3; } +unsigned long *next_stack_func(unsigned long *prev_addr, int count) +{ + struct thread_info *tinfo = current_thread_info(); + + if (!prev_addr) + return NULL; + +#ifdef CONFIG_FRAME_POINTER + /* In this case 'prev_addr' is a pointer to the last return +* function found on the stack */ + if (count == 0) { + unsigned long ebp; + unsigned long *func_ptr; + + asm (movl %%ebp, %0 : =r (ebp) : ); + /* We don't want the obvious caller to show up */ + ebp = *(unsigned long *) ebp; + func_ptr = (unsigned long *)(ebp + 4); + if (valid_stack_ptr(tinfo, func_ptr)) + return func_ptr; + } else { + unsigned long *func_ptr; + unsigned long ebp = (unsigned long) prev_addr; + + ebp -= 4; + + ebp = *(unsigned long *) ebp; + func_ptr = (unsigned long *) ((unsigned long)ebp + 4); + if (valid_stack_ptr(tinfo, func_ptr)) + return func_ptr; + } +#else + while (prev_addr++) { + if (!valid_stack_ptr(tinfo, prev_addr)) + break; + if (__kernel_text_address(*prev_addr)) + return prev_addr; + } +#endif + return NULL; +} + static inline unsigned long print_context_stack(struct thread_info *tinfo, unsigned long *stack, unsigned long ebp) { Index: mm/include/linux/sched.h === --- mm.orig/include/linux/sched.h 2005-09-03 11:22:51.0 +0200 +++ mm/include/linux/sched.h2005-09-03 15:52:20.0 +0200 @@ -171,6 +171,7 @@ * trace (or NULL if the entire call-chain of the task should be shown). */ extern void show_stack(struct task_struct *task, unsigned long *sp); +extern unsigned long *next_stack_func(unsigned long *prev_addr, int count); void io_schedule(void); long io_schedule_timeout(long timeout); Index: mm/arch/x86_64/kernel/traps.c === --- mm.orig/arch/x86_64/kernel/traps.c 2005-09-03 17:59:16.0 +0200 +++ mm/arch/x86_64/kernel/traps.c 2005-09-03 19:00:48.0 +0200 @@ -154,6 +154,54 @@
Re: 2.6.13-mm1
On Thu, Sep 01, 2005 at 03:55:42AM -0700 Andrew Morton wrote: > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/ > I got: <7>Dead loop on netdevice eth0, fix it urgently! When using netconsole and printing out some information from kernel to console. The box uses: [EMAIL PROTECTED]/eth0,[EMAIL PROTECTED]/ :00:0f.0 Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 11) Relevant config: CONFIG_NET_TULIP=y # CONFIG_DE2104X is not set CONFIG_TULIP=y CONFIG_TULIP_MWI=y # CONFIG_TULIP_MMIO is not set CONFIG_TULIP_NAPI=y Matt, on another box I got some irq off hangs that went away when removing netconsole from the .config on a box with 3c59x. Is this known? The problem is getting backtraces when netconsole is active, but the last thing I see before the box goes is that some carrier is up... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-mm1
On Thu, Sep 01, 2005 at 03:55:42AM -0700 Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/ I got: 7Dead loop on netdevice eth0, fix it urgently! When using netconsole and printing out some information from kernel to console. The box uses: [EMAIL PROTECTED]/eth0,[EMAIL PROTECTED]/ :00:0f.0 Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 11) Relevant config: CONFIG_NET_TULIP=y # CONFIG_DE2104X is not set CONFIG_TULIP=y CONFIG_TULIP_MWI=y # CONFIG_TULIP_MMIO is not set CONFIG_TULIP_NAPI=y Matt, on another box I got some irq off hangs that went away when removing netconsole from the .config on a box with 3c59x. Is this known? The problem is getting backtraces when netconsole is active, but the last thing I see before the box goes is that some carrier is up... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: gcc coredump with 2.6.12+ kernels
On Sat, Sep 03, 2005 at 10:25:37AM -0700 Johnny Stenback wrote: > Hey all, > > I just attempted to upgrade my kernel to 2.6.13. The kernel appears to > boot and run just fine, but when I try to build any larger projects like > Mozilla or the Linux kernel I constantly get segfaults from gcc. All > other apps *seem* to work fine. I remember seeing this with 2.6.12 too > when I tried to upgrade to it too but I didn't have the time to > investigate at all then, but now I see the same problem with 2.6.13. The > last version I've used that didn't show this problem is 2.6.11.3, and > that's running with no problems here. > > When gcc segfaults I get the following messages in the messages log: > > cc1[16775]: segfault at rip 0036f2b0119e rsp > 7faaf0a0 error 4 > cc1[17086]: segfault at rip 0036f2b0119e rsp > 7fc4dfc0 error 4 > cc1[17788]: segfault at rip 0036f2b0119e rsp > 7fd777e0 error 4 > cc1[17823]: segfault at rip 0036f2b0119e rsp > 7fc4d630 error 4 > cc1[17895]: segfault at rip 0036f2b0119e rsp > 7ffd2330 error 4 > > I'm on a dual AMD Opteron system, running x86_64 code. Using Fedora Core > 2 (yeah, old, I know...) and gcc 3.3.3 20040412. Does it still happen if you run: echo 0 > /proc/sys/kernel/randomize_va_space - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: gcc coredump with 2.6.12+ kernels
On Sat, Sep 03, 2005 at 10:25:37AM -0700 Johnny Stenback wrote: Hey all, I just attempted to upgrade my kernel to 2.6.13. The kernel appears to boot and run just fine, but when I try to build any larger projects like Mozilla or the Linux kernel I constantly get segfaults from gcc. All other apps *seem* to work fine. I remember seeing this with 2.6.12 too when I tried to upgrade to it too but I didn't have the time to investigate at all then, but now I see the same problem with 2.6.13. The last version I've used that didn't show this problem is 2.6.11.3, and that's running with no problems here. When gcc segfaults I get the following messages in the messages log: cc1[16775]: segfault at rip 0036f2b0119e rsp 7faaf0a0 error 4 cc1[17086]: segfault at rip 0036f2b0119e rsp 7fc4dfc0 error 4 cc1[17788]: segfault at rip 0036f2b0119e rsp 7fd777e0 error 4 cc1[17823]: segfault at rip 0036f2b0119e rsp 7fc4d630 error 4 cc1[17895]: segfault at rip 0036f2b0119e rsp 7ffd2330 error 4 I'm on a dual AMD Opteron system, running x86_64 code. Using Fedora Core 2 (yeah, old, I know...) and gcc 3.3.3 20040412. Does it still happen if you run: echo 0 /proc/sys/kernel/randomize_va_space - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-mm1
On Thu, Sep 01, 2005 at 03:55:42AM -0700 Andrew Morton wrote: > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/ > i386-boottime-for_each_cpu-broken.patch i386-boottime-for_each_cpu-broken-fix.patch The SMP version of __alloc_percpu checks the cpu_possible_map before allocating memory for a certain cpu. With the above patches the BSP cpuid is never set in cpu_possible_map which breaks CONFIG_SMP on uniprocessor machines (as soon as someone tries to dereference something allocated via __alloc_percpu, which in fact is never allocated since the cpu is not set in cpu_possible_map). The below fixes this, I'm not entirely sure about the voyager part, should the cpu_possible_map really be CPU_MASK_ALL to begin with there, Zwane? Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]> Index: mm/arch/i386/kernel/smpboot.c === --- mm.orig/arch/i386/kernel/smpboot.c 2005-09-02 15:28:20.0 +0200 +++ mm/arch/i386/kernel/smpboot.c 2005-09-02 16:16:46.0 +0200 @@ -1265,6 +1265,7 @@ cpu_set(smp_processor_id(), cpu_online_map); cpu_set(smp_processor_id(), cpu_callout_map); cpu_set(smp_processor_id(), cpu_present_map); + cpu_set(smp_processor_id(), cpu_possible_map); per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE; } Index: mm/arch/i386/mach-voyager/voyager_smp.c === --- mm.orig/arch/i386/mach-voyager/voyager_smp.c2005-09-02 15:28:20.0 +0200 +++ mm/arch/i386/mach-voyager/voyager_smp.c 2005-09-02 16:17:29.0 +0200 @@ -1910,6 +1910,7 @@ { cpu_set(smp_processor_id(), cpu_online_map); cpu_set(smp_processor_id(), cpu_callout_map); + cpu_set(smp_processor_id(), cpu_possible_map); } int __devinit - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-mm1
On Thu, Sep 01, 2005 at 03:55:42AM -0700 Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/ i386-boottime-for_each_cpu-broken.patch i386-boottime-for_each_cpu-broken-fix.patch The SMP version of __alloc_percpu checks the cpu_possible_map before allocating memory for a certain cpu. With the above patches the BSP cpuid is never set in cpu_possible_map which breaks CONFIG_SMP on uniprocessor machines (as soon as someone tries to dereference something allocated via __alloc_percpu, which in fact is never allocated since the cpu is not set in cpu_possible_map). The below fixes this, I'm not entirely sure about the voyager part, should the cpu_possible_map really be CPU_MASK_ALL to begin with there, Zwane? Signed-off-by: Alexander Nyberg [EMAIL PROTECTED] Index: mm/arch/i386/kernel/smpboot.c === --- mm.orig/arch/i386/kernel/smpboot.c 2005-09-02 15:28:20.0 +0200 +++ mm/arch/i386/kernel/smpboot.c 2005-09-02 16:16:46.0 +0200 @@ -1265,6 +1265,7 @@ cpu_set(smp_processor_id(), cpu_online_map); cpu_set(smp_processor_id(), cpu_callout_map); cpu_set(smp_processor_id(), cpu_present_map); + cpu_set(smp_processor_id(), cpu_possible_map); per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE; } Index: mm/arch/i386/mach-voyager/voyager_smp.c === --- mm.orig/arch/i386/mach-voyager/voyager_smp.c2005-09-02 15:28:20.0 +0200 +++ mm/arch/i386/mach-voyager/voyager_smp.c 2005-09-02 16:17:29.0 +0200 @@ -1910,6 +1910,7 @@ { cpu_set(smp_processor_id(), cpu_online_map); cpu_set(smp_processor_id(), cpu_callout_map); + cpu_set(smp_processor_id(), cpu_possible_map); } int __devinit - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rcX really this bad ?
On Sun, Aug 14, 2005 at 10:10:18AM + Danny ter Haar wrote: > I've posted a couple of times than my newsserver is not stable > with any 2.6.13-rcX kernels. > Last kernel that survived is 2.6.12-mm1 (18+days) > Of course i can just stick with that kernel, but i thought it would > be wise to live on the edge and run a reasonable loaded server with > the latest/greatest. This ends in disaster though... > > Since i got no feedback on my previous posts, i either bring it > the wrong way, or people don't care and i ought to shut up. > I think however that just before releasing a new stable kernel these > kind of feedback could be healthy to ironout some bugs. > Is the machine running X? We need some output from it so we can debug what's going on, the info should be printed to the console. It would be great if you could run the latest kernel and see if you get any output. Also add nmi_watchdog=2 to the boot command line. You can also set up a serial console or netconsole to capture the output from the server with the help of another machine, described in Documentation/serial-console.txt Documentation/networking/netconsole.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rcX really this bad ?
On Sun, Aug 14, 2005 at 10:10:18AM + Danny ter Haar wrote: I've posted a couple of times than my newsserver is not stable with any 2.6.13-rcX kernels. Last kernel that survived is 2.6.12-mm1 (18+days) Of course i can just stick with that kernel, but i thought it would be wise to live on the edge and run a reasonable loaded server with the latest/greatest. This ends in disaster though... Since i got no feedback on my previous posts, i either bring it the wrong way, or people don't care and i ought to shut up. I think however that just before releasing a new stable kernel these kind of feedback could be healthy to ironout some bugs. Is the machine running X? We need some output from it so we can debug what's going on, the info should be printed to the console. It would be great if you could run the latest kernel and see if you get any output. Also add nmi_watchdog=2 to the boot command line. You can also set up a serial console or netconsole to capture the output from the server with the help of another machine, described in Documentation/serial-console.txt Documentation/networking/netconsole.txt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SLAB] __builtin_return_address use without FRAME_POINTER causes boot failure
On Mon, Aug 08, 2005 at 11:37:18PM +0200 Manfred Spraul wrote: > Christoph Lameter wrote: > > >I kept getting boot failures in the slab allocator. The failure goes > >away if one is setting CONFIG_FRAME_POINTER. Seems that > >CONFIG_DEBUG_SLAB implies the use of __buildin_return_address() which > >needs the framepointer. > > > > > > > Very odd. __builtin_return_address(1) needs frame pointers, but slab > only uses __builtin_return_addresse(0), which should always work. > My fault, I introduced a debugging patch (i think i cc'ed you on it) which used __builtin_return_address([12]) to save traces of who the caller of an object is. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] CHECK_IRQ_PER_CPU() to avoid dead code in __do_IRQ()
> > IRQ_PER_CPU is not used by all architectures. > This patch introduces the macros > ARCH_HAS_IRQ_PER_CPU and CHECK_IRQ_PER_CPU() to avoid the generation of > dead code in __do_IRQ(). > > ARCH_HAS_IRQ_PER_CPU is defined by architectures using > IRQ_PER_CPU in their > include/asm_ARCH/irq.h > file. > > Through grepping the tree I found the following > architectures currently use IRQ_PER_CPU: > > cris, ia64, ppc, ppc64 and parisc. > There are many places where one could replace run-time tests with #ifdef's but it makes reading more difficult (and in longer terms maintainence). Have you benchmarked any workload that benefits from this? > > diff -upr linux-2.6.13-rc6/include/asm-cris/irq.h > linux-2.6.13/include/asm-cris/irq.h > --- linux-2.6.13-rc6/include/asm-cris/irq.h 2005-08-08 11:46:10.0 > +0200 > +++ linux-2.6.13/include/asm-cris/irq.h 2005-08-08 11:41:12.0 > +0200 > @@ -1,6 +1,11 @@ > #ifndef _ASM_IRQ_H > #define _ASM_IRQ_H > > +/* > + * IRQ line status macro IRQ_PER_CPU is used > + */ > +#define ARCH_HAS_IRQ_PER_CPU > + > #include > > extern __inline__ int irq_canonicalize(int irq) > diff -upr linux-2.6.13-rc6/include/asm-ia64/irq.h > linux-2.6.13/include/asm-ia64/irq.h > --- linux-2.6.13-rc6/include/asm-ia64/irq.h 2005-03-02 08:38:33.0 > +0100 > +++ linux-2.6.13/include/asm-ia64/irq.h 2005-08-06 18:06:53.0 > +0200 > @@ -14,6 +14,11 @@ > #define NR_IRQS 256 > #define NR_IRQ_VECTORS NR_IRQS > > +/* > + * IRQ line status macro IRQ_PER_CPU is used > + */ > +#define ARCH_HAS_IRQ_PER_CPU > + > static __inline__ int > irq_canonicalize (int irq) > { > diff -upr linux-2.6.13-rc6/include/asm-parisc/irq.h > linux-2.6.13/include/asm-parisc/irq.h > --- linux-2.6.13-rc6/include/asm-parisc/irq.h 2005-08-08 11:45:26.0 > +0200 > +++ linux-2.6.13/include/asm-parisc/irq.h 2005-08-06 18:05:22.0 > +0200 > @@ -26,6 +26,11 @@ > > #define NR_IRQS (CPU_IRQ_MAX + 1) > > +/* > + * IRQ line status macro IRQ_PER_CPU is used > + */ > +#define ARCH_HAS_IRQ_PER_CPU > + > static __inline__ int irq_canonicalize(int irq) > { > return (irq == 2) ? 9 : irq; > diff -upr linux-2.6.13-rc6/include/asm-ppc/irq.h > linux-2.6.13/include/asm-ppc/irq.h > --- linux-2.6.13-rc6/include/asm-ppc/irq.h2005-08-08 11:46:10.0 > +0200 > +++ linux-2.6.13/include/asm-ppc/irq.h2005-08-08 11:41:14.0 > +0200 > @@ -19,6 +19,11 @@ > #define IRQ_POLARITY_POSITIVE0x2 /* high level or low->high edge > */ > #define IRQ_POLARITY_NEGATIVE0x0 /* low level or high->low edge > */ > > +/* > + * IRQ line status macro IRQ_PER_CPU is used > + */ > +#define ARCH_HAS_IRQ_PER_CPU > + > #if defined(CONFIG_40x) > #include > > diff -upr linux-2.6.13-rc6/include/asm-ppc64/irq.h > linux-2.6.13/include/asm-ppc64/irq.h > --- linux-2.6.13-rc6/include/asm-ppc64/irq.h 2005-03-02 08:38:33.0 > +0100 > +++ linux-2.6.13/include/asm-ppc64/irq.h 2005-08-06 18:06:58.0 > +0200 > @@ -33,6 +33,11 @@ > #define IRQ_POLARITY_POSITIVE0x2 /* high level or low->high edge > */ > #define IRQ_POLARITY_NEGATIVE0x0 /* low level or high->low edge > */ > > +/* > + * IRQ line status macro IRQ_PER_CPU is used > + */ > +#define ARCH_HAS_IRQ_PER_CPU > + > #define get_irq_desc(irq) (_desc[(irq)]) > > /* Define a way to iterate across irqs. */ > diff -upr linux-2.6.13-rc6/include/linux/irq.h > linux-2.6.13/include/linux/irq.h > --- linux-2.6.13-rc6/include/linux/irq.h 2005-08-08 11:46:10.0 > +0200 > +++ linux-2.6.13/include/linux/irq.h 2005-08-08 11:55:11.0 +0200 > @@ -32,7 +32,12 @@ > #define IRQ_WAITING 32 /* IRQ not yet seen - for autodetection */ > #define IRQ_LEVEL64 /* IRQ level triggered */ > #define IRQ_MASKED 128 /* IRQ masked - shouldn't be seen again */ > -#define IRQ_PER_CPU 256 /* IRQ is per CPU */ > +#if defined(ARCH_HAS_IRQ_PER_CPU) > +# define IRQ_PER_CPU 256 /* IRQ is per CPU */ > +# define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) > +#else > +# define CHECK_IRQ_PER_CPU(var) 0 > +#endif > > /* > * Interrupt controller descriptor. This is all we need > diff -upr linux-2.6.13-rc6/kernel/irq/handle.c > linux-2.6.13/kernel/irq/handle.c > --- linux-2.6.13-rc6/kernel/irq/handle.c 2005-08-08 11:46:11.0 > +0200 > +++ linux-2.6.13/kernel/irq/handle.c 2005-08-08 11:53:00.0 +0200 > @@ -111,7 +111,7 @@ fastcall unsigned int __do_IRQ(unsigned > unsigned int status; > > kstat_this_cpu.irqs[irq]++; > - if (desc->status & IRQ_PER_CPU) { > + if (CHECK_IRQ_PER_CPU(desc->status)) { > irqreturn_t action_ret; > > /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at
Re: 2.6.13-rc5-mm1: oops when starting nscd on AMD64
> > > I don't think it was supposed to do that. > > > > > > Quite possibly it's something to do with the new debugging code - could > > you > > > please take a copy of the offending config, send it over and then try > > > removing debug options, see if the crash goes away? CONFIG_DEBUG_PREEMPT > > > would be the first to try.. > > > > The (offending) .config is attached and here's what happens without > > CONFIG_DEBUG_PREEMPT > > (the other debug options being unchanged): > > Yes, my emt64 machine keels over with your .config too. Maybe it's due to > CONFIG_SMP=n, not sure. > > Bisection searching shows that the bug was introduced by > slab-leak-detector-give-longer-traces.patch. > I was afraid it was when I first saw it but I couldn't reproduce (and still can't). > Call Trace:{sys_epoll_create+568} > {vfs_readdir+167} >{add_preempt_count+93} > {system_call+126} > > For some reason your compilers inline heavier than mine do, which makes this: kmem_cache_alloc sys_epoll_create(__builtin_return_address(0)) system_call (__builtin_return_address(1)) (__builtin_return_address(2)) and off the stack we go... I guess it was naive to even try to use this for more than the first caller, sorry. Please throw that thing away and I'll do some backtracing similar to CONFIG_PAGE_OWNER - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc5-mm1: oops when starting nscd on AMD64
I don't think it was supposed to do that. Quite possibly it's something to do with the new debugging code - could you please take a copy of the offending config, send it over and then try removing debug options, see if the crash goes away? CONFIG_DEBUG_PREEMPT would be the first to try.. The (offending) .config is attached and here's what happens without CONFIG_DEBUG_PREEMPT (the other debug options being unchanged): Yes, my emt64 machine keels over with your .config too. Maybe it's due to CONFIG_SMP=n, not sure. Bisection searching shows that the bug was introduced by slab-leak-detector-give-longer-traces.patch. I was afraid it was when I first saw it but I couldn't reproduce (and still can't). Call Trace:801a17bb{sys_epoll_create+568} 8018b1f7{vfs_readdir+167} 80231000{add_preempt_count+93} 8010e8fa{system_call+126} For some reason your compilers inline heavier than mine do, which makes this: kmem_cache_alloc sys_epoll_create(__builtin_return_address(0)) system_call (__builtin_return_address(1)) (__builtin_return_address(2)) and off the stack we go... I guess it was naive to even try to use this for more than the first caller, sorry. Please throw that thing away and I'll do some backtracing similar to CONFIG_PAGE_OWNER - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] CHECK_IRQ_PER_CPU() to avoid dead code in __do_IRQ()
IRQ_PER_CPU is not used by all architectures. This patch introduces the macros ARCH_HAS_IRQ_PER_CPU and CHECK_IRQ_PER_CPU() to avoid the generation of dead code in __do_IRQ(). ARCH_HAS_IRQ_PER_CPU is defined by architectures using IRQ_PER_CPU in their include/asm_ARCH/irq.h file. Through grepping the tree I found the following architectures currently use IRQ_PER_CPU: cris, ia64, ppc, ppc64 and parisc. There are many places where one could replace run-time tests with #ifdef's but it makes reading more difficult (and in longer terms maintainence). Have you benchmarked any workload that benefits from this? diff -upr linux-2.6.13-rc6/include/asm-cris/irq.h linux-2.6.13/include/asm-cris/irq.h --- linux-2.6.13-rc6/include/asm-cris/irq.h 2005-08-08 11:46:10.0 +0200 +++ linux-2.6.13/include/asm-cris/irq.h 2005-08-08 11:41:12.0 +0200 @@ -1,6 +1,11 @@ #ifndef _ASM_IRQ_H #define _ASM_IRQ_H +/* + * IRQ line status macro IRQ_PER_CPU is used + */ +#define ARCH_HAS_IRQ_PER_CPU + #include asm/arch/irq.h extern __inline__ int irq_canonicalize(int irq) diff -upr linux-2.6.13-rc6/include/asm-ia64/irq.h linux-2.6.13/include/asm-ia64/irq.h --- linux-2.6.13-rc6/include/asm-ia64/irq.h 2005-03-02 08:38:33.0 +0100 +++ linux-2.6.13/include/asm-ia64/irq.h 2005-08-06 18:06:53.0 +0200 @@ -14,6 +14,11 @@ #define NR_IRQS 256 #define NR_IRQ_VECTORS NR_IRQS +/* + * IRQ line status macro IRQ_PER_CPU is used + */ +#define ARCH_HAS_IRQ_PER_CPU + static __inline__ int irq_canonicalize (int irq) { diff -upr linux-2.6.13-rc6/include/asm-parisc/irq.h linux-2.6.13/include/asm-parisc/irq.h --- linux-2.6.13-rc6/include/asm-parisc/irq.h 2005-08-08 11:45:26.0 +0200 +++ linux-2.6.13/include/asm-parisc/irq.h 2005-08-06 18:05:22.0 +0200 @@ -26,6 +26,11 @@ #define NR_IRQS (CPU_IRQ_MAX + 1) +/* + * IRQ line status macro IRQ_PER_CPU is used + */ +#define ARCH_HAS_IRQ_PER_CPU + static __inline__ int irq_canonicalize(int irq) { return (irq == 2) ? 9 : irq; diff -upr linux-2.6.13-rc6/include/asm-ppc/irq.h linux-2.6.13/include/asm-ppc/irq.h --- linux-2.6.13-rc6/include/asm-ppc/irq.h2005-08-08 11:46:10.0 +0200 +++ linux-2.6.13/include/asm-ppc/irq.h2005-08-08 11:41:14.0 +0200 @@ -19,6 +19,11 @@ #define IRQ_POLARITY_POSITIVE0x2 /* high level or low-high edge */ #define IRQ_POLARITY_NEGATIVE0x0 /* low level or high-low edge */ +/* + * IRQ line status macro IRQ_PER_CPU is used + */ +#define ARCH_HAS_IRQ_PER_CPU + #if defined(CONFIG_40x) #include asm/ibm4xx.h diff -upr linux-2.6.13-rc6/include/asm-ppc64/irq.h linux-2.6.13/include/asm-ppc64/irq.h --- linux-2.6.13-rc6/include/asm-ppc64/irq.h 2005-03-02 08:38:33.0 +0100 +++ linux-2.6.13/include/asm-ppc64/irq.h 2005-08-06 18:06:58.0 +0200 @@ -33,6 +33,11 @@ #define IRQ_POLARITY_POSITIVE0x2 /* high level or low-high edge */ #define IRQ_POLARITY_NEGATIVE0x0 /* low level or high-low edge */ +/* + * IRQ line status macro IRQ_PER_CPU is used + */ +#define ARCH_HAS_IRQ_PER_CPU + #define get_irq_desc(irq) (irq_desc[(irq)]) /* Define a way to iterate across irqs. */ diff -upr linux-2.6.13-rc6/include/linux/irq.h linux-2.6.13/include/linux/irq.h --- linux-2.6.13-rc6/include/linux/irq.h 2005-08-08 11:46:10.0 +0200 +++ linux-2.6.13/include/linux/irq.h 2005-08-08 11:55:11.0 +0200 @@ -32,7 +32,12 @@ #define IRQ_WAITING 32 /* IRQ not yet seen - for autodetection */ #define IRQ_LEVEL64 /* IRQ level triggered */ #define IRQ_MASKED 128 /* IRQ masked - shouldn't be seen again */ -#define IRQ_PER_CPU 256 /* IRQ is per CPU */ +#if defined(ARCH_HAS_IRQ_PER_CPU) +# define IRQ_PER_CPU 256 /* IRQ is per CPU */ +# define CHECK_IRQ_PER_CPU(var) ((var) IRQ_PER_CPU) +#else +# define CHECK_IRQ_PER_CPU(var) 0 +#endif /* * Interrupt controller descriptor. This is all we need diff -upr linux-2.6.13-rc6/kernel/irq/handle.c linux-2.6.13/kernel/irq/handle.c --- linux-2.6.13-rc6/kernel/irq/handle.c 2005-08-08 11:46:11.0 +0200 +++ linux-2.6.13/kernel/irq/handle.c 2005-08-08 11:53:00.0 +0200 @@ -111,7 +111,7 @@ fastcall unsigned int __do_IRQ(unsigned unsigned int status; kstat_this_cpu.irqs[irq]++; - if (desc-status IRQ_PER_CPU) { + if (CHECK_IRQ_PER_CPU(desc-status)) { irqreturn_t action_ret; /* - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SLAB] __builtin_return_address use without FRAME_POINTER causes boot failure
On Mon, Aug 08, 2005 at 11:37:18PM +0200 Manfred Spraul wrote: Christoph Lameter wrote: I kept getting boot failures in the slab allocator. The failure goes away if one is setting CONFIG_FRAME_POINTER. Seems that CONFIG_DEBUG_SLAB implies the use of __buildin_return_address() which needs the framepointer. Very odd. __builtin_return_address(1) needs frame pointers, but slab only uses __builtin_return_addresse(0), which should always work. My fault, I introduced a debugging patch (i think i cc'ed you on it) which used __builtin_return_address([12]) to save traces of who the caller of an object is. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops with 2.6.13-rc5 on webserver with raid
On Fri, Aug 05, 2005 at 11:52:15AM +0200 Martin Braun wrote: > Hi, > > I've been trying to upgrade kernel to 2.6.13-rc5. The server boots > normally w/o errors, but after while (from 5 minutes up to 2 hours) the > Kernel hangs (no keyboard input possible). As I am a newbie I cannot > figure out who will be concerned with this error. Please don't run ksymoops on 2.6 kernels, it makes the output look weird and isn't necessary anymore. > > >>EIP; c0324afd<= > Should be fixed in 2.6.13-rc6, if problem persists please report back. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sluggish/very slow usb mouse on hp nx6110 notebook => acpi problem
On Fri, Aug 05, 2005 at 08:56:51PM +0200 JG wrote: > hm, i currently have "acpi=off noacpi noapic reboot=b" as kernel > parameter. > > if i remove the acpi stuff and enable acpi, the usb mouse works fine.. > but after some time (5-10min) the kacpid process goes havoc and eats > all cpu and the whole system is unresponsive- that's the reason i added > those acpi=off parameters the first time when installing gentoo.. > > i tested with gentoo-2.6.12-r7 and vanilla-2.6.13rc5 > Indicates a bug in kacpid or similar. Could you make sure you compile in "Magic SysRq key" under "Kernel Hacking" and boot the vanilla-2.6.13-rc6 (some recent acpi changes have gone in) and then wait for kacpid to go nuts and do Alt+Sysrq+t 4 times and then run 'dmesg -s 10 > logfile' and send logfile over here so that we can see what kacpid is up to. If the box becomes so unresponsive you can't extract the log information it would be good if you could use either network console Documentation/networking/netconsole.txt or serial console at Documentation/serial-console.txt, both require an extra computer though... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Oops in 2.6.13-rc5-git-current (0d317fb72fe3cf0f611608cf3a3015bbe6cd2a66)
> Unable to handle kernel paging request at virtual address 6b6b6b6b > printing eip: > c0188d15 > *pde = > Oops: [#1] > PREEMPT > CPU:0 > EIP:0060:[inotify_inode_queue_event+85/336]Not tainted VLI > EFLAGS: 00010206 (2.6.13-rc5-g0d317fb7) > EIP is at inotify_inode_queue_event+0x55/0x150 > eax: 6b6b6b6b ebx: 6b6b6b63 ecx: edx: 0066 > esi: c3effe34 edi: ce8c76ac ebp: d4bb864c esp: d8655eb0 > ds: 007b es: 007b ss: 0068 > Process nfsd (pid: 3750, threadinfo=d8654000 task=d6155020) > Stack: 0286 0286 0400 d4bb8760 d4bb8768 > c3effe34 >ce8c76ac d4bb864c c0170626 c3effe34 d6608ad4 db74b17c > c3effe34 >e0cfe9a4 0013 e0d01b34 c0dd91b4 ce8c76ac c000 d66092dc > d66093c4 > Call Trace: > [vfs_unlink+358/560] vfs_unlink+0x166/0x230 > [pg0+544348580/1067586560] nfsd_unlink+0x104/0x230 [nfsd] > [pg0+544361268/1067586560] nfsd_cache_lookup+0x1c4/0x3c0 [nfsd] > [pg0+544371728/1067586560] nfsd3_proc_remove+0x80/0xc0 [nfsd] > [pg0+544381018/1067586560] nfs3svc_decode_diropargs+0x8a/0x100 [nfsd] > [pg0+544380880/1067586560] nfs3svc_decode_diropargs+0x0/0x100 [nfsd] > [pg0+544321698/1067586560] nfsd_dispatch+0x82/0x1f0 [nfsd] > [svc_authenticate+112/336] svc_authenticate+0x70/0x150 > [svc_process+960/1648] svc_process+0x3c0/0x670 > [pg0+544323105/1067586560] nfsd+0x1a1/0x350 [nfsd] > [ret_from_fork+6/20] ret_from_fork+0x6/0x14 > [pg0+544322688/1067586560] nfsd+0x0/0x350 [nfsd] > [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10 (akpm: a fix for this needs to go into 2.6.13, inotify + nfs trivially oopses otherwise, even if inotify isn't actively used) It looks like the following sequence is done in the wrong order. When vfs_unlink() is called from sys_unlink() it has taken a ref on the inode and sys_unlink() does the last iput() but when called from other callsites vfs_unlink() might do the last iput() and free inode, so inotify_inode_queue_event() will receive an already freed object and dereference an already freed object. Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]> Index: mm/fs/namei.c === --- mm.orig/fs/namei.c 2005-08-07 12:06:16.0 +0200 +++ mm/fs/namei.c 2005-08-07 18:17:20.0 +0200 @@ -1869,8 +1869,8 @@ /* We don't d_delete() NFS sillyrenamed files--they still exist. */ if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) { struct inode *inode = dentry->d_inode; - d_delete(dentry); fsnotify_unlink(dentry, inode, dir); + d_delete(dentry); } return error; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Oops in 2.6.13-rc5-git-current (0d317fb72fe3cf0f611608cf3a3015bbe6cd2a66)
On Sat, Aug 06, 2005 at 11:56:30PM -0400 Ryan Anderson wrote: > > Unable to handle kernel paging request at virtual address 6b6b6b6b > printing eip: > c0188d15 > *pde = > Oops: [#1] > PREEMPT > Modules linked in: ppp_deflate bsd_comp ppp_async ppp_generic slhc radeon > esp6 ah6 wp512 tgr192 tea khazad michael_mic cast6 cast5 arc4 anubis nfsd > exportfs lp binfmt_misc ipv6 tsdev evdev analog parport_pc parport 8250_pnp > 8250 serial_core via_agp serpent aes_i586 crypto_null snd_via82xx gameport > snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc > snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore uhci_hcd via_ircc > irda dm_mod r8169 raid5 xor tulip via drm agpgart cpuid smbfs usbkbd usbcore > trm290 triflex sc1200 ns87415 it821x cy82c693 cs5530 cs5520 atiixp raid1 > md_mod > CPU:0 > EIP:0060:[inotify_inode_queue_event+85/336]Not tainted VLI > EFLAGS: 00010206 (2.6.13-rc5-g0d317fb7) > EIP is at inotify_inode_queue_event+0x55/0x150 > eax: 6b6b6b6b ebx: 6b6b6b63 ecx: edx: 0066 > esi: c3effe34 edi: ce8c76ac ebp: d4bb864c esp: d8655eb0 > ds: 007b es: 007b ss: 0068 > Process nfsd (pid: 3750, threadinfo=d8654000 task=d6155020) > Stack: 0286 0286 0400 d4bb8760 d4bb8768 > c3effe34 >ce8c76ac d4bb864c c0170626 c3effe34 d6608ad4 db74b17c > c3effe34 >e0cfe9a4 0013 e0d01b34 c0dd91b4 ce8c76ac c000 d66092dc > d66093c4 > Call Trace: > [vfs_unlink+358/560] vfs_unlink+0x166/0x230 > [pg0+544348580/1067586560] nfsd_unlink+0x104/0x230 [nfsd] > [pg0+544361268/1067586560] nfsd_cache_lookup+0x1c4/0x3c0 [nfsd] > [pg0+544371728/1067586560] nfsd3_proc_remove+0x80/0xc0 [nfsd] > [pg0+544381018/1067586560] nfs3svc_decode_diropargs+0x8a/0x100 [nfsd] > [pg0+544380880/1067586560] nfs3svc_decode_diropargs+0x0/0x100 [nfsd] > [pg0+544321698/1067586560] nfsd_dispatch+0x82/0x1f0 [nfsd] > [svc_authenticate+112/336] svc_authenticate+0x70/0x150 > [svc_process+960/1648] svc_process+0x3c0/0x670 > [pg0+544323105/1067586560] nfsd+0x1a1/0x350 [nfsd] > [ret_from_fork+6/20] ret_from_fork+0x6/0x14 > [pg0+544322688/1067586560] nfsd+0x0/0x350 [nfsd] > [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10 (the long-aged vfs veteran steps into the picture...) It looks like the following sequence is done in the wrong order. When vfs_unlink() is called from sys_unlink() it has taken a ref on the inode and sys_unlink() does the last iput() but when called from other callsites vfs_unlink() might do the last iput() Can you reproduce with this patch? It should happen with some nfs activity, I'll try to set up a scenario myself. Index: mm/fs/namei.c === --- mm.orig/fs/namei.c 2005-08-07 12:06:16.0 +0200 +++ mm/fs/namei.c 2005-08-07 18:17:20.0 +0200 @@ -1869,8 +1869,8 @@ /* We don't d_delete() NFS sillyrenamed files--they still exist. */ if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) { struct inode *inode = dentry->d_inode; - d_delete(dentry); fsnotify_unlink(dentry, inode, dir); + d_delete(dentry); } return error; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Oops in 2.6.13-rc5-git-current (0d317fb72fe3cf0f611608cf3a3015bbe6cd2a66)
On Sat, Aug 06, 2005 at 11:56:30PM -0400 Ryan Anderson wrote: Unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: c0188d15 *pde = Oops: [#1] PREEMPT Modules linked in: ppp_deflate bsd_comp ppp_async ppp_generic slhc radeon esp6 ah6 wp512 tgr192 tea khazad michael_mic cast6 cast5 arc4 anubis nfsd exportfs lp binfmt_misc ipv6 tsdev evdev analog parport_pc parport 8250_pnp 8250 serial_core via_agp serpent aes_i586 crypto_null snd_via82xx gameport snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore uhci_hcd via_ircc irda dm_mod r8169 raid5 xor tulip via drm agpgart cpuid smbfs usbkbd usbcore trm290 triflex sc1200 ns87415 it821x cy82c693 cs5530 cs5520 atiixp raid1 md_mod CPU:0 EIP:0060:[inotify_inode_queue_event+85/336]Not tainted VLI EFLAGS: 00010206 (2.6.13-rc5-g0d317fb7) EIP is at inotify_inode_queue_event+0x55/0x150 eax: 6b6b6b6b ebx: 6b6b6b63 ecx: edx: 0066 esi: c3effe34 edi: ce8c76ac ebp: d4bb864c esp: d8655eb0 ds: 007b es: 007b ss: 0068 Process nfsd (pid: 3750, threadinfo=d8654000 task=d6155020) Stack: 0286 0286 0400 d4bb8760 d4bb8768 c3effe34 ce8c76ac d4bb864c c0170626 c3effe34 d6608ad4 db74b17c c3effe34 e0cfe9a4 0013 e0d01b34 c0dd91b4 ce8c76ac c000 d66092dc d66093c4 Call Trace: [vfs_unlink+358/560] vfs_unlink+0x166/0x230 [pg0+544348580/1067586560] nfsd_unlink+0x104/0x230 [nfsd] [pg0+544361268/1067586560] nfsd_cache_lookup+0x1c4/0x3c0 [nfsd] [pg0+544371728/1067586560] nfsd3_proc_remove+0x80/0xc0 [nfsd] [pg0+544381018/1067586560] nfs3svc_decode_diropargs+0x8a/0x100 [nfsd] [pg0+544380880/1067586560] nfs3svc_decode_diropargs+0x0/0x100 [nfsd] [pg0+544321698/1067586560] nfsd_dispatch+0x82/0x1f0 [nfsd] [svc_authenticate+112/336] svc_authenticate+0x70/0x150 [svc_process+960/1648] svc_process+0x3c0/0x670 [pg0+544323105/1067586560] nfsd+0x1a1/0x350 [nfsd] [ret_from_fork+6/20] ret_from_fork+0x6/0x14 [pg0+544322688/1067586560] nfsd+0x0/0x350 [nfsd] [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10 (the long-aged vfs veteran steps into the picture...) It looks like the following sequence is done in the wrong order. When vfs_unlink() is called from sys_unlink() it has taken a ref on the inode and sys_unlink() does the last iput() but when called from other callsites vfs_unlink() might do the last iput() Can you reproduce with this patch? It should happen with some nfs activity, I'll try to set up a scenario myself. Index: mm/fs/namei.c === --- mm.orig/fs/namei.c 2005-08-07 12:06:16.0 +0200 +++ mm/fs/namei.c 2005-08-07 18:17:20.0 +0200 @@ -1869,8 +1869,8 @@ /* We don't d_delete() NFS sillyrenamed files--they still exist. */ if (!error !(dentry-d_flags DCACHE_NFSFS_RENAMED)) { struct inode *inode = dentry-d_inode; - d_delete(dentry); fsnotify_unlink(dentry, inode, dir); + d_delete(dentry); } return error; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Oops in 2.6.13-rc5-git-current (0d317fb72fe3cf0f611608cf3a3015bbe6cd2a66)
Unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: c0188d15 *pde = Oops: [#1] PREEMPT CPU:0 EIP:0060:[inotify_inode_queue_event+85/336]Not tainted VLI EFLAGS: 00010206 (2.6.13-rc5-g0d317fb7) EIP is at inotify_inode_queue_event+0x55/0x150 eax: 6b6b6b6b ebx: 6b6b6b63 ecx: edx: 0066 esi: c3effe34 edi: ce8c76ac ebp: d4bb864c esp: d8655eb0 ds: 007b es: 007b ss: 0068 Process nfsd (pid: 3750, threadinfo=d8654000 task=d6155020) Stack: 0286 0286 0400 d4bb8760 d4bb8768 c3effe34 ce8c76ac d4bb864c c0170626 c3effe34 d6608ad4 db74b17c c3effe34 e0cfe9a4 0013 e0d01b34 c0dd91b4 ce8c76ac c000 d66092dc d66093c4 Call Trace: [vfs_unlink+358/560] vfs_unlink+0x166/0x230 [pg0+544348580/1067586560] nfsd_unlink+0x104/0x230 [nfsd] [pg0+544361268/1067586560] nfsd_cache_lookup+0x1c4/0x3c0 [nfsd] [pg0+544371728/1067586560] nfsd3_proc_remove+0x80/0xc0 [nfsd] [pg0+544381018/1067586560] nfs3svc_decode_diropargs+0x8a/0x100 [nfsd] [pg0+544380880/1067586560] nfs3svc_decode_diropargs+0x0/0x100 [nfsd] [pg0+544321698/1067586560] nfsd_dispatch+0x82/0x1f0 [nfsd] [svc_authenticate+112/336] svc_authenticate+0x70/0x150 [svc_process+960/1648] svc_process+0x3c0/0x670 [pg0+544323105/1067586560] nfsd+0x1a1/0x350 [nfsd] [ret_from_fork+6/20] ret_from_fork+0x6/0x14 [pg0+544322688/1067586560] nfsd+0x0/0x350 [nfsd] [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10 (akpm: a fix for this needs to go into 2.6.13, inotify + nfs trivially oopses otherwise, even if inotify isn't actively used) It looks like the following sequence is done in the wrong order. When vfs_unlink() is called from sys_unlink() it has taken a ref on the inode and sys_unlink() does the last iput() but when called from other callsites vfs_unlink() might do the last iput() and free inode, so inotify_inode_queue_event() will receive an already freed object and dereference an already freed object. Signed-off-by: Alexander Nyberg [EMAIL PROTECTED] Index: mm/fs/namei.c === --- mm.orig/fs/namei.c 2005-08-07 12:06:16.0 +0200 +++ mm/fs/namei.c 2005-08-07 18:17:20.0 +0200 @@ -1869,8 +1869,8 @@ /* We don't d_delete() NFS sillyrenamed files--they still exist. */ if (!error !(dentry-d_flags DCACHE_NFSFS_RENAMED)) { struct inode *inode = dentry-d_inode; - d_delete(dentry); fsnotify_unlink(dentry, inode, dir); + d_delete(dentry); } return error; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sluggish/very slow usb mouse on hp nx6110 notebook = acpi problem
On Fri, Aug 05, 2005 at 08:56:51PM +0200 JG wrote: hm, i currently have acpi=off noacpi noapic reboot=b as kernel parameter. if i remove the acpi stuff and enable acpi, the usb mouse works fine.. but after some time (5-10min) the kacpid process goes havoc and eats all cpu and the whole system is unresponsive- that's the reason i added those acpi=off parameters the first time when installing gentoo.. i tested with gentoo-2.6.12-r7 and vanilla-2.6.13rc5 Indicates a bug in kacpid or similar. Could you make sure you compile in Magic SysRq key under Kernel Hacking and boot the vanilla-2.6.13-rc6 (some recent acpi changes have gone in) and then wait for kacpid to go nuts and do Alt+Sysrq+t 4 times and then run 'dmesg -s 10 logfile' and send logfile over here so that we can see what kacpid is up to. If the box becomes so unresponsive you can't extract the log information it would be good if you could use either network console Documentation/networking/netconsole.txt or serial console at Documentation/serial-console.txt, both require an extra computer though... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops with 2.6.13-rc5 on webserver with raid
On Fri, Aug 05, 2005 at 11:52:15AM +0200 Martin Braun wrote: Hi, I've been trying to upgrade kernel to 2.6.13-rc5. The server boots normally w/o errors, but after while (from 5 minutes up to 2 hours) the Kernel hangs (no keyboard input possible). As I am a newbie I cannot figure out who will be concerned with this error. Please don't run ksymoops on 2.6 kernels, it makes the output look weird and isn't necessary anymore. EIP; c0324afd tcp_tso_should_defer+fd/110 = Should be fixed in 2.6.13-rc6, if problem persists please report back. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] module ns558
On Fri, Aug 05, 2005 at 08:52:41PM +0200 Michael Stenzel wrote: > Hello dear Kernel People, > > I have a problem with my gameport, it uses the ns558 driver, the module gets > loaded via hotplug/udev at boot, but the gameport gets deactivated somehow. > I have this Problem for a long time now, and my solution always was rmmod the > module and load it again after that the gameport is working. > But now i have 2.6.13-rc5 with debug stuff turned on and noticed that: > Please take this up with the input guys, I'm guessing it shouldn't happen in the first place, but regarding this bug look at the bottom. > Unable to handle kernel paging request at virtual address 6b6b6b6b > printing eip: > e0afc4ab > *pde = > Oops: [#1] > PREEMPT > Modules linked in: snd_seq_midi snd_seq_midi_event snd_seq video_buf_dvb > video_buf w83627hf w83781d i2c_sensor i2c_isa snd_pcm_oss snd_mixer_oss > ipt_MASQUERADE ipt_state iptable_mangle iptable_nat iptable_filter > ip_conntrack_ftp ip_conntrack_irc ip_conntrack ip_tables rtc joydev analog > ns558 budget s5h1420 l64781 ves1820 budget_core saa7146 ttpci_eeprom stv0299 > tda8083 ves1x93 dvb_core 8139too snd_via82xx gameport snd_mpu401_uart > snd_rawmidi snd_seq_device via_rhine crc32 ide_scsi > CPU:0 > EIP:0060:[]Not tainted VLI > EFLAGS: 00010282 (2.6.13-rc5-debug) > EIP is at ns558_exit+0x4b/0x79 [ns558] > eax: 6b6b6b57 ebx: 6b6b6b57 ecx: edx: 6b6b6b6b > esi: edi: 0002 ebp: d7cfdf60 esp: d7cfdf5c > ds: 007b es: 007b ss: 0068 > Process rmmod (pid: 3267, threadinfo=d7cfc000 task=dfc94080) > Stack: e0afd140 d7cfdfb4 c0146b4d 3535736e d7cf0038 c0169941 b7f43000 >b7f42000 d7cfdfa4 c0169de5 b7f42000 b7f43000 df6a6f44 df6a61fc df17d3a4 >df17d3d4 00cfdfb4 c0169e6a bf856ae0 b7f2917c d7cfc000 c0103889 > Call Trace: > [] show_stack+0x7a/0x90 > [] show_registers+0x156/0x1c0 > [] die+0x14c/0x2c0 > [] do_page_fault+0x343/0x655 > [] error_code+0x4f/0x54 > [] sys_delete_module+0x14d/0x190 > [] syscall_call+0x7/0xb > Code: 8b 43 10 e8 98 65 de ff 8b 4b 08 b8 a0 2f 46 c0 89 ca f7 da 23 53 04 e8 > 64 c7 62 df 89 d8 e8 5d 01 66 df 8b 53 14 8d 42 ec 89 c3 <8b> 40 14 0f 18 00 > 90 81 fa 20 cf af e0 75 c6 8b 1d c0 d2 af e0 > Please try this: Index: linux-2.6/drivers/input/gameport/ns558.c === --- linux-2.6.orig/drivers/input/gameport/ns558.c 2005-07-31 18:10:26.0 +0200 +++ linux-2.6/drivers/input/gameport/ns558.c2005-08-05 21:20:59.0 +0200 @@ -275,9 +275,9 @@ static void __exit ns558_exit(void) { - struct ns558 *ns558; + struct ns558 *ns558, *safe; - list_for_each_entry(ns558, _list, node) { + list_for_each_entry_safe(ns558, safe, _list, node) { gameport_unregister_port(ns558->gameport); release_region(ns558->io & ~(ns558->size - 1), ns558->size); kfree(ns558); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] module ns558
On Fri, Aug 05, 2005 at 08:52:41PM +0200 Michael Stenzel wrote: Hello dear Kernel People, I have a problem with my gameport, it uses the ns558 driver, the module gets loaded via hotplug/udev at boot, but the gameport gets deactivated somehow. I have this Problem for a long time now, and my solution always was rmmod the module and load it again after that the gameport is working. But now i have 2.6.13-rc5 with debug stuff turned on and noticed that: Please take this up with the input guys, I'm guessing it shouldn't happen in the first place, but regarding this bug look at the bottom. Unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: e0afc4ab *pde = Oops: [#1] PREEMPT Modules linked in: snd_seq_midi snd_seq_midi_event snd_seq video_buf_dvb video_buf w83627hf w83781d i2c_sensor i2c_isa snd_pcm_oss snd_mixer_oss ipt_MASQUERADE ipt_state iptable_mangle iptable_nat iptable_filter ip_conntrack_ftp ip_conntrack_irc ip_conntrack ip_tables rtc joydev analog ns558 budget s5h1420 l64781 ves1820 budget_core saa7146 ttpci_eeprom stv0299 tda8083 ves1x93 dvb_core 8139too snd_via82xx gameport snd_mpu401_uart snd_rawmidi snd_seq_device via_rhine crc32 ide_scsi CPU:0 EIP:0060:[e0afc4ab]Not tainted VLI EFLAGS: 00010282 (2.6.13-rc5-debug) EIP is at ns558_exit+0x4b/0x79 [ns558] eax: 6b6b6b57 ebx: 6b6b6b57 ecx: edx: 6b6b6b6b esi: edi: 0002 ebp: d7cfdf60 esp: d7cfdf5c ds: 007b es: 007b ss: 0068 Process rmmod (pid: 3267, threadinfo=d7cfc000 task=dfc94080) Stack: e0afd140 d7cfdfb4 c0146b4d 3535736e d7cf0038 c0169941 b7f43000 b7f42000 d7cfdfa4 c0169de5 b7f42000 b7f43000 df6a6f44 df6a61fc df17d3a4 df17d3d4 00cfdfb4 c0169e6a bf856ae0 b7f2917c d7cfc000 c0103889 Call Trace: [c010483a] show_stack+0x7a/0x90 [c01049c6] show_registers+0x156/0x1c0 [c0104c1c] die+0x14c/0x2c0 [c0118093] do_page_fault+0x343/0x655 [c010430f] error_code+0x4f/0x54 [c0146b4d] sys_delete_module+0x14d/0x190 [c0103889] syscall_call+0x7/0xb Code: 8b 43 10 e8 98 65 de ff 8b 4b 08 b8 a0 2f 46 c0 89 ca f7 da 23 53 04 e8 64 c7 62 df 89 d8 e8 5d 01 66 df 8b 53 14 8d 42 ec 89 c3 8b 40 14 0f 18 00 90 81 fa 20 cf af e0 75 c6 8b 1d c0 d2 af e0 Please try this: Index: linux-2.6/drivers/input/gameport/ns558.c === --- linux-2.6.orig/drivers/input/gameport/ns558.c 2005-07-31 18:10:26.0 +0200 +++ linux-2.6/drivers/input/gameport/ns558.c2005-08-05 21:20:59.0 +0200 @@ -275,9 +275,9 @@ static void __exit ns558_exit(void) { - struct ns558 *ns558; + struct ns558 *ns558, *safe; - list_for_each_entry(ns558, ns558_list, node) { + list_for_each_entry_safe(ns558, safe, ns558_list, node) { gameport_unregister_port(ns558-gameport); release_region(ns558-io ~(ns558-size - 1), ns558-size); kfree(ns558); - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86_64 access of some bad address
On Thu, Aug 04, 2005 at 01:15:12PM -0700 Andrew Morton wrote: > Alexander Nyberg <[EMAIL PROTECTED]> wrote: > > > > As I only have one x86_64 which is my main workstation it's far too > > tedious to do binary searching (this doesn't happen on x86). > > > > Happens with both latest -git and 2.6.12-mm1 > > The tools to reproduce this is at: http://serkiaden.mine.nu/kp2.tar > > > > Just do: > > gdb lyze > > run > > > > and it crashes here giving: > > > > --- [cut here ] - [please bite here ] - > > Kernel BUG at "mm/memory.c":911 > > So I think Hugh's patch this morning should fix this up. Please retest > -rc6 when it's out? Maybe I forgot to tell but I've already tested and it works fine. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.13-rc4] fix get_user_pages bug
> > > >x86_64 had hardcoded the VM_ numbers so it broke down when the numbers > >were changed. > > > > Ugh, sorry I should have audited this but I really wasn't expecting > it (famous last words). Hasn't been a good week for me. Hardcoding is evil so it's good it gets cleaned up anyway. > parisc, cris, m68k, frv, sh64, arm26 are also broken. > Would you mind resending a patch that fixes them all? > Remove the hardcoding in return value checking of handle_mm_fault() Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]> arm26/mm/fault.c |6 +++--- cris/mm/fault.c |6 +++--- frv/mm/fault.c|6 +++--- m68k/mm/fault.c |6 +++--- parisc/mm/fault.c |6 +++--- sh64/mm/fault.c |6 +++--- x86_64/mm/fault.c |6 +++--- 7 files changed, 21 insertions(+), 21 deletions(-) Index: linux-2.6/arch/x86_64/mm/fault.c === --- linux-2.6.orig/arch/x86_64/mm/fault.c 2005-07-31 18:10:20.0 +0200 +++ linux-2.6/arch/x86_64/mm/fault.c2005-08-04 16:04:59.0 +0200 @@ -439,13 +439,13 @@ * the fault. */ switch (handle_mm_fault(mm, vma, address, write)) { - case 1: + case VM_FAULT_MINOR: tsk->min_flt++; break; - case 2: + case VM_FAULT_MAJOR: tsk->maj_flt++; break; - case 0: + case VM_FAULT_SIGBUS: goto do_sigbus; default: goto out_of_memory; Index: linux-2.6/arch/cris/mm/fault.c === --- linux-2.6.orig/arch/cris/mm/fault.c 2005-07-31 18:10:02.0 +0200 +++ linux-2.6/arch/cris/mm/fault.c 2005-08-04 16:40:56.0 +0200 @@ -284,13 +284,13 @@ */ switch (handle_mm_fault(mm, vma, address, writeaccess & 1)) { - case 1: + case VM_FAULT_MINOR: tsk->min_flt++; break; - case 2: + case VM_FAULT_MAJOR: tsk->maj_flt++; break; - case 0: + case VM_FAULT_SIGBUS: goto do_sigbus; default: goto out_of_memory; Index: linux-2.6/arch/m68k/mm/fault.c === --- linux-2.6.orig/arch/m68k/mm/fault.c 2005-07-31 18:10:05.0 +0200 +++ linux-2.6/arch/m68k/mm/fault.c 2005-08-04 16:42:05.0 +0200 @@ -160,13 +160,13 @@ printk("handle_mm_fault returns %d\n",fault); #endif switch (fault) { - case 1: + case VM_FAULT_MINOR: current->min_flt++; break; - case 2: + case VM_FAULT_MAJOR: current->maj_flt++; break; - case 0: + case VM_FAULT_SIGBUS: goto bus_err; default: goto out_of_memory; Index: linux-2.6/arch/parisc/mm/fault.c === --- linux-2.6.orig/arch/parisc/mm/fault.c 2005-07-31 18:10:11.0 +0200 +++ linux-2.6/arch/parisc/mm/fault.c2005-08-04 16:41:18.0 +0200 @@ -178,13 +178,13 @@ */ switch (handle_mm_fault(mm, vma, address, (acc_type & VM_WRITE) != 0)) { - case 1: + case VM_FAULT_MINOR: ++current->min_flt; break; - case 2: + case VM_FAULT_MAJOR: ++current->maj_flt; break; - case 0: + case VM_FAULT_SIGBUS: /* * We ran out of memory, or some other thing happened * to us that made us unable to handle the page fault Index: linux-2.6/arch/arm26/mm/fault.c === --- linux-2.6.orig/arch/arm26/mm/fault.c2005-07-31 18:10:00.0 +0200 +++ linux-2.6/arch/arm26/mm/fault.c 2005-08-04 16:46:18.0 +0200 @@ -176,12 +176,12 @@ * Handle the "normal" cases first - successful and sigbus */ switch (fault) { - case 2: + case VM_FAULT_MAJOR: tsk->maj_flt++; return fault; - case 1: + case VM_FAULT_MINOR: tsk->min_flt++; - case 0: + case VM_FAULT_SIGBUS: return fault; } Index: linux-2.6/arch/frv/mm/fault.c === --- linux-2.6.orig/arch/frv/mm/fault.c 2005-07-31 18:10:03.0 +0200 +++ linux-2.6/arch/frv/mm/fault.c 2005-08-04 16:44:02.0 +0200 @@ -163,13 +163,13 @@ * the fault. */ switch (handle_mm_fault(mm, vma, ear0, write)) { - case 1: + case VM_FAULT_MINOR: current->mi
Re: [patch 2.6.13-rc4] fix get_user_pages bug
On Wed, Aug 03, 2005 at 09:12:37AM -0700 Linus Torvalds wrote: > > > On Wed, 3 Aug 2005, Nick Piggin wrote: > > > > Oh, it gets rid of the -1 for VM_FAULT_OOM. Doesn't seem like there > > is a good reason for it, but might that break out of tree drivers? > > Ok, I applied this because it was reasonably pretty and I liked the > approach. It seems buggy, though, since it was using "switch ()" to test > the bits (wrongly, afaik), and I'm going to apply the appended on top of > it. Holler quickly if you disagreee.. > x86_64 had hardcoded the VM_ numbers so it broke down when the numbers were changed. Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]> Index: linux-2.6/arch/x86_64/mm/fault.c === --- linux-2.6.orig/arch/x86_64/mm/fault.c 2005-07-31 18:10:20.0 +0200 +++ linux-2.6/arch/x86_64/mm/fault.c2005-08-04 16:04:59.0 +0200 @@ -439,13 +439,13 @@ * the fault. */ switch (handle_mm_fault(mm, vma, address, write)) { - case 1: + case VM_FAULT_MINOR: tsk->min_flt++; break; - case 2: + case VM_FAULT_MAJOR: tsk->maj_flt++; break; - case 0: + case VM_FAULT_SIGBUS: goto do_sigbus; default: goto out_of_memory; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.13-rc4] fix get_user_pages bug
On Wed, Aug 03, 2005 at 09:12:37AM -0700 Linus Torvalds wrote: On Wed, 3 Aug 2005, Nick Piggin wrote: Oh, it gets rid of the -1 for VM_FAULT_OOM. Doesn't seem like there is a good reason for it, but might that break out of tree drivers? Ok, I applied this because it was reasonably pretty and I liked the approach. It seems buggy, though, since it was using switch () to test the bits (wrongly, afaik), and I'm going to apply the appended on top of it. Holler quickly if you disagreee.. x86_64 had hardcoded the VM_ numbers so it broke down when the numbers were changed. Signed-off-by: Alexander Nyberg [EMAIL PROTECTED] Index: linux-2.6/arch/x86_64/mm/fault.c === --- linux-2.6.orig/arch/x86_64/mm/fault.c 2005-07-31 18:10:20.0 +0200 +++ linux-2.6/arch/x86_64/mm/fault.c2005-08-04 16:04:59.0 +0200 @@ -439,13 +439,13 @@ * the fault. */ switch (handle_mm_fault(mm, vma, address, write)) { - case 1: + case VM_FAULT_MINOR: tsk-min_flt++; break; - case 2: + case VM_FAULT_MAJOR: tsk-maj_flt++; break; - case 0: + case VM_FAULT_SIGBUS: goto do_sigbus; default: goto out_of_memory; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.13-rc4] fix get_user_pages bug
x86_64 had hardcoded the VM_ numbers so it broke down when the numbers were changed. Ugh, sorry I should have audited this but I really wasn't expecting it (famous last words). Hasn't been a good week for me. Hardcoding is evil so it's good it gets cleaned up anyway. parisc, cris, m68k, frv, sh64, arm26 are also broken. Would you mind resending a patch that fixes them all? Remove the hardcoding in return value checking of handle_mm_fault() Signed-off-by: Alexander Nyberg [EMAIL PROTECTED] arm26/mm/fault.c |6 +++--- cris/mm/fault.c |6 +++--- frv/mm/fault.c|6 +++--- m68k/mm/fault.c |6 +++--- parisc/mm/fault.c |6 +++--- sh64/mm/fault.c |6 +++--- x86_64/mm/fault.c |6 +++--- 7 files changed, 21 insertions(+), 21 deletions(-) Index: linux-2.6/arch/x86_64/mm/fault.c === --- linux-2.6.orig/arch/x86_64/mm/fault.c 2005-07-31 18:10:20.0 +0200 +++ linux-2.6/arch/x86_64/mm/fault.c2005-08-04 16:04:59.0 +0200 @@ -439,13 +439,13 @@ * the fault. */ switch (handle_mm_fault(mm, vma, address, write)) { - case 1: + case VM_FAULT_MINOR: tsk-min_flt++; break; - case 2: + case VM_FAULT_MAJOR: tsk-maj_flt++; break; - case 0: + case VM_FAULT_SIGBUS: goto do_sigbus; default: goto out_of_memory; Index: linux-2.6/arch/cris/mm/fault.c === --- linux-2.6.orig/arch/cris/mm/fault.c 2005-07-31 18:10:02.0 +0200 +++ linux-2.6/arch/cris/mm/fault.c 2005-08-04 16:40:56.0 +0200 @@ -284,13 +284,13 @@ */ switch (handle_mm_fault(mm, vma, address, writeaccess 1)) { - case 1: + case VM_FAULT_MINOR: tsk-min_flt++; break; - case 2: + case VM_FAULT_MAJOR: tsk-maj_flt++; break; - case 0: + case VM_FAULT_SIGBUS: goto do_sigbus; default: goto out_of_memory; Index: linux-2.6/arch/m68k/mm/fault.c === --- linux-2.6.orig/arch/m68k/mm/fault.c 2005-07-31 18:10:05.0 +0200 +++ linux-2.6/arch/m68k/mm/fault.c 2005-08-04 16:42:05.0 +0200 @@ -160,13 +160,13 @@ printk(handle_mm_fault returns %d\n,fault); #endif switch (fault) { - case 1: + case VM_FAULT_MINOR: current-min_flt++; break; - case 2: + case VM_FAULT_MAJOR: current-maj_flt++; break; - case 0: + case VM_FAULT_SIGBUS: goto bus_err; default: goto out_of_memory; Index: linux-2.6/arch/parisc/mm/fault.c === --- linux-2.6.orig/arch/parisc/mm/fault.c 2005-07-31 18:10:11.0 +0200 +++ linux-2.6/arch/parisc/mm/fault.c2005-08-04 16:41:18.0 +0200 @@ -178,13 +178,13 @@ */ switch (handle_mm_fault(mm, vma, address, (acc_type VM_WRITE) != 0)) { - case 1: + case VM_FAULT_MINOR: ++current-min_flt; break; - case 2: + case VM_FAULT_MAJOR: ++current-maj_flt; break; - case 0: + case VM_FAULT_SIGBUS: /* * We ran out of memory, or some other thing happened * to us that made us unable to handle the page fault Index: linux-2.6/arch/arm26/mm/fault.c === --- linux-2.6.orig/arch/arm26/mm/fault.c2005-07-31 18:10:00.0 +0200 +++ linux-2.6/arch/arm26/mm/fault.c 2005-08-04 16:46:18.0 +0200 @@ -176,12 +176,12 @@ * Handle the normal cases first - successful and sigbus */ switch (fault) { - case 2: + case VM_FAULT_MAJOR: tsk-maj_flt++; return fault; - case 1: + case VM_FAULT_MINOR: tsk-min_flt++; - case 0: + case VM_FAULT_SIGBUS: return fault; } Index: linux-2.6/arch/frv/mm/fault.c === --- linux-2.6.orig/arch/frv/mm/fault.c 2005-07-31 18:10:03.0 +0200 +++ linux-2.6/arch/frv/mm/fault.c 2005-08-04 16:44:02.0 +0200 @@ -163,13 +163,13 @@ * the fault. */ switch (handle_mm_fault(mm, vma, ear0, write)) { - case 1: + case VM_FAULT_MINOR: current-min_flt++; break; - case 2: + case VM_FAULT_MAJOR: current-maj_flt++; break; - case 0
Re: x86_64 access of some bad address
On Thu, Aug 04, 2005 at 01:15:12PM -0700 Andrew Morton wrote: Alexander Nyberg [EMAIL PROTECTED] wrote: As I only have one x86_64 which is my main workstation it's far too tedious to do binary searching (this doesn't happen on x86). Happens with both latest -git and 2.6.12-mm1 The tools to reproduce this is at: http://serkiaden.mine.nu/kp2.tar Just do: gdb lyze run and it crashes here giving: --- [cut here ] - [please bite here ] - Kernel BUG at mm/memory.c:911 So I think Hugh's patch this morning should fix this up. Please retest -rc6 when it's out? Maybe I forgot to tell but I've already tested and it works fine. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sat, Jul 30, 2005 at 07:48:11PM -0400 Lee Revell wrote: > I have a machine here that oopses reliably when I start X, but the > interesting stuff scrolls away too fast, and a bunch more Oopses get > printed ending with "Aieee, killing interrupt handler". > > How do I get the output to stop after the first Oops? > set /proc/sys/kernel/panic_on_oops to 1 What version of the kernel is that? It shouldn't do recursive oopses (of the same task) any more. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Making it easier to find which change introduced a bug
> > > We need a super-easy way for people to do bisection searching. > > First step would be to make interdiffs available as quilt patchsets. > > If we had this for e.g. 2.6.13-rc3 -> rc4 it would make tracking down > those new bugs much easier. > > (Yes I know git does bisection but Andrew said it should be easy.) > __ Yeah I agree, it would be extremely useful and simplify for people who don't have git installed. Linus, do you think we could have something like patch-2.6.13-rc4-incremental-broken-out.tar.bz2 that could like Andrew's be placed into patches/ in a tree? So for example, have a tree with 2.6.13-rc3, download patch-2.6.13-rc4-incremental-broken-out.tar.bz2, place it in patches/ and be able to do quilt push / quilt pop easily. As it stands today it's easier for us who don't know git to just find out in which mainline kernel it works and which -mm it doesn't work in, get the broken-out and start push/pop. And I know I'm not the only one who has noticed this. Thanks Alexander - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Making it easier to find which change introduced a bug
We need a super-easy way for people to do bisection searching. First step would be to make interdiffs available as quilt patchsets. If we had this for e.g. 2.6.13-rc3 - rc4 it would make tracking down those new bugs much easier. (Yes I know git does bisection but Andrew said it should be easy.) __ Yeah I agree, it would be extremely useful and simplify for people who don't have git installed. Linus, do you think we could have something like patch-2.6.13-rc4-incremental-broken-out.tar.bz2 that could like Andrew's be placed into patches/ in a tree? So for example, have a tree with 2.6.13-rc3, download patch-2.6.13-rc4-incremental-broken-out.tar.bz2, place it in patches/ and be able to do quilt push / quilt pop easily. As it stands today it's easier for us who don't know git to just find out in which mainline kernel it works and which -mm it doesn't work in, get the broken-out and start push/pop. And I know I'm not the only one who has noticed this. Thanks Alexander - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sat, Jul 30, 2005 at 07:48:11PM -0400 Lee Revell wrote: I have a machine here that oopses reliably when I start X, but the interesting stuff scrolls away too fast, and a bunch more Oopses get printed ending with Aieee, killing interrupt handler. How do I get the output to stop after the first Oops? set /proc/sys/kernel/panic_on_oops to 1 What version of the kernel is that? It shouldn't do recursive oopses (of the same task) any more. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/6] mm: micro-optimise rmap
[Nick, your mail bounced while sending this privately so reply-all this time] > Index: linux-2.6/mm/rmap.c > === > --- linux-2.6.orig/mm/rmap.c > +++ linux-2.6/mm/rmap.c > @@ -442,22 +442,23 @@ int page_referenced(struct page *page, i > void page_add_anon_rmap(struct page *page, > struct vm_area_struct *vma, unsigned long address) > { > - struct anon_vma *anon_vma = vma->anon_vma; > - pgoff_t index; > - > BUG_ON(PageReserved(page)); > - BUG_ON(!anon_vma); > > inc_mm_counter(vma->vm_mm, anon_rss); > > - anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON; > - index = (address - vma->vm_start) >> PAGE_SHIFT; > - index += vma->vm_pgoff; > - index >>= PAGE_CACHE_SHIFT - PAGE_SHIFT; > - > if (atomic_inc_and_test(>_mapcount)) { > - page->index = index; > + struct anon_vma *anon_vma = vma->anon_vma; > + pgoff_t index; > + > + BUG_ON(!anon_vma); > + anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON; > page->mapping = (struct address_space *) anon_vma; > + > + index = (address - vma->vm_start) >> PAGE_SHIFT; > + index += vma->vm_pgoff; > + index >>= PAGE_CACHE_SHIFT - PAGE_SHIFT; > + page->index = index; > + linear_page_index() here too? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/6] mm: micro-optimise rmap
[Nick, your mail bounced while sending this privately so reply-all this time] Index: linux-2.6/mm/rmap.c === --- linux-2.6.orig/mm/rmap.c +++ linux-2.6/mm/rmap.c @@ -442,22 +442,23 @@ int page_referenced(struct page *page, i void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, unsigned long address) { - struct anon_vma *anon_vma = vma-anon_vma; - pgoff_t index; - BUG_ON(PageReserved(page)); - BUG_ON(!anon_vma); inc_mm_counter(vma-vm_mm, anon_rss); - anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON; - index = (address - vma-vm_start) PAGE_SHIFT; - index += vma-vm_pgoff; - index = PAGE_CACHE_SHIFT - PAGE_SHIFT; - if (atomic_inc_and_test(page-_mapcount)) { - page-index = index; + struct anon_vma *anon_vma = vma-anon_vma; + pgoff_t index; + + BUG_ON(!anon_vma); + anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON; page-mapping = (struct address_space *) anon_vma; + + index = (address - vma-vm_start) PAGE_SHIFT; + index += vma-vm_pgoff; + index = PAGE_CACHE_SHIFT - PAGE_SHIFT; + page-index = index; + linear_page_index() here too? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: files_lock deadlock?
tis 2005-07-19 klockan 18:45 +0200 skrev Martin Wilck: > Hello, > > I apologize in advance if this is a dummy question. My web search turned > up nothing, so I'm trying it here. > > We came across the following error message: > > Kernelpanic - not syncing: fs/proc/ > Generic.c:521: spin_lock(fs/file_table.c:80420280) > Already locked by fs/file_table.c/204 > > This shows a locking problem with the files_lock on a UP kernel with > spinlock debugging enabled. > > I noticed that files_lock is only protected with spin_lock() > (file_list_lock(), include/linux/fs.h). Is it possible that this should > be changed to spin_lock_irq()) or spin_lock_irqsave()? Or am I misssing > something obvious? spin_lock_irqsave is only needed when a lock is taken both in normal context and in interrupt context. Clearly this lock is not intended to be taken in interrupt context. I'll take a look, that spinlock debugging information unfortunately doesn't give too much info :| - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: files_lock deadlock?
tis 2005-07-19 klockan 18:45 +0200 skrev Martin Wilck: Hello, I apologize in advance if this is a dummy question. My web search turned up nothing, so I'm trying it here. We came across the following error message: Kernelpanic - not syncing: fs/proc/ Generic.c:521: spin_lock(fs/file_table.c:80420280) Already locked by fs/file_table.c/204 This shows a locking problem with the files_lock on a UP kernel with spinlock debugging enabled. I noticed that files_lock is only protected with spin_lock() (file_list_lock(), include/linux/fs.h). Is it possible that this should be changed to spin_lock_irq()) or spin_lock_irqsave()? Or am I misssing something obvious? spin_lock_irqsave is only needed when a lock is taken both in normal context and in interrupt context. Clearly this lock is not intended to be taken in interrupt context. I'll take a look, that spinlock debugging information unfortunately doesn't give too much info :| - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Bug Report
> > It looks like it panics during a mem_cpy but I know its > > difficult to tell just by the output. > > > > I get a code: f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 66 > > > > The problem appears very reproducable so I can provide more > > information upon request. > > What does the rest of the panic say? There should be text above this > that tells where the panic occured and why. Can you please send that > here? Ok, could you please try the this patch, I'll attach it aswell: From: Andreas Steinmetz <[EMAIL PROTECTED]> from include/linux/kernel.h: #define ALIGN(x,a) (((x)+(a)-1)&~((a)-1)) from crypto/cipher.c: unsigned int alignmask = ... u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); ... unsigned int alignmask = ... u8 *tmp = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); ... unsigned int align; addr = ALIGN(addr, align); addr += ALIGN(tfm->__crt_alg->cra_ctxsize, align); The compiler first does ~((a)-1)) and then expands the unsigned int to unsigned long for the & operation. So we end up with only the lower 32 bits of the address. Who did smoke what to do this? Patch attached. -- Andreas Steinmetz SPAMmers use [EMAIL PROTECTED] --- linux.orig/crypto/cipher.c 2005-07-17 13:35:15.0 +0200 +++ linux/crypto/cipher.c 2005-07-17 14:04:00.0 +0200 @@ -41,7 +41,7 @@ struct scatter_walk *in, struct scatter_walk *out, unsigned int bsize) { - unsigned int alignmask = crypto_tfm_alg_alignmask(desc->tfm); + unsigned long alignmask = crypto_tfm_alg_alignmask(desc->tfm); u8 buffer[bsize * 2 + alignmask]; u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); u8 *dst = src + bsize; @@ -160,7 +160,7 @@ unsigned int nbytes) { struct crypto_tfm *tfm = desc->tfm; - unsigned int alignmask = crypto_tfm_alg_alignmask(tfm); + unsigned long alignmask = crypto_tfm_alg_alignmask(tfm); u8 *iv = desc->info; if (unlikely(((unsigned long)iv & alignmask))) { @@ -424,7 +424,7 @@ } if (ops->cit_mode == CRYPTO_TFM_MODE_CBC) { - unsigned int align; + unsigned long align; unsigned long addr; switch (crypto_tfm_alg_blocksize(tfm)) { --080406080505060706090703-- - --- Begin Message --- from include/linux/kernel.h: #define ALIGN(x,a) (((x)+(a)-1)&~((a)-1)) from crypto/cipher.c: unsigned int alignmask = ... u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); ... unsigned int alignmask = ... u8 *tmp = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); ... unsigned int align; addr = ALIGN(addr, align); addr += ALIGN(tfm->__crt_alg->cra_ctxsize, align); The compiler first does ~((a)-1)) and then expands the unsigned int to unsigned long for the & operation. So we end up with only the lower 32 bits of the address. Who did smoke what to do this? Patch attached. -- Andreas Steinmetz SPAMmers use [EMAIL PROTECTED] --- linux.orig/crypto/cipher.c 2005-07-17 13:35:15.0 +0200 +++ linux/crypto/cipher.c 2005-07-17 14:04:00.0 +0200 @@ -41,7 +41,7 @@ struct scatter_walk *in, struct scatter_walk *out, unsigned int bsize) { - unsigned int alignmask = crypto_tfm_alg_alignmask(desc->tfm); + unsigned long alignmask = crypto_tfm_alg_alignmask(desc->tfm); u8 buffer[bsize * 2 + alignmask]; u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); u8 *dst = src + bsize; @@ -160,7 +160,7 @@ unsigned int nbytes) { struct crypto_tfm *tfm = desc->tfm; - unsigned int alignmask = crypto_tfm_alg_alignmask(tfm); + unsigned long alignmask = crypto_tfm_alg_alignmask(tfm); u8 *iv = desc->info; if (unlikely(((unsigned long)iv & alignmask))) { @@ -424,7 +424,7 @@ } if (ops->cit_mode == CRYPTO_TFM_MODE_CBC) { - unsigned int align; + unsigned long align; unsigned long addr; switch (crypto_tfm_alg_blocksize(tfm)) { --080406080505060706090703-- - --- End Message ---
Re: Kernel Bug Report
It looks like it panics during a mem_cpy but I know its difficult to tell just by the output. I get a code: f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 66 The problem appears very reproducable so I can provide more information upon request. What does the rest of the panic say? There should be text above this that tells where the panic occured and why. Can you please send that here? Ok, could you please try the this patch, I'll attach it aswell: From: Andreas Steinmetz [EMAIL PROTECTED] from include/linux/kernel.h: #define ALIGN(x,a) (((x)+(a)-1)~((a)-1)) from crypto/cipher.c: unsigned int alignmask = ... u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); ... unsigned int alignmask = ... u8 *tmp = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); ... unsigned int align; addr = ALIGN(addr, align); addr += ALIGN(tfm-__crt_alg-cra_ctxsize, align); The compiler first does ~((a)-1)) and then expands the unsigned int to unsigned long for the operation. So we end up with only the lower 32 bits of the address. Who did smoke what to do this? Patch attached. -- Andreas Steinmetz SPAMmers use [EMAIL PROTECTED] --- linux.orig/crypto/cipher.c 2005-07-17 13:35:15.0 +0200 +++ linux/crypto/cipher.c 2005-07-17 14:04:00.0 +0200 @@ -41,7 +41,7 @@ struct scatter_walk *in, struct scatter_walk *out, unsigned int bsize) { - unsigned int alignmask = crypto_tfm_alg_alignmask(desc-tfm); + unsigned long alignmask = crypto_tfm_alg_alignmask(desc-tfm); u8 buffer[bsize * 2 + alignmask]; u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); u8 *dst = src + bsize; @@ -160,7 +160,7 @@ unsigned int nbytes) { struct crypto_tfm *tfm = desc-tfm; - unsigned int alignmask = crypto_tfm_alg_alignmask(tfm); + unsigned long alignmask = crypto_tfm_alg_alignmask(tfm); u8 *iv = desc-info; if (unlikely(((unsigned long)iv alignmask))) { @@ -424,7 +424,7 @@ } if (ops-cit_mode == CRYPTO_TFM_MODE_CBC) { - unsigned int align; + unsigned long align; unsigned long addr; switch (crypto_tfm_alg_blocksize(tfm)) { --080406080505060706090703-- - ---BeginMessage--- from include/linux/kernel.h: #define ALIGN(x,a) (((x)+(a)-1)~((a)-1)) from crypto/cipher.c: unsigned int alignmask = ... u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); ... unsigned int alignmask = ... u8 *tmp = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); ... unsigned int align; addr = ALIGN(addr, align); addr += ALIGN(tfm-__crt_alg-cra_ctxsize, align); The compiler first does ~((a)-1)) and then expands the unsigned int to unsigned long for the operation. So we end up with only the lower 32 bits of the address. Who did smoke what to do this? Patch attached. -- Andreas Steinmetz SPAMmers use [EMAIL PROTECTED] --- linux.orig/crypto/cipher.c 2005-07-17 13:35:15.0 +0200 +++ linux/crypto/cipher.c 2005-07-17 14:04:00.0 +0200 @@ -41,7 +41,7 @@ struct scatter_walk *in, struct scatter_walk *out, unsigned int bsize) { - unsigned int alignmask = crypto_tfm_alg_alignmask(desc-tfm); + unsigned long alignmask = crypto_tfm_alg_alignmask(desc-tfm); u8 buffer[bsize * 2 + alignmask]; u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); u8 *dst = src + bsize; @@ -160,7 +160,7 @@ unsigned int nbytes) { struct crypto_tfm *tfm = desc-tfm; - unsigned int alignmask = crypto_tfm_alg_alignmask(tfm); + unsigned long alignmask = crypto_tfm_alg_alignmask(tfm); u8 *iv = desc-info; if (unlikely(((unsigned long)iv alignmask))) { @@ -424,7 +424,7 @@ } if (ops-cit_mode == CRYPTO_TFM_MODE_CBC) { - unsigned int align; + unsigned long align; unsigned long addr; switch (crypto_tfm_alg_blocksize(tfm)) { --080406080505060706090703-- - ---End Message---
Re: Kernel Bug Report
tor 2005-07-14 klockan 10:10 -0700 skrev Paul Vander Griend: > System: > Motherboard = Tyan K8WE > Processor = 2x Opteron 250 > Memory = 8GB ECC Registered > > On all of the recent release candidates except for > 2.6.13-rc2-git2 the kernel panics while booting. These > versions include 2.6.13-rc2-git* (* != 2 ) and 2.6.13-rc3. > > I also want to mention that I am using gcc 3.3.5 on debian and > that during compilation there are 3 messages at the end that > say an assertion has failed IE (LD: assertion failed). Those are harmless > It looks like it panics during a mem_cpy but I know its > difficult to tell just by the output. > > I get a code: f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 66 > > The problem appears very reproducable so I can provide more > information upon request. What does the rest of the panic say? There should be text above this that tells where the panic occured and why. Can you please send that here? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Bug Report
tor 2005-07-14 klockan 10:10 -0700 skrev Paul Vander Griend: System: Motherboard = Tyan K8WE Processor = 2x Opteron 250 Memory = 8GB ECC Registered On all of the recent release candidates except for 2.6.13-rc2-git2 the kernel panics while booting. These versions include 2.6.13-rc2-git* (* != 2 ) and 2.6.13-rc3. I also want to mention that I am using gcc 3.3.5 on debian and that during compilation there are 3 messages at the end that say an assertion has failed IE (LD: assertion failed). Those are harmless It looks like it panics during a mem_cpy but I know its difficult to tell just by the output. I get a code: f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 66 The problem appears very reproducable so I can provide more information upon request. What does the rest of the panic say? There should be text above this that tells where the panic occured and why. Can you please send that here? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Patch for slab leak debugging
> >Yeah I knew there was one, but I thought that was a standalone patch > >(the one turning all bufctl to unsigned long, turning off irqs and > >printing all slabs_full to console), my intention with this was a > >proper /proc entry, something that could be a simple config option. > > > > > > > No, I never wrote a proper /proc interface. But I think the bufctl > approach is the better solution than storing the first 5 entries in the > slab structure: > What if there is a leak on a cache with more than 5 entries per slab? As slab leaks usually go out of control I think it will be enough to show what is leaking anyway, but you're right on the bufctl approach I think. I may have misundersood the bufctl thing a bit before doing this. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Patch for slab leak debugging
fre 2005-07-08 klockan 16:55 -0700 skrev Andrew Morton: > Alexander Nyberg <[EMAIL PROTECTED]> wrote: > > > > I think we really need an option in the kernel to help users in tracking > > slab leaks so that they can be brought down easier. > > Well we already have slab-leak-detector.patch, whcih I appear to have been > sitting on since 2.6.0-test8. it fell out of -mm after 2.6.12-rc5-mm2 due > to various ravaging of slab.c, but could be brought back. > > pc/2.6.12-rc5-mm2-series:slab-leak-detector.patch > pc/2.6.12-rc5-mm2-series:slab-leak-detector-warning-fixes.patch Yeah I knew there was one, but I thought that was a standalone patch (the one turning all bufctl to unsigned long, turning off irqs and printing all slabs_full to console), my intention with this was a proper /proc entry, something that could be a simple config option. But if something like this already exists, would you please send me what you have and I'll fix the numa changes - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Patch for slab leak debugging
fre 2005-07-08 klockan 16:55 -0700 skrev Andrew Morton: Alexander Nyberg [EMAIL PROTECTED] wrote: I think we really need an option in the kernel to help users in tracking slab leaks so that they can be brought down easier. Well we already have slab-leak-detector.patch, whcih I appear to have been sitting on since 2.6.0-test8. it fell out of -mm after 2.6.12-rc5-mm2 due to various ravaging of slab.c, but could be brought back. pc/2.6.12-rc5-mm2-series:slab-leak-detector.patch pc/2.6.12-rc5-mm2-series:slab-leak-detector-warning-fixes.patch Yeah I knew there was one, but I thought that was a standalone patch (the one turning all bufctl to unsigned long, turning off irqs and printing all slabs_full to console), my intention with this was a proper /proc entry, something that could be a simple config option. But if something like this already exists, would you please send me what you have and I'll fix the numa changes - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Patch for slab leak debugging
Yeah I knew there was one, but I thought that was a standalone patch (the one turning all bufctl to unsigned long, turning off irqs and printing all slabs_full to console), my intention with this was a proper /proc entry, something that could be a simple config option. No, I never wrote a proper /proc interface. But I think the bufctl approach is the better solution than storing the first 5 entries in the slab structure: What if there is a leak on a cache with more than 5 entries per slab? As slab leaks usually go out of control I think it will be enough to show what is leaking anyway, but you're right on the bufctl approach I think. I may have misundersood the bufctl thing a bit before doing this. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12.2 -- time passes faster; related to the acpi_register_gsi() call
fre 2005-07-08 klockan 23:12 +0200 skrev Rudo Thomas: > Hello, guys. > > Time started to pass faster with 2.6.12.2 (actually, it was 2.6.12-ck3 > which is based on it). I have isolated the cause of the problem: I bet you this fixes it (already in mainline) tree e6a38b3d6bf434f08054562113bb660c4227769f parent 4a89a04f1ee21a7c1f4413f1ad7dcfac50ff9b63 author Linus Torvalds <[EMAIL PROTECTED]> Sun, 03 Jul 2005 00:35:33 -0700 committer Linus Torvalds <[EMAIL PROTECTED]> Sun, 03 Jul 2005 00:35:33 -0700 If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq. That zero just means that nothing else found any irq information either. drivers/acpi/pci_irq.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c --- a/drivers/acpi/pci_irq.c +++ b/drivers/acpi/pci_irq.c @@ -433,7 +433,7 @@ acpi_pci_irq_enable ( printk(KERN_WARNING PREFIX "PCI Interrupt %s[%c]: no GSI", pci_name(dev), ('A' + pin)); /* Interrupt Line values above 0xF are forbidden */ - if (dev->irq >= 0 && (dev->irq <= 0xF)) { + if (dev->irq > 0 && (dev->irq <= 0xF)) { printk(" - using IRQ %d\n", dev->irq); acpi_register_gsi(dev->irq, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW); return_VALUE(0); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Patch for slab leak debugging
I think we really need an option in the kernel to help users in tracking slab leaks so that they can be brought down easier. This patch tracks the caller of the first five objects to be created within a slab. This is not much but as slab leaks normally are quite obvious with the exception that we don't know who the caller is, I think this approach will do fine. No NUMA handling, only looks at nodelists[0] at the moment list_ff() is distasteful, but I've yet to come up with a better approach and at the same time not screwing up the slab core too much (I've not seen too big latencies even with 7M size-32 objects, with that size it took around 1 minute to cat /proc/slab_owner > meepmeep.txt on a 1.2Ghz athlon. We could even limit the size of the output as it'll be pretty repetetive anyway). To use it, look at /proc/slabinfo to identify the cache that looks to have leakin callers. Then echo cachename > /proc/slab_owner; cat /proc/slab_owner > unsorted_slab_owner Although glancing at this file will likely reveal the leaking caller, there's a user-space program called slab_owner.c in Documentation/ to help sort the output in the same manner as page_owner Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]> Index: akpm/lib/Kconfig.debug === --- akpm.orig/lib/Kconfig.debug 2005-07-08 22:49:18.0 +0200 +++ akpm/lib/Kconfig.debug 2005-07-08 22:49:27.0 +0200 @@ -85,6 +85,14 @@ allocation as well as poisoning memory on free to catch use of freed memory. This can make kmalloc/kfree-intensive workloads much slower. +config SLAB_OWNER + bool "Track owner of slab objects" + depends on DEBUG_KERNEL && DEBUG_SLAB + help + Say Y here to make the kernel keep track of some of the functions + allocating slab objects. Expensive, should only be used to track + down slab leaks. + config DEBUG_PREEMPT bool "Debug preemptible kernel" depends on DEBUG_KERNEL && PREEMPT Index: akpm/mm/slab.c === --- akpm.orig/mm/slab.c 2005-07-08 22:49:18.0 +0200 +++ akpm/mm/slab.c 2005-07-08 22:49:27.0 +0200 @@ -222,6 +222,10 @@ unsigned intinuse; /* num of objs active in slab */ kmem_bufctl_t free; unsigned short nodeid; +#ifdef CONFIG_SLAB_OWNER + short owner_idx; + unsigned long owner[5]; +#endif }; /* @@ -2062,7 +2066,9 @@ slabp->inuse = 0; slabp->colouroff = colour_off; slabp->s_mem = objp+colour_off; - +#ifdef CONFIG_SLAB_OWNER + slabp->owner_idx = 0; +#endif return slabp; } @@ -2502,6 +2508,13 @@ cachep->ctor(objp, cachep, ctor_flags); } +#ifdef CONFIG_SLAB_OWNER + { + struct slab *slabp = GET_PAGE_SLAB(virt_to_page(objp)); + if (slabp->owner_idx < 5) + slabp->owner[slabp->owner_idx++] = (unsigned long) caller; + } +#endif return objp; } #else @@ -3604,3 +3617,131 @@ return buf; } EXPORT_SYMBOL(kstrdup); + +#ifdef CONFIG_SLAB_OWNER +/* The slab_owner mechanism doesn't aim to be accurate, merely to give + * a (big) hint as to what caller is allocating objects but not releasing + * them. In almost every case this will be quite obvious even with only + * 5 caller addresses per slab saved. + */ +static char slab_owner_name[32]; +static unsigned long saved_addr[40]; + +/* list fast forward 'n' elements */ +static struct list_head *list_ff(struct list_head *start, int n) +{ + int i; + struct list_head *list = start->next; + + for (i = 0; i < n; i++) { + list = list->next; + if (list == start) + return NULL; + } + + return list; +} + +static ssize_t +read_slab_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos) +{ + char *modname; + int ret = 0, x, hit = 0; + char namebuf[KSYM_NAME_LEN]; + unsigned long offset = 0, symsize; + kmem_cache_t *kcache; + struct list_head *start; + struct kmem_list3 *rl3; + char *page = NULL; + + down(_chain_sem); + list_for_each_entry(kcache, _cache.next, next) { + if (!strcmp(kcache->name, slab_owner_name)) { + /* This way we'll just have to look at one element */ + list_move(>next, _cache.next); + hit = 1; + break; + } + } + + if (!hit) { + ret = -ENOENT; + goto out_sem; + } + + page = (char *) __get_free_page(GFP_KERNEL); + if (!page) { + ret = -ENOMEM; + g
Patch for slab leak debugging
I think we really need an option in the kernel to help users in tracking slab leaks so that they can be brought down easier. This patch tracks the caller of the first five objects to be created within a slab. This is not much but as slab leaks normally are quite obvious with the exception that we don't know who the caller is, I think this approach will do fine. No NUMA handling, only looks at nodelists[0] at the moment list_ff() is distasteful, but I've yet to come up with a better approach and at the same time not screwing up the slab core too much (I've not seen too big latencies even with 7M size-32 objects, with that size it took around 1 minute to cat /proc/slab_owner meepmeep.txt on a 1.2Ghz athlon. We could even limit the size of the output as it'll be pretty repetetive anyway). To use it, look at /proc/slabinfo to identify the cache that looks to have leakin callers. Then echo cachename /proc/slab_owner; cat /proc/slab_owner unsorted_slab_owner Although glancing at this file will likely reveal the leaking caller, there's a user-space program called slab_owner.c in Documentation/ to help sort the output in the same manner as page_owner Signed-off-by: Alexander Nyberg [EMAIL PROTECTED] Index: akpm/lib/Kconfig.debug === --- akpm.orig/lib/Kconfig.debug 2005-07-08 22:49:18.0 +0200 +++ akpm/lib/Kconfig.debug 2005-07-08 22:49:27.0 +0200 @@ -85,6 +85,14 @@ allocation as well as poisoning memory on free to catch use of freed memory. This can make kmalloc/kfree-intensive workloads much slower. +config SLAB_OWNER + bool Track owner of slab objects + depends on DEBUG_KERNEL DEBUG_SLAB + help + Say Y here to make the kernel keep track of some of the functions + allocating slab objects. Expensive, should only be used to track + down slab leaks. + config DEBUG_PREEMPT bool Debug preemptible kernel depends on DEBUG_KERNEL PREEMPT Index: akpm/mm/slab.c === --- akpm.orig/mm/slab.c 2005-07-08 22:49:18.0 +0200 +++ akpm/mm/slab.c 2005-07-08 22:49:27.0 +0200 @@ -222,6 +222,10 @@ unsigned intinuse; /* num of objs active in slab */ kmem_bufctl_t free; unsigned short nodeid; +#ifdef CONFIG_SLAB_OWNER + short owner_idx; + unsigned long owner[5]; +#endif }; /* @@ -2062,7 +2066,9 @@ slabp-inuse = 0; slabp-colouroff = colour_off; slabp-s_mem = objp+colour_off; - +#ifdef CONFIG_SLAB_OWNER + slabp-owner_idx = 0; +#endif return slabp; } @@ -2502,6 +2508,13 @@ cachep-ctor(objp, cachep, ctor_flags); } +#ifdef CONFIG_SLAB_OWNER + { + struct slab *slabp = GET_PAGE_SLAB(virt_to_page(objp)); + if (slabp-owner_idx 5) + slabp-owner[slabp-owner_idx++] = (unsigned long) caller; + } +#endif return objp; } #else @@ -3604,3 +3617,131 @@ return buf; } EXPORT_SYMBOL(kstrdup); + +#ifdef CONFIG_SLAB_OWNER +/* The slab_owner mechanism doesn't aim to be accurate, merely to give + * a (big) hint as to what caller is allocating objects but not releasing + * them. In almost every case this will be quite obvious even with only + * 5 caller addresses per slab saved. + */ +static char slab_owner_name[32]; +static unsigned long saved_addr[40]; + +/* list fast forward 'n' elements */ +static struct list_head *list_ff(struct list_head *start, int n) +{ + int i; + struct list_head *list = start-next; + + for (i = 0; i n; i++) { + list = list-next; + if (list == start) + return NULL; + } + + return list; +} + +static ssize_t +read_slab_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos) +{ + char *modname; + int ret = 0, x, hit = 0; + char namebuf[KSYM_NAME_LEN]; + unsigned long offset = 0, symsize; + kmem_cache_t *kcache; + struct list_head *start; + struct kmem_list3 *rl3; + char *page = NULL; + + down(cache_chain_sem); + list_for_each_entry(kcache, cache_cache.next, next) { + if (!strcmp(kcache-name, slab_owner_name)) { + /* This way we'll just have to look at one element */ + list_move(kcache-next, cache_cache.next); + hit = 1; + break; + } + } + + if (!hit) { + ret = -ENOENT; + goto out_sem; + } + + page = (char *) __get_free_page(GFP_KERNEL); + if (!page) { + ret = -ENOMEM; + goto out_sem; + } + + rl3 = kcache-nodelists[0]; + spin_lock_irq(rl3-list_lock); + start
Re: 2.6.12.2 -- time passes faster; related to the acpi_register_gsi() call
fre 2005-07-08 klockan 23:12 +0200 skrev Rudo Thomas: Hello, guys. Time started to pass faster with 2.6.12.2 (actually, it was 2.6.12-ck3 which is based on it). I have isolated the cause of the problem: I bet you this fixes it (already in mainline) tree e6a38b3d6bf434f08054562113bb660c4227769f parent 4a89a04f1ee21a7c1f4413f1ad7dcfac50ff9b63 author Linus Torvalds [EMAIL PROTECTED] Sun, 03 Jul 2005 00:35:33 -0700 committer Linus Torvalds [EMAIL PROTECTED] Sun, 03 Jul 2005 00:35:33 -0700 If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq. That zero just means that nothing else found any irq information either. drivers/acpi/pci_irq.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c --- a/drivers/acpi/pci_irq.c +++ b/drivers/acpi/pci_irq.c @@ -433,7 +433,7 @@ acpi_pci_irq_enable ( printk(KERN_WARNING PREFIX PCI Interrupt %s[%c]: no GSI, pci_name(dev), ('A' + pin)); /* Interrupt Line values above 0xF are forbidden */ - if (dev-irq = 0 (dev-irq = 0xF)) { + if (dev-irq 0 (dev-irq = 0xF)) { printk( - using IRQ %d\n, dev-irq); acpi_register_gsi(dev-irq, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW); return_VALUE(0); - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to debug the kernel for X86_64 SMP?
tis 2005-07-05 klockan 22:58 +0800 skrev Neo Jia: > All, > > These days, I am trying to debug the kernel (2.6.9) on x86_64 SMP. But > the Kprobes and UML cannot work probably for my case, due to the patch > file for x86_64 arch. > > Is there anyone who is working on the same topic? Any hint and help > would be appreciated! You should explain more carefully what you are trying to debug - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to debug the kernel for X86_64 SMP?
tis 2005-07-05 klockan 22:58 +0800 skrev Neo Jia: All, These days, I am trying to debug the kernel (2.6.9) on x86_64 SMP. But the Kprobes and UML cannot work probably for my case, due to the patch file for x86_64 arch. Is there anyone who is working on the same topic? Any hint and help would be appreciated! You should explain more carefully what you are trying to debug - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq.
> tree e6a38b3d6bf434f08054562113bb660c4227769f > parent 4a89a04f1ee21a7c1f4413f1ad7dcfac50ff9b63 > author Linus Torvalds <[EMAIL PROTECTED]> Sun, 03 Jul 2005 00:35:33 -0700 > committer Linus Torvalds <[EMAIL PROTECTED]> Sun, 03 Jul 2005 00:35:33 -0700 > > If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq. > > That zero just means that nothing else found any irq information either. > > drivers/acpi/pci_irq.c |2 +- > 1 files changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c > --- a/drivers/acpi/pci_irq.c > +++ b/drivers/acpi/pci_irq.c > @@ -433,7 +433,7 @@ acpi_pci_irq_enable ( > printk(KERN_WARNING PREFIX "PCI Interrupt %s[%c]: no GSI", > pci_name(dev), ('A' + pin)); > /* Interrupt Line values above 0xF are forbidden */ > - if (dev->irq >= 0 && (dev->irq <= 0xF)) { > + if (dev->irq > 0 && (dev->irq <= 0xF)) { > printk(" - using IRQ %d\n", dev->irq); > acpi_register_gsi(dev->irq, ACPI_LEVEL_SENSITIVE, > ACPI_ACTIVE_LOW); > return_VALUE(0); Could this go into stable please? I've got it confirmed it fixes: http://bugme.osdl.org/show_bug.cgi?id=4824 Which was introduced in -stable 2.6.12.2. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq.
tree e6a38b3d6bf434f08054562113bb660c4227769f parent 4a89a04f1ee21a7c1f4413f1ad7dcfac50ff9b63 author Linus Torvalds [EMAIL PROTECTED] Sun, 03 Jul 2005 00:35:33 -0700 committer Linus Torvalds [EMAIL PROTECTED] Sun, 03 Jul 2005 00:35:33 -0700 If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq. That zero just means that nothing else found any irq information either. drivers/acpi/pci_irq.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c --- a/drivers/acpi/pci_irq.c +++ b/drivers/acpi/pci_irq.c @@ -433,7 +433,7 @@ acpi_pci_irq_enable ( printk(KERN_WARNING PREFIX PCI Interrupt %s[%c]: no GSI, pci_name(dev), ('A' + pin)); /* Interrupt Line values above 0xF are forbidden */ - if (dev-irq = 0 (dev-irq = 0xF)) { + if (dev-irq 0 (dev-irq = 0xF)) { printk( - using IRQ %d\n, dev-irq); acpi_register_gsi(dev-irq, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW); return_VALUE(0); Could this go into stable please? I've got it confirmed it fixes: http://bugme.osdl.org/show_bug.cgi?id=4824 Which was introduced in -stable 2.6.12.2. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86_64: Bug in new out of line put_user()
Brian, thanks for seeing this. (me goes hiding...) The labels after the last put_user patch were misplaced so exceptions on the real mov instructions would not be handled. Index: test/arch/x86_64/lib/putuser.S === --- test.orig/arch/x86_64/lib/putuser.S 2005-04-22 10:04:25.0 +0200 +++ test/arch/x86_64/lib/putuser.S 2005-04-22 10:06:29.0 +0200 @@ -49,8 +49,8 @@ jc 20f cmpq threadinfo_addr_limit(%r8),%rcx jae 20f -2: decq %rcx - movw %dx,(%rcx) + decq %rcx +2: movw %dx,(%rcx) xorl %eax,%eax ret 20:decq %rcx @@ -64,8 +64,8 @@ jc 30f cmpq threadinfo_addr_limit(%r8),%rcx jae 30f -3: subq $3,%rcx - movl %edx,(%rcx) + subq $3,%rcx +3: movl %edx,(%rcx) xorl %eax,%eax ret 30:subq $3,%rcx @@ -79,8 +79,8 @@ jc 40f cmpq threadinfo_addr_limit(%r8),%rcx jae 40f -4: subq $7,%rcx - movq %rdx,(%rcx) + subq $7,%rcx +4: movq %rdx,(%rcx) xorl %eax,%eax ret 40:subq $7,%rcx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86_64: Bug in new out of line put_user()
Brian, thanks for seeing this. (me goes hiding...) The labels after the last put_user patch were misplaced so exceptions on the real mov instructions would not be handled. Index: test/arch/x86_64/lib/putuser.S === --- test.orig/arch/x86_64/lib/putuser.S 2005-04-22 10:04:25.0 +0200 +++ test/arch/x86_64/lib/putuser.S 2005-04-22 10:06:29.0 +0200 @@ -49,8 +49,8 @@ jc 20f cmpq threadinfo_addr_limit(%r8),%rcx jae 20f -2: decq %rcx - movw %dx,(%rcx) + decq %rcx +2: movw %dx,(%rcx) xorl %eax,%eax ret 20:decq %rcx @@ -64,8 +64,8 @@ jc 30f cmpq threadinfo_addr_limit(%r8),%rcx jae 30f -3: subq $3,%rcx - movl %edx,(%rcx) + subq $3,%rcx +3: movl %edx,(%rcx) xorl %eax,%eax ret 30:subq $3,%rcx @@ -79,8 +79,8 @@ jc 40f cmpq threadinfo_addr_limit(%r8),%rcx jae 40f -4: subq $7,%rcx - movq %rdx,(%rcx) + subq $7,%rcx +4: movq %rdx,(%rcx) xorl %eax,%eax ret 40:subq $7,%rcx - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86_64: i8259.c trivial iso99 structure initialization
Trivial iso99 structure initialization Index: test/arch/x86_64/kernel/i8259.c === --- test.orig/arch/x86_64/kernel/i8259.c2005-04-20 22:29:02.0 +0200 +++ test/arch/x86_64/kernel/i8259.c 2005-04-22 00:16:22.0 +0200 @@ -158,14 +158,13 @@ } static struct hw_interrupt_type i8259A_irq_type = { - "XT-PIC", - startup_8259A_irq, - shutdown_8259A_irq, - enable_8259A_irq, - disable_8259A_irq, - mask_and_ack_8259A, - end_8259A_irq, - NULL + .typename = "XT-PIC", + .startup = startup_8259A_irq, + .shutdown = shutdown_8259A_irq, + .enable = enable_8259A_irq, + .disable = disable_8259A_irq, + .ack = mask_and_ack_8259A, + .end = end_8259A_irq, }; /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86_64: i8259.c trivial iso99 structure initialization
Trivial iso99 structure initialization Index: test/arch/x86_64/kernel/i8259.c === --- test.orig/arch/x86_64/kernel/i8259.c2005-04-20 22:29:02.0 +0200 +++ test/arch/x86_64/kernel/i8259.c 2005-04-22 00:16:22.0 +0200 @@ -158,14 +158,13 @@ } static struct hw_interrupt_type i8259A_irq_type = { - XT-PIC, - startup_8259A_irq, - shutdown_8259A_irq, - enable_8259A_irq, - disable_8259A_irq, - mask_and_ack_8259A, - end_8259A_irq, - NULL + .typename = XT-PIC, + .startup = startup_8259A_irq, + .shutdown = shutdown_8259A_irq, + .enable = enable_8259A_irq, + .disable = disable_8259A_irq, + .ack = mask_and_ack_8259A, + .end = end_8259A_irq, }; /* - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
x86_64: Bug in new out of line put_user()
The new out of line put_user() assembly on x86_64 changes %rcx without telling GCC about it causing things like: http://bugme.osdl.org/show_bug.cgi?id=4515 See to it that %rcx is not changed (made it consistent with get_user()). Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]> Index: test/arch/x86_64/lib/getuser.S === --- test.orig/arch/x86_64/lib/getuser.S 2005-04-20 23:55:35.0 +0200 +++ test/arch/x86_64/lib/getuser.S 2005-04-21 00:54:16.0 +0200 @@ -78,9 +78,9 @@ __get_user_8: GET_THREAD_INFO(%r8) addq $7,%rcx - jc bad_get_user + jc 40f cmpq threadinfo_addr_limit(%r8),%rcx - jae bad_get_user + jae 40f subq$7,%rcx 4: movq (%rcx),%rdx xorl %eax,%eax Index: test/arch/x86_64/lib/putuser.S === --- test.orig/arch/x86_64/lib/putuser.S 2005-04-21 00:50:24.0 +0200 +++ test/arch/x86_64/lib/putuser.S 2005-04-21 01:02:15.0 +0200 @@ -46,36 +46,45 @@ __put_user_2: GET_THREAD_INFO(%r8) addq $1,%rcx - jc bad_put_user + jc 20f cmpq threadinfo_addr_limit(%r8),%rcx - jae bad_put_user -2: movw %dx,-1(%rcx) + jae 20f +2: decq %rcx + movw %dx,(%rcx) xorl %eax,%eax ret +20:decq %rcx + jmp bad_put_user .p2align 4 .globl __put_user_4 __put_user_4: GET_THREAD_INFO(%r8) addq $3,%rcx - jc bad_put_user + jc 30f cmpq threadinfo_addr_limit(%r8),%rcx - jae bad_put_user -3: movl %edx,-3(%rcx) + jae 30f +3: subq $3,%rcx + movl %edx,(%rcx) xorl %eax,%eax ret +30:subq $3,%rcx + jmp bad_put_user .p2align 4 .globl __put_user_8 __put_user_8: GET_THREAD_INFO(%r8) addq $7,%rcx - jc bad_put_user + jc 40f cmpq threadinfo_addr_limit(%r8),%rcx - jae bad_put_user -4: movq %rdx,-7(%rcx) + jae 40f +4: subq $7,%rcx + movq %rdx,(%rcx) xorl %eax,%eax ret +40:subq $7,%rcx + jmp bad_put_user bad_put_user: movq $(-EFAULT),%rax - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
x86_64: Bug in new out of line put_user()
The new out of line put_user() assembly on x86_64 changes %rcx without telling GCC about it causing things like: http://bugme.osdl.org/show_bug.cgi?id=4515 See to it that %rcx is not changed (made it consistent with get_user()). Signed-off-by: Alexander Nyberg [EMAIL PROTECTED] Index: test/arch/x86_64/lib/getuser.S === --- test.orig/arch/x86_64/lib/getuser.S 2005-04-20 23:55:35.0 +0200 +++ test/arch/x86_64/lib/getuser.S 2005-04-21 00:54:16.0 +0200 @@ -78,9 +78,9 @@ __get_user_8: GET_THREAD_INFO(%r8) addq $7,%rcx - jc bad_get_user + jc 40f cmpq threadinfo_addr_limit(%r8),%rcx - jae bad_get_user + jae 40f subq$7,%rcx 4: movq (%rcx),%rdx xorl %eax,%eax Index: test/arch/x86_64/lib/putuser.S === --- test.orig/arch/x86_64/lib/putuser.S 2005-04-21 00:50:24.0 +0200 +++ test/arch/x86_64/lib/putuser.S 2005-04-21 01:02:15.0 +0200 @@ -46,36 +46,45 @@ __put_user_2: GET_THREAD_INFO(%r8) addq $1,%rcx - jc bad_put_user + jc 20f cmpq threadinfo_addr_limit(%r8),%rcx - jae bad_put_user -2: movw %dx,-1(%rcx) + jae 20f +2: decq %rcx + movw %dx,(%rcx) xorl %eax,%eax ret +20:decq %rcx + jmp bad_put_user .p2align 4 .globl __put_user_4 __put_user_4: GET_THREAD_INFO(%r8) addq $3,%rcx - jc bad_put_user + jc 30f cmpq threadinfo_addr_limit(%r8),%rcx - jae bad_put_user -3: movl %edx,-3(%rcx) + jae 30f +3: subq $3,%rcx + movl %edx,(%rcx) xorl %eax,%eax ret +30:subq $3,%rcx + jmp bad_put_user .p2align 4 .globl __put_user_8 __put_user_8: GET_THREAD_INFO(%r8) addq $7,%rcx - jc bad_put_user + jc 40f cmpq threadinfo_addr_limit(%r8),%rcx - jae bad_put_user -4: movq %rdx,-7(%rcx) + jae 40f +4: subq $7,%rcx + movq %rdx,(%rcx) xorl %eax,%eax ret +40:subq $7,%rcx + jmp bad_put_user bad_put_user: movq $(-EFAULT),%rax - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3 regression - certain applications get SIGSEGV but are fine with 2.6.12-rc2-mm2
tis 2005-04-19 klockan 11:33 +0200 skrev Jesper Juhl: > Everything is fine with 2.6.12-rc2, 2.6.12-rc2-mm1, 2.6.12-rc2-mm2 & > earlier kernels as well, but 2.6.12-rc2-mm3 seems to have a problem. > I don't know what's causing this, all I can do at the moment is describe > the symptoms. > > Certain applications (krootimage and ksplash from KDE 3.4 are 100% > reproducible test cases) that used to run fine have started crashing with > SIGSEGV on 2.6.12-rc2-mm3. I see nothing suspicious in dmesg. > I'm including dmesg output as well as strace output from krootimage and > ksplash below. > If someone could give me a hint as to what the cause of this could be or > what to try in order to track it down I'd appreciate it. > This is 100% reproducible. Try backing out http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/broken-out/sched-unlocked-context-switches.patch - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3 regression - certain applications get SIGSEGV but are fine with 2.6.12-rc2-mm2
tis 2005-04-19 klockan 11:33 +0200 skrev Jesper Juhl: Everything is fine with 2.6.12-rc2, 2.6.12-rc2-mm1, 2.6.12-rc2-mm2 earlier kernels as well, but 2.6.12-rc2-mm3 seems to have a problem. I don't know what's causing this, all I can do at the moment is describe the symptoms. Certain applications (krootimage and ksplash from KDE 3.4 are 100% reproducible test cases) that used to run fine have started crashing with SIGSEGV on 2.6.12-rc2-mm3. I see nothing suspicious in dmesg. I'm including dmesg output as well as strace output from krootimage and ksplash below. If someone could give me a hint as to what the cause of this could be or what to try in order to track it down I'd appreciate it. This is 100% reproducible. Try backing out http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/broken-out/sched-unlocked-context-switches.patch - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3
mån 2005-04-18 klockan 13:14 +0200 skrev Arjan van de Ven: > On Mon, 2005-04-18 at 13:05 +0200, Alexander Nyberg wrote: > > [Proper patch now that goes all the way, sorry for spamming] > > > > Patch below uses RETIRED_UOPS for a more constant rate of NMI sending. > > This makes x64 deliver NMI interrupts every fourth second at a constant > > rate when going through the local apic. Makes both cpus on my box to get > > NMIs at constant rate that it previously did not, there could be long > > delays when a CPU was idle. > > > isn't this dangerous in the light of the mobile cpus that either scale > back or stop entirely in idle or lower load situations ? > I don't see any real problem, at each nmi_watchdog_tick() the next NMI is calculated accounting cpu_khz so the NMIs might not come at a constant rate while frequency scaling, but over time there will still be one every fourth second. And if stop entirely as you say, are there even any uops run? And even if so the watchdog that is now currently would also have a few events accounted on it and could fire NMI aswell. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Need some help to debug a freeze on 2.6.11
> > > > Sounds like a job for Documentation/networking/netconsole.txt > > > > > > > or Documentation/serial-console.txt > > > > > Console on line printer would also be an option. > > I don't have any printer port cables, so I guess I prefer to try netconsole. > > I'm using wireless lan (Intel's ipw2100), would netconsole work on > wlan interface? Not sure, can't comment on it... > As an alternative, can I configure netconsole for my ethernet port and > only really connect it, after I get the freeze? Yep, this will work well. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3
[Proper patch now that goes all the way, sorry for spamming] Patch below uses RETIRED_UOPS for a more constant rate of NMI sending. This makes x64 deliver NMI interrupts every fourth second at a constant rate when going through the local apic. Makes both cpus on my box to get NMIs at constant rate that it previously did not, there could be long delays when a CPU was idle. This fixes misdetection in check_nmi_watchdog() that thought the NMI sending was stuck although it was not because the perfctr did not generate enough events with the previous mask. The 10-second check_nmi_watchdog() delay is down to 10 msec now. Tested on opteron SMP. Index: x64_mm/arch/x86_64/kernel/nmi.c === --- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-18 12:56:05.0 +0200 +++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 14:47:14.0 +0200 @@ -59,16 +59,14 @@ unsigned int nmi_watchdog = NMI_DEFAULT; static unsigned int nmi_hz = HZ; +static int nmi_mult = 1; /* nmi multiplier for longer intervals */ unsigned int nmi_perfctr_msr; /* the MSR to reset in NMI handler */ -/* Note that these events don't tick when the CPU idles. This means - the frequency varies with CPU load. */ - #define K7_EVNTSEL_ENABLE (1 << 22) #define K7_EVNTSEL_INT (1 << 20) #define K7_EVNTSEL_OS (1 << 17) #define K7_EVNTSEL_USR (1 << 16) -#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0x76 +#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0xC1 /* Retired uops */ #define K7_NMI_EVENT K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING #define P6_EVNTSEL0_ENABLE (1 << 22) @@ -78,6 +76,11 @@ #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79 #define P6_NMI_EVENT P6_EVENT_CPU_CLOCKS_NOT_HALTED +static inline unsigned long nmi_interval(void) +{ + return ((unsigned long)cpu_khz * 1000 * nmi_mult) / nmi_hz; +} + /* Run after command line and cpu_init init, but before all other checks */ void __init nmi_watchdog_default(void) { @@ -146,8 +149,10 @@ /* now that we know it works we can reduce NMI frequency to something more reasonable; makes a difference in some configs */ - if (nmi_watchdog == NMI_LOCAL_APIC) + if (nmi_watchdog == NMI_LOCAL_APIC) { nmi_hz = 1; + nmi_mult = 8; + } return 0; } @@ -305,9 +310,6 @@ int i; unsigned int evntsel; - /* No check, so can start with slow frequency */ - nmi_hz = 1; - /* XXX should check these in EFER */ nmi_perfctr_msr = MSR_K7_PERFCTR0; @@ -325,7 +327,7 @@ | K7_NMI_EVENT; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); - wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz); + wrmsrl(MSR_K7_PERFCTR0, -nmi_interval()); apic_write(APIC_LVTPC, APIC_DM_NMI); evntsel |= K7_EVNTSEL_ENABLE; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); @@ -393,10 +395,10 @@ if (last_irq_sums[cpu] == sum) { /* * Ayiee, looks like this CPU is stuck ... -* wait a few IRQs (5 seconds) before doing the oops ... +* wait a few NMIs before doing the oops ... */ alert_counter[cpu]++; - if (alert_counter[cpu] == 5*nmi_hz) { + if (alert_counter[cpu] == 3*nmi_hz) { if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) == NOTIFY_STOP) { alert_counter[cpu] = 0; @@ -409,7 +411,7 @@ alert_counter[cpu] = 0; } if (nmi_perfctr_msr) - wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1); + wrmsr(nmi_perfctr_msr, -nmi_interval(), -1); } static int dummy_nmi_callback(struct pt_regs * regs, int cpu) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Need some help to debug a freeze on 2.6.11
> I'm running Linux on my laptop and it sometimes freezes (about once a > week). The only thing which seems to work when it's stuck is SysRq (I > can reboot with SysRq+O), however, I'm in X and I don't have a serial > port on my laptop so I can't see any of the outputs of the SysRq > options. > > After a reboot I don't see anything in my logs about the crash. > > Can anyone suggest how to get some information about my freeze? Sounds like a job for Documentation/networking/netconsole.txt started by Ingo Molnar <[EMAIL PROTECTED]>, 2001.09.17 2.6 port and netpoll api by Matt Mackall <[EMAIL PROTECTED]>, Sep 9 2003 Please send bug reports to Matt Mackall <[EMAIL PROTECTED]> This module logs kernel printk messages over UDP allowing debugging of problem where disk logging fails and serial consoles are impractical. It can be used either built-in or as a module. As a built-in, netconsole initializes immediately after NIC cards and will bring up the specified interface as soon as possible. While this doesn't allow capture of early kernel panics, it does capture most of the boot process. It takes a string configuration parameter "netconsole" in the following format: [EMAIL PROTECTED]/[],[tgt-port]@/[tgt-macaddr] where src-port source for UDP packets (defaults to 6665) src-ipsource IP to use (interface address) dev network interface (eth0) tgt-port port for logging agent () tgt-ipIP address for logging agent tgt-macaddr ethernet MAC address for logging agent (broadcast) Examples: linux [EMAIL PROTECTED]/eth1,[EMAIL PROTECTED]/12:34:56:78:9a:bc or insmod netconsole netconsole=@/,@10.0.0.2/ Built-in netconsole starts immediately after the TCP stack is initialized and attempts to bring up the supplied dev at the supplied address. The remote host can run either 'netcat -u -l -p ' or syslogd. WARNING: the default target ethernet setting uses the broadcast ethernet address to send packets, which can cause increased load on other systems on the same ethernet segment. NOTE: the network device (eth1 in the above case) can run any kind of other network traffic, netconsole is not intrusive. Netconsole might cause slight delays in other traffic if the volume of kernel messages is high, but should have no other impact. Netconsole was designed to be as instantaneous as possible, to enable the logging of even the most critical kernel bugs. It works from IRQ contexts as well, and does not enable interrupts while sending packets. Due to these unique needs, configuration can not be more automatic, and some fundamental limitations will remain: only IP networks, UDP packets and ethernet devices are supported. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3
> >This patch fixes the NMI checking problems in -mm x64 for me. It > > What problems? > Sorry, in -mm on x64 check_nmi_watchdog() has started to be run as a late_initcall(). Currently it reports the NMIs as stuck on a few systems although they are not, both of mine are reported as stuck. This appears to be because the current event mask uses don't appear to tick much running mdelay() on opteron (in my case). Also in -mm because nmi_hz is set to 1 in setup_k7_watchdog() the NMI watchdog checking takes 10 seconds, a bit much. Patch below uses RETIRED_UOPS for a more constant rate of NMI sending, this works well for me. However I'd like NMIs to maybe fire every fourth second or so. Using nmi_mult to multiply nmi_interval() by 4 doesn't seem to make it go every fourth second however, maybe every 1.5 second, I'm puzzled about this... Index: x64_mm/arch/x86_64/kernel/nmi.c === --- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-18 12:56:05.0 +0200 +++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 13:34:37.0 +0200 @@ -59,6 +59,7 @@ unsigned int nmi_watchdog = NMI_DEFAULT; static unsigned int nmi_hz = HZ; +static int nmi_mult = 1; /* nmi multiplier, how many seconds inbetween */ unsigned int nmi_perfctr_msr; /* the MSR to reset in NMI handler */ /* Note that these events don't tick when the CPU idles. This means @@ -68,7 +69,7 @@ #define K7_EVNTSEL_INT (1 << 20) #define K7_EVNTSEL_OS (1 << 17) #define K7_EVNTSEL_USR (1 << 16) -#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0x76 +#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0xC1 /* Retired uops */ #define K7_NMI_EVENT K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING #define P6_EVNTSEL0_ENABLE (1 << 22) @@ -78,6 +79,11 @@ #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79 #define P6_NMI_EVENT P6_EVENT_CPU_CLOCKS_NOT_HALTED +static inline unsigned long nmi_interval(void) +{ + return ((unsigned long)cpu_khz * 1000 * nmi_mult) / nmi_hz; +} + /* Run after command line and cpu_init init, but before all other checks */ void __init nmi_watchdog_default(void) { @@ -146,8 +152,10 @@ /* now that we know it works we can reduce NMI frequency to something more reasonable; makes a difference in some configs */ - if (nmi_watchdog == NMI_LOCAL_APIC) + if (nmi_watchdog == NMI_LOCAL_APIC) { nmi_hz = 1; + nmi_mult = 4; + } return 0; } @@ -305,9 +313,6 @@ int i; unsigned int evntsel; - /* No check, so can start with slow frequency */ - nmi_hz = 1; - /* XXX should check these in EFER */ nmi_perfctr_msr = MSR_K7_PERFCTR0; @@ -325,7 +330,7 @@ | K7_NMI_EVENT; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); - wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz); + wrmsrl(MSR_K7_PERFCTR0, -nmi_interval()); apic_write(APIC_LVTPC, APIC_DM_NMI); evntsel |= K7_EVNTSEL_ENABLE; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); @@ -409,7 +414,7 @@ alert_counter[cpu] = 0; } if (nmi_perfctr_msr) - wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1); + wrmsr(nmi_perfctr_msr, -nmi_interval(), -1); } static int dummy_nmi_callback(struct pt_regs * regs, int cpu) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3
This patch fixes the NMI checking problems in -mm x64 for me. It What problems? Sorry, in -mm on x64 check_nmi_watchdog() has started to be run as a late_initcall(). Currently it reports the NMIs as stuck on a few systems although they are not, both of mine are reported as stuck. This appears to be because the current event mask uses don't appear to tick much running mdelay() on opteron (in my case). Also in -mm because nmi_hz is set to 1 in setup_k7_watchdog() the NMI watchdog checking takes 10 seconds, a bit much. Patch below uses RETIRED_UOPS for a more constant rate of NMI sending, this works well for me. However I'd like NMIs to maybe fire every fourth second or so. Using nmi_mult to multiply nmi_interval() by 4 doesn't seem to make it go every fourth second however, maybe every 1.5 second, I'm puzzled about this... Index: x64_mm/arch/x86_64/kernel/nmi.c === --- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-18 12:56:05.0 +0200 +++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 13:34:37.0 +0200 @@ -59,6 +59,7 @@ unsigned int nmi_watchdog = NMI_DEFAULT; static unsigned int nmi_hz = HZ; +static int nmi_mult = 1; /* nmi multiplier, how many seconds inbetween */ unsigned int nmi_perfctr_msr; /* the MSR to reset in NMI handler */ /* Note that these events don't tick when the CPU idles. This means @@ -68,7 +69,7 @@ #define K7_EVNTSEL_INT (1 20) #define K7_EVNTSEL_OS (1 17) #define K7_EVNTSEL_USR (1 16) -#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0x76 +#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0xC1 /* Retired uops */ #define K7_NMI_EVENT K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING #define P6_EVNTSEL0_ENABLE (1 22) @@ -78,6 +79,11 @@ #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79 #define P6_NMI_EVENT P6_EVENT_CPU_CLOCKS_NOT_HALTED +static inline unsigned long nmi_interval(void) +{ + return ((unsigned long)cpu_khz * 1000 * nmi_mult) / nmi_hz; +} + /* Run after command line and cpu_init init, but before all other checks */ void __init nmi_watchdog_default(void) { @@ -146,8 +152,10 @@ /* now that we know it works we can reduce NMI frequency to something more reasonable; makes a difference in some configs */ - if (nmi_watchdog == NMI_LOCAL_APIC) + if (nmi_watchdog == NMI_LOCAL_APIC) { nmi_hz = 1; + nmi_mult = 4; + } return 0; } @@ -305,9 +313,6 @@ int i; unsigned int evntsel; - /* No check, so can start with slow frequency */ - nmi_hz = 1; - /* XXX should check these in EFER */ nmi_perfctr_msr = MSR_K7_PERFCTR0; @@ -325,7 +330,7 @@ | K7_NMI_EVENT; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); - wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz); + wrmsrl(MSR_K7_PERFCTR0, -nmi_interval()); apic_write(APIC_LVTPC, APIC_DM_NMI); evntsel |= K7_EVNTSEL_ENABLE; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); @@ -409,7 +414,7 @@ alert_counter[cpu] = 0; } if (nmi_perfctr_msr) - wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1); + wrmsr(nmi_perfctr_msr, -nmi_interval(), -1); } static int dummy_nmi_callback(struct pt_regs * regs, int cpu) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Need some help to debug a freeze on 2.6.11
I'm running Linux on my laptop and it sometimes freezes (about once a week). The only thing which seems to work when it's stuck is SysRq (I can reboot with SysRq+O), however, I'm in X and I don't have a serial port on my laptop so I can't see any of the outputs of the SysRq options. After a reboot I don't see anything in my logs about the crash. Can anyone suggest how to get some information about my freeze? Sounds like a job for Documentation/networking/netconsole.txt started by Ingo Molnar [EMAIL PROTECTED], 2001.09.17 2.6 port and netpoll api by Matt Mackall [EMAIL PROTECTED], Sep 9 2003 Please send bug reports to Matt Mackall [EMAIL PROTECTED] This module logs kernel printk messages over UDP allowing debugging of problem where disk logging fails and serial consoles are impractical. It can be used either built-in or as a module. As a built-in, netconsole initializes immediately after NIC cards and will bring up the specified interface as soon as possible. While this doesn't allow capture of early kernel panics, it does capture most of the boot process. It takes a string configuration parameter netconsole in the following format: [EMAIL PROTECTED]/[dev],[tgt-port]@tgt-ip/[tgt-macaddr] where src-port source for UDP packets (defaults to 6665) src-ipsource IP to use (interface address) dev network interface (eth0) tgt-port port for logging agent () tgt-ipIP address for logging agent tgt-macaddr ethernet MAC address for logging agent (broadcast) Examples: linux [EMAIL PROTECTED]/eth1,[EMAIL PROTECTED]/12:34:56:78:9a:bc or insmod netconsole netconsole=@/,@10.0.0.2/ Built-in netconsole starts immediately after the TCP stack is initialized and attempts to bring up the supplied dev at the supplied address. The remote host can run either 'netcat -u -l -p port' or syslogd. WARNING: the default target ethernet setting uses the broadcast ethernet address to send packets, which can cause increased load on other systems on the same ethernet segment. NOTE: the network device (eth1 in the above case) can run any kind of other network traffic, netconsole is not intrusive. Netconsole might cause slight delays in other traffic if the volume of kernel messages is high, but should have no other impact. Netconsole was designed to be as instantaneous as possible, to enable the logging of even the most critical kernel bugs. It works from IRQ contexts as well, and does not enable interrupts while sending packets. Due to these unique needs, configuration can not be more automatic, and some fundamental limitations will remain: only IP networks, UDP packets and ethernet devices are supported. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3
[Proper patch now that goes all the way, sorry for spamming] Patch below uses RETIRED_UOPS for a more constant rate of NMI sending. This makes x64 deliver NMI interrupts every fourth second at a constant rate when going through the local apic. Makes both cpus on my box to get NMIs at constant rate that it previously did not, there could be long delays when a CPU was idle. This fixes misdetection in check_nmi_watchdog() that thought the NMI sending was stuck although it was not because the perfctr did not generate enough events with the previous mask. The 10-second check_nmi_watchdog() delay is down to 10 msec now. Tested on opteron SMP. Index: x64_mm/arch/x86_64/kernel/nmi.c === --- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-18 12:56:05.0 +0200 +++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 14:47:14.0 +0200 @@ -59,16 +59,14 @@ unsigned int nmi_watchdog = NMI_DEFAULT; static unsigned int nmi_hz = HZ; +static int nmi_mult = 1; /* nmi multiplier for longer intervals */ unsigned int nmi_perfctr_msr; /* the MSR to reset in NMI handler */ -/* Note that these events don't tick when the CPU idles. This means - the frequency varies with CPU load. */ - #define K7_EVNTSEL_ENABLE (1 22) #define K7_EVNTSEL_INT (1 20) #define K7_EVNTSEL_OS (1 17) #define K7_EVNTSEL_USR (1 16) -#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0x76 +#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0xC1 /* Retired uops */ #define K7_NMI_EVENT K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING #define P6_EVNTSEL0_ENABLE (1 22) @@ -78,6 +76,11 @@ #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79 #define P6_NMI_EVENT P6_EVENT_CPU_CLOCKS_NOT_HALTED +static inline unsigned long nmi_interval(void) +{ + return ((unsigned long)cpu_khz * 1000 * nmi_mult) / nmi_hz; +} + /* Run after command line and cpu_init init, but before all other checks */ void __init nmi_watchdog_default(void) { @@ -146,8 +149,10 @@ /* now that we know it works we can reduce NMI frequency to something more reasonable; makes a difference in some configs */ - if (nmi_watchdog == NMI_LOCAL_APIC) + if (nmi_watchdog == NMI_LOCAL_APIC) { nmi_hz = 1; + nmi_mult = 8; + } return 0; } @@ -305,9 +310,6 @@ int i; unsigned int evntsel; - /* No check, so can start with slow frequency */ - nmi_hz = 1; - /* XXX should check these in EFER */ nmi_perfctr_msr = MSR_K7_PERFCTR0; @@ -325,7 +327,7 @@ | K7_NMI_EVENT; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); - wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz); + wrmsrl(MSR_K7_PERFCTR0, -nmi_interval()); apic_write(APIC_LVTPC, APIC_DM_NMI); evntsel |= K7_EVNTSEL_ENABLE; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); @@ -393,10 +395,10 @@ if (last_irq_sums[cpu] == sum) { /* * Ayiee, looks like this CPU is stuck ... -* wait a few IRQs (5 seconds) before doing the oops ... +* wait a few NMIs before doing the oops ... */ alert_counter[cpu]++; - if (alert_counter[cpu] == 5*nmi_hz) { + if (alert_counter[cpu] == 3*nmi_hz) { if (notify_die(DIE_NMI, nmi, regs, reason, 2, SIGINT) == NOTIFY_STOP) { alert_counter[cpu] = 0; @@ -409,7 +411,7 @@ alert_counter[cpu] = 0; } if (nmi_perfctr_msr) - wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1); + wrmsr(nmi_perfctr_msr, -nmi_interval(), -1); } static int dummy_nmi_callback(struct pt_regs * regs, int cpu) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Need some help to debug a freeze on 2.6.11
Sounds like a job for Documentation/networking/netconsole.txt or Documentation/serial-console.txt Console on line printer would also be an option. I don't have any printer port cables, so I guess I prefer to try netconsole. I'm using wireless lan (Intel's ipw2100), would netconsole work on wlan interface? Not sure, can't comment on it... As an alternative, can I configure netconsole for my ethernet port and only really connect it, after I get the freeze? Yep, this will work well. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3
mån 2005-04-18 klockan 13:14 +0200 skrev Arjan van de Ven: On Mon, 2005-04-18 at 13:05 +0200, Alexander Nyberg wrote: [Proper patch now that goes all the way, sorry for spamming] Patch below uses RETIRED_UOPS for a more constant rate of NMI sending. This makes x64 deliver NMI interrupts every fourth second at a constant rate when going through the local apic. Makes both cpus on my box to get NMIs at constant rate that it previously did not, there could be long delays when a CPU was idle. isn't this dangerous in the light of the mobile cpus that either scale back or stop entirely in idle or lower load situations ? I don't see any real problem, at each nmi_watchdog_tick() the next NMI is calculated accounting cpu_khz so the NMIs might not come at a constant rate while frequency scaling, but over time there will still be one every fourth second. And if stop entirely as you say, are there even any uops run? And even if so the watchdog that is now currently would also have a few events accounted on it and could fire NMI aswell. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3
mån 2005-04-11 klockan 01:25 -0700 skrev Andrew Morton: > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/ > I tried to kexec on my x64 and it hangs up in calibrate_delay() because the PIT never fires any interrupts so jiffies is never updated. Has kexec been tested on x64 and should be working? I want to know if I should start looking at weirdness with my hardware or if it is like this on all x64 boxes. Also, patch at bottom is needed to compile kexec on x64 without ia32 emulation support (the includes are not used at the moment). CC arch/x86_64/kernel/crash.o In file included from arch/x86_64/kernel/crash.c:18: include/linux/elfcore.h: I funktion `elf_core_copy_regs': include/linux/elfcore.h:92: error: dereferencing pointer to incomplete type include/linux/elfcore.h:92: error: dereferencing pointer to incomplete type Index: x64_mm/arch/x86_64/kernel/crash.c === --- x64_mm.orig/arch/x86_64/kernel/crash.c 2005-04-16 19:23:58.0 +0200 +++ x64_mm/arch/x86_64/kernel/crash.c 2005-04-16 19:47:56.0 +0200 @@ -14,8 +14,6 @@ #include #include #include -#include -#include #include #include - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/ > > [Mikael Pettersson on CC, would like your advice] This patch fixes the NMI checking problems in -mm x64 for me. It changes the perfctr selection to use RETIRED_UOPS instead (makes both processors tick even on my box). This makes the NMI tick once per second while running which is quite much, I'd like to get it down to every fourth second and herein lies the problem. Multiplying nmi_interval() in patch below with 4 does not help, still ticks at about the same pace. I'm puzzled... Index: x64_mm/arch/x86_64/kernel/nmi.c === --- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-17 14:34:09.0 +0200 +++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 02:11:37.0 +0200 @@ -58,7 +58,7 @@ int panic_on_timeout; unsigned int nmi_watchdog = NMI_DEFAULT; -static unsigned int nmi_hz = HZ; +static unsigned long nmi_hz = HZ; unsigned int nmi_perfctr_msr; /* the MSR to reset in NMI handler */ /* Note that these events don't tick when the CPU idles. This means @@ -70,6 +70,7 @@ #define K7_EVNTSEL_USR (1 << 16) #define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0x76 #define K7_NMI_EVENT K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING +#define K7_RETIRED_UOPS0xC1 /* always running */ #define P6_EVNTSEL0_ENABLE (1 << 22) #define P6_EVNTSEL_INT (1 << 20) @@ -78,6 +79,11 @@ #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79 #define P6_NMI_EVENT P6_EVENT_CPU_CLOCKS_NOT_HALTED +static inline unsigned long nmi_interval(void) +{ + return (((unsigned long)cpu_khz * 1000UL) / nmi_hz); +} + /* Run after command line and cpu_init init, but before all other checks */ void __init nmi_watchdog_default(void) { @@ -129,8 +135,8 @@ for (cpu = 0; cpu < NR_CPUS; cpu++) counts[cpu] = cpu_pda[cpu].__nmi_count; - local_irq_enable(); - mdelay((10*1000)/nmi_hz); // wait 10 ticks + + mdelay((10*1000) / nmi_hz); /* wait 10 NMI ticks */ for (cpu = 0; cpu < NR_CPUS; cpu++) { if (cpu_pda[cpu].__nmi_count - counts[cpu] <= 5) { @@ -305,9 +311,6 @@ int i; unsigned int evntsel; - /* No check, so can start with slow frequency */ - nmi_hz = 1; - /* XXX should check these in EFER */ nmi_perfctr_msr = MSR_K7_PERFCTR0; @@ -322,10 +325,10 @@ evntsel = K7_EVNTSEL_INT | K7_EVNTSEL_OS | K7_EVNTSEL_USR - | K7_NMI_EVENT; + | K7_RETIRED_UOPS; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); - wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz); + wrmsrl(MSR_K7_PERFCTR0, -nmi_interval()); apic_write(APIC_LVTPC, APIC_DM_NMI); evntsel |= K7_EVNTSEL_ENABLE; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); @@ -409,7 +412,7 @@ alert_counter[cpu] = 0; } if (nmi_perfctr_msr) - wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1); + wrmsr(nmi_perfctr_msr, -nmi_interval(), -1); } static int dummy_nmi_callback(struct pt_regs * regs, int cpu) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/ [Mikael Pettersson on CC, would like your advice] This patch fixes the NMI checking problems in -mm x64 for me. It changes the perfctr selection to use RETIRED_UOPS instead (makes both processors tick even on my box). This makes the NMI tick once per second while running which is quite much, I'd like to get it down to every fourth second and herein lies the problem. Multiplying nmi_interval() in patch below with 4 does not help, still ticks at about the same pace. I'm puzzled... Index: x64_mm/arch/x86_64/kernel/nmi.c === --- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-17 14:34:09.0 +0200 +++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 02:11:37.0 +0200 @@ -58,7 +58,7 @@ int panic_on_timeout; unsigned int nmi_watchdog = NMI_DEFAULT; -static unsigned int nmi_hz = HZ; +static unsigned long nmi_hz = HZ; unsigned int nmi_perfctr_msr; /* the MSR to reset in NMI handler */ /* Note that these events don't tick when the CPU idles. This means @@ -70,6 +70,7 @@ #define K7_EVNTSEL_USR (1 16) #define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0x76 #define K7_NMI_EVENT K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING +#define K7_RETIRED_UOPS0xC1 /* always running */ #define P6_EVNTSEL0_ENABLE (1 22) #define P6_EVNTSEL_INT (1 20) @@ -78,6 +79,11 @@ #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79 #define P6_NMI_EVENT P6_EVENT_CPU_CLOCKS_NOT_HALTED +static inline unsigned long nmi_interval(void) +{ + return (((unsigned long)cpu_khz * 1000UL) / nmi_hz); +} + /* Run after command line and cpu_init init, but before all other checks */ void __init nmi_watchdog_default(void) { @@ -129,8 +135,8 @@ for (cpu = 0; cpu NR_CPUS; cpu++) counts[cpu] = cpu_pda[cpu].__nmi_count; - local_irq_enable(); - mdelay((10*1000)/nmi_hz); // wait 10 ticks + + mdelay((10*1000) / nmi_hz); /* wait 10 NMI ticks */ for (cpu = 0; cpu NR_CPUS; cpu++) { if (cpu_pda[cpu].__nmi_count - counts[cpu] = 5) { @@ -305,9 +311,6 @@ int i; unsigned int evntsel; - /* No check, so can start with slow frequency */ - nmi_hz = 1; - /* XXX should check these in EFER */ nmi_perfctr_msr = MSR_K7_PERFCTR0; @@ -322,10 +325,10 @@ evntsel = K7_EVNTSEL_INT | K7_EVNTSEL_OS | K7_EVNTSEL_USR - | K7_NMI_EVENT; + | K7_RETIRED_UOPS; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); - wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz); + wrmsrl(MSR_K7_PERFCTR0, -nmi_interval()); apic_write(APIC_LVTPC, APIC_DM_NMI); evntsel |= K7_EVNTSEL_ENABLE; wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); @@ -409,7 +412,7 @@ alert_counter[cpu] = 0; } if (nmi_perfctr_msr) - wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1); + wrmsr(nmi_perfctr_msr, -nmi_interval(), -1); } static int dummy_nmi_callback(struct pt_regs * regs, int cpu) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3
mån 2005-04-11 klockan 01:25 -0700 skrev Andrew Morton: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/ I tried to kexec on my x64 and it hangs up in calibrate_delay() because the PIT never fires any interrupts so jiffies is never updated. Has kexec been tested on x64 and should be working? I want to know if I should start looking at weirdness with my hardware or if it is like this on all x64 boxes. Also, patch at bottom is needed to compile kexec on x64 without ia32 emulation support (the includes are not used at the moment). CC arch/x86_64/kernel/crash.o In file included from arch/x86_64/kernel/crash.c:18: include/linux/elfcore.h: I funktion `elf_core_copy_regs': include/linux/elfcore.h:92: error: dereferencing pointer to incomplete type include/linux/elfcore.h:92: error: dereferencing pointer to incomplete type Index: x64_mm/arch/x86_64/kernel/crash.c === --- x64_mm.orig/arch/x86_64/kernel/crash.c 2005-04-16 19:23:58.0 +0200 +++ x64_mm/arch/x86_64/kernel/crash.c 2005-04-16 19:47:56.0 +0200 @@ -14,8 +14,6 @@ #include linux/irq.h #include linux/reboot.h #include linux/kexec.h -#include linux/elf.h -#include linux/elfcore.h #include asm/processor.h #include asm/hardirq.h - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix reproducible SMP crash in security/keys/key.c
tis 2005-04-12 klockan 21:58 +0300 skrev Jani Jaakkola: > SMP race handling is broken in key_user_lookup() in security/keys/key.c > (if CONFIG_KEYS is set to 'y'). This came up on our Samba servers, but is > not restricted to samba, though samba is probably the only software which > is likely to trigger this repeatedly (and it did happen allready four > times here in University of Helsinki, CS department). > > However, it only takes two setreuid() calls at the same instant, so this > may be responsible for some other mysterious random crashes. > > This is the same bug which was previously raported to LKML here (found by > google): > http://www.ussg.iu.edu/hypermail/linux/kernel/0502.2/0521.html > > Here is a small test program, which can be used to trigger the bug and > crash the machine where it is run. It might take a few seconds: > > #include > #include > int main() { > int i; > fork(); > while(1) { > for(i=0;i<6;i++) { setreuid(i,0); } > putchar('.'); fflush(stdout); > }; > } > > The (rather obvious) problem is that key_user_lookup() does not properly > re-initialize the user lookup if there was a race. > > This patch applies to vanilla 2.6.11.7 and latest fedora kernel > 2.6.11-1.14_FC3. When applied, the test program runs just fine (and does > nothing useful). A fix went into mainline for this two months ago (post 2.6.11), but I probably should have sent it into -stable aswell. For your own sake always use the latest kernel when looking at problems/fixes, things move fast around here :) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix reproducible SMP crash in security/keys/key.c
tis 2005-04-12 klockan 21:58 +0300 skrev Jani Jaakkola: SMP race handling is broken in key_user_lookup() in security/keys/key.c (if CONFIG_KEYS is set to 'y'). This came up on our Samba servers, but is not restricted to samba, though samba is probably the only software which is likely to trigger this repeatedly (and it did happen allready four times here in University of Helsinki, CS department). However, it only takes two setreuid() calls at the same instant, so this may be responsible for some other mysterious random crashes. This is the same bug which was previously raported to LKML here (found by google): http://www.ussg.iu.edu/hypermail/linux/kernel/0502.2/0521.html Here is a small test program, which can be used to trigger the bug and crash the machine where it is run. It might take a few seconds: #includeunistd.h #includestdio.h int main() { int i; fork(); while(1) { for(i=0;i6;i++) { setreuid(i,0); } putchar('.'); fflush(stdout); }; } The (rather obvious) problem is that key_user_lookup() does not properly re-initialize the user lookup if there was a race. This patch applies to vanilla 2.6.11.7 and latest fedora kernel 2.6.11-1.14_FC3. When applied, the test program runs just fine (and does nothing useful). A fix went into mainline for this two months ago (post 2.6.11), but I probably should have sent it into -stable aswell. For your own sake always use the latest kernel when looking at problems/fixes, things move fast around here :) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3: 10 seconds of nothingness
> [ 19.617890] Testing NMI watchdog ... <6>ACPI: No ACPI bus support > for 2-2 [ 19.705673] ACPI: No ACPI bus support for 2-2:1.0 > [ 20.002417] usb 3-2: new full speed USB device using uhci_hcd and > address 2 [ 20.121763] ACPI: No ACPI bus support for 3-2 > [ 20.156293] ACPI: No ACPI bus support for 3-2:1.0 > [ 29.539613] OK. > > I also had this "problem" with mm1. mm2 wouldn't compile, so I didn't > test that. IIRC it also happened with the rc1-mm's. Is this supposed to > happen? It's a fairly new thing on x64, should be fixed soon. If it disturbs you too much back out http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/broken-out/rfc-check-nmi-watchdog-is-broken.patch - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bkbits.net is down
tis 2005-04-12 klockan 13:10 +0200 skrev Marcin Dalecki: > On 2005-04-12, at 04:17, Larry McVoy wrote whatever... > > Excuse me, but: who gives a damn shit? > Anyone who wants to have access to the history or any other functioning of the repository. Please don't pollute this list nor Larry with such comments. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bkbits.net is down
tis 2005-04-12 klockan 13:10 +0200 skrev Marcin Dalecki: On 2005-04-12, at 04:17, Larry McVoy wrote whatever... Excuse me, but: who gives a damn shit? Anyone who wants to have access to the history or any other functioning of the repository. Please don't pollute this list nor Larry with such comments. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm3: 10 seconds of nothingness
[ 19.617890] Testing NMI watchdog ... 6ACPI: No ACPI bus support for 2-2 [ 19.705673] ACPI: No ACPI bus support for 2-2:1.0 [ 20.002417] usb 3-2: new full speed USB device using uhci_hcd and address 2 [ 20.121763] ACPI: No ACPI bus support for 3-2 [ 20.156293] ACPI: No ACPI bus support for 3-2:1.0 [ 29.539613] OK. I also had this problem with mm1. mm2 wouldn't compile, so I didn't test that. IIRC it also happened with the rc1-mm's. Is this supposed to happen? It's a fairly new thing on x64, should be fixed soon. If it disturbs you too much back out http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/broken-out/rfc-check-nmi-watchdog-is-broken.patch - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] cifs: md5 cleanup - functions
> Function names and return types on same line - conform to established > fs/cifs/ style. > > -void > -MD5Init(struct MD5Context *ctx) > +void MD5Init(struct MD5Context *ctx) > { > ctx->buf[0] = 0x67452301; > ctx->buf[1] = 0xefcdab89; > @@ -60,8 +58,7 @@ MD5Init(struct MD5Context *ctx) > * Update context to reflect the concatenation of another buffer full > * of bytes. > */ > -void > -MD5Update(struct MD5Context *ctx, unsigned char const *buf, unsigned len) > +void MD5Update(struct MD5Context *ctx, unsigned char const *buf, unsigned > len) > { Can anyone enlighten me why CIFS is not using crypto/md5? Same question about md4 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] cifs: md5 cleanup - functions
Function names and return types on same line - conform to established fs/cifs/ style. -void -MD5Init(struct MD5Context *ctx) +void MD5Init(struct MD5Context *ctx) { ctx-buf[0] = 0x67452301; ctx-buf[1] = 0xefcdab89; @@ -60,8 +58,7 @@ MD5Init(struct MD5Context *ctx) * Update context to reflect the concatenation of another buffer full * of bytes. */ -void -MD5Update(struct MD5Context *ctx, unsigned char const *buf, unsigned len) +void MD5Update(struct MD5Context *ctx, unsigned char const *buf, unsigned len) { Can anyone enlighten me why CIFS is not using crypto/md5? Same question about md4 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm2
> - Largeish x86_64 update Hi Pavel I'm playing a bit with suspend on smp, we need something like this: As the cpu-mask is set to only this cpu _smp_processor_id() is safe. Index: linux-2.6.11/kernel/power/smp.c === --- linux-2.6.11.orig/kernel/power/smp.c2005-04-10 09:43:13.0 +0200 +++ linux-2.6.11/kernel/power/smp.c 2005-04-10 15:23:36.0 +0200 @@ -46,13 +46,13 @@ void disable_nonboot_cpus(void) { - printk("Freezing CPUs (at %d)", smp_processor_id()); oldmask = current->cpus_allowed; set_cpus_allowed(current, cpumask_of_cpu(0)); + printk("Freezing CPUs (at %d)", _smp_processor_id()); current->state = TASK_INTERRUPTIBLE; schedule_timeout(HZ); printk("..."); - BUG_ON(smp_processor_id() != 0); + BUG_ON(_smp_processor_id() != 0); /* FIXME: for this to work, all the CPUs must be running * "idle" thread (or we deadlock). Is that guaranteed? */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm2
- Largeish x86_64 update Hi Pavel I'm playing a bit with suspend on smp, we need something like this: As the cpu-mask is set to only this cpu _smp_processor_id() is safe. Index: linux-2.6.11/kernel/power/smp.c === --- linux-2.6.11.orig/kernel/power/smp.c2005-04-10 09:43:13.0 +0200 +++ linux-2.6.11/kernel/power/smp.c 2005-04-10 15:23:36.0 +0200 @@ -46,13 +46,13 @@ void disable_nonboot_cpus(void) { - printk(Freezing CPUs (at %d), smp_processor_id()); oldmask = current-cpus_allowed; set_cpus_allowed(current, cpumask_of_cpu(0)); + printk(Freezing CPUs (at %d), _smp_processor_id()); current-state = TASK_INTERRUPTIBLE; schedule_timeout(HZ); printk(...); - BUG_ON(smp_processor_id() != 0); + BUG_ON(_smp_processor_id() != 0); /* FIXME: for this to work, all the CPUs must be running * idle thread (or we deadlock). Is that guaranteed? */ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm2
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm2/ > > Changes since 2.6.12-rc2-mm1: > > > bk-acpi.patch [acpi-devel up on cc] One of my boxen takes about 5 minutes to reboot now, hitting sysrq-p a few times shows it mostly sits in in acpi_ut_find_allocation+0x2b/0x37. Reverting bk-acpi.patch makes it reboot like normal, config attached. (gdb) disass acpi_ut_find_allocation Dump of assembler code for function acpi_ut_find_allocation: 0xc01daa2d : push %ebp 0xc01daa2e : mov%esp,%ebp 0xc01daa30 : push %esi 0xc01daa31 : mov%edx,%esi 0xc01daa33 : push %ebx 0xc01daa34 : mov%eax,%ebx 0xc01daa36 : call 0xc01dae66 0xc01daa3b :xor%edx,%edx 0xc01daa3d :cmp$0x6,%ebx 0xc01daa40 :ja 0xc01daa5e 0xc01daa42 :imul $0x24,%ebx,%eax 0xc01daa45 :mov0xc03c1040(%eax),%eax 0xc01daa4b :test %eax,%eax 0xc01daa4d :je 0xc01daa5c 0xc01daa4f :cmp%esi,%eax 0xc01daa51 :mov%eax,%edx 0xc01daa53 :je 0xc01daa5e 0xc01daa55 :mov0x4(%eax),%eax 0xc01daa58 :test %eax,%eax 0xc01daa5a :jne0xc01daa4f 0xc01daa5c :xor%edx,%edx 0xc01daa5e :pop%ebx 0xc01daa5f :pop%esi 0xc01daa60 :mov%edx,%eax 0xc01daa62 :leave 0xc01daa63 :ret End of assembler dump. # # Automatically generated make config: don't edit # Linux kernel version: 2.6.12-rc2-mm2 # Sat Apr 9 13:27:39 2005 # CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y # CONFIG_CLEAN_COMPILE is not set CONFIG_BROKEN=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_POSIX_MQUEUE is not set # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set # CONFIG_HOTPLUG is not set # CONFIG_KOBJECT_UEVENT is not set # CONFIG_IKCONFIG is not set CONFIG_EMBEDDED=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # # Loadable module support # CONFIG_MODULES=y # CONFIG_MODULE_UNLOAD is not set CONFIG_OBSOLETE_MODPARM=y # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set # CONFIG_KMOD is not set # # Processor type and features # CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set CONFIG_MK7=y # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_USE_3DNOW=y CONFIG_HPET_TIMER=y # CONFIG_SMP is not set CONFIG_PREEMPT=y CONFIG_PREEMPT_BKL=y CONFIG_X86_UP_APIC=y CONFIG_X86_UP_IOAPIC=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_TSC=y CONFIG_X86_MCE=y CONFIG_X86_MCE_NONFATAL=y # CONFIG_X86_MCE_P4THERMAL is not set # CONFIG_TOSHIBA is not set # CONFIG_I8K is not set # CONFIG_MICROCODE is not set # CONFIG_X86_MSR is not set # CONFIG_X86_CPUID is not set # # Firmware Drivers # # CONFIG_EDD is not set # CONFIG_NOHIGHMEM is not set CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_HIGHMEM=y CONFIG_FLATMEM=y # CONFIG_DISCONTIGMEM is not set # CONFIG_HIGHPTE is not set # CONFIG_MATH_EMULATION is not set CONFIG_MTRR=y # CONFIG_EFI is not set CONFIG_HAVE_DEC_LOCK=y CONFIG_REGPARM=y CONFIG_SECCOMP=y # # Performance-monitoring counters support # # CONFIG_PERFCTR is not set CONFIG_PHYSICAL_START=0x10 CONFIG_KEXEC=y CONFIG_CRASH_DUMP=y # # Power management options (ACPI, APM) # CONFIG_PM=y # CONFIG_PM_DEBUG is not set CONFIG_SOFTWARE_SUSPEND=y CONFIG_PM_STD_PARTITION=""
Re: 2.6.12-rc2-mm2
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm2/ Changes since 2.6.12-rc2-mm1: bk-acpi.patch [acpi-devel up on cc] One of my boxen takes about 5 minutes to reboot now, hitting sysrq-p a few times shows it mostly sits in in acpi_ut_find_allocation+0x2b/0x37. Reverting bk-acpi.patch makes it reboot like normal, config attached. (gdb) disass acpi_ut_find_allocation Dump of assembler code for function acpi_ut_find_allocation: 0xc01daa2d acpi_ut_find_allocation+0: push %ebp 0xc01daa2e acpi_ut_find_allocation+1: mov%esp,%ebp 0xc01daa30 acpi_ut_find_allocation+3: push %esi 0xc01daa31 acpi_ut_find_allocation+4: mov%edx,%esi 0xc01daa33 acpi_ut_find_allocation+6: push %ebx 0xc01daa34 acpi_ut_find_allocation+7: mov%eax,%ebx 0xc01daa36 acpi_ut_find_allocation+9: call 0xc01dae66 acpi_ut_track_stack_ptr 0xc01daa3b acpi_ut_find_allocation+14:xor%edx,%edx 0xc01daa3d acpi_ut_find_allocation+16:cmp$0x6,%ebx 0xc01daa40 acpi_ut_find_allocation+19:ja 0xc01daa5e acpi_ut_find_allocation+49 0xc01daa42 acpi_ut_find_allocation+21:imul $0x24,%ebx,%eax 0xc01daa45 acpi_ut_find_allocation+24:mov0xc03c1040(%eax),%eax 0xc01daa4b acpi_ut_find_allocation+30:test %eax,%eax 0xc01daa4d acpi_ut_find_allocation+32:je 0xc01daa5c acpi_ut_find_allocation+47 0xc01daa4f acpi_ut_find_allocation+34:cmp%esi,%eax 0xc01daa51 acpi_ut_find_allocation+36:mov%eax,%edx 0xc01daa53 acpi_ut_find_allocation+38:je 0xc01daa5e acpi_ut_find_allocation+49 0xc01daa55 acpi_ut_find_allocation+40:mov0x4(%eax),%eax 0xc01daa58 acpi_ut_find_allocation+43:test %eax,%eax 0xc01daa5a acpi_ut_find_allocation+45:jne0xc01daa4f acpi_ut_find_allocation+34 0xc01daa5c acpi_ut_find_allocation+47:xor%edx,%edx 0xc01daa5e acpi_ut_find_allocation+49:pop%ebx 0xc01daa5f acpi_ut_find_allocation+50:pop%esi 0xc01daa60 acpi_ut_find_allocation+51:mov%edx,%eax 0xc01daa62 acpi_ut_find_allocation+53:leave 0xc01daa63 acpi_ut_find_allocation+54:ret End of assembler dump. # # Automatically generated make config: don't edit # Linux kernel version: 2.6.12-rc2-mm2 # Sat Apr 9 13:27:39 2005 # CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y # CONFIG_CLEAN_COMPILE is not set CONFIG_BROKEN=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION= # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_POSIX_MQUEUE is not set # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set # CONFIG_HOTPLUG is not set # CONFIG_KOBJECT_UEVENT is not set # CONFIG_IKCONFIG is not set CONFIG_EMBEDDED=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # # Loadable module support # CONFIG_MODULES=y # CONFIG_MODULE_UNLOAD is not set CONFIG_OBSOLETE_MODPARM=y # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set # CONFIG_KMOD is not set # # Processor type and features # CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set CONFIG_MK7=y # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_USE_3DNOW=y CONFIG_HPET_TIMER=y # CONFIG_SMP is not set CONFIG_PREEMPT=y CONFIG_PREEMPT_BKL=y CONFIG_X86_UP_APIC=y CONFIG_X86_UP_IOAPIC=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_TSC=y CONFIG_X86_MCE=y CONFIG_X86_MCE_NONFATAL=y #
Re: Timestamp of file modified through mmap are not changed in 2.6
> Timestamp of file modified through mmap are not changed in 2.6 (even > after msync()). Observations on 2.4 and 2.6 kernels: > - on 2.4, timestamps are altered a few seconds after the program exits. > - on 2.6, timestamps are never altered. > > Is this behaviour a normal behaviour ? > > Program example to reproduce the bug (you need to create a "test" file > in the current directory first): Yeah there's been at least one bug on bugzilla open for this, and I recall the posix specification saying the times on files shall be updated on mmap file changes (which makes sense too). Doing it at msync is easy, keeping track of memory mapped data etc. is more cumbersome. I sent a patch doing this a while ago (doesn't work now due to msync rework, think it was the 4-level changes) that worked well for me but nobody seemed to be be overwhelmed by it :-) http://lkml.org/lkml/diff/2004/12/5/95/1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm2
fre 2005-04-08 klockan 03:08 -0700 skrev Andrew Morton: > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm2/ > I got this running ./runltp -x 2, can't recall this happening before. It bothers me a bit as it's a GFP_KERNEL allocation and there's lots of swap available. I don't think I've learned to fully decipher these oom-dumps fully yet (especially the active/inactive stat) but this looks fishy to me. Run with /proc/sys/vm/swappiness=1 After the killing /proc/meminfo reports lots of MemFree. oom-killer: gfp_mask=0x80d2 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages:8220kB (0kB HighMem) Active:240716 inactive:2631 dirty:0 writeback:1162 unstable:0 free:2055 slab:5340 mapped:242023 pagetables:1441 DMA free:4100kB min:60kB low:72kB high:88kB active:172kB inactive:7956kB present:16384kB pages_scanned:363 all_unreclaimable? no lowmem_reserve[]: 0 1007 1007 Normal free:4120kB min:4028kB low:5032kB high:6040kB active:962692kB inactive:2568kB present:1032128kB pages_scanned:6421 all_unreclaimable? no lowmem_reserve[]: 0 0 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 1*4kB 2*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4100kB Normal: 32*4kB 1*8kB 1*16kB 2*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4120kB HighMem: empty Swap cache: add 921, delete 257, find 0/0, race 0+0 Free swap = 4880036kB Total swap = 4883720kB Out of Memory: Killed process 2327 (firefox-bin). oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages:8112kB (0kB HighMem) Active:224482 inactive:19068 dirty:0 writeback:285 unstable:0 free:2028 slab:5089 mapped:243275 pagetables:1407 DMA free:4088kB min:60kB low:72kB high:88kB active:6432kB inactive:1744kB present:16384kB pages_scanned:10438 all_unreclaimable? yes lowmem_reserve[]: 0 1007 1007 Normal free:4024kB min:4028kB low:5032kB high:6040kB active:891496kB inactive:74528kB present:1032128kB pages_scanned:686 all_unreclaimable? no lowmem_reserve[]: 0 0 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4088kB Normal: 6*4kB 4*8kB 0*16kB 4*32kB 2*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4024kB HighMem: empty Swap cache: add 2900, delete 2557, find 0/0, race 0+0 Free swap = 4872120kB Total swap = 4883720kB Out of Memory: Killed process 2305 (evolution). oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages:8096kB (0kB HighMem) Active:229514 inactive:14617 dirty:0 writeback:0 unstable:0 free:2024 slab:4233 mapped:244389 pagetables:1568 DMA free:4088kB min:60kB low:72kB high:88kB active:8228kB inactive:0kB present:16384kB pages_scanned:9771 all_unreclaimable? yes lowmem_reserve[]: 0 1007 1007 Normal free:4008kB min:4028kB low:5032kB high:6040kB active:909060kB inactive:59364kB present:1032128kB pages_scanned:1623064 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4088kB Normal: 0*4kB 1*8kB 2*16kB 0*32kB 4*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4008kB HighMem: empty Swap cache: add 125467, delete 125298, find 70/196, race 0+0 Free swap = 4383340kB Total swap = 4883720kB Out of Memory: Killed process 2330 (gnome-pty-helpe). oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu:
Re: 2.6.12-rc2-mm2
fre 2005-04-08 klockan 03:08 -0700 skrev Andrew Morton: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm2/ I got this running ./runltp -x 2, can't recall this happening before. It bothers me a bit as it's a GFP_KERNEL allocation and there's lots of swap available. I don't think I've learned to fully decipher these oom-dumps fully yet (especially the active/inactive stat) but this looks fishy to me. Run with /proc/sys/vm/swappiness=1 After the killing /proc/meminfo reports lots of MemFree. oom-killer: gfp_mask=0x80d2 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages:8220kB (0kB HighMem) Active:240716 inactive:2631 dirty:0 writeback:1162 unstable:0 free:2055 slab:5340 mapped:242023 pagetables:1441 DMA free:4100kB min:60kB low:72kB high:88kB active:172kB inactive:7956kB present:16384kB pages_scanned:363 all_unreclaimable? no lowmem_reserve[]: 0 1007 1007 Normal free:4120kB min:4028kB low:5032kB high:6040kB active:962692kB inactive:2568kB present:1032128kB pages_scanned:6421 all_unreclaimable? no lowmem_reserve[]: 0 0 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 1*4kB 2*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4100kB Normal: 32*4kB 1*8kB 1*16kB 2*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4120kB HighMem: empty Swap cache: add 921, delete 257, find 0/0, race 0+0 Free swap = 4880036kB Total swap = 4883720kB Out of Memory: Killed process 2327 (firefox-bin). oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages:8112kB (0kB HighMem) Active:224482 inactive:19068 dirty:0 writeback:285 unstable:0 free:2028 slab:5089 mapped:243275 pagetables:1407 DMA free:4088kB min:60kB low:72kB high:88kB active:6432kB inactive:1744kB present:16384kB pages_scanned:10438 all_unreclaimable? yes lowmem_reserve[]: 0 1007 1007 Normal free:4024kB min:4028kB low:5032kB high:6040kB active:891496kB inactive:74528kB present:1032128kB pages_scanned:686 all_unreclaimable? no lowmem_reserve[]: 0 0 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4088kB Normal: 6*4kB 4*8kB 0*16kB 4*32kB 2*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4024kB HighMem: empty Swap cache: add 2900, delete 2557, find 0/0, race 0+0 Free swap = 4872120kB Total swap = 4883720kB Out of Memory: Killed process 2305 (evolution). oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: empty Free pages:8096kB (0kB HighMem) Active:229514 inactive:14617 dirty:0 writeback:0 unstable:0 free:2024 slab:4233 mapped:244389 pagetables:1568 DMA free:4088kB min:60kB low:72kB high:88kB active:8228kB inactive:0kB present:16384kB pages_scanned:9771 all_unreclaimable? yes lowmem_reserve[]: 0 1007 1007 Normal free:4008kB min:4028kB low:5032kB high:6040kB active:909060kB inactive:59364kB present:1032128kB pages_scanned:1623064 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4088kB Normal: 0*4kB 1*8kB 2*16kB 0*32kB 4*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4008kB HighMem: empty Swap cache: add 125467, delete 125298, find 70/196, race 0+0 Free swap = 4383340kB Total swap = 4883720kB Out of Memory: Killed process 2330 (gnome-pty-helpe). oom-killer: gfp_mask=0xd0 DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu:
Re: Timestamp of file modified through mmap are not changed in 2.6
Timestamp of file modified through mmap are not changed in 2.6 (even after msync()). Observations on 2.4 and 2.6 kernels: - on 2.4, timestamps are altered a few seconds after the program exits. - on 2.6, timestamps are never altered. Is this behaviour a normal behaviour ? Program example to reproduce the bug (you need to create a test file in the current directory first): Yeah there's been at least one bug on bugzilla open for this, and I recall the posix specification saying the times on files shall be updated on mmap file changes (which makes sense too). Doing it at msync is easy, keeping track of memory mapped data etc. is more cumbersome. I sent a patch doing this a while ago (doesn't work now due to msync rework, think it was the 4-level changes) that worked well for me but nobody seemed to be be overwhelmed by it :-) http://lkml.org/lkml/diff/2004/12/5/95/1 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Dynamic Tick version 050406-1
> > > Here's an updated dyn-tick patch. Some minor fixes: > > > > Doesn't look so good here. I get this with 2.6.12-rc2 (plus a few other > > patches). > > Disabling Dynamic Tick makes everything happy again (it boots). > > > > [4294688.655000] Unable to handle kernel NULL pointer dereference at > > virtual address > > Thanks for trying it out. What kind of hardware do you have? Does it > have HPET? It looks like no suitable timer for dyn-tick is found... > Maybe the following patch helps? = arch/i386/kernel/Makefile 1.67 vs edited = --- 1.67/arch/i386/kernel/Makefile 2005-01-26 06:21:13 +01:00 +++ edited/arch/i386/kernel/Makefile2005-04-07 11:21:19 +02:00 @@ -32,6 +32,7 @@ obj-$(CONFIG_ACPI_SRAT) += srat.o obj-$(CONFIG_HPET_TIMER) += time_hpet.o obj-$(CONFIG_EFI) += efi.o efi_stub.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o +obj-$(CONFIG_NO_IDLE_HZ) += dyn-tick.o EXTRA_AFLAGS := -traditional - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Dynamic Tick version 050406-1
Here's an updated dyn-tick patch. Some minor fixes: Doesn't look so good here. I get this with 2.6.12-rc2 (plus a few other patches). Disabling Dynamic Tick makes everything happy again (it boots). [4294688.655000] Unable to handle kernel NULL pointer dereference at virtual address Thanks for trying it out. What kind of hardware do you have? Does it have HPET? It looks like no suitable timer for dyn-tick is found... Maybe the following patch helps? = arch/i386/kernel/Makefile 1.67 vs edited = --- 1.67/arch/i386/kernel/Makefile 2005-01-26 06:21:13 +01:00 +++ edited/arch/i386/kernel/Makefile2005-04-07 11:21:19 +02:00 @@ -32,6 +32,7 @@ obj-$(CONFIG_ACPI_SRAT) += srat.o obj-$(CONFIG_HPET_TIMER) += time_hpet.o obj-$(CONFIG_EFI) += efi.o efi_stub.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o +obj-$(CONFIG_NO_IDLE_HZ) += dyn-tick.o EXTRA_AFLAGS := -traditional - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/