from:"Alexander Nyberg"

Re: [discuss] [PATCH] allow CONFIG_FRAME_POINTER for x86-64

2005-09-09 Thread Alexander Nyberg

On Fri, Sep 09, 2005 at 12:58:12PM +0200 Andi Kleen wrote:

> On Friday 09 September 2005 12:45, Hugh Dickins wrote:
> > On Fri, 9 Sep 2005, Jan Beulich wrote:
> > > > But why would anyone want frame pointers on x86-64?
> > >
> > > I'd put the question differently: Why should x86-64 not allow what
> > > other architectures do?
> > >
> > > But of course, I'm not insisting on this patch to get in, it just
> > > seemed an obvious inconsistency...
> >
> > I'm with Jan on this.  I use a similar patch for frame pointers on
> > x86_64 most of the time, in the hope of getting more accurate backtraces.
> 
> It won't give more accurate backtraces, not even on i386 because show_stack
> doesn't have any code to follow frame pointers.
> 

Huh? print_context_stack follows frame pointers which is called from
show_stack
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [discuss] [PATCH] allow CONFIG_FRAME_POINTER for x86-64

2005-09-09 Thread Alexander Nyberg

On Fri, Sep 09, 2005 at 12:58:12PM +0200 Andi Kleen wrote:

 On Friday 09 September 2005 12:45, Hugh Dickins wrote:
  On Fri, 9 Sep 2005, Jan Beulich wrote:
But why would anyone want frame pointers on x86-64?
  
   I'd put the question differently: Why should x86-64 not allow what
   other architectures do?
  
   But of course, I'm not insisting on this patch to get in, it just
   seemed an obvious inconsistency...
 
  I'm with Jan on this.  I use a similar patch for frame pointers on
  x86_64 most of the time, in the hope of getting more accurate backtraces.
 
 It won't give more accurate backtraces, not even on i386 because show_stack
 doesn't have any code to follow frame pointers.
 

Huh? print_context_stack follows frame pointers which is called from
show_stack
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange LVM2/DM data corruption with 2.6.11.12

2005-09-08 Thread Alexander Nyberg

On Thu, Sep 08, 2005 at 11:58:54AM +0200 Ludovic Drolez wrote:

> Hi !
> 
> We are developing (GPLed) disk cloning software similar to partimage: it's 
> an intelligent 'dd' which backups only used sectors.
> 
> Recently I added LVM1/2 support to it, and sometimes we saw LVM 
> restorations failing randomly (Disk images are not corrupted, but the 
> result of the restoration can be lead to a corrupted filesystem). If a 
> restoration fails, just try another one and it will work...
> 

Please upgrade to 2.6.12.6 (I don't remember exactly in which
2.6.12.x it went in), it contains a bugfix that should fix what
you are seeing. 2.6.13 also has this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange LVM2/DM data corruption with 2.6.11.12

2005-09-08 Thread Alexander Nyberg

On Thu, Sep 08, 2005 at 11:58:54AM +0200 Ludovic Drolez wrote:

 Hi !
 
 We are developing (GPLed) disk cloning software similar to partimage: it's 
 an intelligent 'dd' which backups only used sectors.
 
 Recently I added LVM1/2 support to it, and sometimes we saw LVM 
 restorations failing randomly (Disk images are not corrupted, but the 
 result of the restoration can be lead to a corrupted filesystem). If a 
 restoration fails, just try another one and it will work...
 

Please upgrade to 2.6.12.6 (I don't remember exactly in which
2.6.12.x it went in), it contains a bugfix that should fix what
you are seeing. 2.6.13 also has this.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Some debugging patches on top of -mm

2005-09-05 Thread Alexander Nyberg

These are debugging patches on-top of -mm that makes it possible
for those arches that want to be able to to save caller traces
of who allocates pages and slab objects.

Any arch that wants to use this could make a next_stack_func function
that goes through the stack starting at *prev_addr and finds the next
function return address. 'count' is for when we can use the frame
pointer (CONFIG_FRAME_POINTER) to get accurate backtraces.

For x86 it goes like:

unsigned long *next_stack_func(unsigned long *prev_addr, int count)
{
struct thread_info *tinfo = current_thread_info();

if (!prev_addr)
return NULL;

#ifdef CONFIG_FRAME_POINTER
/* In this case 'prev_addr' is a pointer to the last return
 * function found on the stack */
if (count == 0) {
unsigned long ebp;
unsigned long *func_ptr;

asm ("movl %%ebp, %0" : "=r" (ebp) : );
/* We don't want the obvious caller to show up */
ebp = *(unsigned long *) ebp;
func_ptr = (unsigned long *)(ebp + 4);
if (valid_stack_ptr(tinfo, func_ptr))
return func_ptr;
} else {
unsigned long *func_ptr;
unsigned long ebp = (unsigned long) prev_addr;

ebp -= 4;

ebp = *(unsigned long *) ebp;
func_ptr = (unsigned long *) ((unsigned long)ebp + 4);
if (valid_stack_ptr(tinfo, func_ptr))
return func_ptr;
}
#else
while (prev_addr++) {
if (!valid_stack_ptr(tinfo, prev_addr))
break;
if (__kernel_text_address(*prev_addr))
return prev_addr;
}
#endif
return NULL;
}


1) A "generic" next_stack_func() for arches that want to have these
debugging facilities

2) Saving more slab object call traces via DBG_DEBUGWORDS. Now uses
next_stack_func(). This still prints to the console, oh well...
(I have not made SLAB_DEBUG conditional on x86 so it won't compile on
non-x86 arches with these patches currently...)

3) Simplification of the page-owner-leak-detector to use next_stack_func()
so that any arch that wants it can use it.


Index: mm/arch/i386/kernel/traps.c
===
--- mm.orig/arch/i386/kernel/traps.c2005-09-03 11:22:39.0 +0200
+++ mm/arch/i386/kernel/traps.c 2005-09-03 18:17:00.0 +0200
@@ -148,6 +148,48 @@
p < (void *)tinfo + THREAD_SIZE - 3;
 }
 
+unsigned long *next_stack_func(unsigned long *prev_addr, int count)
+{
+   struct thread_info *tinfo = current_thread_info();
+
+   if (!prev_addr)
+   return NULL;
+
+#ifdef CONFIG_FRAME_POINTER
+   /* In this case 'prev_addr' is a pointer to the last return
+* function found on the stack */
+   if (count == 0) {
+   unsigned long ebp;
+   unsigned long *func_ptr;
+
+   asm ("movl %%ebp, %0" : "=r" (ebp) : );
+   /* We don't want the obvious caller to show up */
+   ebp = *(unsigned long *) ebp;
+   func_ptr = (unsigned long *)(ebp + 4);
+   if (valid_stack_ptr(tinfo, func_ptr))
+   return func_ptr;
+   } else {
+   unsigned long *func_ptr;
+   unsigned long ebp = (unsigned long) prev_addr;
+
+   ebp -= 4;
+
+   ebp = *(unsigned long *) ebp;
+   func_ptr = (unsigned long *) ((unsigned long)ebp + 4);
+   if (valid_stack_ptr(tinfo, func_ptr))
+   return func_ptr;
+   }
+#else
+   while (prev_addr++) {
+   if (!valid_stack_ptr(tinfo, prev_addr))
+   break;
+   if (__kernel_text_address(*prev_addr))
+   return prev_addr;
+   }
+#endif
+   return NULL;
+}
+
 static inline unsigned long print_context_stack(struct thread_info *tinfo,
unsigned long *stack, unsigned long ebp)
 {
Index: mm/include/linux/sched.h
===
--- mm.orig/include/linux/sched.h   2005-09-03 11:22:51.0 +0200
+++ mm/include/linux/sched.h2005-09-03 15:52:20.0 +0200
@@ -171,6 +171,7 @@
  * trace (or NULL if the entire call-chain of the task should be shown).
  */
 extern void show_stack(struct task_struct *task, unsigned long *sp);
+extern unsigned long *next_stack_func(unsigned long *prev_addr, int count);
 
 void io_schedule(void);
 long io_schedule_timeout(long timeout);
Index: mm/arch/x86_64/kernel/traps.c
===
--- mm.orig/arch/x86_64/kernel/traps.c  2005-09-03 17:59:16.0 +0200
+++ mm/arch/x86_64/kernel/traps.c   2005-09-03 19:00:48.0 +0200
@@ -154,6 +154,54 @@

Some debugging patches on top of -mm

2005-09-05 Thread Alexander Nyberg

These are debugging patches on-top of -mm that makes it possible
for those arches that want to be able to to save caller traces
of who allocates pages and slab objects.

Any arch that wants to use this could make a next_stack_func function
that goes through the stack starting at *prev_addr and finds the next
function return address. 'count' is for when we can use the frame
pointer (CONFIG_FRAME_POINTER) to get accurate backtraces.

For x86 it goes like:

unsigned long *next_stack_func(unsigned long *prev_addr, int count)
{
struct thread_info *tinfo = current_thread_info();

if (!prev_addr)
return NULL;

#ifdef CONFIG_FRAME_POINTER
/* In this case 'prev_addr' is a pointer to the last return
 * function found on the stack */
if (count == 0) {
unsigned long ebp;
unsigned long *func_ptr;

asm (movl %%ebp, %0 : =r (ebp) : );
/* We don't want the obvious caller to show up */
ebp = *(unsigned long *) ebp;
func_ptr = (unsigned long *)(ebp + 4);
if (valid_stack_ptr(tinfo, func_ptr))
return func_ptr;
} else {
unsigned long *func_ptr;
unsigned long ebp = (unsigned long) prev_addr;

ebp -= 4;

ebp = *(unsigned long *) ebp;
func_ptr = (unsigned long *) ((unsigned long)ebp + 4);
if (valid_stack_ptr(tinfo, func_ptr))
return func_ptr;
}
#else
while (prev_addr++) {
if (!valid_stack_ptr(tinfo, prev_addr))
break;
if (__kernel_text_address(*prev_addr))
return prev_addr;
}
#endif
return NULL;
}


1) A generic next_stack_func() for arches that want to have these
debugging facilities

2) Saving more slab object call traces via DBG_DEBUGWORDS. Now uses
next_stack_func(). This still prints to the console, oh well...
(I have not made SLAB_DEBUG conditional on x86 so it won't compile on
non-x86 arches with these patches currently...)

3) Simplification of the page-owner-leak-detector to use next_stack_func()
so that any arch that wants it can use it.


Index: mm/arch/i386/kernel/traps.c
===
--- mm.orig/arch/i386/kernel/traps.c2005-09-03 11:22:39.0 +0200
+++ mm/arch/i386/kernel/traps.c 2005-09-03 18:17:00.0 +0200
@@ -148,6 +148,48 @@
p  (void *)tinfo + THREAD_SIZE - 3;
 }
 
+unsigned long *next_stack_func(unsigned long *prev_addr, int count)
+{
+   struct thread_info *tinfo = current_thread_info();
+
+   if (!prev_addr)
+   return NULL;
+
+#ifdef CONFIG_FRAME_POINTER
+   /* In this case 'prev_addr' is a pointer to the last return
+* function found on the stack */
+   if (count == 0) {
+   unsigned long ebp;
+   unsigned long *func_ptr;
+
+   asm (movl %%ebp, %0 : =r (ebp) : );
+   /* We don't want the obvious caller to show up */
+   ebp = *(unsigned long *) ebp;
+   func_ptr = (unsigned long *)(ebp + 4);
+   if (valid_stack_ptr(tinfo, func_ptr))
+   return func_ptr;
+   } else {
+   unsigned long *func_ptr;
+   unsigned long ebp = (unsigned long) prev_addr;
+
+   ebp -= 4;
+
+   ebp = *(unsigned long *) ebp;
+   func_ptr = (unsigned long *) ((unsigned long)ebp + 4);
+   if (valid_stack_ptr(tinfo, func_ptr))
+   return func_ptr;
+   }
+#else
+   while (prev_addr++) {
+   if (!valid_stack_ptr(tinfo, prev_addr))
+   break;
+   if (__kernel_text_address(*prev_addr))
+   return prev_addr;
+   }
+#endif
+   return NULL;
+}
+
 static inline unsigned long print_context_stack(struct thread_info *tinfo,
unsigned long *stack, unsigned long ebp)
 {
Index: mm/include/linux/sched.h
===
--- mm.orig/include/linux/sched.h   2005-09-03 11:22:51.0 +0200
+++ mm/include/linux/sched.h2005-09-03 15:52:20.0 +0200
@@ -171,6 +171,7 @@
  * trace (or NULL if the entire call-chain of the task should be shown).
  */
 extern void show_stack(struct task_struct *task, unsigned long *sp);
+extern unsigned long *next_stack_func(unsigned long *prev_addr, int count);
 
 void io_schedule(void);
 long io_schedule_timeout(long timeout);
Index: mm/arch/x86_64/kernel/traps.c
===
--- mm.orig/arch/x86_64/kernel/traps.c  2005-09-03 17:59:16.0 +0200
+++ mm/arch/x86_64/kernel/traps.c   2005-09-03 19:00:48.0 +0200
@@ -154,6 +154,54 @@

Re: 2.6.13-mm1

2005-09-04 Thread Alexander Nyberg

On Thu, Sep 01, 2005 at 03:55:42AM -0700 Andrew Morton wrote:

> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/
> 

I got:
<7>Dead loop on netdevice eth0, fix it urgently!

When using netconsole and printing out some information from kernel to
console.

The box uses:
[EMAIL PROTECTED]/eth0,[EMAIL PROTECTED]/

:00:0f.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
Ethernet 10/100 (rev 11)

Relevant config:
CONFIG_NET_TULIP=y
# CONFIG_DE2104X is not set
CONFIG_TULIP=y
CONFIG_TULIP_MWI=y
# CONFIG_TULIP_MMIO is not set
CONFIG_TULIP_NAPI=y

Matt, on another box I got some irq off hangs that went away when removing
netconsole from the .config on a box with 3c59x. Is this known? The
problem is getting backtraces when netconsole is active, but the last
thing I see before the box goes is that some carrier is up...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-mm1

2005-09-04 Thread Alexander Nyberg

On Thu, Sep 01, 2005 at 03:55:42AM -0700 Andrew Morton wrote:

 
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/
 

I got:
7Dead loop on netdevice eth0, fix it urgently!

When using netconsole and printing out some information from kernel to
console.

The box uses:
[EMAIL PROTECTED]/eth0,[EMAIL PROTECTED]/

:00:0f.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
Ethernet 10/100 (rev 11)

Relevant config:
CONFIG_NET_TULIP=y
# CONFIG_DE2104X is not set
CONFIG_TULIP=y
CONFIG_TULIP_MWI=y
# CONFIG_TULIP_MMIO is not set
CONFIG_TULIP_NAPI=y


Matt, on another box I got some irq off hangs that went away when removing
netconsole from the .config on a box with 3c59x. Is this known? The
problem is getting backtraces when netconsole is active, but the last
thing I see before the box goes is that some carrier is up...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: gcc coredump with 2.6.12+ kernels

2005-09-03 Thread Alexander Nyberg

On Sat, Sep 03, 2005 at 10:25:37AM -0700 Johnny Stenback wrote:

> Hey all,
> 
> I just attempted to upgrade my kernel to 2.6.13. The kernel appears to 
> boot and run just fine, but when I try to build any larger projects like 
> Mozilla or the Linux kernel I constantly get segfaults from gcc. All 
> other apps *seem* to work fine. I remember seeing this with 2.6.12 too 
> when I tried to upgrade to it too but I didn't have the time to 
> investigate at all then, but now I see the same problem with 2.6.13. The 
> last version I've used that didn't show this problem is 2.6.11.3, and 
> that's running with no problems here.
> 
> When gcc segfaults I get the following messages in the messages log:
> 
> cc1[16775]: segfault at  rip 0036f2b0119e rsp 
> 7faaf0a0 error 4
> cc1[17086]: segfault at  rip 0036f2b0119e rsp 
> 7fc4dfc0 error 4
> cc1[17788]: segfault at  rip 0036f2b0119e rsp 
> 7fd777e0 error 4
> cc1[17823]: segfault at  rip 0036f2b0119e rsp 
> 7fc4d630 error 4
> cc1[17895]: segfault at  rip 0036f2b0119e rsp 
> 7ffd2330 error 4
> 
> I'm on a dual AMD Opteron system, running x86_64 code. Using Fedora Core 
> 2 (yeah, old, I know...) and gcc 3.3.3 20040412.

Does it still happen if you run:

echo 0 > /proc/sys/kernel/randomize_va_space
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: gcc coredump with 2.6.12+ kernels

2005-09-03 Thread Alexander Nyberg

On Sat, Sep 03, 2005 at 10:25:37AM -0700 Johnny Stenback wrote:

 Hey all,
 
 I just attempted to upgrade my kernel to 2.6.13. The kernel appears to 
 boot and run just fine, but when I try to build any larger projects like 
 Mozilla or the Linux kernel I constantly get segfaults from gcc. All 
 other apps *seem* to work fine. I remember seeing this with 2.6.12 too 
 when I tried to upgrade to it too but I didn't have the time to 
 investigate at all then, but now I see the same problem with 2.6.13. The 
 last version I've used that didn't show this problem is 2.6.11.3, and 
 that's running with no problems here.
 
 When gcc segfaults I get the following messages in the messages log:
 
 cc1[16775]: segfault at  rip 0036f2b0119e rsp 
 7faaf0a0 error 4
 cc1[17086]: segfault at  rip 0036f2b0119e rsp 
 7fc4dfc0 error 4
 cc1[17788]: segfault at  rip 0036f2b0119e rsp 
 7fd777e0 error 4
 cc1[17823]: segfault at  rip 0036f2b0119e rsp 
 7fc4d630 error 4
 cc1[17895]: segfault at  rip 0036f2b0119e rsp 
 7ffd2330 error 4
 
 I'm on a dual AMD Opteron system, running x86_64 code. Using Fedora Core 
 2 (yeah, old, I know...) and gcc 3.3.3 20040412.

Does it still happen if you run:

echo 0  /proc/sys/kernel/randomize_va_space
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-mm1

2005-09-02 Thread Alexander Nyberg

On Thu, Sep 01, 2005 at 03:55:42AM -0700 Andrew Morton wrote:

> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/
> 

i386-boottime-for_each_cpu-broken.patch
i386-boottime-for_each_cpu-broken-fix.patch

The SMP version of __alloc_percpu checks the cpu_possible_map
before allocating memory for a certain cpu. With the above patches
the BSP cpuid is never set in cpu_possible_map which breaks CONFIG_SMP
on uniprocessor machines (as soon as someone tries to dereference
something allocated via __alloc_percpu, which in fact is never allocated
since the cpu is not set in cpu_possible_map).

The below fixes this, I'm not entirely sure about the voyager
part, should the cpu_possible_map really be CPU_MASK_ALL to begin
with there, Zwane?

Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]>

Index: mm/arch/i386/kernel/smpboot.c
===
--- mm.orig/arch/i386/kernel/smpboot.c  2005-09-02 15:28:20.0 +0200
+++ mm/arch/i386/kernel/smpboot.c   2005-09-02 16:16:46.0 +0200
@@ -1265,6 +1265,7 @@
cpu_set(smp_processor_id(), cpu_online_map);
cpu_set(smp_processor_id(), cpu_callout_map);
cpu_set(smp_processor_id(), cpu_present_map);
+   cpu_set(smp_processor_id(), cpu_possible_map);
per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
 }
 
Index: mm/arch/i386/mach-voyager/voyager_smp.c
===
--- mm.orig/arch/i386/mach-voyager/voyager_smp.c2005-09-02 
15:28:20.0 +0200
+++ mm/arch/i386/mach-voyager/voyager_smp.c 2005-09-02 16:17:29.0 
+0200
@@ -1910,6 +1910,7 @@
 {
cpu_set(smp_processor_id(), cpu_online_map);
cpu_set(smp_processor_id(), cpu_callout_map);
+   cpu_set(smp_processor_id(), cpu_possible_map);
 }
 
 int __devinit
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-mm1

2005-09-02 Thread Alexander Nyberg

On Thu, Sep 01, 2005 at 03:55:42AM -0700 Andrew Morton wrote:

 
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/
 

i386-boottime-for_each_cpu-broken.patch
i386-boottime-for_each_cpu-broken-fix.patch

The SMP version of __alloc_percpu checks the cpu_possible_map
before allocating memory for a certain cpu. With the above patches
the BSP cpuid is never set in cpu_possible_map which breaks CONFIG_SMP
on uniprocessor machines (as soon as someone tries to dereference
something allocated via __alloc_percpu, which in fact is never allocated
since the cpu is not set in cpu_possible_map).

The below fixes this, I'm not entirely sure about the voyager
part, should the cpu_possible_map really be CPU_MASK_ALL to begin
with there, Zwane?

Signed-off-by: Alexander Nyberg [EMAIL PROTECTED]

Index: mm/arch/i386/kernel/smpboot.c
===
--- mm.orig/arch/i386/kernel/smpboot.c  2005-09-02 15:28:20.0 +0200
+++ mm/arch/i386/kernel/smpboot.c   2005-09-02 16:16:46.0 +0200
@@ -1265,6 +1265,7 @@
cpu_set(smp_processor_id(), cpu_online_map);
cpu_set(smp_processor_id(), cpu_callout_map);
cpu_set(smp_processor_id(), cpu_present_map);
+   cpu_set(smp_processor_id(), cpu_possible_map);
per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
 }
 
Index: mm/arch/i386/mach-voyager/voyager_smp.c
===
--- mm.orig/arch/i386/mach-voyager/voyager_smp.c2005-09-02 
15:28:20.0 +0200
+++ mm/arch/i386/mach-voyager/voyager_smp.c 2005-09-02 16:17:29.0 
+0200
@@ -1910,6 +1910,7 @@
 {
cpu_set(smp_processor_id(), cpu_online_map);
cpu_set(smp_processor_id(), cpu_callout_map);
+   cpu_set(smp_processor_id(), cpu_possible_map);
 }
 
 int __devinit
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rcX really this bad ?

2005-08-14 Thread Alexander Nyberg

On Sun, Aug 14, 2005 at 10:10:18AM + Danny ter Haar wrote:

> I've posted a couple of times than my newsserver is not stable
> with any 2.6.13-rcX kernels.
> Last kernel that survived is 2.6.12-mm1 (18+days)
> Of course i can just stick with that kernel, but i thought it would
> be wise to live on the edge and run a reasonable loaded server with
> the latest/greatest. This ends in disaster though...
>
> Since i got no feedback on my previous posts, i either bring it 
> the wrong way, or people don't care and i ought to shut up.
> I think however that just before releasing a new stable kernel these
> kind of feedback could be healthy to ironout some bugs.
> 

Is the machine running X? We need some output from it so we can debug
what's going on, the info should be printed to the console. It would
be great if you could run the latest kernel and see if you get any
output. Also add nmi_watchdog=2 to the boot command line.

You can also set up a serial console or netconsole to capture the output
from the server with the help of another machine, described in
Documentation/serial-console.txt
Documentation/networking/netconsole.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rcX really this bad ?

2005-08-14 Thread Alexander Nyberg

On Sun, Aug 14, 2005 at 10:10:18AM + Danny ter Haar wrote:

 I've posted a couple of times than my newsserver is not stable
 with any 2.6.13-rcX kernels.
 Last kernel that survived is 2.6.12-mm1 (18+days)
 Of course i can just stick with that kernel, but i thought it would
 be wise to live on the edge and run a reasonable loaded server with
 the latest/greatest. This ends in disaster though...

 Since i got no feedback on my previous posts, i either bring it 
 the wrong way, or people don't care and i ought to shut up.
 I think however that just before releasing a new stable kernel these
 kind of feedback could be healthy to ironout some bugs.
 

Is the machine running X? We need some output from it so we can debug
what's going on, the info should be printed to the console. It would
be great if you could run the latest kernel and see if you get any
output. Also add nmi_watchdog=2 to the boot command line.

You can also set up a serial console or netconsole to capture the output
from the server with the help of another machine, described in
Documentation/serial-console.txt
Documentation/networking/netconsole.txt
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [SLAB] __builtin_return_address use without FRAME_POINTER causes boot failure

2005-08-08 Thread Alexander Nyberg

On Mon, Aug 08, 2005 at 11:37:18PM +0200 Manfred Spraul wrote:

> Christoph Lameter wrote:
> 
> >I kept getting boot failures in the slab allocator. The failure goes 
> >away if one is setting CONFIG_FRAME_POINTER. Seems that 
> >CONFIG_DEBUG_SLAB implies the use of __buildin_return_address() which 
> >needs the framepointer.
> >
> > 
> >
> Very odd. __builtin_return_address(1) needs frame pointers, but slab 
> only uses __builtin_return_addresse(0), which should always work.
> 

My fault, I introduced a debugging patch (i think i cc'ed you on it)
which used __builtin_return_address([12]) to save traces of who the
caller of an object is.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] CHECK_IRQ_PER_CPU() to avoid dead code in __do_IRQ()

2005-08-08 Thread Alexander Nyberg

> 
> IRQ_PER_CPU is not used by all architectures.
> This patch introduces the macros
> ARCH_HAS_IRQ_PER_CPU and CHECK_IRQ_PER_CPU() to avoid the generation of
> dead code in __do_IRQ().
> 
> ARCH_HAS_IRQ_PER_CPU is defined by architectures using
> IRQ_PER_CPU in their
> include/asm_ARCH/irq.h
> file.
> 
> Through grepping the tree I found the following
> architectures currently use IRQ_PER_CPU:
> 
> cris, ia64, ppc, ppc64 and parisc. 
> 

There are many places where one could replace run-time tests with 
#ifdef's but it makes reading more difficult (and in longer terms
maintainence). Have you benchmarked any workload that benefits 
from this?

> 
> diff -upr linux-2.6.13-rc6/include/asm-cris/irq.h 
> linux-2.6.13/include/asm-cris/irq.h
> --- linux-2.6.13-rc6/include/asm-cris/irq.h   2005-08-08 11:46:10.0 
> +0200
> +++ linux-2.6.13/include/asm-cris/irq.h   2005-08-08 11:41:12.0 
> +0200
> @@ -1,6 +1,11 @@
>  #ifndef _ASM_IRQ_H
>  #define _ASM_IRQ_H
>  
> +/*
> + * IRQ line status macro IRQ_PER_CPU is used
> + */
> +#define ARCH_HAS_IRQ_PER_CPU
> +
>  #include 
>  
>  extern __inline__ int irq_canonicalize(int irq)
> diff -upr linux-2.6.13-rc6/include/asm-ia64/irq.h 
> linux-2.6.13/include/asm-ia64/irq.h
> --- linux-2.6.13-rc6/include/asm-ia64/irq.h   2005-03-02 08:38:33.0 
> +0100
> +++ linux-2.6.13/include/asm-ia64/irq.h   2005-08-06 18:06:53.0 
> +0200
> @@ -14,6 +14,11 @@
>  #define NR_IRQS  256
>  #define NR_IRQ_VECTORS   NR_IRQS
>  
> +/*
> + * IRQ line status macro IRQ_PER_CPU is used
> + */
> +#define ARCH_HAS_IRQ_PER_CPU
> +
>  static __inline__ int
>  irq_canonicalize (int irq)
>  {
> diff -upr linux-2.6.13-rc6/include/asm-parisc/irq.h 
> linux-2.6.13/include/asm-parisc/irq.h
> --- linux-2.6.13-rc6/include/asm-parisc/irq.h 2005-08-08 11:45:26.0 
> +0200
> +++ linux-2.6.13/include/asm-parisc/irq.h 2005-08-06 18:05:22.0 
> +0200
> @@ -26,6 +26,11 @@
>  
>  #define NR_IRQS  (CPU_IRQ_MAX + 1)
>  
> +/*
> + * IRQ line status macro IRQ_PER_CPU is used
> + */
> +#define ARCH_HAS_IRQ_PER_CPU
> +
>  static __inline__ int irq_canonicalize(int irq)
>  {
>   return (irq == 2) ? 9 : irq;
> diff -upr linux-2.6.13-rc6/include/asm-ppc/irq.h 
> linux-2.6.13/include/asm-ppc/irq.h
> --- linux-2.6.13-rc6/include/asm-ppc/irq.h2005-08-08 11:46:10.0 
> +0200
> +++ linux-2.6.13/include/asm-ppc/irq.h2005-08-08 11:41:14.0 
> +0200
> @@ -19,6 +19,11 @@
>  #define IRQ_POLARITY_POSITIVE0x2 /* high level or low->high edge 
> */
>  #define IRQ_POLARITY_NEGATIVE0x0 /* low level or high->low edge 
> */
>  
> +/*
> + * IRQ line status macro IRQ_PER_CPU is used
> + */
> +#define ARCH_HAS_IRQ_PER_CPU
> +
>  #if defined(CONFIG_40x)
>  #include 
>  
> diff -upr linux-2.6.13-rc6/include/asm-ppc64/irq.h 
> linux-2.6.13/include/asm-ppc64/irq.h
> --- linux-2.6.13-rc6/include/asm-ppc64/irq.h  2005-03-02 08:38:33.0 
> +0100
> +++ linux-2.6.13/include/asm-ppc64/irq.h  2005-08-06 18:06:58.0 
> +0200
> @@ -33,6 +33,11 @@
>  #define IRQ_POLARITY_POSITIVE0x2 /* high level or low->high edge 
> */
>  #define IRQ_POLARITY_NEGATIVE0x0 /* low level or high->low edge 
> */
>  
> +/*
> + * IRQ line status macro IRQ_PER_CPU is used
> + */
> +#define ARCH_HAS_IRQ_PER_CPU
> +
>  #define get_irq_desc(irq) (_desc[(irq)])
>  
>  /* Define a way to iterate across irqs. */
> diff -upr linux-2.6.13-rc6/include/linux/irq.h 
> linux-2.6.13/include/linux/irq.h
> --- linux-2.6.13-rc6/include/linux/irq.h  2005-08-08 11:46:10.0 
> +0200
> +++ linux-2.6.13/include/linux/irq.h  2005-08-08 11:55:11.0 +0200
> @@ -32,7 +32,12 @@
>  #define IRQ_WAITING  32  /* IRQ not yet seen - for autodetection */
>  #define IRQ_LEVEL64  /* IRQ level triggered */
>  #define IRQ_MASKED   128 /* IRQ masked - shouldn't be seen again */
> -#define IRQ_PER_CPU  256 /* IRQ is per CPU */
> +#if defined(ARCH_HAS_IRQ_PER_CPU)
> +# define IRQ_PER_CPU 256 /* IRQ is per CPU */
> +# define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU)
> +#else
> +# define CHECK_IRQ_PER_CPU(var) 0
> +#endif
>  
>  /*
>   * Interrupt controller descriptor. This is all we need
> diff -upr linux-2.6.13-rc6/kernel/irq/handle.c 
> linux-2.6.13/kernel/irq/handle.c
> --- linux-2.6.13-rc6/kernel/irq/handle.c  2005-08-08 11:46:11.0 
> +0200
> +++ linux-2.6.13/kernel/irq/handle.c  2005-08-08 11:53:00.0 +0200
> @@ -111,7 +111,7 @@ fastcall unsigned int __do_IRQ(unsigned 
>   unsigned int status;
>  
>   kstat_this_cpu.irqs[irq]++;
> - if (desc->status & IRQ_PER_CPU) {
> + if (CHECK_IRQ_PER_CPU(desc->status)) {
>   irqreturn_t action_ret;
>  
>   /*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at

Re: 2.6.13-rc5-mm1: oops when starting nscd on AMD64

2005-08-08 Thread Alexander Nyberg

> > > I don't think it was supposed to do that.
> >  > 
> >  > Quite possibly it's something to do with the new debugging code - could 
> > you
> >  > please take a copy of the offending config, send it over and then try
> >  > removing debug options, see if the crash goes away?  CONFIG_DEBUG_PREEMPT
> >  > would be the first to try..
> > 
> >  The (offending) .config is attached and here's what happens without 
> > CONFIG_DEBUG_PREEMPT
> >  (the other debug options being unchanged):
> 
> Yes, my emt64 machine keels over with your .config too.  Maybe it's due to
> CONFIG_SMP=n, not sure.
> 
> Bisection searching shows that the bug was introduced by
> slab-leak-detector-give-longer-traces.patch.
> 

I was afraid it was when I first saw it but I couldn't reproduce (and
still can't).

> Call Trace:{sys_epoll_create+568} 
> {vfs_readdir+167}
>{add_preempt_count+93} 
> {system_call+126}
>   
>  

For some reason your compilers inline heavier than mine do, which makes
this:

kmem_cache_alloc
sys_epoll_create(__builtin_return_address(0))
system_call (__builtin_return_address(1))
(__builtin_return_address(2))

and off the stack we go...

I guess it was naive to even try to use this for more than the first
caller, sorry. Please throw that thing away and I'll do some backtracing
similar to CONFIG_PAGE_OWNER
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.13-rc5-mm1: oops when starting nscd on AMD64

2005-08-08 Thread Alexander Nyberg

   I don't think it was supposed to do that.

Quite possibly it's something to do with the new debugging code - could 
  you
please take a copy of the offending config, send it over and then try
removing debug options, see if the crash goes away?  CONFIG_DEBUG_PREEMPT
would be the first to try..
  
   The (offending) .config is attached and here's what happens without 
  CONFIG_DEBUG_PREEMPT
   (the other debug options being unchanged):
 
 Yes, my emt64 machine keels over with your .config too.  Maybe it's due to
 CONFIG_SMP=n, not sure.
 
 Bisection searching shows that the bug was introduced by
 slab-leak-detector-give-longer-traces.patch.
 

I was afraid it was when I first saw it but I couldn't reproduce (and
still can't).

 Call Trace:801a17bb{sys_epoll_create+568} 
 8018b1f7{vfs_readdir+167}
80231000{add_preempt_count+93} 
 8010e8fa{system_call+126}
   
  

For some reason your compilers inline heavier than mine do, which makes
this:

kmem_cache_alloc
sys_epoll_create(__builtin_return_address(0))
system_call (__builtin_return_address(1))
(__builtin_return_address(2))

and off the stack we go...

I guess it was naive to even try to use this for more than the first
caller, sorry. Please throw that thing away and I'll do some backtracing
similar to CONFIG_PAGE_OWNER
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] CHECK_IRQ_PER_CPU() to avoid dead code in __do_IRQ()

2005-08-08 Thread Alexander Nyberg

 
 IRQ_PER_CPU is not used by all architectures.
 This patch introduces the macros
 ARCH_HAS_IRQ_PER_CPU and CHECK_IRQ_PER_CPU() to avoid the generation of
 dead code in __do_IRQ().
 
 ARCH_HAS_IRQ_PER_CPU is defined by architectures using
 IRQ_PER_CPU in their
 include/asm_ARCH/irq.h
 file.
 
 Through grepping the tree I found the following
 architectures currently use IRQ_PER_CPU:
 
 cris, ia64, ppc, ppc64 and parisc. 
 

There are many places where one could replace run-time tests with 
#ifdef's but it makes reading more difficult (and in longer terms
maintainence). Have you benchmarked any workload that benefits 
from this?

 
 diff -upr linux-2.6.13-rc6/include/asm-cris/irq.h 
 linux-2.6.13/include/asm-cris/irq.h
 --- linux-2.6.13-rc6/include/asm-cris/irq.h   2005-08-08 11:46:10.0 
 +0200
 +++ linux-2.6.13/include/asm-cris/irq.h   2005-08-08 11:41:12.0 
 +0200
 @@ -1,6 +1,11 @@
  #ifndef _ASM_IRQ_H
  #define _ASM_IRQ_H
  
 +/*
 + * IRQ line status macro IRQ_PER_CPU is used
 + */
 +#define ARCH_HAS_IRQ_PER_CPU
 +
  #include asm/arch/irq.h
  
  extern __inline__ int irq_canonicalize(int irq)
 diff -upr linux-2.6.13-rc6/include/asm-ia64/irq.h 
 linux-2.6.13/include/asm-ia64/irq.h
 --- linux-2.6.13-rc6/include/asm-ia64/irq.h   2005-03-02 08:38:33.0 
 +0100
 +++ linux-2.6.13/include/asm-ia64/irq.h   2005-08-06 18:06:53.0 
 +0200
 @@ -14,6 +14,11 @@
  #define NR_IRQS  256
  #define NR_IRQ_VECTORS   NR_IRQS
  
 +/*
 + * IRQ line status macro IRQ_PER_CPU is used
 + */
 +#define ARCH_HAS_IRQ_PER_CPU
 +
  static __inline__ int
  irq_canonicalize (int irq)
  {
 diff -upr linux-2.6.13-rc6/include/asm-parisc/irq.h 
 linux-2.6.13/include/asm-parisc/irq.h
 --- linux-2.6.13-rc6/include/asm-parisc/irq.h 2005-08-08 11:45:26.0 
 +0200
 +++ linux-2.6.13/include/asm-parisc/irq.h 2005-08-06 18:05:22.0 
 +0200
 @@ -26,6 +26,11 @@
  
  #define NR_IRQS  (CPU_IRQ_MAX + 1)
  
 +/*
 + * IRQ line status macro IRQ_PER_CPU is used
 + */
 +#define ARCH_HAS_IRQ_PER_CPU
 +
  static __inline__ int irq_canonicalize(int irq)
  {
   return (irq == 2) ? 9 : irq;
 diff -upr linux-2.6.13-rc6/include/asm-ppc/irq.h 
 linux-2.6.13/include/asm-ppc/irq.h
 --- linux-2.6.13-rc6/include/asm-ppc/irq.h2005-08-08 11:46:10.0 
 +0200
 +++ linux-2.6.13/include/asm-ppc/irq.h2005-08-08 11:41:14.0 
 +0200
 @@ -19,6 +19,11 @@
  #define IRQ_POLARITY_POSITIVE0x2 /* high level or low-high edge 
 */
  #define IRQ_POLARITY_NEGATIVE0x0 /* low level or high-low edge 
 */
  
 +/*
 + * IRQ line status macro IRQ_PER_CPU is used
 + */
 +#define ARCH_HAS_IRQ_PER_CPU
 +
  #if defined(CONFIG_40x)
  #include asm/ibm4xx.h
  
 diff -upr linux-2.6.13-rc6/include/asm-ppc64/irq.h 
 linux-2.6.13/include/asm-ppc64/irq.h
 --- linux-2.6.13-rc6/include/asm-ppc64/irq.h  2005-03-02 08:38:33.0 
 +0100
 +++ linux-2.6.13/include/asm-ppc64/irq.h  2005-08-06 18:06:58.0 
 +0200
 @@ -33,6 +33,11 @@
  #define IRQ_POLARITY_POSITIVE0x2 /* high level or low-high edge 
 */
  #define IRQ_POLARITY_NEGATIVE0x0 /* low level or high-low edge 
 */
  
 +/*
 + * IRQ line status macro IRQ_PER_CPU is used
 + */
 +#define ARCH_HAS_IRQ_PER_CPU
 +
  #define get_irq_desc(irq) (irq_desc[(irq)])
  
  /* Define a way to iterate across irqs. */
 diff -upr linux-2.6.13-rc6/include/linux/irq.h 
 linux-2.6.13/include/linux/irq.h
 --- linux-2.6.13-rc6/include/linux/irq.h  2005-08-08 11:46:10.0 
 +0200
 +++ linux-2.6.13/include/linux/irq.h  2005-08-08 11:55:11.0 +0200
 @@ -32,7 +32,12 @@
  #define IRQ_WAITING  32  /* IRQ not yet seen - for autodetection */
  #define IRQ_LEVEL64  /* IRQ level triggered */
  #define IRQ_MASKED   128 /* IRQ masked - shouldn't be seen again */
 -#define IRQ_PER_CPU  256 /* IRQ is per CPU */
 +#if defined(ARCH_HAS_IRQ_PER_CPU)
 +# define IRQ_PER_CPU 256 /* IRQ is per CPU */
 +# define CHECK_IRQ_PER_CPU(var) ((var)  IRQ_PER_CPU)
 +#else
 +# define CHECK_IRQ_PER_CPU(var) 0
 +#endif
  
  /*
   * Interrupt controller descriptor. This is all we need
 diff -upr linux-2.6.13-rc6/kernel/irq/handle.c 
 linux-2.6.13/kernel/irq/handle.c
 --- linux-2.6.13-rc6/kernel/irq/handle.c  2005-08-08 11:46:11.0 
 +0200
 +++ linux-2.6.13/kernel/irq/handle.c  2005-08-08 11:53:00.0 +0200
 @@ -111,7 +111,7 @@ fastcall unsigned int __do_IRQ(unsigned 
   unsigned int status;
  
   kstat_this_cpu.irqs[irq]++;
 - if (desc-status  IRQ_PER_CPU) {
 + if (CHECK_IRQ_PER_CPU(desc-status)) {
   irqreturn_t action_ret;
  
   /*
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [SLAB] __builtin_return_address use without FRAME_POINTER causes boot failure

2005-08-08 Thread Alexander Nyberg

On Mon, Aug 08, 2005 at 11:37:18PM +0200 Manfred Spraul wrote:

 Christoph Lameter wrote:
 
 I kept getting boot failures in the slab allocator. The failure goes 
 away if one is setting CONFIG_FRAME_POINTER. Seems that 
 CONFIG_DEBUG_SLAB implies the use of __buildin_return_address() which 
 needs the framepointer.
 
  
 
 Very odd. __builtin_return_address(1) needs frame pointers, but slab 
 only uses __builtin_return_addresse(0), which should always work.
 

My fault, I introduced a debugging patch (i think i cc'ed you on it)
which used __builtin_return_address([12]) to save traces of who the
caller of an object is.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: oops with 2.6.13-rc5 on webserver with raid

2005-08-07 Thread Alexander Nyberg

On Fri, Aug 05, 2005 at 11:52:15AM +0200 Martin Braun wrote:

> Hi,
> 
> I've been trying to upgrade kernel to 2.6.13-rc5. The server boots
> normally w/o errors, but after while (from 5 minutes up to 2 hours) the
> Kernel hangs (no keyboard input possible). As I am a newbie I cannot
> figure out who will be concerned with this error.

Please don't run ksymoops on 2.6 kernels, it makes the output look
weird and isn't necessary anymore.

> 
> >>EIP; c0324afd<=
>

Should be fixed in 2.6.13-rc6, if problem persists please report back.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sluggish/very slow usb mouse on hp nx6110 notebook => acpi problem

2005-08-07 Thread Alexander Nyberg

On Fri, Aug 05, 2005 at 08:56:51PM +0200 JG wrote:

> hm, i currently have "acpi=off noacpi noapic reboot=b" as kernel
> parameter.
> 
> if i remove the acpi stuff and enable acpi, the usb mouse works fine..
> but after some time (5-10min) the kacpid process goes havoc and eats
> all cpu and the whole system is unresponsive- that's the reason i added
> those acpi=off parameters the first time when installing gentoo..
> 
> i tested with gentoo-2.6.12-r7 and vanilla-2.6.13rc5
> 

Indicates a bug in kacpid or similar. Could you make sure you compile in
"Magic SysRq key" under "Kernel Hacking" and boot the vanilla-2.6.13-rc6
(some recent acpi changes have gone in) and then wait for kacpid
to go nuts and do

Alt+Sysrq+t 4 times and then run 'dmesg -s 10 > logfile' and
send logfile over here so that we can see what kacpid is up to.

If the box becomes so unresponsive you can't extract the log information
it would be good if you could use either network console 
Documentation/networking/netconsole.txt or serial console at
Documentation/serial-console.txt, both require an extra computer
though...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Oops in 2.6.13-rc5-git-current (0d317fb72fe3cf0f611608cf3a3015bbe6cd2a66)

2005-08-07 Thread Alexander Nyberg

> Unable to handle kernel paging request at virtual address 6b6b6b6b
>  printing eip:
> c0188d15
> *pde = 
> Oops:  [#1]
> PREEMPT 
> CPU:0
> EIP:0060:[inotify_inode_queue_event+85/336]Not tainted VLI
> EFLAGS: 00010206   (2.6.13-rc5-g0d317fb7) 
> EIP is at inotify_inode_queue_event+0x55/0x150
> eax: 6b6b6b6b   ebx: 6b6b6b63   ecx:    edx: 0066
> esi: c3effe34   edi: ce8c76ac   ebp: d4bb864c   esp: d8655eb0
> ds: 007b   es: 007b   ss: 0068
> Process nfsd (pid: 3750, threadinfo=d8654000 task=d6155020)
> Stack: 0286 0286  0400 d4bb8760 d4bb8768  
> c3effe34 
>ce8c76ac d4bb864c c0170626  c3effe34 d6608ad4 db74b17c 
> c3effe34 
>e0cfe9a4 0013 e0d01b34 c0dd91b4 ce8c76ac c000 d66092dc 
> d66093c4 
> Call Trace:
>  [vfs_unlink+358/560] vfs_unlink+0x166/0x230
>  [pg0+544348580/1067586560] nfsd_unlink+0x104/0x230 [nfsd]
>  [pg0+544361268/1067586560] nfsd_cache_lookup+0x1c4/0x3c0 [nfsd]
>  [pg0+544371728/1067586560] nfsd3_proc_remove+0x80/0xc0 [nfsd]
>  [pg0+544381018/1067586560] nfs3svc_decode_diropargs+0x8a/0x100 [nfsd]
>  [pg0+544380880/1067586560] nfs3svc_decode_diropargs+0x0/0x100 [nfsd]
>  [pg0+544321698/1067586560] nfsd_dispatch+0x82/0x1f0 [nfsd]
>  [svc_authenticate+112/336] svc_authenticate+0x70/0x150
>  [svc_process+960/1648] svc_process+0x3c0/0x670
>  [pg0+544323105/1067586560] nfsd+0x1a1/0x350 [nfsd]
>  [ret_from_fork+6/20] ret_from_fork+0x6/0x14
>  [pg0+544322688/1067586560] nfsd+0x0/0x350 [nfsd]
>  [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10

(akpm: a fix for this needs to go into 2.6.13, inotify + nfs 
trivially oopses otherwise, even if inotify isn't actively used)

It looks like the following sequence is done in the wrong order.
When vfs_unlink() is called from sys_unlink() it has taken a ref
on the inode and sys_unlink() does the last iput() but when called
from other callsites vfs_unlink() might do the last iput() and
free inode, so inotify_inode_queue_event() will receive an already
freed object and dereference an already freed object.

Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]>

Index: mm/fs/namei.c
===
--- mm.orig/fs/namei.c  2005-08-07 12:06:16.0 +0200
+++ mm/fs/namei.c   2005-08-07 18:17:20.0 +0200
@@ -1869,8 +1869,8 @@
/* We don't d_delete() NFS sillyrenamed files--they still exist. */
if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
struct inode *inode = dentry->d_inode;
-   d_delete(dentry);
fsnotify_unlink(dentry, inode, dir);
+   d_delete(dentry);
}
 
return error;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Oops in 2.6.13-rc5-git-current (0d317fb72fe3cf0f611608cf3a3015bbe6cd2a66)

2005-08-07 Thread Alexander Nyberg

On Sat, Aug 06, 2005 at 11:56:30PM -0400 Ryan Anderson wrote:

> 
> Unable to handle kernel paging request at virtual address 6b6b6b6b
>  printing eip:
> c0188d15
> *pde = 
> Oops:  [#1]
> PREEMPT 
> Modules linked in: ppp_deflate bsd_comp ppp_async ppp_generic slhc radeon 
> esp6 ah6 wp512 tgr192 tea khazad michael_mic cast6 cast5 arc4 anubis nfsd 
> exportfs lp binfmt_misc ipv6 tsdev evdev analog parport_pc parport 8250_pnp 
> 8250 serial_core via_agp serpent aes_i586 crypto_null snd_via82xx gameport 
> snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc 
> snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore uhci_hcd via_ircc 
> irda dm_mod r8169 raid5 xor tulip via drm agpgart cpuid smbfs usbkbd usbcore 
> trm290 triflex sc1200 ns87415 it821x cy82c693 cs5530 cs5520 atiixp raid1 
> md_mod
> CPU:0
> EIP:0060:[inotify_inode_queue_event+85/336]Not tainted VLI
> EFLAGS: 00010206   (2.6.13-rc5-g0d317fb7) 
> EIP is at inotify_inode_queue_event+0x55/0x150
> eax: 6b6b6b6b   ebx: 6b6b6b63   ecx:    edx: 0066
> esi: c3effe34   edi: ce8c76ac   ebp: d4bb864c   esp: d8655eb0
> ds: 007b   es: 007b   ss: 0068
> Process nfsd (pid: 3750, threadinfo=d8654000 task=d6155020)
> Stack: 0286 0286  0400 d4bb8760 d4bb8768  
> c3effe34 
>ce8c76ac d4bb864c c0170626  c3effe34 d6608ad4 db74b17c 
> c3effe34 
>e0cfe9a4 0013 e0d01b34 c0dd91b4 ce8c76ac c000 d66092dc 
> d66093c4 
> Call Trace:
>  [vfs_unlink+358/560] vfs_unlink+0x166/0x230
>  [pg0+544348580/1067586560] nfsd_unlink+0x104/0x230 [nfsd]
>  [pg0+544361268/1067586560] nfsd_cache_lookup+0x1c4/0x3c0 [nfsd]
>  [pg0+544371728/1067586560] nfsd3_proc_remove+0x80/0xc0 [nfsd]
>  [pg0+544381018/1067586560] nfs3svc_decode_diropargs+0x8a/0x100 [nfsd]
>  [pg0+544380880/1067586560] nfs3svc_decode_diropargs+0x0/0x100 [nfsd]
>  [pg0+544321698/1067586560] nfsd_dispatch+0x82/0x1f0 [nfsd]
>  [svc_authenticate+112/336] svc_authenticate+0x70/0x150
>  [svc_process+960/1648] svc_process+0x3c0/0x670
>  [pg0+544323105/1067586560] nfsd+0x1a1/0x350 [nfsd]
>  [ret_from_fork+6/20] ret_from_fork+0x6/0x14
>  [pg0+544322688/1067586560] nfsd+0x0/0x350 [nfsd]
>  [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10

(the long-aged vfs veteran steps into the picture...)

It looks like the following sequence is done in the wrong order.
When vfs_unlink() is called from sys_unlink() it has taken a ref
on the inode and sys_unlink() does the last iput() but when called
from other callsites vfs_unlink() might do the last iput()

Can you reproduce with this patch? It should happen with some nfs
activity, I'll try to set up a scenario myself.

Index: mm/fs/namei.c
===
--- mm.orig/fs/namei.c  2005-08-07 12:06:16.0 +0200
+++ mm/fs/namei.c   2005-08-07 18:17:20.0 +0200
@@ -1869,8 +1869,8 @@
/* We don't d_delete() NFS sillyrenamed files--they still exist. */
if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
struct inode *inode = dentry->d_inode;
-   d_delete(dentry);
fsnotify_unlink(dentry, inode, dir);
+   d_delete(dentry);
}
 
return error;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Oops in 2.6.13-rc5-git-current (0d317fb72fe3cf0f611608cf3a3015bbe6cd2a66)

2005-08-07 Thread Alexander Nyberg

On Sat, Aug 06, 2005 at 11:56:30PM -0400 Ryan Anderson wrote:

 
 Unable to handle kernel paging request at virtual address 6b6b6b6b
  printing eip:
 c0188d15
 *pde = 
 Oops:  [#1]
 PREEMPT 
 Modules linked in: ppp_deflate bsd_comp ppp_async ppp_generic slhc radeon 
 esp6 ah6 wp512 tgr192 tea khazad michael_mic cast6 cast5 arc4 anubis nfsd 
 exportfs lp binfmt_misc ipv6 tsdev evdev analog parport_pc parport 8250_pnp 
 8250 serial_core via_agp serpent aes_i586 crypto_null snd_via82xx gameport 
 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc 
 snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore uhci_hcd via_ircc 
 irda dm_mod r8169 raid5 xor tulip via drm agpgart cpuid smbfs usbkbd usbcore 
 trm290 triflex sc1200 ns87415 it821x cy82c693 cs5530 cs5520 atiixp raid1 
 md_mod
 CPU:0
 EIP:0060:[inotify_inode_queue_event+85/336]Not tainted VLI
 EFLAGS: 00010206   (2.6.13-rc5-g0d317fb7) 
 EIP is at inotify_inode_queue_event+0x55/0x150
 eax: 6b6b6b6b   ebx: 6b6b6b63   ecx:    edx: 0066
 esi: c3effe34   edi: ce8c76ac   ebp: d4bb864c   esp: d8655eb0
 ds: 007b   es: 007b   ss: 0068
 Process nfsd (pid: 3750, threadinfo=d8654000 task=d6155020)
 Stack: 0286 0286  0400 d4bb8760 d4bb8768  
 c3effe34 
ce8c76ac d4bb864c c0170626  c3effe34 d6608ad4 db74b17c 
 c3effe34 
e0cfe9a4 0013 e0d01b34 c0dd91b4 ce8c76ac c000 d66092dc 
 d66093c4 
 Call Trace:
  [vfs_unlink+358/560] vfs_unlink+0x166/0x230
  [pg0+544348580/1067586560] nfsd_unlink+0x104/0x230 [nfsd]
  [pg0+544361268/1067586560] nfsd_cache_lookup+0x1c4/0x3c0 [nfsd]
  [pg0+544371728/1067586560] nfsd3_proc_remove+0x80/0xc0 [nfsd]
  [pg0+544381018/1067586560] nfs3svc_decode_diropargs+0x8a/0x100 [nfsd]
  [pg0+544380880/1067586560] nfs3svc_decode_diropargs+0x0/0x100 [nfsd]
  [pg0+544321698/1067586560] nfsd_dispatch+0x82/0x1f0 [nfsd]
  [svc_authenticate+112/336] svc_authenticate+0x70/0x150
  [svc_process+960/1648] svc_process+0x3c0/0x670
  [pg0+544323105/1067586560] nfsd+0x1a1/0x350 [nfsd]
  [ret_from_fork+6/20] ret_from_fork+0x6/0x14
  [pg0+544322688/1067586560] nfsd+0x0/0x350 [nfsd]
  [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10

(the long-aged vfs veteran steps into the picture...)

It looks like the following sequence is done in the wrong order.
When vfs_unlink() is called from sys_unlink() it has taken a ref
on the inode and sys_unlink() does the last iput() but when called
from other callsites vfs_unlink() might do the last iput()

Can you reproduce with this patch? It should happen with some nfs
activity, I'll try to set up a scenario myself.

Index: mm/fs/namei.c
===
--- mm.orig/fs/namei.c  2005-08-07 12:06:16.0 +0200
+++ mm/fs/namei.c   2005-08-07 18:17:20.0 +0200
@@ -1869,8 +1869,8 @@
/* We don't d_delete() NFS sillyrenamed files--they still exist. */
if (!error  !(dentry-d_flags  DCACHE_NFSFS_RENAMED)) {
struct inode *inode = dentry-d_inode;
-   d_delete(dentry);
fsnotify_unlink(dentry, inode, dir);
+   d_delete(dentry);
}
 
return error;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Oops in 2.6.13-rc5-git-current (0d317fb72fe3cf0f611608cf3a3015bbe6cd2a66)

2005-08-07 Thread Alexander Nyberg

 Unable to handle kernel paging request at virtual address 6b6b6b6b
  printing eip:
 c0188d15
 *pde = 
 Oops:  [#1]
 PREEMPT 
 CPU:0
 EIP:0060:[inotify_inode_queue_event+85/336]Not tainted VLI
 EFLAGS: 00010206   (2.6.13-rc5-g0d317fb7) 
 EIP is at inotify_inode_queue_event+0x55/0x150
 eax: 6b6b6b6b   ebx: 6b6b6b63   ecx:    edx: 0066
 esi: c3effe34   edi: ce8c76ac   ebp: d4bb864c   esp: d8655eb0
 ds: 007b   es: 007b   ss: 0068
 Process nfsd (pid: 3750, threadinfo=d8654000 task=d6155020)
 Stack: 0286 0286  0400 d4bb8760 d4bb8768  
 c3effe34 
ce8c76ac d4bb864c c0170626  c3effe34 d6608ad4 db74b17c 
 c3effe34 
e0cfe9a4 0013 e0d01b34 c0dd91b4 ce8c76ac c000 d66092dc 
 d66093c4 
 Call Trace:
  [vfs_unlink+358/560] vfs_unlink+0x166/0x230
  [pg0+544348580/1067586560] nfsd_unlink+0x104/0x230 [nfsd]
  [pg0+544361268/1067586560] nfsd_cache_lookup+0x1c4/0x3c0 [nfsd]
  [pg0+544371728/1067586560] nfsd3_proc_remove+0x80/0xc0 [nfsd]
  [pg0+544381018/1067586560] nfs3svc_decode_diropargs+0x8a/0x100 [nfsd]
  [pg0+544380880/1067586560] nfs3svc_decode_diropargs+0x0/0x100 [nfsd]
  [pg0+544321698/1067586560] nfsd_dispatch+0x82/0x1f0 [nfsd]
  [svc_authenticate+112/336] svc_authenticate+0x70/0x150
  [svc_process+960/1648] svc_process+0x3c0/0x670
  [pg0+544323105/1067586560] nfsd+0x1a1/0x350 [nfsd]
  [ret_from_fork+6/20] ret_from_fork+0x6/0x14
  [pg0+544322688/1067586560] nfsd+0x0/0x350 [nfsd]
  [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10

(akpm: a fix for this needs to go into 2.6.13, inotify + nfs 
trivially oopses otherwise, even if inotify isn't actively used)

It looks like the following sequence is done in the wrong order.
When vfs_unlink() is called from sys_unlink() it has taken a ref
on the inode and sys_unlink() does the last iput() but when called
from other callsites vfs_unlink() might do the last iput() and
free inode, so inotify_inode_queue_event() will receive an already
freed object and dereference an already freed object.

Signed-off-by: Alexander Nyberg [EMAIL PROTECTED]

Index: mm/fs/namei.c
===
--- mm.orig/fs/namei.c  2005-08-07 12:06:16.0 +0200
+++ mm/fs/namei.c   2005-08-07 18:17:20.0 +0200
@@ -1869,8 +1869,8 @@
/* We don't d_delete() NFS sillyrenamed files--they still exist. */
if (!error  !(dentry-d_flags  DCACHE_NFSFS_RENAMED)) {
struct inode *inode = dentry-d_inode;
-   d_delete(dentry);
fsnotify_unlink(dentry, inode, dir);
+   d_delete(dentry);
}
 
return error;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sluggish/very slow usb mouse on hp nx6110 notebook = acpi problem

2005-08-07 Thread Alexander Nyberg

On Fri, Aug 05, 2005 at 08:56:51PM +0200 JG wrote:

 hm, i currently have acpi=off noacpi noapic reboot=b as kernel
 parameter.
 
 if i remove the acpi stuff and enable acpi, the usb mouse works fine..
 but after some time (5-10min) the kacpid process goes havoc and eats
 all cpu and the whole system is unresponsive- that's the reason i added
 those acpi=off parameters the first time when installing gentoo..
 
 i tested with gentoo-2.6.12-r7 and vanilla-2.6.13rc5
 

Indicates a bug in kacpid or similar. Could you make sure you compile in
Magic SysRq key under Kernel Hacking and boot the vanilla-2.6.13-rc6
(some recent acpi changes have gone in) and then wait for kacpid
to go nuts and do

Alt+Sysrq+t 4 times and then run 'dmesg -s 10  logfile' and
send logfile over here so that we can see what kacpid is up to.

If the box becomes so unresponsive you can't extract the log information
it would be good if you could use either network console 
Documentation/networking/netconsole.txt or serial console at
Documentation/serial-console.txt, both require an extra computer
though...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: oops with 2.6.13-rc5 on webserver with raid

2005-08-07 Thread Alexander Nyberg

On Fri, Aug 05, 2005 at 11:52:15AM +0200 Martin Braun wrote:

 Hi,
 
 I've been trying to upgrade kernel to 2.6.13-rc5. The server boots
 normally w/o errors, but after while (from 5 minutes up to 2 hours) the
 Kernel hangs (no keyboard input possible). As I am a newbie I cannot
 figure out who will be concerned with this error.

Please don't run ksymoops on 2.6 kernels, it makes the output look
weird and isn't necessary anymore.

 
 EIP; c0324afd tcp_tso_should_defer+fd/110   =


Should be fixed in 2.6.13-rc6, if problem persists please report back.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] module ns558

2005-08-05 Thread Alexander Nyberg

On Fri, Aug 05, 2005 at 08:52:41PM +0200 Michael Stenzel wrote:

> Hello dear Kernel People,
> 
> I have a problem with my gameport, it uses the ns558 driver, the module gets 
> loaded via hotplug/udev at boot, but the gameport gets deactivated somehow.
> I have this Problem for a long time now, and my solution always was rmmod the 
> module and load it again after that the gameport is working.
> But now i have 2.6.13-rc5 with debug stuff turned on and noticed that:
>

Please take this up with the input guys, I'm guessing it shouldn't
happen in the first place, but regarding this bug look at the bottom.

> Unable to handle kernel paging request at virtual address 6b6b6b6b
>  printing eip:
> e0afc4ab
> *pde = 
> Oops:  [#1]
> PREEMPT
> Modules linked in: snd_seq_midi snd_seq_midi_event snd_seq video_buf_dvb 
> video_buf w83627hf w83781d i2c_sensor i2c_isa snd_pcm_oss snd_mixer_oss 
> ipt_MASQUERADE ipt_state iptable_mangle iptable_nat iptable_filter 
> ip_conntrack_ftp ip_conntrack_irc ip_conntrack ip_tables rtc joydev analog 
> ns558 budget s5h1420 l64781 ves1820 budget_core saa7146 ttpci_eeprom stv0299 
> tda8083 ves1x93 dvb_core 8139too snd_via82xx gameport snd_mpu401_uart 
> snd_rawmidi snd_seq_device via_rhine crc32 ide_scsi
> CPU:0
> EIP:0060:[]Not tainted VLI
> EFLAGS: 00010282   (2.6.13-rc5-debug)
> EIP is at ns558_exit+0x4b/0x79 [ns558]
> eax: 6b6b6b57   ebx: 6b6b6b57   ecx:    edx: 6b6b6b6b
> esi:    edi: 0002   ebp: d7cfdf60   esp: d7cfdf5c
> ds: 007b   es: 007b   ss: 0068
> Process rmmod (pid: 3267, threadinfo=d7cfc000 task=dfc94080)
> Stack: e0afd140 d7cfdfb4 c0146b4d  3535736e d7cf0038 c0169941 b7f43000
>b7f42000 d7cfdfa4 c0169de5 b7f42000 b7f43000 df6a6f44 df6a61fc df17d3a4
>df17d3d4  00cfdfb4 c0169e6a bf856ae0 b7f2917c d7cfc000 c0103889
> Call Trace:
>  [] show_stack+0x7a/0x90
>  [] show_registers+0x156/0x1c0
>  [] die+0x14c/0x2c0
>  [] do_page_fault+0x343/0x655
>  [] error_code+0x4f/0x54
>  [] sys_delete_module+0x14d/0x190
>  [] syscall_call+0x7/0xb
> Code: 8b 43 10 e8 98 65 de ff 8b 4b 08 b8 a0 2f 46 c0 89 ca f7 da 23 53 04 e8 
> 64 c7 62 df 89 d8 e8 5d 01 66 df 8b 53 14 8d 42 ec 89 c3 <8b> 40 14 0f 18 00 
> 90 81 fa 20 cf af e0 75 c6 8b 1d c0 d2 af e0
> 

Please try this:

Index: linux-2.6/drivers/input/gameport/ns558.c
===
--- linux-2.6.orig/drivers/input/gameport/ns558.c   2005-07-31 
18:10:26.0 +0200
+++ linux-2.6/drivers/input/gameport/ns558.c2005-08-05 21:20:59.0 
+0200
@@ -275,9 +275,9 @@
 
 static void __exit ns558_exit(void)
 {
-   struct ns558 *ns558;
+   struct ns558 *ns558, *safe;
 
-   list_for_each_entry(ns558, _list, node) {
+   list_for_each_entry_safe(ns558, safe, _list, node) {
gameport_unregister_port(ns558->gameport);
release_region(ns558->io & ~(ns558->size - 1), ns558->size);
kfree(ns558);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] module ns558

2005-08-05 Thread Alexander Nyberg

On Fri, Aug 05, 2005 at 08:52:41PM +0200 Michael Stenzel wrote:

 Hello dear Kernel People,
 
 I have a problem with my gameport, it uses the ns558 driver, the module gets 
 loaded via hotplug/udev at boot, but the gameport gets deactivated somehow.
 I have this Problem for a long time now, and my solution always was rmmod the 
 module and load it again after that the gameport is working.
 But now i have 2.6.13-rc5 with debug stuff turned on and noticed that:


Please take this up with the input guys, I'm guessing it shouldn't
happen in the first place, but regarding this bug look at the bottom.

 Unable to handle kernel paging request at virtual address 6b6b6b6b
  printing eip:
 e0afc4ab
 *pde = 
 Oops:  [#1]
 PREEMPT
 Modules linked in: snd_seq_midi snd_seq_midi_event snd_seq video_buf_dvb 
 video_buf w83627hf w83781d i2c_sensor i2c_isa snd_pcm_oss snd_mixer_oss 
 ipt_MASQUERADE ipt_state iptable_mangle iptable_nat iptable_filter 
 ip_conntrack_ftp ip_conntrack_irc ip_conntrack ip_tables rtc joydev analog 
 ns558 budget s5h1420 l64781 ves1820 budget_core saa7146 ttpci_eeprom stv0299 
 tda8083 ves1x93 dvb_core 8139too snd_via82xx gameport snd_mpu401_uart 
 snd_rawmidi snd_seq_device via_rhine crc32 ide_scsi
 CPU:0
 EIP:0060:[e0afc4ab]Not tainted VLI
 EFLAGS: 00010282   (2.6.13-rc5-debug)
 EIP is at ns558_exit+0x4b/0x79 [ns558]
 eax: 6b6b6b57   ebx: 6b6b6b57   ecx:    edx: 6b6b6b6b
 esi:    edi: 0002   ebp: d7cfdf60   esp: d7cfdf5c
 ds: 007b   es: 007b   ss: 0068
 Process rmmod (pid: 3267, threadinfo=d7cfc000 task=dfc94080)
 Stack: e0afd140 d7cfdfb4 c0146b4d  3535736e d7cf0038 c0169941 b7f43000
b7f42000 d7cfdfa4 c0169de5 b7f42000 b7f43000 df6a6f44 df6a61fc df17d3a4
df17d3d4  00cfdfb4 c0169e6a bf856ae0 b7f2917c d7cfc000 c0103889
 Call Trace:
  [c010483a] show_stack+0x7a/0x90
  [c01049c6] show_registers+0x156/0x1c0
  [c0104c1c] die+0x14c/0x2c0
  [c0118093] do_page_fault+0x343/0x655
  [c010430f] error_code+0x4f/0x54
  [c0146b4d] sys_delete_module+0x14d/0x190
  [c0103889] syscall_call+0x7/0xb
 Code: 8b 43 10 e8 98 65 de ff 8b 4b 08 b8 a0 2f 46 c0 89 ca f7 da 23 53 04 e8 
 64 c7 62 df 89 d8 e8 5d 01 66 df 8b 53 14 8d 42 ec 89 c3 8b 40 14 0f 18 00 
 90 81 fa 20 cf af e0 75 c6 8b 1d c0 d2 af e0
 

Please try this:

Index: linux-2.6/drivers/input/gameport/ns558.c
===
--- linux-2.6.orig/drivers/input/gameport/ns558.c   2005-07-31 
18:10:26.0 +0200
+++ linux-2.6/drivers/input/gameport/ns558.c2005-08-05 21:20:59.0 
+0200
@@ -275,9 +275,9 @@
 
 static void __exit ns558_exit(void)
 {
-   struct ns558 *ns558;
+   struct ns558 *ns558, *safe;
 
-   list_for_each_entry(ns558, ns558_list, node) {
+   list_for_each_entry_safe(ns558, safe, ns558_list, node) {
gameport_unregister_port(ns558-gameport);
release_region(ns558-io  ~(ns558-size - 1), ns558-size);
kfree(ns558);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: x86_64 access of some bad address

2005-08-04 Thread Alexander Nyberg

On Thu, Aug 04, 2005 at 01:15:12PM -0700 Andrew Morton wrote:

> Alexander Nyberg <[EMAIL PROTECTED]> wrote:
> >
> > As I only have one x86_64 which is my main workstation it's far too
> > tedious to do binary searching (this doesn't happen on x86).
> > 
> > Happens with both latest -git and 2.6.12-mm1
> > The tools to reproduce this is at: http://serkiaden.mine.nu/kp2.tar
> > 
> > Just do:
> > gdb lyze
> > run
> > 
> > and it crashes here giving:
> > 
> > --- [cut here ] - [please bite here ] -
> > Kernel BUG at "mm/memory.c":911
> 
> So I think Hugh's patch this morning should fix this up.  Please retest
> -rc6 when it's out?

Maybe I forgot to tell but I've already tested and it works fine.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2.6.13-rc4] fix get_user_pages bug

2005-08-04 Thread Alexander Nyberg

> >
> >x86_64 had hardcoded the VM_ numbers so it broke down when the numbers
> >were changed.
> >
> 
> Ugh, sorry I should have audited this but I really wasn't expecting
> it (famous last words). Hasn't been a good week for me.

Hardcoding is evil so it's good it gets cleaned up anyway.

> parisc, cris, m68k, frv, sh64, arm26 are also broken.
> Would you mind resending a patch that fixes them all?
> 

Remove the hardcoding in return value checking of handle_mm_fault()

Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]>

 arm26/mm/fault.c  |6 +++---
 cris/mm/fault.c   |6 +++---
 frv/mm/fault.c|6 +++---
 m68k/mm/fault.c   |6 +++---
 parisc/mm/fault.c |6 +++---
 sh64/mm/fault.c   |6 +++---
 x86_64/mm/fault.c |6 +++---
 7 files changed, 21 insertions(+), 21 deletions(-)

Index: linux-2.6/arch/x86_64/mm/fault.c
===
--- linux-2.6.orig/arch/x86_64/mm/fault.c   2005-07-31 18:10:20.0 
+0200
+++ linux-2.6/arch/x86_64/mm/fault.c2005-08-04 16:04:59.0 +0200
@@ -439,13 +439,13 @@
 * the fault.
 */
switch (handle_mm_fault(mm, vma, address, write)) {
-   case 1:
+   case VM_FAULT_MINOR:
tsk->min_flt++;
break;
-   case 2:
+   case VM_FAULT_MAJOR:
tsk->maj_flt++;
break;
-   case 0:
+   case VM_FAULT_SIGBUS:
goto do_sigbus;
default:
goto out_of_memory;
Index: linux-2.6/arch/cris/mm/fault.c
===
--- linux-2.6.orig/arch/cris/mm/fault.c 2005-07-31 18:10:02.0 +0200
+++ linux-2.6/arch/cris/mm/fault.c  2005-08-04 16:40:56.0 +0200
@@ -284,13 +284,13 @@
 */
 
switch (handle_mm_fault(mm, vma, address, writeaccess & 1)) {
-   case 1:
+   case VM_FAULT_MINOR:
tsk->min_flt++;
break;
-   case 2:
+   case VM_FAULT_MAJOR:
tsk->maj_flt++;
break;
-   case 0:
+   case VM_FAULT_SIGBUS:
goto do_sigbus;
default:
goto out_of_memory;
Index: linux-2.6/arch/m68k/mm/fault.c
===
--- linux-2.6.orig/arch/m68k/mm/fault.c 2005-07-31 18:10:05.0 +0200
+++ linux-2.6/arch/m68k/mm/fault.c  2005-08-04 16:42:05.0 +0200
@@ -160,13 +160,13 @@
printk("handle_mm_fault returns %d\n",fault);
 #endif
switch (fault) {
-   case 1:
+   case VM_FAULT_MINOR:
current->min_flt++;
break;
-   case 2:
+   case VM_FAULT_MAJOR:
current->maj_flt++;
break;
-   case 0:
+   case VM_FAULT_SIGBUS:
goto bus_err;
default:
goto out_of_memory;
Index: linux-2.6/arch/parisc/mm/fault.c
===
--- linux-2.6.orig/arch/parisc/mm/fault.c   2005-07-31 18:10:11.0 
+0200
+++ linux-2.6/arch/parisc/mm/fault.c2005-08-04 16:41:18.0 +0200
@@ -178,13 +178,13 @@
 */
 
switch (handle_mm_fault(mm, vma, address, (acc_type & VM_WRITE) != 0)) {
- case 1:
+ case VM_FAULT_MINOR:
++current->min_flt;
break;
- case 2:
+ case VM_FAULT_MAJOR:
++current->maj_flt;
break;
- case 0:
+ case VM_FAULT_SIGBUS:
/*
 * We ran out of memory, or some other thing happened
 * to us that made us unable to handle the page fault
Index: linux-2.6/arch/arm26/mm/fault.c
===
--- linux-2.6.orig/arch/arm26/mm/fault.c2005-07-31 18:10:00.0 
+0200
+++ linux-2.6/arch/arm26/mm/fault.c 2005-08-04 16:46:18.0 +0200
@@ -176,12 +176,12 @@
 * Handle the "normal" cases first - successful and sigbus
 */
switch (fault) {
-   case 2:
+   case VM_FAULT_MAJOR:
tsk->maj_flt++;
return fault;
-   case 1:
+   case VM_FAULT_MINOR:
tsk->min_flt++;
-   case 0:
+   case VM_FAULT_SIGBUS:
return fault;
}
 
Index: linux-2.6/arch/frv/mm/fault.c
===
--- linux-2.6.orig/arch/frv/mm/fault.c  2005-07-31 18:10:03.0 +0200
+++ linux-2.6/arch/frv/mm/fault.c   2005-08-04 16:44:02.0 +0200
@@ -163,13 +163,13 @@
 * the fault.
 */
switch (handle_mm_fault(mm, vma, ear0, write)) {
-   case 1:
+   case VM_FAULT_MINOR:
current->mi

Re: [patch 2.6.13-rc4] fix get_user_pages bug

2005-08-04 Thread Alexander Nyberg

On Wed, Aug 03, 2005 at 09:12:37AM -0700 Linus Torvalds wrote:

> 
> 
> On Wed, 3 Aug 2005, Nick Piggin wrote:
> > 
> > Oh, it gets rid of the -1 for VM_FAULT_OOM. Doesn't seem like there
> > is a good reason for it, but might that break out of tree drivers?
> 
> Ok, I applied this because it was reasonably pretty and I liked the 
> approach. It seems buggy, though, since it was using "switch ()" to test 
> the bits (wrongly, afaik), and I'm going to apply the appended on top of 
> it. Holler quickly if you disagreee..
> 

x86_64 had hardcoded the VM_ numbers so it broke down when the numbers
were changed.

Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]>

Index: linux-2.6/arch/x86_64/mm/fault.c
===
--- linux-2.6.orig/arch/x86_64/mm/fault.c   2005-07-31 18:10:20.0 
+0200
+++ linux-2.6/arch/x86_64/mm/fault.c2005-08-04 16:04:59.0 +0200
@@ -439,13 +439,13 @@
 * the fault.
 */
switch (handle_mm_fault(mm, vma, address, write)) {
-   case 1:
+   case VM_FAULT_MINOR:
tsk->min_flt++;
break;
-   case 2:
+   case VM_FAULT_MAJOR:
tsk->maj_flt++;
break;
-   case 0:
+   case VM_FAULT_SIGBUS:
goto do_sigbus;
default:
goto out_of_memory;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2.6.13-rc4] fix get_user_pages bug

2005-08-04 Thread Alexander Nyberg

On Wed, Aug 03, 2005 at 09:12:37AM -0700 Linus Torvalds wrote:

 
 
 On Wed, 3 Aug 2005, Nick Piggin wrote:
  
  Oh, it gets rid of the -1 for VM_FAULT_OOM. Doesn't seem like there
  is a good reason for it, but might that break out of tree drivers?
 
 Ok, I applied this because it was reasonably pretty and I liked the 
 approach. It seems buggy, though, since it was using switch () to test 
 the bits (wrongly, afaik), and I'm going to apply the appended on top of 
 it. Holler quickly if you disagreee..
 

x86_64 had hardcoded the VM_ numbers so it broke down when the numbers
were changed.

Signed-off-by: Alexander Nyberg [EMAIL PROTECTED]

Index: linux-2.6/arch/x86_64/mm/fault.c
===
--- linux-2.6.orig/arch/x86_64/mm/fault.c   2005-07-31 18:10:20.0 
+0200
+++ linux-2.6/arch/x86_64/mm/fault.c2005-08-04 16:04:59.0 +0200
@@ -439,13 +439,13 @@
 * the fault.
 */
switch (handle_mm_fault(mm, vma, address, write)) {
-   case 1:
+   case VM_FAULT_MINOR:
tsk-min_flt++;
break;
-   case 2:
+   case VM_FAULT_MAJOR:
tsk-maj_flt++;
break;
-   case 0:
+   case VM_FAULT_SIGBUS:
goto do_sigbus;
default:
goto out_of_memory;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2.6.13-rc4] fix get_user_pages bug

2005-08-04 Thread Alexander Nyberg

 
 x86_64 had hardcoded the VM_ numbers so it broke down when the numbers
 were changed.
 
 
 Ugh, sorry I should have audited this but I really wasn't expecting
 it (famous last words). Hasn't been a good week for me.

Hardcoding is evil so it's good it gets cleaned up anyway.

 parisc, cris, m68k, frv, sh64, arm26 are also broken.
 Would you mind resending a patch that fixes them all?
 

Remove the hardcoding in return value checking of handle_mm_fault()

Signed-off-by: Alexander Nyberg [EMAIL PROTECTED]

 arm26/mm/fault.c  |6 +++---
 cris/mm/fault.c   |6 +++---
 frv/mm/fault.c|6 +++---
 m68k/mm/fault.c   |6 +++---
 parisc/mm/fault.c |6 +++---
 sh64/mm/fault.c   |6 +++---
 x86_64/mm/fault.c |6 +++---
 7 files changed, 21 insertions(+), 21 deletions(-)

Index: linux-2.6/arch/x86_64/mm/fault.c
===
--- linux-2.6.orig/arch/x86_64/mm/fault.c   2005-07-31 18:10:20.0 
+0200
+++ linux-2.6/arch/x86_64/mm/fault.c2005-08-04 16:04:59.0 +0200
@@ -439,13 +439,13 @@
 * the fault.
 */
switch (handle_mm_fault(mm, vma, address, write)) {
-   case 1:
+   case VM_FAULT_MINOR:
tsk-min_flt++;
break;
-   case 2:
+   case VM_FAULT_MAJOR:
tsk-maj_flt++;
break;
-   case 0:
+   case VM_FAULT_SIGBUS:
goto do_sigbus;
default:
goto out_of_memory;
Index: linux-2.6/arch/cris/mm/fault.c
===
--- linux-2.6.orig/arch/cris/mm/fault.c 2005-07-31 18:10:02.0 +0200
+++ linux-2.6/arch/cris/mm/fault.c  2005-08-04 16:40:56.0 +0200
@@ -284,13 +284,13 @@
 */
 
switch (handle_mm_fault(mm, vma, address, writeaccess  1)) {
-   case 1:
+   case VM_FAULT_MINOR:
tsk-min_flt++;
break;
-   case 2:
+   case VM_FAULT_MAJOR:
tsk-maj_flt++;
break;
-   case 0:
+   case VM_FAULT_SIGBUS:
goto do_sigbus;
default:
goto out_of_memory;
Index: linux-2.6/arch/m68k/mm/fault.c
===
--- linux-2.6.orig/arch/m68k/mm/fault.c 2005-07-31 18:10:05.0 +0200
+++ linux-2.6/arch/m68k/mm/fault.c  2005-08-04 16:42:05.0 +0200
@@ -160,13 +160,13 @@
printk(handle_mm_fault returns %d\n,fault);
 #endif
switch (fault) {
-   case 1:
+   case VM_FAULT_MINOR:
current-min_flt++;
break;
-   case 2:
+   case VM_FAULT_MAJOR:
current-maj_flt++;
break;
-   case 0:
+   case VM_FAULT_SIGBUS:
goto bus_err;
default:
goto out_of_memory;
Index: linux-2.6/arch/parisc/mm/fault.c
===
--- linux-2.6.orig/arch/parisc/mm/fault.c   2005-07-31 18:10:11.0 
+0200
+++ linux-2.6/arch/parisc/mm/fault.c2005-08-04 16:41:18.0 +0200
@@ -178,13 +178,13 @@
 */
 
switch (handle_mm_fault(mm, vma, address, (acc_type  VM_WRITE) != 0)) {
- case 1:
+ case VM_FAULT_MINOR:
++current-min_flt;
break;
- case 2:
+ case VM_FAULT_MAJOR:
++current-maj_flt;
break;
- case 0:
+ case VM_FAULT_SIGBUS:
/*
 * We ran out of memory, or some other thing happened
 * to us that made us unable to handle the page fault
Index: linux-2.6/arch/arm26/mm/fault.c
===
--- linux-2.6.orig/arch/arm26/mm/fault.c2005-07-31 18:10:00.0 
+0200
+++ linux-2.6/arch/arm26/mm/fault.c 2005-08-04 16:46:18.0 +0200
@@ -176,12 +176,12 @@
 * Handle the normal cases first - successful and sigbus
 */
switch (fault) {
-   case 2:
+   case VM_FAULT_MAJOR:
tsk-maj_flt++;
return fault;
-   case 1:
+   case VM_FAULT_MINOR:
tsk-min_flt++;
-   case 0:
+   case VM_FAULT_SIGBUS:
return fault;
}
 
Index: linux-2.6/arch/frv/mm/fault.c
===
--- linux-2.6.orig/arch/frv/mm/fault.c  2005-07-31 18:10:03.0 +0200
+++ linux-2.6/arch/frv/mm/fault.c   2005-08-04 16:44:02.0 +0200
@@ -163,13 +163,13 @@
 * the fault.
 */
switch (handle_mm_fault(mm, vma, ear0, write)) {
-   case 1:
+   case VM_FAULT_MINOR:
current-min_flt++;
break;
-   case 2:
+   case VM_FAULT_MAJOR:
current-maj_flt++;
break;
-   case 0

Re: x86_64 access of some bad address

2005-08-04 Thread Alexander Nyberg

On Thu, Aug 04, 2005 at 01:15:12PM -0700 Andrew Morton wrote:

 Alexander Nyberg [EMAIL PROTECTED] wrote:
 
  As I only have one x86_64 which is my main workstation it's far too
  tedious to do binary searching (this doesn't happen on x86).
  
  Happens with both latest -git and 2.6.12-mm1
  The tools to reproduce this is at: http://serkiaden.mine.nu/kp2.tar
  
  Just do:
  gdb lyze
  run
  
  and it crashes here giving:
  
  --- [cut here ] - [please bite here ] -
  Kernel BUG at mm/memory.c:911
 
 So I think Hugh's patch this morning should fix this up.  Please retest
 -rc6 when it's out?

Maybe I forgot to tell but I've already tested and it works fine.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Simple question re: oops

2005-07-30 Thread Alexander Nyberg

On Sat, Jul 30, 2005 at 07:48:11PM -0400 Lee Revell wrote:

> I have a machine here that oopses reliably when I start X, but the
> interesting stuff scrolls away too fast, and a bunch more Oopses get
> printed ending with "Aieee, killing interrupt handler".
> 
> How do I get the output to stop after the first Oops?
> 

set /proc/sys/kernel/panic_on_oops to 1

What version of the kernel is that? It shouldn't do recursive oopses
(of the same task) any more.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Making it easier to find which change introduced a bug

2005-07-30 Thread Alexander Nyberg

> 
> > We need a super-easy way for people to do bisection searching.
> 
>  First step would be to make interdiffs available as quilt patchsets.
> 
>  If we had this for e.g. 2.6.13-rc3 -> rc4 it would make tracking down
> those new bugs much easier.
> 
> (Yes I know git does bisection but Andrew said it should be easy.)
> __

Yeah I agree, it would be extremely useful and simplify for people
who don't have git installed.

Linus, do you think we could have something like
patch-2.6.13-rc4-incremental-broken-out.tar.bz2 that could like Andrew's
be placed into patches/ in a tree?

So for example, have a tree with 2.6.13-rc3, download
patch-2.6.13-rc4-incremental-broken-out.tar.bz2, place it in patches/ and 
be able to do quilt push / quilt pop easily.

As it stands today it's easier for us who don't know git to just find
out in which mainline kernel it works and which -mm it doesn't work in,
get the broken-out and start push/pop. And I know I'm not the only one
who has noticed this.

Thanks
Alexander
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Making it easier to find which change introduced a bug

2005-07-30 Thread Alexander Nyberg

 
  We need a super-easy way for people to do bisection searching.
 
  First step would be to make interdiffs available as quilt patchsets.
 
  If we had this for e.g. 2.6.13-rc3 - rc4 it would make tracking down
 those new bugs much easier.
 
 (Yes I know git does bisection but Andrew said it should be easy.)
 __

Yeah I agree, it would be extremely useful and simplify for people
who don't have git installed.

Linus, do you think we could have something like
patch-2.6.13-rc4-incremental-broken-out.tar.bz2 that could like Andrew's
be placed into patches/ in a tree?

So for example, have a tree with 2.6.13-rc3, download
patch-2.6.13-rc4-incremental-broken-out.tar.bz2, place it in patches/ and 
be able to do quilt push / quilt pop easily.

As it stands today it's easier for us who don't know git to just find
out in which mainline kernel it works and which -mm it doesn't work in,
get the broken-out and start push/pop. And I know I'm not the only one
who has noticed this.

Thanks
Alexander
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Simple question re: oops

2005-07-30 Thread Alexander Nyberg

On Sat, Jul 30, 2005 at 07:48:11PM -0400 Lee Revell wrote:

 I have a machine here that oopses reliably when I start X, but the
 interesting stuff scrolls away too fast, and a bunch more Oopses get
 printed ending with Aieee, killing interrupt handler.
 
 How do I get the output to stop after the first Oops?
 

set /proc/sys/kernel/panic_on_oops to 1

What version of the kernel is that? It shouldn't do recursive oopses
(of the same task) any more.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/6] mm: micro-optimise rmap

2005-07-27 Thread Alexander Nyberg

[Nick, your mail bounced while sending this privately so reply-all this
time]

> Index: linux-2.6/mm/rmap.c
> ===
> --- linux-2.6.orig/mm/rmap.c
> +++ linux-2.6/mm/rmap.c
> @@ -442,22 +442,23 @@ int page_referenced(struct page *page, i
>  void page_add_anon_rmap(struct page *page,
>   struct vm_area_struct *vma, unsigned long address)
>  {
> - struct anon_vma *anon_vma = vma->anon_vma;
> - pgoff_t index;
> -
>   BUG_ON(PageReserved(page));
> - BUG_ON(!anon_vma);
>  
>   inc_mm_counter(vma->vm_mm, anon_rss);
>  
> - anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
> - index = (address - vma->vm_start) >> PAGE_SHIFT;
> - index += vma->vm_pgoff;
> - index >>= PAGE_CACHE_SHIFT - PAGE_SHIFT;
> -
>   if (atomic_inc_and_test(>_mapcount)) {
> - page->index = index;
> + struct anon_vma *anon_vma = vma->anon_vma;
> + pgoff_t index;
> +
> + BUG_ON(!anon_vma);
> + anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
>   page->mapping = (struct address_space *) anon_vma;
> +
> + index = (address - vma->vm_start) >> PAGE_SHIFT;
> + index += vma->vm_pgoff;
> + index >>= PAGE_CACHE_SHIFT - PAGE_SHIFT;
> + page->index = index;
> +

linear_page_index() here too?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/6] mm: micro-optimise rmap

2005-07-27 Thread Alexander Nyberg

[Nick, your mail bounced while sending this privately so reply-all this
time]

 Index: linux-2.6/mm/rmap.c
 ===
 --- linux-2.6.orig/mm/rmap.c
 +++ linux-2.6/mm/rmap.c
 @@ -442,22 +442,23 @@ int page_referenced(struct page *page, i
  void page_add_anon_rmap(struct page *page,
   struct vm_area_struct *vma, unsigned long address)
  {
 - struct anon_vma *anon_vma = vma-anon_vma;
 - pgoff_t index;
 -
   BUG_ON(PageReserved(page));
 - BUG_ON(!anon_vma);
  
   inc_mm_counter(vma-vm_mm, anon_rss);
  
 - anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
 - index = (address - vma-vm_start)  PAGE_SHIFT;
 - index += vma-vm_pgoff;
 - index = PAGE_CACHE_SHIFT - PAGE_SHIFT;
 -
   if (atomic_inc_and_test(page-_mapcount)) {
 - page-index = index;
 + struct anon_vma *anon_vma = vma-anon_vma;
 + pgoff_t index;
 +
 + BUG_ON(!anon_vma);
 + anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
   page-mapping = (struct address_space *) anon_vma;
 +
 + index = (address - vma-vm_start)  PAGE_SHIFT;
 + index += vma-vm_pgoff;
 + index = PAGE_CACHE_SHIFT - PAGE_SHIFT;
 + page-index = index;
 +

linear_page_index() here too?


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: files_lock deadlock?

2005-07-20 Thread Alexander Nyberg

tis 2005-07-19 klockan 18:45 +0200 skrev Martin Wilck:
> Hello,
> 
> I apologize in advance if this is a dummy question. My web search turned 
> up nothing, so I'm trying it here.
> 
> We came across the following error message:
> 
> Kernelpanic - not syncing: fs/proc/
> Generic.c:521: spin_lock(fs/file_table.c:80420280)
> Already locked by fs/file_table.c/204
> 
> This shows a locking problem with the files_lock on a UP kernel with 
> spinlock debugging enabled.
> 
> I noticed that files_lock is only protected with spin_lock() 
> (file_list_lock(), include/linux/fs.h). Is it possible that this should 
> be changed to spin_lock_irq()) or spin_lock_irqsave()? Or am I misssing 
> something obvious?

spin_lock_irqsave is only needed when a lock is taken both in normal
context and in interrupt context. Clearly this lock is not intended to
be taken in interrupt context. 

I'll take a look, that spinlock debugging information unfortunately
doesn't give too much info :|


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: files_lock deadlock?

2005-07-20 Thread Alexander Nyberg

tis 2005-07-19 klockan 18:45 +0200 skrev Martin Wilck:
 Hello,
 
 I apologize in advance if this is a dummy question. My web search turned 
 up nothing, so I'm trying it here.
 
 We came across the following error message:
 
 Kernelpanic - not syncing: fs/proc/
 Generic.c:521: spin_lock(fs/file_table.c:80420280)
 Already locked by fs/file_table.c/204
 
 This shows a locking problem with the files_lock on a UP kernel with 
 spinlock debugging enabled.
 
 I noticed that files_lock is only protected with spin_lock() 
 (file_list_lock(), include/linux/fs.h). Is it possible that this should 
 be changed to spin_lock_irq()) or spin_lock_irqsave()? Or am I misssing 
 something obvious?

spin_lock_irqsave is only needed when a lock is taken both in normal
context and in interrupt context. Clearly this lock is not intended to
be taken in interrupt context. 

I'll take a look, that spinlock debugging information unfortunately
doesn't give too much info :|


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel Bug Report

2005-07-19 Thread Alexander Nyberg

> > It looks like it panics during a mem_cpy but I know its
> > difficult to tell just by the output.
> > 
> > I get a code: f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 66
> > 
> > The problem appears very reproducable so I can provide more
> > information upon request.
> 
> What does the rest of the panic say? There should be text above this
> that tells where the panic occured and why. Can you please send that
> here?

Ok, could you please try the this patch, I'll attach it aswell:

From:   Andreas Steinmetz <[EMAIL PROTECTED]>

from include/linux/kernel.h:

#define ALIGN(x,a) (((x)+(a)-1)&~((a)-1))

from crypto/cipher.c:

unsigned int alignmask = ...
u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
...
unsigned int alignmask = ...
u8 *tmp = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
...
unsigned int align;
addr = ALIGN(addr, align);
addr += ALIGN(tfm->__crt_alg->cra_ctxsize, align);

The compiler first does ~((a)-1)) and then expands the unsigned int to
unsigned long for the & operation. So we end up with only the lower 32
bits of the address. Who did smoke what to do this? Patch attached.
-- 
Andreas Steinmetz   SPAMmers use [EMAIL PROTECTED]

--- linux.orig/crypto/cipher.c  2005-07-17 13:35:15.0 +0200
+++ linux/crypto/cipher.c   2005-07-17 14:04:00.0 +0200
@@ -41,7 +41,7 @@
   struct scatter_walk *in,
   struct scatter_walk *out, unsigned int bsize)
 {
-   unsigned int alignmask = crypto_tfm_alg_alignmask(desc->tfm);
+   unsigned long alignmask = crypto_tfm_alg_alignmask(desc->tfm);
u8 buffer[bsize * 2 + alignmask];
u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
u8 *dst = src + bsize;
@@ -160,7 +160,7 @@
  unsigned int nbytes)
 {
struct crypto_tfm *tfm = desc->tfm;
-   unsigned int alignmask = crypto_tfm_alg_alignmask(tfm);
+   unsigned long alignmask = crypto_tfm_alg_alignmask(tfm);
u8 *iv = desc->info;
 
if (unlikely(((unsigned long)iv & alignmask))) {
@@ -424,7 +424,7 @@
}

if (ops->cit_mode == CRYPTO_TFM_MODE_CBC) {
-   unsigned int align;
+   unsigned long align;
unsigned long addr;

switch (crypto_tfm_alg_blocksize(tfm)) {

--080406080505060706090703--
-

--- Begin Message ---
from include/linux/kernel.h:

#define ALIGN(x,a) (((x)+(a)-1)&~((a)-1))

from crypto/cipher.c:

unsigned int alignmask = ...
u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
...
unsigned int alignmask = ...
u8 *tmp = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
...
unsigned int align;
addr = ALIGN(addr, align);
addr += ALIGN(tfm->__crt_alg->cra_ctxsize, align);

The compiler first does ~((a)-1)) and then expands the unsigned int to
unsigned long for the & operation. So we end up with only the lower 32
bits of the address. Who did smoke what to do this? Patch attached.
-- 
Andreas Steinmetz   SPAMmers use [EMAIL PROTECTED]

--- linux.orig/crypto/cipher.c  2005-07-17 13:35:15.0 +0200
+++ linux/crypto/cipher.c   2005-07-17 14:04:00.0 +0200
@@ -41,7 +41,7 @@
   struct scatter_walk *in,
   struct scatter_walk *out, unsigned int bsize)
 {
-   unsigned int alignmask = crypto_tfm_alg_alignmask(desc->tfm);
+   unsigned long alignmask = crypto_tfm_alg_alignmask(desc->tfm);
u8 buffer[bsize * 2 + alignmask];
u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
u8 *dst = src + bsize;
@@ -160,7 +160,7 @@
  unsigned int nbytes)
 {
struct crypto_tfm *tfm = desc->tfm;
-   unsigned int alignmask = crypto_tfm_alg_alignmask(tfm);
+   unsigned long alignmask = crypto_tfm_alg_alignmask(tfm);
u8 *iv = desc->info;
 
if (unlikely(((unsigned long)iv & alignmask))) {
@@ -424,7 +424,7 @@
}

if (ops->cit_mode == CRYPTO_TFM_MODE_CBC) {
-   unsigned int align;
+   unsigned long align;
unsigned long addr;

switch (crypto_tfm_alg_blocksize(tfm)) {

--080406080505060706090703--
-
--- End Message ---

Re: Kernel Bug Report

2005-07-19 Thread Alexander Nyberg

  It looks like it panics during a mem_cpy but I know its
  difficult to tell just by the output.
  
  I get a code: f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 66
  
  The problem appears very reproducable so I can provide more
  information upon request.
 
 What does the rest of the panic say? There should be text above this
 that tells where the panic occured and why. Can you please send that
 here?

Ok, could you please try the this patch, I'll attach it aswell:

From:   Andreas Steinmetz [EMAIL PROTECTED]

from include/linux/kernel.h:

#define ALIGN(x,a) (((x)+(a)-1)~((a)-1))

from crypto/cipher.c:

unsigned int alignmask = ...
u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
...
unsigned int alignmask = ...
u8 *tmp = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
...
unsigned int align;
addr = ALIGN(addr, align);
addr += ALIGN(tfm-__crt_alg-cra_ctxsize, align);

The compiler first does ~((a)-1)) and then expands the unsigned int to
unsigned long for the  operation. So we end up with only the lower 32
bits of the address. Who did smoke what to do this? Patch attached.
-- 
Andreas Steinmetz   SPAMmers use [EMAIL PROTECTED]

--- linux.orig/crypto/cipher.c  2005-07-17 13:35:15.0 +0200
+++ linux/crypto/cipher.c   2005-07-17 14:04:00.0 +0200
@@ -41,7 +41,7 @@
   struct scatter_walk *in,
   struct scatter_walk *out, unsigned int bsize)
 {
-   unsigned int alignmask = crypto_tfm_alg_alignmask(desc-tfm);
+   unsigned long alignmask = crypto_tfm_alg_alignmask(desc-tfm);
u8 buffer[bsize * 2 + alignmask];
u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
u8 *dst = src + bsize;
@@ -160,7 +160,7 @@
  unsigned int nbytes)
 {
struct crypto_tfm *tfm = desc-tfm;
-   unsigned int alignmask = crypto_tfm_alg_alignmask(tfm);
+   unsigned long alignmask = crypto_tfm_alg_alignmask(tfm);
u8 *iv = desc-info;
 
if (unlikely(((unsigned long)iv  alignmask))) {
@@ -424,7 +424,7 @@
}

if (ops-cit_mode == CRYPTO_TFM_MODE_CBC) {
-   unsigned int align;
+   unsigned long align;
unsigned long addr;

switch (crypto_tfm_alg_blocksize(tfm)) {

--080406080505060706090703--
-

---BeginMessage---
from include/linux/kernel.h:

#define ALIGN(x,a) (((x)+(a)-1)~((a)-1))

from crypto/cipher.c:

unsigned int alignmask = ...
u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
...
unsigned int alignmask = ...
u8 *tmp = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
...
unsigned int align;
addr = ALIGN(addr, align);
addr += ALIGN(tfm-__crt_alg-cra_ctxsize, align);

The compiler first does ~((a)-1)) and then expands the unsigned int to
unsigned long for the  operation. So we end up with only the lower 32
bits of the address. Who did smoke what to do this? Patch attached.
-- 
Andreas Steinmetz   SPAMmers use [EMAIL PROTECTED]

--- linux.orig/crypto/cipher.c  2005-07-17 13:35:15.0 +0200
+++ linux/crypto/cipher.c   2005-07-17 14:04:00.0 +0200
@@ -41,7 +41,7 @@
   struct scatter_walk *in,
   struct scatter_walk *out, unsigned int bsize)
 {
-   unsigned int alignmask = crypto_tfm_alg_alignmask(desc-tfm);
+   unsigned long alignmask = crypto_tfm_alg_alignmask(desc-tfm);
u8 buffer[bsize * 2 + alignmask];
u8 *src = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1);
u8 *dst = src + bsize;
@@ -160,7 +160,7 @@
  unsigned int nbytes)
 {
struct crypto_tfm *tfm = desc-tfm;
-   unsigned int alignmask = crypto_tfm_alg_alignmask(tfm);
+   unsigned long alignmask = crypto_tfm_alg_alignmask(tfm);
u8 *iv = desc-info;
 
if (unlikely(((unsigned long)iv  alignmask))) {
@@ -424,7 +424,7 @@
}

if (ops-cit_mode == CRYPTO_TFM_MODE_CBC) {
-   unsigned int align;
+   unsigned long align;
unsigned long addr;

switch (crypto_tfm_alg_blocksize(tfm)) {

--080406080505060706090703--
-
---End Message---

Re: Kernel Bug Report

2005-07-14 Thread Alexander Nyberg

tor 2005-07-14 klockan 10:10 -0700 skrev Paul Vander Griend:
> System:
> Motherboard = Tyan K8WE
> Processor = 2x Opteron 250
> Memory = 8GB ECC Registered
> 
> On all of the recent release candidates except for
> 2.6.13-rc2-git2 the kernel panics while booting. These
> versions include 2.6.13-rc2-git* (* != 2 ) and 2.6.13-rc3.
> 
> I also want to mention that I am using gcc 3.3.5 on debian and
> that during compilation there are 3 messages at the end that
> say an assertion has failed IE (LD: assertion failed).

Those are harmless

> It looks like it panics during a mem_cpy but I know its
> difficult to tell just by the output.
> 
> I get a code: f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 66
> 
> The problem appears very reproducable so I can provide more
> information upon request.

What does the rest of the panic say? There should be text above this
that tells where the panic occured and why. Can you please send that
here?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel Bug Report

2005-07-14 Thread Alexander Nyberg

tor 2005-07-14 klockan 10:10 -0700 skrev Paul Vander Griend:
 System:
 Motherboard = Tyan K8WE
 Processor = 2x Opteron 250
 Memory = 8GB ECC Registered
 
 On all of the recent release candidates except for
 2.6.13-rc2-git2 the kernel panics while booting. These
 versions include 2.6.13-rc2-git* (* != 2 ) and 2.6.13-rc3.
 
 I also want to mention that I am using gcc 3.3.5 on debian and
 that during compilation there are 3 messages at the end that
 say an assertion has failed IE (LD: assertion failed).

Those are harmless

 It looks like it panics during a mem_cpy but I know its
 difficult to tell just by the output.
 
 I get a code: f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 66
 
 The problem appears very reproducable so I can provide more
 information upon request.

What does the rest of the panic say? There should be text above this
that tells where the panic occured and why. Can you please send that
here?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Patch for slab leak debugging

2005-07-09 Thread Alexander Nyberg

> >Yeah I knew there was one, but I thought that was a standalone patch
> >(the one turning all bufctl to unsigned long, turning off irqs and
> >printing all slabs_full to console), my intention with this was a
> >proper /proc entry, something that could be a simple config option.
> >
> >  
> >
> No, I never wrote a proper /proc interface. But I think the bufctl 
> approach is the better solution than storing the first 5 entries in the 
> slab structure:
> What if there is a leak on a cache with more than 5 entries per slab?

As slab leaks usually go out of control I think it will be enough to
show what is leaking anyway, but you're right on the bufctl approach I
think. I may have misundersood the bufctl thing a bit before doing this.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Patch for slab leak debugging

2005-07-09 Thread Alexander Nyberg

fre 2005-07-08 klockan 16:55 -0700 skrev Andrew Morton:
> Alexander Nyberg <[EMAIL PROTECTED]> wrote:
> >
> > I think we really need an option in the kernel to help users in tracking
> > slab leaks so that they can be brought down easier.
> 
> Well we already have slab-leak-detector.patch, whcih I appear to have been
> sitting on since 2.6.0-test8.  it fell out of -mm after 2.6.12-rc5-mm2 due
> to various ravaging of slab.c, but could be brought back.
> 
> pc/2.6.12-rc5-mm2-series:slab-leak-detector.patch
> pc/2.6.12-rc5-mm2-series:slab-leak-detector-warning-fixes.patch

Yeah I knew there was one, but I thought that was a standalone patch
(the one turning all bufctl to unsigned long, turning off irqs and
printing all slabs_full to console), my intention with this was a
proper /proc entry, something that could be a simple config option.

But if something like this already exists, would you please send me what
you have and I'll fix the numa changes

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Patch for slab leak debugging

2005-07-09 Thread Alexander Nyberg

fre 2005-07-08 klockan 16:55 -0700 skrev Andrew Morton:
 Alexander Nyberg [EMAIL PROTECTED] wrote:
 
  I think we really need an option in the kernel to help users in tracking
  slab leaks so that they can be brought down easier.
 
 Well we already have slab-leak-detector.patch, whcih I appear to have been
 sitting on since 2.6.0-test8.  it fell out of -mm after 2.6.12-rc5-mm2 due
 to various ravaging of slab.c, but could be brought back.
 
 pc/2.6.12-rc5-mm2-series:slab-leak-detector.patch
 pc/2.6.12-rc5-mm2-series:slab-leak-detector-warning-fixes.patch

Yeah I knew there was one, but I thought that was a standalone patch
(the one turning all bufctl to unsigned long, turning off irqs and
printing all slabs_full to console), my intention with this was a
proper /proc entry, something that could be a simple config option.

But if something like this already exists, would you please send me what
you have and I'll fix the numa changes

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Patch for slab leak debugging

2005-07-09 Thread Alexander Nyberg

 Yeah I knew there was one, but I thought that was a standalone patch
 (the one turning all bufctl to unsigned long, turning off irqs and
 printing all slabs_full to console), my intention with this was a
 proper /proc entry, something that could be a simple config option.
 
   
 
 No, I never wrote a proper /proc interface. But I think the bufctl 
 approach is the better solution than storing the first 5 entries in the 
 slab structure:
 What if there is a leak on a cache with more than 5 entries per slab?

As slab leaks usually go out of control I think it will be enough to
show what is leaking anyway, but you're right on the bufctl approach I
think. I may have misundersood the bufctl thing a bit before doing this.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12.2 -- time passes faster; related to the acpi_register_gsi() call

2005-07-08 Thread Alexander Nyberg

fre 2005-07-08 klockan 23:12 +0200 skrev Rudo Thomas:
> Hello, guys.
> 
> Time started to pass faster with 2.6.12.2 (actually, it was 2.6.12-ck3
> which is based on it). I have isolated the cause of the problem:

I bet you this fixes it (already in mainline)

tree e6a38b3d6bf434f08054562113bb660c4227769f
parent 4a89a04f1ee21a7c1f4413f1ad7dcfac50ff9b63
author Linus Torvalds <[EMAIL PROTECTED]> Sun, 03 Jul 2005 00:35:33 -0700
committer Linus Torvalds <[EMAIL PROTECTED]> Sun, 03 Jul 2005 00:35:33 -0700

If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq.

That zero just means that nothing else found any irq information either.

 drivers/acpi/pci_irq.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -433,7 +433,7 @@ acpi_pci_irq_enable (
printk(KERN_WARNING PREFIX "PCI Interrupt %s[%c]: no GSI",
pci_name(dev), ('A' + pin));
/* Interrupt Line values above 0xF are forbidden */
-   if (dev->irq >= 0 && (dev->irq <= 0xF)) {
+   if (dev->irq > 0 && (dev->irq <= 0xF)) {
printk(" - using IRQ %d\n", dev->irq);
acpi_register_gsi(dev->irq, ACPI_LEVEL_SENSITIVE, 
ACPI_ACTIVE_LOW);
return_VALUE(0);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch for slab leak debugging

2005-07-08 Thread Alexander Nyberg

I think we really need an option in the kernel to help users in tracking
slab leaks so that they can be brought down easier. This patch tracks
the caller of the first five objects to be created within a slab. This
is not much but as slab leaks normally are quite obvious with the
exception that we don't know who the caller is, I think this approach
will do fine.

No NUMA handling, only looks at nodelists[0] at the moment

list_ff() is distasteful, but I've yet to come up with a better approach
and at the same time not screwing up the slab core too much (I've not
seen too big latencies even with 7M size-32 objects, with that size it
took around 1 minute to cat /proc/slab_owner > meepmeep.txt on a 1.2Ghz
athlon. We could even limit the size of the output as it'll be pretty
repetetive anyway).

To use it, look at /proc/slabinfo to identify the cache that looks to
have leakin callers. Then echo cachename > /proc/slab_owner;
cat /proc/slab_owner > unsorted_slab_owner

Although glancing at this file will likely reveal the leaking caller,
there's a user-space program called slab_owner.c in Documentation/
to help sort the output in the same manner as page_owner

Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]>

Index: akpm/lib/Kconfig.debug
===
--- akpm.orig/lib/Kconfig.debug 2005-07-08 22:49:18.0 +0200
+++ akpm/lib/Kconfig.debug  2005-07-08 22:49:27.0 +0200
@@ -85,6 +85,14 @@
  allocation as well as poisoning memory on free to catch use of freed
  memory. This can make kmalloc/kfree-intensive workloads much slower.
 
+config SLAB_OWNER
+   bool "Track owner of slab objects"
+   depends on DEBUG_KERNEL && DEBUG_SLAB
+   help
+ Say Y here to make the kernel keep track of some of the functions 
+ allocating slab objects. Expensive, should only be used to track 
+ down slab leaks.
+
 config DEBUG_PREEMPT
bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT
Index: akpm/mm/slab.c
===
--- akpm.orig/mm/slab.c 2005-07-08 22:49:18.0 +0200
+++ akpm/mm/slab.c  2005-07-08 22:49:27.0 +0200
@@ -222,6 +222,10 @@
unsigned intinuse;  /* num of objs active in slab */
kmem_bufctl_t   free;
unsigned short  nodeid;
+#ifdef CONFIG_SLAB_OWNER
+   short   owner_idx;
+   unsigned long   owner[5];
+#endif
 };
 
 /*
@@ -2062,7 +2066,9 @@
slabp->inuse = 0;
slabp->colouroff = colour_off;
slabp->s_mem = objp+colour_off;
-
+#ifdef CONFIG_SLAB_OWNER
+   slabp->owner_idx = 0;
+#endif
return slabp;
 }
 
@@ -2502,6 +2508,13 @@
 
cachep->ctor(objp, cachep, ctor_flags);
}   
+#ifdef CONFIG_SLAB_OWNER
+   {
+   struct slab *slabp = GET_PAGE_SLAB(virt_to_page(objp));
+   if (slabp->owner_idx < 5)
+   slabp->owner[slabp->owner_idx++] = (unsigned long) caller;
+   }
+#endif
return objp;
 }
 #else
@@ -3604,3 +3617,131 @@
return buf;
 }
 EXPORT_SYMBOL(kstrdup);
+
+#ifdef CONFIG_SLAB_OWNER
+/* The slab_owner mechanism doesn't aim to be accurate, merely to give
+ * a (big) hint as to what caller is allocating objects but not releasing
+ * them. In almost every case this will be quite obvious even with only
+ * 5 caller addresses per slab saved.
+ */
+static char slab_owner_name[32];
+static unsigned long saved_addr[40];
+
+/* list fast forward 'n' elements */
+static struct list_head *list_ff(struct list_head *start, int n)
+{
+   int i;
+   struct list_head *list = start->next;
+   
+   for (i = 0; i < n; i++) {
+   list = list->next;
+   if (list == start)
+   return NULL;
+   }
+
+   return list;
+}
+
+static ssize_t
+read_slab_owner(struct file *file, char __user *buf, size_t count, loff_t 
*ppos)
+{
+   char *modname;
+   int ret = 0, x, hit = 0;
+   char namebuf[KSYM_NAME_LEN];
+   unsigned long offset = 0, symsize;
+   kmem_cache_t *kcache;
+   struct list_head *start;
+   struct kmem_list3 *rl3;
+   char *page = NULL;
+
+   down(_chain_sem);
+   list_for_each_entry(kcache, _cache.next, next) {
+   if (!strcmp(kcache->name, slab_owner_name)) {
+   /* This way we'll just have to look at one element */
+   list_move(>next, _cache.next);
+   hit = 1;
+   break;
+   }
+   }
+
+   if (!hit) {
+   ret = -ENOENT;
+   goto out_sem;
+   }
+
+   page = (char *) __get_free_page(GFP_KERNEL);
+   if (!page) {
+   ret = -ENOMEM;
+   g

Patch for slab leak debugging

2005-07-08 Thread Alexander Nyberg

I think we really need an option in the kernel to help users in tracking
slab leaks so that they can be brought down easier. This patch tracks
the caller of the first five objects to be created within a slab. This
is not much but as slab leaks normally are quite obvious with the
exception that we don't know who the caller is, I think this approach
will do fine.

No NUMA handling, only looks at nodelists[0] at the moment

list_ff() is distasteful, but I've yet to come up with a better approach
and at the same time not screwing up the slab core too much (I've not
seen too big latencies even with 7M size-32 objects, with that size it
took around 1 minute to cat /proc/slab_owner  meepmeep.txt on a 1.2Ghz
athlon. We could even limit the size of the output as it'll be pretty
repetetive anyway).

To use it, look at /proc/slabinfo to identify the cache that looks to
have leakin callers. Then echo cachename  /proc/slab_owner;
cat /proc/slab_owner  unsorted_slab_owner

Although glancing at this file will likely reveal the leaking caller,
there's a user-space program called slab_owner.c in Documentation/
to help sort the output in the same manner as page_owner

Signed-off-by: Alexander Nyberg [EMAIL PROTECTED]

Index: akpm/lib/Kconfig.debug
===
--- akpm.orig/lib/Kconfig.debug 2005-07-08 22:49:18.0 +0200
+++ akpm/lib/Kconfig.debug  2005-07-08 22:49:27.0 +0200
@@ -85,6 +85,14 @@
  allocation as well as poisoning memory on free to catch use of freed
  memory. This can make kmalloc/kfree-intensive workloads much slower.
 
+config SLAB_OWNER
+   bool Track owner of slab objects
+   depends on DEBUG_KERNEL  DEBUG_SLAB
+   help
+ Say Y here to make the kernel keep track of some of the functions 
+ allocating slab objects. Expensive, should only be used to track 
+ down slab leaks.
+
 config DEBUG_PREEMPT
bool Debug preemptible kernel
depends on DEBUG_KERNEL  PREEMPT
Index: akpm/mm/slab.c
===
--- akpm.orig/mm/slab.c 2005-07-08 22:49:18.0 +0200
+++ akpm/mm/slab.c  2005-07-08 22:49:27.0 +0200
@@ -222,6 +222,10 @@
unsigned intinuse;  /* num of objs active in slab */
kmem_bufctl_t   free;
unsigned short  nodeid;
+#ifdef CONFIG_SLAB_OWNER
+   short   owner_idx;
+   unsigned long   owner[5];
+#endif
 };
 
 /*
@@ -2062,7 +2066,9 @@
slabp-inuse = 0;
slabp-colouroff = colour_off;
slabp-s_mem = objp+colour_off;
-
+#ifdef CONFIG_SLAB_OWNER
+   slabp-owner_idx = 0;
+#endif
return slabp;
 }
 
@@ -2502,6 +2508,13 @@
 
cachep-ctor(objp, cachep, ctor_flags);
}   
+#ifdef CONFIG_SLAB_OWNER
+   {
+   struct slab *slabp = GET_PAGE_SLAB(virt_to_page(objp));
+   if (slabp-owner_idx  5)
+   slabp-owner[slabp-owner_idx++] = (unsigned long) caller;
+   }
+#endif
return objp;
 }
 #else
@@ -3604,3 +3617,131 @@
return buf;
 }
 EXPORT_SYMBOL(kstrdup);
+
+#ifdef CONFIG_SLAB_OWNER
+/* The slab_owner mechanism doesn't aim to be accurate, merely to give
+ * a (big) hint as to what caller is allocating objects but not releasing
+ * them. In almost every case this will be quite obvious even with only
+ * 5 caller addresses per slab saved.
+ */
+static char slab_owner_name[32];
+static unsigned long saved_addr[40];
+
+/* list fast forward 'n' elements */
+static struct list_head *list_ff(struct list_head *start, int n)
+{
+   int i;
+   struct list_head *list = start-next;
+   
+   for (i = 0; i  n; i++) {
+   list = list-next;
+   if (list == start)
+   return NULL;
+   }
+
+   return list;
+}
+
+static ssize_t
+read_slab_owner(struct file *file, char __user *buf, size_t count, loff_t 
*ppos)
+{
+   char *modname;
+   int ret = 0, x, hit = 0;
+   char namebuf[KSYM_NAME_LEN];
+   unsigned long offset = 0, symsize;
+   kmem_cache_t *kcache;
+   struct list_head *start;
+   struct kmem_list3 *rl3;
+   char *page = NULL;
+
+   down(cache_chain_sem);
+   list_for_each_entry(kcache, cache_cache.next, next) {
+   if (!strcmp(kcache-name, slab_owner_name)) {
+   /* This way we'll just have to look at one element */
+   list_move(kcache-next, cache_cache.next);
+   hit = 1;
+   break;
+   }
+   }
+
+   if (!hit) {
+   ret = -ENOENT;
+   goto out_sem;
+   }
+
+   page = (char *) __get_free_page(GFP_KERNEL);
+   if (!page) {
+   ret = -ENOMEM;
+   goto out_sem;
+   }
+
+   rl3 = kcache-nodelists[0];
+   spin_lock_irq(rl3-list_lock);
+   start

Re: 2.6.12.2 -- time passes faster; related to the acpi_register_gsi() call

2005-07-08 Thread Alexander Nyberg

fre 2005-07-08 klockan 23:12 +0200 skrev Rudo Thomas:
 Hello, guys.
 
 Time started to pass faster with 2.6.12.2 (actually, it was 2.6.12-ck3
 which is based on it). I have isolated the cause of the problem:

I bet you this fixes it (already in mainline)

tree e6a38b3d6bf434f08054562113bb660c4227769f
parent 4a89a04f1ee21a7c1f4413f1ad7dcfac50ff9b63
author Linus Torvalds [EMAIL PROTECTED] Sun, 03 Jul 2005 00:35:33 -0700
committer Linus Torvalds [EMAIL PROTECTED] Sun, 03 Jul 2005 00:35:33 -0700

If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq.

That zero just means that nothing else found any irq information either.

 drivers/acpi/pci_irq.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -433,7 +433,7 @@ acpi_pci_irq_enable (
printk(KERN_WARNING PREFIX PCI Interrupt %s[%c]: no GSI,
pci_name(dev), ('A' + pin));
/* Interrupt Line values above 0xF are forbidden */
-   if (dev-irq = 0  (dev-irq = 0xF)) {
+   if (dev-irq  0  (dev-irq = 0xF)) {
printk( - using IRQ %d\n, dev-irq);
acpi_register_gsi(dev-irq, ACPI_LEVEL_SENSITIVE, 
ACPI_ACTIVE_LOW);
return_VALUE(0);


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: How to debug the kernel for X86_64 SMP?

2005-07-06 Thread Alexander Nyberg

tis 2005-07-05 klockan 22:58 +0800 skrev Neo Jia:
> All,
> 
> These days, I am trying to debug the kernel (2.6.9) on x86_64 SMP. But 
> the Kprobes and UML cannot work probably for my case, due to the patch 
> file for x86_64 arch.
> 
> Is there anyone who is working on the same topic? Any hint and help 
> would be appreciated!

You should explain more carefully what you are trying to debug

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: How to debug the kernel for X86_64 SMP?

2005-07-06 Thread Alexander Nyberg

tis 2005-07-05 klockan 22:58 +0800 skrev Neo Jia:
 All,
 
 These days, I am trying to debug the kernel (2.6.9) on x86_64 SMP. But 
 the Kprobes and UML cannot work probably for my case, due to the patch 
 file for x86_64 arch.
 
 Is there anyone who is working on the same topic? Any hint and help 
 would be appreciated!

You should explain more carefully what you are trying to debug

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq.

2005-07-04 Thread Alexander Nyberg

> tree e6a38b3d6bf434f08054562113bb660c4227769f
> parent 4a89a04f1ee21a7c1f4413f1ad7dcfac50ff9b63
> author Linus Torvalds <[EMAIL PROTECTED]> Sun, 03 Jul 2005 00:35:33 -0700
> committer Linus Torvalds <[EMAIL PROTECTED]> Sun, 03 Jul 2005 00:35:33 -0700
> 
> If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq.
> 
> That zero just means that nothing else found any irq information either.
> 
>  drivers/acpi/pci_irq.c |2 +-
>  1 files changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
> --- a/drivers/acpi/pci_irq.c
> +++ b/drivers/acpi/pci_irq.c
> @@ -433,7 +433,7 @@ acpi_pci_irq_enable (
>   printk(KERN_WARNING PREFIX "PCI Interrupt %s[%c]: no GSI",
>   pci_name(dev), ('A' + pin));
>   /* Interrupt Line values above 0xF are forbidden */
> - if (dev->irq >= 0 && (dev->irq <= 0xF)) {
> + if (dev->irq > 0 && (dev->irq <= 0xF)) {
>   printk(" - using IRQ %d\n", dev->irq);
>   acpi_register_gsi(dev->irq, ACPI_LEVEL_SENSITIVE, 
> ACPI_ACTIVE_LOW);
>   return_VALUE(0);

Could this go into stable please? I've got it confirmed it fixes:
http://bugme.osdl.org/show_bug.cgi?id=4824

Which was introduced in -stable 2.6.12.2.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq.

2005-07-04 Thread Alexander Nyberg

 tree e6a38b3d6bf434f08054562113bb660c4227769f
 parent 4a89a04f1ee21a7c1f4413f1ad7dcfac50ff9b63
 author Linus Torvalds [EMAIL PROTECTED] Sun, 03 Jul 2005 00:35:33 -0700
 committer Linus Torvalds [EMAIL PROTECTED] Sun, 03 Jul 2005 00:35:33 -0700
 
 If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq.
 
 That zero just means that nothing else found any irq information either.
 
  drivers/acpi/pci_irq.c |2 +-
  1 files changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
 --- a/drivers/acpi/pci_irq.c
 +++ b/drivers/acpi/pci_irq.c
 @@ -433,7 +433,7 @@ acpi_pci_irq_enable (
   printk(KERN_WARNING PREFIX PCI Interrupt %s[%c]: no GSI,
   pci_name(dev), ('A' + pin));
   /* Interrupt Line values above 0xF are forbidden */
 - if (dev-irq = 0  (dev-irq = 0xF)) {
 + if (dev-irq  0  (dev-irq = 0xF)) {
   printk( - using IRQ %d\n, dev-irq);
   acpi_register_gsi(dev-irq, ACPI_LEVEL_SENSITIVE, 
 ACPI_ACTIVE_LOW);
   return_VALUE(0);

Could this go into stable please? I've got it confirmed it fixes:
http://bugme.osdl.org/show_bug.cgi?id=4824

Which was introduced in -stable 2.6.12.2.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: x86_64: Bug in new out of line put_user()

2005-04-22 Thread Alexander Nyberg

Brian, thanks for seeing this. (me goes hiding...)

The labels after the last put_user patch were misplaced so 
exceptions on the real mov instructions would not be handled.

Index: test/arch/x86_64/lib/putuser.S
===
--- test.orig/arch/x86_64/lib/putuser.S 2005-04-22 10:04:25.0 +0200
+++ test/arch/x86_64/lib/putuser.S  2005-04-22 10:06:29.0 +0200
@@ -49,8 +49,8 @@
jc 20f
cmpq threadinfo_addr_limit(%r8),%rcx
jae 20f
-2: decq %rcx
-   movw %dx,(%rcx)
+   decq %rcx
+2: movw %dx,(%rcx)
xorl %eax,%eax
ret
 20:decq %rcx
@@ -64,8 +64,8 @@
jc 30f
cmpq threadinfo_addr_limit(%r8),%rcx
jae 30f
-3: subq $3,%rcx
-   movl %edx,(%rcx)
+   subq $3,%rcx
+3: movl %edx,(%rcx)
xorl %eax,%eax
ret
 30:subq $3,%rcx
@@ -79,8 +79,8 @@
jc 40f
cmpq threadinfo_addr_limit(%r8),%rcx
jae 40f
-4: subq $7,%rcx
-   movq %rdx,(%rcx)
+   subq $7,%rcx
+4: movq %rdx,(%rcx)
xorl %eax,%eax
ret
 40:subq $7,%rcx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: x86_64: Bug in new out of line put_user()

2005-04-22 Thread Alexander Nyberg

Brian, thanks for seeing this. (me goes hiding...)

The labels after the last put_user patch were misplaced so 
exceptions on the real mov instructions would not be handled.

Index: test/arch/x86_64/lib/putuser.S
===
--- test.orig/arch/x86_64/lib/putuser.S 2005-04-22 10:04:25.0 +0200
+++ test/arch/x86_64/lib/putuser.S  2005-04-22 10:06:29.0 +0200
@@ -49,8 +49,8 @@
jc 20f
cmpq threadinfo_addr_limit(%r8),%rcx
jae 20f
-2: decq %rcx
-   movw %dx,(%rcx)
+   decq %rcx
+2: movw %dx,(%rcx)
xorl %eax,%eax
ret
 20:decq %rcx
@@ -64,8 +64,8 @@
jc 30f
cmpq threadinfo_addr_limit(%r8),%rcx
jae 30f
-3: subq $3,%rcx
-   movl %edx,(%rcx)
+   subq $3,%rcx
+3: movl %edx,(%rcx)
xorl %eax,%eax
ret
 30:subq $3,%rcx
@@ -79,8 +79,8 @@
jc 40f
cmpq threadinfo_addr_limit(%r8),%rcx
jae 40f
-4: subq $7,%rcx
-   movq %rdx,(%rcx)
+   subq $7,%rcx
+4: movq %rdx,(%rcx)
xorl %eax,%eax
ret
 40:subq $7,%rcx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86_64: i8259.c trivial iso99 structure initialization

2005-04-21 Thread Alexander Nyberg

Trivial iso99 structure initialization


Index: test/arch/x86_64/kernel/i8259.c
===
--- test.orig/arch/x86_64/kernel/i8259.c2005-04-20 22:29:02.0 
+0200
+++ test/arch/x86_64/kernel/i8259.c 2005-04-22 00:16:22.0 +0200
@@ -158,14 +158,13 @@
 }
 
 static struct hw_interrupt_type i8259A_irq_type = {
-   "XT-PIC",
-   startup_8259A_irq,
-   shutdown_8259A_irq,
-   enable_8259A_irq,
-   disable_8259A_irq,
-   mask_and_ack_8259A,
-   end_8259A_irq,
-   NULL
+   .typename = "XT-PIC",
+   .startup = startup_8259A_irq,
+   .shutdown = shutdown_8259A_irq,
+   .enable = enable_8259A_irq,
+   .disable = disable_8259A_irq,
+   .ack = mask_and_ack_8259A,
+   .end = end_8259A_irq,
 };
 
 /*


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86_64: i8259.c trivial iso99 structure initialization

2005-04-21 Thread Alexander Nyberg

Trivial iso99 structure initialization


Index: test/arch/x86_64/kernel/i8259.c
===
--- test.orig/arch/x86_64/kernel/i8259.c2005-04-20 22:29:02.0 
+0200
+++ test/arch/x86_64/kernel/i8259.c 2005-04-22 00:16:22.0 +0200
@@ -158,14 +158,13 @@
 }
 
 static struct hw_interrupt_type i8259A_irq_type = {
-   XT-PIC,
-   startup_8259A_irq,
-   shutdown_8259A_irq,
-   enable_8259A_irq,
-   disable_8259A_irq,
-   mask_and_ack_8259A,
-   end_8259A_irq,
-   NULL
+   .typename = XT-PIC,
+   .startup = startup_8259A_irq,
+   .shutdown = shutdown_8259A_irq,
+   .enable = enable_8259A_irq,
+   .disable = disable_8259A_irq,
+   .ack = mask_and_ack_8259A,
+   .end = end_8259A_irq,
 };
 
 /*


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

x86_64: Bug in new out of line put_user()

2005-04-20 Thread Alexander Nyberg

The new out of line put_user() assembly on x86_64 changes %rcx without
telling GCC about it causing things like:

http://bugme.osdl.org/show_bug.cgi?id=4515 

See to it that %rcx is not changed (made it consistent with get_user()).


Signed-off-by: Alexander Nyberg <[EMAIL PROTECTED]>

Index: test/arch/x86_64/lib/getuser.S
===
--- test.orig/arch/x86_64/lib/getuser.S 2005-04-20 23:55:35.0 +0200
+++ test/arch/x86_64/lib/getuser.S  2005-04-21 00:54:16.0 +0200
@@ -78,9 +78,9 @@
 __get_user_8:
GET_THREAD_INFO(%r8)
addq $7,%rcx
-   jc bad_get_user
+   jc 40f
cmpq threadinfo_addr_limit(%r8),%rcx
-   jae bad_get_user
+   jae 40f
subq$7,%rcx
 4: movq (%rcx),%rdx
xorl %eax,%eax
Index: test/arch/x86_64/lib/putuser.S
===
--- test.orig/arch/x86_64/lib/putuser.S 2005-04-21 00:50:24.0 +0200
+++ test/arch/x86_64/lib/putuser.S  2005-04-21 01:02:15.0 +0200
@@ -46,36 +46,45 @@
 __put_user_2:
GET_THREAD_INFO(%r8)
addq $1,%rcx
-   jc bad_put_user
+   jc 20f
cmpq threadinfo_addr_limit(%r8),%rcx
-   jae  bad_put_user
-2: movw %dx,-1(%rcx)
+   jae 20f
+2: decq %rcx
+   movw %dx,(%rcx)
xorl %eax,%eax
ret
+20:decq %rcx
+   jmp bad_put_user
 
.p2align 4
 .globl __put_user_4
 __put_user_4:
GET_THREAD_INFO(%r8)
addq $3,%rcx
-   jc bad_put_user
+   jc 30f
cmpq threadinfo_addr_limit(%r8),%rcx
-   jae bad_put_user
-3: movl %edx,-3(%rcx)
+   jae 30f
+3: subq $3,%rcx
+   movl %edx,(%rcx)
xorl %eax,%eax
ret
+30:subq $3,%rcx
+   jmp bad_put_user
 
.p2align 4
 .globl __put_user_8
 __put_user_8:
GET_THREAD_INFO(%r8)
addq $7,%rcx
-   jc bad_put_user
+   jc 40f
cmpq threadinfo_addr_limit(%r8),%rcx
-   jae bad_put_user
-4: movq %rdx,-7(%rcx)
+   jae 40f
+4: subq $7,%rcx
+   movq %rdx,(%rcx)
xorl %eax,%eax
ret
+40:subq $7,%rcx
+   jmp bad_put_user
 
 bad_put_user:
movq $(-EFAULT),%rax


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

x86_64: Bug in new out of line put_user()

2005-04-20 Thread Alexander Nyberg

The new out of line put_user() assembly on x86_64 changes %rcx without
telling GCC about it causing things like:

http://bugme.osdl.org/show_bug.cgi?id=4515 

See to it that %rcx is not changed (made it consistent with get_user()).


Signed-off-by: Alexander Nyberg [EMAIL PROTECTED]

Index: test/arch/x86_64/lib/getuser.S
===
--- test.orig/arch/x86_64/lib/getuser.S 2005-04-20 23:55:35.0 +0200
+++ test/arch/x86_64/lib/getuser.S  2005-04-21 00:54:16.0 +0200
@@ -78,9 +78,9 @@
 __get_user_8:
GET_THREAD_INFO(%r8)
addq $7,%rcx
-   jc bad_get_user
+   jc 40f
cmpq threadinfo_addr_limit(%r8),%rcx
-   jae bad_get_user
+   jae 40f
subq$7,%rcx
 4: movq (%rcx),%rdx
xorl %eax,%eax
Index: test/arch/x86_64/lib/putuser.S
===
--- test.orig/arch/x86_64/lib/putuser.S 2005-04-21 00:50:24.0 +0200
+++ test/arch/x86_64/lib/putuser.S  2005-04-21 01:02:15.0 +0200
@@ -46,36 +46,45 @@
 __put_user_2:
GET_THREAD_INFO(%r8)
addq $1,%rcx
-   jc bad_put_user
+   jc 20f
cmpq threadinfo_addr_limit(%r8),%rcx
-   jae  bad_put_user
-2: movw %dx,-1(%rcx)
+   jae 20f
+2: decq %rcx
+   movw %dx,(%rcx)
xorl %eax,%eax
ret
+20:decq %rcx
+   jmp bad_put_user
 
.p2align 4
 .globl __put_user_4
 __put_user_4:
GET_THREAD_INFO(%r8)
addq $3,%rcx
-   jc bad_put_user
+   jc 30f
cmpq threadinfo_addr_limit(%r8),%rcx
-   jae bad_put_user
-3: movl %edx,-3(%rcx)
+   jae 30f
+3: subq $3,%rcx
+   movl %edx,(%rcx)
xorl %eax,%eax
ret
+30:subq $3,%rcx
+   jmp bad_put_user
 
.p2align 4
 .globl __put_user_8
 __put_user_8:
GET_THREAD_INFO(%r8)
addq $7,%rcx
-   jc bad_put_user
+   jc 40f
cmpq threadinfo_addr_limit(%r8),%rcx
-   jae bad_put_user
-4: movq %rdx,-7(%rcx)
+   jae 40f
+4: subq $7,%rcx
+   movq %rdx,(%rcx)
xorl %eax,%eax
ret
+40:subq $7,%rcx
+   jmp bad_put_user
 
 bad_put_user:
movq $(-EFAULT),%rax


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3 regression - certain applications get SIGSEGV but are fine with 2.6.12-rc2-mm2

2005-04-19 Thread Alexander Nyberg

tis 2005-04-19 klockan 11:33 +0200 skrev Jesper Juhl:
> Everything is fine with 2.6.12-rc2, 2.6.12-rc2-mm1, 2.6.12-rc2-mm2 & 
> earlier kernels as well, but 2.6.12-rc2-mm3 seems to have a problem.
> I don't know what's causing this, all I can do at the moment is describe 
> the symptoms.
> 
> Certain applications (krootimage and ksplash from KDE 3.4 are 100% 
> reproducible test cases) that used to run fine have started crashing with 
> SIGSEGV on 2.6.12-rc2-mm3. I see nothing suspicious in dmesg.
> I'm including dmesg output as well as strace output from krootimage and 
> ksplash below.
> If someone could give me a hint as to what the cause of this could be or 
> what to try in order to track it down I'd appreciate it.
> This is 100% reproducible.

Try backing out
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/broken-out/sched-unlocked-context-switches.patch

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3 regression - certain applications get SIGSEGV but are fine with 2.6.12-rc2-mm2

2005-04-19 Thread Alexander Nyberg

tis 2005-04-19 klockan 11:33 +0200 skrev Jesper Juhl:
 Everything is fine with 2.6.12-rc2, 2.6.12-rc2-mm1, 2.6.12-rc2-mm2  
 earlier kernels as well, but 2.6.12-rc2-mm3 seems to have a problem.
 I don't know what's causing this, all I can do at the moment is describe 
 the symptoms.
 
 Certain applications (krootimage and ksplash from KDE 3.4 are 100% 
 reproducible test cases) that used to run fine have started crashing with 
 SIGSEGV on 2.6.12-rc2-mm3. I see nothing suspicious in dmesg.
 I'm including dmesg output as well as strace output from krootimage and 
 ksplash below.
 If someone could give me a hint as to what the cause of this could be or 
 what to try in order to track it down I'd appreciate it.
 This is 100% reproducible.

Try backing out
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/broken-out/sched-unlocked-context-switches.patch

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3

2005-04-18 Thread Alexander Nyberg

mån 2005-04-18 klockan 13:14 +0200 skrev Arjan van de Ven:
> On Mon, 2005-04-18 at 13:05 +0200, Alexander Nyberg wrote:
> > [Proper patch now that goes all the way, sorry for spamming]
> > 
> > Patch below uses RETIRED_UOPS for a more constant rate of NMI sending.
> > This makes x64 deliver NMI interrupts every fourth second at a constant
> > rate when going through the local apic. Makes both cpus on my box to get
> > NMIs at constant rate that it previously did not, there could be long
> > delays when a CPU was idle.
> 
> 
> isn't this dangerous in the light of the mobile cpus that either scale
> back or stop entirely in idle or lower load situations ?
> 

I don't see any real problem, at each nmi_watchdog_tick() the next NMI
is calculated accounting cpu_khz so the NMIs might not come at a
constant rate while frequency scaling, but over time there will still be
one every fourth second.

And if stop entirely as you say, are there even any uops run? And even
if so the watchdog that is now currently would also have a few events
accounted on it and could fire NMI aswell.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Need some help to debug a freeze on 2.6.11

2005-04-18 Thread Alexander Nyberg

> > > > Sounds like a job for Documentation/networking/netconsole.txt
> > > >
> > > or Documentation/serial-console.txt
> > >
> > Console on line printer would also be an option.
> 
> I don't have any printer port cables, so I guess I prefer to try netconsole.
> 
> I'm using wireless lan (Intel's ipw2100), would netconsole work on
> wlan interface?

Not sure, can't comment on it...

> As an alternative, can I configure netconsole for my ethernet port and
> only really connect it, after I get the freeze?

Yep, this will work well.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3

2005-04-18 Thread Alexander Nyberg

[Proper patch now that goes all the way, sorry for spamming]

Patch below uses RETIRED_UOPS for a more constant rate of NMI sending.
This makes x64 deliver NMI interrupts every fourth second at a constant
rate when going through the local apic. Makes both cpus on my box to get
NMIs at constant rate that it previously did not, there could be long
delays when a CPU was idle.

This fixes misdetection in check_nmi_watchdog() that thought the NMI
sending was stuck although it was not because the perfctr did not
generate enough events with the previous mask. The 10-second
check_nmi_watchdog() delay is down to 10 msec now.

Tested on opteron SMP.


Index: x64_mm/arch/x86_64/kernel/nmi.c
===
--- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-18 12:56:05.0 
+0200
+++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 14:47:14.0 +0200
@@ -59,16 +59,14 @@
 
 unsigned int nmi_watchdog = NMI_DEFAULT;
 static unsigned int nmi_hz = HZ;
+static int nmi_mult = 1;   /* nmi multiplier for longer intervals */
 unsigned int nmi_perfctr_msr;  /* the MSR to reset in NMI handler */
 
-/* Note that these events don't tick when the CPU idles. This means
-   the frequency varies with CPU load. */
-
 #define K7_EVNTSEL_ENABLE  (1 << 22)
 #define K7_EVNTSEL_INT (1 << 20)
 #define K7_EVNTSEL_OS  (1 << 17)
 #define K7_EVNTSEL_USR (1 << 16)
-#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING   0x76
+#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING   0xC1 /* Retired uops */
 #define K7_NMI_EVENT   K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING
 
 #define P6_EVNTSEL0_ENABLE (1 << 22)
@@ -78,6 +76,11 @@
 #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79
 #define P6_NMI_EVENT   P6_EVENT_CPU_CLOCKS_NOT_HALTED
 
+static inline unsigned long nmi_interval(void)
+{
+   return ((unsigned long)cpu_khz * 1000 * nmi_mult) / nmi_hz;
+}
+
 /* Run after command line and cpu_init init, but before all other checks */
 void __init nmi_watchdog_default(void)
 {
@@ -146,8 +149,10 @@
 
/* now that we know it works we can reduce NMI frequency to
   something more reasonable; makes a difference in some configs */
-   if (nmi_watchdog == NMI_LOCAL_APIC)
+   if (nmi_watchdog == NMI_LOCAL_APIC) {
nmi_hz = 1;
+   nmi_mult = 8;
+   }
 
return 0;
 }
@@ -305,9 +310,6 @@
int i;
unsigned int evntsel;
 
-   /* No check, so can start with slow frequency */
-   nmi_hz = 1; 
-
/* XXX should check these in EFER */
 
nmi_perfctr_msr = MSR_K7_PERFCTR0;
@@ -325,7 +327,7 @@
| K7_NMI_EVENT;
 
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
-   wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz);
+   wrmsrl(MSR_K7_PERFCTR0, -nmi_interval());
apic_write(APIC_LVTPC, APIC_DM_NMI);
evntsel |= K7_EVNTSEL_ENABLE;
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
@@ -393,10 +395,10 @@
if (last_irq_sums[cpu] == sum) {
/*
 * Ayiee, looks like this CPU is stuck ...
-* wait a few IRQs (5 seconds) before doing the oops ...
+* wait a few NMIs before doing the oops ...
 */
alert_counter[cpu]++;
-   if (alert_counter[cpu] == 5*nmi_hz) {
+   if (alert_counter[cpu] == 3*nmi_hz) {
if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
== NOTIFY_STOP) {
alert_counter[cpu] = 0; 
@@ -409,7 +411,7 @@
alert_counter[cpu] = 0;
}
if (nmi_perfctr_msr)
-   wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1);
+   wrmsr(nmi_perfctr_msr, -nmi_interval(), -1);
 }
 
 static int dummy_nmi_callback(struct pt_regs * regs, int cpu)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Need some help to debug a freeze on 2.6.11

2005-04-18 Thread Alexander Nyberg

> I'm running Linux on my laptop and it sometimes freezes (about once a
> week). The only thing which seems to work when it's stuck is SysRq (I
> can reboot with SysRq+O), however, I'm in X and I don't have a serial
> port on my laptop so I can't see any of the outputs of the SysRq
> options.
> 
> After a reboot I don't see anything in my logs about the crash.
> 
> Can anyone suggest how to get some information about my freeze?

Sounds like a job for Documentation/networking/netconsole.txt



started by Ingo Molnar <[EMAIL PROTECTED]>, 2001.09.17
2.6 port and netpoll api by Matt Mackall <[EMAIL PROTECTED]>, Sep 9 2003

Please send bug reports to Matt Mackall <[EMAIL PROTECTED]>

This module logs kernel printk messages over UDP allowing debugging of
problem where disk logging fails and serial consoles are impractical.

It can be used either built-in or as a module. As a built-in,
netconsole initializes immediately after NIC cards and will bring up
the specified interface as soon as possible. While this doesn't allow
capture of early kernel panics, it does capture most of the boot
process.

It takes a string configuration parameter "netconsole" in the
following format:


[EMAIL PROTECTED]/[],[tgt-port]@/[tgt-macaddr]

   where
src-port  source for UDP packets (defaults to 6665)
src-ipsource IP to use (interface address)
dev   network interface (eth0)
tgt-port  port for logging agent ()
tgt-ipIP address for logging agent
tgt-macaddr   ethernet MAC address for logging agent (broadcast)

Examples:

 linux [EMAIL PROTECTED]/eth1,[EMAIL PROTECTED]/12:34:56:78:9a:bc

  or

 insmod netconsole netconsole=@/,@10.0.0.2/

Built-in netconsole starts immediately after the TCP stack is
initialized and attempts to bring up the supplied dev at the supplied
address.

The remote host can run either 'netcat -u -l -p ' or syslogd.

WARNING: the default target ethernet setting uses the broadcast
ethernet address to send packets, which can cause increased load on
other systems on the same ethernet segment.

NOTE: the network device (eth1 in the above case) can run any kind
of other network traffic, netconsole is not intrusive. Netconsole
might cause slight delays in other traffic if the volume of kernel
messages is high, but should have no other impact.

Netconsole was designed to be as instantaneous as possible, to
enable the logging of even the most critical kernel bugs. It works
from IRQ contexts as well, and does not enable interrupts while
sending packets. Due to these unique needs, configuration can not
be more automatic, and some fundamental limitations will remain:
only IP networks, UDP packets and ethernet devices are supported.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3

2005-04-18 Thread Alexander Nyberg

> >This patch fixes the NMI checking problems in -mm x64 for me. It 
> 
> What problems?
> 

Sorry, in -mm on x64 check_nmi_watchdog() has started to be run as a
late_initcall(). Currently it reports the NMIs as stuck on a few systems
although they are not, both of mine are reported as stuck. This appears
to be because the current event mask uses don't appear to tick much
running mdelay() on opteron (in my case). Also in -mm because nmi_hz is
set to 1 in setup_k7_watchdog() the NMI watchdog checking takes 10
seconds, a bit much.

Patch below uses RETIRED_UOPS for a more constant rate of NMI sending, 
this works well for me. However I'd like NMIs to maybe fire every fourth
second or so. Using nmi_mult to multiply nmi_interval() by 4 doesn't
seem to make it go every fourth second however, maybe every 1.5 second,
I'm puzzled about this...


Index: x64_mm/arch/x86_64/kernel/nmi.c
===
--- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-18 12:56:05.0 
+0200
+++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 13:34:37.0 +0200
@@ -59,6 +59,7 @@
 
 unsigned int nmi_watchdog = NMI_DEFAULT;
 static unsigned int nmi_hz = HZ;
+static int nmi_mult = 1;   /* nmi multiplier, how many seconds inbetween */
 unsigned int nmi_perfctr_msr;  /* the MSR to reset in NMI handler */
 
 /* Note that these events don't tick when the CPU idles. This means
@@ -68,7 +69,7 @@
 #define K7_EVNTSEL_INT (1 << 20)
 #define K7_EVNTSEL_OS  (1 << 17)
 #define K7_EVNTSEL_USR (1 << 16)
-#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING   0x76
+#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING   0xC1 /* Retired uops */
 #define K7_NMI_EVENT   K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING
 
 #define P6_EVNTSEL0_ENABLE (1 << 22)
@@ -78,6 +79,11 @@
 #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79
 #define P6_NMI_EVENT   P6_EVENT_CPU_CLOCKS_NOT_HALTED
 
+static inline unsigned long nmi_interval(void)
+{
+   return ((unsigned long)cpu_khz * 1000 * nmi_mult) / nmi_hz;
+}
+
 /* Run after command line and cpu_init init, but before all other checks */
 void __init nmi_watchdog_default(void)
 {
@@ -146,8 +152,10 @@
 
/* now that we know it works we can reduce NMI frequency to
   something more reasonable; makes a difference in some configs */
-   if (nmi_watchdog == NMI_LOCAL_APIC)
+   if (nmi_watchdog == NMI_LOCAL_APIC) {
nmi_hz = 1;
+   nmi_mult = 4;
+   }
 
return 0;
 }
@@ -305,9 +313,6 @@
int i;
unsigned int evntsel;
 
-   /* No check, so can start with slow frequency */
-   nmi_hz = 1; 
-
/* XXX should check these in EFER */
 
nmi_perfctr_msr = MSR_K7_PERFCTR0;
@@ -325,7 +330,7 @@
| K7_NMI_EVENT;
 
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
-   wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz);
+   wrmsrl(MSR_K7_PERFCTR0, -nmi_interval());
apic_write(APIC_LVTPC, APIC_DM_NMI);
evntsel |= K7_EVNTSEL_ENABLE;
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
@@ -409,7 +414,7 @@
alert_counter[cpu] = 0;
}
if (nmi_perfctr_msr)
-   wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1);
+   wrmsr(nmi_perfctr_msr, -nmi_interval(), -1);
 }
 
 static int dummy_nmi_callback(struct pt_regs * regs, int cpu)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3

2005-04-18 Thread Alexander Nyberg

 This patch fixes the NMI checking problems in -mm x64 for me. It 
 
 What problems?
 

Sorry, in -mm on x64 check_nmi_watchdog() has started to be run as a
late_initcall(). Currently it reports the NMIs as stuck on a few systems
although they are not, both of mine are reported as stuck. This appears
to be because the current event mask uses don't appear to tick much
running mdelay() on opteron (in my case). Also in -mm because nmi_hz is
set to 1 in setup_k7_watchdog() the NMI watchdog checking takes 10
seconds, a bit much.

Patch below uses RETIRED_UOPS for a more constant rate of NMI sending, 
this works well for me. However I'd like NMIs to maybe fire every fourth
second or so. Using nmi_mult to multiply nmi_interval() by 4 doesn't
seem to make it go every fourth second however, maybe every 1.5 second,
I'm puzzled about this...


Index: x64_mm/arch/x86_64/kernel/nmi.c
===
--- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-18 12:56:05.0 
+0200
+++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 13:34:37.0 +0200
@@ -59,6 +59,7 @@
 
 unsigned int nmi_watchdog = NMI_DEFAULT;
 static unsigned int nmi_hz = HZ;
+static int nmi_mult = 1;   /* nmi multiplier, how many seconds inbetween */
 unsigned int nmi_perfctr_msr;  /* the MSR to reset in NMI handler */
 
 /* Note that these events don't tick when the CPU idles. This means
@@ -68,7 +69,7 @@
 #define K7_EVNTSEL_INT (1  20)
 #define K7_EVNTSEL_OS  (1  17)
 #define K7_EVNTSEL_USR (1  16)
-#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING   0x76
+#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING   0xC1 /* Retired uops */
 #define K7_NMI_EVENT   K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING
 
 #define P6_EVNTSEL0_ENABLE (1  22)
@@ -78,6 +79,11 @@
 #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79
 #define P6_NMI_EVENT   P6_EVENT_CPU_CLOCKS_NOT_HALTED
 
+static inline unsigned long nmi_interval(void)
+{
+   return ((unsigned long)cpu_khz * 1000 * nmi_mult) / nmi_hz;
+}
+
 /* Run after command line and cpu_init init, but before all other checks */
 void __init nmi_watchdog_default(void)
 {
@@ -146,8 +152,10 @@
 
/* now that we know it works we can reduce NMI frequency to
   something more reasonable; makes a difference in some configs */
-   if (nmi_watchdog == NMI_LOCAL_APIC)
+   if (nmi_watchdog == NMI_LOCAL_APIC) {
nmi_hz = 1;
+   nmi_mult = 4;
+   }
 
return 0;
 }
@@ -305,9 +313,6 @@
int i;
unsigned int evntsel;
 
-   /* No check, so can start with slow frequency */
-   nmi_hz = 1; 
-
/* XXX should check these in EFER */
 
nmi_perfctr_msr = MSR_K7_PERFCTR0;
@@ -325,7 +330,7 @@
| K7_NMI_EVENT;
 
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
-   wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz);
+   wrmsrl(MSR_K7_PERFCTR0, -nmi_interval());
apic_write(APIC_LVTPC, APIC_DM_NMI);
evntsel |= K7_EVNTSEL_ENABLE;
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
@@ -409,7 +414,7 @@
alert_counter[cpu] = 0;
}
if (nmi_perfctr_msr)
-   wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1);
+   wrmsr(nmi_perfctr_msr, -nmi_interval(), -1);
 }
 
 static int dummy_nmi_callback(struct pt_regs * regs, int cpu)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Need some help to debug a freeze on 2.6.11

2005-04-18 Thread Alexander Nyberg

 I'm running Linux on my laptop and it sometimes freezes (about once a
 week). The only thing which seems to work when it's stuck is SysRq (I
 can reboot with SysRq+O), however, I'm in X and I don't have a serial
 port on my laptop so I can't see any of the outputs of the SysRq
 options.
 
 After a reboot I don't see anything in my logs about the crash.
 
 Can anyone suggest how to get some information about my freeze?

Sounds like a job for Documentation/networking/netconsole.txt



started by Ingo Molnar [EMAIL PROTECTED], 2001.09.17
2.6 port and netpoll api by Matt Mackall [EMAIL PROTECTED], Sep 9 2003

Please send bug reports to Matt Mackall [EMAIL PROTECTED]

This module logs kernel printk messages over UDP allowing debugging of
problem where disk logging fails and serial consoles are impractical.

It can be used either built-in or as a module. As a built-in,
netconsole initializes immediately after NIC cards and will bring up
the specified interface as soon as possible. While this doesn't allow
capture of early kernel panics, it does capture most of the boot
process.

It takes a string configuration parameter netconsole in the
following format:


[EMAIL PROTECTED]/[dev],[tgt-port]@tgt-ip/[tgt-macaddr]

   where
src-port  source for UDP packets (defaults to 6665)
src-ipsource IP to use (interface address)
dev   network interface (eth0)
tgt-port  port for logging agent ()
tgt-ipIP address for logging agent
tgt-macaddr   ethernet MAC address for logging agent (broadcast)

Examples:

 linux [EMAIL PROTECTED]/eth1,[EMAIL PROTECTED]/12:34:56:78:9a:bc

  or

 insmod netconsole netconsole=@/,@10.0.0.2/

Built-in netconsole starts immediately after the TCP stack is
initialized and attempts to bring up the supplied dev at the supplied
address.

The remote host can run either 'netcat -u -l -p port' or syslogd.

WARNING: the default target ethernet setting uses the broadcast
ethernet address to send packets, which can cause increased load on
other systems on the same ethernet segment.

NOTE: the network device (eth1 in the above case) can run any kind
of other network traffic, netconsole is not intrusive. Netconsole
might cause slight delays in other traffic if the volume of kernel
messages is high, but should have no other impact.

Netconsole was designed to be as instantaneous as possible, to
enable the logging of even the most critical kernel bugs. It works
from IRQ contexts as well, and does not enable interrupts while
sending packets. Due to these unique needs, configuration can not
be more automatic, and some fundamental limitations will remain:
only IP networks, UDP packets and ethernet devices are supported.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3

2005-04-18 Thread Alexander Nyberg

[Proper patch now that goes all the way, sorry for spamming]

Patch below uses RETIRED_UOPS for a more constant rate of NMI sending.
This makes x64 deliver NMI interrupts every fourth second at a constant
rate when going through the local apic. Makes both cpus on my box to get
NMIs at constant rate that it previously did not, there could be long
delays when a CPU was idle.

This fixes misdetection in check_nmi_watchdog() that thought the NMI
sending was stuck although it was not because the perfctr did not
generate enough events with the previous mask. The 10-second
check_nmi_watchdog() delay is down to 10 msec now.

Tested on opteron SMP.


Index: x64_mm/arch/x86_64/kernel/nmi.c
===
--- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-18 12:56:05.0 
+0200
+++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 14:47:14.0 +0200
@@ -59,16 +59,14 @@
 
 unsigned int nmi_watchdog = NMI_DEFAULT;
 static unsigned int nmi_hz = HZ;
+static int nmi_mult = 1;   /* nmi multiplier for longer intervals */
 unsigned int nmi_perfctr_msr;  /* the MSR to reset in NMI handler */
 
-/* Note that these events don't tick when the CPU idles. This means
-   the frequency varies with CPU load. */
-
 #define K7_EVNTSEL_ENABLE  (1  22)
 #define K7_EVNTSEL_INT (1  20)
 #define K7_EVNTSEL_OS  (1  17)
 #define K7_EVNTSEL_USR (1  16)
-#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING   0x76
+#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING   0xC1 /* Retired uops */
 #define K7_NMI_EVENT   K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING
 
 #define P6_EVNTSEL0_ENABLE (1  22)
@@ -78,6 +76,11 @@
 #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79
 #define P6_NMI_EVENT   P6_EVENT_CPU_CLOCKS_NOT_HALTED
 
+static inline unsigned long nmi_interval(void)
+{
+   return ((unsigned long)cpu_khz * 1000 * nmi_mult) / nmi_hz;
+}
+
 /* Run after command line and cpu_init init, but before all other checks */
 void __init nmi_watchdog_default(void)
 {
@@ -146,8 +149,10 @@
 
/* now that we know it works we can reduce NMI frequency to
   something more reasonable; makes a difference in some configs */
-   if (nmi_watchdog == NMI_LOCAL_APIC)
+   if (nmi_watchdog == NMI_LOCAL_APIC) {
nmi_hz = 1;
+   nmi_mult = 8;
+   }
 
return 0;
 }
@@ -305,9 +310,6 @@
int i;
unsigned int evntsel;
 
-   /* No check, so can start with slow frequency */
-   nmi_hz = 1; 
-
/* XXX should check these in EFER */
 
nmi_perfctr_msr = MSR_K7_PERFCTR0;
@@ -325,7 +327,7 @@
| K7_NMI_EVENT;
 
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
-   wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz);
+   wrmsrl(MSR_K7_PERFCTR0, -nmi_interval());
apic_write(APIC_LVTPC, APIC_DM_NMI);
evntsel |= K7_EVNTSEL_ENABLE;
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
@@ -393,10 +395,10 @@
if (last_irq_sums[cpu] == sum) {
/*
 * Ayiee, looks like this CPU is stuck ...
-* wait a few IRQs (5 seconds) before doing the oops ...
+* wait a few NMIs before doing the oops ...
 */
alert_counter[cpu]++;
-   if (alert_counter[cpu] == 5*nmi_hz) {
+   if (alert_counter[cpu] == 3*nmi_hz) {
if (notify_die(DIE_NMI, nmi, regs, reason, 2, SIGINT)
== NOTIFY_STOP) {
alert_counter[cpu] = 0; 
@@ -409,7 +411,7 @@
alert_counter[cpu] = 0;
}
if (nmi_perfctr_msr)
-   wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1);
+   wrmsr(nmi_perfctr_msr, -nmi_interval(), -1);
 }
 
 static int dummy_nmi_callback(struct pt_regs * regs, int cpu)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Need some help to debug a freeze on 2.6.11

2005-04-18 Thread Alexander Nyberg

Sounds like a job for Documentation/networking/netconsole.txt
   
   or Documentation/serial-console.txt
  
  Console on line printer would also be an option.
 
 I don't have any printer port cables, so I guess I prefer to try netconsole.
 
 I'm using wireless lan (Intel's ipw2100), would netconsole work on
 wlan interface?

Not sure, can't comment on it...

 As an alternative, can I configure netconsole for my ethernet port and
 only really connect it, after I get the freeze?

Yep, this will work well.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3

2005-04-18 Thread Alexander Nyberg

mån 2005-04-18 klockan 13:14 +0200 skrev Arjan van de Ven:
 On Mon, 2005-04-18 at 13:05 +0200, Alexander Nyberg wrote:
  [Proper patch now that goes all the way, sorry for spamming]
  
  Patch below uses RETIRED_UOPS for a more constant rate of NMI sending.
  This makes x64 deliver NMI interrupts every fourth second at a constant
  rate when going through the local apic. Makes both cpus on my box to get
  NMIs at constant rate that it previously did not, there could be long
  delays when a CPU was idle.
 
 
 isn't this dangerous in the light of the mobile cpus that either scale
 back or stop entirely in idle or lower load situations ?
 

I don't see any real problem, at each nmi_watchdog_tick() the next NMI
is calculated accounting cpu_khz so the NMIs might not come at a
constant rate while frequency scaling, but over time there will still be
one every fourth second.

And if stop entirely as you say, are there even any uops run? And even
if so the watchdog that is now currently would also have a few events
accounted on it and could fire NMI aswell.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3

2005-04-17 Thread Alexander Nyberg

mån 2005-04-11 klockan 01:25 -0700 skrev Andrew Morton:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/
> 

I tried to kexec on my x64 and it hangs up in calibrate_delay() because
the PIT never fires any interrupts so jiffies is never updated. Has
kexec been tested on x64 and should be working? I want to know if I
should start looking at weirdness with my hardware or if it is like this
on all x64 boxes.

Also, patch at bottom is needed to compile kexec on x64 without ia32
emulation support (the includes are not used at the moment).

  CC  arch/x86_64/kernel/crash.o
In file included from arch/x86_64/kernel/crash.c:18:
include/linux/elfcore.h: I funktion `elf_core_copy_regs':
include/linux/elfcore.h:92: error: dereferencing pointer to incomplete type
include/linux/elfcore.h:92: error: dereferencing pointer to incomplete type


Index: x64_mm/arch/x86_64/kernel/crash.c
===
--- x64_mm.orig/arch/x86_64/kernel/crash.c  2005-04-16 19:23:58.0 
+0200
+++ x64_mm/arch/x86_64/kernel/crash.c   2005-04-16 19:47:56.0 +0200
@@ -14,8 +14,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 
 #include 
 #include 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3

2005-04-17 Thread Alexander Nyberg

> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/
> 
> 

[Mikael Pettersson on CC, would like your advice]

This patch fixes the NMI checking problems in -mm x64 for me. It 
changes the perfctr selection to use RETIRED_UOPS instead 
(makes both processors tick even on my box).

This makes the NMI tick once per second while running which is quite much, 
I'd like to get it down to every fourth second and herein lies the problem.
Multiplying nmi_interval() in patch below with 4 does not help, still ticks 
at about the same pace. I'm puzzled...


Index: x64_mm/arch/x86_64/kernel/nmi.c
===
--- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-17 14:34:09.0 
+0200
+++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 02:11:37.0 +0200
@@ -58,7 +58,7 @@
 int panic_on_timeout;
 
 unsigned int nmi_watchdog = NMI_DEFAULT;
-static unsigned int nmi_hz = HZ;
+static unsigned long nmi_hz = HZ;
 unsigned int nmi_perfctr_msr;  /* the MSR to reset in NMI handler */
 
 /* Note that these events don't tick when the CPU idles. This means
@@ -70,6 +70,7 @@
 #define K7_EVNTSEL_USR (1 << 16)
 #define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING   0x76
 #define K7_NMI_EVENT   K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING
+#define K7_RETIRED_UOPS0xC1 /* always running */
 
 #define P6_EVNTSEL0_ENABLE (1 << 22)
 #define P6_EVNTSEL_INT (1 << 20)
@@ -78,6 +79,11 @@
 #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79
 #define P6_NMI_EVENT   P6_EVENT_CPU_CLOCKS_NOT_HALTED
 
+static inline unsigned long nmi_interval(void)
+{
+   return (((unsigned long)cpu_khz * 1000UL) / nmi_hz);
+}
+
 /* Run after command line and cpu_init init, but before all other checks */
 void __init nmi_watchdog_default(void)
 {
@@ -129,8 +135,8 @@
 
for (cpu = 0; cpu < NR_CPUS; cpu++)
counts[cpu] = cpu_pda[cpu].__nmi_count; 
-   local_irq_enable();
-   mdelay((10*1000)/nmi_hz); // wait 10 ticks
+
+   mdelay((10*1000) / nmi_hz); /* wait 10 NMI ticks */
 
for (cpu = 0; cpu < NR_CPUS; cpu++) {
if (cpu_pda[cpu].__nmi_count - counts[cpu] <= 5) {
@@ -305,9 +311,6 @@
int i;
unsigned int evntsel;
 
-   /* No check, so can start with slow frequency */
-   nmi_hz = 1; 
-
/* XXX should check these in EFER */
 
nmi_perfctr_msr = MSR_K7_PERFCTR0;
@@ -322,10 +325,10 @@
evntsel = K7_EVNTSEL_INT
| K7_EVNTSEL_OS
| K7_EVNTSEL_USR
-   | K7_NMI_EVENT;
+   | K7_RETIRED_UOPS;
 
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
-   wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz);
+   wrmsrl(MSR_K7_PERFCTR0, -nmi_interval());
apic_write(APIC_LVTPC, APIC_DM_NMI);
evntsel |= K7_EVNTSEL_ENABLE;
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
@@ -409,7 +412,7 @@
alert_counter[cpu] = 0;
}
if (nmi_perfctr_msr)
-   wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1);
+   wrmsr(nmi_perfctr_msr, -nmi_interval(), -1);
 }
 
 static int dummy_nmi_callback(struct pt_regs * regs, int cpu)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3

2005-04-17 Thread Alexander Nyberg

 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/
 
 

[Mikael Pettersson on CC, would like your advice]

This patch fixes the NMI checking problems in -mm x64 for me. It 
changes the perfctr selection to use RETIRED_UOPS instead 
(makes both processors tick even on my box).

This makes the NMI tick once per second while running which is quite much, 
I'd like to get it down to every fourth second and herein lies the problem.
Multiplying nmi_interval() in patch below with 4 does not help, still ticks 
at about the same pace. I'm puzzled...


Index: x64_mm/arch/x86_64/kernel/nmi.c
===
--- x64_mm.orig/arch/x86_64/kernel/nmi.c2005-04-17 14:34:09.0 
+0200
+++ x64_mm/arch/x86_64/kernel/nmi.c 2005-04-18 02:11:37.0 +0200
@@ -58,7 +58,7 @@
 int panic_on_timeout;
 
 unsigned int nmi_watchdog = NMI_DEFAULT;
-static unsigned int nmi_hz = HZ;
+static unsigned long nmi_hz = HZ;
 unsigned int nmi_perfctr_msr;  /* the MSR to reset in NMI handler */
 
 /* Note that these events don't tick when the CPU idles. This means
@@ -70,6 +70,7 @@
 #define K7_EVNTSEL_USR (1  16)
 #define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING   0x76
 #define K7_NMI_EVENT   K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING
+#define K7_RETIRED_UOPS0xC1 /* always running */
 
 #define P6_EVNTSEL0_ENABLE (1  22)
 #define P6_EVNTSEL_INT (1  20)
@@ -78,6 +79,11 @@
 #define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79
 #define P6_NMI_EVENT   P6_EVENT_CPU_CLOCKS_NOT_HALTED
 
+static inline unsigned long nmi_interval(void)
+{
+   return (((unsigned long)cpu_khz * 1000UL) / nmi_hz);
+}
+
 /* Run after command line and cpu_init init, but before all other checks */
 void __init nmi_watchdog_default(void)
 {
@@ -129,8 +135,8 @@
 
for (cpu = 0; cpu  NR_CPUS; cpu++)
counts[cpu] = cpu_pda[cpu].__nmi_count; 
-   local_irq_enable();
-   mdelay((10*1000)/nmi_hz); // wait 10 ticks
+
+   mdelay((10*1000) / nmi_hz); /* wait 10 NMI ticks */
 
for (cpu = 0; cpu  NR_CPUS; cpu++) {
if (cpu_pda[cpu].__nmi_count - counts[cpu] = 5) {
@@ -305,9 +311,6 @@
int i;
unsigned int evntsel;
 
-   /* No check, so can start with slow frequency */
-   nmi_hz = 1; 
-
/* XXX should check these in EFER */
 
nmi_perfctr_msr = MSR_K7_PERFCTR0;
@@ -322,10 +325,10 @@
evntsel = K7_EVNTSEL_INT
| K7_EVNTSEL_OS
| K7_EVNTSEL_USR
-   | K7_NMI_EVENT;
+   | K7_RETIRED_UOPS;
 
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
-   wrmsrl(MSR_K7_PERFCTR0, -((u64)cpu_khz*1000) / nmi_hz);
+   wrmsrl(MSR_K7_PERFCTR0, -nmi_interval());
apic_write(APIC_LVTPC, APIC_DM_NMI);
evntsel |= K7_EVNTSEL_ENABLE;
wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
@@ -409,7 +412,7 @@
alert_counter[cpu] = 0;
}
if (nmi_perfctr_msr)
-   wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1);
+   wrmsr(nmi_perfctr_msr, -nmi_interval(), -1);
 }
 
 static int dummy_nmi_callback(struct pt_regs * regs, int cpu)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3

2005-04-17 Thread Alexander Nyberg

mån 2005-04-11 klockan 01:25 -0700 skrev Andrew Morton:
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/
 

I tried to kexec on my x64 and it hangs up in calibrate_delay() because
the PIT never fires any interrupts so jiffies is never updated. Has
kexec been tested on x64 and should be working? I want to know if I
should start looking at weirdness with my hardware or if it is like this
on all x64 boxes.

Also, patch at bottom is needed to compile kexec on x64 without ia32
emulation support (the includes are not used at the moment).

  CC  arch/x86_64/kernel/crash.o
In file included from arch/x86_64/kernel/crash.c:18:
include/linux/elfcore.h: I funktion `elf_core_copy_regs':
include/linux/elfcore.h:92: error: dereferencing pointer to incomplete type
include/linux/elfcore.h:92: error: dereferencing pointer to incomplete type


Index: x64_mm/arch/x86_64/kernel/crash.c
===
--- x64_mm.orig/arch/x86_64/kernel/crash.c  2005-04-16 19:23:58.0 
+0200
+++ x64_mm/arch/x86_64/kernel/crash.c   2005-04-16 19:47:56.0 +0200
@@ -14,8 +14,6 @@
 #include linux/irq.h
 #include linux/reboot.h
 #include linux/kexec.h
-#include linux/elf.h
-#include linux/elfcore.h
 
 #include asm/processor.h
 #include asm/hardirq.h


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix reproducible SMP crash in security/keys/key.c

2005-04-13 Thread Alexander Nyberg

tis 2005-04-12 klockan 21:58 +0300 skrev Jani Jaakkola:
> SMP race handling is broken in key_user_lookup() in security/keys/key.c
> (if CONFIG_KEYS is set to 'y'). This came up on our Samba servers, but is
> not restricted to samba, though samba is probably the only software which
> is likely to trigger this repeatedly (and it did happen allready four 
> times here in University of Helsinki, CS department).
> 
> However, it only takes two setreuid() calls at the same instant, so this
> may be responsible for some other mysterious random crashes.
> 
> This is the same bug which was previously raported to LKML here (found by 
> google):
> http://www.ussg.iu.edu/hypermail/linux/kernel/0502.2/0521.html
> 
> Here is a small test program, which can be used to trigger the bug and 
> crash the machine where it is run. It might take a few seconds:
> 
> #include
> #include
> int main() {
> int i;
> fork();
> while(1) {
> for(i=0;i<6;i++) { setreuid(i,0); } 
> putchar('.'); fflush(stdout);
> };
> }
> 
> The (rather obvious) problem is that key_user_lookup() does not properly 
> re-initialize the user lookup if there was a race.
> 
> This patch applies to vanilla 2.6.11.7 and latest fedora kernel
> 2.6.11-1.14_FC3. When applied, the test program runs just fine (and does
> nothing useful).

A fix went into mainline for this two months ago (post 2.6.11), but I
probably should have sent it into -stable aswell.

For your own sake always use the latest kernel when looking at
problems/fixes, things move fast around here :)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix reproducible SMP crash in security/keys/key.c

2005-04-13 Thread Alexander Nyberg

tis 2005-04-12 klockan 21:58 +0300 skrev Jani Jaakkola:
 SMP race handling is broken in key_user_lookup() in security/keys/key.c
 (if CONFIG_KEYS is set to 'y'). This came up on our Samba servers, but is
 not restricted to samba, though samba is probably the only software which
 is likely to trigger this repeatedly (and it did happen allready four 
 times here in University of Helsinki, CS department).
 
 However, it only takes two setreuid() calls at the same instant, so this
 may be responsible for some other mysterious random crashes.
 
 This is the same bug which was previously raported to LKML here (found by 
 google):
 http://www.ussg.iu.edu/hypermail/linux/kernel/0502.2/0521.html
 
 Here is a small test program, which can be used to trigger the bug and 
 crash the machine where it is run. It might take a few seconds:
 
 #includeunistd.h
 #includestdio.h
 int main() {
 int i;
 fork();
 while(1) {
 for(i=0;i6;i++) { setreuid(i,0); } 
 putchar('.'); fflush(stdout);
 };
 }
 
 The (rather obvious) problem is that key_user_lookup() does not properly 
 re-initialize the user lookup if there was a race.
 
 This patch applies to vanilla 2.6.11.7 and latest fedora kernel
 2.6.11-1.14_FC3. When applied, the test program runs just fine (and does
 nothing useful).

A fix went into mainline for this two months ago (post 2.6.11), but I
probably should have sent it into -stable aswell.

For your own sake always use the latest kernel when looking at
problems/fixes, things move fast around here :)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3: 10 seconds of nothingness

2005-04-12 Thread Alexander Nyberg

> [   19.617890] Testing NMI watchdog ... <6>ACPI: No ACPI bus support
> for 2-2 [   19.705673] ACPI: No ACPI bus support for 2-2:1.0
> [   20.002417] usb 3-2: new full speed USB device using uhci_hcd and
> address 2 [   20.121763] ACPI: No ACPI bus support for 3-2
> [   20.156293] ACPI: No ACPI bus support for 3-2:1.0
> [   29.539613] OK.
> 
> I also had this "problem" with mm1. mm2 wouldn't compile, so I didn't
> test that. IIRC it also happened with the rc1-mm's. Is this supposed to
> happen?

It's a fairly new thing on x64, should be fixed soon. If it disturbs you
too much back out

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/broken-out/rfc-check-nmi-watchdog-is-broken.patch

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bkbits.net is down

2005-04-12 Thread Alexander Nyberg

tis 2005-04-12 klockan 13:10 +0200 skrev Marcin Dalecki:
> On 2005-04-12, at 04:17, Larry McVoy wrote whatever...
> 
> Excuse me, but: who gives a damn shit?
> 

Anyone who wants to have access to the history or any other functioning
of the repository.

Please don't pollute this list nor Larry with such comments.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bkbits.net is down

2005-04-12 Thread Alexander Nyberg

tis 2005-04-12 klockan 13:10 +0200 skrev Marcin Dalecki:
 On 2005-04-12, at 04:17, Larry McVoy wrote whatever...
 
 Excuse me, but: who gives a damn shit?
 

Anyone who wants to have access to the history or any other functioning
of the repository.

Please don't pollute this list nor Larry with such comments.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm3: 10 seconds of nothingness

2005-04-12 Thread Alexander Nyberg

 [   19.617890] Testing NMI watchdog ... 6ACPI: No ACPI bus support
 for 2-2 [   19.705673] ACPI: No ACPI bus support for 2-2:1.0
 [   20.002417] usb 3-2: new full speed USB device using uhci_hcd and
 address 2 [   20.121763] ACPI: No ACPI bus support for 3-2
 [   20.156293] ACPI: No ACPI bus support for 3-2:1.0
 [   29.539613] OK.
 
 I also had this problem with mm1. mm2 wouldn't compile, so I didn't
 test that. IIRC it also happened with the rc1-mm's. Is this supposed to
 happen?

It's a fairly new thing on x64, should be fixed soon. If it disturbs you
too much back out

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/broken-out/rfc-check-nmi-watchdog-is-broken.patch

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] cifs: md5 cleanup - functions

2005-04-11 Thread Alexander Nyberg

> Function names and return types on same line - conform to established 
> fs/cifs/ style.
> 
> -void
> -MD5Init(struct MD5Context *ctx)
> +void MD5Init(struct MD5Context *ctx)
>  {
>   ctx->buf[0] = 0x67452301;
>   ctx->buf[1] = 0xefcdab89;
> @@ -60,8 +58,7 @@ MD5Init(struct MD5Context *ctx)
>   * Update context to reflect the concatenation of another buffer full
>   * of bytes.
>   */
> -void
> -MD5Update(struct MD5Context *ctx, unsigned char const *buf, unsigned len)
> +void MD5Update(struct MD5Context *ctx, unsigned char const *buf, unsigned 
> len)
>  {

Can anyone enlighten me why CIFS is not using crypto/md5?
Same question about md4

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] cifs: md5 cleanup - functions

2005-04-11 Thread Alexander Nyberg

 Function names and return types on same line - conform to established 
 fs/cifs/ style.
 
 -void
 -MD5Init(struct MD5Context *ctx)
 +void MD5Init(struct MD5Context *ctx)
  {
   ctx-buf[0] = 0x67452301;
   ctx-buf[1] = 0xefcdab89;
 @@ -60,8 +58,7 @@ MD5Init(struct MD5Context *ctx)
   * Update context to reflect the concatenation of another buffer full
   * of bytes.
   */
 -void
 -MD5Update(struct MD5Context *ctx, unsigned char const *buf, unsigned len)
 +void MD5Update(struct MD5Context *ctx, unsigned char const *buf, unsigned 
 len)
  {

Can anyone enlighten me why CIFS is not using crypto/md5?
Same question about md4

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm2

2005-04-10 Thread Alexander Nyberg

> - Largeish x86_64 update

Hi Pavel

I'm playing a bit with suspend on smp, we need something like this:
As the cpu-mask is set to only this cpu _smp_processor_id() is safe.

Index: linux-2.6.11/kernel/power/smp.c
===
--- linux-2.6.11.orig/kernel/power/smp.c2005-04-10 09:43:13.0 
+0200
+++ linux-2.6.11/kernel/power/smp.c 2005-04-10 15:23:36.0 +0200
@@ -46,13 +46,13 @@
 
 void disable_nonboot_cpus(void)
 {
-   printk("Freezing CPUs (at %d)", smp_processor_id());
oldmask = current->cpus_allowed;
set_cpus_allowed(current, cpumask_of_cpu(0));
+   printk("Freezing CPUs (at %d)", _smp_processor_id());
current->state = TASK_INTERRUPTIBLE;
schedule_timeout(HZ);
printk("...");
-   BUG_ON(smp_processor_id() != 0);
+   BUG_ON(_smp_processor_id() != 0);
 
/* FIXME: for this to work, all the CPUs must be running
 * "idle" thread (or we deadlock). Is that guaranteed? */


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm2

2005-04-10 Thread Alexander Nyberg

 - Largeish x86_64 update

Hi Pavel

I'm playing a bit with suspend on smp, we need something like this:
As the cpu-mask is set to only this cpu _smp_processor_id() is safe.

Index: linux-2.6.11/kernel/power/smp.c
===
--- linux-2.6.11.orig/kernel/power/smp.c2005-04-10 09:43:13.0 
+0200
+++ linux-2.6.11/kernel/power/smp.c 2005-04-10 15:23:36.0 +0200
@@ -46,13 +46,13 @@
 
 void disable_nonboot_cpus(void)
 {
-   printk(Freezing CPUs (at %d), smp_processor_id());
oldmask = current-cpus_allowed;
set_cpus_allowed(current, cpumask_of_cpu(0));
+   printk(Freezing CPUs (at %d), _smp_processor_id());
current-state = TASK_INTERRUPTIBLE;
schedule_timeout(HZ);
printk(...);
-   BUG_ON(smp_processor_id() != 0);
+   BUG_ON(_smp_processor_id() != 0);
 
/* FIXME: for this to work, all the CPUs must be running
 * idle thread (or we deadlock). Is that guaranteed? */


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm2

2005-04-09 Thread Alexander Nyberg

> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm2/
>
> Changes since 2.6.12-rc2-mm1:
> 
> 
>  bk-acpi.patch

[acpi-devel up on cc]


One of my boxen takes about 5 minutes to reboot now, hitting sysrq-p a
few times shows it mostly sits in in acpi_ut_find_allocation+0x2b/0x37.
Reverting bk-acpi.patch makes it reboot like normal, config attached.


(gdb) disass acpi_ut_find_allocation
Dump of assembler code for function acpi_ut_find_allocation:
0xc01daa2d : push   %ebp
0xc01daa2e : mov%esp,%ebp
0xc01daa30 : push   %esi
0xc01daa31 : mov%edx,%esi
0xc01daa33 : push   %ebx
0xc01daa34 : mov%eax,%ebx
0xc01daa36 : call   0xc01dae66 

0xc01daa3b :xor%edx,%edx
0xc01daa3d :cmp$0x6,%ebx
0xc01daa40 :ja 0xc01daa5e 

0xc01daa42 :imul   $0x24,%ebx,%eax
0xc01daa45 :mov0xc03c1040(%eax),%eax
0xc01daa4b :test   %eax,%eax
0xc01daa4d :je 0xc01daa5c 

0xc01daa4f :cmp%esi,%eax
0xc01daa51 :mov%eax,%edx
0xc01daa53 :je 0xc01daa5e 

0xc01daa55 :mov0x4(%eax),%eax
0xc01daa58 :test   %eax,%eax
0xc01daa5a :jne0xc01daa4f 

0xc01daa5c :xor%edx,%edx
0xc01daa5e :pop%ebx
0xc01daa5f :pop%esi
0xc01daa60 :mov%edx,%eax
0xc01daa62 :leave
0xc01daa63 :ret
End of assembler dump.


#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.12-rc2-mm2
# Sat Apr  9 13:27:39 2005
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
# CONFIG_CLEAN_COMPILE is not set
CONFIG_BROKEN=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
# CONFIG_HOTPLUG is not set
# CONFIG_KOBJECT_UEVENT is not set
# CONFIG_IKCONFIG is not set
CONFIG_EMBEDDED=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0

#
# Loadable module support
#
CONFIG_MODULES=y
# CONFIG_MODULE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
# CONFIG_KMOD is not set

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
CONFIG_HPET_TIMER=y
# CONFIG_SMP is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
# CONFIG_X86_MCE_P4THERMAL is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set

#
# Firmware Drivers
#
# CONFIG_EDD is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
CONFIG_FLATMEM=y
# CONFIG_DISCONTIGMEM is not set
# CONFIG_HIGHPTE is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_HAVE_DEC_LOCK=y
CONFIG_REGPARM=y
CONFIG_SECCOMP=y

#
# Performance-monitoring counters support
#
# CONFIG_PERFCTR is not set
CONFIG_PHYSICAL_START=0x10
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
CONFIG_SOFTWARE_SUSPEND=y
CONFIG_PM_STD_PARTITION=""

Re: 2.6.12-rc2-mm2

2005-04-09 Thread Alexander Nyberg

 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm2/

 Changes since 2.6.12-rc2-mm1:
 
 
  bk-acpi.patch

[acpi-devel up on cc]


One of my boxen takes about 5 minutes to reboot now, hitting sysrq-p a
few times shows it mostly sits in in acpi_ut_find_allocation+0x2b/0x37.
Reverting bk-acpi.patch makes it reboot like normal, config attached.


(gdb) disass acpi_ut_find_allocation
Dump of assembler code for function acpi_ut_find_allocation:
0xc01daa2d acpi_ut_find_allocation+0: push   %ebp
0xc01daa2e acpi_ut_find_allocation+1: mov%esp,%ebp
0xc01daa30 acpi_ut_find_allocation+3: push   %esi
0xc01daa31 acpi_ut_find_allocation+4: mov%edx,%esi
0xc01daa33 acpi_ut_find_allocation+6: push   %ebx
0xc01daa34 acpi_ut_find_allocation+7: mov%eax,%ebx
0xc01daa36 acpi_ut_find_allocation+9: call   0xc01dae66 
acpi_ut_track_stack_ptr
0xc01daa3b acpi_ut_find_allocation+14:xor%edx,%edx
0xc01daa3d acpi_ut_find_allocation+16:cmp$0x6,%ebx
0xc01daa40 acpi_ut_find_allocation+19:ja 0xc01daa5e 
acpi_ut_find_allocation+49
0xc01daa42 acpi_ut_find_allocation+21:imul   $0x24,%ebx,%eax
0xc01daa45 acpi_ut_find_allocation+24:mov0xc03c1040(%eax),%eax
0xc01daa4b acpi_ut_find_allocation+30:test   %eax,%eax
0xc01daa4d acpi_ut_find_allocation+32:je 0xc01daa5c 
acpi_ut_find_allocation+47
0xc01daa4f acpi_ut_find_allocation+34:cmp%esi,%eax
0xc01daa51 acpi_ut_find_allocation+36:mov%eax,%edx
0xc01daa53 acpi_ut_find_allocation+38:je 0xc01daa5e 
acpi_ut_find_allocation+49
0xc01daa55 acpi_ut_find_allocation+40:mov0x4(%eax),%eax
0xc01daa58 acpi_ut_find_allocation+43:test   %eax,%eax
0xc01daa5a acpi_ut_find_allocation+45:jne0xc01daa4f 
acpi_ut_find_allocation+34
0xc01daa5c acpi_ut_find_allocation+47:xor%edx,%edx
0xc01daa5e acpi_ut_find_allocation+49:pop%ebx
0xc01daa5f acpi_ut_find_allocation+50:pop%esi
0xc01daa60 acpi_ut_find_allocation+51:mov%edx,%eax
0xc01daa62 acpi_ut_find_allocation+53:leave
0xc01daa63 acpi_ut_find_allocation+54:ret
End of assembler dump.


#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.12-rc2-mm2
# Sat Apr  9 13:27:39 2005
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
# CONFIG_CLEAN_COMPILE is not set
CONFIG_BROKEN=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
# CONFIG_HOTPLUG is not set
# CONFIG_KOBJECT_UEVENT is not set
# CONFIG_IKCONFIG is not set
CONFIG_EMBEDDED=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0

#
# Loadable module support
#
CONFIG_MODULES=y
# CONFIG_MODULE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
# CONFIG_KMOD is not set

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
CONFIG_HPET_TIMER=y
# CONFIG_SMP is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
#

Re: Timestamp of file modified through mmap are not changed in 2.6

2005-04-08 Thread Alexander Nyberg

> Timestamp of file modified through mmap are not changed in 2.6 (even 
> after msync()). Observations on 2.4 and 2.6 kernels:
> - on 2.4, timestamps are altered a few seconds after the program exits.
> - on 2.6, timestamps are never altered.
> 
> Is this behaviour a normal behaviour ?
> 
> Program example to reproduce the bug (you need to create a "test" file 
> in the current directory first):

Yeah there's been at least one bug on bugzilla open for this, and I
recall the posix specification saying the times on files shall be
updated on mmap file changes (which makes sense too).

Doing it at msync is easy, keeping track of memory mapped data etc. is
more cumbersome. I sent a patch doing this a while ago (doesn't work now
due to msync rework, think it was the 4-level changes) that worked well
for me but nobody seemed to be be overwhelmed by it :-)

http://lkml.org/lkml/diff/2004/12/5/95/1

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.12-rc2-mm2

2005-04-08 Thread Alexander Nyberg

fre 2005-04-08 klockan 03:08 -0700 skrev Andrew Morton:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm2/
> 

I got this running ./runltp -x 2, can't recall this happening before. It
bothers me a bit as it's a GFP_KERNEL allocation and there's lots of
swap available. I don't think I've learned to fully decipher these
oom-dumps fully yet (especially the active/inactive stat) but this looks
fishy to me.

Run with /proc/sys/vm/swappiness=1

After the killing /proc/meminfo reports lots of MemFree.

oom-killer: gfp_mask=0x80d2
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:8220kB (0kB HighMem)
Active:240716 inactive:2631 dirty:0 writeback:1162 unstable:0 free:2055 
slab:5340 mapped:242023 pagetables:1441
DMA free:4100kB min:60kB low:72kB high:88kB active:172kB inactive:7956kB 
present:16384kB pages_scanned:363 all_unreclaimable? no
lowmem_reserve[]: 0 1007 1007
Normal free:4120kB min:4028kB low:5032kB high:6040kB active:962692kB 
inactive:2568kB present:1032128kB pages_scanned:6421 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 2*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 
0*4096kB = 4100kB
Normal: 32*4kB 1*8kB 1*16kB 2*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 
1*2048kB 0*4096kB = 4120kB
HighMem: empty
Swap cache: add 921, delete 257, find 0/0, race 0+0
Free swap  = 4880036kB
Total swap = 4883720kB
Out of Memory: Killed process 2327 (firefox-bin).
oom-killer: gfp_mask=0xd0
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:8112kB (0kB HighMem)
Active:224482 inactive:19068 dirty:0 writeback:285 unstable:0 free:2028 
slab:5089 mapped:243275 pagetables:1407
DMA free:4088kB min:60kB low:72kB high:88kB active:6432kB inactive:1744kB 
present:16384kB pages_scanned:10438 all_unreclaimable? yes
lowmem_reserve[]: 0 1007 1007
Normal free:4024kB min:4028kB low:5032kB high:6040kB active:891496kB 
inactive:74528kB present:1032128kB pages_scanned:686 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 
0*4096kB = 4088kB
Normal: 6*4kB 4*8kB 0*16kB 4*32kB 2*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 
1*2048kB 0*4096kB = 4024kB
HighMem: empty
Swap cache: add 2900, delete 2557, find 0/0, race 0+0
Free swap  = 4872120kB
Total swap = 4883720kB
Out of Memory: Killed process 2305 (evolution).
oom-killer: gfp_mask=0xd0
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:8096kB (0kB HighMem)
Active:229514 inactive:14617 dirty:0 writeback:0 unstable:0 free:2024 slab:4233 
mapped:244389 pagetables:1568
DMA free:4088kB min:60kB low:72kB high:88kB active:8228kB inactive:0kB 
present:16384kB pages_scanned:9771 all_unreclaimable? yes
lowmem_reserve[]: 0 1007 1007
Normal free:4008kB min:4028kB low:5032kB high:6040kB active:909060kB 
inactive:59364kB present:1032128kB pages_scanned:1623064 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 
0*4096kB = 4088kB
Normal: 0*4kB 1*8kB 2*16kB 0*32kB 4*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 
1*2048kB 0*4096kB = 4008kB
HighMem: empty
Swap cache: add 125467, delete 125298, find 70/196, race 0+0
Free swap  = 4383340kB
Total swap = 4883720kB
Out of Memory: Killed process 2330 (gnome-pty-helpe).
oom-killer: gfp_mask=0xd0
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu:

Re: 2.6.12-rc2-mm2

2005-04-08 Thread Alexander Nyberg

fre 2005-04-08 klockan 03:08 -0700 skrev Andrew Morton:
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm2/
 

I got this running ./runltp -x 2, can't recall this happening before. It
bothers me a bit as it's a GFP_KERNEL allocation and there's lots of
swap available. I don't think I've learned to fully decipher these
oom-dumps fully yet (especially the active/inactive stat) but this looks
fishy to me.

Run with /proc/sys/vm/swappiness=1

After the killing /proc/meminfo reports lots of MemFree.

oom-killer: gfp_mask=0x80d2
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:8220kB (0kB HighMem)
Active:240716 inactive:2631 dirty:0 writeback:1162 unstable:0 free:2055 
slab:5340 mapped:242023 pagetables:1441
DMA free:4100kB min:60kB low:72kB high:88kB active:172kB inactive:7956kB 
present:16384kB pages_scanned:363 all_unreclaimable? no
lowmem_reserve[]: 0 1007 1007
Normal free:4120kB min:4028kB low:5032kB high:6040kB active:962692kB 
inactive:2568kB present:1032128kB pages_scanned:6421 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 2*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 
0*4096kB = 4100kB
Normal: 32*4kB 1*8kB 1*16kB 2*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 
1*2048kB 0*4096kB = 4120kB
HighMem: empty
Swap cache: add 921, delete 257, find 0/0, race 0+0
Free swap  = 4880036kB
Total swap = 4883720kB
Out of Memory: Killed process 2327 (firefox-bin).
oom-killer: gfp_mask=0xd0
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:8112kB (0kB HighMem)
Active:224482 inactive:19068 dirty:0 writeback:285 unstable:0 free:2028 
slab:5089 mapped:243275 pagetables:1407
DMA free:4088kB min:60kB low:72kB high:88kB active:6432kB inactive:1744kB 
present:16384kB pages_scanned:10438 all_unreclaimable? yes
lowmem_reserve[]: 0 1007 1007
Normal free:4024kB min:4028kB low:5032kB high:6040kB active:891496kB 
inactive:74528kB present:1032128kB pages_scanned:686 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 
0*4096kB = 4088kB
Normal: 6*4kB 4*8kB 0*16kB 4*32kB 2*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 
1*2048kB 0*4096kB = 4024kB
HighMem: empty
Swap cache: add 2900, delete 2557, find 0/0, race 0+0
Free swap  = 4872120kB
Total swap = 4883720kB
Out of Memory: Killed process 2305 (evolution).
oom-killer: gfp_mask=0xd0
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages:8096kB (0kB HighMem)
Active:229514 inactive:14617 dirty:0 writeback:0 unstable:0 free:2024 slab:4233 
mapped:244389 pagetables:1568
DMA free:4088kB min:60kB low:72kB high:88kB active:8228kB inactive:0kB 
present:16384kB pages_scanned:9771 all_unreclaimable? yes
lowmem_reserve[]: 0 1007 1007
Normal free:4008kB min:4028kB low:5032kB high:6040kB active:909060kB 
inactive:59364kB present:1032128kB pages_scanned:1623064 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB 
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 
0*4096kB = 4088kB
Normal: 0*4kB 1*8kB 2*16kB 0*32kB 4*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 
1*2048kB 0*4096kB = 4008kB
HighMem: empty
Swap cache: add 125467, delete 125298, find 70/196, race 0+0
Free swap  = 4383340kB
Total swap = 4883720kB
Out of Memory: Killed process 2330 (gnome-pty-helpe).
oom-killer: gfp_mask=0xd0
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu:

Re: Timestamp of file modified through mmap are not changed in 2.6

2005-04-08 Thread Alexander Nyberg

 Timestamp of file modified through mmap are not changed in 2.6 (even 
 after msync()). Observations on 2.4 and 2.6 kernels:
 - on 2.4, timestamps are altered a few seconds after the program exits.
 - on 2.6, timestamps are never altered.
 
 Is this behaviour a normal behaviour ?
 
 Program example to reproduce the bug (you need to create a test file 
 in the current directory first):

Yeah there's been at least one bug on bugzilla open for this, and I
recall the posix specification saying the times on files shall be
updated on mmap file changes (which makes sense too).

Doing it at msync is easy, keeping track of memory mapped data etc. is
more cumbersome. I sent a patch doing this a while ago (doesn't work now
due to msync rework, think it was the 4-level changes) that worked well
for me but nobody seemed to be be overwhelmed by it :-)

http://lkml.org/lkml/diff/2004/12/5/95/1

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Dynamic Tick version 050406-1

2005-04-07 Thread Alexander Nyberg

> > > Here's an updated dyn-tick patch. Some minor fixes:
> > 
> > Doesn't look so good here.  I get this with 2.6.12-rc2 (plus a few other 
> > patches).
> > Disabling Dynamic Tick makes everything happy again (it boots).
> > 
> > [4294688.655000] Unable to handle kernel NULL pointer dereference at 
> > virtual address 
> 
> Thanks for trying it out. What kind of hardware do you have? Does it
> have HPET? It looks like no suitable timer for dyn-tick is found...
> Maybe the following patch helps?


= arch/i386/kernel/Makefile 1.67 vs edited =
--- 1.67/arch/i386/kernel/Makefile  2005-01-26 06:21:13 +01:00
+++ edited/arch/i386/kernel/Makefile2005-04-07 11:21:19 +02:00
@@ -32,6 +32,7 @@ obj-$(CONFIG_ACPI_SRAT)   += srat.o
 obj-$(CONFIG_HPET_TIMER)   += time_hpet.o
 obj-$(CONFIG_EFI)  += efi.o efi_stub.o
 obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
+obj-$(CONFIG_NO_IDLE_HZ)   += dyn-tick.o
 
 EXTRA_AFLAGS   := -traditional
 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Dynamic Tick version 050406-1

2005-04-07 Thread Alexander Nyberg

   Here's an updated dyn-tick patch. Some minor fixes:
  
  Doesn't look so good here.  I get this with 2.6.12-rc2 (plus a few other 
  patches).
  Disabling Dynamic Tick makes everything happy again (it boots).
  
  [4294688.655000] Unable to handle kernel NULL pointer dereference at 
  virtual address 
 
 Thanks for trying it out. What kind of hardware do you have? Does it
 have HPET? It looks like no suitable timer for dyn-tick is found...
 Maybe the following patch helps?


= arch/i386/kernel/Makefile 1.67 vs edited =
--- 1.67/arch/i386/kernel/Makefile  2005-01-26 06:21:13 +01:00
+++ edited/arch/i386/kernel/Makefile2005-04-07 11:21:19 +02:00
@@ -32,6 +32,7 @@ obj-$(CONFIG_ACPI_SRAT)   += srat.o
 obj-$(CONFIG_HPET_TIMER)   += time_hpet.o
 obj-$(CONFIG_EFI)  += efi.o efi_stub.o
 obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
+obj-$(CONFIG_NO_IDLE_HZ)   += dyn-tick.o
 
 EXTRA_AFLAGS   := -traditional
 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 171 matches

Mail list logo