Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-08 Thread Henry
> > No, his oops was a bad inode state while trying to > release unused NFS client inodes. Different bug :) > New development. No oops, but apache eventually crashed with the same error message 'semget - no space left on device'. So,... either this was a coincidence (ie, with the kernel

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-08 Thread Henry
No, his oops was a bad inode state while trying to release unused NFS client inodes. Different bug :) New development. No oops, but apache eventually crashed with the same error message 'semget - no space left on device'. So,... either this was a coincidence (ie, with the kernel issue)

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-07 Thread Andrew Morton
Henry wrote: > > > > > I wonder why it only affects you. Is the drive which holds > > your swap partition running in PIO mode? `hdparm' will tell > > you. If it is, then that could easily cause the page to come > > unlocked before brw_page() has finished touching the buffer > > ring. Then

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-07 Thread Henry
> > I wonder why it only affects you. Is the drive which holds > your swap partition running in PIO mode? `hdparm' will tell > you. If it is, then that could easily cause the page to come > unlocked before brw_page() has finished touching the buffer > ring. Then all it takes is a parallel

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-07 Thread Andrew Morton
Henry wrote: > > ... > So far, so good. There has not been a single oops on the two principle > servers I patched. > > uptime1:8:04am up 18:22, 1 user, load average: 0.09, 0.15, 0.11 > uptime2:8:04am up 18:25, 1 user, load average: 0.15, 0.20, 0.15 OK,

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-07 Thread Andrew Morton
Henry wrote: ... So far, so good. There has not been a single oops on the two principle servers I patched. uptime1:8:04am up 18:22, 1 user, load average: 0.09, 0.15, 0.11 uptime2:8:04am up 18:25, 1 user, load average: 0.15, 0.20, 0.15 OK, that

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-07 Thread Henry
I wonder why it only affects you. Is the drive which holds your swap partition running in PIO mode? `hdparm' will tell you. If it is, then that could easily cause the page to come unlocked before brw_page() has finished touching the buffer ring. Then all it takes is a parallel

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-07 Thread Andrew Morton
Henry wrote: I wonder why it only affects you. Is the drive which holds your swap partition running in PIO mode? `hdparm' will tell you. If it is, then that could easily cause the page to come unlocked before brw_page() has finished touching the buffer ring. Then all it takes is

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-06 Thread Henry
On Fri, 06 Jul 2001, Andrew Morton wrote: > Henry wrote: > > > > ... > > Dual-cpu pentium 233 (intel) with 128MB RAM and more than double that swap. > > > > ... > > Unable to handle kernel NULL pointer dereference at virtual address 0008 > > c01b4227 > > *pde = > > Oops: > >

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-06 Thread Henry
> > There does appear to be an SMP race in brw_page() which can cause > this - end_buffer_io_async() unlocks the page, try_to_free_buffers() > zaps the buffer_head ring and brw_page() gets a null pointer. But > gee, it's unlikely unless you have super-fast disks and/or something > which has a

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-06 Thread Andrew Morton
Henry wrote: > > ... > Dual-cpu pentium 233 (intel) with 128MB RAM and more than double that swap. > > ... > Unable to handle kernel NULL pointer dereference at virtual address 0008 > c01b4227 > *pde = > Oops: > CPU:0 > EIP:0010:[] > Using defaults from ksymoops -t

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-06 Thread Andrew Morton
Henry wrote: ... Dual-cpu pentium 233 (intel) with 128MB RAM and more than double that swap. ... Unable to handle kernel NULL pointer dereference at virtual address 0008 c01b4227 *pde = Oops: CPU:0 EIP:0010:[c01b4227] Using defaults from ksymoops -t

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-06 Thread Henry
There does appear to be an SMP race in brw_page() which can cause this - end_buffer_io_async() unlocks the page, try_to_free_buffers() zaps the buffer_head ring and brw_page() gets a null pointer. But gee, it's unlikely unless you have super-fast disks and/or something which has a

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-06 Thread Henry
On Fri, 06 Jul 2001, Andrew Morton wrote: Henry wrote: ... Dual-cpu pentium 233 (intel) with 128MB RAM and more than double that swap. ... Unable to handle kernel NULL pointer dereference at virtual address 0008 c01b4227 *pde = Oops: CPU:0 EIP:

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-05 Thread Henry
> > > > FYI, I see a similar problem under 2.4.5, also SMP, although only > > intermittently. Two oopses are below, from two different, although > > similarly configured, machines. > > [snip] > > Sounds very similar. Our servers are all identical (except for RAM). > > What's unusual is that

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-05 Thread Henry
On Thu, 05 Jul 2001, Wayne Whitney wrote: > In mailing-lists.linux-kernel, you wrote: > > > We've noticed the following kernel error since 2.4 (2.4.1-2.4.6). > > It appears to be swap (kswapd thread specific?) related. The same > > error is reported on several SMP machines after only a short

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-05 Thread Wayne Whitney
In mailing-lists.linux-kernel, you wrote: > We've noticed the following kernel error since 2.4 (2.4.1-2.4.6). > It appears to be swap (kswapd thread specific?) related. The same > error is reported on several SMP machines after only a short period > (an hour or less). FYI, I see a similar

OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-05 Thread Henry
Hello Presumably this has already been mentioned, but since it seems like an ongoing thing (I've seen a similar topic discussed at http://kt.zork.net/kernel-traffic/) I thought it wouldn't hurt to provide more info. We've noticed the following kernel error since 2.4 (2.4.1-2.4.6). It appears

OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-05 Thread Henry
Hello Presumably this has already been mentioned, but since it seems like an ongoing thing (I've seen a similar topic discussed at http://kt.zork.net/kernel-traffic/) I thought it wouldn't hurt to provide more info. We've noticed the following kernel error since 2.4 (2.4.1-2.4.6). It appears

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-05 Thread Wayne Whitney
In mailing-lists.linux-kernel, you wrote: We've noticed the following kernel error since 2.4 (2.4.1-2.4.6). It appears to be swap (kswapd thread specific?) related. The same error is reported on several SMP machines after only a short period (an hour or less). FYI, I see a similar problem

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-05 Thread Henry
On Thu, 05 Jul 2001, Wayne Whitney wrote: In mailing-lists.linux-kernel, you wrote: We've noticed the following kernel error since 2.4 (2.4.1-2.4.6). It appears to be swap (kswapd thread specific?) related. The same error is reported on several SMP machines after only a short period

Re: OOPS (kswapd) in 2.4.5 and 2.4.6

2001-07-05 Thread Henry
FYI, I see a similar problem under 2.4.5, also SMP, although only intermittently. Two oopses are below, from two different, although similarly configured, machines. [snip] Sounds very similar. Our servers are all identical (except for RAM). What's unusual is that the machines