Re: [expert] Do these messages explain why my system hung?

2003-10-07 Thread James Sparenberg
On Mon, 2003-10-06 at 21:20, Wolfgang Bornath wrote:
 James Sparenberg schrieb am Mon, 06 Oct 2003 15:58:36 -0700:
 
  Wobo,
  
 If you do this successfully care to put in the twiki how you did
 it? 
  James
 
 Will do but maybe not before end of month. Too busy now with 9.2 manuals
 on that very machine.  
 
 wobo

No problem.  This might be something worth putting into one of those
manuals as well.   Just a thought.
 
 
 __
 Want to buy your Pack or Services from MandrakeSoft? 
 Go to http://www.mandrakestore.com


Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


[expert] Do these messages explain why my system hung?

2003-10-06 Thread Brian Parish
This is a 9.1 machine with all updates installed.  Has been running just
fine for a long time now.  No recent changes apart from security updates
and none of those for a week or more.

Everything ground to a halt.  Managed to kill X after a long wait for a
response to Ctrl-Alt-Backspace.  Seemed OK, but then stopped again 10
minutes later.  Had to reset at that point.

Here are the messages from the system log that seem perhaps relevant. 
Do they mean anything to anyone?

TIA
Brian

Oct  6 16:02:28 daw kernel: __alloc_pages: 0-order allocation failed
(gfp=0x1f0/0)
Oct  6 16:02:56 daw kernel: __alloc_pages: 0-order allocation failed
(gfp=0x1d2/0)
Oct  6 16:03:01 daw kernel: mkia_add_page_ref: couldn't make b1f31000
present
Oct  6 16:03:01 daw kernel: Merge: _M_map_winpg: mapping of a14b6000 to
b1f31000 failed, prot = 6, err
= -14
Oct  6 16:03:01 daw kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
Oct  6 16:03:27 daw kernel: __alloc_pages: 0-order allocation failed
(gfp=0x1d2/0)
Oct  6 16:03:27 daw kernel: __alloc_pages: 0-order allocation failed
(gfp=0x1d2/0)


Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


Re: [expert] Do these messages explain why my system hung?

2003-10-06 Thread James Sparenberg
On Mon, 2003-10-06 at 00:27, Brian Parish wrote:
 This is a 9.1 machine with all updates installed.  Has been running just
 fine for a long time now.  No recent changes apart from security updates
 and none of those for a week or more.
 
 Everything ground to a halt.  Managed to kill X after a long wait for a
 response to Ctrl-Alt-Backspace.  Seemed OK, but then stopped again 10
 minutes later.  Had to reset at that point.
 
 Here are the messages from the system log that seem perhaps relevant. 
 Do they mean anything to anyone?
 
 TIA
 Brian
 
 Oct  6 16:02:28 daw kernel: __alloc_pages: 0-order allocation failed
 (gfp=0x1f0/0)
 Oct  6 16:02:56 daw kernel: __alloc_pages: 0-order allocation failed
 (gfp=0x1d2/0)
 Oct  6 16:03:01 daw kernel: mkia_add_page_ref: couldn't make b1f31000
 present
 Oct  6 16:03:01 daw kernel: Merge: _M_map_winpg: mapping of a14b6000 to
 b1f31000 failed, prot = 6, err
 = -14
 Oct  6 16:03:01 daw kernel: __alloc_pages: 0-order allocation failed
 (gfp=0xf0/0)
 Oct  6 16:03:27 daw kernel: __alloc_pages: 0-order allocation failed
 (gfp=0x1d2/0)
 Oct  6 16:03:27 daw kernel: __alloc_pages: 0-order allocation failed
 (gfp=0x1d2/0)

Brian,

   A little googling around and I found that it is memory related.  In
that either the box is running out of memory and can't swap so the VM is
trying to kill off processes to stay alive or, you have a memory stick
going bad.  Run memtest86 or a similar memory testing program overnight
if you can. It boots on it's own from a floppy or you can do a urpmi
memtest86 and it will give you another boot option in lilo (grub not
sure) and you boot to that.  

James



Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


Re: [expert] Do these messages explain why my system hung?

2003-10-06 Thread diego
Hu... and if that's the case (a module is corrupted), you can have a
look at badmem:
it will lock bytes reported as wrong by memtest86 so you would be able
to still spend 0$ and use much of that module ;-))


El lun, 06-10-2003 a las 10:30, James Sparenberg escribió:
 On Mon, 2003-10-06 at 00:27, Brian Parish wrote:
  This is a 9.1 machine with all updates installed.  Has been running just
  fine for a long time now.  No recent changes apart from security updates
  and none of those for a week or more.
  
  Everything ground to a halt.  Managed to kill X after a long wait for a
  response to Ctrl-Alt-Backspace.  Seemed OK, but then stopped again 10
  minutes later.  Had to reset at that point.
  
  Here are the messages from the system log that seem perhaps relevant. 
  Do they mean anything to anyone?
  
  TIA
  Brian
  
  Oct  6 16:02:28 daw kernel: __alloc_pages: 0-order allocation failed
  (gfp=0x1f0/0)
  Oct  6 16:02:56 daw kernel: __alloc_pages: 0-order allocation failed
  (gfp=0x1d2/0)
  Oct  6 16:03:01 daw kernel: mkia_add_page_ref: couldn't make b1f31000
  present
  Oct  6 16:03:01 daw kernel: Merge: _M_map_winpg: mapping of a14b6000 to
  b1f31000 failed, prot = 6, err
  = -14
  Oct  6 16:03:01 daw kernel: __alloc_pages: 0-order allocation failed
  (gfp=0xf0/0)
  Oct  6 16:03:27 daw kernel: __alloc_pages: 0-order allocation failed
  (gfp=0x1d2/0)
  Oct  6 16:03:27 daw kernel: __alloc_pages: 0-order allocation failed
  (gfp=0x1d2/0)
 
 Brian,
 
A little googling around and I found that it is memory related.  In
 that either the box is running out of memory and can't swap so the VM is
 trying to kill off processes to stay alive or, you have a memory stick
 going bad.  Run memtest86 or a similar memory testing program overnight
 if you can. It boots on it's own from a floppy or you can do a urpmi
 memtest86 and it will give you another boot option in lilo (grub not
 sure) and you boot to that.  
 
 James
 
 
 
 
 

 Want to buy your Pack or Services from MandrakeSoft? 
 Go to http://www.mandrakestore.com
-- 
   Diego  Dominguez 
  __/\__  
 |  | 
 Andalucia  /\  Spain
\/
 |__  __| 
\/



Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


Re: [expert] Do these messages explain why my system hung?

2003-10-06 Thread Wolfgang Bornath
diego schrieb am 06 Oct 2003 19:02:06 +0200:

 Hu... and if that's the case (a module is corrupted), you can have
 a look at badmem:
   it will lock bytes reported as wrong by memtest86 so you would
   be able to still spend 0$ and use much of that module ;-))

Now this IS a great advice! I have a module here in my desk that memtest
reported has some bad bytes (only a few). If I could get that to work
with locked bad bytes it would increase my desktop's memory up to
1GByte!

If this works then this advice may be one of the most important advices
I received this year.

wobo

Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


RE: [expert] Do these messages explain why my system hung?

2003-10-06 Thread Lawson, Jim
I agree it really was great info.

-Original Message-
From: Wolfgang Bornath [mailto:[EMAIL PROTECTED]
Sent: Monday, October 06, 2003 1:36 PM
To: [EMAIL PROTECTED]
Subject: Re: [expert] Do these messages explain why my system hung?


diego schrieb am 06 Oct 2003 19:02:06 +0200:

 Hu... and if that's the case (a module is corrupted), you can have
 a look at badmem:
   it will lock bytes reported as wrong by memtest86 so you would
   be able to still spend 0$ and use much of that module ;-))

Now this IS a great advice! I have a module here in my desk that memtest
reported has some bad bytes (only a few). If I could get that to work
with locked bad bytes it would increase my desktop's memory up to
1GByte!

If this works then this advice may be one of the most important advices
I received this year.

wobo


Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


Re: [expert] Do these messages explain why my system hung?

2003-10-06 Thread Thomas Backlund
Wolfgang Bornath kirjoitti viestissään (lähetysaika Maanantai 06 Lokakuu 2003 
20:35):
 diego schrieb am 06 Oct 2003 19:02:06 +0200:
  Hu... and if that's the case (a module is corrupted), you can have
  a look at badmem:
  it will lock bytes reported as wrong by memtest86 so you would
  be able to still spend 0$ and use much of that module ;-))

 Now this IS a great advice! I have a module here in my desk that memtest
 reported has some bad bytes (only a few). If I could get that to work
 with locked bad bytes it would increase my desktop's memory up to
 1GByte!

 If this works then this advice may be one of the most important advices
 I received this year.

 wobo

As we have the badram patches in upcoming 9.2, here is how to do it:
(it was added in 2.4.22-0.7mdk)

run memtest, choose the following:
(c)onfiguration
(6) Error Report Mode
(2) BadRam Patterns
(8) Restart Test

When the test have run, and found some bad ram, it will show a line like this:

badram=0x09c17d18,0xfffc

this line that is reported is exactly what you need to add to your lilo append 
line, and the kernel will map that memory area as forbidden area that no 
program will use, not even the kernel...

Thanks to the badram patch I didn't have to discard a 512MB module due to a 
4kB bad area in that module (that's what the above badram=... example fixes)


-- 
Regards

Thomas


Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


RE: [expert] Do these messages explain why my system hung?

2003-10-06 Thread diego
I'm glad it might be usefull to someone (I have used sucessfuly it in
the past). Have a look at (googling around will lead you here):
http://badmem.sourceforge.net/docu/BadMEM-HOWTO.html

It does not waste resources as what it does os just install itself as a
resident kernel module (not swapable) into those specific locations, so
it locks them but will never be used. The only thing to accomplish is to
try to put the good module(s) first to guarantee that lilo, the loader,
etc get executed properly just upto the module load.





El lun, 06-10-2003 a las 19:42, Lawson, Jim escribió:
 I agree it really was great info.
 
 -Original Message-
 From: Wolfgang Bornath [mailto:[EMAIL PROTECTED]
 Sent: Monday, October 06, 2003 1:36 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [expert] Do these messages explain why my system hung?
 
 
 diego schrieb am 06 Oct 2003 19:02:06 +0200:
 
  Hu... and if that's the case (a module is corrupted), you can have
  a look at badmem:
  it will lock bytes reported as wrong by memtest86 so you would
  be able to still spend 0$ and use much of that module ;-))
 
 Now this IS a great advice! I have a module here in my desk that memtest
 reported has some bad bytes (only a few). If I could get that to work
 with locked bad bytes it would increase my desktop's memory up to
 1GByte!
 
 If this works then this advice may be one of the most important advices
 I received this year.
 
 wobo
 
 
 
 

 Want to buy your Pack or Services from MandrakeSoft? 
 Go to http://www.mandrakestore.com
-- 
   Diego  Dominguez 
  __/\__  
 |  | 
 Andalucia  /\  Spain
\/
 |__  __| 
\/



Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


Re: [expert] Do these messages explain why my system hung?

2003-10-06 Thread Wolfgang Bornath
Thomas Backlund schrieb am Tue, 7 Oct 2003 00:09:48 +0300:

 As we have the badram patches in upcoming 9.2, here is how to do it:
 (it was added in 2.4.22-0.7mdk)
 
 run memtest, choose the following:
 (c)onfiguration
 (6) Error Report Mode
 (2) BadRam Patterns
 (8) Restart Test
 
 When the test have run, and found some bad ram, it will show a line
 like this:
 
 badram=0x09c17d18,0xfffc
 
 this line that is reported is exactly what you need to add to your
 lilo append line, and the kernel will map that memory area as
 forbidden area that no program will use, not even the kernel...
 
 Thanks to the badram patch I didn't have to discard a 512MB module due
 to a 4kB bad area in that module (that's what the above badram=...
 example fixes)

Thanks, Thomas, this one will make it into the TipsTricks section of
the upcoming website for German Mandrake users.

wobo

Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


Re: [expert] Do these messages explain why my system hung?

2003-10-06 Thread James Sparenberg
On Mon, 2003-10-06 at 10:35, Wolfgang Bornath wrote:
 diego schrieb am 06 Oct 2003 19:02:06 +0200:
 
  Hu... and if that's the case (a module is corrupted), you can have
  a look at badmem:
  it will lock bytes reported as wrong by memtest86 so you would
  be able to still spend 0$ and use much of that module ;-))
 
 Now this IS a great advice! I have a module here in my desk that memtest
 reported has some bad bytes (only a few). If I could get that to work
 with locked bad bytes it would increase my desktop's memory up to
 1GByte!
 
 If this works then this advice may be one of the most important advices
 I received this year.
 
 wobo
 
Wobo,

   If you do this successfully care to put in the twiki how you did it? 
James



Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com


Re: [expert] Do these messages explain why my system hung?

2003-10-06 Thread Wolfgang Bornath
James Sparenberg schrieb am Mon, 06 Oct 2003 15:58:36 -0700:

 Wobo,
 
If you do this successfully care to put in the twiki how you did
it? 
 James

Will do but maybe not before end of month. Too busy now with 9.2 manuals
on that very machine.

wobo

Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com