Re: [gentoo-user] Re: Random reboots. Where to start?

2011-03-02 Thread Mick
2011/3/1 Peter Humphrey pe...@humphrey.ukfsn.org:
 On Tuesday 01 March 2011 23:14:12 Mick wrote:

 Ha! I remember on an old machine when in WinXP would rarely if ever
 crash, while in Gentoo would crash every time.

 My machine is only about a year old, built by a specialist builder of high-
 performance systems, so it shouldn't be experiencing hardware failures.

 Different OS' use memory differently.

 Indeed they do. My experience is the converse of yours: Gentoo does not
 hang, while Fedora and Mandriva do. It's not a problem with a particular
 area of the disks, as I've installed them both in different partitions and
 got the same result. I assume that some kernel options don't suit my
 motherboard. Don't all laugh, but it's an Asus P7P55D.

 After a year or so though the WinXP installation eventually corrupted
 itself irreparably, while Gentoo (on reiserfs) soldiered on. Eventually, I
 bought new memory modules and there were no more crashes.

 Maybe I need to replace the memory. That's a bit drastic though when I
 haven't actually proved it faulty.

Yes, I tend to agree.  You could end up replacing the memory only to
find out that the crashes persist.


 memtest 86+ showed no errors, so I didn't know what to blame for all
 these crashes.

 It's well known that test programs can't stress a computer the way real life
 does. It was true of Ferranti Argus 500 systems in 1974, and I'm sure it's
 still true today.

I remember using a script which put the system (memory modules and
swap) through its paces.  That did show me some errors which made me
replace the memory.  I can't recall where I found that script, but
remember it being aired in this mailing list.


 After close observation I discovered that the machine would crash the
 moment it tried to start swapping.

 Interesting. As far as I can tell though this box doesn't swap often - it
 can go weeks without doing so. As I said the other day, my 4GB is enough to
 contain the work I usually do.

 This would typically happen in the middle of an emerge, which was rather
 annoying, and/or when updatedb was running.

 At least you could re-run an aborted emerge; when my box hangs it just stops
 responding to keyboard and mouse, and the network interface stops receiving
 packets so I can't ssh in from another box to shut it down neatly. It's BRS
 time.

No I couldn't.  :-(

The whole system would freeze up, no keyboard, no network, no nothing.
 I had to pull the plug every time.
-- 
Regards,
Mick



Re: [gentoo-user] Re: Random reboots. Where to start?

2011-03-02 Thread Mick
On 2 March 2011 16:29, Neil Bothwick n...@digimed.co.uk wrote:
 On Wed, 2 Mar 2011 15:51:54 +, Mick wrote:

  This would typically happen in the middle of an emerge, which was
  rather annoying, and/or when updatedb was running.
 
  At least you could re-run an aborted emerge; when my box hangs it
  just stops responding to keyboard and mouse, and the network
  interface stops receiving packets so I can't ssh in from another box
  to shut it down neatly. It's BRS time.

 No I couldn't.  :-(

 The whole system would freeze up, no keyboard, no network, no nothing.
  I had to pull the plug every time.

 Not even Alt-SysRq? That's a serious lockup.

Yep, when that box locked up, it didn't do it by half.


 You can still resume a merge after a power down, with
 ebuild /path/to/ebuild merge.

I see ... by path you mean /var/tmp/portage/...  ?
-- 
Regards,
Mick



Re: [gentoo-user] Re: Random reboots. Where to start?

2011-03-02 Thread Neil Bothwick
On Wed, 2 Mar 2011 16:37:11 +, Mick wrote:

  You can still resume a merge after a power down, with
  ebuild /path/to/ebuild merge.  
 
 I see ... by path you mean /var/tmp/portage/...  ?

The path to the actual ebuild- /usr/portage/cat/pkg/pkg-ver.ebuild


-- 
Neil Bothwick

The best things in life are free, but the
expensive ones are still worth a look.


signature.asc
Description: PGP signature


Re: [gentoo-user] Re: Random reboots. Where to start?

2011-03-02 Thread Alex Schuster
Mick writes:

 On 2 March 2011 16:29, Neil Bothwick n...@digimed.co.uk wrote:

  You can still resume a merge after a power down, with
  ebuild /path/to/ebuild merge.
 
 I see ... by path you mean /var/tmp/portage/...  ?

No, /usr/portage/category/package.

Alternatively, you can use FEATURES=keepwork emerge package, or even 
simpler with FEATURES=keepwork emerge --resume.

Wonko



Re: [gentoo-user] Re: Random reboots. Where to start?

2011-03-02 Thread Neil Bothwick
On Wed, 2 Mar 2011 15:51:54 +, Mick wrote:

  This would typically happen in the middle of an emerge, which was
  rather annoying, and/or when updatedb was running.  
 
  At least you could re-run an aborted emerge; when my box hangs it
  just stops responding to keyboard and mouse, and the network
  interface stops receiving packets so I can't ssh in from another box
  to shut it down neatly. It's BRS time.  
 
 No I couldn't.  :-(
 
 The whole system would freeze up, no keyboard, no network, no nothing.
  I had to pull the plug every time.

Not even Alt-SysRq? That's a serious lockup.

You can still resume a merge after a power down, with
ebuild /path/to/ebuild merge.


-- 
Neil Bothwick

Drop your carrier .. we have you surrounded


signature.asc
Description: PGP signature


Re: [gentoo-user] Re: Random reboots. Where to start?

2011-03-02 Thread Peter Humphrey
On Wednesday 02 March 2011 16:37:11 Mick wrote:
 On 2 March 2011 16:29, Neil Bothwick n...@digimed.co.uk wrote:
  You can still resume a merge after a power down, with
  ebuild /path/to/ebuild merge.
 
 I see ... by path you mean /var/tmp/portage/...  ?

No, I think he means something like:

ebuild `equery w atom` merge

-- 
Rgds
Peter



Re: [gentoo-user] Re: Random reboots. Where to start?

2011-03-01 Thread Mick
On Sunday 27 February 2011 23:34:09 Peter Humphrey wrote:
 On Sunday 27 February 2011 19:43:10 Mick wrote:
  [...] when I had a failing memory module I would often end up with
  corrupted files all over the place.  Think about it, when the memory
  gave up some write on disk function was invariably foo-barred.
 
 What, though, if you get hang-ups in some OSs but not in others, and never
 a sign of file corruption?

Ha! I remember on an old machine when in WinXP would rarely if ever crash, 
while in Gentoo would crash every time.  Different OS' use memory differently.

After a year or so though the WinXP installation eventually corrupted itself 
irreparably, while Gentoo (on reiserfs) soldiered on.  Eventually, I bought 
new memory modules and there were no more crashes.

memtest 86+ showed no errors, so I didn't know what to blame for all these 
crashes.  After close observation I discovered that the machine would crash 
the moment it tried to start swapping.  This would typically happen in the 
middle of an emerge, which was rather annoying, and/or when updatedb was 
running.  The particular MoBo/memory controller had a dislike for memory 
modules which were not identical.  With new identical modules it never crashed 
again.
-- 
Regards,
Mick


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] Re: Random reboots. Where to start?

2011-03-01 Thread Peter Humphrey
On Tuesday 01 March 2011 23:14:12 Mick wrote:

 Ha! I remember on an old machine when in WinXP would rarely if ever
 crash, while in Gentoo would crash every time.

My machine is only about a year old, built by a specialist builder of high-
performance systems, so it shouldn't be experiencing hardware failures.

 Different OS' use memory differently.

Indeed they do. My experience is the converse of yours: Gentoo does not 
hang, while Fedora and Mandriva do. It's not a problem with a particular 
area of the disks, as I've installed them both in different partitions and 
got the same result. I assume that some kernel options don't suit my 
motherboard. Don't all laugh, but it's an Asus P7P55D.

 After a year or so though the WinXP installation eventually corrupted
 itself irreparably, while Gentoo (on reiserfs) soldiered on. Eventually, I
 bought new memory modules and there were no more crashes.

Maybe I need to replace the memory. That's a bit drastic though when I 
haven't actually proved it faulty.

 memtest 86+ showed no errors, so I didn't know what to blame for all
 these crashes.

It's well known that test programs can't stress a computer the way real life 
does. It was true of Ferranti Argus 500 systems in 1974, and I'm sure it's 
still true today.

 After close observation I discovered that the machine would crash the
 moment it tried to start swapping.

Interesting. As far as I can tell though this box doesn't swap often - it 
can go weeks without doing so. As I said the other day, my 4GB is enough to 
contain the work I usually do.

 This would typically happen in the middle of an emerge, which was rather
 annoying, and/or when updatedb was running.

At least you could re-run an aborted emerge; when my box hangs it just stops 
responding to keyboard and mouse, and the network interface stops receiving 
packets so I can't ssh in from another box to shut it down neatly. It's BRS 
time.

-- 
Rgds
Peter



Re: [gentoo-user] Re: Random reboots. Where to start?

2011-02-27 Thread Mick
On Sunday 27 February 2011 17:15:40 Grant Edwards wrote:
 On 2011-02-26, Dale rdalek1...@gmail.com wrote:
  Mick wrote:
  Before you start tweaking voltages and replacing PSUs you better test
  your *new* memory modules thoroughly, even if that means that you will
  be using your old machine for a day or so.
  
  Personally I usually remove all memory modules and then test one at a
  time overnight with memtest 86+.  If it gives any errors at all I would
  send it back to the shop.
  
  If they all pass, then voltage and PSU issues will need to be looked at.
  
  Good luck.
  
  This appears to be a corrupt file somewhere.
 
 In my experice, failing RAM often appears as a corrupt file
 somewhere.

Yep, when I had a failing memory module I would often end up with corrupted 
files all over the place.  Think about it, when the memory gave up some write 
on disk function was invariably foo-barred.
-- 
Regards,
Mick


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] Re: Random reboots. Where to start?

2011-02-27 Thread Dale

Mick wrote:

On Sunday 27 February 2011 17:15:40 Grant Edwards wrote:
   

On 2011-02-26, Dalerdalek1...@gmail.com  wrote:
 

This appears to be a corrupt file somewhere.
   

In my experice, failing RAM often appears as a corrupt file
somewhere.
 

Yep, when I had a failing memory module I would often end up with corrupted
files all over the place.  Think about it, when the memory gave up some write
on disk function was invariably foo-barred.
   


This was my logic tho.  Reboots when using the OS on the hard drive.  
Runs fine when booted from something else, memtest, system rescue or 
even Knoppix.  If it was memory, then it should fail on everything at 
some point.  Since it only failed when booted from the hard drive, I was 
looking for issues with it.


What you are saying is completely correct tho.  If I load a file into 
ram that is bad, then it gets written back to the drive, that file is 
broke.  That will cause problems eventually and who knows what sort of 
flakey issue that will be.


Anyway, recompiling everything gives me this:

root@fireball / # uptime
 14:05:46 up 1 day,  5:29,  4 users,  load average: 0.43, 0.24, 0.23
root@fireball / #

I think it is going to be OK now.  Some file was having a bad hair day.  
lol


Dale

:-)  :-)



Re: [gentoo-user] Re: Random reboots. Where to start?

2011-02-27 Thread Peter Humphrey
On Sunday 27 February 2011 19:43:10 Mick wrote:

 [...] when I had a failing memory module I would often end up with
 corrupted files all over the place.  Think about it, when the memory
 gave up some write on disk function was invariably foo-barred.

What, though, if you get hang-ups in some OSs but not in others, and never a 
sign of file corruption?

-- 
Rgds
Peter



Re: [gentoo-user] Re: Random reboots. Where to start?

2011-02-26 Thread Mark Knecht
On Sat, Feb 26, 2011 at 2:20 PM, walt w41...@gmail.com wrote:
 On 02/25/2011 03:10 PM, Dale wrote:

 I got a good power supply but it could still be that. Even the best and
 most
  expensive break from time to time. I think I could swap mine out from my
 old
 rig if needed. This new rig doesn't pull near as much as my old one.

 How can you tell how much power the machine is using?


Kill-a-Watt



Re: [gentoo-user] Re: Random reboots. Where to start?

2011-02-26 Thread Dale

Mark Knecht wrote:

On Sat, Feb 26, 2011 at 2:20 PM, waltw41...@gmail.com  wrote:
   

On 02/25/2011 03:10 PM, Dale wrote:

 

I got a good power supply but it could still be that. Even the best and
most
  expensive break from time to time. I think I could swap mine out from my
old
rig if needed. This new rig doesn't pull near as much as my old one.
   

How can you tell how much power the machine is using?

 

Kill-a-Watt

   


Nope, current meter and a calculator.  My computer has a line that is 
for that plug only.  I just clamp my meter on and measure how much 
current it is pulling.  Multiply that times the current and there you go.


The Kill a watt is next on my list tho.  I do want one of those things.  
Newegg has them too. ;-)


Dale

:-)  :-)



Re: [gentoo-user] Re: Random reboots. Where to start?

2011-02-25 Thread Dale

Grant Edwards wrote:

On 2011-02-25, Dalerdalek1...@gmail.com  wrote:

   

Well, I think my machine is possessed or something.  I'm getting random
reboots here.  When it does this, it is like hitting the reset button.
It is sitting on the grub screen when it does this.  I noticed the first
time the other day and this was before adding the extra memory.  I
seemed to be stable at 4Gbs but I seem to be rebooting at random.  I ran
memtest yesterday, it checked fine.
 

By memtest I assume mean memtest86?

In my experience, you should let it run multiple passes (I'd recommend
at least 4 or 4 -- I would imagine it'll take a couple days).  I've
seen situations where it was OK on the initial pass, and then failed
later.

The other likely suspect is probably the power supply.

   


Correct.  To sort of help rule out the OS on the hard drive, I ran 
memtest from a USB stick.  It made it through 2 full passes with no 
errors.  Since this is my main rig, I can't go to long without it.  I 
get to shaking from withdrawal and such as that.  :-(


I got a good power supply but it could still be that.  Even the best and 
most expensive break from time to time.  I think I could swap mine out 
from my old rig if needed.  This new rig doesn't pull near as much as my 
old one.


Thanks.

Dale

:-)  :-)