[gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Nikos Chantziaras

Andrey Falko wrote:
On 11/20/08, *Nikos Chantziaras* [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


I upgraded to kernel 2.6.27 (gentoo-sources-2.6.27-r3) yesterday.
 Today I experienced random segfaults during an emerge (twice during
emerging mozilla-thunderbird; one time as (assembler) segfaulted,
on the second try python segfaulted at the end of the emerge).

Anyone noticing something similar?  I'm reverting back to 2.6.26 for
now.



You do not see these segfaults on 2.6.26, right? 


Correct.


When I had the symptoms 
you described, it turned out that my RAM voltage needed to be raised in 
the BIOS.


Hmm.  My system *is* overclocked (I'm one of those enthusiast guys). 
I'm running an E6600 that runs 2.4GHz stock at 3GHz with a good 
aftermarket cooler (temps never go above 48C at full load).  The CPU is 
overclocked and *undervoltaged* (1.29V from its 1.35V stock).  The RAM 
is both underclocked (to get an FSB:DRAM ratio of 1:1) and 
undervoltaged.  The system has been confirmed stable though; 8 hours 
Prime95 stress test with no errors, which is much more of a stress test 
than any real application can pull off.  It also passes memtest.



Do you run the proprietary nvidia-drivers? If you don't, run 
any software that taints the kernel, I'd file a bug with upstream kernel 
people: http://bugzilla.kernel.org/


As it happens, upgrading to kernel 2.6.27 was not the only change; I 
switched from xf86-video-radeonhd to the proprietary ATI Catalyst 
drivers.  Didn't think that this has anything to do with it though.


Can you recommend a Linux program that does a stress test like Prime95 
on Windows?





Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Dale
Nikos Chantziaras wrote:
 Andrey Falko wrote:

 When I had the symptoms you described, it turned out that my RAM
 voltage needed to be raised in the BIOS.

 Hmm.  My system *is* overclocked (I'm one of those enthusiast guys).
 I'm running an E6600 that runs 2.4GHz stock at 3GHz with a good
 aftermarket cooler (temps never go above 48C at full load).  The CPU
 is overclocked and *undervoltaged* (1.29V from its 1.35V stock).  The
 RAM is both underclocked (to get an FSB:DRAM ratio of 1:1) and
 undervoltaged.  The system has been confirmed stable though; 8 hours
 Prime95 stress test with no errors, which is much more of a stress
 test than any real application can pull off.  It also passes memtest.


I used to overclock some too until I started running folding.  Folding
just doesn't like overclocking.  I'm not sure what prime95 does but it
could still be a problem even though it passes.  Just a thought. 

Your Mileage May Vary.

Dale

:-)  :-) 



Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Andrey Falko
On 11/20/08, Nikos Chantziaras [EMAIL PROTECTED] wrote:

 As it happens, upgrading to kernel 2.6.27 was not the only change; I
 switched from xf86-video-radeonhd to the proprietary ATI Catalyst drivers.
  Didn't think that this has anything to do with it though.


I have a hunch that this is the problem. Have you tried 2.6.27 without the
proprietary ATI drivers? This is the first thing I would suspect because ATI
drivers are not reputed --- or so I believe --- for being the best quality
software out there. Either way, we need to eliminate as many factors as
possible and one of these factors is your ATI driver change. I don't know
much about Prime95, but if it won't simulate a subtlety in your video
drivers, especially how proprietary ATI drivers will play with kernel
2.6.27.


[gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Nikos Chantziaras

Dale wrote:

Nikos Chantziaras wrote:

Andrey Falko wrote:


When I had the symptoms you described, it turned out that my RAM
voltage needed to be raised in the BIOS.

Hmm.  My system *is* overclocked (I'm one of those enthusiast guys).
I'm running an E6600 that runs 2.4GHz stock at 3GHz with a good
aftermarket cooler (temps never go above 48C at full load).  The CPU
is overclocked and *undervoltaged* (1.29V from its 1.35V stock).  The
RAM is both underclocked (to get an FSB:DRAM ratio of 1:1) and
undervoltaged.  The system has been confirmed stable though; 8 hours
Prime95 stress test with no errors, which is much more of a stress
test than any real application can pull off.  It also passes memtest.



I used to overclock some too until I started running folding.  Folding
just doesn't like overclocking.  I'm not sure what prime95 does but it
could still be a problem even though it passes.  Just a thought. 


I was running folding too (up to the point where it affected the 
electricity bill :P)  It ran with no errors.  For the record, Prime puts 
more stress then folding; it really brings the system (CPU, northbridge 
and RAM, the GPU isn't affected) to its knees.





[gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Nikos Chantziaras

Andrey Falko wrote:
On 11/20/08, *Nikos Chantziaras* [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


As it happens, upgrading to kernel 2.6.27 was not the only change; I
switched from xf86-video-radeonhd to the proprietary ATI Catalyst
drivers.  Didn't think that this has anything to do with it though.


I have a hunch that this is the problem. Have you tried 2.6.27 without 
the proprietary ATI drivers?


I'll be running tests on this with various combinations 
(2.6.26/2.6.27/Catalyst/open source radeonhd) and see what happens. 
Re-emerging a bunch of stuff should do it.



This is the first thing I would suspect 
because ATI drivers are not reputed --- or so I believe --- for being 
the best quality software out there.


They are reputed to be one of the most broken (from packaging, install 
scripts up to the actual binary blobs) ever produced ;P  Too bad they're 
the *only* choice on modern cards.  No 3D (or even accelerated 2D) 
support with any open source driver for HD4xxx series cards.  It's one 
of those things you hate to the bone but can't do without due to AMD's 
refusal to open source them so we can fix them.



Either way, we need to eliminate as 
many factors as possible and one of these factors is your ATI driver 
change. I don't know much about Prime95, but if it won't simulate a 
subtlety in your video drivers, especially how proprietary ATI drivers 
will play with kernel 2.6.27.


Yes, Prime doesn't touch the GPU.  It stresses CPU, northbridge and RAM 
only.





[gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Nikos Chantziaras

Nikos Chantziaras wrote:

Andrey Falko wrote:
On 11/20/08, *Nikos Chantziaras* [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


As it happens, upgrading to kernel 2.6.27 was not the only change; I
switched from xf86-video-radeonhd to the proprietary ATI Catalyst
drivers.  Didn't think that this has anything to do with it though.


I have a hunch that this is the problem. Have you tried 2.6.27 without 
the proprietary ATI drivers?


I'll be running tests on this with various combinations 
(2.6.26/2.6.27/Catalyst/open source radeonhd) and see what happens. 
Re-emerging a bunch of stuff should do it.


That was easier than expected.  I've got segfaults every single time I 
emerged mozilla-thunderbird.  Both with kernel 2.6.26 as well as 2.6.27.


Reverting back to the open source radeonhd driver made this stop.  This 
is a serious issue because you'll get segfaults only if you're lucky. 
If not lucky, bad machine code could be produced which on Gentoo being 
source-based is the worst scenario imaginable.





Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Mike

Nikos Chantziaras wrote:

Reverting back to the open source radeonhd driver made this stop.  This 
is a serious issue because you'll get segfaults only if you're lucky. If 
not lucky, bad machine code could be produced which on Gentoo being 
source-based is the worst scenario imaginable.


I have been getting random segfaults too using kernels 2.6.26 AND 2.6.27 
using the nvidia driver. I never had problems like this before.





Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Iain Buchanan

Andrey Falko wrote:

On 11/20/08, Nikos Chantziaras[EMAIL PROTECTED]  wrote:

As it happens, upgrading to kernel 2.6.27 was not the only change; I
switched from xf86-video-radeonhd to the proprietary ATI Catalyst drivers.
  Didn't think that this has anything to do with it though.



I have a hunch that this is the problem. Have you tried 2.6.27 without the
proprietary ATI drivers? This is the first thing I would suspect because ATI
drivers are not reputed --- or so I believe --- for being the best quality
software out there. Either way, we need to eliminate as many factors as
possible and one of these factors is your ATI driver change. I don't know
much about Prime95, but if it won't simulate a subtlety in your video
drivers, especially how proprietary ATI drivers will play with kernel
2.6.27.


I would definitely say this is worth looking into - not necessarily 
because the ati-drivers are bad software but because they have always 
stressed my hardware more than any open source drivers (and yielded 
better performance too).


FWIW I've never known any stress test to be as good as compiling 
openoffice :)


cya,
--
Iain Buchanan iaindb at netspace dot net dot au

QOTD:
There may be no excuse for laziness, but I'm sure looking.



Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Mike

Nikos Chantziaras wrote:

Reverting back to the open source radeonhd driver made this stop.  This 
is a serious issue because you'll get segfaults only if you're lucky. If 
not lucky, bad machine code could be produced which on Gentoo being 
source-based is the worst scenario imaginable.



I have been getting random segfaults too using kernels 2.6.26 AND 2.6.27
using the nvidia driver. I never had problems like this before.





[gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Nikos Chantziaras

Iain Buchanan wrote:

Andrey Falko wrote:

On 11/20/08, Nikos Chantziaras[EMAIL PROTECTED]  wrote:

As it happens, upgrading to kernel 2.6.27 was not the only change; I
switched from xf86-video-radeonhd to the proprietary ATI Catalyst 
drivers.

  Didn't think that this has anything to do with it though.



I have a hunch that this is the problem. Have you tried 2.6.27 without 
the
proprietary ATI drivers? This is the first thing I would suspect 
because ATI
drivers are not reputed --- or so I believe --- for being the best 
quality

software out there. Either way, we need to eliminate as many factors as
possible and one of these factors is your ATI driver change. I don't know
much about Prime95, but if it won't simulate a subtlety in your video
drivers, especially how proprietary ATI drivers will play with kernel
2.6.27.


I would definitely say this is worth looking into - not necessarily 
because the ati-drivers are bad software but because they have always 
stressed my hardware more than any open source drivers (and yielded 
better performance too).


I have opened a bug for it, but it got closed immediately because I have 
no evidence to support it. lol :P


http://bugs.gentoo.org/show_bug.cgi?id=247860




Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Iain Buchanan

Nikos Chantziaras wrote:

Iain Buchanan wrote:



I would definitely say this is worth looking into - not necessarily
because the ati-drivers are bad software but because they have always
stressed my hardware more than any open source drivers (and yielded
better performance too).


I have opened a bug for it, but it got closed immediately because I have
no evidence to support it. lol :P

http://bugs.gentoo.org/show_bug.cgi?id=247860


AFAIR Jer is only a wrangler (ok, only is not the right word, but you 
know what I mean) so he's likely not going to get into too much detail 
about the specifics of the bug wrt fglrx, ati, etc.  You'd have better 
luck talking to a dev who deals with fglrx or the kernel, and convincing 
them to look at it.


I know it can be frustrating to have bugs marked as INVALID, but saying 
your justification for marking this as INVALID is actually the invalid 
thing here ... not my bug report is probably going to get you ignored!


I'm not disagreeing with you here, but I do think you will have a hard 
time getting hard evidence it's specifically the fglrx driver, and not 
some effect the fglrx driver is having on your hw...


--
Iain Buchanan iaindb at netspace dot net dot au

If Chuck Norris round-house kicks you, you will die. If Chuck Norris' 
misses you with the round-house kick, the wind behind the kick will tear 
out your pancreas.




[gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Nikos Chantziaras

Iain Buchanan wrote:

Nikos Chantziaras wrote:

Iain Buchanan wrote:



I would definitely say this is worth looking into - not necessarily
because the ati-drivers are bad software but because they have always
stressed my hardware more than any open source drivers (and yielded
better performance too).


I have opened a bug for it, but it got closed immediately because I have
no evidence to support it. lol :P

http://bugs.gentoo.org/show_bug.cgi?id=247860


AFAIR Jer is only a wrangler (ok, only is not the right word, but you 
know what I mean) so he's likely not going to get into too much detail 
about the specifics of the bug wrt fglrx, ati, etc.  You'd have better 
luck talking to a dev who deals with fglrx or the kernel, and convincing 
them to look at it.


I know it can be frustrating to have bugs marked as INVALID, but saying 
your justification for marking this as INVALID is actually the invalid 
thing here ... not my bug report is probably going to get you ignored!


I'm not disagreeing with you here, but I do think you will have a hard 
time getting hard evidence it's specifically the fglrx driver, and not 
some effect the fglrx driver is having on your hw...


I don't know what I can do.  I'm not willing to set up a hacker kernel 
and start debugging ATI's module.  I posted that bug and also posted on 
Phoronix to warn people to check if they're affected by this.  At least, 
that's enough for a told you so reply later ;D  (Though I hope it's 
just my PC that's acting up here, I don't wish anyone any kind of data 
loss or anything.)


Other than that, I'll wait for Catalyst 8.12 and see what happens.




Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Andrey Falko
On Thu, Nov 20, 2008 at 9:03 PM, Iain Buchanan [EMAIL PROTECTED]wrote:

 Nikos Chantziaras wrote:I have opened a bug for it, but it got closed
 immediately because I have

 no evidence to support it. lol :P

 http://bugs.gentoo.org/show_bug.cgi?id=247860


 AFAIR Jer is only a wrangler (ok, only is not the right word, but you
 know what I mean) so he's likely not going to get into too much detail about
 the specifics of the bug wrt fglrx, ati, etc.  You'd have better luck
 talking to a dev who deals with fglrx or the kernel, and convincing them to
 look at it.

 I know it can be frustrating to have bugs marked as INVALID, but saying
 your justification for marking this as INVALID is actually the invalid
 thing here ... not my bug report is probably going to get you ignored!

 I'm not disagreeing with you here, but I do think you will have a hard time
 getting hard evidence it's specifically the fglrx driver, and not some
 effect the fglrx driver is having on your hw...


 I agree. Even if this is a bug, no open source developer could be qualified
to fix it because fgrlx is closed source. If you can, report this to ATI,
otherwise, you will have to live with the problem or use open source
drivers.


Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Dale
Nikos Chantziaras wrote:
 Dale wrote:
 Nikos Chantziaras wrote:
 Andrey Falko wrote:

 When I had the symptoms you described, it turned out that my RAM
 voltage needed to be raised in the BIOS.
 Hmm.  My system *is* overclocked (I'm one of those enthusiast guys).
 I'm running an E6600 that runs 2.4GHz stock at 3GHz with a good
 aftermarket cooler (temps never go above 48C at full load).  The CPU
 is overclocked and *undervoltaged* (1.29V from its 1.35V stock).  The
 RAM is both underclocked (to get an FSB:DRAM ratio of 1:1) and
 undervoltaged.  The system has been confirmed stable though; 8 hours
 Prime95 stress test with no errors, which is much more of a stress
 test than any real application can pull off.  It also passes memtest.


 I used to overclock some too until I started running folding.  Folding
 just doesn't like overclocking.  I'm not sure what prime95 does but it
 could still be a problem even though it passes.  Just a thought. 

 I was running folding too (up to the point where it affected the
 electricity bill :P)  It ran with no errors.  For the record, Prime
 puts more stress then folding; it really brings the system (CPU,
 northbridge and RAM, the GPU isn't affected) to its knees.




If you have ran folding and it works, then I would think you are ready
for liftoff.  From what I have read, folding is ral touchy on
overclocking.  To think I bought a mobile AMD CPU and can't overclock
because of folding.  sighs

Dale

:-)  :-)



[gentoo-user] Re: Random segfaults with kernel 2.6.27

2008-11-20 Thread Nikos Chantziaras

Andrey Falko wrote:

[...]
 I agree. Even if this is a bug, no open source developer could be 
qualified to fix it because fgrlx is closed source. If you can, report 
this to ATI, otherwise, you will have to live with the problem or use 
open source drivers.


That is not true.  If it's in portage, you have to accept bugs against 
it.  If a bug is serious, then that means the package has to stay masked 
if there's no workaround for it in form of a patch.  And sometimes there 
are patches for non-OSS software too.  (This bug is not serious because 
I'm the only one who can confirm it.  But you're talking generally, so 
this specific bug is not the point of this reply.)


ATI doesn't support Gentoo, btw.  Only Ubuntu, openSUSE and some other 
high-profile distros are supported.  In quotes because it's broken 
even on those ;)