[gentoo-user] Re: Random segfaults with kernel 2.6.27
Andrey Falko wrote: On 11/20/08, *Nikos Chantziaras* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I upgraded to kernel 2.6.27 (gentoo-sources-2.6.27-r3) yesterday. Today I experienced random segfaults during an emerge (twice during emerging mozilla-thunderbird; one time as (assembler) segfaulted, on the second try python segfaulted at the end of the emerge). Anyone noticing something similar? I'm reverting back to 2.6.26 for now. You do not see these segfaults on 2.6.26, right? Correct. When I had the symptoms you described, it turned out that my RAM voltage needed to be raised in the BIOS. Hmm. My system *is* overclocked (I'm one of those enthusiast guys). I'm running an E6600 that runs 2.4GHz stock at 3GHz with a good aftermarket cooler (temps never go above 48C at full load). The CPU is overclocked and *undervoltaged* (1.29V from its 1.35V stock). The RAM is both underclocked (to get an FSB:DRAM ratio of 1:1) and undervoltaged. The system has been confirmed stable though; 8 hours Prime95 stress test with no errors, which is much more of a stress test than any real application can pull off. It also passes memtest. Do you run the proprietary nvidia-drivers? If you don't, run any software that taints the kernel, I'd file a bug with upstream kernel people: http://bugzilla.kernel.org/ As it happens, upgrading to kernel 2.6.27 was not the only change; I switched from xf86-video-radeonhd to the proprietary ATI Catalyst drivers. Didn't think that this has anything to do with it though. Can you recommend a Linux program that does a stress test like Prime95 on Windows?
Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27
Nikos Chantziaras wrote: Andrey Falko wrote: When I had the symptoms you described, it turned out that my RAM voltage needed to be raised in the BIOS. Hmm. My system *is* overclocked (I'm one of those enthusiast guys). I'm running an E6600 that runs 2.4GHz stock at 3GHz with a good aftermarket cooler (temps never go above 48C at full load). The CPU is overclocked and *undervoltaged* (1.29V from its 1.35V stock). The RAM is both underclocked (to get an FSB:DRAM ratio of 1:1) and undervoltaged. The system has been confirmed stable though; 8 hours Prime95 stress test with no errors, which is much more of a stress test than any real application can pull off. It also passes memtest. I used to overclock some too until I started running folding. Folding just doesn't like overclocking. I'm not sure what prime95 does but it could still be a problem even though it passes. Just a thought. Your Mileage May Vary. Dale :-) :-)
Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27
On 11/20/08, Nikos Chantziaras [EMAIL PROTECTED] wrote: As it happens, upgrading to kernel 2.6.27 was not the only change; I switched from xf86-video-radeonhd to the proprietary ATI Catalyst drivers. Didn't think that this has anything to do with it though. I have a hunch that this is the problem. Have you tried 2.6.27 without the proprietary ATI drivers? This is the first thing I would suspect because ATI drivers are not reputed --- or so I believe --- for being the best quality software out there. Either way, we need to eliminate as many factors as possible and one of these factors is your ATI driver change. I don't know much about Prime95, but if it won't simulate a subtlety in your video drivers, especially how proprietary ATI drivers will play with kernel 2.6.27.
[gentoo-user] Re: Random segfaults with kernel 2.6.27
Dale wrote: Nikos Chantziaras wrote: Andrey Falko wrote: When I had the symptoms you described, it turned out that my RAM voltage needed to be raised in the BIOS. Hmm. My system *is* overclocked (I'm one of those enthusiast guys). I'm running an E6600 that runs 2.4GHz stock at 3GHz with a good aftermarket cooler (temps never go above 48C at full load). The CPU is overclocked and *undervoltaged* (1.29V from its 1.35V stock). The RAM is both underclocked (to get an FSB:DRAM ratio of 1:1) and undervoltaged. The system has been confirmed stable though; 8 hours Prime95 stress test with no errors, which is much more of a stress test than any real application can pull off. It also passes memtest. I used to overclock some too until I started running folding. Folding just doesn't like overclocking. I'm not sure what prime95 does but it could still be a problem even though it passes. Just a thought. I was running folding too (up to the point where it affected the electricity bill :P) It ran with no errors. For the record, Prime puts more stress then folding; it really brings the system (CPU, northbridge and RAM, the GPU isn't affected) to its knees.
[gentoo-user] Re: Random segfaults with kernel 2.6.27
Andrey Falko wrote: On 11/20/08, *Nikos Chantziaras* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: As it happens, upgrading to kernel 2.6.27 was not the only change; I switched from xf86-video-radeonhd to the proprietary ATI Catalyst drivers. Didn't think that this has anything to do with it though. I have a hunch that this is the problem. Have you tried 2.6.27 without the proprietary ATI drivers? I'll be running tests on this with various combinations (2.6.26/2.6.27/Catalyst/open source radeonhd) and see what happens. Re-emerging a bunch of stuff should do it. This is the first thing I would suspect because ATI drivers are not reputed --- or so I believe --- for being the best quality software out there. They are reputed to be one of the most broken (from packaging, install scripts up to the actual binary blobs) ever produced ;P Too bad they're the *only* choice on modern cards. No 3D (or even accelerated 2D) support with any open source driver for HD4xxx series cards. It's one of those things you hate to the bone but can't do without due to AMD's refusal to open source them so we can fix them. Either way, we need to eliminate as many factors as possible and one of these factors is your ATI driver change. I don't know much about Prime95, but if it won't simulate a subtlety in your video drivers, especially how proprietary ATI drivers will play with kernel 2.6.27. Yes, Prime doesn't touch the GPU. It stresses CPU, northbridge and RAM only.
[gentoo-user] Re: Random segfaults with kernel 2.6.27
Nikos Chantziaras wrote: Andrey Falko wrote: On 11/20/08, *Nikos Chantziaras* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: As it happens, upgrading to kernel 2.6.27 was not the only change; I switched from xf86-video-radeonhd to the proprietary ATI Catalyst drivers. Didn't think that this has anything to do with it though. I have a hunch that this is the problem. Have you tried 2.6.27 without the proprietary ATI drivers? I'll be running tests on this with various combinations (2.6.26/2.6.27/Catalyst/open source radeonhd) and see what happens. Re-emerging a bunch of stuff should do it. That was easier than expected. I've got segfaults every single time I emerged mozilla-thunderbird. Both with kernel 2.6.26 as well as 2.6.27. Reverting back to the open source radeonhd driver made this stop. This is a serious issue because you'll get segfaults only if you're lucky. If not lucky, bad machine code could be produced which on Gentoo being source-based is the worst scenario imaginable.
Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27
Nikos Chantziaras wrote: Reverting back to the open source radeonhd driver made this stop. This is a serious issue because you'll get segfaults only if you're lucky. If not lucky, bad machine code could be produced which on Gentoo being source-based is the worst scenario imaginable. I have been getting random segfaults too using kernels 2.6.26 AND 2.6.27 using the nvidia driver. I never had problems like this before.
Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27
Andrey Falko wrote: On 11/20/08, Nikos Chantziaras[EMAIL PROTECTED] wrote: As it happens, upgrading to kernel 2.6.27 was not the only change; I switched from xf86-video-radeonhd to the proprietary ATI Catalyst drivers. Didn't think that this has anything to do with it though. I have a hunch that this is the problem. Have you tried 2.6.27 without the proprietary ATI drivers? This is the first thing I would suspect because ATI drivers are not reputed --- or so I believe --- for being the best quality software out there. Either way, we need to eliminate as many factors as possible and one of these factors is your ATI driver change. I don't know much about Prime95, but if it won't simulate a subtlety in your video drivers, especially how proprietary ATI drivers will play with kernel 2.6.27. I would definitely say this is worth looking into - not necessarily because the ati-drivers are bad software but because they have always stressed my hardware more than any open source drivers (and yielded better performance too). FWIW I've never known any stress test to be as good as compiling openoffice :) cya, -- Iain Buchanan iaindb at netspace dot net dot au QOTD: There may be no excuse for laziness, but I'm sure looking.
Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27
Nikos Chantziaras wrote: Reverting back to the open source radeonhd driver made this stop. This is a serious issue because you'll get segfaults only if you're lucky. If not lucky, bad machine code could be produced which on Gentoo being source-based is the worst scenario imaginable. I have been getting random segfaults too using kernels 2.6.26 AND 2.6.27 using the nvidia driver. I never had problems like this before.
[gentoo-user] Re: Random segfaults with kernel 2.6.27
Iain Buchanan wrote: Andrey Falko wrote: On 11/20/08, Nikos Chantziaras[EMAIL PROTECTED] wrote: As it happens, upgrading to kernel 2.6.27 was not the only change; I switched from xf86-video-radeonhd to the proprietary ATI Catalyst drivers. Didn't think that this has anything to do with it though. I have a hunch that this is the problem. Have you tried 2.6.27 without the proprietary ATI drivers? This is the first thing I would suspect because ATI drivers are not reputed --- or so I believe --- for being the best quality software out there. Either way, we need to eliminate as many factors as possible and one of these factors is your ATI driver change. I don't know much about Prime95, but if it won't simulate a subtlety in your video drivers, especially how proprietary ATI drivers will play with kernel 2.6.27. I would definitely say this is worth looking into - not necessarily because the ati-drivers are bad software but because they have always stressed my hardware more than any open source drivers (and yielded better performance too). I have opened a bug for it, but it got closed immediately because I have no evidence to support it. lol :P http://bugs.gentoo.org/show_bug.cgi?id=247860
Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27
Nikos Chantziaras wrote: Iain Buchanan wrote: I would definitely say this is worth looking into - not necessarily because the ati-drivers are bad software but because they have always stressed my hardware more than any open source drivers (and yielded better performance too). I have opened a bug for it, but it got closed immediately because I have no evidence to support it. lol :P http://bugs.gentoo.org/show_bug.cgi?id=247860 AFAIR Jer is only a wrangler (ok, only is not the right word, but you know what I mean) so he's likely not going to get into too much detail about the specifics of the bug wrt fglrx, ati, etc. You'd have better luck talking to a dev who deals with fglrx or the kernel, and convincing them to look at it. I know it can be frustrating to have bugs marked as INVALID, but saying your justification for marking this as INVALID is actually the invalid thing here ... not my bug report is probably going to get you ignored! I'm not disagreeing with you here, but I do think you will have a hard time getting hard evidence it's specifically the fglrx driver, and not some effect the fglrx driver is having on your hw... -- Iain Buchanan iaindb at netspace dot net dot au If Chuck Norris round-house kicks you, you will die. If Chuck Norris' misses you with the round-house kick, the wind behind the kick will tear out your pancreas.
[gentoo-user] Re: Random segfaults with kernel 2.6.27
Iain Buchanan wrote: Nikos Chantziaras wrote: Iain Buchanan wrote: I would definitely say this is worth looking into - not necessarily because the ati-drivers are bad software but because they have always stressed my hardware more than any open source drivers (and yielded better performance too). I have opened a bug for it, but it got closed immediately because I have no evidence to support it. lol :P http://bugs.gentoo.org/show_bug.cgi?id=247860 AFAIR Jer is only a wrangler (ok, only is not the right word, but you know what I mean) so he's likely not going to get into too much detail about the specifics of the bug wrt fglrx, ati, etc. You'd have better luck talking to a dev who deals with fglrx or the kernel, and convincing them to look at it. I know it can be frustrating to have bugs marked as INVALID, but saying your justification for marking this as INVALID is actually the invalid thing here ... not my bug report is probably going to get you ignored! I'm not disagreeing with you here, but I do think you will have a hard time getting hard evidence it's specifically the fglrx driver, and not some effect the fglrx driver is having on your hw... I don't know what I can do. I'm not willing to set up a hacker kernel and start debugging ATI's module. I posted that bug and also posted on Phoronix to warn people to check if they're affected by this. At least, that's enough for a told you so reply later ;D (Though I hope it's just my PC that's acting up here, I don't wish anyone any kind of data loss or anything.) Other than that, I'll wait for Catalyst 8.12 and see what happens.
Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27
On Thu, Nov 20, 2008 at 9:03 PM, Iain Buchanan [EMAIL PROTECTED]wrote: Nikos Chantziaras wrote:I have opened a bug for it, but it got closed immediately because I have no evidence to support it. lol :P http://bugs.gentoo.org/show_bug.cgi?id=247860 AFAIR Jer is only a wrangler (ok, only is not the right word, but you know what I mean) so he's likely not going to get into too much detail about the specifics of the bug wrt fglrx, ati, etc. You'd have better luck talking to a dev who deals with fglrx or the kernel, and convincing them to look at it. I know it can be frustrating to have bugs marked as INVALID, but saying your justification for marking this as INVALID is actually the invalid thing here ... not my bug report is probably going to get you ignored! I'm not disagreeing with you here, but I do think you will have a hard time getting hard evidence it's specifically the fglrx driver, and not some effect the fglrx driver is having on your hw... I agree. Even if this is a bug, no open source developer could be qualified to fix it because fgrlx is closed source. If you can, report this to ATI, otherwise, you will have to live with the problem or use open source drivers.
Re: [gentoo-user] Re: Random segfaults with kernel 2.6.27
Nikos Chantziaras wrote: Dale wrote: Nikos Chantziaras wrote: Andrey Falko wrote: When I had the symptoms you described, it turned out that my RAM voltage needed to be raised in the BIOS. Hmm. My system *is* overclocked (I'm one of those enthusiast guys). I'm running an E6600 that runs 2.4GHz stock at 3GHz with a good aftermarket cooler (temps never go above 48C at full load). The CPU is overclocked and *undervoltaged* (1.29V from its 1.35V stock). The RAM is both underclocked (to get an FSB:DRAM ratio of 1:1) and undervoltaged. The system has been confirmed stable though; 8 hours Prime95 stress test with no errors, which is much more of a stress test than any real application can pull off. It also passes memtest. I used to overclock some too until I started running folding. Folding just doesn't like overclocking. I'm not sure what prime95 does but it could still be a problem even though it passes. Just a thought. I was running folding too (up to the point where it affected the electricity bill :P) It ran with no errors. For the record, Prime puts more stress then folding; it really brings the system (CPU, northbridge and RAM, the GPU isn't affected) to its knees. If you have ran folding and it works, then I would think you are ready for liftoff. From what I have read, folding is ral touchy on overclocking. To think I bought a mobile AMD CPU and can't overclock because of folding. sighs Dale :-) :-)
[gentoo-user] Re: Random segfaults with kernel 2.6.27
Andrey Falko wrote: [...] I agree. Even if this is a bug, no open source developer could be qualified to fix it because fgrlx is closed source. If you can, report this to ATI, otherwise, you will have to live with the problem or use open source drivers. That is not true. If it's in portage, you have to accept bugs against it. If a bug is serious, then that means the package has to stay masked if there's no workaround for it in form of a patch. And sometimes there are patches for non-OSS software too. (This bug is not serious because I'm the only one who can confirm it. But you're talking generally, so this specific bug is not the point of this reply.) ATI doesn't support Gentoo, btw. Only Ubuntu, openSUSE and some other high-profile distros are supported. In quotes because it's broken even on those ;)