RE: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-04-18 Thread Deucher, Alexander
> -Original Message- > From: Matthias Graf [mailto:matthias.g...@st.ovgu.de] > Sent: Friday, April 18, 2014 7:46 AM > To: Borislav Petkov > Cc: linux-kernel@vger.kernel.org; Tony Luck; Deucher, Alexander > Subject: Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-04-18 Thread Borislav Petkov
On Fri, Apr 18, 2014 at 01:45:42PM +0200, Matthias Graf wrote: > I applied your patch to linus' current master (3.15.0-rc1+) and indeed > it does solve the issue for me! > > Thanks for your help. > > I would appreciated if you keep me posted on updates. Ok, goodie, so this one really causes

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-04-18 Thread Matthias Graf
I applied your patch to linus' current master (3.15.0-rc1+) and indeed it does solve the issue for me! Thanks for your help. I would appreciated if you keep me posted on updates. Best, Matthias Am 18.04.2014 11:45, schrieb Borislav Petkov: > On Fri, Apr 18, 2014 at 11:17:34AM +0200, Matthias

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-04-18 Thread Borislav Petkov
On Fri, Apr 18, 2014 at 11:17:34AM +0200, Matthias Graf wrote: > Fine-grained bisection result: > > ab70b1dde73ff4525c3cd51090c233482c50f217 is the first bad commit > commit ab70b1dde73ff4525c3cd51090c233482c50f217 > Author: Alex Deucher > Date: Fri Nov 1 15:16:02 2013 -0400 > >

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-04-18 Thread Matthias Graf
Fine-grained bisection result: ab70b1dde73ff4525c3cd51090c233482c50f217 is the first bad commit commit ab70b1dde73ff4525c3cd51090c233482c50f217 Author: Alex Deucher Date: Fri Nov 1 15:16:02 2013 -0400 drm/radeon: enable DPM by default on r7xx asics Seems to be stable on them.

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-04-18 Thread Borislav Petkov
On Fri, Apr 18, 2014 at 01:45:42PM +0200, Matthias Graf wrote: I applied your patch to linus' current master (3.15.0-rc1+) and indeed it does solve the issue for me! Thanks for your help. I would appreciated if you keep me posted on updates. Ok, goodie, so this one really causes problems.

RE: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-04-18 Thread Deucher, Alexander
-Original Message- From: Matthias Graf [mailto:matthias.g...@st.ovgu.de] Sent: Friday, April 18, 2014 7:46 AM To: Borislav Petkov Cc: linux-kernel@vger.kernel.org; Tony Luck; Deucher, Alexander Subject: Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64 I applied your patch

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-04-18 Thread Matthias Graf
Fine-grained bisection result: ab70b1dde73ff4525c3cd51090c233482c50f217 is the first bad commit commit ab70b1dde73ff4525c3cd51090c233482c50f217 Author: Alex Deucher alexander.deuc...@amd.com Date: Fri Nov 1 15:16:02 2013 -0400 drm/radeon: enable DPM by default on r7xx asics Seems to

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-04-18 Thread Borislav Petkov
On Fri, Apr 18, 2014 at 11:17:34AM +0200, Matthias Graf wrote: Fine-grained bisection result: ab70b1dde73ff4525c3cd51090c233482c50f217 is the first bad commit commit ab70b1dde73ff4525c3cd51090c233482c50f217 Author: Alex Deucher alexander.deuc...@amd.com Date: Fri Nov 1 15:16:02 2013 -0400

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-04-18 Thread Matthias Graf
I applied your patch to linus' current master (3.15.0-rc1+) and indeed it does solve the issue for me! Thanks for your help. I would appreciated if you keep me posted on updates. Best, Matthias Am 18.04.2014 11:45, schrieb Borislav Petkov: On Fri, Apr 18, 2014 at 11:17:34AM +0200, Matthias

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-04-17 Thread Borislav Petkov
On Thu, Apr 17, 2014 at 08:25:58AM +0200, Matthias Graf wrote: > Ok. I tried: > > 3.15-rc1 (16. April) > failed. > > Bisecting turned out: > last working: 3.12.17 > first failing: 3.13 Ok, next steps would then be: * test stock 3.12. -> if it works, bisect between 3.12 and 3.13. -> if not,

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-04-17 Thread Matthias Graf
Ok. I tried: 3.15-rc1 (16. April) failed. Bisecting turned out: last working: 3.12.17 first failing: 3.13 Am 16.04.2014 16:22, schrieb Borislav Petkov: > On Wed, Apr 02, 2014 at 04:14:31PM +0200, Matthias Graf wrote: >> I now tried booting with a different graphics card (on the same >>

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-04-17 Thread Matthias Graf
Ok. I tried: 3.15-rc1 (16. April) failed. Bisecting turned out: last working: 3.12.17 first failing: 3.13 Am 16.04.2014 16:22, schrieb Borislav Petkov: On Wed, Apr 02, 2014 at 04:14:31PM +0200, Matthias Graf wrote: I now tried booting with a different graphics card (on the same machine),

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-04-17 Thread Borislav Petkov
On Thu, Apr 17, 2014 at 08:25:58AM +0200, Matthias Graf wrote: Ok. I tried: 3.15-rc1 (16. April) failed. Bisecting turned out: last working: 3.12.17 first failing: 3.13 Ok, next steps would then be: * test stock 3.12. - if it works, bisect between 3.12 and 3.13. - if not, bisect

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-04-16 Thread Borislav Petkov
On Wed, Apr 02, 2014 at 04:14:31PM +0200, Matthias Graf wrote: > I now tried booting with a different graphics card (on the same > machine), and it resolved the problem. Therefore, it definitely has > something to do with graphics. > > It is a Sapphire ATI Radeon HD 4830 (RV770 chip). As Tony

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-04-16 Thread Borislav Petkov
On Wed, Apr 02, 2014 at 04:14:31PM +0200, Matthias Graf wrote: I now tried booting with a different graphics card (on the same machine), and it resolved the problem. Therefore, it definitely has something to do with graphics. It is a Sapphire ATI Radeon HD 4830 (RV770 chip). As Tony said,

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-04-02 Thread Matthias Graf
I now tried booting with a different graphics card (on the same machine), and it resolved the problem. Therefore, it definitely has something to do with graphics. It is a Sapphire ATI Radeon HD 4830 (RV770 chip). Kind Regards, Matthias Am 24.03.2014 18:22, schrieb Matthias Graf: > Yes it also

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-04-02 Thread Matthias Graf
I now tried booting with a different graphics card (on the same machine), and it resolved the problem. Therefore, it definitely has something to do with graphics. It is a Sapphire ATI Radeon HD 4830 (RV770 chip). Kind Regards, Matthias Am 24.03.2014 18:22, schrieb Matthias Graf: Yes it also

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-03-21 Thread Tony Luck
On Fri, Mar 21, 2014 at 1:13 PM, Borislav Petkov wrote: > Provided the decode is correct and I'm reading it right, this looks > like the cores get to livelock for some reason without any forward > progress. The MCEs signal that there hasn't been any instruction retired > in relatively long time,

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-03-21 Thread Borislav Petkov
+ Tony. Provided the decode is correct and I'm reading it right, this looks like the cores get to livelock for some reason without any forward progress. The MCEs signal that there hasn't been any instruction retired in relatively long time, thus a stall. You say, this happens when gnome starts.

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-03-21 Thread Matthias Graf
(Please CC me on all replies) mcelog output for all mces: Hardware event. This is not a software error. CPU 3 BANK 0 MCG status:RIPV MCIP MCi status: Uncorrected error Error enabled Processor context corrupt MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access

Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64

2014-03-21 Thread Borislav Petkov
On Fri, Mar 21, 2014 at 06:10:23PM +0100, Matthias Graf wrote: > Please CC me on replies. > > [1.] Kernel panic: Fatal Machine Check after booting >= > 3.13.5-101.fc19.x86_64; 3.12.11-201.fc19.x86_64 works fine! > [2.] Screen freezes a few seconds after Gnome appears. The error message > (see

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-03-21 Thread Borislav Petkov
On Fri, Mar 21, 2014 at 06:10:23PM +0100, Matthias Graf wrote: Please CC me on replies. [1.] Kernel panic: Fatal Machine Check after booting = 3.13.5-101.fc19.x86_64; 3.12.11-201.fc19.x86_64 works fine! [2.] Screen freezes a few seconds after Gnome appears. The error message (see

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-03-21 Thread Matthias Graf
(Please CC me on all replies) mcelog output for all mces: Hardware event. This is not a software error. CPU 3 BANK 0 MCG status:RIPV MCIP MCi status: Uncorrected error Error enabled Processor context corrupt MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-03-21 Thread Borislav Petkov
+ Tony. Provided the decode is correct and I'm reading it right, this looks like the cores get to livelock for some reason without any forward progress. The MCEs signal that there hasn't been any instruction retired in relatively long time, thus a stall. You say, this happens when gnome starts.

Re: PROBLEM: Fatal Machine Check = 3.13.5-101.fc19.x86_64

2014-03-21 Thread Tony Luck
On Fri, Mar 21, 2014 at 1:13 PM, Borislav Petkov b...@alien8.de wrote: Provided the decode is correct and I'm reading it right, this looks like the cores get to livelock for some reason without any forward progress. The MCEs signal that there hasn't been any instruction retired in relatively