[Kernel-packages] [Bug 1677673] [NEW] System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl

2017-03-30 Thread bp
Public bug reported: I just upgraded to 17.04. It appears that now leaving the system locked for long enough now makes it unable to respond to any input but SysRq. This apepars to be the only relevant bit in `journalctl --boot -1`: BUG: unable to handle kernel NULL pointer dereference at

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl

2017-03-30 Thread bp
** Attachment added: "PulseList.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1677673/+attachment/4850655/+files/PulseList.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl

2017-03-30 Thread bp
** Attachment added: "WikiSyslog.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1677673/+attachment/4850653/+files/WikiSyslog.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl

2017-03-30 Thread bp
Scrubbed PII from attachments. ** Attachment added: "JournalErrors.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1677673/+attachment/4850652/+files/JournalErrors.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl

2017-03-30 Thread bp
** Attachment added: "CurrentDmesg.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1677673/+attachment/4850654/+files/CurrentDmesg.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl

2017-03-30 Thread bp
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1677673 Title: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0018 in journalctl Status in

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl

2017-04-04 Thread bp
Unfortunately I was still able to reproduce. Here's the current trace, this time without arbitrary line cutoffs. While this report, too, has chrome as the "comm" I've also seen compiz being the "culprit." IP: gen8_ppgtt_alloc_page_directories.isra.40+0x115/0x250 [i915] PGD 0 Oops: 0002 [#1]

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl

2017-04-04 Thread bp
Yes, the bug is indeed the same as that one. dd19674bacba227ae5d3ce680cbc5668198894dc doesn't seem like it's made it to the main kernel repository yet. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl at gen8_ppgtt_alloc_page_directories.isra.38+0x115/0x25

2017-04-04 Thread bp
** Summary changed: - System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0018 in journalctl + System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0018 in journalctl at

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl at gen8_ppgtt_alloc_page_directories.isra.38+0x115/0x25

2017-04-24 Thread bp
My VERY shitty workaround has been to add the yakkety repositories back to my machine with apt edit-sources, then install and use the 4.8.x kernel from there. :/ -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl at gen8_ppgtt_alloc_page_directories.isra.38+0x115/0x25

2017-05-19 Thread bp
*** This bug is a duplicate of bug 1680904 *** https://bugs.launchpad.net/bugs/1680904 bleh, whatever gets this bug fixed. ** This bug has been marked a duplicate of bug 1680904 zesty unable to handle kernel NULL pointer dereference -- You received this bug notification because you are

[Kernel-packages] [Bug 1680904] Re: zesty unable to handle kernel NULL pointer dereference

2017-06-02 Thread bp
> I just tried and my Dell XPS 13 9343 does not boot (hangs with a blank screen > right after kernel selection in grub) with generic variants from: > > http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12-rc3 I can replicate this. Dell Latitude E7250. -- You received this bug notification

[Kernel-packages] [Bug 1677673] Re: System soft-freezes, BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 in journalctl at gen8_ppgtt_alloc_page_directories.isra.38+0x115/0x25

2017-05-05 Thread bp
With that comment do you mean to say you're positive that 4.12 will have a fix, or do you mean to say you're positive that 4.11 will NOT have a fix? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1690085]

2019-01-19 Thread bp
(In reply to Aaron Muir Hamilton from comment #474) > I just experienced this issue with a Threadripper 2950X on kernel 4.20.3 Anything in dmesg? Please upload full dmesg, /proc/cpuinfo and kernel .config. Anything particular you did to reproduce it? Any correlation between what the box does

[Kernel-packages] [Bug 1690085]

2019-01-21 Thread bp
(In reply to Aaron Muir Hamilton from comment #476) > For the time being, I've now added idle=nomwait to my kernel params, as it > seems at least one of these issues is related to MWAIT (according to AMD's > errata). Ok, that's good. Please give your box hell to check whether that really is

[Kernel-packages] [Bug 1690085]

2019-01-22 Thread bp
(In reply to Aaron Muir Hamilton from comment #480) > So my box was mostly idle for the last day or so, and had locked up while > the monitor was asleep, so I had to reset it. Let me make sure I understand it correctly: you had "idle=nomwait" on the kernel command line and it became unresponsive?

[Kernel-packages] [Bug 1690085]

2019-01-22 Thread bp
(In reply to Aaron Muir Hamilton from comment #483) > That is maybe an option, for now I disabled C6 from my board firmware (in > ASRock's case, by selecting "Typical current idle" [sometimes called "Common > current idle" I think, on some Ryzen boards] in Advanced > AMD CBS > Zen > Common Options

[Kernel-packages] [Bug 1690085]

2019-01-17 Thread bp
Sorry but this bugzilla entry is humongous with 110 people on CC and all reporting some different aspects of what they're seeing and what they're trying and and... In order to debug this properly, I'd like for someone to test the latest upstream kernel 4.20 and try to reproduce the issue there.

[Kernel-packages] [Bug 1690085]

2019-01-23 Thread bp
(In reply to Trevor Cordes from comment #485) > Also, we were able to get a stack trace / panic output that was on > the frozen screen in a phone capture jpg. If anyone wants that, I can > attach it. Please do. > We did the "idle: typical" bios tweak What is that? > and the idle=nomwait tweak

[Kernel-packages] [Bug 1690085]

2019-01-23 Thread bp
(In reply to Trevor Cordes from comment #488) > Sorry the pic is so horrible in attachment 280661 [details], but at least > you can see most of the stack traces. I have a wider pic of it I can use to > transcribe the missing right-hand-side bits if needed. Thanks, that's good enough. Btw, is

[Kernel-packages] [Bug 1690085]

2019-01-26 Thread bp
Can people confirm that idle=halt fixes the issue, like Aaron says in comment #506? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690085 Title: Ryzen 1800X freeze - rcu_sched detected

[Kernel-packages] [Bug 1690085]

2019-01-26 Thread bp
(In reply to T X from comment #502) > Jan 25 09:38:33 kernel: ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored > Jan 25 09:38:33 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported > by HW (0x0) Yah, even with the new BIOS that's still there. Doesn't look like it has been fixed. If

[Kernel-packages] [Bug 1690085]

2019-01-26 Thread bp
(In reply to Maxim Bakulin from comment #503) > I tried and it didn't help. After freezes started happening at Xubuntu > 18.04LTS with stock 4.15, I compiled latest 4.20.3 kernel and added > CONFIG_RCU_NOCB_CPU=y, as it was suggested here, but freezes didn't stop. > There's my dmesg at comment 492

[Kernel-packages] [Bug 1690085]

2019-01-26 Thread bp
(In reply to Maxim Bakulin from comment #500) > I found out that openSUSE Tumbleweed with stock 4.20.0 kernel doesn't appear > to freeze. I tried copying suse's kernel and /lib/modules to Xubuntu > 18.04LTS, but it didn't seem to help, xubuntu still freezes. Maybe, > xubuntu's firmware packages

[Kernel-packages] [Bug 1690085]

2019-01-26 Thread bp
(In reply to T X from comment #498) > Jan 24 17:59:39 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported > by HW (0x0) So this looks like a broken BIOS. > Jan 24 17:21:00 kernel: Hardware name: To Be Filled By O.E.M. To Be > Filled By O.E.M./AB350 Gaming-ITX/ac, BIOS P4.60 04/19/2018

[Kernel-packages] [Bug 1690085]

2019-01-28 Thread bp
(In reply to T X from comment #509) > I haven't built a kernel in about decade, but if you provide step-by-step > instructions for how to do so, I'll attempt to build the patched kernel and > then post the dmesg. Ok, here we go: 1. Clone stable kernel from here if you haven't done so yet: $ git

[Kernel-packages] [Bug 1690085]

2019-01-28 Thread bp
Created attachment 280825 acpi-dump-cstates.diff -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690085 Title: Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks Status in

[Kernel-packages] [Bug 1690085]

2019-03-24 Thread bp
(In reply to JerryD from comment #578) > $ sudo rdmsr -a 0xc0011020 > 68000 > > This is on: > > cpu family: 23 > model : 17 > model name: AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx > stepping : 0 > microcode : 0x8101007 You're showing me the MSR output for

[Kernel-packages] [Bug 1690085]

2019-03-22 Thread bp
(In reply to Dennis Schridde from comment #573) > 2068100 If this is correct: http://www.cpu-world.com/CPUs/Zen/AMD-Ryzen%205%202400G.html then you should have in /proc/cpuninfo something like this: cpu family: 23 model: 17 stepping: 0 and if so, not affected. -- You received

[Kernel-packages] [Bug 1690085]

2019-03-22 Thread bp
(In reply to Lars Viklund from comment #570) > rdmsr yields 68010, which has bit 4 set. Looks like your BIOS applies the fix. Now, does the patch in comment #526 fix your freezes? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1690085]

2019-03-21 Thread bp
(In reply to Tolga Cakir from comment #568) > @Borislav has the fix for erratum 1033 "A Lock Operation May Cause the > System to Hang" been applied so far? The suggested workaround was "Program > MSRC001_1020[4] to 1b", but I couldn't find anything about it in master > branch. According to the

[Kernel-packages] [Bug 1690085]

2019-03-21 Thread bp
There's a patch in comment #526 for people to test before we include it so that there's at least *some* fix in the kernel, going forward... -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1690085]

2019-03-21 Thread bp
(In reply to alfie from comment #566) > But I really want C6/P6 and ignoring the BIOS seems to get me there... Why do you think the patch I pointed to won't give you C6? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1690085]

2019-02-05 Thread bp
(In reply to Klaus Mueller from comment #530) > Hmm, if rev. 1 doesn't support MWAIT - why can it be a problem anyway > at the same time which must be fixed by disabling the usage of MWAIT? > I seem to miss something? That's a good question but, frankly, I don't have a very exact answer to it

[Kernel-packages] [Bug 1690085]

2019-02-05 Thread bp
Ok, here's a test patch ontop of 4.20-stable. It should practically make idle=halt the default on revisions before B2. You check which revision you have by doing $ grep stepping /proc/cpuinfo The number must be < 2. For the folks with B2 machines we need to keep debugging. Thx. -- You

[Kernel-packages] [Bug 1690085]

2019-02-05 Thread bp
(In reply to Klaus Mueller from comment #528) > What's the downside of generally disabling mwait? So in your case, you can't do MWAIT to enter C1 anyway because your revision doesn't support it. This is why you're seeing those firmware messages. > I'm using a Ryzen 7 1700X and don't have any

[Kernel-packages] [Bug 1690085]

2019-02-05 Thread bp
Created attachment 280961 Don't do mwait on B1 and earlier -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690085 Title: Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks

[Kernel-packages] [Bug 1690085]

2019-01-25 Thread bp
Folks, just a quick thing: please check whether you have the latest BIOS and if not, do upgrade it and check if it makes any difference. Thx. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1690085]

2019-02-01 Thread bp
(In reply to Philip Rosvall from comment #518) > I can confirm that "idle=halt" makes both my Ryzen desktop and Ryzen > notebook more stable than anything else I've tried. I have not experienced > freezes for some days now. Can you upload dmesg from the working and non-working kernels pls? Thx.

[Kernel-packages] [Bug 1690085]

2019-02-02 Thread bp
(In reply to Philip Rosvall from comment #522) > I have only used the workaround since kernel 4.20.4, so I just booted the > kernel now without and with the parameter. As can be seen in line 704 in the > log without "idle=halt", we see the error "[Firmware Bug]: ACPI MWAIT > C-state 0x0 not

[Kernel-packages] [Bug 1690085]

2019-05-16 Thread bp
(In reply to Philip Rosvall from comment #593) > Since 5.0.10, and this commit ...: > > (https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.0.10) > "commit 205c53cbe553c9e5a9fe93f63e398da7e59124b6 > Author: Thomas Gleixner > Date: Sun Apr 14 19:51:06 2019 +0200 > > x86/speculation:

[Kernel-packages] [Bug 1690085]

2019-09-11 Thread bp
> Sep 10 21:13:56 BelliashPC kernel: Workqueue: 0x0 (events) > Sep 10 21:13:56 BelliashPC kernel: RIP: 0010:worker_thread+0xfc/0x3b0 > Sep 10 21:13:56 BelliashPC kernel: Code: 0f 84 8e 02 00 00 48 8b 4c 24 18 49 > 39 4f 38 0f 85 a9 02 00 00 83 e0 fb 41 89 47 60 ff 4a 34 49 8b 17 49 8b 47 > 08 48

[Kernel-packages] [Bug 1690085]

2019-11-19 Thread bp
Folks, this bugzilla entry, with the amount of different issues reported and the amount of FUD all collected in one place, is prohibitively hard for a debugger to handle. So, I'd suggest if you still would like your issue looked at, to open a separate bug. And please refrain from commenting on a