[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 Michael Ellerman (mich...@ellerman.id.au) changed: What|Removed |Added Status|RESOLVED|CLOSED --- Comment #26 from Michael Ellerman (mich...@ellerman.id.au) --- Now merged, thanks all. If the fix isn't working please reopen. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 Michael Ellerman (mich...@ellerman.id.au) changed: What|Removed |Added Status|NEW |RESOLVED CC||mich...@ellerman.id.au Resolution|--- |CODE_FIX --- Comment #25 from Michael Ellerman (mich...@ellerman.id.au) --- This is in my fixes branch which I'll send to Linus later this week: https://git.kernel.org/powerpc/c/b4fc36e60f25cf22bf8b7b015a701015740c3743 -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #24 from Shawn Anastasio (sh...@anastas.io) --- Done. I've also updated the product/component to Platform Specific/PPC-64 since this wasn't an issue with amdgpu after all. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 Shawn Anastasio (sh...@anastas.io) changed: What|Removed |Added Component|Video(DRI - non Intel) |PPC-64 Kernel Version|5.1.15 |5.2.1 Product|Drivers |Platform Specific/Hardware Regression|No |Yes -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #23 from Timothy Pearson (tpear...@raptorengineering.com) --- A nit, but might want to mark this bug as a regression and update the kernel version to 5.2.1? -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #22 from Shawn Anastasio (sh...@anastas.io) --- Great! I've posted the patch to the linuxppc-dev mailing list here: https://patchwork.ozlabs.org/patch/1133466/. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 Robert Bridge (rob...@robbieab.com) changed: What|Removed |Added CC||rob...@robbieab.com --- Comment #20 from Robert Bridge (rob...@robbieab.com) --- I was encountering a bug showing similar symptoms with a different trigger: For me, any attempt to play sound consistently and immediately crashed my system. This was not the case with the 4.20 kernel, was confirmed happening with the 5.1 kernel. git bisection identified the same patch Timothy has identified as the patch introducing the issue for me. I can confirm that the patch provided by Shawn appears to fix the issue. Building a kernel with that patch applied to head (commit 22051d9c4a57d3b4a8b5a7407efc80c71c7bfb16) from linux.git provides me with a kernel that no longer crashes when I attempt to play sound. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #19 from Shawn Anastasio (sh...@anastas.io) --- Created attachment 283803 --> https://bugzilla.kernel.org/attachment.cgi?id=283803=edit test patch #3 oops, missed a couple of includes and made a typo. Fixed those. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #18 from Shawn Anastasio (sh...@anastas.io) --- Created attachment 283801 --> https://bugzilla.kernel.org/attachment.cgi?id=283801=edit test patch #2 Here's the new patch that should restore the previous behavior correctly. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #17 from Shawn Anastasio (sh...@anastas.io) --- On second glance, it seems I got it backwards. pgprot_noncache /is/ actually being set via the default implementation of arch_dma_mmap_pgprot, but this creates the opposite issue. In the coherent case, the vma is now marked as noncache but in the previous implementation it was not. I'll post a new patch to solve this by providing a powerpc implementation of arch_dma_mmap_pgprot that only sets noncache in the !coherent case to match the previous behvaior. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #16 from Shawn Anastasio (sh...@anastas.io) --- Created attachment 283799 --> https://bugzilla.kernel.org/attachment.cgi?id=283799=edit test patch #1 Though I'm not familiar with this code, a quick spot check shows what I believe to be an inconsistency with the commit's claim of functional identicality. Namely, the previous caller of __dma_get_coherent_pfn (now arch_dma_coherent_to_pfn) would explicitly modify the vm_area to mark it as uncacheable in the !coherent case. It seems the new caller (dma_common_mmap) does not do this. I have written a small patch to restore the previous behavior (I think). Note that this probably isn't upstreamable since this fix should probably go somewhere in the powerpc arch code rather than the dma core. Tim, since you're the only one who can easily reproduce this, would you mind giving this patch a shot? -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #15 from Timothy Pearson (tpear...@raptorengineering.com) --- Bisect shows that the failing commit is: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.1.y=cc17d7802b7dcbb073e7be1eee2cf6fa64d9 This is on a WX7100 GPU, the lockup is 100% repeatable after that patch goes in. Unfortunately it does not cleanly reverse on the 5.1 kernel so I can't verify this is the only problem with 5.1, but it's a start. Any IBMers have ideas why this patch went in and why it would be causing "Cache line inhibited hit cacheable space" faults? -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #14 from Timothy Pearson (tpear...@raptorengineering.com) --- Looks like this is a case where it's fairly critical to have a known 100% reliable way to reproduce a bug like this. I'm bisecting further, so we should be able to get to the bottom of this soon. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #13 from Daniel Kolesa (li...@octaforge.org) --- Well, previously I had X11 running with 5.1.9 for ~2 weeks with no lockups; then I rebooted into 5.1.17 and suddenly it locked up the second day with a checkstop. Now back in 5.1.9, and nothing for almost 3 days already. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #12 from Timothy Pearson (tpear...@raptorengineering.com) --- I just double checked -- 5.0.21 works, 5.1.0 does not. Daniel, didn't you say your lockup was somewhat random? I can trigger this lockup every single time with a specific action, so is it possible 5.1.9 for you just hasn't hit the right combination of factors to trigger the lockup on your system? -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #10 from Shawn Anastasio (sh...@anastas.io) --- I actually haven't tested 5.1.9, but I can confirm that 5.0.9 works fine. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #11 from Daniel Kolesa (li...@octaforge.org) --- I see, I misread then, because the working kernel I have here is 5.1.9... I did have it hang on 5.1.17, though. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #9 from Daniel Kolesa (li...@octaforge.org) --- That's strange. Both me and Shawn had 5.1.9 as a definitely working version. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #8 from Timothy Pearson (tpear...@raptorengineering.com) --- I'm seeing this on 5.1.0 and up. 5.0.0+ was the last working version for me, I'm continuing the bisect. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #7 from Daniel Kolesa (li...@octaforge.org) --- Maybe narrow it down first (i.e. find the last release that was good, by testing 5.1.14 first, and then bisect only history between the last good and first bad release. We know 5.1.9 is good but we don't know which release between 5.1.9 and 5.1.15 introduced the problem. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 Timothy Pearson (tpear...@raptorengineering.com) changed: What|Removed |Added CC||tpearson@raptorengineering. ||com --- Comment #6 from Timothy Pearson (tpear...@raptorengineering.com) --- I just tested and confirmed this bug is still present on the latest 5.2.0+ GIT HEAD. I can reliably reproduce this bug at will, but it is a somewhat involved process to get the machine up and running to the point where I can trigger it each time so bisect will be slow. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #5 from Daniel Kolesa (li...@octaforge.org) --- Nevermind. Seems like I was able to hit the same problem today. It did not happen on video playback though, just randomly, after about a day or two of uptime. -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 Daniel Kolesa (li...@octaforge.org) changed: What|Removed |Added CC||li...@octaforge.org --- Comment #4 from Daniel Kolesa (li...@octaforge.org) --- Can't reproduce this on 5.1.17 with Polaris WX5100 and 18-core POWER9. Since both you and Timothy have dual-CPU systems with the GPU on the second CPU's PCIe, this could indicate that the problem is only affecting dual processor systems possibly in that specific configuration (alternatively, the problem could be fixed in 5.1.17 already...) -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 Alex Deucher (alexdeuc...@gmail.com) changed: What|Removed |Added CC||alexdeuc...@gmail.com --- Comment #3 from Alex Deucher (alexdeuc...@gmail.com) --- can you bisect? -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #2 from Shawn Anastasio (sh...@anastas.io) --- Created attachment 283639 --> https://bugzilla.kernel.org/attachment.cgi?id=283639=edit lspci -vv -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 204145] amdgpu video playback causes host to hard reset (checkstop) on POWER9 with RX 580
https://bugzilla.kernel.org/show_bug.cgi?id=204145 --- Comment #1 from Shawn Anastasio (sh...@anastas.io) --- Created attachment 283637 --> https://bugzilla.kernel.org/attachment.cgi?id=283637=edit /proc/cpuinfo -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel