https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #19 from Dan Horák ---
They (=
https://src.fedoraproject.org/fork/sharkcz/rpms/kernel/blob/talos/f/ppc64-talos-amdgpu-reset.patch)
can go upstream.
I have a 4.20 kernel on the host with recent firmware for polaris11, the
skiroot
https://bugs.freedesktop.org/show_bug.cgi?id=108585
Alex Deucher changed:
What|Removed |Added
See Also||https://bugs.freedesktop.or
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #18 from Alex Deucher ---
(In reply to Dan Horák from comment #17)
> Fedora/ppc64le users can find a pre-built kernel with the patchset at
> https://copr.fedorainfracloud.org/coprs/sharkcz/talos-kernel/build/817728/
Should these
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #17 from Dan Horák ---
Fedora/ppc64le users can find a pre-built kernel with the patchset at
https://copr.fedorainfracloud.org/coprs/sharkcz/talos-kernel/build/817728/
--
You are receiving this mail because:
You are the assignee
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #16 from Dan Horák ---
Reset on init sounds better to me as the loader kernel (in kexec case) is more
difficult to update than the host kernel.
And for the record - after updating the skiroot kernel firmware version to the
latest
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #15 from Alex Deucher ---
Created attachment 142316
--> https://bugs.freedesktop.org/attachment.cgi?id=142316=edit
more involved fix
These patches attempt to reset the GPU on init if the GPU was already running
from a previous
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #14 from Alex Deucher ---
Created attachment 142303
--> https://bugs.freedesktop.org/attachment.cgi?id=142303=edit
possible fix
(In reply to Benjamin Herrenschmidt from comment #12)
> We have no control on what firmware is loaded
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #13 from Christian König ---
(In reply to Benjamin Herrenschmidt from comment #12)
> We'll probably need to add something to the amdgpu shutdown() path to force
> an adapter reset.
If that would be possible we would have already
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #12 from Benjamin Herrenschmidt ---
We have no control on what firmware is loaded by the target distro so the right
thing is going to reset the adapter.
We'll probably need to add something to the amdgpu shutdown() path to force an
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #11 from Dan Horák ---
Thanks for the info, I've documented that in the Talos wiki under
https://wiki.raptorcs.com/wiki/Troubleshooting/GPU#AMDGPU_driver_crashes_after_firmware_update
--
You are receiving this mail because:
You
https://bugs.freedesktop.org/show_bug.cgi?id=108585
Christian König changed:
What|Removed |Added
CC||joel.s...@gmail.com
--- Comment #10
https://bugs.freedesktop.org/show_bug.cgi?id=108585
Christian König changed:
What|Removed |Added
Resolution|--- |NOTABUG
Status|NEW
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #8 from Dan Horák ---
I should have mentioned I'm kexec-ing too. It's from 4.15.9 (in skiroot) to
Fedora kernels 4.16, 4.17, 4.18 and now 4.19 during the time. It worked fine
until the recent amdgpu firmware update. The skiroot
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #7 from Joel ---
(In reply to Benjamin Herrenschmidt from comment #6)
> Dan... did you do some firmware changes here ? Could it have to do with the
> versions differences between petitboot and the final kernel ?
FWIW, Talos II
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #6 from Benjamin Herrenschmidt ---
They may or may not be related ... Alex, kexec is how we boot these machines,
there's a Linux kernel in flash that runs a Linux based bootloader.
Until recently however, that didn't have an amdgpu
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #5 from Alex Deucher ---
(In reply to Joel from comment #4)
> I see a similar backtrace on 4.19.0-11706-g11743c56785c (Linus' tree
> mid-merge window).
>
> My system has a "fiji" card. The first kernel is 4.19 (upstream release),
>
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #4 from Joel ---
I see a similar backtrace on 4.19.0-11706-g11743c56785c (Linus' tree mid-merge
window).
My system has a "fiji" card. The first kernel is 4.19 (upstream release), and
the second kernel where the backtrace occurs is
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #3 from Dan Horák ---
Ha, so it's the firmware stored in the initrds, what is different (lsinitrd
lied). And the latest polaris11* ones provoke the crash. When I manually
replaced them with the ones from the rc8 initrd, I've
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #2 from Dan Horák ---
(In reply to Michel Dänzer from comment #1)
> There were no amdgpu driver changes between rc8 and final... Are you sure
> this is 100% reproducible with the latter and not reproducible with the
> former? If so,
https://bugs.freedesktop.org/show_bug.cgi?id=108585
--- Comment #1 from Michel Dänzer ---
There were no amdgpu driver changes between rc8 and final... Are you sure this
is 100% reproducible with the latter and not reproducible with the former? If
so, can you bisect?
--
You are receiving this
https://bugs.freedesktop.org/show_bug.cgi?id=108585
Bug ID: 108585
Summary: *ERROR* hw_init of IP block failed -22
Product: DRI
Version: unspecified
Hardware: PowerPC
OS: Linux (All)
Status: NEW
21 matches
Mail list logo