Hi Jörg, AMD-Vi: Completion-Wait loop timed out is at [65499.964105] but amdgpu-error is at [ 52.772273], hence much earlier.
Have not tried to use an upstream kernel yet. Which one would you recommend? As far as inconsistencies in the PCI-setup is concerned, the only thing that I know of right now is that we haven´t entered a PCI subsystem vendor and device ID yet. It is still "Advanced Micro Devices". We will change that soon to "General Electric" or "Emerson". Best regards, Edgar -----Original Message----- From: [email protected] <[email protected]> Sent: Mittwoch, 4. November 2020 09:53 To: Merger, Edgar [AUTOSOL/MAS/AUGS] <[email protected]> Cc: [email protected] Subject: [EXTERNAL] Re: amdgpu error whenever IOMMU is enabled Hi Edgar, On Fri, Oct 30, 2020 at 02:26:23PM +0000, Merger, Edgar [AUTOSOL/MAS/AUGS] wrote: > With one board we have a boot-problem that is reproducible at every ~50 boot. > The system is accessible via ssh and works fine except for the > Graphics. The graphics is off. We don´t see a screen. Please see > attached “dmesg.log”. From [52.772273] onwards the kernel reports > drm/amdgpu errors. It even tries to reset the GPU but that fails too. > I tried to reset amdgpu also by command “sudo cat > /sys/kernel/debug/dri/N/amdgpu_gpu_recover”. That did not help either. Can you reproduce the problem with an upstream kernel too? These messages in dmesg indicate some problem in the platform setup: AMD-Vi: Completion-Wait loop timed out Might there be some inconsistencies in the PCI setup between the bridges and the endpoints or something? Regards, Joerg _______________________________________________ iommu mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/iommu
