[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 Shih-Yuan Leechanged: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #46 from Shih-Yuan Lee --- The Linux kernel of Comment 45 is 4.15.0-10.11 from Ubuntu 18.04. When I tried a later version 4.15.0-12.13, I can not reduplicate this issue on Ubuntu 18.04. 4.15.0-12.13 contains the following commit. commit 239b5f64e12b1f09f506c164dff0374924782979 Author: Alex Deucher Date: Tue Nov 21 12:09:38 2017 -0500 drm/radeon: Add dpm quirk for Jet PRO (v2) Fixes stability issues. v2: clamp sclk to 600 Mhz Bug: https://bugs.freedesktop.org/show_bug.cgi?id=103370 Acked-by: Christian König Signed-off-by: Alex Deucher Cc: sta...@vger.kernel.org diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c index ee3e742..97a0a63 100644 --- a/drivers/gpu/drm/radeon/si_dpm.c +++ b/drivers/gpu/drm/radeon/si_dpm.c @@ -2984,6 +2984,11 @@ static void si_apply_state_adjust_rules(struct radeon_device *rdev, (rdev->pdev->device == 0x6667)) { max_sclk = 75000; } + if ((rdev->pdev->revision == 0xC3) || + (rdev->pdev->device == 0x6665)) { + max_sclk = 6; + max_mclk = 8; + } } else if (rdev->family == CHIP_OLAND) { if ((rdev->pdev->revision == 0xC7) || (rdev->pdev->revision == 0x80) || -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #45 from Shih-Yuan Lee--- I can still reduplicate this issue on Ubuntu 18.04 by `seq 100 | while read i; do echo Loop $i; DRI_PRIME=1 glxgears -info|head -n2; done`. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #44 from Shih-Yuan Lee--- I tried max_sclk = 5 and max_mclk = 6 on Ubuntu-4.4.0-112.135, but I can still reduplicate the GPU lock up issue. It can pass the first run of `seq 100 | while read i; do echo Loop $i; DRI_PRIME=1 glxgears -info|head -n 3; done`. But it failed when I tried the second run of `seq 100 | while read i; do echo Loop $i; DRI_PRIME=1 glxgears -info|head -n 3; done`. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #43 from Shih-Yuan Lee--- I can still reduplicate the issue after setting max_sclk to 6 and max_mclk to 8. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #42 from Michel Dänzer--- (In reply to Robert Liu from comment #41) > Another issue found is when removing the adapter, the system goes to > suspend. That's not directly related to graphics drivers. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #41 from Robert Liu--- So far, setting max_sclk to 6 and max_mclk to 8, the system passed a 24hours burn-in test (vblank_mode=0 DRI_PRIME=1 glmark2 --run-forever). Another issue found is when removing the adapter, the system goes to suspend. After I wake it up, it continues running the benchmark. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #40 from Shih-Yuan Lee--- Created attachment 135662 --> https://bugs.freedesktop.org/attachment.cgi?id=135662=edit dmesg (In reply to Alex Deucher from comment #38) > Created attachment 135647 [details] [review] > workaround for radeon > > workarounds for radeon and amdgpu to fix the issue. I applied this patch on top of Ubuntu-4.4.0-101.124 Linux kernel and it seems to fix the issue in the beginning. But it has some problem later on. $ seq 20 | while read i; do echo Loop $i; DRI_PRIME=1 glxgears -info|head -n 5; done Loop 1 radeon: Failed to allocate virtual address for buffer: radeon:size : 65536 bytes radeon:alignment : 4096 bytes radeon:domains : 4 radeon:va: 0x0080 radeon: Failed to deallocate virtual address for buffer: radeon:size : 65536 bytes radeon:va: 0x80 radeon: Failed to allocate virtual address for buffer: radeon:size : 65536 bytes radeon:alignment : 4096 bytes radeon:domains : 4 radeon:va: 0x0080 radeon: Failed to deallocate virtual address for buffer: radeon:size : 65536 bytes radeon:va: 0x80 radeonsi: Failed to create a context. Loop 2 ... -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #39 from Alex Deucher--- Created attachment 135648 --> https://bugs.freedesktop.org/attachment.cgi?id=135648=edit workaround for amdgpu -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #38 from Alex Deucher--- Created attachment 135647 --> https://bugs.freedesktop.org/attachment.cgi?id=135647=edit workaround for radeon workarounds for radeon and amdgpu to fix the issue. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #37 from Robert Liu--- (In reply to Robert Liu from comment #36) > BTW, with 4.13.0-16-generic, I change the max_sclk in drm/radeon/si_dpm.c > (what we did with Ubuntu kernel 4.4.0-101-generic) from 75000 to 65000, but > still met the hang issue. By restricting max_sclk to 65000 and max_mclk to 8, both radeon and amdgpu do not have the issue. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #36 from Robert Liu--- (In reply to Alex Deucher from comment #33) > I think Sonny fixed this. It was due to using the wrong firmware. > [1.827060] [drm] initializing kernel modesetting (HAINAN 0x1002:0x6665 > 0x1028:0x0844 0xC3). This chip should be using radeon/banks_k_2_smc.bin smc > firmware. Is that available on the test system and kernel? The firmware radeon/banks_k_2_smc.bin is on the test system. With Ubuntu kernel 4.4.0-101-generic, I am not pretty sure the radeon driver is using this firmware. With Ubuntu kernel 4.13.0-16-generic, I tried both amdgpu and radeon drivers, but the system hang. as soon as the system hang, the amdgpu_pm_info shows 'invalid dpm profile 15'. (In reply to Alex Deucher from comment #34) > The following commits are relevant: > abb2e3c1ce64c8bba678973800c34ea1dc97c42c > 6458bd4dfd9414cba5804eb9907fe2a824278c34 > ef736d394e85b1bf1fd65ba5e5257b85f6c82325 > 4e6e98b1e48c9474aed7ce03025ec319b941e26e These commits would be already included in Ubuntu kernel 4.13.0-16-generic. (In reply to Alex Deucher from comment #35) > Does reverting a628392cf03e0eef21b345afbb192cbade041741 fix the issue? Removing this commit does not fix the issue. BTW, with 4.13.0-16-generic, I change the max_sclk in drm/radeon/si_dpm.c (what we did with Ubuntu kernel 4.4.0-101-generic) from 75000 to 65000, but still met the hang issue. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #35 from Alex Deucher--- Does reverting a628392cf03e0eef21b345afbb192cbade041741 fix the issue? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #34 from Alex Deucher--- The following commits are relevant: abb2e3c1ce64c8bba678973800c34ea1dc97c42c 6458bd4dfd9414cba5804eb9907fe2a824278c34 ef736d394e85b1bf1fd65ba5e5257b85f6c82325 4e6e98b1e48c9474aed7ce03025ec319b941e26e -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #33 from Alex Deucher--- (In reply to Michel Dänzer from comment #27) > Thanks for bisecting, but I don't think that commit can be directly > responsible for a GPU hang. Before that commit, the DRI3 code in Mesa would > only use one back buffer for glxgears, which means that the GPU could only > start rendering a new frame after the previous one had finished presenting. > Maybe that somehow prevented the hang. That commit "fixed" a performance regression at the time because it ended up causing enough of a delay that the clocks didn't ramp up. So it probably exposed a kernel dpm issue. Without it, the clocks never ramped up enough to cause an issue. With it, they did. (In reply to Timo Aaltonen from comment #32) > forwarding a comment from an engineer: > > "During viewing the source code of radeon module, I found there is a bug [1] > related to the dpm and clocks. So I decided to do some experiments. > Tried to set different max_sclk and max_mclk to see if the issue is gone. > 1. max_sclk: 7, max_mclk: 75000 --> have the same issue > 2. max_sclk: 5, max_mclk: 6 --> pass multi-run test (more than 50 > runs) > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=76490 > " I think Sonny fixed this. It was due to using the wrong firmware. [1.827060] [drm] initializing kernel modesetting (HAINAN 0x1002:0x6665 0x1028:0x0844 0xC3). This chip should be using radeon/banks_k_2_smc.bin smc firmware. Is that available on the test system and kernel? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #32 from Timo Aaltonen--- forwarding a comment from an engineer: "During viewing the source code of radeon module, I found there is a bug [1] related to the dpm and clocks. So I decided to do some experiments. Tried to set different max_sclk and max_mclk to see if the issue is gone. 1. max_sclk: 7, max_mclk: 75000 --> have the same issue 2. max_sclk: 5, max_mclk: 6 --> pass multi-run test (more than 50 runs) [1] https://bugs.freedesktop.org/show_bug.cgi?id=76490 " -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #31 from Michel Dänzer--- With vblank_mode=0, the only thing that can prevent tearing is luck. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #30 from Shih-Yuan Lee--- Tearing won't happen on battery power, but it will only happen when plugged in AC power. Is this behavior also expected? -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 --- Comment #29 from Michel Dänzer--- Tearing is expected with vblank_mode=0. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 103370] `vblank_mode=0 DRI_PRIME=1 glxgears` will introduce GPU lock up on Intel Graphics [8086:5917] + AMD Graphics [1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370 Shih-Yuan Leechanged: What|Removed |Added Summary|`DRI_PRIME=1 glxgears |`vblank_mode=0 DRI_PRIME=1 |-info` halts the system |glxgears` will introduce |with Intel Graphics |GPU lock up on Intel |[8086:5917] + AMD Graphics |Graphics [8086:5917] + AMD |[1002:6665] (rev c3)|Graphics [1002:6665] (rev ||c3) -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel