Re: [Intel-gfx] Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On May 05 Daniel Vetter wrote: > Hm, if it's watermarks then testing with latest drm-intel-nightly would be > interesting. We finally managed to land atomic watermark updates (should > all be there in 4.7 too): > > https://cgit.freedesktop.org/drm-intel I will see if I can test this sometime soon. -- Stefan Richter -==- -=-= --==- http://arcgraph.de/sr/
Re: [Intel-gfx] Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On May 05 Daniel Vetter wrote: > Hm, if it's watermarks then testing with latest drm-intel-nightly would be > interesting. We finally managed to land atomic watermark updates (should > all be there in 4.7 too): > > https://cgit.freedesktop.org/drm-intel I will see if I can test this sometime soon. -- Stefan Richter -==- -=-= --==- http://arcgraph.de/sr/
Re: [Intel-gfx] Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On Thu, May 05, 2016 at 06:50:14PM +, Zanoni, Paulo R wrote: > Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu: > > On Apr 30 Stefan Richter wrote: > > > > > > On Apr 29 Stefan Richter wrote: > > > > > > > > On Apr 26 Stefan Richter wrote: > > > > > > > > > > v4.6-rc solidly hangs after a short while after boot, login to > > > > > X11, and > > > > > doing nothing much remarkable on the just brought up X desktop. > > > > > > > > > > Hardware: x86-64, E3-1245 v3 (Haswell), > > > > > mainboard Supermicro X10SAE, > > > > > using integrated Intel graphics (HD P4600, i915 > > > > > driver), > > > > > C226 PCH's AHCI and USB 2/3, ASMedia ASM1062 AHCI, > > > > > Intel LAN (i217, igb driver), > > > > > several IEEE 1394 controllers, some of them behind > > > > > PCIe bridges (IDT, PLX) or PCIe-to-PCI bridges (TI, > > > > > Tundra) > > > > > and one PCI-to-CardBus bridge (Ricoh) > > > > > > > > > > kernel.org kernel, Gentoo Linux userland > > > > > > > > > > 1. known good: v4.5-rc5 (gcc 4.9.3) > > > > > known bad: v4.6-rc2 (gcc 4.9.3), only tried one time > > > > > > > > > > 2. known good: v4.5.2 (gcc 5.2.0) > > > > > known bad: v4.6-rc5 (gcc 5.2.0), only tried one time > > > > > > > > > > I will send my linux-4.6-rc5/.config in a follow-up message. > > > .config: http://www.spinics.net/lists/kernel/msg2243444.html > > > lspci: http://www.spinics.net/lists/kernel/msg2243447.html > > > > > > Some userland package versions, in case these have any bearing: > > > x11-base/xorg-drivers-1.17 > > > x11-base/xorg-server-1.17.4 > > > x11-bas/xorg-x11-7.4-r2 > > Furthermore, there is a single display hooked up via DisplayPort. > > > > > > > > > > > > > After it proved impossible to capture an oops through netconsole, > > > > I > > > > started git bisect. This will apparently take almost a week, as > > > > git > > > > estimated 13 bisection steps and I will be allowing about 12 > > > > hours of > > > > uptime as a sign for a good kernel. (In my four or five tests of > > > > bad > > > > kernels before I started bisection, they hung after 3 > > > > minutes...5.5 hours > > > > uptime, with no discernible difference in workload. Maybe 12 h > > > > cutoff is > > > > even too short...) > > I took at least 18 hours uptime (usually 24 hours) as a sign for good > > kernels. During the bisection, bad kernels hung after 3 h, 2 h, 9 > > min, > > 45 min, and 4 min uptime. Thus I arrived at a98ee79317b4 > > "drm/i915/fbc: > > enable FBC by default on HSW and BDW" as the point where the hangs > > are > > introduced. > > > > Quoting the changelog of the commit: > > Thanks for following the instructions on the commit message! :) > > > > > Oh, and in case you - the person reading this commit message - > > found > > this commit through git bisect, please do the following: > > - Check your dmesg and see if there are error messages > > mentioning > > underruns around the time your problem started happening. > > > > Well, I always had the followings lines in dmesg: > > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo > > underrun on pipe A > > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO > > underrun > > Oh, well... I had a patch that would just disable FBC in case we saw a > FIFO underrun, but it was rejected. Maybe this is the time to think > about it again? Otherwise, I can't think of much besides disabling FBC > on HSW until all the underruns and watermarks regressions are fixed > forever. Hm, if it's watermarks then testing with latest drm-intel-nightly would be interesting. We finally managed to land atomic watermark updates (should all be there in 4.7 too): https://cgit.freedesktop.org/drm-intel Cheers, Daniel > > > > > I always got these when I switch on the DisplayPort attached monitor. > > Recently I changed userland from kdm to sddm and noticed that I > > apparently get these when sddm shuts down. I am not aware of whether > > or not this also already happened with kdm. > > > > However, "around the time your problem started happening" there is > > nothing in dmesg, because "your problem" is a complete hang without > > possibility of disk IO and without netconsole output. > > > > - Download intel-gpu-tools, compile it, and run: > > $ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' > > 2>&1 | tee fbc.txt > > Then send us the fbc.txt file, especially if you get a > > failure. > > This will really maximize your chances of getting the bug > > fixed > > quickly. > > > > Do you need this while FBC is enabled, or can I run it while FBC is > > disabled? > > FBC enabled. Considering your description, my hope is that maybe some > specific subtest will be able to hang your machine, so testing this > again will require only running the specific subtest instead of waiting > 18 hours. > > > > > - Try
Re: [Intel-gfx] Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On Thu, May 05, 2016 at 06:50:14PM +, Zanoni, Paulo R wrote: > Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu: > > On Apr 30 Stefan Richter wrote: > > > > > > On Apr 29 Stefan Richter wrote: > > > > > > > > On Apr 26 Stefan Richter wrote: > > > > > > > > > > v4.6-rc solidly hangs after a short while after boot, login to > > > > > X11, and > > > > > doing nothing much remarkable on the just brought up X desktop. > > > > > > > > > > Hardware: x86-64, E3-1245 v3 (Haswell), > > > > > mainboard Supermicro X10SAE, > > > > > using integrated Intel graphics (HD P4600, i915 > > > > > driver), > > > > > C226 PCH's AHCI and USB 2/3, ASMedia ASM1062 AHCI, > > > > > Intel LAN (i217, igb driver), > > > > > several IEEE 1394 controllers, some of them behind > > > > > PCIe bridges (IDT, PLX) or PCIe-to-PCI bridges (TI, > > > > > Tundra) > > > > > and one PCI-to-CardBus bridge (Ricoh) > > > > > > > > > > kernel.org kernel, Gentoo Linux userland > > > > > > > > > > 1. known good: v4.5-rc5 (gcc 4.9.3) > > > > > known bad: v4.6-rc2 (gcc 4.9.3), only tried one time > > > > > > > > > > 2. known good: v4.5.2 (gcc 5.2.0) > > > > > known bad: v4.6-rc5 (gcc 5.2.0), only tried one time > > > > > > > > > > I will send my linux-4.6-rc5/.config in a follow-up message. > > > .config: http://www.spinics.net/lists/kernel/msg2243444.html > > > lspci: http://www.spinics.net/lists/kernel/msg2243447.html > > > > > > Some userland package versions, in case these have any bearing: > > > x11-base/xorg-drivers-1.17 > > > x11-base/xorg-server-1.17.4 > > > x11-bas/xorg-x11-7.4-r2 > > Furthermore, there is a single display hooked up via DisplayPort. > > > > > > > > > > > > > After it proved impossible to capture an oops through netconsole, > > > > I > > > > started git bisect. This will apparently take almost a week, as > > > > git > > > > estimated 13 bisection steps and I will be allowing about 12 > > > > hours of > > > > uptime as a sign for a good kernel. (In my four or five tests of > > > > bad > > > > kernels before I started bisection, they hung after 3 > > > > minutes...5.5 hours > > > > uptime, with no discernible difference in workload. Maybe 12 h > > > > cutoff is > > > > even too short...) > > I took at least 18 hours uptime (usually 24 hours) as a sign for good > > kernels. During the bisection, bad kernels hung after 3 h, 2 h, 9 > > min, > > 45 min, and 4 min uptime. Thus I arrived at a98ee79317b4 > > "drm/i915/fbc: > > enable FBC by default on HSW and BDW" as the point where the hangs > > are > > introduced. > > > > Quoting the changelog of the commit: > > Thanks for following the instructions on the commit message! :) > > > > > Oh, and in case you - the person reading this commit message - > > found > > this commit through git bisect, please do the following: > > - Check your dmesg and see if there are error messages > > mentioning > > underruns around the time your problem started happening. > > > > Well, I always had the followings lines in dmesg: > > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo > > underrun on pipe A > > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO > > underrun > > Oh, well... I had a patch that would just disable FBC in case we saw a > FIFO underrun, but it was rejected. Maybe this is the time to think > about it again? Otherwise, I can't think of much besides disabling FBC > on HSW until all the underruns and watermarks regressions are fixed > forever. Hm, if it's watermarks then testing with latest drm-intel-nightly would be interesting. We finally managed to land atomic watermark updates (should all be there in 4.7 too): https://cgit.freedesktop.org/drm-intel Cheers, Daniel > > > > > I always got these when I switch on the DisplayPort attached monitor. > > Recently I changed userland from kdm to sddm and noticed that I > > apparently get these when sddm shuts down. I am not aware of whether > > or not this also already happened with kdm. > > > > However, "around the time your problem started happening" there is > > nothing in dmesg, because "your problem" is a complete hang without > > possibility of disk IO and without netconsole output. > > > > - Download intel-gpu-tools, compile it, and run: > > $ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' > > 2>&1 | tee fbc.txt > > Then send us the fbc.txt file, especially if you get a > > failure. > > This will really maximize your chances of getting the bug > > fixed > > quickly. > > > > Do you need this while FBC is enabled, or can I run it while FBC is > > disabled? > > FBC enabled. Considering your description, my hope is that maybe some > specific subtest will be able to hang your machine, so testing this > again will require only running the specific subtest instead of waiting > 18 hours. > > > > > - Try