Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On May 06 Daniel Vetter wrote: > On Thu, May 05, 2016 at 10:45:31PM +0200, Stefan Richter wrote: [...] > > Subtest fbc-1p-primscrn-spr-indfb-fullscreen: FAIL (5.876s) > > This one failed in both runs. Can you please retest with just that using > > # kms_frontbuffer_tracking --run-subtest fbc-1p-primscrn-spr-indfb-fullscreen > > Also please boot with drm.debug=0xe and grab the full dmesg of just that > single subtest. There's definitely something going wrong here. I performed this test with - plain v4.6-rc6, - v4.6-rc6 patched with drm-intel-nightly (2016y-05m-06d-14h-29m-58s). On v4.6-rc6, the test failed thus: 8< Subtest fbc-1p-primscrn-spr-indfb-fullscreen failed. DEBUG (kms_frontbuffer_tracking:1914) DEBUG: Test requirement passed: fbc.can_test (kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd) (kms_frontbuffer_tracking:1914) igt-fb-DEBUG: igt_create_fb_with_bo_size(width=2560, height=1440, format=0x34325258, tiling=0x101, size=14745600) (kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd) (kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd) (kms_frontbuffer_tracking:1914) igt-fb-DEBUG: igt_create_fb_with_bo_size(handle=6, pitch=10240) (kms_frontbuffer_tracking:1914) DEBUG: Blue CRC: pipe:[2ca73d01 ] sink:[unsupported!] (kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd) (kms_frontbuffer_tracking:1914) igt-fb-DEBUG: igt_create_fb_with_bo_size(width=2560, height=1440, format=0x34325258, tiling=0x101, size=14745600) (kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd) (kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd) (kms_frontbuffer_tracking:1914) igt-fb-DEBUG: igt_create_fb_with_bo_size(handle=6, pitch=10240) (kms_frontbuffer_tracking:1914) igt-draw-DEBUG: Test requirement passed: intel_gen(intel_get_drm_devid(fd)) >= 5 (kms_frontbuffer_tracking:1914) DEBUG: Rect 0 CRC: pipe:[febb8b20 ] sink:[unsupported!] (kms_frontbuffer_tracking:1914) igt-draw-DEBUG: Test requirement passed: intel_gen(intel_get_drm_devid(fd)) >= 5 (kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[2ca73d01 ] sink:[unsupported!] (kms_frontbuffer_tracking:1914) DEBUG: Test requirement passed: !fbc_not_enough_stolen() (kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[2ca73d01 ] sink:[unsupported!] (kms_frontbuffer_tracking:1914) igt-draw-DEBUG: Test requirement passed: intel_gen(intel_get_drm_devid(fd)) >= 5 (kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[2ca73d01 ] sink:[unsupported!] (kms_frontbuffer_tracking:1914) DEBUG: Test requirement passed: !fbc_not_enough_stolen() (kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[2ca73d01 ] sink:[unsupported!] (kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd) (kms_frontbuffer_tracking:1914) igt-fb-DEBUG: igt_create_fb_with_bo_size(width=2560, height=1440, format=0x34325258, tiling=0x101, size=14745600) (kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd) (kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: is_i915_device(fd) && has_known_intel_chipset(fd) (kms_frontbuffer_tracking:1914) igt-fb-DEBUG: igt_create_fb_with_bo_size(handle=6, pitch=10240) (kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[febb8b20 ] sink:[unsupported!] (kms_frontbuffer_tracking:1914) DEBUG: Test requirement passed: !fbc_not_enough_stolen() (kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[febb8b20 ] sink:[unsupported!] (kms_frontbuffer_tracking:1914) igt-core-INFO: Timed out: CRC reading END IGT-Version: 1.14-gc03a8ae6bf2f (x86_64) (Linux: 4.6.0-rc6 x86_64) Primary screen: DP 2560x1440, crtc 26 FBC last action not supported Can't test PSR: no usable eDP screen. Sink CRC not supported: primary screen is not eDP Timed out: CRC reading Subtest fbc-1p-primscrn-spr-indfb-fullscreen: FAIL (5.806s) >8 On v4.6-rc6 plus drm-intel-nightly, the test apparently passed: 8< IGT-Version: 1.14-gc03a8ae6bf2f (x86_64) (Linux: 4.6.0-rc6+intel-drm-nightly x86_64) Primary screen: DP 2560x1440, crtc 26 FBC last action not supported Can't test PSR: no usable eDP screen. Sin
Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On May 08 Stefan Richter wrote: > On May 05 Zanoni, Paulo R wrote: > > If you don't want to keep carrying a manual revert, you can just boot > > with i915.enable_fbc=0 for now (or write a /etc/modprobe.d file). Also, > > it would be good to know in case you still somehow see the machine > > hangs even with FBC disabled. > > As expected, i915.enable_fbc=0 works fine. > No freeze within 2.5 days uptime; tested on v4.6-rc6. Furthermore, I checked out drm-intel.git (v4.6-rc6-962-g91567024d358 "drm-intel-nightly: 2016y-05m-06d-14h-29m-58s UTC integration manifest") and applied "git diff v4.6-rc6..." on top of v4.6-rc6. I booted the result once with default i915.enable_fbc, i.e. FBC enabled, performed the test which Daniel asked for (I will post the results in another message), then started X11. - The good news: I was able to switch back and forth between the sddm greeter screen on tty7, the text consoles at tty1...6, and the logger at tty12 --- without getting any FIFO underrun messages and without getting stuck with a blank screen. - The bad news: Less than a minute after login into sddm, just after having started openbox + lxpanel + konsole, the kernel froze again without netconsole output. I am now on 4.6.0-rc6+intel-drm-nightly with i915.enable_fbc=0. This is running fine so far. (uptime is just 30 minutes now though, so that doesn't say a lot.) Again, switching between ttys works without FIFO underruns, unlike plain v4.6-rc6. Not sure if it is coincidence or if this is because somebody fixed something. Like v4.6-rc6 and older,4.6.0-rc6+intel-drm-nightly still exhibits the following behaviour: If I switch the displayport connected monitor off and on again, the following messages are logged when the monitor comes on: [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on pipe A [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO underrun Other than these messages, there is nothing extraordinary going on. -- Stefan Richter -==- -=-= -=--- http://arcgraph.de/sr/
Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On May 05 Zanoni, Paulo R wrote: > If you don't want to keep carrying a manual revert, you can just boot > with i915.enable_fbc=0 for now (or write a /etc/modprobe.d file). Also, > it would be good to know in case you still somehow see the machine > hangs even with FBC disabled. As expected, i915.enable_fbc=0 works fine. No freeze within 2.5 days uptime; tested on v4.6-rc6. -- Stefan Richter -==- -=-= -=--- http://arcgraph.de/sr/
Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On Thu, May 05, 2016 at 10:45:31PM +0200, Stefan Richter wrote: > On May 05 Stefan Richter wrote: > > Quoting the changelog of the commit: > [...] > > - Download intel-gpu-tools, compile it, and run: > >$ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 2>&1 > > | tee fbc.txt > >Then send us the fbc.txt file, especially if you get a failure. > > Attached are results of kms_frontbuffer_tracking from current > intel-gpu-tools.git (intel-gpu-tools-1.14-273-gb4b2ac346c92), taken on > kernel v4.5.2 and on v4.6-rc5. > Subtest fbc-1p-primscrn-spr-indfb-fullscreen failed. > DEBUG > (kms_frontbuffer_tracking:2266) DEBUG: Test requirement passed: fbc.can_test > (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: > is_i915_device(fd) && has_known_intel_chipset(fd) > (kms_frontbuffer_tracking:2266) igt-fb-DEBUG: > igt_create_fb_with_bo_size(width=2560, height=1440, format=0x34325258, > tiling=0x101, size=14745600) > (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: > is_i915_device(fd) && has_known_intel_chipset(fd) > (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: > is_i915_device(fd) && has_known_intel_chipset(fd) > (kms_frontbuffer_tracking:2266) igt-fb-DEBUG: > igt_create_fb_with_bo_size(handle=7, pitch=10240) > (kms_frontbuffer_tracking:2266) igt-draw-DEBUG: Test requirement passed: > intel_gen(intel_get_drm_devid(fd)) >= 5 > (kms_frontbuffer_tracking:2266) DEBUG: Rect 0 CRC: pipe:[febb8b20 > ] sink:[unsupported!] > (kms_frontbuffer_tracking:2266) igt-draw-DEBUG: Test requirement passed: > intel_gen(intel_get_drm_devid(fd)) >= 5 > (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[2ca73d01 > ] sink:[unsupported!] > (kms_frontbuffer_tracking:2266) DEBUG: Test requirement passed: > !fbc_not_enough_stolen() > (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[2ca73d01 > ] sink:[unsupported!] > (kms_frontbuffer_tracking:2266) igt-draw-DEBUG: Test requirement passed: > intel_gen(intel_get_drm_devid(fd)) >= 5 > (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[2ca73d01 > ] sink:[unsupported!] > (kms_frontbuffer_tracking:2266) DEBUG: Test requirement passed: > !fbc_not_enough_stolen() > (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[2ca73d01 > ] sink:[unsupported!] > (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: > is_i915_device(fd) && has_known_intel_chipset(fd) > (kms_frontbuffer_tracking:2266) igt-fb-DEBUG: > igt_create_fb_with_bo_size(width=2560, height=1440, format=0x34325258, > tiling=0x101, size=14745600) > (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: > is_i915_device(fd) && has_known_intel_chipset(fd) > (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: > is_i915_device(fd) && has_known_intel_chipset(fd) > (kms_frontbuffer_tracking:2266) igt-fb-DEBUG: > igt_create_fb_with_bo_size(handle=7, pitch=10240) > (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[febb8b20 > ] sink:[unsupported!] > (kms_frontbuffer_tracking:2266) DEBUG: Test requirement passed: > !fbc_not_enough_stolen() > (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[febb8b20 > ] sink:[unsupported!] > (kms_frontbuffer_tracking:2266) igt-core-INFO: Timed out: CRC reading > END > Timed out: CRC reading > Subtest fbc-1p-primscrn-spr-indfb-fullscreen: FAIL (5.876s) This one failed in both runs. Can you please retest with just that using # kms_frontbuffer_tracking --run-subtest fbc-1p-primscrn-spr-indfb-fullscreen Also please boot with drm.debug=0xe and grab the full dmesg of just that single subtest. There's definitely something going wrong here. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
Em Sex, 2016-05-06 às 00:54 +0200, Stefan Richter escreveu: > On May 05 Zanoni, Paulo R wrote: > > > > Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu: > > > > > > Oh, and in case you - the person reading this commit message > > > - found > > > this commit through git bisect, please do the following: > > > - Check your dmesg and see if there are error messages > > > mentioning > > > underruns around the time your problem started happening. > > > > > > Well, I always had the followings lines in dmesg: > > > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared > > > fifo underrun on pipe A > > > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO > > > underrun > > Oh, well... I had a patch that would just disable FBC in case we > > saw a > > FIFO underrun, but it was rejected. Maybe this is the time to think > > about it again? Otherwise, I can't think of much besides disabling > > FBC > > on HSW until all the underruns and watermarks regressions are fixed > > forever. > Just to be clear though, I know that these messages are emitted when > the > monitor is switched on, and when sddm is being shut down --- but I do > not > know whether there is any sort of underrun when I get the FBC related > freeze (since I just don't get any kernel messages at that point). The fact that underruns have occurred earlier is enough to know that something is wrong (most probably, bad watermarks): we stop reporting underruns once we get the first one. In addition, we already know that FBC has the tendency to amplify apparently-harmless FIFO underruns into black screens, and I wouldn't be surprised to learn that it could also cause full machine lockups. > > Is there a chance that a serial console would fare better than > netconsole? This board and another PC in its vicinity have got > onboard > serial ports but I don't have cables at the moment. In the past, for some specific cases not related to FBC, I had more luck with serial console than with netconsole. But if this is really caused by FBC and watermarks, I don't think you'll be able to grab any specific message at the time of the machine hang. OTOH, if something actually shows up, it could help invalidate our current assumption of the relationship between the problem and FBC and underruns. > > > > > > > > > - Download intel-gpu-tools, compile it, and run: > > > $ sudo ./tests/kms_frontbuffer_tracking --run-subtest > > > '*fbc-*' 2>&1 | tee fbc.txt > > > Then send us the fbc.txt file, especially if you get a > > > failure. > > > This will really maximize your chances of getting the bug > > > fixed > > > quickly. > > > > > > Do you need this while FBC is enabled, or can I run it while FBC > > > is > > > disabled? > > FBC enabled. Considering your description, my hope is that maybe > > some > > specific subtest will be able to hang your machine, so testing this > > again will require only running the specific subtest instead of > > waiting > > 18 hours. > The kms_frontbuffer_tracking runs from which I posted output two > hours > ago did not trigger a lockup. > > (I ran them while X11 was shut down because otherwise > kms_frontbuffer_tracking would skip all tests with "Can't become DRM > master, please check if no other DRM client is running.") Yes, this is the correct way. > > > > > > > > > PS: > > > I am mentioning the following just in case that it has any > > > relationship > > > with the FBC related kernel freezes. Maybe it doesn't... There > > > is > > > another recent regression on this PC, but I have not yet figured > > > out > > > whether it was introduced by any particular kernel version. The > > > regression is: When switching from X11 to text console by > > > [Ctrl][Alt][Fx] > > > or by shutting down sddm, I often only get a blank screen. I > > > suspect > > > that this regression was introduced when I replaced kdm by sddm, > > > but > > > I am not sure about that. > > Maybe there is some relationship, since this operation involves a > > mode > > change. You can also try checking dmesg to see if there are > > underruns > > right when you do the change. > Yes, this is accompanied by > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo > underrun on pipe A > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO > underrun
Re: [Intel-gfx] Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On May 05 Daniel Vetter wrote: > Hm, if it's watermarks then testing with latest drm-intel-nightly would be > interesting. We finally managed to land atomic watermark updates (should > all be there in 4.7 too): > > https://cgit.freedesktop.org/drm-intel I will see if I can test this sometime soon. -- Stefan Richter -==- -=-= --==- http://arcgraph.de/sr/
Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On May 05 Zanoni, Paulo R wrote: > Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu: > > Oh, and in case you - the person reading this commit message - found > > this commit through git bisect, please do the following: > > - Check your dmesg and see if there are error messages mentioning > > underruns around the time your problem started happening. > > > > Well, I always had the followings lines in dmesg: > > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo underrun > > on pipe A > > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO underrun > > Oh, well... I had a patch that would just disable FBC in case we saw a > FIFO underrun, but it was rejected. Maybe this is the time to think > about it again? Otherwise, I can't think of much besides disabling FBC > on HSW until all the underruns and watermarks regressions are fixed > forever. Just to be clear though, I know that these messages are emitted when the monitor is switched on, and when sddm is being shut down --- but I do not know whether there is any sort of underrun when I get the FBC related freeze (since I just don't get any kernel messages at that point). Is there a chance that a serial console would fare better than netconsole? This board and another PC in its vicinity have got onboard serial ports but I don't have cables at the moment. > > - Download intel-gpu-tools, compile it, and run: > > $ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 2>&1 > > | tee fbc.txt > > Then send us the fbc.txt file, especially if you get a failure. > > This will really maximize your chances of getting the bug fixed > > quickly. > > > > Do you need this while FBC is enabled, or can I run it while FBC is > > disabled? > > FBC enabled. Considering your description, my hope is that maybe some > specific subtest will be able to hang your machine, so testing this > again will require only running the specific subtest instead of waiting > 18 hours. The kms_frontbuffer_tracking runs from which I posted output two hours ago did not trigger a lockup. (I ran them while X11 was shut down because otherwise kms_frontbuffer_tracking would skip all tests with "Can't become DRM master, please check if no other DRM client is running.") > > PS: > > I am mentioning the following just in case that it has any relationship > > with the FBC related kernel freezes. Maybe it doesn't... There is > > another recent regression on this PC, but I have not yet figured out > > whether it was introduced by any particular kernel version. The > > regression is: When switching from X11 to text console by [Ctrl][Alt][Fx] > > or by shutting down sddm, I often only get a blank screen. I suspect > > that this regression was introduced when I replaced kdm by sddm, but > > I am not sure about that. > > Maybe there is some relationship, since this operation involves a mode > change. You can also try checking dmesg to see if there are underruns > right when you do the change. Yes, this is accompanied by [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on pipe A [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO underrun -- Stefan Richter -==- -=-= --=-= http://arcgraph.de/sr/
Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On May 05 Stefan Richter wrote: > Quoting the changelog of the commit: [...] > - Download intel-gpu-tools, compile it, and run: >$ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 2>&1 | > tee fbc.txt >Then send us the fbc.txt file, especially if you get a failure. Attached are results of kms_frontbuffer_tracking from current intel-gpu-tools.git (intel-gpu-tools-1.14-273-gb4b2ac346c92), taken on kernel v4.5.2 and on v4.6-rc5. -- Stefan Richter -==- -=-= --=-= http://arcgraph.de/sr/ IGT-Version: 1.14-g99e61ed66f65 (x86_64) (Linux: 4.5.2 x86_64) Primary screen: DP 2560x1440, crtc 21 FBC last action not supported Can't test PSR: no usable eDP screen. Sink CRC not supported: primary screen is not eDP Subtest fbc-1p-rte: SUCCESS (2.754s) Test requirement not met in function check_test_requirements, file kms_frontbuffer_tracking.c:1816: Test requirement: scnd_mode_params.connector_id Can't test dual pipes with the current outputs Subtest fbc-2p-rte: SKIP (0.000s) Subtest fbc-1p-primscrn-pri-indfb-draw-mmap-cpu: SUCCESS (1.468s) Subtest fbc-1p-primscrn-pri-indfb-draw-mmap-gtt: SUCCESS (0.787s) Subtest fbc-1p-primscrn-pri-indfb-draw-mmap-wc: SUCCESS (0.774s) Subtest fbc-1p-primscrn-pri-indfb-draw-pwrite: SUCCESS (0.958s) Subtest fbc-1p-primscrn-pri-indfb-draw-blt: SUCCESS (0.990s) Subtest fbc-1p-primscrn-pri-indfb-draw-render: SUCCESS (0.984s) Subtest fbc-1p-primscrn-pri-shrfb-draw-mmap-cpu: SUCCESS (0.948s) Subtest fbc-1p-primscrn-pri-shrfb-draw-mmap-gtt: SUCCESS (0.793s) Subtest fbc-1p-primscrn-pri-shrfb-draw-mmap-wc: SUCCESS (0.836s) Subtest fbc-1p-primscrn-pri-shrfb-draw-pwrite: SUCCESS (1.171s) Subtest fbc-1p-primscrn-pri-shrfb-draw-blt: SUCCESS (0.995s) Subtest fbc-1p-primscrn-pri-shrfb-draw-render: SUCCESS (0.977s) Subtest fbc-1p-primscrn-cur-indfb-draw-mmap-cpu: SUCCESS (1.403s) Subtest fbc-1p-primscrn-cur-indfb-draw-mmap-gtt: SUCCESS (0.964s) Subtest fbc-1p-primscrn-cur-indfb-draw-mmap-wc: SUCCESS (0.939s) Subtest fbc-1p-primscrn-cur-indfb-draw-pwrite: SUCCESS (0.965s) Subtest fbc-1p-primscrn-cur-indfb-draw-blt: SUCCESS (0.939s) Subtest fbc-1p-primscrn-cur-indfb-draw-render: SUCCESS (0.941s) Subtest fbc-1p-primscrn-spr-indfb-draw-mmap-cpu: SUCCESS (1.010s) Subtest fbc-1p-primscrn-spr-indfb-draw-mmap-gtt: SUCCESS (0.971s) Subtest fbc-1p-primscrn-spr-indfb-draw-mmap-wc: SUCCESS (0.961s) Subtest fbc-1p-primscrn-spr-indfb-draw-pwrite: SUCCESS (1.010s) Subtest fbc-1p-primscrn-spr-indfb-draw-blt: SUCCESS (0.956s) Subtest fbc-1p-primscrn-spr-indfb-draw-render: SUCCESS (1.010s) Subtest fbc-1p-offscren-pri-indfb-draw-mmap-cpu: SUCCESS (0.502s) Subtest fbc-1p-offscren-pri-indfb-draw-mmap-gtt: SUCCESS (0.500s) Subtest fbc-1p-offscren-pri-indfb-draw-mmap-wc: SUCCESS (0.493s) Subtest fbc-1p-offscren-pri-indfb-draw-pwrite: SUCCESS (0.507s) Subtest fbc-1p-offscren-pri-indfb-draw-blt: SUCCESS (0.498s) Subtest fbc-1p-offscren-pri-indfb-draw-render: SUCCESS (0.486s) Subtest fbc-1p-offscren-pri-shrfb-draw-mmap-cpu: SUCCESS (0.578s) Subtest fbc-1p-offscren-pri-shrfb-draw-mmap-gtt: SUCCESS (0.459s) Subtest fbc-1p-offscren-pri-shrfb-draw-mmap-wc: SUCCESS (0.530s) Subtest fbc-1p-offscren-pri-shrfb-draw-pwrite: SUCCESS (1.096s) Subtest fbc-1p-offscren-pri-shrfb-draw-blt: SUCCESS (0.660s) Subtest fbc-1p-offscren-pri-shrfb-draw-render: SUCCESS (0.661s) Test requirement not met in function check_test_requirements, file kms_frontbuffer_tracking.c:1816: Test requirement: scnd_mode_params.connector_id Can't test dual pipes with the current outputs Subtest fbc-2p-primscrn-pri-indfb-draw-mmap-cpu: SKIP (0.000s) Test requirement not met in function check_test_requirements, file kms_frontbuffer_tracking.c:1816: Test requirement: scnd_mode_params.connector_id Can't test dual pipes with the current outputs Subtest fbc-2p-primscrn-pri-indfb-draw-mmap-gtt: SKIP (0.000s) Test requirement not met in function check_test_requirements, file kms_frontbuffer_tracking.c:1816: Test requirement: scnd_mode_params.connector_id Can't test dual pipes with the current outputs Subtest fbc-2p-primscrn-pri-indfb-draw-mmap-wc: SKIP (0.000s) Test requirement not met in function check_test_requirements, file kms_frontbuffer_tracking.c:1816: Test requirement: scnd_mode_params.connector_id Can't test dual pipes with the current outputs Subtest fbc-2p-primscrn-pri-indfb-draw-pwrite: SKIP (0.000s) Test requirement not met in function check_test_requirements, file kms_frontbuffer_tracking.c:1816: Test requirement: scnd_mode_params.connector_id Can't test dual pipes with the current outputs Subtest fbc-2p-primscrn-pri-indfb-draw-blt: SKIP (0.000s) Test requirement not met in function check_test_requirements, file kms_frontbuffer_tracking.c:1816: Test requirement: scnd_mode_params.connector_id Can't test dual pipes with the current outputs Subtest fbc-2p-primscrn-pri-indfb-draw-render: SKIP (0.000s) Test requirement not met in function check_test_requirements, file kms_frontbuffer_tracking.c:1816: Test requirement: scn
Re: [Intel-gfx] Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On Thu, May 05, 2016 at 06:50:14PM +, Zanoni, Paulo R wrote: > Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu: > > On Apr 30 Stefan Richter wrote: > > > > > > On Apr 29 Stefan Richter wrote: > > > > > > > > On Apr 26 Stefan Richter wrote: > > > > > > > > > > v4.6-rc solidly hangs after a short while after boot, login to > > > > > X11, and > > > > > doing nothing much remarkable on the just brought up X desktop. > > > > > > > > > > Hardware: x86-64, E3-1245 v3 (Haswell), > > > > > mainboard Supermicro X10SAE, > > > > > using integrated Intel graphics (HD P4600, i915 > > > > > driver), > > > > > C226 PCH's AHCI and USB 2/3, ASMedia ASM1062 AHCI, > > > > > Intel LAN (i217, igb driver), > > > > > several IEEE 1394 controllers, some of them behind > > > > > PCIe bridges (IDT, PLX) or PCIe-to-PCI bridges (TI, > > > > > Tundra) > > > > > and one PCI-to-CardBus bridge (Ricoh) > > > > > > > > > > kernel.org kernel, Gentoo Linux userland > > > > > > > > > > 1. known good: v4.5-rc5 (gcc 4.9.3) > > > > > known bad: v4.6-rc2 (gcc 4.9.3), only tried one time > > > > > > > > > > 2. known good: v4.5.2 (gcc 5.2.0) > > > > > known bad: v4.6-rc5 (gcc 5.2.0), only tried one time > > > > > > > > > > I will send my linux-4.6-rc5/.config in a follow-up message. > > > .config: http://www.spinics.net/lists/kernel/msg2243444.html > > > lspci: http://www.spinics.net/lists/kernel/msg2243447.html > > > > > > Some userland package versions, in case these have any bearing: > > > x11-base/xorg-drivers-1.17 > > > x11-base/xorg-server-1.17.4 > > > x11-bas/xorg-x11-7.4-r2 > > Furthermore, there is a single display hooked up via DisplayPort. > > > > > > > > > > > > > After it proved impossible to capture an oops through netconsole, > > > > I > > > > started git bisect. This will apparently take almost a week, as > > > > git > > > > estimated 13 bisection steps and I will be allowing about 12 > > > > hours of > > > > uptime as a sign for a good kernel. (In my four or five tests of > > > > bad > > > > kernels before I started bisection, they hung after 3 > > > > minutes...5.5 hours > > > > uptime, with no discernible difference in workload. Maybe 12 h > > > > cutoff is > > > > even too short...) > > I took at least 18 hours uptime (usually 24 hours) as a sign for good > > kernels. During the bisection, bad kernels hung after 3 h, 2 h, 9 > > min, > > 45 min, and 4 min uptime. Thus I arrived at a98ee79317b4 > > "drm/i915/fbc: > > enable FBC by default on HSW and BDW" as the point where the hangs > > are > > introduced. > > > > Quoting the changelog of the commit: > > Thanks for following the instructions on the commit message! :) > > > > > Oh, and in case you - the person reading this commit message - > > found > > this commit through git bisect, please do the following: > > - Check your dmesg and see if there are error messages > > mentioning > > underruns around the time your problem started happening. > > > > Well, I always had the followings lines in dmesg: > > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo > > underrun on pipe A > > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO > > underrun > > Oh, well... I had a patch that would just disable FBC in case we saw a > FIFO underrun, but it was rejected. Maybe this is the time to think > about it again? Otherwise, I can't think of much besides disabling FBC > on HSW until all the underruns and watermarks regressions are fixed > forever. Hm, if it's watermarks then testing with latest drm-intel-nightly would be interesting. We finally managed to land atomic watermark updates (should all be there in 4.7 too): https://cgit.freedesktop.org/drm-intel Cheers, Daniel > > > > > I always got these when I switch on the DisplayPort attached monitor. > > Recently I changed userland from kdm to sddm and noticed that I > > apparently get these when sddm shuts down. I am not aware of whether > > or not this also already happened with kdm. > > > > However, "around the time your problem started happening" there is > > nothing in dmesg, because "your problem" is a complete hang without > > possibility of disk IO and without netconsole output. > > > > - Download intel-gpu-tools, compile it, and run: > > $ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' > > 2>&1 | tee fbc.txt > > Then send us the fbc.txt file, especially if you get a > > failure. > > This will really maximize your chances of getting the bug > > fixed > > quickly. > > > > Do you need this while FBC is enabled, or can I run it while FBC is > > disabled? > > FBC enabled. Considering your description, my hope is that maybe some > specific subtest will be able to hang your machine, so testing this > again will require only running the specific subtest instead of waiting > 18 hours. > > > > > - Try to
Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On May 05 Stefan Richter wrote: > Quoting the changelog of the commit: [...] > - Boot with drm.debug=0xe, reproduce the problem, then send us the >dmesg file. > > I can try this, but I am skeptical about getting any useful kernel > messages from before the hang. I booted 4.6-rc5 with drm.debug=0xe. It hung after about 80 minutes uptime, and just like at all previous hangs, netconsole did not capture anything at the time when it froze. Here is "dmesg | grep -e :00:02.0 -e i915 -e drm" from that session. [0.00] Command line: BOOT_IMAGE=/vmlinuz-4.6.0-rc5 root=/dev/sda4 ro rootflags=subvol=@ drm.debug=0xe [0.00] Kernel command line: BOOT_IMAGE=/vmlinuz-4.6.0-rc5 root=/dev/sda4 ro rootflags=subvol=@ drm.debug=0xe [0.673659] pci :00:02.0: [8086:041a] type 00 class 0x03 [0.673666] pci :00:02.0: reg 0x10: [mem 0xf580-0xf5bf 64bit] [0.673670] pci :00:02.0: reg 0x18: [mem 0xe000-0xefff 64bit pref] [0.673673] pci :00:02.0: reg 0x20: [io 0xf000-0xf03f] [0.705036] vgaarb: setting as boot device: PCI::00:02.0 [0.705113] vgaarb: device added: PCI::00:02.0,decodes=io+mem,owns=io+mem,locks=none [0.705300] vgaarb: bridge control possible :00:02.0 [0.727542] pci :00:02.0: Video device with shadowed ROM at [mem 0x000c-0x000d] [0.766034] [drm] Initialized drm 1.1.0 20060810 [0.766222] [drm:i915_dump_device_info] i915 device info: gen=7, pciid=0x041a rev=0x06 flags=need_gfx_hws,is_haswell,has_fbc,has_hotplug,has_llc,has_ddi,has_fpga_dbg, [0.766229] [drm:intel_detect_pch] Found LynxPoint PCH [0.766320] [drm:i915_gem_init_stolen] Memory reserved for graphics device: 32768K, usable: 31744K [0.766321] [drm] Memory usable by graphics device = 2048M [0.766398] [drm:i915_gem_gtt_init] GMADR size = 256M [0.766399] [drm:i915_gem_gtt_init] GTT stolen size = 32M [0.766399] [drm:i915_gem_gtt_init] ppgtt mode: 1 [0.766400] [drm] Replacing VGA console driver [0.767158] [drm:intel_opregion_setup] graphic opregion physical addr: 0xd9509018 [0.767161] [drm:intel_opregion_setup] Public ACPI methods supported [0.767162] [drm:intel_opregion_setup] SWSCI supported [0.772643] [drm:swsci_setup] SWSCI GBDA callbacks 0cb3, SBCB callbacks 00300483 [0.772646] [drm:intel_opregion_setup] ASLE supported [0.772646] [drm:intel_opregion_setup] ASLE extension supported [0.772648] [drm:intel_opregion_setup] Found valid VBT in ACPI OpRegion (Mailbox #4) [0.772717] [drm:intel_device_info_runtime_init] slice total: 0 [0.772717] [drm:intel_device_info_runtime_init] subslice total: 0 [0.772718] [drm:intel_device_info_runtime_init] subslice per slice: 0 [0.772719] [drm:intel_device_info_runtime_init] EU total: 0 [0.772720] [drm:intel_device_info_runtime_init] EU per subslice: 0 [0.772720] [drm:intel_device_info_runtime_init] has slice power gating: n [0.772721] [drm:intel_device_info_runtime_init] has subslice power gating: n [0.772722] [drm:intel_device_info_runtime_init] has EU power gating: n [0.772722] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [0.772725] [drm] Driver supports precise vblank timestamp query. [0.772727] [drm:init_vbt_defaults] Set default to SSC at 12 kHz [0.772728] [drm:intel_bios_init] VBT signature "$VBT HASWELL", BDB version 170 [0.772730] [drm:parse_general_features] BDB_GENERAL_FEATURES int_tv_support 0 int_crt_support 1 lvds_use_ssc 0 lvds_ssc_freq 12 display_clock_mode 0 fdi_rx_polarity_inverted 0 [0.772731] [drm:parse_general_definitions] crt_ddc_bus_pin: 2 [0.772732] [drm:parse_lfp_panel_data] DRRS supported mode is static [0.772734] [drm:parse_lfp_panel_data] Found panel mode in BIOS VBT tables: [0.772735] [drm:drm_mode_debug_printmodeline] Modeline 0:"1024x768" 0 65000 1024 1048 1184 1344 768 771 777 806 0x8 0xa [0.772736] [drm:parse_lfp_panel_data] VBT initial LVDS value 300 [0.772738] [drm:parse_lfp_backlight] VBT backlight PWM modulation frequency 200 Hz, active high, min brightness 0, level 255 [0.772739] [drm:parse_sdvo_panel_data] Found SDVO panel mode in BIOS VBT tables: [0.772740] [drm:drm_mode_debug_printmodeline] Modeline 0:"1600x1200" 0 162000 1600 1664 1856 2160 1200 1201 1204 1250 0x8 0xa [0.772741] [drm:parse_sdvo_device_mapping] No SDVO device info is found in VBT [0.772742] [drm:parse_driver_features] DRRS State Enabled:1 [0.772743] [drm:parse_ddi_port] Port B VBT info: DP:1 HDMI:1 DVI:1 EDP:0 CRT:0 [0.772745] [drm:parse_ddi_port] VBT HDMI level shift for port B: 6 [0.772745] [drm:parse_ddi_port] Port C VBT info: DP:0 HDMI:1 DVI:1 EDP:0 CRT:0 [0.772746] [drm:parse_ddi_port] VBT HDMI level shift for port C: 6 [0.772747] [drm:parse_ddi_port] Port D VBT info: DP:1 HDMI:1 DVI:1 EDP:0 CRT:0 [0.772748] [drm:parse_ddi_port] VBT HDMI level shift for port D: 6 [
Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu: > On Apr 30 Stefan Richter wrote: > > > > On Apr 29 Stefan Richter wrote: > > > > > > On Apr 26 Stefan Richter wrote: > > > > > > > > v4.6-rc solidly hangs after a short while after boot, login to > > > > X11, and > > > > doing nothing much remarkable on the just brought up X desktop. > > > > > > > > Hardware: x86-64, E3-1245 v3 (Haswell), > > > > mainboard Supermicro X10SAE, > > > > using integrated Intel graphics (HD P4600, i915 > > > > driver), > > > > C226 PCH's AHCI and USB 2/3, ASMedia ASM1062 AHCI, > > > > Intel LAN (i217, igb driver), > > > > several IEEE 1394 controllers, some of them behind > > > > PCIe bridges (IDT, PLX) or PCIe-to-PCI bridges (TI, > > > > Tundra) > > > > and one PCI-to-CardBus bridge (Ricoh) > > > > > > > > kernel.org kernel, Gentoo Linux userland > > > > > > > > 1. known good: v4.5-rc5 (gcc 4.9.3) > > > > known bad: v4.6-rc2 (gcc 4.9.3), only tried one time > > > > > > > > 2. known good: v4.5.2 (gcc 5.2.0) > > > > known bad: v4.6-rc5 (gcc 5.2.0), only tried one time > > > > > > > > I will send my linux-4.6-rc5/.config in a follow-up message. > > .config: http://www.spinics.net/lists/kernel/msg2243444.html > > lspci: http://www.spinics.net/lists/kernel/msg2243447.html > > > > Some userland package versions, in case these have any bearing: > > x11-base/xorg-drivers-1.17 > > x11-base/xorg-server-1.17.4 > > x11-bas/xorg-x11-7.4-r2 > Furthermore, there is a single display hooked up via DisplayPort. > > > > > > > > > After it proved impossible to capture an oops through netconsole, > > > I > > > started git bisect. This will apparently take almost a week, as > > > git > > > estimated 13 bisection steps and I will be allowing about 12 > > > hours of > > > uptime as a sign for a good kernel. (In my four or five tests of > > > bad > > > kernels before I started bisection, they hung after 3 > > > minutes...5.5 hours > > > uptime, with no discernible difference in workload. Maybe 12 h > > > cutoff is > > > even too short...) > I took at least 18 hours uptime (usually 24 hours) as a sign for good > kernels. During the bisection, bad kernels hung after 3 h, 2 h, 9 > min, > 45 min, and 4 min uptime. Thus I arrived at a98ee79317b4 > "drm/i915/fbc: > enable FBC by default on HSW and BDW" as the point where the hangs > are > introduced. > > Quoting the changelog of the commit: Thanks for following the instructions on the commit message! :) > > Oh, and in case you - the person reading this commit message - > found > this commit through git bisect, please do the following: > - Check your dmesg and see if there are error messages > mentioning > underruns around the time your problem started happening. > > Well, I always had the followings lines in dmesg: > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo > underrun on pipe A > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO > underrun Oh, well... I had a patch that would just disable FBC in case we saw a FIFO underrun, but it was rejected. Maybe this is the time to think about it again? Otherwise, I can't think of much besides disabling FBC on HSW until all the underruns and watermarks regressions are fixed forever. > > I always got these when I switch on the DisplayPort attached monitor. > Recently I changed userland from kdm to sddm and noticed that I > apparently get these when sddm shuts down. I am not aware of whether > or not this also already happened with kdm. > > However, "around the time your problem started happening" there is > nothing in dmesg, because "your problem" is a complete hang without > possibility of disk IO and without netconsole output. > > - Download intel-gpu-tools, compile it, and run: > $ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' > 2>&1 | tee fbc.txt > Then send us the fbc.txt file, especially if you get a > failure. > This will really maximize your chances of getting the bug > fixed > quickly. > > Do you need this while FBC is enabled, or can I run it while FBC is > disabled? FBC enabled. Considering your description, my hope is that maybe some specific subtest will be able to hang your machine, so testing this again will require only running the specific subtest instead of waiting 18 hours. > > - Try to find a reliable way to reproduce the problem, and tell > us. > > The reliable way is to just wait for the kernel to hang after about > 3 minutes to 5.5 hours. I have not identified any special activity > which would trigger the hang. > > - Boot with drm.debug=0xe, reproduce the problem, then send us > the > dmesg file. > > I can try this, but I am skeptical about getting any useful kernel > messages from before the hang. Agree. > > PS: > I am mentioning the following just in case that it has any > relation
Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"
On Apr 30 Stefan Richter wrote: > On Apr 29 Stefan Richter wrote: > > On Apr 26 Stefan Richter wrote: > > > v4.6-rc solidly hangs after a short while after boot, login to X11, and > > > doing nothing much remarkable on the just brought up X desktop. > > > > > > Hardware: x86-64, E3-1245 v3 (Haswell), > > > mainboard Supermicro X10SAE, > > > using integrated Intel graphics (HD P4600, i915 driver), > > > C226 PCH's AHCI and USB 2/3, ASMedia ASM1062 AHCI, > > > Intel LAN (i217, igb driver), > > > several IEEE 1394 controllers, some of them behind > > > PCIe bridges (IDT, PLX) or PCIe-to-PCI bridges (TI, Tundra) > > > and one PCI-to-CardBus bridge (Ricoh) > > > > > > kernel.org kernel, Gentoo Linux userland > > > > > > 1. known good: v4.5-rc5 (gcc 4.9.3) > > >known bad: v4.6-rc2 (gcc 4.9.3), only tried one time > > > > > > 2. known good: v4.5.2 (gcc 5.2.0) > > >known bad: v4.6-rc5 (gcc 5.2.0), only tried one time > > > > > > I will send my linux-4.6-rc5/.config in a follow-up message. > > .config: http://www.spinics.net/lists/kernel/msg2243444.html >lspci: http://www.spinics.net/lists/kernel/msg2243447.html > > Some userland package versions, in case these have any bearing: > x11-base/xorg-drivers-1.17 > x11-base/xorg-server-1.17.4 > x11-bas/xorg-x11-7.4-r2 Furthermore, there is a single display hooked up via DisplayPort. > > After it proved impossible to capture an oops through netconsole, I > > started git bisect. This will apparently take almost a week, as git > > estimated 13 bisection steps and I will be allowing about 12 hours of > > uptime as a sign for a good kernel. (In my four or five tests of bad > > kernels before I started bisection, they hung after 3 minutes...5.5 hours > > uptime, with no discernible difference in workload. Maybe 12 h cutoff is > > even too short...) I took at least 18 hours uptime (usually 24 hours) as a sign for good kernels. During the bisection, bad kernels hung after 3 h, 2 h, 9 min, 45 min, and 4 min uptime. Thus I arrived at a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW" as the point where the hangs are introduced. Quoting the changelog of the commit: Oh, and in case you - the person reading this commit message - found this commit through git bisect, please do the following: - Check your dmesg and see if there are error messages mentioning underruns around the time your problem started happening. Well, I always had the followings lines in dmesg: [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on pipe A [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO underrun I always got these when I switch on the DisplayPort attached monitor. Recently I changed userland from kdm to sddm and noticed that I apparently get these when sddm shuts down. I am not aware of whether or not this also already happened with kdm. However, "around the time your problem started happening" there is nothing in dmesg, because "your problem" is a complete hang without possibility of disk IO and without netconsole output. - Download intel-gpu-tools, compile it, and run: $ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 2>&1 | tee fbc.txt Then send us the fbc.txt file, especially if you get a failure. This will really maximize your chances of getting the bug fixed quickly. Do you need this while FBC is enabled, or can I run it while FBC is disabled? - Try to find a reliable way to reproduce the problem, and tell us. The reliable way is to just wait for the kernel to hang after about 3 minutes to 5.5 hours. I have not identified any special activity which would trigger the hang. - Boot with drm.debug=0xe, reproduce the problem, then send us the dmesg file. I can try this, but I am skeptical about getting any useful kernel messages from before the hang. PS: I am mentioning the following just in case that it has any relationship with the FBC related kernel freezes. Maybe it doesn't... There is another recent regression on this PC, but I have not yet figured out whether it was introduced by any particular kernel version. The regression is: When switching from X11 to text console by [Ctrl][Alt][Fx] or by shutting down sddm, I often only get a blank screen. I suspect that this regression was introduced when I replaced kdm by sddm, but I am not sure about that. -- Stefan Richter -==- -=-= --=-= http://arcgraph.de/sr/