Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-08 Thread Stefan Richter
On May 06 Daniel Vetter wrote:
> On Thu, May 05, 2016 at 10:45:31PM +0200, Stefan Richter wrote:
[...]
> > Subtest fbc-1p-primscrn-spr-indfb-fullscreen: FAIL (5.876s)  
> 
> This one failed in both runs. Can you please retest with just that using
> 
> # kms_frontbuffer_tracking --run-subtest fbc-1p-primscrn-spr-indfb-fullscreen
> 
> Also please boot with drm.debug=0xe and grab the full dmesg of just that
> single subtest. There's definitely something going wrong here.

I performed this test with
  - plain v4.6-rc6,
  - v4.6-rc6 patched with drm-intel-nightly (2016y-05m-06d-14h-29m-58s).

On v4.6-rc6, the test failed thus:

 8< 
Subtest fbc-1p-primscrn-spr-indfb-fullscreen failed.
 DEBUG 
(kms_frontbuffer_tracking:1914) DEBUG: Test requirement passed: fbc.can_test
(kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: 
is_i915_device(fd) && has_known_intel_chipset(fd)
(kms_frontbuffer_tracking:1914) igt-fb-DEBUG: 
igt_create_fb_with_bo_size(width=2560, height=1440, format=0x34325258, 
tiling=0x101, size=14745600)
(kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: 
is_i915_device(fd) && has_known_intel_chipset(fd)
(kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: 
is_i915_device(fd) && has_known_intel_chipset(fd)
(kms_frontbuffer_tracking:1914) igt-fb-DEBUG: 
igt_create_fb_with_bo_size(handle=6, pitch=10240)
(kms_frontbuffer_tracking:1914) DEBUG: Blue CRC:   pipe:[2ca73d01  
  ] sink:[unsupported!]
(kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: 
is_i915_device(fd) && has_known_intel_chipset(fd)
(kms_frontbuffer_tracking:1914) igt-fb-DEBUG: 
igt_create_fb_with_bo_size(width=2560, height=1440, format=0x34325258, 
tiling=0x101, size=14745600)
(kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: 
is_i915_device(fd) && has_known_intel_chipset(fd)
(kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: 
is_i915_device(fd) && has_known_intel_chipset(fd)
(kms_frontbuffer_tracking:1914) igt-fb-DEBUG: 
igt_create_fb_with_bo_size(handle=6, pitch=10240)
(kms_frontbuffer_tracking:1914) igt-draw-DEBUG: Test requirement passed: 
intel_gen(intel_get_drm_devid(fd)) >= 5
(kms_frontbuffer_tracking:1914) DEBUG: Rect 0 CRC: pipe:[febb8b20  
  ] sink:[unsupported!]
(kms_frontbuffer_tracking:1914) igt-draw-DEBUG: Test requirement passed: 
intel_gen(intel_get_drm_devid(fd)) >= 5
(kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[2ca73d01  
  ] sink:[unsupported!]
(kms_frontbuffer_tracking:1914) DEBUG: Test requirement passed: 
!fbc_not_enough_stolen()
(kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[2ca73d01  
  ] sink:[unsupported!]
(kms_frontbuffer_tracking:1914) igt-draw-DEBUG: Test requirement passed: 
intel_gen(intel_get_drm_devid(fd)) >= 5
(kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[2ca73d01  
  ] sink:[unsupported!]
(kms_frontbuffer_tracking:1914) DEBUG: Test requirement passed: 
!fbc_not_enough_stolen()
(kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[2ca73d01  
  ] sink:[unsupported!]
(kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: 
is_i915_device(fd) && has_known_intel_chipset(fd)
(kms_frontbuffer_tracking:1914) igt-fb-DEBUG: 
igt_create_fb_with_bo_size(width=2560, height=1440, format=0x34325258, 
tiling=0x101, size=14745600)
(kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: 
is_i915_device(fd) && has_known_intel_chipset(fd)
(kms_frontbuffer_tracking:1914) drmtest-DEBUG: Test requirement passed: 
is_i915_device(fd) && has_known_intel_chipset(fd)
(kms_frontbuffer_tracking:1914) igt-fb-DEBUG: 
igt_create_fb_with_bo_size(handle=6, pitch=10240)
(kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[febb8b20  
  ] sink:[unsupported!]
(kms_frontbuffer_tracking:1914) DEBUG: Test requirement passed: 
!fbc_not_enough_stolen()
(kms_frontbuffer_tracking:1914) DEBUG: Calculated CRC: pipe:[febb8b20  
  ] sink:[unsupported!]
(kms_frontbuffer_tracking:1914) igt-core-INFO: Timed out: CRC reading
  END  
IGT-Version: 1.14-gc03a8ae6bf2f (x86_64) (Linux: 4.6.0-rc6 x86_64)
Primary screen: DP 2560x1440, crtc 26
FBC last action not supported
Can't test PSR: no usable eDP screen.
Sink CRC not supported: primary screen is not eDP
Timed out: CRC reading
Subtest fbc-1p-primscrn-spr-indfb-fullscreen: FAIL (5.806s)
 >8 

On v4.6-rc6 plus drm-intel-nightly, the test apparently passed:

 8< 
IGT-Version: 1.14-gc03a8ae6bf2f (x86_64) (Linux: 4.6.0-rc6+intel-drm-nightly 
x86_64)
Primary screen: DP 2560x1440, crtc 26
FBC last action not supported
Can't test PSR: no usable eDP screen.
Sin

Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-08 Thread Stefan Richter
On May 08 Stefan Richter wrote:
> On May 05 Zanoni, Paulo R wrote:
> > If you don't want to keep carrying a manual revert, you can just boot
> > with i915.enable_fbc=0 for now (or write a /etc/modprobe.d file). Also,
> > it would be good to know in case you still somehow see the machine
> > hangs even with FBC disabled.  
> 
> As expected, i915.enable_fbc=0 works fine.
> No freeze within 2.5 days uptime; tested on v4.6-rc6.

Furthermore, I checked out drm-intel.git (v4.6-rc6-962-g91567024d358
"drm-intel-nightly: 2016y-05m-06d-14h-29m-58s UTC integration manifest")
and applied "git diff v4.6-rc6..." on top of v4.6-rc6.

I booted the result once with default i915.enable_fbc, i.e. FBC enabled,
performed the test which Daniel asked for (I will post the results in
another message), then started X11.
  - The good news:  I was able to switch back and forth between the sddm
greeter screen on tty7, the text consoles at tty1...6, and the logger
at tty12 --- without getting any FIFO underrun messages and without
getting stuck with a blank screen.
  - The bad news:  Less than a minute after login into sddm, just after
having started openbox + lxpanel + konsole, the kernel froze again
without netconsole output.

I am now on 4.6.0-rc6+intel-drm-nightly with i915.enable_fbc=0.  This is
running fine so far.  (uptime is just 30 minutes now though, so that
doesn't say a lot.)  Again, switching between ttys works without FIFO
underruns, unlike plain v4.6-rc6.  Not sure if it is coincidence or if
this is because somebody fixed something.

Like v4.6-rc6 and older,4.6.0-rc6+intel-drm-nightly still exhibits the
following behaviour:  If I switch the displayport connected monitor off
and on again, the following messages are logged when the monitor comes on:
[drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo underrun 
on pipe A
[drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO underrun
Other than these messages, there is nothing extraordinary going on.
-- 
Stefan Richter
-==- -=-= -=---
http://arcgraph.de/sr/


Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-08 Thread Stefan Richter
On May 05 Zanoni, Paulo R wrote:
> If you don't want to keep carrying a manual revert, you can just boot
> with i915.enable_fbc=0 for now (or write a /etc/modprobe.d file). Also,
> it would be good to know in case you still somehow see the machine
> hangs even with FBC disabled.

As expected, i915.enable_fbc=0 works fine.
No freeze within 2.5 days uptime; tested on v4.6-rc6.
-- 
Stefan Richter
-==- -=-= -=---
http://arcgraph.de/sr/


Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-05 Thread Daniel Vetter
On Thu, May 05, 2016 at 10:45:31PM +0200, Stefan Richter wrote:
> On May 05 Stefan Richter wrote:
> > Quoting the changelog of the commit:
> [...]
> >  - Download intel-gpu-tools, compile it, and run:
> >$ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 2>&1 
> > | tee fbc.txt
> >Then send us the fbc.txt file, especially if you get a failure.
> 
> Attached are results of kms_frontbuffer_tracking from current
> intel-gpu-tools.git (intel-gpu-tools-1.14-273-gb4b2ac346c92), taken on
> kernel v4.5.2 and on v4.6-rc5.

> Subtest fbc-1p-primscrn-spr-indfb-fullscreen failed.
>  DEBUG 
> (kms_frontbuffer_tracking:2266) DEBUG: Test requirement passed: fbc.can_test
> (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: 
> is_i915_device(fd) && has_known_intel_chipset(fd)
> (kms_frontbuffer_tracking:2266) igt-fb-DEBUG: 
> igt_create_fb_with_bo_size(width=2560, height=1440, format=0x34325258, 
> tiling=0x101, size=14745600)
> (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: 
> is_i915_device(fd) && has_known_intel_chipset(fd)
> (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: 
> is_i915_device(fd) && has_known_intel_chipset(fd)
> (kms_frontbuffer_tracking:2266) igt-fb-DEBUG: 
> igt_create_fb_with_bo_size(handle=7, pitch=10240)
> (kms_frontbuffer_tracking:2266) igt-draw-DEBUG: Test requirement passed: 
> intel_gen(intel_get_drm_devid(fd)) >= 5
> (kms_frontbuffer_tracking:2266) DEBUG: Rect 0 CRC: pipe:[febb8b20  
>   ] sink:[unsupported!]
> (kms_frontbuffer_tracking:2266) igt-draw-DEBUG: Test requirement passed: 
> intel_gen(intel_get_drm_devid(fd)) >= 5
> (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[2ca73d01 
>    ] sink:[unsupported!]
> (kms_frontbuffer_tracking:2266) DEBUG: Test requirement passed: 
> !fbc_not_enough_stolen()
> (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[2ca73d01 
>    ] sink:[unsupported!]
> (kms_frontbuffer_tracking:2266) igt-draw-DEBUG: Test requirement passed: 
> intel_gen(intel_get_drm_devid(fd)) >= 5
> (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[2ca73d01 
>    ] sink:[unsupported!]
> (kms_frontbuffer_tracking:2266) DEBUG: Test requirement passed: 
> !fbc_not_enough_stolen()
> (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[2ca73d01 
>    ] sink:[unsupported!]
> (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: 
> is_i915_device(fd) && has_known_intel_chipset(fd)
> (kms_frontbuffer_tracking:2266) igt-fb-DEBUG: 
> igt_create_fb_with_bo_size(width=2560, height=1440, format=0x34325258, 
> tiling=0x101, size=14745600)
> (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: 
> is_i915_device(fd) && has_known_intel_chipset(fd)
> (kms_frontbuffer_tracking:2266) drmtest-DEBUG: Test requirement passed: 
> is_i915_device(fd) && has_known_intel_chipset(fd)
> (kms_frontbuffer_tracking:2266) igt-fb-DEBUG: 
> igt_create_fb_with_bo_size(handle=7, pitch=10240)
> (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[febb8b20 
>    ] sink:[unsupported!]
> (kms_frontbuffer_tracking:2266) DEBUG: Test requirement passed: 
> !fbc_not_enough_stolen()
> (kms_frontbuffer_tracking:2266) DEBUG: Calculated CRC: pipe:[febb8b20 
>    ] sink:[unsupported!]
> (kms_frontbuffer_tracking:2266) igt-core-INFO: Timed out: CRC reading
>   END  
> Timed out: CRC reading
> Subtest fbc-1p-primscrn-spr-indfb-fullscreen: FAIL (5.876s)

This one failed in both runs. Can you please retest with just that using

# kms_frontbuffer_tracking --run-subtest fbc-1p-primscrn-spr-indfb-fullscreen

Also please boot with drm.debug=0xe and grab the full dmesg of just that
single subtest. There's definitely something going wrong here.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-05 Thread Zanoni, Paulo R
Em Sex, 2016-05-06 às 00:54 +0200, Stefan Richter escreveu:
> On May 05 Zanoni, Paulo R wrote:
> > 
> > Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu:
> > > 
> > > Oh, and in case you - the person reading this commit message
> > > - found
> > > this commit through git bisect, please do the following:
> > >  - Check your dmesg and see if there are error messages
> > > mentioning
> > >    underruns around the time your problem started happening.
> > > 
> > > Well, I always had the followings lines in dmesg:
> > > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared
> > > fifo underrun on pipe A
> > > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO
> > > underrun  
> > Oh, well... I had a patch that would just disable FBC in case we
> > saw a
> > FIFO underrun, but it was rejected. Maybe this is the time to think
> > about it again? Otherwise, I can't think of much besides disabling
> > FBC
> > on HSW until all the underruns and watermarks regressions are fixed
> > forever.
> Just to be clear though, I know that these messages are emitted when
> the
> monitor is switched on, and when sddm is being shut down --- but I do
> not
> know whether there is any sort of underrun when I get the FBC related
> freeze (since I just don't get any kernel messages at that point).

The fact that underruns have occurred earlier is enough to know that
something is wrong (most probably, bad watermarks): we stop reporting
underruns once we get the first one. In addition, we already know that
FBC has the tendency to amplify apparently-harmless FIFO underruns into
black screens, and I wouldn't be surprised to learn that it could also
cause full machine lockups.

> 
> Is there a chance that a serial console would fare better than
> netconsole?  This board and another PC in its vicinity have got
> onboard
> serial ports but I don't have cables at the moment.

In the past, for some specific cases not related to FBC, I had more
luck with serial console than with netconsole. But if this is really
caused by FBC and watermarks, I don't think you'll be able to grab any
specific message at the time of the machine hang. OTOH, if something
actually shows up, it could help invalidate our current assumption of
the relationship between the problem and FBC and underruns.

> 
> > 
> > > 
> > >  - Download intel-gpu-tools, compile it, and run:
> > >    $ sudo ./tests/kms_frontbuffer_tracking --run-subtest
> > > '*fbc-*' 2>&1 | tee fbc.txt  
> > >    Then send us the fbc.txt file, especially if you get a
> > > failure.
> > >    This will really maximize your chances of getting the bug
> > > fixed
> > >    quickly.
> > > 
> > > Do you need this while FBC is enabled, or can I run it while FBC
> > > is
> > > disabled?  
> > FBC enabled. Considering your description, my hope is that maybe
> > some
> > specific subtest will be able to hang your machine, so testing this
> > again will require only running the specific subtest instead of
> > waiting
> > 18 hours.
> The kms_frontbuffer_tracking runs from which I posted output two
> hours
> ago did not trigger a lockup.
> 
> (I ran them while X11 was shut down because otherwise
> kms_frontbuffer_tracking would skip all tests with "Can't become DRM
> master, please check if no other DRM client is running.")

Yes, this is the correct way.

> 
> > 
> > > 
> > > PS:
> > > I am mentioning the following just in case that it has any
> > > relationship
> > > with the FBC related kernel freezes.  Maybe it doesn't...  There
> > > is
> > > another recent regression on this PC, but I have not yet figured
> > > out
> > > whether it was introduced by any particular kernel version.  The
> > > regression is:  When switching from X11 to text console by
> > > [Ctrl][Alt][Fx]
> > > or by shutting down sddm, I often only get a blank screen.  I
> > > suspect
> > > that this regression was introduced when I replaced kdm by sddm,
> > > but
> > > I am not sure about that.  
> > Maybe there is some relationship, since this operation involves a
> > mode
> > change. You can also try checking dmesg to see if there are
> > underruns
> > right when you do the change.
> Yes, this is accompanied by
> [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo
> underrun on pipe A
> [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO
> underrun  

Re: [Intel-gfx] Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-05 Thread Stefan Richter
On May 05 Daniel Vetter wrote:
> Hm, if it's watermarks then testing with latest drm-intel-nightly would be
> interesting. We finally managed to land atomic watermark updates (should
> all be there in 4.7 too):
> 
> https://cgit.freedesktop.org/drm-intel

I will see if I can test this sometime soon.
-- 
Stefan Richter
-==- -=-= --==-
http://arcgraph.de/sr/


Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-05 Thread Stefan Richter
On May 05 Zanoni, Paulo R wrote:
> Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu:
> > Oh, and in case you - the person reading this commit message - found
> > this commit through git bisect, please do the following:
> >  - Check your dmesg and see if there are error messages mentioning
> >    underruns around the time your problem started happening.
> > 
> > Well, I always had the followings lines in dmesg:
> > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo underrun 
> > on pipe A
> > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO underrun  
> 
> Oh, well... I had a patch that would just disable FBC in case we saw a
> FIFO underrun, but it was rejected. Maybe this is the time to think
> about it again? Otherwise, I can't think of much besides disabling FBC
> on HSW until all the underruns and watermarks regressions are fixed
> forever.

Just to be clear though, I know that these messages are emitted when the
monitor is switched on, and when sddm is being shut down --- but I do not
know whether there is any sort of underrun when I get the FBC related
freeze (since I just don't get any kernel messages at that point).

Is there a chance that a serial console would fare better than
netconsole?  This board and another PC in its vicinity have got onboard
serial ports but I don't have cables at the moment.

> >  - Download intel-gpu-tools, compile it, and run:
> >    $ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 2>&1 
> > | tee fbc.txt  
> >    Then send us the fbc.txt file, especially if you get a failure.
> >    This will really maximize your chances of getting the bug fixed
> >    quickly.
> > 
> > Do you need this while FBC is enabled, or can I run it while FBC is
> > disabled?  
> 
> FBC enabled. Considering your description, my hope is that maybe some
> specific subtest will be able to hang your machine, so testing this
> again will require only running the specific subtest instead of waiting
> 18 hours.

The kms_frontbuffer_tracking runs from which I posted output two hours
ago did not trigger a lockup.

(I ran them while X11 was shut down because otherwise
kms_frontbuffer_tracking would skip all tests with "Can't become DRM
master, please check if no other DRM client is running.")

> > PS:
> > I am mentioning the following just in case that it has any relationship
> > with the FBC related kernel freezes.  Maybe it doesn't...  There is
> > another recent regression on this PC, but I have not yet figured out
> > whether it was introduced by any particular kernel version.  The
> > regression is:  When switching from X11 to text console by [Ctrl][Alt][Fx]
> > or by shutting down sddm, I often only get a blank screen.  I suspect
> > that this regression was introduced when I replaced kdm by sddm, but
> > I am not sure about that.  
> 
> Maybe there is some relationship, since this operation involves a mode
> change. You can also try checking dmesg to see if there are underruns
> right when you do the change.

Yes, this is accompanied by
[drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on 
pipe A
[drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO underrun  
-- 
Stefan Richter
-==- -=-= --=-=
http://arcgraph.de/sr/


Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-05 Thread Stefan Richter
On May 05 Stefan Richter wrote:
> Quoting the changelog of the commit:
[...]
>  - Download intel-gpu-tools, compile it, and run:
>$ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 2>&1 | 
> tee fbc.txt
>Then send us the fbc.txt file, especially if you get a failure.

Attached are results of kms_frontbuffer_tracking from current
intel-gpu-tools.git (intel-gpu-tools-1.14-273-gb4b2ac346c92), taken on
kernel v4.5.2 and on v4.6-rc5.
-- 
Stefan Richter
-==- -=-= --=-=
http://arcgraph.de/sr/
IGT-Version: 1.14-g99e61ed66f65 (x86_64) (Linux: 4.5.2 x86_64)
Primary screen: DP 2560x1440, crtc 21
FBC last action not supported
Can't test PSR: no usable eDP screen.
Sink CRC not supported: primary screen is not eDP
Subtest fbc-1p-rte: SUCCESS (2.754s)
Test requirement not met in function check_test_requirements, file 
kms_frontbuffer_tracking.c:1816:
Test requirement: scnd_mode_params.connector_id
Can't test dual pipes with the current outputs
Subtest fbc-2p-rte: SKIP (0.000s)
Subtest fbc-1p-primscrn-pri-indfb-draw-mmap-cpu: SUCCESS (1.468s)
Subtest fbc-1p-primscrn-pri-indfb-draw-mmap-gtt: SUCCESS (0.787s)
Subtest fbc-1p-primscrn-pri-indfb-draw-mmap-wc: SUCCESS (0.774s)
Subtest fbc-1p-primscrn-pri-indfb-draw-pwrite: SUCCESS (0.958s)
Subtest fbc-1p-primscrn-pri-indfb-draw-blt: SUCCESS (0.990s)
Subtest fbc-1p-primscrn-pri-indfb-draw-render: SUCCESS (0.984s)
Subtest fbc-1p-primscrn-pri-shrfb-draw-mmap-cpu: SUCCESS (0.948s)
Subtest fbc-1p-primscrn-pri-shrfb-draw-mmap-gtt: SUCCESS (0.793s)
Subtest fbc-1p-primscrn-pri-shrfb-draw-mmap-wc: SUCCESS (0.836s)
Subtest fbc-1p-primscrn-pri-shrfb-draw-pwrite: SUCCESS (1.171s)
Subtest fbc-1p-primscrn-pri-shrfb-draw-blt: SUCCESS (0.995s)
Subtest fbc-1p-primscrn-pri-shrfb-draw-render: SUCCESS (0.977s)
Subtest fbc-1p-primscrn-cur-indfb-draw-mmap-cpu: SUCCESS (1.403s)
Subtest fbc-1p-primscrn-cur-indfb-draw-mmap-gtt: SUCCESS (0.964s)
Subtest fbc-1p-primscrn-cur-indfb-draw-mmap-wc: SUCCESS (0.939s)
Subtest fbc-1p-primscrn-cur-indfb-draw-pwrite: SUCCESS (0.965s)
Subtest fbc-1p-primscrn-cur-indfb-draw-blt: SUCCESS (0.939s)
Subtest fbc-1p-primscrn-cur-indfb-draw-render: SUCCESS (0.941s)
Subtest fbc-1p-primscrn-spr-indfb-draw-mmap-cpu: SUCCESS (1.010s)
Subtest fbc-1p-primscrn-spr-indfb-draw-mmap-gtt: SUCCESS (0.971s)
Subtest fbc-1p-primscrn-spr-indfb-draw-mmap-wc: SUCCESS (0.961s)
Subtest fbc-1p-primscrn-spr-indfb-draw-pwrite: SUCCESS (1.010s)
Subtest fbc-1p-primscrn-spr-indfb-draw-blt: SUCCESS (0.956s)
Subtest fbc-1p-primscrn-spr-indfb-draw-render: SUCCESS (1.010s)
Subtest fbc-1p-offscren-pri-indfb-draw-mmap-cpu: SUCCESS (0.502s)
Subtest fbc-1p-offscren-pri-indfb-draw-mmap-gtt: SUCCESS (0.500s)
Subtest fbc-1p-offscren-pri-indfb-draw-mmap-wc: SUCCESS (0.493s)
Subtest fbc-1p-offscren-pri-indfb-draw-pwrite: SUCCESS (0.507s)
Subtest fbc-1p-offscren-pri-indfb-draw-blt: SUCCESS (0.498s)
Subtest fbc-1p-offscren-pri-indfb-draw-render: SUCCESS (0.486s)
Subtest fbc-1p-offscren-pri-shrfb-draw-mmap-cpu: SUCCESS (0.578s)
Subtest fbc-1p-offscren-pri-shrfb-draw-mmap-gtt: SUCCESS (0.459s)
Subtest fbc-1p-offscren-pri-shrfb-draw-mmap-wc: SUCCESS (0.530s)
Subtest fbc-1p-offscren-pri-shrfb-draw-pwrite: SUCCESS (1.096s)
Subtest fbc-1p-offscren-pri-shrfb-draw-blt: SUCCESS (0.660s)
Subtest fbc-1p-offscren-pri-shrfb-draw-render: SUCCESS (0.661s)
Test requirement not met in function check_test_requirements, file 
kms_frontbuffer_tracking.c:1816:
Test requirement: scnd_mode_params.connector_id
Can't test dual pipes with the current outputs
Subtest fbc-2p-primscrn-pri-indfb-draw-mmap-cpu: SKIP (0.000s)
Test requirement not met in function check_test_requirements, file 
kms_frontbuffer_tracking.c:1816:
Test requirement: scnd_mode_params.connector_id
Can't test dual pipes with the current outputs
Subtest fbc-2p-primscrn-pri-indfb-draw-mmap-gtt: SKIP (0.000s)
Test requirement not met in function check_test_requirements, file 
kms_frontbuffer_tracking.c:1816:
Test requirement: scnd_mode_params.connector_id
Can't test dual pipes with the current outputs
Subtest fbc-2p-primscrn-pri-indfb-draw-mmap-wc: SKIP (0.000s)
Test requirement not met in function check_test_requirements, file 
kms_frontbuffer_tracking.c:1816:
Test requirement: scnd_mode_params.connector_id
Can't test dual pipes with the current outputs
Subtest fbc-2p-primscrn-pri-indfb-draw-pwrite: SKIP (0.000s)
Test requirement not met in function check_test_requirements, file 
kms_frontbuffer_tracking.c:1816:
Test requirement: scnd_mode_params.connector_id
Can't test dual pipes with the current outputs
Subtest fbc-2p-primscrn-pri-indfb-draw-blt: SKIP (0.000s)
Test requirement not met in function check_test_requirements, file 
kms_frontbuffer_tracking.c:1816:
Test requirement: scnd_mode_params.connector_id
Can't test dual pipes with the current outputs
Subtest fbc-2p-primscrn-pri-indfb-draw-render: SKIP (0.000s)
Test requirement not met in function check_test_requirements, file 
kms_frontbuffer_tracking.c:1816:
Test requirement: scn

Re: [Intel-gfx] Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-05 Thread Daniel Vetter
On Thu, May 05, 2016 at 06:50:14PM +, Zanoni, Paulo R wrote:
> Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu:
> > On Apr 30 Stefan Richter wrote:
> > > 
> > > On Apr 29 Stefan Richter wrote:
> > > > 
> > > > On Apr 26 Stefan Richter wrote:  
> > > > > 
> > > > > v4.6-rc solidly hangs after a short while after boot, login to
> > > > > X11, and
> > > > > doing nothing much remarkable on the just brought up X desktop.
> > > > > 
> > > > > Hardware: x86-64, E3-1245 v3 (Haswell),
> > > > >   mainboard Supermicro X10SAE,
> > > > >   using integrated Intel graphics (HD P4600, i915
> > > > > driver),
> > > > >   C226 PCH's AHCI and USB 2/3, ASMedia ASM1062 AHCI,
> > > > >   Intel LAN (i217, igb driver),
> > > > >   several IEEE 1394 controllers, some of them behind
> > > > >   PCIe bridges (IDT, PLX) or PCIe-to-PCI bridges (TI,
> > > > > Tundra)
> > > > >   and one PCI-to-CardBus bridge (Ricoh)
> > > > > 
> > > > > kernel.org kernel, Gentoo Linux userland
> > > > > 
> > > > > 1. known good:  v4.5-rc5 (gcc 4.9.3)
> > > > >    known bad:   v4.6-rc2 (gcc 4.9.3), only tried one time
> > > > > 
> > > > > 2. known good:  v4.5.2 (gcc 5.2.0)
> > > > >    known bad:   v4.6-rc5 (gcc 5.2.0), only tried one time
> > > > > 
> > > > > I will send my linux-4.6-rc5/.config in a follow-up message.  
> > >  .config: http://www.spinics.net/lists/kernel/msg2243444.html
> > >    lspci: http://www.spinics.net/lists/kernel/msg2243447.html
> > > 
> > > Some userland package versions, in case these have any bearing:
> > > x11-base/xorg-drivers-1.17
> > > x11-base/xorg-server-1.17.4
> > > x11-bas/xorg-x11-7.4-r2
> > Furthermore, there is a single display hooked up via DisplayPort.
> > 
> > > 
> > > > 
> > > > After it proved impossible to capture an oops through netconsole,
> > > > I
> > > > started git bisect.  This will apparently take almost a week, as
> > > > git
> > > > estimated 13 bisection steps and I will be allowing about 12
> > > > hours of
> > > > uptime as a sign for a good kernel.  (In my four or five tests of
> > > > bad
> > > > kernels before I started bisection, they hung after 3
> > > > minutes...5.5 hours
> > > > uptime, with no discernible difference in workload.  Maybe 12 h
> > > > cutoff is
> > > > even too short...)  
> > I took at least 18 hours uptime (usually 24 hours) as a sign for good
> > kernels.  During the bisection, bad kernels hung after 3 h, 2 h, 9
> > min,
> > 45 min, and 4 min uptime.  Thus I arrived at a98ee79317b4
> > "drm/i915/fbc:
> > enable FBC by default on HSW and BDW" as the point where the hangs
> > are
> > introduced.
> > 
> > Quoting the changelog of the commit:
> 
> Thanks for following the instructions on the commit message! :)
> 
> > 
> > Oh, and in case you - the person reading this commit message -
> > found
> > this commit through git bisect, please do the following:
> >  - Check your dmesg and see if there are error messages
> > mentioning
> >    underruns around the time your problem started happening.
> > 
> > Well, I always had the followings lines in dmesg:
> > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo
> > underrun on pipe A
> > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO
> > underrun
> 
> Oh, well... I had a patch that would just disable FBC in case we saw a
> FIFO underrun, but it was rejected. Maybe this is the time to think
> about it again? Otherwise, I can't think of much besides disabling FBC
> on HSW until all the underruns and watermarks regressions are fixed
> forever.

Hm, if it's watermarks then testing with latest drm-intel-nightly would be
interesting. We finally managed to land atomic watermark updates (should
all be there in 4.7 too):

https://cgit.freedesktop.org/drm-intel

Cheers, Daniel

> 
> > 
> > I always got these when I switch on the DisplayPort attached monitor.
> > Recently I changed userland from kdm to sddm and noticed that I
> > apparently get these when sddm shuts down.  I am not aware of whether
> > or not this also already happened with kdm.
> > 
> > However, "around the time your problem started happening" there is
> > nothing in dmesg, because "your problem" is a complete hang without
> > possibility of disk IO and without netconsole output.
> > 
> >  - Download intel-gpu-tools, compile it, and run:
> >    $ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 
> > 2>&1 | tee fbc.txt
> >    Then send us the fbc.txt file, especially if you get a
> > failure.
> >    This will really maximize your chances of getting the bug
> > fixed
> >    quickly.
> > 
> > Do you need this while FBC is enabled, or can I run it while FBC is
> > disabled?
> 
> FBC enabled. Considering your description, my hope is that maybe some
> specific subtest will be able to hang your machine, so testing this
> again will require only running the specific subtest instead of waiting
> 18 hours.
> 
> > 
> >  - Try to

Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-05 Thread Stefan Richter
On May 05 Stefan Richter wrote:
> Quoting the changelog of the commit:
[...]
>  - Boot with drm.debug=0xe, reproduce the problem, then send us the
>dmesg file.
> 
> I can try this, but I am skeptical about getting any useful kernel
> messages from before the hang.

I booted 4.6-rc5 with drm.debug=0xe.  It hung after about 80 minutes
uptime, and just like at all previous hangs, netconsole did not capture
anything at the time when it froze.

Here is "dmesg | grep -e :00:02.0 -e i915 -e drm" from that session.

[0.00] Command line: BOOT_IMAGE=/vmlinuz-4.6.0-rc5 root=/dev/sda4 ro 
rootflags=subvol=@ drm.debug=0xe
[0.00] Kernel command line: BOOT_IMAGE=/vmlinuz-4.6.0-rc5 
root=/dev/sda4 ro rootflags=subvol=@ drm.debug=0xe
[0.673659] pci :00:02.0: [8086:041a] type 00 class 0x03
[0.673666] pci :00:02.0: reg 0x10: [mem 0xf580-0xf5bf 64bit]
[0.673670] pci :00:02.0: reg 0x18: [mem 0xe000-0xefff 64bit 
pref]
[0.673673] pci :00:02.0: reg 0x20: [io  0xf000-0xf03f]
[0.705036] vgaarb: setting as boot device: PCI::00:02.0
[0.705113] vgaarb: device added: 
PCI::00:02.0,decodes=io+mem,owns=io+mem,locks=none
[0.705300] vgaarb: bridge control possible :00:02.0
[0.727542] pci :00:02.0: Video device with shadowed ROM at [mem 
0x000c-0x000d]
[0.766034] [drm] Initialized drm 1.1.0 20060810
[0.766222] [drm:i915_dump_device_info] i915 device info: gen=7, 
pciid=0x041a rev=0x06 
flags=need_gfx_hws,is_haswell,has_fbc,has_hotplug,has_llc,has_ddi,has_fpga_dbg,
[0.766229] [drm:intel_detect_pch] Found LynxPoint PCH
[0.766320] [drm:i915_gem_init_stolen] Memory reserved for graphics device: 
32768K, usable: 31744K
[0.766321] [drm] Memory usable by graphics device = 2048M
[0.766398] [drm:i915_gem_gtt_init] GMADR size = 256M
[0.766399] [drm:i915_gem_gtt_init] GTT stolen size = 32M
[0.766399] [drm:i915_gem_gtt_init] ppgtt mode: 1
[0.766400] [drm] Replacing VGA console driver
[0.767158] [drm:intel_opregion_setup] graphic opregion physical addr: 
0xd9509018
[0.767161] [drm:intel_opregion_setup] Public ACPI methods supported
[0.767162] [drm:intel_opregion_setup] SWSCI supported
[0.772643] [drm:swsci_setup] SWSCI GBDA callbacks 0cb3, SBCB callbacks 
00300483
[0.772646] [drm:intel_opregion_setup] ASLE supported
[0.772646] [drm:intel_opregion_setup] ASLE extension supported
[0.772648] [drm:intel_opregion_setup] Found valid VBT in ACPI OpRegion 
(Mailbox #4)
[0.772717] [drm:intel_device_info_runtime_init] slice total: 0
[0.772717] [drm:intel_device_info_runtime_init] subslice total: 0
[0.772718] [drm:intel_device_info_runtime_init] subslice per slice: 0
[0.772719] [drm:intel_device_info_runtime_init] EU total: 0
[0.772720] [drm:intel_device_info_runtime_init] EU per subslice: 0
[0.772720] [drm:intel_device_info_runtime_init] has slice power gating: n
[0.772721] [drm:intel_device_info_runtime_init] has subslice power gating: n
[0.772722] [drm:intel_device_info_runtime_init] has EU power gating: n
[0.772722] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[0.772725] [drm] Driver supports precise vblank timestamp query.
[0.772727] [drm:init_vbt_defaults] Set default to SSC at 12 kHz
[0.772728] [drm:intel_bios_init] VBT signature "$VBT HASWELL", BDB 
version 170
[0.772730] [drm:parse_general_features] BDB_GENERAL_FEATURES int_tv_support 
0 int_crt_support 1 lvds_use_ssc 0 lvds_ssc_freq 12 display_clock_mode 0 
fdi_rx_polarity_inverted 0
[0.772731] [drm:parse_general_definitions] crt_ddc_bus_pin: 2
[0.772732] [drm:parse_lfp_panel_data] DRRS supported mode is static
[0.772734] [drm:parse_lfp_panel_data] Found panel mode in BIOS VBT tables:
[0.772735] [drm:drm_mode_debug_printmodeline] Modeline 0:"1024x768" 0 65000 
1024 1048 1184 1344 768 771 777 806 0x8 0xa
[0.772736] [drm:parse_lfp_panel_data] VBT initial LVDS value 300
[0.772738] [drm:parse_lfp_backlight] VBT backlight PWM modulation frequency 
200 Hz, active high, min brightness 0, level 255
[0.772739] [drm:parse_sdvo_panel_data] Found SDVO panel mode in BIOS VBT 
tables:
[0.772740] [drm:drm_mode_debug_printmodeline] Modeline 0:"1600x1200" 0 
162000 1600 1664 1856 2160 1200 1201 1204 1250 0x8 0xa
[0.772741] [drm:parse_sdvo_device_mapping] No SDVO device info is found in 
VBT
[0.772742] [drm:parse_driver_features] DRRS State Enabled:1
[0.772743] [drm:parse_ddi_port] Port B VBT info: DP:1 HDMI:1 DVI:1 EDP:0 
CRT:0
[0.772745] [drm:parse_ddi_port] VBT HDMI level shift for port B: 6
[0.772745] [drm:parse_ddi_port] Port C VBT info: DP:0 HDMI:1 DVI:1 EDP:0 
CRT:0
[0.772746] [drm:parse_ddi_port] VBT HDMI level shift for port C: 6
[0.772747] [drm:parse_ddi_port] Port D VBT info: DP:1 HDMI:1 DVI:1 EDP:0 
CRT:0
[0.772748] [drm:parse_ddi_port] VBT HDMI level shift for port D: 6
[

Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-05 Thread Zanoni, Paulo R
Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu:
> On Apr 30 Stefan Richter wrote:
> > 
> > On Apr 29 Stefan Richter wrote:
> > > 
> > > On Apr 26 Stefan Richter wrote:  
> > > > 
> > > > v4.6-rc solidly hangs after a short while after boot, login to
> > > > X11, and
> > > > doing nothing much remarkable on the just brought up X desktop.
> > > > 
> > > > Hardware: x86-64, E3-1245 v3 (Haswell),
> > > >   mainboard Supermicro X10SAE,
> > > >   using integrated Intel graphics (HD P4600, i915
> > > > driver),
> > > >   C226 PCH's AHCI and USB 2/3, ASMedia ASM1062 AHCI,
> > > >   Intel LAN (i217, igb driver),
> > > >   several IEEE 1394 controllers, some of them behind
> > > >   PCIe bridges (IDT, PLX) or PCIe-to-PCI bridges (TI,
> > > > Tundra)
> > > >   and one PCI-to-CardBus bridge (Ricoh)
> > > > 
> > > > kernel.org kernel, Gentoo Linux userland
> > > > 
> > > > 1. known good:  v4.5-rc5 (gcc 4.9.3)
> > > >    known bad:   v4.6-rc2 (gcc 4.9.3), only tried one time
> > > > 
> > > > 2. known good:  v4.5.2 (gcc 5.2.0)
> > > >    known bad:   v4.6-rc5 (gcc 5.2.0), only tried one time
> > > > 
> > > > I will send my linux-4.6-rc5/.config in a follow-up message.  
> >  .config: http://www.spinics.net/lists/kernel/msg2243444.html
> >    lspci: http://www.spinics.net/lists/kernel/msg2243447.html
> > 
> > Some userland package versions, in case these have any bearing:
> > x11-base/xorg-drivers-1.17
> > x11-base/xorg-server-1.17.4
> > x11-bas/xorg-x11-7.4-r2
> Furthermore, there is a single display hooked up via DisplayPort.
> 
> > 
> > > 
> > > After it proved impossible to capture an oops through netconsole,
> > > I
> > > started git bisect.  This will apparently take almost a week, as
> > > git
> > > estimated 13 bisection steps and I will be allowing about 12
> > > hours of
> > > uptime as a sign for a good kernel.  (In my four or five tests of
> > > bad
> > > kernels before I started bisection, they hung after 3
> > > minutes...5.5 hours
> > > uptime, with no discernible difference in workload.  Maybe 12 h
> > > cutoff is
> > > even too short...)  
> I took at least 18 hours uptime (usually 24 hours) as a sign for good
> kernels.  During the bisection, bad kernels hung after 3 h, 2 h, 9
> min,
> 45 min, and 4 min uptime.  Thus I arrived at a98ee79317b4
> "drm/i915/fbc:
> enable FBC by default on HSW and BDW" as the point where the hangs
> are
> introduced.
> 
> Quoting the changelog of the commit:

Thanks for following the instructions on the commit message! :)

> 
> Oh, and in case you - the person reading this commit message -
> found
> this commit through git bisect, please do the following:
>  - Check your dmesg and see if there are error messages
> mentioning
>    underruns around the time your problem started happening.
> 
> Well, I always had the followings lines in dmesg:
> [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo
> underrun on pipe A
> [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO
> underrun

Oh, well... I had a patch that would just disable FBC in case we saw a
FIFO underrun, but it was rejected. Maybe this is the time to think
about it again? Otherwise, I can't think of much besides disabling FBC
on HSW until all the underruns and watermarks regressions are fixed
forever.

> 
> I always got these when I switch on the DisplayPort attached monitor.
> Recently I changed userland from kdm to sddm and noticed that I
> apparently get these when sddm shuts down.  I am not aware of whether
> or not this also already happened with kdm.
> 
> However, "around the time your problem started happening" there is
> nothing in dmesg, because "your problem" is a complete hang without
> possibility of disk IO and without netconsole output.
> 
>  - Download intel-gpu-tools, compile it, and run:
>    $ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 
> 2>&1 | tee fbc.txt
>    Then send us the fbc.txt file, especially if you get a
> failure.
>    This will really maximize your chances of getting the bug
> fixed
>    quickly.
> 
> Do you need this while FBC is enabled, or can I run it while FBC is
> disabled?

FBC enabled. Considering your description, my hope is that maybe some
specific subtest will be able to hang your machine, so testing this
again will require only running the specific subtest instead of waiting
18 hours.

> 
>  - Try to find a reliable way to reproduce the problem, and tell
> us.
> 
> The reliable way is to just wait for the kernel to hang after about
> 3 minutes to 5.5 hours.  I have not identified any special activity
> which would trigger the hang.
> 
>  - Boot with drm.debug=0xe, reproduce the problem, then send us
> the
>    dmesg file.
> 
> I can try this, but I am skeptical about getting any useful kernel
> messages from before the hang.

Agree.

> 
> PS:
> I am mentioning the following just in case that it has any
> relation

Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW"

2016-05-05 Thread Stefan Richter
On Apr 30 Stefan Richter wrote:
> On Apr 29 Stefan Richter wrote:
> > On Apr 26 Stefan Richter wrote:  
> > > v4.6-rc solidly hangs after a short while after boot, login to X11, and
> > > doing nothing much remarkable on the just brought up X desktop.
> > > 
> > > Hardware: x86-64, E3-1245 v3 (Haswell),
> > >   mainboard Supermicro X10SAE,
> > >   using integrated Intel graphics (HD P4600, i915 driver),
> > >   C226 PCH's AHCI and USB 2/3, ASMedia ASM1062 AHCI,
> > >   Intel LAN (i217, igb driver),
> > >   several IEEE 1394 controllers, some of them behind
> > >   PCIe bridges (IDT, PLX) or PCIe-to-PCI bridges (TI, Tundra)
> > >   and one PCI-to-CardBus bridge (Ricoh)
> > > 
> > > kernel.org kernel, Gentoo Linux userland
> > > 
> > > 1. known good:  v4.5-rc5 (gcc 4.9.3)
> > >known bad:   v4.6-rc2 (gcc 4.9.3), only tried one time
> > > 
> > > 2. known good:  v4.5.2 (gcc 5.2.0)
> > >known bad:   v4.6-rc5 (gcc 5.2.0), only tried one time
> > > 
> > > I will send my linux-4.6-rc5/.config in a follow-up message.  
> 
>  .config: http://www.spinics.net/lists/kernel/msg2243444.html
>lspci: http://www.spinics.net/lists/kernel/msg2243447.html
> 
> Some userland package versions, in case these have any bearing:
> x11-base/xorg-drivers-1.17
> x11-base/xorg-server-1.17.4
> x11-bas/xorg-x11-7.4-r2

Furthermore, there is a single display hooked up via DisplayPort.

> > After it proved impossible to capture an oops through netconsole, I
> > started git bisect.  This will apparently take almost a week, as git
> > estimated 13 bisection steps and I will be allowing about 12 hours of
> > uptime as a sign for a good kernel.  (In my four or five tests of bad
> > kernels before I started bisection, they hung after 3 minutes...5.5 hours
> > uptime, with no discernible difference in workload.  Maybe 12 h cutoff is
> > even too short...)  

I took at least 18 hours uptime (usually 24 hours) as a sign for good
kernels.  During the bisection, bad kernels hung after 3 h, 2 h, 9 min,
45 min, and 4 min uptime.  Thus I arrived at a98ee79317b4 "drm/i915/fbc:
enable FBC by default on HSW and BDW" as the point where the hangs are
introduced.

Quoting the changelog of the commit:

Oh, and in case you - the person reading this commit message - found
this commit through git bisect, please do the following:
 - Check your dmesg and see if there are error messages mentioning
   underruns around the time your problem started happening.

Well, I always had the followings lines in dmesg:
[drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on 
pipe A
[drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO underrun

I always got these when I switch on the DisplayPort attached monitor.
Recently I changed userland from kdm to sddm and noticed that I
apparently get these when sddm shuts down.  I am not aware of whether
or not this also already happened with kdm.

However, "around the time your problem started happening" there is
nothing in dmesg, because "your problem" is a complete hang without
possibility of disk IO and without netconsole output.

 - Download intel-gpu-tools, compile it, and run:
   $ sudo ./tests/kms_frontbuffer_tracking --run-subtest '*fbc-*' 2>&1 | 
tee fbc.txt
   Then send us the fbc.txt file, especially if you get a failure.
   This will really maximize your chances of getting the bug fixed
   quickly.

Do you need this while FBC is enabled, or can I run it while FBC is
disabled?

 - Try to find a reliable way to reproduce the problem, and tell us.

The reliable way is to just wait for the kernel to hang after about
3 minutes to 5.5 hours.  I have not identified any special activity
which would trigger the hang.

 - Boot with drm.debug=0xe, reproduce the problem, then send us the
   dmesg file.

I can try this, but I am skeptical about getting any useful kernel
messages from before the hang.

PS:
I am mentioning the following just in case that it has any relationship
with the FBC related kernel freezes.  Maybe it doesn't...  There is
another recent regression on this PC, but I have not yet figured out
whether it was introduced by any particular kernel version.  The
regression is:  When switching from X11 to text console by [Ctrl][Alt][Fx]
or by shutting down sddm, I often only get a blank screen.  I suspect
that this regression was introduced when I replaced kdm by sddm, but
I am not sure about that.
-- 
Stefan Richter
-==- -=-= --=-=
http://arcgraph.de/sr/