Re: [Intel-gfx] [PATCH 0/2] Nuke PAGE_KERNEL_IO

2021-11-12 Thread Andy Lutomirski

On 10/21/21 11:15, Lucas De Marchi wrote:

Last user of PAGE_KERNEL_IO is the i915 driver. While removing it from
there as we seek to bring the driver to other architectures, Daniel
suggested that we could finish the cleanup and remove it altogether,
through the tip tree. So here I'm sending both commits needed for that.

Lucas De Marchi (2):
   drm/i915/gem: stop using PAGE_KERNEL_IO
   x86/mm: nuke PAGE_KERNEL_IO

  arch/x86/include/asm/fixmap.h | 2 +-
  arch/x86/include/asm/pgtable_types.h  | 7 ---
  arch/x86/mm/ioremap.c | 2 +-
  arch/x86/xen/setup.c  | 2 +-
  drivers/gpu/drm/i915/gem/i915_gem_pages.c | 4 ++--
  include/asm-generic/fixmap.h  | 2 +-
  6 files changed, 6 insertions(+), 13 deletions(-)



Acked-by: Andy Lutomirski 


Re: [Intel-gfx] [PATCH 1/5] drm/i915: Improve PSR activation timing

2018-02-27 Thread Andy Lutomirski
On Wed, Feb 28, 2018 at 12:26 AM, Chris Wilson <ch...@chris-wilson.co.uk> wrote:
> Quoting Andy Lutomirski (2018-02-24 00:07:23)
>> On Tue, Feb 13, 2018 at 11:26 PM, Rodrigo Vivi <rodrigo.v...@intel.com> 
>> wrote:
>> > From: Andy Lutomirski <l...@kernel.org>
>> >
>> > +
>> > +   dev_priv->psr.activate_timer.expires = jiffies - 1;
>>
>> That can't possibly be okay.
>
> As an initialisation value, set to the previous jiffie? You can set it
> to the current jiffie, but then you have the issue of not noticing the
> update to the current jiffie.
>
> So how is this any more incorrect?

I don't think you can just write to fields in struct timer_list like that.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/3] drm/i915: Improve PSR activation timing

2018-02-27 Thread Andy Lutomirski
On Wed, Feb 28, 2018 at 12:22 AM, Chris Wilson <ch...@chris-wilson.co.uk> wrote:
> Quoting Rodrigo Vivi (2018-02-28 00:14:08)
>> From: Andy Lutomirski <l...@kernel.org>
>>
>> The current PSR code has a two call sites that each schedule delayed
>> work to activate PSR.  As far as I can tell, each call site intends
>> to keep PSR inactive for the given amount of time and then allow it
>> to be activated.
>>
>> The call sites are:
>>
>>  - intel_psr_enable(), which explicitly states in a comment that
>>it's trying to keep PSR off a short time after the dispay is
>>initialized as a workaround.
>>
>>  - intel_psr_flush().  There isn't an explcit explanation, but the
>>intent is presumably to keep PSR off until the display has been
>>idle for 100ms.
>>
>> The current code doesn't actually accomplish either of these goals.
>> Rather than keeping PSR inactive for the given amount of time, it
>> will schedule PSR for activation after the given time, with the
>> earliest target time in such a request winning.
>>
>> In other words, if intel_psr_enable() is immediately followed by
>> intel_psr_flush(), then PSR will be activated after 100ms even if
>> intel_psr_enable() wanted a longer delay.  And, if the screen is
>> being constantly updated so that intel_psr_flush() is called once
>> per frame at 60Hz, PSR will still be activated once every 100ms.
>>
>> Rewrite the code so that it does what was intended.  This adds
>> a new function intel_psr_schedule(), which will enable PSR after
>> the requested time but no sooner.
>>
>> v3: (by Rodrigo): Rebased on top of recent drm-tip without any
>> modification from the original.
>>
>> Cc: Dhinakaran Pandiyan <dhinakaran.pandi...@intel.com>
>> Signed-off-by: Andy Lutomirski <l...@kernel.org>
>> Signed-off-by: Rodrigo Vivi <rodrigo.v...@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_debugfs.c |  9 +++--
>>  drivers/gpu/drm/i915/i915_drv.h |  4 ++-
>>  drivers/gpu/drm/i915/intel_psr.c| 69 
>> -
>>  3 files changed, 71 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 33fbf3965309..1ac942d1742e 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -2572,8 +2572,13 @@ static int i915_edp_psr_status(struct seq_file *m, 
>> void *data)
>> seq_printf(m, "Active: %s\n", yesno(dev_priv->psr.active));
>> seq_printf(m, "Busy frontbuffer bits: 0x%03x\n",
>>dev_priv->psr.busy_frontbuffer_bits);
>> -   seq_printf(m, "Re-enable work scheduled: %s\n",
>> -  yesno(work_busy(_priv->psr.work.work)));
>> +
>> +   if (timer_pending(_priv->psr.activate_timer))
>> +   seq_printf(m, "Activate scheduled: yes, in %ldms\n",
>> +  (long)(dev_priv->psr.earliest_activate - jiffies) 
>> *
>
> msecs_from_jiffies
>
>> +  1000 / HZ);
>> +   else
>> +   seq_printf(m, "Re-enable scheduled: no\n");
>>
>> if (HAS_DDI(dev_priv)) {
>> if (dev_priv->psr.psr2_support)
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 7bbec5546d12..6e6cf2ce3749 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -764,7 +764,9 @@ struct i915_psr {
>> bool sink_support;
>> struct intel_dp *enabled;
>> bool active;
>> -   struct delayed_work work;
>> +   struct timer_list activate_timer;
>> +   struct work_struct activate_work;
>> +   unsigned long earliest_activate;
>
> Incorporated into struct timer_list, so this is redundant.

This way gives a clean way to say "don't do the work before
such-and-such time".  I don't think we can do it with mod_timer()
since the timer might already have started firing, and we can't
del_timer_sync() because there would be a lock inversion.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] drm/i915: Improve PSR activation timing

2018-02-23 Thread Andy Lutomirski
On Tue, Feb 13, 2018 at 11:26 PM, Rodrigo Vivi <rodrigo.v...@intel.com> wrote:
> From: Andy Lutomirski <l...@kernel.org>
>
> +
> +   dev_priv->psr.activate_timer.expires = jiffies - 1;

That can't possibly be okay.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Improve PSR activation timing

2018-02-09 Thread Andy Lutomirski
On Fri, Feb 9, 2018 at 7:39 AM, Rodrigo Vivi <rodrigo.v...@intel.com> wrote:
> Rodrigo Vivi <rodrigo.v...@intel.com> writes:
>
>> "Pandiyan, Dhinakaran" <dhinakaran.pandi...@intel.com> writes:
>>
>>> On Thu, 2018-02-08 at 14:48 -0800, Rodrigo Vivi wrote:
>>>> Hi Andy,
>>>>
>>>> thanks for getting involved with PSR and sorry for not replying sooner.
>>>>
>>>> I first saw this patch on that bugzilla entry but only now I stop to
>>>> really think why I have written the code that way.
>>>>
>>>> So some clarity below.
>>>>
>>>> On Mon, Feb 05, 2018 at 10:07:09PM +, Andy Lutomirski wrote:
>>>> > The current PSR code has a two call sites that each schedule delayed
>>>> > work to activate PSR.  As far as I can tell, each call site intends
>>>> > to keep PSR inactive for the given amount of time and then allow it
>>>> > to be activated.
>>>> >
>>>> > The call sites are:
>>>> >
>>>> >  - intel_psr_enable(), which explicitly states in a comment that
>>>> >it's trying to keep PSR off a short time after the dispay is
>>>> >initialized as a workaround.
>>>>
>>>> First of all I really want to kill this call here and remove the
>>>> FIXME. It was an ugly hack that I added to solve a corner case
>>>> that was leaving me with blank screens when activating so sooner.
>>>>
>>>> >
>>>> >  - intel_psr_flush().  There isn't an explcit explanation, but the
>>>> >intent is presumably to keep PSR off until the display has been
>>>> >idle for 100ms.
>>>>
>>>> The reason for 100 is kind of ugly-nonsense-empirical value
>>>> I concluded from VLV/CHV experience.
>>>> On platforms with HW tracking HW waits few identical frames
>>>> until really activating PSR. VLV/CHV activation is immediate.
>>>> But HW is also different and there it seemed that hw needed a
>>>> few more time before starting the transitions.
>>>> Furthermore I didn't want to add that so quickly because I didn't
>>>> want to take the risk of killing battery with software tracking
>>>> when doing transitions so quickly using software tracking.
>>>>
>>>> >
>>>> > The current code doesn't actually accomplish either of these goals.
>>>> > Rather than keeping PSR inactive for the given amount of time, it
>>>> > will schedule PSR for activation after the given time, with the
>>>> > earliest target time in such a request winning.
>>>>
>>>> Putting that way I was asking myself how that hack had ever fixed
>>>> my issue. Because the way you explained here seems obvious that it
>>>> wouldn't ever fix my bug or any other.
>>>>
>>>> So I applied your patch and it made even more sense (without considering
>>>> the fact I want to kill the first call anyways).
>>>>
>>>> So I came back, removed your patch and tried to understand how did
>>>> it ever worked.
>>>>
>>>> So, the thing is that intel_psr_flush will never be really executed
>>>> if intel_psr_enable wasn't executed. That is guaranteed by:
>>>>
>>>> mutex_lock(_priv->psr.lock);
>>>> if (!dev_priv->psr.enabled) {
>>>>
>>>> So, intel_psr_enable will be for sure the first one to schedule the
>>>> work delayed to the ugly higher delay.
>>>>
>>>> >
>>>> > In other words, if intel_psr_enable() is immediately followed by
>>>> > intel_psr_flush(), then PSR will be activated after 100ms even if
>>>> > intel_psr_enable() wanted a longer delay.  And, if the screen is
>>>> > being constantly updated so that intel_psr_flush() is called once
>>>> > per frame at 60Hz, PSR will still be activated once every 100ms.
>>>>
>>>> During this time you are right, many calls of intel_psr_exit
>>>> coming from flush functions can be called... But none of
>>>> them will schedule the work with 100 delay.
>>>>
>>>> they will skip on
>>>> if (!work_busy(_priv->psr.work.work))
>>>
>>> Wouldn't work_busy() return false until the work is actually queued
>>> which is 100ms after calling schedule_delayed_work()?
>>
>> That's not my understanding

Re: [Intel-gfx] [PATCH] drm/i915: Improve PSR activation timing

2018-02-08 Thread Andy Lutomirski



> On Feb 8, 2018, at 4:39 PM, Pandiyan, Dhinakaran 
> <dhinakaran.pandi...@intel.com> wrote:
> 
> 
>> On Thu, 2018-02-08 at 14:48 -0800, Rodrigo Vivi wrote:
>> Hi Andy,
>> 
>> thanks for getting involved with PSR and sorry for not replying sooner.
>> 
>> I first saw this patch on that bugzilla entry but only now I stop to
>> really think why I have written the code that way.
>> 
>> So some clarity below.
>> 
>>> On Mon, Feb 05, 2018 at 10:07:09PM +, Andy Lutomirski wrote:
>>> The current PSR code has a two call sites that each schedule delayed
>>> work to activate PSR.  As far as I can tell, each call site intends
>>> to keep PSR inactive for the given amount of time and then allow it
>>> to be activated.
>>> 
>>> The call sites are:
>>> 
>>> - intel_psr_enable(), which explicitly states in a comment that
>>>   it's trying to keep PSR off a short time after the dispay is
>>>   initialized as a workaround.
>> 
>> First of all I really want to kill this call here and remove the
>> FIXME. It was an ugly hack that I added to solve a corner case
>> that was leaving me with blank screens when activating so sooner.
>> 
>>> 
>>> - intel_psr_flush().  There isn't an explcit explanation, but the
>>>   intent is presumably to keep PSR off until the display has been
>>>   idle for 100ms.
>> 
>> The reason for 100 is kind of ugly-nonsense-empirical value
>> I concluded from VLV/CHV experience.
>> On platforms with HW tracking HW waits few identical frames
>> until really activating PSR. VLV/CHV activation is immediate.
>> But HW is also different and there it seemed that hw needed a
>> few more time before starting the transitions.
>> Furthermore I didn't want to add that so quickly because I didn't
>> want to take the risk of killing battery with software tracking
>> when doing transitions so quickly using software tracking.
>> 
>>> 
>>> The current code doesn't actually accomplish either of these goals.
>>> Rather than keeping PSR inactive for the given amount of time, it
>>> will schedule PSR for activation after the given time, with the
>>> earliest target time in such a request winning.
>> 
>> Putting that way I was asking myself how that hack had ever fixed
>> my issue. Because the way you explained here seems obvious that it
>> wouldn't ever fix my bug or any other.
>> 
>> So I applied your patch and it made even more sense (without considering
>> the fact I want to kill the first call anyways).
>> 
>> So I came back, removed your patch and tried to understand how did
>> it ever worked.
>> 
>> So, the thing is that intel_psr_flush will never be really executed
>> if intel_psr_enable wasn't executed. That is guaranteed by:
>> 
>> mutex_lock(_priv->psr.lock);
>>if (!dev_priv->psr.enabled) {
>> 
>> So, intel_psr_enable will be for sure the first one to schedule the
>> work delayed to the ugly higher delay.
>> 
>>> 
>>> In other words, if intel_psr_enable() is immediately followed by
>>> intel_psr_flush(), then PSR will be activated after 100ms even if
>>> intel_psr_enable() wanted a longer delay.  And, if the screen is
>>> being constantly updated so that intel_psr_flush() is called once
>>> per frame at 60Hz, PSR will still be activated once every 100ms.
>> 
>> During this time you are right, many calls of intel_psr_exit
>> coming from flush functions can be called... But none of
>> them will schedule the work with 100 delay.
>> 
>> they will skip on
>> if (!work_busy(_priv->psr.work.work))

As below, the first call will.  Then, 100ms later, the work will fire.  Then 
the next flush will schedule it again, etc.

> 
> Wouldn't work_busy() return false until the work is actually queued
> which is 100ms after calling schedule_delayed_work()?
> 
> For e.g, flushes at 0, 16, 32...96 will have work_busy() returning false
> until 100ms.
> 
> The first psr_work will end up getting scheduled at 100ms, which I
> believe is not what we want. 

Indeed.  I stuck some printks in and this seems to be what happens.

> 
> 
> However, I think 
> 
>if (dev_priv->psr.busy_frontbuffer_bits)
>goto unlock;
> 
>intel_psr_activate(intel_dp);
> 
> in psr_work might prevent activate being called at 100ms if an
> invalidate happened to be called before that.
> 

On my system, invalidate is never called.  Even if it were called, that check 
would only help if we got lucky and t

Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-05 Thread Andy Lutomirski


> On Feb 5, 2018, at 2:50 PM, Rodrigo Vivi <rodrigo.v...@intel.com> wrote:
> 
>> On Sat, Feb 03, 2018 at 05:33:08PM +0000, Andy Lutomirski wrote:
>>> On Fri, Feb 2, 2018 at 7:18 PM, Andy Lutomirski <l...@kernel.org> wrote:
>>>> On Fri, Feb 2, 2018 at 1:24 AM, Andy Lutomirski <l...@kernel.org> wrote:
>>>>> On Thu, Feb 1, 2018 at 9:20 PM, Chris Wilson <ch...@chris-wilson.co.uk> 
>>>>> wrote:
>>>>> Quoting Andy Lutomirski (2018-02-01 21:04:30)
>>>>>> I got this after a recent suspend/resume:
>>>>>> 
>>>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Lid closed.
>>>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: device-enumerator: scan all 
>>>>>> dirs
>>>>>> Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
>>>>>> scanning /sys/bus
>>>>>> Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
>>>>>> scanning /sys/class
>>>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Failed to open
>>>>>> configuration file '/etc/systemd/sleep.conf': No such file or
>>>>>> directory
>>>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Suspending...
>>>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
>>>>>> sender=n/a destination=n/a object=/org/freedesktop/login1
>>>>>> interface=org.freedesktop.login1.Manager member=PrepareForSleep
>>>>>> cookie=570 reply
>>>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Got message
>>>>>> type=method_call sender=:1.46 destination=:1.1
>>>>>> object=/org/freedesktop/login1/session/_32
>>>>>> interface=org.freedesktop.login1.Session member=ReleaseDevice
>>>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
>>>>>> sender=n/a destination=:1.46
>>>>>> object=/org/freedesktop/login1/session/_32
>>>>>> interface=org.freedesktop.login1.Session member=PauseDevice cookie
>>>>>> Feb 01 09:44:34 laptop gnome-shell[2630]: Failed to apply DRM plane
>>>>>> transform 0: Permission denied
>>>>>> Feb 01 09:44:34 laptop gnome-shell[2630]: drmModeSetCursor2 failed
>>>>>> with (Permission denied), drawing cursor with OpenGL from now on
>>>>>> 
>>>>>> But I don't see the word "cursor" in my system logs before the first
>>>>>> suspend.  What am I looking for?  This is Fedora 27 running a Gnome
>>>>>> Wayland session, but it hasn't been reinstalled in some time, so it's
>>>>>> possible that there are some weird settings sitting around.  But I did
>>>>>> check and I have no weird i915 parameters.
>>>>> 
>>>>> You are using gnome-shell as the display server. From that it appears to
>>>>> have started off with a HW cursor and switched to a SW cursor after
>>>>> suspend. Did you notice a change in behaviour? After rebooting or just
>>>>> restarting gnome-shell?
>>>> 
>>>> I think it's less consistently bad after a reboot before suspending.
>>>> 
>>>>> 
>>>>>> Also, are these things potentially related:
>>>>>> 
>>>>>> [ 3067.702527] [drm:intel_pipe_update_start [i915]] *ERROR* Potential
>>>>>> atomic update failure on pipe A
>>>>> 
>>>>> They are just "missed the immediate vblank for the screen update"
>>>>> messages. Should not be related to PSR, but may cause jitter by delaying
>>>>> the odd screen update.
>>>> 
>>>> I just got this one, and the timestamp is at least reasonably close to
>>>> a giant latency spike:
>>>> 
>>>> [  288.799654] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic
>>>> update failure on pipe A (start=31 end=32) time 15 us, min 1073, max
>>>> 1079, scanline start 1087, end 1088
>>>> 
>>>>> 
>>>>>> As I'm typing this, I've seen a couple instances of what seems like a
>>>>>> full *second* of cursor latency, but I've only gotten the potential
>>>>>> atomic update failure once.
>>>>>> 
>>>>>> And is there any straightforward tracing to do to distinguish between
>>>>>> PSR exit latency and other potential s

[Intel-gfx] [PATCH] drm/i915: Improve PSR activation timing

2018-02-05 Thread Andy Lutomirski
The current PSR code has a two call sites that each schedule delayed
work to activate PSR.  As far as I can tell, each call site intends
to keep PSR inactive for the given amount of time and then allow it
to be activated.

The call sites are:

 - intel_psr_enable(), which explicitly states in a comment that
   it's trying to keep PSR off a short time after the dispay is
   initialized as a workaround.

 - intel_psr_flush().  There isn't an explcit explanation, but the
   intent is presumably to keep PSR off until the display has been
   idle for 100ms.

The current code doesn't actually accomplish either of these goals.
Rather than keeping PSR inactive for the given amount of time, it
will schedule PSR for activation after the given time, with the
earliest target time in such a request winning.

In other words, if intel_psr_enable() is immediately followed by
intel_psr_flush(), then PSR will be activated after 100ms even if
intel_psr_enable() wanted a longer delay.  And, if the screen is
being constantly updated so that intel_psr_flush() is called once
per frame at 60Hz, PSR will still be activated once every 100ms.

Rewrite the code so that it does what was intended.  This adds
a new function intel_psr_schedule(), which will enable PSR after
the requested time but no sooner.

Signed-off-by: Andy Lutomirski <l...@kernel.org>
---
 drivers/gpu/drm/i915/i915_debugfs.c |  9 +++--
 drivers/gpu/drm/i915/i915_drv.h |  4 ++-
 drivers/gpu/drm/i915/intel_psr.c| 69 -
 3 files changed, 71 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index c65e381b85f3..b67db93f905d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2663,8 +2663,13 @@ static int i915_edp_psr_status(struct seq_file *m, void 
*data)
seq_printf(m, "Active: %s\n", yesno(dev_priv->psr.active));
seq_printf(m, "Busy frontbuffer bits: 0x%03x\n",
   dev_priv->psr.busy_frontbuffer_bits);
-   seq_printf(m, "Re-enable work scheduled: %s\n",
-  yesno(work_busy(_priv->psr.work.work)));
+
+   if (timer_pending(_priv->psr.activate_timer))
+   seq_printf(m, "Activate scheduled: yes, in %ldms\n",
+  (long)(dev_priv->psr.earliest_activate - jiffies) *
+  1000 / HZ);
+   else
+   seq_printf(m, "Re-enable scheduled: no\n");
 
if (HAS_DDI(dev_priv)) {
if (dev_priv->psr.psr2_support)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 46eb729b367d..c0fb7d65cda6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1192,7 +1192,9 @@ struct i915_psr {
bool source_ok;
struct intel_dp *enabled;
bool active;
-   struct delayed_work work;
+   struct timer_list activate_timer;
+   struct work_struct activate_work;
+   unsigned long earliest_activate;
unsigned busy_frontbuffer_bits;
bool psr2_support;
bool aux_frame_sync;
diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c
index 55ea5eb3b7df..333d90d4e5af 100644
--- a/drivers/gpu/drm/i915/intel_psr.c
+++ b/drivers/gpu/drm/i915/intel_psr.c
@@ -461,6 +461,30 @@ static void intel_psr_activate(struct intel_dp *intel_dp)
dev_priv->psr.active = true;
 }
 
+static void intel_psr_schedule(struct drm_i915_private *dev_priv,
+  unsigned long min_wait_ms)
+{
+   unsigned long next;
+
+   lockdep_assert_held(_priv->psr.lock);
+
+   /*
+* We update next_enable *and* call mod_timer() because it's
+* possible that intel_psr_work() has already been called and is
+* waiting for psr.lock.  If that's the case, we don't want it
+* to immediately enable PSR.
+*
+* We also need to make sure that PSR is never activated earlier
+* than requested to avoid breaking intel_psr_enable()'s workaround
+* for pre-gen9 hardware.
+*/
+   next = jiffies + msecs_to_jiffies(min_wait_ms);
+   if (time_after(next, dev_priv->psr.earliest_activate)) {
+   dev_priv->psr.earliest_activate = next;
+   mod_timer(_priv->psr.activate_timer, next);
+   }
+}
+
 static void hsw_psr_enable_source(struct intel_dp *intel_dp,
  const struct intel_crtc_state *crtc_state)
 {
@@ -544,8 +568,7 @@ void intel_psr_enable(struct intel_dp *intel_dp,
 * - On HSW/BDW we get a recoverable frozen screen until
 *   next exit-activate sequence.
 */
-   schedule_delayed_work(_priv->psr.work,
- 
msecs_to_jiffies(intel_dp->panel_power_cycle_d

Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-05 Thread Andy Lutomirski
On Mon, Feb 5, 2018 at 9:17 PM, Pandiyan, Dhinakaran
<dhinakaran.pandi...@intel.com> wrote:
>
> On Mon, 2018-02-05 at 20:35 +, Andy Lutomirski wrote:
>> On Mon, Feb 5, 2018 at 6:53 PM, Pandiyan, Dhinakaran
>> <dhinakaran.pandi...@intel.com> wrote:
>> >
>> >
>> >
>> > On Sun, 2018-02-04 at 21:50 +0000, Andy Lutomirski wrote:
>> >> On Sat, Feb 3, 2018 at 5:08 PM, Andy Lutomirski <l...@kernel.org> wrote:
>> >> > On Sat, Feb 3, 2018 at 5:20 AM, Pandiyan, Dhinakaran
>> >> > <dhinakaran.pandi...@intel.com> wrote:
>> >> >>
>> >> >> On Fri, 2018-02-02 at 19:18 +, Andy Lutomirski wrote:
>> >> >>> I updated to 4.15, and the situation is much worse.  With
>> >> >>> enable_psr=1, the system survives for several seconds and then the
>> >> >>> screen stops updating entirely.  If I boot with i915.enable_psr=1, I
>> >> >>> get to the Fedora login screen and then the system dies.  If I set
>> >> >>> enable_psr=1 using sysfs, it does a bit after the next resume.  It
>> >> >>> seems like it also sometimes hangs even worse a bit after the screen
>> >> >>> stops updating, but it's hard to tell.
>> >> >>
>> >> >> The login screen freeze sounds like what I have. Does this system have
>> >> >> DMC firmware? If yes, can you try this series
>> >> >> https://patchwork.freedesktop.org/series/37598/. You'll only need
>> >> >> patches 1,8,9 and 10.
>> >> >
>> >> > That fixes the hang.  Feel free to add:
>> >> >
>> >> > Tested-by: Andy Lutomirski <l...@kernel.org>
>> >> >
>> >> > to the i915 parts.  Also, any chance of getting it into the 4.15 stable 
>> >> > kernels?
>> >>
>> >> Correction: I'm still getting a second or two of complete screen
>> >> freezing every now and then.  The kernel says:
>> > Thanks a lot for testing. How do you trigger this freeze? Moving the
>> > cursor? Did you apply these patches on top of drm-tip or was it
>> > mainline?
>> >
>> > I also have another patch here that addresses screen freezes in console
>> > mode with PSR - https://patchwork.freedesktop.org/patch/201144/ in case
>> > that is what you are interested in.
>> >>
>> >> [69400.016524] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic
>> >> update failure on pipe A (start=19 end=20) time 198 us, min 1073, max
>> >> 1079, scanline start 1068, end 1082
>> >>
>> >> So something might still be a bit buggy.
>> >
>> > This series fixes only the long freezes due to frame counter resets, I
>> > am sure there are still other issues with PSR.
>> >
>> > BTW does your patch on top of these patches help with the cursor lag?
>>
>> Maybe, but I'm not 100% sure.  I'm not currently seeing the lag with
>> or without the patch.  I also think my distro fixed the cursor in the
>> mean time so that it uses the HW cursor even after suspend/resume.
>>
>> A couple of questions, though:
>>
>> 1. Does moving the HW cursor cause the hardware to automatically turn off 
>> PSR?
>>
> That is correct.
>
>> 2 When something enables vblank interrupts (using drm_*_vblank_get(),
>> for example), are vblank interrupts generated even if PSR is on?
>
> Enabling vblank interrupts deactivates PSR (except on Braswell afaik)
>
>>   And
>> is the scanline, as returned by intel_get_crtc_scanline(), updated?
>
> I don't think so, I have not really checked but there are no frames
> generated, so the timing related registers will not get updated. This is
> the case with the frame counter register.
>

I bet that's the cause of some of the glitches I'm seeing.  I
instrumented intel_pipe_update_start() like this:

diff --git a/drivers/gpu/drm/i915/intel_sprite.c
b/drivers/gpu/drm/i915/intel_sprite.c
index 4a8a5d918a83..6ce0a35187fb 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -97,6 +97,7 @@ void intel_pipe_update_start(const struct
intel_crtc_state *new_crtc_state)
 bool need_vlv_dsi_wa = (IS_VALLEYVIEW(dev_priv) ||
IS_CHERRYVIEW(dev_priv)) &&
 intel_crtc_has_type(new_crtc_state, INTEL_OUTPUT_DSI);
 DEFINE_WAIT(wait);
+int first_scanline = -1;

 vblank_start = adjusted_mode->crtc_vblank_start;
 if (adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE)
@@ -131,9 +132,12 @@ void intel

Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-05 Thread Andy Lutomirski
On Mon, Feb 5, 2018 at 6:53 PM, Pandiyan, Dhinakaran
<dhinakaran.pandi...@intel.com> wrote:
>
>
>
> On Sun, 2018-02-04 at 21:50 +0000, Andy Lutomirski wrote:
>> On Sat, Feb 3, 2018 at 5:08 PM, Andy Lutomirski <l...@kernel.org> wrote:
>> > On Sat, Feb 3, 2018 at 5:20 AM, Pandiyan, Dhinakaran
>> > <dhinakaran.pandi...@intel.com> wrote:
>> >>
>> >> On Fri, 2018-02-02 at 19:18 +, Andy Lutomirski wrote:
>> >>> I updated to 4.15, and the situation is much worse.  With
>> >>> enable_psr=1, the system survives for several seconds and then the
>> >>> screen stops updating entirely.  If I boot with i915.enable_psr=1, I
>> >>> get to the Fedora login screen and then the system dies.  If I set
>> >>> enable_psr=1 using sysfs, it does a bit after the next resume.  It
>> >>> seems like it also sometimes hangs even worse a bit after the screen
>> >>> stops updating, but it's hard to tell.
>> >>
>> >> The login screen freeze sounds like what I have. Does this system have
>> >> DMC firmware? If yes, can you try this series
>> >> https://patchwork.freedesktop.org/series/37598/. You'll only need
>> >> patches 1,8,9 and 10.
>> >
>> > That fixes the hang.  Feel free to add:
>> >
>> > Tested-by: Andy Lutomirski <l...@kernel.org>
>> >
>> > to the i915 parts.  Also, any chance of getting it into the 4.15 stable 
>> > kernels?
>>
>> Correction: I'm still getting a second or two of complete screen
>> freezing every now and then.  The kernel says:
> Thanks a lot for testing. How do you trigger this freeze? Moving the
> cursor? Did you apply these patches on top of drm-tip or was it
> mainline?
>
> I also have another patch here that addresses screen freezes in console
> mode with PSR - https://patchwork.freedesktop.org/patch/201144/ in case
> that is what you are interested in.
>>
>> [69400.016524] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic
>> update failure on pipe A (start=19 end=20) time 198 us, min 1073, max
>> 1079, scanline start 1068, end 1082
>>
>> So something might still be a bit buggy.
>
> This series fixes only the long freezes due to frame counter resets, I
> am sure there are still other issues with PSR.
>
> BTW does your patch on top of these patches help with the cursor lag?

Maybe, but I'm not 100% sure.  I'm not currently seeing the lag with
or without the patch.  I also think my distro fixed the cursor in the
mean time so that it uses the HW cursor even after suspend/resume.

A couple of questions, though:

1. Does moving the HW cursor cause the hardware to automatically turn off PSR?

2 When something enables vblank interrupts (using drm_*_vblank_get(),
for example), are vblank interrupts generated even if PSR is on?  And
is the scanline, as returned by intel_get_crtc_scanline(), updated?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-04 Thread Andy Lutomirski
On Sat, Feb 3, 2018 at 5:08 PM, Andy Lutomirski <l...@kernel.org> wrote:
> On Sat, Feb 3, 2018 at 5:20 AM, Pandiyan, Dhinakaran
> <dhinakaran.pandi...@intel.com> wrote:
>>
>> On Fri, 2018-02-02 at 19:18 +, Andy Lutomirski wrote:
>>> I updated to 4.15, and the situation is much worse.  With
>>> enable_psr=1, the system survives for several seconds and then the
>>> screen stops updating entirely.  If I boot with i915.enable_psr=1, I
>>> get to the Fedora login screen and then the system dies.  If I set
>>> enable_psr=1 using sysfs, it does a bit after the next resume.  It
>>> seems like it also sometimes hangs even worse a bit after the screen
>>> stops updating, but it's hard to tell.
>>
>> The login screen freeze sounds like what I have. Does this system have
>> DMC firmware? If yes, can you try this series
>> https://patchwork.freedesktop.org/series/37598/. You'll only need
>> patches 1,8,9 and 10.
>
> That fixes the hang.  Feel free to add:
>
> Tested-by: Andy Lutomirski <l...@kernel.org>
>
> to the i915 parts.  Also, any chance of getting it into the 4.15 stable 
> kernels?

Correction: I'm still getting a second or two of complete screen
freezing every now and then.  The kernel says:

[69400.016524] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic
update failure on pipe A (start=19 end=20) time 198 us, min 1073, max
1079, scanline start 1068, end 1082

So something might still be a bit buggy.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-03 Thread Andy Lutomirski
On Fri, Feb 2, 2018 at 7:18 PM, Andy Lutomirski <l...@kernel.org> wrote:
> On Fri, Feb 2, 2018 at 1:24 AM, Andy Lutomirski <l...@kernel.org> wrote:
>> On Thu, Feb 1, 2018 at 9:20 PM, Chris Wilson <ch...@chris-wilson.co.uk> 
>> wrote:
>>> Quoting Andy Lutomirski (2018-02-01 21:04:30)
>>>> I got this after a recent suspend/resume:
>>>>
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Lid closed.
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: device-enumerator: scan all 
>>>> dirs
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
>>>> scanning /sys/bus
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
>>>> scanning /sys/class
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Failed to open
>>>> configuration file '/etc/systemd/sleep.conf': No such file or
>>>> directory
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Suspending...
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
>>>> sender=n/a destination=n/a object=/org/freedesktop/login1
>>>> interface=org.freedesktop.login1.Manager member=PrepareForSleep
>>>> cookie=570 reply
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Got message
>>>> type=method_call sender=:1.46 destination=:1.1
>>>> object=/org/freedesktop/login1/session/_32
>>>> interface=org.freedesktop.login1.Session member=ReleaseDevice
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
>>>> sender=n/a destination=:1.46
>>>> object=/org/freedesktop/login1/session/_32
>>>> interface=org.freedesktop.login1.Session member=PauseDevice cookie
>>>> Feb 01 09:44:34 laptop gnome-shell[2630]: Failed to apply DRM plane
>>>> transform 0: Permission denied
>>>> Feb 01 09:44:34 laptop gnome-shell[2630]: drmModeSetCursor2 failed
>>>> with (Permission denied), drawing cursor with OpenGL from now on
>>>>
>>>> But I don't see the word "cursor" in my system logs before the first
>>>> suspend.  What am I looking for?  This is Fedora 27 running a Gnome
>>>> Wayland session, but it hasn't been reinstalled in some time, so it's
>>>> possible that there are some weird settings sitting around.  But I did
>>>> check and I have no weird i915 parameters.
>>>
>>> You are using gnome-shell as the display server. From that it appears to
>>> have started off with a HW cursor and switched to a SW cursor after
>>> suspend. Did you notice a change in behaviour? After rebooting or just
>>> restarting gnome-shell?
>>
>> I think it's less consistently bad after a reboot before suspending.
>>
>>>
>>>> Also, are these things potentially related:
>>>>
>>>> [ 3067.702527] [drm:intel_pipe_update_start [i915]] *ERROR* Potential
>>>> atomic update failure on pipe A
>>>
>>> They are just "missed the immediate vblank for the screen update"
>>> messages. Should not be related to PSR, but may cause jitter by delaying
>>> the odd screen update.
>>
>> I just got this one, and the timestamp is at least reasonably close to
>> a giant latency spike:
>>
>> [  288.799654] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic
>> update failure on pipe A (start=31 end=32) time 15 us, min 1073, max
>> 1079, scanline start 1087, end 1088
>>
>>>
>>>> As I'm typing this, I've seen a couple instances of what seems like a
>>>> full *second* of cursor latency, but I've only gotten the potential
>>>> atomic update failure once.
>>>>
>>>> And is there any straightforward tracing to do to distinguish between
>>>> PSR exit latency and other potential sources of latency?
>>>
>>> It looks plausible that we could at least report how long it takes the
>>> registers to reflect the change in state (but we don't). The best source
>>> of information atm is /sys/kernel/debug/dri/0/i915_edp_psr_status.
>>
>> Hmm.
>>
>> I went and looked at the code, and I noticed what could be bugs or
>> could (more likely) be my confusion since I don't know this code at
>> all:
>>
>> intel_single_frame_update() does something inscrutable to me, but I
>> imagine it does something that causes the next page flip to get
>> noticed by the panel even with PSR on.  But how does the code that
>> calls it know that anything happened?  (Looking at 

Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-03 Thread Andy Lutomirski
On Sat, Feb 3, 2018 at 5:20 AM, Pandiyan, Dhinakaran
<dhinakaran.pandi...@intel.com> wrote:
>
> On Fri, 2018-02-02 at 19:18 +, Andy Lutomirski wrote:
>> I updated to 4.15, and the situation is much worse.  With
>> enable_psr=1, the system survives for several seconds and then the
>> screen stops updating entirely.  If I boot with i915.enable_psr=1, I
>> get to the Fedora login screen and then the system dies.  If I set
>> enable_psr=1 using sysfs, it does a bit after the next resume.  It
>> seems like it also sometimes hangs even worse a bit after the screen
>> stops updating, but it's hard to tell.
>
> The login screen freeze sounds like what I have. Does this system have
> DMC firmware? If yes, can you try this series
> https://patchwork.freedesktop.org/series/37598/. You'll only need
> patches 1,8,9 and 10.

That fixes the hang.  Feel free to add:

Tested-by: Andy Lutomirski <l...@kernel.org>

to the i915 parts.  Also, any chance of getting it into the 4.15 stable kernels?

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-02 Thread Andy Lutomirski
On Fri, Feb 2, 2018 at 7:18 PM, Andy Lutomirski <l...@kernel.org> wrote:
> On Fri, Feb 2, 2018 at 1:24 AM, Andy Lutomirski <l...@kernel.org> wrote:
>> On Thu, Feb 1, 2018 at 9:20 PM, Chris Wilson <ch...@chris-wilson.co.uk> 
>> wrote:
>>> Quoting Andy Lutomirski (2018-02-01 21:04:30)
>>>> I got this after a recent suspend/resume:
>>>>
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Lid closed.
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: device-enumerator: scan all 
>>>> dirs
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
>>>> scanning /sys/bus
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
>>>> scanning /sys/class
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Failed to open
>>>> configuration file '/etc/systemd/sleep.conf': No such file or
>>>> directory
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Suspending...
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
>>>> sender=n/a destination=n/a object=/org/freedesktop/login1
>>>> interface=org.freedesktop.login1.Manager member=PrepareForSleep
>>>> cookie=570 reply
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Got message
>>>> type=method_call sender=:1.46 destination=:1.1
>>>> object=/org/freedesktop/login1/session/_32
>>>> interface=org.freedesktop.login1.Session member=ReleaseDevice
>>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
>>>> sender=n/a destination=:1.46
>>>> object=/org/freedesktop/login1/session/_32
>>>> interface=org.freedesktop.login1.Session member=PauseDevice cookie
>>>> Feb 01 09:44:34 laptop gnome-shell[2630]: Failed to apply DRM plane
>>>> transform 0: Permission denied
>>>> Feb 01 09:44:34 laptop gnome-shell[2630]: drmModeSetCursor2 failed
>>>> with (Permission denied), drawing cursor with OpenGL from now on
>>>>
>>>> But I don't see the word "cursor" in my system logs before the first
>>>> suspend.  What am I looking for?  This is Fedora 27 running a Gnome
>>>> Wayland session, but it hasn't been reinstalled in some time, so it's
>>>> possible that there are some weird settings sitting around.  But I did
>>>> check and I have no weird i915 parameters.
>>>
>>> You are using gnome-shell as the display server. From that it appears to
>>> have started off with a HW cursor and switched to a SW cursor after
>>> suspend. Did you notice a change in behaviour? After rebooting or just
>>> restarting gnome-shell?
>>
>> I think it's less consistently bad after a reboot before suspending.
>>
>>>
>>>> Also, are these things potentially related:
>>>>
>>>> [ 3067.702527] [drm:intel_pipe_update_start [i915]] *ERROR* Potential
>>>> atomic update failure on pipe A
>>>
>>> They are just "missed the immediate vblank for the screen update"
>>> messages. Should not be related to PSR, but may cause jitter by delaying
>>> the odd screen update.
>>
>> I just got this one, and the timestamp is at least reasonably close to
>> a giant latency spike:
>>
>> [  288.799654] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic
>> update failure on pipe A (start=31 end=32) time 15 us, min 1073, max
>> 1079, scanline start 1087, end 1088
>>
>>>
>>>> As I'm typing this, I've seen a couple instances of what seems like a
>>>> full *second* of cursor latency, but I've only gotten the potential
>>>> atomic update failure once.
>>>>
>>>> And is there any straightforward tracing to do to distinguish between
>>>> PSR exit latency and other potential sources of latency?
>>>
>>> It looks plausible that we could at least report how long it takes the
>>> registers to reflect the change in state (but we don't). The best source
>>> of information atm is /sys/kernel/debug/dri/0/i915_edp_psr_status.
>>
>> Hmm.
>>
>> I went and looked at the code, and I noticed what could be bugs or
>> could (more likely) be my confusion since I don't know this code at
>> all:
>>
>> intel_single_frame_update() does something inscrutable to me, but I
>> imagine it does something that causes the next page flip to get
>> noticed by the panel even with PSR on.  But how does the code that
>> calls it know that anything happened?  (Looking at 

Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-02 Thread Andy Lutomirski
On Fri, Feb 2, 2018 at 1:24 AM, Andy Lutomirski <l...@kernel.org> wrote:
> On Thu, Feb 1, 2018 at 9:20 PM, Chris Wilson <ch...@chris-wilson.co.uk> wrote:
>> Quoting Andy Lutomirski (2018-02-01 21:04:30)
>>> I got this after a recent suspend/resume:
>>>
>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Lid closed.
>>> Feb 01 09:44:34 laptop systemd-logind[2412]: device-enumerator: scan all 
>>> dirs
>>> Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
>>> scanning /sys/bus
>>> Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
>>> scanning /sys/class
>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Failed to open
>>> configuration file '/etc/systemd/sleep.conf': No such file or
>>> directory
>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Suspending...
>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
>>> sender=n/a destination=n/a object=/org/freedesktop/login1
>>> interface=org.freedesktop.login1.Manager member=PrepareForSleep
>>> cookie=570 reply
>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Got message
>>> type=method_call sender=:1.46 destination=:1.1
>>> object=/org/freedesktop/login1/session/_32
>>> interface=org.freedesktop.login1.Session member=ReleaseDevice
>>> Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
>>> sender=n/a destination=:1.46
>>> object=/org/freedesktop/login1/session/_32
>>> interface=org.freedesktop.login1.Session member=PauseDevice cookie
>>> Feb 01 09:44:34 laptop gnome-shell[2630]: Failed to apply DRM plane
>>> transform 0: Permission denied
>>> Feb 01 09:44:34 laptop gnome-shell[2630]: drmModeSetCursor2 failed
>>> with (Permission denied), drawing cursor with OpenGL from now on
>>>
>>> But I don't see the word "cursor" in my system logs before the first
>>> suspend.  What am I looking for?  This is Fedora 27 running a Gnome
>>> Wayland session, but it hasn't been reinstalled in some time, so it's
>>> possible that there are some weird settings sitting around.  But I did
>>> check and I have no weird i915 parameters.
>>
>> You are using gnome-shell as the display server. From that it appears to
>> have started off with a HW cursor and switched to a SW cursor after
>> suspend. Did you notice a change in behaviour? After rebooting or just
>> restarting gnome-shell?
>
> I think it's less consistently bad after a reboot before suspending.
>
>>
>>> Also, are these things potentially related:
>>>
>>> [ 3067.702527] [drm:intel_pipe_update_start [i915]] *ERROR* Potential
>>> atomic update failure on pipe A
>>
>> They are just "missed the immediate vblank for the screen update"
>> messages. Should not be related to PSR, but may cause jitter by delaying
>> the odd screen update.
>
> I just got this one, and the timestamp is at least reasonably close to
> a giant latency spike:
>
> [  288.799654] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic
> update failure on pipe A (start=31 end=32) time 15 us, min 1073, max
> 1079, scanline start 1087, end 1088
>
>>
>>> As I'm typing this, I've seen a couple instances of what seems like a
>>> full *second* of cursor latency, but I've only gotten the potential
>>> atomic update failure once.
>>>
>>> And is there any straightforward tracing to do to distinguish between
>>> PSR exit latency and other potential sources of latency?
>>
>> It looks plausible that we could at least report how long it takes the
>> registers to reflect the change in state (but we don't). The best source
>> of information atm is /sys/kernel/debug/dri/0/i915_edp_psr_status.
>
> Hmm.
>
> I went and looked at the code, and I noticed what could be bugs or
> could (more likely) be my confusion since I don't know this code at
> all:
>
> intel_single_frame_update() does something inscrutable to me, but I
> imagine it does something that causes the next page flip to get
> noticed by the panel even with PSR on.  But how does the code that
> calls it know that anything happened?  (Looking at the commit history,
> maybe this is something special that's only needed on some platforms
> but doesn't replace the normal PSR exit sequence.)
>
> Perhaps more interestingly, intel_psr_flush() does this:
>
> /* By definition flush = invalidate + flush */
> if (frontbuffer_bits)
> intel_psr_exit(dev_priv);
>
> if (!dev_priv->psr.active && !dev_priv->psr.busy_f

Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-01 Thread Andy Lutomirski
On Thu, Feb 1, 2018 at 9:20 PM, Chris Wilson <ch...@chris-wilson.co.uk> wrote:
> Quoting Andy Lutomirski (2018-02-01 21:04:30)
>> I got this after a recent suspend/resume:
>>
>> Feb 01 09:44:34 laptop systemd-logind[2412]: Lid closed.
>> Feb 01 09:44:34 laptop systemd-logind[2412]: device-enumerator: scan all dirs
>> Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
>> scanning /sys/bus
>> Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
>> scanning /sys/class
>> Feb 01 09:44:34 laptop systemd-logind[2412]: Failed to open
>> configuration file '/etc/systemd/sleep.conf': No such file or
>> directory
>> Feb 01 09:44:34 laptop systemd-logind[2412]: Suspending...
>> Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
>> sender=n/a destination=n/a object=/org/freedesktop/login1
>> interface=org.freedesktop.login1.Manager member=PrepareForSleep
>> cookie=570 reply
>> Feb 01 09:44:34 laptop systemd-logind[2412]: Got message
>> type=method_call sender=:1.46 destination=:1.1
>> object=/org/freedesktop/login1/session/_32
>> interface=org.freedesktop.login1.Session member=ReleaseDevice
>> Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
>> sender=n/a destination=:1.46
>> object=/org/freedesktop/login1/session/_32
>> interface=org.freedesktop.login1.Session member=PauseDevice cookie
>> Feb 01 09:44:34 laptop gnome-shell[2630]: Failed to apply DRM plane
>> transform 0: Permission denied
>> Feb 01 09:44:34 laptop gnome-shell[2630]: drmModeSetCursor2 failed
>> with (Permission denied), drawing cursor with OpenGL from now on
>>
>> But I don't see the word "cursor" in my system logs before the first
>> suspend.  What am I looking for?  This is Fedora 27 running a Gnome
>> Wayland session, but it hasn't been reinstalled in some time, so it's
>> possible that there are some weird settings sitting around.  But I did
>> check and I have no weird i915 parameters.
>
> You are using gnome-shell as the display server. From that it appears to
> have started off with a HW cursor and switched to a SW cursor after
> suspend. Did you notice a change in behaviour? After rebooting or just
> restarting gnome-shell?

I think it's less consistently bad after a reboot before suspending.

>
>> Also, are these things potentially related:
>>
>> [ 3067.702527] [drm:intel_pipe_update_start [i915]] *ERROR* Potential
>> atomic update failure on pipe A
>
> They are just "missed the immediate vblank for the screen update"
> messages. Should not be related to PSR, but may cause jitter by delaying
> the odd screen update.

I just got this one, and the timestamp is at least reasonably close to
a giant latency spike:

[  288.799654] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic
update failure on pipe A (start=31 end=32) time 15 us, min 1073, max
1079, scanline start 1087, end 1088

>
>> As I'm typing this, I've seen a couple instances of what seems like a
>> full *second* of cursor latency, but I've only gotten the potential
>> atomic update failure once.
>>
>> And is there any straightforward tracing to do to distinguish between
>> PSR exit latency and other potential sources of latency?
>
> It looks plausible that we could at least report how long it takes the
> registers to reflect the change in state (but we don't). The best source
> of information atm is /sys/kernel/debug/dri/0/i915_edp_psr_status.

Hmm.

I went and looked at the code, and I noticed what could be bugs or
could (more likely) be my confusion since I don't know this code at
all:

intel_single_frame_update() does something inscrutable to me, but I
imagine it does something that causes the next page flip to get
noticed by the panel even with PSR on.  But how does the code that
calls it know that anything happened?  (Looking at the commit history,
maybe this is something special that's only needed on some platforms
but doesn't replace the normal PSR exit sequence.)

Perhaps more interestingly, intel_psr_flush() does this:

/* By definition flush = invalidate + flush */
if (frontbuffer_bits)
intel_psr_exit(dev_priv);

if (!dev_priv->psr.active && !dev_priv->psr.busy_frontbuffer_bits)
if (!work_busy(_priv->psr.work.work))
schedule_delayed_work(_priv->psr.work,
  msecs_to_jiffies(100));

I'm guessing that the idea is that we're turning off PSR because we
want the panel to update and we expect that, in 100ms, the update will
have hit the panel and we'll have been idle long enough for it to make
sense to re-enter PSR.  IOW, the code wants PSR to be off for at least
100ms and then to turn back on.  But th

Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-01 Thread Andy Lutomirski
On Thu, Feb 1, 2018 at 9:53 AM, Chris Wilson <ch...@chris-wilson.co.uk> wrote:
> Quoting Andy Lutomirski (2018-02-01 17:40:22)
>> *However*, I do see one unfortunate side effect of turning on PSR.  It
>> seems that, when I move my cursor a little bit after a few seconds of
>> doing nothing, there seems to be a little bit of lag, as if either a
>> few frames are dropped at the beginning of the motion or maybe the
>> entire motion is delayed a bit.  I don't notice a similar delay when
>> typing, so I'm wondering if maybe there's a minor driver bug in which
>> the driver doesn't kick the panel out of PSR quite as quickly when the
>> cursor is updated as it does when the framebuffer is updated.
>
> One thing that's important know regarding the cursor is whether the
> display server is using a HW cursor or SW cursor. Could you please attach
> the log from the display server (or if you are using a stock
> distribution that's probably enough to work out what it is using)?
> -Chris

Looking at the logs, I see a few things.  First, I have a few of these:

Feb 01 09:24:24 laptop kernel: [drm:intel_pipe_update_start [i915]]
*ERROR* Potential atomic update failure on pipe A
Feb 01 09:24:48 laptop org.gnome.Shell.desktop[3261]: libinput error:
event15 - libinput error: DLL0704:01 06CB:76AE Touchpad: libinput
error: kernel bug: Touch jump detected and discarded.
Feb 01 09:24:48 laptop org.gnome.Shell.desktop[3261]: See
https://wayland.freedesktop.org/libinput/doc/1.9.3/touchpad_jumping_cursor.html
for details
Feb 01 09:24:50 laptop org.gnome.Shell.desktop[3261]: libinput error:
event15 - libinput error: DLL0704:01 06CB:76AE Touchpad: libinput
error: kernel bug: Touch jump detected and discarded.
Feb 01 09:24:50 laptop org.gnome.Shell.desktop[3261]: See
https://wayland.freedesktop.org/libinput/doc/1.9.3/touchpad_jumping_cursor.html
for details

(Hi, Peter!)

So it's entirely possible that what I'm seeing is actually an input
issue that's exacerbated by PSR for some bizarre reason.

I got this after a recent suspend/resume:

Feb 01 09:44:34 laptop systemd-logind[2412]: Lid closed.
Feb 01 09:44:34 laptop systemd-logind[2412]: device-enumerator: scan all dirs
Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
scanning /sys/bus
Feb 01 09:44:34 laptop systemd-logind[2412]:   device-enumerator:
scanning /sys/class
Feb 01 09:44:34 laptop systemd-logind[2412]: Failed to open
configuration file '/etc/systemd/sleep.conf': No such file or
directory
Feb 01 09:44:34 laptop systemd-logind[2412]: Suspending...
Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
sender=n/a destination=n/a object=/org/freedesktop/login1
interface=org.freedesktop.login1.Manager member=PrepareForSleep
cookie=570 reply
Feb 01 09:44:34 laptop systemd-logind[2412]: Got message
type=method_call sender=:1.46 destination=:1.1
object=/org/freedesktop/login1/session/_32
interface=org.freedesktop.login1.Session member=ReleaseDevice
Feb 01 09:44:34 laptop systemd-logind[2412]: Sent message type=signal
sender=n/a destination=:1.46
object=/org/freedesktop/login1/session/_32
interface=org.freedesktop.login1.Session member=PauseDevice cookie
Feb 01 09:44:34 laptop gnome-shell[2630]: Failed to apply DRM plane
transform 0: Permission denied
Feb 01 09:44:34 laptop gnome-shell[2630]: drmModeSetCursor2 failed
with (Permission denied), drawing cursor with OpenGL from now on

But I don't see the word "cursor" in my system logs before the first
suspend.  What am I looking for?  This is Fedora 27 running a Gnome
Wayland session, but it hasn't been reinstalled in some time, so it's
possible that there are some weird settings sitting around.  But I did
check and I have no weird i915 parameters.

Also, are these things potentially related:

[ 3067.702527] [drm:intel_pipe_update_start [i915]] *ERROR* Potential
atomic update failure on pipe A

As I'm typing this, I've seen a couple instances of what seems like a
full *second* of cursor latency, but I've only gotten the potential
atomic update failure once.

And is there any straightforward tracing to do to distinguish between
PSR exit latency and other potential sources of latency?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915 PSR test results and cursor lag

2018-02-01 Thread Andy Lutomirski
On Thu, Feb 1, 2018 at 9:40 AM, Andy Lutomirski <l...@kernel.org> wrote:
> Hi-
>
> As requested in your blog post, I tested PSR.  I see something like
> 2.69W with PSR off and 2.17W with PSR on.  Screen blanking,
> suspend/resume, and the contents of the screen all seem okay.  This is
> a Dell XPS 13 9350, i.e.:
>
> System Information
> Manufacturer: Dell Inc.
> Product Name: XPS 13 9350
>
> EDID is attached.
>
> *However*, I do see one unfortunate side effect of turning on PSR.  It
> seems that, when I move my cursor a little bit after a few seconds of
> doing nothing, there seems to be a little bit of lag, as if either a
> few frames are dropped at the beginning of the motion or maybe the
> entire motion is delayed a bit.  I don't notice a similar delay when
> typing, so I'm wondering if maybe there's a minor driver bug in which
> the driver doesn't kick the panel out of PSR quite as quickly when the
> cursor is updated as it does when the framebuffer is updated.
>

I'm also getting occasional messages like:

[ 2675.574486] [drm:intel_pipe_update_start [i915]] *ERROR* Potential
atomic update failure on pipe A

with PSR on.  But there is nowhere near one of these messages per tiny
lag incident.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] i915 PSR test results and cursor lag

2018-02-01 Thread Andy Lutomirski
Hi-

As requested in your blog post, I tested PSR.  I see something like
2.69W with PSR off and 2.17W with PSR on.  Screen blanking,
suspend/resume, and the contents of the screen all seem okay.  This is
a Dell XPS 13 9350, i.e.:

System Information
Manufacturer: Dell Inc.
Product Name: XPS 13 9350

EDID is attached.

*However*, I do see one unfortunate side effect of turning on PSR.  It
seems that, when I move my cursor a little bit after a few seconds of
doing nothing, there seems to be a little bit of lag, as if either a
few frames are dropped at the beginning of the motion or maybe the
entire motion is delayed a bit.  I don't notice a similar delay when
typing, so I'm wondering if maybe there's a minor driver bug in which
the driver doesn't kick the panel out of PSR quite as quickly when the
cursor is updated as it does when the framebuffer is updated.

(A couple of lists are cc'd

BTW, switching PSR on and off using
/sys/module/i915/parameters/enable_psr seems to work fine, although it
seems like I may need to suspend/resume to get it to kick in.  But, if
there's really going to be a blacklist or whitelist of panels in
userspace, shouldn't there be an option in sysfs in
/sys/class/drm/card0-eDP-1/ or similar?


--Andy


panel-edid
Description: Binary data
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] REGRESSION in c5552fde102f ("nvme: Enable autonomous power state transitions")

2018-01-24 Thread Andy Lutomirski
On Wed, Jan 24, 2018 at 5:35 AM, Ville Syrjälä
 wrote:
> On Wed, Jan 24, 2018 at 01:42:08PM +0200, Jani Nikula wrote:
>>
>> Hi Andy, all -
>>
>> So this is an odd one.
>>
>> I'm getting display FIFO underruns in a very specific setting: Laptop
>> display switched off, and an external display connected. Other
>> combinations work fine.
>>
>> I've bisected this to c5552fde102f ("nvme: Enable autonomous power state
>> transitions"), and, being baffled by the result, carefully checked
>> this. There are no problems when running c5552fde102f^, with
>> nvme_core.default_ps_max_latency_us=0, or after 'echo 0 >
>> pm_qos_latency_tolerance_us'. With the last one, restoring the original
>> value of 10 brings the underruns back.
>>
>> I have no idea what the root cause mechanism here is, but the bisect is
>> correct. Perhaps something to do with timing. I'd be happy to provide
>> further details.
>>
>> I see that you have quirked one Samsung device. Incidentally, this
>> Lenovo Yoga 910 (Kabylake, SunrisePoint LP PCH) also has a Samsung NVMe
>> device, just a different one. Details below. I don't know what the
>> failure mode in the quirked one is, so I don't know if this could be the
>> same issue.
>
> My first gut feeling would be that by allowing the nvme to go to sleep
> we're gettting into some deeper power saving state, which then causes
> display underruns. How does the package c-state residency look
> before/after the commit?

I know approximately nothing about how package C-states works and what
exactly triggers APSM low-power state entry, but I've seen reports
that APST is required to get ASPM L1 and that ASPM L1 is needed to get
to the deep PC states.  And deep PC states can surely trigger i915
issues...

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Skylake underruns on 4.8-rc4

2016-08-29 Thread Andy Lutomirski
My Dell XPS 13 9350 laptop just got a buffer underrun:

[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A
FIFO underrun

I'm seeing this very occasionally, and they don't come in groups -- I
seem to get one underrun with a black flash and that's it.  This is
with just the laptop screen -- nothing at all is plugged in to the
USB-C port.

4.8-rc4 has the latest round of fixes applied, so
i915/skl_dmc_ver1_26.bin loaded successfully and the SAGV fix is
there.

I had the same problem on 4.8-rc3.  4.7 seemed okay.

I have:

00:02.0 VGA compatible controller: Intel Corporation HD Graphics 520 (rev 07)

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] DP link training and performance issues with HDMI USB-C dongle and Skylake

2016-06-22 Thread Andy Lutomirski
I have a Dell XPS 13 9350 (Skylake) and a Dell DA200 adapter.  The
latter is a Thunderbolt device that includes an HDMI port and connects
over USB Type C.  I believe that it's internally using DP Alternate
Mode.

When I plug it in on 4.7-rc4, I get spew like this:

[   90.718106] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   91.077604] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   91.437059] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   91.796479] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   92.156101] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   92.515647] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   92.875184] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   93.234735] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   93.594294] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   93.953812] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   94.313390] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   94.673043] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   95.032890] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   95.393016] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   95.752879] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   96.113074] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   96.473068] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   96.833185] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   97.193233] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   97.553138] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   97.913526] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   98.273525] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   98.634178] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   98.993859] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   99.354484] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[   99.714669] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  100.077412] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  100.432684] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  100.792499] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  101.152378] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  101.512265] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  101.872466] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  102.232284] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  102.592251] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  103.111283] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  103.466511] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  103.826082] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  104.191906] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  104.547038] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  104.911264] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  105.270679] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  105.625774] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  105.986064] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  106.350045] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  106.705325] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  107.064897] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  107.431263] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  107.790793] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  108.146016] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  108.506093] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  108.865924] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting
[  109.225629] [drm:intel_dp_start_link_train [i915]] *ERROR* failed
to train DP, aborting

Re: [Intel-gfx] Possible 4.5 i915 Skylake regression

2016-03-13 Thread Andy Lutomirski
On Wed, Feb 17, 2016 at 8:18 AM, Daniel Vetter <dan...@ffwll.ch> wrote:
> On Tue, Feb 16, 2016 at 09:26:35AM -0800, Andy Lutomirski wrote:
>> On Tue, Feb 16, 2016 at 9:12 AM, Andy Lutomirski <l...@amacapital.net> wrote:
>> > On Tue, Feb 16, 2016 at 8:12 AM, Daniel Vetter <dan...@ffwll.ch> wrote:
>> >> On Mon, Feb 15, 2016 at 06:58:33AM -0800, Andy Lutomirski wrote:
>> >>> On Sun, Feb 14, 2016 at 6:59 PM, Andy Lutomirski <l...@kernel.org> wrote:
>> >>> > Hi-
>> >>> >
>> >>> > On 4.5-rc3 on a Dell XPS 13 9350 (Skylake i915, no nvidia on this
>> >>> > model), shortly after resume, I saw a single black flash on the
>> >>> > screen.  The log said:
>> >>> >
>> >>> > [Feb13 07:05] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR*
>> >>> > CPU pipe A FIFO underrun
>> >>> >
>> >>> > I haven't seen this on 4.4.
>> >>> >
>> >>> > I'd be happy to dig up debugging info, but I don't know what would be
>> >>> > useful.  I have no i915 module options set.
>> >>>
>> >>> It's flashing quite frequently now, although I seem to get the
>> >>> underrun warning only once per resume.
>> >>
>> >> We shut up the warning irq source to avoid hijacking an entire cpu core
>> >> ;-)
>> >>
>> >> There's a fix from Matt right after 4.5-rc4 in Linus' branch. I'm hoping
>> >> that should help.
>> >
>> > Do you mean:
>> >
>> > commit e2e407dc093f530b771ee8bf8fe1be41e3cea8b3
>> > Author: Matt Roper <matthew.d.ro...@intel.com>
>> > Date:   Mon Feb 8 11:05:28 2016 -0800
>> >
>> > drm/i915: Pretend cursor is always on for ILK-style WM calculations 
>> > (v2)
>> >
>> > If so, it didn't help.  I'm currently doing a full rebuild just in
>> > case I messed something up, though.
>> >
>>
>> Definitely not fixed.  It seems to be okay after a reboot until the
>> first suspend/resume.
>>
>> This happened after resuming.  Five cents says it's the root cause.
>
> That's interesting, but doesn't ring a bell unfortunately. Can you try to
> attempt a bisect?
>

I'm giving up on my attempt to bisect for now.  After a bunch of false
starts to avoid this crap, I'm stuck at
651174a4a0ccaf41e14fadc4bc525d61ae7f7b18, which is based on 4.3-rc3
and doesn't merge cleanly up to 4.4.  It's also annoying because it
reproduces reasonably quickly but not instantaneously, and I can never
reproduce it before a suspend/resume, so my bisection attempts are
full of errors.

--Andy

> Thanks, Daniel
>
>>
>> [  160.361200] WARNING: CPU: 2 PID: 2512 at
>> drivers/gpu/drm/i915/intel_uncore.c:599
>> hsw_unclaimed_reg_debug+0x69/0x90 [i915]()
>> [  160.361209] Unclaimed register detected before writing to register 0x20a8
>> [  160.361213] Modules linked in: rfcomm fuse ccm cmac xt_CHECKSUM
>> ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns
>> nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
>> xt_conntrack ebtable_filter ebtable_nat ebtable_broute bridge stp llc
>> ebtables ip6table_raw ip6table_mangle ip6table_security ip6table_nat
>> nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter
>> ip6_tables iptable_raw iptable_mangle iptable_security iptable_nat
>> nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bnep
>> arc4 iwlmvm mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek
>> hid_multitouch snd_hda_codec_generic iwlwifi snd_hda_intel intel_rapl
>> snd_hda_codec x86_pkg_temp_thermal coretemp kvm_intel snd_hwdep
>> cfg80211 snd_hda_core kvm snd_seq uvcvideo snd_seq_device
>> i2c_designware_platform
>> [  160.361385]  i2c_designware_core btusb snd_pcm videobuf2_vmalloc
>> wmi_mof vfat dell_wmi fat videobuf2_memops btrtl btbcm btintel
>> bluetooth dell_laptop dell_smbios dcdbas videobuf2_v4l2 snd_timer
>> videobuf2_core rtsx_pci_ms snd irqbypass videodev memstick
>> ghash_clmulni_intel joydev mei_me efi_pstore mei i2c_i801 soundcore
>> efivars pcspkr idma64 shpchp virt_dma media rfkill intel_lpss_pci
>> processor_thermal_device intel_soc_dts_iosf wmi acpi_als kfifo_buf
>> int3403_thermal tpm_tis industrialio pinctrl_sunrisepoint tpm
>> intel_hid int3400_thermal pinctrl_intel intel_lpss_acpi sparse_keymap
>> int340x_thermal_zone acpi_thermal_rel intel_lpss nfsd acpi_pad
>> auth_rpcgss nfs_acl lockd binfmt_misc grace sunrpc dm_crypt i915
>> i2c_algo_bit drm_kms

Re: [Intel-gfx] Possible 4.5 i915 Skylake regression

2016-03-11 Thread Andy Lutomirski
On Mon, Feb 22, 2016 at 7:13 PM, Andy Lutomirski <l...@amacapital.net> wrote:
> On Wed, Feb 17, 2016 at 5:36 PM, Andy Lutomirski <l...@amacapital.net> wrote:
>> On Wed, Feb 17, 2016 at 8:18 AM, Daniel Vetter <dan...@ffwll.ch> wrote:
>>> On Tue, Feb 16, 2016 at 09:26:35AM -0800, Andy Lutomirski wrote:
>>>> On Tue, Feb 16, 2016 at 9:12 AM, Andy Lutomirski <l...@amacapital.net> 
>>>> wrote:
>>>> > On Tue, Feb 16, 2016 at 8:12 AM, Daniel Vetter <dan...@ffwll.ch> wrote:
>>>> >> On Mon, Feb 15, 2016 at 06:58:33AM -0800, Andy Lutomirski wrote:
>>>> >>> On Sun, Feb 14, 2016 at 6:59 PM, Andy Lutomirski <l...@kernel.org> 
>>>> >>> wrote:
>>>> >>> > Hi-
>>>> >>> >
>>>> >>> > On 4.5-rc3 on a Dell XPS 13 9350 (Skylake i915, no nvidia on this
>>>> >>> > model), shortly after resume, I saw a single black flash on the
>>>> >>> > screen.  The log said:
>>>> >>> >
>>>> >>> > [Feb13 07:05] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] 
>>>> >>> > *ERROR*
>>>> >>> > CPU pipe A FIFO underrun
>>>> >>> >
>>>> >>> > I haven't seen this on 4.4.
>>>> >>> >
>>>> >>> > I'd be happy to dig up debugging info, but I don't know what would be
>>>> >>> > useful.  I have no i915 module options set.
>>>> >>>
>>>> >>> It's flashing quite frequently now, although I seem to get the
>>>> >>> underrun warning only once per resume.
>>>> >>
>>>> >> We shut up the warning irq source to avoid hijacking an entire cpu core
>>>> >> ;-)
>>>> >>
>>>> >> There's a fix from Matt right after 4.5-rc4 in Linus' branch. I'm hoping
>>>> >> that should help.
>>>> >
>>>> > Do you mean:
>>>> >
>>>> > commit e2e407dc093f530b771ee8bf8fe1be41e3cea8b3
>>>> > Author: Matt Roper <matthew.d.ro...@intel.com>
>>>> > Date:   Mon Feb 8 11:05:28 2016 -0800
>>>> >
>>>> > drm/i915: Pretend cursor is always on for ILK-style WM calculations 
>>>> > (v2)
>>>> >
>>>> > If so, it didn't help.  I'm currently doing a full rebuild just in
>>>> > case I messed something up, though.
>>>> >
>>>>
>>>> Definitely not fixed.  It seems to be okay after a reboot until the
>>>> first suspend/resume.
>>>>
>>>> This happened after resuming.  Five cents says it's the root cause.
>>>
>>> That's interesting, but doesn't ring a bell unfortunately. Can you try to
>>> attempt a bisect?
>>
>> I probably can, but it's very slow.  Is there a reasonably
>> straightforward way to instrument the watermark computation to see
>> what's going wrong?  I'm reasonably confident that the bug is in the
>> resume code or in something that only happens on resume, since I still
>> haven't seen underruns after rebooting before suspending.
>>
>
> With some instrumentation applied, I got this:
>
> [  369.471064] skl_update_wm(crtc-0): computed update
> [  369.471072] skl_update_other_pipe_wm(crtc-0): no change
> [  369.471075] skl_write_wm_values...
> [  369.471078]  CRTC crtc-0 pipe A
> [  369.471083]   wm_linetime = 121
> [  369.471086]   plane_wm level 0 plane 0 = 2147500036
> [  369.471090]   plane_wm level 0 plane 1 = 0
> [  369.471094]   plane_wm level 0 cursor = 2147500036
> [  369.471097]   plane_wm level 1 plane 0 = 2147516439
> [  369.471101]   plane_wm level 1 plane 1 = 0
> [  369.471104]   plane_wm level 1 cursor = 2147516439
> [  369.471108]   plane_wm level 2 plane 0 = 2147516448
> [  369.47]   plane_wm level 2 plane 1 = 0
> [  369.471115]   plane_wm level 2 cursor = 0
> [  369.471118]   plane_wm level 3 plane 0 = 2147532837
> [  369.471121]   plane_wm level 3 plane 1 = 0
> [  369.471125]   plane_wm level 3 cursor = 0
> [  369.471128]   plane_wm level 4 plane 0 = 2147565639
> [  369.471131]   plane_wm level 4 plane 1 = 0
> [  369.471135]   plane_wm level 4 cursor = 0
> [  369.471138]   plane_wm level 5 plane 0 = 2147582038
> [  369.471141]   plane_wm level 5 plane 1 = 0
> [  369.471145]   plane_wm level 5 cursor = 0
> [  369.471148]   plane_wm level 6 plane 0 = 2147582044
> [  369.471151]   plane_wm level 6 plane 1 = 0
> [  36

Re: [Intel-gfx] [PATCH v1 00/12] PCI: Rework shadow ROM handling

2016-03-11 Thread Andy Lutomirski
On Fri, Mar 11, 2016 at 3:29 PM, Bjorn Helgaas <helg...@kernel.org> wrote:
> On Fri, Mar 11, 2016 at 01:16:09PM -0800, Andy Lutomirski wrote:
>> On Tue, Mar 8, 2016 at 9:45 AM, Bjorn Helgaas <helg...@kernel.org> wrote:
>> > On Thu, Mar 03, 2016 at 10:53:50AM -0600, Bjorn Helgaas wrote:
>> >> The purpose of this series is to:
>> >>
>> >>   - Fix the "BAR 6: [??? 0x flags 0x2] has bogus alignment"
>> >> messages reported by Linus [1], Andy [2], and others.
>> >>
>> >>   - Move arch-specific shadow ROM location knowledge, e.g.,
>> >> 0xC-0xD, from PCI core to arch code.
>> >>
>> >>   - Fix the ia64 and MIPS Loongson 3 oddity of keeping virtual
>> >> addresses in shadow ROM struct resource (resources should always
>> >> contain *physical* addresses).
>> >>
>> >>   - Remove now-unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
>> >> flags.
>> >>
>> >> This series is based on v4.5-rc1, and it's available on my
>> >> pci/resource git branch (along with a couple tiny unrelated patches)
>> >> at [3].
>> >>
>> >> Bjorn
>> >>
>> >>
>> >> [1] 
>> >> http://lkml.kernel.org/r/ca+55afyvmftbb0oz_yx8+eqoejnzgtcsysj9quhepdz9bhd...@mail.gmail.com
>> >> [2] 
>> >> http://lkml.kernel.org/r/calcetrv+rwnpzxyl8uvnsragu-6cczd_cc9pfjt2nctjplz...@mail.gmail.com
>> >> [3] 
>> >> https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/resource
>> >>
>> >>
>> >> ---
>> >>
>> >> Bjorn Helgaas (12):
>> >>   PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
>> >>   PCI: Don't assign or reassign immutable resources
>> >>   PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
>> >>   PCI: Set ROM shadow location in arch code, not in PCI core
>> >>   PCI: Clean up pci_map_rom() whitespace
>> >>   ia64/PCI: Use temporary struct resource * to avoid repetition
>> >>   ia64/PCI: Use ioremap() instead of open-coded equivalent
>> >>   ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM 
>> >> resource
>> >>   MIPS: Loongson 3: Use temporary struct resource * to avoid 
>> >> repetition
>> >>   MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in 
>> >> shadow ROM resource
>> >>   PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
>> >>   PCI: Simplify sysfs ROM cleanup
>> >>
>> >>
>> >>  arch/ia64/pci/fixup.c  |   21 +++--
>> >>  arch/ia64/sn/kernel/io_acpi_init.c |   22 ++
>> >>  arch/ia64/sn/kernel/io_init.c  |   51 --
>> >>  arch/mips/pci/fixup-loongson3.c|   19 +---
>> >>  arch/x86/pci/fixup.c   |   21 +++--
>> >>  drivers/pci/pci-sysfs.c|   13 +-
>> >>  drivers/pci/remove.c   |1
>> >>  drivers/pci/rom.c  |   83 
>> >> +++-
>> >>  drivers/pci/setup-res.c|6 +++
>> >>  include/linux/ioport.h |4 --
>> >>  10 files changed, 111 insertions(+), 130 deletions(-)
>> >
>> > I applied this series to pci/resource for v4.6.
>>
>> This gets rid of all the warnings for me until I try to read my i915
>> device's rom using sysfs.  Then I get:
>>
>> i915 :00:02.0: Invalid PCI ROM header signature: expecting 0xaa55,
>> got 0x
>>
>> So I suspect that something is still subtly wrong -- I'd imagine that
>> this should either work or the intialization code should detect that
>> there is no usable ROM and not expose it.
>>
>> (To be clear, there's no regression here.)
>
> Hmmm.  Thanks for testing this.  As you say, I think this is the way
> it's always been, but it does seem non-intuitive.
>
> That "Invalid PCI ROM header signature" warning comes from
> pci_get_rom_size().  We don't call that at enumeration-time; we only
> call it later when somebody tries to read the ROM via sysfs:
>
>   pci_bus_add_device
> pci_fixup_device(pci_fixup_final)
>   pci_fixup_video # final fixup
> res->flags = MEM | SHADOW | PCI_FIXED
> pci_create_sysfs_dev_files
>   if

Re: [Intel-gfx] [PATCH v1 00/12] PCI: Rework shadow ROM handling

2016-03-11 Thread Andy Lutomirski
On Tue, Mar 8, 2016 at 9:45 AM, Bjorn Helgaas  wrote:
> On Thu, Mar 03, 2016 at 10:53:50AM -0600, Bjorn Helgaas wrote:
>> The purpose of this series is to:
>>
>>   - Fix the "BAR 6: [??? 0x flags 0x2] has bogus alignment"
>> messages reported by Linus [1], Andy [2], and others.
>>
>>   - Move arch-specific shadow ROM location knowledge, e.g.,
>> 0xC-0xD, from PCI core to arch code.
>>
>>   - Fix the ia64 and MIPS Loongson 3 oddity of keeping virtual
>> addresses in shadow ROM struct resource (resources should always
>> contain *physical* addresses).
>>
>>   - Remove now-unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
>> flags.
>>
>> This series is based on v4.5-rc1, and it's available on my
>> pci/resource git branch (along with a couple tiny unrelated patches)
>> at [3].
>>
>> Bjorn
>>
>>
>> [1] 
>> http://lkml.kernel.org/r/ca+55afyvmftbb0oz_yx8+eqoejnzgtcsysj9quhepdz9bhd...@mail.gmail.com
>> [2] 
>> http://lkml.kernel.org/r/calcetrv+rwnpzxyl8uvnsragu-6cczd_cc9pfjt2nctjplz...@mail.gmail.com
>> [3] 
>> https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/resource
>>
>>
>> ---
>>
>> Bjorn Helgaas (12):
>>   PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
>>   PCI: Don't assign or reassign immutable resources
>>   PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
>>   PCI: Set ROM shadow location in arch code, not in PCI core
>>   PCI: Clean up pci_map_rom() whitespace
>>   ia64/PCI: Use temporary struct resource * to avoid repetition
>>   ia64/PCI: Use ioremap() instead of open-coded equivalent
>>   ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM 
>> resource
>>   MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
>>   MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow 
>> ROM resource
>>   PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
>>   PCI: Simplify sysfs ROM cleanup
>>
>>
>>  arch/ia64/pci/fixup.c  |   21 +++--
>>  arch/ia64/sn/kernel/io_acpi_init.c |   22 ++
>>  arch/ia64/sn/kernel/io_init.c  |   51 --
>>  arch/mips/pci/fixup-loongson3.c|   19 +---
>>  arch/x86/pci/fixup.c   |   21 +++--
>>  drivers/pci/pci-sysfs.c|   13 +-
>>  drivers/pci/remove.c   |1
>>  drivers/pci/rom.c  |   83 
>> +++-
>>  drivers/pci/setup-res.c|6 +++
>>  include/linux/ioport.h |4 --
>>  10 files changed, 111 insertions(+), 130 deletions(-)
>
> I applied this series to pci/resource for v4.6.

This gets rid of all the warnings for me until I try to read my i915
device's rom using sysfs.  Then I get:

i915 :00:02.0: Invalid PCI ROM header signature: expecting 0xaa55,
got 0x

So I suspect that something is still subtly wrong -- I'd imagine that
this should either work or the intialization code should detect that
there is no usable ROM and not expose it.

(To be clear, there's no regression here.)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915 Skylake: "Invalid ROM contents"

2016-02-29 Thread Andy Lutomirski
On Sun, Jan 10, 2016 at 11:12 AM, Andy Lutomirski <l...@amacapital.net> wrote:
> On Sun, Jan 10, 2016 at 10:41 AM, Andy Lutomirski <l...@amacapital.net> wrote:
>> On Wed, Nov 18, 2015 at 8:12 AM, Daniel Stone <dan...@fooishbar.org> wrote:
>>> Hi,
>>>
>>> On 18 November 2015 at 15:59, Andy Lutomirski <l...@amacapital.net> wrote:
>>>> On Wed, Nov 18, 2015 at 2:59 AM, Ville Syrjälä
>>>> <ville.syrj...@linux.intel.com> wrote:
>>>>> On Tue, Nov 17, 2015 at 11:43:25AM -0800, Andy Lutomirski wrote:
>>>>>> Typing:
>>>>>>
>>>>>> # cat /sys/devices/pci:00/:00:02.0/rom
>>>>>>
>>>>>> Provokes:
>>>>>>
>>>>>> i915 :00:02.0: Invalid ROM contents
>>>>>
>>>>> Hmm. So there's no PCI option ROM there. I wonder what is there. I
>>>>> get the same on my Braswell BTW. I tried to look through the UEFI
>>>>> spec a bit, and it seems to say that even for non-legacy option ROMs
>>>>> the 0x55aa signature should be there.
>>>>>
>>>>> But this being the GPU means we may be using the shadow ROM stuff,
>>>>> which IIRC assumes that the shadow is at 0xc000. I'm not sure that
>>>>> holds anymore with UEFI, and maybe we should be using some UEFI
>>>>> trick instead to find out where it actually lives?
>>>>>
>>>>> BTW what does 'lspci -vv -s 00:02.0' say on your machine?
>>>>>
>>>>
>>>> 00:02.0 VGA compatible controller: Intel Corporation Sky Lake
>>>> Integrated Graphics (rev 07) (prog-if 00 [VGA controller])
>>>> DeviceName:  Onboard IGD
>>>> Subsystem: Dell Device 0704
>>>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>>>> ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>>>> SERR- >>> Latency: 0
>>>> Interrupt: pin A routed to IRQ 128
>>>> Region 0: Memory at db00 (64-bit, non-prefetchable) [size=16M]
>>>> Region 2: Memory at 9000 (64-bit, prefetchable) [size=256M]
>>>> Region 4: I/O ports at f000 [size=64]
>>>> Expansion ROM at  [disabled]
>>>
>>> UEFI has an option to enable option ROMs, which is disabled by
>>> default; I wonder if having it disabled prevents all access to the
>>> ROM.
>>>
>>> Mind you, it doesn't seem to be fatal; I've not had any issues with
>>> the same machine that I can pin down to lack of ROM.
>>>
>>
>> FWIW, my logs also get spammed with:
>>
>> [  127.101881] i915 :00:02.0: BAR 6: [??? 0x flags 0x2]
>> has bogus alignment
>>
>> I suspect that the PCI core is just failing to recognize that the ROM
>> is disabled.
>>
>
> A bit more info:
>
> I think I only get this error when suspending for the second time
> after boot.  No clue why.
>
> I instrumented the code a bit.  At the time of that error, res->flags
> == 0x2.  It's probably not a coincidence that:
>
> #define IORESOURCE_ROM_SHADOW(1<<1)/* ROM is copy at C000:0 */
>
> Should pci_fixup_video check that the resource exists in the first
> place before setting flags on it?

*ping*

Hi, PCI people.

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Possible 4.5 i915 Skylake regression

2016-02-22 Thread Andy Lutomirski
On Wed, Feb 17, 2016 at 5:36 PM, Andy Lutomirski <l...@amacapital.net> wrote:
> On Wed, Feb 17, 2016 at 8:18 AM, Daniel Vetter <dan...@ffwll.ch> wrote:
>> On Tue, Feb 16, 2016 at 09:26:35AM -0800, Andy Lutomirski wrote:
>>> On Tue, Feb 16, 2016 at 9:12 AM, Andy Lutomirski <l...@amacapital.net> 
>>> wrote:
>>> > On Tue, Feb 16, 2016 at 8:12 AM, Daniel Vetter <dan...@ffwll.ch> wrote:
>>> >> On Mon, Feb 15, 2016 at 06:58:33AM -0800, Andy Lutomirski wrote:
>>> >>> On Sun, Feb 14, 2016 at 6:59 PM, Andy Lutomirski <l...@kernel.org> 
>>> >>> wrote:
>>> >>> > Hi-
>>> >>> >
>>> >>> > On 4.5-rc3 on a Dell XPS 13 9350 (Skylake i915, no nvidia on this
>>> >>> > model), shortly after resume, I saw a single black flash on the
>>> >>> > screen.  The log said:
>>> >>> >
>>> >>> > [Feb13 07:05] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR*
>>> >>> > CPU pipe A FIFO underrun
>>> >>> >
>>> >>> > I haven't seen this on 4.4.
>>> >>> >
>>> >>> > I'd be happy to dig up debugging info, but I don't know what would be
>>> >>> > useful.  I have no i915 module options set.
>>> >>>
>>> >>> It's flashing quite frequently now, although I seem to get the
>>> >>> underrun warning only once per resume.
>>> >>
>>> >> We shut up the warning irq source to avoid hijacking an entire cpu core
>>> >> ;-)
>>> >>
>>> >> There's a fix from Matt right after 4.5-rc4 in Linus' branch. I'm hoping
>>> >> that should help.
>>> >
>>> > Do you mean:
>>> >
>>> > commit e2e407dc093f530b771ee8bf8fe1be41e3cea8b3
>>> > Author: Matt Roper <matthew.d.ro...@intel.com>
>>> > Date:   Mon Feb 8 11:05:28 2016 -0800
>>> >
>>> > drm/i915: Pretend cursor is always on for ILK-style WM calculations 
>>> > (v2)
>>> >
>>> > If so, it didn't help.  I'm currently doing a full rebuild just in
>>> > case I messed something up, though.
>>> >
>>>
>>> Definitely not fixed.  It seems to be okay after a reboot until the
>>> first suspend/resume.
>>>
>>> This happened after resuming.  Five cents says it's the root cause.
>>
>> That's interesting, but doesn't ring a bell unfortunately. Can you try to
>> attempt a bisect?
>
> I probably can, but it's very slow.  Is there a reasonably
> straightforward way to instrument the watermark computation to see
> what's going wrong?  I'm reasonably confident that the bug is in the
> resume code or in something that only happens on resume, since I still
> haven't seen underruns after rebooting before suspending.
>

With some instrumentation applied, I got this:

[  369.471064] skl_update_wm(crtc-0): computed update
[  369.471072] skl_update_other_pipe_wm(crtc-0): no change
[  369.471075] skl_write_wm_values...
[  369.471078]  CRTC crtc-0 pipe A
[  369.471083]   wm_linetime = 121
[  369.471086]   plane_wm level 0 plane 0 = 2147500036
[  369.471090]   plane_wm level 0 plane 1 = 0
[  369.471094]   plane_wm level 0 cursor = 2147500036
[  369.471097]   plane_wm level 1 plane 0 = 2147516439
[  369.471101]   plane_wm level 1 plane 1 = 0
[  369.471104]   plane_wm level 1 cursor = 2147516439
[  369.471108]   plane_wm level 2 plane 0 = 2147516448
[  369.47]   plane_wm level 2 plane 1 = 0
[  369.471115]   plane_wm level 2 cursor = 0
[  369.471118]   plane_wm level 3 plane 0 = 2147532837
[  369.471121]   plane_wm level 3 plane 1 = 0
[  369.471125]   plane_wm level 3 cursor = 0
[  369.471128]   plane_wm level 4 plane 0 = 2147565639
[  369.471131]   plane_wm level 4 plane 1 = 0
[  369.471135]   plane_wm level 4 cursor = 0
[  369.471138]   plane_wm level 5 plane 0 = 2147582038
[  369.471141]   plane_wm level 5 plane 1 = 0
[  369.471145]   plane_wm level 5 cursor = 0
[  369.471148]   plane_wm level 6 plane 0 = 2147582044
[  369.471151]   plane_wm level 6 plane 1 = 0
[  369.471155]   plane_wm level 6 cursor = 0
[  369.471158]   plane_wm level 7 plane 0 = 2147598443
[  369.471161]   plane_wm level 7 plane 1 = 0
[  369.471164]   plane_wm level 7 cursor = 0
[  369.471168]   wm_trans plane 0 = 0
[  369.471171]   wm_trans plane 1 = 0
[  369.471174]   wm_trans cursor = 0
[  369.471182]  CRTC crtc-1 pipe B
[  369.471184]   clean
[  369.471186]  CRTC crtc-2 pipe C
[  369.471189]   clean
[  369.471226] skl_update_wm(crtc-0): no update
[  372.068755] [drm:intel_cpu_fifo_underrun_irq_h

Re: [Intel-gfx] Possible 4.5 i915 Skylake regression

2016-02-17 Thread Andy Lutomirski
On Wed, Feb 17, 2016 at 8:18 AM, Daniel Vetter <dan...@ffwll.ch> wrote:
> On Tue, Feb 16, 2016 at 09:26:35AM -0800, Andy Lutomirski wrote:
>> On Tue, Feb 16, 2016 at 9:12 AM, Andy Lutomirski <l...@amacapital.net> wrote:
>> > On Tue, Feb 16, 2016 at 8:12 AM, Daniel Vetter <dan...@ffwll.ch> wrote:
>> >> On Mon, Feb 15, 2016 at 06:58:33AM -0800, Andy Lutomirski wrote:
>> >>> On Sun, Feb 14, 2016 at 6:59 PM, Andy Lutomirski <l...@kernel.org> wrote:
>> >>> > Hi-
>> >>> >
>> >>> > On 4.5-rc3 on a Dell XPS 13 9350 (Skylake i915, no nvidia on this
>> >>> > model), shortly after resume, I saw a single black flash on the
>> >>> > screen.  The log said:
>> >>> >
>> >>> > [Feb13 07:05] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR*
>> >>> > CPU pipe A FIFO underrun
>> >>> >
>> >>> > I haven't seen this on 4.4.
>> >>> >
>> >>> > I'd be happy to dig up debugging info, but I don't know what would be
>> >>> > useful.  I have no i915 module options set.
>> >>>
>> >>> It's flashing quite frequently now, although I seem to get the
>> >>> underrun warning only once per resume.
>> >>
>> >> We shut up the warning irq source to avoid hijacking an entire cpu core
>> >> ;-)
>> >>
>> >> There's a fix from Matt right after 4.5-rc4 in Linus' branch. I'm hoping
>> >> that should help.
>> >
>> > Do you mean:
>> >
>> > commit e2e407dc093f530b771ee8bf8fe1be41e3cea8b3
>> > Author: Matt Roper <matthew.d.ro...@intel.com>
>> > Date:   Mon Feb 8 11:05:28 2016 -0800
>> >
>> > drm/i915: Pretend cursor is always on for ILK-style WM calculations 
>> > (v2)
>> >
>> > If so, it didn't help.  I'm currently doing a full rebuild just in
>> > case I messed something up, though.
>> >
>>
>> Definitely not fixed.  It seems to be okay after a reboot until the
>> first suspend/resume.
>>
>> This happened after resuming.  Five cents says it's the root cause.
>
> That's interesting, but doesn't ring a bell unfortunately. Can you try to
> attempt a bisect?

I probably can, but it's very slow.  Is there a reasonably
straightforward way to instrument the watermark computation to see
what's going wrong?  I'm reasonably confident that the bug is in the
resume code or in something that only happens on resume, since I still
haven't seen underruns after rebooting before suspending.

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Possible 4.5 i915 Skylake regression

2016-02-16 Thread Andy Lutomirski
On Tue, Feb 16, 2016 at 9:12 AM, Andy Lutomirski <l...@amacapital.net> wrote:
> On Tue, Feb 16, 2016 at 8:12 AM, Daniel Vetter <dan...@ffwll.ch> wrote:
>> On Mon, Feb 15, 2016 at 06:58:33AM -0800, Andy Lutomirski wrote:
>>> On Sun, Feb 14, 2016 at 6:59 PM, Andy Lutomirski <l...@kernel.org> wrote:
>>> > Hi-
>>> >
>>> > On 4.5-rc3 on a Dell XPS 13 9350 (Skylake i915, no nvidia on this
>>> > model), shortly after resume, I saw a single black flash on the
>>> > screen.  The log said:
>>> >
>>> > [Feb13 07:05] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR*
>>> > CPU pipe A FIFO underrun
>>> >
>>> > I haven't seen this on 4.4.
>>> >
>>> > I'd be happy to dig up debugging info, but I don't know what would be
>>> > useful.  I have no i915 module options set.
>>>
>>> It's flashing quite frequently now, although I seem to get the
>>> underrun warning only once per resume.
>>
>> We shut up the warning irq source to avoid hijacking an entire cpu core
>> ;-)
>>
>> There's a fix from Matt right after 4.5-rc4 in Linus' branch. I'm hoping
>> that should help.
>
> Do you mean:
>
> commit e2e407dc093f530b771ee8bf8fe1be41e3cea8b3
> Author: Matt Roper <matthew.d.ro...@intel.com>
> Date:   Mon Feb 8 11:05:28 2016 -0800
>
> drm/i915: Pretend cursor is always on for ILK-style WM calculations (v2)
>
> If so, it didn't help.  I'm currently doing a full rebuild just in
> case I messed something up, though.
>

Definitely not fixed.  It seems to be okay after a reboot until the
first suspend/resume.

This happened after resuming.  Five cents says it's the root cause.

[  160.361200] WARNING: CPU: 2 PID: 2512 at
drivers/gpu/drm/i915/intel_uncore.c:599
hsw_unclaimed_reg_debug+0x69/0x90 [i915]()
[  160.361209] Unclaimed register detected before writing to register 0x20a8
[  160.361213] Modules linked in: rfcomm fuse ccm cmac xt_CHECKSUM
ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns
nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
xt_conntrack ebtable_filter ebtable_nat ebtable_broute bridge stp llc
ebtables ip6table_raw ip6table_mangle ip6table_security ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter
ip6_tables iptable_raw iptable_mangle iptable_security iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bnep
arc4 iwlmvm mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek
hid_multitouch snd_hda_codec_generic iwlwifi snd_hda_intel intel_rapl
snd_hda_codec x86_pkg_temp_thermal coretemp kvm_intel snd_hwdep
cfg80211 snd_hda_core kvm snd_seq uvcvideo snd_seq_device
i2c_designware_platform
[  160.361385]  i2c_designware_core btusb snd_pcm videobuf2_vmalloc
wmi_mof vfat dell_wmi fat videobuf2_memops btrtl btbcm btintel
bluetooth dell_laptop dell_smbios dcdbas videobuf2_v4l2 snd_timer
videobuf2_core rtsx_pci_ms snd irqbypass videodev memstick
ghash_clmulni_intel joydev mei_me efi_pstore mei i2c_i801 soundcore
efivars pcspkr idma64 shpchp virt_dma media rfkill intel_lpss_pci
processor_thermal_device intel_soc_dts_iosf wmi acpi_als kfifo_buf
int3403_thermal tpm_tis industrialio pinctrl_sunrisepoint tpm
intel_hid int3400_thermal pinctrl_intel intel_lpss_acpi sparse_keymap
int340x_thermal_zone acpi_thermal_rel intel_lpss nfsd acpi_pad
auth_rpcgss nfs_acl lockd binfmt_misc grace sunrpc dm_crypt i915
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops drm rtsx_pci_sdmmc
[  160.361548]  mmc_core crct10dif_pclmul crc32_pclmul crc32c_intel
rtsx_pci serio_raw i2c_hid video
[  160.361575] CPU: 2 PID: 2512 Comm: gnome-shell Not tainted
4.5.0-rc4-acpi+ #59
[  160.361581] Hardware name: Dell Inc. XPS 13 9350/07TYC2, BIOS 1.1.9
12/18/2015
[  160.361588]  0086 604232f7 88024d55ba60
81449d83
[  160.361601]  88024d55baa8 a01e15e8 88024d55ba98
81094252
[  160.361612]  88026f4d 20a8 88026f4d
fefe
[  160.361624] Call Trace:
[  160.361644]  [] dump_stack+0x65/0x92
[  160.361660]  [] warn_slowpath_common+0x82/0xc0
[  160.361671]  [] warn_slowpath_fmt+0x5c/0x80
[  160.361764]  [] hsw_unclaimed_reg_debug+0x69/0x90 [i915]
[  160.361844]  [] gen9_write32+0x6e/0x390 [i915]
[  160.361855]  [] ? preempt_count_add+0x85/0xd0
[  160.361939]  [] gen8_logical_ring_get_irq+0x95/0xe0 [i915]
[  160.362017]  [] __i915_wait_request+0x58b/0x650 [i915]
[  160.362028]  [] ? wake_atomic_t_function+0x70/0x70
[  160.362113]  []
i915_gem_object_wait_rendering__nonblocking+0x16e/0x2c0 [i915]
[  160.362200]  [] ? i915_gem_pwrite_ioctl+0xe4/0x9b0 [i915]
[  160.362211]  [] ? preempt_count_add+0x85/0xd0
[  160.362225]  [] ? _raw_write_unlock+0x16/0x30
[  160.362312]  [] i915_gem_set_domain_

Re: [Intel-gfx] Possible 4.5 i915 Skylake regression

2016-02-16 Thread Andy Lutomirski
On Tue, Feb 16, 2016 at 8:12 AM, Daniel Vetter <dan...@ffwll.ch> wrote:
> On Mon, Feb 15, 2016 at 06:58:33AM -0800, Andy Lutomirski wrote:
>> On Sun, Feb 14, 2016 at 6:59 PM, Andy Lutomirski <l...@kernel.org> wrote:
>> > Hi-
>> >
>> > On 4.5-rc3 on a Dell XPS 13 9350 (Skylake i915, no nvidia on this
>> > model), shortly after resume, I saw a single black flash on the
>> > screen.  The log said:
>> >
>> > [Feb13 07:05] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR*
>> > CPU pipe A FIFO underrun
>> >
>> > I haven't seen this on 4.4.
>> >
>> > I'd be happy to dig up debugging info, but I don't know what would be
>> > useful.  I have no i915 module options set.
>>
>> It's flashing quite frequently now, although I seem to get the
>> underrun warning only once per resume.
>
> We shut up the warning irq source to avoid hijacking an entire cpu core
> ;-)
>
> There's a fix from Matt right after 4.5-rc4 in Linus' branch. I'm hoping
> that should help.

Do you mean:

commit e2e407dc093f530b771ee8bf8fe1be41e3cea8b3
Author: Matt Roper <matthew.d.ro...@intel.com>
Date:   Mon Feb 8 11:05:28 2016 -0800

drm/i915: Pretend cursor is always on for ILK-style WM calculations (v2)

If so, it didn't help.  I'm currently doing a full rebuild just in
case I messed something up, though.

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Possible 4.5 i915 Skylake regression

2016-02-16 Thread Andy Lutomirski
On Sun, Feb 14, 2016 at 6:59 PM, Andy Lutomirski <l...@kernel.org> wrote:
> Hi-
>
> On 4.5-rc3 on a Dell XPS 13 9350 (Skylake i915, no nvidia on this
> model), shortly after resume, I saw a single black flash on the
> screen.  The log said:
>
> [Feb13 07:05] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR*
> CPU pipe A FIFO underrun
>
> I haven't seen this on 4.4.
>
> I'd be happy to dig up debugging info, but I don't know what would be
> useful.  I have no i915 module options set.

It's flashing quite frequently now, although I seem to get the
underrun warning only once per resume.

intel_reg_dumper says:

   DCC: 0x (g/dri/0/i915_forcewake_user)
 CHDECMISC: 0x (none, ch2 enh disabled,
ch1 enh disabled, ch0 enh disabled, flex disabled, ep not present)
C0DRB0: 0x (0x)
C0DRB1: 0x (0x)
C0DRB2: 0x (0x)
C0DRB3: 0x (0x)
C1DRB0: 0x (0x)
C1DRB1: 0x (0x)
C1DRB2: 0x (0x)
C1DRB3: 0x (0x)
   C0DRA01: 0x (0x)
   C0DRA23: 0x (0x)
   C1DRA01: 0x (0x)
   C1DRA23: 0x (0x)
PGETBL_CTL: 0x
 VCLK_DIVISOR_VGA0: 0x0080 (n = 0, m1 = 0, m2 = 0)
 VCLK_DIVISOR_VGA1: 0x0080 (n = 0, m1 = 0, m2 = 0)
 VCLK_POST_DIV: 0x0080 (vga0 p1 = 2, p2 = 2, vga1
p1 = 2, p2 = 2)
 DPLL_TEST: 0x (, DPLLA input buffer
disabled, DPLLB input buffer disabled)
  CACHE_MODE_0: 0x
   D_STATE: 0x
 DSPCLK_GATE_D: 0x0080 (clock gates disabled: TVRUNIT)
RENCLK_GATE_D1: 0x0080
RENCLK_GATE_D2: 0x
 SDVOB: 0x (disabled, pipe A, stall
disabled, not detected)
 SDVOC: 0x (disabled, pipe A, stall
disabled, not detected)
   SDVOUDI: 0x
DSPARB: 0x
FW_BLC: 0x
   FW_BLC2: 0x
   FW_BLC_SELF: 0x
DSPFW1: 0x
DSPFW2: 0x
DSPFW3: 0x
  ADPA: 0x (disabled, transcoder A,
-hsync, -vsync)
  LVDS: 0x (disabled, pipe A, 18 bit, 1 channel)
  DVOA: 0x (disabled, pipe A, no
stall, -hsync, -vsync)
  DVOB: 0x (disabled, pipe A, no
stall, -hsync, -vsync)
  DVOC: 0x (disabled, pipe A, no
stall, -hsync, -vsync)
   DVOA_SRCDIM: 0x
   DVOB_SRCDIM: 0x
   DVOC_SRCDIM: 0x
   BLC_PWM_CTL: 0x
  BLC_PWM_CTL2: 0x
PP_CONTROL: 0x (power target: off)
 PP_STATUS: 0x (off, not ready, sequencing idle)
  PP_ON_DELAYS: 0x
 PP_OFF_DELAYS: 0x
PP_DIVISOR: 0x
  PFIT_CONTROL: 0x
   PFIT_PGM_RATIOS: 0x
   PORT_HOTPLUG_EN: 0x
 PORT_HOTPLUG_STAT: 0x
  DSPACNTR: 0xc4802400 (enabled)
DSPASTRIDE: 0x000f (15 bytes)
   DSPAPOS: 0x (0, 0)
  DSPASIZE: 0x0437077f (1920, 1080)
  DSPABASE: 0x
  DSPASURF: 0x0330
   DSPATILEOFF: 0x
 PIPEACONF: 0x (disabled, inactive, pf-pd,
rotate 0, 8bpc)
  PIPEASRC: 0x077f0437 (1920, 1080)
 PIPEASTAT: 0x (status:)
 PIPEA_GMCH_DATA_M: 0x
 PIPEA_GMCH_DATA_N: 0x
   PIPEA_DP_LINK_M: 0x
   PIPEA_DP_LINK_N: 0x
 CURSOR_A_BASE: 0x
  CURSOR_A_CONTROL: 0x
 CURSOR_A_POSITION: 0x
  FPA0: 0x0080 (n = 0, m1 = 0, m2 = 0)
  FPA1: 0x0080 (n = 0, m1 = 0, m2 = 0)
DPLL_A: 0x0080 (disabled, non-dvo, VGA,
default clock, unknown mode, p1 = 8, p2 = 0)
 DPLL_A_MD: 0x
  HTOTAL_A: 0x (1 active, 1 total)
  HBLANK_A: 0x (1 start, 1 end)
 

Re: [Intel-gfx] i915 Skylake: "Invalid ROM contents"

2016-01-10 Thread Andy Lutomirski
On Sun, Jan 10, 2016 at 10:41 AM, Andy Lutomirski <l...@amacapital.net> wrote:
> On Wed, Nov 18, 2015 at 8:12 AM, Daniel Stone <dan...@fooishbar.org> wrote:
>> Hi,
>>
>> On 18 November 2015 at 15:59, Andy Lutomirski <l...@amacapital.net> wrote:
>>> On Wed, Nov 18, 2015 at 2:59 AM, Ville Syrjälä
>>> <ville.syrj...@linux.intel.com> wrote:
>>>> On Tue, Nov 17, 2015 at 11:43:25AM -0800, Andy Lutomirski wrote:
>>>>> Typing:
>>>>>
>>>>> # cat /sys/devices/pci:00/:00:02.0/rom
>>>>>
>>>>> Provokes:
>>>>>
>>>>> i915 :00:02.0: Invalid ROM contents
>>>>
>>>> Hmm. So there's no PCI option ROM there. I wonder what is there. I
>>>> get the same on my Braswell BTW. I tried to look through the UEFI
>>>> spec a bit, and it seems to say that even for non-legacy option ROMs
>>>> the 0x55aa signature should be there.
>>>>
>>>> But this being the GPU means we may be using the shadow ROM stuff,
>>>> which IIRC assumes that the shadow is at 0xc000. I'm not sure that
>>>> holds anymore with UEFI, and maybe we should be using some UEFI
>>>> trick instead to find out where it actually lives?
>>>>
>>>> BTW what does 'lspci -vv -s 00:02.0' say on your machine?
>>>>
>>>
>>> 00:02.0 VGA compatible controller: Intel Corporation Sky Lake
>>> Integrated Graphics (rev 07) (prog-if 00 [VGA controller])
>>> DeviceName:  Onboard IGD
>>> Subsystem: Dell Device 0704
>>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>>> ParErr- Stepping- SERR- FastB2B- DisINTx+
>>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>>> SERR- >> Latency: 0
>>> Interrupt: pin A routed to IRQ 128
>>> Region 0: Memory at db00 (64-bit, non-prefetchable) [size=16M]
>>> Region 2: Memory at 9000 (64-bit, prefetchable) [size=256M]
>>> Region 4: I/O ports at f000 [size=64]
>>> Expansion ROM at  [disabled]
>>
>> UEFI has an option to enable option ROMs, which is disabled by
>> default; I wonder if having it disabled prevents all access to the
>> ROM.
>>
>> Mind you, it doesn't seem to be fatal; I've not had any issues with
>> the same machine that I can pin down to lack of ROM.
>>
>
> FWIW, my logs also get spammed with:
>
> [  127.101881] i915 :00:02.0: BAR 6: [??? 0x flags 0x2]
> has bogus alignment
>
> I suspect that the PCI core is just failing to recognize that the ROM
> is disabled.
>

A bit more info:

I think I only get this error when suspending for the second time
after boot.  No clue why.

I instrumented the code a bit.  At the time of that error, res->flags
== 0x2.  It's probably not a coincidence that:

#define IORESOURCE_ROM_SHADOW(1<<1)/* ROM is copy at C000:0 */

Should pci_fixup_video check that the resource exists in the first
place before setting flags on it?

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915 Skylake: "Invalid ROM contents"

2016-01-10 Thread Andy Lutomirski
On Wed, Nov 18, 2015 at 8:12 AM, Daniel Stone <dan...@fooishbar.org> wrote:
> Hi,
>
> On 18 November 2015 at 15:59, Andy Lutomirski <l...@amacapital.net> wrote:
>> On Wed, Nov 18, 2015 at 2:59 AM, Ville Syrjälä
>> <ville.syrj...@linux.intel.com> wrote:
>>> On Tue, Nov 17, 2015 at 11:43:25AM -0800, Andy Lutomirski wrote:
>>>> Typing:
>>>>
>>>> # cat /sys/devices/pci:00/:00:02.0/rom
>>>>
>>>> Provokes:
>>>>
>>>> i915 :00:02.0: Invalid ROM contents
>>>
>>> Hmm. So there's no PCI option ROM there. I wonder what is there. I
>>> get the same on my Braswell BTW. I tried to look through the UEFI
>>> spec a bit, and it seems to say that even for non-legacy option ROMs
>>> the 0x55aa signature should be there.
>>>
>>> But this being the GPU means we may be using the shadow ROM stuff,
>>> which IIRC assumes that the shadow is at 0xc000. I'm not sure that
>>> holds anymore with UEFI, and maybe we should be using some UEFI
>>> trick instead to find out where it actually lives?
>>>
>>> BTW what does 'lspci -vv -s 00:02.0' say on your machine?
>>>
>>
>> 00:02.0 VGA compatible controller: Intel Corporation Sky Lake
>> Integrated Graphics (rev 07) (prog-if 00 [VGA controller])
>> DeviceName:  Onboard IGD
>> Subsystem: Dell Device 0704
>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR- FastB2B- DisINTx+
>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> SERR- > Latency: 0
>> Interrupt: pin A routed to IRQ 128
>> Region 0: Memory at db00 (64-bit, non-prefetchable) [size=16M]
>> Region 2: Memory at 9000 (64-bit, prefetchable) [size=256M]
>> Region 4: I/O ports at f000 [size=64]
>> Expansion ROM at  [disabled]
>
> UEFI has an option to enable option ROMs, which is disabled by
> default; I wonder if having it disabled prevents all access to the
> ROM.
>
> Mind you, it doesn't seem to be fatal; I've not had any issues with
> the same machine that I can pin down to lack of ROM.
>

FWIW, my logs also get spammed with:

[  127.101881] i915 :00:02.0: BAR 6: [??? 0x flags 0x2]
has bogus alignment

I suspect that the PCI core is just failing to recognize that the ROM
is disabled.

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] i915 Skylake crash on 4.4-rc3

2015-12-07 Thread Andy Lutomirski
[53834.386369] traps: gnome-session-b[2308] general protection
ip:7f10efc1fc2b sp:7ffdfde31880 error:0 in
libc-2.22.so[7f10efba1000+1b7000]
[53834.687584] [ cut here ]
[53834.687607] WARNING: CPU: 0 PID: 23730 at
drivers/gpu/drm/i915/i915_gem_context.c:144
i915_gem_context_free+0x196/0x1c0 [i915]()
[53834.687609] WARN_ON(!list_empty(>base.active_list))
[53834.687610] Modules linked in:
[53834.687612]  wmi_mof dell_wmi wmi(E) rfcomm fuse ccm cmac
xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun
nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter
ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_broute bridge stp llc
ebtable_nat ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6
nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw
ip6table_security ip6table_filter ip6_tables iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
iptable_mangle iptable_raw iptable_security arc4 bnep iwlmvm mac80211
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic
snd_hda_intel iwlwifi snd_hda_codec intel_rapl x86_pkg_temp_thermal
snd_hwdep coretemp snd_hda_core kvm_intel btusb snd_seq btrtl kvm
uvcvideo btbcm cfg80211 btintel bluetooth snd_seq_device
[53834.687656]  videobuf2_vmalloc snd_pcm videobuf2_memops
videobuf2_v4l2 hid_multitouch videobuf2_core sparse_keymap v4l2_common
videodev i2c_designware_platform i2c_designware_core vfat snd_timer
fat irqbypass dell_laptop snd dcdbas efi_pstore pcspkr joydev efivars
media rtsx_pci_ms rfkill soundcore i2c_i801 memstick
pinctrl_sunrisepoint pinctrl_intel intel_lpss_acpi int3400_thermal
int3403_thermal acpi_thermal_rel acpi_pad mei_me tpm_tis mei tpm
idma64 shpchp virt_dma acpi_als kfifo_buf processor_thermal_device
nfsd industrialio intel_soc_dts_iosf iosf_mbi intel_lpss_pci
int340x_thermal_zone intel_lpss auth_rpcgss nfs_acl lockd grace sunrpc
dm_crypt i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect
sysimgblt fb_sys_fops drm rtsx_pci_sdmmc mmc_core crct10dif_pclmul
crc32_pclmul crc32c_intel
[53834.687700]  serio_raw rtsx_pci i2c_hid video [last unloaded: wmi]
[53834.687706] CPU: 0 PID: 23730 Comm: kworker/u8:4 Tainted: G
W   E   4.4.0-rc3+ #19
[53834.687708] Hardware name: Dell Inc. XPS 13 9350/07TYC2, BIOS 1.0.4
10/19/2015
[53834.687726] Workqueue: i915 i915_gem_retire_work_handler [i915]
[53834.687728]   c62a1d15 88017ffb7c70
8142510c
[53834.687731]  88017ffb7cb8 88017ffb7ca8 81092122
8802adfd1240
[53834.687734]  8802845dd800 88028468 88017ffb7d70
8802adfd12b8
[53834.687738] Call Trace:
[53834.687743]  [] dump_stack+0x4e/0x82
[53834.687747]  [] warn_slowpath_common+0x82/0xc0
[53834.687750]  [] warn_slowpath_fmt+0x5c/0x80
[53834.687764]  [] i915_gem_context_free+0x196/0x1c0 [i915]
[53834.68]  [] i915_gem_request_free+0x9f/0xb0 [i915]
[53834.687792]  []
intel_execlists_retire_requests+0x138/0x190 [i915]
[53834.687806]  [] i915_gem_retire_requests+0xd1/0xe0 [i915]
[53834.687827]  []
i915_gem_retire_work_handler+0x58/0x70 [i915]
[53834.687831]  [] process_one_work+0x152/0x400
[53834.687834]  [] worker_thread+0x4b/0x440
[53834.687837]  [] ? process_one_work+0x400/0x400
[53834.687839]  [] ? process_one_work+0x400/0x400
[53834.687842]  [] kthread+0xd8/0xf0
[53834.687845]  [] ? kthread_worker_fn+0x150/0x150
[53834.687849]  [] ret_from_fork+0x3f/0x70
[53834.687851]  [] ? kthread_worker_fn+0x150/0x150
[53834.690419] ---[ end trace 1802637761c0942d ]---

I think this happened when I started emacs.

My user session crashed, but the system is still usable after logging back in.

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915 Skylake: "Invalid ROM contents"

2015-11-18 Thread Andy Lutomirski
[adding linux-pci]

On Wed, Nov 18, 2015 at 2:59 AM, Ville Syrjälä
<ville.syrj...@linux.intel.com> wrote:
> On Tue, Nov 17, 2015 at 11:43:25AM -0800, Andy Lutomirski wrote:
>> Typing:
>>
>> # cat /sys/devices/pci:00/:00:02.0/rom
>>
>> Provokes:
>>
>> i915 :00:02.0: Invalid ROM contents
>
> Hmm. So there's no PCI option ROM there. I wonder what is there. I
> get the same on my Braswell BTW. I tried to look through the UEFI
> spec a bit, and it seems to say that even for non-legacy option ROMs
> the 0x55aa signature should be there.
>
> But this being the GPU means we may be using the shadow ROM stuff,
> which IIRC assumes that the shadow is at 0xc000. I'm not sure that
> holds anymore with UEFI, and maybe we should be using some UEFI
> trick instead to find out where it actually lives?
>
> BTW what does 'lspci -vv -s 00:02.0' say on your machine?
>

00:02.0 VGA compatible controller: Intel Corporation Sky Lake
Integrated Graphics (rev 07) (prog-if 00 [VGA controller])
DeviceName:  Onboard IGD
Subsystem: Dell Device 0704
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
SERR-  [disabled]
Capabilities: [40] Vendor Specific Information: Len=0c 
Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00
DevCap:MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl:Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta:CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-,
OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-,
OBFF Disabled
Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee00018  Data: 
Capabilities: [d0] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [100 v1] #1b
Capabilities: [200 v1] Address Translation Service (ATS)
ATSCap:Invalidate Queue Depth: 00
ATSCtl:Enable-, Smallest Translation Unit: 00
Capabilities: [300 v1] #13
Kernel driver in use: i915
Kernel modules: i915

--Andy

>>
>> This is on a Dell XPS 13 9350 (Skylake).  This is 4.3.0 plus some
>> wireless-next bits.
>>
>> --Andy
>>
>> --
>> Andy Lutomirski
>> AMA Capital Management, LLC
>> ___
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Ville Syrjälä
> Intel OTC



-- 
Andy Lutomirski
AMA Capital Management, LLC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] i915 Skylake: "Invalid ROM contents"

2015-11-17 Thread Andy Lutomirski
Typing:

# cat /sys/devices/pci:00/:00:02.0/rom

Provokes:

i915 :00:02.0: Invalid ROM contents

This is on a Dell XPS 13 9350 (Skylake).  This is 4.3.0 plus some
wireless-next bits.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Regression: audit: x86: drop arch from __audit_syscall_entry() interface

2014-10-22 Thread Andy Lutomirski
On 10/22/2014 11:23 AM, Eric Paris wrote:
 That's really serious.  Looking now.
 
 On Wed, 2014-10-22 at 16:08 -0200, Paulo Zanoni wrote:
 Hi

 (Cc'ing everybody mentioned in the original patch)

 I work for Intel, on our Linux Graphics driver - aka i915.ko - and our
 QA team recently reported a regression on:

 commit b4f0d3755c5e9cc86292d5fd78261903b4f23d4a
 Author: Richard Guy Briggs
 Date:   Tue Mar 4 10:38:06 2014 -0500
 audit: x86: drop arch from __audit_syscall_entry() interface

 According to our QA, their i386 machine doesn't boot anymore. I tried
 to write my own revert for the patch, asked QA to test, and they
 confirmed it solves the problem.

 Here are the details of QA' s bug report:
 https://bugs.freedesktop.org/show_bug.cgi?id=85277 .

 The trees our QA tests are the development trees from i915.ko:
 http://cgit.freedesktop.org/drm-intel?h=drm-intel-fixes .

 I tried searching for other bug reports on the same patch, but
 couldn't find any. Forgive me if this bug was already reported.

 Feel free to continue this discussion on the bugzilla report if you want.

This piece:

movl %esi,4(%esp)   /* 5th arg: 4th syscall arg */
movl %edx,(%esp)/* 4th arg: 3rd syscall arg */

looks like it's overwriting syscall arguments.

This is clearly fixable, but an even better fix would be to drop the asm
entirely and switch to two-phase tracing.  Want to do it?  I can test
the seccomp bits if you switch over the asm :)

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Regression: audit: x86: drop arch from __audit_syscall_entry() interface

2014-10-22 Thread Andy Lutomirski
On Wed, Oct 22, 2014 at 12:16 PM, Richard Guy Briggs r...@redhat.com wrote:
 On 14/10/22, Andy Lutomirski wrote:
 On 10/22/2014 11:23 AM, Eric Paris wrote:
  That's really serious.  Looking now.
 
  On Wed, 2014-10-22 at 16:08 -0200, Paulo Zanoni wrote:
  Hi
 
  (Cc'ing everybody mentioned in the original patch)
 
  I work for Intel, on our Linux Graphics driver - aka i915.ko - and our
  QA team recently reported a regression on:
 
  commit b4f0d3755c5e9cc86292d5fd78261903b4f23d4a
  Author: Richard Guy Briggs
  Date:   Tue Mar 4 10:38:06 2014 -0500
  audit: x86: drop arch from __audit_syscall_entry() interface
 
  According to our QA, their i386 machine doesn't boot anymore. I tried
  to write my own revert for the patch, asked QA to test, and they
  confirmed it solves the problem.
 
  Here are the details of QA' s bug report:
  https://bugs.freedesktop.org/show_bug.cgi?id=85277 .
 
  The trees our QA tests are the development trees from i915.ko:
  http://cgit.freedesktop.org/drm-intel?h=drm-intel-fixes .
 
  I tried searching for other bug reports on the same patch, but
  couldn't find any. Forgive me if this bug was already reported.
 
  Feel free to continue this discussion on the bugzilla report if you want.

 This piece:

   movl %esi,4(%esp)   /* 5th arg: 4th syscall arg */
   movl %edx,(%esp)/* 4th arg: 3rd syscall arg */

 looks like it's overwriting syscall arguments.

 This is clearly fixable, but an even better fix would be to drop the asm
 entirely and switch to two-phase tracing.  Want to do it?  I can test
 the seccomp bits if you switch over the asm :)

 Like what you did for x86_64.  That sounds worth investigating.

 I'll have a look at the asm, but I'm being distracted by a gunman loose
 2km from me and my wife and kids under lockdown in two different
 locations on the other side of the shooting site.  Had to cancel lunch
 today with two work colleagues 1/2km away from that site.  ...not been a
 productive day.


That's putting it mildly.  Stay safe.

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: Use Write-Through cacheing for the display plane on Iris

2013-08-02 Thread Andy Lutomirski
On 08/01/2013 10:39 AM, Chris Wilson wrote:
 Haswell GT3e has the unique feature of supporting Write-Through cacheing
 of objects within the eLLC/LLC. The purpose of this is to enable the display
 plane to remain coherent whilst objects lie resident in the eLLC/LLC - so
 that we, in theory, get the best of both worlds, perfect display and fast
 access.
 
 However, we still need to be careful as the CPU does not see the WT when
 accessing the cache. In particular, this means that we need to flush the
 cache lines after writing to an object through the CPU, and on
 transitioning from a cached state to WT.
 

I'm planning on adding ioremap_wt, etc sometime soon (for an unrelated
reason).  Would this be useful here?

If so, do you need it for real RAM (i.e. pages that the kernel considers
to be direct-mappable RAM) or just for MMIO space?

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: Use Write-Through cacheing for the display plane on Iris

2013-08-02 Thread Andy Lutomirski
On Fri, Aug 2, 2013 at 12:21 PM, Ben Widawsky b...@bwidawsk.net wrote:
 On Fri, Aug 02, 2013 at 11:45:22AM -0700, Andy Lutomirski wrote:
 On 08/01/2013 10:39 AM, Chris Wilson wrote:
  Haswell GT3e has the unique feature of supporting Write-Through cacheing
  of objects within the eLLC/LLC. The purpose of this is to enable the 
  display
  plane to remain coherent whilst objects lie resident in the eLLC/LLC - so
  that we, in theory, get the best of both worlds, perfect display and fast
  access.
 
  However, we still need to be careful as the CPU does not see the WT when
  accessing the cache. In particular, this means that we need to flush the
  cache lines after writing to an object through the CPU, and on
  transitioning from a cached state to WT.
 

 I'm planning on adding ioremap_wt, etc sometime soon (for an unrelated
 reason).  Would this be useful here?

 I don't think so. We should never be ioremapping the buffers with these
 mappings.


 If so, do you need it for real RAM (i.e. pages that the kernel considers
 to be direct-mappable RAM) or just for MMIO space?

 --Andy

 It is for real RAM, but again, not something we should ever be
 ioremapping.

I asked the question poorly.  The change I'm planning to make would be
to add WT mappings on a roughly equal footing with WC.  You could use
mmap protection bits (assuming there's a free bit in there),
set_memory_wt, or whatever the appropriate API is here.

(I know approximately nothing about what's going on in i915 -- I just
noticed someone else mentioning WT mappings.  For my use, I can get
away with skipping this on real ram, although it would be nice.  The
tricky part about using real ram is finding some space in struct
page.)

--Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PULL] drm-intel-next

2011-08-10 Thread Andy Lutomirski

On 08/03/2011 11:14 PM, Keith Packard wrote:


Here's a pile of fixes on top of the stuff already in drm-core-next.

  * Pile of mode setting fixes which eliminate a selection of bugs and
other annoyances. Eliminates the 'stripey' effect when going from
two to one monitor, makes hot-plug work after suspend/resume, turns
off the pipe/plane in DPMS off.


Can you ack at least this one:


   Revert and fix drm/i915/dp: remove DPMS mode tracking from DP

(i.e. d2b996ac698aebb28557355857927b8b934bb4f9)

for -stable?  It fixes an annoying regression in 3.0.

Thanks,
Andy
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [ANCIENT PATCH] Enable 30-bit depth

2011-04-19 Thread Andy Lutomirski
Signed-off-by: Andy Lutomirski l...@mit.edu
---

This patch is over a year old, caused problems, and probably doesn't even apply 
anymore.  It worked at least a little bit, though.  There's a lot more that 
needs doing, especially in relation to DirectColor mode.

 src/intel_driver.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/intel_driver.c b/src/intel_driver.c
index 1ef16ed..99d32b8 100644
--- a/src/intel_driver.c
+++ b/src/intel_driver.c
@@ -372,10 +372,10 @@ static void intel_check_dri_option(ScrnInfoPtr scrn)
if (!xf86ReturnOptValBool(intel-Options, OPTION_DRI, TRUE))
intel-directRenderingType = DRI_DISABLED;
 
-   if (scrn-depth != 16  scrn-depth != 24) {
+   if (scrn-depth != 16  scrn-depth != 24  scrn-depth != 30) {
xf86DrvMsg(scrn-scrnIndex, X_CONFIG,
   DRI is disabled because it 
-  runs only at depths 16 and 24.\n);
+  runs only at depths 16, 24, and 30.\n);
intel-directRenderingType = DRI_DISABLED;
}
 }
@@ -570,6 +570,7 @@ static Bool I830PreInit(ScrnInfoPtr scrn, int flags)
case 15:
case 16:
case 24:
+   case 30:
break;
default:
xf86DrvMsg(scrn-scrnIndex, X_ERROR,
-- 
1.7.4.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm: Aggressively disable vblanks

2010-12-20 Thread Andy Lutomirski
Enabling and disabling the vblank interrupt (on devices that
support it) is cheap.  So disable it quickly after each
interrupt.

To avoid constantly enabling and disabling vblank when
animations are running, after a predefined number (3) of
consecutive enabled vblanks that someone cared about, just
leave the interrupt on until an interrupt that no one cares
about.

The full heuristic is:

There's a per-CRTC counter called vblank_consecutive.  It
starts at zero and counts consecutive useful vblanks.  A
vblank is useful if the refcount is nonzero when the interrupt
comes in.

Whenever drm_vblank_get enables the interrupt, it stays on at
least until the next vblank (*).  If drm_vblank_put is called
and vblank_consecutive is less than a threshold (currently 3),
then the interrupt is disabled.  If a vblank interrupt happens
with refcount == 0, then the interrupt is disabled and
vblank_consecutive is reset to 0.  If vblank_get is called
with the interrupt disabled and no interrupts were missed,
then vblank_consecutive is incremented.

(*) I tried letting it turn off before the next interrupt, but
compiz on my laptop seems to call drm_wait_vblank twice with
relative waits of 0 (!) before actually waiting.

Signed-off-by: Andy Lutomirski l...@mit.edu
---

Jesse, you asked for the deletion of the timer to be separate
from reducing the timeout, but that seemed silly because I'm ripping
out the entire old mechanism.  If you're worried about the added time
spent in the interrupt handler, I could move it to a tasklet.  That
being said, disable_vblank should be very fast (it's at most a couple
of register accesses in all in-tree drivers).

I've tested this on i915, where it works nicely and reduces my wakeups
with a second hand showing on the clock but an otherwise idle system.

This changes the requirements on enable_vblank, disable_vblank and
get_vblank_counter: they can now be called from an IRQ.  They already
had to work with IRQs off and a spinlock held, but now a driver has to
watch out for recursive calls from drm_handle_vblank.  The vbl_lock is
still held.

I've audited the in-tree drivers:

mga, r128: get_vblank_counter just reads an atomic_t.  enable_vblank
just pokes registers without a lock.  disable_vblank does nothing, so
turning off vblanks is pointless.

via: get_vblank_counter just returns a counter.  enable_vblank and
disable_vblank just poke registers without locks.  (This looks wrong:
get_vblank_count does the wrong thing and will confuse my heuristic,
but it should be any worse than it already is.  I can comment out
enable_vblank if anyone prefers that approach.)

vmwgfx: get_vblank_counter does nothing and the other hooks aren't
implemented.

radeon: Everything looks safe.

i915: Looks good and tested!

nouveau: Not implemented at all.  I'm not sure why either the old code
or my code doesn't try to call a null pointer, but it doesn't.  That
being said, sync-to-vblank doesn't work on nouveau for me (glxgears
gets over 600fps while claiming to be synced to vblank).

As a future improvement, all of the vblank-disabling code could be
skipped if there is no disable_vblank callback.

 drivers/gpu/drm/drm_irq.c |  103 +---
 include/drm/drmP.h|5 ++-
 2 files changed, 81 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 9d3a503..2f107c5 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -77,45 +77,59 @@ int drm_irq_by_busid(struct drm_device *dev, void *data,
return 0;
 }
 
-static void vblank_disable_fn(unsigned long arg)
+/* After VBLANK_CONSEC_THRESHOLD consecutive non-ignored vblank interrupts,
+ * vblanks will be left on. */
+#define VBLANK_CONSEC_THRESHOLD 3
+
+static void __vblank_disable_now(struct drm_device *dev, int crtc, int force)
+{
+   if (!dev-vblank_disable_allowed)
+   return;
+
+   if (atomic_read(dev-vblank_refcount[crtc]) == 0  
dev-vblank_enabled[crtc] 
+   (dev-vblank_consecutive[crtc]  VBLANK_CONSEC_THRESHOLD || 
force))
+   {
+   DRM_DEBUG(disabling vblank on crtc %d\n, crtc);
+   dev-last_vblank[crtc] =
+   dev-driver-get_vblank_counter(dev, crtc);
+   dev-driver-disable_vblank(dev, crtc);
+   dev-vblank_enabled[crtc] = 0;
+   if (force)
+   dev-vblank_consecutive[crtc] = 0;
+   }
+}
+
+static void vblank_disable_now(struct drm_device *dev, int crtc, int force)
 {
-   struct drm_device *dev = (struct drm_device *)arg;
unsigned long irqflags;
-   int i;
 
if (!dev-vblank_disable_allowed)
return;
 
-   for (i = 0; i  dev-num_crtcs; i++) {
-   spin_lock_irqsave(dev-vbl_lock, irqflags);
-   if (atomic_read(dev-vblank_refcount[i]) == 0 
-   dev-vblank_enabled[i]) {
-   DRM_DEBUG(disabling vblank on crtc %d\n, i