Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-06 Thread Keith Packard
On Thu, 05 Jan 2012 21:41:34 -0800, Keith Packard kei...@keithp.com wrote:

 I talked with Eric about this and we decided that the whole splitting
 out of the i/o functions just doesn't make any sense. That makes this
 series very similar to Daniel's patches, so I'll rebase my bug fixes on
 top of those changes and send out a (shorter) series tomorrow.

And here's an updated version of my patches built on top of Daniel's:

The following changes since commit d06d2756a21a0c666f167ae9e4f13ef5f07f67d9:

  acpi/video: Don't restore backlight to 0 at boot time (2012-01-06 11:10:25 
-0800)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux forcewake-spinlock

Daniel Vetter (1):
  drm/i915: protect force_wake_(get|put) with the gt_lock

Daniel's patch (v3)


Keith Packard (3):
  drm/i915: Move reset forcewake processing to gen6_do_reset

This moves the forcewake code inside gen6_do_reset, at
the same time it changes from unconditionally calling
__gen6_gt_force_wake_get to using
dev_priv-display.force_wake_get. That could be broken
out as a separate patch -- it's just a bug.

  drm/i915: Hold gt_lock during reset
  drm/i915: Hold gt_lock across forcewake register reads

These two patches eliminate a race between chip reset
and other read operations. By holding the gt_lock during
all read operations, as well as across reset, we can
ensure that forcewake is active for all register
reads. Otherwise, right after chip reset, forcewake can
be inactive, but the internal forcewake_count may be
non-zero.

As a nice side-effect, this eliminates taking the
gt_lock twice during all register reads.

Please take a look and see if these are all reasonable additions to the
original patch and when it's ready, I'll push the whole sequence to
drm-intel-fixes.

-- 
keith.pack...@intel.com


pgpAuvzRPjx4m.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-05 Thread Daniel Vetter
On Wed, Jan 04, 2012 at 06:22:41PM -0800, Keith Packard wrote:
 On Wed, 4 Jan 2012 19:12:57 +0100, Daniel Vetter dan...@ffwll.ch wrote:
 
  The Correct? was just to check my understanding of your concern, I still
  think its invalid. You've cut away the second part of my mail where I
  explain why and I don't see you responding to that here. Also
  micro-optimizing the gpu reset code sounds a bit strange.
 
 Sorry, I didn't explain things very well.
 
 Right now, our register access looks like:
 
 get(struct_mutex);
 if (++forcewake_count == 1)
 force_wake_get()
 
 value = read32(reg) or  write32(reg, val)
 
 if (--forcewake_count == 0)
 force_wake_put();
 
 /* more register accesses may follow ... */
 put(struct_mutex);
 
 All very sensible, the whole register sequence is covered by
 struct_mutex, which ensures that the forcewake is set across the
 register access.
 
 The patch does:
 
 get(spin_lock)
 if (++forcewake_count == 1)
 force_wake_get()
 put(spin_lock)
 value = read32(reg) or write32(reg, val)
 get(spin_lock)
 if (--forcewake_count == 0)
 force_wake_put()
 put(spin_lock)
 
 I realize that all current users hold the struct_mutex across this whole
 sequence, aside from the reset path, but we're removing that requirement
 explicitly (the patch removes the WARN_ON calls).
 
 Without a lock held across the whole sequence, it's easy to see how a
 race could occur. We're also dropping and re-acquiring a spinlock with a
 single instruction between, which seems wasteful. Instead, we should be
 doing:
 
 get(spin_lock)
 force_wake_get();
 value = read32(reg) or  write32(reg,val)
 force_wake_put();
 put(spin_lock);
 
 No need here to deal with the wake lock at reset time; the whole
 operation is atomic wrt interrupts. It's more efficient *and* correct,
 without depending on the old struct_mutex locking.
 
 If you want to continue to expose the user-mode wake lock stuff, you
 could add:
 
 get(spin_lock);
 if (!forcewake_count)
 force_wake_get();
 value = read32(reg) or  write32(reg,val)
 if (!forcewake_count)
 force_wake_put();
 put(spin_lock);
 
 This would require that the reset code also deal with the
 forcewake_count to restore the user-mode force wake.
 
 A further optimization would hold the force_wake active for 'a while' to
 avoid the extra bus cycles required, but we'd want to see a performance
 problem from this before doing that.

I think you've lost me ... A few random comments first, it looks like I
neeed more coffee:

- The reset code (running from a workqueue) does hold sturct mutex. It's
  the hangcheck and error state capture code running from softirq/timer
  context causing issues.

- Forcewake works like a reference counted resources. As long as all
  _get/_put calls are properly balanced, I don't see how somebody could
  sneak in in between (when we don't hold the spinlock) and cause havoc.
  For paranaoia we might want to drop a WARN_ON in the _put call to check
  whether it ever drops below 0. But all current users are clearly
  balanced, so I didn't bother with it.

- My missed IRQ w/a actually relies on this by grabbing a forcewake ref in
  get_irq and dropping it again in put_irq. In between there's usually a
  schedule().

- I've pondered with Chris whether we should do your proposed optimization
  but we then noticed that the gem code doesn't actually read from any
  forcewake protected registers in normal execution (I don't consider
  running the hangcheck timer normal ;-). With my missed irq w/a that now
  changes, so we might need to reconsider this. But imo that's material
  for a separate patch.

Yours, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-05 Thread Keith Packard
On Thu, 5 Jan 2012 12:29:08 +0100, Daniel Vetter dan...@ffwll.ch wrote:

 - The reset code (running from a workqueue) does hold sturct mutex. It's
   the hangcheck and error state capture code running from softirq/timer
   context causing issues.

Right, I mis-wrote; I meant the hangcheck timer (which I always think of
as part of the reset code).

 - Forcewake works like a reference counted resources. As long as all
   _get/_put calls are properly balanced, I don't see how somebody could
   sneak in in between (when we don't hold the spinlock) and cause havoc.
   For paranaoia we might want to drop a WARN_ON in the _put call to check
   whether it ever drops below 0. But all current users are clearly
   balanced, so I didn't bother with it.

Right, I was just confused somehow. Still seems weird to me to drop a
spinlock, execute a single instruction, and then immediately re-acquire
it, along with bumping forcewake_count twice.

 - My missed IRQ w/a actually relies on this by grabbing a forcewake ref in
   get_irq and dropping it again in put_irq. In between there's usually a
   schedule().

This is essentially the same as the user-level forcewake and would be
handled in the same way -- keep forcewake_count, but use it only for
long-term values.

 - I've pondered with Chris whether we should do your proposed optimization
   but we then noticed that the gem code doesn't actually read from any
   forcewake protected registers in normal execution (I don't consider
   running the hangcheck timer normal ;-). With my missed irq w/a that now
   changes, so we might need to reconsider this. But imo that's material
   for a separate patch.

Yeah, all sounds reasonable. That separate patch can actually use
per-chip functions to read/write from the chip so we can also avoid
checking the forcewake stuff on register reads for older generation
hardware.

Make it work, then make it work faster.

-- 
keith.pack...@intel.com


pgpIwsiibUbiB.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-05 Thread Daniel Vetter
Looks like we managed to clear up our mutual confusion here ;-)

On Thu, Jan 05, 2012 at 07:49:12AM -0800, Keith Packard wrote:
 On Thu, 5 Jan 2012 12:29:08 +0100, Daniel Vetter dan...@ffwll.ch wrote:
 
  - The reset code (running from a workqueue) does hold sturct mutex. It's
the hangcheck and error state capture code running from softirq/timer
context causing issues.
 
 Right, I mis-wrote; I meant the hangcheck timer (which I always think of
 as part of the reset code).
 
  - Forcewake works like a reference counted resources. As long as all
_get/_put calls are properly balanced, I don't see how somebody could
sneak in in between (when we don't hold the spinlock) and cause havoc.
For paranaoia we might want to drop a WARN_ON in the _put call to check
whether it ever drops below 0. But all current users are clearly
balanced, so I didn't bother with it.
 
 Right, I was just confused somehow. Still seems weird to me to drop a
 spinlock, execute a single instruction, and then immediately re-acquire
 it, along with bumping forcewake_count twice.

Absolutely agreed, it's really ugly. But especially for locking changes
I'd like a patch to do one thing, and one thing only. And I didn't see the
upside of a separate patch to fix things up, also because the current
I915_WRTE|READ macro maze is a bit hellish.

  - My missed IRQ w/a actually relies on this by grabbing a forcewake ref in
get_irq and dropping it again in put_irq. In between there's usually a
schedule().
 
 This is essentially the same as the user-level forcewake and would be
 handled in the same way -- keep forcewake_count, but use it only for
 long-term values.
 
  - I've pondered with Chris whether we should do your proposed optimization
but we then noticed that the gem code doesn't actually read from any
forcewake protected registers in normal execution (I don't consider
running the hangcheck timer normal ;-). With my missed irq w/a that now
changes, so we might need to reconsider this. But imo that's material
for a separate patch.
 
 Yeah, all sounds reasonable. That separate patch can actually use
 per-chip functions to read/write from the chip so we can also avoid
 checking the forcewake stuff on register reads for older generation
 hardware.
 
 Make it work, then make it work faster.

Absolutely agreed, maybe with the adadendum to only try to make things
faster if it's actually a problem and shows up in a fast-path we care
about.

Cheers, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-05 Thread Keith Packard
On Thu, 5 Jan 2012 17:59:47 +0100, Daniel Vetter dan...@ffwll.ch wrote:

 Absolutely agreed, maybe with the adadendum to only try to make things
 faster if it's actually a problem and shows up in a fast-path we care
 about.

Here's a longer series that does a bunch of cleanup before trying to fix
things. Patches marked with '***' fix bugs. The patch marked with '...'
is the optimization to inline the spinlocks.

The following changes since commit d8e70a254d8f2da141006e496a51502b79115e80:

  drm/i915: only set the intel_crtc DPMS mode to on if the mode set succeeded 
(2012-01-03 14:55:52 -0800)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux forcewake-spinlock

Keith Packard (9):

  drm/i915: Split register access functions out from display functions

The forcewake functions are invoked unconditionally on = gen6
hardware from the register read/write functions. Having these
initialized as a side-effect of display initialization seems
wrong to me. I've moved the functions out of the display
structure and into a separate structure, and moved the
initialization to driver load time.

  drm/i915: Access registers through function pointers

This makes register access go through function pointers,
following similar changes in many other parts of the driver.

  drm/i915: Split out reg read/write for pre/post gen6 hardware

Taking advantage of the previous indirection, this actually
creates separate register read/write functions for pre-gen6 and
post-gen6 hardware.

  drm/i915: Move forcewake_count to reg_access structure

Just moves the count into the new structure to keep things
together.

  drm/i915: hide forcewake_count behind i915_forcewake_count

Create a function to hide getting the forcewake_count value

  drm/i915: Switch forcewake from atomic to using a spinlock

This changes the type from atomic to u32 and wraps all users in
a new spinlock. The spinlock is held across calls to
-force_wake_put and -force_wake_get.

***   drm/i915: Hold forcewake spinlock across reset process

Changes the reset process to hold the spinlock -- this will
ensure that all register operations will be correct wrt the
spinlock, even if the hardware gets reset.

***   drm/i915: Hold forcewake spinlock during register write operations

This protects the gt_fifo_count value under the spinlock and
keeps modifications to that tied to the actual register write.

...   drm/i915: inline spin_lock usage in register read macros

Here's the optimization I mentioned -- inlines the spinlocks
inside the register read operations.

-- 
keith.pack...@intel.com


pgpO1LijxUca4.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-05 Thread Keith Packard
On Thu, 05 Jan 2012 16:29:43 -0800, Keith Packard kei...@keithp.com wrote:

 Here's a longer series that does a bunch of cleanup before trying to fix
 things. Patches marked with '***' fix bugs. The patch marked with '...'
 is the optimization to inline the spinlocks.

I talked with Eric about this and we decided that the whole splitting
out of the i/o functions just doesn't make any sense. That makes this
series very similar to Daniel's patches, so I'll rebase my bug fixes on
top of those changes and send out a (shorter) series tomorrow.

-- 
keith.pack...@intel.com


pgp2ElK9xiAiW.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-04 Thread Daniel Vetter
On Wed, Jan 4, 2012 at 00:33, Keith Packard kei...@keithp.com wrote:
 On Tue, 3 Jan 2012 22:49:52 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
 wrote:

 Nope, current hangcheck blows up, and we have an i-g-t testcase for it
 (which the commit msg clearly states). There are also numerous bug
 reports where a dying gpu results in tons of
 WARN_ON(!mutex_locked(dev-struct_mutex)) noise in dmesg (which drowns
 out the gpu hang warning). The locking change fixes this.

 Ah, ok, that makes sense. Of course, hangcheck *could* have just taken
 struct_mutex were it run in a suitable context.

Nope, we cannot move the hangcheck into process context by using a
delayed work item and then grabbing struct_mutex. If the gpu is dead,
we usually have a task stuck waiting for it and already holding
struct_mutex. It is *absolutely* imperial that the hangcheck and error
state capture code do not block on anything that the i915 gem code
might hold onto.

 The patch adds the required locking to i915_reset.

 No, the spinlock protects the forcewake_count access and not the actual
 register access, which leaves all kinds of potential for races in
 threads not also holding struct_mutex while accessing registers.

Ah, I think I see you're concern: Between the time we reset the gpu
and the time we fix up the forcewake state somebody might sneak in and
see an inconstency between our tracking and the actual hw state, hence
reading garbage. Correct?

 If you want a spinlock to protect the register access, it must surround
 the whole operation.

Between the time the hangcheck declares the gpu dead and the time we
deem it officially resurrected at the end of i915_reset there's no
issue with returning garbage from register writes - after all, the gpu
just went down.

The only thing we have to take care of is that we don't leave behind
an inconsistent state after i915_reset, which the current locking in
my patch takes care of.

Hence I think that no further protection is required.
-Daniel
-- 
Daniel Vetter
daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-04 Thread Daniel Vetter
On Wed, Jan 04, 2012 at 09:54:08AM -0800, Keith Packard wrote:
 On Wed, 4 Jan 2012 18:11:18 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
 wrote:
 
  Ah, I think I see you're concern: Between the time we reset the gpu
  and the time we fix up the forcewake state somebody might sneak in and
  see an inconstency between our tracking and the actual hw state, hence
  reading garbage. Correct?
 
 Indeed. Plus, holding the spinlock across the whole operation also means
 only taking it once, rather than twice. Spinlocks aren't free.
 
 If we change the locking from struct_mutex to the spinlock, we should
 actually make it work, independent of what access we have today.

The Correct? was just to check my understanding of your concern, I still
think its invalid. You've cut away the second part of my mail where I
explain why and I don't see you responding to that here. Also
micro-optimizing the gpu reset code sounds a bit strange.
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-04 Thread Keith Packard
On Wed, 4 Jan 2012 19:12:57 +0100, Daniel Vetter dan...@ffwll.ch wrote:

 The Correct? was just to check my understanding of your concern, I still
 think its invalid. You've cut away the second part of my mail where I
 explain why and I don't see you responding to that here. Also
 micro-optimizing the gpu reset code sounds a bit strange.

Sorry, I didn't explain things very well.

Right now, our register access looks like:

get(struct_mutex);
if (++forcewake_count == 1)
force_wake_get()

value = read32(reg) or  write32(reg, val)

if (--forcewake_count == 0)
force_wake_put();

/* more register accesses may follow ... */
put(struct_mutex);

All very sensible, the whole register sequence is covered by
struct_mutex, which ensures that the forcewake is set across the
register access.

The patch does:

get(spin_lock)
if (++forcewake_count == 1)
force_wake_get()
put(spin_lock)
value = read32(reg) or write32(reg, val)
get(spin_lock)
if (--forcewake_count == 0)
force_wake_put()
put(spin_lock)

I realize that all current users hold the struct_mutex across this whole
sequence, aside from the reset path, but we're removing that requirement
explicitly (the patch removes the WARN_ON calls).

Without a lock held across the whole sequence, it's easy to see how a
race could occur. We're also dropping and re-acquiring a spinlock with a
single instruction between, which seems wasteful. Instead, we should be
doing:

get(spin_lock)
force_wake_get();
value = read32(reg) or  write32(reg,val)
force_wake_put();
put(spin_lock);

No need here to deal with the wake lock at reset time; the whole
operation is atomic wrt interrupts. It's more efficient *and* correct,
without depending on the old struct_mutex locking.

If you want to continue to expose the user-mode wake lock stuff, you
could add:

get(spin_lock);
if (!forcewake_count)
force_wake_get();
value = read32(reg) or  write32(reg,val)
if (!forcewake_count)
force_wake_put();
put(spin_lock);

This would require that the reset code also deal with the
forcewake_count to restore the user-mode force wake.

A further optimization would hold the force_wake active for 'a while' to
avoid the extra bus cycles required, but we'd want to see a performance
problem from this before doing that.


-- 
keith.pack...@intel.com


pgphx3pZKZy5W.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-03 Thread Keith Packard
On Wed, 14 Dec 2011 13:57:03 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
wrote:

 The problem this patch solves is that the forcewake accounting
 necessary for register reads is protected by dev-struct_mutex. But the
 hangcheck and error_capture code need to access registers without
 grabbing this mutex because we hold it while waiting for the gpu.
 So a new lock is required. Because currently the error_state capture
 is called from the error irq handler and the hangcheck code runs from
 a timer, it needs to be an irqsafe spinlock (note that the registers
 used by the irq handler (neglecting the error handling part) only uses
 registers that don't need the forcewake dance).

I think this description is wrong -- the only difference between using
atomic objects and using a spinlock is that with the spinlock the call
to -force_wake_get is correctly serialized so that no register access
can occur without the chip being awoken. Without a spinlock, a second
thread can pass right through gen6_gt_force_wake_get and then go touch
registers while the first thread is busy waking the chip up.

-- 
keith.pack...@intel.com


pgpuWLYZdZV8S.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-03 Thread Daniel Vetter
On Tue, Jan 3, 2012 at 19:51, Keith Packard kei...@keithp.com wrote:
 On Wed, 14 Dec 2011 13:57:03 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
 wrote:

 The problem this patch solves is that the forcewake accounting
 necessary for register reads is protected by dev-struct_mutex. But the
 hangcheck and error_capture code need to access registers without
 grabbing this mutex because we hold it while waiting for the gpu.
 So a new lock is required. Because currently the error_state capture
 is called from the error irq handler and the hangcheck code runs from
 a timer, it needs to be an irqsafe spinlock (note that the registers
 used by the irq handler (neglecting the error handling part) only uses
 registers that don't need the forcewake dance).

 I think this description is wrong -- the only difference between using
 atomic objects and using a spinlock is that with the spinlock the call
 to -force_wake_get is correctly serialized so that no register access
 can occur without the chip being awoken. Without a spinlock, a second
 thread can pass right through gen6_gt_force_wake_get and then go touch
 registers while the first thread is busy waking the chip up.

I'm a bit confused by this. With the current code forcewake is
protected by dev-struct_mutex. Which doesn't work out because we need
to access registers that require the forcewake dance from non-process
context.

Afaik the atomic ops stuff is just ducttape for paranoia reasons.
-Daniel
-- 
Daniel Vetter
daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-03 Thread Keith Packard
On Tue, 3 Jan 2012 20:12:35 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote:

 I'm a bit confused by this. With the current code forcewake is
 protected by dev-struct_mutex. Which doesn't work out because we need
 to access registers that require the forcewake dance from non-process
 context.

Right, I like adding a spinlock around this to allow it to be called
without needing to be able to lock the struct_mutex. (I remember
suggesting that a spinlock would be necessary when the force wake code
first showed up...)

However, the commit message talks about the error capture and
hang check code, but doesn't appear to change them at all.

I think all this patch does is replace the locking for forcewake_count
From struct_mutex to a new irq-safe spinlock, the commit message makes
it sound like it's actually fixing stuff, which it isn't; it just
enables fixing stuff in future patches, right?

Reading through this a bit more, I think your patch opens up a hole in
i915_reset. i915_reset takes struct_mutex, then resets the chip and
restores the forcewake status. If we aren't relying on struct_mutex to
protect the forcewake bits, then there's nothing preventing a thread
From accessing the registers with the chip sleeping between the reset
and the force wake reset.

 Afaik the atomic ops stuff is just ducttape for paranoia reasons.

The atomic ops stuff would allow reading of the value without holding
struct_mutex, if that were actually useful.

-- 
keith.pack...@intel.com


pgpXJ9jHvesmI.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-03 Thread Ben Widawsky
On 01/03/2012 01:13 PM, Keith Packard wrote:
 On Tue, 3 Jan 2012 20:12:35 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
 wrote:
 
 I'm a bit confused by this. With the current code forcewake is
 protected by dev-struct_mutex. Which doesn't work out because we need
 to access registers that require the forcewake dance from non-process
 context.
 
 Right, I like adding a spinlock around this to allow it to be called
 without needing to be able to lock the struct_mutex. (I remember
 suggesting that a spinlock would be necessary when the force wake code
 first showed up...)
 
 However, the commit message talks about the error capture and
 hang check code, but doesn't appear to change them at all.
 
 I think all this patch does is replace the locking for forcewake_count
 From struct_mutex to a new irq-safe spinlock, the commit message makes
 it sound like it's actually fixing stuff, which it isn't; it just
 enables fixing stuff in future patches, right?

As Daniel mentioned in the commit message, it fixes existing bugs simply
by using a spinlock. In the timer, we do not grab struct_mutex and there
is currently a race there (which we've known about since day 1).

 Afaik the atomic ops stuff is just ducttape for paranoia reasons.
 
 The atomic ops stuff would allow reading of the value without holding
 struct_mutex, if that were actually useful.

The atomic ops stuff was simply there to help reduce the races (even if
we don't have the lock, we can still safely increment the variable). It
should be safe to get rid of with the spinlock in place.

My only gripe here is Chris shot down my earlier version of this patch
many moons ago :(

Ben
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-03 Thread Daniel Vetter
On Tue, Jan 3, 2012 at 22:13, Keith Packard kei...@keithp.com wrote:
 On Tue, 3 Jan 2012 20:12:35 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
 wrote:

 I'm a bit confused by this. With the current code forcewake is
 protected by dev-struct_mutex. Which doesn't work out because we need
 to access registers that require the forcewake dance from non-process
 context.

 Right, I like adding a spinlock around this to allow it to be called
 without needing to be able to lock the struct_mutex. (I remember
 suggesting that a spinlock would be necessary when the force wake code
 first showed up...)

 However, the commit message talks about the error capture and
 hang check code, but doesn't appear to change them at all.

 I think all this patch does is replace the locking for forcewake_count
 From struct_mutex to a new irq-safe spinlock, the commit message makes
 it sound like it's actually fixing stuff, which it isn't; it just
 enables fixing stuff in future patches, right?

Nope, current hangcheck blows up, and we have an i-g-t testcase for it
(which the commit msg clearly states). There are also numerous bug
reports where a dying gpu results in tons of
WARN_ON(!mutex_locked(dev-struct_mutex)) noise in dmesg (which drowns
out the gpu hang warning). The locking change fixes this.

 Reading through this a bit more, I think your patch opens up a hole in
 i915_reset. i915_reset takes struct_mutex, then resets the chip and
 restores the forcewake status. If we aren't relying on struct_mutex to
 protect the forcewake bits, then there's nothing preventing a thread
 From accessing the registers with the chip sleeping between the reset
 and the force wake reset.

The patch adds the required locking to i915_reset.

 Afaik the atomic ops stuff is just ducttape for paranoia reasons.

 The atomic ops stuff would allow reading of the value without holding
 struct_mutex, if that were actually useful.

... but is currently unused and inherently racy. Which is why the
patch drops it.
-- 
Daniel Vetter
daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-03 Thread Chris Wilson
On Tue, 03 Jan 2012 13:49:36 -0800, Ben Widawsky b...@bwidawsk.net wrote:
 The atomic ops stuff was simply there to help reduce the races (even if
 we don't have the lock, we can still safely increment the variable). It
 should be safe to get rid of with the spinlock in place.
 
 My only gripe here is Chris shot down my earlier version of this patch
 many moons ago :(

The other way of tackling it would be not to take the forcewake during
hangcheck at all, and engineer the hangcheck not to rely on the
ring reads. For example, use seqno as the primary activity monitor,
which only leaves the case of trying not to fire spuriously during a
long batchbuffer. To counter that, you could optimistically do a raw
read of ACTHD or simply rely on long timeouts. Any error recover should
be moved to the error handling workqueue, so that we never attempt to
write a register or modify the stuct without the struct_mutex.

Reducing the granularity of struct_mutex and solving the contention
with mode_config.lock over register access is the ultimate goal when
reviewing the locking mess.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2012-01-03 Thread Keith Packard
On Tue, 3 Jan 2012 22:49:52 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote:

 Nope, current hangcheck blows up, and we have an i-g-t testcase for it
 (which the commit msg clearly states). There are also numerous bug
 reports where a dying gpu results in tons of
 WARN_ON(!mutex_locked(dev-struct_mutex)) noise in dmesg (which drowns
 out the gpu hang warning). The locking change fixes this.

Ah, ok, that makes sense. Of course, hangcheck *could* have just taken
struct_mutex were it run in a suitable context.

 The patch adds the required locking to i915_reset.

No, the spinlock protects the forcewake_count access and not the actual
register access, which leaves all kinds of potential for races in
threads not also holding struct_mutex while accessing registers.

If you want a spinlock to protect the register access, it must surround
the whole operation.

-- 
keith.pack...@intel.com


pgpgrBG3ePrUe.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

2011-12-14 Thread Daniel Vetter
The problem this patch solves is that the forcewake accounting
necessary for register reads is protected by dev-struct_mutex. But the
hangcheck and error_capture code need to access registers without
grabbing this mutex because we hold it while waiting for the gpu.
So a new lock is required. Because currently the error_state capture
is called from the error irq handler and the hangcheck code runs from
a timer, it needs to be an irqsafe spinlock (note that the registers
used by the irq handler (neglecting the error handling part) only uses
registers that don't need the forcewake dance).

We could tune this down to a normal spinlock when we rework the
error_state capture and hangcheck code to run from a workqueue.  But
we don't have any read in a fastpath that needs forcewake, so I've
decided to not care much about overhead.

This prevents tests/gem_hangcheck_forcewake from i-g-t from killing my
snb on recent kernels - something must have slightly changed the
timings. On previous kernels it only trigger a WARN about the broken
locking.

v2: Drop the previous patch for the register writes.

v3: Improve the commit message per Chris Wilson's suggestions.

Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk
Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com
---
 drivers/gpu/drm/i915/i915_debugfs.c |8 ++--
 drivers/gpu/drm/i915/i915_dma.c |1 +
 drivers/gpu/drm/i915/i915_drv.c |   18 --
 drivers/gpu/drm/i915/i915_drv.h |   10 +++---
 4 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 60e8092..c130c5d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1321,9 +1321,13 @@ static int i915_gen6_forcewake_count_info(struct 
seq_file *m, void *data)
struct drm_info_node *node = (struct drm_info_node *) m-private;
struct drm_device *dev = node-minor-dev;
struct drm_i915_private *dev_priv = dev-dev_private;
+   unsigned forcewake_count;
 
-   seq_printf(m, forcewake count = %d\n,
-  atomic_read(dev_priv-forcewake_count));
+   spin_lock_irq(dev_priv-gt_lock);
+   forcewake_count = dev_priv-forcewake_count;
+   spin_unlock_irq(dev_priv-gt_lock);
+
+   seq_printf(m, forcewake count = %u\n, forcewake_count);
 
return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a9533c5..448d5b1 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -2032,6 +2032,7 @@ int i915_driver_load(struct drm_device *dev, unsigned 
long flags)
if (!IS_I945G(dev)  !IS_I945GM(dev))
pci_enable_msi(dev-pdev);
 
+   spin_lock_init(dev_priv-gt_lock);
spin_lock_init(dev_priv-irq_lock);
spin_lock_init(dev_priv-error_lock);
spin_lock_init(dev_priv-rps_lock);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 28836fe..34f5115 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -368,11 +368,12 @@ void __gen6_gt_force_wake_mt_get(struct drm_i915_private 
*dev_priv)
  */
 void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv)
 {
-   WARN_ON(!mutex_is_locked(dev_priv-dev-struct_mutex));
+   unsigned long irqflags;
 
-   /* Forcewake is atomic in case we get in here without the lock */
-   if (atomic_add_return(1, dev_priv-forcewake_count) == 1)
+   spin_lock_irqsave(dev_priv-gt_lock, irqflags);
+   if (dev_priv-forcewake_count++ == 0)
dev_priv-display.force_wake_get(dev_priv);
+   spin_unlock_irqrestore(dev_priv-gt_lock, irqflags);
 }
 
 void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
@@ -392,10 +393,12 @@ void __gen6_gt_force_wake_mt_put(struct drm_i915_private 
*dev_priv)
  */
 void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 {
-   WARN_ON(!mutex_is_locked(dev_priv-dev-struct_mutex));
+   unsigned long irqflags;
 
-   if (atomic_dec_and_test(dev_priv-forcewake_count))
+   spin_lock_irqsave(dev_priv-gt_lock, irqflags);
+   if (--dev_priv-forcewake_count == 0)
dev_priv-display.force_wake_put(dev_priv);
+   spin_unlock_irqrestore(dev_priv-gt_lock, irqflags);
 }
 
 void __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv)
@@ -626,6 +629,7 @@ int i915_reset(struct drm_device *dev, u8 flags)
 * need to
 */
bool need_display = true;
+   unsigned long irqflags;
int ret;
 
if (!i915_try_reset)
@@ -644,8 +648,10 @@ int i915_reset(struct drm_device *dev, u8 flags)
case 6:
ret = gen6_do_reset(dev, flags);
/* If reset with a user forcewake, try to restore */
-   if (atomic_read(dev_priv-forcewake_count))
+   spin_lock_irqsave(dev_priv-gt_lock,