Re: [Intel-gfx] [PATCH 4/5] drm/i915: reorganize the unclaimed register detection code

2014-07-15 Thread Rodrigo Vivi
On Mon, Jul 7, 2014 at 2:34 PM, Daniel Vetter dan...@ffwll.ch wrote:

 On Fri, Jul 04, 2014 at 11:50:32AM -0300, Paulo Zanoni wrote:
  From: Paulo Zanoni paulo.r.zan...@intel.com
 
  The current code only runs when we do an I915_WRITE operation. It
  checks if the unclaimed register flag is set before we do the
  operation, and then it checks it again after we do the operation. This
  double check allows us to find out if the I915_WRITE operation in
  question is the bad one, or if some previous code is the bad one. When
  it finds a problem, our code uses DRM_ERROR to signal it.
 
  The good thing about the current code is that it detects the problem,
  so at least we can know we did something wrong. The problem is that
  even though we find the problem, we don't really have much information
  to actually debug it. So whenever I see one of these DRM_ERROR
  messages on my systems, the first thing I do is apply a patch to
  change the DRM_ERROR to a WARN and also check for unclaimed registers
  on I915_READ operations. This local patch makes things even slower,
  but it usually helps a lot in finding the bad code.
 
  The first point here is that since the current code is only useful to
  detect whether we have a problem or not, but it is not really good to
  find the cause of the problem, I don't think we should be checking
  both before and after every I915_WRITE operation: just doing the check
  once should be enough for us to quickly detect problems. With this
  change, the code that runs by default for every single user will only
  do 1 read operation for every single I915_WRITE, instead of 2. This
  patch does this change.
 
  The second point is that the local patch I have should be upstream,
  but since it makes things slower it should be disabled by default. So
  I added the i915.mmio_debug option to enable it.
 
  So after this patch, this is what will happen:
   - By default, we will try to detect unclaimed registers once after
 every I915_WRITE operation. Previously we tried twice for every
 I915_WRITE.
   - When we find an unclaimed register we will still print a DRM_ERROR
 message, but we will now tell the user to try again with
 i915.mmio_debug=1.
   - When we use i915.mmio_debug=1 we will try to find unclaimed
 registers both before and after every I915_READ and I915_WRITE
 operation, and we will print stack traces in case we find them.
 This should really help locating the exact point of the bad code
 (or at least finding out that i915.ko is not the problem).
 
  This commit also opens space for really-slow register debugging
  operations on other platforms. In theory we can now add lots and lots
  of debug code behind i915.mmio_debug, enable this option on our tests,
  and catch more problems.
 
  Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com
  ---
   drivers/gpu/drm/i915/i915_drv.h |  1 +
   drivers/gpu/drm/i915/i915_params.c  |  6 
   drivers/gpu/drm/i915/intel_uncore.c | 68
 +
   3 files changed, 68 insertions(+), 7 deletions(-)
 
  diff --git a/drivers/gpu/drm/i915/i915_drv.h
 b/drivers/gpu/drm/i915/i915_drv.h
  index ac06c0f..51d867f 100644
  --- a/drivers/gpu/drm/i915/i915_drv.h
  +++ b/drivers/gpu/drm/i915/i915_drv.h
  @@ -2092,6 +2092,7 @@ struct i915_params {
bool disable_display;
bool disable_vtd_wa;
int use_mmio_flip;
  + bool mmio_debug;
   };
   extern struct i915_params i915 __read_mostly;
 
  diff --git a/drivers/gpu/drm/i915/i915_params.c
 b/drivers/gpu/drm/i915/i915_params.c
  index 8145729..7977872 100644
  --- a/drivers/gpu/drm/i915/i915_params.c
  +++ b/drivers/gpu/drm/i915/i915_params.c
  @@ -49,6 +49,7 @@ struct i915_params i915 __read_mostly = {
.enable_cmd_parser = 1,
.disable_vtd_wa = 0,
.use_mmio_flip = 0,
  + .mmio_debug = 0,
   };
 
   module_param_named(modeset, i915.modeset, int, 0400);
  @@ -161,3 +162,8 @@ MODULE_PARM_DESC(enable_cmd_parser,
   module_param_named(use_mmio_flip, i915.use_mmio_flip, int, 0600);
   MODULE_PARM_DESC(use_mmio_flip,
 use MMIO flips (-1=never, 0=driver discretion [default],
 1=always));
  +
  +module_param_named(mmio_debug, i915.mmio_debug, bool, 0600);
  +MODULE_PARM_DESC(disable_vtd_wa,


wrong parameter here!


  + Enable the MMIO debug code (default: false). This may negatively 
  + affect performance.);
  diff --git a/drivers/gpu/drm/i915/intel_uncore.c
 b/drivers/gpu/drm/i915/intel_uncore.c
  index 29145df..de5402f 100644
  --- a/drivers/gpu/drm/i915/intel_uncore.c
  +++ b/drivers/gpu/drm/i915/intel_uncore.c
  @@ -513,21 +513,72 @@ ilk_dummy_write(struct drm_i915_private *dev_priv)
__raw_i915_write32(dev_priv, MI_MODE, 0);
   }
 
  +/**
  + * hsw_unclaimed_reg_debug - tries to find unclaimed registers
  + * @dev_priv: device private data
  + * @reg: register we're about to touch or just have touched
  + * @read: are we reading or writing a register now?
  

Re: [Intel-gfx] [PATCH 4/5] drm/i915: reorganize the unclaimed register detection code

2014-07-07 Thread Daniel Vetter
On Fri, Jul 04, 2014 at 11:50:32AM -0300, Paulo Zanoni wrote:
 From: Paulo Zanoni paulo.r.zan...@intel.com
 
 The current code only runs when we do an I915_WRITE operation. It
 checks if the unclaimed register flag is set before we do the
 operation, and then it checks it again after we do the operation. This
 double check allows us to find out if the I915_WRITE operation in
 question is the bad one, or if some previous code is the bad one. When
 it finds a problem, our code uses DRM_ERROR to signal it.
 
 The good thing about the current code is that it detects the problem,
 so at least we can know we did something wrong. The problem is that
 even though we find the problem, we don't really have much information
 to actually debug it. So whenever I see one of these DRM_ERROR
 messages on my systems, the first thing I do is apply a patch to
 change the DRM_ERROR to a WARN and also check for unclaimed registers
 on I915_READ operations. This local patch makes things even slower,
 but it usually helps a lot in finding the bad code.
 
 The first point here is that since the current code is only useful to
 detect whether we have a problem or not, but it is not really good to
 find the cause of the problem, I don't think we should be checking
 both before and after every I915_WRITE operation: just doing the check
 once should be enough for us to quickly detect problems. With this
 change, the code that runs by default for every single user will only
 do 1 read operation for every single I915_WRITE, instead of 2. This
 patch does this change.
 
 The second point is that the local patch I have should be upstream,
 but since it makes things slower it should be disabled by default. So
 I added the i915.mmio_debug option to enable it.
 
 So after this patch, this is what will happen:
  - By default, we will try to detect unclaimed registers once after
every I915_WRITE operation. Previously we tried twice for every
I915_WRITE.
  - When we find an unclaimed register we will still print a DRM_ERROR
message, but we will now tell the user to try again with
i915.mmio_debug=1.
  - When we use i915.mmio_debug=1 we will try to find unclaimed
registers both before and after every I915_READ and I915_WRITE
operation, and we will print stack traces in case we find them.
This should really help locating the exact point of the bad code
(or at least finding out that i915.ko is not the problem).
 
 This commit also opens space for really-slow register debugging
 operations on other platforms. In theory we can now add lots and lots
 of debug code behind i915.mmio_debug, enable this option on our tests,
 and catch more problems.
 
 Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com
 ---
  drivers/gpu/drm/i915/i915_drv.h |  1 +
  drivers/gpu/drm/i915/i915_params.c  |  6 
  drivers/gpu/drm/i915/intel_uncore.c | 68 
 +
  3 files changed, 68 insertions(+), 7 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
 index ac06c0f..51d867f 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -2092,6 +2092,7 @@ struct i915_params {
   bool disable_display;
   bool disable_vtd_wa;
   int use_mmio_flip;
 + bool mmio_debug;
  };
  extern struct i915_params i915 __read_mostly;
  
 diff --git a/drivers/gpu/drm/i915/i915_params.c 
 b/drivers/gpu/drm/i915/i915_params.c
 index 8145729..7977872 100644
 --- a/drivers/gpu/drm/i915/i915_params.c
 +++ b/drivers/gpu/drm/i915/i915_params.c
 @@ -49,6 +49,7 @@ struct i915_params i915 __read_mostly = {
   .enable_cmd_parser = 1,
   .disable_vtd_wa = 0,
   .use_mmio_flip = 0,
 + .mmio_debug = 0,
  };
  
  module_param_named(modeset, i915.modeset, int, 0400);
 @@ -161,3 +162,8 @@ MODULE_PARM_DESC(enable_cmd_parser,
  module_param_named(use_mmio_flip, i915.use_mmio_flip, int, 0600);
  MODULE_PARM_DESC(use_mmio_flip,
use MMIO flips (-1=never, 0=driver discretion [default], 
 1=always));
 +
 +module_param_named(mmio_debug, i915.mmio_debug, bool, 0600);
 +MODULE_PARM_DESC(disable_vtd_wa,
 + Enable the MMIO debug code (default: false). This may negatively 
 + affect performance.);
 diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
 b/drivers/gpu/drm/i915/intel_uncore.c
 index 29145df..de5402f 100644
 --- a/drivers/gpu/drm/i915/intel_uncore.c
 +++ b/drivers/gpu/drm/i915/intel_uncore.c
 @@ -513,21 +513,72 @@ ilk_dummy_write(struct drm_i915_private *dev_priv)
   __raw_i915_write32(dev_priv, MI_MODE, 0);
  }
  
 +/**
 + * hsw_unclaimed_reg_debug - tries to find unclaimed registers
 + * @dev_priv: device private data
 + * @reg: register we're about to touch or just have touched
 + * @read: are we reading or writing a register now?
 + * @before: are we running this before or after we touch the register?
 + *
 + * This function tries to detect unclaimed registers and provide as much
 + * information as possible.