Re: [Intel-gfx] [PATCH v11 2/3] drm/i915: Implement WaProgramMgsrForL3BankSpecificMmioReads

2018-04-23 Thread Zhang, Yunwei
Sorry, I added a debug patch when submitting to trybot and forgot to 
remove that from my local branch. I will resubmit to a new series.


Yunwei


On 4/23/2018 12:55 PM, Rodrigo Vivi wrote:

On Mon, Apr 23, 2018 at 09:12:46AM -0700, Yunwei Zhang wrote:

L3Bank could be fused off in hardware for debug purpose, and it
is possible that subslice is enabled while its corresponding L3Bank pairs
are disabled. In such case, if MCR packet control register(0xFDC) is
programed to point to a disabled bank pair, a MMIO read into L3Bank range
will return 0 instead of correct values.

However, this is not going to be the case in any production silicon.
Therefore, we only check at initialization and issue a warning should
this really happen.

References: HSDES#1405586840

v2:
  - use fls instead of find_last_bit (Chris)
  - use is_power_of_2() instead of counting bit set (Chris)
v3:
  - rebase on latest tip
v5:
  - Added references (Mika)
  - Move local variable into scope where they are used (Ursulin)
  - use a new local variable to reduce long line of code (Ursulin)
v6:
  - Some coding style and use more local variables for clearer
logic (Ursulin)
v7:
  - Rebased.
v8:
  - Reviewed by Oscar.
v9:
  - Fixed label location. (Oscar)
v10:
  - Improved comments and replaced magical number. (Oscar)

Cc: Oscar Mateo 
Cc: Michel Thierry 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
Signed-off-by: Yunwei Zhang 
Reviewed-by: Oscar Mateo 

I confess that I got lost on this thread, so please
accept my apologies in advance if I'm missing something here.

But I don't know anymore:

- if this series has 2 or 3 patches.
- which of the patches rv-b by Oscar are still valid
- if they are passing cleaning on CI.

So, my suggestion is to start a new series from scratch.
(resend all without any in-reply-to)

But please double check with Oscar if his rv-b should stay
on the new series.

Thanks,
Rodrigo.



---
  drivers/gpu/drm/i915/i915_reg.h  |  4 
  drivers/gpu/drm/i915/intel_device_info.c | 34 
  2 files changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index fb10602..6c9c01b 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2709,6 +2709,10 @@ enum i915_power_well_id {
  #define   GEN10_F2_SS_DIS_SHIFT   18
  #define   GEN10_F2_SS_DIS_MASK(0xf << GEN10_F2_SS_DIS_SHIFT)
  
+#define	GEN10_MIRROR_FUSE3		_MMIO(0x9118)

+#define GEN10_L3BANK_PAIR_COUNT 4
+#define GEN10_L3BANK_MASK   0x0F
+
  #define GEN8_EU_DISABLE0  _MMIO(0x9134)
  #define   GEN8_EU_DIS0_S0_MASK0xff
  #define   GEN8_EU_DIS0_S1_SHIFT   24
diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
b/drivers/gpu/drm/i915/intel_device_info.c
index d917c9b..44ca90a 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -729,6 +729,40 @@ static void sanitize_mcr(struct intel_device_info *info)
u32 slice = fls(info->sseu.slice_mask);
u32 subslice = fls(info->sseu.subslice_mask[slice]);
  
+	/*

+* WaProgramMgsrForL3BankSpecificMmioReads: cnl,icl
+* L3Banks could be fused off in single slice scenario. If that is
+* the case, we might need to program MCR select to a valid L3Bank
+* by default, to make sure we correctly read certain registers
+* later on (in the range 0xB100 - 0xB3FF).
+* This might be incompatible with
+* WaProgramMgsrForCorrectSliceSpecificMmioReads.
+* Fortunately, this should not happen in production hardware, so
+* we only assert that this is the case (instead of implementing
+* something more complex that requires checking the range of every
+* MMIO read).
+*/
+   if (INTEL_GEN(dev_priv) >= 10 &&
+   is_power_of_2(info->sseu.slice_mask)) {
+   /*
+* read FUSE3 for enabled L3 Bank IDs, if L3 Bank matches
+* enabled subslice, no need to redirect MCR packet
+*/
+   u32 fuse3 = I915_READ(GEN10_MIRROR_FUSE3);
+   u8 ss_mask = info->sseu.subslice_mask[slice];
+
+   u8 enabled_mask = (ss_mask | ss_mask >>
+  GEN10_L3BANK_PAIR_COUNT) &
+  GEN10_L3BANK_MASK;
+   u8 disabled_mask = fuse3 & GEN10_L3BANK_MASK;
+
+   /*
+* Production silicon should have matched L3Bank and
+* subslice enabled
+*/
+   WARN_ON((enabled_mask & disabled_mask) != enabled_mask);
+   }
+
if (INTEL_GEN(dev_priv) >= 11) {

Re: [Intel-gfx] [PATCH v7 1/2] drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads

2018-04-17 Thread Zhang, Yunwei



On 4/16/2018 3:09 PM, Oscar Mateo wrote:



On 04/16/2018 02:22 PM, Yunwei Zhang wrote:
WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any 
MMIO

read into Slice/Subslice specific registers, MCR packet control
register(0xFDC) needs to be programmed to point to any enabled
slice/subslice pair. Otherwise, incorrect value will be returned.

However, that means each subsequent MMIO read will be forwarded to a
specific slice/subslice combination as read is unicast. This is OK since
slice/subslice specific register values are consistent in almost all 
cases
across slice/subslice. There are rare occasions such as INSTDONE that 
this
value will be dependent on slice/subslice combo, in such cases, we 
need to

program 0xFDC and recover this after. This is already covered by
read_subslice_reg.

Also, 0xFDC will lose its information after TDR/engine reset/power state
change.

References: HSD#1405586840, BSID#0575

v2:
  - use fls() instead of find_last_bit() (Chris)
  - added INTEL_SSEU to extract sseu from device info. (Chris)
v3:
  - rebase on latest tip
v5:
  - Added references (Mika)
  - Change the ordered of passing arguments and etc. (Ursulin)
v7:
  - Rebased.

Cc: Oscar Mateo 
Cc: Michel Thierry 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
Signed-off-by: Yunwei Zhang 
---
  drivers/gpu/drm/i915/i915_drv.h  |  2 ++
  drivers/gpu/drm/i915/intel_engine_cs.c   | 30 
+++---

  drivers/gpu/drm/i915/intel_workarounds.c | 12 
  3 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h 
b/drivers/gpu/drm/i915/i915_drv.h

index 8e8667d..43498a47 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2725,6 +2725,8 @@ int vlv_force_gfx_clock(struct drm_i915_private 
*dev_priv, bool on);

  int intel_engines_init_mmio(struct drm_i915_private *dev_priv);
  int intel_engines_init(struct drm_i915_private *dev_priv);
  +u32 calculate_mcr(struct drm_i915_private *dev_priv, u32 mcr);
+


As a global function, this could use a better prefix (intel_something_)

Or, alternatively, make it local and store the calculation somewhere.
Good suggestion, do you think intel_device_info will be a good place to 
store, it is deduced from that structure after all? Or should I put it 
in drm_i915_private?



  /* intel_hotplug.c */
  void intel_hpd_irq_handler(struct drm_i915_private *dev_priv,
 u32 pin_mask, u32 long_mask);
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c

index 1a83707..3b6bc5e 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -799,6 +799,18 @@ const char *i915_cache_level_str(struct 
drm_i915_private *i915, int type)

  }
  }
  +u32 calculate_mcr(struct drm_i915_private *dev_priv, u32 mcr)
+{
+    const struct sseu_dev_info *sseu = &(INTEL_INFO(dev_priv)->sseu);
+    u32 slice = fls(sseu->slice_mask);
+    u32 subslice = fls(sseu->subslice_mask[slice]);
+
+    mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
+    mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+
+    return mcr;
+}
+
  static inline uint32_t
  read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
    int subslice, i915_reg_t reg)
@@ -831,18 +843,30 @@ read_subslice_reg(struct drm_i915_private 
*dev_priv, int slice,

  intel_uncore_forcewake_get__locked(dev_priv, fw_domains);
    mcr = I915_READ_FW(GEN8_MCR_SELECTOR);
+
  /*
   * The HW expects the slice and sublice selectors to be reset to 0
- * after reading out the registers.
+ * before GEN10 or to a enabled s/ss post GEN10 after reading 
out the

+ * registers.
   */
-    WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
+    WARN_ON_ONCE(INTEL_GEN(dev_priv) < 10 &&
+ (mcr & mcr_slice_subslice_mask));


Advantage of storing the calculation: you can assert here for the 
expected value, independently of the platform.



  mcr &= ~mcr_slice_subslice_mask;
  mcr |= mcr_slice_subslice_select;
  I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
    ret = I915_READ_FW(reg);
  -    mcr &= ~mcr_slice_subslice_mask;
+    /*
+ * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl
+ * expects mcr to be programed to a enabled slice/subslice pair
+ * before any MMIO read into slice/subslice register
+ */
+    if (INTEL_GEN(dev_priv) < 10)
+    mcr &= ~mcr_slice_subslice_mask;
+    else
+    mcr = calculate_mcr(dev_priv, mcr);


Another advantage: no branching here either.


+
  I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
    intel_uncore_forcewake_put__locked(dev_priv, fw_domains);
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c 

Re: [Intel-gfx] [PATCH v6 1/2] drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads

2018-04-10 Thread Zhang, Yunwei

Hi All,

I see the latest BAT test failed but the only thing I changed in this 
new patchset is comment. It should be false alarm, not sure if this is 
halting the further review. Please see if the code needs more change.


Thanks,

Yunwei


On 3/29/2018 8:44 AM, Yunwei Zhang wrote:

WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any MMIO
read into Slice/Subslice specific registers, MCR packet control
register(0xFDC) needs to be programmed to point to any enabled
slice/subslice pair. Otherwise, incorrect value will be returned.

However, that means each subsequent MMIO read will be forwarded to a
specific slice/subslice combination as read is unicast. This is OK since
slice/subslice specific register values are consistent in almost all cases
across slice/subslice. There are rare occasions such as INSTDONE that this
value will be dependent on slice/subslice combo, in such cases, we need to
program 0xFDC and recover this after. This is already covered by
read_subslice_reg.

Also, 0xFDC will lose its information after TDR/engine reset/power state
change.

References: HSD#1405586840, BSID#0575

v2:
  - use fls() instead of find_last_bit() (Chris)
  - added INTEL_SSEU to extract sseu from device info. (Chris)
v3:
  - rebase on latest tip
v5:
  - Added references (Mika)
  - Change the ordered of passing arguments and etc. (Ursulin)
v6:
  - Updated the comment that conflict with the patch. (Chris)

Cc: Oscar Mateo 
Cc: Michel Thierry 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
Signed-off-by: Yunwei Zhang 
---
  drivers/gpu/drm/i915/intel_engine_cs.c | 42 +++---
  1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index 12486d8..4c50bee 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -796,6 +796,27 @@ const char *i915_cache_level_str(struct drm_i915_private 
*i915, int type)
}
  }
  
+static u32 calculate_mcr(struct drm_i915_private *dev_priv, u32 mcr)

+{
+   const struct sseu_dev_info *sseu = &(INTEL_INFO(dev_priv)->sseu);
+   u32 slice = fls(sseu->slice_mask);
+   u32 subslice = fls(sseu->subslice_mask[slice]);
+
+   mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
+   mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+
+   return mcr;
+}
+
+static void wa_init_mcr(struct drm_i915_private *dev_priv)
+{
+   u32 mcr;
+
+   mcr = I915_READ(GEN8_MCR_SELECTOR);
+   mcr = calculate_mcr(dev_priv, mcr);
+   I915_WRITE(GEN8_MCR_SELECTOR, mcr);
+}
+
  static inline uint32_t
  read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
  int subslice, i915_reg_t reg)
@@ -828,18 +849,30 @@ read_subslice_reg(struct drm_i915_private *dev_priv, int 
slice,
intel_uncore_forcewake_get__locked(dev_priv, fw_domains);
  
  	mcr = I915_READ_FW(GEN8_MCR_SELECTOR);

+
/*
 * The HW expects the slice and sublice selectors to be reset to 0
-* after reading out the registers.
+* before GEN10 or to a enabled s/ss post GEN10 after reading out the
+* registers.
 */
-   WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
+   WARN_ON_ONCE(INTEL_GEN(dev_priv) < 10 &&
+(mcr & mcr_slice_subslice_mask));
mcr &= ~mcr_slice_subslice_mask;
mcr |= mcr_slice_subslice_select;
I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
  
  	ret = I915_READ_FW(reg);
  
-	mcr &= ~mcr_slice_subslice_mask;

+   /*
+* WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl
+* expects mcr to be programed to a enabled slice/subslice pair
+* before any MMIO read into slice/subslice register
+*/
+   if (INTEL_GEN(dev_priv) < 10)
+   mcr &= ~mcr_slice_subslice_mask;
+   else
+   mcr = calculate_mcr(dev_priv, mcr);
+
I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
  
  	intel_uncore_forcewake_put__locked(dev_priv, fw_domains);

@@ -1307,6 +1340,9 @@ static int cnl_init_workarounds(struct intel_engine_cs 
*engine)
struct drm_i915_private *dev_priv = engine->i915;
int ret;
  
+	/* WaProgramMgsrForCorrectSliceSpecificMmioReads: cnl */

+   wa_init_mcr(dev_priv);
+
/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
I915_WRITE(GAMT_CHKN_BIT_REG,


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5 2/2] drm/i915: Implement WaProgramMgsrForL3BankSpecificMmioReads

2018-03-28 Thread Zhang, Yunwei



On 3/28/2018 2:39 AM, Tvrtko Ursulin wrote:


On 27/03/2018 23:14, Yunwei Zhang wrote:

L3Bank could be fused off in hardware for debug purpose, and it
is possible that subslice is enabled while its corresponding L3Bank 
pairs

are disabled. In such case, if MCR packet control register(0xFDC) is
programed to point to a disabled bank pair, a MMIO read into L3Bank 
range

will return 0 instead of correct values.

However, this is not going to be the case in any production silicon.
Therefore, we only check at initialization and issue a warning should
this really happen.

References: HSDES#1405586840

v2:
  - use fls instead of find_last_bit (Chris)
  - use is_power_of_2() instead of counting bit set (Chris)
v3:
  - rebase on latest tip
v5:
  - Added references (Mika)
  - Move local variable into scope where they are used (Ursulin)
  - use a new local variable to reduce long line of code (Ursulin)

Cc: Oscar Mateo 
Cc: Michel Thierry 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
Signed-off-by: Yunwei Zhang 
---
  drivers/gpu/drm/i915/i915_reg.h    |  4 
  drivers/gpu/drm/i915/intel_engine_cs.c | 20 
  2 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h 
b/drivers/gpu/drm/i915/i915_reg.h

index 1bca695f..4f2f5e1 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2691,6 +2691,10 @@ enum i915_power_well_id {
  #define   GEN10_F2_SS_DIS_SHIFT    18
  #define   GEN10_F2_SS_DIS_MASK    (0xf << GEN10_F2_SS_DIS_SHIFT)
  +#define    GEN10_MIRROR_FUSE3    _MMIO(0x9118)
+#define GEN10_L3BANK_PAIR_COUNT 4
+#define GEN10_L3BANK_MASK   0x0F
+
  #define GEN8_EU_DISABLE0    _MMIO(0x9134)
  #define   GEN8_EU_DIS0_S0_MASK    0xff
  #define   GEN8_EU_DIS0_S1_SHIFT    24
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c

index 4c78d1e..7be7a75 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -811,6 +811,26 @@ static u32 calculate_mcr(struct drm_i915_private 
*dev_priv, u32 mcr)

  static void wa_init_mcr(struct drm_i915_private *dev_priv)
  {
  u32 mcr;
+    const struct sseu_dev_info *sseu = &(INTEL_INFO(dev_priv)->sseu);


Another style nitpick - sorry I did not notice it before - we 
typically order assignments from functions arguments to locals first, 
then pure locals. Also we typically try to make the declaration block 
start wide and then narrow.



+
+    /* If more than one slice are enabled, L3Banks should be all 
enabled */


L3Banks should be all enabled, or enabled for all enabled slices? 
(That comment below says "should have _matched_".

See comment below



+    if (is_power_of_2(sseu->slice_mask)) {
+    /*
+ * WaProgramMgsrForL3BankSpecificMmioReads:
+ * read FUSE3 for enabled L3 Bank IDs, if L3 Bank matches
+ * enabled subslice, no need to redirect MCR packet
+ */


This comment implies there will be some action taken depending on this 
conditional relating to the MCR, but there is nothing there?


It is not clear to me what should and whether perhaps this comment 
should be pulled up and merged with the one above the conditional.
This WA(L3Bank but not the slice/subslice) is not meant to exist on 
production silicon, I am not sure in this case whether we should 
implement/upstream the WA. So we did this also to solicit suggestions.
That is why in case of L3Banks somehow do get fused off, we issue a 
warning instead of programming MCR register.



+    u32 slice = fls(sseu->slice_mask);
+    u32 fuse3 = I915_READ(GEN10_MIRROR_FUSE3);
+    u8 ss_mask = sseu->subslice_mask[slice];


Insert blank line after declarations.

Also, is it correct to only consider the subslice mask of the last 
slice for this check?
The case only exists on 1 enabled slice scenario, if there are two or 
more slices enabled, no subslice will be fused off.



+    /*
+ * Production silicon should have matched L3Bank and
+ * subslice enabled
+ */
+    WARN_ON(!((fuse3 & GEN10_L3BANK_MASK) &
+  ((ss_mask | ss_mask >> GEN10_L3BANK_PAIR_COUNT) & > + 
GEN10_L3BANK_MASK)));


Mask in fuse3 is the disabled mask right, since BSpec calls them "L3 
Bank Disable Select"?


Should you not be checking that none of the enabled slices have L3Bank 
disabled, while the above looks like it can miss a partial mismatch? 
Something like this:


u8 enabled_mask = (ss_mask | ss_mask >> 4) & 0xf;
u8 disabled_mask = fuse3 & 0xf;

WARN_ON((enabled_mask & disabled_mask) != enabled_mask);


+    }
    mcr = I915_READ(GEN8_MCR_SELECTOR);
  mcr = calculate_mcr(dev_priv, mcr);



Regards,

Tvrtko

Thanks,
Yunwei

Re: [Intel-gfx] [PATCH v5 1/2] drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads

2018-03-28 Thread Zhang, Yunwei


On 3/28/2018 9:03 AM, Chris Wilson wrote:

Quoting Zhang, Yunwei (2018-03-28 16:54:26)


On 3/27/2018 4:13 PM, Chris Wilson wrote:

Quoting Zhang, Yunwei (2018-03-27 23:49:27)

On 3/27/2018 3:27 PM, Chris Wilson wrote:

Quoting Yunwei Zhang (2018-03-27 23:14:16)

WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any MMIO
read into Slice/Subslice specific registers, MCR packet control
register(0xFDC) needs to be programmed to point to any enabled
slice/subslice pair. Otherwise, incorrect value will be returned.

However, that means each subsequent MMIO read will be forwarded to a
specific slice/subslice combination as read is unicast. This is OK since
slice/subslice specific register values are consistent in almost all cases
across slice/subslice. There are rare occasions such as INSTDONE that this
value will be dependent on slice/subslice combo, in such cases, we need to
program 0xFDC and recover this after. This is already covered by
read_subslice_reg.

Also, 0xFDC will lose its information after TDR/engine reset/power state
change.

References: HSD#1405586840, BSID#0575

v2:
- use fls() instead of find_last_bit() (Chris)
- added INTEL_SSEU to extract sseu from device info. (Chris)
v3:
- rebase on latest tip
v5:
- Added references (Mika)
- Change the ordered of passing arguments and etc. (Ursulin)

Cc: Oscar Mateo <oscar.ma...@intel.com>
Cc: Michel Thierry <michel.thie...@intel.com>
Cc: Joonas Lahtinen <joonas.lahti...@linux.intel.com>
Cc: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuopp...@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursu...@linux.intel.com>
Signed-off-by: Yunwei Zhang <yunwei.zh...@intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 39 
--
1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index de09fa4..4c78d1e 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -796,6 +796,27 @@ const char *i915_cache_level_str(struct drm_i915_private 
*i915, int type)
   }
}

+static u32 calculate_mcr(struct drm_i915_private *dev_priv, u32 mcr)

+{
+   const struct sseu_dev_info *sseu = &(INTEL_INFO(dev_priv)->sseu);
+   u32 slice = fls(sseu->slice_mask);
+   u32 subslice = fls(sseu->subslice_mask[slice]);
+
+   mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
+   mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+
+   return mcr;
+}
+
+static void wa_init_mcr(struct drm_i915_private *dev_priv)
+{
+   u32 mcr;
+
+   mcr = I915_READ(GEN8_MCR_SELECTOR);
+   mcr = calculate_mcr(dev_priv, mcr);
+   I915_WRITE(GEN8_MCR_SELECTOR, mcr);
+}
+
static inline uint32_t
read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
 int subslice, i915_reg_t reg)
@@ -828,18 +849,29 @@ read_subslice_reg(struct drm_i915_private *dev_priv, int 
slice,
   intel_uncore_forcewake_get__locked(dev_priv, fw_domains);

   mcr = I915_READ_FW(GEN8_MCR_SELECTOR);

+
   /*
* The HW expects the slice and sublice selectors to be reset to 0
* after reading out the registers.
*/
-   WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
+   WARN_ON_ONCE(INTEL_GEN(dev_priv) < 10 &&
+(mcr & mcr_slice_subslice_mask));
   mcr &= ~mcr_slice_subslice_mask;
   mcr |= mcr_slice_subslice_select;
   I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);

   ret = I915_READ_FW(reg);

-   mcr &= ~mcr_slice_subslice_mask;

+   /*
+* WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl
+* expects mcr to be programed to a enabled slice/subslice pair
+* before any MMIO read into slice/subslice register
+*/

So the read was above, where we did set the subslice_select
appropriately. Here we are resetting back to 0 *after* the read, as the
comment before indicates.

So what are you trying to accomplish with this patch? Other than leaving
the code in conflict with itself.
-Chris

Hi Chris,

The comment mentioned 0xFDC needs to be reset to 0 was before this WA
was introduced, in later HW, this WA requires 0xFDC to be programmed to
a enabled slice/subslice.

What this patch does it to initialize 0xFDC once at the initialization
(also it will be called after engine reset/TDR/coming out of c6) and
make sure every time it is changed, it will be reprogrammed to a enabled
slice/subslice so that a MMIO
read will get the correct value. read_subslice_reg changes the 0xFDC
value and if it is set to 0, it will cause MMIO read to return invalid
value for s/ss specific registers.

What mmio read? The only accessor should be this function.

And still the two comments are in direct conflict with each

Re: [Intel-gfx] [PATCH v5 1/2] drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads

2018-03-28 Thread Zhang, Yunwei



On 3/27/2018 4:13 PM, Chris Wilson wrote:

Quoting Zhang, Yunwei (2018-03-27 23:49:27)


On 3/27/2018 3:27 PM, Chris Wilson wrote:

Quoting Yunwei Zhang (2018-03-27 23:14:16)

WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any MMIO
read into Slice/Subslice specific registers, MCR packet control
register(0xFDC) needs to be programmed to point to any enabled
slice/subslice pair. Otherwise, incorrect value will be returned.

However, that means each subsequent MMIO read will be forwarded to a
specific slice/subslice combination as read is unicast. This is OK since
slice/subslice specific register values are consistent in almost all cases
across slice/subslice. There are rare occasions such as INSTDONE that this
value will be dependent on slice/subslice combo, in such cases, we need to
program 0xFDC and recover this after. This is already covered by
read_subslice_reg.

Also, 0xFDC will lose its information after TDR/engine reset/power state
change.

References: HSD#1405586840, BSID#0575

v2:
   - use fls() instead of find_last_bit() (Chris)
   - added INTEL_SSEU to extract sseu from device info. (Chris)
v3:
   - rebase on latest tip
v5:
   - Added references (Mika)
   - Change the ordered of passing arguments and etc. (Ursulin)

Cc: Oscar Mateo <oscar.ma...@intel.com>
Cc: Michel Thierry <michel.thie...@intel.com>
Cc: Joonas Lahtinen <joonas.lahti...@linux.intel.com>
Cc: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuopp...@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursu...@linux.intel.com>
Signed-off-by: Yunwei Zhang <yunwei.zh...@intel.com>
---
   drivers/gpu/drm/i915/intel_engine_cs.c | 39 
--
   1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index de09fa4..4c78d1e 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -796,6 +796,27 @@ const char *i915_cache_level_str(struct drm_i915_private 
*i915, int type)
  }
   }
   
+static u32 calculate_mcr(struct drm_i915_private *dev_priv, u32 mcr)

+{
+   const struct sseu_dev_info *sseu = &(INTEL_INFO(dev_priv)->sseu);
+   u32 slice = fls(sseu->slice_mask);
+   u32 subslice = fls(sseu->subslice_mask[slice]);
+
+   mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
+   mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+
+   return mcr;
+}
+
+static void wa_init_mcr(struct drm_i915_private *dev_priv)
+{
+   u32 mcr;
+
+   mcr = I915_READ(GEN8_MCR_SELECTOR);
+   mcr = calculate_mcr(dev_priv, mcr);
+   I915_WRITE(GEN8_MCR_SELECTOR, mcr);
+}
+
   static inline uint32_t
   read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
int subslice, i915_reg_t reg)
@@ -828,18 +849,29 @@ read_subslice_reg(struct drm_i915_private *dev_priv, int 
slice,
  intel_uncore_forcewake_get__locked(dev_priv, fw_domains);
   
  mcr = I915_READ_FW(GEN8_MCR_SELECTOR);

+
  /*
   * The HW expects the slice and sublice selectors to be reset to 0
   * after reading out the registers.
   */
-   WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
+   WARN_ON_ONCE(INTEL_GEN(dev_priv) < 10 &&
+(mcr & mcr_slice_subslice_mask));
  mcr &= ~mcr_slice_subslice_mask;
  mcr |= mcr_slice_subslice_select;
  I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
   
  ret = I915_READ_FW(reg);
   
-   mcr &= ~mcr_slice_subslice_mask;

+   /*
+* WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl
+* expects mcr to be programed to a enabled slice/subslice pair
+* before any MMIO read into slice/subslice register
+*/

So the read was above, where we did set the subslice_select
appropriately. Here we are resetting back to 0 *after* the read, as the
comment before indicates.

So what are you trying to accomplish with this patch? Other than leaving
the code in conflict with itself.
-Chris

Hi Chris,

The comment mentioned 0xFDC needs to be reset to 0 was before this WA
was introduced, in later HW, this WA requires 0xFDC to be programmed to
a enabled slice/subslice.

What this patch does it to initialize 0xFDC once at the initialization
(also it will be called after engine reset/TDR/coming out of c6) and
make sure every time it is changed, it will be reprogrammed to a enabled
slice/subslice so that a MMIO
read will get the correct value. read_subslice_reg changes the 0xFDC
value and if it is set to 0, it will cause MMIO read to return invalid
value for s/ss specific registers.

What mmio read? The only accessor should be this function.

And still the two comments are in direct conflict with each other.
-Chris
This function is only used in INST_DONE case which you need to iterate 
through each slice/subsli

Re: [Intel-gfx] [PATCH v5 1/2] drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads

2018-03-27 Thread Zhang, Yunwei



On 3/27/2018 3:27 PM, Chris Wilson wrote:

Quoting Yunwei Zhang (2018-03-27 23:14:16)

WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any MMIO
read into Slice/Subslice specific registers, MCR packet control
register(0xFDC) needs to be programmed to point to any enabled
slice/subslice pair. Otherwise, incorrect value will be returned.

However, that means each subsequent MMIO read will be forwarded to a
specific slice/subslice combination as read is unicast. This is OK since
slice/subslice specific register values are consistent in almost all cases
across slice/subslice. There are rare occasions such as INSTDONE that this
value will be dependent on slice/subslice combo, in such cases, we need to
program 0xFDC and recover this after. This is already covered by
read_subslice_reg.

Also, 0xFDC will lose its information after TDR/engine reset/power state
change.

References: HSD#1405586840, BSID#0575

v2:
  - use fls() instead of find_last_bit() (Chris)
  - added INTEL_SSEU to extract sseu from device info. (Chris)
v3:
  - rebase on latest tip
v5:
  - Added references (Mika)
  - Change the ordered of passing arguments and etc. (Ursulin)

Cc: Oscar Mateo 
Cc: Michel Thierry 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
Signed-off-by: Yunwei Zhang 
---
  drivers/gpu/drm/i915/intel_engine_cs.c | 39 --
  1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index de09fa4..4c78d1e 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -796,6 +796,27 @@ const char *i915_cache_level_str(struct drm_i915_private 
*i915, int type)
 }
  }
  
+static u32 calculate_mcr(struct drm_i915_private *dev_priv, u32 mcr)

+{
+   const struct sseu_dev_info *sseu = &(INTEL_INFO(dev_priv)->sseu);
+   u32 slice = fls(sseu->slice_mask);
+   u32 subslice = fls(sseu->subslice_mask[slice]);
+
+   mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
+   mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+
+   return mcr;
+}
+
+static void wa_init_mcr(struct drm_i915_private *dev_priv)
+{
+   u32 mcr;
+
+   mcr = I915_READ(GEN8_MCR_SELECTOR);
+   mcr = calculate_mcr(dev_priv, mcr);
+   I915_WRITE(GEN8_MCR_SELECTOR, mcr);
+}
+
  static inline uint32_t
  read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
   int subslice, i915_reg_t reg)
@@ -828,18 +849,29 @@ read_subslice_reg(struct drm_i915_private *dev_priv, int 
slice,
 intel_uncore_forcewake_get__locked(dev_priv, fw_domains);
  
 mcr = I915_READ_FW(GEN8_MCR_SELECTOR);

+
 /*
  * The HW expects the slice and sublice selectors to be reset to 0
  * after reading out the registers.
  */
-   WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
+   WARN_ON_ONCE(INTEL_GEN(dev_priv) < 10 &&
+(mcr & mcr_slice_subslice_mask));
 mcr &= ~mcr_slice_subslice_mask;
 mcr |= mcr_slice_subslice_select;
 I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
  
 ret = I915_READ_FW(reg);
  
-   mcr &= ~mcr_slice_subslice_mask;

+   /*
+* WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl
+* expects mcr to be programed to a enabled slice/subslice pair
+* before any MMIO read into slice/subslice register
+*/

So the read was above, where we did set the subslice_select
appropriately. Here we are resetting back to 0 *after* the read, as the
comment before indicates.

So what are you trying to accomplish with this patch? Other than leaving
the code in conflict with itself.
-Chris

Hi Chris,

The comment mentioned 0xFDC needs to be reset to 0 was before this WA 
was introduced, in later HW, this WA requires 0xFDC to be programmed to 
a enabled slice/subslice.


What this patch does it to initialize 0xFDC once at the initialization 
(also it will be called after engine reset/TDR/coming out of c6) and 
make sure every time it is changed, it will be reprogrammed to a enabled 
slice/subslice so that a MMIO
read will get the correct value. read_subslice_reg changes the 0xFDC 
value and if it is set to 0, it will cause MMIO read to return invalid 
value for s/ss specific registers.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 1/2] drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads

2018-03-27 Thread Zhang, Yunwei



On 3/26/2018 9:57 AM, Tvrtko Ursulin wrote:


On 26/03/2018 17:12, Yunwei Zhang wrote:
WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any 
MMIO

read into Slice/Subslice specific registers, MCR packet control
register(0xFDC) needs to be programmed to point to any enabled
slice/subslice pair. Otherwise, incorrect value will be returned.

However, that means each subsequent MMIO read will be forwarded to a
specific slice/subslice combination as read is unicast. This is OK since
slice/subslice specific register values are consistent in almost all 
cases
across slice/subslice. There are rare occasions such as INSTDONE that 
this
value will be dependent on slice/subslice combo, in such cases, we 
need to

program 0xFDC and recover this after. This is already covered by
read_subslice_reg.

Also, 0xFDC will lose its information after TDR/engine reset/power state
change.

Reference: HSD#1405586840 BSID#0575

v2:
  - use fls() instead of find_last_bit() (Chris)
  - added INTEL_SSEU to extract sseu from device info. (Chris)
v3:
  - rebase on latest tip
v4:
  - Added references (Mika)

Signed-off-by: Yunwei Zhang 
Cc: Oscar Mateo 
Cc: Michel Thierry 
Cc: Joonas Lahtinen 
Cc: Chris Wilson 
Cc: Mika Kuoppala 
---
  drivers/gpu/drm/i915/i915_drv.h    |  1 +
  drivers/gpu/drm/i915/intel_engine_cs.c | 39 
--

  2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h 
b/drivers/gpu/drm/i915/i915_drv.h

index 800230b..2db2a04 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2297,6 +2297,7 @@ intel_info(const struct drm_i915_private 
*dev_priv)

    #define INTEL_GEN(dev_priv)    ((dev_priv)->info.gen)
  #define INTEL_DEVID(dev_priv) ((dev_priv)->info.device_id)
+#define INTEL_SSEU(dev_priv)    ((dev_priv)->info.sseu)


If we add this someone gets the job of converting existing users?
This is suggestion from Chris Wilson, I am new here, but I guess if I am 
going to do that, it is better in a separate patch. I will remove this 
in next patchset and submit a new patch later.

    #define REVID_FOREVER    0xff
  #define INTEL_REVID(dev_priv) ((dev_priv)->drm.pdev->revision)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c

index de09fa4..cc19e0a 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -796,6 +796,27 @@ const char *i915_cache_level_str(struct 
drm_i915_private *i915, int type)

  }
  }
  +static u32 calculate_mcr(u32 mcr, struct drm_i915_private *dev_priv)


dev_priv first would be more typical function argument order.


+{
+    const struct sseu_dev_info *sseu = &(INTEL_SSEU(dev_priv));
+    u32 slice = fls(sseu->slice_mask);
+    u32 subslice = fls(sseu->subslice_mask[slice]);
+
+    mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
+    mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+
+    return mcr;
+}
+
+static void wa_init_mcr(struct drm_i915_private *dev_priv)
+{
+    u32 mcr;
+
+    mcr = I915_READ(GEN8_MCR_SELECTOR);
+    mcr = calculate_mcr(mcr, dev_priv);
+    I915_WRITE(GEN8_MCR_SELECTOR, mcr);
+}
+
  static inline uint32_t
  read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
    int subslice, i915_reg_t reg)
@@ -828,18 +849,29 @@ read_subslice_reg(struct drm_i915_private 
*dev_priv, int slice,

  intel_uncore_forcewake_get__locked(dev_priv, fw_domains);
    mcr = I915_READ_FW(GEN8_MCR_SELECTOR);
+
  /*
   * The HW expects the slice and sublice selectors to be reset to 0
   * after reading out the registers.
   */
-    WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
+    if (INTEL_GEN(dev_priv) < 10)
+    WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);


Can squash to single line WARN_ON_ONCE(INTEL_GEN() < 10 && (mcr & 
...)), if it fits.



  mcr &= ~mcr_slice_subslice_mask;
  mcr |= mcr_slice_subslice_select;
  I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
    ret = I915_READ_FW(reg);
  -    mcr &= ~mcr_slice_subslice_mask;
+    /*
+ * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl
+ * expects mcr to be programed to a enabled slice/subslice pair
+ * before any MMIO read into slice/subslice register
+ */
+    if (INTEL_GEN(dev_priv) < 10)
+    mcr &= ~mcr_slice_subslice_mask;
+    else
+    mcr = calculate_mcr(mcr, dev_priv);


Does it make sense to move the conditional and comment to 
calculate_mcr - so here only a single call to it remains?
I am thinking maybe it is better to save jump/return for GENs that don't 
have WA..

+
  I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
    intel_uncore_forcewake_put__locked(dev_priv, fw_domains);
@@ -1307,6 +1339,9 @@ static int cnl_init_workarounds(struct 
intel_engine_cs