Re: [Intel-gfx] [PATCH] drm/i915: Only force GGTT coherency w/a on required chipsets

2018-07-20 Thread Chris Wilson
Quoting Tvrtko Ursulin (2018-07-20 16:13:14)
> 
> On 20/07/2018 11:19, Chris Wilson wrote:
> > +/*
> > + * Once upon a time we supposed that writes through the GGTT would be
> > + * immediately in physical memory (once flushed out of the CPU path). 
> > However,
> > + * on a few different processors and chipsets, this is not necessarily the 
> > case
> > + * as the writes appear to be buffered internally. Thus a read of the 
> > backing
> > + * storage (physical memory) via a different path (with different physical 
> > tags
> > + * to the indirect write via the GGTT) will see stale values from before
> > + * the GGTT write. Inside the kernel, we can for the most part keep track 
> > of
> > + * the different read/write domains in use (e.g. set-domain), but the 
> > assumption
> > + * of coherency is baked into the ABI, hence reporting its true state in 
> > this
> > + * parameter.
> > + *
> > + * If set to true, writes via mmap_gtt are immediately visible following an
> > + * lfence to flush the WCB.
> > + *
> > + * If set to false, writes via mmap_gtt are indeterminatnly delayed in an 
> > in
> > + * internal buffer and are _not_ immediately visible to third parties 
> > accessing
> > + * directly via mmap_cpu/mmap_wc. Use of mmap_gtt as part of an IPC
> > + * communications channel when set to false is strongly disadvised.
> 
> I would perhaps change the language from "set to true/false" to 
> "reported as true/false".

s/set/reported/ and pushed. Thanks for the reviews,
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Only force GGTT coherency w/a on required chipsets

2018-07-20 Thread Chris Wilson
Quoting Tvrtko Ursulin (2018-07-20 16:27:20)
> 
> On 20/07/2018 16:21, Chris Wilson wrote:
> > My only plan is for igt to know when the tests are expected to fail.
> > Real userspace should not be using GTT, it is slow (even slower than
> > intended ;) and constrained, so subject to aperture thrashing, fencing
> > is only usable for two out of the many tiling modes, etc, etc, etc.
> 
> I'll take the view that get_param is tiny enough for this to be fine.
> 
> Reviewed-by: Tvrtko Ursulin 

Also take a peek over at
https://patchwork.freedesktop.org/patch/239949/
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Only force GGTT coherency w/a on required chipsets

2018-07-20 Thread Tvrtko Ursulin


On 20/07/2018 16:21, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2018-07-20 16:13:14)


On 20/07/2018 11:19, Chris Wilson wrote:

Not all chipsets have an internal buffer delaying the visibility of
writes via the GGTT being visible by other physical paths, but we use a
very heavy workaround for all. We only need to apply that workarounds to
the chipsets we know suffer from the delay and the resulting coherency
issue.

Similarly, the same inconsistent coherency fouls up our ABI promise that
a write into a mmap_gtt is immediately visible to others. Since the HW
has made that a lie, let userspace know when that contract is broken.
(Not that userspace would want to use mmap_gtt on those chipsets for
other performance reasons...)

Testcase: igt/drv_selftest/live_coherency
Testcase: igt/gem_mmap_gtt/coherency


The list of platforms to mark with false was compiled from the test results?


Yes. Coverage of older chipsets is less than ideal, as we have fewer of
them being tested and the gmch wasn't tied to any processor. So whereas
my PIIIm + i915gm passes, CI's P4 + i915g fails. That is odd.


Hope the oddity is not hiding something but we'll see.


But then before this patch workaround was applied everywhere - so if the
test was failing even then - that means workaround wasn't sufficient?


The test is being run in userspace; bypassing any domain control in the
kernel, assuming the coherency model built into the ABI.


Ah yes.


+/*
+ * Once upon a time we supposed that writes through the GGTT would be
+ * immediately in physical memory (once flushed out of the CPU path). However,
+ * on a few different processors and chipsets, this is not necessarily the case
+ * as the writes appear to be buffered internally. Thus a read of the backing
+ * storage (physical memory) via a different path (with different physical tags
+ * to the indirect write via the GGTT) will see stale values from before
+ * the GGTT write. Inside the kernel, we can for the most part keep track of
+ * the different read/write domains in use (e.g. set-domain), but the 
assumption
+ * of coherency is baked into the ABI, hence reporting its true state in this
+ * parameter.
+ *
+ * If set to true, writes via mmap_gtt are immediately visible following an
+ * lfence to flush the WCB.
+ *
+ * If set to false, writes via mmap_gtt are indeterminatnly delayed in an in
+ * internal buffer and are _not_ immediately visible to third parties accessing
+ * directly via mmap_cpu/mmap_wc. Use of mmap_gtt as part of an IPC
+ * communications channel when set to false is strongly disadvised.


I would perhaps change the language from "set to true/false" to
"reported as true/false".


+ */
+#define I915_PARAM_MMAP_GTT_COHERENT 52
+
   typedef struct drm_i915_getparam {
   __s32 param;
   /*



Looks fine to me in principle. Is there some userspace ready to start
consuming it?


My only plan is for igt to know when the tests are expected to fail.
Real userspace should not be using GTT, it is slow (even slower than
intended ;) and constrained, so subject to aperture thrashing, fencing
is only usable for two out of the many tiling modes, etc, etc, etc.


I'll take the view that get_param is tiny enough for this to be fine.

Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Only force GGTT coherency w/a on required chipsets

2018-07-20 Thread Chris Wilson
Quoting Tvrtko Ursulin (2018-07-20 16:13:14)
> 
> On 20/07/2018 11:19, Chris Wilson wrote:
> > Not all chipsets have an internal buffer delaying the visibility of
> > writes via the GGTT being visible by other physical paths, but we use a
> > very heavy workaround for all. We only need to apply that workarounds to
> > the chipsets we know suffer from the delay and the resulting coherency
> > issue.
> > 
> > Similarly, the same inconsistent coherency fouls up our ABI promise that
> > a write into a mmap_gtt is immediately visible to others. Since the HW
> > has made that a lie, let userspace know when that contract is broken.
> > (Not that userspace would want to use mmap_gtt on those chipsets for
> > other performance reasons...)
> > 
> > Testcase: igt/drv_selftest/live_coherency
> > Testcase: igt/gem_mmap_gtt/coherency
> 
> The list of platforms to mark with false was compiled from the test results?

Yes. Coverage of older chipsets is less than ideal, as we have fewer of
them being tested and the gmch wasn't tied to any processor. So whereas
my PIIIm + i915gm passes, CI's P4 + i915g fails. That is odd.
 
> But then before this patch workaround was applied everywhere - so if the 
> test was failing even then - that means workaround wasn't sufficient?

The test is being run in userspace; bypassing any domain control in the
kernel, assuming the coherency model built into the ABI.

> > +/*
> > + * Once upon a time we supposed that writes through the GGTT would be
> > + * immediately in physical memory (once flushed out of the CPU path). 
> > However,
> > + * on a few different processors and chipsets, this is not necessarily the 
> > case
> > + * as the writes appear to be buffered internally. Thus a read of the 
> > backing
> > + * storage (physical memory) via a different path (with different physical 
> > tags
> > + * to the indirect write via the GGTT) will see stale values from before
> > + * the GGTT write. Inside the kernel, we can for the most part keep track 
> > of
> > + * the different read/write domains in use (e.g. set-domain), but the 
> > assumption
> > + * of coherency is baked into the ABI, hence reporting its true state in 
> > this
> > + * parameter.
> > + *
> > + * If set to true, writes via mmap_gtt are immediately visible following an
> > + * lfence to flush the WCB.
> > + *
> > + * If set to false, writes via mmap_gtt are indeterminatnly delayed in an 
> > in
> > + * internal buffer and are _not_ immediately visible to third parties 
> > accessing
> > + * directly via mmap_cpu/mmap_wc. Use of mmap_gtt as part of an IPC
> > + * communications channel when set to false is strongly disadvised.
> 
> I would perhaps change the language from "set to true/false" to 
> "reported as true/false".
> 
> > + */
> > +#define I915_PARAM_MMAP_GTT_COHERENT 52
> > +
> >   typedef struct drm_i915_getparam {
> >   __s32 param;
> >   /*
> > 
> 
> Looks fine to me in principle. Is there some userspace ready to start 
> consuming it?

My only plan is for igt to know when the tests are expected to fail.
Real userspace should not be using GTT, it is slow (even slower than
intended ;) and constrained, so subject to aperture thrashing, fencing
is only usable for two out of the many tiling modes, etc, etc, etc.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Only force GGTT coherency w/a on required chipsets

2018-07-20 Thread Tvrtko Ursulin


On 20/07/2018 11:19, Chris Wilson wrote:

Not all chipsets have an internal buffer delaying the visibility of
writes via the GGTT being visible by other physical paths, but we use a
very heavy workaround for all. We only need to apply that workarounds to
the chipsets we know suffer from the delay and the resulting coherency
issue.

Similarly, the same inconsistent coherency fouls up our ABI promise that
a write into a mmap_gtt is immediately visible to others. Since the HW
has made that a lie, let userspace know when that contract is broken.
(Not that userspace would want to use mmap_gtt on those chipsets for
other performance reasons...)

Testcase: igt/drv_selftest/live_coherency
Testcase: igt/gem_mmap_gtt/coherency


The list of platforms to mark with false was compiled from the test results?

But then before this patch workaround was applied everywhere - so if the 
test was failing even then - that means workaround wasn't sufficient?



Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100587
Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
---
Resend with --function-context
-Chris
---
  drivers/gpu/drm/i915/i915_drv.c  |  3 +++
  drivers/gpu/drm/i915/i915_gem.c  |  5 +
  drivers/gpu/drm/i915/i915_pci.c  | 10 ++
  drivers/gpu/drm/i915/intel_device_info.h |  1 +
  include/uapi/drm/i915_drm.h  | 22 ++
  5 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f8cfd16be534..18a45e7a3d7c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -302,152 +302,155 @@ static void intel_detect_pch(struct drm_i915_private 
*dev_priv)
  static int i915_getparam_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file_priv)
  {
struct drm_i915_private *dev_priv = to_i915(dev);
struct pci_dev *pdev = dev_priv->drm.pdev;
drm_i915_getparam_t *param = data;
int value;
  
  	switch (param->param) {

case I915_PARAM_IRQ_ACTIVE:
case I915_PARAM_ALLOW_BATCHBUFFER:
case I915_PARAM_LAST_DISPATCH:
case I915_PARAM_HAS_EXEC_CONSTANTS:
/* Reject all old ums/dri params. */
return -ENODEV;
case I915_PARAM_CHIPSET_ID:
value = pdev->device;
break;
case I915_PARAM_REVISION:
value = pdev->revision;
break;
case I915_PARAM_NUM_FENCES_AVAIL:
value = dev_priv->num_fence_regs;
break;
case I915_PARAM_HAS_OVERLAY:
value = dev_priv->overlay ? 1 : 0;
break;
case I915_PARAM_HAS_BSD:
value = !!dev_priv->engine[VCS];
break;
case I915_PARAM_HAS_BLT:
value = !!dev_priv->engine[BCS];
break;
case I915_PARAM_HAS_VEBOX:
value = !!dev_priv->engine[VECS];
break;
case I915_PARAM_HAS_BSD2:
value = !!dev_priv->engine[VCS2];
break;
case I915_PARAM_HAS_LLC:
value = HAS_LLC(dev_priv);
break;
case I915_PARAM_HAS_WT:
value = HAS_WT(dev_priv);
break;
case I915_PARAM_HAS_ALIASING_PPGTT:
value = USES_PPGTT(dev_priv);
break;
case I915_PARAM_HAS_SEMAPHORES:
value = HAS_LEGACY_SEMAPHORES(dev_priv);
break;
case I915_PARAM_HAS_SECURE_BATCHES:
value = capable(CAP_SYS_ADMIN);
break;
case I915_PARAM_CMD_PARSER_VERSION:
value = i915_cmd_parser_get_version(dev_priv);
break;
case I915_PARAM_SUBSLICE_TOTAL:
value = sseu_subslice_total(_INFO(dev_priv)->sseu);
if (!value)
return -ENODEV;
break;
case I915_PARAM_EU_TOTAL:
value = INTEL_INFO(dev_priv)->sseu.eu_total;
if (!value)
return -ENODEV;
break;
case I915_PARAM_HAS_GPU_RESET:
value = i915_modparams.enable_hangcheck &&
intel_has_gpu_reset(dev_priv);
if (value && intel_has_reset_engine(dev_priv))
value = 2;
break;
case I915_PARAM_HAS_RESOURCE_STREAMER:
value = HAS_RESOURCE_STREAMER(dev_priv);
break;
case I915_PARAM_HAS_POOLED_EU:
value = HAS_POOLED_EU(dev_priv);
break;
case I915_PARAM_MIN_EU_IN_POOL:
value = INTEL_INFO(dev_priv)->sseu.min_eu_in_pool;
break;
case I915_PARAM_HUC_STATUS:
value = intel_huc_check_status(_priv->huc);
if (value < 0)

Re: [Intel-gfx] [PATCH] drm/i915: Only force GGTT coherency w/a on required chipsets

2018-07-20 Thread Chris Wilson
Quoting Lis, Tomasz (2018-07-20 15:41:33)
> 
> 
> On 2018-07-20 12:19, Chris Wilson wrote:
> > Not all chipsets have an internal buffer delaying the visibility of
> > writes via the GGTT being visible by other physical paths, but we use a
> > very heavy workaround for all. We only need to apply that workarounds to
> > the chipsets we know suffer from the delay and the resulting coherency
> > issue.
> >
> > Similarly, the same inconsistent coherency fouls up our ABI promise that
> > a write into a mmap_gtt is immediately visible to others. Since the HW
> > has made that a lie, let userspace know when that contract is broken.
> > (Not that userspace would want to use mmap_gtt on those chipsets for
> > other performance reasons...)
> >
> > Testcase: igt/drv_selftest/live_coherency
> > Testcase: igt/gem_mmap_gtt/coherency
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100587
> > Signed-off-by: Chris Wilson 
> > Cc: Joonas Lahtinen 
> Is there any mention of this bug/feature in bspec? Would be nice to have 
> a reference.

I know that some HW bod has come across it, as they recommended the same
for ordering mem access between us using the GTT for iomaps and GuC
submission. All I know is the background chatter... We don't seem to
have a named w/a; iirc it was just concluded that was the way it was
meant to work ;)
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Only force GGTT coherency w/a on required chipsets

2018-07-20 Thread Lis, Tomasz



On 2018-07-20 12:19, Chris Wilson wrote:

Not all chipsets have an internal buffer delaying the visibility of
writes via the GGTT being visible by other physical paths, but we use a
very heavy workaround for all. We only need to apply that workarounds to
the chipsets we know suffer from the delay and the resulting coherency
issue.

Similarly, the same inconsistent coherency fouls up our ABI promise that
a write into a mmap_gtt is immediately visible to others. Since the HW
has made that a lie, let userspace know when that contract is broken.
(Not that userspace would want to use mmap_gtt on those chipsets for
other performance reasons...)

Testcase: igt/drv_selftest/live_coherency
Testcase: igt/gem_mmap_gtt/coherency
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100587
Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
Is there any mention of this bug/feature in bspec? Would be nice to have 
a reference.


Reviewed-by: Tomasz Lis 

---
Resend with --function-context
-Chris
---
  drivers/gpu/drm/i915/i915_drv.c  |  3 +++
  drivers/gpu/drm/i915/i915_gem.c  |  5 +
  drivers/gpu/drm/i915/i915_pci.c  | 10 ++
  drivers/gpu/drm/i915/intel_device_info.h |  1 +
  include/uapi/drm/i915_drm.h  | 22 ++
  5 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f8cfd16be534..18a45e7a3d7c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -302,152 +302,155 @@ static void intel_detect_pch(struct drm_i915_private 
*dev_priv)
  static int i915_getparam_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file_priv)
  {
struct drm_i915_private *dev_priv = to_i915(dev);
struct pci_dev *pdev = dev_priv->drm.pdev;
drm_i915_getparam_t *param = data;
int value;
  
  	switch (param->param) {

case I915_PARAM_IRQ_ACTIVE:
case I915_PARAM_ALLOW_BATCHBUFFER:
case I915_PARAM_LAST_DISPATCH:
case I915_PARAM_HAS_EXEC_CONSTANTS:
/* Reject all old ums/dri params. */
return -ENODEV;
case I915_PARAM_CHIPSET_ID:
value = pdev->device;
break;
case I915_PARAM_REVISION:
value = pdev->revision;
break;
case I915_PARAM_NUM_FENCES_AVAIL:
value = dev_priv->num_fence_regs;
break;
case I915_PARAM_HAS_OVERLAY:
value = dev_priv->overlay ? 1 : 0;
break;
case I915_PARAM_HAS_BSD:
value = !!dev_priv->engine[VCS];
break;
case I915_PARAM_HAS_BLT:
value = !!dev_priv->engine[BCS];
break;
case I915_PARAM_HAS_VEBOX:
value = !!dev_priv->engine[VECS];
break;
case I915_PARAM_HAS_BSD2:
value = !!dev_priv->engine[VCS2];
break;
case I915_PARAM_HAS_LLC:
value = HAS_LLC(dev_priv);
break;
case I915_PARAM_HAS_WT:
value = HAS_WT(dev_priv);
break;
case I915_PARAM_HAS_ALIASING_PPGTT:
value = USES_PPGTT(dev_priv);
break;
case I915_PARAM_HAS_SEMAPHORES:
value = HAS_LEGACY_SEMAPHORES(dev_priv);
break;
case I915_PARAM_HAS_SECURE_BATCHES:
value = capable(CAP_SYS_ADMIN);
break;
case I915_PARAM_CMD_PARSER_VERSION:
value = i915_cmd_parser_get_version(dev_priv);
break;
case I915_PARAM_SUBSLICE_TOTAL:
value = sseu_subslice_total(_INFO(dev_priv)->sseu);
if (!value)
return -ENODEV;
break;
case I915_PARAM_EU_TOTAL:
value = INTEL_INFO(dev_priv)->sseu.eu_total;
if (!value)
return -ENODEV;
break;
case I915_PARAM_HAS_GPU_RESET:
value = i915_modparams.enable_hangcheck &&
intel_has_gpu_reset(dev_priv);
if (value && intel_has_reset_engine(dev_priv))
value = 2;
break;
case I915_PARAM_HAS_RESOURCE_STREAMER:
value = HAS_RESOURCE_STREAMER(dev_priv);
break;
case I915_PARAM_HAS_POOLED_EU:
value = HAS_POOLED_EU(dev_priv);
break;
case I915_PARAM_MIN_EU_IN_POOL:
value = INTEL_INFO(dev_priv)->sseu.min_eu_in_pool;
break;
case I915_PARAM_HUC_STATUS:
value = intel_huc_check_status(_priv->huc);
if (value < 0)
return value;
break;
case I915_PARAM_MMAP_GTT_VERSION:
/* Though we've 

[Intel-gfx] [PATCH] drm/i915: Only force GGTT coherency w/a on required chipsets

2018-07-20 Thread Chris Wilson
Not all chipsets have an internal buffer delaying the visibility of
writes via the GGTT being visible by other physical paths, but we use a
very heavy workaround for all. We only need to apply that workarounds to
the chipsets we know suffer from the delay and the resulting coherency
issue.

Similarly, the same inconsistent coherency fouls up our ABI promise that
a write into a mmap_gtt is immediately visible to others. Since the HW
has made that a lie, let userspace know when that contract is broken.
(Not that userspace would want to use mmap_gtt on those chipsets for
other performance reasons...)

Testcase: igt/drv_selftest/live_coherency
Testcase: igt/gem_mmap_gtt/coherency
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100587
Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
---
Resend with --function-context
-Chris
---
 drivers/gpu/drm/i915/i915_drv.c  |  3 +++
 drivers/gpu/drm/i915/i915_gem.c  |  5 +
 drivers/gpu/drm/i915/i915_pci.c  | 10 ++
 drivers/gpu/drm/i915/intel_device_info.h |  1 +
 include/uapi/drm/i915_drm.h  | 22 ++
 5 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f8cfd16be534..18a45e7a3d7c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -302,152 +302,155 @@ static void intel_detect_pch(struct drm_i915_private 
*dev_priv)
 static int i915_getparam_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file_priv)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
struct pci_dev *pdev = dev_priv->drm.pdev;
drm_i915_getparam_t *param = data;
int value;
 
switch (param->param) {
case I915_PARAM_IRQ_ACTIVE:
case I915_PARAM_ALLOW_BATCHBUFFER:
case I915_PARAM_LAST_DISPATCH:
case I915_PARAM_HAS_EXEC_CONSTANTS:
/* Reject all old ums/dri params. */
return -ENODEV;
case I915_PARAM_CHIPSET_ID:
value = pdev->device;
break;
case I915_PARAM_REVISION:
value = pdev->revision;
break;
case I915_PARAM_NUM_FENCES_AVAIL:
value = dev_priv->num_fence_regs;
break;
case I915_PARAM_HAS_OVERLAY:
value = dev_priv->overlay ? 1 : 0;
break;
case I915_PARAM_HAS_BSD:
value = !!dev_priv->engine[VCS];
break;
case I915_PARAM_HAS_BLT:
value = !!dev_priv->engine[BCS];
break;
case I915_PARAM_HAS_VEBOX:
value = !!dev_priv->engine[VECS];
break;
case I915_PARAM_HAS_BSD2:
value = !!dev_priv->engine[VCS2];
break;
case I915_PARAM_HAS_LLC:
value = HAS_LLC(dev_priv);
break;
case I915_PARAM_HAS_WT:
value = HAS_WT(dev_priv);
break;
case I915_PARAM_HAS_ALIASING_PPGTT:
value = USES_PPGTT(dev_priv);
break;
case I915_PARAM_HAS_SEMAPHORES:
value = HAS_LEGACY_SEMAPHORES(dev_priv);
break;
case I915_PARAM_HAS_SECURE_BATCHES:
value = capable(CAP_SYS_ADMIN);
break;
case I915_PARAM_CMD_PARSER_VERSION:
value = i915_cmd_parser_get_version(dev_priv);
break;
case I915_PARAM_SUBSLICE_TOTAL:
value = sseu_subslice_total(_INFO(dev_priv)->sseu);
if (!value)
return -ENODEV;
break;
case I915_PARAM_EU_TOTAL:
value = INTEL_INFO(dev_priv)->sseu.eu_total;
if (!value)
return -ENODEV;
break;
case I915_PARAM_HAS_GPU_RESET:
value = i915_modparams.enable_hangcheck &&
intel_has_gpu_reset(dev_priv);
if (value && intel_has_reset_engine(dev_priv))
value = 2;
break;
case I915_PARAM_HAS_RESOURCE_STREAMER:
value = HAS_RESOURCE_STREAMER(dev_priv);
break;
case I915_PARAM_HAS_POOLED_EU:
value = HAS_POOLED_EU(dev_priv);
break;
case I915_PARAM_MIN_EU_IN_POOL:
value = INTEL_INFO(dev_priv)->sseu.min_eu_in_pool;
break;
case I915_PARAM_HUC_STATUS:
value = intel_huc_check_status(_priv->huc);
if (value < 0)
return value;
break;
case I915_PARAM_MMAP_GTT_VERSION:
/* Though we've started our numbering from 1, and so class all
 * earlier versions as 0, in effect their value is undefined as
 * the ioctl will 

Re: [Intel-gfx] [PATCH] drm/i915: Only force GGTT coherency w/a on required chipsets

2018-07-20 Thread Ville Syrjälä
On Fri, Jul 20, 2018 at 08:59:31AM +0100, Chris Wilson wrote:
> Not all chipsets have an internal buffer delaying the visibility of
> writes via the GGTT being visible by other physical paths, but we use a
> very heavy workaround for all. We only need to apply that workarounds to
> the chipsets we know suffer from the delay and the resulting coherency
> issue.
> 
> Similarly, the same inconsistent coherency fouls up our ABI promise that
> a write into a mmap_gtt is immediately visible to others. Since the HW
> has made that a lie, let userspace know when that contract is broken.
> (Not that userspace would want to use mmap_gtt on those chipsets for
> other performance reasons...)
> 
> Testcase: igt/drv_selftest/live_coherency
> Testcase: igt/gem_mmap_gtt/coherency
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100587
> Signed-off-by: Chris Wilson 
> Cc: Joonas Lahtinen 
> ---
>  drivers/gpu/drm/i915/i915_drv.c  |  3 +++
>  drivers/gpu/drm/i915/i915_gem.c  |  5 +
>  drivers/gpu/drm/i915/i915_pci.c  | 10 ++
>  drivers/gpu/drm/i915/intel_device_info.h |  1 +
>  include/uapi/drm/i915_drm.h  | 22 ++
>  5 files changed, 41 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index f8cfd16be534..18a45e7a3d7c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -441,6 +441,9 @@ static int i915_getparam_ioctl(struct drm_device *dev, 
> void *data,
>   case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
>   value = 1000 * INTEL_INFO(dev_priv)->cs_timestamp_frequency_khz;
>   break;
> + case I915_PARAM_MMAP_GTT_COHERENT:
> + value = INTEL_INFO(dev_priv)->has_coherent_ggtt;
> + break;
>   default:
>   DRM_DEBUG("Unknown parameter %d\n", param->param);
>   return -EINVAL;
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index fcc73a6ab503..8b52cb768a67 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -802,6 +802,11 @@ void i915_gem_flush_ggtt_writes(struct drm_i915_private 
> *dev_priv)
>* that was!).
>*/
>  
> + wmb();
> +
> + if (INTEL_INFO(dev_priv)->has_coherent_ggtt)
> + return;
> +
>   i915_gem_chipset_flush(dev_priv);
>  
>   intel_runtime_pm_get(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 6a4d1388ad2d..e443fe44da3a 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -74,6 +74,7 @@

? These macros make me sad sometimes. Increase the context size a bit
for this one?

>   .unfenced_needs_alignment = 1, \
>   .ring_mask = RENDER_RING, \
>   .has_snoop = true, \
> + .has_coherent_ggtt = false, \
>   GEN_DEFAULT_PIPEOFFSETS, \
>   GEN_DEFAULT_PAGE_SIZES, \
>   CURSOR_OFFSETS
> @@ -110,6 +111,7 @@ static const struct intel_device_info intel_i865g_info = {
>   .has_gmch_display = 1, \
>   .ring_mask = RENDER_RING, \
>   .has_snoop = true, \
> + .has_coherent_ggtt = true, \
>   GEN_DEFAULT_PIPEOFFSETS, \
>   GEN_DEFAULT_PAGE_SIZES, \
>   CURSOR_OFFSETS
> @@ -117,6 +119,7 @@ static const struct intel_device_info intel_i865g_info = {
>  static const struct intel_device_info intel_i915g_info = {
>   GEN3_FEATURES,
>   PLATFORM(INTEL_I915G),
> + .has_coherent_ggtt = false,
>   .cursor_needs_physical = 1,
>   .has_overlay = 1, .overlay_needs_physical = 1,
>   .hws_needs_physical = 1,
> @@ -178,6 +181,7 @@ static const struct intel_device_info intel_pineview_info 
> = {
>   .has_gmch_display = 1, \
>   .ring_mask = RENDER_RING, \
>   .has_snoop = true, \
> + .has_coherent_ggtt = true, \
>   GEN_DEFAULT_PIPEOFFSETS, \
>   GEN_DEFAULT_PAGE_SIZES, \
>   CURSOR_OFFSETS
> @@ -220,6 +224,7 @@ static const struct intel_device_info intel_gm45_info = {
>   .has_hotplug = 1, \
>   .ring_mask = RENDER_RING | BSD_RING, \
>   .has_snoop = true, \
> + .has_coherent_ggtt = true, \
>   /* ilk does support rc6, but we do not implement [power] contexts */ \
>   .has_rc6 = 0, \
>   GEN_DEFAULT_PIPEOFFSETS, \
> @@ -243,6 +248,7 @@ static const struct intel_device_info 
> intel_ironlake_m_info = {
>   .has_hotplug = 1, \
>   .has_fbc = 1, \
>   .ring_mask = RENDER_RING | BSD_RING | BLT_RING, \
> + .has_coherent_ggtt = true, \
>   .has_llc = 1, \
>   .has_rc6 = 1, \
>   .has_rc6p = 1, \
> @@ -287,6 +293,7 @@ static const struct intel_device_info 
> intel_sandybridge_m_gt2_info = {
>   .has_hotplug = 1, \
>   .has_fbc = 1, \
>   .ring_mask = RENDER_RING | BSD_RING | BLT_RING, \
> + .has_coherent_ggtt = true, \
>   .has_llc = 1, \
>   .has_rc6 = 1, \
>   .has_rc6p = 1, \
> @@ -347,6 +354,7 @@ static const struct 

[Intel-gfx] [PATCH] drm/i915: Only force GGTT coherency w/a on required chipsets

2018-07-20 Thread Chris Wilson
Not all chipsets have an internal buffer delaying the visibility of
writes via the GGTT being visible by other physical paths, but we use a
very heavy workaround for all. We only need to apply that workarounds to
the chipsets we know suffer from the delay and the resulting coherency
issue.

Similarly, the same inconsistent coherency fouls up our ABI promise that
a write into a mmap_gtt is immediately visible to others. Since the HW
has made that a lie, let userspace know when that contract is broken.
(Not that userspace would want to use mmap_gtt on those chipsets for
other performance reasons...)

Testcase: igt/drv_selftest/live_coherency
Testcase: igt/gem_mmap_gtt/coherency
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100587
Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.c  |  3 +++
 drivers/gpu/drm/i915/i915_gem.c  |  5 +
 drivers/gpu/drm/i915/i915_pci.c  | 10 ++
 drivers/gpu/drm/i915/intel_device_info.h |  1 +
 include/uapi/drm/i915_drm.h  | 22 ++
 5 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f8cfd16be534..18a45e7a3d7c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -441,6 +441,9 @@ static int i915_getparam_ioctl(struct drm_device *dev, void 
*data,
case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
value = 1000 * INTEL_INFO(dev_priv)->cs_timestamp_frequency_khz;
break;
+   case I915_PARAM_MMAP_GTT_COHERENT:
+   value = INTEL_INFO(dev_priv)->has_coherent_ggtt;
+   break;
default:
DRM_DEBUG("Unknown parameter %d\n", param->param);
return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fcc73a6ab503..8b52cb768a67 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -802,6 +802,11 @@ void i915_gem_flush_ggtt_writes(struct drm_i915_private 
*dev_priv)
 * that was!).
 */
 
+   wmb();
+
+   if (INTEL_INFO(dev_priv)->has_coherent_ggtt)
+   return;
+
i915_gem_chipset_flush(dev_priv);
 
intel_runtime_pm_get(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 6a4d1388ad2d..e443fe44da3a 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -74,6 +74,7 @@
.unfenced_needs_alignment = 1, \
.ring_mask = RENDER_RING, \
.has_snoop = true, \
+   .has_coherent_ggtt = false, \
GEN_DEFAULT_PIPEOFFSETS, \
GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
@@ -110,6 +111,7 @@ static const struct intel_device_info intel_i865g_info = {
.has_gmch_display = 1, \
.ring_mask = RENDER_RING, \
.has_snoop = true, \
+   .has_coherent_ggtt = true, \
GEN_DEFAULT_PIPEOFFSETS, \
GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
@@ -117,6 +119,7 @@ static const struct intel_device_info intel_i865g_info = {
 static const struct intel_device_info intel_i915g_info = {
GEN3_FEATURES,
PLATFORM(INTEL_I915G),
+   .has_coherent_ggtt = false,
.cursor_needs_physical = 1,
.has_overlay = 1, .overlay_needs_physical = 1,
.hws_needs_physical = 1,
@@ -178,6 +181,7 @@ static const struct intel_device_info intel_pineview_info = 
{
.has_gmch_display = 1, \
.ring_mask = RENDER_RING, \
.has_snoop = true, \
+   .has_coherent_ggtt = true, \
GEN_DEFAULT_PIPEOFFSETS, \
GEN_DEFAULT_PAGE_SIZES, \
CURSOR_OFFSETS
@@ -220,6 +224,7 @@ static const struct intel_device_info intel_gm45_info = {
.has_hotplug = 1, \
.ring_mask = RENDER_RING | BSD_RING, \
.has_snoop = true, \
+   .has_coherent_ggtt = true, \
/* ilk does support rc6, but we do not implement [power] contexts */ \
.has_rc6 = 0, \
GEN_DEFAULT_PIPEOFFSETS, \
@@ -243,6 +248,7 @@ static const struct intel_device_info intel_ironlake_m_info 
= {
.has_hotplug = 1, \
.has_fbc = 1, \
.ring_mask = RENDER_RING | BSD_RING | BLT_RING, \
+   .has_coherent_ggtt = true, \
.has_llc = 1, \
.has_rc6 = 1, \
.has_rc6p = 1, \
@@ -287,6 +293,7 @@ static const struct intel_device_info 
intel_sandybridge_m_gt2_info = {
.has_hotplug = 1, \
.has_fbc = 1, \
.ring_mask = RENDER_RING | BSD_RING | BLT_RING, \
+   .has_coherent_ggtt = true, \
.has_llc = 1, \
.has_rc6 = 1, \
.has_rc6p = 1, \
@@ -347,6 +354,7 @@ static const struct intel_device_info intel_valleyview_info 
= {
.has_aliasing_ppgtt = 1,
.has_full_ppgtt = 1,
.has_snoop = true,
+   .has_coherent_ggtt = false,
.ring_mask = RENDER_RING | BSD_RING | BLT_RING,
.display_mmio_offset =