Re: [RFC PATCH] dma-buf: Fix dma reservation with zero fences

2023-12-15 Thread Mika Kuoppala
Christian König  writes:

> Am 14.12.23 um 13:08 schrieb Mika Kuoppala:
>> Driver can initialize without any fences. If so
>> roundup_power_of_two will overflow as it will try to
>> subtract one from initial value before shift,
>> (1 << fls_long(-1)).
>
> Ah, yes that reminds me that I wanted to take care of this as well.
>
> But solving it like this is the wrong approach. A couple of driver 
> calculate the number of fences needed based on userspace input. If that 
> results in zero then you certainly have a bug in your driver.
>
> Since calling dma_resv_reserve_fences() with num_fences==0 does make 
> much sense we should really just warn about it and just return early 
> from the function.

Sounds like a plan. I will fix our driver to just omit the call
if no fences (yet).

Thanks,
-Mika

> Regards,
> Christian.
>
>
>>
>> Fix this using default (4) if num_fences is zero.
>>
>> Another more radical option would be to return error on
>> zero but that would need a callsite comb.
>>
>> Caught-by: UBSAN
>> Cc: Christian König 
>> Cc: Thomas Hellström 
>> Signed-off-by: Mika Kuoppala 
>> ---
>>   drivers/dma-buf/dma-resv.c | 5 -
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
>> index 38b4110378de..f5ad3ecd0d4f 100644
>> --- a/drivers/dma-buf/dma-resv.c
>> +++ b/drivers/dma-buf/dma-resv.c
>> @@ -192,7 +192,10 @@ int dma_resv_reserve_fences(struct dma_resv *obj, 
>> unsigned int num_fences)
>>  return 0;
>>  max = max(old->num_fences + num_fences, old->max_fences * 2);
>>  } else {
>> -max = max(4ul, roundup_pow_of_two(num_fences));
>> +if (num_fences)
>> +max = max(4ul, roundup_pow_of_two(num_fences));
>> +else
>> +max = 4ul;
>>  }
>>   
>>  new = dma_resv_list_alloc(max);


[RFC PATCH] dma-buf: Fix dma reservation with zero fences

2023-12-14 Thread Mika Kuoppala
Driver can initialize without any fences. If so
roundup_power_of_two will overflow as it will try to
subtract one from initial value before shift,
(1 << fls_long(-1)).

Fix this using default (4) if num_fences is zero.

Another more radical option would be to return error on
zero but that would need a callsite comb.

Caught-by: UBSAN
Cc: Christian König 
Cc: Thomas Hellström 
Signed-off-by: Mika Kuoppala 
---
 drivers/dma-buf/dma-resv.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 38b4110378de..f5ad3ecd0d4f 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -192,7 +192,10 @@ int dma_resv_reserve_fences(struct dma_resv *obj, unsigned 
int num_fences)
return 0;
max = max(old->num_fences + num_fences, old->max_fences * 2);
} else {
-   max = max(4ul, roundup_pow_of_two(num_fences));
+   if (num_fences)
+   max = max(4ul, roundup_pow_of_two(num_fences));
+   else
+   max = 4ul;
}
 
new = dma_resv_list_alloc(max);
-- 
2.34.1



Re: [Intel-gfx] [PATCH] drm/i915/mtl: Don't set PIPE_CONTROL_FLUSH_L3

2023-10-17 Thread Mika Kuoppala
Vinay Belgaumkar  writes:

> This bit does not cause an explicit L3 flush. We already use
> PIPE_CONTROL_DC_FLUSH_ENABLE for that purpose.
>
> Cc: Nirmoy Das 
> Cc: Mikka Kuoppala 
s/kk/k

> Signed-off-by: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index ba4c2422b340..abbc02f3e66e 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -247,6 +247,7 @@ static int mtl_dummy_pipe_control(struct i915_request *rq)
>  int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>  {
>   struct intel_engine_cs *engine = rq->engine;
> + struct intel_gt *gt = rq->engine->gt;
>  
>   /*
>* On Aux CCS platforms the invalidation of the Aux
> @@ -278,7 +279,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
> mode)
>* deals with Protected Memory which is not needed for
>* AUX CCS invalidation and lead to unwanted side effects.
>*/
> - if (mode & EMIT_FLUSH)
> + if ((mode & EMIT_FLUSH) &&
> + !(IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))))
>   bit_group_1 |= PIPE_CONTROL_FLUSH_L3;

Yes its best to apply for MTL first.

Acked-by: Mika Kuoppala   
>   bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
> @@ -812,12 +814,14 @@ u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request 
> *rq, u32 *cs)
>   u32 flags = (PIPE_CONTROL_CS_STALL |
>PIPE_CONTROL_TLB_INVALIDATE |
>PIPE_CONTROL_TILE_CACHE_FLUSH |
> -  PIPE_CONTROL_FLUSH_L3 |
>PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
>PIPE_CONTROL_DEPTH_CACHE_FLUSH |
>PIPE_CONTROL_DC_FLUSH_ENABLE |
>PIPE_CONTROL_FLUSH_ENABLE);
>  
> + if (!(IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71
> + flags |= PIPE_CONTROL_FLUSH_L3;
> +
>   /* Wa_14016712196 */
>   if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71)) || 
> IS_DG2(i915))
>   /* dummy PIPE_CONTROL + depth flush */
> -- 
> 2.38.1


Re: [Intel-gfx] [PATCH] drm/i915: Simplify internal helper function signature

2022-11-10 Thread Mika Kuoppala
Tvrtko Ursulin  writes:

> From: Tvrtko Ursulin 
>
> Since we are now storing the GT backpointer in the wa list we can drop the
> explicit struct intel_gt * argument to wa_list_apply.

There is room for more dropping, all the platform lists inits.
>
> Signed-off-by: Tvrtko Ursulin 
> Cc: Andrzej Hajda 

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 07bf115029a0..4db04761d5ea 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -1717,9 +1717,9 @@ wa_verify(struct intel_gt *gt, const struct i915_wa 
> *wa, u32 cur,
>   return true;
>  }
>  
> -static void
> -wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
> +static void wa_list_apply(const struct i915_wa_list *wal)
>  {
> + struct intel_gt *gt = wal->gt;
>   struct intel_uncore *uncore = gt->uncore;
>   enum forcewake_domains fw;
>   unsigned long flags;
> @@ -1755,7 +1755,7 @@ wa_list_apply(struct intel_gt *gt, const struct 
> i915_wa_list *wal)
>   intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
>   intel_uncore_read_fw(uncore, wa->reg);
>  
> - wa_verify(wal->gt, wa, val, wal->name, "application");
> + wa_verify(gt, wa, val, wal->name, "application");
>   }
>   }
>  
> @@ -1765,7 +1765,7 @@ wa_list_apply(struct intel_gt *gt, const struct 
> i915_wa_list *wal)
>  
>  void intel_gt_apply_workarounds(struct intel_gt *gt)
>  {
> - wa_list_apply(gt, >wa_list);
> + wa_list_apply(>wa_list);
>  }
>  
>  static bool wa_list_verify(struct intel_gt *gt,
> @@ -3025,7 +3025,7 @@ void intel_engine_init_workarounds(struct 
> intel_engine_cs *engine)
>  
>  void intel_engine_apply_workarounds(struct intel_engine_cs *engine)
>  {
> - wa_list_apply(engine->gt, >wa_list);
> + wa_list_apply(>wa_list);
>  }
>  
>  static const struct i915_range mcr_ranges_gen8[] = {
> -- 
> 2.34.1


Re: [PATCH 2/2] drm/i915: Don't use DRM_DEBUG_WARN_ON for ring unexpectedly not idle

2022-05-05 Thread Mika Kuoppala
Tvrtko Ursulin  writes:

> From: Tvrtko Ursulin 
>
> DRM_DEBUG_WARN_ON should only be used when we are certain CI is guaranteed
> to exercise a certain code path, so in case of values coming from MMIO
> reads we cannot be sure CI will have all the possible SKUs and parts, or
> that it will catch all possible error conditions. Use drm_warn instead.
>
> Signed-off-by: Tvrtko Ursulin 
> Cc: Mika Kuoppala 
> Cc: Jani Nikula 

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/i915/gt/intel_ring_submission.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c 
> b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
> index 5423bfd301ad..f8f279a195c0 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
> @@ -117,7 +117,9 @@ static void flush_cs_tlb(struct intel_engine_cs *engine)
>   return;
>  
>   /* ring should be idle before issuing a sync flush*/
> - GEM_DEBUG_WARN_ON((ENGINE_READ(engine, RING_MI_MODE) & MODE_IDLE) == 0);
> + if ((ENGINE_READ(engine, RING_MI_MODE) & MODE_IDLE) == 0)
> + drm_warn(>i915->drm, "%s not idle before sync flush!\n",
> +  engine->name);
>  
>   ENGINE_WRITE_FW(engine, RING_INSTPM,
>   _MASKED_BIT_ENABLE(INSTPM_TLB_INVALIDATE |
> -- 
> 2.32.0


Re: [PATCH] drm/i915/gtt: add some flushing for the 64K GTT path

2021-09-03 Thread Mika Kuoppala
Matthew Auld  writes:

> If we need to mark the PDE as operating in 64K GTT mode, we should be
> paranoid and flush the extra writes, like we already do for the PTEs. On
> some platforms the clflush can apparently add the just the right amount
> of magical delay to force the GPU to see the updated entry.
>
> Signed-off-by: Matthew Auld 
> Cc: Mika Kuoppala 

Makes sense to follow the same pattern as the other writes.

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 6e0e52eeb87a..6a5af995f5b1 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -548,6 +548,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
> I915_GTT_PAGE_SIZE_2M {
>   vaddr = px_vaddr(pd);
>   vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
> + clflush_cache_range(vaddr, PAGE_SIZE);
>   page_size = I915_GTT_PAGE_SIZE_64K;
>  
>   /*
> @@ -568,6 +569,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   for (i = 1; i < index; i += 16)
>   memset64(vaddr + i, encode, 15);
>  
> + clflush_cache_range(vaddr, PAGE_SIZE);
>   }
>   }
>  
> -- 
> 2.26.3


Re: [PATCH v2 1/2] drm/i915: document caching related bits

2021-08-02 Thread Mika Kuoppala
Matthew Auld  writes:

> Try to document the object caching related bits, like cache_coherent and
> cache_dirty.
>
> v2(Ville):
>  - As pointed out by Ville, fix the completely incorrect assumptions
>about the "partial" coherency on shared LLC platforms.
>
> Suggested-by: Daniel Vetter 
> Signed-off-by: Matthew Auld 
> Cc: Ville Syrjälä 
> Cc: Mika Kuoppala 
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  | 173 +-
>  drivers/gpu/drm/i915/i915_drv.h   |   9 -
>  2 files changed, 169 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index ef3de2ae9723..a809424bc8c1 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -92,6 +92,76 @@ struct drm_i915_gem_object_ops {
>   const char *name; /* friendly name for debug, e.g. lockdep classes */
>  };
>  
> +/**
> + * enum i915_cache_level - The supported GTT caching values for system memory
> + * pages.
> + *
> + * These translate to some special GTT PTE bits when binding pages into some
> + * address space. It also determines whether an object, or rather its pages 
> are
> + * coherent with the GPU, when also reading or writing through the CPU cache
> + * with those pages.
> + *
> + * Userspace can also control this through struct drm_i915_gem_caching.
> + */
> +enum i915_cache_level {
> + /**
> +  * @I915_CACHE_NONE:
> +  *
> +  * Not coherent with the CPU cache. If the cache is dirty and we need
> +  * the underlying pages to be coherent with some later GPU access then
> +  * we need to manually flush the pages.
> +  *
> +  * Note that on shared LLC platforms reads and writes through the CPU
> +  * cache are still coherent even with this setting. See also
> +  * _i915_gem_object.cache_coherent for more details.
> +  *
> +  * Note that on platforms with a shared LLC this should ideally only be
> +  * used for scanout surfaces, otherwise we end up over-flushing in some
> +  * places.
> +  */
> + I915_CACHE_NONE = 0,
> + /**
> +  * @I915_CACHE_LLC:
> +  *
> +  * Coherent with the CPU cache. If the cache is dirty, then the GPU will
> +  * ensure that access remains coherent, when both reading and writing
> +  * through the CPU cache.
> +  *
> +  * Not used for scanout surfaces.
> +  *
> +  * Applies to both platforms with shared LLC(HAS_LLC), and snooping
> +  * based platforms(HAS_SNOOP).
> +  *
> +  * This should be the default for platforms which share the LLC with the
> +  * CPU. The only exception is scanout objects, where the display engine
> +  * is not coherent with the LLC. For such objects I915_CACHE_NONE or
> +  * I915_CACHE_WT should be used.
> +  */
> + I915_CACHE_LLC,
> + /**
> +  * @I915_CACHE_L3_LLC:
> +  *
> +  * Explicitly enable the Gfx L3 cache, with snooped LLC.
> +  *
> +  * The Gfx L3 sits between the domain specific caches, e.g
> +  * sampler/render caches, and the larger LLC. LLC is coherent with the
> +  * GPU, but L3 is only visible to the GPU, so likely needs to be flushed
> +  * when the workload completes.
> +  *
> +  * Not used for scanout surfaces.
> +  *
> +  * Only exposed on some gen7 + GGTT. More recent hardware has dropped
> +  * this.
> +  */

This is stellar. Thanks!
-Mika

> + I915_CACHE_L3_LLC,
> + /**
> +  * @I915_CACHE_WT:
> +  *
> +  * hsw:gt3e Write-through for scanout buffers.
> +  */
> + I915_CACHE_WT,
> +};
> +
>  enum i915_map_type {
>   I915_MAP_WB = 0,
>   I915_MAP_WC,
> @@ -228,14 +298,109 @@ struct drm_i915_gem_object {
>   unsigned int mem_flags;
>  #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */
>  #define I915_BO_FLAG_IOMEM   BIT(1) /* Object backed by IO memory */
> - /*
> -  * Is the object to be mapped as read-only to the GPU
> -  * Only honoured if hardware has relevant pte bit
> + /**
> +  * @cache_level: The desired GTT caching level.
> +  *
> +  * See enum i915_cache_level for possible values, along with what
> +  * each does.
>*/
>   unsigned int cache_level:3;
> - unsigned int cache_coherent:2;
> + /**
> +  * @cache_coherent:
> +  *
> +  * Track whether the pages are coherent with the GPU if reading or
> +  * writing through the CPU caches. The largely depends on the
> +  * @cache_level set

Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-13 Thread Mika Kuoppala
Matthew Auld  writes:

> Try to document the object caching related bits, like cache_coherent and
> cache_dirty.
>
> Suggested-by: Daniel Vetter 
> Signed-off-by: Matthew Auld 
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  | 135 +-
>  drivers/gpu/drm/i915/i915_drv.h   |   9 --
>  2 files changed, 131 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index ef3de2ae9723..02c3529b774c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -92,6 +92,57 @@ struct drm_i915_gem_object_ops {
>   const char *name; /* friendly name for debug, e.g. lockdep classes */
>  };
>  
> +/**
> + * enum i915_cache_level - The supported GTT caching values for system memory
> + * pages.
> + *
> + * These translate to some special GTT PTE bits when binding pages into some
> + * address space. It also determines whether an object, or rather its pages 
> are
> + * coherent with the GPU, when also reading or writing through the CPU cache
> + * with those pages.
> + *
> + * Userspace can also control this through struct drm_i915_gem_caching.
> + */
> +enum i915_cache_level {
> + /**
> +  * @I915_CACHE_NONE:
> +  *
> +  * Not coherent with the CPU cache. If the cache is dirty and we need
> +  * the underlying pages to be coherent with some later GPU access then
> +  * we need to manually flush the pages.
> +  *
> +  * Note that on shared-LLC platforms reads through the CPU cache are
> +  * still coherent even with this setting. See also
> +  * I915_BO_CACHE_COHERENT_FOR_READ for more details.
> +  */
> + I915_CACHE_NONE = 0,
> + /**
> +  * @I915_CACHE_LLC:
> +  *
> +  * Coherent with the CPU cache. If the cache is dirty, then the GPU will
> +  * ensure that access remains coherent, when both reading and writing
> +  * through the CPU cache.
> +  *
> +  * Applies to both platforms with shared-LLC(HAS_LLC), and snooping
> +  * based platforms(HAS_SNOOP).
> +  */
> + I915_CACHE_LLC,
> + /**
> +  * @I915_CACHE_L3_LLC:
> +  *
> +  * gen7+, L3 sits between the domain specifc caches, eg sampler/render

typo: specifc

> +  * caches, and the large Last-Level-Cache. LLC is coherent with the CPU,
> +  * but L3 is only visible to the GPU.
> +  */

I dont get the difference between this and I915_CACHE_LLC.
Could the diff between LLC and L3_LLC be described here with example?

Thanks,
-Mika

> + I915_CACHE_L3_LLC,
> + /**
> +  * @I915_CACHE_WT:
> +  *
> +  * hsw:gt3e Write-through for scanout buffers.
> +  */
> + I915_CACHE_WT,
> +};
> +
>  enum i915_map_type {
>   I915_MAP_WB = 0,
>   I915_MAP_WC,
> @@ -228,14 +279,90 @@ struct drm_i915_gem_object {
>   unsigned int mem_flags;
>  #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */
>  #define I915_BO_FLAG_IOMEM   BIT(1) /* Object backed by IO memory */
> - /*
> -  * Is the object to be mapped as read-only to the GPU
> -  * Only honoured if hardware has relevant pte bit
> + /**
> +  * @cache_level: The desired GTT caching level.
> +  *
> +  * See enum i915_cache_level for possible values, along with what
> +  * each does.
>*/
>   unsigned int cache_level:3;
> - unsigned int cache_coherent:2;
> + /**
> +  * @cache_coherent:
> +  *
> +  * Track whether the pages are coherent with the GPU if reading or
> +  * writing through the CPU cache.
> +  *
> +  * This largely depends on the @cache_level, for example if the object
> +  * is marked as I915_CACHE_LLC, then GPU access is coherent for both
> +  * reads and writes through the CPU cache.
> +  *
> +  * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> +  * the CPU cache are always coherent, regardless of the @cache_level. On
> +  * snooping based platforms this is not the case, unless the full
> +  * I915_CACHE_LLC or similar setting is used.
> +  *
> +  * As a result of this we need to track coherency separately for reads
> +  * and writes, in order to avoid superfluous flushing on shared-LLC
> +  * platforms, for reads.
> +  *
> +  * I915_BO_CACHE_COHERENT_FOR_READ:
> +  *
> +  * When reading through the CPU cache, the GPU is still coherent. Note
> +  * that no data has actually been modified here, so it might seem
> +  * strange that we care about this.
> +  *
> +  * As an example, if some object is mapped on the CPU with write-back
> +  * caching, and we read some page, then the cache likely now contains
> +  * the data from that read. At this point the cache and main memory
> +  * match up, so all good. But next the GPU needs to write some data to
> +  * that same 

Re: [Intel-gfx] [PATCH] drm/i915: Take request reference before arming the watchdog timer

2021-03-26 Thread Mika Kuoppala
Tvrtko Ursulin  writes:

> From: Tvrtko Ursulin 
>
> Reference needs to be taken before arming the timer. Luckily, given the
> default timer period of 20s, the potential to hit the race is extremely
> unlikely.
>
> Signed-off-by: Tvrtko Ursulin 
> Fixes: 9b4d0598ee94 ("drm/i915: Request watchdog infrastructure")
> Cc: Daniel Vetter 

Reviewed-by: Mika Kuoppala 

> ---
> Test-with: 20210318162400.2065097-1-tvrtko.ursu...@linux.intel.com
> ---
>  drivers/gpu/drm/i915/i915_request.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c 
> b/drivers/gpu/drm/i915/i915_request.c
> index 9165971c3c47..bec9c3652188 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -350,6 +350,8 @@ static void __rq_arm_watchdog(struct i915_request *rq)
>   if (!ce->watchdog.timeout_us)
>   return;
>  
> + i915_request_get(rq);
> +
>   hrtimer_init(>timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>   wdg->timer.function = __rq_watchdog_expired;
>   hrtimer_start_range_ns(>timer,
> @@ -357,7 +359,6 @@ static void __rq_arm_watchdog(struct i915_request *rq)
>  NSEC_PER_USEC),
>  NSEC_PER_MSEC,
>  HRTIMER_MODE_REL);
> - i915_request_get(rq);
>  }
>  
>  static void __rq_cancel_watchdog(struct i915_request *rq)
> -- 
> 2.27.0
>
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915/selftests: Fix wrong return value of perf_series_engines()

2020-11-17 Thread Mika Kuoppala
Zhang Xiaoxu  writes:

> If intel context create failed, the perf_series_engines() will return 0
> rather than error, because we doesn't initialize the return value.
>
> Fixes: cbfd3a0c5a55 ("drm/i915/selftests: Add request throughput measurement 
> to perf")
> Reported-by: Hulk Robot 
> Signed-off-by: Zhang Xiaoxu 

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/i915/selftests/i915_request.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c 
> b/drivers/gpu/drm/i915/selftests/i915_request.c
> index 6b512fc13ca7..e424a6d1a68c 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_request.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_request.c
> @@ -2469,8 +2469,10 @@ static int perf_series_engines(void *arg)
>   struct intel_context *ce;
>  
>   ce = intel_context_create(engine);
> - if (IS_ERR(ce))
> + if (IS_ERR(ce)) {
> + err = PTR_ERR(ce);
>   goto out;
> + }
>  
>   err = intel_context_pin(ce);
>   if (err) {
> -- 
> 2.25.4
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915/selftests: Fix wrong return value of perf_request_latency()

2020-11-17 Thread Mika Kuoppala
Zhang Xiaoxu  writes:

> If intel context create failed, the perf_request_latency() will return 0
> rather than error, because we doesn't initialize the return value.
>
> Fixes: 25c26f18ea79 ("drm/i915/selftests: Measure dispatch latency")
> Reported-by: Hulk Robot 
> Signed-off-by: Zhang Xiaoxu 

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/i915/selftests/i915_request.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c 
> b/drivers/gpu/drm/i915/selftests/i915_request.c
> index 64bbb8288249..6b512fc13ca7 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_request.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_request.c
> @@ -2293,8 +2293,10 @@ static int perf_request_latency(void *arg)
>   struct intel_context *ce;
>  
>   ce = intel_context_create(engine);
> - if (IS_ERR(ce))
> + if (IS_ERR(ce)) {
> + err = PTR_ERR(ce);
>   goto out;
> + }
>  
>   err = intel_context_pin(ce);
>   if (err) {
> -- 
> 2.25.4
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/2] drm/i915: Add mechanism to submit a context WA on ring submission

2020-01-17 Thread Mika Kuoppala
This patch adds framework to submit an arbitrary batchbuffer on each
context switch to clear residual state for render engine on Gen7/7.5
devices.

The idea of always emitting the context and vm setup around each request
is primary to make reset recovery easy, and not require rewriting the
ringbuffer. As each request would set up its own context, leaving it to
the HW to notice and elide no-op context switches, we could restart the
ring at any point, and reorder the requests freely.

However, to avoid emitting clear_residuals() between consecutive requests
in the ringbuffer of the same context, we do want to track the current
context in the ring. In doing so, we need to be careful to only record a
context switch when we are sure the next request will be emitted.

v2: elide optimization patch squashed, courtesy of Chris Wilson

Signed-off-by: Mika Kuoppala 
Signed-off-by: Akeem G Abodunrin 
Cc: Kumar Valsan Prathap 
Cc: Chris Wilson 
Cc: Balestrieri Francesco 
Cc: Bloomfield Jon 
Cc: Dutt Sudeep 
---
 .../gpu/drm/i915/gt/intel_ring_submission.c   | 132 +-
 1 file changed, 129 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c 
b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index bc44fe8e5ffa..58500032c993 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -1384,7 +1384,9 @@ static int load_pd_dir(struct i915_request *rq,
return rq->engine->emit_flush(rq, EMIT_FLUSH);
 }
 
-static inline int mi_set_context(struct i915_request *rq, u32 flags)
+static inline int mi_set_context(struct i915_request *rq,
+struct intel_context *ce,
+u32 flags)
 {
struct drm_i915_private *i915 = rq->i915;
struct intel_engine_cs *engine = rq->engine;
@@ -1459,7 +1461,7 @@ static inline int mi_set_context(struct i915_request *rq, 
u32 flags)
 
*cs++ = MI_NOOP;
*cs++ = MI_SET_CONTEXT;
-   *cs++ = i915_ggtt_offset(rq->context->state) | flags;
+   *cs++ = i915_ggtt_offset(ce->state) | flags;
/*
 * w/a: MI_SET_CONTEXT must always be followed by MI_NOOP
 * WaMiSetContext_Hang:snb,ivb,vlv
@@ -1574,13 +1576,56 @@ static int switch_mm(struct i915_request *rq, struct 
i915_address_space *vm)
return rq->engine->emit_flush(rq, EMIT_INVALIDATE);
 }
 
+static int clear_residuals(struct i915_request *rq)
+{
+   struct intel_engine_cs *engine = rq->engine;
+   int ret;
+
+   GEM_BUG_ON(!engine->kernel_context->state);
+
+   ret = switch_mm(rq, vm_alias(engine->kernel_context));
+   if (ret)
+   return ret;
+
+   ret = mi_set_context(rq,
+engine->kernel_context,
+MI_MM_SPACE_GTT | MI_RESTORE_INHIBIT);
+   if (ret)
+   return ret;
+
+   ret = engine->emit_bb_start(rq,
+   engine->wa_ctx.vma->node.start, 0,
+   0);
+   if (ret)
+   return ret;
+
+   ret = engine->emit_flush(rq, EMIT_FLUSH);
+   if (ret)
+   return ret;
+
+   /* Always invalidate before the next switch_mm() */
+   return engine->emit_flush(rq, EMIT_INVALIDATE);
+}
+
 static int switch_context(struct i915_request *rq)
 {
+   struct intel_engine_cs *engine = rq->engine;
struct intel_context *ce = rq->context;
+   void **residuals = NULL;
int ret;
 
GEM_BUG_ON(HAS_EXECLISTS(rq->i915));
 
+   if (engine->wa_ctx.vma && ce != engine->kernel_context) {
+   if (engine->wa_ctx.vma->private != ce) {
+   ret = clear_residuals(rq);
+   if (ret)
+   return ret;
+
+   residuals = >wa_ctx.vma->private;
+   }
+   }
+
ret = switch_mm(rq, vm_alias(ce));
if (ret)
return ret;
@@ -1600,7 +1645,7 @@ static int switch_context(struct i915_request *rq)
else
flags |= MI_RESTORE_INHIBIT;
 
-   ret = mi_set_context(rq, flags);
+   ret = mi_set_context(rq, ce, flags);
if (ret)
return ret;
}
@@ -1609,6 +1654,20 @@ static int switch_context(struct i915_request *rq)
if (ret)
return ret;
 
+   /*
+* Now past the point of no return, this request _will_ be emitted.
+*
+* Or at least this preamble will be emitted, the request may be
+* interrupted prior to submitting the user payload. If so, we
+* still submit the "empty" request in order to preserve global
+* state tracking such as this, our tracking of the current
+* dirty context.
+*/
+   if (residua

[PATCH 1/2] drm/i915: Add mechanism to submit a context WA on ring submission

2020-01-17 Thread Mika Kuoppala
This patch adds framework to submit an arbitrary batchbuffer on each
context switch to clear residual state for render engine on Gen7/7.5
devices.

The idea of always emitting the context and vm setup around each request
is primary to make reset recovery easy, and not require rewriting the
ringbuffer. As each request would set up its own context, leaving it to
the HW to notice and elide no-op context switches, we could restart the
ring at any point, and reorder the requests freely.

However, to avoid emitting clear_residuals() between consecutive requests
in the ringbuffer of the same context, we do want to track the current
context in the ring. In doing so, we need to be careful to only record a
context switch when we are sure the next request will be emitted.

v2: elide optimization patch squashed, courtesy of Chris Wilson

Signed-off-by: Mika Kuoppala 
Signed-off-by: Akeem G Abodunrin 
Cc: Kumar Valsan Prathap 
Cc: Chris Wilson 
Cc: Balestrieri Francesco 
Cc: Bloomfield Jon 
Cc: Dutt Sudeep 
---
 .../gpu/drm/i915/gt/intel_ring_submission.c   | 132 +-
 1 file changed, 129 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c 
b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index bc44fe8e5ffa..58500032c993 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -1384,7 +1384,9 @@ static int load_pd_dir(struct i915_request *rq,
return rq->engine->emit_flush(rq, EMIT_FLUSH);
 }
 
-static inline int mi_set_context(struct i915_request *rq, u32 flags)
+static inline int mi_set_context(struct i915_request *rq,
+struct intel_context *ce,
+u32 flags)
 {
struct drm_i915_private *i915 = rq->i915;
struct intel_engine_cs *engine = rq->engine;
@@ -1459,7 +1461,7 @@ static inline int mi_set_context(struct i915_request *rq, 
u32 flags)
 
*cs++ = MI_NOOP;
*cs++ = MI_SET_CONTEXT;
-   *cs++ = i915_ggtt_offset(rq->context->state) | flags;
+   *cs++ = i915_ggtt_offset(ce->state) | flags;
/*
 * w/a: MI_SET_CONTEXT must always be followed by MI_NOOP
 * WaMiSetContext_Hang:snb,ivb,vlv
@@ -1574,13 +1576,56 @@ static int switch_mm(struct i915_request *rq, struct 
i915_address_space *vm)
return rq->engine->emit_flush(rq, EMIT_INVALIDATE);
 }
 
+static int clear_residuals(struct i915_request *rq)
+{
+   struct intel_engine_cs *engine = rq->engine;
+   int ret;
+
+   GEM_BUG_ON(!engine->kernel_context->state);
+
+   ret = switch_mm(rq, vm_alias(engine->kernel_context));
+   if (ret)
+   return ret;
+
+   ret = mi_set_context(rq,
+engine->kernel_context,
+MI_MM_SPACE_GTT | MI_RESTORE_INHIBIT);
+   if (ret)
+   return ret;
+
+   ret = engine->emit_bb_start(rq,
+   engine->wa_ctx.vma->node.start, 0,
+   0);
+   if (ret)
+   return ret;
+
+   ret = engine->emit_flush(rq, EMIT_FLUSH);
+   if (ret)
+   return ret;
+
+   /* Always invalidate before the next switch_mm() */
+   return engine->emit_flush(rq, EMIT_INVALIDATE);
+}
+
 static int switch_context(struct i915_request *rq)
 {
+   struct intel_engine_cs *engine = rq->engine;
struct intel_context *ce = rq->context;
+   void **residuals = NULL;
int ret;
 
GEM_BUG_ON(HAS_EXECLISTS(rq->i915));
 
+   if (engine->wa_ctx.vma && ce != engine->kernel_context) {
+   if (engine->wa_ctx.vma->private != ce) {
+   ret = clear_residuals(rq);
+   if (ret)
+   return ret;
+
+   residuals = >wa_ctx.vma->private;
+   }
+   }
+
ret = switch_mm(rq, vm_alias(ce));
if (ret)
return ret;
@@ -1600,7 +1645,7 @@ static int switch_context(struct i915_request *rq)
else
flags |= MI_RESTORE_INHIBIT;
 
-   ret = mi_set_context(rq, flags);
+   ret = mi_set_context(rq, ce, flags);
if (ret)
return ret;
}
@@ -1609,6 +1654,20 @@ static int switch_context(struct i915_request *rq)
if (ret)
return ret;
 
+   /*
+* Now past the point of no return, this request _will_ be emitted.
+*
+* Or at least this preamble will be emitted, the request may be
+* interrupted prior to submitting the user payload. If so, we
+* still submit the "empty" request in order to preserve global
+* state tracking such as this, our tracking of the current
+* dirty context.
+*/
+   if (residua

Re: [PATCH 1/2] drm/i915: Add mechanism to submit a context WA on ring submission

2020-01-17 Thread Mika Kuoppala
Mika Kuoppala  writes:

>Subject: Re: [PATCH 1/2] drm/i915: Add mechanism to submit a context WA
>on ring submission

I forgot to add RFC into patch subject. This should carry
the RFC status as it is v2 on a RFC patch.

This patch squashes Chris Wilson's ctx switch optimization
and the development is still continuing.

-Mika
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915/selftests: Fix an IS_ERR() vs NULL check

2019-03-27 Thread Mika Kuoppala
Dan Carpenter  writes:

> The live_context() function returns error pointers.  It never returns
> NULL.
>
> Fixes: 9c1477e83e62 ("drm/i915/selftests: Exercise adding requests to a full 
> GGTT")
> Signed-off-by: Dan Carpenter 

Reviewed-by: Mika Kuoppala 

i915_request.c has another :)

-Mika

> ---
>  drivers/gpu/drm/i915/selftests/i915_gem_evict.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c 
> b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> index 9a9451846b33..89766688e420 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> @@ -455,7 +455,7 @@ static int igt_evict_contexts(void *arg)
>   struct i915_gem_context *ctx;
>  
>   ctx = live_context(i915, file);
> - if (!ctx)
> + if (IS_ERR(ctx))
>   break;
>  
>   /* We will need some GGTT space for the rq's context */
> -- 
> 2.17.1
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Intel-gfx] [PATCH 3/7] drm/leases: Don't init to 0 in drm_master_create

2019-03-03 Thread Mika Kuoppala
Daniel Vetter  writes:

> We kzalloc.
>
> Cc: Keith Packard 
> Signed-off-by: Daniel Vetter 

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/drm_auth.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
> index 1669c42c40ed..bcf0a5a1018f 100644
> --- a/drivers/gpu/drm/drm_auth.c
> +++ b/drivers/gpu/drm/drm_auth.c
> @@ -109,8 +109,6 @@ struct drm_master *drm_master_create(struct drm_device 
> *dev)
>   master->dev = dev;
>  
>   /* initialize the tree of output resource lessees */
> - master->lessor = NULL;
> - master->lessee_id = 0;
>   INIT_LIST_HEAD(>lessees);
>   INIT_LIST_HEAD(>lessee_list);
>   idr_init(>leases);
> -- 
> 2.14.4
>
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Intel-gfx] [PATCH 06/64] drm: Restore double clflush on the last partial cacheline

2016-07-13 Thread Mika Kuoppala
Daniel Vetter  writes:

> On Thu, Jul 07, 2016 at 09:41:12AM +0100, Chris Wilson wrote:
>> This effectively reverts
>> 
>> commit afcd950cafea6e27b739fe7772cbbeed37d05b8b
>> Author: Chris Wilson 
>> Date:   Wed Jun 10 15:58:01 2015 +0100
>> 
>> drm: Avoid the double clflush on the last cache line in 
>> drm_clflush_virt_range()
>> 
>> as we have observed issues with serialisation of the clflush operations
>> on Baytrail+ Atoms with partial updates. Applying the double flush on the
>> last cacheline forces that clflush to be ordered with respect to the
>> previous clflush, and the mfence then protects against prefetches crossing
>> the clflush boundary.
>> 
>> The same issue can be demonstrated in userspace with igt/gem_exec_flush.
>> 
>> Fixes: afcd950cafea6 (drm: Avoid the double clflush on the last cache...)
>> Testcase: igt/gem_concurrent_blit
>> Testcase: igt/gem_partial_pread_pwrite
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92845
>> Signed-off-by: Chris Wilson 
>> Cc: dri-devel at lists.freedesktop.org
>> Cc: Akash Goel 
>> Cc: Imre Deak 
>> Cc: Daniel Vetter 
>> Cc: Jason Ekstrand 
>> Cc: stable at vger.kernel.org
>> Reviewed-by: Mika Kuoppala 
>
> It's duct-tape, but oh well. Applied to drm-misc.

We were confused how many layers we need.

-Mika

> -Daniel
>
>> ---
>>  drivers/gpu/drm/drm_cache.c | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
>> index 059f7c39c582..a7916e5f8864 100644
>> --- a/drivers/gpu/drm/drm_cache.c
>> +++ b/drivers/gpu/drm/drm_cache.c
>> @@ -136,6 +136,7 @@ drm_clflush_virt_range(void *addr, unsigned long length)
>>  mb();
>>  for (; addr < end; addr += size)
>>  clflushopt(addr);
>> +clflushopt(end - 1); /* force serialisation */
>>  mb();
>>  return;
>>  }
>> -- 
>> 2.8.1
>> 
>
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
> ___
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[patch] drm/i915/mocs: || vs | typo in get_mocs_settings()

2016-06-13 Thread Mika Kuoppala
Dan Carpenter  writes:

> It seems pretty clear that bitwise OR was intended here and not logical
> OR.
>
> Fixes: 6fc29133eafb ('drm/i915/gen9: Add WaDisableSkipCaching')
> Signed-off-by: Dan Carpenter 

Pushed to drm-intel-next-queued. Thanks for patch.
-Mika


>
> diff --git a/drivers/gpu/drm/i915/intel_mocs.c 
> b/drivers/gpu/drm/i915/intel_mocs.c
> index 8f96c40..3c1482b 100644
> --- a/drivers/gpu/drm/i915/intel_mocs.c
> +++ b/drivers/gpu/drm/i915/intel_mocs.c
> @@ -162,7 +162,7 @@ static bool get_mocs_settings(struct drm_i915_private 
> *dev_priv,
>  
>   for (i = 0; i < table->size; i++)
>   if (WARN_ON(table->table[i].l3cc_value &
> - (L3_ESC(1) || L3_SCC(0x7
> + (L3_ESC(1) | L3_SCC(0x7
>   return false;
>   }
>  


[patch] drm/i915/mocs: || vs | typo in get_mocs_settings()

2016-06-13 Thread Mika Kuoppala
Dan Carpenter  writes:

> It seems pretty clear that bitwise OR was intended here and not logical
> OR.
>
> Fixes: 6fc29133eafb ('drm/i915/gen9: Add WaDisableSkipCaching')
> Signed-off-by: Dan Carpenter 

Reviewed-by: Mika Kuoppala 

>
> diff --git a/drivers/gpu/drm/i915/intel_mocs.c 
> b/drivers/gpu/drm/i915/intel_mocs.c
> index 8f96c40..3c1482b 100644
> --- a/drivers/gpu/drm/i915/intel_mocs.c
> +++ b/drivers/gpu/drm/i915/intel_mocs.c
> @@ -162,7 +162,7 @@ static bool get_mocs_settings(struct drm_i915_private 
> *dev_priv,
>  
>   for (i = 0; i < table->size; i++)
>   if (WARN_ON(table->table[i].l3cc_value &
> - (L3_ESC(1) || L3_SCC(0x7
> + (L3_ESC(1) | L3_SCC(0x7
>   return false;
>   }
>  


[Intel-gfx] [PATCH] drm: Restore double clflush on the last partial cacheline

2016-05-02 Thread Mika Kuoppala
Chris Wilson  writes:

> [ text/plain ]
> This effectively reverts
>
> commit afcd950cafea6e27b739fe7772cbbeed37d05b8b
> Author: Chris Wilson 
> Date:   Wed Jun 10 15:58:01 2015 +0100
>
> drm: Avoid the double clflush on the last cache line in 
> drm_clflush_virt_range()
>
> as we have observed issues with serialisation of the clflush operations
> on Baytrail+ Atoms with partial updates. Applying the double flush on the
> last cacheline forces that clflush to be ordered with respect to the
> previous clflush, and the mfence then protects against prefetches crossing
> the clflush boundary.
>
> The same issue can be demonstrated in userspace with igt/gem_exec_flush.
>
> Fixes: afcd950cafea6 (drm: Avoid the double clflush on the last cache...)
> Testcase: igt/gem_concurrent_blit
> Testcase: igt/gem_partial_pread_pwrite
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92845
> Signed-off-by: Chris Wilson 
> Cc: dri-devel at lists.freedesktop.org
> Cc: Akash Goel 
> Cc: Imre Deak 
> Cc: Daniel Vetter 
> Cc: Jason Ekstrand 
> Cc: stable at vger.kernel.org

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/drm_cache.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
> index 6743ff7dccfa..7f4a6c550319 100644
> --- a/drivers/gpu/drm/drm_cache.c
> +++ b/drivers/gpu/drm/drm_cache.c
> @@ -136,6 +136,7 @@ drm_clflush_virt_range(void *addr, unsigned long length)
>   mb();
>   for (; addr < end; addr += size)
>   clflushopt(addr);
> + clflushopt(end - 1); /* force serialisation */
>   mb();
>   return;
>   }
> -- 
> 2.8.1
>
> ___
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[PATCH] fbcon: when font is freed, clear also vc_font.data

2013-04-22 Thread Mika Kuoppala
commit ae1287865f5361fa138d4d3b1b6277908b54eac9
Author: Dave Airlie 
Date:   Thu Jan 24 16:12:41 2013 +1000

fbcon: don't lose the console font across generic->chip driver switch

uses a pointer in vc->vc_font.data to load font into the new driver.
However if the font is actually freed, we need to clear the data
so that we don't reload font from dangling pointer.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=892340
Signed-off-by: Mika Kuoppala 
---
 drivers/video/console/fbcon.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 3cd6759..a92783e 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -1228,6 +1228,8 @@ static void fbcon_deinit(struct vc_data *vc)
 finished:

fbcon_free_font(p, free_font);
+   if (free_font)
+   vc->vc_font.data = NULL;

if (!con_is_bound(_con))
fbcon_exit();
-- 
1.7.9.5



[PATCH] fbcon: when font is freed, clear also vc_font.data

2013-04-22 Thread Mika Kuoppala
commit ae1287865f5361fa138d4d3b1b6277908b54eac9
Author: Dave Airlie airl...@redhat.com
Date:   Thu Jan 24 16:12:41 2013 +1000

fbcon: don't lose the console font across generic-chip driver switch

uses a pointer in vc-vc_font.data to load font into the new driver.
However if the font is actually freed, we need to clear the data
so that we don't reload font from dangling pointer.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=892340
Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com
---
 drivers/video/console/fbcon.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 3cd6759..a92783e 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -1228,6 +1228,8 @@ static void fbcon_deinit(struct vc_data *vc)
 finished:
 
fbcon_free_font(p, free_font);
+   if (free_font)
+   vc-vc_font.data = NULL;
 
if (!con_is_bound(fb_con))
fbcon_exit();
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2] Documentation: kernel-parameters.txt Add drm_kms_helper.poll

2012-10-09 Thread mika . kuoppala
From: Mika Kuoppala mika.kuopp...@intel.com

Possibility to disable polling to find connected displays was
introduced by commit [1]:

commit e58f637bb96d5a0ae0919b9998b891d1ba7e47c9
Author: Chris Wilson ch...@chris-wilson.co.uk
Date:   Fri Aug 20 09:13:36 2010 +0100

drm/kms: Add a module parameter to disable polling

Add description for this parameter to kernel-parameters.txt

[1] 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=e58f637bb96d5a0ae0919b9998b891d1ba7e47c9

Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com
---
 Documentation/kernel-parameters.txt |   16 
 1 file changed, 16 insertions(+)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index f777fa9..0f0f071 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -741,6 +741,22 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
if its name and a colon are prepended to the EDID
name.
 
+   drm_kms_helper.poll=
+   Polling for connected display devices on older systems
+   without Hot Plug Detect (HPD) support, can be quite
+   expensive. Poll will happen periodically and can in
+   worst cases take several hundred milliseconds,
+   depending on the hardware. This will cause visible
+   stalls, for example in video playback. These stalls
+   might happen even when your video is on HDP output
+   but you have other non HDP outputs in your hw
+   configuration. If you experience stalls in display
+   output occurring every 10 seconds, disabling polling
+   might help. With polling disabled you need to manually
+   use xrandr to trigger detection.
+   Format: bool  (1/Y/y=enable, 0/N/n=disable)
+   default: enabled
+
dscc4.setup=[NET]
 
dyndbg[=val]  [KNL,DYNAMIC_DEBUG]
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm: Fix kdoc comment for drm_crtc_convert_umode

2012-10-09 Thread mika . kuoppala
From: Mika Kuoppala mika.kuopp...@intel.com

Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com
---
 drivers/gpu/drm/drm_crtc.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index ef1b221..bb4c686 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -1148,9 +1148,9 @@ static void drm_crtc_convert_to_umode(struct 
drm_mode_modeinfo *out,
 }
 
 /**
- * drm_crtc_convert_to_umode - convert a modeinfo into a drm_display_mode
- * @out: drm_display_mode to return to the user
- * @in: drm_mode_modeinfo to use
+ * drm_crtc_convert_umode - convert a modeinfo into a drm_display_mode
+ * @out: drm_display_mode to return to the caller
+ * @in: drm_mode_modeinfo from user
  *
  * LOCKING:
  * None.
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] Documentation: kernel-parameters.txt Add drm_kms_helper.poll

2012-10-04 Thread mika . kuoppala
From: Mika Kuoppala mika.kuopp...@intel.com

Add description for drm_kms_helper.poll module parameter introduced
by commit: e58f637bb96d5a0ae0919b9998b891d1ba7e47c9

Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com
---
 Documentation/kernel-parameters.txt |   11 +++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index f777fa9..ad97749 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -741,6 +741,17 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
if its name and a colon are prepended to the EDID
name.
 
+   drm_kms_helper.poll=
+   Polling for connected VGA devices on older systems
+   can be quite expensive, causing latencies of
+   several hundred milliseconds. This will cause visible
+   stalls, for example in video playback. If you
+   experience stalls in display output occurring every
+   10 seconds (DRM_OUTPUT_POLL_PERIOD=(10*HZ)), disabling
+   polling might help.
+   Format: bool  (1/Y/y=enable, 0/N/n=disable)
+   default: enabled
+
dscc4.setup=[NET]
 
dyndbg[=val]  [KNL,DYNAMIC_DEBUG]
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel