[Intel-gfx] [PULL] drm-intel-gt-next

2022-03-02 Thread Joonas Lahtinen
Hi Dave & Daniel,

Here is the last feature PR for v5.18.

For new platforms we have got more DG2 enabling: small BAR foundations,
64K page support and accelerated migration. For XeHP SDV we've got flat
CCS detection and compute command streamer being added.

Disabling i915 build on PREEMPT_RT for now due to deadlocks and
warnings. Fixes to GuC data structure accesses on ARM platforms.
A couple of other GuC init and SLPC fixes.

Then the usual bits of cleanup and smaller fixes.

Regards, Joonas

***

drm-intel-gt-next-2022-03-03:

Cross-subsystem Changes:

- drm-next backmerge for buddy allocator changes

Driver Changes:

- Skip i915_perf init for DG2 as it is not yet enabled (Ram)
- Add missing workarounds for DG2 (Clint)
- Add 64K page/align support for platforms like DG2 that require it (Matt A, 
Ram, Bob)
- Add accelerated migration support for DG2 (Matt A)
- Add flat CCS support for XeHP SDV (Abdiel, Ram)
- Add Compute Command Streamer (CCS) engine support for XeHP SDV (Michel,
  Daniele, Aravind, Matt R)
- Don't support parallel submission on compute / render (Matt B, Matt R)

- Disable i915 build on PREEMPT_RT until RT behaviour fixed (Sebastian)
- Remove RPS interrupt support for TGL+ (Jose)
- Fix S/R with PM_EARLY for non-GTT mappable objects on DG2 (Matt, Lucas)
- Skip stolen memory init if it is fully reserved (Jose)
- Use iosys_map for GuC data structures that may be in LMEM BAR or SMEM (Lucas)
- Do not complain about stale GuC reset notifications for banned contexts (John)

- Move context descriptor fields to intel_lrc.h (Matt R)
- Start adding support for small BAR (Matt A)
- Clarify vma lifetime (Thomas)
- Simplify subplatform detection on TGL (Jose)
- Correct the param count for unset GuC SLPC param (Vinay, Umesh)
- Read RP_STATE_CAP correctly on Gen12 with GuC SLPC (Vinay)
- Initialize GuC submission locks and queues early (Daniele)
- Fix GuC flag query helper function to not modify state (John)

- Drop fake lmem support now we have real hardware available (Lucas)
- Move misplaced W/A to their correct locations (Srinivasan)
- Use get_reset_domain() helper (Tejas)
- Move context descriptor fields to intel_lrc.h (Matt R)
- Selftest improvements (Matt A)

The following changes since commit 54f43c17d681f6d9523fcfaeefc9df77993802e1:

  Merge tag 'drm-misc-next-2022-02-23' of 
git://anongit.freedesktop.org/drm/drm-misc into drm-next (2022-02-25 05:50:18 
+1000)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel tags/drm-intel-gt-next-2022-03-03

for you to fetch changes up to b2006061ae28fe7e84af6c9757ee89c4e505e92b:

  drm/i915/xehpsdv: Move render/compute engine reset domains related 
workarounds (2022-03-02 06:52:42 -0800)


Cross-subsystem Changes:

- drm-next backmerge for buddy allocator changes

Driver Changes:

- Skip i915_perf init for DG2 as it is not yet enabled (Ram)
- Add missing workarounds for DG2 (Clint)
- Add 64K page/align support for platforms like DG2 that require it (Matt A, 
Ram, Bob)
- Add accelerated migration support for DG2 (Matt A)
- Add flat CCS support for XeHP SDV (Abdiel, Ram)
- Add Compute Command Streamer (CCS) engine support for XeHP SDV (Michel,
  Daniele, Aravind, Matt R)
- Don't support parallel submission on compute / render (Matt B, Matt R)

- Disable i915 build on PREEMPT_RT until RT behaviour fixed (Sebastian)
- Remove RPS interrupt support for TGL+ (Jose)
- Fix S/R with PM_EARLY for non-GTT mappable objects on DG2 (Matt, Lucas)
- Skip stolen memory init if it is fully reserved (Jose)
- Use iosys_map for GuC data structures that may be in LMEM BAR or SMEM (Lucas)
- Do not complain about stale GuC reset notifications for banned contexts (John)

- Move context descriptor fields to intel_lrc.h
- Start adding support for small BAR (Matt A)
- Clarify vma lifetime (Thomas)
- Simplify subplatform detection on TGL (Jose)
- Correct the param count for unset GuC SLPC param (Vinay, Umesh)
- Read RP_STATE_CAP correctly on Gen12 with GuC SLPC (Vinay)
- Initialize GuC submission locks and queues early (Daniele)
- Fix GuC flag query helper function to not modify state (John)

- Drop fake lmem support now we have real hardware available (Lucas)
- Move misplaced W/A to their correct locations (Srinivasan)
- Use get_reset_domain() helper (Tejas)
- Move context descriptor fields to intel_lrc.h (Matt R)
- Selftest improvements (Matt A)


Abdiel Janulgue (1):
  drm/i915/lmem: Enable lmem for platforms with Flat CCS

CQ Tang (1):
  drm/i915/xehpsdv: Add has_flat_ccs to device info

Clint Taylor (1):
  drm/i915/dg2: add Wa_14014947963

Daniele Ceraolo Spurio (4):
  drm/i915/guc: Initialize GuC submission locks and queues early
  drm/i915/xehp: compute engine pipe_control
  drm/i915/xehp/guc: enable compute engine inside GuC
  drm/i915/xehp: handle fused off CCS engines

John Harrison 

Re: [Intel-gfx] [PATCH v2 1/4] drm/i915/gt: Clear compress metadata for Xe_HP platforms

2022-03-02 Thread Hellstrom, Thomas
On Wed, 2022-03-02 at 03:23 +0530, Ramalingam C wrote:
> From: Ayaz A Siddiqui 
> 
> Xe-HP and latest devices support Flat CCS which reserved a portion of
> the device memory to store compression metadata, during the clearing
> of
> device memory buffer object we also need to clear the associated
> CCS buffer.
> 
> Flat CCS memory can not be directly accessed by S/W.
> Address of CCS buffer associated main BO is automatically calculated
> by device itself. KMD/UMD can only access this buffer indirectly
> using
> XY_CTRL_SURF_COPY_BLT cmd via the address of device memory buffer.
> 
> v2: Fixed issues with platform naming [Lucas]
> v3: Rebased [Ram]
>     Used the round_up funcs [Bob]
> v4: Fixed ccs blk calculation [Ram]
>     Added Kdoc on flat-ccs.
> v5: GENMASK is used [Matt]
>     mocs fix [Matt]
>     Comments Fix [Matt]
>     Flush address programming [Ram]
> v6: FLUSH_DW is fixed
>     Few coding style fix
> 
> Signed-off-by: Ayaz A Siddiqui 
> Signed-off-by: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  15 ++
>  drivers/gpu/drm/i915/gt/intel_migrate.c  | 143
> ++-
>  2 files changed, 154 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index f8253012d166..237c1baccc64 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -203,6 +203,21 @@
>  #define GFX_OP_DRAWRECT_INFO
> ((0x3<<29)|(0x1d<<24)|(0x80<<16)|(0x3))
>  #define GFX_OP_DRAWRECT_INFO_I965  ((0x7900<<16)|0x2)
>  
> +#define XY_CTRL_SURF_INSTR_SIZE5
> +#define MI_FLUSH_DW_SIZE   3
> +#define XY_CTRL_SURF_COPY_BLT  ((2 << 29) | (0x48 << 22) |
> 3)
> +#define   SRC_ACCESS_TYPE_SHIFT21
> +#define   DST_ACCESS_TYPE_SHIFT20
> +#define   CCS_SIZE_MASKGENMASK(17, 8)
> +#define   XY_CTRL_SURF_MOCS_MASK   GENMASK(31, 25)
> +#define   NUM_CCS_BYTES_PER_BLOCK  256
> +#define   NUM_BYTES_PER_CCS_BYTE   256
> +#define   NUM_CCS_BLKS_PER_XFER1024
> +#define   INDIRECT_ACCESS  0
> +#define   DIRECT_ACCESS1
> +#define  MI_FLUSH_LLC  BIT(9)
> +#define  MI_FLUSH_CCS  BIT(16)
> +
>  #define COLOR_BLT_CMD  (2 << 29 | 0x40 << 22 | (5 -
> 2))
>  #define XY_COLOR_BLT_CMD   (2 << 29 | 0x50 << 22)
>  #define SRC_COPY_BLT_CMD   (2 << 29 | 0x43 << 22)
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c
> b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index 20444d6ceb3c..330fcdc3e0cf 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -16,6 +16,8 @@ struct insert_pte_data {
>  };
>  
>  #define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */
> +#define GET_CCS_BYTES(i915, size)  (HAS_FLAT_CCS(i915) ? \
> +    DIV_ROUND_UP(size,
> NUM_BYTES_PER_CCS_BYTE) : 0)
>  
>  static bool engine_supports_migration(struct intel_engine_cs
> *engine)
>  {
> @@ -467,6 +469,110 @@ static bool wa_1209644611_applies(int ver, u32
> size)
> return height % 4 == 3 && height <= 8;
>  }
>  
> +/**
> + * DOC: Flat-CCS - Memory compression for Local memory
> + *
> + * On Xe-HP and later devices, we use dedicated compression control
> state (CCS)
> + * stored in local memory for each surface, to support the 3D and
> media
> + * compression formats.
> + *
> + * The memory required for the CCS of the entire local memory is
> 1/256 of the
> + * local memory size. So before the kernel boot, the required memory
> is reserved
> + * for the CCS data and a secure register will be programmed with
> the CCS base
> + * address.
> + *
> + * Flat CCS data needs to be cleared when a lmem object is
> allocated.
> + * And CCS data can be copied in and out of CCS region through
> + * XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data directly.
> + *
> + * When we exhaust the lmem, if the object's placements support
> smem, then we can
> + * directly decompress the compressed lmem object into smem and
> start using it
> + * from smem itself.
> + *
> + * But when we need to swapout the compressed lmem object into a
> smem region
> + * though objects' placement doesn't support smem, then we copy the
> lmem content
> + * as it is into smem region along with ccs data (using
> XY_CTRL_SURF_COPY_BLT).
> + * When the object is referred, lmem content will be swaped in along
> with
> + * restoration of the CCS data (using XY_CTRL_SURF_COPY_BLT) at
> corresponding
> + * location.
> + */
> +
> +static inline u32 *i915_flush_dw(u32 *cmd, u32 flags)
> +{
> +   *cmd++ = MI_FLUSH_DW | flags;
> +   *cmd++ = 0;
> +   *cmd++ = 0;
> +
> +   return cmd;
> +}
> +
> +static u32 calc_ctrl_surf_instr_size(struct drm_i915_private *i915,
> int size)
> +{
> +   u32 num_cmds, num_blks, 

[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/gem: missing boundary check in vm_access leads to OOB read/write (rev2)

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915/gem: missing boundary check in vm_access leads to OOB 
read/write (rev2)
URL   : https://patchwork.freedesktop.org/series/100932/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11316 -> Patchwork_22468


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_22468 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_22468, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/index.html

Participating hosts (50 -> 41)
--

  Additional (1): bat-adlp-4 
  Missing(10): fi-kbl-soraka shard-tglu bat-dg1-5 fi-hsw-4200u fi-bsw-cyan 
fi-ctg-p8600 shard-rkl shard-dg1 bat-jsl-2 fi-bdw-samus 

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_22468:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_suspend@basic-s0@smem:
- fi-kbl-7567u:   [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/fi-kbl-7567u/igt@gem_exec_suspend@basic...@smem.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/fi-kbl-7567u/igt@gem_exec_suspend@basic...@smem.html

  * igt@i915_pm_rpm@basic-pci-d3-state:
- fi-skl-6600u:   [PASS][3] -> [FAIL][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/fi-skl-6600u/igt@i915_pm_...@basic-pci-d3-state.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/fi-skl-6600u/igt@i915_pm_...@basic-pci-d3-state.html

  
Known issues


  Here are the changes found in Patchwork_22468 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_lmem_swapping@basic:
- bat-adlp-4: NOTRUN -> [SKIP][5] ([i915#4613]) +3 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/bat-adlp-4/igt@gem_lmem_swapp...@basic.html

  * igt@gem_tiled_pread_basic:
- bat-adlp-4: NOTRUN -> [SKIP][6] ([i915#3282])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/bat-adlp-4/igt@gem_tiled_pread_basic.html

  * igt@kms_busy@basic@modeset:
- bat-adlp-4: NOTRUN -> [DMESG-WARN][7] ([i915#3576])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/bat-adlp-4/igt@kms_busy@ba...@modeset.html

  * igt@kms_chamelium@vga-hpd-fast:
- bat-adlp-4: NOTRUN -> [SKIP][8] ([fdo#111827]) +8 similar issues
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/bat-adlp-4/igt@kms_chamel...@vga-hpd-fast.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
- bat-adlp-4: NOTRUN -> [SKIP][9] ([i915#4103]) +1 similar issue
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/bat-adlp-4/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html

  * igt@kms_force_connector_basic@force-load-detect:
- bat-adlp-4: NOTRUN -> [SKIP][10] ([fdo#109285])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/bat-adlp-4/igt@kms_force_connector_ba...@force-load-detect.html

  * igt@prime_vgem@basic-fence-read:
- bat-adlp-4: NOTRUN -> [SKIP][11] ([i915#3291] / [i915#3708]) +2 
similar issues
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/bat-adlp-4/igt@prime_v...@basic-fence-read.html

  * igt@prime_vgem@basic-userptr:
- bat-adlp-4: NOTRUN -> [SKIP][12] ([i915#3301] / [i915#3708])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/bat-adlp-4/igt@prime_v...@basic-userptr.html

  * igt@runner@aborted:
- fi-kbl-7567u:   NOTRUN -> [FAIL][13] ([i915#4312])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/fi-kbl-7567u/igt@run...@aborted.html
- fi-bdw-5557u:   NOTRUN -> [FAIL][14] ([i915#2426] / [i915#4312])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/fi-bdw-5557u/igt@run...@aborted.html

  
 Possible fixes 

  * igt@gem_exec_suspend@basic-s3@smem:
- fi-bdw-5557u:   [INCOMPLETE][15] ([i915#146]) -> [PASS][16]
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/fi-bdw-5557u/igt@gem_exec_suspend@basic...@smem.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/fi-bdw-5557u/igt@gem_exec_suspend@basic...@smem.html

  * igt@i915_pm_rpm@module-reload:
- fi-icl-u2:  [FAIL][17] ([i915#3049]) -> [PASS][18]
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/fi-icl-u2/igt@i915_pm_...@module-reload.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22468/fi-icl-u2/igt@i915_pm_...@module-reload.html

  * igt@i915_selftest@live@evict:
- {bat-rpls-2}:   [DMESG-WARN][19] ([i915#4391]) -> 

Re: [Intel-gfx] [PATCH v2 2/3] drm/i915: Remove the vma refcount

2022-03-02 Thread Thomas Hellström
On Wed, 2022-03-02 at 14:01 -0800, Niranjana Vishwanathapura wrote:
> On Wed, Mar 02, 2022 at 11:21:59AM +0100, Thomas Hellström wrote:
> > Now that i915_vma_parked() is taking the object lock on vma
> > destruction,
> > and the only user of the vma refcount, i915_gem_object_unbind()
> > also takes the object lock, remove the vma refcount.
> > 
> > Signed-off-by: Thomas Hellström 
> > ---
> > drivers/gpu/drm/i915/i915_gem.c   | 17 +
> > drivers/gpu/drm/i915/i915_vma.c   | 14 +++---
> > drivers/gpu/drm/i915/i915_vma.h   | 14 --
> > drivers/gpu/drm/i915/i915_vma_types.h |  1 -
> > 4 files changed, 16 insertions(+), 30 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index dd84ebabb50f..c26110abcc0b 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -151,14 +151,25 @@ int i915_gem_object_unbind(struct
> > drm_i915_gem_object *obj,
> > break;
> > }
> > 
> > +   /*
> > +    * Requiring the vm destructor to take the object
> > lock
> > +    * before destroying a vma would help us eliminate
> > the
> > +    * i915_vm_tryget() here, AND thus also the barrier
> > stuff
> > +    * at the end. That's an easy fix, but sleeping
> > locks in
> > +    * a kthread should generally be avoided.
> > +    */
> > ret = -EAGAIN;
> > if (!i915_vm_tryget(vma->vm))
> > break;
> > 
> > -   /* Prevent vma being freed by i915_vma_parked as we
> > unbind */
> > -   vma = __i915_vma_get(vma);
> > spin_unlock(>vma.lock);
> > 
> > +   /*
> > +    * Since i915_vma_parked() takes the object lock
> > +    * before vma destruction, it won't race us here,
> > +    * and destroy the vma from under us.
> > +    */
> > +
> > if (vma) {
> > bool vm_trylock = !!(flags &
> > I915_GEM_OBJECT_UNBIND_VM_TRYLOCK);
> > ret = -EBUSY;
> > @@ -180,8 +191,6 @@ int i915_gem_object_unbind(struct
> > drm_i915_gem_object *obj,
> > ret = i915_vma_unbind(vma);
> > }
> > }
> > -
> > -   __i915_vma_put(vma);
> > }
> > 
> > i915_vm_put(vma->vm);
> 
> One issue still left in i915_gem_object_unbind is that it temporarily
> removes
> vmas from the obj->vma.list and adds back later as vma needs to be
> unbind outside
> the obj->vma.lock spinlock. This is an issue for other places where
> we iterate
> over the obj->vma.list. i915_debugfs_describe_obj is one such case
> (upcoming
> vm_bind will be another) that iterates over this list.
> What is the plan here? Do we need to take object lock while iterating
> over the
> list?

Yeah, I guess that's an option if that's at all possible (we might need
to iterate over the list in the mmu notifier, for example).

The other option is to
*) get rid of the GGTT / PPGTT sorting of vmas in the list,
*) being able to determine per vma *before we unlock* if we need to
unlock the list spinlock to take action,
*) re-add all vmas we've previously iterated over at the *tail* of the
list before unlocking the list lock.

Then a termination criterion for iterating would be that we reached the
end of the list without unlocking. Otherwise we need to restart
iteration after unlocking.

This would typically give us O(2N) complexity for the iteration. If we
re-add at the *head* of the list, we'd see O(N²), but to be able to re-
add previous vmas to the tail of the list requires us to get rid of the
sorting.

> 
> But this just something I noticed and not related to this patch.
> This patch looks good to me.
> Reviewed-by: Niranjana Vishwanathapura
> 
>  

Thanks for reviewing. I noticed there is some documentation needing
updating as well, so I'll send out a v3 without functional changes.

/Thomas


> 
> > diff --git a/drivers/gpu/drm/i915/i915_vma.c
> > b/drivers/gpu/drm/i915/i915_vma.c
> > index 91538bc38110..6fd25b39748f 100644
> > --- a/drivers/gpu/drm/i915/i915_vma.c
> > +++ b/drivers/gpu/drm/i915/i915_vma.c
> > @@ -122,7 +122,6 @@ vma_create(struct drm_i915_gem_object *obj,
> > if (vma == NULL)
> > return ERR_PTR(-ENOMEM);
> > 
> > -   kref_init(>ref);
> > vma->ops = >vma_ops;
> > vma->obj = obj;
> > vma->size = obj->base.size;
> > @@ -1628,15 +1627,6 @@ void i915_vma_reopen(struct i915_vma *vma)
> > __i915_vma_remove_closed(vma);
> > }
> > 
> > -void i915_vma_release(struct kref *ref)
> > -{
> > -   struct i915_vma *vma = container_of(ref, typeof(*vma),
> > ref);
> > -
> > -   i915_active_fini(>active);
> > -   

[Intel-gfx] [v2] drm/i915/gem: missing boundary check in vm_access leads to OOB read/write

2022-03-02 Thread Mastan Katragadda
Intel ID: PSIRT-PTK0002429

A missing bounds check in vm_access()can lead to an out-of-bounds read or
write in the adjacent memory area.The len attribute is not validated before
the memcpy at  [1]or [2] occurs.

[  183.637831] BUG: unable to handle page fault for address: c9c86000
[  183.637934] #PF: supervisor read access in kernel mode
[  183.637997] #PF: error_code(0x) - not-present page
[  183.638059] PGD 10067 P4D 10067 PUD 100258067 PMD 106341067 PTE 0
[  183.638144] Oops:  [#2] PREEMPT SMP NOPTI
[  183.638201] CPU: 3 PID: 1790 Comm: poc Tainted: G  D   
5.17.0-rc6-ci-drm-11296+ #1
[  183.638298] Hardware name: Intel Corporation CoffeeLake Client 
Platform/CoffeeLake H DDR4 RVP, BIOS CNLSFWR1.R00.X208.B00.1905301319 05/30/2019
[  183.638430] RIP: 0010:memcpy_erms+0x6/0x10
[  183.640213] RSP: 0018:c90001763d48 EFLAGS: 00010246
[  183.641117] RAX: 888109c14000 RBX: 888111bece40 RCX: 0ffc
[  183.642029] RDX: 1000 RSI: c9c86000 RDI: 888109c14004
[  183.642946] RBP: 0ffc R08: 816b R09: 
[  183.643848] R10: c9c85000 R11: 0048 R12: 1000
[  183.644742] R13: 888111bed190 R14: 888109c14000 R15: 1000
[  183.645653] FS:  7fe5ef807540() GS:88845b38() 
knlGS:
[  183.646570] CS:  0010 DS:  ES:  CR0: 80050033
[  183.647481] CR2: c9c86000 CR3: 00010ff02006 CR4: 003706e0
[  183.648384] DR0:  DR1:  DR2: 
[  183.649271] DR3:  DR6: fffe0ff0 DR7: 0400
[  183.650142] Call Trace:
[  183.650988]  
[  183.651793]  vm_access+0x1f0/0x2a0 [i915]
[  183.652726]  __access_remote_vm+0x224/0x380
[  183.653561]  mem_rw.isra.0+0xf9/0x190
[  183.654402]  vfs_read+0x9d/0x1b0
[  183.655238]  ksys_read+0x63/0xe0
[  183.656065]  do_syscall_64+0x38/0xc0
[  183.656882]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  183.657663] RIP: 0033:0x7fe5ef725142
[  183.659351] RSP: 002b:7ffe1e81c7e8 EFLAGS: 0246 ORIG_RAX: 

[  183.660227] RAX: ffda RBX: 557055dfb780 RCX: 7fe5ef725142
[  183.661104] RDX: 1000 RSI: 7ffe1e81d880 RDI: 0005
[  183.661972] RBP: 7ffe1e81e890 R08: 0030 R09: 0046
[  183.662832] R10: 557055dfc2e0 R11: 0246 R12: 557055dfb1c0
[  183.663691] R13: 7ffe1e81e980 R14:  R15: 
[  183.664566]  

Changes since v1:
 - Updated if condition with range_overflows_t [Chris Wilson]

Signed-off-by: Mastan Katragadda 
Suggested-by: Adam Zabrocki 
Reported-by: Jackson Cody 
Cc: Chris Wilson 
Cc: Bloomfield Jon 
Cc: Dutt Sudeep 
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index efe69d6b86f4..c3ea243d414d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -455,7 +455,7 @@ vm_access(struct vm_area_struct *area, unsigned long addr,
return -EACCES;
 
addr -= area->vm_start;
-   if (addr >= obj->base.size)
+   if (range_overflows_t(u64, addr, len, obj->base.size))
return -EINVAL;
 
i915_gem_ww_ctx_init(, true);
-- 
2.25.1



[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: avoid concurrent writes to aux_inv (rev3)

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915: avoid concurrent writes to aux_inv (rev3)
URL   : https://patchwork.freedesktop.org/series/100772/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11313_full -> Patchwork_22465_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_22465_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_22465_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (13 -> 13)
--

  No changes in participating hosts

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_22465_full:

### IGT changes ###

 Possible regressions 

  * igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-edp:
- shard-tglb: [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/shard-tglb3/igt@i915_pm_lpsp@kms-l...@kms-lpsp-edp.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-tglb2/igt@i915_pm_lpsp@kms-l...@kms-lpsp-edp.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * 
{igt@kms_plane_scaling@planes-upscale-20x20-downscale-factor-0-5@pipe-a-edp-1-planes-upscale-downscale}:
- {shard-rkl}:NOTRUN -> [SKIP][3] +1 similar issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-rkl-6/igt@kms_plane_scaling@planes-upscale-20x20-downscale-factor-...@pipe-a-edp-1-planes-upscale-downscale.html

  
New tests
-

  New tests have been introduced between CI_DRM_11313_full and 
Patchwork_22465_full:

### New IGT tests (1) ###

  * 
igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [1.28] s

  

Known issues


  Here are the changes found in Patchwork_22465_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-apl:  NOTRUN -> [DMESG-WARN][4] ([i915#4991])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-apl8/igt@gem_cre...@create-massive.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][5] -> [FAIL][6] ([i915#232])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/shard-tglb6/igt@gem_...@unwedge-stress.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-tglb5/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_balancer@parallel-out-fence:
- shard-kbl:  NOTRUN -> [DMESG-WARN][7] ([i915#5076])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-kbl3/igt@gem_exec_balan...@parallel-out-fence.html

  * igt@gem_exec_capture@pi@vecs0:
- shard-skl:  NOTRUN -> [INCOMPLETE][8] ([i915#4547])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-skl4/igt@gem_exec_capture@p...@vecs0.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-iclb: [PASS][9] -> [FAIL][10] ([i915#2842])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/shard-iclb7/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-iclb1/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none-vip@rcs0:
- shard-skl:  NOTRUN -> [SKIP][11] ([fdo#109271]) +9 similar issues
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-skl6/igt@gem_exec_fair@basic-none-...@rcs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [PASS][12] -> [FAIL][13] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/shard-tglb3/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-tglb2/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
- shard-glk:  [PASS][14] -> [FAIL][15] ([i915#2842])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/shard-glk7/igt@gem_exec_fair@basic-pace-s...@rcs0.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-glk2/igt@gem_exec_fair@basic-pace-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@bcs0:
- shard-apl:  NOTRUN -> [SKIP][16] ([fdo#109271]) +46 similar issues
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-apl1/igt@gem_exec_fair@basic-p...@bcs0.html
- shard-kbl:  NOTRUN -> [SKIP][17] ([fdo#109271])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/shard-kbl3/igt@gem_exec_fair@basic-p...@bcs0.html

  * igt@gem_exec_whisper@basic-contexts-forked-all:

Re: [Intel-gfx] [PATCH 15/15] drm/i915/gt: Clear compress metadata for Xe_HP platforms

2022-03-02 Thread Matt Roper
On Sun, Feb 27, 2022 at 10:22:20PM +0530, Ramalingam C wrote:
> Matt,
> 
> Thanks for the review.
> 
> On 2022-02-18 at 17:47:22 -0800, Matt Roper wrote:
> > On Sat, Feb 19, 2022 at 12:17:52AM +0530, Ramalingam C wrote:
> > > From: Ayaz A Siddiqui 
> > > 
> > > Xe-HP and latest devices support Flat CCS which reserved a portion of
> > > the device memory to store compression metadata, during the clearing of
> > > device memory buffer object we also need to clear the associated
> > > CCS buffer.
> > > 
> > > Flat CCS memory can not be directly accessed by S/W.
> > > Address of CCS buffer associated main BO is automatically calculated
> > > by device itself. KMD/UMD can only access this buffer indirectly using
> > > XY_CTRL_SURF_COPY_BLT cmd via the address of device memory buffer.
> > > 
> > > v2: Fixed issues with platform naming [Lucas]
> > > v3: Rebased [Ram]
> > > Used the round_up funcs [Bob]
> > > v4: Fixed ccs blk calculation [Ram]
> > > Added Kdoc on flat-ccs.
> > > 
> > > Cc: CQ Tang 
> > > Signed-off-by: Ayaz A Siddiqui 
> > > Signed-off-by: Ramalingam C 
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  15 ++
> > >  drivers/gpu/drm/i915/gt/intel_migrate.c  | 145 ++-
> > >  2 files changed, 156 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
> > > b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> > > index f8253012d166..166de5436c4a 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> > > @@ -203,6 +203,21 @@
> > >  #define GFX_OP_DRAWRECT_INFO ((0x3<<29)|(0x1d<<24)|(0x80<<16)|(0x3))
> > >  #define GFX_OP_DRAWRECT_INFO_I965  ((0x7900<<16)|0x2)
> > >  
> > > +#define XY_CTRL_SURF_INSTR_SIZE  5
> > > +#define MI_FLUSH_DW_SIZE 3
> > > +#define XY_CTRL_SURF_COPY_BLT((2 << 29) | (0x48 << 22) | 3)
> > > +#define   SRC_ACCESS_TYPE_SHIFT  21
> > > +#define   DST_ACCESS_TYPE_SHIFT  20
> > > +#define   CCS_SIZE_SHIFT 8
> > 
> > Rather than using a shift, it might be better to just define the
> > bitfield.  E.g.,
> > 
> > #define CCS_SIZEGENMASK(17, 8)
> > 
> > and then later
> > 
> > FIELD_PREP(CCS_SIZE, i - 1)
> > 
> > to refer to the proper value.
> > 
> > > +#define   XY_CTRL_SURF_MOCS_SHIFT25
> > 
> > Same here; we can use GENMASK(31, 25) to define the field.
> 
> Adapting to the GENMASK and FIELD_PREP for these two macros
> > 
> > > +#define   NUM_CCS_BYTES_PER_BLOCK256
> > > +#define   NUM_BYTES_PER_CCS_BYTE 256
> > > +#define   NUM_CCS_BLKS_PER_XFER  1024
> > > +#define   INDIRECT_ACCESS0
> > > +#define   DIRECT_ACCESS  1
> > > +#define  MI_FLUSH_LLCBIT(9)
> > > +#define  MI_FLUSH_CCSBIT(16)
> > > +
> > >  #define COLOR_BLT_CMD(2 << 29 | 0x40 << 22 | (5 - 2))
> > >  #define XY_COLOR_BLT_CMD (2 << 29 | 0x50 << 22)
> > >  #define SRC_COPY_BLT_CMD (2 << 29 | 0x43 << 22)
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
> > > b/drivers/gpu/drm/i915/gt/intel_migrate.c
> > > index 20444d6ceb3c..9f9cd2649377 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> > > @@ -16,6 +16,8 @@ struct insert_pte_data {
> > >  };
> > >  
> > >  #define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */
> > > +#define GET_CCS_BYTES(i915, size)(HAS_FLAT_CCS(i915) ? \
> > > +  DIV_ROUND_UP(size, 
> > > NUM_BYTES_PER_CCS_BYTE) : 0)
> > >  
> > >  static bool engine_supports_migration(struct intel_engine_cs *engine)
> > >  {
> > > @@ -467,6 +469,113 @@ static bool wa_1209644611_applies(int ver, u32 size)
> > >   return height % 4 == 3 && height <= 8;
> > >  }
> > >  
> > > +/**
> > > + * DOC: Flat-CCS - Memory compression for Local memory
> > > + *
> > > + * On Xe-HP and later devices, we use dedicated compression control 
> > > state (CCS)
> > > + * stored in local memory for each surface, to support the 3D and media
> > > + * compression formats.
> > > + *
> > > + * The memory required for the CCS of the entire local memory is 1/256 
> > > of the
> > > + * local memory size. So before the kernel boot, the required memory is 
> > > reserved
> > > + * for the CCS data and a secure register will be programmed with the 
> > > CCS base
> > > + * address.
> > > + *
> > > + * Flat CCS data needs to be cleared when a lmem object is allocated.
> > > + * And CCS data can be copied in and out of CCS region through
> > > + * XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data directly.
> > > + *
> > > + * When we exaust the lmem, if the object's placements support smem, 
> > > then we can
> > 
> > Typo: exhaust
> > 
> > > + * directly decompress the compressed lmem object into smem and start 
> > > using it
> > > + * from smem itself.
> > > + *
> > > + * But 

Re: [Intel-gfx] [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread David Laight
From: Xiaomeng Tong
> Sent: 03 March 2022 02:27
> 
> On Wed, 2 Mar 2022 14:04:06 +, David Laight
>  wrote:
> > I think that it would be better to make any alternate loop macro
> > just set the variable to NULL on the loop exit.
> > That is easier to code for and the compiler might be persuaded to
> > not redo the test.
> 
> No, that would lead to a NULL dereference.

Why, it would make it b ethe same as the 'easy to use':
for (item = head; item; item = item->next) {
...
if (...)
break;
...
}
if (!item)
return;
 
> The problem is the mis-use of iterator outside the loop on exit, and
> the iterator will be the HEAD's container_of pointer which pointers
> to a type-confused struct. Sidenote: The *mis-use* here refers to
> mistakely access to other members of the struct, instead of the
> list_head member which acutally is the valid HEAD.

The problem is that the HEAD's container_of pointer should never
be calculated at all.
This is what is fundamentally broken about the current definition.

> IOW, you would dereference a (NULL + offset_of_member) address here.

Where?

> Please remind me if i missed something, thanks.
>
> Can you share your "alternative definitions" details? thanks!

The loop should probably use as extra variable that points
to the 'list node' in the next structure.
Something like:
for (xxx *iter = head->next;
iter ==  ? ((item = NULL),0) : ((item = 
list_item(iter),1));
iter = item->member->next) {
   ...
With a bit of casting you can use 'item' to hold 'iter'.

> 
> > OTOH there may be alternative definitions that can be used to get
> > the compiler (or other compiler-like tools) to detect broken code.
> > Even if the definition can't possibly generate a working kerrnel.
> 
> The "list_for_each_entry_inside(pos, type, head, member)" way makes
> the iterator invisiable outside the loop, and would be catched by
> compiler if use-after-loop things happened.

It is also a compete PITA for anything doing a search.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/gt: Handle errors for i915_gem_object_trylock

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915/gt: Handle errors for i915_gem_object_trylock
URL   : https://patchwork.freedesktop.org/series/100951/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11313_full -> Patchwork_22464_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_22464_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_22464_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (13 -> 13)
--

  No changes in participating hosts

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_22464_full:

### IGT changes ###

 Possible regressions 

  * 
igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-b-edp-1-scaler-with-clipping-clamping:
- shard-iclb: [PASS][1] -> [SKIP][2] +1 similar issue
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/shard-iclb1/igt@kms_plane_scaling@scaler-with-clipping-clamp...@pipe-b-edp-1-scaler-with-clipping-clamping.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-iclb3/igt@kms_plane_scaling@scaler-with-clipping-clamp...@pipe-b-edp-1-scaler-with-clipping-clamping.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_exec_schedule@smoketest@rcs0:
- {shard-rkl}:NOTRUN -> [INCOMPLETE][3]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-rkl-5/igt@gem_exec_schedule@smoket...@rcs0.html

  * {igt@kms_plane_scaling@downscale-with-pixel-format-factor-0-25}:
- {shard-rkl}:NOTRUN -> [SKIP][4]
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-rkl-5/igt@kms_plane_scal...@downscale-with-pixel-format-factor-0-25.html

  * 
{igt@kms_plane_scaling@scaler-with-rotation-unity-scaling@pipe-d-hdmi-a-3-scaler-with-rotation}:
- {shard-dg1}:NOTRUN -> [SKIP][5] +3 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-dg1-18/igt@kms_plane_scaling@scaler-with-rotation-unity-scal...@pipe-d-hdmi-a-3-scaler-with-rotation.html

  
New tests
-

  New tests have been introduced between CI_DRM_11313_full and 
Patchwork_22464_full:

### New IGT tests (1) ###

  * 
igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [1.27] s

  

Known issues


  Here are the changes found in Patchwork_22464_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-apl:  NOTRUN -> [DMESG-WARN][6] ([i915#4991])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-apl1/igt@gem_cre...@create-massive.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][7] -> [FAIL][8] ([i915#232])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/shard-tglb6/igt@gem_...@unwedge-stress.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-tglb3/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_balancer@parallel-out-fence:
- shard-kbl:  NOTRUN -> [DMESG-WARN][9] ([i915#5076])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-kbl4/igt@gem_exec_balan...@parallel-out-fence.html

  * igt@gem_exec_endless@dispatch@bcs0:
- shard-iclb: [PASS][10] -> [INCOMPLETE][11] ([i915#3778])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/shard-iclb7/igt@gem_exec_endless@dispa...@bcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-iclb6/igt@gem_exec_endless@dispa...@bcs0.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-iclb: [PASS][12] -> [FAIL][13] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/shard-iclb7/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-iclb8/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none-vip@rcs0:
- shard-skl:  NOTRUN -> [SKIP][14] ([fdo#109271]) +9 similar issues
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-skl8/igt@gem_exec_fair@basic-none-...@rcs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [PASS][15] -> [FAIL][16] ([i915#2842])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/shard-tglb3/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/shard-tglb5/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * 

Re: [Intel-gfx] [intel-gfx] [PATCH] drm/i915: avoid concurrent writes to aux_inv (rev3)

2022-03-02 Thread Yang, Fei
Hi Chris, for some reason I didn't receive the review email, so I copied your 
comments from patchwork and faked this email.

>>  static void execlists_dequeue(struct intel_engine_cs *engine)
>>  {
>> struct intel_engine_execlists * const execlists = >execlists;
>> @@ -1538,6 +1566,16 @@ static void execlists_dequeue(struct intel_engine_cs 
>> *engine)
>> }
>>
>> if (__i915_request_submit(rq)) {
>> +   /* hsdes: 1809175790 */
>> +   if ((GRAPHICS_VER(engine->i915) == 12) &&
>> +   rq->vd_ve_aux_inv &&
>> +   (engine->class == VIDEO_DECODE_CLASS ||
>> +engine->class == 
>> VIDEO_ENHANCEMENT_CLASS)) {

> We do not need the extra checks here; we just do as we are told. We only
> tell ourselves to apply the fixup when required.

Without checking GRAPHICS_VER, I'm seeing a lot of regressions on older 
platforms in the CI result.
This workaround was only implemented for gen12 (gen12_emit_flush_xcs).
Without checking engine->class, I'm seeing boot issues due to GEM_BUG_ON() in 
aux_inv_reg().
Any rq will go through this code regardless of engine class and gen version, so 
the checking seems to be
necessary.

>> +   *rq->vd_ve_aux_inv = 
>> i915_mmio_reg_offset

> Likewise, vd_ve is overspecific, aux_inv_fixup or aux_inv_wa (or
> wa_aux_iv, fixup_aux_inv).

I wanted to be specific because the workaround was only implemented for vd/ve 
engines.
But I'm ok with your proposal.

>> +   (aux_inv_reg(engine));
>> +   rq->vd_ve_aux_inv = NULL;

> Move this to i915_request initialisation so that we only set aux_inv
> when required, which probably explains the extra defence.

The pointer is currently initialized with 0x5a5a. I set it to NULL in 
gen12_emit_flush_xcs, otherwise the rq will
enter that if-statement with an invalid pointer.
I'm not familiar with the code, there seems to be multiple functions allocating 
the structure. I agree it's better
to set it to NULL at initialization, but need some guidance on where is the 
best place to do so.

>> +   rq->execution_mask = engine->mask;
>> +   }
>> if (!merge) {
>> *port++ = i915_request_get(last);
>> last = NULL;
>> diff --git a/drivers/gpu/drm/i915/i915_request.h 
>> b/drivers/gpu/drm/i915/i915_request.h
>> index 28b1f9db5487..69de32e5e15d 100644
>> --- a/drivers/gpu/drm/i915/i915_request.h
>> +++ b/drivers/gpu/drm/i915/i915_request.h
>> @@ -350,6 +350,8 @@ struct i915_request {
>> struct list_head link;
>> unsigned long delay;
>> } mock;)
>> +
>> +   u32 *vd_ve_aux_inv;

> Not at the end of the struct; that's where we put things in the dungeon.
> The selftest struct should be last; I do hope no one has been putting
> things at random places in the struct without considering the layout and
> semantics. From the flow, this is akin to batch, capture_list; before
> emitted_jiffies would be a good spot.

Got it, will change. I thought adding at the end would be safer, thanks for the 
explanation.

> -Chris



[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/cdclk: Add cdclk check to atomic check (rev2)

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915/cdclk: Add cdclk check to atomic check (rev2)
URL   : https://patchwork.freedesktop.org/series/100671/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11316 -> Patchwork_22467


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_22467 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_22467, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/index.html

Participating hosts (48 -> 33)
--

  Additional (1): bat-adls-5 
  Missing(16): fi-bdw-5557u shard-tglu fi-bsw-n3050 fi-hsw-4200u 
fi-bsw-cyan fi-snb-2520m fi-ilk-650 fi-ctg-p8600 fi-hsw-4770 fi-kbl-8809g 
fi-ivb-3770 fi-elk-e7500 fi-bsw-kefka fi-blb-e6850 fi-bdw-samus fi-snb-2600 

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_22467:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_suspend@basic-s0@smem:
- fi-bsw-nick:[PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/fi-bsw-nick/igt@gem_exec_suspend@basic...@smem.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-bsw-nick/igt@gem_exec_suspend@basic...@smem.html

  * igt@runner@aborted:
- bat-dg1-5:  NOTRUN -> [FAIL][3]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/bat-dg1-5/igt@run...@aborted.html
- fi-bwr-2160:NOTRUN -> [FAIL][4]
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-bwr-2160/igt@run...@aborted.html
- bat-dg1-6:  NOTRUN -> [FAIL][5]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/bat-dg1-6/igt@run...@aborted.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@runner@aborted:
- {fi-rkl-11600}: NOTRUN -> [FAIL][6]
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-rkl-11600/igt@run...@aborted.html
- {bat-adlp-6}:   NOTRUN -> [FAIL][7]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/bat-adlp-6/igt@run...@aborted.html
- {bat-jsl-2}:NOTRUN -> [FAIL][8]
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/bat-jsl-2/igt@run...@aborted.html
- {fi-jsl-1}: NOTRUN -> [FAIL][9]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-jsl-1/igt@run...@aborted.html
- {bat-jsl-1}:NOTRUN -> [FAIL][10]
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/bat-jsl-1/igt@run...@aborted.html
- {fi-ehl-2}: NOTRUN -> [FAIL][11]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-ehl-2/igt@run...@aborted.html
- {bat-dg2-9}:[FAIL][12] ([i915#4312]) -> [FAIL][13]
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/bat-dg2-9/igt@run...@aborted.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/bat-dg2-9/igt@run...@aborted.html

  
Known issues


  Here are the changes found in Patchwork_22467 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@runner@aborted:
- fi-kbl-x1275:   NOTRUN -> [FAIL][14] ([i915#2426])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-kbl-x1275/igt@run...@aborted.html
- fi-cfl-8700k:   NOTRUN -> [FAIL][15] ([i915#2426])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-cfl-8700k/igt@run...@aborted.html
- fi-skl-6600u:   NOTRUN -> [FAIL][16] ([i915#2426])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-skl-6600u/igt@run...@aborted.html
- fi-cfl-8109u:   NOTRUN -> [FAIL][17] ([i915#2426])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-cfl-8109u/igt@run...@aborted.html
- fi-icl-u2:  NOTRUN -> [FAIL][18] ([i915#2426] / [i915#3690])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-icl-u2/igt@run...@aborted.html
- fi-glk-dsi: NOTRUN -> [FAIL][19] ([i915#2426] / [k.org#202321])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-glk-dsi/igt@run...@aborted.html
- fi-kbl-soraka:  NOTRUN -> [FAIL][20] ([i915#2426])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-kbl-soraka/igt@run...@aborted.html
- fi-kbl-7500u:   NOTRUN -> [FAIL][21] ([i915#2426])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22467/fi-kbl-7500u/igt@run...@aborted.html
- fi-kbl-guc: NOTRUN -> [FAIL][22] ([i915#2426])
   [22]: 

[Intel-gfx] ✗ Fi.CI.DOCS: warning for drm/i915/cdclk: Add cdclk check to atomic check (rev2)

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915/cdclk: Add cdclk check to atomic check (rev2)
URL   : https://patchwork.freedesktop.org/series/100671/
State : warning

== Summary ==

$ make htmldocs 2>&1 > /dev/null | grep i915
./drivers/gpu/drm/i915/display/intel_cdclk.c:2034: warning: Function parameter 
or member 'i915' not described in 'intel_cdclk_needs_modeset'
./drivers/gpu/drm/i915/display/intel_cdclk.c:2102: warning: Function parameter 
or member 'i915' not described in 'intel_cdclk_changed'




[Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/cdclk: Add cdclk check to atomic check (rev2)

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915/cdclk: Add cdclk check to atomic check (rev2)
URL   : https://patchwork.freedesktop.org/series/100671/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.




[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/display/adlp: Remove code related to underrun recovery

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915/display/adlp: Remove code related to underrun recovery
URL   : https://patchwork.freedesktop.org/series/100965/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11316 -> Patchwork_22466


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/index.html

Participating hosts (48 -> 44)
--

  Additional (1): fi-pnv-d510 
  Missing(5): shard-tglu fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 fi-bdw-samus 

Known issues


  Here are the changes found in Patchwork_22466 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
- fi-snb-2600:NOTRUN -> [SKIP][1] ([fdo#109271]) +17 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/fi-snb-2600/igt@amdgpu/amd_cs_...@sync-fork-compute0.html

  * igt@gem_exec_suspend@basic-s3@smem:
- fi-skl-6600u:   [PASS][2] -> [INCOMPLETE][3] ([i915#4547])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/fi-skl-6600u/igt@gem_exec_suspend@basic...@smem.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/fi-skl-6600u/igt@gem_exec_suspend@basic...@smem.html

  * igt@gem_huc_copy@huc-copy:
- fi-pnv-d510:NOTRUN -> [SKIP][4] ([fdo#109271]) +57 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/fi-pnv-d510/igt@gem_huc_c...@huc-copy.html

  * igt@i915_selftest@live@execlists:
- fi-bsw-kefka:   [PASS][5] -> [INCOMPLETE][6] ([i915#2940])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html

  * igt@runner@aborted:
- fi-bsw-kefka:   NOTRUN -> [FAIL][7] ([fdo#109271] / [i915#1436] / 
[i915#2722] / [i915#3428] / [i915#4312])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/fi-bsw-kefka/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_pm_rpm@module-reload:
- fi-icl-u2:  [FAIL][8] ([i915#3049]) -> [PASS][9]
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/fi-icl-u2/igt@i915_pm_...@module-reload.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/fi-icl-u2/igt@i915_pm_...@module-reload.html

  * igt@i915_selftest@live@evict:
- {bat-rpls-2}:   [DMESG-WARN][10] ([i915#4391]) -> [PASS][11] +1 
similar issue
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/bat-rpls-2/igt@i915_selftest@l...@evict.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/bat-rpls-2/igt@i915_selftest@l...@evict.html

  * igt@i915_selftest@live@hangcheck:
- fi-icl-u2:  [DMESG-WARN][12] ([i915#2867]) -> [PASS][13] +7 
similar issues
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/fi-icl-u2/igt@i915_selftest@l...@hangcheck.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/fi-icl-u2/igt@i915_selftest@l...@hangcheck.html
- fi-snb-2600:[INCOMPLETE][14] ([i915#3921]) -> [PASS][15]
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html

  * igt@i915_selftest@live@workarounds:
- {bat-adlp-6}:   [DMESG-WARN][16] ([i915#5068]) -> [PASS][17]
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/bat-adlp-6/igt@i915_selftest@l...@workarounds.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/bat-adlp-6/igt@i915_selftest@l...@workarounds.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@a-edp1:
- {bat-adlp-6}:   [DMESG-WARN][18] ([i915#3576]) -> [PASS][19]
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11316/bat-adlp-6/igt@kms_flip@basic-flip-vs-wf_vbl...@a-edp1.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22466/bat-adlp-6/igt@kms_flip@basic-flip-vs-wf_vbl...@a-edp1.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#2722]: https://gitlab.freedesktop.org/drm/intel/issues/2722
  [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#3049]: https://gitlab.freedesktop.org/drm/intel/issues/3049
  [i915#3428]: https://gitlab.freedesktop.org/drm/intel/issues/3428
  [i915#3576]: https://gitlab.freedesktop.org/drm/intel/issues/3576
  [i915#3921]: https://gitlab.freedesktop.org/drm/intel/issues/3921
  

[Intel-gfx] [PATCH] drm/i915/cdclk: Add cdclk check to atomic check

2022-03-02 Thread Anusha Srivatsa
Checking cdclk conditions during atomic check and preparing
for commit phase so we can have atomic commit as simple
as possible. Add the specific steps to be taken during
cdclk changes, prepare for squashing, crawling and modeset
scenarios.

Rename functions intel_cdclk_can_squash() and
intel_cdclk_can_crawl() since they no longer simply check
if squashing and crawling can be performed.

Cc: Stanislav Lisovskiy 
Cc: Matt Roper 
Signed-off-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/display/intel_cdclk.c| 169 +++---
 drivers/gpu/drm/i915/display/intel_cdclk.h|  16 +-
 .../drm/i915/display/intel_display_power.c|   2 +-
 3 files changed, 123 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c 
b/drivers/gpu/drm/i915/display/intel_cdclk.c
index fda8b701..04f3f77ef0a8 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -1700,12 +1700,23 @@ static void bxt_set_cdclk(struct drm_i915_private 
*dev_priv,
  const struct intel_cdclk_config *cdclk_config,
  enum pipe pipe)
 {
+   struct intel_atomic_state *state;
+   struct intel_cdclk_state *new_cdclk_state;
+   struct cdclk_steps *cdclk_steps;
+   struct intel_cdclk_state *cdclk_state;
int cdclk = cdclk_config->cdclk;
int vco = cdclk_config->vco;
+   u32 squash_ctl = 0;
u32 val;
u16 waveform;
int clock;
int ret;
+   int i;
+
+   cdclk_state =  to_intel_cdclk_state(dev_priv->cdclk.obj.state);
+   state = cdclk_state->base.state;
+   new_cdclk_state = intel_atomic_get_new_cdclk_state(state);
+   cdclk_steps = new_cdclk_state->steps;
 
/* Inform power controller of upcoming frequency change. */
if (DISPLAY_VER(dev_priv) >= 11)
@@ -1728,40 +1739,43 @@ static void bxt_set_cdclk(struct drm_i915_private 
*dev_priv,
return;
}
 
-   if (HAS_CDCLK_CRAWL(dev_priv) && dev_priv->cdclk.hw.vco > 0 && vco > 0) 
{
-   if (dev_priv->cdclk.hw.vco != vco)
+   for (i = 0; i < CDCLK_ACTIONS; i++) {
+   switch (cdclk_steps[i].action) {
+   case CDCLK_MODESET:
+   if (DISPLAY_VER(dev_priv) >= 11) {
+   if (dev_priv->cdclk.hw.vco != 0 &&
+   dev_priv->cdclk.hw.vco != vco)
+   icl_cdclk_pll_disable(dev_priv);
+
+   if (dev_priv->cdclk.hw.vco != vco)
+   icl_cdclk_pll_enable(dev_priv, vco);
+   } else {
+   if (dev_priv->cdclk.hw.vco != 0 &&
+   dev_priv->cdclk.hw.vco != vco)
+   bxt_de_pll_disable(dev_priv);
+
+   if (dev_priv->cdclk.hw.vco != vco)
+   bxt_de_pll_enable(dev_priv, vco);
+   }
+   clock = cdclk;
+   break;
+   case CDCLK_CRAWL:
adlp_cdclk_pll_crawl(dev_priv, vco);
-   } else if (DISPLAY_VER(dev_priv) >= 11) {
-   if (dev_priv->cdclk.hw.vco != 0 &&
-   dev_priv->cdclk.hw.vco != vco)
-   icl_cdclk_pll_disable(dev_priv);
-
-   if (dev_priv->cdclk.hw.vco != vco)
-   icl_cdclk_pll_enable(dev_priv, vco);
-   } else {
-   if (dev_priv->cdclk.hw.vco != 0 &&
-   dev_priv->cdclk.hw.vco != vco)
-   bxt_de_pll_disable(dev_priv);
-
-   if (dev_priv->cdclk.hw.vco != vco)
-   bxt_de_pll_enable(dev_priv, vco);
-   }
-
-   waveform = cdclk_squash_waveform(dev_priv, cdclk);
-
-   if (waveform)
-   clock = vco / 2;
-   else
-   clock = cdclk;
-
-   if (has_cdclk_squasher(dev_priv)) {
-   u32 squash_ctl = 0;
-
-   if (waveform)
+   clock = cdclk;
+   break;
+   case CDCLK_SQUASH:
+   waveform =  cdclk_squash_waveform(dev_priv, 
cdclk_steps[i].cdclk);
+   clock = vco / 2;
squash_ctl = CDCLK_SQUASH_ENABLE |
CDCLK_SQUASH_WINDOW_SIZE(0xf) | waveform;
-
-   intel_de_write(dev_priv, CDCLK_SQUASH_CTL, squash_ctl);
+   intel_de_write(dev_priv, CDCLK_SQUASH_CTL, squash_ctl);
+   break;
+   case CDCLK_NOOP:
+   break;
+   default:
+   MISSING_CASE(cdclk_steps[i].action);
+   break;
+   }
}
 
val = bxt_cdclk_cd2x_div_sel(dev_priv, clock, vco) |
@@ -1951,11 +1965,12 @@ void 

[Intel-gfx] [PATCH] drm/i915/display/adlp: Remove code related to underrun recovery

2022-03-02 Thread Swathi Dhanavanthri
This is not supported for ADLP and is not needed.

Signed-off-by: Swathi Dhanavanthri 
---
 drivers/gpu/drm/i915/display/intel_display.c | 21 
 1 file changed, 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 6cae58f921a5..541797a2ff9e 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -3595,12 +3595,8 @@ static void hsw_set_transconf(const struct 
intel_crtc_state *crtc_state)
 static void bdw_set_pipemisc(const struct intel_crtc_state *crtc_state)
 {
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
-   const struct intel_crtc_scaler_state *scaler_state =
-   _state->scaler_state;
-
struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
u32 val = 0;
-   int i;
 
switch (crtc_state->pipe_bpp) {
case 18:
@@ -3639,23 +3635,6 @@ static void bdw_set_pipemisc(const struct 
intel_crtc_state *crtc_state)
if (DISPLAY_VER(dev_priv) >= 12)
val |= PIPEMISC_PIXEL_ROUNDING_TRUNC;
 
-   if (IS_ALDERLAKE_P(dev_priv)) {
-   bool scaler_in_use = false;
-
-   for (i = 0; i < crtc->num_scalers; i++) {
-   if (!scaler_state->scalers[i].in_use)
-   continue;
-
-   scaler_in_use = true;
-   break;
-   }
-
-   intel_de_rmw(dev_priv, PIPE_MISC2(crtc->pipe),
-PIPE_MISC2_BUBBLE_COUNTER_MASK,
-scaler_in_use ? 
PIPE_MISC2_BUBBLE_COUNTER_SCALER_EN :
-PIPE_MISC2_BUBBLE_COUNTER_SCALER_DIS);
-   }
-
intel_de_write(dev_priv, PIPEMISC(crtc->pipe), val);
 }
 
-- 
2.20.1



Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for iommu/vt-d: Add RPLS to quirk list to skip TE disabling (rev3)

2022-03-02 Thread Vivi, Rodrigo
Thank you all.

I had reviewed this patch already in the iommu list.
Now pushed.

Thanks,
Rodrigo.

On Thu, 2022-03-03 at 00:38 +0800, Vudum, Lakshminarayana wrote:
> Filed this issue and reported.
> https://gitlab.freedesktop.org/drm/intel/-/issues/5239
>  
> Lakshmi.
>  
> From: Surendrakumar Upadhyay, TejaskumarX
> 
> Sent: Wednesday, March 2, 2022 5:20 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: Vivi, Rodrigo ; Meena, Mahesh
> ; Vudum, Lakshminarayana
> 
> Subject: RE: ✗ Fi.CI.IGT: failure for iommu/vt-d: Add RPLS to quirk
> list to skip TE disabling (rev3)
>  
> Regression is not related to the patch. Please mark it pass and
> requesting to merge.
>  
> Thanks,
> Tejas
>  
> From: Patchwork 
> Sent: 02 March 2022 17:56
> To: Surendrakumar Upadhyay, TejaskumarX
> 
> Cc: intel-gfx@lists.freedesktop.org
> Subject: ✗ Fi.CI.IGT: failure for iommu/vt-d: Add RPLS to quirk list
> to skip TE disabling (rev3)
>  
> Patch Details
> Series: iommu/vt-d: Add RPLS to quirk list to skip TE disabling
> (rev3) URL: https://patchwork.freedesktop.org/series/100165/ State:
> failure Details: https://intel-gfx-ci.01.org/tree/drm-
> tip/Patchwork_22458/index.html CI Bug Log - changes from
> CI_DRM_11308_full -> Patchwork_22458_fullSummaryFAILURE
> Serious unknown changes coming with Patchwork_22458_full absolutely
> need to be
> verified manually.
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_22458_full, please notify your bug team to
> allow them
> to document this new failure mode, which will reduce false positives
> in CI.
> Participating hosts (13 -> 13)No changes in participating hosts
> Possible new issuesHere are the unknown changes that may have been
> introduced in Patchwork_22458_full:
> IGT changesPossible regressions * igt@kms_cursor_legacy@long-
> nonblocking-modeset-vs-cursor-atomic:
> - shard-tglb: PASS -> INCOMPLETE
> SuppressedThe following results come from untrusted machines, tests,
> or statuses.
> They do not affect the overall result.
>  * {igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-
> factor-0-5}:
> - {shard-rkl}: NOTRUN -> SKIP +1 similar issue
>  * {igt@kms_plane_scaling@scaler-with-rotation-unity-scaling@pipe-d-
> hdmi-a-3-scaler-with-rotation}:
> - {shard-dg1}: NOTRUN -> SKIP +3 similar issues
> New testsNew tests have been introduced between CI_DRM_11308_full and
> Patchwork_22458_full:
> New IGT tests (1) * igt@kms_plane_scaling@planes-unity-scaling-
> downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
> - Statuses : 1 pass(s)
> - Exec time: [1.28] s
> Known issuesHere are the changes found in Patchwork_22458_full that
> come from known issues:
> CI changesIssues hit * boot:
> - shard-skl: (PASS,PASS, PASS, PASS, PASS, PASS, PASS, PASS,
> PASS, PASS, PASS, PASS, PASS, PASS, PASS, PASS, PASS, PASS, PASS,
> PASS, PASS, PASS, PASS) -> (PASS,PASS, PASS, PASS, PASS, PASS, PASS,
> PASS, PASS, PASS, PASS, PASS, PASS, PASS, PASS, PASS, FAIL, PASS,
> PASS, PASS, PASS, PASS, PASS, PASS) ([i915#5032])
> IGT changesIssues hit * igt@gem_ctx_isolation@preservation-s3@vcs0:
> - shard-kbl: PASS -> DMESG-WARN ([i915#180]) +4 similar issues
>  * igt@gem_exec_balancer@parallel-bb-first:
> - shard-tglb: NOTRUN -> DMESG-WARN ([i915#5076])
>  * igt@gem_exec_fair@basic-deadline:
> - shard-skl: NOTRUN -> FAIL ([i915#2846])
>  * igt@gem_exec_fair@basic-none-share@rcs0:
> - shard-iclb: PASS -> FAIL ([i915#2842])
> - shard-glk: PASS -> FAIL ([i915#2842])
>  * igt@gem_exec_fair@basic-pace-solo@rcs0:
> - shard-kbl: PASS -> FAIL ([i915#2842])
>  * igt@gem_exec_whisper@basic-contexts-forked-all:
> - shard-glk: PASS -> DMESG-WARN ([i915#118])
>  * igt@gem_exec_whisper@basic-queues-forked-all:
> - shard-iclb: PASS -> INCOMPLETE ([i915#1895])
>  * igt@gem_lmem_swapping@heavy-verify-random:
> - shard-skl: NOTRUN -> SKIP ([fdo#109271] / [i915#4613]) +1
> similar issue
>  * igt@gem_lmem_swapping@smem-oom:
> - shard-apl: NOTRUN -> SKIP ([fdo#109271] / [i915#4613])
>  * igt@gem_pread@exhaustion:
> - shard-skl: NOTRUN -> WARN ([i915#2658])
>  * igt@gem_pxp@create-regular-context-1:
> - shard-iclb: NOTRUN -> SKIP ([i915#4270]) +1 similar issue
>  * igt@gem_pxp@create-regular-context-2:
> - shard-tglb: NOTRUN -> SKIP ([i915#4270])
>  * igt@gem_render_copy@y-tiled-mc-ccs-to-y-tiled-ccs:
> - shard-iclb: NOTRUN -> SKIP ([i915#768])
>  * igt@gem_userptr_blits@create-destroy-unsync:
> - shard-iclb: NOTRUN -> SKIP ([i915#3297])
>  * igt@gem_userptr_blits@dmabuf-sync:
> - shard-skl: NOTRUN -> SKIP ([fdo#109271] / [i915#3323])
>  * igt@gem_userptr_blits@input-checking:
> - shard-apl: NOTRUN -> DMESG-WARN ([i915#4991])
> - shard-kbl: NOTRUN -> DMESG-WARN ([i915#4991])
>  * igt@gen7_exec_parse@batch-without-end:
> - shard-iclb: NOTRUN -> SKIP ([fdo#109289]) +1 similar issue
>  * igt@gen9_exec_parse@allowed-all:
> - shard-apl: PASS -> DMESG-WARN ([i915#1436] / 

Re: [Intel-gfx] [PATCH v2 2/3] drm/i915: Remove the vma refcount

2022-03-02 Thread Niranjana Vishwanathapura

On Wed, Mar 02, 2022 at 11:21:59AM +0100, Thomas Hellström wrote:

Now that i915_vma_parked() is taking the object lock on vma destruction,
and the only user of the vma refcount, i915_gem_object_unbind()
also takes the object lock, remove the vma refcount.

Signed-off-by: Thomas Hellström 
---
drivers/gpu/drm/i915/i915_gem.c   | 17 +
drivers/gpu/drm/i915/i915_vma.c   | 14 +++---
drivers/gpu/drm/i915/i915_vma.h   | 14 --
drivers/gpu/drm/i915/i915_vma_types.h |  1 -
4 files changed, 16 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dd84ebabb50f..c26110abcc0b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -151,14 +151,25 @@ int i915_gem_object_unbind(struct drm_i915_gem_object 
*obj,
break;
}

+   /*
+* Requiring the vm destructor to take the object lock
+* before destroying a vma would help us eliminate the
+* i915_vm_tryget() here, AND thus also the barrier stuff
+* at the end. That's an easy fix, but sleeping locks in
+* a kthread should generally be avoided.
+*/
ret = -EAGAIN;
if (!i915_vm_tryget(vma->vm))
break;

-   /* Prevent vma being freed by i915_vma_parked as we unbind */
-   vma = __i915_vma_get(vma);
spin_unlock(>vma.lock);

+   /*
+* Since i915_vma_parked() takes the object lock
+* before vma destruction, it won't race us here,
+* and destroy the vma from under us.
+*/
+
if (vma) {
bool vm_trylock = !!(flags & 
I915_GEM_OBJECT_UNBIND_VM_TRYLOCK);
ret = -EBUSY;
@@ -180,8 +191,6 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj,
ret = i915_vma_unbind(vma);
}
}
-
-   __i915_vma_put(vma);
}

i915_vm_put(vma->vm);


One issue still left in i915_gem_object_unbind is that it temporarily removes
vmas from the obj->vma.list and adds back later as vma needs to be unbind 
outside
the obj->vma.lock spinlock. This is an issue for other places where we iterate
over the obj->vma.list. i915_debugfs_describe_obj is one such case (upcoming
vm_bind will be another) that iterates over this list.
What is the plan here? Do we need to take object lock while iterating over the
list?

But this just something I noticed and not related to this patch.
This patch looks good to me.
Reviewed-by: Niranjana Vishwanathapura 



diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 91538bc38110..6fd25b39748f 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -122,7 +122,6 @@ vma_create(struct drm_i915_gem_object *obj,
if (vma == NULL)
return ERR_PTR(-ENOMEM);

-   kref_init(>ref);
vma->ops = >vma_ops;
vma->obj = obj;
vma->size = obj->base.size;
@@ -1628,15 +1627,6 @@ void i915_vma_reopen(struct i915_vma *vma)
__i915_vma_remove_closed(vma);
}

-void i915_vma_release(struct kref *ref)
-{
-   struct i915_vma *vma = container_of(ref, typeof(*vma), ref);
-
-   i915_active_fini(>active);
-   GEM_WARN_ON(vma->resource);
-   i915_vma_free(vma);
-}
-
static void force_unbind(struct i915_vma *vma)
{
if (!drm_mm_node_allocated(>node))
@@ -1665,7 +1655,9 @@ static void release_references(struct i915_vma *vma, bool 
vm_ddestroy)
if (vm_ddestroy)
i915_vm_resv_put(vma->vm);

-   __i915_vma_put(vma);
+   i915_active_fini(>active);
+   GEM_WARN_ON(vma->resource);
+   i915_vma_free(vma);
}

/**
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 67ae7341c7e0..6034991d89fe 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -222,20 +222,6 @@ void i915_vma_unlink_ctx(struct i915_vma *vma);
void i915_vma_close(struct i915_vma *vma);
void i915_vma_reopen(struct i915_vma *vma);

-static inline struct i915_vma *__i915_vma_get(struct i915_vma *vma)
-{
-   if (kref_get_unless_zero(>ref))
-   return vma;
-
-   return NULL;
-}
-
-void i915_vma_release(struct kref *ref);
-static inline void __i915_vma_put(struct i915_vma *vma)
-{
-   kref_put(>ref, i915_vma_release);
-}
-
void i915_vma_destroy_locked(struct i915_vma *vma);
void i915_vma_destroy(struct i915_vma *vma);

diff --git a/drivers/gpu/drm/i915/i915_vma_types.h 
b/drivers/gpu/drm/i915/i915_vma_types.h
index eac36be184e5..be6e028c3b57 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h

Re: [Intel-gfx] [PATCH] drm/i915: avoid concurrent writes to aux_inv

2022-03-02 Thread Chris Wilson
Quoting fei.y...@intel.com (2022-03-02 18:26:57)
> From: Fei Yang 
> 
> GPU hangs have been observed when multiple engines write to the
> same aux_inv register at the same time. To avoid this each engine
> should only invalidate its own auxiliary table. The function
> gen12_emit_flush_xcs() currently invalidate the auxiliary table for
> all engines because the rq->engine is not necessarily the engine
> eventually carrying out the request, and potentially the engine
> could even be a virtual one (with engine->instance being -1).
> With this patch, auxiliary table invalidation is done only for the
> engine executing the request. And the mmio address for the aux_inv
> register is set after the engine instance becomes certain.
> 
> Signed-off-by: Chris Wilson 
> Signed-off-by: Fei Yang 
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c  | 41 ---
>  .../drm/i915/gt/intel_execlists_submission.c  | 38 +
>  drivers/gpu/drm/i915/i915_request.h   |  2 +
>  3 files changed, 47 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index b1b9c3fd7bf9..af62e2bc2c9b 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -165,30 +165,6 @@ static u32 preparser_disable(bool state)
> return MI_ARB_CHECK | 1 << 8 | state;
>  }
>  
> -static i915_reg_t aux_inv_reg(const struct intel_engine_cs *engine)
> -{
> -   static const i915_reg_t vd[] = {
> -   GEN12_VD0_AUX_NV,
> -   GEN12_VD1_AUX_NV,
> -   GEN12_VD2_AUX_NV,
> -   GEN12_VD3_AUX_NV,
> -   };
> -
> -   static const i915_reg_t ve[] = {
> -   GEN12_VE0_AUX_NV,
> -   GEN12_VE1_AUX_NV,
> -   };
> -
> -   if (engine->class == VIDEO_DECODE_CLASS)
> -   return vd[engine->instance];
> -
> -   if (engine->class == VIDEO_ENHANCEMENT_CLASS)
> -   return ve[engine->instance];
> -
> -   GEM_BUG_ON("unknown aux_inv reg\n");
> -   return INVALID_MMIO_REG;
> -}
> -
>  static u32 *gen12_emit_aux_table_inv(const i915_reg_t inv_reg, u32 *cs)
>  {
> *cs++ = MI_LOAD_REGISTER_IMM(1);
> @@ -288,7 +264,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
> mode)
> if (mode & EMIT_INVALIDATE)
> aux_inv = rq->engine->mask & ~BIT(BCS0);
> if (aux_inv)
> -   cmd += 2 * hweight32(aux_inv) + 2;
> +   cmd += 4;
>  
> cs = intel_ring_begin(rq, cmd);
> if (IS_ERR(cs))
> @@ -319,16 +295,13 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
> mode)
> *cs++ = 0; /* value */
>  
> if (aux_inv) { /* hsdes: 1809175790 */
> -   struct intel_engine_cs *engine;
> -   unsigned int tmp;
> -
> -   *cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv));
> -   for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) {
> -   *cs++ = i915_mmio_reg_offset(aux_inv_reg(engine));
> -   *cs++ = AUX_INV;
> -   }
> +   *cs++ = MI_LOAD_REGISTER_IMM(1);
> +   rq->vd_ve_aux_inv = cs;
> +   *cs++ = 0; /* address to be set at submission to HW */
> +   *cs++ = AUX_INV;
> *cs++ = MI_NOOP;
> -   }
> +   } else
> +   rq->vd_ve_aux_inv = NULL;
>  
> if (mode & EMIT_INVALIDATE)
> *cs++ = preparser_disable(false);
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 1c602d4ae297..a018de6dcac5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -1258,6 +1258,34 @@ static bool completed(const struct i915_request *rq)
> return __i915_request_is_complete(rq);
>  }
>  
> +static i915_reg_t aux_inv_reg(const struct intel_engine_cs *engine)
> +{
> +   static const i915_reg_t vd[] = {
> +   GEN12_VD0_AUX_NV,
> +   GEN12_VD1_AUX_NV,
> +   GEN12_VD2_AUX_NV,
> +   GEN12_VD3_AUX_NV,
> +   };
> +
> +   static const i915_reg_t ve[] = {
> +   GEN12_VE0_AUX_NV,
> +   GEN12_VE1_AUX_NV,
> +   };
> +
> +   if (engine->class == VIDEO_DECODE_CLASS) {
> +   GEM_BUG_ON(engine->instance >= ARRAY_SIZE(vd));
> +   return vd[engine->instance];
> +   }
> +
> +   if (engine->class == VIDEO_ENHANCEMENT_CLASS) {
> +   GEM_BUG_ON(engine->instance >= ARRAY_SIZE(ve));
> +   return ve[engine->instance];
> +   }
> +
> +   GEM_BUG_ON("unknown aux_inv reg\n");
> +   return INVALID_MMIO_REG;
> +}
> +
>  static void execlists_dequeue(struct intel_engine_cs *engine)
>  {
> struct intel_engine_execlists * const execlists = 

Re: [Intel-gfx] [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Kees Cook
On Wed, Mar 02, 2022 at 12:18:45PM -0800, Linus Torvalds wrote:
> On Wed, Mar 2, 2022 at 12:07 PM Kees Cook  wrote:
> >
> > I've long wanted to change kfree() to explicitly set pointers to NULL on
> > free. https://github.com/KSPP/linux/issues/87
> 
> We've had this discussion with the gcc people in the past, and gcc
> actually has some support for it, but it's sadly tied to the actual
> function name (ie gcc has some special-casing for "free()")
> 
> See
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94527
> 
> for some of that discussion.
> 
> Oh, and I see some patch actually got merged since I looked there last
> so that you can mark "deallocator" functions, but I think it's only
> for the context matching, not for actually killing accesses to the
> pointer afterwards.

Ah! I missed that getting added in GCC 11. But yes, there it is:

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute

Hah, now we may need to split __malloc from __alloc_size. ;)

I'd still like the NULL assignment behavior, though, since some things
can easily avoid static analysis.

-- 
Kees Cook


Re: [Intel-gfx] [PATCH v2 1/3] drm/i915: Remove the vm open count

2022-03-02 Thread Niranjana Vishwanathapura

On Wed, Mar 02, 2022 at 11:21:58AM +0100, Thomas Hellström wrote:

vms are not getting properly closed. Rather than fixing that,
Remove the vm open count and instead rely on the vm refcount.

The vm open count existed solely to break the strong references the
vmas had on the vms. Now instead make those references weak and
ensure vmas are destroyed when the vm is destroyed.

Unfortunately if the vm destructor and the object destructor both
wants to destroy a vma, that may lead to a race in that the vm
destructor just unbinds the vma and leaves the actual vma destruction
to the object destructor. However in order for the object destructor
to ensure the vma is unbound it needs to grab the vm mutex. In order
to keep the vm mutex alive until the object destructor is done with
it, somewhat hackishly grab a vm_resv refcount that is released late
in the vma destruction process, when the vm mutex is no longer needed.

v2: Address review-comments from Niranjana
- Clarify that the struct i915_address_space::skip_pte_rewrite is a hack and
 should ideally be replaced in an upcoming patch.
- Remove an unneeded continue in clear_vm_list and update comment.

Co-developed-by: Niranjana Vishwanathapura 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Thomas Hellström 
---
drivers/gpu/drm/i915/display/intel_dpt.c  |  2 +-
drivers/gpu/drm/i915/gem/i915_gem_context.c   | 29 ++-
.../gpu/drm/i915/gem/i915_gem_execbuffer.c|  6 ++
.../gpu/drm/i915/gem/selftests/mock_context.c |  5 +-
drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |  2 +-
drivers/gpu/drm/i915/gt/intel_ggtt.c  | 30 +++
drivers/gpu/drm/i915/gt/intel_gtt.c   | 54 
drivers/gpu/drm/i915/gt/intel_gtt.h   | 56 
drivers/gpu/drm/i915/gt/selftest_execlists.c  | 86 +--
drivers/gpu/drm/i915/i915_gem.c   |  6 +-
drivers/gpu/drm/i915/i915_vma.c   | 55 
drivers/gpu/drm/i915/i915_vma_resource.c  |  2 +-
drivers/gpu/drm/i915/i915_vma_resource.h  |  6 ++
drivers/gpu/drm/i915/i915_vma_types.h |  7 ++
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  4 +-
15 files changed, 186 insertions(+), 164 deletions(-)



Looks good to me.
Reviewed-by: Niranjana Vishwanathapura 



Re: [Intel-gfx] [PATCH v2 3/3] drm/i915/gem: Remove some unnecessary code

2022-03-02 Thread Niranjana Vishwanathapura

On Wed, Mar 02, 2022 at 11:22:00AM +0100, Thomas Hellström wrote:

The test for vma should always return true, and when assigning -EBUSY
to ret, the variable should already have that value.

Signed-off-by: Thomas Hellström 
---
drivers/gpu/drm/i915/i915_gem.c | 32 ++--
1 file changed, 14 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c26110abcc0b..9747924cc57b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -118,6 +118,7 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj,
   unsigned long flags)
{
struct intel_runtime_pm *rpm = _i915(obj->base.dev)->runtime_pm;
+   bool vm_trylock = !!(flags & I915_GEM_OBJECT_UNBIND_VM_TRYLOCK);
LIST_HEAD(still_in_list);
intel_wakeref_t wakeref;
struct i915_vma *vma;
@@ -170,26 +171,21 @@ int i915_gem_object_unbind(struct drm_i915_gem_object 
*obj,
 * and destroy the vma from under us.
 */

-   if (vma) {
-   bool vm_trylock = !!(flags & 
I915_GEM_OBJECT_UNBIND_VM_TRYLOCK);
-   ret = -EBUSY;
-   if (flags & I915_GEM_OBJECT_UNBIND_ASYNC) {
-   assert_object_held(vma->obj);
-   ret = i915_vma_unbind_async(vma, vm_trylock);
-   }
+   ret = -EBUSY;
+   if (flags & I915_GEM_OBJECT_UNBIND_ASYNC) {
+   assert_object_held(vma->obj);
+   ret = i915_vma_unbind_async(vma, vm_trylock);
+   }

-   if (ret == -EBUSY && (flags & 
I915_GEM_OBJECT_UNBIND_ACTIVE ||
- !i915_vma_is_active(vma))) {
-   if (vm_trylock) {
-   if (mutex_trylock(>vm->mutex)) {
-   ret = __i915_vma_unbind(vma);
-   mutex_unlock(>vm->mutex);
-   } else {
-   ret = -EBUSY;
-   }
-   } else {
-   ret = i915_vma_unbind(vma);
+   if (ret == -EBUSY && (flags & I915_GEM_OBJECT_UNBIND_ACTIVE ||
+ !i915_vma_is_active(vma))) {
+   if (vm_trylock) {
+   if (mutex_trylock(>vm->mutex)) {
+   ret = __i915_vma_unbind(vma);
+   mutex_unlock(>vm->mutex);
}
+   } else {
+   ret = i915_vma_unbind(vma);
}
}


Looks good to me.
Reviewed-by: Niranjana Vishwanathapura 



--
2.34.1



Re: [Intel-gfx] [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Linus Torvalds
On Wed, Mar 2, 2022 at 12:07 PM Kees Cook  wrote:
>
> I've long wanted to change kfree() to explicitly set pointers to NULL on
> free. https://github.com/KSPP/linux/issues/87

We've had this discussion with the gcc people in the past, and gcc
actually has some support for it, but it's sadly tied to the actual
function name (ie gcc has some special-casing for "free()")

See

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94527

for some of that discussion.

Oh, and I see some patch actually got merged since I looked there last
so that you can mark "deallocator" functions, but I think it's only
for the context matching, not for actually killing accesses to the
pointer afterwards.

   Linus


Re: [Intel-gfx] [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Kees Cook
On Wed, Mar 02, 2022 at 10:29:31AM +0100, Rasmus Villemoes wrote:
> This won't help the current issue (because it doesn't exist and might
> never), but just in case some compiler people are listening, I'd like to
> have some sort of way to tell the compiler "treat this variable as
> uninitialized from here on". So one could do
> 
> #define kfree(p) do { __kfree(p); __magic_uninit(p); } while (0)
> 
> with __magic_uninit being a magic no-op that doesn't affect the
> semantics of the code, but could be used by the compiler's "[is/may be]
> used uninitialized" machinery to flag e.g. double frees on some odd
> error path etc. It would probably only work for local automatic
> variables, but it should be possible to just ignore the hint if p is
> some expression like foo->bar or has side effects. If we had that, the
> end-of-loop test could include that to "uninitialize" the iterator.

I've long wanted to change kfree() to explicitly set pointers to NULL on
free. https://github.com/KSPP/linux/issues/87

The thing stopping a trivial transformation of kfree() is:

kfree(get_some_pointer());

I would argue, though, that the above is poor form: the thing holding
the pointer should be the thing freeing it, so these cases should be
refactored and kfree() could do the NULLing by default.

Quoting myself in the above issue:


Without doing massive tree-wide changes, I think we need compiler
support. If we had something like __builtin_is_lvalue(), we could
distinguish function returns from lvalues. For example, right now a
common case are things like:

kfree(get_some_ptr());

But if we could at least gain coverage of the lvalue cases, and detect
them statically at compile-time, we could do:

#define __kfree_and_null(x) do { __kfree(*x); *x = NULL; } while (0)
#define kfree(x) __builtin_choose_expr(__builtin_is_lvalue(x),
__kfree_and_null(&(x)), __kfree(x))

Alternatively, we could do a tree-wide change of the former case (findable
with Coccinelle) and change them into something like kfree_no_null()
and redefine kfree() itself:

#define kfree_no_null(x) do { void *__ptr = (x); __kfree(__ptr); } while (0)
#define kfree(x) do { __kfree(x); x = NULL; } while (0)

-- 
Kees Cook


[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: avoid concurrent writes to aux_inv (rev3)

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915: avoid concurrent writes to aux_inv (rev3)
URL   : https://patchwork.freedesktop.org/series/100772/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11313 -> Patchwork_22465


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/index.html

Participating hosts (48 -> 41)
--

  Missing(7): shard-tglu fi-hsw-4200u fi-icl-u2 fi-bsw-cyan fi-ctg-p8600 
bat-jsl-2 fi-bdw-samus 

Known issues


  Here are the changes found in Patchwork_22465 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_cs_nop@fork-compute0:
- fi-rkl-guc: NOTRUN -> [SKIP][1] ([fdo#109315]) +17 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/fi-rkl-guc/igt@amdgpu/amd_cs_...@fork-compute0.html

  * igt@i915_selftest@live@hangcheck:
- fi-hsw-4770:[PASS][2] -> [INCOMPLETE][3] ([i915#4785])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/fi-hsw-4770/igt@i915_selftest@l...@hangcheck.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/fi-hsw-4770/igt@i915_selftest@l...@hangcheck.html

  * igt@i915_selftest@live@requests:
- fi-blb-e6850:   [PASS][4] -> [DMESG-FAIL][5] ([i915#4528] / 
[i915#5026])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/fi-blb-e6850/igt@i915_selftest@l...@requests.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/fi-blb-e6850/igt@i915_selftest@l...@requests.html

  * igt@runner@aborted:
- fi-hsw-4770:NOTRUN -> [FAIL][6] ([fdo#109271] / [i915#1436] / 
[i915#4312])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/fi-hsw-4770/igt@run...@aborted.html
- fi-blb-e6850:   NOTRUN -> [FAIL][7] ([fdo#109271] / [i915#2403] / 
[i915#2426] / [i915#4312])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/fi-blb-e6850/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_pm_rps@basic-api:
- bat-dg1-5:  [FAIL][8] ([i915#4032]) -> [PASS][9]
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/bat-dg1-5/igt@i915_pm_...@basic-api.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/bat-dg1-5/igt@i915_pm_...@basic-api.html

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [INCOMPLETE][10] -> [PASS][11]
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  * igt@i915_selftest@live@gtt:
- fi-bdw-5557u:   [DMESG-FAIL][12] ([i915#3674]) -> [PASS][13]
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/fi-bdw-5557u/igt@i915_selftest@l...@gtt.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/fi-bdw-5557u/igt@i915_selftest@l...@gtt.html

  * igt@i915_selftest@live@hangcheck:
- {fi-jsl-1}: [INCOMPLETE][14] -> [PASS][15]
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/fi-jsl-1/igt@i915_selftest@l...@hangcheck.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/fi-jsl-1/igt@i915_selftest@l...@hangcheck.html

  * igt@kms_busy@basic@modeset:
- {bat-adlp-6}:   [DMESG-WARN][16] ([i915#3576]) -> [PASS][17]
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/bat-adlp-6/igt@kms_busy@ba...@modeset.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/bat-adlp-6/igt@kms_busy@ba...@modeset.html

  
 Warnings 

  * igt@i915_selftest@live@hangcheck:
- bat-dg1-6:  [DMESG-FAIL][18] ([i915#4494] / [i915#4957]) -> 
[DMESG-FAIL][19] ([i915#4957])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/bat-dg1-6/igt@i915_selftest@l...@hangcheck.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/bat-dg1-6/igt@i915_selftest@l...@hangcheck.html
- bat-dg1-5:  [DMESG-FAIL][20] ([i915#4957]) -> [DMESG-FAIL][21] 
([i915#4494] / [i915#4957])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/bat-dg1-5/igt@i915_selftest@l...@hangcheck.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22465/bat-dg1-5/igt@i915_selftest@l...@hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#112080]: https://bugs.freedesktop.org/show_bug.cgi?id=112080
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#2403]: https://gitlab.freedesktop.org/drm/intel/issues/2403
  [i915#2426]: https://gitlab.freedesktop.org/drm/intel/issues/2426

[Intel-gfx] ✓ Fi.CI.IGT: success for vm- and vma cleanups

2022-03-02 Thread Patchwork
== Series Details ==

Series: vm- and vma cleanups
URL   : https://patchwork.freedesktop.org/series/100945/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11308_full -> Patchwork_22462_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Participating hosts (13 -> 13)
--

  No changes in participating hosts

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_22462_full:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@kms_cursor_legacy@pipe-a-single-move:
- {shard-rkl}:([PASS][1], [PASS][2]) -> [INCOMPLETE][3]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-rkl-4/igt@kms_cursor_leg...@pipe-a-single-move.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-rkl-2/igt@kms_cursor_leg...@pipe-a-single-move.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/shard-rkl-5/igt@kms_cursor_leg...@pipe-a-single-move.html

  * 
{igt@kms_plane_scaling@downscale-with-pixel-format-factor-0-25@pipe-a-edp-1-downscale-with-pixel-format}:
- shard-iclb: NOTRUN -> [SKIP][4] +2 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/shard-iclb6/igt@kms_plane_scaling@downscale-with-pixel-format-factor-0...@pipe-a-edp-1-downscale-with-pixel-format.html

  * 
{igt@kms_plane_scaling@downscale-with-rotation-factor-0-75@pipe-b-hdmi-a-3-downscale-with-rotation}:
- {shard-dg1}:NOTRUN -> [SKIP][5] +3 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/shard-dg1-18/igt@kms_plane_scaling@downscale-with-rotation-factor-0...@pipe-b-hdmi-a-3-downscale-with-rotation.html

  * 
{igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-5@pipe-a-edp-1-planes-upscale-downscale}:
- shard-iclb: [PASS][6] -> [SKIP][7] +5 similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-iclb6/igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-...@pipe-a-edp-1-planes-upscale-downscale.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/shard-iclb2/igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-...@pipe-a-edp-1-planes-upscale-downscale.html

  * {igt@kms_plane_scaling@scaler-with-rotation-unity-scaling}:
- {shard-rkl}:NOTRUN -> [SKIP][8] +1 similar issue
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/shard-rkl-1/igt@kms_plane_scal...@scaler-with-rotation-unity-scaling.html

  
New tests
-

  New tests have been introduced between CI_DRM_11308_full and 
Patchwork_22462_full:

### New IGT tests (1) ###

  * 
igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [1.28] s

  

Known issues


  Here are the changes found in Patchwork_22462_full that come from known 
issues:

### CI changes ###

 Issues hit 

  * boot:
- shard-glk:  ([PASS][9], [PASS][10], [PASS][11], [PASS][12], 
[PASS][13], [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], 
[PASS][19], [PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24], 
[PASS][25], [PASS][26], [PASS][27], [PASS][28], [PASS][29], [PASS][30], 
[PASS][31], [PASS][32], [PASS][33]) -> ([PASS][34], [PASS][35], [PASS][36], 
[PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], 
[PASS][43], [PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], 
[PASS][49], [FAIL][50], [PASS][51], [PASS][52], [PASS][53], [PASS][54], 
[PASS][55], [PASS][56], [PASS][57], [PASS][58]) ([i915#4392])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk6/boot.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk6/boot.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk5/boot.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk5/boot.html
   [22]: 

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915: avoid concurrent writes to aux_inv (rev3)

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915: avoid concurrent writes to aux_inv (rev3)
URL   : https://patchwork.freedesktop.org/series/100772/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.




[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: avoid concurrent writes to aux_inv (rev3)

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915: avoid concurrent writes to aux_inv (rev3)
URL   : https://patchwork.freedesktop.org/series/100772/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
48f4099e5fa7 drm/i915: avoid concurrent writes to aux_inv
-:82: CHECK:BRACES: Unbalanced braces around else statement
#82: FILE: drivers/gpu/drm/i915/gt/gen8_engine_cs.c:303:
+   } else

total: 0 errors, 0 warnings, 1 checks, 118 lines checked




[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gt: Handle errors for i915_gem_object_trylock

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915/gt: Handle errors for i915_gem_object_trylock
URL   : https://patchwork.freedesktop.org/series/100951/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11313 -> Patchwork_22464


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/index.html

Participating hosts (48 -> 43)
--

  Additional (1): fi-pnv-d510 
  Missing(6): shard-tglu fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 bat-jsl-2 
fi-bdw-samus 

Known issues


  Here are the changes found in Patchwork_22464 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
- fi-snb-2600:NOTRUN -> [SKIP][1] ([fdo#109271]) +17 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/fi-snb-2600/igt@amdgpu/amd_cs_...@sync-fork-compute0.html

  * igt@gem_exec_suspend@basic-s3:
- fi-skl-6600u:   NOTRUN -> [INCOMPLETE][2] ([i915#4547])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/fi-skl-6600u/igt@gem_exec_susp...@basic-s3.html

  * igt@gem_huc_copy@huc-copy:
- fi-pnv-d510:NOTRUN -> [SKIP][3] ([fdo#109271]) +57 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/fi-pnv-d510/igt@gem_huc_c...@huc-copy.html

  * igt@i915_selftest@live@execlists:
- fi-bsw-n3050:   [PASS][4] -> [INCOMPLETE][5] ([i915#2940])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/fi-bsw-n3050/igt@i915_selftest@l...@execlists.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/fi-bsw-n3050/igt@i915_selftest@l...@execlists.html

  * igt@runner@aborted:
- fi-bsw-n3050:   NOTRUN -> [FAIL][6] ([fdo#109271] / [i915#1436] / 
[i915#2722] / [i915#3428] / [i915#4312])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/fi-bsw-n3050/igt@run...@aborted.html
- fi-bdw-5557u:   NOTRUN -> [FAIL][7] ([i915#2426] / [i915#4312])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/fi-bdw-5557u/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_selftest@live@hangcheck:
- bat-dg1-6:  [DMESG-FAIL][8] ([i915#4494] / [i915#4957]) -> 
[PASS][9]
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/bat-dg1-6/igt@i915_selftest@l...@hangcheck.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/bat-dg1-6/igt@i915_selftest@l...@hangcheck.html
- {fi-jsl-1}: [INCOMPLETE][10] -> [PASS][11]
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/fi-jsl-1/igt@i915_selftest@l...@hangcheck.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/fi-jsl-1/igt@i915_selftest@l...@hangcheck.html
- fi-snb-2600:[INCOMPLETE][12] ([i915#3921]) -> [PASS][13]
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html

  * igt@kms_busy@basic@modeset:
- {bat-adlp-6}:   [DMESG-WARN][14] ([i915#3576]) -> [PASS][15]
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/bat-adlp-6/igt@kms_busy@ba...@modeset.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/bat-adlp-6/igt@kms_busy@ba...@modeset.html

  
 Warnings 

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [INCOMPLETE][16] -> [INCOMPLETE][17] ([i915#4983])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11313/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22464/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#112080]: https://bugs.freedesktop.org/show_bug.cgi?id=112080
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#2426]: https://gitlab.freedesktop.org/drm/intel/issues/2426
  [i915#2722]: https://gitlab.freedesktop.org/drm/intel/issues/2722
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#3428]: https://gitlab.freedesktop.org/drm/intel/issues/3428
  [i915#3576]: https://gitlab.freedesktop.org/drm/intel/issues/3576
  [i915#3921]: https://gitlab.freedesktop.org/drm/intel/issues/3921
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4494]: https://gitlab.freedesktop.org/drm/intel/issues/4494
  [i915#4547]: https://gitlab.freedesktop.org/drm/intel/issues/4547
  [i915#4957]: https://gitlab.freedesktop.org/drm/intel/issues/4957
  [i915#4983]: 

[Intel-gfx] [PATCH] drm/i915: avoid concurrent writes to aux_inv

2022-03-02 Thread fei . yang
From: Fei Yang 

GPU hangs have been observed when multiple engines write to the
same aux_inv register at the same time. To avoid this each engine
should only invalidate its own auxiliary table. The function
gen12_emit_flush_xcs() currently invalidate the auxiliary table for
all engines because the rq->engine is not necessarily the engine
eventually carrying out the request, and potentially the engine
could even be a virtual one (with engine->instance being -1).
With this patch, auxiliary table invalidation is done only for the
engine executing the request. And the mmio address for the aux_inv
register is set after the engine instance becomes certain.

Signed-off-by: Chris Wilson 
Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c  | 41 ---
 .../drm/i915/gt/intel_execlists_submission.c  | 38 +
 drivers/gpu/drm/i915/i915_request.h   |  2 +
 3 files changed, 47 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index b1b9c3fd7bf9..af62e2bc2c9b 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -165,30 +165,6 @@ static u32 preparser_disable(bool state)
return MI_ARB_CHECK | 1 << 8 | state;
 }
 
-static i915_reg_t aux_inv_reg(const struct intel_engine_cs *engine)
-{
-   static const i915_reg_t vd[] = {
-   GEN12_VD0_AUX_NV,
-   GEN12_VD1_AUX_NV,
-   GEN12_VD2_AUX_NV,
-   GEN12_VD3_AUX_NV,
-   };
-
-   static const i915_reg_t ve[] = {
-   GEN12_VE0_AUX_NV,
-   GEN12_VE1_AUX_NV,
-   };
-
-   if (engine->class == VIDEO_DECODE_CLASS)
-   return vd[engine->instance];
-
-   if (engine->class == VIDEO_ENHANCEMENT_CLASS)
-   return ve[engine->instance];
-
-   GEM_BUG_ON("unknown aux_inv reg\n");
-   return INVALID_MMIO_REG;
-}
-
 static u32 *gen12_emit_aux_table_inv(const i915_reg_t inv_reg, u32 *cs)
 {
*cs++ = MI_LOAD_REGISTER_IMM(1);
@@ -288,7 +264,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (mode & EMIT_INVALIDATE)
aux_inv = rq->engine->mask & ~BIT(BCS0);
if (aux_inv)
-   cmd += 2 * hweight32(aux_inv) + 2;
+   cmd += 4;
 
cs = intel_ring_begin(rq, cmd);
if (IS_ERR(cs))
@@ -319,16 +295,13 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
mode)
*cs++ = 0; /* value */
 
if (aux_inv) { /* hsdes: 1809175790 */
-   struct intel_engine_cs *engine;
-   unsigned int tmp;
-
-   *cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv));
-   for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) {
-   *cs++ = i915_mmio_reg_offset(aux_inv_reg(engine));
-   *cs++ = AUX_INV;
-   }
+   *cs++ = MI_LOAD_REGISTER_IMM(1);
+   rq->vd_ve_aux_inv = cs;
+   *cs++ = 0; /* address to be set at submission to HW */
+   *cs++ = AUX_INV;
*cs++ = MI_NOOP;
-   }
+   } else
+   rq->vd_ve_aux_inv = NULL;
 
if (mode & EMIT_INVALIDATE)
*cs++ = preparser_disable(false);
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 1c602d4ae297..a018de6dcac5 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -1258,6 +1258,34 @@ static bool completed(const struct i915_request *rq)
return __i915_request_is_complete(rq);
 }
 
+static i915_reg_t aux_inv_reg(const struct intel_engine_cs *engine)
+{
+   static const i915_reg_t vd[] = {
+   GEN12_VD0_AUX_NV,
+   GEN12_VD1_AUX_NV,
+   GEN12_VD2_AUX_NV,
+   GEN12_VD3_AUX_NV,
+   };
+
+   static const i915_reg_t ve[] = {
+   GEN12_VE0_AUX_NV,
+   GEN12_VE1_AUX_NV,
+   };
+
+   if (engine->class == VIDEO_DECODE_CLASS) {
+   GEM_BUG_ON(engine->instance >= ARRAY_SIZE(vd));
+   return vd[engine->instance];
+   }
+
+   if (engine->class == VIDEO_ENHANCEMENT_CLASS) {
+   GEM_BUG_ON(engine->instance >= ARRAY_SIZE(ve));
+   return ve[engine->instance];
+   }
+
+   GEM_BUG_ON("unknown aux_inv reg\n");
+   return INVALID_MMIO_REG;
+}
+
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
struct intel_engine_execlists * const execlists = >execlists;
@@ -1538,6 +1566,16 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
}
 
if (__i915_request_submit(rq)) {
+   /* hsdes: 1809175790 */
+   if ((GRAPHICS_VER(engine->i915) == 12) &&

Re: [Intel-gfx] [PATCH 5/6] drm/rcar_du: changes to rcar-du driver resulting from drm_writeback_connector structure changes

2022-03-02 Thread Laurent Pinchart
Hi Abhinav,

On Wed, Mar 02, 2022 at 10:28:03AM -0800, Abhinav Kumar wrote:
> On 2/28/2022 5:42 AM, Laurent Pinchart wrote:
> > On Mon, Feb 28, 2022 at 02:28:27PM +0200, Laurent Pinchart wrote:
> >> On Mon, Feb 28, 2022 at 02:09:15PM +0200, Jani Nikula wrote:
> >>> On Mon, 28 Feb 2022, Laurent Pinchart wrote:
>  On Sat, Feb 26, 2022 at 10:27:59AM -0800, Rob Clark wrote:
> > On Wed, Feb 2, 2022 at 7:41 AM Jani Nikula wrote:
> >> On Wed, 02 Feb 2022, Laurent Pinchart wrote:
> >>> On Wed, Feb 02, 2022 at 03:15:03PM +0200, Jani Nikula wrote:
>  On Wed, 02 Feb 2022, Laurent Pinchart wrote:
> > On Wed, Feb 02, 2022 at 02:24:28PM +0530, Kandpal Suraj wrote:
> >> Changing rcar_du driver to accomadate the change of
> >> drm_writeback_connector.base and drm_writeback_connector.encoder
> >> to a pointer the reason for which is explained in the
> >> Patch(drm: add writeback pointers to drm_connector).
> >>
> >> Signed-off-by: Kandpal Suraj 
> >> ---
> >>   drivers/gpu/drm/rcar-du/rcar_du_crtc.h  | 2 ++
> >>   drivers/gpu/drm/rcar-du/rcar_du_writeback.c | 8 +---
> >>   2 files changed, 7 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h 
> >> b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> >> index 66e8839db708..68f387a04502 100644
> >> --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> >> +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
> >> @@ -72,6 +72,8 @@ struct rcar_du_crtc {
> >>const char *const *sources;
> >>unsigned int sources_count;
> >>
> >> + struct drm_connector connector;
> >> + struct drm_encoder encoder;
> >
> > Those fields are, at best, poorly named. Furthermore, there's no 
> > need in
> > this driver or in other drivers using drm_writeback_connector to 
> > create
> > an encoder or connector manually. Let's not polute all drivers 
> > because
> > i915 doesn't have its abstractions right.
> 
>  i915 uses the quite common model for struct inheritance:
> 
>    struct intel_connector {
>    struct drm_connector base;
>    /* ... */
>    }
> 
>  Same with at least amd, ast, fsl-dcu, hisilicon, mga200, msm, 
>  nouveau,
>  radeon, tilcdc, and vboxvideo.
> 
>  We could argue about the relative merits of that abstraction, but I
>  think the bottom line is that it's popular and the drivers using it 
>  are
>  not going to be persuaded to move away from it.
> >>>
> >>> Nobody said inheritance is bad.
> >>>
>  It's no coincidence that the drivers who've implemented writeback so 
>  far
>  (komeda, mali, rcar-du, vc4, and vkms) do not use the abstraction,
>  because the drm_writeback_connector midlayer does, forcing the issue.
> >>>
> >>> Are you sure it's not a coincidence ? :-)
> >>>
> >>> The encoder and especially connector created by 
> >>> drm_writeback_connector
> >>> are there only because KMS requires a drm_encoder and a drm_connector 
> >>> to
> >>> be exposed to userspace (and I could argue that using a connector for
> >>> writeback is a hack, but that won't change). The connector is 
> >>> "virtual",
> >>> I still fail to see why i915 or any other driver would need to wrap it
> >>> into something else. The whole point of the drm_writeback_connector
> >>> abstraction is that drivers do not have to manage the writeback
> >>> drm_connector manually, they shouldn't touch it at all.
> >>
> >> The thing is, drm_writeback_connector_init() calling
> >> drm_connector_init() on the drm_connector embedded in
> >> drm_writeback_connector leads to that connector being added to the
> >> drm_device's list of connectors. Ditto for the encoder.
> >>
> >> All the driver code that handles drm_connectors would need to take into
> >> account they might not be embedded in intel_connector. Throughout the
> >> driver. Ditto for the encoders.
> >
> > The assumption that a connector is embedded in intel_connector doesn't
> > really play that well with how bridge and panel connectors work.. so
> > in general this seems like a good thing to unwind.
> >
> > But as a point of practicality, i915 is a large driver covering a lot
> > of generations of hw with a lot of users.  So I can understand
> > changing this design isn't something that can happen quickly or
> > easily.  IMO we should allow i915 to create it's own connector for
> > writeback, and just document clearly that this isn't the approach new
> > drivers should take.  I mean, I understand idealism, but sometimes a
> > 

Re: [Intel-gfx] [PATCH 5/6] drm/rcar_du: changes to rcar-du driver resulting from drm_writeback_connector structure changes

2022-03-02 Thread Abhinav Kumar




On 2/28/2022 5:42 AM, Laurent Pinchart wrote:

Hello,

On Mon, Feb 28, 2022 at 02:28:27PM +0200, Laurent Pinchart wrote:

On Mon, Feb 28, 2022 at 02:09:15PM +0200, Jani Nikula wrote:

On Mon, 28 Feb 2022, Laurent Pinchart wrote:

On Sat, Feb 26, 2022 at 10:27:59AM -0800, Rob Clark wrote:

On Wed, Feb 2, 2022 at 7:41 AM Jani Nikula wrote:

On Wed, 02 Feb 2022, Laurent Pinchart wrote:

On Wed, Feb 02, 2022 at 03:15:03PM +0200, Jani Nikula wrote:

On Wed, 02 Feb 2022, Laurent Pinchart wrote:

On Wed, Feb 02, 2022 at 02:24:28PM +0530, Kandpal Suraj wrote:

Changing rcar_du driver to accomadate the change of
drm_writeback_connector.base and drm_writeback_connector.encoder
to a pointer the reason for which is explained in the
Patch(drm: add writeback pointers to drm_connector).

Signed-off-by: Kandpal Suraj 
---
  drivers/gpu/drm/rcar-du/rcar_du_crtc.h  | 2 ++
  drivers/gpu/drm/rcar-du/rcar_du_writeback.c | 8 +---
  2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h 
b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
index 66e8839db708..68f387a04502 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
@@ -72,6 +72,8 @@ struct rcar_du_crtc {
   const char *const *sources;
   unsigned int sources_count;

+ struct drm_connector connector;
+ struct drm_encoder encoder;


Those fields are, at best, poorly named. Furthermore, there's no need in
this driver or in other drivers using drm_writeback_connector to create
an encoder or connector manually. Let's not polute all drivers because
i915 doesn't have its abstractions right.


i915 uses the quite common model for struct inheritance:

  struct intel_connector {
  struct drm_connector base;
  /* ... */
  }

Same with at least amd, ast, fsl-dcu, hisilicon, mga200, msm, nouveau,
radeon, tilcdc, and vboxvideo.

We could argue about the relative merits of that abstraction, but I
think the bottom line is that it's popular and the drivers using it are
not going to be persuaded to move away from it.


Nobody said inheritance is bad.


It's no coincidence that the drivers who've implemented writeback so far
(komeda, mali, rcar-du, vc4, and vkms) do not use the abstraction,
because the drm_writeback_connector midlayer does, forcing the issue.


Are you sure it's not a coincidence ? :-)

The encoder and especially connector created by drm_writeback_connector
are there only because KMS requires a drm_encoder and a drm_connector to
be exposed to userspace (and I could argue that using a connector for
writeback is a hack, but that won't change). The connector is "virtual",
I still fail to see why i915 or any other driver would need to wrap it
into something else. The whole point of the drm_writeback_connector
abstraction is that drivers do not have to manage the writeback
drm_connector manually, they shouldn't touch it at all.


The thing is, drm_writeback_connector_init() calling
drm_connector_init() on the drm_connector embedded in
drm_writeback_connector leads to that connector being added to the
drm_device's list of connectors. Ditto for the encoder.

All the driver code that handles drm_connectors would need to take into
account they might not be embedded in intel_connector. Throughout the
driver. Ditto for the encoders.


The assumption that a connector is embedded in intel_connector doesn't
really play that well with how bridge and panel connectors work.. so
in general this seems like a good thing to unwind.

But as a point of practicality, i915 is a large driver covering a lot
of generations of hw with a lot of users.  So I can understand
changing this design isn't something that can happen quickly or
easily.  IMO we should allow i915 to create it's own connector for
writeback, and just document clearly that this isn't the approach new
drivers should take.  I mean, I understand idealism, but sometimes a
dose of pragmatism is needed. :-)


i915 is big, but so is Intel. It's not fair to treat everybody else as a
second class citizen and let Intel get away without doing its homework.


Laurent, as you accuse us of not doing our homework, I'll point out that
we've been embedding drm crtc, encoder and connector ever since
modesetting support was added to i915 in 2008, since before *any* of the
things you now use as a rationale for asking us to do a massive rewrite
of the driver existed.

It's been ok to embed those structures for well over ten years. It's a
common pattern, basically throughout the kernel. Other drivers do it
too, not just i915. There hasn't been the slightest hint this should not
be done until this very conversation.


I want to see this refactoring effort moving forward in i915 (and moving
to drm_bridge would then be a good idea too). If writeback support in
i915 urgent, then we can discuss *temporary* pragmatic stopgap measures,
but not without a real effort to fix the core issue.


I think the onus is on you to first convince 

Re: [Intel-gfx] [PATCH v2 1/3] drm/i915/guc: Limit scheduling properties to avoid overflow

2022-03-02 Thread John Harrison

On 3/2/2022 01:49, Tvrtko Ursulin wrote:

On 25/02/2022 20:41, john.c.harri...@intel.com wrote:

From: John Harrison 

GuC converts the pre-emption timeout and timeslice quantum values into
clock ticks internally. That significantly reduces the point of 32bit
overflow. On current platforms, worst case scenario is approximately
110 seconds. Rather than allowing the user to set higher values and
then get confused by early timeouts, add limits when setting these
values.

v2: Add helper functins for clamping (review feedback from Tvrtko).

Signed-off-by: John Harrison 
Reviewed-by: Daniele Ceraolo Spurio  
(v1)

---
  drivers/gpu/drm/i915/gt/intel_engine.h  |  6 ++
  drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 69 +
  drivers/gpu/drm/i915/gt/sysfs_engines.c | 25 +---
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h |  9 +++
  4 files changed, 99 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h

index be4b1e65442f..5a9186f784c4 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -349,4 +349,10 @@ intel_engine_get_hung_context(struct 
intel_engine_cs *engine)

  return engine->hung_ce;
  }
  +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs 
*engine, u64 value);
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs 
*engine, u64 value);
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, 
u64 value);
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 
value);
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs 
*engine, u64 value);

+
  #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c

index e855c801ba28..7ad9e6006656 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -399,6 +399,26 @@ static int intel_engine_setup(struct intel_gt 
*gt, enum intel_engine_id id,

  if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
  engine->props.preempt_timeout_ms = 0;
  +    /* Cap properties according to any system limits */
+#define CLAMP_PROP(field) \
+    do { \
+    u64 clamp = intel_clamp_##field(engine, engine->props.field); \
+    if (clamp != engine->props.field) { \
+    drm_notice(>i915->drm, \
+   "Warning, clamping %s to %lld to prevent 
overflow\n", \

+   #field, clamp); \
+    engine->props.field = clamp; \
+    } \
+    } while (0)
+
+    CLAMP_PROP(heartbeat_interval_ms);
+    CLAMP_PROP(max_busywait_duration_ns);
+    CLAMP_PROP(preempt_timeout_ms);
+    CLAMP_PROP(stop_timeout_ms);
+    CLAMP_PROP(timeslice_duration_ms);
+
+#undef CLAMP_PROP
+
  engine->defaults = engine->props; /* never to change again */
    engine->context_size = intel_engine_context_size(gt, 
engine->class);
@@ -421,6 +441,55 @@ static int intel_engine_setup(struct intel_gt 
*gt, enum intel_engine_id id,

  return 0;
  }
  +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs 
*engine, u64 value)

+{
+    value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+    return value;
+}
+
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs 
*engine, u64 value)

+{
+    value = min(value, jiffies_to_nsecs(2));
+
+    return value;
+}
+
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, 
u64 value)

+{
+    /*
+ * NB: The GuC API only supports 32bit values. However, the 
limit is further
+ * reduced due to internal calculations which would otherwise 
overflow.

+ */
+    if (intel_guc_submission_is_wanted(>gt->uc.guc))
+    value = min_t(u64, value, GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS);
+
+    value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+    return value;
+}
+
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 
value)

+{
+    value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+    return value;
+}
+
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs 
*engine, u64 value)

+{
+    /*
+ * NB: The GuC API only supports 32bit values. However, the 
limit is further
+ * reduced due to internal calculations which would otherwise 
overflow.

+ */
+    if (intel_guc_submission_is_wanted(>gt->uc.guc))
+    value = min_t(u64, value, GUC_POLICY_MAX_EXEC_QUANTUM_MS);
+
+    value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+    return value;
+}
+
  static void __setup_engine_capabilities(struct intel_engine_cs 
*engine)

  {
  struct drm_i915_private *i915 = engine->i915;
diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c 
b/drivers/gpu/drm/i915/gt/sysfs_engines.c

index 967031056202..f2d9858d827c 100644
--- a/drivers/gpu/drm/i915/gt/sysfs_engines.c
+++ b/drivers/gpu/drm/i915/gt/sysfs_engines.c
@@ -144,7 +144,7 @@ 

Re: [Intel-gfx] [PATCH 1/3] drm/i915/guc: Limit scheduling properties to avoid overflow

2022-03-02 Thread John Harrison

On 3/2/2022 01:20, Tvrtko Ursulin wrote:

On 01/03/2022 19:57, John Harrison wrote:

On 3/1/2022 02:50, Tvrtko Ursulin wrote:

On 28/02/2022 18:32, John Harrison wrote:

On 2/28/2022 08:11, Tvrtko Ursulin wrote:

On 25/02/2022 17:39, John Harrison wrote:

On 2/25/2022 09:06, Tvrtko Ursulin wrote:


On 24/02/2022 19:19, John Harrison wrote:

[snip]


./gt/uc/intel_guc_fwif.h: u32 execution_quantum;

./gt/uc/intel_guc_submission.c: desc->execution_quantum = 
engine->props.timeslice_duration_ms * 1000;


./gt/intel_engine_types.h: unsigned long 
timeslice_duration_ms;


timeslice_store/preempt_timeout_store:
err = kstrtoull(buf, 0, );

So both kconfig and sysfs can already overflow GuC, not 
only because of tick conversion internally but because at 
backend level nothing was done for assigning 64-bit into 
32-bit. Or I failed to find where it is handled.
That's why I'm adding this range check to make sure we 
don't allow overflows.


Yes and no, this fixes it, but the first bug was not only 
due GuC internal tick conversion. It was present ever since 
the u64 from i915 was shoved into u32 sent to GuC. So even 
if GuC used the value without additional multiplication, bug 
was be there. My point being when GuC backend was added 
timeout_ms values should have been limited/clamped to 
U32_MAX. The tick discovery is additional limit on top.
I'm not disagreeing. I'm just saying that the truncation 
wasn't noticed until I actually tried using very long 
timeouts to debug a particular problem. Now that it is 
noticed, we need some method of range checking and this 
simple clamp solves all the truncation problems.


Agreed in principle, just please mention in the commit message 
all aspects of the problem.


I think we can get away without a Fixes: tag since it requires 
user fiddling to break things in unexpected ways.


I would though put in a code a clamping which expresses both, 
something like min(u32, ..GUC LIMIT..). So the full story is 
documented forever. Or "if > u32 || > ..GUC LIMIT..) return 
-EINVAL". Just in case GuC limit one day changes but u32 
stays. Perhaps internal ticks go away or anything and we are 
left with plain 1:1 millisecond relationship.
Can certainly add a comment along the lines of "GuC API only 
takes a 32bit field but that is further reduced to GUC_LIMIT 
due to internal calculations which would otherwise overflow".


But if the GuC limit is > u32 then, by definition, that means 
the GuC API has changed to take a u64 instead of a u32. So 
there will no u32 truncation any more. So I'm not seeing a need 
to explicitly test the integer size when the value check covers 
that.


Hmm I was thinking if the internal conversion in the GuC fw 
changes so that GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS goes above 
u32, then to be extra safe by documenting in code there is the 
additional limit of the data structure field. Say the field was 
changed to take some unit larger than a millisecond. Then the 
check against the GuC MAX limit define would not be enough, 
unless that would account both for internal implementation and 
u32 in the protocol. Maybe that is overdefensive but I don't see 
that it harms. 50-50, but it's do it once and forget so I'd do it.

Huh?

How can the limit be greater than a u32 if the interface only 
takes a u32? By definition the limit would be clamped to u32 size.


If you mean that the GuC policy is in different units and those 
units might not overflow but ms units do, then actually that is 
already the case. The GuC works in us not ms. That's part of why 
the wrap around is so low, we have to multiply by 1000 before 
sending to GuC. However, that is actually irrelevant because the 
comparison is being done on the i915 side in i915's units. We 
have to scale the GuC limit to match what i915 is using. And the 
i915 side is u64 so if the scaling to i915 numbers overflows a 
u32 then who cares because that comparison can be done at 64 bits 
wide.


If the units change then that is a backwards breaking API change 
that will require a manual driver code update. You can't just 
recompile with a new header and magically get an ms to us or ms 
to s conversion in your a = b assignment. The code will need to 
be changed to do the new unit conversion (note we already convert 
from ms to us, the GuC API is all expressed in us). And that code 
change will mean having to revisit any and all scaling, type 
conversions, etc. I.e. any pre-existing checks will not 
necessarily be valid and will need to be re-visted anyway. But as 
above, any scaling to GuC units has to be incorporated into the 
limit already because otherwise the limit would not fit in the 
GuC's own API.


Yes I get that, I was just worried that u32 field in the protocol 
and GUC_POLICY_MAX_EXEC_QUANTUM_MS defines are separate in the 
source code and then how to protect against forgetting to update 
both in sync.


Like if the protocol was changed to take nanoseconds, and firmware 
implementation changed to support the full 

Re: [Intel-gfx] [PATCH 2/3] drm/i915/gt: Make the heartbeat play nice with long pre-emption timeouts

2022-03-02 Thread John Harrison

On 3/2/2022 03:07, Tvrtko Ursulin wrote:

On 01/03/2022 20:59, John Harrison wrote:

On 3/1/2022 04:09, Tvrtko Ursulin wrote:


I'll trim it a bit again..

On 28/02/2022 18:55, John Harrison wrote:

On 2/28/2022 09:12, Tvrtko Ursulin wrote:

On 25/02/2022 18:48, John Harrison wrote:

On 2/25/2022 10:14, Tvrtko Ursulin wrote:


[snip]

Your only objection is that ends up with too long total time 
before reset? Or something else as well?
An unnecessarily long total heartbeat timeout is the main 
objection. (2.5 + 12) * 5 = 72.5 seconds. That is a massive 
change from the current 12.5s.


If we are happy with that huge increase then fine. But I'm pretty 
sure you are going to get a lot more bug reports about hung 
systems not recovering. 10-20s is just about long enough for 
someone to wait before leaning on the power button of their 
machine. Over a minute is not. That kind of delay is going to 
cause support issues.


Sorry I wrote 12s, while you actually said tP * 12, so 7.68s, 
chosen just so it is longer than tH * 3?


And how do you keep coming up with factor of five? Isn't it four 
periods before "heartbeat stopped"? (Prio normal, hearbeat, 
barrier and then reset.)

Prio starts at low not normal.


Right, slipped my mind since I only keep seeing that one priority 
ladder block in intel_engine_heartbeat.c/heartbeat()..


From the point of view of user experience I agree reasonable 
responsiveness is needed before user "reaches for the power button".


In your proposal we are talking about 3 * 2.5s + 2 * 7.5s, so 22.5s.

Question of workloads.. what is the actual preempt timeout compute 
is happy with? And I don't mean compute setups with disabled 
hangcheck, which you say they want anyway, but if we target 
defaults for end users. Do we have some numbers on what they are 
likely to run?
Not that I have ever seen. This is all just finger in the air 
stuff. I don't recall if we invented the number and the compute 
people agreed with it or if they proposed the number to us.


Yeah me neither. And found nothing in my email archives. :(

Thinking about it today I don't see that disabled timeout is a 
practical default.


With it, if users have something un-preemptable to run (assuming 
prio normal), it would get killed after ~13s (5 * 2.5).


If we go for my scheme it gets killed in ~17.5s (3 * (2.5 + 2.5) + 
2.5 (third pulse triggers preempt timeout)).


And if we go for your scheme it gets killed in ~22.5s (4 * 2.5 + 2 * 
3 * 2.5).
Erm, that is not an apples to apples comparison. Your 17.5 is for an 
engine reset tripped by the pre-emption timeout, but your 22.5s is 
for a GT reset tripped by the heartbeat reaching the end and nuking 
the universe.


Right, in your scheme I did get it wrong. It would wait for GuC to 
reset the engine at the end, rather than hit the fake "hearbeat 
stopped" in that case, full reset path.


4 * 2.5 to trigger a max prio pulse, then 3 * 2.5 preempt timeout for 
GuC to reset (last hearbeat delay extended so it does not trigger). So 
17.5 as well.


Again, apples or oranges? I was using your tP(RCS) == 2.5s assumption in 
all the above calculations given that the discussion was about the 
heartbeat algorithm, not the choice of pre-emption timeout. In which 
case the last heartbeat is max(tP * 2, tH) == 2 * 2.5s.




If you are saying that the first pulse at sufficient priority (third 
being normal prio) is what causes the reset because the system is 
working as expected and the pre-emption timeout trips the reset. In 
that case, you have two periods to get to normal prio plus one 
pre-emption timeout to trip the reset. I.e. (tH * 2) + tP.


Your scheme is then tH(actual) = tH(user) + tP, yes?
So pre-emption based reset is after ((tH(user) + tP) * 2) + tP => (3 
* tP) + (2 * tH)

And GT based reset is after (tH(user) + tP) * 5 => (5 * tP) + (5 * tH)

My scheme is tH(actual) = tH(user) for first four, then max(tH(user), 
tP) for fifth.

So pre-emption based reset is after tH(user) * 2 + tP = > tP + (2 * tH);
And GT based reset is after (tH(user) * 4) + (max(tH(user), tP) * 1) 
=> greater of ((4 * tH) + tP) or (5 * tH)


Either way your scheme is longer. With tH(user) = 2.5s, tP(RCS) = 
7.5s, we get 27.5s for engine and 50s for GT versus my 12.5s for 
engine and 17.5s for GT. With tP(RCS) = 2.5s, yours is 12.5s for 
engine and 25s for GT versus my 7.5s for engine and 12.5s for GT.


Plus, not sure why your calculations above are using 2.5 for tP? Are 
you still arguing that 7.5s is too long? That is a separate issue and 
not related to the heartbeat algorithms. tP must be long enough to 
allow 'out of box OpenCL workloads to complete'. That doesn't just 
mean not being killed by the heartbeat, it also means not being 
killed by running two of them concurrently (or one plus desktop 
OpenGL rendering) and not having it killed by basic time slicing 
between the two contexts. The heartbeat is not involved in that 
process. That is purely the pre-emption timeout. And that is the 

Re: [Intel-gfx] [PATCH 0/3] Improve anti-pre-emption w/a for compute workloads

2022-03-02 Thread John Harrison

On 3/2/2022 03:21, Tvrtko Ursulin wrote:

On 28/02/2022 19:17, John Harrison wrote:

On 2/28/2022 07:32, Tvrtko Ursulin wrote:

On 25/02/2022 19:03, John Harrison wrote:

On 2/25/2022 10:29, Tvrtko Ursulin wrote:

On 25/02/2022 18:01, John Harrison wrote:

On 2/25/2022 09:39, Tvrtko Ursulin wrote:

On 25/02/2022 17:11, John Harrison wrote:

On 2/25/2022 08:36, Tvrtko Ursulin wrote:

On 24/02/2022 20:02, John Harrison wrote:

On 2/23/2022 04:00, Tvrtko Ursulin wrote:

On 23/02/2022 02:22, John Harrison wrote:

On 2/22/2022 01:53, Tvrtko Ursulin wrote:

On 18/02/2022 21:33, john.c.harri...@intel.com wrote:

From: John Harrison 

Compute workloads are inherently not pre-emptible on 
current hardware.
Thus the pre-emption timeout was disabled as a workaround 
to prevent
unwanted resets. Instead, the hang detection was left to 
the heartbeat
and its (longer) timeout. This is undesirable with GuC 
submission as
the heartbeat is a full GT reset rather than a per engine 
reset and so
is much more destructive. Instead, just bump the 
pre-emption timeout


Can we have a feature request to allow asking GuC for an 
engine reset?

For what purpose?


To allow "stopped heartbeat" to reset the engine, however..

GuC manages the scheduling of contexts across engines. With 
virtual engines, the KMD has no knowledge of which engine a 
context might be executing on. Even without virtual 
engines, the KMD still has no knowledge of which context is 
currently executing on any given engine at any given time.


There is a reason why hang detection should be left to the 
entity that is doing the scheduling. Any other entity is 
second guessing at best.


The reason for keeping the heartbeat around even when GuC 
submission is enabled is for the case where the KMD/GuC 
have got out of sync with either other somehow or GuC 
itself has just crashed. I.e. when no submission at all is 
working and we need to reset the GuC itself and start over.


.. I wasn't really up to speed to know/remember heartbeats 
are nerfed already in GuC mode.
Not sure what you mean by that claim. Engine resets are 
handled by GuC because GuC handles the scheduling. You can't 
do the former if you aren't doing the latter. However, the 
heartbeat is still present and is still the watchdog by which 
engine resets are triggered. As per the rest of the 
submission process, the hang detection and recovery is split 
between i915 and GuC.


I meant that "stopped heartbeat on engine XXX" can only do a 
full GPU reset on GuC.
I mean that there is no 'stopped heartbeat on engine XXX' when 
i915 is not handling the recovery part of the process.


H?

static void
reset_engine(struct intel_engine_cs *engine, struct i915_request 
*rq)

{
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
    show_heartbeat(rq, engine);

if (intel_engine_uses_guc(engine))
    /*
 * GuC itself is toast or GuC's hang detection
 * is disabled. Either way, need to find the
 * hang culprit manually.
 */
    intel_guc_find_hung_context(engine);

intel_gt_handle_error(engine->gt, engine->mask,
  I915_ERROR_CAPTURE,
  "stopped heartbeat on %s",
  engine->name);
}

How there is no "stopped hearbeat" in guc mode? From this code 
it certainly looks there is.
Only when the GuC is toast and it is no longer an engine reset 
but a full GT reset that is required. So technically, it is not a 
'stopped heartbeat on engine XXX' it is 'stopped heartbeat on GT#'.




You say below heartbeats are going in GuC mode. Now I totally 
don't understand how they are going but there is allegedly no 
"stopped hearbeat".
Because if GuC is handling the detection and recovery then i915 
will not reach that point. GuC will do the engine reset and start 
scheduling the next context before the heartbeat period expires. 
So the notification will be a G2H about a specific context being 
reset rather than the i915 notification about a stopped heartbeat.






intel_gt_handle_error(engine->gt, engine->mask,
  I915_ERROR_CAPTURE,
  "stopped heartbeat on %s",
  engine->name);

intel_gt_handle_error:

/*
 * Try engine reset when available. We fall back to full 
reset if

 * single reset fails.
 */
if (!intel_uc_uses_guc_submission(>uc) &&
    intel_has_reset_engine(gt) && !intel_gt_is_wedged(gt)) {
    local_bh_disable();
    for_each_engine_masked(engine, gt, engine_mask, tmp) {

You said "However, the heartbeat is still present and is still 
the watchdog by which engine resets are triggered", now I 
don't know what you meant by this. It actually triggers a 
single engine reset in GuC mode? Where in code does that 
happen if this block above shows it not taking the engine 
reset path?

i915 sends down the per engine pulse.
GuC schedules the pulse
GuC attempts to pre-empt the currently active context
GuC detects the pre-emption timeout
GuC resets the 

Re: [Intel-gfx] [PATCH 6/6] treewide: remove check of list iterator against head past the loop body

2022-03-02 Thread Tvrtko Ursulin



On 28/02/2022 11:08, Jakob Koschel wrote:

When list_for_each_entry() completes the iteration over the whole list
without breaking the loop, the iterator value will be a bogus pointer
computed based on the head element.

While it is safe to use the pointer to determine if it was computed
based on the head element, either with list_entry_is_head() or
>member == head, using the iterator variable after the loop should
be avoided.

In preparation to limiting the scope of a list iterator to the list
traversal loop, use a dedicated pointer to point to the found element.

Signed-off-by: Jakob Koschel 


[snip until i915 parts]


  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 14 +++---
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 15 ---
  drivers/gpu/drm/i915/gt/intel_ring.c  | 15 ---


[snip]


diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 00327b750fbb..80c79028901a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -107,25 +107,27 @@ static void lut_close(struct i915_gem_context *ctx)
radix_tree_for_each_slot(slot, >handles_vma, , 0) {
struct i915_vma *vma = rcu_dereference_raw(*slot);
struct drm_i915_gem_object *obj = vma->obj;
-   struct i915_lut_handle *lut;
+   struct i915_lut_handle *lut = NULL;
+   struct i915_lut_handle *tmp;

if (!kref_get_unless_zero(>base.refcount))
continue;

spin_lock(>lut_lock);
-   list_for_each_entry(lut, >lut_list, obj_link) {
-   if (lut->ctx != ctx)
+   list_for_each_entry(tmp, >lut_list, obj_link) {
+   if (tmp->ctx != ctx)
continue;

-   if (lut->handle != iter.index)
+   if (tmp->handle != iter.index)
continue;

-   list_del(>obj_link);
+   list_del(>obj_link);
+   lut = tmp;
break;
}
spin_unlock(>lut_lock);

-   if (>obj_link != >lut_list) {
+   if (lut) {
i915_lut_handle_free(lut);
radix_tree_iter_delete(>handles_vma, , slot);


Looks okay although personally I would have left lut as is for a smaller 
diff and introduced a new local like 'found' or 'unlinked'.



i915_vma_close(vma);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1736efa43339..fda9e3685ad2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2444,7 +2444,8 @@ static struct i915_request *eb_throttle(struct 
i915_execbuffer *eb, struct intel
  {
struct intel_ring *ring = ce->ring;
struct intel_timeline *tl = ce->timeline;
-   struct i915_request *rq;
+   struct i915_request *rq = NULL;
+   struct i915_request *tmp;

/*
 * Completely unscientific finger-in-the-air estimates for suitable
@@ -2460,15 +2461,17 @@ static struct i915_request *eb_throttle(struct 
i915_execbuffer *eb, struct intel
 * claiming our resources, but not so long that the ring completely
 * drains before we can submit our next request.
 */
-   list_for_each_entry(rq, >requests, link) {
-   if (rq->ring != ring)
+   list_for_each_entry(tmp, >requests, link) {
+   if (tmp->ring != ring)
continue;

-   if (__intel_ring_space(rq->postfix,
-  ring->emit, ring->size) > ring->size / 2)
+   if (__intel_ring_space(tmp->postfix,
+  ring->emit, ring->size) > ring->size / 
2) {
+   rq = tmp;
break;
+   }
}
-   if (>link == >requests)
+   if (!rq)
return NULL; /* weird, we will check again later for real */


Alternatively, instead of break could simply do "return 
i915_request_get(rq);" and replace the end of the function after the 
loop with "return NULL;". A bit smaller diff, or at least less "spread 
out" over the function, so might be easier to backport stuff touching 
this area in the future. But looks correct as is.




return i915_request_get(rq);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
b/drivers/gpu/drm/i915/gt/intel_ring.c
index 2fdd52b62092..4881c4e0c407 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -191,24 +191,27 @@ wait_for_space(struct intel_ring *ring,
   struct intel_timeline *tl,
   unsigned int bytes)
  {
-   struct i915_request *target;
+   struct 

Re: [Intel-gfx] [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-03-02 Thread Alex Deucher
On Wed, Mar 2, 2022 at 10:55 AM Michael Cheng  wrote:
>
> Thanks for the feedback Robin!
>
> Sorry my choices of word weren't that great, but what I meant is to
> understand how ARM flushes a range of dcache for device drivers, and not
> an equal to x86 clflush.
>
> I believe the concern is if the CPU writes an update, that update might
> only be sitting in the CPU cache and never make it to device memory
> where the device can see it; there are specific places that we are
> supposed to flush the CPU caches to make sure our updates are visible to
> the hardware.
>
> +Matt Roper
>
> Matt, Lucas, any feed back here?

MMIO (e.g., PCI BARs, etc.) should be mapped uncached.  If it's not
you'll have a lot of problems using a GPU on that architecture.  One
thing that you may want to check is if your device has its own caches
or write queues on the BAR aperture.  You may have to flush them after
CPU access to the BAR to make sure CPU updates land in device memory.
For system memory, PCI, per the spec, should be cache coherent with
the CPU.  If it's not, you'll have a lot of trouble using a GPU on
that platform.

Alex

>
> On 2022-03-02 4:49 a.m., Robin Murphy wrote:
> > On 2022-02-25 19:27, Michael Cheng wrote:
> >> Hi Robin,
> >>
> >> [ +arm64 maintainers for their awareness, which would have been a
> >> good thing to do from the start ]
> >>
> >>   * Thanks for adding the arm64 maintainer and sorry I didn't rope them
> >> in sooner.
> >>
> >> Why does i915 need to ensure the CPU's instruction cache is coherent
> >> with its data cache? Is it a self-modifying driver?
> >>
> >>   * Also thanks for pointing this out. Initially I was using
> >> dcache_clean_inval_poc, which seem to be the equivalently to what
> >> x86 is doing for dcache flushing, but it was giving me build errors
> >> since its not on the global list of kernel symbols. And after
> >> revisiting the documentation for caches_clean_inval_pou, it won't
> >> fly for what we are trying to do. Moving forward, what would you (or
> >> someone in the ARM community) suggest we do? Could it be possible to
> >> export dcache_clean_inval_poc as a global symbol?
> >
> > Unlikely, unless something with a legitimate need for CPU-centric
> > cache maintenance like kexec or CPU hotplug ever becomes modular.
> >
> > In the case of a device driver, it's not even the basic issues of
> > assuming to find direct equivalents to x86 semantics in other CPU
> > architectures, or effectively reinventing parts of the DMA API, it's
> > even bigger than that. Once you move from being integrated in a single
> > vendor's system architecture to being on a discrete card, you
> > fundamentally *no longer have any control over cache coherency*.
> > Whether the host CPU architecture happens to be AArch64, RISC-V, or
> > whatever doesn't really matter, you're at the mercy of 3rd-party PCIe
> > and interconnect IP vendors, and SoC integrators. You'll find yourself
> > in systems where PCIe simply cannot snoop any caches, where you'd
> > better have the correct DMA API calls in place to have any hope of
> > even the most basic functionality working properly; you'll find
> > yourself in systems where even if the PCIe root complex claims to
> > support No Snoop, your uncached traffic will still end up snooping
> > stale data that got prefetched back into caches you thought you'd
> > invalidated; you'll find yourself in systems where your memory
> > attributes may or may not get forcibly rewritten by an IOMMU depending
> > on the kernel config and/or command line.
> >
> > It's not about simply finding a substitute for clflush, it's that the
> > reasons you have for using clflush in the first place can no longer be
> > assumed to be valid.
> >
> > Robin.
> >
> >> On 2022-02-25 10:24 a.m., Robin Murphy wrote:
> >>> [ +arm64 maintainers for their awareness, which would have been a
> >>> good thing to do from the start ]
> >>>
> >>> On 2022-02-25 03:24, Michael Cheng wrote:
>  Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
>  performs a flush by first performing a clean, follow by an
>  invalidation
>  operation.
> 
>  v2 (Michael Cheng): Use correct macro for cleaning and invalidation
>  the
>  dcache. Thanks Tvrtko for the suggestion.
> 
>  v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h
> 
>  v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
>  symbol that could be use by other modules, thus use
>  caches_clean_inval_pou instead. Also this version
>  removes include for cacheflush, since its already
>  included base on architecture type.
> 
>  Signed-off-by: Michael Cheng 
>  Reviewed-by: Matt Roper 
>  ---
>    drivers/gpu/drm/drm_cache.c | 5 +
>    1 file changed, 5 insertions(+)
> 
>  diff --git a/drivers/gpu/drm/drm_cache.c 

Re: [Intel-gfx] [PATCH v5 4/7] drm/i915/gt: create per-tile sysfs interface

2022-03-02 Thread Andrzej Hajda




On 17.02.2022 15:41, Andi Shyti wrote:

Now that we have tiles we want each of them to have its own
interface. A directory "gt/" is created under "cardN/" that will
contain as many diroctories as the tiles.

In the coming patches tile related interfaces will be added. For
now the sysfs gt structure simply has an id interface related
to the current tile count.

The directory structure will follow this scheme:

 /sys/.../card0
  └── gt
      ├── gt0
      │   └── id
  :
 :
 └─- gtN
          └── id

This new set of interfaces will be a basic tool for system
managers and administrators when using i915.

Signed-off-by: Andi Shyti 
Cc: Matt Roper 
Cc: Sujaritha Sundaresan 
Cc: Tvrtko Ursulin 
Reviewed-by: Sujaritha Sundaresan 
---
  drivers/gpu/drm/i915/Makefile|   1 +
  drivers/gpu/drm/i915/gt/intel_gt.c   |   2 +
  drivers/gpu/drm/i915/gt/intel_gt_sysfs.c | 118 +++
  drivers/gpu/drm/i915/gt/intel_gt_sysfs.h |  34 +++
  drivers/gpu/drm/i915/i915_drv.h  |   2 +
  drivers/gpu/drm/i915/i915_sysfs.c|  12 ++-
  drivers/gpu/drm/i915/i915_sysfs.h|   3 +
  7 files changed, 171 insertions(+), 1 deletion(-)
  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_sysfs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 9d588d936e3d..277064b00afd 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -105,6 +105,7 @@ gt-y += \
gt/intel_gt_pm_debugfs.o \
gt/intel_gt_pm_irq.o \
gt/intel_gt_requests.o \
+   gt/intel_gt_sysfs.o \
gt/intel_gtt.o \
gt/intel_llc.o \
gt/intel_lrc.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 8c64b81e9ec9..0f080bbad043 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -26,6 +26,7 @@
  #include "intel_rc6.h"
  #include "intel_renderstate.h"
  #include "intel_rps.h"
+#include "intel_gt_sysfs.h"
  #include "intel_uncore.h"
  #include "shmem_utils.h"
  
@@ -458,6 +459,7 @@ void intel_gt_driver_register(struct intel_gt *gt)

intel_rps_driver_register(>rps);
  
  	intel_gt_debugfs_register(gt);

+   intel_gt_sysfs_register(gt);
  }
  
  static int intel_gt_init_scratch(struct intel_gt *gt, unsigned int size)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
new file mode 100644
index ..0206e9aa4867
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "i915_drv.h"
+#include "i915_sysfs.h"
+#include "intel_gt.h"
+#include "intel_gt_sysfs.h"
+#include "intel_gt_types.h"
+#include "intel_rc6.h"
+
+bool is_object_gt(struct kobject *kobj)
+{
+   return !strncmp(kobj->name, "gt", 2);
+}


It looks quite fragile, at the moment I do not have better idea:) maybe 
after reviewing the rest of the patches.



+
+static struct intel_gt *kobj_to_gt(struct kobject *kobj)
+{
+   return container_of(kobj, struct kobj_gt, base)->gt;
+}
+
+struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
+   const char *name)
+{
+   struct kobject *kobj = >kobj;
+
+   /*
+* We are interested at knowing from where the interface
+* has been called, whether it's called from gt/ or from
+* the parent directory.
+* From the interface position it depends also the value of
+* the private data.
+* If the interface is called from gt/ then private data is
+* of the "struct intel_gt *" type, otherwise it's * a
+* "struct drm_i915_private *" type.
+*/
+   if (!is_object_gt(kobj)) {
+   struct drm_i915_private *i915 = kdev_minor_to_i915(dev);
+
+   pr_devel_ratelimited(DEPRECATED
+   "%s (pid %d) is accessing deprecated %s "
+   "sysfs control, please use gt/gt/%s instead\n",
+   current->comm, task_pid_nr(current), name, name);
+   return to_gt(i915);
+   }
+
+   return kobj_to_gt(kobj);


It took some time for me to understand what is going on here.
We have dev argument which sometimes can point to "struct device", 
sometimes to "struct kobj_gt", but it's type suggests differently, quite 
ugly.
I wonder if wouldn't be better to use __ATTR instead of DEVICE_ATTR* as 
in case of intel_engines_add_sysfs. This way abstractions would look 
better, hopefully.



+}
+
+static ssize_t id_show(struct device *dev,
+  struct device_attribute *attr,
+  char *buf)
+{
+   struct intel_gt *gt = 

[Intel-gfx] ✓ Fi.CI.IGT: success for iommu/vt-d: Add RPLS to quirk list to skip TE disabling (rev3)

2022-03-02 Thread Patchwork
== Series Details ==

Series: iommu/vt-d: Add RPLS to quirk list to skip TE disabling (rev3)
URL   : https://patchwork.freedesktop.org/series/100165/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11308_full -> Patchwork_22458_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Participating hosts (13 -> 13)
--

  No changes in participating hosts

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_22458_full:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * {igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5}:
- {shard-rkl}:NOTRUN -> [SKIP][1] +1 similar issue
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-rkl-1/igt@kms_plane_scal...@planes-upscale-factor-0-25-downscale-factor-0-5.html

  * 
{igt@kms_plane_scaling@scaler-with-rotation-unity-scaling@pipe-d-hdmi-a-3-scaler-with-rotation}:
- {shard-dg1}:NOTRUN -> [SKIP][2] +3 similar issues
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-dg1-18/igt@kms_plane_scaling@scaler-with-rotation-unity-scal...@pipe-d-hdmi-a-3-scaler-with-rotation.html

  
New tests
-

  New tests have been introduced between CI_DRM_11308_full and 
Patchwork_22458_full:

### New IGT tests (1) ###

  * 
igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [1.28] s

  

Known issues


  Here are the changes found in Patchwork_22458_full that come from known 
issues:

### CI changes ###

 Issues hit 

  * boot:
- shard-skl:  ([PASS][3], [PASS][4], [PASS][5], [PASS][6], 
[PASS][7], [PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], 
[PASS][13], [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], 
[PASS][19], [PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24], 
[PASS][25]) -> ([PASS][26], [PASS][27], [PASS][28], [PASS][29], [PASS][30], 
[PASS][31], [PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], 
[PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], [FAIL][42], 
[PASS][43], [PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], 
[PASS][49]) ([i915#5032])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl1/boot.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl9/boot.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl9/boot.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl8/boot.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl8/boot.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl8/boot.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl8/boot.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/boot.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/boot.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/boot.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/boot.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/boot.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl6/boot.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl6/boot.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl6/boot.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl4/boot.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl4/boot.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl4/boot.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl1/boot.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl1/boot.html
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl10/boot.html
   [24]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl10/boot.html
   [25]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl10/boot.html
   [26]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-skl9/boot.html
   [27]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-skl9/boot.html
   [28]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-skl9/boot.html
   [29]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-skl8/boot.html
   [30]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-skl8/boot.html
   [31]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-skl8/boot.html
   [32]: 

Re: [Intel-gfx] [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-03-02 Thread Michael Cheng

Thanks for the feedback Robin!

Sorry my choices of word weren't that great, but what I meant is to 
understand how ARM flushes a range of dcache for device drivers, and not 
an equal to x86 clflush.


I believe the concern is if the CPU writes an update, that update might 
only be sitting in the CPU cache and never make it to device memory 
where the device can see it; there are specific places that we are 
supposed to flush the CPU caches to make sure our updates are visible to 
the hardware.


+Matt Roper

Matt, Lucas, any feed back here?

On 2022-03-02 4:49 a.m., Robin Murphy wrote:

On 2022-02-25 19:27, Michael Cheng wrote:

Hi Robin,

[ +arm64 maintainers for their awareness, which would have been a 
good thing to do from the start ]


  * Thanks for adding the arm64 maintainer and sorry I didn't rope them
    in sooner.

Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


  * Also thanks for pointing this out. Initially I was using
    dcache_clean_inval_poc, which seem to be the equivalently to what
    x86 is doing for dcache flushing, but it was giving me build errors
    since its not on the global list of kernel symbols. And after
    revisiting the documentation for caches_clean_inval_pou, it won't
    fly for what we are trying to do. Moving forward, what would you (or
    someone in the ARM community) suggest we do? Could it be possible to
    export dcache_clean_inval_poc as a global symbol?


Unlikely, unless something with a legitimate need for CPU-centric 
cache maintenance like kexec or CPU hotplug ever becomes modular.


In the case of a device driver, it's not even the basic issues of 
assuming to find direct equivalents to x86 semantics in other CPU 
architectures, or effectively reinventing parts of the DMA API, it's 
even bigger than that. Once you move from being integrated in a single 
vendor's system architecture to being on a discrete card, you 
fundamentally *no longer have any control over cache coherency*. 
Whether the host CPU architecture happens to be AArch64, RISC-V, or 
whatever doesn't really matter, you're at the mercy of 3rd-party PCIe 
and interconnect IP vendors, and SoC integrators. You'll find yourself 
in systems where PCIe simply cannot snoop any caches, where you'd 
better have the correct DMA API calls in place to have any hope of 
even the most basic functionality working properly; you'll find 
yourself in systems where even if the PCIe root complex claims to 
support No Snoop, your uncached traffic will still end up snooping 
stale data that got prefetched back into caches you thought you'd 
invalidated; you'll find yourself in systems where your memory 
attributes may or may not get forcibly rewritten by an IOMMU depending 
on the kernel config and/or command line.


It's not about simply finding a substitute for clflush, it's that the 
reasons you have for using clflush in the first place can no longer be 
assumed to be valid.


Robin.


On 2022-02-25 10:24 a.m., Robin Murphy wrote:
[ +arm64 maintainers for their awareness, which would have been a 
good thing to do from the start ]


On 2022-02-25 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an 
invalidation

operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation 
the

    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.

Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned 
long length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);


Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


Robin.

(Note that the above is somewhat of a loaded question, and I do 
actually have half an idea of what you're trying to do here and why 
it won't fly, but I'd like to at least assume you've read the 
documentation of the function you decided was OK to use)



+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif


Re: [Intel-gfx] [PATCH v2 0/4] drm/i915/ttm: Evict and store of compressed object

2022-03-02 Thread Das, Nirmoy

Reviewed-by: Nirmoy Das  for the series as well.

On 01/03/2022 22:53, Ramalingam C wrote:

On Xe-HP and later devices, we use dedicated compression control
state (CCS) stored in local memory for each surface, to support
the 3D and media compression formats.

The memory required for the CCS of the entire local memory is
1/256 of the local memory size. So before the kernel
boot, the required memory is reserved for the CCS data and a
secure register will be programmed with the CCS base address

So when we allocate a object in local memory we dont need to explicitly
allocate the space for ccs data. But when we evict the obj into the smem
to hold the compression related data along with the obj we need smem
space of obj_size + (obj_size/256).

Hence when we create smem for an obj with lmem placement possibility we
create with the extra space.

When we are swapping out the local memory obj on flat-ccs capable platform,
we need to capture the ccs data too along with main meory and we need to
restore it when we are swapping in the content.

When lmem object is swapped into a smem obj, smem obj will
have the extra pages required to hold the ccs data corresponding to the
lmem main memory. So main memory of lmem will be copied into the initial
pages of the smem and then ccs data corresponding to the main memory
will be copied to the subsequent pages of smem.

Swapin happens exactly in reverse order. First main memory of lmem is
restored from the smem's initial pages and the ccs data will be restored
from the subsequent pages of smem.

Extracting and restoring the CCS data is done through a special cmd called
XY_CTRL_SURF_COPY_BLT

Test-with: 20220301212513.30772-1-ramalinga...@intel.com

Ayaz A Siddiqui (1):
   drm/i915/gt: Clear compress metadata for Xe_HP platforms

Ramalingam C (3):
   drm/ttm: parameter to add extra pages into ttm_tt
   drm/i915/gem: Extra pages in ttm_tt for ccs data
   drm/i915/migrate: Evict and restore the flatccs capable lmem obj

  drivers/gpu/drm/drm_gem_vram_helper.c|   2 +-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c  |  23 +-
  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  15 +
  drivers/gpu/drm/i915/gt/intel_migrate.c  | 327 +--
  drivers/gpu/drm/qxl/qxl_ttm.c|   2 +-
  drivers/gpu/drm/ttm/ttm_agp_backend.c|   2 +-
  drivers/gpu/drm/ttm/ttm_tt.c |  12 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c   |   2 +-
  include/drm/ttm/ttm_tt.h |   4 +-
  9 files changed, 357 insertions(+), 32 deletions(-)



[Intel-gfx] ✗ Fi.CI.BAT: failure for Remove frontbuffer tracking from the gem code

2022-03-02 Thread Patchwork
== Series Details ==

Series: Remove frontbuffer tracking from the gem code
URL   : https://patchwork.freedesktop.org/series/100950/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11311 -> Patchwork_22463


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_22463 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_22463, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/index.html

Participating hosts (48 -> 43)
--

  Additional (1): fi-kbl-8809g 
  Missing(6): shard-tglu fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 fi-pnv-d510 
fi-bdw-samus 

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_22463:

### IGT changes ###

 Possible regressions 

  * igt@i915_pm_rpm@basic-pci-d3-state:
- fi-skl-6600u:   NOTRUN -> [FAIL][1]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-skl-6600u/igt@i915_pm_...@basic-pci-d3-state.html

  
Known issues


  Here are the changes found in Patchwork_22463 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_suspend@basic-s0@smem:
- fi-kbl-8809g:   NOTRUN -> [DMESG-WARN][2] ([i915#4962]) +1 similar 
issue
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-kbl-8809g/igt@gem_exec_suspend@basic...@smem.html

  * igt@gem_huc_copy@huc-copy:
- fi-skl-6600u:   NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#2190])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-skl-6600u/igt@gem_huc_c...@huc-copy.html
- fi-kbl-8809g:   NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#2190])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-kbl-8809g/igt@gem_huc_c...@huc-copy.html

  * igt@gem_lmem_swapping@random-engines:
- fi-kbl-8809g:   NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#4613]) +3 
similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-kbl-8809g/igt@gem_lmem_swapp...@random-engines.html

  * igt@gem_lmem_swapping@verify-random:
- fi-skl-6600u:   NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#4613]) +3 
similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-skl-6600u/igt@gem_lmem_swapp...@verify-random.html

  * igt@i915_pm_rpm@basic-rte:
- fi-kbl-8809g:   NOTRUN -> [SKIP][7] ([fdo#109271]) +54 similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-kbl-8809g/igt@i915_pm_...@basic-rte.html

  * igt@i915_selftest@live:
- fi-skl-6600u:   NOTRUN -> [FAIL][8] ([i915#4547])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-skl-6600u/igt@i915_selft...@live.html

  * igt@i915_selftest@live@hangcheck:
- fi-hsw-4770:[PASS][9] -> [INCOMPLETE][10] ([i915#4785])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11311/fi-hsw-4770/igt@i915_selftest@l...@hangcheck.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-hsw-4770/igt@i915_selftest@l...@hangcheck.html
- fi-bdw-5557u:   NOTRUN -> [INCOMPLETE][11] ([i915#3921])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-bdw-5557u/igt@i915_selftest@l...@hangcheck.html
- fi-snb-2600:[PASS][12] -> [INCOMPLETE][13] ([i915#3921])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11311/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: [PASS][14] -> [INCOMPLETE][15] ([i915#4983])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11311/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  * igt@kms_chamelium@hdmi-edid-read:
- fi-kbl-8809g:   NOTRUN -> [SKIP][16] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-kbl-8809g/igt@kms_chamel...@hdmi-edid-read.html

  * igt@kms_chamelium@vga-edid-read:
- fi-skl-6600u:   NOTRUN -> [SKIP][17] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-skl-6600u/igt@kms_chamel...@vga-edid-read.html
- fi-bdw-5557u:   NOTRUN -> [SKIP][18] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22463/fi-bdw-5557u/igt@kms_chamel...@vga-edid-read.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
- fi-skl-6600u:  

Re: [Intel-gfx] ✓ Fi.CI.IGT: success for i915: Prepare for Xe_HP compute engines (rev4)

2022-03-02 Thread Matt Roper
On Wed, Mar 02, 2022 at 01:23:49PM +, Patchwork wrote:
> == Series Details ==
> 
> Series: i915: Prepare for Xe_HP compute engines (rev4)
> URL   : https://patchwork.freedesktop.org/series/100833/
> State : success
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_11308_full -> Patchwork_22459_full
> 
> 
> Summary
> ---
> 
>   **SUCCESS**
> 
>   No regressions found.
> 

Series applied to drm-intel-gt-next.  Thanks for the reviews.


Matt

>   
> 
> Participating hosts (13 -> 13)
> --
> 
>   No changes in participating hosts
> 
> Possible new issues
> ---
> 
>   Here are the unknown changes that may have been introduced in 
> Patchwork_22459_full:
> 
> ### IGT changes ###
> 
>  Suppressed 
> 
>   The following results come from untrusted machines, tests, or statuses.
>   They do not affect the overall result.
> 
>   * 
> {igt@kms_plane_scaling@downscale-with-pixel-format-factor-0-25@pipe-a-edp-1-downscale-with-pixel-format}:
> - shard-iclb: NOTRUN -> [SKIP][1] +2 similar issues
>[1]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22459/shard-iclb7/igt@kms_plane_scaling@downscale-with-pixel-format-factor-0...@pipe-a-edp-1-downscale-with-pixel-format.html
> 
>   * {igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5}:
> - {shard-rkl}:NOTRUN -> [SKIP][2] +1 similar issue
>[2]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22459/shard-rkl-1/igt@kms_plane_scal...@planes-upscale-factor-0-25-downscale-factor-0-5.html
> 
>   * 
> {igt@kms_plane_scaling@upscale-with-rotation-20x20@pipe-c-hdmi-a-3-upscale-with-rotation}:
> - {shard-dg1}:NOTRUN -> [SKIP][3] +3 similar issues
>[3]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22459/shard-dg1-18/igt@kms_plane_scaling@upscale-with-rotation-20...@pipe-c-hdmi-a-3-upscale-with-rotation.html
> 
>   
> New tests
> -
> 
>   New tests have been introduced between CI_DRM_11308_full and 
> Patchwork_22459_full:
> 
> ### New IGT tests (1) ###
> 
>   * 
> igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
> - Statuses : 1 pass(s)
> - Exec time: [1.29] s
> 
>   
> 
> Known issues
> 
> 
>   Here are the changes found in Patchwork_22459_full that come from known 
> issues:
> 
> ### CI changes ###
> 
>  Issues hit 
> 
>   * boot:
> - shard-glk:  ([PASS][4], [PASS][5], [PASS][6], [PASS][7], 
> [PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], 
> [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], 
> [PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24], [PASS][25], 
> [PASS][26], [PASS][27], [PASS][28]) -> ([PASS][29], [PASS][30], [PASS][31], 
> [PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], 
> [PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], [PASS][43], 
> [PASS][44], [PASS][45], [FAIL][46], [PASS][47], [PASS][48], [PASS][49], 
> [PASS][50], [PASS][51], [PASS][52]) ([i915#4392])
>[4]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
>[5]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
>[6]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
>[7]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
>[8]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
>[9]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
>[10]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
>[11]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
>[12]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
>[13]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk6/boot.html
>[14]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk6/boot.html
>[15]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk5/boot.html
>[16]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk5/boot.html
>[17]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk5/boot.html
>[18]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk4/boot.html
>[19]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk4/boot.html
>[20]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk4/boot.html
>[21]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk3/boot.html
>[22]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk3/boot.html
>[23]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk2/boot.html
>[24]: 
> 

Re: [Intel-gfx] [PATCH] drm/i915: add more TMDS clock rate supported by HDMI driver

2022-03-02 Thread Ville Syrjälä
On Tue, Mar 01, 2022 at 01:01:41PM +0800, Lee Shawn C wrote:
> VBT 249 update to support more TMDS clock rate 3.00G, 3.40G
> and 5.94G. Refer to this new definition to configure max
> TMDS clock rate for HDMI driver.

The patch itself looks fine. The patch subject is pretty much
incomprehensible to me.

> 
> BSpec: 20124
> 
> Cc: Jani Nikula 
> Cc: Ville Syrjala 
> Cc: Ankit Nautiyal 
> Signed-off-by: Lee Shawn C 
> ---
>  drivers/gpu/drm/i915/display/intel_bios.c | 6 ++
>  drivers/gpu/drm/i915/display/intel_vbt_defs.h | 3 +++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_bios.c 
> b/drivers/gpu/drm/i915/display/intel_bios.c
> index 40b5e7ed12c2..a559a1914588 100644
> --- a/drivers/gpu/drm/i915/display/intel_bios.c
> +++ b/drivers/gpu/drm/i915/display/intel_bios.c
> @@ -1955,6 +1955,12 @@ static int _intel_bios_max_tmds_clock(const struct 
> intel_bios_encoder_data *devd
>   fallthrough;
>   case HDMI_MAX_DATA_RATE_PLATFORM:
>   return 0;
> + case HDMI_MAX_DATA_RATE_594:
> + return 594000;
> + case HDMI_MAX_DATA_RATE_340:
> + return 34;
> + case HDMI_MAX_DATA_RATE_300:
> + return 30;
>   case HDMI_MAX_DATA_RATE_297:
>   return 297000;
>   case HDMI_MAX_DATA_RATE_165:
> diff --git a/drivers/gpu/drm/i915/display/intel_vbt_defs.h 
> b/drivers/gpu/drm/i915/display/intel_vbt_defs.h
> index b9397d9363c5..e0508990df48 100644
> --- a/drivers/gpu/drm/i915/display/intel_vbt_defs.h
> +++ b/drivers/gpu/drm/i915/display/intel_vbt_defs.h
> @@ -289,6 +289,9 @@ struct bdb_general_features {
>  #define HDMI_MAX_DATA_RATE_PLATFORM  0   /* 204 */
>  #define HDMI_MAX_DATA_RATE_297   1   /* 204 
> */
>  #define HDMI_MAX_DATA_RATE_165   2   /* 204 
> */
> +#define HDMI_MAX_DATA_RATE_594   3   /* 249 
> */
> +#define HDMI_MAX_DATA_RATE_340   4   /* 249 
> */
> +#define HDMI_MAX_DATA_RATE_300   5   /* 249 
> */
>  
>  #define LEGACY_CHILD_DEVICE_CONFIG_SIZE  33
>  
> -- 
> 2.17.1

-- 
Ville Syrjälä
Intel


[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Remove frontbuffer tracking from the gem code

2022-03-02 Thread Patchwork
== Series Details ==

Series: Remove frontbuffer tracking from the gem code
URL   : https://patchwork.freedesktop.org/series/100950/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.




[Intel-gfx] [PATCH 2/2] drm/i915/mtl: Don't use PIN_MAPPABLE for dpt

2022-03-02 Thread Stanislav Lisovskiy
Cannot use PIN_MAPPABLE pin on MTL because there's no mappable window.
Change dpt allocation as per suggestion from Chris.

v2: - Added forgotten/dropped include

Signed-off-by: Stanslav Lisovskiy 
Signed-off-by: Juha-Pekka Heikkilä 
Cc: Chris Wilson 
---
 drivers/gpu/drm/i915/display/intel_dpt.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index 15b2716172f7..11f328a42e19 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -4,6 +4,7 @@
  */
 
 #include "gem/i915_gem_domain.h"
+#include "gem/i915_gem_internal.h"
 #include "gt/gen8_ppgtt.h"
 
 #include "i915_drv.h"
@@ -128,6 +129,10 @@ struct i915_vma *intel_dpt_pin(struct i915_address_space 
*vm)
void __iomem *iomem;
struct i915_gem_ww_ctx ww;
int err;
+   u64 pin_flags = 0;
+
+   if (i915_gem_object_is_stolen(dpt->obj))
+   pin_flags |= PIN_MAPPABLE; /* for i915_vma_pin_iomap(stolen) */
 
wakeref = intel_runtime_pm_get(>runtime_pm);
atomic_inc(>gpu_error.pending_fb_pin);
@@ -138,7 +143,7 @@ struct i915_vma *intel_dpt_pin(struct i915_address_space 
*vm)
continue;
 
vma = i915_gem_object_ggtt_pin_ww(dpt->obj, , NULL, 0, 4096,
- HAS_LMEM(i915) ? 0 : 
PIN_MAPPABLE);
+ pin_flags);
if (IS_ERR(vma)) {
err = PTR_ERR(vma);
continue;
@@ -248,10 +253,11 @@ intel_dpt_create(struct intel_framebuffer *fb)
 
size = round_up(size * sizeof(gen8_pte_t), I915_GTT_PAGE_SIZE);
 
-   if (HAS_LMEM(i915))
-   dpt_obj = i915_gem_object_create_lmem(i915, size, 
I915_BO_ALLOC_CONTIGUOUS);
-   else
+   dpt_obj = i915_gem_object_create_lmem(i915, size, 
I915_BO_ALLOC_CONTIGUOUS);
+   if (IS_ERR(dpt_obj) && i915_ggtt_has_aperture(to_gt(i915)->ggtt))
dpt_obj = i915_gem_object_create_stolen(i915, size);
+   if (IS_ERR(dpt_obj))
+   dpt_obj = i915_gem_object_create_internal(i915, size);
if (IS_ERR(dpt_obj))
return ERR_CAST(dpt_obj);
 
-- 
2.24.1.485.gad05a3d8e5



[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/gem: missing boundary check in vm_access leads to OOB read/write

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915/gem: missing boundary check in vm_access leads to OOB 
read/write
URL   : https://patchwork.freedesktop.org/series/100932/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11308_full -> Patchwork_22460_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_22460_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_22460_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (13 -> 13)
--

  No changes in participating hosts

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_22460_full:

### IGT changes ###

 Possible regressions 

  * igt@core_setmaster@master-drop-set-user:
- shard-iclb: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-iclb3/igt@core_setmas...@master-drop-set-user.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22460/shard-iclb4/igt@core_setmas...@master-drop-set-user.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_blits@basic:
- {shard-rkl}:[PASS][3] -> [INCOMPLETE][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-rkl-2/igt@gem_bl...@basic.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22460/shard-rkl-5/igt@gem_bl...@basic.html

  * 
{igt@kms_plane_scaling@downscale-with-pixel-format-factor-0-25@pipe-a-edp-1-downscale-with-pixel-format}:
- shard-iclb: NOTRUN -> [SKIP][5] +2 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22460/shard-iclb8/igt@kms_plane_scaling@downscale-with-pixel-format-factor-0...@pipe-a-edp-1-downscale-with-pixel-format.html

  
New tests
-

  New tests have been introduced between CI_DRM_11308_full and 
Patchwork_22460_full:

### New IGT tests (1) ###

  * 
igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [1.28] s

  

Known issues


  Here are the changes found in Patchwork_22460_full that come from known 
issues:

### CI changes ###


### IGT changes ###

 Issues hit 

  * igt@gem_ctx_isolation@preservation-s3@bcs0:
- shard-apl:  [PASS][6] -> [DMESG-WARN][7] ([i915#180]) +1 similar 
issue
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-apl3/igt@gem_ctx_isolation@preservation...@bcs0.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22460/shard-apl6/igt@gem_ctx_isolation@preservation...@bcs0.html

  * igt@gem_ctx_isolation@preservation-s3@vcs0:
- shard-kbl:  [PASS][8] -> [DMESG-WARN][9] ([i915#180]) +4 similar 
issues
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-kbl1/igt@gem_ctx_isolation@preservation...@vcs0.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22460/shard-kbl4/igt@gem_ctx_isolation@preservation...@vcs0.html

  * igt@gem_ctx_param@set-priority-not-supported:
- shard-iclb: NOTRUN -> [SKIP][10] ([fdo#109314])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22460/shard-iclb8/igt@gem_ctx_pa...@set-priority-not-supported.html

  * igt@gem_exec_balancer@parallel-bb-first:
- shard-tglb: NOTRUN -> [DMESG-WARN][11] ([i915#5076])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22460/shard-tglb7/igt@gem_exec_balan...@parallel-bb-first.html

  * igt@gem_exec_capture@pi@rcs0:
- shard-skl:  [PASS][12] -> [INCOMPLETE][13] ([i915#4547])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/igt@gem_exec_capture@p...@rcs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22460/shard-skl3/igt@gem_exec_capture@p...@rcs0.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-iclb: [PASS][14] -> [FAIL][15] ([i915#2842])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-iclb4/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22460/shard-iclb7/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none@vecs0:
- shard-iclb: NOTRUN -> [FAIL][16] ([i915#2842]) +3 similar issues
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22460/shard-iclb8/igt@gem_exec_fair@basic-n...@vecs0.html
- shard-apl:  [PASS][17] -> [FAIL][18] ([i915#2842])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-apl7/igt@gem_exec_fair@basic-n...@vecs0.html
   [18]: 

Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for Fix prime_mmap to work when using LMEM (rev2)

2022-03-02 Thread Gwan-gyeong Mun

The regression reported below is not related to this patch.
On TGL platforms that do not use local memory, this patch has no logic 
changes.
This patch relates to the results of the mmap_prime test on dg1 using 
local memory.

(mmap_prime is a different test from mmap_prime_coherency.)


> Possible regressions
>
>   * igt@kms_vblank@pipe-b-query-forked:
>   o shard-tglb: PASS
> 


> -> INCOMPLETE
> 


>
This issue is not related to this patch.
>
> Suppressed
>
> The following results come from untrusted machines, tests, or statuses.
> They do not affect the overall result.
>
>   * igt@prime_mmap_coherency@ioctl-errors:
>   o {shard-dg1}: FAIL
> 


> ([i915#4899]) -> FAIL
> 


>
This problem in dg1 will be fixed when this patch series of igt 
(https://patchwork.freedesktop.org/series/100819/#rev3) is applied.


Br,
G.G.

On 2/28/22 8:25 PM, Patchwork wrote:

*Patch Details*
*Series:*   Fix prime_mmap to work when using LMEM (rev2)
*URL:*	https://patchwork.freedesktop.org/series/100737/ 


*State:*failure
*Details:* 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22432/index.html 




  CI Bug Log - changes from CI_DRM_11297_full -> Patchwork_22432_full


Summary

*FAILURE*

Serious unknown changes coming with Patchwork_22432_full absolutely need 
to be

verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_22432_full, please notify your bug team to allow 
them

to document this new failure mode, which will reduce false positives in CI.


Participating hosts (13 -> 13)

No changes in participating hosts


Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_22432_full:



  IGT changes


Possible regressions

  * igt@kms_vblank@pipe-b-query-forked:
  o shard-tglb: PASS


-> INCOMPLETE




Suppressed

The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.

  * igt@prime_mmap_coherency@ioctl-errors:
  o {shard-dg1}: FAIL


([i915#4899]) -> FAIL




Known issues

Here are the changes found in Patchwork_22432_full that come from known 
issues:



  IGT changes


Issues hit

  *

igt@feature_discovery@display-3x:

  o shard-iclb: NOTRUN -> SKIP


([i915#1839])
  *

igt@gem_create@create-massive:

  o shard-iclb: NOTRUN -> DMESG-WARN


([i915#4991])
  *

igt@gem_eio@unwedge-stress:

  o shard-tglb: PASS


-> FAIL


([i915#232])
  *

igt@gem_exec_capture@pi@rcs0:

  o shard-skl: PASS


-> INCOMPLETE


([i915#4547])
  *

igt@gem_exec_fair@basic-deadline:

  o shard-apl: NOTRUN -> FAIL


([i915#2846])
  *

igt@gem_exec_fair@basic-none-solo@rcs0:

  o shard-iclb: NOTRUN -> FAIL


([i915#2842])
  *

igt@gem_exec_fair@basic-pace-share@rcs0:

  o shard-glk: PASS


Re: [Intel-gfx] linux-next: build warning after merge of the drm-misc tree

2022-03-02 Thread Stephen Rothwell
Hi Andrey,

On Tue, 1 Mar 2022 22:26:12 -0500 Andrey Grodzovsky  
wrote:
>
> Please check you have commit c7703ce38c1e Andrey Grodzovsky   3 weeks ago    
> drm/amdgpu: Fix htmldoc warning

That has arrived in linux-next today for the first time.  It is in the
drm tree, but that tree has had build problems for some time and so has
not been included completely until today.

Thanks.

-- 
Cheers,
Stephen Rothwell


pgputgc_q_ZeX.pgp
Description: OpenPGP digital signature


Re: [Intel-gfx] [PATCH v12 1/6] drm: Add arch arm64 for drm_clflush_virt_range

2022-03-02 Thread Robin Murphy

On 2022-02-25 19:27, Michael Cheng wrote:

Hi Robin,

[ +arm64 maintainers for their awareness, which would have been a good 
thing to do from the start ]


  * Thanks for adding the arm64 maintainer and sorry I didn't rope them
    in sooner.

Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


  * Also thanks for pointing this out. Initially I was using
    dcache_clean_inval_poc, which seem to be the equivalently to what
    x86 is doing for dcache flushing, but it was giving me build errors
    since its not on the global list of kernel symbols. And after
    revisiting the documentation for caches_clean_inval_pou, it won't
    fly for what we are trying to do. Moving forward, what would you (or
    someone in the ARM community) suggest we do? Could it be possible to
    export dcache_clean_inval_poc as a global symbol?


Unlikely, unless something with a legitimate need for CPU-centric cache 
maintenance like kexec or CPU hotplug ever becomes modular.


In the case of a device driver, it's not even the basic issues of 
assuming to find direct equivalents to x86 semantics in other CPU 
architectures, or effectively reinventing parts of the DMA API, it's 
even bigger than that. Once you move from being integrated in a single 
vendor's system architecture to being on a discrete card, you 
fundamentally *no longer have any control over cache coherency*. Whether 
the host CPU architecture happens to be AArch64, RISC-V, or whatever 
doesn't really matter, you're at the mercy of 3rd-party PCIe and 
interconnect IP vendors, and SoC integrators. You'll find yourself in 
systems where PCIe simply cannot snoop any caches, where you'd better 
have the correct DMA API calls in place to have any hope of even the 
most basic functionality working properly; you'll find yourself in 
systems where even if the PCIe root complex claims to support No Snoop, 
your uncached traffic will still end up snooping stale data that got 
prefetched back into caches you thought you'd invalidated; you'll find 
yourself in systems where your memory attributes may or may not get 
forcibly rewritten by an IOMMU depending on the kernel config and/or 
command line.


It's not about simply finding a substitute for clflush, it's that the 
reasons you have for using clflush in the first place can no longer be 
assumed to be valid.


Robin.


On 2022-02-25 10:24 a.m., Robin Murphy wrote:
[ +arm64 maintainers for their awareness, which would have been a good 
thing to do from the start ]


On 2022-02-25 03:24, Michael Cheng wrote:

Add arm64 support for drm_clflush_virt_range. caches_clean_inval_pou
performs a flush by first performing a clean, follow by an invalidation
operation.

v2 (Michael Cheng): Use correct macro for cleaning and invalidation the
    dcache. Thanks Tvrtko for the suggestion.

v3 (Michael Cheng): Replace asm/cacheflush.h with linux/cacheflush.h

v4 (Michael Cheng): Arm64 does not export dcache_clean_inval_poc as a
    symbol that could be use by other modules, thus use
    caches_clean_inval_pou instead. Also this version
    removes include for cacheflush, since its already
    included base on architecture type.

Signed-off-by: Michael Cheng 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/drm_cache.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index c3e6e615bf09..81c28714f930 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -174,6 +174,11 @@ drm_clflush_virt_range(void *addr, unsigned long 
length)

    if (wbinvd_on_all_cpus())
  pr_err("Timed out waiting for cache flush\n");
+
+#elif defined(CONFIG_ARM64)
+    void *end = addr + length;
+    caches_clean_inval_pou((unsigned long)addr, (unsigned long)end);


Why does i915 need to ensure the CPU's instruction cache is coherent 
with its data cache? Is it a self-modifying driver?


Robin.

(Note that the above is somewhat of a loaded question, and I do 
actually have half an idea of what you're trying to do here and why it 
won't fly, but I'd like to at least assume you've read the 
documentation of the function you decided was OK to use)



+
  #else
  WARN_ONCE(1, "Architecture has no drm_cache.c support\n");
  #endif


Re: [Intel-gfx] [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Xiaomeng Tong
On Mon, 28 Feb 2022 16:41:04 -0800, Linus Torvalds
 wrote:
>
> But basically to _me_, the important part is that the end result is
> maintainable longer-term.

I couldn't agree more. And because of that, I stick with the following
approach because it's maintainable longer-term than "type(pos) pos" one:
 Implements a new macro for each list_for_each_entry* with _inside suffix.
  #define list_for_each_entry_inside(pos, type, head, member)

I have posted a patch series here to demonstrate this approach:
https://lore.kernel.org/lkml/20220301075839.4156-3-xiam0nd.t...@gmail.com/

Although we need replace all the use of list_for_each_entry* (15000+)
with list_for_each_entry*_inside, the work can be done gradually rather
than all at once. We can incrementally replace these callers until
all these in the kernel are completely updated with *_inside* one. At
that time, we can just remove the implements of origin macros and rename
the *_inside* macro back to the origin name just in one single patch.

And the "type(pos) pos" approach need teach developers to "not initialize
the iterator variable, otherwise the use-after-loop will not be reported by
compiler", which is unreasonable and impossible for all developers. 

And it will mess up the following code logic and no warnning reported by
compiler, even without initializing "ext" at the beginning:
void foo(struct mem_extent *arg) {
  struct mem_extent *ext;  // used both for iterator and normal ptr
  ...
  ext = arg;  // this assignment can alse be done in another bar() func
  ...
  list_for_each_entry(ext, head, member) {
if (found(ext))
   break;
  }
  ...
  // use ext after the loop
  ret = ext;
}
If the loop hit the break, the last "ret" will be the found ext iterator.
However, if the "type(pos) pos" approach applied, the last "ret" will be
"arg" which is not the intention of the developers, because the "ext" is
two different variables inside and outside the loop.

Thus, my idea is *better a finger off than always aching*, let's choose
the "list_for_each_entry_inside(pos, type, head, member)" approach.

> It turns out that just syntactically, it's really nice to give the
> type of the iterator from outside the way we do now. Yeah, it may be a
> bit odd, and maybe it's partly because I'm so used to the
> "list_for_each_list_entry()" syntax, but moving the type into the loop
> construct really made it nasty - either one very complex line, or
> having to split it over two lines which was even worse.
>
> Maybe the place I looked at just happened to have a long typename, but
> it's basically always going to be a struct, so it's never a _simple_
> type. And it just looked very odd adn unnatural to have the type as
> one of the "arguments" to that list_for_each_entry() macro.

we can pass a shorter type name to list_for_each_entry_inside, thus no
need to split it over two lines. Actually it is not a big problem.
+ #define t struct sram_bank_info
- list_for_each_entry(pos, head, member) {
+ list_for_each_entry_inside(pos, t, head, member) {

I put the type at the second argument not the first to avoid messing up
the pattern match in some coccinelle scripts.

>  (b) gives us a nice warning for any normal use-after-loop case
> (unless you explicitly initialized it like that
> sgx_mmu_notifier_release() function did for no good reason

sometimes developers can be confused by the reported warnning:
"used without having been initialized", and can not figure out immediately
that "oh, now i am using another different variable but with the same name
of the loop iterator variable", which has changed the programming habits
of developers.

>  (c) also guarantees that even if you don't get a warning,
> non-converted (or newly written) bad code won't actually _work_
>
> so you end up getting the new rules without any ambiguity or mistaken

It will lead to a wrong/NULL pointer dereference if the pointer is used
anywhere else, depend on which value is used to initialized with.

Best regard,
--
Xiaomeng Tong


[Intel-gfx] [PATCH] drm/i915/gt: Handle errors for i915_gem_object_trylock

2022-03-02 Thread Jiasheng Jiang
As the potential failure of the i915_gem_object_trylock(),
it should be better to check it and return error if fails.

Fixes: 94ce0d65076c ("drm/i915/gt: Setup a default migration context on the GT")
Signed-off-by: Jiasheng Jiang 
---
 drivers/gpu/drm/i915/gt/selftest_migrate.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c 
b/drivers/gpu/drm/i915/gt/selftest_migrate.c
index fa4293d2944f..79c6c68f7316 100644
--- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -465,7 +465,11 @@ create_init_lmem_internal(struct intel_gt *gt, size_t sz, 
bool try_lmem)
return obj;
}
 
-   i915_gem_object_trylock(obj, NULL);
+   if (!i915_gem_object_trylock(obj, NULL)) {
+   i915_gem_object_put(obj);
+   return ERR_PTR(-EBUSY);
+   }
+
err = i915_gem_object_pin_pages(obj);
if (err) {
i915_gem_object_unlock(obj);
-- 
2.25.1



Re: [Intel-gfx] [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread David Laight
From: Xiaomeng Tong
> Sent: 02 March 2022 09:31
> 
> On Mon, 28 Feb 2022 16:41:04 -0800, Linus Torvalds
>  wrote:
> >
> > But basically to _me_, the important part is that the end result is
> > maintainable longer-term.
> 
> I couldn't agree more. And because of that, I stick with the following
> approach because it's maintainable longer-term than "type(pos) pos" one:
>  Implements a new macro for each list_for_each_entry* with _inside suffix.
>   #define list_for_each_entry_inside(pos, type, head, member)

I think that it would be better to make any alternate loop macro
just set the variable to NULL on the loop exit.
That is easier to code for and the compiler might be persuaded to
not redo the test.

It also doesn't need an extra variable defined in the for() statement
so can be back-ported to older kernels without required declaration
in the middle of blocks.

OTOH there may be alternative definitions that can be used to get
the compiler (or other compiler-like tools) to detect broken code.
Even if the definition can't possibly generate a working kerrnel.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[Intel-gfx] [RFC PATCH 2/2] drm/i915: Remove all frontbuffer tracking calls from the gem code

2022-03-02 Thread Jouni Högander
We should now rely on userspace doing dirtyfb. There is no
need to have separate frontbuffer tracking hooks in gem code.

This patch is removing all frontbuffer tracking calls from the gem
code.

Signed-off-by: Jouni Högander 
Reviewed-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/display/intel_overlay.c |  2 --
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c  |  2 --
 drivers/gpu/drm/i915/gem/i915_gem_domain.c   |  5 
 drivers/gpu/drm/i915/gem/i915_gem_object.c   | 24 
 drivers/gpu/drm/i915/gem/i915_gem_object.h   | 16 -
 drivers/gpu/drm/i915/gem/i915_gem_phys.c |  7 --
 drivers/gpu/drm/i915/i915_gem.c  |  5 
 7 files changed, 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_overlay.c 
b/drivers/gpu/drm/i915/display/intel_overlay.c
index 76845d34ad0c..98342dec36b5 100644
--- a/drivers/gpu/drm/i915/display/intel_overlay.c
+++ b/drivers/gpu/drm/i915/display/intel_overlay.c
@@ -810,8 +810,6 @@ static int intel_overlay_do_put_image(struct intel_overlay 
*overlay,
goto out_pin_section;
}
 
-   i915_gem_object_flush_frontbuffer(new_bo, ORIGIN_DIRTYFB);
-
if (!overlay->active) {
const struct intel_crtc_state *crtc_state =
overlay->crtc->config;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index ce91b23385cf..96a6b79fb44e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -22,8 +22,6 @@ static void __do_clflush(struct drm_i915_gem_object *obj)
 {
GEM_BUG_ON(!i915_gem_object_has_pages(obj));
drm_clflush_sg(obj->mm.pages);
-
-   i915_gem_object_flush_frontbuffer(obj, ORIGIN_CPU);
 }
 
 static void clflush_work(struct dma_fence_work *base)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 3e5d6057b3ef..f467d7548e83 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -64,7 +64,6 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned 
int flush_domains)
}
spin_unlock(>vma.lock);
 
-   i915_gem_object_flush_frontbuffer(obj, ORIGIN_CPU);
break;
 
case I915_GEM_DOMAIN_WC:
@@ -616,9 +615,6 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void 
*data,
 out_unlock:
i915_gem_object_unlock(obj);
 
-   if (!err && write_domain)
-   i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU);
-
 out:
i915_gem_object_put(obj);
return err;
@@ -729,7 +725,6 @@ int i915_gem_object_prepare_write(struct 
drm_i915_gem_object *obj,
}
 
 out:
-   i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU);
obj->mm.dirty = true;
/* return with the pages pinned */
return 0;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 372bc220faeb..c163ee69608f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -393,30 +393,6 @@ static void i915_gem_free_object(struct drm_gem_object 
*gem_obj)
queue_delayed_work(i915->wq, >mm.free_work, 0);
 }
 
-void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
-enum fb_op_origin origin)
-{
-   struct intel_frontbuffer *front;
-
-   front = __intel_frontbuffer_get(obj);
-   if (front) {
-   intel_frontbuffer_flush(front, origin);
-   intel_frontbuffer_put(front);
-   }
-}
-
-void __i915_gem_object_invalidate_frontbuffer(struct drm_i915_gem_object *obj,
- enum fb_op_origin origin)
-{
-   struct intel_frontbuffer *front;
-
-   front = __intel_frontbuffer_get(obj);
-   if (front) {
-   intel_frontbuffer_invalidate(front, origin);
-   intel_frontbuffer_put(front);
-   }
-}
-
 static void
 i915_gem_object_read_from_page_kmap(struct drm_i915_gem_object *obj, u64 
offset, void *dst, int size)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 02c37fe4a535..d7a08172b239 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -578,22 +578,6 @@ void __i915_gem_object_flush_frontbuffer(struct 
drm_i915_gem_object *obj,
 void __i915_gem_object_invalidate_frontbuffer(struct drm_i915_gem_object *obj,
  enum fb_op_origin origin);
 
-static inline void
-i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
- enum fb_op_origin origin)
-{
-   if (unlikely(rcu_access_pointer(obj->frontbuffer)))
-   __i915_gem_object_flush_frontbuffer(obj, origin);
-}
-
-static 

[Intel-gfx] [RFC PATCH 1/2] drm/i915/fbdev: Remove frontbuffer tracking calls

2022-03-02 Thread Jouni Högander
Intel_fbdev can use drm_helper functions which are calling dirtyfb
callback.

Signed-off-by: Jouni Högander 
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 62 ++
 1 file changed, 4 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c 
b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 2cd62a187df3..177c0c20c11e 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -67,68 +67,15 @@ struct intel_fbdev {
struct mutex hpd_lock;
 };
 
-static struct intel_frontbuffer *to_frontbuffer(struct intel_fbdev *ifbdev)
-{
-   return ifbdev->fb->frontbuffer;
-}
-
-static void intel_fbdev_invalidate(struct intel_fbdev *ifbdev)
-{
-   intel_frontbuffer_invalidate(to_frontbuffer(ifbdev), ORIGIN_CPU);
-}
-
-static int intel_fbdev_set_par(struct fb_info *info)
-{
-   struct drm_fb_helper *fb_helper = info->par;
-   struct intel_fbdev *ifbdev =
-   container_of(fb_helper, struct intel_fbdev, helper);
-   int ret;
-
-   ret = drm_fb_helper_set_par(info);
-   if (ret == 0)
-   intel_fbdev_invalidate(ifbdev);
-
-   return ret;
-}
-
-static int intel_fbdev_blank(int blank, struct fb_info *info)
-{
-   struct drm_fb_helper *fb_helper = info->par;
-   struct intel_fbdev *ifbdev =
-   container_of(fb_helper, struct intel_fbdev, helper);
-   int ret;
-
-   ret = drm_fb_helper_blank(blank, info);
-   if (ret == 0)
-   intel_fbdev_invalidate(ifbdev);
-
-   return ret;
-}
-
-static int intel_fbdev_pan_display(struct fb_var_screeninfo *var,
-  struct fb_info *info)
-{
-   struct drm_fb_helper *fb_helper = info->par;
-   struct intel_fbdev *ifbdev =
-   container_of(fb_helper, struct intel_fbdev, helper);
-   int ret;
-
-   ret = drm_fb_helper_pan_display(var, info);
-   if (ret == 0)
-   intel_fbdev_invalidate(ifbdev);
-
-   return ret;
-}
-
 static const struct fb_ops intelfb_ops = {
.owner = THIS_MODULE,
DRM_FB_HELPER_DEFAULT_OPS,
-   .fb_set_par = intel_fbdev_set_par,
+   .fb_set_par = drm_fb_helper_set_par,
.fb_fillrect = drm_fb_helper_cfb_fillrect,
.fb_copyarea = drm_fb_helper_cfb_copyarea,
.fb_imageblit = drm_fb_helper_cfb_imageblit,
-   .fb_pan_display = intel_fbdev_pan_display,
-   .fb_blank = intel_fbdev_blank,
+   .fb_pan_display = drm_fb_helper_pan_display,
+   .fb_blank = drm_fb_helper_blank,
 };
 
 static int intelfb_alloc(struct drm_fb_helper *helper,
@@ -694,8 +641,7 @@ void intel_fbdev_restore_mode(struct drm_device *dev)
if (!ifbdev->vma)
return;
 
-   if (drm_fb_helper_restore_fbdev_mode_unlocked(>helper) == 0)
-   intel_fbdev_invalidate(ifbdev);
+   drm_fb_helper_restore_fbdev_mode_unlocked(>helper);
 }
 
 struct intel_framebuffer *intel_fbdev_framebuffer(struct intel_fbdev *fbdev)
-- 
2.25.1



[Intel-gfx] [RFC PATCH 0/2] Remove frontbuffer tracking from the gem code

2022-03-02 Thread Jouni Högander
We should now rely on userspace doing dirtyfb. There is no need to
have separate frontbuffer tracking hooks in gem code. 

It was found out that fbdev code calling intel_frontbuffer_invalidate
caused psr being left disabled. Tackle this by removing
intel_frontbuffer_invalidate calls from intel_fbdev code.

Cc: Ville Syrjälä 
Cc: Jani Nikula 
Cc: Daniel Vetter 
Cc: José Roberto de Souza 

Jouni Högander (2):
  drm/i915/fbdev: Remove frontbuffer tracking calls
  drm/i915: Remove all frontbuffer tracking calls from the gem code

 drivers/gpu/drm/i915/display/intel_fbdev.c   | 62 ++--
 drivers/gpu/drm/i915/display/intel_overlay.c |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c  |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_domain.c   |  5 --
 drivers/gpu/drm/i915/gem/i915_gem_object.c   | 24 
 drivers/gpu/drm/i915/gem/i915_gem_object.h   | 16 -
 drivers/gpu/drm/i915/gem/i915_gem_phys.c |  7 ---
 drivers/gpu/drm/i915/i915_gem.c  |  5 --
 8 files changed, 4 insertions(+), 119 deletions(-)

-- 
2.25.1



[Intel-gfx] ✓ Fi.CI.IGT: success for i915: Prepare for Xe_HP compute engines (rev4)

2022-03-02 Thread Patchwork
== Series Details ==

Series: i915: Prepare for Xe_HP compute engines (rev4)
URL   : https://patchwork.freedesktop.org/series/100833/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11308_full -> Patchwork_22459_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Participating hosts (13 -> 13)
--

  No changes in participating hosts

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_22459_full:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * 
{igt@kms_plane_scaling@downscale-with-pixel-format-factor-0-25@pipe-a-edp-1-downscale-with-pixel-format}:
- shard-iclb: NOTRUN -> [SKIP][1] +2 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22459/shard-iclb7/igt@kms_plane_scaling@downscale-with-pixel-format-factor-0...@pipe-a-edp-1-downscale-with-pixel-format.html

  * {igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5}:
- {shard-rkl}:NOTRUN -> [SKIP][2] +1 similar issue
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22459/shard-rkl-1/igt@kms_plane_scal...@planes-upscale-factor-0-25-downscale-factor-0-5.html

  * 
{igt@kms_plane_scaling@upscale-with-rotation-20x20@pipe-c-hdmi-a-3-upscale-with-rotation}:
- {shard-dg1}:NOTRUN -> [SKIP][3] +3 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22459/shard-dg1-18/igt@kms_plane_scaling@upscale-with-rotation-20...@pipe-c-hdmi-a-3-upscale-with-rotation.html

  
New tests
-

  New tests have been introduced between CI_DRM_11308_full and 
Patchwork_22459_full:

### New IGT tests (1) ###

  * 
igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [1.29] s

  

Known issues


  Here are the changes found in Patchwork_22459_full that come from known 
issues:

### CI changes ###

 Issues hit 

  * boot:
- shard-glk:  ([PASS][4], [PASS][5], [PASS][6], [PASS][7], 
[PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], 
[PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], 
[PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24], [PASS][25], 
[PASS][26], [PASS][27], [PASS][28]) -> ([PASS][29], [PASS][30], [PASS][31], 
[PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], 
[PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], [PASS][43], 
[PASS][44], [PASS][45], [FAIL][46], [PASS][47], [PASS][48], [PASS][49], 
[PASS][50], [PASS][51], [PASS][52]) ([i915#4392])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk6/boot.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk6/boot.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk5/boot.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk5/boot.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk5/boot.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk4/boot.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk4/boot.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk4/boot.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk3/boot.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk3/boot.html
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk2/boot.html
   [24]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk2/boot.html
   [25]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk2/boot.html
   [26]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk1/boot.html
   [27]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk1/boot.html
   [28]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk1/boot.html
   [29]: 

Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for iommu/vt-d: Add RPLS to quirk list to skip TE disabling (rev3)

2022-03-02 Thread Surendrakumar Upadhyay, TejaskumarX
Regression is not related to the patch. Please mark it pass and requesting to 
merge.

Thanks,
Tejas

From: Patchwork 
Sent: 02 March 2022 17:56
To: Surendrakumar Upadhyay, TejaskumarX 

Cc: intel-gfx@lists.freedesktop.org
Subject: ✗ Fi.CI.IGT: failure for iommu/vt-d: Add RPLS to quirk list to skip TE 
disabling (rev3)

Patch Details
Series:
iommu/vt-d: Add RPLS to quirk list to skip TE disabling (rev3)
URL:
https://patchwork.freedesktop.org/series/100165/
State:
failure
Details:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/index.html
CI Bug Log - changes from CI_DRM_11308_full -> Patchwork_22458_full
Summary

FAILURE

Serious unknown changes coming with Patchwork_22458_full absolutely need to be
verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_22458_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.

Participating hosts (13 -> 13)

No changes in participating hosts

Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_22458_full:

IGT changes
Possible regressions

  *   igt@kms_cursor_legacy@long-nonblocking-modeset-vs-cursor-atomic:

 *   shard-tglb: 
PASS
 -> 
INCOMPLETE

Suppressed

The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.

  *   {igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5}:

 *   {shard-rkl}: NOTRUN -> 
SKIP
 +1 similar issue

  *   
{igt@kms_plane_scaling@scaler-with-rotation-unity-scaling@pipe-d-hdmi-a-3-scaler-with-rotation}:

 *   {shard-dg1}: NOTRUN -> 
SKIP
 +3 similar issues

New tests

New tests have been introduced between CI_DRM_11308_full and 
Patchwork_22458_full:

New IGT tests (1)

  *   
igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:

 *   Statuses : 1 pass(s)
 *   Exec time: [1.28] s

Known issues

Here are the changes found in Patchwork_22458_full that come from known issues:

CI changes
Issues hit

  *   boot:

 *   shard-skl: 
(PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS,
 
PASS)
 -> 
(PASS,
 
PASS,
 
PASS,
 

Re: [Intel-gfx] [PATCH v2 3/4] drm/i915/gem: Extra pages in ttm_tt for ccs data

2022-03-02 Thread Thomas Hellström
On Wed, 2022-03-02 at 03:23 +0530, Ramalingam C wrote:
> On Xe-HP and later devices, we use dedicated compression control
> state (CCS) stored in local memory for each surface, to support the
> 3D and media compression formats.
> 
> The memory required for the CCS of the entire local memory is 1/256
> of
> the local memory size. So before the kernel boot, the required memory
> is reserved for the CCS data and a secure register will be programmed
> with the CCS base address
> 
> So when we allocate a object in local memory we dont need to
> explicitly
> allocate the space for ccs data. But when we evict the obj into the
> smem to hold the compression related data along with the obj we need
> smem space of obj_size + (obj_size/256).
> 
> Hence when we create smem for an obj with lmem placement possibility
> we
> create with the extra space.

Nit: Again imperative wording, 


> 
> Signed-off-by: Ramalingam C 
> cc: Christian Koenig 
> cc: Hellstrom Thomas 

Reviewed by: Thomas Hellström 


> ---
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 23 ++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 1a8262f5f692..c7a36861c38d 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -20,6 +20,7 @@
>  #include "gem/i915_gem_ttm.h"
>  #include "gem/i915_gem_ttm_move.h"
>  #include "gem/i915_gem_ttm_pm.h"
> +#include "gt/intel_gpu_commands.h"
>  
>  #define I915_TTM_PRIO_PURGE 0
>  #define I915_TTM_PRIO_NO_PAGES  1
> @@ -255,12 +256,27 @@ static const struct i915_refct_sgt_ops
> tt_rsgt_ops = {
> .release = i915_ttm_tt_release
>  };
>  
> +static inline bool
> +i915_gem_object_has_lmem_placement(struct drm_i915_gem_object *obj)
> +{
> +   int i;
> +
> +   for (i = 0; i < obj->mm.n_placements; i++)
> +   if (obj->mm.placements[i]->type ==
> INTEL_MEMORY_LOCAL)
> +   return true;
> +
> +   return false;
> +}
> +
>  static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object
> *bo,
>  uint32_t page_flags)
>  {
> +   struct drm_i915_private *i915 = container_of(bo->bdev,
> typeof(*i915),
> +    bdev);
> struct ttm_resource_manager *man =
> ttm_manager_type(bo->bdev, bo->resource->mem_type);
> struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> +   unsigned long ccs_pages = 0;
> enum ttm_caching caching;
> struct i915_ttm_tt *i915_tt;
> int ret;
> @@ -283,7 +299,12 @@ static struct ttm_tt *i915_ttm_tt_create(struct
> ttm_buffer_object *bo,
> i915_tt->is_shmem = true;
> }
>  
> -   ret = ttm_tt_init(_tt->ttm, bo, page_flags, caching, 0);
> +   if (HAS_FLAT_CCS(i915) &&
> i915_gem_object_has_lmem_placement(obj))
> +   ccs_pages = DIV_ROUND_UP(DIV_ROUND_UP(bo->base.size,
> +
> NUM_BYTES_PER_CCS_BYTE),
> +    PAGE_SIZE);
> +
> +   ret = ttm_tt_init(_tt->ttm, bo, page_flags, caching,
> ccs_pages);
> if (ret)
> goto err_free;
>  




Re: [Intel-gfx] [PATCH v2 2/4] drm/ttm: parameter to add extra pages into ttm_tt

2022-03-02 Thread Thomas Hellström
On Wed, 2022-03-02 at 03:23 +0530, Ramalingam C wrote:
> When a driver needs extra pages in ttm_tt, to facilidate such
> requirement, parameter called "extra_pages" is added for
> ttm_tt_init

nit: Please use imperative wording in commit title and description,
"Add a parameter to add extra pages.."

> 
> Signed-off-by: Ramalingam C 
> cc: Christian Koenig 
> cc: Hellstrom Thomas 

Otherwise LGTM.
Reviewed-by: Thomas Hellström 


> ---
>  drivers/gpu/drm/drm_gem_vram_helper.c  |  2 +-
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c    |  2 +-
>  drivers/gpu/drm/qxl/qxl_ttm.c  |  2 +-
>  drivers/gpu/drm/ttm/ttm_agp_backend.c  |  2 +-
>  drivers/gpu/drm/ttm/ttm_tt.c   | 12 +++-
>  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |  2 +-
>  include/drm/ttm/ttm_tt.h   |  4 +++-
>  7 files changed, 15 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c
> b/drivers/gpu/drm/drm_gem_vram_helper.c
> index dc7f938bfff2..123045b58fec 100644
> --- a/drivers/gpu/drm/drm_gem_vram_helper.c
> +++ b/drivers/gpu/drm/drm_gem_vram_helper.c
> @@ -867,7 +867,7 @@ static struct ttm_tt
> *bo_driver_ttm_tt_create(struct ttm_buffer_object *bo,
> if (!tt)
> return NULL;
>  
> -   ret = ttm_tt_init(tt, bo, page_flags, ttm_cached);
> +   ret = ttm_tt_init(tt, bo, page_flags, ttm_cached, 0);
> if (ret < 0)
> goto err_ttm_tt_init;
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 45cc5837ce00..1a8262f5f692 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -283,7 +283,7 @@ static struct ttm_tt *i915_ttm_tt_create(struct
> ttm_buffer_object *bo,
> i915_tt->is_shmem = true;
> }
>  
> -   ret = ttm_tt_init(_tt->ttm, bo, page_flags, caching);
> +   ret = ttm_tt_init(_tt->ttm, bo, page_flags, caching, 0);
> if (ret)
> goto err_free;
>  
> diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c
> b/drivers/gpu/drm/qxl/qxl_ttm.c
> index b2e33d5ba5d0..52156b54498f 100644
> --- a/drivers/gpu/drm/qxl/qxl_ttm.c
> +++ b/drivers/gpu/drm/qxl/qxl_ttm.c
> @@ -113,7 +113,7 @@ static struct ttm_tt *qxl_ttm_tt_create(struct
> ttm_buffer_object *bo,
> ttm = kzalloc(sizeof(struct ttm_tt), GFP_KERNEL);
> if (ttm == NULL)
> return NULL;
> -   if (ttm_tt_init(ttm, bo, page_flags, ttm_cached)) {
> +   if (ttm_tt_init(ttm, bo, page_flags, ttm_cached, 0)) {
> kfree(ttm);
> return NULL;
> }
> diff --git a/drivers/gpu/drm/ttm/ttm_agp_backend.c
> b/drivers/gpu/drm/ttm/ttm_agp_backend.c
> index 6ddc16f0fe2b..d27691f2e451 100644
> --- a/drivers/gpu/drm/ttm/ttm_agp_backend.c
> +++ b/drivers/gpu/drm/ttm/ttm_agp_backend.c
> @@ -134,7 +134,7 @@ struct ttm_tt *ttm_agp_tt_create(struct
> ttm_buffer_object *bo,
> agp_be->mem = NULL;
> agp_be->bridge = bridge;
>  
> -   if (ttm_tt_init(_be->ttm, bo, page_flags,
> ttm_write_combined)) {
> +   if (ttm_tt_init(_be->ttm, bo, page_flags,
> ttm_write_combined, 0)) {
> kfree(agp_be);
> return NULL;
> }
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c
> b/drivers/gpu/drm/ttm/ttm_tt.c
> index d234aab800a0..1a66d9fc589a 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -134,9 +134,10 @@ void ttm_tt_destroy(struct ttm_device *bdev,
> struct ttm_tt *ttm)
>  static void ttm_tt_init_fields(struct ttm_tt *ttm,
>    struct ttm_buffer_object *bo,
>    uint32_t page_flags,
> -  enum ttm_caching caching)
> +  enum ttm_caching caching,
> +  unsigned long extra_pages)
>  {
> -   ttm->num_pages = PAGE_ALIGN(bo->base.size) >> PAGE_SHIFT;
> +   ttm->num_pages = (PAGE_ALIGN(bo->base.size) >> PAGE_SHIFT) +
> extra_pages;
> ttm->caching = ttm_cached;
> ttm->page_flags = page_flags;
> ttm->dma_address = NULL;
> @@ -146,9 +147,10 @@ static void ttm_tt_init_fields(struct ttm_tt
> *ttm,
>  }
>  
>  int ttm_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo,
> -   uint32_t page_flags, enum ttm_caching caching)
> +   uint32_t page_flags, enum ttm_caching caching,
> +   unsigned long extra_pages)
>  {
> -   ttm_tt_init_fields(ttm, bo, page_flags, caching);
> +   ttm_tt_init_fields(ttm, bo, page_flags, caching,
> extra_pages);
>  
> if (ttm_tt_alloc_page_directory(ttm)) {
> pr_err("Failed allocating page table\n");
> @@ -180,7 +182,7 @@ int ttm_sg_tt_init(struct ttm_tt *ttm, struct
> ttm_buffer_object *bo,
>  {
> int ret;
>  
> -   ttm_tt_init_fields(ttm, bo, page_flags, caching);
> +   ttm_tt_init_fields(ttm, bo, page_flags, caching, 0);
> 

Re: [Intel-gfx] [PATCH] drm/i915: Depend on !PREEMPT_RT.

2022-03-02 Thread Sebastian Andrzej Siewior
On 2022-03-02 11:42:35 [+], Tvrtko Ursulin wrote:
> > > >  0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch
> > > 
> > > What do preempt_disable/enable do on PREEMPT_RT? Thinking if instead the
> > > solution could be to always force the !ATOMIC path (for the whole
> > > _wait_for_atomic macro) on PREEMPT_RT.
> > 
> > Could be one way to handle it. But please don't disable preemption and
> > or interrupts for longer period of time as all of it increases the
> > overall latency.
> 
> I am looking for your guidance of what is the correct thing here.
> 
> Main purpose of this macro on the i915 side is to do short waits on GPU
> registers changing post write from spin-locked sections. But there were rare
> cases when very short waits were needed from unlocked sections, shorter than
> 10us (which is AFAIR what usleep_range documents should be a lower limit).
> Which is why non-atomic path was added to the macro. That path uses
> preempt_disable/enable so it can use local_clock().
>
> All this may, or may not be, compatible with PREEMPT_RT to start with?

Your assumption about atomic is not correct and that is why I aim to
ignore for RT. Or maybe alter so it fits.
It is assumed, that in_atomic() is true in an interrupts handler or with
an acquired spinlock_t, right? Both condition keep the context
preemptible so the atomic check triggers. However, both (the force
threaded interrupt handler and the spinlock_t) ensure that the task is
stuck on the CPU.

So maybe your _WAIT_FOR_ATOMIC_CHECK() could point to cant_migrate().
It looks like you try to ensure that local_clock() is from the same CPU.

> Or question phrased differently, how we should implement the <10us waits
> from non-atomic sections under PREEMPT_RT?

I think if you swap check in _WAIT_FOR_ATOMIC_CHECK() it should be good.
After all the remains preemptible during the condition polls so it
should work.

> > The problem is that you can't acquire that lock from within that
> > trace-point on PREEMPT_RT. On !RT it is possible but it is also
> > problematic because LOCKDEP does not see possible dead locks unless that
> > trace-point is enabled.
> 
> Oh I meant could the include ordering problem be fixed differently?
> 
> """
> [PATCH 07/10] drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with
>  NOTRACE
> 
> The order of the header files is important. If this header file is
> included after tracepoint.h was included then the NOTRACE here becomes a
> nop. Currently this happens for two .c files which use the tracepoitns
> behind DRM_I915_LOW_LEVEL_TRACEPOINTS.
> """
> 
> Like these two .c files - can order of includes just be changed in them?

Maybe. Let me check and get back to you.

> > I've been talking to Steven (after
> > https://lkml.kernel.org/r/20211214115837.6f33a...@gandalf.local.home)
> > and he wants to come up with something where you can pass a lock as
> > argument to the tracing-API. That way the lock can be acquired before
> > the trace event is invoked and lockdep will see it even if the trace
> > event is disabled.
> > So there is an idea how to get it to work eventually without disabling
> > it in the long term.
> > 
> > Making the register a raw_spinlock_t would solve problem immediately but
> > I am a little worried given the increased latency in a quick test:
> > 
> > https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfy...@linutronix.de/
> > 
> > also, this one single hardware but the upper limit atomic-polls is high.
> > 
> > > >  0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch
> > > 
> > > Not sure about why cond_resched was put between irq_work_queue and
> > > irq_work_sync - would it not be like-for-like change to have the two
> > > together?
> > 
> > maybe it loops for a while and an additional scheduling would be nice.
> > 
> > > Commit message makes me think _queue already starts the handler on
> > > x86 at least.
> > 
> > Yes, irq_work_queue() triggers the IRQ right away on x86,
> > irq_work_sync() would wait for it to happen in case it did not happen.
> > On architectures which don't provide an IRQ-work interrupt, it is
> > delayed to the HZ tick timer interrupt. So this serves also as an
> > example in case someone want to copy the code ;)
> 
> My question wasn't why is there a need_resched() in there, but why is the
> patch:
> 
> + irq_work_queue(>irq_work);
>   cond_resched();
> + irq_work_sync(>irq_work);
> 
> And not:
> 
> + irq_work_queue(>irq_work);
> + irq_work_sync(>irq_work);
>   cond_resched();
> 
> To preserve like for like, if my understanding of the commit message was
> correct.

No strong need, it can be put as you suggest.
Should someone else schedule >irq_work from another CPU then you
could first attempt to cond_resched() and then wait for >irq_work's
completion. Assuming that this does not happen (because the irq_work was
previously queued and invoked immediately) irq_work_sync) will just 

[Intel-gfx] ✗ Fi.CI.IGT: failure for iommu/vt-d: Add RPLS to quirk list to skip TE disabling (rev3)

2022-03-02 Thread Patchwork
== Series Details ==

Series: iommu/vt-d: Add RPLS to quirk list to skip TE disabling (rev3)
URL   : https://patchwork.freedesktop.org/series/100165/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11308_full -> Patchwork_22458_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_22458_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_22458_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (13 -> 13)
--

  No changes in participating hosts

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_22458_full:

### IGT changes ###

 Possible regressions 

  * igt@kms_cursor_legacy@long-nonblocking-modeset-vs-cursor-atomic:
- shard-tglb: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-tglb7/igt@kms_cursor_leg...@long-nonblocking-modeset-vs-cursor-atomic.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-tglb8/igt@kms_cursor_leg...@long-nonblocking-modeset-vs-cursor-atomic.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * {igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5}:
- {shard-rkl}:NOTRUN -> [SKIP][3] +1 similar issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-rkl-1/igt@kms_plane_scal...@planes-upscale-factor-0-25-downscale-factor-0-5.html

  * 
{igt@kms_plane_scaling@scaler-with-rotation-unity-scaling@pipe-d-hdmi-a-3-scaler-with-rotation}:
- {shard-dg1}:NOTRUN -> [SKIP][4] +3 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22458/shard-dg1-18/igt@kms_plane_scaling@scaler-with-rotation-unity-scal...@pipe-d-hdmi-a-3-scaler-with-rotation.html

  
New tests
-

  New tests have been introduced between CI_DRM_11308_full and 
Patchwork_22458_full:

### New IGT tests (1) ###

  * 
igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [1.28] s

  

Known issues


  Here are the changes found in Patchwork_22458_full that come from known 
issues:

### CI changes ###

 Issues hit 

  * boot:
- shard-skl:  ([PASS][5], [PASS][6], [PASS][7], [PASS][8], 
[PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], [PASS][14], 
[PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], [PASS][20], 
[PASS][21], [PASS][22], [PASS][23], [PASS][24], [PASS][25], [PASS][26], 
[PASS][27]) -> ([PASS][28], [PASS][29], [PASS][30], [PASS][31], [PASS][32], 
[PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], [PASS][38], 
[PASS][39], [PASS][40], [PASS][41], [PASS][42], [PASS][43], [FAIL][44], 
[PASS][45], [PASS][46], [PASS][47], [PASS][48], [PASS][49], [PASS][50], 
[PASS][51]) ([i915#5032])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/boot.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/boot.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl10/boot.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl10/boot.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl10/boot.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl1/boot.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl1/boot.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl1/boot.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl9/boot.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl9/boot.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl4/boot.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl8/boot.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl8/boot.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl8/boot.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl8/boot.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/boot.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/boot.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl7/boot.html
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl4/boot.html
   [24]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-skl4/boot.html
   [25]: 

Re: [Intel-gfx] [PATCH 2/2] drm/i915/dmabuf: Fix prime_mmap to work when using LMEM

2022-03-02 Thread Das, Nirmoy

LGTM Reviewed-by: Nirmoy Das 

On 25/02/2022 14:13, Gwan-gyeong Mun wrote:

The current implementation of i915 prime mmap only works when initializing
drm_i915_gem_object with shmem_region.
When using LMEM, drm_i915_gem_object is initialized with ttm_system_region.
In order to make prime mmap work even this case, when using LMEM
(when using ttm in i915), dma_buf_ops.mmap callback function calls
drm_gem_prime_mmap(). drm_gem_prime_mmap() of drm core calls internally
i915_gem_mmap() so that prime mmap can perform normally.
The fake offset is processed inside drm_gem_prime_mmap().

Testcase: igt/prime_mmap

Cc: Thomas Hellström 
Cc: Matthew Auld 
Signed-off-by: Gwan-gyeong Mun 
---
  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index af899ae1f3c7..f5062d0c6333 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -93,11 +93,15 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf,
  static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct 
vm_area_struct *vma)
  {
struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
int ret;
  
  	if (obj->base.size < vma->vm_end - vma->vm_start)

return -EINVAL;
  
+	if (HAS_LMEM(i915))

+   return drm_gem_prime_mmap(>base, vma);
+
if (!obj->base.filp)
return -ENODEV;
  


[Intel-gfx] ✓ Fi.CI.BAT: success for vm- and vma cleanups

2022-03-02 Thread Patchwork
== Series Details ==

Series: vm- and vma cleanups
URL   : https://patchwork.freedesktop.org/series/100945/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11308 -> Patchwork_22462


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/index.html

Participating hosts (51 -> 43)
--

  Additional (1): bat-adlp-4 
  Missing(9): fi-cml-u2 fi-bdw-samus shard-tglu fi-hsw-4200u fi-bsw-cyan 
fi-ctg-p8600 shard-rkl shard-dg1 bat-jsl-1 

Known issues


  Here are the changes found in Patchwork_22462 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_flink_basic@bad-flink:
- fi-skl-6600u:   [PASS][1] -> [INCOMPLETE][2] ([i915#4547])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/fi-skl-6600u/igt@gem_flink_ba...@bad-flink.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/fi-skl-6600u/igt@gem_flink_ba...@bad-flink.html

  * igt@gem_lmem_swapping@basic:
- bat-adlp-4: NOTRUN -> [SKIP][3] ([i915#4613]) +3 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/bat-adlp-4/igt@gem_lmem_swapp...@basic.html

  * igt@gem_tiled_pread_basic:
- bat-adlp-4: NOTRUN -> [SKIP][4] ([i915#3282])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/bat-adlp-4/igt@gem_tiled_pread_basic.html

  * igt@i915_selftest@live@gem:
- fi-blb-e6850:   [PASS][5] -> [DMESG-FAIL][6] ([i915#4528])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/fi-blb-e6850/igt@i915_selftest@l...@gem.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/fi-blb-e6850/igt@i915_selftest@l...@gem.html

  * igt@i915_selftest@live@hangcheck:
- fi-bdw-5557u:   NOTRUN -> [INCOMPLETE][7] ([i915#3921])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/fi-bdw-5557u/igt@i915_selftest@l...@hangcheck.html

  * igt@kms_chamelium@vga-edid-read:
- fi-bdw-5557u:   NOTRUN -> [SKIP][8] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/fi-bdw-5557u/igt@kms_chamel...@vga-edid-read.html

  * igt@kms_chamelium@vga-hpd-fast:
- bat-adlp-4: NOTRUN -> [SKIP][9] ([fdo#111827]) +8 similar issues
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/bat-adlp-4/igt@kms_chamel...@vga-hpd-fast.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
- bat-adlp-4: NOTRUN -> [SKIP][10] ([i915#4103]) +1 similar issue
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/bat-adlp-4/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html

  * igt@kms_force_connector_basic@force-load-detect:
- bat-adlp-4: NOTRUN -> [SKIP][11] ([fdo#109285])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/bat-adlp-4/igt@kms_force_connector_ba...@force-load-detect.html

  * igt@kms_psr@cursor_plane_move:
- fi-bdw-5557u:   NOTRUN -> [SKIP][12] ([fdo#109271]) +13 similar issues
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/fi-bdw-5557u/igt@kms_psr@cursor_plane_move.html

  * igt@prime_vgem@basic-fence-read:
- bat-adlp-4: NOTRUN -> [SKIP][13] ([i915#3291] / [i915#3708]) +2 
similar issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/bat-adlp-4/igt@prime_v...@basic-fence-read.html

  * igt@prime_vgem@basic-userptr:
- bat-adlp-4: NOTRUN -> [SKIP][14] ([i915#3301] / [i915#3708])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/bat-adlp-4/igt@prime_v...@basic-userptr.html

  * igt@runner@aborted:
- fi-blb-e6850:   NOTRUN -> [FAIL][15] ([fdo#109271] / [i915#2403] / 
[i915#4312])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/fi-blb-e6850/igt@run...@aborted.html

  
 Possible fixes 

  * igt@kms_flip@basic-flip-vs-modeset@a-edp1:
- {bat-adlp-6}:   [DMESG-WARN][16] ([i915#3576]) -> [PASS][17]
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/bat-adlp-6/igt@kms_flip@basic-flip-vs-mode...@a-edp1.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/bat-adlp-6/igt@kms_flip@basic-flip-vs-mode...@a-edp1.html

  
 Warnings 

  * igt@runner@aborted:
- fi-skl-6600u:   [FAIL][18] ([i915#4312]) -> [FAIL][19] ([i915#2722] / 
[i915#4312])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/fi-skl-6600u/igt@run...@aborted.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22462/fi-skl-6600u/igt@run...@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109285]: 

Re: [Intel-gfx] [PATCH] drm/i915: Depend on !PREEMPT_RT.

2022-03-02 Thread Tvrtko Ursulin



On 01/03/2022 15:13, Sebastian Andrzej Siewior wrote:

On 2022-03-01 14:27:18 [+], Tvrtko Ursulin wrote:

you see:
 0003-drm-i915-Use-preempt_disable-enable_rt-where-recomme.patch
 0004-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch


Two for the display folks.


 0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch


What do preempt_disable/enable do on PREEMPT_RT? Thinking if instead the
solution could be to always force the !ATOMIC path (for the whole
_wait_for_atomic macro) on PREEMPT_RT.


Could be one way to handle it. But please don't disable preemption and
or interrupts for longer period of time as all of it increases the
overall latency.


I am looking for your guidance of what is the correct thing here.

Main purpose of this macro on the i915 side is to do short waits on GPU 
registers changing post write from spin-locked sections. But there were 
rare cases when very short waits were needed from unlocked sections, 
shorter than 10us (which is AFAIR what usleep_range documents should be 
a lower limit). Which is why non-atomic path was added to the macro. 
That path uses preempt_disable/enable so it can use local_clock().


All this may, or may not be, compatible with PREEMPT_RT to start with?

Or question phrased differently, how we should implement the <10us waits 
from non-atomic sections under PREEMPT_RT?



Side note: All of these patches is a collection over time. I personally
have only a single i7-sandybridge with i915 and here I don't really
enter all the possible paths here. People report, I patch and look
around and then they are quiet so I assume that it is working.


 0006-drm-i915-Disable-tracing-points-on-PREEMPT_RT.patch


If the issue is only with certain trace points why disable all?


It is a class and it is easier that way.


 0007-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch


Didn't quite fully understand, why is this not fixable? Especially thinking
if the option of not blanket disabling all tracepoints in the previous
patch.


The problem is that you can't acquire that lock from within that
trace-point on PREEMPT_RT. On !RT it is possible but it is also
problematic because LOCKDEP does not see possible dead locks unless that
trace-point is enabled.


Oh I meant could the include ordering problem be fixed differently?

"""
[PATCH 07/10] drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with
 NOTRACE

The order of the header files is important. If this header file is
included after tracepoint.h was included then the NOTRACE here becomes a
nop. Currently this happens for two .c files which use the tracepoitns
behind DRM_I915_LOW_LEVEL_TRACEPOINTS.
"""

Like these two .c files - can order of includes just be changed in them?



I've been talking to Steven (after
https://lkml.kernel.org/r/20211214115837.6f33a...@gandalf.local.home)
and he wants to come up with something where you can pass a lock as
argument to the tracing-API. That way the lock can be acquired before
the trace event is invoked and lockdep will see it even if the trace
event is disabled.
So there is an idea how to get it to work eventually without disabling
it in the long term.

Making the register a raw_spinlock_t would solve problem immediately but
I am a little worried given the increased latency in a quick test:
https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfy...@linutronix.de/

also, this one single hardware but the upper limit atomic-polls is high.


 0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch


Not sure about why cond_resched was put between irq_work_queue and
irq_work_sync - would it not be like-for-like change to have the two
together?


maybe it loops for a while and an additional scheduling would be nice.


Commit message makes me think _queue already starts the handler on
x86 at least.


Yes, irq_work_queue() triggers the IRQ right away on x86,
irq_work_sync() would wait for it to happen in case it did not happen.
On architectures which don't provide an IRQ-work interrupt, it is
delayed to the HZ tick timer interrupt. So this serves also as an
example in case someone want to copy the code ;)


My question wasn't why is there a need_resched() in there, but why is 
the patch:


+   irq_work_queue(>irq_work);
cond_resched();
+   irq_work_sync(>irq_work);

And not:

+   irq_work_queue(>irq_work);
+   irq_work_sync(>irq_work);
cond_resched();

To preserve like for like, if my understanding of the commit message was 
correct.





 0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch


I think this is okay. The part after the unlock is serialized by the tasklet
already.

Slight doubt due the comment:

   local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */

Makes me want to think about it harder but not now.


Clark reported it and confirmed that the warning is gone on RT and
everything appears to work ;)



[Intel-gfx] ✗ Fi.CI.SPARSE: warning for vm- and vma cleanups

2022-03-02 Thread Patchwork
== Series Details ==

Series: vm- and vma cleanups
URL   : https://patchwork.freedesktop.org/series/100945/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.




[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for vm- and vma cleanups

2022-03-02 Thread Patchwork
== Series Details ==

Series: vm- and vma cleanups
URL   : https://patchwork.freedesktop.org/series/100945/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
67a198829865 drm/i915: Remove the vm open count
-:26: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#26: 
- Clarify that the struct i915_address_space::skip_pte_rewrite is a hack and

total: 0 errors, 1 warnings, 0 checks, 901 lines checked
0a77c40b2518 drm/i915: Remove the vma refcount
dc594b31f5d2 drm/i915/gem: Remove some unnecessary code




Re: [Intel-gfx] [PATCH 0/3] Improve anti-pre-emption w/a for compute workloads

2022-03-02 Thread Tvrtko Ursulin



On 28/02/2022 19:17, John Harrison wrote:

On 2/28/2022 07:32, Tvrtko Ursulin wrote:

On 25/02/2022 19:03, John Harrison wrote:

On 2/25/2022 10:29, Tvrtko Ursulin wrote:

On 25/02/2022 18:01, John Harrison wrote:

On 2/25/2022 09:39, Tvrtko Ursulin wrote:

On 25/02/2022 17:11, John Harrison wrote:

On 2/25/2022 08:36, Tvrtko Ursulin wrote:

On 24/02/2022 20:02, John Harrison wrote:

On 2/23/2022 04:00, Tvrtko Ursulin wrote:

On 23/02/2022 02:22, John Harrison wrote:

On 2/22/2022 01:53, Tvrtko Ursulin wrote:

On 18/02/2022 21:33, john.c.harri...@intel.com wrote:

From: John Harrison 

Compute workloads are inherently not pre-emptible on 
current hardware.
Thus the pre-emption timeout was disabled as a workaround 
to prevent
unwanted resets. Instead, the hang detection was left to 
the heartbeat
and its (longer) timeout. This is undesirable with GuC 
submission as
the heartbeat is a full GT reset rather than a per engine 
reset and so
is much more destructive. Instead, just bump the 
pre-emption timeout


Can we have a feature request to allow asking GuC for an 
engine reset?

For what purpose?


To allow "stopped heartbeat" to reset the engine, however..

GuC manages the scheduling of contexts across engines. With 
virtual engines, the KMD has no knowledge of which engine a 
context might be executing on. Even without virtual engines, 
the KMD still has no knowledge of which context is currently 
executing on any given engine at any given time.


There is a reason why hang detection should be left to the 
entity that is doing the scheduling. Any other entity is 
second guessing at best.


The reason for keeping the heartbeat around even when GuC 
submission is enabled is for the case where the KMD/GuC have 
got out of sync with either other somehow or GuC itself has 
just crashed. I.e. when no submission at all is working and 
we need to reset the GuC itself and start over.


.. I wasn't really up to speed to know/remember heartbeats are 
nerfed already in GuC mode.
Not sure what you mean by that claim. Engine resets are handled 
by GuC because GuC handles the scheduling. You can't do the 
former if you aren't doing the latter. However, the heartbeat 
is still present and is still the watchdog by which engine 
resets are triggered. As per the rest of the submission 
process, the hang detection and recovery is split between i915 
and GuC.


I meant that "stopped heartbeat on engine XXX" can only do a 
full GPU reset on GuC.
I mean that there is no 'stopped heartbeat on engine XXX' when 
i915 is not handling the recovery part of the process.


H?

static void
reset_engine(struct intel_engine_cs *engine, struct i915_request *rq)
{
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
    show_heartbeat(rq, engine);

if (intel_engine_uses_guc(engine))
    /*
 * GuC itself is toast or GuC's hang detection
 * is disabled. Either way, need to find the
 * hang culprit manually.
 */
    intel_guc_find_hung_context(engine);

intel_gt_handle_error(engine->gt, engine->mask,
  I915_ERROR_CAPTURE,
  "stopped heartbeat on %s",
  engine->name);
}

How there is no "stopped hearbeat" in guc mode? From this code it 
certainly looks there is.
Only when the GuC is toast and it is no longer an engine reset but 
a full GT reset that is required. So technically, it is not a 
'stopped heartbeat on engine XXX' it is 'stopped heartbeat on GT#'.




You say below heartbeats are going in GuC mode. Now I totally 
don't understand how they are going but there is allegedly no 
"stopped hearbeat".
Because if GuC is handling the detection and recovery then i915 
will not reach that point. GuC will do the engine reset and start 
scheduling the next context before the heartbeat period expires. So 
the notification will be a G2H about a specific context being reset 
rather than the i915 notification about a stopped heartbeat.






intel_gt_handle_error(engine->gt, engine->mask,
  I915_ERROR_CAPTURE,
  "stopped heartbeat on %s",
  engine->name);

intel_gt_handle_error:

/*
 * Try engine reset when available. We fall back to full 
reset if

 * single reset fails.
 */
if (!intel_uc_uses_guc_submission(>uc) &&
    intel_has_reset_engine(gt) && !intel_gt_is_wedged(gt)) {
    local_bh_disable();
    for_each_engine_masked(engine, gt, engine_mask, tmp) {

You said "However, the heartbeat is still present and is still 
the watchdog by which engine resets are triggered", now I don't 
know what you meant by this. It actually triggers a single 
engine reset in GuC mode? Where in code does that happen if this 
block above shows it not taking the engine reset path?

i915 sends down the per engine pulse.
GuC schedules the pulse
GuC attempts to pre-empt the currently active context
GuC detects the pre-emption timeout
GuC resets the engine

The fundamental process is exactly 

[Intel-gfx] ✗ Fi.CI.BUILD: failure for Improve anti-pre-emption w/a for compute workloads (rev3)

2022-03-02 Thread Patchwork
== Series Details ==

Series: Improve anti-pre-emption w/a for compute workloads (rev3)
URL   : https://patchwork.freedesktop.org/series/100428/
State : failure

== Summary ==

Applying: drm/i915/guc: Limit scheduling properties to avoid overflow
error: patch failed: drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2218
error: drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c: patch does not apply
error: Did you hand edit your patch?
It does not apply to blobs recorded in its index.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Using index info to reconstruct a base tree...
M   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
Patch failed at 0001 drm/i915/guc: Limit scheduling properties to avoid overflow
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".




Re: [Intel-gfx] [PATCH 2/3] drm/i915/gt: Make the heartbeat play nice with long pre-emption timeouts

2022-03-02 Thread Tvrtko Ursulin



On 01/03/2022 20:59, John Harrison wrote:

On 3/1/2022 04:09, Tvrtko Ursulin wrote:


I'll trim it a bit again..

On 28/02/2022 18:55, John Harrison wrote:

On 2/28/2022 09:12, Tvrtko Ursulin wrote:

On 25/02/2022 18:48, John Harrison wrote:

On 2/25/2022 10:14, Tvrtko Ursulin wrote:


[snip]

Your only objection is that ends up with too long total time 
before reset? Or something else as well?
An unnecessarily long total heartbeat timeout is the main 
objection. (2.5 + 12) * 5 = 72.5 seconds. That is a massive change 
from the current 12.5s.


If we are happy with that huge increase then fine. But I'm pretty 
sure you are going to get a lot more bug reports about hung systems 
not recovering. 10-20s is just about long enough for someone to 
wait before leaning on the power button of their machine. Over a 
minute is not. That kind of delay is going to cause support issues.


Sorry I wrote 12s, while you actually said tP * 12, so 7.68s, chosen 
just so it is longer than tH * 3?


And how do you keep coming up with factor of five? Isn't it four 
periods before "heartbeat stopped"? (Prio normal, hearbeat, barrier 
and then reset.)

Prio starts at low not normal.


Right, slipped my mind since I only keep seeing that one priority 
ladder block in intel_engine_heartbeat.c/heartbeat()..


From the point of view of user experience I agree reasonable 
responsiveness is needed before user "reaches for the power button".


In your proposal we are talking about 3 * 2.5s + 2 * 7.5s, so 22.5s.

Question of workloads.. what is the actual preempt timeout compute 
is happy with? And I don't mean compute setups with disabled 
hangcheck, which you say they want anyway, but if we target defaults 
for end users. Do we have some numbers on what they are likely to run?
Not that I have ever seen. This is all just finger in the air stuff. 
I don't recall if we invented the number and the compute people 
agreed with it or if they proposed the number to us.


Yeah me neither. And found nothing in my email archives. :(

Thinking about it today I don't see that disabled timeout is a 
practical default.


With it, if users have something un-preemptable to run (assuming prio 
normal), it would get killed after ~13s (5 * 2.5).


If we go for my scheme it gets killed in ~17.5s (3 * (2.5 + 2.5) + 2.5 
(third pulse triggers preempt timeout)).


And if we go for your scheme it gets killed in ~22.5s (4 * 2.5 + 2 * 3 
* 2.5).
Erm, that is not an apples to apples comparison. Your 17.5 is for an 
engine reset tripped by the pre-emption timeout, but your 22.5s is for a 
GT reset tripped by the heartbeat reaching the end and nuking the universe.


Right, in your scheme I did get it wrong. It would wait for GuC to reset 
the engine at the end, rather than hit the fake "hearbeat stopped" in 
that case, full reset path.


4 * 2.5 to trigger a max prio pulse, then 3 * 2.5 preempt timeout for 
GuC to reset (last hearbeat delay extended so it does not trigger). So 
17.5 as well.


If you are saying that the first pulse at sufficient priority (third 
being normal prio) is what causes the reset because the system is 
working as expected and the pre-emption timeout trips the reset. In that 
case, you have two periods to get to normal prio plus one pre-emption 
timeout to trip the reset. I.e. (tH * 2) + tP.


Your scheme is then tH(actual) = tH(user) + tP, yes?
So pre-emption based reset is after ((tH(user) + tP) * 2) + tP => (3 * 
tP) + (2 * tH)

And GT based reset is after (tH(user) + tP) * 5 => (5 * tP) + (5 * tH)

My scheme is tH(actual) = tH(user) for first four, then max(tH(user), 
tP) for fifth.

So pre-emption based reset is after tH(user) * 2 + tP = > tP + (2 * tH);
And GT based reset is after (tH(user) * 4) + (max(tH(user), tP) * 1) => 
greater of ((4 * tH) + tP) or (5 * tH)


Either way your scheme is longer. With tH(user) = 2.5s, tP(RCS) = 7.5s, 
we get 27.5s for engine and 50s for GT versus my 12.5s for engine and 
17.5s for GT. With tP(RCS) = 2.5s, yours is 12.5s for engine and 25s for 
GT versus my 7.5s for engine and 12.5s for GT.


Plus, not sure why your calculations above are using 2.5 for tP? Are you 
still arguing that 7.5s is too long? That is a separate issue and not 
related to the heartbeat algorithms. tP must be long enough to allow 
'out of box OpenCL workloads to complete'. That doesn't just mean not 
being killed by the heartbeat, it also means not being killed by running 
two of them concurrently (or one plus desktop OpenGL rendering) and not 
having it killed by basic time slicing between the two contexts. The 
heartbeat is not involved in that process. That is purely the 
pre-emption timeout. And that is the fundamental reason why tP needs to 
be much larger on RCS/CCS.


I was assuming 2.5s tP is enough and basing all calculation on that. 
Heartbeat or timeslicing regardless. I thought we established neither of 
us knows how long is enough.


Are you now saying 2.5s is definitely not enough? How is that usable for 
a 

Re: [Intel-gfx] [PATCH] drm/i915/gt: Handle errors for i915_gem_object_trylock

2022-03-02 Thread Tvrtko Ursulin



+ Thomas, Matt

On 02/03/2022 06:19, Jiasheng Jiang wrote:

As the potential failure of the i915_gem_object_trylock(),
it should be better to check it and return error if fails.

Fixes: 94ce0d65076c ("drm/i915/gt: Setup a default migration context on the GT")
Signed-off-by: Jiasheng Jiang 
---
  drivers/gpu/drm/i915/gt/selftest_migrate.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c 
b/drivers/gpu/drm/i915/gt/selftest_migrate.c
index fa4293d2944f..79c6c68f7316 100644
--- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -465,7 +465,11 @@ create_init_lmem_internal(struct intel_gt *gt, size_t sz, 
bool try_lmem)
return obj;
}
  
-	i915_gem_object_trylock(obj, NULL);


Guys why is this a trylock to start with? (Since being added in 
94ce0d65076c ("drm/i915/gt: Setup a default migration context on the GT").


Surely it can't ever fail since the object has just been created.

Regards,

Tvrtko


+   if (!i915_gem_object_trylock(obj, NULL)) {
+   i915_gem_object_put(obj);
+   return ERR_PTR(-EBUSY);
+   }
+
err = i915_gem_object_pin_pages(obj);
if (err) {
i915_gem_object_unlock(obj);


[Intel-gfx] [PATCH v2 3/3] drm/i915/gem: Remove some unnecessary code

2022-03-02 Thread Thomas Hellström
The test for vma should always return true, and when assigning -EBUSY
to ret, the variable should already have that value.

Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/i915_gem.c | 32 ++--
 1 file changed, 14 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c26110abcc0b..9747924cc57b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -118,6 +118,7 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj,
   unsigned long flags)
 {
struct intel_runtime_pm *rpm = _i915(obj->base.dev)->runtime_pm;
+   bool vm_trylock = !!(flags & I915_GEM_OBJECT_UNBIND_VM_TRYLOCK);
LIST_HEAD(still_in_list);
intel_wakeref_t wakeref;
struct i915_vma *vma;
@@ -170,26 +171,21 @@ int i915_gem_object_unbind(struct drm_i915_gem_object 
*obj,
 * and destroy the vma from under us.
 */
 
-   if (vma) {
-   bool vm_trylock = !!(flags & 
I915_GEM_OBJECT_UNBIND_VM_TRYLOCK);
-   ret = -EBUSY;
-   if (flags & I915_GEM_OBJECT_UNBIND_ASYNC) {
-   assert_object_held(vma->obj);
-   ret = i915_vma_unbind_async(vma, vm_trylock);
-   }
+   ret = -EBUSY;
+   if (flags & I915_GEM_OBJECT_UNBIND_ASYNC) {
+   assert_object_held(vma->obj);
+   ret = i915_vma_unbind_async(vma, vm_trylock);
+   }
 
-   if (ret == -EBUSY && (flags & 
I915_GEM_OBJECT_UNBIND_ACTIVE ||
- !i915_vma_is_active(vma))) {
-   if (vm_trylock) {
-   if (mutex_trylock(>vm->mutex)) {
-   ret = __i915_vma_unbind(vma);
-   mutex_unlock(>vm->mutex);
-   } else {
-   ret = -EBUSY;
-   }
-   } else {
-   ret = i915_vma_unbind(vma);
+   if (ret == -EBUSY && (flags & I915_GEM_OBJECT_UNBIND_ACTIVE ||
+ !i915_vma_is_active(vma))) {
+   if (vm_trylock) {
+   if (mutex_trylock(>vm->mutex)) {
+   ret = __i915_vma_unbind(vma);
+   mutex_unlock(>vm->mutex);
}
+   } else {
+   ret = i915_vma_unbind(vma);
}
}
 
-- 
2.34.1



[Intel-gfx] [PATCH v2 2/3] drm/i915: Remove the vma refcount

2022-03-02 Thread Thomas Hellström
Now that i915_vma_parked() is taking the object lock on vma destruction,
and the only user of the vma refcount, i915_gem_object_unbind()
also takes the object lock, remove the vma refcount.

Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/i915_gem.c   | 17 +
 drivers/gpu/drm/i915/i915_vma.c   | 14 +++---
 drivers/gpu/drm/i915/i915_vma.h   | 14 --
 drivers/gpu/drm/i915/i915_vma_types.h |  1 -
 4 files changed, 16 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dd84ebabb50f..c26110abcc0b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -151,14 +151,25 @@ int i915_gem_object_unbind(struct drm_i915_gem_object 
*obj,
break;
}
 
+   /*
+* Requiring the vm destructor to take the object lock
+* before destroying a vma would help us eliminate the
+* i915_vm_tryget() here, AND thus also the barrier stuff
+* at the end. That's an easy fix, but sleeping locks in
+* a kthread should generally be avoided.
+*/
ret = -EAGAIN;
if (!i915_vm_tryget(vma->vm))
break;
 
-   /* Prevent vma being freed by i915_vma_parked as we unbind */
-   vma = __i915_vma_get(vma);
spin_unlock(>vma.lock);
 
+   /*
+* Since i915_vma_parked() takes the object lock
+* before vma destruction, it won't race us here,
+* and destroy the vma from under us.
+*/
+
if (vma) {
bool vm_trylock = !!(flags & 
I915_GEM_OBJECT_UNBIND_VM_TRYLOCK);
ret = -EBUSY;
@@ -180,8 +191,6 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj,
ret = i915_vma_unbind(vma);
}
}
-
-   __i915_vma_put(vma);
}
 
i915_vm_put(vma->vm);
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 91538bc38110..6fd25b39748f 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -122,7 +122,6 @@ vma_create(struct drm_i915_gem_object *obj,
if (vma == NULL)
return ERR_PTR(-ENOMEM);
 
-   kref_init(>ref);
vma->ops = >vma_ops;
vma->obj = obj;
vma->size = obj->base.size;
@@ -1628,15 +1627,6 @@ void i915_vma_reopen(struct i915_vma *vma)
__i915_vma_remove_closed(vma);
 }
 
-void i915_vma_release(struct kref *ref)
-{
-   struct i915_vma *vma = container_of(ref, typeof(*vma), ref);
-
-   i915_active_fini(>active);
-   GEM_WARN_ON(vma->resource);
-   i915_vma_free(vma);
-}
-
 static void force_unbind(struct i915_vma *vma)
 {
if (!drm_mm_node_allocated(>node))
@@ -1665,7 +1655,9 @@ static void release_references(struct i915_vma *vma, bool 
vm_ddestroy)
if (vm_ddestroy)
i915_vm_resv_put(vma->vm);
 
-   __i915_vma_put(vma);
+   i915_active_fini(>active);
+   GEM_WARN_ON(vma->resource);
+   i915_vma_free(vma);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 67ae7341c7e0..6034991d89fe 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -222,20 +222,6 @@ void i915_vma_unlink_ctx(struct i915_vma *vma);
 void i915_vma_close(struct i915_vma *vma);
 void i915_vma_reopen(struct i915_vma *vma);
 
-static inline struct i915_vma *__i915_vma_get(struct i915_vma *vma)
-{
-   if (kref_get_unless_zero(>ref))
-   return vma;
-
-   return NULL;
-}
-
-void i915_vma_release(struct kref *ref);
-static inline void __i915_vma_put(struct i915_vma *vma)
-{
-   kref_put(>ref, i915_vma_release);
-}
-
 void i915_vma_destroy_locked(struct i915_vma *vma);
 void i915_vma_destroy(struct i915_vma *vma);
 
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h 
b/drivers/gpu/drm/i915/i915_vma_types.h
index eac36be184e5..be6e028c3b57 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -211,7 +211,6 @@ struct i915_vma {
 * handles (but same file) for execbuf, i.e. the number of aliases
 * that exist in the ctx->handle_vmas LUT for this vma.
 */
-   struct kref ref;
atomic_t open_count;
atomic_t flags;
/**
-- 
2.34.1



[Intel-gfx] [PATCH v2 1/3] drm/i915: Remove the vm open count

2022-03-02 Thread Thomas Hellström
vms are not getting properly closed. Rather than fixing that,
Remove the vm open count and instead rely on the vm refcount.

The vm open count existed solely to break the strong references the
vmas had on the vms. Now instead make those references weak and
ensure vmas are destroyed when the vm is destroyed.

Unfortunately if the vm destructor and the object destructor both
wants to destroy a vma, that may lead to a race in that the vm
destructor just unbinds the vma and leaves the actual vma destruction
to the object destructor. However in order for the object destructor
to ensure the vma is unbound it needs to grab the vm mutex. In order
to keep the vm mutex alive until the object destructor is done with
it, somewhat hackishly grab a vm_resv refcount that is released late
in the vma destruction process, when the vm mutex is no longer needed.

v2: Address review-comments from Niranjana
- Clarify that the struct i915_address_space::skip_pte_rewrite is a hack and
  should ideally be replaced in an upcoming patch.
- Remove an unneeded continue in clear_vm_list and update comment.

Co-developed-by: Niranjana Vishwanathapura 
Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/display/intel_dpt.c  |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 29 ++-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  6 ++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  5 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 30 +++
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 54 
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 56 
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 86 +--
 drivers/gpu/drm/i915/i915_gem.c   |  6 +-
 drivers/gpu/drm/i915/i915_vma.c   | 55 
 drivers/gpu/drm/i915/i915_vma_resource.c  |  2 +-
 drivers/gpu/drm/i915/i915_vma_resource.h  |  6 ++
 drivers/gpu/drm/i915/i915_vma_types.h |  7 ++
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  4 +-
 15 files changed, 186 insertions(+), 164 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index 05dd7dba3a5c..3af4930c1095 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -300,5 +300,5 @@ void intel_dpt_destroy(struct i915_address_space *vm)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
 
-   i915_vm_close(>vm);
+   i915_vm_put(>vm);
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index bc6d59df064d..fe872e02b395 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1456,8 +1456,6 @@ static void set_closed_name(struct i915_gem_context *ctx)
 
 static void context_close(struct i915_gem_context *ctx)
 {
-   struct i915_address_space *vm;
-
/* Flush any concurrent set_engines() */
mutex_lock(>engines_mutex);
unpin_engines(__context_engines_static(ctx));
@@ -1469,19 +1467,6 @@ static void context_close(struct i915_gem_context *ctx)
 
set_closed_name(ctx);
 
-   vm = ctx->vm;
-   if (vm) {
-   /* i915_vm_close drops the final reference, which is a bit too
-* early and could result in surprises with concurrent
-* operations racing with thist ctx close. Keep a full reference
-* until the end.
-*/
-   i915_vm_get(vm);
-   i915_vm_close(vm);
-   }
-
-   ctx->file_priv = ERR_PTR(-EBADF);
-
/*
 * The LUT uses the VMA as a backpointer to unref the object,
 * so we need to clear the LUT before we close all the VMA (inside
@@ -1489,6 +1474,8 @@ static void context_close(struct i915_gem_context *ctx)
 */
lut_close(ctx);
 
+   ctx->file_priv = ERR_PTR(-EBADF);
+
spin_lock(>i915->gem.contexts.lock);
list_del(>link);
spin_unlock(>i915->gem.contexts.lock);
@@ -1587,12 +1574,8 @@ i915_gem_create_context(struct drm_i915_private *i915,
}
vm = >vm;
}
-   if (vm) {
-   ctx->vm = i915_vm_open(vm);
-
-   /* i915_vm_open() takes a reference */
-   i915_vm_put(vm);
-   }
+   if (vm)
+   ctx->vm = vm;
 
mutex_init(>engines_mutex);
if (pc->num_user_engines >= 0) {
@@ -1642,7 +1625,7 @@ i915_gem_create_context(struct drm_i915_private *i915,
free_engines(e);
 err_vm:
if (ctx->vm)
-   i915_vm_close(ctx->vm);
+   i915_vm_put(ctx->vm);
 err_ctx:
kfree(ctx);
return ERR_PTR(err);
@@ -1826,7 +1809,7 @@ static int get_ppgtt(struct drm_i915_file_private 
*file_priv,
if (err)
return err;
 
-   i915_vm_open(vm);
+   

[Intel-gfx] [PATCH v2 0/3] vm- and vma cleanups

2022-03-02 Thread Thomas Hellström
The first patch of the series addresses a vm open count bug by
removing the vm open count.

The second patch removes the vma refcount that is no longer needed;
the vma is kept a live by taking the vm refcount and object lock.

Finally the last patch removes some unnecessary code. There should be
no functional changes.

Thomas Hellström (3):
  drm/i915: Remove the vm open count
  drm/i915: Remove the vma refcount
  drm/i915/gem: Remove some unnecessary code

 drivers/gpu/drm/i915/display/intel_dpt.c  |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 29 ++-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  6 ++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  5 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 30 +++
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 54 
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 56 
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 86 +--
 drivers/gpu/drm/i915/i915_gem.c   | 55 ++--
 drivers/gpu/drm/i915/i915_vma.c   | 69 +--
 drivers/gpu/drm/i915/i915_vma.h   | 14 ---
 drivers/gpu/drm/i915/i915_vma_resource.c  |  2 +-
 drivers/gpu/drm/i915/i915_vma_resource.h  |  6 ++
 drivers/gpu/drm/i915/i915_vma_types.h |  8 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  4 +-
 16 files changed, 216 insertions(+), 212 deletions(-)

-- 
2.34.1



Re: [Intel-gfx] [PATCH v2 1/3] drm/i915/guc: Limit scheduling properties to avoid overflow

2022-03-02 Thread Tvrtko Ursulin



On 25/02/2022 20:41, john.c.harri...@intel.com wrote:

From: John Harrison 

GuC converts the pre-emption timeout and timeslice quantum values into
clock ticks internally. That significantly reduces the point of 32bit
overflow. On current platforms, worst case scenario is approximately
110 seconds. Rather than allowing the user to set higher values and
then get confused by early timeouts, add limits when setting these
values.

v2: Add helper functins for clamping (review feedback from Tvrtko).

Signed-off-by: John Harrison 
Reviewed-by: Daniele Ceraolo Spurio  (v1)
---
  drivers/gpu/drm/i915/gt/intel_engine.h  |  6 ++
  drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 69 +
  drivers/gpu/drm/i915/gt/sysfs_engines.c | 25 +---
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h |  9 +++
  4 files changed, 99 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h
index be4b1e65442f..5a9186f784c4 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -349,4 +349,10 @@ intel_engine_get_hung_context(struct intel_engine_cs 
*engine)
return engine->hung_ce;
  }
  
+u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value);

+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 
value);
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 
value);
+
  #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index e855c801ba28..7ad9e6006656 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -399,6 +399,26 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id,
if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
engine->props.preempt_timeout_ms = 0;
  
+	/* Cap properties according to any system limits */

+#define CLAMP_PROP(field) \
+   do { \
+   u64 clamp = intel_clamp_##field(engine, engine->props.field); \
+   if (clamp != engine->props.field) { \
+   drm_notice(>i915->drm, \
+  "Warning, clamping %s to %lld to prevent 
overflow\n", \
+  #field, clamp); \
+   engine->props.field = clamp; \
+   } \
+   } while (0)
+
+   CLAMP_PROP(heartbeat_interval_ms);
+   CLAMP_PROP(max_busywait_duration_ns);
+   CLAMP_PROP(preempt_timeout_ms);
+   CLAMP_PROP(stop_timeout_ms);
+   CLAMP_PROP(timeslice_duration_ms);
+
+#undef CLAMP_PROP
+
engine->defaults = engine->props; /* never to change again */
  
  	engine->context_size = intel_engine_context_size(gt, engine->class);

@@ -421,6 +441,55 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id,
return 0;
  }
  
+u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value)

+{
+   value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+   return value;
+}
+
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 
value)
+{
+   value = min(value, jiffies_to_nsecs(2));
+
+   return value;
+}
+
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value)
+{
+   /*
+* NB: The GuC API only supports 32bit values. However, the limit is 
further
+* reduced due to internal calculations which would otherwise overflow.
+*/
+   if (intel_guc_submission_is_wanted(>gt->uc.guc))
+   value = min_t(u64, value, GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS);
+
+   value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+   return value;
+}
+
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value)
+{
+   value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+   return value;
+}
+
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 
value)
+{
+   /*
+* NB: The GuC API only supports 32bit values. However, the limit is 
further
+* reduced due to internal calculations which would otherwise overflow.
+*/
+   if (intel_guc_submission_is_wanted(>gt->uc.guc))
+   value = min_t(u64, value, GUC_POLICY_MAX_EXEC_QUANTUM_MS);
+
+   value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+   return value;
+}
+
  static void __setup_engine_capabilities(struct intel_engine_cs *engine)
  {
struct drm_i915_private *i915 = engine->i915;
diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c 
b/drivers/gpu/drm/i915/gt/sysfs_engines.c
index 967031056202..f2d9858d827c 100644
--- 

Re: [Intel-gfx] [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Rasmus Villemoes
On 02/03/2022 00.55, Linus Torvalds wrote:
> On Tue, Mar 1, 2022 at 3:19 PM David Laight  wrote:
>>

> With the "don't use iterator outside the loop" approach, the exact
> same code works in both the old world order and the new world order,
> and you don't have the semantic confusion. And *if* you try to use the
> iterator outside the loop, you'll _mostly_ (*) get a compiler warning
> about it not being initialized.
> 
>  Linus
> 
> (*) Unless somebody initializes the iterator pointer pointlessly.
> Which clearly does happen. Thus the "mostly". It's not perfect, and
> that's most definitely not nice - but it should at least hopefully
> make it that much harder to mess up.

This won't help the current issue (because it doesn't exist and might
never), but just in case some compiler people are listening, I'd like to
have some sort of way to tell the compiler "treat this variable as
uninitialized from here on". So one could do

#define kfree(p) do { __kfree(p); __magic_uninit(p); } while (0)

with __magic_uninit being a magic no-op that doesn't affect the
semantics of the code, but could be used by the compiler's "[is/may be]
used uninitialized" machinery to flag e.g. double frees on some odd
error path etc. It would probably only work for local automatic
variables, but it should be possible to just ignore the hint if p is
some expression like foo->bar or has side effects. If we had that, the
end-of-loop test could include that to "uninitialize" the iterator.

Maybe sparse/smatch or some other static analyzer could implement such a
magic thing? Maybe it's better as a function attribute
[__attribute__((uninitializes(1)))] to avoid having to macrofy all
functions that release resources.

Rasmus


[Intel-gfx] ✗ Fi.CI.IGT: failure for Prep work for next GuC release (rev4)

2022-03-02 Thread Patchwork
== Series Details ==

Series: Prep work for next GuC release (rev4)
URL   : https://patchwork.freedesktop.org/series/99805/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11308_full -> Patchwork_22457_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_22457_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_22457_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (13 -> 13)
--

  No changes in participating hosts

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_22457_full:

### IGT changes ###

 Possible regressions 

  * 
igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-b-edp-1-scaler-with-clipping-clamping:
- shard-iclb: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-iclb1/igt@kms_plane_scaling@scaler-with-clipping-clamp...@pipe-b-edp-1-scaler-with-clipping-clamping.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22457/shard-iclb2/igt@kms_plane_scaling@scaler-with-clipping-clamp...@pipe-b-edp-1-scaler-with-clipping-clamping.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * 
{igt@kms_plane_scaling@downscale-with-pixel-format-factor-0-25@pipe-a-edp-1-downscale-with-pixel-format}:
- shard-iclb: NOTRUN -> [SKIP][3] +2 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22457/shard-iclb3/igt@kms_plane_scaling@downscale-with-pixel-format-factor-0...@pipe-a-edp-1-downscale-with-pixel-format.html

  * {igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-25}:
- {shard-rkl}:NOTRUN -> [SKIP][4] +6 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22457/shard-rkl-5/igt@kms_plane_scal...@planes-unity-scaling-downscale-factor-0-25.html

  * 
{igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-5@pipe-a-edp-1-planes-upscale-downscale}:
- shard-iclb: [PASS][5] -> [SKIP][6] +5 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-iclb6/igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-...@pipe-a-edp-1-planes-upscale-downscale.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22457/shard-iclb2/igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-...@pipe-a-edp-1-planes-upscale-downscale.html

  
New tests
-

  New tests have been introduced between CI_DRM_11308_full and 
Patchwork_22457_full:

### New IGT tests (1) ###

  * 
igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [1.28] s

  

Known issues


  Here are the changes found in Patchwork_22457_full that come from known 
issues:

### CI changes ###

 Issues hit 

  * boot:
- shard-glk:  ([PASS][7], [PASS][8], [PASS][9], [PASS][10], 
[PASS][11], [PASS][12], [PASS][13], [PASS][14], [PASS][15], [PASS][16], 
[PASS][17], [PASS][18], [PASS][19], [PASS][20], [PASS][21], [PASS][22], 
[PASS][23], [PASS][24], [PASS][25], [PASS][26], [PASS][27], [PASS][28], 
[PASS][29], [PASS][30], [PASS][31]) -> ([PASS][32], [FAIL][33], [PASS][34], 
[PASS][35], [PASS][36], [PASS][37], [PASS][38], [PASS][39], [PASS][40], 
[PASS][41], [PASS][42], [PASS][43], [PASS][44], [PASS][45], [PASS][46], 
[PASS][47], [PASS][48], [PASS][49], [PASS][50], [PASS][51], [PASS][52], 
[PASS][53], [PASS][54], [PASS][55], [PASS][56]) ([i915#4392])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk9/boot.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk8/boot.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk7/boot.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk6/boot.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk6/boot.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk5/boot.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-glk5/boot.html
   

Re: [Intel-gfx] [PATCH 1/3] drm/i915/guc: Limit scheduling properties to avoid overflow

2022-03-02 Thread Tvrtko Ursulin



On 01/03/2022 19:57, John Harrison wrote:

On 3/1/2022 02:50, Tvrtko Ursulin wrote:

On 28/02/2022 18:32, John Harrison wrote:

On 2/28/2022 08:11, Tvrtko Ursulin wrote:

On 25/02/2022 17:39, John Harrison wrote:

On 2/25/2022 09:06, Tvrtko Ursulin wrote:


On 24/02/2022 19:19, John Harrison wrote:

[snip]


./gt/uc/intel_guc_fwif.h: u32 execution_quantum;

./gt/uc/intel_guc_submission.c: desc->execution_quantum = 
engine->props.timeslice_duration_ms * 1000;


./gt/intel_engine_types.h: unsigned long timeslice_duration_ms;

timeslice_store/preempt_timeout_store:
err = kstrtoull(buf, 0, );

So both kconfig and sysfs can already overflow GuC, not only 
because of tick conversion internally but because at backend 
level nothing was done for assigning 64-bit into 32-bit. Or 
I failed to find where it is handled.
That's why I'm adding this range check to make sure we don't 
allow overflows.


Yes and no, this fixes it, but the first bug was not only due 
GuC internal tick conversion. It was present ever since the 
u64 from i915 was shoved into u32 sent to GuC. So even if GuC 
used the value without additional multiplication, bug was be 
there. My point being when GuC backend was added timeout_ms 
values should have been limited/clamped to U32_MAX. The tick 
discovery is additional limit on top.
I'm not disagreeing. I'm just saying that the truncation wasn't 
noticed until I actually tried using very long timeouts to 
debug a particular problem. Now that it is noticed, we need 
some method of range checking and this simple clamp solves all 
the truncation problems.


Agreed in principle, just please mention in the commit message 
all aspects of the problem.


I think we can get away without a Fixes: tag since it requires 
user fiddling to break things in unexpected ways.


I would though put in a code a clamping which expresses both, 
something like min(u32, ..GUC LIMIT..). So the full story is 
documented forever. Or "if > u32 || > ..GUC LIMIT..) return 
-EINVAL". Just in case GuC limit one day changes but u32 stays. 
Perhaps internal ticks go away or anything and we are left with 
plain 1:1 millisecond relationship.
Can certainly add a comment along the lines of "GuC API only 
takes a 32bit field but that is further reduced to GUC_LIMIT due 
to internal calculations which would otherwise overflow".


But if the GuC limit is > u32 then, by definition, that means the 
GuC API has changed to take a u64 instead of a u32. So there will 
no u32 truncation any more. So I'm not seeing a need to 
explicitly test the integer size when the value check covers that.


Hmm I was thinking if the internal conversion in the GuC fw 
changes so that GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS goes above u32, 
then to be extra safe by documenting in code there is the 
additional limit of the data structure field. Say the field was 
changed to take some unit larger than a millisecond. Then the 
check against the GuC MAX limit define would not be enough, unless 
that would account both for internal implementation and u32 in the 
protocol. Maybe that is overdefensive but I don't see that it 
harms. 50-50, but it's do it once and forget so I'd do it.

Huh?

How can the limit be greater than a u32 if the interface only takes 
a u32? By definition the limit would be clamped to u32 size.


If you mean that the GuC policy is in different units and those 
units might not overflow but ms units do, then actually that is 
already the case. The GuC works in us not ms. That's part of why 
the wrap around is so low, we have to multiply by 1000 before 
sending to GuC. However, that is actually irrelevant because the 
comparison is being done on the i915 side in i915's units. We have 
to scale the GuC limit to match what i915 is using. And the i915 
side is u64 so if the scaling to i915 numbers overflows a u32 then 
who cares because that comparison can be done at 64 bits wide.


If the units change then that is a backwards breaking API change 
that will require a manual driver code update. You can't just 
recompile with a new header and magically get an ms to us or ms to 
s conversion in your a = b assignment. The code will need to be 
changed to do the new unit conversion (note we already convert from 
ms to us, the GuC API is all expressed in us). And that code change 
will mean having to revisit any and all scaling, type conversions, 
etc. I.e. any pre-existing checks will not necessarily be valid and 
will need to be re-visted anyway. But as above, any scaling to GuC 
units has to be incorporated into the limit already because 
otherwise the limit would not fit in the GuC's own API.


Yes I get that, I was just worried that u32 field in the protocol 
and GUC_POLICY_MAX_EXEC_QUANTUM_MS defines are separate in the 
source code and then how to protect against forgetting to update 
both in sync.


Like if the protocol was changed to take nanoseconds, and firmware 
implementation changed to support the full range, but define 
left/forgotten at 100s. 

Re: [Intel-gfx] [PATCH 7/8] drm/i915: Count engine instances per uabi class

2022-03-02 Thread Tvrtko Ursulin



On 01/03/2022 19:34, Umesh Nerlige Ramappa wrote:

On Tue, Feb 22, 2022 at 02:04:21PM +, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

This will be useful to have at hand in a following patch.

Signed-off-by: Tvrtko Ursulin 
---
drivers/gpu/drm/i915/gt/intel_engine_user.c | 11 ++-
drivers/gpu/drm/i915/i915_drv.h |  1 +
2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c 
b/drivers/gpu/drm/i915/gt/intel_engine_user.c

index 9ce85a845105..5dd559253078 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
@@ -190,7 +190,6 @@ static void add_legacy_ring(struct legacy_ring *ring,
void intel_engines_driver_register(struct drm_i915_private *i915)
{
struct legacy_ring ring = {};
-    u8 uabi_instances[4] = {};
struct list_head *it, *next;
struct rb_node **p, *prev;
LIST_HEAD(engines);
@@ -211,8 +210,10 @@ void intel_engines_driver_register(struct 
drm_i915_private *i915)

    GEM_BUG_ON(engine->class >= ARRAY_SIZE(uabi_classes));
    engine->uabi_class = uabi_classes[engine->class];

-    GEM_BUG_ON(engine->uabi_class >= ARRAY_SIZE(uabi_instances));
-    engine->uabi_instance = uabi_instances[engine->uabi_class]++;
+    GEM_BUG_ON(engine->uabi_class >=
+   ARRAY_SIZE(i915->engine_uabi_class_count));
+    engine->uabi_instance =
+    i915->engine_uabi_class_count[engine->uabi_class]++;

    /* Replace the internal name with the final user facing name */
    memcpy(old, engine->name, sizeof(engine->name));
@@ -242,8 +243,8 @@ void intel_engines_driver_register(struct 
drm_i915_private *i915)

    int class, inst;
    int errors = 0;

-    for (class = 0; class < ARRAY_SIZE(uabi_instances); class++) {
-    for (inst = 0; inst < uabi_instances[class]; inst++) {
+    for (class = 0; class < 
ARRAY_SIZE(i915->engine_uabi_class_count); class++) {
+    for (inst = 0; inst < 
i915->engine_uabi_class_count[class]; inst++) {

    engine = intel_engine_lookup_user(i915,
  class, inst);
    if (!engine) {
diff --git a/drivers/gpu/drm/i915/i915_drv.h 
b/drivers/gpu/drm/i915/i915_drv.h

index b9d38276801d..68d8a751008b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -533,6 +533,7 @@ struct drm_i915_private {
struct pci_dev *bridge_dev;

struct rb_root uabi_engines;
+    unsigned int engine_uabi_class_count[I915_LAST_UABI_ENGINE_CLASS 
+ 1];


lgtm,
Reviewed-by: Umesh Nerlige Ramappa 


Thanks Umesh - for the series or just this patch? I'd need to update 
your r-b's on patches 3, 6 and 8 to latest as well.


Regards,

Tvrtko


Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/mm: Add an iterator to optimally walk over holes suitable for an allocation (rev2)

2022-03-02 Thread Tvrtko Ursulin



Hi Vivek,

On 01/03/2022 19:23, Patchwork wrote:

*Patch Details*
*Series:*	drm/mm: Add an iterator to optimally walk over holes suitable 
for an allocation (rev2)
*URL:*	https://patchwork.freedesktop.org/series/100847/ 


*State:*failure
*Details:* 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22447/index.html 




  CI Bug Log - changes from CI_DRM_11302_full -> Patchwork_22447_full


Summary

*FAILURE*

Serious unknown changes coming with Patchwork_22447_full absolutely need 
to be

verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_22447_full, please notify your bug team to allow 
them

to document this new failure mode, which will reduce false positives in CI.


Participating hosts (13 -> 13)

No changes in participating hosts


Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_22447_full:



  IGT changes


Possible regressions

  *

igt@drm_mm@all@replace:

  o shard-skl: PASS


-> INCOMPLETE




This looked like a suspicious fail. But from test time to until it is 
terminated due exceeding the timeout is only a sub-second.

<6> [539.988360] [IGT] drm_mm: starting dynamic subtest replace
<6> [540.010576] drm_mm: Testing DRM range manager (struct drm_mm), with 
random_seed=0x19e51e55 max_iterations=8192 max_prime=128
<4> [540.051192] [IGT] Per-test timeout exceeded. Killing the current test with 
SIGQUIT.
...
<6> [540.427965] task:drm_mm  state:R  running task stack:13128 
pid: 6896 ppid:  1037 flags:0x4000
<6> [540.428142] Call Trace:
<6> [540.428164]  
<6> [540.428300]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
<6> [540.428355]  ? lockdep_hardirqs_on+0xbf/0x130
<6> [540.428409]  ? rm_hole+0x7e/0x310
<6> [540.428476]  ? drm_mm_insert_node_in_range+0x2b3/0x3a0
<6> [540.428649]  ? expect_insert.isra.10+0x2f/0x80 [test_drm_mm]
<6> [540.428712]  ? assert_continuous+0x83/0x120 [test_drm_mm]
<6> [540.428842]  ? __igt_insert+0x2b5/0x560 [test_drm_mm]
<6> [540.429338]  ? tick_nohz_tick_stopped+0xd/0x30
<6> [540.429431]  ? wake_up_klogd.part.31+0x4a/0x60
<6> [540.429613]  ? igt_replace+0x46/0xb0 [test_drm_mm]
<6> [540.429678]  ? 0xa004b000
<6> [540.429736]  ? test_drm_mm_init+0xab/0x1000 [test_drm_mm]
<6> [540.429800]  ? 0xa004b000
<6> [540.429835]  ? do_one_initcall+0x56/0x2e0
<6> [540.429869]  ? do_init_module+0x1d/0x1e0
<6> [540.429919]  ? rcu_read_lock_sched_held+0x4d/0x80
<6> [540.429973]  ? kmem_cache_alloc_trace+0x1de/0x250
<6> [540.430202]  ? do_init_module+0x45/0x1e0
<6> [540.430259]  ? load_module+0x2740/0x29d0
<6> [540.430578]  ? __do_sys_finit_module+0xaf/0x120
<6> [540.430617]  ? __do_sys_finit_module+0xaf/0x120
<6> [540.430833]  ? do_syscall_64+0x3a/0xb0
<6> [540.430883]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
<6> [540.431171]  

Looking at the test runner log we have this:

Starting dynamic subtest: insert
Dynamic subtest insert: SUCCESS (120.029s)
[540.455651] Per-test timeout exceeded. Killing the current test with SIGQUIT.
Starting dynamic subtest: replace

So actually insert test is the one which took long, but not that much longer than 
it can take in other CI runs as far as I could see. At least I randomly found one 
instance where it took >110s in the past.

CI history does not show the test as failing in the (visible) past though. 
Neither I can find anything in the issue tracker. But I don't think it is this 
patch at fault. No idea..

One thing you could still tweak would be to put the mode macro argument in 
drm_mm_for_each_suitable_hole into braces, as per checkpatch suggestion:

-:157: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'mode' may be better as 
'(mode)' to avoid precedence issues
#157: FILE: include/drm/drm_mm.h:430:
+#define drm_mm_for_each_suitable_hole(pos, mm, range_start, range_end, \
+ size, mode) \
+   for (pos = __drm_mm_first_hole(mm, range_start, range_end, size, \
+  mode & ~DRM_MM_INSERT_ONCE); \
+pos; \
+pos = mode & DRM_MM_INSERT_ONCE ? \
+NULL : __drm_mm_next_hole(mm, pos, size, \
+  mode & ~DRM_MM_INSERT_ONCE))

Ending up with two instances of "(mode) & ~DRM_MM_INSERT_ONCE" and one "(mode) & 
DRM_MM_INSERT_ONCE".

Regards,

Tvrtko


  *

igt@gem_exec_schedule@preemptive-hang@vecs0:

  o shard-glk: NOTRUN -> INCOMPLETE


  *


[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/ttm: Evict and store of compressed object (rev2)

2022-03-02 Thread Patchwork
== Series Details ==

Series: drm/i915/ttm: Evict and store of compressed object (rev2)
URL   : https://patchwork.freedesktop.org/series/99759/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11308_full -> Patchwork_22455_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_22455_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_22455_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (13 -> 13)
--

  No changes in participating hosts

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_22455_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_eio@reset-stress:
- shard-snb:  [PASS][1] -> [TIMEOUT][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11308/shard-snb2/igt@gem_...@reset-stress.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22455/shard-snb6/igt@gem_...@reset-stress.html

  * {igt@gem_lmem_swapping@heavy-verify-multi-ccs} (NEW):
- shard-iclb: NOTRUN -> [SKIP][3] +4 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22455/shard-iclb1/igt@gem_lmem_swapp...@heavy-verify-multi-ccs.html

  * {igt@gem_lmem_swapping@parallel-random-verify-ccs} (NEW):
- shard-tglb: NOTRUN -> [SKIP][4] +4 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22455/shard-tglb5/igt@gem_lmem_swapp...@parallel-random-verify-ccs.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * 
{igt@kms_plane_scaling@downscale-with-pixel-format-factor-0-5@pipe-a-edp-1-downscale-with-pixel-format}:
- {shard-rkl}:NOTRUN -> [SKIP][5] +1 similar issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22455/shard-rkl-6/igt@kms_plane_scaling@downscale-with-pixel-format-factor-...@pipe-a-edp-1-downscale-with-pixel-format.html

  
New tests
-

  New tests have been introduced between CI_DRM_11308_full and 
Patchwork_22455_full:

### New IGT tests (22) ###

  * igt@gem_lmem_swapping@heavy-verify-multi-ccs:
- Statuses : 7 skip(s)
- Exec time: [0.0] s

  * igt@gem_lmem_swapping@heavy-verify-random-ccs:
- Statuses : 7 skip(s)
- Exec time: [0.0] s

  * igt@gem_lmem_swapping@parallel-random-verify-ccs:
- Statuses : 7 skip(s)
- Exec time: [0.0] s

  * igt@gem_lmem_swapping@verify-ccs:
- Statuses : 7 skip(s)
- Exec time: [0.0] s

  * igt@gem_lmem_swapping@verify-random-ccs:
- Statuses : 7 skip(s)
- Exec time: [0.0] s

  * igt@kms_flip@absolute-wf_vblank@d-hdmi-a3:
- Statuses : 1 pass(s)
- Exec time: [7.80] s

  * igt@kms_flip@nonexisting-fb-interruptible@d-hdmi-a3:
- Statuses : 1 pass(s)
- Exec time: [0.62] s

  * igt@kms_flip@wf_vblank-ts-check-interruptible@d-hdmi-a3:
- Statuses : 1 pass(s)
- Exec time: [8.07] s

  * igt@kms_plane_scaling@invalid-num-scalers@pipe-d-edp-1-invalid-num-scalers:
- Statuses : 1 pass(s)
- Exec time: [0.02] s

  * 
igt@kms_plane_scaling@planes-downscale-factor-0-75@pipe-d-edp-1-planes-downscale:
- Statuses : 1 pass(s)
- Exec time: [1.28] s

  * 
igt@kms_plane_scaling@planes-scaling-unity-scaling@pipe-d-edp-1-planes-unity-scaling:
- Statuses : 1 pass(s)
- Exec time: [1.28] s

  * 
igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-75@pipe-d-edp-1-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [1.28] s

  * igt@kms_plane_scaling@planes-upscale-20x20@pipe-d-edp-1-planes-upscale:
- Statuses : 1 pass(s)
- Exec time: [1.22] s

  * 
igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-75@pipe-a-edp-1-planes-upscale-downscale:
- Statuses : 3 pass(s)
- Exec time: [0.13, 2.04] s

  * 
igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-75@pipe-a-hdmi-a-1-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [0.34] s

  * 
igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-75@pipe-a-vga-1-planes-upscale-downscale:
- Statuses : 1 skip(s)
- Exec time: [0.04] s

  * 
igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-75@pipe-b-edp-1-planes-upscale-downscale:
- Statuses : 3 pass(s)
- Exec time: [1.23, 1.85] s

  * 
igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-75@pipe-b-hdmi-a-2-planes-upscale-downscale:
- Statuses : 1 pass(s)
- Exec time: [0.34] s

  * 
igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-75@pipe-b-vga-1-planes-upscale-downscale:
- Statuses : 1 skip(s)
- Exec time: [0.03] s

  * 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Remove the vm open count

2022-03-02 Thread Thomas Hellström
On Tue, 2022-03-01 at 19:45 -0800, Niranjana Vishwanathapura wrote:
> On Tue, Feb 22, 2022 at 06:10:30PM +0100, Thomas Hellström wrote:
> > vms are not getting properly closed. Rather than fixing that,
> > Remove the vm open count and instead rely on the vm refcount.
> > 
> > The vm open count existed solely to break the strong references the
> > vmas had on the vms. Now instead make those references weak and
> > ensure vmas are destroyed when the vm is destroyed.
> > 
> > Unfortunately if the vm destructor and the object destructor both
> > wants to destroy a vma, that may lead to a race in that the vm
> > destructor just unbinds the vma and leaves the actual vma
> > destruction
> > to the object destructor. However in order for the object
> > destructor
> > to ensure the vma is unbound it needs to grab the vm mutex. In
> > order
> > to keep the vm mutex alive until the object destructor is done with
> > it, somewhat hackishly grab a vm_resv refcount that is released
> > late
> > in the vma destruction process, when the vm mutex is no longer
> > needed.
> > 
> > Cc: 
> > Co-developed-by: Niranjana Vishwanathapura
> > 
> > Signed-off-by: Niranjana Vishwanathapura
> > 
> > Signed-off-by: Thomas Hellström 
> > ---
> > drivers/gpu/drm/i915/display/intel_dpt.c  |  2 +-
> > drivers/gpu/drm/i915/gem/i915_gem_context.c   | 29 ++-
> > .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  6 ++
> > .../gpu/drm/i915/gem/selftests/mock_context.c |  5 +-
> > drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |  2 +-
> > drivers/gpu/drm/i915/gt/intel_ggtt.c  | 25 ++
> > drivers/gpu/drm/i915/gt/intel_gtt.c   | 48 ---
> > drivers/gpu/drm/i915/gt/intel_gtt.h   | 56 
> > drivers/gpu/drm/i915/gt/selftest_execlists.c  | 86 +---
> > ---
> > drivers/gpu/drm/i915/i915_gem.c   |  6 +-
> > drivers/gpu/drm/i915/i915_vma.c   | 55 
> > drivers/gpu/drm/i915/i915_vma_resource.c  |  2 +-
> > drivers/gpu/drm/i915/i915_vma_resource.h  |  6 ++
> > drivers/gpu/drm/i915/i915_vma_types.h |  7 ++
> > drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  4 +-
> > 15 files changed, 179 insertions(+), 160 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c
> > b/drivers/gpu/drm/i915/display/intel_dpt.c
> > index c2f8f853db90..6920669bc571 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dpt.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
> > @@ -298,5 +298,5 @@ void intel_dpt_destroy(struct
> > i915_address_space *vm)
> > {
> > struct i915_dpt *dpt = i915_vm_to_dpt(vm);
> > 
> > -   i915_vm_close(>vm);
> > +   i915_vm_put(>vm);
> > }
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index ebbac2ea0833..41404f043741 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1440,8 +1440,6 @@ static void set_closed_name(struct
> > i915_gem_context *ctx)
> > 
> > static void context_close(struct i915_gem_context *ctx)
> > {
> > -   struct i915_address_space *vm;
> > -
> > /* Flush any concurrent set_engines() */
> > mutex_lock(>engines_mutex);
> > unpin_engines(__context_engines_static(ctx));
> > @@ -1453,19 +1451,6 @@ static void context_close(struct
> > i915_gem_context *ctx)
> > 
> > set_closed_name(ctx);
> > 
> > -   vm = ctx->vm;
> > -   if (vm) {
> > -   /* i915_vm_close drops the final reference, which
> > is a bit too
> > -    * early and could result in surprises with
> > concurrent
> > -    * operations racing with thist ctx close. Keep a
> > full reference
> > -    * until the end.
> > -    */
> > -   i915_vm_get(vm);
> > -   i915_vm_close(vm);
> > -   }
> > -
> > -   ctx->file_priv = ERR_PTR(-EBADF);
> > -
> > /*
> >  * The LUT uses the VMA as a backpointer to unref the
> > object,
> >  * so we need to clear the LUT before we close all the VMA
> > (inside
> > @@ -1473,6 +1458,8 @@ static void context_close(struct
> > i915_gem_context *ctx)
> >  */
> > lut_close(ctx);
> > 
> > +   ctx->file_priv = ERR_PTR(-EBADF);
> > +
> > spin_lock(>i915->gem.contexts.lock);
> > list_del(>link);
> > spin_unlock(>i915->gem.contexts.lock);
> > @@ -1571,12 +1558,8 @@ i915_gem_create_context(struct
> > drm_i915_private *i915,
> > }
> > vm = >vm;
> > }
> > -   if (vm) {
> > -   ctx->vm = i915_vm_open(vm);
> > -
> > -   /* i915_vm_open() takes a reference */
> > -   i915_vm_put(vm);
> > -   }
> > +   if (vm)
> > +   ctx->vm = vm;
> > 
> > mutex_init(>engines_mutex);
> > if (pc->num_user_engines >= 0) {
> > @@ -1626,7 +1609,7 @@ i915_gem_create_context(struct
> >