Re: [Intel-gfx] [PATCH v3] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9

2021-03-04 Thread Chris Wilson
Quoting Chris Wilson (2021-03-04 11:56:16)
> Quoting Chris Wilson (2021-03-04 09:19:24)
> > Quoting Tvrtko Ursulin (2021-03-04 09:12:26)
> > > 
> > > On 02/03/2021 06:27, Cooper Chiou wrote:
> > > > WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to
> > > > resolve VP8 hardware encoding system hang up on GT1 sku for
> > > > ChromiumOS projects
> > > > 
> > > > Slice specific MMIO read inaccurate so MGSR needs to be programmed
> > > > appropriately to get correct reads from these slicet-related MMIOs.
> > > > 
> > > > It dictates that before any MMIO read into Slice/Subslice specific
> > > > registers, MCR packet control register(0xFDC) needs to be programmed
> > > > to point to any enabled slice/subslice pair, especially GT1 fused sku
> > > > since this issue can be reproduced on VP8 hardware encoding via ffmpeg
> > > > on ChromiumOS devices.
> > > > When exit PC7, MGSR will reset so that we have to skip fused subslice 
> > > > ID.
> > > > 
> > > > Reference: HSD#1508045018,1405586840, BSID#0575
> > > > 
> > > > Cc: Ville Syrjälä 
> > > > Cc: Rodrigo Vivi 
> > > > Cc: Jani Nikula 
> > > > Cc: Chris Wilson 
> > > > Cc: Tvrtko Ursulin 
> > > > Cc: William Tseng 
> > > > Cc: Lee Shawn C 
> > > > 
> > > > Signed-off-by: Cooper Chiou 
> > > > ---
> > > >   drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 +
> > > >   1 file changed, 38 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
> > > > b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > > > index 3b4a7da60f0b..4ad598a727a6 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > > > @@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private 
> > > > *i915, struct i915_wa_list *wal)
> > > >   wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME);
> > > >   }
> > > >   
> > > > +static void
> > > > +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list 
> > > > *wal)
> > > > +{
> > > > + const struct sseu_dev_info *sseu = >gt.info.sseu;
> > > > + unsigned int slice, subslice;
> > > > + u32 mcr, mcr_mask;
> > > > +
> > > > + GEM_BUG_ON(INTEL_GEN(i915) < 9);
> > > > +
> > > > + /*
> > > > +  * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml
> > > > +  * Before any MMIO read into slice/subslice specific registers, 
> > > > MCR
> > > > +  * packet control register needs to be programmed to point to any
> > > > +  * enabled s/ss pair. Otherwise, incorrect values will be 
> > > > returned.
> > > > +  * This means each subsequent MMIO read will be forwarded to an
> > > > +  * specific s/ss combination, but this is OK since these registers
> > > > +  * are consistent across s/ss in almost all cases. In the rare
> > > > +  * occasions, such as INSTDONE, where this value is dependent
> > > > +  * on s/ss combo, the read should be done with read_subslice_reg.
> > > > +  */
> > > > + slice = fls(sseu->slice_mask) - 1;
> > > > + GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask));
> > > > + subslice = fls(intel_sseu_get_subslices(sseu, slice));
> > > > + GEM_BUG_ON(!subslice);
> > > > + subslice--;
> > > > +
> > > > + mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
> > > > + mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
> > > > +
> > > > + drm_dbg(>drm, "MCR slice:%d/subslice:%d = %x\n", slice, 
> > > > subslice, mcr);
> > > > +
> > > > + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
> > > > +}
> > > 
> > > Have you considered reusing existing wa_init_mcr? Just needs the 
> > > top-level assert changed and otherwise it looks it would do the right 
> > > thing for Gen9. Advantage being a smaller patch and less code to carry.
> > 
> > That was the first patch, and fails for the same reason. The problem
> > would appear to be in the mcr_mask for gt3.
> 
> For the record, it appears to be an issue with fls vs ffs. Switching to
> ffs also removes the warnings for workaround failures on ehl/jsl.

Of course the icl in farm2 started spewing warnigns, but the other icl
in farm1 were happy.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Silence pipe tracepoint WARNs

2021-03-04 Thread Patchwork
== Series Details ==

Series: drm/i915: Silence pipe tracepoint WARNs
URL   : https://patchwork.freedesktop.org/series/87637/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_9831_full -> Patchwork_19757_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_19757_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_19757_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_19757_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_mmap_gtt@big-copy-xy:
- shard-glk:  [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-glk9/igt@gem_mmap_...@big-copy-xy.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/shard-glk9/igt@gem_mmap_...@big-copy-xy.html

  * igt@gem_mmap_offset@clear:
- shard-iclb: [PASS][3] -> [FAIL][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-iclb3/igt@gem_mmap_off...@clear.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/shard-iclb8/igt@gem_mmap_off...@clear.html

  
Known issues


  Here are the changes found in Patchwork_19757_full that come from known 
issues:

### CI changes ###

 Possible fixes 

  * boot:
- shard-skl:  ([PASS][5], [PASS][6], [PASS][7], [PASS][8], 
[PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], [PASS][14], 
[PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], [PASS][20], 
[FAIL][21], [PASS][22], [PASS][23], [PASS][24], [PASS][25], [PASS][26], 
[PASS][27], [PASS][28]) ([i915#2813]) -> ([PASS][29], [PASS][30], [PASS][31], 
[PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], 
[PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], [PASS][43], 
[PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], [PASS][49], 
[PASS][50], [PASS][51], [PASS][52])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl9/boot.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl9/boot.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl9/boot.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl8/boot.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl8/boot.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl8/boot.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl7/boot.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl7/boot.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl6/boot.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl6/boot.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl6/boot.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl6/boot.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl5/boot.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl4/boot.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl4/boot.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl3/boot.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl3/boot.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl2/boot.html
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl2/boot.html
   [24]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl2/boot.html
   [25]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl10/boot.html
   [26]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl10/boot.html
   [27]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl1/boot.html
   [28]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/shard-skl1/boot.html
   [29]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/shard-skl9/boot.html
   [30]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/shard-skl9/boot.html
   [31]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/shard-skl8/boot.html
   [32]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/shard-skl8/boot.html
   [33]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/shard-skl8/boot.html
   [34]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/shard-skl7/boot.html
   [35]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/shard-skl7/boot.html
   [36]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/shard-skl6/boot.html
   [37]: 

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9 (rev2)

2021-03-04 Thread Matt Roper
On Thu, Mar 04, 2021 at 10:37:28AM -0800, Chiou, Cooper wrote:
> > <3> [198.221812] [drm:wa_verify [i915]] *ERROR* engine workaround lost
> > on application! (reg[b004]=0x0, relevant bits were 0x0 vs expected 0x80) <3>
> > [198.222751] [drm:wa_verify [i915]] *ERROR* engine workaround lost on
> > application! (reg[b118]=0x0, relevant bits were 0x0 vs expected 0x20)
> > <3> [198.223076] [drm:wa_verify [i915]] *ERROR* engine workaround lost
> > on application! (reg[b11c]=0x0, relevant bits were 0x0 vs expected 0x4)
> >
> > ?
> >
> > CI does not think they are old warnings and registers are the MCR affected
> > range. So more digging would be needed to be sure. You are saying those
> > happen in our CI without the patch?
> 
> Hi Tvrtko,
> This patch only programmed 0xfdc register in reg[fdc]=0x1200, no touch
> reg[b004]=0x0 & reg[b118]=0x0 & reg[b11c]=0x0, so I don't think this error
> is caused by this change.

0xFDC is the multicast steering register --- it controls how accesses to
other multicast registers operate.  According to bspec page 66673, range
0xB000-0xB0FF is a multicast range that uses slice steering and
0xB100-0xB3FF is a multicast range that uses L3BANK steering.  So the
regressions here are likely due to your patch introducing invalid
steering (i.e., making register accesses target fused-off or
non-existent instances of those registers).

> This error might be due to wa_write_masked_or()
> 
> Meanwhile, as you can see this 2 kbl devices has different CI result.
> 1. fi-kbl-7500u - no any error log -
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19752/fi-kbl-7500u/igt@gem_exec_susp...@basic-s0.html
> 
> 2. fi-kbl-7567u- has register read/write error log:
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19752/fi-kbl-7567u/igt@gem_exec_susp...@basic-s0.html

Multi-cast fusing depends on the fusing of the specific part you're
running on.  When you see these kind of failures on one KBL and not on
another, it's an indiction that you probably need to take a look at the
steering logic being used (i.e., the programming of 0xFDC) for mistakes.
Incorrect steering logic can result in things working fine on platforms
with certain fusing configs, but still cause major regressions on
platforms with different fusing.


Matt

> 
> Cooper
> >
> > Then with regards to the reported perf drop - something to check would be if
> > the CML system you tested on has the same slice/subslice config as the one
> > from which the original report originated. Might be hard if the test farm 
> > has
> > been re-configured. But essentially running the benchmark on a few Gen9
> > machine with fused ss would be needed I think.
> >
> > And finally I couldn't find the WA entry in bspec, but maybe I just don't 
> > know
> > where to look. Someone better versed to finding WA. Maybe Matt you would
> > have time for a quick check if
> > WaProgramMgsrForCorrectSliceSpecificMmioReads is documented as
> > applicable to Gen9?
> >
> > Regards,
> >
> > Tvrtko

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/4] drm/i915: Silence pipe tracepoint WARNs

2021-03-04 Thread Steven Rostedt
On Thu,  4 Mar 2021 19:04:17 +0200
Ville Syrjala  wrote:

> From: Ville Syrjälä 
> 
> I believe this should silence the WARN spew from the
> pipe disable tracepoint Steve reported. And I tossed in
> a few other minor improvements as well.
> 
> Cc: Steven Rostedt 

It seemed to have stopped the general protection faults when tracing all
events on my machine.

Reported-by: Steven Rostedt (VMware) 
Tested-by: Steven Rostedt (VMware) 

-- Steve

> 
> Ville Syrjälä (4):
>   drm/i915: Move pipe enable/disable tracepoints to
> intel_crtc_vblank_{on,off}()
>   drm/i915: Don't try to query the frame counter for disabled pipes
>   drm/i915: Return zero as the scanline counter for disabled pipes
>   drm/i915: Fix DSI TE max_vblank_count handling
> 
>  drivers/gpu/drm/i915/display/intel_crtc.c| 24 +---
>  drivers/gpu/drm/i915/display/intel_display.c |  8 +--
>  drivers/gpu/drm/i915/i915_irq.c  |  2 +-
>  3 files changed, 23 insertions(+), 11 deletions(-)
> 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [patch 2/7] drm/vmgfx: Replace kmap_atomic()

2021-03-04 Thread Zack Rusin


> On Mar 3, 2021, at 08:20, Thomas Gleixner  wrote:
> 
> From: Thomas Gleixner 
> 
> There is no reason to disable pagefaults and preemption as a side effect of
> kmap_atomic_prot().
> 
> Use kmap_local_page_prot() instead and document the reasoning for the
> mapping usage with the given pgprot.
> 
> Remove the NULL pointer check for the map. These functions return a valid
> address for valid pages and the return was bogus anyway as it would have
> left preemption and pagefaults disabled.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: VMware Graphics 
> Cc: Roland Scheidegger 
> Cc: Zack Rusin 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> ---
> drivers/gpu/drm/vmwgfx/vmwgfx_blit.c |   30 --
> 1 file changed, 12 insertions(+), 18 deletions(-)
> 
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
> @@ -375,12 +375,12 @@ static int vmw_bo_cpu_blit_line(struct v
>   copy_size = min_t(u32, copy_size, PAGE_SIZE - src_page_offset);
> 
>   if (unmap_src) {
> - kunmap_atomic(d->src_addr);
> + kunmap_local(d->src_addr);
>   d->src_addr = NULL;
>   }
> 
>   if (unmap_dst) {
> - kunmap_atomic(d->dst_addr);
> + kunmap_local(d->dst_addr);
>   d->dst_addr = NULL;
>   }
> 
> @@ -388,12 +388,8 @@ static int vmw_bo_cpu_blit_line(struct v
>   if (WARN_ON_ONCE(dst_page >= d->dst_num_pages))
>   return -EINVAL;
> 
> - d->dst_addr =
> - kmap_atomic_prot(d->dst_pages[dst_page],
> -  d->dst_prot);
> - if (!d->dst_addr)
> - return -ENOMEM;
> -
> + d->dst_addr = 
> kmap_local_page_prot(d->dst_pages[dst_page],
> +d->dst_prot);
>   d->mapped_dst = dst_page;
>   }
> 
> @@ -401,12 +397,8 @@ static int vmw_bo_cpu_blit_line(struct v
>   if (WARN_ON_ONCE(src_page >= d->src_num_pages))
>   return -EINVAL;
> 
> - d->src_addr =
> - kmap_atomic_prot(d->src_pages[src_page],
> -  d->src_prot);
> - if (!d->src_addr)
> - return -ENOMEM;
> -
> + d->src_addr = 
> kmap_local_page_prot(d->src_pages[src_page],
> +d->src_prot);
>   d->mapped_src = src_page;
>   }
>   diff->do_cpy(diff, d->dst_addr + dst_page_offset,
> @@ -436,8 +428,10 @@ static int vmw_bo_cpu_blit_line(struct v
>  *
>  * Performs a CPU blit from one buffer object to another avoiding a full
>  * bo vmap which may exhaust- or fragment vmalloc space.
> - * On supported architectures (x86), we're using kmap_atomic which avoids
> - * cross-processor TLB- and cache flushes and may, on non-HIGHMEM systems
> + *
> + * On supported architectures (x86), we're using kmap_local_prot() which
> + * avoids cross-processor TLB- and cache flushes. kmap_local_prot() will
> + * either map a highmem page with the proper pgprot on HIGHMEM=y systems or
>  * reference already set-up mappings.
>  *
>  * Neither of the buffer objects may be placed in PCI memory
> @@ -500,9 +494,9 @@ int vmw_bo_cpu_blit(struct ttm_buffer_ob
>   }
> out:
>   if (d.src_addr)
> - kunmap_atomic(d.src_addr);
> + kunmap_local(d.src_addr);
>   if (d.dst_addr)
> - kunmap_atomic(d.dst_addr);
> + kunmap_local(d.dst_addr);
> 
>   return ret;
> }


Looks good. Thanks.

Reviewed-by: Zack Rusin 

z

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9 (rev2)

2021-03-04 Thread Chiou, Cooper
> <3> [198.221812] [drm:wa_verify [i915]] *ERROR* engine workaround lost
> on application! (reg[b004]=0x0, relevant bits were 0x0 vs expected 0x80) <3>
> [198.222751] [drm:wa_verify [i915]] *ERROR* engine workaround lost on
> application! (reg[b118]=0x0, relevant bits were 0x0 vs expected 0x20)
> <3> [198.223076] [drm:wa_verify [i915]] *ERROR* engine workaround lost
> on application! (reg[b11c]=0x0, relevant bits were 0x0 vs expected 0x4)
> 
> ?
> 
> CI does not think they are old warnings and registers are the MCR affected
> range. So more digging would be needed to be sure. You are saying those
> happen in our CI without the patch?

Hi Tvrtko,
This patch only programmed 0xfdc register in reg[fdc]=0x1200, no touch 
reg[b004]=0x0 & reg[b118]=0x0 & reg[b11c]=0x0, so I don't think this error 
is caused by this change.
This error might be due to wa_write_masked_or() 

Meanwhile, as you can see this 2 kbl devices has different CI result.
1. fi-kbl-7500u - no any error log - 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19752/fi-kbl-7500u/igt@gem_exec_susp...@basic-s0.html

2. fi-kbl-7567u- has register read/write error log: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19752/fi-kbl-7567u/igt@gem_exec_susp...@basic-s0.html

Cooper
> 
> Then with regards to the reported perf drop - something to check would be if
> the CML system you tested on has the same slice/subslice config as the one
> from which the original report originated. Might be hard if the test farm has
> been re-configured. But essentially running the benchmark on a few Gen9
> machine with fused ss would be needed I think.
> 
> And finally I couldn't find the WA entry in bspec, but maybe I just don't know
> where to look. Someone better versed to finding WA. Maybe Matt you would
> have time for a quick check if
> WaProgramMgsrForCorrectSliceSpecificMmioReads is documented as
> applicable to Gen9?
> 
> Regards,
> 
> Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Silence pipe tracepoint WARNs

2021-03-04 Thread Patchwork
== Series Details ==

Series: drm/i915: Silence pipe tracepoint WARNs
URL   : https://patchwork.freedesktop.org/series/87637/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_9831 -> Patchwork_19757


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/index.html

Known issues


  Here are the changes found in Patchwork_19757 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_gttfill@basic:
- fi-kbl-8809g:   [PASS][1] -> [TIMEOUT][2] ([i915#3145])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/fi-kbl-8809g/igt@gem_exec_gttf...@basic.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/fi-kbl-8809g/igt@gem_exec_gttf...@basic.html

  * igt@gem_huc_copy@huc-copy:
- fi-byt-j1900:   NOTRUN -> [SKIP][3] ([fdo#109271]) +27 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/fi-byt-j1900/igt@gem_huc_c...@huc-copy.html

  * igt@i915_selftest@live@hangcheck:
- fi-snb-2600:[PASS][4] -> [INCOMPLETE][5] ([i915#2782])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html

  * igt@kms_chamelium@hdmi-crc-fast:
- fi-byt-j1900:   NOTRUN -> [SKIP][6] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/fi-byt-j1900/igt@kms_chamel...@hdmi-crc-fast.html

  
 Possible fixes 

  * igt@gem_linear_blits@basic:
- fi-kbl-8809g:   [TIMEOUT][7] ([i915#2502]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/fi-kbl-8809g/igt@gem_linear_bl...@basic.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/fi-kbl-8809g/igt@gem_linear_bl...@basic.html

  * igt@gem_tiled_fence_blits@basic:
- fi-kbl-8809g:   [TIMEOUT][9] ([i915#3145]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/fi-kbl-8809g/igt@gem_tiled_fence_bl...@basic.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/fi-kbl-8809g/igt@gem_tiled_fence_bl...@basic.html

  * igt@i915_module_load@reload:
- fi-tgl-u2:  [DMESG-WARN][11] ([i915#1982] / [k.org#205379]) -> 
[PASS][12]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/fi-tgl-u2/igt@i915_module_l...@reload.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/fi-tgl-u2/igt@i915_module_l...@reload.html

  
 Warnings 

  * igt@i915_pm_rpm@module-reload:
- fi-glk-dsi: [DMESG-WARN][13] ([i915#1982]) -> [DMESG-WARN][14] 
([i915#1982] / [i915#3143])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9831/fi-glk-dsi/igt@i915_pm_...@module-reload.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/fi-glk-dsi/igt@i915_pm_...@module-reload.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2502]: https://gitlab.freedesktop.org/drm/intel/issues/2502
  [i915#2782]: https://gitlab.freedesktop.org/drm/intel/issues/2782
  [i915#3143]: https://gitlab.freedesktop.org/drm/intel/issues/3143
  [i915#3145]: https://gitlab.freedesktop.org/drm/intel/issues/3145
  [k.org#205379]: https://bugzilla.kernel.org/show_bug.cgi?id=205379


Participating hosts (45 -> 41)
--

  Additional (1): fi-byt-j1900 
  Missing(5): fi-ilk-m540 fi-hsw-4200u fi-bsw-n3050 fi-bsw-cyan 
fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_9831 -> Patchwork_19757

  CI-20190529: 20190529
  CI_DRM_9831: cd9d321c49fde8049da5a9569839aeffcb6fc964 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6022: 3c3d08ad629c404ace39256da334e4317b550de6 @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_19757: d46ca721a906163531d0851c4fd201ca2bc2fd0f @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

d46ca721a906 drm/i915: Fix DSI TE max_vblank_count handling
929c4e1755da drm/i915: Return zero as the scanline counter for disabled pipes
620967eed766 drm/i915: Don't try to query the frame counter for disabled pipes
0753358edc3d drm/i915: Move pipe enable/disable tracepoints to 
intel_crtc_vblank_{on, off}()

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19757/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [patch 1/7] drm/ttm: Replace kmap_atomic() usage

2021-03-04 Thread Christian König



Am 03.03.21 um 14:20 schrieb Thomas Gleixner:

From: Thomas Gleixner 

There is no reason to disable pagefaults and preemption as a side effect of
kmap_atomic_prot().

Use kmap_local_page_prot() instead and document the reasoning for the
mapping usage with the given pgprot.

Remove the NULL pointer check for the map. These functions return a valid
address for valid pages and the return was bogus anyway as it would have
left preemption and pagefaults disabled.

Signed-off-by: Thomas Gleixner 
Cc: Christian Koenig 
Cc: Huang Rui 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: dri-de...@lists.freedesktop.org
---
  drivers/gpu/drm/ttm/ttm_bo_util.c |   20 
  1 file changed, 12 insertions(+), 8 deletions(-)

--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -181,13 +181,15 @@ static int ttm_copy_io_ttm_page(struct t
return -ENOMEM;
  
  	src = (void *)((unsigned long)src + (page << PAGE_SHIFT));

-   dst = kmap_atomic_prot(d, prot);
-   if (!dst)
-   return -ENOMEM;
+   /*
+* Ensure that a highmem page is mapped with the correct
+* pgprot. For non highmem the mapping is already there.
+*/


I find the comment a bit misleading. Maybe write:

/*
 * Locally map highmem pages with the correct pgprot.
 * Normal memory should already have the correct pgprot in the linear 
mapping.

 */

Apart from that looks good to me.

Regards,
Christian.


+   dst = kmap_local_page_prot(d, prot);
  
  	memcpy_fromio(dst, src, PAGE_SIZE);
  
-	kunmap_atomic(dst);

+   kunmap_local(dst);
  
  	return 0;

  }
@@ -203,13 +205,15 @@ static int ttm_copy_ttm_io_page(struct t
return -ENOMEM;
  
  	dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT));

-   src = kmap_atomic_prot(s, prot);
-   if (!src)
-   return -ENOMEM;
+   /*
+* Ensure that a highmem page is mapped with the correct
+* pgprot. For non highmem the mapping is already there.
+*/
+   src = kmap_local_page_prot(s, prot);
  
  	memcpy_toio(dst, src, PAGE_SIZE);
  
-	kunmap_atomic(src);

+   kunmap_local(src);
  
  	return 0;

  }




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/4] drm/i915: Fix DSI TE max_vblank_count handling

2021-03-04 Thread Ville Syrjala
From: Ville Syrjälä 

commit 33267703df15 ("drm/i915/dsi: Enable software vblank counter")
claims to get the mode_flags from the crtc_state, but in fact does
not. Fix it to do it right.

Cc: Vandita Kulkarni 
Cc: Jani Nikula 
Cc: Steven Rostedt 
Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/display/intel_crtc.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_crtc.c 
b/drivers/gpu/drm/i915/display/intel_crtc.c
index 34ff40852a37..7dcbec3a3ca5 100644
--- a/drivers/gpu/drm/i915/display/intel_crtc.c
+++ b/drivers/gpu/drm/i915/display/intel_crtc.c
@@ -53,8 +53,6 @@ u32 intel_crtc_get_vblank_counter(struct intel_crtc *crtc)
 u32 intel_crtc_max_vblank_count(const struct intel_crtc_state *crtc_state)
 {
struct drm_i915_private *dev_priv = to_i915(crtc_state->uapi.crtc->dev);
-   struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
-   u32 mode_flags = crtc->mode_flags;
 
/*
 * From Gen 11, In case of dsi cmd mode, frame counter wouldnt
@@ -62,7 +60,8 @@ u32 intel_crtc_max_vblank_count(const struct intel_crtc_state 
*crtc_state)
 * the hw counter, then we would find it updated in only
 * the next TE, hence switching to sw counter.
 */
-   if (mode_flags & (I915_MODE_FLAG_DSI_USE_TE0 | 
I915_MODE_FLAG_DSI_USE_TE1))
+   if (crtc_state->mode_flags & (I915_MODE_FLAG_DSI_USE_TE0 |
+ I915_MODE_FLAG_DSI_USE_TE1))
return 0;
 
/*
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 0/4] drm/i915: Silence pipe tracepoint WARNs

2021-03-04 Thread Ville Syrjala
From: Ville Syrjälä 

I believe this should silence the WARN spew from the
pipe disable tracepoint Steve reported. And I tossed in
a few other minor improvements as well.

Cc: Steven Rostedt 

Ville Syrjälä (4):
  drm/i915: Move pipe enable/disable tracepoints to
intel_crtc_vblank_{on,off}()
  drm/i915: Don't try to query the frame counter for disabled pipes
  drm/i915: Return zero as the scanline counter for disabled pipes
  drm/i915: Fix DSI TE max_vblank_count handling

 drivers/gpu/drm/i915/display/intel_crtc.c| 24 +---
 drivers/gpu/drm/i915/display/intel_display.c |  8 +--
 drivers/gpu/drm/i915/i915_irq.c  |  2 +-
 3 files changed, 23 insertions(+), 11 deletions(-)

-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 3/4] drm/i915: Return zero as the scanline counter for disabled pipes

2021-03-04 Thread Ville Syrjala
From: Ville Syrjälä 

We print the scanline counters as unsigned integers so the -1
here just makes the debugs/traces look a bit messy. Zero seems
equally valid for this usecase.

Cc: Steven Rostedt 
Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/i915_irq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 01c409005f6f..44aed4cbf894 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -794,7 +794,7 @@ static int __intel_get_crtc_scanline(struct intel_crtc 
*crtc)
int position, vtotal;
 
if (!crtc->active)
-   return -1;
+   return 0;
 
vblank = >base.dev->vblank[drm_crtc_index(>base)];
mode = >hwmode;
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/4] drm/i915: Don't try to query the frame counter for disabled pipes

2021-03-04 Thread Ville Syrjala
From: Ville Syrjälä 

For platforms/outputs without hardware frame counters we can't
call drm_crtc_accurate_vblank_count() when the vblank support is
disabled or we just get a WARN due to the crtc timings
(vblank->hwmode) being considered invalid. Note that until the
pipe in question has been enabled and drm_crtc_set_max_vblank_count()
has been called on it we would also take this path on platforms
which have a working frame counter. So getting the WARN is rather
likely on any platform unless you always boot with lots of displays
plugged in.

Also even on hardware with a working frame counter we may not be
able to read the actual frame counter register on disabled pipes
due the relevant power well being disabled. Ie. would just result
in the unclaimed reg spew.

So let's just avoid all this an directly report zero in case
the pipe is disabled.

Reported-by: Steven Rostedt 
Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/display/intel_crtc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_crtc.c 
b/drivers/gpu/drm/i915/display/intel_crtc.c
index fd8a66cece80..34ff40852a37 100644
--- a/drivers/gpu/drm/i915/display/intel_crtc.c
+++ b/drivers/gpu/drm/i915/display/intel_crtc.c
@@ -41,6 +41,9 @@ u32 intel_crtc_get_vblank_counter(struct intel_crtc *crtc)
struct drm_device *dev = crtc->base.dev;
struct drm_vblank_crtc *vblank = 
>vblank[drm_crtc_index(>base)];
 
+   if (!crtc->active)
+   return 0;
+
if (!vblank->max_vblank_count)
return (u32)drm_crtc_accurate_vblank_count(>base);
 
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/4] drm/i915: Move pipe enable/disable tracepoints to intel_crtc_vblank_{on, off}()

2021-03-04 Thread Ville Syrjala
From: Ville Syrjälä 

On platforms/outputs without a working frame counter we rely
on the vblank code to cook up the frame counter from the timestamps.
That requires that vblank support is enabled. Thus we need to
move the pipe enable/disable tracepoints to the other side
of the drm_vblank_{on,off}() calls. There shouldn't really be
much happening between these old and new call sites so the
tracepoints should still provide reasonable data.

The alternative would be to give up on having the frame counter
values in the trace which would render the tracepoints more or
less pointless.

Reported-by: Steven Rostedt 
Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/display/intel_crtc.c| 16 
 drivers/gpu/drm/i915/display/intel_display.c |  8 +---
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_crtc.c 
b/drivers/gpu/drm/i915/display/intel_crtc.c
index 88b44ac50aae..fd8a66cece80 100644
--- a/drivers/gpu/drm/i915/display/intel_crtc.c
+++ b/drivers/gpu/drm/i915/display/intel_crtc.c
@@ -28,6 +28,8 @@
 #include "i9xx_plane.h"
 #include "skl_universal_plane.h"
 
+#include "i915_trace.h"
+
 static void assert_vblank_disabled(struct drm_crtc *crtc)
 {
if (I915_STATE_WARN_ON(drm_crtc_vblank_get(crtc) == 0))
@@ -84,12 +86,26 @@ void intel_crtc_vblank_on(const struct intel_crtc_state 
*crtc_state)
drm_crtc_set_max_vblank_count(>base,
  intel_crtc_max_vblank_count(crtc_state));
drm_crtc_vblank_on(>base);
+
+   /*
+* Should really happen exactly when we enable the pipe
+* but we want the frame counters in the trace, and that
+* requires vblank support on some platforms/outputs.
+*/
+   trace_intel_pipe_enable(crtc);
 }
 
 void intel_crtc_vblank_off(const struct intel_crtc_state *crtc_state)
 {
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
 
+   /*
+* Should really happen exactly when we disable the pipe
+* but we want the frame counters in the trace, and that
+* requires vblank support on some platforms/outputs.
+*/
+   trace_intel_pipe_disable(crtc);
+
drm_crtc_vblank_off(>base);
assert_vblank_disabled(>base);
 }
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index b346f6ceb4a2..16b5cb73ddac 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -794,8 +794,6 @@ void intel_enable_pipe(const struct intel_crtc_state 
*new_crtc_state)
/* FIXME: assert CPU port conditions for SNB+ */
}
 
-   trace_intel_pipe_enable(crtc);
-
reg = PIPECONF(cpu_transcoder);
val = intel_de_read(dev_priv, reg);
if (val & PIPECONF_ENABLE) {
@@ -835,8 +833,6 @@ void intel_disable_pipe(const struct intel_crtc_state 
*old_crtc_state)
 */
assert_planes_disabled(crtc);
 
-   trace_intel_pipe_disable(crtc);
-
reg = PIPECONF(cpu_transcoder);
val = intel_de_read(dev_priv, reg);
if ((val & PIPECONF_ENABLE) == 0)
@@ -4023,10 +4019,8 @@ static void hsw_crtc_enable(struct intel_atomic_state 
*state,
if (INTEL_GEN(dev_priv) >= 11)
icl_pipe_mbus_enable(crtc);
 
-   if (new_crtc_state->bigjoiner_slave) {
-   trace_intel_pipe_enable(crtc);
+   if (new_crtc_state->bigjoiner_slave)
intel_crtc_vblank_on(new_crtc_state);
-   }
 
intel_encoders_enable(state, crtc);
 
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [WARNING] v5.12-rc1 in intel_pipe_disable tracepoint

2021-03-04 Thread Ville Syrjälä
On Mon, Mar 01, 2021 at 07:20:59PM +0200, Ville Syrjälä wrote:
> On Mon, Mar 01, 2021 at 11:59:46AM -0500, Steven Rostedt wrote:
> > 
> > On my test box with latest v5.12-rc1, running with all trace events
> > enabled, I consistently trigger this warning.
> > 
> > It appears to get triggered by the trace_intel_pipe_disable() code.
> > 
> > -- Steve
> > 
> >  [ cut here ]
> >  i915 :00:02.0: drm_WARN_ON_ONCE(drm_drv_uses_atomic_modeset(dev))
> >  WARNING: CPU: 7 PID: 1258 at drivers/gpu/drm/drm_vblank.c:728 
> > drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x319/0x330 [drm]
> >  Modules linked in: ebtable_filter ebtables bridge stp llc vsock vmw_vmci 
> > ipt_REJECT nf_reject_ipv4 iptable_filter ip6t_REJECT nf_reject_ipv6 
> > xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
> > ip6table_filter ip6_tables snd_hda_codec_hdmi snd_h
> > ek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg 
> > snd_hda_codec joydev snd_hwdep intel_rapl_msr snd_hda_core hp_wmi i915 
> > iTCO_wdt snd_seq intel_rapl_common iTCO_vendor_support wmi_bmof 
> > sparse_keymap snd_seq_device rfkill snd_pcm x86_pkg_t
> > d_timer i2c_algo_bit drm_kms_helper mei_me intel_powerclamp snd mei 
> > soundcore i2c_i801 drm coretemp lpc_ich e1000e kvm_intel i2c_smbus kvm 
> > irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel serio_raw 
> > ghash_clmulni_intel video tpm_infineon wmi ip_tables
> >  CPU: 7 PID: 1258 Comm: Xorg Tainted: GW 5.12.0-rc1-test+ 
> > #12
> >  Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 
> > v03.03 07/14/2016
> >  RIP: 0010:drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x319/0x330 
> > [drm]
> >  Code: 4c 8b 6f 50 4d 85 ed 75 03 4c 8b 2f e8 60 92 45 c2 48 c7 c1 28 a5 3c 
> > c0 4c 89 ea 48 c7 c7 15 5a 3c c0 48 89 c6 e8 1f e7 7b c2 <0f> 0b e9 e2 fe 
> > ff ff e8 fb 6c 81 c2 66 66 2e 0f 1f 84 00 00 00 00
> >  RSP: 0018:b77580ea7920 EFLAGS: 00010082
> >  RAX:  RBX: 8afe500c RCX: 
> >  RDX: 0004 RSI: 833c86b8 RDI: 0001
> >  RBP: b77580ea7990 R08: 00700c782173 R09: 
> >  R10: 0001 R11: 0001 R12: 
> >  R13: 8afe41c7eff0 R14: c05e0410 R15: 8afe456a2bf8
> >  FS:  7f8f91869f00() GS:8afe5aa0() 
> > knlGS:
> >  CS:  0010 DS:  ES:  CR0: 80050033
> >  CR2: 7f9523a6cad0 CR3: 01b78002 CR4: 001706e0
> >  Call Trace:
> >   drm_get_last_vbltimestamp+0xaa/0xc0 [drm]
> >   drm_update_vblank_count+0x90/0x2d0 [drm]
> >   drm_crtc_accurate_vblank_count+0x3e/0xc0 [drm]
> >   intel_crtc_get_vblank_counter+0x43/0x50 [i915]
> >   trace_event_raw_event_intel_pipe_disable+0x87/0x110 [i915]
> >   intel_disable_pipe+0x1a8/0x210 [i915]
> 
> Hmm. Yeah we do vblank_off() before pipe_disable() which wants
> to still grab the frame counter in the tracepoint. I think we
> could reorder those two without causing any problems. Either
> that or we put the tracepoint before vblank_off().
> 
> >   ilk_crtc_disable+0x85/0x390 [i915]
> 
> But this part is confusing me. intel_crtc_get_vblank_counter() is
> only supposed to do drm_crtc_accurate_vblank_count() fallback when
> the hardware lacks a working frame counter, and that should only
> be the case for ancient gen2 or semi-ancient i965gm TV output,
> ilk_crtc_disable() is not the function we should be calling in
> either of those cases.

OK, figured this out. It can happen on any platform due to
never initializing .max_vblank_count for pipes that were
disabled at boot. Also spotted some other issues in this
code that needs fixing. I'll mail out some patches...

-- 
Ville Syrjälä
Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BUILD: failure for drm, highmem: Cleanup io/kmap_atomic*() usage

2021-03-04 Thread Patchwork
== Series Details ==

Series: drm, highmem: Cleanup io/kmap_atomic*() usage
URL   : https://patchwork.freedesktop.org/series/87631/
State : failure

== Summary ==

Applying: drm/ttm: Replace kmap_atomic() usage
Applying: drm/vmgfx: Replace kmap_atomic()
Applying: highmem: Remove kmap_atomic_prot()
Applying: drm/qxl: Replace io_mapping_map_atomic_wc()
error: sha1 information is lacking or useless (drivers/gpu/drm/qxl/qxl_image.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0004 drm/qxl: Replace io_mapping_map_atomic_wc()
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [patch 1/7] drm/ttm: Replace kmap_atomic() usage

2021-03-04 Thread Thomas Gleixner
From: Thomas Gleixner 

There is no reason to disable pagefaults and preemption as a side effect of
kmap_atomic_prot().

Use kmap_local_page_prot() instead and document the reasoning for the
mapping usage with the given pgprot.

Remove the NULL pointer check for the map. These functions return a valid
address for valid pages and the return was bogus anyway as it would have
left preemption and pagefaults disabled.

Signed-off-by: Thomas Gleixner 
Cc: Christian Koenig 
Cc: Huang Rui 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: dri-de...@lists.freedesktop.org
---
 drivers/gpu/drm/ttm/ttm_bo_util.c |   20 
 1 file changed, 12 insertions(+), 8 deletions(-)

--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -181,13 +181,15 @@ static int ttm_copy_io_ttm_page(struct t
return -ENOMEM;
 
src = (void *)((unsigned long)src + (page << PAGE_SHIFT));
-   dst = kmap_atomic_prot(d, prot);
-   if (!dst)
-   return -ENOMEM;
+   /*
+* Ensure that a highmem page is mapped with the correct
+* pgprot. For non highmem the mapping is already there.
+*/
+   dst = kmap_local_page_prot(d, prot);
 
memcpy_fromio(dst, src, PAGE_SIZE);
 
-   kunmap_atomic(dst);
+   kunmap_local(dst);
 
return 0;
 }
@@ -203,13 +205,15 @@ static int ttm_copy_ttm_io_page(struct t
return -ENOMEM;
 
dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT));
-   src = kmap_atomic_prot(s, prot);
-   if (!src)
-   return -ENOMEM;
+   /*
+* Ensure that a highmem page is mapped with the correct
+* pgprot. For non highmem the mapping is already there.
+*/
+   src = kmap_local_page_prot(s, prot);
 
memcpy_toio(dst, src, PAGE_SIZE);
 
-   kunmap_atomic(src);
+   kunmap_local(src);
 
return 0;
 }


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [patch 0/7] drm, highmem: Cleanup io/kmap_atomic*() usage

2021-03-04 Thread Thomas Gleixner
None of the DRM usage sites of temporary mappings requires the side
effects of io/kmap_atomic(), i.e. preemption and pagefault disable.

Replace them with the io/kmap_local() variants, simplify the
copy_to/from_user() error handling and remove the atomic variants.

Thanks,

tglx
---
 Documentation/driver-api/io-mapping.rst |   22 +++---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c  |7 +--
 drivers/gpu/drm/i915/i915_gem.c |   40 ++-
 drivers/gpu/drm/i915/selftests/i915_gem.c   |4 -
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c   |8 +--
 drivers/gpu/drm/nouveau/nvkm/subdev/devinit/fbmem.h |8 +--
 drivers/gpu/drm/qxl/qxl_image.c |   18 
 drivers/gpu/drm/qxl/qxl_ioctl.c |   27 ++--
 drivers/gpu/drm/qxl/qxl_object.c|   12 ++---
 drivers/gpu/drm/qxl/qxl_object.h|4 -
 drivers/gpu/drm/qxl/qxl_release.c   |4 -
 drivers/gpu/drm/ttm/ttm_bo_util.c   |   20 +
 drivers/gpu/drm/vmwgfx/vmwgfx_blit.c|   30 +-
 include/linux/highmem-internal.h|   14 --
 include/linux/io-mapping.h  |   42 
 15 files changed, 93 insertions(+), 167 deletions(-)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [patch 3/7] highmem: Remove kmap_atomic_prot()

2021-03-04 Thread Thomas Gleixner
From: Thomas Gleixner 

No more users.

Signed-off-by: Thomas Gleixner 
Cc: Andrew Morton 
Cc: linux...@kvack.org
---
 include/linux/highmem-internal.h |   14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

--- a/include/linux/highmem-internal.h
+++ b/include/linux/highmem-internal.h
@@ -88,16 +88,11 @@ static inline void __kunmap_local(void *
kunmap_local_indexed(vaddr);
 }
 
-static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot)
+static inline void *kmap_atomic(struct page *page)
 {
preempt_disable();
pagefault_disable();
-   return __kmap_local_page_prot(page, prot);
-}
-
-static inline void *kmap_atomic(struct page *page)
-{
-   return kmap_atomic_prot(page, kmap_prot);
+   return __kmap_local_page_prot(page, kmap_prot);
 }
 
 static inline void *kmap_atomic_pfn(unsigned long pfn)
@@ -184,11 +179,6 @@ static inline void *kmap_atomic(struct p
return page_address(page);
 }
 
-static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot)
-{
-   return kmap_atomic(page);
-}
-
 static inline void *kmap_atomic_pfn(unsigned long pfn)
 {
return kmap_atomic(pfn_to_page(pfn));

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [patch 4/7] drm/qxl: Replace io_mapping_map_atomic_wc()

2021-03-04 Thread Thomas Gleixner
From: Thomas Gleixner 

None of these mapping requires the side effect of disabling pagefaults and
preemption.

Use io_mapping_map_local_wc() instead, rename the related functions
accordingly and clean up qxl_process_single_command() to use a plain
copy_from_user() as the local maps are not disabling pagefaults.

Signed-off-by: Thomas Gleixner 
Cc: David Airlie 
Cc: Gerd Hoffmann 
Cc: Daniel Vetter 
Cc: virtualizat...@lists.linux-foundation.org
Cc: spice-de...@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
---
 drivers/gpu/drm/qxl/qxl_image.c   |   18 +-
 drivers/gpu/drm/qxl/qxl_ioctl.c   |   27 +--
 drivers/gpu/drm/qxl/qxl_object.c  |   12 ++--
 drivers/gpu/drm/qxl/qxl_object.h  |4 ++--
 drivers/gpu/drm/qxl/qxl_release.c |4 ++--
 5 files changed, 32 insertions(+), 33 deletions(-)

--- a/drivers/gpu/drm/qxl/qxl_image.c
+++ b/drivers/gpu/drm/qxl/qxl_image.c
@@ -124,12 +124,12 @@ qxl_image_init_helper(struct qxl_device
  wrong (check the bitmaps are sent correctly
  first) */
 
-   ptr = qxl_bo_kmap_atomic_page(qdev, chunk_bo, 0);
+   ptr = qxl_bo_kmap_local_page(qdev, chunk_bo, 0);
chunk = ptr;
chunk->data_size = height * chunk_stride;
chunk->prev_chunk = 0;
chunk->next_chunk = 0;
-   qxl_bo_kunmap_atomic_page(qdev, chunk_bo, ptr);
+   qxl_bo_kunmap_local_page(qdev, chunk_bo, ptr);
 
{
void *k_data, *i_data;
@@ -143,7 +143,7 @@ qxl_image_init_helper(struct qxl_device
i_data = (void *)data;
 
while (remain > 0) {
-   ptr = qxl_bo_kmap_atomic_page(qdev, chunk_bo, 
page << PAGE_SHIFT);
+   ptr = qxl_bo_kmap_local_page(qdev, chunk_bo, 
page << PAGE_SHIFT);
 
if (page == 0) {
chunk = ptr;
@@ -157,7 +157,7 @@ qxl_image_init_helper(struct qxl_device
 
memcpy(k_data, i_data, size);
 
-   qxl_bo_kunmap_atomic_page(qdev, chunk_bo, ptr);
+   qxl_bo_kunmap_local_page(qdev, chunk_bo, ptr);
i_data += size;
remain -= size;
page++;
@@ -175,10 +175,10 @@ qxl_image_init_helper(struct qxl_device
page_offset = 
offset_in_page(out_offset);
size = min((int)(PAGE_SIZE - 
page_offset), remain);
 
-   ptr = qxl_bo_kmap_atomic_page(qdev, 
chunk_bo, page_base);
+   ptr = qxl_bo_kmap_local_page(qdev, 
chunk_bo, page_base);
k_data = ptr + page_offset;
memcpy(k_data, i_data, size);
-   qxl_bo_kunmap_atomic_page(qdev, 
chunk_bo, ptr);
+   qxl_bo_kunmap_local_page(qdev, 
chunk_bo, ptr);
remain -= size;
i_data += size;
out_offset += size;
@@ -189,7 +189,7 @@ qxl_image_init_helper(struct qxl_device
qxl_bo_kunmap(chunk_bo);
 
image_bo = dimage->bo;
-   ptr = qxl_bo_kmap_atomic_page(qdev, image_bo, 0);
+   ptr = qxl_bo_kmap_local_page(qdev, image_bo, 0);
image = ptr;
 
image->descriptor.id = 0;
@@ -212,7 +212,7 @@ qxl_image_init_helper(struct qxl_device
break;
default:
DRM_ERROR("unsupported image bit depth\n");
-   qxl_bo_kunmap_atomic_page(qdev, image_bo, ptr);
+   qxl_bo_kunmap_local_page(qdev, image_bo, ptr);
return -EINVAL;
}
image->u.bitmap.flags = QXL_BITMAP_TOP_DOWN;
@@ -222,7 +222,7 @@ qxl_image_init_helper(struct qxl_device
image->u.bitmap.palette = 0;
image->u.bitmap.data = qxl_bo_physical_address(qdev, chunk_bo, 0);
 
-   qxl_bo_kunmap_atomic_page(qdev, image_bo, ptr);
+   qxl_bo_kunmap_local_page(qdev, image_bo, ptr);
 
return 0;
 }
--- a/drivers/gpu/drm/qxl/qxl_ioctl.c
+++ b/drivers/gpu/drm/qxl/qxl_ioctl.c
@@ -89,11 +89,11 @@ apply_reloc(struct qxl_device *qdev, str
 {
void *reloc_page;
 
-   reloc_page = qxl_bo_kmap_atomic_page(qdev, info->dst_bo, 
info->dst_offset & PAGE_MASK);
+   reloc_page = qxl_bo_kmap_local_page(qdev, info->dst_bo, 
info->dst_offset & PAGE_MASK);
*(uint64_t *)(reloc_page + (info->dst_offset & ~PAGE_MASK)) = 
qxl_bo_physical_address(qdev,

  info->src_bo,

  

[Intel-gfx] [patch 5/7] drm/nouveau/device: Replace io_mapping_map_atomic_wc()

2021-03-04 Thread Thomas Gleixner
From: Thomas Gleixner 

Neither fbmem_peek() nor fbmem_poke() require to disable pagefaults and
preemption as a side effect of io_mapping_map_atomic_wc().

Use io_mapping_map_local_wc() instead.

Signed-off-by: Thomas Gleixner 
Cc: Ben Skeggs 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: dri-de...@lists.freedesktop.org
Cc: nouv...@lists.freedesktop.org
---
 drivers/gpu/drm/nouveau/nvkm/subdev/devinit/fbmem.h |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/fbmem.h
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/fbmem.h
@@ -60,19 +60,19 @@ fbmem_fini(struct io_mapping *fb)
 static inline u32
 fbmem_peek(struct io_mapping *fb, u32 off)
 {
-   u8 __iomem *p = io_mapping_map_atomic_wc(fb, off & PAGE_MASK);
+   u8 __iomem *p = io_mapping_map_local_wc(fb, off & PAGE_MASK);
u32 val = ioread32(p + (off & ~PAGE_MASK));
-   io_mapping_unmap_atomic(p);
+   io_mapping_unmap_local(p);
return val;
 }
 
 static inline void
 fbmem_poke(struct io_mapping *fb, u32 off, u32 val)
 {
-   u8 __iomem *p = io_mapping_map_atomic_wc(fb, off & PAGE_MASK);
+   u8 __iomem *p = io_mapping_map_local_wc(fb, off & PAGE_MASK);
iowrite32(val, p + (off & ~PAGE_MASK));
wmb();
-   io_mapping_unmap_atomic(p);
+   io_mapping_unmap_local(p);
 }
 
 static inline bool


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [patch 6/7] drm/i915: Replace io_mapping_map_atomic_wc()

2021-03-04 Thread Thomas Gleixner
From: Thomas Gleixner 

None of these mapping requires the side effect of disabling pagefaults and
preemption.

Use io_mapping_map_local_wc() instead, and clean up gtt_user_read() and
gtt_user_write() to use a plain copy_from_user() as the local maps are not
disabling pagefaults.

Signed-off-by: Thomas Gleixner 
Cc: Jani Nikula 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Chris Wilson 
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |7 +---
 drivers/gpu/drm/i915/i915_gem.c|   40 -
 drivers/gpu/drm/i915/selftests/i915_gem.c  |4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c  |8 ++---
 4 files changed, 22 insertions(+), 37 deletions(-)

--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1080,7 +1080,7 @@ static void reloc_cache_reset(struct rel
struct i915_ggtt *ggtt = cache_to_ggtt(cache);
 
intel_gt_flush_ggtt_writes(ggtt->vm.gt);
-   io_mapping_unmap_atomic((void __iomem *)vaddr);
+   io_mapping_unmap_local((void __iomem *)vaddr);
 
if (drm_mm_node_allocated(>node)) {
ggtt->vm.clear_range(>vm,
@@ -1146,7 +1146,7 @@ static void *reloc_iomap(struct drm_i915
 
if (cache->vaddr) {
intel_gt_flush_ggtt_writes(ggtt->vm.gt);
-   io_mapping_unmap_atomic((void __force __iomem *) 
unmask_page(cache->vaddr));
+   io_mapping_unmap_local((void __force __iomem *) 
unmask_page(cache->vaddr));
} else {
struct i915_vma *vma;
int err;
@@ -1194,8 +1194,7 @@ static void *reloc_iomap(struct drm_i915
offset += page << PAGE_SHIFT;
}
 
-   vaddr = (void __force *)io_mapping_map_atomic_wc(>iomap,
-offset);
+   vaddr = (void __force *)io_mapping_map_local_wc(>iomap, offset);
cache->page = page;
cache->vaddr = (unsigned long)vaddr;
 
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -253,22 +253,15 @@ gtt_user_read(struct io_mapping *mapping
  char __user *user_data, int length)
 {
void __iomem *vaddr;
-   unsigned long unwritten;
+   bool fail = false;
 
/* We can use the cpu mem copy function because this is X86. */
-   vaddr = io_mapping_map_atomic_wc(mapping, base);
-   unwritten = __copy_to_user_inatomic(user_data,
-   (void __force *)vaddr + offset,
-   length);
-   io_mapping_unmap_atomic(vaddr);
-   if (unwritten) {
-   vaddr = io_mapping_map_wc(mapping, base, PAGE_SIZE);
-   unwritten = copy_to_user(user_data,
-(void __force *)vaddr + offset,
-length);
-   io_mapping_unmap(vaddr);
-   }
-   return unwritten;
+   vaddr = io_mapping_map_local_wc(mapping, base);
+   if (copy_to_user(user_data, (void __force *)vaddr + offset, length))
+   fail = true;
+   io_mapping_unmap_local(vaddr);
+
+   return fail;
 }
 
 static int
@@ -437,21 +430,14 @@ ggtt_write(struct io_mapping *mapping,
   char __user *user_data, int length)
 {
void __iomem *vaddr;
-   unsigned long unwritten;
+   bool fail = false;
 
/* We can use the cpu mem copy function because this is X86. */
-   vaddr = io_mapping_map_atomic_wc(mapping, base);
-   unwritten = __copy_from_user_inatomic_nocache((void __force *)vaddr + 
offset,
- user_data, length);
-   io_mapping_unmap_atomic(vaddr);
-   if (unwritten) {
-   vaddr = io_mapping_map_wc(mapping, base, PAGE_SIZE);
-   unwritten = copy_from_user((void __force *)vaddr + offset,
-  user_data, length);
-   io_mapping_unmap(vaddr);
-   }
-
-   return unwritten;
+   vaddr = io_mapping_map_local_wc(mapping, base);
+   if (copy_from_user((void __force *)vaddr + offset, user_data, length))
+   fail = true;
+   io_mapping_unmap_local(vaddr);
+   return fail;
 }
 
 /**
--- a/drivers/gpu/drm/i915/selftests/i915_gem.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
@@ -58,12 +58,12 @@ static void trash_stolen(struct drm_i915
 
ggtt->vm.insert_page(>vm, dma, slot, I915_CACHE_NONE, 0);
 
-   s = io_mapping_map_atomic_wc(>iomap, slot);
+   s = io_mapping_map_local_wc(>iomap, slot);
for (x = 0; x < PAGE_SIZE / sizeof(u32); x++) {
prng = next_pseudo_random32(prng);
iowrite32(prng, [x]);
}
-

[Intel-gfx] [patch 7/7] io-mapping: Remove io_mapping_map_atomic_wc()

2021-03-04 Thread Thomas Gleixner
From: Thomas Gleixner 

No more users. Get rid of it and remove the traces in documentation.

Signed-off-by: Thomas Gleixner 
Cc: Andrew Morton 
Cc: linux...@kvack.org
---
 Documentation/driver-api/io-mapping.rst |   22 +---
 include/linux/io-mapping.h  |   42 +---
 2 files changed, 9 insertions(+), 55 deletions(-)

--- a/Documentation/driver-api/io-mapping.rst
+++ b/Documentation/driver-api/io-mapping.rst
@@ -21,19 +21,15 @@ mappable, while 'size' indicates how lar
 enable. Both are in bytes.
 
 This _wc variant provides a mapping which may only be used with
-io_mapping_map_atomic_wc(), io_mapping_map_local_wc() or
-io_mapping_map_wc().
+io_mapping_map_local_wc() or io_mapping_map_wc().
 
 With this mapping object, individual pages can be mapped either temporarily
 or long term, depending on the requirements. Of course, temporary maps are
-more efficient. They come in two flavours::
+more efficient.
 
void *io_mapping_map_local_wc(struct io_mapping *mapping,
  unsigned long offset)
 
-   void *io_mapping_map_atomic_wc(struct io_mapping *mapping,
-  unsigned long offset)
-
 'offset' is the offset within the defined mapping region.  Accessing
 addresses beyond the region specified in the creation function yields
 undefined results. Using an offset which is not page aligned yields an
@@ -50,9 +46,6 @@ io_mapping_map_local_wc() has a side eff
 migration to make the mapping code work. No caller can rely on this side
 effect.
 
-io_mapping_map_atomic_wc() has the side effect of disabling preemption and
-pagefaults. Don't use in new code. Use io_mapping_map_local_wc() instead.
-
 Nested mappings need to be undone in reverse order because the mapping
 code uses a stack for keeping track of them::
 
@@ -65,11 +58,10 @@ Nested mappings need to be undone in rev
 The mappings are released with::
 
void io_mapping_unmap_local(void *vaddr)
-   void io_mapping_unmap_atomic(void *vaddr)
 
-'vaddr' must be the value returned by the last io_mapping_map_local_wc() or
-io_mapping_map_atomic_wc() call. This unmaps the specified mapping and
-undoes the side effects of the mapping functions.
+'vaddr' must be the value returned by the last io_mapping_map_local_wc()
+call. This unmaps the specified mapping and undoes eventual side effects of
+the mapping function.
 
 If you need to sleep while holding a mapping, you can use the regular
 variant, although this may be significantly slower::
@@ -77,8 +69,8 @@ If you need to sleep while holding a map
void *io_mapping_map_wc(struct io_mapping *mapping,
unsigned long offset)
 
-This works like io_mapping_map_atomic/local_wc() except it has no side
-effects and the pointer is globaly visible.
+This works like io_mapping_map_local_wc() except it has no side effects and
+the pointer is globaly visible.
 
 The mappings are released with::
 
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -60,28 +60,7 @@ io_mapping_fini(struct io_mapping *mappi
iomap_free(mapping->base, mapping->size);
 }
 
-/* Atomic map/unmap */
-static inline void __iomem *
-io_mapping_map_atomic_wc(struct io_mapping *mapping,
-unsigned long offset)
-{
-   resource_size_t phys_addr;
-
-   BUG_ON(offset >= mapping->size);
-   phys_addr = mapping->base + offset;
-   preempt_disable();
-   pagefault_disable();
-   return __iomap_local_pfn_prot(PHYS_PFN(phys_addr), mapping->prot);
-}
-
-static inline void
-io_mapping_unmap_atomic(void __iomem *vaddr)
-{
-   kunmap_local_indexed((void __force *)vaddr);
-   pagefault_enable();
-   preempt_enable();
-}
-
+/* Temporary mappings which are only valid in the current context */
 static inline void __iomem *
 io_mapping_map_local_wc(struct io_mapping *mapping, unsigned long offset)
 {
@@ -163,24 +142,7 @@ io_mapping_unmap(void __iomem *vaddr)
 {
 }
 
-/* Atomic map/unmap */
-static inline void __iomem *
-io_mapping_map_atomic_wc(struct io_mapping *mapping,
-unsigned long offset)
-{
-   preempt_disable();
-   pagefault_disable();
-   return io_mapping_map_wc(mapping, offset, PAGE_SIZE);
-}
-
-static inline void
-io_mapping_unmap_atomic(void __iomem *vaddr)
-{
-   io_mapping_unmap(vaddr);
-   pagefault_enable();
-   preempt_enable();
-}
-
+/* Temporary mappings which are only valid in the current context */
 static inline void __iomem *
 io_mapping_map_local_wc(struct io_mapping *mapping, unsigned long offset)
 {

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [patch 2/7] drm/vmgfx: Replace kmap_atomic()

2021-03-04 Thread Thomas Gleixner
From: Thomas Gleixner 

There is no reason to disable pagefaults and preemption as a side effect of
kmap_atomic_prot().

Use kmap_local_page_prot() instead and document the reasoning for the
mapping usage with the given pgprot.

Remove the NULL pointer check for the map. These functions return a valid
address for valid pages and the return was bogus anyway as it would have
left preemption and pagefaults disabled.

Signed-off-by: Thomas Gleixner 
Cc: VMware Graphics 
Cc: Roland Scheidegger 
Cc: Zack Rusin 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: dri-de...@lists.freedesktop.org
---
 drivers/gpu/drm/vmwgfx/vmwgfx_blit.c |   30 --
 1 file changed, 12 insertions(+), 18 deletions(-)

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
@@ -375,12 +375,12 @@ static int vmw_bo_cpu_blit_line(struct v
copy_size = min_t(u32, copy_size, PAGE_SIZE - src_page_offset);
 
if (unmap_src) {
-   kunmap_atomic(d->src_addr);
+   kunmap_local(d->src_addr);
d->src_addr = NULL;
}
 
if (unmap_dst) {
-   kunmap_atomic(d->dst_addr);
+   kunmap_local(d->dst_addr);
d->dst_addr = NULL;
}
 
@@ -388,12 +388,8 @@ static int vmw_bo_cpu_blit_line(struct v
if (WARN_ON_ONCE(dst_page >= d->dst_num_pages))
return -EINVAL;
 
-   d->dst_addr =
-   kmap_atomic_prot(d->dst_pages[dst_page],
-d->dst_prot);
-   if (!d->dst_addr)
-   return -ENOMEM;
-
+   d->dst_addr = 
kmap_local_page_prot(d->dst_pages[dst_page],
+  d->dst_prot);
d->mapped_dst = dst_page;
}
 
@@ -401,12 +397,8 @@ static int vmw_bo_cpu_blit_line(struct v
if (WARN_ON_ONCE(src_page >= d->src_num_pages))
return -EINVAL;
 
-   d->src_addr =
-   kmap_atomic_prot(d->src_pages[src_page],
-d->src_prot);
-   if (!d->src_addr)
-   return -ENOMEM;
-
+   d->src_addr = 
kmap_local_page_prot(d->src_pages[src_page],
+  d->src_prot);
d->mapped_src = src_page;
}
diff->do_cpy(diff, d->dst_addr + dst_page_offset,
@@ -436,8 +428,10 @@ static int vmw_bo_cpu_blit_line(struct v
  *
  * Performs a CPU blit from one buffer object to another avoiding a full
  * bo vmap which may exhaust- or fragment vmalloc space.
- * On supported architectures (x86), we're using kmap_atomic which avoids
- * cross-processor TLB- and cache flushes and may, on non-HIGHMEM systems
+ *
+ * On supported architectures (x86), we're using kmap_local_prot() which
+ * avoids cross-processor TLB- and cache flushes. kmap_local_prot() will
+ * either map a highmem page with the proper pgprot on HIGHMEM=y systems or
  * reference already set-up mappings.
  *
  * Neither of the buffer objects may be placed in PCI memory
@@ -500,9 +494,9 @@ int vmw_bo_cpu_blit(struct ttm_buffer_ob
}
 out:
if (d.src_addr)
-   kunmap_atomic(d.src_addr);
+   kunmap_local(d.src_addr);
if (d.dst_addr)
-   kunmap_atomic(d.dst_addr);
+   kunmap_local(d.dst_addr);
 
return ret;
 }


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy

2021-03-04 Thread Lionel Landwerlin

On 03/03/2021 23:28, Umesh Nerlige Ramappa wrote:

Perf measurements rely on CPU and engine timestamps to correlate
events of interest across these time domains. Current mechanisms get
these timestamps separately and the calculated delta between these
timestamps lack enough accuracy.

To improve the accuracy of these time measurements to within a few us,
add a query that returns the engine and cpu timestamps captured as
close to each other as possible.

v2: (Tvrtko)
- document clock reference used
- return cpu timestamp always
- capture cpu time just before lower dword of cs timestamp

v3: (Chris)
- use uncore-rpm
- use __query_cs_timestamp helper

v4: (Lionel)
- Kernel perf subsytem allows users to specify the clock id to be used
   in perf_event_open. This clock id is used by the perf subsystem to
   return the appropriate cpu timestamp in perf events. Similarly, let
   the user pass the clockid to this query so that cpu timestamp
   corresponds to the clock id requested.

v5: (Tvrtko)
- Use normal ktime accessors instead of fast versions
- Add more uApi documentation

v6: (Lionel)
- Move switch out of spinlock

v7: (Chris)
- cs_timestamp is a misnomer, use cs_cycles instead
- return the cs cycle frequency as well in the query

v8:
- Add platform and engine specific checks

v9: (Lionel)
- Return 2 cpu timestamps in the query - captured before and after the
   register read

Signed-off-by: Umesh Nerlige Ramappa
---
  drivers/gpu/drm/i915/i915_query.c | 144 ++


FYI, the MR for Mesa : 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9407



-Lionel

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9

2021-03-04 Thread Chris Wilson
Quoting Chris Wilson (2021-03-04 09:19:24)
> Quoting Tvrtko Ursulin (2021-03-04 09:12:26)
> > 
> > On 02/03/2021 06:27, Cooper Chiou wrote:
> > > WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to
> > > resolve VP8 hardware encoding system hang up on GT1 sku for
> > > ChromiumOS projects
> > > 
> > > Slice specific MMIO read inaccurate so MGSR needs to be programmed
> > > appropriately to get correct reads from these slicet-related MMIOs.
> > > 
> > > It dictates that before any MMIO read into Slice/Subslice specific
> > > registers, MCR packet control register(0xFDC) needs to be programmed
> > > to point to any enabled slice/subslice pair, especially GT1 fused sku
> > > since this issue can be reproduced on VP8 hardware encoding via ffmpeg
> > > on ChromiumOS devices.
> > > When exit PC7, MGSR will reset so that we have to skip fused subslice ID.
> > > 
> > > Reference: HSD#1508045018,1405586840, BSID#0575
> > > 
> > > Cc: Ville Syrjälä 
> > > Cc: Rodrigo Vivi 
> > > Cc: Jani Nikula 
> > > Cc: Chris Wilson 
> > > Cc: Tvrtko Ursulin 
> > > Cc: William Tseng 
> > > Cc: Lee Shawn C 
> > > 
> > > Signed-off-by: Cooper Chiou 
> > > ---
> > >   drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 +
> > >   1 file changed, 38 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
> > > b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > > index 3b4a7da60f0b..4ad598a727a6 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > > @@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private 
> > > *i915, struct i915_wa_list *wal)
> > >   wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME);
> > >   }
> > >   
> > > +static void
> > > +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
> > > +{
> > > + const struct sseu_dev_info *sseu = >gt.info.sseu;
> > > + unsigned int slice, subslice;
> > > + u32 mcr, mcr_mask;
> > > +
> > > + GEM_BUG_ON(INTEL_GEN(i915) < 9);
> > > +
> > > + /*
> > > +  * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml
> > > +  * Before any MMIO read into slice/subslice specific registers, MCR
> > > +  * packet control register needs to be programmed to point to any
> > > +  * enabled s/ss pair. Otherwise, incorrect values will be returned.
> > > +  * This means each subsequent MMIO read will be forwarded to an
> > > +  * specific s/ss combination, but this is OK since these registers
> > > +  * are consistent across s/ss in almost all cases. In the rare
> > > +  * occasions, such as INSTDONE, where this value is dependent
> > > +  * on s/ss combo, the read should be done with read_subslice_reg.
> > > +  */
> > > + slice = fls(sseu->slice_mask) - 1;
> > > + GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask));
> > > + subslice = fls(intel_sseu_get_subslices(sseu, slice));
> > > + GEM_BUG_ON(!subslice);
> > > + subslice--;
> > > +
> > > + mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
> > > + mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
> > > +
> > > + drm_dbg(>drm, "MCR slice:%d/subslice:%d = %x\n", slice, 
> > > subslice, mcr);
> > > +
> > > + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
> > > +}
> > 
> > Have you considered reusing existing wa_init_mcr? Just needs the 
> > top-level assert changed and otherwise it looks it would do the right 
> > thing for Gen9. Advantage being a smaller patch and less code to carry.
> 
> That was the first patch, and fails for the same reason. The problem
> would appear to be in the mcr_mask for gt3.

For the record, it appears to be an issue with fls vs ffs. Switching to
ffs also removes the warnings for workaround failures on ehl/jsl.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: failure for HDCP 2.2 MST fixes (rev2)

2021-03-04 Thread Patchwork
== Series Details ==

Series: HDCP 2.2 MST fixes (rev2)
URL   : https://patchwork.freedesktop.org/series/87475/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_9827_full -> Patchwork_19755_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_19755_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_19755_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_19755_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_mmap_gtt@cpuset-big-copy:
- shard-iclb: [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9827/shard-iclb6/igt@gem_mmap_...@cpuset-big-copy.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-iclb5/igt@gem_mmap_...@cpuset-big-copy.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
- shard-iclb: NOTRUN -> [FAIL][3]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-iclb6/igt@kms_cursor_leg...@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend:
- shard-hsw:  NOTRUN -> [INCOMPLETE][4]
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-hsw2/igt@kms_vbl...@pipe-b-ts-continuation-dpms-suspend.html

  
 Warnings 

  * igt@kms_content_protection@srm:
- shard-iclb: [SKIP][5] ([fdo#109300] / [fdo#111066]) -> [FAIL][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9827/shard-iclb8/igt@kms_content_protect...@srm.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-iclb1/igt@kms_content_protect...@srm.html

  
Known issues


  Here are the changes found in Patchwork_19755_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-snb:  NOTRUN -> [DMESG-WARN][7] ([i915#3002])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-snb2/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@smoketest:
- shard-snb:  NOTRUN -> [SKIP][8] ([fdo#109271] / [i915#1099]) +3 
similar issues
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-snb2/igt@gem_ctx_persiste...@smoketest.html

  * igt@gem_eio@unwedge-stress:
- shard-skl:  NOTRUN -> [TIMEOUT][9] ([i915#2771])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-skl2/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
- shard-kbl:  [PASS][10] -> [FAIL][11] ([i915#2846])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9827/shard-kbl1/igt@gem_exec_f...@basic-deadline.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-kbl6/igt@gem_exec_f...@basic-deadline.html
- shard-glk:  [PASS][12] -> [FAIL][13] ([i915#2846])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9827/shard-glk1/igt@gem_exec_f...@basic-deadline.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-glk5/igt@gem_exec_f...@basic-deadline.html
- shard-apl:  NOTRUN -> [FAIL][14] ([i915#2846])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-apl2/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none@vcs1:
- shard-iclb: NOTRUN -> [FAIL][15] ([i915#2842])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-iclb4/igt@gem_exec_fair@basic-n...@vcs1.html

  * igt@gem_exec_fair@basic-pace@rcs0:
- shard-kbl:  [PASS][16] -> [FAIL][17] ([i915#2842])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9827/shard-kbl6/igt@gem_exec_fair@basic-p...@rcs0.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-kbl7/igt@gem_exec_fair@basic-p...@rcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][18] -> [FAIL][19] ([i915#2842])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9827/shard-glk7/igt@gem_exec_fair@basic-throt...@rcs0.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-glk1/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_reloc@basic-many-active@rcs0:
- shard-snb:  NOTRUN -> [FAIL][20] ([i915#2389]) +2 similar issues
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/shard-snb7/igt@gem_exec_reloc@basic-many-act...@rcs0.html

  * igt@gem_exec_reloc@basic-wide-active@bcs0:
- shard-apl:  NOTRUN -> [FAIL][21] ([i915#2389]) +3 similar issues
   [21]: 

Re: [Intel-gfx] [PATCH 3/6] drm/i915: Use pipes instead crtc indices in PLL state tracking

2021-03-04 Thread Kahola, Mika
> -Original Message-
> From: Intel-gfx  On Behalf Of Ville
> Syrjala
> Sent: Wednesday, February 24, 2021 4:42 PM
> To: intel-gfx@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH 3/6] drm/i915: Use pipes instead crtc indices in 
> PLL
> state tracking
> 
> From: Ville Syrjälä 
> 
> All the other places we have use pipes instead of crtc indices when tracking
> resource usage. Life is easier when we do it the same way always, so switch
> the dpll mgr to using pipes as well. Looks like it was actually mixing these 
> up
> in some cases so it would not even have worked correctly except when the
> device has a contiguous set of pipes starting from pipe A.
> Granted, that is the typical case but supposedly it may not always hold on
> modern hw.
> 
> Signed-off-by: Ville Syrjälä 

Reviewed-by: Mika Kahola 

> ---
>  drivers/gpu/drm/i915/display/intel_display.c  | 40 
> .../drm/i915/display/intel_display_debugfs.c  |  4 +-
> drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 48 ++-
> drivers/gpu/drm/i915/display/intel_dpll_mgr.h |  8 ++--
>  4 files changed, 51 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> b/drivers/gpu/drm/i915/display/intel_display.c
> index b34620545d3b..958c2a796bae 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -9653,7 +9653,7 @@ verify_single_dpll_state(struct drm_i915_private
> *dev_priv,
>struct intel_crtc_state *new_crtc_state)  {
>   struct intel_dpll_hw_state dpll_hw_state;
> - unsigned int crtc_mask;
> + u8 pipe_mask;
>   bool active;
> 
>   memset(_hw_state, 0, sizeof(dpll_hw_state)); @@ -9666,34
> +9666,34 @@ verify_single_dpll_state(struct drm_i915_private *dev_priv,
>   I915_STATE_WARN(!pll->on && pll->active_mask,
>"pll in active use but not on in sw tracking\n");
>   I915_STATE_WARN(pll->on && !pll->active_mask,
> -  "pll is on but not used by any active crtc\n");
> +  "pll is on but not used by any active pipe\n");
>   I915_STATE_WARN(pll->on != active,
>"pll on state mismatch (expected %i, found %i)\n",
>pll->on, active);
>   }
> 
>   if (!crtc) {
> - I915_STATE_WARN(pll->active_mask & ~pll->state.crtc_mask,
> - "more active pll users than references: %x vs
> %x\n",
> - pll->active_mask, pll->state.crtc_mask);
> + I915_STATE_WARN(pll->active_mask & ~pll-
> >state.pipe_mask,
> + "more active pll users than references: 0x%x
> vs 0x%x\n",
> + pll->active_mask, pll->state.pipe_mask);
> 
>   return;
>   }
> 
> - crtc_mask = drm_crtc_mask(>base);
> + pipe_mask = BIT(crtc->pipe);
> 
>   if (new_crtc_state->hw.active)
> - I915_STATE_WARN(!(pll->active_mask & crtc_mask),
> - "pll active mismatch (expected pipe %c in
> active mask 0x%02x)\n",
> + I915_STATE_WARN(!(pll->active_mask & pipe_mask),
> + "pll active mismatch (expected pipe %c in
> active mask 0x%x)\n",
>   pipe_name(crtc->pipe), pll->active_mask);
>   else
> - I915_STATE_WARN(pll->active_mask & crtc_mask,
> - "pll active mismatch (didn't expect pipe %c in
> active mask 0x%02x)\n",
> + I915_STATE_WARN(pll->active_mask & pipe_mask,
> + "pll active mismatch (didn't expect pipe %c in
> active mask
> +0x%x)\n",
>   pipe_name(crtc->pipe), pll->active_mask);
> 
> - I915_STATE_WARN(!(pll->state.crtc_mask & crtc_mask),
> - "pll enabled crtcs mismatch (expected 0x%x in
> 0x%02x)\n",
> - crtc_mask, pll->state.crtc_mask);
> + I915_STATE_WARN(!(pll->state.pipe_mask & pipe_mask),
> + "pll enabled crtcs mismatch (expected 0x%x in
> 0x%x)\n",
> + pipe_mask, pll->state.pipe_mask);
> 
>   I915_STATE_WARN(pll->on && memcmp(>state.hw_state,
> _hw_state,
> @@ -9713,15 +9713,15 @@ verify_shared_dpll_state(struct intel_crtc *crtc,
> 
>   if (old_crtc_state->shared_dpll &&
>   old_crtc_state->shared_dpll != new_crtc_state->shared_dpll) {
> - unsigned int crtc_mask = drm_crtc_mask(>base);
> + u8 pipe_mask = BIT(crtc->pipe);
>   struct intel_shared_dpll *pll = old_crtc_state->shared_dpll;
> 
> - I915_STATE_WARN(pll->active_mask & crtc_mask,
> - "pll active mismatch (didn't expect pipe %c in
> active mask)\n",
> - pipe_name(crtc->pipe));
> - I915_STATE_WARN(pll->state.crtc_mask & 

Re: [Intel-gfx] [PATCH 2/6] drm/i915: Do intel_dpll_readout_hw_state() after encoder readout

2021-03-04 Thread Kahola, Mika
> -Original Message-
> From: Intel-gfx  On Behalf Of Ville
> Syrjala
> Sent: Wednesday, February 24, 2021 4:42 PM
> To: intel-gfx@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH 2/6] drm/i915: Do intel_dpll_readout_hw_state()
> after encoder readout
> 
> From: Ville Syrjälä 
> 
> The clock readout for DDI encoders needs to moved into the encoders.
> To that end intel_dpll_readout_hw_state() needs to happen after the
> encoder readout as otherwise it can't correctly populate the PLL
> crtc_mask/active_mask bitmasks.
> 
> Signed-off-by: Ville Syrjälä 

Reviewed-by: Mika Kahola 

> ---
>  drivers/gpu/drm/i915/display/intel_display.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> b/drivers/gpu/drm/i915/display/intel_display.c
> index d0da88751c72..b34620545d3b 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -13444,8 +13444,6 @@ static void
> intel_modeset_readout_hw_state(struct drm_device *dev)
> 
>   readout_plane_state(dev_priv);
> 
> - intel_dpll_readout_hw_state(dev_priv);
> -
>   for_each_intel_encoder(dev, encoder) {
>   pipe = 0;
> 
> @@ -13480,6 +13478,8 @@ static void
> intel_modeset_readout_hw_state(struct drm_device *dev)
>   pipe_name(pipe));
>   }
> 
> + intel_dpll_readout_hw_state(dev_priv);
> +
>   drm_connector_list_iter_begin(dev, _iter);
>   for_each_intel_connector_iter(connector, _iter) {
>   if (connector->get_hw_state(connector)) {
> --
> 2.26.2
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/6] drm/i915: Call primary encoder's .get_config() from MST .get_config()

2021-03-04 Thread Kahola, Mika
> -Original Message-
> From: Intel-gfx  On Behalf Of Ville
> Syrjala
> Sent: Wednesday, February 24, 2021 4:42 PM
> To: intel-gfx@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH 1/6] drm/i915: Call primary encoder's
> .get_config() from MST .get_config()
> 
> From: Ville Syrjälä 
> 
> Stop assuming intel_ddi_get_config() is all we need from the primary
> encoder, and instead call it via the .get_config() vfunc. This will allow
> customized .get_config() for the primary, which I plan to use to handle the
> differences in the clock readout between various platforms.
> 
> Signed-off-by: Ville Syrjälä 

Reviewed-by: Mika Kahola 

> ---
>  drivers/gpu/drm/i915/display/intel_dp_mst.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp_mst.c
> b/drivers/gpu/drm/i915/display/intel_dp_mst.c
> index 8e316146b6d1..906860ad8eb8 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp_mst.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp_mst.c
> @@ -591,7 +591,7 @@ static void intel_dp_mst_enc_get_config(struct
> intel_encoder *encoder,
>   struct intel_dp_mst_encoder *intel_mst = enc_to_mst(encoder);
>   struct intel_digital_port *dig_port = intel_mst->primary;
> 
> - intel_ddi_get_config(_port->base, pipe_config);
> + dig_port->base.get_config(_port->base, pipe_config);
>  }
> 
>  static bool intel_dp_mst_initial_fastset_check(struct intel_encoder
> *encoder,
> --
> 2.26.2
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy

2021-03-04 Thread Lionel Landwerlin

On 04/03/2021 11:54, Chris Wilson wrote:

Actually if we want the best accuracy we can just deal with the lower dword.

Accuracy of what? The lower dword read perhaps, or the accuracy of the
sample point for the combined reads for the timestamp, which is closer
to an external observer (cpu_clock() implies reference to an external
observer).

The two clock samples are not even necessarily closely related due to the
nmi adjustments. If you wanted an unadjusted elapsed time for the read
you can use local_clock() then return the chosen cpu_clock() before plus
the elapsed delta from around the read as the estimated error.

cpu_ts[1] = local_clock();
cpu_ts[0] = cpu_clock();
lower = intel_uncore_read_fw(uncore, lower_reg);
cpu_ts[1] = local_clock() - cpu_ts[1];
-Chris

Thanks,


I meant the accuracy of having 2 samples GPU/CPU as close as possible.

Avoiding to account another register read in there is nice.


My testing was also mostly done with CLOCK_MONOTONIC_RAW which doesn't
seem to be adjusted like CLOCK_MONOTONIC so maybe that why I didn't see
the issue.

_RAW is still adjusted for skews, just not coupled into the ntp feedback.
That is less obvious than the other clocks, and why it's preferred for
comparing against other HW sources. But two reads of _RAW are only
monotonic, not necessarily on the same time base. local_clock() is
tsc/arat, so counting the CPU cycles between the two reads with the
frequency (at least on x86) held constant (and arat should be frequency
invariant).

If we want much better accuracy, we are supposed to use cyclecounter_t
and the system_device_crosststamp.
-Chris


Thanks for the pointers.

I think people are mostly trying to map what's coming out of OA or 
queries from the various command streamers back to perf/ftrace.


As far I know perf will only let you select a clockid.


So maybe cyclecounter_t is not that useful atm.


-Lionel

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for HDCP 2.2 MST fixes (rev2)

2021-03-04 Thread Patchwork
== Series Details ==

Series: HDCP 2.2 MST fixes (rev2)
URL   : https://patchwork.freedesktop.org/series/87475/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_9827 -> Patchwork_19755


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/index.html

Known issues


  Here are the changes found in Patchwork_19755 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@semaphore:
- fi-bdw-5557u:   NOTRUN -> [SKIP][1] ([fdo#109271]) +26 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/fi-bdw-5557u/igt@amdgpu/amd_ba...@semaphore.html

  * igt@core_hotunplug@unbind-rebind:
- fi-bdw-5557u:   NOTRUN -> [WARN][2] ([i915#2283])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/fi-bdw-5557u/igt@core_hotunp...@unbind-rebind.html

  * igt@fbdev@read:
- fi-tgl-y:   [PASS][3] -> [DMESG-WARN][4] ([i915#402]) +1 similar 
issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9827/fi-tgl-y/igt@fb...@read.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/fi-tgl-y/igt@fb...@read.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-bdw-5557u:   NOTRUN -> [SKIP][5] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/fi-bdw-5557u/igt@kms_chamel...@dp-crc-fast.html

  
 Possible fixes 

  * igt@gem_basic@create-close:
- fi-tgl-y:   [DMESG-WARN][6] ([i915#402]) -> [PASS][7] +1 similar 
issue
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9827/fi-tgl-y/igt@gem_ba...@create-close.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/fi-tgl-y/igt@gem_ba...@create-close.html

  * igt@gem_exec_gttfill@basic:
- fi-kbl-8809g:   [TIMEOUT][8] ([i915#3145]) -> [PASS][9] +1 similar 
issue
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9827/fi-kbl-8809g/igt@gem_exec_gttf...@basic.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/fi-kbl-8809g/igt@gem_exec_gttf...@basic.html

  
 Warnings 

  * igt@i915_pm_rpm@module-reload:
- fi-glk-dsi: [DMESG-WARN][10] -> [DMESG-WARN][11] ([i915#1982])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9827/fi-glk-dsi/igt@i915_pm_...@module-reload.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/fi-glk-dsi/igt@i915_pm_...@module-reload.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2283]: https://gitlab.freedesktop.org/drm/intel/issues/2283
  [i915#3145]: https://gitlab.freedesktop.org/drm/intel/issues/3145
  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402


Participating hosts (43 -> 39)
--

  Missing(4): fi-ilk-m540 fi-bsw-cyan fi-bdw-samus fi-hsw-4200u 


Build changes
-

  * Linux: CI_DRM_9827 -> Patchwork_19755

  CI-20190529: 20190529
  CI_DRM_9827: 2659a5b229f34506196153fee450224aec2a9b19 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6022: 3c3d08ad629c404ace39256da334e4317b550de6 @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_19755: fcecb6aa821cd67081b1de7a90df6e350552a771 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

fcecb6aa821c drm/i915/hdcp: return correct error code
0bba791af341 drm/i915/hdcp: link hdcp2 recovery on link enc stopped
1614758cd74f drm/i915/hdcp: HDCP2.2 MST Link failure recovery

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19755/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy

2021-03-04 Thread Chris Wilson
Quoting Lionel Landwerlin (2021-03-04 09:45:47)
> On 04/03/2021 10:58, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2021-03-04 08:28:59)
> >> On 04/03/2021 02:09, Chris Wilson wrote:
> >>> Quoting Umesh Nerlige Ramappa (2021-03-03 21:28:00)
>  Perf measurements rely on CPU and engine timestamps to correlate
>  events of interest across these time domains. Current mechanisms get
>  these timestamps separately and the calculated delta between these
>  timestamps lack enough accuracy.
> 
>  To improve the accuracy of these time measurements to within a few us,
>  add a query that returns the engine and cpu timestamps captured as
>  close to each other as possible.
> 
>  v2: (Tvrtko)
>  - document clock reference used
>  - return cpu timestamp always
>  - capture cpu time just before lower dword of cs timestamp
> 
>  v3: (Chris)
>  - use uncore-rpm
>  - use __query_cs_timestamp helper
> 
>  v4: (Lionel)
>  - Kernel perf subsytem allows users to specify the clock id to be used
>  in perf_event_open. This clock id is used by the perf subsystem to
>  return the appropriate cpu timestamp in perf events. Similarly, let
>  the user pass the clockid to this query so that cpu timestamp
>  corresponds to the clock id requested.
> 
>  v5: (Tvrtko)
>  - Use normal ktime accessors instead of fast versions
>  - Add more uApi documentation
> 
>  v6: (Lionel)
>  - Move switch out of spinlock
> 
>  v7: (Chris)
>  - cs_timestamp is a misnomer, use cs_cycles instead
>  - return the cs cycle frequency as well in the query
> 
>  v8:
>  - Add platform and engine specific checks
> 
>  v9: (Lionel)
>  - Return 2 cpu timestamps in the query - captured before and after the
>  register read
> 
>  Signed-off-by: Umesh Nerlige Ramappa 
>  ---
> drivers/gpu/drm/i915/i915_query.c | 144 ++
> include/uapi/drm/i915_drm.h   |  47 ++
> 2 files changed, 191 insertions(+)
> 
>  diff --git a/drivers/gpu/drm/i915/i915_query.c 
>  b/drivers/gpu/drm/i915/i915_query.c
>  index fed337ad7b68..acca22ee6014 100644
>  --- a/drivers/gpu/drm/i915/i915_query.c
>  +++ b/drivers/gpu/drm/i915/i915_query.c
>  @@ -6,6 +6,8 @@
> 
> #include 
> 
>  +#include "gt/intel_engine_pm.h"
>  +#include "gt/intel_engine_user.h"
> #include "i915_drv.h"
> #include "i915_perf.h"
> #include "i915_query.h"
>  @@ -90,6 +92,147 @@ static int query_topology_info(struct 
>  drm_i915_private *dev_priv,
>    return total_length;
> }
> 
>  +typedef u64 (*__ktime_func_t)(void);
>  +static __ktime_func_t __clock_id_to_func(clockid_t clk_id)
>  +{
>  +   /*
>  +* Use logic same as the perf subsystem to allow user to select 
>  the
>  +* reference clock id to be used for timestamps.
>  +*/
>  +   switch (clk_id) {
>  +   case CLOCK_MONOTONIC:
>  +   return _get_ns;
>  +   case CLOCK_MONOTONIC_RAW:
>  +   return _get_raw_ns;
>  +   case CLOCK_REALTIME:
>  +   return _get_real_ns;
>  +   case CLOCK_BOOTTIME:
>  +   return _get_boottime_ns;
>  +   case CLOCK_TAI:
>  +   return _get_clocktai_ns;
>  +   default:
>  +   return NULL;
>  +   }
>  +}
>  +
>  +static inline int
>  +__read_timestamps(struct intel_uncore *uncore,
>  + i915_reg_t lower_reg,
>  + i915_reg_t upper_reg,
>  + u64 *cs_ts,
>  + u64 *cpu_ts,
>  + __ktime_func_t cpu_clock)
>  +{
>  +   u32 upper, lower, old_upper, loop = 0;
>  +
>  +   upper = intel_uncore_read_fw(uncore, upper_reg);
>  +   do {
>  +   cpu_ts[0] = cpu_clock();
>  +   lower = intel_uncore_read_fw(uncore, lower_reg);
>  +   cpu_ts[1] = cpu_clock();
>  +   old_upper = upper;
>  +   upper = intel_uncore_read_fw(uncore, upper_reg);
> >>> Both register reads comprise the timestamp returned to userspace, so
> >>> presumably you want cpu_ts[] to wrap both.
> >>>
> >>>  do {
> >>>  old_upper = upper;
> >>>
> >>>  cpu_ts[0] = cpu_clock();
> >>>  lower = intel_uncore_read_fw(uncore, lower_reg);
> >>>  upper = intel_uncore_read_fw(uncore, upper_reg);
> >>>  cpu_ts[1] = cpu_clock();
> >>>  } while (upper != old_upper && loop++ < 2);
> >> Actually if we want the best accuracy we can just deal with the lower 
> >> dword.
> > Accuracy of what? The lower dword 

Re: [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy

2021-03-04 Thread Lionel Landwerlin

On 04/03/2021 10:58, Chris Wilson wrote:

Quoting Lionel Landwerlin (2021-03-04 08:28:59)

On 04/03/2021 02:09, Chris Wilson wrote:

Quoting Umesh Nerlige Ramappa (2021-03-03 21:28:00)

Perf measurements rely on CPU and engine timestamps to correlate
events of interest across these time domains. Current mechanisms get
these timestamps separately and the calculated delta between these
timestamps lack enough accuracy.

To improve the accuracy of these time measurements to within a few us,
add a query that returns the engine and cpu timestamps captured as
close to each other as possible.

v2: (Tvrtko)
- document clock reference used
- return cpu timestamp always
- capture cpu time just before lower dword of cs timestamp

v3: (Chris)
- use uncore-rpm
- use __query_cs_timestamp helper

v4: (Lionel)
- Kernel perf subsytem allows users to specify the clock id to be used
in perf_event_open. This clock id is used by the perf subsystem to
return the appropriate cpu timestamp in perf events. Similarly, let
the user pass the clockid to this query so that cpu timestamp
corresponds to the clock id requested.

v5: (Tvrtko)
- Use normal ktime accessors instead of fast versions
- Add more uApi documentation

v6: (Lionel)
- Move switch out of spinlock

v7: (Chris)
- cs_timestamp is a misnomer, use cs_cycles instead
- return the cs cycle frequency as well in the query

v8:
- Add platform and engine specific checks

v9: (Lionel)
- Return 2 cpu timestamps in the query - captured before and after the
register read

Signed-off-by: Umesh Nerlige Ramappa 
---
   drivers/gpu/drm/i915/i915_query.c | 144 ++
   include/uapi/drm/i915_drm.h   |  47 ++
   2 files changed, 191 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index fed337ad7b68..acca22ee6014 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -6,6 +6,8 @@
   
   #include 
   
+#include "gt/intel_engine_pm.h"

+#include "gt/intel_engine_user.h"
   #include "i915_drv.h"
   #include "i915_perf.h"
   #include "i915_query.h"
@@ -90,6 +92,147 @@ static int query_topology_info(struct drm_i915_private 
*dev_priv,
  return total_length;
   }
   
+typedef u64 (*__ktime_func_t)(void);

+static __ktime_func_t __clock_id_to_func(clockid_t clk_id)
+{
+   /*
+* Use logic same as the perf subsystem to allow user to select the
+* reference clock id to be used for timestamps.
+*/
+   switch (clk_id) {
+   case CLOCK_MONOTONIC:
+   return _get_ns;
+   case CLOCK_MONOTONIC_RAW:
+   return _get_raw_ns;
+   case CLOCK_REALTIME:
+   return _get_real_ns;
+   case CLOCK_BOOTTIME:
+   return _get_boottime_ns;
+   case CLOCK_TAI:
+   return _get_clocktai_ns;
+   default:
+   return NULL;
+   }
+}
+
+static inline int
+__read_timestamps(struct intel_uncore *uncore,
+ i915_reg_t lower_reg,
+ i915_reg_t upper_reg,
+ u64 *cs_ts,
+ u64 *cpu_ts,
+ __ktime_func_t cpu_clock)
+{
+   u32 upper, lower, old_upper, loop = 0;
+
+   upper = intel_uncore_read_fw(uncore, upper_reg);
+   do {
+   cpu_ts[0] = cpu_clock();
+   lower = intel_uncore_read_fw(uncore, lower_reg);
+   cpu_ts[1] = cpu_clock();
+   old_upper = upper;
+   upper = intel_uncore_read_fw(uncore, upper_reg);

Both register reads comprise the timestamp returned to userspace, so
presumably you want cpu_ts[] to wrap both.

 do {
 old_upper = upper;

 cpu_ts[0] = cpu_clock();
 lower = intel_uncore_read_fw(uncore, lower_reg);
 upper = intel_uncore_read_fw(uncore, upper_reg);
 cpu_ts[1] = cpu_clock();
 } while (upper != old_upper && loop++ < 2);

Actually if we want the best accuracy we can just deal with the lower dword.

Accuracy of what? The lower dword read perhaps, or the accuracy of the
sample point for the combined reads for the timestamp, which is closer
to an external observer (cpu_clock() implies reference to an external
observer).

The two clock samples are not even necessarily closely related due to the
nmi adjustments. If you wanted an unadjusted elapsed time for the read
you can use local_clock() then return the chosen cpu_clock() before plus
the elapsed delta from around the read as the estimated error.

cpu_ts[1] = local_clock();
cpu_ts[0] = cpu_clock();
lower = intel_uncore_read_fw(uncore, lower_reg);
cpu_ts[1] = local_clock() - cpu_ts[1];
-Chris


Thanks,


I meant the accuracy of having 2 samples GPU/CPU as close as possible.

Avoiding to account another register read in there is nice.


My testing was also mostly done with CLOCK_MONOTONIC_RAW which doesn't 
seem to be adjusted like CLOCK_MONOTONIC so maybe 

Re: [Intel-gfx] [PATCH v3] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9

2021-03-04 Thread Chris Wilson
Quoting Tvrtko Ursulin (2021-03-04 09:12:26)
> 
> On 02/03/2021 06:27, Cooper Chiou wrote:
> > WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to
> > resolve VP8 hardware encoding system hang up on GT1 sku for
> > ChromiumOS projects
> > 
> > Slice specific MMIO read inaccurate so MGSR needs to be programmed
> > appropriately to get correct reads from these slicet-related MMIOs.
> > 
> > It dictates that before any MMIO read into Slice/Subslice specific
> > registers, MCR packet control register(0xFDC) needs to be programmed
> > to point to any enabled slice/subslice pair, especially GT1 fused sku
> > since this issue can be reproduced on VP8 hardware encoding via ffmpeg
> > on ChromiumOS devices.
> > When exit PC7, MGSR will reset so that we have to skip fused subslice ID.
> > 
> > Reference: HSD#1508045018,1405586840, BSID#0575
> > 
> > Cc: Ville Syrjälä 
> > Cc: Rodrigo Vivi 
> > Cc: Jani Nikula 
> > Cc: Chris Wilson 
> > Cc: Tvrtko Ursulin 
> > Cc: William Tseng 
> > Cc: Lee Shawn C 
> > 
> > Signed-off-by: Cooper Chiou 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 +
> >   1 file changed, 38 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
> > b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 3b4a7da60f0b..4ad598a727a6 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, 
> > struct i915_wa_list *wal)
> >   wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME);
> >   }
> >   
> > +static void
> > +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
> > +{
> > + const struct sseu_dev_info *sseu = >gt.info.sseu;
> > + unsigned int slice, subslice;
> > + u32 mcr, mcr_mask;
> > +
> > + GEM_BUG_ON(INTEL_GEN(i915) < 9);
> > +
> > + /*
> > +  * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml
> > +  * Before any MMIO read into slice/subslice specific registers, MCR
> > +  * packet control register needs to be programmed to point to any
> > +  * enabled s/ss pair. Otherwise, incorrect values will be returned.
> > +  * This means each subsequent MMIO read will be forwarded to an
> > +  * specific s/ss combination, but this is OK since these registers
> > +  * are consistent across s/ss in almost all cases. In the rare
> > +  * occasions, such as INSTDONE, where this value is dependent
> > +  * on s/ss combo, the read should be done with read_subslice_reg.
> > +  */
> > + slice = fls(sseu->slice_mask) - 1;
> > + GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask));
> > + subslice = fls(intel_sseu_get_subslices(sseu, slice));
> > + GEM_BUG_ON(!subslice);
> > + subslice--;
> > +
> > + mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
> > + mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
> > +
> > + drm_dbg(>drm, "MCR slice:%d/subslice:%d = %x\n", slice, 
> > subslice, mcr);
> > +
> > + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
> > +}
> 
> Have you considered reusing existing wa_init_mcr? Just needs the 
> top-level assert changed and otherwise it looks it would do the right 
> thing for Gen9. Advantage being a smaller patch and less code to carry.

That was the first patch, and fails for the same reason. The problem
would appear to be in the mcr_mask for gt3.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9

2021-03-04 Thread Tvrtko Ursulin


On 02/03/2021 06:27, Cooper Chiou wrote:

WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to
resolve VP8 hardware encoding system hang up on GT1 sku for
ChromiumOS projects

Slice specific MMIO read inaccurate so MGSR needs to be programmed
appropriately to get correct reads from these slicet-related MMIOs.

It dictates that before any MMIO read into Slice/Subslice specific
registers, MCR packet control register(0xFDC) needs to be programmed
to point to any enabled slice/subslice pair, especially GT1 fused sku
since this issue can be reproduced on VP8 hardware encoding via ffmpeg
on ChromiumOS devices.
When exit PC7, MGSR will reset so that we have to skip fused subslice ID.

Reference: HSD#1508045018,1405586840, BSID#0575

Cc: Ville Syrjälä 
Cc: Rodrigo Vivi 
Cc: Jani Nikula 
Cc: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: William Tseng 
Cc: Lee Shawn C 

Signed-off-by: Cooper Chiou 
---
  drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 +
  1 file changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 3b4a7da60f0b..4ad598a727a6 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME);
  }
  
+static void

+gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
+{
+   const struct sseu_dev_info *sseu = >gt.info.sseu;
+   unsigned int slice, subslice;
+   u32 mcr, mcr_mask;
+
+   GEM_BUG_ON(INTEL_GEN(i915) < 9);
+
+   /*
+* WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml
+* Before any MMIO read into slice/subslice specific registers, MCR
+* packet control register needs to be programmed to point to any
+* enabled s/ss pair. Otherwise, incorrect values will be returned.
+* This means each subsequent MMIO read will be forwarded to an
+* specific s/ss combination, but this is OK since these registers
+* are consistent across s/ss in almost all cases. In the rare
+* occasions, such as INSTDONE, where this value is dependent
+* on s/ss combo, the read should be done with read_subslice_reg.
+*/
+   slice = fls(sseu->slice_mask) - 1;
+   GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask));
+   subslice = fls(intel_sseu_get_subslices(sseu, slice));
+   GEM_BUG_ON(!subslice);
+   subslice--;
+
+   mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+   mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
+
+   drm_dbg(>drm, "MCR slice:%d/subslice:%d = %x\n", slice, subslice, 
mcr);
+
+   wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
+}


Have you considered reusing existing wa_init_mcr? Just needs the 
top-level assert changed and otherwise it looks it would do the right 
thing for Gen9. Advantage being a smaller patch and less code to carry.


Regards,

Tvrtko


+
  static void
  gen9_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list 
*wal)
  {
+   /* WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml */
+   if (!IS_COFFEELAKE(i915))
+   gen9_wa_init_mcr(i915, wal);
+
/* WaDisableKillLogic:bxt,skl,kbl */
if (!IS_COFFEELAKE(i915) && !IS_COMETLAKE(i915))
wa_write_or(wal,


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 3/3] drm/i915/hdcp: return correct error code

2021-03-04 Thread Anshuman Gupta
hdcp2_enable_stream_encryption shouldn't get called in case
of any port authentication or encryption error, though
hdcp2_enable_stream_encryption checks for link encryption
before enabling stream encryption and returns error but
this return error code won't be correct in case of any error
due to port authentication and encryption.

Cc: Ramalingam C 
Signed-off-by: Anshuman Gupta 
---
 drivers/gpu/drm/i915/display/intel_hdcp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_hdcp.c 
b/drivers/gpu/drm/i915/display/intel_hdcp.c
index 9a70c164c377..21d6c73784b3 100644
--- a/drivers/gpu/drm/i915/display/intel_hdcp.c
+++ b/drivers/gpu/drm/i915/display/intel_hdcp.c
@@ -1896,7 +1896,8 @@ static int hdcp2_authenticate_and_encrypt(struct 
intel_connector *connector)
}
}
 
-   ret = hdcp2_enable_stream_encryption(connector);
+   if (!ret)
+   ret = hdcp2_enable_stream_encryption(connector);
 
return ret;
 }
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 2/3] drm/i915/hdcp: link hdcp2 recovery on link enc stopped

2021-03-04 Thread Anshuman Gupta
When stream encryption enabling fails due to Link encryption status
has stopped, prepare HDCP2 for recovery by disabling port authentication
and encryption such that it can re-attempt port authentication
and encryption.

Cc: Ramalingam C 
Signed-off-by: Anshuman Gupta 
---
 drivers/gpu/drm/i915/display/intel_hdcp.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_hdcp.c 
b/drivers/gpu/drm/i915/display/intel_hdcp.c
index 08dd6b46749d..9a70c164c377 100644
--- a/drivers/gpu/drm/i915/display/intel_hdcp.c
+++ b/drivers/gpu/drm/i915/display/intel_hdcp.c
@@ -1706,6 +1706,7 @@ static int hdcp2_enable_stream_encryption(struct 
intel_connector *connector)
 {
struct intel_digital_port *dig_port = 
intel_attached_dig_port(connector);
struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
+   struct hdcp_port_data *data = _port->hdcp_port_data;
struct intel_hdcp *hdcp = >hdcp;
enum transcoder cpu_transcoder = hdcp->cpu_transcoder;
enum port port = dig_port->base.port;
@@ -1715,7 +1716,8 @@ static int hdcp2_enable_stream_encryption(struct 
intel_connector *connector)
LINK_ENCRYPTION_STATUS)) {
drm_err(_priv->drm, "[%s:%d] HDCP 2.2 Link is not 
encrypted\n",
connector->base.name, connector->base.base.id);
-   return -EPERM;
+   ret = -EPERM;
+   goto link_recover;
}
 
if (hdcp->shim->stream_2_2_encryption) {
@@ -1729,6 +1731,15 @@ static int hdcp2_enable_stream_encryption(struct 
intel_connector *connector)
transcoder_name(hdcp->stream_transcoder));
}
 
+   return 0;
+
+link_recover:
+   if (hdcp2_deauthenticate_port(connector) < 0)
+   drm_dbg_kms(_priv->drm, "Port deauth failed.\n");
+
+   dig_port->hdcp_auth_status = false;
+   data->k = 0;
+
return ret;
 }
 
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 0/3] HDCP 2.2 MST fixes

2021-03-04 Thread Anshuman Gupta
Misc HDCP 2.2 MST fixes.

Anshuman Gupta (3):
  drm/i915/hdcp: HDCP2.2 MST Link failure recovery
  drm/i915/hdcp: link hdcp2 recovery on link enc stopped
  drm/i915/hdcp: return correct error code

 drivers/gpu/drm/i915/display/intel_hdcp.c | 26 +--
 1 file changed, 20 insertions(+), 6 deletions(-)

-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 1/3] drm/i915/hdcp: HDCP2.2 MST Link failure recovery

2021-03-04 Thread Anshuman Gupta
DP MST Link Check performed only for the connector involved with
HDCP port authentication and encryption, for other connector it
simply returns link check with true and update the uevent.
Therefore in case of HDCP 2.2 link failure, disable HDCP encryption
and de-authenticate the port so next time it can enable port
authentication and encryption.

Cc: Ramalingam C 
Signed-off-by: Anshuman Gupta 
---
 drivers/gpu/drm/i915/display/intel_hdcp.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_hdcp.c 
b/drivers/gpu/drm/i915/display/intel_hdcp.c
index ae1371c36a32..08dd6b46749d 100644
--- a/drivers/gpu/drm/i915/display/intel_hdcp.c
+++ b/drivers/gpu/drm/i915/display/intel_hdcp.c
@@ -1927,7 +1927,8 @@ static int _intel_hdcp2_enable(struct intel_connector 
*connector)
return 0;
 }
 
-static int _intel_hdcp2_disable(struct intel_connector *connector)
+static int
+_intel_hdcp2_disable(struct intel_connector *connector, bool 
hdcp2_link_recovery)
 {
struct intel_digital_port *dig_port = 
intel_attached_dig_port(connector);
struct drm_i915_private *i915 = to_i915(connector->base.dev);
@@ -1948,7 +1949,7 @@ static int _intel_hdcp2_disable(struct intel_connector 
*connector)
drm_dbg_kms(>drm, "HDCP 2.2 transcoder: %s stream 
encryption disabled\n",
transcoder_name(hdcp->stream_transcoder));
 
-   if (dig_port->num_hdcp_streams > 0)
+   if (dig_port->num_hdcp_streams > 0 && !hdcp2_link_recovery)
return 0;
}
 
@@ -1991,6 +1992,7 @@ static int intel_hdcp2_check_link(struct intel_connector 
*connector)
"HDCP2.2 link stopped the encryption, %x\n",
intel_de_read(dev_priv, HDCP2_STATUS(dev_priv, 
cpu_transcoder, port)));
ret = -ENXIO;
+   _intel_hdcp2_disable(connector, true);
intel_hdcp_update_value(connector,
DRM_MODE_CONTENT_PROTECTION_DESIRED,
true);
@@ -2030,7 +2032,7 @@ static int intel_hdcp2_check_link(struct intel_connector 
*connector)
connector->base.name, connector->base.base.id);
}
 
-   ret = _intel_hdcp2_disable(connector);
+   ret = _intel_hdcp2_disable(connector, true);
if (ret) {
drm_err(_priv->drm,
"[%s:%d] Failed to disable hdcp2.2 (%d)\n",
@@ -2340,7 +2342,7 @@ int intel_hdcp_disable(struct intel_connector *connector)
intel_hdcp_update_value(connector,
DRM_MODE_CONTENT_PROTECTION_UNDESIRED, false);
if (hdcp->hdcp2_encrypted)
-   ret = _intel_hdcp2_disable(connector);
+   ret = _intel_hdcp2_disable(connector, false);
else if (hdcp->hdcp_encrypted)
ret = _intel_hdcp_disable(connector);
 
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy

2021-03-04 Thread Chris Wilson
Quoting Lionel Landwerlin (2021-03-04 08:28:59)
> On 04/03/2021 02:09, Chris Wilson wrote:
> > Quoting Umesh Nerlige Ramappa (2021-03-03 21:28:00)
> >> Perf measurements rely on CPU and engine timestamps to correlate
> >> events of interest across these time domains. Current mechanisms get
> >> these timestamps separately and the calculated delta between these
> >> timestamps lack enough accuracy.
> >>
> >> To improve the accuracy of these time measurements to within a few us,
> >> add a query that returns the engine and cpu timestamps captured as
> >> close to each other as possible.
> >>
> >> v2: (Tvrtko)
> >> - document clock reference used
> >> - return cpu timestamp always
> >> - capture cpu time just before lower dword of cs timestamp
> >>
> >> v3: (Chris)
> >> - use uncore-rpm
> >> - use __query_cs_timestamp helper
> >>
> >> v4: (Lionel)
> >> - Kernel perf subsytem allows users to specify the clock id to be used
> >>in perf_event_open. This clock id is used by the perf subsystem to
> >>return the appropriate cpu timestamp in perf events. Similarly, let
> >>the user pass the clockid to this query so that cpu timestamp
> >>corresponds to the clock id requested.
> >>
> >> v5: (Tvrtko)
> >> - Use normal ktime accessors instead of fast versions
> >> - Add more uApi documentation
> >>
> >> v6: (Lionel)
> >> - Move switch out of spinlock
> >>
> >> v7: (Chris)
> >> - cs_timestamp is a misnomer, use cs_cycles instead
> >> - return the cs cycle frequency as well in the query
> >>
> >> v8:
> >> - Add platform and engine specific checks
> >>
> >> v9: (Lionel)
> >> - Return 2 cpu timestamps in the query - captured before and after the
> >>register read
> >>
> >> Signed-off-by: Umesh Nerlige Ramappa 
> >> ---
> >>   drivers/gpu/drm/i915/i915_query.c | 144 ++
> >>   include/uapi/drm/i915_drm.h   |  47 ++
> >>   2 files changed, 191 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_query.c 
> >> b/drivers/gpu/drm/i915/i915_query.c
> >> index fed337ad7b68..acca22ee6014 100644
> >> --- a/drivers/gpu/drm/i915/i915_query.c
> >> +++ b/drivers/gpu/drm/i915/i915_query.c
> >> @@ -6,6 +6,8 @@
> >>   
> >>   #include 
> >>   
> >> +#include "gt/intel_engine_pm.h"
> >> +#include "gt/intel_engine_user.h"
> >>   #include "i915_drv.h"
> >>   #include "i915_perf.h"
> >>   #include "i915_query.h"
> >> @@ -90,6 +92,147 @@ static int query_topology_info(struct drm_i915_private 
> >> *dev_priv,
> >>  return total_length;
> >>   }
> >>   
> >> +typedef u64 (*__ktime_func_t)(void);
> >> +static __ktime_func_t __clock_id_to_func(clockid_t clk_id)
> >> +{
> >> +   /*
> >> +* Use logic same as the perf subsystem to allow user to select the
> >> +* reference clock id to be used for timestamps.
> >> +*/
> >> +   switch (clk_id) {
> >> +   case CLOCK_MONOTONIC:
> >> +   return _get_ns;
> >> +   case CLOCK_MONOTONIC_RAW:
> >> +   return _get_raw_ns;
> >> +   case CLOCK_REALTIME:
> >> +   return _get_real_ns;
> >> +   case CLOCK_BOOTTIME:
> >> +   return _get_boottime_ns;
> >> +   case CLOCK_TAI:
> >> +   return _get_clocktai_ns;
> >> +   default:
> >> +   return NULL;
> >> +   }
> >> +}
> >> +
> >> +static inline int
> >> +__read_timestamps(struct intel_uncore *uncore,
> >> + i915_reg_t lower_reg,
> >> + i915_reg_t upper_reg,
> >> + u64 *cs_ts,
> >> + u64 *cpu_ts,
> >> + __ktime_func_t cpu_clock)
> >> +{
> >> +   u32 upper, lower, old_upper, loop = 0;
> >> +
> >> +   upper = intel_uncore_read_fw(uncore, upper_reg);
> >> +   do {
> >> +   cpu_ts[0] = cpu_clock();
> >> +   lower = intel_uncore_read_fw(uncore, lower_reg);
> >> +   cpu_ts[1] = cpu_clock();
> >> +   old_upper = upper;
> >> +   upper = intel_uncore_read_fw(uncore, upper_reg);
> > Both register reads comprise the timestamp returned to userspace, so
> > presumably you want cpu_ts[] to wrap both.
> >
> > do {
> > old_upper = upper;
> >
> > cpu_ts[0] = cpu_clock();
> > lower = intel_uncore_read_fw(uncore, lower_reg);
> > upper = intel_uncore_read_fw(uncore, upper_reg);
> > cpu_ts[1] = cpu_clock();
> > } while (upper != old_upper && loop++ < 2);
> 
> Actually if we want the best accuracy we can just deal with the lower dword.

Accuracy of what? The lower dword read perhaps, or the accuracy of the
sample point for the combined reads for the timestamp, which is closer
to an external observer (cpu_clock() implies reference to an external
observer).

The two clock samples are not even necessarily closely related due to the
nmi adjustments. If you wanted an unadjusted elapsed time for the read
you can use local_clock() then return the 

Re: [Intel-gfx] [PATCH] drm/atomic: Add the crtc to affected crtc only if uapi.enable = true

2021-03-04 Thread Pekka Paalanen
On Wed, 3 Mar 2021 12:44:33 -0800
"Navare, Manasi"  wrote:

> On Wed, Mar 03, 2021 at 10:47:44AM +0200, Pekka Paalanen wrote:
> > On Tue,  2 Mar 2021 12:41:32 -0800
> > Manasi Navare  wrote:
> >   
> > > In case of a modeset where a mode gets split across mutiple CRTCs
> > > in the driver specific implementation (bigjoiner in i915) we wrongly count
> > > the affected CRTCs based on the drm_crtc_mask and indicate the stolen 
> > > CRTC as
> > > an affected CRTC in atomic_check_only().
> > > This triggers a warning since affected CRTCs doent match requested CRTC.
> > > 
> > > To fix this in such bigjoiner configurations, we should only
> > > increment affected crtcs if that CRTC is enabled in UAPI not
> > > if it is just used internally in the driver to split the mode.  
> > 
> > Hi,
> > 
> > I think that makes sense to me. Stealing CRTCs that are not currently
> > used by the userspace (display server) should be ok, as long as that
> > is completely invisible to userspace: meaning that it does not cause
> > userspace to unexpectedly e.g. receive or miss per-crtc atomic
> > completion events.  
> 
> Yes since we are only doing atomic_check_only() here, the stolen

But the real not-test-only commit will follow if this test-only commit
succeeds, and keeping the guarantees for the real commit are important.

> crtc is completely invisible to the userspace and hence that is 
> indicated by uapi.enable which is not true for this stolen
> crtc. However if allow modeset flag set, then it will do a full
> modeset and indicate the uapi.enable for this stolen crtc as well
> since that cannot be used for other modeset requested by userspace.
> 
> > 
> > Can that also be asserted somehow, or does this already do that?  
> 
> Not clear what you want the assertion for? Could you elaborate

As assertion that when the real atomic commit happens and then
completion events are fired, they match exactly the affected crtcs mask.

I understand this may be off-topic for this particular patch, but since
we are discussing the topic, such checks would be really nice. I'm
curious if such checks already exist.


Thanks,
pq

> 
> Manasi
> 
> > 
> > 
> > Thanks,
> > pq
> >   
> > > Cc: Ville Syrjälä 
> > > Cc: Simon Ser 
> > > Cc: Pekka Paalanen 
> > > Cc: Daniel Stone 
> > > Cc: Daniel Vetter 
> > > Cc: dri-de...@lists.freedesktop.org
> > > Signed-off-by: Manasi Navare 
> > > ---
> > >  drivers/gpu/drm/drm_atomic.c | 6 --
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > > index 5b4547e0f775..d7acd6bbd97e 100644
> > > --- a/drivers/gpu/drm/drm_atomic.c
> > > +++ b/drivers/gpu/drm/drm_atomic.c
> > > @@ -1358,8 +1358,10 @@ int drm_atomic_check_only(struct drm_atomic_state 
> > > *state)
> > >   }
> > >   }
> > >  
> > > - for_each_new_crtc_in_state(state, crtc, new_crtc_state, i)
> > > - affected_crtc |= drm_crtc_mask(crtc);
> > > + for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
> > > + if (new_crtc_state->enable)
> > > + affected_crtc |= drm_crtc_mask(crtc);
> > > + }
> > >  
> > >   /*
> > >* For commits that allow modesets drivers can add other CRTCs to the  
> >   
> 
> 



pgpUXv5P7xwPu.pgp
Description: OpenPGP digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 1/4] drm/i915/mso: add splitter state readout for platforms that support it

2021-03-04 Thread Jani Nikula
On Wed, 03 Mar 2021, "Shankar, Uma"  wrote:
> Looks Good to me.
> Reviewed-by: Uma Shankar 

Thanks for the review, pushed the series.

>> diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h
>> b/drivers/gpu/drm/i915/display/intel_display_types.h
>> index 1a76e1d9de7a..9f1945018712 100644
>> --- a/drivers/gpu/drm/i915/display/intel_display_types.h
>> +++ b/drivers/gpu/drm/i915/display/intel_display_types.h
>> @@ -1166,6 +1166,13 @@ struct intel_crtc_state {
>>  u8 pipeline_full;
>>  u16 flipline, vmin, vmax;
>>  } vrr;
>> +
>> +/* Stream Splitter for eDP MSO */
>> +struct {
>> +bool enable;
>> +u8 link_count;
>> +u8 pixel_overlap;
>> +} splitter;
>>  };

Due to certain amount of urgency, I decided to push the series with this
naming in crtc state. I think it's accurate now, and workable for legacy
DSI as well. However, I'm not hung up on it, and I'm open to renaming
when the time comes to convert legacy DSI to using crtc state for this
stuff.

BR,
Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy

2021-03-04 Thread Lionel Landwerlin

On 03/03/2021 23:28, Umesh Nerlige Ramappa wrote:

Perf measurements rely on CPU and engine timestamps to correlate
events of interest across these time domains. Current mechanisms get
these timestamps separately and the calculated delta between these
timestamps lack enough accuracy.

To improve the accuracy of these time measurements to within a few us,
add a query that returns the engine and cpu timestamps captured as
close to each other as possible.

v2: (Tvrtko)
- document clock reference used
- return cpu timestamp always
- capture cpu time just before lower dword of cs timestamp

v3: (Chris)
- use uncore-rpm
- use __query_cs_timestamp helper

v4: (Lionel)
- Kernel perf subsytem allows users to specify the clock id to be used
   in perf_event_open. This clock id is used by the perf subsystem to
   return the appropriate cpu timestamp in perf events. Similarly, let
   the user pass the clockid to this query so that cpu timestamp
   corresponds to the clock id requested.

v5: (Tvrtko)
- Use normal ktime accessors instead of fast versions
- Add more uApi documentation

v6: (Lionel)
- Move switch out of spinlock

v7: (Chris)
- cs_timestamp is a misnomer, use cs_cycles instead
- return the cs cycle frequency as well in the query

v8:
- Add platform and engine specific checks

v9: (Lionel)
- Return 2 cpu timestamps in the query - captured before and after the
   register read

Signed-off-by: Umesh Nerlige Ramappa 



Looks good to me, thanks!


Reviewed-by: Lionel Landwerlin 


Let me prepare a Mesa MR to use this with VK_EXT_calibrated_timestamps.


-Lionel



---
  drivers/gpu/drm/i915/i915_query.c | 144 ++
  include/uapi/drm/i915_drm.h   |  47 ++
  2 files changed, 191 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index fed337ad7b68..acca22ee6014 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -6,6 +6,8 @@
  
  #include 
  
+#include "gt/intel_engine_pm.h"

+#include "gt/intel_engine_user.h"
  #include "i915_drv.h"
  #include "i915_perf.h"
  #include "i915_query.h"
@@ -90,6 +92,147 @@ static int query_topology_info(struct drm_i915_private 
*dev_priv,
return total_length;
  }
  
+typedef u64 (*__ktime_func_t)(void);

+static __ktime_func_t __clock_id_to_func(clockid_t clk_id)
+{
+   /*
+* Use logic same as the perf subsystem to allow user to select the
+* reference clock id to be used for timestamps.
+*/
+   switch (clk_id) {
+   case CLOCK_MONOTONIC:
+   return _get_ns;
+   case CLOCK_MONOTONIC_RAW:
+   return _get_raw_ns;
+   case CLOCK_REALTIME:
+   return _get_real_ns;
+   case CLOCK_BOOTTIME:
+   return _get_boottime_ns;
+   case CLOCK_TAI:
+   return _get_clocktai_ns;
+   default:
+   return NULL;
+   }
+}
+
+static inline int
+__read_timestamps(struct intel_uncore *uncore,
+ i915_reg_t lower_reg,
+ i915_reg_t upper_reg,
+ u64 *cs_ts,
+ u64 *cpu_ts,
+ __ktime_func_t cpu_clock)
+{
+   u32 upper, lower, old_upper, loop = 0;
+
+   upper = intel_uncore_read_fw(uncore, upper_reg);
+   do {
+   cpu_ts[0] = cpu_clock();
+   lower = intel_uncore_read_fw(uncore, lower_reg);
+   cpu_ts[1] = cpu_clock();
+   old_upper = upper;
+   upper = intel_uncore_read_fw(uncore, upper_reg);
+   } while (upper != old_upper && loop++ < 2);
+
+   *cs_ts = (u64)upper << 32 | lower;
+
+   return 0;
+}
+
+static int
+__query_cs_cycles(struct intel_engine_cs *engine,
+ u64 *cs_ts, u64 *cpu_ts,
+ __ktime_func_t cpu_clock)
+{
+   struct intel_uncore *uncore = engine->uncore;
+   enum forcewake_domains fw_domains;
+   u32 base = engine->mmio_base;
+   intel_wakeref_t wakeref;
+   int ret;
+
+   fw_domains = intel_uncore_forcewake_for_reg(uncore,
+   RING_TIMESTAMP(base),
+   FW_REG_READ);
+
+   with_intel_runtime_pm(uncore->rpm, wakeref) {
+   spin_lock_irq(>lock);
+   intel_uncore_forcewake_get__locked(uncore, fw_domains);
+
+   ret = __read_timestamps(uncore,
+   RING_TIMESTAMP(base),
+   RING_TIMESTAMP_UDW(base),
+   cs_ts,
+   cpu_ts,
+   cpu_clock);
+
+   intel_uncore_forcewake_put__locked(uncore, fw_domains);
+   spin_unlock_irq(>lock);
+   }
+
+   return ret;
+}
+
+static int
+query_cs_cycles(struct drm_i915_private *i915,
+   struct drm_i915_query_item *query_item)
+{
+   struct 

Re: [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy

2021-03-04 Thread Lionel Landwerlin

On 04/03/2021 02:09, Chris Wilson wrote:

Quoting Umesh Nerlige Ramappa (2021-03-03 21:28:00)

Perf measurements rely on CPU and engine timestamps to correlate
events of interest across these time domains. Current mechanisms get
these timestamps separately and the calculated delta between these
timestamps lack enough accuracy.

To improve the accuracy of these time measurements to within a few us,
add a query that returns the engine and cpu timestamps captured as
close to each other as possible.

v2: (Tvrtko)
- document clock reference used
- return cpu timestamp always
- capture cpu time just before lower dword of cs timestamp

v3: (Chris)
- use uncore-rpm
- use __query_cs_timestamp helper

v4: (Lionel)
- Kernel perf subsytem allows users to specify the clock id to be used
   in perf_event_open. This clock id is used by the perf subsystem to
   return the appropriate cpu timestamp in perf events. Similarly, let
   the user pass the clockid to this query so that cpu timestamp
   corresponds to the clock id requested.

v5: (Tvrtko)
- Use normal ktime accessors instead of fast versions
- Add more uApi documentation

v6: (Lionel)
- Move switch out of spinlock

v7: (Chris)
- cs_timestamp is a misnomer, use cs_cycles instead
- return the cs cycle frequency as well in the query

v8:
- Add platform and engine specific checks

v9: (Lionel)
- Return 2 cpu timestamps in the query - captured before and after the
   register read

Signed-off-by: Umesh Nerlige Ramappa 
---
  drivers/gpu/drm/i915/i915_query.c | 144 ++
  include/uapi/drm/i915_drm.h   |  47 ++
  2 files changed, 191 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index fed337ad7b68..acca22ee6014 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -6,6 +6,8 @@
  
  #include 
  
+#include "gt/intel_engine_pm.h"

+#include "gt/intel_engine_user.h"
  #include "i915_drv.h"
  #include "i915_perf.h"
  #include "i915_query.h"
@@ -90,6 +92,147 @@ static int query_topology_info(struct drm_i915_private 
*dev_priv,
 return total_length;
  }
  
+typedef u64 (*__ktime_func_t)(void);

+static __ktime_func_t __clock_id_to_func(clockid_t clk_id)
+{
+   /*
+* Use logic same as the perf subsystem to allow user to select the
+* reference clock id to be used for timestamps.
+*/
+   switch (clk_id) {
+   case CLOCK_MONOTONIC:
+   return _get_ns;
+   case CLOCK_MONOTONIC_RAW:
+   return _get_raw_ns;
+   case CLOCK_REALTIME:
+   return _get_real_ns;
+   case CLOCK_BOOTTIME:
+   return _get_boottime_ns;
+   case CLOCK_TAI:
+   return _get_clocktai_ns;
+   default:
+   return NULL;
+   }
+}
+
+static inline int
+__read_timestamps(struct intel_uncore *uncore,
+ i915_reg_t lower_reg,
+ i915_reg_t upper_reg,
+ u64 *cs_ts,
+ u64 *cpu_ts,
+ __ktime_func_t cpu_clock)
+{
+   u32 upper, lower, old_upper, loop = 0;
+
+   upper = intel_uncore_read_fw(uncore, upper_reg);
+   do {
+   cpu_ts[0] = cpu_clock();
+   lower = intel_uncore_read_fw(uncore, lower_reg);
+   cpu_ts[1] = cpu_clock();
+   old_upper = upper;
+   upper = intel_uncore_read_fw(uncore, upper_reg);

Both register reads comprise the timestamp returned to userspace, so
presumably you want cpu_ts[] to wrap both.

do {
old_upper = upper;

cpu_ts[0] = cpu_clock();
lower = intel_uncore_read_fw(uncore, lower_reg);
upper = intel_uncore_read_fw(uncore, upper_reg);
cpu_ts[1] = cpu_clock();
} while (upper != old_upper && loop++ < 2);


Actually if we want the best accuracy we can just deal with the lower dword.

We can check the upper one hasn't changed outside of the 2 cpu_clock() 
calls.



-Lionel


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx