date:20211203

Re: [PATCH] drm/i915: Rollback seqno when request creation fails

2021-12-03 Thread kernel test robot

Hi Matthew,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip drm-exynos/exynos-drm-next 
drm/drm-next tegra-drm/drm/tegra/for-next v5.16-rc3 next-20211203]
[cannot apply to airlied/drm-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Matthew-Brost/drm-i915-Rollback-seqno-when-request-creation-fails/20211204-020638
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-m021-20211203 
(https://download.01.org/0day-ci/archive/20211204/202112040802.qqhqie5v-...@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://github.com/0day-ci/linux/commit/af590220a3160c7a680487eac25eb2dc24baf42d
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Matthew-Brost/drm-i915-Rollback-seqno-when-request-creation-fails/20211204-020638
git checkout af590220a3160c7a680487eac25eb2dc24baf42d
# save the config file to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/gpu/drm/i915/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/gt/intel_timeline.c: In function 
'intel_timeline_rollback_seqno':
>> drivers/gpu/drm/i915/gt/intel_timeline.c:306:2: error: implicit declaration 
>> of function 'timeline_rollback' [-Werror=implicit-function-declaration]
 306 |  timeline_rollback(tl);
 |  ^
   cc1: some warnings being treated as errors


vim +/timeline_rollback +306 drivers/gpu/drm/i915/gt/intel_timeline.c

   303  
   304  void intel_timeline_rollback_seqno(struct intel_timeline *tl)
   305  {
 > 306  timeline_rollback(tl);
   307  }
   308  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

Re: [PATCH 10/11] drm/vc4: hdmi: Support HDMI YUV output

2021-12-03 Thread kernel test robot

Hi Maxime,

I love your patch! Perhaps something to improve:

[auto build test WARNING on drm/drm-next]
[also build test WARNING on next-20211203]
[cannot apply to drm-intel/for-linux-next anholt/for-next v5.16-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Maxime-Ripard/drm-vc4-hdmi-Yet-Another-Approach-to-HDMI-YUV-output/20211203-185623
base:   git://anongit.freedesktop.org/drm/drm drm-next
config: powerpc-randconfig-s031-20211203 
(https://download.01.org/0day-ci/archive/20211204/202112040851.zyd2j0lk-...@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 11.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.4-dirty
# 
https://github.com/0day-ci/linux/commit/70ee339decc66c157b6c9c983e8cce172c323218
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Maxime-Ripard/drm-vc4-hdmi-Yet-Another-Approach-to-HDMI-YUV-output/20211203-185623
git checkout 70ee339decc66c157b6c9c983e8cce172c323218
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=powerpc 
SHELL=/bin/bash drivers/gpu/drm/vc4/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


sparse warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/vc4/vc4_hdmi.c:503:37: sparse: sparse: mixing different enum 
>> types:
>> drivers/gpu/drm/vc4/vc4_hdmi.c:503:37: sparse:unsigned int enum 
>> vc4_hdmi_output_format
>> drivers/gpu/drm/vc4/vc4_hdmi.c:503:37: sparse:unsigned int enum 
>> hdmi_colorspace

vim +503 drivers/gpu/drm/vc4/vc4_hdmi.c

   491  
   492  static void vc4_hdmi_avi_infoframe_colorspace(struct hdmi_avi_infoframe 
*frame,
   493enum 
vc4_hdmi_output_format fmt)
   494  {
   495  switch (fmt) {
   496  case VC4_HDMI_OUTPUT_RGB:
   497  fallthrough;
   498  case VC4_HDMI_OUTPUT_YUV420:
   499  fallthrough;
   500  case VC4_HDMI_OUTPUT_YUV422:
   501  fallthrough;
   502  case VC4_HDMI_OUTPUT_YUV444:
 > 503  frame->colorspace = fmt;
   504  break;
   505  
   506  default:
   507  break;
   508  }
   509  }
   510  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

Re: [PATCH v2] drm/dp: Actually read Adjust Request Post Cursor2 register

2021-12-03 Thread Kees Cook

On Fri, Dec 03, 2021 at 04:28:56PM +0100, Thierry Reding wrote:
> On Fri, Dec 03, 2021 at 01:25:17AM -0800, Kees Cook wrote:
> > The link_status array was not large enough to read the Adjust Request
> > Post Cursor2 register. Adjust the size to include it. Found with a
> > -Warray-bounds build:
> > 
> > drivers/gpu/drm/drm_dp_helper.c: In function 
> > 'drm_dp_get_adjust_request_post_cursor':
> > drivers/gpu/drm/drm_dp_helper.c:59:27: error: array subscript 10 is outside 
> > array bounds of 'const u8[6]' {aka 'const unsigned char[6]'} 
> > [-Werror=array-bounds]
> >59 | return link_status[r - DP_LANE0_1_STATUS];
> >   |~~~^~~
> > drivers/gpu/drm/drm_dp_helper.c:147:51: note: while referencing 
> > 'link_status'
> >   147 | u8 drm_dp_get_adjust_request_post_cursor(const u8 
> > link_status[DP_LINK_STATUS_SIZE],
> >   |  
> > ~^~~~
> > 
> > Fixes: 79465e0ffeb9 ("drm/dp: Add helper to get post-cursor adjustments")
> > Signed-off-by: Kees Cook 
> > ---
> > v2: Fix missed array size change in intel_dp_check_mst_status()
> > ---
> >  drivers/gpu/drm/i915/display/intel_dp.c |  8 
> >  include/drm/drm_dp_helper.h | 10 +-
> >  2 files changed, 13 insertions(+), 5 deletions(-)
> 
> This sounds very familiar and I vaguely recall typing up a patch like
> that a long time ago. But I obviously failed because that never seems
> to have made it upstream.
> 
> Or perhaps I'm misremembering and was thinking about this instead:
> 
>   
> https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/338590/

Oh! Yeah, that's the same thing. Looks like that never made its way
upstream. :(

> 
> Bonus points for adding that comment with background information on why
> we need this.

Thanks! Yeah, I needed to really convince myself everything added up and
made sense, and figured I should try to capture that research. ;)

> Reviewed-by: Thierry Reding 

Thanks!

-Kees

-- 
Kees Cook

Re: [Intel-gfx] [PATCH 4/5] drm/i915/guc: Update to GuC version 69.0.0

2021-12-03 Thread Matthew Brost

On Fri, Dec 03, 2021 at 11:28:00PM +0100, Michal Wajdeczko wrote:
> 
> 
> On 03.12.2021 19:33, john.c.harri...@intel.com wrote:
> > From: John Harrison 
> > 
> > Update to the latest GuC release.
> > 
> > The latest GuC firmware introduces a number of interface changes:
> 
> Why can't we review all these changes in smaller patches and squash them
> in separate CI series *after* collecting all required r-b ?
> 
> Michal
> 

I reviewed this and it seems pretty straight forward to me.

I'm giving a RB, but we can hold up merging if you have an objection.
Likely targeting an early next week merge so please raise any concerns
before then.

With that:
Reviewed-by: Matthew Brost 

> > 
> > GuC may return NO_RESPONSE_RETRY message for requests sent over CTB.
> > Add support for this reply and try resending the request again as a
> > new CTB message.
> > 
> > A KLV (key-length-value) mechanism is now used for passing
> > configuration data such as CTB management.
> > 
> > With the new KLV scheme, the old CTB management actions are no longer
> > used and are removed.
> > 
> > Register capture on hang is now supported by GuC. Full i915 support
> > for this will be added by a later patch. A minimum support of
> > providing capture memory and register lists is required though, so add
> > that in.
> > 
> > The device id of the current platform needs to be provided at init time.
> > 
> > The 'poll CS' w/a (Wa_22012773006) was blanket enabled by previous
> > versions of GuC. It must now be explicitly requested by the KMD. So,
> > add in the code to turn it on when relevant.
> > 
> > The GuC log entry format has changed. This requires adding a new field
> > to the log header structure to mark the wrap point at the end of the
> > buffer (as the buffer size is no longer a multiple of the log entry
> > size).
> > 
> > New CTB notification messages are now sent for some things that were
> > previously only sent via MMIO notifications.
> > 
> > Of these, the crash dump notification was not really being handled by
> > i915. It called the log flush code but that only flushed the regular
> > debug log and then only if relay logging was enabled. So just report
> > an error message instead.
> > 
> > The 'exception' notification was just being ignored completely. So add
> > an error message for that as well.
> > 
> > Note that in either the crash dump or the exception case, the GuC is
> > basically dead. The KMD will detect this via the heartbeat and trigger
> > both an error log (which will include the crash dump as part of the
> > GuC log) and a GT reset. So no other processing is really required.
> > 
> > Signed-off-by: John Harrison 
> > Signed-off-by: Michal Wajdeczko 
> > ---
> >  Documentation/gpu/i915.rst|   1 +
> >  .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |  80 +-
> >  drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h |  82 ++
> >  drivers/gpu/drm/i915/gt/uc/intel_guc.c| 126 +---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc.h|   4 +
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  45 +-
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 141 ++
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  37 -
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c|  31 ++--
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_log.h|   3 +
> >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  18 +++
> >  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  |  30 ++--
> >  12 files changed, 434 insertions(+), 164 deletions(-)
> >  create mode 100644 drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
> > 
> > diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
> > index b7d801993bfa..bcaefc952764 100644
> > --- a/Documentation/gpu/i915.rst
> > +++ b/Documentation/gpu/i915.rst
> > @@ -539,6 +539,7 @@ GuC ABI
> >  .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
> >  .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> >  .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> > +.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
> >  
> >  HuC
> >  ---
> > diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h 
> > b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> > index fe5d7d261797..7afdadc7656f 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> > @@ -7,9 +7,9 @@
> >  #define _ABI_GUC_ACTIONS_ABI_H
> >  
> >  /**
> > - * DOC: HOST2GUC_REGISTER_CTB
> > + * DOC: HOST2GUC_SELF_CFG
> >   *
> > - * This message is used as part of the `CTB based communication`_ setup.
> > + * This message is used by Host KMD to setup of the `GuC Self Config 
> > KLVs`_.
> >   *
> >   * This message must be sent as `MMIO HXG Message`_.
> >   *
> > @@ -22,20 +22,18 @@
> >   *  |   
> > +---+--+
> >   *  |   | 27:16 | DATA0 = MBZ

[PATCH] drm/i915: Rollback seqno when request creation fails

2021-12-03 Thread Matthew Brost

gem_ctx_create.basic-files can slam on kernel contexts to the extent
where request creation fails because the ring is full. When this happens
seqno numbers are skipped which can result the below GEM_BUG_ON blowing
in gt/intel_engine_pm.c:__engine_unpark:

GEM_BUG_ON(ce->timeline->seqno !=
   READ_ONCE(*ce->timeline->hwsp_seqno));

Fixup request creation code to roll back seqno when request creation
fails.

v2:
 (CI)
  - Fix build error

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_timeline.c | 8 
 drivers/gpu/drm/i915/gt/intel_timeline.h | 1 +
 drivers/gpu/drm/i915/i915_request.c  | 1 +
 3 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c 
b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 438bbc7b8147..3785e5605549 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -301,6 +301,14 @@ static u32 timeline_advance(struct intel_timeline *tl)
return tl->seqno += 1 + tl->has_initial_breadcrumb;
 }
 
+void intel_timeline_rollback_seqno(struct intel_timeline *tl)
+{
+   GEM_BUG_ON(!atomic_read(>pin_count));
+   GEM_BUG_ON(tl->seqno & tl->has_initial_breadcrumb);
+
+   tl->seqno -= 1 + tl->has_initial_breadcrumb;
+}
+
 static noinline int
 __intel_timeline_get_seqno(struct intel_timeline *tl,
   u32 *seqno)
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.h 
b/drivers/gpu/drm/i915/gt/intel_timeline.h
index 57308c4d664a..a2f2e0ea186f 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.h
@@ -72,6 +72,7 @@ void intel_timeline_enter(struct intel_timeline *tl);
 int intel_timeline_get_seqno(struct intel_timeline *tl,
 struct i915_request *rq,
 u32 *seqno);
+void intel_timeline_rollback_seqno(struct intel_timeline *tl);
 void intel_timeline_exit(struct intel_timeline *tl);
 void intel_timeline_unpin(struct intel_timeline *tl);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index a72c8f0346a0..86f32ee082f7 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -966,6 +966,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 
 err_unwind:
ce->ring->emit = rq->head;
+   intel_timeline_rollback_seqno(tl);
 
/* Make sure we didn't add ourselves to external state before freeing */
GEM_BUG_ON(!list_empty(>sched.signalers_list));
-- 
2.33.1

Re: [PATCH v4] drm/msm/dp: do not initialize phy until plugin interrupt received

2021-12-03 Thread Kuogee Hsieh




On 12/2/2021 6:41 PM, Stephen Boyd wrote:

Quoting Kuogee Hsieh (2021-11-09 13:38:13)

From: Kuogee Hsieh 

Current DP drivers have regulators, clocks, irq and phy are grouped
together within a function and executed not in a symmetric manner.
This increase difficulty of code maintenance and limited code scalability.
This patch divided the driver life cycle of operation into four states,
resume (including booting up), dongle plugin, dongle unplugged and suspend.
Regulators, core clocks and irq are grouped together and enabled at resume
(or booting up) so that the DP controller is armed and ready to receive HPD
plugin interrupts. HPD plugin interrupt is generated when a dongle plugs
into DUT (device under test). Once HPD plugin interrupt is received, DP
controller will initialize phy so that dpcd read/write will function and
following link training can be proceeded successfully. DP phy will be
disabled after main link is teared down at end of unplugged HPD interrupt
handle triggered by dongle unplugged out of DUT. Finally regulators, code
clocks and irq are disabled at corresponding suspension.

Changes in V2:
-- removed unnecessary dp_ctrl NULL check
-- removed unnecessary phy init_count and power_count DRM_DEBUG_DP logs
-- remove flip parameter out of dp_ctrl_irq_enable()
-- add fixes tag

Changes in V3:
-- call dp_display_host_phy_init() instead of dp_ctrl_phy_init() at
 dp_display_host_init() for eDP

Changes in V4:
-- rewording commit text to match this commit changes

Fixes: e91e3065a806 ("drm/msm/dp: Add DP compliance tests on Snapdragon 
Chipsets")

Signed-off-by: Kuogee Hsieh 
---

Can you please resend this based on the msm-next branch[1]? It doesn't
apply now that other patches have been applied.

[1] https://gitlab.freedesktop.org/drm/msm.git msm-next


Stephen, below patches  missed at msm-next branch. Are there merged soon?

1) WIP: drm/msm/dp: Detect the connector type based on reg property

2) WIP: handle no-hpd property in DP driver



  drivers/gpu/drm/msm/dp/dp_ctrl.c| 87 -
  drivers/gpu/drm/msm/dp/dp_ctrl.h|  9 ++--
  drivers/gpu/drm/msm/dp/dp_display.c | 83 ++-
  3 files changed, 105 insertions(+), 74 deletions(-)

Re: [PATCH 4/5] drm/i915/guc: Update to GuC version 69.0.0

2021-12-03 Thread Michal Wajdeczko




On 03.12.2021 19:33, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> Update to the latest GuC release.
> 
> The latest GuC firmware introduces a number of interface changes:

Why can't we review all these changes in smaller patches and squash them
in separate CI series *after* collecting all required r-b ?

Michal

> 
> GuC may return NO_RESPONSE_RETRY message for requests sent over CTB.
> Add support for this reply and try resending the request again as a
> new CTB message.
> 
> A KLV (key-length-value) mechanism is now used for passing
> configuration data such as CTB management.
> 
> With the new KLV scheme, the old CTB management actions are no longer
> used and are removed.
> 
> Register capture on hang is now supported by GuC. Full i915 support
> for this will be added by a later patch. A minimum support of
> providing capture memory and register lists is required though, so add
> that in.
> 
> The device id of the current platform needs to be provided at init time.
> 
> The 'poll CS' w/a (Wa_22012773006) was blanket enabled by previous
> versions of GuC. It must now be explicitly requested by the KMD. So,
> add in the code to turn it on when relevant.
> 
> The GuC log entry format has changed. This requires adding a new field
> to the log header structure to mark the wrap point at the end of the
> buffer (as the buffer size is no longer a multiple of the log entry
> size).
> 
> New CTB notification messages are now sent for some things that were
> previously only sent via MMIO notifications.
> 
> Of these, the crash dump notification was not really being handled by
> i915. It called the log flush code but that only flushed the regular
> debug log and then only if relay logging was enabled. So just report
> an error message instead.
> 
> The 'exception' notification was just being ignored completely. So add
> an error message for that as well.
> 
> Note that in either the crash dump or the exception case, the GuC is
> basically dead. The KMD will detect this via the heartbeat and trigger
> both an error log (which will include the crash dump as part of the
> GuC log) and a GT reset. So no other processing is really required.
> 
> Signed-off-by: John Harrison 
> Signed-off-by: Michal Wajdeczko 
> ---
>  Documentation/gpu/i915.rst|   1 +
>  .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |  80 +-
>  drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h |  82 ++
>  drivers/gpu/drm/i915/gt/uc/intel_guc.c| 126 +---
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h|   4 +
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  45 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 141 ++
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  37 -
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c|  31 ++--
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.h|   3 +
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  18 +++
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  |  30 ++--
>  12 files changed, 434 insertions(+), 164 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
> 
> diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
> index b7d801993bfa..bcaefc952764 100644
> --- a/Documentation/gpu/i915.rst
> +++ b/Documentation/gpu/i915.rst
> @@ -539,6 +539,7 @@ GuC ABI
>  .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
>  .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
>  .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> +.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
>  
>  HuC
>  ---
> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h 
> b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> index fe5d7d261797..7afdadc7656f 100644
> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> @@ -7,9 +7,9 @@
>  #define _ABI_GUC_ACTIONS_ABI_H
>  
>  /**
> - * DOC: HOST2GUC_REGISTER_CTB
> + * DOC: HOST2GUC_SELF_CFG
>   *
> - * This message is used as part of the `CTB based communication`_ setup.
> + * This message is used by Host KMD to setup of the `GuC Self Config KLVs`_.
>   *
>   * This message must be sent as `MMIO HXG Message`_.
>   *
> @@ -22,20 +22,18 @@
>   *  |   
> +---+--+
>   *  |   | 27:16 | DATA0 = MBZ
>   |
>   *  |   
> +---+--+
> - *  |   |  15:0 | ACTION = _`GUC_ACTION_HOST2GUC_REGISTER_CTB` = 0x4505  
>   |
> + *  |   |  15:0 | ACTION = _`GUC_ACTION_HOST2GUC_SELF_CFG` = 0x0508  
>   |
>   *  
> +---+---+--+
> - *  | 1 | 31:12 | RESERVED = MBZ 
>   |
> + *  | 1 | 31:16 | **KLV_KEY** - KLV key, see `GuC Self Config KLVs`_

[PATCH] drm/msm/dp: Add "qcom,sc7280-dp" to support display port.

2021-12-03 Thread Kuogee Hsieh

Signed-off-by: Kuogee Hsieh 
---
 drivers/gpu/drm/msm/dp/dp_display.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
b/drivers/gpu/drm/msm/dp/dp_display.c
index d44f18b..91582d3 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -146,6 +146,7 @@ static const struct msm_dp_config sc7280_dp_cfg = {
 static const struct of_device_id dp_dt_match[] = {
{ .compatible = "qcom,sc7180-dp", .data = _dp_cfg },
{ .compatible = "qcom,sc7280-edp", .data = _dp_cfg },
+   { .compatible = "qcom,sc7280-dp", .data = _dp_cfg },
{}
 };
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH v2 5/6] Documentation/gpu: Add basic overview of DC pipeline

2021-12-03 Thread Yann Dirson

> De: "Rodrigo Siqueira" 
> Objet: [PATCH v2 5/6] Documentation/gpu: Add basic overview of DC pipeline
> 
> This commit describes how DCN works by providing high-level diagrams
> with an explanation of each component. In particular, it details the
> Global Sync signals.
> 
> Signed-off-by: Rodrigo Siqueira 
> ---
>  .../gpu/amdgpu/display/config_example.svg |  414 ++
>  .../amdgpu/display/dc_pipeline_overview.svg   | 1125
>  +
>  .../gpu/amdgpu/display/dcn-overview.rst   |  168 +++
>  .../gpu/amdgpu/display/global_sync_vblank.svg |  485 +++
>  Documentation/gpu/amdgpu/display/index.rst|   23 +-
>  5 files changed, 2203 insertions(+), 12 deletions(-)
>  create mode 100644
>  Documentation/gpu/amdgpu/display/config_example.svg
>  create mode 100644
>  Documentation/gpu/amdgpu/display/dc_pipeline_overview.svg
>  create mode 100644 Documentation/gpu/amdgpu/display/dcn-overview.rst
>  create mode 100644
>  Documentation/gpu/amdgpu/display/global_sync_vblank.svg
> 
...
> diff --git a/Documentation/gpu/amdgpu/display/dcn-overview.rst
> b/Documentation/gpu/amdgpu/display/dcn-overview.rst
> new file mode 100644
> index ..47e9a70de8ae
> --- /dev/null
> +++ b/Documentation/gpu/amdgpu/display/dcn-overview.rst
> @@ -0,0 +1,168 @@
> +===
> +Display Core Next (DCN)
> +===
> +
> +To equip our readers with the basic knowledge of how AMD Display
> Core Next
> +(DCN) works, we need to start with an overview of the hardware
> pipeline. Below
> +you can see a picture that provides a DCN overview, keep in mind
> that this is a
> +generic diagram, and we have variations per ASIC.
> +
> +.. kernel-figure:: dc_pipeline_overview.svg
> +
> +Based on this diagram, we can pass through each block and briefly
> describe
> +them:

Maybe a note on MMHUBBUB is missing ?

> +
> +* **Display Controller Hub (DCHUB)**: This is the gateway between
> the Scalable
> +  Data Port (SDP) and DCN. This component has multiple features,
> such as memory
> +  arbitration, rotation, and cursor manipulation.
> +
> +* **Display Pipe and Plane (DPP)**: This block provides pre-blend
> pixel
> +  processing such as color space conversion, linearization of pixel
> data, tone
> +  mapping, and gamut mapping.
> +
> +* **Multiple Pipe/Plane Combined (MPC)**: This component performs
> blending of
> +  multiple planes, using global or per-pixel alpha.
> +
> +* **Output Pixel Processing (OPP)**: Process and format pixels to be
> sent to
> +  the display.
> +
> +* **Output Pipe Timing Combiner (OPTC)**: It generates time output
> to combine
> +  streams or divide capabilities. CRC values are generated in this
> block.
> +
> +* **Display Output (DIO)**: Codify the output to the display
> connected to our
> +  GPU.
> +
> +* **Display Writeback (DWB)**: It provides the ability to write the
> output of
> +  the display pipe back to memory as video frames.
> +
> +* **DCN Management Unit (DMU)**: It provides registers with access
> control and
> +  interrupts the controller to the SOC host interrupt unit. This
> block includes
> +  the Display Micro-Controller Unit - version B (DMCUB), which is
> handled via
> +  firmware.
> +
> +* **DCN Clock Generator Block (DCCG)**: It provides the clocks and
> resets
> +  for all of the display controller clock domains.
> +
> +* **Azalia (AZ)**: Audio engine.
> +
> +The above diagram is an architecture generalization of DCN, which
> means that
> +every ASIC has variations around this base model. Notice that the
> display
> +pipeline is connected to the Scalable Data Port (SDP) via DCHUB; you
> can see
> +the SDP as the element from our Data Fabric that feeds the display
> pipe.
> +
> +Always approach the DCN architecture as something flexible that can
> be
> +configured and reconfigured in multiple ways; in other words, each
> block can be
> +setup or ignored accordingly with userspace demands. For example, if
> we
> +want to drive an 8k@60Hz with a DSC enabled, our DCN may require 4
> DPP and 2
> +OPP. It is DC's responsibility to drive the best configuration for
> each
> +specific scenario. Orchestrate all of these components together
> requires a
> +sophisticated communication interface which is highlighted in the
> diagram by
> +the edges that connect each block; from the chart, each connection
> between
> +these blocks represents:
> +
> +1. Pixel data interface (red): Represents the pixel data flow;
> +2. Global sync signals (green): It is a set of synchronization
> signals composed
> +   by VStartup, VUpdate, and VReady;
> +3. Config interface: Responsible to configure blocks;
> +4. Sideband signals: All other signals that do not fit the previous
> one.
> +
> +These signals are essential and play an important role in DCN.
> Nevertheless,
> +the Global Sync deserves an extra level of detail described in the
> next
> +section.
> +
> +All of these components are represented by a data structure named
> dc_state.
> +From DCHUB to MPC, we have a

Re: [Intel-gfx] [PATCH 2/5] drm/i915/guc: Increase GuC log size for CONFIG_DEBUG_GEM

2021-12-03 Thread Matthew Brost

On Fri, Dec 03, 2021 at 10:33:36AM -0800, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> Lots of testing is done with the DEBUG_GEM config option enabled but
> not the DEBUG_GUC option. That means we only get teeny-tiny GuC logs
> which are not hugely useful. Enabling full DEBUG_GUC also spews lots
> of other detailed output that is not generally desired. However,
> bigger GuC logs are extremely useful for almost any regression debug.
> So enable bigger logs for DEBUG_GEM builds as well.
> 
> Signed-off-by: John Harrison 

Reviewed-by: Matthew Brost 

> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.h | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
> index ac1ee1d5ce10..fe6ab7550a14 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
> @@ -15,9 +15,12 @@
>  
>  struct intel_guc;
>  
> -#ifdef CONFIG_DRM_I915_DEBUG_GUC
> +#if defined(CONFIG_DRM_I915_DEBUG_GUC)
>  #define CRASH_BUFFER_SIZESZ_2M
>  #define DEBUG_BUFFER_SIZESZ_16M
> +#elif defined(CONFIG_DRM_I915_DEBUG_GEM)
> +#define CRASH_BUFFER_SIZESZ_1M
> +#define DEBUG_BUFFER_SIZESZ_2M
>  #else
>  #define CRASH_BUFFER_SIZESZ_8K
>  #define DEBUG_BUFFER_SIZESZ_64K
> -- 
> 2.25.1
>

Re: [PATCH 5/5] drm/i915/guc: Improve GuC loading status check/error reports

2021-12-03 Thread Matthew Brost

On Fri, Dec 03, 2021 at 10:33:39AM -0800, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> If the GuC fails to load, it is useful to know what firmware file /
> version was attempted. So move the version info report to before the
> load attempt rather than only after a successful load.
> 
> If the GuC does fail to load, then make the error messages visible
> rather than being 'debug' prints that do not appears in dmesg output
> by default.
> 
> When waiting for the GuC to load, it used to be necessary to check for
> two different states - READY and (LAPIC_DONE | MIA_CORE). Apparently
> the second signified init complete on RC6 exit. However, in more
> recent GuC versions the RC6 exit sequence now finishes with status
> READY as well. So the test can be simplified.
> 
> Also, add an enum giving all the current status codes that GuC loading
> can report as a reference without having to pull and search through
> the GuC source files.
> 
> Signed-off-by: John Harrison 

Reviewed-by: Matthew Brost 

> ---
>  .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h   | 23 ++
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 17 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h|  4 ---
>  drivers/gpu/drm/i915/gt/uc/intel_huc.c|  1 +
>  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 31 ++-
>  5 files changed, 48 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h 
> b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
> index 488b6061ee89..c20658ee85a5 100644
> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
> @@ -11,4 +11,27 @@ enum intel_guc_response_status {
>   INTEL_GUC_RESPONSE_STATUS_GENERIC_FAIL = 0xF000,
>  };
>  
> +enum intel_guc_load_status {
> + INTEL_GUC_LOAD_STATUS_DEFAULT  = 0x00,
> + INTEL_GUC_LOAD_STATUS_START= 0x01,
> + INTEL_GUC_LOAD_STATUS_ERROR_DEVID_BUILD_MISMATCH   = 0x02,
> + INTEL_GUC_LOAD_STATUS_GUC_PREPROD_BUILD_MISMATCH   = 0x03,
> + INTEL_GUC_LOAD_STATUS_ERROR_DEVID_INVALID_GUCTYPE  = 0x04,
> + INTEL_GUC_LOAD_STATUS_GDT_DONE = 0x10,
> + INTEL_GUC_LOAD_STATUS_IDT_DONE = 0x20,
> + INTEL_GUC_LOAD_STATUS_LAPIC_DONE   = 0x30,
> + INTEL_GUC_LOAD_STATUS_GUCINT_DONE  = 0x40,
> + INTEL_GUC_LOAD_STATUS_DPC_READY= 0x50,
> + INTEL_GUC_LOAD_STATUS_DPC_ERROR= 0x60,
> + INTEL_GUC_LOAD_STATUS_EXCEPTION= 0x70,
> + INTEL_GUC_LOAD_STATUS_INIT_DATA_INVALID= 0x71,
> + INTEL_GUC_LOAD_STATUS_PXP_TEARDOWN_CTRL_ENABLED= 0x72,
> + INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START,
> + INTEL_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73,
> + INTEL_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID   = 0x74,
> + INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END,
> +
> + INTEL_GUC_LOAD_STATUS_READY= 0xF0,
> +};
> +
>  #endif /* _ABI_GUC_ERRORS_ABI_H */
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> index 196424be0998..d3cee01d07e0 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> @@ -70,11 +70,10 @@ static int guc_xfer_rsa(struct intel_uc_fw *guc_fw,
>  static inline bool guc_ready(struct intel_uncore *uncore, u32 *status)
>  {
>   u32 val = intel_uncore_read(uncore, GUC_STATUS);
> - u32 uk_val = val & GS_UKERNEL_MASK;
> + u32 uk_val = REG_FIELD_GET(GS_UKERNEL_MASK, val);
>  
>   *status = val;
> - return (uk_val == GS_UKERNEL_READY) ||
> - ((val & GS_MIA_CORE_STATE) && (uk_val == 
> GS_UKERNEL_LAPIC_DONE));
> + return uk_val == INTEL_GUC_LOAD_STATUS_READY;
>  }
>  
>  static int guc_wait_ucode(struct intel_uncore *uncore)
> @@ -94,8 +93,8 @@ static int guc_wait_ucode(struct intel_uncore *uncore)
>   if (ret) {
>   struct drm_device *drm = >i915->drm;
>  
> - drm_dbg(drm, "GuC load failed: status = 0x%08X\n", status);
> - drm_dbg(drm, "GuC load failed: status: Reset = %d, "
> + drm_info(drm, "GuC load failed: status = 0x%08X\n", status);
> + drm_info(drm, "GuC load failed: status: Reset = %d, "
>   "BootROM = 0x%02X, UKernel = 0x%02X, "
>   "MIA = 0x%02X, Auth = 0x%02X\n",
>   REG_FIELD_GET(GS_MIA_IN_RESET, status),
> @@ -105,13 +104,13 @@ static int guc_wait_ucode(struct intel_uncore *uncore)
>   REG_FIELD_GET(GS_AUTH_STATUS_MASK, status));
>  
>   if ((status & GS_BOOTROM_MASK) == GS_BOOTROM_RSA_FAILED) {
> - drm_dbg(drm, "GuC firmware signature verification 
> failed\n");
> +

Re: [Intel-gfx] [PATCH 4/4] drm/i915/guc: Don't go bang in GuC log if no GuC

2021-12-03 Thread Daniele Ceraolo Spurio





On 12/2/2021 4:33 PM, Lucas De Marchi wrote:
On Thu, Dec 02, 2021 at 04:06:23PM -0800, john.c.harri...@intel.com 
wrote:

From: John Harrison 

If the GuC has failed to load for any reason and then the user pokes
the debugfs GuC log interface, a BUG and/or null pointer deref can
occur. Don't let that happen.

Signed-off-by: John Harrison 



Reviewed-by: Lucas De Marchi 

Lucas De Marchi



Do we need a fixes tag? or is it ok to not have it for debugfs bugs?

Daniele


---
drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c

index 46026c2c1722..8fd068049376 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c
@@ -31,7 +31,7 @@ static int guc_log_level_get(void *data, u64 *val)
{
struct intel_guc_log *log = data;

-    if (!intel_guc_is_used(log_to_guc(log)))
+    if (!log->vma)
    return -ENODEV;

*val = intel_guc_log_get_level(log);
@@ -43,7 +43,7 @@ static int guc_log_level_set(void *data, u64 val)
{
struct intel_guc_log *log = data;

-    if (!intel_guc_is_used(log_to_guc(log)))
+    if (!log->vma)
    return -ENODEV;

return intel_guc_log_set_level(log, val);
--
2.25.1

Re: [PATCH v2] drm/msm/dpu: removed logically dead code

2021-12-03 Thread Dmitry Baryshkov


On 03/12/2021 22:32, Ameer Hamza wrote:

Fixed coverity warning by removing the dead code

Addresses-Coverity: 1494147 ("Logically dead code")

Signed-off-by: Ameer Hamza 


Reviewed-by: Dmitry Baryshkov 



---
Changes in v2:
removed the 'fail' part completely by moving DPU_ERROR and return statement in 
place of corresponding goto statements.
---
  .../drm/msm/disp/dpu1/dpu_encoder_phys_vid.c| 17 +
  1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index 185379b18572..ddd9d89cd456 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -698,17 +698,17 @@ struct dpu_encoder_phys *dpu_encoder_phys_vid_init(
  {
struct dpu_encoder_phys *phys_enc = NULL;
struct dpu_encoder_irq *irq;
-   int i, ret = 0;
+   int i;
  
  	if (!p) {

-   ret = -EINVAL;
-   goto fail;
+   DPU_ERROR("failed to create encoder due to invalid 
parameter\n");
+   return ERR_PTR(-EINVAL);
}
  
  	phys_enc = kzalloc(sizeof(*phys_enc), GFP_KERNEL);

if (!phys_enc) {
-   ret = -ENOMEM;
-   goto fail;
+   DPU_ERROR("failed to create encoder due to memory allocation 
error\n");
+   return ERR_PTR(-ENOMEM);
}
  
  	phys_enc->hw_mdptop = p->dpu_kms->hw_mdp;

@@ -748,11 +748,4 @@ struct dpu_encoder_phys *dpu_encoder_phys_vid_init(
DPU_DEBUG_VIDENC(phys_enc, "created intf idx:%d\n", p->intf_idx);
  
  	return phys_enc;

-
-fail:
-   DPU_ERROR("failed to create encoder\n");
-   if (phys_enc)
-   dpu_encoder_phys_vid_destroy(phys_enc);
-
-   return ERR_PTR(ret);
  }




--
With best wishes
Dmitry

Re: [Intel-gfx] [PATCH] drm/i915: Fix error pointer dereference in i915_gem_do_execbuffer()

2021-12-03 Thread Matthew Brost

On Wed, Dec 01, 2021 at 08:48:31PM -0800, Matthew Brost wrote:
> From: Dan Carpenter 
> 
> Originally "out_fence" was set using out_fence = sync_file_create() but
> which returns NULL, but now it is set with out_fence = eb_requests_create()
> which returns error pointers.  The error path needs to be modified to
> avoid an Oops in the "goto err_request;" path.
> 
> Fixes: 544460c33821 ("drm/i915: Multi-BB execbuf")
> Signed-off-by: Dan Carpenter 
> Signed-off-by: Matthew Brost 

Reviewed-by: Matthew Brost 

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 9f7c6ecadb90..6db588b9a30e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3288,6 +3288,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   out_fence = eb_requests_create(, in_fence, out_fence_fd);
>   if (IS_ERR(out_fence)) {
>   err = PTR_ERR(out_fence);
> + out_fence = NULL;
>   if (eb.requests[0])
>   goto err_request;
>   else
> -- 
> 2.33.1
>

Re: [PATCH bpf v2] treewide: add missing includes masked by cgroup -> bpf dependency

2021-12-03 Thread patchwork-bot+netdevbpf

Hello:

This patch was applied to bpf/bpf.git (master)
by Alexei Starovoitov :

On Thu,  2 Dec 2021 12:34:00 -0800 you wrote:
> cgroup.h (therefore swap.h, therefore half of the universe)
> includes bpf.h which in turn includes module.h and slab.h.
> Since we're about to get rid of that dependency we need
> to clean things up.
> 
> v2: drop the cpu.h include from cacheinfo.h, it's not necessary
> and it makes riscv sensitive to ordering of include files.
> 
> [...]

Here is the summary with links:
  - [bpf,v2] treewide: add missing includes masked by cgroup -> bpf dependency
https://git.kernel.org/bpf/bpf/c/8581fd402a0c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

Re: [PATCH 1/5] drm/i915/uc: Allow platforms to have GuC but not HuC

2021-12-03 Thread Lucas De Marchi


On Fri, Dec 03, 2021 at 10:33:35AM -0800, john.c.harri...@intel.com wrote:

From: John Harrison 

It is possible for platforms to require GuC but not HuC firmware.
Also, the firmware versions for GuC and HuC advance independently. So
split the macros up to allow the lists to be maintained separately.

Signed-off-by: John Harrison 



Reviewed-by: Lucas De Marchi 

Lucas De Marchi


---
drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 93 
1 file changed, 63 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 3aa87be4f2e4..a7788ce50736 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -48,22 +48,39 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
 * Note that RKL and ADL-S have the same GuC/HuC device ID's and use the same
 * firmware as TGL.
 */
-#define INTEL_UC_FIRMWARE_DEFS(fw_def, guc_def, huc_def) \
-   fw_def(ALDERLAKE_P, 0, guc_def(adlp, 62, 0, 3), huc_def(tgl, 7, 9, 3)) \
-   fw_def(ALDERLAKE_S, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
-   fw_def(DG1, 0, guc_def(dg1, 62, 0, 0), huc_def(dg1,  7, 9, 3)) \
-   fw_def(ROCKETLAKE,  0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
-   fw_def(TIGERLAKE,   0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
-   fw_def(JASPERLAKE,  0, guc_def(ehl, 62, 0, 0), huc_def(ehl,  9, 0, 0)) \
-   fw_def(ELKHARTLAKE, 0, guc_def(ehl, 62, 0, 0), huc_def(ehl,  9, 0, 0)) \
-   fw_def(ICELAKE, 0, guc_def(icl, 62, 0, 0), huc_def(icl,  9, 0, 0)) \
-   fw_def(COMETLAKE,   5, guc_def(cml, 62, 0, 0), huc_def(cml,  4, 0, 0)) \
-   fw_def(COMETLAKE,   0, guc_def(kbl, 62, 0, 0), huc_def(kbl,  4, 0, 0)) \
-   fw_def(COFFEELAKE,  0, guc_def(kbl, 62, 0, 0), huc_def(kbl,  4, 0, 0)) \
-   fw_def(GEMINILAKE,  0, guc_def(glk, 62, 0, 0), huc_def(glk,  4, 0, 0)) \
-   fw_def(KABYLAKE,0, guc_def(kbl, 62, 0, 0), huc_def(kbl,  4, 0, 0)) \
-   fw_def(BROXTON, 0, guc_def(bxt, 62, 0, 0), huc_def(bxt,  2, 0, 0)) \
-   fw_def(SKYLAKE, 0, guc_def(skl, 62, 0, 0), huc_def(skl,  2, 0, 0))
+#define INTEL_GUC_FIRMWARE_DEFS(fw_def, guc_def) \
+   fw_def(ALDERLAKE_P,  0, guc_def(adlp, 62, 0, 3)) \
+   fw_def(ALDERLAKE_S,  0, guc_def(tgl,  62, 0, 0)) \
+   fw_def(DG1,  0, guc_def(dg1,  62, 0, 0)) \
+   fw_def(ROCKETLAKE,   0, guc_def(tgl,  62, 0, 0)) \
+   fw_def(TIGERLAKE,0, guc_def(tgl,  62, 0, 0)) \
+   fw_def(JASPERLAKE,   0, guc_def(ehl,  62, 0, 0)) \
+   fw_def(ELKHARTLAKE,  0, guc_def(ehl,  62, 0, 0)) \
+   fw_def(ICELAKE,  0, guc_def(icl,  62, 0, 0)) \
+   fw_def(COMETLAKE,5, guc_def(cml,  62, 0, 0)) \
+   fw_def(COMETLAKE,0, guc_def(kbl,  62, 0, 0)) \
+   fw_def(COFFEELAKE,   0, guc_def(kbl,  62, 0, 0)) \
+   fw_def(GEMINILAKE,   0, guc_def(glk,  62, 0, 0)) \
+   fw_def(KABYLAKE, 0, guc_def(kbl,  62, 0, 0)) \
+   fw_def(BROXTON,  0, guc_def(bxt,  62, 0, 0)) \
+   fw_def(SKYLAKE,  0, guc_def(skl,  62, 0, 0))
+
+#define INTEL_HUC_FIRMWARE_DEFS(fw_def, huc_def) \
+   fw_def(ALDERLAKE_P,  0, huc_def(tgl,  7, 9, 3)) \
+   fw_def(ALDERLAKE_S,  0, huc_def(tgl,  7, 9, 3)) \
+   fw_def(DG1,  0, huc_def(dg1,  7, 9, 3)) \
+   fw_def(ROCKETLAKE,   0, huc_def(tgl,  7, 9, 3)) \
+   fw_def(TIGERLAKE,0, huc_def(tgl,  7, 9, 3)) \
+   fw_def(JASPERLAKE,   0, huc_def(ehl,  9, 0, 0)) \
+   fw_def(ELKHARTLAKE,  0, huc_def(ehl,  9, 0, 0)) \
+   fw_def(ICELAKE,  0, huc_def(icl,  9, 0, 0)) \
+   fw_def(COMETLAKE,5, huc_def(cml,  4, 0, 0)) \
+   fw_def(COMETLAKE,0, huc_def(kbl,  4, 0, 0)) \
+   fw_def(COFFEELAKE,   0, huc_def(kbl,  4, 0, 0)) \
+   fw_def(GEMINILAKE,   0, huc_def(glk,  4, 0, 0)) \
+   fw_def(KABYLAKE, 0, huc_def(kbl,  4, 0, 0)) \
+   fw_def(BROXTON,  0, huc_def(bxt,  2, 0, 0)) \
+   fw_def(SKYLAKE,  0, huc_def(skl,  2, 0, 0))

#define __MAKE_UC_FW_PATH(prefix_, name_, major_, minor_, patch_) \
"i915/" \
@@ -79,11 +96,11 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
__MAKE_UC_FW_PATH(prefix_, "_huc_", major_, minor_, bld_num_)

/* All blobs need to be declared via MODULE_FIRMWARE() */
-#define INTEL_UC_MODULE_FW(platform_, revid_, guc_, huc_) \
-   MODULE_FIRMWARE(guc_); \
-   MODULE_FIRMWARE(huc_);
+#define INTEL_UC_MODULE_FW(platform_, revid_, uc_) \
+   MODULE_FIRMWARE(uc_);

-INTEL_UC_FIRMWARE_DEFS(INTEL_UC_MODULE_FW, MAKE_GUC_FW_PATH, MAKE_HUC_FW_PATH)
+INTEL_GUC_FIRMWARE_DEFS(INTEL_UC_MODULE_FW, MAKE_GUC_FW_PATH)
+INTEL_HUC_FIRMWARE_DEFS(INTEL_UC_MODULE_FW, MAKE_HUC_FW_PATH)

/* The below structs and macros are used to iterate across the list of blobs */
struct __packed uc_fw_blob {
@@ -106,31 +123,47 @@ struct __packed uc_fw_blob {
struct __packed uc_fw_platform_requirement {
enum intel_platform

Re: [PATCH] drm/msm/a5xx: Add support for Adreno 506 GPU

2021-12-03 Thread Dmitry Baryshkov


On 22/10/2021 20:33, Bjorn Andersson wrote:

On Fri 22 Oct 04:43 PDT 2021, Vladimir Lypak wrote:


This GPU is found on SoCs such as MSM8953(650MHz), SDM450(600MHz),
SDM632(725MHz).

Signed-off-by: Vladimir Lypak 
---
  drivers/gpu/drm/msm/adreno/a5xx_gpu.c  | 34 ++
  drivers/gpu/drm/msm/adreno/adreno_device.c | 18 
  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  5 
  3 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 5e2750eb3810..249a0d8bc673 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -441,7 +441,7 @@ void a5xx_set_hwcg(struct msm_gpu *gpu, bool state)
const struct adreno_five_hwcg_regs *regs;
unsigned int i, sz;
  
-	if (adreno_is_a508(adreno_gpu)) {

+   if (adreno_is_a506(adreno_gpu) || adreno_is_a508(adreno_gpu)) {
regs = a50x_hwcg;
sz = ARRAY_SIZE(a50x_hwcg);
} else if (adreno_is_a509(adreno_gpu) || adreno_is_a512(adreno_gpu)) {
@@ -485,7 +485,7 @@ static int a5xx_me_init(struct msm_gpu *gpu)
OUT_RING(ring, 0x);
  
  	/* Specify workarounds for various microcode issues */

-   if (adreno_is_a530(adreno_gpu)) {
+   if (adreno_is_a506(adreno_gpu) || adreno_is_a530(adreno_gpu)) {
/* Workaround for token end syncs
 * Force a WFI after every direct-render 3D mode draw and every
 * 2D mode 3 draw
@@ -620,8 +620,17 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
  
  static int a5xx_zap_shader_resume(struct msm_gpu *gpu)

  {
+   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
int ret;
  
+	/*

+* Adreno 506,508,512 have CPZ Retention feature and
+* don't need to resume zap shader
+*/
+   if (adreno_is_a506(adreno_gpu) || adreno_is_a508(adreno_gpu) ||
+   adreno_is_a512(adreno_gpu))
+   return 0;


Afaict all other changes in the patch adds a506 support, but this hunk
changes a508 and a512 behavior.

I'm not saying that the change is wrong, but this hunk deserves to be in
it's own patch - so that if there's any impact on those other versions
it can be tracked down to that specific patch.


Vladimir, any plans to submit v2? This comment requests splitting the 
patch in two.



--
With best wishes
Dmitry

[PATCH 3/5] drm/i915/guc: Don't go bang in GuC log if no GuC

2021-12-03 Thread John . C . Harrison

From: John Harrison 

If the GuC has failed to load for any reason and then the user pokes
the debugfs GuC log interface, a BUG and/or null pointer deref can
occur. Don't let that happen.

Signed-off-by: John Harrison 
Reviewed-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c
index 46026c2c1722..8fd068049376 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c
@@ -31,7 +31,7 @@ static int guc_log_level_get(void *data, u64 *val)
 {
struct intel_guc_log *log = data;
 
-   if (!intel_guc_is_used(log_to_guc(log)))
+   if (!log->vma)
return -ENODEV;
 
*val = intel_guc_log_get_level(log);
@@ -43,7 +43,7 @@ static int guc_log_level_set(void *data, u64 val)
 {
struct intel_guc_log *log = data;
 
-   if (!intel_guc_is_used(log_to_guc(log)))
+   if (!log->vma)
return -ENODEV;
 
return intel_guc_log_set_level(log, val);
-- 
2.25.1

[PATCH 5/5] drm/i915/guc: Improve GuC loading status check/error reports

2021-12-03 Thread John . C . Harrison

From: John Harrison 

If the GuC fails to load, it is useful to know what firmware file /
version was attempted. So move the version info report to before the
load attempt rather than only after a successful load.

If the GuC does fail to load, then make the error messages visible
rather than being 'debug' prints that do not appears in dmesg output
by default.

When waiting for the GuC to load, it used to be necessary to check for
two different states - READY and (LAPIC_DONE | MIA_CORE). Apparently
the second signified init complete on RC6 exit. However, in more
recent GuC versions the RC6 exit sequence now finishes with status
READY as well. So the test can be simplified.

Also, add an enum giving all the current status codes that GuC loading
can report as a reference without having to pull and search through
the GuC source files.

Signed-off-by: John Harrison 
---
 .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h   | 23 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 17 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h|  4 ---
 drivers/gpu/drm/i915/gt/uc/intel_huc.c|  1 +
 drivers/gpu/drm/i915/gt/uc/intel_uc.c | 31 ++-
 5 files changed, 48 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
index 488b6061ee89..c20658ee85a5 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
@@ -11,4 +11,27 @@ enum intel_guc_response_status {
INTEL_GUC_RESPONSE_STATUS_GENERIC_FAIL = 0xF000,
 };
 
+enum intel_guc_load_status {
+   INTEL_GUC_LOAD_STATUS_DEFAULT  = 0x00,
+   INTEL_GUC_LOAD_STATUS_START= 0x01,
+   INTEL_GUC_LOAD_STATUS_ERROR_DEVID_BUILD_MISMATCH   = 0x02,
+   INTEL_GUC_LOAD_STATUS_GUC_PREPROD_BUILD_MISMATCH   = 0x03,
+   INTEL_GUC_LOAD_STATUS_ERROR_DEVID_INVALID_GUCTYPE  = 0x04,
+   INTEL_GUC_LOAD_STATUS_GDT_DONE = 0x10,
+   INTEL_GUC_LOAD_STATUS_IDT_DONE = 0x20,
+   INTEL_GUC_LOAD_STATUS_LAPIC_DONE   = 0x30,
+   INTEL_GUC_LOAD_STATUS_GUCINT_DONE  = 0x40,
+   INTEL_GUC_LOAD_STATUS_DPC_READY= 0x50,
+   INTEL_GUC_LOAD_STATUS_DPC_ERROR= 0x60,
+   INTEL_GUC_LOAD_STATUS_EXCEPTION= 0x70,
+   INTEL_GUC_LOAD_STATUS_INIT_DATA_INVALID= 0x71,
+   INTEL_GUC_LOAD_STATUS_PXP_TEARDOWN_CTRL_ENABLED= 0x72,
+   INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START,
+   INTEL_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73,
+   INTEL_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID   = 0x74,
+   INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END,
+
+   INTEL_GUC_LOAD_STATUS_READY= 0xF0,
+};
+
 #endif /* _ABI_GUC_ERRORS_ABI_H */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index 196424be0998..d3cee01d07e0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -70,11 +70,10 @@ static int guc_xfer_rsa(struct intel_uc_fw *guc_fw,
 static inline bool guc_ready(struct intel_uncore *uncore, u32 *status)
 {
u32 val = intel_uncore_read(uncore, GUC_STATUS);
-   u32 uk_val = val & GS_UKERNEL_MASK;
+   u32 uk_val = REG_FIELD_GET(GS_UKERNEL_MASK, val);
 
*status = val;
-   return (uk_val == GS_UKERNEL_READY) ||
-   ((val & GS_MIA_CORE_STATE) && (uk_val == 
GS_UKERNEL_LAPIC_DONE));
+   return uk_val == INTEL_GUC_LOAD_STATUS_READY;
 }
 
 static int guc_wait_ucode(struct intel_uncore *uncore)
@@ -94,8 +93,8 @@ static int guc_wait_ucode(struct intel_uncore *uncore)
if (ret) {
struct drm_device *drm = >i915->drm;
 
-   drm_dbg(drm, "GuC load failed: status = 0x%08X\n", status);
-   drm_dbg(drm, "GuC load failed: status: Reset = %d, "
+   drm_info(drm, "GuC load failed: status = 0x%08X\n", status);
+   drm_info(drm, "GuC load failed: status: Reset = %d, "
"BootROM = 0x%02X, UKernel = 0x%02X, "
"MIA = 0x%02X, Auth = 0x%02X\n",
REG_FIELD_GET(GS_MIA_IN_RESET, status),
@@ -105,13 +104,13 @@ static int guc_wait_ucode(struct intel_uncore *uncore)
REG_FIELD_GET(GS_AUTH_STATUS_MASK, status));
 
if ((status & GS_BOOTROM_MASK) == GS_BOOTROM_RSA_FAILED) {
-   drm_dbg(drm, "GuC firmware signature verification 
failed\n");
+   drm_info(drm, "GuC firmware signature verification 
failed\n");
ret = -ENOEXEC;
}
 
-   if ((status & GS_UKERNEL_MASK) == GS_UKERNEL_EXCEPTION) {
-

[PATCH 4/5] drm/i915/guc: Update to GuC version 69.0.0

2021-12-03 Thread John . C . Harrison

From: John Harrison 

Update to the latest GuC release.

The latest GuC firmware introduces a number of interface changes:

GuC may return NO_RESPONSE_RETRY message for requests sent over CTB.
Add support for this reply and try resending the request again as a
new CTB message.

A KLV (key-length-value) mechanism is now used for passing
configuration data such as CTB management.

With the new KLV scheme, the old CTB management actions are no longer
used and are removed.

Register capture on hang is now supported by GuC. Full i915 support
for this will be added by a later patch. A minimum support of
providing capture memory and register lists is required though, so add
that in.

The device id of the current platform needs to be provided at init time.

The 'poll CS' w/a (Wa_22012773006) was blanket enabled by previous
versions of GuC. It must now be explicitly requested by the KMD. So,
add in the code to turn it on when relevant.

The GuC log entry format has changed. This requires adding a new field
to the log header structure to mark the wrap point at the end of the
buffer (as the buffer size is no longer a multiple of the log entry
size).

New CTB notification messages are now sent for some things that were
previously only sent via MMIO notifications.

Of these, the crash dump notification was not really being handled by
i915. It called the log flush code but that only flushed the regular
debug log and then only if relay logging was enabled. So just report
an error message instead.

The 'exception' notification was just being ignored completely. So add
an error message for that as well.

Note that in either the crash dump or the exception case, the GuC is
basically dead. The KMD will detect this via the heartbeat and trigger
both an error log (which will include the crash dump as part of the
GuC log) and a GT reset. So no other processing is really required.

Signed-off-by: John Harrison 
Signed-off-by: Michal Wajdeczko 
---
 Documentation/gpu/i915.rst|   1 +
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |  80 +-
 drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h |  82 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c| 126 +---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|   4 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  45 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 141 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  37 -
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c|  31 ++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.h|   3 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  18 +++
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  |  30 ++--
 12 files changed, 434 insertions(+), 164 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index b7d801993bfa..bcaefc952764 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -539,6 +539,7 @@ GuC ABI
 .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
 .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
 .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
+.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
 
 HuC
 ---
diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
index fe5d7d261797..7afdadc7656f 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
@@ -7,9 +7,9 @@
 #define _ABI_GUC_ACTIONS_ABI_H
 
 /**
- * DOC: HOST2GUC_REGISTER_CTB
+ * DOC: HOST2GUC_SELF_CFG
  *
- * This message is used as part of the `CTB based communication`_ setup.
+ * This message is used by Host KMD to setup of the `GuC Self Config KLVs`_.
  *
  * This message must be sent as `MMIO HXG Message`_.
  *
@@ -22,20 +22,18 @@
  *  |   
+---+--+
  *  |   | 27:16 | DATA0 = MBZ  
|
  *  |   
+---+--+
- *  |   |  15:0 | ACTION = _`GUC_ACTION_HOST2GUC_REGISTER_CTB` = 0x4505
|
+ *  |   |  15:0 | ACTION = _`GUC_ACTION_HOST2GUC_SELF_CFG` = 0x0508
|
  *  
+---+---+--+
- *  | 1 | 31:12 | RESERVED = MBZ   
|
+ *  | 1 | 31:16 | **KLV_KEY** - KLV key, see `GuC Self Config KLVs`_   
|
  *  |   
+---+--+
- *  |   |  11:8 | **TYPE** - type for the `CT Buffer`_ 
|
+ *  |   |  15:0 | **KLV_LEN** - KLV length 
|
  *  |   |   |  
|
- *  |   |   |   - _`GUC_CTB_TYPE_HOST2GUC` = 0 
|

[PATCH 2/5] drm/i915/guc: Increase GuC log size for CONFIG_DEBUG_GEM

2021-12-03 Thread John . C . Harrison

From: John Harrison 

Lots of testing is done with the DEBUG_GEM config option enabled but
not the DEBUG_GUC option. That means we only get teeny-tiny GuC logs
which are not hugely useful. Enabling full DEBUG_GUC also spews lots
of other detailed output that is not generally desired. However,
bigger GuC logs are extremely useful for almost any regression debug.
So enable bigger logs for DEBUG_GEM builds as well.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
index ac1ee1d5ce10..fe6ab7550a14 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
@@ -15,9 +15,12 @@
 
 struct intel_guc;
 
-#ifdef CONFIG_DRM_I915_DEBUG_GUC
+#if defined(CONFIG_DRM_I915_DEBUG_GUC)
 #define CRASH_BUFFER_SIZE  SZ_2M
 #define DEBUG_BUFFER_SIZE  SZ_16M
+#elif defined(CONFIG_DRM_I915_DEBUG_GEM)
+#define CRASH_BUFFER_SIZE  SZ_1M
+#define DEBUG_BUFFER_SIZE  SZ_2M
 #else
 #define CRASH_BUFFER_SIZE  SZ_8K
 #define DEBUG_BUFFER_SIZE  SZ_64K
-- 
2.25.1

[PATCH 0/5] Update to GuC version 69.0.0

2021-12-03 Thread John . C . Harrison

From: John Harrison 

Update to the latest GuC version. This includes a suite of interface
changes and new features with corresponding i915 side changes.

Also, fix/improve a bunch of other things while at it.

Signed-off-by: John Harrison 



John Harrison (5):
  drm/i915/uc: Allow platforms to have GuC but not HuC
  drm/i915/guc: Increase GuC log size for CONFIG_DEBUG_GEM
  drm/i915/guc: Don't go bang in GuC log if no GuC
  drm/i915/guc: Update to GuC version 69.0.0
  drm/i915/guc: Improve GuC loading status check/error reports

 Documentation/gpu/i915.rst|   1 +
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |  80 +-
 .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h   |  23 +++
 drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h |  82 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c| 126 +---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|   4 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  45 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 141 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c |  17 +--
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  37 -
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c|  31 ++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.h|   8 +-
 .../drm/i915/gt/uc/intel_guc_log_debugfs.c|   4 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h|   4 -
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  18 +++
 drivers/gpu/drm/i915/gt/uc/intel_huc.c|   1 +
 drivers/gpu/drm/i915/gt/uc/intel_uc.c |  31 ++--
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  |  93 
 18 files changed, 536 insertions(+), 210 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h

-- 
2.25.1

[PATCH 1/5] drm/i915/uc: Allow platforms to have GuC but not HuC

2021-12-03 Thread John . C . Harrison

From: John Harrison 

It is possible for platforms to require GuC but not HuC firmware.
Also, the firmware versions for GuC and HuC advance independently. So
split the macros up to allow the lists to be maintained separately.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 93 
 1 file changed, 63 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 3aa87be4f2e4..a7788ce50736 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -48,22 +48,39 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
  * Note that RKL and ADL-S have the same GuC/HuC device ID's and use the same
  * firmware as TGL.
  */
-#define INTEL_UC_FIRMWARE_DEFS(fw_def, guc_def, huc_def) \
-   fw_def(ALDERLAKE_P, 0, guc_def(adlp, 62, 0, 3), huc_def(tgl, 7, 9, 3)) \
-   fw_def(ALDERLAKE_S, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
-   fw_def(DG1, 0, guc_def(dg1, 62, 0, 0), huc_def(dg1,  7, 9, 3)) \
-   fw_def(ROCKETLAKE,  0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
-   fw_def(TIGERLAKE,   0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
-   fw_def(JASPERLAKE,  0, guc_def(ehl, 62, 0, 0), huc_def(ehl,  9, 0, 0)) \
-   fw_def(ELKHARTLAKE, 0, guc_def(ehl, 62, 0, 0), huc_def(ehl,  9, 0, 0)) \
-   fw_def(ICELAKE, 0, guc_def(icl, 62, 0, 0), huc_def(icl,  9, 0, 0)) \
-   fw_def(COMETLAKE,   5, guc_def(cml, 62, 0, 0), huc_def(cml,  4, 0, 0)) \
-   fw_def(COMETLAKE,   0, guc_def(kbl, 62, 0, 0), huc_def(kbl,  4, 0, 0)) \
-   fw_def(COFFEELAKE,  0, guc_def(kbl, 62, 0, 0), huc_def(kbl,  4, 0, 0)) \
-   fw_def(GEMINILAKE,  0, guc_def(glk, 62, 0, 0), huc_def(glk,  4, 0, 0)) \
-   fw_def(KABYLAKE,0, guc_def(kbl, 62, 0, 0), huc_def(kbl,  4, 0, 0)) \
-   fw_def(BROXTON, 0, guc_def(bxt, 62, 0, 0), huc_def(bxt,  2, 0, 0)) \
-   fw_def(SKYLAKE, 0, guc_def(skl, 62, 0, 0), huc_def(skl,  2, 0, 0))
+#define INTEL_GUC_FIRMWARE_DEFS(fw_def, guc_def) \
+   fw_def(ALDERLAKE_P,  0, guc_def(adlp, 62, 0, 3)) \
+   fw_def(ALDERLAKE_S,  0, guc_def(tgl,  62, 0, 0)) \
+   fw_def(DG1,  0, guc_def(dg1,  62, 0, 0)) \
+   fw_def(ROCKETLAKE,   0, guc_def(tgl,  62, 0, 0)) \
+   fw_def(TIGERLAKE,0, guc_def(tgl,  62, 0, 0)) \
+   fw_def(JASPERLAKE,   0, guc_def(ehl,  62, 0, 0)) \
+   fw_def(ELKHARTLAKE,  0, guc_def(ehl,  62, 0, 0)) \
+   fw_def(ICELAKE,  0, guc_def(icl,  62, 0, 0)) \
+   fw_def(COMETLAKE,5, guc_def(cml,  62, 0, 0)) \
+   fw_def(COMETLAKE,0, guc_def(kbl,  62, 0, 0)) \
+   fw_def(COFFEELAKE,   0, guc_def(kbl,  62, 0, 0)) \
+   fw_def(GEMINILAKE,   0, guc_def(glk,  62, 0, 0)) \
+   fw_def(KABYLAKE, 0, guc_def(kbl,  62, 0, 0)) \
+   fw_def(BROXTON,  0, guc_def(bxt,  62, 0, 0)) \
+   fw_def(SKYLAKE,  0, guc_def(skl,  62, 0, 0))
+
+#define INTEL_HUC_FIRMWARE_DEFS(fw_def, huc_def) \
+   fw_def(ALDERLAKE_P,  0, huc_def(tgl,  7, 9, 3)) \
+   fw_def(ALDERLAKE_S,  0, huc_def(tgl,  7, 9, 3)) \
+   fw_def(DG1,  0, huc_def(dg1,  7, 9, 3)) \
+   fw_def(ROCKETLAKE,   0, huc_def(tgl,  7, 9, 3)) \
+   fw_def(TIGERLAKE,0, huc_def(tgl,  7, 9, 3)) \
+   fw_def(JASPERLAKE,   0, huc_def(ehl,  9, 0, 0)) \
+   fw_def(ELKHARTLAKE,  0, huc_def(ehl,  9, 0, 0)) \
+   fw_def(ICELAKE,  0, huc_def(icl,  9, 0, 0)) \
+   fw_def(COMETLAKE,5, huc_def(cml,  4, 0, 0)) \
+   fw_def(COMETLAKE,0, huc_def(kbl,  4, 0, 0)) \
+   fw_def(COFFEELAKE,   0, huc_def(kbl,  4, 0, 0)) \
+   fw_def(GEMINILAKE,   0, huc_def(glk,  4, 0, 0)) \
+   fw_def(KABYLAKE, 0, huc_def(kbl,  4, 0, 0)) \
+   fw_def(BROXTON,  0, huc_def(bxt,  2, 0, 0)) \
+   fw_def(SKYLAKE,  0, huc_def(skl,  2, 0, 0))
 
 #define __MAKE_UC_FW_PATH(prefix_, name_, major_, minor_, patch_) \
"i915/" \
@@ -79,11 +96,11 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
__MAKE_UC_FW_PATH(prefix_, "_huc_", major_, minor_, bld_num_)
 
 /* All blobs need to be declared via MODULE_FIRMWARE() */
-#define INTEL_UC_MODULE_FW(platform_, revid_, guc_, huc_) \
-   MODULE_FIRMWARE(guc_); \
-   MODULE_FIRMWARE(huc_);
+#define INTEL_UC_MODULE_FW(platform_, revid_, uc_) \
+   MODULE_FIRMWARE(uc_);
 
-INTEL_UC_FIRMWARE_DEFS(INTEL_UC_MODULE_FW, MAKE_GUC_FW_PATH, MAKE_HUC_FW_PATH)
+INTEL_GUC_FIRMWARE_DEFS(INTEL_UC_MODULE_FW, MAKE_GUC_FW_PATH)
+INTEL_HUC_FIRMWARE_DEFS(INTEL_UC_MODULE_FW, MAKE_HUC_FW_PATH)
 
 /* The below structs and macros are used to iterate across the list of blobs */
 struct __packed uc_fw_blob {
@@ -106,31 +123,47 @@ struct __packed uc_fw_blob {
 struct __packed uc_fw_platform_requirement {
enum intel_platform p;
u8 rev; /* first platform rev using this FW */
-   const struct uc_fw_blob

Re: [PATCH -next] drm: remove node from list before freeing the node

2021-12-03 Thread Dmitry Baryshkov


On 03/12/2021 06:36, Yang Li wrote:

fix the following smatch warning:
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c:1675 dpu_plane_init() warn:
'>mplane_list' not removed from list

Reported-by: Abaci Robot 
Signed-off-by: Yang Li 


Reviewed-by: Dmitry Baryshkov 


---
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index ca190d9..aad238b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -1672,6 +1672,7 @@ struct drm_plane *dpu_plane_init(struct drm_device *dev,
if (pdpu && pdpu->pipe_hw)
dpu_hw_sspp_destroy(pdpu->pipe_hw);
  clean_plane:
+   list_del(>mplane_list);
kfree(pdpu);
return ERR_PTR(ret);
  }




--
With best wishes
Dmitry

Re: [PATCH] drm/msm/dpu: removed logically dead code

2021-12-03 Thread Dmitry Baryshkov


On 03/12/2021 19:18, Ameer Hamza wrote:

Fixed coverity warning by removing the dead code

Addresses-Coverity: 1494147 ("Logically dead code")

Signed-off-by: Ameer Hamza 


While the patch is correct, remove the 'fail' part completely by moving 
DPU_ERROR and return statement in place of corresponding goto statements.



---
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c | 2 --
  1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index 185379b18572..75f0c0cee661 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -751,8 +751,6 @@ struct dpu_encoder_phys *dpu_encoder_phys_vid_init(
  
  fail:

DPU_ERROR("failed to create encoder\n");
-   if (phys_enc)
-   dpu_encoder_phys_vid_destroy(phys_enc);
  
  	return ERR_PTR(ret);

  }




--
With best wishes
Dmitry

Re: [PATCH] dt-bindings: panel: Include SPI peripheral properties

2021-12-03 Thread Rob Herring

On Fri, Dec 3, 2021 at 9:43 AM Thierry Reding  wrote:
>
> From: Thierry Reding 
>
> SPI panels need to reference the SPI peripheral properties so that they
> can be properly validated.

Thanks for doing this.

>
> Signed-off-by: Thierry Reding 
> ---
>  .../devicetree/bindings/display/panel/lgphilips,lb035q02.yaml| 1 +
>  .../devicetree/bindings/display/panel/sony,acx565akm.yaml| 1 +
>  2 files changed, 2 insertions(+)

I'm seeing a few more than this. If there's some logic to the ones you
fixed, I'm not seeing it.

Documentation/devicetree/bindings/display/st,stm32-dsi.example.dt.yaml:
dsi@5a00: Unevaluated properties are not allowed ('panel-dsi@0'
was unexpected)
Documentation/devicetree/bindings/display/panel/samsung,ld9040.example.dt.yaml:
lcd@0: Unevaluated properties are not allowed ('#address-cells',
'#size-cells', 'spi-max-frequency', 'spi-cpol', 'spi-cpha' were
unexpected)
Documentation/devicetree/bindings/display/panel/kingdisplay,kd035g6-54nt.example.dt.yaml:
panel@0: Unevaluated properties are not allowed ('spi-max-frequency',
'spi-3wire' were unexpected)
Documentation/devicetree/bindings/display/panel/ilitek,ili9322.example.dt.yaml:
display@0: Unevaluated properties are not allowed ('reg' was
unexpected)
Documentation/devicetree/bindings/display/panel/samsung,s6e63m0.example.dt.yaml:
display@0: Unevaluated properties are not allowed ('spi-max-frequency'
was unexpected)
Documentation/devicetree/bindings/display/panel/abt,y030xx067a.example.dt.yaml:
panel@0: Unevaluated properties are not allowed ('spi-max-frequency'
was unexpected)
Documentation/devicetree/bindings/display/panel/sony,acx565akm.example.dt.yaml:
panel@2: Unevaluated properties are not allowed ('spi-max-frequency',
'reg' were unexpected)
Documentation/devicetree/bindings/display/panel/novatek,nt36672a.example.dt.yaml:
panel@0: Unevaluated properties are not allowed ('vddi0-supply',
'#address-cells', '#size-cells' were unexpected)
Documentation/devicetree/bindings/display/panel/lgphilips,lb035q02.example.dt.yaml:
panel@0: Unevaluated properties are not allowed ('reg',
'spi-max-frequency', 'spi-cpol', 'spi-cpha' were unexpected)
Documentation/devicetree/bindings/display/panel/tpo,td.example.dt.yaml:
panel@0: Unevaluated properties are not allowed ('spi-max-frequency',
'spi-cpol', 'spi-cpha' were unexpected)
Documentation/devicetree/bindings/display/panel/innolux,ej030na.example.dt.yaml:
panel@0: Unevaluated properties are not allowed ('spi-max-frequency'
was unexpected)
Documentation/devicetree/bindings/display/panel/sitronix,st7789v.example.dt.yaml:
panel@0: Unevaluated properties are not allowed ('spi-max-frequency',
'spi-cpol', 'spi-cpha' were unexpected)

>
> diff --git 
> a/Documentation/devicetree/bindings/display/panel/lgphilips,lb035q02.yaml 
> b/Documentation/devicetree/bindings/display/panel/lgphilips,lb035q02.yaml
> index 830e335ddb53..240a884b7fa7 100644
> --- a/Documentation/devicetree/bindings/display/panel/lgphilips,lb035q02.yaml
> +++ b/Documentation/devicetree/bindings/display/panel/lgphilips,lb035q02.yaml
> @@ -14,6 +14,7 @@ maintainers:
>- Tomi Valkeinen 
>
>  allOf:
> +  - $ref: ../../spi/spi-peripheral-props.yaml

Please use /schemas/spi/... instead.

This needs Mark's tree if you want to avoid errors. I'm fine though as
long as linux-next is okay. Or I can pull in Mark's tree and take this
and others.

Rob

Re: [PATCH] drm: send vblank event with the attached sequence rather than current

2021-12-03 Thread Mark Yacoub

On Fri, Dec 3, 2021 at 1:03 PM Michel Dänzer  wrote:
>
> On 2021-12-03 16:58, Matthias Brugger wrote:
> > Hi Mark,
> >
> > On 02/12/2021 16:11, Mark Yacoub wrote:
> >> From: Mark Yacoub 
> >>
> >
> > please make sure to add the linux-mediatek mailinglist in any follow-up 
> > communication.
> >
> > Regards,
> > Matthias
> >
> >> [Why]
> >> drm_handle_vblank_events loops over vblank_event_list to send any event
> >> that is current or has passed.
> >> More than 1 event could be pending with past sequence time that need to
> >> be send. This can be a side effect of drivers without hardware vblank
> >> counter and they depend on the difference in the timestamps and the
> >> frame/field duration calculated in drm_update_vblank_count. This can
> >> lead to 1 vblirq being ignored due to very small diff,
>
> That sounds like the real issue which needs to be addressed. As Ville wrote, 
> the sequence value in the event is supposed to be the current sequence, not 
> the one which was specified originally by user space. So this change would 
> basically break the universe. ;)
>
I don't wanna be the reason to break the universe :)) I guess I
misunderstood what the call does, will look into the real issue more.
Thanks everyone!
>
> >> resulting in a subsequent vblank with 2 pending vblank events to be sent, 
> >> each with a
> >> unique sequence expected by user space.
> >>
> >> [How]
> >> Send each pending vblank event with the sequence it's waiting on instead
> >> of assigning the current sequence to all of them.
> >>
> >> Fixes igt@kms_flip "Unexpected frame sequence"
> >> Tested on Jacuzzi (MT8183)
> >>
> >> Signed-off-by: Mark Yacoub 
> >> ---
> >>   drivers/gpu/drm/drm_vblank.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> >> index 3417e1ac79185..47da8056abc14 100644
> >> --- a/drivers/gpu/drm/drm_vblank.c
> >> +++ b/drivers/gpu/drm/drm_vblank.c
> >> @@ -1902,7 +1902,7 @@ static void drm_handle_vblank_events(struct 
> >> drm_device *dev, unsigned int pipe)
> >> list_del(>base.link);
> >>   drm_vblank_put(dev, pipe);
> >> -send_vblank_event(dev, e, seq, now);
> >> +send_vblank_event(dev, e, e->sequence, now);
> >>   }
> >> if (crtc && crtc->funcs->get_vblank_timestamp)
> >>
>
>
>
> --
> Earthling Michel Dänzer|  https://redhat.com
> Libre software enthusiast  | Mesa and Xwayland developer

[PATCH] drm/i915: Rollback seqno when request creation fails

2021-12-03 Thread Matthew Brost

gem_ctx_create.basic-files can slam on kernel contexts to the extent
where request creation fails because the ring is full. When this happens
seqno numbers are skipped which can result the below GEM_BUG_ON blowing
in gt/intel_engine_pm.c:__engine_unpark:

GEM_BUG_ON(ce->timeline->seqno !=
   READ_ONCE(*ce->timeline->hwsp_seqno));

Fixup request creation code to roll back seqno when request creation
fails.

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_timeline.c | 5 +
 drivers/gpu/drm/i915/gt/intel_timeline.h | 1 +
 drivers/gpu/drm/i915/i915_request.c  | 1 +
 3 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c 
b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 438bbc7b8147..64ea9a90c7a0 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -301,6 +301,11 @@ static u32 timeline_advance(struct intel_timeline *tl)
return tl->seqno += 1 + tl->has_initial_breadcrumb;
 }
 
+void intel_timeline_rollback_seqno(struct intel_timeline *tl)
+{
+   timeline_rollback(tl);
+}
+
 static noinline int
 __intel_timeline_get_seqno(struct intel_timeline *tl,
   u32 *seqno)
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.h 
b/drivers/gpu/drm/i915/gt/intel_timeline.h
index 57308c4d664a..a2f2e0ea186f 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.h
@@ -72,6 +72,7 @@ void intel_timeline_enter(struct intel_timeline *tl);
 int intel_timeline_get_seqno(struct intel_timeline *tl,
 struct i915_request *rq,
 u32 *seqno);
+void intel_timeline_rollback_seqno(struct intel_timeline *tl);
 void intel_timeline_exit(struct intel_timeline *tl);
 void intel_timeline_unpin(struct intel_timeline *tl);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index a72c8f0346a0..86f32ee082f7 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -966,6 +966,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 
 err_unwind:
ce->ring->emit = rq->head;
+   intel_timeline_rollback_seqno(tl);
 
/* Make sure we didn't add ourselves to external state before freeing */
GEM_BUG_ON(!list_empty(>sched.signalers_list));
-- 
2.33.1

Re: [PATCH v2 2/8] drm/i915/gtt: add xehpsdv_ppgtt_insert_entry

2021-12-03 Thread Ramalingam C

On 2021-12-03 at 17:31:11 +, Matthew Auld wrote:
> On 03/12/2021 16:59, Ramalingam C wrote:
> > On 2021-12-03 at 12:24:20 +, Matthew Auld wrote:
> > > If this is LMEM then we get a 32 entry PT, with each PTE pointing to
> > > some 64K block of memory, otherwise it's just the usual 512 entry PT.
> > > This very much assumes the caller knows what they are doing.
> > > 
> > > Signed-off-by: Matthew Auld 
> > > Cc: Thomas Hellström 
> > > Cc: Ramalingam C 
> > > ---
> > >   drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 50 ++--
> > >   1 file changed, 48 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> > > b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > > index bd3ca0996a23..312b2267bf87 100644
> > > --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > > +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > > @@ -728,13 +728,56 @@ static void gen8_ppgtt_insert_entry(struct 
> > > i915_address_space *vm,
> > >   gen8_pdp_for_page_index(vm, idx);
> > >   struct i915_page_directory *pd =
> > >   i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> > > + struct i915_page_table *pt = i915_pt_entry(pd, gen8_pd_index(idx, 1));
> > >   gen8_pte_t *vaddr;
> > > - vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
> > > + GEM_BUG_ON(pt->is_compact);
> > 
> > Do we have compact PT for smem with 64k pages?
> 
> It's technically possible but we don't bother trying to support it in the
> driver.
Ok.

Reviewed-by: Ramalingam C 
> 
> > 
> > > +
> > > + vaddr = px_vaddr(pt);
> > >   vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, 
> > > flags);
> > >   clflush_cache_range([gen8_pd_index(idx, 0)], 
> > > sizeof(*vaddr));
> > >   }
> > > +static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space 
> > > *vm,
> > > + dma_addr_t addr,
> > > + u64 offset,
> > > + enum i915_cache_level level,
> > > + u32 flags)
> > > +{
> > > + u64 idx = offset >> GEN8_PTE_SHIFT;
> > > + struct i915_page_directory * const pdp =
> > > + gen8_pdp_for_page_index(vm, idx);
> > > + struct i915_page_directory *pd =
> > > + i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> > > + struct i915_page_table *pt = i915_pt_entry(pd, gen8_pd_index(idx, 1));
> > > + gen8_pte_t *vaddr;
> > > +
> > > + GEM_BUG_ON(!IS_ALIGNED(addr, SZ_64K));
> > > + GEM_BUG_ON(!IS_ALIGNED(offset, SZ_64K));
> > > +
> > > + if (!pt->is_compact) {
> > > + vaddr = px_vaddr(pd);
> > > + vaddr[gen8_pd_index(idx, 1)] |= GEN12_PDE_64K;
> > > + pt->is_compact = true;
> > > + }
> > > +
> > > + vaddr = px_vaddr(pt);
> > > + vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
> > > +}
> > > +
> > > +static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
> > > +dma_addr_t addr,
> > > +u64 offset,
> > > +enum i915_cache_level level,
> > > +u32 flags)
> > > +{
> > > + if (flags & PTE_LM)
> > > + return __xehpsdv_ppgtt_insert_entry_lm(vm, addr, offset,
> > > +level, flags);
> > > +
> > > + return gen8_ppgtt_insert_entry(vm, addr, offset, level, flags);
> > Matt,
> > 
> > Is this call for gen8_*** is for insertion of smem PTE entries on the
> > 64K capable platforms like DG2?
> 
> Yeah, this just falls back to the generic 512 entry layout for the PT.
> 
> > 
> > Ram
> > 
> > > +}
> > > +
> > >   static int gen8_init_scratch(struct i915_address_space *vm)
> > >   {
> > >   u32 pte_flags;
> > > @@ -937,7 +980,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt 
> > > *gt,
> > >   ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
> > >   ppgtt->vm.insert_entries = gen8_ppgtt_insert;
> > > - ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
> > > + if (HAS_64K_PAGES(gt->i915))
> > > + ppgtt->vm.insert_page = xehpsdv_ppgtt_insert_entry;
> > > + else
> > > + ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
> > >   ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
> > >   ppgtt->vm.clear_range = gen8_ppgtt_clear;
> > >   ppgtt->vm.foreach = gen8_ppgtt_foreach;
> > > -- 
> > > 2.31.1
> > >

Re: [PATCH v2 4/8] drm/i915/migrate: fix offset calculation

2021-12-03 Thread Matthew Auld


On 03/12/2021 17:30, Ramalingam C wrote:

On 2021-12-03 at 12:24:22 +, Matthew Auld wrote:

Ensure we add the engine base only after we calculate the qword offset
into the PTE window.


So we didn't hit this issue because we were always using the
engine->instance 0!?


Yes, AFAIK.



Looks good to me

Reviewed-by: Ramalingam C 



Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
  drivers/gpu/drm/i915/gt/intel_migrate.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
b/drivers/gpu/drm/i915/gt/intel_migrate.c
index d553b76b1168..cb0bb3b94644 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -284,10 +284,10 @@ static int emit_pte(struct i915_request *rq,
GEM_BUG_ON(GRAPHICS_VER(rq->engine->i915) < 8);
  
  	/* Compute the page directory offset for the target address range */

-   offset += (u64)rq->engine->instance << 32;
offset >>= 12;
offset *= sizeof(u64);
offset += 2 * CHUNK_SZ;
+   offset += (u64)rq->engine->instance << 32;
  
  	cs = intel_ring_begin(rq, 6);

if (IS_ERR(cs))
--
2.31.1

Re: [PATCH v2 3/8] drm/i915/gtt: add gtt mappable plumbing

2021-12-03 Thread Matthew Auld


On 03/12/2021 17:25, Ramalingam C wrote:

On 2021-12-03 at 12:24:21 +, Matthew Auld wrote:

With object clearing/copying we need to be able to modify the PTEs on
the fly via some batch buffer, which means we need to be able to map the
paging structures(or at the very least the PT, but being able to also
map the PD might also be useful at some point) into the GTT. And since
the paging structures must reside in LMEM on discrete, we need to ensure
that these objects have correct physical alignment, as per any min page
restrictions, like on DG2. This is potentially costly, but this should
be limited to the special migrate_vm, which only needs to a few fixed
sized windows.


Matt,

Just a thought. instead of classifying whole ppgtt as VM_GTT_MAPPABLE
and rounding up the pt size to min_page_size,
could we just add size of pt as parameter into i915_vm_alloc_pt_stash
and alloc_pt, which can be used for vm->alloc_pt_dma() instead of
I915_GTT_PAGE_SIZE_4K.

But PT for a smem entries also needs to be 64k aligned to be mapped into
the GTT right? So no advantage of having the pt_stash level physical
alignment..

Any thoughts on this line?


Yes, this sounds like a good idea. Initially I was worried about stuff 
like gen8_alloc_top_pd() which would skip this, but it looks like we 
only really care about the PT and maybe also the PD having correct 
alignment. Will change.




Ram



Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c |  4 ++--
  drivers/gpu/drm/i915/gem/selftests/huge_pages.c |  2 +-
  drivers/gpu/drm/i915/gt/gen6_ppgtt.c|  2 +-
  drivers/gpu/drm/i915/gt/gen8_ppgtt.c|  3 ++-
  drivers/gpu/drm/i915/gt/gen8_ppgtt.h|  1 +
  drivers/gpu/drm/i915/gt/intel_ggtt.c|  2 +-
  drivers/gpu/drm/i915/gt/intel_gt.c  |  2 +-
  drivers/gpu/drm/i915/gt/intel_gtt.c |  7 +++
  drivers/gpu/drm/i915/gt/intel_gtt.h |  9 +
  drivers/gpu/drm/i915/gt/intel_migrate.c |  4 +++-
  drivers/gpu/drm/i915/gt/intel_ppgtt.c   | 17 -
  drivers/gpu/drm/i915/gt/selftest_hangcheck.c|  2 +-
  drivers/gpu/drm/i915/gvt/scheduler.c|  2 +-
  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c   |  4 ++--
  14 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index ebd775cb1661..b394954726b0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1559,7 +1559,7 @@ i915_gem_create_context(struct drm_i915_private *i915,
} else if (HAS_FULL_PPGTT(i915)) {
struct i915_ppgtt *ppgtt;
  
-		ppgtt = i915_ppgtt_create(>gt, 0);

+   ppgtt = i915_ppgtt_create(>gt, 0, 0);
if (IS_ERR(ppgtt)) {
drm_dbg(>drm, "PPGTT setup failed (%ld)\n",
PTR_ERR(ppgtt));
@@ -1742,7 +1742,7 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void 
*data,
if (args->flags)
return -EINVAL;
  
-	ppgtt = i915_ppgtt_create(>gt, 0);

+   ppgtt = i915_ppgtt_create(>gt, 0, 0);
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
  
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c

index bd8dc1a28022..c1b86c7a4754 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -1764,7 +1764,7 @@ int i915_gem_huge_page_mock_selftests(void)
mkwrite_device_info(dev_priv)->ppgtt_type = INTEL_PPGTT_FULL;
mkwrite_device_info(dev_priv)->ppgtt_size = 48;
  
-	ppgtt = i915_ppgtt_create(_priv->gt, 0);

+   ppgtt = i915_ppgtt_create(_priv->gt, 0, 0);
if (IS_ERR(ppgtt)) {
err = PTR_ERR(ppgtt);
goto out_unlock;
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index c0d149f04949..778472e563aa 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -443,7 +443,7 @@ struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt)
  
  	mutex_init(>flush);
  
-	ppgtt_init(>base, gt, 0);

+   ppgtt_init(>base, gt, 0, 0);
ppgtt->base.vm.pd_shift = ilog2(SZ_4K * SZ_4K / sizeof(gen6_pte_t));
ppgtt->base.vm.top = 1;
  
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c

index 312b2267bf87..dfca803b4ff1 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -912,6 +912,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
   *
   */
  struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
+unsigned long vm_flags,
 unsigned long lmem_pt_obj_flags)
  {
struct i915_ppgtt

Re: [PATCH v2 6/8] drm/i915/selftests: handle object rounding

2021-12-03 Thread Ramalingam C

On 2021-12-03 at 12:24:24 +, Matthew Auld wrote:
> Ensure we account for any object rounding due to min_page_size
> restrictions.
> 
> Signed-off-by: Matthew Auld 

Reviewed-by: Ramalingam C 

> Cc: Thomas Hellström 
> Cc: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gt/selftest_migrate.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c 
> b/drivers/gpu/drm/i915/gt/selftest_migrate.c
> index 12ef2837c89b..e21787301bbd 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
> @@ -49,6 +49,7 @@ static int copy(struct intel_migrate *migrate,
>   if (IS_ERR(src))
>   return 0;
>  
> + sz = src->base.size;
>   dst = i915_gem_object_create_internal(i915, sz);
>   if (IS_ERR(dst))
>   goto err_free_src;
> -- 
> 2.31.1
>

Re: [PATCH v2 5/8] drm/i915/migrate: fix length calculation

2021-12-03 Thread Ramalingam C

On 2021-12-03 at 12:24:23 +, Matthew Auld wrote:
> No need to insert PTEs for the PTE window itself, also foreach expects a
> length not an end offset, which could be gigantic here with a second
> engine.
> 
Looks good to me

Reviewed-by: Ramalingam C 

> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gt/intel_migrate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
> b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index cb0bb3b94644..2076e24e0489 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -136,7 +136,7 @@ static struct i915_address_space *migrate_vm(struct 
> intel_gt *gt)
>   goto err_vm;
>  
>   /* Now allow the GPU to rewrite the PTE via its own ppGTT */
> - vm->vm.foreach(>vm, base, base + sz, insert_pte, );
> + vm->vm.foreach(>vm, base, d.offset - base, insert_pte, );
>   }
>  
>   return >vm;
> -- 
> 2.31.1
>

Re: [PATCH v2 2/8] drm/i915/gtt: add xehpsdv_ppgtt_insert_entry

2021-12-03 Thread Matthew Auld


On 03/12/2021 16:59, Ramalingam C wrote:

On 2021-12-03 at 12:24:20 +, Matthew Auld wrote:

If this is LMEM then we get a 32 entry PT, with each PTE pointing to
some 64K block of memory, otherwise it's just the usual 512 entry PT.
This very much assumes the caller knows what they are doing.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
  drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 50 ++--
  1 file changed, 48 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index bd3ca0996a23..312b2267bf87 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -728,13 +728,56 @@ static void gen8_ppgtt_insert_entry(struct 
i915_address_space *vm,
gen8_pdp_for_page_index(vm, idx);
struct i915_page_directory *pd =
i915_pd_entry(pdp, gen8_pd_index(idx, 2));
+   struct i915_page_table *pt = i915_pt_entry(pd, gen8_pd_index(idx, 1));
gen8_pte_t *vaddr;
  
-	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));

+   GEM_BUG_ON(pt->is_compact);


Do we have compact PT for smem with 64k pages?


It's technically possible but we don't bother trying to support it in 
the driver.





+
+   vaddr = px_vaddr(pt);
vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
clflush_cache_range([gen8_pd_index(idx, 0)], sizeof(*vaddr));
  }
  
+static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,

+   dma_addr_t addr,
+   u64 offset,
+   enum i915_cache_level level,
+   u32 flags)
+{
+   u64 idx = offset >> GEN8_PTE_SHIFT;
+   struct i915_page_directory * const pdp =
+   gen8_pdp_for_page_index(vm, idx);
+   struct i915_page_directory *pd =
+   i915_pd_entry(pdp, gen8_pd_index(idx, 2));
+   struct i915_page_table *pt = i915_pt_entry(pd, gen8_pd_index(idx, 1));
+   gen8_pte_t *vaddr;
+
+   GEM_BUG_ON(!IS_ALIGNED(addr, SZ_64K));
+   GEM_BUG_ON(!IS_ALIGNED(offset, SZ_64K));
+
+   if (!pt->is_compact) {
+   vaddr = px_vaddr(pd);
+   vaddr[gen8_pd_index(idx, 1)] |= GEN12_PDE_64K;
+   pt->is_compact = true;
+   }
+
+   vaddr = px_vaddr(pt);
+   vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
+}
+
+static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
+  dma_addr_t addr,
+  u64 offset,
+  enum i915_cache_level level,
+  u32 flags)
+{
+   if (flags & PTE_LM)
+   return __xehpsdv_ppgtt_insert_entry_lm(vm, addr, offset,
+  level, flags);
+
+   return gen8_ppgtt_insert_entry(vm, addr, offset, level, flags);

Matt,

Is this call for gen8_*** is for insertion of smem PTE entries on the
64K capable platforms like DG2?


Yeah, this just falls back to the generic 512 entry layout for the PT.



Ram


+}
+
  static int gen8_init_scratch(struct i915_address_space *vm)
  {
u32 pte_flags;
@@ -937,7 +980,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
  
  	ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;

ppgtt->vm.insert_entries = gen8_ppgtt_insert;
-   ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
+   if (HAS_64K_PAGES(gt->i915))
+   ppgtt->vm.insert_page = xehpsdv_ppgtt_insert_entry;
+   else
+   ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
ppgtt->vm.clear_range = gen8_ppgtt_clear;
ppgtt->vm.foreach = gen8_ppgtt_foreach;
--
2.31.1

Re: [PATCH v2 4/8] drm/i915/migrate: fix offset calculation

2021-12-03 Thread Ramalingam C

On 2021-12-03 at 12:24:22 +, Matthew Auld wrote:
> Ensure we add the engine base only after we calculate the qword offset
> into the PTE window.

So we didn't hit this issue because we were always using the
engine->instance 0!?

Looks good to me

Reviewed-by: Ramalingam C 

> 
> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gt/intel_migrate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
> b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index d553b76b1168..cb0bb3b94644 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -284,10 +284,10 @@ static int emit_pte(struct i915_request *rq,
>   GEM_BUG_ON(GRAPHICS_VER(rq->engine->i915) < 8);
>  
>   /* Compute the page directory offset for the target address range */
> - offset += (u64)rq->engine->instance << 32;
>   offset >>= 12;
>   offset *= sizeof(u64);
>   offset += 2 * CHUNK_SZ;
> + offset += (u64)rq->engine->instance << 32;
>  
>   cs = intel_ring_begin(rq, 6);
>   if (IS_ERR(cs))
> -- 
> 2.31.1
>

Re: [PATCH v2 3/8] drm/i915/gtt: add gtt mappable plumbing

2021-12-03 Thread Ramalingam C

On 2021-12-03 at 12:24:21 +, Matthew Auld wrote:
> With object clearing/copying we need to be able to modify the PTEs on
> the fly via some batch buffer, which means we need to be able to map the
> paging structures(or at the very least the PT, but being able to also
> map the PD might also be useful at some point) into the GTT. And since
> the paging structures must reside in LMEM on discrete, we need to ensure
> that these objects have correct physical alignment, as per any min page
> restrictions, like on DG2. This is potentially costly, but this should
> be limited to the special migrate_vm, which only needs to a few fixed
> sized windows.

Matt,

Just a thought. instead of classifying whole ppgtt as VM_GTT_MAPPABLE
and rounding up the pt size to min_page_size,
could we just add size of pt as parameter into i915_vm_alloc_pt_stash
and alloc_pt, which can be used for vm->alloc_pt_dma() instead of
I915_GTT_PAGE_SIZE_4K.

But PT for a smem entries also needs to be 64k aligned to be mapped into
the GTT right? So no advantage of having the pt_stash level physical
alignment..

Any thoughts on this line?

Ram

> 
> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c |  4 ++--
>  drivers/gpu/drm/i915/gem/selftests/huge_pages.c |  2 +-
>  drivers/gpu/drm/i915/gt/gen6_ppgtt.c|  2 +-
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c|  3 ++-
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.h|  1 +
>  drivers/gpu/drm/i915/gt/intel_ggtt.c|  2 +-
>  drivers/gpu/drm/i915/gt/intel_gt.c  |  2 +-
>  drivers/gpu/drm/i915/gt/intel_gtt.c |  7 +++
>  drivers/gpu/drm/i915/gt/intel_gtt.h |  9 +
>  drivers/gpu/drm/i915/gt/intel_migrate.c |  4 +++-
>  drivers/gpu/drm/i915/gt/intel_ppgtt.c   | 17 -
>  drivers/gpu/drm/i915/gt/selftest_hangcheck.c|  2 +-
>  drivers/gpu/drm/i915/gvt/scheduler.c|  2 +-
>  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c   |  4 ++--
>  14 files changed, 44 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index ebd775cb1661..b394954726b0 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1559,7 +1559,7 @@ i915_gem_create_context(struct drm_i915_private *i915,
>   } else if (HAS_FULL_PPGTT(i915)) {
>   struct i915_ppgtt *ppgtt;
>  
> - ppgtt = i915_ppgtt_create(>gt, 0);
> + ppgtt = i915_ppgtt_create(>gt, 0, 0);
>   if (IS_ERR(ppgtt)) {
>   drm_dbg(>drm, "PPGTT setup failed (%ld)\n",
>   PTR_ERR(ppgtt));
> @@ -1742,7 +1742,7 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, 
> void *data,
>   if (args->flags)
>   return -EINVAL;
>  
> - ppgtt = i915_ppgtt_create(>gt, 0);
> + ppgtt = i915_ppgtt_create(>gt, 0, 0);
>   if (IS_ERR(ppgtt))
>   return PTR_ERR(ppgtt);
>  
> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
> b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> index bd8dc1a28022..c1b86c7a4754 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> @@ -1764,7 +1764,7 @@ int i915_gem_huge_page_mock_selftests(void)
>   mkwrite_device_info(dev_priv)->ppgtt_type = INTEL_PPGTT_FULL;
>   mkwrite_device_info(dev_priv)->ppgtt_size = 48;
>  
> - ppgtt = i915_ppgtt_create(_priv->gt, 0);
> + ppgtt = i915_ppgtt_create(_priv->gt, 0, 0);
>   if (IS_ERR(ppgtt)) {
>   err = PTR_ERR(ppgtt);
>   goto out_unlock;
> diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
> b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> index c0d149f04949..778472e563aa 100644
> --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> @@ -443,7 +443,7 @@ struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt)
>  
>   mutex_init(>flush);
>  
> - ppgtt_init(>base, gt, 0);
> + ppgtt_init(>base, gt, 0, 0);
>   ppgtt->base.vm.pd_shift = ilog2(SZ_4K * SZ_4K / sizeof(gen6_pte_t));
>   ppgtt->base.vm.top = 1;
>  
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 312b2267bf87..dfca803b4ff1 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -912,6 +912,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
>   *
>   */
>  struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
> +  unsigned long vm_flags,
>unsigned long lmem_pt_obj_flags)
>  {
>   struct i915_ppgtt *ppgtt;
> @@ -921,7 +922,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>   if (!ppgtt)
>   return ERR_PTR(-ENOMEM);

Re: [PATCH v2 2/8] drm/i915/gtt: add xehpsdv_ppgtt_insert_entry

2021-12-03 Thread Ramalingam C

On 2021-12-03 at 12:24:20 +, Matthew Auld wrote:
> If this is LMEM then we get a 32 entry PT, with each PTE pointing to
> some 64K block of memory, otherwise it's just the usual 512 entry PT.
> This very much assumes the caller knows what they are doing.
> 
> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 50 ++--
>  1 file changed, 48 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index bd3ca0996a23..312b2267bf87 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -728,13 +728,56 @@ static void gen8_ppgtt_insert_entry(struct 
> i915_address_space *vm,
>   gen8_pdp_for_page_index(vm, idx);
>   struct i915_page_directory *pd =
>   i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> + struct i915_page_table *pt = i915_pt_entry(pd, gen8_pd_index(idx, 1));
>   gen8_pte_t *vaddr;
>  
> - vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
> + GEM_BUG_ON(pt->is_compact);

Do we have compact PT for smem with 64k pages?

> +
> + vaddr = px_vaddr(pt);
>   vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
>   clflush_cache_range([gen8_pd_index(idx, 0)], sizeof(*vaddr));
>  }
>  
> +static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
> + dma_addr_t addr,
> + u64 offset,
> + enum i915_cache_level level,
> + u32 flags)
> +{
> + u64 idx = offset >> GEN8_PTE_SHIFT;
> + struct i915_page_directory * const pdp =
> + gen8_pdp_for_page_index(vm, idx);
> + struct i915_page_directory *pd =
> + i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> + struct i915_page_table *pt = i915_pt_entry(pd, gen8_pd_index(idx, 1));
> + gen8_pte_t *vaddr;
> +
> + GEM_BUG_ON(!IS_ALIGNED(addr, SZ_64K));
> + GEM_BUG_ON(!IS_ALIGNED(offset, SZ_64K));
> +
> + if (!pt->is_compact) {
> + vaddr = px_vaddr(pd);
> + vaddr[gen8_pd_index(idx, 1)] |= GEN12_PDE_64K;
> + pt->is_compact = true;
> + }
> +
> + vaddr = px_vaddr(pt);
> + vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
> +}
> +
> +static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
> +dma_addr_t addr,
> +u64 offset,
> +enum i915_cache_level level,
> +u32 flags)
> +{
> + if (flags & PTE_LM)
> + return __xehpsdv_ppgtt_insert_entry_lm(vm, addr, offset,
> +level, flags);
> +
> + return gen8_ppgtt_insert_entry(vm, addr, offset, level, flags);
Matt,

Is this call for gen8_*** is for insertion of smem PTE entries on the
64K capable platforms like DG2?

Ram

> +}
> +
>  static int gen8_init_scratch(struct i915_address_space *vm)
>  {
>   u32 pte_flags;
> @@ -937,7 +980,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>  
>   ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
>   ppgtt->vm.insert_entries = gen8_ppgtt_insert;
> - ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
> + if (HAS_64K_PAGES(gt->i915))
> + ppgtt->vm.insert_page = xehpsdv_ppgtt_insert_entry;
> + else
> + ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
>   ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
>   ppgtt->vm.clear_range = gen8_ppgtt_clear;
>   ppgtt->vm.foreach = gen8_ppgtt_foreach;
> -- 
> 2.31.1
>

Re: [PATCH v2 1/8] drm/i915/migrate: don't check the scratch page

2021-12-03 Thread Ramalingam C

On 2021-12-03 at 12:24:19 +, Matthew Auld wrote:
> The scratch page might not be allocated in LMEM(like on DG2), so instead
> of using that as the deciding factor for where the paging structures
> live, let's just query the pt before mapping it.
> 
Looks good to me.

Reviewed-by: Ramalingam C 

> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gt/intel_migrate.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
> b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index 765c6d48fe52..2d3188a398dd 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -13,7 +13,6 @@
>  
>  struct insert_pte_data {
>   u64 offset;
> - bool is_lmem;
>  };
>  
>  #define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */
> @@ -41,7 +40,7 @@ static void insert_pte(struct i915_address_space *vm,
>   struct insert_pte_data *d = data;
>  
>   vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE,
> - d->is_lmem ? PTE_LM : 0);
> + i915_gem_object_is_lmem(pt->base) ? PTE_LM : 0);
>   d->offset += PAGE_SIZE;
>  }
>  
> @@ -135,7 +134,6 @@ static struct i915_address_space *migrate_vm(struct 
> intel_gt *gt)
>   goto err_vm;
>  
>   /* Now allow the GPU to rewrite the PTE via its own ppGTT */
> - d.is_lmem = i915_gem_object_is_lmem(vm->vm.scratch[0]);
>   vm->vm.foreach(>vm, base, base + sz, insert_pte, );
>   }
>  
> -- 
> 2.31.1
>

Re: [PATCH] drm/plane: Move range check for format_count earlier

2021-12-03 Thread Carsten Haitzler


On 12/3/21 13:08, Liviu Dudau wrote:

On Fri, Dec 03, 2021 at 10:28:15AM +, Steven Price wrote:

While the check for format_count > 64 in __drm_universal_plane_init()
shouldn't be hit (it's a WARN_ON), in its current position it will then
leak the plane->format_types array and fail to call
drm_mode_object_unregister() leaking the modeset identifier. Move it to
the start of the function to avoid allocating those resources in the
first place.

Signed-off-by: Steven Price 


Well spotted!

Reviewed-by: Liviu Dudau 

I'm going to wait to see if anyone else has any comments before I'll merge this 
into
drm-misc-fixes (or should it be drm-misc-next-fixes?)


This definitely looks to fix an ugly I spotted too while looking at your 
prior patch thinking it might be wrong so a +1 from me.



Best regards,
Liviu


---
  drivers/gpu/drm/drm_plane.c | 14 +++---
  1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index 82afb854141b..fd0bf90fb4c2 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -249,6 +249,13 @@ static int __drm_universal_plane_init(struct drm_device 
*dev,
if (WARN_ON(config->num_total_plane >= 32))
return -EINVAL;
  
+	/*

+* First driver to need more than 64 formats needs to fix this. Each
+* format is encoded as a bit and the current code only supports a u64.
+*/
+   if (WARN_ON(format_count > 64))
+   return -EINVAL;
+
WARN_ON(drm_drv_uses_atomic_modeset(dev) &&
(!funcs->atomic_destroy_state ||
 !funcs->atomic_duplicate_state));
@@ -270,13 +277,6 @@ static int __drm_universal_plane_init(struct drm_device 
*dev,
return -ENOMEM;
}
  
-	/*

-* First driver to need more than 64 formats needs to fix this. Each
-* format is encoded as a bit and the current code only supports a u64.
-*/
-   if (WARN_ON(format_count > 64))
-   return -EINVAL;
-
if (format_modifiers) {
const uint64_t *temp_modifiers = format_modifiers;
  
--

2.25.1

Re: [PATCH] drm: send vblank event with the attached sequence rather than current

2021-12-03 Thread Matthias Brugger


Hi Mark,

On 02/12/2021 16:11, Mark Yacoub wrote:

From: Mark Yacoub 



please make sure to add the linux-mediatek mailinglist in any follow-up 
communication.


Regards,
Matthias


[Why]
drm_handle_vblank_events loops over vblank_event_list to send any event
that is current or has passed.
More than 1 event could be pending with past sequence time that need to
be send. This can be a side effect of drivers without hardware vblank
counter and they depend on the difference in the timestamps and the
frame/field duration calculated in drm_update_vblank_count. This can
lead to 1 vblirq being ignored due to very small diff, resulting in a
subsequent vblank with 2 pending vblank events to be sent, each with a
unique sequence expected by user space.

[How]
Send each pending vblank event with the sequence it's waiting on instead
of assigning the current sequence to all of them.

Fixes igt@kms_flip "Unexpected frame sequence"
Tested on Jacuzzi (MT8183)

Signed-off-by: Mark Yacoub 
---
  drivers/gpu/drm/drm_vblank.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 3417e1ac79185..47da8056abc14 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -1902,7 +1902,7 @@ static void drm_handle_vblank_events(struct drm_device 
*dev, unsigned int pipe)
  
  		list_del(>base.link);

drm_vblank_put(dev, pipe);
-   send_vblank_event(dev, e, seq, now);
+   send_vblank_event(dev, e, e->sequence, now);
}
  
  	if (crtc && crtc->funcs->get_vblank_timestamp)

[PATCH 7/7] drm/i915: Expose client engine utilisation via fdinfo

2021-12-03 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Similar to AMD commit
874442541133 ("drm/amdgpu: Add show_fdinfo() interface"), using the
infrastructure added in previous patches, we add basic client info
and GPU engine utilisation for i915.

Example of the output:

  pos:0
  flags:  012
  mnt_id: 21
  drm-driver: i915
  drm-pdev:   :00:02.0
  drm-client-id:  7
  drm-engine-render:  9288864723 ns
  drm-engine-copy:2035071108 ns
  drm-engine-video:   0 ns
  drm-engine-video-enhance:   0 ns

v2:
 * Update for removal of name and pid.

v3:
 * Use drm_driver.name.

Signed-off-by: Tvrtko Ursulin 
Cc: David M Nieto 
Cc: Christian König 
Cc: Daniel Vetter 
Acked-by: Christian König 
---
 Documentation/gpu/drm-usage-stats.rst  |  6 +++
 Documentation/gpu/i915.rst | 27 ++
 drivers/gpu/drm/i915/i915_driver.c |  3 ++
 drivers/gpu/drm/i915/i915_drm_client.c | 73 ++
 drivers/gpu/drm/i915/i915_drm_client.h |  4 ++
 5 files changed, 113 insertions(+)

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index c669026be244..6952f8389d07 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -95,3 +95,9 @@ object belong to this client, in the respective memory region.
 
 Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
 indicating kibi- or mebi-bytes.
+
+===
+Driver specific implementations
+===
+
+:ref:`i915-usage-stats`
diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index b7d801993bfa..29f412a0c3dc 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -708,3 +708,30 @@ The style guide for ``i915_reg.h``.
 
 .. kernel-doc:: drivers/gpu/drm/i915/i915_reg.h
:doc: The i915 register macro definition style guide
+
+.. _i915-usage-stats:
+
+i915 DRM client usage stats implementation
+==
+
+The drm/i915 driver implements the DRM client usage stats specification as
+documented in :ref:`drm-client-usage-stats`.
+
+Example of the output showing the implemented key value pairs and entirety of
+the currenly possible format options:
+
+::
+
+  pos:0
+  flags:  012
+  mnt_id: 21
+  drm-driver: i915
+  drm-pdev:   :00:02.0
+  drm-client-id:  7
+  drm-engine-render:  9288864723 ns
+  drm-engine-copy:2035071108 ns
+  drm-engine-video:   0 ns
+  drm-engine-video-enhance:   0 ns
+
+Possible `drm-engine-` key names are: `render`, `copy`, `video` and
+`video-enhance`.
diff --git a/drivers/gpu/drm/i915/i915_driver.c 
b/drivers/gpu/drm/i915/i915_driver.c
index a4f8031602b3..3b75e88de1b8 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -1753,6 +1753,9 @@ static const struct file_operations i915_driver_fops = {
.read = drm_read,
.compat_ioctl = i915_ioc32_compat_ioctl,
.llseek = noop_llseek,
+#ifdef CONFIG_PROC_FS
+   .show_fdinfo = i915_drm_client_fdinfo,
+#endif
 };
 
 static int
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c 
b/drivers/gpu/drm/i915/i915_drm_client.c
index 91a8559bebf7..06dbd20ce763 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -7,6 +7,11 @@
 #include 
 #include 
 
+#include 
+
+#include 
+
+#include "gem/i915_gem_context.h"
 #include "i915_drm_client.h"
 #include "i915_gem.h"
 #include "i915_utils.h"
@@ -68,3 +73,71 @@ void i915_drm_clients_fini(struct i915_drm_clients *clients)
GEM_BUG_ON(!xa_empty(>xarray));
xa_destroy(>xarray);
 }
+
+#ifdef CONFIG_PROC_FS
+static const char * const uabi_class_names[] = {
+   [I915_ENGINE_CLASS_RENDER] = "render",
+   [I915_ENGINE_CLASS_COPY] = "copy",
+   [I915_ENGINE_CLASS_VIDEO] = "video",
+   [I915_ENGINE_CLASS_VIDEO_ENHANCE] = "video-enhance",
+};
+
+static u64 busy_add(struct i915_gem_context *ctx, unsigned int class)
+{
+   struct i915_gem_engines_iter it;
+   struct intel_context *ce;
+   u64 total = 0;
+
+   for_each_gem_engine(ce, rcu_dereference(ctx->engines), it) {
+   if (ce->engine->uabi_class != class)
+   continue;
+
+   total += intel_context_get_total_runtime_ns(ce);
+   }
+
+   return total;
+}
+
+static void
+show_client_class(struct seq_file *m,
+ struct i915_drm_client *client,
+ unsigned int class)
+{
+   const struct list_head *list = >ctx_list;
+   u64 total = atomic64_read(>past_runtime[class]);
+   struct i915_gem_context *ctx;
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(ctx, list, client_link)
+   total += busy_add(ctx, class);
+   rcu_read_unlock();
+
+   return seq_printf(m, "drm-engine-%s:\t%llu ns\n",
+ uabi_class_names[class], total);
+}
+
+void i915_drm_client_fdinfo(struct

[PATCH 6/7] drm: Document fdinfo format specification

2021-12-03 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Proposal to standardise the fdinfo text format as optionally output by DRM
drivers.

Idea is that a simple but, well defined, spec will enable generic
userspace tools to be written while at the same time avoiding a more heavy
handed approach of adding a mid-layer to DRM.

i915 implements a subset of the spec, everything apart from the memory
stats currently, and a matching intel_gpu_top tool exists.

Open is to see if AMD can migrate to using the proposed GPU utilisation
key-value pairs, or if they are not workable to see whether to go
vendor specific, or if a standardised  alternative can be found which is
workable for both drivers.

Same for the memory utilisation key-value pairs proposal.

v2:
 * Update for removal of name and pid.

v3:
 * 'Drm-driver' tag will be obtained from struct drm_driver.name. (Daniel)

Signed-off-by: Tvrtko Ursulin 
Cc: David M Nieto 
Cc: Christian König 
Cc: Daniel Vetter 
Cc: Daniel Stone 
Acked-by: Christian König 
---
 Documentation/gpu/drm-usage-stats.rst | 97 +++
 Documentation/gpu/index.rst   |  1 +
 2 files changed, 98 insertions(+)
 create mode 100644 Documentation/gpu/drm-usage-stats.rst

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
new file mode 100644
index ..c669026be244
--- /dev/null
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -0,0 +1,97 @@
+.. _drm-client-usage-stats:
+
+==
+DRM client usage stats
+==
+
+DRM drivers can choose to export partly standardised text output via the
+`fops->show_fdinfo()` as part of the driver specific file operations registered
+in the `struct drm_driver` object registered with the DRM core.
+
+One purpose of this output is to enable writing as generic as practicaly
+feasible `top(1)` like userspace monitoring tools.
+
+Given the differences between various DRM drivers the specification of the
+output is split between common and driver specific parts. Having said that,
+wherever possible effort should still be made to standardise as much as
+possible.
+
+File format specification
+=
+
+- File shall contain one key value pair per one line of text.
+- Colon character (`:`) must be used to delimit keys and values.
+- All keys shall be prefixed with `drm-`.
+- Whitespace between the delimiter and first non-whitespace character shall be
+  ignored when parsing.
+- Neither keys or values are allowed to contain whitespace characters.
+- Numerical key value pairs can end with optional unit string.
+- Data type of the value is fixed as defined in the specification.
+
+Key types
+-
+
+1. Mandatory, fully standardised.
+2. Optional, fully standardised.
+3. Driver specific.
+
+Data types
+--
+
+-  - Unsigned integer without defining the maximum value.
+-  - String excluding any above defined reserved characters or whitespace.
+
+Mandatory fully standardised keys
+-
+
+- drm-driver: 
+
+String shall contain the name this driver registered as via the respective
+`struct drm_driver` data structure.
+
+Optional fully standardised keys
+
+
+- drm-pdev: 
+
+For PCI devices this should contain the PCI slot address of the device in
+question.
+
+- drm-client-id: 
+
+Unique value relating to the open DRM file descriptor used to distinguish
+duplicated and shared file descriptors. Conceptually the value should map 1:1
+to the in kernel representation of `struct drm_file` instances.
+
+Uniqueness of the value shall be either globally unique, or unique within the
+scope of each device, in which case `drm-pdev` shall be present as well.
+
+Userspace should make sure to not double account any usage statistics by using
+the above described criteria in order to associate data to individual clients.
+
+- drm-engine-:  ns
+
+GPUs usually contain multiple execution engines. Each shall be given a stable
+and unique name (str), with possible values documented in the driver specific
+documentation.
+
+Value shall be in specified time units which the respective GPU engine spent
+busy executing workloads belonging to this client.
+
+Values are not required to be constantly monotonic if it makes the driver
+implementation easier, but are required to catch up with the previously 
reported
+larger value within a reasonable period. Upon observing a value lower than what
+was previously read, userspace is expected to stay with that larger previous
+value until a monotonic update is seen.
+
+- drm-memory-:  [KiB|MiB]
+
+Each possible memory type which can be used to store buffer objects by the
+GPU in question shall be given a stable and unique name to be returned as the
+string here.
+
+Value shall reflect the amount of storage currently consumed by the buffer
+object belong to this client, in the respective memory region.
+
+Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
+indicating kibi-

[PATCH 5/7] drm/i915: Track context current active time

2021-12-03 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Track context active (on hardware) status together with the start
timestamp.

This will be used to provide better granularity of context
runtime reporting in conjunction with already tracked pphwsp accumulated
runtime.

The latter is only updated on context save so does not give us visibility
to any currently executing work.

As part of the patch the existing runtime tracking data is moved under the
new ce->stats member and updated under the seqlock. This provides the
ability to atomically read out accumulated plus active runtime.

v2:
 * Rename and make __intel_context_get_active_time unlocked.

v3:
 * Use GRAPHICS_VER.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Aravind Iddamsetty  #  v1
Reviewed-by: Chris Wilson 
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_context.c   | 27 ++-
 drivers/gpu/drm/i915/gt/intel_context.h   | 15 ---
 drivers/gpu/drm/i915/gt/intel_context_types.h | 24 +++--
 .../drm/i915/gt/intel_execlists_submission.c  | 23 
 .../gpu/drm/i915/gt/intel_gt_clock_utils.c|  4 +++
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 27 ++-
 drivers/gpu/drm/i915/gt/intel_lrc.h   | 24 +
 drivers/gpu/drm/i915/gt/selftest_lrc.c| 10 +++
 drivers/gpu/drm/i915/i915_gpu_error.c |  9 +++
 drivers/gpu/drm/i915/i915_gpu_error.h |  2 +-
 10 files changed, 116 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index ba083d800a08..2cabba7c1fc0 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -385,7 +385,7 @@ intel_context_init(struct intel_context *ce, struct 
intel_engine_cs *engine)
ce->ring = NULL;
ce->ring_size = SZ_4K;
 
-   ewma_runtime_init(>runtime.avg);
+   ewma_runtime_init(>stats.runtime.avg);
 
ce->vm = i915_vm_get(engine->gt->vm);
 
@@ -576,6 +576,31 @@ void intel_context_bind_parent_child(struct intel_context 
*parent,
child->parallel.parent = parent;
 }
 
+u64 intel_context_get_total_runtime_ns(const struct intel_context *ce)
+{
+   u64 total, active;
+
+   total = ce->stats.runtime.total;
+   if (ce->ops->flags & COPS_RUNTIME_CYCLES)
+   total *= ce->engine->gt->clock_period_ns;
+
+   active = READ_ONCE(ce->stats.active);
+   if (active)
+   active = intel_context_clock() - active;
+
+   return total + active;
+}
+
+u64 intel_context_get_avg_runtime_ns(struct intel_context *ce)
+{
+   u64 avg = ewma_runtime_read(>stats.runtime.avg);
+
+   if (ce->ops->flags & COPS_RUNTIME_CYCLES)
+   avg *= ce->engine->gt->clock_period_ns;
+
+   return avg;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_context.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index 246c37d72cd7..1a60193fa58e 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -350,18 +350,13 @@ intel_context_clear_nopreempt(struct intel_context *ce)
clear_bit(CONTEXT_NOPREEMPT, >flags);
 }
 
-static inline u64 intel_context_get_total_runtime_ns(struct intel_context *ce)
-{
-   const u32 period = ce->engine->gt->clock_period_ns;
-
-   return READ_ONCE(ce->runtime.total) * period;
-}
+u64 intel_context_get_total_runtime_ns(const struct intel_context *ce);
+u64 intel_context_get_avg_runtime_ns(struct intel_context *ce);
 
-static inline u64 intel_context_get_avg_runtime_ns(struct intel_context *ce)
+static inline u64 intel_context_clock(void)
 {
-   const u32 period = ce->engine->gt->clock_period_ns;
-
-   return mul_u32_u32(ewma_runtime_read(>runtime.avg), period);
+   /* As we mix CS cycles with CPU clocks, use the raw monotonic clock. */
+   return ktime_get_raw_fast_ns();
 }
 
 #endif /* __INTEL_CONTEXT_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 9e0177dc5484..bf26349358f2 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -35,6 +35,9 @@ struct intel_context_ops {
 #define COPS_HAS_INFLIGHT_BIT 0
 #define COPS_HAS_INFLIGHT BIT(COPS_HAS_INFLIGHT_BIT)
 
+#define COPS_RUNTIME_CYCLES_BIT 1
+#define COPS_RUNTIME_CYCLES BIT(COPS_RUNTIME_CYCLES_BIT)
+
int (*alloc)(struct intel_context *ce);
 
void (*ban)(struct intel_context *ce, struct i915_request *rq);
@@ -133,14 +136,19 @@ struct intel_context {
} lrc;
u32 tag; /* cookie passed to HW to track this context on submission */
 
-   /* Time on GPU as tracked by the hw. */
-   struct {
-   struct ewma_runtime avg;
-   u64 total;
-   u32 last;
-   I915_SELFTEST_DECLARE(u32 num_underflow);
-

[PATCH 4/7] drm/i915: Track all user contexts per client

2021-12-03 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

We soon want to start answering questions like how much GPU time is the
context belonging to a client which exited still using.

To enable this we start tracking all context belonging to a client on a
separate list.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Aravind Iddamsetty 
Reviewed-by: Chris Wilson 
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 12 
 drivers/gpu/drm/i915/gem/i915_gem_context_types.h |  3 +++
 drivers/gpu/drm/i915/i915_drm_client.c|  2 ++
 drivers/gpu/drm/i915/i915_drm_client.h|  5 +
 4 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index ae9de81b0e6b..b37b0b45ea68 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1461,6 +1461,7 @@ static void set_closed_name(struct i915_gem_context *ctx)
 
 static void context_close(struct i915_gem_context *ctx)
 {
+   struct i915_drm_client *client;
struct i915_address_space *vm;
 
/* Flush any concurrent set_engines() */
@@ -1498,6 +1499,13 @@ static void context_close(struct i915_gem_context *ctx)
list_del(>link);
spin_unlock(>i915->gem.contexts.lock);
 
+   client = ctx->client;
+   if (client) {
+   spin_lock(>ctx_lock);
+   list_del_rcu(>client_link);
+   spin_unlock(>ctx_lock);
+   }
+
mutex_unlock(>mutex);
 
/*
@@ -1683,6 +1691,10 @@ static void gem_context_register(struct i915_gem_context 
*ctx,
old = xa_store(>context_xa, id, ctx, GFP_KERNEL);
WARN_ON(old);
 
+   spin_lock(>client->ctx_lock);
+   list_add_tail_rcu(>client_link, >client->ctx_list);
+   spin_unlock(>client->ctx_lock);
+
spin_lock(>gem.contexts.lock);
list_add_tail(>link, >gem.contexts.list);
spin_unlock(>gem.contexts.lock);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 93d24f189ba9..5946dcb11cf5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -296,6 +296,9 @@ struct i915_gem_context {
/** @client: struct i915_drm_client */
struct i915_drm_client *client;
 
+   /** link: _client.context_list */
+   struct list_head client_link;
+
/**
 * @ref: reference count
 *
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c 
b/drivers/gpu/drm/i915/i915_drm_client.c
index e61e9ba15256..91a8559bebf7 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.c
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -38,6 +38,8 @@ struct i915_drm_client *i915_drm_client_add(struct 
i915_drm_clients *clients)
goto err;
 
kref_init(>kref);
+   spin_lock_init(>ctx_lock);
+   INIT_LIST_HEAD(>ctx_list);
client->clients = clients;
 
return client;
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h 
b/drivers/gpu/drm/i915/i915_drm_client.h
index 9d80d9f715ee..7416e18aa33c 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -7,6 +7,8 @@
 #define __I915_DRM_CLIENT_H__
 
 #include 
+#include 
+#include 
 #include 
 
 #include "gt/intel_engine_types.h"
@@ -25,6 +27,9 @@ struct i915_drm_client {
 
unsigned int id;
 
+   spinlock_t ctx_lock; /* For add/remove from ctx_list. */
+   struct list_head ctx_list; /* List of contexts belonging to client. */
+
struct i915_drm_clients *clients;
 
/**
-- 
2.32.0

[PATCH 3/7] drm/i915: Track runtime spent in closed and unreachable GEM contexts

2021-12-03 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

As contexts are abandoned we want to remember how much GPU time they used
(per class) so later we can used it for smarter purposes.

As GEM contexts are closed we want to have the DRM client remember how
much GPU time they used (per class) so later we can used it for smarter
purposes.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Aravind Iddamsetty 
Reviewed-by: Chris Wilson 
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 25 +++--
 drivers/gpu/drm/i915/i915_drm_client.h  |  7 ++
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 78c357baa65d..ae9de81b0e6b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1001,23 +1001,44 @@ static void free_engines_rcu(struct rcu_head *rcu)
free_engines(engines);
 }
 
+static void accumulate_runtime(struct i915_drm_client *client,
+  struct i915_gem_engines *engines)
+{
+   struct i915_gem_engines_iter it;
+   struct intel_context *ce;
+
+   if (!client)
+   return;
+
+   /* Transfer accumulated runtime to the parent GEM context. */
+   for_each_gem_engine(ce, engines, it) {
+   unsigned int class = ce->engine->uabi_class;
+
+   GEM_BUG_ON(class >= ARRAY_SIZE(client->past_runtime));
+   atomic64_add(intel_context_get_total_runtime_ns(ce),
+>past_runtime[class]);
+   }
+}
+
 static int
 engines_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 {
struct i915_gem_engines *engines =
container_of(fence, typeof(*engines), fence);
+   struct i915_gem_context *ctx = engines->ctx;
 
switch (state) {
case FENCE_COMPLETE:
if (!list_empty(>link)) {
-   struct i915_gem_context *ctx = engines->ctx;
unsigned long flags;
 
spin_lock_irqsave(>stale.lock, flags);
list_del(>link);
spin_unlock_irqrestore(>stale.lock, flags);
}
-   i915_gem_context_put(engines->ctx);
+   accumulate_runtime(ctx->client, engines);
+   i915_gem_context_put(ctx);
+
break;
 
case FENCE_FREE:
diff --git a/drivers/gpu/drm/i915/i915_drm_client.h 
b/drivers/gpu/drm/i915/i915_drm_client.h
index e8986ad51176..9d80d9f715ee 100644
--- a/drivers/gpu/drm/i915/i915_drm_client.h
+++ b/drivers/gpu/drm/i915/i915_drm_client.h
@@ -9,6 +9,8 @@
 #include 
 #include 
 
+#include "gt/intel_engine_types.h"
+
 struct drm_i915_private;
 
 struct i915_drm_clients {
@@ -24,6 +26,11 @@ struct i915_drm_client {
unsigned int id;
 
struct i915_drm_clients *clients;
+
+   /**
+* @past_runtime: Accumulation of pphwsp runtimes from closed contexts.
+*/
+   atomic64_t past_runtime[MAX_ENGINE_CLASS + 1];
 };
 
 void i915_drm_clients_init(struct i915_drm_clients *clients,
-- 
2.32.0

[PATCH 2/7] drm/i915: Make GEM contexts track DRM clients

2021-12-03 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Make GEM contexts keep a reference to i915_drm_client for the whole of
of their lifetime which will come handy in following patches.

v2: Don't bother supporting selftests contexts from debugfs. (Chris)
v3 (Lucas): Finish constructing ctx before adding it to the list
v4 (Ram): Rebase.
v5: Trivial rebase for proto ctx changes.
v6: Rebase after clients no longer track name and pid.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Chris Wilson  # v5
Reviewed-by: Aravind Iddamsetty  # v5
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 5 +
 drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 347dab952e90..78c357baa65d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1227,6 +1227,9 @@ static void i915_gem_context_release_work(struct 
work_struct *work)
if (ctx->pxp_wakeref)
intel_runtime_pm_put(>i915->runtime_pm, ctx->pxp_wakeref);
 
+   if (ctx->client)
+   i915_drm_client_put(ctx->client);
+
mutex_destroy(>engines_mutex);
mutex_destroy(>lut_mutex);
 
@@ -1650,6 +1653,8 @@ static void gem_context_register(struct i915_gem_context 
*ctx,
ctx->file_priv = fpriv;
 
ctx->pid = get_task_pid(current, PIDTYPE_PID);
+   ctx->client = i915_drm_client_get(fpriv->client);
+
snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
 current->comm, pid_nr(ctx->pid));
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 282cdb8a5c5a..93d24f189ba9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -293,6 +293,9 @@ struct i915_gem_context {
/** @link: place with _i915_private.context_list */
struct list_head link;
 
+   /** @client: struct i915_drm_client */
+   struct i915_drm_client *client;
+
/**
 * @ref: reference count
 *
-- 
2.32.0

[PATCH 1/7] drm/i915: Explicitly track DRM clients

2021-12-03 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Tracking DRM clients more explicitly will allow later patches to
accumulate past and current GPU usage in a centralised place and also
consolidate access to owning task pid/name.

Unique client id is also assigned for the purpose of distinguishing/
consolidating between multiple file descriptors owned by the same process.

v2:
 Chris Wilson:
 * Enclose new members into dedicated structs.
 * Protect against failed sysfs registration.

v3:
 * sysfs_attr_init.

v4:
 * Fix for internal clients.

v5:
 * Use cyclic ida for client id. (Chris)
 * Do not leak pid reference. (Chris)
 * Tidy code with some locals.

v6:
 * Use xa_alloc_cyclic to simplify locking. (Chris)
 * No need to unregister individial sysfs files. (Chris)
 * Rebase on top of fpriv kref.
 * Track client closed status and reflect in sysfs.

v7:
 * Make drm_client more standalone concept.

v8:
 * Simplify sysfs show. (Chris)
 * Always track name and pid.

v9:
 * Fix cyclic id assignment.

v10:
 * No need for a mutex around xa_alloc_cyclic.
 * Refactor sysfs into own function.
 * Unregister sysfs before freeing pid and name.
 * Move clients setup into own function.

v11:
 * Call clients init directly from driver init. (Chris)

v12:
 * Do not fail client add on id wrap. (Maciej)

v13 (Lucas): Rebase.

v14:
 * Dropped sysfs bits.

v15:
 * Dropped tracking of pid/ and name.
 * Dropped RCU freeing of the client object.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Chris Wilson  # v11
Reviewed-by: Aravind Iddamsetty  # v11
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Makefile  |  1 +
 drivers/gpu/drm/i915/i915_driver.c |  6 +++
 drivers/gpu/drm/i915/i915_drm_client.c | 68 ++
 drivers/gpu/drm/i915/i915_drm_client.h | 50 +++
 drivers/gpu/drm/i915/i915_drv.h|  5 ++
 drivers/gpu/drm/i915/i915_gem.c| 21 ++--
 6 files changed, 148 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 3b5857da4123..2a94e7c441a0 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -31,6 +31,7 @@ subdir-ccflags-y += -I$(srctree)/$(src)
 
 # core driver code
 i915-y += i915_driver.o \
+ i915_drm_client.o \
  i915_config.o \
  i915_irq.o \
  i915_getparam.o \
diff --git a/drivers/gpu/drm/i915/i915_driver.c 
b/drivers/gpu/drm/i915/i915_driver.c
index bbc99fc5888f..a4f8031602b3 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -73,6 +73,7 @@
 
 #include "i915_debugfs.h"
 #include "i915_driver.h"
+#include "i915_drm_client.h"
 #include "i915_drv.h"
 #include "i915_ioc32.h"
 #include "i915_irq.h"
@@ -349,6 +350,8 @@ static int i915_driver_early_probe(struct drm_i915_private 
*dev_priv)
 
intel_gt_init_early(_priv->gt, dev_priv);
 
+   i915_drm_clients_init(_priv->clients, dev_priv);
+
i915_gem_init_early(dev_priv);
 
/* This must be called before any calls to HAS_PCH_* */
@@ -368,6 +371,7 @@ static int i915_driver_early_probe(struct drm_i915_private 
*dev_priv)
 
 err_gem:
i915_gem_cleanup_early(dev_priv);
+   i915_drm_clients_fini(_priv->clients);
intel_gt_driver_late_release(_priv->gt);
intel_region_ttm_device_fini(dev_priv);
 err_ttm:
@@ -387,6 +391,7 @@ static void i915_driver_late_release(struct 
drm_i915_private *dev_priv)
intel_irq_fini(dev_priv);
intel_power_domains_cleanup(dev_priv);
i915_gem_cleanup_early(dev_priv);
+   i915_drm_clients_fini(_priv->clients);
intel_gt_driver_late_release(_priv->gt);
intel_region_ttm_device_fini(dev_priv);
vlv_suspend_cleanup(dev_priv);
@@ -1017,6 +1022,7 @@ static void i915_driver_postclose(struct drm_device *dev, 
struct drm_file *file)
struct drm_i915_file_private *file_priv = file->driver_priv;
 
i915_gem_context_close(file);
+   i915_drm_client_put(file_priv->client);
 
kfree_rcu(file_priv, rcu);
 
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c 
b/drivers/gpu/drm/i915/i915_drm_client.c
new file mode 100644
index ..e61e9ba15256
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_drm_client.c
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+#include "i915_drm_client.h"
+#include "i915_gem.h"
+#include "i915_utils.h"
+
+void i915_drm_clients_init(struct i915_drm_clients *clients,
+  struct drm_i915_private *i915)
+{
+   clients->i915 = i915;
+   clients->next_id = 0;
+
+   xa_init_flags(>xarray, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ);
+}
+
+struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients)
+{
+   struct i915_drm_client *client;
+   struct xarray *xa = >xarray;
+

[PATCH 0/7] Per client GPU stats

2021-12-03 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Same old work but now rebased and series ending with some DRM docs proposing
the common specification which should enable nice common userspace tools to be
written.

For the moment I only have intel_gpu_top converted to use this and that seems to
work okay.

v2:
 * Added prototype of possible amdgpu changes and spec updates to align with the
   common spec.

v3:
 * Documented that 'drm-driver' tag shall correspond with
   struct drm_driver.name.

v4:
 * Dropped amdgpu conversion from the series for now until AMD folks can find
   some time to finish that patch.

Tvrtko Ursulin (7):
  drm/i915: Explicitly track DRM clients
  drm/i915: Make GEM contexts track DRM clients
  drm/i915: Track runtime spent in closed and unreachable GEM contexts
  drm/i915: Track all user contexts per client
  drm/i915: Track context current active time
  drm: Document fdinfo format specification
  drm/i915: Expose client engine utilisation via fdinfo

 Documentation/gpu/drm-usage-stats.rst | 103 +
 Documentation/gpu/i915.rst|  27 
 Documentation/gpu/index.rst   |   1 +
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  42 -
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   6 +
 drivers/gpu/drm/i915/gt/intel_context.c   |  27 +++-
 drivers/gpu/drm/i915/gt/intel_context.h   |  15 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  24 ++-
 .../drm/i915/gt/intel_execlists_submission.c  |  23 ++-
 .../gpu/drm/i915/gt/intel_gt_clock_utils.c|   4 +
 drivers/gpu/drm/i915/gt/intel_lrc.c   |  27 ++--
 drivers/gpu/drm/i915/gt/intel_lrc.h   |  24 +++
 drivers/gpu/drm/i915/gt/selftest_lrc.c|  10 +-
 drivers/gpu/drm/i915/i915_driver.c|   9 ++
 drivers/gpu/drm/i915/i915_drm_client.c| 143 ++
 drivers/gpu/drm/i915/i915_drm_client.h|  66 
 drivers/gpu/drm/i915/i915_drv.h   |   5 +
 drivers/gpu/drm/i915/i915_gem.c   |  21 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c |   9 +-
 drivers/gpu/drm/i915/i915_gpu_error.h |   2 +-
 21 files changed, 535 insertions(+), 54 deletions(-)
 create mode 100644 Documentation/gpu/drm-usage-stats.rst
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

-- 
2.32.0

[PATCH] dt-bindings: panel: Include SPI peripheral properties

2021-12-03 Thread Thierry Reding

From: Thierry Reding 

SPI panels need to reference the SPI peripheral properties so that they
can be properly validated.

Signed-off-by: Thierry Reding 
---
 .../devicetree/bindings/display/panel/lgphilips,lb035q02.yaml| 1 +
 .../devicetree/bindings/display/panel/sony,acx565akm.yaml| 1 +
 2 files changed, 2 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/display/panel/lgphilips,lb035q02.yaml 
b/Documentation/devicetree/bindings/display/panel/lgphilips,lb035q02.yaml
index 830e335ddb53..240a884b7fa7 100644
--- a/Documentation/devicetree/bindings/display/panel/lgphilips,lb035q02.yaml
+++ b/Documentation/devicetree/bindings/display/panel/lgphilips,lb035q02.yaml
@@ -14,6 +14,7 @@ maintainers:
   - Tomi Valkeinen 
 
 allOf:
+  - $ref: ../../spi/spi-peripheral-props.yaml
   - $ref: panel-common.yaml#
 
 properties:
diff --git 
a/Documentation/devicetree/bindings/display/panel/sony,acx565akm.yaml 
b/Documentation/devicetree/bindings/display/panel/sony,acx565akm.yaml
index 95d053c548ab..4459d746592f 100644
--- a/Documentation/devicetree/bindings/display/panel/sony,acx565akm.yaml
+++ b/Documentation/devicetree/bindings/display/panel/sony,acx565akm.yaml
@@ -14,6 +14,7 @@ maintainers:
   - Tomi Valkeinen 
 
 allOf:
+  - $ref: ../../spi/spi-peripheral-props.yaml
   - $ref: panel-common.yaml#
 
 properties:
-- 
2.33.1

Re: [PATCH v2] drm/dp: Actually read Adjust Request Post Cursor2 register

2021-12-03 Thread Thierry Reding

On Fri, Dec 03, 2021 at 01:25:17AM -0800, Kees Cook wrote:
> The link_status array was not large enough to read the Adjust Request
> Post Cursor2 register. Adjust the size to include it. Found with a
> -Warray-bounds build:
> 
> drivers/gpu/drm/drm_dp_helper.c: In function 
> 'drm_dp_get_adjust_request_post_cursor':
> drivers/gpu/drm/drm_dp_helper.c:59:27: error: array subscript 10 is outside 
> array bounds of 'const u8[6]' {aka 'const unsigned char[6]'} 
> [-Werror=array-bounds]
>59 | return link_status[r - DP_LANE0_1_STATUS];
>   |~~~^~~
> drivers/gpu/drm/drm_dp_helper.c:147:51: note: while referencing 'link_status'
>   147 | u8 drm_dp_get_adjust_request_post_cursor(const u8 
> link_status[DP_LINK_STATUS_SIZE],
>   |  
> ~^~~~
> 
> Fixes: 79465e0ffeb9 ("drm/dp: Add helper to get post-cursor adjustments")
> Signed-off-by: Kees Cook 
> ---
> v2: Fix missed array size change in intel_dp_check_mst_status()
> ---
>  drivers/gpu/drm/i915/display/intel_dp.c |  8 
>  include/drm/drm_dp_helper.h | 10 +-
>  2 files changed, 13 insertions(+), 5 deletions(-)

This sounds very familiar and I vaguely recall typing up a patch like
that a long time ago. But I obviously failed because that never seems
to have made it upstream.

Or perhaps I'm misremembering and was thinking about this instead:


https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/338590/

Bonus points for adding that comment with background information on why
we need this.

Reviewed-by: Thierry Reding 


signature.asc
Description: PGP signature

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

2021-12-03 Thread Intel




On 12/3/21 16:00, Christian König wrote:

Am 03.12.21 um 15:50 schrieb Thomas Hellström:


On 12/3/21 15:26, Christian König wrote:

[Adding Daniel here as well]

Am 03.12.21 um 15:18 schrieb Thomas Hellström:

[SNIP]

Well that's ok as well. My question is why does this single dma_fence
then shows up in the dma_fence_chain representing the whole
migration?

What we'd like to happen during eviction is that we

1) await any exclusive- or moving fences, then schedule the migration
blit. The blit manages its own GPU ptes. Results in a single fence.
2) Schedule unbind of any gpu vmas, resulting possibly in multiple
fences.
3) Most but not all of the remaining resv shared fences will have been
finished in 2) We can't easily tell which so we have a couple of 
shared

fences left.


Stop, wait a second here. We are going a bit in circles.

Before you migrate a buffer, you *MUST* wait for all shared fences 
to complete. This is documented mandatory DMA-buf behavior.


Daniel and I have discussed that quite extensively in the last few 
month.


So how does it come that you do the blit before all shared fences 
are completed?


Well we don't currently but wanted to... (I haven't consulted Daniel 
in the matter, tbh).


I was under the impression that all writes would add an exclusive 
fence to the dma_resv.


Yes that's correct. I'm working on to have more than one write fence, 
but that is currently under review.


If that's not the case or this is otherwise against the mandatory 
DMA-buf bevhavior, we can certainly keep that part as is and that 
would eliminate 3).


Ah, now that somewhat starts to make sense.

So your blit only waits for the writes to finish before starting the 
blit. Yes that's legal as long as you don't change the original 
content with the blit.


But don't you then need to wait for both reads and writes before you 
unmap the VMAs?


Yes, but that's planned to be done all async, and those unbind jobs are 
scheduled simultaneosly with the blit, and the blit itself manages its 
own page-table-entries, so no need to unbind any blit vmas.




Anyway the good news is your problem totally goes away with the 
DMA-resv rework I've already send out. Basically it is now possible to 
have more than one fence in the DMA-resv object for migrations and all 
existing fences are kept around until they are finished.


Sounds good.

Thanks,

Thomas

Re: [PATCH v7 0/9] drm/omap: Add virtual-planes support

2021-12-03 Thread Neil Armstrong

Hi,

On 03/12/2021 12:31, Tomi Valkeinen wrote:
> On 17/11/2021 16:19, Neil Armstrong wrote:
>> This patchset is the follow-up the v4 patchset from Benoit Parrot at [1].
>>
>> This patch series adds virtual-plane support to omapdrm driver to allow the 
>> use
>> of display wider than 2048 pixels.
>>
>> In order to do so we introduce the concept of hw_overlay which can then be
>> dynamically allocated to a plane. When the requested output width exceed what
>> be supported by one overlay a second is then allocated if possible to handle
>> display wider then 2048.
>>
>> This series replaces an earlier series which was DT based and using 
>> statically
>> allocated resources.
>>
>> This implementation is inspired from the work done in msm/disp/mdp5
>> driver.
> 
> I think this looks good. I'll apply this to my work tree to see if I see any 
> issues during my daily work, and if not, I'll push to drm-misc-next.
> 
> Have you tested this with other platforms than x15? I'm mostly thinking about 
> omap3/4, as the DSS is somewhat different on those platforms.

I haven't tested the last version non non-x15, but I haven't changed the logic 
from the original patchset which was in the TI BSP
tree for multiple years.

Neil

> 
>  Tomi

Re: [PATCH] drm/komeda: return early if drm_universal_plane_init() fails.

2021-12-03 Thread Carsten Haitzler


On 12/3/21 14:15, Carsten Haitzler wrote:

On 12/3/21 10:09, Liviu Dudau wrote:
If drm_universal_plane_init() fails early we jump to the common 
cleanup code
that calls komeda_plane_destroy() which in turn could access the 
uninitalised
drm_plane and crash. Return early if an error is detected without 
going through

the common code.

Reported-by: Steven Price 
Signed-off-by: Liviu Dudau 
---
  drivers/gpu/drm/arm/display/komeda/komeda_plane.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_plane.c 
b/drivers/gpu/drm/arm/display/komeda/komeda_plane.c

index aa193c58f4bf6d9..517b94c3bcaf966 100644
--- a/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
@@ -279,8 +279,10 @@ static int komeda_plane_add(struct komeda_kms_dev 
*kms,

  komeda_put_fourcc_list(formats);
-    if (err)
-    goto cleanup;
+    if (err) {
+    kfree(kplane);
+    return err;
+    }
  drm_plane_helper_add(plane, _plane_helper_funcs);



Ummm... can I disagree here? this goto cleanup I think is just fine 
because plane has been set before drm_universal_plane_init() is called 
which is before the if (err) here. komeda_plane_destroy() in cleanup: 
does all the right things, so this patch isn't needed. I think it's less 
clean as it adds a new "handle error" path special-case instance where a 
special case is not needed. The fix to Zhou's original patch was needed 
for exactly the reason Liviu said - but not this one... ?


Let me take that back - it seems an init fail shouldn't call cleanup but 
the init fail doesn't quite cleanup properly. Steven found this and 
already sent a patch.

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

2021-12-03 Thread Christian König


Am 03.12.21 um 15:50 schrieb Thomas Hellström:


On 12/3/21 15:26, Christian König wrote:

[Adding Daniel here as well]

Am 03.12.21 um 15:18 schrieb Thomas Hellström:

[SNIP]

Well that's ok as well. My question is why does this single dma_fence
then shows up in the dma_fence_chain representing the whole
migration?

What we'd like to happen during eviction is that we

1) await any exclusive- or moving fences, then schedule the migration
blit. The blit manages its own GPU ptes. Results in a single fence.
2) Schedule unbind of any gpu vmas, resulting possibly in multiple
fences.
3) Most but not all of the remaining resv shared fences will have been
finished in 2) We can't easily tell which so we have a couple of shared
fences left.


Stop, wait a second here. We are going a bit in circles.

Before you migrate a buffer, you *MUST* wait for all shared fences to 
complete. This is documented mandatory DMA-buf behavior.


Daniel and I have discussed that quite extensively in the last few 
month.


So how does it come that you do the blit before all shared fences are 
completed?


Well we don't currently but wanted to... (I haven't consulted Daniel 
in the matter, tbh).


I was under the impression that all writes would add an exclusive 
fence to the dma_resv.


Yes that's correct. I'm working on to have more than one write fence, 
but that is currently under review.


If that's not the case or this is otherwise against the mandatory 
DMA-buf bevhavior, we can certainly keep that part as is and that 
would eliminate 3).


Ah, now that somewhat starts to make sense.

So your blit only waits for the writes to finish before starting the 
blit. Yes that's legal as long as you don't change the original content 
with the blit.


But don't you then need to wait for both reads and writes before you 
unmap the VMAs?


Anyway the good news is your problem totally goes away with the DMA-resv 
rework I've already send out. Basically it is now possible to have more 
than one fence in the DMA-resv object for migrations and all existing 
fences are kept around until they are finished.


Regards,
Christian.



/Thomas

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

2021-12-03 Thread Thomas Hellström




On 12/3/21 15:26, Christian König wrote:

[Adding Daniel here as well]

Am 03.12.21 um 15:18 schrieb Thomas Hellström:

[SNIP]

Well that's ok as well. My question is why does this single dma_fence
then shows up in the dma_fence_chain representing the whole
migration?

What we'd like to happen during eviction is that we

1) await any exclusive- or moving fences, then schedule the migration
blit. The blit manages its own GPU ptes. Results in a single fence.
2) Schedule unbind of any gpu vmas, resulting possibly in multiple
fences.
3) Most but not all of the remaining resv shared fences will have been
finished in 2) We can't easily tell which so we have a couple of shared
fences left.


Stop, wait a second here. We are going a bit in circles.

Before you migrate a buffer, you *MUST* wait for all shared fences to 
complete. This is documented mandatory DMA-buf behavior.


Daniel and I have discussed that quite extensively in the last few month.

So how does it come that you do the blit before all shared fences are 
completed?


Well we don't currently but wanted to... (I haven't consulted Daniel in 
the matter, tbh).


I was under the impression that all writes would add an exclusive fence 
to the dma_resv. If that's not the case or this is otherwise against the 
mandatory DMA-buf bevhavior, we can certainly keep that part as is and 
that would eliminate 3).


/Thomas

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

2021-12-03 Thread Christian König


[Adding Daniel here as well]

Am 03.12.21 um 15:18 schrieb Thomas Hellström:

[SNIP]

Well that's ok as well. My question is why does this single dma_fence
then shows up in the dma_fence_chain representing the whole
migration?

What we'd like to happen during eviction is that we

1) await any exclusive- or moving fences, then schedule the migration
blit. The blit manages its own GPU ptes. Results in a single fence.
2) Schedule unbind of any gpu vmas, resulting possibly in multiple
fences.
3) Most but not all of the remaining resv shared fences will have been
finished in 2) We can't easily tell which so we have a couple of shared
fences left.


Stop, wait a second here. We are going a bit in circles.

Before you migrate a buffer, you *MUST* wait for all shared fences to 
complete. This is documented mandatory DMA-buf behavior.


Daniel and I have discussed that quite extensively in the last few month.

So how does it come that you do the blit before all shared fences are 
completed?



4) Add all fences resulting from 1) 2) and 3) into the per-memory-type
dma-fence-chain.
5) hand the resulting dma-fence-chain representing the end of migration
over to ttm's resource manager.

Now this means we have a dma-fence-chain disguised as a dma-fence out
in the wild, and it could in theory reappear as a 3) fence for another
migration unless a very careful audit is done, or as an input to the
dma-fence-array used for that single dependency.


That somehow doesn't seem to make sense because each individual step
of
the migration needs to wait for those dependencies as well even when
it
runs in parallel.


But that's not really the point, the point was that an (at least to
me) seemingly harmless usage pattern, be it real or fictious, ends
up
giving you severe internal- or cross-driver headaches.

Yeah, we probably should document that better. But in general I don't
see much reason to allow mixing containers. The dma_fence_array and
dma_fence_chain objects have some distinct use cases and and using
them
to build up larger dependency structures sounds really questionable.

Yes, I tend to agree to some extent here. Perhaps add warnings when
adding a chain or array as an input to array and when accidently
joining chains, and provide helpers for flattening if needed.


Yeah, that's probably a really good idea. Going to put it on my todo list.

Thanks,
Christian.



/Thomas



Christian.


/Thomas



Regards,
Christian.

Re: [PATCH v7 6/6] drm/sprd: add Unisoc's drm mipi dsi driver

2021-12-03 Thread Maxime Ripard

On Fri, Dec 03, 2021 at 08:34:50PM +0800, Kevin Tang wrote:
> Maxime Ripard  于2021年12月3日周五 18:38写道：
> >
> > On Mon, Oct 25, 2021 at 05:34:18PM +0800, Kevin Tang wrote:
> > > @@ -618,9 +619,25 @@ static void sprd_crtc_mode_set_nofb(struct drm_crtc 
> > > *crtc)
> > >  {
> > >   struct sprd_dpu *dpu = to_sprd_crtc(crtc);
> > >   struct drm_display_mode *mode = >state->adjusted_mode;
> > > + struct drm_encoder *encoder;
> > > + struct mipi_dsi_device *slave;
> > > + struct sprd_dsi *dsi;
> > >
> > >   drm_display_mode_to_videomode(mode, >ctx.vm);
> > >
> > > + drm_for_each_encoder(encoder, crtc->dev) {
> > > + if (encoder->crtc != crtc)
> > > + continue;
> >
> > encoder->crtc is deprecated. You should be using
> > encoder->drm_for_each_encoder_mask, using the encoder_mask in
> > encoder->drm_crtc_state.
> 
> Use drm_for_each_encoder_mask to replace drm_for_each_encoder? like this:
> drm_for_each_encoder_mask(encoder, crtc->dev, crtc->state->encoder_mask) {
> dsi = encoder_to_dsi(encoder);
> slave = dsi->slave;
> 
> if (slave->mode_flags & MIPI_DSI_MODE_VIDEO)
> dpu->ctx.if_type = SPRD_DPU_IF_DPI;
> else
>  dpu->ctx.if_type = SPRD_DPU_IF_EDPI;
> }

Yes

Maxime


signature.asc
Description: PGP signature

[PATCH/RFC v2] dt-bindings: display: bridge: sil, sii9022: Convert to json-schema

2021-12-03 Thread Geert Uytterhoeven

Convert the Silicon Image sii902x HDMI bridge Device Tree binding
documentation to json-schema.

Add missing sil,sii9022-cpi and sil,sii9022-tpi compatible values.

Signed-off-by: Geert Uytterhoeven 
---
RFC as I do not know the meaning of the various ports subnodes.

v2:
  - Rework sil,i2s-data-lanes,
  - Add schema reference to ports.
---
 .../bindings/display/bridge/sii902x.txt   |  78 --
 .../bindings/display/bridge/sil,sii9022.yaml  | 133 ++
 2 files changed, 133 insertions(+), 78 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/display/bridge/sii902x.txt
 create mode 100644 
Documentation/devicetree/bindings/display/bridge/sil,sii9022.yaml

diff --git a/Documentation/devicetree/bindings/display/bridge/sii902x.txt 
b/Documentation/devicetree/bindings/display/bridge/sii902x.txt
deleted file mode 100644
index 3bc760cc31cbbeee..
--- a/Documentation/devicetree/bindings/display/bridge/sii902x.txt
+++ /dev/null
@@ -1,78 +0,0 @@
-sii902x HDMI bridge bindings
-
-Required properties:
-   - compatible: "sil,sii9022"
-   - reg: i2c address of the bridge
-
-Optional properties:
-   - interrupts: describe the interrupt line used to inform the host
- about hotplug events.
-   - reset-gpios: OF device-tree gpio specification for RST_N pin.
-   - iovcc-supply: I/O Supply Voltage (1.8V or 3.3V)
-   - cvcc12-supply: Digital Core Supply Voltage (1.2V)
-
-   HDMI audio properties:
-   - #sound-dai-cells: <0> or <1>. <0> if only i2s or spdif pin
-  is wired, <1> if the both are wired. HDMI audio is
-  configured only if this property is found.
-   - sil,i2s-data-lanes: Array of up to 4 integers with values of 0-3
-  Each integer indicates which i2s pin is connected to which
-  audio fifo. The first integer selects i2s audio pin for the
-  first audio fifo#0 (HDMI channels 1&2), second for fifo#1
-  (HDMI channels 3&4), and so on. There is 4 fifos and 4 i2s
-  pins (SD0 - SD3). Any i2s pin can be connected to any fifo,
-  but there can be no gaps. E.g. an i2s pin must be mapped to
-  fifo#0 and fifo#1 before mapping a channel to fifo#2. Default
-  value is <0>, describing SD0 pin beiging routed to hdmi audio
-  fifo #0.
-   - clocks: phandle and clock specifier for each clock listed in
-   the clock-names property
-   - clock-names: "mclk"
-  Describes SII902x MCLK input. MCLK can be used to produce
-  HDMI audio CTS values. This property follows
-  Documentation/devicetree/bindings/clock/clock-bindings.txt
-  consumer binding.
-
-   If HDMI audio is configured the sii902x device becomes an I2S
-   and/or spdif audio codec component (e.g a digital audio sink),
-   that can be used in configuring a full audio devices with
-   simple-card or audio-graph-card binding. See their binding
-   documents on how to describe the way the sii902x device is
-   connected to the rest of the audio system:
-   Documentation/devicetree/bindings/sound/simple-card.yaml
-   Documentation/devicetree/bindings/sound/audio-graph-card.yaml
-   Note: In case of the audio-graph-card binding the used port
-   index should be 3.
-
-Optional subnodes:
-   - video input: this subnode can contain a video input port node
- to connect the bridge to a display controller output (See this
- documentation [1]).
-
-[1]: Documentation/devicetree/bindings/media/video-interfaces.txt
-
-Example:
-   hdmi-bridge@39 {
-   compatible = "sil,sii9022";
-   reg = <0x39>;
-   reset-gpios = < 1 0>;
-   iovcc-supply = <_hdmi>;
-   cvcc12-supply = <_hdmi>;
-
-   #sound-dai-cells = <0>;
-   sil,i2s-data-lanes = < 0 1 2 >;
-   clocks = <>;
-   clock-names = "mclk";
-
-   ports {
-   #address-cells = <1>;
-   #size-cells = <0>;
-
-   port@0 {
-   reg = <0>;
-   bridge_in: endpoint {
-   remote-endpoint = <_out>;
-   };
-   };
-   };
-   };
diff --git a/Documentation/devicetree/bindings/display/bridge/sil,sii9022.yaml 
b/Documentation/devicetree/bindings/display/bridge/sil,sii9022.yaml
new file mode 100644
index ..b39537f4fe8694ef
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/bridge/sil,sii9022.yaml
@@ -0,0 +1,133 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/bridge/sil,sii9022.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Silicon Image sii902x HDMI bridge
+
+maintainers:
+  - Boris Brezillon 
+
+properties:
+

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

2021-12-03 Thread Thomas Hellström

On Fri, 2021-12-03 at 14:08 +0100, Christian König wrote:
> Am 01.12.21 um 13:16 schrieb Thomas Hellström (Intel):
> > 
> > On 12/1/21 12:25, Christian König wrote:
> > > Am 01.12.21 um 12:04 schrieb Thomas Hellström (Intel):
> > > > 
> > > > On 12/1/21 11:32, Christian König wrote:
> > > > > Am 01.12.21 um 11:15 schrieb Thomas Hellström (Intel):
> > > > > > [SNIP]
> > > > > > > 
> > > > > > > What we could do is to avoid all this by not calling the
> > > > > > > callback 
> > > > > > > with the lock held in the first place.
> > > > > > 
> > > > > > If that's possible that might be a good idea, pls also see
> > > > > > below.
> > > > > 
> > > > > The problem with that is 
> > > > > dma_fence_signal_locked()/dma_fence_signal_timestamp_locked()
> > > > > . If 
> > > > > we could avoid using that or at least allow it to drop the
> > > > > lock 
> > > > > then we could call the callback without holding it.
> > > > > 
> > > > > Somebody would need to audit the drivers and see if holding
> > > > > the 
> > > > > lock is really necessary anywhere.
> > > > > 
> > > > > > > 
> > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > /Thomas
> > > > > > > > > > 
> > > > > > > > > > Oh, and a follow up question:
> > > > > > > > > > 
> > > > > > > > > > If there was a way to break the recursion on final
> > > > > > > > > > put() 
> > > > > > > > > > (using the same basic approach as patch 2 in this
> > > > > > > > > > series uses 
> > > > > > > > > > to break recursion in enable_signaling()), so that
> > > > > > > > > > none of 
> > > > > > > > > > these containers did require any special treatment,
> > > > > > > > > > would it 
> > > > > > > > > > be worth pursuing? I guess it might be possible by
> > > > > > > > > > having the 
> > > > > > > > > > callbacks drop the references rather than the loop
> > > > > > > > > > in the 
> > > > > > > > > > final put. + a couple of changes in code iterating
> > > > > > > > > > over the 
> > > > > > > > > > fence pointers.
> > > > > > > > > 
> > > > > > > > > That won't really help, you just move the recursion
> > > > > > > > > from the 
> > > > > > > > > final put into the callback.
> > > > > > > > 
> > > > > > > > How do we recurse from the callback? The introduced
> > > > > > > > fence_put() 
> > > > > > > > of individual fence pointers
> > > > > > > > doesn't recurse anymore (at most 1 level), and any
> > > > > > > > callback 
> > > > > > > > recursion is broken by the irq_work?
> > > > > > > 
> > > > > > > Yeah, but then you would need to take another lock to
> > > > > > > avoid 
> > > > > > > racing with dma_fence_array_signaled().
> > > > > > > 
> > > > > > > > 
> > > > > > > > I figure the big amount of work would be to adjust code
> > > > > > > > that 
> > > > > > > > iterates over the individual fence pointers to
> > > > > > > > recognize that 
> > > > > > > > they are rcu protected.
> > > > > > > 
> > > > > > > Could be that we could solve this with RCU, but that
> > > > > > > sounds like 
> > > > > > > a lot of churn for no gain at all.
> > > > > > > 
> > > > > > > In other words even with the problems solved I think it
> > > > > > > would be 
> > > > > > > a really bad idea to allow chaining of dma_fence_array
> > > > > > > objects.
> > > > > > 
> > > > > > Yes, that was really the question, Is it worth pursuing
> > > > > > this? I'm 
> > > > > > not really suggesting we should allow this as an
> > > > > > intentional 
> > > > > > feature. I'm worried, however, that if we allow these
> > > > > > containers 
> > > > > > to start floating around cross-driver (or even internally) 
> > > > > > disguised as ordinary dma_fences, they would require a lot
> > > > > > of 
> > > > > > driver special casing, or else completely unexpeced
> > > > > > WARN_ON()s and 
> > > > > > lockdep splats would start to turn up, scaring people off
> > > > > > from 
> > > > > > using them. And that would be a breeding ground for hairy 
> > > > > > driver-private constructs.
> > > > > 
> > > > > Well the question is why we would want to do it?
> > > > > 
> > > > > If it's to avoid inter driver lock dependencies by avoiding
> > > > > to call 
> > > > > the callback with the spinlock held, then yes please. We had
> > > > > tons 
> > > > > of problems with that, resulting in irq_work and work_item 
> > > > > delegation all over the place.
> > > > 
> > > > Yes, that sounds like something desirable, but in these
> > > > containers, 
> > > > what's causing the lock dependencies is the enable_signaling() 
> > > > callback that is typically called locked.
> > > > 
> > > > 
> > > > > 
> > > > > If it's to allow nesting of dma_fence_array instances, then
> > > > > it's 
> > > > > most likely a really bad idea even if we fix all the locking
> > > > > order 
> > > > > problems.
> > > > 
> > > > Well I think my use-case where I hit a dead end may illustrate
> > > > what 
> > > > worries me here:
> > > > 
> > > > 1) We use a dma-fence-array to coalesce all dependencies for
> > > > ttm 
> > > > object

Re: [PATCH] drm/komeda: return early if drm_universal_plane_init() fails.

2021-12-03 Thread Carsten Haitzler


On 12/3/21 10:09, Liviu Dudau wrote:

If drm_universal_plane_init() fails early we jump to the common cleanup code
that calls komeda_plane_destroy() which in turn could access the uninitalised
drm_plane and crash. Return early if an error is detected without going through
the common code.

Reported-by: Steven Price 
Signed-off-by: Liviu Dudau 
---
  drivers/gpu/drm/arm/display/komeda/komeda_plane.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_plane.c 
b/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
index aa193c58f4bf6d9..517b94c3bcaf966 100644
--- a/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
@@ -279,8 +279,10 @@ static int komeda_plane_add(struct komeda_kms_dev *kms,
  
  	komeda_put_fourcc_list(formats);
  
-	if (err)

-   goto cleanup;
+   if (err) {
+   kfree(kplane);
+   return err;
+   }
  
  	drm_plane_helper_add(plane, _plane_helper_funcs);
  



Ummm... can I disagree here? this goto cleanup I think is just fine 
because plane has been set before drm_universal_plane_init() is called 
which is before the if (err) here. komeda_plane_destroy() in cleanup: 
does all the right things, so this patch isn't needed. I think it's less 
clean as it adds a new "handle error" path special-case instance where a 
special case is not needed. The fix to Zhou's original patch was needed 
for exactly the reason Liviu said - but not this one... ?

Re: [PATCH v2 0/6] drm/vc4: kms: Misc fixes for HVS commits

2021-12-03 Thread Maxime Ripard

On Mon, Nov 29, 2021 at 04:31:57PM +0800, Jian-Hong Pan wrote:
> Maxime Ripard  於 2021年11月26日 週五 下午11:45寫道：
> >
> > On Fri, Nov 19, 2021 at 06:24:34PM +0800, Jian-Hong Pan wrote:
> > > Maxime Ripard  於 2021年11月18日 週四 下午6:40寫道：
> > > >
> > > > On Thu, Nov 18, 2021 at 02:42:58PM +0800, Jian-Hong Pan wrote:
> > > > > Maxime Ripard  於 2021年11月17日 週三 下午5:45寫道：
> > > > > > The conversion to DRM commit helpers (f3c420fe19f8, "drm/vc4: kms: 
> > > > > > Convert to
> > > > > > atomic helpers") introduced a number of issues in corner cases, 
> > > > > > most of them
> > > > > > showing themselves in the form of either a vblank timeout or 
> > > > > > use-after-free
> > > > > > error.
> > > > > >
> > > > > > These patches should fix most of them, some of them still being 
> > > > > > debugged.
> > > > > >
> > > > > > Maxime
> > > > > >
> > > > > > Changes from v1:
> > > > > >   - Prevent a null pointer dereference
> > > > > >
> > > > > > Maxime Ripard (6):
> > > > > >   drm/vc4: kms: Wait for the commit before increasing our clock rate
> > > > > >   drm/vc4: kms: Fix return code check
> > > > > >   drm/vc4: kms: Add missing drm_crtc_commit_put
> > > > > >   drm/vc4: kms: Clear the HVS FIFO commit pointer once done
> > > > > >   drm/vc4: kms: Don't duplicate pending commit
> > > > > >   drm/vc4: kms: Fix previous HVS commit wait
> > > > > >
> > > > > >  drivers/gpu/drm/vc4/vc4_kms.c | 42 
> > > > > > ---
> > > > > >  1 file changed, 19 insertions(+), 23 deletions(-)
> > > > >
> > > > > I tested the v2 patches based on latest mainline kernel with RPi 4B.
> > > > > System can boot up into desktop environment.
> > > >
> > > > So the thing that was broken initially isn't anymore?
> > >
> > > No.  It does not appear anymore.
> > >
> > > > > Although it still hit the bug [1], which might be under debugging, I
> > > > > think these patches LGTM.
> > > >
> > > > The vblank timeout stuff is a symptom of various different bugs. Can you
> > > > share your setup, your config.txt, and what you're doing to trigger it?
> > >
> > > I get the RPi boot firmware files from raspberrypi's repository at tag
> > > 1.20211029 [1]
> > > And, make system boots:
> > > RPi firmware -> U-Boot -> Linux kernel (aarch64) with corresponding DTB
> > >
> > > The config.txt only has:
> > > enable_uart=1
> > > arm_64bit=1
> > > kernel=uboot.bin
> > >
> > > This bug can be reproduced with es2gears command easily.  May need to
> > > wait it running a while.
> > >
> > > Mesa: 21.2.2
> > > libdrm: 2.4.107
> > > xserver/wayland: 1.20.11  Using wayland
> > >
> > > This bug happens with direct boot path as well:
> > > RPi firmware -> Linux kernel (aarch64) with corresponding DTB
> > >
> > > I tried with Endless OS and Ubuntu's RPi 4B images.
> > > An easy setup is using Ubuntu 21.10 RPi 4B image [2].  Then, replace
> > > the kernel & device tree blob and edit the config.txt.
> >
> > Does it still appear if you raise the core clock in the config.txt file
> > using: core_freq_min=600 ?
> 
> It still appears when I raise the core clock in the config.txt file:
> core_freq_min=600

That's weird, we had some issues like that already but could never point
exactly what was going on.

Is testing the official raspberrypi kernel an option for you? If so,
trying the same workload with fkms would certainly help

Maxime


signature.asc
Description: PGP signature

[PATCH v2 3/3] drm/vc4: Notify the firmware when DRM is in charge

2021-12-03 Thread Maxime Ripard

Once the call to drm_fb_helper_remove_conflicting_framebuffers() has
been made, simplefb has been unregistered and the KMS driver is entirely
in charge of the display.

Thus, we can notify the firmware it can free whatever resource it was
using to maintain simplefb functional.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_drv.c | 19 +++
 drivers/gpu/drm/vc4/vc4_drv.h |  2 ++
 2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 8ab89f805826..38d55a47c831 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -37,6 +37,8 @@
 #include 
 #include 
 
+#include 
+
 #include "uapi/drm/vc4_drm.h"
 
 #include "vc4_drv.h"
@@ -251,10 +253,27 @@ static int vc4_drm_bind(struct device *dev)
if (ret)
return ret;
 
+   node = of_find_compatible_node(NULL, NULL, 
"raspberrypi,bcm2835-firmware");
+   if (node) {
+   vc4->firmware = devm_rpi_firmware_get(dev, node);
+   of_node_put(node);
+
+   if (!vc4->firmware)
+   return -EPROBE_DEFER;
+   }
+
ret = drm_aperture_remove_framebuffers(false, _drm_driver);
if (ret)
return ret;
 
+   if (vc4->firmware) {
+   ret = rpi_firmware_property(vc4->firmware,
+   RPI_FIRMWARE_NOTIFY_DISPLAY_DONE,
+   NULL, 0);
+   if (ret)
+   drm_warn(drm, "Couldn't stop firmware display driver: 
%d\n", ret);
+   }
+
ret = component_bind_all(dev, drm);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 4329e09d357c..b840654c53a9 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -76,6 +76,8 @@ struct vc4_dev {
 
unsigned int irq;
 
+   struct rpi_firmware *firmware;
+
struct vc4_hvs *hvs;
struct vc4_v3d *v3d;
struct vc4_dpi *dpi;
-- 
2.33.1

[PATCH v2 2/3] drm/vc4: Remove conflicting framebuffers before callind bind_all

2021-12-03 Thread Maxime Ripard

The bind hooks will modify their controller registers, so simplefb is
going to be unusable anyway. Let's avoid any transient state where it
could still be in the system but no longer functionnal.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_drv.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 16abc3a3d601..8ab89f805826 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -251,6 +251,10 @@ static int vc4_drm_bind(struct device *dev)
if (ret)
return ret;
 
+   ret = drm_aperture_remove_framebuffers(false, _drm_driver);
+   if (ret)
+   return ret;
+
ret = component_bind_all(dev, drm);
if (ret)
return ret;
@@ -259,10 +263,6 @@ static int vc4_drm_bind(struct device *dev)
if (ret)
goto unbind_all;
 
-   ret = drm_aperture_remove_framebuffers(false, _drm_driver);
-   if (ret)
-   goto unbind_all;
-
ret = vc4_kms_load(drm);
if (ret < 0)
goto unbind_all;
-- 
2.33.1

[PATCH v2 1/3] firmware: raspberrypi: Add RPI_FIRMWARE_NOTIFY_DISPLAY_DONE

2021-12-03 Thread Maxime Ripard

The RPI_FIRMWARE_NOTIFY_DISPLAY_DONE firmware call allows to tell the
firmware the kernel is in charge of the display now and the firmware can
free whatever resources it was using.

Signed-off-by: Maxime Ripard 
---
 include/soc/bcm2835/raspberrypi-firmware.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/soc/bcm2835/raspberrypi-firmware.h 
b/include/soc/bcm2835/raspberrypi-firmware.h
index 73ad784fca96..811ea668c4a1 100644
--- a/include/soc/bcm2835/raspberrypi-firmware.h
+++ b/include/soc/bcm2835/raspberrypi-firmware.h
@@ -91,6 +91,7 @@ enum rpi_firmware_property_tag {
RPI_FIRMWARE_GET_POE_HAT_VAL =0x00030049,
RPI_FIRMWARE_SET_POE_HAT_VAL =0x00030050,
RPI_FIRMWARE_NOTIFY_XHCI_RESET =  0x00030058,
+   RPI_FIRMWARE_NOTIFY_DISPLAY_DONE =0x00030066,
 
/* Dispmanx TAGS */
RPI_FIRMWARE_FRAMEBUFFER_ALLOCATE =   0x00040001,
-- 
2.33.1

[PATCH v2 0/3] drm/vc4: Use the firmware to stop the display pipeline

2021-12-03 Thread Maxime Ripard

Hi,

The VC4 driver has had limited support to disable the HDMI controllers and
pixelvalves at boot if the firmware has enabled them.

However, this proved to be limited, and a bit unreliable so a new firmware
command has been introduced some time ago to make it free all its resources and
disable any display output it might have enabled.

This series takes advantage of that command to call it once the transition from
simplefb to the KMS driver has been done.

Let me know what you think,
Maxime

---

Changes from v2:
  - Use of_find_compatible_node instead of a phandle
  - Use devm_rpi_firmware_get

Maxime Ripard (3):
  firmware: raspberrypi: Add RPI_FIRMWARE_NOTIFY_DISPLAY_DONE
  drm/vc4: Remove conflicting framebuffers before callind bind_all
  drm/vc4: Notify the firmware when DRM is in charge

 drivers/gpu/drm/vc4/vc4_drv.c  | 27 ++
 drivers/gpu/drm/vc4/vc4_drv.h  |  2 ++
 include/soc/bcm2835/raspberrypi-firmware.h |  1 +
 3 files changed, 26 insertions(+), 4 deletions(-)

-- 
2.33.1

Re: [PATCH 0/5] drm/vc4: Use the firmware to stop the display pipeline

2021-12-03 Thread Maxime Ripard

Hi Nicolas,

On Tue, Nov 30, 2021 at 12:45:49PM +0100, nicolas saenz julienne wrote:
> On Wed, 2021-11-17 at 15:50 +0100, Maxime Ripard wrote:
> > The VC4 driver has had limited support to disable the HDMI controllers and
> > pixelvalves at boot if the firmware has enabled them.
> > 
> > However, this proved to be limited, and a bit unreliable so a new firmware
> > command has been introduced some time ago to make it free all its resources 
> > and
> > disable any display output it might have enabled.
> > 
> > This series takes advantage of that command to call it once the transition 
> > from
> > simplefb to the KMS driver has been done.
> 
> I think it would make sense to integrate this funtionality into
> 'reset/reset-raspberrypi.c' and pass it to VC4 as a reset controller. It fits
> the same startup situation as the one we have with the USB controller. Also, 
> it
> would contain the firmware weirdness in a single spot.

I don't really think it makes sense. It's not really made for that
purpose, affects multiple devices (basically, all of the devices from
the display pipeline), and can even have some side effects on clocks and
memory. Plus, it can only be done once iirc, so a later call to reset
the pipeline will be ineffective, even if we changed its state.

So yeah, it doesn't really fit into the reset abstraction.

> Otherwise, please use 'devm_rpi_firmware_get()'.

Will do, thanks!
Maxime


signature.asc
Description: PGP signature

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

2021-12-03 Thread Christian König


Am 01.12.21 um 13:16 schrieb Thomas Hellström (Intel):


On 12/1/21 12:25, Christian König wrote:

Am 01.12.21 um 12:04 schrieb Thomas Hellström (Intel):


On 12/1/21 11:32, Christian König wrote:

Am 01.12.21 um 11:15 schrieb Thomas Hellström (Intel):

[SNIP]


What we could do is to avoid all this by not calling the callback 
with the lock held in the first place.


If that's possible that might be a good idea, pls also see below.


The problem with that is 
dma_fence_signal_locked()/dma_fence_signal_timestamp_locked(). If 
we could avoid using that or at least allow it to drop the lock 
then we could call the callback without holding it.


Somebody would need to audit the drivers and see if holding the 
lock is really necessary anywhere.








/Thomas


Oh, and a follow up question:

If there was a way to break the recursion on final put() 
(using the same basic approach as patch 2 in this series uses 
to break recursion in enable_signaling()), so that none of 
these containers did require any special treatment, would it 
be worth pursuing? I guess it might be possible by having the 
callbacks drop the references rather than the loop in the 
final put. + a couple of changes in code iterating over the 
fence pointers.


That won't really help, you just move the recursion from the 
final put into the callback.


How do we recurse from the callback? The introduced fence_put() 
of individual fence pointers
doesn't recurse anymore (at most 1 level), and any callback 
recursion is broken by the irq_work?


Yeah, but then you would need to take another lock to avoid 
racing with dma_fence_array_signaled().




I figure the big amount of work would be to adjust code that 
iterates over the individual fence pointers to recognize that 
they are rcu protected.


Could be that we could solve this with RCU, but that sounds like 
a lot of churn for no gain at all.


In other words even with the problems solved I think it would be 
a really bad idea to allow chaining of dma_fence_array objects.


Yes, that was really the question, Is it worth pursuing this? I'm 
not really suggesting we should allow this as an intentional 
feature. I'm worried, however, that if we allow these containers 
to start floating around cross-driver (or even internally) 
disguised as ordinary dma_fences, they would require a lot of 
driver special casing, or else completely unexpeced WARN_ON()s and 
lockdep splats would start to turn up, scaring people off from 
using them. And that would be a breeding ground for hairy 
driver-private constructs.


Well the question is why we would want to do it?

If it's to avoid inter driver lock dependencies by avoiding to call 
the callback with the spinlock held, then yes please. We had tons 
of problems with that, resulting in irq_work and work_item 
delegation all over the place.


Yes, that sounds like something desirable, but in these containers, 
what's causing the lock dependencies is the enable_signaling() 
callback that is typically called locked.





If it's to allow nesting of dma_fence_array instances, then it's 
most likely a really bad idea even if we fix all the locking order 
problems.


Well I think my use-case where I hit a dead end may illustrate what 
worries me here:


1) We use a dma-fence-array to coalesce all dependencies for ttm 
object migration.
2) We use a dma-fence-chain to order the resulting dm_fence into a 
timeline because the TTM resource manager code requires that.


Initially seemingly harmless to me.

But after a sequence evict->alloc->clear, the dma-fence-chain feeds 
into the dma-fence-array for the clearing operation. Code still 
works fine, and no deep recursion, no warnings. But if I were to add 
another driver to the system that instead feeds a dma-fence-array 
into a dma-fence-chain, this would give me a lockdep splat.


So then if somebody were to come up with the splendid idea of using 
a dma-fence-chain to initially coalesce fences, I'd hit the same 
problem or risk illegaly joining two dma-fence-chains together.


To fix this, I would need to look at the incoming fences and iterate 
over any dma-fence-array or dma-fence-chain that is fed into the 
dma-fence-array to flatten out the input. In fact all 
dma-fence-array users would need to do that, and even 
dma-fence-chain users watching out for not joining chains together 
or accidently add an array that perhaps came as a disguised 
dma-fence from antother driver.


So the purpose to me would be to allow these containers as input to 
eachother without a lot of in-driver special-casing, be it by 
breaking recursion on built-in flattening to avoid


a) Hitting issues in the future or with existing interoperating 
drivers.
b) Avoid driver-private containers that also might break the 
interoperability. (For example the i915 currently driver-private 
dma_fence_work avoid all these problems, but we're attempting to 
address issues in common code rather than re-inventing stuff 
internally).


I don't think that

Re: [PATCH] drm/msm: Initialize MDSS irq domain at probe time

2021-12-03 Thread AngeloGioacchino Del Regno


Il 03/12/21 14:14, Dmitry Baryshkov ha scritto:

On 03/12/2021 13:43, AngeloGioacchino Del Regno wrote:

Il 01/12/21 21:20, Dmitry Baryshkov ha scritto:

Since commit 8f59ee9a570c ("drm/msm/dsi: Adjust probe order"), the
DSI host gets initialized earlier, but this caused unability to probe
the entire stack of components because they all depend on interrupts
coming from the main `mdss` node (mdp5, or dpu1).

To fix this issue, move mdss device initialization (which include irq
domain setup) to msm_mdev_probe() time, as to make sure that the
interrupt controller is available before dsi and/or other components try
to initialize, finally satisfying the dependency.

Fixes: 8f59ee9a570c ("drm/msm/dsi: Adjust probe order")
Co-Developed-By: AngeloGioacchino Del Regno 


Signed-off-by: Dmitry Baryshkov 
---

When checking your patch, I noticed that IRQ domain is created before
respective MDSS clocks are enabled. This does not look like causing any
issues at this time, but it did not look good. So I started moving
clocks parsing to early_init() callbacks. And at some point it looked
like we can drop the init()/destroy() callbacks in favour of
early_init() and remove(). Which promted me to move init()/destroy() in
place of early_init()/remove() with few minor fixes here and there.




Hey Dmitry,
I wanted to make the least amount of changes to Rob's logic... I know that
the clocks aren't up before registering the domain, but my logic was implying
that if the handlers aren't registered, then there's no interrupt coming, hence
no risk of getting issues. Same if the hardware is down, you can't get any
interrupt, because it can't generate any (but if bootloader leaves it up.. eh.)


We can get spurious interrupts for any reason, which puts us at risk of peeking 
into unpowered registers. So, while your approach was working, it did not seem 
fully correct.




Yeah, that's right and I totally agree.



I recognize that such approach is "fragile enough", lastly, I agree with this
patch which is, in the end, something that is in the middle between my first
and last approach.

I've tested this one on trogdor-lazor-limozeen and seems to be working in an
analogous way to my v2/v3, so on my side it's validated.


Let's go for this one!
How do we proceeed now? Are you sending a new series with the new patches, or
should I?


I'll run a few more tests and then I'll probably include both patches into the next 
series to be sent to Rob.




That's perfect!



Also, I don't think this is relevant, since I'm in co-dev, but in case it is:
Tested-by: AngeloGioacchino Del Regno 

P.S.: Sorry for the 1-day delay, got busy with other tasks!


No problem, it was just single day, no worries.



Alright, thank you! :D

Re: [PATCH] drm/msm: Initialize MDSS irq domain at probe time

2021-12-03 Thread Dmitry Baryshkov


On 03/12/2021 13:43, AngeloGioacchino Del Regno wrote:

Il 01/12/21 21:20, Dmitry Baryshkov ha scritto:

Since commit 8f59ee9a570c ("drm/msm/dsi: Adjust probe order"), the
DSI host gets initialized earlier, but this caused unability to probe
the entire stack of components because they all depend on interrupts
coming from the main `mdss` node (mdp5, or dpu1).

To fix this issue, move mdss device initialization (which include irq
domain setup) to msm_mdev_probe() time, as to make sure that the
interrupt controller is available before dsi and/or other components try
to initialize, finally satisfying the dependency.

Fixes: 8f59ee9a570c ("drm/msm/dsi: Adjust probe order")
Co-Developed-By: AngeloGioacchino Del Regno 


Signed-off-by: Dmitry Baryshkov 
---

When checking your patch, I noticed that IRQ domain is created before
respective MDSS clocks are enabled. This does not look like causing any
issues at this time, but it did not look good. So I started moving
clocks parsing to early_init() callbacks. And at some point it looked
like we can drop the init()/destroy() callbacks in favour of
early_init() and remove(). Which promted me to move init()/destroy() in
place of early_init()/remove() with few minor fixes here and there.




Hey Dmitry,
I wanted to make the least amount of changes to Rob's logic... I know that
the clocks aren't up before registering the domain, but my logic was 
implying
that if the handlers aren't registered, then there's no interrupt 
coming, hence

no risk of getting issues. Same if the hardware is down, you can't get any
interrupt, because it can't generate any (but if bootloader leaves it 
up.. eh.)


We can get spurious interrupts for any reason, which puts us at risk of 
peeking into unpowered registers. So, while your approach was working, 
it did not seem fully correct.




I recognize that such approach is "fragile enough", lastly, I agree with 
this
patch which is, in the end, something that is in the middle between my 
first

and last approach.

I've tested this one on trogdor-lazor-limozeen and seems to be working 
in an

analogous way to my v2/v3, so on my side it's validated.


Let's go for this one!
How do we proceeed now? Are you sending a new series with the new 
patches, or

should I?


I'll run a few more tests and then I'll probably include both patches 
into the next series to be sent to Rob.




Also, I don't think this is relevant, since I'm in co-dev, but in case 
it is:
Tested-by: AngeloGioacchino Del Regno 



P.S.: Sorry for the 1-day delay, got busy with other tasks!


No problem, it was just single day, no worries.

--
With best wishes
Dmitry

Re: [PATCH] drm/plane: Move range check for format_count earlier

2021-12-03 Thread Liviu Dudau

On Fri, Dec 03, 2021 at 10:28:15AM +, Steven Price wrote:
> While the check for format_count > 64 in __drm_universal_plane_init()
> shouldn't be hit (it's a WARN_ON), in its current position it will then
> leak the plane->format_types array and fail to call
> drm_mode_object_unregister() leaking the modeset identifier. Move it to
> the start of the function to avoid allocating those resources in the
> first place.
> 
> Signed-off-by: Steven Price 

Well spotted!

Reviewed-by: Liviu Dudau 

I'm going to wait to see if anyone else has any comments before I'll merge this 
into
drm-misc-fixes (or should it be drm-misc-next-fixes?)

Best regards,
Liviu

> ---
>  drivers/gpu/drm/drm_plane.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
> index 82afb854141b..fd0bf90fb4c2 100644
> --- a/drivers/gpu/drm/drm_plane.c
> +++ b/drivers/gpu/drm/drm_plane.c
> @@ -249,6 +249,13 @@ static int __drm_universal_plane_init(struct drm_device 
> *dev,
>   if (WARN_ON(config->num_total_plane >= 32))
>   return -EINVAL;
>  
> + /*
> +  * First driver to need more than 64 formats needs to fix this. Each
> +  * format is encoded as a bit and the current code only supports a u64.
> +  */
> + if (WARN_ON(format_count > 64))
> + return -EINVAL;
> +
>   WARN_ON(drm_drv_uses_atomic_modeset(dev) &&
>   (!funcs->atomic_destroy_state ||
>!funcs->atomic_duplicate_state));
> @@ -270,13 +277,6 @@ static int __drm_universal_plane_init(struct drm_device 
> *dev,
>   return -ENOMEM;
>   }
>  
> - /*
> -  * First driver to need more than 64 formats needs to fix this. Each
> -  * format is encoded as a bit and the current code only supports a u64.
> -  */
> - if (WARN_ON(format_count > 64))
> - return -EINVAL;
> -
>   if (format_modifiers) {
>   const uint64_t *temp_modifiers = format_modifiers;
>  
> -- 
> 2.25.1
> 

-- 

| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---
¯\_(ツ)_/¯

Re: [PATCH 10/11] drm/vc4: hdmi: Support HDMI YUV output

2021-12-03 Thread kernel test robot

Hi Maxime,

I love your patch! Perhaps something to improve:

[auto build test WARNING on drm/drm-next]
[also build test WARNING on next-20211203]
[cannot apply to drm-intel/for-linux-next anholt/for-next v5.16-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Maxime-Ripard/drm-vc4-hdmi-Yet-Another-Approach-to-HDMI-YUV-output/20211203-185623
base:   git://anongit.freedesktop.org/drm/drm drm-next
config: nds32-allyesconfig 
(https://download.01.org/0day-ci/archive/20211203/202112032036.vcxjzxkg-...@intel.com/config)
compiler: nds32le-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/70ee339decc66c157b6c9c983e8cce172c323218
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Maxime-Ripard/drm-vc4-hdmi-Yet-Another-Approach-to-HDMI-YUV-output/20211203-185623
git checkout 70ee339decc66c157b6c9c983e8cce172c323218
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross 
O=build_dir ARCH=nds32 SHELL=/bin/bash drivers/gpu/drm/vc4/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/vc4/vc4_hdmi.c: In function 
'vc4_hdmi_avi_infoframe_colorspace':
>> drivers/gpu/drm/vc4/vc4_hdmi.c:503:35: warning: implicit conversion from 
>> 'enum vc4_hdmi_output_format' to 'enum hdmi_colorspace' [-Wenum-conversion]
 503 | frame->colorspace = fmt;
 |   ^


vim +503 drivers/gpu/drm/vc4/vc4_hdmi.c

   491  
   492  static void vc4_hdmi_avi_infoframe_colorspace(struct hdmi_avi_infoframe 
*frame,
   493enum 
vc4_hdmi_output_format fmt)
   494  {
   495  switch (fmt) {
   496  case VC4_HDMI_OUTPUT_RGB:
   497  fallthrough;
   498  case VC4_HDMI_OUTPUT_YUV420:
   499  fallthrough;
   500  case VC4_HDMI_OUTPUT_YUV422:
   501  fallthrough;
   502  case VC4_HDMI_OUTPUT_YUV444:
 > 503  frame->colorspace = fmt;
   504  break;
   505  
   506  default:
   507  break;
   508  }
   509  }
   510  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

Re: [PATCH v7 6/6] drm/sprd: add Unisoc's drm mipi dsi driver

2021-12-03 Thread Kevin Tang

Maxime Ripard  于2021年12月3日周五 18:38写道：
>
> On Mon, Oct 25, 2021 at 05:34:18PM +0800, Kevin Tang wrote:
> > @@ -618,9 +619,25 @@ static void sprd_crtc_mode_set_nofb(struct drm_crtc 
> > *crtc)
> >  {
> >   struct sprd_dpu *dpu = to_sprd_crtc(crtc);
> >   struct drm_display_mode *mode = >state->adjusted_mode;
> > + struct drm_encoder *encoder;
> > + struct mipi_dsi_device *slave;
> > + struct sprd_dsi *dsi;
> >
> >   drm_display_mode_to_videomode(mode, >ctx.vm);
> >
> > + drm_for_each_encoder(encoder, crtc->dev) {
> > + if (encoder->crtc != crtc)
> > + continue;
>
> encoder->crtc is deprecated. You should be using
> encoder->drm_for_each_encoder_mask, using the encoder_mask in
> encoder->drm_crtc_state.

Use drm_for_each_encoder_mask to replace drm_for_each_encoder? like this:
drm_for_each_encoder_mask(encoder, crtc->dev, crtc->state->encoder_mask) {
dsi = encoder_to_dsi(encoder);
slave = dsi->slave;

if (slave->mode_flags & MIPI_DSI_MODE_VIDEO)
dpu->ctx.if_type = SPRD_DPU_IF_DPI;
else
 dpu->ctx.if_type = SPRD_DPU_IF_EDPI;
}

>
> > +static int sprd_dsi_probe(struct platform_device *pdev)
> > +{
> > + struct device *dev = >dev;
> > + struct sprd_dsi *dsi;
> > +
> > + dsi = devm_kzalloc(dev, sizeof(*dsi), GFP_KERNEL);
> > + if (!dsi)
> > + return -ENOMEM;
> > +
> > + dev_set_drvdata(dev, dsi);
> > +
> > + dsi->host.ops = _dsi_host_ops;
> > + dsi->host.dev = dev;
> > + mipi_dsi_host_register(>host);
> > +
> > + return component_add(>dev, _component_ops);
>
> component_add must be run in the mipi_dsi_host.attach hook.
Got it, will be fixed on patch v8.
>
> Maxime

[PATCH v2 8/8] drm/i915/migrate: turn on acceleration for DG2

2021-12-03 Thread Matthew Auld

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/intel_migrate.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
b/drivers/gpu/drm/i915/gt/intel_migrate.c
index a804c57b61df..0da27ec808dc 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -242,8 +242,6 @@ int intel_migrate_init(struct intel_migrate *m, struct 
intel_gt *gt)
 
memset(m, 0, sizeof(*m));
 
-   return 0;
-
ce = pinned_context(gt);
if (IS_ERR(ce))
return PTR_ERR(ce);
-- 
2.31.1

[PATCH v2 7/8] drm/i915/migrate: add acceleration support for DG2

2021-12-03 Thread Matthew Auld

This is all kinds of awkward since we now have to contend with using 64K
GTT pages when mapping anything in LMEM(including the page-tables
themselves).

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/intel_migrate.c | 186 +++-
 1 file changed, 147 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
b/drivers/gpu/drm/i915/gt/intel_migrate.c
index 2076e24e0489..a804c57b61df 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -33,6 +33,38 @@ static bool engine_supports_migration(struct intel_engine_cs 
*engine)
return true;
 }
 
+static void xehpsdv_toggle_pdes(struct i915_address_space *vm,
+   struct i915_page_table *pt,
+   void *data)
+{
+   struct insert_pte_data *d = data;
+
+   /*
+* Insert a dummy PTE into every PT that will map to LMEM to ensure
+* we have a correctly setup PDE structure for later use.
+*/
+   vm->insert_page(vm, 0, d->offset, I915_CACHE_NONE, PTE_LM);
+   GEM_BUG_ON(!pt->is_compact);
+   d->offset += SZ_2M;
+}
+
+static void xehpsdv_insert_pte(struct i915_address_space *vm,
+  struct i915_page_table *pt,
+  void *data)
+{
+   struct insert_pte_data *d = data;
+
+   /*
+* We are playing tricks here, since the actual pt, from the hw
+* pov, is only 256bytes with 32 entries, or 4096bytes with 512
+* entries, but we are still guaranteed that the physical
+* alignment is 64K underneath for the pt, and we are careful
+* not to access the space in the void.
+*/
+   vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE, PTE_LM);
+   d->offset += SZ_64K;
+}
+
 static void insert_pte(struct i915_address_space *vm,
   struct i915_page_table *pt,
   void *data)
@@ -75,7 +107,12 @@ static struct i915_address_space *migrate_vm(struct 
intel_gt *gt)
 * i.e. within the same non-preemptible window so that we do not switch
 * to another migration context that overwrites the PTE.
 *
-* TODO: Add support for huge LMEM PTEs
+* On platforms with HAS_64K_PAGES support we have three windows, and
+* dedicate two windows just for mapping lmem pages(smem <-> smem is not
+* a thing), since we are forced to use 64K GTT pages underneath which
+* requires also modifying the PDE. An alternative might be to instead
+* map the PD into the GTT, and then on the fly toggle the 4K/64K mode
+* in the PDE from the same batch that also modifies the PTEs.
 */
 
vm = i915_ppgtt_create(gt,
@@ -108,14 +145,20 @@ static struct i915_address_space *migrate_vm(struct 
intel_gt *gt)
 * We copy in 8MiB chunks. Each PDE covers 2MiB, so we need
 * 4x2 page directories for source/destination.
 */
-   sz = 2 * CHUNK_SZ;
+   if (HAS_64K_PAGES(gt->i915))
+   sz = 3 * CHUNK_SZ;
+   else
+   sz = 2 * CHUNK_SZ;
d.offset = base + sz;
 
/*
 * We need another page directory setup so that we can write
 * the 8x512 PTE in each chunk.
 */
-   sz += (sz >> 12) * sizeof(u64);
+   if (HAS_64K_PAGES(gt->i915))
+   sz += (sz / SZ_2M) * SZ_64K;
+   else
+   sz += (sz >> 12) * sizeof(u64);
 
err = i915_vm_alloc_pt_stash(>vm, , sz);
if (err)
@@ -136,7 +179,18 @@ static struct i915_address_space *migrate_vm(struct 
intel_gt *gt)
goto err_vm;
 
/* Now allow the GPU to rewrite the PTE via its own ppGTT */
-   vm->vm.foreach(>vm, base, d.offset - base, insert_pte, );
+   if (HAS_64K_PAGES(gt->i915)) {
+   vm->vm.foreach(>vm, base, d.offset - base,
+  xehpsdv_insert_pte, );
+   d.offset = base + CHUNK_SZ;
+   vm->vm.foreach(>vm,
+  d.offset,
+  2 * CHUNK_SZ,
+  xehpsdv_toggle_pdes, );
+   } else {
+   vm->vm.foreach(>vm, base, d.offset - base,
+  insert_pte, );
+   }
}
 
return >vm;
@@ -274,19 +328,38 @@ static int emit_pte(struct i915_request *rq,
u64 offset,
int length)
 {
+   bool has_64K_pages = HAS_64K_PAGES(rq->engine->i915);
const u64 encode = rq->context->vm->pte_encode(0, cache_level,

[PATCH v2 5/8] drm/i915/migrate: fix length calculation

2021-12-03 Thread Matthew Auld

No need to insert PTEs for the PTE window itself, also foreach expects a
length not an end offset, which could be gigantic here with a second
engine.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/intel_migrate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
b/drivers/gpu/drm/i915/gt/intel_migrate.c
index cb0bb3b94644..2076e24e0489 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -136,7 +136,7 @@ static struct i915_address_space *migrate_vm(struct 
intel_gt *gt)
goto err_vm;
 
/* Now allow the GPU to rewrite the PTE via its own ppGTT */
-   vm->vm.foreach(>vm, base, base + sz, insert_pte, );
+   vm->vm.foreach(>vm, base, d.offset - base, insert_pte, );
}
 
return >vm;
-- 
2.31.1

[PATCH v2 6/8] drm/i915/selftests: handle object rounding

2021-12-03 Thread Matthew Auld

Ensure we account for any object rounding due to min_page_size
restrictions.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/selftest_migrate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c 
b/drivers/gpu/drm/i915/gt/selftest_migrate.c
index 12ef2837c89b..e21787301bbd 100644
--- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -49,6 +49,7 @@ static int copy(struct intel_migrate *migrate,
if (IS_ERR(src))
return 0;
 
+   sz = src->base.size;
dst = i915_gem_object_create_internal(i915, sz);
if (IS_ERR(dst))
goto err_free_src;
-- 
2.31.1

[PATCH v2 4/8] drm/i915/migrate: fix offset calculation

2021-12-03 Thread Matthew Auld

Ensure we add the engine base only after we calculate the qword offset
into the PTE window.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/intel_migrate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
b/drivers/gpu/drm/i915/gt/intel_migrate.c
index d553b76b1168..cb0bb3b94644 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -284,10 +284,10 @@ static int emit_pte(struct i915_request *rq,
GEM_BUG_ON(GRAPHICS_VER(rq->engine->i915) < 8);
 
/* Compute the page directory offset for the target address range */
-   offset += (u64)rq->engine->instance << 32;
offset >>= 12;
offset *= sizeof(u64);
offset += 2 * CHUNK_SZ;
+   offset += (u64)rq->engine->instance << 32;
 
cs = intel_ring_begin(rq, 6);
if (IS_ERR(cs))
-- 
2.31.1

[PATCH v2 3/8] drm/i915/gtt: add gtt mappable plumbing

2021-12-03 Thread Matthew Auld

With object clearing/copying we need to be able to modify the PTEs on
the fly via some batch buffer, which means we need to be able to map the
paging structures(or at the very least the PT, but being able to also
map the PD might also be useful at some point) into the GTT. And since
the paging structures must reside in LMEM on discrete, we need to ensure
that these objects have correct physical alignment, as per any min page
restrictions, like on DG2. This is potentially costly, but this should
be limited to the special migrate_vm, which only needs to a few fixed
sized windows.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  4 ++--
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c |  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c|  2 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c|  3 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h|  1 +
 drivers/gpu/drm/i915/gt/intel_ggtt.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_gt.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c |  7 +++
 drivers/gpu/drm/i915/gt/intel_gtt.h |  9 +
 drivers/gpu/drm/i915/gt/intel_migrate.c |  4 +++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c   | 17 -
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c|  2 +-
 drivers/gpu/drm/i915/gvt/scheduler.c|  2 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c   |  4 ++--
 14 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index ebd775cb1661..b394954726b0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1559,7 +1559,7 @@ i915_gem_create_context(struct drm_i915_private *i915,
} else if (HAS_FULL_PPGTT(i915)) {
struct i915_ppgtt *ppgtt;
 
-   ppgtt = i915_ppgtt_create(>gt, 0);
+   ppgtt = i915_ppgtt_create(>gt, 0, 0);
if (IS_ERR(ppgtt)) {
drm_dbg(>drm, "PPGTT setup failed (%ld)\n",
PTR_ERR(ppgtt));
@@ -1742,7 +1742,7 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void 
*data,
if (args->flags)
return -EINVAL;
 
-   ppgtt = i915_ppgtt_create(>gt, 0);
+   ppgtt = i915_ppgtt_create(>gt, 0, 0);
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index bd8dc1a28022..c1b86c7a4754 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -1764,7 +1764,7 @@ int i915_gem_huge_page_mock_selftests(void)
mkwrite_device_info(dev_priv)->ppgtt_type = INTEL_PPGTT_FULL;
mkwrite_device_info(dev_priv)->ppgtt_size = 48;
 
-   ppgtt = i915_ppgtt_create(_priv->gt, 0);
+   ppgtt = i915_ppgtt_create(_priv->gt, 0, 0);
if (IS_ERR(ppgtt)) {
err = PTR_ERR(ppgtt);
goto out_unlock;
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index c0d149f04949..778472e563aa 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -443,7 +443,7 @@ struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt)
 
mutex_init(>flush);
 
-   ppgtt_init(>base, gt, 0);
+   ppgtt_init(>base, gt, 0, 0);
ppgtt->base.vm.pd_shift = ilog2(SZ_4K * SZ_4K / sizeof(gen6_pte_t));
ppgtt->base.vm.top = 1;
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 312b2267bf87..dfca803b4ff1 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -912,6 +912,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
  *
  */
 struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
+unsigned long vm_flags,
 unsigned long lmem_pt_obj_flags)
 {
struct i915_ppgtt *ppgtt;
@@ -921,7 +922,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
if (!ppgtt)
return ERR_PTR(-ENOMEM);
 
-   ppgtt_init(ppgtt, gt, lmem_pt_obj_flags);
+   ppgtt_init(ppgtt, gt, vm_flags, lmem_pt_obj_flags);
ppgtt->vm.top = i915_vm_is_4lvl(>vm) ? 3 : 2;
ppgtt->vm.pd_shift = ilog2(SZ_4K * SZ_4K / sizeof(gen8_pte_t));
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
index f541d19264b4..c0af12593576 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
@@ -13,6 +13,7 @@ struct intel_gt;
 enum i915_cache_level;
 
 struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
+unsigned long vm_flags,

[PATCH v2 2/8] drm/i915/gtt: add xehpsdv_ppgtt_insert_entry

2021-12-03 Thread Matthew Auld

If this is LMEM then we get a 32 entry PT, with each PTE pointing to
some 64K block of memory, otherwise it's just the usual 512 entry PT.
This very much assumes the caller knows what they are doing.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 50 ++--
 1 file changed, 48 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index bd3ca0996a23..312b2267bf87 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -728,13 +728,56 @@ static void gen8_ppgtt_insert_entry(struct 
i915_address_space *vm,
gen8_pdp_for_page_index(vm, idx);
struct i915_page_directory *pd =
i915_pd_entry(pdp, gen8_pd_index(idx, 2));
+   struct i915_page_table *pt = i915_pt_entry(pd, gen8_pd_index(idx, 1));
gen8_pte_t *vaddr;
 
-   vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+   GEM_BUG_ON(pt->is_compact);
+
+   vaddr = px_vaddr(pt);
vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
clflush_cache_range([gen8_pd_index(idx, 0)], sizeof(*vaddr));
 }
 
+static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
+   dma_addr_t addr,
+   u64 offset,
+   enum i915_cache_level level,
+   u32 flags)
+{
+   u64 idx = offset >> GEN8_PTE_SHIFT;
+   struct i915_page_directory * const pdp =
+   gen8_pdp_for_page_index(vm, idx);
+   struct i915_page_directory *pd =
+   i915_pd_entry(pdp, gen8_pd_index(idx, 2));
+   struct i915_page_table *pt = i915_pt_entry(pd, gen8_pd_index(idx, 1));
+   gen8_pte_t *vaddr;
+
+   GEM_BUG_ON(!IS_ALIGNED(addr, SZ_64K));
+   GEM_BUG_ON(!IS_ALIGNED(offset, SZ_64K));
+
+   if (!pt->is_compact) {
+   vaddr = px_vaddr(pd);
+   vaddr[gen8_pd_index(idx, 1)] |= GEN12_PDE_64K;
+   pt->is_compact = true;
+   }
+
+   vaddr = px_vaddr(pt);
+   vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
+}
+
+static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
+  dma_addr_t addr,
+  u64 offset,
+  enum i915_cache_level level,
+  u32 flags)
+{
+   if (flags & PTE_LM)
+   return __xehpsdv_ppgtt_insert_entry_lm(vm, addr, offset,
+  level, flags);
+
+   return gen8_ppgtt_insert_entry(vm, addr, offset, level, flags);
+}
+
 static int gen8_init_scratch(struct i915_address_space *vm)
 {
u32 pte_flags;
@@ -937,7 +980,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 
ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
ppgtt->vm.insert_entries = gen8_ppgtt_insert;
-   ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
+   if (HAS_64K_PAGES(gt->i915))
+   ppgtt->vm.insert_page = xehpsdv_ppgtt_insert_entry;
+   else
+   ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
ppgtt->vm.clear_range = gen8_ppgtt_clear;
ppgtt->vm.foreach = gen8_ppgtt_foreach;
-- 
2.31.1

[PATCH v2 1/8] drm/i915/migrate: don't check the scratch page

2021-12-03 Thread Matthew Auld

The scratch page might not be allocated in LMEM(like on DG2), so instead
of using that as the deciding factor for where the paging structures
live, let's just query the pt before mapping it.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/intel_migrate.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
b/drivers/gpu/drm/i915/gt/intel_migrate.c
index 765c6d48fe52..2d3188a398dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -13,7 +13,6 @@
 
 struct insert_pte_data {
u64 offset;
-   bool is_lmem;
 };
 
 #define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */
@@ -41,7 +40,7 @@ static void insert_pte(struct i915_address_space *vm,
struct insert_pte_data *d = data;
 
vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE,
-   d->is_lmem ? PTE_LM : 0);
+   i915_gem_object_is_lmem(pt->base) ? PTE_LM : 0);
d->offset += PAGE_SIZE;
 }
 
@@ -135,7 +134,6 @@ static struct i915_address_space *migrate_vm(struct 
intel_gt *gt)
goto err_vm;
 
/* Now allow the GPU to rewrite the PTE via its own ppGTT */
-   d.is_lmem = i915_gem_object_is_lmem(vm->vm.scratch[0]);
vm->vm.foreach(>vm, base, base + sz, insert_pte, );
}
 
-- 
2.31.1

[PATCH v2 0/8] DG2 accelerated migration/clearing support

2021-12-03 Thread Matthew Auld

Enable accelerated moves and clearing on DG2. On such HW we have minimum page
size restrictions when accessing LMEM from the GTT, where we now have to use 64K
GTT pages or larger. With the ppGTT the page-table also has a slightly different
layout from past generations when using the 64K GTT mode(which is still enabled
on via some PDE bit), where it is now compacted down to 32 qword entries. Note
that on discrete the paging structures must also be placed in LMEM, and we need
to able to modify them via the GTT itself(see patch 7), which is one of the
complications here.

The series needs to be applied on top of the DG2 enabling branch:
https://cgit.freedesktop.org/~ramaling/linux/log/?h=dg2_enabling_ww49.3

Patches 2, 7 and 8 have a dependency on patches in that branch, but the rest can
likely already land if the direction makes sense.

Matthew Auld (8):
  drm/i915/migrate: don't check the scratch page
  drm/i915/gtt: add xehpsdv_ppgtt_insert_entry
  drm/i915/gtt: add gtt mappable plumbing
  drm/i915/migrate: fix offset calculation
  drm/i915/migrate: fix length calculation
  drm/i915/selftests: handle object rounding
  drm/i915/migrate: add acceleration support for DG2
  drm/i915/migrate: turn on acceleration for DG2

 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   4 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |   2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |   2 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  53 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |   1 +
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |   2 +-
 drivers/gpu/drm/i915/gt/intel_gt.c|   2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c   |   7 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   9 +
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 196 ++
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  17 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   2 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c|   1 +
 drivers/gpu/drm/i915/gvt/scheduler.c  |   2 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   4 +-
 15 files changed, 241 insertions(+), 63 deletions(-)

-- 
2.31.1

[PATCH 4/6] dt-bindings: gpu: mali-bifrost: Document RZ/G2L support

2021-12-03 Thread Biju Das

The Renesas RZ/G2{L, LC} SoC (a.k.a R9A07G044) has a Bifrost Mali-G31 GPU,
add a compatible string for it.

Signed-off-by: Biju Das 
Reviewed-by: Lad Prabhakar 
---
 .../bindings/gpu/arm,mali-bifrost.yaml| 32 +--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-bifrost.yaml 
b/Documentation/devicetree/bindings/gpu/arm,mali-bifrost.yaml
index 6f98dd55fb4c..c9fac2498f5e 100644
--- a/Documentation/devicetree/bindings/gpu/arm,mali-bifrost.yaml
+++ b/Documentation/devicetree/bindings/gpu/arm,mali-bifrost.yaml
@@ -19,6 +19,7 @@ properties:
   - amlogic,meson-g12a-mali
   - mediatek,mt8183-mali
   - realtek,rtd1619-mali
+  - renesas,r9a07g044-mali
   - rockchip,px30-mali
   - rockchip,rk3568-mali
   - const: arm,mali-bifrost # Mali Bifrost GPU model/revision is fully 
discoverable
@@ -27,19 +28,30 @@ properties:
 maxItems: 1
 
   interrupts:
+minItems: 3
 items:
   - description: Job interrupt
   - description: MMU interrupt
   - description: GPU interrupt
+  - description: EVENT interrupt
 
   interrupt-names:
+minItems: 3
 items:
   - const: job
   - const: mmu
   - const: gpu
+  - const: event
 
   clocks:
-maxItems: 1
+minItems: 1
+maxItems: 3
+
+  clock-names:
+items:
+  - const: gpu
+  - const: bus
+  - const: bus_ace
 
   mali-supply: true
 
@@ -52,7 +64,8 @@ properties:
 maxItems: 3
 
   resets:
-maxItems: 2
+minItems: 1
+maxItems: 3
 
   "#cooling-cells":
 const: 2
@@ -113,6 +126,21 @@ allOf:
 - sram-supply
 - power-domains
 - power-domain-names
+  - if:
+  properties:
+compatible:
+  contains:
+const: renesas,r9a07g044-mali
+then:
+  properties:
+interrupt-names:
+  minItems: 4
+clock-names:
+  minItems: 3
+  required:
+- clock-names
+- power-domains
+- resets
 else:
   properties:
 power-domains:
-- 
2.17.1

[PATCH 0/6] Add Mali-G31 GPU support for RZ/G2L SoC

2021-12-03 Thread Biju Das

RZ/G2L SoC embeds Mali-G31 bifrost GPU.
This patch series aims to add support for the same

It is tested with latest drm-misc-next + mesa21.3.0 + 
out of tree patch for (du + DSI) + mesa configurtion for RZ/G2L.

Tested the kmscube application.

test logs:-
root@smarc-rzg2l:~# kmscube
Using display 0xdb6e7d30 with EGL version 1.4
===
EGL information:
  version: "1.4"
  vendor: "Mesa Project"
.
===
OpenGL ES 2.x information:
  version: "OpenGL ES 3.1 Mesa 21.3.0"
  shading language version: "OpenGL ES GLSL ES 3.10"
  vendor: "Panfrost"
  renderer: "Mali-G31 (Panfrost)"
  
===
^C
root@smarc-rzg2l:~# cat /proc/interrupts | grep gpu
 84:  8  0 GICv3 185 Level panfrost-gpu
root@smarc-rzg2l:~# cat /proc/interrupts | grep panfrost
 82: 587287  0 GICv3 186 Level panfrost-job
 83:  2  0 GICv3 187 Level panfrost-mmu
 84:  8  0 GICv3 185 Level panfrost-gpu

root@smarc-rzg2l:~# cat /sys/class/devfreq/1184.gpu/trans_stat
 From  :   To
   :  5000  6250 1 12500 2 25000 
4 5   time(ms)
*  5000: 0 0 0 0 0 0
 0 072
   6250: 0 0 0 0 0 0
 0 0 0
  1: 0 0 0 0 0 0
 0 0 0
  12500: 0 0 0 0 0 0
 0 168
  2: 0 0 0 0 0 0
 0 168
  25000: 1 0 0 0 0 0
 0 084
  4: 0 0 0 0 0 0
 0 0 0
  5: 0 0 0 1 1 1
 0 0   736
Total transition : 6
root@smarc-rzg2l:~# kmscube
Using display 0xf7a421b0 with EGL version 1.4
===
EGL information:
  version: "1.4"
  vendor: "Mesa Project"
  .
===
OpenGL ES 2.x information:
  version: "OpenGL ES 3.1 Mesa 21.3.0"
  shading language version: "OpenGL ES GLSL ES 3.10"
  vendor: "Panfrost"
  renderer: "Mali-G31 (Panfrost)"
  ..
===

root@smarc-rzg2l:~#
root@smarc-rzg2l:~#
root@smarc-rzg2l:~# cat /sys/class/devfreq/1184.gpu/trans_stat
 From  :   To
   :  5000  6250 1 12500 2 25000 
4 5   time(ms)
*  5000: 0 0 0 0 0 0
 0 1   144
   6250: 0 0 0 0 0 0
 0 0 0
  1: 0 0 0 0 0 0
 0 9   524
  12500: 0 0 9 0 0 0
 0 3  2544
  2: 0 0 011 0 0
 046  3304
  25000: 1 0 0 033 0
 0 0  7496
  4: 0 0 0 01619
 0 0  2024
  5: 1 0 0 1 815
35 0  4032
Total transition : 208

Mesa patch for RZ/G2L
-
src/gallium/targets/dri/meson.build
+   'rcar-du_dri.so',
src/gallium/targets/dri/target.c
+DEFINE_LOADER_DRM_ENTRYPOINT(rcar_du)

Biju Das (6):
  clk: renesas: r9a07g044: Rename CLK_PLL3_DIV4 macro
  clk: renesas: r9a07g044: Add mux and divider for G clock
  clk: renesas: r9a07g044: Add GPU clock and reset entries
  dt-bindings: gpu: mali-bifrost: Document RZ/G2L support
  arm64: dts: renesas: r9a07g044: Add Mali-G31 GPU node
  arm64: dts: renesas: rzg2l-smarc-som: Add vdd core regulator

 .../bindings/gpu/arm,mali-bifrost.yaml| 32 +-
 arch/arm64/boot/dts/renesas/r9a07g044.dtsi| 64 +++
 .../boot/dts/renesas/rzg2l-smarc-som.dtsi | 13 
 drivers/clk/renesas/r9a07g044-cpg.c   | 19 +-
 drivers/clk/renesas/rzg2l-cpg.h   |  4 ++
 5 files changed, 128 insertions(+), 4 deletions(-)

-- 
2.17.1

Re: [PATCH v7 0/9] drm/omap: Add virtual-planes support

2021-12-03 Thread Tomi Valkeinen


On 17/11/2021 16:19, Neil Armstrong wrote:

This patchset is the follow-up the v4 patchset from Benoit Parrot at [1].

This patch series adds virtual-plane support to omapdrm driver to allow the use
of display wider than 2048 pixels.

In order to do so we introduce the concept of hw_overlay which can then be
dynamically allocated to a plane. When the requested output width exceed what
be supported by one overlay a second is then allocated if possible to handle
display wider then 2048.

This series replaces an earlier series which was DT based and using statically
allocated resources.

This implementation is inspired from the work done in msm/disp/mdp5
driver.


I think this looks good. I'll apply this to my work tree to see if I see 
any issues during my daily work, and if not, I'll push to drm-misc-next.


Have you tested this with other platforms than x15? I'm mostly thinking 
about omap3/4, as the DSS is somewhat different on those platforms.


 Tomi

Re: 5.15 regression: CONFIG_SYSFB_SIMPLEFB breaks console scrolling

2021-12-03 Thread Javier Martinez Canillas

Sorry for the late reply.

On 11/21/21 12:47, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker speaking.
> 
> On 16.11.21 05:52, Harald Dunkel wrote:
>>
>> if I enable CONFIG_SYSFB_SIMPLEFB in 5.15.2 and use grub's default
>> configuration
>> (Debian sid amd64), then a few lines at the bottom of /dev/tty1 including
>> login prompt are off-screen. Scrolling is broken. I can login, though.
>>
>> Enabling GRUB_TERMINAL=console in grub doesn't make a difference. Using
>> the same kernel except for CONFIG_SYSFB_SIMPLEFB the problem is gone.
>>
>> Graphics card is a GeForce GTX 1650. I tried with both CONFIG_DRM_NOUVEAU
>> and proprietary graphics drivers disabled.
>>
>> Attached you can find the config file. Please mail if I can help to track
>> this problem down.
> 
> Thx for the report. I'm not totally sure if this is a regression, as
> that's a new config option. But it might be one considered a successor
> to an older one, hence it might count as regression. Adding two
> developers and a mailing list to the CC, hopefully someone can clarify.
>

I don't think this is a regression since enabling CONFIG_SYSFB_SIMPLEFB will
make the simpledrm driver to be bound while disabling the option makes the
efifb driver to be bound instead.
 Yes, it seems to be a bug in the simpledrm driver but the solution if you
have issues with the simpledrm is to not enable CONFIG_SYSFB_SIMPLEFB and
keep using the old fbdev driver.

Best regards,
-- 
Javier Martinez Canillas
Linux Engineering
Red Hat

[PATCH 11/11] drm/vc4: hdmi: Force YUV422 if the rate is too high

2021-12-03 Thread Maxime Ripard

When using the modes that need the highest pixel rate we support (such
as 4k at 60Hz), using a 10 or 12 bpc output will put us over the limit
of what we can achieve.

In such a case, let's force our output to be YUV422 so that we can go
back down under the required clock rate.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 04eb1ab9dad3..5b8b2688a3f4 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -1411,8 +1411,15 @@ vc4_hdmi_encoder_compute_config(struct vc4_hdmi 
*vc4_hdmi,
 vc4_state, mode,
 conn_state->max_bpc,
 format);
-   if (ret)
-   return ret;
+   if (ret) {
+   format = VC4_HDMI_OUTPUT_YUV422;
+   ret = vc4_hdmi_encoder_compute_clock(vc4_hdmi,
+vc4_state, mode,
+conn_state->max_bpc,
+format);
+   if (ret)
+   return ret;
+   }
 
vc4_state->output_format = format;
return ret;
-- 
2.33.1

[PATCH 10/11] drm/vc4: hdmi: Support HDMI YUV output

2021-12-03 Thread Maxime Ripard

In addition to the RGB444 output, the BCM2711 HDMI controller supports
the YUV444 and YUV422 output formats.

Let's add support for them in the driver, but still use RGB as the
preferred format.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c  | 133 +---
 drivers/gpu/drm/vc4/vc4_hdmi.h  |   8 ++
 drivers/gpu/drm/vc4/vc4_hdmi_regs.h |   6 ++
 drivers/gpu/drm/vc4/vc4_regs.h  |  16 
 4 files changed, 153 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 3d649fbc480a..04eb1ab9dad3 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -329,6 +329,7 @@ static void vc4_hdmi_connector_reset(struct drm_connector 
*connector)
 
new_state->base.max_bpc = 8;
new_state->base.max_requested_bpc = 8;
+   new_state->output_format = VC4_HDMI_OUTPUT_RGB;
drm_atomic_helper_connector_tv_reset(connector);
 }
 
@@ -344,6 +345,7 @@ vc4_hdmi_connector_duplicate_state(struct drm_connector 
*connector)
return NULL;
 
new_state->pixel_rate = vc4_state->pixel_rate;
+   new_state->output_format = vc4_state->output_format;
__drm_atomic_helper_connector_duplicate_state(connector, 
_state->base);
 
return _state->base;
@@ -487,11 +489,32 @@ static void vc4_hdmi_write_infoframe(struct drm_encoder 
*encoder,
DRM_ERROR("Failed to wait for infoframe to start: %d\n", ret);
 }
 
+static void vc4_hdmi_avi_infoframe_colorspace(struct hdmi_avi_infoframe *frame,
+ enum vc4_hdmi_output_format fmt)
+{
+   switch (fmt) {
+   case VC4_HDMI_OUTPUT_RGB:
+   fallthrough;
+   case VC4_HDMI_OUTPUT_YUV420:
+   fallthrough;
+   case VC4_HDMI_OUTPUT_YUV422:
+   fallthrough;
+   case VC4_HDMI_OUTPUT_YUV444:
+   frame->colorspace = fmt;
+   break;
+
+   default:
+   break;
+   }
+}
+
 static void vc4_hdmi_set_avi_infoframe(struct drm_encoder *encoder)
 {
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
struct drm_connector *connector = _hdmi->connector;
struct drm_connector_state *cstate = connector->state;
+   struct vc4_hdmi_connector_state *vc4_state =
+   conn_state_to_vc4_hdmi_conn_state(cstate);
const struct drm_display_mode *mode = _hdmi->saved_adjusted_mode;
union hdmi_infoframe frame;
int ret;
@@ -511,6 +534,7 @@ static void vc4_hdmi_set_avi_infoframe(struct drm_encoder 
*encoder)
   HDMI_QUANTIZATION_RANGE_FULL :
   HDMI_QUANTIZATION_RANGE_LIMITED);
drm_hdmi_avi_infoframe_colorimetry(, cstate);
+   vc4_hdmi_avi_infoframe_colorspace(, vc4_state->output_format);
drm_hdmi_avi_infoframe_bars(, cstate);
 
vc4_hdmi_write_infoframe(encoder, );
@@ -809,6 +833,38 @@ static const u16 
vc5_hdmi_csc_full_rgb_to_limited_rgb[3][4] = {
{ 0x, 0x, 0x1b80, 0x0400 },
 };
 
+/*
+ * Conversion between Full Range RGB and Full Range YUV422 using the
+ * BT.709 Colorspace
+ *
+ * [  0.212639  0.715169  0.072192  0   ]
+ * [ -0.117208 -0.394207  0.511416  128 ]
+ * [  0.511416 -0.464524 -0.046891  128 ]
+ *
+ * Matrix is signed 2p13 fixed point, with signed 9p6 offsets
+ */
+static const u16 vc5_hdmi_csc_full_rgb_to_full_yuv422_bt709[3][4] = {
+   { 0x06ce, 0x16e3, 0x024f, 0x },
+   { 0xfc41, 0xf364, 0x105e, 0x2000 },
+   { 0x105e, 0xf124, 0xfe81, 0x2000 },
+};
+
+/*
+ * Conversion between Full Range RGB and Full Range YUV444 using the
+ * BT.709 Colorspace
+ *
+ * [ -0.117208 -0.394207  0.511416  128 ]
+ * [  0.511416 -0.464524 -0.046891  128 ]
+ * [  0.212639  0.715169  0.072192  0   ]
+ *
+ * Matrix is signed 2p13 fixed point, with signed 9p6 offsets
+ */
+static const u16 vc5_hdmi_csc_full_rgb_to_full_yuv444_bt709[3][4] = {
+   { 0xfc41, 0xf364, 0x105e, 0x2000 },
+   { 0x105e, 0xf124, 0xfe81, 0x2000 },
+   { 0x06ce, 0x16e3, 0x024f, 0x },
+};
+
 static void vc5_hdmi_set_csc_coeffs(struct vc4_hdmi *vc4_hdmi,
const u16 coeffs[3][4])
 {
@@ -826,19 +882,53 @@ static void vc5_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi,
   struct drm_connector_state *state,
   const struct drm_display_mode *mode)
 {
+   struct vc4_hdmi_connector_state *vc4_state =
+   conn_state_to_vc4_hdmi_conn_state(state);
unsigned long flags;
+   u32 if_cfg = 0;
+   u32 if_xbar = 0x543210;
+   u32 csc_chan_ctl = 0;
u32 csc_ctl = VC5_MT_CP_CSC_CTL_ENABLE | 
VC4_SET_FIELD(VC4_HD_CSC_CTL_MODE_CUSTOM,
   
VC5_MT_CP_CSC_CTL_MODE);
 
spin_lock_irqsave(_hdmi->hw_lock, flags);
 
-

[PATCH 09/11] drm/vc4: hdmi: Move clock calculation into its own function

2021-12-03 Thread Maxime Ripard

The code to compute our clock rate for a given setup will be called in
multiple places in the next patches, so let's create a separate function
for it.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 49 +++---
 1 file changed, 34 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 9952b13eeb6e..3d649fbc480a 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -1275,6 +1275,35 @@ vc4_hdmi_encoder_clock_valid(struct vc4_hdmi *vc4_hdmi,
return MODE_OK;
 }
 
+static unsigned long
+vc4_hdmi_encoder_compute_mode_clock(const struct drm_display_mode *mode,
+   unsigned int bpc)
+{
+   unsigned long clock = mode->crtc_clock * 1000;
+
+   if (mode->flags & DRM_MODE_FLAG_DBLCLK)
+   clock = clock * 2;
+
+   return clock * bpc / 8;
+}
+
+static int
+vc4_hdmi_encoder_compute_clock(struct vc4_hdmi *vc4_hdmi,
+  struct vc4_hdmi_connector_state *vc4_state,
+  const struct drm_display_mode *mode,
+  unsigned int bpc)
+{
+   unsigned long clock;
+
+   clock = vc4_hdmi_encoder_compute_mode_clock(mode, bpc);
+   if (vc4_hdmi_encoder_clock_valid(vc4_hdmi, clock) != MODE_OK)
+   return -EINVAL;
+
+   vc4_state->pixel_rate = clock;
+
+   return 0;
+}
+
 #define WIFI_2_4GHz_CH1_MIN_FREQ   24ULL
 #define WIFI_2_4GHz_CH1_MAX_FREQ   242200ULL
 
@@ -1287,6 +1316,7 @@ static int vc4_hdmi_encoder_atomic_check(struct 
drm_encoder *encoder,
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
unsigned long long pixel_rate = mode->clock * 1000;
unsigned long long tmds_rate;
+   int ret;
 
if (vc4_hdmi->variant->unsupported_odd_h_timings &&
((mode->hdisplay % 2) || (mode->hsync_start % 2) ||
@@ -1307,21 +1337,10 @@ static int vc4_hdmi_encoder_atomic_check(struct 
drm_encoder *encoder,
pixel_rate = mode->clock * 1000;
}
 
-   if (conn_state->max_bpc == 12) {
-   pixel_rate = pixel_rate * 150;
-   do_div(pixel_rate, 100);
-   } else if (conn_state->max_bpc == 10) {
-   pixel_rate = pixel_rate * 125;
-   do_div(pixel_rate, 100);
-   }
-
-   if (mode->flags & DRM_MODE_FLAG_DBLCLK)
-   pixel_rate = pixel_rate * 2;
-
-   if (vc4_hdmi_encoder_clock_valid(vc4_hdmi, pixel_rate) != MODE_OK)
-   return -EINVAL;
-
-   vc4_state->pixel_rate = pixel_rate;
+   ret = vc4_hdmi_encoder_compute_clock(vc4_hdmi, vc4_state, mode,
+conn_state->max_bpc);
+   if (ret)
+   return ret;
 
return 0;
 }
-- 
2.33.1

[PATCH 08/11] drm/vc4: hdmi: Move clock validation to its own function

2021-12-03 Thread Maxime Ripard

Our code is doing the same clock rate validation in multiple instances.
Let's create a helper to share the rate validation.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index d79a70bae7f2..9952b13eeb6e 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -1262,6 +1262,19 @@ static void vc4_hdmi_encoder_atomic_mode_set(struct 
drm_encoder *encoder,
mutex_unlock(_hdmi->mutex);
 }
 
+static enum drm_mode_status
+vc4_hdmi_encoder_clock_valid(struct vc4_hdmi *vc4_hdmi,
+unsigned long clock)
+{
+   if (clock > vc4_hdmi->variant->max_pixel_clock)
+   return MODE_CLOCK_HIGH;
+
+   if (vc4_hdmi->disable_4kp60 && clock > HDMI_14_MAX_TMDS_CLK)
+   return MODE_CLOCK_HIGH;
+
+   return MODE_OK;
+}
+
 #define WIFI_2_4GHz_CH1_MIN_FREQ   24ULL
 #define WIFI_2_4GHz_CH1_MAX_FREQ   242200ULL
 
@@ -1305,10 +1318,7 @@ static int vc4_hdmi_encoder_atomic_check(struct 
drm_encoder *encoder,
if (mode->flags & DRM_MODE_FLAG_DBLCLK)
pixel_rate = pixel_rate * 2;
 
-   if (pixel_rate > vc4_hdmi->variant->max_pixel_clock)
-   return -EINVAL;
-
-   if (vc4_hdmi->disable_4kp60 && (pixel_rate > HDMI_14_MAX_TMDS_CLK))
+   if (vc4_hdmi_encoder_clock_valid(vc4_hdmi, pixel_rate) != MODE_OK)
return -EINVAL;
 
vc4_state->pixel_rate = pixel_rate;
@@ -1327,13 +1337,7 @@ vc4_hdmi_encoder_mode_valid(struct drm_encoder *encoder,
 (mode->hsync_end % 2) || (mode->htotal % 2)))
return MODE_H_ILLEGAL;
 
-   if ((mode->clock * 1000) > vc4_hdmi->variant->max_pixel_clock)
-   return MODE_CLOCK_HIGH;
-
-   if (vc4_hdmi->disable_4kp60 && vc4_hdmi_mode_needs_scrambling(mode))
-   return MODE_CLOCK_HIGH;
-
-   return MODE_OK;
+   return vc4_hdmi_encoder_clock_valid(vc4_hdmi, mode->crtc_clock * 1000);
 }
 
 static const struct drm_encoder_helper_funcs vc4_hdmi_encoder_helper_funcs = {
-- 
2.33.1

[PATCH 06/11] drm/vc4: hdmi: Define colorspace matrices

2021-12-03 Thread Maxime Ripard

The current CSC setup code for the BCM2711 uses a sequence of register
writes to configure the CSC depending on whether we output using a full
or limited range.

However, with the upcoming introduction of the YUV output, we're going
to add new matrices to perform the conversions, so we should switch to
something a bit more flexible that takes the matrix as an argument and
programs the CSC accordingly.

Acked-by: Thomas Zimmermann 
Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 79 +-
 1 file changed, 50 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 682c3c907cbe..7fdb49e790f3 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -775,6 +775,52 @@ static void vc4_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi,
spin_unlock_irqrestore(_hdmi->hw_lock, flags);
 }
 
+
+/*
+ * If we need to output Full Range RGB, then use the unity matrix
+ *
+ * [ 1  0  0  0]
+ * [ 0  1  0  0]
+ * [ 0  0  1  0]
+ *
+ * Matrix is signed 2p13 fixed point, with signed 9p6 offsets
+ */
+static const u16 vc5_hdmi_csc_full_rgb_unity[3][4] = {
+   { 0x2000, 0x, 0x, 0x },
+   { 0x, 0x2000, 0x, 0x },
+   { 0x, 0x, 0x2000, 0x },
+};
+
+/*
+ * CEA VICs other than #1 require limited range RGB output unless
+ * overridden by an AVI infoframe. Apply a colorspace conversion to
+ * squash 0-255 down to 16-235. The matrix here is:
+ *
+ * [ 0.8594 0  0  16]
+ * [ 0  0.8594 0  16]
+ * [ 0  0  0.8594 16]
+ *
+ * Matrix is signed 2p13 fixed point, with signed 9p6 offsets
+ */
+static const u16 vc5_hdmi_csc_full_rgb_to_limited_rgb[3][4] = {
+   { 0x1b80, 0x, 0x, 0x0400 },
+   { 0x, 0x1b80, 0x, 0x0400 },
+   { 0x, 0x, 0x1b80, 0x0400 },
+};
+
+static void vc5_hdmi_set_csc_coeffs(struct vc4_hdmi *vc4_hdmi,
+   const u16 coeffs[3][4])
+{
+   lockdep_assert_held(_hdmi->hw_lock);
+
+   HDMI_WRITE(HDMI_CSC_12_11, (coeffs[0][1] << 16) | coeffs[0][0]);
+   HDMI_WRITE(HDMI_CSC_14_13, (coeffs[0][3] << 16) | coeffs[0][2]);
+   HDMI_WRITE(HDMI_CSC_22_21, (coeffs[1][1] << 16) | coeffs[1][0]);
+   HDMI_WRITE(HDMI_CSC_24_23, (coeffs[1][3] << 16) | coeffs[1][2]);
+   HDMI_WRITE(HDMI_CSC_32_31, (coeffs[2][1] << 16) | coeffs[2][0]);
+   HDMI_WRITE(HDMI_CSC_34_33, (coeffs[2][3] << 16) | coeffs[2][2]);
+}
+
 static void vc5_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi,
   const struct drm_display_mode *mode)
 {
@@ -786,35 +832,10 @@ static void vc5_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi,
 
HDMI_WRITE(HDMI_VEC_INTERFACE_XBAR, 0x354021);
 
-   if (!vc4_hdmi_is_full_range_rgb(vc4_hdmi, mode)) {
-   /* CEA VICs other than #1 requre limited range RGB
-* output unless overridden by an AVI infoframe.
-* Apply a colorspace conversion to squash 0-255 down
-* to 16-235.  The matrix here is:
-*
-* [ 0.8594 0  0  16]
-* [ 0  0.8594 0  16]
-* [ 0  0  0.8594 16]
-* [ 0  0  0   1]
-* Matrix is signed 2p13 fixed point, with signed 9p6 offsets
-*/
-   HDMI_WRITE(HDMI_CSC_12_11, (0x << 16) | 0x1b80);
-   HDMI_WRITE(HDMI_CSC_14_13, (0x0400 << 16) | 0x);
-   HDMI_WRITE(HDMI_CSC_22_21, (0x1b80 << 16) | 0x);
-   HDMI_WRITE(HDMI_CSC_24_23, (0x0400 << 16) | 0x);
-   HDMI_WRITE(HDMI_CSC_32_31, (0x << 16) | 0x);
-   HDMI_WRITE(HDMI_CSC_34_33, (0x0400 << 16) | 0x1b80);
-   } else {
-   /* Still use the matrix for full range, but make it unity.
-* Matrix is signed 2p13 fixed point, with signed 9p6 offsets
-*/
-   HDMI_WRITE(HDMI_CSC_12_11, (0x << 16) | 0x2000);
-   HDMI_WRITE(HDMI_CSC_14_13, (0x << 16) | 0x);
-   HDMI_WRITE(HDMI_CSC_22_21, (0x2000 << 16) | 0x);
-   HDMI_WRITE(HDMI_CSC_24_23, (0x << 16) | 0x);
-   HDMI_WRITE(HDMI_CSC_32_31, (0x << 16) | 0x);
-   HDMI_WRITE(HDMI_CSC_34_33, (0x << 16) | 0x2000);
-   }
+   if (!vc4_hdmi_is_full_range_rgb(vc4_hdmi, mode))
+   vc5_hdmi_set_csc_coeffs(vc4_hdmi, 
vc5_hdmi_csc_full_rgb_to_limited_rgb);
+   else
+   vc5_hdmi_set_csc_coeffs(vc4_hdmi, vc5_hdmi_csc_full_rgb_unity);
 
HDMI_WRITE(HDMI_CSC_CTL, csc_ctl);
 
-- 
2.33.1

[PATCH 07/11] drm/vc4: hdmi: Change CSC callback prototype

2021-12-03 Thread Maxime Ripard

In order to support the YUV output, we'll need the atomic state to know
what is the state of the associated property in the CSC setup callback.

Let's change the prototype of that callback to allow us to access it.

Acked-by: Thomas Zimmermann 
Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 7 ++-
 drivers/gpu/drm/vc4/vc4_hdmi.h | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 7fdb49e790f3..d79a70bae7f2 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -735,6 +735,7 @@ static void vc4_hdmi_encoder_disable(struct drm_encoder 
*encoder)
 }
 
 static void vc4_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi,
+  struct drm_connector_state *state,
   const struct drm_display_mode *mode)
 {
unsigned long flags;
@@ -822,6 +823,7 @@ static void vc5_hdmi_set_csc_coeffs(struct vc4_hdmi 
*vc4_hdmi,
 }
 
 static void vc5_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi,
+  struct drm_connector_state *state,
   const struct drm_display_mode *mode)
 {
unsigned long flags;
@@ -1144,13 +1146,16 @@ static void vc4_hdmi_encoder_pre_crtc_enable(struct 
drm_encoder *encoder,
 struct drm_atomic_state *state)
 {
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
+   struct drm_connector *connector = _hdmi->connector;
struct drm_display_mode *mode = _hdmi->saved_adjusted_mode;
+   struct drm_connector_state *conn_state =
+   drm_atomic_get_new_connector_state(state, connector);
unsigned long flags;
 
mutex_lock(_hdmi->mutex);
 
if (vc4_hdmi->variant->csc_setup)
-   vc4_hdmi->variant->csc_setup(vc4_hdmi, mode);
+   vc4_hdmi->variant->csc_setup(vc4_hdmi, conn_state, mode);
 
spin_lock_irqsave(_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_FIFO_CTL, VC4_HDMI_FIFO_CTL_MASTER_SLAVE_N);
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h
index 4a5536975bf6..2b6aaafc020a 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.h
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
@@ -77,6 +77,7 @@ struct vc4_hdmi_variant {
 
/* Callback to enable / disable the CSC */
void (*csc_setup)(struct vc4_hdmi *vc4_hdmi,
+ struct drm_connector_state *state,
  const struct drm_display_mode *mode);
 
/* Callback to configure the video timings in the HDMI block */
-- 
2.33.1

[PATCH 05/11] drm/vc4: hdmi: Replace CSC_CTL hardcoded value by defines

2021-12-03 Thread Maxime Ripard

On BCM2711, the HDMI_CSC_CTL register value has been hardcoded to an
opaque value. Let's replace it with properly defined values.

Acked-by: Thomas Zimmermann 
Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 5 ++---
 drivers/gpu/drm/vc4/vc4_regs.h | 3 +++
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 0f8b1e907fae..682c3c907cbe 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -779,9 +779,8 @@ static void vc5_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi,
   const struct drm_display_mode *mode)
 {
unsigned long flags;
-   u32 csc_ctl;
-
-   csc_ctl = 0x07; /* RGB_CONVERT_MODE = custom matrix, || 
USE_RGB_TO_YCBCR */
+   u32 csc_ctl = VC5_MT_CP_CSC_CTL_ENABLE | 
VC4_SET_FIELD(VC4_HD_CSC_CTL_MODE_CUSTOM,
+  
VC5_MT_CP_CSC_CTL_MODE);
 
spin_lock_irqsave(_hdmi->hw_lock, flags);
 
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index 489f921ef44d..952f2aad0785 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -774,6 +774,9 @@ enum {
 # define VC4_HD_CSC_CTL_RGB2YCCBIT(1)
 # define VC4_HD_CSC_CTL_ENABLE BIT(0)
 
+# define VC5_MT_CP_CSC_CTL_ENABLE  BIT(2)
+# define VC5_MT_CP_CSC_CTL_MODE_MASK   VC4_MASK(1, 0)
+
 # define VC4_DVP_HT_CLOCK_STOP_PIXEL   BIT(1)
 
 /* HVS display list information. */
-- 
2.33.1

[PATCH 04/11] drm/vc4: hdmi: Move XBAR setup to csc_setup

2021-12-03 Thread Maxime Ripard

On the BCM2711, the HDMI_VEC_INTERFACE_XBAR register configuration
depends on whether we're using an RGB or YUV output. Let's move that
configuration to the CSC setup.

Acked-by: Thomas Zimmermann 
Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 47ff4507f017..0f8b1e907fae 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -785,6 +785,8 @@ static void vc5_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi,
 
spin_lock_irqsave(_hdmi->hw_lock, flags);
 
+   HDMI_WRITE(HDMI_VEC_INTERFACE_XBAR, 0x354021);
+
if (!vc4_hdmi_is_full_range_rgb(vc4_hdmi, mode)) {
/* CEA VICs other than #1 requre limited range RGB
 * output unless overridden by an AVI infoframe.
@@ -899,7 +901,6 @@ static void vc5_hdmi_set_timings(struct vc4_hdmi *vc4_hdmi,
 
spin_lock_irqsave(_hdmi->hw_lock, flags);
 
-   HDMI_WRITE(HDMI_VEC_INTERFACE_XBAR, 0x354021);
HDMI_WRITE(HDMI_HORZA,
   (vsync_pos ? VC5_HDMI_HORZA_VPOS : 0) |
   (hsync_pos ? VC5_HDMI_HORZA_HPOS : 0) |
-- 
2.33.1

[PATCH 03/11] drm/vc4: hdmi: Use full range helper in csc functions

2021-12-03 Thread Maxime Ripard

The CSC callbacks takes a boolean as an argument to tell whether we're
using the full range or limited range RGB.

However, with the upcoming YUV support, the logic will be a bit more
complex. In order to address this, let's make the callbacks take the
entire mode, and call our new helper to tell whether the full or limited
range RGB should be used.

Acked-by: Thomas Zimmermann 
Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 31 +++
 drivers/gpu/drm/vc4/vc4_hdmi.h |  4 ++--
 2 files changed, 13 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 7966e3b00332..47ff4507f017 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -490,7 +490,6 @@ static void vc4_hdmi_write_infoframe(struct drm_encoder 
*encoder,
 static void vc4_hdmi_set_avi_infoframe(struct drm_encoder *encoder)
 {
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
-   struct vc4_hdmi_encoder *vc4_encoder = to_vc4_hdmi_encoder(encoder);
struct drm_connector *connector = _hdmi->connector;
struct drm_connector_state *cstate = connector->state;
const struct drm_display_mode *mode = _hdmi->saved_adjusted_mode;
@@ -508,9 +507,9 @@ static void vc4_hdmi_set_avi_infoframe(struct drm_encoder 
*encoder)
 
drm_hdmi_avi_infoframe_quant_range(,
   connector, mode,
-  vc4_encoder->limited_rgb_range ?
-  HDMI_QUANTIZATION_RANGE_LIMITED :
-  HDMI_QUANTIZATION_RANGE_FULL);
+  vc4_hdmi_is_full_range_rgb(vc4_hdmi, 
mode) ?
+  HDMI_QUANTIZATION_RANGE_FULL :
+  HDMI_QUANTIZATION_RANGE_LIMITED);
drm_hdmi_avi_infoframe_colorimetry(, cstate);
drm_hdmi_avi_infoframe_bars(, cstate);
 
@@ -735,7 +734,8 @@ static void vc4_hdmi_encoder_disable(struct drm_encoder 
*encoder)
mutex_unlock(_hdmi->mutex);
 }
 
-static void vc4_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi, bool enable)
+static void vc4_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi,
+  const struct drm_display_mode *mode)
 {
unsigned long flags;
u32 csc_ctl;
@@ -745,7 +745,7 @@ static void vc4_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi, 
bool enable)
csc_ctl = VC4_SET_FIELD(VC4_HD_CSC_CTL_ORDER_BGR,
VC4_HD_CSC_CTL_ORDER);
 
-   if (enable) {
+   if (!vc4_hdmi_is_full_range_rgb(vc4_hdmi, mode)) {
/* CEA VICs other than #1 requre limited range RGB
 * output unless overridden by an AVI infoframe.
 * Apply a colorspace conversion to squash 0-255 down
@@ -775,7 +775,8 @@ static void vc4_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi, 
bool enable)
spin_unlock_irqrestore(_hdmi->hw_lock, flags);
 }
 
-static void vc5_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi, bool enable)
+static void vc5_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi,
+  const struct drm_display_mode *mode)
 {
unsigned long flags;
u32 csc_ctl;
@@ -784,7 +785,7 @@ static void vc5_hdmi_csc_setup(struct vc4_hdmi *vc4_hdmi, 
bool enable)
 
spin_lock_irqsave(_hdmi->hw_lock, flags);
 
-   if (enable) {
+   if (!vc4_hdmi_is_full_range_rgb(vc4_hdmi, mode)) {
/* CEA VICs other than #1 requre limited range RGB
 * output unless overridden by an AVI infoframe.
 * Apply a colorspace conversion to squash 0-255 down
@@ -1123,22 +1124,12 @@ static void vc4_hdmi_encoder_pre_crtc_enable(struct 
drm_encoder *encoder,
 {
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
struct drm_display_mode *mode = _hdmi->saved_adjusted_mode;
-   struct vc4_hdmi_encoder *vc4_encoder = to_vc4_hdmi_encoder(encoder);
unsigned long flags;
 
mutex_lock(_hdmi->mutex);
 
-   if (!vc4_hdmi_is_full_range_rgb(vc4_hdmi, mode)) {
-   if (vc4_hdmi->variant->csc_setup)
-   vc4_hdmi->variant->csc_setup(vc4_hdmi, true);
-
-   vc4_encoder->limited_rgb_range = true;
-   } else {
-   if (vc4_hdmi->variant->csc_setup)
-   vc4_hdmi->variant->csc_setup(vc4_hdmi, false);
-
-   vc4_encoder->limited_rgb_range = false;
-   }
+   if (vc4_hdmi->variant->csc_setup)
+   vc4_hdmi->variant->csc_setup(vc4_hdmi, mode);
 
spin_lock_irqsave(_hdmi->hw_lock, flags);
HDMI_WRITE(HDMI_FIFO_CTL, VC4_HDMI_FIFO_CTL_MASTER_SLAVE_N);
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h
index 36c0b082a43b..4a5536975bf6 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.h
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
@@ -12,7 +12,6 @@

[PATCH 02/11] drm/vc4: hdmi: Add full range RGB helper

2021-12-03 Thread Maxime Ripard

We're going to need to tell whether we want to run with a full or
limited range RGB output in multiple places in the code, so let's create
a helper that will return whether we need with full range or not.

Acked-by: Thomas Zimmermann 
Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index be39e55ae113..7966e3b00332 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -104,6 +104,15 @@ static bool vc4_hdmi_mode_needs_scrambling(const struct 
drm_display_mode *mode)
return (mode->clock * 1000) > HDMI_14_MAX_TMDS_CLK;
 }
 
+static bool vc4_hdmi_is_full_range_rgb(struct vc4_hdmi *vc4_hdmi,
+  const struct drm_display_mode *mode)
+{
+   struct vc4_hdmi_encoder *vc4_encoder = _hdmi->encoder;
+
+   return !vc4_encoder->hdmi_monitor ||
+   drm_default_rgb_quant_range(mode) == 
HDMI_QUANTIZATION_RANGE_FULL;
+}
+
 static int vc4_hdmi_debugfs_regs(struct seq_file *m, void *unused)
 {
struct drm_info_node *node = (struct drm_info_node *)m->private;
@@ -1119,8 +1128,7 @@ static void vc4_hdmi_encoder_pre_crtc_enable(struct 
drm_encoder *encoder,
 
mutex_lock(_hdmi->mutex);
 
-   if (vc4_encoder->hdmi_monitor &&
-   drm_default_rgb_quant_range(mode) == 
HDMI_QUANTIZATION_RANGE_LIMITED) {
+   if (!vc4_hdmi_is_full_range_rgb(vc4_hdmi, mode)) {
if (vc4_hdmi->variant->csc_setup)
vc4_hdmi->variant->csc_setup(vc4_hdmi, true);
 
-- 
2.33.1

[PATCH 00/11] drm/vc4: hdmi: Yet Another Approach to HDMI YUV output

2021-12-03 Thread Maxime Ripard

Hi,

This is another attempt at supporting the HDMI YUV output in the vc4 HDMI
driver.

This is a follow-up of
https://lore.kernel.org/dri-devel/20210317154352.732095-1-max...@cerno.tech/

And the discussions that occured recently on the mailing lists and IRC about
this.

The series mentioned above had multiple issues, the main one being that it was
a bit too much complicated for what we wanted to achieve. This series is taking
a much simpler approach with an ad-hoc solution.

I think some parts of it could still be moved to KMS helpers (notably, the
output format enum, and the helper to set the infoframe for it) and structures
(the output format stored in drm_connector_state). This would also interact
nicely with the work done here:

https://lore.kernel.org/dri-devel/2028103814.524670-1-max...@cerno.tech/

This can come as a second step though.

The other issues with the first attempt was that nothing was reported to
userspace about the decision we made about the format, and that this decision
was essentially policy, without any way for the userspace to influence it.

Those two points however are being worked on by Werner in a cross-driver
effort:

https://lore.kernel.org/dri-devel/e452775c-5b95-bbfd-e818-f1480f556...@tuxedocomputers.com/

Since it's a KMS decision, I don't think we should hold off any driver as long
as it's consistent with what the other drivers are doing.

Let me know what you think,
Maxime

Maxime Ripard (11):
  drm/edid: Rename drm_hdmi_avi_infoframe_colorspace to _colorimetry
  drm/vc4: hdmi: Add full range RGB helper
  drm/vc4: hdmi: Use full range helper in csc functions
  drm/vc4: hdmi: Move XBAR setup to csc_setup
  drm/vc4: hdmi: Replace CSC_CTL hardcoded value by defines
  drm/vc4: hdmi: Define colorspace matrices
  drm/vc4: hdmi: Change CSC callback prototype
  drm/vc4: hdmi: Move clock validation to its own function
  drm/vc4: hdmi: Move clock calculation into its own function
  drm/vc4: hdmi: Support HDMI YUV output
  drm/vc4: hdmi: Force YUV422 if the rate is too high

 drivers/gpu/drm/drm_edid.c  |   8 +-
 drivers/gpu/drm/i915/display/intel_hdmi.c   |   2 +-
 drivers/gpu/drm/i915/display/intel_lspcon.c |   2 +-
 drivers/gpu/drm/vc4/vc4_hdmi.c  | 324 +++-
 drivers/gpu/drm/vc4/vc4_hdmi.h  |  13 +-
 drivers/gpu/drm/vc4/vc4_hdmi_regs.h |   6 +
 drivers/gpu/drm/vc4/vc4_regs.h  |  19 ++
 include/drm/drm_edid.h  |   4 +-
 8 files changed, 290 insertions(+), 88 deletions(-)

-- 
2.33.1

[PATCH 01/11] drm/edid: Rename drm_hdmi_avi_infoframe_colorspace to _colorimetry

2021-12-03 Thread Maxime Ripard

The drm_hdmi_avi_infoframe_colorspace() function actually sets the
colorimetry and extended_colorimetry fields in the hdmi_avi_infoframe
structure with DRM_MODE_COLORIMETRY_* values.

To make things worse, the hdmi_avi_infoframe structure also has a
colorspace field used to signal whether an RGB or YUV output is being
used.

Let's remove the inconsistency and allow for the colorspace usage by
renaming the function.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/drm_edid.c  | 8 
 drivers/gpu/drm/i915/display/intel_hdmi.c   | 2 +-
 drivers/gpu/drm/i915/display/intel_lspcon.c | 2 +-
 drivers/gpu/drm/vc4/vc4_hdmi.c  | 2 +-
 include/drm/drm_edid.h  | 4 ++--
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 12893e7be89b..13644dd579b4 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -5891,13 +5891,13 @@ static const u32 hdmi_colorimetry_val[] = {
 #undef ACE
 
 /**
- * drm_hdmi_avi_infoframe_colorspace() - fill the HDMI AVI infoframe
- *   colorspace information
+ * drm_hdmi_avi_infoframe_colorimetry() - fill the HDMI AVI infoframe
+ *   colorimetry information
  * @frame: HDMI AVI infoframe
  * @conn_state: connector state
  */
 void
-drm_hdmi_avi_infoframe_colorspace(struct hdmi_avi_infoframe *frame,
+drm_hdmi_avi_infoframe_colorimetry(struct hdmi_avi_infoframe *frame,
  const struct drm_connector_state *conn_state)
 {
u32 colorimetry_val;
@@ -5916,7 +5916,7 @@ drm_hdmi_avi_infoframe_colorspace(struct 
hdmi_avi_infoframe *frame,
frame->extended_colorimetry = (colorimetry_val >> 2) &
EXTENDED_COLORIMETRY_MASK;
 }
-EXPORT_SYMBOL(drm_hdmi_avi_infoframe_colorspace);
+EXPORT_SYMBOL(drm_hdmi_avi_infoframe_colorimetry);
 
 /**
  * drm_hdmi_avi_infoframe_quant_range() - fill the HDMI AVI infoframe
diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.c 
b/drivers/gpu/drm/i915/display/intel_hdmi.c
index 3b5b9e7b05b7..96e508ddc4af 100644
--- a/drivers/gpu/drm/i915/display/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/display/intel_hdmi.c
@@ -730,7 +730,7 @@ intel_hdmi_compute_avi_infoframe(struct intel_encoder 
*encoder,
else
frame->colorspace = HDMI_COLORSPACE_RGB;
 
-   drm_hdmi_avi_infoframe_colorspace(frame, conn_state);
+   drm_hdmi_avi_infoframe_colorimetry(frame, conn_state);
 
/* nonsense combination */
drm_WARN_ON(encoder->base.dev, crtc_state->limited_color_range &&
diff --git a/drivers/gpu/drm/i915/display/intel_lspcon.c 
b/drivers/gpu/drm/i915/display/intel_lspcon.c
index 05d2d750fa53..092a925c6cf5 100644
--- a/drivers/gpu/drm/i915/display/intel_lspcon.c
+++ b/drivers/gpu/drm/i915/display/intel_lspcon.c
@@ -537,7 +537,7 @@ void lspcon_set_infoframes(struct intel_encoder *encoder,
frame.avi.colorspace = HDMI_COLORSPACE_RGB;
 
/* Set the Colorspace as per the HDMI spec */
-   drm_hdmi_avi_infoframe_colorspace(, conn_state);
+   drm_hdmi_avi_infoframe_colorimetry(, conn_state);
 
/* nonsense combination */
drm_WARN_ON(encoder->base.dev, crtc_state->limited_color_range &&
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 053fbaf765ca..be39e55ae113 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -502,7 +502,7 @@ static void vc4_hdmi_set_avi_infoframe(struct drm_encoder 
*encoder)
   vc4_encoder->limited_rgb_range ?
   HDMI_QUANTIZATION_RANGE_LIMITED :
   HDMI_QUANTIZATION_RANGE_FULL);
-   drm_hdmi_avi_infoframe_colorspace(, cstate);
+   drm_hdmi_avi_infoframe_colorimetry(, cstate);
drm_hdmi_avi_infoframe_bars(, cstate);
 
vc4_hdmi_write_infoframe(encoder, );
diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index 18f6c700f6d0..144c495b99c4 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -401,8 +401,8 @@ drm_hdmi_vendor_infoframe_from_display_mode(struct 
hdmi_vendor_infoframe *frame,
const struct drm_display_mode 
*mode);
 
 void
-drm_hdmi_avi_infoframe_colorspace(struct hdmi_avi_infoframe *frame,
- const struct drm_connector_state *conn_state);
+drm_hdmi_avi_infoframe_colorimetry(struct hdmi_avi_infoframe *frame,
+  const struct drm_connector_state 
*conn_state);
 
 void
 drm_hdmi_avi_infoframe_bars(struct hdmi_avi_infoframe *frame,
-- 
2.33.1

Re: [PATCH] drm/msm: Initialize MDSS irq domain at probe time

2021-12-03 Thread AngeloGioacchino Del Regno


Il 01/12/21 21:20, Dmitry Baryshkov ha scritto:

Since commit 8f59ee9a570c ("drm/msm/dsi: Adjust probe order"), the
DSI host gets initialized earlier, but this caused unability to probe
the entire stack of components because they all depend on interrupts
coming from the main `mdss` node (mdp5, or dpu1).

To fix this issue, move mdss device initialization (which include irq
domain setup) to msm_mdev_probe() time, as to make sure that the
interrupt controller is available before dsi and/or other components try
to initialize, finally satisfying the dependency.

Fixes: 8f59ee9a570c ("drm/msm/dsi: Adjust probe order")
Co-Developed-By: AngeloGioacchino Del Regno 

Signed-off-by: Dmitry Baryshkov 
---

When checking your patch, I noticed that IRQ domain is created before
respective MDSS clocks are enabled. This does not look like causing any
issues at this time, but it did not look good. So I started moving
clocks parsing to early_init() callbacks. And at some point it looked
like we can drop the init()/destroy() callbacks in favour of
early_init() and remove(). Which promted me to move init()/destroy() in
place of early_init()/remove() with few minor fixes here and there.




Hey Dmitry,
I wanted to make the least amount of changes to Rob's logic... I know that
the clocks aren't up before registering the domain, but my logic was implying
that if the handlers aren't registered, then there's no interrupt coming, hence
no risk of getting issues. Same if the hardware is down, you can't get any
interrupt, because it can't generate any (but if bootloader leaves it up.. eh.)

I recognize that such approach is "fragile enough", lastly, I agree with this
patch which is, in the end, something that is in the middle between my first
and last approach.

I've tested this one on trogdor-lazor-limozeen and seems to be working in an
analogous way to my v2/v3, so on my side it's validated.


Let's go for this one!
How do we proceeed now? Are you sending a new series with the new patches, or
should I?

Also, I don't think this is relevant, since I'm in co-dev, but in case it is:
Tested-by: AngeloGioacchino Del Regno 

P.S.: Sorry for the 1-day delay, got busy with other tasks!

Re: 5.15 regression: CONFIG_SYSFB_SIMPLEFB breaks console scrolling

2021-12-03 Thread Thorsten Leemhuis

On 25.11.21 13:00, Thorsten Leemhuis wrote:
> On 21.11.21 12:47, Thorsten Leemhuis wrote:
>> Hi, this is your Linux kernel regression tracker speaking.
> /me again

and again :-D

>> On 16.11.21 05:52, Harald Dunkel wrote:
>>>
>>> if I enable CONFIG_SYSFB_SIMPLEFB in 5.15.2 and use grub's default
>>> configuration
>>> (Debian sid amd64), then a few lines at the bottom of /dev/tty1 including
>>> login prompt are off-screen. Scrolling is broken. I can login, though.
>>>
>>> Enabling GRUB_TERMINAL=console in grub doesn't make a difference. Using
>>> the same kernel except for CONFIG_SYSFB_SIMPLEFB the problem is gone.
>>>
>>> Graphics card is a GeForce GTX 1650. I tried with both CONFIG_DRM_NOUVEAU
>>> and proprietary graphics drivers disabled.
>>>
>>> Attached you can find the config file. Please mail if I can help to track
>>> this problem down.
>>
>> Thx for the report. I'm not totally sure if this is a regression, as
>> that's a new config option. But it might be one considered a successor
>> to an older one, hence it might count as regression. Adding two
>> developers and a mailing list to the CC, hopefully someone can clarify.
> 
> Javier, I'd be interested in your option on this.
> 
> Harald, did you have CONFIG_X86_SYSFB enabled in earlier kernel versions
> (and did console scrolling work then)? The answer would help me to
> decide if this a regression, as those ideally should be fixed as quickly
> as possible.

Harald, a reply would have been very helpful; I wonder if Javier is
waiting for the answer, too. If I get none in the next few days I guess
I'll drop this from the list of tracked regressions.

Ciao, Thorsten

#regzbot ignore-activity

>> TWIMC: To be sure this issue doesn't fall through the cracks unnoticed,
>> I'm adding it to regzbot, my Linux kernel regression tracking bot:
>>
>> #regzbot ^introduced 8633ef82f101c040427b57d4df7b706261420b94
>> #regzbot title CONFIG_SYSFB_SIMPLEFB breaks console scrolling
>> #regzbot ignore-activity
>>
>> Ciao, Thorsten, your Linux kernel regression tracker.
>>
>> P.S.: If you want to know more about regzbot, check out its
>> web-interface, the getting start guide, and/or the references documentation:
>>
>> https://linux-regtracking.leemhuis.info/regzbot/
>> https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md
>> https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md
>>
>> The last two documents will explain how you can interact with regzbot
>> yourself if your want to.
>>
>> Hint for the reporter: when reporting a regression it's in your interest
>> to tell #regzbot about it in the report, as that will ensure the
>> regression gets on the radar of regzbot and the regression tracker.
>> That's in your interest, as they will make sure the report won't fall
>> through the cracks unnoticed.
>>
>> Hint for developers: you normally don't need to care about regzbot, just
>> fix the issue as you normally would. Just remember to include a 'Link:'
>> tag to the report in the commit message, as explained in
>> Documentation/process/submitting-patches.rst
>> That aspect was recently was made more explicit in commit 1f57bd42b77c:
>> https://git.kernel.org/linus/1f57bd42b77c
> 
> #regzbot title drm: CONFIG_SYSFB_SIMPLEFB breaks console scrolling
> 
> 
> 
>

Re: [PATCH v7 6/6] drm/sprd: add Unisoc's drm mipi dsi driver

2021-12-03 Thread Maxime Ripard

On Mon, Oct 25, 2021 at 05:34:18PM +0800, Kevin Tang wrote:
> @@ -618,9 +619,25 @@ static void sprd_crtc_mode_set_nofb(struct drm_crtc 
> *crtc)
>  {
>   struct sprd_dpu *dpu = to_sprd_crtc(crtc);
>   struct drm_display_mode *mode = >state->adjusted_mode;
> + struct drm_encoder *encoder;
> + struct mipi_dsi_device *slave;
> + struct sprd_dsi *dsi;
>  
>   drm_display_mode_to_videomode(mode, >ctx.vm);
>  
> + drm_for_each_encoder(encoder, crtc->dev) {
> + if (encoder->crtc != crtc)
> + continue;

encoder->crtc is deprecated. You should be using
encoder->drm_for_each_encoder_mask, using the encoder_mask in
encoder->drm_crtc_state.

> +static int sprd_dsi_probe(struct platform_device *pdev)
> +{
> + struct device *dev = >dev;
> + struct sprd_dsi *dsi;
> +
> + dsi = devm_kzalloc(dev, sizeof(*dsi), GFP_KERNEL);
> + if (!dsi)
> + return -ENOMEM;
> +
> + dev_set_drvdata(dev, dsi);
> +
> + dsi->host.ops = _dsi_host_ops;
> + dsi->host.dev = dev;
> + mipi_dsi_host_register(>host);
> +
> + return component_add(>dev, _component_ops);

component_add must be run in the mipi_dsi_host.attach hook.

Maxime


signature.asc
Description: PGP signature

1 2 >

1 - 100 of 121 matches

Mail list logo