Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource

2021-09-10 Thread Thomas Hellström
On Fri, 2021-09-10 at 19:03 +0200, Christian König wrote:
> Am 10.09.21 um 17:30 schrieb Thomas Hellström:
> > On Fri, 2021-09-10 at 16:40 +0200, Christian König wrote:
> > > 
> > > Am 10.09.21 um 15:15 schrieb Thomas Hellström:
> > > > Both the provider (resource manager) and the consumer (the TTM
> > > > driver)
> > > > want to subclass struct ttm_resource. Since this is left for
> > > > the
> > > > resource
> > > > manager, we need to provide a private pointer for the TTM
> > > > driver.
> > > > 
> > > > Provide a struct ttm_resource_private for the driver to
> > > > subclass
> > > > for
> > > > data with the same lifetime as the struct ttm_resource: In the
> > > > i915
> > > > case
> > > > it will, for example, be an sg-table and radix tree into the
> > > > LMEM
> > > > /VRAM pages that currently are awkwardly attached to the GEM
> > > > object.
> > > > 
> > > > Provide an ops structure for associated ops (Which is only
> > > > destroy() ATM)
> > > > It might seem pointless to provide a separate ops structure,
> > > > but
> > > > Linus
> > > > has previously made it clear that that's the norm.
> > > > 
> > > > After careful audit one could perhaps also on a per-driver
> > > > basis
> > > > replace the delete_mem_notify() TTM driver callback with the
> > > > above
> > > > destroy function.
> > > Well this is a really big NAK to this approach.
> > > 
> > > If you need to attach some additional information to the resource
> > > then
> > > implement your own resource manager like everybody else does.
> > Well this was the long discussion we had back then when the
> > resource
> > mangagers started to derive from struct resource and I was under
> > the
> > impression that we had come to an agreement about the different
> > use-
> > cases here, and this was my main concern.
> 
> Ok, then we somehow didn't understood each other.
> 
> > I mean, it's a pretty big layer violation to do that for this use-
> > case.
> 
> Well exactly that's the point. TTM should not have a layer design in
> the 
> first place.
> 
> Devices, BOs, resources etc.. are base classes which should implement
> a 
> base functionality which is then extended by the drivers to implement
> the driver specific functionality.
> 
> That is a component based approach, and not layered at all.
> 
> > The TTM resource manager doesn't want to know about this data at
> > all,
> > it's private to the ttm resource user layer and the resource
> > manager
> > works perfectly well without it. (I assume the other drivers that
> > implement their own resource managers need the data that the
> > subclassing provides?)
> 
> Yes, that's exactly why we have the subclassing.
> 
> > The fundamental problem here is that there are two layers wanting
> > to
> > subclass struct ttm_resource. That means one layer gets to do that,
> > the
> > second gets to use a private pointer, (which in turn can provide
> > yet
> > another private pointer to a potential third layer). With your
> > suggestion, the second layer instead is forced to subclass each
> > subclassed instance it uses from  the first layer provides?
> 
> Well completely drop the layer approach/thinking here.
> 
> The resource is an object with a base class. The base class
> implements 
> the interface TTM needs to handle the object, e.g.
> create/destroy/debug 
> etc...
> 
> Then we need to subclass this object because without any additional 
> information the object is pretty pointless.
> 
> One possibility for this is to use the range manager to implement 
> something drm_mm based. BTW: We should probably rename that to
> something 
> like ttm_res_drm_mm or similar.

Sure I'm all in on that, but my point is this becomes pretty awkward
because the reusable code already subclasses struct ttm_resource. Let
me give you an example:

Prereqs:
1) We want to be able to re-use resource manager implementations among
drivers.
2) A driver might want to re-use multiple implementations and have
identical data "struct i915_data" attached to both

With your suggestion that combination of prereqs would look like:

struct i915_resource {
/* Reason why we subclass */
struct i915_data my_data;

/* 
 * Uh this is awkward. We need to do this because these   
 * already subclassed struct ttm_resource.
 */
struct ttm_resource *resource;
union {
struct ttm_range_mgr_node range;
struct i915_ttm_buddy_resource buddy;
};
};

And I can't make it look like

struct i915_resource {
struct i915_data my_data;
struct ttm_resource *resource;
}

Without that private back pointer.
 
But what I'd *really* would want is.

struct i915_resource {
struct i915_data my_data;
struct ttm_resource resource;
};

This would be identical to how we subclass a struct ttm_buffer_object
or a struct ttm_tt. But It can't look like this because then we can't
reuse exising implementations that *already subclass* struct
ttm_resou

[PATCH v2] drm/rockchip: cdn-dp-core: Fix cdn_dp_resume unused warning

2021-09-10 Thread Palmer Dabbelt
From: Palmer Dabbelt 

cdn_dp_resume is only used under PM_SLEEP, and now that it's static an
unused function warning is triggered undner !PM_SLEEP.  This marks the
function as possibly unused, to avoid triggering compiler warnings.

Fixes: 7c49abb4c2f8 ("drm/rockchip: cdn-dp-core: Make 
cdn_dp_core_suspend/resume static")
Reviewed-by: Geert Uytterhoeven 
Signed-off-by: Palmer Dabbelt 
---
This is breaking my builds and looks like it'll land after -rc1, so I've put it
on a shared tag for-rockchip-cdn_dp_resume-v2 which will let me pull it in to
my fixes.  LMK if you guys want me to send this up on my own, but I'm assuming
that the drm/rockchip folks will handle it.
---
 drivers/gpu/drm/rockchip/cdn-dp-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c 
b/drivers/gpu/drm/rockchip/cdn-dp-core.c
index 8ab3247dbc4a..13c6b857158f 100644
--- a/drivers/gpu/drm/rockchip/cdn-dp-core.c
+++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c
@@ -1123,7 +1123,7 @@ static int cdn_dp_suspend(struct device *dev)
return ret;
 }
 
-static int cdn_dp_resume(struct device *dev)
+static __maybe_unused int cdn_dp_resume(struct device *dev)
 {
struct cdn_dp_device *dp = dev_get_drvdata(dev);
 
-- 
2.33.0.309.g3052b89438-goog



Intel UHD resolutions

2021-09-10 Thread Randy Dunlap

Hi,

I would like to use QHD resolution (2560x1440) with my shiny new
computer and display. That resolution works if I boot Windows 10
(cough).

What do I need to do to use that resolution in Linux?

I first tried openSUSE 15.3 (kernel 5.3.18-59.19-default)
then I build a v5.14 kernel and tried that.
Both of them max out at FHD (1920x1080).

I am booting with "i915.force_probe=4c8a" on the kernel command line.

My desktop is XFCE4.

CPU is:
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 167
model name  : 11th Gen Intel(R) Core(TM) i9-11900 @ 2.50GHz
stepping: 1
microcode   : 0x40
cpu MHz : 1021.742
cache size  : 16384 KB
physical id : 0
siblings: 16

with an H470 chipset. (ASRock DeskMini H470)

00:02.0 VGA compatible controller: Intel Corporation RocketLake-S GT1 [UHD 
Graphics 750] (rev 04)

or verbose:

00:02.0 VGA compatible controller: Intel Corporation RocketLake-S GT1 [UHD 
Graphics 750] (rev 04) (prog-if 00 [VGA controller])
Subsystem: ASRock Incorporation Device 4c8a
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- 
TransPend-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, O
BFF Not Supported
 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, 
OBFF Disabled
 AtomicOpsCtl: ReqEn-
Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
Address: fee00018  Data: 
Masking:   Pending: 
Capabilities: [d0] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot
-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [100 v1] Process Address Space ID (PASID)
PASIDCap: Exec- Priv-, Max PASID Width: 14
PASIDCtl: Enable- Exec- Priv-
Capabilities: [200 v1] Address Translation Service (ATS)
ATSCap: Invalidate Queue Depth: 00
ATSCtl: Enable-, Smallest Translation Unit: 00
Capabilities: [300 v1] Page Request Interface (PRI)
PRICtl: Enable- Reset-
PRISta: RF- UPRGI- Stopped+
Page Request Capacity: 8000, Page Request Allocation: 

Kernel driver in use: i915
Kernel modules: i915


thanks.
--
~Randy



[PATCH] drm/i915: fix odd_ptr_err.cocci warnings

2021-09-10 Thread kernel test robot
From: kernel test robot 

drivers/gpu/drm/i915/display/intel_dpt.c:145:6-12: inconsistent IS_ERR and 
PTR_ERR on line 146.

 PTR_ERR should access the value just tested by IS_ERR

Semantic patch information:
 There can be false positives in the patch case, where it is the call to
 IS_ERR that is wrong.

Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci

CC: Maarten Lankhorst 
Reported-by: kernel test robot 
Signed-off-by: kernel test robot 
---

url:
https://github.com/0day-ci/linux/commits/Maarten-Lankhorst/drm-i915-Add-ww-context-to-intel_dpt_pin/20210910-162231
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
:: branch date: 17 hours ago
:: commit date: 17 hours ago

 intel_dpt.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -143,7 +143,7 @@ struct i915_vma *intel_dpt_pin(struct i9
i915_vma_unpin(vma);
 
if (IS_ERR(iomem)) {
-   err = PTR_ERR(vma);
+   err = PTR_ERR(iomem);
continue;
}
 


Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files

2021-09-10 Thread Yokoyama, Caz
On Fri, 2021-09-10 at 14:52 -0700, Lucas De Marchi wrote:
> On Fri, Sep 10, 2021 at 09:14:37PM +, Yokoyama, Caz wrote:
> > On Fri, 2021-09-10 at 10:52 -0700, Lucas De Marchi wrote:
> > > On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote:
> > > > We shouldn't be using debugfs_ namespace for this
> > > > functionality.
> > > > Rename
> > > > debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
> > > > functions, defines and structs follow suit.
> > > > 
> > > > Signed-off-by: Lucas De Marchi 
> > > > ---
> > > > drivers/gpu/drm/i915/Makefile  |  2 +-
> > > > drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 -
> > > > 
> > > > -
> > > > drivers/gpu/drm/i915/gt/intel_gt_debugfs.c |  4 ++--
> > > > .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c}  |  4 ++--
> > > > drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h  | 14
> > > > ++
> > > > 5 files changed, 19 insertions(+), 19 deletions(-)
> > > > delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > > > rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c =>
> > > > intel_gt_pm_debugfs.c} (99%)
> > > > create mode 100644
> > > > drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/Makefile
> > > > b/drivers/gpu/drm/i915/Makefile
> > > > index 232c9673a2e5..dd656f2d7721 100644
> > > > --- a/drivers/gpu/drm/i915/Makefile
> > > > +++ b/drivers/gpu/drm/i915/Makefile
> > > > @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
> > > > 
> > > > # "Graphics Technology" (aka we talk to the gpu)
> > > > gt-y += \
> > > > -   gt/debugfs_gt_pm.o \
> > > > gt/gen2_engine_cs.o \
> > > > gt/gen6_engine_cs.o \
> > > > gt/gen6_ppgtt.o \
> > > > @@ -103,6 +102,7 @@ gt-y += \
> > > > gt/intel_gt_engines_debugfs.o \
> > > > gt/intel_gt_irq.o \
> > > > gt/intel_gt_pm.o \
> > > > +   gt/intel_gt_pm_debugfs.o \
> > > > gt/intel_gt_pm_irq.o \
> > > > gt/intel_gt_requests.o \
> > > > gt/intel_gtt.o \
> > > > diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > > > b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > > > deleted file mode 100644
> > > > index 4cf5f5c9da7d..
> > > > --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > > > +++ /dev/null
> > > > @@ -1,14 +0,0 @@
> > > > -/* SPDX-License-Identifier: MIT */
> > > > -/*
> > > > - * Copyright © 2019 Intel Corporation
> > > > - */
> > > > -
> > > > -#ifndef DEBUGFS_GT_PM_H
> > > > -#define DEBUGFS_GT_PM_H
> > > > -
> > > > -struct intel_gt;
> > > > -struct dentry;
> > > > -
> > > > -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry
> > > > *root);
> > > > -
> > > > -#endif /* DEBUGFS_GT_PM_H */
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > > > b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > > > index e5d173c235a3..4096ee893b69 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > > > @@ -5,10 +5,10 @@
> > > > 
> > > > #include 
> > > > 
> > > > -#include "debugfs_gt_pm.h"
> > > > #include "i915_drv.h"
> > > > #include "intel_gt_debugfs.h"
> > > > #include "intel_gt_engines_debugfs.h"
> > > > +#include "intel_gt_pm_debugfs.h"
> > Why locate here? Why not just replace debugfs_gt_pm.h? Compile
> > error?
> 
> are you asking why I moved the include? Because sorting them
> alphabetically avoid big messes in these includes
As the patch, it is easy to see if - and + lines are side by side.
Anyway, I honor and respect your decision.
-caz

> 
> Lucas De Marchi
> 
> > -caz
> > 
> > > > #include "intel_sseu_debugfs.h"
> > > > #include "uc/intel_uc_debugfs.h"
> > > > 
> > > > @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct
> > > > intel_gt
> > > > *gt)
> > > > return;
> > > > 
> > > > intel_gt_engines_register_debugfs(gt, root);
> > > > -   debugfs_gt_pm_register(gt, root);
> > > > +   intel_gt_pm_register_debugfs(gt, root);
> > > 
> > > This is one case I usually don't know what convention to follow
> > > since
> > > it
> > > changes in different places.
> > > 
> > > I did it like _register_debugfs because of calls like
> > > intel_gt_init_scratch(), xxx_init_hw, etc. However here I see
> > > that
> > > just
> > > below we have intel_sseu_debugfs_register(), so maybe I should
> > > consider
> > > debugfs as part of the namespace?
> > > 
> > > Lucas De Marchi


Re: [Intel-gfx] [PATCH v9 15/17] drm/i915/pxp: add pxp debugfs

2021-09-10 Thread Teres Alexis, Alan Previn
Reviewed-by: Alan Previn  

..alan

On Fri, 2021-09-10 at 08:36 -0700, Daniele Ceraolo Spurio wrote:
> 2 debugfs files, one to query the current status of the pxp session and one
> to trigger an invalidation for testing.
> 
> v2: rename debugfs, fix date (Alan)
> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Reviewed-by : Alan Previn 
> ---
>  drivers/gpu/drm/i915/Makefile|  1 +
>  drivers/gpu/drm/i915/gt/debugfs_gt.c |  2 +
>  drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c | 78 
>  drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h | 21 ++
>  4 files changed, 102 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
>  create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 366e82cec44d..b46474ee1a1f 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -285,6 +285,7 @@ i915-y += i915_perf.o
>  i915-$(CONFIG_DRM_I915_PXP) += \
>   pxp/intel_pxp.o \
>   pxp/intel_pxp_cmd.o \
> + pxp/intel_pxp_debugfs.o \
>   pxp/intel_pxp_irq.o \
>   pxp/intel_pxp_pm.o \
>   pxp/intel_pxp_session.o \
> diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt.c 
> b/drivers/gpu/drm/i915/gt/debugfs_gt.c
> index 591eb60785db..c27847ddb796 100644
> --- a/drivers/gpu/drm/i915/gt/debugfs_gt.c
> +++ b/drivers/gpu/drm/i915/gt/debugfs_gt.c
> @@ -9,6 +9,7 @@
>  #include "debugfs_gt.h"
>  #include "debugfs_gt_pm.h"
>  #include "intel_sseu_debugfs.h"
> +#include "pxp/intel_pxp_debugfs.h"
>  #include "uc/intel_uc_debugfs.h"
>  #include "i915_drv.h"
>  
> @@ -28,6 +29,7 @@ void debugfs_gt_register(struct intel_gt *gt)
>   intel_sseu_debugfs_register(gt, root);
>  
>   intel_uc_debugfs_register(>->uc, root);
> + intel_pxp_debugfs_register(>->pxp, root);
>  }
>  
>  void intel_gt_debugfs_register_files(struct dentry *root,
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c 
> b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
> new file mode 100644
> index ..cbb1853676cc
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
> @@ -0,0 +1,78 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +
> +#include 
> +#include 
> +
> +#include "gt/debugfs_gt.h"
> +#include "pxp/intel_pxp.h"
> +#include "pxp/intel_pxp_irq.h"
> +#include "i915_drv.h"
> +
> +static int pxp_info_show(struct seq_file *m, void *data)
> +{
> + struct intel_pxp *pxp = m->private;
> + struct drm_printer p = drm_seq_file_printer(m);
> + bool enabled = intel_pxp_is_enabled(pxp);
> +
> + if (!enabled) {
> + drm_printf(&p, "pxp disabled\n");
> + return 0;
> + }
> +
> + drm_printf(&p, "active: %s\n", yesno(intel_pxp_is_active(pxp)));
> + drm_printf(&p, "instance counter: %u\n", pxp->key_instance);
> +
> + return 0;
> +}
> +DEFINE_GT_DEBUGFS_ATTRIBUTE(pxp_info);
> +
> +static int pxp_terminate_get(void *data, u64 *val)
> +{
> + /* nothing to read */
> + return -EPERM;
> +}
> +
> +static int pxp_terminate_set(void *data, u64 val)
> +{
> + struct intel_pxp *pxp = data;
> + struct intel_gt *gt = pxp_to_gt(pxp);
> +
> + if (!intel_pxp_is_active(pxp))
> + return -ENODEV;
> +
> + /* simulate a termination interrupt */
> + spin_lock_irq(>->irq_lock);
> + intel_pxp_irq_handler(pxp, 
> GEN12_DISPLAY_PXP_STATE_TERMINATED_INTERRUPT);
> + spin_unlock_irq(>->irq_lock);
> +
> + if (!wait_for_completion_timeout(&pxp->termination,
> +  msecs_to_jiffies(100)))
> + return -ETIMEDOUT;
> +
> + return 0;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(pxp_terminate_fops, pxp_terminate_get, 
> pxp_terminate_set, "%llx\n");
> +void intel_pxp_debugfs_register(struct intel_pxp *pxp, struct dentry 
> *gt_root)
> +{
> + static const struct debugfs_gt_file files[] = {
> + { "info", &pxp_info_fops, NULL },
> + { "terminate_state", &pxp_terminate_fops, NULL },
> + };
> + struct dentry *root;
> +
> + if (!gt_root)
> + return;
> +
> + if (!HAS_PXP((pxp_to_gt(pxp)->i915)))
> + return;
> +
> + root = debugfs_create_dir("pxp", gt_root);
> + if (IS_ERR(root))
> + return;
> +
> + intel_gt_debugfs_register_files(root, files, ARRAY_SIZE(files), pxp);
> +}
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h 
> b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
> new file mode 100644
> index ..7e0c3d2f5d7e
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +
> +#ifndef __INTEL_PXP_DEBUGFS_H__
> +#define __INTEL_PXP_DEBUGFS_H__
> +
> +struct intel_pxp;
> +struct dentry;
> +
> +#ifdef CONFIG_DRM_I915_PXP
> +void intel_pxp_debugfs_register(struct intel_

Re: [PATCH] drm/rockchip: Update crtc fixup to account for fractional clk change

2021-09-10 Thread Chris Morgan
On Wed, Sep 08, 2021 at 09:05:52PM +0300, Andy Shevchenko wrote:
> On Wed, Sep 08, 2021 at 08:53:56AM -0500, Chris Morgan wrote:
> > From: Chris Morgan 
> > 
> > After commit 928f9e268611 ("clk: fractional-divider: Hide
> > clk_fractional_divider_ops from wide audience") was merged it appears
> > that the DSI panel on my Odroid Go Advance stopped working. Upon closer
> > examination of the problem, it looks like it was the fixup in the
> > rockchip_drm_vop.c file was causing the issue. The changes made to the
> > clk driver appear to change some assumptions made in the fixup.
> > 
> > After debugging the working 5.14 kernel and the no-longer working
> > 5.15 kernel, it looks like this was broken all along but still
> > worked, whereas after the fractional clock change it stopped
> > working despite the issue (it went from sort-of broken to very broken).
> > 
> > In the 5.14 kernel the dclk_vopb_frac was being requested to be set to
> > 17000999 on my board. The clock driver was taking the value of the
> > parent clock and attempting to divide the requested value from it
> > (1700/17000999 = 0), then subtracting 1 from it (making it -1),
> > and running it through fls_long to get 64. It would then subtract
> > the value of fd->mwidth from it to get 48, and then bit shift
> > 17000999 to the left by 48, coming up with a very large number of
> > 7649082492112076800. This resulted in a numerator of 65535 and a
> > denominator of 1 from the clk driver. The driver seemingly would
> > try again and get a correct 1:1 value later, and then move on.
> > 
> > Output from my 5.14 kernel (with some printfs for good measure):
> > [2.830066] rockchip-drm display-subsystem: bound ff46.vop (ops 
> > vop_component_ops)
> > [2.839431] rockchip-drm display-subsystem: bound ff45.dsi (ops 
> > dw_mipi_dsi_rockchip_ops)
> > [2.855980] Clock is dclk_vopb_frac
> > [2.856004] Scale 64, Rate 7649082492112076800, Oldrate 17000999, Parent 
> > Rate 1700, Best Numerator 65535, Best Denominator 1, fd->mwidth 16
> > [2.903529] Clock is dclk_vopb_frac
> > [2.903556] Scale 0, Rate 1700, Oldrate 1700, Parent Rate 
> > 1700, Best Numerator 1, Best Denominator 1, fd->mwidth 16
> > [2.903579] Clock is dclk_vopb_frac
> > [2.903583] Scale 0, Rate 1700, Oldrate 1700, Parent Rate 
> > 1700, Best Numerator 1, Best Denominator 1, fd->mwidth 16
> > 
> > Contrast this with 5.15 after the clk change where the rate of 17000999
> > was getting passed and resulted in numerators/denomiators of 17001/
> > 17000.
> > 
> > Output from my 5.15 kernel (with some printfs added for good measure):
> > [2.817571] rockchip-drm display-subsystem: bound ff46.vop (ops 
> > vop_component_ops)
> > [2.826975] rockchip-drm display-subsystem: bound ff45.dsi (ops 
> > dw_mipi_dsi_rockchip_ops)
> > [2.843430] Rate 17000999, Parent Rate 1700, Best Numerator 17018, 
> > Best Denominator 17017
> > [2.891073] Rate 17001000, Parent Rate 1700, Best Numerator 17001, 
> > Best Denominator 17000
> > [2.891269] Rate 17001000, Parent Rate 1700, Best Numerator 17001, 
> > Best Denominator 17000
> > [2.891281] Rate 17001000, Parent Rate 1700, Best Numerator 17001, 
> > Best Denominator 17000
> > 
> > After tracing through the code it appeared that this function here was
> > adding a 999 to the requested frequency because of how the clk driver
> > was rounding/accepting those frequencies. I believe after the changes
> > made in the commit listed above the assumptions listed in this driver
> > are no longer true. When I remove the + 999 from the driver the DSI
> > panel begins to work again.
> > 
> > Output from my 5.15 kernel with 999 removed (printfs added):
> > [2.852054] rockchip-drm display-subsystem: bound ff46.vop (ops 
> > vop_component_ops)
> > [2.864483] rockchip-drm display-subsystem: bound ff45.dsi (ops 
> > dw_mipi_dsi_rockchip_ops)
> > [2.880869] Clock is dclk_vopb_frac
> > [2.880892] Rate 1700, Parent Rate 1700, Best Numerator 1, Best 
> > Denominator 1
> > [2.928521] Clock is dclk_vopb_frac
> > [2.928551] Rate 1700, Parent Rate 1700, Best Numerator 1, Best 
> > Denominator 1
> > [2.928570] Clock is dclk_vopb_frac
> > [2.928574] Rate 1700, Parent Rate 1700, Best Numerator 1, Best 
> > Denominator 1
> > 
> > I have tested the change extensively on my Odroid Go Advance (Rockchip
> > RK3326) and it appears to work well. However, this change will affect
> > all Rockchip SoCs that use this driver so I believe further testing
> > is warranted. Please note that without this change I can confirm
> > at least all PX30s with DSI panels will stop working with the 5.15
> > kernel.
> 
> To me it all makes a lot of sense, thank you for deep analysis of the issue!
> In any case I think we will need a Fixes tag to something (either one of
> clk-fractional-divider.c series or preexisted).

Would this work for a 

Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files

2021-09-10 Thread Lucas De Marchi

On Fri, Sep 10, 2021 at 09:14:37PM +, Yokoyama, Caz wrote:

On Fri, 2021-09-10 at 10:52 -0700, Lucas De Marchi wrote:

On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote:
> We shouldn't be using debugfs_ namespace for this functionality.
> Rename
> debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
> functions, defines and structs follow suit.
>
> Signed-off-by: Lucas De Marchi 
> ---
> drivers/gpu/drm/i915/Makefile  |  2 +-
> drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 -
> -
> drivers/gpu/drm/i915/gt/intel_gt_debugfs.c |  4 ++--
> .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c}  |  4 ++--
> drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h  | 14
> ++
> 5 files changed, 19 insertions(+), 19 deletions(-)
> delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c =>
> intel_gt_pm_debugfs.c} (99%)
> create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h
>
> diff --git a/drivers/gpu/drm/i915/Makefile
> b/drivers/gpu/drm/i915/Makefile
> index 232c9673a2e5..dd656f2d7721 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
>
> # "Graphics Technology" (aka we talk to the gpu)
> gt-y += \
> -  gt/debugfs_gt_pm.o \
>gt/gen2_engine_cs.o \
>gt/gen6_engine_cs.o \
>gt/gen6_ppgtt.o \
> @@ -103,6 +102,7 @@ gt-y += \
>gt/intel_gt_engines_debugfs.o \
>gt/intel_gt_irq.o \
>gt/intel_gt_pm.o \
> +  gt/intel_gt_pm_debugfs.o \
>gt/intel_gt_pm_irq.o \
>gt/intel_gt_requests.o \
>gt/intel_gtt.o \
> diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> deleted file mode 100644
> index 4cf5f5c9da7d..
> --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> +++ /dev/null
> @@ -1,14 +0,0 @@
> -/* SPDX-License-Identifier: MIT */
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#ifndef DEBUGFS_GT_PM_H
> -#define DEBUGFS_GT_PM_H
> -
> -struct intel_gt;
> -struct dentry;
> -
> -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry
> *root);
> -
> -#endif /* DEBUGFS_GT_PM_H */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> index e5d173c235a3..4096ee893b69 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> @@ -5,10 +5,10 @@
>
> #include 
>
> -#include "debugfs_gt_pm.h"
> #include "i915_drv.h"
> #include "intel_gt_debugfs.h"
> #include "intel_gt_engines_debugfs.h"
> +#include "intel_gt_pm_debugfs.h"

Why locate here? Why not just replace debugfs_gt_pm.h? Compile error?


are you asking why I moved the include? Because sorting them
alphabetically avoid big messes in these includes

Lucas De Marchi


-caz


> #include "intel_sseu_debugfs.h"
> #include "uc/intel_uc_debugfs.h"
>
> @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt
> *gt)
>return;
>
>intel_gt_engines_register_debugfs(gt, root);
> -  debugfs_gt_pm_register(gt, root);
> +  intel_gt_pm_register_debugfs(gt, root);

This is one case I usually don't know what convention to follow since
it
changes in different places.

I did it like _register_debugfs because of calls like
intel_gt_init_scratch(), xxx_init_hw, etc. However here I see that
just
below we have intel_sseu_debugfs_register(), so maybe I should
consider
debugfs as part of the namespace?

Lucas De Marchi


Re: [PATCH 2/2] drm/msm/dpu: Fix timeout issues on command mode panels

2021-09-10 Thread Marijn Suijten
Hi Angelo!

On 2021-09-01 19:43:47, AngeloGioacchino Del Regno wrote:
> In function dpu_encoder_phys_cmd_wait_for_commit_done we are always
> checking if the relative CTL is started by waiting for an interrupt
> to fire: it is fine to do that, but then sometimes we call this
> function while the CTL is up and has never been put down, but that
> interrupt gets raised only when the CTL gets a state change from
> 0 to 1 (disabled to enabled), so we're going to wait for something
> that will never happen on its own.
> 
> Solving this while avoiding to restart the CTL is actually possible
> and can be done by just checking if it is already up and running
> when the wait_for_commit_done function is called: in this case, so,
> if the CTL was already running, we can say that the commit is done
> if the command transmission is complete (in other terms, if the
> interface has been flushed).
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
> index aa01698d6b25..b5b1b555ac4e 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
> @@ -682,6 +682,9 @@ static int dpu_encoder_phys_cmd_wait_for_commit_done(
>   if (!dpu_encoder_phys_cmd_is_master(phys_enc))
>   return 0;
>  
> + if (phys_enc->hw_ctl->ops.is_started)
> + return dpu_encoder_phys_cmd_wait_for_tx_complete(phys_enc);

In the previous commit you introduced is_started to the ops struct as
function pointer, and you probably intend to call it here instead of
just checking whether it might be NULL.

As far as I remember this was also the reason for previously mentioning
that it was faulty and required a v2 in:
https://lore.kernel.org/linux-arm-msm/bdc67afc-3736-5497-c43f-5165c55e0...@somainline.org/

Thanks!

- Marijn

> +
>   return _dpu_encoder_phys_cmd_wait_for_ctl_start(phys_enc);
>  }
>  
> -- 
> 2.32.0
> 


Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files

2021-09-10 Thread Yokoyama, Caz
On Fri, 2021-09-10 at 10:52 -0700, Lucas De Marchi wrote:
> On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote:
> > We shouldn't be using debugfs_ namespace for this functionality.
> > Rename
> > debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
> > functions, defines and structs follow suit.
> > 
> > Signed-off-by: Lucas De Marchi 
> > ---
> > drivers/gpu/drm/i915/Makefile  |  2 +-
> > drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 -
> > -
> > drivers/gpu/drm/i915/gt/intel_gt_debugfs.c |  4 ++--
> > .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c}  |  4 ++--
> > drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h  | 14
> > ++
> > 5 files changed, 19 insertions(+), 19 deletions(-)
> > delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c =>
> > intel_gt_pm_debugfs.c} (99%)
> > create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h
> > 
> > diff --git a/drivers/gpu/drm/i915/Makefile
> > b/drivers/gpu/drm/i915/Makefile
> > index 232c9673a2e5..dd656f2d7721 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
> > 
> > # "Graphics Technology" (aka we talk to the gpu)
> > gt-y += \
> > -   gt/debugfs_gt_pm.o \
> > gt/gen2_engine_cs.o \
> > gt/gen6_engine_cs.o \
> > gt/gen6_ppgtt.o \
> > @@ -103,6 +102,7 @@ gt-y += \
> > gt/intel_gt_engines_debugfs.o \
> > gt/intel_gt_irq.o \
> > gt/intel_gt_pm.o \
> > +   gt/intel_gt_pm_debugfs.o \
> > gt/intel_gt_pm_irq.o \
> > gt/intel_gt_requests.o \
> > gt/intel_gtt.o \
> > diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > deleted file mode 100644
> > index 4cf5f5c9da7d..
> > --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > +++ /dev/null
> > @@ -1,14 +0,0 @@
> > -/* SPDX-License-Identifier: MIT */
> > -/*
> > - * Copyright © 2019 Intel Corporation
> > - */
> > -
> > -#ifndef DEBUGFS_GT_PM_H
> > -#define DEBUGFS_GT_PM_H
> > -
> > -struct intel_gt;
> > -struct dentry;
> > -
> > -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry
> > *root);
> > -
> > -#endif /* DEBUGFS_GT_PM_H */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > index e5d173c235a3..4096ee893b69 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > @@ -5,10 +5,10 @@
> > 
> > #include 
> > 
> > -#include "debugfs_gt_pm.h"
> > #include "i915_drv.h"
> > #include "intel_gt_debugfs.h"
> > #include "intel_gt_engines_debugfs.h"
> > +#include "intel_gt_pm_debugfs.h"
Why locate here? Why not just replace debugfs_gt_pm.h? Compile error?
-caz

> > #include "intel_sseu_debugfs.h"
> > #include "uc/intel_uc_debugfs.h"
> > 
> > @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt
> > *gt)
> > return;
> > 
> > intel_gt_engines_register_debugfs(gt, root);
> > -   debugfs_gt_pm_register(gt, root);
> > +   intel_gt_pm_register_debugfs(gt, root);
> 
> This is one case I usually don't know what convention to follow since
> it
> changes in different places.
> 
> I did it like _register_debugfs because of calls like
> intel_gt_init_scratch(), xxx_init_hw, etc. However here I see that
> just
> below we have intel_sseu_debugfs_register(), so maybe I should
> consider
> debugfs as part of the namespace?
> 
> Lucas De Marchi


Re: [Intel-gfx] [PATCH 23/27] drm/i915/guc: Implement no mid batch preemption for multi-lrc

2021-09-10 Thread Matthew Brost
On Fri, Sep 10, 2021 at 12:25:43PM +0100, Tvrtko Ursulin wrote:
> 
> On 20/08/2021 23:44, Matthew Brost wrote:
> > For some users of multi-lrc, e.g. split frame, it isn't safe to preempt
> > mid BB. To safely enable preemption at the BB boundary, a handshake
> > between to parent and child is needed. This is implemented via custom
> > emit_bb_start & emit_fini_breadcrumb functions and enabled via by
> > default if a context is configured by set parallel extension.
> 
> FWIW I think it's wrong to hardcode the requirements of a particular
> hardware generation fixed media pipeline into the uapi. IMO better solution
> was when concept of parallel submission was decoupled from the no preemption
> mid batch preambles. Otherwise might as well call the extension
> I915_CONTEXT_ENGINES_EXT_MEDIA_SPLIT_FRAME_SUBMIT or something.
> 

I don't disagree but this where we landed per Daniel Vetter's feedback -
default to what our current hardware supports and extend it later to
newer hardware / requirements as needed.

Matt

> Regards,
> 
> Tvrtko
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_context.c   |   2 +-
> >   drivers/gpu/drm/i915/gt/intel_context_types.h |   3 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   2 +-
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 283 +-
> >   4 files changed, 287 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> > b/drivers/gpu/drm/i915/gt/intel_context.c
> > index 5615be32879c..2de62649e275 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -561,7 +561,7 @@ void intel_context_bind_parent_child(struct 
> > intel_context *parent,
> > GEM_BUG_ON(intel_context_is_child(child));
> > GEM_BUG_ON(intel_context_is_parent(child));
> > -   parent->guc_number_children++;
> > +   child->guc_child_index = parent->guc_number_children++;
> > list_add_tail(&child->guc_child_link,
> >   &parent->guc_child_list);
> > child->parent = parent;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> > b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > index 713d85b0b364..727f91e7f7c2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > @@ -246,6 +246,9 @@ struct intel_context {
> > /** @guc_number_children: number of children if parent */
> > u8 guc_number_children;
> > +   /** @guc_child_index: index into guc_child_list if child */
> > +   u8 guc_child_index;
> > +
> > /**
> >  * @parent_page: page in context used by parent for work queue,
> >  * work queue descriptor
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > index 6cd26dc060d1..9f61cfa5566a 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > @@ -188,7 +188,7 @@ struct guc_process_desc {
> > u32 wq_status;
> > u32 engine_presence;
> > u32 priority;
> > -   u32 reserved[30];
> > +   u32 reserved[36];
> >   } __packed;
> >   #define CONTEXT_REGISTRATION_FLAG_KMD BIT(0)
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 91330525330d..1a18f99bf12a 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -11,6 +11,7 @@
> >   #include "gt/intel_context.h"
> >   #include "gt/intel_engine_pm.h"
> >   #include "gt/intel_engine_heartbeat.h"
> > +#include "gt/intel_gpu_commands.h"
> >   #include "gt/intel_gt.h"
> >   #include "gt/intel_gt_irq.h"
> >   #include "gt/intel_gt_pm.h"
> > @@ -366,10 +367,14 @@ static struct i915_priolist *to_priolist(struct 
> > rb_node *rb)
> >   /*
> >* When using multi-lrc submission an extra page in the context state is
> > - * reserved for the process descriptor and work queue.
> > + * reserved for the process descriptor, work queue, and preempt BB boundary
> > + * handshake between the parent + childlren contexts.
> >*
> >* The layout of this page is below:
> >* 0  guc_process_desc
> > + * + sizeof(struct guc_process_desc)   child go
> > + * + CACHELINE_BYTES   child join ...
> > + * + CACHELINE_BYTES ...
> >* ...unused
> >* PAGE_SIZE / 2  work queue start
> >* ...work queue
> > @@ -1785,6 +1790,30 @@ static int deregister_context(struct intel_context 
> > *ce, u32 guc_id, bool loop)
> > return __guc_action_deregister_context(guc, guc_id, loop);
> >   }
> > +static inline void clear_children_join_go_memory(struct intel_context *ce)
> > +

Re: [Intel-gfx] [PATCH 05/27] drm/i915: Add GT PM unpark worker

2021-09-10 Thread Matthew Brost
On Fri, Sep 10, 2021 at 09:36:17AM +0100, Tvrtko Ursulin wrote:
> 
> On 20/08/2021 23:44, Matthew Brost wrote:
> > Sometimes it is desirable to queue work up for later if the GT PM isn't
> > held and run that work on next GT PM unpark.
> 
> Sounds maybe plausible, but it depends how much work can happen on unpark
> and whether it can have too much of a negative impact on latency for
> interactive loads? Or from a reverse angle, why the work wouldn't be done on

All it is does is add an interface to kick a work queue on unpark. i.e.
All the actually work is done async in the work queue so it shouldn't
add any latency.

> parking?
> 
> Also what kind of mechanism for dealing with too much stuff being put on
> this list you have? Can there be pressure which triggers (or would need to

No limits on pressure. See above, I don't think this is a concern.

> trigger) these deregistrations to happen at runtime (no park/unpark
> transitions)?
>
> > Implemented with a list in the GT of all pending work, workqueues in
> > the list, a callback to add a workqueue to the list, and finally a
> > wakeref post_get callback that iterates / drains the list + queues the
> > workqueues.
> > 
> > First user of this is deregistration of GuC contexts.
> 
> Does first imply there are more incoming?
>

Haven't found another user yet but this is generic mechanism so we can
add more in the future if other use cases arrise.
 
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/Makefile |  1 +
> >   drivers/gpu/drm/i915/gt/intel_gt.c|  3 ++
> >   drivers/gpu/drm/i915/gt/intel_gt_pm.c |  8 
> >   .../gpu/drm/i915/gt/intel_gt_pm_unpark_work.c | 35 
> >   .../gpu/drm/i915/gt/intel_gt_pm_unpark_work.h | 40 +++
> >   drivers/gpu/drm/i915/gt/intel_gt_types.h  | 10 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.h|  8 ++--
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +--
> >   drivers/gpu/drm/i915/intel_wakeref.c  |  5 +++
> >   drivers/gpu/drm/i915/intel_wakeref.h  |  1 +
> >   10 files changed, 119 insertions(+), 7 deletions(-)
> >   create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c
> >   create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.h
> > 
> > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > index 642a5b5a1b81..579bdc069f25 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -103,6 +103,7 @@ gt-y += \
> > gt/intel_gt_clock_utils.o \
> > gt/intel_gt_irq.o \
> > gt/intel_gt_pm.o \
> > +   gt/intel_gt_pm_unpark_work.o \
> > gt/intel_gt_pm_irq.o \
> > gt/intel_gt_requests.o \
> > gt/intel_gtt.o \
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
> > b/drivers/gpu/drm/i915/gt/intel_gt.c
> > index 62d40c986642..7e690e74baa2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> > @@ -29,6 +29,9 @@ void intel_gt_init_early(struct intel_gt *gt, struct 
> > drm_i915_private *i915)
> > spin_lock_init(>->irq_lock);
> > +   spin_lock_init(>->pm_unpark_work_lock);
> > +   INIT_LIST_HEAD(>->pm_unpark_work_list);
> > +
> > INIT_LIST_HEAD(>->closed_vma);
> > spin_lock_init(>->closed_lock);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c 
> > b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > index dea8e2479897..564c11a3748b 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > @@ -90,6 +90,13 @@ static int __gt_unpark(struct intel_wakeref *wf)
> > return 0;
> >   }
> > +static void __gt_unpark_work_queue(struct intel_wakeref *wf)
> > +{
> > +   struct intel_gt *gt = container_of(wf, typeof(*gt), wakeref);
> > +
> > +   intel_gt_pm_unpark_work_queue(gt);
> > +}
> > +
> >   static int __gt_park(struct intel_wakeref *wf)
> >   {
> > struct intel_gt *gt = container_of(wf, typeof(*gt), wakeref);
> > @@ -118,6 +125,7 @@ static int __gt_park(struct intel_wakeref *wf)
> >   static const struct intel_wakeref_ops wf_ops = {
> > .get = __gt_unpark,
> > +   .post_get = __gt_unpark_work_queue,
> > .put = __gt_park,
> >   };
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c 
> > b/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c
> > new file mode 100644
> > index ..23162dbd0c35
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c
> > @@ -0,0 +1,35 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2021 Intel Corporation
> > + */
> > +
> > +#include "i915_drv.h"
> > +#include "intel_runtime_pm.h"
> > +#include "intel_gt_pm.h"
> > +
> > +void intel_gt_pm_unpark_work_queue(struct intel_gt *gt)
> > +{
> > +   struct intel_gt_pm_unpark_work *work, *next;
> > +   unsigned long flags;
> > +
> > +   spin_lock_irqsave(>->pm_unpark_work_lock, flags);
> > +   list_for_each_entry_safe(work, next,
> > +   

[PATCH v2 3/6] drm/i915/uncore: Replace gen8 write functions with general fwtable

2021-09-10 Thread Matt Roper
Now that we have both a standard forcewake table (albeit a single-entry
table) and the shadow table stored in the uncore, we can drop the
gen8-specific write handlers in favor of the general fwtable version.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 13 +
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 5fa2bf26a948..4c6898746d10 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1046,16 +1046,6 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, 
i915_reg_t reg)
return FORCEWAKE_RENDER;
 }
 
-#define __gen8_reg_write_fw_domains(uncore, offset) \
-({ \
-   enum forcewake_domains __fwd; \
-   if (NEEDS_FORCE_WAKE(offset) && !is_shadowed(uncore, offset)) \
-   __fwd = FORCEWAKE_RENDER; \
-   else \
-   __fwd = 0; \
-   __fwd; \
-})
-
 static const struct intel_forcewake_range __gen6_fw_ranges[] = {
GEN_FW_RANGE(0x0, 0x3, FORCEWAKE_RENDER),
 };
@@ -1711,7 +1701,6 @@ __gen_write(func, 32)
 __gen_reg_write_funcs(gen12_fwtable);
 __gen_reg_write_funcs(gen11_fwtable);
 __gen_reg_write_funcs(fwtable);
-__gen_reg_write_funcs(gen8);
 
 #undef __gen_reg_write_funcs
 #undef GEN6_WRITE_FOOTER
@@ -2121,7 +2110,7 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
} else if (GRAPHICS_VER(i915) == 8) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
-   ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8);
+   ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_VALLEYVIEW(i915)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges);
-- 
2.25.4



[PATCH v2 6/6] drm/i915/dg2: Add DG2-specific shadow register table

2021-09-10 Thread Matt Roper
We thought the DG2 table of shadowed registers would be the same as the
gen12/xehp table, but it turns out that there are a few minor
differences that require us to define a new DG2-specific table:
 * One register is removed (0xC4D4)
 * One register is added (0xC4E0)

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c   | 41 ++-
 drivers/gpu/drm/i915/selftests/intel_uncore.c |  1 +
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 10f124297e7c..b3ba710d4310 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1016,6 +1016,45 @@ static const struct i915_range gen12_shadowed_regs[] = {
{ .start = 0x1F8510, .end = 0x1F8550 },
 };
 
+static const struct i915_range dg2_shadowed_regs[] = {
+   { .start =   0x2030, .end =   0x2030 },
+   { .start =   0x2510, .end =   0x2550 },
+   { .start =   0xA008, .end =   0xA00C },
+   { .start =   0xA188, .end =   0xA188 },
+   { .start =   0xA278, .end =   0xA278 },
+   { .start =   0xA540, .end =   0xA56C },
+   { .start =   0xC4C8, .end =   0xC4C8 },
+   { .start =   0xC4E0, .end =   0xC4E0 },
+   { .start =   0xC600, .end =   0xC600 },
+   { .start =   0xC658, .end =   0xC658 },
+   { .start =  0x22030, .end =  0x22030 },
+   { .start =  0x22510, .end =  0x22550 },
+   { .start = 0x1C0030, .end = 0x1C0030 },
+   { .start = 0x1C0510, .end = 0x1C0550 },
+   { .start = 0x1C4030, .end = 0x1C4030 },
+   { .start = 0x1C4510, .end = 0x1C4550 },
+   { .start = 0x1C8030, .end = 0x1C8030 },
+   { .start = 0x1C8510, .end = 0x1C8550 },
+   { .start = 0x1D0030, .end = 0x1D0030 },
+   { .start = 0x1D0510, .end = 0x1D0550 },
+   { .start = 0x1D4030, .end = 0x1D4030 },
+   { .start = 0x1D4510, .end = 0x1D4550 },
+   { .start = 0x1D8030, .end = 0x1D8030 },
+   { .start = 0x1D8510, .end = 0x1D8550 },
+   { .start = 0x1E0030, .end = 0x1E0030 },
+   { .start = 0x1E0510, .end = 0x1E0550 },
+   { .start = 0x1E4030, .end = 0x1E4030 },
+   { .start = 0x1E4510, .end = 0x1E4550 },
+   { .start = 0x1E8030, .end = 0x1E8030 },
+   { .start = 0x1E8510, .end = 0x1E8550 },
+   { .start = 0x1F0030, .end = 0x1F0030 },
+   { .start = 0x1F0510, .end = 0x1F0550 },
+   { .start = 0x1F4030, .end = 0x1F4030 },
+   { .start = 0x1F4510, .end = 0x1F4550 },
+   { .start = 0x1F8030, .end = 0x1F8030 },
+   { .start = 0x1F8510, .end = 0x1F8550 },
+};
+
 static int mmio_range_cmp(u32 key, const struct i915_range *range)
 {
if (key < range->start)
@@ -2054,7 +2093,7 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
 
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
-   ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
+   ASSIGN_SHADOW_TABLE(uncore, dg2_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c 
b/drivers/gpu/drm/i915/selftests/intel_uncore.c
index 22ef2c87df1a..bc8128170a99 100644
--- a/drivers/gpu/drm/i915/selftests/intel_uncore.c
+++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c
@@ -68,6 +68,7 @@ static int intel_shadow_table_check(void)
{ gen8_shadowed_regs, ARRAY_SIZE(gen8_shadowed_regs) },
{ gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) },
{ gen12_shadowed_regs, ARRAY_SIZE(gen12_shadowed_regs) },
+   { dg2_shadowed_regs, ARRAY_SIZE(dg2_shadowed_regs) },
};
const struct i915_range *range;
unsigned int i, j;
-- 
2.25.4



[PATCH v2 5/6] drm/i915/uncore: Drop gen11 mmio read handlers

2021-09-10 Thread Matt Roper
Consolidate down to just a single 'fwtable' implementation.  For reads
we don't need to worry about shadow tables.  Also, the
NEEDS_FORCE_WAKE() check we previously had in the fwtable implementation
can be dropped --- if a register is outside that range on one of the old
platforms, then it won't belong to any forcewake range and 0 will be
returned anyway.

v2:
 - Restore NEEDS_FORCE_WAKE() check.  (Chris, Tvrtko)

Cc: Chris Wilson 
Cc: Tvrtko Ursulin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 40 -
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index bfb2a6337f9d..10f124297e7c 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -935,9 +935,6 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] 
= {
__fwd; \
 })
 
-#define __gen11_fwtable_reg_read_fw_domains(uncore, offset) \
-   find_fw_domain(uncore, offset)
-
 /* *Must* be sorted by offset! See intel_shadow_table_check(). */
 static const struct i915_range gen8_shadowed_regs[] = {
{ .start =  0x2030, .end =  0x2030 },
@@ -1570,33 +1567,30 @@ static inline void __force_wake_auto(struct 
intel_uncore *uncore,
___force_wake_auto(uncore, fw_domains);
 }
 
-#define __gen_read(func, x) \
+#define __gen_fwtable_read(x) \
 static u##x \
-func##_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { \
+fwtable_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) \
+{ \
enum forcewake_domains fw_engine; \
GEN6_READ_HEADER(x); \
-   fw_engine = __##func##_reg_read_fw_domains(uncore, offset); \
+   fw_engine = __fwtable_reg_read_fw_domains(uncore, offset); \
if (fw_engine) \
__force_wake_auto(uncore, fw_engine); \
val = __raw_uncore_read##x(uncore, reg); \
GEN6_READ_FOOTER; \
 }
 
-#define __gen_reg_read_funcs(func) \
-static enum forcewake_domains \
-func##_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \
-   return __##func##_reg_read_fw_domains(uncore, 
i915_mmio_reg_offset(reg)); \
-} \
-\
-__gen_read(func, 8) \
-__gen_read(func, 16) \
-__gen_read(func, 32) \
-__gen_read(func, 64)
+static enum forcewake_domains
+fwtable_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) {
+   return __fwtable_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg));
+}
 
-__gen_reg_read_funcs(gen11_fwtable);
-__gen_reg_read_funcs(fwtable);
+__gen_fwtable_read(8)
+__gen_fwtable_read(16)
+__gen_fwtable_read(32)
+__gen_fwtable_read(64)
 
-#undef __gen_reg_read_funcs
+#undef __gen_fwtable_read
 #undef GEN6_READ_FOOTER
 #undef GEN6_READ_HEADER
 
@@ -2062,22 +2056,22 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) >= 12) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) == 11) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_GRAPHICS_VER(i915, 9, 10)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
-- 
2.25.4



[PATCH v2 2/6] drm/i915/uncore: Associate shadow table with uncore

2021-09-10 Thread Matt Roper
Store a reference to a platform's shadow table inside the uncore, the
same as we do with the forcewake table.  This will allow us to use a
single set of functions that operate on the shadow table reference
rather than generating lots of nearly-identical functions via macros
that differ only in terms of the table that they reference.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 40 -
 drivers/gpu/drm/i915/intel_uncore.h |  7 +
 2 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 8c09af1e9f7a..5fa2bf26a948 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1026,17 +1026,19 @@ static int mmio_range_cmp(u32 key, const struct 
i915_range *range)
return 0;
 }
 
-#define __is_X_shadowed(x) \
-static bool is_##x##_shadowed(u32 offset) \
-{ \
-   const struct i915_range *regs = x##_shadowed_regs; \
-   return BSEARCH(offset, regs, ARRAY_SIZE(x##_shadowed_regs), \
+static bool
+is_shadowed(struct intel_uncore *uncore, u32 offset)
+{
+   if (drm_WARN_ON(&uncore->i915->drm, !uncore->shadowed_reg_table))
+   return false;
+
+   return BSEARCH(offset,
+  uncore->shadowed_reg_table,
+  uncore->shadowed_reg_table_entries,
   mmio_range_cmp); \
 }
 
-__is_X_shadowed(gen8)
-__is_X_shadowed(gen11)
-__is_X_shadowed(gen12)
+
 
 static enum forcewake_domains
 gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg)
@@ -1047,7 +1049,7 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, 
i915_reg_t reg)
 #define __gen8_reg_write_fw_domains(uncore, offset) \
 ({ \
enum forcewake_domains __fwd; \
-   if (NEEDS_FORCE_WAKE(offset) && !is_gen8_shadowed(offset)) \
+   if (NEEDS_FORCE_WAKE(offset) && !is_shadowed(uncore, offset)) \
__fwd = FORCEWAKE_RENDER; \
else \
__fwd = 0; \
@@ -1081,7 +1083,7 @@ static const struct intel_forcewake_range 
__chv_fw_ranges[] = {
 #define __fwtable_reg_write_fw_domains(uncore, offset) \
 ({ \
enum forcewake_domains __fwd = 0; \
-   if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) \
+   if (NEEDS_FORCE_WAKE((offset)) && !is_shadowed(uncore, offset)) \
__fwd = find_fw_domain(uncore, offset); \
__fwd; \
 })
@@ -1090,7 +1092,7 @@ static const struct intel_forcewake_range 
__chv_fw_ranges[] = {
 ({ \
enum forcewake_domains __fwd = 0; \
const u32 __offset = (offset); \
-   if (!is_gen11_shadowed(__offset)) \
+   if (!is_shadowed(uncore, __offset)) \
__fwd = find_fw_domain(uncore, __offset); \
__fwd; \
 })
@@ -1099,7 +1101,7 @@ static const struct intel_forcewake_range 
__chv_fw_ranges[] = {
 ({ \
enum forcewake_domains __fwd = 0; \
const u32 __offset = (offset); \
-   if (!is_gen12_shadowed(__offset)) \
+   if (!is_shadowed(uncore, __offset)) \
__fwd = find_fw_domain(uncore, __offset); \
__fwd; \
 })
@@ -1705,6 +1707,7 @@ __gen_write(func, 8) \
 __gen_write(func, 16) \
 __gen_write(func, 32)
 
+
 __gen_reg_write_funcs(gen12_fwtable);
 __gen_reg_write_funcs(gen11_fwtable);
 __gen_reg_write_funcs(fwtable);
@@ -1969,6 +1972,12 @@ static int intel_uncore_fw_domains_init(struct 
intel_uncore *uncore)
(uncore)->fw_domains_table_entries = ARRAY_SIZE((d)); \
 }
 
+#define ASSIGN_SHADOW_TABLE(uncore, d) \
+{ \
+   (uncore)->shadowed_reg_table = d; \
+   (uncore)->shadowed_reg_table_entries = ARRAY_SIZE((d)); \
+}
+
 static int i915_pmic_bus_access_notifier(struct notifier_block *nb,
 unsigned long action, void *data)
 {
@@ -2081,30 +2090,37 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
 
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
+   ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
+   ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER(i915) >= 12) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges);
+   ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER(i915) == 11) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ra

[PATCH v2 4/6] drm/i915/uncore: Drop gen11/gen12 mmio write handlers

2021-09-10 Thread Matt Roper
Now that the reference to the shadow table is stored within the uncore,
we don't need to generate separate fwtable, gen11_fwtable, and
gen12_fwtable variants of the register write functions; a single
'fwtable' implementation will work for all of those platforms now.

While consolidating the functions, gen11/gen12 pick up a
NEEDS_FORCE_WAKE() check that they didn't have before, allowing them to
bypass a lot of forcewake/shadow checking for non-GT registers (e.g.,
display).  However since these later platforms also introduce media
engines at higher MMIO offsets, the definition of NEEDS_FORCE_WAKE() is
extended to also consider register offsets above GEN11_BSD_RING_BASE.

v2:
 - Restore NEEDS_FORCE_WAKE(), but extend it for compatibility with the
   gen11+ platforms by also passing offsets above GEN11_BSD_RING_BASE.
   (Chris, Tvrtko)

Cc: Tvrtko Ursulin 
Cc: Chris Wilson 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 61 ++---
 1 file changed, 21 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 4c6898746d10..bfb2a6337f9d 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -851,7 +851,10 @@ void assert_forcewakes_active(struct intel_uncore *uncore,
 }
 
 /* We give fast paths for the really cool registers */
-#define NEEDS_FORCE_WAKE(reg) ((reg) < 0x4)
+#define NEEDS_FORCE_WAKE(reg) ({ \
+   u32 __reg = (reg); \
+   __reg < 0x4 || __reg >= GEN11_BSD_RING_BASE; \
+})
 
 static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry)
 {
@@ -1071,27 +1074,10 @@ static const struct intel_forcewake_range 
__chv_fw_ranges[] = {
 };
 
 #define __fwtable_reg_write_fw_domains(uncore, offset) \
-({ \
-   enum forcewake_domains __fwd = 0; \
-   if (NEEDS_FORCE_WAKE((offset)) && !is_shadowed(uncore, offset)) \
-   __fwd = find_fw_domain(uncore, offset); \
-   __fwd; \
-})
-
-#define __gen11_fwtable_reg_write_fw_domains(uncore, offset) \
 ({ \
enum forcewake_domains __fwd = 0; \
const u32 __offset = (offset); \
-   if (!is_shadowed(uncore, __offset)) \
-   __fwd = find_fw_domain(uncore, __offset); \
-   __fwd; \
-})
-
-#define __gen12_fwtable_reg_write_fw_domains(uncore, offset) \
-({ \
-   enum forcewake_domains __fwd = 0; \
-   const u32 __offset = (offset); \
-   if (!is_shadowed(uncore, __offset)) \
+   if (NEEDS_FORCE_WAKE((__offset)) && !is_shadowed(uncore, __offset)) \
__fwd = find_fw_domain(uncore, __offset); \
__fwd; \
 })
@@ -1675,34 +1661,29 @@ __gen6_write(8)
 __gen6_write(16)
 __gen6_write(32)
 
-#define __gen_write(func, x) \
+#define __gen_fwtable_write(x) \
 static void \
-func##_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool 
trace) { \
+fwtable_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool 
trace) { \
enum forcewake_domains fw_engine; \
GEN6_WRITE_HEADER; \
-   fw_engine = __##func##_reg_write_fw_domains(uncore, offset); \
+   fw_engine = __fwtable_reg_write_fw_domains(uncore, offset); \
if (fw_engine) \
__force_wake_auto(uncore, fw_engine); \
__raw_uncore_write##x(uncore, reg, val); \
GEN6_WRITE_FOOTER; \
 }
 
-#define __gen_reg_write_funcs(func) \
-static enum forcewake_domains \
-func##_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \
-   return __##func##_reg_write_fw_domains(uncore, 
i915_mmio_reg_offset(reg)); \
-} \
-\
-__gen_write(func, 8) \
-__gen_write(func, 16) \
-__gen_write(func, 32)
-
+static enum forcewake_domains
+fwtable_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg)
+{
+   return __fwtable_reg_write_fw_domains(uncore, 
i915_mmio_reg_offset(reg));
+}
 
-__gen_reg_write_funcs(gen12_fwtable);
-__gen_reg_write_funcs(gen11_fwtable);
-__gen_reg_write_funcs(fwtable);
+__gen_fwtable_write(8)
+__gen_fwtable_write(16)
+__gen_fwtable_write(32)
 
-#undef __gen_reg_write_funcs
+#undef __gen_fwtable_write
 #undef GEN6_WRITE_FOOTER
 #undef GEN6_WRITE_HEADER
 
@@ -2080,22 +2061,22 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
-   ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
+   ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
-   ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
+   ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ

[PATCH v2 1/6] drm/i915/uncore: Convert gen6/gen7 read operations to fwtable

2021-09-10 Thread Matt Roper
On gen6-gen8 (except vlv/chv) we don't use a forcewake lookup table; we
simply check whether the register offset is < 0x4, and return
FORCEWAKE_RENDER if it is.  To prepare for upcoming refactoring, let's
define a single-entry forcewake table from [0x0, 0x3] and switch
these platforms over to use the fwtable reader functions.

v2:
 - Drop __gen6_reg_read_fw_domains which is no longer used.  (Tvrtko)

Cc: Tvrtko Ursulin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 21 -
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index f9767054dbdf..8c09af1e9f7a 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -853,16 +853,6 @@ void assert_forcewakes_active(struct intel_uncore *uncore,
 /* We give fast paths for the really cool registers */
 #define NEEDS_FORCE_WAKE(reg) ((reg) < 0x4)
 
-#define __gen6_reg_read_fw_domains(uncore, offset) \
-({ \
-   enum forcewake_domains __fwd; \
-   if (NEEDS_FORCE_WAKE(offset)) \
-   __fwd = FORCEWAKE_RENDER; \
-   else \
-   __fwd = 0; \
-   __fwd; \
-})
-
 static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry)
 {
if (offset < entry->start)
@@ -1064,6 +1054,10 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, 
i915_reg_t reg)
__fwd; \
 })
 
+static const struct intel_forcewake_range __gen6_fw_ranges[] = {
+   GEN_FW_RANGE(0x0, 0x3, FORCEWAKE_RENDER),
+};
+
 /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
 static const struct intel_forcewake_range __chv_fw_ranges[] = {
GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
@@ -1623,7 +1617,6 @@ __gen_read(func, 64)
 
 __gen_reg_read_funcs(gen11_fwtable);
 __gen_reg_read_funcs(fwtable);
-__gen_reg_read_funcs(gen6);
 
 #undef __gen_reg_read_funcs
 #undef GEN6_READ_FOOTER
@@ -2111,15 +2104,17 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) == 8) {
+   ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen6);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_VALLEYVIEW(i915)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_GRAPHICS_VER(i915, 6, 7)) {
+   ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen6);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
}
 
uncore->pmic_bus_access_nb.notifier_call = 
i915_pmic_bus_access_notifier;
-- 
2.25.4



[PATCH v2 0/6] i915: Simplify mmio handling & add new DG2 shadow table

2021-09-10 Thread Matt Roper
Our uncore MMIO functions for reading/writing registers have become very
complicated over time.  There's significant macro magic used to generate
several nearly-identical functions that only really differ in terms of
which platform-specific shadow register table they should check on write
operations.  We can significantly simplify our MMIO handlers by storing
a reference to the current platform's shadow table within the 'struct
intel_uncore' the same way we already do for forcewake; this allows us
to consolidate the multiple variants of each 'write' function down to
just a single 'fwtable' version that gets the shadow table out of the
uncore struct rather than hardcoding the name of a specific platform's
table.  We can do similar consolidation on the MMIO read side by
creating a single-entry forcewake table to replace the open-coded range
check they had been using previously.

The final patch of the series adds a new shadow table for DG2; this
becomes quite clean and simple now, given the refactoring in the first
five patches.

Aside from simplifying the code signficantly, this series reduces the
size of the generated .ko in exchange for adding an extra pointer
indirection to access the tables.  The size deltas (for just the first
five patches, before we add an additional table in the final patch) are:

Old:
$ size drivers/gpu/drm/i915/i915.ko
   textdata bss dec hex filename
2865921   889722912 2957805  2d21ed drivers/gpu/drm/i915/i915.ko

New:
$ size drivers/gpu/drm/i915/i915.ko
   textdata bss dec hex filename
2854181   882362912 2945329  2cf131 drivers/gpu/drm/i915/i915.ko

The code size deltas will become larger as we add more platforms; we
already add one new platform table in the final patch of this series and
our next few platforms are all expected to bring new shadow tables as
well.

I don't think the impact of the indirect table reference for shadow
tables should be a concern for a few reasons:
 * The stored table + indirect lookup design is already deemed good
   enough for forcewake, which is used more frequently (both reads and
   writes, compared to shadow tables which are only used for writes) and
   operates on much larger tables.
 * Performance-critical sections of the code or those read/writing lots
   of registers in a batch usually do an explicit grab of the relevant
   forcewake domains and then perform their MMIO operations via *_fw()
   functions without considering shadowed registers and bypassing all of
   the table lookups.
 * In v2 of the series, we still apply NEEDS_FORCE_WAKE() checks that
   will bypass all of the forcewake and shadow logic for display
   register writes.

v2:
 - Drop orphaned definition of __gen6_reg_read_fw_domains. (Tvrtko)
 - Restore NEEDS_FORCE_WAKE() check to
   __fwtable_reg_{read,write}_fw_domains, but update the definition of
   NEEDS_FORCE_WAKE to also return 'true' on offsets above
   GEN11_BSD_RING_BASE for compatibility with gen11+ platforms. (Chris,
   Tvrtko).

Cc: Tvrtko Ursulin 
Cc: Chris Wilson 

Matt Roper (6):
  drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
  drm/i915/uncore: Associate shadow table with uncore
  drm/i915/uncore: Replace gen8 write functions with general fwtable
  drm/i915/uncore: Drop gen11/gen12 mmio write handlers
  drm/i915/uncore: Drop gen11 mmio read handlers
  drm/i915/dg2: Add DG2-specific shadow register table

 drivers/gpu/drm/i915/intel_uncore.c   | 200 ++
 drivers/gpu/drm/i915/intel_uncore.h   |   7 +
 drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
 3 files changed, 115 insertions(+), 93 deletions(-)

-- 
2.25.4



Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-10 Thread Matthew Brost
On Fri, Sep 10, 2021 at 12:12:42PM +0100, Tvrtko Ursulin wrote:
> 
> On 20/08/2021 23:44, Matthew Brost wrote:
> > Add logical engine mapping. This is required for split-frame, as
> > workloads need to be placed on engines in a logically contiguous manner.
> > 
> > v2:
> >   (Daniel Vetter)
> >- Add kernel doc for new fields
> > 
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 ---
> >   drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
> >   .../drm/i915/gt/intel_execlists_submission.c  |  1 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
> >   5 files changed, 60 insertions(+), 29 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 0d9105a31d84..4d790f9a65dd 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs 
> > *engine, u16 iir)
> > GEM_DEBUG_WARN_ON(iir);
> >   }
> > -static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
> > +static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
> > + u8 logical_instance)
> >   {
> > const struct engine_info *info = &intel_engines[id];
> > struct drm_i915_private *i915 = gt->i915;
> > @@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
> > intel_engine_id id)
> > engine->class = info->class;
> > engine->instance = info->instance;
> > +   engine->logical_mask = BIT(logical_instance);
> > __sprint_engine_name(engine);
> > engine->props.heartbeat_interval_ms =
> > @@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct 
> > intel_gt *gt)
> > return info->engine_mask;
> >   }
> > +static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,
> > +u8 class, const u8 *map, u8 num_instances)
> > +{
> > +   int i, j;
> > +   u8 current_logical_id = 0;
> > +
> > +   for (j = 0; j < num_instances; ++j) {
> > +   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
> > +   if (!HAS_ENGINE(gt, i) ||
> > +   intel_engines[i].class != class)
> > +   continue;
> > +
> > +   if (intel_engines[i].instance == map[j]) {
> > +   logical_ids[intel_engines[i].instance] =
> > +   current_logical_id++;
> > +   break;
> > +   }
> > +   }
> > +   }
> > +}
> > +
> > +static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 
> > class)
> > +{
> > +   int i;
> > +   u8 map[MAX_ENGINE_INSTANCE + 1];
> > +
> > +   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
> > +   map[i] = i;
> 
> What's the point of the map array since it is 1:1 with instance?
> 

Future products do not have a 1 to 1 mapping and that mapping can change
based on fusing, e.g. XeHP SDV.

Also technically ICL / TGL / ADL physical instance 2 maps to logical
instance 1.

> > +   populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
> > +}
> > +
> >   /**
> >* intel_engines_init_mmio() - allocate and prepare the Engine Command 
> > Streamers
> >* @gt: pointer to struct intel_gt
> > @@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt)
> > struct drm_i915_private *i915 = gt->i915;
> > const unsigned int engine_mask = init_engine_mask(gt);
> > unsigned int mask = 0;
> > -   unsigned int i;
> > +   unsigned int i, class;
> > +   u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
> > int err;
> > drm_WARN_ON(&i915->drm, engine_mask == 0);
> > @@ -593,15 +627,23 @@ int intel_engines_init_mmio(struct intel_gt *gt)
> > if (i915_inject_probe_failure(i915))
> > return -ENODEV;
> > -   for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {
> > -   if (!HAS_ENGINE(gt, i))
> > -   continue;
> > +   for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) {
> > +   setup_logical_ids(gt, logical_ids, class);
> > -   err = intel_engine_setup(gt, i);
> > -   if (err)
> > -   goto cleanup;
> > +   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
> > +   u8 instance = intel_engines[i].instance;
> > +
> > +   if (intel_engines[i].class != class ||
> > +   !HAS_ENGINE(gt, i))
> > +   continue;
> > -   mask |= BIT(i);
> > +   err = intel_engine_setup(gt, i,
> > +logical_ids[instance]);
> > +   if (err)
> > +   goto cleanup;
> > +
> > +   mask |= BIT(i);
> 
> I still this there is a less clu

[PATCH 1/1] drm/amdkfd: Add sysfs bitfields and enums to uAPI

2021-09-10 Thread Felix Kuehling
These bits are de-facto part of the uAPI, so declare them in a uAPI header.

Signed-off-by: Felix Kuehling 
---
 MAINTAINERS   |   1 +
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  46 +
 include/uapi/linux/kfd_sysfs.h| 108 ++
 3 files changed, 110 insertions(+), 45 deletions(-)
 create mode 100644 include/uapi/linux/kfd_sysfs.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 84cd16694640..7554ec928ee2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -930,6 +930,7 @@ F:  drivers/gpu/drm/amd/include/kgd_kfd_interface.h
 F: drivers/gpu/drm/amd/include/v9_structs.h
 F: drivers/gpu/drm/amd/include/vi_structs.h
 F: include/uapi/linux/kfd_ioctl.h
+F: include/uapi/linux/kfd_sysfs.h
 
 AMD SPI DRIVER
 M: Sanjay R Mehta 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
index a8db017c9b8e..f0cc59d2fd5d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
@@ -25,38 +25,11 @@
 
 #include 
 #include 
+#include 
 #include "kfd_crat.h"
 
 #define KFD_TOPOLOGY_PUBLIC_NAME_SIZE 32
 
-#define HSA_CAP_HOT_PLUGGABLE  0x0001
-#define HSA_CAP_ATS_PRESENT0x0002
-#define HSA_CAP_SHARED_WITH_GRAPHICS   0x0004
-#define HSA_CAP_QUEUE_SIZE_POW20x0008
-#define HSA_CAP_QUEUE_SIZE_32BIT   0x0010
-#define HSA_CAP_QUEUE_IDLE_EVENT   0x0020
-#define HSA_CAP_VA_LIMIT   0x0040
-#define HSA_CAP_WATCH_POINTS_SUPPORTED 0x0080
-#define HSA_CAP_WATCH_POINTS_TOTALBITS_MASK0x0f00
-#define HSA_CAP_WATCH_POINTS_TOTALBITS_SHIFT   8
-#define HSA_CAP_DOORBELL_TYPE_TOTALBITS_MASK   0x3000
-#define HSA_CAP_DOORBELL_TYPE_TOTALBITS_SHIFT  12
-
-#define HSA_CAP_DOORBELL_TYPE_PRE_1_0  0x0
-#define HSA_CAP_DOORBELL_TYPE_1_0  0x1
-#define HSA_CAP_DOORBELL_TYPE_2_0  0x2
-#define HSA_CAP_AQL_QUEUE_DOUBLE_MAP   0x4000
-
-#define HSA_CAP_RESERVED_WAS_SRAM_EDCSUPPORTED 0x0008 /* Old buggy user 
mode depends on this being 0 */
-#define HSA_CAP_MEM_EDCSUPPORTED   0x0010
-#define HSA_CAP_RASEVENTNOTIFY 0x0020
-#define HSA_CAP_ASIC_REVISION_MASK 0x03c0
-#define HSA_CAP_ASIC_REVISION_SHIFT22
-#define HSA_CAP_SRAM_EDCSUPPORTED  0x0400
-#define HSA_CAP_SVMAPI_SUPPORTED   0x0800
-#define HSA_CAP_FLAGS_COHERENTHOSTACCESS   0x1000
-#define HSA_CAP_RESERVED   0xe00f8000
-
 struct kfd_node_properties {
uint64_t hive_id;
uint32_t cpu_cores_count;
@@ -93,17 +66,6 @@ struct kfd_node_properties {
char name[KFD_TOPOLOGY_PUBLIC_NAME_SIZE];
 };
 
-#define HSA_MEM_HEAP_TYPE_SYSTEM   0
-#define HSA_MEM_HEAP_TYPE_FB_PUBLIC1
-#define HSA_MEM_HEAP_TYPE_FB_PRIVATE   2
-#define HSA_MEM_HEAP_TYPE_GPU_GDS  3
-#define HSA_MEM_HEAP_TYPE_GPU_LDS  4
-#define HSA_MEM_HEAP_TYPE_GPU_SCRATCH  5
-
-#define HSA_MEM_FLAGS_HOT_PLUGGABLE0x0001
-#define HSA_MEM_FLAGS_NON_VOLATILE 0x0002
-#define HSA_MEM_FLAGS_RESERVED 0xfffc
-
 struct kfd_mem_properties {
struct list_headlist;
uint32_theap_type;
@@ -116,12 +78,6 @@ struct kfd_mem_properties {
struct attributeattr;
 };
 
-#define HSA_CACHE_TYPE_DATA0x0001
-#define HSA_CACHE_TYPE_INSTRUCTION 0x0002
-#define HSA_CACHE_TYPE_CPU 0x0004
-#define HSA_CACHE_TYPE_HSACU   0x0008
-#define HSA_CACHE_TYPE_RESERVED0xfff0
-
 struct kfd_cache_properties {
struct list_headlist;
uint32_tprocessor_id_low;
diff --git a/include/uapi/linux/kfd_sysfs.h b/include/uapi/linux/kfd_sysfs.h
new file mode 100644
index ..e1fb78b4bf09
--- /dev/null
+++ b/include/uapi/linux/kfd_sysfs.h
@@ -0,0 +1,108 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT WITH Linux-syscall-note */
+/*
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONI

Re: [virtio-dev] [PATCH v1 09/12] drm/virtio: implement context init: allocate an array of fence contexts

2021-09-10 Thread Chia-I Wu
On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh
 wrote:
>
> We don't want fences from different 3D contexts (virgl, gfxstream,
> venus) to be on the same timeline.  With explicit context creation,
> we can specify the number of ring each context wants.
>
> Execbuffer can specify which ring to use.
>
> Signed-off-by: Gurchetan Singh 
> Acked-by: Lingfeng Yang 
> ---
>  drivers/gpu/drm/virtio/virtgpu_drv.h   |  3 +++
>  drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 --
>  2 files changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
> b/drivers/gpu/drm/virtio/virtgpu_drv.h
> index a5142d60c2fa..cca9ab505deb 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_drv.h
> +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
> @@ -56,6 +56,7 @@
>  #define STATE_ERR 2
>
>  #define MAX_CAPSET_ID 63
> +#define MAX_RINGS 64
>
>  struct virtio_gpu_object_params {
> unsigned long size;
> @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv {
> uint32_t ctx_id;
> uint32_t context_init;
> bool context_created;
> +   uint32_t num_rings;
> +   uint64_t base_fence_ctx;
> struct mutex context_lock;
>  };
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
> b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> index f51f3393a194..262f79210283 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
> int in_fence_fd = exbuf->fence_fd;
> int out_fence_fd = -1;
> void *buf;
> +   uint64_t fence_ctx;
> +   uint32_t ring_idx;
> +
> +   fence_ctx = vgdev->fence_drv.context;
> +   ring_idx = 0;
>
> if (vgdev->has_virgl_3d == false)
> return -ENOSYS;
> @@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
> if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS))
> return -EINVAL;
>
> +   if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
> +   if (exbuf->ring_idx >= vfpriv->num_rings)
> +   return -EINVAL;
> +
> +   if (!vfpriv->base_fence_ctx)
> +   return -EINVAL;
> +
> +   fence_ctx = vfpriv->base_fence_ctx;
> +   ring_idx = exbuf->ring_idx;
> +   }
> +
> exbuf->fence_fd = -1;
>
> virtio_gpu_create_context(dev, file);
> @@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
> goto out_memdup;
> }
>
> -   out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 
> 0);
> +   out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx);
> if(!out_fence) {
> ret = -ENOMEM;
> goto out_unresv;
> @@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct 
> drm_device *dev,
> return -EINVAL;
>
> /* Number of unique parameters supported at this time. */
> -   if (num_params > 1)
> +   if (num_params > 2)
> return -EINVAL;
>
> ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params),
> @@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct 
> drm_device *dev,
>
> vfpriv->context_init |= value;
> break;
> +   case VIRTGPU_CONTEXT_PARAM_NUM_RINGS:
> +   if (vfpriv->base_fence_ctx) {
> +   ret = -EINVAL;
> +   goto out_unlock;
> +   }
> +
> +   if (value > MAX_RINGS) {
> +   ret = -EINVAL;
> +   goto out_unlock;
> +   }
> +
> +   vfpriv->base_fence_ctx = 
> dma_fence_context_alloc(value);
With multiple fence contexts, we should do something about implicit fencing.

The classic example is Mesa and X server.  When both use virgl and the
global fence context, no dma_fence_wait is fine.  But when Mesa uses
venus and the ring fence context, dma_fence_wait should be inserted.


> +   vfpriv->num_rings = value;
> +   break;
> default:
> ret = -EINVAL;
> goto out_unlock;
> --
> 2.33.0.153.gba50c8fa24-goog
>
>
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
>


Re: [git pull] drm fixes for 5.15-rc1

2021-09-10 Thread pr-tracker-bot
The pull request you sent on Fri, 10 Sep 2021 16:35:59 +1000:

> git://anongit.freedesktop.org/drm/drm tags/drm-next-2021-09-10

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/a668acb8f01fc0d1e3877cddecbe319ef2ef651c

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [Intel-gfx] [PATCH v9 05/17] drm/i915/pxp: Implement funcs to create the TEE channel

2021-09-10 Thread Rodrigo Vivi
On Fri, Sep 10, 2021 at 08:36:15AM -0700, Daniele Ceraolo Spurio wrote:
> From: "Huang, Sean Z" 
> 
> Implement the funcs to create the TEE channel, so kernel can
> send the TEE commands directly to TEE for creating the arbitrary
> (default) session.
> 
> v2: fix locking, don't pollute dev_priv (Chris)
> 
> v3: wait for mei PXP component to be bound.
> 
> v4: drop the wait, as the component might be bound after i915 load
> completes. We'll instead check when sending a tee message.
> 
> v5: fix an issue with mei_pxp module removal
> 
> v6: don't use fetch_and_zero in fini (Rodrigo)
> 
> Signed-off-by: Huang, Sean Z 
> Signed-off-by: Daniele Ceraolo Spurio 
> Cc: Chris Wilson 

Reviewed-by: Rodrigo Vivi 

> ---
>  drivers/gpu/drm/i915/Makefile  |  3 +-
>  drivers/gpu/drm/i915/pxp/intel_pxp.c   | 13 
>  drivers/gpu/drm/i915/pxp/intel_pxp_tee.c   | 79 ++
>  drivers/gpu/drm/i915/pxp/intel_pxp_tee.h   | 14 
>  drivers/gpu/drm/i915/pxp/intel_pxp_types.h |  6 ++
>  5 files changed, 114 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
>  create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 23f5bc268962..d39bd0cefc64 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -283,7 +283,8 @@ i915-y += i915_perf.o
>  
>  # Protected execution platform (PXP) support
>  i915-$(CONFIG_DRM_I915_PXP) += \
> - pxp/intel_pxp.o
> + pxp/intel_pxp.o \
> + pxp/intel_pxp_tee.o
>  
>  # Post-mortem debug and GPU hang state capture
>  i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
> b/drivers/gpu/drm/i915/pxp/intel_pxp.c
> index 7b2053902146..400deaea2d8a 100644
> --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
> @@ -3,6 +3,7 @@
>   * Copyright(c) 2020 Intel Corporation.
>   */
>  #include "intel_pxp.h"
> +#include "intel_pxp_tee.h"
>  #include "gt/intel_context.h"
>  #include "i915_drv.h"
>  
> @@ -50,7 +51,16 @@ void intel_pxp_init(struct intel_pxp *pxp)
>   if (ret)
>   return;
>  
> + ret = intel_pxp_tee_component_init(pxp);
> + if (ret)
> + goto out_context;
> +
>   drm_info(>->i915->drm, "Protected Xe Path (PXP) protected content 
> support initialized\n");
> +
> + return;
> +
> +out_context:
> + destroy_vcs_context(pxp);
>  }
>  
>  void intel_pxp_fini(struct intel_pxp *pxp)
> @@ -58,5 +68,8 @@ void intel_pxp_fini(struct intel_pxp *pxp)
>   if (!intel_pxp_is_enabled(pxp))
>   return;
>  
> + intel_pxp_tee_component_fini(pxp);
> +
>   destroy_vcs_context(pxp);
> +
>  }
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c 
> b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
> new file mode 100644
> index ..f1d8de832653
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
> @@ -0,0 +1,79 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright(c) 2020 Intel Corporation.
> + */
> +
> +#include 
> +#include "drm/i915_pxp_tee_interface.h"
> +#include "drm/i915_component.h"
> +#include "i915_drv.h"
> +#include "intel_pxp.h"
> +#include "intel_pxp_tee.h"
> +
> +static inline struct intel_pxp *i915_dev_to_pxp(struct device *i915_kdev)
> +{
> + return &kdev_to_i915(i915_kdev)->gt.pxp;
> +}
> +
> +/**
> + * i915_pxp_tee_component_bind - bind function to pass the function pointers 
> to pxp_tee
> + * @i915_kdev: pointer to i915 kernel device
> + * @tee_kdev: pointer to tee kernel device
> + * @data: pointer to pxp_tee_master containing the function pointers
> + *
> + * This bind function is called during the system boot or resume from system 
> sleep.
> + *
> + * Return: return 0 if successful.
> + */
> +static int i915_pxp_tee_component_bind(struct device *i915_kdev,
> +struct device *tee_kdev, void *data)
> +{
> + struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev);
> +
> + pxp->pxp_component = data;
> + pxp->pxp_component->tee_dev = tee_kdev;
> +
> + return 0;
> +}
> +
> +static void i915_pxp_tee_component_unbind(struct device *i915_kdev,
> +   struct device *tee_kdev, void *data)
> +{
> + struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev);
> +
> + pxp->pxp_component = NULL;
> +}
> +
> +static const struct component_ops i915_pxp_tee_component_ops = {
> + .bind   = i915_pxp_tee_component_bind,
> + .unbind = i915_pxp_tee_component_unbind,
> +};
> +
> +int intel_pxp_tee_component_init(struct intel_pxp *pxp)
> +{
> + int ret;
> + struct intel_gt *gt = pxp_to_gt(pxp);
> + struct drm_i915_private *i915 = gt->i915;
> +
> + ret = component_add_typed(i915->drm.dev, &i915_pxp_tee_component_ops,
> +   I915_COMPONENT_PXP);
> + if (ret < 0) {
> + drm_err(&i915->drm, "

Re: [PATCH v9 10/17] drm/i915/pxp: interfaces for using protected objects

2021-09-10 Thread Rodrigo Vivi
On Fri, Sep 10, 2021 at 08:36:20AM -0700, Daniele Ceraolo Spurio wrote:
> This api allow user mode to create protected buffers and to mark
> contexts as making use of such objects. Only when using contexts
> marked in such a way is the execution guaranteed to work as expected.
> 
> Contexts can only be marked as using protected content at creation time
> (i.e. the parameter is immutable) and they must be both bannable and not
> recoverable. Given that the protected session gets invalidated on
> suspend, contexts created this way hold a runtime pm wakeref until
> they're either destroyed or invalidated.
> 
> All protected objects and contexts will be considered invalid when the
> PXP session is destroyed and all new submissions using them will be
> rejected. All intel contexts within the invalidated gem contexts will be
> marked banned. Userspace can detect that an invalidation has occurred via
> the RESET_STATS ioctl, where we report it the same way as a ban due to a
> hang.
> 
> v5: squash patches, rebase on proto_ctx, update kerneldoc
> 
> v6: rebase on obj create_ext changes
> 
> v7: Use session counter to check if an object it valid, hold wakeref in
> context, don't add a new flag to RESET_STATS (Daniel)
> 
> v8: don't increase guilty count for contexts banned during pxp
> invalidation (Rodrigo)
> 
> v9: better comments, avoid wakeref put race between pxp_inval and
> context_close, add usage examples (Rodrigo)
> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Signed-off-by: Bommu Krishnaiah 
> Cc: Rodrigo Vivi 
> Cc: Chris Wilson 
> Cc: Lionel Landwerlin 
> Cc: Jason Ekstrand 
> Cc: Daniel Vetter 

Reviewed-by: Rodrigo Vivi 


> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 98 ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  6 ++
>  .../gpu/drm/i915/gem/i915_gem_context_types.h | 28 ++
>  drivers/gpu/drm/i915/gem/i915_gem_create.c| 72 ++
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 18 
>  drivers/gpu/drm/i915/gem/i915_gem_object.c|  1 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.h|  6 ++
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  8 ++
>  .../gpu/drm/i915/gem/selftests/mock_context.c |  4 +-
>  drivers/gpu/drm/i915/pxp/intel_pxp.c  | 78 +++
>  drivers/gpu/drm/i915/pxp/intel_pxp.h  | 12 +++
>  drivers/gpu/drm/i915/pxp/intel_pxp_session.c  |  6 ++
>  drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  9 ++
>  include/uapi/drm/i915_drm.h   | 96 +-
>  14 files changed, 407 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index c2ab0e22db0a..3418be4f727f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -77,6 +77,8 @@
>  #include "gt/intel_gpu_commands.h"
>  #include "gt/intel_ring.h"
>  
> +#include "pxp/intel_pxp.h"
> +
>  #include "i915_gem_context.h"
>  #include "i915_trace.h"
>  #include "i915_user_extensions.h"
> @@ -186,10 +188,13 @@ static int validate_priority(struct drm_i915_private 
> *i915,
>   return 0;
>  }
>  
> -static void proto_context_close(struct i915_gem_proto_context *pc)
> +static void proto_context_close(struct drm_i915_private *i915,
> + struct i915_gem_proto_context *pc)
>  {
>   int i;
>  
> + if (pc->pxp_wakeref)
> + intel_runtime_pm_put(&i915->runtime_pm, pc->pxp_wakeref);
>   if (pc->vm)
>   i915_vm_put(pc->vm);
>   if (pc->user_engines) {
> @@ -241,6 +246,33 @@ static int proto_context_set_persistence(struct 
> drm_i915_private *i915,
>   return 0;
>  }
>  
> +static int proto_context_set_protected(struct drm_i915_private *i915,
> +struct i915_gem_proto_context *pc,
> +bool protected)
> +{
> + int ret = 0;
> +
> + if (!intel_pxp_is_enabled(&i915->gt.pxp)) {
> + ret = -ENODEV;
> + } else if (!protected) {
> + pc->uses_protected_content = false;
> + } else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) ||
> +!(pc->user_flags & BIT(UCONTEXT_BANNABLE))) {
> + ret = -EPERM;
> + } else {
> + pc->uses_protected_content = true;
> +
> + /*
> +  * protected context usage requires the PXP session to be up,
> +  * which in turn requires the device to be active.
> +  */
> + pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm);
> + ret = intel_pxp_wait_for_arb_start(&i915->gt.pxp);
> + }
> +
> + return ret;
> +}
> +
>  static struct i915_gem_proto_context *
>  proto_context_create(struct drm_i915_private *i915, unsigned int flags)
>  {
> @@ -269,7 +301,7 @@ proto_context_create(struct drm_i915_private *i915, 
> unsigned int flags)
>   return pc;
>  
>  proto_close:
> - p

Re: [Intel-gfx] [PATCH v9 16/17] drm/i915/pxp: add PXP documentation

2021-09-10 Thread Rodrigo Vivi
On Fri, Sep 10, 2021 at 08:36:26AM -0700, Daniele Ceraolo Spurio wrote:
> Now that all the pieces are in place we can add a description of how the
> feature works. Also modify the comments in struct intel_pxp into
> kerneldoc.
> 
> v2: improve doc (Rodrigo)
> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Cc: Daniel Vetter 
> Cc: Rodrigo Vivi 

Reviewed-by: Rodrigo Vivi 

> ---
>  Documentation/gpu/i915.rst |  8 
>  drivers/gpu/drm/i915/pxp/intel_pxp.c   | 28 +
>  drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 47 --
>  3 files changed, 71 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
> index 101dde3eb1ea..78ecb9d5ec20 100644
> --- a/Documentation/gpu/i915.rst
> +++ b/Documentation/gpu/i915.rst
> @@ -471,6 +471,14 @@ Object Tiling IOCTLs
>  .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c
> :doc: buffer object tiling
>  
> +Protected Objects
> +-
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp.c
> +   :doc: PXP
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp_types.h
> +
>  Microcontrollers
>  
>  
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
> b/drivers/gpu/drm/i915/pxp/intel_pxp.c
> index 97c6368fddc3..5610634f8929 100644
> --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
> @@ -11,6 +11,34 @@
>  #include "gt/intel_context.h"
>  #include "i915_drv.h"
>  
> +/**
> + * DOC: PXP
> + *
> + * PXP (Protected Xe Path) is a feature available in Gen12 and newer 
> platforms.
> + * It allows execution and flip to display of protected (i.e. encrypted)
> + * objects. The SW support is enabled via the CONFIG_DRM_I915_PXP kconfig.
> + *
> + * Objects can opt-in to PXP encryption at creation time via the
> + * I915_GEM_CREATE_EXT_PROTECTED_CONTENT create_ext flag. For objects to be
> + * correctly protected they must be used in conjunction with a context 
> created
> + * with the I915_CONTEXT_PARAM_PROTECTED_CONTENT flag. See the documentation
> + * of those two uapi flags for details and restrictions.
> + *
> + * Protected objects are tied to a pxp session; currently we only support one
> + * session, which i915 manages and whose index is available in the uapi
> + * (I915_PROTECTED_CONTENT_DEFAULT_SESSION) for use in instructions targeting
> + * protected objects.
> + * The session is invalidated by the HW when certain events occur (e.g.
> + * suspend/resume). When this happens, all the objects that were used with 
> the
> + * session are marked as invalid and all contexts marked as using protected
> + * content are banned. Any further attempt at using them in an execbuf call 
> is
> + * rejected, while flips are converted to black frames.
> + *
> + * Some of the PXP setup operations are performed by the Management Engine,
> + * which is handled by the mei driver; communication between i915 and mei is
> + * performed via the mei_pxp component module.
> + */
> +
>  /* KCR register definitions */
>  #define KCR_INIT _MMIO(0x320f0)
>  
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h 
> b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
> index ae24064bb57e..73ef7d1754e1 100644
> --- a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
> @@ -16,42 +16,65 @@
>  struct intel_context;
>  struct i915_pxp_component;
>  
> +/**
> + * struct intel_pxp - pxp state
> + */
>  struct intel_pxp {
> + /**
> +  * @pxp_component: i915_pxp_component struct of the bound mei_pxp
> +  * module. Only set and cleared inside component bind/unbind functions,
> +  * which are protected by &tee_mutex.
> +  */
>   struct i915_pxp_component *pxp_component;
> + /**
> +  * @pxp_component_added: track if the pxp component has been added.
> +  * Set and cleared in tee init and fini functions respectively.
> +  */
>   bool pxp_component_added;
>  
> + /** @ce: kernel-owned context used for PXP operations */
>   struct intel_context *ce;
>  
> - /*
> + /** @arb_mutex: protects arb session start */
> + struct mutex arb_mutex;
> + /**
> +  * @arb_is_valid: tracks arb session status.
>* After a teardown, the arb session can still be in play on the HW
>* even if the keys are gone, so we can't rely on the HW state of the
>* session to know if it's valid and need to track the status in SW.
>*/
> - struct mutex arb_mutex; /* protects arb session start */
>   bool arb_is_valid;
>  
> - /*
> -  * Keep track of which key instance we're on, so we can use it to
> -  * determine if an object was created using the current key or a
> + /**
> +  * @key_instance: tracks which key instance we're on, so we can use it
> +  * to determine if an object was created using the current key or a
>* previous one.
>*/
>   u32 key_instance;
>  
> - struct mutex tee

Re: [PATCH] drm/ttm: add a WARN_ON in ttm_set_driver_manager when array bounds (v2)

2021-09-10 Thread Robin Murphy

On 2021-09-10 11:09, Guchun Chen wrote:

Vendor will define their own memory types on top of TTM_PL_PRIV,
but call ttm_set_driver_manager directly without checking mem_type
value when setting up memory manager. So add such check to aware
the case when array bounds.

v2: lower check level to WARN_ON

Signed-off-by: Leslie Shi 
Signed-off-by: Guchun Chen 
---
  include/drm/ttm/ttm_device.h | 1 +
  1 file changed, 1 insertion(+)

diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index 07d722950d5b..aa79953c807c 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -291,6 +291,7 @@ ttm_manager_type(struct ttm_device *bdev, int mem_type)
  static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type,
  struct ttm_resource_manager *manager)
  {
+   WARN_ON(type >= TTM_NUM_MEM_TYPES);


Nit: I know nothing about this code, but from the context alone it would 
seem sensible to do


if (WARN_ON(type >= TTM_NUM_MEM_TYPES))
return;

to avoid making the subsequent assignment when we *know* it's invalid 
and likely to corrupt memory.


Robin.


bdev->man_drv[type] = manager;
  }
  



Re: [PATCH v2] kernel/locking: Add context to ww_mutex_trylock.

2021-09-10 Thread Mark Brown
On Thu, Sep 09, 2021 at 11:32:18AM +0200, Maarten Lankhorst wrote:

> This is also useful in regulator_lock_nested, which may avoid dropping
> regulator_nesting_mutex in the uncontended path, so use it there.

Acked-by: Mark Brown 


signature.asc
Description: PGP signature


Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files

2021-09-10 Thread Lucas De Marchi

On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote:

We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
functions, defines and structs follow suit.

Signed-off-by: Lucas De Marchi 
---
drivers/gpu/drm/i915/Makefile  |  2 +-
drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 --
drivers/gpu/drm/i915/gt/intel_gt_debugfs.c |  4 ++--
.../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c}  |  4 ++--
drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h  | 14 ++
5 files changed, 19 insertions(+), 19 deletions(-)
delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c} (99%)
create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 232c9673a2e5..dd656f2d7721 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o

# "Graphics Technology" (aka we talk to the gpu)
gt-y += \
-   gt/debugfs_gt_pm.o \
gt/gen2_engine_cs.o \
gt/gen6_engine_cs.o \
gt/gen6_ppgtt.o \
@@ -103,6 +102,7 @@ gt-y += \
gt/intel_gt_engines_debugfs.o \
gt/intel_gt_irq.o \
gt/intel_gt_pm.o \
+   gt/intel_gt_pm_debugfs.o \
gt/intel_gt_pm_irq.o \
gt/intel_gt_requests.o \
gt/intel_gtt.o \
diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h 
b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
deleted file mode 100644
index 4cf5f5c9da7d..
--- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
+++ /dev/null
@@ -1,14 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#ifndef DEBUGFS_GT_PM_H
-#define DEBUGFS_GT_PM_H
-
-struct intel_gt;
-struct dentry;
-
-void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root);
-
-#endif /* DEBUGFS_GT_PM_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
index e5d173c235a3..4096ee893b69 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
@@ -5,10 +5,10 @@

#include 

-#include "debugfs_gt_pm.h"
#include "i915_drv.h"
#include "intel_gt_debugfs.h"
#include "intel_gt_engines_debugfs.h"
+#include "intel_gt_pm_debugfs.h"
#include "intel_sseu_debugfs.h"
#include "uc/intel_uc_debugfs.h"

@@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt *gt)
return;

intel_gt_engines_register_debugfs(gt, root);
-   debugfs_gt_pm_register(gt, root);
+   intel_gt_pm_register_debugfs(gt, root);


This is one case I usually don't know what convention to follow since it
changes in different places.

I did it like _register_debugfs because of calls like
intel_gt_init_scratch(), xxx_init_hw, etc. However here I see that just
below we have intel_sseu_debugfs_register(), so maybe I should consider
debugfs as part of the namespace?

Lucas De Marchi


Re: [PATCH] drm/msm: Disable frequency clamping on a630

2021-09-10 Thread Caleb Connolly




On 10/09/2021 18:18, Rob Clark wrote:

On Tue, Sep 7, 2021 at 7:20 PM Bjorn Andersson
 wrote:


On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote:


On 8/9/2021 9:48 PM, Caleb Connolly wrote:



On 09/08/2021 17:12, Rob Clark wrote:

On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen
 wrote:

[..]

I am a bit confused. We don't define a power domain for gpu in dt,
correct? Then what exactly set_opp do here? Do you think this usleep is
what is helping here somehow to mask the issue?

The power domains (for cx and gx) are defined in the GMU DT, the OPPs in
the GPU DT. For the sake of simplicity I'll refer to the lowest
frequency (25700) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as
the "min" state, and the highest frequency (71000) and OPP level
(RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined in
sdm845.dtsi under the gpu node.

The new devfreq behaviour unmasks what I think is a driver bug, it
inadvertently puts much more strain on the GPU regulators than they
usually get. With the new behaviour the GPU jumps from it's min state to
the max state and back again extremely rapidly under workloads as small
as refreshing UI. Where previously the GPU would rarely if ever go above
342MHz when interacting with the device, it now jumps between min and
max many times per second.

If my understanding is correct, the current implementation of the GMU
set freq is the following:
   - Get OPP for frequency to set
   - Push the frequency to the GMU - immediately updating the core clock
   - Call dev_pm_opp_set_opp() which triggers a notify chain, this winds
up somewhere in power management code and causes the gx regulator level
to be updated


Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing else. We
were using a different api earlier which got deprecated -
dev_pm_opp_set_bw().



On the Lenovo Yoga C630 this is reproduced by starting alacritty and if
I'm lucky I managed to hit a few keys before it crashes, so I spent a
few hours looking into this as well...

As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote.
The opp-level is just there for show and isn't used by anything, at
least not on 845.

Further more, I'm missing something in my tree, so the interconnect
doesn't hit sync_state, and as such we're not actually scaling the
buses. So the problem is not that Linux doesn't turn on the buses in
time.

So I suspect that the "AHB bus error" isn't saying that we turned off
the bus, but rather that the GPU becomes unstable or something of that
sort.


Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran
Aquarium for 20 minutes without a problem. I then switched the gpu
devfreq governor to "userspace" and ran the following:

while true; do
   echo 25700 > /sys/class/devfreq/500.gpu/userspace/set_freq
   echo 71000 > /sys/class/devfreq/500.gpu/userspace/set_freq
done

It took 19 iterations of this loop to crash the GPU.


I assume you still had aquarium running, to keep the gpu awake while
you ran that loop?

Fwiw, I modified this slightly to match sc7180's min/max gpu freq and
could not trigger any issue.. interestingly sc7180 has a lower min
freq (180) and higher max freq (800) so it was toggling over a wider
freq range.  I also tried on a device that  had the higher 825MHz opp
(since I noticed that was the only opp that used
RPMH_REGULATOR_LEVEL_TURBO_L1 and wanted to rule that out), but could
not reproduce.

I guess a630 (sdm845) should have higher power draw (it is 2x # of
shader cores and 2x GMEM size, but lower max freq).. the question is,
is this the reason we see this on sdm845 and not sc7180?  Or is there
some other difference.  On the gpu side of this, they are both closely
related (ie. the same "sub-generation" of a6xx, same gmu fw, etc)..
I'm less sure about the other parts (icc, rpmh, etc)


My guess would be power draw, nobody has mentioned this yet but I've realised that the vdd_gfx rail is powered by a buck 
converter, which could explain a lot of the symptoms.


Buck converters depend on high frequency switching and inductors to work, this inherently leads to some lag time when 
changing voltages, and also means that the behaviour of the regulator is defined in part by how much current is being 
drawn. Wikipedia has a pretty good explanation: https://en.wikipedia.org/wiki/Buck_converter


At the best of times these regulators have a known voltage ripple, when under load and when rapidly switching voltages 
this will get a lot worse.


Someone with an oscilloscope and schematics could probe the rail and probably see exactly what's going on when the GPU 
crashes. Because of the lag time in the regulator changing voltage, it might be undershooting whilst the GPU is trying 
to clock up and draw more current - causing instability and crashes.


BR,
-R


So the problem doesn't seem to be Rob's change, it's just that prior to
it the chance to hitting it is way lower. Question is still what it is
that we're triggering.

Re: [PATCH v2] kernel/locking: Add context to ww_mutex_trylock.

2021-09-10 Thread Peter Zijlstra
On Fri, Sep 10, 2021 at 05:02:54PM +0200, Peter Zijlstra wrote:

> That doesn't look right, how's this for you?

Full patch for the robots here:

https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=locking/core&id=826e7b8826f0af185bb93249600533c33fd69a95


Re: [PATCH] drm/msm: Disable frequency clamping on a630

2021-09-10 Thread Rob Clark
On Thu, Sep 9, 2021 at 1:54 PM Rob Clark  wrote:
>
> On Thu, Sep 9, 2021 at 12:50 PM Akhil P Oommen  wrote:
> >
> > On 9/9/2021 9:42 PM, Amit Pundir wrote:
> > > On Thu, 9 Sept 2021 at 17:47, Amit Pundir  wrote:
> > >>
> > >> On Wed, 8 Sept 2021 at 07:50, Bjorn Andersson
> > >>  wrote:
> > >>>
> > >>> On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote:
> > >>>
> >  On 8/9/2021 9:48 PM, Caleb Connolly wrote:
> > >
> > >
> > > On 09/08/2021 17:12, Rob Clark wrote:
> > >> On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen
> > >>  wrote:
> > >>> [..]
> > >>> I am a bit confused. We don't define a power domain for gpu in dt,
> > >>> correct? Then what exactly set_opp do here? Do you think this 
> > >>> usleep is
> > >>> what is helping here somehow to mask the issue?
> > > The power domains (for cx and gx) are defined in the GMU DT, the OPPs 
> > > in
> > > the GPU DT. For the sake of simplicity I'll refer to the lowest
> > > frequency (25700) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as
> > > the "min" state, and the highest frequency (71000) and OPP level
> > > (RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined 
> > > in
> > > sdm845.dtsi under the gpu node.
> > >
> > > The new devfreq behaviour unmasks what I think is a driver bug, it
> > > inadvertently puts much more strain on the GPU regulators than they
> > > usually get. With the new behaviour the GPU jumps from it's min state 
> > > to
> > > the max state and back again extremely rapidly under workloads as 
> > > small
> > > as refreshing UI. Where previously the GPU would rarely if ever go 
> > > above
> > > 342MHz when interacting with the device, it now jumps between min and
> > > max many times per second.
> > >
> > > If my understanding is correct, the current implementation of the GMU
> > > set freq is the following:
> > >- Get OPP for frequency to set
> > >- Push the frequency to the GMU - immediately updating the core 
> > > clock
> > >- Call dev_pm_opp_set_opp() which triggers a notify chain, this 
> > > winds
> > > up somewhere in power management code and causes the gx regulator 
> > > level
> > > to be updated
> > 
> >  Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing 
> >  else. We
> >  were using a different api earlier which got deprecated -
> >  dev_pm_opp_set_bw().
> > 
> > >>>
> > >>> On the Lenovo Yoga C630 this is reproduced by starting alacritty and if
> > >>> I'm lucky I managed to hit a few keys before it crashes, so I spent a
> > >>> few hours looking into this as well...
> > >>>
> > >>> As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote.
> > >>> The opp-level is just there for show and isn't used by anything, at
> > >>> least not on 845.
> > >>>
> > >>> Further more, I'm missing something in my tree, so the interconnect
> > >>> doesn't hit sync_state, and as such we're not actually scaling the
> > >>> buses. So the problem is not that Linux doesn't turn on the buses in
> > >>> time.
> > >>>
> > >>> So I suspect that the "AHB bus error" isn't saying that we turned off
> > >>> the bus, but rather that the GPU becomes unstable or something of that
> > >>> sort.
> > >>>
> > >>>
> > >>> Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran
> > >>> Aquarium for 20 minutes without a problem. I then switched the gpu
> > >>> devfreq governor to "userspace" and ran the following:
> > >>>
> > >>> while true; do
> > >>>echo 25700 > /sys/class/devfreq/500.gpu/userspace/set_freq
> > >>>echo 71000 > /sys/class/devfreq/500.gpu/userspace/set_freq
> > >>> done
> > >>>
> > >>> It took 19 iterations of this loop to crash the GPU.
> > >>
> > >> Ack. With your above script, I can reproduce a crash too on db845c
> > >> (A630) running v5.14. I didn't get any crash log though and device
> > >> just rebooted to USB crash mode.
> > >>
> > >> And same crash on RB5 (A650) too https://hastebin.com/raw/ejutetuwun
> >
> > Are we sure this is the same issue? It could be, but I thought we were
> > seeing a bunch of random gpu errors (which may eventually hit device crash).
>
> In the sense that async-serror often seems to be a clk issue, it
> *could* be related.. but this would have to be triggered by CPU
> access.  The symptom does seem very different.
>

The more I think about it, the more I think this is a different
issue.. a650 is somewhat different wrt gmu (ie. hfi vs legacy code
paths).

Amit, could you try the same experiment (with 9bc95570175a ("drm/msm:
Devfreq tuning") revert) while running something like webgl aquarium
to prevent the GPU from suspending?  I'm kinda suspecting the issue
you hit is more likely some suspend/resume issue.

BR,
-R


Re: [PATCH] drm/msm: Disable frequency clamping on a630

2021-09-10 Thread Rob Clark
On Tue, Sep 7, 2021 at 7:20 PM Bjorn Andersson
 wrote:
>
> On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote:
>
> > On 8/9/2021 9:48 PM, Caleb Connolly wrote:
> > >
> > >
> > > On 09/08/2021 17:12, Rob Clark wrote:
> > > > On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen
> > > >  wrote:
> [..]
> > > > > I am a bit confused. We don't define a power domain for gpu in dt,
> > > > > correct? Then what exactly set_opp do here? Do you think this usleep 
> > > > > is
> > > > > what is helping here somehow to mask the issue?
> > > The power domains (for cx and gx) are defined in the GMU DT, the OPPs in
> > > the GPU DT. For the sake of simplicity I'll refer to the lowest
> > > frequency (25700) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as
> > > the "min" state, and the highest frequency (71000) and OPP level
> > > (RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined in
> > > sdm845.dtsi under the gpu node.
> > >
> > > The new devfreq behaviour unmasks what I think is a driver bug, it
> > > inadvertently puts much more strain on the GPU regulators than they
> > > usually get. With the new behaviour the GPU jumps from it's min state to
> > > the max state and back again extremely rapidly under workloads as small
> > > as refreshing UI. Where previously the GPU would rarely if ever go above
> > > 342MHz when interacting with the device, it now jumps between min and
> > > max many times per second.
> > >
> > > If my understanding is correct, the current implementation of the GMU
> > > set freq is the following:
> > >   - Get OPP for frequency to set
> > >   - Push the frequency to the GMU - immediately updating the core clock
> > >   - Call dev_pm_opp_set_opp() which triggers a notify chain, this winds
> > > up somewhere in power management code and causes the gx regulator level
> > > to be updated
> >
> > Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing else. We
> > were using a different api earlier which got deprecated -
> > dev_pm_opp_set_bw().
> >
>
> On the Lenovo Yoga C630 this is reproduced by starting alacritty and if
> I'm lucky I managed to hit a few keys before it crashes, so I spent a
> few hours looking into this as well...
>
> As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote.
> The opp-level is just there for show and isn't used by anything, at
> least not on 845.
>
> Further more, I'm missing something in my tree, so the interconnect
> doesn't hit sync_state, and as such we're not actually scaling the
> buses. So the problem is not that Linux doesn't turn on the buses in
> time.
>
> So I suspect that the "AHB bus error" isn't saying that we turned off
> the bus, but rather that the GPU becomes unstable or something of that
> sort.
>
>
> Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran
> Aquarium for 20 minutes without a problem. I then switched the gpu
> devfreq governor to "userspace" and ran the following:
>
> while true; do
>   echo 25700 > /sys/class/devfreq/500.gpu/userspace/set_freq
>   echo 71000 > /sys/class/devfreq/500.gpu/userspace/set_freq
> done
>
> It took 19 iterations of this loop to crash the GPU.

I assume you still had aquarium running, to keep the gpu awake while
you ran that loop?

Fwiw, I modified this slightly to match sc7180's min/max gpu freq and
could not trigger any issue.. interestingly sc7180 has a lower min
freq (180) and higher max freq (800) so it was toggling over a wider
freq range.  I also tried on a device that  had the higher 825MHz opp
(since I noticed that was the only opp that used
RPMH_REGULATOR_LEVEL_TURBO_L1 and wanted to rule that out), but could
not reproduce.

I guess a630 (sdm845) should have higher power draw (it is 2x # of
shader cores and 2x GMEM size, but lower max freq).. the question is,
is this the reason we see this on sdm845 and not sc7180?  Or is there
some other difference.  On the gpu side of this, they are both closely
related (ie. the same "sub-generation" of a6xx, same gmu fw, etc)..
I'm less sure about the other parts (icc, rpmh, etc)

BR,
-R

> So the problem doesn't seem to be Rob's change, it's just that prior to
> it the chance to hitting it is way lower. Question is still what it is
> that we're triggering.
>
> Regards,
> Bjorn


Re: [PATCH] dt-bindings: More use 'enum' instead of 'oneOf' plus 'const' entries

2021-09-10 Thread Guenter Roeck

On 9/10/21 9:51 AM, Rob Herring wrote:

'enum' is equivalent to 'oneOf' with a list of 'const' entries, but 'enum'
is more concise and yields better error messages.

Fix a couple more cases which have appeared.

Cc: Rob Clark 
Cc: Sean Paul 
Cc: Mark Brown 
Cc: Wim Van Sebroeck 
Cc: Guenter Roeck 
Cc: Jonathan Marek 
Cc: Aswath Govindraju 
Cc: Marc Zyngier 
Cc: Linus Walleij 
Cc: dri-devel@lists.freedesktop.org
Cc: freedr...@lists.freedesktop.org
Cc: linux-...@vger.kernel.org
Cc: linux-watch...@vger.kernel.org
Signed-off-by: Rob Herring 
---
  .../bindings/display/msm/dsi-phy-7nm.yaml  |  8 
  .../devicetree/bindings/spi/omap-spi.yaml  |  6 +++---
  .../bindings/watchdog/maxim,max63xx.yaml   | 14 +++---


For watchdog:

Acked-by: Guenter Roeck 


  3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml 
b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
index 4265399bb154..c851770bbdf2 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
@@ -14,10 +14,10 @@ allOf:
  
  properties:

compatible:
-oneOf:
-  - const: qcom,dsi-phy-7nm
-  - const: qcom,dsi-phy-7nm-8150
-  - const: qcom,sc7280-dsi-phy-7nm
+enum:
+  - qcom,dsi-phy-7nm
+  - qcom,dsi-phy-7nm-8150
+  - qcom,sc7280-dsi-phy-7nm
  
reg:

  items:
diff --git a/Documentation/devicetree/bindings/spi/omap-spi.yaml 
b/Documentation/devicetree/bindings/spi/omap-spi.yaml
index e55538186cf6..9952199cae11 100644
--- a/Documentation/devicetree/bindings/spi/omap-spi.yaml
+++ b/Documentation/devicetree/bindings/spi/omap-spi.yaml
@@ -84,9 +84,9 @@ unevaluatedProperties: false
  if:
properties:
  compatible:
-  oneOf:
-- const: ti,omap2-mcspi
-- const: ti,omap4-mcspi
+  enum:
+- ti,omap2-mcspi
+- ti,omap4-mcspi
  
  then:

properties:
diff --git a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml 
b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
index f2105eedac2c..ab9641e845db 100644
--- a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
+++ b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
@@ -15,13 +15,13 @@ maintainers:
  
  properties:

compatible:
-oneOf:
-  - const: maxim,max6369
-  - const: maxim,max6370
-  - const: maxim,max6371
-  - const: maxim,max6372
-  - const: maxim,max6373
-  - const: maxim,max6374
+enum:
+  - maxim,max6369
+  - maxim,max6370
+  - maxim,max6371
+  - maxim,max6372
+  - maxim,max6373
+  - maxim,max6374
  
reg:

  description: This is a 1-byte memory-mapped address





Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource

2021-09-10 Thread Christian König

Am 10.09.21 um 17:30 schrieb Thomas Hellström:

On Fri, 2021-09-10 at 16:40 +0200, Christian König wrote:


Am 10.09.21 um 15:15 schrieb Thomas Hellström:

Both the provider (resource manager) and the consumer (the TTM
driver)
want to subclass struct ttm_resource. Since this is left for the
resource
manager, we need to provide a private pointer for the TTM driver.

Provide a struct ttm_resource_private for the driver to subclass
for
data with the same lifetime as the struct ttm_resource: In the i915
case
it will, for example, be an sg-table and radix tree into the LMEM
/VRAM pages that currently are awkwardly attached to the GEM
object.

Provide an ops structure for associated ops (Which is only
destroy() ATM)
It might seem pointless to provide a separate ops structure, but
Linus
has previously made it clear that that's the norm.

After careful audit one could perhaps also on a per-driver basis
replace the delete_mem_notify() TTM driver callback with the above
destroy function.

Well this is a really big NAK to this approach.

If you need to attach some additional information to the resource
then
implement your own resource manager like everybody else does.

Well this was the long discussion we had back then when the resource
mangagers started to derive from struct resource and I was under the
impression that we had come to an agreement about the different use-
cases here, and this was my main concern.


Ok, then we somehow didn't understood each other.


I mean, it's a pretty big layer violation to do that for this use-case.


Well exactly that's the point. TTM should not have a layer design in the 
first place.


Devices, BOs, resources etc.. are base classes which should implement a 
base functionality which is then extended by the drivers to implement 
the driver specific functionality.


That is a component based approach, and not layered at all.


The TTM resource manager doesn't want to know about this data at all,
it's private to the ttm resource user layer and the resource manager
works perfectly well without it. (I assume the other drivers that
implement their own resource managers need the data that the
subclassing provides?)


Yes, that's exactly why we have the subclassing.


The fundamental problem here is that there are two layers wanting to
subclass struct ttm_resource. That means one layer gets to do that, the
second gets to use a private pointer, (which in turn can provide yet
another private pointer to a potential third layer). With your
suggestion, the second layer instead is forced to subclass each
subclassed instance it uses from  the first layer provides?


Well completely drop the layer approach/thinking here.

The resource is an object with a base class. The base class implements 
the interface TTM needs to handle the object, e.g. create/destroy/debug 
etc...


Then we need to subclass this object because without any additional 
information the object is pretty pointless.


One possibility for this is to use the range manager to implement 
something drm_mm based. BTW: We should probably rename that to something 
like ttm_res_drm_mm or similar.


What we should avoid is to abuse TTM resource interfaces in the driver, 
e.g. what i915 is currently doing. This is a TTM->resource mgr interface 
and should not be used by drivers at all.



Ofc we can do that, but it does indeed feel pretty awkward.

In any case, if you still think that's the approach we should go for,
I'd need to add init() and fini() members to the ttm_range_manager_func
struct to allow subclassing without having to unnecessarily copy the
full code?


Yes, exporting the ttm_range_manager functions as needed is one thing I 
wanted to do for the amdgpu_gtt_mgr.c code as well.


Just don't extend the function table but rather directly export the 
necessary functions.


Regards,
Christian.



Thanks,
Thomas











Regards,
Christian.


Cc: Matthew Auld 
Cc: König Christian 
Signed-off-by: Thomas Hellström 
---
   drivers/gpu/drm/ttm/ttm_resource.c | 10 +++---
   include/drm/ttm/ttm_resource.h | 28

   2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_resource.c
b/drivers/gpu/drm/ttm/ttm_resource.c
index 2431717376e7..973e7c50bfed 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object
*bo,
   void ttm_resource_free(struct ttm_buffer_object *bo, struct
ttm_resource **res)
   {
 struct ttm_resource_manager *man;
+   struct ttm_resource *resource = *res;
   
-   if (!*res)

+   if (!resource)
 return;
   
-   man = ttm_manager_type(bo->bdev, (*res)->mem_type);

-   man->func->free(man, *res);
 *res = NULL;
+   if (resource->priv)
+   resource->priv->ops.destroy(resource->priv);
+
+   man = ttm_manager_type(bo->bdev, resource->mem_type);
+   man->func->free(man, resource);
   

[PATCH] video: fbdev: atyfb: Remove assigned but never used variable statements

2021-09-10 Thread Colin King
From: Colin Ian King 

There are a couple of statements where local variables are being assigned
values that are never read because the function returns immediately after
the assignment. Clean up the code by removing them.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King 
---
 drivers/video/fbdev/aty/mach64_gx.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/video/fbdev/aty/mach64_gx.c 
b/drivers/video/fbdev/aty/mach64_gx.c
index 9c37e28fb78b..d06d24830080 100644
--- a/drivers/video/fbdev/aty/mach64_gx.c
+++ b/drivers/video/fbdev/aty/mach64_gx.c
@@ -352,10 +352,8 @@ static int aty_var_to_pll_18818(const struct fb_info 
*info, u32 vclk_per,
post_divider = 1;
 
if (MHz100 > MAX_FREQ_2595) {
-   MHz100 = MAX_FREQ_2595;
return -EINVAL;
} else if (MHz100 < ABS_MIN_FREQ_2595) {
-   program_bits = 0;   /* MHz100 = 257 */
return -EINVAL;
} else {
while (MHz100 < MIN_FREQ_2595) {
-- 
2.32.0



Re: [PATCH] dt-bindings: More use 'enum' instead of 'oneOf' plus 'const' entries

2021-09-10 Thread Mark Brown
On Fri, Sep 10, 2021 at 11:51:53AM -0500, Rob Herring wrote:

> 'enum' is equivalent to 'oneOf' with a list of 'const' entries, but 'enum'
> is more concise and yields better error messages.

Acked-by: Mark Brown 


signature.asc
Description: PGP signature


[PATCH] dt-bindings: More use 'enum' instead of 'oneOf' plus 'const' entries

2021-09-10 Thread Rob Herring
'enum' is equivalent to 'oneOf' with a list of 'const' entries, but 'enum'
is more concise and yields better error messages.

Fix a couple more cases which have appeared.

Cc: Rob Clark 
Cc: Sean Paul 
Cc: Mark Brown 
Cc: Wim Van Sebroeck 
Cc: Guenter Roeck 
Cc: Jonathan Marek 
Cc: Aswath Govindraju 
Cc: Marc Zyngier 
Cc: Linus Walleij 
Cc: dri-devel@lists.freedesktop.org
Cc: freedr...@lists.freedesktop.org
Cc: linux-...@vger.kernel.org
Cc: linux-watch...@vger.kernel.org
Signed-off-by: Rob Herring 
---
 .../bindings/display/msm/dsi-phy-7nm.yaml  |  8 
 .../devicetree/bindings/spi/omap-spi.yaml  |  6 +++---
 .../bindings/watchdog/maxim,max63xx.yaml   | 14 +++---
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml 
b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
index 4265399bb154..c851770bbdf2 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
@@ -14,10 +14,10 @@ allOf:
 
 properties:
   compatible:
-oneOf:
-  - const: qcom,dsi-phy-7nm
-  - const: qcom,dsi-phy-7nm-8150
-  - const: qcom,sc7280-dsi-phy-7nm
+enum:
+  - qcom,dsi-phy-7nm
+  - qcom,dsi-phy-7nm-8150
+  - qcom,sc7280-dsi-phy-7nm
 
   reg:
 items:
diff --git a/Documentation/devicetree/bindings/spi/omap-spi.yaml 
b/Documentation/devicetree/bindings/spi/omap-spi.yaml
index e55538186cf6..9952199cae11 100644
--- a/Documentation/devicetree/bindings/spi/omap-spi.yaml
+++ b/Documentation/devicetree/bindings/spi/omap-spi.yaml
@@ -84,9 +84,9 @@ unevaluatedProperties: false
 if:
   properties:
 compatible:
-  oneOf:
-- const: ti,omap2-mcspi
-- const: ti,omap4-mcspi
+  enum:
+- ti,omap2-mcspi
+- ti,omap4-mcspi
 
 then:
   properties:
diff --git a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml 
b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
index f2105eedac2c..ab9641e845db 100644
--- a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
+++ b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
@@ -15,13 +15,13 @@ maintainers:
 
 properties:
   compatible:
-oneOf:
-  - const: maxim,max6369
-  - const: maxim,max6370
-  - const: maxim,max6371
-  - const: maxim,max6372
-  - const: maxim,max6373
-  - const: maxim,max6374
+enum:
+  - maxim,max6369
+  - maxim,max6370
+  - maxim,max6371
+  - maxim,max6372
+  - maxim,max6373
+  - maxim,max6374
 
   reg:
 description: This is a 1-byte memory-mapped address
-- 
2.30.2



Re: [RFC PATCH 0/4] Allow to use DRM fbdev emulation layer with CONFIG_FB disabled

2021-09-10 Thread Sam Ravnborg
Hi Noralf,

On Thu, Sep 09, 2021 at 06:27:02PM +0200, Noralf Trønnes wrote:
> 
> > > Hi Daniel,
> > >
> > > >
> > > > I think for a substantial improvement here in robustness what you
> really
> > > > want is
> > > > - kmscon in userspace
> > > > - disable FB layer
> > > > - ideally also disable console/vt layer in the kernel
> > > > - have a minimal emergency/boot-up log thing in drm, patches for that
> > > >   floated around a few times
> > >
> > > I assume you refer to this work by David Herrmann:
> > > "[RFC] drm: add kernel-log renderer"
> > > https://lists.freedesktop.org/archives/dri-devel/2014-March/055136.html
> > >
> >
> > There's also this:
> >
> > [PATCH v2 0/3] drm: Add panic handling
> >
> https://lore.kernel.org/dri-devel/20190311174218.51899-1-nor...@tronnes.org/
> 
> And here's a DRM console example that was part of the early drm_client work:
> 
> [RFC v4 25/25] drm/client: Hack: Add DRM VT console client
> https://lore.kernel.org/dri-devel/20180414115318.14500-26-nor...@tronnes.org/

Thanks for providing these pointers. Looks forwards to find time to play
with all this. Having an embedded board without any fbdev stuff seems
like a nice goal.

Sam


Re: Habanalabs Open-Source TPC LLVM compiler and SynapseAI Core library

2021-09-10 Thread Daniel Vetter
Forgot to add dri-devel.

On Fri, Sep 10, 2021 at 6:09 PM Daniel Vetter  wrote:
>
> On Fri, Sep 10, 2021 at 9:58 AM Greg Kroah-Hartman
>  wrote:
> > On Fri, Sep 10, 2021 at 10:26:56AM +0300, Oded Gabbay wrote:
> > > Hi Greg,
> > >
> > > Following our conversations a couple of months ago, I'm happy to tell you 
> > > that
> > > Habanalabs has open-sourced its TPC (Tensor Processing Core) LLVM 
> > > compiler,
> > > which is a fork of the LLVM open-source project.
> > >
> > > The project can be found on Habanalabs GitHub website at:
> > > https://github.com/HabanaAI/tpc_llvm
> > >
> > > There is a companion guide on how to write TPC kernels at:
> > > https://docs.habana.ai/en/latest/TPC_User_Guide/TPC_User_Guide.html
> >
> > That's great news, thanks for pushing for this and releasing it all!
>
> Yeah this is neat.
>
> There's still the problem that we spent the past 2.5 years pissing off
> a lot of people for an imo questionable political project, bypassing
> all the technical review and expertise. Now that the political
> nonsense is resolved I think we need to look at at least the technical
> cleanup. The angered people are much harder to fix, so let's maybe
> ignore that (or perhaps a ks topic, no idea, I'm honestly not super
> motivated to rehash this entire story again). Here's what I think we
> should do:
>
> - move drivers/misc/habanalabs under drivers/gpu/habanalabs and
> review/discussions on dri-devel
> - grandfather the entire current situation in as-is, it's not the only
> driver we have with a funny uapi of its own (but the other driver did
> manage to get their compiler into upstream llvm even, and not like 2
> years late)
> - review the dma-buf stuff on dri-devel and then land it through
> standard flows, not the gregk-misc bypass
> - close drivers/misc backdoor for further accel driver submissions,
> I'd like to focus on technical stuff in this area going forward and
> not pointless exercises in bypassing due process and all that
>
> I expect we'll have a proper discussion what the stack should look
> like with the next submission (from a different vendor maybe), that
> ship kinda sailed with habanalabs.
>
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code

2021-09-10 Thread Alex Williamson
On Fri, 10 Sep 2021 10:38:50 -0300
Jason Gunthorpe  wrote:

> On Fri, Sep 10, 2021 at 01:10:46PM +0100, Christoph Hellwig wrote:
> > On Thu, Sep 09, 2021 at 04:38:45PM -0300, Jason Gunthorpe wrote:  
> > > Every driver just emits a static string, simply feed it through the ops
> > > and provide a standard sysfs show function.  
> > 
> > Looks sensible.  But can you make the attribute optional and add a
> > comment marking it deprecated?  Because it really is completely useless.
> > We don't version userspace APIs, userspae has to discover new features
> > individually by e.g. finding new sysfs files or just trying new ioctls.  
> 
> To be honest I have no idea what side effects that would have..
> 
> device code search tells me libvirt reads it and stuffs it into some
> XML
> 
> Something called mdevctl touches it, feeds it into some JSON and
> other stuff..
> 
> qemu has some VFIO_DEVICE_API_* constants but it is all dead code
> 
> I agree it shouldn't have been there in the first place
> 
> Cornelia? Alex? Any thoughts?

It's not a version, it's a means for userspace to determine the basic
API for an mdev device without needing to go through the process of
creating a container, adding the group, setting an IOMMU type, opening
the device before being able to call VFIO_DEVICE_GET_INFO to determine
the API.  For example, it wouldn't make sense for libvirt to attach a
vfio-ccw device to a PCIe root port in a VM.  It's a means to say this
mdev device is a vfio-pci or that mdev device is a vfio-ccw.  If it were
optional, then management tools would have no basic idea how to attach
the device to a VM without gaining access to the device themselves.
Thanks,

Alex



[PATCH v9 15/17] drm/i915/pxp: add pxp debugfs

2021-09-10 Thread Daniele Ceraolo Spurio
2 debugfs files, one to query the current status of the pxp session and one
to trigger an invalidation for testing.

v2: rename debugfs, fix date (Alan)

Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by : Alan Previn 
---
 drivers/gpu/drm/i915/Makefile|  1 +
 drivers/gpu/drm/i915/gt/debugfs_gt.c |  2 +
 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c | 78 
 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h | 21 ++
 4 files changed, 102 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 366e82cec44d..b46474ee1a1f 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -285,6 +285,7 @@ i915-y += i915_perf.o
 i915-$(CONFIG_DRM_I915_PXP) += \
pxp/intel_pxp.o \
pxp/intel_pxp_cmd.o \
+   pxp/intel_pxp_debugfs.o \
pxp/intel_pxp_irq.o \
pxp/intel_pxp_pm.o \
pxp/intel_pxp_session.o \
diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt.c 
b/drivers/gpu/drm/i915/gt/debugfs_gt.c
index 591eb60785db..c27847ddb796 100644
--- a/drivers/gpu/drm/i915/gt/debugfs_gt.c
+++ b/drivers/gpu/drm/i915/gt/debugfs_gt.c
@@ -9,6 +9,7 @@
 #include "debugfs_gt.h"
 #include "debugfs_gt_pm.h"
 #include "intel_sseu_debugfs.h"
+#include "pxp/intel_pxp_debugfs.h"
 #include "uc/intel_uc_debugfs.h"
 #include "i915_drv.h"
 
@@ -28,6 +29,7 @@ void debugfs_gt_register(struct intel_gt *gt)
intel_sseu_debugfs_register(gt, root);
 
intel_uc_debugfs_register(>->uc, root);
+   intel_pxp_debugfs_register(>->pxp, root);
 }
 
 void intel_gt_debugfs_register_files(struct dentry *root,
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
new file mode 100644
index ..cbb1853676cc
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
@@ -0,0 +1,78 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include "gt/debugfs_gt.h"
+#include "pxp/intel_pxp.h"
+#include "pxp/intel_pxp_irq.h"
+#include "i915_drv.h"
+
+static int pxp_info_show(struct seq_file *m, void *data)
+{
+   struct intel_pxp *pxp = m->private;
+   struct drm_printer p = drm_seq_file_printer(m);
+   bool enabled = intel_pxp_is_enabled(pxp);
+
+   if (!enabled) {
+   drm_printf(&p, "pxp disabled\n");
+   return 0;
+   }
+
+   drm_printf(&p, "active: %s\n", yesno(intel_pxp_is_active(pxp)));
+   drm_printf(&p, "instance counter: %u\n", pxp->key_instance);
+
+   return 0;
+}
+DEFINE_GT_DEBUGFS_ATTRIBUTE(pxp_info);
+
+static int pxp_terminate_get(void *data, u64 *val)
+{
+   /* nothing to read */
+   return -EPERM;
+}
+
+static int pxp_terminate_set(void *data, u64 val)
+{
+   struct intel_pxp *pxp = data;
+   struct intel_gt *gt = pxp_to_gt(pxp);
+
+   if (!intel_pxp_is_active(pxp))
+   return -ENODEV;
+
+   /* simulate a termination interrupt */
+   spin_lock_irq(>->irq_lock);
+   intel_pxp_irq_handler(pxp, 
GEN12_DISPLAY_PXP_STATE_TERMINATED_INTERRUPT);
+   spin_unlock_irq(>->irq_lock);
+
+   if (!wait_for_completion_timeout(&pxp->termination,
+msecs_to_jiffies(100)))
+   return -ETIMEDOUT;
+
+   return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(pxp_terminate_fops, pxp_terminate_get, 
pxp_terminate_set, "%llx\n");
+void intel_pxp_debugfs_register(struct intel_pxp *pxp, struct dentry *gt_root)
+{
+   static const struct debugfs_gt_file files[] = {
+   { "info", &pxp_info_fops, NULL },
+   { "terminate_state", &pxp_terminate_fops, NULL },
+   };
+   struct dentry *root;
+
+   if (!gt_root)
+   return;
+
+   if (!HAS_PXP((pxp_to_gt(pxp)->i915)))
+   return;
+
+   root = debugfs_create_dir("pxp", gt_root);
+   if (IS_ERR(root))
+   return;
+
+   intel_gt_debugfs_register_files(root, files, ARRAY_SIZE(files), pxp);
+}
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h 
b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
new file mode 100644
index ..7e0c3d2f5d7e
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#ifndef __INTEL_PXP_DEBUGFS_H__
+#define __INTEL_PXP_DEBUGFS_H__
+
+struct intel_pxp;
+struct dentry;
+
+#ifdef CONFIG_DRM_I915_PXP
+void intel_pxp_debugfs_register(struct intel_pxp *pxp, struct dentry *root);
+#else
+static inline void
+intel_pxp_debugfs_register(struct intel_pxp *pxp, struct dentry *root)
+{
+}
+#endif
+
+#endif /* __INTEL_PXP_DEBUGFS_H__ */
-- 
2.25.1



[PATCH v9 16/17] drm/i915/pxp: add PXP documentation

2021-09-10 Thread Daniele Ceraolo Spurio
Now that all the pieces are in place we can add a description of how the
feature works. Also modify the comments in struct intel_pxp into
kerneldoc.

v2: improve doc (Rodrigo)

Signed-off-by: Daniele Ceraolo Spurio 
Cc: Daniel Vetter 
Cc: Rodrigo Vivi 
---
 Documentation/gpu/i915.rst |  8 
 drivers/gpu/drm/i915/pxp/intel_pxp.c   | 28 +
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 47 --
 3 files changed, 71 insertions(+), 12 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 101dde3eb1ea..78ecb9d5ec20 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -471,6 +471,14 @@ Object Tiling IOCTLs
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c
:doc: buffer object tiling
 
+Protected Objects
+-
+
+.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp.c
+   :doc: PXP
+
+.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp_types.h
+
 Microcontrollers
 
 
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index 97c6368fddc3..5610634f8929 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -11,6 +11,34 @@
 #include "gt/intel_context.h"
 #include "i915_drv.h"
 
+/**
+ * DOC: PXP
+ *
+ * PXP (Protected Xe Path) is a feature available in Gen12 and newer platforms.
+ * It allows execution and flip to display of protected (i.e. encrypted)
+ * objects. The SW support is enabled via the CONFIG_DRM_I915_PXP kconfig.
+ *
+ * Objects can opt-in to PXP encryption at creation time via the
+ * I915_GEM_CREATE_EXT_PROTECTED_CONTENT create_ext flag. For objects to be
+ * correctly protected they must be used in conjunction with a context created
+ * with the I915_CONTEXT_PARAM_PROTECTED_CONTENT flag. See the documentation
+ * of those two uapi flags for details and restrictions.
+ *
+ * Protected objects are tied to a pxp session; currently we only support one
+ * session, which i915 manages and whose index is available in the uapi
+ * (I915_PROTECTED_CONTENT_DEFAULT_SESSION) for use in instructions targeting
+ * protected objects.
+ * The session is invalidated by the HW when certain events occur (e.g.
+ * suspend/resume). When this happens, all the objects that were used with the
+ * session are marked as invalid and all contexts marked as using protected
+ * content are banned. Any further attempt at using them in an execbuf call is
+ * rejected, while flips are converted to black frames.
+ *
+ * Some of the PXP setup operations are performed by the Management Engine,
+ * which is handled by the mei driver; communication between i915 and mei is
+ * performed via the mei_pxp component module.
+ */
+
 /* KCR register definitions */
 #define KCR_INIT _MMIO(0x320f0)
 
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h 
b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
index ae24064bb57e..73ef7d1754e1 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
@@ -16,42 +16,65 @@
 struct intel_context;
 struct i915_pxp_component;
 
+/**
+ * struct intel_pxp - pxp state
+ */
 struct intel_pxp {
+   /**
+* @pxp_component: i915_pxp_component struct of the bound mei_pxp
+* module. Only set and cleared inside component bind/unbind functions,
+* which are protected by &tee_mutex.
+*/
struct i915_pxp_component *pxp_component;
+   /**
+* @pxp_component_added: track if the pxp component has been added.
+* Set and cleared in tee init and fini functions respectively.
+*/
bool pxp_component_added;
 
+   /** @ce: kernel-owned context used for PXP operations */
struct intel_context *ce;
 
-   /*
+   /** @arb_mutex: protects arb session start */
+   struct mutex arb_mutex;
+   /**
+* @arb_is_valid: tracks arb session status.
 * After a teardown, the arb session can still be in play on the HW
 * even if the keys are gone, so we can't rely on the HW state of the
 * session to know if it's valid and need to track the status in SW.
 */
-   struct mutex arb_mutex; /* protects arb session start */
bool arb_is_valid;
 
-   /*
-* Keep track of which key instance we're on, so we can use it to
-* determine if an object was created using the current key or a
+   /**
+* @key_instance: tracks which key instance we're on, so we can use it
+* to determine if an object was created using the current key or a
 * previous one.
 */
u32 key_instance;
 
-   struct mutex tee_mutex; /* protects the tee channel binding */
+   /** @tee_mutex: protects the tee channel binding and messaging. */
+   struct mutex tee_mutex;
 
-   /*
-* If the HW perceives an attack on the integrity of the encryption it
-* will invalidate the keys and expect SW

[PATCH v9 12/17] drm/i915/pxp: Enable PXP power management

2021-09-10 Thread Daniele Ceraolo Spurio
From: "Huang, Sean Z" 

During the power event S3+ sleep/resume, hardware will lose all the
encryption keys for every hardware session, even though the
session state might still be marked as alive after resume. Therefore,
we should consider the session as dead on suspend and invalidate all the
objects. The session will be automatically restarted on the first
protected submission on resume.

v2: runtime suspend also invalidates the keys
v3: fix return codes, simplify rpm ops (Chris), use the new worker func
v4: invalidate the objects on suspend, don't re-create the arb sesson on
resume (delayed to first submission).
v5: move irq changes back to irq patch (Rodrigo)
v6: drop invalidation in runtime suspend (Rodrigo)

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
Cc: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Makefile|  1 +
 drivers/gpu/drm/i915/gt/intel_gt_pm.c| 15 ++-
 drivers/gpu/drm/i915/i915_drv.c  |  2 +
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c |  1 +
 drivers/gpu/drm/i915/pxp/intel_pxp_pm.c  | 46 
 drivers/gpu/drm/i915/pxp/intel_pxp_pm.h  | 23 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 38 +++-
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c |  9 
 8 files changed, 124 insertions(+), 11 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_pm.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_pm.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b22b8c195bb8..366e82cec44d 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -286,6 +286,7 @@ i915-$(CONFIG_DRM_I915_PXP) += \
pxp/intel_pxp.o \
pxp/intel_pxp_cmd.o \
pxp/intel_pxp_irq.o \
+   pxp/intel_pxp_pm.o \
pxp/intel_pxp_session.o \
pxp/intel_pxp_tee.o
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c 
b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index dea8e2479897..b47a8d8f1bb5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -18,6 +18,7 @@
 #include "intel_rc6.h"
 #include "intel_rps.h"
 #include "intel_wakeref.h"
+#include "pxp/intel_pxp_pm.h"
 
 static void user_forcewake(struct intel_gt *gt, bool suspend)
 {
@@ -262,6 +263,8 @@ int intel_gt_resume(struct intel_gt *gt)
 
intel_uc_resume(>->uc);
 
+   intel_pxp_resume(>->pxp);
+
user_forcewake(gt, false);
 
 out_fw:
@@ -296,6 +299,7 @@ void intel_gt_suspend_prepare(struct intel_gt *gt)
user_forcewake(gt, true);
wait_for_suspend(gt);
 
+   intel_pxp_suspend(>->pxp, false);
intel_uc_suspend(>->uc);
 }
 
@@ -346,6 +350,7 @@ void intel_gt_suspend_late(struct intel_gt *gt)
 
 void intel_gt_runtime_suspend(struct intel_gt *gt)
 {
+   intel_pxp_suspend(>->pxp, true);
intel_uc_runtime_suspend(>->uc);
 
GT_TRACE(gt, "\n");
@@ -353,11 +358,19 @@ void intel_gt_runtime_suspend(struct intel_gt *gt)
 
 int intel_gt_runtime_resume(struct intel_gt *gt)
 {
+   int ret;
+
GT_TRACE(gt, "\n");
intel_gt_init_swizzling(gt);
intel_ggtt_restore_fences(gt->ggtt);
 
-   return intel_uc_runtime_resume(>->uc);
+   ret = intel_uc_runtime_resume(>->uc);
+   if (ret)
+   return ret;
+
+   intel_pxp_resume(>->pxp);
+
+   return 0;
 }
 
 static ktime_t __intel_gt_get_awake_time(const struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 59fb4c710c8c..d5bcc70a22d4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -67,6 +67,8 @@
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_rc6.h"
 
+#include "pxp/intel_pxp_pm.h"
+
 #include "i915_debugfs.h"
 #include "i915_drv.h"
 #include "i915_ioc32.h"
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
index 340f20d130a8..9e5847c653f2 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
@@ -9,6 +9,7 @@
 #include "gt/intel_gt_irq.h"
 #include "i915_irq.h"
 #include "i915_reg.h"
+#include "intel_runtime_pm.h"
 
 /**
  * intel_pxp_irq_handler - Handles PXP interrupts.
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c
new file mode 100644
index ..23fd86de5a24
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright(c) 2020 Intel Corporation.
+ */
+
+#include "intel_pxp.h"
+#include "intel_pxp_irq.h"
+#include "intel_pxp_pm.h"
+#include "intel_pxp_session.h"
+
+void intel_pxp_suspend(struct intel_pxp *pxp, bool runtime)
+{
+   if (!intel_pxp_is_enabled(pxp))
+   return;
+
+   pxp->arb_is_valid = false;
+
+   /*
+* Contexts using protected objects keep a runtime PM reference, so we
+* can only runtime suspend when all of t

[PATCH v9 13/17] drm/i915/pxp: Add plane decryption support

2021-09-10 Thread Daniele Ceraolo Spurio
From: Anshuman Gupta 

Add support to enable/disable PLANE_SURF Decryption Request bit.
It requires only to enable plane decryption support when following
condition met.
1. PXP session is enabled.
2. Buffer object is protected.

v2:
- Used gen fb obj user_flags instead gem_object_metadata. [Krishna]

v3:
- intel_pxp_gem_object_status() API changes.

v4: use intel_pxp_is_active (Daniele)

v5: rebase and use the new protected object status checker (Daniele)

v6: used plane state for plane_decryption to handle async flip
as suggested by Ville.

v7: check pxp session while plane decrypt state computation. [Ville]
removed pointless code. [Ville]

v8 (Daniele): update PXP check

v9: move decrypt check after icl_check_nv12_planes() when overlays
have fb set (Juston)

v10 (Daniele): update PXP check again to match rework in earlier patches and
don't consider protection valid if the object has not been used in an
execbuf beforehand.

Cc: Bommu Krishnaiah 
Cc: Huang Sean Z 
Cc: Gaurav Kumar 
Cc: Ville Syrjälä 
Signed-off-by: Anshuman Gupta 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Juston Li 
Reviewed-by: Rodrigo Vivi 
Reviewed-by: Uma Shankar  #v9
---
 drivers/gpu/drm/i915/display/intel_display.c  | 26 +++
 .../drm/i915/display/intel_display_types.h|  3 +++
 .../drm/i915/display/skl_universal_plane.c| 15 ---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  2 +-
 drivers/gpu/drm/i915/i915_reg.h   |  1 +
 drivers/gpu/drm/i915/pxp/intel_pxp.c  |  9 ---
 drivers/gpu/drm/i915/pxp/intel_pxp.h  |  7 +++--
 7 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index a7ca38613f89..7c19a7b0676a 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -71,6 +71,8 @@
 #include "gt/intel_rps.h"
 #include "gt/gen8_ppgtt.h"
 
+#include "pxp/intel_pxp.h"
+
 #include "g4x_dp.h"
 #include "g4x_hdmi.h"
 #include "i915_drv.h"
@@ -8994,13 +8996,23 @@ static int intel_bigjoiner_add_affected_planes(struct 
intel_atomic_state *state)
return 0;
 }
 
+static bool bo_has_valid_encryption(struct drm_i915_gem_object *obj)
+{
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
+
+   return intel_pxp_key_check(&i915->gt.pxp, obj, false) == 0;
+}
+
 static int intel_atomic_check_planes(struct intel_atomic_state *state)
 {
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
struct intel_crtc_state *old_crtc_state, *new_crtc_state;
struct intel_plane_state *plane_state;
struct intel_plane *plane;
+   struct intel_plane_state *new_plane_state;
+   struct intel_plane_state *old_plane_state;
struct intel_crtc *crtc;
+   const struct drm_framebuffer *fb;
int i, ret;
 
ret = icl_add_linked_planes(state);
@@ -9048,6 +9060,16 @@ static int intel_atomic_check_planes(struct 
intel_atomic_state *state)
return ret;
}
 
+   for_each_new_intel_plane_in_state(state, plane, plane_state, i) {
+   new_plane_state = intel_atomic_get_new_plane_state(state, 
plane);
+   old_plane_state = intel_atomic_get_old_plane_state(state, 
plane);
+   fb = new_plane_state->hw.fb;
+   if (fb)
+   new_plane_state->decrypt = 
bo_has_valid_encryption(intel_fb_obj(fb));
+   else
+   new_plane_state->decrypt = old_plane_state->decrypt;
+   }
+
return 0;
 }
 
@@ -9334,6 +9356,10 @@ static int intel_atomic_check_async(struct 
intel_atomic_state *state)
drm_dbg_kms(&i915->drm, "Color range cannot be changed 
in async flip\n");
return -EINVAL;
}
+
+   /* plane decryption is allow to change only in synchronous 
flips */
+   if (old_plane_state->decrypt != new_plane_state->decrypt)
+   return -EINVAL;
}
 
return 0;
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
b/drivers/gpu/drm/i915/display/intel_display_types.h
index e9e806d90eec..d75c8bd39abc 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -625,6 +625,9 @@ struct intel_plane_state {
 
struct intel_fb_view view;
 
+   /* Plane pxp decryption state */
+   bool decrypt;
+
/* plane control register */
u32 ctl;
 
diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c 
b/drivers/gpu/drm/i915/display/skl_universal_plane.c
index 724e7b04f3b6..55e3f093b951 100644
--- a/drivers/gpu/drm/i915/display/skl_universal_plane.c
+++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c
@@ -18,6 +18,7 @@
 #include "intel_sprite.h"
 #include "skl_scaler.h"
 #include "skl_universal_plane.h"
+#include "pxp/intel_pxp.h"
 

[PATCH v9 09/17] drm/i915/pxp: Implement PXP irq handler

2021-09-10 Thread Daniele Ceraolo Spurio
From: "Huang, Sean Z" 

The HW will generate a teardown interrupt when session termination is
required, which requires i915 to submit a terminating batch. Once the HW
is done with the termination it will generate another interrupt, at
which point it is safe to re-create the session.

Since the termination and re-creation flow is something we want to
trigger from the driver as well, use a common work function that can be
called both from the irq handler and from the driver set-up flows, which
has the addded benefit of allowing us to skip any extra locks because
the work itself serializes the operations.

v2: use struct completion instead of bool (Chris)
v3: drop locks, clean up functions and improve comments (Chris),
move to common work function.
v4: improve comments, simplify wait logic (Rodrigo)
v5: unconditionally set interrupts, rename state_attacked var (Rodrigo)

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
Cc: Rodrigo Vivi 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Makefile|  1 +
 drivers/gpu/drm/i915/gt/intel_gt_irq.c   |  7 ++
 drivers/gpu/drm/i915/i915_reg.h  |  1 +
 drivers/gpu/drm/i915/pxp/intel_pxp.c | 66 +++--
 drivers/gpu/drm/i915/pxp/intel_pxp.h |  8 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c | 99 
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.h | 32 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 54 ++-
 drivers/gpu/drm/i915/pxp/intel_pxp_session.h |  5 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c |  8 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h   | 18 
 11 files changed, 283 insertions(+), 16 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 4fb663de344d..b22b8c195bb8 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -285,6 +285,7 @@ i915-y += i915_perf.o
 i915-$(CONFIG_DRM_I915_PXP) += \
pxp/intel_pxp.o \
pxp/intel_pxp_cmd.o \
+   pxp/intel_pxp_irq.o \
pxp/intel_pxp_session.o \
pxp/intel_pxp_tee.o
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c 
b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index b2de83be4d97..699a74582d32 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -13,6 +13,7 @@
 #include "intel_lrc_reg.h"
 #include "intel_uncore.h"
 #include "intel_rps.h"
+#include "pxp/intel_pxp_irq.h"
 
 static void guc_irq_handler(struct intel_guc *guc, u16 iir)
 {
@@ -64,6 +65,9 @@ gen11_other_irq_handler(struct intel_gt *gt, const u8 
instance,
if (instance == OTHER_GTPM_INSTANCE)
return gen11_rps_irq_handler(>->rps, iir);
 
+   if (instance == OTHER_KCR_INSTANCE)
+   return intel_pxp_irq_handler(>->pxp, iir);
+
WARN_ONCE(1, "unhandled other interrupt instance=0x%x, iir=0x%x\n",
  instance, iir);
 }
@@ -196,6 +200,9 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK,  ~0);
intel_uncore_write(uncore, GEN11_GUC_SG_INTR_ENABLE, 0);
intel_uncore_write(uncore, GEN11_GUC_SG_INTR_MASK,  ~0);
+
+   intel_uncore_write(uncore, GEN11_CRYPTO_RSVD_INTR_ENABLE, 0);
+   intel_uncore_write(uncore, GEN11_CRYPTO_RSVD_INTR_MASK,  ~0);
 }
 
 void gen11_gt_irq_postinstall(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index c2853cc005ee..84bc884bd474 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8117,6 +8117,7 @@ enum {
 /* irq instances for OTHER_CLASS */
 #define OTHER_GUC_INSTANCE 0
 #define OTHER_GTPM_INSTANCE1
+#define OTHER_KCR_INSTANCE 4
 
 #define GEN11_INTR_IDENTITY_REG(x) _MMIO(0x190060 + ((x) * 4))
 
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index 26176d43a02d..b0c7edc10cc3 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -2,7 +2,9 @@
 /*
  * Copyright(c) 2020 Intel Corporation.
  */
+#include 
 #include "intel_pxp.h"
+#include "intel_pxp_irq.h"
 #include "intel_pxp_session.h"
 #include "intel_pxp_tee.h"
 #include "gt/intel_context.h"
@@ -68,6 +70,16 @@ void intel_pxp_init(struct intel_pxp *pxp)
 
mutex_init(&pxp->tee_mutex);
 
+   /*
+* we'll use the completion to check if there is a termination pending,
+* so we start it as completed and we reinit it when a termination
+* is triggered.
+*/
+   init_completion(&pxp->termination);
+   complete_all(&pxp->termination);
+
+   INIT_WORK(&pxp->session_work, intel_pxp_session_work);
+
ret = create_vcs_context(pxp);
if (ret)
return;
@@ -96,19 +108,61 @@ void intel_pxp_fini(struct in

[PATCH v9 11/17] drm/i915/pxp: start the arb session on demand

2021-09-10 Thread Daniele Ceraolo Spurio
Now that we can handle destruction and re-creation of the arb session,
we can postpone the start of the session to the first submission that
requires it, to avoid keeping it running with no user.

Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c  |  4 ++-
 drivers/gpu/drm/i915/pxp/intel_pxp.c | 37 +---
 drivers/gpu/drm/i915/pxp/intel_pxp.h |  5 +--
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c |  2 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c |  6 ++--
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 10 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h   |  2 ++
 7 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 3418be4f727f..f1a6cfc33148 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -267,7 +267,9 @@ static int proto_context_set_protected(struct 
drm_i915_private *i915,
 * which in turn requires the device to be active.
 */
pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm);
-   ret = intel_pxp_wait_for_arb_start(&i915->gt.pxp);
+
+   if (!intel_pxp_is_active(&i915->gt.pxp))
+   ret = intel_pxp_start(&i915->gt.pxp);
}
 
return ret;
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index e49e60567a56..e183ac479e8b 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -79,6 +79,7 @@ void intel_pxp_init(struct intel_pxp *pxp)
init_completion(&pxp->termination);
complete_all(&pxp->termination);
 
+   mutex_init(&pxp->arb_mutex);
INIT_WORK(&pxp->session_work, intel_pxp_session_work);
 
ret = create_vcs_context(pxp);
@@ -115,7 +116,7 @@ void intel_pxp_mark_termination_in_progress(struct 
intel_pxp *pxp)
reinit_completion(&pxp->termination);
 }
 
-static void intel_pxp_queue_termination(struct intel_pxp *pxp)
+static void pxp_queue_termination(struct intel_pxp *pxp)
 {
struct intel_gt *gt = pxp_to_gt(pxp);
 
@@ -134,31 +135,41 @@ static void intel_pxp_queue_termination(struct intel_pxp 
*pxp)
  * the arb session is restarted from the irq work when we receive the
  * termination completion interrupt
  */
-int intel_pxp_wait_for_arb_start(struct intel_pxp *pxp)
+int intel_pxp_start(struct intel_pxp *pxp)
 {
+   int ret = 0;
+
if (!intel_pxp_is_enabled(pxp))
-   return 0;
+   return -ENODEV;
+
+   mutex_lock(&pxp->arb_mutex);
+
+   if (pxp->arb_is_valid)
+   goto unlock;
+
+   pxp_queue_termination(pxp);
 
if (!wait_for_completion_timeout(&pxp->termination,
-msecs_to_jiffies(100)))
-   return -ETIMEDOUT;
+   msecs_to_jiffies(100))) {
+   ret = -ETIMEDOUT;
+   goto unlock;
+   }
+
+   /* make sure the compiler doesn't optimize the double access */
+   barrier();
 
if (!pxp->arb_is_valid)
-   return -EIO;
+   ret = -EIO;
 
-   return 0;
+unlock:
+   mutex_unlock(&pxp->arb_mutex);
+   return ret;
 }
 
 void intel_pxp_init_hw(struct intel_pxp *pxp)
 {
kcr_pxp_enable(pxp_to_gt(pxp));
intel_pxp_irq_enable(pxp);
-
-   /*
-* the session could've been attacked while we weren't loaded, so
-* handle it as if it was and re-create it.
-*/
-   intel_pxp_queue_termination(pxp);
 }
 
 void intel_pxp_fini_hw(struct intel_pxp *pxp)
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h 
b/drivers/gpu/drm/i915/pxp/intel_pxp.h
index f942bdd2af0c..424fe00a91fb 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h
@@ -34,7 +34,8 @@ void intel_pxp_init_hw(struct intel_pxp *pxp);
 void intel_pxp_fini_hw(struct intel_pxp *pxp);
 
 void intel_pxp_mark_termination_in_progress(struct intel_pxp *pxp);
-int intel_pxp_wait_for_arb_start(struct intel_pxp *pxp);
+
+int intel_pxp_start(struct intel_pxp *pxp);
 
 int intel_pxp_key_check(struct intel_pxp *pxp, struct drm_i915_gem_object 
*obj);
 
@@ -48,7 +49,7 @@ static inline void intel_pxp_fini(struct intel_pxp *pxp)
 {
 }
 
-static inline int intel_pxp_wait_for_arb_start(struct intel_pxp *pxp)
+static inline int intel_pxp_start(struct intel_pxp *pxp)
 {
return -ENODEV;
 }
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
index 46eca1e81b9b..340f20d130a8 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
@@ -31,7 +31,7 @@ void intel_pxp_irq_handler(struct intel_pxp *pxp, u16 iir)
   GEN12_DISPLAY_APP_TERMINATED_PER_FW_REQ_INTERRUPT)) {
  

[PATCH v9 17/17] drm/i915/pxp: enable PXP for integrated Gen12

2021-09-10 Thread Daniele Ceraolo Spurio
Note that discrete cards can support PXP as well, but we haven't tested
on those yet so keeping it disabled for now.

Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/i915_pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index d4a6a9dcf182..169837de395d 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -865,6 +865,7 @@ static const struct intel_device_info jsl_info = {
}, \
TGL_CURSOR_OFFSETS, \
.has_global_mocs = 1, \
+   .has_pxp = 1, \
.display.has_dsb = 1
 
 static const struct intel_device_info tgl_info = {
@@ -891,6 +892,7 @@ static const struct intel_device_info rkl_info = {
 #define DGFX_FEATURES \
.memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
.has_llc = 0, \
+   .has_pxp = 0, \
.has_snoop = 1, \
.is_dgfx = 1
 
-- 
2.25.1



[PATCH v9 14/17] drm/i915/pxp: black pixels on pxp disabled

2021-09-10 Thread Daniele Ceraolo Spurio
From: Anshuman Gupta 

When protected sufaces has flipped and pxp session is disabled,
display black pixels by using plane color CTM correction.

v2:
- Display black pixels in async flip too.

v3:
- Removed the black pixels logic for async flip. [Ville]
- Used plane state to force black pixels. [Ville]

v4 (Daniele): update pxp_is_borked check.

v5: rebase on top of v9 plane decryption moving the decrypt check
(Juston)

Cc: Ville Syrjälä 
Cc: Gaurav Kumar 
Cc: Shankar Uma 
Signed-off-by: Anshuman Gupta 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Juston Li 
Reviewed-by: Rodrigo Vivi 
Reviewed-by: Uma Shankar 
---
 drivers/gpu/drm/i915/display/intel_display.c  | 12 -
 .../drm/i915/display/intel_display_types.h|  3 ++
 .../drm/i915/display/skl_universal_plane.c| 36 ++-
 drivers/gpu/drm/i915/i915_reg.h   | 46 +++
 4 files changed, 94 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 7c19a7b0676a..755f3e32516d 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -9003,6 +9003,11 @@ static bool bo_has_valid_encryption(struct 
drm_i915_gem_object *obj)
return intel_pxp_key_check(&i915->gt.pxp, obj, false) == 0;
 }
 
+static bool pxp_is_borked(struct drm_i915_gem_object *obj)
+{
+   return i915_gem_object_is_protected(obj) && 
!bo_has_valid_encryption(obj);
+}
+
 static int intel_atomic_check_planes(struct intel_atomic_state *state)
 {
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
@@ -9064,10 +9069,13 @@ static int intel_atomic_check_planes(struct 
intel_atomic_state *state)
new_plane_state = intel_atomic_get_new_plane_state(state, 
plane);
old_plane_state = intel_atomic_get_old_plane_state(state, 
plane);
fb = new_plane_state->hw.fb;
-   if (fb)
+   if (fb) {
new_plane_state->decrypt = 
bo_has_valid_encryption(intel_fb_obj(fb));
-   else
+   new_plane_state->force_black = 
pxp_is_borked(intel_fb_obj(fb));
+   } else {
new_plane_state->decrypt = old_plane_state->decrypt;
+   new_plane_state->force_black = 
old_plane_state->force_black;
+   }
}
 
return 0;
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
b/drivers/gpu/drm/i915/display/intel_display_types.h
index d75c8bd39abc..9fa4ef06e377 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -628,6 +628,9 @@ struct intel_plane_state {
/* Plane pxp decryption state */
bool decrypt;
 
+   /* Plane state to display black pixels when pxp is borked */
+   bool force_black;
+
/* plane control register */
u32 ctl;
 
diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c 
b/drivers/gpu/drm/i915/display/skl_universal_plane.c
index 55e3f093b951..c4adcb3e12b3 100644
--- a/drivers/gpu/drm/i915/display/skl_universal_plane.c
+++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c
@@ -1002,6 +1002,33 @@ static u32 skl_surf_address(const struct 
intel_plane_state *plane_state,
}
 }
 
+static void intel_load_plane_csc_black(struct intel_plane *intel_plane)
+{
+   struct drm_i915_private *dev_priv = to_i915(intel_plane->base.dev);
+   enum pipe pipe = intel_plane->pipe;
+   enum plane_id plane = intel_plane->id;
+   u16 postoff = 0;
+
+   drm_dbg_kms(&dev_priv->drm, "plane color CTM to black  %s:%d\n",
+   intel_plane->base.name, plane);
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 0), 0);
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 1), 0);
+
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 2), 0);
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 3), 0);
+
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 4), 0);
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 5), 0);
+
+   intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 0), 0);
+   intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 1), 0);
+   intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 2), 0);
+
+   intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 0), postoff);
+   intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 1), postoff);
+   intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 2), postoff);
+}
+
 static void
 skl_program_plane(struct intel_plane *plane,
  const struct intel_crtc_state *crtc_state,
@@ -1115,14 +1142,21 @@ skl_program_plane(struct intel_plane *plane,
 */
intel_de_write_fw(dev_priv, PLANE_CTL(pipe, plane_id), plane_ctl);
plane_surf = intel_plane_ggtt_offs

[PATCH v9 10/17] drm/i915/pxp: interfaces for using protected objects

2021-09-10 Thread Daniele Ceraolo Spurio
This api allow user mode to create protected buffers and to mark
contexts as making use of such objects. Only when using contexts
marked in such a way is the execution guaranteed to work as expected.

Contexts can only be marked as using protected content at creation time
(i.e. the parameter is immutable) and they must be both bannable and not
recoverable. Given that the protected session gets invalidated on
suspend, contexts created this way hold a runtime pm wakeref until
they're either destroyed or invalidated.

All protected objects and contexts will be considered invalid when the
PXP session is destroyed and all new submissions using them will be
rejected. All intel contexts within the invalidated gem contexts will be
marked banned. Userspace can detect that an invalidation has occurred via
the RESET_STATS ioctl, where we report it the same way as a ban due to a
hang.

v5: squash patches, rebase on proto_ctx, update kerneldoc

v6: rebase on obj create_ext changes

v7: Use session counter to check if an object it valid, hold wakeref in
context, don't add a new flag to RESET_STATS (Daniel)

v8: don't increase guilty count for contexts banned during pxp
invalidation (Rodrigo)

v9: better comments, avoid wakeref put race between pxp_inval and
context_close, add usage examples (Rodrigo)

Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Bommu Krishnaiah 
Cc: Rodrigo Vivi 
Cc: Chris Wilson 
Cc: Lionel Landwerlin 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 98 ---
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  6 ++
 .../gpu/drm/i915/gem/i915_gem_context_types.h | 28 ++
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 72 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 18 
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  1 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  6 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  8 ++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  4 +-
 drivers/gpu/drm/i915/pxp/intel_pxp.c  | 78 +++
 drivers/gpu/drm/i915/pxp/intel_pxp.h  | 12 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c  |  6 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  9 ++
 include/uapi/drm/i915_drm.h   | 96 +-
 14 files changed, 407 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c2ab0e22db0a..3418be4f727f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -77,6 +77,8 @@
 #include "gt/intel_gpu_commands.h"
 #include "gt/intel_ring.h"
 
+#include "pxp/intel_pxp.h"
+
 #include "i915_gem_context.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -186,10 +188,13 @@ static int validate_priority(struct drm_i915_private 
*i915,
return 0;
 }
 
-static void proto_context_close(struct i915_gem_proto_context *pc)
+static void proto_context_close(struct drm_i915_private *i915,
+   struct i915_gem_proto_context *pc)
 {
int i;
 
+   if (pc->pxp_wakeref)
+   intel_runtime_pm_put(&i915->runtime_pm, pc->pxp_wakeref);
if (pc->vm)
i915_vm_put(pc->vm);
if (pc->user_engines) {
@@ -241,6 +246,33 @@ static int proto_context_set_persistence(struct 
drm_i915_private *i915,
return 0;
 }
 
+static int proto_context_set_protected(struct drm_i915_private *i915,
+  struct i915_gem_proto_context *pc,
+  bool protected)
+{
+   int ret = 0;
+
+   if (!intel_pxp_is_enabled(&i915->gt.pxp)) {
+   ret = -ENODEV;
+   } else if (!protected) {
+   pc->uses_protected_content = false;
+   } else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) ||
+  !(pc->user_flags & BIT(UCONTEXT_BANNABLE))) {
+   ret = -EPERM;
+   } else {
+   pc->uses_protected_content = true;
+
+   /*
+* protected context usage requires the PXP session to be up,
+* which in turn requires the device to be active.
+*/
+   pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm);
+   ret = intel_pxp_wait_for_arb_start(&i915->gt.pxp);
+   }
+
+   return ret;
+}
+
 static struct i915_gem_proto_context *
 proto_context_create(struct drm_i915_private *i915, unsigned int flags)
 {
@@ -269,7 +301,7 @@ proto_context_create(struct drm_i915_private *i915, 
unsigned int flags)
return pc;
 
 proto_close:
-   proto_context_close(pc);
+   proto_context_close(i915, pc);
return err;
 }
 
@@ -693,6 +725,8 @@ static int set_proto_ctx_param(struct drm_i915_file_private 
*fpriv,
ret = -EPERM;
else if (args->value)
pc->user_flag

[PATCH v9 08/17] drm/i915/pxp: Implement arb session teardown

2021-09-10 Thread Daniele Ceraolo Spurio
From: "Huang, Sean Z" 

Teardown is triggered when the display topology changes and no
long meets the secure playback requirement, and hardware trashes
all the encryption keys for display. Additionally, we want to emit a
teardown operation to make sure we're clean on boot and resume

v2: emit in the ring, use high prio request (Chris)
v3: better defines, stalling flush, cleaned up and renamed submission
funcs (Chris)

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Makefile|   1 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  22 ++-
 drivers/gpu/drm/i915/pxp/intel_pxp.c |   7 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c | 141 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h |  15 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c |  29 
 drivers/gpu/drm/i915/pxp/intel_pxp_session.h |   1 +
 7 files changed, 212 insertions(+), 4 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 405e04f4dd59..4fb663de344d 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -284,6 +284,7 @@ i915-y += i915_perf.o
 # Protected execution platform (PXP) support
 i915-$(CONFIG_DRM_I915_PXP) += \
pxp/intel_pxp.o \
+   pxp/intel_pxp_cmd.o \
pxp/intel_pxp_session.o \
pxp/intel_pxp_tee.o
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 1c3af0fc0456..ec2a0a566c40 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -28,10 +28,13 @@
 #define INSTR_26_TO_24_MASK0x700
 #define   INSTR_26_TO_24_SHIFT 24
 
+#define __INSTR(client) ((client) << INSTR_CLIENT_SHIFT)
+
 /*
  * Memory interface instructions used by the kernel
  */
-#define MI_INSTR(opcode, flags) (((opcode) << 23) | (flags))
+#define MI_INSTR(opcode, flags) \
+   (__INSTR(INSTR_MI_CLIENT) | (opcode) << 23 | (flags))
 /* Many MI commands use bit 22 of the header dword for GGTT vs PPGTT */
 #define  MI_GLOBAL_GTT(1<<22)
 
@@ -57,6 +60,7 @@
 #define MI_SUSPEND_FLUSH   MI_INSTR(0x0b, 0)
 #define   MI_SUSPEND_FLUSH_EN  (1<<0)
 #define MI_SET_APPID   MI_INSTR(0x0e, 0)
+#define   MI_SET_APPID_SESSION_ID(x)   ((x) << 0)
 #define MI_OVERLAY_FLIPMI_INSTR(0x11, 0)
 #define   MI_OVERLAY_CONTINUE  (0x0<<21)
 #define   MI_OVERLAY_ON(0x1<<21)
@@ -146,6 +150,7 @@
 #define MI_STORE_REGISTER_MEM_GEN8   MI_INSTR(0x24, 2)
 #define   MI_SRM_LRM_GLOBAL_GTT(1<<22)
 #define MI_FLUSH_DWMI_INSTR(0x26, 1) /* for GEN6 */
+#define   MI_FLUSH_DW_PROTECTED_MEM_EN (1<<22)
 #define   MI_FLUSH_DW_STORE_INDEX  (1<<21)
 #define   MI_INVALIDATE_TLB(1<<18)
 #define   MI_FLUSH_DW_OP_STOREDW   (1<<14)
@@ -272,6 +277,19 @@
 #define   MI_MATH_REG_ZF   0x32
 #define   MI_MATH_REG_CF   0x33
 
+/*
+ * Media instructions used by the kernel
+ */
+#define MEDIA_INSTR(pipe, op, sub_op, flags) \
+   (__INSTR(INSTR_RC_CLIENT) | (pipe) << INSTR_SUBCLIENT_SHIFT | \
+   (op) << INSTR_26_TO_24_SHIFT | (sub_op) << 16 | (flags))
+
+#define MFX_WAIT   MEDIA_INSTR(1, 0, 0, 0)
+#define  MFX_WAIT_DW0_MFX_SYNC_CONTROL_FLAGREG_BIT(8)
+#define  MFX_WAIT_DW0_PXP_SYNC_CONTROL_FLAGREG_BIT(9)
+
+#define CRYPTO_KEY_EXCHANGEMEDIA_INSTR(2, 6, 9, 0)
+
 /*
  * Commands used only by the command parser
  */
@@ -328,8 +346,6 @@
 #define GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS \
((0x3<<29)|(0x3<<27)|(0x0<<24)|(0x47<<16))
 
-#define MFX_WAIT  ((0x3<<29)|(0x1<<27)|(0x0<<16))
-
 #define COLOR_BLT ((0x2<<29)|(0x40<<22))
 #define SRC_COPY_BLT  ((0x2<<29)|(0x43<<22))
 
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index e1370f323126..26176d43a02d 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -98,9 +98,14 @@ void intel_pxp_fini(struct intel_pxp *pxp)
 
 void intel_pxp_init_hw(struct intel_pxp *pxp)
 {
+   int ret;
+
kcr_pxp_enable(pxp_to_gt(pxp));
 
-   intel_pxp_create_arb_session(pxp);
+   /* always emit a full termination to clean the state */
+   ret = intel_pxp_terminate_arb_session_and_global(pxp);
+   if (!ret)
+   intel_pxp_create_arb_session(pxp);
 }
 
 void intel_pxp_fini_hw(struct intel_pxp *pxp)
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c
new file mode 100644
index ..80678dafde15
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c
@@ -0,0 +1,141 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright(c) 2020, Intel Corporation. All rights reserved.
+ */
+
+#in

[PATCH v9 06/17] drm/i915/pxp: set KCR reg init

2021-09-10 Thread Daniele Ceraolo Spurio
The setting is required by hardware to allow us doing further protection
operation such as sending commands to GPU or TEE. The register needs to
be re-programmed on resume, so for simplicitly we bundle the programming
with the component binding, which is automatically called on resume.

Further HW set-up operations will be added in the same location in
follow-up patches, so get ready for them by using a couple of
init/fini_hw wrappers instead of calling the KCR funcs directly.

v3: move programming to component binding function, rework commit msg

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/pxp/intel_pxp.c | 27 
 drivers/gpu/drm/i915/pxp/intel_pxp.h |  3 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c |  5 +
 3 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index 400deaea2d8a..66a98feb33ab 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -7,6 +7,24 @@
 #include "gt/intel_context.h"
 #include "i915_drv.h"
 
+/* KCR register definitions */
+#define KCR_INIT _MMIO(0x320f0)
+
+/* Setting KCR Init bit is required after system boot */
+#define KCR_INIT_ALLOW_DISPLAY_ME_WRITES REG_BIT(14)
+
+static void kcr_pxp_enable(struct intel_gt *gt)
+{
+   intel_uncore_write(gt->uncore, KCR_INIT,
+  
_MASKED_BIT_ENABLE(KCR_INIT_ALLOW_DISPLAY_ME_WRITES));
+}
+
+static void kcr_pxp_disable(struct intel_gt *gt)
+{
+   intel_uncore_write(gt->uncore, KCR_INIT,
+  
_MASKED_BIT_DISABLE(KCR_INIT_ALLOW_DISPLAY_ME_WRITES));
+}
+
 static int create_vcs_context(struct intel_pxp *pxp)
 {
static struct lock_class_key pxp_lock;
@@ -71,5 +89,14 @@ void intel_pxp_fini(struct intel_pxp *pxp)
intel_pxp_tee_component_fini(pxp);
 
destroy_vcs_context(pxp);
+}
+
+void intel_pxp_init_hw(struct intel_pxp *pxp)
+{
+   kcr_pxp_enable(pxp_to_gt(pxp));
+}
 
+void intel_pxp_fini_hw(struct intel_pxp *pxp)
+{
+   kcr_pxp_disable(pxp_to_gt(pxp));
 }
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h 
b/drivers/gpu/drm/i915/pxp/intel_pxp.h
index e87550fb9821..5427c3b28aa9 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h
@@ -22,6 +22,9 @@ static inline bool intel_pxp_is_enabled(const struct 
intel_pxp *pxp)
 #ifdef CONFIG_DRM_I915_PXP
 void intel_pxp_init(struct intel_pxp *pxp);
 void intel_pxp_fini(struct intel_pxp *pxp);
+
+void intel_pxp_init_hw(struct intel_pxp *pxp);
+void intel_pxp_fini_hw(struct intel_pxp *pxp);
 #else
 static inline void intel_pxp_init(struct intel_pxp *pxp)
 {
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
index f1d8de832653..0c0c7946e6a0 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
@@ -33,6 +33,9 @@ static int i915_pxp_tee_component_bind(struct device 
*i915_kdev,
pxp->pxp_component = data;
pxp->pxp_component->tee_dev = tee_kdev;
 
+   /* the component is required to fully start the PXP HW */
+   intel_pxp_init_hw(pxp);
+
return 0;
 }
 
@@ -41,6 +44,8 @@ static void i915_pxp_tee_component_unbind(struct device 
*i915_kdev,
 {
struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev);
 
+   intel_pxp_fini_hw(pxp);
+
pxp->pxp_component = NULL;
 }
 
-- 
2.25.1



[PATCH v9 07/17] drm/i915/pxp: Create the arbitrary session after boot

2021-09-10 Thread Daniele Ceraolo Spurio
From: "Huang, Sean Z" 

Create the arbitrary session, with the fixed session id 0xf, after
system boot, for the case that application allocates the protected
buffer without establishing any protection session. Because the
hardware requires at least one alive session for protected buffer
creation. This arbitrary session will need to be re-created after
teardown or power event because hardware encryption key won't be
valid after such cases.

The session ID is exposed as part of the uapi so it can be used as part
of userspace commands.

v2: use gt->uncore->rpm (Chris)
v3: s/arb_is_in_play/arb_is_valid (Chris), move set-up to the new
init_hw function
v4: move interface defs to separate header, set arb_is valid to false
on fini (Rodrigo)
v5: handle async component binding

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
Cc: Rodrigo Vivi 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Makefile |  1 +
 drivers/gpu/drm/i915/pxp/intel_pxp.c  |  7 ++
 drivers/gpu/drm/i915/pxp/intel_pxp.h  |  5 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c  | 74 
 drivers/gpu/drm/i915/pxp/intel_pxp_session.h  | 15 
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c  | 87 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h  |  3 +
 .../drm/i915/pxp/intel_pxp_tee_interface.h| 37 
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 10 +++
 include/uapi/drm/i915_drm.h   |  3 +
 10 files changed, 242 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_session.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_session.h
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee_interface.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index d39bd0cefc64..405e04f4dd59 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -284,6 +284,7 @@ i915-y += i915_perf.o
 # Protected execution platform (PXP) support
 i915-$(CONFIG_DRM_I915_PXP) += \
pxp/intel_pxp.o \
+   pxp/intel_pxp_session.o \
pxp/intel_pxp_tee.o
 
 # Post-mortem debug and GPU hang state capture
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index 66a98feb33ab..e1370f323126 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -3,6 +3,7 @@
  * Copyright(c) 2020 Intel Corporation.
  */
 #include "intel_pxp.h"
+#include "intel_pxp_session.h"
 #include "intel_pxp_tee.h"
 #include "gt/intel_context.h"
 #include "i915_drv.h"
@@ -65,6 +66,8 @@ void intel_pxp_init(struct intel_pxp *pxp)
if (!HAS_PXP(gt->i915))
return;
 
+   mutex_init(&pxp->tee_mutex);
+
ret = create_vcs_context(pxp);
if (ret)
return;
@@ -86,6 +89,8 @@ void intel_pxp_fini(struct intel_pxp *pxp)
if (!intel_pxp_is_enabled(pxp))
return;
 
+   pxp->arb_is_valid = false;
+
intel_pxp_tee_component_fini(pxp);
 
destroy_vcs_context(pxp);
@@ -94,6 +99,8 @@ void intel_pxp_fini(struct intel_pxp *pxp)
 void intel_pxp_init_hw(struct intel_pxp *pxp)
 {
kcr_pxp_enable(pxp_to_gt(pxp));
+
+   intel_pxp_create_arb_session(pxp);
 }
 
 void intel_pxp_fini_hw(struct intel_pxp *pxp)
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h 
b/drivers/gpu/drm/i915/pxp/intel_pxp.h
index 5427c3b28aa9..8eeb65af78b1 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h
@@ -19,6 +19,11 @@ static inline bool intel_pxp_is_enabled(const struct 
intel_pxp *pxp)
return pxp->ce;
 }
 
+static inline bool intel_pxp_is_active(const struct intel_pxp *pxp)
+{
+   return pxp->arb_is_valid;
+}
+
 #ifdef CONFIG_DRM_I915_PXP
 void intel_pxp_init(struct intel_pxp *pxp);
 void intel_pxp_fini(struct intel_pxp *pxp);
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c
new file mode 100644
index ..3331868f354c
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright(c) 2020, Intel Corporation. All rights reserved.
+ */
+
+#include "drm/i915_drm.h"
+#include "i915_drv.h"
+
+#include "intel_pxp.h"
+#include "intel_pxp_session.h"
+#include "intel_pxp_tee.h"
+#include "intel_pxp_types.h"
+
+#define ARB_SESSION I915_PROTECTED_CONTENT_DEFAULT_SESSION /* shorter define */
+
+#define GEN12_KCR_SIP _MMIO(0x32260) /* KCR hwdrm session in play 0-31 */
+
+static bool intel_pxp_session_is_in_play(struct intel_pxp *pxp, u32 id)
+{
+   struct intel_gt *gt = pxp_to_gt(pxp);
+   intel_wakeref_t wakeref;
+   u32 sip = 0;
+
+   with_intel_runtime_pm(gt->uncore->rpm, wakeref)
+   sip = intel_uncore_read(gt->uncore, GEN12_KCR_SIP);
+
+   return sip & BIT(id);
+}
+
+static int pxp_wait_for_session_state(struct intel_pxp *pxp, u32 id, bool 
in_

[PATCH v9 05/17] drm/i915/pxp: Implement funcs to create the TEE channel

2021-09-10 Thread Daniele Ceraolo Spurio
From: "Huang, Sean Z" 

Implement the funcs to create the TEE channel, so kernel can
send the TEE commands directly to TEE for creating the arbitrary
(default) session.

v2: fix locking, don't pollute dev_priv (Chris)

v3: wait for mei PXP component to be bound.

v4: drop the wait, as the component might be bound after i915 load
completes. We'll instead check when sending a tee message.

v5: fix an issue with mei_pxp module removal

v6: don't use fetch_and_zero in fini (Rodrigo)

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
---
 drivers/gpu/drm/i915/Makefile  |  3 +-
 drivers/gpu/drm/i915/pxp/intel_pxp.c   | 13 
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c   | 79 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h   | 14 
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h |  6 ++
 5 files changed, 114 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 23f5bc268962..d39bd0cefc64 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -283,7 +283,8 @@ i915-y += i915_perf.o
 
 # Protected execution platform (PXP) support
 i915-$(CONFIG_DRM_I915_PXP) += \
-   pxp/intel_pxp.o
+   pxp/intel_pxp.o \
+   pxp/intel_pxp_tee.o
 
 # Post-mortem debug and GPU hang state capture
 i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index 7b2053902146..400deaea2d8a 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -3,6 +3,7 @@
  * Copyright(c) 2020 Intel Corporation.
  */
 #include "intel_pxp.h"
+#include "intel_pxp_tee.h"
 #include "gt/intel_context.h"
 #include "i915_drv.h"
 
@@ -50,7 +51,16 @@ void intel_pxp_init(struct intel_pxp *pxp)
if (ret)
return;
 
+   ret = intel_pxp_tee_component_init(pxp);
+   if (ret)
+   goto out_context;
+
drm_info(>->i915->drm, "Protected Xe Path (PXP) protected content 
support initialized\n");
+
+   return;
+
+out_context:
+   destroy_vcs_context(pxp);
 }
 
 void intel_pxp_fini(struct intel_pxp *pxp)
@@ -58,5 +68,8 @@ void intel_pxp_fini(struct intel_pxp *pxp)
if (!intel_pxp_is_enabled(pxp))
return;
 
+   intel_pxp_tee_component_fini(pxp);
+
destroy_vcs_context(pxp);
+
 }
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
new file mode 100644
index ..f1d8de832653
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
@@ -0,0 +1,79 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright(c) 2020 Intel Corporation.
+ */
+
+#include 
+#include "drm/i915_pxp_tee_interface.h"
+#include "drm/i915_component.h"
+#include "i915_drv.h"
+#include "intel_pxp.h"
+#include "intel_pxp_tee.h"
+
+static inline struct intel_pxp *i915_dev_to_pxp(struct device *i915_kdev)
+{
+   return &kdev_to_i915(i915_kdev)->gt.pxp;
+}
+
+/**
+ * i915_pxp_tee_component_bind - bind function to pass the function pointers 
to pxp_tee
+ * @i915_kdev: pointer to i915 kernel device
+ * @tee_kdev: pointer to tee kernel device
+ * @data: pointer to pxp_tee_master containing the function pointers
+ *
+ * This bind function is called during the system boot or resume from system 
sleep.
+ *
+ * Return: return 0 if successful.
+ */
+static int i915_pxp_tee_component_bind(struct device *i915_kdev,
+  struct device *tee_kdev, void *data)
+{
+   struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev);
+
+   pxp->pxp_component = data;
+   pxp->pxp_component->tee_dev = tee_kdev;
+
+   return 0;
+}
+
+static void i915_pxp_tee_component_unbind(struct device *i915_kdev,
+ struct device *tee_kdev, void *data)
+{
+   struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev);
+
+   pxp->pxp_component = NULL;
+}
+
+static const struct component_ops i915_pxp_tee_component_ops = {
+   .bind   = i915_pxp_tee_component_bind,
+   .unbind = i915_pxp_tee_component_unbind,
+};
+
+int intel_pxp_tee_component_init(struct intel_pxp *pxp)
+{
+   int ret;
+   struct intel_gt *gt = pxp_to_gt(pxp);
+   struct drm_i915_private *i915 = gt->i915;
+
+   ret = component_add_typed(i915->drm.dev, &i915_pxp_tee_component_ops,
+ I915_COMPONENT_PXP);
+   if (ret < 0) {
+   drm_err(&i915->drm, "Failed to add PXP component (%d)\n", ret);
+   return ret;
+   }
+
+   pxp->pxp_component_added = true;
+
+   return 0;
+}
+
+void intel_pxp_tee_component_fini(struct intel_pxp *pxp)
+{
+   struct drm_i915_private *i915 = pxp_to_gt(pxp)->i915;
+
+   if (!pxp->pxp_component_added)
+   return;
+
+   

[PATCH v9 04/17] drm/i915/pxp: allocate a vcs context for pxp usage

2021-09-10 Thread Daniele Ceraolo Spurio
The context is required to send the session termination commands to the
VCS, which will be implemented in a follow-up patch. We can also use the
presence of the context as a check of pxp initialization completion.

v2: use perma-pinned context (Chris)
v3: rename pinned_context functions (Chris)
v4: split export of pinned_context functions to a separate patch (Rodrigo)

Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Makefile  |  4 ++
 drivers/gpu/drm/i915/gt/intel_engine.h |  2 +
 drivers/gpu/drm/i915/gt/intel_gt.c |  5 ++
 drivers/gpu/drm/i915/gt/intel_gt_types.h   |  3 ++
 drivers/gpu/drm/i915/pxp/intel_pxp.c   | 62 ++
 drivers/gpu/drm/i915/pxp/intel_pxp.h   | 35 
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 15 ++
 7 files changed, 126 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.h
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_types.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index c36c8a4f0716..23f5bc268962 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -281,6 +281,10 @@ i915-y += \
 
 i915-y += i915_perf.o
 
+# Protected execution platform (PXP) support
+i915-$(CONFIG_DRM_I915_PXP) += \
+   pxp/intel_pxp.o
+
 # Post-mortem debug and GPU hang state capture
 i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o
 i915-$(CONFIG_DRM_I915_SELFTEST) += \
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h
index 87579affb952..eed4634c08cd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -175,6 +175,8 @@ intel_write_status_page(struct intel_engine_cs *engine, int 
reg, u32 value)
 #define I915_GEM_HWS_SEQNO 0x40
 #define I915_GEM_HWS_SEQNO_ADDR(I915_GEM_HWS_SEQNO * 
sizeof(u32))
 #define I915_GEM_HWS_MIGRATE   (0x42 * sizeof(u32))
+#define I915_GEM_HWS_PXP   0x60
+#define I915_GEM_HWS_PXP_ADDR  (I915_GEM_HWS_PXP * sizeof(u32))
 #define I915_GEM_HWS_SCRATCH   0x80
 
 #define I915_HWS_CSB_BUF0_INDEX0x10
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 2aeaae036a6f..da30919b7e99 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -21,6 +21,7 @@
 #include "intel_uncore.h"
 #include "intel_pm.h"
 #include "shmem_utils.h"
+#include "pxp/intel_pxp.h"
 
 void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
 {
@@ -712,6 +713,8 @@ int intel_gt_init(struct intel_gt *gt)
 
intel_migrate_init(>->migrate, gt);
 
+   intel_pxp_init(>->pxp);
+
goto out_fw;
 err_gt:
__intel_gt_disable(gt);
@@ -747,6 +750,8 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
 
intel_rps_driver_unregister(>->rps);
 
+   intel_pxp_fini(>->pxp);
+
/*
 * Upon unregistering the device to prevent any new users, cancel
 * all in-flight requests so that we can quickly unbind the active
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 6fdcde64c180..8001a61f42e5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -26,6 +26,7 @@
 #include "intel_rps_types.h"
 #include "intel_migrate_types.h"
 #include "intel_wakeref.h"
+#include "pxp/intel_pxp_types.h"
 
 struct drm_i915_private;
 struct i915_ggtt;
@@ -196,6 +197,8 @@ struct intel_gt {
struct {
u8 uc_index;
} mocs;
+
+   struct intel_pxp pxp;
 };
 
 enum intel_gt_scratch_field {
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
new file mode 100644
index ..7b2053902146
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -0,0 +1,62 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright(c) 2020 Intel Corporation.
+ */
+#include "intel_pxp.h"
+#include "gt/intel_context.h"
+#include "i915_drv.h"
+
+static int create_vcs_context(struct intel_pxp *pxp)
+{
+   static struct lock_class_key pxp_lock;
+   struct intel_gt *gt = pxp_to_gt(pxp);
+   struct intel_engine_cs *engine;
+   struct intel_context *ce;
+
+   /*
+* Find the first VCS engine present. We're guaranteed there is one
+* if we're in this function due to the check in has_pxp
+*/
+   for (engine = gt->engine_class[VIDEO_DECODE_CLASS][0]; !engine; 
engine++);
+   GEM_BUG_ON(!engine || engine->class != VIDEO_DECODE_CLASS);
+
+   ce = intel_engine_create_pinned_context(engine, engine->gt->vm, SZ_4K,
+   I915_GEM_HWS_PXP_ADDR,
+   &pxp_lock, "pxp_context");
+   if (IS_ERR(ce)) {

[PATCH v9 03/17] drm/i915/pxp: define PXP device flag and kconfig

2021-09-10 Thread Daniele Ceraolo Spurio
Ahead of the PXP implementation, define the relevant define flag and
kconfig option.

v2: flip kconfig default to N. Some machines have IFWIs that do not
support PXP, so we need it to be an opt-in until we add support to query
the caps from the mei device.

Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Kconfig | 11 +++
 drivers/gpu/drm/i915/i915_drv.h  |  3 +++
 drivers/gpu/drm/i915/intel_device_info.h |  1 +
 3 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index f960f5d7664e..5987c3d5d9fb 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -131,6 +131,17 @@ config DRM_I915_GVT_KVMGT
  Choose this option if you want to enable KVMGT support for
  Intel GVT-g.
 
+config DRM_I915_PXP
+   bool "Enable Intel PXP support for Intel Gen12+ platform"
+   depends on DRM_I915
+   depends on INTEL_MEI && INTEL_MEI_PXP
+   default n
+   help
+ PXP (Protected Xe Path) is an i915 component, available on GEN12+
+ GPUs, that helps to establish the hardware protected session and
+ manage the status of the alive software session, as well as its life
+ cycle.
+
 menu "drm/i915 Debugging"
 depends on DRM_I915
 depends on EXPERT
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 37c1ca266bcd..447a248f14aa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1678,6 +1678,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 
 #define HAS_GLOBAL_MOCS_REGISTERS(dev_priv)
(INTEL_INFO(dev_priv)->has_global_mocs)
 
+#define HAS_PXP(dev_priv) (IS_ENABLED(CONFIG_DRM_I915_PXP) && \
+  INTEL_INFO(dev_priv)->has_pxp) && \
+  VDBOX_MASK(&dev_priv->gt)
 
 #define HAS_GMCH(dev_priv) (INTEL_INFO(dev_priv)->display.has_gmch)
 
diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
b/drivers/gpu/drm/i915/intel_device_info.h
index d328bb95c49b..8e6f48d1eb7b 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -133,6 +133,7 @@ enum intel_ppgtt_type {
func(has_logical_ring_elsq); \
func(has_mslices); \
func(has_pooled_eu); \
+   func(has_pxp); \
func(has_rc6); \
func(has_rc6p); \
func(has_rps); \
-- 
2.25.1



[PATCH v9 02/17] mei: pxp: export pavp client to me client bus

2021-09-10 Thread Daniele Ceraolo Spurio
From: Vitaly Lubart 

Export PAVP client to work with i915 driver,
for binding it uses kernel component framework.

v2:drop debug prints, refactor match code to match mei_hdcp (Tomas)

Signed-off-by: Vitaly Lubart 
Signed-off-by: Tomas Winkler 
Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by: Rodrigo Vivi 
---
 drivers/misc/mei/Kconfig   |   2 +
 drivers/misc/mei/Makefile  |   1 +
 drivers/misc/mei/pxp/Kconfig   |  13 ++
 drivers/misc/mei/pxp/Makefile  |   7 +
 drivers/misc/mei/pxp/mei_pxp.c | 229 +
 drivers/misc/mei/pxp/mei_pxp.h |  18 +++
 6 files changed, 270 insertions(+)
 create mode 100644 drivers/misc/mei/pxp/Kconfig
 create mode 100644 drivers/misc/mei/pxp/Makefile
 create mode 100644 drivers/misc/mei/pxp/mei_pxp.c
 create mode 100644 drivers/misc/mei/pxp/mei_pxp.h

diff --git a/drivers/misc/mei/Kconfig b/drivers/misc/mei/Kconfig
index f5fd5b786607..0e0bcd0da852 100644
--- a/drivers/misc/mei/Kconfig
+++ b/drivers/misc/mei/Kconfig
@@ -47,3 +47,5 @@ config INTEL_MEI_TXE
  Intel Bay Trail
 
 source "drivers/misc/mei/hdcp/Kconfig"
+source "drivers/misc/mei/pxp/Kconfig"
+
diff --git a/drivers/misc/mei/Makefile b/drivers/misc/mei/Makefile
index f1c76f7ee804..d8e5165917f2 100644
--- a/drivers/misc/mei/Makefile
+++ b/drivers/misc/mei/Makefile
@@ -26,3 +26,4 @@ mei-$(CONFIG_EVENT_TRACING) += mei-trace.o
 CFLAGS_mei-trace.o = -I$(src)
 
 obj-$(CONFIG_INTEL_MEI_HDCP) += hdcp/
+obj-$(CONFIG_INTEL_MEI_PXP) += pxp/
diff --git a/drivers/misc/mei/pxp/Kconfig b/drivers/misc/mei/pxp/Kconfig
new file mode 100644
index ..4029b96afc04
--- /dev/null
+++ b/drivers/misc/mei/pxp/Kconfig
@@ -0,0 +1,13 @@
+
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
+#
+config INTEL_MEI_PXP
+   tristate "Intel PXP services of ME Interface"
+   select INTEL_MEI_ME
+   depends on DRM_I915
+   help
+ MEI Support for PXP Services on Intel platforms.
+
+ Enables the ME FW services required for PXP support through
+ I915 display driver of Intel.
diff --git a/drivers/misc/mei/pxp/Makefile b/drivers/misc/mei/pxp/Makefile
new file mode 100644
index ..0329950d5794
--- /dev/null
+++ b/drivers/misc/mei/pxp/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
+#
+# Makefile - PXP client driver for Intel MEI Bus Driver.
+
+obj-$(CONFIG_INTEL_MEI_PXP) += mei_pxp.o
diff --git a/drivers/misc/mei/pxp/mei_pxp.c b/drivers/misc/mei/pxp/mei_pxp.c
new file mode 100644
index ..f7380d387bab
--- /dev/null
+++ b/drivers/misc/mei/pxp/mei_pxp.c
@@ -0,0 +1,229 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright © 2020 - 2021 Intel Corporation
+ */
+
+/**
+ * DOC: MEI_PXP Client Driver
+ *
+ * The mei_pxp driver acts as a translation layer between PXP
+ * protocol  implementer (I915) and ME FW by translating PXP
+ * negotiation messages to ME FW command payloads and vice versa.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mei_pxp.h"
+
+/**
+ * mei_pxp_send_message() - Sends a PXP message to ME FW.
+ * @dev: device corresponding to the mei_cl_device
+ * @message: a message buffer to send
+ * @size: size of the message
+ * Return: 0 on Success, <0 on Failure
+ */
+static int
+mei_pxp_send_message(struct device *dev, const void *message, size_t size)
+{
+   struct mei_cl_device *cldev;
+   ssize_t byte;
+
+   if (!dev || !message)
+   return -EINVAL;
+
+   cldev = to_mei_cl_device(dev);
+
+   /* temporary drop const qualifier till the API is fixed */
+   byte = mei_cldev_send(cldev, (u8 *)message, size);
+   if (byte < 0) {
+   dev_dbg(dev, "mei_cldev_send failed. %zd\n", byte);
+   return byte;
+   }
+
+   return 0;
+}
+
+/**
+ * mei_pxp_receive_message() - Receives a PXP message from ME FW.
+ * @dev: device corresponding to the mei_cl_device
+ * @buffer: a message buffer to contain the received message
+ * @size: size of the buffer
+ * Return: bytes sent on Success, <0 on Failure
+ */
+static int
+mei_pxp_receive_message(struct device *dev, void *buffer, size_t size)
+{
+   struct mei_cl_device *cldev;
+   ssize_t byte;
+
+   if (!dev || !buffer)
+   return -EINVAL;
+
+   cldev = to_mei_cl_device(dev);
+
+   byte = mei_cldev_recv(cldev, buffer, size);
+   if (byte < 0) {
+   dev_dbg(dev, "mei_cldev_recv failed. %zd\n", byte);
+   return byte;
+   }
+
+   return byte;
+}
+
+static const struct i915_pxp_component_ops mei_pxp_ops = {
+   .owner = THIS_MODULE,
+   .send = mei_pxp_send_message,
+   .recv = mei_pxp_receive_message,
+};
+
+static int mei_component_master_bind(struct device *dev)
+{
+   struct mei_cl_device *cldev = to_mei_cl_device(dev);
+   struct i915_pxp_component *comp_master = mei_

[PATCH v9 01/17] drm/i915/pxp: Define PXP component interface

2021-09-10 Thread Daniele Ceraolo Spurio
This will be used for communication between the i915 driver and the mei
one. Defining it in a stand-alone patch to avoid circualr dependedencies
between the patches modifying the 2 drivers.

Split out from an original patch from  Huang, Sean Z

v2: rename the component struct (Rodrigo)

Signed-off-by: Daniele Ceraolo Spurio 
Cc: Rodrigo Vivi 
Reviewed-by: Rodrigo Vivi 
---
 include/drm/i915_component.h |  1 +
 include/drm/i915_pxp_tee_interface.h | 42 
 2 files changed, 43 insertions(+)
 create mode 100644 include/drm/i915_pxp_tee_interface.h

diff --git a/include/drm/i915_component.h b/include/drm/i915_component.h
index 55c3b123581b..c1e2a43d2d1e 100644
--- a/include/drm/i915_component.h
+++ b/include/drm/i915_component.h
@@ -29,6 +29,7 @@
 enum i915_component_type {
I915_COMPONENT_AUDIO = 1,
I915_COMPONENT_HDCP,
+   I915_COMPONENT_PXP
 };
 
 /* MAX_PORT is the number of port
diff --git a/include/drm/i915_pxp_tee_interface.h 
b/include/drm/i915_pxp_tee_interface.h
new file mode 100644
index ..af593ec64469
--- /dev/null
+++ b/include/drm/i915_pxp_tee_interface.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef _I915_PXP_TEE_INTERFACE_H_
+#define _I915_PXP_TEE_INTERFACE_H_
+
+#include 
+#include 
+
+/**
+ * struct i915_pxp_component_ops - ops for PXP services.
+ * @owner: Module providing the ops
+ * @send: sends data to PXP
+ * @receive: receives data from PXP
+ */
+struct i915_pxp_component_ops {
+   /**
+* @owner: owner of the module provding the ops
+*/
+   struct module *owner;
+
+   int (*send)(struct device *dev, const void *message, size_t size);
+   int (*recv)(struct device *dev, void *buffer, size_t size);
+};
+
+/**
+ * struct i915_pxp_component - Used for communication between i915 and TEE
+ * drivers for the PXP services
+ * @tee_dev: device that provide the PXP service from TEE Bus.
+ * @pxp_ops: Ops implemented by TEE driver, used by i915 driver.
+ */
+struct i915_pxp_component {
+   struct device *tee_dev;
+   const struct i915_pxp_component_ops *ops;
+
+   /* To protect the above members. */
+   struct mutex mutex;
+};
+
+#endif /* _I915_TEE_PXP_INTERFACE_H_ */
-- 
2.25.1



[PATCH v9 00/17] drm/i915: Introduce Intel PXP

2021-09-10 Thread Daniele Ceraolo Spurio
PXP (Protected Xe Path) is an i915 component, available on
GEN12i and newer platforms, that helps to establish the hardware
protected session and manage the status of the alive software session,
as well as its life cycle.

changes from v8:
- comments/docs improvements
- remove rpm put race (pxp_inval vs context_close)
- don't call pxp_invalidate on rpm suspend because it's redundant

Tested with: https://patchwork.freedesktop.org/series/87570/

Cc: Gaurav Kumar 
Cc: Chris Wilson 
Cc: Rodrigo Vivi 
Cc: Joonas Lahtinen 
Cc: Juston Li 
Cc: Alan Previn 
Cc: Lionel Landwerlin 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 

Anshuman Gupta (2):
  drm/i915/pxp: Add plane decryption support
  drm/i915/pxp: black pixels on pxp disabled

Daniele Ceraolo Spurio (9):
  drm/i915/pxp: Define PXP component interface
  drm/i915/pxp: define PXP device flag and kconfig
  drm/i915/pxp: allocate a vcs context for pxp usage
  drm/i915/pxp: set KCR reg init
  drm/i915/pxp: interfaces for using protected objects
  drm/i915/pxp: start the arb session on demand
  drm/i915/pxp: add pxp debugfs
  drm/i915/pxp: add PXP documentation
  drm/i915/pxp: enable PXP for integrated Gen12

Huang, Sean Z (5):
  drm/i915/pxp: Implement funcs to create the TEE channel
  drm/i915/pxp: Create the arbitrary session after boot
  drm/i915/pxp: Implement arb session teardown
  drm/i915/pxp: Implement PXP irq handler
  drm/i915/pxp: Enable PXP power management

Vitaly Lubart (1):
  mei: pxp: export pavp client to me client bus

 Documentation/gpu/i915.rst|   8 +
 drivers/gpu/drm/i915/Kconfig  |  11 +
 drivers/gpu/drm/i915/Makefile |  10 +
 drivers/gpu/drm/i915/display/intel_display.c  |  34 +++
 .../drm/i915/display/intel_display_types.h|   6 +
 .../drm/i915/display/skl_universal_plane.c|  49 ++-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 100 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   6 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  28 ++
 drivers/gpu/drm/i915/gem/i915_gem_create.c|  72 +++--
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  18 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c|   1 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   6 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   8 +
 .../gpu/drm/i915/gem/selftests/mock_context.c |   4 +-
 drivers/gpu/drm/i915/gt/debugfs_gt.c  |   2 +
 drivers/gpu/drm/i915/gt/intel_engine.h|   2 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |  22 +-
 drivers/gpu/drm/i915/gt/intel_gt.c|   5 +
 drivers/gpu/drm/i915/gt/intel_gt_irq.c|   7 +
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |  15 +-
 drivers/gpu/drm/i915/gt/intel_gt_types.h  |   3 +
 drivers/gpu/drm/i915/i915_drv.c   |   2 +
 drivers/gpu/drm/i915/i915_drv.h   |   3 +
 drivers/gpu/drm/i915/i915_pci.c   |   2 +
 drivers/gpu/drm/i915/i915_reg.h   |  48 +++
 drivers/gpu/drm/i915/intel_device_info.h  |   1 +
 drivers/gpu/drm/i915/pxp/intel_pxp.c  | 288 ++
 drivers/gpu/drm/i915/pxp/intel_pxp.h  |  67 
 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c  | 141 +
 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h  |  15 +
 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c  |  78 +
 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h  |  21 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c  | 100 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.h  |  32 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_pm.c   |  46 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_pm.h   |  23 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c  | 175 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.h  |  15 +
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c  | 172 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h  |  17 ++
 .../drm/i915/pxp/intel_pxp_tee_interface.h|  37 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  83 +
 drivers/misc/mei/Kconfig  |   2 +
 drivers/misc/mei/Makefile |   1 +
 drivers/misc/mei/pxp/Kconfig  |  13 +
 drivers/misc/mei/pxp/Makefile |   7 +
 drivers/misc/mei/pxp/mei_pxp.c| 229 ++
 drivers/misc/mei/pxp/mei_pxp.h|  18 ++
 include/drm/i915_component.h  |   1 +
 include/drm/i915_pxp_tee_interface.h  |  42 +++
 include/uapi/drm/i915_drm.h   |  99 +-
 52 files changed, 2153 insertions(+), 42 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.h
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
 create mode 100644 drivers/gpu/drm/i915/pxp/int

Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource

2021-09-10 Thread Thomas Hellström
On Fri, 2021-09-10 at 16:40 +0200, Christian König wrote:
> 
> 
> Am 10.09.21 um 15:15 schrieb Thomas Hellström:
> > Both the provider (resource manager) and the consumer (the TTM
> > driver)
> > want to subclass struct ttm_resource. Since this is left for the
> > resource
> > manager, we need to provide a private pointer for the TTM driver.
> > 
> > Provide a struct ttm_resource_private for the driver to subclass
> > for
> > data with the same lifetime as the struct ttm_resource: In the i915
> > case
> > it will, for example, be an sg-table and radix tree into the LMEM
> > /VRAM pages that currently are awkwardly attached to the GEM
> > object.
> > 
> > Provide an ops structure for associated ops (Which is only
> > destroy() ATM)
> > It might seem pointless to provide a separate ops structure, but
> > Linus
> > has previously made it clear that that's the norm.
> > 
> > After careful audit one could perhaps also on a per-driver basis
> > replace the delete_mem_notify() TTM driver callback with the above
> > destroy function.
> 
> Well this is a really big NAK to this approach.
> 
> If you need to attach some additional information to the resource
> then 
> implement your own resource manager like everybody else does.

Well this was the long discussion we had back then when the resource
mangagers started to derive from struct resource and I was under the
impression that we had come to an agreement about the different use-
cases here, and this was my main concern.

I mean, it's a pretty big layer violation to do that for this use-case.
The TTM resource manager doesn't want to know about this data at all,
it's private to the ttm resource user layer and the resource manager
works perfectly well without it. (I assume the other drivers that
implement their own resource managers need the data that the
subclassing provides?)

The fundamental problem here is that there are two layers wanting to
subclass struct ttm_resource. That means one layer gets to do that, the
second gets to use a private pointer, (which in turn can provide yet
another private pointer to a potential third layer). With your
suggestion, the second layer instead is forced to subclass each
subclassed instance it uses from  the first layer provides?

Ofc we can do that, but it does indeed feel pretty awkward.

In any case, if you still think that's the approach we should go for,
I'd need to add init() and fini() members to the ttm_range_manager_func
struct to allow subclassing without having to unnecessarily copy the
full code? 

Thanks,
Thomas










> 
> Regards,
> Christian.
> 
> > 
> > Cc: Matthew Auld 
> > Cc: König Christian 
> > Signed-off-by: Thomas Hellström 
> > ---
> >   drivers/gpu/drm/ttm/ttm_resource.c | 10 +++---
> >   include/drm/ttm/ttm_resource.h | 28
> > 
> >   2 files changed, 35 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/ttm/ttm_resource.c
> > b/drivers/gpu/drm/ttm/ttm_resource.c
> > index 2431717376e7..973e7c50bfed 100644
> > --- a/drivers/gpu/drm/ttm/ttm_resource.c
> > +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> > @@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object
> > *bo,
> >   void ttm_resource_free(struct ttm_buffer_object *bo, struct
> > ttm_resource **res)
> >   {
> > struct ttm_resource_manager *man;
> > +   struct ttm_resource *resource = *res;
> >   
> > -   if (!*res)
> > +   if (!resource)
> > return;
> >   
> > -   man = ttm_manager_type(bo->bdev, (*res)->mem_type);
> > -   man->func->free(man, *res);
> > *res = NULL;
> > +   if (resource->priv)
> > +   resource->priv->ops.destroy(resource->priv);
> > +
> > +   man = ttm_manager_type(bo->bdev, resource->mem_type);
> > +   man->func->free(man, resource);
> >   }
> >   EXPORT_SYMBOL(ttm_resource_free);
> >   
> > diff --git a/include/drm/ttm/ttm_resource.h
> > b/include/drm/ttm/ttm_resource.h
> > index 140b6b9a8bbe..5a22c9a29c05 100644
> > --- a/include/drm/ttm/ttm_resource.h
> > +++ b/include/drm/ttm/ttm_resource.h
> > @@ -44,6 +44,7 @@ struct dma_buf_map;
> >   struct io_mapping;
> >   struct sg_table;
> >   struct scatterlist;
> > +struct ttm_resource_private;
> >   
> >   struct ttm_resource_manager_func {
> > /**
> > @@ -153,6 +154,32 @@ struct ttm_bus_placement {
> > enum ttm_cachingcaching;
> >   };
> >   
> > +/**
> > + * struct ttm_resource_private_ops - Operations for a struct
> > + * ttm_resource_private
> > + *
> > + * Not much benefit to keep this as a separate struct with only a
> > single member,
> > + * but keeping a separate ops struct is the norm.
> > + */
> > +struct ttm_resource_private_ops {
> > +   /**
> > +    * destroy() - Callback to destroy the private data
> > +    * @priv - The private data to destroy
> > +    */
> > +   void (*destroy) (struct ttm_resource_private *priv);
> > +};
> > +
> > +/**
> > + * struct ttm_resource_private - TTM drive

Re: [PATCH v2 1/3] video: fbdev: asiliantfb: Error out if 'pixclock' equals zero

2021-09-10 Thread Geert Uytterhoeven
Hi Zheyu,

On Mon, Jul 26, 2021 at 12:04 PM Zheyu Ma  wrote:
> The userspace program could pass any values to the driver through
> ioctl() interface. If the driver doesn't check the value of 'pixclock',
> it may cause divide error.
>
> Fix this by checking whether 'pixclock' is zero first.
>
> The following log reveals it:
>
> [   43.861711] divide error:  [#1] PREEMPT SMP KASAN PTI
> [   43.861737] CPU: 2 PID: 11764 Comm: i740 Not tainted 
> 5.14.0-rc2-00513-gac532c9bbcfb-dirty #224
> [   43.861756] RIP: 0010:asiliantfb_check_var+0x4e/0x730
> [   43.861843] Call Trace:
> [   43.861848]  ? asiliantfb_remove+0x190/0x190
> [   43.861858]  fb_set_var+0x2e4/0xeb0
> [   43.861866]  ? fb_blank+0x1a0/0x1a0
> [   43.861873]  ? lock_acquire+0x1ef/0x530
> [   43.861884]  ? lock_release+0x810/0x810
> [   43.861892]  ? lock_is_held_type+0x100/0x140
> [   43.861903]  ? ___might_sleep+0x1ee/0x2d0
> [   43.861914]  ? __mutex_lock+0x620/0x1190
> [   43.861921]  ? do_fb_ioctl+0x313/0x700
> [   43.861929]  ? mutex_lock_io_nested+0xfa0/0xfa0
> [   43.861936]  ? __this_cpu_preempt_check+0x1d/0x30
> [   43.861944]  ? _raw_spin_unlock_irqrestore+0x46/0x60
> [   43.861952]  ? lockdep_hardirqs_on+0x59/0x100
> [   43.861959]  ? _raw_spin_unlock_irqrestore+0x46/0x60
> [   43.861967]  ? trace_hardirqs_on+0x6a/0x1c0
> [   43.861978]  do_fb_ioctl+0x31e/0x700
>
> Signed-off-by: Zheyu Ma 

Thanks for your patch!

> ---
> Changes in v2:
> - Make commit log more descriptive
> ---
>  drivers/video/fbdev/asiliantfb.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/video/fbdev/asiliantfb.c 
> b/drivers/video/fbdev/asiliantfb.c
> index 3e006da47752..84c56f525889 100644
> --- a/drivers/video/fbdev/asiliantfb.c
> +++ b/drivers/video/fbdev/asiliantfb.c
> @@ -227,6 +227,9 @@ static int asiliantfb_check_var(struct fb_var_screeninfo 
> *var,
>  {
> unsigned long Ftarget, ratio, remainder;
>
> +   if (!var->pixclock)
> +   return -EINVAL;

While this fixes the crash, it is not correct: according to the
fbdev API, invalid values must be rounded up to a supported value,
if possible.  -EINVAL should only be returned if rounding up values
in fb_var_screeninfo cannot give a valid mode.

The same comment applies to the other patches in this series:
[PATCH v2 2/3] video: fbdev: kyro: Error out if 'pixclock' equals zero
[PATCH v2 3/3] video: fbdev: riva: Error out if 'pixclock' equals zero

> +
> ratio = 100 / var->pixclock;
> remainder = 100 % var->pixclock;
> Ftarget = 100 * ratio + (100 * remainder) / var->pixclock;

Gr{oetje,eeting}s,

Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table

2021-09-10 Thread Matt Roper
On Fri, Sep 10, 2021 at 04:03:50PM +0100, Tvrtko Ursulin wrote:
> 
> On 10/09/2021 15:24, Matt Roper wrote:
> > On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 10/09/2021 06:33, Matt Roper wrote:
> > > > Our uncore MMIO functions for reading/writing registers have become very
> > > > complicated over time.  There's significant macro magic used to generate
> > > > several nearly-identical functions that only really differ in terms of
> > > > which platform-specific shadow register table they should check on write
> > > > operations.  We can significantly simplify our MMIO handlers by storing
> > > > a reference to the current platform's shadow table within the 'struct
> > > > intel_uncore' the same way we already do for forcewake; this allows us
> > > > to consolidate the multiple variants of each 'write' function down to
> > > > just a single 'fwtable' version that gets the shadow table out of the
> > > > uncore struct rather than hardcoding the name of a specific platform's
> > > > table.  We can do similar consolidation on the MMIO read side by
> > > > creating a single-entry forcewake table to replace the open-coded range
> > > > check they had been using previously.
> > > > 
> > > > The final patch of the series adds a new shadow table for DG2; this
> > > > becomes quite clean and simple now, given the refactoring in the first
> > > > five patches.
> > > 
> > > Tidy and it ends up saving kernel binary size.
> > > 
> > > However I am undecided yet, because one thing to note is that the trade 
> > > off
> > > is source code and kernel text consolidation at the expense of more 
> > > indirect
> > > calls at runtime and larger common read/write functions.
> > > 
> > > To expand, current code generates a bunch of per gen functions but in 
> > > doing
> > > so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and 
> > > BSEARCH
> > > (from find_fw_domain) so at runtime each platform mmio read/write does not
> > > have to do indirect calls to do lookups.
> > > 
> > > It may matter a lot in the grand scheme of things but this trade off is
> > > something to note in the cover letter I think.
> > 
> > That's true.  However it seems like if the extra indirect calls are good
> > enough for our forcewake lookups (which are called more frequently and
> > have to search through much larger tables) then using the same strategy
> > for shadow registers should be less of a concern.  Plus most of
> > timing-critical parts of the code don't call through this at all; they
> > just grab an explicit forcewake and then issue a bunch of *_fw()
> > operations that skip all the per-register forcewake and shadow handling.
> 
> With lookups you mean intel_uncore_forcewake_for_reg? Yeah I don't have a
> good idea of how many of those followed by "_fw" accessors we have vs
> "un-optimized" access. But it's a good point.
> 
> I was mostly coming from the point of view of old platforms like gen6, where
> with this series reads go from inlined checks (NEEDS_FORCE_WAKE) to always
> calling find_fw_domain. Just because it is a bit unfortunate to burden old
> CPUs (they are not getting any faster) with executing more code. It's not
> nice when old hardware gets slower and slower with software updates. :) But
> whether or not this case would at all be measurable.. probably not. Unless
> some compounding effects, like "death by thousand cuts", would come into
> play.

Chris pointed out in an offline mail that NEEDS_FORCE_WAKE does cut cut
out a lot of display MMIO lookups.  So I think it might be worth adding
that back, but also adding an "|| GEN11_BSD_RING_BASE" so that it will
still be accurate for the newer platforms too.

But I think another thing to consider here would be that we might want
to switch our intel_de_{read,write} wrappers to call raw mmio directly,
to completely bypass forcewake and shadow logic.


Matt

> 
> Regards,
> 
> Tvrtko
> 
> > But you're right that this is something I should mention more clearly in
> > the cover letter.
> > 
> > 
> > Matt
> > 
> > > 
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > > > Matt Roper (6):
> > > > drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
> > > > drm/i915/uncore: Associate shadow table with uncore
> > > > drm/i915/uncore: Replace gen8 write functions with general fwtable
> > > > drm/i915/uncore: Drop gen11/gen12 mmio write handlers
> > > > drm/i915/uncore: Drop gen11 mmio read handlers
> > > > drm/i915/dg2: Add DG2-specific shadow register table
> > > > 
> > > >drivers/gpu/drm/i915/intel_uncore.c   | 190 
> > > > ++
> > > >drivers/gpu/drm/i915/intel_uncore.h   |   7 +
> > > >drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
> > > >3 files changed, 110 insertions(+), 88 deletions(-)
> > > > 
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795


Re: [PATCH v2] kernel/locking: Add context to ww_mutex_trylock.

2021-09-10 Thread Peter Zijlstra
On Thu, Sep 09, 2021 at 11:32:18AM +0200, Maarten Lankhorst wrote:
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index d456579d0952..791c28005eef 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -736,6 +736,44 @@ __ww_mutex_lock(struct mutex *lock, unsigned int state, 
> unsigned int subclass,
>   return __mutex_lock_common(lock, state, subclass, NULL, ip, ww_ctx, 
> true);
>  }
>  
> +/**
> + * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire 
> context
> + * @lock: mutex to lock
> + * @ctx: optional w/w acquire context
> + *
> + * Trylocks a mutex with the optional acquire context; no deadlock detection 
> is
> + * possible. Returns 1 if the mutex has been acquired successfully, 0 
> otherwise.
> + *
> + * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a 
> @ctx is
> + * specified, -EALREADY and -EDEADLK handling may happen in calls to 
> ww_mutex_lock.
> + *
> + * A mutex acquired with this function must be released with ww_mutex_unlock.
> + */
> +int __sched
> +ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ctx)
> +{
> + bool locked;
> +
> + if (!ctx)
> + return mutex_trylock(&ww->base);
> +
> +#ifdef CONFIG_DEBUG_MUTEXES
> + DEBUG_LOCKS_WARN_ON(ww->base.magic != &ww->base);
> +#endif
> +
> + preempt_disable();
> + locked = __mutex_trylock(&ww->base);
> +
> + if (locked) {
> + ww_mutex_set_context_fastpath(ww, ctx);
> + mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ctx->dep_map, 
> _RET_IP_);
> + }
> + preempt_enable();
> +
> + return locked;
> +}
> +EXPORT_SYMBOL(ww_mutex_trylock);
> +
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>  void __sched
>  mutex_lock_nested(struct mutex *lock, unsigned int subclass)

> diff --git a/kernel/locking/ww_rt_mutex.c b/kernel/locking/ww_rt_mutex.c
> index 3f1fff7d2780..c4cb863edb4c 100644
> --- a/kernel/locking/ww_rt_mutex.c
> +++ b/kernel/locking/ww_rt_mutex.c
> @@ -50,6 +50,18 @@ __ww_rt_mutex_lock(struct ww_mutex *lock, struct 
> ww_acquire_ctx *ww_ctx,
>   return ret;
>  }
>  
> +int __sched
> +ww_mutex_trylock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
> +{
> + int locked = rt_mutex_trylock(&lock->base);
> +
> + if (locked && ctx)
> + ww_mutex_set_context_fastpath(lock, ctx);
> +
> + return locked;
> +}
> +EXPORT_SYMBOL(ww_mutex_trylock);
> +
>  int __sched
>  ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
>  {

That doesn't look right, how's this for you?

---
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -94,6 +94,9 @@ static inline unsigned long __owner_flag
return owner & MUTEX_FLAGS;
 }
 
+/*
+ * Returns: __mutex_owner(lock) on failure or NULL on success.
+ */
 static inline struct task_struct *__mutex_trylock_common(struct mutex *lock, 
bool handoff)
 {
unsigned long owner, curr = (unsigned long)current;
@@ -736,6 +739,47 @@ __ww_mutex_lock(struct mutex *lock, unsi
return __mutex_lock_common(lock, state, subclass, NULL, ip, ww_ctx, 
true);
 }
 
+/**
+ * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire 
context
+ * @ww: mutex to lock
+ * @ww_ctx: optional w/w acquire context
+ *
+ * Trylocks a mutex with the optional acquire context; no deadlock detection is
+ * possible. Returns 1 if the mutex has been acquired successfully, 0 
otherwise.
+ *
+ * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a @ctx 
is
+ * specified, -EALREADY handling may happen in calls to ww_mutex_trylock.
+ *
+ * A mutex acquired with this function must be released with ww_mutex_unlock.
+ */
+int ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ww_ctx)
+{
+   if (!ww_ctx)
+   return mutex_trylock(&ww->base);
+
+   MUTEX_WARN_ON(ww->base.magic != &ww->base);
+
+   if (unlikely(ww_ctx == READ_ONCE(ww->ctx)))
+   return -EALREADY;
+
+   /*
+* Reset the wounded flag after a kill. No other process can
+* race and wound us here, since they can't have a valid owner
+* pointer if we don't have any locks held.
+*/
+   if (ww_ctx->acquired == 0)
+   ww_ctx->wounded = 0;
+
+   if (__mutex_trylock(&ww->base)) {
+   ww_mutex_set_context_fastpath(ww, ww_ctx);
+   mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ww_ctx->dep_map, 
_RET_IP_);
+   return 1;
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL(ww_mutex_trylock);
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 void __sched
 mutex_lock_nested(struct mutex *lock, unsigned int subclass)
--- a/kernel/locking/ww_rt_mutex.c
+++ b/kernel/locking/ww_rt_mutex.c
@@ -9,6 +9,34 @@
 #define WW_RT
 #include "rtmutex.c"
 
+int ww_mutex_trylock(struct ww_mutex *lock, struct ww_acquire_ctx *ww_ctx)
+{
+   struct rt_mutex *rtm = &lock->base;
+
+   if (!ww_ctx)
+   return rt_mutex_trylock(rtm);
+
+   if (unlikely(ww_

Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table

2021-09-10 Thread Tvrtko Ursulin



On 10/09/2021 15:24, Matt Roper wrote:

On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote:


On 10/09/2021 06:33, Matt Roper wrote:

Our uncore MMIO functions for reading/writing registers have become very
complicated over time.  There's significant macro magic used to generate
several nearly-identical functions that only really differ in terms of
which platform-specific shadow register table they should check on write
operations.  We can significantly simplify our MMIO handlers by storing
a reference to the current platform's shadow table within the 'struct
intel_uncore' the same way we already do for forcewake; this allows us
to consolidate the multiple variants of each 'write' function down to
just a single 'fwtable' version that gets the shadow table out of the
uncore struct rather than hardcoding the name of a specific platform's
table.  We can do similar consolidation on the MMIO read side by
creating a single-entry forcewake table to replace the open-coded range
check they had been using previously.

The final patch of the series adds a new shadow table for DG2; this
becomes quite clean and simple now, given the refactoring in the first
five patches.


Tidy and it ends up saving kernel binary size.

However I am undecided yet, because one thing to note is that the trade off
is source code and kernel text consolidation at the expense of more indirect
calls at runtime and larger common read/write functions.

To expand, current code generates a bunch of per gen functions but in doing
so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and BSEARCH
(from find_fw_domain) so at runtime each platform mmio read/write does not
have to do indirect calls to do lookups.

It may matter a lot in the grand scheme of things but this trade off is
something to note in the cover letter I think.


That's true.  However it seems like if the extra indirect calls are good
enough for our forcewake lookups (which are called more frequently and
have to search through much larger tables) then using the same strategy
for shadow registers should be less of a concern.  Plus most of
timing-critical parts of the code don't call through this at all; they
just grab an explicit forcewake and then issue a bunch of *_fw()
operations that skip all the per-register forcewake and shadow handling.


With lookups you mean intel_uncore_forcewake_for_reg? Yeah I don't have 
a good idea of how many of those followed by "_fw" accessors we have vs 
"un-optimized" access. But it's a good point.


I was mostly coming from the point of view of old platforms like gen6, 
where with this series reads go from inlined checks (NEEDS_FORCE_WAKE) 
to always calling find_fw_domain. Just because it is a bit unfortunate 
to burden old CPUs (they are not getting any faster) with executing more 
code. It's not nice when old hardware gets slower and slower with 
software updates. :) But whether or not this case would at all be 
measurable.. probably not. Unless some compounding effects, like "death 
by thousand cuts", would come into play.


Regards,

Tvrtko


But you're right that this is something I should mention more clearly in
the cover letter.


Matt



Regards,

Tvrtko


Matt Roper (6):
drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
drm/i915/uncore: Associate shadow table with uncore
drm/i915/uncore: Replace gen8 write functions with general fwtable
drm/i915/uncore: Drop gen11/gen12 mmio write handlers
drm/i915/uncore: Drop gen11 mmio read handlers
drm/i915/dg2: Add DG2-specific shadow register table

   drivers/gpu/drm/i915/intel_uncore.c   | 190 ++
   drivers/gpu/drm/i915/intel_uncore.h   |   7 +
   drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
   3 files changed, 110 insertions(+), 88 deletions(-)





[PATCH 0/1] lib, stackdepot: Add helper to print stack entries into buffer.

2021-09-10 Thread Imran Khan
This change is in response to discussion at [1].
The patch has been created on top of my earlier changes [2] and [3].
If needed I can resend all of these patches together, though my
earlier patches have been Acked.

[1] https://lore.kernel.org/lkml/e6f6fb85-1d83-425b-9e36-b5784cc9e...@suse.cz/
[2] https://lore.kernel.org/lkml/fe94ffd8-d235-87d8-9c3d-80f7f73e0...@suse.cz/
[3] https://lore.kernel.org/lkml/85f4f073-0b5a-9052-0ba9-74d450608...@suse.cz/

Imran Khan (1):
  lib, stackdepot: Add helper to print stack entries into buffer.

 drivers/gpu/drm/drm_dp_mst_topology.c   |  5 +
 drivers/gpu/drm/drm_mm.c|  5 +
 drivers/gpu/drm/i915/i915_vma.c |  5 +
 drivers/gpu/drm/i915/intel_runtime_pm.c | 20 +---
 include/linux/stackdepot.h  |  3 +++
 lib/stackdepot.c| 23 +++
 mm/page_owner.c |  5 +
 7 files changed, 35 insertions(+), 31 deletions(-)

-- 
2.30.2



[PATCH 1/1] lib, stackdepot: Add helper to print stack entries into buffer.

2021-09-10 Thread Imran Khan
To print stack entries into a buffer, users of stackdepot,
first get a list of stack entries using stack_depot_fetch
and then print this list into a buffer using stack_trace_snprint.
Provide a helper in stackdepot for this purpose.
Also change above mentioned users to use this helper.

Signed-off-by: Imran Khan 
Suggested-by: Vlastimil Babka 
---
 drivers/gpu/drm/drm_dp_mst_topology.c   |  5 +
 drivers/gpu/drm/drm_mm.c|  5 +
 drivers/gpu/drm/i915/i915_vma.c |  5 +
 drivers/gpu/drm/i915/intel_runtime_pm.c | 20 +---
 include/linux/stackdepot.h  |  3 +++
 lib/stackdepot.c| 23 +++
 mm/page_owner.c |  5 +
 7 files changed, 35 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
b/drivers/gpu/drm/drm_dp_mst_topology.c
index 86d13d6bc463..2d1adab9e360 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -1668,13 +1668,10 @@ __dump_topology_ref_history(struct 
drm_dp_mst_topology_ref_history *history,
for (i = 0; i < history->len; i++) {
const struct drm_dp_mst_topology_ref_entry *entry =
&history->entries[i];
-   ulong *entries;
-   uint nr_entries;
u64 ts_nsec = entry->ts_nsec;
u32 rem_nsec = do_div(ts_nsec, 10);
 
-   nr_entries = stack_depot_fetch(entry->backtrace, &entries);
-   stack_trace_snprint(buf, PAGE_SIZE, entries, nr_entries, 4);
+   stack_depot_snprint(entry->backtrace, buf, PAGE_SIZE, 4);
 
drm_printf(&p, "  %d %ss (last at %5llu.%06u):\n%s",
   entry->count,
diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 93d48a6f04ab..ca04d7f6f7b5 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -118,8 +118,6 @@ static noinline void save_stack(struct drm_mm_node *node)
 static void show_leaks(struct drm_mm *mm)
 {
struct drm_mm_node *node;
-   unsigned long *entries;
-   unsigned int nr_entries;
char *buf;
 
buf = kmalloc(BUFSZ, GFP_KERNEL);
@@ -133,8 +131,7 @@ static void show_leaks(struct drm_mm *mm)
continue;
}
 
-   nr_entries = stack_depot_fetch(node->stack, &entries);
-   stack_trace_snprint(buf, BUFSZ, entries, nr_entries, 0);
+   stack_depot_snprint(node->stack, buf, BUFSZ);
DRM_ERROR("node [%08llx + %08llx]: inserted at\n%s",
  node->start, node->size, buf);
}
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 4b7fc4647e46..f2d9ed375109 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -56,8 +56,6 @@ void i915_vma_free(struct i915_vma *vma)
 
 static void vma_print_allocator(struct i915_vma *vma, const char *reason)
 {
-   unsigned long *entries;
-   unsigned int nr_entries;
char buf[512];
 
if (!vma->node.stack) {
@@ -66,8 +64,7 @@ static void vma_print_allocator(struct i915_vma *vma, const 
char *reason)
return;
}
 
-   nr_entries = stack_depot_fetch(vma->node.stack, &entries);
-   stack_trace_snprint(buf, sizeof(buf), entries, nr_entries, 0);
+   stack_depot_snprint(vma->node.stack, buf, sizeof(buf), 0);
DRM_DEBUG_DRIVER("vma.node [%08llx + %08llx] %s: inserted at %s\n",
 vma->node.start, vma->node.size, reason, buf);
 }
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
b/drivers/gpu/drm/i915/intel_runtime_pm.c
index eaf7688f517d..cc312f0a05eb 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -65,16 +65,6 @@ static noinline depot_stack_handle_t __save_depot_stack(void)
return stack_depot_save(entries, n, GFP_NOWAIT | __GFP_NOWARN);
 }
 
-static void __print_depot_stack(depot_stack_handle_t stack,
-   char *buf, int sz, int indent)
-{
-   unsigned long *entries;
-   unsigned int nr_entries;
-
-   nr_entries = stack_depot_fetch(stack, &entries);
-   stack_trace_snprint(buf, sz, entries, nr_entries, indent);
-}
-
 static void init_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
spin_lock_init(&rpm->debug.lock);
@@ -146,12 +136,12 @@ static void untrack_intel_runtime_pm_wakeref(struct 
intel_runtime_pm *rpm,
if (!buf)
return;
 
-   __print_depot_stack(stack, buf, PAGE_SIZE, 2);
+   stack_depot_snprint(stack, buf, PAGE_SIZE, 2);
DRM_DEBUG_DRIVER("wakeref %x from\n%s", stack, buf);
 
stack = READ_ONCE(rpm->debug.last_release);
if (stack) {
-   __print_depot_stack(stack, buf, PAGE_SIZE, 2);
+  

Re: [PATCH v3 2/8] mm: Introduce a function to check for confidential computing features

2021-09-10 Thread Borislav Petkov
On Wed, Sep 08, 2021 at 05:58:33PM -0500, Tom Lendacky wrote:
> In prep for other confidential computing technologies, introduce a generic

preparation

> helper function, cc_platform_has(), that can be used to check for specific
> active confidential computing attributes, like memory encryption. This is
> intended to eliminate having to add multiple technology-specific checks to
> the code (e.g. if (sev_active() || tdx_active())).

...

> diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
> new file mode 100644
> index ..253f3ea66cd8
> --- /dev/null
> +++ b/include/linux/cc_platform.h
> @@ -0,0 +1,88 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Confidential Computing Platform Capability checks
> + *
> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + *
> + * Author: Tom Lendacky 
> + */
> +
> +#ifndef _CC_PLATFORM_H

_LINUX_CC_PLATFORM_H

> +#define _CC_PLATFORM_H

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


[PATCH v5 1/3] dt-bindings: Add YAML bindings for NVDEC

2021-09-10 Thread Mikko Perttunen
Add YAML device tree bindings for NVDEC, now in a more appropriate
place compared to the old textual Host1x bindings.

Signed-off-by: Mikko Perttunen 
---
v5:
* Changed from nvidia,instance to nvidia,host1x-class optional
  property.
* Added dma-coherent
v4:
* Fix incorrect compatibility string in 'if' condition
v3:
* Drop host1x bindings
* Change read2 to read-1 in interconnect names
v2:
* Fix issues pointed out in v1
* Add T194 nvidia,instance property
---
 .../gpu/host1x/nvidia,tegra210-nvdec.yaml | 104 ++
 MAINTAINERS   |   1 +
 2 files changed, 105 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml

diff --git 
a/Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml 
b/Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml
new file mode 100644
index ..f1f8d083d736
--- /dev/null
+++ b/Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml
@@ -0,0 +1,104 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/gpu/host1x/nvidia,tegra210-nvdec.yaml#";
+$schema: "http://devicetree.org/meta-schemas/core.yaml#";
+
+title: Device tree binding for NVIDIA Tegra NVDEC
+
+description: |
+  NVDEC is the hardware video decoder present on NVIDIA Tegra210
+  and newer chips. It is located on the Host1x bus and typically
+  programmed through Host1x channels.
+
+maintainers:
+  - Thierry Reding 
+  - Mikko Perttunen 
+
+properties:
+  $nodename:
+pattern: "^nvdec@[0-9a-f]*$"
+
+  compatible:
+enum:
+  - nvidia,tegra210-nvdec
+  - nvidia,tegra186-nvdec
+  - nvidia,tegra194-nvdec
+
+  reg:
+maxItems: 1
+
+  clocks:
+maxItems: 1
+
+  clock-names:
+items:
+  - const: nvdec
+
+  resets:
+maxItems: 1
+
+  reset-names:
+items:
+  - const: nvdec
+
+  power-domains:
+maxItems: 1
+
+  iommus:
+maxItems: 1
+
+  dma-coherent: true
+
+  interconnects:
+items:
+  - description: DMA read memory client
+  - description: DMA read 2 memory client
+  - description: DMA write memory client
+
+  interconnect-names:
+items:
+  - const: dma-mem
+  - const: read-1
+  - const: write
+
+  nvidia,host1x-class:
+description: Host1x class of the engine. If not specified, defaults to 
0xf0.
+$ref: /schemas/types.yaml#/definitions/uint32
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - clock-names
+  - resets
+  - reset-names
+  - power-domains
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+#include 
+#include 
+#include 
+#include 
+
+nvdec@1548 {
+compatible = "nvidia,tegra186-nvdec";
+reg = <0x1548 0x4>;
+clocks = <&bpmp TEGRA186_CLK_NVDEC>;
+clock-names = "nvdec";
+resets = <&bpmp TEGRA186_RESET_NVDEC>;
+reset-names = "nvdec";
+
+power-domains = <&bpmp TEGRA186_POWER_DOMAIN_NVDEC>;
+interconnects = <&mc TEGRA186_MEMORY_CLIENT_NVDECSRD &emc>,
+<&mc TEGRA186_MEMORY_CLIENT_NVDECSRD1 &emc>,
+<&mc TEGRA186_MEMORY_CLIENT_NVDECSWR &emc>;
+interconnect-names = "dma-mem", "read-1", "write";
+iommus = <&smmu TEGRA186_SID_NVDEC>;
+};
+
+
diff --git a/MAINTAINERS b/MAINTAINERS
index 69932194e1ba..ce9e360639d5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6230,6 +6230,7 @@ L:linux-te...@vger.kernel.org
 S: Supported
 T: git git://anongit.freedesktop.org/tegra/linux.git
 F: 
Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt
+F: Documentation/devicetree/bindings/gpu/host1x/
 F: drivers/gpu/drm/tegra/
 F: drivers/gpu/host1x/
 F: include/linux/host1x.h
-- 
2.32.0



[PATCH v5 2/3] arm64: tegra: Add NVDEC to Tegra186/194 device trees

2021-09-10 Thread Mikko Perttunen
Add a device tree node for NVDEC on Tegra186, and
device tree nodes for NVDEC and NVDEC1 on Tegra194.

Signed-off-by: Mikko Perttunen 
---
v5:
* Change from nvidia,instance to nvidia,host1x-class
v4:
* Add dma-coherent markers
v3:
* Change read2 to read-1
v2:
* Add NVDECSRD1 memory client
* Add also to T194 (both NVDEC0/1)
---
 arch/arm64/boot/dts/nvidia/tegra186.dtsi | 16 ++
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 38 
 2 files changed, 54 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index d02f6bf3e2ca..4f2f21242b2c 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1342,6 +1342,22 @@ dsib: dsi@1540 {
power-domains = <&bpmp TEGRA186_POWER_DOMAIN_DISP>;
};
 
+   nvdec@1548 {
+   compatible = "nvidia,tegra186-nvdec";
+   reg = <0x1548 0x4>;
+   clocks = <&bpmp TEGRA186_CLK_NVDEC>;
+   clock-names = "nvdec";
+   resets = <&bpmp TEGRA186_RESET_NVDEC>;
+   reset-names = "nvdec";
+
+   power-domains = <&bpmp TEGRA186_POWER_DOMAIN_NVDEC>;
+   interconnects = <&mc TEGRA186_MEMORY_CLIENT_NVDECSRD 
&emc>,
+   <&mc TEGRA186_MEMORY_CLIENT_NVDECSRD1 
&emc>,
+   <&mc TEGRA186_MEMORY_CLIENT_NVDECSWR 
&emc>;
+   interconnect-names = "dma-mem", "read-1", "write";
+   iommus = <&smmu TEGRA186_SID_NVDEC>;
+   };
+
sor0: sor@1554 {
compatible = "nvidia,tegra186-sor";
reg = <0x1554 0x1>;
diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index 5ba7a4519b95..04e883aa7aa2 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -1412,6 +1412,25 @@ host1x@13e0 {
interconnect-names = "dma-mem";
iommus = <&smmu TEGRA194_SID_HOST1X>;
 
+   nvdec@1514 {
+   compatible = "nvidia,tegra194-nvdec";
+   reg = <0x1514 0x0004>;
+   clocks = <&bpmp TEGRA194_CLK_NVDEC1>;
+   clock-names = "nvdec";
+   resets = <&bpmp TEGRA194_RESET_NVDEC1>;
+   reset-names = "nvdec";
+
+   power-domains = <&bpmp 
TEGRA194_POWER_DOMAIN_NVDECB>;
+   interconnects = <&mc 
TEGRA194_MEMORY_CLIENT_NVDEC1SRD &emc>,
+   <&mc 
TEGRA194_MEMORY_CLIENT_NVDEC1SRD1 &emc>,
+   <&mc 
TEGRA194_MEMORY_CLIENT_NVDEC1SWR &emc>;
+   interconnect-names = "dma-mem", "read-1", 
"write";
+   iommus = <&smmu TEGRA194_SID_NVDEC1>;
+   dma-coherent;
+
+   nvidia,host1x-class = <0xf5>;
+   };
+
display-hub@1520 {
compatible = "nvidia,tegra194-display";
reg = <0x1520 0x0004>;
@@ -1525,6 +1544,25 @@ vic@1534 {
iommus = <&smmu TEGRA194_SID_VIC>;
};
 
+   nvdec@1548 {
+   compatible = "nvidia,tegra194-nvdec";
+   reg = <0x1548 0x0004>;
+   clocks = <&bpmp TEGRA194_CLK_NVDEC>;
+   clock-names = "nvdec";
+   resets = <&bpmp TEGRA194_RESET_NVDEC>;
+   reset-names = "nvdec";
+
+   power-domains = <&bpmp 
TEGRA194_POWER_DOMAIN_NVDECA>;
+   interconnects = <&mc 
TEGRA194_MEMORY_CLIENT_NVDECSRD &emc>,
+   <&mc 
TEGRA194_MEMORY_CLIENT_NVDECSRD1 &emc>,
+   <&mc 
TEGRA194_MEMORY_CLIENT_NVDECSWR &emc>;
+   interconnect-names = "dma-mem", "read-1", 
"write";
+   iommus = <&smmu TEGRA194_SID_NVDEC>;
+   dma-coherent;
+
+   nvidia,host1x-class = <0xf0>;
+   };
+
dpaux0: dpaux@155c {
compatible = "nvidia,tegra194-dpaux";
reg = <0x155c 0x1>;
-- 
2.32.0



[PATCH v5 3/3] drm/tegra: Add NVDEC driver

2021-09-10 Thread Mikko Perttunen
Add support for booting and using NVDEC on Tegra210, Tegra186
and Tegra194 to the Host1x and TegraDRM drivers. Booting in
secure mode is not currently supported.

Signed-off-by: Mikko Perttunen 
---
v5:
* Remove num_instances
* Change from nvidia,instance to nvidia,host1x-class
v3:
* Change num_instances to unsigned int
* Remove unnecessary '= 0' initializer
* Populate num_instances data
* Fix instance number check
v2:
* Use devm_platform_get_and_ioremap_resource
* Remove reset handling, done by power domain code
* Assume runtime PM is enabled
---
 drivers/gpu/drm/tegra/Makefile |   3 +-
 drivers/gpu/drm/tegra/drm.c|   4 +
 drivers/gpu/drm/tegra/drm.h|   1 +
 drivers/gpu/drm/tegra/nvdec.c  | 464 +
 drivers/gpu/host1x/dev.c   |  18 ++
 include/linux/host1x.h |   2 +
 6 files changed, 491 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/tegra/nvdec.c

diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
index 5d2039f0c734..b248c631f790 100644
--- a/drivers/gpu/drm/tegra/Makefile
+++ b/drivers/gpu/drm/tegra/Makefile
@@ -24,7 +24,8 @@ tegra-drm-y := \
gr2d.o \
gr3d.o \
falcon.o \
-   vic.o
+   vic.o \
+   nvdec.o
 
 tegra-drm-y += trace.o
 
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index b20fd0833661..5f5afd7ba37e 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -1337,15 +1337,18 @@ static const struct of_device_id host1x_drm_subdevs[] = 
{
{ .compatible = "nvidia,tegra210-sor", },
{ .compatible = "nvidia,tegra210-sor1", },
{ .compatible = "nvidia,tegra210-vic", },
+   { .compatible = "nvidia,tegra210-nvdec", },
{ .compatible = "nvidia,tegra186-display", },
{ .compatible = "nvidia,tegra186-dc", },
{ .compatible = "nvidia,tegra186-sor", },
{ .compatible = "nvidia,tegra186-sor1", },
{ .compatible = "nvidia,tegra186-vic", },
+   { .compatible = "nvidia,tegra186-nvdec", },
{ .compatible = "nvidia,tegra194-display", },
{ .compatible = "nvidia,tegra194-dc", },
{ .compatible = "nvidia,tegra194-sor", },
{ .compatible = "nvidia,tegra194-vic", },
+   { .compatible = "nvidia,tegra194-nvdec", },
{ /* sentinel */ }
 };
 
@@ -1369,6 +1372,7 @@ static struct platform_driver * const drivers[] = {
&tegra_gr2d_driver,
&tegra_gr3d_driver,
&tegra_vic_driver,
+   &tegra_nvdec_driver,
 };
 
 static int __init host1x_drm_init(void)
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 8b28327c931c..fc0a19554eac 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -202,5 +202,6 @@ extern struct platform_driver tegra_sor_driver;
 extern struct platform_driver tegra_gr2d_driver;
 extern struct platform_driver tegra_gr3d_driver;
 extern struct platform_driver tegra_vic_driver;
+extern struct platform_driver tegra_nvdec_driver;
 
 #endif /* HOST1X_DRM_H */
diff --git a/drivers/gpu/drm/tegra/nvdec.c b/drivers/gpu/drm/tegra/nvdec.c
new file mode 100644
index ..c3b6fe7fb454
--- /dev/null
+++ b/drivers/gpu/drm/tegra/nvdec.c
@@ -0,0 +1,464 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2015-2021, NVIDIA Corporation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "drm.h"
+#include "falcon.h"
+#include "vic.h"
+
+struct nvdec_config {
+   const char *firmware;
+   unsigned int version;
+   bool supports_sid;
+};
+
+struct nvdec {
+   struct falcon falcon;
+
+   void __iomem *regs;
+   struct tegra_drm_client client;
+   struct host1x_channel *channel;
+   struct device *dev;
+   struct clk *clk;
+
+   /* Platform configuration */
+   const struct nvdec_config *config;
+};
+
+static inline struct nvdec *to_nvdec(struct tegra_drm_client *client)
+{
+   return container_of(client, struct nvdec, client);
+}
+
+static void nvdec_writel(struct nvdec *nvdec, u32 value, unsigned int offset)
+{
+   writel(value, nvdec->regs + offset);
+}
+
+static int nvdec_boot(struct nvdec *nvdec)
+{
+#ifdef CONFIG_IOMMU_API
+   struct iommu_fwspec *spec = dev_iommu_fwspec_get(nvdec->dev);
+#endif
+   int err;
+
+#ifdef CONFIG_IOMMU_API
+   if (nvdec->config->supports_sid && spec) {
+   u32 value;
+
+   value = TRANSCFG_ATT(1, TRANSCFG_SID_FALCON) | TRANSCFG_ATT(0, 
TRANSCFG_SID_HW);
+   nvdec_writel(nvdec, value, VIC_TFBIF_TRANSCFG);
+
+   if (spec->num_ids > 0) {
+   value = spec->ids[0] & 0x;
+
+   nvdec_writel(nvdec, value, VIC_THI_STREAMID0);
+   nvdec_writel(nvdec, value, VIC_THI_STREAMID1);
+   }
+   }
+#endif
+
+   err = falcon_boot(&nvdec->falcon);
+   if (

[PATCH v5 0/3] NVIDIA Tegra NVDEC support

2021-09-10 Thread Mikko Perttunen
Here's the v5 of the NVDEC support series, containing the
following changes:

* Changed from nvidia,instance property to nvidia,host1x-class
  property.
* Set additionalProperties to false in DT bindings.
* Added dma-coherent property to DT bindings.

NVDEC hardware documentation can be found at
https://github.com/NVIDIA/open-gpu-doc/tree/master/classes/video

and example userspace can be found at
https://github.com/cyndis/vaapi-tegra-driver

Thanks,
Mikko

Mikko Perttunen (3):
  dt-bindings: Add YAML bindings for NVDEC
  arm64: tegra: Add NVDEC to Tegra186/194 device trees
  drm/tegra: Add NVDEC driver

 .../gpu/host1x/nvidia,tegra210-nvdec.yaml | 104 
 MAINTAINERS   |   1 +
 arch/arm64/boot/dts/nvidia/tegra186.dtsi  |  16 +
 arch/arm64/boot/dts/nvidia/tegra194.dtsi  |  38 ++
 drivers/gpu/drm/tegra/Makefile|   3 +-
 drivers/gpu/drm/tegra/drm.c   |   4 +
 drivers/gpu/drm/tegra/drm.h   |   1 +
 drivers/gpu/drm/tegra/nvdec.c | 464 ++
 drivers/gpu/host1x/dev.c  |  18 +
 include/linux/host1x.h|   2 +
 10 files changed, 650 insertions(+), 1 deletion(-)
 create mode 100644 
Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml
 create mode 100644 drivers/gpu/drm/tegra/nvdec.c

-- 
2.32.0



Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource

2021-09-10 Thread Christian König




Am 10.09.21 um 15:15 schrieb Thomas Hellström:

Both the provider (resource manager) and the consumer (the TTM driver)
want to subclass struct ttm_resource. Since this is left for the resource
manager, we need to provide a private pointer for the TTM driver.

Provide a struct ttm_resource_private for the driver to subclass for
data with the same lifetime as the struct ttm_resource: In the i915 case
it will, for example, be an sg-table and radix tree into the LMEM
/VRAM pages that currently are awkwardly attached to the GEM object.

Provide an ops structure for associated ops (Which is only destroy() ATM)
It might seem pointless to provide a separate ops structure, but Linus
has previously made it clear that that's the norm.

After careful audit one could perhaps also on a per-driver basis
replace the delete_mem_notify() TTM driver callback with the above
destroy function.


Well this is a really big NAK to this approach.

If you need to attach some additional information to the resource then 
implement your own resource manager like everybody else does.


Regards,
Christian.



Cc: Matthew Auld 
Cc: König Christian 
Signed-off-by: Thomas Hellström 
---
  drivers/gpu/drm/ttm/ttm_resource.c | 10 +++---
  include/drm/ttm/ttm_resource.h | 28 
  2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_resource.c 
b/drivers/gpu/drm/ttm/ttm_resource.c
index 2431717376e7..973e7c50bfed 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object *bo,
  void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource 
**res)
  {
struct ttm_resource_manager *man;
+   struct ttm_resource *resource = *res;
  
-	if (!*res)

+   if (!resource)
return;
  
-	man = ttm_manager_type(bo->bdev, (*res)->mem_type);

-   man->func->free(man, *res);
*res = NULL;
+   if (resource->priv)
+   resource->priv->ops.destroy(resource->priv);
+
+   man = ttm_manager_type(bo->bdev, resource->mem_type);
+   man->func->free(man, resource);
  }
  EXPORT_SYMBOL(ttm_resource_free);
  
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h

index 140b6b9a8bbe..5a22c9a29c05 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -44,6 +44,7 @@ struct dma_buf_map;
  struct io_mapping;
  struct sg_table;
  struct scatterlist;
+struct ttm_resource_private;
  
  struct ttm_resource_manager_func {

/**
@@ -153,6 +154,32 @@ struct ttm_bus_placement {
enum ttm_cachingcaching;
  };
  
+/**

+ * struct ttm_resource_private_ops - Operations for a struct
+ * ttm_resource_private
+ *
+ * Not much benefit to keep this as a separate struct with only a single 
member,
+ * but keeping a separate ops struct is the norm.
+ */
+struct ttm_resource_private_ops {
+   /**
+* destroy() - Callback to destroy the private data
+* @priv - The private data to destroy
+*/
+   void (*destroy) (struct ttm_resource_private *priv);
+};
+
+/**
+ * struct ttm_resource_private - TTM driver private data
+ * @ops: Pointer to struct ttm_resource_private_ops with associated operations
+ *
+ * Intended to be subclassed to hold, for example cached data sharing the
+ * lifetime with a struct ttm_resource.
+ */
+struct ttm_resource_private {
+   const struct ttm_resource_private_ops ops;
+};
+
  /**
   * struct ttm_resource
   *
@@ -171,6 +198,7 @@ struct ttm_resource {
uint32_t mem_type;
uint32_t placement;
struct ttm_bus_placement bus;
+   struct ttm_resource_private *priv;
  };
  
  /**




Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table

2021-09-10 Thread Matt Roper
On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote:
> 
> On 10/09/2021 06:33, Matt Roper wrote:
> > Our uncore MMIO functions for reading/writing registers have become very
> > complicated over time.  There's significant macro magic used to generate
> > several nearly-identical functions that only really differ in terms of
> > which platform-specific shadow register table they should check on write
> > operations.  We can significantly simplify our MMIO handlers by storing
> > a reference to the current platform's shadow table within the 'struct
> > intel_uncore' the same way we already do for forcewake; this allows us
> > to consolidate the multiple variants of each 'write' function down to
> > just a single 'fwtable' version that gets the shadow table out of the
> > uncore struct rather than hardcoding the name of a specific platform's
> > table.  We can do similar consolidation on the MMIO read side by
> > creating a single-entry forcewake table to replace the open-coded range
> > check they had been using previously.
> > 
> > The final patch of the series adds a new shadow table for DG2; this
> > becomes quite clean and simple now, given the refactoring in the first
> > five patches.
> 
> Tidy and it ends up saving kernel binary size.
> 
> However I am undecided yet, because one thing to note is that the trade off
> is source code and kernel text consolidation at the expense of more indirect
> calls at runtime and larger common read/write functions.
> 
> To expand, current code generates a bunch of per gen functions but in doing
> so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and BSEARCH
> (from find_fw_domain) so at runtime each platform mmio read/write does not
> have to do indirect calls to do lookups.
> 
> It may matter a lot in the grand scheme of things but this trade off is
> something to note in the cover letter I think.

That's true.  However it seems like if the extra indirect calls are good
enough for our forcewake lookups (which are called more frequently and
have to search through much larger tables) then using the same strategy
for shadow registers should be less of a concern.  Plus most of
timing-critical parts of the code don't call through this at all; they
just grab an explicit forcewake and then issue a bunch of *_fw()
operations that skip all the per-register forcewake and shadow handling.

But you're right that this is something I should mention more clearly in
the cover letter.


Matt

> 
> Regards,
> 
> Tvrtko
> 
> > Matt Roper (6):
> >drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
> >drm/i915/uncore: Associate shadow table with uncore
> >drm/i915/uncore: Replace gen8 write functions with general fwtable
> >drm/i915/uncore: Drop gen11/gen12 mmio write handlers
> >drm/i915/uncore: Drop gen11 mmio read handlers
> >drm/i915/dg2: Add DG2-specific shadow register table
> > 
> >   drivers/gpu/drm/i915/intel_uncore.c   | 190 ++
> >   drivers/gpu/drm/i915/intel_uncore.h   |   7 +
> >   drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
> >   3 files changed, 110 insertions(+), 88 deletions(-)
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795


Re: [PATCH] drm/vc4: hdmi: Remove unused struct

2021-09-10 Thread Dave Stevenson
On Thu, 19 Aug 2021 at 15:08, Maxime Ripard  wrote:
>
> Commitc7d30623540b ("drm/vc4: hdmi: Remove unused struct") removed the
> references to the vc4_hdmi_audio_widgets and vc4_hdmi_audio_routes
> structures, but not the structures themselves resulting in two warnings.
> Remove it.
>
> Fixes: c7d30623540b ("drm/vc4: hdmi: Remove unused struct")
> Reported-by: kernel test robot 
> Signed-off-by: Maxime Ripard 

Reviewed-by: Dave Stevenson 

> ---
>  drivers/gpu/drm/vc4/vc4_hdmi.c | 8 
>  1 file changed, 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> index b7dc32a0c9bb..1e2d976e8736 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> @@ -1403,14 +1403,6 @@ static int vc4_hdmi_audio_prepare(struct device *dev, 
> void *data,
> return 0;
>  }
>
> -static const struct snd_soc_dapm_widget vc4_hdmi_audio_widgets[] = {
> -   SND_SOC_DAPM_OUTPUT("TX"),
> -};
> -
> -static const struct snd_soc_dapm_route vc4_hdmi_audio_routes[] = {
> -   { "TX", NULL, "Playback" },
> -};
> -
>  static const struct snd_soc_component_driver vc4_hdmi_audio_cpu_dai_comp = {
> .name = "vc4-hdmi-cpu-dai-component",
>  };
> --
> 2.31.1
>


Re: [PATCH v3 1/6] drm/vc4: select PM

2021-09-10 Thread Dave Stevenson
On Thu, 19 Aug 2021 at 14:59, Maxime Ripard  wrote:
>
> We already depend on runtime PM to get the power domains and clocks for
> most of the devices supported by the vc4 driver, so let's just select it
> to make sure it's there, and remove the ifdef.
>
> Signed-off-by: Maxime Ripard 

Reviewed-by: Dave Stevenson 

> ---
>  drivers/gpu/drm/vc4/Kconfig| 1 +
>  drivers/gpu/drm/vc4/vc4_hdmi.c | 2 --
>  2 files changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/Kconfig b/drivers/gpu/drm/vc4/Kconfig
> index 118e8a426b1a..f774ab340863 100644
> --- a/drivers/gpu/drm/vc4/Kconfig
> +++ b/drivers/gpu/drm/vc4/Kconfig
> @@ -9,6 +9,7 @@ config DRM_VC4
> select DRM_KMS_CMA_HELPER
> select DRM_GEM_CMA_HELPER
> select DRM_PANEL_BRIDGE
> +   select PM
> select SND_PCM
> select SND_PCM_ELD
> select SND_SOC_GENERIC_DMAENGINE_PCM
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> index c2876731ee2d..602203b2d8e1 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> @@ -2107,7 +2107,6 @@ static int vc5_hdmi_init_resources(struct vc4_hdmi 
> *vc4_hdmi)
> return 0;
>  }
>
> -#ifdef CONFIG_PM
>  static int vc4_hdmi_runtime_suspend(struct device *dev)
>  {
> struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
> @@ -2128,7 +2127,6 @@ static int vc4_hdmi_runtime_resume(struct device *dev)
>
> return 0;
>  }
> -#endif
>
>  static int vc4_hdmi_bind(struct device *dev, struct device *master, void 
> *data)
>  {
> --
> 2.31.1
>


Re: [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code

2021-09-10 Thread Jason Gunthorpe
On Fri, Sep 10, 2021 at 01:10:46PM +0100, Christoph Hellwig wrote:
> On Thu, Sep 09, 2021 at 04:38:45PM -0300, Jason Gunthorpe wrote:
> > Every driver just emits a static string, simply feed it through the ops
> > and provide a standard sysfs show function.
> 
> Looks sensible.  But can you make the attribute optional and add a
> comment marking it deprecated?  Because it really is completely useless.
> We don't version userspace APIs, userspae has to discover new features
> individually by e.g. finding new sysfs files or just trying new ioctls.

To be honest I have no idea what side effects that would have..

device code search tells me libvirt reads it and stuffs it into some
XML

Something called mdevctl touches it, feeds it into some JSON and
other stuff..

qemu has some VFIO_DEVICE_API_* constants but it is all dead code

I agree it shouldn't have been there in the first place

Cornelia? Alex? Any thoughts?

Jason


Re: [Intel-gfx] [PATCH v5] drm/i915: Use Transparent Hugepages when IOMMU is enabled

2021-09-10 Thread Tvrtko Ursulin



On 09/09/2021 17:17, Rodrigo Vivi wrote:

On Thu, Sep 09, 2021 at 12:44:48PM +0100, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

Usage of Transparent Hugepages was disabled in 9987da4b5dcf
("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it
appears majority of performance regressions reported with an enabled IOMMU
can be almost eliminated by turning them on, lets just do that.

To err on the side of safety we keep the current default in cases where
IOMMU is not active, and only when it is default to the "huge=within_size"
mode. Although there probably would be wins to enable them throughout,
more extensive testing across benchmarks and platforms would need to be
done.

With the patch and IOMMU enabled my local testing on a small Skylake part
shows OglVSTangent regression being reduced from ~14% (IOMMU on versus
IOMMU off) to ~2% (same comparison but with THP on).

More detailed testing done in the below referenced Gitlab issue by Eero:

Skylake GT4e:

Performance drops from enabling IOMMU:

 30-35% SynMark CSDof
 20-25% Unigine Heaven, MemBW GPU write, SynMark VSTangent
 ~20% GLB Egypt  (1/2 screen window)
 10-15% GLB T-Rex (1/2 screen window)
 8-10% GfxBench T-Rex, MemBW GPU blit
 7-8% SynMark DeferredAA + TerrainFly* + ZBuffer
 6-7% GfxBench Manhattan 3.0 + 3.1, SynMark TexMem128 & CSCloth
 5-6% GfxBench CarChase, Unigine Valley
 3-5% GfxBench Vulkan & GL AztecRuins + ALU2, MemBW GPU texture,
  SynMark Fill*, Deferred, TerrainPan*
 1-2% Most of the other tests

With the patch drops become:

 20-25% SynMark TexMem*
 15-20% GLB Egypt (1/2 screen window)
 10-15% GLB T-Rex (1/2 screen window)
 4-7% GfxBench T-Rex, GpuTest Triangle
 1-8% GfxBench ALU2 (offscreen 1%, onscreen 8%)
 3% GfxBench Manhattan 3.0, SynMark CSDof
 2-3% Unigine Heaven + Valley, MemBW GPU texture
 1-3 GfxBench Manhattan 3.1 + CarChase + Vulkan & GL AztecRuins

Broxton:

Performance drops from IOMMU, without patch:

 30% MemBW GPU write
 25% SynMark ZBuffer + Fill*
 20% MemBW GPU blit
 15% MemBW GPU blend, GpuTest Triangle
 10-15% MemBW GPU texture
 10% GLB Egypt, Unigine Heaven (had hangs), SynMark TerrainFly*
 7-9% GLB T-Rex, GfxBench Manhattan 3.0 + T-Rex,
  SynMark Deferred* + TexMem*
 6-8% GfxBench CarChase, Unigine Valley,
  SynMark CSCloth + ShMapVsm + TerrainPan*
 5-6% GfxBench Manhattan 3.1 + GL AztecRuins,
  SynMark CSDof + TexFilterTri
 2-4% GfxBench ALU2, SynMark DrvRes + GSCloth + ShMapPcf + Batch[0-5] +
  TexFilterAniso, GpuTest GiMark + 32-bit Julia

And with patch:

 15-20% MemBW GPU texture
 10% SynMark TexMem*
 8-9% GLB Egypt (1/2 screen window)
 4-5% GLB T-Rex (1/2 screen window)
 3-6% GfxBench Manhattan 3.0, GpuTest FurMark,
  SynMark Deferred + TexFilterTri
 3-4% GfxBench Manhattan 3.1 + T-Rex, SynMark VSInstancing
 2-4% GpuTest Triangle, SynMark DeferredAA
 2-3% Unigine Heaven + Valley
 1-3% SynMark Terrain*
 1-2% GfxBench CarChase, SynMark TexFilterAniso + ZBuffer

Tigerlake-H:

 20-25% MemBW GPU texture
 15-20% GpuTest Triangle
 13-15% SynMark TerrainFly* + DeferredAA + HdrBloom
 8-10% GfxBench Manhattan 3.1, SynMark TerrainPan* + DrvRes
 6-7% GfxBench Manhattan 3.0, SynMark TexMem*
 4-8% GLB onscreen Fill + T-Rex + Egypt (more in onscreen than
  offscreen versions of T-Rex/Egypt)
 4-6% GfxBench CarChase + GLES AztecRuins + ALU2, GpuTest 32-bit Julia,
  SynMark CSDof + DrvState
 3-5% GfxBench T-Rex + Egypt, Unigine Heaven + Valley, GpuTest Plot3D
 1-7% Media tests
 2-3% MemBW GPU blit
 1-3% Most of the rest of 3D tests

With the patch:

 6-8% MemBW GPU blend => the only regression in these tests (compared
  to IOMMU without THP)
 4-6% SynMark DrvState (not impacted) + HdrBloom (improved)
 3-4% GLB T-Rex
 ~3% GLB Egypt, SynMark DrvRes
 1-3% GfxBench T-Rex + Egypt, SynMark TexFilterTri
 1-2% GfxBench CarChase + GLES AztecRuins, Unigine Valley,
 GpuTest Triangle
 ~1% GfxBench Manhattan 3.0/3.1, Unigine Heaven

Perf of several tests actually improved with IOMMU + THP, compared to no
IOMMU / no THP:

 10-15% SynMark Batch[0-3]
 5-10% MemBW GPU texture, SynMark ShMapVsm
 3-4% SynMark Fill* + Geom*
 2-3% SynMark TexMem512 + CSCloth
 1-2% SynMark TexMem128 + DeferredAA

As a summary across all platforms, these are the benchmarks where enabling
THP on top of IOMMU enabled brings regressions:

  * Skylake GT4e:
20-25% SynMark TexMem*
(whereas all MemBW GPU tests either improve or are not affected)

  * Broxton J4205:
7% MemBW GPU texture
2-3% SynMark TexMem*

  * Tigerlake-H:
7% MemBW GPU blend

Other benchmarks show either lowering of regressions or improvements.

v2:
  * Add Kconfig dependency to transparent hugepages and some help text.
  * Move to helper for e

[Bug 213391] AMDGPU retries page fault with some specific processes amdgpu and sometimes followed [gfxhub0] retry page fault until *ERROR* ring gfx timeout, but soft recovered

2021-09-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213391

--- Comment #37 from Michel Dänzer (mic...@daenzer.net) ---
(In reply to Lahfa Samy from comment #36)
> Did anyone test whether this has been fixed in newer firmware updates, or
> should we still stay on version 20210315.3568f96-3 ?

It's fixed in upstream linux-firmware 20210818.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v2 3/6] drm/i915 Implement LMEM backup and restore for suspend / resume

2021-09-10 Thread Thomas Hellström



On 9/6/21 6:55 PM, Thomas Hellström wrote:

Just evict unpinned objects to system. For pinned LMEM objects,
make a backup system object and blit the contents to that.

Backup is performed in three steps,
1: Opportunistically evict evictable objects using the gpu blitter.
2: After gt idle, evict evictable objects using the gpu blitter. This will
be modified in an upcoming patch to backup pinned objects that are not used
by the blitter itself.
3: Backup remaining pinned objects using memcpy.

Also move uC suspend to after 2) to make sure we have a functional GuC
during 2) if using GuC submission.

v2:
- Major refactor to make sure gem_exec_suspend@hang-SX subtests work, and
   suspend / resume works with a slightly modified GuC submission enabling
   patch series.

Signed-off-by: Thomas Hellström 
---
  drivers/gpu/drm/i915/Makefile |   1 +
  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   1 +
  drivers/gpu/drm/i915/gem/i915_gem_pm.c|  92 +++-
  drivers/gpu/drm/i915/gem/i915_gem_pm.h|   3 +-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |  29 ++-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.h   |  10 +
  drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c| 205 ++
  drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.h|  24 ++
  drivers/gpu/drm/i915/gt/intel_gt_pm.c |   4 +-
  drivers/gpu/drm/i915/i915_drv.c   |  10 +-
  drivers/gpu/drm/i915/i915_drv.h   |   2 +-
  11 files changed, 364 insertions(+), 17 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index c36c8a4f0716..3379a0a6c91e 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -155,6 +155,7 @@ gem-y += \
gem/i915_gem_throttle.o \
gem/i915_gem_tiling.o \
gem/i915_gem_ttm.o \
+   gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 2471f36aaff3..734cc8e16481 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -534,6 +534,7 @@ struct drm_i915_gem_object {
struct {
struct sg_table *cached_io_st;
struct i915_gem_object_page_iter get_io_page;
+   struct drm_i915_gem_object *backup;
bool created:1;
} ttm;
  
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c

index 8b9d7d14c4bd..9746c255ddcc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
@@ -5,6 +5,7 @@
   */
  
  #include "gem/i915_gem_pm.h"

+#include "gem/i915_gem_ttm_pm.h"
  #include "gt/intel_gt.h"
  #include "gt/intel_gt_pm.h"
  #include "gt/intel_gt_requests.h"
@@ -39,7 +40,79 @@ void i915_gem_suspend(struct drm_i915_private *i915)
i915_gem_drain_freed_objects(i915);
  }
  
-void i915_gem_suspend_late(struct drm_i915_private *i915)

+static int lmem_restore(struct drm_i915_private *i915, bool allow_gpu)
+{
+   struct intel_memory_region *mr;
+   int ret = 0, id;
+
+   for_each_memory_region(mr, i915, id) {
+   if (mr->type == INTEL_MEMORY_LOCAL) {
+   ret = i915_ttm_restore_region(mr, allow_gpu);
+   if (ret)
+   break;
+   }
+   }
+
+   return ret;
+}
+
+static int lmem_suspend(struct drm_i915_private *i915, bool allow_gpu,
+   bool backup_pinned)
+{
+   struct intel_memory_region *mr;
+   int ret = 0, id;
+
+   for_each_memory_region(mr, i915, id) {
+   if (mr->type == INTEL_MEMORY_LOCAL) {
+   ret = i915_ttm_backup_region(mr, allow_gpu, 
backup_pinned);
+   if (ret)
+   break;
+   }
+   }
+
+   return ret;
+}
+
+static void lmem_recover(struct drm_i915_private *i915)
+{
+   struct intel_memory_region *mr;
+   int id;
+
+   for_each_memory_region(mr, i915, id)
+   if (mr->type == INTEL_MEMORY_LOCAL)
+   i915_ttm_recover_region(mr);
+}
+
+int i915_gem_backup_suspend(struct drm_i915_private *i915)
+{
+   int ret;
+
+   /* Opportunistically try to evict unpinned objects */
+   ret = lmem_suspend(i915, true, false);
+   if (ret)
+   goto out_recover;
+
+   i915_gem_suspend(i915);
+
+   /*
+* More objects may have become unpinned as requests were
+* retired. Now try to evict again. The gt may be wedged here
+* in which case we automatically fall back to memcpy.
+*/
+
+   ret = lmem_suspend(i915, true, false);
+   if (ret)
+  

[RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource

2021-09-10 Thread Thomas Hellström
Both the provider (resource manager) and the consumer (the TTM driver)
want to subclass struct ttm_resource. Since this is left for the resource
manager, we need to provide a private pointer for the TTM driver.

Provide a struct ttm_resource_private for the driver to subclass for
data with the same lifetime as the struct ttm_resource: In the i915 case
it will, for example, be an sg-table and radix tree into the LMEM
/VRAM pages that currently are awkwardly attached to the GEM object.

Provide an ops structure for associated ops (Which is only destroy() ATM)
It might seem pointless to provide a separate ops structure, but Linus
has previously made it clear that that's the norm.

After careful audit one could perhaps also on a per-driver basis
replace the delete_mem_notify() TTM driver callback with the above
destroy function.

Cc: Matthew Auld 
Cc: König Christian 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/ttm/ttm_resource.c | 10 +++---
 include/drm/ttm/ttm_resource.h | 28 
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_resource.c 
b/drivers/gpu/drm/ttm/ttm_resource.c
index 2431717376e7..973e7c50bfed 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object *bo,
 void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res)
 {
struct ttm_resource_manager *man;
+   struct ttm_resource *resource = *res;
 
-   if (!*res)
+   if (!resource)
return;
 
-   man = ttm_manager_type(bo->bdev, (*res)->mem_type);
-   man->func->free(man, *res);
*res = NULL;
+   if (resource->priv)
+   resource->priv->ops.destroy(resource->priv);
+
+   man = ttm_manager_type(bo->bdev, resource->mem_type);
+   man->func->free(man, resource);
 }
 EXPORT_SYMBOL(ttm_resource_free);
 
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 140b6b9a8bbe..5a22c9a29c05 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -44,6 +44,7 @@ struct dma_buf_map;
 struct io_mapping;
 struct sg_table;
 struct scatterlist;
+struct ttm_resource_private;
 
 struct ttm_resource_manager_func {
/**
@@ -153,6 +154,32 @@ struct ttm_bus_placement {
enum ttm_cachingcaching;
 };
 
+/**
+ * struct ttm_resource_private_ops - Operations for a struct
+ * ttm_resource_private
+ *
+ * Not much benefit to keep this as a separate struct with only a single 
member,
+ * but keeping a separate ops struct is the norm.
+ */
+struct ttm_resource_private_ops {
+   /**
+* destroy() - Callback to destroy the private data
+* @priv - The private data to destroy
+*/
+   void (*destroy) (struct ttm_resource_private *priv);
+};
+
+/**
+ * struct ttm_resource_private - TTM driver private data
+ * @ops: Pointer to struct ttm_resource_private_ops with associated operations
+ *
+ * Intended to be subclassed to hold, for example cached data sharing the
+ * lifetime with a struct ttm_resource.
+ */
+struct ttm_resource_private {
+   const struct ttm_resource_private_ops ops;
+};
+
 /**
  * struct ttm_resource
  *
@@ -171,6 +198,7 @@ struct ttm_resource {
uint32_t mem_type;
uint32_t placement;
struct ttm_bus_placement bus;
+   struct ttm_resource_private *priv;
 };
 
 /**
-- 
2.31.1



[PATCH 3/3] drm/vc4: dsi: Switch to devm_drm_of_get_bridge

2021-09-10 Thread Maxime Ripard
The new devm_drm_of_get_bridge removes most of the boilerplate we
have to deal with. Let's switch to it.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_drv.c |  2 ++
 drivers/gpu/drm/vc4/vc4_dsi.c | 28 
 2 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 16abc3a3d601..96c526f1022e 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -25,7 +25,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/gpu/drm/vc4/vc4_dsi.c b/drivers/gpu/drm/vc4/vc4_dsi.c
index a185027911ce..a229da58962a 100644
--- a/drivers/gpu/drm/vc4/vc4_dsi.c
+++ b/drivers/gpu/drm/vc4/vc4_dsi.c
@@ -1497,7 +1497,6 @@ static int vc4_dsi_bind(struct device *dev, struct device 
*master, void *data)
struct drm_device *drm = dev_get_drvdata(master);
struct vc4_dsi *dsi = dev_get_drvdata(dev);
struct vc4_dsi_encoder *vc4_dsi_encoder;
-   struct drm_panel *panel;
const struct of_device_id *match;
dma_cap_mask_t dma_mask;
int ret;
@@ -1609,27 +1608,9 @@ static int vc4_dsi_bind(struct device *dev, struct 
device *master, void *data)
return ret;
}
 
-   ret = drm_of_find_panel_or_bridge(dev->of_node, 0, 0,
- &panel, &dsi->bridge);
-   if (ret) {
-   /* If the bridge or panel pointed by dev->of_node is not
-* enabled, just return 0 here so that we don't prevent the DRM
-* dev from being registered. Of course that means the DSI
-* encoder won't be exposed, but that's not a problem since
-* nothing is connected to it.
-*/
-   if (ret == -ENODEV)
-   return 0;
-
-   return ret;
-   }
-
-   if (panel) {
-   dsi->bridge = devm_drm_panel_bridge_add_typed(dev, panel,
- 
DRM_MODE_CONNECTOR_DSI);
-   if (IS_ERR(dsi->bridge))
-   return PTR_ERR(dsi->bridge);
-   }
+   dsi->bridge = devm_drm_of_get_bridge(dev, dev->of_node, 0, 0);
+   if (IS_ERR(dsi->bridge))
+   return PTR_ERR(dsi->bridge);
 
/* The esc clock rate is supposed to always be 100Mhz. */
ret = clk_set_rate(dsi->escape_clock, 100 * 100);
@@ -1667,8 +1648,7 @@ static void vc4_dsi_unbind(struct device *dev, struct 
device *master,
 {
struct vc4_dsi *dsi = dev_get_drvdata(dev);
 
-   if (dsi->bridge)
-   pm_runtime_disable(dev);
+   pm_runtime_disable(dev);
 
/*
 * Restore the bridge_chain so the bridge detach procedure can happen
-- 
2.31.1



[PATCH 2/3] drm/vc4: dpi: Switch to devm_drm_of_get_bridge

2021-09-10 Thread Maxime Ripard
The new devm_drm_of_get_bridge removes most of the boilerplate we
have to deal with. Let's switch to it.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_dpi.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_dpi.c b/drivers/gpu/drm/vc4/vc4_dpi.c
index a90f2545baee..c180eb60bee8 100644
--- a/drivers/gpu/drm/vc4/vc4_dpi.c
+++ b/drivers/gpu/drm/vc4/vc4_dpi.c
@@ -229,26 +229,19 @@ static const struct of_device_id vc4_dpi_dt_match[] = {
 static int vc4_dpi_init_bridge(struct vc4_dpi *dpi)
 {
struct device *dev = &dpi->pdev->dev;
-   struct drm_panel *panel;
struct drm_bridge *bridge;
-   int ret;
 
-   ret = drm_of_find_panel_or_bridge(dev->of_node, 0, 0,
- &panel, &bridge);
-   if (ret) {
+   bridge = devm_drm_of_get_bridge(dev, dev->of_node, 0, 0);
+   if (IS_ERR(bridge)) {
/* If nothing was connected in the DT, that's not an
 * error.
 */
-   if (ret == -ENODEV)
+   if (PTR_ERR(bridge) == -ENODEV)
return 0;
else
-   return ret;
+   return PTR_ERR(bridge);
}
 
-   if (panel)
-   bridge = drm_panel_bridge_add_typed(panel,
-   DRM_MODE_CONNECTOR_DPI);
-
return drm_bridge_attach(dpi->encoder, bridge, NULL, 0);
 }
 
-- 
2.31.1



[PATCH 1/3] drm/bridge: Add a function to abstract away panels

2021-09-10 Thread Maxime Ripard
Display drivers so far need to have a lot of boilerplate to first
retrieve either the panel or bridge that they are connected to using
drm_of_find_panel_or_bridge(), and then either deal with each with ad-hoc
functions or create a drm panel bridge through drm_panel_bridge_add.

In order to reduce the boilerplate and hopefully create a path of least
resistance towards using the DRM panel bridge layer, let's create the
function devm_drm_of_get_next to reduce that boilerplate.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/drm_bridge.c | 42 
 drivers/gpu/drm/drm_of.c |  3 +++
 include/drm/drm_bridge.h |  2 ++
 3 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
index a8ed66751c2d..10ddca4638b0 100644
--- a/drivers/gpu/drm/drm_bridge.c
+++ b/drivers/gpu/drm/drm_bridge.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "drm_crtc_internal.h"
@@ -51,10 +52,8 @@
  *
  * Display drivers are responsible for linking encoders with the first bridge
  * in the chains. This is done by acquiring the appropriate bridge with
- * of_drm_find_bridge() or drm_of_find_panel_or_bridge(), or creating it for a
- * panel with drm_panel_bridge_add_typed() (or the managed version
- * devm_drm_panel_bridge_add_typed()). Once acquired, the bridge shall be
- * attached to the encoder with a call to drm_bridge_attach().
+ * devm_drm_of_get_bridge(). Once acquired, the bridge shall be attached to the
+ * encoder with a call to drm_bridge_attach().
  *
  * Bridges are responsible for linking themselves with the next bridge in the
  * chain, if any. This is done the same way as for encoders, with the call to
@@ -1233,6 +1232,41 @@ struct drm_bridge *of_drm_find_bridge(struct device_node 
*np)
return NULL;
 }
 EXPORT_SYMBOL(of_drm_find_bridge);
+
+/**
+ * devm_drm_of_get_bridge - Return next bridge in the chain
+ * @dev: device to tie the bridge lifetime to
+ * @np: device tree node containing encoder output ports
+ * @port: port in the device tree node
+ * @endpoint: endpoint in the device tree node
+ *
+ * Given a DT node's port and endpoint number, finds the connected node
+ * and returns the associated bridge if any, or creates and returns a
+ * drm panel bridge instance if a panel is connected.
+ *
+ * Returns a pointer to the bridge if successful, or an error pointer
+ * otherwise.
+ */
+struct drm_bridge *devm_drm_of_get_bridge(struct device *dev,
+ struct device_node *np,
+ unsigned int port,
+ unsigned int endpoint)
+{
+   struct drm_bridge *bridge;
+   struct drm_panel *panel;
+   int ret;
+
+   ret = drm_of_find_panel_or_bridge(np, port, endpoint,
+ &panel, &bridge);
+   if (ret)
+   return ERR_PTR(ret);
+
+   if (panel)
+   bridge = devm_drm_panel_bridge_add(dev, panel);
+
+   return bridge;
+}
+EXPORT_SYMBOL(devm_drm_of_get_bridge);
 #endif
 
 MODULE_AUTHOR("Ajay Kumar ");
diff --git a/drivers/gpu/drm/drm_of.c b/drivers/gpu/drm/drm_of.c
index 997b8827fed2..37c34146eea8 100644
--- a/drivers/gpu/drm/drm_of.c
+++ b/drivers/gpu/drm/drm_of.c
@@ -231,6 +231,9 @@ EXPORT_SYMBOL_GPL(drm_of_encoder_active_endpoint);
  * return either the associated struct drm_panel or drm_bridge device. Either
  * @panel or @bridge must not be NULL.
  *
+ * This function is deprecated and should not be used in new drivers. Use
+ * devm_drm_of_get_bridge() instead.
+ *
  * Returns zero if successful, or one of the standard error codes if it fails.
  */
 int drm_of_find_panel_or_bridge(const struct device_node *np,
diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h
index 46bdfa48c413..f70c88ca96ef 100644
--- a/include/drm/drm_bridge.h
+++ b/include/drm/drm_bridge.h
@@ -911,6 +911,8 @@ struct drm_bridge *devm_drm_panel_bridge_add(struct device 
*dev,
 struct drm_bridge *devm_drm_panel_bridge_add_typed(struct device *dev,
   struct drm_panel *panel,
   u32 connector_type);
+struct drm_bridge *devm_drm_of_get_bridge(struct device *dev, struct 
device_node *node,
+   unsigned int port, unsigned int 
endpoint);
 struct drm_connector *drm_panel_bridge_connector(struct drm_bridge *bridge);
 #endif
 
-- 
2.31.1



[PATCH 0/3] drm/bridge: Create a function to abstract panels away

2021-09-10 Thread Maxime Ripard
Hi,

This series used to be part of the DSI probe order series, but got removed
since it wasn't useful there anymore.

However, I still believe there is value in moving towards merging bridges and
panels by only making encoder (or upstream bridges) manipulate bridges.

The first patch creates a new helper that does just this by looking for a
bridge and a panel, and if a panel is found create a panel_bridge to return
that bridge instead.

The next two patches convert the vc4 encoders to use it.

If it's accepted, I plan on converting all the relevant users over time.

Let me know what you think,
Maxime

Maxime Ripard (3):
  drm/bridge: Add a function to abstract away panels
  drm/vc4: dpi: Switch to devm_drm_of_get_bridge
  drm/vc4: dsi: Switch to devm_drm_of_get_bridge

 drivers/gpu/drm/drm_bridge.c  | 42 +++
 drivers/gpu/drm/drm_of.c  |  3 +++
 drivers/gpu/drm/vc4/vc4_dpi.c | 15 -
 drivers/gpu/drm/vc4/vc4_drv.c |  2 ++
 drivers/gpu/drm/vc4/vc4_dsi.c | 28 ---
 include/drm/drm_bridge.h  |  2 ++
 6 files changed, 53 insertions(+), 39 deletions(-)

-- 
2.31.1



Re: [PATCH v5 1/3] dt-bindings: Add YAML bindings for NVDEC

2021-09-10 Thread Rob Herring
On Fri, 10 Sep 2021 13:42:45 +0300, Mikko Perttunen wrote:
> Add YAML device tree bindings for NVDEC, now in a more appropriate
> place compared to the old textual Host1x bindings.
> 
> Signed-off-by: Mikko Perttunen 
> ---
> v5:
> * Changed from nvidia,instance to nvidia,host1x-class optional
>   property.
> * Added dma-coherent
> v4:
> * Fix incorrect compatibility string in 'if' condition
> v3:
> * Drop host1x bindings
> * Change read2 to read-1 in interconnect names
> v2:
> * Fix issues pointed out in v1
> * Add T194 nvidia,instance property
> ---
>  .../gpu/host1x/nvidia,tegra210-nvdec.yaml | 104 ++
>  MAINTAINERS   |   1 +
>  2 files changed, 105 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:
./Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml:104:1:
 [warning] too many blank lines (2 > 1) (empty-lines)

dtschema/dtc warnings/errors:

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/1526459

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.



Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table

2021-09-10 Thread Tvrtko Ursulin



On 10/09/2021 06:33, Matt Roper wrote:

Our uncore MMIO functions for reading/writing registers have become very
complicated over time.  There's significant macro magic used to generate
several nearly-identical functions that only really differ in terms of
which platform-specific shadow register table they should check on write
operations.  We can significantly simplify our MMIO handlers by storing
a reference to the current platform's shadow table within the 'struct
intel_uncore' the same way we already do for forcewake; this allows us
to consolidate the multiple variants of each 'write' function down to
just a single 'fwtable' version that gets the shadow table out of the
uncore struct rather than hardcoding the name of a specific platform's
table.  We can do similar consolidation on the MMIO read side by
creating a single-entry forcewake table to replace the open-coded range
check they had been using previously.

The final patch of the series adds a new shadow table for DG2; this
becomes quite clean and simple now, given the refactoring in the first
five patches.


Tidy and it ends up saving kernel binary size.

However I am undecided yet, because one thing to note is that the trade 
off is source code and kernel text consolidation at the expense of more 
indirect calls at runtime and larger common read/write functions.


To expand, current code generates a bunch of per gen functions but in 
doing so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE 
and BSEARCH (from find_fw_domain) so at runtime each platform mmio 
read/write does not have to do indirect calls to do lookups.


It may matter a lot in the grand scheme of things but this trade off is 
something to note in the cover letter I think.


Regards,

Tvrtko


Matt Roper (6):
   drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
   drm/i915/uncore: Associate shadow table with uncore
   drm/i915/uncore: Replace gen8 write functions with general fwtable
   drm/i915/uncore: Drop gen11/gen12 mmio write handlers
   drm/i915/uncore: Drop gen11 mmio read handlers
   drm/i915/dg2: Add DG2-specific shadow register table

  drivers/gpu/drm/i915/intel_uncore.c   | 190 ++
  drivers/gpu/drm/i915/intel_uncore.h   |   7 +
  drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
  3 files changed, 110 insertions(+), 88 deletions(-)



Re: [Intel-gfx] [PATCH 1/6] drm/i915/uncore: Convert gen6/gen7 read operations to fwtable

2021-09-10 Thread Tvrtko Ursulin




On 10/09/2021 06:33, Matt Roper wrote:

On gen6-gen8 (except vlv/chv) we don't use a forcewake lookup table; we
simply check whether the register offset is < 0x4, and return
FORCEWAKE_RENDER if it is.  To prepare for upcoming refactoring, let's
define a single-entry forcewake table from [0x0, 0x3] and switch
these platforms over to use the fwtable reader functions.

Signed-off-by: Matt Roper 
---
  drivers/gpu/drm/i915/intel_uncore.c | 11 ---
  1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index f9767054dbdf..7f92f12d95f2 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1064,6 +1064,10 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, 
i915_reg_t reg)
__fwd; \
  })
  


Is __gen6_reg_read_fw_domains left orphaned somewhere around here or in 
a later patch?


Regards,

Tvrtko


+static const struct intel_forcewake_range __gen6_fw_ranges[] = {
+   GEN_FW_RANGE(0x0, 0x3, FORCEWAKE_RENDER),
+};
+
  /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
  static const struct intel_forcewake_range __chv_fw_ranges[] = {
GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
@@ -1623,7 +1627,6 @@ __gen_read(func, 64)
  
  __gen_reg_read_funcs(gen11_fwtable);

  __gen_reg_read_funcs(fwtable);
-__gen_reg_read_funcs(gen6);
  
  #undef __gen_reg_read_funcs

  #undef GEN6_READ_FOOTER
@@ -2111,15 +2114,17 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) == 8) {
+   ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen6);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_VALLEYVIEW(i915)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_GRAPHICS_VER(i915, 6, 7)) {
+   ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen6);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
}
  
  	uncore->pmic_bus_access_nb.notifier_call = i915_pmic_bus_access_notifier;




Re: [Intel-gfx] [PATCH 5/6] drm/i915/uncore: Drop gen11 mmio read handlers

2021-09-10 Thread Tvrtko Ursulin



On 10/09/2021 06:33, Matt Roper wrote:

Consolidate down to just a single 'fwtable' implementation.  For reads
we don't need to worry about shadow tables.  Also, the
NEEDS_FORCE_WAKE() check we previously had in the fwtable implementation
can be dropped --- if a register is outside that range on one of the old
platforms, then it won't belong to any forcewake range and 0 will be
returned anyway.

Signed-off-by: Matt Roper 
---
  drivers/gpu/drm/i915/intel_uncore.c | 45 +++--
  1 file changed, 17 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index c181e74fbf43..95398cb69722 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -935,14 +935,6 @@ static const struct intel_forcewake_range 
__vlv_fw_ranges[] = {
  };
  
  #define __fwtable_reg_read_fw_domains(uncore, offset) \

-({ \
-   enum forcewake_domains __fwd = 0; \
-   if (NEEDS_FORCE_WAKE((offset))) \
-   __fwd = find_fw_domain(uncore, offset); \
-   __fwd; \
-})
-
-#define __gen11_fwtable_reg_read_fw_domains(uncore, offset) \
find_fw_domain(uncore, offset)


Looks like you can drop this macro and just call find_fw_domain or you 
think there is value to keep it?


Regards,

Tvrtko

  
  /* *Must* be sorted by offset! See intel_shadow_table_check(). */

@@ -1577,33 +1569,30 @@ static inline void __force_wake_auto(struct 
intel_uncore *uncore,
___force_wake_auto(uncore, fw_domains);
  }
  
-#define __gen_read(func, x) \

+#define __gen_fwtable_read(x) \
  static u##x \
-func##_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { \
+fwtable_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) \
+{ \
enum forcewake_domains fw_engine; \
GEN6_READ_HEADER(x); \
-   fw_engine = __##func##_reg_read_fw_domains(uncore, offset); \
+   fw_engine = __fwtable_reg_read_fw_domains(uncore, offset); \
if (fw_engine) \
__force_wake_auto(uncore, fw_engine); \
val = __raw_uncore_read##x(uncore, reg); \
GEN6_READ_FOOTER; \
  }
  
-#define __gen_reg_read_funcs(func) \

-static enum forcewake_domains \
-func##_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \
-   return __##func##_reg_read_fw_domains(uncore, 
i915_mmio_reg_offset(reg)); \
-} \
-\
-__gen_read(func, 8) \
-__gen_read(func, 16) \
-__gen_read(func, 32) \
-__gen_read(func, 64)
+static enum forcewake_domains
+fwtable_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) {
+   return __fwtable_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg));
+}
  
-__gen_reg_read_funcs(gen11_fwtable);

-__gen_reg_read_funcs(fwtable);
+__gen_fwtable_read(8)
+__gen_fwtable_read(16)
+__gen_fwtable_read(32)
+__gen_fwtable_read(64)
  
-#undef __gen_reg_read_funcs

+#undef __gen_fwtable_read
  #undef GEN6_READ_FOOTER
  #undef GEN6_READ_HEADER
  
@@ -2069,22 +2058,22 @@ static int uncore_forcewake_init(struct intel_uncore *uncore)

ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) >= 12) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) == 11) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_GRAPHICS_VER(i915, 9, 10)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);



[Bug 213391] AMDGPU retries page fault with some specific processes amdgpu and sometimes followed [gfxhub0] retry page fault until *ERROR* ring gfx timeout, but soft recovered

2021-09-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213391

--- Comment #36 from Lahfa Samy (s...@lahfa.xyz) ---
Did anyone test whether this has been fixed in newer firmware updates, or
should we still stay on version 20210315.3568f96-3 ?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH RESEND] drm/i915: Mark GPU wedging on driver unregister unrecoverable

2021-09-10 Thread Michał Winiarski

On 03.09.2021 16:28, Janusz Krzysztofik wrote:

GPU wedged flag now set on driver unregister to prevent from further
using the GPU can be then cleared unintentionally when calling
__intel_gt_unset_wedged() still before the flag is finally marked
unrecoverable.  We need to have it marked unrecoverable earlier.
Implement that by replacing a call to intel_gt_set_wedged() in
intel_gt_driver_unregister() with intel_gt_set_wedged_on_fini().

With the above in place, intel_gt_set_wedged_on_fini() is now called
twice on driver remove, second time from __intel_gt_disable().  This
seems harmless, while dropping intel_gt_set_wedged_on_fini() from
__intel_gt_disable() proved to break some driver probe error unwind
paths as well as mock selftest exit path.

Signed-off-by: Janusz Krzysztofik 
Cc: Michał Winiarski 


Reviewed-by: Michał Winiarski 

-Michał


---
Resending with Cc: dri-devel@lists.freedesktop.org as requested.

Thanks,
Janusz

  drivers/gpu/drm/i915/gt/intel_gt.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 62d40c986642..173b53cb2b47 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -750,7 +750,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
 * all in-flight requests so that we can quickly unbind the active
 * resources.
 */
-   intel_gt_set_wedged(gt);
+   intel_gt_set_wedged_on_fini(gt);
  
  	/* Scrub all HW state upon release */

with_intel_runtime_pm(gt->uncore->rpm, wakeref)





Re: [Intel-gfx] [PATCH 23/27] drm/i915/guc: Implement no mid batch preemption for multi-lrc

2021-09-10 Thread Tvrtko Ursulin



On 20/08/2021 23:44, Matthew Brost wrote:

For some users of multi-lrc, e.g. split frame, it isn't safe to preempt
mid BB. To safely enable preemption at the BB boundary, a handshake
between to parent and child is needed. This is implemented via custom
emit_bb_start & emit_fini_breadcrumb functions and enabled via by
default if a context is configured by set parallel extension.


FWIW I think it's wrong to hardcode the requirements of a particular 
hardware generation fixed media pipeline into the uapi. IMO better 
solution was when concept of parallel submission was decoupled from the 
no preemption mid batch preambles. Otherwise might as well call the 
extension I915_CONTEXT_ENGINES_EXT_MEDIA_SPLIT_FRAME_SUBMIT or something.


Regards,

Tvrtko

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/intel_context.c   |   2 +-
  drivers/gpu/drm/i915/gt/intel_context_types.h |   3 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   2 +-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 283 +-
  4 files changed, 287 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 5615be32879c..2de62649e275 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -561,7 +561,7 @@ void intel_context_bind_parent_child(struct intel_context 
*parent,
GEM_BUG_ON(intel_context_is_child(child));
GEM_BUG_ON(intel_context_is_parent(child));
  
-	parent->guc_number_children++;

+   child->guc_child_index = parent->guc_number_children++;
list_add_tail(&child->guc_child_link,
  &parent->guc_child_list);
child->parent = parent;
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 713d85b0b364..727f91e7f7c2 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -246,6 +246,9 @@ struct intel_context {
/** @guc_number_children: number of children if parent */
u8 guc_number_children;
  
+		/** @guc_child_index: index into guc_child_list if child */

+   u8 guc_child_index;
+
/**
 * @parent_page: page in context used by parent for work queue,
 * work queue descriptor
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index 6cd26dc060d1..9f61cfa5566a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -188,7 +188,7 @@ struct guc_process_desc {
u32 wq_status;
u32 engine_presence;
u32 priority;
-   u32 reserved[30];
+   u32 reserved[36];
  } __packed;
  
  #define CONTEXT_REGISTRATION_FLAG_KMD	BIT(0)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 91330525330d..1a18f99bf12a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -11,6 +11,7 @@
  #include "gt/intel_context.h"
  #include "gt/intel_engine_pm.h"
  #include "gt/intel_engine_heartbeat.h"
+#include "gt/intel_gpu_commands.h"
  #include "gt/intel_gt.h"
  #include "gt/intel_gt_irq.h"
  #include "gt/intel_gt_pm.h"
@@ -366,10 +367,14 @@ static struct i915_priolist *to_priolist(struct rb_node 
*rb)
  
  /*

   * When using multi-lrc submission an extra page in the context state is
- * reserved for the process descriptor and work queue.
+ * reserved for the process descriptor, work queue, and preempt BB boundary
+ * handshake between the parent + childlren contexts.
   *
   * The layout of this page is below:
   * 0  guc_process_desc
+ * + sizeof(struct guc_process_desc)   child go
+ * + CACHELINE_BYTES   child join ...
+ * + CACHELINE_BYTES ...
   * ...unused
   * PAGE_SIZE / 2  work queue start
   * ...work queue
@@ -1785,6 +1790,30 @@ static int deregister_context(struct intel_context *ce, 
u32 guc_id, bool loop)
return __guc_action_deregister_context(guc, guc_id, loop);
  }
  
+static inline void clear_children_join_go_memory(struct intel_context *ce)

+{
+   u32 *mem = (u32 *)(__get_process_desc(ce) + 1);
+   u8 i;
+
+   for (i = 0; i < ce->guc_number_children + 1; ++i)
+   mem[i * (CACHELINE_BYTES / sizeof(u32))] = 0;
+}
+
+static inline u32 get_children_go_value(struct intel_context *ce)
+{
+   u32 *mem = (u32 *)(__get_process_desc(ce) + 1);
+
+   return mem[0];
+}
+
+static inline u32 get_children_join_value(struct intel_context *ce,
+ u8 child_index)
+{
+   u32 *mem = (u32 *)(__get_process_desc(ce) + 1);
+
+

Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-10 Thread Tvrtko Ursulin



On 20/08/2021 23:44, Matthew Brost wrote:

Add logical engine mapping. This is required for split-frame, as
workloads need to be placed on engines in a logically contiguous manner.

v2:
  (Daniel Vetter)
   - Add kernel doc for new fields

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 ---
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
  .../drm/i915/gt/intel_execlists_submission.c  |  1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
  5 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0d9105a31d84..4d790f9a65dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs *engine, 
u16 iir)
GEM_DEBUG_WARN_ON(iir);
  }
  
-static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)

+static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
+ u8 logical_instance)
  {
const struct engine_info *info = &intel_engines[id];
struct drm_i915_private *i915 = gt->i915;
@@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
  
  	engine->class = info->class;

engine->instance = info->instance;
+   engine->logical_mask = BIT(logical_instance);
__sprint_engine_name(engine);
  
  	engine->props.heartbeat_interval_ms =

@@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
return info->engine_mask;
  }
  
+static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,

+u8 class, const u8 *map, u8 num_instances)
+{
+   int i, j;
+   u8 current_logical_id = 0;
+
+   for (j = 0; j < num_instances; ++j) {
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   if (!HAS_ENGINE(gt, i) ||
+   intel_engines[i].class != class)
+   continue;
+
+   if (intel_engines[i].instance == map[j]) {
+   logical_ids[intel_engines[i].instance] =
+   current_logical_id++;
+   break;
+   }
+   }
+   }
+}
+
+static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class)
+{
+   int i;
+   u8 map[MAX_ENGINE_INSTANCE + 1];
+
+   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
+   map[i] = i;


What's the point of the map array since it is 1:1 with instance?


+   populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
+}
+
  /**
   * intel_engines_init_mmio() - allocate and prepare the Engine Command 
Streamers
   * @gt: pointer to struct intel_gt
@@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt)
struct drm_i915_private *i915 = gt->i915;
const unsigned int engine_mask = init_engine_mask(gt);
unsigned int mask = 0;
-   unsigned int i;
+   unsigned int i, class;
+   u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
int err;
  
  	drm_WARN_ON(&i915->drm, engine_mask == 0);

@@ -593,15 +627,23 @@ int intel_engines_init_mmio(struct intel_gt *gt)
if (i915_inject_probe_failure(i915))
return -ENODEV;
  
-	for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {

-   if (!HAS_ENGINE(gt, i))
-   continue;
+   for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) {
+   setup_logical_ids(gt, logical_ids, class);
  
-		err = intel_engine_setup(gt, i);

-   if (err)
-   goto cleanup;
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   u8 instance = intel_engines[i].instance;
+
+   if (intel_engines[i].class != class ||
+   !HAS_ENGINE(gt, i))
+   continue;
  
-		mask |= BIT(i);

+   err = intel_engine_setup(gt, i,
+logical_ids[instance]);
+   if (err)
+   goto cleanup;
+
+   mask |= BIT(i);


I still this there is a less clunky way to set this up in less code and 
more readable at the same time. Like do it in two passes so you can 
iterate gt->engine_class[] array instead of having to implement a skip 
condition (both on class and HAS_ENGINE at two places) and also avoid 
walking the flat intel_engines array recursively.



+   }
}
  
  	/*

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index ed91bcff20eb..fddf35546b58 10

[PATCH v4 24/24] drm/exynos: dsi: Adjust probe order

2021-09-10 Thread Maxime Ripard
Without proper care and an agreement between how DSI hosts and devices
drivers register their MIPI-DSI entities and potential components, we can
end up in a situation where the drivers can never probe.

Most drivers were taking evasive maneuvers to try to workaround this,
but not all of them were following the same conventions, resulting in
various incompatibilities between DSI hosts and devices.

Now that we have a sequence agreed upon and documented, let's convert
exynos to it.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/exynos/exynos_drm_dsi.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
index e39fac889edc..dfda2b259c44 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
@@ -1529,6 +1529,7 @@ static const struct drm_encoder_helper_funcs 
exynos_dsi_encoder_helper_funcs = {
 
 MODULE_DEVICE_TABLE(of, exynos_dsi_of_match);
 
+static const struct component_ops exynos_dsi_component_ops;
 static int exynos_dsi_host_attach(struct mipi_dsi_host *host,
  struct mipi_dsi_device *device)
 {
@@ -1536,6 +1537,7 @@ static int exynos_dsi_host_attach(struct mipi_dsi_host 
*host,
struct drm_encoder *encoder = &dsi->encoder;
struct drm_device *drm = encoder->dev;
struct drm_bridge *out_bridge;
+   struct device *dev = host->dev;
 
out_bridge  = of_drm_find_bridge(device->dev.of_node);
if (out_bridge) {
@@ -1585,7 +1587,7 @@ static int exynos_dsi_host_attach(struct mipi_dsi_host 
*host,
if (drm->mode_config.poll_enabled)
drm_kms_helper_hotplug_event(drm);
 
-   return 0;
+   return component_add(dev, &exynos_dsi_component_ops);
 }
 
 static int exynos_dsi_host_detach(struct mipi_dsi_host *host,
@@ -1593,6 +1595,9 @@ static int exynos_dsi_host_detach(struct mipi_dsi_host 
*host,
 {
struct exynos_dsi *dsi = host_to_dsi(host);
struct drm_device *drm = dsi->encoder.dev;
+   struct device *dev = host->dev;
+
+   component_del(dev, &exynos_dsi_component_ops);
 
if (dsi->panel) {
mutex_lock(&drm->mode_config.mutex);
@@ -1716,7 +1721,7 @@ static int exynos_dsi_bind(struct device *dev, struct 
device *master,
of_node_put(in_bridge_node);
}
 
-   return mipi_dsi_host_register(&dsi->dsi_host);
+   return 0;
 }
 
 static void exynos_dsi_unbind(struct device *dev, struct device *master,
@@ -1726,8 +1731,6 @@ static void exynos_dsi_unbind(struct device *dev, struct 
device *master,
struct drm_encoder *encoder = &dsi->encoder;
 
exynos_dsi_disable(encoder);
-
-   mipi_dsi_host_unregister(&dsi->dsi_host);
 }
 
 static const struct component_ops exynos_dsi_component_ops = {
@@ -1821,7 +1824,7 @@ static int exynos_dsi_probe(struct platform_device *pdev)
 
pm_runtime_enable(dev);
 
-   ret = component_add(dev, &exynos_dsi_component_ops);
+   ret = mipi_dsi_host_register(&dsi->dsi_host);
if (ret)
goto err_disable_runtime;
 
@@ -1835,10 +1838,12 @@ static int exynos_dsi_probe(struct platform_device 
*pdev)
 
 static int exynos_dsi_remove(struct platform_device *pdev)
 {
+   struct exynos_dsi *dsi = platform_get_drvdata(pdev);
+
+   mipi_dsi_host_unregister(&dsi->dsi_host);
+
pm_runtime_disable(&pdev->dev);
 
-   component_del(&pdev->dev, &exynos_dsi_component_ops);
-
return 0;
 }
 
-- 
2.31.1



[PATCH v4 23/24] drm/kirin: dsi: Adjust probe order

2021-09-10 Thread Maxime Ripard
Without proper care and an agreement between how DSI hosts and devices
drivers register their MIPI-DSI entities and potential components, we can
end up in a situation where the drivers can never probe.

Most drivers were taking evasive maneuvers to try to workaround this,
but not all of them were following the same conventions, resulting in
various incompatibilities between DSI hosts and devices.

Now that we have a sequence agreed upon and documented, let's convert
kirin to it.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c | 27 +++-
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c 
b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
index 952cfdb1961d..be20c2ffe798 100644
--- a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
+++ b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
@@ -720,10 +720,13 @@ static int dw_drm_encoder_init(struct device *dev,
return 0;
 }
 
+static const struct component_ops dsi_ops;
 static int dsi_host_attach(struct mipi_dsi_host *host,
   struct mipi_dsi_device *mdsi)
 {
struct dw_dsi *dsi = host_to_dsi(host);
+   struct device *dev = host->dev;
+   int ret;
 
if (mdsi->lanes < 1 || mdsi->lanes > 4) {
DRM_ERROR("dsi device params invalid\n");
@@ -734,13 +737,20 @@ static int dsi_host_attach(struct mipi_dsi_host *host,
dsi->format = mdsi->format;
dsi->mode_flags = mdsi->mode_flags;
 
+   ret = component_add(dev, &dsi_ops);
+   if (ret)
+   return ret;
+
return 0;
 }
 
 static int dsi_host_detach(struct mipi_dsi_host *host,
   struct mipi_dsi_device *mdsi)
 {
-   /* do nothing */
+   struct device *dev = host->dev;
+
+   component_del(dev, &dsi_ops);
+
return 0;
 }
 
@@ -785,10 +795,6 @@ static int dsi_bind(struct device *dev, struct device 
*master, void *data)
if (ret)
return ret;
 
-   ret = dsi_host_init(dev, dsi);
-   if (ret)
-   return ret;
-
ret = dsi_bridge_init(drm_dev, dsi);
if (ret)
return ret;
@@ -859,12 +865,19 @@ static int dsi_probe(struct platform_device *pdev)
 
platform_set_drvdata(pdev, data);
 
-   return component_add(&pdev->dev, &dsi_ops);
+   ret = dsi_host_init(&pdev->dev, dsi);
+   if (ret)
+   return ret;
+
+   return 0;
 }
 
 static int dsi_remove(struct platform_device *pdev)
 {
-   component_del(&pdev->dev, &dsi_ops);
+   struct dsi_data *data = platform_get_drvdata(pdev);
+   struct dw_dsi *dsi = &data->dsi;
+
+   mipi_dsi_host_unregister(&dsi->host);
 
return 0;
 }
-- 
2.31.1



[PATCH v4 22/24] drm/bridge: tc358775: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/tc358775.c | 37 +--
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/bridge/tc358775.c 
b/drivers/gpu/drm/bridge/tc358775.c
index 35e66d1b6456..2c76331b251d 100644
--- a/drivers/gpu/drm/bridge/tc358775.c
+++ b/drivers/gpu/drm/bridge/tc358775.c
@@ -594,11 +594,26 @@ static int tc_bridge_attach(struct drm_bridge *bridge,
enum drm_bridge_attach_flags flags)
 {
struct tc_data *tc = bridge_to_tc(bridge);
+
+   /* Attach the panel-bridge to the dsi bridge */
+   return drm_bridge_attach(bridge->encoder, tc->panel_bridge,
+&tc->bridge, flags);
+}
+
+static const struct drm_bridge_funcs tc_bridge_funcs = {
+   .attach = tc_bridge_attach,
+   .pre_enable = tc_bridge_pre_enable,
+   .enable = tc_bridge_enable,
+   .mode_valid = tc_mode_valid,
+   .post_disable = tc_bridge_post_disable,
+};
+
+static int tc_attach_host(struct tc_data *tc)
+{
struct device *dev = &tc->i2c->dev;
struct mipi_dsi_host *host;
struct mipi_dsi_device *dsi;
int ret;
-
const struct mipi_dsi_device_info info = { .type = "tc358775",
.channel = 0,
.node = NULL,
@@ -628,19 +643,9 @@ static int tc_bridge_attach(struct drm_bridge *bridge,
return ret;
}
 
-   /* Attach the panel-bridge to the dsi bridge */
-   return drm_bridge_attach(bridge->encoder, tc->panel_bridge,
-&tc->bridge, flags);
+   return 0;
 }
 
-static const struct drm_bridge_funcs tc_bridge_funcs = {
-   .attach = tc_bridge_attach,
-   .pre_enable = tc_bridge_pre_enable,
-   .enable = tc_bridge_enable,
-   .mode_valid = tc_mode_valid,
-   .post_disable = tc_bridge_post_disable,
-};
-
 static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
 {
struct device *dev = &client->dev;
@@ -704,7 +709,15 @@ static int tc_probe(struct i2c_client *client, const 
struct i2c_device_id *id)
 
i2c_set_clientdata(client, tc);
 
+   ret = tc_attach_host(tc);
+   if (ret)
+   goto err_bridge_remove;
+
return 0;
+
+err_bridge_remove:
+   drm_bridge_remove(&tc->bridge);
+   return ret;
 }
 
 static int tc_remove(struct i2c_client *client)
-- 
2.31.1



[PATCH v4 21/24] drm/bridge: tc358775: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device. This also avoids leaking the device when we detach
the bridge.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/tc358775.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/bridge/tc358775.c 
b/drivers/gpu/drm/bridge/tc358775.c
index 2272adcc5b4a..35e66d1b6456 100644
--- a/drivers/gpu/drm/bridge/tc358775.c
+++ b/drivers/gpu/drm/bridge/tc358775.c
@@ -610,11 +610,10 @@ static int tc_bridge_attach(struct drm_bridge *bridge,
return -EPROBE_DEFER;
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
dev_err(dev, "failed to create dsi device\n");
-   ret = PTR_ERR(dsi);
-   goto err_dsi_device;
+   return PTR_ERR(dsi);
}
 
tc->dsi = dsi;
@@ -623,19 +622,15 @@ static int tc_bridge_attach(struct drm_bridge *bridge,
dsi->format = MIPI_DSI_FMT_RGB888;
dsi->mode_flags = MIPI_DSI_MODE_VIDEO;
 
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
dev_err(dev, "failed to attach dsi to host\n");
-   goto err_dsi_attach;
+   return ret;
}
 
/* Attach the panel-bridge to the dsi bridge */
return drm_bridge_attach(bridge->encoder, tc->panel_bridge,
 &tc->bridge, flags);
-err_dsi_attach:
-   mipi_dsi_device_unregister(dsi);
-err_dsi_device:
-   return ret;
 }
 
 static const struct drm_bridge_funcs tc_bridge_funcs = {
-- 
2.31.1



[PATCH v4 20/24] drm/bridge: sn65dsi86: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 74 ++-
 1 file changed, 38 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index b5662269ff95..7f71329536a2 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -667,58 +667,27 @@ static struct ti_sn65dsi86 *bridge_to_ti_sn65dsi86(struct 
drm_bridge *bridge)
return container_of(bridge, struct ti_sn65dsi86, bridge);
 }
 
-static int ti_sn_bridge_attach(struct drm_bridge *bridge,
-  enum drm_bridge_attach_flags flags)
+static int ti_sn_attach_host(struct ti_sn65dsi86 *pdata)
 {
int ret, val;
-   struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
struct mipi_dsi_host *host;
struct mipi_dsi_device *dsi;
struct device *dev = pdata->dev;
const struct mipi_dsi_device_info info = { .type = "ti_sn_bridge",
   .channel = 0,
   .node = NULL,
-};
+   };
 
-   if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR) {
-   DRM_ERROR("Fix bridge driver to make connector optional!");
-   return -EINVAL;
-   }
-
-   pdata->aux.drm_dev = bridge->dev;
-   ret = drm_dp_aux_register(&pdata->aux);
-   if (ret < 0) {
-   drm_err(bridge->dev, "Failed to register DP AUX channel: %d\n", 
ret);
-   return ret;
-   }
-
-   ret = ti_sn_bridge_connector_init(pdata);
-   if (ret < 0)
-   goto err_conn_init;
-
-   /*
-* TODO: ideally finding host resource and dsi dev registration needs
-* to be done in bridge probe. But some existing DSI host drivers will
-* wait for any of the drm_bridge/drm_panel to get added to the global
-* bridge/panel list, before completing their probe. So if we do the
-* dsi dev registration part in bridge probe, before populating in
-* the global bridge list, then it will cause deadlock as dsi host probe
-* will never complete, neither our bridge probe. So keeping it here
-* will satisfy most of the existing host drivers. Once the host driver
-* is fixed we can move the below code to bridge probe safely.
-*/
host = of_find_mipi_dsi_host_by_node(pdata->host_node);
if (!host) {
DRM_ERROR("failed to find dsi host\n");
-   ret = -ENODEV;
-   goto err_dsi_host;
+   return -ENODEV;
}
 
dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
DRM_ERROR("failed to create dsi device\n");
-   ret = PTR_ERR(dsi);
-   goto err_dsi_host;
+   return PTR_ERR(dsi);
}
 
/* TODO: setting to 4 MIPI lanes always for now */
@@ -736,10 +705,35 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
DRM_ERROR("failed to attach dsi to host\n");
-   goto err_dsi_host;
+   return ret;
}
pdata->dsi = dsi;
 
+   return 0;
+}
+
+static int ti_sn_bridge_attach(struct drm_bridge *bridge,
+  enum drm_bridge_attach_flags flags)
+{
+   struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
+   int ret;
+
+   if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR) {
+   DRM_ERROR("Fix bridge driver to make connector optional!");
+   return -EINVAL;
+   }
+
+   pdata->aux.drm_dev = bridge->dev;
+   ret = drm_dp_aux_register(&pdata->aux);
+   if (ret < 0) {
+   drm_err(bridge->dev, "Failed to register DP AUX channel: %d\n", 
ret);
+   return ret;
+   }
+
+   ret = ti_sn_bridge_connector_init(pdata);
+   if (ret < 0)
+   goto err_conn_init;
+
/* We never want the next bridge to *also* create a connector: */
flags |= DRM_BRIDGE_ATTACH_NO_CONNECTOR;
 
@@ -1223,7 +1217,15 @@ static int ti_sn_bridge_probe(struct auxiliary_device 
*adev,
 
drm_bridge_add(&pdata->bridge);
 
+   ret = ti_sn_attach_host(pdata);
+   if (ret)
+   goto err_remove_bridge;
+
return 0;
+
+err_remove_bridge:
+   drm_bridge_remove(&pdata->bridge);
+   return ret;
 }
 
 static void ti_sn_bridge_remove(struct auxiliary_device *adev)
-- 
2.31.1



[PATCH v4 19/24] drm/bridge: sn65dsi86: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device. This also avoids leaking the device when we detach
the bridge.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 22 +++---
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 41d48a393e7f..b5662269ff95 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -674,6 +674,7 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
struct mipi_dsi_host *host;
struct mipi_dsi_device *dsi;
+   struct device *dev = pdata->dev;
const struct mipi_dsi_device_info info = { .type = "ti_sn_bridge",
   .channel = 0,
   .node = NULL,
@@ -713,7 +714,7 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
goto err_dsi_host;
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
DRM_ERROR("failed to create dsi device\n");
ret = PTR_ERR(dsi);
@@ -726,16 +727,16 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
dsi->mode_flags = MIPI_DSI_MODE_VIDEO;
 
/* check if continuous dsi clock is required or not */
-   pm_runtime_get_sync(pdata->dev);
+   pm_runtime_get_sync(dev);
regmap_read(pdata->regmap, SN_DPPLL_SRC_REG, &val);
-   pm_runtime_put_autosuspend(pdata->dev);
+   pm_runtime_put_autosuspend(dev);
if (!(val & DPPLL_CLK_SRC_DSICLK))
dsi->mode_flags |= MIPI_DSI_CLOCK_NON_CONTINUOUS;
 
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
DRM_ERROR("failed to attach dsi to host\n");
-   goto err_dsi_attach;
+   goto err_dsi_host;
}
pdata->dsi = dsi;
 
@@ -746,14 +747,10 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
ret = drm_bridge_attach(bridge->encoder, pdata->next_bridge,
&pdata->bridge, flags);
if (ret < 0)
-   goto err_dsi_detach;
+   goto err_dsi_host;
 
return 0;
 
-err_dsi_detach:
-   mipi_dsi_detach(dsi);
-err_dsi_attach:
-   mipi_dsi_device_unregister(dsi);
 err_dsi_host:
drm_connector_cleanup(&pdata->connector);
 err_conn_init:
@@ -1236,11 +1233,6 @@ static void ti_sn_bridge_remove(struct auxiliary_device 
*adev)
if (!pdata)
return;
 
-   if (pdata->dsi) {
-   mipi_dsi_detach(pdata->dsi);
-   mipi_dsi_device_unregister(pdata->dsi);
-   }
-
drm_bridge_remove(&pdata->bridge);
 
of_node_put(pdata->host_node);
-- 
2.31.1



[PATCH v4 18/24] drm/bridge: sn65dsi83: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/ti-sn65dsi83.c | 80 +++
 1 file changed, 46 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
index db4d39082705..f951eb19767b 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
@@ -245,40 +245,6 @@ static int sn65dsi83_attach(struct drm_bridge *bridge,
enum drm_bridge_attach_flags flags)
 {
struct sn65dsi83 *ctx = bridge_to_sn65dsi83(bridge);
-   struct device *dev = ctx->dev;
-   struct mipi_dsi_device *dsi;
-   struct mipi_dsi_host *host;
-   int ret = 0;
-
-   const struct mipi_dsi_device_info info = {
-   .type = "sn65dsi83",
-   .channel = 0,
-   .node = NULL,
-   };
-
-   host = of_find_mipi_dsi_host_by_node(ctx->host_node);
-   if (!host) {
-   dev_err(dev, "failed to find dsi host\n");
-   return -EPROBE_DEFER;
-   }
-
-   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
-   if (IS_ERR(dsi)) {
-   return dev_err_probe(dev, PTR_ERR(dsi),
-"failed to create dsi device\n");
-   }
-
-   ctx->dsi = dsi;
-
-   dsi->lanes = ctx->dsi_lanes;
-   dsi->format = MIPI_DSI_FMT_RGB888;
-   dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST;
-
-   ret = devm_mipi_dsi_attach(dev, dsi);
-   if (ret < 0) {
-   dev_err(dev, "failed to attach dsi to host\n");
-   return ret;
-   }
 
return drm_bridge_attach(bridge->encoder, ctx->panel_bridge,
 &ctx->bridge, flags);
@@ -646,6 +612,44 @@ static int sn65dsi83_parse_dt(struct sn65dsi83 *ctx, enum 
sn65dsi83_model model)
return 0;
 }
 
+static int sn65dsi83_host_attach(struct sn65dsi83 *ctx)
+{
+   struct device *dev = ctx->dev;
+   struct mipi_dsi_device *dsi;
+   struct mipi_dsi_host *host;
+   const struct mipi_dsi_device_info info = {
+   .type = "sn65dsi83",
+   .channel = 0,
+   .node = NULL,
+   };
+   int ret;
+
+   host = of_find_mipi_dsi_host_by_node(ctx->host_node);
+   if (!host) {
+   dev_err(dev, "failed to find dsi host\n");
+   return -EPROBE_DEFER;
+   }
+
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
+   if (IS_ERR(dsi))
+   return dev_err_probe(dev, PTR_ERR(dsi),
+"failed to create dsi device\n");
+
+   ctx->dsi = dsi;
+
+   dsi->lanes = ctx->dsi_lanes;
+   dsi->format = MIPI_DSI_FMT_RGB888;
+   dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST;
+
+   ret = devm_mipi_dsi_attach(dev, dsi);
+   if (ret < 0) {
+   dev_err(dev, "failed to attach dsi to host: %d\n", ret);
+   return ret;
+   }
+
+   return 0;
+}
+
 static int sn65dsi83_probe(struct i2c_client *client,
   const struct i2c_device_id *id)
 {
@@ -686,7 +690,15 @@ static int sn65dsi83_probe(struct i2c_client *client,
ctx->bridge.of_node = dev->of_node;
drm_bridge_add(&ctx->bridge);
 
+   ret = sn65dsi83_host_attach(ctx);
+   if (ret)
+   goto err_remove_bridge;
+
return 0;
+
+err_remove_bridge:
+   drm_bridge_remove(&ctx->bridge);
+   return ret;
 }
 
 static int sn65dsi83_remove(struct i2c_client *client)
-- 
2.31.1



[PATCH v4 17/24] drm/bridge: sn65dsi83: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device. This also avoids leaking the device when we detach
the bridge but don't remove its driver.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/ti-sn65dsi83.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
index a32f70bc68ea..db4d39082705 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
@@ -262,7 +262,7 @@ static int sn65dsi83_attach(struct drm_bridge *bridge,
return -EPROBE_DEFER;
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
return dev_err_probe(dev, PTR_ERR(dsi),
 "failed to create dsi device\n");
@@ -274,18 +274,14 @@ static int sn65dsi83_attach(struct drm_bridge *bridge,
dsi->format = MIPI_DSI_FMT_RGB888;
dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST;
 
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
dev_err(dev, "failed to attach dsi to host\n");
-   goto err_dsi_attach;
+   return ret;
}
 
return drm_bridge_attach(bridge->encoder, ctx->panel_bridge,
 &ctx->bridge, flags);
-
-err_dsi_attach:
-   mipi_dsi_device_unregister(dsi);
-   return ret;
 }
 
 static void sn65dsi83_atomic_pre_enable(struct drm_bridge *bridge,
@@ -697,8 +693,6 @@ static int sn65dsi83_remove(struct i2c_client *client)
 {
struct sn65dsi83 *ctx = i2c_get_clientdata(client);
 
-   mipi_dsi_detach(ctx->dsi);
-   mipi_dsi_device_unregister(ctx->dsi);
drm_bridge_remove(&ctx->bridge);
of_node_put(ctx->host_node);
 
-- 
2.31.1



  1   2   >