date:20210910

Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource

2021-09-10 Thread Thomas Hellström

On Fri, 2021-09-10 at 19:03 +0200, Christian König wrote:
> Am 10.09.21 um 17:30 schrieb Thomas Hellström:
> > On Fri, 2021-09-10 at 16:40 +0200, Christian König wrote:
> > > 
> > > Am 10.09.21 um 15:15 schrieb Thomas Hellström:
> > > > Both the provider (resource manager) and the consumer (the TTM
> > > > driver)
> > > > want to subclass struct ttm_resource. Since this is left for
> > > > the
> > > > resource
> > > > manager, we need to provide a private pointer for the TTM
> > > > driver.
> > > > 
> > > > Provide a struct ttm_resource_private for the driver to
> > > > subclass
> > > > for
> > > > data with the same lifetime as the struct ttm_resource: In the
> > > > i915
> > > > case
> > > > it will, for example, be an sg-table and radix tree into the
> > > > LMEM
> > > > /VRAM pages that currently are awkwardly attached to the GEM
> > > > object.
> > > > 
> > > > Provide an ops structure for associated ops (Which is only
> > > > destroy() ATM)
> > > > It might seem pointless to provide a separate ops structure,
> > > > but
> > > > Linus
> > > > has previously made it clear that that's the norm.
> > > > 
> > > > After careful audit one could perhaps also on a per-driver
> > > > basis
> > > > replace the delete_mem_notify() TTM driver callback with the
> > > > above
> > > > destroy function.
> > > Well this is a really big NAK to this approach.
> > > 
> > > If you need to attach some additional information to the resource
> > > then
> > > implement your own resource manager like everybody else does.
> > Well this was the long discussion we had back then when the
> > resource
> > mangagers started to derive from struct resource and I was under
> > the
> > impression that we had come to an agreement about the different
> > use-
> > cases here, and this was my main concern.
> 
> Ok, then we somehow didn't understood each other.
> 
> > I mean, it's a pretty big layer violation to do that for this use-
> > case.
> 
> Well exactly that's the point. TTM should not have a layer design in
> the 
> first place.
> 
> Devices, BOs, resources etc.. are base classes which should implement
> a 
> base functionality which is then extended by the drivers to implement
> the driver specific functionality.
> 
> That is a component based approach, and not layered at all.
> 
> > The TTM resource manager doesn't want to know about this data at
> > all,
> > it's private to the ttm resource user layer and the resource
> > manager
> > works perfectly well without it. (I assume the other drivers that
> > implement their own resource managers need the data that the
> > subclassing provides?)
> 
> Yes, that's exactly why we have the subclassing.
> 
> > The fundamental problem here is that there are two layers wanting
> > to
> > subclass struct ttm_resource. That means one layer gets to do that,
> > the
> > second gets to use a private pointer, (which in turn can provide
> > yet
> > another private pointer to a potential third layer). With your
> > suggestion, the second layer instead is forced to subclass each
> > subclassed instance it uses from  the first layer provides?
> 
> Well completely drop the layer approach/thinking here.
> 
> The resource is an object with a base class. The base class
> implements 
> the interface TTM needs to handle the object, e.g.
> create/destroy/debug 
> etc...
> 
> Then we need to subclass this object because without any additional 
> information the object is pretty pointless.
> 
> One possibility for this is to use the range manager to implement 
> something drm_mm based. BTW: We should probably rename that to
> something 
> like ttm_res_drm_mm or similar.

Sure I'm all in on that, but my point is this becomes pretty awkward
because the reusable code already subclasses struct ttm_resource. Let
me give you an example:

Prereqs:
1) We want to be able to re-use resource manager implementations among
drivers.
2) A driver might want to re-use multiple implementations and have
identical data "struct i915_data" attached to both

With your suggestion that combination of prereqs would look like:

struct i915_resource {
/* Reason why we subclass */
struct i915_data my_data;

/* 
 * Uh this is awkward. We need to do this because these   
 * already subclassed struct ttm_resource.
 */
struct ttm_resource *resource;
union {
struct ttm_range_mgr_node range;
struct i915_ttm_buddy_resource buddy;
};
};

And I can't make it look like

struct i915_resource {
struct i915_data my_data;
struct ttm_resource *resource;
}

Without that private back pointer.
 
But what I'd *really* would want is.

struct i915_resource {
struct i915_data my_data;
struct ttm_resource resource;
};

This would be identical to how we subclass a struct ttm_buffer_object
or a struct ttm_tt. But It can't look like this because then we can't
reuse exising implementations that *already subclass* struct
ttm_resou

[PATCH v2] drm/rockchip: cdn-dp-core: Fix cdn_dp_resume unused warning

2021-09-10 Thread Palmer Dabbelt

From: Palmer Dabbelt 

cdn_dp_resume is only used under PM_SLEEP, and now that it's static an
unused function warning is triggered undner !PM_SLEEP.  This marks the
function as possibly unused, to avoid triggering compiler warnings.

Fixes: 7c49abb4c2f8 ("drm/rockchip: cdn-dp-core: Make 
cdn_dp_core_suspend/resume static")
Reviewed-by: Geert Uytterhoeven 
Signed-off-by: Palmer Dabbelt 
---
This is breaking my builds and looks like it'll land after -rc1, so I've put it
on a shared tag for-rockchip-cdn_dp_resume-v2 which will let me pull it in to
my fixes.  LMK if you guys want me to send this up on my own, but I'm assuming
that the drm/rockchip folks will handle it.
---
 drivers/gpu/drm/rockchip/cdn-dp-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c 
b/drivers/gpu/drm/rockchip/cdn-dp-core.c
index 8ab3247dbc4a..13c6b857158f 100644
--- a/drivers/gpu/drm/rockchip/cdn-dp-core.c
+++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c
@@ -1123,7 +1123,7 @@ static int cdn_dp_suspend(struct device *dev)
return ret;
 }
 
-static int cdn_dp_resume(struct device *dev)
+static __maybe_unused int cdn_dp_resume(struct device *dev)
 {
struct cdn_dp_device *dp = dev_get_drvdata(dev);
 
-- 
2.33.0.309.g3052b89438-goog

Intel UHD resolutions

2021-09-10 Thread Randy Dunlap


Hi,

I would like to use QHD resolution (2560x1440) with my shiny new
computer and display. That resolution works if I boot Windows 10
(cough).

What do I need to do to use that resolution in Linux?

I first tried openSUSE 15.3 (kernel 5.3.18-59.19-default)
then I build a v5.14 kernel and tried that.
Both of them max out at FHD (1920x1080).

I am booting with "i915.force_probe=4c8a" on the kernel command line.

My desktop is XFCE4.

CPU is:
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 167
model name  : 11th Gen Intel(R) Core(TM) i9-11900 @ 2.50GHz
stepping: 1
microcode   : 0x40
cpu MHz : 1021.742
cache size  : 16384 KB
physical id : 0
siblings: 16

with an H470 chipset. (ASRock DeskMini H470)

00:02.0 VGA compatible controller: Intel Corporation RocketLake-S GT1 [UHD 
Graphics 750] (rev 04)

or verbose:

00:02.0 VGA compatible controller: Intel Corporation RocketLake-S GT1 [UHD 
Graphics 750] (rev 04) (prog-if 00 [VGA controller])
Subsystem: ASRock Incorporation Device 4c8a
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- 
TransPend-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, O
BFF Not Supported
 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, 
OBFF Disabled
 AtomicOpsCtl: ReqEn-
Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
Address: fee00018  Data: 
Masking:   Pending: 
Capabilities: [d0] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot
-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [100 v1] Process Address Space ID (PASID)
PASIDCap: Exec- Priv-, Max PASID Width: 14
PASIDCtl: Enable- Exec- Priv-
Capabilities: [200 v1] Address Translation Service (ATS)
ATSCap: Invalidate Queue Depth: 00
ATSCtl: Enable-, Smallest Translation Unit: 00
Capabilities: [300 v1] Page Request Interface (PRI)
PRICtl: Enable- Reset-
PRISta: RF- UPRGI- Stopped+
Page Request Capacity: 8000, Page Request Allocation: 

Kernel driver in use: i915
Kernel modules: i915


thanks.
--
~Randy

[PATCH] drm/i915: fix odd_ptr_err.cocci warnings

2021-09-10 Thread kernel test robot

From: kernel test robot 

drivers/gpu/drm/i915/display/intel_dpt.c:145:6-12: inconsistent IS_ERR and 
PTR_ERR on line 146.

 PTR_ERR should access the value just tested by IS_ERR

Semantic patch information:
 There can be false positives in the patch case, where it is the call to
 IS_ERR that is wrong.

Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci

CC: Maarten Lankhorst 
Reported-by: kernel test robot 
Signed-off-by: kernel test robot 
---

url:
https://github.com/0day-ci/linux/commits/Maarten-Lankhorst/drm-i915-Add-ww-context-to-intel_dpt_pin/20210910-162231
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
:: branch date: 17 hours ago
:: commit date: 17 hours ago

 intel_dpt.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -143,7 +143,7 @@ struct i915_vma *intel_dpt_pin(struct i9
i915_vma_unpin(vma);
 
if (IS_ERR(iomem)) {
-   err = PTR_ERR(vma);
+   err = PTR_ERR(iomem);
continue;
}

Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files

2021-09-10 Thread Yokoyama, Caz

On Fri, 2021-09-10 at 14:52 -0700, Lucas De Marchi wrote:
> On Fri, Sep 10, 2021 at 09:14:37PM +, Yokoyama, Caz wrote:
> > On Fri, 2021-09-10 at 10:52 -0700, Lucas De Marchi wrote:
> > > On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote:
> > > > We shouldn't be using debugfs_ namespace for this
> > > > functionality.
> > > > Rename
> > > > debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
> > > > functions, defines and structs follow suit.
> > > > 
> > > > Signed-off-by: Lucas De Marchi 
> > > > ---
> > > > drivers/gpu/drm/i915/Makefile  |  2 +-
> > > > drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 -
> > > > 
> > > > -
> > > > drivers/gpu/drm/i915/gt/intel_gt_debugfs.c |  4 ++--
> > > > .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c}  |  4 ++--
> > > > drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h  | 14
> > > > ++
> > > > 5 files changed, 19 insertions(+), 19 deletions(-)
> > > > delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > > > rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c =>
> > > > intel_gt_pm_debugfs.c} (99%)
> > > > create mode 100644
> > > > drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/Makefile
> > > > b/drivers/gpu/drm/i915/Makefile
> > > > index 232c9673a2e5..dd656f2d7721 100644
> > > > --- a/drivers/gpu/drm/i915/Makefile
> > > > +++ b/drivers/gpu/drm/i915/Makefile
> > > > @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
> > > > 
> > > > # "Graphics Technology" (aka we talk to the gpu)
> > > > gt-y += \
> > > > -   gt/debugfs_gt_pm.o \
> > > > gt/gen2_engine_cs.o \
> > > > gt/gen6_engine_cs.o \
> > > > gt/gen6_ppgtt.o \
> > > > @@ -103,6 +102,7 @@ gt-y += \
> > > > gt/intel_gt_engines_debugfs.o \
> > > > gt/intel_gt_irq.o \
> > > > gt/intel_gt_pm.o \
> > > > +   gt/intel_gt_pm_debugfs.o \
> > > > gt/intel_gt_pm_irq.o \
> > > > gt/intel_gt_requests.o \
> > > > gt/intel_gtt.o \
> > > > diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > > > b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > > > deleted file mode 100644
> > > > index 4cf5f5c9da7d..
> > > > --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > > > +++ /dev/null
> > > > @@ -1,14 +0,0 @@
> > > > -/* SPDX-License-Identifier: MIT */
> > > > -/*
> > > > - * Copyright © 2019 Intel Corporation
> > > > - */
> > > > -
> > > > -#ifndef DEBUGFS_GT_PM_H
> > > > -#define DEBUGFS_GT_PM_H
> > > > -
> > > > -struct intel_gt;
> > > > -struct dentry;
> > > > -
> > > > -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry
> > > > *root);
> > > > -
> > > > -#endif /* DEBUGFS_GT_PM_H */
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > > > b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > > > index e5d173c235a3..4096ee893b69 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > > > @@ -5,10 +5,10 @@
> > > > 
> > > > #include 
> > > > 
> > > > -#include "debugfs_gt_pm.h"
> > > > #include "i915_drv.h"
> > > > #include "intel_gt_debugfs.h"
> > > > #include "intel_gt_engines_debugfs.h"
> > > > +#include "intel_gt_pm_debugfs.h"
> > Why locate here? Why not just replace debugfs_gt_pm.h? Compile
> > error?
> 
> are you asking why I moved the include? Because sorting them
> alphabetically avoid big messes in these includes
As the patch, it is easy to see if - and + lines are side by side.
Anyway, I honor and respect your decision.
-caz

> 
> Lucas De Marchi
> 
> > -caz
> > 
> > > > #include "intel_sseu_debugfs.h"
> > > > #include "uc/intel_uc_debugfs.h"
> > > > 
> > > > @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct
> > > > intel_gt
> > > > *gt)
> > > > return;
> > > > 
> > > > intel_gt_engines_register_debugfs(gt, root);
> > > > -   debugfs_gt_pm_register(gt, root);
> > > > +   intel_gt_pm_register_debugfs(gt, root);
> > > 
> > > This is one case I usually don't know what convention to follow
> > > since
> > > it
> > > changes in different places.
> > > 
> > > I did it like _register_debugfs because of calls like
> > > intel_gt_init_scratch(), xxx_init_hw, etc. However here I see
> > > that
> > > just
> > > below we have intel_sseu_debugfs_register(), so maybe I should
> > > consider
> > > debugfs as part of the namespace?
> > > 
> > > Lucas De Marchi

Re: [Intel-gfx] [PATCH v9 15/17] drm/i915/pxp: add pxp debugfs

2021-09-10 Thread Teres Alexis, Alan Previn

Reviewed-by: Alan Previn  

..alan

On Fri, 2021-09-10 at 08:36 -0700, Daniele Ceraolo Spurio wrote:
> 2 debugfs files, one to query the current status of the pxp session and one
> to trigger an invalidation for testing.
> 
> v2: rename debugfs, fix date (Alan)
> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Reviewed-by : Alan Previn 
> ---
>  drivers/gpu/drm/i915/Makefile|  1 +
>  drivers/gpu/drm/i915/gt/debugfs_gt.c |  2 +
>  drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c | 78 
>  drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h | 21 ++
>  4 files changed, 102 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
>  create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 366e82cec44d..b46474ee1a1f 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -285,6 +285,7 @@ i915-y += i915_perf.o
>  i915-$(CONFIG_DRM_I915_PXP) += \
>   pxp/intel_pxp.o \
>   pxp/intel_pxp_cmd.o \
> + pxp/intel_pxp_debugfs.o \
>   pxp/intel_pxp_irq.o \
>   pxp/intel_pxp_pm.o \
>   pxp/intel_pxp_session.o \
> diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt.c 
> b/drivers/gpu/drm/i915/gt/debugfs_gt.c
> index 591eb60785db..c27847ddb796 100644
> --- a/drivers/gpu/drm/i915/gt/debugfs_gt.c
> +++ b/drivers/gpu/drm/i915/gt/debugfs_gt.c
> @@ -9,6 +9,7 @@
>  #include "debugfs_gt.h"
>  #include "debugfs_gt_pm.h"
>  #include "intel_sseu_debugfs.h"
> +#include "pxp/intel_pxp_debugfs.h"
>  #include "uc/intel_uc_debugfs.h"
>  #include "i915_drv.h"
>  
> @@ -28,6 +29,7 @@ void debugfs_gt_register(struct intel_gt *gt)
>   intel_sseu_debugfs_register(gt, root);
>  
>   intel_uc_debugfs_register(>->uc, root);
> + intel_pxp_debugfs_register(>->pxp, root);
>  }
>  
>  void intel_gt_debugfs_register_files(struct dentry *root,
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c 
> b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
> new file mode 100644
> index ..cbb1853676cc
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
> @@ -0,0 +1,78 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +
> +#include 
> +#include 
> +
> +#include "gt/debugfs_gt.h"
> +#include "pxp/intel_pxp.h"
> +#include "pxp/intel_pxp_irq.h"
> +#include "i915_drv.h"
> +
> +static int pxp_info_show(struct seq_file *m, void *data)
> +{
> + struct intel_pxp *pxp = m->private;
> + struct drm_printer p = drm_seq_file_printer(m);
> + bool enabled = intel_pxp_is_enabled(pxp);
> +
> + if (!enabled) {
> + drm_printf(&p, "pxp disabled\n");
> + return 0;
> + }
> +
> + drm_printf(&p, "active: %s\n", yesno(intel_pxp_is_active(pxp)));
> + drm_printf(&p, "instance counter: %u\n", pxp->key_instance);
> +
> + return 0;
> +}
> +DEFINE_GT_DEBUGFS_ATTRIBUTE(pxp_info);
> +
> +static int pxp_terminate_get(void *data, u64 *val)
> +{
> + /* nothing to read */
> + return -EPERM;
> +}
> +
> +static int pxp_terminate_set(void *data, u64 val)
> +{
> + struct intel_pxp *pxp = data;
> + struct intel_gt *gt = pxp_to_gt(pxp);
> +
> + if (!intel_pxp_is_active(pxp))
> + return -ENODEV;
> +
> + /* simulate a termination interrupt */
> + spin_lock_irq(>->irq_lock);
> + intel_pxp_irq_handler(pxp, 
> GEN12_DISPLAY_PXP_STATE_TERMINATED_INTERRUPT);
> + spin_unlock_irq(>->irq_lock);
> +
> + if (!wait_for_completion_timeout(&pxp->termination,
> +  msecs_to_jiffies(100)))
> + return -ETIMEDOUT;
> +
> + return 0;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(pxp_terminate_fops, pxp_terminate_get, 
> pxp_terminate_set, "%llx\n");
> +void intel_pxp_debugfs_register(struct intel_pxp *pxp, struct dentry 
> *gt_root)
> +{
> + static const struct debugfs_gt_file files[] = {
> + { "info", &pxp_info_fops, NULL },
> + { "terminate_state", &pxp_terminate_fops, NULL },
> + };
> + struct dentry *root;
> +
> + if (!gt_root)
> + return;
> +
> + if (!HAS_PXP((pxp_to_gt(pxp)->i915)))
> + return;
> +
> + root = debugfs_create_dir("pxp", gt_root);
> + if (IS_ERR(root))
> + return;
> +
> + intel_gt_debugfs_register_files(root, files, ARRAY_SIZE(files), pxp);
> +}
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h 
> b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
> new file mode 100644
> index ..7e0c3d2f5d7e
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +
> +#ifndef __INTEL_PXP_DEBUGFS_H__
> +#define __INTEL_PXP_DEBUGFS_H__
> +
> +struct intel_pxp;
> +struct dentry;
> +
> +#ifdef CONFIG_DRM_I915_PXP
> +void intel_pxp_debugfs_register(struct intel_

Re: [PATCH] drm/rockchip: Update crtc fixup to account for fractional clk change

2021-09-10 Thread Chris Morgan

On Wed, Sep 08, 2021 at 09:05:52PM +0300, Andy Shevchenko wrote:
> On Wed, Sep 08, 2021 at 08:53:56AM -0500, Chris Morgan wrote:
> > From: Chris Morgan 
> > 
> > After commit 928f9e268611 ("clk: fractional-divider: Hide
> > clk_fractional_divider_ops from wide audience") was merged it appears
> > that the DSI panel on my Odroid Go Advance stopped working. Upon closer
> > examination of the problem, it looks like it was the fixup in the
> > rockchip_drm_vop.c file was causing the issue. The changes made to the
> > clk driver appear to change some assumptions made in the fixup.
> > 
> > After debugging the working 5.14 kernel and the no-longer working
> > 5.15 kernel, it looks like this was broken all along but still
> > worked, whereas after the fractional clock change it stopped
> > working despite the issue (it went from sort-of broken to very broken).
> > 
> > In the 5.14 kernel the dclk_vopb_frac was being requested to be set to
> > 17000999 on my board. The clock driver was taking the value of the
> > parent clock and attempting to divide the requested value from it
> > (1700/17000999 = 0), then subtracting 1 from it (making it -1),
> > and running it through fls_long to get 64. It would then subtract
> > the value of fd->mwidth from it to get 48, and then bit shift
> > 17000999 to the left by 48, coming up with a very large number of
> > 7649082492112076800. This resulted in a numerator of 65535 and a
> > denominator of 1 from the clk driver. The driver seemingly would
> > try again and get a correct 1:1 value later, and then move on.
> > 
> > Output from my 5.14 kernel (with some printfs for good measure):
> > [2.830066] rockchip-drm display-subsystem: bound ff46.vop (ops 
> > vop_component_ops)
> > [2.839431] rockchip-drm display-subsystem: bound ff45.dsi (ops 
> > dw_mipi_dsi_rockchip_ops)
> > [2.855980] Clock is dclk_vopb_frac
> > [2.856004] Scale 64, Rate 7649082492112076800, Oldrate 17000999, Parent 
> > Rate 1700, Best Numerator 65535, Best Denominator 1, fd->mwidth 16
> > [2.903529] Clock is dclk_vopb_frac
> > [2.903556] Scale 0, Rate 1700, Oldrate 1700, Parent Rate 
> > 1700, Best Numerator 1, Best Denominator 1, fd->mwidth 16
> > [2.903579] Clock is dclk_vopb_frac
> > [2.903583] Scale 0, Rate 1700, Oldrate 1700, Parent Rate 
> > 1700, Best Numerator 1, Best Denominator 1, fd->mwidth 16
> > 
> > Contrast this with 5.15 after the clk change where the rate of 17000999
> > was getting passed and resulted in numerators/denomiators of 17001/
> > 17000.
> > 
> > Output from my 5.15 kernel (with some printfs added for good measure):
> > [2.817571] rockchip-drm display-subsystem: bound ff46.vop (ops 
> > vop_component_ops)
> > [2.826975] rockchip-drm display-subsystem: bound ff45.dsi (ops 
> > dw_mipi_dsi_rockchip_ops)
> > [2.843430] Rate 17000999, Parent Rate 1700, Best Numerator 17018, 
> > Best Denominator 17017
> > [2.891073] Rate 17001000, Parent Rate 1700, Best Numerator 17001, 
> > Best Denominator 17000
> > [2.891269] Rate 17001000, Parent Rate 1700, Best Numerator 17001, 
> > Best Denominator 17000
> > [2.891281] Rate 17001000, Parent Rate 1700, Best Numerator 17001, 
> > Best Denominator 17000
> > 
> > After tracing through the code it appeared that this function here was
> > adding a 999 to the requested frequency because of how the clk driver
> > was rounding/accepting those frequencies. I believe after the changes
> > made in the commit listed above the assumptions listed in this driver
> > are no longer true. When I remove the + 999 from the driver the DSI
> > panel begins to work again.
> > 
> > Output from my 5.15 kernel with 999 removed (printfs added):
> > [2.852054] rockchip-drm display-subsystem: bound ff46.vop (ops 
> > vop_component_ops)
> > [2.864483] rockchip-drm display-subsystem: bound ff45.dsi (ops 
> > dw_mipi_dsi_rockchip_ops)
> > [2.880869] Clock is dclk_vopb_frac
> > [2.880892] Rate 1700, Parent Rate 1700, Best Numerator 1, Best 
> > Denominator 1
> > [2.928521] Clock is dclk_vopb_frac
> > [2.928551] Rate 1700, Parent Rate 1700, Best Numerator 1, Best 
> > Denominator 1
> > [2.928570] Clock is dclk_vopb_frac
> > [2.928574] Rate 1700, Parent Rate 1700, Best Numerator 1, Best 
> > Denominator 1
> > 
> > I have tested the change extensively on my Odroid Go Advance (Rockchip
> > RK3326) and it appears to work well. However, this change will affect
> > all Rockchip SoCs that use this driver so I believe further testing
> > is warranted. Please note that without this change I can confirm
> > at least all PX30s with DSI panels will stop working with the 5.15
> > kernel.
> 
> To me it all makes a lot of sense, thank you for deep analysis of the issue!
> In any case I think we will need a Fixes tag to something (either one of
> clk-fractional-divider.c series or preexisted).

Would this work for a

Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files

2021-09-10 Thread Lucas De Marchi


On Fri, Sep 10, 2021 at 09:14:37PM +, Yokoyama, Caz wrote:

On Fri, 2021-09-10 at 10:52 -0700, Lucas De Marchi wrote:

On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote:
> We shouldn't be using debugfs_ namespace for this functionality.
> Rename
> debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
> functions, defines and structs follow suit.
>
> Signed-off-by: Lucas De Marchi 
> ---
> drivers/gpu/drm/i915/Makefile  |  2 +-
> drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 -
> -
> drivers/gpu/drm/i915/gt/intel_gt_debugfs.c |  4 ++--
> .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c}  |  4 ++--
> drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h  | 14
> ++
> 5 files changed, 19 insertions(+), 19 deletions(-)
> delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c =>
> intel_gt_pm_debugfs.c} (99%)
> create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h
>
> diff --git a/drivers/gpu/drm/i915/Makefile
> b/drivers/gpu/drm/i915/Makefile
> index 232c9673a2e5..dd656f2d7721 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
>
> # "Graphics Technology" (aka we talk to the gpu)
> gt-y += \
> -  gt/debugfs_gt_pm.o \
>gt/gen2_engine_cs.o \
>gt/gen6_engine_cs.o \
>gt/gen6_ppgtt.o \
> @@ -103,6 +102,7 @@ gt-y += \
>gt/intel_gt_engines_debugfs.o \
>gt/intel_gt_irq.o \
>gt/intel_gt_pm.o \
> +  gt/intel_gt_pm_debugfs.o \
>gt/intel_gt_pm_irq.o \
>gt/intel_gt_requests.o \
>gt/intel_gtt.o \
> diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> deleted file mode 100644
> index 4cf5f5c9da7d..
> --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> +++ /dev/null
> @@ -1,14 +0,0 @@
> -/* SPDX-License-Identifier: MIT */
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#ifndef DEBUGFS_GT_PM_H
> -#define DEBUGFS_GT_PM_H
> -
> -struct intel_gt;
> -struct dentry;
> -
> -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry
> *root);
> -
> -#endif /* DEBUGFS_GT_PM_H */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> index e5d173c235a3..4096ee893b69 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> @@ -5,10 +5,10 @@
>
> #include 
>
> -#include "debugfs_gt_pm.h"
> #include "i915_drv.h"
> #include "intel_gt_debugfs.h"
> #include "intel_gt_engines_debugfs.h"
> +#include "intel_gt_pm_debugfs.h"

Why locate here? Why not just replace debugfs_gt_pm.h? Compile error?


are you asking why I moved the include? Because sorting them
alphabetically avoid big messes in these includes

Lucas De Marchi


-caz


> #include "intel_sseu_debugfs.h"
> #include "uc/intel_uc_debugfs.h"
>
> @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt
> *gt)
>return;
>
>intel_gt_engines_register_debugfs(gt, root);
> -  debugfs_gt_pm_register(gt, root);
> +  intel_gt_pm_register_debugfs(gt, root);

This is one case I usually don't know what convention to follow since
it
changes in different places.

I did it like _register_debugfs because of calls like
intel_gt_init_scratch(), xxx_init_hw, etc. However here I see that
just
below we have intel_sseu_debugfs_register(), so maybe I should
consider
debugfs as part of the namespace?

Lucas De Marchi

Re: [PATCH 2/2] drm/msm/dpu: Fix timeout issues on command mode panels

2021-09-10 Thread Marijn Suijten

Hi Angelo!

On 2021-09-01 19:43:47, AngeloGioacchino Del Regno wrote:
> In function dpu_encoder_phys_cmd_wait_for_commit_done we are always
> checking if the relative CTL is started by waiting for an interrupt
> to fire: it is fine to do that, but then sometimes we call this
> function while the CTL is up and has never been put down, but that
> interrupt gets raised only when the CTL gets a state change from
> 0 to 1 (disabled to enabled), so we're going to wait for something
> that will never happen on its own.
> 
> Solving this while avoiding to restart the CTL is actually possible
> and can be done by just checking if it is already up and running
> when the wait_for_commit_done function is called: in this case, so,
> if the CTL was already running, we can say that the commit is done
> if the command transmission is complete (in other terms, if the
> interface has been flushed).
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
> index aa01698d6b25..b5b1b555ac4e 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
> @@ -682,6 +682,9 @@ static int dpu_encoder_phys_cmd_wait_for_commit_done(
>   if (!dpu_encoder_phys_cmd_is_master(phys_enc))
>   return 0;
>  
> + if (phys_enc->hw_ctl->ops.is_started)
> + return dpu_encoder_phys_cmd_wait_for_tx_complete(phys_enc);

In the previous commit you introduced is_started to the ops struct as
function pointer, and you probably intend to call it here instead of
just checking whether it might be NULL.

As far as I remember this was also the reason for previously mentioning
that it was faulty and required a v2 in:
https://lore.kernel.org/linux-arm-msm/bdc67afc-3736-5497-c43f-5165c55e0...@somainline.org/

Thanks!

- Marijn

> +
>   return _dpu_encoder_phys_cmd_wait_for_ctl_start(phys_enc);
>  }
>  
> -- 
> 2.32.0
>

Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files

2021-09-10 Thread Yokoyama, Caz

On Fri, 2021-09-10 at 10:52 -0700, Lucas De Marchi wrote:
> On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote:
> > We shouldn't be using debugfs_ namespace for this functionality.
> > Rename
> > debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
> > functions, defines and structs follow suit.
> > 
> > Signed-off-by: Lucas De Marchi 
> > ---
> > drivers/gpu/drm/i915/Makefile  |  2 +-
> > drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 -
> > -
> > drivers/gpu/drm/i915/gt/intel_gt_debugfs.c |  4 ++--
> > .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c}  |  4 ++--
> > drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h  | 14
> > ++
> > 5 files changed, 19 insertions(+), 19 deletions(-)
> > delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c =>
> > intel_gt_pm_debugfs.c} (99%)
> > create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h
> > 
> > diff --git a/drivers/gpu/drm/i915/Makefile
> > b/drivers/gpu/drm/i915/Makefile
> > index 232c9673a2e5..dd656f2d7721 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
> > 
> > # "Graphics Technology" (aka we talk to the gpu)
> > gt-y += \
> > -   gt/debugfs_gt_pm.o \
> > gt/gen2_engine_cs.o \
> > gt/gen6_engine_cs.o \
> > gt/gen6_ppgtt.o \
> > @@ -103,6 +102,7 @@ gt-y += \
> > gt/intel_gt_engines_debugfs.o \
> > gt/intel_gt_irq.o \
> > gt/intel_gt_pm.o \
> > +   gt/intel_gt_pm_debugfs.o \
> > gt/intel_gt_pm_irq.o \
> > gt/intel_gt_requests.o \
> > gt/intel_gtt.o \
> > diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > deleted file mode 100644
> > index 4cf5f5c9da7d..
> > --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
> > +++ /dev/null
> > @@ -1,14 +0,0 @@
> > -/* SPDX-License-Identifier: MIT */
> > -/*
> > - * Copyright © 2019 Intel Corporation
> > - */
> > -
> > -#ifndef DEBUGFS_GT_PM_H
> > -#define DEBUGFS_GT_PM_H
> > -
> > -struct intel_gt;
> > -struct dentry;
> > -
> > -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry
> > *root);
> > -
> > -#endif /* DEBUGFS_GT_PM_H */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > index e5d173c235a3..4096ee893b69 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
> > @@ -5,10 +5,10 @@
> > 
> > #include 
> > 
> > -#include "debugfs_gt_pm.h"
> > #include "i915_drv.h"
> > #include "intel_gt_debugfs.h"
> > #include "intel_gt_engines_debugfs.h"
> > +#include "intel_gt_pm_debugfs.h"
Why locate here? Why not just replace debugfs_gt_pm.h? Compile error?
-caz

> > #include "intel_sseu_debugfs.h"
> > #include "uc/intel_uc_debugfs.h"
> > 
> > @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt
> > *gt)
> > return;
> > 
> > intel_gt_engines_register_debugfs(gt, root);
> > -   debugfs_gt_pm_register(gt, root);
> > +   intel_gt_pm_register_debugfs(gt, root);
> 
> This is one case I usually don't know what convention to follow since
> it
> changes in different places.
> 
> I did it like _register_debugfs because of calls like
> intel_gt_init_scratch(), xxx_init_hw, etc. However here I see that
> just
> below we have intel_sseu_debugfs_register(), so maybe I should
> consider
> debugfs as part of the namespace?
> 
> Lucas De Marchi

Re: [Intel-gfx] [PATCH 23/27] drm/i915/guc: Implement no mid batch preemption for multi-lrc

2021-09-10 Thread Matthew Brost

On Fri, Sep 10, 2021 at 12:25:43PM +0100, Tvrtko Ursulin wrote:
> 
> On 20/08/2021 23:44, Matthew Brost wrote:
> > For some users of multi-lrc, e.g. split frame, it isn't safe to preempt
> > mid BB. To safely enable preemption at the BB boundary, a handshake
> > between to parent and child is needed. This is implemented via custom
> > emit_bb_start & emit_fini_breadcrumb functions and enabled via by
> > default if a context is configured by set parallel extension.
> 
> FWIW I think it's wrong to hardcode the requirements of a particular
> hardware generation fixed media pipeline into the uapi. IMO better solution
> was when concept of parallel submission was decoupled from the no preemption
> mid batch preambles. Otherwise might as well call the extension
> I915_CONTEXT_ENGINES_EXT_MEDIA_SPLIT_FRAME_SUBMIT or something.
> 

I don't disagree but this where we landed per Daniel Vetter's feedback -
default to what our current hardware supports and extend it later to
newer hardware / requirements as needed.

Matt

> Regards,
> 
> Tvrtko
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_context.c   |   2 +-
> >   drivers/gpu/drm/i915/gt/intel_context_types.h |   3 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   2 +-
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 283 +-
> >   4 files changed, 287 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> > b/drivers/gpu/drm/i915/gt/intel_context.c
> > index 5615be32879c..2de62649e275 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -561,7 +561,7 @@ void intel_context_bind_parent_child(struct 
> > intel_context *parent,
> > GEM_BUG_ON(intel_context_is_child(child));
> > GEM_BUG_ON(intel_context_is_parent(child));
> > -   parent->guc_number_children++;
> > +   child->guc_child_index = parent->guc_number_children++;
> > list_add_tail(&child->guc_child_link,
> >   &parent->guc_child_list);
> > child->parent = parent;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> > b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > index 713d85b0b364..727f91e7f7c2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > @@ -246,6 +246,9 @@ struct intel_context {
> > /** @guc_number_children: number of children if parent */
> > u8 guc_number_children;
> > +   /** @guc_child_index: index into guc_child_list if child */
> > +   u8 guc_child_index;
> > +
> > /**
> >  * @parent_page: page in context used by parent for work queue,
> >  * work queue descriptor
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > index 6cd26dc060d1..9f61cfa5566a 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> > @@ -188,7 +188,7 @@ struct guc_process_desc {
> > u32 wq_status;
> > u32 engine_presence;
> > u32 priority;
> > -   u32 reserved[30];
> > +   u32 reserved[36];
> >   } __packed;
> >   #define CONTEXT_REGISTRATION_FLAG_KMD BIT(0)
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 91330525330d..1a18f99bf12a 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -11,6 +11,7 @@
> >   #include "gt/intel_context.h"
> >   #include "gt/intel_engine_pm.h"
> >   #include "gt/intel_engine_heartbeat.h"
> > +#include "gt/intel_gpu_commands.h"
> >   #include "gt/intel_gt.h"
> >   #include "gt/intel_gt_irq.h"
> >   #include "gt/intel_gt_pm.h"
> > @@ -366,10 +367,14 @@ static struct i915_priolist *to_priolist(struct 
> > rb_node *rb)
> >   /*
> >* When using multi-lrc submission an extra page in the context state is
> > - * reserved for the process descriptor and work queue.
> > + * reserved for the process descriptor, work queue, and preempt BB boundary
> > + * handshake between the parent + childlren contexts.
> >*
> >* The layout of this page is below:
> >* 0  guc_process_desc
> > + * + sizeof(struct guc_process_desc)   child go
> > + * + CACHELINE_BYTES   child join ...
> > + * + CACHELINE_BYTES ...
> >* ...unused
> >* PAGE_SIZE / 2  work queue start
> >* ...work queue
> > @@ -1785,6 +1790,30 @@ static int deregister_context(struct intel_context 
> > *ce, u32 guc_id, bool loop)
> > return __guc_action_deregister_context(guc, guc_id, loop);
> >   }
> > +static inline void clear_children_join_go_memory(struct intel_context *ce)
> > +

Re: [Intel-gfx] [PATCH 05/27] drm/i915: Add GT PM unpark worker

2021-09-10 Thread Matthew Brost

On Fri, Sep 10, 2021 at 09:36:17AM +0100, Tvrtko Ursulin wrote:
> 
> On 20/08/2021 23:44, Matthew Brost wrote:
> > Sometimes it is desirable to queue work up for later if the GT PM isn't
> > held and run that work on next GT PM unpark.
> 
> Sounds maybe plausible, but it depends how much work can happen on unpark
> and whether it can have too much of a negative impact on latency for
> interactive loads? Or from a reverse angle, why the work wouldn't be done on

All it is does is add an interface to kick a work queue on unpark. i.e.
All the actually work is done async in the work queue so it shouldn't
add any latency.

> parking?
> 
> Also what kind of mechanism for dealing with too much stuff being put on
> this list you have? Can there be pressure which triggers (or would need to

No limits on pressure. See above, I don't think this is a concern.

> trigger) these deregistrations to happen at runtime (no park/unpark
> transitions)?
>
> > Implemented with a list in the GT of all pending work, workqueues in
> > the list, a callback to add a workqueue to the list, and finally a
> > wakeref post_get callback that iterates / drains the list + queues the
> > workqueues.
> > 
> > First user of this is deregistration of GuC contexts.
> 
> Does first imply there are more incoming?
>

Haven't found another user yet but this is generic mechanism so we can
add more in the future if other use cases arrise.
 
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/Makefile |  1 +
> >   drivers/gpu/drm/i915/gt/intel_gt.c|  3 ++
> >   drivers/gpu/drm/i915/gt/intel_gt_pm.c |  8 
> >   .../gpu/drm/i915/gt/intel_gt_pm_unpark_work.c | 35 
> >   .../gpu/drm/i915/gt/intel_gt_pm_unpark_work.h | 40 +++
> >   drivers/gpu/drm/i915/gt/intel_gt_types.h  | 10 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.h|  8 ++--
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +--
> >   drivers/gpu/drm/i915/intel_wakeref.c  |  5 +++
> >   drivers/gpu/drm/i915/intel_wakeref.h  |  1 +
> >   10 files changed, 119 insertions(+), 7 deletions(-)
> >   create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c
> >   create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.h
> > 
> > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > index 642a5b5a1b81..579bdc069f25 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -103,6 +103,7 @@ gt-y += \
> > gt/intel_gt_clock_utils.o \
> > gt/intel_gt_irq.o \
> > gt/intel_gt_pm.o \
> > +   gt/intel_gt_pm_unpark_work.o \
> > gt/intel_gt_pm_irq.o \
> > gt/intel_gt_requests.o \
> > gt/intel_gtt.o \
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
> > b/drivers/gpu/drm/i915/gt/intel_gt.c
> > index 62d40c986642..7e690e74baa2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> > @@ -29,6 +29,9 @@ void intel_gt_init_early(struct intel_gt *gt, struct 
> > drm_i915_private *i915)
> > spin_lock_init(>->irq_lock);
> > +   spin_lock_init(>->pm_unpark_work_lock);
> > +   INIT_LIST_HEAD(>->pm_unpark_work_list);
> > +
> > INIT_LIST_HEAD(>->closed_vma);
> > spin_lock_init(>->closed_lock);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c 
> > b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > index dea8e2479897..564c11a3748b 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > @@ -90,6 +90,13 @@ static int __gt_unpark(struct intel_wakeref *wf)
> > return 0;
> >   }
> > +static void __gt_unpark_work_queue(struct intel_wakeref *wf)
> > +{
> > +   struct intel_gt *gt = container_of(wf, typeof(*gt), wakeref);
> > +
> > +   intel_gt_pm_unpark_work_queue(gt);
> > +}
> > +
> >   static int __gt_park(struct intel_wakeref *wf)
> >   {
> > struct intel_gt *gt = container_of(wf, typeof(*gt), wakeref);
> > @@ -118,6 +125,7 @@ static int __gt_park(struct intel_wakeref *wf)
> >   static const struct intel_wakeref_ops wf_ops = {
> > .get = __gt_unpark,
> > +   .post_get = __gt_unpark_work_queue,
> > .put = __gt_park,
> >   };
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c 
> > b/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c
> > new file mode 100644
> > index ..23162dbd0c35
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c
> > @@ -0,0 +1,35 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2021 Intel Corporation
> > + */
> > +
> > +#include "i915_drv.h"
> > +#include "intel_runtime_pm.h"
> > +#include "intel_gt_pm.h"
> > +
> > +void intel_gt_pm_unpark_work_queue(struct intel_gt *gt)
> > +{
> > +   struct intel_gt_pm_unpark_work *work, *next;
> > +   unsigned long flags;
> > +
> > +   spin_lock_irqsave(>->pm_unpark_work_lock, flags);
> > +   list_for_each_entry_safe(work, next,
> > +

[PATCH v2 3/6] drm/i915/uncore: Replace gen8 write functions with general fwtable

2021-09-10 Thread Matt Roper

Now that we have both a standard forcewake table (albeit a single-entry
table) and the shadow table stored in the uncore, we can drop the
gen8-specific write handlers in favor of the general fwtable version.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 13 +
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 5fa2bf26a948..4c6898746d10 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1046,16 +1046,6 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, 
i915_reg_t reg)
return FORCEWAKE_RENDER;
 }
 
-#define __gen8_reg_write_fw_domains(uncore, offset) \
-({ \
-   enum forcewake_domains __fwd; \
-   if (NEEDS_FORCE_WAKE(offset) && !is_shadowed(uncore, offset)) \
-   __fwd = FORCEWAKE_RENDER; \
-   else \
-   __fwd = 0; \
-   __fwd; \
-})
-
 static const struct intel_forcewake_range __gen6_fw_ranges[] = {
GEN_FW_RANGE(0x0, 0x3, FORCEWAKE_RENDER),
 };
@@ -1711,7 +1701,6 @@ __gen_write(func, 32)
 __gen_reg_write_funcs(gen12_fwtable);
 __gen_reg_write_funcs(gen11_fwtable);
 __gen_reg_write_funcs(fwtable);
-__gen_reg_write_funcs(gen8);
 
 #undef __gen_reg_write_funcs
 #undef GEN6_WRITE_FOOTER
@@ -2121,7 +2110,7 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
} else if (GRAPHICS_VER(i915) == 8) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
-   ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8);
+   ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_VALLEYVIEW(i915)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges);
-- 
2.25.4

[PATCH v2 6/6] drm/i915/dg2: Add DG2-specific shadow register table

2021-09-10 Thread Matt Roper

We thought the DG2 table of shadowed registers would be the same as the
gen12/xehp table, but it turns out that there are a few minor
differences that require us to define a new DG2-specific table:
 * One register is removed (0xC4D4)
 * One register is added (0xC4E0)

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c   | 41 ++-
 drivers/gpu/drm/i915/selftests/intel_uncore.c |  1 +
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 10f124297e7c..b3ba710d4310 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1016,6 +1016,45 @@ static const struct i915_range gen12_shadowed_regs[] = {
{ .start = 0x1F8510, .end = 0x1F8550 },
 };
 
+static const struct i915_range dg2_shadowed_regs[] = {
+   { .start =   0x2030, .end =   0x2030 },
+   { .start =   0x2510, .end =   0x2550 },
+   { .start =   0xA008, .end =   0xA00C },
+   { .start =   0xA188, .end =   0xA188 },
+   { .start =   0xA278, .end =   0xA278 },
+   { .start =   0xA540, .end =   0xA56C },
+   { .start =   0xC4C8, .end =   0xC4C8 },
+   { .start =   0xC4E0, .end =   0xC4E0 },
+   { .start =   0xC600, .end =   0xC600 },
+   { .start =   0xC658, .end =   0xC658 },
+   { .start =  0x22030, .end =  0x22030 },
+   { .start =  0x22510, .end =  0x22550 },
+   { .start = 0x1C0030, .end = 0x1C0030 },
+   { .start = 0x1C0510, .end = 0x1C0550 },
+   { .start = 0x1C4030, .end = 0x1C4030 },
+   { .start = 0x1C4510, .end = 0x1C4550 },
+   { .start = 0x1C8030, .end = 0x1C8030 },
+   { .start = 0x1C8510, .end = 0x1C8550 },
+   { .start = 0x1D0030, .end = 0x1D0030 },
+   { .start = 0x1D0510, .end = 0x1D0550 },
+   { .start = 0x1D4030, .end = 0x1D4030 },
+   { .start = 0x1D4510, .end = 0x1D4550 },
+   { .start = 0x1D8030, .end = 0x1D8030 },
+   { .start = 0x1D8510, .end = 0x1D8550 },
+   { .start = 0x1E0030, .end = 0x1E0030 },
+   { .start = 0x1E0510, .end = 0x1E0550 },
+   { .start = 0x1E4030, .end = 0x1E4030 },
+   { .start = 0x1E4510, .end = 0x1E4550 },
+   { .start = 0x1E8030, .end = 0x1E8030 },
+   { .start = 0x1E8510, .end = 0x1E8550 },
+   { .start = 0x1F0030, .end = 0x1F0030 },
+   { .start = 0x1F0510, .end = 0x1F0550 },
+   { .start = 0x1F4030, .end = 0x1F4030 },
+   { .start = 0x1F4510, .end = 0x1F4550 },
+   { .start = 0x1F8030, .end = 0x1F8030 },
+   { .start = 0x1F8510, .end = 0x1F8550 },
+};
+
 static int mmio_range_cmp(u32 key, const struct i915_range *range)
 {
if (key < range->start)
@@ -2054,7 +2093,7 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
 
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
-   ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
+   ASSIGN_SHADOW_TABLE(uncore, dg2_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c 
b/drivers/gpu/drm/i915/selftests/intel_uncore.c
index 22ef2c87df1a..bc8128170a99 100644
--- a/drivers/gpu/drm/i915/selftests/intel_uncore.c
+++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c
@@ -68,6 +68,7 @@ static int intel_shadow_table_check(void)
{ gen8_shadowed_regs, ARRAY_SIZE(gen8_shadowed_regs) },
{ gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) },
{ gen12_shadowed_regs, ARRAY_SIZE(gen12_shadowed_regs) },
+   { dg2_shadowed_regs, ARRAY_SIZE(dg2_shadowed_regs) },
};
const struct i915_range *range;
unsigned int i, j;
-- 
2.25.4

[PATCH v2 5/6] drm/i915/uncore: Drop gen11 mmio read handlers

2021-09-10 Thread Matt Roper

Consolidate down to just a single 'fwtable' implementation.  For reads
we don't need to worry about shadow tables.  Also, the
NEEDS_FORCE_WAKE() check we previously had in the fwtable implementation
can be dropped --- if a register is outside that range on one of the old
platforms, then it won't belong to any forcewake range and 0 will be
returned anyway.

v2:
 - Restore NEEDS_FORCE_WAKE() check.  (Chris, Tvrtko)

Cc: Chris Wilson 
Cc: Tvrtko Ursulin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 40 -
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index bfb2a6337f9d..10f124297e7c 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -935,9 +935,6 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] 
= {
__fwd; \
 })
 
-#define __gen11_fwtable_reg_read_fw_domains(uncore, offset) \
-   find_fw_domain(uncore, offset)
-
 /* *Must* be sorted by offset! See intel_shadow_table_check(). */
 static const struct i915_range gen8_shadowed_regs[] = {
{ .start =  0x2030, .end =  0x2030 },
@@ -1570,33 +1567,30 @@ static inline void __force_wake_auto(struct 
intel_uncore *uncore,
___force_wake_auto(uncore, fw_domains);
 }
 
-#define __gen_read(func, x) \
+#define __gen_fwtable_read(x) \
 static u##x \
-func##_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { \
+fwtable_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) \
+{ \
enum forcewake_domains fw_engine; \
GEN6_READ_HEADER(x); \
-   fw_engine = __##func##_reg_read_fw_domains(uncore, offset); \
+   fw_engine = __fwtable_reg_read_fw_domains(uncore, offset); \
if (fw_engine) \
__force_wake_auto(uncore, fw_engine); \
val = __raw_uncore_read##x(uncore, reg); \
GEN6_READ_FOOTER; \
 }
 
-#define __gen_reg_read_funcs(func) \
-static enum forcewake_domains \
-func##_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \
-   return __##func##_reg_read_fw_domains(uncore, 
i915_mmio_reg_offset(reg)); \
-} \
-\
-__gen_read(func, 8) \
-__gen_read(func, 16) \
-__gen_read(func, 32) \
-__gen_read(func, 64)
+static enum forcewake_domains
+fwtable_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) {
+   return __fwtable_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg));
+}
 
-__gen_reg_read_funcs(gen11_fwtable);
-__gen_reg_read_funcs(fwtable);
+__gen_fwtable_read(8)
+__gen_fwtable_read(16)
+__gen_fwtable_read(32)
+__gen_fwtable_read(64)
 
-#undef __gen_reg_read_funcs
+#undef __gen_fwtable_read
 #undef GEN6_READ_FOOTER
 #undef GEN6_READ_HEADER
 
@@ -2062,22 +2056,22 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) >= 12) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) == 11) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_GRAPHICS_VER(i915, 9, 10)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
-- 
2.25.4

[PATCH v2 2/6] drm/i915/uncore: Associate shadow table with uncore

2021-09-10 Thread Matt Roper

Store a reference to a platform's shadow table inside the uncore, the
same as we do with the forcewake table.  This will allow us to use a
single set of functions that operate on the shadow table reference
rather than generating lots of nearly-identical functions via macros
that differ only in terms of the table that they reference.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 40 -
 drivers/gpu/drm/i915/intel_uncore.h |  7 +
 2 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 8c09af1e9f7a..5fa2bf26a948 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1026,17 +1026,19 @@ static int mmio_range_cmp(u32 key, const struct 
i915_range *range)
return 0;
 }
 
-#define __is_X_shadowed(x) \
-static bool is_##x##_shadowed(u32 offset) \
-{ \
-   const struct i915_range *regs = x##_shadowed_regs; \
-   return BSEARCH(offset, regs, ARRAY_SIZE(x##_shadowed_regs), \
+static bool
+is_shadowed(struct intel_uncore *uncore, u32 offset)
+{
+   if (drm_WARN_ON(&uncore->i915->drm, !uncore->shadowed_reg_table))
+   return false;
+
+   return BSEARCH(offset,
+  uncore->shadowed_reg_table,
+  uncore->shadowed_reg_table_entries,
   mmio_range_cmp); \
 }
 
-__is_X_shadowed(gen8)
-__is_X_shadowed(gen11)
-__is_X_shadowed(gen12)
+
 
 static enum forcewake_domains
 gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg)
@@ -1047,7 +1049,7 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, 
i915_reg_t reg)
 #define __gen8_reg_write_fw_domains(uncore, offset) \
 ({ \
enum forcewake_domains __fwd; \
-   if (NEEDS_FORCE_WAKE(offset) && !is_gen8_shadowed(offset)) \
+   if (NEEDS_FORCE_WAKE(offset) && !is_shadowed(uncore, offset)) \
__fwd = FORCEWAKE_RENDER; \
else \
__fwd = 0; \
@@ -1081,7 +1083,7 @@ static const struct intel_forcewake_range 
__chv_fw_ranges[] = {
 #define __fwtable_reg_write_fw_domains(uncore, offset) \
 ({ \
enum forcewake_domains __fwd = 0; \
-   if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) \
+   if (NEEDS_FORCE_WAKE((offset)) && !is_shadowed(uncore, offset)) \
__fwd = find_fw_domain(uncore, offset); \
__fwd; \
 })
@@ -1090,7 +1092,7 @@ static const struct intel_forcewake_range 
__chv_fw_ranges[] = {
 ({ \
enum forcewake_domains __fwd = 0; \
const u32 __offset = (offset); \
-   if (!is_gen11_shadowed(__offset)) \
+   if (!is_shadowed(uncore, __offset)) \
__fwd = find_fw_domain(uncore, __offset); \
__fwd; \
 })
@@ -1099,7 +1101,7 @@ static const struct intel_forcewake_range 
__chv_fw_ranges[] = {
 ({ \
enum forcewake_domains __fwd = 0; \
const u32 __offset = (offset); \
-   if (!is_gen12_shadowed(__offset)) \
+   if (!is_shadowed(uncore, __offset)) \
__fwd = find_fw_domain(uncore, __offset); \
__fwd; \
 })
@@ -1705,6 +1707,7 @@ __gen_write(func, 8) \
 __gen_write(func, 16) \
 __gen_write(func, 32)
 
+
 __gen_reg_write_funcs(gen12_fwtable);
 __gen_reg_write_funcs(gen11_fwtable);
 __gen_reg_write_funcs(fwtable);
@@ -1969,6 +1972,12 @@ static int intel_uncore_fw_domains_init(struct 
intel_uncore *uncore)
(uncore)->fw_domains_table_entries = ARRAY_SIZE((d)); \
 }
 
+#define ASSIGN_SHADOW_TABLE(uncore, d) \
+{ \
+   (uncore)->shadowed_reg_table = d; \
+   (uncore)->shadowed_reg_table_entries = ARRAY_SIZE((d)); \
+}
+
 static int i915_pmic_bus_access_notifier(struct notifier_block *nb,
 unsigned long action, void *data)
 {
@@ -2081,30 +2090,37 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
 
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
+   ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
+   ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER(i915) >= 12) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges);
+   ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER(i915) == 11) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ra

[PATCH v2 4/6] drm/i915/uncore: Drop gen11/gen12 mmio write handlers

2021-09-10 Thread Matt Roper

Now that the reference to the shadow table is stored within the uncore,
we don't need to generate separate fwtable, gen11_fwtable, and
gen12_fwtable variants of the register write functions; a single
'fwtable' implementation will work for all of those platforms now.

While consolidating the functions, gen11/gen12 pick up a
NEEDS_FORCE_WAKE() check that they didn't have before, allowing them to
bypass a lot of forcewake/shadow checking for non-GT registers (e.g.,
display).  However since these later platforms also introduce media
engines at higher MMIO offsets, the definition of NEEDS_FORCE_WAKE() is
extended to also consider register offsets above GEN11_BSD_RING_BASE.

v2:
 - Restore NEEDS_FORCE_WAKE(), but extend it for compatibility with the
   gen11+ platforms by also passing offsets above GEN11_BSD_RING_BASE.
   (Chris, Tvrtko)

Cc: Tvrtko Ursulin 
Cc: Chris Wilson 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 61 ++---
 1 file changed, 21 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 4c6898746d10..bfb2a6337f9d 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -851,7 +851,10 @@ void assert_forcewakes_active(struct intel_uncore *uncore,
 }
 
 /* We give fast paths for the really cool registers */
-#define NEEDS_FORCE_WAKE(reg) ((reg) < 0x4)
+#define NEEDS_FORCE_WAKE(reg) ({ \
+   u32 __reg = (reg); \
+   __reg < 0x4 || __reg >= GEN11_BSD_RING_BASE; \
+})
 
 static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry)
 {
@@ -1071,27 +1074,10 @@ static const struct intel_forcewake_range 
__chv_fw_ranges[] = {
 };
 
 #define __fwtable_reg_write_fw_domains(uncore, offset) \
-({ \
-   enum forcewake_domains __fwd = 0; \
-   if (NEEDS_FORCE_WAKE((offset)) && !is_shadowed(uncore, offset)) \
-   __fwd = find_fw_domain(uncore, offset); \
-   __fwd; \
-})
-
-#define __gen11_fwtable_reg_write_fw_domains(uncore, offset) \
 ({ \
enum forcewake_domains __fwd = 0; \
const u32 __offset = (offset); \
-   if (!is_shadowed(uncore, __offset)) \
-   __fwd = find_fw_domain(uncore, __offset); \
-   __fwd; \
-})
-
-#define __gen12_fwtable_reg_write_fw_domains(uncore, offset) \
-({ \
-   enum forcewake_domains __fwd = 0; \
-   const u32 __offset = (offset); \
-   if (!is_shadowed(uncore, __offset)) \
+   if (NEEDS_FORCE_WAKE((__offset)) && !is_shadowed(uncore, __offset)) \
__fwd = find_fw_domain(uncore, __offset); \
__fwd; \
 })
@@ -1675,34 +1661,29 @@ __gen6_write(8)
 __gen6_write(16)
 __gen6_write(32)
 
-#define __gen_write(func, x) \
+#define __gen_fwtable_write(x) \
 static void \
-func##_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool 
trace) { \
+fwtable_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool 
trace) { \
enum forcewake_domains fw_engine; \
GEN6_WRITE_HEADER; \
-   fw_engine = __##func##_reg_write_fw_domains(uncore, offset); \
+   fw_engine = __fwtable_reg_write_fw_domains(uncore, offset); \
if (fw_engine) \
__force_wake_auto(uncore, fw_engine); \
__raw_uncore_write##x(uncore, reg, val); \
GEN6_WRITE_FOOTER; \
 }
 
-#define __gen_reg_write_funcs(func) \
-static enum forcewake_domains \
-func##_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \
-   return __##func##_reg_write_fw_domains(uncore, 
i915_mmio_reg_offset(reg)); \
-} \
-\
-__gen_write(func, 8) \
-__gen_write(func, 16) \
-__gen_write(func, 32)
-
+static enum forcewake_domains
+fwtable_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg)
+{
+   return __fwtable_reg_write_fw_domains(uncore, 
i915_mmio_reg_offset(reg));
+}
 
-__gen_reg_write_funcs(gen12_fwtable);
-__gen_reg_write_funcs(gen11_fwtable);
-__gen_reg_write_funcs(fwtable);
+__gen_fwtable_write(8)
+__gen_fwtable_write(16)
+__gen_fwtable_write(32)
 
-#undef __gen_reg_write_funcs
+#undef __gen_fwtable_write
 #undef GEN6_WRITE_FOOTER
 #undef GEN6_WRITE_HEADER
 
@@ -2080,22 +2061,22 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
-   ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
+   ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
-   ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
+   ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ

[PATCH v2 1/6] drm/i915/uncore: Convert gen6/gen7 read operations to fwtable

2021-09-10 Thread Matt Roper

On gen6-gen8 (except vlv/chv) we don't use a forcewake lookup table; we
simply check whether the register offset is < 0x4, and return
FORCEWAKE_RENDER if it is.  To prepare for upcoming refactoring, let's
define a single-entry forcewake table from [0x0, 0x3] and switch
these platforms over to use the fwtable reader functions.

v2:
 - Drop __gen6_reg_read_fw_domains which is no longer used.  (Tvrtko)

Cc: Tvrtko Ursulin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 21 -
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index f9767054dbdf..8c09af1e9f7a 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -853,16 +853,6 @@ void assert_forcewakes_active(struct intel_uncore *uncore,
 /* We give fast paths for the really cool registers */
 #define NEEDS_FORCE_WAKE(reg) ((reg) < 0x4)
 
-#define __gen6_reg_read_fw_domains(uncore, offset) \
-({ \
-   enum forcewake_domains __fwd; \
-   if (NEEDS_FORCE_WAKE(offset)) \
-   __fwd = FORCEWAKE_RENDER; \
-   else \
-   __fwd = 0; \
-   __fwd; \
-})
-
 static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry)
 {
if (offset < entry->start)
@@ -1064,6 +1054,10 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, 
i915_reg_t reg)
__fwd; \
 })
 
+static const struct intel_forcewake_range __gen6_fw_ranges[] = {
+   GEN_FW_RANGE(0x0, 0x3, FORCEWAKE_RENDER),
+};
+
 /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
 static const struct intel_forcewake_range __chv_fw_ranges[] = {
GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
@@ -1623,7 +1617,6 @@ __gen_read(func, 64)
 
 __gen_reg_read_funcs(gen11_fwtable);
 __gen_reg_read_funcs(fwtable);
-__gen_reg_read_funcs(gen6);
 
 #undef __gen_reg_read_funcs
 #undef GEN6_READ_FOOTER
@@ -2111,15 +2104,17 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) == 8) {
+   ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen6);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_VALLEYVIEW(i915)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_GRAPHICS_VER(i915, 6, 7)) {
+   ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen6);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
}
 
uncore->pmic_bus_access_nb.notifier_call = 
i915_pmic_bus_access_notifier;
-- 
2.25.4

[PATCH v2 0/6] i915: Simplify mmio handling & add new DG2 shadow table

2021-09-10 Thread Matt Roper

Our uncore MMIO functions for reading/writing registers have become very
complicated over time.  There's significant macro magic used to generate
several nearly-identical functions that only really differ in terms of
which platform-specific shadow register table they should check on write
operations.  We can significantly simplify our MMIO handlers by storing
a reference to the current platform's shadow table within the 'struct
intel_uncore' the same way we already do for forcewake; this allows us
to consolidate the multiple variants of each 'write' function down to
just a single 'fwtable' version that gets the shadow table out of the
uncore struct rather than hardcoding the name of a specific platform's
table.  We can do similar consolidation on the MMIO read side by
creating a single-entry forcewake table to replace the open-coded range
check they had been using previously.

The final patch of the series adds a new shadow table for DG2; this
becomes quite clean and simple now, given the refactoring in the first
five patches.

Aside from simplifying the code signficantly, this series reduces the
size of the generated .ko in exchange for adding an extra pointer
indirection to access the tables.  The size deltas (for just the first
five patches, before we add an additional table in the final patch) are:

Old:
$ size drivers/gpu/drm/i915/i915.ko
   textdata bss dec hex filename
2865921   889722912 2957805  2d21ed drivers/gpu/drm/i915/i915.ko

New:
$ size drivers/gpu/drm/i915/i915.ko
   textdata bss dec hex filename
2854181   882362912 2945329  2cf131 drivers/gpu/drm/i915/i915.ko

The code size deltas will become larger as we add more platforms; we
already add one new platform table in the final patch of this series and
our next few platforms are all expected to bring new shadow tables as
well.

I don't think the impact of the indirect table reference for shadow
tables should be a concern for a few reasons:
 * The stored table + indirect lookup design is already deemed good
   enough for forcewake, which is used more frequently (both reads and
   writes, compared to shadow tables which are only used for writes) and
   operates on much larger tables.
 * Performance-critical sections of the code or those read/writing lots
   of registers in a batch usually do an explicit grab of the relevant
   forcewake domains and then perform their MMIO operations via *_fw()
   functions without considering shadowed registers and bypassing all of
   the table lookups.
 * In v2 of the series, we still apply NEEDS_FORCE_WAKE() checks that
   will bypass all of the forcewake and shadow logic for display
   register writes.

v2:
 - Drop orphaned definition of __gen6_reg_read_fw_domains. (Tvrtko)
 - Restore NEEDS_FORCE_WAKE() check to
   __fwtable_reg_{read,write}_fw_domains, but update the definition of
   NEEDS_FORCE_WAKE to also return 'true' on offsets above
   GEN11_BSD_RING_BASE for compatibility with gen11+ platforms. (Chris,
   Tvrtko).

Cc: Tvrtko Ursulin 
Cc: Chris Wilson 

Matt Roper (6):
  drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
  drm/i915/uncore: Associate shadow table with uncore
  drm/i915/uncore: Replace gen8 write functions with general fwtable
  drm/i915/uncore: Drop gen11/gen12 mmio write handlers
  drm/i915/uncore: Drop gen11 mmio read handlers
  drm/i915/dg2: Add DG2-specific shadow register table

 drivers/gpu/drm/i915/intel_uncore.c   | 200 ++
 drivers/gpu/drm/i915/intel_uncore.h   |   7 +
 drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
 3 files changed, 115 insertions(+), 93 deletions(-)

-- 
2.25.4

Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-10 Thread Matthew Brost

On Fri, Sep 10, 2021 at 12:12:42PM +0100, Tvrtko Ursulin wrote:
> 
> On 20/08/2021 23:44, Matthew Brost wrote:
> > Add logical engine mapping. This is required for split-frame, as
> > workloads need to be placed on engines in a logically contiguous manner.
> > 
> > v2:
> >   (Daniel Vetter)
> >- Add kernel doc for new fields
> > 
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 ---
> >   drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
> >   .../drm/i915/gt/intel_execlists_submission.c  |  1 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
> >   5 files changed, 60 insertions(+), 29 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 0d9105a31d84..4d790f9a65dd 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs 
> > *engine, u16 iir)
> > GEM_DEBUG_WARN_ON(iir);
> >   }
> > -static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
> > +static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
> > + u8 logical_instance)
> >   {
> > const struct engine_info *info = &intel_engines[id];
> > struct drm_i915_private *i915 = gt->i915;
> > @@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
> > intel_engine_id id)
> > engine->class = info->class;
> > engine->instance = info->instance;
> > +   engine->logical_mask = BIT(logical_instance);
> > __sprint_engine_name(engine);
> > engine->props.heartbeat_interval_ms =
> > @@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct 
> > intel_gt *gt)
> > return info->engine_mask;
> >   }
> > +static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,
> > +u8 class, const u8 *map, u8 num_instances)
> > +{
> > +   int i, j;
> > +   u8 current_logical_id = 0;
> > +
> > +   for (j = 0; j < num_instances; ++j) {
> > +   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
> > +   if (!HAS_ENGINE(gt, i) ||
> > +   intel_engines[i].class != class)
> > +   continue;
> > +
> > +   if (intel_engines[i].instance == map[j]) {
> > +   logical_ids[intel_engines[i].instance] =
> > +   current_logical_id++;
> > +   break;
> > +   }
> > +   }
> > +   }
> > +}
> > +
> > +static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 
> > class)
> > +{
> > +   int i;
> > +   u8 map[MAX_ENGINE_INSTANCE + 1];
> > +
> > +   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
> > +   map[i] = i;
> 
> What's the point of the map array since it is 1:1 with instance?
> 

Future products do not have a 1 to 1 mapping and that mapping can change
based on fusing, e.g. XeHP SDV.

Also technically ICL / TGL / ADL physical instance 2 maps to logical
instance 1.

> > +   populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
> > +}
> > +
> >   /**
> >* intel_engines_init_mmio() - allocate and prepare the Engine Command 
> > Streamers
> >* @gt: pointer to struct intel_gt
> > @@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt)
> > struct drm_i915_private *i915 = gt->i915;
> > const unsigned int engine_mask = init_engine_mask(gt);
> > unsigned int mask = 0;
> > -   unsigned int i;
> > +   unsigned int i, class;
> > +   u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
> > int err;
> > drm_WARN_ON(&i915->drm, engine_mask == 0);
> > @@ -593,15 +627,23 @@ int intel_engines_init_mmio(struct intel_gt *gt)
> > if (i915_inject_probe_failure(i915))
> > return -ENODEV;
> > -   for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {
> > -   if (!HAS_ENGINE(gt, i))
> > -   continue;
> > +   for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) {
> > +   setup_logical_ids(gt, logical_ids, class);
> > -   err = intel_engine_setup(gt, i);
> > -   if (err)
> > -   goto cleanup;
> > +   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
> > +   u8 instance = intel_engines[i].instance;
> > +
> > +   if (intel_engines[i].class != class ||
> > +   !HAS_ENGINE(gt, i))
> > +   continue;
> > -   mask |= BIT(i);
> > +   err = intel_engine_setup(gt, i,
> > +logical_ids[instance]);
> > +   if (err)
> > +   goto cleanup;
> > +
> > +   mask |= BIT(i);
> 
> I still this there is a less clu

[PATCH 1/1] drm/amdkfd: Add sysfs bitfields and enums to uAPI

2021-09-10 Thread Felix Kuehling

These bits are de-facto part of the uAPI, so declare them in a uAPI header.

Signed-off-by: Felix Kuehling 
---
 MAINTAINERS   |   1 +
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  46 +
 include/uapi/linux/kfd_sysfs.h| 108 ++
 3 files changed, 110 insertions(+), 45 deletions(-)
 create mode 100644 include/uapi/linux/kfd_sysfs.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 84cd16694640..7554ec928ee2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -930,6 +930,7 @@ F:  drivers/gpu/drm/amd/include/kgd_kfd_interface.h
 F: drivers/gpu/drm/amd/include/v9_structs.h
 F: drivers/gpu/drm/amd/include/vi_structs.h
 F: include/uapi/linux/kfd_ioctl.h
+F: include/uapi/linux/kfd_sysfs.h
 
 AMD SPI DRIVER
 M: Sanjay R Mehta 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
index a8db017c9b8e..f0cc59d2fd5d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
@@ -25,38 +25,11 @@
 
 #include 
 #include 
+#include 
 #include "kfd_crat.h"
 
 #define KFD_TOPOLOGY_PUBLIC_NAME_SIZE 32
 
-#define HSA_CAP_HOT_PLUGGABLE  0x0001
-#define HSA_CAP_ATS_PRESENT0x0002
-#define HSA_CAP_SHARED_WITH_GRAPHICS   0x0004
-#define HSA_CAP_QUEUE_SIZE_POW20x0008
-#define HSA_CAP_QUEUE_SIZE_32BIT   0x0010
-#define HSA_CAP_QUEUE_IDLE_EVENT   0x0020
-#define HSA_CAP_VA_LIMIT   0x0040
-#define HSA_CAP_WATCH_POINTS_SUPPORTED 0x0080
-#define HSA_CAP_WATCH_POINTS_TOTALBITS_MASK0x0f00
-#define HSA_CAP_WATCH_POINTS_TOTALBITS_SHIFT   8
-#define HSA_CAP_DOORBELL_TYPE_TOTALBITS_MASK   0x3000
-#define HSA_CAP_DOORBELL_TYPE_TOTALBITS_SHIFT  12
-
-#define HSA_CAP_DOORBELL_TYPE_PRE_1_0  0x0
-#define HSA_CAP_DOORBELL_TYPE_1_0  0x1
-#define HSA_CAP_DOORBELL_TYPE_2_0  0x2
-#define HSA_CAP_AQL_QUEUE_DOUBLE_MAP   0x4000
-
-#define HSA_CAP_RESERVED_WAS_SRAM_EDCSUPPORTED 0x0008 /* Old buggy user 
mode depends on this being 0 */
-#define HSA_CAP_MEM_EDCSUPPORTED   0x0010
-#define HSA_CAP_RASEVENTNOTIFY 0x0020
-#define HSA_CAP_ASIC_REVISION_MASK 0x03c0
-#define HSA_CAP_ASIC_REVISION_SHIFT22
-#define HSA_CAP_SRAM_EDCSUPPORTED  0x0400
-#define HSA_CAP_SVMAPI_SUPPORTED   0x0800
-#define HSA_CAP_FLAGS_COHERENTHOSTACCESS   0x1000
-#define HSA_CAP_RESERVED   0xe00f8000
-
 struct kfd_node_properties {
uint64_t hive_id;
uint32_t cpu_cores_count;
@@ -93,17 +66,6 @@ struct kfd_node_properties {
char name[KFD_TOPOLOGY_PUBLIC_NAME_SIZE];
 };
 
-#define HSA_MEM_HEAP_TYPE_SYSTEM   0
-#define HSA_MEM_HEAP_TYPE_FB_PUBLIC1
-#define HSA_MEM_HEAP_TYPE_FB_PRIVATE   2
-#define HSA_MEM_HEAP_TYPE_GPU_GDS  3
-#define HSA_MEM_HEAP_TYPE_GPU_LDS  4
-#define HSA_MEM_HEAP_TYPE_GPU_SCRATCH  5
-
-#define HSA_MEM_FLAGS_HOT_PLUGGABLE0x0001
-#define HSA_MEM_FLAGS_NON_VOLATILE 0x0002
-#define HSA_MEM_FLAGS_RESERVED 0xfffc
-
 struct kfd_mem_properties {
struct list_headlist;
uint32_theap_type;
@@ -116,12 +78,6 @@ struct kfd_mem_properties {
struct attributeattr;
 };
 
-#define HSA_CACHE_TYPE_DATA0x0001
-#define HSA_CACHE_TYPE_INSTRUCTION 0x0002
-#define HSA_CACHE_TYPE_CPU 0x0004
-#define HSA_CACHE_TYPE_HSACU   0x0008
-#define HSA_CACHE_TYPE_RESERVED0xfff0
-
 struct kfd_cache_properties {
struct list_headlist;
uint32_tprocessor_id_low;
diff --git a/include/uapi/linux/kfd_sysfs.h b/include/uapi/linux/kfd_sysfs.h
new file mode 100644
index ..e1fb78b4bf09
--- /dev/null
+++ b/include/uapi/linux/kfd_sysfs.h
@@ -0,0 +1,108 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT WITH Linux-syscall-note */
+/*
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONI

Re: [virtio-dev] [PATCH v1 09/12] drm/virtio: implement context init: allocate an array of fence contexts

2021-09-10 Thread Chia-I Wu

On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh
 wrote:
>
> We don't want fences from different 3D contexts (virgl, gfxstream,
> venus) to be on the same timeline.  With explicit context creation,
> we can specify the number of ring each context wants.
>
> Execbuffer can specify which ring to use.
>
> Signed-off-by: Gurchetan Singh 
> Acked-by: Lingfeng Yang 
> ---
>  drivers/gpu/drm/virtio/virtgpu_drv.h   |  3 +++
>  drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 --
>  2 files changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
> b/drivers/gpu/drm/virtio/virtgpu_drv.h
> index a5142d60c2fa..cca9ab505deb 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_drv.h
> +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
> @@ -56,6 +56,7 @@
>  #define STATE_ERR 2
>
>  #define MAX_CAPSET_ID 63
> +#define MAX_RINGS 64
>
>  struct virtio_gpu_object_params {
> unsigned long size;
> @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv {
> uint32_t ctx_id;
> uint32_t context_init;
> bool context_created;
> +   uint32_t num_rings;
> +   uint64_t base_fence_ctx;
> struct mutex context_lock;
>  };
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
> b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> index f51f3393a194..262f79210283 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
> int in_fence_fd = exbuf->fence_fd;
> int out_fence_fd = -1;
> void *buf;
> +   uint64_t fence_ctx;
> +   uint32_t ring_idx;
> +
> +   fence_ctx = vgdev->fence_drv.context;
> +   ring_idx = 0;
>
> if (vgdev->has_virgl_3d == false)
> return -ENOSYS;
> @@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
> if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS))
> return -EINVAL;
>
> +   if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
> +   if (exbuf->ring_idx >= vfpriv->num_rings)
> +   return -EINVAL;
> +
> +   if (!vfpriv->base_fence_ctx)
> +   return -EINVAL;
> +
> +   fence_ctx = vfpriv->base_fence_ctx;
> +   ring_idx = exbuf->ring_idx;
> +   }
> +
> exbuf->fence_fd = -1;
>
> virtio_gpu_create_context(dev, file);
> @@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
> goto out_memdup;
> }
>
> -   out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 
> 0);
> +   out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx);
> if(!out_fence) {
> ret = -ENOMEM;
> goto out_unresv;
> @@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct 
> drm_device *dev,
> return -EINVAL;
>
> /* Number of unique parameters supported at this time. */
> -   if (num_params > 1)
> +   if (num_params > 2)
> return -EINVAL;
>
> ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params),
> @@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct 
> drm_device *dev,
>
> vfpriv->context_init |= value;
> break;
> +   case VIRTGPU_CONTEXT_PARAM_NUM_RINGS:
> +   if (vfpriv->base_fence_ctx) {
> +   ret = -EINVAL;
> +   goto out_unlock;
> +   }
> +
> +   if (value > MAX_RINGS) {
> +   ret = -EINVAL;
> +   goto out_unlock;
> +   }
> +
> +   vfpriv->base_fence_ctx = 
> dma_fence_context_alloc(value);
With multiple fence contexts, we should do something about implicit fencing.

The classic example is Mesa and X server.  When both use virgl and the
global fence context, no dma_fence_wait is fine.  But when Mesa uses
venus and the ring fence context, dma_fence_wait should be inserted.


> +   vfpriv->num_rings = value;
> +   break;
> default:
> ret = -EINVAL;
> goto out_unlock;
> --
> 2.33.0.153.gba50c8fa24-goog
>
>
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
>

Re: [git pull] drm fixes for 5.15-rc1

2021-09-10 Thread pr-tracker-bot

The pull request you sent on Fri, 10 Sep 2021 16:35:59 +1000:

> git://anongit.freedesktop.org/drm/drm tags/drm-next-2021-09-10

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/a668acb8f01fc0d1e3877cddecbe319ef2ef651c

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

Re: [Intel-gfx] [PATCH v9 05/17] drm/i915/pxp: Implement funcs to create the TEE channel

2021-09-10 Thread Rodrigo Vivi

On Fri, Sep 10, 2021 at 08:36:15AM -0700, Daniele Ceraolo Spurio wrote:
> From: "Huang, Sean Z" 
> 
> Implement the funcs to create the TEE channel, so kernel can
> send the TEE commands directly to TEE for creating the arbitrary
> (default) session.
> 
> v2: fix locking, don't pollute dev_priv (Chris)
> 
> v3: wait for mei PXP component to be bound.
> 
> v4: drop the wait, as the component might be bound after i915 load
> completes. We'll instead check when sending a tee message.
> 
> v5: fix an issue with mei_pxp module removal
> 
> v6: don't use fetch_and_zero in fini (Rodrigo)
> 
> Signed-off-by: Huang, Sean Z 
> Signed-off-by: Daniele Ceraolo Spurio 
> Cc: Chris Wilson 

Reviewed-by: Rodrigo Vivi 

> ---
>  drivers/gpu/drm/i915/Makefile  |  3 +-
>  drivers/gpu/drm/i915/pxp/intel_pxp.c   | 13 
>  drivers/gpu/drm/i915/pxp/intel_pxp_tee.c   | 79 ++
>  drivers/gpu/drm/i915/pxp/intel_pxp_tee.h   | 14 
>  drivers/gpu/drm/i915/pxp/intel_pxp_types.h |  6 ++
>  5 files changed, 114 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
>  create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 23f5bc268962..d39bd0cefc64 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -283,7 +283,8 @@ i915-y += i915_perf.o
>  
>  # Protected execution platform (PXP) support
>  i915-$(CONFIG_DRM_I915_PXP) += \
> - pxp/intel_pxp.o
> + pxp/intel_pxp.o \
> + pxp/intel_pxp_tee.o
>  
>  # Post-mortem debug and GPU hang state capture
>  i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
> b/drivers/gpu/drm/i915/pxp/intel_pxp.c
> index 7b2053902146..400deaea2d8a 100644
> --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
> @@ -3,6 +3,7 @@
>   * Copyright(c) 2020 Intel Corporation.
>   */
>  #include "intel_pxp.h"
> +#include "intel_pxp_tee.h"
>  #include "gt/intel_context.h"
>  #include "i915_drv.h"
>  
> @@ -50,7 +51,16 @@ void intel_pxp_init(struct intel_pxp *pxp)
>   if (ret)
>   return;
>  
> + ret = intel_pxp_tee_component_init(pxp);
> + if (ret)
> + goto out_context;
> +
>   drm_info(>->i915->drm, "Protected Xe Path (PXP) protected content 
> support initialized\n");
> +
> + return;
> +
> +out_context:
> + destroy_vcs_context(pxp);
>  }
>  
>  void intel_pxp_fini(struct intel_pxp *pxp)
> @@ -58,5 +68,8 @@ void intel_pxp_fini(struct intel_pxp *pxp)
>   if (!intel_pxp_is_enabled(pxp))
>   return;
>  
> + intel_pxp_tee_component_fini(pxp);
> +
>   destroy_vcs_context(pxp);
> +
>  }
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c 
> b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
> new file mode 100644
> index ..f1d8de832653
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
> @@ -0,0 +1,79 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright(c) 2020 Intel Corporation.
> + */
> +
> +#include 
> +#include "drm/i915_pxp_tee_interface.h"
> +#include "drm/i915_component.h"
> +#include "i915_drv.h"
> +#include "intel_pxp.h"
> +#include "intel_pxp_tee.h"
> +
> +static inline struct intel_pxp *i915_dev_to_pxp(struct device *i915_kdev)
> +{
> + return &kdev_to_i915(i915_kdev)->gt.pxp;
> +}
> +
> +/**
> + * i915_pxp_tee_component_bind - bind function to pass the function pointers 
> to pxp_tee
> + * @i915_kdev: pointer to i915 kernel device
> + * @tee_kdev: pointer to tee kernel device
> + * @data: pointer to pxp_tee_master containing the function pointers
> + *
> + * This bind function is called during the system boot or resume from system 
> sleep.
> + *
> + * Return: return 0 if successful.
> + */
> +static int i915_pxp_tee_component_bind(struct device *i915_kdev,
> +struct device *tee_kdev, void *data)
> +{
> + struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev);
> +
> + pxp->pxp_component = data;
> + pxp->pxp_component->tee_dev = tee_kdev;
> +
> + return 0;
> +}
> +
> +static void i915_pxp_tee_component_unbind(struct device *i915_kdev,
> +   struct device *tee_kdev, void *data)
> +{
> + struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev);
> +
> + pxp->pxp_component = NULL;
> +}
> +
> +static const struct component_ops i915_pxp_tee_component_ops = {
> + .bind   = i915_pxp_tee_component_bind,
> + .unbind = i915_pxp_tee_component_unbind,
> +};
> +
> +int intel_pxp_tee_component_init(struct intel_pxp *pxp)
> +{
> + int ret;
> + struct intel_gt *gt = pxp_to_gt(pxp);
> + struct drm_i915_private *i915 = gt->i915;
> +
> + ret = component_add_typed(i915->drm.dev, &i915_pxp_tee_component_ops,
> +   I915_COMPONENT_PXP);
> + if (ret < 0) {
> + drm_err(&i915->drm, "

Re: [PATCH v9 10/17] drm/i915/pxp: interfaces for using protected objects

2021-09-10 Thread Rodrigo Vivi

On Fri, Sep 10, 2021 at 08:36:20AM -0700, Daniele Ceraolo Spurio wrote:
> This api allow user mode to create protected buffers and to mark
> contexts as making use of such objects. Only when using contexts
> marked in such a way is the execution guaranteed to work as expected.
> 
> Contexts can only be marked as using protected content at creation time
> (i.e. the parameter is immutable) and they must be both bannable and not
> recoverable. Given that the protected session gets invalidated on
> suspend, contexts created this way hold a runtime pm wakeref until
> they're either destroyed or invalidated.
> 
> All protected objects and contexts will be considered invalid when the
> PXP session is destroyed and all new submissions using them will be
> rejected. All intel contexts within the invalidated gem contexts will be
> marked banned. Userspace can detect that an invalidation has occurred via
> the RESET_STATS ioctl, where we report it the same way as a ban due to a
> hang.
> 
> v5: squash patches, rebase on proto_ctx, update kerneldoc
> 
> v6: rebase on obj create_ext changes
> 
> v7: Use session counter to check if an object it valid, hold wakeref in
> context, don't add a new flag to RESET_STATS (Daniel)
> 
> v8: don't increase guilty count for contexts banned during pxp
> invalidation (Rodrigo)
> 
> v9: better comments, avoid wakeref put race between pxp_inval and
> context_close, add usage examples (Rodrigo)
> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Signed-off-by: Bommu Krishnaiah 
> Cc: Rodrigo Vivi 
> Cc: Chris Wilson 
> Cc: Lionel Landwerlin 
> Cc: Jason Ekstrand 
> Cc: Daniel Vetter 

Reviewed-by: Rodrigo Vivi 


> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 98 ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  6 ++
>  .../gpu/drm/i915/gem/i915_gem_context_types.h | 28 ++
>  drivers/gpu/drm/i915/gem/i915_gem_create.c| 72 ++
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 18 
>  drivers/gpu/drm/i915/gem/i915_gem_object.c|  1 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.h|  6 ++
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  8 ++
>  .../gpu/drm/i915/gem/selftests/mock_context.c |  4 +-
>  drivers/gpu/drm/i915/pxp/intel_pxp.c  | 78 +++
>  drivers/gpu/drm/i915/pxp/intel_pxp.h  | 12 +++
>  drivers/gpu/drm/i915/pxp/intel_pxp_session.c  |  6 ++
>  drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  9 ++
>  include/uapi/drm/i915_drm.h   | 96 +-
>  14 files changed, 407 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index c2ab0e22db0a..3418be4f727f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -77,6 +77,8 @@
>  #include "gt/intel_gpu_commands.h"
>  #include "gt/intel_ring.h"
>  
> +#include "pxp/intel_pxp.h"
> +
>  #include "i915_gem_context.h"
>  #include "i915_trace.h"
>  #include "i915_user_extensions.h"
> @@ -186,10 +188,13 @@ static int validate_priority(struct drm_i915_private 
> *i915,
>   return 0;
>  }
>  
> -static void proto_context_close(struct i915_gem_proto_context *pc)
> +static void proto_context_close(struct drm_i915_private *i915,
> + struct i915_gem_proto_context *pc)
>  {
>   int i;
>  
> + if (pc->pxp_wakeref)
> + intel_runtime_pm_put(&i915->runtime_pm, pc->pxp_wakeref);
>   if (pc->vm)
>   i915_vm_put(pc->vm);
>   if (pc->user_engines) {
> @@ -241,6 +246,33 @@ static int proto_context_set_persistence(struct 
> drm_i915_private *i915,
>   return 0;
>  }
>  
> +static int proto_context_set_protected(struct drm_i915_private *i915,
> +struct i915_gem_proto_context *pc,
> +bool protected)
> +{
> + int ret = 0;
> +
> + if (!intel_pxp_is_enabled(&i915->gt.pxp)) {
> + ret = -ENODEV;
> + } else if (!protected) {
> + pc->uses_protected_content = false;
> + } else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) ||
> +!(pc->user_flags & BIT(UCONTEXT_BANNABLE))) {
> + ret = -EPERM;
> + } else {
> + pc->uses_protected_content = true;
> +
> + /*
> +  * protected context usage requires the PXP session to be up,
> +  * which in turn requires the device to be active.
> +  */
> + pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm);
> + ret = intel_pxp_wait_for_arb_start(&i915->gt.pxp);
> + }
> +
> + return ret;
> +}
> +
>  static struct i915_gem_proto_context *
>  proto_context_create(struct drm_i915_private *i915, unsigned int flags)
>  {
> @@ -269,7 +301,7 @@ proto_context_create(struct drm_i915_private *i915, 
> unsigned int flags)
>   return pc;
>  
>  proto_close:
> - p

Re: [Intel-gfx] [PATCH v9 16/17] drm/i915/pxp: add PXP documentation

2021-09-10 Thread Rodrigo Vivi

On Fri, Sep 10, 2021 at 08:36:26AM -0700, Daniele Ceraolo Spurio wrote:
> Now that all the pieces are in place we can add a description of how the
> feature works. Also modify the comments in struct intel_pxp into
> kerneldoc.
> 
> v2: improve doc (Rodrigo)
> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Cc: Daniel Vetter 
> Cc: Rodrigo Vivi 

Reviewed-by: Rodrigo Vivi 

> ---
>  Documentation/gpu/i915.rst |  8 
>  drivers/gpu/drm/i915/pxp/intel_pxp.c   | 28 +
>  drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 47 --
>  3 files changed, 71 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
> index 101dde3eb1ea..78ecb9d5ec20 100644
> --- a/Documentation/gpu/i915.rst
> +++ b/Documentation/gpu/i915.rst
> @@ -471,6 +471,14 @@ Object Tiling IOCTLs
>  .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c
> :doc: buffer object tiling
>  
> +Protected Objects
> +-
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp.c
> +   :doc: PXP
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp_types.h
> +
>  Microcontrollers
>  
>  
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
> b/drivers/gpu/drm/i915/pxp/intel_pxp.c
> index 97c6368fddc3..5610634f8929 100644
> --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
> @@ -11,6 +11,34 @@
>  #include "gt/intel_context.h"
>  #include "i915_drv.h"
>  
> +/**
> + * DOC: PXP
> + *
> + * PXP (Protected Xe Path) is a feature available in Gen12 and newer 
> platforms.
> + * It allows execution and flip to display of protected (i.e. encrypted)
> + * objects. The SW support is enabled via the CONFIG_DRM_I915_PXP kconfig.
> + *
> + * Objects can opt-in to PXP encryption at creation time via the
> + * I915_GEM_CREATE_EXT_PROTECTED_CONTENT create_ext flag. For objects to be
> + * correctly protected they must be used in conjunction with a context 
> created
> + * with the I915_CONTEXT_PARAM_PROTECTED_CONTENT flag. See the documentation
> + * of those two uapi flags for details and restrictions.
> + *
> + * Protected objects are tied to a pxp session; currently we only support one
> + * session, which i915 manages and whose index is available in the uapi
> + * (I915_PROTECTED_CONTENT_DEFAULT_SESSION) for use in instructions targeting
> + * protected objects.
> + * The session is invalidated by the HW when certain events occur (e.g.
> + * suspend/resume). When this happens, all the objects that were used with 
> the
> + * session are marked as invalid and all contexts marked as using protected
> + * content are banned. Any further attempt at using them in an execbuf call 
> is
> + * rejected, while flips are converted to black frames.
> + *
> + * Some of the PXP setup operations are performed by the Management Engine,
> + * which is handled by the mei driver; communication between i915 and mei is
> + * performed via the mei_pxp component module.
> + */
> +
>  /* KCR register definitions */
>  #define KCR_INIT _MMIO(0x320f0)
>  
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h 
> b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
> index ae24064bb57e..73ef7d1754e1 100644
> --- a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
> @@ -16,42 +16,65 @@
>  struct intel_context;
>  struct i915_pxp_component;
>  
> +/**
> + * struct intel_pxp - pxp state
> + */
>  struct intel_pxp {
> + /**
> +  * @pxp_component: i915_pxp_component struct of the bound mei_pxp
> +  * module. Only set and cleared inside component bind/unbind functions,
> +  * which are protected by &tee_mutex.
> +  */
>   struct i915_pxp_component *pxp_component;
> + /**
> +  * @pxp_component_added: track if the pxp component has been added.
> +  * Set and cleared in tee init and fini functions respectively.
> +  */
>   bool pxp_component_added;
>  
> + /** @ce: kernel-owned context used for PXP operations */
>   struct intel_context *ce;
>  
> - /*
> + /** @arb_mutex: protects arb session start */
> + struct mutex arb_mutex;
> + /**
> +  * @arb_is_valid: tracks arb session status.
>* After a teardown, the arb session can still be in play on the HW
>* even if the keys are gone, so we can't rely on the HW state of the
>* session to know if it's valid and need to track the status in SW.
>*/
> - struct mutex arb_mutex; /* protects arb session start */
>   bool arb_is_valid;
>  
> - /*
> -  * Keep track of which key instance we're on, so we can use it to
> -  * determine if an object was created using the current key or a
> + /**
> +  * @key_instance: tracks which key instance we're on, so we can use it
> +  * to determine if an object was created using the current key or a
>* previous one.
>*/
>   u32 key_instance;
>  
> - struct mutex tee

Re: [PATCH] drm/ttm: add a WARN_ON in ttm_set_driver_manager when array bounds (v2)

2021-09-10 Thread Robin Murphy


On 2021-09-10 11:09, Guchun Chen wrote:

Vendor will define their own memory types on top of TTM_PL_PRIV,
but call ttm_set_driver_manager directly without checking mem_type
value when setting up memory manager. So add such check to aware
the case when array bounds.

v2: lower check level to WARN_ON

Signed-off-by: Leslie Shi 
Signed-off-by: Guchun Chen 
---
  include/drm/ttm/ttm_device.h | 1 +
  1 file changed, 1 insertion(+)

diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index 07d722950d5b..aa79953c807c 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -291,6 +291,7 @@ ttm_manager_type(struct ttm_device *bdev, int mem_type)
  static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type,
  struct ttm_resource_manager *manager)
  {
+   WARN_ON(type >= TTM_NUM_MEM_TYPES);


Nit: I know nothing about this code, but from the context alone it would 
seem sensible to do


if (WARN_ON(type >= TTM_NUM_MEM_TYPES))
return;

to avoid making the subsequent assignment when we *know* it's invalid 
and likely to corrupt memory.


Robin.


bdev->man_drv[type] = manager;
  }

Re: [PATCH v2] kernel/locking: Add context to ww_mutex_trylock.

2021-09-10 Thread Mark Brown

On Thu, Sep 09, 2021 at 11:32:18AM +0200, Maarten Lankhorst wrote:

> This is also useful in regulator_lock_nested, which may avoid dropping
> regulator_nesting_mutex in the uncontended path, so use it there.

Acked-by: Mark Brown 


signature.asc
Description: PGP signature

Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files

2021-09-10 Thread Lucas De Marchi


On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote:

We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
functions, defines and structs follow suit.

Signed-off-by: Lucas De Marchi 
---
drivers/gpu/drm/i915/Makefile  |  2 +-
drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 --
drivers/gpu/drm/i915/gt/intel_gt_debugfs.c |  4 ++--
.../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c}  |  4 ++--
drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h  | 14 ++
5 files changed, 19 insertions(+), 19 deletions(-)
delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c} (99%)
create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 232c9673a2e5..dd656f2d7721 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o

# "Graphics Technology" (aka we talk to the gpu)
gt-y += \
-   gt/debugfs_gt_pm.o \
gt/gen2_engine_cs.o \
gt/gen6_engine_cs.o \
gt/gen6_ppgtt.o \
@@ -103,6 +102,7 @@ gt-y += \
gt/intel_gt_engines_debugfs.o \
gt/intel_gt_irq.o \
gt/intel_gt_pm.o \
+   gt/intel_gt_pm_debugfs.o \
gt/intel_gt_pm_irq.o \
gt/intel_gt_requests.o \
gt/intel_gtt.o \
diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h 
b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
deleted file mode 100644
index 4cf5f5c9da7d..
--- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
+++ /dev/null
@@ -1,14 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#ifndef DEBUGFS_GT_PM_H
-#define DEBUGFS_GT_PM_H
-
-struct intel_gt;
-struct dentry;
-
-void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root);
-
-#endif /* DEBUGFS_GT_PM_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
index e5d173c235a3..4096ee893b69 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
@@ -5,10 +5,10 @@

#include 

-#include "debugfs_gt_pm.h"
#include "i915_drv.h"
#include "intel_gt_debugfs.h"
#include "intel_gt_engines_debugfs.h"
+#include "intel_gt_pm_debugfs.h"
#include "intel_sseu_debugfs.h"
#include "uc/intel_uc_debugfs.h"

@@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt *gt)
return;

intel_gt_engines_register_debugfs(gt, root);
-   debugfs_gt_pm_register(gt, root);
+   intel_gt_pm_register_debugfs(gt, root);


This is one case I usually don't know what convention to follow since it
changes in different places.

I did it like _register_debugfs because of calls like
intel_gt_init_scratch(), xxx_init_hw, etc. However here I see that just
below we have intel_sseu_debugfs_register(), so maybe I should consider
debugfs as part of the namespace?

Lucas De Marchi

Re: [PATCH] drm/msm: Disable frequency clamping on a630

2021-09-10 Thread Caleb Connolly

On 10/09/2021 18:18, Rob Clark wrote:

On Tue, Sep 7, 2021 at 7:20 PM Bjorn Andersson
wrote:

On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote:

On 8/9/2021 9:48 PM, Caleb Connolly wrote:

On 09/08/2021 17:12, Rob Clark wrote:

On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen
wrote:

[..]

I am a bit confused. We don't define a power domain for gpu in dt,
correct? Then what exactly set_opp do here? Do you think this usleep is
what is helping here somehow to mask the issue?

The power domains (for cx and gx) are defined in the GMU DT, the OPPs in
the GPU DT. For the sake of simplicity I'll refer to the lowest
frequency (25700) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as
the "min" state, and the highest frequency (71000) and OPP level
(RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined in
sdm845.dtsi under the gpu node.

The new devfreq behaviour unmasks what I think is a driver bug, it
inadvertently puts much more strain on the GPU regulators than they
usually get. With the new behaviour the GPU jumps from it's min state to
the max state and back again extremely rapidly under workloads as small
as refreshing UI. Where previously the GPU would rarely if ever go above
342MHz when interacting with the device, it now jumps between min and
max many times per second.

If my understanding is correct, the current implementation of the GMU
set freq is the following:
- Get OPP for frequency to set
- Push the frequency to the GMU - immediately updating the core clock
- Call dev_pm_opp_set_opp() which triggers a notify chain, this winds
up somewhere in power management code and causes the gx regulator level
to be updated

Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing else. We
were using a different api earlier which got deprecated -
dev_pm_opp_set_bw().

On the Lenovo Yoga C630 this is reproduced by starting alacritty and if
I'm lucky I managed to hit a few keys before it crashes, so I spent a
few hours looking into this as well...

As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote.
The opp-level is just there for show and isn't used by anything, at
least not on 845.

Further more, I'm missing something in my tree, so the interconnect
doesn't hit sync_state, and as such we're not actually scaling the
buses. So the problem is not that Linux doesn't turn on the buses in
time.

So I suspect that the "AHB bus error" isn't saying that we turned off
the bus, but rather that the GPU becomes unstable or something of that
sort.

Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran
Aquarium for 20 minutes without a problem. I then switched the gpu
devfreq governor to "userspace" and ran the following:

while true; do
echo 25700 > /sys/class/devfreq/500.gpu/userspace/set_freq
echo 71000 > /sys/class/devfreq/500.gpu/userspace/set_freq
done

It took 19 iterations of this loop to crash the GPU.

I assume you still had aquarium running, to keep the gpu awake while
you ran that loop?

Fwiw, I modified this slightly to match sc7180's min/max gpu freq and
could not trigger any issue.. interestingly sc7180 has a lower min
freq (180) and higher max freq (800) so it was toggling over a wider
freq range. I also tried on a device that had the higher 825MHz opp
(since I noticed that was the only opp that used
RPMH_REGULATOR_LEVEL_TURBO_L1 and wanted to rule that out), but could
not reproduce.

I guess a630 (sdm845) should have higher power draw (it is 2x # of
shader cores and 2x GMEM size, but lower max freq).. the question is,
is this the reason we see this on sdm845 and not sc7180? Or is there
some other difference. On the gpu side of this, they are both closely
related (ie. the same "sub-generation" of a6xx, same gmu fw, etc)..
I'm less sure about the other parts (icc, rpmh, etc)

My guess would be power draw, nobody has mentioned this yet but I've realised that the vdd_gfx rail is powered by a buck
converter, which could explain a lot of the symptoms.

Buck converters depend on high frequency switching and inductors to work, this inherently leads to some lag time when
changing voltages, and also means that the behaviour of the regulator is defined in part by how much current is being
drawn. Wikipedia has a pretty good explanation: https://en.wikipedia.org/wiki/Buck_converter

At the best of times these regulators have a known voltage ripple, when under load and when rapidly switching voltages
this will get a lot worse.

Someone with an oscilloscope and schematics could probe the rail and probably see exactly what's going on when the GPU
crashes. Because of the lag time in the regulator changing voltage, it might be undershooting whilst the GPU is trying
to clock up and draw more current - causing instability and crashes.

BR,
-R

So the problem doesn't seem to be Rob's change, it's just that prior to
it the chance to hitting it is way lower. Question is still what it is
that we're triggering.

Re: [PATCH v2] kernel/locking: Add context to ww_mutex_trylock.

2021-09-10 Thread Peter Zijlstra

On Fri, Sep 10, 2021 at 05:02:54PM +0200, Peter Zijlstra wrote:

> That doesn't look right, how's this for you?

Full patch for the robots here:

https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=locking/core&id=826e7b8826f0af185bb93249600533c33fd69a95

Re: [PATCH] drm/msm: Disable frequency clamping on a630

2021-09-10 Thread Rob Clark

On Thu, Sep 9, 2021 at 1:54 PM Rob Clark  wrote:
>
> On Thu, Sep 9, 2021 at 12:50 PM Akhil P Oommen  wrote:
> >
> > On 9/9/2021 9:42 PM, Amit Pundir wrote:
> > > On Thu, 9 Sept 2021 at 17:47, Amit Pundir  wrote:
> > >>
> > >> On Wed, 8 Sept 2021 at 07:50, Bjorn Andersson
> > >>  wrote:
> > >>>
> > >>> On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote:
> > >>>
> >  On 8/9/2021 9:48 PM, Caleb Connolly wrote:
> > >
> > >
> > > On 09/08/2021 17:12, Rob Clark wrote:
> > >> On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen
> > >>  wrote:
> > >>> [..]
> > >>> I am a bit confused. We don't define a power domain for gpu in dt,
> > >>> correct? Then what exactly set_opp do here? Do you think this 
> > >>> usleep is
> > >>> what is helping here somehow to mask the issue?
> > > The power domains (for cx and gx) are defined in the GMU DT, the OPPs 
> > > in
> > > the GPU DT. For the sake of simplicity I'll refer to the lowest
> > > frequency (25700) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as
> > > the "min" state, and the highest frequency (71000) and OPP level
> > > (RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined 
> > > in
> > > sdm845.dtsi under the gpu node.
> > >
> > > The new devfreq behaviour unmasks what I think is a driver bug, it
> > > inadvertently puts much more strain on the GPU regulators than they
> > > usually get. With the new behaviour the GPU jumps from it's min state 
> > > to
> > > the max state and back again extremely rapidly under workloads as 
> > > small
> > > as refreshing UI. Where previously the GPU would rarely if ever go 
> > > above
> > > 342MHz when interacting with the device, it now jumps between min and
> > > max many times per second.
> > >
> > > If my understanding is correct, the current implementation of the GMU
> > > set freq is the following:
> > >- Get OPP for frequency to set
> > >- Push the frequency to the GMU - immediately updating the core 
> > > clock
> > >- Call dev_pm_opp_set_opp() which triggers a notify chain, this 
> > > winds
> > > up somewhere in power management code and causes the gx regulator 
> > > level
> > > to be updated
> > 
> >  Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing 
> >  else. We
> >  were using a different api earlier which got deprecated -
> >  dev_pm_opp_set_bw().
> > 
> > >>>
> > >>> On the Lenovo Yoga C630 this is reproduced by starting alacritty and if
> > >>> I'm lucky I managed to hit a few keys before it crashes, so I spent a
> > >>> few hours looking into this as well...
> > >>>
> > >>> As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote.
> > >>> The opp-level is just there for show and isn't used by anything, at
> > >>> least not on 845.
> > >>>
> > >>> Further more, I'm missing something in my tree, so the interconnect
> > >>> doesn't hit sync_state, and as such we're not actually scaling the
> > >>> buses. So the problem is not that Linux doesn't turn on the buses in
> > >>> time.
> > >>>
> > >>> So I suspect that the "AHB bus error" isn't saying that we turned off
> > >>> the bus, but rather that the GPU becomes unstable or something of that
> > >>> sort.
> > >>>
> > >>>
> > >>> Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran
> > >>> Aquarium for 20 minutes without a problem. I then switched the gpu
> > >>> devfreq governor to "userspace" and ran the following:
> > >>>
> > >>> while true; do
> > >>>echo 25700 > /sys/class/devfreq/500.gpu/userspace/set_freq
> > >>>echo 71000 > /sys/class/devfreq/500.gpu/userspace/set_freq
> > >>> done
> > >>>
> > >>> It took 19 iterations of this loop to crash the GPU.
> > >>
> > >> Ack. With your above script, I can reproduce a crash too on db845c
> > >> (A630) running v5.14. I didn't get any crash log though and device
> > >> just rebooted to USB crash mode.
> > >>
> > >> And same crash on RB5 (A650) too https://hastebin.com/raw/ejutetuwun
> >
> > Are we sure this is the same issue? It could be, but I thought we were
> > seeing a bunch of random gpu errors (which may eventually hit device crash).
>
> In the sense that async-serror often seems to be a clk issue, it
> *could* be related.. but this would have to be triggered by CPU
> access.  The symptom does seem very different.
>

The more I think about it, the more I think this is a different
issue.. a650 is somewhat different wrt gmu (ie. hfi vs legacy code
paths).

Amit, could you try the same experiment (with 9bc95570175a ("drm/msm:
Devfreq tuning") revert) while running something like webgl aquarium
to prevent the GPU from suspending?  I'm kinda suspecting the issue
you hit is more likely some suspend/resume issue.

BR,
-R

Re: [PATCH] drm/msm: Disable frequency clamping on a630

2021-09-10 Thread Rob Clark

On Tue, Sep 7, 2021 at 7:20 PM Bjorn Andersson
 wrote:
>
> On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote:
>
> > On 8/9/2021 9:48 PM, Caleb Connolly wrote:
> > >
> > >
> > > On 09/08/2021 17:12, Rob Clark wrote:
> > > > On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen
> > > >  wrote:
> [..]
> > > > > I am a bit confused. We don't define a power domain for gpu in dt,
> > > > > correct? Then what exactly set_opp do here? Do you think this usleep 
> > > > > is
> > > > > what is helping here somehow to mask the issue?
> > > The power domains (for cx and gx) are defined in the GMU DT, the OPPs in
> > > the GPU DT. For the sake of simplicity I'll refer to the lowest
> > > frequency (25700) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as
> > > the "min" state, and the highest frequency (71000) and OPP level
> > > (RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined in
> > > sdm845.dtsi under the gpu node.
> > >
> > > The new devfreq behaviour unmasks what I think is a driver bug, it
> > > inadvertently puts much more strain on the GPU regulators than they
> > > usually get. With the new behaviour the GPU jumps from it's min state to
> > > the max state and back again extremely rapidly under workloads as small
> > > as refreshing UI. Where previously the GPU would rarely if ever go above
> > > 342MHz when interacting with the device, it now jumps between min and
> > > max many times per second.
> > >
> > > If my understanding is correct, the current implementation of the GMU
> > > set freq is the following:
> > >   - Get OPP for frequency to set
> > >   - Push the frequency to the GMU - immediately updating the core clock
> > >   - Call dev_pm_opp_set_opp() which triggers a notify chain, this winds
> > > up somewhere in power management code and causes the gx regulator level
> > > to be updated
> >
> > Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing else. We
> > were using a different api earlier which got deprecated -
> > dev_pm_opp_set_bw().
> >
>
> On the Lenovo Yoga C630 this is reproduced by starting alacritty and if
> I'm lucky I managed to hit a few keys before it crashes, so I spent a
> few hours looking into this as well...
>
> As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote.
> The opp-level is just there for show and isn't used by anything, at
> least not on 845.
>
> Further more, I'm missing something in my tree, so the interconnect
> doesn't hit sync_state, and as such we're not actually scaling the
> buses. So the problem is not that Linux doesn't turn on the buses in
> time.
>
> So I suspect that the "AHB bus error" isn't saying that we turned off
> the bus, but rather that the GPU becomes unstable or something of that
> sort.
>
>
> Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran
> Aquarium for 20 minutes without a problem. I then switched the gpu
> devfreq governor to "userspace" and ran the following:
>
> while true; do
>   echo 25700 > /sys/class/devfreq/500.gpu/userspace/set_freq
>   echo 71000 > /sys/class/devfreq/500.gpu/userspace/set_freq
> done
>
> It took 19 iterations of this loop to crash the GPU.

I assume you still had aquarium running, to keep the gpu awake while
you ran that loop?

Fwiw, I modified this slightly to match sc7180's min/max gpu freq and
could not trigger any issue.. interestingly sc7180 has a lower min
freq (180) and higher max freq (800) so it was toggling over a wider
freq range.  I also tried on a device that  had the higher 825MHz opp
(since I noticed that was the only opp that used
RPMH_REGULATOR_LEVEL_TURBO_L1 and wanted to rule that out), but could
not reproduce.

I guess a630 (sdm845) should have higher power draw (it is 2x # of
shader cores and 2x GMEM size, but lower max freq).. the question is,
is this the reason we see this on sdm845 and not sc7180?  Or is there
some other difference.  On the gpu side of this, they are both closely
related (ie. the same "sub-generation" of a6xx, same gmu fw, etc)..
I'm less sure about the other parts (icc, rpmh, etc)

BR,
-R

> So the problem doesn't seem to be Rob's change, it's just that prior to
> it the chance to hitting it is way lower. Question is still what it is
> that we're triggering.
>
> Regards,
> Bjorn

Re: [PATCH] dt-bindings: More use 'enum' instead of 'oneOf' plus 'const' entries

2021-09-10 Thread Guenter Roeck


On 9/10/21 9:51 AM, Rob Herring wrote:

'enum' is equivalent to 'oneOf' with a list of 'const' entries, but 'enum'
is more concise and yields better error messages.

Fix a couple more cases which have appeared.

Cc: Rob Clark 
Cc: Sean Paul 
Cc: Mark Brown 
Cc: Wim Van Sebroeck 
Cc: Guenter Roeck 
Cc: Jonathan Marek 
Cc: Aswath Govindraju 
Cc: Marc Zyngier 
Cc: Linus Walleij 
Cc: dri-devel@lists.freedesktop.org
Cc: freedr...@lists.freedesktop.org
Cc: linux-...@vger.kernel.org
Cc: linux-watch...@vger.kernel.org
Signed-off-by: Rob Herring 
---
  .../bindings/display/msm/dsi-phy-7nm.yaml  |  8 
  .../devicetree/bindings/spi/omap-spi.yaml  |  6 +++---
  .../bindings/watchdog/maxim,max63xx.yaml   | 14 +++---


For watchdog:

Acked-by: Guenter Roeck 


  3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml 
b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
index 4265399bb154..c851770bbdf2 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
@@ -14,10 +14,10 @@ allOf:
  
  properties:

compatible:
-oneOf:
-  - const: qcom,dsi-phy-7nm
-  - const: qcom,dsi-phy-7nm-8150
-  - const: qcom,sc7280-dsi-phy-7nm
+enum:
+  - qcom,dsi-phy-7nm
+  - qcom,dsi-phy-7nm-8150
+  - qcom,sc7280-dsi-phy-7nm
  
reg:

  items:
diff --git a/Documentation/devicetree/bindings/spi/omap-spi.yaml 
b/Documentation/devicetree/bindings/spi/omap-spi.yaml
index e55538186cf6..9952199cae11 100644
--- a/Documentation/devicetree/bindings/spi/omap-spi.yaml
+++ b/Documentation/devicetree/bindings/spi/omap-spi.yaml
@@ -84,9 +84,9 @@ unevaluatedProperties: false
  if:
properties:
  compatible:
-  oneOf:
-- const: ti,omap2-mcspi
-- const: ti,omap4-mcspi
+  enum:
+- ti,omap2-mcspi
+- ti,omap4-mcspi
  
  then:

properties:
diff --git a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml 
b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
index f2105eedac2c..ab9641e845db 100644
--- a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
+++ b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
@@ -15,13 +15,13 @@ maintainers:
  
  properties:

compatible:
-oneOf:
-  - const: maxim,max6369
-  - const: maxim,max6370
-  - const: maxim,max6371
-  - const: maxim,max6372
-  - const: maxim,max6373
-  - const: maxim,max6374
+enum:
+  - maxim,max6369
+  - maxim,max6370
+  - maxim,max6371
+  - maxim,max6372
+  - maxim,max6373
+  - maxim,max6374
  
reg:

  description: This is a 1-byte memory-mapped address

Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource

2021-09-10 Thread Christian König


Am 10.09.21 um 17:30 schrieb Thomas Hellström:

On Fri, 2021-09-10 at 16:40 +0200, Christian König wrote:


Am 10.09.21 um 15:15 schrieb Thomas Hellström:

Both the provider (resource manager) and the consumer (the TTM
driver)
want to subclass struct ttm_resource. Since this is left for the
resource
manager, we need to provide a private pointer for the TTM driver.

Provide a struct ttm_resource_private for the driver to subclass
for
data with the same lifetime as the struct ttm_resource: In the i915
case
it will, for example, be an sg-table and radix tree into the LMEM
/VRAM pages that currently are awkwardly attached to the GEM
object.

Provide an ops structure for associated ops (Which is only
destroy() ATM)
It might seem pointless to provide a separate ops structure, but
Linus
has previously made it clear that that's the norm.

After careful audit one could perhaps also on a per-driver basis
replace the delete_mem_notify() TTM driver callback with the above
destroy function.

Well this is a really big NAK to this approach.

If you need to attach some additional information to the resource
then
implement your own resource manager like everybody else does.

Well this was the long discussion we had back then when the resource
mangagers started to derive from struct resource and I was under the
impression that we had come to an agreement about the different use-
cases here, and this was my main concern.


Ok, then we somehow didn't understood each other.


I mean, it's a pretty big layer violation to do that for this use-case.


Well exactly that's the point. TTM should not have a layer design in the 
first place.


Devices, BOs, resources etc.. are base classes which should implement a 
base functionality which is then extended by the drivers to implement 
the driver specific functionality.


That is a component based approach, and not layered at all.


The TTM resource manager doesn't want to know about this data at all,
it's private to the ttm resource user layer and the resource manager
works perfectly well without it. (I assume the other drivers that
implement their own resource managers need the data that the
subclassing provides?)


Yes, that's exactly why we have the subclassing.


The fundamental problem here is that there are two layers wanting to
subclass struct ttm_resource. That means one layer gets to do that, the
second gets to use a private pointer, (which in turn can provide yet
another private pointer to a potential third layer). With your
suggestion, the second layer instead is forced to subclass each
subclassed instance it uses from  the first layer provides?


Well completely drop the layer approach/thinking here.

The resource is an object with a base class. The base class implements 
the interface TTM needs to handle the object, e.g. create/destroy/debug 
etc...


Then we need to subclass this object because without any additional 
information the object is pretty pointless.


One possibility for this is to use the range manager to implement 
something drm_mm based. BTW: We should probably rename that to something 
like ttm_res_drm_mm or similar.


What we should avoid is to abuse TTM resource interfaces in the driver, 
e.g. what i915 is currently doing. This is a TTM->resource mgr interface 
and should not be used by drivers at all.



Ofc we can do that, but it does indeed feel pretty awkward.

In any case, if you still think that's the approach we should go for,
I'd need to add init() and fini() members to the ttm_range_manager_func
struct to allow subclassing without having to unnecessarily copy the
full code?


Yes, exporting the ttm_range_manager functions as needed is one thing I 
wanted to do for the amdgpu_gtt_mgr.c code as well.


Just don't extend the function table but rather directly export the 
necessary functions.


Regards,
Christian.



Thanks,
Thomas











Regards,
Christian.


Cc: Matthew Auld 
Cc: König Christian 
Signed-off-by: Thomas Hellström 
---
   drivers/gpu/drm/ttm/ttm_resource.c | 10 +++---
   include/drm/ttm/ttm_resource.h | 28

   2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_resource.c
b/drivers/gpu/drm/ttm/ttm_resource.c
index 2431717376e7..973e7c50bfed 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object
*bo,
   void ttm_resource_free(struct ttm_buffer_object *bo, struct
ttm_resource **res)
   {
 struct ttm_resource_manager *man;
+   struct ttm_resource *resource = *res;
   
-   if (!*res)

+   if (!resource)
 return;
   
-   man = ttm_manager_type(bo->bdev, (*res)->mem_type);

-   man->func->free(man, *res);
 *res = NULL;
+   if (resource->priv)
+   resource->priv->ops.destroy(resource->priv);
+
+   man = ttm_manager_type(bo->bdev, resource->mem_type);
+   man->func->free(man, resource);

[PATCH] video: fbdev: atyfb: Remove assigned but never used variable statements

2021-09-10 Thread Colin King

From: Colin Ian King 

There are a couple of statements where local variables are being assigned
values that are never read because the function returns immediately after
the assignment. Clean up the code by removing them.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King 
---
 drivers/video/fbdev/aty/mach64_gx.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/video/fbdev/aty/mach64_gx.c 
b/drivers/video/fbdev/aty/mach64_gx.c
index 9c37e28fb78b..d06d24830080 100644
--- a/drivers/video/fbdev/aty/mach64_gx.c
+++ b/drivers/video/fbdev/aty/mach64_gx.c
@@ -352,10 +352,8 @@ static int aty_var_to_pll_18818(const struct fb_info 
*info, u32 vclk_per,
post_divider = 1;
 
if (MHz100 > MAX_FREQ_2595) {
-   MHz100 = MAX_FREQ_2595;
return -EINVAL;
} else if (MHz100 < ABS_MIN_FREQ_2595) {
-   program_bits = 0;   /* MHz100 = 257 */
return -EINVAL;
} else {
while (MHz100 < MIN_FREQ_2595) {
-- 
2.32.0

Re: [PATCH] dt-bindings: More use 'enum' instead of 'oneOf' plus 'const' entries

2021-09-10 Thread Mark Brown

On Fri, Sep 10, 2021 at 11:51:53AM -0500, Rob Herring wrote:

> 'enum' is equivalent to 'oneOf' with a list of 'const' entries, but 'enum'
> is more concise and yields better error messages.

Acked-by: Mark Brown 


signature.asc
Description: PGP signature

[PATCH] dt-bindings: More use 'enum' instead of 'oneOf' plus 'const' entries

2021-09-10 Thread Rob Herring

'enum' is equivalent to 'oneOf' with a list of 'const' entries, but 'enum'
is more concise and yields better error messages.

Fix a couple more cases which have appeared.

Cc: Rob Clark 
Cc: Sean Paul 
Cc: Mark Brown 
Cc: Wim Van Sebroeck 
Cc: Guenter Roeck 
Cc: Jonathan Marek 
Cc: Aswath Govindraju 
Cc: Marc Zyngier 
Cc: Linus Walleij 
Cc: dri-devel@lists.freedesktop.org
Cc: freedr...@lists.freedesktop.org
Cc: linux-...@vger.kernel.org
Cc: linux-watch...@vger.kernel.org
Signed-off-by: Rob Herring 
---
 .../bindings/display/msm/dsi-phy-7nm.yaml  |  8 
 .../devicetree/bindings/spi/omap-spi.yaml  |  6 +++---
 .../bindings/watchdog/maxim,max63xx.yaml   | 14 +++---
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml 
b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
index 4265399bb154..c851770bbdf2 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
@@ -14,10 +14,10 @@ allOf:
 
 properties:
   compatible:
-oneOf:
-  - const: qcom,dsi-phy-7nm
-  - const: qcom,dsi-phy-7nm-8150
-  - const: qcom,sc7280-dsi-phy-7nm
+enum:
+  - qcom,dsi-phy-7nm
+  - qcom,dsi-phy-7nm-8150
+  - qcom,sc7280-dsi-phy-7nm
 
   reg:
 items:
diff --git a/Documentation/devicetree/bindings/spi/omap-spi.yaml 
b/Documentation/devicetree/bindings/spi/omap-spi.yaml
index e55538186cf6..9952199cae11 100644
--- a/Documentation/devicetree/bindings/spi/omap-spi.yaml
+++ b/Documentation/devicetree/bindings/spi/omap-spi.yaml
@@ -84,9 +84,9 @@ unevaluatedProperties: false
 if:
   properties:
 compatible:
-  oneOf:
-- const: ti,omap2-mcspi
-- const: ti,omap4-mcspi
+  enum:
+- ti,omap2-mcspi
+- ti,omap4-mcspi
 
 then:
   properties:
diff --git a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml 
b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
index f2105eedac2c..ab9641e845db 100644
--- a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
+++ b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml
@@ -15,13 +15,13 @@ maintainers:
 
 properties:
   compatible:
-oneOf:
-  - const: maxim,max6369
-  - const: maxim,max6370
-  - const: maxim,max6371
-  - const: maxim,max6372
-  - const: maxim,max6373
-  - const: maxim,max6374
+enum:
+  - maxim,max6369
+  - maxim,max6370
+  - maxim,max6371
+  - maxim,max6372
+  - maxim,max6373
+  - maxim,max6374
 
   reg:
 description: This is a 1-byte memory-mapped address
-- 
2.30.2

Re: [RFC PATCH 0/4] Allow to use DRM fbdev emulation layer with CONFIG_FB disabled

2021-09-10 Thread Sam Ravnborg

Hi Noralf,

On Thu, Sep 09, 2021 at 06:27:02PM +0200, Noralf Trønnes wrote:
> 
> > > Hi Daniel,
> > >
> > > >
> > > > I think for a substantial improvement here in robustness what you
> really
> > > > want is
> > > > - kmscon in userspace
> > > > - disable FB layer
> > > > - ideally also disable console/vt layer in the kernel
> > > > - have a minimal emergency/boot-up log thing in drm, patches for that
> > > >   floated around a few times
> > >
> > > I assume you refer to this work by David Herrmann:
> > > "[RFC] drm: add kernel-log renderer"
> > > https://lists.freedesktop.org/archives/dri-devel/2014-March/055136.html
> > >
> >
> > There's also this:
> >
> > [PATCH v2 0/3] drm: Add panic handling
> >
> https://lore.kernel.org/dri-devel/20190311174218.51899-1-nor...@tronnes.org/
> 
> And here's a DRM console example that was part of the early drm_client work:
> 
> [RFC v4 25/25] drm/client: Hack: Add DRM VT console client
> https://lore.kernel.org/dri-devel/20180414115318.14500-26-nor...@tronnes.org/

Thanks for providing these pointers. Looks forwards to find time to play
with all this. Having an embedded board without any fbdev stuff seems
like a nice goal.

Sam

Re: Habanalabs Open-Source TPC LLVM compiler and SynapseAI Core library

2021-09-10 Thread Daniel Vetter

Forgot to add dri-devel.

On Fri, Sep 10, 2021 at 6:09 PM Daniel Vetter  wrote:
>
> On Fri, Sep 10, 2021 at 9:58 AM Greg Kroah-Hartman
>  wrote:
> > On Fri, Sep 10, 2021 at 10:26:56AM +0300, Oded Gabbay wrote:
> > > Hi Greg,
> > >
> > > Following our conversations a couple of months ago, I'm happy to tell you 
> > > that
> > > Habanalabs has open-sourced its TPC (Tensor Processing Core) LLVM 
> > > compiler,
> > > which is a fork of the LLVM open-source project.
> > >
> > > The project can be found on Habanalabs GitHub website at:
> > > https://github.com/HabanaAI/tpc_llvm
> > >
> > > There is a companion guide on how to write TPC kernels at:
> > > https://docs.habana.ai/en/latest/TPC_User_Guide/TPC_User_Guide.html
> >
> > That's great news, thanks for pushing for this and releasing it all!
>
> Yeah this is neat.
>
> There's still the problem that we spent the past 2.5 years pissing off
> a lot of people for an imo questionable political project, bypassing
> all the technical review and expertise. Now that the political
> nonsense is resolved I think we need to look at at least the technical
> cleanup. The angered people are much harder to fix, so let's maybe
> ignore that (or perhaps a ks topic, no idea, I'm honestly not super
> motivated to rehash this entire story again). Here's what I think we
> should do:
>
> - move drivers/misc/habanalabs under drivers/gpu/habanalabs and
> review/discussions on dri-devel
> - grandfather the entire current situation in as-is, it's not the only
> driver we have with a funny uapi of its own (but the other driver did
> manage to get their compiler into upstream llvm even, and not like 2
> years late)
> - review the dma-buf stuff on dri-devel and then land it through
> standard flows, not the gregk-misc bypass
> - close drivers/misc backdoor for further accel driver submissions,
> I'd like to focus on technical stuff in this area going forward and
> not pointless exercises in bypassing due process and all that
>
> I expect we'll have a proper discussion what the stack should look
> like with the next submission (from a different vendor maybe), that
> ship kinda sailed with habanalabs.
>
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code

2021-09-10 Thread Alex Williamson

On Fri, 10 Sep 2021 10:38:50 -0300
Jason Gunthorpe  wrote:

> On Fri, Sep 10, 2021 at 01:10:46PM +0100, Christoph Hellwig wrote:
> > On Thu, Sep 09, 2021 at 04:38:45PM -0300, Jason Gunthorpe wrote:  
> > > Every driver just emits a static string, simply feed it through the ops
> > > and provide a standard sysfs show function.  
> > 
> > Looks sensible.  But can you make the attribute optional and add a
> > comment marking it deprecated?  Because it really is completely useless.
> > We don't version userspace APIs, userspae has to discover new features
> > individually by e.g. finding new sysfs files or just trying new ioctls.  
> 
> To be honest I have no idea what side effects that would have..
> 
> device code search tells me libvirt reads it and stuffs it into some
> XML
> 
> Something called mdevctl touches it, feeds it into some JSON and
> other stuff..
> 
> qemu has some VFIO_DEVICE_API_* constants but it is all dead code
> 
> I agree it shouldn't have been there in the first place
> 
> Cornelia? Alex? Any thoughts?

It's not a version, it's a means for userspace to determine the basic
API for an mdev device without needing to go through the process of
creating a container, adding the group, setting an IOMMU type, opening
the device before being able to call VFIO_DEVICE_GET_INFO to determine
the API.  For example, it wouldn't make sense for libvirt to attach a
vfio-ccw device to a PCIe root port in a VM.  It's a means to say this
mdev device is a vfio-pci or that mdev device is a vfio-ccw.  If it were
optional, then management tools would have no basic idea how to attach
the device to a VM without gaining access to the device themselves.
Thanks,

Alex

1 2 >

1 - 100 of 144 matches

Mail list logo